What is paired T test?

A paired t-test, also known as a dependent t-test or paired samples t-test, is a statistical method used to compare the means of two related groups to determine whether there is a significant difference between them. This test is particularly useful when the same subjects are measured under two different conditions or at two different points in time, making it different from an independent t-test, which compares two separate groups. The paired t-test is commonly used in experimental and observational studies where measurements are taken before and after an intervention, such as testing the effectiveness of a drug by comparing patients’ health indicators before and after treatment, or assessing the impact of a training program on employees’ performance by measuring their scores before and after training. The fundamental assumption behind this test is that the two sets of observations are not independent but instead related, making it necessary to evaluate the difference between paired observations rather than comparing the raw values themselves.

Mathematically, the paired t-test is based on computing the differences between the paired observations and analyzing whether the mean of these differences (dˉ\bar{d}) is significantly different from zero. The test statistic (tt) is calculated using the formula:

t=dˉsd/nt = \frac{\bar{d}}{s_d / \sqrt{n}}

where dˉ\bar{d} represents the mean of the differences, sds_d is the standard deviation of these differences, and nn is the number of pairs. The standard deviation is essential in assessing the variability of differences and ensuring the statistical robustness of the test. The null hypothesis (H0H_0) in a paired t-test states that there is no significant difference between the two related samples (μd=0\mu_d = 0), meaning any observed difference is likely due to random chance. The alternative hypothesis (HaH_a) suggests that there is a significant difference between the two sets of measurements (μd≠0\mu_d \neq 0 for a two-tailed test, or μd>0\mu_d > 0 or μd<0\mu_d < 0 for one-tailed tests).

To conduct a paired t-test, the process begins with collecting paired data and computing the differences. Then, the mean and standard deviation of these differences are calculated, followed by determining the test statistic. This calculated t-value is compared with a critical value from the t-distribution table, or alternatively, a p-value is obtained. If the absolute value of the test statistic exceeds the critical value, or if the p-value is smaller than the chosen significance level (commonly 0.05), the null hypothesis is rejected, indicating that the difference between the paired samples is statistically significant. A key assumption of the paired t-test is that the differences between paired observations should be normally distributed, though this assumption can often be relaxed when the sample size is sufficiently large due to the Central Limit Theorem. Additionally, the test assumes that the data are measured on an interval or ratio scale and that observations within each pair are independent of one another.

A real-world example of a paired t-test could involve evaluating the effect of a new educational technique on student performance. Suppose a teacher administers a pre-test to a group of students before introducing a new teaching method and then gives them a post-test after implementing the method. By applying a paired t-test to analyze the difference in scores, the teacher can determine whether the new method significantly improved student learning. Similarly, in medical research, a paired t-test can assess whether a new medication reduces blood pressure by measuring patients’ blood pressure before and after taking the drug.

In summary, the paired t-test is a powerful statistical tool for analyzing differences in related groups by considering the change in measurements rather than absolute values. Its effectiveness lies in eliminating variability that might arise from differences between individuals, allowing researchers to focus solely on the impact of an intervention or treatment. However, it is crucial to check its assumptions before application, particularly the normality of differences and the independence of observations within each pair. When used correctly, the paired t-test provides meaningful insights into whether a change has occurred due to a specific factor, making it a widely used technique in fields like psychology, medicine, education, and business analytics.

When to use The test

A paired t-test is used when you want to compare the means of two related groups to determine if there is a significant difference between them. The key aspect of this test is that the two groups are not independent; instead, they are paired or matched in some way. Below are the common scenarios where a paired t-test is appropriate:

1. Before-and-After Measurements (Repeated Measures on the Same Subjects)

One of the most common applications of a paired t-test is when the same subjects are measured before and after an intervention. Since the measurements come from the same individuals, a paired t-test is ideal for detecting changes over time.
Example:

  • Measuring students’ test scores before and after a new teaching method to determine if it improved their performance.
  • Measuring patients’ blood pressure before and after administering a drug to check its effectiveness.

 

2. Comparing Two Conditions on the Same Subjects

A paired t-test is used when the same subjects are tested under two different conditions, and we want to see if there is a significant difference between them.
Example:

  • Measuring reaction times with and without caffeine in the same group of participants.
  • Comparing muscle strength in the right and left arm of the same individuals

3. Matched Pairs Design (Related or Matched Subjects)

Sometimes, subjects are not identical but are closely matched based on certain characteristics, such as age, gender, or genetic background. This is common in twin studies or when researchers match participants based on similar attributes to reduce variability.
Example:

  • Comparing the performance of identical twins, where one twin is exposed to a treatment and the other is not.
  • Studying spouses’ stress levels before and after a major life event.

 

4. Natural Pairings in Data

A paired t-test is appropriate when there is a natural pairing in the data, where each observation in one group has a direct counterpart in the other group.
Example:

  • Comparing the cholesterol levels of individuals before and after a diet plan.
  • Measuring employee productivity levels before and after implementing a new workflow system.

 

Using the paired t-test

The sections below discuss what is needed to perform the test, checking our data, how to perform the test and statistical details.

What do we need?

For the paired t-test, we need two variables. One variable defines the pairs for the observations. The second variable is a measurement. Sometimes, we already have the paired differences for the measurement variable. Other times, we have separate variables for “before” and “after” measurements  for each pair and need to calculate the differences.

We also have an idea, or hypothesis, that the differences between pairs is zero. Here are three examples:

  • A group of people with dry skin use a medicated lotion on one arm and a non-medicated lotion on their other arm. After a week, a doctor measures the redness on each arm. We want to know if the medicated lotion is better than the non-medicated
    • lotion. We do this by finding out if the arm with medicated lotion has less redness than the other arm. Since we have pairs of measurements for each person, we find the differences. Then we test if the mean difference is zero or not.
    • We measure weights of people in a program to quit smoking. For each person, we have the weight at the start and end of the program. We want to know if the mean weight change for people in the program is zero or not.
    • An instructor gives students an exam and the next day gives students a different exam on the same material. The instructor wants to know if the two exams are equally difficult. We calculate the difference in exam scores for each student. We test if the mean difference is zero or not.

Paired t-test assumptions

To apply the paired t-test to test for differences between paired measurements, the following assumptions need to hold:

  • Subjects must be independent. Measurements for one subject do not affect measurements for any other subject.
  • Each of the paired measurements must be obtained from the same subject. For example, the before-and-after weight for a smoker in the example above must be from the same person.
  • The measured differences are normally distributed

Paired t-test example

An instructor wants to use two exams in her classes next year. This year, she gives both exams to the students. She wants to know if the exams are equally difficult and wants to check this by looking at the differences between scores. If the mean difference between scores for students is “close enough” to zero, she will make a practical conclusion that the exams are equally difficult. Here is the data:

 

Student

Exam 1 Score

Exam 2 Score

Difference

Bob

63

69

6

Nina

65

65

0

Tim

56

62

6

Kate

100

91

-9

Alonzo

88

78

-10

Jose

83

87

4

Nikhil

77

79

2

Julia

92

88

-4

Tohru

90

85

-5

Michael

84

92

8

Jean

68

69

1

Indra

74

81

7

Susan

87

84

-3

Allen

64

75

11

Paul

71

84

13

Edwina

88

82

-6

 

If you look at the table above, you see that some of the score differences are positive and some are negative. You might think that the two exams are equally difficult. Other people might disagree. The statistical test gives a common way to make the decision, so that everyone makes the same decision on the same data.

Checking the data

Let’s start by answering: Is the paired t-test an appropriate method to evaluate the difference in difficulty between the two exams?

  • Subjects are independent. Each student does their own work on the two exams.
  • Each of the paired measurements are obtained from the same subject. Each student takes both tests.
  • The distribution of differences is normally distributed. For now, we will assume this is true. We will test this later.

We decide that we have selected a valid analysis method.

Before jumping into the analysis, we should plot the data. The figure below shows a histogram and summary statistics for the score differences.

From the histogram, we see that there are no very unusual points, or outliers. The data are roughly bell-shaped, so our idea of a normal distribution for the differences seems reasonable.

How to perform the paired t-test

We’ll further explain the principles underlying the paired t-test in the Statistical Details section below, but let’s first proceed through the steps from beginning to end. We start by calculating our test statistic. To accomplish this, we need the average difference, the standard deviation of the difference and the sample size. These are shown in Figure 1 above. (Note that the statistics are rounded to two decimal places below. Software will usually display more decimal places and use them in calculations.)

The average score difference is:

¯xd¯=1.31

Next, we calculate the standard error for the score difference. The calculation is:

Standard Error=sdn=7.0016=7.004=1.75

In the formula above, n is the number of students – which is the number of differences. The standard deviation of the differences is sd.

We now have the pieces for our test statistic. We calculate our test statistic as:

t=Average differenceStandard Error=1.311.75=0.750

To make our decision, we compare the test statistic to a value from the t- distribution. This activity involves four steps:

  1. We decide on the risk we are willing to take for declaring a difference when there is not a difference. For the exam score data, we decide that we are willing to take a 5% risk of saying that the unknown mean exam score difference is zero when in reality it is not. In statistics-speak, we set the significance level, denoted by α, to 0.05. It’s a good practice to make this decision before collecting the data and before calculating test statistics.
  2. We calculate a test statistic. Our test statistic is 0.750.
  3. We find the value from the t-distribution. Most statistics books have look-up tables for the distribution. You can also find tables online. The most likely situation is that you will use software for your analysis  and will not use printed tables.

    To find this value, we need the significance level (α = 0.05) and the degrees of freedom. The degrees of freedom (df) are based on the sample size. For the exam score data, this is:

    df=n−1=16−1=15

    The t value with α = 0.05 and 15 degrees of freedom is 2.131.

  4. We compare the value of our statistic (0.750) to the t value. Because 0.750 < 2.131, we cannot reject our idea that the mean score difference is zero. We make a practical conclusion to consider exams as equally difficult.

Statistical details

Let’s look at the exam score data and the paired t-test using statistical terms.

Our null hypothesis is that the population mean of the differences is zero. The null hypothesis is written as:

Ho:μd=0

The alternative hypothesis is that the population mean of the differences is not zero. This is written as:

Ho:μd≠0

We calculate the standard error as:

StandardError=sdn

The formula shows the sample standard deviation of the differences as sd and the sample size as n.

The test statistic is calculated as:

t=μdsn

We compare the test statistic to a t value with our chosen alpha value and the degrees of freedom for our data. In our exam score data example, we set α = 0.05. The degrees of freedom (df) are based on the sample size and are calculated as:

df=n−1=16−1=15

Statisticians write the t value with α = 0.05 and 15 degrees of freedom as:

t0.05,15

The t value with α = 0.05 and 15 degrees of freedom is 2.131. There are two possible results from our comparison:

  • The test statistic is lower than the t value. You fail to reject the hypothesis that the mean difference is zero. The practical conclusion made by the instructor is that the two tests are equally difficult. Next year, she can use both exams and give half the students one exam and half the other exam.
  • The test statistic is higher than the t value. You reject the hypothesis that the mean difference is zero. The practical conclusion made by the instructor is that the tests are not of equal difficulty. She must use the same exam for all students.

Testing for normality

The normality assumption is more important for small sample sizes than for larger sample sizes.

Normal distributions are symmetric, which means they are equal on both sides of the center. Normal distributions do not have extreme values, or outliers. You can check these two features of a normal distribution with graphs. Earlier, we decided that the distribution of exam score differences were “close enough” to normal to go ahead with the assumption of normality. The figure below shows a normal quantile plot for the data and supports our decision.

You can also perform a formal test for normality using software. Figure 3 below shows results of testing for normality with JMP. We test the distribution of the score differences. We cannot reject the hypothesis of a normal distribution. We can go ahead with the paired -test.

Understanding p-values

Using a visual, you can check to see if your test statistic is a more extreme value in the distribution. The t- distribution is similar to a normal distribution. The figure below shows a t- distribution with 15 degrees of freedom.

Since our test is two-sided and we set α = 0.05, the figure shows that the value of 2.131 “cuts off” 2.5% of the data in each of the two tails. Only 5% of the data overall is further out in the tails than 2.131.

Figure 5 shows where our result falls on the graph. You can see that the test statistic (0.75) is not far enough “out in the tail” to reject the hypothesis of a mean difference of zero.

Putting it all together with software

To perform the paired t-test in the real world, you are likely to use software most of the time. The figure below shows results for the paired t-test for the exam score data using JMP.

The software shows results for a two-sided test (Prob > |t|) and for one-sided tests. The two-sided test is what we want. Our null hypothesis is that the mean difference between the paired exam scores is zero. Our alternative hypothesis is that the mean difference is not equal to zero.

The software shows a p-value of 0.4650 for the two-sided test. This means that the likelihood of seeing a sample average difference of 1.31 or greater, when the underlying population mean difference is zero, is about 47 chances out of 100. We feel confident in our decision not to reject the null hypothesis. The instructor can go ahead with her plan to use both exams next year, and give half the students one exam and half the other exam.

For more query contact:[email protected]

Frequently
Asked Questions

If your sample size is very small, it is hard to test for normality. In this situation, you need to use your understanding of the measurements. For example, for the test scores data, the instructor knows that the underlying distribution of score differences is normally distributed. Even for a very small sample, the instructor would likely go ahead with the t-test and assume normality.

What if you know the underlying measurements are not normally distributed? Or what if your sample size is large and the test for normality is rejected? In this situation, you can use nonparametric analyses. These types of analyses do not depend on an assumption that the data values are from a specific distribution. For the paired t ­-test, a nonparametric test is the Wilcoxon signed-rank test.

A paired t-test is used for dependent (related) groups, where each subject has two related measurements. In contrast, an independent t-test is used when comparing two independent groups (e.g., two different groups of people receiving different treatments).

 

The test statistic (ttt) is calculated as:

t=dˉsd/nt = \frac{\bar{d}}{s_d / \sqrt{n}}t=sd/ndˉ

Where:

  • dˉ\bar{d}dˉ = Mean of the differences between paired values
  • sds_dsd = Standard deviation of the differences
  • nnn = Number of pairs

 

  • If p-value < 0.05, reject the null hypothesis. This means there is a significant difference between the two paired groups.
  • If p-value > 0.05, fail to reject the null hypothesis. This means there is no significant difference between the paired groups.

 

You can perform a paired t-test using statistical software such as:

  • Excel (Data Analysis ToolPak)
  • SPSS (Analyze → Compare Means → Paired-Samples t-test)
  • R (t.test(x, y, paired = TRUE))
  • Python (scipy.stats.ttest_rel(x, y))

 

× Chat with us !