25 Paired Samples t-test

The paired sample t-test, also known as the dependent sample t-test or repeated measures t-test, is a statistical method used to compare two related means. This test is applicable when the data consists of matched pairs of similar units or the same unit is tested at two different times.

25.0.1 Key Features and Applications

The paired sample t-test is commonly used in situations such as:

Comparing the before and after effects of a treatment on the same subjects.
Measuring performance on two different occasions.
Comparing two different treatments on the same subjects in a crossover study.

25.0.2 Assumptions

To properly conduct a paired sample t-test, the data must meet the following assumptions:

Paired Data: The observations are collected in pairs, such as pre-test and post-test measurements or measurements of the same subjects under two different conditions.
Normality: The differences between the paired observations should be approximately normally distributed. This assumption can be tested using plots or normality tests like the Shapiro-Wilk test.
Scale of Measurement: The variable being tested should be continuous and measured at least at the interval level.

25.0.3 Hypotheses

The hypotheses for a paired sample t-test are as follows:

Null Hypothesis (H₀): The mean difference between the paired observations is zero (\(\mu_d = 0\)).
Alternative Hypothesis (H₁): The mean difference between the paired observations is not zero (\(\mu_d \neq 0\)). This can be tailored to a one-tailed test if a specific direction is hypothesized (\(\mu_d > 0\) or \(\mu_d < 0\)).

25.0.4 Formulae

Mean Difference (\(\bar{d}\)):

The mean difference is calculated by taking the average of the differences between all paired observations. \[ \bar{d} = \frac{1}{n} \sum_{i=1}^n (x_{i1} - x_{i2}) \] Where \(x_{i1}\) and \(x_{i2}\) are the measurements from the first and second condition for the ith pair, and \(n\) is the number of pairs.
Standard Deviation of the Differences (\(s_d\)):

This measures the variability of the differences between the paired observations. \[ s_d = \sqrt{\frac{\sum_{i=1}^n (d_i - \bar{d})^2}{n-1}} \] Here, \(d_i = x_{i1} - x_{i2}\) represents the difference for each pair.
t-Statistic:

The t-statistic is calculated to determine if the differences are statistically significant. \[ t = \frac{\bar{d}}{s_d / \sqrt{n}} \] This formula represents the ratio of the mean difference to the standard error of the difference.

25.0.5 calculation of Degrees of Freedom

The degrees of freedom for the paired sample t-test are \(n - 1\), where \(n\) is the number of pairs.

25.0.6 Interpretation

To decide whether to reject the null hypothesis, compare the calculated t-value with the critical t-value from the t-distribution at the chosen significance level (\(\alpha\)), typically set at 0.05 for a 5% significance level. If the absolute value of the t-statistic is greater than the critical value, the null hypothesis is rejected, suggesting a significant difference between the paired groups.

This test is particularly valuable for detecting changes in conditions or treatments when the same subjects are observed under both scenarios, as it effectively accounts for variability between subjects.

25.0.7 Paired samples t-test Example problem

A nutritionist wants to test the effectiveness of a new diet program. To do this, they measure the weight of 5 participants before starting the program and again after 6 weeks on the program. The goal is to see if there is a significant change in weight due to the diet.

Participant Weights (kg) Before the Diet: 70, 72, 75, 80, 78
Participant Weights (kg) After the Diet: 68, 70, 74, 77, 76

Hypotheses:

Null Hypothesis (H₀): There is no significant difference in the mean weight before and after the diet. (\(\mu_d = 0\))
Alternative Hypothesis (H₁): There is a significant difference in the mean weight before and after the diet. (\(\mu_d \neq 0\))

Significance Level:

We will use a significance level (\(\alpha\)) of 0.05.

Let’s break down the detailed mathematics behind each step of the paired samples t-test for the diet program effectiveness example, using the provided weights before and after the diet.

Calculate the differences for each participant:

\[ \begin{align*} d_1 & = 70 - 68 = 2 \\ d_2 & = 72 - 70 = 2 \\ d_3 & = 75 - 74 = 1 \\ d_4 & = 80 - 77 = 3 \\ d_5 & = 78 - 76 = 2 \\ \end{align*} \] Differences: \(d = [2, 2, 1, 3, 2]\)

Calculate the mean difference (\(\bar{d}\)):

\[ \bar{d} = \frac{2 + 2 + 1 + 3 + 2}{5} = \frac{10}{5} = 2 \text{ kg} \]

Calculate the standard deviation of the differences (\(s_d\)):

First, calculate the squared deviations from the mean: \[ \begin{align*} (2 - 2)^2 & = 0 \\ (2 - 2)^2 & = 0 \\ (1 - 2)^2 & = 1 \\ (3 - 2)^2 & = 1 \\ (2 - 2)^2 & = 0 \\ \end{align*} \] Sum of squared deviations: \[ 0 + 0 + 1 + 1 + 0 = 2 \] Now, calculate \(s_d\): \[ s_d = \sqrt{\frac{2}{4}} = \sqrt{0.5} = 0.707 \text{ kg} \]

Calculate the t-statistic:

Use the formula for the t-statistic with \(n = 5\) (number of participants): \[ t = \frac{\bar{d}}{s_d / \sqrt{n}} = \frac{2}{0.707 / \sqrt{5}} = \frac{2}{0.707 / 2.236} = \frac{2}{0.316} = 6.324 \]

Degrees of freedom (\(df\)):

\[ df = n - 1 = 5 - 1 = 4 \]

Compare the calculated t-statistic to the critical t-value:

The critical t-value for \(df = 4\) and a two-tailed test with \(\alpha = 0.05\) is approximately 2.776 (from t-distribution tables).

25.0.8 Interpretation

Since the calculated t-statistic (6.324) is significantly greater than the critical t-value (2.776), we reject the null hypothesis. This indicates a statistically significant decrease in weight due to the diet, confirming the effectiveness of the nutritionist’s program. The precise calculation steps and their results provide strong mathematical evidence for this conclusion.

25.0.9 Paired Samples T-Test calculation using Excel:

Download the Excel file link here

25.0.10 Paired Samples T-Test calculation using R:

# Participant weights before and after the diet
weights_before <- c(70, 72, 75, 80, 78)
weights_after <- c(68, 70, 74, 77, 76)
alpha = 0.05
# Perform paired samples t-test
t_test_result <- t.test(weights_before, weights_after, paired = TRUE)

# Print the results
t_test_result


    Paired t-test

data:  weights_before and weights_after
t = 6.3246, df = 4, p-value = 0.003198
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 1.122011 2.877989
sample estimates:
mean difference 
              2

# Extract p-value
p_value = t_test_result$p.value
# hypothesis decision
if (p_value < alpha) {
  cat("Reject null hypothesis\n")
} else {
  cat("Do not reject null hypothesis\n")
}

Reject null hypothesis

25.0.11 Paired Samples T-Test calculation using Python:

Python

from scipy.stats import ttest_rel

# Participant weights before and after the diet
weights_before = [70, 72, 75, 80, 78]
weights_after = [68, 70, 74, 77, 76]
alpha = 0.05
# Perform paired samples t-test
t_test_result = ttest_rel(weights_before, weights_after)

# Print the results
t_test_result

TtestResult(statistic=6.324555320336758, pvalue=0.0031982021523353082, df=4)

# Extract P-value
p_value = t_test_result.pvalue
# hypothesis decision
if p_value < alpha:
    print("Reject null hypothesis")
else:
    print("Do not reject null hypothesis")

Reject null hypothesis

25.0.12 Example Research Articles on Paired t-test:

Haslenda Yusop et al. (2015) 👉Download Paper