23 One-Sample T-Test

The one-sample t-test is a statistical procedure used to determine whether the mean of a single sample differs significantly from a known or hypothesized population mean. This test is particularly useful when the population standard deviation is unknown and the sample size is small, which is a common scenario in many practical research applications.

23.0.1 Assumptions

Before conducting a one-sample t-test, certain assumptions must be verified to ensure the validity of the test results:

Normality: The data should be approximately normally distributed. This assumption is especially important with smaller sample sizes. For larger samples, the Central Limit Theorem helps as it suggests that the means of the samples will be approximately normally distributed regardless of the shape of the population distribution.
Independence: The sampled observations must be independent of each other. This means that the selection of one observation does not influence or alter the selection of other observations.
Scale of Measurement: The data should be measured at least at the interval level, which means that the numerical distances between measurements are defined.

23.0.2 Hypotheses

The hypotheses for a one-sample t-test are structured as follows:

Null Hypothesis (H₀): The population mean is equal to the specified value (\(\mu = \mu_0\)).
Alternative Hypothesis (H₁): The population mean is not equal to the specified value (\(\mu \neq \mu_0\)). The alternative hypothesis can also be directional, stating that the mean is greater than (\(\mu > \mu_0\)) or less than (\(\mu < \mu_0\)) the specified value, depending on the research question.

23.0.3 Formula

The t-statistic is calculated using the formula:

\[t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}\]

Where:

\(\bar{x}\) is the sample mean.
\(\mu_0\) is the hypothesized population mean.
\(s\) is the sample standard deviation.
\(n\) is the sample size.

23.0.4 Calculating Degrees of Freedom

The degrees of freedom for the one-sample t-test are calculated as \(n - 1\). This value is crucial for determining the critical values from the t-distribution, which are needed to assess the significance of the test statistic.

23.0.5 Interpretation

To decide whether to reject the null hypothesis, compare the calculated t-value to the critical t-value from the t-distribution at the desired significance level (\(\alpha\), often 0.05 for a 5% significance level). The decision rules are:

If the absolute value of the calculated t-value is greater than the critical t-value, reject the null hypothesis.
If the absolute value of the calculated t-value is less than or equal to the critical t-value, do not reject the null hypothesis.

23.0.6 One-Sample T-Test Example problem

A bakery claims that its chocolate chip cookies weigh at least 60 grams on average. A quality control manager is skeptical of this claim and decides to test it. She randomly selects 15 cookies and finds the following weights in grams:

52, 55, 61, 54, 58, 59, 62, 53, 56, 57, 60, 59, 61, 64, 58

She decides to use a one-sample t-test to see if there’s evidence that the average weight is different from the bakery’s claim. She chooses a significance level of 0.05.

Hypotheses

Null Hypothesis (\(H_0\)): \(\mu = 60\) grams. The average weight of the cookies is 60 grams.
Alternative Hypothesis (\(H_1\)): \(\mu \neq 60\) grams. The average weight of the cookies is not 60 grams.

First, let’s calculate the sample mean (\(\bar{x}\)), sample standard deviation (\(s\)), and the t-statistic.

Calculate the Sample Mean (\(\bar{x}\)):

The sample size \(n\) is 15.

Sample mean = \[ \bar{x} = \frac{\sum \text{sample values}}{n} \]

\[ = \frac{52 + 55 + 61 + 54 + 58 + 59 + 62 + 53 + 56 + 57 + 60 + 59 + 61 + 64 + 58}{15} \]

\[ = \frac{866}{15} = 57.73 \text{ grams} \]

Calculate the Sample Standard Deviation (s):

To calculate \(s\), use the formula:

\[ s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} \]

First, compute the deviations from the mean, square each, and then sum them up:

\((52 - 57.73)^2 = 32.6729\)
\((55 - 57.73)^2 = 7.4929\)
\((61 - 57.73)^2 = 10.6329\)
\((54 - 57.73)^2 = 13.9129\)
\((58 - 57.73)^2 = 0.0729\)
\((59 - 57.73)^2 = 1.6129\)
\((62 - 57.73)^2 = 18.1929\)
\((53 - 57.73)^2 = 22.3729\)
\((56 - 57.73)^2 = 2.9929\)
\((57 - 57.73)^2 = 0.5329\)
\((60 - 57.73)^2 = 5.1129\)
\((59 - 57.73)^2 = 1.6129\)
\((61 - 57.73)^2 = 10.6329\)
\((64 - 57.73)^2 = 39.3129\)
\((58 - 57.73)^2 = 0.0729\)

Sum of squared deviations:

\[ \sum (x_i - \bar{x})^2 = 167.1204 \]

Now calculate \(s\):

\[ s = \sqrt{\frac{167.1204}{14}} = 3.46 \text{ grams} \]

Compute the T-Statistic:

Using the t-test formula:

\[ t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{57.73 - 60}{3.46 / \sqrt{15}} = -2.32 \]

Determine Degrees of Freedom:

\[ df = n - 1 = 15 - 1 = 14 \]

Calculate P-Value for a Two-Tailed Test:

Based on the t-statistic, look up or compute the p-value for \(|t| = 2.32\) with \(df = 14\). This value is approximately \(p = 0.036\).

Interpretation

T-Statistic: The negative value of the t-statistic (-2.32) indicates that the sample mean is less than the null hypothesis mean of 60 grams.
P-Value: The p-value of 0.036 is less than the chosen significance level of 0.05. This suggests that there is statistically significant evidence to reject the null hypothesis.
Therefore, based on the sample of 15 cookies, there is sufficient statistical evidence to conclude that the average weight of the bakery’s chocolate chip cookies is different from the claimed 60 grams.
Given the direction indicated by the t-statistic, it suggests that the cookies may, on average, weigh less than the claimed 60 grams.

23.0.7 One-Sample T-Test calculation using Excel:

Download the Excel file link here

23.0.8 One-Sample T-Test calculation using R and Python

Python

One-Sample T-Test

# Sample data
weights <- c(52, 55, 61, 54, 58, 59, 62, 53, 56, 57, 60, 59, 61, 64, 58)

# Perform one-sample t-test (H0: mean = 60)
result <- t.test(weights, mu = 60)

# Display full test output
cat(
  "Sample Mean         :", round(mean(weights), 2), "\n",
  "Population Mean     :", 60, "\n",
  "t-Statistic         :", round(result$statistic, 4), "\n",
  "Degrees of Freedom  :", result$parameter, "\n",
  "P-value             :", round(result$p.value, 5), "\n",
  "Confidence Interval :", paste(round(result$conf.int, 2), collapse = " to "), "\n"
)

Sample Mean         : 57.93 
 Population Mean     : 60 
 t-Statistic         : -2.318 
 Degrees of Freedom  : 14 
 P-value             : 0.0361 
 Confidence Interval : 56.02 to 59.85

# Interpret the result 
alpha <- 0.05
p <- result$p.value

if (p < alpha) {
  cat("(P-value)",  p, "<", alpha, 
      "(significance level / alpha).",  "\n",
      " Reject the null hypothesis: 
      There is a significant difference.\n")
} else {
  cat("(P-value)",  p, ">=", alpha, 
      "(significance level / alpha).", "\n",
      " Do not Reject the null hypothesis: 
      No significant difference.\n")
}

(P-value) 0.03609761 < 0.05 (significance level / alpha). 
  Reject the null hypothesis: 
      There is a significant difference.

One-Sample T-Test

import numpy as np
from scipy.stats import ttest_1samp

# Sample data
weights = np.array([52, 55, 61, 54, 58, 59, 62, 53, 56, 57, 60, 59, 61, 64, 58])

# Population mean (hypothesized)
mu = 60

# Perform one-sample t-test
t_stat, p_value = ttest_1samp(weights, popmean=mu)

# Summary statistics
sample_mean = np.mean(weights)
sample_size = len(weights)
sample_sd = np.std(weights, ddof=1)  # ddof=1 for sample SD
df = sample_size - 1

# Display formatted results
print(f"""
Sample Size         : {sample_size}
Sample Mean         : {sample_mean:.2f}
Sample SD           : {sample_sd:.2f}
Population Mean     : {mu}
t-Statistic         : {t_stat:.4f}
Degrees of Freedom  : {df}
P-value (two-tailed): {p_value:.5f}
""")


Sample Size         : 15
Sample Mean         : 57.93
Sample SD           : 3.45
Population Mean     : 60
t-Statistic         : -2.3180
Degrees of Freedom  : 14
P-value (two-tailed): 0.03610

# Interpret the result
alpha = 0.05
p = p_value
if p < alpha:
    print(f"(P-value) {p:.4f} < {alpha} (sig level / alpha).")
    print("Reject the null hypothesis:", "\n", "There is a significant difference.\n")
else:
    print(f"(P-value) {p:.4f} >= {alpha} (significance level / alpha).")
    print("Do not reject the null hypothesis:", "\n", "No significant difference.\n")

(P-value) 0.0361 < 0.05 (sig level / alpha).
Reject the null hypothesis: 
 There is a significant difference.