One-way ANOVA (Analysis of Variance) is a statistical technique used to compare the means of three or more independent (unrelated) groups to determine if there are any statistically significant differences between the mean scores of these groups. It extends the t-test for comparing more than two groups, providing a way to handle complex comparisons without increasing the risk of committing Type I errors (incorrectly rejecting the null hypothesis).
Purpose
The primary purpose of a one-way ANOVA is to test if at least one group mean is different from the others, which suggests that at least one treatment or condition has an effect that is not common to all groups.
Assumptions
One-way ANOVA makes several key assumptions:
-
Independence of Observations: Each group’s observations must be independent of the observations in other groups.
-
Normality: Data in each group should be approximately normally distributed.
-
Homogeneity of Variances: All groups must have the same variance, often assessed by Levene’s Test of Equality of Variances.
Hypotheses
The hypotheses for a one-way ANOVA are formulated as:
-
Null Hypothesis (H₀): The means of all groups are equal, implying no effect of the independent variable on the dependent variable across the groups.
-
Alternative Hypothesis (H₁): At least one group mean is different from the others, suggesting an effect of the independent variable.
Calculations
The analysis involves several key calculations:
-
Total Sum of Squares (SST): Measures the total variability in the dependent variable.
-
Sum of Squares Between (SSB): Reflects the variability due to the interaction between the groups.
-
Sum of Squares Within (SSW): Captures the variability within each group.
-
Degrees of Freedom (DF): Varies for each sum of squares; DF between = \(k - 1\) (where \(k\) is the number of groups) and DF within = \(N - k\) (where \(N\) is the total number of observations).
-
Mean Squares: Each sum of squares is divided by its respective degrees of freedom to obtain mean squares (MSB and MSW).
-
F-statistic: The ratio of MSB to MSW, which follows an F-distribution under the null hypothesis.
Interpretation
The result of a one-way ANOVA is typically reported as an F-statistic and its corresponding p-value. The F-statistic determines whether the observed variances between means are large enough to be considered statistically significant:
-
If the F-statistic is larger than the critical value (or if the p-value is less than the significance level, typically 0.05), the null hypothesis is rejected, indicating significant differences among the means.
-
If the F-statistic is smaller than the critical value, the null hypothesis is not rejected, suggesting no significant difference among the group means.
One way Anova Example Problem
A company wants to know the impact of three different selection methods on the employee performance. The HR analyst chose 15 employees at random and collected the data of sales volume reached by each employee. Out of 15 employees, 5 employees were taken from each of the selection methods. The data obtained are given below.
| 1 |
11 |
17 |
15 |
| 2 |
15 |
18 |
16 |
| 3 |
18 |
21 |
18 |
| 4 |
19 |
22 |
19 |
| 5 |
22 |
27 |
22 |
At the 0.05 level of significance, do the selection methods have different effects on the performance of employees?
Calculations:
To perform a one-way ANOVA test to see if there are significant differences in the performance of employees based on their selection method (Emp Referral, Job Portals, Consultancy), we need to calculate several components including the group means, the overall mean, the sum of squares between groups (SSB), the sum of squares within groups (SSW), and the total sum of squares (SST). Additionally, we’ll calculate the F-statistic and compare it to the critical F-value from an F-distribution table.
Data Organization:
Group A (Emp Referral): \([11, 15, 18, 19, 22]\)
Group B (Job Portals): \([17, 18, 21, 22, 27]\)
Group C (Consultancy): \([15, 16, 18, 19, 22]\)
Calculate the Means for Each Group:
\[ \bar{x}_A = \frac{11 + 15 + 18 + 19 + 22}{5} = 17 \] \[ \bar{x}_B = \frac{17 + 18 + 21 + 22 + 27}{5} = 21 \] \[ \bar{x}_C = \frac{15 + 16 + 18 + 19 + 22}{5} = 18 \]
Calculate the Overall Mean:
\[ \bar{x} = \frac{11 + 15 + 18 + 19 + 22 + 17 + 18 + 21 + 22 + 27 + 15 + 16 + 18 + 19 + 22}{15} = 18.667 \]
Calculate Sum of Squares Between Groups (SSB):
\[ SSB = 5[(\bar{x}_A - \bar{x})^2 + (\bar{x}_B - \bar{x})^2 + (\bar{x}_C - \bar{x})^2] \] \[ = 5[(17 - 18.667)^2 + (21 - 18.667)^2 + (18 - 18.667)^2] \] \[ = 5[(-1.667)^2 + (2.333)^2 + (-0.667)^2] \] \[ = 5[2.778 + 5.444 + 0.444] = 5 \times 8.667
\] \[= 43.333 \]
Calculate Sum of Squares Within Groups (SSW):
\[ SSW = \sum_{i=1}^{5} (x_{Ai} - \bar{x}_A)^2 + \sum_{i=1}^{5} (x_{Bi} - \bar{x}_B)^2 + \sum_{i=1}^{5} (x_{Ci} - \bar{x}_C)^2 \] \[ = [(11-17)^2 + (15-17)^2 + (18-17)^2 + (19-17)^2 + (22-17)^2] \] \[ \;\;\;\; + [(17-21)^2 + (18-21)^2 + (21-21)^2 + (22-21)^2 + (27-21)^2] \] \[ \;\;\;\; + [(15-18)^2 + (16-18)^2 + (18-18)^2 + (19-18)^2 + (22-18)^2] \] \[ = [36 + 4 + 1 + 4 + 25 + 16 + 9 + 0 + 1 + 36 + 9 + 4 + 0 + 1 + 16]
\] \[= 162 \]
Calculate the Total Sum of Squares (SST):
\[ SST = SSB + SSW = 43.333 + 162 = 205.333 \]
Calculate Mean Squares:
\[ between groups = MSB = \frac{SSB}{k-1} = \frac{43.333}{3-1} = 21.667 \] \[ within groups = MSW = \frac{SSW}{N-k} = \frac{162}{15-3} = 13.5 \]
Calculate F-statistic:
\[ F = \frac{MSB}{MSW} = \frac{21.667}{13.5} = 1.605 \]
degrees of freedom
-
Degrees of freedom for the numerator (df1): This corresponds to the number of groups minus one. In your case, with three groups (Emp Referral, Job Portals, Consultancy), \(df1 = 3 - 1 = 2\).
-
Degrees of freedom for the denominator (df2): This corresponds to the total number of observations minus the number of groups. For 15 employees and 3 groups, \(df2 = 15 - 3 = 12\).
-
Significance level (α): Typically, this is set at 0.05 for most studies, implying a 95% confidence level in the results.
Critical F-value Interpretation
You would locate the value in the F-table where \(df1 = 2\) and \(df2 = 12\), at the row and column intersecting at \(α = 0.05\). The critical F-value at these degrees of freedom and significance level is typically provided by statistical tables available in textbooks or online resources.
For practical purposes, based on typical values found in F-distribution tables for these degrees of freedom: - If the critical F-value is around 3.89 (common value for df1 = 2, df2 = 12, at α = 0.05), then since 1.605 < 3.89, you would fail to reject the null hypothesis, concluding that there is no significant effect of the selection method on employee performance at the 0.05 significance level.
This interpretation means that, based on your ANOVA results, the different selection methods do not statistically significantly impact employee sales performance.
One way ANOVA Test in R
# Prepare the Data
emp_referral <- c(11, 15, 18, 19, 22)
job_portals <- c(17, 18, 21, 22, 27)
consultancy <- c(15, 16, 18, 19, 22)
alpha = 0.05
# Combining the data into a single data frame
data <- data.frame(
Sales = c(emp_referral, job_portals, consultancy),
Method = factor(rep(c("Emp Referral", "Job Portals", "Consultancy"), each = 5))
)
data
Sales Method
1 11 Emp Referral
2 15 Emp Referral
3 18 Emp Referral
4 19 Emp Referral
5 22 Emp Referral
6 17 Job Portals
7 18 Job Portals
8 21 Job Portals
9 22 Job Portals
10 27 Job Portals
11 15 Consultancy
12 16 Consultancy
13 18 Consultancy
14 19 Consultancy
15 22 Consultancy
# Perform ANOVA Test
result <- aov(Sales ~ Method, data = data)
# Results
summary(result)
Df Sum Sq Mean Sq F value Pr(>F)
Method 2 43.33 21.67 1.605 0.241
Residuals 12 162.00 13.50
# Get the summary of the ANOVA test
summary_result <- summary(result)
# Extract the p-value
p_value <- summary_result[[1]]["Method", "Pr(>F)"]
# hypothesis decision
if (p_value < alpha) {
cat("Reject null hypothesis\n")
} else {
cat("Do not reject null hypothesis\n")
}
Do not reject null hypothesis
One way ANOVA Test in python
Install statsmodels package
!pip3 install statsmodels
import pandas as pd
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols
# Step 1: Prepare the Data
emp_referral = [11, 15, 18, 19, 22]
job_portals = [17, 18, 21, 22, 27]
consultancy = [15, 16, 18, 19, 22]
alpha = 0.05
# Combining the data into a single DataFrame
data = pd.DataFrame({
'Sales': emp_referral + job_portals + consultancy,
'Method': ['Emp Referral'] * 5 + ['Job Portals'] * 5 + ['Consultancy'] * 5
})
data
Sales Method
0 11 Emp Referral
1 15 Emp Referral
2 18 Emp Referral
3 19 Emp Referral
4 22 Emp Referral
5 17 Job Portals
6 18 Job Portals
7 21 Job Portals
8 22 Job Portals
9 27 Job Portals
10 15 Consultancy
11 16 Consultancy
12 18 Consultancy
13 19 Consultancy
14 22 Consultancy
# Step 2: Perform ANOVA Test
model = ols('Sales ~ C(Method)', data=data).fit()
# Step 3: Get the summary to see the results
result = sm.stats.anova_lm(model, typ=2)
print(result)
sum_sq df F PR(>F)
C(Method) 43.333333 2.0 1.604938 0.241176
Residual 162.000000 12.0 NaN NaN
# Extract p-value
p_value = result.loc['C(Method)', 'PR(>F)']
# Hypothesis decision
if p_value < alpha:
print("Reject null hypothesis")
else:
print("Do not reject null hypothesis")
Do not reject null hypothesis
Example Research Articles on Anova:
-
Esra Emir et al. (2025) : 👉 Download Article
-
Aynur Bozkurt Bostancı & Musa Pullu (2025) : 👉 Download Article
Bozkurt Bostancı, A., & Pullu, M. (2025).
The Role Of Organizational Transparency Levels In Schools On Teachers’ Organizational Citizenship Behaviors.
PERR,
14(2), 51–70.
https://doi.org/10.52963/PERR_Biruni_V14.N2.02
Emir, E., Akça, E., Badau, A., & Badau, D. (2025).
The dark side of leisure time: Analysis of the predictive effects between boredom, internet usage habits, and gambling behaviors.
Brain Sciences,
15(598, 6).
https://doi.org/10.3390/brainsci15060598