Two-Way ANOVA, also known as factorial ANOVA, extends the principles of the One-Way ANOVA by not just comparing means across one categorical independent variable, but two. This method allows researchers to study the effect of two factors simultaneously and to evaluate if there is an interaction between the two factors on a continuous dependent variable.
Purpose
The primary goals of Two-Way ANOVA are:
To determine if there is a significant effect of each of the two independent variables on the dependent variable. This is analogous to conducting multiple One-Way ANOVAs, each for a different factor, though doing so separately ignores the potential interaction between the factors.
To determine if there is a significant interaction effect between the two independent variables on the dependent variable. An interaction effect occurs when the effect of one independent variable on the dependent variable changes across the levels of the other independent variable.
Assumptions
-
Independence of observations: Each subject’s response is independent of the others’.
-
Normality: The data for each combination of groups formed by the two factors should be normally distributed.
-
Homogeneity of variances: The variances among the groups should be approximately equal.
Components
In a Two-Way ANOVA, the data can be represented in a matrix format where one factor’s levels are on the rows, the other factor’s levels are on the columns, and the cell values are the means (or other statistics) of the dependent variable for the combinations of factor levels.
Hypotheses
There are three sets of null hypotheses in a Two-Way ANOVA:
-
Main Effect of Factor A: The means of the different levels of factor A are equal.
-
Main Effect of Factor B: The means of the different levels of factor B are equal.
-
Interaction Effect of Factors A and B: There is no interaction between factors A and B; the effect of factor A on the dependent variable is the same at all levels of factor B, and vice versa.
Calculation
Two-Way ANOVA involves partitioning the total variance observed in the data into components attributable to each factor and their interaction. The sums of squares for these components are compared to a residual (error) term to produce F-statistics for each hypothesis.
Interpretation
-
Main effects: Significant F-statistics for either main effect indicate that there are significant differences in the dependent variable across the levels of that factor, ignoring the other factor.
-
Interaction effect: A significant F-statistic for the interaction indicates that the effect of one factor on the dependent variable differs across the levels of the other factor.
If there’s a significant interaction, it’s crucial to interpret the main effects within the context of the interaction, often requiring a more detailed analysis such as simple effects tests or plotting interaction plots to understand the nature of the interaction.
Example problem on Two way Anova
Let’s consider a study to evaluate the impact of two factors on plant growth: Fertilizer Type (A, B) and Irrigation Method (X, Y). The objective is to determine the effect of these two factors and their interaction on plant height. Here is the hypothetical data:
-
Fertilizer Type A, Irrigation X: Plant heights are 15, 17, 16 cm.
-
Fertilizer Type A, Irrigation Y: Plant heights are 14, 15, 15 cm.
-
Fertilizer Type B, Irrigation X: Plant heights are 18, 20, 19 cm.
-
Fertilizer Type B, Irrigation Y: Plant heights are 22, 21, 23 cm.
The hypothesis for this Two-Way ANOVA test would be:
-
Null Hypothesis for Fertilizer Type (H0a): There is no difference in plant height across the different types of fertilizer.
-
Null Hypothesis for Irrigation Method (H0b): There is no difference in plant height across the different irrigation methods.
-
Null Hypothesis for Interaction (H0ab): There is no interaction effect between fertilizer type and irrigation method on plant height.
Let’s calculate the Two-Way ANOVA for this example.
The Two-Way ANOVA results for our hypothetical study on the impact of fertilizer type and irrigation method on plant growth yield the following:
Fertilizer Type: The sum of squares is 80.08, with an F-statistic of 96.1 and a p-value of 0.00001. This indicates a highly significant effect of fertilizer type on plant height, meaning we can reject the null hypothesis that there’s no difference in plant height across the different types of fertilizer.
Irrigation Method: The sum of squares is 2.08, with an F-statistic of 2.5 and a p-value of 0.1525. This suggests that the effect of irrigation method on plant height is not statistically significant at the 0.05 level, and we fail to reject the null hypothesis for the irrigation method.
Interaction between Fertilizer Type and Irrigation Method: The sum of squares for the interaction is 14.08, with an F-statistic of 16.9 and a p-value of 0.003386. This indicates a significant interaction effect between fertilizer type and irrigation method on plant height, meaning the effect of one factor depends on the level of the other factor.
Based on these results: - There’s a significant difference in plant growth across different fertilizer types. - There’s no significant difference in plant growth across different irrigation methods. - The interaction between fertilizer type and irrigation method significantly affects plant growth, suggesting that the best combination of factors for plant growth depends on both the type of fertilizer and the method of irrigation used together, not just one or the other in isolation.
Two way ANOVA Test in R
# Loading necessary library
library(stats)
library(agricolae)
# Preparing the data
PlantHeight <- c(15, 17, 16, 14, 15, 15, 18, 20, 19, 22, 21, 23)
FertilizerType <- factor(rep(c('A', 'B'), each=6))
IrrigationMethod <- factor(rep(c('X', 'Y', 'X', 'Y'), each=3))
alpha = 0.05
data <- data.frame(PlantHeight, FertilizerType, IrrigationMethod)
data
PlantHeight FertilizerType IrrigationMethod
1 15 A X
2 17 A X
3 16 A X
4 14 A Y
5 15 A Y
6 15 A Y
7 18 B X
8 20 B X
9 19 B X
10 22 B Y
11 21 B Y
12 23 B Y
# Conducting Two-Way ANOVA
result <- aov(PlantHeight ~ FertilizerType * IrrigationMethod, data = data)
summary(result)
Df Sum Sq Mean Sq F value Pr(>F)
FertilizerType 1 80.08 80.08 96.1 9.85e-06 ***
IrrigationMethod 1 2.08 2.08 2.5 0.15250
FertilizerType:IrrigationMethod 1 14.08 14.08 16.9 0.00339 **
Residuals 8 6.67 0.83
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Two way ANOVA Test in Python
import pandas as pd
import numpy as np
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.multicomp import pairwise_tukeyhsd, MultiComparison
# Preparing the data
PlantHeight = [15, 17, 16, 14, 15, 15, 18, 20, 19, 22, 21, 23]
FertilizerType = ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B']
IrrigationMethod = ['X', 'X', 'X', 'Y', 'Y', 'Y', 'X', 'X', 'X', 'Y', 'Y', 'Y']
# Combine into DataFrame
data = pd.DataFrame({
'PlantHeight': PlantHeight,
'FertilizerType': FertilizerType,
'IrrigationMethod': IrrigationMethod
})
# Two-way ANOVA model
model = ols('PlantHeight ~ C(FertilizerType) * C(IrrigationMethod)', data=data).fit()
# ANOVA table (Type II)
anova_table = sm.stats.anova_lm(model, typ=2)
# Display results
print("\nTwo-Way ANOVA Results (Type II)\n--------------------------------------")
Two-Way ANOVA Results (Type II)
--------------------------------------
sum_sq df F PR(>F)
C(FertilizerType) 80.083333 1.0 96.1 0.000010
C(IrrigationMethod) 2.083333 1.0 2.5 0.152502
C(FertilizerType):C(IrrigationMethod) 14.083333 1.0 16.9 0.003386
Residual 6.666667 8.0 NaN NaN