You are on page 1of 23

ANOVA

Dr.Trilok Nath Pandey


SCOPE,VIT,Chennai
One-way ANOVA test
• ANOVA – Analysis of Variance
• One-way ANOVA, also known as one-factor ANOVA is a
test for comparing means of more than two groups
• ANOVA test hypotheses:
• Null hypothesis: the means of the different groups are the same
• Alternative hypothesis: At least one sample mean is not equal to
the others.
Assumptions of ANOVA test
• The observations are obtained independently and
randomly from the population defined by the factor levels
• The data of each factor level are normally distributed.
• These normal populations have a common variance.
(Levene’s test can be used to check this.)
How it works?
• Assume that we have 3 groups (A, B, C) to compare:
• Compute the common variance, which is called variance within
samples (S2within) or residual variance.
• Compute the variance between sample means as follow:
• Compute the mean of each group
• Compute the variance between sample means (S2between)

• Produce F-statistic as the ratio of S2between / S2within.


Compute one-way ANOVA
• We want to know if there is any significant difference
between the average weights of plants in the 3
experimental conditions.
• Functions used
• aov()
• summary.aov()
ANOVA - Analysis of Variance
ANOVA - Analysis of Variance
ANOVA - Analysis of Variance

ANOVA test statistic, f = MSB / MSE


Critical Value at α = F(α, k - 1, N - k)
ANOVA - Analysis of Variance
ANOVA - Analysis of Variance
• Step 1: Calculate the mean for each group.

• Step 2: Calculate the total mean. This is done by adding all the means
and dividing it by the total number of means.

• Step 3: Calculate the SSB.

• Step 4: Calculate the between groups degrees of freedom.

• Step 5: Calculate the SSE.


• Step 6: Calculate the degrees of freedom of errors.

• Step 7: Determine the MSB and the MSE.

• Step 8: Find the f test statistic.

• Step 9: Using the f table for the specified level of significance, αα, find the
critical value. This is given by F(α, df1. df2).

• Step 10: If f > F then reject the null hypothesis.


ANOVA - Analysis of Variance
Example 1: Three types of fertilizers are used on three groups of plants. We
want to check if there is a difference in the mean growth of each group. Using the
data given below apply a one way ANOVA test at 0.05 significant level.
ANOVA - Analysis of Variance
Solution:
H0: μ1 = μ2 = μ3
H1: The means are not equal
ANOVA - Analysis of Variance
ANOVA - Analysis of Variance
ANOVA - Analysis of Variance
Q. Determine if there is a difference in the mean daily calcium intake
for people with normal bone density, osteopenia, and osteoporosis at
a 0.05 alpha level. The data was recorded as follows:
R code
#compute analysis of variance
res <-aov(weight~group,data=mydat)

#summary of analysis
summary.aov(res)

Df Sum Sq Mean Sq F value Pr(>F)


group 2 3.766 1.8832 4.846 0.0159 *
Residuals 27 10.492 0.3886
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

As the p-value is less than the significance level 0.05, we can conclude that there are
significant differences between the groups highlighted with “*" in the model summary.
Multiple pairwise-comparison between the
means of groups

• In one-way ANOVA test, a significant p-value indicates


that some of the group means are different, but we don’t
know which pairs of groups are different.

• It’s possible to perform multiple pairwise-comparison, to


determine if the mean difference between specific pairs of
group are statistically significant.
Tukey multiple pairwise-comparisons

• As the ANOVA test is significant, we can compute Tukey


HSD (Tukey Honest Significant Differences)
• R function: TukeyHSD() for performing multiple pairwise-
comparison between the means of groups.
R code
#Tukey HSD -multiple pairwise
comparison
TukeyHSD(res)

Tukey multiple comparisons of means 95% family-wise


confidence level
Fit: aov(formula = weight ~ group, data = mydat)
$group diff lwr upr p adj
trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
trt2-ctrl 0.494 -0.1972161 1.1852161 0.1979960
trt2-trt1 0.865 0.1737839 1.5562161 0.0120064

It can be seen from the output, that only the difference between trt2 and
trt1 is significant with an adjusted p-value of 0.012.
Checking ANOVA Assumptions
• Check the homogeneity of variance
assumption
• The residuals versus fits plot can be used to check
the homogeneity of variances.
#checking homogeneity of variance
plot(res,1)

• Levene’s test
#Levene's Test
library(car)
leveneTest(weight~group,mydat)
Checking ANOVA Assumptions
(contd.)
• Check the normality assumption
• Normality plot of residuals. In this plot, the quantiles of the
residuals are plotted against the quantiles of the normal
distribution.

• Shapiro-Wilk test on ANOVA residuals

#checking normality
plot(res,2)

#Extract the residuals


res_resi <- residuals(res)
shapiro.test(res_resi)
Non-parametric alternative to one-way
ANOVA test
• A non-parametric alternative to one-way ANOVA
is Kruskal-Wallis rank sum test, which can be used
when ANOVA assumptions are not met.

kruskal.test(weight~group,mydat)
Reference
• https://online.stat.psu.edu/stat415/lesson/13/13.2
• http://users.sussex.ac.uk/~grahamh/RM1web/F-
ratio%20table%202005.pdf
• http://www.stat.yale.edu/Courses/1997-
98/101/anovareg.htm

You might also like