You are on page 1of 57

3/5/2021 One and two-way ANOVA with R (3)

ANOVA – Analysis of
Variances
Testing of complex hypothesis as a whole, e.g.:
more than two samples (multiple test problem),
several multiple factors (multiway ANOVA)
elimination of covariates (ANCOVA)
fixed and/or random effects (variance decomposition
methods, mixed effects models)

Different application scenarios:


explorative use: Which influence factors are
important?
descriptive use: Fitting of models for process
description and forecasting.
significance tests.

ANOVA methods are (in most cases) based on


linear models.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(3) 1/1
3/5/2021 One and two-way ANOVA with R (4)

A practical example

Scientific question
Find a suitable medium for growth experiments with
green algae:
Cheap, easy to handle
Suitable for students courses and classroom
experiments

Idea
Use a commercial fertilizer with the main nutrients N
and P
https://tpetzoldt.github.io/RStatistics/slides-anova.html#(4) 1/2
3/5/2021 One and two-way ANOVA with R (4)

Mineral water with trace elements


Does non-sparkling mineral water contain enough
HCO ? −

Test how to improve (CO ) availability for 2

photosynthesis

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(4) 2/2
3/5/2021 One and two-way ANOVA with R (5)

Application
7 Different treatments
Fertilizer solution in closed bottles
Fertilizer solution in open bottles (CO 2 from air)
Fertilizer + Sugar (organic C source)
Fertilizer + additional HCO −

3
(add CaCO 3 to
sparkling mineral water)
A standard algae growth medium (“Basal medium”)
for comparison
Deionized (“destilled”) water and tap water for
comparison

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(5) 1/1
3/5/2021 One and two-way ANOVA with R (6)

Experimental design

each treatment with 3 replicates


randomized experiment on a shaker
16:8 light:dark-cycle
Measurement directly in the bottles using a self-made
turbidity meter
(https://tpetzoldt.github.io/growthlab/doc/versuchsaufbau.html)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(6) 1/1
3/5/2021 One and two-way ANOVA with R (7)

Results

Fertilizer – Open Bottle – F. +


Sugar – F. + CaCO3 – Basal
medium – A. dest – Tap water

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(7) 1/1
3/5/2021 One and two-way ANOVA with R (8)

The data set


Data set: Growth from day 2 to day 6 (in relative
units)
treat replicate 1 replicate 2 replicate 3

Fertilizer 0.020 -0.217 -0.273

F. open 0.940 0.780 0.555

F.+sugar 0.188 -0.100 0.020

F.+CaCO3 0.245 0.236 0.456

Bas.med. 0.699 0.727 0.656

A.dest -0.010 0.000 -0.010

Tap water 0.030 -0.070 NA

NA means “not available”, i.e. a missing value


The crosstable structure is compact and nice for a
slide, but not suitable for data analysis
therefore, we use the long table format instead

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(8) 1/1
3/5/2021 One and two-way ANOVA with R (9)

Data in long format


Data set: Growth from
day 2 to day 6 (in relative
units)
treat rep growth

Fertilizer 1 0.020

Fertilizer 2 -0.217

Fertilizer 3 -0.273

F. open 1 0.940

F. open 2 0.780

F. open 3 0.555
Advantages
F.+sugar 1 0.188
looks “stupid” but is
better for data F.+sugar 2 -0.100
analysis
F.+sugar 3 0.020
dependend growth
F.+CaCO3 1 0.245
and explanation
variable treat clearly F.+CaCO3 2 0.236
visible
F.+CaCO3 3 0.456
easily extensible to
Bas.med. 1 0.699
> 1 explanation

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(9) 1/2
3/5/2021 One and two-way ANOVA with R (9)

variable treat rep growth

Bas.med. 2 0.727

Bas.med. 3 0.656

A.dest 1 -0.010

A.dest 2 0.000

A.dest 3 -0.010

Tap water 1 0.030

Tap water 2 -0.070

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(9) 2/2
3/5/2021 One and two-way ANOVA with R (10)

The data in R
dat <- data.frame(
treat = factor(c("Fertilizer", "Fertilizer", "Fertilizer",
"F. open", "F. open", "F. open",
"F.+sugar", "F.+sugar", "F.+sugar",
"F.+CaCO3", "F.+CaCO3", "F.+CaCO3",
"Bas.med.", "Bas.med.", "Bas.med.",
"A.dest", "A.dest", "A.dest",
"Tap water", "Tap water"),
levels=c("Fertilizer", "F. open", "F.+sugar",
"F.+CaCO3", "Bas.med.", "A.dest", "Tap water")),
rep = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2),
growth = c(0.02, -0.217, -0.273, 0.94, 0.78, 0.555, 0.188, -0.1, 0.02,
0.245, 0.236, 0.456, 0.699, 0.727, 0.656, -0.01, 0, -0.01, 0.03, -0.07)
)

… can be read from a csv-file or entered directly in the


code.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(10) 1/1
3/5/2021 One and two-way ANOVA with R (11)

Visualization
boxplot(growth ~ treat, data=dat)
abline(h=0, lty="dashed", col="grey")

But as we have only 2-3 replicates per box, it is better to


plot the all values separately using stripchart :

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(11) 1/2
3/5/2021 One and two-way ANOVA with R (11)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(11) 2/2
3/5/2021 One and two-way ANOVA with R (12)

Statistical approach and the


Bonferroni law
Questions
Are the treatments different?
Which medium is the best?
Is the best medium significantly better than the
others?

Hypotheses
H0 growth is the same in all treatments
HA differences between media

Why can’t we apply just several t-


tests?
If we have 7 treatments and want to test all against
each other, we would need 7 ⋅ (7 − 1)/2 = 21
tests.
If we set α = 0.05 we will get 5% false positives,
i.e. one of 20 tests is on average false positive
This means that we do N tests, we may increase the
overall α error in the worst case to a value of N α .

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(12) 1/2
3/5/2021 One and two-way ANOVA with R (12)

This is called alpha-error-inflation or the


Bonferroni law:

αtotal ≤ ∑ αi = N ⋅ α

i=1

If we ignore the Bonferroni law, we end in statistical


fishing i.e. we get spurious results just by chance.

Solutions
One approach can be to down-correct the alpha
errors so that α = 0.05 total

The preferred approach is to use a method that


does all tests simultanaeously: the ANOVA.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(12) 2/2
3/5/2021 One and two-way ANOVA with R (13)

ANOVA: Analysis of variances


Basic Idea
split the total variance into effect(s) and errors:

2 2 2
sy = s + sε
effect

The most surprising is, that we use variances to


compare mean values. The reason for this is, that
differences of means contribute to the total variance
of the whole sample. Sometimes, the variance
components are also called variance within (s ) 2
ε

and variance between samples.


The way how to separate variances is a linear
model.

Example
We have two brands of Clementines from a shop “E”, that
we encode as “EB” and “EP”. We want to know whether
the premium brand (“P”) and the basic brand (“B”) have a
https://tpetzoldt.github.io/RStatistics/slides-anova.html#(13) 1/3
3/5/2021 One and two-way ANOVA with R (13)

different weight.
Instead of a t-test we encode “EB” with 1 and “EP” with 2.
clem_edeka <- data.frame(
brand = c("EP", "EB", "EB", "EB", "EB", "EB", "EB", "EB", "EB", "EB", "EB",
"EB", "EB", "EB", "EP", "EP", "EP", "EP", "EP", "EP", "EP", "EB", "EP"),
weight = c(88, 96, 100, 96, 90, 100, 92, 92, 102, 99, 86, 89, 99, 89, 75, 80,
81, 96, 82, 98, 80, 107, 88)
)

clem_edeka$code <- as.numeric(factor(clem_edeka$brand))

plot(weight ~ code, data=clem_edeka, axe=FALSE)


m <- lm(weight ~ code, data=clem_edeka)
axis(1, at=c(1,2), labels=c("EB", "EP")); axis(2); box()
abline(m, col="blue")

Total variance
https://tpetzoldt.github.io/RStatistics/slides-anova.html#(13) 2/3
3/5/2021 One and two-way ANOVA with R (13)

(var_tot <- var(clem_edeka$weight))

## [1] 68.98814

Residual variance (alias “within


variance”)
(var_res <- var(residuals(m)))

## [1] 43.25

Between variance or explained


variance
1 - var_res / var_tot

## [1] 0.3730807

Exercise:
Perform a t-Test for the two Clementine brands
Compare the p-value of the t-test with the p-value of
an ANOVA

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(13) 3/3
3/5/2021 One and two-way ANOVA with R (14)

ANOVA in R
Back to the algae growth data. Let’s call the linear model
m:

m <- lm(growth ~ treat, data=dat)

We can then print the coefficients of the linear model with


summary(m) , but the more common way is to use the
anova function

anova(m)

## Analysis of Variance Table


##
## Response: growth
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 6 2.35441 0.39240 25.045 1.987e-06 ***
## Residuals 13 0.20368 0.01567
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The ANOVA table shows the F-tests testing for


significance of all factors. In the table above, we have
only one single factor.
We see that the treatment had a significant effect.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(14) 1/1
3/5/2021 One and two-way ANOVA with R (15)

Posthoc tests
The test above showed only, that the factor “treatment”
had a significant effect, but we don’t know which levels of
the factor are different. Here we apply a so-called
posthoc test.
Different posthoc tests exist, here we use the Tukey HSD
test that is the most common.
The TukeyHSD function has a numerical and a graphical
output.

Tukey HSD test


tk <- TukeyHSD(aov(m))
tk

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(15) 1/3
3/5/2021 One and two-way ANOVA with R (15)

## Tukey multiple comparisons of means


## 95% family-wise confidence level
##
## Fit: aov(formula = m)
##
## $treat
## diff lwr upr p adj
## F. open-Fertilizer 0.91500000 0.56202797 1.26797203 0.0000103
## F.+sugar-Fertilizer 0.19266667 -0.16030537 0.54563870 0.5211198
## F.+CaCO3-Fertilizer 0.46900000 0.11602797 0.82197203 0.0069447
## Bas.med.-Fertilizer 0.85066667 0.49769463 1.20363870 0.0000231
## A.dest-Fertilizer 0.15000000 -0.20297203 0.50297203 0.7579063
## Tap water-Fertilizer 0.13666667 -0.25796806 0.53130140 0.8837597
## F.+sugar-F. open -0.72233333 -1.07530537 -0.36936130 0.0001312
## F.+CaCO3-F. open -0.44600000 -0.79897203 -0.09302797 0.0102557
## Bas.med.-F. open -0.06433333 -0.41730537 0.28863870 0.9943994
## A.dest-F. open -0.76500000 -1.11797203 -0.41202797 0.0000721
## Tap water-F. open -0.77833333 -1.17296806 -0.38369860 0.0001913
## F.+CaCO3-F.+sugar 0.27633333 -0.07663870 0.62930537 0.1727182
## Bas.med.-F.+sugar 0.65800000 0.30502797 1.01097203 0.0003363
## A.dest-F.+sugar -0.04266667 -0.39563870 0.31030537 0.9994197
## Tap water-F.+sugar -0.05600000 -0.45063473 0.33863473 0.9985686
## Bas.med.-F.+CaCO3 0.38166667 0.02869463 0.73463870 0.0307459
## A.dest-F.+CaCO3 -0.31900000 -0.67197203 0.03397203 0.0879106
## Tap water-F.+CaCO3 -0.33233333 -0.72696806 0.06230140 0.1247914
## A.dest-Bas.med. -0.70066667 -1.05363870 -0.34769463 0.0001792
## Tap water-Bas.med. -0.71400000 -1.10863473 -0.31936527 0.0004507
## Tap water-A.dest -0.01333333 -0.40796806 0.38130140 0.9999997

Graphical output
par(las = 1) # las = 1 make y annotation horizontal
par(mar = c(4, 10, 3, 1)) # more space at the left for axis annotation
plot(tk)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(15) 2/3
3/5/2021 One and two-way ANOVA with R (15)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(15) 3/3
3/5/2021 One and two-way ANOVA with R (16)

ANOVA assumptions and


diagnostics
The assumptions of the ANOVA are the same as for the
linear model. In short:
1. Independence of errors
2. Variance homogeneity
3. Approximate normality of errors
Again, graphical methods are preferred. The easiest is
plot(m) .

plot(m, which=1)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(16) 1/3
3/5/2021 One and two-way ANOVA with R (16)

plot(m, which=2)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(16) 2/3
3/5/2021 One and two-way ANOVA with R (16)

It is also possible to test variance homogeneity. Instead


of an F-test that can only compare two variances, we
need a test that can compare more than two, for example
the Fligner-Killeen-test:
fligner.test(growth ~ treat, data=dat)

##
## Fligner-Killeen test of homogeneity of variances
##
## data: growth by treat
## Fligner-Killeen:med chi-squared = 4.2095, df = 6, p-value = 0.6483

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(16) 3/3
3/5/2021 One and two-way ANOVA with R (17)

One-way ANOVA with


heterogeneous variances
If variances are not equal we can use an extension of the
Welch test for ≥ 2 samples, in R called oneway.test
instead of the one-way ANOVA :
oneway.test(growth ~ treat, data=dat)

##
## One-way analysis of means (not assuming equal variances)
##
## data: growth and treat
## F = 115.09, num df = 6.0000, denom df = 4.6224, p-value = 6.57e-05

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(17) 1/1
3/5/2021 One and two-way ANOVA with R (18)

Two-way ANOVA
Example from a statistics text book (Crawley 2002)
Effects of diet and coat color on growth of Hamsters
in Gramm per time (constructed data set)

Factorial experiment (with replicates)


Each factor combination (cell) contains more than
one observation.
Without replication: only one experiment per factor
combination. This is possible, but does not allow to
identify interaction effects.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(18) 1/1
3/5/2021 One and two-way ANOVA with R (19)

Tidy data
hams <- data.frame(No = 1:12,
growth = c(6.6, 7.2, 6.9, 8.3, 7.9, 9.2,
8.3, 8.7, 8.1, 8.5, 9.1, 9.0),
diet = rep(c("A", "B", "C"), each=2),
coat = rep(c("light", "dark"), each=6)
)

Data set: Growth of


hamsters (in gramm)
No growth diet coat

1 6.6 A light

2 7.2 A light

3 6.9 B light

4 8.3 B light

5 7.9 C light

6 9.2 C light

7 8.3 A dark

8 8.7 A dark

9 8.1 B dark

10 8.5 B dark

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(19) 1/2
3/5/2021 One and two-way ANOVA with R (19)

No growth diet coat

11 9.1 C dark

12 9.0 C dark

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(19) 2/2
3/5/2021 One and two-way ANOVA with R (20)

Visualization and ANOVA

ANOVA
m <- lm(growth~coat*diet, data=hams)
anova(m)

## Analysis of Variance Table


##
## Response: growth
## Df Sum Sq Mean Sq F value Pr(>F)
## coat 1 2.61333 2.61333 7.2258 0.03614 *
## diet 2 2.66000 1.33000 3.6774 0.09069 .
## coat:diet 2 0.68667 0.34333 0.9493 0.43833
## Residuals 6 2.17000 0.36167
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(20) 1/2
3/5/2021 One and two-way ANOVA with R (20)

Interaction plot
with(hams, interaction.plot(diet, coat, growth, col=c("brown", "orange"), lty=1, lwd=2))

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(20) 2/2
3/5/2021 One and two-way ANOVA with R (21)

Diagnostics
Assumptions
1. independence of measurements (within
samples)
2. Variance homogeneity of residuals
3. Normal distribution of residuals
Note: test of assumptions only possible after fitting the
model.
⇒ Fit the ANOVA model first, then check if it was correct!

Diagnostic tools
Box plot
Plot of residuals vs. mean values
Q-Q-plot of residuals
Fligner-Killeen test (alternative: some people
recommend the Levene-Test)

par(mfrow=c(1, 2))
par(cex=1.2, las=1)
qqnorm(residuals(m))
qqline(residuals(m))

plot(residuals(m)~fitted(m))
abline(h=0)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(21) 1/2
3/5/2021 One and two-way ANOVA with R (21)

fligner.test(growth ~ interaction(coat, diet), data=hams)

##
## Fligner-Killeen test of homogeneity of variances
##
## data: growth by interaction(coat, diet)
## Fligner-Killeen:med chi-squared = 10.788, df = 5, p-value = 0.05575

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(21) 2/2
3/5/2021 One and two-way ANOVA with R (22)

Sequential Holm-Bonferroni method


Also called Holm procedure (Holm 1979)
Easy to use
Can be applied to any multiple test problem
Less conservative that ordinary Bonferroni correction, but …
… still a very conservative approach
see also Wikipedia
(https://en.wikipedia.org/wiki/Holm%E2%80%93Bonferroni_method)

Algorithm
1. Select smallest p out of all n p -values
2. If p ⋅ n < α ⇒ significant, else STOP
3. Set n − 1 → n , remove smallest p from the list and go to
step 1.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(22) 1/1
3/5/2021 One and two-way ANOVA with R (23)

Example
Growth rate per day (d ) of blue-green algae cultures −1

(Pseudanabaena) after adding toxic peptides from


another blue-green algae (Microcystis).
The original hypothesis was that Microcystin LR
(MCYST) or a derivative of it (Substance A) inhibits
growth.
mcyst <- data.frame(treat = factor(c(rep("Control", 5),
rep("MCYST", 5),
rep("Subst A", 5)),
levels=c("Control", "MCYST", "Subst A")),
mu = c(0.086, 0.101, 0.086, 0.086, 0.099,
0.092, 0.088, 0.093, 0.088, 0.086,
0.095, 0.102, 0.106, 0.106, 0.106)
)

Approach 1: one-way ANOVA


par(mar=c(4, 8, 2, 1), las=1)
m <- lm(mu ~ treat, data=mcyst)
anova(m)

## Analysis of Variance Table


##
## Response: mu
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 2 0.00053293 2.6647e-04 8.775 0.004485 **
## Residuals 12 0.00036440 3.0367e-05
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(TukeyHSD(aov(m)))

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(23) 1/3
3/5/2021 One and two-way ANOVA with R (23)

Approach 2: multiple t-Tests with


sequential Bonferroni correction
We separate the data set in single subsets:
Control <- mcyst$mu[mcyst$treat == "Control"]
MCYST <- mcyst$mu[mcyst$treat == "MCYST"]
SubstA <- mcyst$mu[mcyst$treat == "Subst A"]

and perform 3 t-Tests:


p1 <- t.test(Control, MCYST)$p.value
p2 <- t.test(Control, SubstA)$p.value
p3 <- t.test(MCYST, SubstA)$p.value

The following shows the raw p-values without correction:


https://tpetzoldt.github.io/RStatistics/slides-anova.html#(23) 2/3
3/5/2021 One and two-way ANOVA with R (23)

c(p1, p2, p3)

## [1] 0.576275261 0.027378832 0.001190592

and with Holm correction:


p.adjust(c(p1, p2, p3))

## [1] 0.576275261 0.054757664 0.003571775

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(23) 3/3
3/5/2021 One and two-way ANOVA with R (24)

Conclusions
Statistical methods
In case of Holm-corrected t-tests, only a signle p-
value (MCYST vs. Subst A) remains significant. This
indicates that in this case, Holm’s method is more
conservative than TukeyHSD (only one compared to
two significant) effects.
An ANOVA with posthoc test is in general preferred,
but the sequential Holm-Bonferroni can be helpful in
special cases.
Moreover, it demonstrates clearly that massive
multiple testing needs to be avoided.
⇒ ANOVA is to be preferred, when possible.

Interpretation
Regarding our original hypothesis, we can see that
MCYST and SubstA did not inhibit growth of
Pseudanabaena. In fact SubstA stimulated growth.
This was contrary to our expectations – the
biological reason was then found 10 years later.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(24) 1/2
3/5/2021 One and two-way ANOVA with R (24)

More about this can be found in Jähnichen, Petzoldt, and


Benndorf (2001), Jähnichen et al. (2007), Jähnichen,
Long, and Petzoldt (2011), Zilliges et al. (2011) or
Dziallas and Grossart (2011).

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(24) 2/2
3/5/2021 One and two-way ANOVA with R (25)

ANCOVA

Annette Dobson’s birthweight data. A data set from a


statistics textbook (Dobson 2013), birth weight of boys
and girls in dependence of the pregnancy week.

The data set is found at different places on the internet


and in different versions.
Here the version that is found in an R demo:
demo(lm.glm)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(25) 1/2
3/5/2021 One and two-way ANOVA with R (25)

## Birth Weight Data see stats/demo/lm.glm.R


dobson <- data.frame(
week = c(40, 38, 40, 35, 36, 37, 41, 40, 37, 38, 40, 38,
40, 36, 40, 38, 42, 39, 40, 37, 36, 38, 39, 40),
weight = c(2968, 2795, 3163, 2925, 2625, 2847, 3292, 3473, 2628, 3176,
3421, 2975, 3317, 2729, 2935, 2754, 3210, 2817, 3126, 2539,
2412, 2991, 2875, 3231),
sex = gl(2, 12, labels=c("M","F"))
)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(25) 2/2
3/5/2021 One and two-way ANOVA with R (26)

Linear regression, ANCOVA


and ANCOVA
ANCOVA (analysis of covariance) deals with the
comparison of regression lines
Simply speaking, we can distinguish the following:
independent variables have metric scale: linear
regression
independent variables all nominal (factor): ANOVA
independent variables are mixed nominal and metric:
ANCOVA

For the linear models discussed so far, the dependent


variable is always metric, while binary or nominal
dependent variables can be handled with generalized
linear models (GLM).

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(26) 1/1
3/5/2021 One and two-way ANOVA with R (27)

Anette Dobson’s birthweight


data
Why not just using a t-test?
boxplot(weight ~ sex,data=dobson, ylab="weight")

t.test(weight ~ sex, data=dobson, var.equal=TRUE)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(27) 1/2
3/5/2021 One and two-way ANOVA with R (27)

##
## Two Sample t-test
##
## data: weight by sex
## t = 0.97747, df = 22, p-value = 0.339
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -126.3753 351.7086
## sample estimates:
## mean in group M mean in group F
## 3024.000 2911.333

The box plot shows much overlap and the difference is


not significant, because the t-test ignores important
information: the pregnancy week.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(27) 2/2
3/5/2021 One and two-way ANOVA with R (28)

ANCOVA makes use of


covariates
m <- lm(weight ~ week * sex, data=dobson)
anova(m)

## Analysis of Variance Table


##
## Response: weight
## Df Sum Sq Mean Sq F value Pr(>F)
## week 1 1013799 1013799 31.0779 1.862e-05 ***
## sex 1 157304 157304 4.8221 0.04006 *
## week:sex 1 6346 6346 0.1945 0.66389
## Residuals 20 652425 32621
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(28) 1/3
3/5/2021 One and two-way ANOVA with R (28)

How this works


plot(weight ~ week, data=dobson, col=c("blue","red")[as.numeric(sex)], pch=16)

summary(m)

##
## Call:
## lm(formula = weight ~ week * sex, data = dobson)
##
## Residuals:
## Min 1Q Median 3Q Max
## -246.69 -138.11 -39.13 176.57 274.28
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1268.67 1114.64 -1.138 0.268492
## week 111.98 29.05 3.855 0.000986 ***
## sexF -872.99 1611.33 -0.542 0.593952
## week:sexF 18.42 41.76 0.441 0.663893
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 180.6 on 20 degrees of freedom
## Multiple R-squared: 0.6435, Adjusted R-squared: 0.59
## F-statistic: 12.03 on 3 and 20 DF, p-value: 0.000101

p <- coef(m)
abline(a=p[1], b=p[2], col="red")
abline(a=p[1]+p[3], b=p[2]+p[4], col="blue")

## the result is the same as when we would fit separate linear models
fem <- lm(weight ~ week, data=dobson, subset = sex=="F")
mal <- lm(weight ~ week, data=dobson, subset = sex=="M")
abline(fem, col="black", lty="dashed")
abline(mal, col="black", lty="dashed")

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(28) 2/3
3/5/2021 One and two-way ANOVA with R (28)

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(28) 3/3
3/5/2021 One and two-way ANOVA with R (29)

Pitfalls of the ANOVA


described so far
1. Heterogeneity of variance
p-values can be biased (i.e. misleading or
wrong)
use of a one-way ANOVA for uneaqual
variances (Welch, 1951); in R: oneway.test

2. Unbalanced case:
unequal number of samples for each factor
combination
ANOVA results depend on the order of factors
in the model formula.
Classical method: Type II or Type III ANOVA
Modern approach: model selection and
likelihood ratio tests

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(29) 1/1
3/5/2021 One and two-way ANOVA with R (30)

Type II and Type III ANOVA


function Anova (with upper case A ) in package car

Help of function Anova: “Type-II tests are calculated


according to the principle of marginality, testing each
term after all others, except ignoring the term’s
higher-order relatives; so-called type-III tests violate
marginality, testing each term in the model after all
of the others.”
Conclusion: use Type II and don’t try to interpret
single terms in case of significant interactions.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(30) 1/1
3/5/2021 One and two-way ANOVA with R (31)

Type II and Type III ANOVA:


Example

library("car")
m <- lm(growth ~ coat * diet, data = hams)
Anova(m, type="II")

## Anova Table (Type II tests)


##
## Response: growth
## Sum Sq Df F value Pr(>F)
## coat 2.61333 1 7.2258 0.03614 *
## diet 2.66000 2 3.6774 0.09069 .
## coat:diet 0.68667 2 0.9493 0.43833
## Residuals 2.17000 6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(31) 1/1
3/5/2021 One and two-way ANOVA with R (32)

Model selection – a paradigm


change
Problem:
In complicated models, p-values depend on number
(and sometimes of order) of included factors and
interactions.
The H -based approach becomes confusing,
0

e.g. because of contradictory p-values.

Alternative approach:
Comparison of different model candidates instead of p-
value based testing.
Model with all potentiall effects → full model,
Omit single factors → reduced models (several!),
No influence factors (ony mean value) → null model.
Which model is the best → minimal adequate
model?

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(32) 1/1
3/5/2021 One and two-way ANOVA with R (33)

How can we measure which


model is the best?
Compromize between model fit and model complexity
(number of parameters, k).
Goodness of fit: Likelihood L (measures how good
the data match a given model).
Log Likelihood: makes the criterion additive.
AIC (Akaike Information Criterion):

AI C = −2 ln(L) + 2k

Alternative: BIC (Bayesian Information Criterion),


takes sample size into account (n ):

BI C = −2 ln(L) + k ⋅ ln(n)

The model with the smallest AIC (or BIC) is considered


as minimal adequate (i.e. optimal) model.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(33) 1/1
3/5/2021 One and two-way ANOVA with R (34)

Model Selection and


Likelihood Ratio Tests
m1 <- lm(growth ~ diet * coat, data=hams)
m2 <- lm(growth ~ diet + coat, data=hams)
anova(m1, m2)

## Analysis of Variance Table


##
## Model 1: growth ~ diet * coat
## Model 2: growth ~ diet + coat
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 6 2.1700
## 2 8 2.8567 -2 -0.68667 0.9493 0.4383

## df AIC
## m1 7 27.53237
## m2 5 26.83151

Likelihood ratio test compares two models ( anova


with > 1 model)
Model with interaction ( m1 ) not significantly better
than model without interaction ( m2 ).
Conclusion: take the simpler model

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(34) 1/1
3/5/2021 One and two-way ANOVA with R (35)

Automatic Model Selection


The full model is supplied to the step function.

The modell with the smallest AIC ist the minimal


adequate model:

m1 <- lm(growth ~ diet * coat, data=hams)


step(m1)

## Start: AIC=-8.52
## growth ~ diet * coat
##
## Df Sum of Sq RSS AIC
## - diet:coat 2 0.68667 2.8567 -9.2230
## <none> 2.1700 -8.5222
##
## Step: AIC=-9.22
## growth ~ diet + coat
##
## Df Sum of Sq RSS AIC
## <none> 2.8567 -9.2230
## - diet 2 2.6600 5.5167 -5.3256
## - coat 1 2.6133 5.4700 -3.4275

##
## Call:
## lm(formula = growth ~ diet + coat, data = hams)
##
## Coefficients:
## (Intercept) dietB dietC coatlight
## 8.1667 0.2500 1.1000 -0.9333

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(35) 1/1
3/5/2021 One and two-way ANOVA with R (36)

Summary
Linear models form the basis of many statistical
methods
Linear regression
ANOVA, ANCOVA, GLM, GAM, GLMM, . . .
ANOVA/ANCOVA instead of multiple testing

ANOVA is more powerful than multiple tests:


one big experiment needs less n than many small
experiments together
identification of interaction effects
elimination of co-variates

Model selection vs. p-value based testing


paradigm shift in statistics: AIC instead of p-value
more reliable, especially for imbalanced or complex
designs
but: p-value based tests are sometimes easier to
understand

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(36) 1/1
3/5/2021 One and two-way ANOVA with R (37)

Avoid p-value hacking


Do NOT repeat experiments until a
significant p-value is found.
“As debate rumbles on about how and how much
poor statistics is to blame for poor reproducibility,
Nature asked influential statisticians to recommend
one change to improve science. The common theme?
The problem is not our maths, but ourselves.”

Five ways to fix statistics. Comment


on Nature.
Leek et al. (2017) https://doi.org/10.1038/d41586-017-
07522-z (https://doi.org/10.1038/d41586-017-07522-z)
1. Jeff Leek: Adjust for human cognition
2. Blakeley B. McShane & Andrew Gelman:
Abandon statistical significance
3. David Colquhoun: State false-positive risk, too
4. Michèle B. Nuijten: Share analysis plans and
results
5. Steven N. Goodman: Change norms from within

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(37) 1/2
3/5/2021 One and two-way ANOVA with R (37)

Another blog post that aims to improve understanding:


http://daniellakens.blogspot.de/2017/12/understanding-
common-misconceptions.html?m=1
(http://daniellakens.blogspot.de/2017/12/understanding-
common-misconceptions.html?m=1)

My conclusion: The p-value is still useful but apply it


with great care.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(37) 2/2
3/5/2021 One and two-way ANOVA with R (38)

Copyright
This resource was created by tpetzoldt
(github.com/tpetzoldt). It is provided as is without
warranty.

https://tpetzoldt.github.io/RStatistics/slides-anova.html#(38) 1/1

You might also like