You are on page 1of 11

9/3/2021

Lecture 5: Multiple
Comparisons
(Post hoc Tests)

Outline

• Contrasts
• Multiple comparisons (Post hoc tests)
• Bonferroni method
• Tukey’s HSD method

Introduction
• ANOVA F test can only tell whether a difference
exists among the population means
• It cannot tell where the difference (if any) is

1
9/3/2021

Introduction

Introduction

• Imagine we just conduct all pairwise t tests (using α = 0.05 for


each test) to compare the group means
• What is the problem with this approach?

Increase Type I error!

Introduction

• Imagine we just conduct all pairwise t tests (using α = 0.05 for


each test) to compare the group means

2
9/3/2021

Introduction
• Two better approaches: contrasts and multiple
comparison procedures.

Contrasts
• Preferred approach
• Pose specific questions regarding comparisons
among the means
• Questions formulated before data collection

Case study
• Case study in reference book (Evaluation of a New
Educational Product): your company markets
educational materials aimed at parents of young
children. You are planning a new product that is
designed to improve children’s reading
comprehension. Your product is based on new ideas
from educational research, and you would like to
claim that children will acquire better reading
comprehension skills utilizing these new ideas than
with the traditional approach. Your marketing material
will include the results of a study conducted to
compare two versions of the new approach with the
traditional method. 9

3
9/3/2021

Case study
• The standard method is called Basal, and the two
variations of the new method are called DRTA and
Strat. Education researchers randomly divided 66
children into three groups of 22. Each group was
taught by one of the three methods. The response
(or outcome) variable is a measure of reading
comprehension called COMP that was obtained by a
test taken after the instruction was completed. Can
you claim that the new methods are superior to
Basal?

10

Case study
• Let’s explain why we should use contrasts in this
case

• The details of contrasts are out of the scope of this


lecture

11

Multiple comparisons
(Post hoc tests)
• In many studies, specific questions cannot be
formulated in advance of analysis
• If we reject ANOVA Ho → which pairs of means
differ?
• Use a multiple comparison procedure (the focus of
this lecture)
• We restrict our attention to multiple comparison
procedures after rejecting one-way ANOVA

12

4
9/3/2021

Case study
• Case study: the same case study (Evaluation of a
New Educational Product) but now let’s assume that
we cannot formulate specific questions regarding the
population means in advance of the analysis.
• First, we run one-way ANOVA (we can verify that the
assumptions of one-way ANOVA are satisfied).

13

Case study
• For one-way ANOVA, we found that the means were
not all the same (F = 4.48, df = 2 and 63, p-value =
0.015)
• Because we reject one-way ANOVA Ho multiple
comparison procedure

14

Multiple comparison procedures


• 3 pairs of population means
• For each of these pairs, we can write a t statistic for
the difference in means.
• Example: to compare Basal with DRTA (1 with 2):
( ̅ ̅ )
12 = = −2.99
( )

15

5
9/3/2021

Multiple comparison procedures


• To perform a multiple comparison procedure, compute t
statistics for all pairs of means:
( ̅ − ̅)
=
1 1
( + )

• Sp: pooled estimator from all groups


• Here Sp = where MSE is the mean square for error
for one-way ANOVA.

16

Multiple comparison procedures


• If |tij| ≥ t** µi and µj are different
• The value of t** depends on which multiple
comparison procedure we choose.

17

Multiple comparison procedures


Multiple comparison procedures
• Use a smaller comparisonwise error rate for each
test
• Maximum overall (or familywise) Type I error rate
across all comparisons is α (called αoverall)

18

6
9/3/2021

Multiple comparison procedures

Multiple comparison
procedures

Tukey’s Other
Bonferroni
HSD methods
19

Bonferroni method
• If we want the maximum probability of Type I error for
the overall study to be αoverall use comparisonwise
error rate = αoverall/C (where C is the number of
comparisons)

20

Bonferroni method
• R output for the Bonferroni method applied to the
case study:
Pairwise comparisons using t tests with pooled SD

data: eduProduct$Comp and eduProduct$Group

Basal DRTA
DRTA 0.012 -
Strat 0.285 0.606

P value adjustment method: bonferroni

21

7
9/3/2021

Bonferroni method
• Compare the p-values from the above output and
αoverall
• Make decision and write conclusion
• Limitation of Bonferroni method

22

Tukey’s HSD method


• Theory is out of scope of this course
• We focus on reading R output and interpreting results

23

Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts

R output
for the Fit: aov(formula = Comp ~ Group, data = eduProduct)
Tukey’s
HSD Linear Hypotheses:
method Estimate Std. Error t value Pr(>|t|)
applied to DRTA - Basal == 0 5.682 1.904 2.985 0.0111 *
the case Strat - Basal == 0 3.227 1.904 1.695 0.2150
study Strat - DRTA == 0 -2.455 1.904 -1.289 0.4064
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

24

8
9/3/2021

Comparing Bonferroni and Tukey’s HSD method


• Both control Type I error very well
• Both are conservative
• Bonferroni is more powerful when number of
comparisons is small
• Tukey’s HSD has more power when testing larger
number of means

25

Practical suggestions
• There are many different post-hoc tests. When we
have equal sample sizes and group standard
deviations are similar, use Tukey’s HSD (Tukey has
good power and tight control over Type I error in this
case)
• If we want guaranteed control over Type I error, then
Bonferroni offers an option
• If there is doubt over the assumptions, use a different
method

26

Another example
IQ scores from three groups of undergraduates of different disciplines
are measured and stored in iqdata.csv. Run the ANOVA and identify the
possible source(s) of differences among IQ scores of these three majors.

#Means
iqdata$group: 1
[1] 35.8
---------------------------------------------------------------
iqdata$group: 2
[1] 35.86667
---------------------------------------------------------------
iqdata$group: 3
[1] 48.2
27

9
9/3/2021

Another example
IQ scores from three groups of undergraduates of different disciplines are
measured and stored in iqdata.csv. Run the ANOVA and identify the
possible source(s) of differences among IQ scores of these three majors.

#Standard Deviation
iqdata$group: 1
[1] 7.79377
---------------------------------------------------------------
iqdata$group: 2
[1] 6.937133
---------------------------------------------------------------
iqdata$group: 3
[1] 2.396426

28

Another example

29

Another example
IQ scores from three groups of undergraduates of different
disciplines are measured and stored in iqdata.csv. Run the ANOVA
and identify the possible source(s) of differences among IQ scores
of these three majors.

#ANOVA
Df Sum Sq Mean Sq F value Pr(>F)
group 2 1529 764.7 20.02 7.84e-07 ***
Residuals 42 1604 38.2
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

30

10
9/3/2021

Another example
IQ scores from three groups of undergraduates of different disciplines are
measured and stored in iqdata.csv. Run the ANOVA and identify the
possible source(s) of differences among IQ scores of these three majors.

#Bonferroni
Pairwise comparisons using t tests with pooled SD

data: iqdata$iq and iqdata$group

1 2
2 1 -
3 6.3e-06 7.0e-06

P value adjustment method: bonferroni

31

Another example
IQ scores from three groups of undergraduates of different disciplines are
measured and stored in iqdata.csv. Run the ANOVA and identify the possible
source(s) of differences among IQ scores of these three majors.

#Tukey’s HSD
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = iq ~ group, data = iqdata)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
2 - 1 == 0 0.06667 2.25694 0.030 1
3 - 1 == 0 12.40000 2.25694 5.494 <1e-05 ***
3 - 2 == 0 12.33333 2.25694 5.465 <1e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
32

Conclusion

• Contrasts
• Multiple comparisons (Post hoc tests)
• Bonferroni method
• Tukey’s HSD method

33

11

You might also like