You are on page 1of 79

Statistics for Business and Economics

Ninth Edition, Global Edition

Chapter 15
Analysis of
Variance

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 1


Chapter Goals
After completing this chapter, you should be
able to:
• Recognize situations in which to use analysis of variance
• Understand different analysis of variance designs
• Perform a one-way and two-way analysis of variance and
interpret the results
• Conduct and interpret a Kruskal-Wallis test
• Analyze two-factor analysis of variance tests with more
than one observation per cell

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 2


Section 15.1 Comparison of Several
Population Means (1 of 2)
• Tests were presented in Chapter 10 for the difference
between two population means
• In this chapter these procedures are extended to tests for
the equality of more than two population means
• The null hypothesis is that the population means are all the
same
• The critical factor is the variability involved in the data
– If the variability around the sample means is small
compared with the variability among the sample
means, we reject the null hypothesis

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 3


Section 15.1 Comparison of Several
Population Means (2 of 2)

• Small variation around the • Large variation around the


sample means compared sample means compared
to the variation among the to the variation among the
sample means sample means
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 4
Section 15.2 One-Way Analysis of
Variance
• Evaluate the difference among the means of three
or more groups
Examples: Average production for 1st, 2nd, and 3rd shifts
Expected mileage for five brands of tires

• Assumptions
– Populations are normally distributed
– Populations have equal variances
– Samples are randomly and independently drawn

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 5


Hypotheses of One-Way ANOVA
• 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 =⋅⋅⋅= 𝜇𝑘
– All population means are equal
– i.e., no variation in means between groups
• 𝐻1 : 𝜇𝑖 ≠ 𝜇𝑗for at least one i , j pair.
– At least one population mean is different
– i.e., there is variation between groups
– Does not mean that all population means are different
(some pairs may be the same)

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 6


One-Way ANOVA (1 of 2)
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 =⋅⋅⋅= 𝜇𝑘

𝐻1 : Not all 𝜇𝑖 are the same

All Means are the same:


The Null Hypothesis is True
(No variation between groups)

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 7


One-Way ANOVA (2 of 2)
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 =⋅⋅⋅= 𝜇𝐾
𝐻1 : Not all 𝜇𝑖 are the same.
At least one mean is different:
The Null Hypothesis is Not true
(Variation is present between groups)

or

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 8


Variability
• The variability of the data is key factor to test the equality
of means
• In each case below, the means may look different, but a
large variation within groups in B makes the evidence that
the means are different weak

Small variation within groups Large variation within groups

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 9


Sum of Squares Decomposition (1 of 2)
• Total variation can be split into two parts:
SST = SSW + SSG
SST = Total Sum of Squares
Total Variation = the aggregate dispersion of the individual data
values across the various groups

SSW = Sum of Squares Within Groups


Within-Group Variation = dispersion that exists among the data
values within a particular group

SSG = Sum of Squares Between Groups


Between-Group Variation = dispersion between the group
sample means
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 10
Sum of Squares Decomposition (2 of 2)

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 11


Total Sum of Squares (1 of 2)
SST = SSW + SSG
𝐾 𝑛𝑖
2
SST = ෍ ෍ 𝑥𝑖𝑗 − 𝑥
𝑖=1 𝑗=1

Where:
SST = Total sum of squares
K = number of groups (levels or treatments)
𝑛𝑖 = number of observations in group i
𝑥𝑖𝑗 = 𝑗𝑡ℎ
observation from group i
𝑥 = overall sample mean
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 12
Total Sum of Squares (2 of 2)
2 2 2
SST= 𝑥11 − 𝑥 + 𝑥12 − 𝑥 +⋅⋅⋅ + 𝑥𝐾𝑛𝐾 − 𝑥

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 13


Within-Group Variation (1 of 3)
SST = SSW + SSG

𝐾 𝑛𝑖
2
SSW = ෍ ෍ 𝑥𝑖𝑗 − 𝑥lj 𝑖
𝑖=1 𝑗=1
Where:

SSW = Sum of squares within groups


K = number of groups
𝑛𝑖 = sample size from group i
𝑥lj 𝑖 = sample mean from group i
𝑥𝑖𝑗 = 𝑗 th
observation in group i

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 14


Within-Group Variation (2 of 3)
𝐾 𝑛𝑖
2
SSW = ෍ ෍ 𝑥𝑖𝑗 − 𝑥lj 𝑖
𝑖=1 𝑗=1

Summing the variation within SSW


each group and then adding
MSW =
𝑛−𝐾
over all groups

Mean Square Within


SSW
=
degrees of freedom

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 15


Within-Group Variation (3 of 3)
2 2 +. . . + 2
SSW = 𝑥11 − 𝑥lj 1 + 𝑥12 − 𝑥lj 1 𝑥𝐾𝑛𝐾 − 𝑥lj 𝐾

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 16


Between-Group Variation (1 of 3)
SST=SSW+SSG
𝐾 2

SSG = ෍ 𝑛𝑖 𝑥lj 𝑖 − 𝑥
Where: 𝑖=1
SSG = Sum of squares between groups
K = number of groups
𝑛𝑖 = sample size from group i
𝑥lj 𝑖 = sample mean from group i
𝑥 = grand mean (mean of all data values)
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 17
Between-Group Variation (2 of 3)
𝐾
2
SSG = ෍ 𝑛𝑖 𝑥lj 𝑖 − 𝑥
𝑖=1
SSG
Variation Due to Differences
MSG =
𝐾−1
Between Groups
Mean Square Between
Groups
SSG
=
degrees of freedom

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 18


Between-Group Variation (3 of 3)
2
SSG = 𝑛1 𝑥lj 1 − 𝑥 + 𝑛2 𝑥lj 2 − 𝑥 2 +. . . +𝑛𝐾 𝑥lj 𝐾 − 𝑥 2

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 19


Obtaining the Mean Squares
SST
MST =
𝑛−1

SSW
MSW =
𝑛−𝐾

SSG
MSG =
𝐾−1

Where n = sum of the sample sizes from all groups


K = number of populations

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 20


One-Way ANOVA Table

K = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 21


One-Factor ANOVA F Test Statistic
𝐻0 : 𝜇1 = 𝜇2 =. . . = 𝜇𝐾

𝐻1 :
At least two population means are different
• Test statistic
MSG
𝐹=
MSW

MSG is mean squares between variances


MSW is mean squares within variances

• Degrees of freedom
– df1 = 𝐾– 1 (K = number of groups)
– df2 = 𝑛– 𝐾 (n = sum of sample sizes from all groups)
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 22
Interpreting the F Statistic
• The F statistic is the ratio of the between estimate
of variance and the within estimate of variance
– The ratio must always be positive
– df1 = 𝐾 − 1 will typically be small
– df2 = 𝑛 − 𝐾 will typically be large

Decision Rule:
• Reject 𝐻0 if
𝐹 > 𝐹𝐾−1,𝑛−𝐾,𝛼

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 23


One-Factor ANOVA F Test Example
You want to see if three
different golf clubs yield
different distances. You
randomly select five
measurements from trials on
an automated driving
machine for each club. At the
.05 significance level, is there
a difference in mean
distance?

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 24


One-Factor ANOVA Example: Scatter
Diagram

𝑥lj 1 = 249.2 𝑥lj 2 = 226.0 𝑥lj 3
= 205.8
𝑥 = 227.0
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 25
One-Factor ANOVA Example
Computations
𝑥1 = 249.2 𝑛1 = 5

𝑥2 = 226.0 𝑛2 = 5

𝑥3 = 205.8 𝑛3 = 5

𝑥 = 227.0 𝑛 = 15
𝑘=3

2 2 2
SSG = 5 249.2 − 227 + 5 226 − 227 + 5 205.8 − 227 = 4716.4
2
SSW = 254 − 249.2 + 263 − 249.2 2 +. . . + 204 − 205.8 2
= 1119.6
4716.4
MSG = = 2358.2
3−1 2358.2
𝐹= = 25.275
1119.6 93.3
MSW = = 93.3
15 − 3
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 26
One-Factor ANOVA Example Solution
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
Test Statistic:
𝐻1 : 𝜇𝑖 not all equal
MSA 2358.2
𝐹= = = 25.275
α = .05 MSW 93.3

df1 = 2 df2 = 12
Decision:
Reject 𝐻0 at α = 0.05
Conclusion:
There is evidence
that at least one
𝜇𝑖 differs from the rest

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 27


ANOVA -- Single Factor: Excel Output
Excel: data | data analysis | ANOVA: single factor

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 28


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 29
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 30
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 31
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 32
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 33
Multiple Comparisons Between
Subgroup Means
• To test which population means are significantly
different
– e.g.: 𝜇1 = 𝜇2 ≠ 𝜇3
– Done after rejection of equal means in single factor
ANOVA design
• Allows pair-wise comparisons
– Compare absolute mean differences with critical range

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 34


Two Subgroups
• When there are only two subgroups, compute the
minimum significant difference (MSD)
1 1
MSD = 𝑡𝛼 𝑠𝑝 +
2 𝑛1 𝑛2
Where 𝑠𝑝 is a pooled estimate of the variance

• Use hypothesis testing methods of Ch. 10

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 35


Multiple Subgroups (1 of 2)
• The minimum significant difference between k subgroups
is
𝑠𝑝
MSD 𝐾 = 𝑄 where 𝑠𝑝 = MSW
𝑛

• Q is a factor from Appendix Table 13


for the chosen level of α
• K = number of subgroups, and
• MSW = Mean square within from ANOVA table

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 36


Multiple Subgroups (2 of 2)
𝑠𝑝
MSD(𝐾) = 𝑄
𝑛

𝑥lj 1 − 𝑥lj 2 Compare:
Is 𝑥lj 𝑖 − 𝑥𝑗lj > MSD 𝐾 ?
𝑥lj 1 − 𝑥lj 3
If the absolute mean difference is greater than
𝑥lj 2 − 𝑥lj 3 MSD then there is a significant difference
etc. . . between that pair of means at the chosen level
of significance.

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 37


Multiple Subgroups: Example
𝑥lj 1 = 249.2 𝑛1 = 5
𝑠𝑝 93.3
𝑥lj 2 = 226.0 𝑛2 = 5
MSD 𝐾 = 𝑄 = 3.77 = 9.387
𝑛 15
𝑥lj 3 = 205.8 𝑛3 = 5
(where Q = 3.77 is from Table 13 for
α = .05 and 12 df)

𝑥lj 1 − 𝑥lj 2 = 23.2
Since each difference is greater
𝑥lj 1 − 𝑥lj 3 = 43.4 than 9.387, we conclude that all
three means are different from
𝑥lj 2 − 𝑥lj 3 = 20.2 one another at the .05 level of
significance.

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 38


Section 15.3 Kruskal-Wallis Test
• Use when the normality assumption for one-way
ANOVA is violated
• Assumptions:
– The samples are random and independent
– variables have a continuous distribution
– the data can be ranked
– populations have the same variability
– populations have the same shape

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 39


Kruskal-Wallis Test Procedure (1 of 3)
• Obtain relative rankings for each value
– In event of tie, each of the tied values gets the average
rank
• Sum the rankings for data from each of the K
groups
– Compute the Kruskal-Wallis test statistic
– Evaluate using the chi-square distribution with K − 1
degrees of freedom

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 40


Kruskal-Wallis Test Procedure (2 of 3)
• The Kruskal-Wallis test statistic:
(chi-square with K − 1 degrees of freedom)

𝐾
12 𝑅𝑖2
𝑊= ෍ −3 𝑛+1
𝑛 𝑛+1 𝑛𝑖
𝑖=1

where:

n = sum of sample sizes in all groups


K = Number of samples
𝑅𝑖= Sum of ranks in the ith group
𝑛𝑖 = Size of the ith group

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 41


Kruskal-Wallis Test Procedure (3 of 3)
• Complete the test by comparing the calculated
H value to a critical 𝜒 2 value from the chi-square
distribution with K − 1 degrees of freedom

Decision rule
– Reject 𝐻0 if 𝑊 > 𝜒 2 𝐾−1,𝛼

– Otherwise do not reject 𝐻0

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 42


Kruskal-Wallis Example (1 of 4)
• Do different departments have different class sizes?

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 43


Kruskal-Wallis Example (2 of 4)
• Do different departments have different class
sizes?

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 44


Kruskal-Wallis Example (3 of 4)
𝐻0 : Mean𝑀 = Mean𝐸 = Mean𝐵
𝐻1 : Not all population means are equal

• The W statistic is
𝑘
12 𝑅𝑖2
𝑊= ෍ −3 𝑛+1
𝑛 𝑛+1 𝑛𝑖
𝑖=1

12 442 562 202


= + + − 3 15 + 1 = 6.72
15 15 + 1 5 5 5

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 45


Kruskal-Wallis Example (4 of 4)
• Compare W = 6.72 to the critical value from
the chi-square distribution for 3 − 1 = 2
degrees of freedom and 𝛼 = .05:


2
= 5.991
2,0.05

reject H0

There is sufficient evidence to reject that the


population means are all equal

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 46


Section 15.4 Two-Way Analysis of
Variance
One Observation per Cell, Randomized Blocks
• Examines the effect of
– Two factors of interest on the dependent variable
▪ e.g., Percent carbonation and line speed on soft drink bottling
process
– Interaction between the different levels of these two
factors
▪ e.g., Does the effect of one particular carbonation level depend
on which level the line speed is set?

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 47


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 48
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 49
Two-Way ANOVA
• Assumptions
– Populations are normally distributed
– Populations have equal variances
– Independent random samples are drawn

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 50


Randomized Block Design
Two Factors of interest: A and B
K = number of groups of factor A
H = number of levels of factor B
(sometimes called a blocking variable)

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 51


Two-Way Notation
• Let 𝑥𝑖𝑗 denote the observation in the ith group and jth block

• Suppose that there are K groups and H blocks, for a


total of 𝑛 = 𝐾𝐻 observations
• Let the overall mean be 𝑥
• Denote the group sample means by
xlj 𝑖• (i = 1,2, … , K)

• Denote the block sample means by

xlj •𝑗 j = 1,2, … , 𝐻

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 52


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 53
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 54
Partition of Total Variation
• SST = SSG + SSB + SSE

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 55


Two-Way Sums of Squares

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 56


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 57
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 58
Two-Way Mean Squares
• The mean squares are
SST
MST =
𝑛−1

SS𝐺
MSG =
𝐾−1

SS𝐵
MSB =
𝐻−1
SSE
MSE =
𝐾−1 𝐻−1

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 59


Two-Way ANOVA: The F Test
Statistic (1 of 2)
H0 : The K population group F Test for Groups
means are all the same Reject 𝐻0 if
MSG
𝐹=
MSE 𝐹 > 𝐹𝐾−1, 𝐾−1 𝐻−1 ,𝛼

F Test for Blocks


𝐻0 : The H population block MSB Reject 𝐻0 if
means are the same 𝐹=
MSE 𝐹 > 𝐹
𝐻−1, 𝐾−1 𝐻−1 ,𝛼

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 60


General Two-Way Table Format

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 61


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 62
Section 15.5 More Than One
Observation per Cell (1 of 2)
• A two-way design with more than one observation
per cell allows one further source of variation
• The interaction between groups and blocks can
also be identified
• Let
– K = number of groups
– H = number of blocks
– m = number of observations per cell
– KHm = total number of observations

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 63


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 64
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 65
Section 15.5 More Than One
Observation per Cell (2 of 2)

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 66


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 67
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 68
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 69
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 70
Sums of Squares with Interaction

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 71


Two-Way Mean Squares with
Interaction
• The mean squares are MST =
SST
𝐾𝐻𝑚 − 1

SS𝐺
MSG =
𝐾−1

SS𝐵
MSB =
𝐻−1

SSI
MSI =
𝐾−1 𝐻−1

SSE
MSE =
𝐾𝐻 𝑚 − 1

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 72


Two-Way ANOVA: The F Test
Statistic (2 of 2)
𝐻0 : The K population group F Test for group effect
means are all the same
MSG Reject 𝐻0 if
𝐹=
MSE F > F𝐾−1,𝐾𝐻 𝐿−1 ,𝛼

𝐻0 : The H population block F Test for block effect


means are the same
MSB Reject 𝐻0 if
𝐹=
MSE F > F𝐻−1,𝐾𝐻 𝐿−1 ,𝛼

𝐻0 : the interaction of groups F Test for interaction effect


and blocks is equal to
MSI Reject 𝐻0 if
zero 𝐹=
MSE
F>F 𝐾−1 𝐻−1 ,𝐾𝐻 𝐿−1 ,𝛼

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 73


Two-Way ANOVA Summary Table

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 74


Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 75
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 76
Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 77
Features of Two-Way ANOVA F Test
• Degrees of freedom always add up
– 𝐾𝐻𝑚 − 1 = 𝐾 − 1 + 𝐻 − 1 + 𝐾 − 1 𝐻 − 1 + 𝐾𝐻 𝑚 − 1
– Total = groups + blocks + interaction + error
• The denominator of the F Test is always the same
but the numerator is different
• The sums of squares always add up
– SST = SSG + SSB + SSI + SSE
– Total = groups + blocks + interaction + error

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 78


Chapter Summary
• Described one-way analysis of variance
– The logic of Analysis of Variance
– Analysis of Variance assumptions
– F test for difference in K means
• Applied the Kruskal-Wallis test when the
populations are not known to be normal
• Described two-way analysis of variance
– Examined effects of multiple factors
– Examined interaction between factors

Copyright © 2020 Pearson Education Ltd. All Rights Reserved. Slide - 79

You might also like