You are on page 1of 42

12-1

Chapter

Twelve

McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.


12-2
Chapter Twelve
Analysis of Variance
GOALS
When you have completed this chapter, you will be able to:

ONE
List the characteristics of the F distribution.
TWO
Conduct a test of hypothesis to determine whether the variances of two
populations are equal.
THREE
Discuss the general idea of analysis of variance.
FOUR
Organize data into a one-way and a two-way ANOVA table.

Goals
12-3
Chapter Twelve continued

Analysis of Variance
GOALS
When you have completed this chapter, you will be able to:

FIVE
Define and understand the terms treatments and blocks.
SIX
Conduct a test of hypothesis among three or more treatment means.
SEVEN
Develop confidence intervals for the difference between treatment means.
EIGHT
Conduct a test of hypothesis to determine if there is a difference among
block means.

Goals
12-4

Characteristics of the F-Distribution

There is a “family” of F Distributions.


Each member of the family is
determined by two parameters: the
numerator degrees of freedom and
the denominator degrees of
4.5
1
freedom. Its values range from 0
to ∞ . As F → ∞ the
F cannot be
The F curve approaches the X-
negative, and
distribution is axis but never touches it.
it is a
continuous positively
distribution. skewed. Characteristics of F-
Distribution
12-5
Test for Equal Variances of Two Populations

2
For the two tail test, the
s1
F = 2
test statistic is given by s2
The degrees of freedom are
n1-1 for the numerator and
n2-1 for the denominator.
s12 and s 22 are the The null hypothesis is rejected
sample variances for if the computed value of the
the two samples. The test statistic is greater than the
larger s is placed in critical value.
the denominator.
Test for Equal Variances of Two Populations
12-6

Colin, a stockbroker at Critical


Securities, reported that the mean
rate of return on a sample of 10
internet stocks was 12.6 percent
with a standard deviation of 3.9
percent.
The mean rate of return on a sample
of 8 utility stocks was 10.9 percent
with a standard deviation of 3.5
percent. At the .05 significance level,
can Colin conclude that there is more
variation in the internet stocks?
Example 1
12-7

2 2
H0 :σ I ≤σ U
Step 1: The hypotheses are 2 2
H1 : σ I >σ U

Step 2: The significance level is .05.

Step 3: The test statistic is the F distribution.

Example 1 continued
12-8

Step 4: H0 is rejected Step 5: The value of F is


if F>3.68 or if p < .05. computed as follows.
The degrees of 2
freedom are n1-1 or 9 (3.9)
F= 2
= 1.2416
in the numerator and (3.5)
n1-1 or 7 in the
The p(F>1.2416) is .3965.
denominator.

H0 is not rejected. There is


insufficient evidence to show more
variation in the internet stocks.

Example 1 continued
12-9

The ANOVA Test of Means

The F distribution is also This technique is called


used for testing whether two analysis of variance or
or more sample means came ANOVA
from the same or equal
populations.

The null and alternate hypotheses for four sample


means is given as:
Ho: µ1 = µ2 = µ3 = µ4
H1: µ1 = µ2 = µ3 = µ4
The ANOVA Test of Means
12-10

ANOVA requires the following conditions

The sampled
populations follow the
normal distribution.

The samples are independent

The populations have equal


standard deviations.
Underlying assumptions for
ANOVA
Estimate of the population variance 12-11

based on the differences among the sample means


F=
Estimate of the population variance
based on the variation within the samples

If there are k populations


being sampled, the numerator
Degrees of freedom degrees of freedom is k – 1
for the F statistic in
ANOVA If there are a total of n
observations the denominator
degrees of freedom is n – k.

ANOVA Test of Means


12-12

ANOVA divides the Total Variation into the


variation due to the treatment, Treatment Variation,
and to the error component, Random Variation.

In the following table,


i stands for the ith observation
xG is the overall or grand mean
k is the number of treatment groups

ANOVA Test of Means


12-13

ANOVA Table
Source of Sum of Degrees Mean F
Variatio Squares of Square
n Freed
om
Treatments SST k-1 SST/(k-1)
(k) k =MST MST
Σnk(Xk-XG)2 MSE
Error SSE n-k SSE/(n-k)
i k =MSE
Treatment variation
ΣΣ(Xi.k-Xk) 2
Random variation
Total TSS n-1
Total variation
i
Anova Table
Σ(Xi-XG) 2
12-14

Sum of Squares: One-Way

Xi1 Xi2 (Xi1- XG)2 (Xi2- XG)2

ΣiXi1 ΣiXi2 ΣijXij

X1 = ΣiXi1/n1 X2 = ΣiXi2/n2 XG = ΣijXij/nij Σi(Xi1- XG)2 Σi(Xi2- XG)2

n1(X1 - XG)2 n2(X2 - XG)2 SST = Σjnj(Xj - XG)2 TSS =


Σij(Xij-
XG)2

SSE = TSS - SST


12-15

Rosenbaum Restaurants specialize in meals for


families. Katy Polsby, President, recently
developed a new meat loaf dinner. Before
making it a part of the regular menu she decides
to test it in several of her restaurants.
She would like to know if
there is a difference in the
mean number of dinners sold
per day at the Anyor, Loris,
and Lander restaurants. Use
the .05 significance level.

Example 2
12-16

Number of Dinners Sold by Restaurant


Restaurant Aynor Loris Lander
Day

Day 1 13 10 18
Day 2 12 12 16
Day 3 14 13 17
Day 4 12 11 17
Day 5 17

Example 2 continued
12-17

Step One: State the null hypothesis and the alternate


hypothesis.
Ho: µAynor = µLoris = µLandis
H1: µAynor = µLoris = µLandis

Step Two: Select the level of significance. This is


given in the problem statement as .05.

Step Three: Determine the test statistic. The test


statistic follows the F distribution.

Example 2 continued
12-18

Step Four: Formulate the decision rule.


The numerator degrees of freedom, k-1, equal 3-1 or 2.
The denominator degrees of freedom, n-k, equal 13-3 or
10. The value of F at 2 and 10 degrees of freedom is
4.10. Thus, H0 is rejected if F>4.10 or p< α of .05.

Step Five: Select the sample, perform the calculations,


and make a decision.

Using the data provided, the


ANOVA calculations follow.

Example 2 continued
Computation of SSE 12-19

ΣΣ(Xi.k-Xk)2

Anyor SS(Anyor) Loris SS(Loris) Lander SS(Lander)


#sold #so #sold
ld
13 (13-12.75)2 10 (10-11.5)2 18 (18-17)2
12 (12-12.75)2 12 (12-11.5)2 16 (16-17)2
14 (14-12.75)2 13 (13-11.5)2 17 (17-17)2
12 (12-12.75)2 11 (11-11.5)2 17 (17-17)2
17 (17-17)2
2.75 5 2
Xk 12.75 11.5 17
SSE: 2.75 + 5 + 2 = 9.75
XG: 14.00
Computation of TSS i
12-20

Σ(Xi-XG)2

Anyor TSS(Anyor) Loris TSS(Loris) Lander TSS(Lander)


#sold #s #sold
old
13 (13-14)2 10 (10-14)2 18 (18-14)2
12 (12-14)2 12 (12-14)2 16 (16-14)2
14 (14-14)2 13 (13-14)2 17 (17-14)2
12 (12-14)2 11 (11-14)2 17 (17-14)2
17 (17-14)2
9.00 30 47
TSS: 9.00 + 30 + 47 = 86.00
SSE: 9.75
XG: 14.00
Example 2 continued
Computation of TSS
Computation of SST k 12-21

Σnk(Xk-XG)2

Restaurant XT SST
Anyor 12.75 4(12.75-14)2
Loris 11.50 4(11.50-14)2
Lander 17.00 5(17.00-14)2
76.25

Shortcut: SST = TSS – SSE


= 86 – 9.75
= 76.25 Example 2 continued
Computation of SST
12-22

ANOVA Table
Source of Sum of Degrees Mean F
Variatio Squares of Square
n Freed
om
Treatments 76.25 3-1 76.25/2
=2 =38.125 38.125
Error 9.75 13-3 9.75/10 .975
=10 =.975 = 39.103
Total 86.00 13-1
=12

Example 2 continued
12-23

The p(F> 39.103) is .000018.

At least two of the


Since an F of 39.103 > the treatment means are
critical F of 4.10, the p not the same.
of .000018 < a of .05, the
decision is to reject the The mean number of
null hypothesis and meals sold at the three
conclude that locations is not the
same.

The ANOVA tables on the next two slides are from the
Minitab and EXCEL systems. Example 2continued
12-24

Analysis of Variance
Source DF SS MS F P
Factor 2 76.250 38.125 39.10 0.000
Error 10 9.750 0.975
Total 12 86.000
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ---------+---------+---------
+-------
Aynor 4 12.750 0.957 (---*---)
Loris 4 11.500 1.291 (---*---)
Lander 5 17.000 0.707 (---*---)
---------+---------+---------
+-------
Pooled StDev = 0.987 12.5 15.0 17.5

Example 2 continued
12-25
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Aynor 4 51 12.75 0.92
Loris 4 46 11.50 1.67
Lander 5 85 17.00 0.50

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 76.25 2 38.13 39.10 2E-05 4.10
Within Groups 9.75 10 0.98

Total 86.00 12

Example 2 continued
12-26

When I reject the null


hypothesis that the
means are equal, I want
to know which
treatment means differ.

One of the simplest procedures


is through the use of confidence
intervals around the difference
in treatment means.
Inferences About
Treatment Means
12-27

 1 1
( X 1 − X 2 ) ± t MSE  n + n 
1 2

t is obtained from
the t table with MSE = [SSE/(n - k)]
degrees of freedom
(n - k).

If the confidence interval around the difference


in treatment means includes zero, there is not a
difference between the treatment means.
Confidence Interval for the
Difference Between Two Means
12-28
95% confidence interval for the difference
in the mean number of meat loaf dinners
sold in Lander and Aynor

Can Katy conclude that


there is a difference
between the two
restaurants?

 1 1
(17 − 12.75) ± 2.228 .975 + 
 4 5
4.25 ± 1.48 ⇒ (2.77,5.73)
EXAMPLE 3
12-29

Because zero is not


in the interval, we
conclude that this
pair of means
differs.

The mean number of


meals sold in Aynor
is different from
Lander.

Example 3continued
12-30

Sometimes there are other causes of variation. For the two-


factor ANOVA we test whether there is a significant difference
between the treatment effect and whether there is a difference
in the blocking effect (a second treatment variable).

SSB = r Σ (Xb – XG)2


where r is the number of blocks
Xb is the sample mean of block b
XG is the overall or grand mean

In the following ANOVA table, all sums of squares are


computed as before, with the addition of the SSB.
Two-Factor ANOVA
12-31

ANOVA Table
Source of Sum of Squares Degrees Mean F
Variatio of Square
n Freedo
m
Treatments SST k-1 SST/(k-1)
(k) =MST MST
Blocks SSB b-1 SSB/(b-1) MSE
(b) =MSB
MSB
Error SSE (k-1)(b-1) SSE/
MSE
(TSS – SST –SSB) (k-1)(b-1)
=MSE
Total TSS n-1
Two factor ANOVA table
12-32

SSE = TSS – SST – SSB Sum of Squares: Two-Way


Xi1 Xi2 (Xi1- XG)2 (Xi2-
XG)2
X1j ΣjX1j X1j = b1(X1j -
ΣjX1j/b1 XG)2
X2j ΣjX2j X2j = b2(X2j -
ΣjX2j/b2 XG)2
ΣiXi1 ΣiXi2 ΣijXij SSB =
Σibi(
Xi -
XG)2
Xi1 = Xi2 = XG = Σi(Xi1- XG)2 Σi(Xi2-
ΣiX ΣiX ΣijXij/nij XG)2
i1
/n i2
/n
1 2

n1(Xi1 - n2(Xi2 - SST = TSS =


XG) XG) Σjnj( Σij(Xij-
2 2 Xj - XG)2
XG)2
12-33
The Bieber Manufacturing
Co. operates 24 hours a
day, five days a week. The
workers rotate shifts each
week. Todd Bieber, the
owner, is interested in
whether there is a
difference in the number
of units produced when the At the .05 significance level,
employees work on can we conclude there is a
various shifts. A sample difference in the mean
of five workers is selected production by shift and in
and their output recorded the mean production by
on each shift. employee?
Example 4
12-34

Employee Day Evening Night


Output Output Output
McCartney 31 25 35

Neary 33 26 33

Schoen 28 24 30

Thompson 30 29 28

Wagner 28 26 27

Example 4 continued
Treatment Effect
12-35

Step 1: State the null hypothesis and


the alternate hypothesis.
Step 2: Select the level of
H0 : µ 1 = µ 2 = µ 3 significance. Given as .05.
H1: Not all means are equal.
Step 3: Determine the Step 4: Formulate the
test statistic. The test decision rule.
statistic follows the F Ho is rejected if F > 4.46,
distribution. the degrees of freedom are
2 and 8, or if p < .05.
Step 5: Perform the calculations
Example 4 continued
and make a decision.
12-36
Block Effect
Step 1: State the null hypothesis and
the alternate hypothesis. Step 2: Select the
H0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5
level of significance.
Given as α = .05.

H1: Not all means are equal.


Step 4: Formulate the
Step 3: Determine the
test statistic. The test decision rule.
statistic follows the F H0 is rejected if F>3.84, df
distribution. =(4,8) or if p < .05.
Step 5: Perform the calculations and Example 4 continued
make a decision.
Note: xG = 28.87 Block Sums of Squares 12-37

Effects of time of day and worker on productivity


Day Evening Night Employee x SSB
McCartney 31 25 35 30.33 3(30.33-28.87)2
=
6.42
Neary 33 26 33 30.67 3(30.67-28.87)2
= 9.68
Schoen 28 24 30 27.33 3(27.33-28.87)2
7.08
Thompson 30 29 28 29.00 3(29.00-28.87)2
.09
Wagner 28 26 27 27.00 3(27.00-28.87)2
10.49
SSB = 6.42 + 9.68 + 7.08 + .05 + 10.49= 33.73
12-38

Compute the remaining sums of squares as before:


TSS = 139.73
SST = 62.53
SSE = 43.47 (139.73-62.53-33.73)
df(block) = 4 (b-1)
df(treatment) = 2 (k-1)
df(error)=8 (k-1)(b-1)

Example 4 continued
12-39

ANOVA Table
Source of Sum of Degrees of Mean F
Variatio Squares Freedo Square
n m
Treatments 62.53 2 62.53/2 31.27/5.43
(k) =31.275 = 5.75

Blocks 33.73 4 33.73/4 8.43/5.43


(b) =8.43 =1.55
Error 43.47 8 43.47/8
=5.43
Total 139.73 14

Example 4 continued
12-40

Treatment Effect
Since the computed Block Effect
F of 5.75 > the Since the computed F of 1.55
critical F of 4.10, the < the critical F of 3.84, the p
p of .03 < α of .05, of .28> α of .05, H0 is not
H0 is rejected. There rejected since there is no
is a difference in the significant difference in the
mean number of average number of units
units produced for produced for the different
the different time employees.
periods.

Example 4 continued
12-41
Minitab output

Two-way ANOVA: Units versus Worker, Shift

Analysis of Variance for Units


Source DF SS MS F P
Worker 4 33.73 8.43 1.55 0.276
Shift 2 62.53 31.27 5.75 0.028
Error 8 43.47 5.43
Total 14 139.73

Example 4 continued
12-42

Anova: Two-Factor Without Replication

SUM M ARY Count Sum Average Variance


Day 5 150 30.0 4.5
Evening 5 130 26.0 3.5
Night 5 153 30.6 11.3

McCartney 3 91 30.33 25.33


Neary 3 92 30.67 16.33
Schoen 3 82 27.33 9.33
Thompson 3 87 29.00 1
Wagner 3 81 27.00 1

ANOVA

Source of
Variation SS df MS F P-value F crit
Rows 62.53 2 31.27 5.75 0.03 4.46
Columns 33.73 4 8.43 1.55 0.28 3.84
Error 43.47 8 5.43

Output Total 139.73 14


Using
EXCEL Example 4 continued