You are on page 1of 40

12-1

Chapter

Twelve

McGraw-Hill/Irwin 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.


12-2
Chapter Twelve
Analysis of Variance
GOALS
When you have completed this chapter, you will be able to:

ONE
List the characteristics of the F distribution.
TWO
Conduct a test of hypothesis to determine whether the variances of two
populations are equal.
THREE
Discuss the general idea of analysis of variance.
FOUR
Organize data into a one-way and a two-way ANOVA table.

Goals
12-3
Chapter Twelve continued

A
Analysis
i off Variance
i
GOALS
When you have completed this chapter, you will be able to:

FIVE
Define and understand the terms treatments and blocks.
SIX
Conduct a test of hypothesis
yp among
g three or more treatment means.
SEVEN
Develop confidence intervals for the difference between treatment means.
EIGHT
Conduct a test of hypothesis to determine if there is a difference among block
means.

Goals
12-4

Characteristics of the F-Distribution

There is a family of F Distributions.


Each member of the family is
determined by two parameters: the
numerator degrees of freedom and
the denominator degrees of
4.5
1
freedom. Its values range
g from 0
to . As F the
F cannot be
The F curve approaches the X-
negative and
negative,
distribution is axis but never touches it.
it is a
continuous positively
distribution. k d
skewed. Characteristics of F-
Distribution
12-5
Test for Equal Variances of Two Populations

2
s1
For the two tail test, the F = 2
test statistic
i i isi given
i by
b s2
The degrees of freedom are
n1-1 for the numerator and
n2-1 for the denominator.
s12 and s 22 are the The null hypothesis is rejected
sample variances for if the computed
p value of the
the two samples. The test statistic is greater than the
larger s is placed in critical value.
th d
the denominator.
i t
Test for Equal Variances of Two Populations
12-6

Colin, a stockbroker at Critical


Securities, reported that the mean
rate of return on a sample of 10
i
internet stocks
k was 12.6
12 6 percent
with a standard deviation of 3.9
percent
percent.
The mean rate of return on a sample
off 8 utility
tilit stocks
t k was 10.9
10 9 percentt
with a standard deviation of 3.5
percent At the .05
percent. 05 significance level,
level
can Colin conclude that there is more
variation in the software stocks?
Example 1
12-7

H0 : I 2
U 2

Step 1: The hypotheses are


H 1 : I2 > U2

Step 2: The significance level is .05.

Step 3: The test statistic is the F distribution.

Example 1 continued
12-8

Step 4: H0 is rejected Step 5: The value of F is


if F>3.68 or if p < .05.
computed as follows.
The degrees of
2
f d
freedom are n1-1
1 or 9 (3.9)
F= 2
= 1.2416
in the numerator and (3.5)
n1-1 or 7 in the
denominator. The p(F>1.2416) is .3965.

H0 is not rejected. There is


insufficient evidence to show more
variation
i ti in i the
th internet
i t t stocks.
t k

Example 1 continued
12-9

The ANOVA Test of Means

The F distribution is also This technique


q is called
used for testing whether two analysis of variance or
or more sample means came ANOVA
from the same or equal
populations.

The null and alternate hypotheses for four sample


means is given as:
Ho: 1 = 2 = 3 = 4
H1: 1 = 2 = 3 = 4
The ANOVA Test of Means
12-10

ANOVA requires the following conditions

The sampled
populations follow the
normal distribution.

The samples are independent

The populations
pop lations ha
havee eq
equal
al
standard deviations.
Underlying assumptions for
ANOVA
12-11
Estimate of the population variance
based on the differences among the sample means
F=
Estimate of the population variance
based on the variation within the samples
p

If there are k populations


being
i sampled, the numerator
Degrees of freedom degrees of freedom is k 1
for the F statistic in
ANOVA If there are a total of n
observations the denominator
degrees of freedom is n k.

ANOVA Test of Means


12-12

ANOVA divides the Total


T t l Variation
V i ti into the variation
due to the treatment, Treatment Variation, and to
the error component, Random Variation.

In the following table,


i stands h ith observation
d ffor the b i
xG is the overall or grand mean
k is the number of treatment groups

ANOVA Test of Means


12-13

ANOVA Table
Source of Sum of Degrees Mean F
Variation Squares of Square
Freedom
Treatments SST k-1 SST/(k-1)
(k) k =MST MST
nk(Xk-XG)2 MSE
Error SSE n-k SSE/(n-k)
i k =MSE
(Xi.k-Xk)2 Treatment variation
Total TSS n-1
n1 Random variation
i
(Xi-XG)2 Total variation

Anova Table
12-14

Rosenbaum Restaurants specialize in meals for


families. Katy Polsby, President, recently
developed a new meat loaf dinner. Before
making it a part of the regular menu she decides
to test it in several of her restaurants.
She would like to know if
there is a difference in the
mean number b off dinners
di sold
ld
per day at the Anyor, Loris,
and Lander restaurants.
restaurants Use
the .05 significance level.

Example 2
12-15

N b off Di
Number Dinners S ld bby R
Sold Restaurant
Restaurant Aynor
y Loris Lander
Day
Day 1 13 10 18
Day 2 12 12 16
Day 3 14 13 17
Day 4 12 11 17
Day 5 17

Example 2 continued
12-16

Step
p One: State the null hypothesis
yp and the alternate
hypothesis.
Ho: Aynor = Loris = Landis
H1: Aynor = Loris = Landis

St Two:
Step T S l t the
Select th level
l l off significance.
i ifi This
Thi is
i
given in the problem statement as .05.

Step Three: Determine the test statistic. The test


statistic follows the F distribution
distribution.

Example 2 continued
12-17

Step
p Four: Formulate the decision rule.
The numerator degrees of freedom, k-1, equal 3-1 or 2.
The denominator degrees of freedom, n-k, equal 13-3 or
10. The value of F at 2 and 10 degrees of freedom is
4.10. Thus, H0 is rejected if F>4.10 or p< of .05.

Step Five: Select the sample, perform the calculations,


and make a decision.

Using the data provided, the


ANOVA calculations follow.

Example 2 continued
Computation of SSE i k
12-18

(X
( i.k
i k-Xk)
2

Anyor SS(Anyor) Loris SS(Loris) Lander SS(Lander)


#sold # ld
#sold #sold
13 (13-12.75)2 10 (10-11.5)2 18 (18-17)2
12 (12 12 75)2
(12-12.75) 12 (12 11 5)2
(12-11.5) 16 (16 17)2
(16-17)
14 (14-12.75)2 13 (13-11.5)2 17 (17-17)2
12 12.75)2
(12-12.75)
(12 11 11.5)2
(11-11.5)
(11 17 17)2
(17-17)
(17
17 (17-17)2
2.75 5 2
Xk 12.75 11.5 17

SSE: 22.75
75 + 5 + 2 = 9.75
9 75
XG : 14.00
Computation of TSS i
12-19

(Xi-X
XG)2

Anyor TSS(Anyor) Loris TSS(Loris) Lander TSS(Lander)


# ld
#sold #sold # ld
#sold
13 (13-14)2 10 (10-14)2 18 (18-14)2
12 (12 14)2
(12-14) 12 (12 14)2
(12-14) 16 (16 14)2
(16-14)
14 (14-14)2 13 (13-14)2 17 (17-14)2
12 ((12-14))2 11 ((11-14))2 17 ((17-14))2
17 (17-14)2
9.00 30 47
TSS: 9.00 + 30 + 47 = 86.00
SSE: 9.75
XG : 14.00
Example 2 continued
Computation of TSS
Computation of SST k 12-20

nk(Xk-X
XG)2

Restaurant
R t t XT SST
Anyor 12.75 4(12.75-14)2
Loris 11.50 4(11.50-14)2
Lander 17.00 5(17.00-14)2
76.25

Sh t t SST = TSS SSE


Shortcut:
= 86 9.75
= 76.25 Example 2 continued
Computation of SST
12-21

ANOVA Table
Source of Sum of Degrees Mean F
Variation Squares of Square
F d
Freedom
Treatments 76.25 3-1 76.25/2
=22 =38.125
38 125 38.125
38 125
Error 9.75 13-3 9.75/10 .975
=10 = 975
=.975 = 39.103
39 103
Total 86.00 13-1
=12
12

Example 2 continued
12-22

The p(F> 39.103) is .000018.

At least two of the


Since an F of 39.103 > the treatment means are
critical F of 4.10, the p of not the same.
.000018 < a of .05, the
decision is to reject the The mean number of
null
ll hypothesis
h th i and d meals sold at the three
conclude that locations is not the
same.

The ANOVA tables on the next two slides are from the
Minitab and EXCEL systems. Example 2continued
12-23

Analysis of Variance
Source DF SS MS F P
Factor 2 76.250 38.125 39.10 0.000
Error 10 9.750 0.975
Total
T t l 12 86.000
86 000
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ---------+---------+---------+---
+ + +
----
Aynor 4 12.750 0.957 (---*---)
Loris 4 11.500 1.291 (---*---)
Lander 5 17.000 0.707 (---*---)
---------+---------+---------+---
----
Pooled StDev = 0.987 12.5 15.0 17.5

Example 2 continued
12-24
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Aynor 4 51 12.75 0.92
Loris 4 46 11.50 1.67
Lander 5 85 17.00 0.50

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 76.25 2 38.13 39.10 2E-05 4.10
Within Groups 9 75
9.75 10 0 98
0.98

Total 86.00 12

Example 2 continued
12-25

When I reject the null


hypothesis that the
means are equal, I want
to know which
i treatment
means differ.

One off the


O th simplest
i l t procedures
d
is through the use of confidence
intervals around the difference
in treatment means.
Inferences About
Treatment Means
12-26

1 1
( X1 X2 ) t MSE n + n
1 2

t is obtained from
th t ttable
the bl with
ith MSE = [SSE/(n - k)]
degrees of freedom
(n - k).
k)

If the confidence interval around the difference


in treatment means includes zero, there is not a
difference between the treatment means.
Confidence Interval for the
Difference Between Two Means
12-27
95% confidence interval for the difference
in the mean number of meat loaf dinners
sold in Lander and Aynor

Can Katy conclude that


there is a difference
between the two
restaurants?

1 1
(17 12.75) 2.228 .975 +
4 5
4.25 1.48 ( 2.77 ,5.73)
EXAMPLE 3
12-28

Because zero is not


in the interval, we
conclude that this
pair
i off means
differs.

The mean number


of meals sold in
Aynor is different
from Lander.

Example 3continued
12-29

Sometimes there are other causes of variation. For the two-


factor ANOVA we test whether there is a significant difference
between the treatment effect and whether there is a difference
in the blocking effect (a second treatment variable)
variable).

SSB = r (Xb XG)2


where r is the number of blocks
Xb is the sample mean of block b
XG is the overall or grand mean

g ANOVA table,, all sums of squares


In the following q are
computed as before, with the addition of the SSB.
Two-Factor ANOVA
12-30

ANOVA Table
Source of Sum of Squares Degrees Mean F
Variation of Square
F d
Freedom
Treatments SST k-1 SST/(k-1)
(k) =MST MST
Blocks SSB b-1 SSB/(b-1) MSE
(b) =MSB
MSB
Error SSE (k-1)(b-1) SSE/(n-k) MSB
(TSS SST SSB) =MSE MSE
Total TSS n-1

Two factor ANOVA table


12-31
The Bieber Manufacturing
Co operates 24 hours a
Co.
day, five days a week. The
workers rotate shifts each
week. Todd Bieber, the
owner, is interested in
whether there is a
difference in the number of
units
it produced
d d when h theth At the .05
05 significance level,
level
employees work on can we conclude there is a
various shifts
shifts. A sample of difference in the mean
five workers is selected production by shift and in
and their output
p recorded the mean production by
on each shift. employee?
Example 4
12-32

Employee Day Evening Night


Output Output Output
McCartney 31 25 35

Neary 33 26 33

Schoen 28 24 30

Thompson 30 29 28

Wagner 28 26 27

Example 4 continued
12-33
Treatment Effect
Step 1: State the null hypothesis and
the alternate hypothesis.
Step 2: Select the level of
H 0 : 1 = 2 = 3 significance. Given as .05.
H1: Not
N all
ll means are equal.
l
Step 3: Determine the Step 4: Formulate the
test statistic. The test decision rule.
statistic follows the F Ho is rejected if F > 4.46,
di t ib ti
distribution. th degrees
the d off freedom
f d are
2 and 8, or if p < .05.
Step 5: Perform the calculations
Example 4 continued
and make a decision.
12-34
Block Effect
Step 1: State the null hypothesis and
the alternate hypothesis.
yp Step 2: Select the
H0 : 1 = 2 = 3 = 4 = 5 level of significance.
Given as = .05.
H1: Not all means are equal.

Step 3: Determine the Step 4: Formulate the


test statistic. The test decision rule.
statistic
t ti ti follows
f ll the
th F H0 is rejected if F
F>3.84,
3.84,
distribution. df =(4,8) or if p < .05.

St 5:
Step 5 Perform
P f th
the calculations
l l ti andd Example 4 continued
make a decision.
Note: xG = 28.87 Block Sums of Squares 12-35

Effects
ff off time off dayy and worker on productivity
p y
Day Evening Night Employee x SSB
McCartney 31 25 35 30.33 )2
3(30.33-28.87)
(
= 6.42

Neary 33 26 33 30.67 3(30.67-28.87)2


= 9.68
Schoen 28 24 30 27.33 3(27.33-28.87)2
7.08
Thompson 30 29 28 29.00 3(29.00-28.87)2
.09
09
Wagner 28 26 27 27.00 3(27.00-28.87)2
10 49
10.49
SSB = 6.42 + 9.68 + 7.08 + .05 + 10.49= 33.73
12-36

Compute the remaining sums of squares as before:


TSS = 139.73
139 73
SST = 62.53
SSE = 43.47 (139.73-62.53-33.73)
( )
df(block) = 4 (b-1)
df(treatment) = 2 (k-1)
df(error)=8 (k-1)(b-1)

Example 4 continued
12-37

ANOVA Table
Source of Sum of Degrees of Mean F
Variation Squares Freedom Square
Treatments 62.53 2 62.53/2 31.27/5.43
(k) =31.275 = 5.75

Blocks 33.73 4 33.73/4 8.43/5.43


(b) =8 43
=8.43 =1 55
=1.55
Error 43.47 8 43.47/8
=5.43
5.43
Total 139.73 14

Example 4 continued
12-38

Treatment Effect
Since the computed Block Effect
F of 5.75 > the Since the computed F of
critical
i i l F off 4.10,
4 10 1.55 < the critical F of 3.84,
the p of .03 < of the p of .28> of .05, H0 is
.05,
05 H0 is rejected
rejected. not rejected since there is no
There is a significant difference in the
difference in the average number of units
mean number of produced for the different
units produced for employees.
the different time
periods.

Example 4 continued
12-39
Minitab output

T
Two-way ANOVA Units
ANOVA: U it versus Worker,
W k Shift

Analysis of Variance for Units


Source DF SS MS F P
Worker 4 33.73 8.43 1.55 0.276
Shift 2 62.53 31.27 5.75 0.028
Error 8 43.47 5.43
Total 14 139.73

Example 4 continued
12-40

Anova: Two-Factor Without Replication

SUMMARY Count Sum Average Variance


Day 5 150 30.0 4.5
Evening 5 130 26.0 3.5
Night 5 153 30.6 11.3

McCartney 3 91 30.33 25.33


Neary 3 92 30.67 16.33
S h
Schoen 3 82 27 33
27.33 9 33
9.33
Thompson 3 87 29.00 1
Wagner 3 81 27.00 1

ANOVA
Source of
Variation SS df MS F P-value F crit
Rows 62.53 2 31.27 5.75 0.03 4.46
Columns 33.73 4 8.43 1.55 0.28 3.84
Error 43.47 8 5.43

Output Total 139.73 14


Using
EXCEL Example 4 continued