You are on page 1of 44

Slides Prepared by

JOHN S. LOUCKS
St. Edwards University

2002 South-Western/Thomson Learning 1


Chapter 13
Analysis of Variance and Experimental
Design
An Introduction to Analysis of Variance
Analysis of Variance: Testing for the Equality
of
k Population Means
Multiple Comparison Procedures
An Introduction to Experimental Design
Completely Randomized Designs
Randomized Block Design

2
An Introduction to Analysis of Variance

Analysis of Variance (ANOVA) can be used to


test for the equality of three or more
population means using data obtained from
observational or experimental studies.
We want to use the sample results to test the
following hypotheses.

H0: 1=2=3=. . . = k
Ha: Not all population means are equal
If H0 is not rejected, we cannot conclude that
all population means are different.
Rejecting H0 means that at least two
population means have different values. 3
Assumptions for Analysis of Variance

For each population, the response variable is


normally distributed.
The variance of the response variable, denoted
2, is the same for all of the populations.
The observations must be independent.

4
Analysis of Variance:
Testing for the Equality of K Population
Means
Between-Samples Estimate of Population
Variance
Within-Samples Estimate of Population
Variance
Comparing the Variance Estimates: The F Test
The ANOVA Table

5
Between-Samples Estimate
of Population Variance
A between-samples estimate of 2 is called
the mean square between (MSB).
k
2
nj (xj_ x)=2
j1
MSB
k 1
The numerator of MSB is called the sum of
squares between (SSB).
The denominator of MSB represents the
degrees of freedom associated with SSB.

6
Within-Samples Estimate
of Population Variance
The estimate of 2 based on the variation of
the sample observations within each sample is
called the mean square within (MSW).
k
(nj 1)s2j
j 1
MSW
nT k
The numerator of MSW is called the sum of
squares within (SSW).
The denominator of MSW represents the
degrees of freedom associated with SSW.

7
Comparing the Variance Estimates: The F
Test
If the null hypothesis is true and the ANOVA
assumptions are valid, the sampling
distribution of MSB/MSW is an F distribution
with MSB d.f. equal to k - 1 and MSW d.f. equal
to nT - k.
If the means of the k populations are not
equal, the value of MSB/MSW will be inflated
because MSB overestimates 2.
Hence, we will reject H0 if the resulting value of
MSB/MSW appears to be too large to have
been selected at random from the appropriate
F distribution.

8
Test for the Equality of k Population
Means
Hypotheses

H0: 1=2=3=. . . = k
Ha: Not all population means are equal
Test Statistic
F = MSB/MSW
Rejection Rule
Reject H0 if F > F
where the value of F is based on an F
distribution with k - 1 numerator degrees of
freedom and nT - 1 denominator degrees of
freedom.

9
Sampling Distribution of MSTR/MSE

The figure below shows the rejection region


associated with a level of significance equal to
where F denotes the critical value.

Do Not Reject H0 Reject H0


MSTR/MSE
F
Critical Value

10
The ANOVA Table

Source of Sum of Degrees of Mean


Variation Squares Freedom Squares
F
Treatment SSTR k-1 MSTR MSTR/MSE
Error SSE nT - k MSE
Total SST nT - 1

SST divided by its degrees of freedom nT - 1 is


simply the overall sample variance that would be
obtained if we treated
k nj the entire nT observations
SST ij
as one data set. ( x
j 1 i 1
x ) 2
SSTR SSE

11
Example: Reed Manufacturing

Analysis of Variance
J. R. Reed would like to know if the mean
number of
hours worked per week is the same for the
department
managers at her three manufacturing plants
(Buffalo,
Pittsburgh, and Detroit).
A simple random sample of 5 managers from
each of
the three plants was taken and the number of
hours
worked by each manager for the previous week
is 12
Example: Reed Manufacturing

Analysis of Variance
Plant 1 Plant 2 Plant 3
Observation Buffalo Pittsburgh
Detroit
1 48 73 51
2 54 63 63
3 57 66 61
4 54 64 54
5 62 74 56
Sample Mean 55 68 57
Sample Variance 26.0 26.5
24.5

13
Example: Reed Manufacturing

Analysis of Variance
Hypotheses

H0: 1=2=3
Ha: Not all the means are equal
where:
1 = mean number of hours worked per
week by the managers at Plant 1
2 = mean number of hours worked per
week by the managers at Plant 2
3 = mean number of hours worked per
week by the managers at Plant 3

14
Example: Reed Manufacturing

Analysis of Variance
Mean Square Between

Since the sample sizes are all equal


x ==(55 + 68 + 57)/3 = 60
SSB = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 -
60)2 = 490
MSB = 490/(3 - 1) = 245
Mean Square Within

SSW = 4(26.0) + 4(26.5) + 4(24.5) =


308
MSW = 308/(15 - 3) = 25.667

15
Example: Reed Manufacturing

Analysis of Variance
F - Test

If H0 is true, the ratio MSB/MSW should be


near 1
since both MSB and MSW are estimating 2.
If Ha
is true, the ratio should be significantly
larger than
1 since MSB tends to overestimate 2.
Rejection Rule

Assuming = .05, F.05 = 3.89 (2 d.f.


numerator,
12 d.f. denominator). Reject H0 if F > 3.89
16
Example: Reed Manufacturing

Analysis of Variance
Test Statistic
F = MSB/MSW = 245/25.667 = 9.55
Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0.
The mean
number of hours worked per week by
department
managers is not the same at each plant.

17
Example: Reed Manufacturing

Analysis of Variance
ANOVA Table
Source of Sum of Degrees of Mean
Variation Squares Freedom Square
F
Treatments 490 2 245
9.55
Error 308 12 25.667
Total 798 14

18
Multiple Comparison Procedures

Suppose that analysis of variance has


provided statistical evidence to reject the null
hypothesis of equal population means.
Fishers least significance difference (LSD)
procedure can be used to determine where the
differences occur.

19
Fishers LSD Procedure

Hypotheses
H0: i = j
Ha: i j
Test Statistic
xi xj
t
MSW( 1n 1n )
i j

Rejection Rule
Reject H0 if t < -ta/2 or t > ta/2

where the value of ta/2 is based on a t


distribution
with nT - k degrees of freedom.
20
Fishers LSD Procedure
_ _
Based on the Test Statistic xi - xj
Hypotheses
H0: i = j
Ha: i j
Test Statistic _ _
xi - xj
Rejection Rule
_ _
Reject H0 if |xi - xj| > LSD

where LSD t / 2 MSW( 1n 1n )


i j

21
Example: Reed Manufacturing

Fishers LSD
Assuming = .05,

(15 15) 6.98


LSD 2. 179 25667
.

Hypotheses (A) H0: 1 = 2


Ha: 1 2
Test Statistic
_ _
|x1 - x2| = |55 - 68| = 13
Conclusion
The mean number of hours worked at Plant
1 is not equal to the mean number worked
at Plant 2.
22
Example: Reed Manufacturing

Fishers LSD
Hypotheses (B)

H0: 1 = 3
Ha: 1 3
Test Statistic
_ _
|x1 - x3| = |55 - 57| = 2
Conclusion
There is no significant difference between
the mean number of hours worked at Plant
1 and
the mean number of hours worked at Plant
3.
23
Example: Reed Manufacturing

Fishers LSD
Hypotheses (C)

H0: 2 = 3
Ha: 2 3
Test Statistic
_ _
|x2 - x3| = |68 - 57| = 11
Conclusion
The mean number of hours worked at Plant
2 is not equal to the mean number worked
at Plant 3.

24
An Introduction to Experimental Design

Statistical studies can be classified as being


either experimental or observational.
In an experimental study, one or more factors
are controlled so that data can be obtained
about how the factors influence the variables
of interest.
In an observational study, no attempt is made
to control the factors.
Cause-and-effect relationships are easier to
establish in experimental studies than in
observational studies.

25
An Introduction to Experimental Design

A factor is a variable that the experimenter


has selected for investigation.
A treatment is a level of a factor.
Experimental units are the objects of interest
in the experiment.
A completely randomized design is an
experimental design in which the treatments
are randomly assigned to the experimental
units.
If the experimental units are heterogeneous,
blocking can be used to form homogeneous
groups, resulting in a randomized block
design.

26
Completely Randomized Designs

Between-Treatments Estimate of Population


Variance
Within-Treatments Estimate of Population
Variance
Comparing the Variance Estimates: The F Test
The ANOVA Table
Pairwise Comparisons

27
Between-Treatments Estimate
of Population Variance
In the context of experimental design, the
between-samples estimate of 2 is referred to
as the mean square due to treatments (MSTR).
It is the same as what we previously called
mean square between (MSB).
k
The formula for MSTR is nj (xj x)2
j1
MSTR _ =
k 1
The numerator is called the sum of squares
due to treatments (SSTR).
The denominator k - 1 represents the degrees
of freedom associated with SSTR.

28
Within-Treatments Estimate
of Population Variance
The second estimate of 2, the within-samples
estimate, is referred to as the mean square
due to error (MSE).
It is the same as what we previously called
mean square within (MSW). k
The formula for MSE is (nj 1)s2j
j1
MSE
nT k

The numerator is called the sum of squares


due to error (SSE).
The denominator nT - k represents the degrees
of freedom associated with SSE.

29
ANOVA Table for a
Completely Randomized Design

Source of Sum of Degrees of Mean


Variation Squares Freedom Squares
F
SSTR MSTR
MSTR
Treatments SSTR k-1 k- 1 MSE
SSE
MSE
Error SSE nT - k nT - k

Total SST nT - 1

30
Example: Home Products, Inc.

Home Products, Inc. is considering marketing a


long-
lasting car wax. Three different waxes (Type 1,
Type 2,
and Type 3) have been developed.
In order to test the durability of these waxes, 5
new
cars were waxed with Type 1, 5 with Type 2, and
5 with
Type 3. Each car was then repeatedly run
through an
automatic carwash until the wax coating showed
signs
of deterioration. The number of times each car
went
31
Example: Home Products, Inc.

Wax Wax Wax


Observation Type 1 Type 2 Type
3
1 48 73 51
2 54 63 63
3 57 66 61
4 54 64 54
5 62 74 56

Sample Mean 55 68 57
Sample Variance 26.0 26.5
24.5

32
Example: Home Products, Inc.

Completely Randomized Design


Hypotheses

H0: 1=2=3
Ha: Not all the means are equal
where:
1 = mean number of washes for Type 1
wax
2 = mean number of washes for Type 2
wax
3 = mean number of washes for Type 3
wax

33
Example: Home Products, Inc.

Completely Randomized Design


Mean Square Between Treatments

Since the sample sizes are all equal


= _ _ _
x = (x1 + x2 + x3)/3 = (55 + 68 + 57)/3
= 60
SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 -
60)2 = 490
MSTR = 490/(3 - 1) = 245
Mean Square Error

SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308


MSE = 308/(15 - 3) = 25.667

34
Example: Home Products, Inc.

Completely Randomized Design


Rejection Rule

Assuming = .05, F.05 = 3.89 (2 d.f.


numerator
and 12 d.f. denominator). Reject H0 if F >
3.89.
Test Statistic

F = MSTR/MSE = 245/25.667 = 9.55


Conclusion

Since F = 9.55 > F.05 = 3.89, we reject H0.


The
mean number of carwashes are not the
same for
35
Example: Home Products, Inc.

Completely Randomized Design


ANOVA Table

Source of Sum of Degrees of


Mean
Variation Squares Freedom
Squares F
Treatments 490 2 245
9.55
Error 308 12 25.667
Total 798 14

36
Randomized Block Design

The ANOVA Procedure


Computations and Conclusions

37
The ANOVA Procedure

The ANOVA procedure for the randomized


block design requires us to partition the sum of
squares total (SST) into three groups: sum of
squares due to treatments, sum of squares
due to blocks, and sum of squares due to error.
The formula for this partitioning is

SST = SSTR + SSBL + SSE


The total degrees of freedom, nT - 1, are
partitioned such that k - 1 degrees of freedom
go to treatments,
b - 1 go to blocks, and (k - 1)(b - 1) go to the
error term.
38
ANOVA Table for a
Randomized Block Design

Source of Sum of Degrees of Mean


Variation Squares Freedom Squares
F
SSTR MSTR
MSTR
Treatments SSTR k-1 k- 1 MSE
SSBL
MSBL
Blocks SSBL b-1 b- 1
SSE
MSE
Error SSE (k - 1)(b - 1) (k 1)(b 1)

Total SST nT - 1

39
Example: Eastern Oil Co.

Eastern Oil has developed three new blends


of gasoline and must decide which blend or
blends to produce and distribute. A study of
the miles per gallon ratings of the three blends
is being conducted to determine if the mean
ratings are the same for the three blends.
Five automobiles have been tested using
each of the three gasoline blends and the
miles per gallon ratings are shown on the next
slide.

40
Example: Eastern Oil Co.

Automobile Type of Gasoline (Treatment)


Blocks
(Block) Blend X Blend Y Blend Z
Means
1 31 30 30 30.333
2 30 29 29 29.333
3 29 29 28 28.667
4 33 31 29 31.000
5 26 25 26 25.667
Treatment
Means 29.8 28.8 28.4

41
Example: Eastern Oil Co.

Randomized Block Design


Mean Square Due to Treatments

The overall sample mean is 29. Thus,


SSTR = 5[(29.8 - 29)2 + (28.8 - 29)2 + (28.4 -
29)2] = 5.2
MSTR = 5.2/(3 - 1) = 2.6
Mean Square Due to Blocks

SSBL = 3[(30.333 - 29)2 + . . . + (25.667 -


29)2] = 51.33
MSBL = 51.33/(5 - 1) = 12.8
Mean Square Due to Error

SSE = 62 - 5.2 - 51.33 = 5.47


MSE = 5.47/[(3 - 1)(5 - 1)] = .68
42
Example: Eastern Oil Co.

Randomized Block Design


Rejection Rule

Assuming = .05, F.05 = 4.46 (2 d.f.


numerator and 8 d.f. denominator). Reject
H0 if F > 4.46.
Test Statistic
F = MSTR/MSE = 2.6/.68 = 3.82
Conclusion
Since 3.82 < 4.46, we cannot reject H0.
There is not sufficient evidence to conclude
that the miles per gallon ratings differ for
the three gasoline blends.

43
End of Chapter 13

44

You might also like