You are on page 1of 93

Chapter 6

Analysis of Variance
Chapter outline

 One-Way Analysis of Variance


 Multiple Comparisons (optional)
Concepts

The response variable is the variable of interest to be


measured in the experiment. We also refer to the
response as the dependent variable. Typically, the
response/dependent variable is quantitative in nature.
Factors are those variables whose effect on the
response is of interest to the experimenter.
Quantitative factors are measured on a numerical
scale, whereas qualitative factors are those that are
not (naturally) measured on a numerical scale. Factors
are also referred to as independent variables.
Concepts

Factor levels are the values of the factor used in


the experiment.

The treatments of an experiment are the


factor-level combinations used.
An experimental unit is the object on which the
response and factors are observed or measured.
Randomized Design Example

Factor (Training Method)


Factor levels Level 1 Level 2 Level 3
(Treatments)
Experimental
units   
     

Dependent 21 hrs. 17 hrs. 31 hrs.


variable 27 hrs. 25 hrs. 28 hrs.
(Response) 29 hrs. 20 hrs. 22 hrs.
Example
 In the last decade, stockbrokers have drastically changed the way
they do business. Internet trading has become quite common, and
online trades can cost as little as $7. It is now easier and cheaper to
invest in the stock market than ever before. What are the effects of
these changes? To help answer this question, a financial analyst
randomly sampled 366 American households and asked each to
report the age category of the head of the household and the
proportion of its financial assets that are invested in the stock market.
The age categories are Young (less than 35), Early middle age (35
to 49), Late middle age (50 to 65), Senior (older than 65).
Example
 The analyst was particularly interested in determining
whether the ownership of stocks varied by age. Some of
the data are listed next. Do these data allow the analyst
to determine that there are differences in stock
ownership between the four age groups?
Concepts

 The variable X is called the response variable, and its


values are called responses.
 The unit that we measure is called an experimental unit.
In this example, the response variable is the percentage
of assets invested in stocks, and the experimental units
are the heads of households sampled.
 The criterion by which we classify the populations is
called a factor. Each population is called a factor level.
In this example, factor is is the age category of the head
of the household and there are four levels.
One-Way Analysis of Variance
 The analysis of variance is a procedure that tests to
determine whether differences exist between two or
more population means.
ANOVA F-Test
 Tests the equality of two or more (k)
population means
 Variables
 One nominal scaled independent variable
 Two or more (k) treatment levels or
classifications
 One interval or ratio scaled dependent
variable
Example
 You select independent random samples of five female
and five male high school seniors and record their SAT
scores Can we conclude that the population of female
high school students scores higher, on average, than the
population of male students?
Females Males
530 490
560 520
590 550
620 560
650 610

Mean 590 546


S.D 47.434 45.056
Variance 2250 2030
The key is to compare the difference between the
treatment means with the amount of sampling variability.
ANOVA
Partitions Total Variation

Total variation

Variation due to Variation due to


treatment random sampling
 Sum of Squares Among  Sum of Squares Within
 Sum of Squares Between  Sum of Squares Error
 Sum of Squares Treatment  Within Groups Variation
 Among Groups Variation
Total Variation
SS  Total   SST  SSE

Response, x

Group 1 Group 2 Group 3


Treatment Variation
k

 
2
SST   ni xi  x
i1

Response, x

x3
x
x2
x1

Group 1 Group 2 Group 3


Random (Error) Variation
n1 n2 nk
SSE    x1 j  x1     x2 j  x2       xkj  xk 
2 2 2

j 1 j 1 j 1

Response, x

x3
x2
x1

Group 1 Group 2 Group 3


ANOVA Summary Table

Degrees Mean
Source of Sum of
of Square F
Variation Squares
Freedom (Variance)
Treatment k–1 SST MST = MST
SST/(k – 1) MSE
Error n–k SSE MSE =
SSE/(n – k)
Total n–1 SS(Total) =
SST+SSE
ANOVA F-Test
Test Statistic
 H0 is true: the sample means would be close to one
another SST is small
 H1 is true: there would be large differences between some

of the sample means  SST is large


Test Statistic
• F = MST / MSE
— MST is Mean Square for Treatment
— MSE is Mean Square for Error

Degrees of Freedom
• 1 = k – 1 numerator degrees of freedom
• 2 = n – k denominator degrees of freedom
— k = Number of groups
— n = Total sample size
ANOVA F-Test Hypotheses
 H0: 1 = 2 = 3 = ... = k
f(x)
— All population means
are equal
— No treatment effect
x
1 = 2 = 3
 Ha: Not All i Are Equal
— At least 2 pop. means
f(x)
are different
— Treatment effect

—    ...   is
1 2 k x
Wrong 1 =  2  3
ANOVA F-Test to Compare k
Treatment Means:
H0: µ1 = µ2 = … = µk
Ha: At least two treatment means differ
MST
Test Statistic: F 
MSE
Rejection region: F > F, where F is based on (k –
1) numerator degrees of freedom (associated with
MST) and (n – k) denominator degrees of freedom
(associated with MSE).
ANOVA F-Test Critical Value

If means are equal, F


= MST / MSE  1.
Only reject large F! Reject H 0

Do Not 
Reject H 0

0 F
F(α; k – 1, n – k)

Always One-Tail!
© 1984-1994 T/Maker Co.
Conditions Required for a Valid ANOVA F-
test:Completely Randomized Design

1. The samples are randomly selected in an


independent manner from the k treatment
populations. (This can be accomplished by
randomly assigning the experimental units to the
treatments.)
2. All k sampled populations have distributions that
are approximately normal.
3. The
2
k population
2 2
variances
2
are equal (i.e.,
 1   2   3  ...   k ).
Are the assumptions required for the
test approximately satisfied?

 The samples of golf balls for each brand are


selected randomly and independently.
 The probability distributions of the distances for
each brand are normal.
 The variances of the distance probability
distributions for each brand are equal.
ANOVA F-Test Example
As production manager, you
want to see if three filling Mach1 Mach2 Mach3
machines have different 25.40 23.40 20.00
mean filling times. You 26.31 21.80 22.20
assign 15 similarly trained 24.10 23.50 19.75
and experienced workers, 5 23.74 22.75 20.60
per machine, to the
machines. At the .05 level of 25.10 21.60 20.40
significance, is there a
difference in mean filling
times?
ANOVA F-Test Solution
 H0: 1 = 2 = 3
 Ha: Not all equal
  = .05
 1 = 2 2 = 12
 Critical Value(s):

 = .05

0 3.89 F
Summary Table
Solution
From Computer
Degrees Mean
Source of Sum of
of Square F
Variation Squares (Variance)
Freedom

Treatment 3–1=2 47.1640 23.5820 25.60


(Machines)
Error 15 – 3 = 12 11.0532 .9211

Total 15 – 1 = 14 58.2172
ANOVA F-Test Solution
 H0: 1 = 2 = 3
Test Statistic:
 Ha: Not all equal
  = .05 MST 23 .5820
F   25.6
 1 = 2 2 = 12 MSE .9211
 Critical Value(s):
Decision:
Reject at  = .05
 = .05
Conclusion:
There is evidence population
0 3.89 F means are different
ANOVA F-Test
Thinking Challenge

You’re a trainer for Microsoft


Corp. Is there a difference in
mean learning times of 12
people using 4 different
training methods ( =.05)?
M1 M2 M3 M4
© 1984-1994 T/Maker Co.

10 11 13 18
9 16 8 23
5 9 9 25
Use the following table.
Summary Table
(Partially Completed)

Degrees Mean
Source of Sum of
of Square F
Variation Squares (Variance)
Freedom
Treatment 348
(Methods)
Error 80

Total
ANOVA F-Test
Solution
 H0: 1 = 2 = 3 = 4
Test Statistic:
 Ha: Not all equal
  = .05
  = 3 
1 2 = 8
 Critical Value(s):
Decision:
 = .05
Conclusion:

0 4.07 F
Summary Table
Solution

Degrees Mean
Source of Sum of
of Square F
Variation Squares (Variance)
Freedom
Treatment 4–1=3 348 116 11.6
(Methods)
Error 12 – 4 = 8 80 10

Total 12 – 1 = 11 428
ANOVA F-Test
Solution
 H0: 1 = 2 = 3 = 4
Test Statistic:
 Ha: Not all equal
MST 116
  = .05 F   11.6
MSE 10
  = 3 
1 2 = 8
 Critical Value(s):
Decision:
Reject at  = .05
 = .05
Conclusion:
There is evidence population
0 4.07 F means are different
The Randomized Block Design
Randomized Block Design
The randomized block design consists of a two-
step procedure:
1. Matched sets of experimental units, called blocks,
are formed, each block consisting of k
experimental units (where k is the number of
treatments). The b blocks should consist of
experimental units that are as similar as possible.
2. One experimental unit from each block is randomly
assigned to each treatment, resulting in a total of
n = bk responses.
Randomized Block Design
 Reduces sampling variability (SSE)
 Matched sets of experimental units (blocks)
 One experimental unit from each block is
randomly assigned to each treatment
Randomized Block Design
Total Variation Partitioning
ANOVA F-Test to Compare k Treatment
Means: Randomized Block Design

H0: µ1 = µ2 = … = µk
Ha: At least two treatment means differ
MST
Test Statistic: F 
MSE
Rejection region: F > F, where F is based on (k –
1) numerator degrees of freedom and (n – b – k
+1) denominator degrees of freedom.
Conditions Required for a Valid ANOVA F-test:
Randomized Block Design

1. The b blocks are randomly selected, and all k


treatments are applied (in random order) to
each block.
2. The distributions of observations
corresponding to all bk block-treatment
combinations are approximately normal.
3. All bk block-treatment distributions have equal
variances.
Randomized Block Design
F-Test Test Statistic
1. Test Statistic

F = MST / MSE
— MST is Mean Square for Treatment
— MSE is Mean Square for Error
2. Degrees of Freedom
• 1 = k – 1 (numerator) 2 = n – k – b + 1
(denominator)
— k = Number of groups
— n = Total sample size
— b = Number of blocks
Randomized Block Design Example
A production manager wants to see if three
assembly methods have different mean assembly
times (in minutes). Five employees were selected
at random and assigned to use each assembly
method. At the .05 level of significance, is there a
difference in mean assembly times?
Employee Method 1 Method 2 Method
3
1 5.4 3.6 4.0
2 4.1 3.8 2.9
3 6.1 5.6 4.3
4 3.6 2.3 2.6
5 5.3 4.7 3.4
Random Block Design F-Test
Solution
 H0: 1 = 2 = 3
 Ha: Not all equal
  = .05
 1 = 2 2 = 8
 Critical Value(s):

 = .05

0 4.46 F
Summary Table
Solution
Degrees Mean
Source of Sum of
of Square F
Variation Squares (Variance)
Freedom
Treatment 3–1=2 5.43 2.71 12.9
(Methods)
Block 5–1=4 10.69 2.67 12.7
(Employee)

Error 15 – 3 – 5 + 1 1.68 .21


=8
Total 15 – 1 = 14 17.8
Random Block Design F-Test
Solution
 H0: 1 = 2 = 3
Test Statistic:
 Ha: Not all equal
MST 2.71
  = .05 F   12.9
MSE .21
 1 = 2 2 =8
 Critical Value(s):
Decision:
Reject at  = .05
 = .05
Conclusion:
There is evidence population
0 4.46 F means are different
Steps for Conducting an ANOVA for
a Randomized Block Design
1. Be sure the design consists of blocks (preferably,
blocks of homogeneous experimental units) and
that each treatment is randomly assigned to one
experimental unit in each block.
2. If possible, check the assumptions of normality
and equal variances for all block-treatment
combinations. [Note: This may be difficult to do
because the design will likely have only one
observation for each block-treatment combination.]
Steps for Conducting an ANOVA for
a Randomized Block Design
3. Create an ANOVA summary table that specifies
the variability attributable to Treatments,
Blocks, and Error, which leads to the
calculation of the F-statistic to test the null
hypothesis that the treatment means are equal
in the population. Use a statistical software
package or the calculation formulas in
Appendix C to obtain the necessary numerical
ingredients.
Steps for Conducting an ANOVA
for a Randomized Block Design
4. If the F-test leads to the conclusion that the
means differ, use the Bonferroni, Tukey, or
similar procedure to conduct multiple
comparisons of as many of the pairs of means
as you wish. Use the results to summarize the
statistically significant differences among the
treatment means. Remember that, in general,
the randomized block design cannot be used to
form confidence intervals for individual treatment
means.
Steps for Conducting an ANOVA
for a Randomized Block Design
5. If the F-test leads to the nonrejection of the null
hypothesis that the treatment means are equal,
consider the following possibilities:
a. The treatment means are equal–that is, the null
hypothesis is true.
Steps for Conducting an ANOVA for
a Randomized Block Design
b. The treatment means really differ, but other important
factors affecting the response are not accounted for
by the randomized block design. These factors
inflate the sampling variability, as measured by MSE,
resulting in smaller values of the F-statistic. Either
increase the sample size for each treatment or
conduct an experiment that accounts for the other
factors affecting the response. Do not automatically
reach the former conclusion because the possibility
of a Type II error must be considered if you accept
H0.
Steps for Conducting an ANOVA
for a Randomized Block Design

6. If desired, conduct the F-test of the null


hypothesis that the block means are equal.
Rejection of this hypothesis lends statistical
support to using the randomized block design.
Steps for Conducting an ANOVA
for a Randomized Block Design

Note: It is often difficult to check whether the


assumptions for a randomized block design are
satisfied. There is usually only one observation for
each block-treatment combination. When you feel
these assumptions are likely to be violated, a
nonparametric procedure is advisable.
Factorial Experiments:
Two Factors
Factorial Design

A complete factorial experiment is one in which


every factor-level combination is employed–that is,
the number of treatments in the experiment equals
the total number of factor-level combinations.
Also referred to as a two-way classification.
Factorial Design
To determine the nature of the treatment effect, if
any, on the response in a factorial experiment, we
need to break the treatment variability into three
components: Interaction between Factors A and B,
Main Effect of Factor A, and Main Effect of Factor
B. The Factor Interaction component is used to
test whether the factors combine to affect the
response, while the Factor Main Effect
components are used to determine whether the
factors separately affect the response.
Factorial Design

 Experimental units (subjects) are assigned


randomly to treatments
 Subjects are assumed homogeneous
 Two or more factors or independent variables
 Each has two or more treatments (levels)
 Analyzed by two-way ANOVA
Procedure for Analysis of Two-
Factor Factorial Experiment

1. Partition the Total Sum of Squares into the


Treatments and Error components. Use either a
statistical software package or the calculation
formulas in Appendix C to accomplish the
partitioning.
Procedure for Analysis of Two-
Factor Factorial Experiment

2. Use the F-ratio of Mean Square for Treatments to


Mean Square for Error to test the null hypothesis
that the treatment means are equal.
a. If the test results in nonrejection of the null
hypothesis, consider refining the experiment by
increasing the number of replications or introducing
other factors. Also consider the possibility that the
response is unrelated to the two factors.
b. If the test results in rejection of the null hypothesis,
then proceed to step 3.
Procedure for Analysis of Two-
Factor Factorial Experiment

3. Partition the Treatments Sum of Squares into the


Main Effect and Interaction Sum of Squares. Use
either a statistical software package or the
calculation formulas in Appendix C to accomplish
the partitioning.
Procedure for Analysis of Two-
Factor Factorial Experiment

4. Test the null hypothesis that factors A and B do


not interact to affect the response by computing
the F-ratio of the Mean Square for Interaction to
the Mean Square for Error.
a. If the test results in nonrejection of the null
hypothesis, proceed to step 5.
b. If the test results in rejection of the null hypothesis,
conclude that the two factors interact to affect the
mean response. Then proceed to step 6a.
Procedure for Analysis of Two-
Factor Factorial Experiment

5. Conduct tests of two null hypotheses that the


mean response is the same at each level of factor
A and factor B. Compute two F-ratios by
comparing the Mean Square for each Factor Main
Effect to the Mean Square for Error.
a. If one or both tests result in rejection of the null
hypothesis, conclude that the factor affects the
mean response. Proceed to step 6b.
Procedure for Analysis of Two-
Factor Factorial Experiment

b. If both tests result in nonrejection, an apparent


contradiction has occurred. Although the
treatment means apparently differ (step 2 test),
the interaction (step 4) and main effect (step 5)
tests have not supported that result. Further
experimentation is advised.
Procedure for Analysis of Two-
Factor Factorial Experiment

6. Compare the means:


a. If the test for interaction (step 4) is significant, use
a multiple comparisons procedure to compare any
or all pairs of the treatment means.
b. If the test for one or both main effects (step 5) is
significant, use a multiple comparisons procedure
to compare the pairs of means corresponding to
the levels of the significant factor(s).
ANOVA Tests Conducted for Factorial Experiments:
Completely Randomized Design, r Replicates per
Treatment

Test for Treatment Means


H0: No difference among the ab treatment means
Ha: At least two treatment means differ
MST
Test Statistic: F 
MSE
Rejection region: F > F, based on (ab – 1)
numerator and (n – ab) denominator degrees of
freedom [Note: n = abr.]
ANOVA Tests Conducted for Factorial Experiments:
Completely Randomized Design, r Replicates per
Treatment

Test for Factor Interaction


H0: Factors A and B do not interact to affect the response
mean
Ha: Factors A and B do interact to affect the response
mean
MS  AB 
Test Statistic: F 
MSE
Rejection region: F > F, based on (a – 1)(b – 1)
numerator and (n – ab) denominator degrees of freedom
ANOVA Tests Conducted for Factorial Experiments:
Completely Randomized Design, r Replicates per
Treatment

Test for Main Effect of Factor A


H0: No difference among the a mean levels of factor
A
Ha: At least two factor A mean levels differ
MS  A 
Test Statistic: F 
MSE
Rejection region: F > F, based on (a – 1)
numerator and (n – ab) denominator degrees of
freedom
ANOVA Tests Conducted for Factorial Experiments:
Completely Randomized Design, r Replicates per
Treatment

Test for Main Effect of Factor B


H0: No difference among the b mean levels of factor
B
Ha: At least two factor B mean levels differ
Test Statistic: F 

MS B
MSE
Rejection region: F > F, based on (b – 1)
numerator and (n – ab) denominator degrees of
freedom
Conditions Required for Valid
F-tests in Factorial Experiments

1. The response distribution for each factor-level


combination (treatment) is normal.
2. The response variance is constant for all
treatments.
3. Random and independent samples of
experimental units are associated with each
treatment.
ANOVA Data Table

Factor Factor B
A 1 2 ... b Observation k
1 x111 x121 ... x1b1
x112 x122 ... x1b2 xijk
2 x211 x221 ... x2b1
x212 x222 ... x2b2 Level i Level j
: : : : : Factor Factor
a xa11 xa21 ... xab1 A B

xa12 xa22 ... xab2 Treatment


Factorial Design Example

Factor 2 (Training Method)


Factor Level 1 Level 2 Level 3
Levels

Level 1 15 hr.  10 hr.  22 hr. 


(High)
Factor 1 11 hr.  12 hr.  17 hr. 
(Motivation)
Level 2 27 hr.  15 hr.  31 hr. 
(Low)
29 hr.  17 hr.  49 hr. 
Treatment
Advantages
of Factorial Designs

 Saves time and effort


 e.g., Could use separate completely
randomized designs for each variable
 Controls confounding effects by putting other
variables into model
 Can explore interaction between variables
Two-Way ANOVA

 Tests the equality of two or more population


means when several independent variables
are used
 Same results as separate one-way ANOVA on
each variable
 No interaction can be tested
 Used to analyze factorial designs
Interaction

• Occurs when effects of one factor vary


according to levels of other factor
• When significant, interpretation of main effects
(A and B) is complicated
• Can be detected
– In data table, pattern of cell means in one row
differs from another row
– In graph of cell means, lines cross
Graphs of Interaction
Effects of motivation (high or low) and
training method (A, B, C) on mean learning
time
Interaction No Interaction
Average Average
Response High Response High

Low Low

A B C A B C
Two-Way ANOVA
Total Variation Partitioning
Two-Way ANOVA
Summary Table

Source of Degrees of Sum of Mean F


Variation Freedom Squares Square
A a–1 SS(A) MS(A) MS(A)
(Row) MSE
B b–1 SS(B) MS(B) MS(B)
(Column) MSE
AB (a – 1)(b – 1) SS(AB) MS(AB) MS(AB)
(Interaction) MSE
Error n – ab SSE MSE
Total n–1 SS(Total) Same as other
designs
Factorial Design
Example
Human Resources wants to determine if training
time is different based on motivation level and
training method. Conduct the appropriate ANOVA
tests. Use α = .05 for each test.
Training Method
Factor Self–
Levels paced Classroom Computer
15 hr. 10 hr. 22 hr.
High
11 hr. 12 hr. 17 hr.
Motivation
27 hr. 15 hr. 31 hr.
Low
29 hr. 17 hr. 49 hr.
Treatment Means F-Test
Solution
 H0: The 6 treatment
means are equal
 Ha: At least 2 differ
  = .05
  = 5  
1 2 = 6
 Critical Value(s):
 = .05

0 4.39 F
Two-Way ANOVA
Summary Table

Source of Degrees of Sum of Mean F


Variation Freedom Squares Square
Model 5 1201.8 240.35 7.65

Error 6 188.5 31.42

Corrected
11 1390.3
Total
Treatment Means F-Test
Solution
 H0: The 6 treatment
Test Statistic:
means are equal
MST
 Ha: At least 2 differ F  7.65
MSE
  = .05
  = 5  
1 2 = 6
Decision:
 Critical Value(s): Reject at  = .05
 = .05
Conclusion:
There is evidence population
0 4.39 F means are different
Factor Interaction F-Test
Solution
 H0: The factors
do not interact
 Ha: The factors interact
  = .05
  =2 
1 2= 6
 Critical Value(s):  = .05

0 5.14 F
Two-Way ANOVA
Summary Table

Source of Degrees of Sum of Mean F


Variation Freedom Squares Square
A 1 546.75 546.75 17.40
(Row)
B 2 531.5 265.75 8.46
(Column)
AB 2 123.5 61.76 1.97
(Interaction)
Error 6 188.5 31.42
Total 11 SS(Total) Same as other
designs
Factor Interaction F-Test
Solution
 H0: The factors
Test Statistic:
do not interact
MS ( AB )
 Ha: The factors interact F  1.97
MSE
  = .05
  =2 
1 2= 6
Decision:
 Critical Value(s): Do not reject at  = .05
 = .05
Conclusion:
There is no evidence the
0 5.14 F factors interact
Main Factor A F-Test
Solution
 H0: No difference between
motivation levels
 Ha: Motivation levels differ
  = .05
  =1 
1 2= 6
 Critical Value(s):  = .05

0 5.99 F
Two-Way ANOVA
Summary Table

Source of Degrees of Sum of Mean F


Variation Freedom Squares Square
A 1 546.75 546.75 17.40
(Row)
B 2 531.5 265.75 8.46
(Column)
AB 2 123.5 61.76 1.97
(Interaction)
Error 6 188.5 31.42
Total 11 SS(Total) Same as other
designs
Main Factor A F-Test
Solution
 H0: No difference between
Test Statistic:
motivation levels
MSA
 Ha: Motivation levels differ F  17.4
.05 MSE
 =
 1 =1 
2 = 6
Decision:
 Critical Value(s): Reject at  = .05
 = .05
Conclusion:
There is evidence that
0 5.99 F motivations levels differ
Main Factor B F-Test
Solution
 H0: No difference between
training methods
 Ha: Training methods differ
  = .05
  =2 
1 2= 6
 Critical Value(s):
 = .05

0 5.14 F
Two-Way ANOVA
Summary Table

Source of Degrees of Sum of Mean F


Variation Freedom Squares Square
A 1 546.75 546.75 17.40
(Row)
B 2 531.5 265.75 8.46
(Column)
AB 2 123.5 61.76 1.97
(Interaction)
Error 6 188.5 31.42
Total 11 SS(Total) Same as other
designs
Main Factor B F-Test
Solution
 H0: No difference between
Test Statistic:
training methods
MS ( B )
 Ha: Training methods differ F   8.46
MSE
 = .05
 1 =2 
2 = 6
Decision:
 Critical Value(s): Reject at  = .05
 = .05
Conclusion:
There is evidence
0 5.14 F training methods differ
 End of chapter 6

You might also like