# Analysis of Variance

1

Chapter Goals
After completing this chapter, you should be able to: 
     Recognize situations in which to use analysis of variance Understand different analysis of variance designs Perform a single-factor hypothesis test and interpret results Conduct and interpret post-analysis of variance pairwise comparisons procedures Set up and perform randomized blocks analysis Analyze two-factor analysis of variance test with replications results

2

Chapter Overview
Analysis of Variance (ANOVA) One-Way ANOVA F-test F-test TukeyKramer test Fisher¶s Least Significant Difference test
3

Randomized Complete Block ANOVA

Two-factor ANOVA with replication

General ANOVA Setting 
Investigator controls one or more independent variables 
Called factors (or treatment variables)  Each factor contains two or more levels (or categories/classifications) 

Observe effects on dependent variable 
Response to levels of independent variable 

Experimental design: the plan used to test hypothesis
4

One-Way Analysis of Variance 
Evaluate the difference among the means of three or more populations
Examples: Accident rates for 1st, 2nd, and 3rd shift Expected mileage for five brands of tires 

Assumptions  Populations are normally distributed  Populations have equal variances  Samples are randomly and independently drawn
5

Completely Randomized Design 
Experimental units (subjects) are assigned randomly to treatments  Only one factor or independent variable 
With two or more treatment levels 

Analyzed by 
One-factor analysis of variance (one-way ANOVA) 

Called a Balanced Design if all factor levels have equal sample size
6

Hypotheses of One-Way ANOVA  

All population means are equal

H0 :

1

!

2

!

3

!. !

k 

i.e., no treatment effect (no variation in means among groups)  

At least one population mean is different  i.e., there is a treatment effect

HA : Not all of the population means are the same 

Does not mean that all population means are different (some pairs may be the same)
7

One-Factor ANOVA
H0 :
1

!

2

!
i

3

!. !

k

HA : Not all

are the same
All Means are the same: The Null Hypothesis is True (No Treatment Effect)

1

!

2

!

3
8

One-Factor ANOVA
(continued)

H0 :

1

!

2

!
i

3

!. !

k

HA : Not all

are the same

At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or

1

!

2

{

3

1

{

2

{

3

9

Partitioning the Variation 
Total variation can be split into two parts: SST = SSB + SSW
SST = Total Sum of Squares SSB = Sum of Squares Between SSW = Sum of Squares Within

10

Partitioning the Variation
SST = SSB + SSW

(continued)

Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST) Between-Sample Variation = dispersion among the factor sample means (SSB) Within-Sample Variation = dispersion that exists among the data values within a particular factor level (SSW)
11

Partition of Total Variation 
   Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within Groups Variation

Total Variation (SST)

Commonly referred to as:

= 
  

Variation Due to Factor (SSB)

+

Variation Due to Random Sampling (SSW)

Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation
12

Total Sum of Squares
SST = SSB + SSW
k

SST ! §§ ( x ij  x )
Where:
i!1 j!1

ni

2

SST = Total sum of squares k = number of populations (levels or treatments) ni = sample size from population i xij = jth measurement from population i x = grand mean (mean of all data values)
13

Total Variation
2 2

(continued)

SST ! ( x11  x )  ( x12  x )  ...  ( x knk  x )
Response, X

2

X
Group 1 Group 2 Group 3
14

Sum of Squares Between
SST = SSB + SSW
k

SSB ! § ni ( x i  x )
Where:
i!1

2

SSB = Sum of squares between k = number of populations ni = sample size from population i xi = sample mean from population i x = grand mean (mean of all data values)
15

Between-Group Variation
k

SSB ! § ni ( x i  x )
i!1

2

Variation Due to Differences Among Groups

SSB MSB ! k 1
Mean Square Between = SSB/degrees of freedom

Qi

Qj
16

Between-Group Variation (continued)
SSB ! n1 ( x1  x )  n 2 ( x 2  x )  ...  nk ( x k  x )
Response, X
2 2 2

X3
X1
Group 1 Group 2

X2
Group 3

X

17

Sum of Squares Within
SST = SSB + SSW
k nj

SSW ! §
i !1

§
j!1

( x ij  x i )

2

Where:

SSW = Sum of squares within k = number of populations ni = sample size from population i xi = sample mean from population i xij = jth measurement from population i
18

Within-Group Variation
k nj

SSW ! §
i !1

§
j!1

( x ij  x i )2

Summing the variation within each group and then adding over all groups

SSW MSW ! Nk
Mean Square Within = SSW/degrees of freedom

Qi
19

Within-Group Variation
2 2

(continued)

SSW ! ( x11  x1 )  ( x12  x 2 )  ...  ( x knk  x k )
Response, X

2

X3

X1
Group 1 Group 2

X2
Group 3

20

One-Way ANOVA Table
Source of Variation Between Samples Within Samples Total SS SSB SSW SST = SSB+SSW df k-1 N-k N-1 MS MSB = SSB F ratio MSB F= MSW

k-1 SSW MSW = N-k

k = number of populations N = sum of the sample sizes from all populations df = degrees of freedom

21

One-Factor ANOVA F Test Statistic 
Test statistic H 0:
1= 2

= «=

k

HA: At least two population means are different
MSB is mean squares between variances MSW is mean squares within variances 

Degrees of freedom 
 df1 = k ± 1 df2 = N ± k

MSB F! MSW

(k = number of populations) (N = sum of sample sizes from all populations)

22

Interpreting One-Factor ANOVA F Statistic 
The F statistic is the ratio of the between estimate of variance and the within estimate of variance 
  The ratio must always be positive df1 = k -1 will typically be small df2 = N - k will typically be large

The ratio should be close to 1 if H0: 1= 2 = « = k is true The ratio will be larger than 1 if H0: 1= 2 = « = k is false

23

One-Factor ANOVA F Test Example
You want to see if three different golf clubs yield different Club 1 Club 2 Club 3 distances. You randomly select five measurements from trials on 234 200 an automated driving machine for each254 At the .05 club. 218 significance level, is there a difference263 in mean distance? 222 241 235 197 237 227 206 251 216 204

24

One-Factor ANOVA Example: Scatter Diagram
Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204
Distance 270 260 250 240 230 220 210

    

X1
    

X2

    

X
X3
25

x1 ! 249.2 x 2 ! 226.0 x 3 ! 205.8 x ! 227.0

200 190 1 2 Club

3

One-Factor ANOVA Example Computations
Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204
x1 = 249.2 x2 = 226.0 x3 = 205.8 x = 227.0 n1 = 5 n2 = 5 n3 = 5 N = 15 k=3

SSB = 5 [ (249.2 ± 227)2 + (226 ± 227)2 + (205.8 ± 227)2 ] = 4716.4 SSW = (254 ± 249.2)2 + (263 ± 249.2)2 +«+ (204 ± 205.8)2 = 1119.6 MSB = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3

2358.2 F! ! 25.275 93.3
26

One-Factor ANOVA Example Solution
H0 : 1 = 2 = 3 HA: i not all equal E = .05 df1= 2 df2 = 12
Critical Value: FE = 3.885 E = .05

Test Statistic:
MSB 2358.2 F! ! ! 25.275 MSW 93.3

Decision: Reject H0 at E = 0.05 Conclusion: There is evidence that at least one i differs F = 25.275 from the rest 27

0

Do not reject H0

Reject H0

F.05 = 3.885

ANOVA -- Single Factor: Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY Groups Club 1 Club 2 Club 3 ANOVA Source of Variation Between Groups Within Groups Total SS 4716.4 1119.6 5836.0 df 2 12 14 MS 2358.2 93.3
28

Count 5 5 5

Sum 1246 1130 1029

Average 249.2 226 205.8

Variance 108.2 77.5 94.2

F 25.275

P-value 4.99E-05

F crit 3.885

The Tukey-Kramer Procedure 
Tells which population means are significantly different 
e.g.: 1 = 2 { 3  Done after rejection of equal means in ANOVA 

Allows pair-wise comparisons 
Compare absolute mean differences with critical range

1=

2

3

x
29

Tukey-Kramer Critical Range
MSW ¨ 1 1 ¸ ©  ¹ 2 © ni n j ¹ ª º

Critical Range ! qE
where:

qE = Value from standardized range table with k and N - k degrees of freedom for the desired level of E MSW = Mean Square Within ni and nj = Sample sizes from populations (levels) i and j
30

The Tukey-Kramer Procedure: Example
1. Compute absolute mean differences: Club 1 Club 2 Club 3 254 234 200 x1  x 2 ! 249.2  226.0 ! 23.2 263 218 222 241 235 197 x1  x 3 ! 249.2  205.8 ! 43.4 237 227 206 x 2  x 3 ! 226.0  205.8 ! 20.2 251 216 204 2. Find the q value from the table in appendix J with k and N - k degrees of freedom for the desired level of E

q ! 3.77
31

The Tukey-Kramer Procedure: Example
3. Compute Critical Range:
Critical Range ! q MSW ¨ 1 1 ¸ ©  ¹ ! 3.77 93.3 ¨ 1  1 ¸ ! 16.285 © ¹ ©n n ¹ 2 ª i 2 ª5 5º j º

4. Compare: 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance.
x1  x 2 ! 23.2 x1  x 3 ! 43.4 x 2  x 3 ! 20.2

32

Tukey-Kramer in PHStat

33

Randomized Complete Block ANOVA 
Like One-Way ANOVA, we test for equal population means (for different factor levels, for example)... ...but we want to control for possible variation from a second factor (with two or more levels) Used when more than one factor may influence the value of the dependent variable, but only one is of key interest Levels of the secondary factor are called blocks   

34

Partitioning the Variation 
Total variation can now be split into three parts: SST = SSB + SSBL + SSW
SST = Total sum of squares SSB = Sum of squares between factor levels SSBL = Sum of squares between blocks SSW = Sum of squares within levels

35

Sum of Squares for Blocking
SST = SSB + SSBL + SSW
b

SSBL ! § k( x j  x )
j!1

2

Where:

k = number of levels for this factor b = number of blocks xj = sample mean from the jth block x = grand mean (mean of all data values)
36

Partitioning the Variation 
Total variation can now be split into three parts: SST = SSB + SSBL + SSW
SST and SSB are computed as they were in One-Way ANOVA SSW = SST ± (SSB + SSBL)

37

Mean Squares
SSBL MSBL ! Mean square blocking ! b 1 SSB k 1

MSB ! Mean square between !

SSW MSW ! Mean square within ! (k  1)(b  1)
38

Randomized Block ANOVA Table
Source of Variation Between Blocks Between Samples Within Samples Total SS SSBL SSB SSW SST df b-1 k-1 (k±1)(b-1) N-1
N = sum of the sample sizes from all populations 39 df = degrees of freedom

MS MSBL MSB MSW

F ratio MSBL MSW MSB MSW

k = number of populations b = number of blocks

Blocking Test 
Blocking test: df H0 : 1 = b !1 b1
b2 df2 = (k ± 1)(b ± 1) b3

!

! ...

HA : Not all block means are equal
MSBL F= MSW

Reject H0 if F > FE

40

Main Factor Test 
Main Factor test:

H0 :

df1 = k - 1 ! 1 ! (k2± ! 3 1) ... df2 = 1)(b ±

!

k

HA : Not all population means are equal
MSB MSW

F=

Reject H0 if F > FE

41

Fisher¶s Least Significant Difference Test 
To test which population means are significantly different 
e.g.: 1 = 2  3  Done after rejection of equal means in randomized block ANOVA design 

Allows pair-wise comparisons 
Compare absolute mean differences with critical range
Q1= Q2 Q3
x
42

Fisher¶s Least Significant Difference (LSD) Test
2 MSW b

LSD ! t E/2
where:

tE/2 = Upper-tailed value from Student¶s t-distribution for E/2 and (k -1)(n - 1) degrees of freedom MSW = Mean square within from ANOVA table b = number of blocks k = number of levels of the main factor

43

Fisher¶s Least Significant Difference (LSD) Test (continued)
LSD ! t E/2 2 MSW b
Compare:

Is x i  x j " LSD ?
If the absolute mean difference is greater than LSD then there is a significant difference between that pair of means at the chosen level of significance.

x1  x 2 x1  x 3 x2  x3 etc...
44

Two-Way ANOVA 
Examines the effect of 
Two or more factors of interest on the dependent variable 
e.g.: Percent carbonation and line speed on soft drink bottling process 

Interaction between the different levels of these two factors 
e.g.: Does the effect of one particular percentage of carbonation depend on which level the line speed is set?
45

Two-Way ANOVA 
Assumptions 
Populations are normally distributed  Populations have equal variances  Independent random samples are drawn

(continued)

46

Two-Way ANOVA Sources of Variation
Two Factors of interest: A and B a = number of levels of factor A b = number of levels of factor B N = total number of observations in all cells

47

Two-Way ANOVA Sources of Variation
SST = SSA + SSB + SSAB + SSE
SSA
Variation due to factor A

(continued) Degrees of Freedom: a±1

SST Total Variation

SSB
Variation due to factor B

b±1

SSAB
N-1 Variation due to interaction between A and B

(a ± 1)(b ± 1)

SSE
Inherent variation (Error)

N ± ab
48

Two Factor ANOVA Equations
Total Sum of Squares:
a b nd

SST ! §§§ ( x ijk  x )2
i!1 j !1 k !1 a i !1

Sum of Squares Factor A:

d ( x i  x )2 SS A ! bn §
Sum of Squares Factor B:
b j !1
49

d ( x j  x )2 SSB ! an §

Two Factor ANOVA Equations (continued)
Sum of Squares Interaction Between A and B: SS Sum of Squares Error:

a AB

b

2 d ! n §§ ( x ij  x i  x j  x ) i!1 j !1

a

b

nd 2

SSE ! §§§ ( x ijk  x ij )
i!1 j!1 k !1

50

Two Factor ANOVA Equations (continued)
a b nd ijk

where:

§§§ x
x!
i!1 j!1 k !1
nd ijk

b

abnd

! Grand Mean

§§ x
xi !
j !1 k !1

bnd
xj !
nd

! Mean of each level of factor A
a nd ijk

§§ x
i!1 k !1

and

! Mean of each level of factor B
a = number of levels of factor A b = number of levels of factor B n¶ = number of replications in each cell
51

x ijk ! Mean of each cell x ij ! § k !1 nd

Mean Square Calculations
SS A MS A ! Mean square factor A ! a 1 SSB MSB ! Mean square factor B ! b 1 MS AB SS AB ! Mean square interaction ! (a  1)(b  1) SSE MSE ! Mean square error ! N  ab
52

Two-Way ANOVA: The F Test Statistic
H0:
A1 = A2

=

A3

=

F Test for Factor A Main Effect

HA: Not all

Ai are equal

MS A F! MSE

Reject H0 if F > FE

H0:

B1 =

B2

=

B3

=

F Test for Factor B Main Effect

HA: Not all

Bi are equal

MSB F! MSE

Reject H0 if F > FE

H0: factors A and B do not interact to affect the mean response HA: factors A and B do interact

F Test for Interaction Effect

MS AB F! MSE

Reject H0 if F > FE
53

Two-Way ANOVA Summary Table
Source of Variation Factor A Factor B AB (Interaction) Error Total Sum of Squares SSA SSB Degrees of Freedom a±1 b±1 Mean Squares MSA
= SSA /(a ± 1)

F Statistic MSA MSE MSB MSE MSAB MSE

MSB
= SSB /(b ± 1)

SSAB SSE SST

(a ± 1)(b ± 1)

MSAB
= SSAB / [(a ± 1)(b ± 1)]

N ± ab N±1

MSE =
SSE/(N ± ab)
54

Features of Two-Way ANOVA F Test 
Degrees of freedom always add up 


N-1 = (N-ab) + (a-1) + (b-1) + (a-1)(b-1) Total = error + factor A + factor B + interaction 

The denominator of the F Test is always the same but the numerator is different  The sums of squares always add up 
SST = SSE + SSA + SSB + SSAB 

Total = error + factor A + factor B + interaction

55

Examples: Interaction vs. No Interaction 
No interaction: Interaction is present:
Mean Response

Mean Response

Factor B Level 1 Factor B Level 1 Factor B Level 2 Factor B Level 3 Factor B Level 3 Factor B Level 2

1

Factor A Levels

2

1

Factor A Levels

2
56