You are on page 1of 27

Analysis of Variance

One-Way ANOVA; Post-Hoc Tests


INTRODUCTION TO EXPERIMENTAL
DESIGNS
 Basic Concepts

 Basic Concepts: experiments, experimental units, factors,


treatments, experimental design, experimental errors
 Basic Principles: Randomization, Replication, Local (Error)
Controls
 Types of Experimental Designs: Completely randomized
designs (one-way classification models), Randomized
Complete Block Designs (RCBD), Latin Squares, Split-plot,
strip-plot,etc.
 Experiment: procedure or planned inquiry to
discover new facts/knowledge or to
prove/disprove existing theory or previous
results.

 Experimental Design: plan and actual


procedure of carrying out an experiments.
Basic Steps in Designing Experiments

 State objectives clearly.


 Classification of major/minor objectives (e.g. give higher
precision for the major objectives)

 Requires defining:
 Population for which inference is to be made
 Response (dependent) Variables
 Factor – independent variables which may refer to
procedures or amount of material whose effect on the
response variable we are interested in studying.
Basic Steps in Designing Experiments

 Treatments – intensity setting or level of one factor or a


combination of intensity settings of two or more
factors.

 Experimental Unit (e.u.) – refers to the unit of material


to which one level of treatment is applied or the unit of
material subjected to one level of treatment.

 Sampling Unit (s.u.) – refers to a fraction or part of the


e.u. from which measurements are taken.
Example:

Consider the experiment of determining the


effect of three types of fertilizer
 Response Variable: yield (number of tillers)
 Factor: Fertilizer [with three levels (type)]
 Treatment: Level 1, Level 2, Level 3
 Experimental Unit: Plant
 Sampling Unit: Leaves
One way Anova

 Note that there are not much


differences in the sizes of
pumpkins applied with
Fertilizer A and Fertilizer B
(Variation Within is small)
FERTILIZER A  There is a large difference
(variation) on the sizes of
pumpkins between the
groups.
 CONCLUSION: Fertilizer A is
more effective in increasing
pumpkin sizes (yield)

FERTILIZER B
One way Anova

 Note that there appears to be


large variation on sizes of
pumpkins applied with
Fertilizer A and Fertilizer B
(Variation Within is large)
FERTILIZER A  There is a “small” difference
(variation) on the sizes of
pumpkins between the
groups.
 CONCLUSION: No sufficient
evidence to say that Fertilizer
A is more effective in
increasing pumpkin sizes
FERTILIZER B (yield)
Objective

 We wish to test the equality of several means

 H1: At least one pair of means is not equal.


Notations
Why analysis of variance?
 Explain the variation in the data through two
components:
 Experimental error
 Error due to treatments
 Variance:

 Sum of Squares Identity:

SST = SSR + SSE


Computational Formulas
The ANOVA Table

 We summarize our computations in the following


table:
Source of Sum of Degrees of Mean Square F
Variation Squares Freedom
Treatment SSR k-1 MSR
f = MSR/MSE
Error SSE k(n - 1) MSE
Total SST nk - 1

 Reject Ho if
Post-Hoc Tests

 We use these to determine which of the treatments


are significantly different from one another.
 Steps for the Tukey-Kramer Procedure:
 Arrange the treatments in increasing order of their
means.
 Compute the critical value:
 𝑄 = 𝑄𝛼 (𝑘, 𝑁 − 𝑘)

 Compute the differences among all pairs.


Example 1
 The following data represent the number of hours of
pain relief provided by 5 different brands of headache
tablets administered to 25 subjects. The 25 subjects
were randomly divided into 5 groups and each group
was treated with a different brand.
Tablet
A B C D E
5 9 3 2 7
4 7 5 3 6
8 8 2 4 9
6 6 3 1 4
3 9 7 4 7

 Perform the analysis of variance, and test the


hypothesis at the 0.05 level of significance that the
mean number of hours of relief provided by the
tablets is the same for all five brands
PhStat Output

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups (SSR) 79.44 4 19.86 6.895833 0.00117 2.866081

Within Groups (SSE) 57.6 20 2.88

Total (SST) 137.04 24


PhStat Output

Tukey-Kramer Multiple Comparisons

Sample Sample Absolute Std. Error Critical


Group Mean Size Comparison Difference of Difference Range Results
1: A 5.2 5 Group 1 to Group 2 2.6 0.758946638 3.212 Means are not different
2: B 7.8 5 Group 1 to Group 3 1.2 0.758946638 3.212 Means are not different
3: C 4 5 Group 1 to Group 4 2.4 0.758946638 3.212 Means are not different
4: D 2.8 5 Group 1 to Group 5 1.4 0.758946638 3.212 Means are not different
5: E 6.6 5 Group 2 to Group 3 3.8 0.758946638 3.212 Means are different
Group 2 to Group 4 5 0.758946638 3.212 Means are different
Other Data Group 2 to Group 5 1.2 0.758946638 3.212 Means are not different
Level of significance 0.05 Group 3 to Group 4 1.2 0.758946638 3.212 Means are not different
Numerator d.f. 5 Group 3 to Group 5 2.6 0.758946638 3.212 Means are not different
Denominator d.f. 20 Group 4 to Group 5 3.8 0.758946638 3.212 Means are different
MSW 2.88
Q Statistic 4.232
Example 2
 Six different machines are being considered for use in
manufacturing rubber seals. The machines are being
compared with respect to tensile strength of the product.
A random sample of 4 seals from each machine is used to
determine whether the mean tensile strength varies from
machine to machine. The following are the tensile strength
measurements in kilograms per square centimeter.
Machine
1 2 3 4 5 6
17.5 16.4 20.3 14.6 17.5 18.3
16.9 19.2 15.7 16.7 19.2 16.2
15.8 17.7 17.8 20.8 16.5 17.5
18.6 15.4 18.9 18.9 20.5 20.1

 Perform the analysis of variance at the 0.05 level of


significance and indicate whether or not the mean tensile
strengths differ significantly for the 6 machines.
PhStat Output

Source of Variation SS df MS F P-value F crit


Between Groups 5.3383 5 1.0677 0.3068 0.9024 2.7729
Within Groups 62.6400 18 3.4800

Total 67.9783 23
Unequal Sample Sizes

 Degrees of freedom: SST → N - 1, SSR → k – 1, SSE →


N-k
Example 3
 It is suspected that higher priced automobiles are assembled
with greater care than lower-priced automobiles. To investigate
this, a large luxury model A, a medium size sedan B, and a
subcompact hunchback C were compared for defects when they
arrived at the dealer’s showroom. All cars were manufactured by
the same company. The number of defects for several of the
three models were recorded and are shown below:
Model
A B C
4 5 8
7 1 6
6 3 8
6 5 9
3 5
4
 Test the hypothesis at the 0.05 level of significance that the
average number of defects is the same for the three models.
PhStat Output

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 38.2833 2 19.1417 8.4917 0.0050 3.8853
Within Groups 27.0500 12 2.2542

Total 65.3333 14
Two-Way ANOVA (w/o replication)

 We wish to test the following hypotheses:


 Ho: The row means are all equal
 H1: The row means are significantly different
 Ho: The column means are all equal
 H1: The column means are significantly different

ANOVA : PLBautista
Computational Formulas

ANOVA : PLBautista
Example 4

 The yields of three types of wheat using four different


kinds of fertilizer were recorded and are shown on
the next page:

 Test the hypothesis at the 0.05 level of significance


that there is no difference in the average yield of
wheat when different kinds of fertilizer are used.
Also, test the hypothesis that there is no difference in
the average yield of the three varieties of wheat.
ANOVA : PLBautista
Example 4

ANOVA

Source of Variation SS df MS F P-value F crit


Rows 498 3 166 9.222222 0.011523 4.757063
Columns 56 2 28 1.555556 0.285588 5.143253
Error 108 6 18

Total 662 11

ANOVA : PLBautista

You might also like