This action might not be possible to undo. Are you sure you want to continue?
ANALYSIS OF VARIANCE
Jennifer Kensler
ONE SAMPLE TTEST
ONE SAMPLE TTEST
Used to test whether the population mean is
different from a specified value.
Example: Is the mean amount of soda in a 20 oz.
bottle different from 20 oz?
STEP 1: FORMULATE THE HYPOTHESES
The population mean is not equal to a specified value.
H
0
: μ = μ
0
H
a
: μ ≠ μ
0
The population mean is greater than a specified
value.
H
0
: μ = μ
0
H
a
: μ > μ
0
The population mean is less than a specified value.
H
0
: μ = μ
0
H
a
: μ < μ
0
STEP 2: CHECK THE ASSUMPTIONS
The sample is random.
The population from which the sample is drawn is
either normal or the sample size is large.
STEPS 35
Step 3: Calculate the test statistic:
Where
Step 4: Calculate the pvalue based on the
appropriate alternative hypothesis.
Step 5: Write a conclusion.
n s
y
t
/
0
µ ÷
=
( )
1
1
2
÷
÷
=
¿
=
n
y y
s
n
i
i
IRIS EXAMPLE
A researcher would like to know whether the mean
sepal width of a variety of irises is different from 3.5
cm.
The researcher randomly measures the sepal width
of 50 irises.
Step 1: Hypotheses
H
0
: μ = 3.5 cm
H
a
: μ ≠ 3.5 cm
JMP
Steps 24:
JMP Demonstration
Analyze Distribution
Y, Columns: Sepal Width
Test Mean
Specify Hypothesized Mean: 3.5
JMP OUTPUT
Step 5 Conclusion: The sepal width is not
significantly different from 3.5 cm.
TWO SAMPLE TTEST
TWO SAMPLE TTEST
Two sample ttests are used to determine whether
the mean of one group is equal to, larger than or
smaller than the mean of another group.
Example: Is the mean cholesterol of people taking
drug A lower than the mean cholesterol of people
taking drug B?
STEP 1: FORMULATE THE HYPOTHESES
The population means of the two groups are not
equal.
H
0
: μ
1
= μ
2
H
a
: μ
1
≠ μ
2
The population mean of group 1 is greater than the
population mean of group 2.
H
0
: μ
1
= μ
2
H
a
: μ
1
> μ
2
The population mean of group 1 is less than the
population mean of group 2.
H
0
: μ
1
= μ
2
H
a
: μ
1
< μ
2
STEP 2: CHECK THE ASSUMPTIONS
The two samples are random and independent.
The populations from which the samples are drawn
are either normal or the sample sizes are large.
The populations have the same standard deviation.
STEPS 35
Step 3: Calculate the test statistic
where
Step 4: Calculate the appropriate pvalue.
Step 5: Write a Conclusion.
2 1
2 1
1 1
n n
s
y y
t
p
+
÷
=
2
) 1 ( ) 1 (
2 1
2
2 2
2
1 1
÷ +
÷ + ÷
=
n n
s n s n
s
p
TWO SAMPLE EXAMPLE
A researcher would like to know whether the mean
sepal width of a setosa irises is different from the
mean sepal width of versicolor irises.
Step 1 Hypotheses:
H
0
: μ
setosa
= μ
versicolor
H
a
: μ
setosa
≠ μ
versicolor
JMP
Steps 24:
JMP Demonstration:
Analyze Fit Y By X
Y, Response: Sepal Width
X, Factor: Species
JMP OUTPUT
Step 5 Conclusion: There is strong evidence (p
value < 0.0001) that the mean sepal widths for the
two varieties are different.
PAIRED TTEST
PAIRED TTEST
The paired ttest is used to compare the means of
two dependent samples.
Example:
A researcher would like to determine if background
noise causes people to take longer to complete
math problems. The researcher gives 20 subjects
two math tests one with complete silence and one
with background noise and records the time each
subject takes to complete each test.
STEP 1: FORMULATE THE HYPOTHESES
The population mean difference is not equal to zero.
H
0
: μ
difference
= 0
H
a
: μ
difference
≠ 0
The population mean difference is greater than zero.
H
0
: μ
difference
= 0
H
a
: μ
difference
> 0
The population mean difference is less than a zero.
H
0
: μ
difference
= 0
H
a
: μ
difference
< 0
STEP 2: CHECK THE ASSUMPTIONS
The sample is random.
The data is matched pairs.
The differences have a normal distribution or the
sample size is large.
STEPS 35
n s
d
t
d
/
0 ÷
=
Where d bar is the mean of the differences and s
d
is
the standard deviations of the differences.
Step 4: Calculate the pvalue.
Step 5: Write a conclusion.
Step 3: Calculate the test Statistic:
PAIRED TTEST EXAMPLE
A researcher would like to determine whether a
fitness program increases flexibility. The researcher
measures the flexibility (in inches) of 12 randomly
selected participants before and after the fitness
program.
Step 1: Formulate a Hypothesis
H
0
: μ
After

Before
= 0
H
a
: μ
After

Before
> 0
PAIRED TTEST EXAMPLE
Steps 24:
JMP Analysis:
Create a new column of After – Before
Analyze Distribution
Y, Columns: After – Before
Test Mean
Specify Hypothesized Mean: 0
JMP OUTPUT
Step 5 Conclusion: There is not evidence that the
fitness program increases flexibility.
ONEWAY ANALYSIS OF VARIANCE
ONEWAY ANOVA
ANOVA is used to determine whether three or more
populations have different distributions.
A B C
Medical Treatment
ANOVA STRATEGY
The first step is to use the ANOVA F test to
determine if there are any significant differences
among means.
If the ANOVA F test shows that the means are not
all the same, then follow up tests can be performed to
see which pairs of means differ.
ONEWAY ANOVA MODEL
i
ij
i
ij
ij i ij
n j
r i
N
y
y
, , 1
, , 1
) , 0 ( ~
group ith the of mean the is
level factor ith on the jth trial the of response the is
Where
2
=
=
+ =
o c
µ
c µ
In other words, for each group the observed value
is the group mean plus some random variation.
ONEWAY ANOVA HYPOTHESIS
Step 1: We test whether there is a difference in the
means.
equal. all not are The :
:
2 1 0
i a
r
H
H
µ
µ µ µ = = =
STEP 2: CHECK ANOVA ASSUMPTIONS
The samples are random and independent of each
other.
The populations are normally distributed.
The populations all have the same variance.
The ANOVA F test is robust to the assumptions of
normality and equal variances.
STEP 3: ANOVA F TEST
Compare the variation within the samples to the
variation between the samples.
A B C A B C
Medical Treatment
ANOVA TEST STATISTIC
MSE
MSG
Groups within Variation
Groups between Variation
F = =
Variation within groups small
compared with variation
between groups
→ Large F
Variation within groups large
compared with variation
between groups → Small F
MSG
1  r
) ( n ) ( n ) ( n
1  r
SSG
MSG
2
1 r
2
2 2
2
1 1 · · · · · · · · ·
÷ + + ÷ + ÷
=
=
y y y y y y
The mean square for groups, MSG, measures the
variability of the sample averages.
SSG stands for sums of squares groups.
MSE
1
) (
s
Where
r  n
1)s  (n 1)s  (n 1)s  (n
r  n
SSE
MSE
1
i
2
r r
2
2 2
2
1 1
÷
÷
=
+ + +
=
=
¿
=
·
i
n
j
i ij
n
y y
i
Mean square error, MSE, measures the variability
within the groups.
SSE stands for sums of squares error.
STEPS 45
Step 4: Calculate the pvalue.
Step 5: Write a conclusion.
ANOVA EXAMPLE
A researcher would like to determine if three drugs
provide the same relief from pain.
60 patients are randomly assigned to a treatment
(20 people in each treatment).
Step 1: Formulate the Hypotheses
H
0
: μ
Drug
A
= μ
Drug B
= μ
Drug C
H
a
: The μ
i
are not all equal.
STEPS 24
JMP demonstration
Analyze Fit Y By X
Y, Response: Pain
X, Factor: Drug
EXAMPLE 1: JMP OUTPUT AND CONCLUSION
Step 5 Conclusion: There is strong evidence
that the drugs are not all the same.
FOLLOWUP TEST
The pvalue of the overall F test indicates that level
of pain is not the same for patients taking drugs A,
B and C.
We would like to know which pairs of treatments
are different.
One method is to use Tukey’s HSD (honestly
significant differences).
TUKEY TESTS
Tukey’s test simultaneously tests
JMP demonstration
Oneway Analysis of Pain By Drug
Compare Means All Pairs, Tukey HSD
' a
' 0
: H
: H
i i
i i
µ µ
µ µ
=
=
for all pairs of factor levels. Tukey’s HSD controls
the overall type I error.
JMP OUTPUT
The JMP output shows that drugs A and C are
significantly different.
ANALYSIS OF COVARIANCE
ANALYSIS OF COVARIANCE (ANCOVA)
Covariates are variables that may affect the
response but cannot be controlled.
Covariates are not of primary interest to the
researcher.
We will look at an example with two covariates, the
model is
ij i ij
y c µ + + = covariates
ANCOVA EXAMPLE
Consider the previous example where we tested
whether the patients receiving different drugs
reported different levels of pain. Perhaps age and
gender may influence the efficacy of the drug. We
can use age and gender as covariates.
JMP demonstration
Analyze Fit Model
Y: Pain
Add: Drug
Age
Gender
JMP OUTPUT
CONCLUSION
The one sample ttest allows us to test whether the
mean of a group is equal to a specified value.
The two sample ttest and paired ttest allows us to
determine if the means of two groups are different.
ANOVA and ANCOVA methods allow us to
determine whether the means of several groups are
statistically different.
SAS AND SPSS
For information about using SAS and SPSS to do
ANOVA:
http://www.ats.ucla.edu/stat/sas/topics/anova.htm
http://www.ats.ucla.edu/stat/spss/topics/anova.htm
REFERENCES
Fisher’s Irises Data (used in one sample and two
sample ttest examples).
Flexibility data (paired ttest example):
Michael Sullivan III. Statistics Informed Decisions
Using Data. Upper Saddle River, New Jersey:
Pearson Education, 2004: 602.