You are on page 1of 38

Analysis of variance 

Tron Anders Moger


2006.31.10
Comparing more than two groups
• Up to now we have studied situations with
– One observation per object
• One group
• Two groups
– Two or more observations per object
• We will now study situations with one observation
per object, and three or more groups of objects
• The most important question is as usual: Do the
numbers in the groups come from the same
population, or from different populations?
ANOVA
• If you have three groups, could plausibly do
pairwise comparisons. But if you have 10
groups? Too many pairwise comparisons:
You would get too many false positives!
• You would really like to compare a null
hypothesis of all equal, against some
difference
• ANOVA: ANalysis Of VAriance
One-way ANOVA: Example
• Assume ”treatment results” from 13 patients
visiting one of three doctors are given:
– Doctor A: 24,26,31,27
– Doctor B: 29,31,30,36,33
– Doctor C: 29,27,34,26
• H0: The means are equal for all groups (The
treatment results are from the same population of
results)
• H1: The means are different for at least two groups
(They are from different populations)
Comparing the groups
• Averages within groups:
– Doctor A: 27
– Doctor B: 31.8
– Doctor C: 29
4  27  5  31.8  4  29
• Total average:  29.46
45 4
• Variance around the mean matters for comparison.
• We must compare the variance within the groups
to the variance between the group means.
Variance within and between groups
K ni
SSW   ( xij  xi ) 2
• Sum of squares within groups: i 1 j 1

SSW  (24  27) 2  (26  27) 2  ...  (29  31.8) 2  ....  94.8
• Compare it with sum of squares between
groups: SSG   n ( x  x )
K

i i
2

i 1
SSG  (27  29.46)  (27  29.46)  ...  (31.8  29.46)2  ....
2 2

2 2 2
 4(27  29.46)  5(31.8  29.46)  4(29  29.46)  52.43
• Comparing these, we also need to take into
account the number of observations and
sizes of groups
Adjusting for group sizes
• Divide by the number of degrees of
freedom
SSW Both are estimates of population
MSW  variance of error under H0
nK
n: number of observations
SSG
MSG  K: number of groups
K 1
MSG
MSW
• Test statistic: reject H0 if this is large
Test statistic thresholds
• If populations are normal, with the same
variance, then we can show that under the
null hypothesis, MSG and MSW are Chi-
square distributed with K-1 and n-K d.f.
MSG The F distribution, with
~ FK 1,n  K K-1 and n-K degrees of
MSW
freedom

• Reject at confidence level  if MSG


 FK 1,n  K ,
MSW
Find this value in table p. 871
Continuing example
SSW 94.8
MSW    9.48
n  K 13  3
SSG 52.43
MSG    26.2
K 1 3 1
MSG 26.2
  2.76 Page 871: F31,133,0.05  4.10
MSW 9.48

• Thus we can NOT reject the null hypothesis


in our case.
ANOVA table
Source of Sum of Deg. of Mean F ratio
variation squares freedom squares
Between SSG K-1 MSG MSG
groups MSW
Within SSW n-K MSW
groups
Total SST n-1

SST  (24  29.46) 2  (26  29.46) 2  ...  (26  29.46) 2


NOTE: SSG  SSW  SST
Formulation of the model:
• H0: µ1=µ2=…=µK
• Xij=µi+εij
• Let Gi be the difference between the group
means and the population mean. Then:
• Gi=µi-µ of µi=µ+Gi
• Giving Xij=µ+Gi+εij
• And H0: G1=G2=…=GK=0
One-way ANOVA in SPSS
• ANOVA

VAR00001
Sum of
Squares df Mean Square F Sig.
Between Groups 52,431 2 26,215 2,765 ,111
Within Groups 94,800 10 9,480
Total 147,231 12

Last column: The p-value: The smallest value of  at which the


null hypothesis is rejected.
One-way ANOVA in SPSS:
• Analyze - Compare Means - One-way
ANOVA
• Move dependent variable to Dependent list
and group to Factor
• Choose Bonferroni in the Post Hoc window
to get comparisons of all groups
• Choose Descriptive and Homogeneity of
variance test in the Options window
Energy expenditure example:
• Let us say we have measurements of energy
expenditure in three independent groups:
Anorectic, lean and obese
• Want to test H0: Energy expenditure is the
same for anorectic, lean and obese
• Data for anorctic: 5.40, 6.23, 5.34, 5.76,
5.99, 6.55, 6.33, 6.21
SPSS output:
Descriptives

Energy
95% Confidence Interval for
Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Lean 13 8,0662 1,23808 ,34338 7,3180 8,8143 6,13 10,88
Obese 9 10,2978 1,39787 ,46596 9,2233 11,3723 8,79 12,79
Anorectic 8 5,9762 ,44032 ,15568 5,6081 6,3444 5,34 6,55
Total 30 8,1783 1,98936 ,36321 7,4355 8,9212 5,34 12,79

ANOVA
Test of Homogeneity of Variances Energy
Energy Sum of
Squares df Mean Square F Sig.
Levene
Between Groups 79,385 2 39,693 30,288 ,000
Statistic df1 df2 Sig.
Within Groups 35,384 27 1,311
2,814 2 27 ,078
Total 114,769 29

Multiple Comparisons

Dependent Variable: Energy • See that there is a


Bonferroni
difference between
Mean
95% Confidence Interval
groups.
Difference
(I) Group
Lean
(J) Group
Obese
(I-J)
-2,23162*
Std. Error
,49641
Sig.
,000
Lower Bound
-3,4987
Upper Bound
-,9646
• See also between
Anorectic 2,08990* ,51441 ,001 ,7769 3,4029 which groups the
Obese Lean
Anorectic
2,23162*
4,32153*
,49641
,55626
,000
,000
,9646
2,9017
3,4987
5,7414 difference is!
Anorectic Lean -2,08990* ,51441 ,001 -3,4029 -,7769
Obese -4,32153* ,55626 ,000 -5,7414 -2,9017
*. The mean difference is significant at the .05 level.
Conclusion:
• There is a significant overall difference in
energy expenditure between the three
groups (p-value<0.001)
• There are also significant differences for all
two-by-two comparisons of groups
The Kruskal-Wallis test
• ANOVA is based on the assumption of
normality
• There is a non-parametric alternative not
relying this assumption:
– Looking at all observations together, rank them
– Let R1, R2, …,RK be the sums of ranks of each
group
– If some R’s are much larger than others, it
indicates the numbers in different groups come
from different populations
The Kruskal-Wallis test
• The test statistic is
12 K
Ri2
W 
n(n  1) i 1 ni
 3( n  1)

• Under the null hypothesis, this has an


2

approximate K 1 distribution.
• The approximation is OK when each group
contains at least 5 observations.
Example: previous data
Doctor A Doctor B Doctor C
24 (rank 1) 29 (rank 6.5) 29 (rank 6.5)
26 (rank 2.5) 31 (rank 9.5) 27 (rank 4.5) (We really have
too few
31 (rank 9.5) 30 (rank 8) 34 (rank 12) observations
27 (rank 4.5) 36 (rank 13) 26 (rank 2.5) for this test!)

33 (rank 11)
R1=17.5 R2=48 R3=25.5
Kruskal-Wallis in SPSS
• Use ”Analyze=>Nonparametric tests=>K
independent samples”
• For our data, we get
Ranks Test Statisticsa,b

VAR00002 N Mean Rank VAR00001


VAR00001 1 4 4,38 Chi-Square 4,195
2 5 9,60 df 2
3 4 6,38 Asymp. Sig. ,123
Total 13 a. Kruskal Wallis Test
b. Grouping Variable: VAR00002
For the energy data:
• Same result as for one-way ANOVA!
Ranks
Test Statisticsa,b
Group N Mean Rank
Energy
Energy Lean 13 15,62
Chi-Square 21,146
Obese 9 24,67
df 2
Anorectic 8 5,00
Asymp. Sig. ,000
Total 30
a. Kruskal Wallis Test
b. Grouping Variable: Group

• Reject H0
When to use what method
• In situations where we have one observation per
object, and want to compare two or more groups:
– Use non-parametric tests if you have enough data
• For two groups: Mann-Whitney U-test (Wilcoxon rank sum)
• For three or more groups use Kruskal-Wallis
– If data analysis indicate assumption of normally
distributed independent errors is OK
• For two groups use t-test (equal or unequal variances assumed)
• For three or more groups use ANOVA
When to use what method
• When you in addition to the main
observation have some observations that
can be used to pair or block objects, and
want to compare groups, and assumption of
normally distributed independent errors is
OK:
– For two groups, use paired-data t-test
– For three or more groups, we can use two-way
ANOVA
Two-way ANOVA (without
interaction)
• In two-way ANOVA, data fall into categories in
two different ways: Each observation can be
placed in a table.
• Example: Both doctor and type of treatment
should influence outcome.
• Sometimes we are interested in studying both
categories, sometimes the second category is used
only to reduce unexplained variance (like an
independent variable in regression!). Then it is
called a blocking variable
• Compare means, just as before, but for different
groups and blocks
Data from exercise 17.46:
• Three types of aptitude tests (K=3) given to
prospective management trainers
• Each test type is given to members of each
of four groups of subjects (H=4): Profile fit,
Mindbender, Psych Out
Test type
Subject type Profile fit Mindbender Psych Out
Poor 65 69 75
Fair 74 72 70
Good 64 68 78
Excellent 83 78 76
Sums of squares for two-way
ANOVA
• Assume K groups, H blocks, and assume
one observation xij for each group i and
each block j, so we have n=KH
observations (independent!).
– Mean for category i: xi
– Mean for block j: x j
– Overall mean: x
• Model: Xij=µ+Gi+Bj+εij
Sums of squares for two-way
ANOVA
K H
SSG  H  ( xi  x ) 2
SSB  K  ( x j  x ) 2
i 1 j 1

K H K H

SSE   ( xij  xi   x j  x ) 2 SST   ( xij  x ) 2


i 1 j 1 i 1 j 1

SSG  SSB  SSE  SST


ANOVA table for two-way data
Source of Sums of Deg. of Mean squares F ratio
variation squares freedom
Between groups SSG K-1 MSG= SSG/(K-1) MSG/MSE
Between blocks SSB H-1 MSB= SSB/(H-1) MSB/MSE
Error SSE (K-1)(H-1) MSE=
SSE/(K-1)(H-1)
Total SST n-1

MSG
Test for between groups effect: compare to FK 1,( K 1)( H 1)
MSE
MSB
Test for between blocks effect: compare to FH 1,( K 1)( H 1)
MSE
Two-way ANOVA (with interaction)
• The setup above assumes that the blocking
variable influences outcomes in the same
way in all categories (and vice versa)
• We can check if there is interaction between
the blocking variable and the categories by
extending the model with an interaction
term
• Need more observations per block
• Other advantages: More precise estimates
Data from exercise 17.46 cont’d:
• Each type of test was given three times for
each type of subject
Test type
Subject type Profile fit Mindbender Psych Out
Poor 65 68 62 69 71 67 75 75 78
Fair 74 79 76 72 69 69 70 69 65
Good 64 72 65 68 73 75 78 82 80
Excellent 83 82 84 78 78 75 76 77 75
Sums of squares for two-way
ANOVA (with interaction)
• Assume K groups, H blocks, and assume L
observations xij1, xij2, …,xijL for each
category i and each block j block, so we
have n=KHL observations (independent!).
– Mean for category i: xi
– Mean for block j: x j 
– Mean for cell ij: xij 
– Overall mean: x
• Model: Xijl=µ+Gi+Bj+Iij+εijl
Sums of squares for two-way
ANOVA (with interaction)
K H

SSG  HL  ( xi  x ) 2 SSB  KL ( x j   x ) 2


i 1 j 1

K H L K H L

SSE   ( xijl  xij  ) 2 SST   ( xijl  x )2


i 1 j 1 l 1 i 1 j 1 l 1

K H
SSI  L ( xij   xi  x j   x ) 2
i 1 j 1

SSG  SSB  SSI  SSE  SST


ANOVA table for two-way data
(with interaction)
Source of Sums of Deg. of Mean squares F ratio
variation squares freedom
Between groups SSG K-1 MSG= SSG/(K-1) MSG/MSE
Between blocks SSB H-1 MSB= SSB/(H-1) MSB/MSE
Interaction SSI (K-1)(H-1) MSI= MSI/MSE
SSI/(K-1)(H-1)
Error SSE KH(L-1) MSE=
SSE/KH(L-1)
Total SST n-1

Test for interaction: compare MSI/MSE with F( K 1)( H 1), KH ( L 1)


Test for block effect: compare MSB/MSE with FH 1, KH ( L 1)
Test for group effect: compare MSG/MSE with FK 1, KH ( L 1)
Two-way ANOVA in SPSS
• Analyze->General Linear Model->
Univariate
• Move dependent variable (Score) to
Dependent Variable
• Move test type and subject type to Fixed
Factor(s)
• Under Options, may check Descriptive
Statistics and Homogeneity Tests, and also
get two-by-two comparisons by checking
Bonferroni under Post Hoc
• Gives you a full model (with interaction)
Some SPSS output:
a
Levene's Test of Equality of Error Variances
• See that there is a
Dependent Variable: Score
F df1 df2 Sig. Equal variances significant block
1,472 11 24 ,206
Tests the null hypothesis that the error variance of the can be assumed effect, significant
dependent variable is equal across groups.
a. Design: Intercept+Subjectty+Testtype+Subjectty group effect, and a
* Testtype
significant
Tests of Between-Subjects Effects
interaction effect
Dependent Variable: Score

Source
Type IV Sum
of Squares df Mean Square F Sig.
• Means (in plain
Corrected Model
Intercept
1032,556a
193306,778
11
1
93,869
193306,778
15,360
31632,018
,000
,000
words) that test
Subjectty
Testtype
389,000
57,556
3
2
129,667
28,778
21,218
4,709
,000
,019
score is different for
Subjectty * Testtype 586,000 6 97,667 15,982 ,000 subject types, for the
Error 146,667 24 6,111
Total 194486,000 36 three tests, and that
Corrected Total
difference for test
1179,222 35
a. R Squared = ,876 (Adjusted R Squared = ,819)
2. Subjectty
type depends on
Dependent Variable: Score
95% Confidence Interval what block you
Subjectty
Poor
Mean
70,000
Std. Error
,824
Lower Bound
68,299
Upper Bound
71,701 consider
Fair 71,444 ,824 69,744 73,145
Good 73,000 ,824 71,299 74,701
Excellent 78,667 ,824 76,966 80,367
Two-by-two comparisons
Multiple Comparisons

Dependent Variable: Score


Bonferroni

Mean
Difference 95% Confidence Interval
(I) Testtype (J) Testtype (I-J) Std. Error Sig. Lower Bound Upper Bound
Profile fit Mindbender ,83 1,009 1,000 -1,76 3,43
Psych Out -2,17 1,009 ,126 -4,76 ,43
Mindbender Profile fit -,83 1,009 1,000 -3,43 1,76
Psych Out -3,00* 1,009 ,020 -5,60 -,40
Psych Out Profile fit 2,17 1,009 ,126 -,43 4,76
Mindbender 3,00* 1,009 ,020 ,40 5,60
Based on observed means. Multiple Comparisons
*. The mean difference is significant at the ,05 level.
Dependent Variable: Score
Bonferroni

Mean
Difference 95% Confidence Interval
(I) Subjectty (J) Subjectty (I-J) Std. Error Sig. Lower Bound Upper Bound
Poor Fair -1,44 1,165 1,000 -4,79 1,91
Good -3,00 1,165 ,100 -6,35 ,35
Excellent -8,67* 1,165 ,000 -12,02 -5,32
Fair Poor 1,44 1,165 1,000 -1,91 4,79
Good -1,56 1,165 1,000 -4,91 1,79
Excellent -7,22* 1,165 ,000 -10,57 -3,87
Good Poor 3,00 1,165 ,100 -,35 6,35
Fair 1,56 1,165 1,000 -1,79 4,91
Excellent -5,67* 1,165 ,000 -9,02 -2,32
Excellent Poor 8,67* 1,165 ,000 5,32 12,02
Fair 7,22* 1,165 ,000 3,87 10,57
Good 5,67* 1,165 ,000 2,32 9,02
Based on observed means.
*. The mean difference is significant at the ,05 level.
Notes on ANOVA
• All analysis of variance (ANOVA) methods are
based on the assumptions of normally distributed
and independent errors
• The same problems can be described using the
regression framework. We get exactly the same
tests and results!
• There are many extensions beyond those
mentioned
• In fact, the book only briefly touches this subject
• More material is needed in order to do two-way
ANOVA on your own
Next time:
• How to design a study?
• Different sampling methods
• Research designs
• Sample size considerations

You might also like