Professional Documents
Culture Documents
Health Management 10
Health Management 10
SSW (24 27) 2 (26 27) 2 ... (29 31.8) 2 .... 94.8
• Compare it with sum of squares between
groups: SSG n ( x x )
K
i i
2
i 1
SSG (27 29.46) (27 29.46) ... (31.8 29.46)2 ....
2 2
2 2 2
4(27 29.46) 5(31.8 29.46) 4(29 29.46) 52.43
• Comparing these, we also need to take into
account the number of observations and
sizes of groups
Adjusting for group sizes
• Divide by the number of degrees of
freedom
SSW Both are estimates of population
MSW variance of error under H0
nK
n: number of observations
SSG
MSG K: number of groups
K 1
MSG
MSW
• Test statistic: reject H0 if this is large
Test statistic thresholds
• If populations are normal, with the same
variance, then we can show that under the
null hypothesis, MSG and MSW are Chi-
square distributed with K-1 and n-K d.f.
MSG The F distribution, with
~ FK 1,n K K-1 and n-K degrees of
MSW
freedom
VAR00001
Sum of
Squares df Mean Square F Sig.
Between Groups 52,431 2 26,215 2,765 ,111
Within Groups 94,800 10 9,480
Total 147,231 12
Energy
95% Confidence Interval for
Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Lean 13 8,0662 1,23808 ,34338 7,3180 8,8143 6,13 10,88
Obese 9 10,2978 1,39787 ,46596 9,2233 11,3723 8,79 12,79
Anorectic 8 5,9762 ,44032 ,15568 5,6081 6,3444 5,34 6,55
Total 30 8,1783 1,98936 ,36321 7,4355 8,9212 5,34 12,79
ANOVA
Test of Homogeneity of Variances Energy
Energy Sum of
Squares df Mean Square F Sig.
Levene
Between Groups 79,385 2 39,693 30,288 ,000
Statistic df1 df2 Sig.
Within Groups 35,384 27 1,311
2,814 2 27 ,078
Total 114,769 29
Multiple Comparisons
33 (rank 11)
R1=17.5 R2=48 R3=25.5
Kruskal-Wallis in SPSS
• Use ”Analyze=>Nonparametric tests=>K
independent samples”
• For our data, we get
Ranks Test Statisticsa,b
• Reject H0
When to use what method
• In situations where we have one observation per
object, and want to compare two or more groups:
– Use non-parametric tests if you have enough data
• For two groups: Mann-Whitney U-test (Wilcoxon rank sum)
• For three or more groups use Kruskal-Wallis
– If data analysis indicate assumption of normally
distributed independent errors is OK
• For two groups use t-test (equal or unequal variances assumed)
• For three or more groups use ANOVA
When to use what method
• When you in addition to the main
observation have some observations that
can be used to pair or block objects, and
want to compare groups, and assumption of
normally distributed independent errors is
OK:
– For two groups, use paired-data t-test
– For three or more groups, we can use two-way
ANOVA
Two-way ANOVA (without
interaction)
• In two-way ANOVA, data fall into categories in
two different ways: Each observation can be
placed in a table.
• Example: Both doctor and type of treatment
should influence outcome.
• Sometimes we are interested in studying both
categories, sometimes the second category is used
only to reduce unexplained variance (like an
independent variable in regression!). Then it is
called a blocking variable
• Compare means, just as before, but for different
groups and blocks
Data from exercise 17.46:
• Three types of aptitude tests (K=3) given to
prospective management trainers
• Each test type is given to members of each
of four groups of subjects (H=4): Profile fit,
Mindbender, Psych Out
Test type
Subject type Profile fit Mindbender Psych Out
Poor 65 69 75
Fair 74 72 70
Good 64 68 78
Excellent 83 78 76
Sums of squares for two-way
ANOVA
• Assume K groups, H blocks, and assume
one observation xij for each group i and
each block j, so we have n=KH
observations (independent!).
– Mean for category i: xi
– Mean for block j: x j
– Overall mean: x
• Model: Xij=µ+Gi+Bj+εij
Sums of squares for two-way
ANOVA
K H
SSG H ( xi x ) 2
SSB K ( x j x ) 2
i 1 j 1
K H K H
MSG
Test for between groups effect: compare to FK 1,( K 1)( H 1)
MSE
MSB
Test for between blocks effect: compare to FH 1,( K 1)( H 1)
MSE
Two-way ANOVA (with interaction)
• The setup above assumes that the blocking
variable influences outcomes in the same
way in all categories (and vice versa)
• We can check if there is interaction between
the blocking variable and the categories by
extending the model with an interaction
term
• Need more observations per block
• Other advantages: More precise estimates
Data from exercise 17.46 cont’d:
• Each type of test was given three times for
each type of subject
Test type
Subject type Profile fit Mindbender Psych Out
Poor 65 68 62 69 71 67 75 75 78
Fair 74 79 76 72 69 69 70 69 65
Good 64 72 65 68 73 75 78 82 80
Excellent 83 82 84 78 78 75 76 77 75
Sums of squares for two-way
ANOVA (with interaction)
• Assume K groups, H blocks, and assume L
observations xij1, xij2, …,xijL for each
category i and each block j block, so we
have n=KHL observations (independent!).
– Mean for category i: xi
– Mean for block j: x j
– Mean for cell ij: xij
– Overall mean: x
• Model: Xijl=µ+Gi+Bj+Iij+εijl
Sums of squares for two-way
ANOVA (with interaction)
K H
K H L K H L
K H
SSI L ( xij xi x j x ) 2
i 1 j 1
Source
Type IV Sum
of Squares df Mean Square F Sig.
• Means (in plain
Corrected Model
Intercept
1032,556a
193306,778
11
1
93,869
193306,778
15,360
31632,018
,000
,000
words) that test
Subjectty
Testtype
389,000
57,556
3
2
129,667
28,778
21,218
4,709
,000
,019
score is different for
Subjectty * Testtype 586,000 6 97,667 15,982 ,000 subject types, for the
Error 146,667 24 6,111
Total 194486,000 36 three tests, and that
Corrected Total
difference for test
1179,222 35
a. R Squared = ,876 (Adjusted R Squared = ,819)
2. Subjectty
type depends on
Dependent Variable: Score
95% Confidence Interval what block you
Subjectty
Poor
Mean
70,000
Std. Error
,824
Lower Bound
68,299
Upper Bound
71,701 consider
Fair 71,444 ,824 69,744 73,145
Good 73,000 ,824 71,299 74,701
Excellent 78,667 ,824 76,966 80,367
Two-by-two comparisons
Multiple Comparisons
Mean
Difference 95% Confidence Interval
(I) Testtype (J) Testtype (I-J) Std. Error Sig. Lower Bound Upper Bound
Profile fit Mindbender ,83 1,009 1,000 -1,76 3,43
Psych Out -2,17 1,009 ,126 -4,76 ,43
Mindbender Profile fit -,83 1,009 1,000 -3,43 1,76
Psych Out -3,00* 1,009 ,020 -5,60 -,40
Psych Out Profile fit 2,17 1,009 ,126 -,43 4,76
Mindbender 3,00* 1,009 ,020 ,40 5,60
Based on observed means. Multiple Comparisons
*. The mean difference is significant at the ,05 level.
Dependent Variable: Score
Bonferroni
Mean
Difference 95% Confidence Interval
(I) Subjectty (J) Subjectty (I-J) Std. Error Sig. Lower Bound Upper Bound
Poor Fair -1,44 1,165 1,000 -4,79 1,91
Good -3,00 1,165 ,100 -6,35 ,35
Excellent -8,67* 1,165 ,000 -12,02 -5,32
Fair Poor 1,44 1,165 1,000 -1,91 4,79
Good -1,56 1,165 1,000 -4,91 1,79
Excellent -7,22* 1,165 ,000 -10,57 -3,87
Good Poor 3,00 1,165 ,100 -,35 6,35
Fair 1,56 1,165 1,000 -1,79 4,91
Excellent -5,67* 1,165 ,000 -9,02 -2,32
Excellent Poor 8,67* 1,165 ,000 5,32 12,02
Fair 7,22* 1,165 ,000 3,87 10,57
Good 5,67* 1,165 ,000 2,32 9,02
Based on observed means.
*. The mean difference is significant at the ,05 level.
Notes on ANOVA
• All analysis of variance (ANOVA) methods are
based on the assumptions of normally distributed
and independent errors
• The same problems can be described using the
regression framework. We get exactly the same
tests and results!
• There are many extensions beyond those
mentioned
• In fact, the book only briefly touches this subject
• More material is needed in order to do two-way
ANOVA on your own
Next time:
• How to design a study?
• Different sampling methods
• Research designs
• Sample size considerations