You are on page 1of 50

Comparing means

T test , Mann Whitney,


ANOVA dan Kruskal Wallis

Dr. Pudji Lestari, dr, Mkes


IKM-KP FK UNAIR
Contoh kasus
• Perbandingan dua kelompok penderita
dengan terapi berbeda---independent ---
outcome continues
• Perbandingan dalam subyek yang sama
sebelum dan sesudah pengobatan ---
related/pair ---outcome continues
Asumsi distribusi
Normal Tidak Normal
Independen
Dua kelompok T test independen Mann whitney

Lebih dari dua ANOVA Kruskal Wallis


kelompok
Terkait/related
Dua kelompok Paired T test Wilcoxcon sign rank
test
Choosing Statistical Procedures
One Independent Variable Two Independent Variables
Measurement Two Levels More than 2 Levels Factorial Designs
Scale of the
Dependent Two Two Multiple Multiple
Independent Dependent
Variable Independent Dependent Independent Dependent
Groups Groups
Groups Groups Groups Groups
Two-Factor
Repeated
Independent Dependent One-Way Two -Factor ANOVA
Interval or Ratio Measures
t-test t-test ANOVA ANOVA Repeated
ANOVA
Measures
Mann- Kruskal-
Ordinal Wilcoxon Friedman
Whitney U Wallis

Nominal Chi-Square Chi-Square Chi-Square


Independent Samples
• Individu yang berbeda di dua kelompok sample
• Besar sampel boleh sama atau tidak
• Large-sample inference based on Normal
Distribution (Central Limit Theorem)
• Small-sample inference depends on distribution
of individual outcomes (Normal vs non-Normal)
Parameters/Estimates (Independent
Samples)
• Parameter:     
• Estimator: Y1 Y 2
S12 S 22

• Estimated standard error: n1 n2
• Shape of sampling distribution:
– Normal if data are normal
– Approximately normal if n1,n2>20
– Non-normal otherwise (typically)
The Normal Distribution
Overview

Mean changes

Variance changes
Two Sample Problems

Is  A   B ?

Probability distribution Probability distribution


of population A of population B

Comparison of the means of


two prob. dists.

A B

Is  A   B ?
2 2
Probability distribution
Probability distribution of population B
of population A
Comparison of the variance of
two prob. dists.
A B
8
Example: Vitamin C for Common Cold
• Outcome: Number of Colds During Study Period
for Each Student
• Group 1: Given Placebo
y1  2.2 s1  0.12 n1  155

• Group 2: Given Ascorbic Acid (Vitamin C)


y 2  1.9 s2  0.10 n2  208

Source: Pauling (1971)


2-Sided Test to Compare Groups

• H0: 12 0 (No difference in treatment effects)


• HA: 12≠ 0 (Difference in treatment effects)

(2.2  1.9)  0 0.3


• Test Statistic: zobs    25.3
(0.12) 2 (0.10) 2 0.0119

155 208
• Decision Rule (a=0.05)
– Conclude  > 0 since zobs = 25.3 > z.025 = 1.96
Test for 
Normal Populations
• Case 1: Common Variances (12 = 22 = 2)
• Null Hypothesis: H 0 : 1   2   0
• Alternative Hypotheses:
H A : 1   2   0
– 1-Sided:
– 2-Sided: H A : 1   2   0

• Test Statistic:(where Sp2 is a “pooled” estimate of 2)

( y1  y 2 )   0 ( n  1) S 2
 ( n  1) S 2
tobs  S p2  1 1 2 2

 1 1  n1  n2  2
S p2   
 n1 n2 
Test for 
Normal Populations
• Decision Rule: (Based on t-distribution with n=n1+n2-2 df)
– 1-sided alternative
• If tobs  ta,n ==> Conclude   0
• If tobs < ta,n ==> Do not reject   0

– 2-sided alternative
• If tobs  ta/ ,n ==> Conclude   0
• If tobs ≤ -ta/,n ==> Conclude  < 0
• If -ta/,n < tobs < ta/,n ==> Do not reject   0
Test for 
Normal Populations
• Observed Significance Level (P-Value)
• If P-Value  a, then reject the null hypothesis
Inference for 
Normal Populations

• Case 2: 12  22

• Don’t pool variances: S12 S 22


Sy y  
1 2
n1 n2
• Use “adjusted” degrees of freedom (Satterthwaites’
Approximation) : S S 
2 2
2


 n  n 
1

2

n*   1 2 

S2 
2
 2

2

 S 
n1   n2 
1 2
    
 n 1 n2  1 
 1

 
 
Output SPss
Output Spss
Post hoc
Equal variance Unequal variance
– Bonferroni • Games-Howell
– LSD • Dunnett C, T3
– Tukey
• Tamhane
– Duncan
– dll
Mann Whitney U Test
• Nonparametric equivalent of
the independent t test

– Two independent groups


– Ordinal measurement of the DV
– The sampling distribution of U is
known and is used to test
hypotheses in the same way as
the t distribution.
Mann Whitney U Test
• To compute the Mann
Whitney U: Income Rank No Income Rank
25 12 27 10
– Rank the scores in both 32 5 19 17
groups (together) from 36 3 16 20
40 1 33 4
highest to lowest. 22 14 30 7
– Sum the ranks of the 37 2 17 19
20 16 21 15
scores for each group. 18 18 23 13
– The sum of ranks for each 31 6 26 11
29 8 28 9
group are used to make 85 125
the statistical comparison.
Non-Directional Hypotheses
• Null Hypothesis: There is no difference
in scores of the two groups (i.e. the sum
of ranks for group 1 is no different than
the sum of ranks for group 2).
• Alternative Hypothesis: There is a
difference between the scores of the
two groups (i.e. the sum of ranks for
group 1 is significantly different from the
sum of ranks for group 2).
Computing the Mann Whitney U Using
SPSS
• Enter data into SPSS spreadsheet; two columns
 1st column: groups; 2nd column: scores
(ratings)
• Analyze  Nonparametric  2 Independent
Samples
• Select the independent variable and move it to
the Grouping Variable box  Click Define Groups
 Enter 1 for group 1 and 2 for group 2
• Select the dependent variable and move it to the
Test Variable box  Make sure Mann Whitney is
selected  Click OK
Interpreting the Output
Ranks

Income Status N Mean Rank Sum of Ranks


Equal Rights Attitudes Income Producing 10 12.50 125.00
No Income 10 8.50 85.00
Total 20

Test Statisticsb

Equal Rights The output provides a z


Attitudes score equivalent of the
Mann-Whitney U 30.000
Wilcoxon W 85.000 Mann Whitney U statistic.
Z -1.512
Asymp. Sig. (2-tailed) .131 It also gives significance
Exact Sig. [2*(1-tailed
.143
a levels for both a one-tailed
Sig.)]
and a two-tailed hypothesis.
a. Not corrected for ties.
b. Grouping Variable: Income Status
Lanjut ya?
Syarat ANOVA:
• Dependent variable :
- data distribusi normal, tidak ada outliers
- varians homogen—Levene test
(Bila tidak memenuhi syarat pakai Uji Kruskal-Wallis )
• Antar kelompok Independent :
- Pengamatan bebas tidak saling terkait
ANOVA
• Omnibus test : bila bermakna menunjukkan
ada perbedaan diantara kelompok
pengamatan (paling sedikit diantara dua
kelompok )
• Dilanjutkan dengan Post Hoc test untuk tahu
kelompok mana yang berbeda-pemilihan
bergantung homogenitas variance
An example ANOVA situation
Subjects: 25 patients with blisters
Treatments: Treatment A, Treatment B, Placebo
Measurement: # of days until blisters heal

Data [and means]:


• A: 5,6,6,7,7,8,9,10 [7.25]
• B: 7,7,8,9,9,10,10,11 [8.875]
• P: 7,9,9,10,10,10,11,12,13 [10.11]

Are these differences significant?


Informal Investigation
Graphical investigation:
• side-by-side box plots
• multiple histograms

Whether the differences between the groups are


significant depends on
• the difference in the means
• the standard deviations of each group
• the sample sizes

ANOVA determines P-value from the F statistic


Side by Side Boxplots

13

12

11

10
days

A B P
treatment
What does ANOVA do?
At its simplest ANOVA tests the following
hypotheses:
H0: The means of all the groups are equal.
Ha: Not all the means are equal
• doesn’t say how or which ones differ.
• Can follow up with “multiple comparisons”

Note: we usually refer to the sub-populations as “groups”


when doing ANOVA.
Assumptions of ANOVA
• each group is approximately normal
· check this by looking at histograms and/or
normal quantile plots, or use assumptions
· can handle some nonnormality, but not severe
outliers
• standard deviations of each group are
approximately equal
· rule of thumb: ratio of largest to smallest
sample st. dev. must be less than 2:1
Normality Check

We should check for normality using:


• assumptions about population
• histograms for each group
• normal quartile plot for each group

With such small data sets, there really isn’t a really


good way to check normality from data, but we make
the common assumption that physical
measurements of people tend to be normally
distributed.
Standard Deviation Check

Variable treatment N Mean Median StDev


days A 8 7.250 7.000 1.669
B 8 8.875 9.000 1.458
P 9 10.111 10.000 1.764

Compare largest and smallest standard deviations:


• largest: 1.764
• smallest: 1.458
• 1.458 x 2 = 2.916 > 1.764
Notation for ANOVA
• n = number of individuals all together
• I = number of groups
• = mean for entire data set is
x
Group i has
• ni = # of individuals in group I

• xij = value for individual j in group I

•x = mean for group I


i

• si = standard deviation for group i


How ANOVA works (outline)
ANOVA measures two sources of variation in the data and
compares their relative sizes

• variation BETWEEN groups


• for each data value look at the difference between its
group mean and the overall mean
(xi  x  2

• variation WITHIN groups


• for each data value we look at the difference between
that value and the mean of its group
(x ij  xi 
2
The ANOVA F-statistic is a ratio of the Between Group Variaton divided by
the Within Group Variation:

Between MSG
F 
Within MSE
A large F is evidence against H0, since it indicates that there is more
difference between groups than within groups.
How are these computations made?

We want to measure the amount of variation


due to BETWEEN group variation and WITHIN
group variation

For each data value, we calculate its


contribution to:
( 
2
• BETWEEN group variation: x i  x

• WITHIN group variation: ( x ij  x i )2


An even smaller example
Suppose we have three groups
• Group 1: 5.3, 6.0, 6.7
• Group 2: 5.5, 6.2, 6.4, 5.7
• Group 3: 7.5, 7.2, 7.9
We get the following statistics:

SUMMARY
Groups Count Sum Average Variance
Column 1 3 18 6 0.49
Column 2 4 23.8 5.95 0.176667
Column 3 3 22.6 7.533333 0.123333
Excel ANOVA Output
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 5.127333 2 2.563667 10.21575 0.008394 4.737416
Within Groups 1.756667 7 0.250952

Total 6.884 9

1 less than number number of data values -


of groups number of groups
(equals df for each group
1 less than number of individuals added together)
(just like other situations)
Computing ANOVA F statistic
WITHIN BETWEEN
difference: difference
group data - group mean group mean - overall mean
data group mean plain squared plain squared
5.3 1 6.00 -0.70 0.490 -0.4 0.194
6.0 1 6.00 0.00 0.000 -0.4 0.194
6.7 1 6.00 0.70 0.490 -0.4 0.194
5.5 2 5.95 -0.45 0.203 -0.5 0.240
6.2 2 5.95 0.25 0.063 -0.5 0.240
6.4 2 5.95 0.45 0.203 -0.5 0.240
5.7 2 5.95 -0.25 0.063 -0.5 0.240
7.5 3 7.53 -0.03 0.001 1.1 1.188
7.2 3 7.53 -0.33 0.109 1.1 1.188
7.9 3 7.53 0.37 0.137 1.1 1.188
TOTAL 1.757 5.106
TOTAL/df 0.25095714 2.55275

overall mean: 6.44 F = 2.5528/0.25025 = 10.21575


Minitab ANOVA Output
Analysis of Variance for days
Source DF SS MS F P
treatment 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00

# of data values - # of groups

(equals df for each group added


1 less than # of
together)
groups

1 less than # of individuals


(just like other situations)
Minitab ANOVA Output
Analysis of Variance for days
Source DF SS MS F P
treatment 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00

 (x  xi ) 2
 (x  x) 2

 (x
ij
obs
ij  x) 2
obs
i

obs
SS stands for sum of squares
• ANOVA splits this into 3 parts



Minitab ANOVA Output
Analysis of Variance for days
Source DF SS MS F P
treatment 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00

MSG = SSG / DFG


MSE = SSE / DFE
P-value
comes from
F = MSG / MSE F(DFG,DFE)

(P-values for the F statistic are in Table E)


So How big is F?
Since F is
Mean Square Between / Mean Square Within

= MSG / MSE

A large value of F indicates relatively more


difference between groups than within groups
(evidence against H0)

To get the P-value, we compare to F(I-1,n-I)-distribution


• I-1 degrees of freedom in numerator (# groups -1)
• n - I degrees of freedom in denominator (rest of df)
Pooled estimate for st. dev
One of the ANOVA assumptions is that all groups
have the same standard deviation. We can
estimate this with a weighted average:
(n 1)s 2
 (n 1)s 2
 ... (n 1)s 2
s2p  1 1 2 2 I I
nI
(df1 )s  (df 2 )s  ... (df I )s
2 2 2
s 
2
p
1 2 I
df1  df 2  ... df I
so MSE is the pooled
SSE estimate of variance
s 
2
p  MSE
DFE

In Summary
SST   (x ij  x )  s (DFT)
2 2

obs

SSE   (x ij  x i )   si (df i )
2 2

obs groups

SSG   (x i  x) 2
 n (x i i  x) 2

obs groups

SS MSG
SSE SSG  SST; MS  ; F
DF MSE
Understanding Anova visually
http://www.psych.utah.edu/stat/in
trostats/anovaflash.html
Kruskal- Wallis (extended Mann
Whitney test)
• Dependent : Ordinal, interval /ratio tidak
memenuhi syarat
• Independent : katagori lebih dari 2.
bebas antar pengamatan..
Clinical important and statistically
important

You might also like