Professional Documents
Culture Documents
Non-Parametric Tests: B.H. Robbins Scholars Series
Non-Parametric Tests: B.H. Robbins Scholars Series
Non-parametric tests
1 / 30
Non-parametric tests
Outline
Confidence Intervals
Summary
2 / 30
Non-parametric tests
Introduction
I T-tests: tests for the means of continuous data
I One sample H0 : µ = µ0 versus HA : µ 6= µ0
I Two sample H0 : µ1 − µ2 = 0 versus HA : µ1 − µ2 6= 0
I Underlying these tests is the assumption that the data arise
from a normal distribution
I T-tests do not actually require normally distributed data to
perform reasonably well in most circumstances
I Parametric methods: assume the data arise from a
distribution described by a few parameters (Normal
distribution with mean µ and variance σ 2 ).
I Nonparametric methods: do not make parametric assumptions
(most often based on ranks as opposed to raw values)
I We discuss non-parametric alternatives to the one and two
sample t-tests.
3 / 30
Non-parametric tests
I Extreme outliers
I Example: t-test comparing two sets of measurements
I Sample 1: 1 2 3 4 5 6 7 8 9 10
I Sample 2: 7 8 9 10 11 12 13 14 15 16 17 18 19 20
I Sample averages: 5.5 and 13.5, T-test p-value p = 0.000019
I Example: t-test comparing two sets of measurements
I Sample 1: 1 2 3 4 5 6 7 8 9 10
I Sample 2: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 200
I Sample averages: 5.5 and 25.9, T-test p-value p = 0.12
4 / 30
Non-parametric tests
6 / 30
Non-parametric tests
n=8 n = 18
2500
Above Detection Limit
2000
Fecal Calprotectin
1500
1000
500
7 / 30
Non-parametric tests
3
The large-sample efficiency of the Wilcoxon test compared to the t test is π
=
0.9549.
8 / 30
Non-parametric tests
Non-parametric methods
9 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
10 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
11 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
I Sleep Dataset
I Compare the effects of two soporific drugs.
I Each subject receives Drug 1 and Drug 2
I Study question: Is Drug 1 or Drug 2 more effective at
increasing sleep?
I Dependent variable: Difference in hours of sleep comparing
Drug 2 to Drug 1
I H0 : For any given subject, the difference in hours of sleep is
equally likely to be positive or negative
13 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
14 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
16 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
17 / 30
Non-parametric tests
One Sample Test: Wilcoxon Signed-Rank
18 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
19 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
20 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
W − µw
z=
σw
follow a N(0,1) distribution.
21 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
R̄ − n12+1
C= ,
n2
22 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
23 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
24 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
2000
Fecal Calprotectin
1500
18
1000 17
1516
14
13
12
500 11
10
9
8
7 6
5
0 134 2
25 / 30
Non-parametric tests
Two Sample Test: Wilcoxon–Mann–Whitney
26 / 30
Non-parametric tests
Confidence Intervals
27 / 30
Non-parametric tests
Confidence Intervals
Confidence Intervals
I Confidence intervals for the difference in two medians (two
samples)
I Considers all possible differences between sample 1 and sample
2
Female
Male 120 118 121 119
124 4 6 3 5
120 0 2 -1 1
133 13 15 12 14
I An estimate of the median difference (males - females) is the
median of these 12 differences, with the 3rd and 10th largest
values giving an (approximate) 95% CI
I Median estimate = 4.5, 95% CI = [1, 13]
I Specific formulas found in Altman, pages 40-41
28 / 30
Non-parametric tests
Confidence Intervals
Confidence Intervals
I Bootstrap
I General method, not just for medians
I Non-parametric, does not assume symmetry
I Iterative method that repeatedly samples from the original data
I Algorithm for creating a 95% CI for the difference in two
medians
1. Sample with replacement from sample 1 and sample 2
2. Calculate the difference in medians, save result
3. Repeat Steps 1 and 2 1000 times
I A (naive) 95% CI is given by the 25th and 97.5th largest values
of your 1000 median differences
I For the male/female data, median estimate = 4.5, 95% CI =
[-0.5, 14.5], which agrees with the conclusion from a WMW
rank sum test (p = 0.11).
29 / 30
Non-parametric tests
Summary
30 / 30