Professional Documents
Culture Documents
Objectives Parametric vs. Nonparametric Statistics
Parametric Statistics are statistical techniques based
Recognize the advantages and disadvantages of on assumptions about the population from which the
nonparametric statistics. sample data are collected.
Understand how to use the runs test to test for Assumption that data being analyzed are randomly
randomness. selected from a normally distributed population.
Know when and how to use the Mann‐Whitney U y test, , Requires quantitative measurement that yield interval
the Wilcoxon matched‐pairs signed rank test, the or ratio level data.
Kruskal‐Wallis test, and the Friedman test. Nonparametric Statistics are based on fewer
assumptions about the population and the
parameters.
Sometimes called “distribution‐free” statistics.
A variety of nonparametric statistics are available for
use with nominal or ordinal data.
Advantages of Nonparametric Techniques Disadvantages of Nonparametric Statistics
Sometimes there is no parametric alternative to the Nonparametric tests can be wasteful of data if
use of nonparametric statistics. parametric tests are available for use with the data.
Certain nonparametric test can be used to analyze Nonparametric tests are usually not as widely
nominal data. available and well know as parametric tests.
Certain nonparametric test can be used to analyze For large samples, the calculations for many
g p , y
ordinal data. nonparametric statistics can be tedious.
The computations on nonparametric statistics are
usually less complicated than those for parametric
statistics, particularly for small samples.
Probability statements obtained from most
nonparametric tests are exact probabilities.
Page 1
Learning Objectives Parametric vs. Nonparametric Statistics
Parametric Statistics are statistical techniques based
Recognize the advantages and disadvantages of on assumptions about the population from which the
nonparametric statistics. sample data are collected.
Understand how to use the runs test to test for Assumption that data being analyzed are randomly
randomness. selected from a normally distributed population.
Know when and how to use the Mann‐Whitney U y test, , Requires quantitative measurement that yield interval
the Wilcoxon matched‐pairs signed rank test, the or ratio level data.
Kruskal‐Wallis test, and the Friedman test. Nonparametric Statistics are based on fewer
assumptions about the population and the
parameters.
Sometimes called “distribution‐free” statistics.
A variety of nonparametric statistics are available for
use with nominal or ordinal data.
Advantages of Nonparametric Techniques Disadvantages of Nonparametric Statistics
Sometimes there is no parametric alternative to the Nonparametric tests can be wasteful of data if
use of nonparametric statistics. parametric tests are available for use with the data.
Certain nonparametric test can be used to analyze Nonparametric tests are usually not as widely
nominal data. available and well know as parametric tests.
Certain nonparametric test can be used to analyze For large samples, the calculations for many
g p , y
ordinal data. nonparametric statistics can be tedious.
The computations on nonparametric statistics are
usually less complicated than those for parametric
statistics, particularly for small samples.
Probability statements obtained from most
nonparametric tests are exact probabilities.
Page 1
Mann‐Whitney U Test Mann‐Whitney U Test:
Sample Size Consideration
Mann‐Whitney U test ‐ a nonparametric counterpart Size of sample one: n1
of the t test used to compare the means of two Size of sample two: n2
independent populations. If both n1 and n2 are 10, the small sample procedure
Nonparametric counterpart of the t test for is appropriate.
independent samples If either n1 or n
If either n or n2 is greater than 10, the large sample
is greater than 10 the large sample
Does not require normally distributed populations procedure is appropriate.
May be applied to ordinal data
Assumptions
Independent Samples
At Least Ordinal Data
Mann‐Whitney U Test: Small Sample Mann‐Whitney U Test: Small Sample
Example‐Demonstration Example‐Demonstration
H0: µ1 = µ2 = .05 Compensation Rank Group
Ha: µ1 π µ2 Drug A Drug B 18.75 1 H
19.80 2 H
20.10 26.19 If the final p‐value < .05, reject H0. 20.10 3 H
20.75 4 H
19.80 23.88
21.64 5 E
22 36
22.36 25 50
25.50 21 90
21.90 6 H
18.75 21.64 W1 = 1 + 2 + 3 + 4 + 6 + 7 + 8 = 31 22.36 7 H
22.96 8 H
21.90 24.85 23.45 9 E
22.96 25.30 W2 = 5 + 9 + 10 + 11 + 12 + 13 + 14 + 15 = 89 23.88 10 E
24.12 11 E
20.75 24.12 24.85 12 E
23.45 25.30 13 E
25.50 14 E
26.19 15 E
Page 2
Mann‐Whitney U Test Mann‐Whitney U Test:
Sample Size Consideration
Mann‐Whitney U test ‐ a nonparametric counterpart Size of sample one: n1
of the t test used to compare the means of two Size of sample two: n2
independent populations. If both n1 and n2 are 10, the small sample procedure
Nonparametric counterpart of the t test for is appropriate.
independent samples If either n1 or n
If either n or n2 is greater than 10, the large sample
is greater than 10 the large sample
Does not require normally distributed populations procedure is appropriate.
May be applied to ordinal data
Assumptions
Independent Samples
At Least Ordinal Data
Mann‐Whitney U Test: Small Sample Mann‐Whitney U Test: Small Sample
Example‐Demonstration Example‐Demonstration
H0: µ1 = µ2 = .05 Compensation Rank Group
Ha: µ1 π µ2 Drug A Drug B 18.75 1 H
19.80 2 H
20.10 26.19 If the final p‐value < .05, reject H0. 20.10 3 H
20.75 4 H
19.80 23.88
21.64 5 E
22 36
22.36 25 50
25.50 21 90
21.90 6 H
18.75 21.64 W1 = 1 + 2 + 3 + 4 + 6 + 7 + 8 = 31 22.36 7 H
22.96 8 H
21.90 24.85 23.45 9 E
22.96 25.30 W2 = 5 + 9 + 10 + 11 + 12 + 13 + 14 + 15 = 89 23.88 10 E
24.12 11 E
20.75 24.12 24.85 12 E
23.45 25.30 13 E
25.50 14 E
26.19 15 E
Page 2
Mann‐Whitney U Test: Small Mann‐Whitney U Test:
Sample Example Formulas for Large Sample Case
n (n 1) Since U2 < U1, U = 3.
U n n
1 1 2
2
1 1
W1
(7)(8) p‐value = .0011*2 n1 (n1 1)
n n
(7)(8) 31 U n1n 2 W 1
1 2
2 (for a two‐tailed test) = .022 2
U 2
12
2
n (n 1) n number in group 2
U n n
2 1 2
2 2
2
W 2 2
Z
U U
(8)(9)
W 1
sum or the ranks of U
Page 3
Mann‐Whitney U Test: Small Mann‐Whitney U Test:
Sample Example Formulas for Large Sample Case
n (n 1) Since U2 < U1, U = 3.
U n n
1 1 2
2
1 1
W1
(7)(8) p‐value = .0011*2 n1 (n1 1)
n n
(7)(8) 31 U n1n 2 W 1
1 2
2 (for a two‐tailed test) = .022 2
U 2
12
2
n (n 1) n number in group 2
U n n
2 1 2
2 2
2
W 2 2
Z
U U
(8)(9)
W 1
sum or the ranks of U
Page 3
PBS and Non‐PBS : Calculation of U PBS and Non‐PBS : Conclusion
n n U
W 4 7 11 12 13 14 18 19.5 22 23 24 25 26 27 1 2
Z U
1 U 2
2455
. 14 13 U
41.5 91
n n 1
2
20.6
U n1 n2
1 1
W 91
2 1
2.40
1415 n n n n
1
1413 2455
. U
1 2 1
12
2
2
415
. 14 13 28
20.6
12
Z Cal
2 .40 1.96 , reject H o
Wilcoxon Matched‐Pairs Signed Rank Test Wilcoxon Matched‐Pairs Signed Rank Test
Page 4
PBS and Non‐PBS : Calculation of U PBS and Non‐PBS : Conclusion
n n U
W 4 7 11 12 13 14 18 19.5 22 23 24 25 26 27 1 2
Z U
1 U 2
2455
. 14 13 U
41.5 91
n n 1
2
20.6
U n1 n2
1 1
W 91
2 1
2.40
1415 n n n n
1
1413 2455
. U
1 2 1
12
2
2
415
. 14 13 28
20.6
12
Z Cal
2 .40 1.96 , reject H o
Wilcoxon Matched‐Pairs Signed Rank Test Wilcoxon Matched‐Pairs Signed Rank Test
Page 4
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Sample Size Consideration Small Sample Example
n is the number of matched pairs Consider the survey by American Demographics that
If n > 15, T is approximately normally distributed, estimated the average annual household spending
and a Z test is used. on healthcare. The U.S. metropolitan average was
If n 15, a special “small sample” procedure is $1,800.
followed. Suppose six families in Pittsburgh, Pennsylvania, are
pp g , y ,
The paired data are randomly selected. matched demographically with six families in
The underlying distributions are symmetric. Oakland, California, and their amounts of household
spending on healthcare for last year are obtained.
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Small Sample Example Small Sample Example
Family
H0: Md = 0 Pair Sample A Sample B d Rank
Ha: Md 0 Pair Sample B 1 1,950 1,760 190 +4
Sample A
1 1,950 1,760 2 1,840 1,870 ‐30 ‐1
n = 6
2 1,840 1,870 3 2,015 1,810 205 +5
3 2,015 1,810 4 1 580
1,580 1 660
1,660 ‐80
80 ‐2
2
=0.05
4 1,580 1,660 5 1,790 1,340 450 +6
6 1,925 1,765 160 +3
5 1,790 1,340
If Tobserved 1, reject H0. 6 1,925 1,765
T = minimum(T+, T-) T = 3 > Tcrit = 1, do not reject H0.
T+ = 4 + 5 + 6 + 3= 18
T- = 1 + 2 = 3
T=3
Page 5
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Sample Size Consideration Small Sample Example
n is the number of matched pairs Consider the survey by American Demographics that
If n > 15, T is approximately normally distributed, estimated the average annual household spending
and a Z test is used. on healthcare. The U.S. metropolitan average was
If n 15, a special “small sample” procedure is $1,800.
followed. Suppose six families in Pittsburgh, Pennsylvania, are
pp g , y ,
The paired data are randomly selected. matched demographically with six families in
The underlying distributions are symmetric. Oakland, California, and their amounts of household
spending on healthcare for last year are obtained.
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Small Sample Example Small Sample Example
Family
H0: Md = 0 Pair Sample A Sample B d Rank
Ha: Md 0 Pair Sample B 1 1,950 1,760 190 +4
Sample A
1 1,950 1,760 2 1,840 1,870 ‐30 ‐1
n = 6
2 1,840 1,870 3 2,015 1,810 205 +5
3 2,015 1,810 4 1 580
1,580 1 660
1,660 ‐80
80 ‐2
2
=0.05
4 1,580 1,660 5 1,790 1,340 450 +6
6 1,925 1,765 160 +3
5 1,790 1,340
If Tobserved 1, reject H0. 6 1,925 1,765
T = minimum(T+, T-) T = 3 > Tcrit = 1, do not reject H0.
T+ = 4 + 5 + 6 + 3= 18
T- = 1 + 2 = 3
T=3
Page 5
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Large Sample Formulas Large Sample Formulas
For large samples, the T statistic is approximately normally
distributed and a z score can be used as the test statistic.
Wilcoxon Matched‐Pairs Signed Rank Test: Example
Large Sample Formulas
n n 1 H0: Md = 0 .05
T
Ha: Md 0
4 If Z 1.96 or Z 1.96 , reject H o
nn 12n 1
T 24
City 1979 2011
1 20.3
20 3 22.8
22 8 25
d Rank
‐2.5 ‐8
8
City 1979 2011
10 20.3
20 3 20.9
20 9
d Rank
‐0.6
06 ‐1
1
T T 2 19.5 12.7 6.8 17 11 19.2 22.6 ‐3.4 ‐11.5
Z 3 18.6 14.1 4.5 13 12 19.5 16.9 2.6 9
T
4 20.9 16.1
5 19.9 25.2
4.8
‐5.3
15
‐16
13 18.7 20.6
14 17.7 18.5
‐1.9 ‐6.5
‐0.8 ‐2
where : n = number of pairs 6 18.6 20.2 ‐1.6 ‐4 15 21.6 23.4 ‐1.8 ‐5
T = total ranks for either + or - differences, which ever is less 7 19.6 14.9 4.7 14 16 22.4 21.3 1.1 3
8 23.2 21.3 1.9 6.5 17 20.8 17.4 3.4 11.5
9 21.8 18.7 3.1 10
Page 6
Wilcoxon Matched‐Pairs Signed Rank Test: Wilcoxon Matched‐Pairs Signed Rank Test:
Large Sample Formulas Large Sample Formulas
For large samples, the T statistic is approximately normally
distributed and a z score can be used as the test statistic.
Wilcoxon Matched‐Pairs Signed Rank Test: Example
Large Sample Formulas
n n 1 H0: Md = 0 .05
T
Ha: Md 0
4 If Z 1.96 or Z 1.96 , reject H o
nn 12n 1
T 24
City 1979 2011
1 20.3
20 3 22.8
22 8 25
d Rank
‐2.5 ‐8
8
City 1979 2011
10 20.3
20 3 20.9
20 9
d Rank
‐0.6
06 ‐1
1
T T 2 19.5 12.7 6.8 17 11 19.2 22.6 ‐3.4 ‐11.5
Z 3 18.6 14.1 4.5 13 12 19.5 16.9 2.6 9
T
4 20.9 16.1
5 19.9 25.2
4.8
‐5.3
15
‐16
13 18.7 20.6
14 17.7 18.5
‐1.9 ‐6.5
‐0.8 ‐2
where : n = number of pairs 6 18.6 20.2 ‐1.6 ‐4 15 21.6 23.4 ‐1.8 ‐5
T = total ranks for either + or - differences, which ever is less 7 19.6 14.9 4.7 14 16 22.4 21.3 1.1 3
8 23.2 21.3 1.9 6.5 17 20.8 17.4 3.4 11.5
9 21.8 18.7 3.1 10
Page 6
T Calculation Conclusion
T T
T
8 16 4 1 11.5 6.5 2 5
Z
54 765
.
107
.
211.
54 T
T minimum(99,54)
54 1.96 Z Cal 1.07 1.96 , do not reject H o
Kruskal‐Wallis Test ‐ A nonparametric alternative
12 C T j
2
to one‐way analysis of variance
May used to analyze ordinal data
K
n n 1 j 1 n j
3 n 1
No assumed population shape
where : c = number of groups
Assumes that the C groups are independent
Assumes that the C groups are independent
n = total number of items
Assumes random selection of individual items
T j
total of ranks in a group
n j = number of items in a group
K χ , with df = c -1
2
Page 7
T Calculation Conclusion
T T
T
8 16 4 1 11.5 6.5 2 5
Z
54 765
.
107
.
211.
54 T
T minimum(99,54)
54 1.96 Z Cal 1.07 1.96 , do not reject H o
Kruskal‐Wallis Test ‐ A nonparametric alternative
12 C T j
2
to one‐way analysis of variance
May used to analyze ordinal data
K
n n 1 j 1 n j
3 n 1
No assumed population shape
where : c = number of groups
Assumes that the C groups are independent
Assumes that the C groups are independent
n = total number of items
Assumes random selection of individual items
T j
total of ranks in a group
n j = number of items in a group
K χ , with df = c -1
2
Page 7
Number of Patients per Day per Physician Number of Patients per Day per Physician
in Three Organizational Categories in Three Organizational Categories
Suppose a researcher wants to determine whether the Ho: The three populations are identical
number of physicians in an office produces significant Ha: At least one of the three populations is different
differences in the number of office patients seen by each Three or
physician per day. 0.05 Two More
Sh t k
She takes a random sample of physicians from practices
d l f h i i f ti df C 1 3 1 2 P
Partners P
Partners HMO
in which (1) there are only two partners, (2) there are 13 24 26
2
three or more partners, or (3) the office is a health .05, 2
5.991 15 16 22
20 19 31
maintenance organization (HMO). If K 5.991, reject Ho. 18 22 27
23 25 28
14 33
17
Patients per Day Data: Kruskal‐Wallis Patients per Day Data: Kruskal‐Wallis
Preliminary Calculations Calculations and Conclusion
Three or
12 C T j
Two More 2
Partners
Patients
Partners HMO
Rank Patients Rank Patients Rank
K
n n 1 j 1 n j
3 n 1
13 1 24 12 26 14
15 3 16 4 22 9.5
12 29 2 52.52 89.52
20 8 19 7 31 17 318 1
18 6 22 95
9.5 27 15 1818 1 5 7 6
23 11 25 13 28 16
12
14 2 33 18 1,897 318 1
17 5 1818 1
T1 = 29 T2 = 52.5 T3 = 89.5
9.56
n1 = 5 n2 = 7 n3 = 6
n = n1 + n2 + n3 = 5 + 7 + 6 = 18
2
.05, 2
5.991
K 9.56 5.991, reject Ho.
Page 8
Number of Patients per Day per Physician Number of Patients per Day per Physician
in Three Organizational Categories in Three Organizational Categories
Suppose a researcher wants to determine whether the Ho: The three populations are identical
number of physicians in an office produces significant Ha: At least one of the three populations is different
differences in the number of office patients seen by each Three or
physician per day. 0.05 Two More
Sh t k
She takes a random sample of physicians from practices
d l f h i i f ti df C 1 3 1 2 P
Partners P
Partners HMO
in which (1) there are only two partners, (2) there are 13 24 26
2
three or more partners, or (3) the office is a health .05, 2
5.991 15 16 22
20 19 31
maintenance organization (HMO). If K 5.991, reject Ho. 18 22 27
23 25 28
14 33
17
Patients per Day Data: Kruskal‐Wallis Patients per Day Data: Kruskal‐Wallis
Preliminary Calculations Calculations and Conclusion
Three or
12 C T j
Two More 2
Partners
Patients
Partners HMO
Rank Patients Rank Patients Rank
K
n n 1 j 1 n j
3 n 1
13 1 24 12 26 14
15 3 16 4 22 9.5
12 29 2 52.52 89.52
20 8 19 7 31 17 318 1
18 6 22 95
9.5 27 15 1818 1 5 7 6
23 11 25 13 28 16
12
14 2 33 18 1,897 318 1
17 5 1818 1
T1 = 29 T2 = 52.5 T3 = 89.5
9.56
n1 = 5 n2 = 7 n3 = 6
n = n1 + n2 + n3 = 5 + 7 + 6 = 18
2
.05, 2
5.991
K 9.56 5.991, reject Ho.
Page 8
Friedman Test Friedman Test
Friedman Test ‐ A nonparametric alternative to the C
12
randomized block design
2
bc(c 1) j 1 R j
3b(c 1)
2
r
Assumptions
The blocks are independent.
There is no interaction between blocks and treatments. where : c number of treatment levels (columns)
Observations within each block can be ranked. b = number of blocks (rows)
Hypotheses R j= total ranks for a particular treatment level
Ho: The treatment populations are equal j = particular treatment level
2 2
Ha: At least one treatment population yields larger values , with df = c - 1
r
than at least one other treatment population
Friedman Test: Friedman Test:
Ho: The supplier populations are equal
Ha: At least one supplier population yields larger values
than at least one other supplier population 0.05
Supplier 1 Supplier 2 Supplier 3 Supplier 4
df c 1 4 1 3
Monday 62 63 57 61 2
Tuesday 63 61 59 65 7.81473
Wednesday 61 62 56 63 .05,3
Thursday 62 60 57 64 2
Friday 64 63 58 66 If 7.81473, reject Ho.
r
Page 9
Friedman Test Friedman Test
Friedman Test ‐ A nonparametric alternative to the C
12
randomized block design
2
bc(c 1) j 1 R j
3b(c 1)
2
r
Assumptions
The blocks are independent.
There is no interaction between blocks and treatments. where : c number of treatment levels (columns)
Observations within each block can be ranked. b = number of blocks (rows)
Hypotheses R j= total ranks for a particular treatment level
Ho: The treatment populations are equal j = particular treatment level
2 2
Ha: At least one treatment population yields larger values , with df = c - 1
r
than at least one other treatment population
Friedman Test: Friedman Test:
Ho: The supplier populations are equal
Ha: At least one supplier population yields larger values
than at least one other supplier population 0.05
Supplier 1 Supplier 2 Supplier 3 Supplier 4
df c 1 4 1 3
Monday 62 63 57 61 2
Tuesday 63 61 59 65 7.81473
Wednesday 61 62 56 63 .05,3
Thursday 62 60 57 64 2
Friday 64 63 58 66 If 7.81473, reject Ho.
r
Page 9
Friedman Test: Friedman Test:
12
Supplier 1 Supplier 2 Supplier 3 Supplier 4 2
bc(c 1) j 1 R j
3b(c 1)
2
Monday 3 4 1 2 r
Tuesday 3 2 1 4
Wednesday 2 3 1 4 12
(714) 3(5)(4 1)
Thursday 3 2 1 4
(5)(4)(4 1)
Friday 3 2 1 4
R j 14 13 5 18 10.68
2
Rj 196 169 25 324
2
4
Page 10