You are on page 1of 51

Slides Prepared by

JOHN S. LOUCKS
St. Edward’s University

© 2005 Thomson/South-Western Slide


1
Chapter 19
Nonparametric Methods
■ Sign Test
■ Wilcoxon Signed-Rank Test
■ Mann-Whitney-Wilcoxon Test
■ Kruskal-Wallis Test
■ Rank Correlation

© 2005 Thomson/South-Western Slide


2
Nonparametric Methods

■ Most of the statistical methods referred to as parametric


require the use of interval- or ratio-scaled data.
■ Nonparametric methods are often the only way to
analyze nominal or ordinal data and draw statistical
conclusions.
■ Nonparametric methods require no assumptions about
the population probability distributions.
■ Nonparametric methods are often called distribution-
free methods.

© 2005 Thomson/South-Western Slide


3
Nonparametric Methods

■ In general, for a statistical method to be classified as


nonparametric, it must satisfy at least one of the
following conditions.
• The method can be used with nominal data.
• The method can be used with ordinal data.
• The method can be used with interval or ratio data
when no assumption can be made about the
population probability distribution.

© 2005 Thomson/South-Western Slide


4
Sign Test

■ A common application of the sign test involves using


a sample of n potential customers to identify a
preference for one of two brands of a product.
■ The objective is to determine whether there is a
difference in preference between the two items being
compared.
■ To record the preference data, we use a plus sign if
the individual prefers one brand and a minus sign if
the individual prefers the other brand.
■ Because the data are recorded as plus and minus
signs, this test is called the sign test.

© 2005 Thomson/South-Western Slide


5
Sign Test: Small-Sample Case

■ The small-sample case for the sign test should be


used whenever n < 20.
■ The hypotheses are
H 0 : p = .50 No preference for one brand
over the other exists.
H a : p ≠ .50 A preference for one brand
over the other exists.
■ The number of plus signs is our test statistic.
■ Assuming H0 is true, the sampling distribution for
the test statistic is a binomial distribution with p = .5.
■ H0 is rejected if the p-value < level of significance, α.

© 2005 Thomson/South-Western Slide


6
Sign Test: Large-Sample Case

■ Using H0: p = .5 and n > 20, the sampling distribution


for the number of plus signs can be approximated by
a normal distribution.
■ When no preference is stated (H0: p = .5), the sampling
distribution will have:
Mean: µ = .50n
Standard Deviation: σ = .25n

■ The test statistic is: x−µ


z= (x is the number
σ of plus signs)

■ H0 is rejected if the p-value < level of significance, α.

© 2005 Thomson/South-Western Slide


7
Sign Test: Large-Sample Case

■ Example: Ketchup Taste Test


As part of a market research study, a
sample of 36 consumers were asked to taste
two brands of ketchup and indicate a
preference. Do the data shown on the next
slide indicate a significant difference in the
consumer preferences for the two brands? A
B

© 2005 Thomson/South-Western Slide


8
Sign Test: Large-Sample Case

■ Example: Ketchup Taste Test

1 preferred Brand A Ketchup


(+ sign recorded)
1 preferred Brand B Ketchup
(_ sign recorded)
6 had no preference A
B
The analysis will be based on
a sample size of 18 + 12 = 30.

© 2005 Thomson/South-Western Slide


9
Sign Test: Large-Sample Case

■ Hypotheses A B
H 0 : p = .50
No preference for one brand over the other exists
H a : p ≠ .50
A preference for one brand over the other exists

© 2005 Thomson/South-Western Slide


10
Sign Test: Large-Sample Case

■ Sampling Distribution for Number of Plus Signs A B

σ = .25n = .25(30) = 2.74

µ = .5(30) = 15

© 2005 Thomson/South-Western Slide


11
Sign Test: Large-Sample Case

■ Rejection Rule A B
Using .05 level of significance:
Reject H0 if p-value < .05
■ Test Statistic
z = (x – µ)/σ = (18 - 15)/2.74 = 3/2.74 = 1.10

■ p-Value
p-Value = 2(.5000 - .3643) = .2714

© 2005 Thomson/South-Western Slide


12
Sign Test: Large-Sample Case

■ Conclusion A B
Because the p-value > α, we cannot reject H0.
There is insufficient evidence in the sample to conclude
that a difference in preference exists for the two brands
of ketchup.

© 2005 Thomson/South-Western Slide


13
Hypothesis Test About a Median

■ We can apply the sign test by:


• Using a plus sign whenever the data in the sample
are above the hypothesized value of the median
• Using a minus sign whenever the data in the
sample are below the hypothesized value of the
median
• Discarding any data exactly equal to the
hypothesized median

© 2005 Thomson/South-Western Slide


14
Hypothesis Test About a Median

■ Example: Trim Fitness Center


A hypothesis test is being conducted
about the median age of female members
of the Trim Fitness Center.
H0: Median Age = 34 years
Ha: Median Age ≠ 34 years

In a sample of 40 female members, 25 are older


than 34, 14 are younger than 34, and 1 is 34. Is there
sufficient evidence to reject H0? Assume α = .05.

© 2005 Thomson/South-Western Slide


15
Hypothesis Test About a Median

■ Mean and Standard Deviation


µ = .5(39) = 19.5
σ = .25n = .25(39) = 3.12
■ Test Statistic
z = (x – µ)/σ = (25 – 19.5)/3.12 = 1.76

■ p-Value
p-Value = 2(.5000 − .4608) = .0784

© 2005 Thomson/South-Western Slide


16
Hypothesis Test About a Median

■ Rejection Rule
Using .05 level of significance:
Reject H0 if p-value < .05
■ Conclusion
Do not reject H0. The p-value for this two-tail
test is .0784. There is insufficient evidence in the
sample to conclude that the median age is not 34 for
female members of Trim Fitness Center.

© 2005 Thomson/South-Western Slide


17
Wilcoxon Signed-Rank Test

■ This test is the nonparametric alternative to the


parametric matched-sample test presented in
Chapter 10.
■ The methodology of the parametric matched-sample
analysis requires:
• interval data, and
• the assumption that the population of differences
between the pairs of observations is normally
distributed.
■ If the assumption of normally distributed differences
is not appropriate, the Wilcoxon signed-rank test can
be used.

© 2005 Thomson/South-Western Slide


18
Wilcoxon Signed-Rank Test

■ Example: Express Deliveries


A firm has decided to select one
of two express delivery services to
provide next-day deliveries to its
district offices.
To test the delivery times of the two services, the
firm sends two reports to a sample of 10 district
offices, with one report carried by one service and the
other report carried by the second service. Do the data
on the next slide indicate a difference in the two
services?

© 2005 Thomson/South-Western Slide


19
Wilcoxon Signed-Rank Test

District Office OverNight NiteFlite


Seattle 32 hrs. 25 hrs.
Los Angeles 30 24
19 15
Boston 16 15
Cleveland 15 13
New York 18 15
Houston 14 15
Atlanta 10 8
St. Louis 7 9
Milwaukee 16 11
Denver

© 2005 Thomson/South-Western Slide


20
Wilcoxon Signed-Rank Test

■ Preliminary Steps of the Test


• Compute the differences between the paired
observations.
• Discard any differences of zero.
• Rank the absolute value of the differences from
lowest to highest. Tied differences are assigned
the average ranking of their positions.
• Give the ranks the sign of the original difference
in the data.
• Sum the signed ranks.
. . . next we will determine whether the
sum is significantly different from zero.

© 2005 Thomson/South-Western Slide


21
Wilcoxon Signed-Rank Test

District Office Differ. |Diff.| Rank Sign. Rank


Seattle 7 10 +10
Los Angeles 6 9 +9
4 7 +7
Boston 1 1.5 +1.5
Cleveland 2 4 +4
New York 3 6 +6
Houston −1 1.5 −1.5
Atlanta 2 4 +4
St. Louis −2 4 −4
Milwaukee 5 8 +8
Denver +44

© 2005 Thomson/South-Western Slide


22
Wilcoxon Signed-Rank Test

■ Hypotheses
H0: The delivery times of the two services are the
same; neither offers faster service than the other.
Ha: Delivery times differ between the two services;
recommend the one with the smaller times.

© 2005 Thomson/South-Western Slide


23
Wilcoxon Signed-Rank Test

■ Sampling Distribution of T for Identical Populations

n(n + 1)(2 n + 1) 10(11)(21)


σT = = = 19.62
6 6

T
µT = 0

© 2005 Thomson/South-Western Slide


24
Wilcoxon Signed-Rank Test

■ Rejection Rule
Using .05 level of significance,
Reject H0 if p-value < .05
■ Test Statistic
z = (T - µT )/σT = (44 - 0)/19.62 = 2.24
■ p-Value
p-Value = 2(.5000 - .4875) = .025

© 2005 Thomson/South-Western Slide


25
Wilcoxon Signed-Rank Test

■ Conclusion
Reject H0. The p-value for this two-tail test is
.025. There is sufficient evidence in the sample to
conclude that a difference exists in the delivery times
provided by the two services.

© 2005 Thomson/South-Western Slide


26
Mann-Whitney-Wilcoxon Test

■ This test is another nonparametric method for


determining whether there is a difference between
two populations.
■ This test, unlike the Wilcoxon signed-rank test, is not
based on a matched sample.
■ This test does not require interval data or the
assumption that both populations are normally
distributed.
■ The only requirement is that the measurement scale
for the data is at least ordinal.

© 2005 Thomson/South-Western Slide


27
Mann-Whitney-Wilcoxon Test

■ Instead of testing for the difference between the


means of two populations, this method tests to
determine whether the two populations are identical.
■ The hypotheses are:

H0: The two populations are identical


Ha: The two populations are not identical

© 2005 Thomson/South-Western Slide


28
Mann-Whitney-Wilcoxon Test

■ Example: Westin Freezers


Manufacturer labels indicate the
annual energy cost associated with
operating home appliances such as
freezers.
The energy costs for a sample of
10 Westin freezers and a sample of 10
Easton Freezers are shown on the next slide. Do the
data indicate, using α = .05, that a difference exists in
the annual energy costs for the two brands of freezers?

© 2005 Thomson/South-Western Slide


29
Mann-Whitney-Wilcoxon Test

Westin Freezers Easton Freezers


$55.10 $56.10
54.70
54.50 54.40
53.20 55.40
53.00 54.10
55.50 56.00
54.90 55.50
55.80 55.00
54.00 54.30
54.20 57.00
55.20

© 2005 Thomson/South-Western Slide


30
Mann-Whitney-Wilcoxon Test

■ Hypotheses
H0: Annual energy costs for Westin freezers
and Easton freezers are the same.
Ha: Annual energy costs differ for the two
brands of freezers.

© 2005 Thomson/South-Western Slide


31
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
■ First, rank the combined data from the lowest to
the highest values, with tied values being assigned
the average of the tied rankings.
■ Then, compute T, the sum of the ranks for the first
sample.
■ Then, compare the observed value of T to the
sampling distribution of T for identical populations.
The value of the standardized test statistic z will
provide the basis for deciding whether to reject H0.

© 2005 Thomson/South-Western Slide


32
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
■ Sampling Distribution of T for Identical Populations
• Mean
µT = n1(n1 + n2 + 1)

• Standard Deviation
σ T = 1 12 n1 n2 (n1 + n2 + 1)

• Distribution Form
Approximately normal, provided
n1 > 10 and n2 > 10

© 2005 Thomson/South-Western Slide


33
Mann-Whitney-Wilcoxon Test

Westin Freezers Rank Easton Freezers Rank


$55.10 12 $56.10 19
8 54.70 9
54.50 2 54.40 7
53.20 1 55.40 14
53.00 15.5 54.10 4
55.50 10 56.00 18
54.90 17 55.50 15.5
55.80 3 55.00 11
54.00 5 54.30 6
54.20 13 57.00 20
Sum of Ranks 86.5
55.20 Sum of Ranks 123.5

© 2005 Thomson/South-Western Slide


34
Mann-Whitney-Wilcoxon Test

■ Sampling Distribution of T for Identical Populations

σ T = 1 12 n1 n2 (n1 + n2 + 1)

= 1 12 (10)(10)(21)
= 13.23

T
µT = ½(10)(21) = 105

© 2005 Thomson/South-Western Slide


35
Mann-Whitney-Wilcoxon Test

■ Rejection Rule
Using .05 level of significance,
Reject H0 if p-value < .05
■ Test Statistic
z = (T - µT )/σT = (86.5 − 105)/13.23 = -1.40
■ p-Value
p-Value = 2(.5000 - .4192) = .1616

© 2005 Thomson/South-Western Slide


36
Mann-Whitney-Wilcoxon Test

■ Conclusion
Do not reject H0. The p-value > α. There is
insufficient evidence in the sample data to conclude
that there is a difference in the annual energy cost
associated with the two brands of freezers.

© 2005 Thomson/South-Western Slide


37
Kruskal-Wallis Test

■ The Mann-Whitney-Wilcoxon test has been extended


by Kruskal and Wallis for cases of three or more
populations.
H0: All populations are identical
Ha: Not all populations are identical

■ The Kruskal-Wallis test can be used with ordinal data


as well as with interval or ratio data.
■ Also, the Kruskal-Wallis test does not require the
assumption of normally distributed populations.

© 2005 Thomson/South-Western Slide


38
Kruskal-Wallis Test

■ Test Statistic

 12 k
Ri2 
W = ∑  − 3(nT + 1)
 nT (nT + 1) i =1 ni 

where: k = number of populations


ni = number of items in sample i
nT = Σni = total number of items in all samples
Ri = sum of the ranks for sample i

© 2005 Thomson/South-Western Slide


39
Kruskal-Wallis Test

■ When the populations are identical, the sampling


distribution of the test statistic W can be approximated
by a chi-square distribution with k – 1 degrees of
freedom.
■ This approximation is acceptable if each of the sample
sizes ni is > 5.
■ The rejection rule is: Reject H0 if p-value < α

© 2005 Thomson/South-Western Slide


40
Rank Correlation

■ The Pearson correlation coefficient, r, is a measure of


the linear association between two variables for
which interval or ratio data are available.
■ The Spearman rank-correlation coefficient, rs , is a
measure of association between two variables when
only ordinal data are available.
■ Values of rs can range from –1.0 to +1.0, where
• values near 1.0 indicate a strong positive
association between the rankings, and
• values near -1.0 indicate a strong negative
association between the rankings.

© 2005 Thomson/South-Western Slide


41
Rank Correlation

■ Spearman Rank-Correlation Coefficient, rs

6∑ di2
rs = 1 −
n( n2 − 1)

where: n = number of items being ranked


xi = rank of item i with respect to one variable
yi = rank of item i with respect to a second variable
di = xi - yi

© 2005 Thomson/South-Western Slide


42
Test for Significant Rank Correlation

■ We may want to use sample results to make an


inference about the population rank correlation ps.
■ To do so, we must test the hypotheses:

H 0 : ps = 0 (No rank correlation exists)


H a : ps ≠ 0 (Rank correlation exists)

© 2005 Thomson/South-Western Slide


43
Rank Correlation

■ Sampling Distribution of rs when ps = 0


• Mean
µrs = 0

• Standard Deviation
1
σ rs =
n−1

• Distribution Form
Approximately normal, provided n > 10

© 2005 Thomson/South-Western Slide


44
Rank Correlation

■ Example: Crennor Investors


Crennor Investors provides
a portfolio management service
for its clients. Two of Crennor’s
analysts ranked ten investments
as shown on the next slide. Use
rank correlation, with α = .10, to
comment on the agreement of the two analysts’
rankings.

© 2005 Thomson/South-Western Slide


45
Rank Correlation

■ Example: Crennor Investors


• Analysts’ Rankings
Investment A B C D E F G H I J
Analyst #1 1 4 9 8 6 3 5 7 2 10
Analyst #2 1 5 6 2 9 7 3 10 4 8

• Hypotheses
H 0 : ps = 0 (No rank correlation exists)
H a : ps ≠ 0 (Rank correlation exists)

© 2005 Thomson/South-Western Slide


46
Rank Correlation

Analyst #1 Analyst #2
Investment Ranking Ranking Differ. (Differ.)2
A 1 1 0 0
B 4 5 -1 1
C 9 6 3 9
D 8 2 6 36
E 6 9 -3 9
F 3 7 -4 16
G 5 3 2 4
H 7 10 -3 9
I 2 4 -2 4
J 10 8 2 4
Sum = 92
© 2005 Thomson/South-Western Slide
47
Rank Correlation

 Sampling Distribution of rs
Assuming No Rank Correlation

1
σ rs = = .333
10 − 1

rs
µr = 0

© 2005 Thomson/South-Western Slide


48
Rank Correlation

■ Rejection Rule
With .10 level of significance:
Reject H0 if p-value < .10
■ Test Statistic
6∑ di2 6(92)
rs = 1 − =1− = 0.4424
n(n 2 − 1) 10(100 − 1)

z = (rs - µr )/σr = (.4424 - 0)/.3333 = 1.33

■ p-Value
p-Value = 2(.5000 - .4082) = .1836

© 2005 Thomson/South-Western Slide


49
Rank Correlation

■ Conclusion
Do no reject H0. The p-value > α. There is not a
significant rank correlation. The two analysts are not
showing agreement in their ranking of the risk
associated with the different investments.

© 2005 Thomson/South-Western Slide


50
End of Chapter 19

© 2005 Thomson/South-Western Slide


51