You are on page 1of 51

# Slides Prepared by

JOHN S. LOUCKS
St. Edward’s University

1
Chapter 19
Nonparametric Methods
■ Sign Test
■ Wilcoxon Signed-Rank Test
■ Mann-Whitney-Wilcoxon Test
■ Kruskal-Wallis Test
■ Rank Correlation

2
Nonparametric Methods

## ■ Most of the statistical methods referred to as parametric

require the use of interval- or ratio-scaled data.
■ Nonparametric methods are often the only way to
analyze nominal or ordinal data and draw statistical
conclusions.
■ Nonparametric methods require no assumptions about
the population probability distributions.
■ Nonparametric methods are often called distribution-
free methods.

3
Nonparametric Methods

## ■ In general, for a statistical method to be classified as

nonparametric, it must satisfy at least one of the
following conditions.
• The method can be used with nominal data.
• The method can be used with ordinal data.
• The method can be used with interval or ratio data
population probability distribution.

4
Sign Test

## ■ A common application of the sign test involves using

a sample of n potential customers to identify a
preference for one of two brands of a product.
■ The objective is to determine whether there is a
difference in preference between the two items being
compared.
■ To record the preference data, we use a plus sign if
the individual prefers one brand and a minus sign if
the individual prefers the other brand.
■ Because the data are recorded as plus and minus
signs, this test is called the sign test.

5
Sign Test: Small-Sample Case

## ■ The small-sample case for the sign test should be

used whenever n < 20.
■ The hypotheses are
H 0 : p = .50 No preference for one brand
over the other exists.
H a : p ≠ .50 A preference for one brand
over the other exists.
■ The number of plus signs is our test statistic.
■ Assuming H0 is true, the sampling distribution for
the test statistic is a binomial distribution with p = .5.
■ H0 is rejected if the p-value < level of significance, α.

6
Sign Test: Large-Sample Case

## ■ Using H0: p = .5 and n > 20, the sampling distribution

for the number of plus signs can be approximated by
a normal distribution.
■ When no preference is stated (H0: p = .5), the sampling
distribution will have:
Mean: µ = .50n
Standard Deviation: σ = .25n

## ■ The test statistic is: x−µ

z= (x is the number
σ of plus signs)

## ■ H0 is rejected if the p-value < level of significance, α.

7
Sign Test: Large-Sample Case

## ■ Example: Ketchup Taste Test

As part of a market research study, a
sample of 36 consumers were asked to taste
two brands of ketchup and indicate a
preference. Do the data shown on the next
slide indicate a significant difference in the
consumer preferences for the two brands? A
B

8
Sign Test: Large-Sample Case

## 1 preferred Brand A Ketchup

(+ sign recorded)
1 preferred Brand B Ketchup
(_ sign recorded)
B
The analysis will be based on
a sample size of 18 + 12 = 30.

9
Sign Test: Large-Sample Case

■ Hypotheses A B
H 0 : p = .50
No preference for one brand over the other exists
H a : p ≠ .50
A preference for one brand over the other exists

10
Sign Test: Large-Sample Case

## σ = .25n = .25(30) = 2.74

µ = .5(30) = 15

11
Sign Test: Large-Sample Case

■ Rejection Rule A B
Using .05 level of significance:
Reject H0 if p-value < .05
■ Test Statistic
z = (x – µ)/σ = (18 - 15)/2.74 = 3/2.74 = 1.10

■ p-Value
p-Value = 2(.5000 - .3643) = .2714

12
Sign Test: Large-Sample Case

■ Conclusion A B
Because the p-value > α, we cannot reject H0.
There is insufficient evidence in the sample to conclude
that a difference in preference exists for the two brands
of ketchup.

13

## ■ We can apply the sign test by:

• Using a plus sign whenever the data in the sample
are above the hypothesized value of the median
• Using a minus sign whenever the data in the
sample are below the hypothesized value of the
median
• Discarding any data exactly equal to the
hypothesized median

14

## ■ Example: Trim Fitness Center

A hypothesis test is being conducted
about the median age of female members
of the Trim Fitness Center.
H0: Median Age = 34 years
Ha: Median Age ≠ 34 years

## In a sample of 40 female members, 25 are older

than 34, 14 are younger than 34, and 1 is 34. Is there
sufficient evidence to reject H0? Assume α = .05.

15

## ■ Mean and Standard Deviation

µ = .5(39) = 19.5
σ = .25n = .25(39) = 3.12
■ Test Statistic
z = (x – µ)/σ = (25 – 19.5)/3.12 = 1.76

■ p-Value
p-Value = 2(.5000 − .4608) = .0784

16

■ Rejection Rule
Using .05 level of significance:
Reject H0 if p-value < .05
■ Conclusion
Do not reject H0. The p-value for this two-tail
test is .0784. There is insufficient evidence in the
sample to conclude that the median age is not 34 for
female members of Trim Fitness Center.

17
Wilcoxon Signed-Rank Test

## ■ This test is the nonparametric alternative to the

parametric matched-sample test presented in
Chapter 10.
■ The methodology of the parametric matched-sample
analysis requires:
• interval data, and
• the assumption that the population of differences
between the pairs of observations is normally
distributed.
■ If the assumption of normally distributed differences
is not appropriate, the Wilcoxon signed-rank test can
be used.

18
Wilcoxon Signed-Rank Test

## ■ Example: Express Deliveries

A firm has decided to select one
of two express delivery services to
provide next-day deliveries to its
district offices.
To test the delivery times of the two services, the
firm sends two reports to a sample of 10 district
offices, with one report carried by one service and the
other report carried by the second service. Do the data
on the next slide indicate a difference in the two
services?

19
Wilcoxon Signed-Rank Test

## District Office OverNight NiteFlite

Seattle 32 hrs. 25 hrs.
Los Angeles 30 24
19 15
Boston 16 15
Cleveland 15 13
New York 18 15
Houston 14 15
Atlanta 10 8
St. Louis 7 9
Milwaukee 16 11
Denver

20
Wilcoxon Signed-Rank Test

## ■ Preliminary Steps of the Test

• Compute the differences between the paired
observations.
• Discard any differences of zero.
• Rank the absolute value of the differences from
lowest to highest. Tied differences are assigned
the average ranking of their positions.
• Give the ranks the sign of the original difference
in the data.
• Sum the signed ranks.
. . . next we will determine whether the
sum is significantly different from zero.

21
Wilcoxon Signed-Rank Test

## District Office Differ. |Diff.| Rank Sign. Rank

Seattle 7 10 +10
Los Angeles 6 9 +9
4 7 +7
Boston 1 1.5 +1.5
Cleveland 2 4 +4
New York 3 6 +6
Houston −1 1.5 −1.5
Atlanta 2 4 +4
St. Louis −2 4 −4
Milwaukee 5 8 +8
Denver +44

22
Wilcoxon Signed-Rank Test

■ Hypotheses
H0: The delivery times of the two services are the
same; neither offers faster service than the other.
Ha: Delivery times differ between the two services;
recommend the one with the smaller times.

23
Wilcoxon Signed-Rank Test

## n(n + 1)(2 n + 1) 10(11)(21)

σT = = = 19.62
6 6

T
µT = 0

24
Wilcoxon Signed-Rank Test

■ Rejection Rule
Using .05 level of significance,
Reject H0 if p-value < .05
■ Test Statistic
z = (T - µT )/σT = (44 - 0)/19.62 = 2.24
■ p-Value
p-Value = 2(.5000 - .4875) = .025

25
Wilcoxon Signed-Rank Test

■ Conclusion
Reject H0. The p-value for this two-tail test is
.025. There is sufficient evidence in the sample to
conclude that a difference exists in the delivery times
provided by the two services.

26
Mann-Whitney-Wilcoxon Test

## ■ This test is another nonparametric method for

determining whether there is a difference between
two populations.
■ This test, unlike the Wilcoxon signed-rank test, is not
based on a matched sample.
■ This test does not require interval data or the
assumption that both populations are normally
distributed.
■ The only requirement is that the measurement scale
for the data is at least ordinal.

27
Mann-Whitney-Wilcoxon Test

## ■ Instead of testing for the difference between the

means of two populations, this method tests to
determine whether the two populations are identical.
■ The hypotheses are:

## H0: The two populations are identical

Ha: The two populations are not identical

28
Mann-Whitney-Wilcoxon Test

## ■ Example: Westin Freezers

Manufacturer labels indicate the
annual energy cost associated with
operating home appliances such as
freezers.
The energy costs for a sample of
10 Westin freezers and a sample of 10
Easton Freezers are shown on the next slide. Do the
data indicate, using α = .05, that a difference exists in
the annual energy costs for the two brands of freezers?

29
Mann-Whitney-Wilcoxon Test

## Westin Freezers Easton Freezers

\$55.10 \$56.10
54.70
54.50 54.40
53.20 55.40
53.00 54.10
55.50 56.00
54.90 55.50
55.80 55.00
54.00 54.30
54.20 57.00
55.20

30
Mann-Whitney-Wilcoxon Test

■ Hypotheses
H0: Annual energy costs for Westin freezers
and Easton freezers are the same.
Ha: Annual energy costs differ for the two
brands of freezers.

31
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
■ First, rank the combined data from the lowest to
the highest values, with tied values being assigned
the average of the tied rankings.
■ Then, compute T, the sum of the ranks for the first
sample.
■ Then, compare the observed value of T to the
sampling distribution of T for identical populations.
The value of the standardized test statistic z will
provide the basis for deciding whether to reject H0.

32
Mann-Whitney-Wilcoxon Test:
Large-Sample Case
■ Sampling Distribution of T for Identical Populations
• Mean
µT = n1(n1 + n2 + 1)

• Standard Deviation
σ T = 1 12 n1 n2 (n1 + n2 + 1)

• Distribution Form
Approximately normal, provided
n1 > 10 and n2 > 10

33
Mann-Whitney-Wilcoxon Test

## Westin Freezers Rank Easton Freezers Rank

\$55.10 12 \$56.10 19
8 54.70 9
54.50 2 54.40 7
53.20 1 55.40 14
53.00 15.5 54.10 4
55.50 10 56.00 18
54.90 17 55.50 15.5
55.80 3 55.00 11
54.00 5 54.30 6
54.20 13 57.00 20
Sum of Ranks 86.5
55.20 Sum of Ranks 123.5

34
Mann-Whitney-Wilcoxon Test

## ■ Sampling Distribution of T for Identical Populations

σ T = 1 12 n1 n2 (n1 + n2 + 1)

= 1 12 (10)(10)(21)
= 13.23

T
µT = ½(10)(21) = 105

35
Mann-Whitney-Wilcoxon Test

■ Rejection Rule
Using .05 level of significance,
Reject H0 if p-value < .05
■ Test Statistic
z = (T - µT )/σT = (86.5 − 105)/13.23 = -1.40
■ p-Value
p-Value = 2(.5000 - .4192) = .1616

36
Mann-Whitney-Wilcoxon Test

■ Conclusion
Do not reject H0. The p-value > α. There is
insufficient evidence in the sample data to conclude
that there is a difference in the annual energy cost
associated with the two brands of freezers.

37
Kruskal-Wallis Test

## ■ The Mann-Whitney-Wilcoxon test has been extended

by Kruskal and Wallis for cases of three or more
populations.
H0: All populations are identical
Ha: Not all populations are identical

## ■ The Kruskal-Wallis test can be used with ordinal data

as well as with interval or ratio data.
■ Also, the Kruskal-Wallis test does not require the
assumption of normally distributed populations.

38
Kruskal-Wallis Test

■ Test Statistic

 12 k
Ri2 
W = ∑  − 3(nT + 1)
 nT (nT + 1) i =1 ni 

## where: k = number of populations

ni = number of items in sample i
nT = Σni = total number of items in all samples
Ri = sum of the ranks for sample i

39
Kruskal-Wallis Test

## ■ When the populations are identical, the sampling

distribution of the test statistic W can be approximated
by a chi-square distribution with k – 1 degrees of
freedom.
■ This approximation is acceptable if each of the sample
sizes ni is > 5.
■ The rejection rule is: Reject H0 if p-value < α

40
Rank Correlation

## ■ The Pearson correlation coefficient, r, is a measure of

the linear association between two variables for
which interval or ratio data are available.
■ The Spearman rank-correlation coefficient, rs , is a
measure of association between two variables when
only ordinal data are available.
■ Values of rs can range from –1.0 to +1.0, where
• values near 1.0 indicate a strong positive
association between the rankings, and
• values near -1.0 indicate a strong negative
association between the rankings.

41
Rank Correlation

6∑ di2
rs = 1 −
n( n2 − 1)

## where: n = number of items being ranked

xi = rank of item i with respect to one variable
yi = rank of item i with respect to a second variable
di = xi - yi

42
Test for Significant Rank Correlation

## ■ We may want to use sample results to make an

inference about the population rank correlation ps.
■ To do so, we must test the hypotheses:

## H 0 : ps = 0 (No rank correlation exists)

H a : ps ≠ 0 (Rank correlation exists)

43
Rank Correlation

## ■ Sampling Distribution of rs when ps = 0

• Mean
µrs = 0

• Standard Deviation
1
σ rs =
n−1

• Distribution Form
Approximately normal, provided n > 10

44
Rank Correlation

## ■ Example: Crennor Investors

Crennor Investors provides
a portfolio management service
for its clients. Two of Crennor’s
analysts ranked ten investments
as shown on the next slide. Use
rank correlation, with α = .10, to
comment on the agreement of the two analysts’
rankings.

45
Rank Correlation

## ■ Example: Crennor Investors

• Analysts’ Rankings
Investment A B C D E F G H I J
Analyst #1 1 4 9 8 6 3 5 7 2 10
Analyst #2 1 5 6 2 9 7 3 10 4 8

• Hypotheses
H 0 : ps = 0 (No rank correlation exists)
H a : ps ≠ 0 (Rank correlation exists)

46
Rank Correlation

Analyst #1 Analyst #2
Investment Ranking Ranking Differ. (Differ.)2
A 1 1 0 0
B 4 5 -1 1
C 9 6 3 9
D 8 2 6 36
E 6 9 -3 9
F 3 7 -4 16
G 5 3 2 4
H 7 10 -3 9
I 2 4 -2 4
J 10 8 2 4
Sum = 92
47
Rank Correlation

 Sampling Distribution of rs
Assuming No Rank Correlation

1
σ rs = = .333
10 − 1

rs
µr = 0

48
Rank Correlation

■ Rejection Rule
With .10 level of significance:
Reject H0 if p-value < .10
■ Test Statistic
6∑ di2 6(92)
rs = 1 − =1− = 0.4424
n(n 2 − 1) 10(100 − 1)

## z = (rs - µr )/σr = (.4424 - 0)/.3333 = 1.33

■ p-Value
p-Value = 2(.5000 - .4082) = .1836

49
Rank Correlation

■ Conclusion
Do no reject H0. The p-value > α. There is not a
significant rank correlation. The two analysts are not
showing agreement in their ranking of the risk
associated with the different investments.