You are on page 1of 27

Chapter

11 Nonparametric Tests

Elementary Statistics
Larson Farber
Section 11.1

The Sign Test


Nonparametric Tests
A nonparametric test is a hypothesis test that does not require
any specific conditions about the shape of the populations or the
value of any population parameters.
Tests are often called distribution free tests.
The Sign Test is a nonparametric test that can be used to
test a population median against a hypothesized value, k.

Hypotheses
Left-tailed test: H0: median k and Ha:
medianor< k

or test: H0: median k and Ha: median > k


Right-tailed

Two-tailed test: H0: median = k and Ha: median k


Sign Test
To use the sign test, first compare each entry in the
sample to the hypothesized median, k.

If the entry is below the median, assign it a sign.


If the entry is above the median, assign it a + sign.
If the entry is equal to the median, assign it a 0.

Compare the number of + and signs. (Ignore 0s.) If the


number of + signs and the number of signs are
approximately equal, the null hypothesis is not likely to
be rejected. If they are not approximately equal,
however, it is likely that the null hypothesis will be
rejected.
Sign Test
Test Statistic: When n 25, the test statistic is the
smaller number of + or signs.

When n > 25, the test statistic is:

For n > 25, you are testing the binomial probability that = 0.50.
Application
A meteorologist claims that the daily median temperature for
the month of January in San Diego is 57 Fahrenheit. The
temperatures (in degrees Fahrenheit) for 18 randomly selected
January days are listed below. At = 0.01, can you support the
meteorologists claim?
58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63
55
1. Write the null and alternative hypothesis.
H0: median = 57 and Ha: median 57
2. State the level of significance.
= 0.01
3. Determine the sampling distribution.
Binomial with p = 0.5
58 62 55 55 53 52 52 59 55
55 60 56 57 61 58 63 63 55
+ + +
+ 0 + + + +

There are 8 + signs and 9 signs. So, n = 8 + 9 = 17.

Since Ha contains the symbol, this is a two-tail test.


4. Find the critical value. With n = 17, use Table 8

Critical value is 2.

5. Find the rejection region. Reject H0 if the test


statistic is less than or
equal to 2.

6. Find the test statistic.

The test statistic is the smaller number of + or signs,


so the test statistic is 8.
7. Make your decision.

The test statistic, 8, does not fall in the critical region. Fail
to reject the null hypothesis.
8. Interpret your decision.
There is not enough evidence to reject the
meteorologists claim that the median daily
temperature for January in San Diego is 57 .
The sign test can also be used with
paired data (such as before and after).
Find the difference between
corresponding values and record the
sign. Use the same procedure.
Section 11.2

The Wilcoxon Test


Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is a nonparametric test that
can be used to determine whether two dependent samples
were selected from populations with the same distribution.
To find the test statistic, ws
Find the difference for each pair:
Sample 1 value Sample 2 value
Find the absolute value of the difference.
Rank order these differences.
Affix a + or sign to each of the rankings.
Find the sum of the positive ranks.
Find the sum of the negative ranks.
Select the smaller of the absolute values of the sums.
Application
The table shows the daily headache hours suffered by 12
patients before and after receiving a new drug for seven weeks.
At = 0.01, is there enough evidence to conclude that the
new drug helped to reduce daily headache hours?

1. Write the null and alternative hypothesis.


H0: The headache hours after using the new drug are
at least as long as before using the drug.
Ha: The new drug reduces headache hours. (Claim)
2. State the level of significance.

= 0.01
Before After Diff. Abs Rank Sign Rank

1 2.1 2.2 0.1 0.1 1.5 1.5


2 3.9 2.8 1.1 1.1 5.0 5.0
3 3.8 2.5 1.3 1.3 6.0 6.0
4 2.5 2.6 0.1 0.1 1.5 1.5
5 2.4 1.9 0.5 0.5 3.0 3.0
6 3.6 1.8 1.8 1.8 8.0 8.0
7 3.4 2.0 1.4 1.4 7.0 7.0
8 2.4 1.6 0.8 0.8 4.0 4.0
The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33.
The sum of the negative ranks is 1.5 + (1.5) = 3.

The test statistic is the smaller of the absolute value of


these sums, ws = 3.

There are 8 + and signs, so n = 8. The critical


value is 2. Because ws = 3 is greater than the
critical value, fail to reject the null hypothesis.
There is not enough evidence to conclude the
new drug reduces headache hours.
Wilcoxon Rank-Sum Test
The Wilcoxon rank-sum test is a nonparametric test that
can be used to determine whether two independent
samples were selected from populations having the same
distribution.

Both samples must be at least 10. Then n1


represents the size of the smaller sample and n2
the size of the larger sample.

When the samples are the same size, it does not matter which is n1.
Wilcoxon Rank-Sum Test
Test statistic:
Combine the data from both samples and rank it.
R = the sum of the ranks for the smaller sample.
Find the z-score for the value of R.
where
Section 11.3

The Kruskal-
Kruskal-Wallis
Test
The Kruskal-Wallis Test
The Kruskal-Wallis test is a nonparametric test that can be
used to determine whether three or more independent
samples were selected from populations having the same
distribution.
H0: There is no difference in the population distributions.
Ha: There is a difference in the population distributions.

Combine the data and rank the values. Then


separate the data according to sample and find
the sum of the ranks for each sample.

Ri = the sum of the ranks for sample i.


The Kruskal-Wallis Test
Given three or more independent samples, the test
statistic H for the Kruskal-Wallis test is:

where k represents the number of samples, ni is the


size of the i th sample, N is the sum of the sample
sizes, and Ri is the sum of the ranks of the i th
sample.
The sampling distribution is a chi-square distribution with k 1
degrees of freedom (where k = the number of samples).

Reject the null hypothesis when H is greater than the critical


number. (Always use a right-tail test.)
Application
You want to compare the hourly pay rates of accountants
who work in Michigan, New York and Virginia. To do so, you
randomly select 10 accountants in each state and record
their hourly pay rate as shown below. At the .01 level, can
you conclude that the distributions of accountants hourly pay
rates in these three states are different?

MI(1) NY(2) VA(3)


14.24 21.18 17.020
14.06 20.94 20.630
14.85 16.26 17.470
17.47 21.03 15.540
14.83 19.95 15.380
19.01 17.54 14.900
13.08 14.89 20.480
15.94 18.88 18.500
13.48 20.06 12.800
16.94 21.81 15.570
1. Write the null and alternative hypothesis.
H0 : There is no difference in the hourly pay rate in the 3 states.
Ha : There is a difference in the hourly pay in the 3 states.
2. State the level of significance. = 0.01
3. Determine the sampling distribution.

4. Find the critical value.


5. Find the rejection region.

X2
The sampling distribution is chi-square with d.f. = 3 1 = 2.
From Table 6, the critical value is 9.210.
Test Statistic
Data State Rank
12.800 VA 1
13.080 MI 2
13.480 MI 3
14.060 MI 4 Michigan salaries are in ranks:
14.240 MI 5
14.830
14.850
MI
MI
6
7
2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22
14.890
14.900
NY
VA
8
9
The sum is 94.5.
15.380 VA 10
15.540 VA 11
15.570 VA 12
15.940 MI 13 New York salaries are in ranks:
16.260 NY 14
16.940 MI 15 8, 14, 19, 21, 23, 24, 27, 28, 29, 30
17.020 VA 16
17.470 MI 17.5 The sum is 223.
17.470 VA 17.5
17.540 NY 19
18.500 VA 20
18.880 NY 21
19.010 MI 22 Virginia salaries are in ranks:
19.950 NY 23
20.060 NY 24 1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26
20.480 VA 25
20.630
20.940
VA
NY
26
27
The sum is 147.5.
21.030 NY 28
21.180 NY 29
21.810 NY 30
R1 = 94.5, R2 = 223, R3 = 147.5 Find the test statistic.
n1 = 10, n2 = 10 and n3 = 10, so N = 30

9.210 10.76

Make Your Decision


The test statistic 10.76 falls in the rejection region, so
reject the null hypothesis.

Interpret your Decision


There is a difference in the salaries of the 3 states.
Section 11.4

Rank Correlation
Rank Correlation
The Spearman rank correlation coefficient, rs, is a measure of
the strength of the relationship between two variables. The
Spearman rank correlation coefficient is calculated using the
ranks of paired sample data entries. The formula for the
Spearman rank correlation coefficient is

where n is the number of paired data entries and d is the


difference between the ranks of a paired data entry.
The hypotheses:

(There is no correlation between the variables.)


(There is a significant correlation between the
variables.)
Rank Correlation
Seven candidates applied for a x y
nursing position. The seven
candidates were placed in rank 1 2 1
order first by x and then by y. 2 4 4
The results of the rankings are 3 1 3
listed below. Using a .05 level 4 5 2
of significance, test the claim 5 7 6
that there is a significant 6 3 1
correlation between the 7 6 7
variables.

(There is no correlation between the variables.)


(There is a significant correlation between the
variables.)
Application
x y d=xy d2
Critical Value = 0 .715

1 2 1 1 1
2 4 4 0 0
3 1 3 2 4
4 5 2 3 9
5 7 6 1 1
6 3 1 2 4
7 6 7 1 1
20

Since the statistic 0.643 does not fall in the rejection region, fail to reject H0. There
is not enough evidence to support the claim that there is a significant correlation.