11 Nonparametric Tests

Elementary Statistics

Larson Farber

Section 11.1

Nonparametric Tests

A nonparametric test is a hypothesis test that does not require

any specific conditions about the shape of the populations or the

value of any population parameters.

Tests are often called distribution free tests.

The Sign Test is a nonparametric test that can be used to

test a population median against a hypothesized value, k.

Hypotheses

Left-tailed test: H0: median k and Ha:

medianor< k

Right-tailed

Sign Test

To use the sign test, first compare each entry in the

sample to the hypothesized median, k.

If the entry is above the median, assign it a + sign.

If the entry is equal to the median, assign it a 0.

number of + signs and the number of signs are

approximately equal, the null hypothesis is not likely to

be rejected. If they are not approximately equal,

however, it is likely that the null hypothesis will be

rejected.

Sign Test

Test Statistic: When n 25, the test statistic is the

smaller number of + or signs.

For n > 25, you are testing the binomial probability that = 0.50.

Application

A meteorologist claims that the daily median temperature for

the month of January in San Diego is 57 Fahrenheit. The

temperatures (in degrees Fahrenheit) for 18 randomly selected

January days are listed below. At = 0.01, can you support the

meteorologists claim?

58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63

55

1. Write the null and alternative hypothesis.

H0: median = 57 and Ha: median 57

2. State the level of significance.

= 0.01

3. Determine the sampling distribution.

Binomial with p = 0.5

58 62 55 55 53 52 52 59 55

55 60 56 57 61 58 63 63 55

+ + +

+ 0 + + + +

4. Find the critical value. With n = 17, use Table 8

Critical value is 2.

statistic is less than or

equal to 2.

so the test statistic is 8.

7. Make your decision.

The test statistic, 8, does not fall in the critical region. Fail

to reject the null hypothesis.

8. Interpret your decision.

There is not enough evidence to reject the

meteorologists claim that the median daily

temperature for January in San Diego is 57 .

The sign test can also be used with

paired data (such as before and after).

Find the difference between

corresponding values and record the

sign. Use the same procedure.

Section 11.2

Wilcoxon Signed-Rank Test

The Wilcoxon signed-rank test is a nonparametric test that

can be used to determine whether two dependent samples

were selected from populations with the same distribution.

To find the test statistic, ws

Find the difference for each pair:

Sample 1 value Sample 2 value

Find the absolute value of the difference.

Rank order these differences.

Affix a + or sign to each of the rankings.

Find the sum of the positive ranks.

Find the sum of the negative ranks.

Select the smaller of the absolute values of the sums.

Application

The table shows the daily headache hours suffered by 12

patients before and after receiving a new drug for seven weeks.

At = 0.01, is there enough evidence to conclude that the

new drug helped to reduce daily headache hours?

H0: The headache hours after using the new drug are

at least as long as before using the drug.

Ha: The new drug reduces headache hours. (Claim)

2. State the level of significance.

= 0.01

Before After Diff. Abs Rank Sign Rank

2 3.9 2.8 1.1 1.1 5.0 5.0

3 3.8 2.5 1.3 1.3 6.0 6.0

4 2.5 2.6 0.1 0.1 1.5 1.5

5 2.4 1.9 0.5 0.5 3.0 3.0

6 3.6 1.8 1.8 1.8 8.0 8.0

7 3.4 2.0 1.4 1.4 7.0 7.0

8 2.4 1.6 0.8 0.8 4.0 4.0

The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33.

The sum of the negative ranks is 1.5 + (1.5) = 3.

these sums, ws = 3.

value is 2. Because ws = 3 is greater than the

critical value, fail to reject the null hypothesis.

There is not enough evidence to conclude the

new drug reduces headache hours.

Wilcoxon Rank-Sum Test

The Wilcoxon rank-sum test is a nonparametric test that

can be used to determine whether two independent

samples were selected from populations having the same

distribution.

represents the size of the smaller sample and n2

the size of the larger sample.

When the samples are the same size, it does not matter which is n1.

Wilcoxon Rank-Sum Test

Test statistic:

Combine the data from both samples and rank it.

R = the sum of the ranks for the smaller sample.

Find the z-score for the value of R.

where

Section 11.3

The Kruskal-

Kruskal-Wallis

Test

The Kruskal-Wallis Test

The Kruskal-Wallis test is a nonparametric test that can be

used to determine whether three or more independent

samples were selected from populations having the same

distribution.

H0: There is no difference in the population distributions.

Ha: There is a difference in the population distributions.

separate the data according to sample and find

the sum of the ranks for each sample.

The Kruskal-Wallis Test

Given three or more independent samples, the test

statistic H for the Kruskal-Wallis test is:

size of the i th sample, N is the sum of the sample

sizes, and Ri is the sum of the ranks of the i th

sample.

The sampling distribution is a chi-square distribution with k 1

degrees of freedom (where k = the number of samples).

number. (Always use a right-tail test.)

Application

You want to compare the hourly pay rates of accountants

who work in Michigan, New York and Virginia. To do so, you

randomly select 10 accountants in each state and record

their hourly pay rate as shown below. At the .01 level, can

you conclude that the distributions of accountants hourly pay

rates in these three states are different?

14.24 21.18 17.020

14.06 20.94 20.630

14.85 16.26 17.470

17.47 21.03 15.540

14.83 19.95 15.380

19.01 17.54 14.900

13.08 14.89 20.480

15.94 18.88 18.500

13.48 20.06 12.800

16.94 21.81 15.570

1. Write the null and alternative hypothesis.

H0 : There is no difference in the hourly pay rate in the 3 states.

Ha : There is a difference in the hourly pay in the 3 states.

2. State the level of significance. = 0.01

3. Determine the sampling distribution.

5. Find the rejection region.

X2

The sampling distribution is chi-square with d.f. = 3 1 = 2.

From Table 6, the critical value is 9.210.

Test Statistic

Data State Rank

12.800 VA 1

13.080 MI 2

13.480 MI 3

14.060 MI 4 Michigan salaries are in ranks:

14.240 MI 5

14.830

14.850

MI

MI

6

7

2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22

14.890

14.900

NY

VA

8

9

The sum is 94.5.

15.380 VA 10

15.540 VA 11

15.570 VA 12

15.940 MI 13 New York salaries are in ranks:

16.260 NY 14

16.940 MI 15 8, 14, 19, 21, 23, 24, 27, 28, 29, 30

17.020 VA 16

17.470 MI 17.5 The sum is 223.

17.470 VA 17.5

17.540 NY 19

18.500 VA 20

18.880 NY 21

19.010 MI 22 Virginia salaries are in ranks:

19.950 NY 23

20.060 NY 24 1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26

20.480 VA 25

20.630

20.940

VA

NY

26

27

The sum is 147.5.

21.030 NY 28

21.180 NY 29

21.810 NY 30

R1 = 94.5, R2 = 223, R3 = 147.5 Find the test statistic.

n1 = 10, n2 = 10 and n3 = 10, so N = 30

9.210 10.76

The test statistic 10.76 falls in the rejection region, so

reject the null hypothesis.

There is a difference in the salaries of the 3 states.

Section 11.4

Rank Correlation

Rank Correlation

The Spearman rank correlation coefficient, rs, is a measure of

the strength of the relationship between two variables. The

Spearman rank correlation coefficient is calculated using the

ranks of paired sample data entries. The formula for the

Spearman rank correlation coefficient is

difference between the ranks of a paired data entry.

The hypotheses:

(There is a significant correlation between the

variables.)

Rank Correlation

Seven candidates applied for a x y

nursing position. The seven

candidates were placed in rank 1 2 1

order first by x and then by y. 2 4 4

The results of the rankings are 3 1 3

listed below. Using a .05 level 4 5 2

of significance, test the claim 5 7 6

that there is a significant 6 3 1

correlation between the 7 6 7

variables.

(There is a significant correlation between the

variables.)

Application

x y d=xy d2

Critical Value = 0 .715

1 2 1 1 1

2 4 4 0 0

3 1 3 2 4

4 5 2 3 9

5 7 6 1 1

6 3 1 2 4

7 6 7 1 1

20

Since the statistic 0.643 does not fall in the rejection region, fail to reject H0. There

is not enough evidence to support the claim that there is a significant correlation.

