# Non Parametric Tests

Rohit Vishal Kumar April 24, 2008

Contents
1 Introduction 2 Mann-Whitney U Test 2.1 Steps in Computation . . . . . . . 2.2 Example . . . . . . . . . . . . . . . 2.3 Remarks . . . . . . . . . . . . . . . 3 Kruskal Wallis H Test 4 The 4.1 4.2 4.3 Sign Test Sign test for small samples . . . . Sign test for large samples . . . . Remarks . . . . . . . . . . . . . . . 1

ordinal data. We shall study some of the common non-parametric test. 1 2 Mann-Whitney U Test 1 2 This is is normally used when the measure2 ment is ordinal and is from two independent samples. This test is used to determine 3 whether the samples are from the same population or not. It is a relatively powerful 3 non-parametric test and is an alternative to 3 “Student’s - T” test, especially so when the 4 data cannot meet the assumptions of T test. 4 Both one-tailed or two-tailed test can be performed. 5 5

5 Wilcoxon Rank Sum T Test 6 Wald Wolfowitz Run Test 7 Run Test for Randomness

2.1

Steps in Computation

Step 1 Combine all sample values in an ar6 ray from the smallest to the largest and assign ranks to all these values. If two or more 8 Advantages of Non-Parametric Test 6 sample values are identical (i.e. there is a tie), the sample values are each assigned a 9 Disadvantages of Non-Parametric rank equal to the mean of the ranks that Test 6 would otherwise be assigned. For example, if a value xi occupies rank 12 and 13 in the array, then the rank assigned to them each 1 1 Introduction would be 2 (12 + 13) = 12.5. Most of the testing of hypothesis considered in statistics is based on the assumption that the population follows a particular probability distribution. Situations arise in practise in which such assumptions may not be justiﬁed or in which there is doubt that they apply, as in the case where the population may be highly skewed. Because of this statisticians have devised several tests and methods that are independent of population distribution and associated parameters. These are called non-parametric tests. Non-parametric tests can be used as a shortcut replacement for more complicated tests. They are specially valuable in dealing with non-numeric data specially nominal and 1 Step 2 Find the sum of the ranks for each of the samples. Denote these sums by R1 and R2 corresponding to the sample sizes N1 and N2 respectively. For convenience choose N1 as the smaller size if they are unequal, so that N1 ≤ N2 . A signiﬁcant difference between the rank sums R1 and R2 implies a signiﬁcant difference between the samples. Step 3 To test the difference between the rank sums, corresponding to sample 1, use the following statistic: N1 (N1 + 1) − R1 2

U

= N1 N2 +

follows N(0,1)

NOTE: The assumption of normality is only valid when both n1 , n2 > 8 Step 4 The sampling distribution of U is symmetrical and has a mean and variance given respectively by the formula µU = N1 N2 /2 2 and σU = N1 N2 (N1 + N2 + 1)/12 which needs to be calculated. Step 5 We compute the Z value by converting as follows Z = (U − µU )/σU and then compare it with the relevant Z table and draw the conclusion.

Alloy I 18.3 16.4 22.7 17.8 18.9 25.3 16.1 24.2 Total

Rank 12 10 16 11 13 18 9 17 106

2.2

Example

Given the following data about the strength of the cables made from two different alloys, I and II, determine using Mann-Whitney U Test whether there is a signiﬁcant difference between the strength of the cables made from alloy I and alloy II? Alloy I Alloy II 18.3 12.6 16.4 14.1 22.7 20.5 17.8 10.7 18.9 15.9 25.3 19.6 16.1 12.9 24.2 15.2 11.8 14.7

Computing the sum of ranks for Alloy II we have Alloy II: Rank 12.6 3 14.1 5 20.5 15 10.7 1 15.9 8 19.6 14 12.9 4 15.2 7 11.8 2 14.7 6 Total 65 Since the alloy I samples have the smaller sample size N1 = 8 we assign R1 = 106 and R2 = 65 then we have: 8(8 + 1) − 106 2

U

= =

(8)(10) + 10

We now compute the mean as µU = N1 N2 /2 = (8)(10)/2 = 40 and the variance as 2 σU = N1 N2 (N1 + N2 + 1)/12 = (8)(10)(8 + 10 + 1)/12 √ 126.67. Then the standard deviation = σU = 126.67 = 11.25. Solution We organise the data into an array Now computing the Z statistic we have Z = starting from the smallest to the largest and (U − µ )/σ = (10 − 40)/11.25 = −2.67. U U give ranks from 1 to 18 as follows: Data 10.7 11.8 12.6 12.9 14.1 14.7 15.2 15.9 16.1 Rank 1 2 3 4 5 6 7 8 9 Data 16.4 17.8 18.3 18.9 19.6 20.5 22.7 24.2 25.3 Rank 10 11 12 13 14 15 16 17 18 Conclusion Now because the value of Zcalc = −2.67 > Z0.05 = −1.96 we reject the null hypothesis (i.e. that there is no difference in the strength of the two alloys) and conclude that there is signiﬁcant difference between the strength of two alloys.

2.3

Remarks

Computing the sum of ranks for Alloy I we have 2

1. To test the difference between the rank sums, corresponding to sample 2, use the following statistic:

U

= N1 N2 +

N2 (N2 + 1) − R1 2

follows N(0,1)

taken over all the observations. If there are no ties then T = 0 and C reduces to 1, so that no correction is needed. In practise, the correction is usually negligible (i.e. not enough to warrant a change in the decision).

The H test provides a non parametric The sampling distribution of U is symmetrical and has a mean and variance method in the analysis of variance for one given respectively by the formula µU = way classiﬁcation, or one-factor experiments 2 and generalisations can be made. N1 N2 /2 and σU = N1 N2 (N1 + N2 + 1)/12. 2. Mann Whitney U test should be avoided if N1 or N2 is ≤ 8. Under such a situation it is better to use T-test.

3. U1 + U2 = N1 N2 and R1 + R2 = (N1 + N2 )(N1 + N2 + 1)/2. These provide a check 4.1 for the correctness of the calculation.

4

The Sign Test
Sign test for small samples

3

Kruskal Wallis H Test

The U test is a non parametric test for deciding whether or not two samples come from the same population. A generalisation of this for k samples is provided by the Kruskal Wallis H Test. This test may be described as follows: Suppose that we have k samples of size N1 , N2 , N3 , · · · Nk , with the total size of all samples taken together being given by N = N1 + N2 + N3 + · · · + Nk . Suppose further that the data from all the samples are taken together and ranked and that the sum It is hypothesised that if the difference in of ranks for the k samples are R1 , R2 , R3 , · · · Rk signs are purely due to chance then the probrespectively. Then the statistic ability of a (+)ve sign is 1/2 and that of a (−)ve sign is 1/2. If S is the number of times k 2 the less frequent sign occurs, then S has a Rj 12 − 3(N + 1) H = binomial distribution with p = 1/2. We take N (N + 1) j=1 Nj H0 : p = 0.5 as the null hypothesis. The critical value for a two sided test at can be shown to follow χ2 distribution with α = 0.05 can be conveniently found by the ex(k-1) degrees of freedom provided that Nk ’s are all at least ≥ 5 and that there are no ties pression: in rank. In case there are too many ties amongst the observations in the sample data, the value of √ n−1 − (0.98) n K = H is smaller than it should be. The corrected 2 value of H, denoted by HC is obtained by dividing the value of H by the correction factor The null hypothesis H0 is rejected if S ≤ K. (C) i.e. HC where C = H/C = 1− (T 3 − T ) N3 − N

The sign test is the simplest of all the non parametric test. It names comes from the fact that it is based on the direction (or signs for pluses and minuses) of a pair of observations and not on their numerical magnitude. In any problem of sign test we count (a) the number of (+)ve signs (b) the number of (−)ve signs (c) number of 0’s (zeros) i.e. which cannot be included either as positive or negative. In case there is a tie i.e. both the values are same, thereby giving zero as the difference, the convention is that we drop that particular pair of observation(s) and work with the remaining.

Example Use the sign test to see if there is a difference between the number of days until the collection of account receivable before and where T is the number of ties correspond- after a new collection policy is implemented. ing to each observation and where the sum is Use the 0.05 level of signiﬁcance. 3

Before 30 28 34 35 40 42 33 38 34 45 28 27 25 41 36

After 32 29 33 32 37 43 40 41 37 44 27 33 30 38 36

(1st - 2nd) Calculated − − + + + − − − − + + − − + 0

Example The following data relates to the daily production of cement (in million tons) of a cement plant for 30 days. Use sign test to test the null hypothesis that the plants daily average production of cement is 11.2 million tom against the alternative hypothesis that it is less than 11.2 million tons at the 0.05 level of signiﬁcance: 11.5 11.1 9.3 12.3 10.8 11.6 10.0 10.2 10.7 11.4 11.9 8.3 11.2 9.6 11.3 10.2 12.4 9.3 10.0 8.7 10.4 11.6 9.6 10.4 12.3 9.3 11.4 9.5 10.5 11.5

Solution Putting + or − signs after comparing with 11.2 we have: + − − + − + − − − + + − 0 − + − + − − − − + − − + − + − − + 11 18 1 30

Solution The number of plus and minus signs for each pair is shown along with the raw data in the ﬁgure above From the above we see that there are 8 (−)ve signs, 6 (+)ve sings and 1 zero. As per the convention we drop the pair giving rise to zero. Then n = 15 − 1 = 14 and S = 6 as the (+)ve sing is less frequent. Calculating the value of K we have: √ 14 − 1 − (0.98) 14 2 6.50 − 3.67 2.83

K

= = =

The number of plus signs: The number of minus signs: Number of zeros: Total Sample Size:

Hence we have the following: X = 11, n = 29 and p = 1/2. Substituting the values in the formula we have: 11 − 29(0.5)

Since S > K the null hypothesis is accepted and we may conclude that there is no signiﬁcant difference in the number of days between an accounts receiveable before and after the introduction of a new policy.

Z

=

4.2

Sign test for large samples

29(0.5)(1 − (0.5)) 11 − 14.5 √ = 7.25 −3.5 = √ 7.25 = −1.2998

For large samples, generally considered for n > 25 the normal approximation to the binoSince |Zcalc = −1.299| ≤ |Ztab,0.05 = −1.645| mial may be used, correcting for continuity. we cannot reject the null hypothesis and conThe actual value of Z can be calculated using clude that the plants average production of the formula: cement may be equal to 11.2 million tons per day. Z = (X − np) np(1 − p)

4.3

Remarks

where X is the number of times the less frequent sign occurs. 4

1. The sign test is used for observations that have been randomly selected in pairs, using a paired difference experiment.

2. It is one of the few test that can be em- 6 Wald Wolfowitz Run Test ployed when the only information available is that one observation exceeds an- Wald Wolfowitz Run test is a non-parametric other or vice versa. test for testing the null hypothesis that the distribution functions of two continuous 3. According to some statisticians, sign test populations are the same. should always be used with caution, as Suppose x1 , x2 , x3 . . . , xn1 is an ordered samthe rejection or non-rejection depends ple from a population with the density funcrandom pairing. Or in other words the tion f1 (.) and let y1 , y2 , y3 . . . , yn2 be an indeone set of pairing of the same data may pendent ordered sample from another populead to rejection whereas another may lation with density function f2 (.). What we lead to acceptance. want to test is whether the samples have been drawn from the same population or from populations with the same density functions i.e. 5 Wilcoxon Rank Sum T f1 (.) = f2 (.). Let us combine the two samples and arrange the observations in order of magTest nitude to give the combined ordered sample This test is used for testing dependent sam- as x1 x2 x3 y1 y2 y3 y4 x4 x5 . . .. A Run is deﬁned as a sequence of letters of ples in which data is collected in matched any kind surrounded by sequence of letters of pairs. This test takes into account both the direction of differences within a pair of obser- the other kind and the number of elements in vations and the relative magnitude of differ- a run is usually referred to as the length (l) ences. It gives more weight to the pairs show- of the run. In the above example we have, in ing large differences; than to pairs showing order, a run of x (l = 3), a run of y (l = 4), a small differences. To use this test, measure- run of x (l = 2) etc. If both the samples come from the same ment must at least be ordinally scaled within population then there would be a thorough pairs. For the Wilcoxon Rank Sum Test, the basic mingling of x’s and y’s and consequently idea is that if the sample’s are from the same the number of runs in the combined sampopulation. If this assumption is true then it ple would be large. On the other hand if can be assumed that the difference between the samples come from two different poputhe pairs (either + or -) should be symmetri- lations so that their ranges do not overlap then there would be only two runs of type cally distributed around a central value. x1 , x2 , x3 . . . , xn1 and y1 , y2 , y3 . . . , yn2 . Assume that there are N pairs of values In order to test the Null Hypothesis H0 : (x1 , y1 ), (x2 , y2 ), . . . , (xN , yN ). Let the HO be f1 (.) = f2 (.) i.e. the samples have come from that the N pairs of observations have been the same population, we count the number drawn from identical (or same) population. of runs (’U ’) in the combined ordered sample. Compute the list of differences δj = (xj − yj ). Null Hypothesis is rejected if U < u0 where the Next, sort the absolute values of differences value of u0 is determined from considering the {|δj |} into ascending order. Add up the ranks distribution of U under H0 . assigned to the positive differences and call it Given that n1 , n2 are the number of obserW +. Similarly ﬁnd the sum of the ranks asvations of x, y respectively under the null hysigned to the negative differences and call it pothesis we have: W −. For N > 25 the following test statistic is applicable: T − µT follows N(0,1) σT N (N + 1) = 4 N (N + 1)(2N + 1) = 24 = min{W +, W −} = 5 E(U ) V ar(U ) = = 2n1 n2 +1 (n1 + n2 ) 2n1 n2 (2n1 n2 − n1 − n2 ) (n1 + n2 )2 (n1 + n2 − 1)

z where µT σT T

and we can use the normal test Z = U − E(U ) V ar(U )

follows N (0, 1) asymptotically. This approximation is a fairly good representation if n1 , n2 ≥ 10. Since the alternative hypothesis is ’too few runs’ the test is ordinarily one tailed with only negative values. The test has a very low power; its relative efﬁciency compared to the traditional t test for equal variances is zero. Furthermore, it has the least power compared to other nonparametric tests applied to the same data.

parametric methods are readily applicable. 4. Since the socio-economic data are not, in general, normally distributed, non parametric test have found application in various social sciences like — Psychometry, Sociology and Educational Statistics etc. 5. Non Parametric tests are available to deal with data which are given in ranks or grades.

7

Run Test for Randomness 9 Disadvantages of Parametric Test Non-

Another application of the ’run’ test is in the testing of randomness of a given set of observations. Let x1 , x2 , x3 . . . , xn be the set of observations arranged in the order in which they occur i.e. xi is the ith observation in the outcome of an experiment. Then, for each of the observations, we see if it is below or above the value of the median of the observations and we write A if the observation is above the median and B if the observation is below the median value. Thus we get a sequence of A’s and B’s of the type A A A B A B B B B A B A B say. Then, under the Null Hypothesis, H0 that the set of observations is random the number of runs, denoted by U is a random variable with n+2 2 n (n − 2) 4 (n − 1)

1. Non Parametric test can be used only if the measurement are nominal or ordinal. Even in that cases, if parametric test exists they are more powerful than non parametric test. 2. Non Parametric tests are designed to test statistical hypothesis only and not for estimating parameter. 3. So far, no Non Parametric test is available for testing interactions in ANOVA model unless speciﬁc assumptions are made about the additivity of the the model.

E(U ) V ar(U )

= =

This document can be obtained from: Rohit Vishal Kumar Reader - (Department of Marketing) Xavier Institute of Social Service P.O. Box No: 7, Purulia Road Ranchi - 834 001, Jharkhand, India Phone: (91-651) 2241-8694 / 8695 Ext. 405 Email: rohitvishalkumar@yahoo.com Website: www.iiswbm.edu Final Print on: April 24, 2008

For large n (say > 25), U may be regarded as asymptotically normal and we may use the normal test.

8