Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
11Activity
0 of .
Results for:
No results containing your search query
P. 1
Non Parametric Tests

Non Parametric Tests

Ratings:

4.71

(7)
|Views: 3,547|Likes:
Published by Rohit Vishal Kumar

More info:

Categories:Types, School Work
Published by: Rohit Vishal Kumar on Jan 07, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

10/07/2012

pdf

text

original

 
Non Parametric Tests
Rohit Vishal Kumar  April 24, 2008
Contents
1 Introduction 12 Mann-Whitney U Test 1
2.1 Steps in Computation . . . . . . . 12.2 Example . . . . . . . . . . . . . . . 22.3 Remarks . . . . . . . . . . . . . . . 2
3 Kruskal Wallis H Test 34 The Sign Test 3
4.1 Sign test for small samples . . . . 34.2 Sign test for large samples . . . . 44.3 Remarks . . . . . . . . . . . . . . . 4
5 Wilcoxon Rank Sum T Test 56 Wald Wolfowitz Run Test 57 Run Test for Randomness 68 Advantages of Non-Parametric Test 69 Disadvantages of Non-Parametric Test 6
1 Introduction
Most of the testing of hypothesis consideredin statistics is based on the assumption that the population follows a particular probabil-ity distribution. Situations arise in practisein which such assumptions may not be justi-fied or in which there is doubt that they ap-ply, as in the case where the population may  be highly skewed. Because of this statisti-cians have devised several tests and methodsthat are independent of population distribu-tion and associated parameters. These arecalled
non-parametric tests 
.Non-parametric tests can be used as shortcut replacement for more complicatedtests. They are specially valuable in dealing with non-numeric data specially nominal andordinal data. We shall study some of the com-mon non-parametric test.
2 Mann-Whitney U Test
 This is is normally used when the measure-ment is ordinal and is from two independe-nt samples. This test is used to determine whether the samples are from the same pop-ulation or not. It is a relatively powerfulnon-parametric test and is an alternative to“Student’s - T” test, especially so when thedata cannot meet the assumptions of T test.Both one-tailed or two-tailed test can be per-formed.
2.1 Steps in Computation
Step 1
Combine all sample values in an ar-ray from the smallest to the largest and as-sign ranks to all these values. If two or moresample values are identical (i.e. there is
tie 
), the sample values are each assigned rank equal to the
mean 
of the ranks that  would otherwise be assigned. For example,if a value
x
i
occupies rank 12 and 13 in thearray, then the rank assigned to them each would be
12
(12 + 13) = 12
.
5
.
Step 2
Find the sum of the ranks for each of the samples. Denote these sums by 
R
1
and
R
2
corresponding to the sample sizes
1
and
2
respectively. For convenience choose
1
asthe smaller size if they are unequal, so that 
1
2
. A significant difference between therank sums
R
1
and
R
2
implies a significant dif-ference between the samples.
Step 3
To test the difference between therank sums, corresponding to sample 1, usethe following statistic:
=
1
2
+
1
(
1
+ 1)2
R
1
follows N(0,1)1
 
NOTE: The assumption of normality is onlvalid when both 
n
1
,n
2
>
8
Step 4
The sampling distribution of U issymmetrical and has a mean and variancegivenrespectivelybytheformula 
µ
=
1
2
/
2
and
σ
2
=
1
2
(
1
+
2
+1)
/
12
which needs to be calculated.
Step 5
We compute the
value by convert-ing as follows
= (
µ
)
and then com-pare it with the relevant Z table and draw theconclusion.
2.2 Example
Given the following data about the strengthof the cables made from two different alloys, Iand II, determine using Mann-Whitney U Test  whether there is a significant difference be-tween the strength of the cables made fromalloy I and alloy II? Alloy I Alloy II18.3 12.616.4 14.122.7 20.517.8 10.718.9 15.925.3 19.616.1 12.924.2 15.211.814.7
Solution
We organise the data into an array starting from the smallest to the largest andgive ranks from 1 to 18 as follows:Data Rank Data Ran10.7 1 16.4 1011.8 2 17.8 1112.6 3 18.3 1212.9 4 18.9 1314.1 5 19.6 1414.7 6 20.5 1515.2 7 22.7 1615.9 8 24.2 1716.1 9 25.3 18Computing the sum of ranks for Alloy I wehave Alloy I Rank 18.3 1216.4 1022.7 1617.8 1118.9 1325.3 1816.1 924.2 17 Total 106Computing the sum of ranks for Alloy II wehave Alloy II: Rank 12.6 314.1 520.5 1510.7 115.9 819.6 1412.9 415.2 711.8 214.7 6 Total 65Since the alloy I samples have the smaller sample size
1
= 8
we assign
R
1
= 106
and
R
2
= 65
then we have:
= (8)(10) +8(8 + 1)2
106= 10
 We now compute the mean as
µ
=
1
2
/
2 = (8)(10)
/
2 = 40
and the variance as
σ
2
=
1
2
(
1
+
2
+ 1)
/
12 = (8)(10)(8 + 10 +1)
/
12 = 126
.
67
. Then the standard deviation
σ
=
126
.
67 = 11
.
25
.Now computing the
statistic we have
=(
µ
)
= (10
40)
/
11
.
25 =
2
.
67
.
Conclusion
Now because the value o
calc
=
2
.
67
>
0
.
05
=
1
.
96
we reject thenull hypothesis (i.e. that there is no differ-ence in the strength of the two alloys) andconclude that there is significant difference between the strength of two alloys.
2.3 Remarks
1. To test the difference between the rank sums, corresponding to sample 2, use thefollowing statistic:2
 
=
1
2
+
2
(
2
+ 1)2
R
1
follows N(0,1) The sampling distribution of U is sym-metrical and has a mean and variancegiven respectively by the formula 
µ
=
1
2
/
2
and
σ
2
=
1
2
(
1
+
2
+ 1)
/
12
.2. Mann Whitney U test should be avoidedif 
1
or 
2
is
8. Under such a situationit is better to use T-test.3.
1
+
2
=
1
2
and
R
1
+
R
2
= (
1
+
2
)(
1
+
2
+1)
/
2
. These provide a check for the correctness of the calculation.
3 Kruskal Wallis H Test
 The U test is a non parametric test for de-ciding whether or not two samples come fromthe same population. A generalisation of thisfor 
k
samples is provided by the
Kruskal Wallis H Test 
. This test may be describedas follows: Suppose that we have
k
samplesof size
1
,
2
,
3
,
···
k
, with the total sizeof all samples taken together being given by 
=
1
+
2
+
3
+
···
+
k
. Suppose fur-ther that the data from all the samples aretaken together and ranked and that the sumof ranks for the
k
samples are
R
1
,R
2
,R
3
,
···
R
k
respectively. Then the statistic
=12
(
+ 1)
k
j
=1
R
2
j
j
3(
+ 1)
can be shown to follow 
χ
2
distribution with(k-1) degrees of freedom provided tha
k
’sare all at least 
5 and that there are no tiesin rank.In case there are too many ties amongst theobservations in the sample data, the value of H is smaller than it should be. The corrected value of H, denoted by 
is obtained by di- viding the value of 
by the correction factor (C) i.e.
=
H/C where
= 1
(
3
)
3
 where
is the number of ties correspond-ing to each observation and where the sum istaken over all the observations. If there areno ties then
= 0
and C reduces to 1, so that no correction is needed. In practise, the cor-rection is usually negligible (i.e. not enoughto warrant a change in the decision). The H test provides a non parametricmethod in the
analysis of variance 
for one way classification, or one-factor experimentsand generalisations can be made.
4 The Sign Test
4.1 Sign test for small samples
 The sign test is the simplest of all the nonparametric test. It names comes from the fact that it is based on the direction (or signs for pluses and minuses) of a pair of observationsand not on their numerical magnitude. In any problem of sign test we count (a) the number of (
+
)ve signs (b) the number of (
)ve signs (c)number of 0’s (zeros) i.e. which cannot be in-cluded either as positive or negative. In casethere is a tie i.e. both the values are same,thereby giving zero as the difference, the con- vention is that we drop that particular pair of observation(s) and work with the remaining.It is hypothesised that if the difference insigns are purely due to chance then the prob-ability of a (
+
)ve sign is 1/2 and that of a (
)ve sign is 1/2. If 
is the number of timesthe less frequent sign occurs, then
has a  binomial distribution with
p
= 1
/
2
. We take
0
:
p
= 0
.
5
as the null hypothesis. The critical value for a two sided test at 
α
= 0
.
05
can be conveniently found by the ex-pression:
=
n
12
(0
.
98)
n
 The null hypothesis
0
is rejected if 
.
Example
Use the sign test to see if there isa difference between the number of days untilthe collection of account receivable before andafter a new collection policy is implemented.Use the 0.05 level of significance.3

Activity (11)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Angela Ferrer liked this
Anchu Kumar liked this
Shradha Raut liked this
Ravikumar Pa liked this
sarikawalia_7 liked this
laura42 liked this
Mozno Snad liked this
axelkokocur liked this

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->