Non Parametric Test
Parametric Statistics
Parametric tests are based upon assumptions that may include the following:
The data have the same variance
The data are normally distributed
Non parametric Statistics
It is not to make the assumption that a population is distributed in the shape
of normal curve or another specified shape.
It is easier to do and understand.
Sometimes even formal order or ranking is not required.
The One Sample Runs Test
A runs test is a statistical analysis that helps determine the
randomness of data by revealing any variables that might affect
data patterns.
A run is a sequence of identical occurrences preceded and
followed by different occurrences or by none at all.
If n1 + n2 ≤ 40 (small sample case) then directly look run table.
Example : A sequence of small glass was inspected for shipping damage. The sequence of acceptable and
damaged pieces was as follows:
D, A, A, A, D, D, D, D, D, A, A, A, A,D, A, A, D, D, D, D,D.
In this case ;
n1 = number of occurrence of acceptable pieces = 9.
n2 = number of occurrence of damaged pieces = 12.
D , A A A , D D D D D , A A A A , D , A A , D D D DD.
r = number of runs = 7 .
From table r tabulated with n1 = 9 and n2 = 12, r tab is (6,16)
Here the value of r is 7 which lies in the interval (6,16) hence we accept H0.
If lies in range then, H0 accepted and there is randomness
Runs test is a statistical procedure which determines whether a sequence of data within a given distribution
have been derived with a random process or not.
It may be applied to test the randomness of data in a survey that collect data from an ordered population.
Sampling Distribution of the r Statistic
The number of runs (r) is a statistic with its own special sampling
distribution and its own test.
A one-sample runs test, then, is based on the idea that too few or too many
runs show that the items were not chosen randomly.
If n1 + n2 > 40 (large sample case) then we have to calculate as on
upcoming slide.
r Statistics
If zcal < ztab , Ho is accepted otherwise rejected.
If Ho is accepted, we conclude that sample items are
distributed in random order.
Example
Test the randomness of the following sample using the
0.05 significance level:
A, B, A, A, A, B, B, A, B, B, A, A,B, A, B, A, A, B, B,
B, B, A, B, B, A, A, A, B, A ,B, A, A, B, B, A, B, B, A,
A ,A, B, B, A, A, B, A, A, A.
If n1 + n2 > 40 (large sample case )
Solution
Here, no of runs (r) = 27
no of A (n1) = 26
no of B (n2) = 22
Null hypothesis : Ho: The items A and B are randomly mixed.
Alternative hypothesis : H1: The items are not randomly mixed.
α = 0.05.
= 24.83
= 3.4
𝑟 −μ 𝑟 27 −24.83
Z= = = 0.63
σ𝑟 3.4
Z tab = 1.96
Z tab > Z cal. We accept H1.
Conclusion
The items are not randomly mixed
Example
A sequence of small glass was inspected for shipping damage.
The sequence of acceptable and damaged pieces was as
follows:
D, A, A, A, D, D, D, D, D, A, A, A, A,D, A, A, D, D, D, D,D.
Test the randomness of the damage to the shipment using the
0.05 level of significance level.
Example
Barcelona Football Club had records of team’s won (W) and lost
(L) over the last 25 championship matches, the result of this
tally are
W, W, W, W, W, W, L, W, W, W, W, W, L, W, W, W, W, L, L,
W, W, W, W, W, W.
At 5% level of significance level, is the occurrence of wins and
loses at random one?
Sign Test
The sign test is a fairly simple procedure that can be used to compare two
populations when the samples consist of paired observations.
It can be used when the assumptions required for the paired-difference test of unit
4 are not valid .
It is based on the direction (signs for plus or minus) of a pair observations, not on
their numerical magnitude.
For each pair, measure whether the first response, say, A exceeds the second
response say, B (i.e. difference between two pairs)
Only pairs without ties are included in the test.
Excluding zero if n < 10, it is small sample case otherwise large sample case.
z = (p - P )/√(PQ/n), where,
P = proportion of population = 0.5
Q = 1 - P = 0.5
p = no of plus signs/ n
q = 1- p = no of minus signs / n
n = no of pairs without ties
If zcal < ztab , Ho is accepted otherwise rejected.
Example
You record the number of accidents per day at a large manufacturing plant for both the day
and evening shifts for n = 100 days. You find that the number of accidents per day for the
evening shift xE exceeded the corresponding number of accidents in the day shift xD on 63 of
the 100 days. Do these results provide sufficient evidence to indicate that more accidents tend
to occur on one shift than on the other?
Solution:
Null hypothesis : Ho: P = 0.5, accidents are same in day & evening shifts.
Alternative hypothesis : H1: P > 0.5, accidents occur in evening shift is more than
in day shift. (one tail)
Choose α = 0.05.
Under Ho: r statistic is
Where,
z = (p - P )/ S.E (p) 𝑷𝑸
S.E (p) =
𝒏
= (0.63 – 0.5)/√{(0.5)(0.5)/100}
= 2.60.
From table, at 5% level of significance, value of z = 1.645
As 2.60 > 1.645, Ho is rejected.
Conclusion : There is evidence of a difference of accidents
occur between the day and night shifts.
Example
Use sign test to see whether there is a different between the number of days
to required to collect an account receivable (Account receivable is the money
owed to a company by its customers for goods or services that have been
delivered or used but not yet paid for) before and after a new collection
policy. Use the 0.05 significance level.
Bef 33 36 41 32 39 47 34 29 32 34 40 42 33 36 29
ore
Aft 35 29 38 34 37 47 36 32 30 34 41 38 37 35 28
er
Setting of hypothesis
Null Hypothesis : H0 : P = 0.5 ( There is no significant different between the number of days
required to collect an account receivable before and after a new collection policy ) .
Alternative Hypothesis : H1 : P ≠ 0.5 (There is significant different between the number of
days required to collect an account receivable before and after a new collection policy).
level of significance
α = 5% = 0.05
Test Statistics
𝑝 −𝑃
Z=
𝑃𝑄
𝑛
Calculation Table
Bef 33 36 41 32 39 47 34 29 32 34 40 42 33 36 29
ore
(X)
Afte 35 29 38 34 37 47 36 32 30 34 41 38 37 35 28
r
(Y)
X-Y -2 7 3 -2 2 0 -2 -3 2 0 -1 4 -4 1 1
7
Number of positive sign = 7 and proportion p = = 0.53 and q = 1 – p = 1 – 0.53 = 0.47.
7+6
Number of negative sign = 6
Test Statistics
𝑝 −𝑃 0.53 −0.5
Z= = = 0.216
𝑃𝑄 0.5 (0.5)
𝑛 13
And the tabulated value with 5% level of significance is 1.96.
Decision
If Z tab > Z cal . Then We accept H1 otherwise reject H1 .
Conclusion
There is significant different between the number of days required to collect an account
receivable before and after a new collection policy.
Thank You