Chapter 8 Non-Parametric Statistics

To use parametric or non-parametric
Either you have to be at least 30 or 3.
- Abdi-Khalil Edriss -
NON-PARAMETRIC STATISTICS
Nowadays, there are many literatures on non-parametric procedures that are different
from parametric procedures, but serving a great deal in research. Non-Parametric
statistical procedures do not depend on certain statistical distributions or assumptions
such as normality, constant variance, etc. Non-parametric procedures are often
known as distribution-free procedures. Since populations do not always meet the
assumptions underlying parametric tests, and therefore the focus of this chapter is on
most frequently used non-parametric statistical procedures.
8.1. What is the difference between parametric and non-

parametric1 statistics?
In parametric statistics inferences made using such tests as t-statistic,

analysis of variance, correlation analysis, etc. depend on certain
assumptions including normality, constant variance and zero mean.
In non-parametric statistics the procedures are not concerned with such

distributions or assumptions. This is perhaps why it is often called
distribution-free statistics.
1
According to Daniel Wayne, the first use of what we would now call a non-parametric statistical
procedure seems to have been reported in 1710 by John Arbuthnot. Uses of such procedures were
conspicuously sparse until the 1940s. The word non-parametric appeared for the first time in 1942 in a
paper by Wolfowitz.
~ 193 ~
8.2. What are some of the advantages of non-parametric
statistics?
Since most non-parametric procedures depend on a minimum number of

assumptions, the chance of their being improperly used is smaller.
For some non-parametric procedures, the computations can be quickly and

easily performed, especially if calculations are done by hand, especially
for smaller sample sizes. Thus, using these procedures saves computation
time.
Researchers with minimum preparation in mathematics and statistics

usually find the concepts and methods of non-parametric procedures easy
to understand.
Non-parametric procedures may be applied when the data are measured on

a weak measurement scale, as when only count data or rank data are
available for analysis.
8.3. What are some of the disadvantages of non-parametric

statistical procedures?
Because the calculations needed for most non-parametric procedures are

simple and rapid, these procedures are sometimes used when parametric
procedures are more appropriate. Such a practice often wastes information.
Although non-parametric procedures have a reputation for requiring only

simple calculations, the arithmetic in many instances is tedious and
laborious, especially when the samples are large and a computer is not
handy.
8.4. When do we use non-parametric procedures?
When the data have been measured on a scale weaker than that required
for parametric procedures that would otherwise be employed. For
example, the data may consist of count data or rank data, thereby
precluding the use of some otherwise important parametric procedures.
When the assumptions necessary for the valid use of a parametric

procedure are not met. In many instances, the design of a research project
may suggest a certain parametric procedure. Examinations of the data,
however, may reveal that one or more assumptions underlying the test are
violated. In that case, a non-parametric procedure is frequently the only
alternative.
~ 194 ~
When results are needed in a hurry, a computer is not readily available,
and calculations must be done by hand; though sometimes tedious.
8.5. What are some of most frequently used non-parametric

statistical procedures or methods?
These procedures include Kruskal-Wallis test by rank, Multiple

Comparisons Test, Comparison of all treatments with a control, Rank
correlation, Phi-coefficient, Cramer Statistic and Point bi-serial
Correlation Coefficient. More discussions on these procedures with
numerical examples as follows.
I - METHOD 1: Kruskal-Wallis(KW) Test by Rank
8.6. What is Kruskal-Wallis test, and what are the procedures

involved?
This is perhaps the most widely used non-parametric technique for testing
the null hypothesis that several samples have been drawn from the same or
identical populations. Thus, Kruskal-Wallis-One-Way-Analysis of
Variance Test is performed as follows.
Assumptions
The data for analysis consist of k random samples of sizes n1, n2, n3, … nk.
The observations are independent both within and among samples.
The variable of interest is continuous.
The measurement scale is at least ordinal.
The populations are identical except for a possible difference in location
for at least one population.
Hypothesis
Null hypothesis, Ho: the k population distribution functions have identical
medians.
Alternative hypothesis, Ha: the k populations do not all have the same
median.
Decision Rule
When we are considering three samples, and each sample has 5 or fewer
observations, we compare the computed value of Kruskal-Wallis (KW) Test
Statistic, denoted by H, for significance with the tabulated Kruskal-Wallis
critical values.
~ 195 ~
However, when the number of samples is larger than 3 and the number of
observations per sample are larger than 5, we cannot use KW table. We rather
compare the computed values of H for significance with Chi-square, k2 1 ,
where k is the number of samples in the study. Meaning, KW test shows that
for large sample size, nj, and k, H is distributed approximately at Chi-square
with k-1 degree of freedom.
For the number of samples and sample sizes that can be accommodated by
KW table, we reject Ho if the computed value of H exceeds the critical value
listed in KW table for the pre-selected value of significant level of α.
Test Statistic
Replace each original observation by its rank relative to all the observations in
the k samples.
If we let N = ni, where i=1,…, k be the total number of observations in the k

samples, we assign the rank 1 to the smallest of these, the rank 2 to the next
size, and so on to the largest, which is given the rank N.
In case of ties we assign the tied observations the average of the ranks that
would be assigned if there were no ties.
If the null hypothesis is true, we expect the k sums of ranks (that is, the sums
of the ranks in each sample) to be about equal when adjusted for unequal
sample sizes.
Use the Kruskal-Wallis Test Statistic given as
12 Ri2
H 3( N 1)
N ( N 1) ni
where Ri is the sum of the ranks assigned to the observations in the ith
treatment under Ho.
NUMERICAL EXAMPLE
In a Rural Credit program, the amount of loan paid back or re-payment in Kwacha to
the lending institutions in the three regions of the country for last year are recorded as
follows in Table 8.1.
~ 196 ~
Table 8.1: Repayments and ranks of borrowers in the three regions
Northern Rank Central Rank Southern Rank

region region region
2650 4 4650 16 3430 10
3070 7 5010 18 7720 20
2110 3 4550 15 2070 2
3230 8 3550 11 10480 22
4540 14 4680 17 8380 21
3390 9 3620 13 6870 19
3040 6
1540 1
2870 5
3560 12
Hypothesis:
Ho: in the three regions of the country the median repayment amounts are the
same
Ha: the three regions do not have the same median repayment amount
Test Statistic
The ranks replacing the original observations will be with number of
observations N=10+6+6=22 and sample size of k=3
Table 8.2: Ranks and Rank Sums in the three regions
Group (Region) Ranks Sample Rank Sums Ri

ni
Northern 4 7 3 8 14 9 6 1 5 12 n1 = 10 R1 = 69
Central 16 18 15 11 17 13 n2 = 6 R2 = 90
Southern 10 20 2 22 21 19 n3 = 6 R3 = 94
N= ni=22
Thus, the KW test statistic H is given by
12 692 902 942

H ( ) 3(22 1) 9.232
22(22 1) 10 6 6
Decision Rule
The sample sizes all exceed 5, and therefore we must use chi-Square statistic
to determine whether the sample medians (roughly speaking the average
repayment amount) are significantly different from one region to another.
~ 197 ~
From the KW table, the critical value for 2 k-1, = 2 3-1,0.01 = 2 2, 0.01 = 9.210,
and since H = 9.232 exceeds 2 2, 0.01 = 9.210, we reject the null hypothesis
(Ho) at the given significance level of 0.01. We therefore conclude that the
repayments (medians) of the three regions represented are not equal. Since the
loan repayments (average amount) for each region is different, the lending
institution may institute some kind of repayment policy based on regional
income or other regional differences.
STATA output on the same data, repayment of loan, is as follows.
. kwallis repayment, by(region)
Kruskal-Wallis equality-of-populations rank test
region Obs Rank Sum
Central 6 90.00
Northern 10 69.00
Southern 6 94.00
chi-squared = 9.232 with 2 d.f.

probability = 0.0099
chi-squared with ties = 9.232 with 2 d.f.

Since KW p-value = 0.0099 < 0.05, we reject the null hypothesis of equality of
medians repayment amount attributable to the three regions, and conclude that
the median repayment is significantly different by region.
Additional example using grain yield dry weight (t/h) attributable to four treatments,
the treatments are: 1, 2, 3B and 4. (Data – Courtesy of Clinton Foundation and AGRA)
. kwallis graYdryw, by ( treatment)
treatm~t Obs Rank Sum
1 75 12030.50
2 76 12275.50
3B 77 12300.00
4 76 9754.00


Note that Kruskal-Wallis is a non-parametric test alternative to one-way ANOVA. It

tests the null hypothesis of equality of population medians. The result (ρ=0.0532 >
0.05) suggests that there is no statistically significant difference between the median
grain yield dry weight attributable to the four treatments.
~ 198 ~
Furthermore, results attributable to maize variety are shown as follows.
. kwallis graYdryw, by (cropvariety)
cropva~y Obs Rank Sum
DK9089 106 13968.50

Sc719 198 32391.50


The result (ρ=0.0026 < 0.05) implies that the medians grain yield dry weight
attributable to maize variety (DK9089 and SC719) are significantly different from
each other. Thus, the varieties make difference in maize yield.
II - METHOD 2: Multiple Comparisons Test
8.7. What is a multiple comparisons (MC) test?
When a hypothesis testing procedures such as Kruskal-Wallis test leads us

to reject the null hypothesis, and thus conclude that not all sampled
populations are identical, we naturally question which populations are
different from each other. Notably we would like to know whether the
medians, M1, M2 and M3 (in our example if the median repayments) are
different from each other or if the difference is between M1 and M2 only,
between M1 and M3 only, or between M2 and M3 only. Thus, in order to
perform such comparisons, we use a non-parametric multiple comparisons
(MC) test.
8.8. How do we apply MC test?
To apply non-parametric MC test we use what is known as experimental-

wise error rate. The experiment-wise rate, which represents a conservative
approach in making multiple comparisons, holds the probability of making
only correct decisions at 1 - level of significance when the null
hypothesis of no difference among populations is true. Note that this
approach protects well against error when Ho is true.
~ 199 ~
8.9. What are the steps to use non-parametric MC test?
We first obtain the mean of the ranks of the ith sample, and Ŕj be the mean
of the ranks of the jth sample. We next select an experimental-wise error
rate of , which we think it is an overall effect of significance. Our choice
of is determined in part by k, the number of samples involved, and is
larger for larger k. Note that if we make multiple comparisons with an
experimental-wise error rate, we usually select a value of larger than
those customarily encountered in single comparison inferences. For
example, we may choose 0.15, 0.20 or perhaps 0.25 depending on the size
of k.
Secondly, if we have k samples, there will be a total of k(k-1)/2 pairs of

samples that can be compared a pair at a time. For example, if we have 5
samples, there will be a total of 5(4)/2 = 10 possible pair-wise comparisons
that we can make.
The next step is to find the value of Z in Z-score table that has /k(k-1)
area to its right, and finally we form the inequality
N ( N 1) 1 1
Ri Rj Z (1 ( / k ( k 1)))
12 ni nj
where an R is a rank average, k is the number of samples, n is sample size

for each group and N is number of observations in all samples combined.
A difference Ri R j that is larger than the right-hand side of the
inequality is significant at the level.
POINTS TO PONDER
This procedure allows us to forget about the direction of the differences between
mean ranks when performing the calculations. The direction of the differences should,
of course, be taken into account in the interpretation of the results.
NUMERICAL EXAMPLE
Refer to the previous data, for which we have computed H = 9.23 and suggested to
reject the null hypothesis (the median of repayment is the same for all the three
regions of the country) at 0.01 level. As a result, we concluded that the median
repayments of rural credit are not the same for all the three regions of the country.
~ 200 ~
Now, we would like to make all possible comparisons in order to locate just where the
difference occurred. Let us choose an error rate of 0.15, where k=3 samples involved,
and there will be 3(2)/2 = 3 comparisons to make. Hence, /k(k-1) = 0.15/3(3-1) =
0.025. And referring to a table of Z-score, we find Z0.025 = 1.96.
We then calculate the means of the ranks for the three samples,
R1 69 / 10 6.9, R2 90 / 6 15 and R3 94 / 6 15.67 . We can now make
comparisons between the groups or regions.
EXAMPLE 1
Group I and II (Northern and Central Regions)
Calculating the right-hand side of the inequality, using Z0.025=1.96
N ( N 1) 1 1
R1 R2 Z (1 ( / k ( k 1)))
12 n1 n2
22(22 1) 1 1
gives the value 1.96 6.57
12 10 6
Decision Rule: Since R1 R2 6.9 15 8.1 6.57 , this comparison is

significant. For this rural credit data set, we conclude that borrowers of rural credit
from Northern region tend to have lower repayment rate than their counterparts in
the Central region of the country. Note that the sign of the difference indicates its
direction.
EXAMPLE 2
Group I and III (Northern and Southern Regions)
Calculating the right-hand side of the inequality
N ( N 1) 1 1
R1 R3 Z (1 ( / k ( k 1)))
12 n1 n3
22(22 1) 1 1
gives the value 1.96 6.57 .
12 10 6
~ 201 ~
Decision Rule: Since R1 R3 6.9 15.67 8.77 6.57 , this comparison is
again significant. We therefore conclude that borrowers who received rural credit
in Northern region tend to have lower repayment rate than their counterparts in the
Southern region of the country.
EXAMPLE 3
Group II and III (Central and Southern Regions)
Calculating the right-hand side of the inequality
N ( N 1) 1 1
R2 R3 Z (1 ( / k ( k 1)))
12 n2 n3
22(22 1) 1 1
gives the value 1.96 7.35
12 6 6
Decision Rule: Since R2 R3 15 15.67 0.67 7.35 . Hence, we conclude that

borrowers of rural credit from southern region tend to have lower repayment rate than
their counterparts in the central region of the country.
STATA output on the same data displays the following results.
~ 202 ~
. oneway repayment region, scheffe tabulate
Summary of Repayment
Region Mean Std. Dev. Freq.
Central 4343.3333 607.80479 6

Northern 3000 813.75672 10
Southern 6491.6667 3163.8421 6
Total 4318.6364 2220.9014 22
Analysis of Variance
Source SS df MS F Prob > F
Between groups 45724042.4 2 22862021.2 7.51 0.0040

Within groups 57856416.7 19 3045074.56
Total 103580459 21 4932402.81
Bartlett's test for equal variances: chi2( 2) = 17.0267 Prob>chi2 = 0.000
Comparison of Repayment by Region

(Scheffe)
Row Mean-
Col Mean Central Northern
Northern -1343.33
0.350
Southern 2148.33 3491.67

0.130 0.004
Note that Scheffe’s test is actually Multiple Comparisons test.
Conclusions
[1] Northern region tend to have lower repayment rate than their counterparts in the
Central region but with no statistically difference (ρ=0.35>0.05) between the two
regions.
[2] Southern region tend to have higher repayment rate than their counterparts in the
northern region with statistically significant difference (ρ=0.004 < 0.05); and
[3] We also conclude that borrowers of rural credit from southern region tend to have
higher repayment rate than their counterparts in the central region of the country, but
with no statistically significant difference ((ρ=0.13 > 0.05).
Lastly, note that the overall result indicates that there is statistically significant
difference (ρ=0.004 < 0.05) between the mean repayments from the regions.
III - METHOD 3: Comparing all Treatments with a Control
8.10. What does comparing all treatments with a control procedure

in non-parametric statistics involve?
Sometimes a research situation is such that one of the k treatments is a

control condition. When this is the case, the investigator is frequently only
~ 203 ~
interested in comparing each treatment with the control condition without
regard to whether the overall test for a treatment effect is significant.
That is, irrespective of any potential significant differences between other

pairs of treatments. Thus, when interest focuses on comparing all
treatments with a control condition, there will be k-1 comparisons to be
made. The procedure is the same as the previous example except for the
method of obtaining Z, where is divided by 2(k-1).
NUMERICAL EXAMPLE
A fertilizer manufacturer conducted an experiment to compare the effect of four types
of fertilizers on the yield of a certain grain. Homogeneous equal size experimental
plots of soil were made available for the experiment. There were randomly assigned
to receive one of the four fertilizers and plots receiving no fertilizer served as
controls. The yields (‘000 kg) for each plot are given in Table 10.3
Table 8.3: Fertilizer type, plot number and yield of grains
Fertilizer Type Plot No Yield (in coded form)

O (no fertilizer) 1 58 29 37 40 44 37 49 49 38
A 2 68 67 69 58 62 48 62 76 66
B 3 96 90 90 92 99 86 79 96 75
C 4 101 110 90 103 100 91 100 114 94
D 5 124 114 111 113 114 102 114 112 103
Now, we wish to know which fertilizers are superior to no fertilizer. Note that plot 1
is the control plot, where no fertilizer had been applied.
Solution
Step 1
To find out the answers, we convert the above data into ranks, rank total and mean
ranks and tabulate them as follows (Table 10.4).
Table 8.4: Fertilizer, plot, yield and rank of observations
Fertilizer Plot Yield (‘000 kg) Rank Total Mean

Type No Rank of each observation Rank
O 1 58 29 37 40 44 37 49 49 38
10.5 1 2.5 5 6 2.5 8.5 8.5 4 48.5 5.39
A 2 68 67 69 58 62 48 62 76 66
16 15 17 10.5 12.5 12.5 12.5 19 14 123.5 13.72
~ 204 ~
B 3 96 90 90 92 99 86 79 96 75
28.5 23 23 26 30 21 20 28.5 18 218 24.22
C 4 101 110 90 103 100 91 100 114 94

33 37 23 35.5 31.5 25 31.5 42.5 27 286 31.78
D 5 124 114 111 113 114 102 114 112 103

45 42.5 38 40 42.5 34 42.5 39 35.5 359 39.89
Step 2
Looking at the ranks in Table 10.4, the data contain several ties, and thus we have to
adjust for the ties. Since the samples are all the same size, we use the expression
k N ( N 2 1) ( t3 t)
Ri Rj Z (1 ( / k ( k 1)))
6 N ( N 1)
for the adjustment of the ties and calculate as follows.
Table 8.5: Rank and position total where tie occurred

Rank position 2 8 10 12 22 28 31 35 42
No of ties, t 222232224 t = 21
Ties cubed, t3 8 8 8 8 27 8 8 8 64 t3 = 147
Step 3
Since there are five treatments (four fertilizers and no fertilizer), we have four
comparisons to make. To find the appropriate Z-value for = 0.2, we compute
0.2/2(4) = 0.25, and from Z-score table we have obtained, Z = 1.96. From the absolute
value of the mean ranks, we obtain Ro 5.39 , the mean rank of the yields of the plots
to which no fertilizer was applied. The R j ' s , the mean ranks of the plots receiving
fertilizers, j = A, B, C, and D are R A 13.72 , R B 24.22 , RC 31.78 and
RD 39.89 .
And, now the right-hand side of the above mathematical expression yields -
5 45(452 1) (127 1)
1.96 12.13
6(45)(45 1)
~ 205 ~
Hence, the comparison of yields of plots receiving fertilizer to yields of plots
receiving no fertilizer is shown next.
Table 8.6: Comparisons of applying fertilizer than no fertilizer
Fertilizer Plot Ri Rj Is applying fertilizer

better?
A 2 5.39 – 13.73 = 8.33 No
B 3 5.39 – 24.22 = 18.83 Yes
C 4 5.39 – 31.78 = 26.39 Yes
D 5 5.39 – 39.89 = 34.50 Yes
From Table 8.6, we can see that since 8.33 < 12.127 (comparing the left-hand to the
right-hand of the mathematical expression), we cannot conclude that fertilizer A is
better than no fertilizer. However, since 18.83, 26.39 and 34.50 are greater than 12.13,
it is possible to conclude that fertilizers B, C and D all resulted in higher yields than if
no fertilizer at all were used.
~ 206 ~
. oneway yield fertilizertype
Between groups 27568.7146 4 6892.17866 111.14 0.0000

Within groups 2418.44444 39 62.011396
Total 29987.1591 43 697.375793
. oneway yield fertilizertype, tabulate scheffe
fertilizer Summary of yield

type Mean Std. Dev. Freq.
A 64 7.8898669 9
B 89.222222 8.0121436 9
C 100.33333 8.0777472 9
D 111.88889 6.5085414 9
O 41.5 8.8317609 8
Total 82.295455 26.407874 44
Between groups 27568.7146 4 6892.17866 111.14 0.0000

Within groups 2418.44444 39 62.011396
Total 29987.1591 43 697.375793
Comparison of yield by fertilizer type

(Scheffe)
Row Mean-
Col Mean A B C D
B 25.2222
0.000
C 36.3333 11.1111
0.000 0.082
D 47.8889 22.6667 11.5556

0.000 0.000 0.064
O -22.5 -47.7222 -58.8333 -70.3889

0.000 0.000 0.000 0.000
Conclusions:
1. Since ρ=0.0000 < 0.05 (F=111.14), we can reject the null hypothesis of equal
means (mean yields by fertilizer type). Or, there is different means yield by
fertilizer type.
2. Bartlett’s test, ρ=0.952 > 0.05 suggests acceptance of the null hypothesis of
equal variances or no significant variation between the yield. It is good news
for the validity of ANOVA.
3. Scheffe’s test (Multiple Comparisons test) yield by fertilizer type indicates the
differences between each pair of yield means. Thus, there is significant
differences between the control group (O), and fertilizer types A, B, C, and D.
Except yield differences between C and B (ρ=0.082 > 0.05), D and C
(ρ=0.064 > 0.05), any other pairs indicate significant difference in the mean
yields by fertilizer type (ρ=0.000 < 0.05).
~ 207 ~
IV - METHOD 4: Spearman Rank Correlation
8.11. What does Spearman rank correlation measure?
Unlike Pearson product moment correlation coefficient, which measures the

strength of association of bivariate normal populations with continuous
variable, the Spearman rank correlation coefficient has the following
characteristics.
 Random sample of n pairs of numeric and non-numeric observations.

 Each pair of observation represents two measurements taken on the same
object or individual, that is, the unit of association is similar for the
observations.
 Measure of the degree of association utilizes the ranks of the sample,
rather than values of observations themselves.
8.12. How do we compute the Spearman rank correlation

coefficient?
In preparation for computing the Spearman Rank Correlation Coefficient,

we subject our data to the following procedures.
If the data consist of observation from a bivariate population, we designate

the n pairs of observations (x1, y1), (x2, y2), …, (xn, yn).
Each X is ranked relative to all other observed values of X from the

smallest to largest in order of magnitude. The rank of the ith value of X is
denoted by R(xi), and R(xi) = 1 if xi is the smallest observed value of X.
Each Y is ranked relative to all other observed values of Y from the

smallest to largest in order of magnitude. The rank of the ith value of Y is
denoted by R(yi), and R(yi) = 1 if yi is the smallest observed value of Y.
If ties occur between the X’s or among the Y’s, each tied value is assigned
the mean of the rank positions for which it is tied.
If the data consist of non-numeric observations, they must be capable of

being ranked as described above.
~ 208 ~
Hypothesis
A: Two sided
o Ho: Observations on X and Y are independent.
o Ha: Observations on X and Y are either directly or inversely related.
B: One-sided
o Ho: Observations on X and Y are independent.
o Ha: There is a direct relationship between X and Y.
A: One-sided
o Ho: Observations X and Y are independent.
o Ha: There is an inverse relationship between X and Y.
Test Statistic
n
6 d i2 n
i 1 2
The test statistic is rs 1 2
, where d i2 R( x i ) R( y i ) , where
n(n 1) i 1
rs is the Spearman rank correlation.
NUMERICAL EXAMPLE
Suppose we have the following eight (n=8) pairs of observations of X and Y given as:
(xi, yi): (0, 10), (9, 3), (1, 9), (5, 6), (7, 11), 6, 12), (2, 4) and (3, 5) and ranks are –
Table 8.7: Observations, ranks and rank differences

Observation xi 09157623
Rank for the x’s, R(xi) 18257634
Observations yi 10 3 9 6 11 12 4 5
Rank for the y’s, R(yi) 6154 7 823
Difference between rank x and rank y is di -5 7 –3 1 0 –2 1 1
Now the sum of the differences square is given by
6(90)
di2 = (-5)2 + 72 + (-3)2 + 12 + 02 + (-2)2 + 12 + 12 = 90 and rs 1 0.07 .
8(64 1)
Decision Rule
Now based on the three types of the hypotheses, we make the following decisions.
Two-sided: Reject Ho at the level if the computed value of rs is greater than

the tabulated rs value for n and . (Use Spearman’s rank table)
One-sided: reject Ho at the level if rs is less than tabulated rs value.
~ 209 ~
Now, since rs = -0.07 < r0.05(2), 8 = 0.738 (for two-sided decision), we refrain from
rejecting the null hypothesis.
For the same data set, STATA output for spearman correlation is as follows.
. spearman x y, stats(rho obs p) star(0.05)
Number of obs = 8
Spearman's rho = -0.0714
Test of Ho: x and y are independent

Prob > |t| = 0.8665
Since ρ=0.8655 > 0.05, we fail to reject the null hypothesis that x and y are independent.
Thus, it can be concluded that x and y are independent, and there is negative weak
correlation, ρ = -0.07 14or -7.14%, between x and y.
POINTS TO PONDER
Watch out the decision rule. It depends on whether we use one-sided or two-sided hypothesis
testing with the pre-determined level of significance.
V - METHOD 5: Phi-Coefficient
8.13. What is the Phi-Coefficient designed for?
The Phi-Coefficient, denoted by , was designed for use with

dichotomous variables, that is, variables that can assume only one of two
possible mutually exclusive values.
Examples are gender (male or female), product quality (defective or non-

defective) and marital status (married or not married).
In practice, it is also used when values of non-dichotomous variables can

be meaningfully grouped into two distinct categories. Students’ knowledge
of a subject, for example, may be continuous and measurable through the
assignment of numerical interval scale scores based on appropriate test
performance as either pass or fail, depending on whether or not their
numerical scores fall above or below some chosen value.
8.14. How do we compute Phi-coefficient?
The Phi-coefficient is given by the expression, which is mostly used in the

case of a 2 by 2 contingency table.
~ 210 ~
a c
Let the 2 by 2 table is given as , then.
b d
ad bc
(a c)(b d )(c d )(a b)
This works if the observations are grouped in two groups or categories as mentioned
before. Otherwise, the Chi-Square test, 2, can be applied if the observations are
grouped in a contingency table with r rows and c columns, that is, r by c contingency
table.
POINTS TO PONDER
The Phi-coefficient is related to Pearson’s chi-square statistic, 2. The relationship is
expressed by 2 = 2/n. To determine whether a computed value of is significant,
convert this value to 2 = n 2 and compare the resulting 2 with the tabulated chi-
square values with 1 degree of freedom. Note that the Phi-coefficient lies in the range
of –1 to 1 inclusive.
NUMERICAL EXAMPLE
Suppose that a sample of 125 employees is classified by sex and job satisfaction as
follows:
Table 8.8: Sex and Job Satisfaction

Job Satisfaction
Sex Yes No Total
Male 15 35 50
Female 50 25 75
Total 65 60 125
Computing
(15)(25) (35)(50)
0.3595
(50)(75)(60)(65)
We now have a measure of strength of the association between gender and job
satisfaction of a sample of 125 workers.
~ 211 ~
Decision rule
To test for significance, computed 2 = 125 (-0.3595)2 = 6.16 > 21, 0.05 = 3.841 (from
chi-square table); Since the computed value is greater than the tabulated value, we
reject the null hypothesis of ‘no association between sex and job satisfaction.’
STATA output using tetrachoric correlation (tetrachoric computes estimates of the

tetrachoric correlation coefficients of the binary variables in variable list. All these
variables should be 0, 1, or missing values) is displayed as follows. Here the data in
Table 8.8 has been entered as 0 and 1 for both sex and job satisfaction for all 125
employees to apply tetrachoric correlation command.
. tetrachoric sex jobsatisfaction, stats(rho se obs) star(0.05)
Number of obs = 125

Tetrachoric rho = -0.5410
Std error = 0.1115
Test of Ho: sex and jobsatisfaction are independent

2-sided exact P = 0.0001
Conclusion: Since the computed p-value = 0.0001 < 0.05, we reject the null
hypotheses that sex and job satisfaction are independent. We, therefore, conclude that
there is significant correlation or association between sex and job satisfaction.
Points to Ponder
Tetrachoric correlations assume a latent bivariate normal distribution (X 1, X2) for each pair
of variables (v1, v2), with a threshold model for the manifest variables (v i = 1 if and only if Xi
> 0). The means and variances of the latent variables are not identified, but the correlation,
r, of X1 and X2 can be estimated from the joint distribution of v1 and v2 and is called the
tetrachoric correlation coefficient.
VI - METHOD 6: Cramer Statistics
8.15. What is the Cramer statistic designed for?
A statistic suggested by Cramer provides an appropriate measure of strength

of association between categorical variables yielding data that may be
displayed in a contingency table of any size.
8.16. How is Cramer coefficient defined?
The Cramer Coefficient is defined as
2
2 (Oi Ei ) 2
C and r 1 ,
n(t 1) Ei
~ 212 ~
where 2 is Chi-square statistic, n is the total sample size, t is either the
number of rows or the number of columns in the contingency table, which
ever is smaller, Oi is observed value and Ei is expected value of the ith row
observation.
Note that the expected value can easily be computed using Eij = [Oi.O.j]/n, for
ith row and jth column observations or entries.
Decision Rule
2
If Cramer statistic if C > (r-1)(c-1), then reject the null hypothesis.
NUMERICAL EXAMPLE
The following data have been collected on question ‘How satisfied are you with the
college you attended?’
Table 8.9: College satisfaction data
College Level of satisfaction and number of respondents

Attended Very Very
Satisfied Satisfied Unsatisfied Unsatisfied Total
Bunda 30(20.8) 15(13.0) 10(11.8) 5(14.3) 60
Poly 40(29.6) 20(18.5) 15(16.6) 10(20.3) 85
Nursing 10(29.6) 15(18.5) 20(16.6) 40(20.3) 85
Total 80 50 45 55 230
Note that the total sample size, n = 230, and the number in the parentheses are the
expected values calculated. For example,
E11 = [Oi.O.j]/n = [O1.O.1]/n = [60x80]/230 = 20.8.

E12 = [Oi.O.j]/n = [O1.O.2]/n = [60x50]/230 = 13.0.
.
.
E34 = [Oi.O.j]/n = [O3.O.4]/n = [85x55]/230 = 20.3.
Now, using chi-square equation 22 53.178 and since number of rows is less
than the number of columns (that is, r=3 < c=4), we have t-1 = 3-1 = 2. Thus, Cramer
2
53.178 2 2 2
Coefficient C 0.34 , and (r-1)(c-1) = (3-1)(4-1) = (2)(3) =
n(t 1) 230(2)
2
= 14.45 . Hence, since 0.34 < 14.45, we do not reject the null hypothesis. We
6, 0.025
therefore conclude that there is no association between which college the students
attended and the level of satisfaction.
~ 213 ~
Also, STATA results (F=0.27 and p-value=0.7703 > 0.05) show that there no
significant difference by level of satisfaction among the students.
. oneway noofstudents levelofsatisfaction, scheffe
Between groups 16.6666667 2 8.33333333 0.27 0.7703

Within groups 183.333333 6 30.5555556
Total 200 8 25
Comparison of No. Of Students by Level of Satisfaction

(Scheffe)
Row Mean-
Col Mean Satisfie Unsatisf
Unsatisf -1.66667
0.935
V. Unsat -3.33333 -1.66667

0.770 0.935
Comparisons between the levels of satisfaction themselves are not significant (Scheffe
results), as well.
VII - METHOD 7: Point bi-serial coefficient of correlation
8.17. What is Point Bi-serial Coefficient of Correlation designed for?
Not infrequently, we may wish to assess the strength of the relationship

between two variables, one of which is dichotomous and the other measured
on an interval or ratio scale (continuous). For example, we might wish to
measure the strength of the association between the sexes of children (male
and female) and the amount of time they spend playing or studying (time).
Other examples might be geographic area (urban, rural) and amount of
fertilizer used (kg), educational achievement among young adults (high school
graduates, high school dropouts) and income, and college students’
membership (fraternity and sorority) and grade point average.
Thus, subjects of these matters in the population, and consequently those in a

sample drawn from the population, will have two measurements of interest,
one of dichotomous variable and one on the interval or ratio scale. Unlike the
other correlation coefficient methods we have dealt with, the concept of
correlation between two such variables (dichotomous and interval scale) is
measured by Point Bi-Serial Correlation Coefficient method.
~ 214 ~
8.18. How do we calculate Point bi-serial correlation coefficient
measurement?
When the point of bi-serial correlation coefficient is of interest, we use B(b1,

b2 …, bn) to designate the dichotomous variable (for example, if yes, then
b=1, and if no, then b=0) and A(a1, a2, …, an) designate the continuous
(interval or ratio scale). To minimize the calculation burden, let one of the two
possible values of b be 1 and the other value be 0.
We use the symbol rpb to designate the point bi-serial correlation coefficient
computed from the sample data. The simplest formula for this is given as
n1 no a1 a o
r pb
n (a i a)2
where n1 is the number of 1’s and n0 is the number of 0’s observed in a

sample of n subjects (n1 + n0 = n), ā1 is the mean value of a for the n1
objects, and ā 0 is the mean value of the a for n0 objects, ā is the over all
sample mean, and (a i a ) 2 is the numerator of the variance of the
sample of all a measurements.
POINTS TO PONDER
Like any other correlation coefficients, the value of rbp ranges between –1 and 1
inclusive.
NUMERICAL EXAMPLE
In a study of the association between income and education, the data in Table 8.10
were obtained on a sample of 25-year old individuals who either completed or did not
complete college.
Table 8.10: Income and college attendance data

Completed Completed
Income College Income College
Observation per year (Euro) Yes = 1 Observation per year Yes = 1
ai and No = 0 (Euro) and No = 0
ai
1 24000 1 16 41000 1
2 15000 1 17 11500 0
3 35000 0 18 13500 0
4 31000 1 19 32000 0
5 20000 0 20 12000 1
6 50000 1 21 25000 1
~ 215 ~
7 13000 0 22 33000 1
8 10100 1 23 21000 0
9 27000 1 24 32000 0
10 90000 1 25 29000 1
11 13000 0 26 19000 0
12 10000 1 27 21000 0
13 21400 0 28 14000 1
14 16000 0 29 16000 1
15 10000 0 30 30000 1
ā=24516.67, ā1= 28568.75, ā0= 19885.71
For this data set, we obtain number of Yes’s, n1 = 16 and number of No’s, n0 = 14
and average of the Yes-income, ā1 = 28568.75 and average of the No-income, ā0 =
19885.75and computing
16(14) 28568.75 19885.75

rpb 7.47(0.101) 0.7553 75.53%
30 85880.51
Decision Rule
Hence, we can conclude that there is strong association between income and
completing college for this particular study since the correlations, rpb= 75.53%.
Did you know.....

Greed governs the world. Greed caused Global financial crisis based on
financial interest. Interest is the cancer of global, national and local
communities’ financial investment. The only way out is to kill financial
interest to restore morality, save human dignity and uplift economic
activities among the majority.
~ 216 ~
========================================================
MENTAL GYMNASTICS
CHAPTER EIGHT
=======================================================
1. Distinguish between the following of pairs of terms.

a) Parametric and Non-parametric statistics
b) Point bi-serial correlation and Correlation coefficient
c) Multiple comparison test and Phi-coefficient
2. True, False or Uncertain. Support your answer.

a) Non-parametric statistics complements parametric statistics.
b) Phi-coefficient is used when values of non-dichotomous variables can be
meaningfully grouped into two distinct categories.
c) Chi-square statistics is not different from point bi-serial correlation.
3. Discuss advantages and disadvantages of non-parametric statistical procedures.
4. Why does non-parametric statistical procedure allow sample size less than 30?
5. What is the difference between Pearson Correlation coefficient and Spearman rank
correlation coefficient?
6. Explain formulating hypothesis in parametric and non-parametric procedure, and

give example for each procedure.
7. What does comparing all treatments with a control procedure in non-parametric

statistics involve?
8. Calculate the correlation for the following data, and draw conclusion accordingly.
Sex (male=0,
female = 10 income('000)Mk
0 55
0 43
1 38
0 50
1 49
0 42
1 60
1 70
1 65
0 55
~ 217 ~
9. Use the data given in the table. Which fertilizers are superior to no fertilizer?
Fertilizer type, plot number and yield of grains
Fertilizer Type Plot No Yield (in coded form)

O (no fertilizer) 1 68 39 47 46 41 30 41 40 58 66
A 2 78 77 59 50 60 45 61 70 69 50
B 3 90 90 81 82 95 87 77 76 87 76
C 4 100 120 100 113 100 90 110 112 90
10. Why do you think populations do not always meet the assumptions underlying
parametric tests? Do some research and build your case.
We have looked West, East, North and South for

quite sometimes for assistances. But, we have
developed nothing and reached no-where!
We look neither West nor East; South nor North.

Let’s look at ourselves inside and find solutions to
our problems. Otherwise, we will never shift from
begging mode, dismantle dependency and get out
of poverty.
-An Ethiopian farmer who broke the vicious poverty cycle
in his household, advising his fellow farmers
~ 218 ~

Chapter 8 Non-Parametric Statistics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 8 Non-Parametric Statistics

Uploaded by

Copyright:

Available Formats

To use parametric or non-parametric

Either you have to be at least 30 or 3.

8.1. What is the difference between parametric and non-

In parametric statistics inferences made using such tests as t-statistic,

In non-parametric statistics the procedures are not concerned with such

Since most non-parametric procedures depend on a minimum number of

For some non-parametric procedures, the computations can be quickly and

Researchers with minimum preparation in mathematics and statistics

Non-parametric procedures may be applied when the data are measured on

8.3. What are some of the disadvantages of non-parametric

Because the calculations needed for most non-parametric procedures are

Although non-parametric procedures have a reputation for requiring only

8.4. When do we use non-parametric procedures?

When the assumptions necessary for the valid use of a parametric

8.5. What are some of most frequently used non-parametric

These procedures include Kruskal-Wallis test by rank, Multiple

I - METHOD 1: Kruskal-Wallis(KW) Test by Rank

8.6. What is Kruskal-Wallis test, and what are the procedures

If we let N = ni, where i=1,…, k be the total number of observations in the k

Use the Kruskal-Wallis Test Statistic given as

Northern Rank Central Rank Southern Rank

Table 8.2: Ranks and Rank Sums in the three regions

Group (Region) Ranks Sample Rank Sums Ri

Thus, the KW test statistic H is given by

12 692 902 942

STATA output on the same data, repayment of loan, is as follows.

. kwallis repayment, by(region)

Kruskal-Wallis equality-of-populations rank test

region Obs Rank Sum

chi-squared = 9.232 with 2 d.f.

chi-squared with ties = 9.232 with 2 d.f.

Kruskal-Wallis equality-of-populations rank test

treatm~t Obs Rank Sum

chi-squared = 7.670 with 3 d.f.

chi-squared with ties = 7.670 with 3 d.f.

Note that Kruskal-Wallis is a non-parametric test alternative to one-way ANOVA. It

Kruskal-Wallis equality-of-populations rank test

cropva~y Obs Rank Sum

DK9089 106 13968.50

chi-squared = 9.044 with 1 d.f.

chi-squared with ties = 9.044 with 1 d.f.

II - METHOD 2: Multiple Comparisons Test

8.7. What is a multiple comparisons (MC) test?

When a hypothesis testing procedures such as Kruskal-Wallis test leads us

8.8. How do we apply MC test?

To apply non-parametric MC test we use what is known as experimental-

Secondly, if we have k samples, there will be a total of k(k-1)/2 pairs of

where an R is a rank average, k is the number of samples, n is sample size

Group I and II (Northern and Central Regions)

Calculating the right-hand side of the inequality, using Z0.025=1.96

Decision Rule: Since R1 R2 6.9 15 8.1 6.57 , this comparison is

Group I and III (Northern and Southern Regions)

Calculating the right-hand side of the inequality

Group II and III (Central and Southern Regions)

Calculating the right-hand side of the inequality

Decision Rule: Since R2 R3 15 15.67 0.67 7.35 . Hence, we conclude that

STATA output on the same data displays the following results.

Central 4343.3333 607.80479 6

Total 4318.6364 2220.9014 22

Between groups 45724042.4 2 22862021.2 7.51 0.0040

Total 103580459 21 4932402.81

Bartlett's test for equal variances: chi2( 2) = 17.0267 Prob>chi2 = 0.000

Comparison of Repayment by Region

Southern 2148.33 3491.67