You are on page 1of 21

NON-

PARAMETRIC
Group 7

Debby Handayani Pratiwi

Rahmad Dede Asriadi

Reza Augusti Putri

Yohan Restuadi
What is Non-parametric?
– Also known as distribution-free tests because they are based on fewer
assumptions.
– Non-parametric test does not assume anything about the underlying
distribution (the data comes from a normal distribution)
– For example, one assumption for the one way ANOVA is that the data comes
from a normal distribution. If your data is not normaly distributed, you can’t run
an ANOVA, but you can run the non-parametric alternative (the Kruskal-Wallis
Test)
– Non-parametric tests can perform well with non-normal continious data if you
have a sufficiently large sample size (generaly 15-20 items in each group)
When to Use it
– Non-parametric tests can be applied when :
 Non-parametric tests are used when your data is not normal
 One or more assumptions of a parametric test have been
violeted
 Your sample size is to small to run a parametric test
 Your data has outliers that can not be removed
 You want to test for the median rather than the mean (you
might want to do this if you have a very skewed distribution
 Data measured on any scale.
Advantages of Non-parametric tests
Compared to parametric tests, non-parametric tests have several
advantages, including:
 More statistical power when assumptions for the parametric tests
have been violeted. when assumptions have not been violeted,
they can be almost as powerful.
 Fewer assumptions (assumption of normality does not apply)
 Small sample sizes are acceptable
 They can be used for all data types, including nominal variables,
interval variables, or data that has outliers or that has been
measured imprecisely
Disadvantages of Non-parametric tests
However, they do have their disadvantages. The most notable ones
are:
 Less powerful than parametric tests if assumptions have not been
violated
 More labor-intensive to calculate by hand (for computer
calculations, this is not an issue)
 Critical value tables for many tests are not included in many
computer software packages. This is compared to tables for
parametric tests which usually are included.
Non-parametric Methods
– Mann-Whitney-Wilcoxon
– Kruskal-Wallis
Mann-Whitney-Wilcoxon (U tets)
– Also known as Wilcoxon rank sum test
– It is a non-parametric counterpart of the t test used to compare the means of
two independent populations
– The two-tailed hypotheses being tested with the Mann-Whitney-Wilcoxon U
test are as follows
 H0 : the samples are independent
 Ha : the two populations is at least ordinal
– The following assumptions underlie the use of the Mann-Whitney-Wilcoxon U
test
 The samples are independent
 The level of the data is at least ordinal
Kruskal-Wallis
– The Kruskal-Wallis test is used to determine whether c>3
samples come from the same or different populations.
– The Kruskal-Wallis test is based on the assumption that
the c groups are independent and that individual items
are selected randomly.
– The hypothesis tested by the Kruskal-Wallis test follow:
 H0: the c population are identical
 Ha: at least one of the c populatons is different
Example : Mann-Whitney-Wilcoxon (U tets)
– Case Examples for Small Samples (U ≤ 20)
For example, the Statistics Team is curious to know whether there are differences in men's pulses and
women's pulses. Then a sample test for men and women is performed by looking at each pulse. Following are
the results of each pulse calculation.

Men’s pulse Women’s pulse


90 79
89 82
82 85
89 88
91 85
86 80
85 80
86
84
Hypothesis:
H0: A woman's pulse is the same as a man's pulse
H1: A woman's pulse is different from a man's pulse
The first step is Arrange the two Observation results into one
sample group and rank as follows

Pulse Rank Sex


79 1 Woman
80 2,5 Woman
80 2,5 Woman
82 4,5 Man
82 4,5 Woman
84 6 Man
85 8 Man
85 8 Woman
85 8 Woman
86 10,5 Man
86 10,5 Man
88 12 Woman
89 13,5 Man
89 13,5 Man
90 15 Man
91 16 Man
Next, sum up the grade values for each sample

Man’s pulse Rank Women’s pulse Rank

90 15 79 1
89 13,5 82 4,5
82 4,5 85 8
89 13,5 88 12
91 16 85 8
86 10,5 80 2,5
85 8 80 2,5
86 10,5
84 6

∑ 97,5 38,5
Calculate the U test statistic value

After going through the steps above. Now it's time to calculate the U test statistics. First, calculate U1. The
following calculations.

Meanwhile, to calculate U2. You can use a formula.

U2 = n1.n2 - U1
U2 = 9.7 - 52.5
U2 = 10.5
Then from the two values, the smallest value is 10.5, which is used to compare with the
Mann Whitney table.

Conclusion
Because the U value of the test statistic is smaller than the U value of Mann Whitney's
table which is 10.5 <12. So that the H0 decision is rejected, H1 is accepted. So it can be
concluded that there is a difference between men's pulse and women's pulse.
Density of Density of the
Rank Rank
fishermen’s houses Farmhouse

4,25 37 1,75 1
3,1 21 2,35 8
• Case Examples for Large Samples (U>
3,25 25 3,22 23 20)
3,05 19 3,4 29
2,41 10 2,67 13
The Statistics Team is getting a case
2,15 6 4,01 33
2,25 7 1,9 3 in a study regarding residential densities
3,52 31 2,48 11 between fishermen and agricultural
2,03 5 3,33 27 areas, the Team uses α = 0.05. The team
1,85 2 3,26 26
4,19 36 2,89 17
wondered whether there was a
2,86 15 3,35 28 difference in residential density between
4,02 34 2,87 16 the fishing grounds and the agricultural
3,83 32 2,55 12
area. obtained data as in the table below.
1,92 4 3,46 30
3,02 18 Here it has been ranked the same way as
3,23 24 the example above.
4,05 35
3,21 2
3,09 20
2,83 14
2,36 9

284 419
Hypothesis:
H0: The density of the fishermen's house and the farmer's house are the same
H1: There is a difference in the density of fishing houses and farm houses

Calculate the U test statistic value


Before carrying out test statistical calculations. do the steps like in the previous example which is to sort the
data for rank then add up so the results are like in the table above. Then go straight to the calculations. First look
for U1.

Second to calculate U2. You can use a formula.

U2 = n1.n2 - U1
U2 = 15.22 -164
U2 = 166
In contrast to small samples. for large samples using the Z table, it is necessary to find the
z value of the U value that has been obtained.

Whereas if we enter the value of U2, the result is the opposite of the value of U1, which is
+0.0309. So no need to calculate anymore. Well, then what is taken is the positive so that
compared later is 0.0309. After obtaining the value of Z, the final step is to look for the value of
table Z. Table value in table Z, Two-way test with α = 5%, ie 1, 96.
Conclusion
Because the z test statistic value is smaller than the value of table Z that is
0.0309 <1.96. So that H0 decision is accepted, H1 is rejected. So it can be
concluded that there is no difference in the density of fishermen and farmer
houses.
Kruskal-wallis Test Example
– Crason and team reported data on cartisol
levels in three groups of patients giving birth
between 38 and 42 weeks' gestation.
Kelompok 1 262 307 211 323 454 339 304 154 287 356
Observation of group I was carried out before
the deliberate cesarean section was selected.
Observation of group II was carried out in the
Kelompok 2 465 501 455 355 468 362 Caesarean section which had to be chosen
due to the normal process was unsuccessful.
And group III consists of patients who can
Kelompok 3 343 772 207 1048 838 687 give birth normally, but there are those who
choose to give birth through cesarean
section. We want to know whether this data
provides sufficient evidence to show
differences in the median cortisol level
between the three populations represented.
The data are as follows:
Hypotheses
H0: The three populations represented by the data are identical
H1: All three populations do not have the same median

Test Statistics
Before calculating test statistics, the first step is to rank the data as follows:

Kelompok 1 4 7 3 8 14 9 6 1 5 12 ∑= 69

Kelompok 2 16 18 15 11 17 13 ∑= 90

Kelompok 3 10 20 2 22 21 19 ∑=94

Then sum up each group. Here are the results:


R1 = 69, R2 = 90 and R3 = 94
From those results new test statistics can be calculated.
Conclusion
Because all sample sizes are greater than 5, we must use
the square-table to decide whether the sample medians are
significantly different. The critical value of the square for db
= k - 1 = 3 - 1 = 2 is 9.210 for a = 0.01. So, because with X2
0.99; 2 H = 9, 232> c (9,210); we reject H0 at this apparent
level and we conclude that the medians of the three
populations represented are not all the same. Whereas the P
value for this example is between 0.01 and 0.005.

You might also like