Professional Documents
Culture Documents
Non Parametric
Non Parametric
PARAMETRIC
Group 7
Yohan Restuadi
What is Non-parametric?
– Also known as distribution-free tests because they are based on fewer
assumptions.
– Non-parametric test does not assume anything about the underlying
distribution (the data comes from a normal distribution)
– For example, one assumption for the one way ANOVA is that the data comes
from a normal distribution. If your data is not normaly distributed, you can’t run
an ANOVA, but you can run the non-parametric alternative (the Kruskal-Wallis
Test)
– Non-parametric tests can perform well with non-normal continious data if you
have a sufficiently large sample size (generaly 15-20 items in each group)
When to Use it
– Non-parametric tests can be applied when :
Non-parametric tests are used when your data is not normal
One or more assumptions of a parametric test have been
violeted
Your sample size is to small to run a parametric test
Your data has outliers that can not be removed
You want to test for the median rather than the mean (you
might want to do this if you have a very skewed distribution
Data measured on any scale.
Advantages of Non-parametric tests
Compared to parametric tests, non-parametric tests have several
advantages, including:
More statistical power when assumptions for the parametric tests
have been violeted. when assumptions have not been violeted,
they can be almost as powerful.
Fewer assumptions (assumption of normality does not apply)
Small sample sizes are acceptable
They can be used for all data types, including nominal variables,
interval variables, or data that has outliers or that has been
measured imprecisely
Disadvantages of Non-parametric tests
However, they do have their disadvantages. The most notable ones
are:
Less powerful than parametric tests if assumptions have not been
violated
More labor-intensive to calculate by hand (for computer
calculations, this is not an issue)
Critical value tables for many tests are not included in many
computer software packages. This is compared to tables for
parametric tests which usually are included.
Non-parametric Methods
– Mann-Whitney-Wilcoxon
– Kruskal-Wallis
Mann-Whitney-Wilcoxon (U tets)
– Also known as Wilcoxon rank sum test
– It is a non-parametric counterpart of the t test used to compare the means of
two independent populations
– The two-tailed hypotheses being tested with the Mann-Whitney-Wilcoxon U
test are as follows
H0 : the samples are independent
Ha : the two populations is at least ordinal
– The following assumptions underlie the use of the Mann-Whitney-Wilcoxon U
test
The samples are independent
The level of the data is at least ordinal
Kruskal-Wallis
– The Kruskal-Wallis test is used to determine whether c>3
samples come from the same or different populations.
– The Kruskal-Wallis test is based on the assumption that
the c groups are independent and that individual items
are selected randomly.
– The hypothesis tested by the Kruskal-Wallis test follow:
H0: the c population are identical
Ha: at least one of the c populatons is different
Example : Mann-Whitney-Wilcoxon (U tets)
– Case Examples for Small Samples (U ≤ 20)
For example, the Statistics Team is curious to know whether there are differences in men's pulses and
women's pulses. Then a sample test for men and women is performed by looking at each pulse. Following are
the results of each pulse calculation.
90 15 79 1
89 13,5 82 4,5
82 4,5 85 8
89 13,5 88 12
91 16 85 8
86 10,5 80 2,5
85 8 80 2,5
86 10,5
84 6
∑ 97,5 38,5
Calculate the U test statistic value
After going through the steps above. Now it's time to calculate the U test statistics. First, calculate U1. The
following calculations.
U2 = n1.n2 - U1
U2 = 9.7 - 52.5
U2 = 10.5
Then from the two values, the smallest value is 10.5, which is used to compare with the
Mann Whitney table.
Conclusion
Because the U value of the test statistic is smaller than the U value of Mann Whitney's
table which is 10.5 <12. So that the H0 decision is rejected, H1 is accepted. So it can be
concluded that there is a difference between men's pulse and women's pulse.
Density of Density of the
Rank Rank
fishermen’s houses Farmhouse
4,25 37 1,75 1
3,1 21 2,35 8
• Case Examples for Large Samples (U>
3,25 25 3,22 23 20)
3,05 19 3,4 29
2,41 10 2,67 13
The Statistics Team is getting a case
2,15 6 4,01 33
2,25 7 1,9 3 in a study regarding residential densities
3,52 31 2,48 11 between fishermen and agricultural
2,03 5 3,33 27 areas, the Team uses α = 0.05. The team
1,85 2 3,26 26
4,19 36 2,89 17
wondered whether there was a
2,86 15 3,35 28 difference in residential density between
4,02 34 2,87 16 the fishing grounds and the agricultural
3,83 32 2,55 12
area. obtained data as in the table below.
1,92 4 3,46 30
3,02 18 Here it has been ranked the same way as
3,23 24 the example above.
4,05 35
3,21 2
3,09 20
2,83 14
2,36 9
∑
284 419
Hypothesis:
H0: The density of the fishermen's house and the farmer's house are the same
H1: There is a difference in the density of fishing houses and farm houses
U2 = n1.n2 - U1
U2 = 15.22 -164
U2 = 166
In contrast to small samples. for large samples using the Z table, it is necessary to find the
z value of the U value that has been obtained.
Whereas if we enter the value of U2, the result is the opposite of the value of U1, which is
+0.0309. So no need to calculate anymore. Well, then what is taken is the positive so that
compared later is 0.0309. After obtaining the value of Z, the final step is to look for the value of
table Z. Table value in table Z, Two-way test with α = 5%, ie 1, 96.
Conclusion
Because the z test statistic value is smaller than the value of table Z that is
0.0309 <1.96. So that H0 decision is accepted, H1 is rejected. So it can be
concluded that there is no difference in the density of fishermen and farmer
houses.
Kruskal-wallis Test Example
– Crason and team reported data on cartisol
levels in three groups of patients giving birth
between 38 and 42 weeks' gestation.
Kelompok 1 262 307 211 323 454 339 304 154 287 356
Observation of group I was carried out before
the deliberate cesarean section was selected.
Observation of group II was carried out in the
Kelompok 2 465 501 455 355 468 362 Caesarean section which had to be chosen
due to the normal process was unsuccessful.
And group III consists of patients who can
Kelompok 3 343 772 207 1048 838 687 give birth normally, but there are those who
choose to give birth through cesarean
section. We want to know whether this data
provides sufficient evidence to show
differences in the median cortisol level
between the three populations represented.
The data are as follows:
Hypotheses
H0: The three populations represented by the data are identical
H1: All three populations do not have the same median
Test Statistics
Before calculating test statistics, the first step is to rank the data as follows:
Kelompok 1 4 7 3 8 14 9 6 1 5 12 ∑= 69
Kelompok 2 16 18 15 11 17 13 ∑= 90
Kelompok 3 10 20 2 22 21 19 ∑=94