You are on page 1of 63

Ph.

D In Nursing

RESEARCH METHODS

Module No: 8 DATA ANALYSIS

Name of the subtopic


8.2 NON PARAMETRIC TEST

Faculty Name;
Date:
Subject Code: School of Nursing
Learning Objectives
 Meaning of non parametric tests
 Application and purpose
 Models of non parametric tests
 Types of non parametric tests
 parametric and non parametric Significance
 parametric and non parametric prediction
 Some concepts related to the statistical methods
 Selected nonparametric tests
 Disadvantages
 To distinguish parametric and nonparametric tests of significance
 To identify situations in which the use of nonparametric tests is
appropriate
2
List of contents
 introduction
 Meaning of non parametric tests
 Application and purpose
 Models of non parametric tests
 Types of non parametric tests
 parametric and non parametric Significance
 parametric and non parametric prediction
 Some concepts related to the statistical methods
 Selected nonparametric tests
 Disadvantages
 Summary
 references
3
introduction
 Nonparametric statistics are statistics not based on 
parameterized families of probability distributions
 They include both  descriptive and inferential  statistics. The
typical parameters are the mean, variance, etc. Unlike 
parametric statistics , nonparametric statistics make no
assumptions about the  probability distributions of the variables
being assessed.
 The difference between parametric model and non-parametric
model is that the former has a fixed number of parameters,
while the latter grows the number of parameters with the
amount of training data. 

4
meaning of Non-parametric test

Statistical tests which are not based on a normal distribution of data


or on any other assumption. They are also known as distribution-
free tests and the data are generally ranked or grouped.
 Non-parametric statistics do not assume any underlying
distribution of parameter.
 Non-parametric does not meant that model lack parameters but
that the number and nature of the parameters are flexible

5
Non-parametric test

The first meaning of non-parametric covers techniques that do


not rely on data belonging to any particular distribution
These are the techniques that do not assume that the structure of
a model is fixed. Typically, the model grows in size to
accommodate the complexity of the data. In these techniques,
individual variables are typically assumed to belong to
parametric distributions, and assumptions about the types of
connections among variables are also made.

6
Non-parametric test-Meaning

 non-parametric statistics (in the sense of a  Statistic over data,


which is defined to be a function on a sample that has no
dependency on a  parameter), whose interpretation does not
depend on the population fitting any parameterized
distributions. 
 Order statistics which are based on the ranks of observations,
are one example of such statistics and these play a central role
in many non-parametric approaches.

7
Why Nonparametric Test

 Sample distribution is unknown.


 When the population distribution is abnormal i.e. too
many variables involved.
USAGE

 Decision making/ forecasting.


 Studying populations that take on a ranked order (such as
movie reviews receiving one to four stars)
 Simple analysis.
Applications and purpose
 Non-parametric methods are widely used for studying populations
that take on a ranked order (such as movie reviews receiving one
to four stars).
 The use of non-parametric methods may be necessary when data
have a ranking  but no clear  numerical  interpretation, such as
when assessing  preferences .
 In terms of levels of measurement, non-parametric methods result
in "ordinal" data.
 As non-parametric methods make fewer assumptions, their
applicability is much wider than the corresponding parametric
methods. In particular, they may be applied in situations where less
is known about the application in question. Also, due to the
reliance on fewer assumptions, non-parametric methods are more
robust.
10
Applications and purpose
 Another justification for the use of non-parametric methods is
simplicity. In certain cases, even when the use of parametric
methods is justified, non-parametric methods may be easier to
use. Due both to this simplicity and to their greater robustness,
non-parametric methods are seen by some statisticians as
leaving less room for improper use and misunderstanding.
 The wider applicability and increased  robustness  of non-
parametric tests comes at a cost: in cases where a parametric
test would be appropriate, non-parametric tests have less
power.
 In other words, a larger sample size can be required to draw
conclusions with the same degree of confidence.

11
Non-parametric models

 Non-parametric models differ from parametric  models in that


the model structure is not specified a priori but is instead
determined from data.
 The term non-parametric is not meant to imply that such
models completely lack parameters but that the number and
nature of the parameters are flexible and not fixed in advance.
 A  histogram  is a simple nonparametric estimate of a
probability distribution.
 kernel density estimation  provides better estimates of the
density than histograms.

12
Non-parametric models

   non parametric regression and semi parametric regression


methods have been developed based on kernels , and spliness.
 data development analysis  provides efficiency coefficients
similar to those obtained by  multivariate analysis. without any
distributional assumption.
KNNs  classify the unseen instance based on the K points in the
training set which are nearest to it.
 A support vector machine  (with a Gaussian kernel) is a
nonparametric large-margin classifier.

13
Types of Non-parametric test
1. One sample test
• Chi-square test
• One sample sign test
2. Two samples test
• Median test
• Two samples sign test
3. K-samples test
• Median tets
• Kruskal Wallis test
Types of Non-parametric test
4.Chi-square test (χ2):
 Used to compare between observed and expected data.

1. Test of goodness of fit


2. Test of independence
3. Test of homogeneity
5.Kruskal-Wallis test-
 for testing whether samples originate from the same
distribution.
 used for comparing more than two samples that are
independent, or not related
 Alternative to ANOVA.
Types of Non-parametric test

6.Wilcoxon signed-rank-
 used when comparing two related samples or repeated
measurements on a single sample to assess whether their
population mean ranks differ.
Types of Non-parametric test
7.Median test-
 Use to test the null hypothesis that the medians of the
populations from which two samples are drawn are identical.
 The data in sample is assigned to two groups, one consisting
of data whose values are higher than the median value in the
two groups combined, and the other consisting of data whose
values are at the median or below
8.Sign test:
 can be used to test the hypothesis that there is "no difference
in medians" between the continuous distributions of two
random variables X and Y,
9.Fisher's exact test:
 test used in the analysis of contingency where sample sizes
are small
Parametric v Non-parametric
 Parametric tests => have info about population, or can make
certain assumptions
 Assume normal distribution of population.
 Data is distributed normally.
 population variances are the same.
 Non-parametric tests are used when there are no assumptions
made about population distribution
 Also known as distribution free tests.
 But info is known about sampling distribution.
Tests of Significance

Non-parametric Parametric
Two-groups
Paired Wilcoxin Rank Paired t test
Unpaired Mann-Whitney Unpaired t test
U
More than two-
groups
Friedman test ANOVA
Repeated measures Kruskal -Wallis Repeated measures
Independent groups ANOVA

19
Parametric and nonparametric tests
of significance
Parametric test of significance - to estimate at least one population
parameter from sample statistics
Assumption: the variable we have measured in the sample is
normally distributed in the population to which we plan to
generalize our findings

Nonparametric test - distribution free, no assumption about the


distribution of the variable in the population
Tests of Significance
 Non-parametric tests of significance – small numbers, can’t
assume a normal distribution, or measurement not interval
 Chi-square – requires only nominal data – allows researcher
to determine whether frequencies that have been obtained in
research differ from those that would have been expected –
use a X2 sampling distribution
 Chi-square goodness of Fit
 Chi-Square test of independence
Prediction
 Parametric Prediction – using a correlation, if you know score
“x”, you can predict score “y” for one person – Use regression
analysis
 Simple linear regression – allows the prediction from one
variable to another – you must have at least interval level data
 Multiple linear regression – this allows the prediction of one
variable from several other variables. The dependent variable
must be on the interval scale
Prediction

Non-parametric Prediction – measures the extent to which you


can reduce the error in predicting the dependent variable as a
consequence of having some knowledge of the independent
variable such as, predicting income [DV] by education [IV]
Kendall’s Tau – used with ordinal data and ranking - is
better than the Gamma because it takes ties into account
Gamma - used with ordinal data to predict the rank of one
variable by knowing rank on another variable
Lambda – can be used with nominal data – knowledge of the
IV allows one to make a better prediction of the DV than if
you had no knowledge at all
Parametric and nonparametric tests
of significance

Nonparametric tests Parametric tests


Nominal Ordinal data Ordinal, interval,
data ratio data
One group
Two
unrelated
groups
Two related
groups
K-unrelated
groups
K-related
groups
Some concepts related to the statistical
methods.

Multiple comparison

two or more data sets, which should be analyzed

– repeated measurements made on the same individuals

– entirely independent samples


Some concepts related to the statistical methods.

Sample size
number of cases, on which data have been obtained

Which of the basic characteristics of a distribution are more


sensitive to the sample size ?
central tendency (mean, median, mode) mean

variability (standard deviation, range, IQR) standard deviation

Skewness, kurtosis
Some concepts related to the statistical
methods.

Degrees of freedom
the number of scores, items, or other units in the data
set, which are free to vary

One- and two tailed tests


one-tailed test of significance used for directional
hypothesis
two-tailed tests in all other situations
Parametric and nonparametric tests of
significance
N o n p a r a m e tr ic te s ts P a r a m e tr ic te s ts
N o m in a l O r d in a l d a ta O r d in a l, in te r v a l,
d a ta r a tio d a ta
O n e g ro u p C h i sq u a re
good n ess
o f fit
Tw o C h i sq u a re
u n r e la te d
groups
T w o r e la te d M cN em ar’
groups s te st
K -u n r e la te d C h i sq u a re
groups te st
K -r e la te d
groups
Selected nonparametric tests
Chi-Square goodness of fit test.

to determine whether a variable has a frequency distribution


compariable to the one expected

1
   ( f oi  f ei )
 2
 f
ei

expected frequency can be based on


theory
previous experience
comparison groups
Selected nonparametric tests
Chi-Square goodness of fit test. Example
The average prognosis of total hip replacement in relation
to pain reduction in hip joint is
exelent - 80%
good - 10%
medium - 5% expected
bad - 5%
In our study of we had got a different outcome
exelent - 95%
good - 2%
observed
medium - 2%
bad - 1%
Does observed frequencies differ from expected ?
Selected nonparametric tests
Chi-Square goodness of fit test. Example

fe1= 80, fe2= 10, fe3=5, fe4= 5;


fo1= 95, fo2= 2, fo3=2, fo4= 1;
2 > 3.841 p < 0.05
2= 14.2, df=3 (4-1) 2 > 6.635 p < 0.01
2 > 10.83 p < 0.001
0.0005 < p < 0.05

Null hypothesis is rejected at 5% level


Selected nonparametric tests
Chi-Square test.

Chi-square statistic (test) is usually used with an R (row)


by C (column) table.

Expected frequencies can be calculated:

1
F rc  ( fr fc )
N
then
1
    ( f ij  Fij )
 2
 j F
ij
df = (fr-1) (fc-1)
Selected nonparametric tests
Chi-Square test. Example

Question: whether men are treated more aggressively for


cardiovascular problems than women?

Sample: people have similar results on initial testing

Response: whether or not a cardiac catheterization was


recommended
Independent: sex of the patient
Selected nonparametric tests
Chi-Square test. Example

Result: observed frequencies

Sex
Cardiac male female Row total
Cath
No 15 16 31
Yes 45 24 69
Column 60 40 100
total
Selected nonparametric tests
Chi-Square test. Example

Result: expected frequencies

Sex
Cardiac male female Row total
Cath
No 18.6 12.4 31
Yes 41.4 27.6 69
Column 60 40 100
total
Selected nonparametric tests
Chi-Square test. Example

Result:

2= 2.52, df=1 (2-1) (2-1)

p > 0.05

Null hypothesis is accepted at 5% level

Conclusion: Recommendation for cardiac catheterization


is not related to the sex of the patient
Selected nonparametric tests
Chi-Square test. Underlying assumptions.

Cannot be used to analyze


 Frequency data
differences in scores or
their means
 Adequate sample size Expected frequencies should
not be less than 5
 Measures independent No subjects can be count
of each other more than once

 Theoretical basis for Categories should be defined


the categorization of the prior to data collection and
analysis
variables
Selected nonparametric tests
Fisher’s exact test. McNemar test.

 For N x N design and very small sample size Fisher's


exact test should be applied

 McNemar test can be used with two dichotomous


measures on the same subjects (repeated
measurements). It is used to measure change
Parametric and nonparametric tests of
significance
N o n p a r a m e tr ic te s ts P a r a m e tr ic
te sts
N o m in a l O r d in a l d a ta
d a ta
O n e g ro u p C h i sq u a re W ilc o x o n s ig n e d
g o o d n ess o f r a n k te st
fit
T w o C h i sq u a re W ilc o x o n r a n k
u n r e la te d su m te st,
g ro u p s M a n n -W h itn e y
te st
T w o r e la te d M c N e m a r’s W ilc o x o n s ig n e d
g ro u p s te s t r a n k te st
K -u n r e la te d C h i sq u a re K r u s k a l -W a llis
g ro u p s te s t o n e w a y a n a ly s is
o f v a r ia n c e
K -r e la te d F r ie d m a n
g ro u p s m a tc h e d s a m p le s
Non-parametric Multiple Comparison
 Kruskal-Wallis Test – an alternative to the one-way ANOVA.
The scores are ranked and the analyses compare the mean rank
in each group. It determines if there is a difference between
groups.
 McNemar Test – an adaptation of the Chi-square that is used
with repeated measures at the nominal level.
 Friedman Test –an alternative to the repeated ANOVA. Two
or more measurements are taken from the same subjects. It
answers the questions as to whether the measurement changes
over time.
Selected nonparametric tests
Ordinal data independent groups.

Mann-Whitney U : used to compare two groups

Kruskal-Wallis H: used to compare two or more groups


Non-parametric test
 MANN-WHITNEY TEST

This is a method for the comparison of two independent


random samples (x and y): The Mann Whitney U statistic is
defined as:

- where samples of size n1 and n2 are pooled and Ri are the


ranks. U can be resolved as the number of times observations in
one sample precede observations in the other sample in the
ranking.
42
Selected nonparametric tests
Ordinal data independent groups. Mann-Whitney test

Null hypothesis : Two sampled populations are


equivalent in location

The observations from both groups are combined and


ranked, with the average rank assigned in the case of
ties.

If the populations are identical in location, the ranks


should be randomly mixed between the two samples
Non-parametric test
 KRUSKAL-WALLIS TEST

 This is a method for comparing several independent random


samples and can be used as a non-parametric alternative to the
one way ANOVA.

44
Selected nonparametric tests
Ordinal data independent groups. Kruskal-Wallis test

k- groups comparison, k  2

Null hypothesis : k sampled populations are


equivalent in location

The observations from all groups are combined and


ranked, with the average rank assigned in the case of
ties.

If the populations are identical in location, the ranks


should be randomly mixed between the k samples
Non-parametric test

KRUSKAL-WALLIS TEST

46
Selected nonparametric tests
Ordinal data related groups.

Wilcoxon matched-pairs signed rank test:


used to compare two related groups

Friedman matched samples:


used to compare two or more related groups
Selected nonparametric tests
Ordinal data 2 related groups Wilcoxon signed rank
test
Two related variables. No assumptions about the shape of
distributions of the variables.
Null hypothesis : Two variables have the same
distribution
Takes into account information about the magnitude of
differences within pairs and gives more weight to pairs
that show large differences than to pairs that show small
differences.
Based on the ranks of the absolute values of the differences
between the two variables.
Non-parametric test
WILCOXON TEST
 This is a method for the comparison of a pair of samples. The
Wilcoxon signed ranks test statistic T+ is the sum of the ranks
of the positive, non-zero differences (Di) between a pair of
samples.

49
Non-parametric tests for association

 Non-parametric tests for association


Correlation
The Spearman Rank Order Correlation (Rs)– “To what
extent and how strongly are two variables related?”
Phi coefficient – it can be used with nominal data, but
should have ordinal data
Kendall’s Q – can be used with nominal data

50
Non-parametric test
FRIEDMAN’S TEST This method compares several related
samples and can be used as a non-parametric alternative to the
two way ANOVA.
The power of this method is low with small samples but it is the
best method for non-parametric two way analysis of variance
with sample sizes above five.

51
Non-parametric test
 Disadvantage

 The disadvantage is that nonparametric tests are not as


efficient; for a given data set the nonparametric test will give a
higher p-value.

52
Nonparametric Correlations

 The following are three types of commonly used


nonparametric correlation coefficients
 spearman R
 kendall Tan 
 Gamma coefficient
 Note that the chi-square statistic computed for, two- way
frequency tables also provides a careful measure of a
relation between the two (tabulated) variables, and unlike
the correlation measures listed below, it can be used for
variables that are measured on a simple nominal scale.

53
Nonparametric Correlations

 Spearman R. Spearman R (Siegel & Castellan, 1988) assumes


that the variables under consideration were measured on at
least an ordinal  (rank order) scale, that is, that the individual
observations can be ranked into two ordered series. Spearman
R can be thought of as the regular  pearson product movement
correlation co-efficient, that is, in terms of proportion of
variability accounted for, except that Spearman R is computed
from ranks.

54
Non-parametric test

 SPEARMANS CORRELATION:
 Spearman's rank correlation provides a distribution free test of
independence between two variables. It is, however, insensitive
to some types of dependence.

where R(x) and R(y) are the ranks of a pair of variables (x and
y) each containing n observations.
55
Non-parametric test
 Kendall tau is equivalent to Spearman R with regard to the
underlying assumptions. It is also comparable in terms of its
statistical power.
 However, Spearman R and Kendall tau are usually not identical
in magnitude because their underlying logic as well as their
computational formulas are very different.
 Siegel and Castellan (1988) express the relationship of the two
measures in terms of the inequality: More importantly, Kendall
tau and Spearman R imply different interpretations:

56
Non-parametric test
 Spearman R can be thought of as the regular Pearson product
moment correlation coefficient, that is, in terms of proportion
of variability accounted for, except that Spearman R is
computed from ranks. Kendall tau, on the other hand,
represents a probability, that is, it is the difference between the
probability that in the observed data the two variables are in
the same order versus the probability that the two variables are
in different orders.

57
Non-parametric test
 Gamma. The Gamma statistic (Siegel & Castellan, 1988) is
preferable to Spearman R or Kendall tau when the data contain
many tied observations.
 In terms of the underlying assumptions, Gamma is equivalent
to Spearman R or Kendall tau; in terms of its interpretation and
computation it is more similar to Kendall tau than Spearman R.
 In short, Gamma is also a probability; specifically, it is
computed as the difference between the probability that the
rank ordering of the two variables agree minus the probability
that they disagree, divided by 1 minus the probability of ties.
Thus, Gamma is basically equivalent to Kendall tau, except
that ties are explicitly taken into account.

58
SUMMARY
 In this module we have learned about the non parametric test.
We have discussed various types, purposes, disadvantages of
non-parametric test. the next module will discuss about the
ststistical software application to do a data analysis.. this
statistical methods are very important for the nurse researcher.

59
References
 Carol Leslie Macnee, (2008), Understanding Nursing
Research: Using Research in Evidence-based Practice,
Lippincott Williams & Wilkins, ISBN 0781775582,
9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 http://www.socialresearchmethods.net/kb/statinf.php
 http://onlinestatbook.com/2/introduction/inferential.html
 http://www.stat.purdue.edu/~wsharaba/stat511/chapter1_print.p
df
 http://fbm.uni-ruse.bg/d/mra/Introduction%20to%20statistical
%20methods.pdf
References
 Murphy, Kevin (2012). Machine Learning: A Probabilistic
Perspective. MIT. p. 16. ISBN 978-0262018029.
 Jump up^ Stuart A., Ord J.K, Arnold S. (1999), Kendall's
Advanced Theory of Statistics: Volume 2A—Classical
Inference and the Linear Model, sixth edition, §20.2–20.3 (
Arnold).
 General references[edit]
 Bagdonavicius, V., Kruopis, J., Nikulin, M.S. (2011). "Non-
parametric tests for complete data", ISTE & WILEY: London
& Hoboken. ISBN 978-1-84821-269-5.
 Corder, G. W.; Foreman, D. I. (2014). Nonparametric
Statistics: A Step-by-Step Approach. Wiley. ISBN 
978-1118840313.
61
References
 Gibbons, Jean Dickinson; Chakraborti, Subhabrata
(2003). Nonparametric Statistical Inference, 4th Ed. CRC
Press. ISBN 0-8247-4052-1.
 Hettmansperger, T. P.; McKean, J. W. (1998). Robust
nonparametric statistical methods. Kendall's Library of
Statistics 5 (First ed.). London: Edward Arnold. New York:
John Wiley & Sons. ISBN 0-340-54937-8. MR 1604954. also 
ISBN 0-471-19479-4.
 Hollander M., Wolfe D.A., Chicken E. (2014). Nonparametric
Statistical Methods, John Wiley & Sons.
 Wasserman, Larry (2007). All of Nonparametric Statistics,
Springer. ISBN 0-387-25145-6.

62
Thanks

Next Topic>>
Application of
statistical software
for data analysis

63

You might also like