Research Methods: PH.D in Nursing

Ph.
D In Nursing
RESEARCH METHODS
Module No: 8 DATA ANALYSIS
Name of the subtopic

8.2 NON PARAMETRIC TEST
Faculty Name;
Date:
Subject Code: School of Nursing
Learning Objectives
 Meaning of non parametric tests
 Application and purpose
 Models of non parametric tests
 Types of non parametric tests
 parametric and non parametric Significance
 parametric and non parametric prediction
 Some concepts related to the statistical methods
 Selected nonparametric tests
 Disadvantages
 To distinguish parametric and nonparametric tests of significance
 To identify situations in which the use of nonparametric tests is
appropriate
2
List of contents
 introduction
 Meaning of non parametric tests
 Application and purpose
 Models of non parametric tests
 Types of non parametric tests
 parametric and non parametric Significance
 parametric and non parametric prediction
 Some concepts related to the statistical methods
 Selected nonparametric tests
 Disadvantages
 Summary
 references
3
introduction
 Nonparametric statistics are statistics not based on
parameterized families of probability distributions
 They include both descriptive and inferential statistics. The
typical parameters are the mean, variance, etc. Unlike
parametric statistics , nonparametric statistics make no
assumptions about the probability distributions of the variables
being assessed.
 The difference between parametric model and non-parametric
model is that the former has a fixed number of parameters,
while the latter grows the number of parameters with the
amount of training data.
4
meaning of Non-parametric test
Statistical tests which are not based on a normal distribution of data

or on any other assumption. They are also known as distribution-
free tests and the data are generally ranked or grouped.
 Non-parametric statistics do not assume any underlying
distribution of parameter.
 Non-parametric does not meant that model lack parameters but
that the number and nature of the parameters are flexible
5
Non-parametric test
The first meaning of non-parametric covers techniques that do

not rely on data belonging to any particular distribution
These are the techniques that do not assume that the structure of
a model is fixed. Typically, the model grows in size to
accommodate the complexity of the data. In these techniques,
individual variables are typically assumed to belong to
parametric distributions, and assumptions about the types of
connections among variables are also made.
6
Non-parametric test-Meaning
 non-parametric statistics (in the sense of a Statistic over data,

which is defined to be a function on a sample that has no
dependency on a parameter), whose interpretation does not
depend on the population fitting any parameterized
distributions.
 Order statistics which are based on the ranks of observations,
are one example of such statistics and these play a central role
in many non-parametric approaches.
7
Why Nonparametric Test
 Sample distribution is unknown.

 When the population distribution is abnormal i.e. too
many variables involved.
USAGE
 Decision making/ forecasting.

 Studying populations that take on a ranked order (such as
movie reviews receiving one to four stars)
 Simple analysis.
Applications and purpose
 Non-parametric methods are widely used for studying populations
that take on a ranked order (such as movie reviews receiving one
to four stars).
 The use of non-parametric methods may be necessary when data
have a ranking but no clear numerical interpretation, such as
when assessing preferences .
 In terms of levels of measurement, non-parametric methods result
in "ordinal" data.
 As non-parametric methods make fewer assumptions, their
applicability is much wider than the corresponding parametric
methods. In particular, they may be applied in situations where less
is known about the application in question. Also, due to the
reliance on fewer assumptions, non-parametric methods are more
robust.
10
Applications and purpose
 Another justification for the use of non-parametric methods is
simplicity. In certain cases, even when the use of parametric
methods is justified, non-parametric methods may be easier to
use. Due both to this simplicity and to their greater robustness,
non-parametric methods are seen by some statisticians as
leaving less room for improper use and misunderstanding.
 The wider applicability and increased robustness of non-
parametric tests comes at a cost: in cases where a parametric
test would be appropriate, non-parametric tests have less
power.
 In other words, a larger sample size can be required to draw
conclusions with the same degree of confidence.
11
Non-parametric models
 Non-parametric models differ from parametric models in that

the model structure is not specified a priori but is instead
determined from data.
 The term non-parametric is not meant to imply that such
models completely lack parameters but that the number and
nature of the parameters are flexible and not fixed in advance.
 A histogram is a simple nonparametric estimate of a
probability distribution.
 kernel density estimation provides better estimates of the
density than histograms.
12
Non-parametric models
 non parametric regression and semi parametric regression

methods have been developed based on kernels , and spliness.
 data development analysis provides efficiency coefficients
similar to those obtained by multivariate analysis. without any
distributional assumption.
KNNs classify the unseen instance based on the K points in the
training set which are nearest to it.
 A support vector machine (with a Gaussian kernel) is a
nonparametric large-margin classifier.
13
Types of Non-parametric test
1. One sample test
• Chi-square test
• One sample sign test
2. Two samples test
• Median test
• Two samples sign test
3. K-samples test
• Median tets
• Kruskal Wallis test
4.Chi-square test (χ2):
 Used to compare between observed and expected data.
1. Test of goodness of fit

2. Test of independence
3. Test of homogeneity
5.Kruskal-Wallis test-
 for testing whether samples originate from the same
distribution.
 used for comparing more than two samples that are
independent, or not related
 Alternative to ANOVA.
6.Wilcoxon signed-rank-
 used when comparing two related samples or repeated
measurements on a single sample to assess whether their
population mean ranks differ.
7.Median test-
 Use to test the null hypothesis that the medians of the
populations from which two samples are drawn are identical.
 The data in sample is assigned to two groups, one consisting
of data whose values are higher than the median value in the
two groups combined, and the other consisting of data whose
values are at the median or below
8.Sign test:
 can be used to test the hypothesis that there is "no difference
in medians" between the continuous distributions of two
random variables X and Y,
9.Fisher's exact test:
 test used in the analysis of contingency where sample sizes
are small
Parametric v Non-parametric
 Parametric tests => have info about population, or can make
certain assumptions
 Assume normal distribution of population.
 Data is distributed normally.
 population variances are the same.
 Non-parametric tests are used when there are no assumptions
made about population distribution
 Also known as distribution free tests.
 But info is known about sampling distribution.
Tests of Significance
Non-parametric Parametric
Two-groups
Paired Wilcoxin Rank Paired t test
Unpaired Mann-Whitney Unpaired t test
U
More than two-
groups
Friedman test ANOVA
Repeated measures Kruskal -Wallis Repeated measures
Independent groups ANOVA
19
Parametric and nonparametric tests
of significance
Parametric test of significance - to estimate at least one population
parameter from sample statistics
Assumption: the variable we have measured in the sample is
normally distributed in the population to which we plan to
generalize our findings
Nonparametric test - distribution free, no assumption about the

distribution of the variable in the population
Tests of Significance
 Non-parametric tests of significance – small numbers, can’t
assume a normal distribution, or measurement not interval
 Chi-square – requires only nominal data – allows researcher
to determine whether frequencies that have been obtained in
research differ from those that would have been expected –
use a X2 sampling distribution
 Chi-square goodness of Fit
 Chi-Square test of independence
Prediction
 Parametric Prediction – using a correlation, if you know score
“x”, you can predict score “y” for one person – Use regression
analysis
 Simple linear regression – allows the prediction from one
variable to another – you must have at least interval level data
 Multiple linear regression – this allows the prediction of one
variable from several other variables. The dependent variable
must be on the interval scale
Prediction
Non-parametric Prediction – measures the extent to which you

can reduce the error in predicting the dependent variable as a
consequence of having some knowledge of the independent
variable such as, predicting income [DV] by education [IV]
Kendall’s Tau – used with ordinal data and ranking - is
better than the Gamma because it takes ties into account
Gamma - used with ordinal data to predict the rank of one
variable by knowing rank on another variable
Lambda – can be used with nominal data – knowledge of the
IV allows one to make a better prediction of the DV than if
you had no knowledge at all
Parametric and nonparametric tests
of significance
Nonparametric tests Parametric tests

Nominal Ordinal data Ordinal, interval,
data ratio data
One group
Two
unrelated
groups
Two related
groups
K-unrelated
groups
K-related
groups
Some concepts related to the statistical
methods.
Multiple comparison
two or more data sets, which should be analyzed
– repeated measurements made on the same individuals
– entirely independent samples

Some concepts related to the statistical methods.
Sample size
number of cases, on which data have been obtained
Which of the basic characteristics of a distribution are more

sensitive to the sample size ?
central tendency (mean, median, mode) mean
variability (standard deviation, range, IQR) standard deviation
Skewness, kurtosis
Some concepts related to the statistical
methods.
Degrees of freedom
the number of scores, items, or other units in the data
set, which are free to vary
One- and two tailed tests

one-tailed test of significance used for directional
hypothesis
two-tailed tests in all other situations
Parametric and nonparametric tests of
significance
N o n p a r a m e tr ic te s ts P a r a m e tr ic te s ts
N o m in a l O r d in a l d a ta O r d in a l, in te r v a l,
d a ta r a tio d a ta
O n e g ro u p C h i sq u a re
good n ess
o f fit
Tw o C h i sq u a re
u n r e la te d
groups
T w o r e la te d M cN em ar’
groups s te st
K -u n r e la te d C h i sq u a re
groups te st
K -r e la te d
groups
Selected nonparametric tests
Chi-Square goodness of fit test.
to determine whether a variable has a frequency distribution

compariable to the one expected
1
   ( f oi  f ei )
 2
 f
ei
expected frequency can be based on

theory
previous experience
comparison groups
Chi-Square goodness of fit test. Example
The average prognosis of total hip replacement in relation
to pain reduction in hip joint is
exelent - 80%
good - 10%
medium - 5% expected
bad - 5%
In our study of we had got a different outcome
exelent - 95%
good - 2%
observed
medium - 2%
bad - 1%
Does observed frequencies differ from expected ?
Chi-Square goodness of fit test. Example
fe1= 80, fe2= 10, fe3=5, fe4= 5;

fo1= 95, fo2= 2, fo3=2, fo4= 1;
2 > 3.841 p < 0.05
2= 14.2, df=3 (4-1) 2 > 6.635 p < 0.01
2 > 10.83 p < 0.001
0.0005 < p < 0.05
Null hypothesis is rejected at 5% level

Chi-Square test.
Chi-square statistic (test) is usually used with an R (row)

by C (column) table.
Expected frequencies can be calculated:
1
F rc  ( fr fc )
N
then
1
    ( f ij  Fij )
 2
 j F
ij
df = (fr-1) (fc-1)
Chi-Square test. Example
Question: whether men are treated more aggressively for

cardiovascular problems than women?
Sample: people have similar results on initial testing
Response: whether or not a cardiac catheterization was

recommended
Independent: sex of the patient
Result: observed frequencies
Sex
Cardiac male female Row total
Cath
No 15 16 31
Yes 45 24 69
Column 60 40 100
total
Result: expected frequencies
Sex
Cardiac male female Row total
Cath
No 18.6 12.4 31
Yes 41.4 27.6 69
Column 60 40 100
total
Result:
2= 2.52, df=1 (2-1) (2-1)
p > 0.05
Null hypothesis is accepted at 5% level
Conclusion: Recommendation for cardiac catheterization

is not related to the sex of the patient
Chi-Square test. Underlying assumptions.
Cannot be used to analyze

 Frequency data
differences in scores or
their means
 Adequate sample size Expected frequencies should
not be less than 5
 Measures independent No subjects can be count
of each other more than once
 Theoretical basis for Categories should be defined

the categorization of the prior to data collection and
analysis
variables
Fisher’s exact test. McNemar test.
 For N x N design and very small sample size Fisher's

exact test should be applied
 McNemar test can be used with two dichotomous

measures on the same subjects (repeated
measurements). It is used to measure change
Parametric and nonparametric tests of
significance
N o n p a r a m e tr ic te s ts P a r a m e tr ic
te sts
N o m in a l O r d in a l d a ta
d a ta
O n e g ro u p C h i sq u a re W ilc o x o n s ig n e d
g o o d n ess o f r a n k te st
fit
T w o C h i sq u a re W ilc o x o n r a n k
u n r e la te d su m te st,
g ro u p s M a n n -W h itn e y
te st
T w o r e la te d M c N e m a r’s W ilc o x o n s ig n e d
g ro u p s te s t r a n k te st
K -u n r e la te d C h i sq u a re K r u s k a l -W a llis
g ro u p s te s t o n e w a y a n a ly s is
o f v a r ia n c e
K -r e la te d F r ie d m a n
g ro u p s m a tc h e d s a m p le s
Non-parametric Multiple Comparison
 Kruskal-Wallis Test – an alternative to the one-way ANOVA.
The scores are ranked and the analyses compare the mean rank
in each group. It determines if there is a difference between
groups.
 McNemar Test – an adaptation of the Chi-square that is used
with repeated measures at the nominal level.
 Friedman Test –an alternative to the repeated ANOVA. Two
or more measurements are taken from the same subjects. It
answers the questions as to whether the measurement changes
over time.
Ordinal data independent groups.
Mann-Whitney U : used to compare two groups
Kruskal-Wallis H: used to compare two or more groups

Non-parametric test
 MANN-WHITNEY TEST
This is a method for the comparison of two independent

random samples (x and y): The Mann Whitney U statistic is
defined as:
- where samples of size n1 and n2 are pooled and Ri are the

ranks. U can be resolved as the number of times observations in
one sample precede observations in the other sample in the
ranking.
42
Ordinal data independent groups. Mann-Whitney test
Null hypothesis : Two sampled populations are

equivalent in location
The observations from both groups are combined and

ranked, with the average rank assigned in the case of
ties.
If the populations are identical in location, the ranks

should be randomly mixed between the two samples
Non-parametric test
 KRUSKAL-WALLIS TEST
 This is a method for comparing several independent random

samples and can be used as a non-parametric alternative to the
one way ANOVA.
44
Ordinal data independent groups. Kruskal-Wallis test
k- groups comparison, k  2
Null hypothesis : k sampled populations are

equivalent in location
The observations from all groups are combined and

ranked, with the average rank assigned in the case of
ties.
If the populations are identical in location, the ranks

should be randomly mixed between the k samples
Non-parametric test
KRUSKAL-WALLIS TEST
46
Ordinal data related groups.
Wilcoxon matched-pairs signed rank test:

used to compare two related groups
Friedman matched samples:

used to compare two or more related groups
Ordinal data 2 related groups Wilcoxon signed rank
test
Two related variables. No assumptions about the shape of
distributions of the variables.
Null hypothesis : Two variables have the same
distribution
Takes into account information about the magnitude of
differences within pairs and gives more weight to pairs
that show large differences than to pairs that show small
differences.
Based on the ranks of the absolute values of the differences
between the two variables.
Non-parametric test
WILCOXON TEST
 This is a method for the comparison of a pair of samples. The
Wilcoxon signed ranks test statistic T+ is the sum of the ranks
of the positive, non-zero differences (Di) between a pair of
samples.
49
Non-parametric tests for association
 Non-parametric tests for association

Correlation
The Spearman Rank Order Correlation (Rs)– “To what
extent and how strongly are two variables related?”
Phi coefficient – it can be used with nominal data, but
should have ordinal data
Kendall’s Q – can be used with nominal data
50
Non-parametric test
FRIEDMAN’S TEST This method compares several related
samples and can be used as a non-parametric alternative to the
two way ANOVA.
The power of this method is low with small samples but it is the
best method for non-parametric two way analysis of variance
with sample sizes above five.
51
Non-parametric test
 Disadvantage
 The disadvantage is that nonparametric tests are not as

efficient; for a given data set the nonparametric test will give a
higher p-value.
52
Nonparametric Correlations
 The following are three types of commonly used

nonparametric correlation coefficients
 spearman R
 kendall Tan
 Gamma coefficient
 Note that the chi-square statistic computed for, two- way
frequency tables also provides a careful measure of a
relation between the two (tabulated) variables, and unlike
the correlation measures listed below, it can be used for
variables that are measured on a simple nominal scale.
53
Nonparametric Correlations
 Spearman R. Spearman R (Siegel & Castellan, 1988) assumes

that the variables under consideration were measured on at
least an ordinal (rank order) scale, that is, that the individual
observations can be ranked into two ordered series. Spearman
R can be thought of as the regular pearson product movement
correlation co-efficient, that is, in terms of proportion of
variability accounted for, except that Spearman R is computed
from ranks.
54
Non-parametric test
 SPEARMANS CORRELATION:
 Spearman's rank correlation provides a distribution free test of
independence between two variables. It is, however, insensitive
to some types of dependence.
where R(x) and R(y) are the ranks of a pair of variables (x and
y) each containing n observations.
55
Non-parametric test
 Kendall tau is equivalent to Spearman R with regard to the
underlying assumptions. It is also comparable in terms of its
statistical power.
 However, Spearman R and Kendall tau are usually not identical
in magnitude because their underlying logic as well as their
computational formulas are very different.
 Siegel and Castellan (1988) express the relationship of the two
measures in terms of the inequality: More importantly, Kendall
tau and Spearman R imply different interpretations:
56
Non-parametric test
 Spearman R can be thought of as the regular Pearson product
moment correlation coefficient, that is, in terms of proportion
of variability accounted for, except that Spearman R is
computed from ranks. Kendall tau, on the other hand,
represents a probability, that is, it is the difference between the
probability that in the observed data the two variables are in
the same order versus the probability that the two variables are
in different orders.
57
Non-parametric test
 Gamma. The Gamma statistic (Siegel & Castellan, 1988) is
preferable to Spearman R or Kendall tau when the data contain
many tied observations.
 In terms of the underlying assumptions, Gamma is equivalent
to Spearman R or Kendall tau; in terms of its interpretation and
computation it is more similar to Kendall tau than Spearman R.
 In short, Gamma is also a probability; specifically, it is
computed as the difference between the probability that the
rank ordering of the two variables agree minus the probability
that they disagree, divided by 1 minus the probability of ties.
Thus, Gamma is basically equivalent to Kendall tau, except
that ties are explicitly taken into account.
58
SUMMARY
 In this module we have learned about the non parametric test.
We have discussed various types, purposes, disadvantages of
non-parametric test. the next module will discuss about the
ststistical software application to do a data analysis.. this
statistical methods are very important for the nurse researcher.
59
References
 Carol Leslie Macnee, (2008), Understanding Nursing
Research: Using Research in Evidence-based Practice,
Lippincott Williams & Wilkins, ISBN 0781775582,
9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 http://www.socialresearchmethods.net/kb/statinf.php
 http://onlinestatbook.com/2/introduction/inferential.html
 http://www.stat.purdue.edu/~wsharaba/stat511/chapter1_print.p
df
 http://fbm.uni-ruse.bg/d/mra/Introduction%20to%20statistical
%20methods.pdf
References
 Murphy, Kevin (2012). Machine Learning: A Probabilistic
Perspective. MIT. p. 16. ISBN 978-0262018029.
 Jump up^ Stuart A., Ord J.K, Arnold S. (1999), Kendall's
Advanced Theory of Statistics: Volume 2A—Classical
Inference and the Linear Model, sixth edition, §20.2–20.3 (
Arnold).
 General references[edit]
 Bagdonavicius, V., Kruopis, J., Nikulin, M.S. (2011). "Non-
parametric tests for complete data", ISTE & WILEY: London
& Hoboken. ISBN 978-1-84821-269-5.
 Corder, G. W.; Foreman, D. I. (2014). Nonparametric
Statistics: A Step-by-Step Approach. Wiley. ISBN
978-1118840313.
61
References
 Gibbons, Jean Dickinson; Chakraborti, Subhabrata
(2003). Nonparametric Statistical Inference, 4th Ed. CRC
Press. ISBN 0-8247-4052-1.
 Hettmansperger, T. P.; McKean, J. W. (1998). Robust
nonparametric statistical methods. Kendall's Library of
Statistics 5 (First ed.). London: Edward Arnold. New York:
John Wiley & Sons. ISBN 0-340-54937-8. MR 1604954. also
ISBN 0-471-19479-4.
 Hollander M., Wolfe D.A., Chicken E. (2014). Nonparametric
Statistical Methods, John Wiley & Sons.
 Wasserman, Larry (2007). All of Nonparametric Statistics,
Springer. ISBN 0-387-25145-6.
62
Thanks
Next Topic>>
Application of
statistical software
for data analysis
63

Research Methods: PH.D in Nursing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Methods: PH.D in Nursing

Uploaded by

Copyright:

Available Formats

Ph.

Module No: 8 DATA ANALYSIS

Name of the subtopic

Statistical tests which are not based on a normal distribution of data

The first meaning of non-parametric covers techniques that do

 non-parametric statistics (in the sense of a Statistic over data,

 Sample distribution is unknown.

 Decision making/ forecasting.

 Non-parametric models differ from parametric models in that

 non parametric regression and semi parametric regression

1. Test of goodness of fit

Nonparametric test - distribution free, no assumption about the

Non-parametric Prediction – measures the extent to which you

Nonparametric tests Parametric tests

two or more data sets, which should be analyzed

– repeated measurements made on the same individuals

– entirely independent samples

Which of the basic characteristics of a distribution are more

variability (standard deviation, range, IQR) standard deviation

One- and two tailed tests

to determine whether a variable has a frequency distribution

expected frequency can be based on

fe1= 80, fe2= 10, fe3=5, fe4= 5;

Null hypothesis is rejected at 5% level

Chi-square statistic (test) is usually used with an R (row)

Expected frequencies can be calculated:

Question: whether men are treated more aggressively for

Sample: people have similar results on initial testing

Response: whether or not a cardiac catheterization was

Result: observed frequencies

Result: expected frequencies

2= 2.52, df=1 (2-1) (2-1)

Null hypothesis is accepted at 5% level

Conclusion: Recommendation for cardiac catheterization

Cannot be used to analyze

 Theoretical basis for Categories should be defined

 For N x N design and very small sample size Fisher's

 McNemar test can be used with two dichotomous

Mann-Whitney U : used to compare two groups

Kruskal-Wallis H: used to compare two or more groups

This is a method for the comparison of two independent

- where samples of size n1 and n2 are pooled and Ri are the

Null hypothesis : Two sampled populations are

The observations from both groups are combined and

If the populations are identical in location, the ranks

 This is a method for comparing several independent random

Null hypothesis : k sampled populations are

The observations from all groups are combined and

If the populations are identical in location, the ranks

Wilcoxon matched-pairs signed rank test:

Friedman matched samples:

 Non-parametric tests for association

 The disadvantage is that nonparametric tests are not as

 The following are three types of commonly used

 Spearman R. Spearman R (Siegel & Castellan, 1988) assumes

You might also like