You are on page 1of 3

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/277636620

A comparison of parametric and non-parametric statistical tests

Article  in  BMJ (online) · April 2015


DOI: 10.1136/bmj.h2053

CITATIONS READS

4 5,547

1 author:

Philip Sedgwick
St George's, University of London
409 PUBLICATIONS   5,368 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

BMJ Statistics Endgames View project

All content following this page was uploaded by Philip Sedgwick on 17 August 2015.

The user has requested enhancement of the downloaded file.


BMJ 2015;350:h2053 doi: 10.1136/bmj.h2053 (Published 17 April 2015) Page 1 of 2

Endgames

ENDGAMES

STATISTICAL QUESTION

A comparison of parametric and non-parametric


statistical tests
Philip Sedgwick reader in medical statistics and medical education
Institute for Medical and Biomedical Education, St George’s, University of London, London, UK

Researchers investigated the effectiveness of corticosteroids in c) Non-parametric tests can be used to analyse data measured
reducing respiratory disorders in infants born at 34-36 weeks’ on a continuous or ordinal scale
gestation. A randomised placebo controlled trial was performed. d) The Student’s t test had greater statistical power than the
The intervention was treatment with betamethasone, 12 mg Mann-Whitney U test to detect a difference in mean birth
intramuscularly daily for two consecutive days at 34-36 weeks weight between groups
of pregnancy. Participants were 320 women at 34-36 weeks of
pregnancy who were at risk of imminent premature delivery. Answers
Women were randomised to the intervention (n=163) or placebo
Statements a, c, and d are true, whereas b is false.
(n=157).1
The aim of the study was to test the effectiveness of
The primary outcome was the incidence of neonatal respiratory
corticosteroids in reducing respiratory disorders in infants born
disorders, including respiratory distress syndrome and transient
at 34-36 weeks’ gestation. A randomised placebo controlled
tachypnoea. Secondary outcomes included perinatal
trial was performed. Traditional statistical hypothesis testing
measurements in the infant, including birth weight, plus Apgar
was used to establish whether differences existed between
score at five minutes. Statistical hypothesis testing used a two
treatment groups in the perinatal measurements, therefore
sided alternative and a critical level of significance of 0.05 (5%).
confounding the association between treatment and the primary
Distributional assumptions were verified before statistical
outcome.2 Two broad categories of statistical methods were
testing. The rate of respiratory distress syndrome was higher in
used—parametric and non-parametric tests. When using
the intervention group than in the control group (two (1.4%) v
parametric tests it is necessary to make assumptions about the
one (0.8%)), although the difference was not significant
distribution of the data, whereas no such assumptions need to
(P=0.54), as was the rate of transient tachypnoea (34 (24%) v
be made when using non-parametric methods. Non-parametric
29 (22%); P=0.77). Babies born to mothers in the intervention
methods are sometimes referred to as distribution-free methods
group had a higher mean birth weight, although the difference
or methods of rank order. When comparing two independent
was not significant (2640 (standard deviation 445) v 2627 (452)
groups, as in the study above, the parametric test that is usually
g; Student’s t test P=0.80). There was no difference between
used is the Student’s t test, and the non-parametric tests that can
the intervention and control groups in Apgar scores at five
be used are the Mann-Whitney U test or Wilcoxon rank sum
minutes (median 9 (interquartile range 9-10) v 9 (9-10);
test.
Mann-Whitney U test P=0.77).
The treatment groups were compared in mean birth weight using
The researchers concluded that antenatal treatment with
the Student’s t test, also known as the independent samples t
corticosteroids at 34-36 weeks of pregnancy does not reduce
test. Described in a previous question,3 the Student’s t test
the incidence of respiratory disorders in newborn infants.
compares two independent groups with regard to the mean of
Which of the following statements, if any, are true? a variable measured on a continuous scale. The test is a
a) The use of the Student’s t test assumed that birth weight parametric method; such methods make the assumption that the
was normally distributed in the population for each treatment variable being analysed has a particular distribution in the
group population, typically a Normal distribution. The Normal
b) The use of the Mann-Whitney U test assumed that the distribution, described in a previous question,4 is a theoretical
variance of Apgar scores at five minutes was equal between distribution described by its mean and standard deviation. In
treatment groups in the population particular, the application of the Student’s t test assumed that
the distribution of birth weight was normal in the population

p.sedgwick@sgul.ac.uk

For personal use only: See rights and reprints http://www.bmj.com/permissions Subscribe: http://www.bmj.com/subscribe
BMJ 2015;350:h2053 doi: 10.1136/bmj.h2053 (Published 17 April 2015) Page 2 of 2

ENDGAMES

for each treatment group (a is true). A further assumption was normally distributed in the population. Therefore, parametric
that the variance in birth weight was equal in the treatment methods should be used only to analyse data that are measured
groups in the population. on a continuous scale. If the assumptions cannot be verified
The distributional assumption of normality of birth weight in then non-parametric methods should be applied. Variables
the population was verified using the sample measurements of measured on an ordinal scale, such as an anxiety rating scale,
birth weight. A formal statistical test could have been used but should be analysed using non-parametric methods only because
it is not always accurate. Therefore, it is generally recommended the distributional assumption of normality cannot be made.
that the assumption is confirmed by inspection of the histogram Hence, non-parametric tests can be used to analyse data
for the sample measurements of each treatment group. The measured on a continuous or ordinal scale (c is true). Sometimes
distribution of a variable can be assumed to be normal if the variables measured on an ordinal scale with a potentially large
distributions are not extremely skewed. Nonetheless, the spread in values are analysed using parametric methods, but
independent samples t test is remarkably robust with regard to such testing would probably be meaningless. Hypothesis testing
departures in normality, particularly if the numbers of would be based on the sample estimate for the population
participants in the two groups are similar. If the distribution of parameter of the mean difference in the outcome measure.
the sample data is skewed then a transformation—for example, However, ordinal scales consist of categories with natural
a logarithmic transformation—may make the data suitable for ordering and measurements are therefore inherently discrete
analysis using parametric methods.5 The assessment of equal rather than continuous; differences between points on the scale
variance in birth weight between the treatment groups in the have little, if any, inherent meaning or value.
population would have been verified by comparison of the Non-parametric methods can be used in all circumstances,
sample standard deviations. A statistical test—for example, regardless of whether the distributional assumption of normality
Levene’s test, which is provided routinely by statistical in the population can be made for the variable being analysed.
software—would have been used. However, as a rule of thumb However, it would not be sensible to do this. If all the
the variances will be considered equal if the ratio of the larger assumptions underlying the parametric test are satisfied, then
to the smaller standard deviation is no more than two. However, parametric methods are preferable to non-parametric ones
if the variances are not equal then the software will typically because they will have greater statistical power to detect a
make an adjustment in the application of the test. Although it difference between treatment groups in an outcome if it exists
was not essential for the two treatment groups to have equal in the population (d is true). Statistical power has been described
variance in birth weight, and therefore not essentially an in a previous question.6 Furthermore, non-parametric methods
assumption of the data, it was important that equality of are limited because they are primarily tests that result in
variances was investigated before statistical testing. If the dichotomous decisions of significance; unlike parametric
assumptions cannot be verified then non-parametric methods, methods they typically do not permit statements about the
as described below, should be used. population based on the sample estimate and confidence interval
The Apgar score at five minutes was measured on an ordinal for the population parameter.
scale; therefore, the distributional assumption of normality could The independent samples t test and Mann-Whitney test are used
not be made and the Student’s t test could not be used. The to compare two independent groups as described. When there
Mann-Whitney U test—the non-parametric equivalent of the are three or more independent groups the parametric and
Student’s t test—was used instead. Non-parametric methods non-parametric methods used are analysis of variance and the
make no assumptions about the distribution of data or equality Kruskal-Wallis test, respectively.7 8 The parametric and
of variances between groups in the population (b is false). non-parametric methods used to analyse two related groups—for
Although non-parametric methods make no assumptions about example, an outcome measured before and after an intervention
the distribution of data, the data may have a particular designed to help weight loss in patients with osteoarthritis of
distribution; however, it is typically not of interest in itself. the knee—have been described in previous questions.9 10
Non-parametric methods still use traditional statistical
hypothesis testing; the Mann-Whitney U test involved a null Competing interests: None declared.
hypothesis that stated that the distribution of the Apgar scores
at five minutes was similar for each of the treatment groups in 1 Porto AMF, Coutinho IC, Correia JB, et al. Effectiveness of antenatal corticosteroids in
reducing respiratory disorders in late preterm infants: randomised clinical trial. BMJ
the population. The Mann-Whitney U test was based on ranking 2011;342:d1696.
the sample values of the Apgar scores regardless of treatment 2 Sedgwick P. Understanding statistical hypothesis testing. BMJ 2014;348:g3557.
group. Under the null hypothesis, if the distribution of Apgar 3
4
Sedgwick P. Independent samples t test. BMJ 2010;340:c2673.
Sedgwick P. The Normal distribution. BMJ 2010;341:c6085.
scores was similar for each treatment group in the population 5 Sedgwick P. Log transformation of data. BMJ 2012;345:e6727.
then the average rank of the scores would be expected to be 6 Sedgwick P. The importance of statistical power. BMJ 2013;347:f6282.
7 Sedgwick P. One way analysis of variance. BMJ 2012;344:e2427.
equal for the treatment groups in the sample. The Wilcoxon 8 Sedgwick P. Non-parametric statistical tests for independent groups: numerical data. BMJ
rank sum test is sometimes used instead of the Mann-Whitney 2012;344:e3354.
9 Sedgwick P. Parametric statistical tests for two related groups: numerical data. BMJ
U test. It is also a non-parametric test and the two tests give the 2014;348:g124.
same P value, so the same conclusion would be made with 10 Sedgwick P. Non-parametric statistical tests for two related groups: numerical data. BMJ

respect to statistical hypothesis testing. 2012;344:e2537.

As described above, when using parametric methods it must be Cite this as: BMJ 2015;350:h2053
assumed that the variable being compared between groups is
© BMJ Publishing Group Ltd 2015

For personal use only: See rights and reprints http://www.bmj.com/permissions Subscribe: http://www.bmj.com/subscribe
View publication stats

You might also like