You are on page 1of 9

This article was downloaded by: [University of South Florida]

On: 25 March 2015, At: 07:12


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Computation and


Simulation
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/gscs20

Monte Carlo comparison of seven


normality tests
a a
Hadi Alizadeh Noughabi & Naser Reza Arghami
a
Department of Statistics , Ferdowsi University of Mashhad ,
Mashhad, Iran
Published online: 08 Dec 2010.

To cite this article: Hadi Alizadeh Noughabi & Naser Reza Arghami (2011) Monte Carlo comparison
of seven normality tests, Journal of Statistical Computation and Simulation, 81:8, 965-972, DOI:
10.1080/00949650903580047

To link to this article: http://dx.doi.org/10.1080/00949650903580047

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Journal of Statistical Computation and Simulation
Vol. 81, No. 8, August 2011, 965–972

Monte Carlo comparison of seven normality tests


Hadi Alizadeh Noughabi* and Naser Reza Arghami
Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran
Downloaded by [University of South Florida] at 07:12 25 March 2015

(Received 20 October 2009; final version received 23 December 2009 )

This article studies seven different tests of normality. The tests in question are Kolmogorov–Smirnov,
Anderson–Darling, Kuiper, Jarque–Bera, Cramer von Mises, Shapiro–Wilk, and Vasicek. Each test is
described and power comparisons are made by using Monte Carlo computations under various alternatives.
The results are discussed and interpreted separately.

Keywords: test of normality; Monte Carlo simulation; power of test

1. Introduction

To make a statistical inference, several assumptions about the data must be fulfilled. Most statis-
tical methods assume an underlying distribution in the derivation of their results. However, when
we assume that our data follow a specific distribution, we take a serious risk. If our assumption is
wrong, then the results obtained may be invalid. For example, the confidence levels of the confi-
dence intervals or error probabilities of the tests of hypotheses implemented may be completely
off. The consequences of misspecifying the distribution may prove very costly. One way to deal
with this problem is to check the distribution assumptions carefully.
The goodness-of-fit tests have been discussed by many authors including D’Agostino and
Stephens [1], Huber-Carol et al. [2], Li and Papadopoulos [3], Thode [4], Zhang and Cheng [5],
Steele and Chaseling [6], Jager and Wellner [7], Raschke [8], Zhao et al. [9], etc.
Normality assumption is indispensable in many statistical procedures, some of which may be
quite sensitive to any departure from normality. Therefore, testing normality is one of the most
studied goodness-of-fit problems.
Many normality tests have been developed by different authors. Since the invention of the
chi-squared goodness-of-fit test for normality by Pearson in 1900, considerable attention has
been given to the problem of testing normality and a fair number of tests can be found in
the literature. Cramer von Mises and Kolmogorov–Smirnov proposed their tests in 1931 and
1933, respectively. After almost two decades, Anderson and Darling suggested their test. After-
wards, Kuiper’s and Shapiro and Wilk’s tests of normality were introduced. A new normality test
‘Vasicek’s test of normality’ was suggested in 1976. Jarque and Bera proposed their test in 1987.

*Corresponding author. Email: alizadehhadi@ymail.com

ISSN 0094-9655 print/ISSN 1563-5163 online


© 2011 Taylor & Francis
DOI: 10.1080/00949650903580047
http://www.informaworld.com
966 H. Alizadeh Noughabi and N.R. Arghami

Detailed discussions on these tests may be found in D’Agostino and Stephens [1], Mardia [10],
and references therein.
Comparison of the normality tests has received attention in the literature. Stephens [11], by
Monte Carlo simulation, presents comparisons of some normality tests (Kolmogorov–Smirnov,
Cramer von Mises, Kuiper, Watson, Anderson–Darling, and Shapiro-Wilk). Vasicek’s [12] test
of normality based on entropy is not, of course, included in Stephens’s study, but Vasicek [12]
compared his test with the other tests under some alternatives [exponential, gamma(2), uniform,
beta(2,1), and Cauchy] and showed that for some alternatives his test is most powerful. Bera and Ng
[13] present a graphical alternative to the Q − Q plot for detecting departures from normality using
the score function. They concluded that the estimated score function is informative in performing
exploratory data analysis. Dufour et al. [14] compared different normality tests (Kolmogorov–
Smirnov, Cramer von Mises, Jarque–Bera, Anderson–Darling, Shapiro-Wilk, and D’Agostino)
Downloaded by [University of South Florida] at 07:12 25 March 2015

for residuals of linear regression model. Their study did not include Vasicek’s [12] test.
Esteban et al. [15] proposed three new tests of normality based on the three improved or modified
versions of the Vasicek entropy estimator. They computed critical values of the corresponding test
statistics for sample size 5 ≤ n ≤ 50 by using Monte Carlo experiments. They concluded that the
power of tests depends on alternatives; therefore, they divided the alternatives into four groups,
depending on the support and shape of their densities.
Goria et al. [16] and Choi [17] improved the normality tests based on entropy and compared the
proposed tests with other entropy-based tests, under some alternatives. Farrell and Rogers-Stewart
[18] compared some normality and symmetry tests. Also, Yazici and Yolacan [19] compared the
power of the normality tests for the populations from distributions (beta, gamma, log-normal,
Weibull, and t) and with different sample sizes (n = 20, 30, 40 and 50). Meintanis [20] compared
classical normality tests with the tests based on characteristic function. Meintanis did not include
the tests of normality based on entropy in their comparisons.
In all of the above studies (with the exception of Esteban et al. [15], who compared only four
entropy-based normality tests, using different entropy estimates, under classified alternatives),
alternatives were not classified and authors considered only some alternatives. In this paper, we
consider seven normality tests and compare them with each other, by Monte Carlo simulations,
under classified alternatives. Our choice of the seven tests has been based on popularity (like
Kolmogorov–Smirnov, Anderson–Darling) and powerfulness (Shapiro–Wilk, Vasicek).
It turns out (Section 3) that no single test procedure is uniformly more powerful than others.
That is to say that some tests are more powerful than other tests for some alternatives and some
are better for other alternatives. We have classified normality tests based on the alternatives under
which some tests are uniformly most powerful.
In this paper, the methodologies of the tests mentioned earlier are given in Section 2. All the
tests are compared with each other by Monte Carlo simulation in Section 3. The last section
includes some conclusions.

2. Tests of normality

Let X1 , . . . , Xn be i.i.d. (independent identically distributed) observations from a continuous dis-


tribution F , which is otherwise unknown. We wish to test the hypothesis H0 :F is normal against
the general alternatives H1 :F is not normal. In this section, we shall compare seven tests for the
above problem in terms of their powers.
The normality tests in question are Cramer von Mises [21], Kolmogorov [22], Anderson and
Darling [23], Kuiper [24], Shapiro and Wilk [25], Vasicek [12], and Jarque and Bera [26].
The description of each normality tests is given in Table 1.
Journal of Statistical Computation and Simulation 967

Table 1. Tests of normality.

Test of normality Test statistic Notations


n 
 2  
1 2i − 1 X(i) − X̄
Cramer von Mises CH = + − Zi , i = 1, . . . , n Zi =  : where
12n 2n SX
i=1
 is the cdf of standard
normal distribution
   
i i−1  
D + = max − Zi , D − = max Zi − , X(i) − X̄
Kolmogorov–Smirnov n n Zi =  : where
i = 1, . . . , n SX
D = max(D + , D − )  is the cdf of standard
normal distribution
Kuiper V = D+ + D− D + and D − are as in 2
Downloaded by [University of South Florida] at 07:12 25 March 2015


2
[n/2]
i=1 a(n−i+1) X(n−i+1) − X(i)
Shapiro–Wilk W = n
2 The coefficients ai are
i=1 X(i) − X̄ tabulated in Pearson and
Hartley [27]
n  
(2i − 1){ln(Zi ) + ln(1 − Zn−i+1 )} X(i) − X̄
Anderson–Darling A2 = −n − i=1 Zi =  : where
n SX
 is the cdf of standard
normal distribution.
exp(H (m, n))
KLnm =
SX
2 : sample variance; m:
Vasicek
1
n  n  SX 
H (m, n) = log (X(i+m) − X(i−m) ) n
n 2m positive integer m ≤
i=1 2
 
c2 (k − 3)2 c = skewness
Jarque–Bera JB = n +
6 24 k = kurtosis

Among the tests in Table 1, Vasicek’s test and Shapiro–Wilk and Jarque–Bera tests are specific
in the sense that the null hypothesis is normal, while the rest are suitable for any null family of
distributions. Also, although Vasicek’s test and Shapiro–Wilk and Jarque–Bera tests are exact, the
other four tests are approximate in the sense that the actual size of the test is only approximately
equal to the nominal size. For further study about these tests, see the references.

3. Simulation study

In this section, power comparisons of tests of normality are made by using Monte Carlo com-
putations. Table 2 gives Type I error probabilities (the actual size of the tests), which (with the
exception of Vasicek’s test and Shapiro–Wilk and Jarque–Bera tests) have been obtained by 20,000
simulations.
We compute the powers of the tests based on CH, D, V , W , A2 , KLmn , and JB statistics by
means of Monte Carlo simulations under 20 alternatives. These alternatives were used by Esteban
et al. [15] in their study of power comparisons of several tests for normality. The alternatives can
be divided into four groups, depending on the support and shape of their densities. From the point
of view of applied statistics, natural alternatives to normal distribution are in Groups I and II. For
the sake of completeness, we also consider Groups III and IV. This fact gives additional insight
towards understanding the behaviour of the tests.
968 H. Alizadeh Noughabi and N.R. Arghami

Table 2. The actual Type I error probabilities of tests of normality (n = 20, nominal α = 0.05, σ = standard deviation).

Test of normality σ = 0.01 σ = 0.25 σ =1 σ =5 σ = 10 σ = 100

Cramer von Mises 0.0510 0.0506 0.050 0.0492 0.0485 0.0509


Kolmogorov–Smirnov 0.0507 0.0492 0.050 0.0505 0.0491 0.0511
Kuiper 0.0517 0.0509 0.050 0.0493 0.0508 0.0487
Shapiro–Wilk 0.050 0.050 0.050 0.050 0.050 0.050
Anderson–Darling 0.0491 0.0505 0.050 0.0506 0.0490 0.0508
Vasicek 0.050 0.050 0.050 0.050 0.050 0.050
Jarque–Bera 0.050 0.050 0.050 0.050 0.050 0.050

Group I: Support (−∞, ∞), symmetric:


Downloaded by [University of South Florida] at 07:12 25 March 2015

• Student t with 1 degree of freedom (i.e. the standard Cauchy);


• Student t with 3 degrees of freedom;
• Standard logistic;
• Standard double exponential.
Group II: Support (−∞, ∞), asymmetric:
• Gumbel with parameters α = 0 (location) and β = 1 (scale);
• Gumbel with parameters α = 0 (location) and β = 2 (scale);
• Gumbel with parameters α = 0 (location) and β = 1/2 (scale).
Group III: Support (0, ∞):
• Exponential with mean 1;
• Gamma with parameters β = 1 (scale) and α = 2 (shape);
• Gamma with parameters β = 1 (scale) and α = 1/2 (shape);
• Lognormal with parameters μ = 0 (scale) and σ = 1 (shape);
• Lognormal with parameters μ = 0 (scale) and σ = 2 (shape);

Table 3. Power comparisons of 0.05 tests based on CH, D, V , W , A2 , KLmn and JB statistics for sample sizes
n = 10,20,30 and 50 under alternatives from Group I.

n Alternatives CH D V W A2 KLmn JB

10 t(1) 0.618 0.580 0.589 0.594 0.618 0.434 0.590


20 t(1) 0.880 0.847 0.865 0.869 0.882 0.745 0.855
30 t(1) 0.965 0.947 0.958 0.960 0.967 0.909 0.953
50 t(1) 0.9974 0.994 0.9967 0.9966 0.9977 0.9882 0.9947
10 t(3) 0.182 0.164 0.163 0.187 0.190 0.098 0.212
20 t(3) 0.309 0.260 0.277 0.340 0.327 0.158 0.374
30 t(3) 0.410 0.345 0.377 0.460 0.436 0.245 0.507
50 t(3) 0.570 0.484 0.538 0.632 0.599 0.370 0.681
10 Logistic 0.080 0.073 0.071 0.082 0.083 0.051 0.094
20 Logistic 0.106 0.087 0.090 0.123 0.113 0.052 0.140
30 Logistic 0.110 0.094 0.099 0.144 0.123 0.056 0.179
50 Logistic 0.138 0.112 0.121 0.192 0.157 0.061 0.250
10 Double exponential 0.158 0.142 0.142 0.150 0.159 0.069 0.175
20 Double exponential 0.270 0.224 0.242 0.264 0.274 0.093 0.294
30 Double exponential 0.365 0.294 0.333 0.360 0.374 0.158 0.390
50 Double exponential 0.537 0.436 0.503 0.523 0.543 0.268 0.541
10 Average 0.2595 0.2398 0.2413 0.2533 0.2625 0.163 0.2678
20 Average 0.3913 0.3545 0.3685 0.399 0.399 0.262 0.4158
30 Average 0.4625 0.4200 0.4418 0.481 0.475 0.342 0.5073
50 Average 0.5606 0.5065 0.5397 0.5859 0.5742 0.4218 0.6167
Journal of Statistical Computation and Simulation 969

Table 4. Power comparisons of 0.05 tests based on CH, D, V , W , A2 , KLmn and JB statistics for sample sizes
n = 10, 20, 30 and 50 under alternatives from Group II.

n Alternatives CH D V W A2 KLmn JB

10 Gumbel(0,1) 0.137 0.121 0.117 0.153 0.147 0.106 0.160


20 Gumbel(0,1) 0.249 0.203 0.194 0.313 0.273 0.201 0.303
30 Gumbel(0,1) 0.360 0.290 0.272 0.469 0.402 0.302 0.430
50 Gumbel(0,1) 0.545 0.440 0.418 0.686 0.596 0.482 0.634
10 Gumbel(0,2) 0.136 0.121 0.117 0.150 0.144 0.104 0.159
20 Gumbel(0,2) 0.252 0.203 0.195 0.315 0.276 0.202 0.296
30 Gumbel(0,2) 0.359 0.289 0.269 0.464 0.399 0.300 0.431
50 Gumbel(0,2) 0.542 0.437 0.417 0.690 0.597 0.481 0.636
10 Gumbel(0,1/2) 0.139 0.118 0.117 0.154 0.147 0.104 0.157
20 Gumbel(0,1/2) 0.249 0.203 0.194 0.314 0.275 0.200 0.300
30 Gumbel(0,1/2) 0.360 0.288 0.269 0.464 0.400 0.300 0.426
Downloaded by [University of South Florida] at 07:12 25 March 2015

50 Gumbel(0,1/2) 0.544 0.440 0.417 0.690 0.598 0.488 0.635


10 Average 0.1373 0.120 0.117 0.1523 0.146 0.1047 0.1587
20 Average 0.250 0.203 0.1943 0.3140 0.2747 0.2010 0.2997
30 Average 0.3597 0.289 0.270 0.4657 0.4003 0.3007 0.4290
50 Average 0.5437 0.439 0.4173 0.6887 0.5970 0.4837 0.6350

Table 5. Power comparisons of 0.05 tests based on CH, D, V , W , A2 , KLmn and JB statistics for sample sizes
n = 10, 20, 30 and 50 under alternatives from Group III.

n Alternatives CH D V W A2 KLmn JB

10 Exponential 0.390 0.301 0.360 0.442 0.416 0.424 0.371


20 Exponential 0.724 0.586 0.696 0.836 0.773 0.845 0.679
30 Exponential 0.896 0.784 0.883 0.968 0.934 0.967 0.860
50 Exponential 0.9958 0.963 0.9907 0.9995 0.9971 0.9996 0.9853
10 Gamma(2) 0.210 0.175 0.180 0.239 0.225 0.189 0.225
20 Gamma(2) 0.425 0.326 0.353 0.532 0.467 0.457 0.430
30 Gamma(2) 0.600 0.472 0.516 0.749 0.660 0.662 0.617
50 Gamma(2) 0.836 0.701 0.767 0.949 0.890 0.908 0.855
10 Gamma(1/2) 0.672 0.540 0.662 0.735 0.703 0.791 0.598
20 Gamma(1/2) 0.952 0.879 0.957 0.984 0.970 0.993 0.891
30 Gamma(1/2) 0.9954 0.9823 0.9964 0.9997 0.9983 0.9999 0.9807
50 Gamma(1/2) 1.000 0.9998 1.000 1.000 1.000 1.000 0.9997
10 Lognormal(0,1) 0.554 0.463 0.524 0.603 0.578 0.562 0.549
20 Lognormal(0,1) 0.881 0.778 0.857 0.932 0.904 0.919 0.856
30 Lognormal(0,1) 0.975 0.935 0.967 0.991 0.984 0.988 0.964
50 Lognormal(0,1) 0.9989 0.9951 0.9984 0.9999 0.9997 0.9998 0.9989
10 Lognormal(0,2) 0.896 0.826 0.892 0.920 0.909 0.939 0.848
20 Lognormal(0,2) 0.9978 0.991 0.9976 0.9996 0.9987 0.9998 0.991
30 Lognormal(0,2) 1.000 0.9999 1.000 1.000 1.000 1.000 0.9999
50 Lognormal(0,2) 1.000 1.000 1.000 1.000 1.000 1.000 1.000
10 Lognormal(0,1/2) 0.220 0.182 0.187 0.245 0.233 0.177 0.241
20 Lognormal(0,1/2) 0.427 0.337 0.346 0.517 0.463 0.395 0.476
30 Lognormal(0,1/2) 0.607 0.492 0.505 0.726 0.656 0.588 0.643
50 Lognormal(0,1/2) 0.829 0.715 0.740 0.924 0.871 0.832 0.865
10 Weibull(1/2) 0.855 0.758 0.854 0.894 0.875 0.929 0.784
20 Weibull(1/2) 0.9957 0.9818 0.9962 0.9992 0.9979 0.9998 0.979
30 Weibull(1/2) 0.9998 0.9992 0.9999 1.000 1.000 1.000 0.9987
50 Weibull(1/2) 1.000 1.000 1.000 1.000 1.000 1.000 1.000
10 Weibull(2) 0.079 0.074 0.068 0.084 0.083 0.084 0.079
20 Weibull(2) 0.120 0.103 0.095 0.156 0.132 0.129 0.133
30 Weibull(2) 0.159 0.132 0.117 0.232 0.184 0.186 0.176
50 Weibull(2) 0.259 0.210 0.178 0.416 0.307 0.327 0.285
10 Average 0.4845 0.4149 0.4659 0.5203 0.5028 0.5119 0.4619
20 Average 0.6903 0.6227 0.6622 0.7445 0.7132 0.7172 0.6794
30 Average 0.7790 0.7246 0.7480 0.8332 0.8020 0.7989 0.7799
50 Average 0.8648 0.8230 0.8343 0.9111 0.8831 0.8833 0.8736
970 H. Alizadeh Noughabi and N.R. Arghami

• Lognormal with parameters μ = 0 (scale) and σ = 1/2 (shape);


• Weibull with parameters β = 1 (scale) and α = 1/2 (shape);
• Weibull with parameters β = 1 (scale) and α = 2 (shape).

Group IV: Support (0,1):

• Uniform;
• Beta(2,2);
• Beta(0.5,0.5);
• Beta(3,1.5);
• Beta(2,1).
Downloaded by [University of South Florida] at 07:12 25 March 2015

Tables 2–6 were obtained by simulation in the following manner.


Under each alternative, we generated 20,000 samples of size 10, 20, 30, and 50. We evaluated
for each sample the statistics (CH, D, V , W , A2 , KLmn , JB) and the power of the corresponding
test was estimated by the frequency of the event ‘the statistic is in the critical region’. Although
the required critical values are given in the corresponding articles, we also obtained them by
simulation, before power simulations. The power estimates are given in Tables 3–6. For each
sample size and alternative, the bold type in these tables indicates the statistics achieving the
maximal power. For Vasicek’s test, the window sizes are m = 2, (3), 4 for sample sizes n = 10,
(20,30), 50, respectively.
From Tables 3–6, we can see that the tests compared considerably differ in power. By average
powers, we can select the tests that are, on average, most powerful against the alternatives from
the given groups (Table 7).
In Group I, it is seen that the tests JB and A2 have the most power and the test KLmn has the
least power. The difference of powers between the test KLmn and other tests are substantial.

Table 6. Power comparisons of 0.05 tests based on CH, D, V , W , A2 , KLmn and JB statistics for sample sizes
n = 10,20,30 and 50 under alternatives from Group IV.

n Alternatives CH D V W A2 KLmn JB

10 Uniform 0.074 0.066 0.081 0.082 0.080 0.172 0.028


20 Uniform 0.144 0.100 0.150 0.200 0.171 0.420 0.006
30 Uniform 0.230 0.145 0.230 0.381 0.299 0.653 0.010
50 Uniform 0.433 0.264 0.418 0.749 0.568 0.921 0.132
10 Beta(2,2) 0.044 0.046 0.048 0.042 0.046 0.079 0.022
20 Beta(2,2) 0.058 0.053 0.064 0.053 0.058 0.132 0.006
30 Beta(2,2) 0.072 0.060 0.080 0.080 0.080 0.187 0.003
50 Beta(2,2) 0.107 0.083 0.121 0.153 0.127 0.318 0.005
10 Beta(1/2,1/2) 0.229 0.162 0.237 0.299 0.268 0.515 0.096
20 Beta(1/2,1/2) 0.509 0.318 0.490 0.727 0.618 0.911 0.058
30 Beta(1/2,1/2) 0.738 0.507 0.707 0.944 0.862 0.992 0.230
50 Beta(1/2,1/2) 0.955 0.802 0.932 0.9993 0.991 1.000 0.848
10 Beta(3,1/2) 0.542 0.418 0.530 0.609 0.576 0.684 0.445
20 Beta(3,1/2) 0.875 0.746 0.879 0.948 0.913 0.977 0.745
30 Beta(3,1/2) 0.979 0.934 0.981 0.997 0.991 0.999 0.922
50 Beta(3,1/2) 0.9997 0.9987 0.9999 1.000 1.000 1.000 0.9974
10 Beta(2,1) 0.115 0.100 0.109 0.130 0.126 0.179 0.083
20 Beta(2,1) 0.232 0.174 0.202 0.306 0.261 0.431 0.096
30 Beta(2,1) 0.359 0.268 0.315 0.515 0.428 0.651 0.136
50 Beta(2,1) 0.606 0.448 0.549 0.838 0.711 0.918 0.310
10 Average 0.2008 0.1584 0.2010 0.2324 0.2192 0.3258 0.1348
20 Average 0.3636 0.2782 0.3570 0.4468 0.4042 0.5742 0.1822
30 Average 0.4756 0.3828 0.4626 0.5834 0.5320 0.6964 0.2602
50 Average 0.6201 0.5191 0.6040 0.7479 0.6794 0.8314 0.4585
Journal of Statistical Computation and Simulation 971

Table 7. The best of tests in term of power in different groups.

Groups (alternatives)

I II III IV
JB W W, KLmn KLmn

In Group II, the test W has the most power and the test V has the least power. For n = 10, the
test JB has the most power and the difference of powers between W and JB is small.
In Group III, the tests W and KLmn have the most power and the test D has the least power.
The difference of powers between the test W (KLmn ) and the other tests are substantial.
In Group IV, the test KLmn has the most power and the test JB has the least power. The difference
Downloaded by [University of South Florida] at 07:12 25 March 2015

of powers between the test KLmn and the other tests are substantial.

4. Conclusions

In this paper, we first described seven tests for normality, namely Kolmogorov–Smirnov,
Anderson–Darling, Kuiper, Jarque–Bera, Cramer von Mises, Shapiro–Wilk, and Vasicek.
The paper also compares the power of these seven tests using Monte Carlo computations for
sample sizes n = 10, 20, 30 and n = 50. Differences in power of the seven tests are consid-
erable and each of the tests JB, A2 , W , and KLmn can be most powerful in the group of tests
{CH, D, V , W, A2 , KLmn , JB}, depending on the type of alternatives. The test KLmn is most
powerful against alternatives with the support (0, 1) (Group IV). The tests JB and A2 are most
powerful against symmetric alternatives with the support (−∞, ∞) (Group I).
The tests W and KLmn are most powerful against alternatives in Group III with the support
(0, ∞). The test W is most powerful against asymmetric alternatives in Group II with the support
(−∞, ∞).
Based on these observations, we can formulate the following recommendations for the
application of the studied tests in practice:
• Use the statistic JB, if the assumed alternatives are symmetric and supported by (−∞, ∞).
• Use the statistic KLmn , based on sample entropy, if the assumed alternatives are supported by
the bounded interval (0, 1).
• Use the statistic KLmn or W , if the assumed alternatives are supported by (0, ∞).
• Use the statistic W , if the assumed alternatives are asymmetric and supported by (−∞, ∞).

Acknowledgements
The authors express their appreciation to an anonymous referee and the Associate Editor whose comments improved this
manuscript. Partial support from Ordered and Spatial Data Center of Excellence of Ferdowsi University of Mashhad is
acknowledged.

References

[1] R.B. D’Agostino and M.A. Stephens, Goodness-of-Fit Techniques, Marcel Dekker, Inc, New York, 1986.
[2] C. Huber-Carol, N. Balakrishnan, M.S. Nikulin, and M. Mesbah, Goodness-of-Fit Tests and Model Validity,
Birkhäuser, Boston, Basel, Berlin, 2002.
[3] G. Li and A. Papadopoulos, A note on goodness of fit test using moments, Statistica 62(1) (2002), pp. 72–86.
[4] H. Thode Jr., Testing for Normality, Marcel Dekker, New York, 2002.
[5] C. Zhang and B. Cheng, Binning methodology for nonparametric goodness-of-fit test, J. Stat. Comput. Simul. 73
(2003), pp. 71–82.
972 H. Alizadeh Noughabi and N.R. Arghami

[6] M. Steele and J. Chaseling, Powers of discrete goodness-of-fit test statistics for a uniform null against a selection of
alternative distributions, Commun. Stat. Simul. Comput. 35 (2006), pp. 1067–1075.
[7] L. Jager and J.A. Wellner, Goodness-of-fit tests via phi-divergences, Ann. Stat. 35(5) (2007), pp. 2018–2053.
[8] M.F. Raschke, The biased transformation and its application in goodness-of-fit tests for the beta and gamma
distribution, Commun. Stat. Simul. Comput. 38 (2009), pp. 1870–1890.
[9] J. Zhao, X. Xu, and X. Ding, Some new goodness-of-fit tests based on stochastic sample quantiles, Commun. Stat.
Simul. Comput. 38 (2009), pp. 571–589.
[10] K.V. Mardia, Tests of univariate and multivariate normality, in Handbook of Statistics 4, P.R. Krishnaiah, ed.,
Amsterdam, North-Holland, 1980.
[11] M.A. Stephens, EDF statistics for goodness of fit and some comparisons, J. Am. Stat. Assoc. 69 (1974), pp. 730–737.
[12] O. Vasicek, A test for normality based on sample entropy, J. R. Stat. Soc. B 38 (1976), pp. 54–59.
[13] A.K. Bera and P.T. Ng, Tests for normality using estimated score function, J. Stat. Comput. Simul. 52(3) (1995),
pp. 273–287.
[14] J.M. Dufour,A. Farhat, L. Gardiol, and L. Khalaf, Simulation-based finite sample normality tests in linear regressions,
Econ. J. 1 (1998), pp. 154–173.
[15] M.D. Esteban, M.E. Castellanos, D. Morales, and I. Vajda, Monte Carlo comparison of four normality tests using
Downloaded by [University of South Florida] at 07:12 25 March 2015

different entropy estimates, Commun. Stat. Simul. Comput. 30 (2001), pp. 761–785.
[16] M.N. Goria, N.N. Leonenko, V.V. Mergel, and P.L. Novi Inverardi, A new class of random vector entropy estimators
and its applications in testing statistical hypotheses, Nonparametric Stat. 17(3) (2005), pp. 277–297.
[17] B. Choi, Improvement of goodness of fit test for normal distribution based on entropy and power comparison, J. Stat.
Comput. Simul. 78(9) (2008), pp. 781–788.
[18] P.J. Farrell and K. Rogers-Stewart, Comprehensive study of tests for normality and symmetry: Extending the
Spiegelhalter test, J. Stat. Comput. Simul. 76(9) (2006), pp. 803–816.
[19] B. Yazici and S. Yolacan, A comparison of various tests of normality, J. Stat. Comput. Simul. 77(2) (2007),
pp. 175–183.
[20] S.G. Meintanis, Goodness-of-fit testing by transforming to normality: Comparison between classical and charac-
teristic function-based methods, J. Stat. Comput. Simul. 79(2) (2009), pp. 205–212.
[21] R. von Mises, Wahrscheinlichkeitsrechnung und ihre Anwendung in der Statistik und theoretischen Physik, Deuticke,
Leipzig and Vienna, 1931.
[22] A.N. Kolmogorov, Sulla determinazione empirica di une legge di distribuzione, Giornale dell’Intituto Italiano degli
Attuari 4 (1933), pp. 83–91.
[23] T.W. Anderson and D.A. Darling, A test of goodness of fit, J. Am. Stat. Assoc. 49 (1954), pp. 765–769.
[24] N.H. Kuiper, Test concerning random points on a circle. Proc. K. Ned. Akad. Wet. A. 63 (1962), pp. 38–47.
[25] S.S. Shapiro and M.B. Wilk, An analysis of variance test for normality, Biometrika 52 (1965), pp. 591–611.
[26] C.M. Jarque and A.K. Bera, A test normality of observations and regression residuals, Int. Stat. Rev. 55 (1987),
pp. 163–172.
[27] E.S. Pearson and H.O. Hartley, Biometrika Tables for Statisticians, Cambridge University Press, London, 1972.

You might also like