You are on page 1of 7

This article was downloaded by: [Michigan State University

]
On: 05 January 2015, At: 05:41
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,
UK

Journal of the American
Statistical Association
Publication details, including instructions for authors
and subscription information:
http://www.tandfonline.com/loi/uasa20

A Test of Goodness of Fit
a a
T. W. Anderson & D. A. Darling
a
Columbia University and University of Michigan
Published online: 11 Apr 2012.

To cite this article: T. W. Anderson & D. A. Darling (1954) A Test of Goodness of Fit,
Journal of the American Statistical Association, 49:268, 765-769

To link to this article: http://dx.doi.org/10.1080/01621459.1954.10501232

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the
information (the “Content”) contained in the publications on our platform.
However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness,
or suitability for any purpose of the Content. Any opinions and views
expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the
Content should not be relied upon and should be independently verified with
primary sources of information. Taylor and Francis shall not be liable for any
losses, actions, claims, proceedings, demands, costs, expenses, damages,
and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the
Content.

This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan,
sub-licensing, systematic supply, or distribution in any form to anyone is

expressly forbidden.tandfonline.com/page/terms-and-conditions Downloaded by [Michigan State University] at 05:41 05 January 2015 . Terms & Conditions of access and use can be found at http://www.

n . A. 765 . R-345-20-1. and let u. U. This procedure may be used if one wishes to reject the hypothesis whenever the true distribution differs materially from the hypothetical and especially when it differs in the tails.1)[log Uj + log (1 . Then compute 1 n (2) Wn2 = . the population may be specified by the hypothesis to be normal with mean 1 and variance t. Project No. Air Force. The test. THE PROCEDURE HE problem of statistical inference considered here is to test the T hypothesis that a sample has been drawn from a population with a specified continuous cumulative distribution function F (z). 1. S.). ANDERSON AND D. An il- lustration is given. Contract AF18(600)-442. L (2j . DARLING· Columbia University and University of Michigan Some (large sample) significance points are tabulated for a distribution-free test of goodness of fit which was introduced earlier by the authors. A TEST OF GOODNESS OF FIT T. For example. Un-Hl)] n j=l where the logarithms are the natural logarithms. The test procedure we have proposed earlier [1] is the following: Let Xl ~ X2 ~ • • • ~ X n be the n observations in the sample in order.. The asymptotic significance points are given below: * Work sponsored by Office of Scientific Research. Significancepoints for W n 2 are not available for small sample sizes. using a numerical example used previously Downloaded by [Michigan State University] at 05:41 05 January 2015 by Birnbaum in illustrating the Kolmogorov test. which uses the actual ob- servations without grouping. the corresponding cumulative distribution func- tion is (1) In practice the procedure really tests the hypothesis that the sample has been drawn from a population with a completely specified density function. The authors wish to acknowledge the assistance of Vernon Johns in the computations. W. is sensitive to discrepancies at the tails of the distribution rather than near the median. the hypothesis is to be rejected. since the cumulative distribution function is simply the integral of the density. If this number is too large.=F(x.

05 2.01 3. 1-U7\-i+l.4 which with 7 degrees of freedom is not significant at the 10 per cent level. A NUMERICAL ILLUSTRATION Birnbaum [2] has considered a sample of 40 observations and applied Downloaded by [Michigan State University] at 05:41 05 January 2015 the Kolmogorov statistic to test the hypothesis that the population from which the data came was normal with mean 1 and standard deviation 1/v6. log Uj. By this test he found the data were consistent with the hypothesis. The operation Uj=F(Xi) is simply finding the prob- ability to the left of v6(xj-l) according to the standard normal distri- bution. using 8 categories each with expected frequency 5. shows that x2 =6. Empirical study suggests that the asymptotic value is reached very rapidly. Application to the same data of the usual x 2 criterion of K. 766 AMERICAN STATISTICAL ASSOCIATION JOURNAL.492 .3473 In these two examples we have used the asymptotic percentage points instead of the actual ones based on finite sample size. obtaining W n 2 = 1. the empirical cumulative distribution function .1789. uj=F(xj). DECEMBER 1954 ASYMPTOTIC SIGNIFICANCE POINTS Significance Level Significance Point . Another test procedure uses the Cramer-von Mises w2 criterion given by (3) nw2 =-1 + 2: 7\ ( 2j - Uj--- 1)2 12n i~l 2n The asymptotic distribution of this statistic is given in [1]. and we do not reject the hypothesis.857 2. The computation sheet for this calculation had the following columns: Xi. which is also well below the 10 per cent asymptotic significance point of . 3.10 1. Pearson. and it ap- pears safe to use the asymptotic value for a sample size as large as 40. We have analyzed the same data using (2). v6(xj-1). DERIVATION OF THE CRITERION Several test procedures are based on comparing the specified cumula- tive distribution function F(x) with its sample analogue.158 which is well below the 10 per cent significance point.933 . log (l-U7\-j+l) and -[log uj+log (l-U n-i+l)]. For Birn- baum's data we obtain nw2 = .

it appears difficult to find an "optimum" weight function 1/. the variance is F(x)[I-F{x)]. Birnbaum [3] and Lehmann [4].(11) to be large for u near 0 and 1.H(x)] + n[F(x) . and to this end is willing to sacrifice power against alternatives in which H(x) and F(x) disagree near the median of F(x). weighted by 1/. This function has the effect of weighting the tails heavily since this function is large near u=O and U= 1. Under the null hypothesis (H(x)=F(x)). 1 this criterion is n times the w2 criterion. see. of Xi ~ X Fn(x) = . A TEST OF GOODNESS OF FIT 767 no.(u). For a given value of z. Thus. the true distribution.H(X)]2 = H(x) [1 .F(x) where the test is desired Downloaded by [Michigan State University] at 05:41 05 January 2015 to have sensitivity. In a sense. Fn(x) is a binomial variable. If one wishes the test to have good power against alternatives in which H(x).H(X)]2. For a discussion of the general nature of power of distribu- tion-free tests.u) as a weight function. and small near u=!. The criterion W n 2 is an average of the squared discrepancy fFn(x)-F(x)]2.[F(x)] and the increase in F(x) (and the normalization n). it is distributed in the same way as the proportion of successes in n trials. we would equalize the sampling error over the entire range of z by weighting the deviation by the reciprocal of the standard devi- ation under the null hypothesis. Even if the alternative hypotheses are closely delineated. E[Fn(x)] =H(x) and (5) nE[Fn(x) .(u) =.(u) is some nonnegative weight function chosen by the experi- menter to accentuate the values of Fn(X) . The hypothesis is to be rejected if W n2 is sufficiently large. where the probability of success is H(x). Formula (2) is obtained by writing (4) as . that is.H(X)]2 + n[F(x) . for example.(u) = u(1 . and F(x) disagree near the tails of F(x). It is this weight function (6) which we treat in the present note. by using 1 (6) 1/. When 1/. it seems that one ought to choose 1/.F(X)]2 = nE[Fn(x) . however. n The present writers suggested [1] the use of the criterion (4) where 1/.

F(x)] dF(x) rl-F(x)12 f cc + . By using the fact that ez/ [S(w2+ l) ] ~ez/s. 2 2 (8) Z . + :0" F(x)[1 .!)(4j + L--·--.=0 j! The terms of this series alternate in sign and the (j+ l)st term is less than the jth term.F(x) ]2 f CC .(x) [1 . COMI'UTATION OF THE ASYMPTOTIC SIGNIFICANCE POINTS It was proved in [1] that the limiting characteristic function of W.e-[{4HI) .--·---.. which is not true of (6).5) in [1] cannot be used directly here. 2) = / - 1r 21rit _ cos (2V1 + 8it) and that the inversion of this characteristic function gave for the limiting cumulative distribution of W. ]/(Sz) 1) . then the inte- grand isf(y)e-i Il2• The y-axis was divided into intervals according to the integral e.. .. 4.2 the expression - V2 00 (-. 768 AMERICAN STATISTICAL ASSOCIATION JOURNAL..2 n = -00 F(x) [1 _ F(x)] dF(x) "I F2(X) f"t [Ji'.(x) . DECEMBER 1954 1 [F. j?. and letting F(z) =u(dF(x) =du).F(x)] dF(x). one needs only the O-th term for the first two significance points and the O-th and L-st terms for the third significance point. The formula (2.----·----·.F(X)]2 = f -00 Ji'(x) [1 . 1.... The laborious part of the computation is the evaluation of the integral. F(x)] dF(x) + '"I F. one can easily verify 'that to compute the prob- abilities correctly to four decimal places. thus the error involved in using only j terms of this series is less than the (j+l)st term for j?.2 defined in either (2) or (4) is (7) ep(t) = ~: E(e i IW . W.i ll2 and numerical integration was performed. for that formula requires that fo 1 !/I(u)du < 'Xi. l)ir(j -I.O. Straightforward integration and collec- Downloaded by [Michigan State University] at 05:41 05 January 2015 tion of terms gives (2).(x) . Let [("1j+1)1r!2vz)]w=y....

.. 24 (1953). Z.. . "Asymptotic theory of certain 'good- ness of fit' criteria based on stochastic processes. [4J Lehmann.425-41.= 1. H j(j + 1) 00 1 2 lim Var (W. 47 (1952)." Annals of Mathematical Statis- tics." Journal of the American Statistical Associa- tion.57974. (11"2 . 23-43. Z. W.. IJ.=1 j2(j + 1)2 3 The asymptotic significance points are computed to assure the prob- abilities (significance levels) to be correct to four decimal places. 1-8. Downloaded by [Michigan State University] at 05:41 05 January 2015 REFERENCES [1) Anderson... A. [2) Birnbaum.9) "" . D.. 24 (1953). n.. T.... 193-212. The first two are 00 1 lim E(W. A TEST OF GOODNESS OF FIT 769 The moments of the asymptotic distribution are fairly easy to obtain from formulas given in [1]. W. W. "Numerical tabulation of the distribution of Kolmogorov's statistic for finite sample size... [3) Birnbaum. ..2) E(W 2) = " L: ." Annals of Mathematical Statistics.2) = 2 I: =." Annals of Mathematical Statistics. "The power of rank tests. "Distribution-free tests of fit for continuous distribution functions. 23 (1952). and Darling. E.