Professional Documents
Culture Documents
STATISTICS
IN PARTICLE PHYSICS
A. G. Frodesen, 0.Skjeggestad
DEPARTMENT OF PHYSICS
UNlVElUilTY OF BERGEN
H. T ~ f t e
DEPARTMENT OF C O M P W l N C SCIENCE
ACDER REGIONAL COLLEGE. KRISTIANSAND
UNIVERSITETSFORLAGET
BERGEN -OSLO - TROMSQ
Preface
II
Disrribution offices:
NORWAY The present book on probability theory and statistics is intended for
Universitetsforlaget graduate students and research workers in experimental high energy and elementary
Box 2977 T0yen particle physics. The book has originated from the authors' attempts during many
Oslo 6
years to provide themselves and their students working for a degree in experimen-
UNITED KINGDOM tal particle physics with practical knowledge of statistical analysis methods and
Global Books Resources Ltd some further insight required for research in this field.
109 Great Reffill Street
The first drafting of notes started more than ten years ago when the
London WClB SND
authors were colleagues at the University of Oslo, and working with bubble cham-
, >
UNITED STATES and CANADA ber experiments. At that time no textbook in statistics was knovn to us which
Columbia University Press took its examples and applications from high energy physics and could serve as a
136 Sourh Broadway
reference book in daily work and a suitable eurrievllrm for our students. Several
Irvington.on-Hudson 4'
New York 10533 advanced books were available in the library which discussed the fundamentals of
probability theory and mathematical statistics per se 1e.g. Cram&, Kendall and
Stuart]. Other, less demanding textbooks incorporated useful examples from many
fields, including physics, and had the virtue of acquainting a wider scientific
community with the universal methods devised by the science of statistics [Pisher.
Johnson and Leone. Ostle. Sverdrup. Wine]. Also available were articles and lec-
ture notes discussing statistical estimation in physics in general and in high
Reprinted with Permission from Columbia University Press.
energy physics in particular [Annis e t o 1 . . B6ek. Hudaon. Jauneau and Horellet,
PROBABILITY AND STATISTICS IN PARTICLE PHYSICS by
Orear. Solmitzl. The need for mare coherent presentations apparently was latent
Frodesen, Skjeggestad, and Tdfte, 1979.
and brought on the market a systematic, relatively theoretical account of statis-
Copyright 1979 by Columbia University Press.
tics as used by physicists [Martin, 19711, aa well as two treatises written by
experimental particle phy.sicists. The latter authors, however, either intended
their book "for student8 end research uorkers in science, medicine, engineering
and economics" [Brondt, 19701, or addressed their course "to physicists (and ex-
perimenters in related sciences) in their task of extracting information from ex-
Printed in Norway by perimental data" [Eadie et at.. 19711.
Reklametrykk A s . Bergen The preaent text has been written for readers who are aupposed to have
book. It is hoped that the book will prove useful in everyday work. For this
their main interest in elementary particle physics and who have a need for stat- purpose the list of contents and the subject index have been made to include
istical methods as a tool in their work. This, of course, does not mean that the physics key words as well as statistical terms to facilitate the use of the book
book can only be comprehended by people whose background is particle as a reference manual. To make it selfcontained, the book has also been sup-
However, it is only fair to state that it is a rather specialized book, in which plied with a set of statistical tables in an appendix.
the topics diecussed, the disposition and style reflect the need of an experimen- With its emphasis on the various practical aspects of statistical
tal particle physicist, and in which examples and applications have been almost methods and techniques the presentation in this book differs from the general,
exclusively chosen from this field. This fact ~"doubtedlylimits the usefulness more theoretical points of view shared by statisticians. As "on-professionals
of the boak to readers from other branches of physics. On the other hand, with in statistics the authors make no claim to originality on the subject. We have
the high degree of specialization within the and other sciences today, felt free to borrav material which, over the years, have acquired a status of
it is, in the opinion of the authors, well worth-while to aim at a m r e re- "cornon property'' among particle physicists, without mentioning originators or
stricted group of readers and to tailor the presentation to meet the specific written sources. Our reference policy is otherwise to give only the names of
demands of this group. After all, the statistical methods needed in many disci- authors in the text where examples have been taken from articles in physics pub-
plines are more or less standard and available in excellent general presentations lications, lecture notes e t c . , and to give the full reference in the bibli-
of mathematical statistics. Often, however, the task of extracting the relevant ography at the end of the boak. The bibliography also contains references to
information from these books can be both time-consuming and troublesome for the textbooks which can be suggested as alternative and further reading.
non-specialist, Stories are told about who have spent mnths of their We would like to express explicitly our indebtedness to H.G. Kendall
time developing new methods for data analysis, only to find out later that such and A. Stuart, the authors of the three-volume work "The Advanced Theory of
methods were already described in the statistical literature. A dedicated book Statistics" which we have constantly coneulted and found to contain answers to
like the present can hopefully senre to reduce such instances of vasted time and any question.
effoIt. We are also indebted to Addison-Wesley Publishing Company, Inc.. for
The book assumes no previous knowledge beyond basic calculus. The sub- permission to use material from Table 15.1 in Handbook of S t a t i s t i c a t TabZes
ject of probability theory is entered on an elementary level and given a rather by D. h e n , and to the Bimtrika Trustees for permission to reproduce Table 1
simple and detailed exposition; this is thought to be to the benefit of the sru- from L.R. Verdooren's paper "Extended tables of critical values for Wilcoxon's
dent who starts a new course and should get well acquainted with the moat common rest statistic", printed in B i m e t ~ i k o .
~r~bability
concepts and distributions before entering the domain of statistics. It is a pleasure to acknowledge the useful help of our many students
The boak has been written so that it need nor be worked through as a regular who over the years contributed their coments on the course. Finally, we wish
course, with extensive reading from the very beginning, but can be studied chap- to thank Mrs. Laila Nest far her patient and carefcl cooperetion with the tgping
ter- or seetiowwise, should the reader prefer so. The material has, in fact. of the manuscript.
been organized with an eye to the experimental practical need, which
is likely to be statistical methods for estimation or decision-making. Since an Decenher 1978 A.G.F.. O.S.. H.T.
established physicist will probably possess a sufficient background on the funda-
mentals of probability theory, he can be recommended to start his reading direct-
ly at the chapters he is interested in. Cross references are given to indicate
where definitions and developed f o v l a e =an be found in earlier chapters of the vii
Contents
1 INTRODUCTION
..
3.6 L i n e a r f u n c t i o n s o f random v a r i a b l e s
3.6.1 ~ x a m p l e : A r i t h m e t i c mean of independent v a r i a b l e s 4.6 The e x p o n e n t i a l d i s t r i b u t i o n
w i t h t h e same mean and v a r i a n c e '92
4.6.1 D e f i n i t i o n and p r o p e r t i e s 92
3.7 Change o f v a r i a b l e s 4.6.2 Derivation of the exponential p.d.f. from t h e P o i s s o n
3.7.1 Example: ~ a l i t zp l o t v a r i a b l e s assumptions 93
3.8 P r o p a g a t i o n of errors 4.7 The g a m a d i s t r i b u t i o n 95
3.8.1 A s i n g l e f u n c t i o n 4 . 7 . 1 D e f i n i t i o n and p r o p e r t i e s 95
3.8.2 Example: V a r i a n c e o f a r i t h m e t i c mean 4.7.2 D e r i v a t i o n of t h e ganma p . d . f . from t h e Poisson
3.8.3 s e v e r a l f u n c t i o n s ; m a t r i x n o t a t i o n assumptions 97
4.7.3 Example: On-line p r o c e s s i n g of b a r c h e d e v e n t s PO
1.9 Uisrrele p r o b a b i l i t y d i r r r i b u t i o n s
1.9. L n v d i f ~ c a r i o no f [g,rmulac 4.8 The normal, o r G a u s s i a n , d i s t r i b u t i o n
3.9.2 The p r o b a b i l i t y p e n e r a r i n g funcrion 4 . 8 . 1 D e f i n i t i o n and p r o p e r t i e s of N ( ~ , O ' )
4.8.2 The s t a n d a r d normal d i s t r i b u t i o n N(O.1)
3.10 Sampling 4 . 8 . 3 P r o b a b i l i t y c o n t e n t s of N ( ~ , O ~ )
3.10.1 U n i v e r s e and sample 4.8.4 C e n t r a l moments; the c h a r a c t e r i s t i c f u n c t i o n
3.10.2 Sample p r o p e r t i e s 4.8.5 A d d i t i o n theorem f o r n o r m a l l y d i s t r i b u t e d v a r i a b l e s
3.10.3 I n f e r e n c e s from t h e sample 4.8.6 P r o p e r t i e s of i and s 2 f o r sample from ~ ( l l . o ~ )
3.10.4 The Law o f Large Numbers 4.8.7 Example: P o s i t i o n and w i d t h of resonance peak
4.8.8 The C e n t r a l L i m i t Theorem
4 SPECIAL PROBABILITK DISTRIBUTIONS 4.8.9 Example: Gaussian random rider g e n e r a t o r
4.1 The b i n o m i a l d i s t r i b u t i o n 4.9 The b i n o r m a l d i s t r i b u t i o n
4 . 1 . 1 ~ e f i n i t i o nand p r o p e r t i e s 4 . 9 . 1 D e f i n i t i o n and p r o p e r t i e s
4.1.2 ~ x a m p l e : H i ~ t o ~ r a a m i negv e n t s (1) 4.9.2 E x a w l e : C o n s t r u c t i o n of a b i n o r m a l random rider g e n e r a t o r
4 . 1 . 3 ~ x a m p l e : Scanning e f f i c i e n c y (2)
4.10 The m l t i n o r m e l d i s t r i b u t i o n
4.2 The m u l t i n o m i a l d i s t r i b u t i o n 4.10.1 D e f i n i t i o n a n d p r o p e r t i e s
4.2.1 D e f i n i t i o n and p r o p e r t i e s 4.10.2 The q u a d r a t i c form Q
4.2.2 Example: ~ i s t ~ ~ r a c m ei vn egn t s (2)
4.11 The Cauchy. o r B r e i t - N i g n e r , distribution
4.3 The Poisson d i s t r i b u t i o n
4 . 3 . 1 D e f i n i t i o n and p r o p e r t i e s 5 SAMPLING DISTRIBUTIONS
4.3.2 The P o i s s o n a s s u m p t i o n s
E ~ a ~ l Bubbles e : a l o n g a t r a c k i n a b u b b l e chamber 5.1 The c h i - s q u a r e d i s t r i b u t i o n
4.3.3 ~ x a m p l e :R a d i o a c t i v e e m i s s i o n s 5.1.1 Definition
5.1.2 Proof f o r t h e chi-square p.d.f.
4.4 R e l a t i o n s h i p s between t h e P o i s s o n and o t h e r p r o b a b i l i t y 5 . 1 . 3 P r o p e r t i e s of t h e c h i - s q u a r e d i s t r i b u t i o n
distributions 5.1.4 P r o b a b i l i t y c o n t e n t s of t h e c h i - s q u a r e d i s t r i b u t i o n
4 . 4 . 1 Example: D i s t r i b u t i o n o f counts from an i n e f f i c i e n t 5 . 1 . 5 A d d i t i o n theorem f o r c h i - s q u a r e d i s t r i b u t e d v a r i a b l e s
counter 5.1.6 Proof t h a t ( n - l ) s ' / 0 2 f o r sample from N(u,02) i s x 2 ( n - 1 )
4.4.2 Example: S u b d i v i s i o n o f a c o u n t i n g i n t e r v a l
L.L.3 R e l a t i o n b e m e e n b i n o m i a l and P o i s s o n d i s t r i b u t i o n s 5.2 The S t u d e n t ' s t - d i s t r i b u t i o n
Example: Faward-backward c l a s s i f i c a t i o o 5.2.1 D e f i n i t i o n
4.4.4 R e l a t i o n between m u l t i n o m i a l and P o i s s o n d i s t r i b u t i o n s 5.2.2 Proof f o r t h e S t u d e n t ' s t p . d . f .
Example: Histogramming e v e n t s (3) 5.2.3 P r o p e r t i e s of t h e S t u d e n t ' s t - d i s t r i b u t i o n
4.4.5 The compound P o i s s o n d i s t r i b u t i o n 5.2.4 Probability contents of the Student's t-distribution
Example: D r o p l e t f o r m a t i o n a l o n g t r a c k s i n c l o u d c h d e r 5.3 me P - d i s t r i b u t i o n
4.5 me uniform d i s t r i b u t i o n 5.3.1 Definition
4.5.1 me uniform p . d . f . 5.3.2 P r w f f o r the P p.d.f.
4.5.2 E x a m l e : Uniform random n u d e r g e n e r a t o r s 5.3.3 P r o p e r t i e s of t h e F - d i s t r i b u t i o n
5.3.4 P r o b a b i l i t y c o n t e n t s of t h e F - d i s t r i b u t i o n
5.4 L i m i t i n g p r o p e r t i e s - c o n n e c t i o n between p r o b a b i l i t y d i s t r i b u t i o n s
6 COMPARISON OF EXPERIMENTAL DATA WITH THEORY 9 THE HMIMUI-LIKELIHOOD METHOD
6.1 R e j e c t i o n of bad m e a s u r e m n t s 9.1 "he Haxi-Likelihood Principle
6.2 E x p e r i m e n t a l e r r o r s on m a s u r e n e n r s . The r e s o l u t i o n f u n c t i o n 9.1.1 Exaople: E s t i m a t e o f mean l i f e t i =
6 . 2 . 1 Example: Gaussian r e s o l u t i o n f u n c t i o n and e x p o n e n t i a l p . d . f . 9.2 E s t i m a t i o n of p a r a m t e r n i n t h e normal d i s t r i b u t i o n
6.2.2 Example: G a w s i a n r e s o l u t i o n f u n c t i o n and Gaussian p . d . f . 9 . 2 . 1 E s t i m a t i o n of U; measurements w i t h c o m n e r r o r
6 . 2 . 3 Example: Breit-Wiener r e s o l u t i o n f u n c t i o n and B r e i t n i m e r - 9.2.2 E s t i m a t i o n of u; measurements w i t h d i f f e r e n t errors
p.d.f. (weighted mean)
6.2.4 Example: Width of a resonance 9.2.3 Simultaneous e s t i m a t i o n of mean and v a r i a n c e
6.2.5 Experimental d e t e r m i n a t i o n of r e s o l u t i o n f u n c t i o n ; ideogram
9.3 E s t i a a r i o n of t h e l o c a t i o n p a r a m e t e r i n t h e Cauchy p . d . f .
6.3 Sysrernatlc e f f e c t s . D e t e c t i o n e f f i c i e n c y
6 . 3 . 1 Example: T r u n c a t i o n of an e x p o n e n t i a l d i s t r i b u t i o n 9.4 P r o p e r t i e s of MaximuwLikelihood e s t i m a t o r s
6.3.2 Example: T r u n c a t i o n of a Breit-Wigner d i s t r i b u t i o n 9 . 4 . 1 I n v a r i a n c e under p a r a m e t e r r r a n s f o r m a t i o n
6.3.3 Correcting f o r f i n i t e georerry - m d i f y i n g the p.d.f. 9.4.2 Consistency
9 . 4 . 3 Unbiessedness
6.3.4 C o r r e c t i n g f o r u n o b s e r v a b l e e v e n t s - w e i g h t i n g of the e v e n t s
9.4.4 Sufficiency
6.4 Superimposed p r o b a b i l i t y d e n s i t i e s 9.4.5 Efficiency
6.4.1 Example: P a r t i c l e beam w i t h background 9.4.6 Uniqueness
6.4.2 Example: Resonance peaks i n an e f f e c t i v e - m a s s s p e c r r m . 9 . 4 . 7 Asymptotic n o r m a l i t y of ML e s t i m a t o r s
9.4.8 Example: Asymptotic n o r m a l i t y of t h e ML e s t i m a t o r a f t h e
7 STATISTICAL INFERENCE FROM NORMAL SANPLES mean l i f e t i m e
7.1 Definitions 9.5 Variance o f M a x i m u r L i k e l i h w d e s t i m a t o r s
7.2 Confidence i n t e r v a l s f o r t h e m a n 9.5.1 General methods f o r v a r i a n c e e s t i m a t i o n
7.2.1 Case w i t h a' known 9.5.2 Example: V a r i a n c e of t h e l i f e t i m e e s t i m a t e
7.2.2 Case w i t h a' unknown 9 . 5 . 3 V a r i a n c e of s u f f i c i e n t HL e s t i m a t o r s
9.5.4 Example: V a r i a n c e of t h e w e i g h t e d mean
7.3 Confidence i n t e r v a l s f o r t h e v a r i a n c e 9.5.5 Example: E r r o r s i n t h e WL e s t i m a t e d of u and 0 ' in
7 . 3 . 1 Case w i t h u k n o v n N(u.02)
7.3.2 Case w i t h J! unknown 9.5.6 V a r i a n c e of large-sample ML e s t i m a t o r s
7.4 Confidence r e g i o n s f o r t h e mean and v a r i a n c e 9.5.7 Exaople: P l a n n i n g of an e x p e r i m e n t ; (I)
9 . 5 . 8 Example: P l a n n i n g of a n e x p e r i m e n t ; d e n s i t y m a t r i x
8 ESTIWTION OF PARAMETERS e l e m e n t s (1)
xiv
14 HYPOTHESIS TESTING APPENDIX STATISTICAL TABLES
14.1 I n t r o d u c t o r y remarks Table A1 The b i n o m i a l d i s t r i b u t i o n
Table A2 The cvmularive b i n o m i a l d i s t r i b u t i o n
14.2 Outline o f g e n e r a l methods Table A3 The P o i s s o n d i s t r i b u t i o n
14.2.1 Example: S e p a r a t i o n o f one-no and m u l t i - n o events Table A4 The cumulative P o i s s o n d i s t r i b u t i o n
14.2.2 The Neyman-Pearson t e s t f o r s i m p l e hypotheses Table A5 The s t a n d a r d normal p r o b a b i l i t y d e n s i t y f u n c t i o n
14.2.3 Example: Neyman-Pearson t e a r on t h e Eo mean l i f e t i m e Table A6 The curnularive s t a n d a r d n o r n a l d i s t r i b u t i o n
14.2.4 The l i k e l i h o o d - r a t i o t e s t for composite h y p o t h e s e s Table A7 P e r c e n t a g e p o i n t s of t h e S t u d e n t ' s t - d i s t r i b u t i o n
14.2.5 Example: L i k e l i h o o d - r a t i o t e s t on t h e mean o f a Table A8 P e r c e n t a g e p o i n t s of t h e chi-square d i s t r i b u t i o n
normal p . d . f . Table A9 P e r c e n t a g e p o i n t s of t h e F - d i s t r i b u t i o n
Table A10 P e r c e n t a g e p o i n t s of t h e Kolmogorov-Smirnov s t a t i s t i c
14.3 P a r a m e t r i c t e s t s f o r normal v a r i a b l e s
Table All C r i t i c a l v a l u e s o f t h e run s t a t i s t i c
14.3.1 T e s t s o f mean and v a r i a n c e i n N ( P , O ~ )
Table A12 C r i t i c a l v a l u e s o f t h e Wilcoxon rank sum s t a t i s t i c
14.3.2 Comparison of mans i n two normal d i s t r i b u t i o n s
, ,
14.3.3 Comparison of v a r i a n c e s i n two normal d i s t r i b u t i o n s
1 14.3.4 S u m r y t a b l e
14.3.5 Example: Conparison of r e s u l t s from two d i f f e r e n t
BIBLIOGRAPHY
INDEX
measuring machines
14.3.6 Example: S i g n i f i c a n c e of s i g n a l above background
;,
14.3.7 Comparison o f means i n N n o m l d i s t r i b u t i o n s ; s c a l e f a c t o r
14.4 Gaodness-of-fir t e s t s
14.4.1 P e a r s o n ' s x2
test
14.4.2 Choice o f c l a s s e s f o r P e a r s o n ' s xZ
teat
1 4 . 4 . 3 Degrees of freedom i n P e a r s o n ' s x2
test
14.4.4 General X2 t e s t s f o r goodness-of-fit
14.4.5 Example: Kinematic a n a l y s i s o f a 'V e v e n t (2)
14.4.6 me Kolmogorov-Smirnov t e s t
14.4.7 Example: Goodneso-of-fit i n a s m a l l sample
14.5 T e a t s of independence
14.5.1 Two-way c l a s s i f i c a t i o n ; contingency t a b l e s
14.5.2 Example: Independence of momentum components
14.6 T e a t s of e o n a i s t e n c y and randomess
14.6.1 Sign t e s t
14.6.2 Run t e s t f o r comparison of two samples
1 4 . 6 . 3 Example: C o n s i s t e n c y between two e f f e c t i v e - m a s s samples
14.6.4 Run t e s t f o r c h e c k i n g randomness w i t h i n one sample
I 14.6.5 Example: Time v a r i a t i o n of beam momentum
14.6.6 Run t e s t as a s u p p l e r e n t t o P e a r s o n ' s X
' test
14.6.7 Example: Comparison of e q e r i m e n t a l h i s t o g r a m and
theoretical distribution
14.6.8 K o l m o g o r o ~ S m i m vt e a t f o r comparison o f two samples
14.6.9 Wilcoxon's ravk sum t e s t for comparison o f two samples
14.6.10 Example: C o n s i s t e n c y t e s t f o r two s e t s o f measurements
of t h e n o l i f e t i m e
14.6.11 ~ r u s k a l - W a l l i s rank t e s t f o r comparison of a e v e r a l samples
I 14.6.12 me X2 t e s t f o r comparison o f h i s t o g r a m
xvi
1. Introduction
L
binomial, Poisson, and exponential distributions. Particular attention is given
to the normal, or Gaussian, distribution, because of its key role, theoretically The second main application of statistics, the testing of ststistieal
(the Central Limit Theorem) as well as practically (describing outcome of measur- hypotheses, is taken up in Chapter 14. In this area of statistical inference
ments). Chapter 5 deals with a class of sampling distributions which are all the observations are used for decision-making, With a test we mean a given rule
related to the normal distribution; the most important of these is the chi-square or criterion for arriving at a decision of acceptance or rejection of same
I distribution. formulated hypothesis. After a brief survey of the general principles involved.
The real world seldom fits exactly into the scheme provided by the we concentrate on parametric tests for normally distributed variables, which
ideal mathematical models. In Chapter 6 we indicate ha, different situations can have considerable practical importance. Distribution-free tests of goodness-of-
be handled by truncation of the probability distribution, by folding-in of the fit between model prediction and experimental observation, or between different
experimental resolution, and by correcting for inefficiencies in the detecting sets of observations, include the co-n xz-test and the KolmogorarSrnirnov
apparatus. tests; the x2-test can also be used to test independence between variables and
Passing to the domain of statistics, we begin in Chapter 7 by intro- "onsistency between s e t s of observations (histograms). Other simple, less well-
ducing the importantcancepr of a confidence interval, applying it to the comon k n w n prescriptions are given to test randomness within a sample, and consist-
practical problem of estimating the parameters of the normal distribution. The ency between two or more samples.
general aspects of parameter estimation are discussed in Chapter 8, which gives The Appendix contains tabulations of the most common probability
the formal background for the specific estimation methods described in the three distributions which are referred to throughout the book, as well as a set of
subsequent chapters. The Maxim-Likelihood method (Chapter 9), the Least- tables with percentage points and critical values for some of the test statis-
Squares method (Chapter 10). and the method of moments (Chapter 11) all produce tics used exclusively in Chapter 14.
point estimates of the unknown parameters, and we discuss har measures can be There are two important subjects of wide application in particle
obtained for the uncertainty - or error -
in these estimates. A point estimate physics, which are mutually related like probability theory and statistics but
and its error is equivalent to a particular interval estimate of the parameter. not covered in this book. The first of these concerns the simulation of
Interval estimation in general is also discussed, based on the simplifying processes and generation of artificial N-particle reactions in the 3N-4 dimen-
assumptions of infinite sample sizes (in the likelihood approach) and linear sional Lorentz invariant phase space by the Monte Carlo technique. The second
models (for the Least-Squares estimation), since these facilitate comparisons deals with methods for analyzing and finding structures in this multi-
I with the normal and the chi-square distributions. In Chapter 10 we also con- dimensional space, given a sample of observed N-particle reactions. Readers
sider the Important issue of constrained parameter estimation, using the tech- who are interested in these topics should consult references on Monte Carlo
. . ' . ' . .
nlque wlth Lagrang~anmulr~pllers,and discuss h w the Least-Squares estimation methods and multi-dimensional data analysis given in the bibliography at the end
can provide measures of goodness-of-fit. Chapter 12 describes a simple ease of the book.
~
zation experiments.
The MaximvmLikelihood and the Least-Squares estimation methods both
require searching the extremum of a function with respect to the unknown p a r s
1 meters. In Chapter 13 w e sketch the principles behind commonly used numerical
b a b i l i t y p ( ~ - J ) of landing "heads" and a p r o b a b i l i t y 1-p of landing " t a i l s " . could be expressed by assigning a confidence l e v e l t o i t . Given the observa-
We ask: What is t h e p r o b a b i l i t y of observing r heada out of n t o s s ~ s ? I h i a t i o n of r heads i n n tosses it i s again a ease of a t a t i s t i c a t inference t o de-
is a question i n p r o b a b i l i t y theory, and an answer i s provided by t h e binomi- termine an i n t e r v a l [pl,pzl which is such t h a t i t has a c e r t a i n p r o b a b i l i t y of
a l d i s t r i b u t i o n law, which s t a t e s t h a t the t o obtain r heads a d including t h e t r u e value of p. I n general, the larger we take the i n t e r v a l the
From t h i s d e f i n i t i o n tlre p r o b a b i l i t y of the event E is some n d m r The requirement t h a t a l l p r o b a b i l i t i e s should add up t o one i n n w formulated
satisfying
by
i-1
P(Ai) - 1 1 (exclusive and exhaustive). (2.9)
Exercise 2.1: Let A,B,C be three non-exclusive sets in the sample space R .
Find representations of the combinations (A U 8) n C and (A Il 8) n C.
2.3.4 Conditional probabilitr
Hsrcise 2.2: S h m , by using the technique of the Venn diagram, that Suppose A and B are subsets of the sample space i? and represent the
(A U B) ll C in general is different from A U (B n C).
probabilities P(A) and P(B), respectively. Suppose further that we for some
reason will be interested only in the elements of A and that we therefore want
~ + + p * ~ ' + n ++ p , (production).
I n s p e c t i n g t h e d i f f e r e n t areas i n t h e diagramwe w r i t e
and d e t e c t e d through t h e decay
N~ N~
P(A) -E , P(B) =E , nt + n- ,
KO + (decay, event B).
and
The i n t e r e s t i n g r e a c t i o n is
.. K ' + ~ + K ' + ~ .
( s c a t t e r , event A),
A
E I
f o r which w e s e a r c h the p r o b a b i l i t y P(h). We are only a b l e t o i d e n t i f y event
intersection P(A ll 8 ) represents t h e f r a c t i o n of eoopletely i d e o t i f i e d
A i f event D i s a l s o observed, because the decaying kaon as well as the r e c o i l -
sequential events,
ing proton must be measured t o o b t a i n a kinematical f i t t o t h e s c a t t e r i n g reae-
t i . Writing (eq.(2.10))
2.3.6 Independence; m u l t i p l i c a t i o n r u l e
Two a e t a A and B are s a i d t o be independent i f t h e c o n d i t i o n a l proba-
b i l i t y of B r e l a t i v e t o A i s equal t o t h e p r o b a b i l i t y of B,
--
Let t h e event of c l o s i n g r e l a y i be denoted by E.. i-1.2.3. Then
P(EI) = P(Ez) P(El) = a. Having a current between t h e terminals corresponds In order t o d i s c r i m i n a t e against accidental t r i g g e r i n g of the system it is
t o t h e event E El U (E2 n E1), f o r which we s h a l l f i n d desirable t o observe coincidences between t h e s i g n a l s from s e v e r a l phototubes.
We assume a l l phototubea t o a c t independently.
P(E) - P(E,) + P ( E ~ ) P ( E , ) - P(E1)P(E2)P(E3) = a + a' - a'. arrangement of the phototubea. In Pig.2.5(b)the tubes are grouped together t h r e e
by t h r e e , and each group is a c t i v a t e d i f a t l e a s t one of t h e tubes i n t h e proup
which can be r e w r i t t e n as
has a s i g n a l . The observation of a p a r t i c l e by t h e d e t e c t o r then r e q u i r e s a
coincidence between t h e s i g n a l s from t h e t h r e e groups, f o r which t h e probabi-
l i t y becomes
to s h m an a p p l i c a t i o n of t h e a d d i t i o n r u l e , eq.(2.7).
The same l i n e of thought m y be applied t o more complex s i t u a t i o n s
here s e v e r a l no's are involved. For instance, an w0 meson, with a decay i n t o
2.3.9 Example: no d e t e c t i o n 37' w i l l give r i s e t o up t o s i x detected y-rays. However, unless t h e d e t e c t o r
The no meson is knam t o decay eleetromagnetieally i n t o two y-quanta. is very e f f i c i e n t , the p r o b a b i l i t y t o see many decay products soon becomes very
Suppose t h a t we study no decays i n some d e t e c t o r , e . g . a heavy l i q u i d bubble small.
chamber, end t h a t t h e average p r o b a b i l i t y i a a f o r t h e conversion of a y i n t o
2.3.10 Example: Beam contamination and d e l t a r a y s
an electron-positron p a i r w i t h i n t h e d e t e c t o f . We want f i r s t t o f i n d t h e pro-
It i s q u i t e o f t e n s problem i n kaon and a n t i p r o t o n exposures i n bubble
b a b i l i t i e s f o r seeing two, one, or none of t h e decay products of a s i n g l e no.
charbera t o f i n d t h e contamination i n t h e beam of l i g h t e r p a r t i c l e s , pions and
Clearly t h e conversion of d i f f e r e n t y-rays can be assumed t o occur
muons. Since a l i g h t p a r t i c l e can impact more of i t s energy t o an e l e c t r o n
independently of each o t h e r . Therefore, i n s simple a p p l i c a t i o n of the multi-
than a heavy p a r t i c l e of t h e same momentum, i t is p o s s i b l e t o e s t i m a t e t h e ean-
p l i c a t i o n r u l e , eq.(2.13), we can w r i t e down the p r o b a b i l i t y t o d e t e c t both y ' s
tamination from a count of beam t r a c k s having d e l t a ray e l e c t r o o s with an energy
from t h e decaying no as
E exceeding the maximum possible. Emx, f o r a d e l t a ray produced by the h e a v i e r
particle. Renee t h e presence of a d e l t a ray with E > E ("large 6") i n d i c a t e s
max
a "light" beam track.
end, s i m i l a r l y , the p r o b a b i l i t y f o r seeing none of them,
We introduce t h e following notation:
The p r o b a b i l i t y t h a t only one of t h e y ' s i s v i s i b l e i s a l s o e a s i l y w r i t t e n down, N, - number of beam t r a c k s observed w i t h one "large 6".
-
P(16) - p r o b a b i l i t y that a l i g h t p a r t i c l e produces one "large 6",
Per) + P(lY) + P(0Y) 1 I
P(Z6) - p r o b a b i l i t y t h a t a l i g h t p a r t i c l e produeea two "large 6".
Scan 1 Scan 2
Hence the contamination is estimated as
I J
individual efficiencies; applying eq.(2.7) we have and, similarly, the marginal probability of 8.
1'
P(Bj)
i-1
P(Ai n Bj) . (2.23)
which, using eq. (2.16), gives In particle physics the concept of marginal probability is inherent
in the notion of inclt~sioereactions. For example, writing
a + b * c + anything
In developing the formulae above several assumptions were involved,
some of which have not been stated explicitly and may hardly be fulfilled in implies that in the reaction of particles a and b one is only interested in ob-
practice. For instance, it has been assumed that servations on the properties of particle e, ignoring all the other particles.
A more specific example follars.
(i) the different scans are performed independently (an assump-
tion that is perhaps best met by having different seannera). 2.3.13 Example: Topologies of bubble chamber events (2)
(ii) all events have the same ~robabilit~
of being detected (an An experiment on strange particle production in proton-proton reae-
ideal requirement which is not achieved in practice when tiona in a bubble chamber has classified the events according to the identified
complicated topologies occur), neutral strange particle (v'), criterion A, end the number of charged particles
the nuder of events is sufficiently large to warrant the (prongs) seen in the primary reaction, criterion B. Criterion A gives four
(iii)
exclusive possibilities, with observation of one KO oon A, two .:K or one :K
use of the concept "probability". s'
plus one A , respectively, while criterion B gives five (exclusive) possible
We briefly touch the question of the statistical uncertainties in the prong nmher assignments for the primary reaction. Suppose that the probabili-
scanning efficiencies in connection with our diaeussion of the binomial distri- ties P(A. n 8.) are given by the observed relative frequencies for the various
bution (Sect.4.1.3). The whole problem of estimating scanning efficiencies is I I
topoloeiea displayed in the following table:
I £ A is also a set belonging to n, Bayes' Theorem states that
whereas, for example, the marginal probability for having a 4-prong reaction
P(B~IA) - P(AIB~)P(B~)
P(A) '
with any 'V signal is found by adding the nunhers in the second column, For P(A) we get an expreasioo from the definition of marginal probability, eq.
2.4.2 Exvaple
Let each of three drawers 01,Bn.Bs contain two coins; B1 has n o gold
coins. Bp one gold and one silver, 81 010 silver coins. We are to select one
2.4 BAYES' THEOREM
drawer at random and pick a coin from it. Supposing that this first coin turlu
We shall give a brief account of Bwes' l'keomm, which. although
mathematicslly simple, has a controversial status among the specialists. We out to be one of gold, what i e then the probability that the second coin in the
first state the theorem, next prove it (Sect.2.4.1). consiter an example (Sect. same drawer is also a gold coin7
2.4.2). give further c o m n t s (Sect.2.4.3) and a second example (Sect.2.4.4). If A denote. the event of first a gold coin we want t o calcu-
late the conditional probability P(BIIA). Obviously the conditional probnbili-
2.4.1 Statement and proof tiea P(A(B.) of getting e gold coin from drawer Bi, are
Let the sample space n be spanned by the n mutually exclusive and
exhaustive subsets B.
I'
respectively. Also, s i n c e a drawer i s s e l e c t e d a t random
second v0 is ambiguous between a KO and a A assignment. The ptobabi;ities eq.tZ.29) looks c l e a r l y q u i t e a r b i t r a r y , and may lead t o l o g i c a l inconsistencies
under some circmstancea. Consider, f o r instance, the decay of unstable pnr-
f o l l w i n g from the i d e n t i f i c a t i o n end measuremnts x hlve been found t o be
P(X~K;) - 0.10 and P(xlA) - 0.50 f o r the two p o s s i b i l i t i e s .
the b e t t i n g odds f o r the hypothesis t h a t the 2v0 event i a a'K'K
We want t o find
p a i r against
t i e l e s , which may be described i n terms of the mean l i f e t i m e r of the p a r t i c l e s ,
o r the decay constant A, where A
1
--.
I f we wish t o determine r, having no
* s
the hypothesis t h a t the event is a A:K
p a i r . Prom t h e t a b l e i n Sect.2.3.13,
ignoring the information on the charged tracks of the primary reaction, we f i n d , prior d i s t r i b u t i o n P(T) -
p r i o r information on t h i s quantity, Bayel' Postulate suggest. t h a t we take the
constant. I f instead A ".a chosen t o describe the
using eq.(2.28),
mould suggest that we use the p r i o r d i s t r i b u t i o n ?(A) -
decay, and we had no ~ r i o rinformation on t h i ~quantity. then Bayes' P o s t u l a t e
constant. But t h i s i s
quite d i f f e r e n t from the previous suggestion, because
I
P(r) - dA
-
~ ( h ) - l ~ lA2-P(A).
interest. The present chapter, with i t s minimum of a p p l i c a t i o n s , should be Instead of c h a r a c t e r i z i n g t h e random v a r i a b l e by i t s p r o b a b i l i t y den-
*)
regarded as s r a t h e r complete reference l i s t of t h e general d e f i n i t i o n s and ~ i t yfunction f ( x ) one may use t h e m t o t i v e dGtribution F(x). defined by
.
x
p r o p e r t i e s which w i l l be applied and developed f u r t h e r i n t h e r e m i n d e r of
t h e book.
F(x) E
i
f (xl)dx'
X .
mLO
(3.3)
= a E[g(x)),
where a is a eonstant
Fig. 3.1 shovs the relative position of the location parameters (mode,
median, mean) for a unimodular p.d.f..
i
E(algl(~1 + 8282(~)) - a1E(gt(10) + a2E(g?(x)).
mus E has the properties of B linear operator.
An important application of the expectation operator is to derive the
.xpected value of the square of the difference between g(x) and its expectation
e(.g(x)). we wilt then get a measure 01 the spread or dispersion of g(x) about
its central value, which is eslled the variance of g(a) for the p.d.f. L(x):
We next specify some definite forms of g(x) which are particularly use-
ful.
3.3.1 Expectation values of a function For the spread of x about its mean value we take the varimce V(x), or
Let g(x) be some funetion of x. We define the mathematical s q e c t a - dispersion, of x for the p.d.f. f(x); this o d e r is denoted by 02:
35
I
Ihe mments of lowest order are e a s i l y derived from the d e f i n i t i o n s : I
!
Zxercise 3.1: Show t h a t , i f a i s a constant, " ( a x ) = ~ZY(X).
11.- 1
1
Prove t h a t the p r o b a b i l i t y f o r g(x) t o be a t l e a s t
a s l a r g e a s any constant v a l u e c i s limited i f E(g(x)) e x i s t s ,
and
110- 1
i
r E(g(x)) . ( c e n t r a l moments). (3.16)
p(g(x) ?_ c) 5
Io p a r t i c u l a r , with g ( x ) = ( x - E ( r ) ) % t h i s i s equivalent to 112- az
I
P ( ~ X - E ( F )5 i a ) A< L2 , Note i n p a r t i c u l a r t h e r e l a t i o n bemeen t h e second moments: I I
which i s c a l l e d the Bienavmd-Tshebycheff i n e q u a l i t y .
I
This general r e s u l t on the p r o b a b i l i t y f o r Ix-E(x) t o exceed a given
number of standard deviations turns our t o be very u s e f u l f o r proofs of l i m i t i n g
~ e r t i e sand coovergence theorems; see Sect.3.10.4.
This is t h e same r e s u l t as eq.(3.12).
J. 3.3 Generalmoenfe
Because of t h e i r simple i n t e r p r e t a t i o n 11 - E(x) and o 2 = E(x-u)'
e x t e n s i v e l y vsed as Parameters i n t h e p r o b a b i l i t y d e n s i t y function.
are
The general r e l a t i o n s fietween c e n t r a l momenta and a l g e b r a i c moments
are as f o l l a r s : i
For t h e o r e t i c a l and p r a c t i c a l purpose* i t i s a l s o convenient t o define pk .1
r 4
k $-,(-p:)r
(r] ( a l g e b r a i c moments kmovn), (3.18)
e x p e c t a t i o n values of other powers of x and (x-p). With a general d e f i n i t i o n
we s h a l l c a l l t h e expectation of xk the k-th moment of flsl &out the origin,
or t h e k-th a l g e b r a i c momnt. ( c e n t r a l moments known)
r 4
11-111,
a2 - uz. ..
( f i r s t a l g e b r a i c mrment)
(second c e n t r a l moment)
Clearly a p o s i t i v e value of y , implies t h a t t h e d i s t r i b u t i o n f(x) has a t a i l t o
t h e r i g h t of t h e man ~ a l v e ,whereas a negative Y, i n d ~ c a t e sa t a i l t o the l e f t . 1
I
The coefficient of kurtosis, or peokedwrrs, of f(x) is defined by the
dimensionless quantity Since the algebraic moments v' appear as coefficients in a series
k
expansion of @(t) they can be expressed a.
y2 EL -
jr--~(x-v)' 3. (3.21)
(112 )' oh
has y z -
This definition implies a comparison with the normal or Gaussian p.d.f.,
0. (see Sect. 4.8.4).
vhieh
A positive (negative) value of y2 indicates that
the distribution is more (less) peaked about the mean than a n o w 1 distribution
This relation can be used for the evaluation of the algebraic moments of any
order when @(t) is k n m .
of the same mean and variance.
If instead we vant the central moments we should use the charaeteris-
tie function in the form
3.4 THE CHARAmERISTIC FUNCTION
.-
The various moments introduced in the preceding section serve to char-
acterize the distribution under study. For instance, the first algebraic moment
@ (t) E
v
1 eit(x-v)f(x)dx = E[e it (x-p) 1 (3.25)
defines a mean or "center of gravity" for the distribution, and the second cen-
and perform an expansion in a pover series about v. Then, by analogy vith eq.
- -L
tral moment measures the spread of the distribution about this mean.
(3.23,
In addition to the usefulness of the individual moments there is con-
siderable theoretical interest attached to the conplete set of moments vk 1 . k
(or mV(t) ~('t) Vk
equivalently of $) since this set determines the probability density function k-0
completely. and the central moments are obtained as
-
ta The relation between the tvo form of the characteristic function de-
@(t) E
- eitxf(x)dr E(eitx) .
This function then contains the complete set of algebraic momenta for the dis-
(3.22) fining the tvo different sets of moments, is simply
tribution f(x). By taking the Taylor expansion of eitx about the origin and
Clearly, if one is interested in a general set of moments about an arbitrary
using the linearity property of the expectation operator, we have
point a, then one should use instead
H (t)
u
Q1
3.5.1 The joint probability density funetion Here u. and v. are the expectations of the variables x. and xj, respectively.
1
Up to w e have assumed the probability density function to depend
now in accordance vith eq.(3.36) above.
on a single random variable. The extension to several variables xl.rr. xn ..., The cavsriance matrix is of great inportanee to physicists. Some of
consists in considering a j d n t p r o b o 6 i t i t y d e n s i t y function £(XI ,xr ,. ..,xn). its properties may be stated as fall-:
We a s s u m thia function to be positive and single-valued at every point
x,,x2. ...,xn in the n-dimensional space, and that it is properly normalized, (i) V(g) is s-trie.
j ~(xI,x~,....xn)dxldx2...dx,, - 1
(ii) A diagonal element V..I1 is called the varimree of of the
variable x.. of is a non-negative quantity,
I 1
n
when the integration is over the entire domain of all xi. For short v e write
and, analogously to eqs.(3.10) - (3.12) for a single random variable,
I
we have Since this condition must be satisfied for any value of a. there follovs a
o?
1
- V..
I1
- ~ ( ~ -2 ) (ECxi))'. (3.38)
restriction on p,
3.5.4
pZ 5 1,
Independent variables
which leads to the inequality (3.41).
(iii) An off-diagonal element V.. where ilj, is called the Tbe random variables xl,y,...,x
11' ere said to be mutuatty independent
coicnrimce of x . and x. and is denoted by cov(xi.x.). if their joint probability density function is completely factorizable as
I' I
COV(X.,X.)
L 1
Z V..
11
- E(x.x.)
1 I
- E(x~)E(x~). (3.39)
I f(xl,x2,. ..,x,) ' fx(xr)ft(x2). ..fn(xn), (independence). (3.42) !
Tha covariance may be a positive or negative quantity. I This is just a neu f o m l a t i o n of the definition of independence by eq.(2.13)
-
Dividing by V(xr). putting a 2 ~ ( x ~ ) 1 ~ ( r l ) oz and using the definition of the
8(xi,x.)
I
- "(x~)"(x.)
1
(3.45)
ables x. and x.: If u-u(x.) and v-v(x.), then u and v are also independent.
1 function for XI, thus
I 3
The proof of this statement is suggested as an exercise for the reader in Sect.
3.7. (Exercise 3.7).
and similarly for the other n-1 variables. It will be seen that eq.(3.47) re-
presents an application of the definition of marginal probability, eq.(2.22).
In the case of mtual7.y independent variables for which the p.d.f.
factorizes according to eq.(3.42), the marginal distribution becomes
3.5.6 Examle: Scatterplots of k~nematlcvar~ables
In particle physics probability density functions of two variables are
encountered in studies of seatterplots. or two-dimensional displays of kinematic
variables. The most c m o n of these are pres-bly the Dalitz plot and the
with corresponding expressions for h2(x2) etc. Because of the overall normali- Chew-Low plot.
zation condition for f(x), one must have For definiteness let us thi& of a reaction
at a total centre-of-mass energy &, and let 8..
LJ'
t
at
. denote, respectively, the
squared effective-mass of particles i.j and the squared four-momentum transfer
between particles a,i. Then the kinematically allowed region in a Chew-Lar dis-
, versm slz is a closed area bounded by a straight line and a hyper-
play of t
bola, see Fig. 3.2(a). The marginal distribution in s ~ ris the projection on the
s ~ axis,
r giving the one-dimensional distribution in the squared effective-maas
of particles I and 2. The other marginal distribution far the CherLow plot
gives the one-dimensional distribution in the squared four-momentum transfer
between the initial particle a and the final 3. For the Dalitz plot of
say, s ~ versus
r s t a . both marginal distributions are squared effective-mass
spectra, see Pig. 3.2(b).
Lorentz-invariant phase space predicts the density in the Chew-Low
plot to be given by the formula
1
(The quantity 2; h L (x z ,y2 . r 2 ) corresponds to the magnitude of the momenta
when two particles of masses y and z share the centre-of-mass energy x.) Thus
according to the phase space prescription the conditional density, given st?,
is a constant.
d2R,(ta3Inlr)
f(ta$ls12) ' dslrdtP, (independent of t
.
,
)
.
In Fig. 3.2(a) this implies that the density is uniform within the kinematic
boundary along lines of constant sir The marginal distribution in s l z is ob-
tained by integrating over ta3. thus Fig. 3.2. Illustration of joint probability and marginal distributions. The
shaded areas correspond to the kinematically allowed physical regions in two
hl(slr) - i
ta, mad
f(te,lslr)dt,, - f(t,~~~~~)(t,,(ma]r) - tal(Din)).
dimensions (variables) for the reaction a + b + 1 + 2 + 3 ; (a) ta, versue
s t 2 (Chewlow plot), (b) s 1 3 Versue S I P (Dalitz plot). In both diagrams the
degree of shading indicates the density expected according to Lorentz-inva-
ta,(min) riant phase space, and the one-dimensional projections on the two a x e s give
the spectral shapes for the variables involved.
Since the boundary corresponds t o configurations where t h e f i n a l s t a t e p a r t i c l e s 3.5.7 The j o i n t c h a r a c t e r i s t i c function
are c o l l i n e a r , the square bracket can be evaluated q u i t e e a s i l y t o give I n analogy v i t h t h e d e f i n i t i o o of the c h a r a c t e r i s t i c function in t h e
case of a s i n g l e random v a r i a b l e , we n m introduce t h e j d n t characterietic
M a i o n @(t,,t2, ...,t ) f o r t h e j o i n t p r o b d i l i t y denaity function
space i s obtained e x p l i c i t l y as
It is s e e n t h a t t h i s leads t o an e l l i p t i c a l i n t e g r a l .
i
612 "7.3~)
s ~ (mi")
r (6;rn~)~ in XI, and s i m i l a r l y f o r 'Z(t2). Thus, f o r independent v a r i a b l e s t h e j o i n t eher-
a e t e r i a t i c function is f e c t o r i z a b l c * ) . I n g e n e r a l , w i t h n independent v a r i a b l e s .
(mtimz)2
where t h e two-particle phase space f a c t o r Rz is r e l a t e d t o the kinematic func-
m(tt,tz ,...,t n ) - ...
@(tl)@(t%) @(tn), (independenre). (3.51)
t i o n k by
1 --
R~(X~;~~.Z.~) A ( X ~ , Y ~ , Z ~ ) .
2r2
The j o i n t c h a r a c t e r i s t i c function may be used t o f i n d the general
moments of the d i f f e r e n t variables. The technique i s t h e same a. shown b e f o r e
i n t h e case of a s i n g l e variable. For s i o p l i e i t y , l e t us again s p e c i a l i z e t o
E x e r c i s e 3.5: For t h e t h r e e - p a r t i c l e f i n a l s t a t e Lorentz-invariant phase space j u s t two v a r i a b l e s , rn and x2. When
p r e d i c t s t h e density within t h e kinematic boundary of t h e D a l i t r p l o t t o be
.rooo. or ti anal t o
- = -d'R3 n2
dsl2dsxa 48 *)
i.e. oonstmt. Show t h a t t h e marginal d i s t r i b u t i o n f o r s,, is given by t h e I n f a c t t h e i n v e r s e s t a t e m n t a l s o holds: I f t h e j o i n t c h a r a c t e r i s t i c
expression f o r h , ( s L 2 ) i n t h e t e x t . function can be f a c t o r i z e d , t h e v a r i a b l e s are independent. Thus eq.(3.51)
represents s necessary and s u f f i c i e n t condition f o r independence.
The expectation of t h i s sum is e a s i l y found by using t h e l i n e a r i t y property of
E:
-- am
a(it,)
I1
CD CD
-- x~eit'X'titzXzf(X~,Xz)dX1d~2.
Thus t h e expectation of a l i n e a r con6ination of v a r i a b l e s x. i s t h e same l i o e a r
eonbination of t h e individual mean values.
The variance of t h e l i o e a r function i s s l i g h t l y Tore troublesome t o
P u t t i n g t l - t z 4 , t h e right-hand s i d e i s nothing but t h e expectation of XI, evaluate. W e have
and 80 on. Corresponding expressions f o r the other v a r i a b l e r e s u l t from d e r i - This can f u r t h e r be v r i t t e n as follows:
v a t i o n s with respect t o i t r . Also
- Ea;V(xi) + 1 a.a.cov(xi,r.)
i*j '3 I
.
n u s , v i t h two v a r i a b l e s , or, finally
1 aixi)
V( i-1 - n
L- 1
a;v(xi), (uncarrelated v a r i a b l e s ) . (3.56)
3.6.1 Example: Arithmetic mean of indeoendent variables w i r l r the same mean I To ensure a non-negative dependence, we take
and variance
i
Let x t , x n . . . . , x " be n mutually independent random variables having
the same mean value p . - ~and the same variance o f a 2 . We then take as a parti-
and this is the anawer to our question.
cular linear combination the o r i t h e t i c mem2 ;or merage of the a.I'
It the transformation (3.59) is not one-to-one, and several segments
! [x,x+dxl map onto [y,pdyl, one must sum over all segments,
a. --.
This is a special case of eq.(3.53)
1
in which all coefficients ai are equal,
and hence we get from eqs.(3.54) and (3.56) the expectation and vari-
ance of :; I Eq.(3.60)
of v i a b l e ,
can easily be extended to cover a transformation from a set
i t P ..xn) to a second set of vari-
£(XI .i....
ables YI,Y~.....Y . The p.d.f, for the new set of variables is
~,
or, for short, in obvious vector notation,
3.7 CHANGE OF VARIABLES where J is the Jacobian determinant of the transformation, given by
It often happens that the probability density function ia known far
a certain set of variables and that one wants to find what the distribution
will be like when a transformation is made to a new set of variables. For
instance, given a spectrum of particle Mnnenta one may want to have the corre-
sponding energy spectrum.
Suppose first that x is a continuous random variable with p.d.f.
f(a) and that we knar a functional dependence
that is, a constant density, (compare Exercise 3.5). For new variables choose
Y Y E +
i-1
x i i 2 + term of higher o r d e r . (3.66)
--
the linear effective masses M,~,M,,. Then the Jaeohian of the transformation Taking the expectation value of this expression each first order term will
ir vanish, so that
E(Y(~)) = ~(1)
+ terms of higher order . (3.67)
Under the assunption that the quantities (x.-v.) are small, the remaining terms
1 1
can he dropped to give the approximate result
The density in the new variables is therefore
Introducing this in the formulae far the variance of y(x), eq.(3.35), we get
which is not a constant. Thus the nice feature of constant density is lost in
-
the change from squared to linear effective masses.
Exercise 3.8: In the exemle above, prove that a transformation to the energy Now we can find an approximate value of the difference y(5) - y(~) from eq.
variables EASEzyields a constant probability density. (3.66) by dropping all terms of order higher than one.
~ e us
t further ass- that the covariance matrix V(x) of 5 i a h m . We need
not specify whether the xi's are independent or not, so the form of V(&)
where the derivatives are evaluated at z-y. which all depend on the n random variable8 x,.x,,....x , thus
The formula (3.72) is k n a m as the Zm of p r o p a g a t i a of ermrs and
is of great importance t o physicists. In the general case it is to b e regarded
as only approximately "=lid, in view of the assumptions made in deriving it A Taylor expansion about -
x - 2 leads to
(dropping term. of higher order). We have found an expression for the variance
of y valid when 1 is in the neighbourhood of y. Note, however, that for the
particular case when y has a linear functional dependence on z all derivatives
Y = Y + 1P
i
1 +
-r-u-
. , k - 1.2 ,...,m,
of second and higher order vanish identically; eq.(3.72) is then exact for all by analogy with eq.(3.66). Taking the expectation value each of the first order
terms drops out, and we have
-
X.
For n mutually independent variables all covariance terms are zero
and eq. (3.72) reduces to
This relation is also exact only for linear functions of 5 , and otherwise
approximately correct to the extent that higher-order terms can be neglected.
Exercise 3.9: ~f = $L x i and x r are tvo independent r a n d m variables or
having
V(Y) .
xi*(x:v(x,) + x:v(xz)), and v(y)/y2 -
v(x,) ahd v(x,), respectively, show that
v(xl)/x? + ~(x~)/~f.
vkQ" z
n n
,I,j ~ ,zi)
ayk a~,
1 3
E((x~-u~)(x.-u.)) .
5?! 5%
3.8.2 Example: Variance of arithmetic mean In a n a l o w with eq.0.72) we write
~~t y be the average of a set of n independent variables xl,xz.. ...x,,
all with the same variance 0'-
ere 2 i s an m-~o~np~nent
(~01umn) v e c t o r of constants and S an m b y n matrix
x + 2. Then, t o f i r s t order,
Exercise 3.11: I f p, -
r n-r .
pr(l-p)"-'
.... (,n,t h es binomial -
distribution) and the
describing t h e l i n e a r p a r t o f t h e transformation
v(r) -
v a r i a b l e r may t a k e any i n i e g e r value 0.1
np(1-P).
h w that E(r) np,
"(p? - SV(@
T
, (3.80)
and
-
s2m n-l .I
i
1-1
(xi-x12 . (3.88)
Here ;
Exercise 3.12: Given the probability generating function G(z) -
(zp + q ) n
where P q = 1, show that E(r) = np, Y ( ~ ) = npq, (Compare Exerelee 3.11 .)
+
met;
is the smp2e mean, or arithmetic mem? (overage), which we have already
the quantity a' measures the dispersion of the sample about its mean value
and is called rhe sample variance.
The two quantities ; and s2 defined here as functions*) of the random
1.10 SAMPLING
variables xi, are themselves random variables. This is clearly so because a
3.10.1 Universe and sample -
repeated dcswing of new s m p l e a , all of aize n, obtained from the s a w popula-
tion will produce new x , s 2 . T ~ U Sthe random variables ; and s2 will have their
A probability density function f ( x ) for a continuous random variable,
or equivalently the set of probabilities in the discrete ceae, describes the o m distributions. Obviously, these distributions must depend on tbe properties
properties of a p o p u z d i o n , or miverse. In physic. one associates random of the parent distribution, and on n. The study of ramplea by the diatributions
variable8 with observations on specified physical system, and the p.d.f. f(r) of ;and s2 form an important part of probability theory.
Particularly interesting are samples dram from a universe. It
summarizes the outcome of all conceivable measurements on such a system if the -
measurements were repeated infinitely many time8 under the same experimental turns out in thia case that the variables x and s2 are independent; thia proper-
ty is unique for the normal distribution. Moreover, the resulting distributions
conditions. Since an infinite number of observations is of course impossible,
even on the simplest system, the concept of a population for a physicist repre- for the two variables become especially simple, ;being normally distributed,
sents an idealization which can never be attained in practice. and sZ related to a chi-square distribution. This will be discussed further in
An actual experiment will consist of a finite n u d e r of observations. Seets.4.8.6 and 5.1.6.
A sequence of measurements xl.x2, ....
xn on some quantity is said tr, constitute
3.10.3 Inferences from the s w l e
a a m p l e of size n. A sample is accordingly a subset of the population or uni-
A physicist's motivation for undertaking an experiment and to perform
verse; we may say that " sample of size n is dram from the universe". Phyai-
measureamnts of physical quantities is that he wants to find out something about
cists would like to think that their measorements are typicsl. in the sense
"reality"; thus his interest is in some true distribution, or universe. In fact.
that repeated experiments with the same number of measurements are likely to
give more or less the same result. m i 8 corresponds to the notion of -dm
*) A function of one or more random variables that does not depend on any un-
nqZes.
-known paramter is called e statistic. In accordance with this definition
x , as well ss s2, may be called a statistic.
h e may b e p r e p a r e d t o make i n f e r e n c e s a b o u t t h i s u n i v e r s e an t h e b a s i s of h i s
following section. I t can b e i n t u i t i v e l y u n d e r s t o o d from t h e o b s e r v a t i o n t h a t
r e s t r i c t e d number of o b s e r v a t i o n s . -
t h e e x p e c t a t i o n and v a r i a n c e of t h e v a r i a b l e x are, r e s p e c t i v e l y , E(;)=I~ end
Suppose t h a t m a a u r e m e n t s an t h e v a r i a b l e x have g i v e n t h e numbers
v(;)-$In, (qs.(3.57), (3.58)). i m p l y i n g t h a t t h e s p r e a d of j about u w i l l
XI.X%. ....xn, ~ o n s t i t u t i n ga sample a f size n. E v i d e n t l y we hope t h a t t h e become s m a l l when n i s l a r g e . s i m i l a r r e s u l t s a p p l y f o r t h e mean and v a r i a n c e
sample i n some r e s p e c t i s r e p r e a e o t a t i v e of t h e u n d e r l y i n g m i v e r s e o r popula-
of s2 ( ~ ~ ~ 3.13).
~ i sn ues , by c h o o s i n g s u f f i c i e n t l y l a r g e s a m p l e s , any
tion. A measure of t h e p o p u l a t i o n mean v a l u e 11 d i e h s u g g e s t s i t s e l f i s ,; the d e s i r e d accuracy =an be o b t a i n e d i n t h e e s t i m a t e s of t h e p o p u l a t i o n p a r a m e t e r s .
a r i t h m e t i c mean of t h e sample, aa g i v e n by eq.(3.87). We m y therefore c a l l ; m i s p r o p e r t y i s c a l l e d consistency of t h e e s t i m a t o r s ; see f u r t h e r Sect.8.3.
an e s t i m a t e of the popuZotion mew p . S i m i l a r l y , a measure of t h e p o p u l a t i o n
v a r i a n c e o2 i e p r o v i d e d by t h e q u a n t i t y 8' d e s c r i b i n g t h e d i s p e r s i o n of t h e ~ x e r c i r e3.13: Show t h a t , i n t e r n of t h e c e n t r a l moments,
2
sample, eq.(3.88). Hence t h e n o t i o n t h a t s 2 i s the estimate of t h e p o p u Z a t i a ~ ( 9 2 )= 02 =
UaPiwce 0
'. We w r i t e
3.10.4 lbe Law o f L a r g e Nulllel-s
Convergence t h e o r e m p l a y a fundamental r o l e i n p r o b a b i l i t y t h e o r y
and s t a t i s t i c s a n d a r e t h o r o u g h l y d i s c u s s e d i n t r e a t i s e s on t h e t h e o r e t i c a l
Foundations o f t h e s e s u b j e c t s . Since i n t h i s book mathematical r i g o r i s c o n s i -
d e r e d of l e s s importance compared t o p r a c t i c a l i m p l i c a t i o n s we s h a l l l i m i t o u r
discussion here LO t h e Lnw of Large brribers which was mentioned i n t h e p r e c e d i n g
l'he reason f o r u s i n g (n-1) and n o t n i n t h e e x p r e s s i o n f o r s2, is t o
s e c t i o n and which w i l l be r e f e r r e d to i n l a t e r a p p l i c a t i o n s .
ensure t h a t s 2 is an tmbinssed e s t i m a t o r of o'; a d i s c u s s i o n on t h i s p o i n t i s
L e t nl,x2,... be a s e t of i n d e p e n d e n t random v a r i a b l e s which have
g i v e n i n Sect.8.4.1. An i n t u i t i v e e x p l a n a t i o n why we s h o u l d t a k e (n-I) instead
i d e n t i c a l d i s t r i b u t i o n s w i t h mean v a l u e p . For t h e f i r s t n of t h e s e v a r i a b l e s
of n i s t h e f o l l o w i n g : From t h e sample a l o n e we d o n o t know e x a c t l y what t h e - - 1 "
c e n t r a l v a l u e p of the p o p u l a t i o n i s ; we o n l y have an e l t i m e t e . p-x, which i s
- t h e a r i t h m r i ~mean x = n i = Z l x 1. w i l l a l s o have rean v a l u e U, r e g a r d l e s s o f t h e
number n. The ( ~ e a k-)Law o f Large Nurrbers s t a t e s t h a t , given any p o s i t i v e E ,
subject t o uncertainties. As a measure of t h e d i s p e r s i o n of t h e p o p u l a t i o n t h e t h e p r o b a b i l i t y t h a t x d e v i a t e s from p by an armunt more than E w i l l be z e r o i n
quantity ~ ~ l ( x i - ~ is
) z t h e r e f o r e l i k e l y t o b e t o o s m a l l . and we s h o u l d b e
t h e l i m i t of i n f i n i t e n,
b e t t e r o f f r e p l a c i n g n i n t h e denominator by a s m a l l e r n u d e r .
When n becomes v e r y l a r g e t h e sample p r o p e r t i e s w i l l approach t h e lim P(lx-pl > E) = 0. (3.91)
" + r n
properties of t h e p o p u l a t i o n , h e n c e
AS s t a t e d above t h e theorem concerns t h e l i m i t i n g p r o p e r t i e s o f ;
when n approaches infinity. A s t r o n g e r v e r s i o n of t h e theorem s a y s s o m e t h i n g
A t h e o r e t i c a l l y and p r a c t i c a l l y i m p o r t a n t c l a s s of p r o b a b i l i r y d i s t r i -
b u t i o n s , t h e sampling d i s t r i b u t i o n s r e l a t e d t o t h e normal p.d.f ., is treated
s e p a r a t e l y i n C h a p t e r 5.
0.20
0.10
0.30
1, ,
n.5,
P=O.Z
0.20
0
, 2 4 6 8
n=5
P = 0.5
r
- I
I
The variance of t h e binomial v a r i a b l e i s t h e r e f o r e
Exercise 4.1:
V(r)
~(t)$ E ( = )
v(;)
eqs.(4.1).(4.5),
Exercise 4.2:
Exercise 4.3:
F(n;n,p) = I
-
successes i n n t r i a l s .
=
~(r')
- (;)z"(r)
- (E(d)'
P,
- *-
-
- F(n-x-1;n.l-p).
n(n-l)p2
.
+ mp - (np)' - np(1-p)
-
From the d e f i n i t i o n of t h e cumulative binomial d i s t r i b u t i o n by
show t h a t , f o r 0 2 x 5 n-1.
-
(4.8)
(4.9)
0.20
ively,
YI - ( 1 - 2 p ) I ~ . = (~-KP(~-P))/(~P(~-P)).
Observe from t h e expression f o r Y L t h a t , f o r f i n i t e n, p < 0.5 (p > 0.5) implies
t h a t t h e d i s t r i b u t i o n i s p o s i t i v e l y (negatively) skew and has e t a i l t o t h e
B(r;n,p) r i g h t ( l e f t ) . Note a l s o t h a t both c o e f f i c i e n t s tend t o zero when n becomes
l a r g e , i n d i c a t i n g t h a t t h e b i m m i a l d i s t r i b u t i o n beeones s i m i l a r t o t h e normal
0.10 d i s t r i b u t i o n (Sects.3.3.3 and 4.8.4).
j;J:
histogram. With a t o t a l of n independent e v e n t s t h e p r o b a b i l i t y f o r having j u s t
n = 20 I r events i n b i n i and t h e remaining n-r events d i s t r i b u t e d over t h e o t h e r b i n s
B(r;n,p) p-02 is given by t h e binomial d i s t r i b u t i o n law, eq.(4.1). The expected number of
0.10 events i n the i - t h b i n i s E ( r ) = n p from eq.(4.6). and t h e variance of t h i s nuar
ber V(r) = n p ( l - p ) , eq.(4.7).
r=k,k+l, ....
(ii) Show t h a t t h i s p r o b a b i l i t y d i s t r i b u t i o n has t h e p r o p e r t i e s
k
(Hint: m e p r o b a b i l i t y g e n e r a t i n g function i s G(Z) = [ p z / ( ~ - q z ) ) .I
Note t h a t t h i s d i s t r i b u t i o n is always p o s i t i v e l y skew.
d i s t r i b u t i o n eq.(4.1) as a case. I n a d d i t i o n i t has t h e following prop- Notice t h a t t h e contents i n the two e l a s s e s are always n e g a t i v e l y eorre-
lated. I
erties: I n terms of the c o r r e l a t i o n c o e f f i c i e n t f r m eq.(3.40) one has
k
where i i l r i = n
which approaches zero when u gets large. Asymptotically, when u goes towards
i n f i n i t y t h e P o i s s o n d i s t r i b u t i o n becomes i d e n t i c a l t o t h e normal d i s t r i b u t i o n .
AS rill be seen f r w F i g . 4.3 t h e s i m i l a r i t y between these m o d i s t r i b u t i o n s i s
r a t h e r c l o s e already a t u=20.
From eq.(4.20) we observe t h a t
P(r;u) - P(=-l;u). ,
and t h e p r o b a b i l i t y w i l l t h e r e f o r e i n c r e a s e from r=O,l,Z etc. s o long as r < u.
The maximum p r o b a b i l i t y i s a t r = l p l , and with an equal, a d j a c e n t maximm a t
0 2 r
u
I II
p-1 i f i s an i n t e g e r ; see F i g . 4 . 3 .
The Poisson d i s t r i b u t i o n of eq.(4.20) has been t a b u l a t e d i n Appendix
Table A3 f o r values of u b e m e e n 0.1 and 20. Appendix Table A4 gives a s i m i l a r
t a b u l a t i o n of t h e e m u l a t i v e Poisson d i s t r i b u t i o n
0.20
of eq.(4.25)
There e x i s t s a u s e f u l r e l a t i o n s h i p between t h e cumulative Poisson sum
and the c m l a t i v e i n t e g r a l of t h e chi-square d i s t r i b u t i o n , which
I1
0.10 0.10
w e s h a l l d i s c u s s i n Chapter 5.
F(x;u) - 1 - 'I
0
f(u;"-Zx+2)du. (4.26)
P(ri~)
0.05 0.05
Herr f(u;V) i s t h e chi-square p.d.f. with v degrees of freedom, and t h e q u a n t i t y
on t h e right-hand s i d e haa been dieplayed graphically Lor d i f f e r e n t u i n Fig.5.2.
.. 0 4 8 12 1 6 r 10 15 20 25 30 r
while t h e p r o b a b i l i t y t h a t t h e r e is no bubble i n t h e i n t e r v a l ie
";+I
dPrW
-=-
da
g ~ ~ ( 9+
, ) gp7_, ( a ) . (4.29)
I
I
I
Exercise 4.16: The rider of beam p a r t i c l e s p e r p u l s e is assumed t o be Poisson
d i s t r i b u t e d . I f i t i s k n m t h a t t h e average n d e r of p a r t i c l e s per p u l s e is
16, what i s t h e p r o b a b i l i t y t h a t a p u l s e w i l l have between 12 and 20 p a r t i c l e s 7
(Answer: 0.7411.) - See a l s o Exercise 4.40.
" . r ". , . . , , .
which gives t h e d i s t r i b u t i o n of t h e n d e r of bubbles r i n i n t e r v a l s of f i x e d
The candidates could, however, a l s o be i n t e r p r e t e d as background events due t o
lennth
. 9,. I t is seen t h a t eq.(4.30) g i v e s a Poisson d i s t r i b u t i o n with t h e para- , the neutronreactions
meter ( g t ) . S p e c i f i c a l l y i t includes eq.(4.28) as a s p e c i a l ease.
n + p - n + p r " o ,
I t may be a p p r o p r i a t e t o emphasize t h a t eq.(4.M) describes the fre- 1
n+p-n+n+lr+.
. .
auencv d i s t r i b u t i o n f o r t h e discrete v a r i a b l e r , with .9 (or, s t r i c t l y speaking,
From i d e n t i f i e d events of t h e type n+p+pfp+"- t h e expected number of background
g?.) as a parameter of t h e d i s t r i b u t i o n . Prom t h e Poisson assumptions one can
events was estimated t o be 4.9. Assuming t h a t t h e nmber of background e v e n t s
a l s o t u r n t h e problem around and seek t h e d i s t r i b u t i o n i n t h e c a t i n w u ~v a r i - i s Poisson d i s t r i b u t e d with mean value 4.9. what i s t-~~~
~ ~ ~
.
h e o r o b a b i l i t~>v to
~
. or
-~ have Q
more background events7 Do you c o n s i d e r t h a t t h i s experiment i n d i c a t e s t h e pre-
able a f o r s p e c i f i e d values of t h e parameter r . I n o t h e r words, one can ask f o r
sence of weak n e u t r a l c u r r e n t s ?
t h e p r o b a b i l i t y t o have a t o t a l d i s t a n c e t of t h e t r a c k t o f i n d e x a c t l y r bub-
b l e s , given t h a t t h e average p r o b a b i l i t y t o f i n d a bubble i s c o n s t a n t along t h e
'
I 4.3.3 Example: Radioactive emissions
t r a c k and equal t o g p e r u n i t length. This problem w i l l b e i n v e s t i g a t e d l a t e r
A f r e q u e n t l y c i t e d example on a Poisson process is t h a t of p a r t i c l e
i n Sects.4.b and 4.7, and as we s h a l l see, i t w i l l l e a d us t o a s p e c i f i c c l a s s
emission from a r a d i o a c t i v e source. I f t h e p a r t i c l e s are e m i t t e d from t h e
of the g a m a d i s t r i b u t i o n s , including t h e w e l l - k o u n exponential d i s t r i b u t i o n
source a t an average r a t e of A p a r t i c l e s p e r u n i t time t h e n m b e r of emissions
law.
r, i n f i r e d time i n t e r v a l s t f o l l o v s a Poisson law with mean A x t .
..
4.4 REUTIONSHIPS BETWEEN nlE POISSON AND OTHER PROBABILITY DISTRIBUTIONS
The Poisson d i s t r i b u t i o n has some i n t e r e s t i n g connections t o o t h e r
Suppose now t h a t t h e source i s placed i n surroundings where t h e back- p r o b a b i l i t y d i s t r i b u t i o n s which are u s e f u l f o r physical a p p l i c a t i o n s . We w i l l
ground of r a d i o a c t i v e emissions i s given by an average r a t e A p a r t i c l e s per i n the following i n d i c a t e h w the same mathematical r e l a t i o n s h i p s between t h e
b
" n i t time. Then t h e number rb of background emissions i n time i n t e r v a l s of Poisson and o t h e r d i s t r i b u t i o n s can come o u t when a p h y s i c a l problem i s a t t a c k e d
l e n g t h t i s Poisson d i s t r i b u t e d with man kt, from d i f f e r e n t viewpoints which may a t f i r s t appear r a t h e r d i s s i m i l a r . We study
i n p a r t i c u l a r t h e connections between t h e Poisson and binomiallmultinomial d i s -
t r i b u t i o n laws which are important f o r many p r a c t i c a l problems. In the f i n a l
I
which is nothing b u t a Poisson d i s t r i b u t i o n with mean value pw.
An i n e f f i c i e n t c o u n t e r of t h e above type h a s t h e p r o p e r t y t h a t i t thought a "success" would correspond t o having t h e count occur i n the time t .
p i c k s a random sample from t h e p a r e n t With a Poisson p o p u l a t i o n t h e while a " f a i l u r e " would be t h a t i t o c c u r r e d i n the remaining time T-t.With an
random sample was found t o be of t h e Poisson type. Conversely. t h e sample w i l l average counting r a t e nlT per eeeond,the success r a t e , o r the p r o b a b i l i t y f o r
o n l y be Poisson d i s t r i b u t e d i f t h e p o p u l a t i o n was P o i s s o n . 1
-
each count o c c u r r i n g i n the t i m e t, i a p- ;(n/T)t=tlT.
Thus t h e p r o b a b i l i t y f o r
r c o m t s i n t and n-r counts i n T-t i s given by t h e e x p r e s s i o n above.
4.4.2 Example: Subdivision of a c o u n t i n g i n t e r v a l
We a s s m e t h a t a d e t e c t o r r e g i s t e r s p a r t i c l e s over p e r i o d s o f T sec- 4.4.3 R e l a t i o n between binomial and P o i s s o n d i s t r i b u t i o n s
onds and t h a t t h e number of counts n f o l l a r s a Poisson law w i t h mean v a l u e v=AT. Exaolple: Porvard-backward classification
We want t o f i n d the d i s t r i b u t i o n d e s c r i b i n g t h e nunber o f counts r i n a n i n t e r - The mathematical r e l a t i o n s h i p between t h e binomial a n d t h e Poisson pro-
val of t seconds, where t < T. b a b i l i t i e s o f the preceding s e c t i o n can be d e r i v e d from an a l t e r n a t i v e p o i n t o f
view, a s s m i n g two v a r i a b l e s r e l a t e d i n a binomial d i s t r i b u t i o n , w i t h t h e i r sum
From t h e s p e c i f i c a t i o n above t h e c o u n t s occur a t a r a t e of A c o m t s
obeying a Poisson law. The formulation below i n v o l v i n g two v a r i a b l e s can e a s i l y
p e r second. S i n c e the b a s i c p r o c e s s i s of t h e Poisson type t h e occurrences of
be g e n e r a l i z e d t o a e v e r s l v a r i a b l e s , see t h e subsequent s e c t i o n .
e v e n t s i n two nun-overlapping time i n t e r v a l s ere independent. The p r o b a b i l i t y
For the moment, l e t us suppose t h a t we make a c l a s s i f i c a t i o n o f par-
t o have r counts i n t h e i n t e r v a l t and n-r counts i n t h e remaining time T-t i s
t i c l e s i n two c a t e g o r i e s , "forward" and "backward". according t o t h e i r production
t h e r e f o r e e q u a l t o t h e p r o d u c t o f t h e two Poisson p r o b a b i l i t i e s .
angle i n an o v e r a l l centre-of-mass system.
The t o t a l n d e r of p a r t i c l e s n i s
assumed t o be a Poisson v a r i a b l e with mean v a l u e v . For any n t h e number of
p a r t i c l e s i n the forward ( f ) and backward (b) hemispheres are c o n d i t i o n e d through
T h i s i s n o t t h e p r o b a b i l i t y d i s t r i b u t i o n we s e e k , because the n o r m a l i z a t i o n i s a binomial law, which we w r i t e
not correct. We need t h e c o n d i t i o n a l p r o b a b i l i t y t o have r and n-r counts.
g i v e n a t o t a l n counrs. The ~ o n d i t i o n a lp r o b a b i l i t y i s t h e r e f o r e o b t a i n e d by
d i v i d i n g P ( r ) by the Poisson p r o b a b i l i t y f o r n counts i n t h e time T, t h u s
Here p and q are the c o n s t a n t s d e s c r i b i n g t h e f r a c t i o n s o f forward and backward
P(~;A~)P("-~;A(T-~)~
P ( r i n t , n-r i n T - r l n i n T) particles, respectively. The j o i n t p r o b a b i l i t y d i s t r i b u t i o n f o r a l l t h r e e v a r i -
P(n;AT)
a b l e s f , b and n becomes t h e r e f o r e
S u b s t i t u t i n g the e x p l i c i t Poisson p r o b a b i l i t i e s on t h e right-hand s i d e and re-
a r r a n g i n g terms we g e t
Suppose t h a t k v a r i a b l e s r l . r z . ...,rk are dependently d i s t r i b u t e d according t o the a d d i t i o n theorem f o r Poisson v a r i a b l e s (see Sect.4.3.3 and
according t o t h e multinomial d i s t r i b u t i o n law, eq.(4.16), i n such a way t h a t Exercise 4.18) is t h e Poisson d i s t r i b u t i o n P(r;nU). Hence
k k
Since iE1pi-l, I: ri-n, t h i s can be organized t o give
i-1
from which a l l moments can be derived. S p e c i f i c a l l y , t h e mean and v a r i a n c e erne
out as
( i i ) I f t h e c o n s t i t u e n t v a r i a b l e s ri have t h e p r o b a b i l i t y generating function
The compound Poisson d i s t r i b u t i o n i s a p p l i c a b l e whenever a r a n d m pra- g ( r ) , shou t h a t t h e p r o b a b i l i t y generating function of r i s ~i~~~bv
c e s s of t h e Poisson type i n i t i a t e s a n o t h e r . In n a t u r e , one can f i n d many exam-
p l e s on chained reactions, where t h e products from one type of r e a c t i o n gives ( i i i ) Show t h a t t h e p r o b a b i l i t y generating f u n c t i o n has t h e f a c t o r i z a t i o n
r i s e t o a second generation of random e v e n t s . For i n s t a n c e , it has been sugges-
t e d (by R.K. Adair and H. Kasha) t h a t t h e formation of d r o p l e t s along t h e t r a c k s
property
-
c ( r , t ~ + t z ) G(z.t,) G ( r , t z )
and i n t e r p r e t t h e r e s u l t .
of charged p a r t i c l e s i n a cloud chamber provides an example on chained Poisson Compare Exercise 4.20.
(6.40)
-
I f we make t h e s u b s t i t u t i o n
where g i s some generating ftmctionand k i s u s u a l l y 1 or 2 . Each n d e r xi+, i n
t h e sequence w i l l t h e r e f o r e be completely determined by i t s predecessors and t h e
given s t a r t i n g v a l u e ( s ) . Thus t h e sequence i s n o t random, b u t i t w i l l appear t o
t h e new v a r i a b l e u w i l l cover the region O l u l l and be uniformly d i s t r i b u t e d ,
be SO f o r most p r a c t i c a l a p p l i c a t i o n s . The sequence w i l l always be p e r i o d i c .
because from eq.(3.60) i t s p.d.f. becomes
with a c y c l e of numbers which i s repeated e n d l e s s l y . The length of t h e p e r i o d i c
cycle i s determined by t h e chosen algorithm, end has an upper l i m i t implied by
t h e computer word length' ) .
Randon1 numbers generated by recursion formulae are i n t h e t e c h n i c a l
An example on t h e usefulness of t h i s transformation i s given i n Sect.10.4.4.
l i t e r a t u r e c a l l e d pseudo-random or quasi-radom.
E x e r c i s e 4.22: Show t h a t t h e skewness and k u r t o s i s c o e f f i c i e n t s f a r t h e uniform
d i s t r i b u t i o n are y,=O and yz=-1.2, r e s p e c t i v e l y . *) A s p e c i f i c algorithm producing sequences of period m i s given by t h e Linear
congmential method as x i + , = ( a a i + c ) ~ d m ,where t h e constants have been
properly a d j u s t e d and t h e uoer only s e l e c t s t h e s t a r t i n g value x .
E x e r c i s e 4.25: A random number g e n e r a t o r produces numbers x. which are uniformly 4.6.2 D e r i v a t i o n of t h e e x p o n e n t i a l p , d . f . from t h e Poisson assumptions
d i s t r i b u t e d between 0 and 1. We seek a g e n e r a t o r f o r a new $andom number y
d e f i n e d over t h e i n t e r v a l [A.BI, which corresponds t o t h e p r o b a b i l i t y d e n s i t y Let us go back t o our e a r l i e r example (Sect.4.3.2) with bubbles along
f ( y ) . Defining a function y1 . y ( x . ) by
1 t h e t r a c k o f a e h a r g e d p a r t i c l e i n a bubble chamber, w i t h the c o o s t a n t g g i v i n g
~ ( " ~ 1
Xi = ] f(t)dt,
the number of ( p o i n t - l i k e ) bubbles p e r u n i t l e n g t h of t h e t r a c k .
W e want now t o f i n d a n e x p r e s s i o n f o r t h e ~ r o b a b i l i t h
~a t t h e f i r s t
A
s h m t h a t y. h a s t h e d e s i r e d d i s t r i b u t i o n . S p e e i f i e a l l y , shou t h a t the bubble o n a t r a c k occurs a t a d i s t a n c e 6 from t h e chosen o r i g i n . Since " t h e
exponential'distribution f(y)=exp(-y) f o r 0 < y 9 can be simulated by t a k i n g f i r s t bubble i n t h e i n t e r v a l [L,L+ALl" i s e q u i v a l e n t t o having no bubble i n
yi = - l"(l-xi) . [O,L] and one bubble i n the "on-overlapping i n t e r v a l [E,L+AL] t h e j o i n t probabi-
l i t y f o r t h e occurrence of t h e s e two independent "events" i s the p r o d u c t o f t h e
p r o b a b i l i t i e s t o r the individual events. Now we know t h a t the p r o b a b i l i t y f a r
4.6 THE EXPONENTIAL DISTRIBUTION
no bubble over the length L i s e-gL (f5om e q . ( 4 . 2 8 ) , or t h e Poisson formula w i t h
4.6.1 D e f i n i t i o n and p r o p e r t i e s r=O) and t h a t the p r o b a b i l i t y f o r one bubble i n AL is gAL ( t h e Poisson assump-
4.7.1
THE GAWA DISTRIBUTION
~ e f i n i t i o nand p r o p e r t i e s
With a and B as two p o s i t i v e c o n s t a n t s we d e f i n e t h e qmRm d i s t r i b u -
I
t i o n by
Review t h e considerations of t h e l a s t s e c t i o n i n t h i s Context. I n p a r t i c u l a r .
f i n d a r g m e n r s f o r the coonno" statement t h a t "the exponential d i s t r i b u t i o n has
n o memory". This PrOpertY i s very u s e f u l i n p r a c t i c e s i n c e i t , f o r example.
allows one t o measure the l i f e t i m e s of u n s t a b l e p a r t i c l e s s t a r t i n g from an a r b i - This funcrion i s seen t o be properly normalized, i n v i r t u e of t h e g m function.
t r a r y time t a f t e r t h e time (t-0) they w e r e c r e a t e d . See a l s o Exercise 4.28 m
below.
E x e r c i s e 4.29: Decays from a r a d i o a c t i v e source of decay c o n s t a n t A are regis- The fornula (4.52) produces s v a r i e t y of shapes f o r d i f f e r e n t values
t e r e d by a Geiger counter of r e s o l u t i o n time to (i.e. a d e t e c t o r which f a i l s in
recording an event i f i t occurs separated i n tlme f r o . t h e previous event by an of t h e c ~ t t e t a n ta, as i n d i c a t e d i n P i g . 4.5. For us1 t h e d i s t r i b u t i o n i s
amount l e a s than t o ) . I f N ' counts are r e g i s t e r e d over t h e time t , that
t h e nlnober of decays N t h a t have a c t u a l l y taken place i n t h e r a d i o a c t i v e souree J - ~ h ~ p ~while
d , a > l gives e unimodal d i s t r i b u t i o n with maximum a t %=(a-1)B.
i s given by The p a r a t e r 8 is only a s c a l e f a c t o r .
-At
When a i s an i n t e g e r , a-k, r(k)=(k-1): and t h e p.d.f. (4.52) i s c a l l e d
N. I - e -At N' E aN'
e-At~ - I t h e ErZangim d i s t ~ i b u t i o n ; t h i a d i s t r i b u t i o n law can be derived from f i r s t prin-
where a > 1 . and i s discussed i n t h e following s e c t i o n .
=iples The s p e c i a l case a-1 corre-
Assume t h a t t h e r e g i s t e r e d number of counts i s a Poisson v a r i a b l e of
eponds t o the exponential d i s t r i b u t i o n , which was t r e a t e d already i n Sect.4.6.
estimated meanvalue 0-N'.
of decays i s V(N) -
Show t h a t t h e estimated v a r i a n c e i n t h e t r u e number
-
aN > N, w h e r e a s ~ i t ha p e r f e c t d e t e c t o r , t o < < l l A , t h e
estimated variance would have been V(N) N.
he specisl case 8-2,
square d i s t r i b u t i o n with
v
a=?where V i s an i n t e g e r i s e q u i v a l e n t t o t h e c h i -
v degrees of freedom; t h i s important d i s t r i b u t i o n is
discussed q u i t e e x t e n s i v e l y i n Chapter 5 .
4.7.2 D e r i v a t i o n of t h e gama p . d . f . from t h e P o i s s o n a s s u m p t i o n s
From a p h y s i c i s t ' s p o i n t of view t h e i m p o r t a n c e of t h e gamna d i s t r i b u -
t i o n l i e s mainly i n t h e f a c t t h a t it, for i n t e g e r v a l u e s of t h e p a r a m e t e r a, can
be d e r i v e d from t h e P o i s s o n and c o n s e q u e n t l y d e s c r i b e s random pro-
cesses of t h e P o i s s o n t y p e .
For d e f i n i t e n e s s , l e t us t a k e t h e random v a r i a b l e t o d e s c r i b e a time
interval. I f A i s t h e consranr a v e r a g e number of events ( d e c a y s , a c c i d e n t s ,
etc.) p e r u n i t t i m e , t h e P o i s s o n p r e d i c t i o n f a r t h e number of e v e n t s r i n t h e
time t i s ( e q . ( 4 . 3 0 ) )
YI -
c o e f f i c i e n t s given by, r e s p e c t i v e l y ,
2/&, Y, = 6la.
E x e r c i s e 4.34: (The b e t a d i s t r i b u t i o n )
While t h e g m a d i s t r i b u t i o n d e s c r i b e s v a r i a b l e s v h i c h are bounded a t
one s i d e , t h e b e t a d i s t r i b u t i a
-
'This i s r e c o g n i z e d us n n e g a t i v e b i n o m i a l d i s t r i b u t i o n w i t h p a r a m e t e r p = A / ( l + l )
( E x e r c i s e 4.5) Ior whicl, i h e mean v a l u e end v a r i a n c e a r e E ( s ) = k / A and
v(s)=klA+k/A2, r e s p e c t i v e l y . P(T) - If10
0
(u;V-20)du 0.03.
0.2 -
-1
- 0.2 ! Y Y
82
G(Y) =
- jg(y')dy' =
J2.G
.-by dyV, (4.64)
,
i
~ ( u . 0 ' ) of eq.(4.59).
lower l i m i t a and an
We want t o f i n d t h e p r o b a b i l i t y t h a t x f a l l s between a
upper l i m i t b . Clearly,
X I
F i g . 4.7. (=) The s t a n d a r d normal p.d.f. N(0.1).
(b) he s t a n d a r d normal d i s t r i b u t i o n .
I
d a r d i z e d v a r i a b l e (x- lo, and we have
Hence
-
c o r r e s p o n d t o p r o b a b i l i t i e s P(P-20
0.9973, r e s p e c t i v e l y , ( e q . ( 4 . 6 7 ) ) .
I n 5 u+20) = 0.9545 and P(p-M 5 x I v + M )
- The d o t t e d l i n e s i n d i c a t e t h e e x t e n s i o n
o f t h e r e g i o n s which c o r r e s p o n d t o p r o b a b i l i t i e s 0.90 and 0.95, ( e q . ( 4 . 6 9 ) ) .
Exercise 4.40: I f x i s normally d i s t r i b u t e d with mean and variance both equal This form of t h e c h a r a c t e r i s t i c function i s used t o d e r i v e many t h e o r e t i c a l l y
to 16, what i s the roba ability t h a t 12 5 x 5 20 ? Compare Exercise 4.16. important p r o p e r t i e s of t h e normal d i s t r i b u t i o n (see, f o r example. t h e following
sections). Eor t h e purpose of e v a l u a t i n g c e n t r a l moments of the normal p.d.f.
one may consider i n s t e a d t h e function (eq.(3.28))
4.8.4 C e n t r a l moments; t h e c h a r a c t e r i s t i c function
The c e n t r a l moments of t h e normal d i s t r i b u t i o n of eq.(4.59) can be
obtained from the g e n e r a l d e f i n i t i o n by eq.(3.14). C l e a r l y a l l odd c e n t r a l
moments vanish because of t h e s y m e t r y property of thenorma1p.d.f. The even which has the s e r i e s expansion
moments can be evaluated in a s t r a i g h t f o r w a r d way by carrying o u t i n t e g r a t i o n s
of t h e type
According t o eq.(3.27) t h e c e n t r a l moments are found by taking the d e r i v a t i v e s
of @,,(t) with r e s p e c t t o ( i t ) and p u t t i n g t = O ,
One f i n d s
u2 k
k = 0.1.2. ... (4.71) S i n c e only even powers of t appear i n the sum. a l l odd d e r i v a t i v e s w i l l vanish
when evaluated f o r t-0. Thus a l l odd c e n t r a l moments w i l l be zero and only ewn
Thus a l l c e n t r a l moments of t h e "0-1 d i s t r i b u t i o n can be expressed i n terms of
moments survive; i t w i l l be seen t h a t eq.(4.71) i s regained, as expected.
the variance 0'. The lowest of the even moments are
We consider t h e sum
e(a,v~+a.p2)it-!(a:o~+a~o$)t
mY ( t ) = m a,x, (t) .ma2xz c t ) =
Furthermore, ;and 8' for t h e "0-1 sample are independent v a r i a b l e s (see
p u t t h i s i s nothing but a new c h a r a c t e r i s t i c f u n c t i o n of t h e form o f eq.(4.74); ) Exercise 5.9). This is a property which i s s p e c i f i c f o r the normal sample.
hence y i s d i s t r i b u t e d as N(a,ul+atuz , a : o : + a f d ) . The outstanding p o s i t i o n of t h e normal d i s t r i b u t i o n i n t h i s r e s p e c t i s e x p l i c i t
The proof can c l e a r l y be extended t o any " h e r of independent, normal i n the following important theorem:
variables. So we may formulate the a d d i t i a theorem f o r normally d i s t r i b u t e d
Given t h a t t h e random v a r i a b l e s x ~ . x z , . ..,n are independent, with
v a r i a b l e s as follows:
identical normal d i s t r i b u t i o n s , then the Cwo v a r i a b l e s ( s t a t i s t i c s )
L e t nl.xz....,x be independent, normally d i s t r i b u t e d v a r i a b l e s
-x= n
Z x.ln
n -
and s'=,~ (xi-x)'l(n-1) are independent. Conversely. i f
such t h a t xi i s N ( ~ ~ , O : ) . Then the l i n e a r combination
n y=.Z
1-1 aI. a .1 ! i-1 ' - 1-1
t h e mean a and variance s2 of random samples from a population are
is a l s o a normally d i s t r i b u t e d v a r i a b l e , with man Z aipi and independent. t h a t population must be no-I.
variance 2 atof.
i=l I 3
i=1
Thus, when n + -, @ ( t ) + exp(-it2). But t h i s i s p r e c i s e l y the c h a r a c t e r i s t i c 4.8.9 Example: Gaussian random number g e n e r a t o r
f u n c t i o n f o r a normal d i s t r i b u t i o n with mean 0 and variance 1 (see eq.(4.74)). An i l l u s t r a t i v e example with a p r a c t i c a l a p p l i c a t i o n of t h e C e n t r a l
I n t h e l i m i t of very l a r g e n, t h e r e f o r e , t h e v a r i a b l e z has t h e d i s t r i b u t i o n Limit Theorem i s the e m s t r u c t i o n of a g e n e r a t o r f o r n o m l t y d i s t r i b u t e d random
N(0,1), i n accordance with t h e s t a t e n e n t made; t h i s completes the proof f o r t h e numbers.
C e n t r a l Limit Theorem f o r t h i s p a r t i c u l a r case. Suppose t h a t t h e r e i s a v a i l a b l e an ordinary random number g e n e r a t o r
The C e n t r a l Limit Theorem i s r e s p o n s i b l e f o r many o t h e r t h e o r e m and vhich d e l i v e r s numbers uniformly d i s t r i b u t e d over t h e i n t e r v a l [0.11, as des-
s t a t e m e n t s i n t h e theory of p r o b a b i l i t y and s t a t i s t i c s . I n f a c t , we have cribed i n Seet.4.5.2, with a mean value & and a variance
1
-.
Let xi be the i - t h
12
a l r e a d y seen t h e theorem i n o p e r a t i o n e a r l i e r i n t h i s c h a p t e r , i n t h e observed number i n a sequence of n consecutive nuder$ from t h i s generator. Then, accard-
tendency tovsrds normality f o r i n c r e a s i n g n of t h e ( d i s c r e t e ) binomial and i n g t o t h e Central Limit Theorem, i n t h e l i m i t of l a r g e n, t h e v a r i a b l e
Poisson d i s t r i b u t i o n s .
gnd 101 1. As one may s u s p e c t from t h e n o t a t i o n t h e p a r a m r t c r s u2.p2,af,o:
w i l l have t h e m a n i n g of mean v a l u e r and v a r i a n c e s , r e s p e c t i v e l y , w h i l e P
measures t h e c o r r e l a t i o n between t h o two v a r i a b l e s .
To see t h i s , i t is convenient t o work o u t t h e c h a r a c t e r i s t i c f u n c t i o n
w i l l have a d i s t r i b u t i o n which is e x a c t l y normal, w i t h mean zero and v a r i a n c e f o r x i and x2 w i t h t h e p.d.f. (4.77) from t h e g e n e r a l d e f i n i t i o n o f eq. (3.50);
one. A r e a s o n a b l e approximation t o t h e normal d i s t r i b u t i o n i s o b t a i n e d f o r n the r e s u l t i s
v a l u e s as s m a l l as 10.
z -
12
l x i - 6 .
A p r a c t i c a l c h o i c e is t o t a k e 11-12, which g i v e s simply
m(t,,t,) - .i t , v , + i t * ~ z i+I ( i t l ) ' o : + (it1)'&+ (it~)(it2)2~010~1
(4.78)
O t h e r i n t e r e s t i n g f e a t u r e s of t h e binormal d i s t r i b u t i o n are c o n t a i n e d
i n t h e exercises below; see i n p a r t i c u l a r 4.46-4.48.
I n s l l b ~ t i t u t i n gf ( n , , x l )
give -
from eq. (4.77) t h e f a c t o r s can h e r e b e o r g a n i z e d t o
E x e r c i s e 4.43:
- .
L e t x , and xz be hro i n d e p e n d e n t v a r i a b l e s which a r e u n i f o m l y
d i s t r i b u t e d between 0 a n d 1 Shov t h a t two new v a r i a b l e s r , == cos(2nx,),
h,(x,) = -
"50,
1
e
-!(XI-WI)~IO?
1
e- ~ ( x ~ - c ) ~ / o : ( ~ - P ,~ ) ~ ~ , zz s i n ( 2 1 1 ~ ~w) i l l b e h i n o r m a l l y d i s t r i b u t e d , w i t h m a r g i n a l d i s t r i b u t i o n s
N ( 0 , l ) and zero c o r r e l a t i o n .
-
E x e r c i s e 4.44,: V e r i f y t h a t t h e covariance m a t r i x and i t e i n v e r s e f o r two v a r i -
able rr, C
no-l
.
where C in t h e exponent o f t h e i n t e g r a l i s independent o f t h e i n t e g r a t i o n v a r i -
.
ur + p o z / o ~ ( x l - ~ , ) The c u r l y b r a c k e t is t h e r e f o r e an i n t e g r a t e d
p . d . f , which gives just 1 .
Hence t h e m a r g i n a l d i s t r i b u t i o n i n XI is
able8 w i t h the b i n o m l d i s t r i b u t i o n (4.77) a r e g i v e n by, r e s p e c r i v e l y .
-
Exercise 6.46:
B,X1+BZX2 s a T ,
with and x z r e l a t e d i n t h e b i n o r m a l p . d . f . ,
is ~ ( ~ , p , + a ~ u , , a : o : + a ~ o ~ + 2 ~ 1 a ~ ~'
(mint: ~ i " d t h e ~ h a r a c t e r i s t i cf u n c t i o n f o r Y .)
0 ~~0 2
show t h a t
( )a ~ ,GV!?)
i .
h e l i x r e l a t i v e to the same p l a n e .F u r t h e r , e x p e r i e n c e has shown t h a r e a c h of
t h e s e q u a n t i t i e s under measurpmentr c a n be c o n s i d e r e d as a normally d i s t r i b u t e d
v a r i a b l e , w i t h a s p r e a d around t h e c e n t r a l ( t r u e ) v a l u e as i m p l i e d by t h e accu-
s x c e r c i s e 4.47: ( i ) Show t h a t two v a r i a b l e s c o n s t r u c t e d as l i n e a r combinations r a c y o f t h e measuring system; i n a d d i t i o n , 110 and 4 are c o r r e l a t e d . To simu-
of t h e v a r i a b l e s o f a b i n o r m l d i s t r i b u t i o n (4.77) w i l l a l s o be b i n o r m a l l y d i g - l a t e an e x p e r i m e n t by t h e Monte C a r l o t e c h n i q u e one t h e r e f o r e needs a p r e s c r i p -
t r i b u t e d . ( H i n t : Write y , - a l n , + a 2 x a y 2 - b l x l + b 2 x z and show t h a t t h e c h a r a c t e r i s -
t i c f u n c t i o n @ ( t l , t r ) f o r y ~ and y? i s of t h e form of e q . (4.78).) t i o n f o r o b t a i n i n g s e t s o f random, normal v a r i a b l e s , such Char two of them
( i i ) w r i t i n g 5 = i a , , a 2 } , b = { b , , b r ) , show t h a t t h e new b i n o m l d i s t r i b u t i o n p o s s e s s a mutual r e l a t i o n s h i p c o r r e s p o n d i n g t o a binorwe1 d i s t r i b u t i o n o f s p e c i -
h a s t h e m a r g i n a l d i s t r i b u t i o n s ~ ( a T u ,&TV%) and N(L'$, b T v b ) f o r Y L and y 2 .
r e s p e c t i v e l y . a n d t h a t t h e i r eova;i&ce i s ~ o ~ ( ~ , , y ~ ) = Hence p b . yl and Y 2 f i e d c o r r e l a t i o n , c o r r e s p o n d i n g t o t h a r between t h e c u r v a t u r e and t h e a z i m u t h a l
w i l l be independent i f , and o n l y i f , t h e t r a n s f a r m t i o n mkes a T ~ b0 .
anele.
~ ~ ~ r 4.48:
c i Let ~ e x l r x z be r e l a t e d i n the binormal p . d . f . of eq.Ch.77). We assume t h a t a g e n e r a t o r f o r Gaussian random n u h e r s i s a v a i l a b l e .
( i ) shov t h a r a change of v a r i a b l e s t o y , . y ~ by t h e o r t h o g o n a l t r a n s f o r m a t i o n which upon c a l l produces a "random number" e s u c h t h a t , i n t h e l o n g run, z w i l l
be n e a r l y N ( 0 , I ) . Any independent n o r m 1 v a r i a b l e of mean v a l u e and s t a n d a r d
brings the p.d.f. over t o t h e form d e v i a t i o n 0 i s t h e n siolply c o n s t r u c t e d as u to..
Suppose t h a t two dependent v a r i a b l e s x~ and xz a r e r e q u i r e d t o have
makes Q = 2: + z:
~ ( z , . z z )=
. [=
and
1
=xp(-iZ:)]. [z 1
exp(-lz:)] .
..
and (4.77).
The normal d i s t r i b u t i o n s o f dimension 1 and 2 , as d e f i n e d by eqs.(4.59)
r e s p e c t i v e l y , l e a d us t o s e a r c h a m t t i n o m t o r n-dimeneionat nomot
'! m a t r i x takes t h e g e n e r a l farm
Q - ( 5 - g) V
T -1
(5 - 2) (4.87)
d i s t r i b u t e d w i t h given covariance m a t r i x .
I I pracrion o f distribution i n t a i l 1
Distribution
1x1 L1 -> 2 -> 3 -> 4 -> 6
L'+rn -
does nor e x i s t , a l t h o u g h t h e p r i n c i p a l v a l u e , d e f i n e d w i t h L'
is e q u a l t o one.
- L , does e x i s t and
Following c o n v e n t i o n we s h a l l r e g a r d t h e d i s t r i b u t i o n of eq.
S t a n d a r d normal
Double e x p o n e n t i a l
.3173
3679
A455
.I353
.0027
.0498
.00006
,0183 .0025
Cauchy .5000 .2952 .2048 .I560 .lo51
(4.88) as n o t p o s s e s s i n g a man. The same convergence s i t u a t i o n a p p l i e s t o a l l
o t h e r moments xk . Thus we m y s a y t h a t f o r t h e Cauchy d i s t r i b u t i o n no m m e n t s 1
are d e f i n e d , s i n c e they a l l d i v e r g e .
One way o f g e t t i n g o u t of t h e dilemoa i s t o impose a r e s t r i c t i o n on t h e
domain o f t h e v a r i a b l e x . The i n t e g r a l of f ( x ) over a f i n i t e i n t e r v a l I-L,+LI
i s e q u a l t o 2/77 (tan-'L) . I f we t h e r e f o r e r e d e f i n e o u r f ( x ) by this normaliza-
t i o n f a c t o r and w r i t e
T h i s e x p r e s s i o n f o r t h e v a r i a n c e i l l u s t r a t e s t h e s t a t e of a f f a i r s f o r t h e Cauchy
p.d.f. ( 4 . 8 8 ) : t h e t a i l s o f t h i s d i s t r i b u t i o n t e n d so s l o w l y t o zero t h a t conver-
gence i s p r e v e n t e d . Indeed, when L i s p e r m i t t e d t o grow i n d e f i n i t e l y , t h e v a r i - F i g . 4.9. The Cauchy o r Breir-Wigner d i s t r i b u t i o n ( s o l i d curve) and t h e
s t a n d a r d normal d i s t r i b u t i o n (dashed curve). The h a l f - w i d t h s a t h a l f -
ance can become a r b i t r a r i l y l a r g e , s i n c e V ( x ) + m when L+-. maximum are i n d i c a t e d by arrows of l e n g t h 1 and = 1.18, r e s p e c r i v e l y .
Far moderate n v a l u e s t h e shape of t h e Cauchy d i s t r i b u t i o n (4.88) i s
n o t v e r y d i f f e r e n t from t h e s t a n d a r d normal, as one can see from F i g . 4.9. A
E x e r c i s e 4.52: Show t h a t t h e Breit-Wigner formula f o r a n s r a v e resonance of cen-
q u a n t i t a t i v e e x p r e s s i o n o f t h e i r d i f f e r e n t t a i l b e h a v i o u r is p r o v i d e d by t h e f o l - t r a l v a l u e No and f u l l w i d t h r a t h a l f maximum.
l o w i n g t a b l e , which g i v e s t h e f r a c t i o n o f e a c h d i s t r i b u t i o n i n b o t h t a i l s beyond
the i n d i c a t e d v a l u e s o f 1x1 . For comparison, t h e t a b l e a l s o shows t h e corre-
s p o n d i n g f r a c t i o n s f o r t h e d o u b l e e x p o n e n t i a l d i s t r i b u t i o n (see E x e r c i s e 4.55).
c o r r e s p o n d s to a Cauehy d i s t r i b u t i o n . Note t h a t w i t h t h i s farm, half-maximum
which i n t h i s r e s o e c t i s seen t o have an i n t e r m e d i a t e b e h a v i o u r .
occurs f o r M - M o t ( t r ) , whereas a Gaussian shape N(M,,,(tr)%) h a s half-maximum a t
M - M f l .18(hT); compare E x e r c i s e 4.37.
Exercise 4.53: Show t h a t the c h a r a c t e r i s t i c function f o r the Cauchy p.d.f. is
5. Sampling distributions
m(t) = .-It/.
Note t h a t t h i s function has no Taylor expansion around the o r i g i n and t h a t there-
f o r e the rmmenrs of the Cauchy p.d.f. do not e x i s t .
Exercise 4.54: Let xl,xr.....x
buted according t o eq.(4.88). :how that -
be n independent Cauchy v a r i a b l e s , each d i s t r i -
x i has the same d i s t r i b u t i o n .
This r e s u l t may a t f i r s t appear somewhat s u r p r i s i n g , i n view of what has
The previous c h a p t e r h a s d e a l t r a t h e r e x t e n s i v e l y with t h e c h a r a c t e r
i s t i c s and p r o p e r t i e s of some p r o b a b i l i t y d i s t r i b u t i o n s which have been found
been learned from the Central Limit Theorem. One might perhaps have expected t o d e s c r i b e Oarious physical phenomena q u i t e a c c u r a t e l y under c e r t a i n i d e a l
t h a t t h e a r i t h m e t i c mean ?. of the n independent v a r i a b l e s would become appraxi- /
i
mately normally d i s t r i b u t e d for very l a r g e n. The appearant discrepancy i s due conditions.
t o the f a c t t h a t the n Cauchy v a r i a b l e s do not f u l f i l the requirement of possess-
The p r e s e n t chapter w i l l be devoted t o s study of t h e p r o p e r t i e s of
i n g a f i n i t e variance, which was e s s e n t i a l i n deriving the Central Limit Iheorem.
t h r e e sampling d i s t r i b u t i o n s which are r e l a t e d t o t h e normal. The chi-square,
Exercise 4.55: (The double exponential d i s t r i b u t i o n )
Discuss the p r o p e r t i e s o f a v a r i a b l e with the p.d.f. t h e S t u d e n t ' s t , and the F-distributions do n o t have any d i r e c t p h y s i c a l ma-
f(n) 1 e -14, - m < x < m . l ~ g u e s ,b u t c m be connected t o experimental s i t u a t i o n s where t h e no-1 dia-
Note i n p a r t i c u l a r t h a t t h i s d i s t r i b u t i o n has t a i l s which drop off more slowly
than the standard normal b u t faster than the Cauchy d i s t r i b u t i o n .
I
I
,
t r i b u t i o n I m i s supposed t o d e s c r i b e t h e outcome of meaeurements. The m o t i v e
t i o n t o s t u d y t h e s e d i s t r i b u t i o n s may perhaps not be clear t o t h e r e a d e r a t t h e
moment, i n which ease h e should proceed t o t h e following chapters end return t o
t h i s p o i n t vben it is found necessary. It may s u f f i c e t o mention t h a t . although
some a p p l i c a t i o n s of t h e s w l i n g d i s t r i b u t i o n s are found already i n Chapter 7.
i n connection w i t h simple i n f e r e n c e problems involving s a w l e a from t h e n o m l
d i e t r i b u t i o n , a f u l l a p p r e c i a t i o n of t h e sampling d i s t r i b u t i o n s d i s c u s s e d h e r e
w i l l f i r s t become evident i n t h e l a a t c h a p t e r of t h e book. Thus a n d e r of
e-les an hypothesis t e s t i n g indeed presupposes knovledge on t h e e t m d a r d
sampling d i s t r i b u t i o n s as w e l l as some acqualntanee with the related non-central
I
s q l i n g d i s t r i b u t i o n s , vhieh are introduced i n t h e e x e r c i s e s of t h e p r e l e n t
chapter.
5.1.1 Definition
Let us ass- t h a t t h e r e i n given a set of n lartually independent
r m d w v a r i a b l e e x,.x,. ....xn whicli are a11 n o w 1 N(p.02). We r e w f o r i n s t a n c e
think of t h e xi's as t h e outc-s of n repeated r a s u r e m c n t s on t h e same physi-
c a l system or n independent observations on the s- qumtity. Than t h e xi's
c o n s t i t u t e a sample of s i z e n from population which i * normal w i t h mean U and
of freedom has been l o s t , s i n c e it 6.. bcea w e d t o estimate t h e d n m para-
variance a'. We define t h e chi-square sun x2 by adding t h e aquare. of t h e
meter (the c e n t r a l value) of t h e d i s t r i b u t i o n .
standardized normal variables (xi-$)r0, viz.
Remark 2. Prom a mathematical viewpoint the requirement t h a t 111 t h e I. should
be s i m i l a r l y d i s t r i b u t e d , (a11 N(v.u')). is unnecesse.rily restrictive. I n gene-
r a l , f o r n independent variables r.
The variable x2 has a probability density function given by
of squares x2 - from norms1 d i s t r i b u t i o n s ~ ( u ~ . a f )the
i i l y f of the standardized q u a n t i t i e s y -(xi-Ui)/ai
, sum
w i l l have a
chi-square d i s t r i b u t i o n v i t h n degrees of freedom.
I f the variables y. i n the sum x2 - iilYf are not N(0.12, but more
generally d i s t r i b u t e d v i t h u n i t variances and means d i f f e r e n t from zero (and
dent v a r i a b l e s making up the X' sum (5.1). 5.1.2 Proof f a r the chi-square p.d.f.
- (y)2
i n words, t h i s sum of squares has s chi-square d i s t r i b u t i o n v i t h n-1 degrees of ing the formula by using t h e change-of-variable technique outlined in Sect.3.7,
freedom. This may a t a f i r s t glance seem t o contradict the d e f i n i t i o n of the r e c a l l i n g only t h a t putting u i n p l i e s a two-to-one transformation from
chi-square d i s t r i b u t i o n i n t h i s section, vhich implies t h a t x t o u; compare Exercise 3.6. Here we s h a l l be s a t i s f i e d with verifying eq.
(5.5) f o r the case V-2. The case V-3 can be treated i n an analogous way. (Exer-
cise 5.1). For the general cllse of an a r b i t r a r y U a proof can be given by
The disagreement i s , h m v e r , only apparent, f o r the follouing reason: In the mathematical induction (Exercise 5.2), or using a rmre d i r e c t mthod. (see f o r
where 0 I p 5 -.
0 5 4 5 2 ~ .The transformation from the set xl,rz to the set
p,$ involves the Sacobian
- e-'p2.p .
Since the relation between the variables u and p is
- -1
The chi-square distribution of eq.(5.5) is shown in Fig. 5.1 for from the definition eq.(3.22).
selected values of the parameter v . I
m(tl - e(eitu) eitur(u;v)du.
0
Inserting the p.d.f. of eq.O.5) and carrying out the integration lads co
When the characteristic function is known one can easily evaluate upectation
I In fact, it can be a h a n that a~~ptotically
tion doee indeed become identical to the no-1
the chi-square distribu-
distribution. To see thie it
valves by differentiating with respect to (it) end putting t 4 , 8s demonstrated is sufficient to demonstrate that the characteristic funetioru for the N o
in Seet.3.4. One finds in particular distributions become equal in the limit of large v. Guided by the established
facts that the mean and variance for the chi-square distribution are given by
( Y and ZV, respectively. we form the standardized variable
I
This relation, together with the generally valid formula (3.18). permits the
v. (5.8)
where, in the last step, we have inserted the expression of eq.(5.6) for the
determination of all central moments up to any desired order. In particular.
the central moments of orders 3 and 4 are found ta be ' characteristic function of the variable u. Taking the logarithm and expanding
the last term we have
Thua the asymmetry and kurtosis coefficients, from their definition by eqs.
- - - 1t 2 ++-4).
(3.20) and (3.21). respectively, become
When v goes tarards infinity, @ (t) + e-lt2; t h u in the limit of infinite v
Y1
the variable y, has the characteristic function of a standard normal variable.
Hence the original variable u, for limiting values of V, will also be normal,
yz -A 12
3 - .-
(LIZ)?
These numbers express the tendency seen in Fig. 5.1, that the skewness of the
namely N(v,ZV).
-
Mathematically, the approach of yl ( u - v ) l G to N(0.1)
slow. One can show, see Exercises 5.10. 5.11, that the variable
is rather
chi-square distribution decreases for increasing W, while the shape beeoms more
"bell"-like. visually the distribution looks "naml"already at v = 20. In che
limit v +I. the coefficients y, and yr are zero, indicating exact spmetry and represents a better approximation to N(O.1).
a peaking equal to that of a normal distribution.
5.1.4 P r o b a b i l i t y content8 of the chi-square d i s t r i b u t i o n
I n p r a c t i c e one is f r e q u e n t l y i n t e r e s t e d i n the cumulative chi-square
d i s t r i b u t i o n t o c a l c u l a t e confidence i n t e r v a l s o r f o r t e s t i n g hlrpotblses invol-
ving chi-square d i s t r i b u t e d v a r i a b l e s .
Figure 5.2 gives p r o b a b i l i t y contents of t h e chi-square p.d.f. for
d i f f e r e n t numbers of degrees of freedom. The f i g u r e shovs a double-logarithmic
F(X';V)
a z r'
display of the q u a n t i t i e s F ( G ; v ) and a versus
..2
f o r d i f f e r e n t w and s p e c i f i e d e n t r i e s of
F(X;;Y).
When t h e n d e r of degrees of freedom i s s u f f i c i e n t l y l a r g e , v ? 30.
t h e p r o b a b i l i t y coorents of the chi-square d i s t r i b u t i o n can e a s i l y be found
using t h e f a c t t h a t t h e v a r i a b l e yz of eq.(5.12) ( o r y, of eq.(5.11)) is
approximately standard normal. See Exercise 5.8.
It may be worth noting. t h a t t h e p.d.f. f o r the v a r i a b l e F ( x ~ ; * ) of
eq. (5.13) is uniform over t h e i n t e r v a l fO.1 I. (This f a c t is g e o e r d l y t r u e f o r
any v a r i a b l e defined by t h e c m u l s t i v c i n t e g r a l of a p.d.f.. see ~ ~ c t . 6 . 5 . 1 . )
We s h a l l see an e x m l e of t h e usefulness of t h i s property i n Seet.10.6.4.
I
I
satisfying
II i-1
f aijaik - Ajk.
,
I
I
The independent v a r i a b l e s yi are a l l normally d i e t r i b u t e d . each being N(0,u2).
N a r one has
Hence v is x2 (vl+vZ+. ..+Vr).
The a d d i t i o n theorem f o r chi-square v a r i a b l e s i n f e e t m y appear in-
- 1 --
n
i-1
-i
i-l
X; - n;2 - i-1
- y;
n- 1
.1
i-1
y;.
t u i t i v e l y c o r r e c t , because t h e number of degrees of freedom is nothing b u t the Therefore
rider of independent t e r n s m d i n g up the
5.1.6
x2 bum.
1
-
Exercise
~(d') - 5.3: For t h e chi-square p.d.f w i t h v degrees of freedom shov t h a t
2'*r ( I ( v + r ) ) l r ( l v ) f o r a l l p o e i t i v e and negative i n t e g e r s r s a t i s f y i n g
v+K>O. Note t h a t t h i s i s a more general r e s u l t than t h a t implied by eq.(5.8),
where k i s assurnEd t o be a p o s i t i v e i n t e g e r .
-
general,
v; 2~k~(A(v+k))/~(~~),
4 k -
and t h a t i n p a r t i c u l a r t h e even moments r e s u l t as
v(u*2)-. .(v+2(k-1)).
Verify t h a t , vhen X . 0 , f(u';v.A) reduce. t o t h e ordinary ( c e n t r a l ) chi-aquare
d i s t r i b u t i o n v i t h U degrees of freedom, eq.(5.5).
It can a l s o be s h m t h a t the v a r i a b l e
! J ; ~(v+1)
+~ -
The odd moments can be expressed by t h e f i r s t , !J:,
(u+3). ..(v+2k-1)~:.
Use S t i r l i n g ' s expansion
Inr(x+l) - (ln(27) + ( x + O l n x
1
-a+- --
1
...
360.'
+ eq.(5.5), but with a parameter v ' -
i s approximately d i s t r i b u t e d l i k e a c e n t r a l chi-square v a r i a b l e according t o
(V+A)~/(V+ZX) vhere v' i s , i n general.
f r a c t i o n a l . This f a c t is frequently used t o f i n d approximate values o f t h e
i n t e g r a l of a non-central chi-square v a r i a b l e , by i n t e r p o l a t i n g i n t h e t a b l e s
(curves) f o r the c e n t r a l chi-square d i s t r i b u t i o n .
5.2.1 Definition
Let x be a standard normal variable N(0.1) and u a chi-square variable
with v degrees of freedom x2(V), aml ass- that x and u are independent. De-
fine a variable t by is therefore a Student's t-variable with n-1 degrees of freedom.
-
Remark 2. T o motivate
wellknam properties of the mean
a study of the Student's t-variable, recall the by now
;and variance s z of a sample.xl,r2.. .,nn .
II where - 0 1 t 5
m'
-, 0 5 v 5 0, the Jaeobian of the transforumtion is
from N(LI,u~),
-x is Nb,;
az
), where x --
7
(n-1)s2 is ( - 1 where s2- -
n-1 .
1-1
In terms of the new variables the joint p.d.f. e m be written as
Moreover. ;and s2 are independent variables (Exercise 5.9). Consequently the
two independent variables
and -
(,I-1)s'
Since we ere only interested in the variable t ve proceed to find the marginal
016 o2
distribution in this variable by integrating over v.
being, respectively N(0.1) and y2(n-1), satisfy the requirements specified io
the beginning of this section, with the trivial difference that the chi-square
variable has "-1 degrees of freedom, instead of v. A variable constructed from
these two variables as
1 which is seen to lead to eq.(5.15)
' ( s L . ~ ) . b a . a ~ q s y z w h a s,auapnag aqa 30 .j'p.d a43 103 joold
aqa a a a l h o ~Lqalaqa pus ( 9 1 . s ) . b ~ 30 1 s d o a m aqa aasnlena : E L ' S asy?laxp
.BL.S p " ~ 1 . ssasyazaxa aas 'zaaael aqa 30 as"
axem a a a ~ q eaq dqezaqa pua ' ~ ypus gy ~ a ~ q rrpuaddv
s j uaanaaq uo;aJauuoJ aqa
qsylqeasa 61ynsa 11yn ' ( 1 ' 0 ) ~ 30 sauaauos Lay~?qeqoxd 391 qayt! p"Trys"b~8 9:
6.p.y.aJaS m o q Lp~bpeaqeoqn 'xapmaz aqa as*) u l .uoyanqylasyp 1-ou plspusas
aqa oa -(eayauapy sy ueae ensq an q,yqn '-A asaJ 1 e y ~ a d saqa SapnlJU? q q a a au
13
.mopaaq 30 saax8ap 7uaxa43rp l a 3 a pu. (nf a)$ 103 sayxaua p a y ~ p a d so l 8uypuods
-axma
aa 30 sen-(en eany8 ~y alq-L ~ p u a d d y' ( s [ . s ) . b a dq "any8 sy ( A ! ~ ) J slap
m-
(61.5) a - 1 - ap(n!a)J J' z (~!'a)a
D
I
gxercise 5.16,: show that the m n t s U' f o r the t - d i s t r i b u t i o n e x i s t
i f k < v, and t h a t the even moments are kiven by
~ ( t ~ ' =) E[[s)2r1 -
L ~ ' ' J
yr E(x~~).E(P)>
2r -r
2r < V ,
5.3.1 Definition
Let ul and u2 be two independent (central) chi-square v a r i d l e . with,
because of the independence of t h e variables x and u; E(x ) end E(u ) are the . is X 2 ( v z ) .
respectively, v l andvz degrees of freedm, i . e . u, is X 2 ( ~ , ) ur
expectations f o r N(0,l) and X 2 ( v ) . respectively. Compare Sect.4.8.4 and Exer-
c i s e 5.3. Define a variable P by
P(-b 5 t 5 b) =
I
b
f(t;u)dt - y
This v a r i a b l e has t h e p.d.f.
-b
by the use of Appendix Table A7. Write d a m values of b taking y-0.95 f o r
v-1,5,10,M.60,'. Note t h a t f o r \UI, t h e l i m i t s b 4 2 . 0 0 correspond t o the pro-
b a b i l i t y content 0.954 of N(0,l). -,
M.If
degrees o f f m e d a .
- -
Exercise
IS
5.19: Show t h a t i f t2 i s taken as a v a r i a b l e instead of t , the p.d.f.
Remark 2. I n practice one encounters chi-square variables i n the form of sample
variances f o r normally d i s t r i b u t e d variables. For definiteness, l e t x ~ , x l , . . , x
and YI.YI ....,y, be two independent samples fmm the same population N ( ! A , ~ ~ ) ,
f o r example two series of independent measuremmts. The sample variances are
This i s the same form as a s p e c i a l case of the F-distribution (with "1-1) t o be
discussed i n the next paragraph.
F - o2
I(lp1)
= 5:
8;
. (5.22)
t-p.d.f. i n Seet.5.2.2. From the j o i n t p.d.f. of u, and ut. a transformation t h a t is, t h e v a r i a b l e (v,F) approaches x 2 ( v l ) ; see E m r e i s e (5.25).
i s made t o t h e nn. s e t of v a r i a b l e s For v l * -, v, * t h e F-distribution tends t o normal. The
F -- u,lv,
9 v-u2.
approach t o normality is, however, r a t h e r slw.
-
where 0 5 F 5
eliminating
-. "2/V2
0 5 v 5 -. Applying t h e change of v a r i a b l e technique and
the auxiliary v a r i a b l e v by i n t e g r a t i n g t h e j o i n t p.d.f. over t h i s
(v) The q u a n t i t y z
normal, w i t h approximate me* 1
Exercise 5.28.
vz v,
1 +>I,
4lnF has a d i s t r i b u t i o n which is c l o s e t o
[l-
and variance 4($, + see
5.3.3 P r o p e r t i e s of t h e F - d i s t r i b u t i o n
Figure 5.4 shows a sketch of the F - d i s t r i b u t i o n for a few carbin=-
r i o n s of the parameters V I , V ~
The f a l l w i n g f e a t u r e s c h a r a c t e r i z e t h e F - d i s t r i b u t i o n :
where t h e integrand is t h e p.d.f. of eq.(5.21). Appendix Table A9 gives values and thereby v e r i f y t h e statement made under (v) i n Se t 5 3 3, char z is
of x, f o r s p e c i f i e d e n t r i e s of F(xa;v,,v,) and d i f f e r e n t degrees of freedom ap roximately normally d i s t r i b u t e d w i t h mean value t(-F r i ]. and variance-u,
(Y,.'J2).
r +,;(7 5,). . V~
+
onlyi f k < tvrr and i s then given by (ii) I n view of a statemcot i n Exercise 5.1 2 the v a r i a b l e u:l(*] will
u; = E(Fk) =
I'
;;r;;
J
=
k
(2)
E ( ~ , ~ ) E ( ~ =c ~ )
(%]k.%
v,+k) r ( ~ 2 - k )
* YI- ( v , + ~ ) ~ / ( v I + z AHence
). ";I(*)] lv: -
have an approximate c e n t r a l chi-square d i s t r i b u t i o n with parameter
u;l(vl+A) is an approximate c e n t r a l
~ x e ~ 5.26: e i ~Calculate
~ o- (eg.(5.24)) f o r t h e F - d i s t r i b u t i o n with
where F approximately hae e c e n t r a l F-distribution with parameters (u:,u,),
being i n general f r a c t i o n a l .
V:
fixed v,=5 and v2-10, 20, 60, r e s p e c t i v e l y . From Appendix Table 8 9 , f i n d by 5.4 LIMITING PROPERTIES - CONNECrION BErWEEN PROBABILITY DISTRIBUTIONS
i n t e r p o l a t i o n F ( % ; U , , V ~ ) f o r t h e t h r e e combinations of (Vl,vt). What i s t h e The connection between the sampling d i s t r i b u t i o n s of the p r e s e n t
l i m i t i n g value when v, * -?
Poisson
Mullinomiai Binomial In t h e preceding chapters we have investigated f e l t u r e a of probabili-
t y d i s t r i b u t i o n s which are frequently used i n physics. Experimental findings
em, hovever, not always be d i r e c t l y compared t o the i d e a l mathemtical d i s t r i -
butions. Quite often a t h e o r e t i c a l model w i l l have t o be modified i n some r a y
before one can make a meaningful comparison between prediction and obeelvation.
( NORMAL 1 The reason f o r t h i s can be t h e t the t h e o r e t i c a l p.d.f. w i l l only describe m
p r a c t i c e ; f o r instance, the l i f e t i m e d i s t r i b u t i o t ~l w
f(t;A) Ae-At-
experiment performed under c e r t a i n i d e a l conditioru which are not f u l f i l l e d i n
- for
-
f q ( X q ) ei(An)2.G[$-hR].Ae-Ax', o ' x ' ' ~ , (6.6)
I where G is t h e cumulative s t a n d a r d normal d i s t r i b u t i o n introduced i n Sect.4.8.2
and t a b u l a t e d i n Appendix Table A6.
I n p r a c t i c e an i d e a l behaviour o f t h e form o f eq.(6.5) is expected,
f o r e x a q l e , f o r p a r t i c l e l i f e t i m s , t r a n s v e r s e monenta and four-mowenturn t r a n s -
fers. The assumption of a normal-shaped r e s o l u t i o n f u n c t i o n appears reasonable
f o r many experimental s e t - u p s .
Fig. 6.1 i l l u s t r a t e s how t h e o r i g i n a l p . d . f . of eq.(6.5) i s modified
by eq.(6.4) i n t o d i f f e r e n t observable p . d . f . ' s of eq.(6.6) f o r d i f f e r e n t numer-
i c a l values f o r the c o n s t a n t s A and R.
r(xV;x) -- 1
ER
e-i (x'-x)'/R'
t h e i n t e g r a t i o n of eq.(6.1) y i e l d s f o r t h e r e s o l u t i o n transform
f ' ( ~ ' )=
r+R
(x'-x
1
)2+(r+~)2
- x 5 (6.11)
d e t e c t i o n e f f i c i e n c y becomes as high as possible. Since, however, p e r f e c t de- which i n t u r n can be compared d i r e c t l y with the observations. Although exact.
t e c t i o n can never be achieved i n p r a c t i c e due t o high c o s t s , time-consumption t h i s method may be d i f f i c u l t , i f not iolposaible, t o carry out i n p r a c t i c e .
e t r . . a l l kinds of p o s s i b l e l o s s and systematic e f f e c t s t h a t w i l l d i s t o r t t h e Suppose t h a t we make observations on some v a r i a b l e x i n o r d e r t o
d a t a must be checked and estimated. estimate t h e parameters of t h e i d e a l p.d.f. f(x;B) Because of an imperfect
d e t e c t i o n apparatus the d i s t r i b u t i o n t h a t can be observed is not
A p r o b a b i l i t y d e n s i t y function f ( x ; l ) which appears suggestive t o -
f(x;Q), b u t
some d i s t o r t e d d i s t r i b u t i o n f'(x;Q)
d e s c r i b e t h e phenomenon under study w i l l sometimes be mathematically defined - which is r e l a t e d t o t h e i d e a l p.d.f.
over "on-observable values of the physical variable*). This a i t u a t i a n can h e through t h e d e t e c t i o n e f f i c i e n c y . I n general, t h i s e f f i c i e n c y w i l l b e depen-
handled by truncation of t h e p.d.f. i n t h e f o l l w i n g way. Let us assume t h a t dent on t h e v a r i a b l e x i n which we are i n t e r e s t e d , as well as on one or mre
t h e observable p a r t of t h e s p e c t r m of x l i e s between some d e f i n i t e l i m i t s A and a d d i t i o n a l v a r i a b l e s , y say. The a d d i t i o n a l v a r i a b l e s may a l s o be dependent on
8. We then r e q u i r e t h e p.d.f. t o be zero o u t s i d e t h e s e l i m i t s and w r i t e our X, so t h a t t o cover t h e most general ease we w r i t e
new p.d.f. as
f'(x;B) - jf(x;B)~(x.~)~(~lxld~
f (x;?)D(x.y)P(yl xldydx
(6.15)
I t is seen t h a t eq.(6.14) f o r s truncated p.d.f. represents a s p e c i a l case of 6.3.2 Exaorple: Truncation of a Breit-Wigner d i s t r i b u t i o n
the l a s t formula.
Method ( i i ) , which is only approximately c o r r e c t , b u t perhaps more the Breit-Wigner parameterization of a resonance f(M;Mo,r)
When t h e observations are r e s t r i c t e d t o a f i n i t e mass region M
-r
A second example where t r u n c a t i o n is always used i n p r a c t i c e is f o r
;((M-M~)~+~~)".
-' M < b$, the
f r e q v e n t l y used i n p r a c t i c e , a p p l i e s c o r r e c t i n g weights t o t h e individual ob-
A
served events, equal t o the r e c i p r o c a l of t h e d e t e c t i o n e f f i c i e n c y . The t r e a t - truncated p.d.f. according t o eq.(6.14) is
ment assumes a subsequent coaparison of the d i s t r i b ~ t i o uof t h e s e weighted
events w i t h the o r i g i n a l p.d.f.. Thus t h e philosophy i s now t o a d j u s t t h e data,
r a t h e r than the t h e o r e t i c a l model. I f one event is observed a t e p a r t i c u l a r
value x . of t h e variabie we say t h a t t h e c o r r e c t e d nunbrr of events is u i , t h e
weight w; being equal t o the inverse of t h e d e t e c t i o n p r o b a b i l i t y f o r t h i s
p a r t i c u l a r event, For t h i s p.d.f. t h e expectation of M i s
v(M) -
value M , show t h a t t h e variance f o r t h e truneatgd B r e P t - ~ i g n e r p.d.f. i s
&n/r)/arctan(m/r)- 1 . Note t h a t lim V(M) = -; coopare Sect.4. 1 7 .
w
D(p.h.41 - e
-tmin17
-e
-tmaxl~ a r o t a t i o n of a l l charged pion d i r e c t i o n s around t h i s a x i s and determine the
t o t a l p r o b a b i l i t y P: t h a t a t l e a s t one of the t r a c k s w i l l correspond t o a lab-
where t .
m1n
is t h e minimal d e t e c t a b l e proper f l i g h t - t i m e corresponding t o t h e oratory angle within t h e cone dn. The event is then assigned a weight
chosen c u t on t h e range, and where t i s the potential flight-time.
man
Both t . and tmaxare i n v e r s e l y proportional t o t h e momentm p, t h e
rmn
" i n t e r e s t i n g " variable. They w i l l a l s o involve t h e "nuisance variables" A and
4 g i v i n g t h e d i r e c t i o n of t h e line-of-flight. I t i s f o r w necessary t o estab- When a l l events from t h e p u r i f i e d sample are p l o t t e d w i t h t h e i r indi-
l i s h a r e l a t i o n s h i p between p and A,+, which f o r a given p expresses the d i s t r i - v i d u a l v e i g h t s the r e s u l t i n g corrected experimental d i s t r i b u t i o n can be compared
with t h e unmodified t h e o r e t i c a l model.
b u t i o n of the angles, ~ ( A . d l p ) . Usually t h i s r e l a t i o n s h i p has t o be i n f e r r e d
from t h e same d a t a which we want t o w e f o r t h e e s t i m a t i o n of t h e unknown para-
meters .: When the dependence p(A.41~) has been e s t a b l i s h e d , i n a f u n c t i o n a l I 6.4 SUPENWOSED PROBABILITY DENSITIES
7.1 DEFINITIONS
XI ,xz,.
~ e t .. .X b e a random .ample from a population w i t h a probabi- *) We s h a l l i n Chapter 8 idencify t as an estimator f o r the parameter 8, or.
more generally. an estimator f o r some function of 8.
l i t y density function which depend. on a p a r m e t e r 8 which is not k n m b u t
169
NO n d e r s d e f i n i n g an i n t e r v a l . We r e a l i z e , however, t h a t s l n c e x is a random
- Therefore, an i n f e r e n c e t o be made from the m a s u r e m a t s is t h a t the e-tric
-2 a
a
variable, the quantities x - 2-
G
and +
6
are a l s o random v a r i a b l e s ;
95.47. confidence i n t e r v a l f a r t h e m a n U i s t h e i n t e r v a l [0.7. 6.7).
hence it is j u s t i f i e d t o c a l l t h e i n t e r v a l [x - 2 2 ; + 2
6' It ahould be e l e a r from t h e reasoning above t h a t i t i s e 8 e e n t i n l t h a t
vat. We m y read eq.(7.4) as a p r o b a b i l i t y statement about U: Prior t o the
repeated, independent measurements t h e r e is e p r o b a b i l i t y 0.954 t h a t t h e random
U i s a k n o w number. I f U were not k n m , t h e confidence limits (i 2 2 and -
(T + 2 -)
a G' no
could not have been c a l c u l a t e d from t h e measurements, and hence
i n t e r v a l ;[ 2-i,; + 2 -"I w i l l include t h e unknwn, b u t f i x e d value U.
"G
hi
inference could have been made about u based on N(0.1).
Other p r o b a b i l i t y n s t a t e m e n t s can of course be w r i t t e n taking o t h e r i n t e r v a l s
I n p r a c t i c e t h e s i t u a t i o n i* o f t e n t h a t t h e error on t h e measurements
corresponding t o o t h e r p r o b a b i l i t i e s . The ~ o i n is, t t h a t a l l statements of t h i s
is not known exactly. Hwever, t h e s i r e of t h e eamplt may sometimes be s u f f i -
s o r t . which c m be made b e f o r e any measurements are a c t u a l l y performed, belong
c i e n t l y large t o allow t h e approximation of a' by t h e observed sample variance
t o probabititg t h e o w . Generally we may w r i t e
s Z , and t h e procedure above e m be applied to f i n d confidence i n t e r v a l s for 11.
If u2 i s not k n m , and the sample a i z e is small (n 5 20). the procedure of the
subsequent beetion should be w e d .
where t h e l i m i t s a and b f o r a given y can be found from Appendix Table A6. Exercise 7.1: For t h e n m e r i c a l example given in t h e t e x t , what i s t h e symmetric
90% confidence i n t e r v a l f o r U?
As soon as t h e measured numbers are a t hand and ue are given a p a r t i -
c u l a r s e t of n observations x , , x r . ....x,. we m y pass t o t h e domain of s t a t i s - -
Exercise 7.2: Given 6 independent measurewnts
of k n m error 0-2. Assuming a normal sample.
10.7, 9.7, 13.3, 10.2, 8.9, 11.6
f i n d a-trie confidence i n t e r -
t i c s , and make i n f e r e n c e s about the unknown ii on t h e b a s i s of the observations. vals for u corresponding t o (a) Y-0.90. (b) ~ 4 . 9 5 . (e) y q . 9 9 .
For d e f i n i t e n e s s , l e t us e s t a b l i s h a 95.4% confidence i n t e r v a l f o r
Exercise 7.3: Given t h a t a normal d i s t r i b u t i o n has variance a', whet i s t h e
t h e mean 11 i n N(!J,~'), given t h a t f o u r independent measurements. with a known, sample s i z e needed i f t h e symmetric 95.4% confidence i n t e r v a l f o r v s h a l l have
comnon error 0-3, have l e d t o the numbers a length equal t o (al 0. (b) of27
Exercise 7.4: Measurements on the momentm of monoenergetic beam t r a c k s o n
bubble e h a d e r p i c t u r e s have l e d t o the following sequence of nlrmbsrs i n u n i t s
of GeVlc: 18.87, 19.55, 19.32, 18.70, 19.41, 19.37, 18.84, 19.40, 18.78, 18.76.
The sample mean i s We assume t h a t t h i s sample of s i z e 10 o r i g i n a t e s from a normal d i s t r i b u t i o n .
I f the measuring machine has a k n m accuracy corresponding t o an un-
c e r t a i n t y of 300 MeV/= i n t h e moment= determination, f i n d a 95% confidence
i n t e r v a l f o r t h e beam momentum.
1
is a
i
provided by the remark of Sect.5.2.1. We have seen that if lq,q,....xn Rewriting the ar-nt in the left-hand side of eq.(7.8) gives a pro-
random sample from N(v,$) two variables can be formed, which have wellknmn bability statement about the unknom 11.
properties, namely
which may be compared to eq.(7.41, valid in the previous case when 0' was knwn.
For a specified value of y the corresponding value of b vill be dependant on
the n d e r of degrees of freedom. The sire of the random interval ;[ - b x'
'
and these variables are independent. Therefore the variable
I f +b
6
for a given y i@ large for very small values of (o-I), but approach..
the sire of the corresponding intervals in N(O.1) when the rider of degrees of
freedom becomes large. This is so because the Student's t-distribution has
N(0,l) as a limiting distribution when n + -; (compare Sect.5.2.3).
For illustration, let us return to the numerical example of the pre-
vious section, with the measurements 2.2, 4.3, 1.7, 6.6 from ~(ll.o'), where nn,
is a Student's t-variable vith (11-11 degrees of freedom. We note that from the.
v as well as a' are unknam. we calculate
construction of t, the d n a m parameter a2 drops out, and we are left with a
the previous case: With a' known, the variable constructed was
-
is distributed as N(O.l); in the present ease where a'
s,
variable which has only p as an unknown constituent. It is also worth comparing
is assumed d n m , the
which
variable needed is F,
6
s
which has a student's t-distribution with ("-1) de-
Searching a confidence interval which can be compared to the symmetric 95.4%
grees of freedom.
(or Y standard deviation) confidence interval derived for the caae when 0
' was
For the variable constructed by eq.(7.6) we may write d m probnbili-
known, we observe that Appendix Table A7 has entries corresponding to probabili-
ty statements analo~ousto eq.(7.5).
-
ty contents of 0.025 in the tails of the Student's t-distribution. For 3 degrees
.t
of freedom, we find b 3.182, and the confidence limits are given by the n-
bers
where f(t;n-1) is the student's t probability density function for (n-1) degrees
of freedom, given by eq.(5.15). Since f(t;n-1) has symmetry about t-0 it is
customary to choose interval. [a.b] which are symmetric. Values for b in the
relation
L
Thus the symmetric 95% confidence.interva1 for p obtained from the four measure-
ments of unknown experimental precision is the interval [0.14, 7.261. Notice
that this interval is larger than the corresponding 95.4% confidence interval
can be deduced from Mpendix Table A7 for different n d e r of degrees of free-
[0.7, 6.71 obtained in the previous example when a' was .as-d knm.
dom and for the usually chosen values of the confidence coefficient y. (compare
Exercise 5.17). ..
E x e r c i s e 7.5: Foi t h e numerical example above, what is t h e symmetric 90% con- For a chosen value of y t h e r e is an i n f i n i t e rider of p o s s i b l e choices f o r the
f i d e n c e i n t e r v a l f o r u1 Compare t h i s r e s u l t w i t h t h a t of Exercise 7.1.
i n t e g r a t i o n l i m i t s a and b f o r the s h chi-square p.d.f.. It i s customary t o
E x e r c i s e 7.6: S i x independent observations from a population ~ ( u . 5 ' ) are given
-
t & e t h e l i m i t s such t h a t t h e t w o t a i l s b e l m a and above b w i l l correspond t o
by t h e numbera 10.7, 9.7, 13.3, 10.2, 8.9, 11.6. With o2 unknam, f i n d symme-
t r i c confidence i n t e r v a l s f o r u corresponding t o (a) y = 0.90. (b) y 0.95, equal p r o b a b i l i t i e s A(1-y). Calculations of a and b f o r given y and given nun-
( c ) y = 0.99. (Compare Exercise 7.2.) b e t of degrees of freedom can then be done i n t h e ordinary manner. using Appen-
E x e r c i s e 7.7: w i t h t h e observations of Exercise 7.4, what i s t h e symmetric 95% din Table A8.
confidence i n t e r v a l f o r the beam m m e n t m i f the accuracy of t h e measuring in- Let us again take an example. Suppose t h a t i t i s requested t o a a y
s t r m e n t i s not known p r i o r t o t h e measuremental
something about t h e accuracy of a new measuring instrument, end f o r t h i s p u r
pose a c a l i b r a t e d length is measured s e v e r a l times. The outcomes from 10 inde-
7.3 CONFIDENCE INTERVALS FOR TtlE VARIANCE
pendent measurements are t h e numbers
As before we w i l l assume t h a t xl,xr,...,x i s s random sample from
~ ( u , o ~ but
) , now we want t o d i s c u s s how we e m findnconfidence i n t e r v a l s f o r '0
and thereby make inferences about t h i s parameter. Again we must t r e a t separa-
t e l y two cases, f i r s t assuming u t o be known (Sect.7.3.1). and next assuming u and t h e t r u e number, U, i s 1000.
unknown (Sect.7.3.2). I f we demand a 95% confidence i n t e r v a l f o r 0 2 , Appendix Table A8
shows t h a t f o r 10 degrees of freedom the i n t e g r a t i o n l i m i t s i n eq.(7.10) w i l l
7.3.1 Case w i t h u known 1 correspond t o equal p r o b a b i l i t i e s (-0.025) i n t h e two t a i l s of t h e chi-square
This may correspond t o an experimental s i t u a t i o n where repeated mea- p.d.f. provided t h a t we take a-3.247 and b-20.483. The measurements g i v e the
surements are performed on a hm q u a n t i t y u using a measuring device of un- 10
squared d e v i a t i o n s about t h e k n m mean U a8 i l ( x i - u 2 - . Hence an inference
known p r e c i s i o n . from t h e measurements i s t h a t t h e 95% confidence i n t e r n a l f o r t h e variance o2
A s t a t i s t i c which has correspondence t o the variance o 2 i s t h e sum
-n1 i ="E l
1 i s given by
As we have seen b e f o r e (Seet.5.1.1) the v a r i a b l e
I Exercise 7.8:
Exercise 7.9:
- -
For t h e example i n t h e t e x t determine confidence i n t e r v a l s f o r
5' corresponding t o (a) y 0.90, (b) y 0.99.
Let 7.3, 6.6, 7.0, 5.1, 7.1, 8.5, 5.gJ 6.5, 6.2 be 9 independent
measurements from an sssllmed normal population N(7.o ). On t h e b w i . of theae
(a) y = 0.90, (b) y -
0.95. (c) y 0.99. -
observations, make inferences about t h e variance o2 corresponding t o
is d i s t r i b u t e d as X'(n-l>. Thus t h e reasoning of t h e preceding s e c t i o n can be Exercise 7.10: For the example i n t h e t e x t deduce confidence i n t e r v a l s f o r 0%
.pplied t o t h e inference problem, and we may s t i l l use t h e chi-square d i s t r i b u - corresponding t o (a) Y = 0.90, (b) Y = 0.99. (Compare Exercise 7.8.)
t i o n t o obtain confidence i n t e r v a l s f o r 0
'. However, whereas i n t h e previous Exercise 7.11: I f 7.3, 6.6, 7.0. 5.1. 7.1, 8.5, 5.9, 6.5, 6.2 are independent
ease when
x2 (n- 1).
u was k n m we used X 2 ( n ) , t h e present case with u unknown requires
corresponding t o ( a ) Y
7.9.)
-
0.90. (b) Y 0.95. (c) y- 0.99. -
observations from N(u.u'), where 11 i s unlnam, find confidence i n t e r v a l s for c2
(Compare Exercise
where f ( ~ ; ~ - l is
) t h e chi-square p.d.f. f o r n-1 degrees of freedom. For a
7.4 CONFIDENCE REGIONS FOR THE FXAN AND VARIANCE
s p e c i f i e d ~ o n f i d e n c ec o e f f i c i e n t y the l i m i t s a,b can be determined i n t h e usu-
Suppose we are t o give a j o i n t confidence region f o r t h e mean and t h e
a l manner, e n t e r i n g Appendix Table A8 f o r o-1 degrees of freedom. The probabi-
l i t y statement f o r 0
'
. analogous t o eq. (7.1 l ) , reads
variance i n ~ ( u . 0 ' ) on t h e b a s i s of the sample x,,x2....,x . To do t h i s we use
t h e f a c t t h a t f o r normal samples, the v a r i a b l e s ;and s2 are independent (Sect.
4.8.6). I f , f o r example, a 95% confidence region i s desired we can w r i t e two
p r o b a b i l i t y statements as
-
and determine t h e l i m i t s a and b.b' from N 0.1) and X2(n-l), r e s p e c t i v e l y . For
-' )n xi-12
interval for :
o
d i x Table A8)
obtained f o r (10-1) -
I f we t h e r e f o r e d i d not know what the t r u e value 11 were, t h e 95% confidence
9 degrees of freedom would be (see Appen-
the independent v a r i a b l e s a joint probability
..
I atatemeat is obtained by multiplying t h e OE eqs.(7.16], (7.171. 1 8. Estimation of parameters
giving
I
i n eq.(7.18)
The inequalities determine a region i n the parlmeter
is indicated by the shaded area i n Pig. 7.2. The region i 8 bounded
space
The general problem of parameter est-tion may be sketched a.
a' -
tively, and the
"(u - See f u r t h e r Sect.9.7.5. I n t h i n chapter we s h a l l take up various general aspects of par-ter
estimation by discussing i n some d e t a i l a few of t h e c r i t e r i a t h a t should be
f u l f i l l e d by good and acceptable estimators. Although these c r i t e r i a rill be
applied t o the a p e c i f i e point eetimrtion method. described i n Chaptern 9-11 t h e i
I
discussion in t h e following sections w i l l mainly be of a f o m l nsLure. 'the
student who require8 j u s t e s u p e r f i c i a l knowledge of these r a t h e r theor.tic.1
features may therefore be s a t i s f i e d with reading only the f i r s t N o sectimm of
t h i s chapter.
8.1 DEFINITIONS A good e s t i m a t o r should i n t h e long run produce e s t i m a t e s which do
The term estimator denotes i n t h e following a function of t h e obser- not systematically d e v i a t e from the t r u e parameter value. and i t s accuracy
v a t i o n s , or t h e method o r p r e s c r i p t i o n w e d t o f i n d a value f o r a t uham should i n c r e a s e with the n u d e r of observations. Frequently t h e r e are s e v e r a l
parameter. By an estimate we mean t h e n-rical value of t h e parameter obtain-
estimators which f u l f i l these requirements and hence can reasonably be thought
ed with t h e e s t i m a t o r f o r a p a r t i c u l a r s e t of observations. If the p a r a t e r
of f o r e s t i m a t i n g an unlinown parameter. I f so, one estimator can be s a i d t o be
i s 8, i t s e s t i m a t e is denoted by 6 . The term s t a t i s t i c was introduced i n Sect.
s u p e r i o r t o the o t h e r s i f i t s d i s t r i b u t i o n of estimates shows the b e s t "concen-
7.1 as a f u n c t i o n of one o r more random v a r i a b l e s t h a t does not depend on any t r a t i o n " about t h e t r u e parameter value. "Concentration" may f o r t h i s pulpose
I n the general ease we w i l l l e t t h e e t a t i a t i c
t -
unknown parameters.
t(x,,x2, ...,xn) be an e s i i m a t o r of t h e unknown 9 o r of s m e function of 8.
I f a continuous o r d i s c r e t e population has t h e p r o b a b i l i t y d i a t r i b u -
be expressed by giving the variance as a measure of the spread of t h e d i s t r i b u -
t i o n about i t s c e n t r a l value.
In the forthcoming s e c t i o n s we w i l l discuss the following optimum
t i o n f ( x ; 8 ) , t h e ZikeZihwd of t h e observations xt,x2. .... X, for a specific 8 p r o p e r t i e s t h a t are desired f o r good estimators: consistency, unbisssednesn,
is given by minimum variance, e f f i c i e n c y and s u f f i c i e n c y . Only r a r e l y w i l l the conceivable
n
estimators f o r a parameter possess a l l t h e good p r o p e r t i e s .
L(x,,x~... .,Xn~e) = TT
i-1
f(xi;e). (8.1)
have t o choose between them, and i n each s p e c i f i c ease decide which of t h e
One may t h e r e f o r e
statistic t
Observations are rzndom v a r i a b l e s .
- ...,
~ ( x , , x ~ , x ) i s used
Any f u n c t i o n of t h e observations
w i l l e l s o be a random variable, which may take on a v a r i e t y of values.
an r s i i m a t o r f o r t h e parameter
If a
8 it
for a l l n
tity E.
. N. In words, t h e e s t i m a t o r i s c o n s i s t e n t i f , given any small quan-
we can f i n d a sample size N such t h a t , f o r a l l l a r g e r samples, t h e pro-
w i l l t h e r e f o r e give r1.e t o s d i s t r i b u t i o n of estimates g. The individual e s t i - bability that en d i f f e r s from t h e t r u e value by more than E is a r b i t r a r i l y c l o s e '
t o zero.
I
mates obtained are of l e s s i n t e r e s t than t h e i r o v e r a l l d i s t r i b u t i o n , because !
t h i s d i s t r i b u t i o n w i l l r e f l e c t the q u a l i t y of the e s t i m a t o r when it i s ueed many As an example, we knar from t h e Lev of Large N d e r s (Sect.3.10.4) that
times. we w i l l t h e r e f o r e judge the !merits of an e s t i m a t o r from the character- the a r i t h m t i c mean of s sample of n measurements from a population w i t h mean
i s t i c s of the d i s t r i b u t i o n of i t s estimates. U and f i n i t e variance w i l l converge tovards 11 as n becomes l a r g e ,
Consistency and unbiassedness are independent estimator qualities, as
neither property implies the other. It is generally accepted that consistency
is more iwortant than unbiesaedness, partly because bias can often be correc-
The sample man is therefore a consistent estimator of the population mean.
ted for. A consistent estimator whose asymptotic distribulil,n has a fi~li~e
Exercise 8.1: S h w explicitly that the mean ; of a sample of size n from the mean will dlvays be asy~aytotieallyunbiassed.
normal population ~(11.o') converges in probability to !J.
t' ='; --
n-a
1 "
.I xi
1-1 where U is the population mean. Remehering the fact that the different xi are
where a is any fixed number. Why do we prefer one to the other7 independent we get for the expectation value
and b is different from zero, the estimator is biasesd. The bias t o m b b(8) - s 2 = ( ] - (1 -0 * 02.
1
will for all reasrmable estimators be of order - or araller compared to 8.
It is rather trivial to see that the sample mean ;is an unbiassed
we
eize.
see that the bias is b(02) --2 0 % . which decreases with increasing sample
estimator of the population mean !J whenever the latter exiet.. It is alao seen
-
that the estimator .'above will be e biassed estimator of !J for all a different
from zero.
8.4.2 Example: Estimator of the third central moment 8.5 MINIHUH VARIANCE AND EPFICIENFI
This example is resented to illustrate h m one from the first intui- The requirements of consistency and unbiassedneaa do not uniquely de-
tive guess can construct the correct f o m of m ""biassed estimator. termine how to choose a good estimator. One finds, for instance, that both the
Guided by the preceding example we consider the following sum. sample mean and the sample median are eonaistent and unbiaseed eatirmtors of
the location of e normel population vith hovn variance. However, as it can be
s h o w that the variance of the nmm is smaller than the variance of the median,
the mean is regarded as a better estimator of the central value. It seems
natural, therefore, to use the spread in the estimates as a measure for the
acceptability of an estimtor. For most distributions encounterad in practice,
necalling the definition of the third central moment p a (Sect.3.3.3) and u i n g
the second central moment, or the variance, will be a good measure of the coo-
the independence of the x. one finds for the expectation of the different parts,
centration of the estimates; this is especially so for the many cases where this
distribution is approximately normal.
Under fairly general conditions there exists a lower bound on the
variance of the es~imatesderived from en estimator. This lower bound is eaeily
established when considering the likelihood function defined by eq.(8.1). We
) respect to 8 e x i ~ t
shall assume that the first two derivative8 of ~ ( ~ 1 8with
for all 8, and that the range of x is independent of 8. Given an estimator t of
so- function of 8, say T(8), we define its bias b(8) by the relation (compare
eq. (8.4)),
collecting terms lead to
/...j(t-T(e))
aln~
L dz - -a~ae + ae .
ab
- (8.9)
ill have a variance larger chan the MVB. We therefore define the e f f i c i e n c y
of an estimator as the ratio between the MVB and the actual variance V(t) of the
estimator.
BY applying the Schwsrz inequality to the integral we obtain the formula
(8.10)
Efficiency (t) - MVB
. (8.16)
The lower limit of the variance implied by eqs.(E.ll). (8.13) is simplifying the MVB formula (8.15) to
called the m i n i m variance bound, MVB, and an estimator attaining this limit
ie called an MVB estimator, or more often, and e f f i c i e n t estimator. V(t) = ( I + $),A(8).
From the derivation above it is realized that the variance of an esti-
mator. will attain the MVB if the Schwsrz inequality applied to eq.(8.9) becomes Exercise 8.3: Prove eq.(8.12). (Hint: Differentiate eq.(8.8) with respect to
8.)
equality. The necessary end sufficient condition far this is that (t-~(8))
alnL
is linearly related to -
aefor all sets of observations; we write 8.5.1 Example: Estimator of the mean in the Poisson distribution
ribution f(x;E) -
The likelihood for n observations x,,x.,....x
;;1T 8xe- e .is
-
n from the Poisson dist-
From eq. (8. 13) one then g e t s a simple formula for the MVB,
Hence we have for the derivative of 1nL with respect to the unknown 8.
Efficient estimators exist only for the limited class of problem for
"(median) - E '
2" '
(8.22)
t
V(t)
- -- ;.
b(9) = 0. An unbiassed and e f f i c i e n t e s t i m a t o r of t h e parsmeter 8 i s t h e r e f o r e
x, t h e sample mean, w i t h variance given by the
B
HYB formula eq.(8.20), Exercise 8.6:
(8.18).
Show t h a t t h e HYB of u i n N(u.02) can a l s o be found from eq.
-:
What i s V ( t ) ?
Again a comparison v i t h eqs.(8.19), (8.20) ah- t h a t the s t a t i s t i c t iIlx:
8.5.2 Example: Estimators of t h e mean i n t h e normal p.d.f.
we have seen t h a t t h e sample mean i s a c o n s i s t e n t and unbiaased e s t i -
mator of t h e mean value i n any population of f i n i t e variance. To examine
variance V(t) -
i s an unbiassed and e f f i c i e n t estimator of t h e variance o2 of N(0.o2), w i t h
2oU/n.
I f , a l t e r n a t i v e l y , we t a k e o as t h e parameter t o be estimated we f i n d
f u r t h e r the p r o p e r t i e s of ;as an e s t i m a t o r of u i n t h e normal d i s t r i b u t i o n
N(u,02) w i t h fixed o2 we w r i t e
-
V(X) - oz
-
n f (8.21)
t is seen t o be
~ ( t )-- (8.24)
This i s i n accordance with our previous knowledge t h a t t h e v a r i a b l e ;i s distri-
buted as N(p,02/n). which i s in agreement v i t h our previous finding.
An a l t e r n a t i v e c o n s i s t e n t and unbiassed estimator of u is t h e sample Note t h a t t h e r e a u l t s above hold a l s o i f the saople o r i g i n a t e s from a
median. It can be s h a m t h a t , when s i z e of t h e n o r m 1 sample g e t s vary l a r g e , normal d i s t r i b u t i o n having a known mean value u d i f f e r e n t from zero.
t h e median becomes d i s t r i b u t e d according t o N(u,mr2/2n); hence t h e variance of
t h i s e s t i m a t o r of u is
Exercise 8.7: Consider t h e g- d i s t r i b u t i o n f(x;u,B) -
(T(a)6 )
u -lxa-le-xI8.
(a) Assuming a t o be known, find an e f f i c i e n t e s t i m a t o r and the MVB of 6. (b)
Assuming 6 t o be known, does any e f f i c i e n t e s t i m a t o r e x i s t f o r a7
8.6 SUFFICIENCY
t - n
1 C(xi).
i-1
(8.27)
8.6.1 One-parameter case
It is e a s i l y s h a m t h a t e f f i c i e n t estimators are alveys s u f f i c i e n t .
AU estimator t i s s a i d t o be sufficient i f it enhsusts a l l information
To see t h i s v e t a k e logarithms on both s i d e s of eq.(8.25) end d i f f e r e n t i a t e with
i n t h e observations x , . x z . . . . , x
d i s t r i b u t i o n we have used t h e sample mean x ;
mean p .
population
1 - -
regarding t h e parameter 8. For t h e normal
;S x. as an e s t i m a t o r of t h e
1 1
No e x t r a knowledge on p can be gained from o t h e r functions
respect t o 8, g e t t i n g
i
c i e n t e s t i m a t o r f o r U. Actually any function of x provides a s u f f i c i e n t s t e t i - We see t h a t t h e e f f i c i e n c y condition eq.(8.14) i s j u s t a s p e c i a l ease of t h e
s t i c for A
! in t h e normal d i s t r i b u t i o n . Therefore, t o choose betveen t h e d i f f e - more senera1 eq. (8.28). vith
r e n t functions one may have t o t e s t a l s o t h e q u a l i t i e s of consistency, unbiars-
edness, and e f f i c i e n c y .
To be more p r e c i s e , l e t us consider t h e l i k e l i h o o d function when t h e
p.d.f. is f ( x ; 8 ) . Suppose t h a t L can be f a c t o r i z e d t o give
.
I estimtor.
Under t h e r e g u l a r i t y conditions s p e c i f i e d f o r t h e likelihood function
i n s e c t . 8 . l t h e r e is among a l l s ~ f f i c i l estimators
I f t h e p.d.f.
t f o r 8 only one e f f i c i e n t
belongs t o the exponential family and the range of
i s independent of 0, it can be s h a m t h a t s u f f i c i e n t s t a t i s t i c s always e x i s t
v h e r e t h e function*) G involves the s t a t i s t i c t and t h e parameter 8, and tl is
f o r 8. HoYever, t h e r e v i l l be j u s t one s u f f i c i e n t a t a t i s t i e vhich v i l l s a t i s f y
independent of 8. being a function of t h e observations 5 only. Since G only h a s
eq.(8.291 ~ n dthus e s t i m t e some function T(0) v i t h variance equal t o t h e WYB;
r e f e r e n c e t o t h e d a t a vio t = t ( x , . x 2 . . . . , a ) and
, is a hnm rider f o r t h e
compare t h e example of Seet.8.5.3. Furthermore, f o r l a r g e samples, any function
given sample, t must supply a l l t h e a v a i l a b l e information i n t h e d a t a regarding
of a s u f f i c i e n t s t a t i s t i c v i l l be an NVB estimator.
8. It f a l l n r s t h a t whenever t h e l i k e l i h o o d function can be v r i t t e n i n t h e form
of eq.(8.25), t i s a s u f f i c i e n t s t a t i s t i c f o r t h e parameter 8. Exercise 8.8: Verify t h a t t h e Poisson d i s t r i b u t i o n P ( x ~ 8 )= $ 8Xe-e belongs
t o the exponential family. Compare Sect.8.5.1.
exist
One can show t h a t a necessary condition f o r a s u f f i c i e n t s t a t ~ s t i ct o
is t h a t t h e p.d.f. belongs t o t h e e q m e n t i o t family, defined by Exercise 8.9: Shar t h a t the Cauchy p.d.f. f(x;8) n 1 (1 +
h a v e a s u f e t e s t i m a t o r of 8. Compare Exercise 8.4.
-- does not
where B,C,D,E are functions o f t h e i n d i c a t e d arguments. We have already seen (Sect.8.5.2) t h a t t h e sample mean ;ls an e f f i -
eient e s t i m a t o r of t h e mean !J i n t h e normal d i s t r i b u t i o n . According t o t h e
For t h e r e s t r i c t e d class of p.d.f.'a s a t i s f y i n g eq.(R.26) one sees -
general statement of the p r e v i o w s e c t i o n x is then a l s o a s u f f i c i e n t s t a t i s t i c
t h a t the f a c t o r i z a t i o n requirement of eq.(8.25) implies t h a t e s u f f i c i e n t s t e
t i s t i c must be expressed by t h e function C(x1, f a r u. We vant t o s h w t h i s mare d i r e c t l y , and observe t h a t , generally
Thus we can make the identification with the functions where - [91,92.....9 k 1. Writing out the likelihood function one finds by c o w
parison with the factorization property (8.30) that the k joint sufficient st.-
tistics of the k parameters must be expressed by the functions C of the obser-
vations.
Tf, instead, v was fixed and 0
' to be estimated we would write the
likelihood function as
However, these estimators of li and a' are neither unbiasaed nor consistent.
Considering instead (compare Sect.5.1.61 The method of parameter estimation hm as the HluirurLikelihood
i (MImethod is very general and pwerful. For estimation problems where a
! functional dependence can be written d a m for the observed variables, the ML
method is eminently satisfacrory for two reasons: it provides estimators with
desirable properties, and the estimator8 are easy to find. The M L theory has a
it is seen that these variables define a one-to-one mapping of t,,tr onto r , , ~ , . i fundamental position in all problems of parameter estimation where the func-
tional form of the p.d.f. is given. We will therefore treat the ML m t h o d
The statistics r, and r r are therefore also jointly sufficient estimatore for
s&st at length and discuss its theoretical aspects e. well as its practical
the two parameters. Moreover. these are unbiassed estimators of 11 and 0'.
implications.
because I
In expositions of the EL method aimed for physicists, it is often the
osynptotic properties of the HL estimators which are emphluired. For large
samples the M estimates are ~ r m a l l ydistributed. This nice property m a b e
the determination of variances on HL estimates very simple. The following pre-
The joint likelihood function for u end a2 can be shown to factorize sentation will also emphasize the asymptotic properties, and these should be
to the form of eq.(8.M), but since fairly easy to extract, without reading the chapter in full. However, it i~ in
I
the framework of a u f f i c i m t s t a t i s t i c s that the HL estimators have their most
important properties. We have already given a somevhat theoretical discussion
s2 is not a single sufficient estimator of a' when p is unknown, nor is ;a of s- of the fundamental properties of estimators in Chapter 8, and we will !
single sufficient estimator of li when 0' is unknown. find in this chapter that the ML estimators possesa moat of these good p r o p e r
ties.
The reader who wants only a firat working knwledge of the ML method.
I
and who wants mainly to h m its asymptotic properties, can .elect a. an initial 1
reading the sections 9.1, 9 . 2 , 9.5.1'. 9.5.4 - 9.5.6, 9.6.1, 9.6.3, 9.1.1. 9.9.
I
In particular one should note that the very aimple graphical solution of the HL
estimation problem, described in Sects.9.6.1 and 9.6.3, can be applied in many
practical situations involving one or two unknown parameters.
I
9.1 THE MAXIMUM-LIKELIHOOD PRINCIPLE
Consider a p.d.f. f ( ~ j 8 )with an unknown parameter 8 t o be estimated
f r o m t h e s e t of observations
*)
9 i e negative,
as t h e j o i n t c o n d i t i o n a l p r o b a b i l i t y of t h e observations x,.x,, ...,xn a t a Usually L has only one maximum, and $ i s unique. I f t h e r e is mre
f i x e d 8. Since f(x18) i s a p r o b a b i l i t y d e n s i t y function properly normalized t o than one maximum, one should look f o r supplementary information t o chooee br-
one, we see t h a t when t h e LF i s considered s function of the x . ' s t h e integra- tween t h e s o l u t i o n s .
t i o n over t h e t o t a l sample space n yields Since L and the logarithm of L a t t a i n t h e i r maxim f o r t h e same value
of 8, the Ea s o l u t i o n may be found from the tikelihood equation
n
-
~ ( ~ l e ) 1d ~ (9.2)
**) mi. choice of t h e "best value of a parameter" as t h e one t h a t maximizes t h e I n many p r a c t i c a l problems t h e s o l u t i o n of eq.(9.6), o r more g e n e r a l l y
c o n d i t i o n a l p r o b a b i l i t y of r f o r given 8, is not obvious. From Bayea' meo- eqs.(9.8), has t o be found numerically. I n f a c t , t h e n-rical procedure can i n
rem, f o r instance, a mre i n t v i t i v e choice of a '%eat value of 9" would be
t h e one which maximizes t h e j o i n t p r o b a b i l i t y of x and 8; compare Seet.2.4.3. many instances be advantageous, s i n c e i t gives a l s o d i r e c t l y t h e variances t o be
associated w i t h t h e HL estimates; see Sect.9.6.2 f o r an example.
9.1.1 Example: Entimate of mean lifetime 1 S h w that the M. estimate far the par-ter T can be elmresaed .s
mmentum and the length between the production and the decay points, the proper 9.2 ESTIMATION OF PARAUETERS IN THE NORHAL DISTRIBUTION
flight-time ti for each event is determined. Por o observed events the LF is. TO illustrate further the M. estimation of unknown parameters w e will
according co the definition eq. (9.1), consider some useful examples vhere the LP is written in t e r m of normal proba-
n bility density functions.
uence the pa estimate of the parameter r is equal to the arithmetic mean of the
~bserved flight-times. The solution obtained does correspond to a maximum of when L is considered a function of v. In this case eq.(9.6) reads
the LF, because
Exercise 9.1:
-
above, one can alternatively use the decay constant A
with the p.d.f. f(tlA)
-
Instead of having the lifetime r as a parameter in the example
llr as the parameter
Ae-At. Show that the M. estimate of A is
/
I
I
Therefore, the M. estimate of the population mean li is equal to the sample mean
-x.
9.2.2 Estimation of li; measurements vith different errors (weighted mean)
Exercise 9.2: Assume that we observe the and decay of particles We ass- now that the measurements x l . x r . ....a
on the unknown
within a f z n i t e detector. The p.d.f. is then (compare Sect.6.3.1) , quantity li have different, but still k n m errors. If each measurement x. is
i
I
normally d i n t r i b u t e d w i t h measuring error oi*), the LF is
!
From eq.(9.8) we should now solve t h e s e t of two simultaneow equations
E x e r c i s e 9.3: Shol t h a t t h e HL e s t i m a t e of o z i n ~ ( u . 0 ~f o
) r given u is Exercise 9.4: Show t h a t t h e estimates u,
A A
o' correspond t o a maximm of t h e LF.
i2 -
mination in Sect. 9.1.1 we could have used the decay constant A as paraoeter The HL estimators are frequently not unbiaased for finite samples.
instead of the mean lifetime r . Let 8 be the HL solution for the parameter 8. For instance, the KL estimator of o2 in N(P,O')is
n i-1
(see Sect.
I£, ..
we had chosen to estimate a function of 8, say r(W, then
-
9.2.3, and this is a biassed estimator of a', because
-
the ML solution for this function would be that value ~ ( 6 )for which aL/ar
since aL/ae = (aL/ar)(ar/ae) for a11 e, and aLlae o for 8-8, it f o l ~ m swhen
A
,
0.
a ~ l a er o that aLla~-0 for e = 8; hence we must have The bias is here - -10 5 which is negligible far large n.
Exercise 9.6:
~ ( v , o ~ are
) G - --
Show that the HL eatimates of 11 and o of the n o m l distribution
X and o (<1 ~(xi-G)~)), respectively. It ha* been e h w n in
Sect.8.3 and Sect.8.4 that f is a eonaistent and ""biassed estimator of v. Show
The present remarks on sufficiency for ML estimators also apply to
the multi-parameter case. For the ease with k unknown parameters it can be
can simultaneously have
proved that r 5 k sufficient estimator. tl.t2,...,t
that the estimator of a is biasaed, but consistent. Compare the estimate 2 with
the estimate of 0' derived in Sect.9.2.3 and note that this provides another their minimum attainable variance.
example on the invariance of ML estimators.
9.4.5 Efficiency
9.4.4 Sufficiencz The variance of an HL estimator can not be arbitrarily small. The
It was stated in Sect.8.6 that if the likelihood function can be fae- "ecessary and sufficient condition that an efficient, or minimum uarianee b o d
torired as (m),estimator t exists for the parameter 8, is that one can write (Seet.8.5)
which is explicitly of the form (8.25). This demonstrates that -1n Ct.,
I
or
only one statistic t which will estimate some function r(8) with variance equal
to the m. m e MVB can be found if one can write
A
-
8" with variance equal t o the MVB, provided 1nL i s twiec d i f f e r -
equal t o t h e m, provided t h a t c e r t a i n r e g u l a r i t y conditions hold. This e n t i a b l e i n 8 and t h e range of x is independent of 8.
implies t h a t HL estimators are asymptotically e f f i c i e n t , and hence a l s o asymp- A Taylor expansion about t h e t r u e value
"
e - eo of alnLIa8 at the HL
t o t i c a l l y s u f f i c i e n t , s i n c e whenever e f f i c i e n c y holds s u f f i c i e n c y holds too. estimate 0 = I2 gives
9.4.6 Uniqueness
It is easy t o see t h a t , i f an e f f i c i e n t e s t i m a t o r e x i s t s f o r some
/ Therefore. from eq.(8.12).
a
f u n c t i o n T(B), then t h e MI estimate e is unique. Differentiating the efficiency
c o n d i t i o n eq. (8.15) w i t h respect t o 8 and i n s e r t i n g 8 = 8 leads t o
I writing
alnLlaE -
i n v i r t u e of eq.(9.16). Thus every s o l u t i o n of the l i k e l i h o o d equation
0 corresponds t o a maximum of t h e LF. Since, f o r a r e g u l a r function.
t h e r e must be a minimum between successive maxima, i t follows t h a t t h e r e cannot
the r i g h t hand a i d e i s e sum of o independent t e r r a alnf(xi18)/a8. Yhich has a
mean value of zero and a variance given by eq.(9.18).
Theorem t h e q u a n t i t y
Prom t h e Central L i p i t
i n v a r i a n t under traneformation t o t h e decay constant A - 1
-; . The e s t i m a t o r of
r can i n t h i s r e s p e c t be regarded s u p e r i o r t o t h e e s t i m a t o r of A , which is
at 0 - e0 is a s t a n d a r d i r e d v a r i a b l e w i t h asymptotic d i s t r i b u t i o n N(0.1).
(9.20)
The
biassed and n o t e f f i c i e n t .
We s h a l l now e x p l i c i t l y prove t h a t t h e HL eatimate ;-
samples is normally d i s t r i b u t e d about t h e t r u e value T w i t h v a r i a n c e equal t o
f o r large
(8-e0)[~(- ),]
a'ln~ I = (E(-
a21n~1
, (9.23)
ax(t) E ( ~ ~
E exp (. t) i i l x i ) ) .
~ i~t
Example: Asymptotic normality of t h e ML estimator of the mean l i f e t i m e and olay compare with t h e c h a r a c t e r i s t i c function f o r a normal v a r i a b l e with
9.4.8
We heve i n t h e l a a t s e c t i o n s i n d i c a t e d t h a t t h e HL e s t i m s t o r of t h e
mean u and variance a',
1 -tlr
mean l i f e t i m e T in t h e p.d.f. f ( t l T ) -; e has a nllmber of optimum propcr-
t i e s : i t i. unique, c o n s i s t e n t , unbiassed, s u f f i c i e n t , e f f i c i e n t , and a l s o
I t is seen. t h e r e f o r e , t h a t
A
T --x, f o r l a r g e but f i n i t e n, w i l l have s norms1
d i s t r i b u t i o n w i t h mean value r and v a r i a n c e r21n, which i s equal t o t h e HVB.
Accordingly. t h e s t a n d a r d i z e d q u a n t i t y
A
where t h e i n t e g r a t i o n is over a l l n x.'s and
t r u e values of t h e parameters.
B - 181 ,en,. ..,ek)
The formula ahove can b e used t o f i n d t h e co-
represent the
T - T
u=-
~ a r i a n c ematrix from t h e given £(XI?) alone without having any d a t a a v a i l a b l e .
TI&
, e x a l e on t h e use of eq.(9.26)
b is given i n Sect.9.5.2.
is N ( 0 . 1 ) f o r large s a q l e s
With eq.(9.26) an e q u i v a l e n t formula f o r t h e covariance t e r m can be
B as
A
1
r e f e r r e d t o as s m 1 1 sample f o m t a e . be noted t h a t t h e a n a l y t i c a l i n t e g r a t i o n of formulae (9.26). (9.27) can only be
When t h e experimental r e s o l u t i o n has been folded i n t o t h e p . d . f . (see ~ a r r i e dout i n r a t h e r favourable c a s e s . They w i l l lead t o t h e covariances
Sect.6.2) t h e errors c a l c u l a t e d from t h e l i k e l i h o o d f u n c t i o n w i l l c o n t a i n t h e expressed as functions of t h e t r u e (constant) parameters.
s t a t i s t i c a l as well as t h e experimental u n c e r t a i n t i e s . I f t h e r e s o l u t i o n i s mot '
1
Let us n w consider t h e l i k e l i h o o d function L(?lIi) as a f u n c t i o n of
included i n t h e p . d . f . . t h e errors estimated from an i d e a l t h e o r e t i c a l d i s t r i b u - f o r given 5. Since L(?[!) i s normalized t o one over t h e sample space, i t i s
t i o n obviously only r e f l e c t t h e s t a t i s t i c a l u n c e r t a i n t y and n o t t h e errors i n generally n o t normalized o v e r t h e parameter space. T h e variances may then be
t h e measurements. When t h e e s t i m a t i o n of t h e errors is done d i r e c t l y from t h e evaluated from t h e a l t e r n a t i v e formula *)
I
observed d a t a as f o r i n s t a n c e by t h e g r a p h i c a l method described l a t e r i n S e c t .
9.6, t h e s e errors w i l l c l e a r l y contain both t h e experimental and t h e s t a t i s t i c a l
uncerrainties.
V.. (8)
11 -
- cei-iji)ce.-~j)~c?llgd~
l~(~l!)d!
where t h e i n t e g r a t i o n s a r e extended over a l l k parameters. Again, i n f o r t u n a t e
9.5.1 General-ds f o r variance e s t i m a t i o n
Let us now regard t h e l i k e l i h o o d f u n c t i o n L ( ? ~ B ) - -(T
n
1- 1
£(xi[!) as t h e !
s i t u a t i o n s , i t may be p o s s i b l e t o c a r r y out p a r t s of t h e i n t e g r a t i o n s s n a l y t i -
c a l l y , or remove common f a c t o r s i n t h e denominator and numerator. I n t h e gen-
j o i n t p.d.f. of t h e n v a r i a b l e s x~,xr,...,x f o r t h e k parameters 8,.BI, ...,
ek. e r a l case, approximative values of t h e covariance between any p a i r of parameters
I f t h e e s t i m a t e s can be w r i t t e n e x p l i c i t l y as f u n c t i o n s of t h e x i ' s , i.e.
Gi = .
ai(xJ , x 2 , . .,xn). t h e covariance term between Gi and 8^. may be defined as
* Equations ( 9 . 2 8 ) , ( 9 . 2 9 ) formally consider t h e LF as providing a measure of
I
t h e d i s t r i b u t i o n of t h e van'obtes o;
a d i s c u s s i o n of the conceptual d i f f i -
c u l t i e s on t h i s point i s deferred t o Secr.9.7.
i s found by numerical i n t e g r a t i o n s of t h e t y p e maror t.
- The a l t e r n a t i v e formula, eq.(9.28), gives
where the A U ' s a r e the proper bi,? widths f o r the required i n t e g r a t i o n s over
the d i f f e r e n t parameters, and which leads t o
i s a common o v e r a l l n o ~ m a l i z a t i o ~f la c t o r .
I f , i n the l a s t formul.ntion, some of the parameters are not i n t e -
II Thus, only f o r i n f i n i t e l y l a r g e n do the r e s u l t 8 of t h e w o procedures, eqs.
(9.26) and (9.281, coincide.
!
Exercise 9.12:
(Sect.9.2.3).
-
Find using eq.(9.281 t h e covariance matrix of t h e simultaneous
HL e s t i m a t e s 0 X, 2
= l/n E ( r i - i ) of t h e mean and variance i n N ( u , ~ ' ) ,
Hint: The LF can be w r i t t e n
This can be w r i t t e n
9.5.3 Variance of s u f f i c i e n t NL e s t i m a t o r s
The formulae given i n Sect.9.5.1 are g e n e r a l l y v a l i d f o r a l l PL e s t i -
mators, i r r e s p e c t i v e of t h e sample 8i.e. I n s p e c i f i c cases more convenient
where t h e i n t e g r a t i o n s are from zero t o i n f i n i t y f o r a l l t h e n v a r i a b l e s t . . A formulae may be developed, and we t u r n now t o t h e s i t u a t i o n where t h e Pa e s t i -
s t r a i g h t f o r n a r d computation gives mators are s u f f i c i e n t .
When t h e p.d.f. f ( x l 8 ) provides a s i n g l e s u f f i c i e n t s t a t i s t i c , and
consequently an e f f i c i e n t e s t i m a t o r , f o r t h e parameter 8, we have already seen
This v(?) is t h e same as t h e MVB derived i n Sect.9.4.5 f o r the efficient esti-
in Sect.9.4.5
variance bound,
that the variance of the HL estimator is given by the minimum
i Exercise 9.13:
mate u -
It was ah- in Chapter 8 that ;is an unbiased and efficient
estimator of v in the norms1 distribution. Pind the variance of the ML eeti-
f from eq.(9.31).
Thus,
or, for an unbiassed estimator simply Since the weighted mean is an unbiassed and efficient t4L estimator of J! (compare
Exercise 9.8) and
It should be noted that these relations hold for small samples as well as for
large samples, and in particular eq.(9.31) is very useful in practice. the variance can be found from eq. (9.31).
In the multi-parameter ease the situation is not so simple. If, hov-
ever, there exists a set of k jointly sufficient statistics t,,t,, ....tk for the
k parameters e1.e2.....ek, it can be s h n m that the inverse of the covariance
matrix of the I 5 estimates in t m g e somptes is given by
When the errors oi are all equal, a.
known expression for the error on the mean. A;
-- o, eq.(9.12)
016.
leads to the well-
-
and hence
v a r i a n c e 2(n-1). Therefore. v(") -
2ok(n-l)ln2, which holds s t r i c t l y f o r a11 n.
- and has a
( u ~ / ~ ) ~ v ( ~ ~ ' / (oc ~ ~) / n ) ~ 2 ( n - l )
This i s c l o s e t o the r e s u l t
me fact t h a t the ML e s t i m a t e i s asymptotically normally d i s t r i b u t e d
about the t r u e parameter value can, i n view o f the formal synmetry between
variable and mean value i n the normal p.d.f., be formally expressed as the
2o*ln implied by the asymptotic formula (9.33).
..
Henee, from eq.(9.34),
parameter being a s y m p t o t i c a l l y normally d i s t r i b u t e d about t h e ML e s t i m a t e ,
v i t h a ~ p r e a daround t h i s mean value as implied by the MVB. Writing t h e LF
f o r t h e one-parameter case as
'('1 -z 1
in(i+u) -
2a3
ln(l-a) - 2u .
when a < < l t h i s can be w r i t t e n
we f i n d a t once t h e simple r e l a t i o n s h i p
(9.36)
determine t h e parameter v i t h am u n c e r t a i n t y Aa
t h a t more thao 3-10' e v e n t s v i l l b e needed.
-
I f , p r i o r t o t h e experiment, cr i s assumed t o b e approximately 0.1 and v e v i s h t o
0.01, we f i n d from eq.(9.40)
For l a r g e n t h e v a r i a n c e of o can be c a l c u l a t e d from t h e asymptotic d e n s i t y m a t r i x elements t o be entimated are p a o , p , - , , and ReplQ. The t h r e e
-1
j l (aa3)'
f dX=
1 4 L. .
+l
-1
i t n x
dx =
20
4 ( l n ( ~ + c i )- l n ( l - a ) - 2a). i s necessary t o f i n d t h e e s t i m a t e s p,,,
, , - A
PI-,. Repls and t h e i r errors. wemy
now ask: Can anything be s a i d about t h e covariance matrix of t h e paremcters 9.6 GRAPHICAL DETERMINATION OF THE MAXIMIK.LIKELIHOOD ESTIMATE AND ITS ERROR
b e f o r e d a t a are a v a i l a b l e ? It can b e v e r i f i e d t h a t f o r t h e d i s t r i b u t i o n (9.41) I n many p r a c t i c a l problems n e i t h e r t h e ML e s t i m a t e nor i t s variance
t h e r e e x i s t s no s e t of j o i n t l y s u f f i c i e n t s t a t i s t i c s f o r t h e t h r e e parameters, can be found a n a l y t i c a l l y . The numerical behaviour of L(&) as a function of
o r f o r any combination of two of them. Nor i s t h e r e a s i n g l e s u f f i c i e n t s t a t i - 0 can, however, be used t o determine t h e e s t i m a t e 1 as
A
w e l l as i t s e r r o r A?
-
s t i c f o r any of t h e parameters alone. For small s a m l e s of d a t a , t h e r e f o r e , g r a p h i c a l l y when t h e number of parameters i s l i m i t e d t o one or two.
l i t t l e can be s a i d about t h e errors from t h e t h e o r e t i c a l p.d.f. For l a r g e
samples, however, t h e asymptotic covariance terms may i n p r i n c i p l e b e c a l c u l a t e d 9.6.1 The One-parameter case
from eq. (9.35). When t h e e s t i m a t i o n problem involves a s i n g l e parameter, one can
i n t h e Jackson reference system, t h e s p i n s t r u c t u r e can b e described i n t h e ao- read off the M e s t i m a t e 8 from t h e graph as t h a t p a r t i c u l a r value of 8 f o r
c a l l e d dynamic reference system where t h e "observable" p a r t of t h e d e n s i t y which t h e curve peaks. Except For rare s i t u a t i o n s t h e curve w i l l have a s i n g l e
m a t r i x is diagonal. For vector p a r t i c l e s t h e d e n s i t y m a t r i x can be diagonalized maximum, and h e n r c a u n i q u e s o l ~ ~ t i af no r t h e ML e s t i m a t e R . I f more than one
by r o t a t i n g t h e Jackson r e f e r e n c e system a c e r t a i n angle 8 about t h e y-axis. maximum show up w i t h i n t h e physically admissible range of 8 , one w i l l usually
a
The t h r e e independent parameters i n t h e dynamic r e f e r e n c e system, are c a l l e d a, take 8 as t h a t value of 9 which corresponds t o t h e h i g h e s t maximum.
8, and 8. The decay d i s t r i b u t i o n i s of t h e form I n t h e case of a s i n g l e maximum, o r one dominant maximum well separ-
ated from o t h e r smaller maxima, one deduces t h e e r r o r i n t h e M e s t i m a t e 8 by
looking up t h e values of 8 f o r which L has f a l l e n by a f a c t o r of e-'.' of i t s
where t h e ""her of observations i s f i n i t e . I n t h e general ease, with a n un- N (unknown) i s t h e t o t a l number of events i n t h e f i l m ,
- A8A
9.6.2 ~ ~ ~ Scanning
~ ~ e fl f i c ei e n c:y (3)
1 scan is, according t o t h e binomial d i s t r i b u t i o n law.
N! N1
we have on two e a r l i e r occasions examined t h e q u e s t i o n o f how t o P,(NIIN.E,) = Nl! ( N - N l ) ~E, (I-EI)~-~'. (9.44)
Then,
which is symmetric in the indices 1 and 2.
The joint probability P may be interpreted as the likelihood
L(NI.N~,N~~/N,EI,
ofthe
E ~ ) observations N1,NZ.NlZ - NltN2-NIP for the parameters
I N.EI.EZ. One can therefore solve the three simultaneous likelihood equations for
A
these parameters to find thelr ML estimates $ . E , , E ~ .
The likelihood equations alnLlael
expected for the efficiencies of the individual scans
- 0 and alnL/acz - 0 give as
-
The likelihood equation a l n ~ l a ~0 can, unfortunately, not be
solved analytically. Since, however, we are primarily interested in the number
N rather than the individual efficiencies ue can integrate the joint probability
of eq.(9.47) over the "nuisance" variables E I , E ~ to obtain a likelihood function
involving only the parameter N. The result of this integration is
Pig. 9.2. The likelihood function L(N) of eq.(9.49) generated with the
I This form is not particularly suitable for nmerieal evaluation numbers N1=43, N2.48, Nlz45, starting from N-67 (L(66)=1).
l when the observed n u h e r s are large. However, the expression can be formulated
in terms of a recurrence relation, The estimated efficiencies are
" 43
c, = -
81 = 0.53,
E2 = -
48
81
= 0.59.
which is very convenient for numerical calculation. Since the number N of Exercise 9.17: With the observations of the example in the text, what is the
total number of events and the scanning efficiencies estimated with the conven-
the two scans, one can take L(N-1) - L(N12)
stepwise to generate as much of the LF as desired.
-
events in the film m t be at least as large as the number of events found from
1 as a starting value and proceed
tional formulae?
-
mon. Nl+NZ-Ntz
shape of the function can conveniently be visualized by plotting level curves
66.
ing with N -
Thus we put L(66)
67.
1 and generate new values of L(N)
The result of this computation is sharn in Fig. 9.2.
shape of the LF we conclude that the ML estimate of the unknwn number of events
from eq.(9.49) start-
From the
for constant values o f ~(_?IB1,82)in a (81,821 plane, equivalent to drawing
intersections between the surface and a set of parallel planes. In the vicinity
of a maximum of the function these level curves will be a series of smoth,
is a A
closed contours around the maximum point (81~82)which can thus be localized to
a required accuracy.
Wirh two parameters the LF will often have more than one maximum.
Usually, however, there is little trouble in identifying one particular maxi-
A second approach determines the "errors" in the parameter estimates from the
intersections between the same contour and the two lines 8,
indicated in Fig. 9.3(b).
- ^8,
With this intersection method the "errors" in either
and 8~ = Or as
A
mum with the desired ML solution for the parameter estimates. For example, / parameter are thvs deduced by keeping the other parameter at its estimated
-
some of the maxima may occur in ~nphysicalregions of the parameter space and value, making these "errors" in general smaller than the errors obtained by the
can therefore be eliminated right away, or the principal maximum can be over-
whelmingly favourable to the secondary maxima from the numerical
magnitude of their likelihoods. Wirh ill-behaved likelihood functions having
i tangential method. Since an asyometric orientation of the contour L
relative to the coordinate axes reflects a no"-zero correlation between the
L(rnax)e*.'
!
estimates, these two approaches to error determination will obviously produce
two or mre maxima of comparable magnitude the ambiguities of the solution identical results in situations with uncorrelated parameters only.
i
may have to be resolved by looking for additional information. An example is In the asynptotic timit of infinitely large samples the LF takes the
i
given by the case study described in Sect.9.12. ! binormal form
For a regular LF with a single maximum in the parameter region of
interest the errors in the MI estimates of the two parameters can be obtained
from the specific likelihood contour for which L = ~(rnax)e-"'. The tangents
to this to the pa ordinate axes ~rovidea set of upper and
I
where 0: and 0: are the variances for the two HL estimates ^BI and g2, and p
I
lower errors for the two parmeter estimates, as indicated in Fig. 9.3(a). their correlation coefficient, as can be verified by calculating the matrix
If the LF has the shape of a binormal distribution the errors deduced this elements v 7 ? ( 8 ) according to eq.(9.37)
and inverting the resulting matrix. The
I I 1' -
way are identical to the standard deviations, as will be demonstrated below. contour L = ~(max)e-~"is now given by r quadratic equation in 8, and 02,
I
This is the coumionce eZtipse for the binormal LF. The ellipse is centred at
A A
(81,82), and its principal axes make an angle a relative to the coordinate e y s -
I
tem, where
81 * a , and 8~ =
A A
Fig. 9.3. Graphical determination of the HL estimates and their errors coordinate axes will always have distances i o1.f or from the point (81.8~);
from the two-parametric likelihood function; (a) the tangential method, this serves to justify the tangential method for graphical error determination
(b) the intersection method (conditional errors).
9.7 INTERVAL ESTIMATION FROM THE LIKELIHOOD FUNCTION
The Manimm-Likelihood P r i n c i p l e produces p o i n t e s t i m a t e s of t h e un-
known p a r a m e t e r s . As i t i s r e c o g n i z e d t h a t t h e r e s h o u l d be a m a g i n o f uncer-
I t a i n t y a s s o c i a t e d w i t h a n e s t i m a t e , we have i n t h e p r e v i o u s s e c t i o n s d i s c u s s e d
a t l e n g t h how t o e v a l u a t e i t s v a r i a n c e . We have seen t h a t t h e approach t o t h e
v a r i a n c e d e t e r m i n a t i o n i s n o t unique, and t h a t somewhat d i f f e r e n t r e s u l t s are
A A
I I n s t e a d of g i v i n g t h e r e s u l t of an experiment i n terms of a p o i n t
estimate 8 and i t s error A8 one can sumoarize t h e outcome of t h e e x p e r i m e n t by
performing a n i n t e r v a l e s t i m a t i o n f o r t h e unknown p a r a m e t e r 0. For s u c h a
purpose w e i n t r o d u c e d t h e c o n c e p t o f a confidence i n t e r v a l i n Chapter 7. We
a s s o c i a t e d a n e s t i m a t e d i n t e r v a l w i t h a p r o b a b i l i t y c o n t e n t Y, and c a l l e d it a
i I
I lOOy % c o n f i d e n c e i n t e r v a l for t h e p a r a m e t e r . The meaning o f t h i s was t h a t i f
I t h e experiment were r e p e a t e d many t i e s under t h e same c o n d i t i o n s , t h e n , i n
t h e l o n g run, t h e e s t i m a t e d i n t e r v a l s would i n c l u d e t h e t r u e v a l u e o f t h e
F i g . 9 . 4 . Covariance e l l i p s e s f o r binormal l i k e l i h o o d f u n c t i o n s w i t h
1 ,
conanon maximum (a1,&) and comon v a r i a n c e s 0f.4, o % = l .The e l l i p s e s
! p a r a m e t e r i n l0Oy 'I o f t h e s e e x p e r i m e n t s . To a r r i v e a t t h e s e i n f e r e n c e s a b o u t
I I t h e p a r a m e t e r we had t o i n v e r t p r o b a b i l i t y s t a t e m e n t s a b o u t some f u n c t i o n , o r
for d i f f e r e n t v a l u e s of ;he c o r r e l a t i o n c o e f f i c i e n t P a l l touch t h e
!
r e c t a n g l e d e f i n e d by 8 1 = @ l * o l , 8 2 = 8 2 + 0 2a t f o u r p o i n t s ; f o r P = 1 * s t a t i s t i c , o f t h e o b s e r v a b l e q u a n t i t i e s , whose d i s t r i b u t i o n a l p r o p e r t i e s were
t h e e l l i p s e s d e g e n e r a t e i n t o t h e d i a g o n a l s of t h e r e c t a n g l e .
known. Thus, for i n s t a n c e , i n v e r t i n g a p r o b a b i l i t y s t a t w e n t a b o u t t h e sample
-
(Fig. 9.3(a)). The i n t e r s e c t i o n method f o r d e t e r m i n i n g t h e errors, c o n s i s t i n g mean n, known t o be d i s t r i b u t e d as N(u,a2/n) w i t h u udmown, n2 known, we ob-
t h e c o v a r i a n c e e l l i p s e a t d i s t a n c e s *al&?, tnZJ1-p' from (81,821. The l a s t larger than x - ;2 and s m a l l e r t h a n ;+ 2; ; hence we c a l l e d t h e random
I i n t e r v a l [ ~ - 2 i 7:221
, a 95.4% c o n f i d e n c e i n t e r v a l f o r 11.
~ b s e r v a t i o ns h o u l d make i t c l e a r why one must be c a r e f u l i n u s i n g t h i s method,
s i n c e merely q v o t i n g t h e i n t e r s e c t i o n d i s t a n c e s as errors w i l l b e incomplete. I n t h e a s y m p t o t i c l i m i t , w i t h infinite sample s i z e s , we can a r g u e i n
I t h e second i s i g n o r e d . me r e g i o n e n c l o s e d by t h i s c o n t o u r does n o t r e p r e s e n t
a j o i n t 68.3% p r o b a b i l i t y f o r t h e two p a r a m e t e r s , but c o r r e s p o n d s t o a much
l o w e r j o i n t p r o b a b i l i t y , i n f a c t l e s s t h a n 40% i n t h e i d e a l a s y m p t o t i c c a s e ; The random v a r i a b l e 8 h e r e h a s a p r o b a b i l i t y 0.954 of f a l l i n g w i t h i n d i s t a n c e
the interval [@-20, ^O+2ol is therefore a 95.4% confidence interval for 8. We e t c . would be proportional to the constants 0.683, 0.954, ....
~hese
numbers therefore provide relative measures of our belief in the specified
note that the inversion of the probability statement (9.53) was particularly
intervals for 8. We could write, for example,
simple here, due to the algebraic symnetry between variable and mean value in
the normal p.d.f. This asymptotic symmetry persists in the multi-parameter
Rel. belief (e - 2 < 9 < 0 + 20) = 0.954,
(9.55)
case, and hence permits a similar reasoning with inversion of probability
statements to ~htainconfidence intervals (regions) in the general case with
which is an expression of the same formal structure as the inverted probabi-
several parameters; see for example the ~resentatianin Chapter 9 in the book
1 lity statement eq. (9.54) used to define the 95.4% confidence interval for 9.
by Eadie e t a t . A
In the following sections, when we refer to the likelihood function and write
For f i n i t e samples we do not know the exact distribution of 8. W e
can therefore not write down statements like eq.(9.53), invert them, and next
interpret the results in terms of exact confidence intervals for the unknown
constant as we did above.
we shall take this probability statement to mean "relative belief" in the
In the following we shall make use of the likelihood function to* above sense.
perform an interval estimation of 9. We resume and extend our point of view It is customary among physicists to refer to all intervals derived
from earlier in this chapter: not only shall we regard the 0 value for which from the likelihood function as confidence intervals. This is in many in-
the LF is maximal as the "most likely'' value of the unknown parameter; other stances unfortunate, since the meaning is different from what is usually
e values will be considered less likely of being the true value of the para- understood by a confidence interval. Intervals obtained by a specific pre-
meter, in accordance with the fall-off of the likelihood. Thus, for the set
scription from the likelihood function were originally named fiducinl i n t e r v a t s
of observations at hand, we shall regard the likelihood function itself as by R . A . Fisher.
Following a suggestion by D.J. Hudson we shall denote all
providing a meamre of the intensity of our credence in the various con- intervals derived from the likelihood function a s likelihood i n t e r v a l s to
ceivable values of the unknown e. This means that we make an interpretation indicate their origin and to distinguish them from confidence intervals which
of the likelihood function as measuring our "degree of belief'' in the possible have an entirely different conceptual content.
values e can have, based on our particular observations x,.x2. ...,
x,,. Where-
Finally, let it be mentioned that the use of the likelihood function
as the confidence interval gave a measure for the probability that the true in statistical inference is by no means a trivial matter. In fact, there has
value of the unknown parameter i n the tong run would be included in the esti- over the Years been a great deal of controversy among the specialists, due to
mated interval, an interval estimated from the likelihood function vill their different attitudes to Bayes' Theorem. To indicate how confusion can
measure our belief that the pa~*ticuZars e t o f observation8 was generated by a arise on the subject, let us recall that the likelihood ~ ( ~ 1 8expresses
) the
parameter belonging to the estimated interval.
..
probability to obtain the particular set 5 = {x,,x2. ...,x of observed With an asymptotic LF of Gaussian form or, equivalently, a parabolic
values, on the condition that the value of the parameter is 8. This pcob- InL function, intervals for @ can be constructed to correspond to a specified
ability is connected with the inverse probability P(B/~)for a particular probability content Y , as described for the normal variable in Secr.4.8.3. ~n
value of 8, given the observations x, through BayesS Theorem, which states general, two limits 6 and 6 can be chosen such as to make
b
(eq. (2.26))
i
I
t h i s p r o b a b i l i t y must b e r h e same whether t h e p a r a m e t e r i s e x p r e s s e d d i r e c t l y
as 0 or i n an i m p l i e d form g ( B ) , we must have, f o r a l l 8 ,
L8c.le) = ~ ~ ( ~ 1 ~ ) .
and
9 - Prob@btllfyand ifaflltlc.
A-
For small s a q l e s an even more applicable function is
s(o) :u = ae I [,(--)I4a2in~
(9.63)
m estimate becomes r -- - 1
- The asymme-
-
and the t Lt. as we saw in Sect.9.1.1.
n I
becomes distributed as N(0,I) when n goes towards infinity. One can therefore, try coefficient is obtained as y, 2 / K and the Bartlett functions taLe the
as suggested by M.S. Bartlett, use S(8) to find the ML estimate 8 as well as foras
any confidence interval for B. This is suggestive, since S(8)
to being normally distributed also for finite n.
is usually close
S(T> --A
T-T
T/G *
(9.66)
and
s (T) 1- 1
= T-T [/;-T )z - p r o b a b i l i t y s t a t e m e n t s about s p e c i f i e d f u n c t i o n s o f r . They e x p r e s s t h e prob-
TIK 3 6 GZ a b i l i t y t h a t t h e t r u e v a l u e of t h e p a r a m e t e r w i l l b e i n c l u d e d between c e r t a i n
* -
me f i r s t of t h e s e f u n c t i o n s appeared a l r e a d y i n t h e e n a v l e of s e c t . l i m i t s g i v e n by t h e random v a r i a b l e r = t and t h e sample s i z e n, and t h e r e f o r e
9.4.8. where i t was shown e x p l i c i t l y t o be a s t a n d a r d normal v a r i a b l e f o r l a r g e correspond t o c o n f i d e n c e i n t e r v a l s i n t h e sense of Chapter 7. These i n t e r v a l s
n. A p r o b a b i l i t y s t a t e m e n t may t h e r e f o r e be w r i t t p n as w i l l b e mre s y m e t r i c and, i n t h e l o n g run, mre accurate than i n t e r v a l s de-
r i v e d d i r e c t l y from t h e l i k e l i h o o d Function.
I
s t a t e m e n t (9.68) i s f(t;T/A)
(compare E x e r c i s e 9 . 2 ) . Show t h a t t h e B a r t l e t t f u n c t i o n d e f i n e d by oq.(9.66)
becomes $ .
I f we now want t h e l i m i t s of t h e c o r r e s p o n d i n g c o n f i d e n c e i n t e r v a l f o r r a
second-order e q u a t i o n must b e s o l v e d f o r e a c h of t h e s e l i m i t s . Of t h e two s a l u -
Lions of e a c h e q u a t i o n we t a k e t h o s e which, when n + -, coincide with the l i m i t s
o b t a i n e d above villi t h e f o a c t i o n S ( T ) , s i n c e S (T) + S ( r ) when n becomes v e v
Y 9.7.4 L i k e l i h o o d r e g i o n s ; t h e two-parameter case
l a r g e . The r e s u l t is t h a t S (T), f o r n n o t t o o s m a l l , p r o v i d e s t h e 95.4% con-
Y I n a s i t u a t i o n i n v o l v i n g two p a r a m e t e r s w e s h a l l r e g a r d t h e l i k e l i h o o d
fidrnce interval for r
n e )d f o r a s e t of o b s e r v a t i o n s as c o n t a i n i n g a l l i n f o r -
f u n c t i o n L ( ~ I R ~o ,b t~a i ~
mation a v a i l a b l e o n t h e unknovn parameters and use i t t o make i n f e r e n c e s about
them based on p r o b a b i l i t y s t a t e m e n t s of the t y p e
I
T h i s i n t e r v a l i s more symmerric about r and a l s o s h o r t e r t h a n t h e i n t e r v a l of
eq.(9.69).
We v i l l s t a r t w i t h t h e assumption of a s y m p t o t i c c o n d i t i o n s . In the
The i n t e r v a l s of e q s . ( 9 . 6 9 1 , ( 9 . 7 1 ) a r e t h e r e s u l t s o f i n v e r s i o n of
l i m i t of i n f i n i t e l y l a r g e s a m p l e s t h e LF w i l l be t h e binormal d i s t r i b u t i o n
1
obtained by intersecting 1nL by parallel planes 1nL = lnL(max)-a: they are the
boundaries of joint likelihood regions for the two udnown parameters, and have
I
I
1
probability contents as implied by eq.(9.76),
~ovarianceellipse has a probability content y
where
= l-e
_
iQ
'
Y
=
= a. In particular, the
0.393 and thus repre-
~ e n t sa 39.3% joint likelihood region for 8, and 82. The following list of
"umbers should be compared with its one-parameter analogue, eq.(9.62),
"
" 86.5%
" 98.9%
"
"
9, ,,
,, ,,
t,
,r
,,
,I
. (9.77)
-
Curves for c o n s t a n t likelihood will then be ellipses with centre at the M
A
(2nalo2im. L ( ~ X
X-lii+o, 6 ja, d01dO,L(8i,B,) ,
I t i s r a t h e r t r i v i a l t o c o n s t r u c t l i k e l i h o o d i n t e r v a l s f o r e a c h of
the parameters considered s e p a r a t e l y . Writing a probability statement l i k e
ol-al R2-a2
P(;I-R~, 8, 5 ;,+mo,) = y (9.81)
which c a n b e reduced t o a f u n c t i o n of p and d e t e r m i n e d n u m e r i c a l l y . The same
f o r t h e f i r s t p r a m e t e r would imply i g n o r i n g t h e s e c o n d , which means t h a r i t
~ r o c e d u r cc a n o b v i o u s l y b e a p p l i e d t o d e t e r m i n e t h e p r o b a b i l i t y y(p,m) of any
can have any v a l u e . For t h i s s i t u a t i o n , i n t e g r a t i n g L(B1,B2) w i t h t h e ap-
o t h e r r e c t a n g l e s p e c i f i e d by 0 1 = Olmml. 0 , = '62tmr2. ~ t i s l e f t as an exer-
p r o p r i a t e n o r m a l i z a t i o n over a l l 8 2 g i v e s t h e m a r g i n a l d i s t r i b u t i o n i n 8 , ,
c i s e t o t h e reader to v e r i f y t h a t t h i s p r o b a b i l i t y i s g i v e n by t h e formula
which becomes N($,,D:) (see t h e d e r i v a t i o n of eq.(4.81) i n Sect.4.9.2). From
t h i s d i s t r i b u t i o n one f i n d s t h e l i k e l i h o o d i n t e r v a l s f o r 81 i n t h e u s u a l
manner. S p e c i f i c a l l y , t h e o n e s t a n d a r d d e v i a t i o n (68.3%) l i k e l i h o o d i n t e r v a l
[81-olr gl+ol I f o r 81 becomes t h e i n f i n i t e l y l o n g v e r t i c a l band t o u c h i n g t h e
where G i s t h e c u m u l a t i v e s t a n d a r d normal d i s t r i b u t i o n . As e x p e c t e d from
c o v a r i a n c e e l l i p s e , i n d i c a t e d i n F i g . 9.6; s i m i l a r l y , t h e long h o r i z o n t a l band
i s t h e one-standard d e v i a t i o n l i k e l i h o o d i n t e r v a l
A
[O2-n2. B2+021f o r e 2 . 9.7.5 Example: L i k e l i h o o d r e g i o n f o r p and 5' i n N(u.~')
the e v e n t of independent parameters (p = 0, a s y m e t r i c a l l y p o s i t i o n e d e l l i p s e ) , We have e a r l i e r e s t a b l i s h e d t h a t t h e j o i n t HL e s t i m a t e s o f t h e olean
when t h e j o i n t p r o b a b i l i t y f a c t o r i z e s i n t o t h e two m a r g i n a l p r o b a b i l i t i e s , w e ' - - I
and variance i n t h e normal p . d . f . N(u.02) are g i v e n by u = n = - E x . and
n 1
therefore find G2 = s2 = I
n
E (n.-;)'
1
(Sect.9.2.3) and t h a t , f o r l a r g e s a m p l e s , t h e i r co-
v a r i a n c e m a t r i x i s d i a g o n a l w i t h e l e m e n t s V l l = o Z / n and V2> = 209/n ( s e c t .
9.5.5). I f i n t h e s e e l e m e n t s we r e p l a c e t h e p a r a m e t e r '
0 by i t s e s t i m a t e d
value o2 we f i n d t h a t t h e v a r i a b l e Q of eq.(9.74) i s given by
F o r an i r r e g u l a r 1.F, where t h e e x i s t e n c e of a t r a n s f o r m i n g f u n c t i o n
i s an o b v i o u s l y i n a d e q u a t e a s s u m p t i o n , i t w i l l n o t b e j u s t i f i e d t o t a k e t h e
-a
numbers y = I-e as approximate measures of t h e p r o b a b i l i t i e s t o b e a s s o c i -
a t e d with t h e j o i n t l i k e l i h o o d regions. This a p p l i e s , i n p a r t i c u l a r , i f t h e
LF h a s more than one maximum and t h e i n t e r s e c t i o n 1nL = lnL(max)-a produces
two o r more d i s c o n n e c t e d r e g i o n s i n p a r a m e t e r s p a c e . In t h i s s i t u a t i o n the
e x p e r i m e n t i s o n l y p o o r l y s u m a r i z e d by s p e c i f y i n g t h e p a r t i c u l a r r e g i o n s ,
F i g . 9 . 7 . L i k e l i h o o d r e g i o n ( e l l i p s e ) and c o n f i d e n c e r e g i o n
and i t i s more i n f o r m a t i v e t o d i s p l a y t h e LF g r a p h i c a l l y by a whole s e t of ( i n t e r s e c t e d p a r a b o l a ) f o r u and 0 ' in ~(p,a').
l i k e l i h o o d contours.
eq.(9.78), a = - l n ( 1 - 0 . 9 5 ) = 2.996; hence i t s h a l f axes a r e Unless w e make t h e s i m p l i f y i n g assumption of aPymDtotic c o n d i t i o n s
we s h a l l n o t be a b l e t o p u r s u e t h i s s u b j e c t very f a r . I n doing so i t s h o u l d
b e n o t e d t h a t w i t h an i n c r e a s i n g n u d e r of v a r i a b l e s ( p a r a m e t e r s ) t h e approach
I t i s i n t e r e s t i n g t o compare t h e e l l i p t i c j o i n t l i k e l i h o o d r e g i o n to "asyrnptopia" becomes i n c r e a s i n g l y s l o w . lhus l a r g e r sample s i r e s are i n
I o b t a i n e d i n t h i s manner w i t h t h e f i n d i n g s o f S e c t . 7 . 4 , where we used t h e inde- g e n e r a l needed t o a t t a i n a g i v e n accuracy i n t h e a p p r o x i m a t i o n to
pendence p r o p e r t y o f t h e s t a t i s t i c s x and
s2 t o construct a j o i n t confidence when t h e number o f p a r a m e t e r s goes from one t o two and beyond. I n o t h e r words.
r e g i o n f o r l~ and o Z , i l l u s t r a t e d by t h e shaded area of F i g . 7 . 2 . A sample of comparing r e a l l i f e t o t h e i d e a l c o n d i t i o n s w i l l g e n e r a l l y i m p l y rougher ap-
.
s i z e n = l o o w i t h x = I , s Z = 1 g i v e s t h e 95% j o i n r l i k e l i h o o d e l l i p s e of p r o x i m a t i o n s i n t h e m u l t i - p a r a m e t e r case.
' 8k
F i g . 9.7. T h i s e l l i p s e i s s l i g h t l y s m a l l e r i n area t h e n t h e p a r t i c u l a r 95% A T a y l o r e x p a n s i o n o f 1nL about t h e ML e s t i m a t e = 1
j o i n t c o n f i d e n c e r e g i o n shown in t h e same f i g u r e , bounded by t h e p a r a b o l a and reads, i n general,
t h e two h o r i z o n t a l l i n e s . These have b e e n determined by demanding e q u a l t a i l
~ r o b a b i l i t j e s= l ( l - m 5 ) i n t h e ends of t h e N ( 0 , l ) d i s t r i b u t i o n ( f o r t h e
variable x-lJ as w e l l as t h e X2(n-1) d i s t r i b u t i o n ( for L
i
[G)' 1. This
Here t h e f i r s t d e r i v a t i v e s v a n i s h i d e n t i c a l l y . I f t h e sample s i z e i s l a r g e
o f t h e c o n s t a n t s a, b, b ' i n t h e p r o b a b i l i t y
r e q u i r e m e n t f i x e s t h e values
statement eq.(7.18); we f i n d a = 2.237 i n t h e u s u a l way, and b = 67.5, b ' enough t h e second d e r i v a t i v e s a r e t h e n e g a t i v e of t h e e l e m e n t s o f t h e i n v e r s e
When h i g h e r - o r d e r t e r n a r e n e g l e c t e d t h i s g i v e s t h e a s y m p t o t i c LF as t h e m u l t i -
9.7.6 L i k e l i h o o d r e g i o n s ; t h e m u l t i - p a r a m e t e r case
normal d i s t r i b u t i o n i n 8,
To g e n e r a l i z e t h e a r g u m e n t s o f S e c t s . 9 . 7 . l and 9.7.4 to a s i t u a t i o n
w i t h s e v e r a l p a r a m e t e r s w e are now l o o k i n g f o r t h e p o s s i b i l i t y to f o r m u l a t e ~ ( 8 )= ~ ( m a x )exp [- 1 (8-41 T ~ - (i)
l (9.84)
p r o b a b i l i f y s t a t e m e n t s of t h e t y p e
where we h a v e w r i t t e n dl(!) i n s t e a d of ~ ' ( i ) .
b
p ( e Y 5 8, zo!, ... eta -< , k -< ok ) = y
(9.82)
I n t e r s e c t i n g t h e h y p e r s u r f a c e L(2) by h y p e r p l a n e s L = L ( ~ ~ v i~l l ) ~ - ~
now g i v e c o n t o u r s o f c o n s t a n t l i k e l i h o o d which d e f i n e h y p e r e l l i p s o i d a l r e g i o n s
o n t h e b a s i s of a k-dimensional ...
l i k e l i h o o d f u n c t i o n L = L ( ~ ~ O ~ , B Bk).
~ , . If
in parameter space. S i n c e t h e q u a d r a t i c form
a l l p a r a m e t e r s are c o n s i d e r e d t o l i e between two d i f f e r e n t l i m i t s , t h e s p a n i n
p a r a m e t e r s p a c e w i l l b e a lOOy % j o i n t l i k e l i h o o d r e g i o n f o r a l l t h e k para-
meters. I f some of them a r e k e p t c o n s t a n t , s a y a t t h e i r e s t i m a t e d v a l u e s , t h e
r e g i o n spanned by t h e parameters w i l l b e a c o n d i t i o n a l j o i n t l i k e l i - f a r mulrinormal i s d i s t r i b u t e d a s a c h i - s q u a r e v a r i a b l e w i t h k d e g r e e s of
hood region ( i f t h e r e a r e a t l e a s t two p a r a m e t e r s l e f t ) o r a c o n d i t i o n a l l i k e - freedom ( S e c t . 4 . 1 0 . 2 ) , we can e x p r e s s a p r o b a b i l i t y by i n t e g r a t i n g t h e p . d . f .
l i h o o d i n t e r v a l ( i f o n l y one parameter r e m a i n s ) . I n any case, o u r purpose i s t o of t h e x 2 ( k ) v a r i a b l e between 0 and some v a l u e Q ; w e w r i t e
Y
f i n d a r e g i o n f o r chosen y , o r c o n v e r s e l y , t o f i n d t h e v a l u e of Y c o r r e s p o n d i n g
to a specified region.
f ( q ; v = k ) d Q = F1-y(Q=Qy;~=k) = Y . (9.86)
where F is the cumulative integral of the chi-square p.d.f. as defined in first. More odd situations, in which some parameters are ignored while others
Sect.5.1.4. Clearly, Q 5 Qy is here equivalent to having all parameters 8>,82, are kept fined, can also be thought of. We leave it to the interested reader
...,Bk ~imultaneouslywithin the region enclosed by the hyperellipse 9 = Qy. t o contemplate this matter further and to work out the relevant formulae. The
' h i s hyperellipse, centred at and obtained by the intersecting hyperplane a t lesson to be learned from our considerations here is that it is extremely
distance a = below Inl(max), will therefore he the boundary of a 1001 2
)Q important to s t a t e explicitly which parameters have been considered jointly
Y
joint likelihood region for all k parameters. corresponding values of y can be estimated and which have been integrated over or kept constant at their esti-
found from graphs or tabulations of the cumulative chi-square distribution, such mated values.
as Fig. 5.2 or Appendix 'rable A8. It is important to observe that, for a fined
value of the inrerseccinn constant a , the associated with the like-
lihood resion drops very quickly when the number of parameters increases. For 9.8 GENERALIZED LIKELIHOOD FUNCTION
example, the choice a = 0.5 (Q = I), which produced y = 0.683 for the one-para- With the assused f~mrtionaldependence f(xl9) between the observable
Y
meter and y = 0.393 for the two-parameter case, leads to chi-square probabili- x and the unknam parameter 0 we have, given n events (observations)
ties = 0.20, r 0.10, = 0.05 for k = 1, 4, 5, as one can see from Fig. 5.2.
7 , one vill have to rake
....
~ 1 . ~ 2 , xn, written the likelihood function as L(q/B)
1-1
-,a
f(~~l9). If it io
, ~ b f a i na specified prubability
~ i k e w i ~ eto possible to write the total number v of expected events as a function of 9 , say
increasingly large "slues o f a for increasing k. Specifically, to have 68.3%
v = v(B), this "information" may be utilized by constructing a gene~aZised
joint likelihood regions, the intersecting hyperplanes must be taken for ZikeZihood f t m c t i a as
a = 1.15, 1.77, 2 . 3 8 , and 3.00 when the number of parameters is k = 2, 3, 4, and
5, respectively.
TO obtain likelihood regions o r intervals which are conditional or
This expression describes the joint probability for observing just n events.
independent on some of the parameters we must carry out appropriate integrations
of the multinomal LF. The background for our remarks here has been presented
and that these give the reaults x ~ , x ~ , ....a,
when the number of observed events
is assumed to be Poisson variable vith mean value v .
in Sect.4.lO.l. If some of the parameters, say the first k of them, are without
interest and can be ignored, the marginal disiriburion obtained by integrating The advantage of introducing the generalized f is that the n u d e r of
observed events n adds an extra constraint in the determination of 9. In pro-
the LF over these is a multinormal distribution in the k-9. remaining parameters
b l e m vhere the shape of f(x/9) is of primary interest one will, however, in
with the same mean ~aluesand cavariences as they had in the full LF; this
general gain fairly little by using C instead of L. The usr of f ie suggestive
marginal distribution vill then provide joint likelihood regions for the k-a
parameters, independent of the E first, in the manner described above. If, on only in caaea where rhe expected number of eventa v(9) can be calculated vith
parameters are kept fined at their estimated values, the considerable accuracy. An example is given in the csae study of Seet.9.12.
the other hand,
conditional distribution in the remaining variables is also multinormal, of
I
9.9. APPLICATION OF THE MAX--LIYELIHOOD m T W D TO C W S l F I E D DATA
dimension k-m, but with new covariance terms as determined by the covariance
matrix V*, which results by deleting the appropriate e rows and columns from the When the number of observations is very large the numericel evaluation
original v
1-and inverting the resultant matrix. This distribution would then cf the likelihood function may become quite laborious, especially if the p.d.f.
provide joint likelihood regions for the k-m parameters, conditional on the m f(xl8) has a complex form. In such situations one may reduce the amount of
computation by grouping the data into subsets or classes w d m i t e the
l i k e l i h o o d f u n c t i o n as t h e product of a smaller number of "averagedq' p . d . f . ' a .
likelihood, f o r observing just n ~ , n r , . . . , % events i n t h e N e l a s s e a is then
A s i m i l a r grouping of t h e d a t a is f r e q u e n t l y i n h e r e n t i n t h e e x p e r i m n t a l set-up
1 ni ;vi,
i t e e l f , as f o r i n s t a n c e when a counter c o n b i n a t ~ a ni s used t o r e g i s t e r p a r t i c l e s
w i t h i n a c e r t a i n angular i n t e r v a l . I t i s c l e a r t h a t the grouping of t h e d a t a
i n t o c a t e g o r i e s n e c e s s a r i l y implies some l o s s of information; t h i s loss w i l l .
here vi - n
Ax.
f(xl8)dx. Zvi - -Eni n. Show that. when t h e number of events
i.1
1 pini
"i'
, (9 .RH)
~ e n c et h e HL e s t i m a t i o n become8 equivalent t o a Least-Squares e s t i m a t i m of t h e
parameter; see Sect.lO.5.1.
Pi - pi(e) =
I fix(R)dx . the same physical parameter 9 from two s e t s of observations 5 and
ing t o t h e l i k e l i h o o d functions ~ ( ~ ( and
8 ) L ( ~ / B ) r,e s p e c t ~ v e l y .
r, eorrespond-
me jolnt
Ax.
"Lrn = - - - a'ln~'
aepaem ' m - 1 2k (9.94)
9.12 A CASE STUDY: AN ILL-BEtLAVED LIKELItlOOD FUNCTION
We will now give an example, from a low statistics experiment, on an
and where the matrix H' is defined by
ill-behaved likelihood fvnction in two parameters. We will show that the intro- 1'
duction of the generalized LF improves the shape somewhat, but only to a very 1
small extent. The real cure of the problem can only be obtained after a close
look at the physics involved. In fact, for the present problem rvo experiments
It is seen that eqs.(9.93) - (9.95) reduce to eq.(9.92) if there is only one
described by slightly different p.d.f.'s should be combined to give a well- 1
parameter, and the special case with all w . = 1 reproduce our earlier asympto-
behaved LF in the two parameters.
tic result, eq.(9.37).
The physics question is the following: Is the decay KO * nt"-no
There is always loss of information involved in the weighting proce- due to the decay of the long-lived only, or is it partly due to the decay of
dure, but this will be serious only if very large weights occur. As very large the short-lived $? In the latter case, CP is violated, and the reaction can
weights may mbitrarily increase the variance v@), one will somtimes improve give information on the complex CP-violating parameter n. defined by the ratio
the precision in the parameter estimates by excluding some events from the
Amplitude (K: + T'V-T')
samle. In bubble chamber experiments, for example, one can usually avoid the + =Ren+m.
unwanted large weights by a suitable choice of fiducial volume. Amplitude (< * TI n n )
(It should be noted tha; the decay $ * nt"-no has already been observed and
~ ~ e r c i s9.23:
e (The effect of experimental resolution) that its decay rate T(< + n+<no) is L n w n . )
An azimuthal angle $ is distributed according to the ideal theoreti-
cal p.d.f. The random variable (observable) in the problem is the proper flight-
£($la)
1
-
2rr (1 + a c o g $ ) , o 5 $ 5 zn,
time t of the KO between production and decay, and its p.d.f. is given by
(i) rf R << zn, shov that the resolution transform can be expressed as
c - 1
t .
maa
~(tln)dt
ran
and
No = number of ~''s produced a t t - 3 (known),
AS.
6 - A L = t h e t o t a l decay rates of
mass d i f f e r e n c e between and c and c (knmn],
(known).
-
180
1
L(n) CiN(ti In, (9.97)
i-1
i n t h e (Ren.lmn) p l a n e is given i n Fig. 9 . 8 ( a ) . The LF has two maxima, one
pronounced maximum a t Ren = - 2.68. Im = 0.55 and one l e s s pronounced maximum
a t Ren = 0.15, Im = -0.06. The d i f f e r e n c e i n 1nL between t h e maxima is 2.18.
We next consider t h e e f f e c t of imposing an a b s o l u t e normalization
given by t h e known decay r a t e ~(c + ntn-no). We w r i t e t h e generalized l i k e l i -
hood f u n c t i o n as
p h y s i c s of t h e problem, one s o l u t i o n suggests t h e e x i s t e n c e of a huge CP-violat- favouring l a r g e negative values of Ren and Imn. Thus, i n a band i n t h e
i n g e f f e c t , while t h e o t h e r s o l u t i o n is c o n s i s t e n t with no C P y i o l a t i o n . (Re". Irm) plane t h e proper f l i g h t - t i m e of t h e KO i s l i t t l e s e n s i t i v e t o t h e
One might wander i f t h e ill-behaved LF and t h e t v o s o l u t i o n s are. j u s t values of q. The shape of t h e f u l l l i k e l i h o o d contours of Fig. 9.8(c) i n d i c a t e s
bad luck i n this p a r t i c u l a r experiment, and due t o some l a r g e s t a t i s t i c a l fluc- t h a t t h e experiment cannot e a s i l y d i s t i n g u i s h between a broad range of v a l u e s
of t h e parameter n. The indeterminacy can, however, be solved by performing a
- -
t u a t i o n i n t h e data. To check t h i s p o i n t , many a r t i f i c i a l samples c o n s i s t i n g
of 180 e v e n t s each, were generated by t h e Monte Carlo method f o r Req Im 0 new, but very s i m i l a r experiment. This r e s t a upon t h e following observation:
and t h e l i k e l i h o o d function constructed. The contours of t h e LF f o r a t y p i c a l I f one s t a r t s with K-zero with strangeness equal t o -1 t o study to+ n + n n D
I a r t i f i c i a l sample are s h a m by the f u l l d r a m curves i n Fig. 9 . 8 ( c ) . In
t h e l a s t term i n the p.d.f.
unchanged.
(9.961 changes s i g n , whereas everything e l s e i e
The contours f o r a t y p i c a l LF obtained from 180 a r t i f i c i a l ioeeventa
1 lo. T h e Least-Squares method
are s h a m by t h e dotted curves i n Pig. 9 . R ( c ) . The dotted eurvea correspond
simply t o the f u l l dr- curves r e f l e c t e d a t the o r i g i n . We conclude t h e r e f o r e ,
that i f n is c l o s e t o zero a goad procedure t o determine t h e par-ter would be
t o combine KO and iod a t a . Thin has i n f a c t a l s o been done, and it i s found
that n i n indeed c l o s e t o zero, and c o n s i s t e n t with no CP-violation i n
KO * "+<no decay. I n t h i s c h a p t e r we s h a l l d i s c u s s the e s t i m a t i o n of parameters by t h e
Least-Squares (LS) method, probably the e s t i m a t i o n method rmst f r e q u e n t l y used
i n practice.
The p o p u l a r i t y of the LS method may p a r t l y be a s c r i b e d t o t h e f a c t
t h a t i t has had a long h i s t o r y during which i t has been applied t o a number of
s p e c i f i c p r o b l e m a s well as t o problems of more general nature. Besides t h i s
importance gained by t r a d i t i o n , the a c c e p t a b i l i t y of t h e LS method, as f o r any
systematic e s t i m a t i o n p r i n c i p l e , depends on the p r o p e r t i e s of the e s t i n a t o r a t o
which i t leads. Unlike the MaximunrLikelihood method t h e Least-Squares method
has no general optimum p r o p e r t i e s t o recommend i t , even asymptotically. However,
f o r a n important e l a s s of p r o b l e m , where t h e parameter dependence i s Linear,
the LS method has t h e v i r t u e t h a t i t , even f o r small samples, produces u n b i a s e d
e s t i m a t o r s of m i n i a m variance.
I n the following we consider f i r s t the simple case with l i n e a r para-
meter dependence, then we proceed t o the non-linear ease, and f u r t h e r t o s i t u a -
t i o n s of i n c r e a s i n g complexity involving f i r s t l i n e a r , and l a t e r general con-
s t r a i n t equations.
B = 181,8z,
- ..
.,$L> v h l c h p r o d u c e s t h e s m a l l e s t v a l u e f o r x2 i s c a l l e d t h e Least-
I t h a s i n a l l f o m u l a t i o n s above b e e n t a c i t l y assumed t h a t t h e xi are
Squalau e,stimnte of the pammeters. Each x . may have a p r e a s s i g n e d
p r e c i s e values, w i t h no errors a t t a c h e d t o them.
The w e i g h t wi e x p r e s s e s t h e a c c u r a c y i n t h e measurement y . . In many v a l u e , o r i t bas brew me.rsurcd with;rrr e r r n r i d h i r h is n e g l i g i b l e t o t h e e r r o r of t h e
s i t u a t i o n s one assumes t h a t a l l o b s e r v a t i o n s a r e e q u a l l y a c c u r a t e .
I n such A l t e r n a t i v e l y , t h e " o b s e r u a t i a n a l p o i n t " x. can s t a n d f o r a
corresponding y . .
cases t h e LS s o l u t i o n f o r t h e p a r a m e t e r s i s found by d e t e r m i n i n g t h e minimum of m a t e v e r t h e meaning of xi the
well-defined r e g i o n from x i t o xi+Axi, say.
t h e unwelghted sum of s q u a r e d d e v i a t i o n s , i . c . one minimizes t h e q u a n t i t y
c r u c i a l assumption i s t h a t i t i s p o s s i b l e t o make a pwn'se e v a l u a t i o n of t h e i
N from x. t o %1+ A xI. . Simi- I
x2 = ,I (10.2)
p r e d i c t e d v a l u e fi corresponding t o xi. o r the region
1=1 1 larly, t h e e x p e r i m e n t a l v a l u e y . may be c o n s i d e r e d as t h e outcome of a s i n g l e
T h i s i s c a l l e d u w e i g h t e d LS c s t i m t i o n . measurement, or more measurements (an a v e r a g e , s a y ) a t t h e p a i n t x . . e v e n t u a l l y
I f t h e e r r o r s i n t h e d i f f e r e n t o b s e r v a t i o n s are d i f f e r e n t b u t known i n t h e r e g i o n from x . t o xi+Ari. We may well speak o f x as an independent, a n d
t h e w e i g h t of the i - t h o b s e r v a t i o n i s u s u a l l y taken e q u a l t o i t s precision, y as a dependent v a r i a b l e .
wi = 110; . The q u a n t i t y t o be minimized i s t h e n F i n a l l y , i t may b e w o r t h w h i l e t o emphasize t h a t t h e LS e s t i m a t i o n
method makes no r e q u i r e r e n t a b o u t t h e d i s t r i b u t i o n a l p r o p e r t i e s of t h e a b s e r l r
On t h e o t h e r
~ b l e s . I n t h i s sense t h e LS e s t i m a t i o n i s d i a t r i b u t i a - f r e e .
TI 1
fi 0; e x p ( - i ( y )
= i=l )= exp(-iii,(y) ). I ... 7 N
ox-
%, = i!l(-~)(~i-e,-xie,) = 0,
According t o t h e MaximumLikelihood P r i n c i p l e t h e mast p r o b a b l e v a l u e s of t h e
unknown n ; ' s a r e t h o s e which make L as l a r g e as p o s s i b l e . Evidently, L i s a t
N
maximum when
1(
N
i=,
7.-TI.
"i
)' = minimum , This s e t o f l i n e a r e q u a t i o n s c a n be w r i t t e n i n the form
X. I
N
= 1=1 ) y f z
= E i?(yi-(
i=lI
i aiIeI))2
L
By e q u a t i n g a l l d e r i v a t i v e s
ax2
-
N
1 (-2)aik 1
ax2/aek
of(Yi -
L
I
t o zero we g e t t h e L c o n d i t i o n s
aiaee) = 0. = I . . L (10.13)
I We s h a l l r e p h r a s e t h e f o m l a e f o r the l i n e a r problem of t h e l a s t sec-
t i o n i n terms o f m a t r i x n o t a t i o n .
We o r d e r the measurements and p r e d i c t i o n s i n two column v e c t o r s y and
28, i=l e=I f. both with N elements, and l e t
- 5 be a column v e c t o r with the L parameters.
I (L 5 N),
which can a l s o be w r i t t e n a s
I
a n d t h e q u a n t i t y to be minimized i s
A
E x e r c i s e 10.2: V e r i f y eq.(10.24)
l i n e a r LS e s t i m a t e of t h e p a r a m e t e r s , e q . ( 1 0 . 2 3 ) . From S e c t . 3 . 8 we r e a l i z e A f u r t h e r optimum p r o p e r f y of t h e l i n e a r LS e s t i m a t o r s i s c o n t a i n e d i n
quantities -
9. Applying t h e g e n e r a l formula f a r error p r 0 p a g a t i o n , ~ e q . ( 3 . 8 0 ) . to t i o n s of t h e o b s e r v a t i o n s t h e LS e s t i m a t o r s h a v e t h e s m a l l e s t v a r i a n c e .
268
We want t o f i t t h e p a r a b o l i c p a r a m e t e r i z a t i o n
r
- = sx. (10.26)
C = SA. (10.27)
T
I Here e a c h o f t h e two terms on t h e right-hand s i d e i s of q u a d r a t i c form UVU ,
which i m p l i e s non-negative diagonal elements. Only t h e s e c o n d t e r m i s a func-
t i o n o f S , and t h e sum of t h e two terms w i l l have s t r i c t l y minimum d i a g o n a l e l e -
ments when t h e s e c o n d t e r m h a s v a n i s h i n g e l e m e n t s on t h e d i a g o n a l . T h i s occurs
The i n d e p e n d e n t measurements d e f i n e t h e column v e c t o r y = 15,3,5,8) and t h e d i a g -
when o n a l covariance m a t r i x
T h e r e f o r e , t h e u n b i a s s e d , minimum v a r i a n c e e s t i m a t o r s f o r CB i s
10.2.5 Example: F i t t i n g a p a r a b o l a
We s h a l l work t h r o u g h an example o f a l i n e a r L e a s t - S q u a r e s f i t t o
m e a s u r e r e n t s of d i f f e r e n t a c c u r a c y , which requires a- w LS e s t i m a t i o n .
i The p a r a m a t e r s 8 i a n d 83 are c o r r e l a t e d , w i t h t h e e s t i m a t e d c o r r e l a t i o n c o e f -
ficient
I
I
e x e r c i s e 10.3: Derive t h e s o l u t i o n j$ f o r t h e problem i n t h e t e n t by s o l v i n g the
normal e q u a t i o n s .
I
The r e s u l t i n g m a t r i x i s o f dimension 3 x 3, i t i s n o n - s i n g u l a r ,
verted. The i n v e r s e becomes
and can be i n -
! E x e r c i s e 10.4: D e r i v e t h e LS s o l u t i o n and i t s errors f o r t h e same problem w i t h
a l l masurement errors e q u a l , o. = 2 . I
I Yy : = 0.01 t 0 . 0 8 .
f 0
I
The a c c u r a c y o f t h e e s t i m a t e d p a r a m e t e r s can b e found f r o m t h e c o v a r i - e want t o f i n d t h e b e s t cornbined outcome of t h e two e x p e r i m e n t s .
W
where A i s t h e m a t r i x of c o e f f i c i e n t s a i L ,
first-order
We have i n two p r e v i o u s examples c o n s i d e r e d Least-Squares
and second-order p o l y n o m i a l s i n the v a r i a b l e x ( S e c r r . l O . 2 . 1
fits to
and
I
I
fi =
L
1Sa(~i)~a,
a- I
(10.35) I
10.2.5,respectively). It i s f r e q u e n t l y necessary t o c o n s i d e r f i t s t o higher-
order p o l y n o m i a l s o f t h e g e n e r a l form
where w
elements
a are t h e L new p a r a m e t e r s .
T
The m a t r i x A and i t s t r a n s p o s 6 A now have
i
T
(A)ia = (A = aia =
T .
S i n c e t h e p a r a m t e r dependence i s s t i l l l i n e a r , s u c h a problem h a s an e x a c t s o l u - i The p r o d u c t m a t r i x A A 1% such t h a t
N
t i o n of t h e form of e q . ( 1 0 . 2 3 ) . However, as t h e power o f t h e polynomial i n -
creases t h e i n v e r s i o n of t h e m a t r i c e s i n v o l v e d b e c o r n s i n c r e a s i n g l y i n t r i c a t e . j T
(A A)ka = 1 (AT)ki(A)ie
i=l
=' 1 &(xi)Se(xi)
i-1
= 6ka.
Serious n u m e r i c a l i n a c c u r a c i e s may occur when t h e d e g r e e o f t h e polynomial g e t s Hence ATA = IL, t h e u n i t m a t r i x o f d i m e ~ l ~ i oLn x L. The LS s o l u t i o n f o r t h e
as l a r g e a s , s a y , 6 or 7. parameter v e c t o r w simplifies t o
or, on component form,
Because o f t h e r e l a x a t i o n o f t h e n o r m a l i z a t i o n c o n d i t i o n t h e p r o d u c t m a t r i x A ~ A
o becomes
A
The c o v a r i a n c e m a t r i x f o r
of dimension 2 x 2 i s n o t a m u l t i p l e of t h e u n i t m a t r i x , b u t r a t h e r
i . e . a diagonal matrix.
I We see t h e r e f o r e t h a t t h e LS e s t i m a t e s o f t h e p a r a m e t e r s are e a s i l y
1 d e r i v e d i n t h i s case, w i t h t h e e r r o r s o n t h e u n c o r r e l a t e d e s t i m a t e s g i v e n d i -
r e c t l y by t h e (cooanon) error on t h e measurements.
T h i s m a t r i x can t r i v i a l l y be i n v e r t e d , g i v i n g
I
10.2.9 Example: F i t t i n g a s t r a i g h t l i n e (2)
Let us i l l u s t r a t e the use o f o r t h o g o n a l p o l y n o m i a l s by r e c o n s i d e r i n g
...,
i
!
r h e problem o f f i t t i n g a s t r a i g h t l i n e t h r o u g h t h e p o i n t s ( x , . y , ) , ( x , . y r ) , i
( 5 , ~ ~ which
) . was f i r s t t r e a t e d i n Sect.lO.Z.1.
I The LS s o l u t i o n for t h e c o e f f i c i e n t s g i s t h e r e f o r e
The model i s now t o b e w r i t t e n i n t e r n . o f two p a r a m e t e r s w,,wz and
i ZY; \
-
two o r hogonal f u n c t i o n s S I ( x i ) . S 2 ( x i ) .
; 1I
6
Ni=l
From t h e d e f i n i t i o n o f t h e mean v a l u e
xi we r e a l i z e t h a t o r t h o g o n a l i t y i s e n s u r e d i f we t a k e
Ii
+
p ( ~ o s $ ~1)cosB
zi = C + p$. t a n k
-
p ( ~ o s $ ~1 ) s i n B + psin$.cosB
, I
psind.sinB
a2xZ
a6 = 2 I[(xi-A)(xi-A)
GBB = 7 + (Yi-B)(yi-~)]
where 6 i s t h e a n g l e d e s c r i b i n g t h e r e l a t i v e o r i e n t a t i o n of t h e two c o o r d i n a t e
while GBA = GAB = O.
I t s h o u l d a l s o be mentioned t h a t i n p r a c t i c e t h e measured p o i n t s are
The s t a r r i n g v a l u e s f o r t h e i t e r a t i o n p r o c e d u r e can b e found as
n o t known w i t h f u l l p r e c i s i o n , b u t a r e c o n n e c t e d v i t h errors, AXi.AYi,AZi and
follows:
c o r r e l a t i o n terms. The m i n i m i z a t i o n i s a c c o r d i n g l y c a r r i e d o u t for a weighted
For t h e d i p a n g l e we rake tanho as t h e v a l u e o b t a i n e d by a LS s t r a i g h t
l i n e f i t through t h e N p a i r s of p o i n t s ( s l , z ! ) , w h e r e z! i s t h e measured z-coor-
x2 f u n c t i o n , r a t h e r than w i t h t h e u n v e i g h t e d X'
of e q . ( 1 0 . 4 7 ) . The problem t h e n
i m p l i e s more c o m p l i c a t e d formulae f o r t h e g r a d i e n t v e c t o r g and t h e m a t r i x G of
d i n a t e e x p r e s s e d i n t h e i n t h e ( ~ ' ~ ' r system,
' ) and s! t h e d i s t a n c e from t h e
o r i g i n o f t h i s s y s t e m t o t h e measured p o i n t ( ~ i , ~ i , z : ) .
second d e r i v a t i v e s of x'. As a r e s u l t o f t h e complete m i n i m i z a t i o n one w i l l in
t h i s s i t u a t i o n , i n addition t o the f i t t e d h e l i x p a r a m t e r s , obtain a s e t of
For t h e r a d i u s of c u r v a t u r e p0 and t h e r o t a t i o n a n g l e 6' w e can t a k e
,,.improved measurements", or f i t t e d v a l u e s of t h e c o o r d i n a t e s of t h e N s p a c e
t h e v a l u e s o b t a i n e d by a l i n e a r LS f i t t o a c i r c l e through t h e measured p r o j e e -
points.
t e d p o i n t s (Xi,Y.,a). W r i t i n g t h e e q u a t ~ o nf o r t h e c i r c l e a s
li
t h e f i t t e d v a l u e s of t h e p a r a m e t e r s a and b give t h e s t a r t i n g v a l u e s
2) of
-
We have emphasized t h a t t o o b t a i n t h e s o l u t i o n (or the l i n e a r
i s an u n b i a s s e d e s t i m a t o r o f 0% h e r e , as b e f o r e , N i s t h e number o f observa-
t i o n s a n d L t h e number o f p a r a m e t e r s e s t i m a t e d * ) . where IL i s t h e i d e n t i t y m a t r i x o f dimension L x L. Thus, f r D = N-L, and eq.
I
known n i are we must be s a t i s f i e d w i t h a d o p t i n g t h e i r estimated v a l u e s ;I. as ,b- known. Asymptotically, f o r l a r g e N , i t c a n be shown, however, t h a t x : ~ ~i s
t a i n e d from t h e m i n i m i z a t i o n o f x'. I n s e r t e d i n X' t h i s g i v e s t h e weighted sum a p p r o x i m a t e l y chi-square distributed a l s o i n t h i s general case.
of squared residuals, F i n a l l y , l e t us s t r e s s a g a i n t h a t f o r t h e e s t i m a t i o n problem t h e L e a s t -
Squares P r i n c i p l e i n v o l v e d no assumption a b o u t t h e d i s t r i b u t i o n a l p r o p e r t i e s o f
t h e o b s e r v a t i o n s . The comnonly used t e r n "chi-square ( o r x2-) minimization".
Using t h e n o r m a l i t y a s s m p t i o n a b o u t t h e N i n d e p e n d e n t y . i t can be shown. i n "X2-fitting", e t c . , t h e o r i g i n of which i s e v i d e n t from t h e above c o n s i d e r a t i o n s ,
t h e case o f a l i n e a r model w i t h L p a r a m e t e r s , t h a t xiin can b e e x p r e s s e d as a
As l o n g as t h e s u b j e c t
are t h e r e f o r e s o w w h a t m i s l e a d i n g and s h o u l d be a v o i d e d .
Sum o f (N-L) i n d e p e n d e n t t e r m s , e a c h term b e i n g t h e square of a s t a n d a r d i z e d i s p a r a m e t e r e s t i m a t i o n as such one s h o u l d i n s t e a d use t h e a p p r o p r i a t e t e m i n o -
'
"LC-fit". By comparison w i t h a graph (for example. F i g . 5.2) o r a t a b l e ( f o r
, example, Appendix T a b l e A81 o f c h i - s q u a r e p r o b a b i l i t i e s we can t h e n deduce t h e
duals, xiin, gives a measure o f t h e s i m i l a r i t y between t h e o b s e r v a t i o n s and t h e
xii n
t a i n e d when t h e m i s t a k e is c o r r e c t e d .
A c l o s e r s t u d y o f t h e f i t can be done by l o o k i n g a t the r e s i d u a l s
where F(X;~,,;V) i s t h e c u m u l a t i v e c h i - s q u a r e d i s t r i b u t i o n for v d e g r e e s o f f r e e -
E.1 = y
1 , - n .1, which d i r e c t l y measure t h e d e v i a t i o n s between t h e o b s e r v a t i o n s and
dom. S i n c e a c u m u l a t i v e i n t e g r a l i s i t s e l f a v a r i a b l e which i s uniformly d i s t r i -
t h e f i t t e d values. TO a l l o w f o r d i f f e r e n t a c c u r a c i e s i t is r e a s o n a b l e t o judge
b u t e d between 0 and 1 (compare S e c c s . 4 . 1 . 1 md 5.1.4) t h e chi-square p r o h a h i l i t y
P 2 w i l l also have a t u n i f o m
an ii r e l a t i v e l y t o t h e u n c e r t a i n t y , o r s t a n d a r d d e v i a t i o n o ( c i ) , i n t h i s quan-
X
d i s t r i b u t i o n o v e r rhe i n t e r v a l [ 0 . 1 ] .
tity. n u s t h e e x a m i n a t i o n o f t h e f i r s h o u l d be done i n t e n n s of t h e v a r i a b l e s
If, i n a s e r i e s o f s i m i l a r m i n i m i z a t i o n s . P t u r n s o u t t o have a non-
uniform d i s t r i b u t i o n , t h i s i n d i c a t e s t h a t t h e a s s u m pXt i o n s s p e c i f i e d i n S e c t .
10.4.3 a r e n o t f u l f i l l e d . One may then s u s p e c t t h e measurements o r t h e model.
or b o t h , t o be u n s a t i s f a c t o r y and s h o u l d examine t h i s f u r t h e r .
For example. i f
I
PX2 i s S t r o n g l y peaked a t very low p r o b a b i l i t i e s t h i s may r e v e a l a c o n t a m i n a t i o n f o r the i-th observation. Consider-
z. i s called the stwtch f h c t i o n o r "ppuZ1"
o f "wrong" e v e n t s . S i m i l a r l y , a skew d i s t r i b u t i o n f o r P 2 w i t h an excess on t h e i n g ""correlated o b s e r v a t i o n s and a s u f f i c i e n t l y l i n e a r e s t i m a t i o n problem we
X r s i n t h e measurements
h i g h (or low) p r o b a b i l i t y s i d e may i n d i c a t e t h a t t h e e r r o
have
have s y s t e m a t i c a l l y b e e n p u t t o o h i g h (low).
A
- ~ C O V ( ~+ .V..~ (;I)
1 ^
= Vii(y)
0 ( c i ) = Vii(y-3) ) ~ ~
I1 - i n class i . I n t h e case of a c o n t i n u o u s v a r i a b l e x we may have to f i n d p . by
i n t e g r a t i n g a p r o b a b i l i t y d e n s i t y f u n c t i o n o v e r t h e w i d t h Axi of t h e i - t h c l a s s .
= Vii(y) - Vii(g. (10.68)
he e x p e c t e d number of o b s e r v a t i o n s i n r h i s c l a s s i s
N
and t h e n o r m a l i z a t i o n c o n d i t i o n 1 p. = 1 implies
i-1 '
C l e a r l y t h e minus s i g n i n t h e denominator h e r e h a s i t s o r i g i n i n t h e f a c t t h a t
t h e two q u a n t i t i e s i n t h e n u m r a t o r are c o m p l e t e l y ( p o s i t i v e l y ) c o r r e l a t e d .
The " p u l l " r . i s a n t i c i p a t e d t o have a d i s t r i b u t i o n which i s f a i r l y
'I. . c l o s e to ~ ( 0 . 1 ) . 1 f s 1 i n a p a r t i c u l a r f i t , one o f t h e z i V s d e v i a t e s very much
For a g i v e n n t h e numbers o f o b s e r v a t i o n s n . a r e m u l t i n ~ m i a l ld~i s t r i -
buted o v e r t h e N c l n s s c s w i t h r l r v : l r i a ~ > c ernaciix
from t h e o t h e r s i n magnitude t h e c o r r e s p o n d i n g d a t a p o i n t s h o u l d b e examined, and
p e r h a p s abandoned i f i t l o o k s s u s p i c i o u s ( S e c t . 6 . 1 ) .
This c r i t i q u e uf t h e d a t a
i s most l i k e l y t o be u s e f u l when t h e number o f d e g r e e s of freedom v i s f a i r l y
large. For 1 C - f i t s (-1) one sees t h a t a l l " p u l l s " are of t h e sam magnitude,
and t h e y c o n t a i n no mare i n f o r m a t i o n than t h e v a l u e x2.
mln'
I n t h e l o n g run, i f t h e shape of t h e o b s e r v e d d i s t r i b u t i o n a f a " p u l l "
e . is s h i f t e d r e l a t i v e l y t o zero t h i s d e m o n s t r a t e s a c e r t a i n b i a s i n t h e i - t h
observation. S i m i l a r l y , i f t h e observed " p u l l " d i s t r i b u t i o n i s s u b s t a n t i a l l y Because of t h e n o r m a l i z a t i o n c o n d i t i o n r h i s m a t r i x i s s i n g u l a r ( I v I = 0 ) and
b r o a d e r (narrower) than N(0,l) t h e error i n t h e i - t h o b s e r v a t i o n has probably can n o t b e i n v e r t e d . The Least-Squares P r i n c i p l e as f o r m u l a t e d by e q . ( 1 0 . 6 ) is
c o n s i s t e n t l y been t a k e n t o o snail ( l a r g e ) . t h e r e f o r e not a p p l i c a b l e t o t h i s case. However, i f we o m i t o n e o f t h e ni, say
as i t i s r e d u n d a n t , t h e remaining (N-1) n. w i l l c o r r e s p o n d t o a c o v a r i a n c e
n ~ '
10.5 APPLICATION OF THE LEAST-SQUARES METHOD TO CLASSIFIED DATA m a t r i x V* which i s s i m p l y V(y) w i t h i t s N-th r o w and c o l u m d e l e t e d . We c o u l d
t h e n r e f o r m u l a t e t h e Least-Squares Principle f o r finding the b e s t values of the
10.5.1 C o n s t r u c t i o n of : X
parameters by demanding t h e minimum of t h e q u a n t i t y
I n ~ r a c t i c eone often groups t h e measurements ( e v e n t s ) a c c o r d i n g t o
some c l a s s i f i c a t i o n scheme, f o r example by p l o t t i n g a h i s t o g r a m , b e f o r e t h e ac-
t w l e s t i m a t i o n of t h e parameters. *
L e t t h e range of t h e v a r i a b l e * ) x b e d i v i d e d i n t o N m u t u a l l y e x c l u s i v e I t can b e v e r i f i e d by t h e r e a d e r t h a t t h e i n v e r s e of t h e m a t r i x V is
c l a s s e s ( b i n s ) and d e n o t e by n. t h e n m b e r o f t h e n o b s e r v a t i o n s x s . n r , . ..,n
b e l o n g i n g t o the i - t h c l a s s . We a s s m e t h a t , w i t h t h e p a r a m e t e r s
8
- = {9ir92,....9L) we know t h e p r o b a b i l i t y p. = P . (9) o f g e t t i n g an o b s e r v a t i o n
1 1 -
*)
I n r h i s s e c t i o n x may d e n o t e a one-dimensional o r a m u l t i - d i m e n s i o n a l
variable.
The d o u b l e s m i n X 2 above r a n t h e r e f o r e b e w r i t t e n as
s t a n t s i n d e p e n d e n t of and l e a d s t o a s i m p l e r form o f e q s . ( 1 0 . 7 8 ) ,
~ n . - f . af.
ax2 =
- -2 Y- 1
f.
1
- I
?.I,,=n, a = I , Z ,...,L, (10.79)
ant i=1 1
which may be e a s i e r to h a n d l e .
The l a s t e x p r e s s i o n r e s t o r e s the symnetry i n a l l N c l a s s e s and corre- The v a r i a n c e of i s sometimes approximated by n . i n s t e a d of by f . . If,
sponds t o the f o r m u l a t i o n of eq.(10.3) w i t h f . = n p i , o.=$.
he e x p r e s s i o n r a t h e r than eq.(10.71). w e minimize
(10.75) c o u l d have b e e n w r i t t e n down a t once, from t h e assumption t h a t the
number o f events ni is P o i s s o n d i s t r i b u t e d w i t h mean and v a r i a n c e e q u a l t o n p . .
The a l g e b r a above t h u s d e m o n s t r a t e s a g a i n t h e mathematical e q u i v a l e n c e between
two d i f f e r e n t p o i n t s o f view, t h e f i r s t c o n s i d e r i n g N ( d e p e n d e n t ) m u l t i n o m i a l l y the solution for i s more s e n s i t i v e t o s t a t i s t i c a l f l u c t u a t i o n s i n t h e o b s e r v e d
d i s t r i b u t e d v a r i a b l e s c o n d i t i o n e d on t h e i r s m , t h e second c o n s i d e r i n g N inde- data. I t can be shown. however, t h a t f o r l a r g e numbers o f events t h e s o l u t i o n s
pendent P o i s s o n v a r i a b l e s ; compare S e c t . 4 . 4 . 4 . o b t a i n e d from t h e two f o r m u l a t i o n s (i.c. e q s . ( 1 0 . 7 7 ) , (10.80)) c o i n r i d e . I t can
Generally, i n t h e covariance matrix o f eq.(10.72), i f the nmber of f u r t h e r be s h a m t h a t t h e f o r m u l a t i o n s c o r r e s p o n d t o e s t i m a t o r s which, f o r l a r g e
c l a s s e s i s l a r g e such t h a t a l l p . ' s a r e s m a l l , t h e o f f - d i a g o n a l t e r m become samples, p o s s e s s o p t i m m t h e o r e t i c a l p r o p e r t i e s : t h e y a r e c o n s i s t e n t , asympto-
n e g l i g i b l e , and t i c a l l y normal, and e f f i c i e n t (i.e. g i v e minimm v a r i a n c e ) .
A s y m p t o t i c a l l y , i n t h e l i m i t o f l a r g e numbers, t h e x2 of e q . ( 1 0 . 7 3 ) w i l l
V i i ( ~ )= Of = "pi('-pi) 1 np. = f . . i f E(".) n p . . be d i s t r i b u t e d a s x'(N-I), as w i l l a l s o t h e a l t e r n a t i v e and
1 L (10.76) =
have t o be f u l f i l l e d . vhere
F i r s t l y , i t i s n o t allowed t o choose t h e l i m i t s of t h e c l a s s e s i n such
a way as t o make x'.",I" as s n a l l as possible. T h i s f o l l o w s from t h e f a c t t h a t t h e
s t a t i s t i c x2.
ml n
w i l l o n l y be a n approximate c h i - s q u a r e v a r i a b l e i f t h e c l a s s With t h e P o i s s o n a p p r o x i m a t i o n , a ? ^. fi, t h e e x a c t and t h e s i m p l i f i e d
b o u n d a r i e s f o r x a r c n o t random v a r i a b l e s . I n most p r a c t i c a l work, t h e group- consequence of e q . ( 1 0 . 7 7 ) ( i . e . e q s . ( l 0 . 7 8 ) and(10.79). respectively) lead to
ing i s made from r o m p u t a t i o n a l c o n v e n i e n r e . The second c o n d i t i o n , a l r e a d y men- e q u a t i o n s of d e g r e e ZN and N i n the p a r a m e t e r a. The p a r a m e t e r e s t i m a t e
A
n and
t i o n e d , i s t h a t the number o f e x p e c t e d o b s e r v a t i o n s w i t h i n each c l a s s must be i t s error must t h e r e f o r e be found by some numerical method.
* ' l a r g e e 9 , which i s n e c e s s a r y t o approximate ( n i - f i ) / T (or (ni-fi)IJn7 to a With t h e a 1 t e r n a t i v e approximation. 0; r n i , however, we have, from
s t a n d a r d normal v a r i a b l e . Fortunately, for p u r p o s e s the e x p e c t e d e q . ( 1 0 . 8 0 ) f o r t h e s i m p l i f i e d LS method,
numbers need n o t he v e r y l a r g e : i n f a c t i t i s customary t o r e q u i r e a minimum of
f i v e e n t r i e s in each c l a s s . I t h a s been v e r i f i e d , u s i n g t h e equal-width svb-
d i v i s i o n , t h a t one o r two c l a s s e s may be allowed t o have e x p e c t a t i o n s even l e s s
which i s a f u n c t i o n o f second o r d e r i n a. The s o l u t i o n of t h e e q u a t i o n dx21da=0
*) E s t i m t i n g p a r a m e t e r s by minimizing pq.(10.77) i s known i n t h e l i t e r a t u r e as can t h e r e f o r e be s t a t e d e x p l i c i t l y as
t h e :li,r.'mlim X2 -..:l,rri, whereas m i n i m i z i n g eq.(10.80) is sometimes c a l l e d t h e
t o u i / i ~ , - ' i a ' t:rnli,w X l ,m,:lwd. In accordance w i t h our remarks a t t h e end of
~ c ~ t . 1 0 . 4 . we 3 d i s c o n r a g e t h e use o f t h e s e terms. I n p a r t i c u l a r , we s h a l l
I r e f e r t o t h e e s t i m a t i o n by e q . ( 1 0 . 8 0 ) as t h e s i m p l i f i e d Least-Squares method.
Since xi can be e x p r e s s e d as t h e d i s t r i b u t i o n can a l s o be w r i t t e n as a l i n e a r combination of t h e ,Y!
t h e v a r i a n c e of t h e LS e s t i m a t e a is e x a c t l y g i v e n by t h e i n v e r s e of t h e c o e f f i -
I n eq.(10.87) o n l y s p h e r i c a l harmonics w i t h even values of j occur,
c i e n t t o t h e p a r a b o l i c term, compare S e c t . l O . 9 . 2 . Hence an a n a l y t i c formula can
i m p l y i n g t h a t i n t h e tuo-pion decay o f a s p i n J boson t h e a n g u l a r d i s t r i b u t i o n
a l s o b e given f o r t h e error A;,
w i l l be a polynomial i n cose of d e g r e e a r m * - t Z J ; t h i s i s known ar t h e mzimm
complerity theorem.
The e x p a n s i o n c o e f f i c i e n t s cjm i n e q . ( l O . 8 7 ) c o n s t i t u t e t h e s e t of
The r e a d e r may f i n d i t a r e w a r d i n g e x e r c i s e t o r e p h r a s e t h e l a s t example unknown p a r a m e t e r s which we want to e s t i m a t e . S i n c e , f o r a g i v e n J, t h e r e a r e
ii + N * B + N (10.84)
c l a ~ s i f ythe e v e n t s i n N a n g u l a r i n t e r v a l s En.. The p r o b a b i l i t y f o r t h e i - t h
interval i s
Lna + %
I f l i t t l e i s known a b o u t t h e i n t e r a c t i o n a good way o f s t a r t i n g an i n v e s t i g a t i o n
i s co s t u d y the a n g u l a r d i s t r i b u t i o n of t h e decay i n t h e r e s t frame o f t h e
s y s t e m B as a f u n c t i o n o f t h e mass o f c h i s s y s t e m . where t h e s w a t i o n goes over a l l ( J + l ) ( Z J + l ) combinations o f t h e i n d i c e s j , , n ,
L e t t h e decay a n g l e n=(B,b) be d e f i n e d as t h e d i r e c t i o n o f t h e na mo-- With a t o t a l number of n e v e n t s the p r e d i c t e d number f o r t h e i - t h interval i s
mentum i n the B r e s t frame r e l a t i v e l y t o t h e q u a n t i z a t i o n a x i s . The a n g u l a r I
decay d i s t r i b u t i o n can t h e n g e n e r a l l y be e x p r e s s e d i n t e r n s o f t h e s p h e r i c a l
harmonic f u n c t i o n s Y m ( c o s ~ , b ) , and t h e d e n s i t y m a t r i x e l e m e n t s Pm, as
J The v e c t o r of o b s e r v e d numbers of e v e n t s i n t h e N i n t e r v a l s i s I = ( ~ , ...,
,~~,
m
%'*
r ( ~ ~ . o , m )= 1 Y ~ ( C ~ S O , + ) ~ ~ , Y ~ ~ C ~ ~ B , ~ ) * , (10.85) where
m,mT
where .I i s t h e s p i n of 8 . From t h e p r o p e r t i e s of t h e s p h e r i c a l harmonics I
I f t h e nurber of i n t e r v a l s i s o u f f i c i e n t l y l a r g e t o j u s t i f y t h e approxi-
mati?" t h a t e a c h n . i s a n i n d e p n d e n t P o i s s o n v a r i a b l e w i t h w a n and variance
e q u a l t o f . , t h e unknown p a r a m e t e r s would b e found by minimizing e q . ( l O . 7 7 ) , or
r a t h e r eq.(10.80) i f t h e approximation of1 = n .1 i s acceptable. Clearly, with
e i t h e r formulation, a g e n e r a l numerical minimization procedure i s c a l l e d f o r . rhe " c x a c c methodt'. i n which one m o d i f i e s t h e i d e a l p . d . f . (i.e.
(i)
I n a c t u a l e x p e r i m e n t s the d e t e r m i n a r i o n of t h e c o e f f i c i e n t s c . as the t h e o r e t i c a l r n d e l ) t o g i v e a n " o b s e r v a b l e p . d . f . " , which i s
Im
f u n c t i o n s o f t h e mass of t h e s y s t e m B can serve d i f f e r e n t p u r p o s e s .
~ i r s r l y ,i t s u b s e q u e n t l y compared t o t h e a b s e r v a t i o n s . T h i s approach r e q u i r e s
may be used t o d e t e r m i n e t h e s p i n of t h e decaying boson from t h e maximum com- c h a t t h e e x p e r i m e n t a l d e t e c t i o n a b i l i t y i s known over t h e whole
p l e x i t y theorem i n the f o l l o w i n g manner: S t a r t i n g from rhe l o w e s t p o s s i b l e spin range of t h e o b s e r v a b l e s .
v a l v e one e v a l u a t e s c o n s e c u t i v e l y t h e s e t s o f c o e f f i c i e n t s 2 . f o r i n c r e a s i n g
v a l u e s o f J. From a c e r t a i n J v a l v e on, s a y from J=Jmax, t h eJmg o o d n e s s - o f - f i t when method ( i ) i s n o t a p p l i c a b l e one h a s t o r e s o r t t o
w i l l n o t become s i g n i f i c a n t l y b e t t e r w i t h i n c r e a s i n g J, and a l l c o e f f i c i e n t s
( i i ) t h e " a p p r o x i m a t r rnechod", i n whirh one m o d i f i e r t h e raw observa-
Sj, f ~ jr 2 J m x ~ i l l be c a n p o t i b l e w i t h zero. The value Jmax t h e n d e t e r n i n e s r i n n s by a s s i g n i n g d i f f e r e n t w e i g h t s t o t h e i n d i v i d u a l o b s e r v e d
a lower l i m i t f o r t h e s p i n o f t h e boson 8 . Secondly, t h e b e h a v i o u r of t h e coef-
events. The weight w . a s s i g n e d t o a n e v e n t i s e q u a l t o r h e in-
f i c i e n t s with d i f f e r e n t j,m may sometimes g i v e e v i d e n c e f o r t h e p r e s e n c e o f
verse tilr p l - n i , . i h i l i ~ yi,,r d e t r r f i n g t h i s event: i n otlier
more than one resonance i n a c e r t a i n mass r e g i o n . F i n a l l y , i f t h e s p i n s o f the e v e n t was o b s e r v e d one a s s u m e s t h a t t h e r e would
wards, i f
produced resonances are well-known, t h e magnirude of t h e c o e f f i c i e n t s Sjm
lor,
have been w. e v e n t s i f t h e d e t e c t i o n had been p e r f e c t .
equivalently. t h e d e n s i t y m a t r i x e l e m e n t s pm ) can s u p p l y i n f o r m a t i o n on t h e
p r o d u c t i o n p r o c e s s , and be used t o t e s t p r o d u c t i o n models. c l e a r l y , whenever t h e t h e o r e t i c a l model can b e p r o p e r l y m o d i f i e d , by
f o l d i n g - i n o f t h e e x p e r i m e n t a l r e s o l u t i o n o r by t h e "enact methodr', t h e r e i s no
E x e r c i s e 10.13: D e r i v e eq.(10.87) from e q . ( 1 0 . 8 5 ) . (Hint: Use t h e p r o p e r t y
f u r t h e r need f o r changes i n t h e LS e s t i m a t i o n p r o c e d u r e as described i n t h e
(-I)?;
ymymTs
a t 4"
I<a,m;e.m'lj,m+m'>~/&
.
-
and t h e p r o d u c t theorem f o r s p h e r i c a l harmnnics,
~ ~ ' c a . n ; e , n ( ~ . ~ ,
previous s e c t i o n s of t h i s c h a p t e r . I n f a c t , t h e mosr p s s e n t i a l restriction we
have p u t o n a t h e o r e t i c a l model i s that i t s h o u l d g i v e p r e d i c r i , l o s which coio-
1 c i d e w i t h the e x p e c t a t i o n v a l u e s of the o b s e r v a t i o n s . w i t h t h e " a p p r o x i m a t e
w i t h t h e f a c t t h a t t h e Clebsch-Gordan c a e f f i c i e n r s < L , o ; k , o l j , o > v a n i s h f o r .dd
j.) method", however, s p e c i f i c problems a r i s e f a r t h e LS p a r a m e t e r e s t i m a t i o n , i n
p a r t i c u l a r i t becomes more d i f f i c u l t t o g e t r e l i a b l e v a l u e s o f the e r r o r s o n the
10.6 APPLICATION OF THE LEAST-SQUARES METHOD TO WEIGHTED EVENTS estimated parameters.
SO f a r we have i n c h i s c h a p t e r t a c i t l y assumed t h a t the p r e d i c t i o n s of Let us assume t h a t t h e o b s e r v a t i o n s lhave b e e n c l a s s i f i e d i n t o N b i n s as
t h e t h e o r e t i c a l model a r e d i r e c t l y comparable t o t h e e x p e r i m e n t a l o b s e r v a t i o n . i n Secf.lo.5.1, and c h a t i t i s "ow n o r meaningful t o compare the p r e d i c t e d o m -
I n p r a c t i c e , however, t h e o b s e r v a t i o n s a r e f r e q u e n t l y known t o be b i a s s e d . AS b e r o f e v e n t s f i d i r e c t l y t o t h e observed number of events n i i n the i - t h b i n
was d i s c u s s e d i n C h a p t e r 6, t h e o b s e r v a t i o n a l b i a s e s can he d i v i d e d i n t o random Using t h e ' " a p p ~ a x i m a t emethod" we would t h a t w i t h a ~ e r f e c td e t e c t i o n appar-
(or s t a t i s t i c a l ) errors and s y s t e m a t i c e r r o r s . We r e c a l l from S e c t . 6 . Z t h a t atus we s h o u l d have ~ b s e r v e d i n b i n i a number o f e v e n t s e q u a l t o
random o b s e r v a t i o n a l errors can be t a k e n care of by "smearing" t h e i d e a l p . d . f .
w i t h t h e experimental r e s o l u t i o n function t o o b t a i n a m d i f i e d p.d.f. ("resolu-
t i o n transform") which can t h e n be d i r e c t l y compared t o t h e o b s e r v a t i o n s .
where w.. i s t h e i n v e r s e o f t h e d e t e c t i o n p ~ o b a b i l i r yf o r e v e n t j w i t h i n t h e
F u r t h e r , from S e c t . 6 . 3 . s y s t e m a t i c o b s e r v a t i o n a l errors can b e h a n d l e d by two LJ
i - r h b i n . I t i s then ~ . u g g ~ c t itvo~w r i t e down the Following two a l t e r n a t i v e
b a s i c a l l y d i f f e r e n t approaches. The f i r s t of t h e s e i s e x p r e s s i o n s to be minimized i n t h e case of weighted e v e n t s .
LINEAR LEAST-SQUARES ESTIMdTION IJITH LINEAR CONSTRAINTS
I t f r e q u e n t l y happens t h a t t h e o h s e r v a h l e s 3 i n an LS e s t i m a t i o n
and While t h e o r i g i n a l measure-
are r e l a t e d t h r o u g h a l g e b r a i c c o n s t r a i n t e q u a t i o n s .
i;,
y. - 180' = 2'. To f i n d t h e "improved measurements" ni
w e i g h t s a r e i d e n t i c a l e q . ( 1 0 . 9 5 ) a n d e ~ . ( 1 0 . 9 3 ) g i v e t h e same r e s u l t s . .
a c c o r d i n g t o t h e LS P r i n c i p l e we seek t h e s o l u t i o n of t h e c o n s t r a i n e d minimiza- Explicitly, the
*his i s a l i n e a r LS problem f o r t h e f o u r unknowns nl.n2.q1,A.
t i o n problem
..rmal e q u a t i o n s become
\
I
I
With r e s p e c t t o t h e r e m a i n i n g two v a r i a b l e s ; t h u s we c o n s i d e r t h e u n c o n s t r a i n e d
case ~ ~ l r i ~ t hl e yf iir s~t ~tllrcc! c q i l . ~ ~ i o by
n -1 2 n d t h e 1 s t by
1
z,. we r i n d by add-
I
)'
inga l l e x p r e s s i o n s an e q u a t i o n f o r I.
I y3-(l 8o0-n,-n7)
= minimum. (10.98)
T h i s t r i v i a l m i n i m i z a t i o n problem h a s t h e s o l u t i o n
which l e a d s r o
me e s t i m a t e s f o r t h e a n g l e s , o b t a i n e d from t h e t h r e e f i r s t e q u a t i o n s , are
and from t h e c o n s t r a i n t e q u a t i o n we f i n d
x 2 ( j ) = ( y - ~ ~ ) ~ v - ~ ( y=- ~
e j - a = ? .
Here t h e L p a r a m e t e r s
minimum,
j)
I (10.101)
vB = ~ c - 1 ~ ~
- (10.107)
m i n i m i z a t i o n f o r t h e L + K urknouns B and *,
1 When t h e L a g r a n g i a n m u l r i p l i e r s are s u b s t i t u t e d back i n eq.(10.106) we o b t a i n t h e
s o l u t i o n f o r the parameters 1,
I f we h e r e e q u a t e t o z e r o t h e d e r i v a t i v e s of x2 with respect to BL, k=1,2, ...,1
and A k , k=1,2, ...,K we g e t t h e norwal e q u a t i o n s , , w h i c h i n v e c t o r n o t a t i o n can
be w r i t t e n as
E q u a t i o n s (10.108) and (10.109) p r o v i d e a n e x a c t s o l u t i o n , s i n c e a l l ma-
t r i c e s and v e c t o r s are known q u a n t i t i e s . I t is i n t e r e s t i n g t o o b s e r v e t h a t t h e
c o r b i ~ t i o nc - ' ~ . which i s n o t h i n g b u t t h e s o l u t i o n of t h e u n c o n s t r a i n e d minimi-
A
h a s w e l l as i n
zation, e n t e r s i n - 8.
I n f a c t , t h e p a r e n t h e s i s ( B C - ' e ) meas-
ures how much t h e o b s e r v a t i o n s y v i o l a t e t h e c o n s t r a i n t e q u a t i o n s ; compare t h e
These a r e L + K l i n e a r e q u a t i o n s f o r t h e urknowns. Obviously, e q s . ( 1 0 . 1 0 4 ) a r e
example of t h e p r e v i o u s s e c t i o n . As for B we see t h a t t h e e f f e c t of t h e con-
t h e c o n s t r a i n t e q u a t i o n s r e g a i n e d , whereas e q s . ( 1 0 . 1 0 3 ) a r e t h e a n a l o g u e s of e q s .
s t r a i n t equations has been t o correct t h e solution c - ' ~of t h e unconstrained
(10.21) f a r t h e u n c o n s t r a i n e d case, now m o d i f i e d by t h e A-term due t o t h e con-
minimization by an amount p r o p o r t i o n a l t o t h e ' " v i o l a t i o n " t e r m (BC"*).
straints. I t w i l l be seen t h a t eqs.(lO.lOO)of the previous section represent a
We n o t e f u r t h e r t h a t t h e L a g r a n g i a n m u l t i p l i e r s h a s w e l l as t h e p a r a -
A
s p e c i a l c a s e of t h e formulae above. meters 8 have a l i n e a r dependence o n t h e o b s e r v a t i o n s y through t h e v e c t o r 2.
L e t us i n t r o d u c e t h e a b b r e v i a t i o n s I t i s seen by a p p l i c a t i o n o f t h e e x p e c t a t i o n o p e r a t o r t h a t
)j(,, = -
[C-~AT\Ty-l C - ~ B ~ V ; ~ B C - l A T V - ~ ] ~ [ ~ - l ~ T v - l - c-'B~v-'Bc-~A~v-~]T.
B
( i i ) I f t h e measurements are u n c o r r e l a r e d and have errors, i.e.
/ "
V(y) = U ~ I ~D ,s i m p l i f i e s t o
I n e q . ( 1 0 , 1 1 1 ) C-' i s t h e c o v a r i a n c e m a t r i x f a r t h e u n c o n s t r a i n e d p a r a m e t e r s , and
10.8 GENERALLEAST-SQUARES ESTIMATION WITH CONSTRAINTS
a s t h e d i a g o n a l e l e m e n t s o f t h e term ( B C - ~ ) ~ V ; ~ ( B C -are
' ) always "on-negative,
we see t h a t t h e c o n s t r a i n t e q u a t i o n s w i l l l e a d t o a reduction of t h e p a m e t e r We h a v e i n t h e p r e c e d i n g s e c t i o n s d i s c u s s e d how t h e LS method c a n be
errors ( i . u . t h e d i a g o n a l t e r m s ) compared t o t h e u n c o n s t r a i n e d c a s e . For t h e used t o e s t i m a t e unknown p a r a m e t e r s i n v a r i o u s problems of i n c r e a s i n g complex-
off-diagonal terms no s i m i l a r s t a t e m e n t can be made i n t h e g e n e r a l case, as t h e ity. We w i l l now t u r n t o t h e most g e n e r a l s i t u a t i o n , where t h e e s t i m a t i o n prob-
t h e c o v a r i a n c e m a t r i x V(y).
ables 4 = {E1,52, ....Csl.
I n a d d i t i o n we h a v e a s e t of J unmeasurable v a r i -
T h e N measurable and t h e J unmeasurable v a r i a b l e s a r e
r e l a t e d and h a v e to s a t i s f y a s e t of K c o n s t r a i n t e q u a t i o n s ,
According t o t h e Least-Squares P r i n c i p l e we s h o u l d a d o p t as o u r b e s t e s t i m a t e s
of t h e unknowns 2 and 5 t h o s e v a l u e s f o r which
V
7l
v 5 x2
vAx2 =
xZ =
=
-
2
2 ~ " ( ~ - g ) + 2rT1 =
2r;i =
f(2.i)= 0,
t.
ir
2. (N e q u a t i o n s )
(J e q u a t i o n s )
(K e q u a t i o n s )
T
x 2 ( g ) = (y-g) T1(z)(x-9)= minimum, Thus, removing t h e n u i c a n c e f a c t o r s 2 , t h e e q u a t i o n s a r e
(10.113)
-f(rl.5)
-- = o.
The g e n e r a l , c o n s t r a i n e d LS problem of eqs.(lO.!l3) can be s o l v e d by
e l i m i n a t i n g K unknowns from t h e c o n s t r a i n t e q u a t i o n s , s u b s t i t u t i n g i n x2 and
minimizing t h i s f u n c r i o n w i t h r e s p e c t t o t h e N+S-K r e m a i n i n g v a r i a b l e s . The
The s o l u t i o n of t h e s e t of e q u a t i o n s (10.117)- ( 1 0 . 1 1 v ) f o r t h e N+S+K
e l i m i n a t i o n method, however, h a s t h e d i s a d v a n r a g e t h a t i t does not g i v e any
unknowns must i n t h e g e n e r a l case*) be found by i t e r a t i o n s , producing success-
p r e s c r i p t i o n on which v a r i a b l e s one s h o u l d e l i m i n a t e from t h e c o n s t r a i n t equa-
ively b e t t e r approximations.
tions. I f t h e s e a r e " ~ n - l i ~ ~ at rh e, a c t u a l m i n i m i z a t i o n o f t h e f u n c t i o n X' may
The Lagrange L e t us suppose t h a t i t e r a t i o n number v h a s b e e n performed and t h a t i t
d e v e l o p q u i t e d i f f e r e n t l y . d e p e n d i n g on t h e e l i m i n a t i o n made.
i s necessary t o f i n d a s t i l l b e t t e r s o l u t i o n . Far t h e v-th iteration the
m u l t i p l i e r method, on t h e o t h e r hand, a v o i d s t h e p r e f e r e n c e of any of t h e un-
known v a r i a b l e s and t r e a t s them a l l on an e q u a l f o o t i n g . Accordingly, although
approximative s o l u t i o n i s g i v e n by t h e v a l u e s
-n U , ~ v , ~c o, r r e s p o n d i n g t o the
We p e r f o r m a T a y l o r e n p a n s i o n of t h e c o n s t r a i n t equa-
f u n c t i o n v a l u e (X2)".
t h i s approach i m p l i e s more v a r i a b l e s i n t h e m i n i m i z a t i o n , i t s f e a t u r e o f s p e -
t i o n s 00.119) i n t h e p a i n t (gV .5U),
t r y i n t h e v a r i a b l e s i s c o n s i d e r e d a g r e a t e r v i r t u e , and i s p r e f e r r e d i n p r a e -
tice.
We p r o c e e d t h e r e f o r e t o s o l v e t h e problem of eqs.(lO.l13) by t h e method
o f t h e Lagrangian m u l t i p l i e r s . We i n t r o d u c e K a d d i t i o n a l unknowns ? = (A,,..,
* The s e t (10.117)-(10.119) r e d u c e s o f course t o eqs.(10.106) for a linear
AK' Problem v i t h no unmeasurable v a r i a b l e s .
When t h e terms of second and h i g h e r o r d e r s a r e n e g l e c t e d t h i s can be w r i t t e n
rv
- + F,"G~+' + F;I&'+' -47 = -
O, (10.120)
" V V
The l i n e a r i z e d e q u a t i o n s a r e t h e r e f o r e s o l v e d i n such a way t h a t t h e
where a l l s u p e r s c r i p t s v i n d i c a t e t h a t f , F n' F 5 are t o be e v a l u a t e d a t t h e " ~ ~ m pt le el y unknown" iVi1
a r e found f i r s t , next t h e Lagrangian m u l t i p l i e r s W1
p o i n t (- C U ) , the v-th i t e r a t i o n . E q u a t i o n s (10.117) and 0 0 . 1 1 8 ) now r e a d
n V ,- and f i n a l l y ' t h e "improved measurements" g U f l .
(10.121) I n eqs.(10.126)-(10,128) the matrices F F ,S and t h e v e c t o r r a r e
v - l ~ V t ' - +~ ( F ; ) * C + ~ = 2, n' F
evalmted a t the point (nV,~'). We note t h a t i n d e r i v i n g t h e formulae i t h a s
I
(F;)" = -0. (10.122) been t a c i t l y assumed t h a t t h e i n v e r s e o f t h e m a t r i c e s S and (FTs-'F ) e x i s t s .
F C
With i h r ne,w vnl(,,,s '17r , ! V + l ,-
~V+l inrl iVtl
ur c a l c u l . ~ t c i~h r v ; i l u e of
These e q u a t i o n s , t o g e t h e r w i t h t h e expanded ~ o n s t r a i n te q u a t i o n s ( 1 0 . 1 2 0 ) w i l l the f u n c t i o n ( X ' ) ~ + If o r t h e ( " + I ) - t h i t e r a t i o n and compare i t to the p r e v i o u s
make i t p o s s i b l e t o e x p r e s s a l l unknowns of t h e ( v t 1 ) - t h i t e r a t i o n by t h e quan- value (x2)'. n V t l , ~ v + l ) i s used
w i t h an improved s o l u t i o n t h e new p o i n t ( - for
t i t i e s o f the p r e c e d i n g i t e r a t i o n . a new T a y l o r e x p a n s i a n of t h e c o n s t r a i n t e q u a t i o n s and t h e p r o c e s s i s s t a r t e d
I € we e l i m i n a t e from (10.121) and s u b s t i t u t e i n (10.120)we g e t a
over a g a i n . The i t e r a t i o n s should be c o n t i n u e d u n t i l a s a t i s f a c t o r y s o l u t i o n
"tl
r e l a t i o n involving only 1"
' and 5 has been found. Uscnally one would r e p e a t t h e i : a l c u l u t i o n s u n t i l t h e change i n
v
f v + Fw[,
- '1
- V(F;)' 2") - gv] + FC&
"+I_
57 ' c. X' between s u c c e s s i v e s t e p s becomes s m a l l . One may a l s o have t o check t h a t t h e
differences AI. A1 converge p r o p e r l y . General convergence c r i t e r i a c a n h a r d l y
be g i v e n , b u t must be d e c i d e d f o r t h e s e p a r a t e problems. It i s g e n e r a l l y v a l i d .
t h a t i n order t o o p t i m i z e t h e canverxence o f an i t e r a t i o n p r o c e d u r e should
be c a r e f u l i n g i v i n g good s t a r t i n g v a l u e s go, 50 f o r t h e i t e r a t i o n s , sir.ce these
when we i n t r o d u c e t h e n o t a t i o n determine how many s t e p s w i l l be n e c e s s a r y t o r e a c h t h e d e s i r e d minimum.
The d i s t i n c t i o n between the two t y p e s of v a r i a b l e s 2 and 5 lies in
f a c t i n t h e c h o i c e of s t a r t i n g values. n
For t h e measurable v a r i a b l e s - one
eq.ClO.128).
f2 = - PA~~~.4Asio$APPCOSA Psin6P
+ +
2 Wl,
(v) Calculate the new value (X ) fs = - PA sinAA + P sinh + P,,sinA",
(~i) Compare results with ~ r e v i o ~iteration.
s
proceed to (i) if "ew iteration is required.
Stop, if solution has been obtained. since the Problem involves 4 constraints and 3 unmeasured unknowns we are there-
A
fore dealing with a 1C-fic.
When the final step has been made the covarisnces of the estimates 1
From the definitions of eqs.(lO.ll6) we see that the matrices Pq ("f
and 5 should be found; see Secc.lO.R.3.
dimension 1
6) and F i. (dis!r!nsion 4 X 3 ) as obtained from t h e derivatives of
~xercise 10.17: show that the value of X' for the (v+l)-th step is the four constraint functions f with respect e o the measurable
(x7)v+1 - (LU+l,Tsi*l + 2(g+1)=Lv+1, able variables, are, respectively,
k and unmeasur-
and sir measurable unknowns, To start the iterations we t a k e the measurements as the initial ,,"
1 - (pa ,.A,". ,:P ,:A 4;).
-
..
The algebraic constraints are the four equations describing momentum and energy
Far in
we
take, for example, the value '-
5 - {PO
A'
A'
A'
mO),
A
where the con-
p o n e n t s are o b t a i n e d by demanding the f i r s t r h r e e c o n s t r a i n t f u n c t i o n s e q u a l t o I n t h e s e formulae and f , as w e l l as t h e f u n c t i o n f and a l l m a t r i c e s F and S
z e r o , i.e. momentlm c o n s e r v a t i o n s a t i s f i e d . The f o u r t h c o n s t r a i n t f u n c t i o n are e v a l u a t e d i n t h e l a s t , i.9. t h e v-th iteration. To t h e a p p r o x i m t i o n of a
w i l l t h e n i n g e n e r a l n o t b e s t r i c t l y zero. ~ h u swe w i l l h a v e an i n i t i a l v a l u e l i n e a r dependence on y t h e c o v a r i a n c e m a t r i c e s f o r t and $ are g i v e n by eq.(3.80),
of the vector r from eq.(10.124). the law of p r o p a g a t i o n of e r r o r s , as
0
I n s e r t i n g t h e a p p r o x i m a t i o n s (Z ,&0 ) we can f i n d POn' F05 and o b t a i n t h e 4 4
m a t r i x S from eq.(10.125),
t h e process.
I f we had measured t h e c o o r d i n a t e s of t h e o r i g i n of t h e A i n a d d i t i o n
t o i t s decay p i n t t h e l i n e - o f - f l i g h t of t h i s p a r t i c l e would have b e e n k n o m ,
e q u i v a l e n t t o i n c l u d i n g hA and $A among t h e measurable unknowns 2. The only
c o m p l e t e l y unknown v a r i a b l e would then be P A , t h e magnitude of t h e m m n t u m ,
In t h e s e e x p r e s s i o n s we o b s e r v e t h a t t h e m a t r i c e s F ,F e n t e r o n l y yia t h e t h r e e
corresponding t o a i n t h i s case. See a l s o
p o s s i b l e c m b i m t i o n s of t h e type F S IF.
T - n 5
One of t h e s e , F ~ S - ' P a p p e a r e d al-
5 5:
10.8.3 C a l c u l a t i o n of errors ready as a p a r t o f t h e s o l u t i o n f o r t h e unmeasurable v a r i a b l e s 4, eq.(10.126).
The e r r o r s i n t h e f i n a l e s t i m a t e s of t h e measurable and unmeasurable With t h e a b b r e v i a t i o n s
v a r i a b l e s from t h e g e n e r a l LS f i t of Sect.lO.8.1 a r e found by a p p l y i n g t h e law
"tl
of e r r o r p r o p a g a t i o n . L e t us c o n s i d e r t h e e s t i m a t e s = g"tl and 5 =Ias
f u n c t i o n s of t h e measurements y,
we f i n d a f t e r a l i t t l e a l g e b r a t h a t e q s . 0 0 . 1 3 1 ) lead to
-
t o t h e a p p r o x i m a t i o n of a l i n e a r r e l a t i o n s h i p between 11 and y, i s x2(?) s u r f a c e by p l a n e s
-
v(:) V(Y) + ~ ( 6 )- 2cov(y,6) V(y) - v(<) = V&)(G-HUH T)V(y).
10.9 MNFIDENCE INTERVALS AND ERRORS FROM THE xi FUNCTION t h e c o n s t a n t a i s a p p r o p r i a t e l y chosen. The c o r r e s p o n d i n g p r o b a b i l i t i e s
are d e t e r m i n e d from t h e c h i - s q u a r e d i s t r i b u t i o n w i t h a number of d e g r e e s of
10.9.1 B a s i s f o r t h e d e t e r m i n a t i o n o f LS c o n f i d e n c e i n t e r v a l s freedom e q u a l t o t h e number of independent p a r a m e t e r s .
With a t h e o r e t i c a l model which i s l i n e a r i n t h e p a r a m e t e r s 2 we have When ~ ' ( 8 ) i s n o t a q u a d r a t i c f u n c t i o n i n the p a r a m e t e r s , f o r example
t h e g e n e r a l e x p r e s s i o n f o r t h e X' function i f t h e model i s n o t a l i n e a r f u n c t i o n of t h e p a r a m e t e r s a n d l o r the c o v a r i a n c e
m a t r i x i s n o t independent of -
8, i t i s s t i l l customary t o use t h e i n t e r s e c t i o n
approach ( t h e " g r a p h i c a l method'') to e s t a b l i s h c o n f i d e n c e r e g i o n s f o r t h e para-
t h a t t h e m i n i m i z a t i o n of x 2 @ ) w i t h r e s p e c t t o square d i s r r i h u r i n n w i l l o n l y be a p p r o x i m a t e l y c o r r e c t i n t h e s e c a s e s .
l e d t o t h e LS e s t i m a t e
S i m p l e a l g e b r a then l e a d s t o t h e f o l l o w i n g r e l a t i o n (see E x e r c i s e 1 0 . 1 2 ) :
xZ(8) = x2.
ml n + l
.jX2 L( ~ - 6 )+~ ...
0=8
(10.138)
l i n e a r l y r e l a t e d t o y, w i l l a l s o be normally d i s t r i b u t e d (compare S e c t . 4 . 8 . 5 ) . t i v e of t h e f u n c t i o n x2 e v a l u a t e d f o r 8 = 2.
Under t h e s e c o n d i t i o n s , e a c h of t h e t h r e e terms i n eq.(10.136) w i l l be chi-square For a linear- LS problem f u l f i l l i n g t h e c o n d i t i o n s s p e c i f i e d i n , S e c t .
10.2.3, w i t h a constant c o v a r i a n c e m a t r i x V(y) f o r t h e o b s e r v a t i o n s , t h e f u n c t i o n
distributed. For example, w i t h N i n d e p e n d e n t o b s e r v a t i o n s and L u n c o n s t r a i n e d
p a r s m e t e r s , x*@) i s x'(N). x : ~ i ~s x'(N-L), and t h e q u a d r a t i c ( c o v a r i a n c e ) form
xl i s s t r i c t l y of s e c o n d o r d e r in t h e p a r a m e t e r 8 . The second d e r i v a t i v e o f X'
' i:
t o t h e s i m i l a r e x p r e s s i o n o b t a i n e d f o r t h e large-sample ML e s t i m a t e , e q . ( 9 . 3 6 ) . f i n d s t h a t t h e elements o f t h e c o v a r i a n c e m a t r i x f o r t h e LS e s t i m a t e 8 can b e
I t s h o u l d b e emphasized t h a t t h e formulae (10.139) - (10.141) a r e ~ m under
t e x p r e s s e d as
t h e c o n d i t i o n s s p e c i f i e d above.
I n t h e c a s e of a non-linear LS problem, o r i n g e n e r a l , w i r h an x2
f u n c t i o n which i s n o t of s t r i c t l y p a r a b o l i c form, one can s t i l l e x p e c t t o f i n d
the variance - and hence t h e e r r o r - of t h e e s t i m a t e 6 from t h e formula This formula i s a l s o e x a c t t o t h e e x t e n t t h a t x2 has a q u a d r a t i c dependence upon
-
8. I t can be compared t o t h e s i m i l a r e x p r e s s i o n o b t a i n e d f o r t h e c o v a r i a n c e s of
ML e s t i m a t e s , e q . ( 9 . 3 2 ) .
I n t h e s i m p l e s t s i t u a t i o n v i t h a l i n e a r model and a parameter indepen-
which w i l l be c o r r e c t t o t h e a p p r o x ~ r n a t i o nt h a t t h e h i g h e r - o r d e r t e r m s of eq d e n t c o v a r i a n c e m r r i x f o r t h e normally d i s t r i b u t e d o b s e r v a t i o n s , t h e d o u b l e s u m
(10.138) small.
are
i n e q . ( 1 0 . 1 4 4 ) - ~ h i c hi s i d e n t i c a l t o t h e c o v a r i a n c e form of e q . ( 1 0 . 1 3 6 ) - is
U n d e r t h e a s s u m p t i o n o f u n h i a s s e d and normally d i s t r i b u t e d o b s e r v a t i o n s a chi-square v a r i a b l e f o r which t h e number of d e g r e e s of freedom i s e q u a l t o t h e
we can f i n d ( e x a c t o r a p p r o x i m a t e ) c o n f i d e n c e i n t e r v a l s f o r 8 by s e e k i n g t h e number o f e s t i m a t e d parameters minus the n u d e r of l i n e a r c o n e r r a i n t s , i f any.
i n t e r s e c t i o n s of t h e ( e x a c t o r approximate) p a r a b o l i c f u n c t i o n x Z ( 8 ) by t h e
S p e c i f i c a l l y , v i t h o n l y two independent parameters t h e i n t e r s e c t i o n s between t h e
straight lines
x 2 (-
8 ) s u r f a c e and t h e p a r a l l e l p l a n e s a t d i s t a n c e a from t h e minimum x : ~w~i l l
be a set of c o n c e n t r i c e l l i p s e s which d e f i n e j o i n t c o n f i d e n c e r e g i o n s f o r t h e
two p a r a m e t e r s , whose p r o b a b i l i t y c o n t e n t i s determined by ~ ' ( 2 ) . Hence t h e
Here, i n t h e one-parameter case, t h e v a l u e s a = 1'. z2, and 32 f a r t h e i n t e r - e l l i p t i c c o n f i d e n c e r e g i o n s o b t a i n e d by t a k i n g a = 1'. 2', 3' w i l l have a s s o c i -
s e r t i o n d i s t a n c e from minimum l e a d t o c o n f i d e n c e i n t e r v a l s of p r o b a b i l i t i e s 6 8 . 3 , a t e d p r o b a b i l i t i e s 39.3, 86.5. 98.91, r e s p e c t i v e l y , i n complete analogy w i t h t h e
9 5 . 4 , and 99.72, r e s p e c t i v e l y , which correspond t o one, two, and t h r e e s t a n d a r d j o i n t l i k e l i h o o d regionsf o r t h e a s y m p t o t i c two-parameter case d i s c u s s e d i n S e c t .
d e v i a t i o n i n t e r v a l s when ~ ' ( 6 ) i s s t r i c t l y p a r a b o l i c i n 0. Hence t h e r e i s a 9.7.4. I n t h e g e n e r a l r n ~ l t i - p a r a m e t e r c a s e , t h e i n t e r s e c t i o n between t h e ~ ' ( 8 )
c l o s e a n a l o g y between t h e i n t e r v a l e s t i m a t i o n from t h e X' f u n c t i o n considered h y p e r s u r f a c e and t h e h y p e r p l a n e a t d i s t a n c e a above t h e minimum w i l l produce a
h e r e and t h e l i k e l i h o o d f u n c t i o n as d e s c r i b e d f a r t h e one-parameter case i n h y p e r e l l i p t i c j o i n t c o n f i d e n c e r e g i o n f o r a l l t h e p a r a m e t e r s , f o r which t h e con-
Sect.9.7.1. f i d e n c e c o e f f i c i e n t i s e x a c t l y g i v e n by t h e c u m u l a t i v e i n t e g r a l , up t o t h e v a l u e
a, of t h e c h i - s q u a r e p . d . f . w i t h a number a f d e g r e e s of freedom e q u a l t o t h e
number of independent parameters. Evidently, the associated probabilities for 1 1. T h e method of moments
fixed values a = 12, 2',... will decrease quickly when the n d e r of parameters
increases. Conversely, to have a specified probability content for the joint
confidence region, larger values of a must be taken for increasing number of
parameters,
From the joint confidence region for all parameters considered simul-
taneously one can also deduce conditional confidence regions (intervals) for
subsets of the parameters, by seeking the intersections between this region and
lines for fixed values of the remaining parameters, for example their estimated
The parameter estimators constructed by the method of manents (m) are
consistent but in general neither as efficient as the Haxi-Likelihood estime-
: values. The arguments are the same as in Sects .9.7.4 and 9.7.6 for the asymp-
. .. totic likelihood function. The specific choice a = 1' will as before supply the
tors, nor sufficient. However, although the qualities of efficiency and suffi-
ciency are important an estimation method should not be judged from its theore-
errors in the parameter estimates by the hypersurface circumscribing the joint
tical optimum properties alone, but also for its applicability to practical pro-
confidence region, as well as the conditional errors obtained by keeping some
blem. In particle physics, the moments method because of its feasibility has
parameters at their estimated values. In particular, with two independent para-
been widely used in experiments to determine polarization and density matrix
meters the situation is completely analogous to that described in detail for the
binormal likelihood function, Sects.9.6.3 and 9.7.4. elements. The HH estimates can be easily obtained. since their evaluation only
linear model or not, the errors in the LS estimates are by convention determined over the experimental sample.
from the intersecting hyperplane at one unit above the minimum x:~", and confi-
dence regions deduced with the choices a = 1 2 , .2;' .
In all situations when 11.1 BASIS FOR ?HE SINPLE MOMeNTS ME-ROD
~'(8)
is not of second order in the parameters andlor the observations are not Given a probability density function f(xl8) with unkn- Parameters
normally distributed, a comparison with the chi-square distribution will obvi- -
8 = (81,02, ....9k1 we want to estimate these parameters from a set of observa-
ously provide only approximate values for the probabilities associated vith the tions I(~.X~....,X . r t h algebraic mrment of the population is defined by
Ihe
different regions. Har good the approximation is will in general depend on the (see eq.(3.13) of Sect.3.3.3)
magnirvde of the higher-order re- in the series expansion of ~'(1)and the
validity of the normality assumption for the observations.
By equating the different moments of the parent population, which are functione
of the unknown 8, to the numerical values of the corresponding sample moments We
set
1 1 .Z GENERALIZED MONENTS METHOD
Instead of using a set of powers of the variable x to estimate the
unknown 8, one can select a set of independent functions of x and proceed in a
similar way to construct estimators for these functions in tern of appropriate
k set of equations (11.3) can therefore be found and solved to give the
averages of the functions evaluated for the sample values xl.x2, ....x, *) .
a *.. A
MH estimates = 101r82,...,13k1. AS a limited number of rmmenrs usually will 11.2.1 One-parameter case
not contain all information about the p.d.f. the MM estimators will in general In.the simplest situation, when there is only one unknown parameter.
be less efficient than the Maximum-Likelihood estimators. it will suffice to consider a single function g(x). The expectation of g(x),
The estimator m' of eq.(11.2) is an ""biassed estimator of ui. since or the first rmment of this function, for the p.d.f. f(x(8) is defined by (corn
pare eq.(3.6) of Sect.3.3.1)
where for ~(g(x)) we may insert its estimated value obtained from the sample,
which shows that the variance of the sample moment of a given order is dependent
on the population moment of twice this order; V(mr) may therefore, even when n
Thus we take
is large, be of considerable magnitude for higher moments if the p.d.f. has
substantial tails. This explains why the simple method of taking the moments of
the variable x itself ( i . ~ .eqs.(ll.3)) iq rather seldom used in prartire.
An alternative form, convenient for numerical computation, is obtained after a
Exercise 1 1 . 1 : Show that the covariance between m' and m
: is given by little algebra,
cou(m:.ml) = n-'(~:+~ - u:ul).
Exercise 11.2: Show that the MH estimators of the first algebraic moment and
"
the second central moment of any p.d.f. are u
1 1
= ; and o2 =
respectively. Show that the variance in the MN estimate u is
"i
c(x~-;)- pi - ',
-- -
v(j) - :(u:-u:') = 02/n 2 21,. *) We will in the following allow x to have several components; xi will there-
fore denote all measured quantities for the i-th event.
To s u m a r i z e , e q . ( 1 1 . 7 ) w i l l produce t h e MH e s t i m a t e f o r t h e f u n c t i o n e o n l l ~ c t i o nw i t h t h e M x i m u n r L i k e l i h o o d a p p r o a c h t o t h e e s t i w n a n problem. We
~ ( 8 ) .and e q . ( l l . l O ) (or a l t e r n a t i v e l y . e q . ( l l . l l ) ) an e s t i m a t e o f i t s v a r i a n c e . ass- t h a t a sample of n resonance e v e n t s h a s been o b t a i n e d , e a c h e v e n t earre-
These e s t i m a t e s must n e x t b e " i n v e r t e d " t o g i v e t h e d e s i r e d 2 and i t s error.
I spending t o two measured q u a n t i t i e s e 0 s 8 ~ , mi. The t h e o r e t i c a l d i s t r i b u t i o n f o r
t h e decay o f a v e c t o r meson i n t o two p s e u d o s c a l a r mesons i s
E x e r c i s e 11.3: Verify t h a t t h e c h o i c e g ( x ) = n reduces t h e formulae of the
l a s t s e c t i o n t o t h e p r e v i o u s l y e s t a b l i s h e d e x p r e s s i o n s f o r t h e MH e s t i m a t e of p
and i t s v a r i a n c e ( E x e r c i s e 1 1 . 2 ) .
11.2.2 M u l t i - p a r a m e t e r case
L e t u s now assume t h a t t h e e s t i m a t i o n p r o b l e m i n v o l v e s k unknown para-
m e t e r s , and c h a t we have s e l e c t e d a s e t o f k l i n e a r l y i n d e p e n d e n t ftnnrrions *ere -I i cos8 5 + I , 0 5 4 i 2 n , and Poa.p,-, ,ReoI0 are t h e t h r e e unknown p a r a -
g,(x), g2(x), ...,g k ( x ) . With t h e p . d . f . f ( x 8 ) t h e f u n c t i o n s have e x p e r t a r i o n meters.
values g i v e n by ~ ~ i d by
c d t h e f o r m o f the p . d . f . we now d e f i n e t h r e e f u n c t i o n g , , g 2 , g l
!
I
E(gr(x)) s Y r ( z ) =
I
n
gr(x) f ( x l g ) d x , r=1,2,. .., k . (11.12)
o f t h e a n g u l a r v a r i a b l e s as
Y - 1 "
n . 1 8r (X i ) ' r=I,Z,...,k, (11.13) C a l c u l a t i n g t h e e x p e c t a t i o n v a l u e s of t h e s e f u n c t i o n s f o r t h e p . d . f . of e q . ( 9 . 4 1 )
1-1
and e q u a t i n g t h e e x p e c t a t i o n s t o t h e c o r r e s p o n d i n g sample means we g e t
These e x p r e s s i o n s a r e of course s i m i l a r t o t h e f o r m u l a e w r i t t e n down f o r t h e one-
P a r a m e t e r case.
-
F o r t h e c o v a r i a n c e terms between t h e g (x) we g e n e r a l i z e e q s .
(11.10)-(11.11) t o g i v e t h e e l e m e n t s of t h e c o v a r i a n c e m a t r i x f o r ~=(~z,...,~k~,
-
-
have simple r e l a t i o n s h i p s t o t h e parameters
antes on t h e e s t i m a t e s !will
8,
n o t become t o o l a r g e .
and which are s u c h t h a t t h e v a r i - us f o r c o n v e n i e n c e w r i t e t h e l i n e a r r e l a t i o n s h i p as
or, e x p l i c i t l y
1'
+ 2Rep,_,cos3+sin20sin28 + p,-,cos44sin*8
1-.
The n i n e d e n s i t y m a t r i x elements a r e n o t a l l i n d e p e n d e n t s i n c e t h e n o r m a l i z a t i o n
c o n d i t i o n r e q u i r e s p o o + 2 p l l + 2022 = 1 . D i s c u s s how a s e t o f t r i a l f u n c t i o n s
where v1 5 i s s h o r t f o r VrS(g) from e q . ( 1 1 . 1 4 ) . can be chosen f o r t h i s p . d . f . , which w i l l l e a d to MH e s t i m a t e s f a r t h e p a r a m e t e r s .
The p r e v i o u s c o n s i d e r a t i o n s assume t h a t o u r sample c o n s i s t s of n e v e n t s
E x e r c i s e 11.6: Show t h a t t h e formulae f o r t h e decays 1- + 0- + 0- ( e q . ( 9 . 4 1 ) )
t h a t a l l r e p r e s e n t t r u e decays o f t h e s p e c i f i c t y p e 1- * 0- + 0-. Most o f t e n and 2' + 0- + 0- ( E x e r c i s e 11.5) f o l l o w from t h e g e n e r a l f o r m u l a e q . ( 1 0 . 8 5 ) .
(Hint: The d e n s i t y m a t r i x p i s H e m i t e a n . )
it i s not e x p e r i n e n t a l l y p o s s i b l e t o o h r a i n a p u r e sample of resonance e v e n t s ,
a n d i t is n e c e s s a r y t o p e r f o r m some k i n d of background s u b t r a c t i o n . One popu-
11.3 MOMENTS METHOD WITH ORTHONORKAL FUNCTIONS
l a r way of d o i n g t h i s i s t o d e t e r m i n e t h e d e n s i t y m a t r i x e l e m e n t s pa u s i n g a l l
The method o u t l i n e d i n Sect.11.2.2 becomes e s p e c i a l l y s i m p l e i f t h e
the e v e n t s i n t h e resonance r e g i o n o f t h e e f f e c t i v e mass p l o t , and t h e n t o
p.d.f. can be e x p r e s s e d as
c a l c u l a t e t h e same q u a n t i t i e s p u s i n g t h e e v e n t s i n two a d j a c e n t mass r e g i o n s .
b
The number % of background events w i t h i n t h e resonance r e g i o n can he e s t i m a t e d
from t h e shape of t h e e f f e c t i v e mass s p e c t r u m . Under t h e assumption t h a t t h i s
hackground i s w e l l d e s c r i b e d by che e v e n t s i n t h e n e i g h h o u r i n g r e g i o n s we can where t h e Sr(x) c o n s t i t u t e a s e t of k orthonomat flnctions, s a t i s f y i n g
e s t i m a t e t h e d e n s i t y m a t r i x e l e m e n t s P of t h e resonance by t h e a p p r o x i m a t i o n
~ S , ( x ) S S ( x ) d x = 6rs,
0
4 and
The u n c e r t a i n t y i n t h i s q u a n t i t y i s e s t i m a t e d as
Then e q . ( i l . l 2 ) y i e l d s t h e e x p e c t a t i o n of Er(x) s i m p l y a s
where Ap
a'
Apb are t h e e r r o r s c o n n e c t e d t o t h e e s t i m a t e s of pa,pb.
e s t i m a t o r of a' i s t h e r e f o r e g i v e n by ( e q . ( 1 1 . 2 6 ) )
T h e r e f o r e , an u n b i a s s e d e s t i m a t o r o f
f u n c t i o n f, ( x ) over t h e sample,
'ar i s p r o v i d e d by t h e a v e r a g e v a l u e o f t h e 0' = rn = 1
n . :
1=1
J~~(ZY~-I),
f o r which t h e v a r i a n c e i s a s y m p t o t i c a l l y ( e q . ( 1 1 . 2 7 ) )
1 "
Er = S,(x) = ; , I t,(ni), 1 2k (11.26) 1
1=1 ~ ( 0 '=) -
n- l ( 1 - a r 2 ) .
Since the f u n c t i o n s 5 (x) are o r t h o g o n a l t h e c o v a r i a n c e terms between
h t e r m s of t h e o r i g i n a l v a r i a b l e eosm t h e e s t i m a t e of t h e p a r a m e t e r
different $ .6 v a n i s h . The ertor i n gr can be found from t h e approximate for-
r 5
mula f o r t h e v a r i a n c e , e q . ( 1 1 . 1 4 ) , which becomes a and i t s v a r i a n c e become.
11.3.1 Example: P o l a r i z a t i o n o f a n t i p r o t o n s ( 3 )
I t i s i n t e r e s t i n g t o compare t h i s l a s t r e s u l t w i t h t h e c o r r e s p o n d i n g
We r e c o n s i d e r t h e p o l a r i z a t i o n example d e s c r i b e d under t h e Maximor
r e s u l t f o r t h e v a r i a n c e o b t a i n e d w i t h t h e Maximum-Likelihood method. For l a r g e
L i k e l i h o o d method i n S e c t . 9 . 5 . 7 and under t h e Least-Squares method i n Sect.10.5.3.
n the varianre of t h e ML e s t i m a t e of a t a k e s the s m a l l e s t p o s s i b l e v a l u e , g i v e n
l'he d i s t r i b u t i o n of t h e a n g l e r$ between t h e normals of t h e two s c a t t e r i n g p l a n e s
by e q . ( 9 . 3 9 ) . Hence t h e a s y m p t o t i c e f f i c i e n c y o f t h e moments e s t i m a t o r i s
is g i v e n by
A
1
-
E f f i c i e n c y (&)=
"HL(')
- l"(l+;) - l n ( l - 6 ) - 26
(11.31)
here t h e unknown is a = p 7 , t h e square of t h e p o l a r i z a t i o n . Vm(a) (3-iz)
The p . d . f . i s n o t of t h e form of e q . ( 1 1 . 2 2 ) . However, i n t e r m s of a n- 1
A
4
E f f i c i e n c y (&)= 1 - - a 2 . ( 1 1.32)
15
where n i s the number "f euenls. The e s t i m a t e s a r e u n c o r r e l a t e d and have v a r i - which i s n o t h i n g b u t a s i m p l e weighted c o m b i n a t i o n of t h e MM e s t i m a t e s from t h e
ances g i v e n by e q . 0 1 . 2 7 ) , i n d i v i d u a l experiments. The v a r i a n c e of t h i s combined e s t i m a t e of 8 is
.
r - t h p a r a m e t e r we have from t h e C e n t r a l L i m i t Theorem, when n becomes l a r g e ,
observations i s increased;
- unbiassehess, which means t h a t , r e g a r d l e s s of t h e sample s i z e .
t h e e s t i m a t o r p r o d u c e s e s t i m a t e s t h a t are n o t s y s t e m a t i c a l l y s h i f t e d "here x = cosb i s r e s t r i c t e d t o t h e i n t e r v a l [-1,+1 I. For t h i s c l a s s o f u n d e r
from t h e t r u e p a r a m e t e r v a l u e ; l y i n g d i s t r i b u t i o n s w e have g e n e r a t e d a r t i f i c i a l e v e n t samples c o r r e s p o n d i n g t o
- efficiency, which means t h a t t h e d i s t r i b u t i o n of t h e e s t i m a t e s h a s s p e c i f i e d v a l u e s of t h e p a r a m e t e r a by t h e "hit-and-miss" Monte C a r l o method
m i n i m variance a b o u t t h e c e n t r a l v a l u e ( e q u a l t o t h e t r u e v a l u e u s i n g a c m p u t e r equipped w i t h a uniform random number g e n e r a t o r . The number
of t h e p a r a m e t e r f o r u n b i a s s e d e s t i m a t o r s ) ; g e n e r a t o r d e l i v e r s numbers r which a r e u n i f o r m l y d i s t r i b u t e d between 0 and 1 .
- sufficiency, which means t h a t t h e e s t i m a t o r e x h a u s t s a l l i n f o r m a t i o n An e v e n t c a n d i d a t e can b e c o n s t r u c t e d from two c o n s e c u t i v e numbers r , and rr
i n t h e o b s e r v a t i o n s r e g a r d i n g t h e unlrnovn p a r a m e t e r . f r m t h e g e n e r a t o r by d e f i n i n g t h e a n g l e 4 . through t h e r e l a t i o n
I n c h o o s i n g between p o s s i b l e e s t i m a t o r s t h e p h y s i c i s t w i l l a l s o t a k e
o t h e r factors i n t o consideration. Preferentially,
and t h i s c a n d i d a t e i s a c c e p t e d and i n c l u d e d i n t h e a r t i f i c i a l e v e n t sample i f ,
- t h e e s t a b l i s h i n g of n e c e s s a r y formulae f o r c o m p u t a t i o n s h o u l d be as
for the specified a,
s i m p l e as p o s s i b l e ;
- the computer p r o g r a m i n g s h o u l d n o t be t o o c o m p l i c a t e d ; f a r e n q l e .
r e l e v a n t s o f t w a r e s h o u l d b e a v a i l a b l e f o r m a t r i x i n v e r s i o n and
function optimization; We have c a r r i e d o u t s i m u l a t i o n s u s i n g two d i f f e r e n t v a l u e s o f t h e
- t h e method s h o u l d make economic use o f computer t i m e . parameter a, a-0.09 ( c o r r e s p o n d i n g t o a p o l a r i z a t i o n P = & = o . ~ ) and a-0.25
(P=O.S). F o r each a, f o u r i n d e p e n d e n t , a r t i f i c i a l samples were g e n e r a t e d w i t h
Some of the i d e a l t h e o r e t i c a l p r o p e r t i e s and t h e p r a c t i c a l demands
n.10, 100, 1000, and 10000, g i v i n g a l t o g e t h e r 8 i n d e p e n d e n t , s i m u l a t e d e x p e r i -
w i l l f r e q u e n t l y be c o n f l i c t i n g . I n p r a c t i c e one v i l l t h e r e f o r e have t o g i v e
t . F i g u r e 12.1 shows h i s t o g r a m s i n x=cos+ f o r t h e " e v e n t s " from t h e s e 8"ex-
perimenrs". t o g e t h e r w i t h curves showing t h e u n d e r l y i n g " t h e o r e t i c a l " d i s t r i b u -
t i o n s f ( x / u ) w i t h t h e u - v a l u e s used i n t h e s i m u l a t i o n s , n o r m a l i z e d t o t h e number
of = v e n t s i n t h e "experiments". I t i s seen t h a t a l l h i s t o g r a m s match w e l l w i t h
the " t h e o r e t i c a l " d i s t r i b u t i o n s , and i t a p p e a r s j u s t i f i e d t o c o n s i d e r t h e gene-
r a t e d e v e n t samples as F a i r l y t y p i c a l f o r p h y s i c a l e v e n t s o r i g i n a t i n g from t h e
distributions. We w i l l t h e r e f o r e , i n t h e f o l l o w i n g , r e g a r d t h e
samples a s i f they c o n s i s t e d o f r e a l , o b s e r v e d e v e n t s i n 8 d i f f e r e n t
experimenrs, and use them t o e s t i m a t e t h e unknown p a r a m e t e r a by t h e u s u a l e s r i -
m a t i o n methods d e s c r i b e d i n C h a p t e r s 9-11.
from which
"
InL = -"In2 + 1 ln(l+axi). (12.7)
i-1
In Fig. 12.2 1nL is shown as a function of u for the 8 experiments.
For each the M estimate a corresponds to the peak of the 1nL func-
tion, and the error in a is determined by intersecting the function by a
Straight line at a distance 0.5 below its maximum value. As can be seen from
Fig. 12.2 the 1nL function, even for the smallest samples, has an almost symnet-
ric and prabolie shape. The error Aa can therefore for each experiment be
taken as the average of the distances AOL and A q i defined by the lover and upper
intersection points; see Fig. 12.2(a).
In addition to the estimated errors obtained by the graphical method
Table 12.1 also gives the errors deduced from the largrsample famula (9.39)
derived for the present p.d.f. in Sect.9.5.7,
x.+Ax.
pi(a) =
I
1
X.
1
bin.
Although the HL method with c l a s s i f i e d d a t a was introduced t o save com-
p u t a t i o n , and only i s of p r a c t i c a l i n t e r e s t when n i s l a r g e , we have a p p l i e d i t
h e r e f o r demonstration purposes f o r n a s small as 100 (10 c l a s s e s ) . The r e s u l t s
o b t a i n e d are given i n Table 12.1, t o g e t h e r w i t h t h e errors i n t h e e s t i m a t e s as
deduced by t h e g r a p h i c a l method using t h e p o i n t s where t h e 1nL f u n c t i o n i s 0 . 5
i
below i t s maximm v a l u e . I
I I (g)
12.2.4 The Leasf-Squares method
With n events d i s t r i b u t e d i n N b i n s the u s u a l form f o r the f u n c t i o n t o
, 0.0 0.04 0.08 0.12 a012 036 a20 0.24 a
b e minimized i s x2
..
f o r t h e 8 experiments, w i t h t h e i r minimum v a l u e s i n d i c a t e d , corresponding t o t h e
LS e s t i m a t e s u . Also i n d i c a t e d i n each graph i s t h e s t r a i g h t l i n e a t d i s t a n c e
2
'min
I
1 . 0 above the f u n c t i o n m i n i m m , which determines t h e e r r o r i n each e s t i m a t e . I
Fig. 12.3. x2 of eq. (12.11) as a f u n c t i o n of t h e parameter o f o r t h e 8
simulated experiments.
12.2.5 The simplified Least-Squares method
As we saw in Sect.10.5.3, the simplified LS method with 12.1 Summary of results from estimations
- Estimation Number Estimated Estimated parameter errordo
~~nerated
sample method of bins parameter
(experiment) W value 6 Analytical Graphical MVB
and npi(a) linear in the parameter a, implies that X2 is of second order in a.
Hence the analytical solutions can be written down for the LS estimate and its ManentS - 0.42 0.62 -
(a) - 0.38 0.52 0.47
error (eqs.(10.82), (10.83)). HL, classified data - - - - 0.54
,,=0.00
LS, ordinary (4) (0.15) - (0.53)
"-10 LS, simplified (4) (0.16) (0.55) -
Moments - 0.172 0.175 -
(b) HL
. 0.170 0.171 0.173
ML, classified data 10 0.177 - 0.174 0.173
a=O.09
LS, ordinary 10 0.175 - 0.172
"=loo 10 0.188 0.165 -
LS, simplifted
Table 12.1 gives the numerical results obtained for the estimated para- Moments - 0.078 0.055 -
meter and its error with the simplified LS method as well as the ordinary LS (el .-
MT - 0.080
~ ~
0.054 0.056
ML, classified data 50 0.080 . 0.056 0.054
method applied to all 8 experiments, including the two experiments with sample a.0.09 -
LS, ordinary 50 0.075 0.054
size nilO, which do not fulfil the usual requirements on the number OF bins and "=LO0o LS, simplified 50 0.109 0.052 -
their contents as discussed in Sects.lO.5.1. 10.5.2; for the latter the "umbers Moments - 0.093 0.0173 -
(dl - 0.093 0.0173 0.0173
are given in parenthesis, HL, classified data 100 0.093 - 0.0172 0.0173
a=0.09
LS, ordinary 100 0.092 - 0.0175
n=lOOOO 100 0.095 0.0172 -
LS, simplified
12.3 DISCUSSION
- -
12.3.1 The estimated parameters and their errors
From the numerical values of Table 12.1 the following conclusionsmay I az0.25
1I z,
Moments
rn
classified clafa
-
-
0.21
0.40
-
0.44
0.52
-
0.71
- 0.54
LS, ordinary (4) (0.05) - (0.69)
be drawn:
n-10 LS, simplified (4) (0.40) (0.43) -
- for each generated sample (experiment) the estimated values of the
Moments - 0.215 O.~or -
Parameter by the five different procedures a r e generally in good (f) ML - 0.240 0.170 0.178
a,classified data 10 0.251 - 0.180 0.170
agreement, except when the sample sire is very small (,,=lo); a=0.25
10 0.250 - 0.180
- within each sample the parameter errors estimated by the different
"=loo
LS, ordinary
LS, simplified 10 0.224 0.154 - -
Hoolents - 0.211 0.055 -
procedures are roughly equal;
(g) - 0.210 0.054 0.054
ML. classified data 50 0.207 - 0.054 0.054
- the estimated errors are inversely proportional to the square root u-0.25
50 0.200 - 0.057
of the sample size.
n=lOOO
LS. ordinary
LS, simplified 50 0.215 0.054 -
Moments - 0.262 0.0171 -
For the larger samples the first result is not surprising, since all (h) - 0.258 0.0170 0.0172
HL
HL, classified data 100 0.259 - 0.0168 0.0170
u.0.25
100 0.258 - 0.0170
n=lOOOO
LS. ordinary
LS, simplified . 100 0.260 0.0169 -
f i v e e s t i m a t o r s a r e c o n s i s t e n t and a s y m p t o t i c a l l y i m b i a s s e d . For t h e v e r y s m a l l has w e l l d e f i n e d d i s t r i b u t i o n p r o p e r t i e s . S p e c i f i c a l l y , i f t h e number
samples w i t h n=10, even t h e methods which u t i l i z e a l l i n f o r m a t i o n i n t h e ("0- ,,f events i s not too s m a l l , x'.m l n i s a c h i - s q u a r e v a r i a b l e w i t h a number of de-
binned) d a t a , i.e. t h e moments and t h e ML method, g i v e n u m e r i c a l l y d i f f e r e n t grees of freedom e q u a l t o t h e number of independent t e r m . i n t h e X' sum minus
v a l u e s f o r the estimated parameters; however, c o n s i d e r i n g t h e magnitude of t h e the number of i n d e p e n d e n t p a r a m e t e r s e s t i m a t e d ; t h e corresponding chi-square
e s t i m a t e d e r r o r s , t h e s e r e s u l t s are n o t i n c o m p a t i b l e . P i s t h e n t h e p r o b a b i l i t y f o r o b t a i n i n g a h i g h e r v a l u e of X2.
2
X mln
The dependence of t h e e s t i m a t e d e r r o r s upon t h e sample s i z e i s as e r t h a t o b s e r v e d , and can be found, f o r example, from t h e graph of F i g . 5.2.
pected. For t h e p . d . f . of t h e p r e s e n t example, e q . ( 1 2 . 1 ) , t h e minimum v a r i a n c e Table 1 2 . 2 g i v e s t h e minimum v a l u e x'.
mln
and t h e deduced c h i - s q u a r e
bound, MVB, can be e v a l u a t e d from t h e fundamental Cramcr-Raa i n e q u a l i t y (8.11); probability P 2 f o r t h e o r d i n a r y and t h e s i m p l i f i e d LS f i t s to t h e g e n e r a t e d
X
one f i n d s o f s i z e n 2 100. I n each f i t t h e number of d e g r e e s of freedom is
(N-1) - 1 = N-2, where N i s t h e number of b i n s used.
rable 12.2.
which i s n o t h i n g h u t t h e ML large-sample v a r i a n c e o f e q . ( 1 2 . 8 ) .
i n g MVB error Am o b t a i n e d f o r each g e n e r a t e d sample i s a l s o g i v e n i n T a b l e 12.1.
The correspond-
r1 ~ ~ sample ~ Number
~ o f b i~n s O r d i n a r y t LS
~ ~S i m p l i f ide d LS i
I t i s seen t h a t t h e errors e s t i m a t e d by t h e d i f f e r e n t p r o c e d u r e s are c l o s e t o
(experiment) N
n
i:' ?x2 , x2.
ml" Pxt !
t h e MVB error f o r a l l sample s i z e s . T h i s means t h a t a l l f i v e e s t i m a t i o n pro- (b)
(c)
a=0.09
u=0.09
"-100
n=1000
10
50
4.6
38.5
0.80
0.82
1
;
4.5
47.3
0.82
0.56
!;
c e d u r e s have a h i g h e f f i c i e n c y a l s o f o r s m a l l n. (However, no e s t i m a t i o n pro- (d) a.0.09 n=L0000 100
c e d u r e f o r a can be f u l l y e f f i c i e n t f o r a l l n, s i n c e t h e p . d . f . of eq.(12.1) (f) a-0.25 *=I00 10 11.7 11.8 0.17
50
d o e s n o t b e l o n g t o t h e e x p o n e n t i a l f a m i l y and t h e r e f o r e does n o t have any s u f f i -
(a)
(h)
a=0.25
a.0.25
"=I000
n=10000 100 39.0 >0.99 , 39.5 ,0.99
c i e n t e s t i m a t o r f o r a; Sect.8.6.1.)
Some of t h e e s t i m a t e d errors i n T a b l e 12.1 are somewhat s m a l l e r than The numbers of Table 12.2 show t h a t t h e c h i - s q u a r e p r o b a b i l i t i e s from
t h e c o r r e s p o n d i n g Mlll e r r o r . This need n o t d i s t u r b u s , s i n c e t h e MVB i s t o be the o r d i n a r y and t h e s i m p l i f i e d LS methods a r e s i m i l a r . The h i g h p r o b a b i l i t i e s
u n d e r s t o o d a s t h e lower l i m i t of t h e ezpected v a l u e of t h e e s t i m a t e d v a r i a n c e , i n d i c a t e t h a t t h e LS f i t s are "good". I n p a r t i c u l a r , t h e exceedingly high
and t h u s r e p r e s e n t s no a b s o l u t e minimum f o r t h i s q u a n t i t y . Hence i f many new p r o b a b i l i t i e s o b t a i n e d f o r t h e l a r g e sample e x p e r i m e n t s i n t h i s case a r e very
s a m p l e s were g e n e r a t e d , w i t h s i m i l a r n m b e r of e v e n t s , t h e s e c o u l d g i v e s m a l l e r l i k e l y j u s t a r e f l e c t i o n o f a well-behaved random number g e n e r a t o r , which h a s
o r l a r g e r e s t i m a t e d errors t h a n t h e a c t u a l t a b l e v a l u e f o r t h e g i v e n methods, produced a r t i f i c i a l e v e n t samples which a r e e x t r e m e l y c l o s e t o t h e i d e a l con-
b u t i n such a way t h a t t h e i r a v e r a g e v a l u e , f o r any method, would always be a t tinuous d i s t r i b u t i o n s o f eq. ( 1 2 . 1 ) .
l e a s t as l a r g e as t h e MVB error. For s m l l samples t h e d i s t r i b u t i o n of t h e x2. s t a t i s t i c is n o t known,
mln
and t h e LS e s t i m a t i o n is " o t a s s o c i a b l e ~ i t a
h chi-square rob ability e x p r e s s i n g
12.3.2 Goodneos-of-fit s i m i l a r l y , r e g a r d l e s s of t h e sample s i z e , t h e moments and
the g o o d n e s s - ~ f - f i t .
As was emphasized i n Chapter 1 0 t h e Least-Squares method h a s an ad-
the ML e s t i m a t i o n methods p a v i d e no d i r e c t measures f o r t h e goodness-of-fit.
v a n t a g e over o t h e r p a r w e t e r e s t i m a t i o n p r o c e d u r e s i n t h a t i t can p r o v i d e a d i -
One may of course c a l c u l a t e a ~ ~ r r e s ~ o n d x2
i n gv a l u e from eq.(lZ.11) or (12.12)
r e c t measure o f the g o o d n e s s - o f - f i t between a f i t t e d model and t h e e x p e r i m e n t a l
using t h e f i t t e d p a r a m e t e r ~ a l u e eby t h e s e methods and a r e a s o n a b l e b i n n i n g of
d a t a , s i n c e t h e minimum v a l u e o b t a i n e d f o r t h e o p t i m i z e d f u n c t i o n , under c e r t a i n
the d a t a , b u t the X' s t a t i s t i c c o n s t r u c t e d t h i s way i s g e n e r a l l y n o t s i m p l y c h i -
square distributed, and one will therefore in general not be able to assign a
chi-square probability for the goodness-of-fit. Only if there is a very large
number of observations, corresponding to a substantial number of events in the
separate bins, can the X' statistic as obtained by inserting the ML estimates
for the parameters be regarded as approximately chi-square distributed (see
Sect.14.4.3), and a chi-square probability for the goodness-of-fit be deduced
from, for example, a standard graph of the cumulative chi-square distribution.
In principle, the numerical value obtained for lnL(max) could also be
used to supply information on the goodness-of-fit if the distributional prop-
erties of the statistic lnL(max) were k n a m . This is generally not the case.
One can, however, construct an o p p r o s i m t e probability distribution of lnL(max)
corresponding to the specific 6 and n by using the Monte Carlo technique to
generate a large number of event samples, all of sire n, and determine for
these (independent) samples the frequency distribution of the values obtained
far lnL(man). Since lnL(man) depends on the parameter m, only simulated ex-
periments producing fitted parameter values very close to the specific ;should
be used in deriving this frequency distribution.
smaller value than the actually observed InL(max)
..
The probability to obtain a
for the specific a can then
be estimated as the integrated value from -- up to lnl.(max) of the derived
frequency distribution, thus providing the desired measure of goodness-of-fit.
Figure 12.4 shows the frequency distribution for lnL(msx) and the
corresponding cumulative distribution F obtained chis way on the basis of 100
independent simulated experiments with n=10 which all gave estimated parameter Fig. 12.4. ( a ) Distribution of lnl(max) obtained for 100 simulated
values in the interval [0.37,0.41 I. We take Fig. 12.4(a) to represent an experiments with n=10 and 3 E [0.37,0.41 I. (b) he cumulative inte-
gral of the distribution in ( a ) .
approximate distribution of lnL(max) for the two small sample experiments (a)
.
and (el from Table 12.1, for which the ML method gave the estimated parameter
value a equal to 0.39 and 0.40, respectively. Since the actual values obtained
for lnL(man) were -6.68 for experiment (a) and -6.79 for experiment (el, we
find from Fig. 12.4(b) that the corresponding estimated ML probabilities become
0.46 and 0.04 for these experiments. Proceeding in a similar manner to obtain
the approximate lnL(max1 distributions for the two experiments (b) and (I) in
Table 12.1, both with n=100, we estimate their ML probabilities to be 0.55 and
0.12, respectively; these numbers compare reasonably with the chi-square prob-
abilities as given in Table 12.2, being 20.80 and -0.18 for these experiments.
13. Minimization procedures
conceivable t o apply these techniques f o r hand c a l c u l a t i o n . One m y conveniently We w i l l h e r e consider functions f o r which an e n p l i c i t a n a l y t i c erpres-
t h i n k of t h e procedure as a progr-d subroutine o r algorithm which is c a l l e d The usual way of f i n d i n g the e r
sion is not s p e c i f i e d or i s very complicated.
w i t h assigned values of t h e v a r i a b l e s (parameters) and which r e t u r n s the function
trema of a function by equating a l l i t s f i r s t d e r i v a t i v e s t o zero i s then not
v a l u e and sometimes i t s d e r i v a t i v e s . d i r e c t l y applicable. A reasonable approach i n t h i s s i t u a t i o n i s t o perform a
The renewed i n t e r e s t i n the rninimiratian problem during the l a s t y e a r s napping or aearch over t h e v a r i a b l e space t o l o c a t e the minima of F(x_). Such a
has r e s u l t e d i n new minimization procedures as well as improvements and exten- search can be done i n many ways, defining d i f f e r e n t minimization procedures.
s i o n s t o o l d ones. We s h a l l emphasize the p r i n c i p l e s behind s e l e c t e d methods and The f m c t i o n t o be minimized o f t e n has more than one minimum. Since
d e s c r i b e how they work. The r e a d e r who needs more d e t a i l e d information end
it seems t o be r a t h e r d i f f i c u l t t o define minimization procedures which w i l l
f u r t h e r t h e o r e t i c a l j u s t i f i c a t i o n should c o n s u l t more s p e c i a l i z e d l i t e r a t u r e . surely produce the absolute minimum of a function, o r the gtoboi m i n i m , we w i l l
Several very good and f l e x i b l e minimization progra-s*) of r a t h e r a t f i r s t a n t i c i p a t e the procedures t o l e a d t o the n e a r e s t tocat m i n i m .
To be able t o propose " i n t e l l i g e n t " minimization methods l e t us f i r s t
*) An example is HIWIT of the CERN Program Library.
I n t u i t i v e l y we expect a l l
study the f u n c t i o n F(4) near some a r b i t r a r y p o i n t c _ .
the d e r i v a t i v e s of a p h y s i c a l l y meaningful F(5) t o e x i s t i n the region of i n t e r -
The minimization procedures we d e s c r i b e can conveniently be divided
est. We may t h e r e f o r e perform a Taylor s e r i e s expansion of F(5) around t h e p o i n t
i n t o two main c l a s s e s , t h e step methods and the g m d i e n t methods. The s t e p
c
- and w r i t e
i do n o t use any information about t h e d e r i v a t i v e s of F(g) when the s t e p
length and s t e p d i r e c t i o n have been chosen, whereas t h e g r a d i e n t methods do.
Common t o many methods i s t h a t they need t h e r e p e t i t i o n of a c e r t a i n
T . One must t h e r e f o r e
where g is t h e transposed g r a d i e n t v e c t o r with elements g.=aF/axi, and t h e procedure t o make the f u n c t i o n converge t w a r d s a minimm.
matrix G has elements G. . = a 2 ~ / a x i a x .with t h e i n d i c e s i , j running from 1 t o n. formulate convergence c r i t e r i a which w i l l ensure t h a t the process i s brought t o
11 1
The d e r i v a t i v e s are t o be evaluated a t 5 = 5. an end as soon as t h e c r i t e r i a are f u l f i l l e d . The r e p e t i t i o n s can be stopped.
I n eq.(13.1) F(S) i s a c o n s t a n t and t h e r e f o r e s u p p l i e s no information f a r e x a n p l e , i f t h e change i n t h e function v a l u e f o r two consecutive i t e r a t i o n s
about t h e l o c a t i o n of a minimum.
I n t h e second term the vector g i s expected t o i s smaller than a preassigned number.
vary considerably over parameter space, being c l o s e t o zero i n t h e neighbrxlrhaod When a minimum has been obtained f o r F(5) i t remains t o f i n d t h e er-
of a s t a t i o n a r y minimum. The components of t h e product gT (5-5) w i l l t e l l us i n rors on the For the l a s t of t h e minimization procedures described
which d i r e c t i o n F(5) changes most r a p i d l y , b u t not how f a r we should go t o poss- here, the Davidan variance algorithm, t h e covariance matrix i s obtained by the
i b l y reach the minimum. Information about t h e required s t e p s i z e can be gained algorithm i t s e l f . For t h e o t h e r minimization methods s p e c i a l v a r i a n c e algo-
0 from t h e t h i r d term of eq.(13.1). The m a t r i x G , derived from t h e second deriva- rithms must be applied, based upon reasonable a s s u q t i o n s about F(5) and the
I !
! t i v e s of F ( 5 ) , w i l l u s u a l l y have a modest v a r i a t i o n over parameter space, being ideas of Chapters 9 and 10, whenever these are a p p l i c a b l e .
I :
c o n s t a n t f o r a f u n c t i o n F(x) of s t r i c t l y q u a d r a t i c form.
Clearly, i f F(5) i s t o I n Sect.13.6 we w i l l d i s c u s s t h e s i t u a t i o n which a r i s e s when the func-
possess any minimum value a t a l l t h e r e must be some r e s t r i c t i o n s on t h e s y m n e t r i ~ tion F(5) is c o n s t r a i n e d through l i m i t e d allowed regions f o r t h e parameters.
n x n m a t r i x 6. At a s t a t i o n a r y minimum G i s p o s i t i v e - d e f i n i t e .
For a s p e c i f i e d problem t h e choice of minimization method should depend
on t h e information a v a i l a b l e on the function F(5). I n g e n e r a l , t h e mre informa-
t i o n about F(5) a c t u a l l y used i n t h e minimization, t h e more e f f i c i e n t we expect
t h e method t o be. One can conveniently consider t h e f o l l w i n g s i t u a t i o n s i n
l e v e l s of i n c r e a s i n g knowledge about F(g).
(i) t h e minimm of t h e f u n c t i o n i s given by F(1.1) = 0 , The mapping procedure runs i n t o obvious d i f f i c u l t i e s i f t h e parameter
(ii) I n t h i s case one can i n p r a c t i c e s t a r t t h e
range i s very l a r g e ( i n f i n i t e ) .
t h e components of t h e g r a d i e n t v e c t o r g a t an a r b i t r a r y p o i n t (xl.rz) are
given by mapping
with a reasonable i n t e r v a l f o r x w i t h i n t h e a l l w e d r e g i o n , and l a t e r
g l = 400~: - 4 0 0 x 1 ~ 2+ 2x1 - 2,
g2 = - 2 0 0 ~ : + 2 0 0 ~ ~ . s h i f t the range i f the s m a l l e s t value of t h e function t u r n s o u t t o be a t t h e
(iii)
GI, -
t h e elements of the m a t r i x G are given by
1 2 0 0 ~ : - 400x2 + 2, G,, = G 2 , = - 400x1 , Gzi = 200.
boundary of t h e chosen i n t e r v a l .
The simple g r i d search can be c h a r a c t e r i z e d as a b l i n d o r u n i n t e l l i g e n t
s i n c e i t does n o t take account of what could be learned about t h e func-
13.2 S l K P EIETHODS t i o n along the way. Assuming a reasonably smooth f u n c t i o n t h e method c e r t a i n l y
The s t e p methods presented below are more o r l e s s e m p i r i c a l and do not invalves many redundant f u n c t i o n e v a l u a t i o n s i n regions of t h e parameter space
have any r e a l t h e o r e t i c a l b a s i s .
N e v e r t h e l e s s , f o r many minimization problems where t h e f u n c t i o n values are n o t small. By performing t h e g r i d search i n more
simple s t e p methods perform e q u a l l y well as t h e b e t t e r grounded g r a d i e n t m t h o d s . s t a g e s , however, t h e method can be made more e f f i c i e n t .
A multi-stage g r i d s e a r c h can be done i n t h e f o l l a r i n g way: in the f i r s t
13.2.1 Grid s e a r c h and random search
s t a g e a crude g r i d mapping is made a l l over t h e parameter spaee, c o n f i n i n g t h e
The most elementary minimization procedure c o n s i s t s i n mapping t h e func-
t o a r e s t r i c t e d volume element. I n t h e second s t a g e a new g r i d search
t i o n v a l u e s i n a g r i d over t h e e n t i r e parameter space and keeping t h e p o i n t with
i s performed w i t h i n t h i s volume, l i m i t i n g t h e m i n i m t o an even s m a l l e r region.
t h e lowest f u n c t i o n value as the b e s t p o i n t .
and 00 on.
In a one-dimensional g r i d search one j u s t c a l c u l a t e s t h e f u n c t i o n v a l u e s
With many parameters, i n s t e a d of a systematic g r i d aearch a l l
F(x) a t p o i n t s e q u a l l y spaced Ax a p a r t . One of these p o i n t s must then l i e w i t h i n
over parameter s p a e e , good results a r e o f t e n obtained by a Monte Carlo s e a r c h ,
&An from the trueminimum, b u t s i n c e the minimum need not be c l o s e s t t o t h e p o i n t
w i t h t h e s m a l l e s t F-value i t is f o r reasonably smooth f u n c t i o n s assumed t h a t t h i s
choosing p o i n t s x randomly i n t h a t region of parameter spaee where one e x p e c t s
the minimm t o be. The Monte Csrlo mapping is u s u a l l y made with a Gaussian ran-
simple g r i d search i n one v a r i a b l e w i l l only l o c a l i z e t h e minimum t o w i t h i n a d i s -
dam number g e n e r a t o r c o n s t r u c t i n g a s e t of t r i a l p a i n t s 5 around some f i r s t - g u e s s
tance An.
value x_p of s p e c i f i e d width. The p o i n t g i v i n g rhe lowest f u n c t i o n v a l u e i s then
I n two v a r i a b l e s e g r i d seareh l o c a t e s the minimum w i t h i n a r e c t a n g l e
r e t a i n e d and can be used as a s t a r t p o i n t f o r a more advance minimization teeh-
Axibxz, i n t h r e e v a r i a b l e s w i t h i n a volume Ax,AxzAxs, e t c . With more v a r i a b l e s the
nique.
g r i d s e a r c h obviously r e q u i r e s a r a p i d l y i n c r e a s i n g number of f u n c t i o n e v a l u a t i o n s .
For example, t o l o c a l i z e a minimum t o w i t h i n 1%of the range of one v a r i a b l e by Exercise 13.2: A f u n c t i o n of four v a r i a b l e s d e f i n e d w i t h i n a f i n i t e spaee i s t o
t h i s techniqw, 100 f u n c t i o n e v a l u a t i o n s are necessary, v h i l e with f i v e v a r i a b l e s be minimized by s g r i d s e a r c h . The minimum should be l o c a l i z e d to w i t h i n one p e r
m i l l e of t h e range i n each v a r i a b l e . Show t h a t i n a simple g r i d search 10" f m c -
t h e number of e v a l u a t i o n s r e q u i r e d i s 10''. C l e a r l y , t h e r e f o r e , a simple-minded t i o n e v a l u a t i o n s are necessary, and t h a t 3.10' e v a l u a t i o n s are necessary i n a
g r i d s e a r c h should only be used f o r a r a t h e r crvde mapping over t h e parameter three-stage g r i d search.
I'U I ,
x1
F i g . 13.2. The one-by-one v a r i a t i o n method f o r f i n d i n g t h e minimum o f a f u n c t i o n T3.3. I l l u s t r a t i o n of t h e Rosenbrock method f o r a case w i t h two v a r i a b l e s
o f two v a r i a b l e s ; (a) weakly c o r r e l a t e d v a r i a b l e s , (b) s t r o n g l y c o r r e l a t e d v a r i -
ables.
p o i n t s a l o n g t h e "best" l i n e . When a minimum PI h a s s u b s e q u e n t l y been found a l o n g
t h e " b e s t " l i n e a new s e a r c h i s made i n a p e r p e n d i c u l a r d i r e c t i o n , g i v i n g a mini-
Although t h e one-by-one v a r i a t i o n method v s u a l l y does converge, i t may mum P*. The p r o c e s s i s r e p e a t e d a l o n g a new "best" d i r e c t i o n d e f i n e d by t h e l i n e
r e q u i r e a l a r g e m d e r of s t e p s b e f o r e t h e convergence is reached. I n cases w i t h j o i n i n g P2 and P * , and s o on u n t i l t h e convergence c r i t e r i a are f u l f i l l e d .
s t r o n g l y c o r r e l a t e d v a r i a b l e s t h e method i s u n a c c e p t a b l y slow, s i n c e t h e approach
t o w a r d s t h e minimum goes by an i n e f f i c i e n t z i g zag curve c r o s s i n g t h e s i d e s of
I n n dimensions t h e Rosenbroek r e c i p e i s as f o l l o w s : 6.. i - l , Z ,
L e t -I ...,n
d e n o t e t h e s e t of n o r t h o g o n a l d i r e c t i o n s . e f i n d t h e minirnum of F(5) a l o n g
W
t h e "valley" t o which t h e minimum b e l o n e s ; see F i g . 13.2(b).
e a c h of t h e s e d i r e c t i o n s i n t u r n , s t a r t i n g from t h e p o i n t Po and a f t e r c o m p l e t i n g
13.2.4 The Rosenbroek npthod t h e c y c l e r e a c h i n g the p o i n t P I . L e t s. b e t h e s i z e of t h e s t e p t a k e n t o r e a c h
With t h i s s e t of o r t h o g o n a l d i r e c t i o n s t h e p r o c e d u r e i s r e p e a t e d f o r t h e p o i n t P I , P* by t h e r e l a t i o n
-
g i v i n g P2, and so on. P* = (1+c1)P - aPh , (13.8)
The Rosenbrock method u s u a l l y works w e l l i f t h e number of v a r i a b l e s i s
where t h e r e f l e c t i o n c w f f i c i e n t a i s a p o s i t i v e c o n s t a n t . Three s i t u a t i o n s are
n o t t o o l a r g e , b u t when t h e number i n c r e a s e s i t s e f f i c i e n c y goes d a m .
possible:
E x e r c i s e 13.7: V e r i f y t h a t t w o d i f f e r e n t u n i t v e c t o r s ni.g. d e f i n e d by e q s . ( 1 3 . 4 ) ,
(13.5) are o r t h o g o n a l . 1 (i) I f F(P*) < F(P ) t h e r e f l e c t i o n h a s p r o d w e d a new minimum. To
1
see i f we can do even b e t t e r w e m k e an e x p a n s i o n a l o n g t h e l i n e
13.2.5 l h e s i m p l e x method
and go beyond P* ro a new p o i n t P*', d e f i n e d by
A f r e q u e n t l y used s t e p method f o r minimizing a f u n c t i o n of many v a r i a b l e s
is t h e s i m p l e z m t h o d i n v e n t e d by J . A . N e l d e r and N. W a d . The method i s b a s e d
on t h e e v a l u a t i o n of t h e f u n c t i o n v a l u e F ( x l , x 2 , ...x ) a t n t l p o i n t s forming a
where t h e e x p a n s i o n c o e f f i c i e n t y is g r e a t e r t h w u n i t y . If
g e n e r a l s i m p l e x * ) , f o l l o w e d by t h e r e p l a c e m e n t of th: vertex with t h e highest
F(P**) < F(P1) we r e p l a c e Ph by P** and r e s t a r t t h e p r o c e s s . If
f u n c t i o n v a l u e by a new and b e t t e r p o i n t , i f p o s s i b l e . The new p o i n t i s o b t a i n e d
by a s p e c i f i c a l g o r i t h m , and l e a d s t o a new s i m p l e x b e t t e r a d a p t e d t o t h e func- PCP**) L F ( P ~ ) t h e e x p a n s i o n h a s f a i l e d and we r e p l a c e Ph by P'
x2 * ,
0
I.,** take as l a r g e s t e p s as p o s s i b l e i t i s r a t h e r i n s e n s i t i v e t o s h a l l o w l o c a l minima
0 o r f i n e s t r u c t u r e s i n t h e f u n c t i o n , implying a g e n e r a l l y good a d a p t i o n t o t h e
landscape and a q u i c k c o n t r a c t i o n t o t h e o v e r a l l minimum.
E x e r c i s e 13.8:
t i a l s i m p l e x d e f i n e d by P I = ( ! , I ) ,
- 2 2
The f u n c t i o n F ( X , , X ~ ) X, + x 2 i s g i v e n t o g e t h e r w i t h an i n i -
P,= (1,-2). P , = (-1.0). Show, u s i n g t h e sim-
plex minimization procedure with c o e f f i c i e n t s u = I , 6 = 0.5, y = 2, t h a t
1 F(Ph) - F(P1) < 1 a f t e r t h r e e i t e r a t i o n s , and t h a t t h e e s t i m a t e d minimum a t t h i s
stage i s a t ( 3 1 8 , 5/16).
10 m - y cases t h e a n a l y t i c a l e x p r e s s i o n s f o r the d e r i v a t i v e s of t h e
f u n c t i o n a r e very complicated o r can n o t be found a t a l l . General p r o g r a m s
II Far t h e o f f - d i a z o n a l elements of G we get, using syametrical s t e p s ,
aF
F(x1, ...,xi+Ani... .,x ) - F(xl ,...,x.... ..x n) f o r each off-diagonal element. Since t h e r e are n(n-1)/2 independent off-diagonal
!
small r e g i o n s near 5, s-trical s t e p s w i l l not be necessary t o f i n d 6. The o f f -
aF
" . (13.12) , diagonal elements may then be c a l c u l a t e d from t h e formula
= si= AX.
o u t as by-products when e s t i m a t i n g t h e g r a d i e n t by t h e s y m t r i e method, i.e. Exercise 13.11: V e r i f y t h e formula eq.(13.14) f a r the d i a g o n a l elements of the
matrix G by applying e q . (1 3.13) m i c e .
eq.(13.13). s i n c e one may v r i t e
E x e r c i s e 13.12: V e r i f y t h e e x p r e s s i o n s g i v e n f o r t h e off-diagonal elements of G , The s t e e p e s t d e s c e n t method i s i l l u s t r a t e d f o r a ease w i t h two v a r i -
eqs.(l3.15),(13.16).
ables i n P i g . 13.5. I n t h i s s i t u a t i o n t h e method is e q u i v a l e n t t o t h e one-by-
E x e r c i s e 13.13: Consider t h e q u a d r a t i c f u n c t i o n F ( x l , x r ) = x : + x l x r + 2 x : f o r
which t h e m a t r i x of second d e r i v a t i v e s is c o n s t a n t and e q u a l t o i v a r i a t i o n method e x c e p t f o r a r o t a t i o n of t h e c o o r d i n a t e axes; i n t h e gene-
ral case with more t h a n two dimensions the two methods are n o t e q u i v a l e n t .
E x e r c i s e 13.14: Given t h e f u n c t i o n F ( x I . x ~ =
) X ~ + X ~ X ~ + X I X : + X ~ +C a l c u l a t e t h e
f i r s t and second d e r i v a t i v e s i n the p o i n t (1.1) a n a l y t i c a l l y and numerically,
I
I
u s i n g ( i ) Axl = Ax, = 1 , ( i i ) Ax, = Axz = 0.1.
13.3.2 k t h o d of s t e e p e s t d e s c e n t
The steepest descent method can he thought o f as a n a t u r a l improvement
of t h e o n e - b y - o n e v a r i a t i o n method i n s i t u a t i o n s where t h e d e r i v a t i v e s of F(x) are
knam.
From t h e s t a r t i n g p o i n t Po we h e r e seek a minimm of F(5) along t h e
d i r e c t i o n i n parameter space where the f u n c t i o n d e c r e a s e s most r a p i d l y . When t h e
minimum P , i n t h i s d i r e c t i o n h a s been found t h e p r o c e s s i s r e p e a t e d s e a r c h i n g a ,
second and b e t t e r minimum i n a d i r e c t i o n orthogonal t o t h e f i r s r , ~ i v i n gP2. and
X1
so on u n t i l a s a t i s f a c t o r y convergence h a s been o b t a i n e d . F i g . 13.5. I l l u s t r a t i o n of the method of s t e e p e s t d e s c e n t
The d i r e c t i o n 5 of t h e s t e e p e s t d e s c e n t i n t h e p o i n t Po=xo h a s compo- f o r a case w i t h two v a r i a b l e s .
nents
The method of s t e e p e s t d e s c e n t ensures a b e t t e r convergence t h a n t h e
simple one-by-one v a r i a t i o n method. S t i l l i t may be r a t h e r slow when t h i v a r i -
a b l e s have a ~ m p l i e a t e dinterdependency i n F and the choice of s t a r t i n g p o i n t
where the d e r i v a t i v e s are e v a l u a t e d a t Po. I f t h e search along t h e d i r e c t i o n I,
Po ha8 not been f o r t u n a t e .
h a s l e a d t o t h e minimum P I the new d i r e c t i o n n of s t e e p e s t d e s c e n t i n the p o i n t
PI i s p e r p e n d i c u l a r t o t h e p r e v i o u s d i r e c t i o n 5. To see t h i s , i n t r o d u c e a v a r i a - Exercise 13.15: For t h e "Rosenbrock curved v a l l e y " of Exercise 13.1, show how
a b l e s on a l i n e along the d i r e c t i o n 5 through P o . A l l p o i n t s on t h i s l i n e t h e n d i f f e r e n t c h o i c e s of t h e s t a r t i n g p o i n t P o f o r t h e s t e e p e s t d e s c e n t method w i l l
lead to minimizations of d i f f e r e n t e f f i c i e n c y .
satisfy x = xo+s$. The minimwo p o i n t PI on the l i n e is d e f i n e d by the r e q u i r e -
ment a F / a x 4 . With t h e d e r i v a t i v e s e v a l u a t e d a t PI we have
13.3.3 The Davidon v a r i a n c e a l g o r i t h m
me e s s e n t i a l f e a t u r e of DaVidon'a varirmce atgorithm i s t h a t t h e f u n r
t i o n i s made t o approach i t s minimum by l e t t i n g t h e covariance m a t r i x V(z)= G-'
which i m p l i e s t h a t two c o n s e c u t i v e d i r e c t i o n s of s e a r c h w i l l always be orthogonal. undergo s u c c e s s i v e approximations. This means t h a t a simultaneous convergence is
obtained towards the f u n c t i o n minimum and t h e t r u e covariance m a t r i x . I n this
1 3 - P r ~ b a b l l i t yand sfafirtiss.
365
Proceed t o ( i ) f o r a new i t e r a t i o n .
g - g*, V = V*.
Before s t a r t i n g t h e i t e r a t i o n s t h e method r e q u i r e s t h e knowledge of an ous types of problems. I n f a c t i t can be proven t h a t when F(5) i s of q u a d r a t i c
I
o e a r t i n g p o i n t f o r t h e minimization.
I" minimizing a f u n c t i o n with s e v e r a l minima t h e t a s k u s u a l l y belongs
I f p 1 E, proceed t o ( i i i ) .
I t o one of t h e following t h r e e c a t e g o r i e s i n descending o r d e r of d i f f i c u l t y :
i Define y = -~r/p.
An obvious
If -
!+a
5 u
l-a , d e f i n e A = u. (i) A l l minima are of i n t e r e s t and should be found.
If - -@B+1
- <- - u, y <
1 +a d e f i n e A --Y+l
I.
although r a t h e r p r i m i t i v e way t o !oak f o r a l l minima i s t o make acompletemapping
of F(x) over t h e e n t i r e parameter space, b u t t h i s may imply a l a r g e number of
~f B-l - Y < - -,B d e f i n e
-8, A = 0, flmction evaluation^ and be p r o h i b i t e d f o r economical reasons. Anorher approach
Define v?.
B+l
I f none of t h e s e t h r e e , d e f i n e A
= V.. + (A-1)r.r.l~.
- I.Y+l
i s t o perform the minimum s e a r c h from s e v e r a l s t a r t i n g values. This may, however,
ever, as i f no s a f e and simple method has been i n v e n t e d y e t . For d e t a i l s on t h i s 9.7,!, and 9.7.6, respectively. m e numerical problem i t s e l f i s t r i v i a l vhen t h e
f,(_?) = 0
f,(S) = 0 constraint equations
5 XI 5 b~
hinple constant limits
(ii) az 5 x t 5 hz
simple v a r i a b l e l i m i t s
I i - i ) a , b i - 1 . n are c o n s t a n t s w h i l e a l l f u n c t i o n s
depend on 5 . The g e n e r a l c o n d i t i o n s of t y p e ( i v ) a c t u a l l y i n c l u d e a l l c o n d i t i o n s
of t h e t y p e s ( i i ) and ( i i i ) , b u t f o r p r a c t i c a l and i l l u s t r a t i v e p u r p o s e s t h e y may
be s e p a r a t e d as above.
In t h e s p e c i a l f i e l d of linear progrmnming, where t h e f u n c t i o n F ( x ) and
i t s c o n s t r a i n t s are l i n e a r i n t h e p a r a m e t e r s , t h e i d e a s d i s c u s s e d i n t h e f o l l o w -
F i g . 13.6. Error d e t e r m i n a t i o n f o r an i l l - b e h a v e d f u n c t i o n (see t e x t ) .
ing are n o t of p r a c t i c a l use. When F(?) i s l i n e a r i t i s t h e c o n s t r a i n t s t h a t make
i t p o s s i b l e f o r t h e f u n c t i o n t o rake a minimum v a l u e a t a f i n i t e p o i n t a t t h e
boundary of t h e a l l o w e d r e g i o n . Thus, i n t h e f i e l d of l i n e a r programming t h e
c o n s t r a i n t s are e s s e n t i a l t n o b t a i n a minimum, whereas i n o u r problems t h e con-
s t r a i n t s are more o r l e s s r e g a r d e d as nuisance.
C o n s t r a i n t s can h e t a k e n care o f by s p e c i a l procedures for
c o n s t r a i n e d minimization. We w i l l . however, only c o n s i d e r techniques f o r modi- I f , f a r i n s t a n c e , x i h a a t o be non?legstive, 0 I x. 5 -, e i t h e r of t h e
- e l i m i n a t i o n of v a r i a b l e s u s i n g the c o n s t r a i n t e q u a t i o n s ,
problems which r e q u i r e 0 5 xi 5 1 transformations l i k e
- i n t r o d u c t i o n of Lagrangian m u l t i p l i e r s , .Yi
x. = sin2yi or = (13.20)
- change of v a r i a b l e s to eliminate constraints,
X.
1 eYi - .-Yi
ai 5 xi I; h i
- i n t r o d u c t i o n of p e n a l t y f u n c t i o n s .
.ill remove t h e c o n s t r a i n t . With g e n e r a l c o n s t a n t l i m i t s an o f t e n
used transformation i s
I n p a r t i c u l a r , when F(x) i s t o be minimized under k c o n s t r a i n t equa-
(13.21)
tions fi(x) = 0, i=l,2, ....k , corresponding t o case ( i ) above, one may use the
x. =
I
a. + (b.-a.)sinZy..
I I t
c o n s t r a i n t equations t o eliminate v a r i a b l e s i n F ( 5 ) An example an the elimina- The v a r i a b l e changes above w i l l n o t i n t r o d u c e new minima i n x. The
t i o n approach was given i n Sect.lO.7.1. A l t e r n a t i v e l y one ran, us discussed hi
transformations i n v o l v i n g s i n 2 y . a c t u a l l y produce f o r each minimum i n 5-space
Chapter 10, i n t r o d u c e Lagrangian m u l t i p l i e r s X = [Al,A2, ....Akl and c o n s t r u c t a
y,. equally-spaced minima i n y-space. This should, howeipr, n o t cause d i f f i c u l -
modified f u n c t i o n T ( 5 . l ) . where
t i e s provided t h a t t h e minimization procedure used does not involve so l o n g s t e p s
that i n t e r m d i a t e h i l l s are c r o s s e d .
( i v ) above.
I n the f o l l o w i n g s e c t i o n s we w i l l see haw the two techniques i n v o l v i n g
change of v a r i a b l e s and p e n a l t y f u n c t i o n s can be a p p l i e d t o t h e s i t u a t i o n s ( i i ) - x, - eY1. x2 = sin2y2. XI = 711
verify t h a t t h e minimm can be obtained by an u n c o n s t r a i n e d minimization i n r-
space followed by t h e t r a n s f o r m a t i o n of t h e minimm p o i n t back t o 5-space. I f the
~ o ~ a r i a n cofe the minimm p o i n t i n space i s v ( y ) , show t h a t t h e covariance of
13.6.1 E l i m i n a t i o n of c o n s t r a i n t s by change of v a r i a b l e s
An important p r a c t i c a l case of c o n s t r a i n e d minimization occurs when
some or a l l v a r i a b l e s are bounded by constant l i m i t s , a. i x 1. 5 hi. I n a mini-
~1 the minimum i n 5-space i s V(x) = Sv(y)ST, w h e r e
eYl 0 0
I
m i z a r i a n by a g r i d s e a r c h such i n e q u a l i t y c o n s t r a i n t s are e a s i l y taken care of by
l i m i t i n g the search region. With the o t h e r minimization procedures one nay e l i m i -
n a t e the e x p l i c i t i n e q u a l i t y c o n s t r a i n t s by changing t o a new s e t of u n r e s t r i c t e d I! Exercise 13.18: The minimum of a f u n c t i o n F(X,,X~) subject to the constraint
variables. To f i n d the minima one simply e x p r e s s e s F(5) by the new v a r i a b l e s y
0 5 r , l n , i -
thravgh a s t r a i g h t f o r w a r d s u b s t i t u t i o n and minimizes with r e s p e c t t o y . The
can be obtained by an unconstrained minimization with d i r e c t s u b s t i t u t i o n of
minima obtained i n y-space are t h e n transformed back t o r s p a c e . - The e r r o r s i n 5'
must b e found from an o r d i n a r y computation of e r r o r propagation.
XI = Y , . x 2 = y;
Consider t h e e x p l i c i t f u n c t i o n F ( x , , x 2 ) =
+ Y:
- x: -
.
x,x, + x:. Shnr a n a l y t i c a l l y
T(x.5) - F(5) + r
m
i=l
L ei/pi(5) . (13.23)
-
p . ( x ) 2 0, f o r i-1.2.
I
...,m. The m o d i f i e d f u n c t i o n T(5.5) t o be minimized is then minimizing a f u n e t i o n F ( u u
P' f
) of 8"-d s q u a r e d d e v i a t i o n s , s u b j e c t t o t h e can-
d e f i n e d by straints
the ~ h o i c eof procedure should be made a l s o c o n s i d e r i n g where i n t h e r e g i o n of
space t h e minimization i s t o be s t a r t e d . Quite o f t e n a method which
,rks w e l l i n t h e d i s t a n t r e g i o n s i s l e s s e f f i c i e n t when approaching the minimm.
% i s s u g g e s t s t h a t t h e d i f f e r e n t methods be a p p l i e d i n sequence, depending on t h e
To make use o f an unconstrained minimization procedure with C a r r o l l ' s reached. Indeed, t h e g e n e r a l purpose p r o g r a m s a v a i l a b l e incorpor-
response s u r f a c e technique we must r e w r i t e the c o n s t r a i n t s (2) t o the farm .te s e v e r a l minimization techniques and a l l o w t h e user t o change from one method
p i ( ~ )1 0, .
i = 1 , 2 , . .,m. The e q u i v a l e n t s e t of c o n s t r a i n t s is to i n one run.
1
P f
Prom t h i s example i t is r e a l i z e d that.depending on which v a r i a b l e i s
e l i m i n a t e d f r o . t h e n o r m a l i z a t i o n condition,omewhat d i f f e r e n t e s t i m a t e s may be
found f a r the production f r a c t i o n s . The d i f f e r e n c e s w i l l , however, be r e f l e c t e d
by the errors connected t o the e s t i m a t e s .
Prob(Type I1 error)= 8 =
I
W-R
f(rl8,)dx - If (xl8,)dx.
as p o s s i b l e . This vague statement about t h e choice of the best critical region
can be given a more q u a n t i t a t i v e form i n terms of the Neyman-Pearson t e s t and
the l i k e l i h o o d - r a t i o t e s t t o be discussed l a t e r . Before we proceed t o d i s c u s s
these e x p l i c i t t e s t a we w i l l i l l u s t r a t e with a s p e c i f i c example from p a r t i c l e
physics some of t h e concepts introduced above.
against t h e a l t e r n a t i v e hypothesis
- + + - -
HI: p p * T " " T M ,
The pmer of a t e s t i s defined as t h e p r o b a b i l i t y of r e j e c t i n g a hypo-
t h e s i s when i t i s f a l s e . We have f o r t h e p w e r of t h e t e s t of t h e n u l l hypoth-
(here M denotes "mare than one n e u t r a l pion"), and t h e missing-mass squared m2
esis H a g a i n s t t h e a l t e r n a t i v e HI:
is t o be used as a t e s t s t a t i s t i c . The c r i t i c a l value m: is most reasonably
- -
m
-
be an approximate chi-square v a r i a b l e w i t h one d e g r e e of
I,;
freedom, i t can be used as a t e s t s t a t i s t i c f o r a goodness-of-fit t e s t ; see also 1 { ~ ( ~ ( 8 , ) da~ (14.7)
Sec:.14.4.5.
R
14.2.2 The Neyman-Pearson t e s t f o r simple hypotheses which r e p l a c e s eq.(14.1).
The Neymon-Pearson Zemnn s t a t e s t h a t , f o r a f i x e d s i g n i f i c a n c e l e v e l ,
I The c r i t i c a l r e g i o n c o n s t r u c t e d i n accordance w i t h t h e r u l e s e q s .
t h e b e s t c r i t i c a l region 8 should i n c l u d e those v a l u e s o f x f o r which f ( n l 8 1 ) i s
as l a r g e as p o s s i b l e r e l a t i v e t o f ( ~ 1 8 ~ ) .
1 (14.6), (14.7) w i l l provide the maximm pover of a simple n u l l h y p o t h e s i s a g a i n s t
a simple a l t e r n a t i v e hypothesis f o r t h e given s i g n i f i c a n c e l e v e l . Equivalently,
I r h i s r e g i o n minimizes t h e p r o b a b i l i t y o f Type I1 e r r o r s .
the t e s t H : 8 = 8 against H I : 8 -
It can e a s i l y be v e r i f i e d t h a t r h i s c h o i c e o f R maximizes the p w e r o f
81. Consider one measurement x and d e f i n e
t h e s i g n i f i c a n c e a of t h e t e s t by the i n t e g r a l o f eq.(14.1).
For t h e case with many measurements t h e c r i t i c a l r e g i o n R i n ?-space
may be d i f f i c u l t t o f i n d since eq.(14.7) c o n s t i t u t e s an n-dimensional integral.
The r e g i o n R should
o b v i o v s l y i n c l u d e a l l p o i n t s where f ( x l 8 ) = 0 and f(xlO1) > 0 , s i n c e t h e s e I n p r a c t i c e one w i l l seek t h e c r i t i c a l r e g i o n f o r some t e s t s t a t i s t i c , expressed
p o i n t s do not c o n t r i b u t e t o a . For f ( x l 0 ) > 0 t h e power of the t e s t of Ho as a f u n c t i o n of the xi. For i n s t a n c e , i t w i l l be convenient t o seek t h e e r i t i -
a g a i n s t H I may be w r i t t e n (from e q . ( 1 4 . 3 ) ) c a l r e g i o n f o r t h e sample m a n ;when t e s t i n g on a p o p u l a t i o n m a n U, o r t h e
1 sample v a r i a n c e 's when the t e s t i n v o l v e s t h e p o p u l a t i o n variance 0'. I n any
case, t o determine the c r i t i c a l r e g i o n one w i l l have t o i n t e g r a t e over t h e proba-
b i l i t y d e n s i t y f u n c t i o n f o r t h e t e s t s t a t i s t i c a c t u a l l y used.
It should be noted t h a t the Neyman-Pearson t e s t i s applicable only i n
where E i s a p o i n t w i t h i n R . S i n c e t h e l a s t i n t e g r a l i s n o t h i n g b u t the con-
t e s t i n g simpZe hypotheses. or composite hypotheses one can only r a r e l y f i n d a
s t a n t u, t h e p a r e r of H o w i l l be maximal i f the r e g i o n 8 i s chosen such t h a t t h e
t e s t which i s more p w e r f u l than any o t h e r t e a t . The l a t t e r type of problem
r a t i o f ( ~ ~ 8 ~ ) / f ( x / 8as
~ ) liasr g e as p o s s i b l e . The b e s t c r i t i c a l r e g i o n t h e r e f o r e
' w i l l be d i s c u s s e d i n Seet.14.2.4 i n connection with t h e l i k e l i h o o d - r a t i o test.
c o n s i s t s of p o i n t s s a t i s f y i n g t h e i n e q o a l i t y
-
The c o n d i t i o n given by eq.(14.6) now r e a d s , assuming n o b s e r v a t i o n s . With j u s t one o b s e r v a t i o n of t h e
the n u l l h y p o t h e s i s H i f t h e observed v a l u e
EO l i f e t i m we w i l l t h e r e f o r e r e j e c t
lobs i s l a r g e r than TI= 3.00. The
p r o b a b i l i t y t h a t we a c c e p t Ho when t h e a l t e r n a t i v e h y p o t h e s i s H I i s t r u e (i.e. we
c o m i t a Type I1 error) i s v e r y l a r g e , namely 8 = 0.78.
The b e s t c r i t i c a l r e g i o n i n t space i s t h e r e f o r e t h e s e t of v a l u e s s a t i s f y i n g
i The l o u e r l i m i t T of t h e c r i t i c a l region f o r F is now implied by the i n t e g r a l
t h e i n e q u a l i t y (14.8), where Tn i s a c o n s t a n t which can be c a l c u l a t e d f o r any
given s i g n i f i c a n c e a. We see t h a t t h e sample mean r is a n a t u r a l t e s t s t a t i s t i c
f o r t h i s t e s t about t h e p o p u l a t i o n mean r . However, t o f i n d R (or To) i t is
n e c e s s a r y t o know t h e p.d.f. f (t) for f. This p.d.f. i s p a r t i c u l a r l y simple
f o r t h e two l i m i t i n g cases. "-1 and n very l a r g e . Fixing the significance l e v e l where G i s t h e cumulative standard no-1 d i s t r i b u t i o n function. Prom Appendix
t o 5% we w i l l now study t h e s e extreme cases. Table A6 we f i n d t h e v a l u e of the a r g u m n t of G t o be 1.645; hence
0The
. p.d.f. f o r the t e s t s t a t i s t i c ;is i n t h i s case the saw as t h e
p.d.f. f o r t.
The p a r e r of t h e t e s t H : T = 1 against H,: r - 2 now becomes
s i g n i f i c a n c e a=0.05, t h e power of t h e t e s t Ho a g a i n s t H, i n c r e a s e s very r a p i d l y ! w cannot exceed t h e maximum value over t h e e n t i r e space n, A must be P
cance l e v e l i s lowered.
1 renders L(& c l o s e t o t h e maximum ~ ( 6 ) and
. hence H w i l l have a l a r g e probabi-
l i t y of being t r u e . On t h e o t h e r hand, a small value of A w i l l i n d i c a t e t h a t Ho
E x e r c i s e 14.1: For the l i f e t i m e example, consider t h e two simple hypotheses , i s unlikely. The v a r i a b l e A i s t h e r e f o r e i n t u i t i v e l y a reasonable t e s t s t a t i s t i c
4 no,
-
H : T = ~ , (A1 = rulc), for and the l i k e l i h o o d - r a t i o rest s t a t e s t h a t t h e c r i t i c a l region
EL: T 114, ( A 1 = 312 r u l e ) . : for A i s given by
For t h e s i g n i f i c a n c e ~ 0 . 0 5 ,show chat t h e power of t h e t e s t Ho a g a i n s t H1 with i
o n l y one observation i s 0.19.
I samples. For example, if the null hypothesis fixes the values of r of the para-
meters it can be shown that, for Ho true, the statistic -2 In A tends asymptoti-
I cally to s chi-square variable with r of degrees of freedom. The statistic
-2 In A can therefore be used to test H , taking the critical region at the
right-hand tail o f the appropriate chi-square distribution.
It should be stressed that the asymptotic behaviour of the likelihood- A
The absolute maximum of the likelihood function is found by inserting 0 , and 8,
ratio depends essentially on the same conditions that give the optimum proper-
in eq.(14.17), giving
ties of the Maximum-Likelihood estimators.
I has a specified value uo, given a sample o f n observations x~,x~,...,x . The For Ho true, the maximum of the likelihood function is
null hypothesis to be tested is then
I For both hypotheses the second parameter 8, is some positive number. The total Since
I
thus provide the critical region for A with a given significance a; compare eq. men a Taylor expansion is performed and the sample variance replaced by o2 for
t2 > t2. Thus the critical reglon for t corresponding to the significance u
a Far a chi-square variable with one degree of freedom we find from Appendix Table
consists of the two intervals
A8 that a one-sided test for the significance 0-0.05 corresponds to the critical
value -2 In A 05-3.841, from which A 05=0.147. Comparing this to the enact value
0.125 obtained above, we see that even for a sample sire as small as 20 the a s p -
Here tu12 is defined by tatic -2 In A approximation is reasonable.
-t
T o compute the power of the test w c have to consider the alternative
(14.27) hypothesis HI: 81 Po. The variable t as defined by eq.(14.25) may still be
%
satisfies
' It is seen that 6 gives a measure of h w much the alternative deviates from the
null hypothesis.
In accordance with the definition of the power of a test the pooer
the hypothesis H is accepted.
function is defined as
Consider, for instance, an experiment with 0-20 and a chosen signifi-
cance a-0.05. Por this case Appendix Table A7 gives t 025-2.093. If therefore
the observed 1 ti-value computed from eq.(14.25) is larger than 2.093. the null
hypothesis should be rejected. The corresponding critical value for A is found
by inserting t -2.093 into eq.(14.24), giving A.05-0.125. where g(AlH1,6) is the distribution of A for a given 6 when H L is true, and
,025
Let us compare this exact calculation of the critical region for A to f(t;n-1,6) is the non-central t-distribution with n-1 degrees of freedom. The
8 ,
cumulative d i s t r i b u t i o n of f(t;"-1,6) has been t a b u l a t e d f o r various parameter .
I c o d i n a t i o n s i n , f o r i n s t a n c e , "Tables of t h e "on-central t - d i s t r i b u t i o n " by
where F
= Zt.10. For n reasonably l a r g e , when E i s N ( l , l l n ) , show t h a t a-0.05
corresponds t o a c r i t i c a l region 0 < A < A.05 where
G.J. Resnikoff and G.J. Liebermn. I n F i g 14.4 t h e power function i s shorn f o r
- y)"
(1 +
.'. erp(-1 . 6 4 5 n ,
t h e t e s r of t h e simple n u l l hypothesis H : r = 1 a g a i n s t the camposite a l r e r - We then h s v e e p r o b l m which involves a reasoning q u i t e analogous t o t h a t used i n
m t i v e hypothesis
.. H,: r t I . Shar r h a t o v i t h the sample t l , t 2 , . . . , t n t h e
l i k e l i h o o d - r a t i o becows , e s t a b l i s h i n g confidence i n t e r v a l s f o r t h e mean of a n o r m 1 d i s t r i b u t i o n , which
was carried out in some detail in Seet.7.2.
quantity or not, the appropriate test statistic to test H :U
7.2.1 and 7.2.2 for the arguments)
-
Depending on whether of is a known
Uo is (see Sect.
lishing a confidence interval for the variance of a normal distribution can nov
be used. We recall that a variable constructed from the sample variance s2 aa
(n-l)s2/02 will have a chi-square distribution with "-1 degrees of freedom. If
should happen to know what the true mean u of the population is we may replace
G-uo
-
016
i-vo
-
s/&
,
,
which is N(O,1)
which is t(n-1)
(a2 k n m )
(u2unhown)
II by u in the evaluation of a',
0
' :
(n-l)s2/ai - n
and the chi-square variable
have n degrees of freedom. Thus the appropriate test statistic to be used for
testing H,: = 0 is
(y known),
i=1
Here N(0.1) is the usual abbreviation for a standard normal variable, while (14.35)
"
t(n-1) is a Student's t-variable with "-1 degrees of freedom; ;and s2 are the (0-l)s2/ai = 1 (~~-;)2/~:,
i=l which is X2(n-1) (U
wellknown sample characteristics,
In testing the variance we are likely to reject H (and accept H,) if
S' from the measurements is either much too small or much too large compared to
the value oi specified by Ho. Therefore, again, a two-sided test is appropriate.
In testing the mean we are likely to reject H (and accept H,) if the
Since the chi-square distribution is unsymetric the choice of the two critical
observations are svch that the sample mean ;is either much too amall or much
regions is not unambiguous, but e m n practice is to take equal probabilities
too large canbared to the hypothetical value u". therefore a No-sided test is
(=la) at both ends of the distribution.
applied. Fixing the significance a we will use the appropriate distribution
(i.e. the standard normal or the Student's t, depending on whether a' is known In general, two-sided tests for the null hypothesis Ho will always be
or not) to deternine the two critical values of the test statistic,, which eorre- appropriate whenever the alternative hypothesis are of the f o m s of eqs.(14.31).
spond to a probability ha in either tail of the distribution. If it is found (14.34). If the alternatives are fomulated as. for example,
that the observed value of the test statistic falls in any of the two tails
(i.e. it belongs to the critical region) then we shall reject the hypothesis
Ho: 11 = Uo at the significance level 10Oa X ; otherwise the null hypothesis is
accepted. one-sided tests must be applied to rest Ho.
A similar procedure is applicable if we want to test the variance of
14.3.2 Comparison of means in two normal distributions
the normal distribution. The null hypothesis is
....xn
-
We shall assume that x,.x,, is a sample of sire n from N(u1 ,a:)
Ho: o2 0
:
. (14.33) and yl,yr, ....ym a sample of size m from ~(u2.o:). We want to test the hypa-
thesis that the two normal distributions have the same mean,
where :a is specified, and the alternative (composite) hypothesis can be
The technique developed earlier (see Seets.7.3.1 and 7.3.2) for estab- HI: Ul t u2. (14.37)
The formalism covers situations frequently met in practice when it is desired to
and
indepe,,,jenthave wellknown distributions:
check whether two series of n d e r s really are measurements on the same physical
I
quantity. Although simply stated,the null hypothesis ia not always easily tested
;, which is N(VI.'J:/") 9
(n-i]sf/o:, which is x2(n-1).
unless some simplifying assumptions can he made about the variances o: and o
:.
(m-~)s:/a:, which is x2(m-1).
We shall consider the follwing situations:
L
-
y , which is N(V2.o:/m).
(i) o: and a: are know, These variables are the entities from which we must try to build up a nw vsri-
(ii) oi and a: are unknown, but equal. able that may serve as a test statistic. The two normal variables can be eom-
(iii) u? and 01 are unknown a d different. I bined in the usual way such that
be N(O.l). thermore independent, and according to Sect.5.2.1 they can be used to form a
To test the null hypothesis of equal means, when ,,,-ur=O, we
student'$ tvariahle by the prescription of eq.(5.14). Hence
consequently use the test =tatiatic
-x - y-
Lzzz
which is N(0.1) under H .
The problem is therefore reduced to a wellknwn one,
being exactly analogous to the testing of the mean of a normal distribution when vill be a Student's t-variable with (n-2) degrees of freedom.
its variance is known, which was discussed in the preceding section. Note that the complicated expression (14.39) is no test statistic yet.
since it includes the parameters of and o:
, which were assumed to he unknown.
(ii) of and 0% are unknown, but equal
If the variances of the populations are not known it is still possible
However, if 0
: o:, -
the expression
tion variances drop out, giving
can be si~lifiedfurther since the popula-
to carry out the desired teat, provided that it can be assumed that the two vari-
ances are of equal sire. To construct a test statistic far this case we recall (G)- (uI-v~) , (14.40)
that with the r w samples xl,nr. ....
x, from ~(u~,o:) and y ~ , y r ym from ..... s/q-
~(u~.o:) we can itmediately write d a m a set of four variables which are all
where s1 is the poo~ed setinnee of the population variance '
0 (=0?rt%).
s2 z _?_(
n*m-2
("-1)s; + (m-1)s;
!
1
i
The approach described above can be used for the practical purpose of
checking consistency between two sets of observations. If the first set
x,,,,,...,
r,,yl,....y,
xn implies an experimental reault (x
similarly (y ?
- * Ax)
- and the second set
A?) a test for the hypothesis that the two experi-
I Lnts have measured the same physical quantity (i.e. ulsuz) can be based an the
/ test statistic
which is t(n+m-2). Thus the problem ie analogous to the one discussed previous-
ly in Sect.14.3.1, when we tested the mean of a normal distribution with an un-
known variance. hieh is assumed to be N(0,l). In fact, this is the prescription used by most
It is seen from this rather lengthy derivation that unless the two un- physicists to test if two experimental results are compatible. Thus the assump-
k n o w population variances 0: and 5: are equal, we shall in general not be able
- tion norolally distributed observations is implicit whenever the above pre-
to obtain a variable for which we know the explicit distribution. The general scription is used, alti*ougl! this is seldom explicitly stated in practice.
case O:*LT$ can in fact not be treated exactly, and approximate methods must be
i! applied. ~ ~ ~ 14.3:
~ ~ Experimental
i s e
times are, respectively,
values (as of April 1976) for the E'
-,u
. .
and 2- life-
n : of - 0 ; . (14.46)
TO test the hypothesis H : ~ ~ = uone
r will therefore in this situation use the
I
test statistic against the alternative
I
Conversely, i f the a l t e r n a t i v e were
I I o: and 0: know"
&:In
r y
+ oflm
dom.
I
14.3.5 Example: Comparison of r e s u l t s from two d i f f e r e n t measuring machines yhich has a S t u d e n t ' s t - d i s t r i b u t i o n with 19 degrees of freedom. I f we choose
As a f i r s t example on t e s t s involving parameters of normal d i s t r i b u - a $ i g n i f i c a n c e of a.0.05 we f i n d from Appendix Table A7 t h a t t h e c r i t i c a l region
t i o n s , suppose t h a t we want t o compare measurements on beam t r a c k s i n a bubble
j f o r the two-sided t e s t with equal p r o b a b i l i t i e s la i n t h e two t a i l s i s given by
chamber from two d i f f e r e n t measuring devices with unspecified precisians.
For d e f i n i t e n e s s we ass- t h a t a monoenergetic beam of m-ntm / t l > t,025 - 2.09.
P0=24.90 GeVlc i s i n c i d e n t i n the bubble chamber, and t h a t the m u l t i p l e Coulomb
' prom the observed numbers i n t h e t a b l e above we f i n d t h a t t h e observations with
s c a t t e r i n g on the t r a c k s can be considered n e g l i g i b l e . Then the error on the i
i A and B correspond t o
measured t r a c k momentum w i l l be due s o l e l y t o t h e inaccuracy of t h e measuring
device. From the geometrical r e c o n s t r u c t i o n method applied t o t h e measured
t r a c k s one expects t h a t i t i s the i n v e r s e of t h e mamentvm - r a t h e r than t h e mo-
A
fobs = - 0.40, ttbs - + 1.60.
mentum i t s e l f - which i s an approximate normal v a r i a b l e . Suppose t h a t 20 t r a c k s uence both s e r i e s of observations produce values of t i n t h e acceptance region,
have been measured with t h e two machines A and B, g i v i n g the mean v a l u e s and and there is no reason t o suspect t h e measurements f o r e i t h e r machine.
Secondly, we w a n t t o test the hypothesis t h a t the two neasuring ma-
I : s t a n d a r d d e v i a t i o n s f o r t h e i n v e r s e mmentum l / P as displayed i h t h e following
,hines have the same p r e c i s i o n . This amounts t o t e s t i n g whether t h e v a r i a n c e s
table.
O' and og are equal, s i n c e
A
1 -
(-)= 1-
1 2O1
Machine P ~oi.lpi
The a l t e r n a t i v e hypothesis may s t a t e t h a t A i s l e s s p r e c i s e t h a n B ,
i n u n i t s lO".(GeV/c)-' i n u n i t s lo-'. (Gev1c)-'
A 40.12 0.46
B 40.32 0.25
1" t h i s case we know the mean valves of the two (normal) d i s t r i b u t i o n s , u ~ = L I ~ - ~ / P , ,
and, i n accordance with Sect.14.3.3, we form t h e t e a t s t a t i s t i c
Let us f i r s t check t h e s u p p o s i t i o n t h a t each machine r e a l l y measures
t h e i n c i d e n t momentum, which corresponds t o 1/P - 4 0 . 1 6 . 1 0 - ~ ( ~ e ~ l c ) ~For
' . each
machine t h i s implies a t e s t on t h e mean value of a normal d i s t r i b u t i o n , with t h e
which has an F-distribution with (20.20) degrees of freedom. Pining the s i g n i -
c o n d i t i o n t h a t the variance is unknown. With t h e formulation of the p r e s e n t
c h a p t e r we w r i t e ficance l e v e l a t 5% t h e c r i t i c a l region corresponding t o a one-sided t e s t is
given by, according t o Appendix Table AS.
F > F.95 = 2 . 1 6 .
!
From the observed numbers we have t h e a c t u a l number
According t o Seet.14.3.1 the a p p r o p r i a t e t e s t s t a t i s t i c is
i
which f a l l s w i t h i n t h e c r i t i c a l region. Hence we must r e j e c t t h e hypothesis of
and extending over a few b i n s . We ask t h e f o l l a u i n % q u e s t i o n s :
e q u a l p r e c i s i o n s a t the chosen s i g n i f i c a n c e l e v e l . Indeed, ve f i n d t h a t t h e r e
i s a p r o b a b i l i t y of l e s s than 0.5% t h a t a l a r g e r r a t i o P would be observed, i f (1) !ghat is t h e p r o b a b i l i t y t h a t t h e observed e f f e c t a
H were t r u e .
Hence we are l i k e l y t o a c c e p t t h e a l t e r n a t i v e hypothesis i n t h i s p a r t i c u l a r mass value (M-no) i s a s t a t i s t i c a l f l u c t u a t i o n of
case, and conclude t h a t machine A i s l e a s p r e c i s e than ~naehineB . the background?
-
n i f i c a n c e o f an enhancement t o t a k e i n t o a c c o u n t t h e u n c e r t a i n t y i n the back-
average number of mass c o m b i n a t i o n s p e r e v e n t
number of c o m b i n a t i o n s i n each h i s t o g r a m --
= 15,
3000.
40.
ground e s t i m a t e .
I t i s seen t h a t t h e t e r n V(B) i n t h e denominator of e q . ( 1 4 . 5 4 ) number of b i n s p e r h i s t o g r a m
t e n d s t o lower the number of s t a n d a r d d e v i a t i o n s , t h a t i s , i t r e d u c e s t h e s i g n i -
f i c a n c e of t h e peak. S i n c e , f o r g i v e n t o t a l number of e v e n t s N and e s t i m a t e d mis givesan e s t i m a t e d number of b i n s p e r Yea' e q u a l t o
*
background 8 , a broad a c c u m u l a t i o n c o v e r i n g many b i n s w i l l u s u a l l y have a l a r g e r
V(R) t h a n h a s a narrow p e a k , i t w i l l i n g e n e r a l be more d i f f i c u l t t o d e t e c t a
-5
broad resonance than a narrow o n e . - f l u c t u a t i o n of minimum 40 i s 3.2.10 i n any
since the probability of a
we proceed n e x t t o answer t h e q u e s t i o n (2) r a i s e d e a r l i e r . We assume
of these b i n s we expect t o t a l of = 13 occurrences p e r y e a r of " e f f e c t s " of
t h a t we have a l r e a d y determined t h e p r o b a b i l i t y P ( d ; M-MJ t h a t the d standard T h i s example i l l u s t r a t e s why i t
at least 4 standard deviations in magnitude.
d e v i a t i o n e f f e c t a t t h e p a r t i c u l a r mass v a l u e M=H i s a s t a t i s t i c a l fluctuation
become customary t o r e q u i r e f i v e o r mare s t a n d a r d d e v i a t i o n s t o c l a i m any en-
14.3.7 Comparison of means i n N no-1 d i s t r i b e t i o n s ; scale f a c t o r
hancement i n a n e f f e c t i v e - m a s s d i s t r i b u t i o n as e v i d e n c e f o r a new resonance. W
e saw i n Sect.14.3.2 how t h e mean v a l u e s of two normal d i s t r i b u t i o n s
be t e s t e d f o r e q u a l i t y by c o n s t r u c t i o n o f an a p p r o p r i a t e t e s t s t a t i s t i c ,
E x e r c i s e 14.4: I n an experiment t o s e a r c h f o r and s t u d y t h e r a t e of e l a s t i c
a n t i n e u t r i n o - e l e c t r o n s c a t t e r i n g , "iz. *ich was e x a c t l y or approximately N ( 0 , 1 ) , or a S t u d e n t ' s t - v a r i a b l e , depending
(1) Ye + e - + V e + e-
,,~ h e t h e rt h e v a r i a n c e s of t h e two normal d i s t r i b u t i o n s were known o r n o t .
mese methods can t h e r e f o r e be used t o t e s t whether two e x p e r i m e n t a l r e s u l t s are
R e i n e s , Gurr and Sobel have u t i l i z e d low-energy a n t i n e u t r i n o s from a n u c l e a r m u t u a l l y c o n s i s t e n t provided, of course, t h a t t h e u n d e r l y i n g assumption of nor-
r e a c t o r . Evidence f o r t h e r e a c t i o n would mainly come from e d i f f e r e n c e i n t h e
c o u n t i n g r a t e s observed when t h e r e a c t o r was ON and when i t was OFF. mally d i s t r i b u t e d observations i s reasonably s a t i s f i e d .
d i f f e r e n t e l e c t r o n energy i n t e r v a l s one r e g i s t e r e d t h e c o u n t s o v e r p e r iFor
o d s s of
ix
I t i s f r e q u e n t l y d e s i r e d t o check t h e i n t e r n a l c o n s i s t e n c y between more
many d a y s . The observed c o u n t i n g r a t e s w i t h t h e r e a c t o r ON and OFF are g i v e n i n
t h e t a b l e below (columns I1 and 1111, t o g e t h e r w i t h t h e e s t i m a t e d errara due t o than two e x p e r i m e n t a l r e s u l t s . Suppose, f o r e x a r p l e , t h a t t h e r e are N e x p e r i -
i n s t r u m e n t a l i n s t a b i l i t i e s i n t h e runs ( c o l u m IV): m e n t s , each r e p o r t i n g an observed v a l u e xi w i t h error Ax.. We want t o f i n d o u t
1.5 - 2.0 30.6 t .69 26.9 1 .67 t .60 3.7 f 1.28 2.89
2.0 - 2.5 10.5 f f .38
he a l t e r n a t i v e h y p o t h e s i s i s any o t h e r p o s s i b i l i t y , where so- of t h e means are
not e q u a l t o t h e o t h e r s , c o r r e s p o n d i n g t o b i a s i n home of t h e e x p e r i m e n t s .
I f t h e n u l l h y p o t h e s i s had s p e c i f i e d t h e v a l u e u of t h e cornon popula-
t i o n mean, and a l s o t h e s t a n Nd a r d d e v i a t i o n s o. of t h e i n d i v i d u a l &distri-
.Z (xi-u)'/of
b u t i o n s , t h e n the q u a n t i t y 1-1 would by d e f i n i t i o n be a chi-square
v a r i a b l e w i t h N d e g r e e s o f f r e e d a n and an a p p r o p r i a t e t e s t s t a t i s t i e However,
since H does n o t s p e c i f y t h e c a m n p o p u l a t i o n mean i t must be e s r i m a t e d f r w
( i ) Assume t h e number of c o u n t s p e r day t o be P o i s s o n v a r i a b l e s and f i l l i n t o
columns I1 and 111 t h e errors ( s t a n d a r d d e v i a t i o n s ) on t h e a v e r a g e numbers of the d a t a . ~n e s t i m a t e f o r u i s t h e weighted mean v a l u e f o r a l l N measurements.
e v e n t s p e r day. (The errors f o r t h e f i r s t energy b i n are g i v e n f o r check.)
( f i ) Find t h e d i f f e r e n c e i n t h e o b s e r v e d c o u n t i n g r a t e s v i t h t h e r e a c t o r ON and
n t h t h e r e a c t o r OFF and d e t e r m i n e t h e error on t h e s e d i f f e r e n c e s (column v ) .
( H i n t : Assume t h e i n s t r u m e n t a l i n s t a b i l i t y t o be independent of t h e number of
c o u n t s .) -
h e r e t h e weipht of t h e i - t h o b s e r v a t i o n i s taken e q u a l t o i n v e r s e of t h e s q u a r e
I! ( i i i ) I n accordance w i t h cornon p r a c t i c e among size of of its error, w i = j / ~ x f . ~ n e s t i m a t e for the error i n v i s
t h e o b a e ~ e dr e a c t o r a s s o c i a t e d r a t e s i n u n i t s of standard deviations (column v I ) . !
( i v ) A t a chosen s i g n i f i c a n c e l e v e l of 0 . 1 % . do t h e r e a c t o r associated
t h e s e p a r a t e energy b i n s s u p p o r t t h e h y p o t h e s i s of a real of N (14.59)
e f f e c t i n any b i n ? ( H i n t : Assum t h e r e a c t o r
v a r i a b l e .) rate to be a normal
(v) I f t h e d a t a are lumped i n t o one group c o v e r i n g a l l energy b i n s , h a t is the each reported v a l u e xi w i l l o f t e n b e an awrage Valuefrom a
*) In practice,
combined e v i d e n c e f o r a r e a l s i g n a l of r e a c t i o n ( I ) ? and t h e bni t h e c o r r e s p o n d i n g error in thin average;
series of
(For Ho t r u e and normally d i s t r i b u t e d o b s e r v a t i o n s t h e s e are the Harimm-Likeli- p h y s i c s become customary t o r e t a i n t h e averaging of the weighted o b s e r v a t i o n s
hood e s t i m a t e s of the p o p u l a t i o n parameters, given by eqs.(9.11).(9.12), when but to i n t r o d u c e a scale f a c t o ~t o i n c r e a s e t h e e r r o r s . Since one u s u a l l y does
I
t h e 0. are approximated by t h e observed errors Axi.) ,,.t know which, i f any, of t h e measurements or experiments may be vrong one
I I f Ho i s t r u e we e x p e c t t h a t t h e weighted sum of t h e squared devia- -kes t h e r a t h e r a r b i t r a r y assvmption t h a t a l l experiments have u r d e r e s t i m a t e d
t i o n s from t h e weighted mean v a l u e errors. and i n the same p r o p o r t i o n , so t h a t a l l errors shovld be a d j u s t e d
by some comon f a c t o r S. The P a r t i c l e Data Group accordingly d e f i n e S
on t h e b a s i s of t h e sample v a l u e s i e t h e n a t y p i c a l g o o d n e s s - o f - f i r problem.
I n r e s t i n g g o o d n e s s - o f - f i r we s h a l l , as b e f o r e , need a t e s t s t a t i s t i c
" t r u e , d e f i n e s a c r i t i c a l r e g i o n and an a c c e p t -
"hose d i s t r i b u t i o n , a s s u m i n g H-
C o n s i d e r i n g t h e r e a l and t h e imaginary p a r t s of t h e p a r a m e t e r s e p a r a t e l y , are region with probabilities d and 1-a, r e s p e c t i v e l y . The s i t u a t i o n can be
t h e r e s u l t s from t h e d i f f e r e n t e x p e r i m e n t s c o n s i s t e n t a t t h e 5 % l e v e l ? What
are t h e s c a l e f a c t o r s f o r t h e two s e r i e s of o b s e r v a t i o n s u s i n g a l l measurements? d i f f e r e n t from t h a t o f t h e p r e v i o u s s e c t i o n s i n t h a t we may now n o t f o r n u l a t e an
What would t h e s c a l e f a c t o r s be i f t h e l a s t e x p e r i m e n t was d i s r e g a r d e d ? h y p o t h e s i s H I , s i n c e H I c a n be t h e ensemble o f a l l c o n c e i v a b l e hy-
p o t h e s e s d i f f e r e n t from H,>. Thns A , i s o f t e n l e f t u n s p e c i f i e d , and the power of
E x e r c i s e 14.7: The m a n l i f e t i m e r of t h e no meson h a s b e e n measured by 11 ex-
p e r i m e n t s , s i x of which used n u c l e a r e m u l s i o n s as d e t e c t o r s and f i v e used c o u n t e r the r e s t n o t taken i n t o account.
t e c h n i q u e . The f o l l o w i n g d a t a have been r e p o r t e d : The g o o d n e s s - o f - f i t t e s t most ~ o m n l yused i s Pearsm'e X' t e s t .
"hich w i l l be d i s c u s s e d e x t e n s i v e l y i n t h e f a l l o v i n g s e c t i o n s . This t e s t i s
Nuclear e m u l s i o n t e c h n i q u e Counter technique
s p e a k i n g e x a c t f o r l a r g e samples o n l y , and o t h e w i s e approximate. A
r ( i n u n i t s of 10'L6sec) Of
T ( i n u n i t s of 1 0 " ~ s e c )
~ e c o n dt e s t f o r g o o d n e s s - o f - f i t , the Zikelihood-mtio t e s t , i s v a l i d f o r a l l
p r o b a b i l i t i e s Pi f o r t h e i n d i v i d u a l c l a s s e s as determined by t h e " n d e r l y i n g dia. same c o n d i t i o n s v i t h n observations, the actual values obtained f o r xzObs w i l l
tribution. me s i m p l e h y p o t h e s i s we wish t o t e s t s p e c i f i e s t h e c l a s s p r o b a b i l - t h e r e f o r e be d i s t r i b u t e d " e a r l y l i k e x'(N-1); i n p a r t i c u l a r , the average value
and thereby f i x the n u d e r of c l a s s e s , was a l s o considered i n connection v i t h the , determination from the h y p o t h e t i c a l d i s t r i b u t i o n i t s e l f , and e s s e n t i a l l y two d i f -
The asylrptotic chi-square behaviour of the X' s t a r i s t i e f o r the Pearson (Seet.lO.5.2): E i t h e r the range i s d i v i d e d i n t o c l a s s e s o f eqml width, o r i t i s
divided t o correspond t o c l a s s e s of equal p m b a b i l i t y . The equal-width method
,y2 rest of goodness-of-fit i s , s t r i c t l y speaking, only proved t o be c o r r e c t i f
the c l a s s d i v i s i o n is made without an). r e f e r e n c e t o t h e o b s e r v a t i o n s . l h i s is s o i s a r i t h m e t i c a l l y s i m p l e r t h a n t h e e q u a l - p r o b a b i l i t y method which may r e q u i r e a
o f t e n made a f t e r t h e d a t a have been o b t a i n e d and t h e general p a t t e r n of the ob- Assuming e q u a l - p r o b a b i l i t y p a r t i t i o n and s u f f i c i e n t l y l a r g e samples.
s e r v a t i o n s has emerged and can be taken i n t o account. This p r a c t i c e i s j u s t i f i e d it i s p o s s i b l e t o e s t a b l i s h a r e l a t i o n f o r the optimum number of c l a s s e s which
by t h e f a c t t h a t , f o r i n f i n i t e n, t h e d i s t r i b u t i o n of x2 w i l l be X 2 ( ~ - l )f o r any m x i m i r e s an approximate p a r e r f u n c t i o n f o r Pearson's x2 t e s t ( s e e Kendall and
S t u a r t . Chapter 2 0 , Vol.2). The optimum n u d e r of c l a s s e s i s found t o i n c r e a s e
p a r t i t i o n with N c l a s s e s , provided H, i s true.
Pearson's x2 t e s t r e l i e s on t h e approximation of a mulfinomial t o a i n p r o p o r t i o n t o n2" for f i x e d paver and s i g n i f i c a n c e or, e q u i v a l e n t l y , t h e op-
m u l t i n o m l d i s t r i b u t i o n , s i n c e i t assumes an approximate s t a n d a r d normal behav- timum expected e v e n t n u d e r i n each c l a s s i n c r e a s e s as n3I5 .
Specifically,
i n i t s e l f unwanted. I n a p p l y i n g t h i s t e s t t o cornpare model and d a t a the physi- corresponding t o mare expected events p e r c l a s s .
I
T h i s cor-
r e s p o n d s to making t h e node1 a s i m p l e h y p o t h e s i s , which i s s u b s e q u e n t l y p u t t o 1h.4.4 General X2 t e s t s f o r g w d n e s s - o f - f i t
t e s t f o r goodness-of-fir.
I I n o u r c o n s i d e r a t i o n s so f a r we have assumed t h a t the t e s t s t a t i s t i c
F a r a Least-Squares e s t i m a t i o n we know from Sects.lO.4.3 and 10.4.4 1 x2 has been e x p r e s s e d i n terns of N c l a s s p r o b a b i l i t i e s , which a r e n o t a l l inde-
t h a t t h e comparison between d a t a and f i t t e d model i s made u s i n g t h e chi-square I pendent b u t must add t o u n i t y . This i s e q u i v a l e n t t o r e q u i r i n g e q u a l l y many
d i s t r i b u t i o n w i t h a n u d e r of d e g r e e s of freedom e q u a l t o t h e n u d e r o f indepen- p r e d i c t e d and o b s e r v e d evenrs when su-d over a l l c l a s s e s , and i m p l i e s a redue-
d e n t o b s e r v a t i o n s minus t h e n u h e r of i n d e p e n d e n t p a r a m e t e r s e s t i m a t e d . This r i o n i n t h e n u d e r of d e g r e e s o f freedom of one u n i t i n t h e comparison f o r good-
p r o c e d u r e i s e x a c t o n l y i n t h e l i m i t o f infinitely many o b s e r v a t i o n s and w i t h a ness-of-fit. Q u i t e f r e q u e n t l y , however, t h e model v h i c h we v i s h t o t e s t g i v e s
l i n e a r p a r a m t e r dependence; o t h e n r i s e i t i s a n a p p r o x i m a t i o n .
Thus, i f t h e r e d e f i n i t e p r e d i c t i o n s f o r t h e a b s o l u t e n u d e r s of e x p e c t e d events f i i n t h e s e p -
1
are L p a r a m e t e r s i n H, which a r e e s t i m a t e d by t h e LS method and N c l a s s e s s u b j e c t : arate classes. R a t h e r than e q s . (14.63),(14.64) the hypothesis i s
t o an o v e r a l l n o r m a l i z a t i o n condition, P e a r s o n ' s x2 r e s t f o r goodness-of-fit con-
f(u;v)du = 1 -F(u=<;v)
her. ~ e p e n d i n gon whether t h e p o s i t i v e decay p a r t i c l e i s a p r o t o n o r a p i o n
t h e r e are two h y p o t h e s e s f o r e a c h v":
(14.70)
Ho: v0 is A *p+n-,
xi
where f ( u ; v ) i s t h e c h i - s q u a r e p . d . f . ,
p.d.f..
F ( u ; v ) the c ~ l t l l a t i v ei n t e g r a l o f t h e same
and v t h e a p p r o p r i a t e number of d e g r e e s of freedom. I f x : ~as~ c a l e u l a -
I HI: v0is K~*nt+il'.
pX2 =
-j f(u;v)du = 1- F ( U = X ~ ~ ~ ; ~ ) .
X
a specified hypothesis.
I f t h e c o r r e c t h y p o t h e s i s h a s b e e n used f o r t h e f i t t h e v a r i a b l e X'
(14.71)
i of eq.(14.69) with the f i t t e d q u a n t i t i e s =
: obtained i n t h e f i n a l i r e r e t i o n of
Xh:. I the m i n i m i z a t i o n p r o c e d u r e , w i l l b e approximately X 2 ( 3 ) . F i x i n g , f a r example,
i'
T h i s p r o b a b i l i t y i s most c o n v e n i e n t l y o b t a i n e d from c u r v e s of the e u r n l a t i v e c h i - the s i g n i f i c a n c e l e v e l a t 1 % t h e c r i t i c a l v a l u e , as r e a d o f f from Appendix T a b l e
in
s q v a r e d i s t r i b u t i o n , s v c h as F i g . 5.2, b u r can a l s o he found by interpolation A8, i s xfO1 - 1 1 .345. Hence, a t t h e 1 % l e v e l , we s h a l l reject a h y p o t h e s i s i f
t h e s t a n d a r d cables with f i r e d percentage p o i n t s . the o b s e r v e d v a l u e x : ~exceeds
~ xfO1. and o t h e l w i s e a c c e p t it.
I t may b e a p p r o p r i a t e t o s t r e s s t h a t a l t h o u g h a vely b a d f i t ( w i t h a Specifically, l e t "3 c o n s i d e r an e v e n t for which r e l e v a n t nurrbera are
high x : ~v ~a l u e and low PX2) can be a s u f f i c i e n t reason f o r r e j e c t i n g s hypoth- given i n t h e f o l l o w i n g t a b l e . The a n g l e s f o r t h e YO have b e e n o b t a i n e d from t h e
e s i s , a good f i t i s i n i t s e l f i n c o n c l u s i v e as l o n g as o t h e r h y p o t h e s e s have n o t p r o d u c t i o n and decay p o i n t s w i t h measured ( x , y . z ) c o o r d i n a t e s (-44.4+.14,-1.8f.17.
been t r i e d . In f a c t , i n s t e a d o f u s i n g t h e p h r a s e "we a c c e p t t h e h y p o t h e s i s " i t -16.2+.26) and (-28.8t.15,-5.3f .16.-16.0t.26) i n cm, r e s p e c t i v e l y . A l l neasured
w i l l p e r h a p s b e more a p p r o p r i a t e t o express t h e m a t t e r as 'be f a i l t o r e j e c t t h e q u a n t i t i e s have ""correlated errors.
1
I Momentum Dlp a n g l e Azlmuth a n g l e ' fie c u m u l a t i v e d i s t r i b u t i o n f o r t h i s s a n p l e of s i r e n i s na, d e f i n e d by
(radians) (radians) ,
Measured q u a n t i t i e s
I 3
1
/
!
1535172
1479160 1 0.02210.006
0.019~.006
6.107t0.001
6.111t0.006
;
,
: T
- ,
1 378218 1 -0.09720.016 1 5.76820.014 iI Thus S _ ( X ) is an i n c r e a s i n g s t e p f u n c t i o n w i t h a s t e p of h e i g h t
1
a t e a c h of t h e
Fitted quantities. p_ ; 1564t72 ' 0.022f0.006 j 6.106f0.007
.. ,x,,.
h y p o t h e s i s Ho T i 354111 ' -0.091?0.016 5.78lt0.012 / ~ o i n t sx l, x z . .
The ~ o l ~ ~ o r o v - S d r n tae vs t i n v o l v e s a c o o p a r i s o n between t h e observed
Fitted quantities, :
T / 1831251 1 0.024f0.006 ' 6.124f0.006
d i s t r i b u t i o n f u n c t i o n S (x) f o r the sample and t h e c u m u l a t i v e d i s -
cumulative
L
hypothesis H I / j 381218 , -0.118*0.016 5.719*0.011 , :
t r i b u t i o n f u n c t i o n F ( x ) which would occur under some t h e o r e t i c a l rmdel. We s t a t e
I n view o f t h e p r e a s s i g n e d I % s i g n i f i c a n c e l e v e l and c o r r e e p a n d i n g the n u l l h y p o t h e s i s as
c r i t i c a l v a l u e of t h e t e s t s t a t i s t i c , we s h a l l from t h e o b s e i v e d v a l u e s xZ H : S-(x) = F.(n).
(14.73)
3.6 and X & ( H , ) = 26.7 a c c e p t Ho and r e j e c t H i . W
obs(Ho)=
e t h u s conclude t h a t t h i s par- 0 r. "
For H t r u e one e x p e c t s t h a t t h e d i f f e r e n c e between S,,(n) and Fo(x) a t
t i c u l a r V" i s a A .
any p o i n t s h o u l d be r e a s o n a b l y s m a l l . The K o l m g o r o r S m i m o v t e a t l o o k s a t t h e
With t h e p r e s e n t example t h e n u d e r s assure, v i t h ovelvhelming p l a u s i -
b i l i t y , t h a t t h e c o r r e c t i d e n t i t y h a s been e s t a b l i s h e d f o r t h e v". d i f f e r e n c e S (x) - Fo(x) ar a l l o b s e r v e d p o i n t s and r a k e s as a t e s t s t a t i s t i c * )
I n other
cases the s i t u a t i o n may b e n o t so s i m p l e . the mximum of t h e a b s o l u t e v a l u e of t h i s q u a n t i t y , t h u s
For example, i f both h y p o t h e s e s give :
xibs < xi and thus are a c c e p t a b l e a t t h e chosen s i g n i f i c a n c e a, t h e v0 i s kine- D = max /S,,(x) - F~(X)\. (14.74)
- -L
d i s t r i b u t i o n which f o r l a r g e n i s g i v e n by
1b.4.6 The Kolmgorov-Smirnov t e s t 2 2
.-2r 2 (14.75)
Pearson's x2 t e s t i s undoubtedly t h e most p o p u l a r "on-paramerric test lim P (D < 5)
n-Jii
1-2
r-1
(-1)
n-
used by p h y s i c i s t s . However, o t h e r g o o d n e s s - o f - f i t t e s t s e x i s t which a v o i d the
For f i n i t e n t h e On d i s -
b i n n i n g o f i n d i v i d u a l o b s e r v a t i o n s and may be m3re s e n s i t i v e This r e l a t i o n is a p p r o x i m a r e l y v a l i d a l r e s d y a t n - 8 0 .
the d a t a . ne
m3st i m p o r t a n t o f t h e s e t e s t s i s p r o b a b l y the ~ 0 2 m o g o l o u - s ~test, i~~~ t r i b u t i o n s can be found from recurrence r e l a t i o n s .
in
p a r t i c u l a r f o r small sanples i s s u p e r i o r t o the X2 f e e t . and has mny ,,iceprop- Appendix able A10 g i v e s t h e e x a c t c r i t i c a l v a l u e s d01 of t h e t e s t s c a t -
e r t i e s when a p p l i e d t o p r o b l e m i n which no paranr?ters are e ~ t i m t e d . istic D f o r n 1 100 as as v a l u e s f o r t h e l i m i t i n g case o f n
.. D: = max(sn(x)-~o(x)) or 0; max(Fo(x)-~,,(Xl)
l a r g e ( l a s t row), f o r d i f f e r e n t v a l u e s of t h e s i g n i f i c a n c e u . I t t u r n s o u t t h a t an s h o u l d have i n o r d e r t o p r o v i d e € ( a ) t o a r e q u i r e d
sample
t h e approximate v a l u e s o b t a i n e d w i t h t h e l i m i t i n g e n t r i e s are always l a r g e r than ~ ~ ~ ~l e t ius demand
~ ~ anl a c clu r a c~y of, b e t t e r t h a n 0.20 anywhere
t h e e x a c t ones. For i n s t a n c e , t a k i n g a = 0.05 i n a case w i t h n = 80 the exact on F ( ~ a) t a c o n f i d e n c e l e v e l of 9 0 % . Then, from Appendix T a b l e A10 we s e e from
c r i t i c a l value o f 0 i s d,05 = 0.1496, w h i l e t h e a p p r o x i a s r e v a l u e becomes With a
e n t r i e s f o r a = 0 . 1 0 t h a t n 2 35 w i l l be n e c e s s a r y t o have D 0 < 0.20. -
1.3581@5 = 0.1518. I f the n u l l hypothesis H is t e s t e d st a s i g n i f i c a n c e l e v e l f
, better t h a n 0.05 ar t h e same c o n f i d e n c e l e v e l we f i n d from
of 5% i t s h o u l d t h e r e f o r e b e r e j e c t e d i f t h e l a r g e s t o b s e r v e d d e v i a t i o n between the asymtotic
entrl t h a t the c o n d i t i o n on n i s 1 .22/& 5 0.05, which i m p l i e s
SeO(x) and Fo(x) e x c e e d s 0.15.
" 2 600.
I t w i l l be seen from Appendix T a b l e A10 t h a t , i f t h e sample s i r e i s 1r s h o u l d b e s t r e s s e d t h a t t h e c o n s i d e r a t i o n s above apply o n l y t o a i t u -
s m a l l , r a t h e r l a r g e d i f f e r e n c e s must b e found between t h e c u m u l a t i v e d i s t r i b u t i o n s ,tion$ where no unknown parameters are i n v o l v e d . I E same o f t h e P a r a m e t e r s en-
i n order t o d e t e c t s i g n i f i c a n t d e v i a t i o n s between d a t a and h y p o t h e s i s ; h e n c e un- : tering have been e s t i m a t e d u s i n g t h e d a t a t h e s t a t i s t i c D, i s no l o n g e r
l e s s t h i s d i f f e r e n c e i s c o n s i d e r a b l e one s h a l l n o t be a b l e t o f a l s i f y no. Indeed, o f F ~ ( ~ and
) , t h e = r i t i c a l v a l u e s da can n o t be o b t a i n e d u s i n g t h e
t h e nurbers o f t h i s t a b l e can be taken as an i l l u s t r a t i o n of t h e g e n e r a l d i f f i - tables. H ~ in s o~m f a r t~u n a r e s~i t u a t i o~n s t h e ~Kalmgorav-Smirnov
,
c u l t y i n c o n s t r u c t i n g e f f e c t i v e t e s t s f o r small d a t a samples. can s t i l l be u s e d even i n t h e p r e s e n c e of unknown n u i s a n c e p a r a m e t e r s , pro-
I t i s worth n o t i n g t h a t s i n c e D,, f o r Ho t r u e h a s a d i s t r i b u t i o n which vided
appropriate
t a b l e s over p e r c e n t a g e p i n t s are a v a i l a b l e . For example, f o r
!
i s u n i v e r s a l and i n d e p e n d e n t of t h e t h e o r e t i c a l F (x), and f u r t h e r m o r e is k n o w
for all n . one may use D,, to c o n s t r u c t confidence bmds f a r any c o n t i n u o u s d i s -
'
a,
important
<lass of pmb~em t h e e s t i m a t i o n i n v o l v e s the mean v a l u e o f
d i s t r i b u t i o n such t a b l e s can be found i n a recent a r t i c l e by
exponential
t r i b u t i o n f u n c r i o n F(x). Whatever t h e true F ( x ) i s we may w r i t e a p r o b a b i l i t y J . Durbin.
s t a t e m e n t about D as
14.4.7 ample: Goodness-of-fit i n a small s a v l e
P (D,, = max /s,,(x) - Fo(n)I ,da) =a. (14.76) T~ i l l u s t r a t e t h e use of t h e K o l m o g o r o ~ s m i r n o v t e s t o f g o o d n e s s - o f - f i t
where as b e f o r e d i s the c r i t i c a l value of D to the significance We consider a t y p i c a l l o w - s t a t i s t i c s experiment. Suppose t h a t for 30 events One
" corresponding
a. The s t a t e m e n t can be i n v e r t e d t o g i v e a c o n f i d e n c e s t a t e m e n t a b o u t F ( x ) , has measured t h e p r o p e r f l i g h t - t i m e of n e u t r a l kaons d e c a y i n g i n t o t h e semilep-
t o n i c f i n a l s t a t e rite-v. with t h e kaons produced i n an i n i t i a l l y pure s t r a n g e -
P (s,(.) - da < F(X) < s"(x) + dm 811 = 1 - . (14.77)
"ess + I s t a t e one can p r e d i c t t h e p . d . f . f,(t) f o r the flight-time t under t h e
T h i s means t h a t , a t any p o i n t x, the c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n F ( x ) w i l l K component w i t h s t r a n g e n e s s -1
assumption t h a t o n l y t h e -0 contributes t o the
have a p r o b a b i l i t y (1 - a ) of b e i n g l a r g e r than (S,,(r) -d ) but smaller than n+e-v f i n a l s t a t e . This assumption d e f i n e s t h e n u l l h y p o t h e s i s Ho which we want
(s,,(x) +do). T h e r e f o r e , i f one c o n s t r u c t s a band of w i d t h *d around t h e e m p i r i - to test.
c a l c u m u l a t i v e d i s t r i b u t i o n S (n) t h e p r o b a b i l i t y i s ( I - a ) t h a t t h e t r u e F ( x ) ~i~~~~ 1 4 . 6 ( ~ )~ h o w st h e s t e p f u n c t i o n S X I ( t ) o b t a i n e d f o r t h e sample
w i l l l i e e n t i r e l y w i t h i n t h i s band. This p r o v i d e s an e x t r e m e l y s i m p l e and d i r e c t cumulative d i s t r i b u t i o n function P,(t).
of XI flighr.times the
method f o r e s t i m a t i n g a c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n a t given ~ r o mt h e l a r g e s t d e v i a t i o n b e w e e e n t h e experimental and t h e o r e t i c a l c u r v e s we
level. Obviously t h i s i n v e r s i o n of t h e goodness-of-fir t e s t i n t o a confidence
determine t h e a c t u a l v a l u e of t h e Kolmogorov t e s t s t a t i s t i c of eq.(14.74) as
s t a t e m e n t a b o u t F ( x ) r e s t s upon t h e s i m p l e way D,, was d e f i n e d t o give a f,
t h e d e v i a t i o n between S". (x) and F ( x )
D~ = max ( SY)( t ) - ~ ~ ( t =) 0.17
l
' 0 . ' .
The technique d e s c r i b e d h e r e can be used, f o r i n s t a n c e , t o plan From Appendix = a b l e A10 we see t h a t a t t h e c o m n l y chosen s i g n i f i c a n c e l e v e l s UP
; to 10% we s h a l l n o t be a b l e t o reject H~ on t h e b a s i s of t h e XI o b s e r v a t i o n s w i t h
14.5 TESTS OF INDEPENDENCE
F r e q u e n t l y , when d a t a a r e a v a i l a b l e i n d i f f e r e n t i a l farm s p e c i f y i n g
.,, p r o p e r t i e s o r a t t r i b u t e s , i t i s d e s i r e d t o t e s t whether t h e s e
independent of e a c h o t h e r . The m o t i v a t i o n f o r c a r r y i n g o u t a t e s t of inde-
pendence can s o w t i m e s b e a p r o f o v n d t h e o r e t i c a l c o n j e c t u r e , f a r example, a
p r e d i c t i o n f o r t h e s h a p e of a s p e c t r v m of a k i n e m a t i c a l v a r i a b l e . Before
a c l a i m i s made on s c a l i n g b e h a v i o u r i t i s t h e n n e c e s s a r y t o e s t a b l i s h t h a t t h e
rpee~rurni n q u e s t i o n remains unchanged when, s a y , an i n c i d e n t energy i s i n -
More o f t e n t h e m t i v a t i o n i s l e s s s u b t l e . The e x p e r i m e n t e r may s i w l y
I to f i n d o u t w h e t h e r o b s e r v e d e v e n t s are u n i f o r m l y d i s t r i b u t e d a l o n g a band
I
I i n a ~ a l i t z lot, whether t r a n s v e r s e and l o n g i t u d i n a l rromnta are u n c o r r e l a t e d .
~'te.
An assumption o f independence i n t h e v a r i a b l e s x , y , . .. c a n be s t a t e d as
I a n u l l h y p o t h e s i s where t h e j o i n t p r o b a b i l i t y d i s t r i b u t i o n f a c t o r i z e s i n t o sepa-
I r a t e p r o b a b i l i t y d i s t r i b u t i o n s f o r t h e i n d i v i d u a l v a r i a b l e s , i.e.
1
I no: f(n,y,.) = f , ( x ) f,(y)'" . (1478)
!
A test problem of t h i s k i n d can b e approached a l o n g d i f f e r e n t l i n e s of t h o u g h t ,
Time of flight (in units of 0.89xl0-'~sec)
: some of which c o n s i s t i n g i n r e p h r a s i n g t h e problem to make i t analogous t o t h o s e
Fig. 1 4 . 6 . ComParlson of p r e d i c t e d and e x p e r i m e n t a l d i s t r i b u t i o n I
discussed i n Sects.14.6 below.
O f f l i g h t times; (a) c u m u l a t i v e d i s t r i b u t i o n ( ~ ~ l ~ ~ ~ ~ ~ ~ ~ - ~ ~ i ~ ~ ~ ~
I t turns our t h a t t h e X Z test i s a l s o a d e q u a t e f o r p r o v i d i n g answers to
t e s t ) . ( b ) d i f f e r e n t l a 1 d i s t r i b u t i o n (pearsonVs X2
test o f the above t y p e , a t e s t s t a t i s t i c can b e c o n s t r u c t e d i n
this test. To d e m o n s t r a t e t h i s we s h a l l
Indeed, f a r Ho t r u e , we f i n d by e x t r a p o l a t i o n of t h e t a b l e e n t r i e s analogy w i t h t h e Pearson s t a t i s t i c of e q . ( 1 4 . 6 5 ) .
f o r n = 30 t h a t t h e r e i s a P r o b a b i l i t y of about 0.25 t h a t a l a r g e r m a x i m a
be s a t i s f i e d w i t h c o n s i d e r i n g a problem i n two dimensions o n l y , which w i l l b e
d e v i a t i o n t h a n 0 . 1 7 would b e found between t h e observed c u m u l a t i v e d i s t r i b u t i o n
s u f f i c i e n t far most p r a c t i c a l p u r p o s e s . The e x t e r n i o n t o h i g h e r dimensions m y
f u n c t i o n and t h e p r e d i c t e d F o ( t ) .
become somewhat awkward r e g a r d i n g n o t a t i o n b u t i s c o n c e p t u a l l y s i m p l e and s h o u l d
For comparison, l e t us use t h e same o b s e r v a t i o n s t o t e s t H by t h e Xz be borne i n mind by t h e e x p r i m e n t e r working w i t h h i g h - s t a t i s t i c s d a t a samples.
method. For t h e f l i g h t - t i n e s of F i g . 14.6(a) a g r o u p i n g o f t h e d a t a i n t h e 4
i n t e r v a l s 0-3, 3-5, 5-7, and 7-18 ( i n u n i t s of t h e KO mean l i f e t i m e ) f u l f i l the 14.5.1 no-way c l a s s i f i c a t i o n ; contingency t a b l e a
-
the p r o d u c t s of t h e a p p r o p r i a t e e s t i m a t e d row and column p r o b a b i l i t i e s .
6;. p.j ni.".jln2.
611
..
A s u g g e s t i v e t e s t s t a t i s t i c f o r t e s t i n g independence i n a
-
two-way c l a s s i f i c a t i o n i s c o n s e q u e n t l y
-5
u 1.2 A1
A~
B1
20
20
B2
23
8
B3
13
22
B4
12
26
B5
13
15
E6
11
14
8,
21
7 5
20
B9
9
7
BI0
10 /
n.1.
106 j
-
0
'a
0
22
18
15
25
24
19
35
25
10
25
19
19
22
18 6
8
7 1 8 2 '
I
E
2 0.8 ZI.. 81 71 75 81 88 60 66 65 30 53 670
EE i
0
I By c a r r y i n g o u t t h e s u m t i o n a c c o r d i n g t o eq.(14.87) v i t h the n d e r s i n t h i s
E 1 x : ~= 3~9 . 8 .
U
5 0.4
V) i
I
t a b l e one f i n d s t h a t t h e d a t a c o r r e s p o n d t o
of freedom, from e q . (14.88). i s e q u a l t o (4-1)(10-1) -
p r o b a b i l i t y f o r independence i n t h e rvo momnrum components i s deduced t o b e
27.
The number o f d e g r e e s
Hence t h e c h i - s q u a r e
UI
about 5 % .
2
L
14.6 TESTS OF CONSISTENCY AND RANWWESS
C
i When e s e t of o b s e r v a t i o n s is used t o e s t i m a t e t h e p a r a m e t e r s e n t e r i n g
a p.d.f., i t is g e n e r a l l y t a c i t l y assumed that t h e o b s e r v a t i o n s r e p r e s e n t a
-2.4 -1.6 0
-0.8 0.8 1.6 2.4 saaple which h a s b e e n d r a m a t random from t h e p o p v l a t i a n o r u n i v e r s e . Thus t h e
Longitudinal momentum pL(GeV/c) n o t i o n of a randm ~ a r p l epresupposes t h a t t h e o b s e r v a t i o n s a c q u i r e d are t y p i c a l
and r e p r e s e n t a t i v e f o r t h e u n d e r l y i n g d i s t r i b u t i o n . I f t h e assumption a b o u t
F i g . 14.R. S c a t t e r diagram o f centre-of-mass mamenturn components of A hyperons. randomeso i s n o t f u l f i l l e d t h e c o n c l u s i o n s c o n c e r n i n g t h e p r o p e r t i e s of t h e
p o p u l a t i o n m y be wrong o r m i s l e a d i n g . I t i s t h e r e f o r e of i m p o r t a n c e t o have erpected. OD t h e other hand, t e s t s on p o p u l a t i o n v a r i a n c e s are v e r y s e n s i t i v e
a v a i l a b l e som s t a n d a r d p r o c e d u r e s f o r t e s t i n g w h e t h e r s e t s o f o b s e N a t i o n s may
d e p a r t u r e s from n o r m a l i t y , r e s t r i c t i n g t h e u s e f u l n e s s of t h e p r o c e d u r e s e f
be r e g a r d e d as random and f r e e of s y s t e m a t i c e f f e c t s .
sect,14.3.3 t o s i t u a t i o n s where t h e o b s e r v a t i o n s are m a n i f e s t l y c l o s e t o normal.
P a r t i c l e p h y s i c i s t s f r e q u e n t l y f i n d themselves i n s i t u a t i o n s which I" t h e f o l l o w i n g we s h a l l f o r t h e i n v e s t i g a t i o n of randormess and eon-
c a l l f o r i n v e s t i g a t i o n o f randomness and c o n s i s t e n c y . When o b s e r v a t i o n s have ,isrency f o r m u l a t e v a r i o u s t e s t s which a v o i d making s p e c i f i c assumptions about
b e e n o b t a i n e d through a s e r i e s of m e a s u r e m n t s e x t e n d i n g o w r t i n . o r s p a c e i t form o f t h e p o p u l a t i o n s . These distribution-free t e s t s are t h e r e f o r e gener-
may be n e c e s s a r y t o check t h a t t h e e x p e r i m e n t a l c o n d i t i o n s have remained t h e a l l y v a l i d , r e g a r d l e s s of t h e u n d e r l y i n g t h e d i s t r i b u r i o n of t h e i r
same t h r o u g h o u t t h e e x p e r i m e n t . S i m i l a r l y , t h e d a t a may have b e e n a c q u i r e d i n
s t a t i s t i c i s determined by t h e number of e q u i v a l e n t p e r m u t a t i o n s o f elemen-
two o r mre runs w i t h c o n p l i c a t e d e x p e r i m e n t a l s e t - u p s , o r c o l l e c t e d by d i f f e r
e q u i p r o b a b l e e v e n t s and can, a t l e a s t i n p r i n c i p l e , f o r f i n i t e samples be
e n t l a b o r a t o r i e s p a r t i c i p a t i n g i n a c o l l a b o r a t i o n experiment. derived from p u r e l y c o m b i n a t o r i a l arguments. I n abandoning t h e c o m n normal
I" s u c h s i r u -
a t i o n s , b e f o r e any i n f e r e n c e s are made from t h e combined d a t a , i t i s i m p o r t a n t
theory methods f o r t h e mre g e n e r a l d i s t r i b u t i o n - f r e e p r o c e d u r e s one may have t o
t h a t c o n s i s t e n c y checks a r e performed t o ensure t h a t s y s t e m t i c d i f f e r e n c e s do t h a t of l o o s i n g " e f f i c i e n c y " , o r r e l a t i v e power. However.
pay a c e r t a i n p r i c e .
n o t e x i s t between t h e s e p a r a t e s a m p l e s . Likewise, b e f o r e d i f f e r e n t experlmen- A l ~ h o u g hi t i s g e n r r a l l y t r u e t h a r d i s t r i b u t i o n - f r e e approaches are l e s s e f f i -
t a l e s t i m a t e s o f some p a r a m e t e r are eonbined t o o b t a i n a " b e s t average" o r
=ientthan t a i l o r e d t e s t s based on n o r m a l i t y assumptions, theoretical investiga-
"pooled e s t i m a t e " , i t must be checked t h a t t h e i n d i v i d u a l e s t i m a t e s do n o t de-
t i o n s have shown t h a t f o r p o p u l a t i o n s which are nor normal, t h e d i s t r i b u t i o n - f r e e
pend on p a r t i c u l a r a s s u m p t i o n s which are d i f f e r e n t f o r t h e d i f f e r e n t e s t i m a t e s . r e s t s may even be s u p e r i o r .
For example, i f the v a l u e s o f mass and width of a resonance have been e s t i ~ n a t e d ! TO t e s t c o n s i s t e n c y between two or more e x p e r i m e n t a l samples we s h a l l
!
by d i f f e r e n t groups i r w i l l b e u n j u s t i f i e d t o deduce p o o l e d e s t i m a t e s of t h e from mow on make no f u r t h e r a s s u m p t i o n a b o u t t h e ~ ~ d e r l y i nd gi s t r i b u t i o n s e x c e p t
resonance p a r a m e t e r s i f t h e v a l u e s r e p o r t e d by t h e s e p a r a t e groups have b e e n : t h a r they a r e a l l e q u a l . F o r o b s e r v a t i o n s of t h e c o n t i n u o u s t y p e t h e s e tests of
d e r i v e d from t h e raw o b s e r v a t i o n s u s i n g d i s s i m i l a r a s s u m p t i o n s a b o u t t h e reson- homogeneity imply t h e n u l l h y p o t h e s i s
ance s h a p e .
e saw i n S e c t . 1 4 . 3 how t e s t s of c o n s i s t e n c y can be f o m l a t e d f o r
W
no: € , ( x ) = f 2 ( ~ =) .,. (14.89)
'
cedures are r e l a t i v e l y i n s e n s i t i v e t o t h e s p e c i f i c f o m o f t h e u n d e r l y i n g d i s - f u l l y the i n f o r m a r i o n i n t h e d a t a and w i l l be more powerful i n d e t e c t i n g p o s s i b l e
tribution. These p r o c e d u r e s p o s s e s s a p r o p e r t y a p t l y c a l l e d mbustness. This
I inconsistencies. - me r u n t e s t h a s o t h e r u s e f u l a p p l i c a t i o n s ; f o r example i t can
be used t o g i v e a rough check as t o w h e t h e r a s e t o f o b s e r v a t i o n s is f r e e from
a p p l i e s , f o r example, t o t h e tests an p o p u l a t i o n wans r e f e r e n c e d above, and
i systematic trends. I t can a l s o be u s e d t o s u p p l e r e n t ~eeraon's X2 test far
may j u s t i f y t h e i r use i n cases where no d r a m a t i c d e v i a t i o n f m m normal b e h a v i o u r
I
g o o d o e r s - o f - f i t , o f u h i c h i t is i n d e p e n d e n t under some c o n d i t i o n s . m i s is an
ae~ r i t i e a lv a l u e sra 1 2 and r l-o,2 can b e d e t e r m i n e d from t a b u l a t i o n s o f t h e c r
i n t e r e s t i n g feature b e c a u s e , i n g e n e r a l , d i f f e r e n t t e n t s on t h e same d a t a are n o t
m l a t i v e binolnial d i s t r i b u t i o n . S i n c e t h e s t a t i s t i c r is d i s c r e t e one can, f o r
i n d e p e n d e n t , and hence t h e e o n b i n i n g of o u t c o n e s o f d i f f e r e n t t e s t s i s n o t t r i v -
a probability a i n t h e lower t a i l . d e f i n e t h e c r i t i c a l v a l u e ra as t h a t i n t e g e r
ial.
which s a t i s f i e s t h e i n e q u a l i t y
With more t h a n t w o samples the h y p o t h e s i s (14.89) can be t e s t e d by t h e
KnrakaZ-WaZlis rank t e s t . When t h e u n d e r l y i n g d i s t r i b u t i o n s are o f t h e d i s c r e t e
type. t h e analogous multi-sample h y p o t h e s i s can be examined b y a p p l y i n g t h e w e l l -
knam x2 t e s t ; t h i s is d e s c r i b e d i n Sect.14.6.12.
or, w i t h t h e n o t a t i o n of Appendix Table AZ.
14.6.1 Sign t e s t
A s i m p l e way o f r e c o r d i n g d a t a i s t o n o t e o n l y whether each o b s e r v a t i o n
i s s m l l e r t h a n , o r l a r g e r t h a n , some s p e c i f i e d v a l u e .
Although t h i s rough
I The sign t e s t can be ,,sed to test whether a v a r i a b l e when r e c o r d e d as a
method m y imply t h e l o s s o f a c o n s i d e r a b l e a m u n t o f i n f o r m a t i o n i n t h e o b s e r v a - ' f u n c t i o n of t i n e remains " c o n s t a n t " and e q u a l t o B f i x e d v a l u e , o r t e n d s to c h a n p
t i o n s , i t is p o s s i b l e t o c o n s t r u c t u s e f u l t e s r s f o r t h i s k i n d of d a t a . These .irh rim. Suppose, f o r i n s t a n c e , t h a t in a b u b b l e e h e h e r e x p o s u r e one m y want
sign t e s t s are based on t h e b i n o m i a l d i s t r i b u t i o n law, which d e s c r i b e s e x p e r i - =heck t h a t t h e ( a v e r a g e ) n u d e r o f beam p a r t i c l e s p e r p u l s e remains t h e same
ments w i t h o n l y two p o s s i b l e o u t c o n e s f o r i n d i v i d u a l e v e n t s . during the whole r u n , and e q u a l t o t h e optimum n u d e r r e q u e s t e d by t h e d e s i g n e r s
L e t t h e v a r i a b l e x have a d i s t r i b u t i o n o f media v a l u e li. We want t o the e x p e r i m e n t . I f the beam i n t e n s i t y s h o u l d vary s i g n i f i c a n t l y d u r i n g t h e
t e s t t h e simple n u l l hypothesis exposure, e i t h e r by f a l l i n g below t h e r e q u e s t e d v a l u e ( w i t h the consequence o f
r,90 = 1 4 , t h e two v a l u e s b e i n g s y n r
.
t r u e i m p l i e s a n e x p e c t e d v a l u e of r e q u a l t o i n , and very s m l l v a l u e s of r as
w e l l as very l a r g e v a l u e s (near n) are u n l i k e l y . To t e s t H a t t h e s i g n i f i c a n c e a e t r i c a l l y positioned r e l a t i v e l y t o t h e e x p e c t a t i o n v a l u e f a r r , which is h e r e
a we may t h e r e f o r e use the n u d e r r as a t e s t s t a t i s t i c and t a k e t h e r e j e c t i o n 1.20- 10. - T h e r e f o r e . i f d u r i n g t h e e x p o s u r e we made a count on 20 randomly se-
r e g i o n a t t h e two t a i l s of t h e binomial d i s t r i b u t i o n B ( r ; n , p = l ) . That is, ye l e c t e d picturesand found t h a t t h e n u d e r of t r a c k s was s m a l l e r t h a n t h e optimum
s h a l l r e j e c t t h e assumption o f a p o p u l a r i o n median e q u a l t o uo i f , among n nulrber u, i n mre t h a n 6 b u t i n less t h a n 14 p i c t u r e s , t h e n we would b e s a t i s f i e d
o b s e r v a t i o n s , t h e number of t i m e s r v h e n x i s s m a l l e r t h a n J
! i s such t h a t with the s t a t e of a f f a i r s , and b e l i e v e i n t h e assumption of a c o n s t a n t beam i n t e n -
0
Sity. I f , on t h e o r h e r hand, we found t h a t t h e n u d e r of p i c t u r e s h a v i n g l e s s
(14.92)
! than yo t r a c k s was either I 6 , or 1 1 4 , we would r e j e c t H on t h e b a s i s o f t h i s
I test. I f r e p e a t e d counts on a new sample of 20 p i c t u r e s gave s i m i l a r r e s u l t s , or
i f a c l o s e r e x a m i n a t i o n o f t h e n d e r t r a c k s on e a c h p i c t u r e s u g g e s t e d a s h i f t
ency d o e s n o t h o l d .
t o w a r d s , s a y , a l o v e r i n t e n s i t y , we would presumably a d j u s t t h e c o n d i t i o n s t o
TO we t h e number o f runs r a s a t e s t s t a t i s t i c f o r t h e h y p o t h e s i s Ho
b r i n g t h e i n t e n s i t y up t o t h e o p t i m a l b e f o r e c o n t i n u i n g t h e exposure.
have t o f i n d t h e p r o b a b i l i t y d i s t r i b u t i o n o f r assuming H t o b e t r u e . A
The s i g n t e s t assumes, f o r H t r u e , e a c h of t h e n o b s e r v a t i o n s xl.m, S i n c e , hov-
,,tat o f (n-) q u a n t i t i e s can be a r r a n g e d i n (n+m)! d i f f e r e n t ways.
...,x t o have a c o n s t a n t p r o b a b i l i t y f o r b e i n g s m a l l e r t h a n t h e s p e c i f i e d v a l u e
,,,r, t h e m u t u a l o r d e r w i t h i n r h e x ' s as w e l l as w i t h i n t h e y ' ~ ,h a s a l r e a d y been
vo. In t h i s r e s p e c t t h e o r d e r o f t h e i n d i v i d u a l o b s e r v a t i o n s is i m t e r i a l , and
t h e sequence i s random u n d e r H . I n s t e a d of c l a s s i f y i n g t h e o b s e r v a t i o n s r e l a -
f i x e d , we can o n l y have (n+m)!/(n!m!) = p) d i f f e r e n t p e r m u t a t i o n s a n d , pro-
"ided Ho i s t r u e , each o f t h e s e p e r m u t a t i o n s v i l l have t h e same p r o b a b i l i t y o f
t i v e t o a f i x e d , o u t s i d e v a l u e I),, one can d e s i g n a s i m p l e t e s t o f r a n d m e s s by
i g . To f i n d t h e p r o b a b i l i t y f o r a p a r t i c u l a r n u d e r o f runs. s a y r , one
c l a s s i f y i n g t h e i n d i v i d u a l measurements w i t h r e f e r e n c e t o one o f t h e sample T h i s is a c o m b i n a t o r i a l
c o u n t a l l p e r m u t a t i o n s g i v i n g r i s e t o just r runs.
v a l u e s , f o r example, r e l a t i v e t o t h e s a n p l e median, see Seet.14.6.4. lhe r e s u l t s o f t h e corn
problem which can be s o l v e d i n a s t r a i g h t f o w a r d manner.
1 (14.94)
I
For l a r g e s a m p l e s , v h i c h i n p r a c t i c e often i s t a k e n t o man m a n d n W i t h t h e s e n d e r s t h e conbined o r d e r e d s e r i e s i s
l a r g e r t h a n 10, t h e p r o b a b i l i t y d i s t r i b u t i o n of eq.(14.94) i s very c l o s e t o nor-
"el. The a p p r o p r i a t e r u n t e s t s t a t i s t i c i n t h i s s i t u a t i o n i s !
which i s a p p r o x i m a t e l y N ( o , I ) . 1
I
The run t e s t i n t h e form d e s c r i b e d above i s one of t h e l e a s t p o w e r f u l From eq.(14.95) the
which meam t h a t t h e o b s e r v e d n u h e r of runs i s equal t o 24.
distribution-free tests. I t i s i n f a c t o n l y m a n i n g f u l vhen a p p l i e d t o samples e q e ~ r e dv a l u e and v a r i a n c e f o r t h e v a r i a b l e r w i t h sample s i r e s n - 2 8 and m - 3 2
of comparable s i r e . I f one sample i s v e r y much l a r g e r than t h e o t h e r t h e n , i n are, respectively,
t h e c o & i w d o r d e r e d s e r i e s , t h e o b s e r v a t i o n s from t h e s m a l l e s t s a n p l e w i l l a l -
most e e r r a i n l y be s e p a r a t e d from e a c h o t h e r by o b s e r v a t i o n s from t h e l a m e r
-~ -
s a m p l e ; hence the n u d e r o f runs v i l l t e n d to bee- maximm,
regardless of 2.28.32(2'28'32-28-32) ,1 4 . 6 2 ,
w h e t h e r t h e a ~ s u m p t i o no f i d e n t i c a l p a r e n t V(d *
i s true or not, (28+32)' (28+32-1)
O t h e r r e s t s b a s e d on r u n s have been d e v i s e d , s o m of which elrploit
Using the large sample approximation
eq.(14.97) for t h e t e s t s t a t i s t i c we find
the
hrll~ i n t h e d a t a ; one, f o r e r a n p l e , takes as a test
that the actualv a l l e from t h e o b s e r v a t i o n s is
the l e n g t h O f t h e l o n g e s t run. For a d e s c r i p t i o n of t h i s and o t h e r run tests the I
r e a d e r s h o v l d c m s u l t mre s p e c i a l i z e d l i t e r a t u r e .
Z
ObS
= 24- 33.87
m
- -1.80.
- -
14.6.3 Example: C o n ~ i s t e n c yb e t v e e n t v o e f f e c t i v e - ~ s s a n p t e s
Since
the p r o b a b i l i t y f o r a s t a n d a r d normal v a r i a b l e t o be smaller
than -'.'a is
10 s t u d y the p r o d u c t i o n o f fl i n a n t i n e u t r i n o induced r e a c t i o n s i n a
G(-1.80) - 1 - G(1.80) = 0.036, the h y p o t h e s i s of c o n s i s t e n c y between t h e two
.can and v a r i a n c e t o r - t h e number of runs become, r e s p e c t i v e l y .
-
samples w i l l t h e r e f o r e h a w t o b e r e j e c t e d from t h e r u n t e s t a t t h e 5% l e v e l .
By i n s p e c t i o n of t h e o r i g i n a l sample v a l e s i t i s seen t h a t t h e a t r i k - r - 1 V(r) - "("-1) , (14.98)
i n g d i f f e r e n c e h e m e e n t h e two s e r i e s o f m e a s u r e m n t s is the l a r g e rider of
F o r l a r g e samples t h e run s t a t i s t i c i s t h e n a p p r o x i m a t e l y N(n.ln).
smll qyv a l u e s o b s e r v e d by l a b o r a t o r y Y which are m i s s i n g f o r l a b o r a t o r y X. As
p h y s i c i s t s we m y t r y t o e x p l a i n t h e d i s c r e p a n c y h e m e e n t h e d a t a s e t s as b e i n g 14.6.5 Example: Time v a r i a t i o n of beam momentum
due t o an e x p e r i m e n t a l b i a s : The events w i t h s m a l l v a l u e s o f yycould be "wrong" As a n a p p l i c a t i o n of t h e run t e s t on one a e r i e s of o b s e r v a t i o n s , l e t
e v e n t s , i n which, f o r example, one y from no decay was e r r o n e o u s l y combined w i r h c o n s i d e r measurements on t h e beam momentum i n a b u b b l e chamber e x p e r i m e n t .
a bremsstrahlung air from t h e same y . Therefore, given t h e d a t a s e t s above, i t suppose t h a t measurements on 30 r o l l s of f i l m o r d e r e d a c c o r d i n g t o t h e rime of
i s s u g g e s t i u e to a s k l a b o r a t o r y Y t o look more c l o s e l y a t t h e i r l o r m a s e v e n t s exposure gave t h e f o l l o w i n g v a l u e s f o r t h e a v e r a g e momentum of t h e i n c i d e n t
OD t h e scan t a b l e . t r a c k s (numbers i n GeVlc):
I f one d i s r e g a r d s t h e seven e v e n t s w i t h 5,. i 40 HeV from l a b o r a t o r y Y , 18.90 18.88 18.94 18.91 18.96 19.05 19.06 19.08 19.03 19.10
one f i n d s t h a t t h e d a t a f o r t h e samples o f s i z e 28 and 25 c o r r e s p o n d t o a rul, 19.07 19.12 19.13 19.10 19.15 19.20 19.17 19.14 9 4 19.10
p r o b a b i l i t y of a b o u t 0.17. Hence w i r h t h e r e d w e d
nwober o f e v e n t s the run r e s t 19.11 19.08 19.08 19.07 19.03 18.98 19.00 18.97 18.94 18.95
f i n d s no i n c o m p a t i b i l i t y between t h e two samples even ar t h e 10% l e v e l .
Do t h e s e numbers e v p p o r t t h e h y p o t h e s i s H of a c o n s t a n t beam momentum d u r i n g
I1 -
14.6.4 Run t e s t f o--
r c h e c k i n g j a n d o m n e s 8 w i r h i n one *Ie
- t h e exposure?
Already from an i n s p e c t i o n of t h e cantro2 chart f o r t h e m e a s u r e a e o t s
3 The run t e s t p r o c e d u r e f o r t e s t i n g w h e t h e r two samples have t h e same
j p a r e n t p o p u l a t i o n can r e a d i l y be adopted t o t e s t t h e assumption t h a t a s e r i e s o f 1 above, s h o w i n P i g . 14.9, one would b e i n c l i n e d t o rejeer t h e assumption of
o b s e r v a t i o n s o b t a i n e d s e q u e n t i a l l y , f o r i n s t a n c e by measurements p e r f o m d a t
d i f f e r e n t t i m e s , can he c o n s i d e r e d f r e e from s y a t e m a r i e t r e n d s .
I
,
constancy in t h e beam mowenturn. I n d e e d , t h e e v i d e n c e from t h i s c h a r t is t h a t
t h e momentum h a s f i r s t i n c r e a s e d , t h e n d e c r e a s e d d u r i n g t h e exposure.
The h y p o t h e s i s 1
i of randomness i s t h a t a l l o b s e r v a t i o n s masure t h e same q u a n t i t y , which i n p l i e s 1'
t h a t t h e o r d e r of t h e o b s e r v a t i o n s i s i m m a t e r i a l . !
L e t the e l e m e n t s i n a t i r e - o r d e r e d s e r i e s o f o b s e r v a t i o n s be c l a s s i f i e d
I ;
r e l a t i v e l y t o some v a l u e of t h e s a m p l e , s u c h t h a t a n o b s e r v a t i o n a b o w t h i s v a l u e '
!
is l a b e l l e d by A and an o b s e r v a t i o n below i t by B .
Observations c o i n c i d i n g with
t h e c h o s e n v a l u e can be i g n o r e d . The h y p o t h e s i s H i m p l i e s t h a t a t every p o s i t i o n
i n t h e sequence the roba ability t o have an A i s the same, i.e. t h e ~ r o h a h i l i t ~
f o r an A r e w i n s c o n s t a n t a l o n g the sequence, and l i k e w i s e f o r B .
Ihe r e s u l t i n g
Film mll nwnkr
s e r i e s o f A ' s and 8's is t h e n a p a t t e r n of s y n b o l s w i t h p r o p e r t i e s analogous t o F i g . 14.9. C o n t r o l c h a r t f o r beam measurements
, ,
I
t h e s e r i e s of x's and y ' s of Sect.16.6.2.
We may t h e r e f o r e use t h e formulae f o r
t h e run s t a t i s t i c g i v e n e a r l i e r . We s e a k a n u m e r i c a l measure f o r our d i s b e l i e f i n Ho. The median v a l u e
These become p a r t i c u l a r l y s i w l e i f we choose
t o c l a s s i f y t h e o b s e r v a t i o n s r e l a t i v e l y t o t h e sample median, s i n c e by d e f i n i t i o n
f o r t h e o b s e r v a t i o n s is 19.07 GeVIe, and t h e c l a s s i f i c a t i o n r e l a t i v e t o t h i s
t h e rider o f A ' s a n d B S s w i l l t h e n b e e q u a l .
By p u t t i n g n = m i n eq.(14.95) the v a l u e g i v e s t h e f o l l o w i n g s e r i e s a f t e r t h e two o b s e r v a t i o n s c o i n c i d i n g w i t h t h e
I
median a r e i g n o r e d :
B B B B B B B A B A A A A A A A A A A A A A B B B B B B
. l i m i t e d amount o f i n f o m t i o n i n t h e o b s e r v a t i o n s , t h e run t e s t is p s r t i e u l s r l y
o n l y when u s e d i n c o n j u n c t i o n w i t h a x2 t e s t on t h e sane d a t a .
T h i s s e r i e s h a s a c o n s i d e r a b l e d e g r e e of o r d e r w i t h o n l y 5 runs among t h e 28 sym- For d e f i n i t e n e s s , c o n s i d e r t h e t h r e e s i t u a t i o n s s k e t c h e d i n F i g . 14.10.
bols. From Appendix T a b l e All we see t h a t , f o r n = m - 1 4 , t h e c r i t i c a l v a l u e s r m (a) t h e p r e d i c t i o n from t h e h y p o t h e s i s under t e s t r o u g h l y f o l l o w s t h e observa-
a
f o r t h e s i g n i f i c a n c e s a = 0.05. 0.025. 0.01. and 0.005 are, r e s p e c t i v e l y , 1 0 , 9 , t i o n s over t h e v a r i a b l e r a n g e , r e s u l t i n g i n a s e r i e s o f d e v i a t i o n s between ob-
8 , and 7 .Hence t h e p r o b a b i l i t y t o have as l i t t l e as 5 runs must b e c o n s i d e r a b l y served and h y p o t h e t i c a l v a l u e 9 which a l t e r n a t e i n s i g n and hence g i v e a f a i r l y
smaller than 1 ' l o o . l a r g e n u d e r of runs. I f the hypothetical distribution d i f f e r s substantially
From e q . ( 1 4 . 9 8 ) t h e e l t p e c t a t i o n v a l u e and v a r i a n c e f o r t h e number of from t h e o b s e r v e d i n l o c a t i o n , as i n ( b ) , i t is c l e a r t h a t t h e r e w i l l b e a se-
runs r a r e , r e s p e c t i v e l y , quence o f p o s i t i v e s i g n s f o l l o w e d by a sequence o f n e g a t i v e s i g n s . Similarly,
I f we a d o p t t h e l a r g e sample a p p r o x i m a t i o n i n t h i s case t h e a c t u a l v a l u e of t h e
a p p r o x i m a t e N ( 0 , I ) t e s t s t a t i s t i c of eq.(14.97) becomes
I
-
Hence t h e p r o b a b i l i t y t o have 5 o r l e s s runs among t h e 28 symbols w i t h t h i s ap-
p r o x i m a t i o n and t h e a c c u r a c y of Appendix T a b l e A6 i s G(-3.85) = 1-0.99994 I Fig. 14.10. Observed and h y p o t h e t i c a l d i s t r i b u t i o n s , ( a ) comparable i n
shape and l o c a t i o n , (b) d i f f e r i n g i n . l o c a t i o n , ( c ) d i f f e r i n g i n s h a p e .
~.IO-~.
E x e r c i s e 14.11: For t h e example above, show t h a t , f o r Hotrue, t h e e x a c t prob- i f t h e d i s t r i b u t i o n s d i f f e r mainly i n shape, as i n ( c ) , t h e s i g n s w i l l occur i n
a b i l i t y t o have 5 o r l e s s runs i s 5.97.10.'.
$eqUence o f n e g a t i v e , p o s i t i v e , n e g a t i v e . Thus i n b o t h s i t u a t i o n s ( b ) a n d ( c )
u - -2(LnP, + EnP,)
For t h i s
combined t e s t t h e c r i t i c a l r e g i o n i s t a k e n a t t h e upper t a i l o f t h e c h i - s q u a r e
i € i g . 14.11. E x p e r i m e n t a l and p r e d i c t e d d i s t r i b u t i o n of four-momentum t r a n s f e r .
14.6.9 Wilcoxon's rank sum t e s t f a r comparison o f two samples assumes one-sided t e s t s , t h e c r i t i c a l value W b e i n g d e f i n e d as t h a t i n t e g e r
" d u e f o r which
We have so f a r given s e v e r a l p r e s c r i p t i o n s f o r t h e comparison o f two
samples, and w i l l now i n t r o d u c e the WiLcomn two-sample test, or Wilcomn's rnnk
sum test f o r the same problem. We assume a g a i n t h a t we have two ordered samples
~ 1 ~ x 2 ....,\ and Y!.Yz,.... 7, ("51"); we want t o t e s t t h e h y p o t h e s i s H that
t h e two p o p u l a t i o n s fromwhich t h e s e samples o r i g i n a t e are i d e n t i c a l . To o b t a i n t h e c r i t i c a l v a l u e s corresponding t o a two-sided t e s t a t a s i g n i f i c a n c e
AS d e s c r i b e d for t h e r u n t e s t i n Seet.14.6.2 we a r r a n g e the ( n r m ) ob- l e v e l lOOo I one r e a d s o f f t h e l w e r c r i t i c a l v a l u e WE as t h e t a b l e e n t r y i n t h e
s e r v a t i o n s i n i n c r e a s i n g o r d e r of magnitude. I n t h i s = d i n e d ordered sample a p p r o p r i a t e column for a 1 2 . The upper c r i t i c a l value W,, can t h e n b e o b t a i n e d
each o b s e r v a t i o n i s assigned a m n k , e q u a l to t h e o r d e r i n which t h e o b s e r v a t i o n from t h e sytrmetry p r o p e r t y o f t h e d i s t r i b u t i o n , which implies
occurs i n t h e s e r i e s . I f some o b s e r v a t i o n s happen t o be i d e n t i c a l ( " t i e s " ) they
a r e a l l a s s i g n e d the average value of the ranks t h e s e o b s e r v a t i o n s would have i f
t h e y were d i s t i n g u i s h a b l e . The Wilconon t e s t s t a t i s t i c W i s now c o n s t r u c t e d as
t h e sum o f t h e n ranks f o r t h e o b s e r v a t i o n s from t h e a sample. I f H i s t r u e we , The v a l u e of 2c i s a l s o given i n t h e t a b l e f o r each n,m combination.
The d i s t r i b u t i o n f o r w can be s h w n t o tend t o normal f o r n and m
e x p e c t the x and the y o b s e r v a t i o n s t o be w e l l mixed i n t h e combined s e r i e s , and
large. For l a r g e samples one can t h e r e f o r e use Wileoxon's t e s t w i t h t h e a p p r o r i -
hence t h e value of W should be not "too small" and not "too large". Conversely, mate N(0.1) statistic
i f t h e value f o r W comes o u t e i t h e r "very small" o r "very l a r g e " t h i s would m a n
where e " c o n t i n u i t y c o r r e c t i o n " o f -1 o r + j i s added t o t h e numerator depending "here the c o n t r i b u t i o n s from the c o u n t e r experiments have been underlined. Thus
, .i on whether a n upper or lower t a i l p r o b a b i l i t y i s b e i n g c a l c u l a t e d . The normal the a c t u a l value o f the rank sum becomes
approximation i s good £or most p r a c t i c a l purposes with n and m both l a r g e r t h a n
10. Even i f n i s s m a l l e r t h a n 10 t h e approximation is f a i r , provided t h a t m i s
Webs - 1+2+3+4+6 - 16
~ h i c hi s below the c r i t i c a l l i m i t f o r the t e s t with the chosen s i g n i f i c a n c e l e v e l .
n o t too much l a r g e r t h a n n ( f a i r l y s y m e t r i c p r o b a b i l i t y d i s t r i b u t i o n ) and the
~ c c o r d i n g l y , from t h e Wilcoxon r a n k sum t e s t the two s e t s o f measurements of t h e
s i g n i f i c a n c e a n o r too small ( h e l m 0.01, s a y ) .
l i f e t i m e are n o t c o n s i s t e n t a t t h e 5% l e v e l .
E x e r c i s e 14.13: (Wilcoxon's rank sum t e s t f o r randomness w i t h i n one sample) The ordered sample of the two measurement s e r i e s corresponds t o 4 r u n s .
Discuss how the Wilcoxon rank sum t e s t can be used t o t e a t whether a s e r i e s o f
measurements i s f r e e from s y s t e m a t i c t r e n d s .
E x e r c i s e 14.14: (Wileoxon's rank sum t e s t f o r independence)
.=6 and a-0.05 is r
.05
-
1t w i l l be seen from Appendix Table A l l t h a t t h e c r i t i c a l number of runs f o r "-5.
3 . Hence, from the run t e s t we would have no reason t o
"lairn t h a t the r e s u l t s o f the two s e t s of measurements are not c o n s i s t e n t a t a
A d i s t r i b u t i o n o f two v a r i a b l e s f ( x , y ) i s such t h a t t h e v a r i a b l e y can t a k e on
o n l y two v a l u e s . Show how Wilcoxon's two-sample t e s t can be used t o t e s t whether ~ i g n i f i c a n c e0 . 0 5 . I n f a c t , t h e p r o b a b i l i t y t o have 4 or l e s s runs is 0.0644.
x and y are independent v a r i a b l e s .
his i l l u s t r a t e s t h a t the Wilcoxon t e s t i s more capable than the simple run t e s t
The o b s e r v a t i o n s above correspond t o t h e f o l l o w i n g combined ordered rank sum W . as w e l l as the mean rank i. = W.In., where n. is t h e number of obser-
3 I I 1 1
sample. v a t i o n s i n t h e j-th sample. I F t h e assumption of J i d e n t i c a l p a r e n t p o p u l a t i o n s
-
0.56 0.6 0.73 0.9 1.0 1.05 1.6 1.7 1.9 2.3 2.8
is c o r r e c t , i.e. i f i s t r u e , a l l samples are expected t o have t h e same mean Exercise 14.15: J u s t i f y the s t a t e m e n t t h a t H of eq.(14.109) i s asymptotically
rank d i s t r i b v t e d as x'(J-1) i f Ho i s t r u e .
e x e r c i s e 14.16: Shou t h a t eq.(14.110) f o l l o w s from eq.(14.109).
Exercise 14.17: Shav t h a t a t e s t with the s t a t i s t i c N/(N-l).H for 5-2 i s e q u i r
= l e n t t o t h e Wilcoxon two-sample t e s t .
s u b s t a n t i a l l y d i f f e r e n t , the h y p o t h e s i s o f a c o m n p a r e n t p ~ p u l a t i o nshould be
r e j e c t e d i f the observed value Robs exceeds the c r i t i c a l value Ha corresponding
t o t h e chosen s i g n i f i c a n c e a . I n o r d e r t o determine t h e s e c r i t i c a l l i m i t s one
must know the p r o b a b i l i t y d r i s t r i b u t i o n f o r H assuming the n u l l h y p o t h e s i s t o be The observed number of e v e n t s i n the d i f f e r e n t b i n s and the t o t a l number of
-
H f o r s u f f i c i e n t l y l a r g e n . has a chi-square d i s t r i b u t i o n w i t h J-1 degrees of
I i i d e n t i c a l , corresponding t o , f o r each b i n nuaher i , a common p r o b a b i l i t y f o r a l l
freedom. The x2- approximation i s g e n e r a l l y accepted when e i t h e r J 3 and a l l
I J histograms,
I sample s i n e s =re above 5 , O F J 2 4 and a l l sample s i z e s above 4 .
i H : pil = PiZ = ..' = P ~ J i = 1.2, ...,I (14.114)
l e t us d e n o t e t h e c o m n , unknown, p r o b a b i l i t i e s by p i . , i-1.2... .,I.
The Maxi-Likelihood e s t i m a t e s f o r t h e s e b i n p r o b a b i l i t i e s are t h e a v e r a g e f r e - Event t o p o l o g y
q u e n c i e s o b s e r v e d f o r each of t h e b i n s Laboratory
152 131 70 48
189 161 108 42 25
I
w h i c h are seen t o s a t i s f y t h e r e q u i r e m e n t , E 9 . . = 1 i n v i r t u e o f t h e c o n s t r a i n t 105 78 52 32 12
1'1 1
on t h e n.. e q . ( 1 4 . 1 1 3 ) ; t h u s o n l y 1-1 o f t h e e s t i m a t e d b i n p r o b a b i l i t i e s are i n -
LI'
dependent. The t e s t s t a t i s t i c f o r t h e comparison o f a l l h i s t o g r a m s s i m u l t a n e o u s - Check t h a t t h e e v e n t samples o b t a i n e d by t h e d i f f e r e n t l a b o r a t o r i e s are f u l l y
compatible.
l y i s c o n s t r u c t e d as t h e sum over a l l h i s t o g r a m s and b i n s o f a l t o g e t h e r J.1 t e r n s .
e a c h t e r m b e i n g a s q u a r e d d e v i a t i o n between an o b s e r v e d and e s t i m a t e d number, d i -
v i d e d by t h e e s t i m a t e d n d e r ,
I f Ho i s t r u e and t h e e x p e c t e d e v e n t numbers i n a l l h i s t o g r a m b i n s f u l f i l t h e
u s u a l n o r m a l i t y r e q u i r e m e n t , t h i s s t a t i s t i c w i l l be a p p r o x i m a t e l y c h i - s q u a r e d i a -
tributed. The number o f d e g r e e s o f freedom i s ( I - 1 ) ( J - l ) , corresponding t o the
p r e s e n t number o f independent o b s e r v a t i o n s ( I J - J ) minus t h e number o f independ-
e n t l y e s t i m a t e d p a r a m e t e r s , 1-1 .
The a s s u m p t i o n o f c o n s i s t e n c y between a l l J h i s t o g r a m s i s a c c e p t e d a t
t h e c h o s e n s i g n i f i c a n c e l e v e l lOOa Z i f t h e c a l c u l a t e d v a l u e x : ~comes
~ out
smaller than the c r i t i c a l value 4 f o r t h e a p p r o p r i a t e n u d e r of degrees of
freedom, and r e j e c t e d i f t h e o p p o s i t e o c c u r s . Q u i t e f r e q u e n t l y when xibs ' 4 t h e
o v e r a l l i n c o n s i s t e n c y can be t r a c e d t o a s i n g l e h i s t o g r a m , s a y t h e j - t h , having
an exceptionally large contribution t o x : ~ ~ .A repeated calculation with the
j - t h h i s t o g r a m e x c l u d e d may t h e n show t h e r e m a i n i n g h i s t o g r a m s t o b e m u t u a l l y
c o m p a t i b l e and s u g g e s t a c r i t i c a l e x a m i n a t i o n o f t h e d a t a f o r t h e odd h i s t o g r a m .
i
Table A 1 . The binomial d i s t r i b u t i o n
" ?
p .01 .'? .O, .a5 .,a .,s .lo 5 .la .." .ro
464
. . . .,0672 ,8837
. . . ~ s a .nno .~trs .era, .mrs
....,
.
I
i
3
5
1
I.
.we5
,.oooo
1.1000
1.0"O.
1.1.10
00
.99.,
.?**a
,.1.1$
1.0100
I.OOII
1.00.0
.?a75
.QWS
1.0000
1.0000
,.naoo
I."OOO
.o*,a
,999.)
,.".OO
,.nnoo
,.no00
,384,
.QPIII
.so**
I.OOOl
i.00""
,7765
.9%,
.99.,
.9996
i.0ooa
,.oonn
.*5s.
.90,,
_?a30
.9.
,9999
,.oqos
,5329
.96l*
,395.
.wee
,.ooos
.em?
,9295
.%9l
.nlOl
,.oooo
,233,
,..s.
,820"
.gIso
,9959
,.oaao
.,OQ.
.,.m
.hra2
..*m
.9"..
,.onaa
p ." . " ." .'5 .,a .I5 .?O .'5 .'O .." .'O
n x
6 0 ,1515 .72,a .*,.I .L..l .,s5, ,071, ."?*, ,0100 .0011 .O.", .noon
.so,> .oaa,
.
1 .*"ll ..601 t " " 8 5 . 3 0 5 6 ,
1
I
1
.909S
1.0000
I.0.0"
1.1000
,9962
.%PQI
,.ooo.
1.0000
,9187
,991s
.9W?
1.000@
,9511
.a010
..ss,
.9Ol9
.?ill
.s"lo
.*%I
.%lr
.la09
.slnl
.qI65
,151"
.5WI
.,"a2
.eldl
.1011
,6050
.L102
.1IO1
...
.D*Q.
,1659
.as%
99
.01"1
,0611
.,rrr
,1111
.0011
.",".
.0106
,1151
6
,
1.0100
0
I.0000
0 .
I.OIO0 t.&ooO
I.""" ..WOT .01).1 ,9711
0
.9?0. .(I?.,
,256
,1271 .1z1?
..Dl"
...
I 9 " 9 7 ,
11 1 . ~ 1 0 1 1.0010 ,..0~0 ,_""no i.ooao .ew. ..*a5 ,9925 .rr., ,8577 ,3912
9 ,.1110 1.00*1 1.0000 ,."an0 t.oano ,.oono .W"" .9srr .ee>e n .rrm
10 ,.00.0 I.OO0D 1.0w0 ,.nnns ,.onno ,..an" ,.oooo .v99r .s*. .Qam .o.o
I, I.0.00 ,.00@0 1.0000 I. 0 1.0110 1.010" 1.0"OO I.lO"0 .P.9, ,9951 ,9616
11 2.0000 1.0101 ,.nolo i.0o.O I.ODI. 1.11"" l.O.0" ,.oono 1.00.0 .Pes, .*es.
11 ,.a(100 1.0000 ,.noao 1.ooae ,.aooa ,.*oaa ,.nooa ,.mano ,.DO00 .9.e* .wr*
I. ,."no0 ,.oooa ,.nola ,.onno ,.nooo ,.oaan ,.onoo ,.mano ,.ooso >.o.olr .ose,
l5 ,.onas ,..aoo 1.nooa ,.onaa ,.oaao L.oao0 ,.ooo* I.OO"0 ,.oooo I.o.00 ,.onoo
I6 t.0noo 1.1000 I."IO. 1.n""" 1.0".. >.000"..D00 I.... 0 ,.*..a 1.0.00 >.no.
I? O
I
.BIZ.)
" 7
,1001
5 . ..%01* _a111
.7*22
,1168
.lala
.0611
.rizr
.01?5
.tlrr
.001J
.osar
.0013
.Oh?,
.00mZ
.ooz,
.ooID
."so,
....
1 .?Psl .wJL ,9111 ....1 .Ibl* .lOPL ,1617 .0116 .01Z1 ,0011
3 1.0000 .Pvsi ,9080 _iell ,0171 .i55l .%80 ,3510 .101* ,016. ,096.
...
* 1.0010 I.0000 ."*P* .ill* , 9 1 7 " .9m11 .lie? .571P .31.1 _I260 ."2'S
1 1.0110 ,.ow0 1.0000 9 ,9953 .91.1 ,1911 ,765, .L9." .
s
.
,9 ,071,
6 ,.*I00 I.0000 1.11100 I.0D.I .esw .*Pn .sar> .Is29 .,rra .I., " .,srr
....."...",,.
r 1.0000 1.01* ,.DODO ,.O.PO 'Q". .PQ"1 .9"V, ,959. 5. ..LO5 .,I..
I I.1000 1.1110 1.0110 1.0011 1.0000 .999l .09ll .?ll6 .*191 ,1111 . 5 ~ 0
1.00oo I.BOOO ~.oaoo I.DOIP 1.0000 t.0000 .essr . ~ m..sen .eoal .681S
,a ,.a000 I.*I.O I.0000 1.0010 I.0000 1.0100 .*WP ,999. ..we 52
11 1.1.10 1.0000 ,."a00 ,.""ID 1.0000 I.00"" 1.01.0 .ewe .es., .*I",
12 I.MOO 1.oom I.OODO >.DIOO t to no t.onoo ,.aaav i.oooa .r.sq ..err .,s.
t1 ,.0000 I.OIO0 ,."DO0 ,."moo 1.0000 1.0000 ,.moo 1.1&"0 I.0O.D ,999.
....
.9*,1
I.
I5
I6
I.O.00
>..a01
I.0.00
,.O*DO
1.1010
I.IOOO
1.10.0
,.nolo
I.ID00
1.111"
1.1110
,..000
1.0000
1.0001
1.0000
1.000*
t.0001
1.0B01
I.0000
I.0000
I.0001
1.0000
1.0100
I.01.0
1.0000
I
3.0.00
0
,0999
I.. 0.
I.1.00
..*.
..9""
1.0.00
9
I 7 I.0000 I.0000 >.I000 1.0000 I.OOB1 1.1000 I.OOOI 8.0000 I.0000 I.OIOO I.D&OO
111 0 .81.1 .L9ll .57ID ,1971 .1511 .0531 .1180 ,0156 .1111 .1001 .MOD
I ,9505 ."W7 ,7775 ,950, 22.1 ."99, ,0395 .a,.? .oo,, ,000,
1 .99*1 .99.8 )..9. ,Q.lV .1111 ..I97 ./713 .1>53 ,0800 .OO~Z .OO@I
1 I.0000 ,9396 .*e.r .9.e, .sola .?lo2 .rota .>.57 .,srs .o,p. ..*,a
L I.ODO* 1.0011 .eeea .ooar .orla .~rq. .,ma. .rna~ .mm .oe.r .oar.
... .,.*)
z 1.1100 1.1010 1.00011 .v-ea .seas .*sat .as71 . r > n .nrl .zoan .onl
I.1000 3.0000 1.0000 I.DO"0 ,9988 .ell1 .?.a, ..(I,. .1.1, ,111, .,,a
1 I.OOlb I.0000 1,1000 I.D.00 .esea .esn .e., .?.,, ,859, .I*,.
I I.0000 1.0000 1.0110 ,.DO00 I.IPO0 .se57 .*.ll ,1161 .0,
9 I.0000 I.00W 1.0IDO I.001 I.0000 .091)9 .*e.l, .errs ..re0 ..en .5e.,
I0
I!
1.0.00
1.0DOI
1.1001
I.00*0
I.DI00 I.Old. I.OO"D 1.01111 .*we .a. .?*a .el,. .I507
1.00011 3.1000 i . 0 0 0 ~ 1.0100 I.0.OD .9W" .eeas ,979, ."a,,
I? I.0000 1.001* 1.0000 I.P.OO 8.0000 1.0000 DO..I I.O..$ ,9997 .%? .or,.
11
1.
I.0.W
1.0010
1.0100
1.1010
1.@100
1.1000
,..on0
I.0.00
1.0000
1.a010
1.0010
I.0100
I.O.00
I.O.00
I.IO.0
I.... 0
I....
I.00.0
0
...
....I
9"
.*".a
.We2
I9
15
IS
I7
,I
0
1.101
I.ODI0
!.I000
I.I.00
.8?61
1.0*10
1.0110
1.1110
1.1000
.6111
1.1000
1."10*
1.0010
I."Wl
.180&
1.....
I.0.00
,."~OO
1.0000
.1111
1.0000
I.0DIO
I.0000
1.1000
.1>11
I.0.0.
1.moo
1.0000
I.0000
.Oh56
I.00..
I.0.00
1.0010
I...IO
.Ol.L
1.010.
I.0I.D
1.0001
l.OlO0
.11'2
1.001.
1.0010
1.101(
1.1.1.
.1*11
1.0101
I... 0.
1.0(01
I.OO.0
s.*.
I..
..9.,
I.oo)o00
...,
,0001 ,1001
P
1
1
0
.'I.I
,999,
.s.I.
.w>9
5
.*",.
,8900
" 7
.ISLI
.?>a3
9""
A101
,709.
" 0
,1085
6 " .
>
.O.I*
.2>6*
.rrr,
.,,,,
.1310
,263,
,9106
.a,$?
,1332
.0001
...,.
.a055
,0000
,000.
.00.2
t.ooao I.OOOO .sew . Q O ~ . ~ a r r .rtm .ern ..sir .arz .oass
.."," .".,.
5 t.eo00 t.0000 ,.OOO" ,*99" .w,. .9&6, ."M+ .W7" .
.,
,Q ,1829.O,,"
6 L.O"I. ,.OO.* 1.0010 I.DO.0 .Q*"I .9"l, .s11. ,1251 ,6655 .,I",
.,,*.
I
8
9
I0
II
I.0010
1.0110
1.1.01
1.00.0
I.0"00
1.1011
I.01*0
1.0001
1.1000
,.ow0
,."I00
I."*IO
,.DIDO
1.1101
1."010
I.1.00
I.**"" ~ . O O " P
I.""""1.000"
,
,.o"nn
.no00
,9997
t.II*
,.oaaa
.w..
.P99?
.sess
I.*O"O
i .oaoo
,976,
.PQ11
.war
.WP.
I.aoa.
.9llb
,9713
..PI,
.ssrr
.sevr
.".,.
,818 1
.*16$
."".5
,9972
,6679
.I,,.
.s,,.
.sa.n
,3231
.*,12
.".or
I.aana
.. .....
I? I.O.00 1.0100 i."OOO ,."DO0 I.O*IO i.0000 ,9999 .99u ,088. ..,a5
13 1.01101 ,.a000 ,.nolo ,.noon r.nooo i.0000 ,.oaon ,.**no .sess .swe ..rar
I. I.0000 ,.oooe r.naaa
1.."10 1.00011 1.00.. ,.DO00 1.11.. 1.1100 ,199. 0.
15 1.11000 I.0000 I."."" 1.0000
>."ll" I.IO.0 I.I.00 I.D... .1...1 9 ,9971
6 I.OLl0l 1.1OOO l.nO.1 I..DOI I.IO.0 1.0000 I.@"00 1.11.0 1.00 0. I.I.0. L
17 I.OOOO 1.1100 1.0101 ~.DDOO ~ . a m o o 1.0000 1.0100 1.1000 n.oooo I.OO.~ I..100
II I.OOOO 1.0000 i.no*o i.nnoo i.oooo t.aoos t.oaao ~ . a a o o ~ . o n t a I.OO$. I.naoo
Is I.0000 1.1001 I."000 1.0010 I.IDO0 1.10." 1.1000 1.000. t.0.0. 1.0.10 1.00..
468
i'
Table A2. Ihe cumulative binomial d i s t r i b u t i o n (continued) Table A3. The Poisson d i s t r i b u t i o n
p .O' '"2 .a, .OT ." .'5 .a "5 .'O .a0 .'O The t a b l e g i v e s values of -
P(r;u) = r! ure-' for the
n x s p e c i f i e d values of u and r .
10 0 ,1179 .61,* .lT. .,5.5 ,1116 -0111 .n,,r .**,I .Ion" ,0000 .o.oo
.
1
(I
1
t
1
..I11
.OssO
I.0000
,.onoo
1.0000
I.0000
.e.o,
,9929
,999.
,.oaaa
1.0100
I.0000
.mar
.PW0
,0911
.*w,
1.00OO
I.ooBI
.r,ra
.9?15
,9111
.esrr
.~P97
i.oton
.I?,,
,6769
.B011
.*rn
.P111
.eOla
.tr5s
.LO19
,11117
."?W
.WE1
.We1
.os9s
.1061
..,I.
.s*va
.10aZ
.9>>I
.w.1
,0911
.22ll
..,.a
,6171
.mSI
.oa7s
.a151
,107,
.2,r5
..I*+
.1Ow
.laor
,0016
,116"
.Orlo
.I118
,1100
.nono
,000z
.om,
.aorv
."zO,
.n%ll
I
1 1.0060 1.1010 ,.noon ,.nano .ssoa .rQ.l .err9 .Is"Z .rr2, ..>re .,,,a
1 1.0000 1.0100 l_"~.0 ,.nnnn .-99e .Qoar .ssoa ,939, .saar .5ssr .7r,r
9 1.1na11 1.0000 1.OODO ,.01"",.IO"O ,999" .ee>. .r"s, ,9520 .,qr, .r,,.
10 t.anoa ,.naoa om i.aonn i.nono i.oooo . e e ~ r ..esb .em .mrr .maI
II t.oooo I.OOOO 1.no00 I.onno t.nsno i.aonn . . i ~.wen .ew .e.x .,.a,
12 t.onoo >.oooo I . ~ O ~ O I.osnn t.anoa ~ . e a a n m.ooilo .ee*a .em .wee .erer
11 I.OOOO I.0000 ,."*I0 ,.""no 1.00011 1.0001 I.0000 1.*10. ,9997 .QV>V .*.?I
I. I.0000 I.0000 ,."lo" ,.nooo t.ooso ,.oaoa ,.oooo ,.oono i.vooa .sm. ,s.
IS I.0000 I.0000 I.0000 l_""00 I.OO00 1.0000 I.0000 I.0000 1.0000 .we, .**.I
16 1.0000 I.00@0 >.0aoo 1.0000 1.00BO 1.0000 1.0000 i.IIPO0 ,.DO0 1.0100 .P.).,
I7 I.IODO I.OOO. I."OO" 1.14"" 1.1001 1.000" I.0000 1.0000 ,.on00 1.1000 .
.*
el
I8 ,.onlo I.DOIO ,.oo*o ,."d"o 3.1000 I.00BO I.0000 1.0110 1.0000 1.0000 I.0.00
I9 I.O"OO 1.0010 1_00*0 1.0110 1.0000 I.0000 1.0000 i.DDIO I.0000 t.0000 I. 00
?O 1.0000 1.000. I."Ol. ,.nono ,.oano ,.onaa ,.a000 ,.aano ,.oooo 1.00aa I.000~
5 a ,7778 .ao,r .*a," ,777. .a,,s .I,,? .OOl" .oooa .ooo, .&aoa .oano
I
2
1
.WaZ
,9980
,9999
,9116
_016a
.sslb
,11280
.9+,10
,9911
.L&?.
,1729
..Ll*
.11I1
,537,
,1611
.0+11
5
,1111
.
,027'
,2310
.no10
"
,0962
3
,0016
.*o*a
.0,1.
.1001
,000.
,001.
,1000
.aoao
,010,
L 1.*0"0 .PWP ,9992 .09,8 ,9010 .m21 ..*01 ,1111 .010* ,0095 ,0005
I
6
7
I . * ~ o1 . 0 o ~ o
1.0010
1.0011
1.0001
1.0000
. o w
I.0OOD
i.nOon
.99~.
1.0010
.esaa
,0001
,9917
.ales
,7305
.97.5
.atw
.,*a11
.WOP
.~ral .ms
.5*11
.72LI
.1.01
.I1II
.ozer
.OI,.
,153.
.".,,
.na20
,021.
I0
8
9
I.OOOO
I.0000
1.0"10
I.OOOI
1.1000
1.0101
i.nooo
,.""lo
i.ooOP
l.oono
\.oono
,.nono
.wsr
.99se
,.oono
.esro
,9979
,999-
.Vnz
.Q"Z,
.ss.l
.asas
,928,
,970,
.arrs
.was
,9129
,1115
..r.r
.5"1"
.,,."
.orlr
.>I?*
II 3.0000 1.0010 1.OOOO t.aOO"l.OOOO .l)990 .We5 .WP1 ,9551 .1111 .1&.0
I? ,.On00 1.0000 I.D.00 ,."OD" l_O""O >.oaon ,9996 .ssaa ,9825 .a.w ,5100
11 t.aooe 1.0oao ~.oooo I.aooo t.aooo i.aoan .oew .qrw .esra . ~ ? 2 .*rso
I. I.0000 I.DOIO I.0001 ,.""no ,.oaoo 1.00on 1.oOln .99*. .use2 .%56 .r*r.
I5 I.OI.0 1.0010 ,."ma ,.""no I.OODO I.0400 I.00OD I.0000 ,9995 .."a" .*rrr
I6 1.0"00 1.0000 1.0110 1.nnoo ,.nooo i.aooo ,.aaoa ,.oooa ,9999 .o.r, .*,a,
11 1.0000 I.0000 i.DDO0 1.*"110 I.0000 2.0000 1.0000 1.0000 1.0000 ..9.J ..I..
11 1.0010 1.0000 ~ . n o o o i.nnoo r.oooo i.aooo i . o n a o i . a o a o ~.oooa .WT .-sl~
le i.onoo 1.0000 1.0000 ~ . n n n a ~ . o o o o ~.oaoo ~ . o o o o ~ . n o o a i.oooo .sseq ,0910
1D ,..OD0 1.00110 1.1000 ,.nono ,.no00 r.0000 ,.a000 ,.aooo l.OOD0 ,.oooa ,9995
I3 I.l"DD I.0000 1.10110 ,."Dl0 1.1000 1.0000 I.0.00 1.0000 1.0.11 ,.*.1. .99e.
2 1.0006 1.0000 I.lOO0 I.nnoa I.Ooa0 1.0000 I.0000 I.oOO0 3.0000 >.Do00 1.0000
11 t.0110 1.0000 1.nOoo i.oao0 I.oOo0 t.Oooa 1.0ool i.oOO0 1.0000 l . 0 0 0 ~ t.oo00
2. 1.0000 I.0.0. I.00*0 ,.nnoo I.oooo 8.oaoo ,.oooo I.olao ,.om0 ,.oaoo ,.oaoa
15 1.0000 1.1010 r.aooa t.onoo i.oono i.oaao i.aaaa i.aooo i.oooo i.oaao t.saoa
30 0 .13*1 .5.99 .&+la .?I11 .OllL .OO?L .0011 .m1? .0000 .OOOI ,0000
I .ss,e .m.s .r,,, ,5535 ,1037 .o.m .*to5 .noeo ,0003 .oooo ."DO*
2 .we7 .??a> ,0399 ,",t> ..,I. .,5,. .*.a2 ,0106 .mt, .om* ."OOO
1 .999a ,997, .08", ,9,-2 .6.7. .x,7 .I?*, .on. .ow, .ooo, .oom
6
5
6
I.O"O0
3.onao
1.0000
,9397
i.oooa
1.100.
.W9l?
,9998
1.0000
,911.
.w.r
.se-.
,8267
.e2*a
.Srrl
.5Z'5
.,,aa
.Ill,.
.155?
.at75
.a010
.oe,s
.Ills
.,."I
.O,Ol
,0766
,1395
.oo,r
,005,
.OII.
...
.a000
..IO,
02
7 1.0000 I.1000 I.0000 ,999. .sv22 .9>az ,7608 .r,r3 ,281. .a.35 ,0026
8 I.0000 3.0000 I.DOO0 1.1016 .Wall .el?? .",I3 .a736 .Ill5 .09ra .oOsl
s 1.0010 ,.0000 I."OOO I.ll.0 .ss-5 .esa, .1),ev ,8011 .l"." ,1161 ,0211
10
,I
12
1.0016
2.0000
t.onno
I.0000
1.000*
~.oooo
I.0000
1.0(100
,.no00
1.lOl0
,.nnnn
1.0nnn
.
'
9
99
,.oano
i.oooo
,9971
.99e>
.%n
..'*a
.osos
,1941
.s.-3
. ~ v s ~.sr.*
.110&
.a.o,
.*ass
,1915
..,,,
.rlar
,019.
.,*a2
.la*.
,I I . ~ " O ~ 1.0000 1.0000 ,.""no ,.onoo t.oann .qss, .ee,a .rws .,,.r .rsz,
I. I.O"D@ 1.0000 1.0011 ,.nono ,.ooao i.aann .9*9" .9sr, .s",, ."I.* ..*re
15 1.1.00 I.0040 ,."*O" I."""" ,."ODD ,.oonn .9w9 .wsz .s93a .w2s ,7722
06 ,.elOD I.@"OO 1.000" 1.nnn0 r.onoo ,.aooo ,.aooa .9wa .re77 ,9510 ,707,
..,".
I7 I.0000 I.0000 l.n100 l.nOn0 i.nooo i.oonn 1.onOo .weq .ewr ,978" ,1102
,I i.(l"OO I.000" ,.noon ,.""DO 1.0000 I."O." 1.0000 I.O.00 .".PI ..s,, ,1991
3s ,.onlo ,.0000 ,."I00 ,.nnoa ,.oooo ,.oooo ,.oooo ,.ooao i.oo00 ,997, .?sea
,D I.0000 I."OOO I.nnoD ,.."00 ,.000o >.*a"" 1.0100 1.1000 I.OODI .09*1
>I i.onn0 1.0001 1."00" ,.mono ,.oooo i.onan ,.loo0 ,.aono ,.oaon .evg" .*st.,
71 ,.onlo i.Ol"0 i.noo0 ,.""PO ,."no0 i.00oo ,.loo. ,.OOOO ,.an". I.O... ...I I
71 1.0000 i.L)@IO 1.0010 ,.""no i.oaoo i.oano ,.no00 ,.osno ,.onoo ,.oooo .ow,
1. 1.0010 1.0010 ,_nola 1."""" l.oO10 i.oo00 i.noo0 ,...oo 1.0000 1.O.01 ..a
15 1.0010 1.010. 1.000" l.000. ,_onno i.oaon i.oo00 I.0001 I.0.00 >.O".l ,.no00
7 I.OP.4 1.0000 1.0000 ,.nono 1.naoo 1.oooa ,.looo ,.oooa i.oooB ,.oaan ,.noo&
11 1.0100 1.000a I..000 1.001" ,_*no0 1.0000 1.0.*1 I.*000 ,.onoo ,.o.oa ,.nnaa
21 1.11100 l.0000 I.&000 ,_""0".aaoo I.0""" ,.nooo ,.oaoo I.oooa ,.oooo ,.nono
0 1.""00 1.0000 ,_"OD" ,.""no i.oono !.a000 i.oo0o ,.aaoa t.ooo0 ,.oaao ,.oooa
- 10 1.0000 1.001. l."010 1.nnoo 1.0000 i.on00 ,.nooo ,.0aoo I.0000 I.oloO ,.oooa
n ,0008 .0om .on", .OD"* .on06 .eons .onn5 .noor .ona* .om,
I ,005~ .oosa .nn*q .oars .owl .aom . o o ~ .oow ,0029 .non
Z .P?Oll .Om"* .n(lO ,0167 ,0156 .Ole5 .">la ,0125 ,0116 ,0101
I .OLW .or,. .0*1, .O,lP ,0316 _"145 .011* .om5 .02.(.
,
5
,0117.
.1111
.OII*
,01116
.,?a*
.n,ss
.,,LI
,076.
.Ill0
.or29
.Ins*
.orss
.,or7
,066,
.to23
.oa,z
.nsss
.oa07
.osr,
,097,
.os,s
6 ,168 5 0 3 7 3 1 .I292 .I112 ,1271
7 .1.19 .Iblh .,."I .I.,C .
,'
61 .I.)& .,.<I .1*21 .,6,1 .I,%
11 .I121 ,1311 .1151 ,1363 ,1173 +11*1 .I111 .IlW .!Is5 .I396
(1 ,1012 .to10 .In*& .It21 .ll+* ,1167 .It87 ,1207 .I??& ,124,
10 ,0110 ,0710 ."l"O .one9 .olsR .a.a, .ns,r ."Q., .mr, ,099,
.
11 .01111 .OIDI ,0111 -0551 ,0519 ,0011 .a640 ,0667 ,0695 .n7??
I2 .OZO, .o,o, .o,n .a,.r .o,ra .om8 .or,, .0.36 ,0457 .o.*,
II
Is
,015~
,0071
-0017
.OIW
..me6
.onrl
.n~m
,0095
.no*&
.ole6
,0101
.On5l
.",,,
,073 1
.on57
.om
.a,z,
.OOII
. a m
.o,,r
,0069
.nrm
."try
.DOIS
.am
,0357
.no81
.o1*6
.",re
.DOqP
I6 ,0016 .oo,s .""ll .00>* .an?* .oo>o ."a,, .oo37 .oa*, .no.r
1, .0001 .00"B .onn9 .oo,a .ao,r .oo,, .oa,r .ooll .O"l., .a071
111 .0001 .00", .on"& ,000. .onor .noas .no06 ,0007 .ones .eons
19 .0001 ,0001 .on", .oooz .OOO? .ooa2 .ooo> ,0003 .on03 .ooar
20 .0000 .0"*0 .on", .O"Ol .ooa, .oaa, .O""l .ooo, .ooo, .oooz
2, .OPOO ,0000 .onno .oona .anao .onao .nnoo .aaoo .onat ."an,
,0001 .mn, .arm? .on@, .mar .a002 .oao~ .oaaz .oaot .noat
1 .00?1 .nn?l .oal.) ,0017 ,0016 .@OM ,0013 ,0012 .oOll
.oteo .(109~ .one. .owe .oo~r .006(1 .n06) ,0051 .OD% .0a5a
1 ,0269 ,0152 .n237 .o?lr ,0208 ,0195 .Dl83 .olll ,0160 ,0150
& .0511 .PSI7 .OL?I .OI6L .W41 ,0110 .Dl98 .0111 ,0151 ,0317
1 .On111 .OW9 .en16 ,0184 .011Z ,0722 ,0692 ,0663 ,0615 ,1607
6 .llsl .I160 .?I28 .I097 ,1066 .1O1. .lo01 .0912 .&9.1 .OPII
1 ,11711 .I158 ,1311 ,1317 ,1294 ,1271 .IZL1 .1122 ,1197 ,1171
8
9
I0
.!IPS
6
.IoI1
.I102
6
.IPIO
.
,111.
.IoL1
,1112
0
.to81
,1115
,199
+IlOL
.I116
6
+Iltl
+1156
3
.ll.0
.131+
,1115
,1157
.1112
.)>I1
,8172
,1310
,1311
.Ill&
II .01Le .0116 .alOr .0128 .0851 .OL(lO .OW2 ,0921 .a948 .Wlo
11 . ~ 0 5 .@5>0 . o w .osw ,060~ .as?? .osra .oslv .WOI .wra
11 .0115 .PI>+ .m5& ,0176 ,0395 .O.I6 .043l) +0159 .a*&! .0501
1. .01111 .Pls6 .n?lP .01?5 ,0260 .DIS(I .OZ7P .0289 ,0101, .01?L
II .oms .mm .oxto .eta .o>>h .OM, .o~za .oms .oxaz .ol*r
16 ,0050 .On55 .nDLP ,0066 ,0071 .0019 ,0086 ,0093 .a101 .01W
I7 .001. ,1026 .OW9 .DO33 ,0036 ,0040 .OOL1 .0018 .@051 .005l
18 .00,, .POI2 .no,r .on15 .oa,, .00,s .aoll .oozr .002a .nore
I9 .On05 ,0005 .arms .OOOl .a008 .0009 .a010 .OOll .OOl2 .Oat.
20 .OD02 ,0002 ."no? .0001 .00o3 .no** .ooo. .sow .oans .a006
PI .POI, .00", ."an, .ooa, .aoa, .aaal ,0002 .ooo2 .om2 .ooo,
22 .oooo .moo .moo .aaoo .ono~ .a001 .soot .ooo~ .so08 .oom
2, .oona .oona .anno .oooa .*no0 .oano .a000 .on00 .anoa .onno
9.1 P.? 9.3 O.. 0.5 9.6 '.I 9.8 9.s i0.P
0 .000, .0001 ."no, .o00, .ooo, .a001 .0001 .aoo, .0001 .a000
.
1 ,0010 .Onns .on09 .OOol( .OoOl ,0007 .OOOb .OOOI .OD05 .OOOI
2 .0016 ,0063 .OW0 .DO11 +0014 .DO31 ,0029 ,0027 ,0021 ,0021
J .0110 .a111 .Din .PI15 ,0107 .Dl00 ."a93 .0087 .OOsl ,0076
.0,19 .P,02 .".05 .ole* .025r .o>.o .Oll(l .*?I, , 0 1 0 , .a,.*
5 .osm .*sss . o m .OIM . O L ~ .orso .ml+ .orla .om .nnr
8 ,085, .a112 .Or93 .01aL .OI36 ,070'1 .Ohel .Oh% .Os>l
7 .I145 ,1111 .Inel .L@e. ,1037 .tat0 .OW? .0955 .OVZa .OW3
II .I302 .I286 .l>69 .I251 .I212 .I212 .1191 .1170 .11*11 ,1116
9 ,111 7 ,111 5 .,,,I ,1306 .001. .I293 .11111 .I211 .I261 .,271
IO .II.)B ,1210 .l?ls .111$ .1215 .11.1 .11.5 .II*9 .12% .$?-I
.os9n .IS)> . I ~ V .lors .anal .~oa> .lose . I ~ I ? ,1125 .I,??
I? ,0152 ,0776 .n,s* .01*? .o*rr ,0866 .D8lld .OW8 .m11 ,0940
11 ,0516 .0%9 ,0417 .059l .PLI7 .ObLO ."LI? .OW5 ,0107 .071'
)I .01a1 .0161 ,0110 .Olss ,0439 ,0139 ,0631 ,0179 ,0500 .0511
IS .Oml .W?I .*115 .O?IO +n265 .02.1 ,0297 ,0111 ,0110 .01.1
16 .OBI8 ,0117 .a117 ,0147 ,0151 ,01611 ,0180 .OIOI .WIT
I, ,0061 .Om9 .om5 .P*LI, .OOPII .an95 .o,o> ."ill .0119 ."lZ"
In .PO11 ,0035 ,0039 .Onrl .OW6 ,0051 .@05l .OM0 ,0065 .oall
te .oats .OOII .ant9 .oou .OOZ~ .ooze .~OPI .no>, .oolr .no>?
20 .0001 .00D11 .on09 .on10 .oo!l .OOl? .nOll .00>5 .On97 .OOl*
?I .0011 .0001 .arm* .Ooor ,0005 .OOaa .nos6 ,0007 .Onall .nOW
?I .Doll .0001 .Onor .OoOl .am2 .ooo2 ,0003 .OOOl .O@O. .00"4
13 .oooo .oan~ .nno! .oool .on01 .oau .ooox ,0001 .ooo? .oon2
2. .OD00 .O""O .oono .on00 .oooo . O ~ ~ O .o000 .ooo, .0"0l .son,
473
I., 1.2 I.3 I.* 1.5 Z.6 1.7 1.I z.9 2.0
0 .11?$ ,1011 _>lli .2Lb* ,2231 ,2019 ,11117 .1151 .I*% .1)51
I .sem .a626 .wsa .5sla .157a .I?W .LV~Z ..*?a .+111 .*tho
2 .900* .81*5 ,1511 .a315 ,8088 .1111 ,1571 ,7106 ,7027 ,6767
1 .97.> ,9662 .a569 .%63 .93&. ,9212 ,9068 .a?!> .87'7
+ .P.)l* .9'l23 .9*91 .9*51 ,981. .97&l ,9701 ,9636 .ST19 ,911,
5 .0990 ,9915 .?Dl8 ,9975 .9PW ,9920 .sag6 .%.a .P$>r
6 .ww . 9 ~.WQL .sw .wst . s w ..vat .ser* .essa ,995s
I I . O ~ O 1.0000 ,9799 .eew .ssss .e*w . w s .wsr .we ..sms
I I.0000 1.0000 I."OOO 1.0000 1.oono I.ono0 .was .9sos .9W" .W."
9 1.0000 1.0010 i.nnnn i.aooo i.oooa ,.moo i . o o a o i.oooo i.aoaa 1.0aaa
2.1 1.1 2.1 2.1 2.5 2.6 2.7 2.1 2.9 3.0
0 .1?14 .)1&8 .lo03 .0907 .0"21 .Ol4l ,0672 .Oboe .0510 ,0191
I ,1796 .15*6 .no+ ,308. .Zlnl .?67r .I+OY .?III .?l46 .lwl
2 .(1ls(l .(1?21 .5960 .56sl ,5618 .*la+ .r936 .*695 .+.a0 .an2
1 .a18e ,8194 .1991 .7787 ,1576 ,1360 ,7111 .eel9 .saga .nip
L ,917'1 .em ,916s .eael .as)? .mrr .la19 .am .ella .~151
5 .wse .P~W .woo .sea> .err0 .srnn .+a, . w r s .srro .slrl
.W.I .*975 .*so6 .911r .9158 .el28 ,979. ,9756 .*)I> .96~5
7 .<91S .91)10 .*97* .9967 .W51 .99" .993L .Wls .9m1 .ell>
I .9ew .es95 ,9991 .swb .was ,9983 .90a1 .sws .eves .*P.>
9 .PW* .99P9 ,0999 .s99. .9997 .r)ll% .sesr .I)ss, .99., ..sns
It I.OOD0 I.1000 I."ODO 1.00.0 .sses .$so9 .see9 .rse. .ssrs .s.,
I, I.0000 I.0000 ,.0"00 ,.DO00 1.0000 1.0000 I.0.00 1.0000 .P.99 .*s.s
XI 1.a0a0 1.0090 >.nsnn t.ooao >.000s ~ . * o o ~L.OO.O >.oooo 1.80bo I.~OOO
--j Pi e-bx2dn
The table gives values of ta for different degrees of freedom v such as
The table gives values of G(y) = for 0 5 y 5 4 . 9 9 . to produce specified values F(t,;v), where
G(-y) = 1 - G(y).
T a b l e A9. P e r c e n t a g e p o i n t s o f t h e F-distribution
* M u l t i p l y t h e s e numbers b y 100.
I 1
Table A9.
3 '
Percenrage points of the F-distribution (continued)
5 6 I B s 10 12 15 10 10 LO bO 120 m
-
*
m
a .so L.5. L.12 ..Is 4.11 k.05 1.01 3.98 3.95 3.9* 3.P l.W 1.87 1.1. 3.82 3.80 1.n 1.76
?.,I
.PS 1.71 6.9. 6.59 s.39 6.26 6.16 6.03 a.o* 6.00 5.96 5.e) 5.8. 5.80 5.75 5.12 3.69 5.6s 5.6,
.m ~ 1 . ~ 2 10.e5 e.se 9.60 9.36 9.20 e.07 a.qa 8.90 8.s' 8.7% s.*r s.sr ..+a *.at 8.16 s.n e.as
.99 z1.10 18.00 I6.W IS_eI 15.5, 15.2, li.98 lb.80 Ik.66 ,655 L1.37 11.10 Ia.02 11.~. 11.15 11.65 12.56 !I.'*
.?q% 31.11 i6.2E 2L.26 21.15 ??.a* 2L.97 21.W 21.15 21-11 20.97 10.10 iB.L* 20.17 is..P 19.75 19.61 19.LI 80.32
,999 ,'.I. at.25 ss.,s r,.** FI.7, 50.53 -9.6s es.00 is.*, *l.OS *7.., rr.7s *s.,o *i.*, *s.os .1.,5 r... a *,.05
5 .90 6.06 1.78 i.L? 3.5' 1.*5 3-*d 3.37 1.11 3.30 3.27 1.?i 1.)) 1.11 3.1* 1.1a 1.11 1.1D
5
.9rs
.Q-
6.6,
ID.OI
16.i6
(I..,
5.79
11.21
5.rl
7.16
12.06
5.19
7.w
11.w
5.05
7.15
10.97
L.Q5
6.98
to.67
a.88
6.85
ra.ra
*.bZ
6.76
1o.r-
*.I7
6.68
..I'
6.62
1 0 . ~ 6 1o.o-
a.68
6 . ~ 2
9.m
L.6Z
&.*,
q.12
L.56
6 . n
9.55
1.50
6.2,
s.la
a,.(,
6.1,
+.a1
6.12
9.20
r.m
6.0,
9.11
r.36
6.02
9.02
,995 2 . 8 111.31 1a.51 15.56 1*.9L 1*.51 11.10 11.'6 3 . 7 3 11.18 11.15 12.90 I2.66 11.51 >?.LO 1 . 7 II.1*
.PW 7 . 7 . 2 3 . 0 11.09 19.75 ZS.114 11.16 27.6' 11.2* 26.91 Z5.91
?$.L? 25.39 ?+.a7 1a.60 21.11 il.OL 23.7.
6 .90 1.78 I.&* 1.>9 3,)s 1.11 1.05 1.01 2.91 Z.96 1+'* 2.90 2.87 2 . ~?.no 2.78
.
-95
.o-
5.99
a.et
11.75
5.1.
7.26
10.92
i.76
*.an
9.18
b.51
s.2,
".I?
L.19
5.99
e.19
L.20
s.81
a.rr
*.I!
s.70
8.26
s.15
s.so
e.10
1.10
5.52
1.98
L.OS
5.bs
7.81
1.00
5.n
1_9*
5.2,
I.s7
5.17
1.~1
s.",
3.17
5.01
2.7%
3.76
..vr
2.7.
3.70
*.s"
2.72
3.67
*.s~
1.12 7 . n 7.m 1.1, T . I ~ 7.06 h . 9 ~ *.en
.W5 ib.L3 I2.W I2.DI 11.66 11.07 IO.79 iO.57 10.19 10.25 10.03 9
,
'
s *.<9 9.15 9.2' 9.11 7.00 8.88
.q99 35.51 l7.00 21.10 11_9? 10.81 W.01 19.L6 19.03 IP.69 IB.rl 7 . 9 6 7 . 1 la.67 la.'* Is.ll ir.9e ,$.is
7 .W 3.5- 3.26 3.07 ?.w 2.88 2.75 2.72 2.70 2.67 7.63 2.w 2.56 2.5. 2.5, ?..a >.a,
. .?5
.OF
5.:' '.I* 6.15 r.12 1.*1
2.83
1.11
2.78
3.79 1.71 3.60 3.6. 1.57 1.59 1.L- 1.18 I.>* 3.10 1.27 3.23
1.01 6.51 5 . ~ 9 5 . 5 ~ 5.19 5.1? r.99 '.PO r . 8 ~ *.IL L . ~ I L.FI c.,. i.31 L.Z~ i.2. ..i*
.i)F 12.25 9.51 I.L5 7.85 I,** 1.1* 6.99 6.8% 6.72 6.62 ..&I 6.3, 1.16 5.99 5.91 5.aP 5 . ~5 . 6 5~
-9-5 16.11 IZ.60 10.18 10.05 P.52 ').I@ P.(IP a.a* 8.5) 8.38 8.18 7.9, i.,r I.., r..2 7.1, ,.Is ,.ns
.eve 1 9 . s ~ ?,.as LO.^ . I ~ 5 . 5 i IS.OZ . a 1 . 3 I . 11.11 I,.>? 12.91 12.5, 12.11 i~.i? 11.91 11.70
~~- ~~~ -
-
Table A9. Percentage poinro of the F-distribution (continued)
% F
y 1 2 1 , I 6 I 8 9 1. t i 15 20 I0 .@ 0 120
....
Li .W 3.18 1.81 Z.6) 2.6. Z.>9 2.13 2.2. 2.2. 1.11 2.L- ..I5 >_I. 2.06 2.OL 1.9'1 I.** 1.9, I.9"
.s a.75 ,.w 3 . e 3.76 3.11 3.00 2.9, 2.*5 2.m 2.75 2.69 7.6, 2.5. 2.67 ?A, ?.,e 2.3. 2.30
..ess
,975 6.5s
e.,3
LI.75
5.10
6.9,
1.51
4.67
9.95
7.2,
b.tl
S..,
6.w
1 . l ~ 1.11
5.w
6.07
.ma
3.76
3.61
5.51
1.5,
4.5.
5.3%
>.a4
1.39
5.20
1.37
r.,.
5.09
1.2.
*.I6
..$l
1.1.
*..I
e.72
>."I
3.86
L.5)
I+'L
i.,"
b.17
2.91
3.6%
b.23
1.85
1.5.
*.I?
2.7-
3.r5
L.01
7.72
3.-
1.90
.se., ,a,*. 82.9, In."o 9.n *.a* a.,a a..* 7.7, 7.1. 7.zs 7.0. 6.7, s..o 6.09 5.0, 5.76 5.59 5.e
81 .so 1.1. 2.16 2.5. z.+> 1.15 1.28 z.21 1.20 2.16 2.1. 2.10 >.01 2.01 I.'$ I.*, 1.*0 L.8" 1.95
..5 ..67 1.01 >.a1 1.). 3.03 2.92 z.81 2.77 t.71 2.67 2.6. 1.7, 2.66 Z.II2.- 2.30 1.15 2.2,
,975 ...I *..I e . 3 ~ L.OO 1.v 1.a. 3.m 1.11 1.15 1.15 ,.or 2.95 2.w. 2.w 1 . 7 ~ 2.66 2.m
.se ..a, a.,. S.7. 5.2, ..eL ..*I ..a. 4.3. ..,9 ..I. 1.96 ,.a? 3.6. 3.51 ,.L, ,.1. 1.25 1.11
.w5 LI.37 I.39 6.03 6.2, 5.79 5.6, 5.15 5.0. 1.- L.82 +.SL a.97 L . D ~ I.07 3.11 1.11 3.b5
...
.0.9 ,,.PI 12.3, 10.2, 9.0, 8.15 7.8. 1..9 1.2, 6.98 s..o 0.52 6.7, 5.9, 5.5, 5.*r s.,o 5.1. 6.97
3. .90
.*5
,975
1.10
e.10
o
2.13
3.1.
6.86
1.12
3.-
A,?.
2
,.,,
1.19
. ~2.1,
?.e6
3.66
.21
2.85
1.50
2.19
2.76
T.18
2.15
2.70
3.29
I . ~ I
1.&5
1.2,
2.10
2.60
3.35
t.02
1.51
9.05
'.OK
?.a5
i.ea
2.m
2.M
i.9,
l.ll
2.73
1.89
z.2,
2.67
i.aa
2 . r ~ 2.ls
2.61
i.8,
2.55
i.ao
2.n
I-**
..*
,905
8.01
I,...
...I
7.92
5.5.
6.6.
5.".
(1.0.
..I9
5.5.
..LL
5.1e
6.2.
5.01
4.8.
L.86
..ox
6.72
3.9.
..h. *..,
1.80 ..ar
..,5
3.5,
L.0.
1.15
,.'C
3.27
3.76
,.ld
,.A*
2.09
1.55
3.""
I...
.w9 11.1. 11.7. v.1, 8.61 7.92 l..? 7.0. 6.e. 8 . 5 6..0 6 . 3 5.85 5.56 5.75 5.LO 6."' 1.7, ..eo
IS ..I 1.07 1.10 ?.re z.ls 2.l. 2.2l 2.LI 2.U Z.os 2.06 2.02 1.97 1.92 I.&? 1.85 1.W l.79 1.76
.95 L
.
% 3.b" 1.Z9 i.nc 2.90 2.7. 2.7) 1.e 2.5e Z.9 2.ra 2.a 2.31 2.75 2.20 2.11 2.1, i.0'
,975 8.20 6.77 *+I5 >..o 1.58 1.11 1.29 >.20 1.Iz >.Oe I.96 ?.a6 2.76 I.*' 2.5q 1.51 ?..a ?.+"
.er ,.as 6.36 I..2 4.19 i.5s r.32 r.lr 6.00 1.W >.eD 1.11 1.57 1.17 1.21 1.11 1.05 2.9s ?.I7
..a5 lb.IO 7.70 6.). 5.~0 5.>1 5.07 L.85 a.67 ..LI 6.42 ..IS r.07 1.81 I.bP 3.51 3.r" 2.17 3.11
.se9 11.5s I,.,. 9.3. ..?5 7.5, s.., 6.7. r..s a.zs 6.08 5.8, <.s* 5.25 ..ss ..no ..a. ..A7 '.,I
I6 .90
.95
,915
.s9
...
1.15
$.,Z
8.53
9
2.67
3.6,
Q.69
6.23
i.ib
3.2.
4.08
5.2e
7.13
,.a,
3.7,
6.77
2.11
2.85
,.% 6.20
+.a.
3.3.
2.18
2.7.
1.11
2.66
3.n
a.03
2.0-
2.5*
3.12
3.8-
2.06
2.5.
3.05
3.78
2.03
?.'9
2.99
3.69
L.99
2..2
2.8-
3.55
1.'.
2.35
?.7?
>.<L
1.'-
2.n
?.fie
3.76
1."'
2.19
7.77
3.10
1."
2.,5
2.5,
3.02
I.78
2.1,
?.a-
Z.93
1.75
Z.06
2.38
2.86
2.72
2.01
2.32
>.75
.9*5 10.5. 1.11 6.30 5.6. 5.21 L.9, 6.49 4.52 ..la 6.17 &.ID 1.92 1.11 3.G. 3.r. 2.33 2.22 3.1,
.w. I.., 1 i0.e. 9.0. 7.9' 1.21 6.8, 6..6 6.1- 5.98 5.8, 5.55 5.27 ..pe '.TO r.5. r.19 r.2, r.o*
I, .90
.e5
,..,
.A5
2.b.
3.59
?..I
1.10
?.,I
7.90
1.m
>.el
*.IS
2.70
?.LO
2.6,
Z.O6
1.55
2.0,
2.69
1.00
2.L5
I.%
I.%
,.PI
P.11
1.M
2.?1
I.*,
I.>%
1.11
2.10
1.75
Z.06
1.71
i.01
1.*9
1.96
,975 6.0. L.62 ..a, ,.cr 1.4. 1
.2
. 3.1. 1.06 2.9. *?
. 2.82 ,.I, 2.6, 7.50 .'.1 2.1s 2.11 2-25
.eo $..O h.LL 5.t8 4.h7 a.3. A.kO 3.93 3.79 3.e8 >.39 3.+b 3.3, 3.16 3.0" 2.W Z."> 2.75 2.65
.995 10.11 7.15 6.16 S.50 5.07 r.7. +.56 L.39 L.25 +.I. 1.97 1.19 1.61 >.&I 3.31 3.28 2.l" 2.79
.ess 15.72 LO.66 7 , 7 . 6 7.02 1.1. 6.11 5.9. 5.15 5.5e 5.12 5.05 ..?(I "'.L 3 , ..I8 '.DL 1.35
I* .s. 3.0, 2.62 1..* 2.7' 2.20 2.11 ?.*a 2.e. 2.00 1,s" 1.93 l.8') 1.8. i.7" 1.15 ,.I? 1.1- 1.66
.e5
.el3
.es
...I
5.9*
*.?9
3.55
6.56
6.0,
3.36
3.-
9.09
?.Q,
3.6,
6.5, a.25
2.7,
,.,a
2.M
3.22
..o,
,.,*
2.58
3.86
2.5,
,.*I
3.7,
2..*
2.w
3.60
2.6,
>.a7
3.5,
2.3.
2.77
3.37
z.27
>,*>
3.23
2.t9
2.56
3.08
2.1,
,..a
2.-2
2.06
2.3"
?.a'
2.02
2.32
2.75
1.97
2.26
Z.b6
,.9z
>,,9
2.57
.PQ~
.ess
t0.21
8r.m
1.21
10.39
6.0,
a,..
5.1, L.~L
6.8,
L.W
s.35
L.L.
6.02
..a
5.76
..I.
5.56
a.01
5.39
,.I*
r.13
I.*.
r .
1.10
~l.5-
1.10
*.la
3.20
r.ts
3.10
r.00
7.~9
1.8.
?.rr
i.r?
9 0 2.9s 2.61 >.LO 7.11 2.111 1.11 2.06 1.W 1.9" 1.96 1.9, I.@* t.81 i.76 1.7, %.TO 1.67 3.63
?.r+
.e.
.s,.
.9e
4.3s
s.*2
8.1-
a.5,
5.9,
,.>%
,.90
5.01
z.qo
3.56
r.50
3.31
+.I7
,.,,
2.63
3.96
2.-
3.05
1.77
z..e
2.96
3.61
?..z
2.m
1.51
2.38
z.a2
3.r3
2.31
2.72
I.>*
2.m
,A?
3.15
2.m
2.5,
3.00
z.07
2.29
2.e.
2.0,
2.3,
2.76
1.9"
z.2,
I.67
1.93
z.z*
i.58
i.ee
z.,,
I..?
.015 10.01 1.09 3.92 5.z7 6.15 r.54 r.3. ..ll a.0. 3.93 3.7s 1.29 0
.
1 1.21 1.1, 3.0" 2.89 2.78
.*s. . . . . L ,o.,s ..>a 7.2. 6.62 a.,o 5.85 5.5. 5.3- 5.21 4.97 ..TO .A, ..I. 3.99 1.8. >.6D 1.51
2
Table A 9 . Percentage points of the F-distribution (continued) b
*
m
v, I 2 3 5 a 7 8 3 lo 12 15 I0 10 LO 6 0 OD
Y F
20 .90 2.9, 2.59 1.311 1.)$ ?.I6 2.09 2.0, 1.00 1.96 I.* 1.80 I.". 1.79 I.,. I.,, t.40 1.6. ,.s,
.s5 4.15 3.49 3.10 2.7, z.60 1.51 2.&5 2.19 2.35 2.211 z.10 2.12 ?.OL 1.W !.'5 I.% i.8r
.97? 5.87 ..a6 3.86 1.29 3.13 1.01 2.9, >.a* i.77 2.68 >.51 2.66 2.39 1.29 1.22 2.11 2.09
_s9 (1.10 5.85 6.91 1..' *.LO 3.81 1.10 3.56 3.46 3.37 3.11 1.00 2.0. 2.78 Z.69 2.61 1.52 2.62
,995 9.9. 6.99 5.12 5 . ) ~ s.76 *.*I b.09 i.96 3.85 i.68 1.50 1.32 3.82 I.02 2.92 Z.sl 2.69
_999 s.sS a.10 e..e 6.w 5.69 Z.I* 5 2 1 5.08 6.82 a.56 *.2- ..no 3.16 3.m 1.11 3 . 1 ~
CI .en
.v5
2.96
L.12
2.5,
I.',
2.36
1.67
*.,,
2.e.
2.1,.
2.68
>.OF
2.57
2.02
?.a9
1.9" t.95
2.17
8.W
2.12
,.a,
2.25
1.8,
1.11
1.78
2.10
1.72
2.01
1.6-
1.96
,.fib
1.V
1.52
1.81
h.59
1-91
.P15
.P')
,995
5.C3
a,"?
+.(I
6.12
5.7"
6.89
1.82
.#,
1.73
..,,
9."-
1.25
..o.
r.60
1.09
,."I
*.?s
z.97
3.6.
4.21
1.1I1
3.5,
r.01
2.80
,.a0
3.m
2.73
3.3,
1.17
,.,,
2.6.
1.60
7.51
3.0,
,.r>
?.LZ
2.m
i.>r
I.>!
z.72
3.n~
2.25
2.e.
2.95
2.I.
2.57
2.a.
2.11
2..*
i.7,
2.0.
Z.36
r.n~
,999 $1.59 9.77 ?.PI 6.95 6.32 5.88 5.56 5.11 5.11 L.95 ..I0 ...I ..!I 3.3" 3.7. 1.51 !.at 3.25
ZZ .ea 1.95 1.56 2.35 2.7, 2.13 2.M 2.01 1.97 1.93 i.90 I.Ba l.el 1.76 I.?" 1.67 I.*. 1.60 l.57
-95 a.10 3.L. 1.05 ?.R> 2.66 1.55 2.66 i.LO 2
.
3
' 2.10 2.21 2.15 2.07 1.01 1.9- 1.19 1.W 1.75
7 5.79 L.38 3.78 >.a' 3-22 1.05 2.Ql 2.W Z.76 2.70 2.60 2.50 2.39 2.77 2.21 2.1. 2.0. Z.0"
.99 7.95 5.12 *.a2 r.31 ?.99 1.16 3.59 3.65 3.15 1.16 3.12 2.98 1.81 2.67 2.58 2.50 i.lO 1.1,
.9q5 9.73 &.a1 5.65 5.n? L.61 b.12 b.11 1.W 3.81 1.70 1.5. 1-16 3.18 2.'" 2.W 2.77 Z.11. 2.55
.9Pe IL.38 °.$I 7.90 *..I L.19 5.76 Z.LI 5.19 ..*9 a.11 1.51 ..11 6.06 1.11 1.61 3.LI 1.12 >.IS
23 .iD 2
.
9' 2.55 1.11 2.7) 2.il 2.05 1.W 1.95 1.W 1.89 &.a* i.80 1.7. I.r9 >.as 1.62 1.59 1.55
-95 r.28 1.w 3.01 2.80 >.a1 2.53 2.L1 2.37 1.32 2.27 2.10 2.11 1.05 1.06 I.91 I.86 I.81 1.76
.P15 5.15 L.15 1.15 1.LI i.L8 3.02 2.90 2.81 2.73 P.67 2.57 7.*7 2.16 1.71 2.18 Z.11 i.OI I.*'
-09 7.08 5.66 6.7& .,Z* >.*a 3.7, 3.5+ ,.+I 3.20 3.21 3.07 ?.Q, 2.m ?.M 2.5. 2..5 2.35 2.Z6
,995 9.83 6.73 5.58 4.P5 L.5' 1.26 L.05 1.88 1.75 3.61 1.11 1.1. 1.IZ Z.02 1.82 2.77 >.an &re
.ws I*.** e.*r 7.67 6.60 6.08 5.65 523 5.09 1.8- +.,a *.*a L.E 3.w >.*8 1.53 1.11 1.~2 3.05
2, .so 2.9, Z.S* 1.33 ?.I9 2.b0 2.04 1.98 i.s* 1.9) 8 . M ).el 1.7" 1.73 I..? ,.a& ,.at 1.57 1.53
.95 4.X 3.60 1.01 2.7s 2.w 2.51 I.'I 2.16 1.10 2.2s 2.18 7.18 2.0, 8.9. 1.89 1.1. 1.7" 1-73
,915 5.72 6.12 1.7Z 3.18 3.LS 2.99 2.117 1.71 2.70 2.64 2.51 2.L. 2.31 1.21 1.15 L.03 i.01 1.0.
.W 7.12 5.6) *.11 r.7) 1.90 1.61 1.50 3.16 1.2e 1.11 3.01 7.89 1.1. 2.58 2.69 2.L. ?.It i.LI
.Q95 9.55 6.SLi 5.52 *.19 L.L9 L.?O 3.99 1.8, 1.69 3.59 1.w 3.25 3.0. 2.ll 2.11 2.66 2.55 L.*>
.P99 i4.01 9.W 1.$5 6.59 5.98 5.55 5.23 L.99 L.80 I.6. L.1' L.ll 3.87 3.59 3.65 >.?* >.lc 2.97
25 .90 2.91 1.51 2.11 1.11 1.09 1.01 1.91 1.91 L.89 1.81 1.82 1.77 1.72 1.66 1.63 1.59 1.11 I.5L
.s6 6.2. 3.3s 2.99 2.76 2.~0 *.r9 I.*a 2.14 2.2e 1.11 2.1. p.09 2.01 1.92 1.87 I..? 1.77 1.n
,912 5.69 6.29 1.69 3.75 1.13 1.97 ?.a5 2.75 Z.68 2.61 1.11 ?..I 2.30 2.18 2.22 ?.a7 i.e. L.91
.e* 7.77 5.57 ..68 *.,a 3.85 3.63 3.*6 3.32 3.22 3.13 2.99 ?.a5 2.70 t.5. 2.6% 2.38 2.27 z.,,
,995 9.W &.SO 5.L6 6.R. L.*l L.15 3.9. 3.6. 3.5. 3.37 3.20 3.0, r.li 2.72 2.61 2.5- >.la
.l)w 11.e" 9.22 I.rr s.rq 5.8a s.rs 5.15 4.91 ..TI r.56 r.>t ..os 3.79 1.51 1.17 3.22 3.0- r.3.
16 .90 2.91 1.11 1.11 2.>7 2.08 2.0, 1.96 1.92 1.88 I... 8.08 1.76 1.11 1.65 1.h1 1.51 1.5b 1.50
.95 6.2, 1.17 2.98 1.1. 1.59 1
.2 2.3') 2.12 2.27 1.12 2.15 2.07 I.99 1.90 1.85
.el$
.se
3.66
7.72
..n
5.53
3.6, 1.11
..I.
3.10 2.9, 2.81 2.73 2.65 2.59 2.49 .?.m
,* 2.r. 2.36 z.oe
I.10
2.03 1.9~
1.75 1.6-
i.a8
,995 *..I e.11
6.6.
5.11
3.82 3.59 3.62 3.29 3.18 3.09 2.96 z.66 2.50 2.62 2.3, 2.2, z.~,
4.V *+>I( ..I0 I.89 3.73 3.60 1.M 1.11 1.15 2.97 i.77 1.67 2.56 Z.L5 2.3,
.w 13.7. 9.12 7.16 6.61 5.m 5.38 5.07 4.8, r.sr +.+a 4.2. 3.99 3.12 I... 1.30 3.15 2.9s r.az
27 .90 2.90 2.3, 1.10 ?.I7 2.07 2.00 1.93 i.9, 1.87 1.115 1.110 1.75 1.70 8.6. 1.60 1.57 l.sl I...
.es 6.21 3.s 2.w 1.71 2.51 1.w 2.37 2.u 1.25 2.20 1.1, p.06 1.9, 3.0. 1. 1.79 1.7, ,.&,
,975 5.63 L.2. 1.65 >.?I 3.08 2.92 2.80 2.11 2A3 1.57 1..1 2.?4 1.25 2.8, 2.07 2.0. 1.93 i.ar
.w 1.a 5.as L.~O 6.1, 1.18 3.5t 3 . 3 ~ 3.26 1.15 3.06 2.9) 2.1" 2.63 2 . r ~ 2.1. 2.29 2.20 z.10
.sP5 9.3. 6.** 5.36 ..I& ..,a L.06 1.85 1.69 1.56 >.a5 1.11 1.11 2.Vl 2.73 2.63 2.52 2.r) L.Zs
.99* 33.61 9.02 7.27 6.13 5.73 5.31 5.00 a.76 a.57 6.68 6.87 >.V2 3.66 3.38 3.23 3.0s 2.92 2.7s
b28 .so
.s~
.s15
9, 1
2.es
..IO
5.~1
2
2.50
1.3.
6.22
3
1.29
2..5
1.61
r
1.~6
1.1,
3.99
5
?.Oh
1.56
1.06
6
2.00
2.15
i.90
I
1.91
2.36
2.78
II
1.90
1.29
2.6Q
9
1.87
LO
I.*&
i . ~ +I.,?
Z.al 1.55
I2
I.7-
2.12
1.45
15
I.7L
1.0'
2.36
ZO
1.69
1.96
Z.23
1s
1.61
1.a7
?.I1
*O
I.%
1.82
2.05
0
1.56
1.7r
,.Pa
120
1.52
1.1,
1.91
>.'a
1.65
1.11
a
.sq 1.6. 5.~5 e.11 *.01 3.75 1.51 1.36 1.21 3.12 1.01 Z.90 7.71 2.60 ?.a4 2.15 Z.ZL 2.17 2.06
.9e5 9.2. .
.
LL 1.1. *.TO b.10 *.oi 1.81 3.~5 1.52 1.~1 1.15 2.01 r.ae 2.w 1.59 2.68 2.n r.zs
,999 11.50 8.9, 7.19 6.25 5.aa r.2r e.9, ..as r.50 *.,s r.,, ,.a* 3.60 3.37 >.,a 3.02 2.86 i.63
29 .')O
.P.
1.89
..I8
2.50
1.11
2.2.
>.a3
,.,'
2.70
2.06
1.55
1.99
1.63
1.9,
p.35
1.89
1.28
1.m
1.12
1.8,
?.nu
1.7s
2.10
I.,,
>.O?
,.he
>.PI
$.*I
I.s5
i.21
1.81
1.15
1.75
1.51
i.70
,..,
I.*.
,975 5.59 6.20 ).a1 1.27 1.01 2.w 2.76 1.67 1.19 1.51 1.*1 ?.ll 2.21 2.09 2.01 1.96 i.89 1-SI
.P 7.e0 5.e. s.51 *.n* 3.71 3.50 1.33 3.20 3.0- 3.00 2.87 7.7, 2.51 >..I 1.11 1.71 2.1. 2.01
...5 9.13 6.ao 3.18 *.&c L.26 1.9C 1.11 3.6, >.La 1.18 3.2, 3.0. 2.M 2.66 1.50 2.L5 1.11 2.21
.99.) 11.19 8.15 1.11 6.19 5.19 5.ll L.87 a.6' L.LS *.Iq L.F l.an i.3 >.El 3.11 2.q7 2.8, ?.a*
30 .eo 2.m z..9 2.a ,.I' 2.05 1.9- t.9, ,.an 1.s ,.ez 1.7, L?,. ,.67 1.6, 1.57 1.5' b.50 1.66
.PS
.*,?
.ss
a,,,
5.5,
7.56
1.12
..IS
5.39
i.02
3.59
<.5,
,.L9
2.75
..OF
2.53
3.0,
3.70
Z.'i
2.8,
,..7
2.11
2.75
3.m
?.<I
2.65
3.,7
2.2,
2.57
3.07
2.16
2.5,
2.98
i.OP
z..,
2.8..
p.01
2.3,
>.TO
i.91
2.70
2.55 2.m
1.8.
,.a7
1.19
z.0,
2.30
,.%
1.11
2.2,
3.69
1.87
Z.,&
i.62
L.79
*."L
5 9.18 6.15 1.1- r.62 L.23 3.9% 1.11 1.51 1.L5 3.3. 1.18 3.01 2.8: ?.L1 1.52 2.*? 2.30 2.18
,993 13.21) 8.71 7.05 6.11 5.51 5.12 1.81 r.58 i.37 '
2
.
L ..OD 1.75 I.*9 3.Z 1.07 1.'2 2.76 i.59
LO .PI 2.w 1.11 2.73 2.00 2.00 i.9: 1.87 1.83 1.79 1.76 1.71 1.66 1.6, I.?* i.li I.*? L.+? 1.29
.P5 1.08 3.29 2.sC 2.6, 2.*1 1.11 2.25 2.11 2.12 1.011 1.00 I.9Z 1.8. I.11 1.69 i.6. 5.58 1.5,
+m 2 . a ~ ..)I >.LO 1.~1 2.90 2.7. ?.a* 2.53 2.15 1.39 2.19 2.1. 1.07 1.9. i.88 1.30 l . 7 ~ #.re
.Ps 1.11 5.ie L.31 ,.a3 1.51 3.29 1.11 2.99 2.8- I.-* Z.66 7.5) 1.11 2.m 2.)) i.aZ i.?2 1.80
.9e5
.*Pa
a.81
11.61
s.01
8.15
L.98
6.60
r.37
5.70
3.9-
5.33
3.71
6.73 *..'
1.51 1.15
L.2,
1.22
s.01
>.I2
1.117
?.Q5
1.6s
2.71
>.LO
2.00
3.15
i.rO
2.41
2.10
2.71
2.1-
2.51
L.UL
L.ll
t.93
L.L1
60 _PO 2.79 2.3- ?.,a 2.e. ,.95 ,.a7 ,.a? 3.7. 1.7. 1.7, ,.a* 1.6" 1.5. ,..R I.&* ,.a0 2.35 I.??
.*5 *.so 3.,5 Z.76 2.5, 2.37 2.25 *.,r 2.10 2.0. 1.99 1.32 ,.as 1.7- 1.65 1.59 t.5, 2.-7 1.w
.915 3.29 3.93 1.N 1.0, 2.1' 2.63 2.51 ?..I 2.13 2.27 1.11 2.06 1.'. i.12 1.11 1.61 1.53 1-66
.es 7.0" ..?a ..,3 3.65 3.3. 3.,2 2.97 2.82 2.72 2.63 Z.5" 9.35 2.20 7.0, t.9. 1.8. L.73 1.w
,995 8.'') 5.19 L.71 +_!a 1.76 3.W 1.23 1.11 2.01 2.90 2.7. 1.57 1.19 7.19 2.01 I_% ,.XI I.&'
,999 Lx.97 7.76 6.>7 5.31 a.76 *.>7 a.0- >.e9 >.% 3 . 3 1 3.08 2.83 2.q5 ?.st ?.25 Z.08 1.m
120 .en 2.7% 2.35 2.1, b.99 1.m L.*? 2.77 >.7? ,.6* 1.65 1.60 1.55 ,.*8 ,.L. 1.3, 1.32 ,.a 1.19
.er 3.5z ,.@7 2.m 2.'5 2.a 2.t7 2.09 2.02 1.96 I.*, ,.a, t.75 1.M z.75 ,.$O I.', b.35 1.25
.SIS 5.15 >.a0 3.23 ?.19 ?.b1 2.51 2
.
3
' 2.30 2.22 2.16 1.05 i.W i.sZ 1.6' 1.61 1.51 %..I i.Ji
.se 6.85 r.79 3 . e ~ i.ra 1.17 2.9s 2.79 i.ar 2.56 2.rr 2.1. ,.I* 2.03 1.r6 1.76 1.66 1.5, 1.38
,995 s.18 3.5. 6.50 3.w 1.55 I.2L 3.09 2.93 2.81 1.71 2.5. 1.17 2.19 1.91 1.87 1.75 I.LI I.*>
.W9 kt.>* 7.32 3.79 +.s5 ..+* ..Ok 3.77 3.S 3.>8 3.26 >.OZ ?.,a z.53 2.26 2.11 L.+< 1.76 >.5+
9) .90 2.71 1.30 2.011 1.W 1.85 1.77 1.12 1.67 1.61 1.60 I.15 >.a9 I.LZ I.,& 1.10 8.1. 1.17 1.00
.91 1.W 3.00 2.60 7.37 P.21 2.10 2.0, i.9. 1.88 i.sl i.75 i.67 1.57 1.r- 1.1- 1.11 1.11 l.0"
,975 5.01 3.6- 1.11 2.79 2.Sl 1.*1 2.29 2.19 1.11 2.05 l_Or 1.113 1.11 1.97 !.A11 1.1- 1.27 1.00
.9s 6.61 *.61 3.18 1.17 1.02 ?.*a 2.66 2.51 ?A1 2.11 2.i8 ?.O1 1.81 1.76 1.59 1.67 1.32 i.00
.ees ?.an 5.30 1.2s 3.12 1.35 3.09 2.90 P
.
7
. 2.61 1.5Z 1.16 2.19 2.00 1.7- 1.61 1.51 1.36 1.00
.PW 1O.e) 6.9, 5.*1 L.61 ..I0 I.?. > . + I 1.21 1.10 2.W 1.11 1.51 2.27 1.99 1.8. 8.66 1.+5 3.00
,*.
"2
. .-
i
1
Table A10. Percentage points of the ~ol~ogorov-s~ir~ov
statistic*
The table gives critical values r of the run statistic r with probability
distribution p(r) defined in ~ect.14.6?2 for sample sizes n,m up to 15 (n 5 m).
P(D < d l =I -a, The table assumes a one-sided test of significance a with critical region in
n- a the lower tail of the probability distribution,
where the Kolm~gorov-Smirnovtest statistic D for the sample of size n
is the largest deviation betveen the observedncwoularive distribution
and the theoretical cumulative distribution.
n=, n- R
m a.ml oms a.080 0.025 oar 0.10 1W owl 000s +at0 OQ15 005 0.10 2 8 m
I, i l 111
I!, 5, I,"
5% in i4i
i, iil 16,
$0 "I llil
D (il in8
81 lili 175
6% li!) 182
Bi 71 In!+
A, 7, ID"
n = 9 n = lo
I. O.WI DWS (1.010 0.02s e.05 0.10 2W orat w r wlo 0.028 oar o.loaR m
1 SS 60 39 82 80 70 175
10 33 68 81 ni 69 13 I80 85 rl il 78 8l 87 210 I0
II fii 01 13 88 !? ?R 18% 07 ra )r 81 ao n t a m II
II (13 "6 71 ,., RO 198 68 m 79 a4 8s 8 i nro 11
IS 69 06 OR 13 70 83 PO7 72 18 IIP 88 9% 08 240 I3
I4 00 A7 I! 16 RI 86 P l l l 74 RI 85 DI 90 101 250 14
I5 82 8" 73 10 84 "0 821, 78 84 88 4 DO I" ROO 15
w a.
- --- -*-<< T
3 g g z 5 D
-
EADIE, W.T.. DRIJARD, D., JAMES. F.E., ROOS, M., and SAWULET, 8.. Gordon and Breach Science Publishers, New York-London-Paris, 1970.
S t a t i s t i c a l Methods i n Ezperimntol Phyeice, KNOP. R.E. Errors in estimation of total e v e n t s , Rev. S c i . Instnun. 5 , 1518
North-Holland Publishing Company, AmsterdawLondon. 1971. ~...., .
119711)~
JANOSSY, L., Theory a d Practice for the Evaluation of Measurements, OREAR, J., Notes on statistics for physicists. UCRL-8417 (1958).
Oxford a t the Clarendon Press. Oxford Universitv Press. 1965. ROSENFELD, A.H. and HUUPHREY, W.E., Analysis of bubble chamber data, Am. 'Rev.
JOHNSON, N.J. and LEONE. F.C.. NucZ. S c i . 13, 103 (1963).
S t a t i s t i c s and Ezperimentol Design i n Engineering and the Physi- SHBPPEY, G.C., Minimization and curve fitting. in Pmgramning Techniques,
cal Sciences, Volum I, CERN hR-5 (1968).
John Wiley 6 Sons, Inc., New York- London- Sydney. 1964. SOLMITZ, F., Analysis of experiments in particle physics, Ann. Rev. Nucl. S c i .
MARTIN, B.R., S t a t i s t i c s for Physicistrr, -
14, 375 (1964).
Academic Press, London-New York, 1971.
PARRATT, L.G.. Probability and Ezperimental Errors i n Science, Articles referred to in the text
John Wiley 6 Sons, Inc., New York-London, 1961.
WINE, R.L., S t a t i s t i c s f o Scientists
~ and Engineers, ADAIR, R.K. and KASILA. H., Analysis of s o w results of quark searches, Phys. Retl
Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1964. L e t t e r s 23. 1355 (1969).
BARTLETT, M.S., On the Gtiaticsl estimation of m a n life-times, P h i l . Mag. 14,
Statistical tables 249 (1953), and Estimation of m a n lifetimes from multiplate
cloud chamber tracks. Phil. Mag. 5. 1407 (1953).
FISHER, R.A. and YATES, F.. DAVID. F.N.. A Y' "smoth" test for eoodness-of-fit. Biometrika 34. 299 (1947).
S t a t i s t i c a l Tables for Biological, Agricultuml rmd Medical Re- DAVIDON, w.c., variance algorithm for minimization, COT. J . 10, 6 0 6 (1960).
search, DIIRBIN, J.. Kolmo~oro\rsmirnov
. tests when parameters are ezirnated with app-
~ ~
Oliver and Boyd, Edinburgh, 1938. lications to tests of exponentiality and tests on spacings,
MILLER, L.H., Table of percentage points of Kolmogorov statistics, J. Amer.
Biomtrikn 9 , 5 (1975).
S t a t i s t . Ass. 51, 111 (1956). EVANS. D.A. and BARKAS, W.H., Exact treatmnf of search statistics, Nucl. I n s t .
OWEN, D.B., Rm&wk of S t z i s t i c o l Tables, Metk. 56. 289 (1967).
Addison-Wesley Publishing Company, Inc., Reading, Massachusetts, GELFAND. I.M. and TSETLIN, M.L., The principles of nonlocal search in automatic
1962. ~ ~~
~.
ootimization svstems. Soviet Phus. " . .
Dokl. 6. 192 (1961).
RESNIKOFF, G.J. and LIEBERMAN, G.J., GOLDSTEIN, A.A. and PRICE, J.I., On descent from local mTnima, Math. Comput. 25.
Tables of the n a - c e n t m l t - d i s t r i b u t i a , 569 (1971).
Stanford University Press, 1957. KRUSKAL, W.H. and WALLIS, W . A . , The use of ranks in one-criterion variance
VERDOREN, L.R., Extended tables of critical values for Wilcoxon's test statistic, analysis, J. Amr. S t a t i s t . Ass. 5, 583 (1952).
Biometrika 50, 177 (1963). NELDER, J.A. and WEAD, R., A simplex method for function rninimieatian, Conp. J.
PEARSON. E.S. and HARTLEY. KO. (editors), 7, 348 (1965).
Biometrika Tablea for S t n t i s t i c i m s , Volumes I and 11, PARTICLE DATA GROUP, Review of particle properties. Rev. Mod. Pkys. 9, No. 2 ,
Cambridge, 1970 and 1972. Part 11, April 1976. -
REINES, F., GURR, H.S.. and SOBEL, H.W., Detection of v,-e scattering, Phye.
Rev. Letters 37. 315 (1976).. .
Articles,reports,and lecture notes an statistical estimation in particle physics ROSENBROCK. H.. An automatic%thod for finding the greatest or least value of
a function, Conp. J. 2, 175 (1960).
ANNIS, M., CHESTON, W., and PRIEIAKOFF. H., On statistical estimation in physics, WILCOXON, F.. Individual comparisons by ranking methods, Biometrica Bulletin 1,
Rev. Elod. Phys. 25, 818 (1953). 80 (1945).
- -*
-
h
--
*.
*h
- - --
rn
m b r
*n
. - - - ..
m Pd
-- ., - - .- - .
. - n rn a
c.z
sn
* - .
r+
t
1.
z
c n l
I
m.
* m
o m
a
N
2
m
8
*0Am0
mm
. - x
0 .e rn *n -
m m
*w
8
* ?-
-0
wm
- 4
A,,
mm
w.
-
z =m
.,
2 Y . . D 8 m P * n ?*. m
m -* rn z rn ..n
I - n 7 .r n N I .b-OY
-;
*1 P: uL
-4
<
: : : :; .::.
-- ..---
m +=, w -m --om .n ..,?,a br.0 r. m.
=..
- - -- . . . .. --
-m
2:: 2 ,%:A> .q.; 2m 5 m... - m.
'" Z .
- . . z i s - e e : z z i , s a ~ : ~ a s r - a 2 E :?: .
e
2- =- - c-< m mrn ;z 8 3 - 7 T I I I" "- 4 2mmO m
P * Z8bi 2I * 5 = -mn. s. sz
-.c
nt
O h
-n
, m = m z = * * * r.<
3- 2. Y m l n n C L I r n
z, -E<>=z x - z w2: a u c.m
m m u z m
=*- = z,.= +
:+
::
-..= Lz =e a= ,,mu<
L=ZL:8:rYfirb;: 2 Z m + Y 2 E Z-ELZ
. :sr-:cN:
-;-in ~ e u c o ao h m myif E
~ = -.=- O iw - * c z C LW~+Y
a.."..'"cZZ 3 "
- :
+*.L.<*Z =z"=mm m...
e ~ ~ ~ ~ ~ g ~
uc
-=L"-x
x*z--= -
-,-uzLc
0 -
L C 2 i _ _ l
..C_1>Di
**C&..
. . IL-I-'-=--'C"
< -
..
i
w - u x z ~ > c + ~
, x....I.u --
. ~- ,. .~3>1. 3
D
D
*
"bk$:>'
E l
>
uu
Y < S Y
Ern,
= m .c">m<<*-"
"L="-Y'
fe,:4Pi-tE:"&L
3
.Y^*.52.iL.-r3
.~~lr,Y.-Y-..I..~
Z * C # O r n *u.mo
.
T
.Z
*Y..D*.
I W C " 1 3 l l
.
N Y Y l r
il..Y*L
~ ~ k s ~ ~ ~ ; r ~ ~ E ~ r r n n , ~ z z ~ z ~ ~ Y r L m ~ Lm
< . . -~
+ mu
. mu
a u a z-
a.. a a 3 ~ ~ w
-
o.=z=--e.x
-riii-x*..
- -
z
~ ~ . ~ . . . u w n w w ~ u ~ ~ u - u ~ f E Z r Y Y Y I 'D=="""=="'-
* a m = . = ~ . x z z m ~ 2 ~ ~ ~ u , z - A - " " m ~m ~a-i.m
;;t .+ +..
P 6 : $ $ 6 Z r P+ 2* =
w~, " " . , & -
-a
Z ?- *
::rLrhgs:tsrze
..Zzz=~L~L.;.- mz.
--.z....u...c",->
er rr:"::r:r:p:s:::::t:
=.--WE.
z::~zzz;:z:%:: u w w . . - -
moocc
2 u u
u. L,
2
w
n
u
= c,u u
0 ~
"UUC"0
0 =~= = 0
"UQ
<~
-0
u
m a a a - m
nln- :rll,nl xelt r s e . . O B 21,. 31, r,,
, a , * a m r ,.,#K,h,T,",.
6akA"r
.'LLI*I..... ,,k
I,).
,h
Il *om* r ,.T",r.T,o*
201.202
,pe-?or.
rr~owr.nal.-u. 91
I 115~119 GLflllll I 8 l : i r l i i l l l T l O N 9 5 YR
oewm ?5.?8 BfiU!i'i I I ) Y k O Y 1HLLlllfM 267-268
..,,
.......................
.",or
O F H C E A I COHI OLINII IPOlSliOU BR lihll<iSlnN i N O H M a l ) R h N O O M NUfiIIER OCNERfiTOH IL),,T1.*(ilA* n 6. l l * *T,,,O" ,801 -0,. lo? r1.l
O E N t l L I IZCO HII't:UliEOMI:TUII: 74 11.1 1 1 4 .*XOT @.4"illr,, 6, 61
OFONCIHIC bl?~>O liQ1151.1"N I I I E T I I I I ~ T I O N PSEE NORHI)I DISTRIBLITIOH L*5T-50,allcs L3llnnll"" nr "C*"
""IShEXION, "TI",. 7s O I Y I R " I LOnPOllHD I U I S S O H DISTRIBUTION eB <,*,TIN8 " I F 1 I I I B u T I D H OF 111
,~"I.EROFIIH,.I",C 70 CltWII*, 1 2 TCST "I.i . ,mant l , "S ,,a :a, OF 1 " " C T I " N 33
""I T I U D n l a l 7 2 731 86-18, L ' l O L l F B U I FHFEOOM I N a22 1 0 . a 1 1 2 1 1 1 ~.es
~ :v*. ir. nb Y * N K , O ~ YanlaeLE 33
N U L T I H O R H A I 120 173 I O H C O N F I I I I I S O N 0, +11ErO.R.N9 455-457 5111a10(11 L L W I a L a I I 1 T I S : 6 2 ? 6 I . 2 ) . ? I ? B A I b L t ss
UEOAIIYE 8IHONIOI. 7 0 . 9 0 POR OL1011HL59-0#-1 1 T 4 2 1 4 1 4 .r191 -11, a*,- ,,I Y t r R H r C O l e e 2 0 0 , 21s
NON-CFNTPIL CNI SOLIeRE 129. 119 FOR l N o r P # NIICHCE 119-433 C", S C I a L , "10*6*1 1 1 1 - 8 :n" I I r E T I m E . EEL EXPONENTIAL OISTRlBUTIOW
HDN-CZUTT16L r 1 4 5 . 14s IiFUIRhI IIED b I I P F l D E O N E T I I I C I I r S T R I O U T I O H 7. MEOil)N 31
.................
I I G L T I . 111 1 W C l l l U. : 0 1 . . * 1 . 111 1 1 ,
NOH-CENTRAL T 1 4 0 , 1 1 4 bbNI:WL 12tTe i ikCIII4ODrm lUNCTlOH 249 ,2111.110L Or ,LA. .n2,n*r, '1 I)' a\ r S r I M l T 0 R OF W I I N I N HORHIIL I ) I S T I I I B U T I O N
NOR**. LO1 1 1 4 '.LO*,,",,: "lsrli1X"TlDH 69.70
POISSOH l i ~ 7 8 .8181. ?R 611110N193 D F ~ l l T2 8 8 - 2 8 7 . 142-113 ,*,k ,$.I * , A . L , n , . l . I , . , , , a,:,., I
5 T I N 0 ( 1 H D MOHMAL 101-103 O(1OIIUTliS O F - F I T T T S T S 3771 1 1 4 - 4 2 8
E T l , l l E U i ~ S T 140-1.4 5, Nl.hhL 1 ' 421-121
UWIFOUI B S ~ P I hUI.1(OBOROV-~WlHNOV 4 1 4 - 4 2 8
1 148.149 I'EOUSONS X 415-471. 42". 4.4-4.B
I l U U D L i CXI.O"FNTIOL O l s T l l l B U T l O N 124 m&IsLrNr ~ L ~ H D D357 S ,as
0fiAFHICl)L FROCCDUHC 10H P.RMTTER ESTrMeTIOH
Ilb-JI2
I R O N LIKELIHOOD V l l U C T l O N 721-228. 212-216.
:,.,*-24. NOH-L I W E I R N O E L 275-281
FRO8 X ) rUNCTlOW l l d ~ l l s NOH-LIHr(lR "DOEL Y I T H CONSTRbIHTS 307-316
I SIMPLIFIED 261. 2 V S - 1 9 4
UWYEIOHTED 2 6 0
LIhST-SOUARES P R I N C r P I E 259-260
.. ..
" O l l l T I T D I.,
.I .. \I, L,1",! 110" llll)L5F0n*
LllFLlHOOD nnmcrr ;r*raarl*o r r r r r t o r is
OF CVLHT 26 *"*T*1. "'1.a"" i2l-lrl
OF OBBERVATIDWS 1BO
L l X E L l H O D O COUhTION 197
LIKELIHOOD FUNCTION LBO. 1%
FOP CLASSIFIED DeT1 2.9--251
TOIIIw1.Ii
O,P.,kb
<,-' . ,.,
L,.,
t<ITklnl).lL
, .>
I?, $ 2 7
111
R F I . 8 T I O H TO NSBI)TIYI:
PB
IINOMIhI O l i i l H l B U T l O N ..
., , , -.
A s 9UPPLEMFNT TO P E a R I O N ' S XI TEST 1 1 4 - 4 4 8
P O L I ) R I I I ) I I D N IIB-217. 2 9 5 - 2 7 6 . 318-.1?9. FOR CONSISTI:NCI BETWEEN TYO 51)IPLES 418.142
355z.5 T O R R 4 H D O H W S S U l T H I N ONE SllnPLE 412-4.1
POLINOeIaL r l T T l H B 2 6 2 - 2 6 3 , 2 7 2 7 / 5
P0PULI)TIOH 58
P O S T E R I O R Y R O B l B l l l T I 26
POWER FUNCTIOW 393 nrnN se
PDYER OF TEST 380 P'RllPERTlES FOR HORI1I)L VIRrbBLES 1 0 9
PRECISION OF OBSE*VIIIO" 200. 2 6 0 , 262 SllL 5 8
PrtlDI1 PROBABILITY 26 V&RlANCE 59
PI1OB(IBILITI SEE $LEO HORHI)L 9 W P I . E
ADDITION RUI.E I I S***lC STICE 8
EOHDlTlOHlL , , - I , SOMI-LING DISTRIBUTIONS 121--150
D E F I N I T I O N OF 7-B I C e L E F 1 C T D R 11,
*(IROIHL(L 2,
NULTIPLICITIOW RULE 15-86 .. b r A N H I U B ErrlCII:"C"
SLaTTERI.I.OTS 4 3 - 4 6
10-22. bB-b?, 222.125