You are on page 1of 259

PROBABILITY AND

STATISTICS
IN PARTICLE PHYSICS

A. G. Frodesen, 0.Skjeggestad
DEPARTMENT OF PHYSICS
UNlVElUilTY OF BERGEN

H. T ~ f t e
DEPARTMENT OF C O M P W l N C SCIENCE
ACDER REGIONAL COLLEGE. KRISTIANSAND

UNIVERSITETSFORLAGET
BERGEN -OSLO - TROMSQ
Preface

II
Disrribution offices:

NORWAY The present book on probability theory and statistics is intended for
Universitetsforlaget graduate students and research workers in experimental high energy and elementary
Box 2977 T0yen particle physics. The book has originated from the authors' attempts during many
Oslo 6
years to provide themselves and their students working for a degree in experimen-
UNITED KINGDOM tal particle physics with practical knowledge of statistical analysis methods and
Global Books Resources Ltd some further insight required for research in this field.
109 Great Reffill Street
The first drafting of notes started more than ten years ago when the
London WClB SND
authors were colleagues at the University of Oslo, and working with bubble cham-
, >
UNITED STATES and CANADA ber experiments. At that time no textbook in statistics was knovn to us which
Columbia University Press took its examples and applications from high energy physics and could serve as a
136 Sourh Broadway
reference book in daily work and a suitable eurrievllrm for our students. Several
Irvington.on-Hudson 4'
New York 10533 advanced books were available in the library which discussed the fundamentals of
probability theory and mathematical statistics per se 1e.g. Cram&, Kendall and
Stuart]. Other, less demanding textbooks incorporated useful examples from many
fields, including physics, and had the virtue of acquainting a wider scientific
community with the universal methods devised by the science of statistics [Pisher.
Johnson and Leone. Ostle. Sverdrup. Wine]. Also available were articles and lec-
ture notes discussing statistical estimation in physics in general and in high
Reprinted with Permission from Columbia University Press.
energy physics in particular [Annis e t o 1 . . B6ek. Hudaon. Jauneau and Horellet,
PROBABILITY AND STATISTICS IN PARTICLE PHYSICS by
Orear. Solmitzl. The need for mare coherent presentations apparently was latent
Frodesen, Skjeggestad, and Tdfte, 1979.
and brought on the market a systematic, relatively theoretical account of statis-
Copyright 1979 by Columbia University Press.
tics as used by physicists [Martin, 19711, aa well as two treatises written by
experimental particle phy.sicists. The latter authors, however, either intended
their book "for student8 end research uorkers in science, medicine, engineering
and economics" [Brondt, 19701, or addressed their course "to physicists (and ex-
perimenters in related sciences) in their task of extracting information from ex-
Printed in Norway by perimental data" [Eadie et at.. 19711.
Reklametrykk A s . Bergen The preaent text has been written for readers who are aupposed to have
book. It is hoped that the book will prove useful in everyday work. For this
their main interest in elementary particle physics and who have a need for stat- purpose the list of contents and the subject index have been made to include
istical methods as a tool in their work. This, of course, does not mean that the physics key words as well as statistical terms to facilitate the use of the book
book can only be comprehended by people whose background is particle as a reference manual. To make it selfcontained, the book has also been sup-
However, it is only fair to state that it is a rather specialized book, in which plied with a set of statistical tables in an appendix.
the topics diecussed, the disposition and style reflect the need of an experimen- With its emphasis on the various practical aspects of statistical
tal particle physicist, and in which examples and applications have been almost methods and techniques the presentation in this book differs from the general,
exclusively chosen from this field. This fact ~"doubtedlylimits the usefulness more theoretical points of view shared by statisticians. As "on-professionals
of the boak to readers from other branches of physics. On the other hand, with in statistics the authors make no claim to originality on the subject. We have
the high degree of specialization within the and other sciences today, felt free to borrav material which, over the years, have acquired a status of
it is, in the opinion of the authors, well worth-while to aim at a m r e re- "cornon property'' among particle physicists, without mentioning originators or
stricted group of readers and to tailor the presentation to meet the specific written sources. Our reference policy is otherwise to give only the names of
demands of this group. After all, the statistical methods needed in many disci- authors in the text where examples have been taken from articles in physics pub-
plines are more or less standard and available in excellent general presentations lications, lecture notes e t c . , and to give the full reference in the bibli-
of mathematical statistics. Often, however, the task of extracting the relevant ography at the end of the boak. The bibliography also contains references to
information from these books can be both time-consuming and troublesome for the textbooks which can be suggested as alternative and further reading.
non-specialist, Stories are told about who have spent mnths of their We would like to express explicitly our indebtedness to H.G. Kendall
time developing new methods for data analysis, only to find out later that such and A. Stuart, the authors of the three-volume work "The Advanced Theory of
methods were already described in the statistical literature. A dedicated book Statistics" which we have constantly coneulted and found to contain answers to
like the present can hopefully senre to reduce such instances of vasted time and any question.
effoIt. We are also indebted to Addison-Wesley Publishing Company, Inc.. for
The book assumes no previous knowledge beyond basic calculus. The sub- permission to use material from Table 15.1 in Handbook of S t a t i s t i c a t TabZes
ject of probability theory is entered on an elementary level and given a rather by D. h e n , and to the Bimtrika Trustees for permission to reproduce Table 1
simple and detailed exposition; this is thought to be to the benefit of the sru- from L.R. Verdooren's paper "Extended tables of critical values for Wilcoxon's
dent who starts a new course and should get well acquainted with the moat common rest statistic", printed in B i m e t ~ i k o .
~r~bability
concepts and distributions before entering the domain of statistics. It is a pleasure to acknowledge the useful help of our many students
The boak has been written so that it need nor be worked through as a regular who over the years contributed their coments on the course. Finally, we wish
course, with extensive reading from the very beginning, but can be studied chap- to thank Mrs. Laila Nest far her patient and carefcl cooperetion with the tgping
ter- or seetiowwise, should the reader prefer so. The material has, in fact. of the manuscript.
been organized with an eye to the experimental practical need, which
is likely to be statistical methods for estimation or decision-making. Since an Decenher 1978 A.G.F.. O.S.. H.T.
established physicist will probably possess a sufficient background on the funda-
mentals of probability theory, he can be recommended to start his reading direct-
ly at the chapters he is interested in. Cross references are given to indicate
where definitions and developed f o v l a e =an be found in earlier chapters of the vii
Contents

1 INTRODUCTION

2 PROMBILITI AND STATISTICS


2.1 D e f i n i t i o n of p r o b a b i l i t y
2.2 Random v a r i a b l e s . Sample space
2.3 Calculus of p r o b a b i l i t i e s
2.3.1 D e f i n i t i o n s
2.3.2 Example: Topologies of bubble chamber e v e n t s (1)
2 . 3 3 Additioo r~-~
~~ ~ ~ ule
2.3.4 Conditional p r o b a b i l i t y
2.3.5 Exawl.: K'P s c a t t e r i n g cross s e c r i o n
2.3.6 Indeoendence; m u l r i ~ l i e a t i o n r u l e
2.3.7 Exarople: Relay networks
2.3.8 Example: Efficiency of a Ferenkov counter
2.3.9 Example: no d e t e c t i o n
2.3.10 Example: Beam cantaminbtion and 6-rays
2.3.11 Example: Scanning e f f i c i e n c y (1)
2.3.12 Marginal p r o b a b i l i t y
2.3.13 Exaople: Topologies of bubble c h a h e r e v e n t s (2)
2.4 Bayes' Theorem
2.4.1 Statement and proof
2.4.2 Example
2.4.3 Colnoents
2.4.4 Example: B e t t i n g odds
2.4.5 Bayes' P o s t u l a t e

3 GENERAL PROPERTIES OF PROBABILITYDISTRIBUTIONS


3.1 The p r o b a b i l i t y d e n s i t y f u n c t i o n
3.2 The cumulative d i s t r i b u t i o n f u n c t i o n
3.3 P r o p e r t i e s of t h e p r o b a b i l i t y d e n s i t y function
3.3.1 Expectation values of a function
I o f t e n say 3.3.2 Man value and variance of a random v a r i a b l e
t h a t vhen you can measure 3.3.3 General moments
what you are speaking about.
and express it i n numbers, 3.4 The c h a r a c t e r i s t i c f u n c t i o n
you knov something about i t ; 3.5 D i s t r i b u t i o n s of more than one random v a r i a b l e
bur vhen you cannot measure i t , 3.5.1 The j o i n t p r o b a b i l i t y d e n s i t y f u n c t i o n
"hen you cannot express i t i n n u h e r s . 3.5.2 Expectation valucs
your k n r w l e d ~ eis 3.5.3 The covariance matrix; c o r r e l a t i o n c o e f f i c i e n t s
of a meagre and u n p a t i s f a c t o r y kind. 3.5.4 Independent v a r i a b l e s
3.5.5 m r g i n a l and c o n d i t i o n a l distributions
Lord Kelvin 1883 3.5.6 Example: S c a t t e r p l o t s of kinematic v a r i a b l e s
3.5.7 The j o i n t c h a r a c t e r i s t i c f u n c t i o n

..
3.6 L i n e a r f u n c t i o n s o f random v a r i a b l e s
3.6.1 ~ x a m p l e : A r i t h m e t i c mean of independent v a r i a b l e s 4.6 The e x p o n e n t i a l d i s t r i b u t i o n
w i t h t h e same mean and v a r i a n c e '92
4.6.1 D e f i n i t i o n and p r o p e r t i e s 92
3.7 Change o f v a r i a b l e s 4.6.2 Derivation of the exponential p.d.f. from t h e P o i s s o n
3.7.1 Example: ~ a l i t zp l o t v a r i a b l e s assumptions 93
3.8 P r o p a g a t i o n of errors 4.7 The g a m a d i s t r i b u t i o n 95
3.8.1 A s i n g l e f u n c t i o n 4 . 7 . 1 D e f i n i t i o n and p r o p e r t i e s 95
3.8.2 Example: V a r i a n c e o f a r i t h m e t i c mean 4.7.2 D e r i v a t i o n of t h e ganma p . d . f . from t h e Poisson
3.8.3 s e v e r a l f u n c t i o n s ; m a t r i x n o t a t i o n assumptions 97
4.7.3 Example: On-line p r o c e s s i n g of b a r c h e d e v e n t s PO
1.9 Uisrrele p r o b a b i l i t y d i r r r i b u t i o n s
1.9. L n v d i f ~ c a r i o no f [g,rmulac 4.8 The normal, o r G a u s s i a n , d i s t r i b u t i o n
3.9.2 The p r o b a b i l i t y p e n e r a r i n g funcrion 4 . 8 . 1 D e f i n i t i o n and p r o p e r t i e s of N ( ~ , O ' )
4.8.2 The s t a n d a r d normal d i s t r i b u t i o n N(O.1)
3.10 Sampling 4 . 8 . 3 P r o b a b i l i t y c o n t e n t s of N ( ~ , O ~ )
3.10.1 U n i v e r s e and sample 4.8.4 C e n t r a l moments; the c h a r a c t e r i s t i c f u n c t i o n
3.10.2 Sample p r o p e r t i e s 4.8.5 A d d i t i o n theorem f o r n o r m a l l y d i s t r i b u t e d v a r i a b l e s
3.10.3 I n f e r e n c e s from t h e sample 4.8.6 P r o p e r t i e s of i and s 2 f o r sample from ~ ( l l . o ~ )
3.10.4 The Law o f Large Numbers 4.8.7 Example: P o s i t i o n and w i d t h of resonance peak
4.8.8 The C e n t r a l L i m i t Theorem
4 SPECIAL PROBABILITK DISTRIBUTIONS 4.8.9 Example: Gaussian random rider g e n e r a t o r
4.1 The b i n o m i a l d i s t r i b u t i o n 4.9 The b i n o r m a l d i s t r i b u t i o n
4 . 1 . 1 ~ e f i n i t i o nand p r o p e r t i e s 4 . 9 . 1 D e f i n i t i o n and p r o p e r t i e s
4.1.2 ~ x a m p l e : H i ~ t o ~ r a a m i negv e n t s (1) 4.9.2 E x a w l e : C o n s t r u c t i o n of a b i n o r m a l random rider g e n e r a t o r
4 . 1 . 3 ~ x a m p l e : Scanning e f f i c i e n c y (2)
4.10 The m l t i n o r m e l d i s t r i b u t i o n
4.2 The m u l t i n o m i a l d i s t r i b u t i o n 4.10.1 D e f i n i t i o n a n d p r o p e r t i e s
4.2.1 D e f i n i t i o n and p r o p e r t i e s 4.10.2 The q u a d r a t i c form Q
4.2.2 Example: ~ i s t ~ ~ r a c m ei vn egn t s (2)
4.11 The Cauchy. o r B r e i t - N i g n e r , distribution
4.3 The Poisson d i s t r i b u t i o n
4 . 3 . 1 D e f i n i t i o n and p r o p e r t i e s 5 SAMPLING DISTRIBUTIONS
4.3.2 The P o i s s o n a s s u m p t i o n s
E ~ a ~ l Bubbles e : a l o n g a t r a c k i n a b u b b l e chamber 5.1 The c h i - s q u a r e d i s t r i b u t i o n
4.3.3 ~ x a m p l e :R a d i o a c t i v e e m i s s i o n s 5.1.1 Definition
5.1.2 Proof f o r t h e chi-square p.d.f.
4.4 R e l a t i o n s h i p s between t h e P o i s s o n and o t h e r p r o b a b i l i t y 5 . 1 . 3 P r o p e r t i e s of t h e c h i - s q u a r e d i s t r i b u t i o n
distributions 5.1.4 P r o b a b i l i t y c o n t e n t s of t h e c h i - s q u a r e d i s t r i b u t i o n
4 . 4 . 1 Example: D i s t r i b u t i o n o f counts from an i n e f f i c i e n t 5 . 1 . 5 A d d i t i o n theorem f o r c h i - s q u a r e d i s t r i b u t e d v a r i a b l e s
counter 5.1.6 Proof t h a t ( n - l ) s ' / 0 2 f o r sample from N(u,02) i s x 2 ( n - 1 )
4.4.2 Example: S u b d i v i s i o n o f a c o u n t i n g i n t e r v a l
L.L.3 R e l a t i o n b e m e e n b i n o m i a l and P o i s s o n d i s t r i b u t i o n s 5.2 The S t u d e n t ' s t - d i s t r i b u t i o n
Example: Faward-backward c l a s s i f i c a t i o o 5.2.1 D e f i n i t i o n
4.4.4 R e l a t i o n between m u l t i n o m i a l and P o i s s o n d i s t r i b u t i o n s 5.2.2 Proof f o r t h e S t u d e n t ' s t p . d . f .
Example: Histogramming e v e n t s (3) 5.2.3 P r o p e r t i e s of t h e S t u d e n t ' s t - d i s t r i b u t i o n
4.4.5 The compound P o i s s o n d i s t r i b u t i o n 5.2.4 Probability contents of the Student's t-distribution
Example: D r o p l e t f o r m a t i o n a l o n g t r a c k s i n c l o u d c h d e r 5.3 me P - d i s t r i b u t i o n
4.5 me uniform d i s t r i b u t i o n 5.3.1 Definition
4.5.1 me uniform p . d . f . 5.3.2 P r w f f o r the P p.d.f.
4.5.2 E x a m l e : Uniform random n u d e r g e n e r a t o r s 5.3.3 P r o p e r t i e s of t h e F - d i s t r i b u t i o n
5.3.4 P r o b a b i l i t y c o n t e n t s of t h e F - d i s t r i b u t i o n
5.4 L i m i t i n g p r o p e r t i e s - c o n n e c t i o n between p r o b a b i l i t y d i s t r i b u t i o n s
6 COMPARISON OF EXPERIMENTAL DATA WITH THEORY 9 THE HMIMUI-LIKELIHOOD METHOD
6.1 R e j e c t i o n of bad m e a s u r e m n t s 9.1 "he Haxi-Likelihood Principle
6.2 E x p e r i m e n t a l e r r o r s on m a s u r e n e n r s . The r e s o l u t i o n f u n c t i o n 9.1.1 Exaople: E s t i m a t e o f mean l i f e t i =
6 . 2 . 1 Example: Gaussian r e s o l u t i o n f u n c t i o n and e x p o n e n t i a l p . d . f . 9.2 E s t i m a t i o n of p a r a m t e r n i n t h e normal d i s t r i b u t i o n
6.2.2 Example: G a w s i a n r e s o l u t i o n f u n c t i o n and Gaussian p . d . f . 9 . 2 . 1 E s t i m a t i o n of U; measurements w i t h c o m n e r r o r
6 . 2 . 3 Example: Breit-Wiener r e s o l u t i o n f u n c t i o n and B r e i t n i m e r - 9.2.2 E s t i m a t i o n of u; measurements w i t h d i f f e r e n t errors
p.d.f. (weighted mean)
6.2.4 Example: Width of a resonance 9.2.3 Simultaneous e s t i m a t i o n of mean and v a r i a n c e
6.2.5 Experimental d e t e r m i n a t i o n of r e s o l u t i o n f u n c t i o n ; ideogram
9.3 E s t i a a r i o n of t h e l o c a t i o n p a r a m e t e r i n t h e Cauchy p . d . f .
6.3 Sysrernatlc e f f e c t s . D e t e c t i o n e f f i c i e n c y
6 . 3 . 1 Example: T r u n c a t i o n of an e x p o n e n t i a l d i s t r i b u t i o n 9.4 P r o p e r t i e s of MaximuwLikelihood e s t i m a t o r s
6.3.2 Example: T r u n c a t i o n of a Breit-Wigner d i s t r i b u t i o n 9 . 4 . 1 I n v a r i a n c e under p a r a m e t e r r r a n s f o r m a t i o n
6.3.3 Correcting f o r f i n i t e georerry - m d i f y i n g the p.d.f. 9.4.2 Consistency
9 . 4 . 3 Unbiessedness
6.3.4 C o r r e c t i n g f o r u n o b s e r v a b l e e v e n t s - w e i g h t i n g of the e v e n t s
9.4.4 Sufficiency
6.4 Superimposed p r o b a b i l i t y d e n s i t i e s 9.4.5 Efficiency
6.4.1 Example: P a r t i c l e beam w i t h background 9.4.6 Uniqueness
6.4.2 Example: Resonance peaks i n an e f f e c t i v e - m a s s s p e c r r m . 9 . 4 . 7 Asymptotic n o r m a l i t y of ML e s t i m a t o r s
9.4.8 Example: Asymptotic n o r m a l i t y of t h e ML e s t i m a t o r a f t h e
7 STATISTICAL INFERENCE FROM NORMAL SANPLES mean l i f e t i m e
7.1 Definitions 9.5 Variance o f M a x i m u r L i k e l i h w d e s t i m a t o r s
7.2 Confidence i n t e r v a l s f o r t h e m a n 9.5.1 General methods f o r v a r i a n c e e s t i m a t i o n
7.2.1 Case w i t h a' known 9.5.2 Example: V a r i a n c e of t h e l i f e t i m e e s t i m a t e
7.2.2 Case w i t h a' unknown 9 . 5 . 3 V a r i a n c e of s u f f i c i e n t HL e s t i m a t o r s
9.5.4 Example: V a r i a n c e of t h e w e i g h t e d mean
7.3 Confidence i n t e r v a l s f o r t h e v a r i a n c e 9.5.5 Example: E r r o r s i n t h e WL e s t i m a t e d of u and 0 ' in
7 . 3 . 1 Case w i t h u k n o v n N(u.02)
7.3.2 Case w i t h J! unknown 9.5.6 V a r i a n c e of large-sample ML e s t i m a t o r s
7.4 Confidence r e g i o n s f o r t h e mean and v a r i a n c e 9.5.7 Exaople: P l a n n i n g of an e x p e r i m e n t ; (I)
9 . 5 . 8 Example: P l a n n i n g of a n e x p e r i m e n t ; d e n s i t y m a t r i x
8 ESTIWTION OF PARAMETERS e l e m e n t s (1)

8.1 Definitions 9.6 G r a p h i c a l d e t e r m i n a t i o n o f t h e Maxim-Likelihood estimate


and i t s error
8.2 P r o p e r t i e s of e s t i m a t o r s 9.6.1 me one-parameter case
8.3 Consistency 9.6.2 Example: Scanning e f f i c i e n c y ( 3 )
9.6.3 "he two-parameter ease; t h e c o v a r i a n c e e l l i p s e
8.4 Unbiassedneas
9.7 I n t e r v a l e s t i m a t i o n from t h e l i k e l i h o o d f u n c t i o n
8.4.1 Exaople: o z as an e s t i m a t o r of ' 0
9 . 7 . 1 L i k e l i h o o d i n t e r v a l s , t h e one-parameter case
8.4.2 Example: E s t i m a t o r of t h e t h i r d c e n t r a l moment
9.7.2 Confidence i n t e r v a l s from t h e B a r t l e t t f u n c t i o n s
8.5 Minimum v a r i a n c e and e f f i c i e n c y 9.7.3 Example: Confidence i n t e r v a l s f o r t h e mean l i f e t i m e
8.5.1 Example: E s t i m a t o r of t h e mean i n t h e P o i s s o n d i s t r i b u t i o n 9.7.4 L i k e l i h o o d r e g i o n s . t h e two-parameter ease
8.5.2 Example: E s t i m a t o r s of t h e mean i n t h e normal p . d . f . 9.7.5 Example: L i k e l i h o o d r e g i o n f o r 11 and o 2 i n ~ ( v , o ' )
8 . 5 . 3 Exaople: E s t i m a r o r s of o2 and o i n t h e normal p.d.f. 9.7.6 L i k e l i h o o d r e g i o n s , t h e multi-parameter case
8.6 Sufficiency 9.8 Generalized l i k e l i h o o d f u n c t i o n
8.6.1 h e - p a r a m e t e r case 9.9
8.6.2 Example: S i n g l e s u f f i c i e n t s t a t i s t i c s f o r t h e n o w 1 p . d . f . A p p l i c a t i o n of t h e Maxim-Likelihood method t o c l a s s i f i e d d a t a
8.6.3 Extension t o s e v e r a l parameters 9.10 Combining experiments by t h e HaximuorLikelihood method
8.6.4 Example: J o i n t l y s u f f i c i e n t e s t i m a t o r s f o r !J and o2 i n
N(u.u') . .
xii xiii
11 THE METHOD OF PtOMENTS
9 . 1 1 A p p l i c a t i o n o f t h e ManimurLikelihood method t o weighted e v e n t s
11.1 B a s i s f o r t h e s i m p l e moments method
9.12 A case s t u d y : an i l l - b e h a v e d l i k e l i h o o d f u n c t i o n 11.2 Generalized moments method
11.2.1 One-parameter ease
10 THE LEAST-SQUARES METHOD 11.2.2 ~ m l t i - p a r a m e t e r ease
11.2.3 Exanple: D e n s i t y m a t r i x elements (2)
10.1 B a s i s f o r t h e Leasr-Squares method
10.1.1 The Leart-Squares P r i n c i p l e 11.3 ~ o m e n t smethod w i t h orthonormal f u n c t i o n s
10.1.2 Connection between t h e LS and t h e KL e s t i m a t i o n methods 11.3.1 ~ x a m p l e : P o l a r i z a t i o n of a n t i p r o t o n * (3)
11.3.2 Example: Angular mmentum a n a l y s i s (2)
10.2 The l i n e a r Least-Squares m d e l 11.3.3 Confidence i n t e r v a l s f o r m e s t i m a t e s
10.2.1 Example: F i t t i n g a s t r a i g h t l i n e (1)
10.2.2 The normal e q u a t i o n s 11.4 Conbining MI e s t i m a r e a from d i f f e r e n t experiments
10.2.3 Matrix n o t a t i o n
10.2.4 P r o p e r t i e s of the l i n e a r LS e s t i m a t o r 12 A SIWLE CASE STUDY WITH APPLICATION OF DIFFERENT P A W T E R
10.2.5 Exanple: F i t t i n g a p a r a b o l a ESTIMATION METHODS
10.2.6 Example: Combining NO experiments 12.1 Sirnvlation of a p o l a r i z a t i o n experiment
10.2.7 General polynomial f i t t i n g
10.2.8 Orthogonal polynomials 12.2 Application o f d i f f e r e n t estimation,methods
10.2.9 Example: F i t t i n g a s t r a i g h t l i n e (2) 12.2.1 The method of moments
12.2.2 The ELaximvwLikelihood method
10.3 The non-linear Least-Squares m d e l 12.2.3 The MaximuwLikelihood method f o r c l a s s i f i e d d a t a
10.3.1 Newton's method 12.2.4 me Least-Squares method
10.3.2 Example! Helix parameters i n t r a c k r e c o n s t r u c t i o n 12.2.5 The s i o p l i f i e d Leasr-Squares method
10.4 Leasr-Squares f i t 12.3 Discussion
10.4.1 "Improved measuremento" ( f i t t e d v a r i a b l e s ) and r e s i d u a l s 12.3.1 The e s t i m a t e d parameter values and t h e i r errors
10.4.2 E s t i m a t i n g 0' i n t h e l i n e a r rmdel 12.3.2 Goodness-of-fit
10.4.3 The n o m l i t y assumption; degrees of freedom
10.4.4 Goodneos-of-fit 1 3 MINIMIZATION PROCEDURES
10.4.5 S t r e t c h f u n c t i o n s . o r " p u l l s "
13.1 General remarks
10.5 A p p l i c a t i o n of t h e Least-Squares method t o c l a s s i f i e d d a t a
10.5.1 C o n s t r u c t i o n of x2 13.2 Step methods
10.5.2 Choice o f c l a s s e s 13.2.1 Grid s e a r c h and random s e a r c h
10.5.3 Exaople: P o l a r i z a t i o n of a n t i p r o t o n s (2) 13.2.2 Hinimum a l o n g a l i n e ; the s u c c e s s - f a i l u r e method
10.5.4 Exaople: Angular rmmentum a n a l y s i s (1) 13.2.3 The c o o r d i n a t e v a r i a t i o n method
i3.2.4 The Rosenbrock method
10.6 A p p l i c a t i o n o f the Least-Squares m t h o d t o weighted events 13.2.5 The simplex method
10.7 L i n e a r Least-Squares e s t i m a t i o n with l i n e a r c o n s t r a i n t s 13.3 Gradient methods
10.7.1 Example: Angles i n a t r i a n g l e 13.3.1 Numerical c a l c u l a t i o n of d e r i v a t i v e s
10.7.2 L i n e a r LS model with l i n e a r c o n s t r a i n t s ; 13.3.2 k t h o d of s t e e p e s t descent
Lagrangian m u l t i p l i e r s 13.3.3 The Davidon v a r i a n c e a l g o r i t h m
10.8 General Least-Squares e s t i m a t i o n with c o n s t r a i n t s 307 365
13.4 k l t i p l e minima
10.8.1 The i t e r a t i o n procedure 308
10.8.2 € a m p l e : Kinematic a n a l y s i s of a YO event (1) 312 13.5 Evaluation of errors 366
10.8.3 C a l c u l a t i o n of errors 314 13.6 H i n i n i r a t i o n with c o n s t r a i n t s 367
10.9 Confidence i n t e r v a l s and errors from the X' f u n c t i o n 316 13.6.1 E l i m i n a t i o n o f c o n s t r a i n t s by change of v a r i a b l e s 370
10.9.1 B a s i s f o r the determination of LS confidence i n t e r v a l s 316 13.6.2 Penalty f u n c t i o n s ; C a r r o l l ' s response s u r f a c e t e c h n i q u e 372
10.9.2 LS errors and confidence i n t e r v a l s , t h e one-parameter case 317 13.6.3 Example: Determination of resonance production 373
10.9.3 LS errors and confidence r e g i o n s , the multi-parameter case 319 13.7 Conclvding remarks 374

xiv
14 HYPOTHESIS TESTING APPENDIX STATISTICAL TABLES
14.1 I n t r o d u c t o r y remarks Table A1 The b i n o m i a l d i s t r i b u t i o n
Table A2 The cvmularive b i n o m i a l d i s t r i b u t i o n
14.2 Outline o f g e n e r a l methods Table A3 The P o i s s o n d i s t r i b u t i o n
14.2.1 Example: S e p a r a t i o n o f one-no and m u l t i - n o events Table A4 The cumulative P o i s s o n d i s t r i b u t i o n
14.2.2 The Neyman-Pearson t e s t f o r s i m p l e hypotheses Table A5 The s t a n d a r d normal p r o b a b i l i t y d e n s i t y f u n c t i o n
14.2.3 Example: Neyman-Pearson t e a r on t h e Eo mean l i f e t i m e Table A6 The curnularive s t a n d a r d n o r n a l d i s t r i b u t i o n
14.2.4 The l i k e l i h o o d - r a t i o t e s t for composite h y p o t h e s e s Table A7 P e r c e n t a g e p o i n t s of t h e S t u d e n t ' s t - d i s t r i b u t i o n
14.2.5 Example: L i k e l i h o o d - r a t i o t e s t on t h e mean o f a Table A8 P e r c e n t a g e p o i n t s of t h e chi-square d i s t r i b u t i o n
normal p . d . f . Table A9 P e r c e n t a g e p o i n t s of t h e F - d i s t r i b u t i o n
Table A10 P e r c e n t a g e p o i n t s of t h e Kolmogorov-Smirnov s t a t i s t i c
14.3 P a r a m e t r i c t e s t s f o r normal v a r i a b l e s
Table All C r i t i c a l v a l u e s o f t h e run s t a t i s t i c
14.3.1 T e s t s o f mean and v a r i a n c e i n N ( P , O ~ )
Table A12 C r i t i c a l v a l u e s o f t h e Wilcoxon rank sum s t a t i s t i c
14.3.2 Comparison of mans i n two normal d i s t r i b u t i o n s
, ,
14.3.3 Comparison of v a r i a n c e s i n two normal d i s t r i b u t i o n s
1 14.3.4 S u m r y t a b l e
14.3.5 Example: Conparison of r e s u l t s from two d i f f e r e n t
BIBLIOGRAPHY

INDEX
measuring machines
14.3.6 Example: S i g n i f i c a n c e of s i g n a l above background
;,
14.3.7 Comparison o f means i n N n o m l d i s t r i b u t i o n s ; s c a l e f a c t o r
14.4 Gaodness-of-fir t e s t s
14.4.1 P e a r s o n ' s x2
test
14.4.2 Choice o f c l a s s e s f o r P e a r s o n ' s xZ
teat
1 4 . 4 . 3 Degrees of freedom i n P e a r s o n ' s x2
test
14.4.4 General X2 t e s t s f o r goodness-of-fit
14.4.5 Example: Kinematic a n a l y s i s o f a 'V e v e n t (2)
14.4.6 me Kolmogorov-Smirnov t e s t
14.4.7 Example: Goodneso-of-fit i n a s m a l l sample
14.5 T e a t s of independence
14.5.1 Two-way c l a s s i f i c a t i o n ; contingency t a b l e s
14.5.2 Example: Independence of momentum components
14.6 T e a t s of e o n a i s t e n c y and randomess
14.6.1 Sign t e s t
14.6.2 Run t e s t f o r comparison of two samples
1 4 . 6 . 3 Example: C o n s i s t e n c y between two e f f e c t i v e - m a s s samples
14.6.4 Run t e s t f o r c h e c k i n g randomness w i t h i n one sample
I 14.6.5 Example: Time v a r i a t i o n of beam momentum
14.6.6 Run t e s t as a s u p p l e r e n t t o P e a r s o n ' s X
' test
14.6.7 Example: Comparison of e q e r i m e n t a l h i s t o g r a m and
theoretical distribution
14.6.8 K o l m o g o r o ~ S m i m vt e a t f o r comparison o f two samples
14.6.9 Wilcoxon's ravk sum t e s t for comparison o f two samples
14.6.10 Example: C o n s i s t e n c y t e s t f o r two s e t s o f measurements
of t h e n o l i f e t i m e
14.6.11 ~ r u s k a l - W a l l i s rank t e s t f o r comparison of a e v e r a l samples
I 14.6.12 me X2 t e s t f o r comparison o f h i s t o g r a m

xvi
1. Introduction

The term s t a t i s t i c s is given several precise definitions in the


dictionaries and is also used with different meanings in everyday language.
It can be used a s synonymous with data, oc taken to mean the entire s c i e n t i f i c
d i s c i p l i n e concerned with the methodology of extracting information from data.
Often the word is given a meaning between these two extremes, and stands for a
cornpilatian of figures pertaining to various interests, or for specified methods
or compvtational procedures applied to sets of numerical observations of similar
kind. Used in different contexts the word statistics is associated with the
collecting and svnmarizing of observations in experiments, with measuring the
variation in such data, and with more elaborate investigations; these may
include comparison with other data or model predictions, and parameter estima-
tion and hypothesis testing on the basis of observed data.
The ultimate goal of the physical sciences is to uncover the funda-
mental laws governing the phenomena in the material world. In pursuing this
ambitious task ir is widely recognized that statistical reasoning and statia-
tical analysis methods are becoming increasingly important, even indispensable
in many fields. Accordingly, training in statistics has become more or less
mandatory for students aiming for an academic career on the experimental side
of these fields. Hany supervisors reco-nd a course in mathematical statis-
tics to their students. Hwever, as each discipline has developed its own
character and methodology for experimentation, registration, and interrelation
of observed facts, as well as its w n conceptual and theoretical framework,
it is by nw usually agreed that the training in data handling and statistics
should be made an integrated part of the discipline for practical and peda-
gogical reasons. Thorough knowledge of means and methods is essential in all
scientific research. The education in statistics should therefore go beyond
the teaching of cook-book recipes to be applied in a mare or less automatic
manner to standard problems, and aim at providing some understanding of the
general principles involved as well as of the specific assumptions underlying
2

The interplay between theory and experiment through probability and


the various methods, since these determine their applicability and hen.. their
statistics can be sketched as follars:
implicit limitations.
A theoretical model predicts a certain correspondence between an
An experiment is often motivated by the need to confront current
observable quantity x and some parameter 8 which is not k n w n experimentally and
theoretical ideas vith new experimental facts to reduce speculative model-
which is not directly accessible to measurement; the value of 8 may, or may not,
building and confine intellectual and other effort in a certain direction. In
be predicted by the model. The purpose of the experiment is to "determine" the
experimental high energy physics utilizing the facilities of man-made particle
value of 8 by performing measurements of the observable x. From the s e t of
accelerators a typical experiment is an impressive undertaking which may repre-
Sent the combined effort of large reams of technicians, engineers and physicists
observed n u h e r s x , , x z , ...,xn statistical methods will tell us ha, to obtain an
estimace for the parameter, as well as a measvre of uncertainty in thiti quantity.
from several universities or other research institutions, often from many coun-
P a m e t e r estimation on the basis of observations is the most important applica-
tries, and which may last over a period of many years. Probability arguments
tion of statistics in physics. The second main application is that of hypothesis
and statistical reasoning are employed throughout the experiment, and may in
testing, which can consist in, for example, finding out whether the paraneter
fact enter already at the initial planning stage vith estimating the sire of the
value as predicted by the model is consistent vith the value inferred from the
experiment, in terms of the number of events needed to attain a specified preei-
experimental observations.
sion. The proposal for the experiment will include other estimates for costs in
Theoretical models produce answers to questions of the type: "For a
money and man-pwer, the proposed experimental layout and design. During the
given value of the parameter 8, what is the expected distribution for the obselv-
long period when the experiment is assembled on the floor, extensive Monte Carlo
able .x ?". The experiolent has to do with the inverse problem, corresponding to
simulations are carried out to follow rhe paths of produced particles through
answering questions like: "Given the observations x , , x t , . " , x n , what is the
the experimental set-up and estimate the acceptances and efficiencies of the
value of 0 7". Fundamentally different as these questions are, they are never
different detectors. The actual running period with beam on target may last for
theless intimately connected, illustrating the complementary relationship between
a few weeks time up to several months. In electronic experiments the interesting
probability theory and statistics, represented by, respectively, the theoretician
event candidates are, during this period, directly selected through triggering
and the experimentalist.
systems, checked and initially analyzed by on-line computers before the relevant
The subsequent fovr chapters of the book are concerned with probability
information from all parts of the detection system is output event-by-event on
theory, since an account of this subject must be considered an essential pre-
magnetic tape. In experiments with bubble chambers as detection device the films
requisite for a lneaningful introduction to statistics. We begin in Chapter 2 by
from the exposure must first be scanned for events with the selected track
defining different probability concepts and establish rules for the combination
topologies, measured. kinematically analyzed, and checked again on the scanning
of probabilities. In Chapter 3 we consider the properties of probability distri-
table before the event is ready for the data summary tape. Follwing the data
butions for random variables in general; we introduce some useful conCepfS to
acquisition comes next a phase in which the collected data are examined for
characterize distributions of a single variable (among them: mean value, variance,
internal consistency, possible error sources located, and biases corrected for.
and other expectation values or moments) end several variables (in particular:
I When the observed data have passed all consistency checks and are well understood
the covariance matrix), develop various formulae pertaining to functions of
the final stage involves the interpretation of the experimental findings in the
random variables (rules for error propagation), and give some remarks on
light of theoretical models. At times, when the data can not be explained by the
sampling. In Chapter 4 we focus on special probability distributions which
existing models, the experimental outcome may call for revision of current ideas,
often serve as mathematical models in experimental situations, notably the
inexceptional andhistoric cases even lead to the discovery of fundamental laws.
..

L
binomial, Poisson, and exponential distributions. Particular attention is given
to the normal, or Gaussian, distribution, because of its key role, theoretically The second main application of statistics, the testing of ststistieal
(the Central Limit Theorem) as well as practically (describing outcome of measur- hypotheses, is taken up in Chapter 14. In this area of statistical inference
ments). Chapter 5 deals with a class of sampling distributions which are all the observations are used for decision-making, With a test we mean a given rule
related to the normal distribution; the most important of these is the chi-square or criterion for arriving at a decision of acceptance or rejection of same

I distribution. formulated hypothesis. After a brief survey of the general principles involved.
The real world seldom fits exactly into the scheme provided by the we concentrate on parametric tests for normally distributed variables, which
ideal mathematical models. In Chapter 6 we indicate ha, different situations can have considerable practical importance. Distribution-free tests of goodness-of-
be handled by truncation of the probability distribution, by folding-in of the fit between model prediction and experimental observation, or between different
experimental resolution, and by correcting for inefficiencies in the detecting sets of observations, include the co-n xz-test and the KolmogorarSrnirnov
apparatus. tests; the x2-test can also be used to test independence between variables and
Passing to the domain of statistics, we begin in Chapter 7 by intro- "onsistency between s e t s of observations (histograms). Other simple, less well-
ducing the importantcancepr of a confidence interval, applying it to the comon k n w n prescriptions are given to test randomness within a sample, and consist-
practical problem of estimating the parameters of the normal distribution. The ency between two or more samples.
general aspects of parameter estimation are discussed in Chapter 8, which gives The Appendix contains tabulations of the most common probability
the formal background for the specific estimation methods described in the three distributions which are referred to throughout the book, as well as a set of
subsequent chapters. The Maxim-Likelihood method (Chapter 9), the Least- tables with percentage points and critical values for some of the test statis-
Squares method (Chapter 10). and the method of moments (Chapter 11) all produce tics used exclusively in Chapter 14.
point estimates of the unknown parameters, and we discuss har measures can be There are two important subjects of wide application in particle
obtained for the uncertainty - or error -
in these estimates. A point estimate physics, which are mutually related like probability theory and statistics but
and its error is equivalent to a particular interval estimate of the parameter. not covered in this book. The first of these concerns the simulation of
Interval estimation in general is also discussed, based on the simplifying processes and generation of artificial N-particle reactions in the 3N-4 dimen-
assumptions of infinite sample sizes (in the likelihood approach) and linear sional Lorentz invariant phase space by the Monte Carlo technique. The second
models (for the Least-Squares estimation), since these facilitate comparisons deals with methods for analyzing and finding structures in this multi-
I with the normal and the chi-square distributions. In Chapter 10 we also con- dimensional space, given a sample of observed N-particle reactions. Readers
sider the Important issue of constrained parameter estimation, using the tech- who are interested in these topics should consult references on Monte Carlo
. . ' . ' . .
nlque wlth Lagrang~anmulr~pllers,and discuss h w the Least-Squares estimation methods and multi-dimensional data analysis given in the bibliography at the end
can provide measures of goodness-of-fit. Chapter 12 describes a simple ease of the book.

1 study with application of the different estimation methods on simulated polari-

~
zation experiments.
The MaximvmLikelihood and the Least-Squares estimation methods both
require searching the extremum of a function with respect to the unknown p a r s
1 meters. In Chapter 13 w e sketch the principles behind commonly used numerical

i procedures far locating the minima of general functions of many variables.


2. Probability and statistics A

not be completely sure t h a t the value p thus obtained would be i d e n t i c a l t o the


true value of the parameter p. Indeed we would f e e l t h a t i f the experiment
were repeated, with new sequences of toeses, then presumably d i f f e r e n t e a t i -
mates 6 would be obtained.
Instead of s t a t i n g the r e s u l t of the experiment i n terms of a single
The t h e o m o f p r o b a b i t i t y i s a brench of pure msthemtics. From a
number we could give an i n t e r v a t estimate f o r the parameter p. W e would
c e r t a i n s e t of axioms and d e f i n i t i o n s one b u i l d s up the theory by deductions.
then f i n a l i z e our experiment by giving two numbers p , , p t , hoping t h a t
m c o n t r a s t , s t o t i s t i c a is a branch of applied mathemptics which is e s s e n t i a l -
l y inductive. Nevertheless s t a t i s t i c s i s intimately connected with probabili-
t y theory as t h e following considerations may show.
Suppose it is k n o w t h a t when tossing a coin i t has an a pri& pro- represents a true statement about p. me f a i t h we a t t a c h t o the statement

b a b i l i t y p ( ~ - J ) of landing "heads" and a p r o b a b i l i t y 1-p of landing " t a i l s " . could be expressed by assigning a confidence l e v e l t o i t . Given the observa-
We ask: What is t h e p r o b a b i l i t y of observing r heada out of n t o s s ~ s ? I h i a t i o n of r heads i n n tosses it i s again a ease of a t a t i s t i c a t inference t o de-

is a question i n p r o b a b i l i t y theory, and an answer i s provided by t h e binomi- termine an i n t e r v a l [pl,pzl which is such t h a t i t has a c e r t a i n p r o b a b i l i t y of

a l d i s t r i b u t i o n law, which s t a t e s t h a t the t o obtain r heads a d including t h e t r u e value of p. I n general, the larger we take the i n t e r v a l the

(n-r) t a i l s i s given by t h e mnnber more c e r t a i n we vould be t h a t t h i s i n t e r v a l r e a l l y includes the t r u e value of


p. but a t the a- time a large i n t e r v a l means l e s s precise knowledge on p. On
the other hand, a small i n t e r v a l corresponds t o a b e t t e r precision i n the de-
termination of p, but the statement t h a t [pl.p21 includes p then has a greater
A completely d i f f e r e n t s i t u a t i o n e x i s t s i f one has no a priori knw-
chance of not being true.
ledge on t h e p r o b a b i l i t y p and decides t o perform an experiment t o "determine"
Since the a c t u a l calculation of the limite pl and pz of a c e r t a i n
t h i a parameter. A simple experiment would consist i n tossing t h e coin repeat-
confidence i n t e r v a l requires some assumption about the probability f o r g e t t i n g
edly and counting har many t i n s the outc- heads would occur. It would then
j u s t r heads out of n tosses m r e a l i z e t h a t i n order t o mske thia s t a t i s t i e e l
be a question of s t a t i s t i c s t o ask what t h e parameter p is l i k e , given t h a t i n inference i t is necessary t o know the functional form of the binomial distribu-
n tossem, r heads were observed. A reasonable answer t o t h i e question i s t o t i o n law.
say t h a t "the moat l i k e l y value" of the parameter i s given by the r a t i o of t h e
In the f o l l m i n g seetione we s h a l l s t a t e , but not prove, t h e r u l e s
n d e r of head. observed t o the t o t a l n d e r of tosses, t h a t govern t h e ealculus of p r o b a b i l i t i e s , and i l l u s t r a t e theer with so-
simple errmples. h r o of the exunplrs. Sccts.2.3.10 and 2.3.11, a l s o include
s t a t i s t i c a l inference.
A

I h e e x p e r i e n t has then given a point e s t i r m t e p-p f o r the unknown parameter


*)
.
we could a l s o say t h a t the value of p was inferred from the observations made 2.1 DEFINITION OF PROBABILITP
by the erperimcnt. kkmever, f r o . the very mature of t h e experiment, we could S t a t i s t i c i a n s do ~t seer t o agree about t h e b e s t v a ~
t o define pro-
bability. We v i l l adapt a r a t h e r mimple approach, evstomnry apong physicists.
and d e f i n e probability i n ~ e - of t h e Itnit of r e k t i o e jb-q of
*) The s+ol ever a q u a n t i t y <s.u.ed t o denote an c s t i m t r .
length X obtained from measurements on a bar. We would then a s s o c i a t e a proba-
ocmence. Thus, i f i n a aequenee of n t r i a l s of an experiment t h e outcome of b i l i t y P(xlX<x+dx) t o t h e event of g e t t i n g a measured length i n the i n t e r v a l
a s p e c i f i e d c l a s s , o r t h e event E, occurs r times, then t h e p r o b a b i l i t y of E i s
[x.x+dxl, and define the probobiliw density ftmction f(x) f o r the continuova
o p e r a t i o n a l l y defined as t h e l i m i t i n g value random v a r i a b l e X by the equation

P (E) - when n * -. (2.1)

From t h i s d e f i n i t i o n tlre p r o b a b i l i t y of the event E is some n d m r The requirement t h a t a l l p r o b a b i l i t i e s should add up t o one i n n w formulated
satisfying
by

where P(E) = 0 i f t h e event never occurs when t h e experiment i s performed, and


where t h e i n t e g r a t i o n goes over a l l possible outcomes x defining the sample
P(E) . - 1 i f it always occurs.
space n.

2.2 RANWM VARIABLES. S M L E SPACE


2.3 CALCULUS OF PROBABILITIES
To i l l u s t r a t e what i s meant by e rmtdom v&able consider t h e simple
We s h a l l nar introduce some u s e f u l concepts and s t a t e some b a s i c r u l e s
experiment of r o l l i n g a d i e . Since p r i o r t o a t h r w , i t s outcome e m not be
f o r c a l c u l a t i o n of p r o b a b i l i t i e s according t o set theory.
p r e d i c t e d with complete c e r t a i n t y , t h e numher of d o t s t h a t i a observed i s c a l l e d
a random v a r i a b l e . I n t h i s case t h e outcome w i l l be one number i n t h e sequence 2.3.1 Definitions
1.2,. ...6 , hence t h e smnpte spa- c o n s i s t s of t h e c o l l e c t i o n of integer numbers The concept of a s e t i s used t o denote a c o l l e c t i o n of o b j e c t s with
between 1 and 6. Since t h e occurrence of one outcome. 1 say, ax- some c o m n p r o p e r t i e s . An object t h a t belongs t o a s e t A i s s a i d t o be an
cludes t h e o t h e r possibilities 2,3,...,6 these events are s a i d t o be sectusive. element of A. I f every element of t h e s e t B i s a l s o an element of the s e t A we
I f t h e random v a r i a b l e X can only take on a f i n i t e rider of values, say t h a t B is a subset of A.
as i n t h e example above, we c a l l X a d i s c r e t e random v a r i a b l e . We aseociata Let A b e en a r b i t r a r y s e t of elements i n the s q l e space fl. The
w i t h each p o s s i b l e outccme xi of t h e experiment a p r o b a b i l i t y Pi. canplement i s then defined as t h e s e t of a l l elements i n n t h a t do not belong
t o t h e s e t A.
The mion A U B of two s e t s A and B is defined as the s e t of ele-
Since t h e r e must be some outcome of every t r i a l the sum of a l l Pi f o r a l l con- ments t h a t belong t o A o r 8 , or both.
c e i v a b l e outcomes must be equal t o one. The intersection A n B is defined as t h e s e t of elements t h a t be-
long t o both s e t s A and B.
1 p i - l . (2.3) Two s e t s A and B are s a i d t o be szhoustive i f any element of n belongs
i
t o t h e union A U B,
Exclusive eventa, f o r which eq.(2.3) is s a t i s f i e d , are s a i d t o be exhumtiwe.
I f t h e random v a r i a b l e X e m have s continuum of values w i t h i n any AUB-n. (exhaustive s e t s ) (2.5)
f i n i t e i n t e r v a l i t is c a l l e d a continuous r a n d w variable. An u q l e is t h e
Tvo sets A and B are mutually ezctuaive if they h.v+ no elemcots in 2.3.2 Example: Topologies of bubble chamber events (1)
common, that is To illustrate the definitions introduced in the previous section, let
us consider the topologies of high-energy proton-proton interaetiona in a bubble
A ~ B - 0 . (exclusive sets) (2.6) chamber. The reactions will have final states vith 2.4.6, ... charged particles
According to these definitions the set A and its complement h a r e
(prongs) and 0.1.2, ... aaaaciated neutral strange particles (TO'S). The sample
space fi will be the collection of all conceivable topologies. that is, all e m
two exclusive and erhauative sets.
binations of n d e r of prongs and v0's that can occur st this particular energy.
A convenient way of visualizing the concepts introduced d o v e , and
Por definiteness, let the set A be the collection of all events asso-
also of illustrating the algebra of sets, is by means of Venn d i a g r m .
ciated vith at least one vO. The complement 5; will then be the collection of
Let the square of Fig. 2.1 represent the ssmple space fi and the
events which are not associated with 'V signals. Similarly, the set B can re-
areas enclosed by the circles represent the sets A and B. Then the complement
-
A, the union A U B , and the intersection A n B are given by the different
present all 2-prong events; the complementary set b will then represent all
events vith more than 2 prongs. The union A U B will be the collection of all
shaded regions in the diagram.
events that have at least one .'v or 2-prongs, or both. The intersection
A n B represents events with 2-prongs and at least one.'v

Exercise 2.3: Draw Venn diagrams for this example.

2.3.3 Addition rule


prom the Venn d i a g r m in Fig. 2.1 one may write

which is kn- as the m l e of a d d i t i a for probabilities. If, in particular,


An B the sets A and B are mutually exclusive, 6 . e . the circles in Fig. 2.1 do not
overlap, the addition rule tahes the simple form
x, the union A U 8, and
Fig. 2.1. Venn diagram for the complement
the intersection A n 8. P(A U B) - P(A) + P (B) (exclusive sets). (2.8)

If the sets A~;A*. ...,A" are exclusive and exhaustive,


More complicated combinations of sets can also be conveniently de-
pictured in Venn diagrams. ?(A1 U A* U.. .U A,) -1 n

i-1
P(Ai) - 1 1 (exclusive and exhaustive). (2.9)

Exercise 2.1: Let A,B,C be three non-exclusive sets in the sample space R .
Find representations of the combinations (A U 8) n C and (A Il 8) n C.
2.3.4 Conditional probabilitr
Hsrcise 2.2: S h m , by using the technique of the Venn diagram, that Suppose A and B are subsets of the sample space i? and represent the
(A U B) ll C in general is different from A U (B n C).
probabilities P(A) and P(B), respectively. Suppose further that we for some
reason will be interested only in the elements of A and that we therefore want

2 - Probability and staflsticr.


t o r e d e f i n e t h e sample space t o t h e s u b s e t A only. U w can v e t h e n expresa t h e
p r o b a b i l i t y of t h e s u b s e t B r e l a t i v e l y t o t h e "nev" s q l e space A1 This new
P ( A n B) - NNc . (overlap region r e l a t i v e l y t o square)
p r o b a b i l i t y i s c a l l e d t h e c a d i t i a o t probnbitity of B r e l a t i v e t o A; i t is
written P ( B / A > ,which should be r e a d " p r o b a b i l i t y of B given A". h o t h e r vry P(BIA) - fiC
NA
(overlap region r e l a t i v e l y t o c i r c l e A)
of expressing t h e c o n d i t i o n a l p r o b a b i l i t y , w i t h r e f e r e n c e t o
i s t o say t h a t P(BIA)
many a p p l i c a t i o n s .
gives t h e p r o b a b i l i t y t 6 a t t h e e v e n t B occurs under t h e P(A/B) -- N~
Nn" '
(overlap r e g i o n r e l a t i v e l y t o c i r c l e B)

c o n d i t i o n t h a t t h e event A has a l r e a d y occurred.


I t i s seen t h a t t h e s e p r o b a b i l i t i e s s a t i s f y eq.(2.10) as well as t h e s i m i l a r
It seems i n t u i t i v e l y c l e a r t h a t t h e c o n d i t i o n a l p r o b a b i l i t y muat have
equation f o r t h e c o n d i t i o n a l p r o b a b i l i t y P(AIB),
something to do with t h e "overlap" of A and B, o r t h e i n t e r s e c t i o n A n 8. In
f a c t . the conditional probability P(BIA) i s d e f i n e d through t h e e q u a t i o n P ( A n B) = P ( A B ) P ( B ) . (2.11)

P ( A n B) = P ( B ~ AP) ( A ) . (2.10) It may be wort11 s t r e s s i n g t h a t all p r o b a b i l i t i e s , i n f a c t , are condi-


t i o n a l p r o b a b i l i t i e s , because t h e occurrence of any event alvays depends on
Ta i l l u s t r a t e t h e meaning o f b n d i t i o n a l p r o b a b i l i t y , ve look a t t h e
some c o n d i t i o n s . But s i n c e t h e s e c o n d i t i o n s o f t e n are t h e same f o r a l l events
Venn diagram of Fig. 2.2. The sample space n has a t o t a l of N elements and the
c o n s t i t u t i n g t h e sample they are considered t o be t r i v i a l and t h e r e f o r e n o t
nmber of elements in t h e s u b s e t s A and B i s N and NB, r e s p e c t i v e l y , w h i l e A
A stated explicitly.
and B have N C elements i n c o m n .

2.3.5 Example: K0p s c a t t e r i n g cross s e c t i o n


As an i l l u s t r a t i o n on t h e i m p l i c i t use of c o n d i t i o n a l p r o b a b i l i t y ,
consider an experiment t o determine t h e sign of t h e mass d i f f e r e n c e between t h e
-
longlived and t h e s h o r t l i v e d KO, v h i c h can b e deduced from t h e K0p and KOp

s c a t t e r i n g cross s e c t i o n s . To e v a l u a t e t h e s e q u a n t i t i e s one can measure t h e


p r o b a b i l i t y t h a t a n e u t r a l Laon produced i n a hydrogen bubble chamber sill
i n t e r a c t v i t h a proton v i t h i n t h e chamber. The c a l c u l a t e d p r o b a b i l i t y f o r K0p
s c a t t e r i n g w i l l , h w e v e r , depend on t h e p r o b a b i l i t y for observing t h e decay of
a KO.

Fig. 2.2. Venn diagram to i l l u s t r a t e c o n d i t i o n a l p r o b a b i l i t y . Ass- the KO's t o be produced v i a t h e r e a c t i o n

~ + + p * ~ ' + n ++ p , (production).

I n s p e c t i n g t h e d i f f e r e n t areas i n t h e diagramwe w r i t e
and d e t e c t e d through t h e decay
N~ N~
P(A) -E , P(B) =E , nt + n- ,
KO + (decay, event B).
and
The i n t e r e s t i n g r e a c t i o n is

.. K ' + ~ + K ' + ~ .
( s c a t t e r , event A),

A
E I
f o r which w e s e a r c h the p r o b a b i l i t y P(h). We are only a b l e t o i d e n t i f y event
intersection P(A ll 8 ) represents t h e f r a c t i o n of eoopletely i d e o t i f i e d
A i f event D i s a l s o observed, because the decaying kaon as well as the r e c o i l -
sequential events,
ing proton must be measured t o o b t a i n a kinematical f i t t o t h e s c a t t e r i n g reae-
t i . Writing (eq.(2.10))

r e l a t i v e l y t o the t o t a l rider of KO producing reactions Ktp +K%+p with KO

Potential path of seen or mseen i n t h e bubble chamber. Thus t h e p r o b a b i l i t y P(A n 8) can be


unscattered K O found by counting events.
On t h e other hand P ( B ( A ) , t h e conditional p r o b a b i l i t y f o r a seen
decay of the KO, given t h a t a s c a t t e r i n g has occurred, can be e a l e u l a t e d from
the i d e n t i f i e d , complete events taking i n t o account the geometric d e t e c t i o n
e f f i c i e n c y of t h e bubble chamber, t h e mean l i f e t i m e of the KO, and t h e branch-
ing r a t i o f o r the charged decay mode. Note t h a t ~ ( 0 1 i~s ) d i f f e r e n t from t h e
probability P(B) t o observe t h e decay of an unscattered KO, due t o t h e change
in t h e KO momentum and d i r e c t i o n i n t h e s c a t t e r i n g process. Compare t h e i l l u -
s t r a t i o n , Pig. 2.3, where t h e s e q u e n t i a l events are shown, t o g e t h e r w i t h indi-
eations of t h e p o t e n t i a l paths f o r t h e s c a t t e r e d KO as well as f a r an unseat-
tered KO.

2.3.6 Independence; m u l t i p l i c a t i o n r u l e
Two a e t a A and B are s a i d t o be independent i f t h e c o n d i t i o n a l proba-
b i l i t y of B r e l a t i v e t o A i s equal t o t h e p r o b a b i l i t y of B,

P(B(A) - P(B), (independence). (2.12)

This means t h a t t h e occurrence of t h e event B i s not dependent on t h e (previous)


'Fig. 2.3. Sketch of se u e n t i a l events observed i n a hydrogen bubble
chamber. An incoming Kq meson gives r i s e t o a KO via the r e a c t i o n occurrence of event A. Prom our e a r l i e r d e f i o i t i o m of conditional p r o b a b i l i t y ,
0 0
Ktp * K0"+p. The KO s c a t t e r s o f f a proton, K p + K p, and decays eq.(2.10), we see t h a t an a l t e r n a t i v e E a r n l a t i a n of independence f o r two sets
i n t o a p a i r of pions, KO+ ntli.
is
P(A n 8 ) - P(A).P(B), (independence). (2.13)
2.3.8 Example: Efficiency of a &lermLov counter
suppose t h e ZereLov l i g h t from p a r t i c l e s t r a v e r s i n g a Ferekov
counter along i t s a x i s i s detected by a concentric errangeneot of phototubes
I n other wofds, t h e p r o b a b i l i t y f o r t h e occurrence of b o t h events A and 0 is
i l l u s t r a t e d i n Fig, 2.5.
t o t h e product of t h e p r o b a b i l i t i e s f o r t h e two s e p a r a t e events.
equal
Eq.(2.13) provides a necessary and s u f f i c i e n t condition f o r indepen-
dence of t h e two s e t s A and B. It i s o f t e n r e f e r r e d t o as t h e rub o f m t t i -
p l i e n t i o n f a r p r o b a b i l i t i e s of independent events.

2.3.7 Exangle: Relay networks


Ib Fig. 2.4 t h e probability f a r t h e c l o s i n g of each r e l a y i n t h e c i r -
is Smg given number a. Assuming t h a t a l l r e l a y s a c t independently we
want t o f i n d the p r o b a b i l i t y f o r the flow of a current between t h e terminals.

Fig. 2.4. Relay network

Fig. 2.5. Arrangements o f 9 phototubes i n a Ehrenkov counrer

--
Let t h e event of c l o s i n g r e l a y i be denoted by E.. i-1.2.3. Then
P(EI) = P(Ez) P(El) = a. Having a current between t h e terminals corresponds In order t o d i s c r i m i n a t e against accidental t r i g g e r i n g of the system it is
t o t h e event E El U (E2 n E1), f o r which we s h a l l f i n d desirable t o observe coincidences between t h e s i g n a l s from s e v e r a l phototubes.
We assume a l l phototubea t o a c t independently.

Applying t h e addition r u l e f o r p r o b a b i l i t i e s , eq.O.7). leads t o probability P(E) - -


I£ t h e event E of having a s i g n a l from one phototube corresponds t o a
E 0.93, t h e p r o b a b i l i t y f o r t h e d e t e c t i o n of t h e eereokov
l i g h t by a l l 9 independent phototubes in Fig. 2 . 5 ( a ) is

and using the m u l t i p l i c a t i o n r u l e eq.(2.13) f o r t h e independent events Ei we


obtain The e f f i c i e n c y of the d e t e c t o r can be l a r g e l y improved by a d i f f e r e n t

P(E) - P(E,) + P ( E ~ ) P ( E , ) - P(E1)P(E2)P(E3) = a + a' - a'. arrangement of the phototubea. In Pig.2.5(b)the tubes are grouped together t h r e e
by t h r e e , and each group is a c t i v a t e d i f a t l e a s t one of t h e tubes i n t h e proup
which can be r e w r i t t e n as
has a s i g n a l . The observation of a p a r t i c l e by t h e d e t e c t o r then r e q u i r e s a
coincidence between t h e s i g n a l s from t h e t h r e e groups, f o r which t h e probabi-
l i t y becomes
to s h m an a p p l i c a t i o n of t h e a d d i t i o n r u l e , eq.(2.7).
The same l i n e of thought m y be applied t o more complex s i t u a t i o n s
here s e v e r a l no's are involved. For instance, an w0 meson, with a decay i n t o
2.3.9 Example: no d e t e c t i o n 37' w i l l give r i s e t o up t o s i x detected y-rays. However, unless t h e d e t e c t o r
The no meson is knam t o decay eleetromagnetieally i n t o two y-quanta. is very e f f i c i e n t , the p r o b a b i l i t y t o see many decay products soon becomes very
Suppose t h a t we study no decays i n some d e t e c t o r , e . g . a heavy l i q u i d bubble small.
chamber, end t h a t t h e average p r o b a b i l i t y i a a f o r t h e conversion of a y i n t o
2.3.10 Example: Beam contamination and d e l t a r a y s
an electron-positron p a i r w i t h i n t h e d e t e c t o f . We want f i r s t t o f i n d t h e pro-
It i s q u i t e o f t e n s problem i n kaon and a n t i p r o t o n exposures i n bubble
b a b i l i t i e s f o r seeing two, one, or none of t h e decay products of a s i n g l e no.
charbera t o f i n d t h e contamination i n t h e beam of l i g h t e r p a r t i c l e s , pions and
Clearly t h e conversion of d i f f e r e n t y-rays can be assumed t o occur
muons. Since a l i g h t p a r t i c l e can impact more of i t s energy t o an e l e c t r o n
independently of each o t h e r . Therefore, i n s simple a p p l i c a t i o n of the multi-
than a heavy p a r t i c l e of t h e same momentum, i t is p o s s i b l e t o e s t i m a t e t h e ean-
p l i c a t i o n r u l e , eq.(2.13), we can w r i t e down the p r o b a b i l i t y t o d e t e c t both y ' s
tamination from a count of beam t r a c k s having d e l t a ray e l e c t r o o s with an energy
from t h e decaying no as
E exceeding the maximum possible. Emx, f o r a d e l t a ray produced by the h e a v i e r
particle. Renee t h e presence of a d e l t a ray with E > E ("large 6") i n d i c a t e s
max
a "light" beam track.
end, s i m i l a r l y , the p r o b a b i l i t y f o r seeing none of them,
We introduce t h e following notation:

P(0y) = N -t o t a l number of beam t r a c k s observed,

The p r o b a b i l i t y t h a t only one of t h e y ' s i s v i s i b l e i s a l s o e a s i l y w r i t t e n down, N, - number of beam t r a c k s observed w i t h one "large 6".

NI - n m b e r of beam t r a c k s observed with two "large 6".

N,, -number of "light" beam t r a c k s (unknown),


The t h r e e p r o b a b i l i t i e s can be added t o give

-
P(16) - p r o b a b i l i t y that a l i g h t p a r t i c l e produces one "large 6",
Per) + P(lY) + P(0Y) 1 I
P(Z6) - p r o b a b i l i t y t h a t a l i g h t p a r t i c l e produeea two "large 6".

as they should. We a l s o n o t e t h a t t h e p r o b a b i l i t y t o see a t l e a s t one y from a


Then, by d e f i n i t i o n , i n the l i m i t of l a r g e numbers.
decaying no i s

P(l1y) - P(2y) + P(1y) - 2a - rrZ.


P(l6) -5 'N*
P(26) - N'.
Nn
The law of independence, eq. (2.13). implies t h a t
Thus, combining the expressions above one sees that it is possible to estimate
the number of "light" beam tracks as N = All events

Scan 1 Scan 2
Hence the contamination is estimated as
I J

2.3.11 Example: Scanning efficiency (1)


In experimental high energy physics employing track detection methods Undetected events N-(NI+N2+N,,)
(bubble chamber, nuclear emleion, spark chamber) the experimenter must rely oo
scanners who search for specified types of events. The interesting events occur Fig. 2.6. Venn diagram for illustration of scanning efficiency.

at random among large quantities of extraneous information, end it is rather


unlikely that the scanner will be able to register all events of the specified The probability that an event has been found in both scans is given
types. The scanning process is therefore probably less than 100% effective. To by the intersection (see Fig. 2.6)
estimate the probability that the interesting events are recorded one can per-
form two, or more, completely independent scans of the material.
Suppose that from two independent scams of a given sample of bubble
Since the scans were assvmed to be independent, the probability that an event
chamber pictures one divides the interesting events into four exclusive classes.
will be observed in both scans ir also given by the multiplication rule, eq.
where
(2.13).

N12 + N1 events were found in the first scan,

NI. + Nz events were found in the second scan.


Combining the relations (2.141 - (2.16) we find an estimate for the total number
Nit events were found in both scans. of events in the film, by taking
N - (N1+N2+Nl2) events are undetected by the N o scene.

N is the unknown ntmber of intereating events in the film. NI is the number of


Hence, inserting this expression in ep.(2.14) we obtain the estimated effieien-
events that are found in the first scan but not in the second. and similarly for
eiea of the individual scan* as
N n A diagrammatic representation of the situation is given in Pig. 2.6.
From our definition of probability, eq. (2.1). the efficiencies of the
individual scans are, respectively. assuming large numbers,
It is worth noting that because N has to be estimated using the io- taken up again in Sect.9.6.2 where we adopt the Maximum-Likelihood approach to
formation from both scans the estimated efficiency for the first .can depends the estimation problem.
on the result of the second scan, and vice versa.
From the total number of events f o m d in the two scaoa. the o u e m t t 2.3.12 Marginal ~robabilitr
The elements of a set may frequently be classified according to more
scanning efficiency is given by the union,
than one criterion. The termmarginal probability is then used whenever a-
of the erit.eria are being ignored in the classification. Thus if the classi-
fications according to two criteria A and B are A,.A2,....A ....
m and 81.8~. .,B
Thus, taking the estimated value for N from eq.(2.17) we arrive at the follor
ing expression for the estimated combined efficiency of the two scans,
m
respectively, where 1 P(Ai)
i=l
-f
j-1
P -
1 then the marginal probability of
A. is
"
The overall efficiency can alternatively be expressed in tern of the
P(A~)
j-1
-
1P(A; n nj) (2.22)

individual efficiencies; applying eq.(2.7) we have and, similarly, the marginal probability of 8.
1'

P(Bj)
i-1
P(Ai n Bj) . (2.23)

which, using eq. (2.16), gives In particle physics the concept of marginal probability is inherent
in the notion of inclt~sioereactions. For example, writing

a + b * c + anything
In developing the formulae above several assumptions were involved,
some of which have not been stated explicitly and may hardly be fulfilled in implies that in the reaction of particles a and b one is only interested in ob-
practice. For instance, it has been assumed that servations on the properties of particle e, ignoring all the other particles.
A more specific example follars.
(i) the different scans are performed independently (an assump-
tion that is perhaps best met by having different seannera). 2.3.13 Example: Topologies of bubble chamber events (2)
(ii) all events have the same ~robabilit~
of being detected (an An experiment on strange particle production in proton-proton reae-
ideal requirement which is not achieved in practice when tiona in a bubble chamber has classified the events according to the identified
complicated topologies occur), neutral strange particle (v'), criterion A, end the number of charged particles
the nuder of events is sufficiently large to warrant the (prongs) seen in the primary reaction, criterion B. Criterion A gives four
(iii)
exclusive possibilities, with observation of one KO oon A, two .:K or one :K
use of the concept "probability". s'
plus one A , respectively, while criterion B gives five (exclusive) possible
We briefly touch the question of the statistical uncertainties in the prong nmher assignments for the primary reaction. Suppose that the probabili-
scanning efficiencies in connection with our diaeussion of the binomial distri- ties P(A. n 8.) are given by the observed relative frequencies for the various
bution (Sect.4.1.3). The whole problem of estimating scanning efficiencies is I I
topoloeiea displayed in the following table:
I £ A is also a set belonging to n, Bayes' Theorem states that

The proof of this theorem is nothing but an application of the defini-


tions introduced in the previous sections. Fran the definition of conditional
probability we can write d m two expressions for the intersection P(A ll Bi).
The marginal probability to have a reaction with, m y . only one ,:K
iirespective of the n h e r of prongs, is here found by adding the'entries in the
first row of the table.
(compare eqs.(2.10), (2.11)). hence

whereas, for example, the marginal probability for having a 4-prong reaction
P(B~IA) - P(AIB~)P(B~)
P(A) '

with any 'V signal is found by adding the nunhers in the second column, For P(A) we get an expreasioo from the definition of marginal probability, eq.

P(Bt) - P(4-prong) - 0.448.


(2.22). vhieh can be rewritten using eq.(2.11),

It will be seen that the marginal probabilities add to unity,


Substituting for P(A) in P(BilA) above ia seen to give eq.(2.24)

2.4.2 Exvaple
Let each of three drawers 01,Bn.Bs contain two coins; B1 has n o gold
coins. Bp one gold and one silver, 81 010 silver coins. We are to select one
2.4 BAYES' THEOREM
drawer at random and pick a coin from it. Supposing that this first coin turlu
We shall give a brief account of Bwes' l'keomm, which. although
mathematicslly simple, has a controversial status among the specialists. We out to be one of gold, what i e then the probability that the second coin in the
first state the theorem, next prove it (Sect.2.4.1). consiter an example (Sect. same drawer is also a gold coin7
2.4.2). give further c o m n t s (Sect.2.4.3) and a second example (Sect.2.4.4). If A denote. the event of first a gold coin we want t o calcu-
late the conditional probability P(BIIA). Obviously the conditional probnbili-
2.4.1 Statement and proof tiea P(A(B.) of getting e gold coin from drawer Bi, are
Let the sample space n be spanned by the n mutually exclusive and
exhaustive subsets B.
I'
respectively. Also, s i n c e a drawer i s s e l e c t e d a t random

I vhere t h e conditional p r o b a b i l i t y ~ ( ~ 1 9 i. s) t h e likelihood of ohtsioing the


Hence, from eq.(2.241,
measurement r f o r a s p e c i f i e d hypothesis 8..
I Prom eq.(2.26) any p o s t e r i o r p r o b a b i l i t y P ( 9 . l ~ )can only be evalu-
/ ated i f pkior p r o b a b i l i t i e s P(9.1 are s p e c i f i e d . $he n-ricsl value of
I
P(Ei) is a measme of o u r p r i o r degree of b e l i e f t h a t 8. i s a t r u e hypothesis
Thus, although t h e p r o b a b i l i t y was only 1 / 3 f o r s e l e c t i n g drawer 8 1 , t h e ob- I
about t h e unknown 8 . The measurement of x implies a change i n our b e l i e f t h a t
s e r v a t i o n t h a t t h e f i r s t coin drawn was a gold coin e f f e c t i v e l y doubles t h e I
Bi i s t r u e , ~ ( 8 ~ 1 xgiving
1 a n m e r i e a l value f o r our new f a i t h i n t h e statement
p r o b a b i l i t y t h a t the draver B, had been selected. t h a t 8. i s t r u e . If, on the b a s i s of t h e observation x , we had t o choose one
This example may serve t o j u s t i f y t h c f o l l o v i n g notation: The P(Bi)
hypothesis from t h e s e t 81,92,....9nwe would probably choose t h e one with t h e
are t h e prior probabilities for t h e drawers, while P(Bl /A) is t h e posterior
largest posterior probability P ( 9 . 1 ~ ) .
probability f o r t h e drawer B,. The P ( A / B ~i )s c a l l e d t h e A consequence of Beyes' Theorem i s t h a t two p h y s i c i s t s , from t h e same
Likelihood of the event A, given Bi. observation x, may f i n d d i f f e r e n t d i s t r i b u t i o n s of p o s t e r i o r p r o b a b i l i t i e s , de-
pending on t h e i r formulation of t h e p r i o r p r o b a b i l i t i e s . From t h e i r i n d i v i d u a l
2.4.3 Comments
experience end ~ r i o knowledge
r theg may a r r i v e st d i f t e r e n t p o s t e r i o r knowledge,
Bayes' Theorem as i t has been s t a t e d by eq.(2.24) is nothing but a
and hence they may, q u i t e l e g i t i r a t e l y , reach d i f f e r e n t conclusions about the
simple and l o g i c a l consequence of t h e r u l e s f o r c a l c u l a t i o n of p r o b a b i l i t i e s .
parameter 9.
When it comes t o t h e a p p l i c a t i o n of Bayes' Theorem t o estimation problems, how-
To avoid t h e element of subjectivism i n t h e inference about 9 i t would
ever, d i f f e r e n t schools of s t a t i s t i e i a n e have d i f f e r e n t i n t e r p r e t a t i o n s of t h e
be necessary t o agree on t h e p r i o r p r o b a b i l i t i e s P ( B i ) Thua a l l p h y s i c i s t s
theorem with far-reaching philosophical implications, and t h e apparently uneon-
would have t o combine t h e i r p r i o r knowledge, deciding oo an "objective" p r i o r
c i l i a t o r y opinions awrng s p e c i a l i s t s have led t o a v a s t l i t e r a t u r e OII t h e aub-
d i s t r i b u t i o n of t h e P(ei). l ' i s d i s t r i b u t i o n would, however. s t i l l be condi-
ject.
tioned i n t h e sense t h a t every P(9.) would be dependent on a l l knowledge aeeu-
~ e us,
t i n eq.(2.24), make a change i n n o t a t i o n and nitt r instead
of A, 8 . instead of Bi. W e may think d x r s n random v a r i a b l e , t o b e measured
mutated up t o t h e present $1 .
The preceding considerations r e f l a c t t b a t t i t u d e of Bayesian s t a t i -
by an experiment; t h e Oi may represent d i f f e r e n t hypotheses about a parameter
I sticians. The anti-Bayesians, on t h e other hand, p r e f e r to a b s t a i n from p o s t e r i o r
8, whose a c t u a l n m r i c a l value i s not Cam. The a e t of hypothesea 8 . s a t i s -
p r o b a b i l i t i e s , being s a t i s f i e d with d e s t i l l i n g and presenting t h e i r data as indi-
f i e s the condition f o r exclusive and exhaustive s e t s ,
c a t i v e as possible, but leaving t h e (subjective) conclusions t o be d r a m by t h e
I reader.
In many cases the p r i o r p r o b a b i l i t i e s P(9.) are only incompletely
Bsyes' Theorem r e l a t e s these p r i o r p r o b a b i l i t i e s f o r t h e hypotheses t o t h e i r 1
p o s t e r i o r p r o b a b i l i t i e s ~ ( 9 ~ 1 given
~ ) . t h a m~asurcmmtx,
1 *) Recall our e a r l i e r remark t h a t a l l p r o b a b i l i t i e s indeed ere conditional.
if the p r i o r k a a l e d g e P(Bi) .
i camplatsly missing. In pcaetiee, hmtnr,
known. Kf the denominator of eq.(2.261 can not be found, Bayrs1 Theorem takes
the weaker form ,mingful posterior ststemcnts .bout 9 e m frequently be nude a l s o when the
p r i o r knwledge is scanty; thin is p a r t i c u l a r l y so i f the P(ei) are roughly of
) dissimilar for t h e
9 ~very
the same magnitude and the likelihoods ~ ( ~ ( are
d i f f e r e n t hypotheses Bi.
Although one can not i n t h i s case completely determine t h e p o s t e r i o r probabili-
Bmes' Postuhte aays that i f t h e d i s t r i b u t i o n of p r i o r p r o b a b i l i t i e s
t i e s , eq.(2.27) can nevertheless be uaeful f o r calculation of r e l a t i v e p o s t e r i o r
is completely unknown, one may take
probabilities. With the observation x, the r e l a t i v e p o s t e r i o r p r o b a b i l i t i e s
f o r Oi and 0. define the b e t t i n g odds f o r the hypothesis 9. against tbe hypc-
3
t h e s i s 0;.
*hen Bayesl lbeorem w i l l express t h e posterior p r o b a b i l i t i e s by t h e likelihoods
Betting odds of Bi againat 0 .
J -vp(Bilx) P(X~B~)P(B~)
.
'P ( X I ~ . ) P ( ~ ~ )
1
(2.28) alone.

P(9;Ix) -" p(xlei)

2.4.4 Example: Bettinp odds


Let us go back t o the experiment described i n Seet.Z.3.13 and assume As we have argued before, given t h e choice bemeen d i f f e r e n t hypothemes Bi, we
t h a t the numbers in t h e l a s t two rora of the t a b l e i n d i c a t e a pFiori probabili- would choose the one with t h e l a r g e s t P(Oi(x). Tke resemblance t o tb. M a x i m
t i e s f o r the observation of'K'K and K'A p a i r s , respectively. Likelihood method a t t h i ~ - ~ o i inn t, however, only a r t i f i c i a l , baeause i n t h e
8 8
Suppose f u r t h e r t h a t a new event with two v0's has been fo~mdi n the context, 9 is a parameter, whereas i n the Maximum-Likelihood approash t o
film, and t h a t the subsequeot measurement and kinematical analysis of thi. the eetirmtion problem 9 is regarded as a variable.
event shovs t h a t while one of the v0'a i s uniquely i d e n t i f i e d as a KO, the Choosing a11 p r i o r probabilitie. equal according t o Bayes' P o s t u l l t e

second v0 is ambiguous between a KO and a A assignment. The ptobabi;ities eq.tZ.29) looks c l e a r l y q u i t e a r b i t r a r y , and may lead t o l o g i c a l inconsistencies
under some circmstancea. Consider, f o r instance, the decay of unstable pnr-
f o l l w i n g from the i d e n t i f i c a t i o n end measuremnts x hlve been found t o be
P(X~K;) - 0.10 and P(xlA) - 0.50 f o r the two p o s s i b i l i t i e s .
the b e t t i n g odds f o r the hypothesis t h a t the 2v0 event i a a'K'K
We want t o find
p a i r against
t i e l e s , which may be described i n terms of the mean l i f e t i m e r of the p a r t i c l e s ,
o r the decay constant A, where A
1
--.
I f we wish t o determine r, having no
* s
the hypothesis t h a t the event is a A:K
p a i r . Prom t h e t a b l e i n Sect.2.3.13,
ignoring the information on the charged tracks of the primary reaction, we f i n d , prior d i s t r i b u t i o n P(T) -
p r i o r information on t h i s quantity, Bayel' Postulate suggest. t h a t we take the
constant. I f instead A ".a chosen t o describe the

using eq.(2.28),
mould suggest that we use the p r i o r d i s t r i b u t i o n ?(A) -
decay, and we had no ~ r i o rinformation on t h i ~quantity. then Bayes' P o s t u l a t e
constant. But t h i s i s
quite d i f f e r e n t from the previous suggestion, because
I

P(r) - dA
-
~ ( h ) - l ~ lA2-P(A).

2.4.5 Bayes' Postulate Ae we s h a l l see l a t e r i n our discussion of the Maxkm-Likelihood


From our previous discussion of Beyes' Theorem i t i a c l e a r t h a t no method the p r i o r p r o b a b i l i t i e s need not represent e problem; see Sect. 9.4.1.
statement can be made about t h e parameter 0 on t h e b a s i s of t h e observation x
..
3. General properties of probability distributions In our f u r t h e r application.
t h a t is, f ( x ) is a single-valued function of x.
f ( x ) is a n t i c i p a t e d t o be s u f f i c i e n t l y regular, so as t o sllm d i f f e r e n t i a t i o n .
~ i t h
respect t o x. I n s h o r t , f ( x ) is assumed t o be a well-behaved flmction.
The p r o b a b i l i t y density function w i l l be our main t o p i c i n t h e f o l l o r r
I n t h i s chapter we s h a l l be concerned with t h e formulation of same ing. We a h a l l o f t e n use t h e notion "p.d.f.", or j u s t "distribution", when no
b a s i c concepts and d e f i n i t i o n s from p r o b a b i l i t y theory whieh are a l s o impor- i s possible.
t a n t t o a p p l i c a t i o n s in s t a t i a t i e a . Without s p e c i a l i z i n g our asaurnptiona on ~ r o mt h i s point we a h a l l a l s o s i o p l i f y notation and denote t h e random
t h e a c t u a l form of t h e p r o b a b i l i t y d i s t r i b u t i o n , we s h a l l d i s c u s s a set of variable i t s e l f as w e l l as i t s s p e c i f i c values (observatioosl by l o w e r c a s e
general f e a t u r e s whieh are common t o most types of p r o b a b i l i t y d i s t r i b u t i o n s
t o be studied l a t e r . A number of s p e c i f i c d i s t r i b u t i o n s are t r e a t e d i n =re
d e t a i l i n t h e subsequent chapters, which a l s o include examples of p r a c t i c a l 3.2 THE CUMTLATI\'@DISTRIBUTION FUNCTION

interest. The present chapter, with i t s minimum of a p p l i c a t i o n s , should be Instead of c h a r a c t e r i z i n g t h e random v a r i a b l e by i t s p r o b a b i l i t y den-
*)
regarded as s r a t h e r complete reference l i s t of t h e general d e f i n i t i o n s and ~ i t yfunction f ( x ) one may use t h e m t o t i v e dGtribution F(x). defined by

.
x
p r o p e r t i e s which w i l l be applied and developed f u r t h e r i n t h e r e m i n d e r of
t h e book.
F(x) E
i
f (xl)dx'
X .
mLO
(3.3)

3.1 THE PROBABILITK DENSITY FUNCTION where x .


rmn
is t h e lower l i m i t value of x. Since f(x) i s always non-negative

From t h e very o u t s e t we v i l l assume t h a t t h e random v a r i a b l e s t o be F(X) i s c l e a r l y a moootooic increasing function of x over t h e i n t e r v a l

considered are of t h e continuous type. < x 5 rma. One has

For t h e moment it w i l l s u f f i c e t o connider a s i n g l e , eantinuoua random


v a r i a b l e X. We assume t h a t X can have any value over a c e r t a i n domain 0. The
p r o b a b i l i t y t h a t t h e random v a r i a b l e comes out with a value i n t h e p a r t i c u l a r
3.3 PROPERTIES OF THE PROBABILITY DENSITY FUNCTION
i n t e r v a l ix,x+drl i s v r i t t e n
The p r o b a b i l i t y deneity function f ( r ) contains a l l information about
the random v a r i a b l e x. Various p r o p e r t i e s of f ( x ) serve t o c h a r a c t e r i z e t h e
distribution. For instance. a mode of t h e d i s t r i b u t i o n i s a value of x vhich
Here f ( x ) must represent "probability per u n i t length" and i a c a l l e d t h e p r o b e maximizes the p.d.f.. I f t h e r e i s only one mode t h e function i s unimodutar.
b i t i t y density function f o r x. Since t h e t o t a l p r o b a b i l i t y t h a t t h e v a r i a b l e The median of the p.d.f. is defined as t h e value of x f o r whieh
v i l l have some value i n n must be equal t o one, t h e normalization c o n d i t i o n is

*) S t a t i s t i c i a n s o f t e n p r e f e r t h e inverse d e f i n i t i o n and put


The way f ( x ) is introduced guarantees i t t o have a non-negative d q a -
dence on x.
f(x) = dF
W
e s h a l l a l s o ass- that th functional dependence on r is unipuc,
r e f e r r i n g t o F(x) as t h e " d i s t r i b u t i o n function".
33
Particularly useful are the mem as a masure of the central value and the i
V u r i m c e as a measure
introduced belov.
of the spread of the diatribution. T h e ~ econcepts are E(a) -
E[ag(x))
8 ,

= a E[g(x)),
where a is a eonstant

Fig. 3.1 shovs the relative position of the location parameters (mode,
median, mean) for a unimodular p.d.f..
i
E(algl(~1 + 8282(~)) - a1E(gt(10) + a2E(g?(x)).
mus E has the properties of B linear operator.
An important application of the expectation operator is to derive the
.xpected value of the square of the difference between g(x) and its expectation
e(.g(x)). we wilt then get a measure 01 the spread or dispersion of g(x) about
its central value, which is eslled the variance of g(a) for the p.d.f. L(x):

We next specify some definite forms of g(x) which are particularly use-
ful.

3.3.2 Mean value and variance of a random variable


*a a simple application ue put g(x) - r in the general definition eq.
(3.6). We then have the expectation of the random variable itself, which is
I called the mem vatw of x for the p.d.f. f(x); this number we denote by p:

Fig. 3.1. Location parameters for a unimodular probability density function.


(The curve corresponds to a chi-square p.d.f. for 5 degrees of freedom.) p E E(d -In
xf (x)dx . (3.8)

3.3.1 Expectation values of a function For the spread of x about its mean value we take the varimce V(x), or
Let g(x) be some funetion of x. We define the mathematical s q e c t a - dispersion, of x for the p.d.f. f(x); this o d e r is denoted by 02:

t i m vaLue (or e q e c t e d mtw, or e l p e c t a t i a ) of the function g(x) for the


p.d.f. f(x) by

Clearly a2 is a non-negative quantity; o is eslled the stmrdmvl deviation of r


for the p.d.f. £(XI.
where the integration is over the entire domain of the variable. Tbru ).(f Using the linearity property of E we can establish a useful relation
nerves as a weighting function for g(x), and the resulting quantity E(g(x)) is between the expectation of X~ and the parameters a2, p . From eq.(3.9) we find
I
a rider, i.e. a constant, independent of x. We see that the expectetioo
e(g(x)) is s measure of the mean or central value of the function g(x).
From the definition of the expectatim value one easily derives the
followiog relations:
It m y be inatruetive to note that One has
I
!'

35

I
Ihe mments of lowest order are e a s i l y derived from the d e f i n i t i o n s : I
!
Zxercise 3.1: Show t h a t , i f a i s a constant, " ( a x ) = ~ZY(X).
11.- 1

Exercise 3.2: (The Bienayk-Tshebycheff ineq"ality)


u:
11:- -11
at + u2
( a l g e b r a i c mament~l, (3.15)

Let g(x) be a non-negative function of the random v a r i a b l e x with


p.d.f. f ( r ) and variance .'a

1
Prove t h a t the p r o b a b i l i t y f o r g(x) t o be a t l e a s t
a s l a r g e a s any constant v a l u e c i s limited i f E(g(x)) e x i s t s ,
and
110- 1
i
r E(g(x)) . ( c e n t r a l moments). (3.16)
p(g(x) ?_ c) 5
Io p a r t i c u l a r , with g ( x ) = ( x - E ( r ) ) % t h i s i s equivalent to 112- az
I
P ( ~ X - E ( F )5 i a ) A< L2 , Note i n p a r t i c u l a r t h e r e l a t i o n bemeen t h e second moments: I I
which i s c a l l e d the Bienavmd-Tshebycheff i n e q u a l i t y .
I
This general r e s u l t on the p r o b a b i l i t y f o r Ix-E(x) t o exceed a given
number of standard deviations turns our t o be very u s e f u l f o r proofs of l i m i t i n g
~ e r t i e sand coovergence theorems; see Sect.3.10.4.
This is t h e same r e s u l t as eq.(3.12).
J. 3.3 Generalmoenfe
Because of t h e i r simple i n t e r p r e t a t i o n 11 - E(x) and o 2 = E(x-u)'
e x t e n s i v e l y vsed as Parameters i n t h e p r o b a b i l i t y d e n s i t y function.
are
The general r e l a t i o n s fietween c e n t r a l momenta and a l g e b r a i c moments
are as f o l l a r s : i
For t h e o r e t i c a l and p r a c t i c a l purpose* i t i s a l s o convenient t o define pk .1
r 4
k $-,(-p:)r
(r] ( a l g e b r a i c moments kmovn), (3.18)
e x p e c t a t i o n values of other powers of x and (x-p). With a general d e f i n i t i o n
we s h a l l c a l l t h e expectation of xk the k-th moment of flsl &out the origin,
or t h e k-th a l g e b r a i c momnt. ( c e n t r a l moments known)
r 4

Here U: - 11. i n accordance with eq. (3.15).


The momnte of higher order became of i n t e r e s t when one w m t s t o atudy
In a s i m i l a r way t h e expectation of ( r u ) k is c a l l e d the k-th monent of flzl I the behaviour of f ( ~ )f o r l a r g e IX-~\.
For a symmetric d i a t r i b u t i o n a l l odd
about the merm or the k-th c e n t r a l moment. c e n t r a l moments vanish. Any odd c e n t r a l moment which i s not zero m y t h e r e f o r e
be taken as a measure of t h e asymmetry or skewness of t h e d i s t r i b u t i o n . The
simplest of t h e s e measures i s t h e t h i r d c e n t r a l moment U,. b u t i t s n l m . r i c a l
1 value w i l l depend t o t h e t h i r d order the u n i t s of t h e v a r i a b l e x. To have
The mean valve and t h e variance are t h e r e f o r e s p e c i a l e x a q l e s of more
I an absolute and dimensionless measure of t h e as-try one, t h e r e f o r e , d e f i n e s
g e n e r a l l y defined moments. In p a r t i c u l a r , t h e mean is equal t o t h e f i r e t moment 1 the aspmmtry coefficient, o r ekewness, by
of f ( x ) about t h e o r i g i n , whereas t h e variance is equal t o the second moment of
f (x) about t h e mean,

11-111,

a2 - uz. ..
( f i r s t a l g e b r a i c mrment)

(second c e n t r a l moment)
Clearly a p o s i t i v e value of y , implies t h a t t h e d i s t r i b u t i o n f(x) has a t a i l t o
t h e r i g h t of t h e man ~ a l v e ,whereas a negative Y, i n d ~ c a t e sa t a i l t o the l e f t . 1
I
The coefficient of kurtosis, or peokedwrrs, of f(x) is defined by the
dimensionless quantity Since the algebraic moments v' appear as coefficients in a series
k
expansion of @(t) they can be expressed a.
y2 EL -
jr--~(x-v)' 3. (3.21)
(112 )' oh

has y z -
This definition implies a comparison with the normal or Gaussian p.d.f.,
0. (see Sect. 4.8.4).
vhieh
A positive (negative) value of y2 indicates that
the distribution is more (less) peaked about the mean than a n o w 1 distribution
This relation can be used for the evaluation of the algebraic moments of any
order when @(t) is k n m .
of the same mean and variance.
If instead we vant the central moments we should use the charaeteris-
tie function in the form
3.4 THE CHARAmERISTIC FUNCTION
.-
The various moments introduced in the preceding section serve to char-
acterize the distribution under study. For instance, the first algebraic moment
@ (t) E
v
1 eit(x-v)f(x)dx = E[e it (x-p) 1 (3.25)

defines a mean or "center of gravity" for the distribution, and the second cen-
and perform an expansion in a pover series about v. Then, by analogy vith eq.

- -L
tral moment measures the spread of the distribution about this mean.
(3.23,
In addition to the usefulness of the individual moments there is con-
siderable theoretical interest attached to the conplete set of moments vk 1 . k
(or mV(t) ~('t) Vk
equivalently of $) since this set determines the probability density function k-0
completely. and the central moments are obtained as

We shall now introduce the characteristic m c t i o n @(t) for the proba-


bility density function f(x). By definition @(t) is the Fourier transform of
f(x),

-
ta The relation between the tvo form of the characteristic function de-
@(t) E
- eitxf(x)dr E(eitx) .
This function then contains the complete set of algebraic momenta for the dis-
(3.22) fining the tvo different sets of moments, is simply

tribution f(x). By taking the Taylor expansion of eitx about the origin and
Clearly, if one is interested in a general set of moments about an arbitrary
using the linearity property of the expectation operator, we have
point a, then one should use instead

and derive the desired moments from this function.


With the knovledge of @(t) the moments can be evaluated to any order;
hence the properties of the parent distribution are derived. Furthermore, the
probability density function itself can be found explicitly, since, from eq.
(3.22)
39

3.5.2 Expectation values

general function g(xl,x? xn) g(x) as -


The generalization of eq.0.6) gives the expectation value of a
,...,
-
Remark: In the literature one frequently finds reference to the moment-genera-
t i n g f m c t i o n M(t), which is used to evaluate momenta of distributions. Using
E(K(~)) = 1
n
g(x)f(;)dx. (3.34) 1

thia function is in m y respects equivalent to our use of the characteristic


function. Thus, the moment-generating function about the wean 14 (t), defined
We can generally define the variance of the function g(x) by -
U
by

H (t)
u
Q1

E I et(xy)f(x)dx = ~(e~(~-~)), (3.31)

provides the central moments through the formula


3.5.3
Specializing g(x) -
The covariance matrix; correlation coefficients
x, we get the expectation or m e n votue of x. j
!
-/
as
which is of course equivalent to eq.(3.27). However, from a theoretical v i e r q i E(xi) xif(;)d;. (3.36) I
point it may be advantageous to work with the characteristic funetion rather
~ r ~ t i ~ gFor instance, vhen @(t) is k n m ,
than with the m o r n e ~ ~ t - ~ ~ ~ funetion.
n
We are next lead to handle the extension of the definition of the
I
f(x) is explicitly given by eq.(3.30), whereas a similar f a m u l a can not be variance by eq.U.9) to the ease of several variables. The couarimtce m r s t r i s
written with M(t). V(x) of; is defined by its elements

3.5 DISTRIBUTIONS OF NOW THAN ONE RlWWM VARIABLE

3.5.1 The joint probability density funetion Here u. and v. are the expectations of the variables x. and xj, respectively.
1
Up to w e have assumed the probability density function to depend
now in accordance vith eq.(3.36) above.
on a single random variable. The extension to several variables xl.rr. xn ..., The cavsriance matrix is of great inportanee to physicists. Some of
consists in considering a j d n t p r o b o 6 i t i t y d e n s i t y function £(XI ,xr ,. ..,xn). its properties may be stated as fall-:
We a s s u m thia function to be positive and single-valued at every point
x,,x2. ...,xn in the n-dimensional space, and that it is properly normalized, (i) V(g) is s-trie.

j ~(xI,x~,....xn)dxldx2...dx,, - 1
(ii) A diagonal element V..I1 is called the varimree of of the
variable x.. of is a non-negative quantity,
I 1
n
when the integration is over the entire domain of all xi. For short v e write
and, analogously to eqs.(3.10) - (3.12) for a single random variable,

I
we have Since this condition must be satisfied for any value of a. there follovs a

o?
1
- V..
I1
- ~ ( ~ -2 ) (ECxi))'. (3.38)
restriction on p,

3.5.4
pZ 5 1,

Independent variables
which leads to the inequality (3.41).

(iii) An off-diagonal element V.. where ilj, is called the Tbe random variables xl,y,...,x
11' ere said to be mutuatty independent
coicnrimce of x . and x. and is denoted by cov(xi.x.). if their joint probability density function is completely factorizable as
I' I
COV(X.,X.)
L 1
Z V..
11
- E(x.x.)
1 I
- E(x~)E(x~). (3.39)
I f(xl,x2,. ..,x,) ' fx(xr)ft(x2). ..fn(xn), (independence). (3.42) !

Tha covariance may be a positive or negative quantity. I This is just a neu f o m l a t i o n of the definition of independence by eq.(2.13)

A frequently used measure on the correlation between two variables


I applied to the case of continuous variables.
Independent variables have the property that their covariance, and
xi.xj is the c o ~ m t a t i acoefficient p(xi.x.) defined by
1 hence their correlation, vanish. To see this, consider the expectation of the
product of two mutually independent variables x ,r.:
1 1
-I xixjfi(xi)f.(x.)dxidx.
I 1
which satisfies
xifi(xi)dx..
1 I
n
x.f.(x.)dx.
I 1 I I
- E(xi)-E(x.)
1
1
(3.43)

Two random variables having


tively (negatively) correlated.
( x
1
~
When
- )1 - 1
p(xi,x.)
are said to be completely posi-
- 0 the variables are m c o ~ r e -
provided that f. and f. are both properly no-lized.
3
Fron the definition of
the covariance of two variables and of the correlation coefficient, eqs.(3.39)

1 and (3.40). respectively. it follows that


Zoted.
To prove that the correlation coefficient is a number between -1 and I cov(xi.x.)
1
- 0. p(xi.x,)
I
- 0, (independent variables). (3.44)
+ 1 we make use of a theorem on the variance of e linear corbination of variables,
It may be noted that although independent variables necessarily ere
which is proved for a more general case in Seet.3.6. The theorem (eq.(3.55))
uneorrelated, the opposite statement is not generally true.
implies that for the sum xl+ruz the variance is given by
The result derived above by eq.(3.43) means that the erpectation of
the product of two mutually independent variables is equal to the product of
the expectations of the individual variables. This is in fact just a special
By definition the variance is a non-negative quantity, hence case of a more general statement about the expectation value of a function
which is factorizable in the two independent variables. To see this, let us
write

-
Dividing by V(xr). putting a 2 ~ ( x ~ ) 1 ~ ( r l ) oz and using the definition of the
8(xi,x.)
I
- "(x~)"(x.)
1
(3.45)

correlation coefficient p, the condition can be written as


where x. and x. are assumed to be mutually independent. Using the definition
I
of independence one readily show that
Thus the concept of independence as stated in Sect.3.5.4 can be formulated by
saying that mutually independent random variables have a joint probability
density function which is factorizable into its marginal density functions.
- E(U(X~))-E(V(~~)) , (independence). (3.46) We next introduce the concept of a conditional density function or a
m d i t i o n o t d i s t r i b u t i o n for f(x). Consider again the n-dimensional p.d.f.
again assuming f. and f. to be separately nom8lized. This general Property
1 f(x,,xz. ...,
xn). The conditional probability density in all variables except
frequently simplifies the ealeulation of various expectation value..
Finally, one may note the falloving fact about two independent vari- I
is then defined as the ratio betvean f (x, ,xl.. ..
.x ) and the marginal density

ables x. and x.: If u-u(x.) and v-v(x.), then u and v are also independent.
1 function for XI, thus
I 3
The proof of this statement is suggested as an exercise for the reader in Sect.
3.7. (Exercise 3.7).

Exercise 3.3: Given f(xl,xr) 4


dent variahler), a h m that p ( x l . r r )
- -8
8
iix: + -xz and fi defined by xf + x: 5 1 (depen-
(no correlation). ere it is understood that xl is kept fixed, and f(n 2.alx)
is then a
function of the remaining variables xr....,x n .
S~milardefinitions apply for
3.5.5 Marginal and conditional distributions the other variables. It will be seen that eq.(3.48) corresponds to the previous
A projection of the probability density function f(x) onto a subspace deftnition of conditional probability by eq.(Z.lO).
is called a marginal density function or a marginal d i s t r i b u t i o n for f(g). Con- i With the conditional p.d.f. of eq.(3.48) one may define conditional
1
sider the n-dimensional p.d.f. f(xl .xr ,....x ). Integrating over all variables e q e c t o t i o n uaLues of functions and variables in a manner which is quite ana-
except one, say xl, gives the marginal distribution in this variable; we write
1 logous to what ha. appeared before. or example, the conditional expectation
1I of u(x2.. ..,xnl XI). given XI, is
hl(x!) s f(x3.m ,....x,,)dxt ...d m , (3.47)
xz b i n ) xn(mio)

and similarly for the other n-1 variables. It will be seen that eq.(3.47) re-
presents an application of the definition of marginal probability, eq.(2.22).
In the case of mtual7.y independent variables for which the p.d.f.
factorizes according to eq.(3.42), the marginal distribution becomes
3.5.6 Examle: Scatterplots of k~nematlcvar~ables
In particle physics probability density functions of two variables are
encountered in studies of seatterplots. or two-dimensional displays of kinematic
variables. The most c m o n of these are pres-bly the Dalitz plot and the
with corresponding expressions for h2(x2) etc. Because of the overall normali- Chew-Low plot.
zation condition for f(x), one must have For definiteness let us thi& of a reaction
at a total centre-of-mass energy &, and let 8..
LJ'
t
at
. denote, respectively, the
squared effective-mass of particles i.j and the squared four-momentum transfer
between particles a,i. Then the kinematically allowed region in a Chew-Lar dis-
, versm slz is a closed area bounded by a straight line and a hyper-
play of t
bola, see Fig. 3.2(a). The marginal distribution in s ~ ris the projection on the
s ~ axis,
r giving the one-dimensional distribution in the squared effective-maas
of particles I and 2. The other marginal distribution far the CherLow plot
gives the one-dimensional distribution in the squared four-momentum transfer
between the initial particle a and the final 3. For the Dalitz plot of
say, s ~ versus
r s t a . both marginal distributions are squared effective-mass
spectra, see Pig. 3.2(b).
Lorentz-invariant phase space predicts the density in the Chew-Low
plot to be given by the formula

where the kineratic function h is defined by

1
(The quantity 2; h L (x z ,y2 . r 2 ) corresponds to the magnitude of the momenta
when two particles of masses y and z share the centre-of-mass energy x.) Thus
according to the phase space prescription the conditional density, given st?,

is a constant.
d2R,(ta3Inlr)
f(ta$ls12) ' dslrdtP, (independent of t
.
,
)
.

In Fig. 3.2(a) this implies that the density is uniform within the kinematic
boundary along lines of constant sir The marginal distribution in s l z is ob-
tained by integrating over ta3. thus Fig. 3.2. Illustration of joint probability and marginal distributions. The
shaded areas correspond to the kinematically allowed physical regions in two

hl(slr) - i
ta, mad
f(te,lslr)dt,, - f(t,~~~~~)(t,,(ma]r) - tal(Din)).
dimensions (variables) for the reaction a + b + 1 + 2 + 3 ; (a) ta, versue
s t 2 (Chewlow plot), (b) s 1 3 Versue S I P (Dalitz plot). In both diagrams the
degree of shading indicates the density expected according to Lorentz-inva-
ta,(min) riant phase space, and the one-dimensional projections on the two a x e s give
the spectral shapes for the variables involved.
Since the boundary corresponds t o configurations where t h e f i n a l s t a t e p a r t i c l e s 3.5.7 The j o i n t c h a r a c t e r i s t i c function
are c o l l i n e a r , the square bracket can be evaluated q u i t e e a s i l y t o give I n analogy v i t h t h e d e f i n i t i o o of the c h a r a c t e r i s t i c function in t h e
case of a s i n g l e random v a r i a b l e , we n m introduce t h e j d n t characterietic
M a i o n @(t,,t2, ...,t ) f o r t h e j o i n t p r o b d i l i t y denaity function

Hence t h e marginal d i s t r i b u t i o n i n s ~ according


r t o Lorentz-invariant phase f (XI .a ,. .. ,X ), by t h e d e f i n i t i o n

space i s obtained e x p l i c i t l y as

( t h e p r o p o r t i o n a l i t y s i g n i s a warning t h a t h~ i s not properly normalized due


t o t h e conventional d e f i n i t i o n adopted f o r R,). Let us examine t h e ease v i t h j u s t two v a r i a b l e s , when the eharacter-
I f instead we were i n t e r e s t e d i n the marginal d i s t r i b u t i o n i n the i s t i e function i s
o t h e r v a r i a b l e r a , , we would have t o i n t e g r a t e t h e conditional d e n s i t y
f ( s 1 2 1 t a l ) f o r fixed ta3, thus
s I 2 (max)
If and xz are independent v a r i a b l e s , i.e. i f f ( x l , x d - £1 (XI).£* ( x z ) , we can
h2(ta3) =
f
sls(min)
f(sirlt.,)dn!z. write
XI

It is s e e n t h a t t h i s leads t o an e l l i p t i c a l i n t e g r a l .

E x e r c i s e 3.4: Show t h a t the marginal d i s t r i b u t i o n h 1 ( s , . ) can be normalized


using t h e r e l a t i o n

i
612 "7.3~)

hi(812)ds1z = R~(s;m~,m2,m~) where @ ( t ~ i)s t h e c h a r a c t e r i s t i c function f o r f l ( x l ) , t h e marginal d i s t r i b u t i o n

s ~ (mi")
r (6;rn~)~ in XI, and s i m i l a r l y f o r 'Z(t2). Thus, f o r independent v a r i a b l e s t h e j o i n t eher-
a e t e r i a t i c function is f e c t o r i z a b l c * ) . I n g e n e r a l , w i t h n independent v a r i a b l e s .

(mtimz)2
where t h e two-particle phase space f a c t o r Rz is r e l a t e d t o the kinematic func-
m(tt,tz ,...,t n ) - ...
@(tl)@(t%) @(tn), (independenre). (3.51)
t i o n k by
1 --
R~(X~;~~.Z.~) A ( X ~ , Y ~ , Z ~ ) .
2r2
The j o i n t c h a r a c t e r i s t i c function may be used t o f i n d the general
moments of the d i f f e r e n t variables. The technique i s t h e same a. shown b e f o r e
i n t h e case of a s i n g l e variable. For s i o p l i e i t y , l e t us again s p e c i a l i z e t o
E x e r c i s e 3.5: For t h e t h r e e - p a r t i c l e f i n a l s t a t e Lorentz-invariant phase space j u s t two v a r i a b l e s , rn and x2. When
p r e d i c t s t h e density within t h e kinematic boundary of t h e D a l i t r p l o t t o be
.rooo. or ti anal t o
- = -d'R3 n2
dsl2dsxa 48 *)
i.e. oonstmt. Show t h a t t h e marginal d i s t r i b u t i o n f o r s,, is given by t h e I n f a c t t h e i n v e r s e s t a t e m n t a l s o holds: I f t h e j o i n t c h a r a c t e r i s t i c
expression f o r h , ( s L 2 ) i n t h e t e x t . function can be f a c t o r i z e d , t h e v a r i a b l e s are independent. Thus eq.(3.51)
represents s necessary and s u f f i c i e n t condition f o r independence.
The expectation of t h i s sum is e a s i l y found by using t h e l i n e a r i t y property of
E:

a d e r i v a t i o n with respect t o ( i t l ) gives

-- am
a(it,)
I1
CD CD

-- x~eit'X'titzXzf(X~,Xz)dX1d~2.
Thus t h e expectation of a l i n e a r con6ination of v a r i a b l e s x. i s t h e same l i o e a r
eonbination of t h e individual mean values.
The variance of t h e l i o e a r function i s s l i g h t l y Tore troublesome t o
P u t t i n g t l - t z 4 , t h e right-hand s i d e i s nothing but t h e expectation of XI, evaluate. W e have

S i m i l a r l y , a f t e r two derivations with respect t o ( i t , ) , f o r t4,

and 80 on. Corresponding expressions f o r the other v a r i a b l e r e s u l t from d e r i - This can f u r t h e r be v r i t t e n as follows:
v a t i o n s with respect t o i t r . Also

- Ea;V(xi) + 1 a.a.cov(xi,r.)
i*j '3 I
.
n u s , v i t h two v a r i a b l e s , or, finally

Thus t h e variance o f t h e l i o e a r combination c o n s i s t s of two parts: the f i r s t


The extension t o more v a r i a b l e s i s c l e a r l y s t r a i g h t f o r n a r d . p a r t is j u s t t h e sum of t h e variances of t h e individual v a r i a b l e s , weighted by
the square of t h e i r c o e f f i c i e n t s i n t h e l i n e a r combination, t h e second p a r t is
3.6 LINEAR FUNCTIONS OF R A N W M VARIABLES a sum of a l l the covariance terms t h a t can be made up f o r t h e v a r i a b l e s .
We s h a l l go back t o o u r d e f i n i t i o n s of t h e expectation and variance We note t h a t f o r t h e p a r t i c u l a r case of uncorrelated v a r i a b l e s t h e
of a general function and i n v e s t i g a t e t h e consequences i f g(xl.x2, ....\ ) is a l
variance of t h e l i n e a r function reduces t o t h e s i n p l e r e l a t i o n
linear function of t h e random variables. We put
!
n

1 aixi)
V( i-1 - n

L- 1
a;v(xi), (uncarrelated v a r i a b l e s ) . (3.56)
3.6.1 Example: Arithmetic mean of indeoendent variables w i r l r the same mean I To ensure a non-negative dependence, we take
and variance
i
Let x t , x n . . . . , x " be n mutually independent random variables having
the same mean value p . - ~and the same variance o f a 2 . We then take as a parti-
and this is the anawer to our question.
cular linear combination the o r i t h e t i c mem2 ;or merage of the a.I'
It the transformation (3.59) is not one-to-one, and several segments
! [x,x+dxl map onto [y,pdyl, one must sum over all segments,

a. --.
This is a special case of eq.(3.53)
1
in which all coefficients ai are equal,
and hence we get from eqs.(3.54) and (3.56) the expectation and vari-
ance of :; I Eq.(3.60)
of v i a b l e ,
can easily be extended to cover a transformation from a set
i t P ..xn) to a second set of vari-
£(XI .i....
ables YI,Y~.....Y . The p.d.f, for the new set of variables is

~,
or, for short, in obvious vector notation,

3.7 CHANGE OF VARIABLES where J is the Jacobian determinant of the transformation, given by
It often happens that the probability density function ia known far
a certain set of variables and that one wants to find what the distribution
will be like when a transformation is made to a new set of variables. For
instance, given a spectrum of particle Mnnenta one may want to have the corre-
sponding energy spectrum.
Suppose first that x is a continuous random variable with p.d.f.
f(a) and that we knar a functional dependence

We ask: What is the p.d.f. g(y) for the new variable y?


Exercise 3.6: If f(x) -
(2")-'exp(-!xz)
the variable y-x2 has a p.d.f. given by g(y) -
(the stands d normal p.d.f.),
(7.")- I
y-!exp(-!y)
ahor that
(the chi-square
For a one-to-one correspondence between the old variabla x and the distribution with one degree of freedw).
new variable y, an interval [x,x+dxl ia mapped onto [y,y+dyl and n require Exercise 3.7: Let x i and rz be N o independent variables with p.d.f.'s fl(xl)
and fz(xn), respectively, and let y l h e a function of XI alone. r, a function
of xz alone. Show that yl and y2 are alao independent.
3.7.1 ~ ~ Dalitz
~ plot~ variables
~ l ~ : (diegooel or not) is inessential.
Consider a three-particle final state for which lorentz-invariant
phase space predicts in terms of the squared invariant masses M:Z,M:I,
-
To estimate the variance of y we perform a Taylor expannion of y
about the mean value y (Va.u?,.. .,u,,) of 5. Writing out the terms of order
zero and one ve have

that is, a constant density, (compare Exercise 3.5). For new variables choose
Y Y E +
i-1
x i i 2 + term of higher o r d e r . (3.66)
--
the linear effective masses M,~,M,,. Then the Jaeohian of the transformation Taking the expectation value of this expression each first order term will
ir vanish, so that

E(Y(~)) = ~(1)
+ terms of higher order . (3.67)
Under the assunption that the quantities (x.-v.) are small, the remaining terms
1 1
can he dropped to give the approximate result
The density in the new variables is therefore

Introducing this in the formulae far the variance of y(x), eq.(3.35), we get
which is not a constant. Thus the nice feature of constant density is lost in
-
the change from squared to linear effective masses.

Exercise 3.8: In the exemle above, prove that a transformation to the energy Now we can find an approximate value of the difference y(5) - y(~) from eq.
variables EASEzyields a constant probability density. (3.66) by dropping all terms of order higher than one.

3.8 PROPAGATION OF ERRORS


We have in the preceding paragraphs studied various functions of
Clearly any such function or new variable which is defined Substituting this back in eq.(3.69) we obtain the following approximation for
=andm variables.
the variance of y (5).
by a functional dependence of random variables is itself a random variable.
We shall nov study properties of variables of this type.

3.8.1 A single function


L~~ xl,x2,. ...xn be the original random variables and Put The expectation values here are nothing hut t h e elemnts of the covariance
-
matrix of x. So we get the final result

~ e us
t further ass- that the covariance matrix V(x) of 5 i a h m . We need
not specify whether the xi's are independent or not, so the form of V(&)
where the derivatives are evaluated at z-y. which all depend on the n random variable8 x,.x,,....x , thus
The formula (3.72) is k n a m as the Zm of p r o p a g a t i a of ermrs and
is of great importance t o physicists. In the general case it is to b e regarded
as only approximately "=lid, in view of the assumptions made in deriving it A Taylor expansion about -
x - 2 leads to
(dropping term. of higher order). We have found an expression for the variance
of y valid when 1 is in the neighbourhood of y. Note, however, that for the
particular case when y has a linear functional dependence on z all derivatives
Y = Y + 1P
i
1 +

-r-u-
. , k - 1.2 ,...,m,
of second and higher order vanish identically; eq.(3.72) is then exact for all by analogy with eq.(3.66). Taking the expectation value each of the first order
terms drops out, and we have
-
X.
For n mutually independent variables all covariance terms are zero
and eq. (3.72) reduces to

(independent variables) . (3.73)


which holds exactly in the close neighbourhood of 2. ~ e
(3.71) w e find for the covariance between, say, yk and yQ,
~ ~ eqa,(3.69)
~ ~ l -i ~ i ~ ~

This relation is also exact only for linear functions of 5 , and otherwise
approximately correct to the extent that higher-order terms can be neglected.
Exercise 3.9: ~f = $L x i and x r are tvo independent r a n d m variables or
having
V(Y) .
xi*(x:v(x,) + x:v(xz)), and v(y)/y2 -
v(x,) ahd v(x,), respectively, show that
v(xl)/x? + ~(x~)/~f.
vkQ" z
n n
,I,j ~ ,zi)
ayk a~,
1 3
E((x~-u~)(x.-u.)) .
5?! 5%
3.8.2 Example: Variance of arithmetic mean In a n a l o w with eq.0.72) we write
~~t y be the average of a set of n independent variables xl,xz.. ...x,,
all with the same variance 0'-

which is the general formulation of the law of error propagation. 1t is under-


stood that the derivatives should be evaluated for :?, and the formula is only
Then 2 = 11 for all i, higher derivatives of y vanish and eq.(3.73)
ax. n
holds valid to the extent that t e r m of second order and higher can be neglected.
strictlir, giving The covariance terms Vk,(y) define the c o v k m c e matriz V(y) for the
dependent variables y. Eq.O.75) in fact provides the basis for error celcula-
tion in physics. The errors on the variables y are given by the square root of
This ia the same result as eq.(3.58) derived in the example of Sect.3.6.1. the diagonal terms of V(y). In general a diagonal term Vkk(y) will contain co-
variance terms V..(x) of the originil variables, because, granted s sufficient
11 -
3.8.3 Several functions; matrix notation linearity.
-
We need a aeneralization of the law of propagation of errors which
will =over the inportant case when there is a set of functions y, .yt ....,ym
3.9 DISCRETE PROBABILITY DISTRIBUTIONS
I f , however, t h e 0~igine.1 v a r i a b l e s xi are uneorrelated, t h e sum reduces t o

3.9.1 Modification of formulae


A random v a r i a b l e which can take on only d i s c r e t e values r e denote by
~ e n ~ ei .n t e r n of t h e errors 0. r. Its p r o b a b i l i t y d i s t r i b u t i o n i s given by t h e s e t of p r o b a b i l i t i e s pr, nor-
, malized such t h a t
(uncorrelated 5 ) . (3.78)

This e x p r e s s i o n i s commonly r e f e r r e d t o as t h e law of error propagation, but as


where t h e s m a t i o n goes over a l l r .
we have emphasized, i t represenrs only a s p e c i a l case.
For a d i s c r e t e p r o b a b i l i t y d i s t r i b u t i o n t h e d e f i n i t i o n s of expectation
Note t h a t even if the o r i g i n a l v a r i a b l e s 5 are uncorrelated t h e co-
value- are analogous t o t h e d e f i n i t i o n s introduced e a r l i e r i n the case of conti-
v a r i a n c e matrix f o r the new v a r i a b l e s y may well have off-diagonal elements
nvous v a r i a b l e s . The expectation and variance of r , for i n s t a n c e , are
d i f f e r e n t from zero.
~ i n ~ l l el t ~ us, f o r convenience summarize i n matrix n o t a t i o n . With 5

and y as column vectors having, r e s p e c t i v e l y , n and m elements we w r i t e

+ sx + higher order t e r m s . (3.79)


y =

ere 2 i s an m-~o~np~nent
(~01umn) v e c t o r of constants and S an m b y n matrix
x + 2. Then, t o f i r s t order,
Exercise 3.11: I f p, -
r n-r .
pr(l-p)"-'
.... (,n,t h es binomial -
distribution) and the
describing t h e l i n e a r p a r t o f t h e transformation
v(r) -
v a r i a b l e r may t a k e any i n i e g e r value 0.1
np(1-P).
h w that E(r) np,

3.9.2 The p r o b a b i l i t y generating function


Dealing w i t h d i s c r e t e p r o b a b i l i t i e s one can mahe use of the probobit-
and t h e kL-th element of t h e covariance matrix V(y) can be expressed as (eq.
1 i t g generating f m c t i a , defined by
(3.75)).
.. ..

I t s usefulness c m e s from t h e properties of t h e d e r i v a t i v e s evaluated a t t h e


Thus, t h e law of propagation of errors takes t h e form point E-1. Since

"(p? - SV(@
T
, (3.80)

where t h e s u p e r s c r i p t denotes t h e transposed matrix.


1 h , $ have been measured w i t h errors A ( 1~ } ,
Exercise 3.10: I f the v a r i a b l e s
a, A$,
" i t h t h e derived q u a n t i t i e s p, - p eosh cos$, p,'
What are t h e c o r r e l a t i o n s i n the new v a r i a b l e s ?
- -
and with no c o r r e l a t i o n s , what are t h e errors aesoclated
p c o d s i n $ , p. p .id?
e t c . , we deduce t h a t
3.10.2 Sample properties
Let rhe sample of .ire n be x , , r , . . ...xn . This set of n independent,
random variables possesses certain properties, which we may hope resemble those
of the population. Two quantities to characterize the sample are
For the most common expectation values one has therefore the convenient expres-
sions

and
-
s2m n-l .I
i
1-1
(xi-x12 . (3.88)

Here ;
Exercise 3.12: Given the probability generating function G(z) -
(zp + q ) n
where P q = 1, show that E(r) = np, Y ( ~ ) = npq, (Compare Exerelee 3.11 .)
+
met;
is the smp2e mean, or arithmetic mem? (overage), which we have already
the quantity a' measures the dispersion of the sample about its mean value
and is called rhe sample variance.
The two quantities ; and s2 defined here as functions*) of the random
1.10 SAMPLING
variables xi, are themselves random variables. This is clearly so because a
3.10.1 Universe and sample -
repeated dcswing of new s m p l e a , all of aize n, obtained from the s a w popula-
tion will produce new x , s 2 . T ~ U Sthe random variables ; and s2 will have their
A probability density function f ( x ) for a continuous random variable,
or equivalently the set of probabilities in the discrete ceae, describes the o m distributions. Obviously, these distributions must depend on tbe properties
properties of a p o p u z d i o n , or miverse. In physic. one associates random of the parent distribution, and on n. The study of ramplea by the diatributions
variable8 with observations on specified physical system, and the p.d.f. f(r) of ;and s2 form an important part of probability theory.
Particularly interesting are samples dram from a universe. It
summarizes the outcome of all conceivable measurements on such a system if the -
measurements were repeated infinitely many time8 under the same experimental turns out in thia case that the variables x and s2 are independent; thia proper-
ty is unique for the normal distribution. Moreover, the resulting distributions
conditions. Since an infinite number of observations is of course impossible,
even on the simplest system, the concept of a population for a physicist repre- for the two variables become especially simple, ;being normally distributed,
sents an idealization which can never be attained in practice. and sZ related to a chi-square distribution. This will be discussed further in
An actual experiment will consist of a finite n u d e r of observations. Seets.4.8.6 and 5.1.6.
A sequence of measurements xl.x2, ....
xn on some quantity is said tr, constitute
3.10.3 Inferences from the s w l e
a a m p l e of size n. A sample is accordingly a subset of the population or uni-
A physicist's motivation for undertaking an experiment and to perform
verse; we may say that " sample of size n is dram from the universe". Phyai-
measureamnts of physical quantities is that he wants to find out something about
cists would like to think that their measorements are typicsl. in the sense
"reality"; thus his interest is in some true distribution, or universe. In fact.
that repeated experiments with the same number of measurements are likely to
give more or less the same result. m i 8 corresponds to the notion of -dm
*) A function of one or more random variables that does not depend on any un-
nqZes.
-known paramter is called e statistic. In accordance with this definition
x , as well ss s2, may be called a statistic.
h e may b e p r e p a r e d t o make i n f e r e n c e s a b o u t t h i s u n i v e r s e an t h e b a s i s of h i s
following section. I t can b e i n t u i t i v e l y u n d e r s t o o d from t h e o b s e r v a t i o n t h a t
r e s t r i c t e d number of o b s e r v a t i o n s . -
t h e e x p e c t a t i o n and v a r i a n c e of t h e v a r i a b l e x are, r e s p e c t i v e l y , E(;)=I~ end
Suppose t h a t m a a u r e m e n t s an t h e v a r i a b l e x have g i v e n t h e numbers
v(;)-$In, (qs.(3.57), (3.58)). i m p l y i n g t h a t t h e s p r e a d of j about u w i l l
XI.X%. ....xn, ~ o n s t i t u t i n ga sample a f size n. E v i d e n t l y we hope t h a t t h e become s m a l l when n i s l a r g e . s i m i l a r r e s u l t s a p p l y f o r t h e mean and v a r i a n c e
sample i n some r e s p e c t i s r e p r e a e o t a t i v e of t h e u n d e r l y i n g m i v e r s e o r popula-
of s2 ( ~ ~ ~ 3.13).
~ i sn ues , by c h o o s i n g s u f f i c i e n t l y l a r g e s a m p l e s , any
tion. A measure of t h e p o p u l a t i o n mean v a l u e 11 d i e h s u g g e s t s i t s e l f i s ,; the d e s i r e d accuracy =an be o b t a i n e d i n t h e e s t i m a t e s of t h e p o p u l a t i o n p a r a m e t e r s .
a r i t h m e t i c mean of t h e sample, aa g i v e n by eq.(3.87). We m y therefore c a l l ; m i s p r o p e r t y i s c a l l e d consistency of t h e e s t i m a t o r s ; see f u r t h e r Sect.8.3.
an e s t i m a t e of the popuZotion mew p . S i m i l a r l y , a measure of t h e p o p u l a t i o n
v a r i a n c e o2 i e p r o v i d e d by t h e q u a n t i t y 8' d e s c r i b i n g t h e d i s p e r s i o n of t h e ~ x e r c i r e3.13: Show t h a t , i n t e r n of t h e c e n t r a l moments,
2
sample, eq.(3.88). Hence t h e n o t i o n t h a t s 2 i s the estimate of t h e p o p u Z a t i a ~ ( 9 2 )= 02 =

UaPiwce 0
'. We w r i t e
3.10.4 lbe Law o f L a r g e Nulllel-s
Convergence t h e o r e m p l a y a fundamental r o l e i n p r o b a b i l i t y t h e o r y
and s t a t i s t i c s a n d a r e t h o r o u g h l y d i s c u s s e d i n t r e a t i s e s on t h e t h e o r e t i c a l
Foundations o f t h e s e s u b j e c t s . Since i n t h i s book mathematical r i g o r i s c o n s i -
d e r e d of l e s s importance compared t o p r a c t i c a l i m p l i c a t i o n s we s h a l l l i m i t o u r
discussion here LO t h e Lnw of Large brribers which was mentioned i n t h e p r e c e d i n g
l'he reason f o r u s i n g (n-1) and n o t n i n t h e e x p r e s s i o n f o r s2, is t o
s e c t i o n and which w i l l be r e f e r r e d to i n l a t e r a p p l i c a t i o n s .
ensure t h a t s 2 is an tmbinssed e s t i m a t o r of o'; a d i s c u s s i o n on t h i s p o i n t i s
L e t nl,x2,... be a s e t of i n d e p e n d e n t random v a r i a b l e s which have
g i v e n i n Sect.8.4.1. An i n t u i t i v e e x p l a n a t i o n why we s h o u l d t a k e (n-I) instead
i d e n t i c a l d i s t r i b u t i o n s w i t h mean v a l u e p . For t h e f i r s t n of t h e s e v a r i a b l e s
of n i s t h e f o l l o w i n g : From t h e sample a l o n e we d o n o t know e x a c t l y what t h e - - 1 "
c e n t r a l v a l u e p of the p o p u l a t i o n i s ; we o n l y have an e l t i m e t e . p-x, which i s
- t h e a r i t h m r i ~mean x = n i = Z l x 1. w i l l a l s o have rean v a l u e U, r e g a r d l e s s o f t h e
number n. The ( ~ e a k-)Law o f Large Nurrbers s t a t e s t h a t , given any p o s i t i v e E ,
subject t o uncertainties. As a measure of t h e d i s p e r s i o n of t h e p o p u l a t i o n t h e t h e p r o b a b i l i t y t h a t x d e v i a t e s from p by an armunt more than E w i l l be z e r o i n
quantity ~ ~ l ( x i - ~ is
) z t h e r e f o r e l i k e l y t o b e t o o s m a l l . and we s h o u l d b e
t h e l i m i t of i n f i n i t e n,
b e t t e r o f f r e p l a c i n g n i n t h e denominator by a s m a l l e r n u d e r .
When n becomes v e r y l a r g e t h e sample p r o p e r t i e s w i l l approach t h e lim P(lx-pl > E) = 0. (3.91)
" + r n
properties of t h e p o p u l a t i o n , h e n c e
AS s t a t e d above t h e theorem concerns t h e l i m i t i n g p r o p e r t i e s o f ;
when n approaches infinity. A s t r o n g e r v e r s i o n of t h e theorem s a y s s o m e t h i n g

about t h e behaviour o f x f o r any v a l u e of n e x c e e d i n g some f i n i t e v a l u e , s a y N .


Given two p o s i t i v e E and 6 , an N e x i s t s s u c h t h a t

T h i s i s t h e c o n t e n t s o f g e n e r a l convergence theorems. In p a r t i c u l a r , the f a c t


t h a t t h e sample m a n has t h e p o p u l a t i o n man a s i t s l i m i t i n g v a l u e i s a con-
f o r a l l n>N.
s e q u e n c e o f t h e Law of Large Numbers, which w i l l be d i s c u s s e d b r i e f l y i n t h e
m e Law of ~ a r g eNurrbers can e a s i l y b e adapted ro t h e case when t h e
I

x ' s have d i f f e r e n t mean v a l u e s . I t w i l l be o b s e r v e d t h a t n o t h i n g h a s b e e n 4. Special probability distributions


s a i d about t h e v a r i a n c e of t h e d i s t r i b u t i o n s . I n f a c t , t h e theorem remains
t r u e even i f t h e v a r i a n c e s do nor e x i s t . I f w e assume t h a t t h e v a r i a n c e of
any n e x i s t s and i s e q u a l t o 02 t h e proof f a r t h e theorem f o l l o w s as a n
immediate consequence of the 8ienaymB-~shebycheff i n e q u a l i t y ( E x e r c i s e 3 . 2 ) ;
-
of v a r i a n c e a 7 / n t h i s i n e q u a l i t y can be w r i t t e n
when a p p l i e d t o t h e v a r i a b l e I n t h i s c h a p t e r we s h a l l b e c o n c e r n e d w i t h an e x a m i n a t i o n o f t h o s e

o2 p r o b a b i l i t y d i s t r i b u t i o n s which are, p r o b a b l y , most f r e q u e n t l y e n c o u n t e r e d i n


P( IF-vl 2 c) 5 7. practice. These d i s t r i b u t i o n s r e p r e s e n t good a p p r o x i m a t i o n s t o " r e a l l i f e " ,
T ~ U S ,for a given c , the t o have /;-U~>E can be made a r b i t r a r i l y andlor nave p a r t i c u l a r t h e o r e t i c a l i m p o r t a n c e . Fortunately, they a l l possess
s m a l l by c h o o s i n g n l a r g e enough. r e l a t i v e mathematical s i m p l i c i t y . In e s t a b l i s h i n g the properties of t h e d i f f e r -
in c a s e s where the ~ a r i ~ n c e xs i s t a much more p r e c i s e s t a t e m e n t c a n
- - e n t i d e a l d i s t r i b u t i o n s we w i l l make use o f t h e g e n e r a l d e f i n i t i o n s and theorems
h e made a h o u t the b e h a v i o u r when n becomes l a r g e . ~t t u r n s out t h a t x
from t h e p r e c e d i n g c h a p t e r and, whenever p o s s i b l e , p r o v i d e i l l u s t r a t i o n s by
.; t o be norrmlly d i s t r i b u t e d i n these s i t u a t i o n s , r e g a r d l e s s of t h e d i s t - coomon, p h y s i c a l examples. The m a t h e m a t i c a l c o n n e c t i o n between t h e d i f f e r e n t
I , t i a n a l s h a p e s of the i n d i v i d u a l x ' s . T h i s i s t h e c o n t e n t s o f t h e Central
p r o b a b i l i t y d i s t r i b u t i o n s i s worked o u t i n some d e t a i l * ) ; t h e i r p h y s i c a l eonnec-
Limit Theorem , which w i l l be d i s c u s s e d i n S e c t . 4 . 8 . 8 . t i o n i s a l s o p o i n t e d o u t i n some c a s e s .
I n c l u d e d i n t h i s c h a p t e r i s a nu&er of e x e r c i s e s . Some of t h e s e are
r a t h e r formal and serve t o f i l l a gap i n t h e proof o f a s r a t e m n t i n t h e f e x t ,
which' u s u a l l y r e q u i r e s l i t t l e more t h a n j u s t a c o w u t a t i o n a l e f f o r t . Other e x e r
c i s e s p o i n t o u t s p e c i f i c p r o p e r t i e s as w e l l as u s e f u l r e l a t i o n s h i p s b e t w e e n t h e
different distributions. A few e x e r c i s e s i n t r o d u c e o t h e r p r o b a b i l i t y d i s t r i b u -
t i o n s which are r e l a t e d t o t h o s e d i s c u s s e d i n t h e t e x t . These may b e l e s s w e l l -
k n o w t o p a r t i c l e p h y s i c i s t s , b u t may have a d i r e c t , o f t e n q u i t e s i m p l e marhe-
=tical or physical content. F i n a l l y , some e x e r c i s e s are i n c l u d e d which p r o v i d e
t h e t h e o r e t i c a l background f o r a p p l i c a t i o n s found i n t h e l a t e r c h a p t e r s o f t h e
book.

A t h e o r e t i c a l l y and p r a c t i c a l l y i m p o r t a n t c l a s s of p r o b a b i l i r y d i s t r i -
b u t i o n s , t h e sampling d i s t r i b u t i o n s r e l a t e d t o t h e normal p.d.f ., is treated
s e p a r a t e l y i n C h a p t e r 5.

*' For a g r a p h i c a l i l l u s t r a t i o n o f t h e r e l a t i o n s h i p s between t h e p r o b a b i l i t y


d i s t r i b u t i o n s . see F i g . 5.4 a t t h e e n d o f C h a p t e r 5 .
4.1 THE BINOMIAL DISTRIBLTION
We b e g i n our d i s c u s s i o n o f p r o b a b i l i t y d i s t r i b u t i o n s by c o n s i d e r i n g
f i r s t a few examples of d i s t r i b u t i o n s of random v a r i a b l e s of t h e discrete type. The binomial d i s t r i b u t i o n i s ay-tric when p=q (= 0.5). and otherwise
The s i m p l e s t s i t u a t i o n one can t h i n k o f i n v o l v e s a s i n g l e , d i s c r e t e v a r i a b l e skew. F i g u r e 4.1 shows i l l u s t r a t i o n s o f the d i s t r i b u t i o n f o r two v a l u e s of the
which d e s c r i b e s a n experiment with only two p o s s i b l e outcomes. c o n s t a n t p (= 0.2, 0.5) and t h r e e d i f f e r e n t n (= 5,10.20). It i s seen t h a t t h e
d i s t r i b u t i o n g e t s i n c r e a s i n g l y s y m e t r i c f o r h i g h e r v a l u e s of n . When n becones
4.1.1 D e f i n i t i o n and p r o p e r t i e s
l a r g e t h e binomial d i s t r i b u t i o n takes an approximate normal (Gaussian) shape.
Let us denote t h e two e x c l u s i v e outcomes o f a random experiment by A
C o q a r e E x e r c i s e 4.3.
and A; A i s c a l l e d a "success" and A For each e x p e r i n e n t or t r i a l
a "failure".
The m a n v a l u e and v a r i a n c e f o r a v a r i a b l e which i s d i s t r i b u t e d accor-
I let p ( 0 i p S 1) be the p r o b a b i l i t y t h a t a success occurs, and q - 1 - p the
d i n g t o t h e binomial law, e q . ( 4 . 1 ) , can be found as follows. F i r s t , t h e mean
probability for a failure. Then, i n a sequence of n independent t r i a l s , t h e
value of t h e d i s t r i b u t i o n i s o b t a i n e d from t h e d e f i n i t i o n of t h e expectation
p r o b a b i l i t y t o have a t o t a l o f r successes and n-r failures i s
value of a d i s c r e t e v a r i a b l e , eq.(3.82),

T h i s i s t h e binomiot (or B e m Z Z i J d i s t r i b u t i o n f o r t h e v a r i a b l e r w i t h the


parameters n and p. The binomial c o e f f i c i e n t
t h a t t h e o r d e r o f t h e i n d i v i d u a l r successes and
I:( t a k e s account o f t h e f a c t
n-r f a i l u r e s i e immaterial,
S i n c e t h e f i r s t term drops o u t the s m a t i o n l i m i t can b e changed from r=O t o
1 W r i t i n g o u t t h e binomial c o e f f i c i e n t and e x t r a c t i n g t h e f a c t o r np o u t s i d e
t h e sum we g e t

It i s r e a d i l y checked t h a t t h e p r o b a b i l i t i e s of eq.(4.1) are p r o p e r l y normal-


i z e d , s i n c e they add t o u n i t y w h e n $waned over a l l r, With the s u b s t i t u t i o n s s-r-1, m=n-1 the slrm becomes

(=,P) - The binomial p r o b a b i l i t i e s are i n v a r i a n t under the i n t e r c h a n g e


(n-r.I-p),
from eq.(4.3).
is
Hence t h e mean value o f a v a r i a b l e w i t h a binomial d i s t r i b u t i o n

II - E(r) - np. (4.6)

The binomial d i s t r i b u t i o n has b e e n t a b u l a t e d i n Appendix T a b l e A1 f o r


11 d i f f e r e n t v a l u e s o f p between 0.01 and 0.50. and f o r n up to 20. Appendix
I
e(r2) - E(r(r-1) + r) -
To f i n d t h e v a r i a n c e from eq.(3.83)
E(r(r-1)) + E ( r ) . where
we observe t h a t one can v r i t e

Table A2 g i v e s a corresponding t a b u l a t i o n o f t h e cumulative binomial d i s t r i b u -


tion,
I
B(r;n,p)
0.30
0'4:1

0.20

0.10

0.30
1, ,
n.5,

P=O.Z

0.20

0
, 2 4 6 8
n=5
P = 0.5

r
- I

I
The variance of t h e binomial v a r i a b l e i s t h e r e f o r e

Exercise 4.1:
V(r)

~(t)$ E ( = )
v(;)

eqs.(4.1).(4.5),

Exercise 4.2:

Exercise 4.3:
F(n;n,p) = I
-
successes i n n t r i a l s .

=
~(r')

- (;)z"(r)
- (E(d)'

P,

- *-
-

- F(n-x-1;n.l-p).
n(n-l)p2

.
+ mp - (np)' - np(1-p)

One is o f t c n i n t e r e s t e d i n t h e q u a n t i t y z, t h e relative number of


For t h i s v a r i a b l e t h e mean and variance are g i v e n by

-
From the d e f i n i t i o n of t h e cumulative binomial d i s t r i b u t i o n by
show t h a t , f o r 0 2 x 5 n-1.
-

Show f r w i t s d e f i n i t i o n by eq.(3.84) t h a t t h e p r o b a b i l i t y gene-


r a t i n g function f o r t h e binomial d i s t r i b u t i o n is G(z) = (zp + q)".

Show fmm t h e d e f i n i t i o n s eqs.(3.20), (3.21) t h a t t h e asyometry


npq. (4.7)

(4.8)

(4.9)

, and k u r t o s i s c o e f f i c i e n t s of t h e binomial d i s t r i b u t i o n are given by, respeet-

0.20
ively,
YI - ( 1 - 2 p ) I ~ . = (~-KP(~-P))/(~P(~-P)).
Observe from t h e expression f o r Y L t h a t , f o r f i n i t e n, p < 0.5 (p > 0.5) implies
t h a t t h e d i s t r i b u t i o n i s p o s i t i v e l y (negatively) skew and has e t a i l t o t h e
B(r;n,p) r i g h t ( l e f t ) . Note a l s o t h a t both c o e f f i c i e n t s tend t o zero when n becomes
l a r g e , i n d i c a t i n g t h a t t h e b i m m i a l d i s t r i b u t i o n beeones s i m i l a r t o t h e normal
0.10 d i s t r i b u t i o n (Sects.3.3.3 and 4.8.4).

I 4.1.2 Example: H i s t o g r a m i n g events (1)


As an a p p l i c a t i o n of t h e binomial d i s t r i b u t i o n suppose t h a t we f o r
some reason are i n t e r e s t e d i n j u s t one p a r t i c u l a r b i n of a compound histogram.
0 2 4 6 r 0 2 4 6 8 r
Then A (success) m y correspond t o g e t t i n g an e n t r y i n t h i s p a r t i c u l a r b i n , say
b i n number i, and ( f a i l u r e ) corresponds t o an e n t r y i n any o t h e r b i n of t h e

j;J:
histogram. With a t o t a l of n independent e v e n t s t h e p r o b a b i l i t y f o r having j u s t
n = 20 I r events i n b i n i and t h e remaining n-r events d i s t r i b u t e d over t h e o t h e r b i n s
B(r;n,p) p-02 is given by t h e binomial d i s t r i b u t i o n law, eq.(4.1). The expected number of
0.10 events i n the i - t h b i n i s E ( r ) = n p from eq.(4.6). and t h e variance of t h i s nuar
ber V(r) = n p ( l - p ) , eq.(4.7).

0s20 The p r o b a b i l i t y p f o r a success i s s a w c o n s t a n t whose e x a c t v a l u e m y


0 4 8 12.r 0 4 8 12 16 r , n o t be knovn p r i o r t o the experiment. men the experiment has been p e r f o m d .

Fig. 4.1. The binomial d i s t r i b u t i o n f o r i n d i c a t e d values o f t h e parameters n,p.


g i v i n g r e v e n t s i n the i - t h b i n o u t of a t o t a l o f n e v e n t s , we may adopt f o r p For the o v e r a l l scanning e f f i c i e n c y t h e error i s found by a p p l y i n g t h e
i t s estimated vatue law of propagation of errors t o e q . ( 4 . 1 1 ) . I f one a s s u m s t h a t s , and EZ are
independent v a r i a b l e s , t h e v a r i a n c e of s i s given by
I
p = p = - .

The number of successes r h a s an e s t i m a t e d v a r i a n c e


I
Heace eqs.(4.11)-(4.13) lead t o t h e f o l l o w i n g e x p r e s s i o n f o r t h e s t a n d a r d devia-
t i o n o f t h e o v e r a l l scanning e f f i c i e n c y ,
(I-E~)(~-E~)(E,+E,-~EIE~)'
and a s t a n d a r d d e v i a t i o n a(€) =
N (4.14)

I n the error formulae (4.12) and (4.14) N i s t h e t r u e number o f e v e n t s


contained i n the f i l m . Usually N i s n o t known e x a c t l y and h a s t o be e s t i m a t e d
The l a s t r e s u l t i m p l i e s t h a t t h e error on t h e number o f e v e n t s r i n
from t h e number o f e v e n t s found i n the two independent scans.
the i - t h b i n i s & &, b u t a smaller quantity. Only i n t h e l i m i t when p + 0,
S t r i c t l y speaking, the assumptions s t a t e d above are not f u l f i l l e d i n
u s u a l l y corresponding t o a l a r g e n m h e r of b i n s , i s o r = . In fact, t h i s
practice. Specifically. the efficiencies EI,EZ are not independently determined,
a s y m p t o t i c l i m i t r e f l e c t s t h e c o n d i t i o n f o r a v a r i a b l e with a Poisson d i s t r i b u -
s i n c e they, as w e l l as the t o t a l number of e v e n t s N, have t o be e s t i m a t e d from
tion.
the number of e v e n t s found i n t h e two independent scans, as i n d i c a t e d by t h e
4.1.3 Example: Scanning e f f i c i e n c y (2) 1 formulae d e r i v e d i n Seet.2.3.11. Thus a b e t t e r e s t i m a t e o f t h e errors would
We s h a l l take up a g a i n the problem with t h e scanning e f f i c i e n c i e s r e q u i r e a more e l a b o r a t e t r e a t m e n t of t h e error propagation s t a r t i n g from t h e
which was i n t r o d u c e d i n Sect.2.3.11. observed q u a n t i t i e s NI.NZ.NLZ. T h i ~i s presumably only seldom t r i e d i n p r a c t i c e .
Suppose f o r t h e moment t h a t we know t h e e f f i c i e n c i e s EI and s 2 o f two s i n c e the s y s t e m a t i c errors a s s o c i a t e d with t h e scanning procedure i n most eases
i n d i v i d u a l , independent scans. The o v e r a l l scanning e f f i c i e n c y c i s t h e n given are assumed t o be more important than t h e pure s t a t i s t i c a l error i n t h e o v e r a l l
by efficiency.
E = E l + €2 - ELEZ r (4.11) I
Exercise 4.4: (The geometric d i s t r i b u t i o n )
e q . ( 2 . 2 1 ) , ~ h i e hi s the same, e x c e p t f o r an obvious change i n n o t a t i o n ) . (i) Show t h a t , i f p i s t h e p r o b a b i l i t y f o r having a success i n each binomial
t r i a l , t h e p r o b a b i l i t y t h a t the f i r s t success occurs i n t h e r-th t r i a l i s
We want t o f i n d e x p r e s s i o n s f o r t h e errors t o be a s s o c i a t e d w i t h the q u a n t i t i e s
EI.EI.E.
The scanning p r o c e s s i t s e l f i s o f binomial n a t u r e , because e i t h e r an
and v e r i f y t h a t P t ( r ) with r = 1 , 2 . ...
g i v e s a c o r r e c t l y normalized p r o b a b i l i t y
e v e n t is r e g i s t e r e d by t h e scanner, o r i t is n o t . We can t h e r e f o r e f o r t h e i n d i - I d i s t r i b u t i o n f o r t h e occurrence o f t h e f i r s t success.
v i d u a l scans apply t h e formulae o f t h e binomial d i s t r i b u t i o n . Thus t h e s t a n d a r d
(ii) Show t h a t t h i s geometric d i s t r i b u t i a h a s t h e p r o b a b i l i t y g e n e r a t i n g func-
d e v i a t i o n of t h e s e a m i n g e f f i c i e n c y f o r t h e i - t h scan i s o b t a i n e d from eq.(4.9), tion

( i i i ) By the use of G(E). show t h a t the mean and v a r i a n c e of t h i s d i s t r i b u t i o n


where N i s t h e t o t a l number of events contained i n t h e f i l m . are given by, r e s p e c t i v e l y ,
.. !
E(r) = l/p, V(=) = ( I - p ) l p ,
and t h a t t h e a s y m e t r y and e u r t o s i s c o e f f i c i e n t s are
Y P / Y2 = (PI-6p+6)/(1-p).
The geometric d i s t r i b u t i o n i s i l l u s t r a t e d i n t h e upper p a r t of
F i g . 4.2 f o r a few values of t h e parameter p.

E x e r c i s e 4.5: (The negative binomial d i s t r i b u t i o n )


(i) Generalizing t h e s i t u a t i o n from t h e preceding e x e r c i s e , show t h a t the
p r o b a b i l i t y f o r o b t a i n i n g t h e k-rh success i n the r-th t r i a l i s given by t h e
negative binomial ( o r PascaZ) d i s t r i b u t i o n

r=k,k+l, ....
(ii) Show t h a t t h i s p r o b a b i l i t y d i s t r i b u t i o n has t h e p r o p e r t i e s

k
(Hint: m e p r o b a b i l i t y g e n e r a t i n g function i s G(Z) = [ p z / ( ~ - q z ) ) .I
Note t h a t t h i s d i s t r i b u t i o n is always p o s i t i v e l y skew.

( i i i ) Introduce t h e mmber of f a i l u r e s , s=r-k, and show t h a t t h e p r o b a b i l i t y


d i s t r i b u t i o n f o r having s f a i l u r e s when t h e k-th success occurs can be w r i t t e n

Pk(s;k) - (S+~-l)pk(l-p)s, s=1,2, ....


Verify t h a t t h i s d i s t r i b u t i o n has the mean value s h i f t e d (reduced) by t h e almunt
k , b u t t h a t a l l c e n t r a l moments, and hence V(a).y~,Yz. are as above.
The negative binomial d i s t r i b u t i o n i s shovn i n F i g . 4.2 f o r a few combi-
n a t i o n s o f t h e parameters p and k . The p r o b a b i l i t i e s f o r k-1 i n t h e upper p a r t
of t h e f i g u r e correspond t o t h e geometric d i s t r i b u t i o n of Exercise 4.4.

E x e r c i s e 4.6: (The hypergeometric d i s t r i b u t i o n (1))


(i) Suppose-that o f N elements, a have t h e a t t r i b u t e A and t h e remaining N-a
t h e a t t r i b u t e A . Show t h a t , when n elements are picked a t random and without
replacement from the t o t a l N elements, t h e p r o b a b i l i t y t h a t t h e random sample of
s i z e n w i l l c o n t a i n r elements with t h e a t t r i b u t e A and n-r elements with t h e
attribute x, i s given by t h e hypergeomtric d i s t r i b u t i o n ,

P(r;N,n,a) - r=O,l... ..min(a,nJ.

(ii) Show t h a t when N >> n, t h i s d i s t r i b u t i o n reduces t o t h e o r d i n a r y bino-


m i a l d i s t r i b u t i o n of eq.(4.1) w i t h p=a/N, i n accordance with c o m n sense expee-
tation.
The canditione above can be generalized t o a c l a s s i f i c a t i o n i n more
t h a n two e a f e g o ~ i e s ; see E x e r c i s e 4 . 8 .

1i Pig. 4.2. l h e negative binomial d i s t r i b u t i o n (Exercise 4.5) for i n d i c a t e d values


of the parameters k.p. For k-1 one has t h e geometric d i s t r i b u t i o n (Exercise 4.4).
I
I
To prove p r o p e r t i e s ( i ) and ( i i ) , n o t i c e t h a t t h e p r o b a b i l i t y f o r get-
4.2 THE HULTINOMIAL DISTRIBUTION
t i n g r . outcomes i n the c l a s s i i n a t o t a l of n t r i a l s i a
4.2.1 D e f i n i t i o n and p r o p e r t i e s
The g e n e r a l i z a t i o n O F t h e binomial c o n d i t i o n s t o t h e case with more
than two p o s s i b l e outcomes of an experimental t r i a l l e a d s t o t h e mu2tinomiaZ
distPibution law. ; This is an example of t h e binomial case: e i t h e r a t r i a l gives an outcorn i n t h e
c l a s s Ai ( t h e p r o b a b i l i t y f o r t h i s occurrence being p.) or i t does n o t (probabi-
st t h e possible outcome^ d e f i n e a s e t of c a t e g o r i e s or c l a s s e s
AI.A1. .. .,+. Tor each t r i a l t h e p r o b a b i l i t y of an outcome i n the s p e c i f i c l i t y 1-pi) Therefore t h e formulae from t h e binomial d i s t r i b u t i o n eqr.(4.6),
Then s i n c e every t r i a l must g i v e some outcome, t h e p r o b a b i l i - (4.7) can be taken over t o g i v e the e x p e c t a t i o n and v a r i a n c e i n one p a r t i c u l a r
c l a s s Ai is pi.
t i e s must add t o u n i t y , class.
To prove property ( i i i ) , consider t h e two c l a s s e s A. and A . t o g e t h e r .
1
The p r o b a b i l i t y f o r r . outcomes i n Ai and r . outcomes i n Aj, and with t h e out-
3
comes of the remaining n-r.-r. t r i a l s d i s t r i b u t e d over a l l o t h e r c l a s s e s , is
Nov we assume t h a t the outcomes of d i f f e r e n t t r i a l s are independent. After n 1 3

independent t r i a l s t h e p r o b a b i l i t y of having a f i n a l r e s u l t with r,,r,....,r k I


n: r. r . "-=.-I-. I
. ! : ( " - r . ! p.'1 p.1
1 (l-pi-p.) 1 I.
outcomes i n t h e d i f f e r e n t c l a s s e s i s given by ' I ' 1 1
It i s easy t o shov t h a t t h e expectation v a l u e of t h e product r . r . with t h e p.d.f.
'1
, of eq.(4.16) i s

Eq.(4.16) g i v e s the multinomial d i s t r i b u t i o n of t h e v a r i a b l e s r


f o r t h e parameters n and p = { p , , p ~ , . . . p k ) . The r . are n o t a l l
-
t r , . r 2,...,rk>
independent,
since
With t h i s r e s u l t one g e t s t h e covariance from e q . 0 . 3 9 ) .
I
cov(r.r.)
1 1
- E(r.r.)
I 1
- E(ri).E(r.)1 = n(n-1)p.p.
I 1
- (npi)(npj) = -np.p.,
I I

Evidently t h e multinomial d i s t r i b u t i o n eq.(4.16) includes t h e binomial I which was s t a t e d above.

d i s t r i b u t i o n eq.(4.1) as a case. I n a d d i t i o n i t has t h e following prop- Notice t h a t t h e contents i n the two e l a s s e s are always n e g a t i v e l y eorre-
lated. I
erties: I n terms of the c o r r e l a t i o n c o e f f i c i e n t f r m eq.(3.40) one has

(i) The value f o r c l a s s A. i s E(ri) - "pi.

(ii) The variance f o r c l a s s Ai is v ( r i ) = npi(l-pi).


I
(iii) The covariance f o r c l a s s e s Ai.A
1
is ~ov(r..r.)
1 I
= -np.p.
1 3
1
t
Exercise 4.7: Prove eq.(4.18).

(iv) When n becomes l a r g e t h e multinomial d i s t r i b u t i o n tends t o a


I
oultinomal distribution.
4.2.2 Example: Histogr-ing e v e n t s (2) 4.3 l l l E POISSON DISTRIBUTION
As an example of t h e multinomial d i s t r i b u t i o n we consider n events
d i s t r i b u t e d among k b i n s i n a histogram. Then p. i s t h e p r o b a b i l i t y t h a t m 4.3.1 D e f i n i t i o n and p r o p e r t i e s
e v e n t w i l l f a l l i n b i n number i. and r . i s t h e number of e v e n t s i n t h i s b i n . The With binomial conditions i t .-times happens t h a t t h e r a t e p of "sw-
mean value and variance of t h e v a r i a b l e r . are, r e s p e c t i v e l y , eesses" i s very small. In a long a e r i e s of n t r i a l s t h e t o t a l nlnnber of sueces-
ses np may, however, s t i l l b e considerable. I t i s t h e r e f o r e appealing t o examine
mathematically t h e l i m i t i n g case of t h e binomial d i s t r i b u t i o n , when p * 0 , n + -
i n such a way t h a t t h e product np remains c o n s t a n t and equal t o p , say. From
I f p . < < l , corresponding t o a d i s t r i b u t i o n with many b i n s i n general, we have
S t i r l i n g ' s formula f o r t h e f a c t o r i a l of a l a r g e number, &!n nne-", we have
V(ri)=npi- r . . and t h e standard d e v i a t i o n b e e o m s o ( r i ) =F. under these c o n d i t i o n s f o r t h e t e r n s of the binomial d i s t r i b u t i o n
The covariance f o r t h e number of e v e n t s i n b i n s i and j i s t h e negative
number n! r n-r 1 6 nne-
P =---
=: .'2n(n-r)(n-r)"- e

Thus t h e c o r r e l a t i o n between two b i n s i s only n e g l i g i b l e i f t h e p r o b a b i l i t i e s f o r


t h e n*o b i n s are small, corresponding t o a histogram with many b i n s .
The multinomial c l a s s i f i c a t i o n considers n, the t o t a l nlnnber of events. Thus, we can w r i t e

a f i x e d number (parameter). I f i n s t e a d we regard the " h e r of events r i i n the


d i f f e r e n t bins 4s independent, random v a r i a b l e s of t h e Poisson types, they w i l l
k
have standard d d v i a t i o n s ~ ( = and
. )t h=
e t oq
tal,
number of e v e n t s n=.E r . w i l l
1-1 I This i s t h e Poisson d i s t r i b u t i o n f o r t h e d i s c r e t e v a r i a b l e r , with t h e parameter
then be a random v a r i a b l e of t h e Poisson type; see Exercise 4.18 and Sect.4.4.4. (mean value) u.
The d i s t r i b u t i o n of eq.(4.20) i s e v i d e n t l y c o r r e c t l y normalized. The
E x e r c i s e 4.8: (The generalized hypergeometric d i s t r i b u t i o n ( 2 ) )
p r o b a b i l i t y g e n e r a t i n g function (eq.(3.84)) becomes
(i) A population c o n s i s t s of N elements which can be c l a s s i f i e d i n t o k
d i f f e r e n t ~ a t e g o r i e s .There
~ are a, elements i n the f i r s t category, at i n che
second. and so on. .E a.=N. Show t h a t , i f a r a n d m sample of s i z e n i s d r a m
-1
f r w this (tfiat is, i f n elements are picked a t random a m n g the N
elements and without replacement), t h e p r o b a b i l i t y d i s t r i b u t i o n f o r o b t a i n i n g
r,,r,, .... rk elements o f t h e d i f f e r e n t c a t e g o r i e s i s From t h e general expressions eqs.(3.85). (3.86) t h e expectation and v a r i a n c e of
t h e d i s t r i b u t i o n come out as

k
where i i l r i = n

(ii) Show t h a t i f t h e population i s very l a r g e compared t o t h e s i r e of t h e


sample (N>>o), t h i s d i s t r i b u t i o n reduces t o t h e multinomial d i s t r i b u t i o n , of Thus t h e Poisson d i s t r i b u t i o n has t h e property t h a t t h e mean equals t h e variance,
...,
eq.(4.16) with pimailN, i-1.2. k.
E ( r ) = V(r) - u. (4.22)
The Poisson d i s t r i b u t i o n i s very asymmetric f o r small u and has s t a i l
t o t h e r i g h t of t h e mean. Q u a n t i t a t i v e l y , t h e asyrmetry i s expressed by t h e
p o s i t i v e skewness c o e f f i c i e n t (eq.(3.20))

which approaches zero when u gets large. Asymptotically, when u goes towards
i n f i n i t y t h e P o i s s o n d i s t r i b u t i o n becomes i d e n t i c a l t o t h e normal d i s t r i b u t i o n .
AS rill be seen f r w F i g . 4.3 t h e s i m i l a r i t y between these m o d i s t r i b u t i o n s i s
r a t h e r c l o s e already a t u=20.
From eq.(4.20) we observe t h a t

P(r;u) - P(=-l;u). ,
and t h e p r o b a b i l i t y w i l l t h e r e f o r e i n c r e a s e from r=O,l,Z etc. s o long as r < u.
The maximum p r o b a b i l i t y i s a t r = l p l , and with an equal, a d j a c e n t maximm a t
0 2 r
u

I II
p-1 i f i s an i n t e g e r ; see F i g . 4 . 3 .
The Poisson d i s t r i b u t i o n of eq.(4.20) has been t a b u l a t e d i n Appendix
Table A3 f o r values of u b e m e e n 0.1 and 20. Appendix Table A4 gives a s i m i l a r
t a b u l a t i o n of t h e e m u l a t i v e Poisson d i s t r i b u t i o n
0.20

The t a b l e s of t h e Poisson d i s t r i b u t i o n involve only one parameter and


are e a s i e r t o work with than t h e corresponding t a b l e s of t h e ma-parameter bino-
m i a l d i s t r i b u t i o n . Because of t h e l i m i t i n g r e l a t i o n s h i p , t h e t a b l e s of t h e
Poisson d i s t r i b u t i o n r e p r e s e n t a convenient approximation of t h e binomial t a b l e s
when p i s small and n s u f f i c i e n t l y l a r g e (u=np).

of eq.(4.25)
There e x i s t s a u s e f u l r e l a t i o n s h i p between t h e cumulative Poisson sum
and the c m l a t i v e i n t e g r a l of t h e chi-square d i s t r i b u t i o n , which
I1
0.10 0.10
w e s h a l l d i s c u s s i n Chapter 5.

F(x;u) - 1 - 'I
0
f(u;"-Zx+2)du. (4.26)
P(ri~)
0.05 0.05
Herr f(u;V) i s t h e chi-square p.d.f. with v degrees of freedom, and t h e q u a n t i t y
on t h e right-hand s i d e haa been dieplayed graphically Lor d i f f e r e n t u i n Fig.5.2.
.. 0 4 8 12 1 6 r 10 15 20 25 30 r

I Fig. 4.3. l b c Poisson d i s t r i b u t i o n for d i f f e r e n t mean values 11.


From t h i s graph one can t h e r e f o r e read o f f d i r e c t l y t h e value of P(x;II) on the
(i) There is a t most one bubble i n the i n t e r v a l [t.t+Atl.
curve f o r \ ~ 2 x + 2a t t h e value u-2".
(ii) The p r o b a b i l i t y f o r f i n d i n g one bubble i n t h i s i n t e r v a l i s pro-
E x e r c i s e 4.9: Derive t h e mean value and t h e variance of t h e Poisson d i s t r i b u t i o n
portional t o A t .
by c a r r y i n g out t h e stamations f a r E(r) and E ( r 2 ) with t h e p r o b a b i l i t y d i s t r i -
b u t i o n (4.20).
( i i i ) The occurrence of a bubble i n t h e i n t e r v a l [t,fi+Atl i e indepen-
E x e r c i s e 4.10: Show t h a t thePoisson d i s t r i b v t i o n has t h e p r o b a b i l i t y generating
d e n t of t h e occurrence of bubbles i n any o t h e r "on-overlapping
function G ( a ) = exp(u(z-1)).
interval.
E x e r c i s e 4.11: Show t h a t t h e asy-try and k u r t o s i s c o e f f i c i e n t s of t h e Poisson
d i s t r i b u t i o n are given by y l = l / G and yn=l/u, r e s p e c t i v e l y . Both c o e f f i c i e n t s From a s s a p t i o n s ( i ) and ( i i ) t h e p r o b a b i l i t y t h a t t h e r e i s one bubble
tend t o zero when u g e t s l a r g e , i n d i c a t i n g an i n c r e a s i n g s i m i l a r i t y t o t h e nor-
mal d i s t r i b u t i o n (Sect.4.8.1). i n the interval [t.t+Atl is

E x e r c i s e 4.12: Show t h a t t h e Poisson d i s t r i b u t i o n has a l g e b r a i c moments connec- P, (At) = gAE ,


red bytbe r e l a t i o n

while t h e p r o b a b i l i t y t h a t t h e r e is no bubble i n t h e i n t e r v a l ie
";+I

E x e r c i s e 4.13: Show t h a t t h e Poisson d i s t r i b u t i o n has t h e c h a r a c t e r i s t i c fune-


t i o n @ ( r ) = exp(u(eit-1)) . Po(At) - 1 - P,(At) - 1 - gdt.
Assumption ( i i i ) implies t h a t t h e occurrence of no bubble i n A t i s independent
of t h e presence of no bubbles over the d i s t a n c e t, and hence a f a c t o r i z a t i o n of
4.3.2 The Poisson a s s q t i o n s the f o r the occurrence of no bubbles over t h e length t + A t
E x a m l e : Bubbles along a t r s e k i n a bubble c h a d e r
I n the previous s e c t i o n the Poisson d i s t r i b u t i o n was introduced as a
l i m i t i n g case of t h e binomial d i s t r i b u t i o n . To g e t more i n s i g h t i n t o i t s mean- Combining t h e l a s t two expressions we may w r i t e
i n g and a p p l i c a b i l i t y as w e l l as t h e b a s i c a s s q t i a n s underlying a Poisson law
- Pow)
we d i s c u s s next i n t h i s and t h e following s e c t i o n two examples of p h y s i c a l p r r
cesses which are w e l l described by t h i s law. I n f a c t , s t a r t i n g from f i r s t ~ r i n -
p0(t+At)
At - -&(,a').
When A t + 0 t h e left-hand s i d e of t h i s enpression i a t h e d e r i v a t i v e of Po(!-)
e i p l e e we s h a l l now d e r i v e t h e Poisson d i s t r i b u t i o n formula.
with r e s p e c t t o t. hence we g e t
Let us here think of the d i s t r i b u t i o n o f bubbles formed along t h e
p a t h s of charged p a r t i c l e s i n a bubble chamber. We assume f o r t h e mment t h a t
t h e s i r e of t h e bubbles can b e ignored and t h a t the mean number of bubbles p e r
u n i t length along the track i s c o n s t a n t . We denote t h i s average bubble d e n s i t y The d i f f e r e n t i a l equation 14.27) can, because of the ass-d constancy
by g, and consider a small l e n g t h A t of t h e t r a c k .
assumptions may then be formulated as follows:
The t h r e e b a s i c Poinson
of g, e a s i l y be solved with t h e boundary condition Po(0) - 1.
T h i s f o r m l a givea t h e p r o b a b i l i t y t h a t t h e r e are n o bubbles over t h e l e n g t h t.
L e t i t a l s o b e pointed o u t t h a t i t i s e s s e n t i a l i n t h e d e r i v a t i o n of
~ e ust next f i n d t h e p r o b a b i l i t y f o r observing r bubbles w i t h i n t h e
t h e formulae above t h a t t h e average bubble d e n s i t y g p e r u n i t length is a
length t . s i n c e t h e r e can be a t most one bubble i n t h e small i n t e r v a l [ a . a + ~ t l
constant, independent of t. I f g=g(L) t h e r e s u l t i n g d i s t r i b u t i o n of bubbles
we may v r i t e
v i l l n o t be Poisson but some o t h e r , g e n e r a l l y unlmwn. d i s t r i b u t i o n : see a l s o

pr(t+Aa) = Pr(?.) .Po(At) + Pr-, (9,) 'PI (At) , I Exercise 4.35.


I
where t h e f i r s t term on t h e r i g h t implies a l l r bubbles i n .9. and the second
t e r n i m p l i e s ( r l ) bubbles i n .? and one bubble i n A t . Introducing t h e probabi-
1 e x e r c i s e 4.14: V e r i f y t h a t eq.(4.30) i s t h e s o l u t i o n of eq.(4.29).

Exercise 4.15: Suppose t h a t t h e average bubble d e n s i t y of minimm ionizing


l i t i e a Po(&?.) and Pl(A?.) from a s s m p t i o n s ( i ) and ( i i ) we g e t a f t e r rearranging p a r t i c l e s of charge fe i n a bubble chamber is 9 bubbles p e r cm. What is t h e
p r o b a b i l i t y t h a t a r e l a t i v i s t i c p a r t i c l e v i l l produce only 1 bubble p e r em,
pr(e+ae) - ~ ~ ( 9 , ) equal t o t h e bubble d e n s i t y expected from a quark of charge f1/3e? I f t h e mini-
mum i o n i z a t i o n were two times l a r g e r , 7 . r . 18 bubbles p e r cm, what would then
At = -g~r(?.) + p ~ ( e ~) . - ~
Lhc p r o b a b i l i t y be f o r observing a track with 119 of t h i s average . bubble den-
sity7
~ At
~ hen ~ d i t h e~ left-hand
0. , s i d e i s a d e r i v a t i v e , and leads t o the d i f f e r -
e n t i a l equation

dPrW
-=-
da
g ~ ~ ( 9+
, ) gp7_, ( a ) . (4.29)
I
I
I
Exercise 4.16: The rider of beam p a r t i c l e s p e r p u l s e is assumed t o be Poisson
d i s t r i b u t e d . I f i t i s k n m t h a t t h e average n d e r of p a r t i c l e s per p u l s e is
16, what i s t h e p r o b a b i l i t y t h a t a p u l s e w i l l have between 12 and 20 p a r t i c l e s 7
(Answer: 0.7411.) - See a l s o Exercise 4.40.

Exercise 4.17: I n an experimental search f o r weak n e u t r a l c u r r e n t s 9 candidates


The s o l u t i o n of t h i s equation i s
, were observed f o r t h e n e u t r i n o r e a c t i o n s

" . r ". , . . , , .
which gives t h e d i s t r i b u t i o n of t h e n d e r of bubbles r i n i n t e r v a l s of f i x e d
The candidates could, however, a l s o be i n t e r p r e t e d as background events due t o
lennth
. 9,. I t is seen t h a t eq.(4.30) g i v e s a Poisson d i s t r i b u t i o n with t h e para- , the neutronreactions
meter ( g t ) . S p e c i f i c a l l y i t includes eq.(4.28) as a s p e c i a l ease.
n + p - n + p r " o ,
I t may be a p p r o p r i a t e t o emphasize t h a t eq.(4.M) describes the fre- 1
n+p-n+n+lr+.
. .
auencv d i s t r i b u t i o n f o r t h e discrete v a r i a b l e r , with .9 (or, s t r i c t l y speaking,
From i d e n t i f i e d events of t h e type n+p+pfp+"- t h e expected number of background
g?.) as a parameter of t h e d i s t r i b u t i o n . Prom t h e Poisson assumptions one can
events was estimated t o be 4.9. Assuming t h a t t h e nmber of background e v e n t s
a l s o t u r n t h e problem around and seek t h e d i s t r i b u t i o n i n t h e c a t i n w u ~v a r i - i s Poisson d i s t r i b u t e d with mean value 4.9. what i s t-~~~
~ ~ ~
.
h e o r o b a b i l i t~>v to
~
. or
-~ have Q
more background events7 Do you c o n s i d e r t h a t t h i s experiment i n d i c a t e s t h e pre-
able a f o r s p e c i f i e d values of t h e parameter r . I n o t h e r words, one can ask f o r
sence of weak n e u t r a l c u r r e n t s ?
t h e p r o b a b i l i t y t o have a t o t a l d i s t a n c e t of t h e t r a c k t o f i n d e x a c t l y r bub-
b l e s , given t h a t t h e average p r o b a b i l i t y t o f i n d a bubble i s c o n s t a n t along t h e
'
I 4.3.3 Example: Radioactive emissions
t r a c k and equal t o g p e r u n i t length. This problem w i l l b e i n v e s t i g a t e d l a t e r
A f r e q u e n t l y c i t e d example on a Poisson process is t h a t of p a r t i c l e
i n Sects.4.b and 4.7, and as we s h a l l see, i t w i l l l e a d us t o a s p e c i f i c c l a s s
emission from a r a d i o a c t i v e source. I f t h e p a r t i c l e s are e m i t t e d from t h e
of the g a m a d i s t r i b u t i o n s , including t h e w e l l - k o u n exponential d i s t r i b u t i o n
source a t an average r a t e of A p a r t i c l e s p e r u n i t time t h e n m b e r of emissions
law.
r, i n f i r e d time i n t e r v a l s t f o l l o v s a Poisson law with mean A x t .
..
4.4 REUTIONSHIPS BETWEEN nlE POISSON AND OTHER PROBABILITY DISTRIBUTIONS
The Poisson d i s t r i b u t i o n has some i n t e r e s t i n g connections t o o t h e r
Suppose now t h a t t h e source i s placed i n surroundings where t h e back- p r o b a b i l i t y d i s t r i b u t i o n s which are u s e f u l f o r physical a p p l i c a t i o n s . We w i l l
ground of r a d i o a c t i v e emissions i s given by an average r a t e A p a r t i c l e s per i n the following i n d i c a t e h w the same mathematical r e l a t i o n s h i p s between t h e
b
" n i t time. Then t h e number rb of background emissions i n time i n t e r v a l s of Poisson and o t h e r d i s t r i b u t i o n s can come o u t when a p h y s i c a l problem i s a t t a c k e d
l e n g t h t i s Poisson d i s t r i b u t e d with man kt, from d i f f e r e n t viewpoints which may a t f i r s t appear r a t h e r d i s s i m i l a r . We study
i n p a r t i c u l a r t h e connections between t h e Poisson and binomiallmultinomial d i s -
t r i b u t i o n laws which are important f o r many p r a c t i c a l problems. In the f i n a l

Accessible f o r measurement i s t h e sum of background and source emissions.


t h e number of counts i n an i n t e r v a l t be denoted by r .
v a r i a b l e r can then be expressed as
Let
The d i s t r i b u t i o n f o r the i s e c t i o n we i n d i c a t e an extension t o t h e compound Poisson d i s t r i b u t i o n , adequate
f o r the d e s c r i p t i o n of chained r e a c t i o n s .

4.4.1 Example: D i s t r i b u t i o n of counts from an i n e f f i c i e n t counter


P(r;Axt,Abt) - I
1~ ( r - r ~ i A ~ t ) . P ( r ~ ; \ t )
rb'O
Suppose char charged p a r t i c l e s are counted by some e l e c t r o n i c device
which has a p r o b a b i l i t y p < 1 f o r r e g i s t e r i n g e p a r t i c l e t r a v e r s i n g i t ; the
r e g i s t r a t i o n p r o b a b i l i t y i s t h e same f o r a l l p a r t i c l e s . We assume f u r t h e r t h a t
the n m b e r of p a r t i c l e s n e n t e r i n g and t r a v e r s i n g t h e device i n fixed time i n t e r -
v a l s t i s Poisson d i s t r i b u t e d with mean value v . We want t o f i n d the d i s t r i b u -

= 1 - = r: rb -(AX+Ab)t t i o n f o r the number of r e g i s t r a t i o n s r over time i n t e r v a l s of length t .


: r =I o (,-r ) *r ,(Axt) '-'b(\t)
b ' b '
e , To have r counts by t h e i n e f f i c i e n t device a t l e a s t r p a r t i c l e s must
b
have t r a v e r s e d it. The p r o b a b i l i t y we seek i s t h e r e f o r e obtained by adding a l l
or. p r o b a b i l i t i e s t h a t w i l l give r r e g i s t r a t i o n s . Far a given n t h e p r o b a b i l i t y of
1 g e t t i n g r counts i s given by t h e binomial law. Thus, when a l l c o n d i t i o n a l pro-
P ( r ; ~ ~ t , k t=) z ( ( ~ 2 % ) t ) r e-(Axth)t E p(r:(kX+\)t). (4.31)
, b a b i l i t i e s are added with n running from r t o i n f i n i t y , we g e t
Thus the observed emissions are a l s o Poisson d i s t r i b u t e d , with a r a t e equal t o
t h e sum of t h e r a t e s of t h e source and background.
With t h i s we have a r r i v e d a t t h e addition theomm f o r Poisson
d i s t r i b u t e d variables. The g e n e r a l i z a t i o n t o more v a r i a b l e s is obvious: The
sum of any nlmber of independent Poisson v a r i a b l e s is i t s e l f a Poisson v a r i a b l e
w i t h mean value equal t o t h e sum of t h e i n d i v i d u a l means. See a l s o Exercise
4.18 below.
or,
E x e r c i s e 4.18: L e t r1.r2.....r
mean v a l u e s ul.uz. ....vn.
be a s e t of independent Poisson v a r i a b l e s with
n
-
show: u s i n g the c h a r a c t e r i s t i c f u n c t i o n technique of
2 r i is a l s o a Poisaon
P - 1
P
r -pU
e , r - 0 , 1, ..., (4.32)

v a r i a b l e with mean value v


i- 1
-?
Sect.3.4 and the r e s u l t of ~ x e r c i s e4.13, t h a t r
pi: ,
i-1

I
which is nothing b u t a Poisson d i s t r i b u t i o n with mean value pw.
An i n e f f i c i e n t c o u n t e r of t h e above type h a s t h e p r o p e r t y t h a t i t thought a "success" would correspond t o having t h e count occur i n the time t .
p i c k s a random sample from t h e p a r e n t With a Poisson p o p u l a t i o n t h e while a " f a i l u r e " would be t h a t i t o c c u r r e d i n the remaining time T-t.With an
random sample was found t o be of t h e Poisson type. Conversely. t h e sample w i l l average counting r a t e nlT per eeeond,the success r a t e , o r the p r o b a b i l i t y f o r
o n l y be Poisson d i s t r i b u t e d i f t h e p o p u l a t i o n was P o i s s o n . 1
-
each count o c c u r r i n g i n the t i m e t, i a p- ;(n/T)t=tlT.
Thus t h e p r o b a b i l i t y f o r
r c o m t s i n t and n-r counts i n T-t i s given by t h e e x p r e s s i o n above.
4.4.2 Example: Subdivision of a c o u n t i n g i n t e r v a l
We a s s m e t h a t a d e t e c t o r r e g i s t e r s p a r t i c l e s over p e r i o d s o f T sec- 4.4.3 R e l a t i o n between binomial and P o i s s o n d i s t r i b u t i o n s
onds and t h a t t h e number of counts n f o l l a r s a Poisson law w i t h mean v a l u e v=AT. Exaolple: Porvard-backward classification
We want t o f i n d the d i s t r i b u t i o n d e s c r i b i n g t h e nunber o f counts r i n a n i n t e r - The mathematical r e l a t i o n s h i p between t h e binomial a n d t h e Poisson pro-
val of t seconds, where t < T. b a b i l i t i e s o f the preceding s e c t i o n can be d e r i v e d from an a l t e r n a t i v e p o i n t o f
view, a s s m i n g two v a r i a b l e s r e l a t e d i n a binomial d i s t r i b u t i o n , w i t h t h e i r sum
From t h e s p e c i f i c a t i o n above t h e c o u n t s occur a t a r a t e of A c o m t s
obeying a Poisson law. The formulation below i n v o l v i n g two v a r i a b l e s can e a s i l y
p e r second. S i n c e the b a s i c p r o c e s s i s of t h e Poisson type t h e occurrences of
be g e n e r a l i z e d t o a e v e r s l v a r i a b l e s , see t h e subsequent s e c t i o n .
e v e n t s i n two nun-overlapping time i n t e r v a l s ere independent. The p r o b a b i l i t y
For the moment, l e t us suppose t h a t we make a c l a s s i f i c a t i o n o f par-
t o have r counts i n t h e i n t e r v a l t and n-r counts i n t h e remaining time T-t i s
t i c l e s i n two c a t e g o r i e s , "forward" and "backward". according t o t h e i r production
t h e r e f o r e e q u a l t o t h e p r o d u c t o f t h e two Poisson p r o b a b i l i t i e s .
angle i n an o v e r a l l centre-of-mass system.
The t o t a l n d e r of p a r t i c l e s n i s
assumed t o be a Poisson v a r i a b l e with mean v a l u e v . For any n t h e number of
p a r t i c l e s i n the forward ( f ) and backward (b) hemispheres are c o n d i t i o n e d through
T h i s i s n o t t h e p r o b a b i l i t y d i s t r i b u t i o n we s e e k , because the n o r m a l i z a t i o n i s a binomial law, which we w r i t e
not correct. We need t h e c o n d i t i o n a l p r o b a b i l i t y t o have r and n-r counts.
g i v e n a t o t a l n counrs. The ~ o n d i t i o n a lp r o b a b i l i t y i s t h e r e f o r e o b t a i n e d by
d i v i d i n g P ( r ) by the Poisson p r o b a b i l i t y f o r n counts i n t h e time T, t h u s
Here p and q are the c o n s t a n t s d e s c r i b i n g t h e f r a c t i o n s o f forward and backward
P(~;A~)P("-~;A(T-~)~
P ( r i n t , n-r i n T - r l n i n T) particles, respectively. The j o i n t p r o b a b i l i t y d i s t r i b u t i o n f o r a l l t h r e e v a r i -
P(n;AT)
a b l e s f , b and n becomes t h e r e f o r e
S u b s t i t u t i n g the e x p l i c i t Poisson p r o b a b i l i t i e s on t h e right-hand s i d e and re-
a r r a n g i n g terms we g e t

P ( r i n t , n-r i n T - t l n i n T) = r .(n-r) -+-- - $1 : (+y(l


n-r
E B(r;n,$).(4.33)
Using t h e f a c t s t h a t p+q-1 and f+b-n t h i s can be w r i t t e n

T h i s i s a binomial d i s t r i b u t i o n law f o r t h e v a r i a b l e r, w i t h parameters n and


t/T. which i s nothing b u t a p r o d u c t of two Poisson p r o b a b i l i t i e s f o r t h e v a r i a b l e s f
The r e s u l t found i s i d e n t i c a l t o what we would have o b t a i n e d with a and b with mean v a l u e s vp and vq, r e s p e c t i v e l y ,
l e s s s o p h i a t i e a t e d approach, c o n s i d e r i n g t h e r o t a 1 number of counts n a f i n e d
number, e q u a l t o the number o f independent binomial " t r i a l s " . In t h i s l i n e of
The e x p l i c i t n-dependence t h e r e f o r e drops o u t on t h e right-hand a i d e . Thus t h e j o i n t d i s t r i b u t i o n i s equal t o a product of k Poisson d i s t r i b u t i o n s ,
The l a s t formula could of course have been w r i t t e n d m immediately i f t h e i - t h v a r i a b l e r . having the mean value (up.).
we had assumed f end b t o be w o independent Poisson v a r i a b l e s . This r e s u l t can be applied t o considerations over the c o n t e n t s of a
I t w i l l be seen chac the present and t h e preceding s e c t i o n are methe- histogram with k b i n s . If t h e t o t a l n m b e r of e n t r i e s n i s a Poisson v a r i a b l e .
equivalent, i n t h e sense t h a t they involve t h e same Poisson and bino-
matically it i s a p p r o p r i a t e t o regard t h e n m b e r s of events i n each b i n , ri.i-1.2, ...,k
mial f a c t o r s , b u t i n d i f f e r e n t o r d e r . as independent Poisson v a r i a b l e s . Thus, f a r each b i n , E(ri)-V(ri)-ri, as s t a t e d
i n S e c t . 4.2.2.
e x e r c i s e 4.19: An i s made t o measure t h e Goldhaber as-try coef-
ficient
4.4.5 The compound Poisson d i s t r i b u t i o n
Example: Droplet f o m t i a n along t r a c k s i n cloud chamber
Let r . , b e a s e t of n independent Poisson v a r i a b l e s with c o w
( i ) Show t h a t , i f f and b are considered two independent ~ o i s s o n ' v a r i a b l e s . t h e
v a r i a n c e of y i s approximately mon mean value u.
and l e t a l s o n be Poisson d i s t r i b u t e d with mean value v . We
n
v(y) er 4fb/(f+b)'. want t o f i n d the d i s t r i b u t i o n of the sum r-.E r . .
1.1 1
(Hint: Use t h e law of error propagation, eq.(3.77) .) Prom t h e d e f i n i t i o n of marginal p r o b a b i l i t y the p r o b a b i l i t y P ( r ) we
( i i ) Show t h a t , i f n i s considered fixed, with f and b conditioned i n a binomial seek w i l l be the s m of a l l p r o b a b i l i t i e s t h a t produce e x a c t l y r "events", hence
law with c o n s t a n t s n.p,q, t h e variance i s given by t h e e x a c t expression
V(Y) = 4pqIn.
which c o i n c i d e s with t h e r e s u l t from ( i ) , provided p and q are replaced by t h e i r
e s t i m a t e d values, p = t = f / n , q=a=b/n.
n
Relation b e w e e n multinomial and Poisson d i s t r i b u t i o n s Here P(n;v) i s the Poisson d i s t r i b u t i o n f o r t h e v a r i a b l e n, and P ( r ~ ~ g ~ r ~ ; u )
4.4.4
~ x a m p l e : H i s t o g r a m i n g events (3) i s t h e j o i n t p r o b a b i l i t y d i s t r i b u t i o n f o r the n c o n s t i t u e n t v a r i a b l e s r i , which

Suppose t h a t k v a r i a b l e s r l . r z . ...,rk are dependently d i s t r i b u t e d according t o the a d d i t i o n theorem f o r Poisson v a r i a b l e s (see Sect.4.3.3 and
according t o t h e multinomial d i s t r i b u t i o n law, eq.(4.16), i n such a way t h a t Exercise 4.18) is t h e Poisson d i s t r i b u t i o n P(r;nU). Hence

t h e i r sum n i s a Poisson v a r i a b l e with mean value W . The j o i n t d i s t r i b u t i o n is


then equal t o t h e product of t h e mulrinomial and t h e Poisson p r o b a b i l i t i e s .

This is t h e conpound Poisson d i s t r i b u t i o n .


The canpound Poisson d i s t r i b u t i o n has t h e p r o b a b i l i t y g e n e r a t i n g fune-
tion

k k
Since iE1pi-l, I: ri-n, t h i s can be organized t o give
i-1
from which a l l moments can be derived. S p e c i f i c a l l y , t h e mean and v a r i a n c e erne
out as
( i i ) I f t h e c o n s t i t u e n t v a r i a b l e s ri have t h e p r o b a b i l i t y generating function
The compound Poisson d i s t r i b u t i o n i s a p p l i c a b l e whenever a r a n d m pra- g ( r ) , shou t h a t t h e p r o b a b i l i t y generating function of r i s ~i~~~bv
c e s s of t h e Poisson type i n i t i a t e s a n o t h e r . In n a t u r e , one can f i n d many exam-
p l e s on chained reactions, where t h e products from one type of r e a c t i o n gives ( i i i ) Show t h a t t h e p r o b a b i l i t y generating f u n c t i o n has t h e f a c t o r i z a t i o n
r i s e t o a second generation of random e v e n t s . For i n s t a n c e , it has been sugges-
t e d (by R.K. Adair and H. Kasha) t h a t t h e formation of d r o p l e t s along t h e t r a c k s
property
-
c ( r , t ~ + t z ) G(z.t,) G ( r , t z )
and i n t e r p r e t t h e r e s u l t .
of charged p a r t i c l e s i n a cloud chamber provides an example on chained Poisson Compare Exercise 4.20.

processes and should b e described by t h e compound Poisson d i s t r i b u t i o n r a t h e r


t h a n by rhe simple Poisson law. The physical argument is t h a t t h e charged pac- 4.5 THE uNIFofc4 DISTRIBUTION
r i c l e on i t s passage through m a t t e r w i l l e x p e r i e n c e a s e r i e s of elementary s c a t -
4.5.1 The uniform p.d.f.
terings, t h e number of which, over f i x e d lengths, w i l l be Poisson d i s t r i b u t e d .
In passing from d i s c r e t e t o continuous r a n d m v a r i a b l e s t h e silnplest
and f u r t h e r t h e number of d r o p l e t s produced i n each s c a t t e r i n g event is a l s o
s i t u a t i o n one can think o f i s t h a t t h e r e i s a s i n g l e v a r i a b l e x f o r which the
adequately described by a Poisson law. The number of d r o p l e t s r aver f i x e d
!engths v i l l then h e given by e q . ( 4 . 3 6 ) , where u gives t h e mean number of drop- p r o b a b i l i t y d e n s i t y i s c o n s t a n t over t h e region where x is d e f i n e d . We w r i t e
l e t s p e r elementary s c a t t e r i n g and w the mean number of s c a t t e r i n g s over t h e f(x) = -
1
f i x e d length.
b-a ' a s x s b , (4.39)
which gives t h e unifom p r o b a b i l i t y d e n s i t y function.
E x e r c i s e 4.20: I n the example i n t h e t e x t , l e t t denote t h e f i x e d length over The expectation and variance of x w i t h t h e uniform p.d.f. becorne, from
which t h e d r o p l e t s are counted, and put W=AL where A is the average number of
s c a t t e r i n g s per u n i t l e n t h Writing t h e p r o b a b i l i t y generating function i n t h e eqs.(3.8) and (3.9),
form G(z,L)=exp(A~e~(Z-l~-~i),show t h a t , with L - t l + t 2 ,

(6.40)

This f a c t o r i z a t i o n property implies t h a t t h e nunber of d r o p l e t s over each of t h e b


two non-overlapping l e n g t h s k l and t2 v i l l a l s o be d i s t r i b u t e d according t o t h e
compound Poisson law. The r e s u l t can obviously be generalized t o any number of
non-werlapping lengths.
V(x) - I(x-~(x))'f(x)dx =
I
(b-a)' . (4.41)
a
Since f ( 3 i s synmetric about i t s mean a l l odd c e n t r a l moments vanish; t h e even
E x e r c i s e 4.21: (The general cmpound Poisson d i s t r i b u t i o n )
c e n t r a l m n t s can be found by t r i v i a l i n t e g r a t i o n ,
( i ) The conditions of t h e previous s e c t i o n can be generalized t o s i t u a t i o n s h
where t h e second generation of t h e branching process is describable by any
(i.e. n o t necessarily Poisson) d i s t r i b u t i o n . S p e c i f i c a l l y , l e t t h e number o f
p r i m a r i e s n b e Poisson d i s t r i b u t e d (as before) w i t h mean value w-At. Each
primary gives r i s e t o a n m b e r o f s e c o a d a r i e s riwhieh are d i s t r i b u t e d with a c o w
mo mean value v. Show t h a t t h e d i s t r i b u t i o n of t h e t o t a l rider of secondaries
r=2,rj i s given by t h e genernt c a n p o d Poisson distribution
The cumulative d i s t r i b u t i o n f o r the m i f o r m p.d.f.
x
is

F(x) = lf(x')dx' = ) a s x s b . (4.43)


a

where P ( f = 1 r.;ll) i s t h e j o i n t d i s t r i b u t i o n of t h e r i , given n.


i-1 ' F i g . 4.4 i l l u s t r a t e s f ( x ) and P(x) f o r t h e uniform d i s t r i b u t i o n .
Exercise 4.23: S h w t h a t monoenergetic p i o n s which decay i n f l i g h t , n * !J + v,
produce n e u t r i n o s with a uniform energy d i s t r i b u t i o n i n t h e laboratory system.
(Hint: The decay is i s o t r o p i c i n t h e pion r e s t system.)

Exercise 4.24: L e t x be a continuous random v a r i a b l e vhich is uniformly d i a t r i -


bufed betveen 0 and 1. Shov t h a t t h e v a r i a b l e "-2 lnx has t h e p.d.f.
g(u)-(exp(-lu) (the chi-square d i s t r i b u t i o n with 2 degrees of freedom).
1
-
b-a 4.5.2 Example: Uniform random number generators
As.an example of an approximative uniform d i s t r i b u t i o n one can eon-
s i d e r t h e frequency d i s t r i b u t i o n of a sample of n d e r s obtained from a random
number g e n e r a t o r .

f o r a uniform d i s t r i b u t i o n between n = a and x b. -


Fig. 4.4. The p.d.f. f (x) and the cumulative d i s t r i b u t i o n F(n) Truly random numbers can be constructed by r o l l i n g d i c e , d e a l i n g cards
o r be generated by s p e c i a l mechanical machines. Such methods are, however, slow
and only of l i m i t e d p r a c t i c a l use.
me m i f o r m d i s t r i b u t i o n , although extremely simple, is n e v e r t h e l e s s Large samples of numbers "chosen a t random" are frequently needed, f o r
very u s e f u l ; f o r example, any d i s t r i b u t i o n of a continuous v a r i a b l e can be trans- i n nonte Carlo simulations and i n t e g r a t i o n s . Many algorithms have been
formed i n t o a uniform d i s t r i b u t i o n . invented t o produce them by computers. These a l g o r i t h m give p r e s c r i p t i o n s on
To see t h i s , r e c a l l t h e d e f i n i t i o n of t h e cumvlative d i s t r i b u t i o n how t o d e r i v e sequences of numbers n,,xz,x,, ... which are "evenly" and "ran-
f u n c t i o n F(n) f o r an a r b i t r a r y continuous p . d . f . f ( n ) , domly" d i s t r i b u t e d i n a given i n t e r v a l . The numbers are obtained by a r e c u r r i n g
a r i t h r e t i c process.

-
I f we make t h e s u b s t i t u t i o n
where g i s some generating ftmctionand k i s u s u a l l y 1 or 2 . Each n d e r xi+, i n
t h e sequence w i l l t h e r e f o r e be completely determined by i t s predecessors and t h e
given s t a r t i n g v a l u e ( s ) . Thus t h e sequence i s n o t random, b u t i t w i l l appear t o
t h e new v a r i a b l e u w i l l cover the region O l u l l and be uniformly d i s t r i b u t e d ,
be SO f o r most p r a c t i c a l a p p l i c a t i o n s . The sequence w i l l always be p e r i o d i c .
because from eq.(3.60) i t s p.d.f. becomes
with a c y c l e of numbers which i s repeated e n d l e s s l y . The length of t h e p e r i o d i c
cycle i s determined by t h e chosen algorithm, end has an upper l i m i t implied by
t h e computer word length' ) .
Randon1 numbers generated by recursion formulae are i n t h e t e c h n i c a l
An example on t h e usefulness of t h i s transformation i s given i n Sect.10.4.4.
l i t e r a t u r e c a l l e d pseudo-random or quasi-radom.
E x e r c i s e 4.22: Show t h a t t h e skewness and k u r t o s i s c o e f f i c i e n t s f a r t h e uniform
d i s t r i b u t i o n are y,=O and yz=-1.2, r e s p e c t i v e l y . *) A s p e c i f i c algorithm producing sequences of period m i s given by t h e Linear
congmential method as x i + , = ( a a i + c ) ~ d m ,where t h e constants have been
properly a d j u s t e d and t h e uoer only s e l e c t s t h e s t a r t i n g value x .
E x e r c i s e 4.25: A random number g e n e r a t o r produces numbers x. which are uniformly 4.6.2 D e r i v a t i o n of t h e e x p o n e n t i a l p , d . f . from t h e Poisson assumptions
d i s t r i b u t e d between 0 and 1. We seek a g e n e r a t o r f o r a new $andom number y
d e f i n e d over t h e i n t e r v a l [A.BI, which corresponds t o t h e p r o b a b i l i t y d e n s i t y Let us go back t o our e a r l i e r example (Sect.4.3.2) with bubbles along
f ( y ) . Defining a function y1 . y ( x . ) by
1 t h e t r a c k o f a e h a r g e d p a r t i c l e i n a bubble chamber, w i t h the c o o s t a n t g g i v i n g
~ ( " ~ 1
Xi = ] f(t)dt,
the number of ( p o i n t - l i k e ) bubbles p e r u n i t l e n g t h of t h e t r a c k .
W e want now t o f i n d a n e x p r e s s i o n f o r t h e ~ r o b a b i l i t h
~a t t h e f i r s t
A
s h m t h a t y. h a s t h e d e s i r e d d i s t r i b u t i o n . S p e e i f i e a l l y , shou t h a t the bubble o n a t r a c k occurs a t a d i s t a n c e 6 from t h e chosen o r i g i n . Since " t h e
exponential'distribution f(y)=exp(-y) f o r 0 < y 9 can be simulated by t a k i n g f i r s t bubble i n t h e i n t e r v a l [L,L+ALl" i s e q u i v a l e n t t o having no bubble i n
yi = - l"(l-xi) . [O,L] and one bubble i n the "on-overlapping i n t e r v a l [E,L+AL] t h e j o i n t probabi-
l i t y f o r t h e occurrence of t h e s e two independent "events" i s the p r o d u c t o f t h e
p r o b a b i l i t i e s t o r the individual events. Now we know t h a t the p r o b a b i l i t y f a r
4.6 THE EXPONENTIAL DISTRIBUTION
no bubble over the length L i s e-gL (f5om e q . ( 4 . 2 8 ) , or t h e Poisson formula w i t h
4.6.1 D e f i n i t i o n and p r o p e r t i e s r=O) and t h a t the p r o b a b i l i t y f o r one bubble i n AL is gAL ( t h e Poisson assump-

The e q o r w n t i a z p . d . f . i s d e f i n e d as tions). The p r o b a b i l i t y f o r the simulraneous e v e n t s i s t h e r e f o r e (e-gL). (gAL).


The p r o b a b i l i t y d e n s i t y , i.e. the p r o b a b i l i t y p e r u n i t l e n g t h , f o r f i n d i n g the
f i r s t bubble a t the p o s i t i o n L becomes

f o r p o s i t i v e v a l u e s of t h e s c a l e parameter 8. The mean and v a r i a n c e f o r t h i s


d i s t r i b u t i o n are given b y , r e s p e c t i v e l y ,
The d i s t r i b u t i o n law (4.48) can a l s o be i n t e r p r e t e d as t h e p . d . f . f o r
t h e d i s t a n c e 9. b e m e e n t w o consecutive b u b b l e s on t h e t r a c k . Hence t h e probabi-
lity f o r f i n d i n g i n t e r v a l s of l e n g t h 5 e between two a d j a c e n t bubbles i s given by
V(x) = 8'. (4.46)
the cumulative i n t e g r a l
a
The a s y m t r y and k u r t o s i s c o e f f i c i e n t s are c o n s t a n t s , and independenr of 6.
-
P ( L ) = I ~ ( L , ; ~ ) ~ IP ? =.-gas (4.49)
0
The p r o b a b i l i t y f o r i n t e r v a l s >L i s
The e x p o n e n t i a l d i s t r i b u t i o n d e s c r i b e s a v a r i e t y o f p h y s i c a l phenomena.
It can e a s i l y be d e r i v e d from t h e Poisson assumptions f o r i n d i v i d u a l "random"
e v e n t s , as we s h a l l see i n t h e next s e c t i o n . Mathematically, t h e e x p o n e n t i a l
The mean s i z e of t h e i n t e r v a l s between two c o n s e c u t i v e bubbles i s , of eourae,
d i s t r i b u t i o n i s a s p e c i a l case of t h e more g e n e r a l gamma d i s t r i b u t i o n d i s c u s s e d
e(L)=g-l.
i n Sect.4.7.
I f t h e bubbles are of "on-negligible s i z e t h e formule (4.48) s t i l l
E x e r c i s e 4.26: Show t h a t the a l g e b r a i c m n e n t r of t h e e x p o n e n t i a l d i s t r i b u t i o n g i v e s t h e d i s t r i b u t i o n of t h e d i s t a n c e between a d j a c e n t bubble c e n t r e s . Ass-
are given by
i n g the bubbles t o be of e q u a l s i r e with d i a m e t e r d t h e d i s t r i b u t i o n of t h e gap
l e n g t h s x-t-d between a d j a c e n t bubbles can a l s o be expressed by t h e e x p o n e n t i a l
law, as Exercise 4.30: (The hyperexponential d i s t r i b u t i o n )
A superposition of exponential p r o b a b i l i t y d i s t r i b u t i o n s goes under
f(x;g) = ge-gX, O S x S " . (4.51) t h e name of t h e hypererponential d i s t r i b u t i o n . I n terms of e time v a r i a b l e t

The parameter g w i l l then have t h e i n t e r p r e t a t i o n of rhe inverse of t h e mean gap


length.
t h e p.d.f. i s
fct;~,!) - i.,
p.A.exp(-Ait),
' Ost5-,
where p. denotes t h e proportion of t h e i - t h process ( > l p i = l ) , f o r which t h e
decay ebnstant i s A , . Discuss t h e p r o p e r t i e s of t h i s p.d.f.
This d i s t $ i b u t i o n d e s c r i b e s t h e e f f e c t of exponentials o p e r a t i n g i n
E x e r c i s e 4.27: For processes occurring randomly i n time the p r o b a b i l i t y density a r a l l e l . When exponential d i s t r i b u t i o n s of s i m i l a r type o p e r a t e i n s e r i e s ,
f a r t h e e v e n t s , decays, a r r i v a l s of p a r t i c l e s i n t o a d e t e c t o r , e t c . . i s enpres- k i t i n g d i s t r i b u t i o n i s an Erlangian d i s t r i b u t i o n ; see Sect.4.7.2.
sed as I

where the parameter r measures t h e mean l i f e t i m e , or t h e average time befveen


two consecutive events. Equivalently, i n terms of t h e decay c o n s t a n t A = I / r the
p.d.f. i s
4.7

4.7.1
THE GAWA DISTRIBUTION

~ e f i n i t i o nand p r o p e r t i e s
With a and B as two p o s i t i v e c o n s t a n t s we d e f i n e t h e qmRm d i s t r i b u -
I
t i o n by
Review t h e considerations of t h e l a s t s e c t i o n i n t h i s Context. I n p a r t i c u l a r .
f i n d a r g m e n r s f o r the coonno" statement t h a t "the exponential d i s t r i b u t i o n has
n o memory". This PrOpertY i s very u s e f u l i n p r a c t i c e s i n c e i t , f o r example.
allows one t o measure the l i f e t i m e s of u n s t a b l e p a r t i c l e s s t a r t i n g from an a r b i - This funcrion i s seen t o be properly normalized, i n v i r t u e of t h e g m function.
t r a r y time t a f t e r t h e time (t-0) they w e r e c r e a t e d . See a l s o Exercise 4.28 m
below.

Exercise 4.28: A n e u t r i n o beam i s produced by decays of high-energy pions,


o + u+v. I f the pions have momentum p,, and the a v a i l a b l e decay region i s of which h a s t h e Property
length f,, what f r a c t i o n of t h e pions w l l l produce neutrinos? As a numerical
example, take pn=lO.OGeV/c, 2-80 m, m,,=O .I396 ~ e v l c ' , er-7.8 m.
(Answer: 13.3X.)

E x e r c i s e 4.29: Decays from a r a d i o a c t i v e source of decay c o n s t a n t A are regis- The fornula (4.52) produces s v a r i e t y of shapes f o r d i f f e r e n t values
t e r e d by a Geiger counter of r e s o l u t i o n time to (i.e. a d e t e c t o r which f a i l s in
recording an event i f i t occurs separated i n tlme f r o . t h e previous event by an of t h e c ~ t t e t a n ta, as i n d i c a t e d i n P i g . 4.5. For us1 t h e d i s t r i b u t i o n i s
amount l e a s than t o ) . I f N ' counts are r e g i s t e r e d over t h e time t , that
t h e nlnober of decays N t h a t have a c t u a l l y taken place i n t h e r a d i o a c t i v e souree J - ~ h ~ p ~while
d , a > l gives e unimodal d i s t r i b u t i o n with maximum a t %=(a-1)B.
i s given by The p a r a t e r 8 is only a s c a l e f a c t o r .
-At
When a i s an i n t e g e r , a-k, r(k)=(k-1): and t h e p.d.f. (4.52) i s c a l l e d
N. I - e -At N' E aN'
e-At~ - I t h e ErZangim d i s t ~ i b u t i o n ; t h i a d i s t r i b u t i o n law can be derived from f i r s t prin-
where a > 1 . and i s discussed i n t h e following s e c t i o n .
=iples The s p e c i a l case a-1 corre-
Assume t h a t t h e r e g i s t e r e d number of counts i s a Poisson v a r i a b l e of
eponds t o the exponential d i s t r i b u t i o n , which was t r e a t e d already i n Sect.4.6.
estimated meanvalue 0-N'.
of decays i s V(N) -
Show t h a t t h e estimated v a r i a n c e i n t h e t r u e number

-
aN > N, w h e r e a s ~ i t ha p e r f e c t d e t e c t o r , t o < < l l A , t h e
estimated variance would have been V(N) N.
he specisl case 8-2,
square d i s t r i b u t i o n with
v
a=?where V i s an i n t e g e r i s e q u i v a l e n t t o t h e c h i -
v degrees of freedom; t h i s important d i s t r i b u t i o n is
discussed q u i t e e x t e n s i v e l y i n Chapter 5 .
4.7.2 D e r i v a t i o n of t h e gama p . d . f . from t h e P o i s s o n a s s u m p t i o n s
From a p h y s i c i s t ' s p o i n t of view t h e i m p o r t a n c e of t h e gamna d i s t r i b u -
t i o n l i e s mainly i n t h e f a c t t h a t it, for i n t e g e r v a l u e s of t h e p a r a m e t e r a, can
be d e r i v e d from t h e P o i s s o n and c o n s e q u e n t l y d e s c r i b e s random pro-
cesses of t h e P o i s s o n t y p e .
For d e f i n i t e n e s s , l e t us t a k e t h e random v a r i a b l e t o d e s c r i b e a time
interval. I f A i s t h e consranr a v e r a g e number of events ( d e c a y s , a c c i d e n t s ,
etc.) p e r u n i t t i m e , t h e P o i s s o n p r e d i c t i o n f a r t h e number of e v e n t s r i n t h e
time t i s ( e q . ( 4 . 3 0 ) )

WE want to f i n d t h e d i s t r i b u t i o n l a w f o r t h e time t a t which t h e k-th event

F i g . 4 . 5 . Shapes of the g n m v d i s t r i b u t i o n f o r d i f f e r e n t parameter v a l u e s . orcurs. s i n c e t h e p r o b a b i l i t y f a r 0.1, ...,k-1 e v e n t s i n t h e time t i s

E x e r c i s e 4.31: Show t h a t t h e m a n and v a r i a n c e of t h e g a m d i s t r i b u t i n are


g i v e n by, r e s p e c t i v e l y ,
E(x) = aO. v ( x ) = aB2. t h e p r o b a b i l i t y c h a r t h e r e are ar l e a s t k e v e n t s i n t h e t i m e t i s

E x e r c i s e 4.32: Show t h a t t h e gamna d i s t r i b u t i o n h a s t h e c h a r a c t e r i s t i c f u n c t i o n


@ ( t ) = ( I - it)-^ .
I t can b e sharn by m a t h e m a t i c a l i n d u c t i o n t h a t t h e sum i n eq.(4.55) can be v r i t -
E x e r c i s e 4.33: Show t h a t t h e g s m a d i s t r i b u t i o n h a s asyometry and k u r t o s i s t e n as a n i n t e g r a l ,

YI -
c o e f f i c i e n t s given by, r e s p e c t i v e l y ,
2/&, Y, = 6la.

E x e r c i s e 4.34: (The b e t a d i s t r i b u t i o n )
While t h e g m a d i s t r i b u t i o n d e s c r i b e s v a r i a b l e s v h i c h are bounded a t
one s i d e , t h e b e t a d i s t r i b u t i a

can b e used t o d e s c r i b e v a r i a b l e s which a r e l i m i t e d on t w o s i d e s . The para-


m e t e r s v and v a r e b o t h p o s i t i v e i n t e g e r s ; t h e q u a n t i t y B ( u , v ) 5 T ( u ) F ( v ) I r ( ~ + v )
i s often called the beta function. R e p l a c i n g r by Az l e a d s t o
I k k-1 - A =
(i) Show t h a t t h e mean and v a r i a n c e of t h e b e t a d i s t r i b u t i o n are, r e s p e c -
tively, (4.56)
\.. ., .
E(x) = & , W
V(X) ' (v+v)z(u+v+l) * b
Show t h a t t h e b e t a d i s t r i b u t i o n i s s y m n e t r i c a b o u t 8 1 when v=v, and This i s the cumulative i n t e g r a l of a d e n s i t y which i s seen t o be of
(ii)
reduces t o t h e uniform p.d.f. f o r u=v=l. t h e form of e q . ( 4 . 5 2 ) , w i t h t h e p a r a m e t e r s ask ( i n t e g e r ) , 8 - i s i . e . an Erlan-
(iii) Sketch t h e b e t a d i s t r i b u t i o n f o r t h e l o v e s t combinations of t h e para- gian p.d.f.; we w r i t e
metere.
.. I
4.7.3 Example: On-line p r o c e s s i n g of b a t c h e d e v e n t s
A6 an a p p l i c a t i o n of t h e p r e v i o u s a r g m n t s , l e t us c o n s i d e r a queue-
8 ; S i n c e t h e c u m u l a t i v e d i s t r i b u t i o n of e q . ( 4 . 5 6 ) g i v e s t h e p r o b a b i l i t y i n g problem i n v o l v i n g t h e c o l l e c t i n g a d p r o c e s s i n g of d a t a i n an e l e c t r o n i c
I i t h a t t h e r e are a t l e a s t k e v e n t s i n t h e time t t h e p . d . f . f ( t ; k,A) w i l l g i v e experiment. The d i r e c t measureolents from a n u d e r of s e q u e n t i a l e v e n t s are
tlle p r o b a b i l i t y d e n s i t y i n t f o r t h e occurrence of t h e k-th e v e n t , when t h e s t o r e d i n an i n t e r m e d i a t e b u f f e r and s u b s e q u e n t l y t r a n s f e r r e d t o s c e n t r a l corn
e v e n t s occur i n d e p e n d e n t l y and a t an a v e r a g e r a t e o f A p e r u n i t t i m e . We ob- puting unit f o r processing. I f t h e b u f f e r r u n s f u l l b e f o r e t h e computer i s
serve i n particular t h a t t h e p . d . f . d e s c r i b i n g t h e occurrence of t h e f i r s t r e a d y w i t h t h e p r e v i o u s b a t c h of e v e n t s t h e t r a n s f e r of t h e d a t a i s p r o h i b i t e d ;
e v e n t (a d e c a y , s a y ) 13 f ( t ; l , A ) = ~ e - ' ~ . as i t s h o u l d b e . The formula (4.57) can t h e b u f f e r i s then simply r e s e t t o zero a n d t h e d a t a c o l l e c t i n g is s t a r t e d anew.
a l s o b e i n t e r p r e t e d as g i v i n g t h e d i s t r i b u t i o n f o r t h e t i m e e l a p s e d between Haw l a r g e a f r a c t i o n o f t h e e v e n t s w i l l i n t h e long run g e t l o s t w i t h t h i s way
( k r l ) c o n s e c u t i v e events, c o r r e s p o n d i n g t o k "time gaps''. I n accordance v i t h of o p e r a t i n g t h e system?
t h i s i n t e r p r e t a t i o n the expectation value for t i s The e s s e n t i a l q u a n t i t i e s i n t h i s problem are t h e e v e n t r a r e , t h e
b u f f e r c a p a c i t y and t h e computing s p e e d . We w i l l a s o m e t h a t t h e events occur
i n d e p e n d e n t l y i n time, a t an a v e r a g e r a t e of A e v e n t s p e r second. The b u f f e r
i n o t h e r words, the mean v a l u e of r i s k t i m e s l a r g e r t h a n tlle a v e r a g e l e n g t h o f h a s a c a p a c i t y t o s t o r e k e v e n t s , and t h e computer p r o c e s s e s t h e k e v e n t s i n T
t h e time gap between wo c o n s e c u r i v e events. seconds. We ass- t h a t t h e t r a n s f e r time between t h e b u f f e r and t h e c e n t r a l
I To s u m a r i r e , t h e E r l a n g i a n p . d . f . d e s c r i b e s the ( t i m e ) d i s t r i b u t i o n of memory is n e g l i g i b l e and t h a t t h e time r e q u i r e d f o r r e s e t t i n g t h e b u f f e r b e f o r e
exponentially distributed events occurring in *. (When e x p o n e n t i a l s o p e r a t e a new c o l l e c t i n g p e r i o d can a l s o b e i g n o r e d .
i n p a r a l l e l t h e r e s u l t i s t h e h y p e r e x p o n e n t i a l d i s t r i b u t i o n . E x e r c i s e 4.30.) With t h i s f o r m u l a t i o n we r e c o g n i z e t h a t t h e p r o b a b i l i t y d e n s i t y func-
t i o n f o r t h e time t o f a f u l l b u f f e r i s g i v e n by t h e E r l a n g i a n formula ( 4 . 5 7 ) .
E x r r c i s e 4 . 3 5 : Show t h a t t h e mean v a l u e f o r t h e p . d . f . (4.57) i s ~ ( t ) - k / A ( e q . The f r a c t i o n F(T) of r e j e c t e d e v e n t s t h e n c o r r e s p o n d s t o t h e p r o b a b i l i t y of g e t -
4 . 5 8 ) and t h a t t h e v a r i a n c e i s v ( t ) = k / A 2 .
t i n g t h e k-th e v e n t b e f o r e t h e time T, when t h e computer i s s t i l l w o r k i n g w i t h
E x e r c i s e 4.36: (Connection between t h e E r l a n g i a n , P o i s s o n , and n e g a t i v e bino- t h e o r e v i o u s b a t c h of e v e n t s . Thus.
m i a l distributions)
I n t h e p r e v i o u s s e c t i o n s i t h a s f r e q u e n t l y been emphasized t h a t t h e
p h y s i c a l assumption u n d e r l y i n g t h e P o i s s o n d i s t r i b u t i o n law i s t h a t t h e aver-
age e v e n t r a t e i s c o n s t a n t . Suppose tlow t h a t t h i s c o n d i t i o n i s n o t f u l f i l l e d ,
and t h a t t h e a v e r a g e i s d e s c r i b e d by o n E r l a n g i a n formula
For a nllmericat example, suppose t h a t t h e a v e r a g e e v e n t r a t e corre-
ik k-1 -Au
f(vik.A) = u e o I u I - sponds t o A-0.5 8-', t h e b u f f e r c a p a c i t y k-10, and t h e computer speed s u c h t h a t
one e v e n t i s p r o c e s s e d p e r second, or T=lO s f o r one b a t c h o f 10 e v e n t s . With
Given L, a d i s c r e t e v a r i a b l e s i s supposed t o be P o i s s o n d i s t r i b u t e d w i t h mean
value u . Show t h a t t h e r e s u l t i n g p r o b a b i l i t y d i s t r i b u t i o n f o r r i s t h i s c h o i c e of c o n s t a n t s the E r l a n g i a n p . d . f . f(t;k-10, A=)) i s i d e n t i c a l t o a

= ('+:-')(&)"(l - A)' c h i - s q u a r e p.d.f. v i t h V-2k-20 d e g r e e s of freedom. The c u m u l a t i v e c h i - s q u a r e


d i s t r i b u t i o n o f F i g . 5.2 from C h a p t e r 5 can t h e n be used t o read o f f t h e v a l u e
0
of F(T); we f i n d

-
'This i s r e c o g n i z e d us n n e g a t i v e b i n o m i a l d i s t r i b u t i o n w i t h p a r a m e t e r p = A / ( l + l )
( E x e r c i s e 4.5) Ior whicl, i h e mean v a l u e end v a r i a n c e a r e E ( s ) = k / A and
v(s)=klA+k/A2, r e s p e c t i v e l y . P(T) - If10

0
(u;V-20)du 0.03.

Thus o n l y a b o u t 3% of t h e e v e n t s w i l l n o t b e used i n t h i s case.


4.8 THE NORHAL. OR GAUSSIAN. DISTRIBUTION
We d i e c w s n e x t t h e ronml, o r Can,,8a2nrr, p r o b a b i l i t y d e n s i t y f u n c t i o n
which p l a y s a fundamental r o l e i n p r o b a b i l i t y theory and s t a t i s t i c s . W e consider
i t f i r s t as a f u n c t i o n of only one v a r i a b l e ; t h e e x t e n s i o n t o two and more v a r i -
a b l e s i s made i n S e c t s . 4 . 9 and 4.10, r e s p e c t i v e l y .

1.8.1 D e f i n i t i o n and p r o p e r t i e s of N(p,02)


The normal p . d . f . i n one dimension has t h e g e n e r a l form

The e x p e c t a t i o n and t h e v a r i a n c e of x with t h i s d i s t r i b u t i o n can be found by


t r i v i a l i n t e g r a t i o n from t h e d e f i n i t i o n s e q s . ( 3 . 8 ) and (3.9),
m

Hence' t h e parameters u and oz of ~ ( u . 0 ' ) have t h e usual meaning o f t h e mean value


and v a r i a n c e (or d i s p e r s i o n ) o f a d i s t r i b u t i o n .
The normal p.d.f. N(u,02) i s symnetric about PU. and hence t h e median
c o i n c i d e s w i t h t h e mean. The p . d . f . a l s o has itsmode(maximm) a t x=p, w h i l e
t h e two p o i n t s o f i n f l e c t i o n occur a t d i s t a n c e s f o from t h i s v a l u e . Figure 4.6
shows d i f f e r e n t normal d i s t r i b u t i o n s w i t h a c o m n mean.

E x e r c i s e 4.37: Show t h a t t h e h a l f - w i d t h of N(u.02) a t half-maximum i s e q u a l t o


"??I30 = 1 . l e a .

E x e r c i s e 4.38: Show t h a t i f x i s ~(v.5~). them ax is tl(au.a202)

4.8.2 The s t a n d a r d n o m l d i s t r i b u t i o n N(0,l)


The g e n e r a l nolmal d i s t r i b u t i o n ~ ( u . 0 ' ) can be transformed t o a con-
v e n i e n t form by t h e standmd t r a n a f o m t i a

P i g . 4.6. The normal p . d . f . ~ ( ~ . of o~ r ) d i f f e r e n t v a l u e s o f t h e s t a n d a r d


d e v i a t i o n 0.
, which i m p l i e s a s h i f t of t h e o r i g i n t o the man v a l v e and an ~ p p r o p r i a t echange
' of s c a l e . T h i s g i v e s t h e standold n o m I . p.d.f.
I
- 0.4 0.4 - N(0.1) e g(~) -- i
J i i i 7
hJy2 -"<yC-, (4.63)

which h a s mean value zero and s t a n d a r d d e v i a t i o n one. hi^ p . d . f . is


- - i i n Appendix Table A5.

The c m l a t i v e stondard no-2 distribution i s given


by

0.2 -
-1
- 0.2 ! Y Y
82
G(Y) =
- jg(y')dy' =
J2.G
.-by dyV, (4.64)

- - and i s t a b u l a t e d i n Appendix Table A h . T h i s f u n c t i o n , which i n t h e l i t e r a t u r e


, is o f t e n c a l l e d t h e s t a n d a r d normal distribution f u n c t i o n ( c o q a r e t h e f o o t n o t e
I
I i n Sect.3.2), has t h e p r o p e r t y

G(-y) = 1 - G(y). (4.65)


-3.0 -2.0 -1.0 0 1.0 2.0 3.0
X F i g u r e 4.7 shows the s t a n d a r d normal p . d . f . and i t s cumulative d i s t r i -
bution.
- 1.0 1.0 -
E x e r c i s e 4.39: Verify t h a t i f n i s N ( U , U ' ) t h e v a r i a b l e y = ( x - u ) / o is N(O,f),
(eq.(4.63)). Show t h a t u-yZ has t h e p . d . f . f(u)= ( ~ n ) - ~ d ~ ~ (~t h~e (c h- i -~ ~ ) ,
0.8 -
square d i s t r i b u t i o n with one degree of freedom).
- 0.8
% The cumulative s t a n d a r d normal d i s t r i b u t i o n M y ) i s used i n p r a c t i c e

- 0.6 0.6 - t o determine t h e p r o b a b i l i t y c o n t e n t s of a given i n t e r v a l f o r a normally d i r t r i -


buted v a l u e , o r vice Versa. t o determine an i n t e r v a l corresponding t o a g i v e n
probability.

- 0.4 0.4 - L e t x be a random v a r i a b l e which i s d i s t r i b u t e d according t o t h e p . d . f .

,
i
~ ( u . 0 ' ) of eq.(4.59).
lower l i m i t a and an
We want t o f i n d t h e p r o b a b i l i t y t h a t x f a l l s between a
upper l i m i t b . Clearly,

- 0.2 0.2 - I P(a 5 x 5 b) = P(X 5 b) -~ ( ix a ) .

On the right-hand-ride t h e i n e q u a l i t i e s may be expressed i n terms of t h e s t a n -

-4.0 -3.0 -2.0 -1.0 0 1.0 2.0 3.0 4.0 !

X I
F i g . 4.7. (=) The s t a n d a r d normal p.d.f. N(0.1).
(b) he s t a n d a r d normal d i s t r i b u t i o n .
I
d a r d i z e d v a r i a b l e (x- lo, and we have

Hence

where G i s t h e emulative s t a n d a r d "0-1 d i s t r i b u t i o n of eq.(4.64). Given p.a2


x
a n d t h e i n t e r v a l [ a , b ] t h e a c t u a l f u n c t i o n v a l u e s of G c a n be found f r w , f o r
example, F i g . 4 . 7 ( b ) o r Appendix T a b l e A6. Using t h e p r o p e r t y of eq.(4.65) we F i g . 4 . 8 . P r o b a b i l i t y Contents of t h e normal p r o b a b i l i t y d i s t r i b u t i o n N(p.aZ).
f i n d t h e p r o b a b i l i t y c o n t e n t s c o r r e s p o n d i n g t o t h e ( s y n n e t r i c ) one-, tvo-, and The shaded area c o r r e s p o n d s t o t h e s y - t r i c one-standard d e v i a t i o n i n t e r v a l
a b o u t t h e mean v a l u e , f o r which t h e o r o b a b i l i r~, v i-.
s P I ? # - n < r < ?#nl
" " ,= n-.---,,
rwr.
t h r e e s t a n d a r d d e v i a t i o n i n t e r v a l s a b o u t t h e mean u: t h e BY-tric
- ~
r e g i o n s c o v e r i n g t h e two- and t h r e e s t a n d a r d d e v i a t i o n i n t e r v a l s

-
c o r r e s p o n d t o p r o b a b i l i t i e s P(P-20
0.9973, r e s p e c t i v e l y , ( e q . ( 4 . 6 7 ) ) .
I n 5 u+20) = 0.9545 and P(p-M 5 x I v + M )
- The d o t t e d l i n e s i n d i c a t e t h e e x t e n s i o n
o f t h e r e g i o n s which c o r r e s p o n d t o p r o b a b i l i t i e s 0.90 and 0.95, ( e q . ( 4 . 6 9 ) ) .

I n t e r p o l a t i o n i n Appendix T a b l e A6 then g i v e s for t h e v a l u e of t h e arglrment o f G


S e e t h e i l l u s t r a t i o n of F i g . 4 . 8 .
Suppose n e x t t h a t we have t h e o p p o s i t e s i t u a t i o n and want t o d e t e r - e
a
-1.645.
mine t h e s i z e of an i n t e r v a l which i s t o c o r r e s p o n d t o a g i v e n p r o b a b i l i t y , s a y
Thus, g i v e n N(u.02) the s y m e t r i c i n t e r v a l about u c o r r e s p o n d i n g t o t h e probabi-
0.90. E q u a t i o n (4.66) i s now
j l i t y 0.90 i s

0.90 = G ( 2 ) -(y), (4.68)

and w i t h g i v e n p a r a m t e r s u.a t h e r e are o b v i o u s l y i n f i n i t e l y many s o l u t i o n s f o r


For any g i v e n p r o b a b i l i t y Y between 0 and 1 one can proceed i n a s i m i -
t h e unknovns a , b . With t h e a d d i t i o n a l r e q u i r e m e n t t h a t t h e i n t e r v a l [a.bl
l a r manner t o d e t e r m i n e s m t r i c i n t e r v a l s around u. For t h e comnon c h o i c e s
s h o u l d be s y m t r i c a b o v t u, however, t h e r e i s o n l y one imknam, say b . With
t h e use of e q . ( 4 . 6 5 ) we can then w r i t e eq.(4.66) as
( Y-0.90, 0.95. 0 . 9 9 , 0.999 o f t e n r e f e r r e d t o , one h a s
me f i r s t two of these i n t e r v a l s a r e i n d i c a t e d i n Fig. 4.8. Prom t h e general d e f i n i t i o n of eq.(3.22) one finds t h e characteristic
p r a c t i c a l purposes i t may be u s e f u l t o observe and remember t h a t , f o r t h e normal p.d.f. of eq.(4.59),
f o r a normal v a r i a b l e , the symmetric i n t e r v a l covering a roba ability 0.95 very
n e a r l y c o i n c i d e s with t h e two-standard d e v i a t i o n i n t e r v a l . m(t) z E ( ~ - ~e ~ i~t - l o~2 t '). (4.74)

Exercise 4.40: I f x i s normally d i s t r i b u t e d with mean and variance both equal This form of t h e c h a r a c t e r i s t i c function i s used t o d e r i v e many t h e o r e t i c a l l y
to 16, what i s the roba ability t h a t 12 5 x 5 20 ? Compare Exercise 4.16. important p r o p e r t i e s of t h e normal d i s t r i b u t i o n (see, f o r example. t h e following
sections). Eor t h e purpose of e v a l u a t i n g c e n t r a l moments of the normal p.d.f.
one may consider i n s t e a d t h e function (eq.(3.28))
4.8.4 C e n t r a l moments; t h e c h a r a c t e r i s t i c function
The c e n t r a l moments of t h e normal d i s t r i b u t i o n of eq.(4.59) can be
obtained from the g e n e r a l d e f i n i t i o n by eq.(3.14). C l e a r l y a l l odd c e n t r a l
moments vanish because of t h e s y m e t r y property of thenorma1p.d.f. The even which has the s e r i e s expansion
moments can be evaluated in a s t r a i g h t f o r w a r d way by carrying o u t i n t e g r a t i o n s
of t h e type
According t o eq.(3.27) t h e c e n t r a l moments are found by taking the d e r i v a t i v e s
of @,,(t) with r e s p e c t t o ( i t ) and p u t t i n g t = O ,

One f i n d s

u2 k
k = 0.1.2. ... (4.71) S i n c e only even powers of t appear i n the sum. a l l odd d e r i v a t i v e s w i l l vanish
when evaluated f o r t-0. Thus a l l odd c e n t r a l moments w i l l be zero and only ewn
Thus a l l c e n t r a l moments of t h e "0-1 d i s t r i b u t i o n can be expressed i n terms of
moments survive; i t w i l l be seen t h a t eq.(4.71) i s regained, as expected.
the variance 0'. The lowest of the even moments are

UZ=Q~. uu - 30*. US = 150'. (4,72) 1 Exercise 4.41:


the r e l a t i o n
Show t h a t t h e c e n t r a l moments of t h e "0-1 d i s t r i b u t i o n satisfy

he a s y m e t r y and k u r t o s i s c o e f f i c i e n t s are both zero,


I
'2k+2 - o2bk
+ a' dU2k
do
-' k=O,l, ...
(
1
Exercise 4.42: Shov t h a t t h e a l g e b r a i c moments of t h e no-1
t e d through t h e r e l a t i o n
p.d.f. are eonnec-
(4.73) i A,.,

sincethe d e f i n i t i o n of t h e k u r t o s i s c o e f f i c i e n t by e q . 0 . 2 1 ) was 4.8.5 Addition theorem f o r normally d i s t r i b u t e d v a r i a b l e s


made t o correspond t o yz-0 f o r t h e normal p.d.f. I t f r e q u e n t l y happens t h a t one wants t o know t h e d i s t r i b u t i o n of a
v a r i a b l e which is a function of normally d i s t r i b u t e d v a r i a b l e s . I n particular,
one can have a tinem function a f nornal v a r i a b l e s . I t t u r n s o u t t h a t e random
..

1 5 - Prob.billfy and statistic..


4.8.6 P r o p e r t i e s of i and s2 for a m p l e from N(u,Q')
v a r i a b l e which i s a l i n e a r c o d i n a t i o n of independent, n o m l l y d i e t r i b u t e d ran-
We ass- t h a t n independent random v a r i a b l e s are a l l normally d i s t r i -
dom v a r i a b l e s is i t s e l f no-lly d i s t r i b u t e d with mean and variance equal t o t h e
sums of, r e s p e c t i v e l y , t h e means and variances of t h e c o n s t i t u e n t v a r i a b l e s .
buted with mean u and v a r i a n c e 0'. ...,
The v a r i a b l e s x ~ . x r , X, are s a i d t o be a
random sample of s i r e n drawn f r w t h e normal population ( o r universe) N(v.oZ).
To see t h i e , l e t x l and ~2 be two independent v a r i a b l e s with normal
Then. with a, and a2 as eon-
The sample has a mean and a variance given by t h e general f o m l a e eqs.(3.87) -
d i s t r i b u t i o n s N(u1.0:) and N(u.,~:), respectively.
(3.88).
s t a n t s , we know t h a t a l x l and a r x 2 are independent, normal v a r i a b l e s d i s t r i b u t e d
as, r e s p e c t i v e l y , N(alul.a:o:) and N(a2pl,a:o:) (compare E x e r c i s e 4.38). Accor-
d i n g t o eq.(4.74) t h e i r c h a r a c t e r i s t i c f u n c t i o n s are

I Prom the preceding s e c t i o n we are nov able t o s t a t e t h e d i s t r i b u t i o n a l


p r o p e r t i e s of t h e s w p l e mean x. ;i s
I T
With t h e n o t a t i o n of Seet.4.8.5 a linear
function of t h e independent, normal v a r i a b l e s x . with a l l c o e f f i c i e n t s equal.
a.= -. Hence, according t o the a d d i t i o n theorem f o r normally d i s t r i b u t e d v a r i -
~ n - n
a b l e s , x w i l l a l s o be a normal v a r i a b l e , with mean a.p.-p and variance
j 2 atot=02/,-,;
i-1 1 1
thus
i-1 1 1

We consider t h e sum

I t w i l l be demonstrated i n Seet.5.1.6 t h a t t h e sample v a r i a n c e s2 f o r


Because of t h e independence of t h e two terms t h e c h a r a c t e r i s t i c function f o r y ( t h e normal sample is r e l a t e d t o the chi-square d i s t r i b u t i o n with n-1 degrees
i s equal t o t h e product of t h e c h a r a c t e r i s t i c functions f o r a,x, and a,? (this 1 of freedom, x2(n-1). Specifically.
is t h e r e s u l t of Secr.3.5.7). Fro. eq.(3.51), therefore

e(a,v~+a.p2)it-!(a:o~+a~o$)t
mY ( t ) = m a,x, (t) .ma2xz c t ) =
Furthermore, ;and 8' for t h e "0-1 sample are independent v a r i a b l e s (see
p u t t h i s i s nothing but a new c h a r a c t e r i s t i c f u n c t i o n of t h e form o f eq.(4.74); ) Exercise 5.9). This is a property which i s s p e c i f i c f o r the normal sample.
hence y i s d i s t r i b u t e d as N(a,ul+atuz , a : o : + a f d ) . The outstanding p o s i t i o n of t h e normal d i s t r i b u t i o n i n t h i s r e s p e c t i s e x p l i c i t
The proof can c l e a r l y be extended t o any " h e r of independent, normal i n the following important theorem:
variables. So we may formulate the a d d i t i a theorem f o r normally d i s t r i b u t e d
Given t h a t t h e random v a r i a b l e s x ~ . x z , . ..,n are independent, with
v a r i a b l e s as follows:
identical normal d i s t r i b u t i o n s , then the Cwo v a r i a b l e s ( s t a t i s t i c s )
L e t nl.xz....,x be independent, normally d i s t r i b u t e d v a r i a b l e s
-x= n
Z x.ln
n -
and s'=,~ (xi-x)'l(n-1) are independent. Conversely. i f
such t h a t xi i s N ( ~ ~ , O : ) . Then the l i n e a r combination
n y=.Z
1-1 aI. a .1 ! i-1 ' - 1-1
t h e mean a and variance s2 of random samples from a population are
is a l s o a normally d i s t r i b u t e d v a r i a b l e , with man Z aipi and independent. t h a t population must be no-I.
variance 2 atof.
i=l I 3
i=1

An a p p l i c a t i o n of t h i s theorem i s given i n t h e following s e c t i o n .


4.8.7 Example: P o s i t i o n and width of resonance peak i m p o r t a n t theorem i n p r o b a b i l i t y theory, with f a r - r e a c h i n g t h e o r e t i c a l and prac-
To i l l u s t r a t e t h e i m p l i c a t i o n o f t h e f o r e g o i n g l e t us c o n s i d e r as an t i c a l implications.
example t h e d e t e r m i n a t i o n o f t h e p o s i t i o n and width of a resonance peak i n an We l e a r n e d a l r e a d y i n Sect.3.6 t h a t i f xl.x2.....r is a s e t o f inde-
e f f e c t i v e mass s p e c t r m . pendent random v a r i a b l e s , such t h a t each x . has a mean value p . and a v a r i a n c e
For s i m p l i c i t y we assume t h a t t h e background problem does n o t e x i s t . 0:. then the sum of t h e s e v a r i a b l e s , Exi, w i l l be d i s t r i b u t e d with mean and
From t h e o b s e r v a t i o n of n e v e n t s w i t h effective-masses MI.MI.....M i n the v a r i a n c e e q u a l t o , r e s p e c t i v e l y , Evi and Zuf. The d i s t r i b u t i o n s o f t h e s e p a r a t e
a p p r o p r i a t e r e g i o n o f t h e spectrum t h e t r u e mass M and width r of t h e peak x . were u n s p e c i f i e d except f o r t h e i r means and v a r i a n c e s , and so was a l s o t h e
can be e s t i m a t e d by, r e s p e c t i v e l y , t h e a r i t h m e t i c mean o f t h e sample v a l u e s d i s t r i b u t i o n of t h e i r sum En..
The C e n t r a l Limit Theorem expresses t h a t t h e d i s t r i b u t i o n a l p r o p e r t i e s
of t h e sum Zx. w i l l be completely known provided t h a t t h e n m b e r o f terms i n t h e
n
sum i s very l a r g e . I n t h e l i m i t when n goes towards i n f i n i t y , Z xi w i l l be
and t h e square root o f t h e sample v a r i a n c e , given by " "E 0..i = l
n o n m t l y d i s t r i b u t e d , w i t h mean value Z U. and v a r i a n c e
i=1 ' i=l '
I n a condensed Corn, we may s t a t e t h e C e n t r a l Limit Theorem an follows:

According to t h e s t a t e m e n t i n t h e preceding s e c t i o n t h e v a r i a b l e s i and M


s2 will Let xl.x2.....n be a s e t of n independent random variables. and
be independent i f , and only i f , t h e M . c o n s t i t u t e a normal sample. Hence t h e each xi be d i s t r i b u t e d with mean v a l u e and a f i n i t e v a r i a n c e o ? .
A

above e s t i m a t e s Mo and r w i l l only be independent i f t h e resonance h a s a Then t h e v a r i a b l e (? x.-


I=, 1
!lui)/(i?
i. i 1
d)'haS a limiting d i s t r i b u A m
s t r i c t l y normal shape. Any o t h e r form of t h e underlying d i s t r i b u t i o n , f o r i n s - which i s N(0.1).
I
t a n c e a 8reit-Wigner s h a p e , i m p l i e s t h a t t h e e s t i m a t e s of t h e peak mass and width
The remarkable t h i n g about t h i s powerful theorem i s t h a t t h e ass--
w i l l n e c e s s a r i l y be dependent. Hence the understanding of t h e u n c e r t a i n t y i n
t i o n s about t h e n. are s o u n r e s t r i c t i v e . I n fact. i t i s sufficient t o specify
t h e e s t i m a t e s of t h e s e q u a n t i t i e s w i l l only be s t r a i g h t ~ o w a r dwhen t h e normal
t h a t each xi should have a f i n i t e v a r i a n c e , s i n c e then n e c e s s a r i l y a l s o t h e
peak shape i s assumed: t h e e s t i m a t e s a r e then u n c o r r e l a t e d and have errors simp-
mean w i l l b e f i n i t e .
l y given by t h e s q u a r e r o o t o f t h e diagonal elements i n the covariance m a t r i x .
We w i l l prove t h e C e n t r a l Limit Theorem only f o r t h e s p e c i a l case when
For a l l o t h e r assumptions about the shape of t h e t r u e resonance peak a more
a l l n independent xi have t h e same mean and v a r i a n c e . The x. then correspond
e l a b o r a t e t r e a t m e n t i s r e q u i r e d t o determine the c o r r e l a t i o n i n t h e e s t i m a t e s
t o a sample from a p o p u l a t i o n w i t h mean U and v a r i a n c e 02<-. We construct the
of peak mass and w i d t h .
standardized v a r i a b l e
I n t h e d i s c u s s i o n above we have t a c i t l y assumed an i d e a l s i t u a t i o n
where t h e measured N. are p r e c i s e v a l u e s , without any experimental u n c e r t a i n t i e s
a t t a c h e d t o them. A more r e a l i s t i c experimental s i t u a t i o n w i t h a f i n i t e inhe-
r e n t e r r o r i n any measurement can be t r e a t e d as d e s c r i b e d i n some d e t a i l i n
Sect.6.2.
f o r which we went t o e s t a b l i s h t h e c h a r a c t e r i s t i c f u n c t i o n N t ) . Each o f t h e xi
4.8.8 The C e n t r a l Limit Theorem h a s a c h a r a c t e r i s t i c f u n c t i o n $ ( t ) which c o n t a i n s t h e a r i t h m e t i c maments of t h e
We s h a l l next c o n s i d e r t h e Central Limit Theorem, probably the o a s t x. by t h e s e r i e s expansion of eq.(3.23). W r i t i n g o u t t h e terms up t o second
o r d e r moments we have
I t i s an e m p i r i c a l f a c t t h a t a v a r i e t y of phenomena seems t o be v e l l
described by t h e normal d i s t r i b u t i o n law. Thus, f o r example, repeated measure-
ments on the same p h y s i c a l system by a c e r t a i n apparatvs may shou a d i s t r i b u t i o n
Since a l l xi - and hence the terms (xi-u)/o& i n eq.(4.76) - by a s s m p t i o n are of outcomes which approaches the normal i f t h e rider of observations becomes
independent, the c h a r a c t e r i s t i c function f o r a i s equal t o t h e product of n sufficiently large. This is by no means obvious, s i n c e even the s i m p l e s t physi-
c h a r a c t e r i s t i c functions f o r each of these t e r n s (Sect.3.5.7). Hence c a l experiment involves many d i f f e r e n t sources of errors, which can be of
instrumental o r i g i n as w e l l as have a s u b j e c t i v e c h a r a c t e r . The C e n t r a l Limit
Theorem affbrds a t h e o r e t i c a l explanation of these empirical findings. The
t o t a l e f f e c t r e g i s t e r e d can be considered as t h e sum of a " t r u e e f f e c t " and a
" t o t a l error", which i s i n t u r n t h e combined e f f e c t of a n m b e r of mutually
independent elementary e r r o r s . When t h e number of e r r o r sources becomes l a r g e
the t o t a l error w i l l be normally d i s t r i b u t e d , and hence a l s o t h e t o t a l e f f e c t
Taking t h e logarithm and expressing (A)i n terms of t h e s e r i e s expansion observed.
above we g e t I Quite frequently one can envisage t h a t t h e " t r u e e f f e c t " under study
i s composed of s e v e r a l p a r t i a l c o n t r i b u t i o n s . For example, m u l t i p a r t i c l e pro-
duction i n high-energy r e a c t i o n s i s thought of as being due t o d i f f e r e n t funda-

and with t h e expansion l n ( l + x ) = x - l x 2 + . .. t h i s can f v r t h e r be w r i t t e n I


mental mechanisms. A d y n m i c a l v a r i a b l e d e s c r i b i n g t h e s e r e a c t i o n s w i l l then
have c o n t r i b u t i o n s from d i f f e r e n t p h y s i c a l e f f e c t s . I f these e f f e c t s are inde-
pendent and many i n number, t h e r e s u l t a n t w i l l be a v a r i a b l e which i s normelly
d i s t r i b u t e d , i r r e s p e c t i v e of t h e s p e c t r a l forms from the i n d i v i d u a l e l e m n t a r y
which s i m p l i f i e r t o
effects. I n a c e r t a i n sense t h e Central Limit Theorem can t h e r e f o r e be s a i d t o
obscure t h e fundamental e f f e c t s i n n a t u r e .

Thus, when n + -, @ ( t ) + exp(-it2). But t h i s i s p r e c i s e l y the c h a r a c t e r i s t i c 4.8.9 Example: Gaussian random number g e n e r a t o r
f u n c t i o n f o r a normal d i s t r i b u t i o n with mean 0 and variance 1 (see eq.(4.74)). An i l l u s t r a t i v e example with a p r a c t i c a l a p p l i c a t i o n of t h e C e n t r a l
I n t h e l i m i t of very l a r g e n, t h e r e f o r e , t h e v a r i a b l e z has t h e d i s t r i b u t i o n Limit Theorem i s the e m s t r u c t i o n of a g e n e r a t o r f o r n o m l t y d i s t r i b u t e d random
N(0,1), i n accordance with t h e s t a t e n e n t made; t h i s completes the proof f o r t h e numbers.
C e n t r a l Limit Theorem f o r t h i s p a r t i c u l a r case. Suppose t h a t t h e r e i s a v a i l a b l e an ordinary random number g e n e r a t o r
The C e n t r a l Limit Theorem i s r e s p o n s i b l e f o r many o t h e r t h e o r e m and vhich d e l i v e r s numbers uniformly d i s t r i b u t e d over t h e i n t e r v a l [0.11, as des-
s t a t e m e n t s i n t h e theory of p r o b a b i l i t y and s t a t i s t i c s . I n f a c t , we have cribed i n Seet.4.5.2, with a mean value & and a variance
1
-.
Let xi be the i - t h
12
a l r e a d y seen t h e theorem i n o p e r a t i o n e a r l i e r i n t h i s c h a p t e r , i n t h e observed number i n a sequence of n consecutive nuder$ from t h i s generator. Then, accard-
tendency tovsrds normality f o r i n c r e a s i n g n of t h e ( d i s c r e t e ) binomial and i n g t o t h e Central Limit Theorem, i n t h e l i m i t of l a r g e n, t h e v a r i a b l e
Poisson d i s t r i b u t i o n s .
gnd 101 1. As one may s u s p e c t from t h e n o t a t i o n t h e p a r a m r t c r s u2.p2,af,o:
w i l l have t h e m a n i n g of mean v a l u e r and v a r i a n c e s , r e s p e c t i v e l y , w h i l e P
measures t h e c o r r e l a t i o n between t h o two v a r i a b l e s .
To see t h i s , i t is convenient t o work o u t t h e c h a r a c t e r i s t i c f u n c t i o n
w i l l have a d i s t r i b u t i o n which is e x a c t l y normal, w i t h mean zero and v a r i a n c e f o r x i and x2 w i t h t h e p.d.f. (4.77) from t h e g e n e r a l d e f i n i t i o n o f eq. (3.50);
one. A r e a s o n a b l e approximation t o t h e normal d i s t r i b u t i o n i s o b t a i n e d f o r n the r e s u l t i s
v a l u e s as s m a l l as 10.

z -
12
l x i - 6 .
A p r a c t i c a l c h o i c e is t o t a k e 11-12, which g i v e s simply
m(t,,t,) - .i t , v , + i t * ~ z i+I ( i t l ) ' o : + (it1)'&+ (it~)(it2)2~010~1
(4.78)

i=l The a l g e b r a i c moments of the v a r i a b l e s xt and x2 can t h e n be o b t a i n e d from t h i s


T h i s v a r i a b l e , c o n s t r u c t e d f o r r e p e a t e d sequences o f s i z e 12, produces a d i a t r i - f u n c t i o n , by c a l c v l a t i n g p a r t i a l d e r i v a t i v e s w i t h r e s p e c t t o ( i t , ) and ( i t z ) ,
b u t i o n o f numbers between -6 and +6, which d i f f e r s o n l y s l i g h t l y from N(0.1) and and e v a l u a t i n g t h e e x p r e s s i o n s f o r t l - t r 4 . as was shown i n Sect. 3.5.7. We
w i l l b e s u f f i c i e n t l y a c c u r a t e f o r most p u r p o s e s . f i n d , f o r example,
An a l t e r n a t i v e method t o o b t a i n normally d i s t r i b u t e d random ""&era is
the f o l l o w i n g : I f x, and x 2 i s a p a i r of n u d e r s from a g e n e r a t o r producing a
u n i f o r m d i s t r i b u t i o n between o and I , any d e r i v e d q u a n t i t y c o n s t r u c t e d as
zl- ccs(2nx2) o r sr=- s i n ( 2 n x z ) w i l l i n t h e l o n g r u n be of s t a n -
d a r d "0-1 form; see E x e r c i s e 4.43.

4.9 THE BINOWL DISTRIBUTION


ID g e n e r a l i z i n g t h e n o w 1 p.d.f. r o rmre than one v a r i a b l e i t m y be and s i m i l a r l y f o r t h e o t h e r v a r i a b l e ; t h e s e are t h e u s u a l r e l a t i o n s between t h e
u s e f u l t o i n v e s t i g a t e f i r s t t h e ease with two v a r i a b l e s i n s o w d e t a i l . This lowest moments of a random v a r i a b l e (see S e c t . 3 . 3 . 3 ) . For t h e product o f x l and
p a r t i c u l a r case demonstrates t h e s p e c i a l f e a t u r e s of t h e "0-1 distribution i n xl the expectation i s
a t r a n s p a r e n t way, and w i l l f a c i l i t a t e t h e f u r t h e r g e n e r a l i z a t i o n t o many
dirensionr .
4.9 .I D e f i n i t i o n and p r o p e r t i e s
r joint
c a l l e d b i n o m l o r two-dimensionat normal i f , f o r - - 5 xl,x2 5 -
d i s t r i b u t i o n f o r two random v a r i a b l e s a, and
,
x2 is From 0 4 3 . (3.39). (3.40) t h e c o r r e l a t i o n c o e f f i c i e n t P ( X , , X Z ) between t h e N o
v a r i a b l e s t h e r e f o r e becomes

Hence t h e parameter P i n t h e p.d.f. (4.77) can be i d e n t i f i e d ss t h e c o r r e l a t i o n


c o e f f i c i e n t between X L and a t .

I n o r d e r t h a t f ( x , , x z ) be non-negative and r e a l , b o t h 0's are r e q u i r e d p o s i t i v e ,


-
I f p = 0 we see fro01 e q s . (4.77) and (4.78) t h a t t h e p.d.f. and t h e
c h a r a c t e r i s t i c f u n c t i o n b o t h f a c t o r i z e i n t o two s e p a r a t e p a r t s , one f o r e a c h
the p r o j e c t i o n s of the b i n o m l d i s t r i b u t i o n p e r p e n d i c u l a r t o t h e c o o r d i n a t e
axes are always o f a o o m l s h a p e , i n d e p e n d e n t o f t h e c o r r e l a t i o n b e t v e e n t h e
variable giving variables.
L e t us n e x t t u r n t o t h e c o n d i t i o n a l d i s t r i b v t i o n s which, a c c o r d i n g t o
Sect.3.5.5, are d e f i n e d a s t h e r a t i o between t h e j o i n t p.d.f. and t h e m r g i n a l
d i s t r i b u t i o n s (eq.(3.48)). F o r example, t h e c o n d i t i o n a l d i s t r i b u t i o n f ( x z l x l )
.
f o r n2. g i v e n X I , is e q u a l to f ( x l , x ? ) / h ~ ( x > ) From t h e e x p r e s s i o n s above i t
and w i l l b e seen t h a t t h e c u r l y b r a c k e t c o r r e s p o n d s t o t h e i n t e g r a l o f f ( x r l x l ) over
+~(it,)'0:].[~ittur+i(it2)~~~] (4.80) xz . Hence t h e c o n d i t i o n a l d i s t r i b u t i o n i n x 2 f o r g i v e n x , i s

According t o S e c t s . 3.5.4 and 3.5.7 t h e v a r i a b l e s x l a n d n2 are t h e r e f o r e a l s o


independent. - T h i s is a s p e c i a l f e a t u r e o f t h e normal d i s t r i b u t i o n b e c a u s e , i n A c o r r e s p o n d i n g e x p r e s s i o n is o b t a i n e d f o r t h e o t h e r c o n d i t i o n a l d e n s i t y
g e n e r a l , when two v a r i a b l e s have zero c o r r e l a t i o n t h e y need n o t b e i n d e p e n d e n t . f(x1lx2). Thus a n y i n t e r s e c t i o n of t h e two-dimensional s u r f a c e by a p l a n e p e r -
F o r t h e p n e r a l case, when P,t h e m a r g i n a l d i s t r i b u t i o n i n one p e n d i c u l a r t o a c o o r d i n a t e a x i s v i l l produce a normal curve. From eq.(4.82) i t
v a r i a b l e i s found by i n t e g r a t i n g t h e b i n o r m a l p.d.f. (4.77) over t h e second i s i n t e r e s t i n g t o o b s e r v e t h a t a l t h o u g h t h e m a n v a l u e of t h e c o n d i t i o n a l d i s -
v a r i a b l e , ( t h i s i s t h e d e f i n i t i o n of a marginal p r o b a b i l i t y d e n s i t y given i n t r i b u t i o n i n x r f o r g i v e n x l does depend a n XI, t h e v a r i a n c e does n o t . Hence
s e c t . 3.5.5). Thus t h e marginal d i s t r i b u t i o n i n x, is o b t a i n e d by e v a l u a t i n g t h e p l a n e i n t e r s e c t i o n s f o r d i f f e r e n t v a l u e s of x1 v i l l be a s e t o f n o w 1
the integral
- curves which a l l have v a r i a n c e &(I-0').
growing l i n e a r l y w i t h x, .
b u t w i t h t h e man v a l u e u t + W 2 / o l ( x , - ~ l )

O t h e r i n t e r e s t i n g f e a t u r e s of t h e binormal d i s t r i b u t i o n are c o n t a i n e d
i n t h e exercises below; see i n p a r t i c u l a r 4.46-4.48.
I n s l l b ~ t i t u t i n gf ( n , , x l )
give -
from eq. (4.77) t h e f a c t o r s can h e r e b e o r g a n i z e d t o
E x e r c i s e 4.43:

- .
L e t x , and xz be hro i n d e p e n d e n t v a r i a b l e s which a r e u n i f o m l y
d i s t r i b u t e d between 0 a n d 1 Shov t h a t two new v a r i a b l e s r , == cos(2nx,),
h,(x,) = -
"50,
1
e
-!(XI-WI)~IO?
1
e- ~ ( x ~ - c ) ~ / o : ( ~ - P ,~ ) ~ ~ , zz s i n ( 2 1 1 ~ ~w) i l l b e h i n o r m a l l y d i s t r i b u t e d , w i t h m a r g i n a l d i s t r i b u t i o n s
N ( 0 , l ) and zero c o r r e l a t i o n .

-
E x e r c i s e 4.44,: V e r i f y t h a t t h e covariance m a t r i x and i t e i n v e r s e f o r two v a r i -

able rr, C
no-l
.
where C in t h e exponent o f t h e i n t e g r a l i s independent o f t h e i n t e g r a t i o n v a r i -
.
ur + p o z / o ~ ( x l - ~ , ) The c u r l y b r a c k e t is t h e r e f o r e an i n t e g r a t e d
p . d . f , which gives just 1 .
Hence t h e m a r g i n a l d i s t r i b u t i o n i n XI is
able8 w i t h the b i n o m l d i s t r i b u t i o n (4.77) a r e g i v e n by, r e s p e c r i v e l y .

( i ) Shov t h a t t h e b i n o r m a l p.d.f. can be e x p r e s s e d as


~ I Z )= ( Z " ) - ~ [ J V I J - ~eqr-j(2-E)T~-'izk)~
where z = Ixl,x2) and g = (u~,url.

S i n i l a r l y , t h e m a r g i n a l d i s t r i b u t i o n i n xr becomes e q u a l t o N(~2.o:) . Thus


4.9.2 Example: c o n s t r u c t i o n o f a binormal random number g e n e r a t o r
( i i ) Show r h a t t h e m a r g i n a l d i s t r i b u t i o n i n x, i s g i v e n by a f o r m l a of t h e form
o f e q . ( 4 . 7 7 b ) , b u t with v r e p l a c e d by t h e "submatrix" o b t a i n e d by d e l e t i n g t h e Far Monte C a r l o s i m l s t i o n s of p a r t i c l e p r o d u c t i o n a n d s u b s e q u e n t
second row and column from V

( i i i ) show that the


of t h e forrn(4.77 b)
a farmula
d i s t r i b u t i o n i n xr
, but
.
g i v e n XI , i s a l s o g i v e n by
with V r e p l a c e d by V*, where V*
the "sub-
d e t e c t i o n i n an e x p e r i m n t a l Set-up i t i s sometimes d e s i r a b l e t o have a v a i l a b l e
a method f o r g e n e r a t i n g v a r i a b l e s which are normally d i s t r i b u t e d mrd i n t e r n a l l y
M c r i x -obtained by d e l e t i n g t h e f i r s t row and column from V-' and i n v e r t i n g t h e c o r r e l a t e d . I t i s known, f o r example, t h a t a charged p a r t i c l e moving i n a "ni-
form magnetic f i e l d d e s c r i b e s a h e l i x curve w i t h a x i s a l o n g t h e f i e l d d i r e c t i o n ,
E x e r c i s e 4.45: V e r i f y t h a t t h e c h a r a c t e r i s t i c f u n c t i o n f o r two v a r i a b l e s w i t h a which c a n be p a r a m e t r i z e d i n t e r m of t h r e e q u a n t i t i e s I l p . A, and 4: 1/p me=
b i n o m i n a l d i s t r i b u t i o n i s g i v e n by e q . ( 4 . 7 8 ) , which i n v e c t o r n o t a t i o n r e a d s sures t h e c u r v a t u r e of t h e h e l i x p r o j e c t i o n i n a p l a n e p e r p e n d i e u l p r t o the
f i e l d . $ g i v e s t h e a z i m u t h a l a n g l e i n t h i s p l a n e , and A t h e d i p a n g l e of the

-
Exercise 6.46:
B,X1+BZX2 s a T ,
with and x z r e l a t e d i n t h e b i n o r m a l p . d . f . ,
is ~ ( ~ , p , + a ~ u , , a : o : + a ~ o ~ + 2 ~ 1 a ~ ~'
(mint: ~ i " d t h e ~ h a r a c t e r i s t i cf u n c t i o n f o r Y .)
0 ~~0 2
show t h a t
( )a ~ ,GV!?)
i .
h e l i x r e l a t i v e to the same p l a n e .F u r t h e r , e x p e r i e n c e has shown t h a r e a c h of
t h e s e q u a n t i t i e s under measurpmentr c a n be c o n s i d e r e d as a normally d i s t r i b u t e d
v a r i a b l e , w i t h a s p r e a d around t h e c e n t r a l ( t r u e ) v a l u e as i m p l i e d by t h e accu-
s x c e r c i s e 4.47: ( i ) Show t h a t two v a r i a b l e s c o n s t r u c t e d as l i n e a r combinations r a c y o f t h e measuring system; i n a d d i t i o n , 110 and 4 are c o r r e l a t e d . To simu-
of t h e v a r i a b l e s o f a b i n o r m l d i s t r i b u t i o n (4.77) w i l l a l s o be b i n o r m a l l y d i g - l a t e an e x p e r i m e n t by t h e Monte C a r l o t e c h n i q u e one t h e r e f o r e needs a p r e s c r i p -
t r i b u t e d . ( H i n t : Write y , - a l n , + a 2 x a y 2 - b l x l + b 2 x z and show t h a t t h e c h a r a c t e r i s -
t i c f u n c t i o n @ ( t l , t r ) f o r y ~ and y? i s of t h e form of e q . (4.78).) t i o n f o r o b t a i n i n g s e t s o f random, normal v a r i a b l e s , such Char two of them
( i i ) w r i t i n g 5 = i a , , a 2 } , b = { b , , b r ) , show t h a t t h e new b i n o m l d i s t r i b u t i o n p o s s e s s a mutual r e l a t i o n s h i p c o r r e s p o n d i n g t o a binorwe1 d i s t r i b u t i o n o f s p e c i -
h a s t h e m a r g i n a l d i s t r i b u t i o n s ~ ( a T u ,&TV%) and N(L'$, b T v b ) f o r Y L and y 2 .
r e s p e c t i v e l y . a n d t h a t t h e i r eova;i&ce i s ~ o ~ ( ~ , , y ~ ) = Hence p b . yl and Y 2 f i e d c o r r e l a t i o n , c o r r e s p o n d i n g t o t h a r between t h e c u r v a t u r e and t h e a z i m u t h a l
w i l l be independent i f , and o n l y i f , t h e t r a n s f a r m t i o n mkes a T ~ b0 .
anele.
~ ~ ~ r 4.48:
c i Let ~ e x l r x z be r e l a t e d i n the binormal p . d . f . of eq.Ch.77). We assume t h a t a g e n e r a t o r f o r Gaussian random n u h e r s i s a v a i l a b l e .
( i ) shov t h a r a change of v a r i a b l e s t o y , . y ~ by t h e o r t h o g o n a l t r a n s f o r m a t i o n which upon c a l l produces a "random number" e s u c h t h a t , i n t h e l o n g run, z w i l l
be n e a r l y N ( 0 , I ) . Any independent n o r m 1 v a r i a b l e of mean v a l u e and s t a n d a r d
brings the p.d.f. over t o t h e form d e v i a t i o n 0 i s t h e n siolply c o n s t r u c t e d as u to..
Suppose t h a t two dependent v a r i a b l e s x~ and xz a r e r e q u i r e d t o have

which *hows r h a t y, and y n w i l l b e independent and normal. Here. each of t h e the c o r r e l a t i o n c o e f f i c i e n t p.


We t h e n use t h e Gaussian random number g e n e r a t o r
~ ~ " a y. r ~ occurs
d w i t h a c o e f f i c i e n t d e t e r m i n e d by t h e l a t e n t r o o t s ( e i g e n - t o o b t a i n two independent s t a n d a r d "0-1 v a r i a b l e s r t and zr, and w r i t e
v a l u e s ) bf the m a t r i x V(x).
( i i ) show t h a r a subsequenr t r a n s f o r m a t i o n z l = yl/JTT;j, rr = y 2 / 6 , o r
directly,

makes Q = 2: + z:
~ ( z , . z z )=
. [=
and
1
=xp(-iZ:)]. [z 1
exp(-lz:)] .
..

This t r a n s f o r m a t i o n r e p r e s e n t 8 t h e s o l u t i o n t o our problem, =he j o i n t


p.d.f. for x, and x, becomes
Thisshows t h a t z, and z z v i l l be i n d e p e n d e n t and s t a n d a r d normal. The q u a n t i f y
Q, nw in form of a o f twoa q u a r e d , independent N(0.1) v a r i a b l e s , 1
o f the d e f i n i t i o n of a c h i - s q u a r e v a r i a b l e (see S e c t . (4.84)
seen be y 2 ( 2 ) , in
5.1.1).
The v a r i a b l e s x l and x2 are t h e r e f o r e b i n o m l l y d i s t r i b u t e d w i t h c o r r e l a t i o n
from which t h e moments can be c a l c u l a t e d i n a s t r a i g h t f o r w a r d manner as
c o e f f i c i e n t p, a s r e q u i r e d , and each v a r i a b l e has msrginal d i s t r i b u t i o n N(0.1).
d e s c r i b e d i n Sect.3.5.7. For example,
C l e a r l y , i f non-zero mean values and s t a n d a r d d e v i a t i o n s d i f f e r e n t from 1 are
needed, t h i s can be achieved by a p p r o p r i a t e l o c a t i o n s h i f t s and a c a l i n g a . lhus
the case w i t h parameters u l , u2, a:, a: i s o b t a i n e d by r e p l a c i n g t h e left-hand
s i d e s i n eq.(4.83) by ( x l - u , ) l a l and ( x 2 - v 2 ) / o 2 . r e s p e c t i v e l y . The r e a d e r w i l l
1
recognize t h i s t r a n s f o r m a t i o n as the i n v e r s e of t h e g e n e r a l t r a n s f o r w i t i o n i n a2m
E x e r c i s e 4.48 of the previous s e c t i o n .
-t-0
4.10 THE M l % T I N O W DISTRIBUTION
. With
4.10.1 D e f i n i t i o n and p r o p e r t i e s
f o r any r,s = 1.2,....0
t h e v a r i a b l e s xr end xS g i v e n by prs
Vrr
-
= 0: and the c o r r e l a t i o n c o e f f i c i e n t between
p(xr,xs) = VrsI(VrrVss)L the c o v a r i a n c e

and (4.77).
The normal d i s t r i b u t i o n s o f dimension 1 and 2 , as d e f i n e d by eqs.(4.59)
r e s p e c t i v e l y , l e a d us t o s e a r c h a m t t i n o m t o r n-dimeneionat nomot
'! m a t r i x takes t h e g e n e r a l farm

distribution, f o r which t h e p . d . f . should be of an e x p o n e n t i a l type and t h e expo-


nent have a g e n e r a l q u a d r a t i c dependence on n v a r i a b l e s x i .

I f V i s a diagonal m a t r i x , implying t h a t a l l x . are u n c o r r e l a t e d , t h e


This w i l l be an allowed p.d.f. provided t h e c o e f f i c i e n t s c . .
11 '
s-tric i n the inverse matrix V-I w i l l a l s o be diagonal. The exponent tn eq.(4.84) w i l l then
i n d i c e s , are such t h a t t h e i n t e g r a l over t h e n-dimensional r e a l space e x i s t s , have no t e r m mixing t h e d i f f e r e n t xi , so t h a t the j o i n t p.d.f. f ( 5 ) w i l l be
and C ensures a proper o v e r a l l n o r m a l i z a t i o n .
With an eye t o the two-dimensional case, f o r which v e c t o r n o t a t i o n was
' x,,xz,"'.x, .
f a e t o r i r a b l e i n t o n one-dimensional normal p.d.f.'s
showing t h a t they are a l l independent.
f o r the s e p a r a t e components
- I n f a c t , the v a r i a b l e s
i n t r o d u c e d i n Exercise 4.44, i t i s st once r e a l i z e d t h a t a more compact form of i n a multinormal d i s t r i b u t i o n are independent i f , and o n l y i f , the c o v a r i a n c e
eq.(4.83) is m a t r i x i s diagonal.
S i m i l a r reasoning t o t h a t a p p l i e d f o r t h e binormal d i s t r i b u t i o n shows
t h a t any p r o j e c t i o n of the f ( 5 ) from eq.(4.84) t o a space of lower dimension
g i v e s a new d i s t r i b u t i o n whieh i s of t h e same form, but with a c o v a r i a n c e mtrir
, , o b t a i n e d from t h e o r i g i n a l V by d e l e t i n g t h e rows and columns corresponding t o
where 5 = (r,,x,,. ...x,,>
v a r i a n c e m a t r i x f a r 2.
j = {11,,11~,... ,un) and V i s the symmetric nxn
The denominator o f t h e o v e r a l l n o r m a l i z a t i o n f a c t o r in-
co-
' t h e v a r i a b l e s p r o j e c t e d away. I n p a r t i c u l a r , i n t e g r a t i n g over a l l v a r i a b l e s ex-
c l u d e s one (2") 1 f a r each dimension of 5 , and t h e square r o o t of the determinant cept xi g i v e s the marginal d i s t r i b u t i o n i n t h i s v a r i a b l e , which i s N(u..o?)
I I
.
of V .
(and V
-L
C l e a r l y we must have / v / $ o , and
) must be p o s i t i v e d e f i n i t e ,
f o r the i n t e g r a l o f f ( 5 ) t o e x i s t . V
say x.
p.d.f.;
.
Also, i n t e r s e c t i n g f ( 5 ) by a plane p e r p e n d i c u l a r t o one of the c o o r d i n a t e axes,
produces a c o n d i t i o n a l d i s t r i b u t i o n whieh i s an (0-1)-dimensional
the m a t r i x V = V* i n t h i s c o n d i t i o n a l d i s t r i b u t i o n is obtained by de-
normal
The c h a r a c t e r i s t i c f u n c t i o n f o r t h e p.d.f. (4.84) i s
T l e t i n g t h e i - t h row and column from the o r i g i n a l V-I and i n v e r t i n g the r e s u l t a n t
@(r)
=. -
(it) u + ~ ( i ~ ) ~ v, ( i ~ )
- (4.85)
submatrix.
Other p r o p e r t i e s o f t h e normal d i s t r i b u t i o n i n two dimensions are found
A f u r t h e r t r a n s f o r m a t i o n t o a new s e t o f s c a l e d v a r i a b l e s , s a y r .
9 as a sum of n squares of independent, s t a n d a r d normal zi and, i n accordance
. v i l l express

t o extend t o t h e multi-dimensional case. Thus any l i n e a r f u n c t i o n of t h e x. f r w w i t h the d e f i n i t i o n of Seet.5.1.1, Q is t h e r e f o r e x 2 ( n ) , a chi-square v a r i a b l e


a multinormal d i s t r i b u t i o n i s i t a e l f a ( o n e - d i m n s i o n a l ) no-lly distributed with n degrees o f freedom.
v a r i a b l e (see E x e r c i s e 4.49); more g e n e r a l l y , a s e t o f v a r i a b l e s made up as l i n - I f the covariance m a t r i x i s s i n g u l a r . IvI -0, i t s i n ~ r s edoes not
ear combinations of the m u l t i n o m a l r. v i l l a l s o be mvltino-lly distributed e x i s t , and hence eq.(6.84) w i l l loose i t s usual meaning. There is t h e n a t l e a s t
( E x e r c i s e 4.50). This is a v a s t g e n e r a l i z a t i o n o f t h e a d d i t i o n theorem f o r nor- one l i n e a r r e l a t i o n between the x i , or e q u i v a l e n t l y , one o r more of t h e x . i s
m a l v a r i a b l e s from Sect.4.8.5. redundant. I n t h i s case we s h a l l take eq.(4.84) t o maan t h e m u l t i n o m l d i s t r i b u -
t i o n i n a space o f d i m n s i o n (n-r), where r i s the number of l i n e a r redundancies,
E x e r c i s e 4.49: Let y be a l i n e a r f u n c t i o n of t h e x . w i t h t h e multinormal d i s -
t r i b u t i o n (4.86). y= ;il aixi = gx. Show t h a t y is ~ (
T
~ 5~VA)g ,
implying t h e covariance m a t r i x t o be e l i m i n a t e d f o r t h e rows and columns corre-
sponding t o the redundant nls . S i m i l a r l y , the q u a d r a t i c form 9 can be considered
E x e r c i s e 4.50: L e t yI.y2,...,ym be a s e t of l i n e a r f u n c t i o n s of the x. with the
m u l t i n o m l p.d.f. (4.84), sueh t h a t z - S x , where t h e m a t r i x S i s ofldimension as composed of an e q u i v a l e n t reduced nmher o f terms, making i t i n t h i s case d i s -
mxn ( m < n ) . Show t h a t the c h a r a e r e r i s t i c f u n c t i o n @ ( t , . t s . ....
t,) has a form t r i b u t e d as x2(n-r) *). This p r o p e r t y o f q u a d r a t i c forms of normally d i s t r i b u t e d
which i m p l i e s t h a t the yi are multinormal1y d i s t r i b u t e d with a v e c t o r of mean
v a l u e s Su and covariance m a t r i x S V S ~ . v a r i a b l e s w i l l be f r e q u e n t l y r e f e r r e d t o l a t e r i n t h i s book; s i n c e , however, the
proof i s r a t h e r formal and s o p h i s t i c a t e d i t s d e t a i l s w i l l not be given here.
4.10.2 The q u a d r a t i c form 9
I n the multinormal p.d.f.
where 9 i s t h e quadratic fom, o r cmarimce fom,
the exponent has been expressed as -19,
Given n independent v a r i a b l e s 1 . .
Exercise 4.51: ( C o n s t r u c t i o n o f a multinormal random number g e n e r a t o r )
a l l N(0,l). Discuss the problem o f f i n d i n g
n v a r i a b l e s x i , l i n e a r l y r e l a t k d t o z i , such t h a t the x i become multinorlnelly

Q - ( 5 - g) V
T -1
(5 - 2) (4.87)
d i s t r i b u t e d w i t h given covariance m a t r i x .

4.11 THE CAUCHY, OR BREIT-WIGNER, DISTRIBUTION


i n which y and V are, r e s p e c t i v e l y , the mean value and covariance m a t r i x f a r the The Cauchy distribution has t h e form
n-component v a r i a b l e 2 . For t h e s p e c i a l cases with n - 1 and n = 2 i t h a s been in-
d i c a t e d e a r l i e r t h a t the v a r i a b l e Q has a d i s t r i b u t i o n which i s t h a t o f a c h i -
square v a r i a b l e w i t h , r e s p e c t i v e l y , 1 and 2 degrees o f freedom (compare E x e r c i s e s , and i s an allowed p r o b a b i l i t y d e n s i t y f u n c t i o n s i n c e the i n t e g r a l of f ( x ) over
4.39 and 4.48). I t t u r n s o u t t h a t q a l s o €or a g e n e r a l value of n has a c h i -
a l l x i s e q u a l t o one. This d i s t r i b u t i o n is of i n t e r e s t t o p a r t i c l e p h y s i c i s t s
square d i s t r i b u t i o n with n degrees of freedom and thus is a f u n c t i o n of one para-
because i t produces t h e Breit-Wigner shape. Unfortunately, however, t h e f u n c t i o n
mrer only. This i s q u i t e r e m a r k a b l e , i n view o f the f o m a l s t r u c t u r e of q as a
f ( x ) as d e f i n e d above i s mathematically awkward. D i f f i c u l t i e s a r i a e i-diately
f u n c t i o n i n v o l v i n g many parameters.
i f one t r i e s t o e v a l v e r e t h e e x p e c t a t i o n value of x with the p.d,f, f(x), since
When the covariance matrix i s non-singular, a s we have assumed, i t i s +"., :"."..-",
'..CSbL"

always p o s s i b l e t o f i n d a l i n e a r t r a n s f o r m a t i o n t o a new s e t of v a r i a b l e s y 4 from ,


t h e xi, which i s sueh t h a t i t brings Q t o a form w i t h s u m o f s q v a r e s i n yi, and
with DO terms mixing t h e d i f f e r e n t y i . I n p a r t i c u l a r . an o r t h o g o n a l transforma-
t i o n w i l l produce q as a sum o f squares of n independent y i , where t h e c o e f f i -
c i e n t s are given by the l a t e n t r o o t s of the covariance matrix V(q); i n c m o "
t h i s transformation serves t o "diagonalize the covariance matrix".
*)The number of independent v a r i a b l e s , n-r , i s o f t e n c a l l e d the nmk of the
q u a d r a t i c form.
..
i s n o t completely convergent. This means t h a t t h e l i m i t i n g v a l v e

I I pracrion o f distribution i n t a i l 1
Distribution
1x1 L1 -> 2 -> 3 -> 4 -> 6
L'+rn -
does nor e x i s t , a l t h o u g h t h e p r i n c i p a l v a l u e , d e f i n e d w i t h L'
is e q u a l t o one.
- L , does e x i s t and
Following c o n v e n t i o n we s h a l l r e g a r d t h e d i s t r i b u t i o n of eq.
S t a n d a r d normal
Double e x p o n e n t i a l
.3173
3679
A455
.I353
.0027
.0498
.00006
,0183 .0025
Cauchy .5000 .2952 .2048 .I560 .lo51
(4.88) as n o t p o s s e s s i n g a man. The same convergence s i t u a t i o n a p p l i e s t o a l l
o t h e r moments xk . Thus we m y s a y t h a t f o r t h e Cauchy d i s t r i b u t i o n no m m e n t s 1
are d e f i n e d , s i n c e they a l l d i v e r g e .
One way o f g e t t i n g o u t of t h e dilemoa i s t o impose a r e s t r i c t i o n on t h e
domain o f t h e v a r i a b l e x . The i n t e g r a l of f ( x ) over a f i n i t e i n t e r v a l I-L,+LI
i s e q u a l t o 2/77 (tan-'L) . I f we t h e r e f o r e r e d e f i n e o u r f ( x ) by this normaliza-
t i o n f a c t o r and w r i t e

t h i s w i l l b e an a c c e p t a b l e p . d . f . which i s p r o p e r l y n o r m a l i z e d and f o r which a l l


moments e x i s t . From i t s s y m n e t r i c f a r . a l l odd moments of f ' ( x ) v a n i s h i d e n t -
i c a l l y ; i n p a r t i c u l a r , E(x) - 0 . We a l s o f i n d

T h i s e x p r e s s i o n f o r t h e v a r i a n c e i l l u s t r a t e s t h e s t a t e of a f f a i r s f o r t h e Cauchy
p.d.f. ( 4 . 8 8 ) : t h e t a i l s o f t h i s d i s t r i b u t i o n t e n d so s l o w l y t o zero t h a t conver-
gence i s p r e v e n t e d . Indeed, when L i s p e r m i t t e d t o grow i n d e f i n i t e l y , t h e v a r i - F i g . 4.9. The Cauchy o r Breir-Wigner d i s t r i b u t i o n ( s o l i d curve) and t h e
s t a n d a r d normal d i s t r i b u t i o n (dashed curve). The h a l f - w i d t h s a t h a l f -
ance can become a r b i t r a r i l y l a r g e , s i n c e V ( x ) + m when L+-. maximum are i n d i c a t e d by arrows of l e n g t h 1 and = 1.18, r e s p e c r i v e l y .
Far moderate n v a l u e s t h e shape of t h e Cauchy d i s t r i b u t i o n (4.88) i s
n o t v e r y d i f f e r e n t from t h e s t a n d a r d normal, as one can see from F i g . 4.9. A
E x e r c i s e 4.52: Show t h a t t h e Breit-Wigner formula f o r a n s r a v e resonance of cen-
q u a n t i t a t i v e e x p r e s s i o n o f t h e i r d i f f e r e n t t a i l b e h a v i o u r is p r o v i d e d by t h e f o l - t r a l v a l u e No and f u l l w i d t h r a t h a l f maximum.
l o w i n g t a b l e , which g i v e s t h e f r a c t i o n o f e a c h d i s t r i b u t i o n i n b o t h t a i l s beyond
the i n d i c a t e d v a l u e s o f 1x1 . For comparison, t h e t a b l e a l s o shows t h e corre-
s p o n d i n g f r a c t i o n s f o r t h e d o u b l e e x p o n e n t i a l d i s t r i b u t i o n (see E x e r c i s e 4.55).
c o r r e s p o n d s to a Cauehy d i s t r i b u t i o n . Note t h a t w i t h t h i s farm, half-maximum
which i n t h i s r e s o e c t i s seen t o have an i n t e r m e d i a t e b e h a v i o u r .
occurs f o r M - M o t ( t r ) , whereas a Gaussian shape N(M,,,(tr)%) h a s half-maximum a t
M - M f l .18(hT); compare E x e r c i s e 4.37.
Exercise 4.53: Show t h a t the c h a r a c t e r i s t i c function f o r the Cauchy p.d.f. is
5. Sampling distributions
m(t) = .-It/.
Note t h a t t h i s function has no Taylor expansion around the o r i g i n and t h a t there-
f o r e the rmmenrs of the Cauchy p.d.f. do not e x i s t .
Exercise 4.54: Let xl,xr.....x
buted according t o eq.(4.88). :how that -
be n independent Cauchy v a r i a b l e s , each d i s t r i -
x i has the same d i s t r i b u t i o n .
This r e s u l t may a t f i r s t appear somewhat s u r p r i s i n g , i n view of what has
The previous c h a p t e r h a s d e a l t r a t h e r e x t e n s i v e l y with t h e c h a r a c t e r
i s t i c s and p r o p e r t i e s of some p r o b a b i l i t y d i s t r i b u t i o n s which have been found
been learned from the Central Limit Theorem. One might perhaps have expected t o d e s c r i b e Oarious physical phenomena q u i t e a c c u r a t e l y under c e r t a i n i d e a l
t h a t t h e a r i t h m e t i c mean ?. of the n independent v a r i a b l e s would become appraxi- /

i
mately normally d i s t r i b u t e d for very l a r g e n. The appearant discrepancy i s due conditions.
t o the f a c t t h a t the n Cauchy v a r i a b l e s do not f u l f i l the requirement of possess-
The p r e s e n t chapter w i l l be devoted t o s study of t h e p r o p e r t i e s of
i n g a f i n i t e variance, which was e s s e n t i a l i n deriving the Central Limit Iheorem.
t h r e e sampling d i s t r i b u t i o n s which are r e l a t e d t o t h e normal. The chi-square,
Exercise 4.55: (The double exponential d i s t r i b u t i o n )
Discuss the p r o p e r t i e s o f a v a r i a b l e with the p.d.f. t h e S t u d e n t ' s t , and the F-distributions do n o t have any d i r e c t p h y s i c a l ma-
f(n) 1 e -14, - m < x < m . l ~ g u e s ,b u t c m be connected t o experimental s i t u a t i o n s where t h e no-1 dia-
Note i n p a r t i c u l a r t h a t t h i s d i s t r i b u t i o n has t a i l s which drop off more slowly
than the standard normal b u t faster than the Cauchy d i s t r i b u t i o n .
I
I

,
t r i b u t i o n I m i s supposed t o d e s c r i b e t h e outcome of meaeurements. The m o t i v e
t i o n t o s t u d y t h e s e d i s t r i b u t i o n s may perhaps not be clear t o t h e r e a d e r a t t h e
moment, i n which ease h e should proceed t o t h e following chapters end return t o
t h i s p o i n t vben it is found necessary. It may s u f f i c e t o mention t h a t . although
some a p p l i c a t i o n s of t h e s w l i n g d i s t r i b u t i o n s are found already i n Chapter 7.
i n connection w i t h simple i n f e r e n c e problems involving s a w l e a from t h e n o m l
d i e t r i b u t i o n , a f u l l a p p r e c i a t i o n of t h e sampling d i s t r i b u t i o n s d i s c u s s e d h e r e
w i l l f i r s t become evident i n t h e l a a t c h a p t e r of t h e book. Thus a n d e r of
e-les an hypothesis t e s t i n g indeed presupposes knovledge on t h e e t m d a r d
sampling d i s t r i b u t i o n s as w e l l as some acqualntanee with the related non-central
I
s q l i n g d i s t r i b u t i o n s , vhieh are introduced i n t h e e x e r c i s e s of t h e p r e l e n t
chapter.

5.1.1 Definition
Let us ass- t h a t t h e r e i n given a set of n lartually independent
r m d w v a r i a b l e e x,.x,. ....xn whicli are a11 n o w 1 N(p.02). We r e w f o r i n s t a n c e
think of t h e xi's as t h e outc-s of n repeated r a s u r e m c n t s on t h e same physi-
c a l system or n independent observations on the s- qumtity. Than t h e xi's
c o n s t i t u t e a sample of s i z e n from population which i * normal w i t h mean U and
of freedom has been l o s t , s i n c e it 6.. bcea w e d t o estimate t h e d n m para-
variance a'. We define t h e chi-square sun x2 by adding t h e aquare. of t h e
meter (the c e n t r a l value) of t h e d i s t r i b u t i o n .
standardized normal variables (xi-$)r0, viz.
Remark 2. Prom a mathematical viewpoint the requirement t h a t 111 t h e I. should
be s i m i l a r l y d i s t r i b u t e d , (a11 N(v.u')). is unnecesse.rily restrictive. I n gene-
r a l , f o r n independent variables r.
The variable x2 has a probability density function given by
of squares x2 - from norms1 d i s t r i b u t i o n s ~ ( u ~ . a f )the
i i l y f of the standardized q u a n t i t i e s y -(xi-Ui)/ai
, sum
w i l l have a
chi-square d i s t r i b u t i o n v i t h n degrees of freedom.
I f the variables y. i n the sum x2 - iilYf are not N(0.12, but more
generally d i s t r i b u t e d v i t h u n i t variances and means d i f f e r e n t from zero (and

(5.2). r is t h e g-a function, f o r vflich r(x+l) -


which i s called the chi-square d i s t r i b u t i a with n d e g ~ e e sof +@dm.
rr(x), r(I) - K,
Notice t h a t t h e number of degrees of freedom i s t h e mlmber of indepen-
r(1) -
I n eq.
1.
not necessarily equal), the variable
t r i b u t i a with n degrees of freedom.
x2 w i l l have s na-centro2 chi-squaz.8
See Exercise 5.12.
dia-

dent v a r i a b l e s making up the X' sum (5.1). 5.1.2 Proof f a r the chi-square p.d.f.

m.The sample xt,x2.. ...xn from N(u,u'> has a sample variance


/I
In our d ~ s e u s s i o nof the standard, o r c e n t r a l , chi-square d i s t r i b u -
t i o n we s h a l l f o r convencience m i t e u instead of x2. Also re s b l l use the
Greek syrbol V f o r n, t o be i n accordance with our convention f o r parameters
entering a p.d.f.. Thus the chi-square d i s t r i b u t i o n f o r u i s w r i t t e n
where x
- -;
I "
i E l ~ i is the ample mean. It w l s s t a t e d i n Sect.4.8.6 and w i l l be
proved i n Seet.5.1.6 t h a t the quantity (n-1)s2102 is d i s t r i b u t e d u a chi-
square v a r i a b l e with n-1 degrees of freedom. This i s equivalent t o aayiog t h a t

We s h a l l not prove eq.(5.5) f o r the general ease of a r b i t r a r y v . For


the most t r i v i a l case when v-1 the reader should have no d i f f i c u l t i e s i n verify-

- (y)2
i n words, t h i s sum of squares has s chi-square d i s t r i b u t i o n v i t h n-1 degrees of ing the formula by using t h e change-of-variable technique outlined in Sect.3.7,
freedom. This may a t a f i r s t glance seem t o contradict the d e f i n i t i o n of the r e c a l l i n g only t h a t putting u i n p l i e s a two-to-one transformation from
chi-square d i s t r i b u t i o n i n t h i s section, vhich implies t h a t x t o u; compare Exercise 3.6. Here we s h a l l be s a t i s f i e d with verifying eq.
(5.5) f o r the case V-2. The case V-3 can be treated i n an analogous way. (Exer-
cise 5.1). For the general cllse of an a r b i t r a r y U a proof can be given by
The disagreement i s , h m v e r , only apparent, f o r the follouing reason: In the mathematical induction (Exercise 5.2), or using a rmre d i r e c t mthod. (see f o r

expression (5.4). u i s considered a k


B quantity, given independently of the I instance Kendall and S t u a r t , Chapter 11. Vol.1).
x.;
-
i n t h e expression (5.31, on the other hand. x i s a derived quantity, namely For "-2 we have
the arithmetic me= of the x . . When t h i s average h l s been calculated and
adopted as an estimate of the t r u e , but Mkn- parameter 11, only "-1 of the
squares i n the sum s' are independent quantities. Loosely speaking, one degree where x , has a p.d.f. £(XI) - (2~')-'exp
-( )
I and s u m l a r l y f o r x,.
. me
joint ~.d.f. is equal to the product of t6e p.d.f.'a of the t ~ Independent
o
variables x,.xz (compare Sect.3.5.4) We define two new variables p and $ by

y E p cod, --U =- p .in$,


x2-

where 0 I p 5 -.
0 5 4 5 2 ~ .The transformation from the set xl,rz to the set
p,$ involves the Sacobian

Hence the joint p.d.f. in term of the new variables becomes

f(p.4) - f(xl)f(xr).l~/ - e-lp2 .P ,

, is independent of m. The marginal distribution in P obtained by integre-


tins over $ (Sect.3.5.5) becomes simply

- e-'p2.p .
Since the relation between the variables u and p is

Fig. 5.1. The chi-square distribution tor different degrees of freedom V.


the p.d.f. for u is given by
For v 5 2 the chi-square distribution i a monmonically decreasing
I with increasing u; indeed v-1 implie. m infinite ordinate at "10. Por w > 2
the distribution has a m a x i m value (mode) at w - 2 . It is seen that the chi-
which is seen to be f(u;2), the chi-square distribution with two degrees of
square distributions correspond to a special class of the more general g-
freedom.
distribution (Seet.4.7.1).
5.1.3 Properties of the chi-square distribution The characteristic function for the chi-square distribution is found

- -1
The chi-square distribution of eq.(5.5) is shown in Fig. 5.1 for from the definition eq.(3.22).
selected values of the parameter v . I
m(tl - e(eitu) eitur(u;v)du.
0
Inserting the p.d.f. of eq.O.5) and carrying out the integration lads co
When the characteristic function is known one can easily evaluate upectation
I In fact, it can be a h a n that a~~ptotically
tion doee indeed become identical to the no-1
the chi-square distribu-
distribution. To see thie it
valves by differentiating with respect to (it) end putting t 4 , 8s demonstrated is sufficient to demonstrate that the characteristic funetioru for the N o
in Seet.3.4. One finds in particular distributions become equal in the limit of large v. Guided by the established
facts that the mean and variance for the chi-square distribution are given by
( Y and ZV, respectively. we form the standardized variable
I

/ The characteristic function for this variable is


Thua the chi-square distribution has mean value V and variance 2V.
In general the algebraic moment of order k.
3 or the k-th moment about
my,(t) - ~(~~~'1) -[ r
E exp it -
21- I- - ][ I:[- ,
1 exp E exp
the origin, for the chi-square distribution is given by

- -- - which can be rewritten as


E(u~)
a(itlk
It* v(w+2). ..(v+2(k-1)) '
= 2r( l
r(lV)

This relation, together with the generally valid formula (3.18). permits the
v. (5.8)

where, in the last step, we have inserted the expression of eq.(5.6) for the
determination of all central moments up to any desired order. In particular.
the central moments of orders 3 and 4 are found ta be ' characteristic function of the variable u. Taking the logarithm and expanding
the last term we have

Thua the asymmetry and kurtosis coefficients, from their definition by eqs.
- - - 1t 2 ++-4).
(3.20) and (3.21). respectively, become
When v goes tarards infinity, @ (t) + e-lt2; t h u in the limit of infinite v
Y1
the variable y, has the characteristic function of a standard normal variable.
Hence the original variable u, for limiting values of V, will also be normal,
yz -A 12
3 - .-
(LIZ)?

These numbers express the tendency seen in Fig. 5.1, that the skewness of the
namely N(v,ZV).
-
Mathematically, the approach of yl ( u - v ) l G to N(0.1)
slow. One can show, see Exercises 5.10. 5.11, that the variable
is rather

chi-square distribution decreases for increasing W, while the shape beeoms more
"bell"-like. visually the distribution looks "naml"already at v = 20. In che
limit v +I. the coefficients y, and yr are zero, indicating exact spmetry and represents a better approximation to N(O.1).
a peaking equal to that of a normal distribution.
5.1.4 P r o b a b i l i t y content8 of the chi-square d i s t r i b u t i o n
I n p r a c t i c e one is f r e q u e n t l y i n t e r e s t e d i n the cumulative chi-square
d i s t r i b u t i o n t o c a l c u l a t e confidence i n t e r v a l s o r f o r t e s t i n g hlrpotblses invol-
ving chi-square d i s t r i b u t e d v a r i a b l e s .
Figure 5.2 gives p r o b a b i l i t y contents of t h e chi-square p.d.f. for
d i f f e r e n t numbers of degrees of freedom. The f i g u r e shovs a double-logarithmic

F(X';V)
a z r'
display of the q u a n t i t i e s F ( G ; v ) and a versus
..2

Appendix Table A8 gives values of


f(~;v)du - I - .
x2a
LY
4. as implied by t h e r e l a t i o n
(5.13)

f o r d i f f e r e n t w and s p e c i f i e d e n t r i e s of
F(X;;Y).
When t h e n d e r of degrees of freedom i s s u f f i c i e n t l y l a r g e , v ? 30.
t h e p r o b a b i l i t y coorents of the chi-square d i s t r i b u t i o n can e a s i l y be found
using t h e f a c t t h a t t h e v a r i a b l e yz of eq.(5.12) ( o r y, of eq.(5.11)) is
approximately standard normal. See Exercise 5.8.
It may be worth noting. t h a t t h e p.d.f. f o r the v a r i a b l e F ( x ~ ; * ) of
eq. (5.13) is uniform over t h e i n t e r v a l fO.1 I. (This f a c t is g e o e r d l y t r u e f o r
any v a r i a b l e defined by t h e c m u l s t i v c i n t e g r a l of a p.d.f.. see ~ ~ c t . 6 . 5 . 1 . )
We s h a l l see an e x m l e of t h e usefulness of t h i s property i n Seet.10.6.4.

5.1.5 Addition theorem f o r chi-square d i s t r i b u t e d v a r i a b l e s


It h a s previously been s h m t h a t e l i n e a r conbination of independent,
normal v a r i a b l e s is i t s e l f a normally d i s t r i b u t e d v a r i a b l e ( t h e a d d i t i o n t h e w
=em f o r normally d i s t r i b u t e d v a r i a b l e s , Sect.4.8.5). A s i m i l a r theorem holds
f o r a l i o e a r combination of independeat chi-square v a r i a b l e s , and may be s t a t e d
as follows:
Let u , . u r , .... ur be a s e t of independent v a r i a b l e s having chi-square
d i s t r i b u t i o n s wrth vl,u,. ...,
ur degrees of freedmn, r e s p e c t i v e l y .
Then t h e sum v-ul+u*+...+ur is a l s o a chi-square d i s t r i b u t e d v a r i a b l e ,
w i t h v..vl+vz+ ...+ u, degree8 of freedom.
This theorem i s proved r i @ l y by n o t i n g t h a t the c h a r a c t e r i s t i a func-
t i o n f o r t h e v a r i a b l e v g e t s the airme form ae t h e c h a r a c t e r i s t i c f u n c t i o n f o r an
individual u*. Because of the ass-d independence, eq.(3.51) a p p l i e s , and
gives
F i g . 5.2. p r ~ b a h i l i r y tllc chi-square d i s t r i b u t i o n .
I
1 137

I
I
satisfying

II i-1
f aijaik - Ajk.

,
I
I
The independent v a r i a b l e s yi are a l l normally d i e t r i b u t e d . each being N(0,u2).
N a r one has
Hence v is x2 (vl+vZ+. ..+Vr).
The a d d i t i o n theorem f o r chi-square v a r i a b l e s i n f e e t m y appear in-
- 1 --
n

i-1
-i
i-l
X; - n;2 - i-1
- y;
n- 1
.1
i-1
y;.
t u i t i v e l y c o r r e c t , because t h e number of degrees of freedom is nothing b u t the Therefore
rider of independent t e r n s m d i n g up the

5.1.6
x2 bum.

proof t h a t ( n - ~ s ~ / o f' o r sample from N ( V , U ~ i) s x2(n-1)


--
(n-1)s'
o2
L
"-I
i-1
pIz'
Before leaving t h e chi-square d i s t r i b u t i o n we want t o prove t h e impor- where t h e right-hand s i d e involve8 a sum of squares of independent standard
r a i l t f a c t about the v a r i a n c e of a normal sample which was b r i e f l y mentioned i n , normal v a r i a b l e s , as required f o r t h e d e f i n i t i o n of a chi-square v a r i a b l e .
Sect.3.10.2, and m p h a s i r e d i n Seet.4.8.6 as well as i n t h e beginning of t h i s Since t h e number of independent terms is o-1, t h e v a r i a b l e (n-l)a2/02 is

~ h a p t e r . ~ p ~ ~ i ~ i c ai lf l xy, ,.x r .. . n. , x n i s a sample from N ( ~ , u ' ) , w i t h mean x2("-1).


-1
; n i = l x.I and variance s2 =
1 -
we s h a l l prove t h a t t h e v a r i a b l e The proof t h a t ;and s2 f o r a normal sample are a l s o independent v a r i -
ables i a l e f t as an e x e r c i s e f o r the reader (Exercise 5.9).
I
Exercise 5.1: Verify eq.(5.5) f o r t h e eaaa V-3. (Hint: Use t h e procedure i n t h e
! t e x t tor "-2, putting
i s chi-square d i s t r i b u t e d w i t h n-1 degrees of freedom.
From the independent v a r i a b l e s xi, vhich are a l l N(v,02), we mke a
~ x e r c i s e5.2: Prove eq.O.5) by mathematical induction.
change t o a new s e t of v a r i a b l e s yi by Helmert's trrmsfomtion,

1
-
Exercise
~(d') - 5.3: For t h e chi-square p.d.f w i t h v degrees of freedom shov t h a t
2'*r ( I ( v + r ) ) l r ( l v ) f o r a l l p o e i t i v e and negative i n t e g e r s r s a t i s f y i n g
v+K>O. Note t h a t t h i s i s a more general r e s u l t than t h a t implied by eq.(5.8),
where k i s assurnEd t o be a p o s i t i v e i n t e g e r .

Exercise 5.4: Shov t h a t t h e chi-square d i s t r i b u t i o n has t h e c h a n e t e r i e t i c


function of eq.(5.6).

; Exercise 5.5: v e r i f y eqs.(5.8)-(5.10).

Exercise 5.6: Let xl.xz,. ...


x be r independent v a r i a b l e s which are a l l uni-
j formly d i s t r i b u t e d over the i n f e r v a l [0,11. Show t h a t u=-2 ln(xlx 2...xr) is
i x 2 ( 2 r ) . (Hint: See Exercise 4.24.)

n i s i s an orthaoormal transformation, the c o e f f i c i e n t s a.. i n yi


11
- tl
.Z s..x.
1-1 11 I ! Exercise 5.7: From Fig. 5.2, w r i t e do- values of p(x2;v) taking x 2 equal t o
the mode of the chi-square d i s t r i b u t i o n f o r u-3.4,5,10,20,30 degrees of freedom.
What i s t h e l i m i t i n g value when v + rn ?
Exercise 5.8: Prom Appendix Table A8, v e r i f y t h a t P(x~40.3;v-30) 0.900.
Construein two a roximate normal v a r i a b l e s by eqs.(5.11), (5.121 i n t h e t e x t ,
- Exerciee 5.11: Use t h e r e s u l t s of Exercise 5.10 t o s h w t h a t t h e v a r i a b l e
r ' 2 ~ 24 2 " has a d i a t r i b u t i o n v i t h

via. y,-(~-v)/d?and y . - W - m , shov t h a t t h e corresponding p r o t a b i l i t i e s


obtained £?om Appendix Table ?.6. are 0.908 and 0.903, r e s p e c t i v e l y . Repeating
t h e problem f o r t h e same v and a higher xZ, f o r instance $,-53.7, s h w t h a t the
Y 2 approximation becomes i n c r e a s i n g l y b e t t e r than t h e y, approximation.
Thia s h w s t h e t a
is d i s t r i b u t e d about t h e r e a n t a order ;'12 with a
Exercise 5.9: (Independence of
Shov t h a t the mean -
; ;
;and
1
;iElai
a' f o r normal sample)
and t h e variance s 2 - 1 " - z
zikl(~i-~)
variance 1 t o order v-' .
By comparison with eqs. (5.10) i t i s seen t h a t
tends t w a r d s normality much f a s t e r than u.
f o r a sample from N ( ! J . ~ ' ) are independent v a r i a b l e s . (Hint: Make use of t h e
Exercise 5.12: (The non-central chi-square d i s t r i b u t i o n )
identity
(i) I f yl . y r . . . .,y i s a s e t of independent v a r i a b l e s vhieh are N(vi,l),
0
7
x.-
( 1 1 !J 2
= m2&
(s]z n v
t h e v a r i a b l e u' E .Z yf is a no"-central chi-square v a r i a b l e v i t h v degrees of
i".kOi 5% '(OlATJ
t o show t h a t t h e j o i n t c h a r a c t e r i s t i c function of t h e v a r i a b l e s (n-1)s2/02 and
( & ( X - ~ ) l o ) ~ f a c t o r i z e i n t o t h e i r individual c h a r a c t e r i s t i c functions. Aeeord-
1-1 r
freedom and noo-central parameter - iL, Ui;
t h a t u' has t h e c h a r a c t e r i s t i c function
v
f o r s h o r t , u' i s xlZ(v.X). s h

i n g t o Sect.3.5.7 these v a r i a b l e s ere t h e r e f o r e independent, as w i l l be t h e


' - i a b l e s s Z and Z.)
@ ( t ) = (1 - (
~ i r ) - ' ~ e xi
p --*

Note t h a t i f a l l V i 4 . A . 0 and @ ( t ) reduces t o the form of eq.(5.6).


t r c i s e 5.10: (The c h i - d i s t r i b u t i o n )
Shov t h a t the v a r i a b l e X-& has t h e ~ r a b e b i l i t yd e n s i t y function (ii) Shov t h a t t h e mean and t h e variance f o r u' are given by ("+A) and
(i)
f(x;v) - 1
zAu-'r(lv)
x
"-1 -(x2
e 7 o f X < - .
(hi+4A), respectively.
(iii) It can be sh- (see f o r instence Kendsll and S t u a r t . Chapter 24.
Vo1.2) t h e t t h e p.d.f. f o r u' i s given by
(ii) Show t h a t t h e a l g e b r a i c moments for t h i s p.d.f. become i n

-
general,
v; 2~k~(A(v+k))/~(~~),

4 k -
and t h a t i n p a r t i c u l a r t h e even moments r e s u l t as
v(u*2)-. .(v+2(k-1)).
Verify t h a t , vhen X . 0 , f(u';v.A) reduce. t o t h e ordinary ( c e n t r a l ) chi-aquare
d i s t r i b u t i o n v i t h U degrees of freedom, eq.(5.5).
It can a l s o be s h m t h a t the v a r i a b l e

! J ; ~(v+1)
+~ -
The odd moments can be expressed by t h e f i r s t , !J:,
(u+3). ..(v+2k-1)~:.
Use S t i r l i n g ' s expansion
Inr(x+l) - (ln(27) + ( x + O l n x
1
-a+- --
1
...
360.'
+ eq.(5.5), but with a parameter v ' -
i s approximately d i s t r i b u t e d l i k e a c e n t r a l chi-square v a r i a b l e according t o
(V+A)~/(V+ZX) vhere v' i s , i n general.
f r a c t i o n a l . This f a c t is frequently used t o f i n d approximate values o f t h e
i n t e g r a l of a non-central chi-square v a r i a b l e , by i n t e r p o l a t i n g i n t h e t a b l e s
(curves) f o r the c e n t r a l chi-square d i s t r i b u t i o n .

(iii) with t h e preceding r e s u l t s and t h e general r e l a t i o n s h i p between


algebraic and c e n t r a l moments, eq.(3.18), e h w t h a t t h e l o v e s t c e n t r a l moments
f o r t h i s d i s t r i b u t i o n arc
3 3
2 8w 4 8u

Here the convergence of ~ ( v - I ) tovards zero is f a s t e r than u-', md similarly


for +-I).
..

1 6 - Probability and statistics


5.2 THE STUDENT'S T-DISTRIBUTION

5.2.1 Definition
Let x be a standard normal variable N(0.1) and u a chi-square variable
with v degrees of freedom x2(V), aml ass- that x and u are independent. De-
fine a variable t by is therefore a Student's t-variable with n-1 degrees of freedom.

5.2.2 Proof for the Student's t p.d.f.


To prove that the variable t as defined by eq.6.14) has the p.d.f.
This variable then has a p.d.f. given by
of eq.(5.15) one may proceed as foll-:
The joint p.d.f. of x and u, because of their independence, is given
by the product of the individual p.d.f.'s,

which is called the Student's t-distlribution w i t h v degmes of f i p e d m .

'mark 1. If x is not an N(0,l) variable, but more generally N(u.1).


X' (v) as above, with x and u independent, the variable t' - and u is
x f m has a n a -
1 ~ransformingto e new set of variables,
central t - d i s t 2 i b u t i m . See Exercise 5.20.
t .X Y - U,

-
Remark 2. T o motivate
wellknam properties of the mean
a study of the Student's t-variable, recall the by now
;and variance s z of a sample.xl,r2.. .,nn .
II where - 0 1 t 5
m'
-, 0 5 v 5 0, the Jaeobian of the transforumtion is
from N(LI,u~),
-x is Nb,;
az
), where x --
7
(n-1)s2 is ( - 1 where s2- -
n-1 .
1-1
In terms of the new variables the joint p.d.f. e m be written as
Moreover. ;and s2 are independent variables (Exercise 5.9). Consequently the
two independent variables

and -
(,I-1)s'
Since we ere only interested in the variable t ve proceed to find the marginal
016 o2
distribution in this variable by integrating over v.
being, respectively N(0.1) and y2(n-1), satisfy the requirements specified io
the beginning of this section, with the trivial difference that the chi-square
variable has "-1 degrees of freedom, instead of v. A variable constructed from
these two variables as
1 which is seen to lead to eq.(5.15)
' ( s L . ~ ) . b a . a ~ q s y z w h a s,auapnag aqa 30 .j'p.d a43 103 joold
aqa a a a l h o ~Lqalaqa pus ( 9 1 . s ) . b ~ 30 1 s d o a m aqa aasnlena : E L ' S asy?laxp
.BL.S p " ~ 1 . ssasyazaxa aas 'zaaael aqa 30 as"
axem a a a ~ q eaq dqezaqa pua ' ~ ypus gy ~ a ~ q rrpuaddv
s j uaanaaq uo;aJauuoJ aqa
qsylqeasa 61ynsa 11yn ' ( 1 ' 0 ) ~ 30 sauaauos Lay~?qeqoxd 391 qayt! p"Trys"b~8 9:
6.p.y.aJaS m o q Lp~bpeaqeoqn 'xapmaz aqa as*) u l .uoyanqylasyp 1-ou plspusas
aqa oa -(eayauapy sy ueae ensq an q,yqn '-A asaJ 1 e y ~ a d saqa SapnlJU? q q a a au
13
.mopaaq 30 saax8ap 7uaxa43rp l a 3 a pu. (nf a)$ 103 sayxaua p a y ~ p a d so l 8uypuods
-axma
aa 30 sen-(en eany8 ~y alq-L ~ p u a d d y' ( s [ . s ) . b a dq "any8 sy ( A ! ~ ) J slap
m-
(61.5) a - 1 - ap(n!a)J J' z (~!'a)a
D
I
gxercise 5.16,: show that the m n t s U' f o r the t - d i s t r i b u t i o n e x i s t
i f k < v, and t h a t the even moments are kiven by

~ ( t ~ ' =) E[[s)2r1 -
L ~ ' ' J
yr E(x~~).E(P)>
2r -r
2r < V ,
5.3.1 Definition
Let ul and u2 be two independent (central) chi-square v a r i d l e . with,
because of the independence of t h e variables x and u; E(x ) end E(u ) are the . is X 2 ( v z ) .
respectively, v l andvz degrees of freedm, i . e . u, is X 2 ( ~ , ) ur
expectations f o r N(0,l) and X 2 ( v ) . respectively. Compare Sect.4.8.4 and Exer-
c i s e 5.3. Define a variable P by

Exercise 5.17: Show t h a t , given A, one can f o r the student's t - d i s t r i b u t i o n


OCPS-; V,,,,>O. (5.20)
c a l c u l a t e t h e l i m i t s f b i n the r e l a t i o n

P(-b 5 t 5 b) =
I
b
f(t;u)dt - y
This v a r i a b l e has t h e p.d.f.

-b
by the use of Appendix Table A7. Write d a m values of b taking y-0.95 f o r
v-1,5,10,M.60,'. Note t h a t f o r \UI, t h e l i m i t s b 4 2 . 0 0 correspond t o the pro-
b a b i l i t y content 0.954 of N(0,l). -,

Exercise 5.18: Calcvlate a


'?
- (eq.(5.17)) and find values of
which i s called the F-distribution with (v,,v,)

M.If
degrees o f f m e d a .

u, i s not a c e n t r a l chi-square variable as was asa-d above, but


P(-O 5 t 5 j
f o r v-3.5.10,30,60.-.
O) =
-0
f(t;W)dt
uz as x2(vz) and ul and un independent, the variable P'
n a - c e n t r a t F-dietn3ution. See Exercise 5.29.
-
instead a non-central chi-square variable with V l degrees of freedom, then, with
( U ~ I V I ) I ( U ~ / has
V ~ )a

- -
Exercise
IS
5.19: Show t h a t i f t2 i s taken as a v a r i a b l e instead of t , the p.d.f.
Remark 2. I n practice one encounters chi-square variables i n the form of sample
variances f o r normally d i s t r i b u t e d variables. For definiteness, l e t x ~ , x l , . . , x
and YI.YI ....,y, be two independent samples fmm the same population N ( ! A , ~ ~ ) ,
f o r example two series of independent measuremmts. The sample variances are
This i s the same form as a s p e c i a l case of the F-distribution (with "1-1) t o be
discussed i n the next paragraph.

e x e r c i s e 5.20: (The non-central t-distribution)

from zero, i.@. x is N(6,l).


pendent, the variable t ' 5 x / m f o r -DD 6 t ' 5 -
Let x be a normal variable with unit variance but with mean d i f f e r e n t
I f the variable u i s X 2 ( ~ ) and x and u are inde-
has s p.d.f. given by (A-6')
Then (n-1)s:/uZ
,
1
1-1
Y
- 2, where ;--
1 1
--.
m ;=,yi
and (m-l)a$/oz are two independent chi-square variables with,
respectively, "-1 and m-1 degrees of freedom. s a t i s f y i n g the requirements s ~ e e i -
f i e d above f o r an F-variable; taking the r a t i o according t o the d e f i n i t i o n of eq.
which i s called t h e non-central t - d i s t r i b u t i o n with v degrees of freedom nod (5.20) we f i n d
non-central p a r a t e r 6. Verify t h a t chis d i s t r i b u t i o n specializes t o the usual
( c e n t r a l ) t-distribution of eq.(5.15) when 6-0.

When the oon-central t - d i s t r i b u t i o n is regarded as a function of t"


i t i s a specialcase of the "on-central Y-distribution (for V,-1) which i s
introduced i n Exercise 5.29. . .
(iii) For V L - 1 the E-distribution s p e c i a l i z e s to

F - o2

I(lp1)
= 5:
8;
. (5.22)

a2 which i s nothing b u t s Student's t - d i s t r i b u t i o n w i t h v, degrees of


This v a r i a b l e then has an f - d i s t r i b u t i o n v i t h ((o-1, -1) degrees of freedom; freedom, when t h e l a t t e r i s regarded as a function of tf-F; compere
f o r obvious reasons F is h e r e c a l l e d t h e vmirmoe ratio. Exercise 5.19.

5.3.2 Proof f o r t h e F p.d.f.


(iv) The l i m i t i n g p r o p e r t i e s are such t h a t f o r v, f i r e d , vs * -,
To prove t h a t t h e v a r i a b l e F defined by eq.(5.20) has t h e p.d.f. of
eq. (5.21) one can follow t h e reasoning t h a t was used t o e a t a b l i a h the Student's - ~

t-p.d.f. i n Seet.5.2.2. From the j o i n t p.d.f. of u, and ut. a transformation t h a t is, t h e v a r i a b l e (v,F) approaches x 2 ( v l ) ; see E m r e i s e (5.25).
i s made t o t h e nn. s e t of v a r i a b l e s For v l * -, v, * t h e F-distribution tends t o normal. The
F -- u,lv,
9 v-u2.
approach t o normality is, however, r a t h e r slw.

-
where 0 5 F 5
eliminating
-. "2/V2
0 5 v 5 -. Applying t h e change of v a r i a b l e technique and
the auxiliary v a r i a b l e v by i n t e g r a t i n g t h e j o i n t p.d.f. over t h i s
(v) The q u a n t i t y z
normal, w i t h approximate me* 1
Exercise 5.28.
vz v,
1 +>I,
4lnF has a d i s t r i b u t i o n which is c l o s e t o
[l-
and variance 4($, + see

v a r i a b l e , t h e reader should be able t o v e r i f y eq.(5.21).

5.3.3 P r o p e r t i e s of t h e F - d i s t r i b u t i o n
Figure 5.4 shows a sketch of the F - d i s t r i b u t i o n for a few carbin=-
r i o n s of the parameters V I , V ~

The f a l l w i n g f e a t u r e s c h a r a c t e r i z e t h e F - d i s t r i b u t i o n :

(i) f(F;vl,v2) i s monotonically decreasing i f v l 5 2, while i t has


a maximum f o r v , > 2, t h e mode being

(ii) The d i s t r i b u t i o n i s skew and has a man value and


variance given by

Fig. 5.4. fie F-distribution f o r s e l e c t e d degrees of freedom (v,,V2).


Note t h a t whereas the mode, when i t e x i s t s , i s always < I , the
mean value i s > I .
..
5.3.4 p r o b a b i l i t y contents of t h e F-distribution
The cumulative F - d i s t r i b u t i o n is defined by

(ii) Shov t h a t t h e c h a r a c t e r i s t i c function f o r r i a sfleh t h a t

where t h e integrand is t h e p.d.f. of eq.(5.21). Appendix Table A9 gives values and thereby v e r i f y t h e statement made under (v) i n Se t 5 3 3, char z is
of x, f o r s p e c i f i e d e n t r i e s of F(xa;v,,v,) and d i f f e r e n t degrees of freedom ap roximately normally d i s t r i b u t e d w i t h mean value t(-F r i ]. and variance-u,
(Y,.'J2).
r +,;(7 5,). . V~

~n p r a c t i c e , t h e p r o b a b i l i t y contents of t h e F - d i s t r i b u t i o n is used Exercise 5.29: (The non-central F-distribution)


i n connection w i t h hypothesis t e s t i n g of variances f o r n o m l samples; f o r en (i) Let u: be a nan-central chi-squsrp v a r i a b l e with v, degrees of free-
dom and non-central Parameter A as defined i n Execeiae 5.12, and l e t u, be a
.xanple, see Sect.14.3.3. ( c e n t r a l ) chi-square v a r i a b l e w i t h V, degrees of freedom.
independent the v a r i a b l e F' -(u;lvl)i(uiiut), where 0 5 F ' L I m, f U; ur are
hasanda p.d.f.
exercise 5 . 2 ~ : ~f =he variable F hes en F - d i s t r i b u t i o n w i t h ( v ~ , V z ) degree' of
given by
rreedom. what i s the d i s t r i b u t i o n f o r t h e v a r i a b l e llF7

E.rcise 5.22: I,, sect.5.3.2, carry outt h e d e t a i l s of t h e proof f o r t h e p.d.f.


o f the F v a r i a b l e .
I depreoa VZ I
Verify t h e formulae f o r E(F) and V(F) given i n t h e t e x t , eqs. which i s c a l l e d t h e non-central F-distribution with (Vt,up) of freedom
~ x e r c i s e5.23:
and non-central parameter h. Shou t h a t , f o r A ¶ , f ( ~ ' ; v , , v ~ , h reduces
) t o the
(5.24). ordinary ( c e n t r a l ) F - d i s t r i b u t i o n of eq.(5.21).
5,24: that the mament v; f o r t h e F-distribution exists

+
onlyi f k < tvrr and i s then given by (ii) I n view of a statemcot i n Exercise 5.1 2 the v a r i a b l e u:l(*] will

u; = E(Fk) =
I'
;;r;;
J
=
k
(2)
E ( ~ , ~ ) E ( ~ =c ~ )
(%]k.%
v,+k) r ( ~ 2 - k )
* YI- ( v , + ~ ) ~ / ( v I + z AHence
). ";I(*)] lv: -
have an approximate c e n t r a l chi-square d i s t r i b u t i o n with parameter
u;l(vl+A) is an approximate c e n t r a l

k -k chi-square v a r i a b l e divided by i t s n u h e r of degrees of freedom. Shov t h a t F'


I I,
s i n c e t h e expectations E(u, ) and E(UI ) are evaluated f o r x 2 ( v l ) and x2(Vr), can be w r i t t e n
I r e s o e c t i v e l y Compare Exercise 5.3. ",+A
F' I - F ,
.
,-.
i i ~ e Verify t h e s p e c i a l forms f o r the F - d i s t r i b u t i o n s t a t e d under
~ ~ e r c 5.25:
( i i i ) and (IV) i n Sect.5.3.3.

~ x e ~ 5.26: e i ~Calculate
~ o- (eg.(5.24)) f o r t h e F - d i s t r i b u t i o n with
where F approximately hae e c e n t r a l F-distribution with parameters (u:,u,),
being i n general f r a c t i o n a l .
V:

fixed v,=5 and v2-10, 20, 60, r e s p e c t i v e l y . From Appendix Table 8 9 , f i n d by 5.4 LIMITING PROPERTIES - CONNECrION BErWEEN PROBABILITY DISTRIBUTIONS
i n t e r p o l a t i o n F ( % ; U , , V ~ ) f o r t h e t h r e e combinations of (Vl,vt). What i s t h e The connection between the sampling d i s t r i b u t i o n s of the p r e s e n t
l i m i t i n g value when v, * -?

Exercise 5.27: When ul * -, v r +


and. the variance of t h e F-distribution?
-.
what are t h e l i m i t i n g values f o r t h e mean
Corpsre t h e corresponding e n t r i e s of
chapter and some of t h e p r o b a b i l i t y d i s t r i b u t i o n s discussed i n t h e previoue
chapter is i l l u s t r a t e d by Pig. 5.5.
i i p e n d i x Table A9. Note t h e c e n t r a l p o s i t i o n of t h e normal d i s t r i b u t i o n as a l i m i t i n g
case of t h e t h r e e sampling d i s t r i b u t i o n . (chi-square, F. Student's t ) as well
~~~~~i~~
5.28: (The = - d i s t r i b u t i o n )
put E z IlnF. Show t h a t t h e v a r i a b l e z hns t h e P.d.f. es of the t h r e e d i s c r e t e d i s t r i b u t i - ( m l t i n o m i e l , binomial. Poisson).
(i)
6. Comparison of experimental data with theory

Poisson
Mullinomiai Binomial In t h e preceding chapters we have investigated f e l t u r e a of probabili-
t y d i s t r i b u t i o n s which are frequently used i n physics. Experimental findings
em, hovever, not always be d i r e c t l y compared t o the i d e a l mathemtical d i s t r i -
butions. Quite often a t h e o r e t i c a l model w i l l have t o be modified i n some r a y
before one can make a meaningful comparison between prediction and obeelvation.
( NORMAL 1 The reason f o r t h i s can be t h e t the t h e o r e t i c a l p.d.f. w i l l only describe m

p r a c t i c e ; f o r instance, the l i f e t i m e d i s t r i b u t i o t ~l w
f(t;A) Ae-At-
experiment performed under c e r t a i n i d e a l conditioru which are not f u l f i l l e d i n

- for

'. 0 S t 5 assumes t h e t a p a r t i c l e detector of i n f i n i t e dilnruione i s availrible.


It may a l s o be necessary t o correct f o r experimental u n c e r t a i o t i e s and varioru
Student's I
v,=1 types of sgstematic e f f e c t s .
I
The purpose of the present chapter, i n s p i t e of i t s r a t h e r a n b i t i o w
headline, i a only t o outline some of the preparatory work t h a t may be neceseary
before t h e experimental d a t a c m be used t o e l i c i t information on unknown para-
pig. 5 . 5 , elations between probability d i s t r i b u t i o n s meters, o r agreement checked v i t h other experiments o r t h e o r e t i c a l model.. The
perrmeter estimation problem i8 discussed q u i t e extensively i n Chapter. 8-11,
Exercise 5 , 3 0 : ~i,,d appropriate positions i n Fig. 5.5 of the exponential, the whereas a t r a a t l n n t of variou. te.ts of goodness-of-fit i s postponed t o Chapter
ga-, and the Cauchy d i s t r i b u t i o n s
14.

6.1 RETECTION OF B*D HEI\SUPEW3NTS


It o f t e n happens during the a e e m l a t i o n of d a t a t h a t one of t h e
n a s u r e n m t s d i f f e r s s u b s t a n t i a l l y from t h e others. In such r ~ i t u a t i o none m y
perhapa suspect t h e t some mistake has occurred, f o r inataaee t h a t an erroneous
f i g u r e ha. been recorded by the o b a e m r , and there should than be no objection
t o aimply diacard t h i s s i n g l e observation.
The s i t u a t i o n i e frequently not so c l e a r , as whm there are .even1
obsemationa which seem t o deviate considerably from the majority. It i. o f t e n
a matter of t a s t e t o judge which masuremants mhould be kept m d which should be
I Thia i n t e g r a l can be e v a l u a t e d t o g i v e t h e f o l l w i n g form,

-
f q ( X q ) ei(An)2.G[$-hR].Ae-Ax', o ' x ' ' ~ , (6.6)
I where G is t h e cumulative s t a n d a r d normal d i s t r i b u t i o n introduced i n Sect.4.8.2
and t a b u l a t e d i n Appendix Table A6.
I n p r a c t i c e an i d e a l behaviour o f t h e form o f eq.(6.5) is expected,
f o r e x a q l e , f o r p a r t i c l e l i f e t i m s , t r a n s v e r s e monenta and four-mowenturn t r a n s -
fers. The assumption of a normal-shaped r e s o l u t i o n f u n c t i o n appears reasonable
f o r many experimental s e t - u p s .
Fig. 6.1 i l l u s t r a t e s how t h e o r i g i n a l p . d . f . of eq.(6.5) i s modified
by eq.(6.4) i n t o d i f f e r e n t observable p . d . f . ' s of eq.(6.6) f o r d i f f e r e n t numer-
i c a l values f o r the c o n s t a n t s A and R.

6.2.2 Example: Gaussian r e s o l u t i o n f u n c t i o n and Gaussian y.d.f.


1 Let t h e o r i g i n a l p.d.f. f o r t h e v a r i a b l e x b e normal w i t h mean v a l u e
xo and s t a n d a r d d e v i a t i o n r.

I f t h e r e s o l u t i o n f u n c t i o n i s a l s o normal of width R over t h e e n t i r e spectrum,

r(xV;x) -- 1
ER
e-i (x'-x)'/R'

t h e i n t e g r a t i o n of eq.(6.1) y i e l d s f o r t h e r e s o l u t i o n transform

Hence, t h e p.d.f. f o r t h e observable x' w i l l a l s o be normal, w i t h t h e mean v a l u e


of t h e t r u e d i s t r i b u t i o n , b u t w i t h a v a r i a n c e e q u a l t o t h e sum of t h e v a r i a n c e s
of t h e o r i g i n a l distribution and t h e r e s o l u t i o n function.

pig. 6.1. ~ o g a r i t h r n i cd i s p l a y showing the e f f e c t of Gaussian r e s o l u t i o n


f u n c t i o n s on e x p o n e n t i a l p . d . f . ' S ' f o r d i f f e r e n t values o f the damping
c o n s t a n t h and the r e s o l u t i o n ~ i d t hR . The curves f o r R = 0 ( s t r a i g h t
l i n e s ) correspond t o t h e unmodified, o r i g i n a l p . d . t . ' s of e q . ( 6 . 5 ) .
6.2.3 Example: Breit-Wigner r e a o l u t i o n function and Breit-Wigner p.d.f. w i l l depend on t h e r e l a t i v e magnitude of t h e measured width and r e s o l u t i o n

Suppose t h a t t h e i d e a l parameterization i s given by a Cauchy, o r width, being small when t h i s r a t i o is l a r g e (high r e s o l u t i o n ) . It i s r e e o r

Breit-Wigner, formula mended t o apply both s e t s of par-terizations; i n t h e case of a poor agree-


ment it may be neeessaly t o study mre c l o s e l y t h e v a l i d i t y of t h e assumptions.
f(x;xo,r) =
r 1
-Sr5-. (6.9) In any ease, r a t h e r than using a rough approximation, s superior approech would
( r ~ ~ ) ~ + r ~
be t o use the experimentally observed shape of t h e r e s o l u t i o n function end per-
For an a n a l y t i c a l evaluation of t h e observable d i s t r i b u t i o n t h e most convenient
form a numerical i n t e g r a t i o n of eq.(6.1).
form of t h e r e s o l u t i o n function is now another function of t h e same type. sag

Then eq.(6.1) leads t o the r e s o l u t i o n transform

f ' ( ~ ' )=
r+R
(x'-x
1
)2+(r+~)2
- x 5 (6.11)

I n o t h e r words, the observable d i s t r i b u t i o n w i l l a l s o be a Breit-Wigner curve,


w i t h width equal t o the sum of t h e resonance width and the r e s o l u t i o n width.

Exercise 6.1: With a Gaussian r e s o l u t i o n flmction and a Breit-Wigner p.d.f.,


how would you f i n d t h e observable p.d.f.7

6.2.4 Exanple: Width of a resonance


The two preceding examples suggest t h a t f o r s t u d i e s of peaks i n
e f f e c t i v e mass s p e c t r a i t may be convenient t o employ t h e satne f u n c t i o n a l form
(Gaussian o r Breit-Wigner) f o r t h e r e s o l u t i o n function and f o r t h e p.d.f. de- -1.5 -1.0 -.5 0 .5 1.0 x-X'
s c r i b i n g the physical e f f e c t . When t h i s is p o s s i b l e t h e width of t h e r e s u l t i n g Fig. 6.2. Five measurements o f d i f f e r e n t accuracy are represented by normal
p.d.f.'s N(0,Anf) (dashed curves) and correspond t o the r e s o l u t i o n function
observable peak r is simply r e l a t e d t o t h e t r u e resonance width r and t h e
obs T ( x ' ; x ) shown by the full-drawn curve.
r e s o l u t i o n width R. Thus, i f b o t h t h e t r u e resonance peak and the r e s o l u t i o n
f u n c t i o n have normal shapes,the width of t h e resonance i s given by
6.2.5 Experimental determination of t h e r e s o l u t i o n function; ideogram
(two Gaussian shapes). (6.12) It is o f t e n d i f f i c u l t t o give a good a n a l y t i c a l approximation f o r t h e
r e s o l u t i o n function. This i s sometims t h e ease when the uncertainty i n t h e
For t h e Breit-Wigner ease, t h e r e l a t i o n s h i p between t h e widths is l i n e a r , One may, f o r i n s t a n c e , think
v a r i a b l e v a r i e s from one measurement t o another.

r - robs - R, (two rei it-wigner shapes). (6.13)


of d i f f e r e n t bubble ehdmber event.
c o n t r i b u t i n g t o the s a w b i n of a histogrsm.
where t h e errors on t h e individual events can be l a r g e l y d i f f e r e n t . I n such
I n p r a c t i c e , t h e r e f o r e , t h e two parameterization. may l e a d t o d i f f e r cases an average r e s o l u t i o n function can be determined experimentally by plot-
e n t e s t i m a t e s of the t r u e resonance width. The d i f f e r e n c e between t h e r e s u l t s . t i n g t h e d a t a i n an i & o g ~ a m i n t h e following manner.
Let t h e measurement r. b v e an experimental error Ax.. Otlr rill then where P denotes t h e e m u l a t i v e d i s t r i b u t i o n f a r t h e i d e a l p.d.f.. Thus the
assign a normal p r o b a b i l i t y d i s t r i b u t i o n t o t h i s measurement, w i t h a standard t r u n c a t i o n i w l i c s a renormalization of t h e i d e a l d i s t r i b u t i o n over t h e obse-
d e v i a t i o n corresponding t o t h e e x p e r i n m t a l error. When t h i s i s done f o r a l l able region of t h e v a r i a b l e .
measurements i n a c e r t a i n region of t h e histogram, and the (normalized) Gauss- The t r u n c a t i o n of a t b o r e t i c a l p.d.f. can formally be regarded as a
i a n s a r e subsequently centered about some cormon, a r b i t r a r i l y chosen value, s p e c i a l ease of a more general handling of experimental b i a s which follows fmm
t h e added c o n t r i b u t i o n s give the shape of t h e experimental r e s o l u t i o n function; incomplete d e t e c t i o n a b i l i t y . Detection i n e f f i c i e n c y can i n p r i n c i p l e be hwd-
see i l l u s t r a t i o n by Fig. 6.2. led according t o two d i f f e r e n t approaches vhich d i f f e r i n b a s i c philosophy,
c o n s i s t i n g i n , respectively,
6.3 SKSTEMATIC EFFECTS. DETECTION EFFICIENCY (i) modifying t h e i d e a l p.d.f. (exact method)
I n many experiments the d e t e c t o r s used t o r e g i s t e r the s i g n a l s do not (ii) weighting of t h e observed events (approximate method)
have t h e same s e n s i t i v i t y f o r a l l types of r e a c t i o n s . The d e t e c t i o n e f f i c i e n c y E s s e n t i a l l y , t h e i d e a behind the f i r s t method is t o apply t h e c o r r e c t i o n t o
may, f o r e r q l e , depend on the l o c a t i o n of t h e i n t e r a c t i o n p o i n t i n t h e detec- t h e t h e o r e t i c a l model, leaving the d a t a as they were observed, whereas w i t h the
t o r , on the emission angle of t h e p a r t i c l e s , on t h e i r momenta, ete. The experi- second method one keeps the t h e o r e t i c a l model unchanged and a d j u s t s t h e experi-
mental b i a s introduced by imperfect d e t e c t i o n a b i l i t y can f o r many experiments mental data.
be very s e r i o u s , because i t leads t o l o s s of information and l e a s r e l i a b l e coo- Let us discuss t h e treatment according t o method ( i ) i n more d e t a i l .
elusions. One t h e r e f o r e t r i e s t o design t h e experiment i n such a way t h a t the We s h a l l modify t h e i d e a l t h e o r e t i c a l d e s c r i p t i o n t o o b t a i n an observable p.d.f.

d e t e c t i o n e f f i c i e n c y becomes as high as possible. Since, however, p e r f e c t de- which i n t u r n can be compared d i r e c t l y with the observations. Although exact.
t e c t i o n can never be achieved i n p r a c t i c e due t o high c o s t s , time-consumption t h i s method may be d i f f i c u l t , i f not iolposaible, t o carry out i n p r a c t i c e .
e t r . . a l l kinds of p o s s i b l e l o s s and systematic e f f e c t s t h a t w i l l d i s t o r t t h e Suppose t h a t we make observations on some v a r i a b l e x i n o r d e r t o
d a t a must be checked and estimated. estimate t h e parameters of t h e i d e a l p.d.f. f(x;B) Because of an imperfect
d e t e c t i o n apparatus the d i s t r i b u t i o n t h a t can be observed is not
A p r o b a b i l i t y d e n s i t y function f ( x ; l ) which appears suggestive t o -
f(x;Q), b u t
some d i s t o r t e d d i s t r i b u t i o n f'(x;Q)
d e s c r i b e t h e phenomenon under study w i l l sometimes be mathematically defined - which is r e l a t e d t o t h e i d e a l p.d.f.
over "on-observable values of the physical variable*). This a i t u a t i a n can h e through t h e d e t e c t i o n e f f i c i e n c y . I n general, t h i s e f f i c i e n c y w i l l b e depen-
handled by truncation of t h e p.d.f. i n t h e f o l l w i n g way. Let us assume t h a t dent on t h e v a r i a b l e x i n which we are i n t e r e s t e d , as well as on one or mre
t h e observable p a r t of t h e s p e c t r m of x l i e s between some d e f i n i t e l i m i t s A and a d d i t i o n a l v a r i a b l e s , y say. The a d d i t i o n a l v a r i a b l e s may a l s o be dependent on
8. We then r e q u i r e t h e p.d.f. t o be zero o u t s i d e t h e s e l i m i t s and w r i t e our X, so t h a t t o cover t h e most general ease we w r i t e

new p.d.f. as
f'(x;B) - jf(x;B)~(x.~)~(~lxld~

f (x;?)D(x.y)P(yl xldydx
(6.15)

Here D(x.y) i s t h e d e t e c t i o n efficiency. and P(ylx) t h e conditional d i n t r i b u t i o n


of y, given x. Since t h e dependence on y i s assumed t o be of no i n t e r e s t t o
our problem t h i s v a r i a b l e i s i n t e g r a t e d over i n t h e numerator. The i n t e g r a t i o n
* I n t h e remainder of t h i s s e c t i o n we ass- t h a t , i f necessary, t h e experi- over y as well as r i n t h e denominator ensures t h a t t h e d i a t o r t e d p.d.f. ~(x;Q)
mental r e s o l u t i o n has already been folded i n t o f(x;!).
i s c o r r e c t l y normalized over the observable region f o r r. a p p l i c a t i o n of t h e procedures o u t l i n e d above. The t h r e e f i r s t examples deal
or acceptance, i a a property of t h e
The d e t e c t i o n e f f i c i e n c y D(a,y), with method (i), whereas t h e f o u r t h a p p l i e s t h e weighting method ( i i ) .
d e t e c t i n g system and can i n p r i n c i p l e be ass-d k n m or measurable. The
c o n d i t i o n a l p r o b a b i l i t y ~ ( y / x ) ,on t h e o t h e r hand, depends on some physical h y 6.3.1 Example: Truncation of an exponential d i s t r i b u t i o n
As a f i r s t example on t h e modification of an i d e a l p r o b a b i l i t y d i s t r i -
pothesis t h a t may, o r may not be kn-
known a p r i o r i P ( Y ~ X-t)
prior t o t h e experiuent. I f i t is not
be i n f e r r e d from t h e d a t a a t the expence of t h e pre-
c i s i o n i n t h e e s t i m a t i o n of t h e unkn-x.
bution by t r u n c a t i o n , l e t us take t h e d i s t r i b u t i o n law
A ) A - describ-
ing unstable p a r t i c l e s with decay canatant A f o r l i f e t i m e s s a t i s f y i n g 0 I t 5 -.
I n t h e general case, w i t h e x p l i c i t y-dependence f o r D(x,yl and ~ ( y l x ) I f t h e g e o w t r y of the apparatus a l l o u s t h e d e t e c t i o n of an event only i f
and with i n t e g r a t i o n l i m i t s f o r y depending on x, t h e e v a l u a t i o n of t h e dis- t . 5
m n
t 5 t
max we should, according t o eq.(6.14) take f o r t h e truncated p.d.f.
t o r t e d p.d.f. can t h e r e f o r e represent a formidable task. I f , h m e v e r , t h e de-
t e c t i o n e f f i c i e n c y should happen t o depend on x only, and t h e limits f o r t h e r
i n t e g r a t i o n are independent of x, t h e observable p.d.f. is e a s i e r t o f i n d , s i n c e
eq.(6.15) then reduces t o t .
ym"

which i s properly normalized over t h e region of d e t e c t a b l e f l i g h t - t i m e s and


hence describes t h e observable events.

I t is seen t h a t eq.(6.14) f o r s truncated p.d.f. represents a s p e c i a l case of 6.3.2 Exaorple: Truncation of a Breit-Wigner d i s t r i b u t i o n
the l a s t formula.
Method ( i i ) , which is only approximately c o r r e c t , b u t perhaps more the Breit-Wigner parameterization of a resonance f(M;Mo,r)
When t h e observations are r e s t r i c t e d t o a f i n i t e mass region M
-r
A second example where t r u n c a t i o n is always used i n p r a c t i c e is f o r
;((M-M~)~+~~)".
-' M < b$, the
f r e q v e n t l y used i n p r a c t i c e , a p p l i e s c o r r e c t i n g weights t o t h e individual ob-
A
served events, equal t o the r e c i p r o c a l of t h e d e t e c t i o n e f f i c i e n c y . The t r e a t - truncated p.d.f. according t o eq.(6.14) is
ment assumes a subsequent coaparison of the d i s t r i b ~ t i o uof t h e s e weighted
events w i t h the o r i g i n a l p.d.f.. Thus t h e philosophy i s now t o a d j u s t t h e data,
r a t h e r than the t h e o r e t i c a l model. I f one event is observed a t e p a r t i c u l a r
value x . of t h e variabie we say t h a t t h e c o r r e c t e d nunbrr of events is u i , t h e
weight w; being equal t o the inverse of t h e d e t e c t i o n p r o b a b i l i t y f o r t h i s
p a r t i c u l a r event, For t h i s p.d.f. t h e expectation of M i s

We w i l l see l a t e r t h a t the introduction of weighted events gives r i s e


t o s p e c i f i c problems which require a t t e n t i o n f o r the ManimumLikelihood
which w i l l be d i f f e r e n t from No, t h e c e n t r a l value of the resonance peak, unless
and t h e Least-Squarer methods f o r parameter estimation, Sects.9.11 and 10.6,
t h e i n t e r v a l [HA.%] is ~ y m m t r i earound No.
respectively.
I n t h e f o l l a r i n g s e c t i o n s we s h a l l now give some examples on the
Exercise 6 . 2 : With a symmetric mass i n t e r v a l [H -m. H +ml around t h e peak

v(M) -
value M , show t h a t t h e variance f o r t h e truneatgd B r e P t - ~ i g n e r p.d.f. i s
&n/r)/arctan(m/r)- 1 . Note t h a t lim V(M) = -; coopare Sect.4. 1 7 .
w

6.3.3 Example: Correcting f o r f i n i t e geometry - modifying t h e p.d.f. where N is a normalization constant.


Let us ass- t h a t we want t o determine some parameters 1r e l a t e d to
t h e spectrum of t h e momentum p of n e u t r a l p e r t i e l e a which are observed i n s da-
6.3.4 Exaople: Correcting f o r unobservable events - weightinn of the events
r e c t o r through t h e i r decays i n t o charged products. Por d e f i n i t e n e s s , l e t wr
Suppose some theory r e l a t e s t h e physical constant !to t h e production
t h i n k of KO * dn- i n a bubble c h d e r . With a l i m i t e d d e t e c t i o n vol- the
angle f o r a pion measured r e l a t i v e l y t o t h e incoming a n t i p r o t o n d i r e c t i o n i n
p r o b a b i l i t y t o d e t e c t and i d e n t i f y a produced KO w i l l depend on t h e l o c a t i o n of
the r e a c t i o n pp * ntn+n-<no. Vhen observatioos are made i o a bubble chmber
t h e production p o i n t as well as on t h e momentum and d i r e c t i o n of t h e KO. Fast
the events where one of t h e charged pions has been emitted along t h e d i r e c t i o n
p a r t i c l e s are more l i k e l y t o escape t h e bubble chnmber. as w i l l be those whose
of t h e magnetic f i e l d w i l l o f t e n not be s u c c e s s f u l l y analyzed because of an un-
d i r e c t i o n implies a s h o r t p a t h length. h e w i l l a l s o o f t e n expect a lower de-
measurable momentum. Such events w i l l t h e r e f o r e l a r g e l y be missing i n t h e
tection probability f o c KO's decaying very c l o s e t o the production p o i n t s i n c e
sample. I f a l l events w i t h a charged p a r t i c l e i n an "unmeasurable" cone dn
such events can be d i f f i c u l t t o d i s t i n g u i s h from o t h e r topologiell. To c o r r e c t
around t h e magnetic f i e l d d i r e c t i o n are excluded from t h e s q l e , t h e t o t a l
f o r t h e l o s s of short-range KO we r e q u i r e t h e p r o j e c t e d p a t h length t o be l a r g e r
l o s s of events can be corrected f o r w i n g t h e remaining sample, under t h e
than a c e r t a i n minimum value. I f t h e K0 proper f l i g h t - t i m e follows a d i s t r i -
assunption t h a t t h e s e events are f r e e from experimental b i a s .
b u t i o n l a v e-"' where r i s t h e mean l i f e t i m e , the d e t e c t i o n p r o b a b i l i t y is
KO
The r e a c t i o n pp + n+trtn-~-no i s charge sy-tric and has a r o t a t i o n a l
given by
s w t r y around t h e c o l l i s i o n a x i s . For each observed event we t h e r e f o r e make

D(p.h.41 - e
-tmin17
-e
-tmaxl~ a r o t a t i o n of a l l charged pion d i r e c t i o n s around t h i s a x i s and determine the
t o t a l p r o b a b i l i t y P: t h a t a t l e a s t one of the t r a c k s w i l l correspond t o a lab-
where t .
m1n
is t h e minimal d e t e c t a b l e proper f l i g h t - t i m e corresponding t o t h e oratory angle within t h e cone dn. The event is then assigned a weight
chosen c u t on t h e range, and where t i s the potential flight-time.
man
Both t . and tmaxare i n v e r s e l y proportional t o t h e momentm p, t h e
rmn
" i n t e r e s t i n g " variable. They w i l l a l s o involve t h e "nuisance variables" A and
4 g i v i n g t h e d i r e c t i o n of t h e line-of-flight. I t i s f o r w necessary t o estab- When a l l events from t h e p u r i f i e d sample are p l o t t e d w i t h t h e i r indi-

l i s h a r e l a t i o n s h i p between p and A,+, which f o r a given p expresses the d i s t r i - v i d u a l v e i g h t s the r e s u l t i n g corrected experimental d i s t r i b u t i o n can be compared
with t h e unmodified t h e o r e t i c a l model.
b u t i o n of the angles, ~ ( A . d l p ) . Usually t h i s r e l a t i o n s h i p has t o be i n f e r r e d
from t h e same d a t a which we want t o w e f o r t h e e s t i m a t i o n of t h e unknown para-
meters .: When the dependence p(A.41~) has been e s t a b l i s h e d , i n a f u n c t i o n a l I 6.4 SUPENWOSED PROBABILITY DENSITIES

form o r by a numerical mapping, t h e observable d i s t r i b u t i o n f o r t h e mmentum An experimentally observable q u a n t i t y o f t e n has a p o s s i b l e o r i g i n i n


can be found from eq.(6.15). which i n t h i n case becomes several processes. The o v e r a l l roba ability d e n s i t y can then be thought
of as a sum of terms where each term gives t h e c o n t r i b u t i o n from one praeeae;
Breit-Uigner parts BW and BW describing t h e resonaneen, and some function B
we w r i t e TL W
describing t h e background. In an obvious notation the i d e a l p.d.f. is

where f.(x;B.) denotes t h e d i s t r i b u t i o n expected from the j-th contribution


1 -1 I f the observations are r e s t r i c t e d t o a f i n i t e mass i n t e r v a l and the experimen-
alone and a. the probability f o r t h i s process. For incoherent processes,
C a.
1 1
- 1
1
t o keep the overall f ( x ; a , ) properly normalized.
t a l resolution taken i n t o account, the overall p.d.f. which should be compared
with the observed d i s t r i b u t i o n is
I n many s i t u a t i o n s one w i l l be mainly i n t e r e s t e d i n only one or e few
of the contributions, and w i l l consider t h e r e s t as some type of background
effect. Sometimes t h i s background is well understood. Unfortunately, hwever.
t h e background sources are frequently more o r l e s s unlmam, s o t h a t the mathe-
mathical description of t h i s p a r t of the t o t a l amplitude may be rather uncer-
tain. The conclusions t h a t can b e d r a m about the i n t e r e s t i n g phyaieal e f f e c t
w i l l then be correspondingly uncertain. experimental set-ups one w i l l there-
fore i n general t r y t o keep the background r a t e , or noiee level, as small a8
possible.

6.4.1 Example: P a r t i c l e beam with backsrovnd


A p a r t i c l e beam of mean momentum p and standard deviation Ap i s of
normal shape between the l i m i t s p, and pR. From outside sources the beam i s
-
contaminated by e background of ~ s r t i e l e suniformly d i s t r i b u t e d i n momentum.
The t o t a l molnentum p.d.f. can then be w r i t t e n

f(e;a~,a~.p,,,A~) - ale p A Z P 5 pB.

Exercise 6.3: In the example,above determine the r e l a t i v e magnitude of the


beam and t h e background r a t e s .

6.4.2 Example: Resonance peaks i n an effective-mass spectrum


The resonances " ( 5 4 9 ) and ~ ( 7 8 3 )are abundantly produced i n t h e reac-
+ + + - 0
a t intermediate energies and are detected as enhancements
tion n p + n pn ?I n
i n t h e n e u t r a l nb-?ro system. An effective-mass p l o t f o r t h i s combination s h w a
n i c e peaka a t about 550 and 7 8 0 MeV over a smooth background and may therefore
be used t o estimate t h e masses and widths of the q and o mesons. I n t e r m of
the t r u e three-pion effective-mass M the overall p.d.f. mlrst be composed of two
7. Statistical inference from normal samples h o s e numerical value we want t o estimate from the sample. If t=t(x,,xl,..,xn)
i s a function of the sample variable8 which does not depend on any unknown
parameters, we c a l l t a s t a t i a t i c ; we assume here that t he. some correspon-
dence t o B*). Let f ( t ) be the probability density function for the v a r i a b l e t .
We w i l l ass- t h a t , according t o a given pre8eription, i t is possible t o deter-
I
mine two values ta and 5 such t h e t t h e i n t e g r a l of f ( t ) beween the l i m i t s t
The previous chapters of t h i s book b v e mainly d e a l t r i t h p r o b a b i l i t y and t is equal t o some fixed number y, 0 5 y 5 1.
b I f , f o r t h e par-ter 8,
theory. In p a r t i c u l a r , Chapters 3 . 4 and 5 were devoted t o supply thm reader the probability i s such t h e t
with some general background as well as d e t a i l e d knowledge on s p e c i f i c proba-
b i l i t y d i s t r i b u t i o n s , which e i t h e r describe physical phenomena d i r e c t l y , o r
turn out t o be u s e f u l f o r the handling and i n t e r p r e t a t i o n of exper-ntal data.
the closed i n t e r v a l [ t .t I i s called e l O O Y 2 confidence i n t e r u n l for 8 . The
I n Chapter 6 we saw how i t might be necessary t o modify t h e i d e a l t h e o r e t i c a l a b
number y i a called the confidence c o e f f i c i e n t , and the numbers t and the 5
d i s t r i b u t i o n s t o prepare the ground f o r r l a t e r comparison between theory and
confidence l i m i t s . See Fig. 7.1.
experiment.
we s h a l l now make the t r a n s i t i o n from p r o b a b i l i t y theory t o the
d m i n of s t a t i s t i c s . We do t h i s by discussing i n t h i s chapter lorn. simple
examples on s t a t i s t i c a l k f e r e n c e about the parameters i n t h e normel probabi-
l i t y distribution. me normal p.d.f. is assumed t o give m adequate dc.eription
of some population, or universe, f o r instance the outcomes of i n f i n i t e l y many
measurements on the same physical quantity. The nvmeriealvalues of t h e para-
meters i n the p.d.f. may, hmever. not be e m p l e t e l y to-. Prom a r e s t r i c t e d
number of obaervatiooe, assumed t o be representative f o r t h e universe, i t is
p o s s i b l e t o make inferences about the u o k n a n q u a n t i t i e s . We w i l l b u i l d our
examples around the nation of a c o n f i d e n m i n t e r v a l , which we s h a l l meet again
Fig. 7.1. I l l u s t r a t i o n of t h e concepts of confidence
l a t e r i n Chapters 9-11 when we came t o the d i f f e r e n t methods which are generally c o e f f i c i e n t and confidence l i m i t s .
applicable f o r estimating unknown par-ters.
We begin t h i s chapter by giving f i r s t i n Sect.7.1 a s e t of £0-1 de-
f i n i t i o n s and some general remarks on the problem of making inference. about an I f we are given one p a r t i c u l a r sample of s i z e n, say n measurements.
underlying population from e given sample. From Sect.7.2 onwards we s h a l l the l i m i t s t and tb ere d e f i n i t e numbers which can be calculated from t h e
s p e c i f i c a l l y a s s m e t h a t the sample o r i g i n a t a s from a normal parant populetion. measurements. From t h i s p a r t i c u l a r sample, therefore, a c e r t a i n i n t e r v a l

7.1 DEFINITIONS
XI ,xz,.
~ e t .. .X b e a random .ample from a population w i t h a probabi- *) We s h a l l i n Chapter 8 idencify t as an estimator f o r the parameter 8, or.
more generally. an estimator f o r some function of 8.
l i t y density function which depend. on a p a r m e t e r 8 which is not k n m b u t
169

7.2 CONFIDENCE INTERVALS FOR RIE NEAN


It a' tb I is deduced, and this interval will either include the true value 8 or
In many nituations it is reasonable to ass- that the result of a
it will not. A second sample, also of sire n, will usually lead to a different
measurement of some quantity u is a random variable x that has a norms1 diatri-
interval, which may, or may not include the parameter 8. The relevant atete-
bution about the mean value U. We will now ass- that 11 is an u n L n o w qunnti-
ment for one particular sample is therefore, either
ty, about which we are nuppoaed to aay something an the basis of a s e r i e ~of n
independent measurements, made under the same conditions. In other words, w e
are to make an inference about 11 from a random sample of sire n from the normal
or
P(ta 8 I tb) - 01 if 8 is not within [t,.tbl.
population hl(u,02). Here the variance oz gives a measure of the necwacy io
the measurements, and we must treat separately the two conceivable cases, when
a2 is a known n d e r (Sect.7.2.1). and when of is v n k n m (Sect.7.2.2).
The meaning of the probability statement eq.(7.1) is the following:
There is a probability y that the rmdan internal [ta,tbl will cover the true 7.2.1 C a s e w i t h o 2 known
value 8. If a large nrrmber of samples of size n are examined, that is, if the Our starting point is a random sample of sire n from a normal p o p u l r
experiment is repeated many times under the same conditions, then 8 will be in- tion ~(v.5'). for instance n independent observations x l , x 2 . . ,x oi m..
i m
cluded in the calculated intervals [ta.%] in 1OOy % of these experiments. In error 0. When inferences are to be made about the unknown population mean u it
other words, it is expected that, in the long run, the calculated limits ta and is suggestive to consider the sample mean ;-;1 iZ x: "
this ststistie is dist-
-1 1'
t are such that the statement ributed as N(v,02/n), and as we have seen repeatedly, (see for instance Sect.
b
4.8.61, the variable
-
will be true in lOOy % of the eases. Thus the confidence coefficient reflects
0l"G
the reliability one may attach to the inequality statement (7.2).
menever it is desired to make a statement about an u n k n m parameter has a distribution which is standard " o m 1 N(0,l). We can therefore apply
8 in terms of a confidence interval one is confronted with e dilema. Choosing the considerations of Sect.4.8.3 concerning the probability contents of a
a wide interval corresponds to having a large probability that the unknown i normal distribution. For instance, a probability content equivalent to f2
parameter indeed does belong to the interval, but a rather vague statewnt in standard deviations is implied by writing
then expressed about the parameter itself. On the other hand, giving a oarrw
interval would imply a more precise of the parameter, but then our
statement is less likely to be true. It appears that in thie situation most
people prefer the former alternative and make their assertions taking y, the where e(yl is the standard normal p.d.f. of eq.(4.63). Eq.(7.3) is a probabi-
confidence coefficient, as a number in the neighbourhood of 1. Common choices
are 0.90, 0.95, 0.99. able y -3
lity statement, expressing that the probability is 0.954 that the random vari-
will have some value between -2 and +2.
We can rewrite the probability statement in the form
... -- 2-
u -
This r e l a t i o n apparently considers ii as a v a r i a b l e and x x +
JI; -.&
2 -a 11

NO n d e r s d e f i n i n g an i n t e r v a l . We r e a l i z e , however, t h a t s l n c e x is a random
- Therefore, an i n f e r e n c e t o be made from the m a s u r e m a t s is t h a t the e-tric
-2 a
a
variable, the quantities x - 2-
G
and +
6
are a l s o random v a r i a b l e s ;
95.47. confidence i n t e r v a l f a r t h e m a n U i s t h e i n t e r v a l [0.7. 6.7).
hence it is j u s t i f i e d t o c a l l t h e i n t e r v a l [x - 2 2 ; + 2
6' It ahould be e l e a r from t h e reasoning above t h a t i t i s e 8 e e n t i n l t h a t
vat. We m y read eq.(7.4) as a p r o b a b i l i t y statement about U: Prior t o the
repeated, independent measurements t h e r e is e p r o b a b i l i t y 0.954 t h a t t h e random
U i s a k n o w number. I f U were not k n m , t h e confidence limits (i 2 2 and -
(T + 2 -)
a G' no
could not have been c a l c u l a t e d from t h e measurements, and hence
i n t e r v a l ;[ 2-i,; + 2 -"I w i l l include t h e unknwn, b u t f i x e d value U.
"G
hi
inference could have been made about u based on N(0.1).
Other p r o b a b i l i t y n s t a t e m e n t s can of course be w r i t t e n taking o t h e r i n t e r v a l s
I n p r a c t i c e t h e s i t u a t i o n i* o f t e n t h a t t h e error on t h e measurements
corresponding t o o t h e r p r o b a b i l i t i e s . The ~ o i n is, t t h a t a l l statements of t h i s
is not known exactly. Hwever, t h e s i r e of t h e eamplt may sometimes be s u f f i -
s o r t . which c m be made b e f o r e any measurements are a c t u a l l y performed, belong
c i e n t l y large t o allow t h e approximation of a' by t h e observed sample variance
t o probabititg t h e o w . Generally we may w r i t e
s Z , and t h e procedure above e m be applied to f i n d confidence i n t e r v a l s for 11.
If u2 i s not k n m , and the sample a i z e is small (n 5 20). the procedure of the
subsequent beetion should be w e d .

where t h e l i m i t s a and b f o r a given y can be found from Appendix Table A6. Exercise 7.1: For t h e n m e r i c a l example given in t h e t e x t , what i s t h e symmetric
90% confidence i n t e r v a l f o r U?
As soon as t h e measured numbers are a t hand and ue are given a p a r t i -
c u l a r s e t of n observations x , , x r . ....x,. we m y pass t o t h e domain of s t a t i s - -
Exercise 7.2: Given 6 independent measurewnts
of k n m error 0-2. Assuming a normal sample.
10.7, 9.7, 13.3, 10.2, 8.9, 11.6
f i n d a-trie confidence i n t e r -
t i c s , and make i n f e r e n c e s about the unknown ii on t h e b a s i s of the observations. vals for u corresponding t o (a) Y-0.90. (b) ~ 4 . 9 5 . (e) y q . 9 9 .
For d e f i n i t e n e s s , l e t us e s t a b l i s h a 95.4% confidence i n t e r v a l f o r
Exercise 7.3: Given t h a t a normal d i s t r i b u t i o n has variance a', whet i s t h e
t h e mean 11 i n N(!J,~'), given t h a t f o u r independent measurements. with a known, sample s i z e needed i f t h e symmetric 95.4% confidence i n t e r v a l f o r v s h a l l have
comnon error 0-3, have l e d t o the numbers a length equal t o (al 0. (b) of27
Exercise 7.4: Measurements on the momentm of monoenergetic beam t r a c k s o n
bubble e h a d e r p i c t u r e s have l e d t o the following sequence of nlrmbsrs i n u n i t s
of GeVlc: 18.87, 19.55, 19.32, 18.70, 19.41, 19.37, 18.84, 19.40, 18.78, 18.76.
The sample mean i s We assume t h a t t h i s sample of s i z e 10 o r i g i n a t e s from a normal d i s t r i b u t i o n .
I f the measuring machine has a k n m accuracy corresponding t o an un-
c e r t a i n t y of 300 MeV/= i n t h e moment= determination, f i n d a 95% confidence
i n t e r v a l f o r t h e beam momentum.

The confidence liwits corresponding t o a a-tric 95.41 confidence i n t e r v a l 7.2.2 Case w i t h 0


' unknown
for u are then given by We t u r n t o the problem of f i n d i n g s confidence i n t e r v a l f o r t h e m a n
u of a normal d i s t r i b u t i o n when we are not so f o r t u n a t e aa t o know the variance
u2.
The t o o l s needed t o handle t h i s s i t u a t i o n have i n f a c t a l r e a d y been

7 -Probability and sfeflsfiss.

1
is a

i
provided by the remark of Sect.5.2.1. We have seen that if lq,q,....xn Rewriting the ar-nt in the left-hand side of eq.(7.8) gives a pro-
random sample from N(v,$) two variables can be formed, which have wellknmn bability statement about the unknom 11.

properties, namely

which may be compared to eq.(7.41, valid in the previous case when 0' was knwn.
For a specified value of y the corresponding value of b vill be dependant on
the n d e r of degrees of freedom. The sire of the random interval ;[ - b x'
'
and these variables are independent. Therefore the variable
I f +b
6
for a given y i@ large for very small values of (o-I), but approach..
the sire of the corresponding intervals in N(O.1) when the rider of degrees of
freedom becomes large. This is so because the Student's t-distribution has
N(0,l) as a limiting distribution when n + -; (compare Sect.5.2.3).
For illustration, let us return to the numerical example of the pre-
vious section, with the measurements 2.2, 4.3, 1.7, 6.6 from ~(ll.o'), where nn,
is a Student's t-variable vith (11-11 degrees of freedom. We note that from the.
v as well as a' are unknam. we calculate
construction of t, the d n a m parameter a2 drops out, and we are left with a

the previous case: With a' known, the variable constructed was

-
is distributed as N(O.l); in the present ease where a'
s,
variable which has only p as an unknown constituent. It is also worth comparing

is assumed d n m , the
which

variable needed is F,
6
s
which has a student's t-distribution with ("-1) de-
Searching a confidence interval which can be compared to the symmetric 95.4%
grees of freedom.
(or Y standard deviation) confidence interval derived for the caae when 0
' was
For the variable constructed by eq.(7.6) we may write d m probnbili-
known, we observe that Appendix Table A7 has entries corresponding to probabili-
ty statements analo~ousto eq.(7.5).

-
ty contents of 0.025 in the tails of the Student's t-distribution. For 3 degrees
.t

of freedom, we find b 3.182, and the confidence limits are given by the n-
bers
where f(t;n-1) is the student's t probability density function for (n-1) degrees
of freedom, given by eq.(5.15). Since f(t;n-1) has symmetry about t-0 it is
customary to choose interval. [a.b] which are symmetric. Values for b in the
relation
L
Thus the symmetric 95% confidence.interva1 for p obtained from the four measure-
ments of unknown experimental precision is the interval [0.14, 7.261. Notice
that this interval is larger than the corresponding 95.4% confidence interval
can be deduced from Mpendix Table A7 for different n d e r of degrees of free-
[0.7, 6.71 obtained in the previous example when a' was .as-d knm.
dom and for the usually chosen values of the confidence coefficient y. (compare
Exercise 5.17). ..
E x e r c i s e 7.5: Foi t h e numerical example above, what is t h e symmetric 90% con- For a chosen value of y t h e r e is an i n f i n i t e rider of p o s s i b l e choices f o r the
f i d e n c e i n t e r v a l f o r u1 Compare t h i s r e s u l t w i t h t h a t of Exercise 7.1.
i n t e g r a t i o n l i m i t s a and b f o r the s h chi-square p.d.f.. It i s customary t o
E x e r c i s e 7.6: S i x independent observations from a population ~ ( u . 5 ' ) are given

-
t & e t h e l i m i t s such t h a t t h e t w o t a i l s b e l m a and above b w i l l correspond t o
by t h e numbera 10.7, 9.7, 13.3, 10.2, 8.9, 11.6. With o2 unknam, f i n d symme-
t r i c confidence i n t e r v a l s f o r u corresponding t o (a) y = 0.90. (b) y 0.95, equal p r o b a b i l i t i e s A(1-y). Calculations of a and b f o r given y and given nun-
( c ) y = 0.99. (Compare Exercise 7.2.) b e t of degrees of freedom can then be done i n t h e ordinary manner. using Appen-
E x e r c i s e 7.7: w i t h t h e observations of Exercise 7.4, what i s t h e symmetric 95% din Table A8.
confidence i n t e r v a l f o r the beam m m e n t m i f the accuracy of t h e measuring in- Let us again take an example. Suppose t h a t i t i s requested t o a a y
s t r m e n t i s not known p r i o r t o t h e measuremental
something about t h e accuracy of a new measuring instrument, end f o r t h i s p u r
pose a c a l i b r a t e d length is measured s e v e r a l times. The outcomes from 10 inde-
7.3 CONFIDENCE INTERVALS FOR TtlE VARIANCE
pendent measurements are t h e numbers
As before we w i l l assume t h a t xl,xr,...,x i s s random sample from
~ ( u , o ~ but
) , now we want t o d i s c u s s how we e m findnconfidence i n t e r v a l s f o r '0
and thereby make inferences about t h i s parameter. Again we must t r e a t separa-
t e l y two cases, f i r s t assuming u t o be known (Sect.7.3.1). and next assuming u and t h e t r u e number, U, i s 1000.
unknown (Sect.7.3.2). I f we demand a 95% confidence i n t e r v a l f o r 0 2 , Appendix Table A8
shows t h a t f o r 10 degrees of freedom the i n t e g r a t i o n l i m i t s i n eq.(7.10) w i l l
7.3.1 Case w i t h u known 1 correspond t o equal p r o b a b i l i t i e s (-0.025) i n t h e two t a i l s of t h e chi-square
This may correspond t o an experimental s i t u a t i o n where repeated mea- p.d.f. provided t h a t we take a-3.247 and b-20.483. The measurements g i v e the
surements are performed on a hm q u a n t i t y u using a measuring device of un- 10
squared d e v i a t i o n s about t h e k n m mean U a8 i l ( x i - u 2 - . Hence an inference
known p r e c i s i o n . from t h e measurements i s t h a t t h e 95% confidence i n t e r n a l f o r t h e variance o2
A s t a t i s t i c which has correspondence t o the variance o 2 i s t h e sum
-n1 i ="E l
1 i s given by
As we have seen b e f o r e (Seet.5.1.1) the v a r i a b l e

w i l l be d i s t r i b u t e d as x 2 ( n ) . For t h e chi-square d i s t r i b u t i o n with n degrees


of freedom i t is p o s s i b l e t o find two numbers, a and b, such t h a t , f o r 0 5 ~ 5 1 , I It may be noted t h a t i n t h i s caae with an unsymmetric p.d.f. the
i n t e r v a l constructed by l e t t i n g t h e t a i l s represent equal does
not correspond t o t h e s h o r t e s t p o s s i b l e confidence i n t e r n a l f a r a given y. The
required computation t o obtain a minima2 confidence i n t e r v a l , given y , is s o
Here f(u;n) is t h e chi-square p.d.f. f o r n degrees of freedom, given by eq. tedious t h a t i t is r a r e l y done i n p r a c t i c e .
(5.5). The p r o b a b i l i t y statement can be r e w r i t t e n as

I Exercise 7.8:

Exercise 7.9:
- -
For t h e example i n t h e t e x t determine confidence i n t e r v a l s f o r
5' corresponding t o (a) y 0.90, (b) y 0.99.

Let 7.3, 6.6, 7.0, 5.1, 7.1, 8.5, 5.gJ 6.5, 6.2 be 9 independent
measurements from an sssllmed normal population N(7.o ). On t h e b w i . of theae
(a) y = 0.90, (b) y -
0.95. (c) y 0.99. -
observations, make inferences about t h e variance o2 corresponding t o

7.3.2 Case w i t h p unknown


When we s e a r c h a confidence i n t e r v a l £or t h e variance of b u t do not
know t h e mean value of t h e population ~ ( p . 0 ' 1 t h e sample xl ,x2,. . .,xn pro- We see, by comparison w i t h eq.(7.12) of the preceding s e c t i o n , t h a t
v i d e s the a r i t h m t i e mean ,; and we can make use of t h e f a c t (compare Sect. t h e 95% confidence i n t e r v a l f o r 0' becornea wider when t h e n d e r of degrees of
5.1.6) t h a t the v a r i a b l e freedom is'reduced from 10 t o 9. That i s , our knowledge about 0' has becon.
l e s s p r e c i s e i n t h e present ease because we a l s o had t o adopt t h e sample m a n ;
as an e s t i m a t e f o r the unknown population mean 11.

is d i s t r i b u t e d as X'(n-l>. Thus t h e reasoning of t h e preceding s e c t i o n can be Exercise 7.10: For the example i n t h e t e x t deduce confidence i n t e r v a l s f o r 0%
.pplied t o t h e inference problem, and we may s t i l l use t h e chi-square d i s t r i b u - corresponding t o (a) Y = 0.90, (b) Y = 0.99. (Compare Exercise 7.8.)
t i o n t o obtain confidence i n t e r v a l s f o r 0
'. However, whereas i n t h e previous Exercise 7.11: I f 7.3, 6.6, 7.0. 5.1. 7.1, 8.5, 5.9, 6.5, 6.2 are independent
ease when
x2 (n- 1).
u was k n m we used X 2 ( n ) , t h e present case with u unknown requires
corresponding t o ( a ) Y
7.9.)
-
0.90. (b) Y 0.95. (c) y- 0.99. -
observations from N(u.u'), where 11 i s unlnam, find confidence i n t e r v a l s for c2
(Compare Exercise

Instead of eq.(7.10) we s h a l l now have


Exercise 7.12: I f , i n Exercise 7.4, t h e momentum of t h e beam p a r t i c l e s was
known t o be 79.08 GeVlc, but t h e measuring device had an unLnow accuracy, f i n d
a 90% confidence i n t e r v a l f o r t h e error. How would you determine a 68% confi-
dence i n t e r v a l f o r t h e error?

where f ( ~ ; ~ - l is
) t h e chi-square p.d.f. f o r n-1 degrees of freedom. For a
7.4 CONFIDENCE REGIONS FOR THE FXAN AND VARIANCE
s p e c i f i e d ~ o n f i d e n c ec o e f f i c i e n t y the l i m i t s a,b can be determined i n t h e usu-
Suppose we are t o give a j o i n t confidence region f o r t h e mean and t h e
a l manner, e n t e r i n g Appendix Table A8 f o r o-1 degrees of freedom. The probabi-
l i t y statement f o r 0
'
. analogous t o eq. (7.1 l ) , reads
variance i n ~ ( u . 0 ' ) on t h e b a s i s of the sample x,,x2....,x . To do t h i s we use
t h e f a c t t h a t f o r normal samples, the v a r i a b l e s ;and s2 are independent (Sect.
4.8.6). I f , f o r example, a 95% confidence region i s desired we can w r i t e two
p r o b a b i l i t y statements as

Far t h e nvmerical example of t h e previous s e c t i o n t h e 10 measurements


give

-
and determine t h e l i m i t s a and b.b' from N 0.1) and X2(n-l), r e s p e c t i v e l y . For
-' )n xi-12

interval for :
o
d i x Table A8)
obtained f o r (10-1) -
I f we t h e r e f o r e d i d not know what the t r u e value 11 were, t h e 95% confidence
9 degrees of freedom would be (see Appen-
the independent v a r i a b l e s a joint probability

..
I atatemeat is obtained by multiplying t h e OE eqs.(7.16], (7.171. 1 8. Estimation of parameters
giving
I

i n eq.(7.18)
The inequalities determine a region i n the parlmeter
is indicated by the shaded area i n Pig. 7.2. The region i 8 bounded
space
The general problem of parameter est-tion may be sketched a.

From a r e s t r i c t e d n l d e r of obeervations. assumed t o c o n s t i t u t e a


random sanptc, one wants t o gain some knovledge about the underlying ~ o p u l a t i o n
o r m i v e r s e from which the sample emanated. The mathematical form of t h e parent
d i s t r i b u t i o n may be well-defined, but i t involves a c e r t a i n number of parameters,
whose numerical values are not kn-. The measurements should t h e r e f o r e be used
to e x t r a c t t h e l a r g r s t poeeible mount of informetion about t h e par-ters,
s p e c i f i c a l l y , the e x p r i o l e n t a l sample should supply numbers which could be s a i d
t o represent the n m e r i c a l value8 of t h e parameters.
We have already i n Cbepter 7 seen examples on i n t a m a t estimation of
the parameters i n the normal d i i t r i b u t i o n . I n the forthcoming chapter* we s h a l l
study d i f f e r e n t methods f o r estimating &am parameters i n general p r o b a b i l i t y
F i g . 7.2, confidence region(shaded) f o r the mean !J and variance 0
' of ~(lr.5'). distributions. These methods (the H s r h L i k e l i h o o d - , least-Square.- and mo-
ments methods) a11 produce p d n t 8stimatss.th.t is, some d e f i n i t e numbor. f o r th.
parameters. The methods a l s o give mcasures f o r the uncertainties i n t h e e s t i -
-
by t h e two s t r a i g h t l i n e s oz ie,(xi-T)z/b' and u2 -
i ~ l ( r i - ~ ) 2 / b respec-
expressing the dependence between a' and p ,
, mates and thereby express the confidence t h a t can be attached t o the n-rical
results. I

a' -
tively, and the
"(u - See f u r t h e r Sect.9.7.5. I n t h i n chapter we s h a l l take up various general aspects of par-ter
estimation by discussing i n some d e t a i l a few of t h e c r i t e r i a t h a t should be
f u l f i l l e d by good and acceptable estimators. Although these c r i t e r i a rill be
applied t o the a p e c i f i e point eetimrtion method. described i n Chaptern 9-11 t h e i
I
discussion in t h e following sections w i l l mainly be of a f o m l nsLure. 'the
student who require8 j u s t e s u p e r f i c i a l knowledge of these r a t h e r theor.tic.1
features may therefore be s a t i s f i e d with reading only the f i r s t N o sectimm of
t h i s chapter.
8.1 DEFINITIONS A good e s t i m a t o r should i n t h e long run produce e s t i m a t e s which do
The term estimator denotes i n t h e following a function of t h e obser- not systematically d e v i a t e from the t r u e parameter value. and i t s accuracy
v a t i o n s , or t h e method o r p r e s c r i p t i o n w e d t o f i n d a value f o r a t uham should i n c r e a s e with the n u d e r of observations. Frequently t h e r e are s e v e r a l
parameter. By an estimate we mean t h e n-rical value of t h e parameter obtain-
estimators which f u l f i l these requirements and hence can reasonably be thought
ed with t h e e s t i m a t o r f o r a p a r t i c u l a r s e t of observations. If the p a r a t e r
of f o r e s t i m a t i n g an unlinown parameter. I f so, one estimator can be s a i d t o be
i s 8, i t s e s t i m a t e is denoted by 6 . The term s t a t i s t i c was introduced i n Sect.
s u p e r i o r t o the o t h e r s i f i t s d i s t r i b u t i o n of estimates shows the b e s t "concen-
7.1 as a f u n c t i o n of one o r more random v a r i a b l e s t h a t does not depend on any t r a t i o n " about t h e t r u e parameter value. "Concentration" may f o r t h i s pulpose
I n the general ease we w i l l l e t t h e e t a t i a t i c
t -
unknown parameters.
t(x,,x2, ...,xn) be an e s i i m a t o r of t h e unknown 9 o r of s m e function of 8.
I f a continuous o r d i s c r e t e population has t h e p r o b a b i l i t y d i a t r i b u -
be expressed by giving the variance as a measure of the spread of t h e d i s t r i b u -
t i o n about i t s c e n t r a l value.
In the forthcoming s e c t i o n s we w i l l discuss the following optimum
t i o n f ( x ; 8 ) , t h e ZikeZihwd of t h e observations xt,x2. .... X, for a specific 8 p r o p e r t i e s t h a t are desired f o r good estimators: consistency, unbisssednesn,
is given by minimum variance, e f f i c i e n c y and s u f f i c i e n c y . Only r a r e l y w i l l the conceivable
n
estimators f o r a parameter possess a l l t h e good p r o p e r t i e s .
L(x,,x~... .,Xn~e) = TT
i-1
f(xi;e). (8.1)
have t o choose between them, and i n each s p e c i f i c ease decide which of t h e
One may t h e r e f o r e

i d e a l p r o p e r t i e s t h a t can be abandoned, taking a l s o considerations


T h e product expresses the j o i n t c o n d i t i o n a l p r o b a b i l i t y f o r obtaining the
i n t o account.
measurements x,,x2, ...,
xn, given 9. We s h a l l l a t e r a l s o t h i n k of
~(x,.x~. ...,x,I 9) as a function of 9, c a l l i n g it a Liketihood function. We w i l l
8.3 CONSISTENCY
use t h e symbul L or ~ ( ~ 1 9t o) denote the l i k e l i h o o d ( i . e . t h e number expressing
It i s i n t u i t i v e l y c l e a r t h a t a d e s i r a b l e property of an e s t i m a t o r ia
t h e j o i n t p r o b a b i l i t y ) as well as the l i k e l i h o o d function (i.e. t h e function of
t h a t i t s e s t i m a t e s converge t a r a r d a the t r u e parameter value vhen the nuder of
9). A t h i r d i n t e r p r e t a t i o n regards L as a j o i n t p r o b a b i l i t y density function of
observations is increased. Such a property i s c a l l e d consistenq.
observable q u a n t i t i e s xi, f o r a given 8. Hopefully, i n t h e following, the
Mathematically, the e s t i m a t o r t can be s a i d t o be c o n s i s t e n t i f i t
p r e c i s e meaning of the symbol L , o r ~ ( ~ 1 9 ) .w i l l be c l e a r from the context i n
conwrges in probobiZity t o 9. I f the e s t i m a t e 9 is obtained from a s a n p l e of
which i t appears.
s i z e n, then, given any p o s i t i v e s and q, an N should e x i s t such t h a t

8.2 PROPERTIES OF ESTIMATOR?

statistic t
Observations are rzndom v a r i a b l e s .

- ...,
~ ( x , , x ~ , x ) i s used
Any f u n c t i o n of t h e observations
w i l l e l s o be a random variable, which may take on a v a r i e t y of values.
an r s i i m a t o r f o r t h e parameter
If a
8 it
for a l l n
tity E.
. N. In words, t h e e s t i m a t o r i s c o n s i s t e n t i f , given any small quan-
we can f i n d a sample size N such t h a t , f o r a l l l a r g e r samples, t h e pro-
w i l l t h e r e f o r e give r1.e t o s d i s t r i b u t i o n of estimates g. The individual e s t i - bability that en d i f f e r s from t h e t r u e value by more than E is a r b i t r a r i l y c l o s e '

t o zero.
I
mates obtained are of l e s s i n t e r e s t than t h e i r o v e r a l l d i s t r i b u t i o n , because !
t h i s d i s t r i b u t i o n w i l l r e f l e c t the q u a l i t y of the e s t i m a t o r when it i s ueed many As an example, we knar from t h e Lev of Large N d e r s (Sect.3.10.4) that
times. we w i l l t h e r e f o r e judge the !merits of an e s t i m a t o r from the character- the a r i t h m t i c mean of s sample of n measurements from a population w i t h mean
i s t i c s of the d i s t r i b u t i o n of i t s estimates. U and f i n i t e variance w i l l converge tovards 11 as n becomes l a r g e ,
Consistency and unbiassedness are independent estimator qualities, as
neither property implies the other. It is generally accepted that consistency
is more iwortant than unbiesaedness, partly because bias can often be correc-
The sample man is therefore a consistent estimator of the population mean.
ted for. A consistent estimator whose asymptotic distribulil,n has a fi~li~e
Exercise 8.1: S h w explicitly that the mean ; of a sample of size n from the mean will dlvays be asy~aytotieallyunbiassed.
normal population ~(11.o') converges in probability to !J.

8.4.1 Example: s 2 as an estimator of o2


8.4 UNBIASSEDNESS We will show that the statistic s' = - 1
~(n~-;)',
"-1 i
the sample variance,
Consistency ia a property that describes the behaviour of an estims- is an unbiassed estimator of the population variance a'. We m i t e
tor when the sample size increases to infinity, but says nothing about its be-
haviour for finite data sets. For example, in the last section we just saw
that the sample mean ;is a consistent estimator of the population mean li, but
so would also be

t' ='; --
n-a
1 "
.I xi
1-1 where U is the population mean. Remehering the fact that the different xi are

where a is any fixed number. Why do we prefer one to the other7 independent we get for the expectation value

Lmbiossedness is an estimator property defined for finite sets of ob-


servations. This property is possessed by estimators whose estimates are not
systematically shifted from the true parameter value but centered around this
value for all sample sizes. AB a measure of the centre of a distribution one
can use the conventional mean value. Mathematically, the property unbiassed- Hence
ness is therefore defined if, for all sample nises n, the expectation value
of the estimator t is equal to the true parameter value 0 ,

which should be compared to e q . ( 8 . 3 ) .


It is seen that the presence of the
factor 1 instead of the more intuitive 1 in the definition ensures that the
n- 1 n

The statistic 5' ---


sample variance s 2 is an vnbiassed estimator of 0'.
1
n;
c(x~-;)' is a biassed estimator of 02,because

and b is different from zero, the estimator is biasesd. The bias t o m b b(8) - s 2 = ( ] - (1 -0 * 02.
1
will for all reasrmable estimators be of order - or araller compared to 8.
It is rather trivial to see that the sample mean ;is an unbiassed
we
eize.
see that the bias is b(02) --2 0 % . which decreases with increasing sample

estimator of the population mean !J whenever the latter exiet.. It is alao seen
-
that the estimator .'above will be e biassed estimator of !J for all a different
from zero.
8.4.2 Example: Estimator of the third central moment 8.5 MINIHUH VARIANCE AND EPFICIENFI
This example is resented to illustrate h m one from the first intui- The requirements of consistency and unbiassedneaa do not uniquely de-
tive guess can construct the correct f o m of m ""biassed estimator. termine how to choose a good estimator. One finds, for instance, that both the
Guided by the preceding example we consider the following sum. sample mean and the sample median are eonaistent and unbiaseed eatirmtors of
the location of e normel population vith hovn variance. However, as it can be
s h o w that the variance of the nmm is smaller than the variance of the median,
the mean is regarded as a better estimator of the central value. It seems
natural, therefore, to use the spread in the estimates as a measure for the
acceptability of an estimtor. For most distributions encounterad in practice,
necalling the definition of the third central moment p a (Sect.3.3.3) and u i n g
the second central moment, or the variance, will be a good measure of the coo-
the independence of the x. one finds for the expectation of the different parts,
centration of the estimates; this is especially so for the many cases where this
distribution is approximately normal.
Under fairly general conditions there exists a lower bound on the
variance of the es~imatesderived from en estimator. This lower bound is eaeily
established when considering the likelihood function defined by eq.(8.1). We
) respect to 8 e x i ~ t
shall assume that the first two derivative8 of ~ ( ~ 1 8with
for all 8, and that the range of x is independent of 8. Given an estimator t of
so- function of 8, say T(8), we define its bias b(8) by the relation (compare
eq. (8.4)),
collecting terms lead to

E(!(xi-i)'] - n.3 - 3P9


3
;U3
1
-;U3 - (n- 1 ) ("-2)
D

With the assumption that the range of x is independent of 8 the differentiation


By comparison with eq.(8.3) it is seen that an unbiassed estimator of ul is
of eq.(8.5) vith respect to 8 gives

since L is the joint probability of the observations,


~lrereise 8.2:In the binomial distribution R(~;~,~)~[:)~~(:-~)~~~~where
r-O,l,...,n, show that the ""biassed estimators of p end p are, respectively,
r/n and r(r-l)/n(n-1).

we find by differentiating this relation with respect to 8


Multiplying eq.(8.8) by T(8) and subtracting the result from eq.(8.6) leads to which the likelihood function satisfies the condition (8.14). Most estimators

/...j(t-T(e))
aln~
L dz - -a~ae + ae .
ab
- (8.9)
ill have a variance larger chan the MVB. We therefore define the e f f i c i e n c y
of an estimator as the ratio between the MVB and the actual variance V(t) of the
estimator.
BY applying the Schwsrz inequality to the integral we obtain the formula

(8.10)
Efficiency (t) - MVB
. (8.16)

It is interesting to note that for distributions vhere the regularity


which can be written as
conditions do not hold the minimum attainable variance may be saaller or larger
than the MVB.
For the particular case where t is an estimator of the parameter e
This fundamental inequality for the variance of an eatimator is often itself, we have -
ar -1,
ae and the Cra&r-Rao inequalities (8.11), (8.13) become,
referred to as the C r d r R o o i n e q u a l i t y . When L satiafies the afore~ntioned ~eapectively,
regularity conditions one can prove the following relation,

Thus an alternative form of the Cr-r-Rao inequality, which is sometimes easier


to evaluate, is The efficiency condition eq.(8.14) then reeds

v(t) 2 ($+ %)'/E(- +) . (8.13)

The lower limit of the variance implied by eqs.(E.ll). (8.13) is simplifying the MVB formula (8.15) to
called the m i n i m variance bound, MVB, and an estimator attaining this limit
ie called an MVB estimator, or more often, and e f f i c i e n t estimator. V(t) = ( I + $),A(8).
From the derivation above it is realized that the variance of an esti-
mator. will attain the MVB if the Schwsrz inequality applied to eq.(8.9) becomes Exercise 8.3: Prove eq.(8.12). (Hint: Differentiate eq.(8.8) with respect to
8.)
equality. The necessary end sufficient condition far this is that (t-~(8))
alnL
is linearly related to -
aefor all sets of observations; we write 8.5.1 Example: Estimator of the mean in the Poisson distribution

ribution f(x;E) -
The likelihood for n observations x,,x.,....x
;;1T 8xe- e .is
-
n from the Poisson dist-

From eq. (8. 13) one then g e t s a simple formula for the MVB,

Hence we have for the derivative of 1nL with respect to the unknown 8.
Efficient estimators exist only for the limited class of problem for
"(median) - E '
2" '
(8.22)

This variance i s l a r g e r than the HYB of eq.(8.21). The asymptotic e f f i c i e n c y


of the median as an e s t i m a t o r of t h e mean i n t h e normal d i s t r i b u t i o n is there-

where ; - illxi. Thus


aid .
-ae LS of t h e form of eq.(8.19), v i t h A(9) - e'
f o r e from eq. (8.161,

Efficiency (median) --/ a' nz _ 2= 0.64. (8.23)

t
V(t)
- -- ;.
b(9) = 0. An unbiassed and e f f i c i e n t e s t i m a t o r of t h e parsmeter 8 i s t h e r e f o r e
x, t h e sample mean, w i t h variance given by the
B
HYB formula eq.(8.20), Exercise 8.6:
(8.18).
Show t h a t t h e HYB of u i n N(u.02) can a l s o be found from eq.

Exercise 8.4: Explain why t h e Caauhy p.d.f. f ( r ; 9 ) n -1 (1 +


not have any e f f i c i e n t e s t i m a t o r of 9. Show t h a t the Cr-r-Reo
- does
l a r e r bound i s
8.5.3 Example: Estimators of 0
' and o i n t h e normal p.d.f.
W e n e x t consider the normal d i s t r i b u t i o n ~ ( 0 . 0 ' ) . To f i n d an estima-
2/n, where n i s t h e sample s i z e .
t o r of 0
' we write
Exercise 8.5:
-
In t h e binomial d i s t r i b u t i o n f o r which ~ ( r l 9 ) c)pT(l-p)"-':
show t h a t t h e ""biassed e s t i m a t o r t
-
r/n of p (Exercise 8.2) i s a l s o e f f i c i e n t .

-:
What i s V ( t ) ?
Again a comparison v i t h eqs.(8.19), (8.20) ah- t h a t the s t a t i s t i c t iIlx:
8.5.2 Example: Estimators of t h e mean i n t h e normal p.d.f.
we have seen t h a t t h e sample mean i s a c o n s i s t e n t and unbiaased e s t i -
mator of t h e mean value i n any population of f i n i t e variance. To examine
variance V(t) -
i s an unbiassed and e f f i c i e n t estimator of t h e variance o2 of N(0.o2), w i t h
2oU/n.
I f , a l t e r n a t i v e l y , we t a k e o as t h e parameter t o be estimated we f i n d
f u r t h e r the p r o p e r t i e s of ;as an e s t i m a t o r of u i n t h e normal d i s t r i b u t i o n
N(u,02) w i t h fixed o2 we w r i t e

which is not of t h e form of eq.Vl.19). Hence t h e r e e x i s t s no e f f i c i e n t estima-


t o r f o r the standard d e v i a t i o n o of the normal p.d.f. N(O.o2). In the frame-
;i s u,
Cooparing w i t h eq.(8.19)
t h e v a r i a n c e from eq.(8.20)
we see t h a t
is
an e f f i c i e n t e s t i m a t o r of and t h a t
tor t
1
x ? f o r a function of 0. r ( o )
n 1-1 r
-
work and formulation of Seet.8.5.1 t h e r e e x i s t s , however, an e f f i c i e n t estima-
- - .f 0% ; from eq. (8.15) the variance of

-
V(X) - oz
-
n f (8.21)
t is seen t o be

~ ( t )-- (8.24)
This i s i n accordance with our previous knowledge t h a t t h e v a r i a b l e ;i s distri-
buted as N(p,02/n). which i s in agreement v i t h our previous finding.
An a l t e r n a t i v e c o n s i s t e n t and unbiassed estimator of u is t h e sample Note t h a t t h e r e a u l t s above hold a l s o i f the saople o r i g i n a t e s from a
median. It can be s h a m t h a t , when s i z e of t h e n o r m 1 sample g e t s vary l a r g e , normal d i s t r i b u t i o n having a known mean value u d i f f e r e n t from zero.
t h e median becomes d i s t r i b u t e d according t o N(u,mr2/2n); hence t h e variance of
t h i s e s t i m a t o r of u is
Exercise 8.7: Consider t h e g- d i s t r i b u t i o n f(x;u,B) -
(T(a)6 )
u -lxa-le-xI8.
(a) Assuming a t o be known, find an e f f i c i e n t e s t i m a t o r and the MVB of 6. (b)
Assuming 6 t o be known, does any e f f i c i e n t e s t i m a t o r e x i s t f o r a7
8.6 SUFFICIENCY
t - n
1 C(xi).
i-1
(8.27)
8.6.1 One-parameter case
It is e a s i l y s h a m t h a t e f f i c i e n t estimators are alveys s u f f i c i e n t .
AU estimator t i s s a i d t o be sufficient i f it enhsusts a l l information
To see t h i s v e t a k e logarithms on both s i d e s of eq.(8.25) end d i f f e r e n t i a t e with
i n t h e observations x , . x z . . . . , x
d i s t r i b u t i o n we have used t h e sample mean x ;
mean p .
population
1 - -
regarding t h e parameter 8. For t h e normal
;S x. as an e s t i m a t o r of t h e
1 1
No e x t r a knowledge on p can be gained from o t h e r functions
respect t o 8, g e t t i n g

of the observations, such as glxil, the s t a t i s t i c ;is then a e u f f i -


C X . ~ ete.;
i ' -

i
c i e n t e s t i m a t o r f o r U. Actually any function of x provides a s u f f i c i e n t s t e t i - We see t h a t t h e e f f i c i e n c y condition eq.(8.14) i s j u s t a s p e c i a l ease of t h e
s t i c for A
! in t h e normal d i s t r i b u t i o n . Therefore, t o choose betveen t h e d i f f e - more senera1 eq. (8.28). vith
r e n t functions one may have t o t e s t a l s o t h e q u a l i t i e s of consistency, unbiars-
edness, and e f f i c i e n c y .
To be more p r e c i s e , l e t us consider t h e l i k e l i h o o d function when t h e
p.d.f. is f ( x ; 8 ) . Suppose t h a t L can be f a c t o r i z e d t o give

.
I estimtor.
Under t h e r e g u l a r i t y conditions s p e c i f i e d f o r t h e likelihood function
i n s e c t . 8 . l t h e r e is among a l l s ~ f f i c i l estimators
I f t h e p.d.f.
t f o r 8 only one e f f i c i e n t
belongs t o the exponential family and the range of
i s independent of 0, it can be s h a m t h a t s u f f i c i e n t s t a t i s t i c s always e x i s t
v h e r e t h e function*) G involves the s t a t i s t i c t and t h e parameter 8, and tl is
f o r 8. HoYever, t h e r e v i l l be j u s t one s u f f i c i e n t a t a t i s t i e vhich v i l l s a t i s f y
independent of 8. being a function of t h e observations 5 only. Since G only h a s
eq.(8.291 ~ n dthus e s t i m t e some function T(0) v i t h variance equal t o t h e WYB;
r e f e r e n c e t o t h e d a t a vio t = t ( x , . x 2 . . . . , a ) and
, is a hnm rider f o r t h e
compare t h e example of Seet.8.5.3. Furthermore, f o r l a r g e samples, any function
given sample, t must supply a l l t h e a v a i l a b l e information i n t h e d a t a regarding
of a s u f f i c i e n t s t a t i s t i c v i l l be an NVB estimator.
8. It f a l l n r s t h a t whenever t h e l i k e l i h o o d function can be v r i t t e n i n t h e form
of eq.(8.25), t i s a s u f f i c i e n t s t a t i s t i c f o r t h e parameter 8. Exercise 8.8: Verify t h a t t h e Poisson d i s t r i b u t i o n P ( x ~ 8 )= $ 8Xe-e belongs
t o the exponential family. Compare Sect.8.5.1.

exist
One can show t h a t a necessary condition f o r a s u f f i c i e n t s t a t ~ s t i ct o
is t h a t t h e p.d.f. belongs t o t h e e q m e n t i o t family, defined by Exercise 8.9: Shar t h a t the Cauchy p.d.f. f(x;8) n 1 (1 +
h a v e a s u f e t e s t i m a t o r of 8. Compare Exercise 8.4.
-- does not

8.6.2 Exaaple: Single s u f f i c i e n t s t a t i e t i c s f o r t h e normal p.d.f.

where B,C,D,E are functions o f t h e i n d i c a t e d arguments. We have already seen (Sect.8.5.2) t h a t t h e sample mean ;ls an e f f i -
eient e s t i m a t o r of t h e mean !J i n t h e normal d i s t r i b u t i o n . According t o t h e
For t h e r e s t r i c t e d class of p.d.f.'a s a t i s f y i n g eq.(R.26) one sees -
general statement of the p r e v i o w s e c t i o n x is then a l s o a s u f f i c i e n t s t a t i s t i c
t h a t the f a c t o r i z a t i o n requirement of eq.(8.25) implies t h a t e s u f f i c i e n t s t e
t i s t i c must be expressed by t h e function C(x1, f a r u. We vant t o s h w t h i s mare d i r e c t l y , and observe t h a t , generally

*) I n t h e non-regular s i t u a t i o n vhere t h e range of x may depend on 8 , one must


check t h a t the function G i s t h e c o n d i t i o n a l p.d.f. f o r t , given 8.
..
Therefore, when oz is known, the likelihood function for II can be written

8.6.3 Extension to several parameters


If the p.d.f. has k parameters 91,82....,9k, there will exist a set
of r jointly sufficient statistics tl,t2,....t where r may be smaller than,
equal to, or larger than k, provided that the likelihood function can be
factorized in the folloving manner,
L is now factorized to the form of eq.(8.25), with the first bracket to be
identified with G(;~V), only dependent on the observations through the atatistie
. Thus ;is a sufficient statistic for v. We see explicitly that G(;~V) ie
the p.d.f. far the variable ;, namely ~(v.a'fn). The second bracket involve* the
separate sample values, and corresponds to H ( 2 ) of eq.Kl.25).
It is easy to see that, given 0
'. the normal p.d.f. is of the expanen-
I This is seen to be a generalization of eq.(8.25).
To have a set of k jointly sufficient statistics for the k parameters,
the p.d.f. must belong to the exponential family. The generalization of eq.
tial form of eq.(8.26) for the parameter p, because
/ (8.26) to the multi-parameter case is
k
I (8.31)

Thus we can make the identification with the functions where - [91,92.....9 k 1. Writing out the likelihood function one finds by c o w
parison with the factorization property (8.30) that the k joint sufficient st.-
tistics of the k parameters must be expressed by the functions C of the obser-
vations.
Tf, instead, v was fixed and 0
' to be estimated we would write the
likelihood function as

8.6.4 Example: Jointly sufficient estimators for II and a-in N(!J,o')


To demonstrate that two jointly sufficient estimators exist for II and
a2 in the normal distribution it suffices to show that the p.d.f. belongs to the
Thie is again of the factorized form of eq.(8.25), and shows that z(x~-v)' is exponential family of eq.(8.31). We write
now a sufficient statistic for a'. One observes that by inclusion of appropri-
ate factors the first bracket can be identified with a chi-square p.d.f. with n
degrees of freedom for the variable ~(~~-y)'/o~. which is of the form of eq.(8.31). with
It is seen that, estimating 0 2 , we could take for the functions of
BI(V,O') - $*, B2(v,a2) - - sz, 1
the exponential Emily, eq.(8.26).
c, (=) -x, Cz(x) - xZ.
D(II.O~) - - &! - Iln(2na2). - E(x) 0.
Aceording to eq. (8.32) two jointly sufficient statistics for v and o 2
are given by
9. The Maximum-Likelihood method

However, these estimators of li and a' are neither unbiasaed nor consistent.
Considering instead (compare Sect.5.1.61 The method of parameter estimation hm as the HluirurLikelihood
i (MImethod is very general and pwerful. For estimation problems where a
! functional dependence can be written d a m for the observed variables, the ML
method is eminently satisfacrory for two reasons: it provides estimators with
desirable properties, and the estimator8 are easy to find. The M L theory has a

it is seen that these variables define a one-to-one mapping of t,,tr onto r , , ~ , . i fundamental position in all problems of parameter estimation where the func-
tional form of the p.d.f. is given. We will therefore treat the ML m t h o d
The statistics r, and r r are therefore also jointly sufficient estimatore for
s&st at length and discuss its theoretical aspects e. well as its practical
the two parameters. Moreover. these are unbiassed estimators of 11 and 0'.
implications.
because I
In expositions of the EL method aimed for physicists, it is often the
osynptotic properties of the HL estimators which are emphluired. For large
samples the M estimates are ~ r m a l l ydistributed. This nice property m a b e
the determination of variances on HL estimates very simple. The following pre-
The joint likelihood function for u end a2 can be shown to factorize sentation will also emphasize the asymptotic properties, and these should be
to the form of eq.(8.M), but since fairly easy to extract, without reading the chapter in full. However, it i~ in
I
the framework of a u f f i c i m t s t a t i s t i c s that the HL estimators have their most
important properties. We have already given a somevhat theoretical discussion
s2 is not a single sufficient estimator of a' when p is unknown, nor is ;a of s- of the fundamental properties of estimators in Chapter 8, and we will !
single sufficient estimator of li when 0' is unknown. find in this chapter that the ML estimators possesa moat of these good p r o p e r
ties.
The reader who wants only a firat working knwledge of the ML method.
I
and who wants mainly to h m its asymptotic properties, can .elect a. an initial 1
reading the sections 9.1, 9 . 2 , 9.5.1'. 9.5.4 - 9.5.6, 9.6.1, 9.6.3, 9.1.1. 9.9.
I

In particular one should note that the very aimple graphical solution of the HL
estimation problem, described in Sects.9.6.1 and 9.6.3, can be applied in many
practical situations involving one or two unknown parameters.

I
9.1 THE MAXIMUM-LIKELIHOOD PRINCIPLE
Consider a p.d.f. f ( ~ j 8 )with an unknown parameter 8 t o be estimated
f r o m t h e s e t of observations
*)

countered t h e l i k e l i h o o d function (LF)


x,.xz. ...,x, W
e have already i n Chapter 8 en- with t h e condition t h a t t h e second d e r i v a t i v e evaluated a t 9 - A

9 i e negative,

as t h e j o i n t c o n d i t i o n a l p r o b a b i l i t y of t h e observations x,.x,, ...,xn a t a Usually L has only one maximum, and $ i s unique. I f t h e r e is mre
f i x e d 8. Since f(x18) i s a p r o b a b i l i t y d e n s i t y function properly normalized t o than one maximum, one should look f o r supplementary information t o chooee br-
one, we see t h a t when t h e LF i s considered s function of the x . ' s t h e integra- tween t h e s o l u t i o n s .
t i o n over t h e t o t a l sample space n yields Since L and the logarithm of L a t t a i n t h e i r maxim f o r t h e same value
of 8, the Ea s o l u t i o n may be found from the tikelihood equation

n
-
~ ( ~ l e ) 1d ~ (9.2)

f o r e l l 8. This sum i s o f t e n e a s i e r t o handle than t h e product i n eq.(9.4). We then re-


I n the LF one may regard t h e observed xi's as constants and the para- quire
meter Q as a v a r i a b l e . According t o t h e Mamhm-L<keZikood PTincipZe we should
choose as an e s t i m a t e of the unknown parameter 8 t h a t p a r t i c u l a r 8 within t h e
= d m i s s i b l e range of 9 which renders L as l a r g e as possible**). This means t h a t
the e s t i m a t e g is such t h a t For the general case with s e v e r a l unknown parameters ~ ( 9 .8%,.
, ...$)
we have t o s o l v e t h e s e t of k l i k e l i h o o d equations

f o r a l l conceivable values of 9. I f L i s twice d i f f e r e n t i a b l e w i t h respect t o 8,


the value $ may be obtained by solving t h e equation
t o f i n d t h e ML estimates = {8^,.82,...,$k). A s u f f i c i e n t condition t h a t the LP
is a t an absolute maximum i s t h a t the quadratic m t r i n U($) v i t h elements
-
* Each observation o f t e n corresponds t o more than one measured v a r i a b l e ; f o r
i n s t a n c e , x can be a set of two angular q u a n t i t i e s s p e c i f y i n g a s p a t i a l
d i r e c t i o n (an e x a m l e i s given i n Sect.9.5.8). I n general, xi w i l l denote
t h e s e t of measured q u a n t i t i e s f o r event i. I i s negative d e f i n i t e .

**) mi. choice of t h e "best value of a parameter" as t h e one t h a t maximizes t h e I n many p r a c t i c a l problems t h e s o l u t i o n of eq.(9.6), o r more g e n e r a l l y
c o n d i t i o n a l p r o b a b i l i t y of r f o r given 8, is not obvious. From Bayea' meo- eqs.(9.8), has t o be found numerically. I n f a c t , t h e n-rical procedure can i n
rem, f o r instance, a mre i n t v i t i v e choice of a '%eat value of 9" would be
t h e one which maximizes t h e j o i n t p r o b a b i l i t y of x and 8; compare Seet.2.4.3. many instances be advantageous, s i n c e i t gives a l s o d i r e c t l y t h e variances t o be
associated w i t h t h e HL estimates; see Sect.9.6.2 f o r an example.
9.1.1 Example: Entimate of mean lifetime 1 S h w that the M. estimate far the par-ter T can be elmresaed .s

let us ass- that we observe the as well as the decay of


The p.d.f. is then 2-0
neutral kaons in an infinite detector where T. is the potential fligpt-time for the i-th event. Note that r enters
on the $i ht hand side in the eorreetion term to the arithmetic mean f . The
!
F m m measurements of the
, solution c M be obtained by an iteration procedure, taking f as the atarting
value.
where r is the KO mean lifetime to be estimated. KO

mmentum and the length between the production and the decay points, the proper 9.2 ESTIMATION OF PARAUETERS IN THE NORHAL DISTRIBUTION
flight-time ti for each event is determined. Por o observed events the LF is. TO illustrate further the M. estimation of unknown parameters w e will
according co the definition eq. (9.1), consider some useful examples vhere the LP is written in t e r m of normal proba-
n bility density functions.

1-1 9.2.1 Estimation of v; measurements vith co-n error


Let n . x z . . . . , x be n independent measurements an the same unknown
and from ee.(9.6) the HL estimate r of r is found by solving the equation
quantity li, and assum that the measurement error is o, common to all a b s c r
vations. We assume the x ' s to constitute a sample of sire n d r a m from a n o r
ma1 population N(U,O~), where u is unknown, o2 known.
This gives TO estimate the parameter U by the MaximunrLikelihood method we
I demand the maximum of

uence the pa estimate of the parameter r is equal to the arithmetic mean of the
~bserved flight-times. The solution obtained does correspond to a maximum of when L is considered a function of v. In this case eq.(9.6) reads
the LF, because

and the solution ia

in accordance with the requirement of eq.(9.7).

Exercise 9.1:

-
above, one can alternatively use the decay constant A
with the p.d.f. f(tlA)
-
Instead of having the lifetime r as a parameter in the example
llr as the parameter
Ae-At. Show that the M. estimate of A is
/
I
I
Therefore, the M. estimate of the population mean li is equal to the sample mean
-x.
9.2.2 Estimation of li; measurements vith different errors (weighted mean)
Exercise 9.2: Assume that we observe the and decay of particles We ass- now that the measurements x l . x r . ....a
on the unknown
within a f z n i t e detector. The p.d.f. is then (compare Sect.6.3.1) , quantity li have different, but still k n m errors. If each measurement x. is

i
I
normally d i n t r i b u t e d w i t h measuring error oi*), the LF is

!
From eq.(9.8) we should now solve t h e s e t of two simultaneow equations

The ML e s t i m a t e for u becomes i n this s i t u a t i o n


I and

from which the ML estiolatea are


which is c a l l e d the weighted m a of t h e observations. We note t h a t t h e
measurements are given weight i n i n v e r s e proportion t o t h e square of t h e i r
errors, i .a. they are weighted i n proportion t o t h e i r p r e c i s i o n s II'$~. I n the

case t h a t t h e measurements have t h e name e r r o r , oi -0, i t is seen t h a t eq.


(9.11) reduces t o eq. (9.10). as i t should.
x,
A

It w i l l be s h a m i n Sect.9.5.4 t h a t t h e variance on t h e e s t i m a t e 11 of We note t h a t t h e e s t i m a t e of u i s equal t o as we found i n Sect.9.2.1.


The
eq.(9.11) is given by
I sz -
HL estimate of
I z(xi-;)'
0 % isbiaaped, the r e l a t i o n t o t h e unbiassed estimator
of a2 being = :'
A n
s 2 $ compare Sect.8.4.1.
The errors on and 0' w i l l be derived i n Sect.9.5.5.

E x e r c i s e 9.3: Shol t h a t t h e HL e s t i m a t e of o z i n ~ ( u . 0 ~f o
) r given u is Exercise 9.4: Show t h a t t h e estimates u,
A A
o' correspond t o a maximm of t h e LF.

9.3 ESTIHATION OF THE WCATION P M T E R I N THE CAUCHY P.D.F.


Simultaneoue e s t i m a t i o n of mean and variance I I n the examples of the previous s e e t i o n e t h e ML s o l u t i o n f o r t h e
9.2.3
It is frequently not r e a l i s t i c t o conaider t h e errors connected t o parameters turned out t o be unique. As an example where eq.(9.6) may have
several s o l u t i o n s , consider the estimation o f the parameter 0 i n the Cauchy
t h e measurements as h a m q u a n t i t i t e s . As an example, suppose we have measured
t h e range of monoenergetic p a r t i c l e s i n some material. The measurements f o r n
distribution f 1 + . With t h e measurements ni.x2,. ..,xn
we have the LF
e v e n t s are nl.x2,...,x , and t h i s sample is supposed t o o r i g i n a t e from a normal
population N ( V . U ~ ) . Here " is t h e t r u e , b u t v n h m mean range, end o the c o w
bined s t r a g g l i n g a d measuring error, vbich i s also not k n a m , b u t assumed t o be
eo-n f o r a l l measurements.
/ and eq.(9.6) becomes
TO e s t i m a t e b o t h 11 and a2 by t h e ML m c t b d we w r i t e

*) Note t h a t t h e xi o r i g i n a t e from d i f f e r e n t parent populations, and hence do


n o t c o n s t i t u t e a b a l e in the unual sense.
I This is an equation of degree (2"-1) in 8 , and up to (2"-1) different solutions 9.4.2 Coneistencz
exist, n of which will or respond to maxima of the LF. Usually the best value. It can be shown that under very general condition8 the ML estimators
to the highest maximum of L, is near to the sample median. The are consistent. This means that the HL estimates will converge towards the true
median of the sample may therefore be taken as the starting value in an itars parameter values when the sample size increases. Thie applies to the single-
tive search for the maximwn of L. parameter as well as to the multi-parameter case. Harever, it may sometimes
happen that the LF has two or more supremr due to a- specific form of the
9.4 PROPERTIES OF HAXIMM-LIKELIHLWD ESTIMATORS p.d.f. whieh.essentially makes the p a r a t e r indeterminable.
The use of the MJ. method presupposes that the p.d.f. is specified
9.4.3 Unbiassedneas
except for the unknown parameters. It may often not be possible to find expli-
In favourable situations the ML eathators turn out to be unbiassed,
cit analytical expressions for the estimators, but by maximizing the likelihood
irrespective of the size of the sample. This means that for any n, the esti-
function using iteration or interpolation procedures the estimates themseIves
mates rill be distributed with a mean equal to the true parameter value. For
can be obtained, irrespective of the analytical form of the estimators. Mare-
over, as will be sh- in the following, the KL estimators possess nearly ell
p.d.f. f(tlT1 -4
e~tample,it ia easily verified that the M, estimator
e-'lT (Sect.9.1.1) is unbiassed, since
1 -
n Zt.1 of r in the
the theoretically best estimator properties.

9.4.1 Invariance under parameter transformation


In practice it is often rather arbitrary what physical quantity is
chosen as the parameter 8 to be estimated. For example, in the lifetime deter- in accordance with the requir-nt for an unbiassed estimator, eq.(8.3).

i2 -
mination in Sect. 9.1.1 we could have used the decay constant A as paraoeter The HL estimators are frequently not unbiaased for finite samples.
instead of the mean lifetime r . Let 8 be the HL solution for the parameter 8. For instance, the KL estimator of o2 in N(P,O')is
n i-1
(see Sect.

I£, ..
we had chosen to estimate a function of 8, say r(W, then
-
9.2.3, and this is a biassed estimator of a', because

-
the ML solution for this function would be that value ~ ( 6 )for which aL/ar
since aL/ae = (aL/ar)(ar/ae) for a11 e, and aLlae o for 8-8, it f o l ~ m swhen
A
,
0.

a ~ l a er o that aLla~-0 for e = 8; hence we must have The bias is here - -10 5 which is negligible far large n.

function r(8) of the parameter, r(B1


general we have
-
From the invariance property of KL estimators, we know that for a
A A

~(8). but far expectation values in


Eq.(9.13) expresses the invariance of M, estimates under parameter transforma-
tions. We see that it makes no difference to the result for the estimate 8
I which variable, 8 or the function ~ ( 0 1 ,is used to maximize tha LF. Thus the HL
I ! method is free for the arbitrariness discussed in connection with Bayes' Poatu- Therefore, although 8 may have an unbiassed estimator, r(B) need not.
late, Seet.2.4.5. In the asymptotic limit of infinite a q l e s KL estimators are
An example of the invariance property has already been demonstrated by unbiaased.
Sect.9.1.1 and Exercise 9.1.

/ 8 - Probability end statistic*.


I
f(t 1 A) -
Exercise 9.5: Shav that the ML eatimator of the decay constant h in the p.d.f.
~e-ht is biassed. Is the estimator consiatent?
m y one-twone function of E,
~ h a r nthat the HL estimator of I
-
in a sufficient e'timtor
is just t.
of r. We have already

Exercise 9.6:
~ ( v , o ~ are
) G - --
Show that the HL eatimates of 11 and o of the n o m l distribution
X and o (<1 ~(xi-G)~)), respectively. It ha* been e h w n in
Sect.8.3 and Sect.8.4 that f is a eonaistent and ""biassed estimator of v. Show
The present remarks on sufficiency for ML estimators also apply to
the multi-parameter case. For the ease with k unknown parameters it can be
can simultaneously have
proved that r 5 k sufficient estimator. tl.t2,...,t
that the estimator of a is biasaed, but consistent. Compare the estimate 2 with
the estimate of 0' derived in Sect.9.2.3 and note that this provides another their minimum attainable variance.
example on the invariance of ML estimators.
9.4.5 Efficiency
9.4.4 Sufficiencz The variance of an HL estimator can not be arbitrarily small. The
It was stated in Sect.8.6 that if the likelihood function can be fae- "ecessary and sufficient condition that an efficient, or minimum uarianee b o d
torired as (m),estimator t exists for the parameter 8, is that one can write (Seet.8.5)

where t ia a function of the observations, t


sufficient tati is tic for 8.
- t(xl,x2, ...,x ), then t is a where b(8) is the bias of the estimator. Since the ML estimate 6 for 8 is ob-
I This entails that t contains all information in the tained by equating alnLI.38 to zero, it follavs that the likelihood equation
measurements regarding the parameter 8. A condition for e sufficient statistic produces the efficient estimator whenever there ia one.
to exist is that the p.d.f. f(xl8) can be vriLten in the exponential form The W B for an efficient estimator t of 8 is given by the Ccakr-Rao
limit. (compare eqs. (8.17). (8.181, (8.20)),

where 0,C.D.E are functions of the indicated arguments.


Since estimating 8 from L(?/ 81 of eq. (8.25) is tantamount to estima-
tion 8 from G(t18). it follavs that the estimate depends on the observations
For example, to find an efficient estimator for r in the p.d.f.
through the sufficient statistic t alone. This also means that if there exists
a sufficient estimator for the parameter 8, the likelihood equation will produce f(tjr) - ?;
(Sect.9.1.1)
e-tlT, it must be possible to write alnLlar - A(T)(?-r-b(r)). We find
it. It can be s h w n that sufficient estimators will have the minimum attain&&
umirmce. This means that if a sufficient estimator exists, the HL method pro-
duces the estimate with minimum attainable variance.
strongest argument in iavour of the WL method.
This fact is probably the

Tbe likelihood function of our exanple from Seet.9.3.3 can be written --


which is of the form of eq. (8.19). with A(?) - -
nlrz, b(r) 0. Therefore, the
ML estimator t 1 Et. is an unbis~sed and efficient estimator of I, with vari-
ance )
:(v -
1/A(r)
n 1
-
r2/n.
As etated in Seet.8.6 there is a w n g all sufficient estimators of 0

which is explicitly of the form (8.25). This demonstrates that -1n Ct.,
I
or
only one statistic t which will estimate some function r(8) with variance equal
to the m. m e MVB can be found if one can write
A

be more than one maxi- i o t h e present ease, end hence t h e s o l u t i o n e i a unique.


The proof of uniqueness can i n f a c t be extended t o a l l cesea where a s i n g l e
hen (compare eqs. (8. l l ) , (8.13),(8.15)), sufficient s t a t i s t i c exists. I f t h e r e is no s u f f i c i e n t s t a t i s t i c . then t h e HL
e s t i m a t o r i s not n e c e s s a r i l y unique.
For k parameters, when t h e r e is s s e t of k j o i n t l y s u f f i c i e n t s t a t i -
s t i c s , t h e s o l u t i o n of the s e t of l i k e l i h o o d equatione i a unique, provided t h a t
the usual r e g u l a r i t y conditions are s a t i s f i e d .
For example, from eq.(8.14) it i s e a s i l y seen t h a t t h e r e can be no
Harever, t h e r e i s an unbiassed and e f f i -
c i e n t estimator t of r(o) -
e f f i c i e n t e s t i m a t o r of o i n e ( 0 , 0 2 ) .
02, namely t -:
i!lxi, f o r which V(t) - 2oUln
9.4.7 Asymptotic normality of HL e s t i m t o r s
Let us consider again the one-parameter case w i t h p.d.f.
given n observations ~ 1 ~ x 2 . x,. ....
f(xl8)
(see Sect.8.5.3).
. We w i l l o u t l i n e t h e proof t h a t f o r l a r g e
A s w i l l be s h a m i n Sect.9.4.7 the HL e s t i m a t e 8 of 8, f o r l a r g e n.
i s approximately normally d i s t r i b u t e d about t h e t r u e value 8 = E0 w i t h variance "alue 8 -
samples the e s t i m a t e 0 i s asymptotically normally d i s t r i b u t e d sbout t h e t r u e

-
8" with variance equal t o the MVB, provided 1nL i s twiec d i f f e r -
equal t o t h e m, provided t h a t c e r t a i n r e g u l a r i t y conditions hold. This e n t i a b l e i n 8 and t h e range of x is independent of 8.
implies t h a t HL estimators are asymptotically e f f i c i e n t , and hence a l s o asymp- A Taylor expansion about t h e t r u e value
"
e - eo of alnLIa8 at the HL
t o t i c a l l y s u f f i c i e n t , s i n c e whenever e f f i c i e n c y holds s u f f i c i e n c y holds too. estimate 0 = I2 gives

I n t h e multi-parameter ease i t can be s h a m t h a t , f o r l a r g e samples,


t h e ML e s t i m a t e s ill, under ordinary r e g u l a r i t y conditions, tend t o a multi-
normal d i s t r i b u t i o n .

E x e r c i s e 9.7: Explain b y t h e p.d.f. f ( t 11) = AeXt


has no e f f i c i e n t e s t b h
where 8' i s some value between 6 and 8 .
t o r f o r A i t s e l f , but only for-the function r(A) = 111. What is t h e HL e s t b - It has previously been e s t a b l i s h e d i n Sect.8.5 t h a t , when L is s u f f i -
t o r A ? Shov t h a t the m of A is 12/n.

Exercise 9.8: Show t h a t the weighted mean of eq.(9.11)


and e f f i c i e n t e s t i m a t o r of the man U.
~ r o v i d e san unbiassed I c i e n t l y regular.

9.4.6 Uniqueness
It is easy t o see t h a t , i f an e f f i c i e n t e s t i m a t o r e x i s t s f o r some
/ Therefore. from eq.(8.12).
a
f u n c t i o n T(B), then t h e MI estimate e is unique. Differentiating the efficiency
c o n d i t i o n eq. (8.15) w i t h respect t o 8 and i n s e r t i n g 8 = 8 leads t o
I writing

alnLlaE -
i n v i r t u e of eq.(9.16). Thus every s o l u t i o n of the l i k e l i h o o d equation
0 corresponds t o a maximum of t h e LF. Since, f o r a r e g u l a r function.
t h e r e must be a minimum between successive maxima, i t follows t h a t t h e r e cannot
the r i g h t hand a i d e i s e sum of o independent t e r r a alnf(xi18)/a8. Yhich has a
mean value of zero and a variance given by eq.(9.18).
Theorem t h e q u a n t i t y
Prom t h e Central L i p i t
i n v a r i a n t under traneformation t o t h e decay constant A - 1
-; . The e s t i m a t o r of
r can i n t h i s r e s p e c t be regarded s u p e r i o r t o t h e e s t i m a t o r of A , which is

at 0 - e0 is a s t a n d a r d i r e d v a r i a b l e w i t h asymptotic d i s t r i b u t i o n N(0.1).
(9.20)

The
biassed and n o t e f f i c i e n t .
We s h a l l now e x p l i c i t l y prove t h a t t h e HL eatimate ;-
samples is normally d i s t r i b u t e d about t h e t r u e value T w i t h v a r i a n c e equal t o
f o r large

quantity the HVB. The function f ( t l r ) ia twice d i f f e r e n t i a b l e w i t h r e s p e c t t o r and t h e


range of v a r i a t i o n f o r t is independent of r so t h e conditions f o r asymptotic
normality are f u l f i l l e d . Writing x f o r the v a r i a b l e i n s t e a d of t , the p.d.f.
f ( x l ? ) h a s t h e c h a r a c t e r i s t i c function Ox(t).
i n viev of t h e Law of Large Numbers,is such t h a t , when n + -,
(9.22)
.. The c h a r a c t e r i s t i c function QE(t) f o r the v a r i a b l e ;= 1
" i-,x.1 with t h e p.d.f.
A

s i n c e B* l i e s between 0 and go, and the ML e s t i m a t e 0 converges towards O L(+) = fi


L-1
i ( x i l ~ ) is
-
- -(
(consistency). Introducing t h e asymptotic value of v i n eq.(9.17) we o b t a i n

(8-e0)[~(- ),]
a'ln~ I = (E(-
a21n~1
, (9.23)
ax(t) E ( ~ ~
E exp (. t) i i l x i ) ) .
~ i~t

where t h e expressions should be evaluated a t O


is recognized as u from eq.(9.20),
- go. The q u a n t i t y on t h e r i g h t
which i s asymptotically normal with zero
s i n c e t h e observations are independent t h i s may be w r i t t e n

mean and u n i t variance. Therefore the e s t i m a t e 8 rust be asymptotically normal-


When n goes towards i n f i n i t y , @?(t) + e x p ( i t r ) , which according t o Seet.4.8.4 is
l y d i s t r i b u t e d about t h e t r u e value w i t h variance equal t o t h e NVB,

"(8) - 1/E(- ~w)


. (9.24)
the characteristic function f o r e normal v a r i a b l e with man value r and variance
zero. Hence t h e absolute l i m i t i n g d i s t r i b u t i o n f o r t h e ML e s t i m a t e
be i n t e r p r e t e d as en i n f i n i t e l y sharp peak a t r .
-
; ;may
Po, t h e case with k parameters O I . ~ ~ . . . . , Oone can prove t h a t t h e
k I f we rant t o nay something about the d i a r r i b u t i o n f o r n l a r g e , b u t
estimates 8*l,8r.....8^
k
are asymptotically multinoraally d i s t r i b u t e d about t h e
not n e c e s s a r i l y i n f i n i t e , we must study t h e expansion of L ( r ) i n more d e t a i l .
t r u e parameter values and w i t h covarianees given by We have

i,j - 1.2.. ..,k. (9.25)


(1 - - + itr + t ( i t ) 2 ( r Z + ~ ~ / +n )...
+)-" 1

Example: Asymptotic normality of t h e ML estimator of the mean l i f e t i m e and olay compare with t h e c h a r a c t e r i s t i c function f o r a normal v a r i a b l e with
9.4.8
We heve i n t h e l a a t s e c t i o n s i n d i c a t e d t h a t t h e HL e s t i m s t o r of t h e
mean u and variance a',
1 -tlr
mean l i f e t i m e T in t h e p.d.f. f ( t l T ) -; e has a nllmber of optimum propcr-
t i e s : i t i. unique, c o n s i s t e n t , unbiassed, s u f f i c i e n t , e f f i c i e n t , and a l s o
I t is seen. t h e r e f o r e , t h a t
A
T --x, f o r l a r g e but f i n i t e n, w i l l have s norms1
d i s t r i b u t i o n w i t h mean value r and v a r i a n c e r21n, which i s equal t o t h e HVB.
Accordingly. t h e s t a n d a r d i z e d q u a n t i t y
A
where t h e i n t e g r a t i o n is over a l l n x.'s and
t r u e values of t h e parameters.
B - 181 ,en,. ..,ek)
The formula ahove can b e used t o f i n d t h e co-
represent the

T - T
u=-
~ a r i a n c ematrix from t h e given £(XI?) alone without having any d a t a a v a i l a b l e .
TI&
, e x a l e on t h e use of eq.(9.26)
b is given i n Sect.9.5.2.
is N ( 0 . 1 ) f o r large s a q l e s
With eq.(9.26) an e q u i v a l e n t formula f o r t h e covariance t e r m can be
B as
A

~ b t a i n e dwhere t h e i n t e g r a t i o n s are t o be done with variables. The j o i n t


9.5 VARIANCE OF MAXIMIM-LIKELIHOOD ESTNTORS
Har t o m k e t h e b e s t determination of t h e e r r o r s i n t h e M a r i m -
-
p r o b a b i l i t y ~ (-a l 8 ) dwith
~ n v a r i a b l e s must be transformed t o t h e j o i n t prob-
L'(alo)d$ with k v a r i a b l e s . Introducing t h e Jacobian f o r t h e chvaen
Likelihood e s t i m a t e s w i l l depend on t h e p r o p e r t i e s of t h e p r o b a b i l i t y d e n s i t y
transformation and i n t e g r a t i n g out a n d e r of (n-k) dummy v a r i a b l e s t h e r e s u l t
f u n c t i o n and on t h e s i z e of experimental s a l e . Since t h e variances f o r t h e
e s t i m a t e d parameters are so i n p o r t a n t , we w i l l e x p l a i n very e x p l i c i t l y under is

what c o n d i t i o n s t h e d i f f e r e n t methods f o r variance determination should be used.


Some formulae are v a l i d only f o r l a r g e samples. which means t h a t they are e x a c t
only i n t h e l i m i t when n goes t w a r d s i n f i n i t y . Hwever, many of these s o c a l l e d where t h e new l i k e l i h o o d function L' includes t h e Jaeobian. The transformation
Large s q b f o m L a e g i v e good approximations a l s o when n i s f i n i t e but reason- trom t h e v a r i a b l e s 5 t o t h e v a r ~ a b l e s i s o f t e n complicated, and i t may be
ably l a r g e . Other formulae are v a l i d f o r samples of a l l s i r e s ; these are o f t e n e a s i e r t o f i n d t h e covariance from t h e f o r m r expression eq.(9.26). I t should

1
r e f e r r e d t o as s m 1 1 sample f o m t a e . be noted t h a t t h e a n a l y t i c a l i n t e g r a t i o n of formulae (9.26). (9.27) can only be
When t h e experimental r e s o l u t i o n has been folded i n t o t h e p . d . f . (see ~ a r r i e dout i n r a t h e r favourable c a s e s . They w i l l lead t o t h e covariances
Sect.6.2) t h e errors c a l c u l a t e d from t h e l i k e l i h o o d f u n c t i o n w i l l c o n t a i n t h e expressed as functions of t h e t r u e (constant) parameters.
s t a t i s t i c a l as well as t h e experimental u n c e r t a i n t i e s . I f t h e r e s o l u t i o n i s mot '

1
Let us n w consider t h e l i k e l i h o o d function L(?lIi) as a f u n c t i o n of
included i n t h e p . d . f . . t h e errors estimated from an i d e a l t h e o r e t i c a l d i s t r i b u - f o r given 5. Since L(?[!) i s normalized t o one over t h e sample space, i t i s
t i o n obviously only r e f l e c t t h e s t a t i s t i c a l u n c e r t a i n t y and n o t t h e errors i n generally n o t normalized o v e r t h e parameter space. T h e variances may then be
t h e measurements. When t h e e s t i m a t i o n of t h e errors is done d i r e c t l y from t h e evaluated from t h e a l t e r n a t i v e formula *)
I
observed d a t a as f o r i n s t a n c e by t h e g r a p h i c a l method described l a t e r i n S e c t .
9.6, t h e s e errors w i l l c l e a r l y contain both t h e experimental and t h e s t a t i s t i c a l
uncerrainties.
V.. (8)
11 -
- cei-iji)ce.-~j)~c?llgd~

l~(~l!)d!
where t h e i n t e g r a t i o n s a r e extended over a l l k parameters. Again, i n f o r t u n a t e
9.5.1 General-ds f o r variance e s t i m a t i o n
Let us now regard t h e l i k e l i h o o d f u n c t i o n L ( ? ~ B ) - -(T
n

1- 1
£(xi[!) as t h e !
s i t u a t i o n s , i t may be p o s s i b l e t o c a r r y out p a r t s of t h e i n t e g r a t i o n s s n a l y t i -
c a l l y , or remove common f a c t o r s i n t h e denominator and numerator. I n t h e gen-
j o i n t p.d.f. of t h e n v a r i a b l e s x~,xr,...,x f o r t h e k parameters 8,.BI, ...,
ek. e r a l case, approximative values of t h e covariance between any p a i r of parameters
I f t h e e s t i m a t e s can be w r i t t e n e x p l i c i t l y as f u n c t i o n s of t h e x i ' s , i.e.
Gi = .
ai(xJ , x 2 , . .,xn). t h e covariance term between Gi and 8^. may be defined as
* Equations ( 9 . 2 8 ) , ( 9 . 2 9 ) formally consider t h e LF as providing a measure of
I
t h e d i s t r i b u t i o n of t h e van'obtes o;
a d i s c u s s i o n of the conceptual d i f f i -
c u l t i e s on t h i s point i s deferred t o Secr.9.7.
i s found by numerical i n t e g r a t i o n s of t h e t y p e maror t.
- The a l t e r n a t i v e formula, eq.(9.28), gives

where the A U ' s a r e the proper bi,? widths f o r the required i n t e g r a t i o n s over
the d i f f e r e n t parameters, and which leads t o

i s a common o v e r a l l n o ~ m a l i z a t i o ~f la c t o r .
I f , i n the l a s t formul.ntion, some of the parameters are not i n t e -
II Thus, only f o r i n f i n i t e l y l a r g e n do the r e s u l t 8 of t h e w o procedures, eqs.
(9.26) and (9.281, coincide.

grated over but kept f i x e d a t t h e i r estimated values, t h i s w i l l correspond t o


o b t a i n i n g rwn&tionat e r r o r s and covariances f o r the remaining parameters.
Exercise 9.9: Consider the p.d.f. f(t1A) he-" - .
Show, using t h e approxi-
mation of eq.(9.28), t h a t t h e variance of t h e NL e s t i m a t e i i s
This procedure, which gives s m a l l e r e r r o r s than the preceding approach, i s c o r n
p u t a t i o n a l l y simpler and i s , i n f a c t , the standard used i n same popular optimi-
Compare t h i s w i t h t h e r e s u l t of Exereire 9.7.
z a t i o n programmes (see Chapter 13). Whenever such a procedure i s adopted, t h i s
I
should be e x p l i c i t l y s t a r e d t o avoid m i s i n t e r p r e t a t i o n of the numerical r e s u l t s
! Exercise 9.10: Show. using eq.(9.26). t h a t the variance v(;) of the PL e n t i -
mate ;- E of t h e mean i n t h e oornal d i s t r i b u t i o n ~ ( u , a ' ) in equal t o t h e m,
(Seet.8.5.2). What i s t h e r e s u l t obtained from eq.(9.28)7
9.5.2 Example: Variance of the l i f e t i m e e s t i m a t e
Using eq.(9.26)
ane-parameterp.d.f.
t h e variance of the NL estitnste r
f ( t l r ) -;e
1 -t/~ .
1s
- -
n i-1
t i f o r the
/1 2; -
Exercise 9.11: From eq.(9.261, f i n d the variance v ( i 2 ) of t h e NL e s t i m a t e
1l11 xi-!^)' of t h e parameter O 2 i n N(u,u'). (Compare Exercise 9.3)).

!
Exercise 9.12:

(Sect.9.2.3).
-
Find using eq.(9.281 t h e covariance matrix of t h e simultaneous
HL e s t i m a t e s 0 X, 2
= l/n E ( r i - i ) of t h e mean and variance i n N ( u , ~ ' ) ,
Hint: The LF can be w r i t t e n

This can be w r i t t e n

9.5.3 Variance of s u f f i c i e n t NL e s t i m a t o r s
The formulae given i n Sect.9.5.1 are g e n e r a l l y v a l i d f o r a l l PL e s t i -
mators, i r r e s p e c t i v e of t h e sample 8i.e. I n s p e c i f i c cases more convenient
where t h e i n t e g r a t i o n s are from zero t o i n f i n i t y f o r a l l t h e n v a r i a b l e s t . . A formulae may be developed, and we t u r n now t o t h e s i t u a t i o n where t h e Pa e s t i -
s t r a i g h t f o r n a r d computation gives mators are s u f f i c i e n t .
When t h e p.d.f. f ( x l 8 ) provides a s i n g l e s u f f i c i e n t s t a t i s t i c , and
consequently an e f f i c i e n t e s t i m a t o r , f o r t h e parameter 8, we have already seen
This v(?) is t h e same as t h e MVB derived i n Sect.9.4.5 f o r the efficient esti-
in Sect.9.4.5
variance bound,
that the variance of the HL estimator is given by the minimum

i Exercise 9.13:

mate u -
It was ah- in Chapter 8 that ;is an unbiased and efficient
estimator of v in the norms1 distribution. Pind the variance of the ML eeti-
f from eq.(9.31).

Exercise 9.14: The distribution of gap lengthn x between adjacent bubble.


formed along the tracks of charged particles in a bubble c h d e r is given by
where b(B) is the bias of the estimator. tlarever, it is not necessary to evalu-
ate the expectation value of -a21nLlae2 in this case. From the efficiency eon-
f(.lg) - g =-gx. oi.<-.
The mean gap length l/g is often used as a measure of the ionization. Show
dition
A(l/g)/(l/g) -
that the relative statistical uncertainty for n measured gap lengths is
116.
-
alnL
ae
- (t-e-b(0)) (8.19)
9.5.4 Example: Variance of the weighted mean
one finds In Sect.9.2.2 we found that the HL estimate for the u n L n w n mean v of
the normally distributed observations xl,x,,...,n with errors 01,02,...,0 is

given by the weighted m a n of the x .


1'

Thus,

or, for an unbiassed estimator simply Since the weighted mean is an unbiassed and efficient t4L estimator of J! (compare
Exercise 9.8) and

It should be noted that these relations hold for small samples as well as for
large samples, and in particular eq.(9.31) is very useful in practice. the variance can be found from eq. (9.31).
In the multi-parameter ease the situation is not so simple. If, hov-
ever, there exists a set of k jointly sufficient statistics t,,t,, ....tk for the
k parameters e1.e2.....ek, it can be s h n m that the inverse of the covariance
matrix of the I 5 estimates in t m g e somptes is given by
When the errors oi are all equal, a.
known expression for the error on the mean. A;
-- o, eq.(9.12)
016.
leads to the well-

9.5.5 Example: Errors in the HL estimates of p and a' in N(p,02)


" "
- - In Exercise 9.12 we suggested that the approximate relation eq.(9.28)
This is s natural generalization of eq.(9.31). In sibmtioos where we do not know be used to find the covariance matrix of the HL estimates iand i2 of the mean
k jointly sufficient statistics for the k par-ters and where the nuder of and variance in the normal distribution N(v.5'). Instead of carrying out the
observations is not large, the covariance matrix of the ML estimates may still integration of the likelihood function over the parameter space a much easier
be found by one of the general methods described in Sect.9.5.1. way to obtain the covariance. is to use the fact that - ; 0%-and I/~C(X~-~)~
are two j o i n t l y s u f f i c i e n t s t a t i s t i c s f o r t h e parameters i n t h e no-1 p.d.f. 9.5.6 Variance of large-.ample KL e a t k . t o r s
(campsre Seet.8.6.4). It is t h e r e f o r e a p p r o p r i a t e t o apply eq.O.32) i n the It wan .ham i n Sect.9.4.7 t h a t t h e MI, estimate, under very general
! p r e s e n t case, provided n i s l a r g e . We have now conditions, w i l l be asymptotically normelly d i s t r i b u t e d about t h e t r u e valve
and w i t h variance equal t o t h e HYB. The MVB formulae e p . ( 9 . 3 1 ) and (9.32) can
then be reformulated in terms of the p.d.f. and used, on t h e planning s t a g e of
^ ^ the experiment. t o p r e d i c t t h e errors t h a t mvet be expected.
s o t h a t t h e elements of t h e inverse of t h e covariance matrix V(u,02) become
?or t h e one-parameter p.d.f. f(n18) the negative of the second
from eq. (9.32),
d e r i v a t i v e o f 1nL with respect t o e can generally be expressed a s
a21nl n
-1
v,, = - -a$ 7'

and t h e expectation, o r average, of t h i s v a r i a b l e i a

where it is understood t h a t the expressions should be evaluated f o r


0
' = 0
'.
a

Accordingly, t h e i n v e r s e of t h e covariance matrix i s


A
! - A

11 and For t h e f i r s t p e r t of t h e integrand on t h e right-hand s i d e we have

and, provided the i n t e g r a t i o n l i m i t s f o r a are independent of 8, t h e r e f o r e

which can be i n v e r t e d t o give


S u b s t i t u t i n g t h i s expression i n eq.(9.31) we f i n d t h e f o l l a r i n g convenient f o r
mula f o r the average of t h e inverse of t h e variance,

I It is i n t e r e s t i n g t o note t h a t t h i s asymptotic covariance matrix f o r


a I n t h e multi-par-ter case the corresponding formula f o r t h e covari-
t h e ML e s t i m a t e s i
i and o2 i s diagonal. This need not s u r p r i s e us s i n c e we know
ance t e r m , expressed by t h e p.d.f.. is
t h a t <(- G) and s 2 (- 5 :z) from normal samples are independent v a r i a b l e s
(Sect.4.8.6). A m , since ;i s ~ ( u . o ' / n ) , i t i n c l e a r t h a t the KL e s t i m a t e G
I w i l l , f o r a l l n, have a variance 02/n, t h e HYB. From Sect.5.1.6 we know t h a t
A

na2/a2, i s d i s t r i b u t e d according t o x2(n-1).


(n-l)s2/02,

-
and hence
v a r i a n c e 2(n-1). Therefore. v(") -
2ok(n-l)ln2, which holds s t r i c t l y f o r a11 n.
- and has a
( u ~ / ~ ) ~ v ( ~ ~ ' / (oc ~ ~) / n ) ~ 2 ( n - l )
This i s c l o s e t o the r e s u l t
me fact t h a t the ML e s t i m a t e i s asymptotically normally d i s t r i b u t e d
about the t r u e parameter value can, i n view o f the formal synmetry between
variable and mean value i n the normal p.d.f., be formally expressed as the
2o*ln implied by the asymptotic formula (9.33).
..
Henee, from eq.(9.34),
parameter being a s y m p t o t i c a l l y normally d i s t r i b u t e d about t h e ML e s t i m a t e ,
v i t h a ~ p r e a daround t h i s mean value as implied by the MVB. Writing t h e LF
f o r t h e one-parameter case as
'('1 -z 1
in(i+u) -
2a3
ln(l-a) - 2u .
when a < < l t h i s can be w r i t t e n

we f i n d a t once t h e simple r e l a t i o n s h i p
(9.36)
determine t h e parameter v i t h am u n c e r t a i n t y Aa
t h a t more thao 3-10' e v e n t s v i l l b e needed.
-
I f , p r i o r t o t h e experiment, cr i s assumed t o b e approximately 0.1 and v e v i s h t o
0.01, we f i n d from eq.(9.40)

which should be compared t o e q . ( 9 . 3 1 ) .


C l e a r l y , a normal-shaped LF ~ ~ r r e s p o nt od ~a ~ a r a b o l i edependence of 1 Exercise 9.15: I n t h e preceding example, v h s t can be s a i d about V(2) i f n is
InL on 8 , and a constant second d e r i v a t i v e . n o t very l a r g e ?
I n t h e m u l t i - ~ a r a r n e t e r "ase the LF has a s y m p t o t i c a l l y a multinolmal
6 i s found by i n v e r s i o n of t h e matrix with
lhe covariance matrix f a r - 9.5.8 Example: Planning of an experiment; d e n s i t y m a t r i x elements (1)
shape. I
elements given by t h e c o n s t a n t s
We v i l l now look a t a s p e c i f i c exaople t o i l l u s t r a t e h a r t h e NL
I !
method can b e a p p l i e d t o problems where t h e p.d.f. has a p h y s i c a l o r i g i n r a t h e r
' : !

Example: Planning of an experiment; p o l a r i z a t i o n (1)


I
than b e i n g of a atimdard mathematical type.

pseudoscalar mesons w i t h J' -


Consider a meson resonance w i t h s p i n p a r i t y J'
0-.
- 1- decaying i n t o two
The angular d i s t r i b u t i o n of t h e decay p a r t i e -
9.5.7
I n t h e preceding s e c t i o n s s e v e r a l methods have been presented on how
j
8 l e a , d e s c r i b e d i n t h e Jackson r e f e r e n c e s y s t e m b y t h e p o l a r angle 8 end t h e azi-
t o f i n d t h e variances of ML e s t i m a t e s . Some of t h e s e methods can only b e used I muth angle $, i n
a f t e r the d a t a have been c o l l e c t e d , while o t h e r s may b e applied
a l r e a d y on t h e p l a n n i n g s t a g e , p r i o r t o t h e d a t a t a k i n g .
suppose we want to study the p o l a r i z a t i o n of a n t i p r o t o n s i n antipro-
ton-proton e l a s t i c f r o m a double s c a t t e r i n g experiment measuring t h e
a n g l e $ between t h e normals of t h e s c a t t e r i n g planes. The p.d.f. is where f is p r o p e r l y n a m a l i z e d . since
n 1:
f ( x l a ) = L(1 + a n ) -1 : x : l (9.38)

where x = cos9, o = P2. We ask: How e v e n t s w i l l be needed t o o b t a i n a


For each e v e n t t h e r e are two measured q u a n t i t i e s , eos8. and $.. The
p r e s c r i b e d accuracy on t h e e s t i m a t e of a?
A

For l a r g e n t h e v a r i a n c e of o can be c a l c u l a t e d from t h e asymptotic d e n s i t y m a t r i x elements t o be entimated are p a o , p , - , , and ReplQ. The t h r e e

f o r m u l a , eq.(9.34). We f i r s t e v a l u a t e t h e i n t e g r a l simultaneous lilrelihood e q u a t i o n s are of order n in t h e paramctcrs. and t h e HL


e s t i m a t e s can t h e r e f o r e not b e found m a l y t i c a l l y . tlenee a numerical procedure
+l

-1
j l (aa3)'
f dX=
1 4 L. .
+l

-1
i t n x
dx =
20
4 ( l n ( ~ + c i )- l n ( l - a ) - 2a). i s necessary t o f i n d t h e e s t i m a t e s p,,,
, , - A
PI-,. Repls and t h e i r errors. wemy
now ask: Can anything be s a i d about t h e covariance matrix of t h e paremcters 9.6 GRAPHICAL DETERMINATION OF THE MAXIMIK.LIKELIHOOD ESTIMATE AND ITS ERROR
b e f o r e d a t a are a v a i l a b l e ? It can b e v e r i f i e d t h a t f o r t h e d i s t r i b u t i o n (9.41) I n many p r a c t i c a l problems n e i t h e r t h e ML e s t i m a t e nor i t s variance
t h e r e e x i s t s no s e t of j o i n t l y s u f f i c i e n t s t a t i s t i c s f o r t h e t h r e e parameters, can be found a n a l y t i c a l l y . The numerical behaviour of L(&) as a function of
o r f o r any combination of two of them. Nor i s t h e r e a s i n g l e s u f f i c i e n t s t a t i - 0 can, however, be used t o determine t h e e s t i m a t e 1 as
A

w e l l as i t s e r r o r A?
-
s t i c f o r any of t h e parameters alone. For small s a m l e s of d a t a , t h e r e f o r e , g r a p h i c a l l y when t h e number of parameters i s l i m i t e d t o one or two.
l i t t l e can be s a i d about t h e errors from t h e t h e o r e t i c a l p.d.f. For l a r g e
samples, however, t h e asymptotic covariance terms may i n p r i n c i p l e b e c a l c u l a t e d 9.6.1 The One-parameter case
from eq. (9.35). When t h e e s t i m a t i o n problem involves a s i n g l e parameter, one can

A l t e r n a t i v e l y t o t h e above d e s c r i p t i o n of t h e d e n s i t y m a t r i x elements simply p l o t t h e likelihood L f o r t h e given observations as a function of 0 and


n

i n t h e Jackson reference system, t h e s p i n s t r u c t u r e can b e described i n t h e ao- read off the M e s t i m a t e 8 from t h e graph as t h a t p a r t i c u l a r value of 8 f o r
c a l l e d dynamic reference system where t h e "observable" p a r t of t h e d e n s i t y which t h e curve peaks. Except For rare s i t u a t i o n s t h e curve w i l l have a s i n g l e
m a t r i x is diagonal. For vector p a r t i c l e s t h e d e n s i t y m a t r i x can be diagonalized maximum, and h e n r c a u n i q u e s o l ~ ~ t i af no r t h e ML e s t i m a t e R . I f more than one

by r o t a t i n g t h e Jackson r e f e r e n c e system a c e r t a i n angle 8 about t h e y-axis. maximum show up w i t h i n t h e physically admissible range of 8 , one w i l l usually
a

The t h r e e independent parameters i n t h e dynamic r e f e r e n c e system, are c a l l e d a, take 8 as t h a t value of 9 which corresponds t o t h e h i g h e s t maximum.

8, and 8. The decay d i s t r i b u t i o n i s of t h e form I n t h e case of a s i n g l e maximum, o r one dominant maximum well separ-
ated from o t h e r smaller maxima, one deduces t h e e r r o r i n t h e M e s t i m a t e 8 by
looking up t h e values of 8 f o r which L has f a l l e n by a f a c t o r of e-'.' of i t s

where e is a u n i t v e c t o r along t h e decay d i r e c t i o n , and i, i, 4 are unit vectors


manjmum value L(mar), as i n d i c a t e d by Fig. 9 . 1 .
shown i n Fig. 9 . l ( a ) , t h e two values 8 =
-8 f
A
For a s t r i c t l y normal LF,
A8 w i l l correspond t o 11 * o for a
along t h e axes of t h e dynamic r e f e r e n c e eystem. In terms of t h e p o l a r angles i n
v a r i a b l e d i s t r i b u t e d according t o ~ ( u , o ' ) . The t r u e value of the parameter
t h e Jackson frame and t h e r o t a t i o n angle 8 t h i s d i s t r i b u t i o n can b e expressed as

The M p r o p e r t i e s of t h i s p.d.f. are t h e same as those of t h e p.d.f. eq.(9.41).


I n p a r t i c u l a r t h e s o l u t i o n s f o r t h e ML e s t i m a t e s and 6
G. @, have t o be found
numerically by some optimization procedure.
We s h a l l i n Sect.11.2.3 d i s e u a s how t h e parameters i n eq.(9.41) can
a l t e r n a t i v e l y b e obtained by t h e moments methcd f o r parameter e s t i m a t i o n .

E x e r c i s e 9.16: Show t h a t n e i t h e r of t h e two marginal d i s t r i b u t i o n s obtained


by i n t e g r a t i n g t h e p.d.f. (9.41) over 4 and 0 , r e s p e c t i v e l y , w i l l provide any
s u f f i c i e n t n t a t b t i c or any s e t of j o i n t l y s u f f i c i e n t s t a t i s t i c s f o r t h e unknown
parameter(s1. Shov t h a t t h e s- s i t u a t i o n a p p l i e s f o r t h e p.d.f. (9.43). Fig. 9.1. Graphical determination of t h e M e s t i m a t e 8 and i t s error
from t h e one-parametric likelihood function; (a) a symmetric, normal
(Gaussian) LF, (h) an u n s y m e t r i c LF.
--- is t h e n d s r of event8 found i n scan I ,
A A

then has a p r o b a b i l i t y 0.683 of being l a r g e r than 0 - A 8 and s m a l l e r than N1 N,+N,,

i + &. N2 N,+N,, i s rhe n d e r of events found i n .can 2,

A ~ f r i c t l ynormal-shaped LF i s never a t t a i n e d i n r e a l experiments, N12 N,+N,+N,, i s t h e t o t a l number of events found,

where t h e ""her of observations i s f i n i t e . I n t h e general ease, with a n un- N (unknown) i s t h e t o t a l number of events i n t h e f i l m ,

symmetric LF, t h e two values of 0 f o r which .I = ~ ( m a x ) e - ~ w


" i l l g i v e unequal N-N1Z is t h e n d e r of undetected events i n t h e f i l m
errors on t h e upper and lower s i d e s of 8 ; see Fig. 9.1 ( b ) . The p r o b a b i l i t y
The scanning proeesa i s binomial, s i n c e an event i s e i t h e r found
t h a t t h e true value of 0 l i e s between 8
A

- A8A

and 6 + dab v i l l s t i l l be I by t h e scanner, o r i t is not. Therefore, i f E, i s t h e p r o b a b i l i t y t o observe


approximarely equal t o 0.68. This v i l l be discussed f u r t h e r i n Sect.9.7.1.
1 an event i n scan 1, t h e p r o b a b i l i t y f o r f i n d i n g j u s t N1 o u t of N event. in this

9.6.2 ~ ~ ~ Scanning
~ ~ e fl f i c ei e n c:y (3)
1 scan is, according t o t h e binomial d i s t r i b u t i o n law.

N! N1
we have on two e a r l i e r occasions examined t h e q u e s t i o n o f how t o P,(NIIN.E,) = Nl! ( N - N l ) ~E, (I-EI)~-~'. (9.44)

determine t h e e f f i c i e n c y i n a scanning procedure when two scans have been p e r


formed, f o r
chamber f i l m s .
t o l o c a t e s p e c i f i e d types of events i n a batch of bubble
l n Sect.2.3.11 we s a o how t h e e f f i c i e n c i e s of t h e i n d i v i d u a l
scans as w e l l as t h e o v e r a l l scanning e f f i c i e n c y , and thereby t h e t o t a l number
I~ For t h e second scan, t h e observed N2 events can b e divided i n t o
two groups. (I) t h e N I Z events t h a t have already been found i n .can 1, and (11)
t h e N 1 events t h a t have n o t been recorded i n t h e previous scan. For each of
t h e s e groups we can apply t h e binomial d i s t r i b u t i o n law. Thus, f o r group I t h e
of events, could be determined from t h e n u d e r of events detected i n two
p r o b a b i l i t y t o observe N 1 2 out of N1 events i a
1 independent scans. E r r o r formulae were derived i n Seet.4.1.3. The reasoning
t i n t h e s e sections r e s t e d on t h e a s s w t i o n t h a t t h e n d e r s of events were
l a r g e , and a l s o t h a t t h e e f f i c i e n c i e s were not too small.
We v i l l now s e e how t h e MaxirnumLikelihood method can b e used t o For group I1 t h e p r o b a b i l i t y f o r s e e i n g j u s t N, from a t o t a l of (N-N1) events
e s t i m a t e t h e scanning e f f i c i e n c i e s and t h e t o t a l number of events from t h e t h a t were undetected i n scan 1, i s
double-scan data. The procedure, introduced by D.A. Evans and W.H. Barkas, ea-
(N-Nl)! E2N2(1-s2) N-Nl-NI (9.46)
t r a c t s more information from t h e d a t a a t hand than does t h e conventional method P II ( N z I N - N ~ , E ~=) N,: (N-N1-N,):
described earlier. Contrary t o t h e e a r l i e r method t h e HL approach i s a l s o app- In t h e expressions f o r PI, PI and PII above t h e q u a n t i t i e s N. €I.
l i c a b l e t o experiments with low s t a t i s t i c s .
st are unknown. The j o i n t p r o b a b i l i t y f o r s e e i n g N1 events i n scan 1, NZ events
We s h a l l assume, as b e f o r e , t h a t t h e f i l m has been s u b j e c t e d t o
i n scan 2, and N I z events i n co-n i n t h e two scans, i s t h e product o f t h e
two independent scans, and t h a t i n each scan a l l events have t h e s a m ~ r ~ b a b i l i - three probabilities,
I
t y of being detected. The two scans have l e d t o t h e following r e s u l t s ,

Nnr events were found i n scan 1 as w e l l as i n .can 2,


Nz events were found i n scan 1 but not i n scan 2 , Organizing f a c t o r s t h i s can be w r i t t e n
Nz events were found i n scan 2 b u t not i n scan 1.

Then,
which is symmetric in the indices 1 and 2.
The joint probability P may be interpreted as the likelihood
L(NI.N~,N~~/N,EI,
ofthe
E ~ ) observations N1,NZ.NlZ - NltN2-NIP for the parameters
I N.EI.EZ. One can therefore solve the three simultaneous likelihood equations for
A
these parameters to find thelr ML estimates $ . E , , E ~ .
The likelihood equations alnLlael
expected for the efficiencies of the individual scans
- 0 and alnL/acz - 0 give as

-
The likelihood equation a l n ~ l a ~0 can, unfortunately, not be
solved analytically. Since, however, we are primarily interested in the number
N rather than the individual efficiencies ue can integrate the joint probability
of eq.(9.47) over the "nuisance" variables E I , E ~ to obtain a likelihood function
involving only the parameter N. The result of this integration is

Pig. 9.2. The likelihood function L(N) of eq.(9.49) generated with the
I This form is not particularly suitable for nmerieal evaluation numbers N1=43, N2.48, Nlz45, starting from N-67 (L(66)=1).
l when the observed n u h e r s are large. However, the expression can be formulated
in terms of a recurrence relation, The estimated efficiencies are
" 43
c, = -
81 = 0.53,
E2 = -
48
81
= 0.59.

which is very convenient for numerical calculation. Since the number N of Exercise 9.17: With the observations of the example in the text, what is the
total number of events and the scanning efficiencies estimated with the conven-
the two scans, one can take L(N-1) - L(N12)
stepwise to generate as much of the LF as desired.
-
events in the film m t be at least as large as the number of events found from
1 as a starting value and proceed
tional formulae?

9.6.3 The two-parameter case; the covariance ellipse


or a numerical eaemple, suppose that the two independent
yielded N1 - 43 and N2 - 48 events, respectively, with N12
me total number of events recorded by the scans is then N12
- -
scans have
25 events in c o w
-
For the two-parameter case the likelihood function ~ ( _ ? 1 8 ~ , 8becomes
a three-dimensional surface which is less trivial to display. However, the
~)

-
mon. Nl+NZ-Ntz
shape of the function can conveniently be visualized by plotting level curves
66.
ing with N -
Thus we put L(66)
67.
1 and generate new values of L(N)
The result of this computation is sharn in Fig. 9.2.
shape of the LF we conclude that the ML estimate of the unknwn number of events
from eq.(9.49) start-
From the
for constant values o f ~(_?IB1,82)in a (81,821 plane, equivalent to drawing
intersections between the surface and a set of parallel planes. In the vicinity
of a maximum of the function these level curves will be a series of smoth,
is a A

closed contours around the maximum point (81~82)which can thus be localized to
a required accuracy.
Wirh two parameters the LF will often have more than one maximum.
Usually, however, there is little trouble in identifying one particular maxi-
A second approach determines the "errors" in the parameter estimates from the
intersections between the same contour and the two lines 8,
indicated in Fig. 9.3(b).
- ^8,
With this intersection method the "errors" in either
and 8~ = Or as
A

mum with the desired ML solution for the parameter estimates. For example, / parameter are thvs deduced by keeping the other parameter at its estimated

-
some of the maxima may occur in ~nphysicalregions of the parameter space and value, making these "errors" in general smaller than the errors obtained by the
can therefore be eliminated right away, or the principal maximum can be over-
whelmingly favourable to the secondary maxima from the numerical
magnitude of their likelihoods. Wirh ill-behaved likelihood functions having
i tangential method. Since an asyometric orientation of the contour L
relative to the coordinate axes reflects a no"-zero correlation between the
L(rnax)e*.'

!
estimates, these two approaches to error determination will obviously produce
two or mre maxima of comparable magnitude the ambiguities of the solution identical results in situations with uncorrelated parameters only.
i
may have to be resolved by looking for additional information. An example is In the asynptotic timit of infinitely large samples the LF takes the
i
given by the case study described in Sect.9.12. ! binormal form
For a regular LF with a single maximum in the parameter region of
interest the errors in the MI estimates of the two parameters can be obtained
from the specific likelihood contour for which L = ~(rnax)e-"'. The tangents
to this to the pa ordinate axes ~rovidea set of upper and
I
where 0: and 0: are the variances for the two HL estimates ^BI and g2, and p
I
lower errors for the two parmeter estimates, as indicated in Fig. 9.3(a). their correlation coefficient, as can be verified by calculating the matrix
If the LF has the shape of a binormal distribution the errors deduced this elements v 7 ? ( 8 ) according to eq.(9.37)
and inverting the resulting matrix. The
I I 1' -
way are identical to the standard deviations, as will be demonstrated below. contour L = ~(max)e-~"is now given by r quadratic equation in 8, and 02,
I

This is the coumionce eZtipse for the binormal LF. The ellipse is centred at
A A

(81,82), and its principal axes make an angle a relative to the coordinate e y s -
I
tem, where

Figure 9.4 shows a feu examples of covariance ellipses with a c o r n "


centre and c o m o n values of o f , & bur with different p . As can be verified
from eq.(9.51),
tangle defined by the straight lines 8 1 -
the ellipses with such properties are all inscribed in the ree-
&

81 * a , and 8~ =

words, regardless of the value of p, the tangents to an ellipse parallel to the


O2 i 0 ~ .In other

A A

Fig. 9.3. Graphical determination of the HL estimates and their errors coordinate axes will always have distances i o1.f or from the point (81.8~);
from the two-parametric likelihood function; (a) the tangential method, this serves to justify the tangential method for graphical error determination
(b) the intersection method (conditional errors).
9.7 INTERVAL ESTIMATION FROM THE LIKELIHOOD FUNCTION
The Manimm-Likelihood P r i n c i p l e produces p o i n t e s t i m a t e s of t h e un-
known p a r a m e t e r s . As i t i s r e c o g n i z e d t h a t t h e r e s h o u l d be a m a g i n o f uncer-

I t a i n t y a s s o c i a t e d w i t h a n e s t i m a t e , we have i n t h e p r e v i o u s s e c t i o n s d i s c u s s e d
a t l e n g t h how t o e v a l u a t e i t s v a r i a n c e . We have seen t h a t t h e approach t o t h e
v a r i a n c e d e t e r m i n a t i o n i s n o t unique, and t h a t somewhat d i f f e r e n t r e s u l t s are
A A

o b t a i n e d from t h e v a r i o u s methods. Therefore, i n .quoting a r e s u l t 0 t A6 for


t h e HL e s t i m a t e one s h o u l d i n d i c a t e how t h e error was o b t a i n e d .

I I n s t e a d of g i v i n g t h e r e s u l t of an experiment i n terms of a p o i n t
estimate 8 and i t s error A8 one can sumoarize t h e outcome of t h e e x p e r i m e n t by
performing a n i n t e r v a l e s t i m a t i o n f o r t h e unknown p a r a m e t e r 0. For s u c h a
purpose w e i n t r o d u c e d t h e c o n c e p t o f a confidence i n t e r v a l i n Chapter 7. We
a s s o c i a t e d a n e s t i m a t e d i n t e r v a l w i t h a p r o b a b i l i t y c o n t e n t Y, and c a l l e d it a
i I
I lOOy % c o n f i d e n c e i n t e r v a l for t h e p a r a m e t e r . The meaning o f t h i s was t h a t i f
I t h e experiment were r e p e a t e d many t i e s under t h e same c o n d i t i o n s , t h e n , i n
t h e l o n g run, t h e e s t i m a t e d i n t e r v a l s would i n c l u d e t h e t r u e v a l u e o f t h e
F i g . 9 . 4 . Covariance e l l i p s e s f o r binormal l i k e l i h o o d f u n c t i o n s w i t h
1 ,
conanon maximum (a1,&) and comon v a r i a n c e s 0f.4, o % = l .The e l l i p s e s
! p a r a m e t e r i n l0Oy 'I o f t h e s e e x p e r i m e n t s . To a r r i v e a t t h e s e i n f e r e n c e s a b o u t
I I t h e p a r a m e t e r we had t o i n v e r t p r o b a b i l i t y s t a t e m e n t s a b o u t some f u n c t i o n , o r
for d i f f e r e n t v a l u e s of ;he c o r r e l a t i o n c o e f f i c i e n t P a l l touch t h e
!
r e c t a n g l e d e f i n e d by 8 1 = @ l * o l , 8 2 = 8 2 + 0 2a t f o u r p o i n t s ; f o r P = 1 * s t a t i s t i c , o f t h e o b s e r v a b l e q u a n t i t i e s , whose d i s t r i b u t i o n a l p r o p e r t i e s were
t h e e l l i p s e s d e g e n e r a t e i n t o t h e d i a g o n a l s of t h e r e c t a n g l e .
known. Thus, for i n s t a n c e , i n v e r t i n g a p r o b a b i l i t y s t a t w e n t a b o u t t h e sample
-
(Fig. 9.3(a)). The i n t e r s e c t i o n method f o r d e t e r m i n i n g t h e errors, c o n s i s t i n g mean n, known t o be d i s t r i b u t e d as N(u,a2/n) w i t h u udmown, n2 known, we ob-

i n drawing l i n e s 8 , = and e2 = Bi, i s h e r e s e e n t o g i v e i n t e r s e c t i o n s w i t h t a i n e d a s t a t e m e n t e x p r e s s i n g t h a t t h e p r o b a b i l i t y was 0.954 t h a t !J would b e

t h e c o v a r i a n c e e l l i p s e a t d i s t a n c e s *al&?, tnZJ1-p' from (81,821. The l a s t larger than x - ;2 and s m a l l e r t h a n ;+ 2; ; hence we c a l l e d t h e random
I i n t e r v a l [ ~ - 2 i 7:221
, a 95.4% c o n f i d e n c e i n t e r v a l f o r 11.
~ b s e r v a t i o ns h o u l d make i t c l e a r why one must be c a r e f u l i n u s i n g t h i s method,
s i n c e merely q v o t i n g t h e i n t e r s e c t i o n d i s t a n c e s as errors w i l l b e incomplete. I n t h e a s y m p t o t i c l i m i t , w i t h infinite sample s i z e s , we can a r g u e i n

and perhaps even m i s l e a d i n g , w i t h o u t a s p e c i f i c s t a t e m e n t e x p r e s s i n g t h a t t h e s e a s i m i l a r manner t o o b t a i n c o n f i d e n c e i n t e r v a l s f o r t h e unknown p a r a m e t e r 8 .

errors are indeed c o n d i t i o n a l . We know t h e n t h a t t h e HL e s t i m a t e 0 h a s t h e p r o p e r t y of b e i n g normally d i s t r i -

I t can b e shown t h a t d e t e r m i n i n g t h e errors by t h e t a n g e n t s t o t h e buted about t h e t r u e parameter v a l u e 8 , w i t h a v a r i a n c e given by t h e minimum


1 contour L - ~ ( m a a ) e - " ~c o r r e s p o n d s t o h a v i n g a p r o b a b i l i t y 0.683 f o r i n c l u d i n g
A A , . A
v a r i a n c e bound, and t h i s a l l o w s us t o w r i t e down p r o b a b i l i t y s t a t e m e n t s l i k e

t h e t r u e v a l u e of one o f t h e p a r a m e t e r s i n t h e i n t e r v a l [B.-A8.,B.+AO.l when


1 I , 1

I t h e second i s i g n o r e d . me r e g i o n e n c l o s e d by t h i s c o n t o u r does n o t r e p r e s e n t
a j o i n t 68.3% p r o b a b i l i t y f o r t h e two p a r a m e t e r s , but c o r r e s p o n d s t o a much
l o w e r j o i n t p r o b a b i l i t y , i n f a c t l e s s t h a n 40% i n t h e i d e a l a s y m p t o t i c c a s e ; The random v a r i a b l e 8 h e r e h a s a p r o b a b i l i t y 0.954 of f a l l i n g w i t h i n d i s t a n c e

t h i s w i l l be d i s c u s s e d f u r t h e r i n S e c t . 9 . 7 . 4 . AZo from t h e t r u e b u t unknown 8 , where a i s implied by t h e MVB. Rewriting the


I As soon as the likelihood function has been given the intuitive
inequalities we get the expression
meaning of expressing degree of belief in possible values of 8, i t is only
natural to associate an intelval in E with a numerical measure of our
belief, in proportion to the integral of the LF over this interval.
Let
for simplicity assume a normal-shaped LF of mean 8 and variance m2.
stating that the probability is 0.954 that the interval [@-20, b 2 0 1 vill Then
c o v e r the constant 9. According to the definition of the concept in Chapter 7
1 our belief associated with the specific intervals [:-a, &I, [;-z0, 8+2a],

the interval [@-20, ^O+2ol is therefore a 95.4% confidence interval for 8. We e t c . would be proportional to the constants 0.683, 0.954, ....
~hese
numbers therefore provide relative measures of our belief in the specified
note that the inversion of the probability statement (9.53) was particularly
intervals for 8. We could write, for example,
simple here, due to the algebraic symnetry between variable and mean value in
the normal p.d.f. This asymptotic symmetry persists in the multi-parameter
Rel. belief (e - 2 < 9 < 0 + 20) = 0.954,
(9.55)
case, and hence permits a similar reasoning with inversion of probability
statements to ~htainconfidence intervals (regions) in the general case with
which is an expression of the same formal structure as the inverted probabi-
several parameters; see for example the ~resentatianin Chapter 9 in the book
1 lity statement eq. (9.54) used to define the 95.4% confidence interval for 9.
by Eadie e t a t . A

In the following sections, when we refer to the likelihood function and write
For f i n i t e samples we do not know the exact distribution of 8. W e
can therefore not write down statements like eq.(9.53), invert them, and next
interpret the results in terms of exact confidence intervals for the unknown
constant as we did above.
we shall take this probability statement to mean "relative belief" in the
In the following we shall make use of the likelihood function to* above sense.
perform an interval estimation of 9. We resume and extend our point of view It is customary among physicists to refer to all intervals derived
from earlier in this chapter: not only shall we regard the 0 value for which from the likelihood function as confidence intervals. This is in many in-
the LF is maximal as the "most likely'' value of the unknown parameter; other stances unfortunate, since the meaning is different from what is usually
e values will be considered less likely of being the true value of the para- understood by a confidence interval. Intervals obtained by a specific pre-
meter, in accordance with the fall-off of the likelihood. Thus, for the set
scription from the likelihood function were originally named fiducinl i n t e r v a t s
of observations at hand, we shall regard the likelihood function itself as by R . A . Fisher.
Following a suggestion by D.J. Hudson we shall denote all
providing a meamre of the intensity of our credence in the various con- intervals derived from the likelihood function a s likelihood i n t e r v a l s to
ceivable values of the unknown e. This means that we make an interpretation indicate their origin and to distinguish them from confidence intervals which
of the likelihood function as measuring our "degree of belief'' in the possible have an entirely different conceptual content.
values e can have, based on our particular observations x,.x2. ...,
x,,. Where-
Finally, let it be mentioned that the use of the likelihood function
as the confidence interval gave a measure for the probability that the true in statistical inference is by no means a trivial matter. In fact, there has
value of the unknown parameter i n the tong run would be included in the esti- over the Years been a great deal of controversy among the specialists, due to
mated interval, an interval estimated from the likelihood function vill their different attitudes to Bayes' Theorem. To indicate how confusion can
measure our belief that the pa~*ticuZars e t o f observation8 was generated by a arise on the subject, let us recall that the likelihood ~ ( ~ 1 8expresses
) the
parameter belonging to the estimated interval.
..
probability to obtain the particular set 5 = {x,,x2. ...,x of observed With an asymptotic LF of Gaussian form or, equivalently, a parabolic
values, on the condition that the value of the parameter is 8. This pcob- InL function, intervals for @ can be constructed to correspond to a specified
ability is connected with the inverse probability P(B/~)for a particular probability content Y , as described for the normal variable in Secr.4.8.3. ~n
value of 8, given the observations x, through BayesS Theorem, which states general, two limits 6 and 6 can be chosen such as to make
b
(eq. (2.26))

where G is the cumulative standard normal distribution. Particularly convenient


are the choices which make the intervals symmetric about 8, since these are the
where P(8) is the prior probability for 8. In addition to knowing the likeli-
shortest intervals which have a given probability y and also produce equal tail
hood ~ (-~ 1 8of
) the observations 5 in our experiment we must have some prior
probabilities i(l-Y) in the ends of the distribution *); these can be obtained
knowledge of the parameter, expressed by P(e), in order to obtain the new,
by simply intersrcting the 1.F cir lnl. f x i n r ~ i o nby straight lines. Thus, the
posterior probability ~ ( 8 / ~ )which
, in this scheme forms the basis for statis-
!>robability statement
tical inference about 8. Thus, we may want to undertake an experiment to get
some information on the parameter 8, but unless some previous knowledge al-
ready exists on e, or is simply guessed at, it is not even in principle poss-
corresponds to obtaining the symmetric and central m standard deviation likeli-
ible to gain in knowledge by carrying out the experiment. Clearly, the re-
hood interval [;-mr, ;+an1 of probability Y. The interval is constructed by
quirement of having previous knowledge in order to learn something new is
intersecting the Gaussian L(O) by the straight line L = ~(mar)e-~,or what
philosophically disputable. The reader who is interested in underlying phil-
amounts to the same, by intersecting the parabola InL by the straight line at
osophies and the relationship between Bayesian and other approaches to inter-
distance a below maximum, where a = 19 = hm2. Specifically,
val estimation should consult Kendall and Stuart, Chapters 20 and 21, Vol. 2.
a = 0.5 gives a 68.3% likelihood interval for 8,
9.7.1 ~ikelihood intervals; the one-parameter case
a = 2.0 " " 95.4% " ,I 1s (9.62)
Under very general conditions, when the number of observations
a = 4.5 " " 99.7% " I* I*.

I becomes infinitely large, the likelihood function ~ ( ~ 1 0=ii,f(xil@)


) gets I
independent of the sample values x l , x 2 , ...,% and rakes the shape of a normal This is illustrated in Fig. 9.5(a).
distribution in 6 with mean value ^e and variance o 2 as implied by the M V B . We The above procedure can also be applied when the LF is nor of a normal
write shape (and, equivalently, 1nL is not parabolic). Let us assume that L (XI@) is
8-
, , a continuous unimodular function of 8 and that there exists some transfornation
L(?/O) * L(B) = ~ ( m a x )e - " , (9.57)
g = g(B) of the variable @ which transforms the function L (~10)into a Gaussian
8 -
of unit variance and mean value g. In terms of the new variable g the LF is

* We use the word "symmetric" t o characterize the interval limits relative to


8, and "central" to indicate that the interval has been chosen to give equal
probabilities in the two ends of the distribution.

i
I
t h i s p r o b a b i l i t y must b e r h e same whether t h e p a r a m e t e r i s e x p r e s s e d d i r e c t l y
as 0 or i n an i m p l i e d form g ( B ) , we must have, f o r a l l 8 ,

L8c.le) = ~ ~ ( ~ 1 ~ ) .

I t f o l l o w s t h a t s i n c e , f o r example, a 95.4% l i k e l i h o o d i n t e r v a l f o r g i s found


from t h e i n t e r s e c t i o n s of L ( n ( g ) w i t h t h e l i n e L = L (max)e"', t h e corre-
g - g g
sponding 95.4% l i k e l i h o o d i n t e r v a l f o r 0 can simply b e o b t a i n e d by f i n d i n g t h e
i n t e r s e c t i o n s o f L ( ~ 1 . 3 )w i t h Lo = L (max) e
- 210
. Similarly, i n t e r s e c t i n g the
8 - e
"on-normal LF by any s t r a i g h t l i n e a t a m u l t i p l e e-= from maximum w i l l d i r e c t l y
produce l i k e l i h o o d i n t e r v a l s o f as g i v e n f o r t h e c a s e w i t h an
i d e a l , normal LF.
S t r i r t l y s p e a k i n p , t h e l;?sf a s s e r t i o n can o n l y b e a p p r o x i m a t e l y
correct. T h i s i s so because t h e assumptions u n d e r l y i n g t h e a r g u m e n t a t i o n above
w i l l n o t be s a t i s f i e d i n t h e g e n e r a l case. The e x i s t e n c e of t h e t r a n s f o r m i n g
f u n c t i o n g ( 0 ) i s n o t g r a n t e d and w i l l , i n f a c t , o n l y h e f u l f i l l e d t o some
approximation, depending on t h e f u n c t i o n a l form of t h e p . d . f . and t h e sample
v a l u e s , which d e t e r m i n e t h e a c t u a l shape of t h e LF. For p r a c t i c a l work t h i s
F i e . 9 . 5 . L i k e l i h o o d i n t e r v a l s f o r a one-parameter 1nL f u n c t i o n , need n o t d i s t u r b us; as l o n g as t h e graph o f t h e 1nL f u n c t i o n has a s i n g l e
o b t a i n e d from i n t e r s e c t i o n w i t h s t r a i g h t l i n e s 1nL = lnL(man)-2;
(a) a symmetric, p a r a b o l i c lnl. f u n c t i o n , (b) an u n s y m e t r i c 1nL maximum i n t h e r e g i o n of i n t e r e s t and does n o t d e v i a t e t o o much from a p a r a b o l a
function. we may f i n d i t s i n t e r s e c t i o n s w i t h t h e s t r a i g h t l i n e s to o b t a i n l i k e l i h o o d
i n t e r v a l s of approximate p r o b a b i l i t y c o n t e n t s a s g i v e n by e q . ( 9 . 6 2 ) ; see
Fig. 9 . 5 ( b ) .
a n d i n a c c o r d a n c e w i t h t h e i n v a r i a n c e p r o p e r t y of HL e s t i m a t e s (Sect.9.4.1)
Because o f i t s s i m p l i c i t y , t h e i n t e r s e c t i o n p r o c e d u r e i s t h e one
t h e HL s o l u t i o n f o r g i s k = g(8).
most f r e q u e n t l y used tar i n t e r v a l e s t i m a t i o n by p h y s i c i s t s . An a l t e r n a t i v e
~ r o mL ( ~ l g )one can f i n d l i k e l i h o o d i n t e r v a l s f o r
8 -
g i n exactly the
p r o c e d u r e , e q u a l l y j u s t i f i e d from t h e i n t u i t i v e p o i n t of view t h a t t h e l i k e -
same way as in t h e case of B normal-shaped LF. I f the likelihood interval f o r
- l i h o o d f u n c t i o n g i v e s a measure of o u r b e l i e f i n t h e p o s s i b l e v a l u e s o f t h e
g i s [ga,gbI, t h e corresponding l i k e l i h o o d i n t e r v a l ( 8 a' eb 1 f o r t h e o r i g i n a l
unknown p a r a m e t e r , c o n s i s t s i n e x p l i c i t l y i n t e g r a t i n g t h e LF. F a r example, i f
eb
formation g -
p a r a m e t e r !3 can b e found by t & i n g t h e v a l v e s Oa,
g(Ba), gb = g(Bb).
formation t o o b t a i n 8 and 8b e x p l i c i t l y .
as i m p l i e d by t h e t r a n s -
Thus we s h o u l d have t o make an i n v e r s e t r a n s -
we s h o u l d choose t o have e q u a l
could d i v i d e t h e t o t a l range f o r 0, R L :I3
i n t h e two t a i l s of t h e LF, w e
B,, i n t o a number of c e l l s ABi and
d e t e r m i n e n u m e r i c a l l y two v a l u e s B and Bb such t h a t
me e l a b o r a t e p r o c e s s of p e r f o r m i n g a t r a n s f o r m t i o n and a s u b s e q u e n t
i n v e r s e t r a n s f o r m a t i o n is i n f a c t u n n e c e s s a r y . Since the likelihood i t s e l f
g i v e s t h e j o i n t p r o b a b i l i t y f o r o b t a i n i n g t h e o b s e r v a t i o n s xl,x,, ... ,x and

and

9 - Prob@btllfyand ifaflltlc.

A-
For small s a q l e s an even more applicable function is

This would correspond t o faking IRa,BbI as a lOOy Z central likelihood interval


here the last term is a skewness correction to the ~rdinaryBartlett function
for 8.
of eq.(9.63). The asymetry coefficient is defined ( s e e Sect.3.3.3) in t e r m
Although sufficient for most practical purposes, the methods de-
of the second and third central moments of alnLIa6,
scribed above may sometimes not provide an adequate summary af the experiment.
This can be the case, for instance, if the LF is very skew, and produces large
likelihood intervals which are unsymmetrically positioned relative t o the most
likely value of the parameter. Tn such a situation one will usually state the where
point estimate 2 in addition to the interval estimate [8a,8bl.
In more extreme
cases, when the LF is particularly ill-shaped, as is sometimes experienced in
low-statistics experiments, it is more informative to present a graph of the
and
whole LF.
Exepf for the ideal situation with a strictly normal LF we shall not,
in general, be ensured that the intervals obtained directly from the LF (or,
In these expressions the expectations are to be evaluated for the joint p.d.f.
equivalently, from 1nL) are the shortest intervals for 8 corresponding to the
~(~18).
given probability y. If we are interested in obtaining intervals which give as
accurate information on the parameter as possible, we should look for some - -
Exercise 9. 18: Derive eq. (9.65).
e5pZicit transformation into a new variable which is more close to the normal
approximation than is the LF, and which will therefore, in the long run, pro- Example: Confidence intervals for the mean lifetime
9.7.3
duce tighter intervals, consistent with the minimum variance bound. An example TO illustrate the use of the Bartlett fmctions from the previous
on such a transformation is discussed in the subsequent sections.

9.7.2 Confidence intervals from the Bartlett functions


-
section let us consider again the ideal lifetime distribution law
( 1 ) . For n observations,

It was shown in Sect.9.4.7 that the standardized variable

s(o) :u = ae I [,(--)I4a2in~
(9.63)
m estimate becomes r -- - 1
- The asymme-
-
and the t Lt. as we saw in Sect.9.1.1.
n I
becomes distributed as N(0,I) when n goes towards infinity. One can therefore, try coefficient is obtained as y, 2 / K and the Bartlett functions taLe the
as suggested by M.S. Bartlett, use S(8) to find the ML estimate 8 as well as foras
any confidence interval for B. This is suggestive, since S(8)
to being normally distributed also for finite n.
is usually close
S(T> --A
T-T
T/G *
(9.66)

and
s (T) 1- 1
= T-T [/;-T )z - p r o b a b i l i t y s t a t e m e n t s about s p e c i f i e d f u n c t i o n s o f r . They e x p r e s s t h e prob-
TIK 3 6 GZ a b i l i t y t h a t t h e t r u e v a l u e of t h e p a r a m e t e r w i l l b e i n c l u d e d between c e r t a i n
* -
me f i r s t of t h e s e f u n c t i o n s appeared a l r e a d y i n t h e e n a v l e of s e c t . l i m i t s g i v e n by t h e random v a r i a b l e r = t and t h e sample s i z e n, and t h e r e f o r e
9.4.8. where i t was shown e x p l i c i t l y t o be a s t a n d a r d normal v a r i a b l e f o r l a r g e correspond t o c o n f i d e n c e i n t e r v a l s i n t h e sense of Chapter 7. These i n t e r v a l s
n. A p r o b a b i l i t y s t a t e m e n t may t h e r e f o r e be w r i t t p n as w i l l b e mre s y m e t r i c and, i n t h e l o n g run, mre accurate than i n t e r v a l s de-
r i v e d d i r e c t l y from t h e l i k e l i h o o d Function.

~ r e r c i e e9.19: Using t h e o n e - s t a n d a r d d e v i a t i o n l i m i t s f o r t h e two B a r r l e r t


which can b e i n v e r t e d t o r e a d f u n c t i o n s S ( r ) end S (TI i n t h e t e x t , show t h a t t h e 68.3% c o n f i d e n c e i n t e r v a l s
obtained f o r r are i x e n t i c a l .

i E x e r c i s e 9.20: I n a l a b o r a t o r y e x e r c i s e t o measure t h e mean l i f e t i m e r of A


hyperons produced i n a b u b b l e chamber a s t u d e n t h a s f o r 14 event c a n d i d a t e s
measured t h e i c > l l r w i n g prr>per flig1,t t i m e s i n u n i t s o f 10." seconds:
This gives t h e 9 5 . 4 % c o n f i d e n c e i n t e r v a l f o r r,
2 . 4 , 1.7, 1.9, 4.5, 0.5, 1 0 . 2 , 2.7, 0 . 8 , 1 . 4 , 1 . 6 , 1 . 0 , 1 . 2 , 2 . 8 , 0 . 7 .
~ ~ ~ ~ r tnh ei nchamber
g t o have i n f i n i t e dimensions and p e r f e c t d e t e c t i o n condi-
t i o n s , what i s t h e NL e s t i m a t e f o r T? P l o t t h e 1nL f u n c t i o n from t h e s e assump-
t i o n s and d e t e r m i n e t h e 68.3% and 95.4% l i k e l i h o o d i n t e r v a l s . Compare t h e s e
a
with t h e c o r r e s p o n d i n g i n t e r v a l s o b t a i n e d from t h e B a r r l e t t f u n c t i o n s S ( r ) and
-
To o r d e r 1 t h i s i n t e r v a l i s symmetric about r. S (7).
1
Taking i n s t e a d t h e f u n c t i o n S
Y
(TI t h e analogue t o the p r o b a b i l i r y
E x e r c i s e 9.21: Consider t h e p.d.f. -
~ e - ~ ~ l ( l - e - ' ~f o) r 0 5 t I T ,

I
s t a t e m e n t (9.68) i s f(t;T/A)
(compare E x e r c i s e 9 . 2 ) . Show t h a t t h e B a r t l e t t f u n c t i o n d e f i n e d by oq.(9.66)
becomes $ .

I f we now want t h e l i m i t s of t h e c o r r e s p o n d i n g c o n f i d e n c e i n t e r v a l f o r r a
second-order e q u a t i o n must b e s o l v e d f o r e a c h of t h e s e l i m i t s . Of t h e two s a l u -
Lions of e a c h e q u a t i o n we t a k e t h o s e which, when n + -, coincide with the l i m i t s
o b t a i n e d above villi t h e f o a c t i o n S ( T ) , s i n c e S (T) + S ( r ) when n becomes v e v
Y 9.7.4 L i k e l i h o o d r e g i o n s ; t h e two-parameter case
l a r g e . The r e s u l t is t h a t S (T), f o r n n o t t o o s m a l l , p r o v i d e s t h e 95.4% con-
Y I n a s i t u a t i o n i n v o l v i n g two p a r a m e t e r s w e s h a l l r e g a r d t h e l i k e l i h o o d
fidrnce interval for r
n e )d f o r a s e t of o b s e r v a t i o n s as c o n t a i n i n g a l l i n f o r -
f u n c t i o n L ( ~ I R ~o ,b t~a i ~
mation a v a i l a b l e o n t h e unknovn parameters and use i t t o make i n f e r e n c e s about
them based on p r o b a b i l i t y s t a t e m e n t s of the t y p e
I
T h i s i n t e r v a l i s more symmerric about r and a l s o s h o r t e r t h a n t h e i n t e r v a l of
eq.(9.69).
We v i l l s t a r t w i t h t h e assumption of a s y m p t o t i c c o n d i t i o n s . In the
The i n t e r v a l s of e q s . ( 9 . 6 9 1 , ( 9 . 7 1 ) a r e t h e r e s u l t s o f i n v e r s i o n of
l i m i t of i n f i n i t e l y l a r g e s a m p l e s t h e LF w i l l be t h e binormal d i s t r i b u t i o n

1
obtained by intersecting 1nL by parallel planes 1nL = lnL(max)-a: they are the
boundaries of joint likelihood regions for the two udnown parameters, and have

I
I
1
probability contents as implied by eq.(9.76),
~ovarianceellipse has a probability content y
where
= l-e
_
iQ
'
Y
=
= a. In particular, the
0.393 and thus repre-
~ e n t sa 39.3% joint likelihood region for 8, and 82. The following list of
"umbers should be compared with its one-parameter analogue, eq.(9.62),

a = 0.5 gives a 39.3% joint likelihood region for 0, and 02,


a
a =
= 2.0
4.5
"

"
" 86.5%
" 98.9%
"

"
9, ,,
,, ,,
t,

,r
,,
,I
. (9.77)

-
Curves for c o n s t a n t likelihood will then be ellipses with centre at the M
A

estimates 81,82, as shown in Fig. 9.6. Specifically, we recognize Q = 1 as


giving the covariance ellipse of eq.(9.51).
The quadratic form (I of the two normal variables 0 , and O2 has the
remarkable feature that its distributional properties are independent of all
,. "
P ;fact, Q is distributed as a =hi-square variable
five constants O L . ~ ~ , U ~ , O ~ ,in
w i t h 2 degrees of freedom (compare Exercise 4.48). For such a variable we can
express a probability by the cumulative integral

P(Q 5 QY1 = fi(Q;V=2)dQ = Y . (9.75)


0
where f(Q;v=2)=jexp(-tQ), according to Sects.5.1 .I and 5.1 .2. The integration
can therefore be performed explicitly, giving

Clearly, the condition Q 5 Qy is equivalent to having both variables Bi,Eiat


the same time within the region enclosed by the ellipse Q = Q . Writing
Y
A "
P(O,,O, within ellipse Q about 8,,02) = Y
= Q
Y
we have arrived at a probability statement of the type (9.72). The ellipse
A A

Q = QY centred at the M estimates 0 1 . 8 ~has a probability Y of covering both


8, and 8~ simultaneously and therefore represents a lOOy % joint likelihood Fig. 9.6. Likelihood regions for a binormal likelihood function. The ellipses
are obtained by intersecting the 1nL function by planes a t distances a below
region for the two parameters.
the maximum point ( 8 1 . 8 2 ) and define joint likelihood regions for 81 and Q2,0f
If the two-parameter LF is of a strictly binormal shape we shall probabilities as given by eq.(9.77). The vertical band of width 201 around 01
touching the covariance ellipse Q = 1 is a 68.3% likelihood interval far 01,
have a simple interpretation of the elliptic likelihood contours irrespective of E2; similarly, the horizontal band is a 68.3% likelihood inter-
val for E2, ignoring 81. Within the covariance ellipse the 68.3% ~onditionnl
likelihood intervals of length for either parameter are indicated.
s y m e t r y reasons, Y(-v,m) = y(p,m). The f o l l o w i n g t a b l e s-arizes the
From t h e p r e c e d i n g a r ~ u m e n f s i f i s a l s o e v i d e n t how a j o i n t l i k e l i -
numerical v a l u e s o f t h e p r o b a b i l i t y c o n t e n t f o r a s e t of e l l i p s e s as w e l l
hood r e g i o n o f s p e c i f i e d p r o b a b i l i t ~Y r a n be found f o r two p a r a m e t e r s , g i v e n
as t h e i r c i r c u m s c r i b i n g r e c t a n g l e s f o r d i f f e r e n t v a l u e s of p .
a b i n o r m a l I.F. One r h a n r e s t h e e l l i p t i r r e g i o n Q = 2.3 bounded by t h e c o n t o u r
l . . . . _ . . ... .~ . ~ . .
InL = lnL(man)-a, where t h e number a i s deduced from e q . ( 9 . 7 6 ) f o r t h e speci- Frob. of / 1
f i e d y, / ellipse,,' P r o b a b i l i t y o f c i r c u m s c r i b i n g r e c t a n g l e , y(0.m) I
a = - ln(1 - y). (9.78)

Thus, f o r example, i f we w e r e i n t e r e s t e d i n a 68.3% j o i n t l i k e l i h o o d r e g i o n


f o r t h e two p a r a m e t e r s , t h i s would currespond t o f a k i n g a = 1 .15 and t h e
ellipse Q - 2.30.
~ l t h ~ a~p gp eha l i n g , c h o o s i n g e l l i p t i c areas as j o i n t l i k e l i h o o d
f o r two p a r a m e t e r s d o e s nor r e p r e s e n t t h e o n l y p o s s i b i l i t y . For i n - The t a b l e va1ut.s >how t h a t y(is,rn) i s a rather sensitive increasing
a r a n r r , ~ h r :r e c t a n g l e c i r c u m s c r i b i n g t h e covariance e l l i p s e g i v e s a n o t h e r f u n c t i o n of 0 , p a r t i c u l a r l y when /PI -t 1 and m i s s m a l l , b u t becomes a l m o s t
c o n v e n i e n t j o i n r l i k e l i h o o d region f o r 81 and R 2 . I t c o r r e ~ p o n d sto a prob- independent of p when m t 3. The numbers imply t h a t , g i v e n t h e s i z e o f a
a b i l i t y s t a t e m e n t of t h e t y p e of e q . ( 9 . 7 2 ) , c i r c u m s c r i b i n g r e c t a n g l e , i t r e p r e s e n t s a p r o b a b i l i t y which depends a n t h e
magnitude of t h e c o r r e l a t i o n between t h e p a r a m e t e r s . On t h e o t h e r hand, a l l
P ( R , - ~ ,5 R , 5 o , + o ~ ,G2-02 5 02 5 g2+02) = Y(P) . (9.79)
e l l i p s e s which can b e i n s c r i b e d i n t h e r e c t a n g l e , c o r r e s p o n d i n g t o any v a l u e
of t h e c o r r e l a t i o n , r e p r e s e n t s One c o m n v a l u e f o r t h e j o i n t p r o b a b i l i r y .
~ v i d e ~ t lt h~e , p r o b a b i l i t y c o n t e n t of t h e r e c t a n g l e must b e l a r g e r t h a n t h a t
These f i n d i n g s may a p p e a r somewhat s u r p r i s i n g , s i n c e one's f i r s t i m p r e s s i o n ,
o f t h e c o v a r i a n c e e l l i p s e ; i n f a c t , i t i s found co depend on t h e c o r r e l a t i o n
w i t h t h e 1.F o f e q . ( 9 . 5 0 ) the p r o b a b i l i t y ( s t r i c t l y speaking: our r e l a t i v e say from g l a n c i n g a t F i g . 9 . 4 , i s l i k e l y t o b e o t h e r w i s e and, i n p a r t i c u l a r .
p.
t h a r the p r o b a b i l i t y a s s o c i a t e d with e l l i p s e s inscribed i n a given r e c t a n g l e
b e l i e f ) o f h a v i n g 0 , and R 2 w i t h i n d i s t a n c e s t o l and i 0 2 from t h e e s t i m a t e d
should depend on t h e i r s i z e .
v a l u e s i s g i v e n by t h e i n t e g r a l

(2nalo2im. L ( ~ X
X-lii+o, 6 ja, d01dO,L(8i,B,) ,
I t i s r a t h e r t r i v i a l t o c o n s t r u c t l i k e l i h o o d i n t e r v a l s f o r e a c h of
the parameters considered s e p a r a t e l y . Writing a probability statement l i k e

ol-al R2-a2
P(;I-R~, 8, 5 ;,+mo,) = y (9.81)
which c a n b e reduced t o a f u n c t i o n of p and d e t e r m i n e d n u m e r i c a l l y . The same
f o r t h e f i r s t p r a m e t e r would imply i g n o r i n g t h e s e c o n d , which means t h a r i t
~ r o c e d u r cc a n o b v i o u s l y b e a p p l i e d t o d e t e r m i n e t h e p r o b a b i l i t y y(p,m) of any
can have any v a l u e . For t h i s s i t u a t i o n , i n t e g r a t i n g L(B1,B2) w i t h t h e ap-
o t h e r r e c t a n g l e s p e c i f i e d by 0 1 = Olmml. 0 , = '62tmr2. ~ t i s l e f t as an exer-
p r o p r i a t e n o r m a l i z a t i o n over a l l 8 2 g i v e s t h e m a r g i n a l d i s t r i b u t i o n i n 8 , ,
c i s e t o t h e reader to v e r i f y t h a t t h i s p r o b a b i l i t y i s g i v e n by t h e formula
which becomes N($,,D:) (see t h e d e r i v a t i o n of eq.(4.81) i n Sect.4.9.2). From
t h i s d i s t r i b u t i o n one f i n d s t h e l i k e l i h o o d i n t e r v a l s f o r 81 i n t h e u s u a l
manner. S p e c i f i c a l l y , t h e o n e s t a n d a r d d e v i a t i o n (68.3%) l i k e l i h o o d i n t e r v a l
[81-olr gl+ol I f o r 81 becomes t h e i n f i n i t e l y l o n g v e r t i c a l band t o u c h i n g t h e
where G i s t h e c u m u l a t i v e s t a n d a r d normal d i s t r i b u t i o n . As e x p e c t e d from
c o v a r i a n c e e l l i p s e , i n d i c a t e d i n F i g . 9.6; s i m i l a r l y , t h e long h o r i z o n t a l band
i s t h e one-standard d e v i a t i o n l i k e l i h o o d i n t e r v a l
A
[O2-n2. B2+021f o r e 2 . 9.7.5 Example: L i k e l i h o o d r e g i o n f o r p and 5' i n N(u.~')
the e v e n t of independent parameters (p = 0, a s y m e t r i c a l l y p o s i t i o n e d e l l i p s e ) , We have e a r l i e r e s t a b l i s h e d t h a t t h e j o i n t HL e s t i m a t e s o f t h e olean
when t h e j o i n t p r o b a b i l i t y f a c t o r i z e s i n t o t h e two m a r g i n a l p r o b a b i l i t i e s , w e ' - - I
and variance i n t h e normal p . d . f . N(u.02) are g i v e n by u = n = - E x . and
n 1
therefore find G2 = s2 = I
n
E (n.-;)'
1
(Sect.9.2.3) and t h a t , f o r l a r g e s a m p l e s , t h e i r co-
v a r i a n c e m a t r i x i s d i a g o n a l w i t h e l e m e n t s V l l = o Z / n and V2> = 209/n ( s e c t .
9.5.5). I f i n t h e s e e l e m e n t s we r e p l a c e t h e p a r a m e t e r '
0 by i t s e s t i m a t e d
value o2 we f i n d t h a t t h e v a r i a b l e Q of eq.(9.74) i s given by

I n a s i m i l a r manner one can e s t a b l i s h caditionat l i k e l i h o o d i n t e r -


v a l s f o r one p a r a m e t e r when t h e o t h e r parameter i s k e p t a t a f i r e d v a l u e . For
n

i n s t a n c e , s p e c i f y i n g t h e f i r s t p a r a m e t e r t o i t s e s t i m a t e d ' v a l u e 8, = 8 1 , t h e which d e s c r i b e s e l l i p s e s c e n t r e d a t (j,62)i n t h e (u.0') lane, w i t h h a l f axes


second has t h e c o n d i t i o n a l d i s t r i b u t i o n ~ ( i ~ , o $ ( 1 - 0 ~ ) )(compare
, eq.(4.82)), proporfinn;~l to ;/fi n r ~ d: ' ! j ; . In particular, the wtiich g i v e s t h e 95%
showing t h a t [ ^ 0 2 - 0 2 m , n ^ ~ + o ~ ~i ni ~t h Fi s l case i s a 68.3% c o n d i t i o n a l j o i n t l i k e l i h o o d r e g i o n f o r ~i and 0' c o r r e s p o n d s t o r a k i n g Q = 2a where, from
likelihood interval f o r t h i s parameter.
Our d i s c u s s i o n h a s so f a r been based on t h e assumption o f an i d e a l ,
binormal l i k e l i h o o d f u n c t i o n . I n r e a l l i f e , w i t h f i n i t e s a m p l e s , t h e two-
p a r a m e t e r LF w i l l n o t comply w i t h t h i s r e q u i r e m e n t . As i n t h e one-parameter
c a s e we r a n , however, assume t h a t t h e d e p a r t u r e from a s y m p t o t i c c o n d i t i o n s i s

nor too severe. I f t h e LF h a s a s i n g l e maximum and i s s u f f i c i e n t l y r e g u l a r i n


t h e r e g i o n of i n t e r e s t w e s h a l l a g a i n t a k e f o r g r a n t e d t h e e x i s t e n c e of some
t r a n s f o r m i n g f u n c t i o n which b r i n g s the LF over t o t h e i d e a l binormal s h a p e ,
t h e r e b y e n a b l i n g us t o reason about t r a n s f o r m e d l i k e l i h o o d r e g i o n s l i k e we d i d
i n Sect.9.7.1 f o r t h e one-parameter i n t e r v a l s . Consequently, by i n t e r s e c t i n g
t h e 1nL s u r f a c e by p l a n e s 1nL = InL(max)-a we s h a l l t a k e t h e r e g i o n e n c l o s e d
by t h e c o n t o u r as an approximate lOOy % j o i n t l i k e l i h o o d r e g i o n f o r t h e two
p a r a m e t e r s 0 1 and B 2 where, a s b e f o r e f a r t h e i d e a l binormal c a s e , Y = ~ - e ' ~ .

F o r an i r r e g u l a r 1.F, where t h e e x i s t e n c e of a t r a n s f o r m i n g f u n c t i o n
i s an o b v i o u s l y i n a d e q u a t e a s s u m p t i o n , i t w i l l n o t b e j u s t i f i e d t o t a k e t h e
-a
numbers y = I-e as approximate measures of t h e p r o b a b i l i t i e s t o b e a s s o c i -
a t e d with t h e j o i n t l i k e l i h o o d regions. This a p p l i e s , i n p a r t i c u l a r , i f t h e
LF h a s more than one maximum and t h e i n t e r s e c t i o n 1nL = lnL(max)-a produces
two o r more d i s c o n n e c t e d r e g i o n s i n p a r a m e t e r s p a c e . In t h i s s i t u a t i o n the
e x p e r i m e n t i s o n l y p o o r l y s u m a r i z e d by s p e c i f y i n g t h e p a r t i c u l a r r e g i o n s ,
F i g . 9 . 7 . L i k e l i h o o d r e g i o n ( e l l i p s e ) and c o n f i d e n c e r e g i o n
and i t i s more i n f o r m a t i v e t o d i s p l a y t h e LF g r a p h i c a l l y by a whole s e t of ( i n t e r s e c t e d p a r a b o l a ) f o r u and 0 ' in ~(p,a').
l i k e l i h o o d contours.
eq.(9.78), a = - l n ( 1 - 0 . 9 5 ) = 2.996; hence i t s h a l f axes a r e Unless w e make t h e s i m p l i f y i n g assumption of aPymDtotic c o n d i t i o n s
we s h a l l n o t be a b l e t o p u r s u e t h i s s u b j e c t very f a r . I n doing so i t s h o u l d
b e n o t e d t h a t w i t h an i n c r e a s i n g n u d e r of v a r i a b l e s ( p a r a m e t e r s ) t h e approach
I t i s i n t e r e s t i n g t o compare t h e e l l i p t i c j o i n t l i k e l i h o o d r e g i o n to "asyrnptopia" becomes i n c r e a s i n g l y s l o w . lhus l a r g e r sample s i r e s are i n
I o b t a i n e d i n t h i s manner w i t h t h e f i n d i n g s o f S e c t . 7 . 4 , where we used t h e inde- g e n e r a l needed t o a t t a i n a g i v e n accuracy i n t h e a p p r o x i m a t i o n to

pendence p r o p e r t y o f t h e s t a t i s t i c s x and
s2 t o construct a j o i n t confidence when t h e number o f p a r a m e t e r s goes from one t o two and beyond. I n o t h e r words.
r e g i o n f o r l~ and o Z , i l l u s t r a t e d by t h e shaded area of F i g . 7 . 2 . A sample of comparing r e a l l i f e t o t h e i d e a l c o n d i t i o n s w i l l g e n e r a l l y i m p l y rougher ap-
.
s i z e n = l o o w i t h x = I , s Z = 1 g i v e s t h e 95% j o i n r l i k e l i h o o d e l l i p s e of p r o x i m a t i o n s i n t h e m u l t i - p a r a m e t e r case.

' 8k
F i g . 9.7. T h i s e l l i p s e i s s l i g h t l y s m a l l e r i n area t h e n t h e p a r t i c u l a r 95% A T a y l o r e x p a n s i o n o f 1nL about t h e ML e s t i m a t e = 1
j o i n t c o n f i d e n c e r e g i o n shown in t h e same f i g u r e , bounded by t h e p a r a b o l a and reads, i n general,
t h e two h o r i z o n t a l l i n e s . These have b e e n determined by demanding e q u a l t a i l
~ r o b a b i l i t j e s= l ( l - m 5 ) i n t h e ends of t h e N ( 0 , l ) d i s t r i b u t i o n ( f o r t h e
variable x-lJ as w e l l as t h e X2(n-1) d i s t r i b u t i o n ( for L
i
[G)' 1. This
Here t h e f i r s t d e r i v a t i v e s v a n i s h i d e n t i c a l l y . I f t h e sample s i z e i s l a r g e
o f t h e c o n s t a n t s a, b, b ' i n t h e p r o b a b i l i t y
r e q u i r e m e n t f i x e s t h e values
statement eq.(7.18); we f i n d a = 2.237 i n t h e u s u a l way, and b = 67.5, b ' enough t h e second d e r i v a t i v e s a r e t h e n e g a t i v e of t h e e l e m e n t s o f t h e i n v e r s e

130.5 by a p p r o x i m a t i n g t h e chi-square v a r i a b l e t o a normal v a r i a b l e of t h e c o v a r i a n c e m a t r i x f o r t h e MI. e s t i m a t e s ( e q . ( 9 . 3 2 ) ) . Hence we may w r i t e


same mean and v a r i a n c e .

When h i g h e r - o r d e r t e r n a r e n e g l e c t e d t h i s g i v e s t h e a s y m p t o t i c LF as t h e m u l t i -
9.7.6 L i k e l i h o o d r e g i o n s ; t h e m u l t i - p a r a m e t e r case
normal d i s t r i b u t i o n i n 8,
To g e n e r a l i z e t h e a r g u m e n t s o f S e c t s . 9 . 7 . l and 9.7.4 to a s i t u a t i o n
w i t h s e v e r a l p a r a m e t e r s w e are now l o o k i n g f o r t h e p o s s i b i l i t y to f o r m u l a t e ~ ( 8 )= ~ ( m a x )exp [- 1 (8-41 T ~ - (i)
l (9.84)
p r o b a b i l i f y s t a t e m e n t s of t h e t y p e
where we h a v e w r i t t e n dl(!) i n s t e a d of ~ ' ( i ) .
b
p ( e Y 5 8, zo!, ... eta -< , k -< ok ) = y
(9.82)
I n t e r s e c t i n g t h e h y p e r s u r f a c e L(2) by h y p e r p l a n e s L = L ( ~ ~ v i~l l ) ~ - ~
now g i v e c o n t o u r s o f c o n s t a n t l i k e l i h o o d which d e f i n e h y p e r e l l i p s o i d a l r e g i o n s
o n t h e b a s i s of a k-dimensional ...
l i k e l i h o o d f u n c t i o n L = L ( ~ ~ O ~ , B Bk).
~ , . If
in parameter space. S i n c e t h e q u a d r a t i c form
a l l p a r a m e t e r s are c o n s i d e r e d t o l i e between two d i f f e r e n t l i m i t s , t h e s p a n i n
p a r a m e t e r s p a c e w i l l b e a lOOy % j o i n t l i k e l i h o o d r e g i o n f o r a l l t h e k para-
meters. I f some of them a r e k e p t c o n s t a n t , s a y a t t h e i r e s t i m a t e d v a l u e s , t h e
r e g i o n spanned by t h e parameters w i l l b e a c o n d i t i o n a l j o i n t l i k e l i - f a r mulrinormal i s d i s t r i b u t e d a s a c h i - s q u a r e v a r i a b l e w i t h k d e g r e e s of
hood region ( i f t h e r e a r e a t l e a s t two p a r a m e t e r s l e f t ) o r a c o n d i t i o n a l l i k e - freedom ( S e c t . 4 . 1 0 . 2 ) , we can e x p r e s s a p r o b a b i l i t y by i n t e g r a t i n g t h e p . d . f .
l i h o o d i n t e r v a l ( i f o n l y one parameter r e m a i n s ) . I n any case, o u r purpose i s t o of t h e x 2 ( k ) v a r i a b l e between 0 and some v a l u e Q ; w e w r i t e
Y
f i n d a r e g i o n f o r chosen y , o r c o n v e r s e l y , t o f i n d t h e v a l u e of Y c o r r e s p o n d i n g
to a specified region.
f ( q ; v = k ) d Q = F1-y(Q=Qy;~=k) = Y . (9.86)
where F is the cumulative integral of the chi-square p.d.f. as defined in first. More odd situations, in which some parameters are ignored while others
Sect.5.1.4. Clearly, Q 5 Qy is here equivalent to having all parameters 8>,82, are kept fined, can also be thought of. We leave it to the interested reader
...,Bk ~imultaneouslywithin the region enclosed by the hyperellipse 9 = Qy. t o contemplate this matter further and to work out the relevant formulae. The
' h i s hyperellipse, centred at and obtained by the intersecting hyperplane a t lesson to be learned from our considerations here is that it is extremely
distance a = below Inl(max), will therefore he the boundary of a 1001 2
)Q important to s t a t e explicitly which parameters have been considered jointly
Y
joint likelihood region for all k parameters. corresponding values of y can be estimated and which have been integrated over or kept constant at their esti-
found from graphs or tabulations of the cumulative chi-square distribution, such mated values.
as Fig. 5.2 or Appendix 'rable A8. It is important to observe that, for a fined
value of the inrerseccinn constant a , the associated with the like-
lihood resion drops very quickly when the number of parameters increases. For 9.8 GENERALIZED LIKELIHOOD FUNCTION
example, the choice a = 0.5 (Q = I), which produced y = 0.683 for the one-para- With the assused f~mrtionaldependence f(xl9) between the observable
Y
meter and y = 0.393 for the two-parameter case, leads to chi-square probabili- x and the unknam parameter 0 we have, given n events (observations)
ties = 0.20, r 0.10, = 0.05 for k = 1, 4, 5, as one can see from Fig. 5.2.
7 , one vill have to rake
....
~ 1 . ~ 2 , xn, written the likelihood function as L(q/B)
1-1
-,a
f(~~l9). If it io
, ~ b f a i na specified prubability
~ i k e w i ~ eto possible to write the total number v of expected events as a function of 9 , say
increasingly large "slues o f a for increasing k. Specifically, to have 68.3%
v = v(B), this "information" may be utilized by constructing a gene~aZised
joint likelihood regions, the intersecting hyperplanes must be taken for ZikeZihood f t m c t i a as
a = 1.15, 1.77, 2 . 3 8 , and 3.00 when the number of parameters is k = 2, 3, 4, and
5, respectively.
TO obtain likelihood regions o r intervals which are conditional or
This expression describes the joint probability for observing just n events.
independent on some of the parameters we must carry out appropriate integrations
of the multinomal LF. The background for our remarks here has been presented
and that these give the reaults x ~ , x ~ , ....a,
when the number of observed events
is assumed to be Poisson variable vith mean value v .
in Sect.4.lO.l. If some of the parameters, say the first k of them, are without
interest and can be ignored, the marginal disiriburion obtained by integrating The advantage of introducing the generalized f is that the n u d e r of
observed events n adds an extra constraint in the determination of 9. In pro-
the LF over these is a multinormal distribution in the k-9. remaining parameters
b l e m vhere the shape of f(x/9) is of primary interest one will, however, in
with the same mean ~aluesand cavariences as they had in the full LF; this
general gain fairly little by using C instead of L. The usr of f ie suggestive
marginal distribution vill then provide joint likelihood regions for the k-a
parameters, independent of the E first, in the manner described above. If, on only in caaea where rhe expected number of eventa v(9) can be calculated vith

parameters are kept fined at their estimated values, the considerable accuracy. An example is given in the csae study of Seet.9.12.
the other hand,
conditional distribution in the remaining variables is also multinormal, of
I
9.9. APPLICATION OF THE MAX--LIYELIHOOD m T W D TO C W S l F I E D DATA
dimension k-m, but with new covariance terms as determined by the covariance
matrix V*, which results by deleting the appropriate e rows and columns from the When the number of observations is very large the numericel evaluation
original v
1-and inverting the resultant matrix. This distribution would then cf the likelihood function may become quite laborious, especially if the p.d.f.
provide joint likelihood regions for the k-m parameters, conditional on the m f(xl8) has a complex form. In such situations one may reduce the amount of
computation by grouping the data into subsets or classes w d m i t e the
l i k e l i h o o d f u n c t i o n as t h e product of a smaller number of "averagedq' p . d . f . ' a .
likelihood, f o r observing just n ~ , n r , . . . , % events i n t h e N e l a s s e a is then
A s i m i l a r grouping of t h e d a t a is f r e q u e n t l y i n h e r e n t i n t h e e x p e r i m n t a l set-up
1 ni ;vi,
i t e e l f , as f o r i n s t a n c e when a counter c o n b i n a t ~ a ni s used t o r e g i s t e r p a r t i c l e s
w i t h i n a c e r t a i n angular i n t e r v a l . I t i s c l e a r t h a t the grouping of t h e d a t a
i n t o c a t e g o r i e s n e c e s s a r i l y implies some l o s s of information; t h i s loss w i l l .
here vi - n
Ax.
f(xl8)dx. Zvi - -Eni n. Show that. when t h e number of events

h w e v e r , be modest i f t h e v a r i a t i o n of t h e d i s t r i b u t i o n i s smell over each


in each e l a s s d l a r g e , n. >> 1,
N
interval. 1nL = 1 -
( ~ ~ l " ( " ~ - l ) 1n(ni!j) - 1x2,
i-1
Let the t o t a l rider of events n be grouped i n t o N classes f o r d i f f e -
r e n t i n t e r v a l s of the v a r i a b l e x. where
The j o i n t p r o b a b i l i t y t o have nl events i n
c l a s s 1, n2 e v e n t s i n c l a s s 2 , and so on, i s given by t h e multinomial d i s t r i b u -
.
I
t i o n law,
me maximization o f lnl. as a function of 0 i s now t h e same as minimizing x'. I
. . 18)
~ ( n , , n ~ , ,n
N
- n-
n!
N

i.1
1 pini
"i'
, (9 .RH)
~ e n c et h e HL e s t i m a t i o n become8 equivalent t o a Least-Squares e s t i m a t i m of t h e
parameter; see Sect.lO.5.1.

where p. i s t h e p r o b a b i l i t y f o r the c l a s s i. This p r o b a b i l i t y c a n be fr~undby COMBINING EXPERIHENTS BY TUE MAXIWLII(EL1HOOD HETEOD


9.10
i n t e g r a t i n g t h e p.d.f. over t h e width Ax; o f t h e i - t h i n t e r v a l ,
Consider two independent experiments with t h e purpose of determining

Pi - pi(e) =
I fix(R)dx . the same physical parameter 9 from two s e t s of observations 5 and
ing t o t h e l i k e l i h o o d functions ~ ( ~ ( and
8 ) L ( ~ / B ) r,e s p e c t ~ v e l y .
r, eorrespond-
me jolnt
Ax.

Since L depends on 8 only through t h e pi, the ML e s t i m a t e f o r t h e parameter i s


p r o b a b i l ~ t yof a l l observstlons i s 1
found by seeking t h e maximum value of t h e expression
N
lnL(n,,n r,...,nN!6) = 1 nilnpi(e) and t h e ML e s t ~ m a t eof 0 from t h e colqposite experiment i s found by maximizing A

i-1 t h i s combined l i k e l i h o o d i n t h e usual manner. Thus t h e colibined e s t i m a t e 0 can


i n t h e usual manner.
be found whenever t h e l i k e l i h o o d functions of t h e two experiments are kn-.
I t i s obvious t h a t when t h e number of c l a s s e s i s l a r g e , corresponding
In p a r t i c u l a r f o r low s t a t i s t i c s experiment^, when t h e IF'S are f a r from b e i n g
t o small i n t e r v a l s Axi, t h i s method i s e q u i v a l e n t t o t h e ordinary procedure f o r
of Gavssian shape, t h i s procedure i s b e t t e r than t a k i n g some weighted average
u n c l a s s i f i e d data. I t i s a l s o evident t h a t i f t h e p.d.f. varies s i g n i f i c a n t l y of t h e En. e s t i m a t e s from the i c d i v i d u a l experiments.
over each i n t e r v a l , corresponding t o few c l a s s e s , l a r g e Axi andlor a r a p i d l y
10 rite l i m i t i n g cases ?hen t h e LF'c a r t of approximate Gaussian ahspr,
varying f u n c t i o n , t h e average p. v i l l not be a good approximation f o r t h e proba-
the combinac~onof s e v e r a l e m e r i m r n t s becomes very simple, because t h e c d i n e d
b i l i t y of each c l a s s .
I n any ease, t h e grouping of t h e d a t a i n t o c l a s s e s b u t a ~ ~ o d u of c l normel p.d.f.'a. Hence t h e formulae from
IF rill be
implies a lose of information about t h e parameter and should t h e r e f o r e be
Sect.9.2.2 f o r t h e weighted mean can be applied.
avoided i f p o s s i b l e .
The p r e s c r i p t i o n f o r combining independent e x p e r i m n t s by eq.(9.89)
E x e r c i s e 9.22: In t h e c l a s s i f i c a t i o n above, assllme t h a t , f o r clasn i, t h e number follows as a eonsequence from ~ a ~ e Theorem. s ' From Sect.2.4 we r c e a l l t h a t i f
of e v e n t s n. is a Poisson v a r i a b l e with mean value v.. The j o i n t p r o b a b i l i t y ,
~ ( 8 )r e p r e s e n t s t h e p r i o r knowledge about 8, t h e p o s t e r i o r knowledge ~ ( 8 1 3
For both typea of e f f e c t s , t h e new, corrected p.d.f. should then be used i n t h e
about 8 , given t h e observations 5, i s given by (compare eq. (2.26))
construction of the LF, and the parameters estimated as before. An example on
the e f f e c t of f o l d i n g i n t h e experimental r e s o l u t i o n is given by Exercise 9.23
st t h e end of this s e c t i o n .
Before t h e second s e t of observations y is made ~ ( ~ 1 i8s ) t h e new p r i o r L n a r When we have not succeeded i n c o r r e c t i n g t h e p.d.f. f o r experimental
ledge, and t h e new p o s t e r i o r b o v l e d g e becomes biasaes, w e are i n a l e s s f o r t u n a t e p o s i t i o n and have t o r e l y on t h e approximate
method ( i i l of Sect.6.3, c o n s i s t i n g i n applying weight f a c t o r s t o t h e observed
events before comparison i s made w i t h t h e i d e a l model.
StIpp~set h a t we have a biassed s q l e of observations, due t o an
i For t h e conbined observations we must a l s o have
uneven d e t e c t i o n e f f i c i e n c y . E s s e n t i a l l y t h i s means t h a t a t the value xi, where I
we have observed one e v e n t , t h e r e should have been w . events, where v. i s the
reciprocal of t h e d e t e c t i o n p r o b a b i l i t y f o r the observed event. A f a i r l y
and t h e two expressions a r e s e e n t o lead t o
obvious construction of an approximate l i k e l i h o o d fvnction i n t h i e s i t u a t i o n is

he present remarks w i l l obviously hold a l s o f o r t h e multi-parameter


or, equivalently,
case.
n n
lnLr(xl!) = 1 wilnf(xilg) = 1w. Inf. (9.91)
9.11 APPLICATION OF TEE MAXIHUM-LIKELIHOOD METHOD TO WEIGHTED EVENTS
i-1 i.1 '
vhich can be maximized with respect t o i n the usual way.
I t has so f a r i n t h i s chapter been t a c i t l y assumed t h a t t h e p.d.f.
It can be s h a m t h a t t h i s approximate method w i l l lead t o e s t i m a t e s
used t o constrvct t h e likelihood function gives an adequate d e s c r i p t i o n of t h e
experimental d a t a . We know, however, t h a t q u i t e frequently w i l l t h e t h e o r e t i -
6 which are asymptotically normally d i s t r i b u t e d about the t r u e parameter valuea.
-
Since, however, L' i s not i n a simple way r e l a t e d t o t h e t r u e (unknown) l i k e l i -
cal, ideal p.d.f. cover unphysical regions f o r t h e measarable v a r i a b l e , o r t h e
hood function of the problem, the usual procedures t o determine the errors are
e x p e r i m n t a l s i t u a t i o n be such t h a t the d a t a w i l l be d i s t o r t e d compared t o the
no more v a l i d . I n p a r t i c u l a r , taking the i n v e r s e of t h e second d e r i v a t i v e s of
i d e a l expectation. We saw in Chapter 6 t h a t it i s possible, i n favourable
-1nL' would underestimate t h e e r r o r s , s i n c e i t implies the observation o f
cases, t o modify t h e i d e a l p.d.f. t o include c o r r e c t i o n s f o r these e f f e c t s , and
Zv. > n e v e n t s .
c o n s t r u c t a modified p.d.f. which w i l l be d i r e c t l y comparable t o the rav experi-
In t h e one-parameter case a crude procedure f o r t h e determination of
mental d a t a . S p e c i f i c a l l y , we saw t h a t one can take i n t o account
the error would be t o t & e t h e error as deduced from L' by the o r d i n a v methods
- random observational errors, due t o measuring u n c e r t a i n t i e s ,
( f o r example, the graphical method) and multiply it by a f a c t o r equal t o t h e
handled by "smearing" the i d e a l d i s t r i b u t i o n by the experimental square root of the average weight a € the events. 2his would correspond t o
r e s o l u t i o n function ( S e r t . 6 . 2 ) , having the variance
I - systematic e f f e c t s , due for example t o imperfect d e t e c t i o n a b i l i t y
(Sect.6.3).
* *
which siwlifieo to the usual large sample formula (9.36) if all weights are (ii) Show that the ML estimates u and a' for the unknown parameter obtained
the uncorrected p.d.f. f (@la) and the "smeared" p.d.f. f'(@;R/a), respect-
equal to one.
ively, are related by
For the multi-parameter case with k unknowns it has been shown by
F. Solmitz that the variance matrix for is asymptotically given by
Thus, random observational errors on the data will affect the ML estimate of the
parameter. With reasonable good resolution, however, the change is small. For
m a w l r , a resolution width of 10' will cause only a 1.5% correction.
where H-' is the variance matrix to be deduced from L',

"Lrn = - - - a'ln~'
aepaem ' m - 1 2k (9.94)
9.12 A CASE STUDY: AN ILL-BEtLAVED LIKELItlOOD FUNCTION
We will now give an example, from a low statistics experiment, on an
and where the matrix H' is defined by
ill-behaved likelihood fvnction in two parameters. We will show that the intro- 1'
duction of the generalized LF improves the shape somewhat, but only to a very 1
small extent. The real cure of the problem can only be obtained after a close
look at the physics involved. In fact, for the present problem rvo experiments
It is seen that eqs.(9.93) - (9.95) reduce to eq.(9.92) if there is only one
described by slightly different p.d.f.'s should be combined to give a well- 1
parameter, and the special case with all w . = 1 reproduce our earlier asympto-
behaved LF in the two parameters.
tic result, eq.(9.37).
The physics question is the following: Is the decay KO * nt"-no
There is always loss of information involved in the weighting proce- due to the decay of the long-lived only, or is it partly due to the decay of
dure, but this will be serious only if very large weights occur. As very large the short-lived $? In the latter case, CP is violated, and the reaction can
weights may mbitrarily increase the variance v@), one will somtimes improve give information on the complex CP-violating parameter n. defined by the ratio
the precision in the parameter estimates by excluding some events from the
Amplitude (K: + T'V-T')
samle. In bubble chamber experiments, for example, one can usually avoid the + =Ren+m.
unwanted large weights by a suitable choice of fiducial volume. Amplitude (< * TI n n )
(It should be noted tha; the decay $ * nt"-no has already been observed and
~ ~ e r c i s9.23:
e (The effect of experimental resolution) that its decay rate T(< + n+<no) is L n w n . )
An azimuthal angle $ is distributed according to the ideal theoreti-
cal p.d.f. The random variable (observable) in the problem is the proper flight-

£($la)
1
-
2rr (1 + a c o g $ ) , o 5 $ 5 zn,
time t of the KO between production and decay, and its p.d.f. is given by

where o is unknoun. Measurements o n m a r e associated with Gaussian distributed


random errors, i . ~ .the resolution function is of the form
r($';$) -e~p(-J($'-4)~/~~) where R is the resolution width (campare eq.(6.4)).

(i) rf R << zn, shov that the resolution transform can be expressed as

fq($*;~/a) -& (1 u ~x~(-~R~)cos$'), 0 S . $ ' S 2".


fort. < t < t
m1n man. C is a normalization factor, given by
t

c - 1
t .
maa
~(tln)dt

ran
and
No = number of ~''s produced a t t - 3 (known),
AS.
6 - A L = t h e t o t a l decay rates of
mass d i f f e r e n c e between and c and c (knmn],
(known).

For 180 observed e v e n t s a contour p l o t of t h e l i k e l i h o o d f u n c t i o n

-
180
1
L(n) CiN(ti In, (9.97)
i-1
i n t h e (Ren.lmn) p l a n e is given i n Fig. 9 . 8 ( a ) . The LF has two maxima, one
pronounced maximum a t Ren = - 2.68. Im = 0.55 and one l e s s pronounced maximum
a t Ren = 0.15, Im = -0.06. The d i f f e r e n c e i n 1nL between t h e maxima is 2.18.
We next consider t h e e f f e c t of imposing an a b s o l u t e normalization
given by t h e known decay r a t e ~(c + ntn-no). We w r i t e t h e generalized l i k e l i -
hood f u n c t i o n as

where v is t h e expected number of events, which can be c a l c u l a t e d vhen t h e de-


Fig. 9.8. Plaximwn l i k e l i h o o d p o i n t s and l i k e l i h o o d contours f o r the experilnent
t e c t i o n e f f i c i e n c y is known. described i n the t e x t . ( a ) The LF of e q . ( 9 . 9 7 ) . (b) The generalized LF of
The contours of t h e generalized l i k e l i h o o d f u n c t i o n are s h a m i n eq.(9.98). ( c ) Monte Carlo generated LF's assuming decays o f i n i t i a l l y produced
Ko ( f u l l curves) and @ (dotted curves).
Flg. 9.8(b). One f i n d s s t i l l two maxima, they are s l i g h t l y b e t t e r separated,
b u t t h e i r d i f f e r e n c e i n 1nL i s now only about 0.24. There i s , t h e r e f o r e , no
r e a l justification f o r favouring one s o l u t i o n above t h e o t h e r . Referring t o t h e a d d i t i o n t o the m a x i m a t Ren r~ Imq ar 0 we observe a pronounced t a i l in t h e I F

p h y s i c s of t h e problem, one s o l u t i o n suggests t h e e x i s t e n c e of a huge CP-violat- favouring l a r g e negative values of Ren and Imn. Thus, i n a band i n t h e
i n g e f f e c t , while t h e o t h e r s o l u t i o n is c o n s i s t e n t with no C P y i o l a t i o n . (Re". Irm) plane t h e proper f l i g h t - t i m e of t h e KO i s l i t t l e s e n s i t i v e t o t h e
One might wander i f t h e ill-behaved LF and t h e t v o s o l u t i o n s are. j u s t values of q. The shape of t h e f u l l l i k e l i h o o d contours of Fig. 9.8(c) i n d i c a t e s
bad luck i n this p a r t i c u l a r experiment, and due t o some l a r g e s t a t i s t i c a l fluc- t h a t t h e experiment cannot e a s i l y d i s t i n g u i s h between a broad range of v a l u e s
of t h e parameter n. The indeterminacy can, however, be solved by performing a
- -
t u a t i o n i n t h e data. To check t h i s p o i n t , many a r t i f i c i a l samples c o n s i s t i n g
of 180 e v e n t s each, were generated by t h e Monte Carlo method f o r Req Im 0 new, but very s i m i l a r experiment. This r e s t a upon t h e following observation:
and t h e l i k e l i h o o d function constructed. The contours of t h e LF f o r a t y p i c a l I f one s t a r t s with K-zero with strangeness equal t o -1 t o study to+ n + n n D
I a r t i f i c i a l sample are s h a m by the f u l l d r a m curves i n Fig. 9 . 8 ( c ) . In
t h e l a s t term i n the p.d.f.
unchanged.
(9.961 changes s i g n , whereas everything e l s e i e
The contours f o r a t y p i c a l LF obtained from 180 a r t i f i c i a l ioeeventa
1 lo. T h e Least-Squares method
are s h a m by t h e dotted curves i n Pig. 9 . R ( c ) . The dotted eurvea correspond
simply t o the f u l l dr- curves r e f l e c t e d a t the o r i g i n . We conclude t h e r e f o r e ,
that i f n is c l o s e t o zero a goad procedure t o determine t h e par-ter would be
t o combine KO and iod a t a . Thin has i n f a c t a l s o been done, and it i s found
that n i n indeed c l o s e t o zero, and c o n s i s t e n t with no CP-violation i n
KO * "+<no decay. I n t h i s c h a p t e r we s h a l l d i s c u s s the e s t i m a t i o n of parameters by t h e
Least-Squares (LS) method, probably the e s t i m a t i o n method rmst f r e q u e n t l y used
i n practice.
The p o p u l a r i t y of the LS method may p a r t l y be a s c r i b e d t o t h e f a c t
t h a t i t has had a long h i s t o r y during which i t has been applied t o a number of
s p e c i f i c p r o b l e m a s well as t o problems of more general nature. Besides t h i s
importance gained by t r a d i t i o n , the a c c e p t a b i l i t y of t h e LS method, as f o r any
systematic e s t i m a t i o n p r i n c i p l e , depends on the p r o p e r t i e s of the e s t i n a t o r a t o
which i t leads. Unlike the MaximunrLikelihood method t h e Least-Squares method
has no general optimum p r o p e r t i e s t o recommend i t , even asymptotically. However,
f o r a n important e l a s s of p r o b l e m , where t h e parameter dependence i s Linear,
the LS method has t h e v i r t u e t h a t i t , even f o r small samples, produces u n b i a s e d
e s t i m a t o r s of m i n i a m variance.
I n the following we consider f i r s t the simple case with l i n e a r para-
meter dependence, then we proceed t o the non-linear ease, and f u r t h e r t o s i t u a -
t i o n s of i n c r e a s i n g complexity involving f i r s t l i n e a r , and l a t e r general con-
s t r a i n t equations.

10.1 BASIS FOR THE LEAST-SQUARES METHOD

10.1.1 The Least-Squares P r i n c i p l e


B r i e f l y , t h e b a s i s f o r t h e LS e s t i m a t i o n method may be s t a t e d as
, follows:
At the o b s e r v a t i o n a l p o i n t s xl.xz. ...,% we are given a s e t o f N
independent, experimental values y ~ . y z ~ . . . , y ~ . The t r u e values n1.n~. ....% of
the observables are not known, b u t we assume t h a t same t h e o r e t i c model e x i s t s .
which p r e d i c t s the t r u e value a s s o c i a t e d with each xi through s w c f u n c t i o n a l
dependence.
v h e r e 81.82 '
,...,BL i s a s e t of p a r a m e t e r s , L N . According t o t h e Least- T h i s c a n be c a l l e d a s i n p t i f i e d L S e s t b p t i m .
Squnres Principle t h e b e s t v a l u e s of t h e unknown p a r a m e t e r s are t h o s e which make ' I f t h e o b s e r v a t i o n s a r e c o r r e l a t e d , w i t h e r r o r s and c o v a r i a n c e t e r m s
given by t h e (syormetric) c o v a r i a n c e m a t r i x V ( x ) , t h e Least-Squares P r i n c i p l e f o r
N f i n d i n g t h e b e s t v a l u e s o f t h e unknown p a r a m e t e r s i s f o r m u l a t e d as
Xz I wi(yi-fi12 = minimum, (10.1)
1 1 ( Y . - f . ) V .-. L ( y . - f . )
i=1 N N
~2 = minimum. (10.6)
v h e r e v i is t h e v e i g h t a s c r l b e d t o t h e i - t h o b s e r v a t i o n . The s e t of p a r a m e t e r s i = 1 j=, I 1 L J J J
*..
A

B = 181,8z,
- ..
.,$L> v h l c h p r o d u c e s t h e s m a l l e s t v a l u e f o r x2 i s c a l l e d t h e Least-
I t h a s i n a l l f o m u l a t i o n s above b e e n t a c i t l y assumed t h a t t h e xi are
Squalau e,stimnte of the pammeters. Each x . may have a p r e a s s i g n e d
p r e c i s e values, w i t h no errors a t t a c h e d t o them.
The w e i g h t wi e x p r e s s e s t h e a c c u r a c y i n t h e measurement y . . In many v a l u e , o r i t bas brew me.rsurcd with;rrr e r r n r i d h i r h is n e g l i g i b l e t o t h e e r r o r of t h e
s i t u a t i o n s one assumes t h a t a l l o b s e r v a t i o n s a r e e q u a l l y a c c u r a t e .
I n such A l t e r n a t i v e l y , t h e " o b s e r u a t i a n a l p o i n t " x. can s t a n d f o r a
corresponding y . .
cases t h e LS s o l u t i o n f o r t h e p a r a m e t e r s i s found by d e t e r m i n i n g t h e minimum of m a t e v e r t h e meaning of xi the
well-defined r e g i o n from x i t o xi+Axi, say.
t h e unwelghted sum of s q u a r e d d e v i a t i o n s , i . c . one minimizes t h e q u a n t i t y
c r u c i a l assumption i s t h a t i t i s p o s s i b l e t o make a pwn'se e v a l u a t i o n of t h e i
N from x. t o %1+ A xI. . Simi- I
x2 = ,I (10.2)
p r e d i c t e d v a l u e fi corresponding t o xi. o r the region
1=1 1 larly, t h e e x p e r i m e n t a l v a l u e y . may be c o n s i d e r e d as t h e outcome of a s i n g l e
T h i s i s c a l l e d u w e i g h t e d LS c s t i m t i o n . measurement, or more measurements (an a v e r a g e , s a y ) a t t h e p a i n t x . . e v e n t u a l l y
I f t h e e r r o r s i n t h e d i f f e r e n t o b s e r v a t i o n s are d i f f e r e n t b u t known i n t h e r e g i o n from x . t o xi+Ari. We may well speak o f x as an independent, a n d
t h e w e i g h t of the i - t h o b s e r v a t i o n i s u s u a l l y taken e q u a l t o i t s precision, y as a dependent v a r i a b l e .
wi = 110; . The q u a n t i t y t o be minimized i s t h e n F i n a l l y , i t may b e w o r t h w h i l e t o emphasize t h a t t h e LS e s t i m a t i o n
method makes no r e q u i r e r e n t a b o u t t h e d i s t r i b u t i o n a l p r o p e r t i e s of t h e a b s e r l r
On t h e o t h e r
~ b l e s . I n t h i s sense t h e LS e s t i m a t i o n i s d i a t r i b u t i a - f r e e .

Q u i t e f r e q u e n t l y when t h e observations a r e knovn t o b e of d i f f e r e n t


hand, i f t h e o b s e r v a b l e s a r e normally d i s t r i b u t e d , t h e minimum value xii,, will,
under c e r t a i n c o n d i t i o n s , be d i s t r i h u t e d as a c h i - s q u a r e v a r i a b l e ; i t w i l l t h e n
a c c u r a c y one c a n n o t c l a i m a p r e c i s e knowledge a b o u t t h e errors. I n s t e a d one be p o s s i b l e t o g i v e a q u a n t i t a t i v e measure of t h e o v e r a l l f i t between a b s e r v a -
m u s t e s t i m a t e t h e e r r o r i n t h e i n d i v i d u a l measurements. Suppose f o r example, This
t i o n s and model based on t h e p r o p e r t i e s of t h e c h i - s q u a r e d i s t r i b u t i o n .
t h a t t h e measurement y . g i v e s t h e nilnlber of e v e n t s i n a c l a s s i. One c o u l d t h e n
may e x p l a i n why cki-sqwrre (or X') minimization i n eomon p a r l a n c e i s used
use t h e a p p r o x i r m t i a n o? a f i , e q u i v a l e n t t o c o n s i d e r i n g t h e t r u e q. a P o i s s o n
as synonymous v i t h Least-Squares estimation.
v a r i a b l e , v i t h mean v a l u e and v a r i a n c e e q u a l t o f i , i.a. one p u t s
I
10.1 .2 Connection between t h e LS and t h e ML e s t i m a t i o n methods
Let us now assume t h a t v e want t o g a i n i n f o r n a t i o n on t h e t r u e v a l u e s
I,. of t h e o b s e r v a b l e . on the b a s i s o f t h e o b s e r v e d nlnobers y i .
When f i i s a c o m p l i c a t e d f u n c t i o n one can a l s o for e o m p ~ t a t i a n a lconvenience see
We can then s h o v t h a t t h e Least-Squares and t h e H a x i m n r L i k e l i h o o d
t h e a p p r o x i m a t i o n a? r. y . be u s e d , w i t h
1 1 P r i n c i p l e s a r e e q u i v a l e n t under c e r t a i n c o n d i t i o n s I f we assume t h a t t h e i n d i -
N N
I v i d u a l measurements y . ace normaZLy d i s t r i b u t e d abovt t h e i r t r u e , unknown v a l u e s x2 = 1 (yi-fiI2
i=l
1( Y ~ - ~ ~ - X ~ B ~ ) ~ .
i=1
n. with variance o
:, 1 . e . t h e v a r i a b l e y i i s N ( T ~ ~ , ot ~h e)n, t h e l i k e l i h o o d f o r
observing t h e s e r i e s y, , y 2 , . ..,yN i s When t h e d e r i v a t i v e s o f x2 w i t h r e s p e c t t o e, and e2 are p,lt e q u a l to
N
yn
:. 2 N y.-n. I we g e t t h e two e q u a t i o n s

TI 1
fi 0; e x p ( - i ( y )
= i=l )= exp(-iii,(y) ). I ... 7 N
ox-
%, = i!l(-~)(~i-e,-xie,) = 0,
According t o t h e MaximumLikelihood P r i n c i p l e t h e mast p r o b a b l e v a l u e s of t h e
unknown n ; ' s a r e t h o s e which make L as l a r g e as p o s s i b l e . Evidently, L i s a t
N
maximum when

1(
N

i=,
7.-TI.

"i
)' = minimum , This s e t o f l i n e a r e q u a t i o n s c a n be w r i t t e n i n the form

I ' which i s e q u i v a l e n t t o t h e L e a s t - S q u a r e s P r i n c i p l e , p r o v i d e d t h e w e i g h t s are


i d e n t i f i e d ~ i t ht h e p r e c i s i o n s i n t h e i n d i v i d u a l , i n d e p e n d e n t measurements,
w. = 11.f.

I 10.2 THE LINEAR LEAST-SQUARES MODEL


The LS e s t i m a t i o n o f unknown p a r a m e t e r s becomes e s p e c i a l l y a t t r a c t i v e
N
1 xi8i
i=1
+
N
1xfe2
i=1

The s o l u t i o n f o r t h e parameters becares


=
N
1 xiyi.
i=1 I
i f t h e t h e o r e t i c a l model i w l i e s a Linear dependence o n t h e p a r a m e t e r s and t h e
w e i g h t s a r e independent of the parameters. As we s h a l l see i n t h e f o l l o w i n g ZxtLy.
1 1
- Zn.y.Exi
1 1
$, =
t h e 1.S rrethod t h e n p r o v i d e s an exact s o l u t i o n f o r t h e p a r a m e t e r s , which c a n be N X X ~- (XX;)~
e x p r e s s e d i n a s i m p l e c l o s e d form. Moreover, the LS e s t i m a t o r i n t h e l i n e a r
NLx.y. - Ln.Cy.
rase possesses the theoretical optimum p r o p e r t i e s of uniqueness, unbiassedness & = 1 1 1 1
and minimum v o ~ i a n c e . NSX? - (LX;)'

I 10.2.1 ~ x a m p l e :F i t t i n g a s t r a i g h t l i n e (1) where t h e s u m a t i o n s are over a l l o b s e r v a t i o n s .


As a s i m p l e e x a o p l e on a n unweighted LS e s t i m a t i o n , s u a p o s e t h a t we
10.2.2 The normal e q u a t i o n s
are g i v e n a s e t of e x p e r i m e n t a l p o i n t s ( x l , ~ ~ )( ,x ~ , Y ~ ) , . . . ~ ( x ~and
~ Yt h~a)t we
w a n t t o f i n d t h e "best" s t r a i g h t l i n e p a s s i n g through t h e s e p o i n t s . We t r y t h e
L e t us now see how the i n d e p e n d e n t o b s e r v a t i o n s ( y l % o l ) , ( y z t o 2 ) . . ...
parameterization I
( ~ $ 0 ~ a) t t h e p o i n t s x l . x t . ....
x N can be f i t t e d by t h e weighted LS method t o a
l i n e a r model w i t h L p a r a m e t e r s .
I
F . = 8, + x i @ p . (10.7) L
fi = fi(81,81 .....8L'. xi = 1 a '.9. 0
9.' i = l , 2 , . . , N, (10.11)
I f we assume t h a t a l l errors can be n e g l e c t e d t h e LS s o l u t i o n i s o b t a i n e d by de-
where L 5 N . Here aiQ is t h e c o e f f i c i e n t a p p e a r i n g w i t h t h e L-th p a r a m t e r ;
t e r m i n i n g t h e minimum of t h e u n w e i e t e d sum of s q u a r e d d e v i a t i o n s , e q . ( 1 0 . 2 ) ,
~ x e r c i s e10.1: I n the example o f Sect.lO.2.1, ass- t h a t the o b s e r v a t i o n s
u s u a l l y a i Q i s a f u n c t i o n o f t h e r.'s.
we "ow seek the parameter v a l u e s which minimize t h e q u a n t i t y o f e q .
...,
~ , . Y ~ , . . . . Y N have errors o ~ , Q t , oN.,Find t h e weighted LS a o l u t i o n f o r t h e
parameters R I and 82 Of t h e s t r a i g h t l ~ n e f = 81 + x 8 ~ . N m r i c a l l y , t a k e t h e
four p o i n t s (x..y.) a s (0,2).(2,5),(5,7),(8,10) and ass- a l l yi's t o have a
(10.31, r e l a t i v e error'of'loz.

X. I
N
= 1=1 ) y f z
= E i?(yi-(
i=lI
i aiIeI))2
L

a=1 10.2.3 Matrix n o t a t i o n

By e q u a t i n g a l l d e r i v a t i v e s

ax2
-
N
1 (-2)aik 1
ax2/aek

of(Yi -
L
I
t o zero we g e t t h e L c o n d i t i o n s

aiaee) = 0. = I . . L (10.13)
I We s h a l l r e p h r a s e t h e f o m l a e f o r the l i n e a r problem of t h e l a s t sec-
t i o n i n terms o f m a t r i x n o t a t i o n .
We o r d e r the measurements and p r e d i c t i o n s i n two column v e c t o r s y and
28, i=l e=I f. both with N elements, and l e t
- 5 be a column v e c t o r with the L parameters.

I (L 5 N),
which can a l s o be w r i t t e n a s
I

we see t h a t eqs.(10.13), (10.14) are t h e g e n e r a l i z a t i o n s o f e q e . ( 1 0 . 8 ) , I


I
(10.9) f o r t h e simple ""weighted straight-line f i t o f t h e ~ r e v i o u ss e c t i o n .
I
~t be i l l u m i n a t i n g t o w r i t e eqs.(10.14) out:
The errors i n 1 are given i n a diagonal N by N m a t r i x ( d i a g o n a l , s i n c e t h e obser-
vations were assumed independent),

and the c o e f f i c i e n t s are organized i n a m a t r i x A with N rows and L c o l m n s ,

These are t h e n o m t equations f o r t h e L unknown parameters. Since !


t h e r e are L inhomogeneous l i n e a r e q u a t i o n s f a r t h e L unknowns t h e normal equa-
I ;.
t i o n s provide an e m c t and unique s o l u t i o n .
1 I
When the nvmber of parameters i s s m a l l , L S 3, and t h e number of obser-
v a t i o n s i s n o t too l a r g e the normal e q u a t i o n s are e a s i l y solved
t h e r e are more than t h r e e parameters, o r t h e r e are many o b s e r v a t i o n s , most people
'my hand". If
I
i The l i n e a r dependence of t h e t h e o r e t i c a l p r e d i c t i o n s on t h e parameters, e q .

would probably use a computer f o r the c a l c u l a t i o n s .


.. I
I
(10.11). i s t h e n e m r e s s e d as
f = A$, (10.19)
-

a n d t h e q u a n t i t y to be minimized i s
A

C l e a r l y , t o f i n d V(g) t h e m a t r i x V = V(y) mast be known c o m p l e t e l y .


E q u a t i o n s (10.23) and (10.24) p r o v i d e t h e s o l u t i o n t o t h e l i n e a r LS
I By p u t t i n g t h e d e r i v a t i v e s of x2 with respect t o i equal t o zero we have t h e problem f o r t h e unknown p a r a m e t e r s B. It i s worth n o t i n g t h a t t h e ( s y r r o n e t r i ~ )
I short-hand e q u i v a l e n t o f e q s . ( l o . l 5 ) . m a r r i x (ATv-'A)-' which c o n s r i t u t e s t h e c o v a r i a n c e m a t r i x v(;), a p p e a r s as a se-
-
8. Thus, no e x t r a c a l c u l a t i o n is n e c e s s a r y t o
p a r a t e p a r t i n the e x p r e s s i o n f o r -
d e t e r m i n e t h e errors on $, as t h e m a t r i x (ATv-'A)-' h a s a l r e a d y been found i n
8.
I from which obtaining the solution

E x e r c i s e 10.2: V e r i f y eq.(10.24)

T h i s compact form s h o u l d b e c o v a r e d t o t h e normal e q u a t i o n s ( 1 0 . 1 5 ) . Provided 10.2.4 P r o p e r t i e s of t h e l i n e a r LS e s t i m a t o r


is) n o n - s i n g u l a r ,
rhe matrix ( A ~ V - ~ A i t can be i n v e r t e d and t h e s o l u t i o n f a r 5 I t s h o u l d be s t r e s s e d t h a t t h e LS e s t i m a t i o n problem i n t h e g e n e r a l
w r i t t e n i n t h e c l o s e d form case of a linear f u n c t i o n a l dependence o n the p a r a m e t e r s , as d i s c u s s e d above,
h a s b e e n given an emct s o l u t i o n by t h e c l o s e d form of e q . ( 1 0 . 2 3 ) . The m a t r i x
a l g e b r a i n v o l v e s no a p p r o x i m a t i o n s and t h e s o l u t i o n i s unique as l o n g as t h e mat-

one o b s e r v e s t h a t t o f i n d t h e s o l u t i o n B o f eq.(10.20) i t is s u f f i c i e n t r i c e s are " o n - s i n g u l a r .


n

t h a t t h e c o v a r i a n c e m a t r i x V = V(Y) i s knom up t o a m u l t i p l i c a t i v e factor. We have found t h a t t h e Least-Squares P r i n c i p l e h a s produced as

T h i s i s a l s o seen d i r e c t l y from e q . ( l O . Z 3 ) . lincm estimators, since come o u t w i t h a l i n e a r dependence on t h e measurements


he e q u a t i o n s w r i t t e n down and s o l v e d i n m a t r i x n o t a t i o n are i n f a c t y. From t h e d e f i n i t i o n of u n b i a s s e d n e s s , eq.Ct3.3). the 4 ace a l s o unbiassed
mare g e n e r a l t h a n i n d i c a t e d , s i n c e t h e y h o l d a l s o when V(y) i s a m t r i x w i t h non- e s t i m a t o r s , as can b e seen by a p p l y i n g t h e e x p e c t a t i o n o p e r a t o r t o f):

pero c o v a r i a n c e terms. This c o r r e s p o n d s t o g i v i n g up t h e assumption i n t r o d u c e d


i n the beginning, hen i t was s t a t e d c h a t t h e measurements y s h o u l d be indepen-
dent. R e l a x i n g t h i s r e q u i r e m e n t c o r r e s p o n d s of course t o t h e more g e n e r a l f o r -
m u l a t i o n o f the Least-Squares P r i n c i p l e by e q . ( 1 0 . 6 ) .
We must n e x t f i n d o u t what can be s a i d a b o u t t h e u n c e r t a i n t i e s i n t h e when we use t h e f a c t t h a t E(y) = f = A*.

l i n e a r LS e s t i m a t e of t h e p a r a m e t e r s , e q . ( 1 0 . 2 3 ) . From S e c t . 3 . 8 we r e a l i z e A f u r t h e r optimum p r o p e r f y of t h e l i n e a r LS e s t i m a t o r s i s c o n t a i n e d i n

t h a t t h e errors i n t h e o b s e r v e d q u a n t i t i e s y a r e c a r r i e d o v e r t o r h e d e r i v e d t h e Gauss-Morkov theorem: Among a l l u n b i a s s e d e s t i m a t o r s which are l i n e a r func-

quantities -
9. Applying t h e g e n e r a l formula f a r error p r 0 p a g a t i o n , ~ e q . ( 3 . 8 0 ) . to t i o n s of t h e o b s e r v a t i o n s t h e LS e s t i m a t o r s h a v e t h e s m a l l e s t v a r i a n c e .

eq.(10.23) we f i n d t h a t t h e c o v a r i a n c e m a t r i x f o r t h e LS e s t i m a t e iis To p r o v e t h e Gauss-PLarkov theorem, c o n s i d e r a v e c t o r t of estimators,


l i n e a r i n t h e o b s e r v a t i o n s y.
I

268

We want t o f i t t h e p a r a b o l i c p a r a m e t e r i z a t i o n
r
- = sx. (10.26)

These e s t i m a t o r s have e x p e c t a t i o n E ( 5 ) = SE(x) = SAB where 8 is the original f ( e l , 8 ~ , 0 1 ; ~ =) E L t X B Z x2el


vector of parameters. I f t h e new e s t i m a t o r s a r e t o be u n b i a s s e d f o r a s e t o f
l i n e a r f u n c t i o n s of t h e parameters, s a y f o r Cg, then a l s o E(I) = CB for a l l g, to t h e f o l l o w i n g o b s e r v a t i o n s :
h e n c e we m u s t have

C = SA. (10.27)

We w a n t t o f i n d o u t when t h e c o v a r i a n c e m a t r i x V(5) f o r t h e new e s t i m a t o r s ,

Thus t h e problem i n v o l v e s t h r e e unknown p a r a m e t e r s and f o u r o b s e r -


vations.
h a s minimal d i a g o n a l terms. For t h i s p u r p o s e we c o n s i d e r t h e i d e n t i t y (V = V ( y ) ) From e q . ( l O . l R ) , ( 1 0 . 19) w e can w r i t e down t h e m a t r i x A 1x1 dimension
4 x 3 from the observations.
C(A*V-~A)-LA~V-~)V(C(A~V-~A)-

T
I Here e a c h o f t h e two terms on t h e right-hand s i d e i s of q u a d r a t i c form UVU ,
which i m p l i e s non-negative diagonal elements. Only t h e s e c o n d t e r m i s a func-
t i o n o f S , and t h e sum of t h e two terms w i l l have s t r i c t l y minimum d i a g o n a l e l e -
ments when t h e s e c o n d t e r m h a s v a n i s h i n g e l e m e n t s on t h e d i a g o n a l . T h i s occurs
The i n d e p e n d e n t measurements d e f i n e t h e column v e c t o r y = 15,3,5,8) and t h e d i a g -

when o n a l covariance m a t r i x

T h e r e f o r e , t h e u n b i a s s e d , minimum v a r i a n c e e s t i m a t o r s f o r CB i s

where ii s t h e LS e s t i m a t o r f o r i. Also Because o f i t s d i a g o n a l form t h e i n v e r s e o f V c a n e a s i l y be w r i t t e n


down. We f i n d t h e r e f o r e f o r t h e p r o d u c t

10.2.5 Example: F i t t i n g a p a r a b o l a
We s h a l l work t h r o u g h an example o f a l i n e a r L e a s t - S q u a r e s f i t t o
m e a s u r e r e n t s of d i f f e r e n t a c c u r a c y , which requires a- w LS e s t i m a t i o n .
i The p a r a m a t e r s 8 i a n d 83 are c o r r e l a t e d , w i t h t h e e s t i m a t e d c o r r e l a t i o n c o e f -
ficient

I
I
e x e r c i s e 10.3: Derive t h e s o l u t i o n j$ f o r t h e problem i n t h e t e n t by s o l v i n g the
normal e q u a t i o n s .

I
The r e s u l t i n g m a t r i x i s o f dimension 3 x 3, i t i s n o n - s i n g u l a r ,
verted. The i n v e r s e becomes
and can be i n -
! E x e r c i s e 10.4: D e r i v e t h e LS s o l u t i o n and i t s errors f o r t h e same problem w i t h
a l l masurement errors e q u a l , o. = 2 . I

10.2.6 Example: Combining two e x p e r i m e n t s


To t e s t t h e ASIAQ = 1 r u l e i n weak i n t e r a c t i o n s one can measure t h e
complex q u a n t i r y

I Amplitude (KO * n+&-?)


From t h e formula (10.23) we f i n d t h e LS s o l u t i o n 9 by c a r r y i n g o u t t h e x =
I Amplitude (KO + W-L'U)
m u l t i p l i c a t i o n of the m a t r i c e s ,
which g i v e s a measure o f t h e " v i o l a t i o n " of t h e r u l e . L e t t h e t r u e v a l u e s of
the o h s e r v a b l e s Rex and I m x b e denoted by n, and '12. E x p e r i m n r A h a s measured
1 b a t h v a r i a b l e s , and o b t a i n e d t h e r e s u l t s

with errors given by


1

Thus t h e 1.S s o l u t i o n f o r t h e p a r a b o l a through t h e g i v e n d a t a p o i n t s i s Experiment 8 measured TI, o n l y , w i t h t h e r e s u l t

I Yy : = 0.01 t 0 . 0 8 .
f 0
I
The a c c u r a c y o f t h e e s t i m a t e d p a r a m e t e r s can b e found f r o m t h e c o v a r i - e want t o f i n d t h e b e s t cornbined outcome of t h e two e x p e r i m e n t s .
W

ance m a t r i x ~ ( g ) ,which a c c o r d i n g t o e q . ( 1 0 . 2 4 ) is nothing but the matrix W


e have two unknowns i n t h i s problem, 171 and 11%- and t h r e e measure-
A A B
(ATv-'A)-' above. Hence t h e e s t i m a t e s o f t h e errors are g i v e n by t h e s q u a r e merits, y , , Y Z and y , . I n a c c o r d a n c e w i t h o u r f o r m u l a t i o n of t h e l i n e a r LS prob-
r o o t s of the diagonal elements of t h i s m a t r i x , lem we put
)0.2.8 o r t h o ~ o n a lp o l y n o m i a l s
I t i s p o s s i b l e t o a v o i d s e r i o u s rounding-off errors i n t h e m a t r i x
by r e w r i t i n g a l i n e a r model o f t h e form o f eq.(10.33) i n t e r m s of
The measurements a r e c o l l e c t e d i n a column v e c t o r 1 w i t h 3 e l e m e n t s , and t h e eo-
o ~ t h o g a a lpolywmiats o f x. The e f f e c t i s t o produce = t r i c e s of d i o g a n l form
v a r i a n c e m a t r i x e x t e n d e d t o d i r e n s i o n 3 by 3.
which can e a s i l y be i n v e r t e d .
L e t us c o n s i d e r a case where t h e m e a s u r e m n t s a r e mcorretated and
have t h e s o y e m r o . The covariance m a t r i x and i t s i n v e r s e are t h e n m l r i p l e s
1
.f t h e u n i t m a t r i x I N of dimension N x N. V(1) = 1'0 v-'(y) = 021N. With t h e
N'
l i n e a r model
C a r r y i n g o u t t h e m a t r i x o p e r a t i o n s we f i n d ! I
L
1 (10.11)
fi = [ a . 8
e= 1
~a. a
t h e s o l u t i o n of t h e e q u i v a l e n t unweighted LS problem is g i v e n by I
and
i = ( A ~ A ) - ' A ~,Y
-

where A i s t h e m a t r i x of c o e f f i c i e n t s a i L ,

suppose now t h a t we have a s e t of L polynomials S a ( x ) , which are o r t h o -


I t i s worth n o t i n g t h a t due t o t h e c o r r e l a t i o n s t h e e s t i m a t e o f gonal over t h e o b s e r v a t i o n s ,
Imx (= 112) h a s changed, a l t h o u g h t h i s q , l a n t i t y was o n l y measured i n one e x p e r i - N
men t . I
i- I
%(xi)SL(xi) = 6kL. k,a = 1.2 ,...,L . (10.34)

10.2.7 G e n e r a l polynomial f i t t i n g ; We w i l l t a k e as o u r new m d e l

first-order
We have i n two p r e v i o u s examples c o n s i d e r e d Least-Squares
and second-order p o l y n o m i a l s i n the v a r i a b l e x ( S e c r r . l O . 2 . 1
fits to
and
I
I
fi =
L
1Sa(~i)~a,
a- I
(10.35) I
10.2.5,respectively). It i s f r e q u e n t l y necessary t o c o n s i d e r f i t s t o higher-
order p o l y n o m i a l s o f t h e g e n e r a l form
where w
elements
a are t h e L new p a r a m e t e r s .
T
The m a t r i x A and i t s t r a n s p o s 6 A now have
i
T
(A)ia = (A = aia =
T .
S i n c e t h e p a r a m t e r dependence i s s t i l l l i n e a r , s u c h a problem h a s an e x a c t s o l u - i The p r o d u c t m a t r i x A A 1% such t h a t
N
t i o n of t h e form of e q . ( 1 0 . 2 3 ) . However, as t h e power o f t h e polynomial i n -
creases t h e i n v e r s i o n of t h e m a t r i c e s i n v o l v e d b e c o r n s i n c r e a s i n g l y i n t r i c a t e . j T
(A A)ka = 1 (AT)ki(A)ie
i=l
=' 1 &(xi)Se(xi)
i-1
= 6ka.

Serious n u m e r i c a l i n a c c u r a c i e s may occur when t h e d e g r e e o f t h e polynomial g e t s Hence ATA = IL, t h e u n i t m a t r i x o f d i m e ~ l ~ i oLn x L. The LS s o l u t i o n f o r t h e
as l a r g e a s , s a y , 6 or 7. parameter v e c t o r w simplifies t o
or, on component form,

Because o f t h e r e l a x a t i o n o f t h e n o r m a l i z a t i o n c o n d i t i o n t h e p r o d u c t m a t r i x A ~ A
o becomes
A

The c o v a r i a n c e m a t r i x f o r
of dimension 2 x 2 i s n o t a m u l t i p l e of t h e u n i t m a t r i x , b u t r a t h e r

i . e . a diagonal matrix.

I We see t h e r e f o r e t h a t t h e LS e s t i m a t e s o f t h e p a r a m e t e r s are e a s i l y
1 d e r i v e d i n t h i s case, w i t h t h e e r r o r s o n t h e u n c o r r e l a t e d e s t i m a t e s g i v e n d i -
r e c t l y by t h e (cooanon) error on t h e measurements.
T h i s m a t r i x can t r i v i a l l y be i n v e r t e d , g i v i n g

I
10.2.9 Example: F i t t i n g a s t r a i g h t l i n e (2)
Let us i l l u s t r a t e the use o f o r t h o g o n a l p o l y n o m i a l s by r e c o n s i d e r i n g
...,
i
!
r h e problem o f f i t t i n g a s t r a i g h t l i n e t h r o u g h t h e p o i n t s ( x , . y , ) , ( x , . y r ) , i
( 5 , ~ ~ which
) . was f i r s t t r e a t e d i n Sect.lO.Z.1.
I The LS s o l u t i o n for t h e c o e f f i c i e n t s g i s t h e r e f o r e
The model i s now t o b e w r i t t e n i n t e r n . o f two p a r a m e t e r s w,,wz and
i ZY; \
-
two o r hogonal f u n c t i o n s S I ( x i ) . S 2 ( x i ) .
; 1I
6
Ni=l
From t h e d e f i n i t i o n o f t h e mean v a l u e
xi we r e a l i z e t h a t o r t h o g o n a l i t y i s e n s u r e d i f we t a k e
Ii

where t h e sumnations go o w r a l l measurements


T h i s c h o i c e i s , however, n o t q u i t e i n a c c o r d a n c e w i t h t h e f o r m u l a t i o n o f t h e
p r e v i o u s s e c t i o n , s i n c e t h e p o l y n o m i a l s are n o t n o r m a l i z e d t o one, b u t g i v e
E x e r c i s e 10.5: Show t h a t t h e r e s u l t o b t a i n e d i n t h i s s e c t i o n i s c o n s i s t e n t w i t h
the s o l u t i o n found i n S e c t . l O . 2 . 1 .

E x e r c i s e 10.6: I f t h e ( i n d e p e n d e n t ) o b s e r v a t i o n s y . have a conman e r r o r a, what


The p a r a m e t e r i z a t i o n f o r the s t r a i g h t l i n e i s %
are the errors on t h e p a r a m e t e r s f o r t h e s t r a i g h t l l n e u s i n g ( a ) t h e model
f i = 81 + xie' (Sect.lO.Z.l), ( b ) t h e model f i = w, + (x.-ji)w,?

10.3 THE NON-LINEAR LEAST-SQUARES MODEL


and t h e m a t r i x A, o f d i m e n s i o n N by 2,
We t u r n now t o B more g e n e r a l problem when t h e p r e d i c t e d v a l u e s f .
have a non-linear dependence on t h e p a r a m e t e r s . It i s then n o t p o s s i b l e t o
w r i t e dawn a n e x a c t s o l u t i o n as was done f o r t h e l i n e a r ease. I n s t e a d one h a s 1
t o p e r f o r m t h e m i n i m i z a t i o n by a n i t e r a t i v e p r o c e d u r e t o
b e t t e r a p p r o x i m a t i o n s f o r t h e unknown p a r a m e t e r s .
find increasingly
~ .ill be t h e d e s i r e d s o l u t i o n ( p r o v i d e d , of course, t h a t t h e extremum of x2 is a
s e v e r a l i t e r a t i o n methods t o l o c a t e t h e minimum v a l u e s of a g e n e r a l
f u n c t i o n a r e o u t l i n e d i n C h a p t e r 13. Here we d e s c r i b e o n l y one method, which To f i n d t h e c o r r e c t i o n b i w t o t h e approximate p a r a m e t e r v e c t o r we
goes back t o Isaac Newton. expand t h e gQ o f e q s . ( l 0 . 4 1 ) a b o u t i"and keep t e r m up t o f i r s t o r d e r . hen we
demnd
10.3.1 Newton's method
According to t h e Least-Squares P r i n c i p l e t h e b e s t e s t i m a t e s of t h e un-
known p a r a m e t e r s are t h e v a l u e s which minimize t h e q u a n t i t y
where t h e d e r i v a t i v e s are e v a l u a t e d f o r != g. E q u a t i o n s (10.43) c o n s t i t u t e a
s e t of L inhomogeneous, L i n e a r e q u a t i o n s which c a n he s o l v e d f o r t h e unknown
I
I
I
AB; , Q=1,2. ...,L i n t h e u s u a l way. The c o e f f i c i e n t a p p e a r i n g b e f o r e 10'
k
i n the
where y i s t h e v e c t o r of measurements w i t h c o v a r i a n c e m a t r i x V(E). and f the e-th e q u a t i o n i s
vector o f predicted values,

and may he found e i t h e r a n a l y t i c a l l y o r n u m e r i c a l l y , d e p e n d i n g on t h e problem.


which i s now n o n - l i n e a r in 8.
parameter values
I n eq.(10.44) i t i s understood t h a t a l l e x p r e s s i o n s are evaluated f o r i = $.
Slippose t h a t we have found a s e t o f
Obviously $Q= Gak. implying t h a t G i s a s y m e t r i c matrix.
I n v e c t o r n o t a t i o n t h e s o l u t i o n f o r t h e c o r r e c t i o n 's ! A i s (compare
the formulae of S e c t s . l O . 2 . 2 and 10.2.3)
which c o r r e s p o n d s t o t h e v-th i t e r a t i o n with t h e function value x2U' and t h a t a
We t h e n e v a l u a t e t h e d e r i v a t i v e s o f x2 w i t h re-
b e t t e r a p p r o x i m a t i o n i s needed.
s p e c t t o the parameters a t t h e p o i n t != !' (remember t h a t t h e s e would v a n i s h i f
'
This r e l a t i o n assumes t h a t G i s "on-singular, h u t i s v a l i d a l s o when x2 i s con-
t h e minimum had b e e n o b t a i n e d ! ) . I n t h e case of i n d e p e n d e n t measurements. when
s t r u c t e d forcorrelatedmeasuremenrs.
v:!(y) = l l o f , we can w r i t e
II The new p a r a m e t e r v a l u e s = $+b g are next used t c f i n d t h e
corresponding x:+~,
has b e e n o b t a i n e d .
and i f t h i s q u a n t i t y i s s m a l l e r t h a n
The p r o c e d u r e i s t h e n r e p e a t e d t a k i n g
x:gut'
a better solution
as a new a p p r o x i -
where f . and a f . / a e , are e v a l u a t e d f a r !=$. I f we can f i n d a n i n c r e m e n t A$ to mste s o l u t i o n , a new c o r r e c t i o n AC"i s d e t e r m i n e d , and s o on. The i t e r a t i o n s
8" which w i l l ;he
- t h e g r a d i e n t v e c t o r g e q u a l to z e r o , a r e c o n t i n u e d u n t i l t h e improvement i n x2 between two c o n s e c u t i v e i t e r a t i o n s he-
comes s m a l l e r t h a n some p r e s e t number.
I f i t i s found t h a t a new i t e r a t i o n gives x:+~ > X' t h e minimum v a l u e
t h e problem h a s been s o l v e d , because t h e n t h e c o r r e c t e d v a l u e has been passed a v e r . One may then r e d e f i n e t h e c o r r e c t i o n t o t h e v - t h s t e p by
t a k i n g a s m a l l e r v a l u e , s a y A$ = ~ A and
C repeat t h e procedure with t h i s s t e p .
I t w i l l be seen t h a t i f t h e t h e o r e t i c a l v a l u e f i s of second o r d e r i n
s y s t e m s . w i t h + v a l u e s c o r r e s p o n d i n g t o t h e measured p o i n t s , e v a l u a t e d as i n d i -
t h e p a r a m e t e r s t h e m t r i x G w i l l become c o n s t a n t , i n d e p e n d e n t of i,and the c a t e d l a t e r , we have t h e r e f o r e a s e t of " p r e d i c t e d " p o i n t s f o r i.l.2, ...,
N,
e x a c t minimum i s found i n t h e f i r s t i t e r a t i o n .
- -
10.3.2 Example: H e l i x p a r a m e t e r s i n t r a c k r e c o n s t r u c t i o n
AS a n example on an i t e r a t i v e X' m i n i m i z a t i o n we w i l l d i s c u s s t h e de-
t e r m i n a t i o n of track p a r a n e r e r s f o r a charged p a r t i c l e moving i n a m g n e t i c f i e l d ,
f o r i n s t a n c e i n a b u b b l e chamber o r a s t r e a m e r c h a h e r . We w i l l f o r s i m p l i c i t y
xi = A
yi = B
+

+
p ( ~ o s $ ~1)cosB

zi = C + p$. t a n k
-
p ( ~ o s $ ~1 ) s i n B + psin$.cosB
, I
psind.sinB

and t h e LS s o l u t i o n f o r t h e unknown p a r a m e t e r s i s o b t a i n e d by s e e k i n g t h e mini-


mm of t h e (""weighted) expression
(10.46)

a s s u e t h a t t h e p a r t i c l e h a s a c o n s t a n t momentum over t h e t r a c k l e n g t h con-


s i d e r e d ( n e g l e c t i o n l s a t i o n l o s s ) and t h a t t h e magnetic f i e l d i s uniform. The
i
p a r t i c l e t r a j e c t o r y i n s p a c e w i l l t h e n be a h e l i x w i t h a x i s p a r a l l e l t o the
d i r e c t i o n of t h e magnetic f i e l d . Our problem i s t o d e t e r m i n e t h e b e s t para-
The r n i n i a i c ; , t i o n is illusL c o n v e n i e n t l y done i n terms of Lhe p a r a m e t e r s
m e t e r s of t h i s h e l i x o n t h e b a s i s o f a s e r i e s of measured p o i n t s a l o n g t h e p a t h
0.5 and t a d . With t h e f o r m l a t i o n of S e c t . l O . 3 . l t h e g r a d i e n t v e c t o r g and t h e
of t h e p a r t i c l e .
~ e ust assume t h a t t h e N p i n t s i n s p a c e (Xi.Yi,Zi), i=1,2,3. ... N have
m a t r i x G can be c a l c u l a t e d from eqs . ( 1 0 . 4 1 ) ,
n e n t s of g,
(10.44). One o b t a i n s f o r t h e compo-

been a c c u r a t e l y measured w i t h r e s p e c t t o a f i x e d c o o r d i n a t e system (xyz) w i t h z


a x i s a l o n g t h e d i r e c t i o n of t h e magnetic f i e l d . L e t us f u r t h e r f o r t h e moment
assume t h a t t h e s t a r t i n g p o i n t (A,B,C) of t h e t r a c k h a s b e e n p r e c i s e l y measured.
We t a k e t h i s p o i n t a s t h e o r i g i n of a second c o o r d i n a t e s y s t e m ( ~ ' y ' z ' ) w i t h z'
a x i s ~ a r a l l e lt o t h e z d i r e c t i o n , and w i t h t h e y ' a x i s a l o n g t h e d i r e c t i o n of
t h e tangent to the track i n t h e xy p l a n e , see F i g . 10.1. In this
r e l a t i v e c o o r d i n a t e s y s t e m t h e h e l i x can be p a r a n r t e r i z e d as

x' = O(COS$ - 1) The m a t r i x G h a s t h e f a l l o w i n g non-vanishing e l e m e n t s ,


y' = p sin$
z' = p$ r a n k , G ~ ,=
aix2 = -z2
3 L [ ( x ~ - A ) ' + ( Y ~ - B ) '+ ( z ~ - c ) ~ I
i
where 0 i s t h e r a d i u s of c u r v a t u r e o f t h e p r o j e c t e d ( c i r c u l a r ) p a t h i n the x S y ' 2
Go@ = GBi) = 5 f [(xi-xi) (yip) - ( Y . - Y . ) (x.-A)
1 1 1
1
p l a n e , and A t h e a n g l e between t h i s ? l a n e and the t a n g e n t t o t h e h e l i x ( t h e "dip"
angle). I n the fixed coordinate t h e h e l i x i s e x p r e s s e d as G = G = 2 I[(z~-C) - ( z ~ - z ~ ) ] ~ ~
oA Ap

a2xZ
a6 = 2 I[(xi-A)(xi-A)
GBB = 7 + (Yi-B)(yi-~)]

where 6 i s t h e a n g l e d e s c r i b i n g t h e r e l a t i v e o r i e n t a t i o n of t h e two c o o r d i n a t e
while GBA = GAB = O.
I t s h o u l d a l s o be mentioned t h a t i n p r a c t i c e t h e measured p o i n t s are
The s t a r r i n g v a l u e s f o r t h e i t e r a t i o n p r o c e d u r e can b e found as
n o t known w i t h f u l l p r e c i s i o n , b u t a r e c o n n e c t e d v i t h errors, AXi.AYi,AZi and
follows:
c o r r e l a t i o n terms. The m i n i m i z a t i o n i s a c c o r d i n g l y c a r r i e d o u t for a weighted
For t h e d i p a n g l e we rake tanho as t h e v a l u e o b t a i n e d by a LS s t r a i g h t
l i n e f i t through t h e N p a i r s of p o i n t s ( s l , z ! ) , w h e r e z! i s t h e measured z-coor-
x2 f u n c t i o n , r a t h e r than w i t h t h e u n v e i g h t e d X'
of e q . ( 1 0 . 4 7 ) . The problem t h e n
i m p l i e s more c o m p l i c a t e d formulae f o r t h e g r a d i e n t v e c t o r g and t h e m a t r i x G of
d i n a t e e x p r e s s e d i n t h e i n t h e ( ~ ' ~ ' r system,
' ) and s! t h e d i s t a n c e from t h e
o r i g i n o f t h i s s y s t e m t o t h e measured p o i n t ( ~ i , ~ i , z : ) .
second d e r i v a t i v e s of x'. As a r e s u l t o f t h e complete m i n i m i z a t i o n one w i l l in
t h i s s i t u a t i o n , i n addition t o the f i t t e d h e l i x p a r a m t e r s , obtain a s e t of
For t h e r a d i u s of c u r v a t u r e p0 and t h e r o t a t i o n a n g l e 6' w e can t a k e
,,.improved measurements", or f i t t e d v a l u e s of t h e c o o r d i n a t e s of t h e N s p a c e
t h e v a l u e s o b t a i n e d by a l i n e a r LS f i t t o a c i r c l e through t h e measured p r o j e e -
points.
t e d p o i n t s (Xi,Y.,a). W r i t i n g t h e e q u a t ~ o nf o r t h e c i r c l e a s
li

t h e f i t t e d v a l u e s of t h e p a r a m e t e r s a and b give t h e s t a r t i n g v a l u e s

p" = m, e" = tan-> b .


The a r i m v t h a l a n g l e mi can then be e x p r e s s e d f o r t h e measured p o i n t s as
Yi - (B-b)
4. = tan -'
xi - (A-a) - 'O'
i=1,2, ....N.
When t h e s t a r t i n g values have been o b t a i n e d t h e g r a d i e n t v e c t o r and Projection of helix
the n a t r i x GO are e v a l u a t e d a c c o r d i n g to t h e e x p r e s s i o n s above. The c o r r e c t i o n s in the X Y plane
~ p " , A g ~ , ~ t aar en ~t h~e n e v a l u a t e d from e q . ( 1 0 . 4 5 ) and t h e s t a r t i n g v a l u e s re-
p l a c e d by t h e new a p p r o x i m a t i o n I Projection of measured
Helix axix
Point in the xy plane

A new i t e r a t i o n r a n t h e n be made w i t h t h i s s e t and t h e p r o c e s s r e p e a t e d u n t i l a


s a t i s f a c t o r y s t a t i o n a r y v a l u e of x2 i s o b t a i n e d . The i t e r a t i o n s a r e s t o p p e d
, equivalent to (x;,y;,z;)
t h e sum o f t h e a b s o l u t e v ~ l r l eos
f t h e c o r r e c t i o n terms g e t s below a p r e s e t in the (x'y'z') system
"due,
o r a f t e r a p r e s e t maximum number of i t e r a t i o n s h a s been performed.
I n p r a c t i c e t h e assumption o f a k n o w o r i g i n of t h e t r a c k i s n o t made.
I I n s t e a d one a l l o w s rhe c o o r d i n a t e s A,B.C t o v a r y and makes a h e l i x f i t w i t h t h e

p a r a m t e r s e t A,C,p,B.tanA, o r B.C,p.B,tanA. The s t a r t i n g v a l u e s f o r A, (or B),


and C a r e then t a k e n as t h e c o o r d i n a t e s of t h e f i r s t measured p o i n t , and t h e
s t a r t i n g v a l u e s f o r 0 and 6 must be found by a " o n - l i n e a r LS f i t of s c i r c l e t o F i g . 1 0 . 1 . Track reconstruction w i t h h e l i x parameters (see t e x t )
t h e meamred p r o j e c t e d p o i n t s . ..
282

10.4 LEAST-SQUARES FIT where V (1) i s known. we d e f i n e i n a s i m i l a r manner a q u a n t i t y which i s u s u a l l y


r e f e r r e d t o as the w s i h z l swn o f squares,
10.4.1 "Improved measurements" ( f i t t e d v a r i a b l e s ) and r e s i d u a l s
In t h e p r e c e d i n g s e c t i o n s we have r e g a r d e d t h e Least-Squares P r i n c i p l e
as p r o v i d i n g a p r e s c r i p t i o n f o r how t o f i n d t h e b e s t v a l u e s o f t h e unknown para-
meters !, which are supposed t o be connected t o t h e t r u e o b s e r v a b l e s 2 through The c o n n e c t i o n between t h e two q u a n t i t i e s xii,, and Q : ~ i ~s
some f u n c t i o n a l dependence.
In many s i l o a t i u n s t h e b a s i c unknowns are i n f a c t t h e o b s e r v a b l e s 9
t h e m s e l v e s ; t h i s was f o r i n s t a n c e t h e case i n t h e example o f Sect.lO.Z.6. char- The d e f i n i t i o n s above make no r e f e r e n c e t o any s p e c i f i c type o f model.

' a c t e r i s t i c f o r many s i t u a t i o n s i s t h a t t h e 3 are d i r e c t t g w n s u r e a b t e . We may


y w i t h c o v a r i a n c e m a t r i x V(r) as o u r i n i t i a l e s t i m a t e s of
take t h e o b s e r v a t i o n s
I n t h e s u b s e q u e n t s e c t i o n s we s h a l l d e r i v e . o r merely s t a t e , s o n r e s u l t s which
are e x a c t f o r t h e l i n e a r model, and a p p r o x i m a t e l y v a l i d f o r any g e n e r a l model i n
t h e t r u e , b u t unknown 9. According t o t h e Least-Squares P r i n c i p l e we s h o u l d which t h e p a r a m e t e r dependence i s n o t too f a r from l i n e a r .
t h e n a d a p t as o u r b e s t e s t i m a t e s o f 3 t h o s e v a l u e s which minimize t h e q u a n t i t y
E x e r c i s e 10.7: F o r t h e l i n e a r p a r a b o l a f i t of S e c r . 1 0 . 2 . 5 , f i n d t h e "improved
measurements" 5and t h e r e s i d u a l sE. What i s t h e weighted sum of s q u a r e d r e s i -
duals x,:,~, i n t h i s case?

where g i s t h e d i f f e r e n c e b e t w e n t h e measured and t r u e v a l u e s , and may be


c a l l e d t h e error 'iuc t o mcasulement,
E x e r c i s e 10.8: ( i ) For t h e l i n e a r LS model of Secr.10.2.3, i n which , !A fi
show
t h a t t h e "improved measurements" and t h e i r c o v a r i a n c e m a t r i x are g i v e n b y , res-
-
pectively,
-
;I A ( A ~ v - ' A ) - ~ A ~ v,- ' ~
- v(;)
( i i ) show c h a t t h e r e s i d u a l s can be e x p r e s s e d as
= A(A~v-'A)-'A~.
= Dg, where 5 i s the error
When t h e LS m i n i m i z a t i o n h a s b e e n performed t h e f i n a l e s t i m a t e s 5 of the true 3 due t o measurement and

a r e c a l l e d t h e "impmued measuremcntc", o r f i t t e d v a r i a b l e s . D 5 I~ - A(A~v-'A)-'A~v-'.


The r e s i d u a l s 2 of t h e LS e s t i m a t i o n a r e d e f i n e d as t h e d i f f e r e n c e s ( i i i ) show t h a t t h e w e i g h t e d s m of s q u a r e d r e s i d u a l s f o r t h e l i n e a r model re-
duces t o
between t h e o r i g i n a l measurements and t h e "improved measumments" from t h e f i t ,
xi.
mln
= yT"-'c.
E x e r c i s e 10.9: With u n c o r r e l a t e d measurenents o f c o m n error o and t h e o r t h o -
gonal polynomial f o r n u l a t i o n o f S e c t . 1 0 . 2 . 8 , show t h a t t h e r e s i d u a l sum o f squ-
I n o t h e r words, t h e r e s i d u a l i s t h e e s t i m a t e d error due t o measurement. ares can be c a l c u l a t e d frpm t h e s i m p l e formula
The minimm v a l u e o b t a i n e d f o r X' can b e c a l l e d t h e weighted s m o f
squared r e s i d u a l s i n t h e f i t ,
10.4.2 Estimating n2 i n t h e l i n e a r model
.. A

2) of
-
We have emphasized t h a t t o o b t a i n t h e s o l u t i o n (or the l i n e a r

I f t h e c o v a r i a n c e m a t r i x V(y) i s known o n l y up t o a c o n s t a n t m u l t i p l i - LS problem t h e c o v a r i a n c e m a t r i x V(y) a 2 v o ( y ) of t h e measurements n e e d o n l y be


c a t i v e f a c t o r a 2 , i . ~i f. known up t o t h e m u l t i p l i c a t i v e f a c t o r 0'. However, t o f i n d t h e errors on t h e
e s t i m a t e s t h e m a t r i x V(y) must be c o m p l e t e l y known.
I t can b e s h o w t h a t , w i t h Vo(y) kknown, t h e unknown o 2 can be e s t i m a t e d
from t h e r e s i d u a l sum o f s q u a r e s xi". since Fcom t h e d e f i n i t i o n o f D we Find by commuting t h e m a t r i x A(ATA)-' by under
the t r a c e operation

i s an u n b i a s s e d e s t i m a t o r o f 0% h e r e , as b e f o r e , N i s t h e number o f observa-
t i o n s a n d L t h e number o f p a r a m e t e r s e s t i m a t e d * ) . where IL i s t h e i d e n t i t y m a t r i x o f dimension L x L. Thus, f r D = N-L, and eq.

We s h a l l prove t h a t s 2 o f eq.(10.56) i s an u n b i a a s e d e s t i m a t o r o f o2 (10.61) l e a d s t o

f o r t h e s i m p l e s t case where t h e u n c o r r e l a t e d measurements have a c m n error.


Then
BY w i t h t h e d e f i n i t i o n of an u n b i a s s e d e s t i m a t o r ( e q . ( 8 . 3 ) ) we see,
t h e r e f o r e , t h a t q i , / ( ~ - ~ ) w i l l be a n u n b i a s s e d e s t i m a t o r o f 0 2 , as was s t a t e d
A . aboM.
where IN i s t h e u n i t m a t r i x of dimension N x N . The r e s i d u a l s 5 can now be ex-
p r e s s e d i n terms of t h e errors due t o m e a s u r e m e n t g . W r i t i n g = !A + 5 we have ~ x e r c i s e10.10: Prove t h a t , f o r any N x N m a t r i x G and a g e n e r a l c o v a r i a n c e
(see E x e r c i s e 10.8) m a t r i x V,
T
E(E G5) = tr(VG)
i n v i r t u e of t h e d e f i n i t i o n o f t h e c o v a r i a n c e s . E ( E . E . ) = V.
1 I 11
..
when we i d e n t i f y E x e r c i s e 10.11: G e n e r a l i z e t h e proof i n t h e t e x t and show t h a t s 2 = <.,I(N-L)
i s a n u n b i a s s e d e s t i m a t o r o f o2 a l s o i f t h e measurements are correlates and
have unequal errors. ( H i n t : w i t h V(y) = o%V(y) and V (y) ktnown, t h e r e s i d u a l
sum o f s q u a r e s can b e e x p r e s s e d as
-T - L A T T -
T Q&,, = 5 vo 5 = ( c ~ ) ~ % ' ( D ~= )g ( D v ~ ' D ) ~ .
The m a t r i x D i s idempotent, s a t i s f y i n g DT = 0 , D D = D. The r e s i d u a l sum o f
s q u a r e s t h e r e f o r e becomes where
D : - A(AT%'A)-~A~V;'.
x i " = g '-TA
g = ( D 5 ) T (DZ) = 5 D
T g = y , D . . C ? + lD..E.E..
i*j 11 1 J
(10.60) T h i s m a t r i x h a s t h e p r o p e r t y t h a t DTv,'D = VG'D
= DTvil, so t h a t w i t h t h e r e s u l t
of t h e p r e c e d i n g e x e r c i s e , E ( g i n ) = 0 2 t r D = 02(N-L).)
I f we t a k e t h e e x p e c t a t i o n v a l u e o f t h i s e x p r e s s i o n the c o n t r i b u t i o n s from the
l a s t sum w i l l v a n i s h , s i n c e we have assumed u n c o r r e l a t e d measurements, f o r which 10.4.3 The n o r m a l i t y assumption; d e g r e e s of freedom
E ( E . E . ) = 0 when i i j . Hence For l i n e a r models t h e LS method p r o d u c e s e s t i m a t o r s of t h e unknown
1 J
parameters !which are u n b i a s s e d and have minimum v a r i a n c e (Sect.10.2.4) and,
as we saw i n t h e previous s e c t i o n , i t a l s o ~ r o v i d e san ""biassed e s t i m a t o r of a 2 .
In t e r n o f t h e error due t o measurement 5 t h e o n l y assumption made i n d e r i v i n g
--
*) I f t h e L p a r a m e t e r s are c o n s t r a i n e d i n K l i n e a r a l g e b r a i c e q u a t i o n s t h e
u n b i a s s e d e s t i m a t o r of a' i s g i w n by s' = Q ~ ~ , / ( N - L + K ) ;see E x e r c i s e 10.16.
these properties i s t h a t
sedness.
E(5) - 0, which is r e q u i r e d f o r t h e proof of unbias-
To prove t h a t Q i i n l ( ~ - L ) i s an u n b i a a s e d e s t i m a t o r o f 0
' ve a l s o

assumed i n S e c t . 1 0 . 4 . 2 t h a t t h e measuremenrs were u n c o r r e l a t e d , w i t h E(F.F.) = 0


' 1
f o r i t j , b u t t h i s r e s t r i c t i o n i s n o t e s s e n t i a l ( E x e r c i s e 10.11). Thus, e x c e p t
f o r t h e f i r s t and second moments, no d i s t r i b u t i o n a l assllmptions have been made m a s u r e m e n t s and t h e number o f independent p a r a m e t e z s .
a b o u t t h e ~ ~ ' i sn ;o t h e r words, t h e o p t i m a l p r o p e r t i e s are d i s t r i b u t i o n - f r e e , I f t h e measurements are nor i n d e p e n d e n t i t can b e s h o r n c h a t t h e above
L e t us now make t h e f u r t h e r assumption t h a t t h e uncormlnted c . ' s are s t a t e m e n t s w i l l s t i l l b e t r u e p r o v i d e d t h a t t h e measurements are n p l l t i n o m t l y
n o m l t y d i s t r i b u t e d v i t h mean v a l u e 0 and v a r i a n c e 02.
As u n c o r r e l a t e d , normal
d i s t r i b u t e d a b o u t t h e t r u e values. When t h e e r r o r due to measurements f = x-3
I
v a r i a b l e s a r e i n d e p e n d e n t ( S e c t . 4 . 1 0 . 1 ) t h i s amounts t o assuming t h a t the N mao- i s multinormally d i s t r i b u t e d with value g and c o n s t a n t ( n u n - s i n g u l a r )
~ ~ Y i ore independent
~ t and n so m l t y d i s t r i b u t e d with variance of about c o v a r l a n c e m a t r i x \ , t n r weighted sum of squared r e s i d u a l s
t h e t r u e values n..
If the observables ni were known t h e a s s u m p t i o n s above
implyt h a t
the quantity
"ill b e c h i - s q u a r e d i s t r i b u t e d a s b e f o r e , w i t h (N-L+K) d e g r e e s of freedom i n the
case when t h e r e are K c o n s t r a i n t e q u a t i o n s r e s t r i c t i n g t h e L p a r a m e t e r s .
I t i s e s s e n t i a l l y the optirnrrm p r o p e r t i e s o f l i w a r e s t i m a t o r s t h a t make
would b e a sum o f N i n d e p e n d e n t s q u a r e d s t a n d a r d n o m a l v a r i a b l e s , and would by x2.
mln a = h i - s q u a r e v a r i a b l e f o r normally d i s t r i b u t e d v a r i a b l e s . I n a non-tinear
d e f i n i t i o n ( S e c t . 5.1.1) be a c h i - s q u a r e v a r i a b l e w i t h N d e g r e e s of f r e e d a n . LS e s t i m a t i o n t h e s i t u a c i o o i s n o t 60 simple; t h e LS e s t i m a t o r i s t h e n i n g e n e r a l
S i n c e , however, we do n o t know what t h e p r e c i s e v a l u e s of t h e t r u e , un- b i a s s e d and n o t of minimum v a r i a n c e , and t h e e x a c t d i s t r i b u t i o n o f Xiin i s n o t

I
known n i are we must be s a t i s f i e d w i t h a d o p t i n g t h e i r estimated v a l u e s ;I. as ,b- known. Asymptotically, f o r l a r g e N , i t c a n be shown, however, t h a t x : ~ ~i s
t a i n e d from t h e m i n i m i z a t i o n o f x'. I n s e r t e d i n X' t h i s g i v e s t h e weighted sum a p p r o x i m a t e l y chi-square distributed a l s o i n t h i s general case.
of squared residuals, F i n a l l y , l e t us s t r e s s a g a i n t h a t f o r t h e e s t i m a t i o n problem t h e L e a s t -
Squares P r i n c i p l e i n v o l v e d no assumption a b o u t t h e d i s t r i b u t i o n a l p r o p e r t i e s o f
t h e o b s e r v a t i o n s . The comnonly used t e r n "chi-square ( o r x2-) minimization".
Using t h e n o r m a l i t y a s s m p t i o n a b o u t t h e N i n d e p e n d e n t y . i t can be shown. i n "X2-fitting", e t c . , t h e o r i g i n of which i s e v i d e n t from t h e above c o n s i d e r a t i o n s ,
t h e case o f a l i n e a r model w i t h L p a r a m e t e r s , t h a t xiin can b e e x p r e s s e d as a
As l o n g as t h e s u b j e c t
are t h e r e f o r e s o w w h a t m i s l e a d i n g and s h o u l d be a v o i d e d .
Sum o f (N-L) i n d e p e n d e n t t e r m s , e a c h term b e i n g t h e square of a s t a n d a r d i z e d i s p a r a m e t e r e s t i m a t i o n as such one s h o u l d i n s t e a d use t h e a p p r o p r i a t e t e m i n o -

normally d i s t r i b u t e d variable. Hence x : ~ i~s x'(N-L), a chi-square v a r i a b l e I l o g y w i t h "Least-Squares.. .".


w i t h (N-L) d e g r e e s of freedom*). As we s h a l l see i m e d i a t e l y below the n o r m a l i t y assumption a b o u t t h e

I t h a s b e e n t a c i t l y assumed s o f a r t h a t t h e L p a r a m e t e r s are indepen- o b s e r v a t i o n s i s e s s e n t i a l o n l y when i c comes t o t h e q u e s t i o n o f j u d g i n g how re-


dent. l i a b l e t h e e s t i m a t e d parameter values a r e .
I f t h e p a r a m e t e r s are i n t e r n a l l y r e l a t e d i n K l i n e a r a l g e b r a i c e q u a t i o n s
o n l y (L-K) of them are i n d e p e n d e n t , g i v i n g (N-(L-K)) i n d e p e n d e n t terms i n t h e
E x e r c i s e 10.12: For t h e l i n e a r LS model of Sect. 10.2.3, show t h a t , i n g e n e r a l .
xii, of eq.(10.64). I n this situation x:;" i s distributed as X 2 ( ~ - ~ ~Thus
t h e number o f d e g r e e s o f freedom i n t h e c o n s t r a i n e d as w e l l as i n t h e uncon-
) .
x2 -x ; ~ ~ + (K-~)~A~V-'A(K-@.

s t r a i n e d l i n e a r f i t i s e q u a l t o t h e d i f f e r e n c e between t h e nllmber o f independent For normally d i s t r i b u t e d o b s e r v a t i o n s e a c h of the t h r e e terms i s c h i - s q u a r e d i s -


t r i b u t e d , as X 2 ( ~ ) X2(N-L),
, and x'(L). r e s p e c t i v e l y .

*) I f t h e E . ' S have mean v a l u e s d i f f e r e n t from zero xiin w i l l have a n o n - c e n t r a l


c h i - s q u a r e a i s t r i b u t i o n ; compare E x e r c i s e 5 . 1 2 .
10.4.4 Goodness-of-fit I t i s a unique f e a t u r e o f t h e LS method t h a t a s s e r t i o n s can b e made
The f a c t t h a t t h e w e i g h t e d sum o f s q u a r e d r e s i d u a l s xiin, f o r normally about t h e q u a l i t y o f t h e f i t , a n d hence of t h e e s t i m a t e d p a r a m e t e r s , from t h e
d i s t r i b u t e d measurements, h a s a known (i.?. chi-square) d i s t r i b u t i o n h a s an f i n a l n-ricsl v a l u e o f t h e o p t i m i z e d q u a n t i t y . The o t h e r e s t i m t i o n methods
i m p o r t a n t p r a c t i c a l consequence. It implies t h a t t h e x : ~ v~a l u e obtained i n a do n o t p r o v i d e t h i s p o s s i b i l i t y . We s h a l l come back t o t h e q u e s t i o n o f goodness-
p a r t i c u l a r m i n i m i z a t i o n can be used t o g i v e a q u a n t i t a t i v e measure of how o f - f i t i n c o n n e c t i o n w i t h t h e s u b j e c t of h y p o t h e s i s t e s t i n g i n C h a p t e r 14,
I
c l o s e t h e o v e r a l l agreement i s between t h e f i t t e d q u a n t i t i e s 1 and t h e measurr- where i n p a r t i c u l a r we s h a l l d i s c u s s how one c a n make goodness-of-fit statements
mnts x. I n o t h e r words, xiin w i l l p r o v i d e a measure of t h e goodness-of f i t . when t h e unknown p a r a m e t e r s have been e s t i m a t e d by t h e ML method (Sect.14.4.3).
F o r d e f i n i t e n e s s , l e t us assume t h a t t h e problem i n v o l v e s v d e g r e e s o f
10.4.5 S t r e t c h f u n c t i o n s , or " p u l l s "
freedom; i n canman p a r l a n c e we are t h e n d e a l i n g w i t h a "v-constrained fit", or a
As d i s c u s s e d i n t h e p r e v i o u s s e c t i o n t h e weighted sum of s q u a r e d r e s i -

'
"LC-fit". By comparison w i t h a graph (for example. F i g . 5.2) o r a t a b l e ( f o r
, example, Appendix T a b l e A81 o f c h i - s q u a r e p r o b a b i l i t i e s we can t h e n deduce t h e
duals, xiin, gives a measure o f t h e s i m i l a r i t y between t h e o b s e r v a t i o n s and t h e

p r o b a b i l i t y corresponding t o the c o n t e n t s o f t h e chi-square p . d . f . f(u;\,) bet- f i t t e d values. S p e c i f i c a l l y , a very l a r g e x:~,,, or e q u i v a l e n t l y , a low c h i -


1 i P > e x p r e s s e s a r a t h e r p o o r o v e r a l l agreement between d a t a and
square
ween the values u = X2. and u --, Qbviausly t h i s chi-squnre probability P s X
mln f i t t e d model, and may l e a d one t o s u s p e c t t h a t t h e model i s u n s a t i s f a c t o r y .
t h e p r o b a b i l i t y for o b t a i n i n g , i n a new m i n i m i z a t i o n w i t h s i m i l Xa r
then g i ~ S
Q u i t e f r e q u e n t l y , however, one f i n d s t h a t an unexpected l a r g e v a l u e of
measurements and t h e same model, a h i g h e r v a l u e f o r x2.
mln
.
A s m a l l value o f x2.
mln x2.
mzn n o t n e c e s s a r i l y has t o b e a s c r i b e d t o a wrong model o r h y p o t h e s i s , as it
c o r r e s p o n d s t o a l a r g e P 2 , o r a "good" f i t , while a very l a r g e valve im- x'.
X mln can simply be due t o a l a r g e c o n t r i b u t i o n from one, o r a few, of t h e N p o i n t s .
p l i e s a s m a l l PX2, or a "bad" f i t .
We have
Thus, r a t h e r t h a n i n m e d i a t e l y abandoning t h e model from a "bad" x'.mln a q u i c k
look a t the d a t a p o i n t s i s recomended. Sometimes t h i s i n s p e c t i o n shows a t once

pX2 = j f(u;v)du = 1 - F ( x ~ . ;u),


mln (10.66)
t h a t one of t h e i n p u t v a l u e s was i n c o k r e e t , and a s a t i s f a c t o r y f i t can be ob-

xii n
t a i n e d when t h e m i s t a k e is c o r r e c t e d .
A c l o s e r s t u d y o f t h e f i t can be done by l o o k i n g a t the r e s i d u a l s
where F(X;~,,;V) i s t h e c u m u l a t i v e c h i - s q u a r e d i s t r i b u t i o n for v d e g r e e s o f f r e e -
E.1 = y
1 , - n .1, which d i r e c t l y measure t h e d e v i a t i o n s between t h e o b s e r v a t i o n s and
dom. S i n c e a c u m u l a t i v e i n t e g r a l i s i t s e l f a v a r i a b l e which i s uniformly d i s t r i -
t h e f i t t e d values. TO a l l o w f o r d i f f e r e n t a c c u r a c i e s i t is r e a s o n a b l e t o judge
b u t e d between 0 and 1 (compare S e c c s . 4 . 1 . 1 md 5.1.4) t h e chi-square p r o h a h i l i t y
P 2 w i l l also have a t u n i f o m
an ii r e l a t i v e l y t o t h e u n c e r t a i n t y , o r s t a n d a r d d e v i a t i o n o ( c i ) , i n t h i s quan-
X
d i s t r i b u t i o n o v e r rhe i n t e r v a l [ 0 . 1 ] .
tity. n u s t h e e x a m i n a t i o n o f t h e f i r s h o u l d be done i n t e n n s of t h e v a r i a b l e s
If, i n a s e r i e s o f s i m i l a r m i n i m i z a t i o n s . P t u r n s o u t t o have a non-
uniform d i s t r i b u t i o n , t h i s i n d i c a t e s t h a t t h e a s s u m pXt i o n s s p e c i f i e d i n S e c t .
10.4.3 a r e n o t f u l f i l l e d . One may then s u s p e c t t h e measurements o r t h e model.
or b o t h , t o be u n s a t i s f a c t o r y and s h o u l d examine t h i s f u r t h e r .
For example. i f
I
PX2 i s S t r o n g l y peaked a t very low p r o b a b i l i t i e s t h i s may r e v e a l a c o n t a m i n a t i o n f o r the i-th observation. Consider-
z. i s called the stwtch f h c t i o n o r "ppuZ1"
o f "wrong" e v e n t s . S i m i l a r l y , a skew d i s t r i b u t i o n f o r P 2 w i t h an excess on t h e i n g ""correlated o b s e r v a t i o n s and a s u f f i c i e n t l y l i n e a r e s t i m a t i o n problem we
X r s i n t h e measurements
h i g h (or low) p r o b a b i l i t y s i d e may i n d i c a t e t h a t t h e e r r o
have
have s y s t e m a t i c a l l y b e e n p u t t o o h i g h (low).
A

- ~ C O V ( ~+ .V..~ (;I)
1 ^
= Vii(y)
0 ( c i ) = Vii(y-3) ) ~ ~
I1 - i n class i . I n t h e case of a c o n t i n u o u s v a r i a b l e x we may have to f i n d p . by
i n t e g r a t i n g a p r o b a b i l i t y d e n s i t y f u n c t i o n o v e r t h e w i d t h Axi of t h e i - t h c l a s s .
= Vii(y) - Vii(g. (10.68)
he e x p e c t e d number of o b s e r v a t i o n s i n r h i s c l a s s i s

Hence, t h e i - t h " p u l l ' ' can b e e x p r e s s e d as

N
and t h e n o r m a l i z a t i o n c o n d i t i o n 1 p. = 1 implies
i-1 '
C l e a r l y t h e minus s i g n i n t h e denominator h e r e h a s i t s o r i g i n i n t h e f a c t t h a t
t h e two q u a n t i t i e s i n t h e n u m r a t o r are c o m p l e t e l y ( p o s i t i v e l y ) c o r r e l a t e d .
The " p u l l " r . i s a n t i c i p a t e d t o have a d i s t r i b u t i o n which i s f a i r l y
'I. . c l o s e to ~ ( 0 . 1 ) . 1 f s 1 i n a p a r t i c u l a r f i t , one o f t h e z i V s d e v i a t e s very much
For a g i v e n n t h e numbers o f o b s e r v a t i o n s n . a r e m u l t i n ~ m i a l ld~i s t r i -
buted o v e r t h e N c l n s s c s w i t h r l r v : l r i a ~ > c ernaciix
from t h e o t h e r s i n magnitude t h e c o r r e s p o n d i n g d a t a p o i n t s h o u l d b e examined, and
p e r h a p s abandoned i f i t l o o k s s u s p i c i o u s ( S e c t . 6 . 1 ) .
This c r i t i q u e uf t h e d a t a
i s most l i k e l y t o be u s e f u l when t h e number o f d e g r e e s of freedom v i s f a i r l y
large. For 1 C - f i t s (-1) one sees t h a t a l l " p u l l s " are of t h e sam magnitude,
and t h e y c o n t a i n no mare i n f o r m a t i o n than t h e v a l u e x2.
mln'
I n t h e l o n g run, i f t h e shape of t h e o b s e r v e d d i s t r i b u t i o n a f a " p u l l "
e . is s h i f t e d r e l a t i v e l y t o zero t h i s d e m o n s t r a t e s a c e r t a i n b i a s i n t h e i - t h
observation. S i m i l a r l y , i f t h e observed " p u l l " d i s t r i b u t i o n i s s u b s t a n t i a l l y Because of t h e n o r m a l i z a t i o n c o n d i t i o n r h i s m a t r i x i s s i n g u l a r ( I v I = 0 ) and
b r o a d e r (narrower) than N(0,l) t h e error i n t h e i - t h o b s e r v a t i o n has probably can n o t b e i n v e r t e d . The Least-Squares P r i n c i p l e as f o r m u l a t e d by e q . ( 1 0 . 6 ) is
c o n s i s t e n t l y been t a k e n t o o snail ( l a r g e ) . t h e r e f o r e not a p p l i c a b l e t o t h i s case. However, i f we o m i t o n e o f t h e ni, say
as i t i s r e d u n d a n t , t h e remaining (N-1) n. w i l l c o r r e s p o n d t o a c o v a r i a n c e
n ~ '
10.5 APPLICATION OF THE LEAST-SQUARES METHOD TO CLASSIFIED DATA m a t r i x V* which i s s i m p l y V(y) w i t h i t s N-th r o w and c o l u m d e l e t e d . We c o u l d
t h e n r e f o r m u l a t e t h e Least-Squares Principle f o r finding the b e s t values of the
10.5.1 C o n s t r u c t i o n of : X
parameters by demanding t h e minimum of t h e q u a n t i t y
I n ~ r a c t i c eone often groups t h e measurements ( e v e n t s ) a c c o r d i n g t o
some c l a s s i f i c a t i o n scheme, f o r example by p l o t t i n g a h i s t o g r a m , b e f o r e t h e ac-
t w l e s t i m a t i o n of t h e parameters. *
L e t t h e range of t h e v a r i a b l e * ) x b e d i v i d e d i n t o N m u t u a l l y e x c l u s i v e I t can b e v e r i f i e d by t h e r e a d e r t h a t t h e i n v e r s e of t h e m a t r i x V is

c l a s s e s ( b i n s ) and d e n o t e by n. t h e n m b e r o f t h e n o b s e r v a t i o n s x s . n r , . ..,n
b e l o n g i n g t o the i - t h c l a s s . We a s s m e t h a t , w i t h t h e p a r a m e t e r s
8
- = {9ir92,....9L) we know t h e p r o b a b i l i t y p. = P . (9) o f g e t t i n g an o b s e r v a t i o n
1 1 -

*)
I n r h i s s e c t i o n x may d e n o t e a one-dimensional o r a m u l t i - d i m e n s i o n a l
variable.
The d o u b l e s m i n X 2 above r a n t h e r e f o r e b e w r i t t e n as

Even f o r s i m p l e models i t may be d i f f i c u l t t o s o l v e e q s . ( 1 0 . 7 8 ) analy-


t i c a l l y , and o f t e n a n m e r i c a l m i n i m i z a t i o n of e q . ( 1 0 . 7 7 ) i s used i n s t e a d . It
can be shown. however, t h a t f o r l a r g e n m b e r of e v e n t s n, t h e i n f l u e n c e from t h e
second tern i n the p a r e n t h e s e s of e q s . ( l o . l 8 ) becomes s m a l l . Neglecting t h i s
term c o r r e s p o n d s t o r e g a r d i n g t h e f . i n t h e denominator of e q . ( 1 0 . 7 7 ) a s cnn-

s t a n t s i n d e p e n d e n t of and l e a d s t o a s i m p l e r form o f e q s . ( 1 0 . 7 8 ) ,

~ n . - f . af.
ax2 =
- -2 Y- 1
f.
1
- I
?.I,,=n, a = I , Z ,...,L, (10.79)
ant i=1 1

which may be e a s i e r to h a n d l e .
The l a s t e x p r e s s i o n r e s t o r e s the symnetry i n a l l N c l a s s e s and corre- The v a r i a n c e of i s sometimes approximated by n . i n s t e a d of by f . . If,
sponds t o the f o r m u l a t i o n of eq.(10.3) w i t h f . = n p i , o.=$.
he e x p r e s s i o n r a t h e r than eq.(10.71). w e minimize
(10.75) c o u l d have b e e n w r i t t e n down a t once, from t h e assumption t h a t the
number o f events ni is P o i s s o n d i s t r i b u t e d w i t h mean and v a r i a n c e e q u a l t o n p . .
The a l g e b r a above t h u s d e m o n s t r a t e s a g a i n t h e mathematical e q u i v a l e n c e between
two d i f f e r e n t p o i n t s o f view, t h e f i r s t c o n s i d e r i n g N ( d e p e n d e n t ) m u l t i n o m i a l l y the solution for i s more s e n s i t i v e t o s t a t i s t i c a l f l u c t u a t i o n s i n t h e o b s e r v e d
d i s t r i b u t e d v a r i a b l e s c o n d i t i o n e d on t h e i r s m , t h e second c o n s i d e r i n g N inde- data. I t can be shown. however, t h a t f o r l a r g e numbers o f events t h e s o l u t i o n s
pendent P o i s s o n v a r i a b l e s ; compare S e c t . 4 . 4 . 4 . o b t a i n e d from t h e two f o r m u l a t i o n s (i.c. e q s . ( 1 0 . 7 7 ) , (10.80)) c o i n r i d e . I t can
Generally, i n t h e covariance matrix o f eq.(10.72), i f the nmber of f u r t h e r be s h a m t h a t t h e f o r m u l a t i o n s c o r r e s p o n d t o e s t i m a t o r s which, f o r l a r g e
c l a s s e s i s l a r g e such t h a t a l l p . ' s a r e s m a l l , t h e o f f - d i a g o n a l t e r m become samples, p o s s e s s o p t i m m t h e o r e t i c a l p r o p e r t i e s : t h e y a r e c o n s i s t e n t , asympto-
n e g l i g i b l e , and t i c a l l y normal, and e f f i c i e n t (i.e. g i v e minimm v a r i a n c e ) .
A s y m p t o t i c a l l y , i n t h e l i m i t o f l a r g e numbers, t h e x2 of e q . ( 1 0 . 7 3 ) w i l l
V i i ( ~ )= Of = "pi('-pi) 1 np. = f . . i f E(".) n p . . be d i s t r i b u t e d a s x'(N-I), as w i l l a l s o t h e a l t e r n a t i v e and
1 L (10.76) =

T h e r e f o r e , whenever t h e LS e s t i m a t e s of t h e p a r a m e t e r s are found by minimizing


approximate e x p r e s s i o n s f o r x2 above. Compared t o t h e s i t u a t i o n o f
Sect.10.4.3 one degree o f freedom h a s b e e n l o s t because o f t h e n o r m a l i z a t i o n
N
condition .L
1=1
n. =
1
n. which l e a v e s o n l y (N-1) o f t h e o b s e r v a t i o n s i n d e p e n d e n t .
When t h e m i n i m i z a t i o n h a s been performed t h e minimum v a l u e x'.m a n can
t h e r e f o r e be used t o g i v e an approximate measure o f t h e g o o d n e s s - o f - f i t , by com-
t h e method i m p l i e s a P o i s s o n a p p r o x i m a t i o n f o r t h e i n d i v i d u a l n i , (mean and v a r i -
p a r i s o n w i t h t h e c h i - s q u a r e d i s t r i b u t i o n w i t h (N-1-L) d e g r e e s o f freedom. Here,
ance = fi). E q u a t i n g t h e d e r i v a t i v e s o f X' t o zero g i v e s now t h e f o l l o w i n g s e t
as b e f o r e , L i s t h e n m b e r o f ( i n d e p e n d e n t ) p a r a m e t e r s e s t i m a t e d . From e q s .
of e q u a t i o n s ,
(10.77) and (10.80) i t i s c l e a r t h a t xiin is x'(N-1-L) t o the extent t h a t the
n m b e r s a r e l a r g e enough t o j u s t i f y t h e a s s u m p t i o n t h a t ( n i - f i ) l q *and
than f i v e , provided t h e nurnber of d e g r e e s o f freedom i s s u f f i c i e n t l y l a r g e , s a y
(ni-fi)l$ a r e a p p r o x i m e t e s t a n d a r d normal v a r i a b l e s
*) .
a t l e a s t six. S i n c e t h e p r o b a b i l i t i e s f r e q u e n t l y drop o f f a t t h e ends o f t h e
10.5.2 Choice of c l a s s e s v a r i a b l e r a n g e , one may o f t e n have t o use wider w i d t h s a t t h e e x t r e m e s o f t h e

There i s t o d a t e no g e n e r a l l y a c c e p t e d p r e s c r i p t i o n f o r how t o s u b d i - range t o g e t s u f f i c i e n t l y l a r g e e x p e c t e d numbers i n t h e s e c l a s s e s .


v i d e t h e v a r i a b l e range i n t o c a t e g o r i e s o r c l a s s e s . I n many e x p e r i m e n t s , hou-
10.5.3 Example: P o l a r i z a t i o n o f a n t i p r o t o n s (2)
ever, t h e s e t - u p o f t h e a p p a r a t u s i t s e l f i m p l i e s a g r o u p i n g o f t h e d a t a , f o r
Let us use t h e s i m p l e p o l a r i z a t i o n example d e s c r i b e d under t h e Maximm-
example d e f i n e d by t h e a n g u l a r a c c e p t a n c e of a n e l e c t r o n i c counter. When t h e
L i k e l i h o o d method i n SecC.9.5.7 t o illustrate how t h e d i f f e r e n t Lesst-Squares
g r o u p i n g of t h e d a t a i s n o t a priori g i v e n , e s s e n t i a l l y two d i f f e r e n t approaches
f o r m u l a t i o n s a f Sect.lO.5.1 l e a d to m i n i m i z a t i o n s of v a r y i n g complexity.
a r e used f o r t h e s u b d i v i s i o n o f t h e range o f t h e v a r i a b l e ,
We assume t h a t n d o u b l e - s c a t t e r i n g events have been observed and p l o t -

(i) t h e e q u a 2 l i i i i t h method, where t h e range i s d i v i d e d i n t o classes t e d i n a h i s t o g r a m as a f u n c t i o n of r E cosb, where $ i s the a n g l e between t h e


normals a f t h e two s c a t t e r i n ) : p l a n e s . The h i s t o g r a m h a s N b i n s , and the i - t h
, # of e q u a l w i d t h , and
t h i n , extellding from x . to x . + ? V , rorlt.lin 11. observed e v e n t s . The e x p e c t e d
( i i ) t h e i q u o 2 - p r o b a b i l i t y method, where t h e range i s d i v i d e d i n t o i I
f r e q u e n c y f o r t h i s b i n is found hy i n t e a r u t i n p t h e p . d . f . o f e q . ( 9 . 3 8 ) over the
c l a s s e s of e q u a l e x p e c t e d p r o b a b i l i t y .
b i n w i d t h ; hence t h e e x p e c t e d number o f e v e n t s i s
x.+An.
We have a l r e a d y assumed t h a t t h e nlonber N o f c l a s s e s i s l a r g e t o j u s t i - 1 1

f y the approximation a 1i E f , i n eq.(10.77), or


1
oi1 r-. n.
1
i n eq.(1n.80).
I € x:;,
is f u r t h e r t o be used t o measure t h e q u a l i t y o f t h e f i t , a few mre c o n d i t i o n s
fi = n I
7..
J(l+ax)dx = n(a.+h.o),
1 1

have t o be f u l f i l l e d . vhere
F i r s t l y , i t i s n o t allowed t o choose t h e l i m i t s of t h e c l a s s e s i n such
a way as t o make x'.",I" as s n a l l as possible. T h i s f o l l o w s from t h e f a c t t h a t t h e
s t a t i s t i c x2.
ml n
w i l l o n l y be a n approximate c h i - s q u a r e v a r i a b l e i f t h e c l a s s With t h e P o i s s o n a p p r o x i m a t i o n , a ? ^. fi, t h e e x a c t and t h e s i m p l i f i e d
b o u n d a r i e s f o r x a r c n o t random v a r i a b l e s . I n most p r a c t i c a l work, t h e group- consequence of e q . ( 1 0 . 7 7 ) ( i . e . e q s . ( l 0 . 7 8 ) and(10.79). respectively) lead to
ing i s made from r o m p u t a t i o n a l c o n v e n i e n r e . The second c o n d i t i o n , a l r e a d y men- e q u a t i o n s of d e g r e e ZN and N i n the p a r a m e t e r a. The p a r a m e t e r e s t i m a t e
A

n and
t i o n e d , i s t h a t the number o f e x p e c t e d o b s e r v a t i o n s w i t h i n each c l a s s must be i t s error must t h e r e f o r e be found by some numerical method.
* ' l a r g e e 9 , which i s n e c e s s a r y t o approximate ( n i - f i ) / T (or (ni-fi)IJn7 to a With t h e a 1 t e r n a t i v e approximation. 0; r n i , however, we have, from
s t a n d a r d normal v a r i a b l e . Fortunately, for p u r p o s e s the e x p e c t e d e q . ( 1 0 . 8 0 ) f o r t h e s i m p l i f i e d LS method,
numbers need n o t he v e r y l a r g e : i n f a c t i t i s customary t o r e q u i r e a minimum of
f i v e e n t r i e s in each c l a s s . I t h a s been v e r i f i e d , u s i n g t h e equal-width svb-
d i v i s i o n , t h a t one o r two c l a s s e s may be allowed t o have e x p e c t a t i o n s even l e s s
which i s a f u n c t i o n o f second o r d e r i n a. The s o l u t i o n of t h e e q u a t i o n dx21da=0
*) E s t i m t i n g p a r a m e t e r s by minimizing pq.(10.77) i s known i n t h e l i t e r a t u r e as can t h e r e f o r e be s t a t e d e x p l i c i t l y as
t h e :li,r.'mlim X2 -..:l,rri, whereas m i n i m i z i n g eq.(10.80) is sometimes c a l l e d t h e
t o u i / i ~ , - ' i a ' t:rnli,w X l ,m,:lwd. In accordance w i t h our remarks a t t h e end of
~ c ~ t . 1 0 . 4 . we 3 d i s c o n r a g e t h e use o f t h e s e terms. I n p a r t i c u l a r , we s h a l l
I r e f e r t o t h e e s t i m a t i o n by e q . ( 1 0 . 8 0 ) as t h e s i m p l i f i e d Least-Squares method.
Since xi can be e x p r e s s e d as t h e d i s t r i b u t i o n can a l s o be w r i t t e n as a l i n e a r combination of t h e ,Y!

t h e v a r i a n c e of t h e LS e s t i m a t e a is e x a c t l y g i v e n by t h e i n v e r s e of t h e c o e f f i -
I n eq.(10.87) o n l y s p h e r i c a l harmonics w i t h even values of j occur,
c i e n t t o t h e p a r a b o l i c term, compare S e c t . l O . 9 . 2 . Hence an a n a l y t i c formula can
i m p l y i n g t h a t i n t h e tuo-pion decay o f a s p i n J boson t h e a n g u l a r d i s t r i b u t i o n
a l s o b e given f o r t h e error A;,
w i l l be a polynomial i n cose of d e g r e e a r m * - t Z J ; t h i s i s known ar t h e mzimm
complerity theorem.
The e x p a n s i o n c o e f f i c i e n t s cjm i n e q . ( l O . 8 7 ) c o n s t i t u t e t h e s e t of
The r e a d e r may f i n d i t a r e w a r d i n g e x e r c i s e t o r e p h r a s e t h e l a s t example unknown p a r a m e t e r s which we want to e s t i m a t e . S i n c e , f o r a g i v e n J, t h e r e a r e

I i n t e r m s of t h e l i n e a r model f o r m u l a t i o n o f Secr.10.2.3, and v e r i f y t h a t t h e ( J + O ( Z J + l ) terms i n t h e double sum t h i s g i v e s t h e number of e l e m e n t s i n t h e


expressions for and A; above c o r r e s p o n d t o the g e n e r a l formulae eqs.(10.23) pdrdrneter v e c t o r I..
- The c . ,r,, nnf a l l i n d r p e n d e n f , s i n r e t h e y must s a t i s f y
I na
and ( 1 0 . 2 4 ) . r e s p e c t i v e l y ; compare a l s o E x e r c i s e 10.12. the normalization condition

10.5.4 ~ x a m p l e : Angular momentum a n a l y s i s ( 1 )


W(0)dn = 1 . (10.88)
Suppose t h a t i n a p r e l i m i n a r y s t u d y of t h e pion-nucleon i n t e r a c t i o n we 4"
want to examine t h e p r o d u c t i o n and s u b s e q u e n t r ~ o - ~ i o decay
n a f a boson 8 , v i z . For t h e chosen mass r e g i o n of t h e two-pion s y s t e m B i n (10.84) w e

ii + N * B + N (10.84)
c l a ~ s i f ythe e v e n t s i n N a n g u l a r i n t e r v a l s En.. The p r o b a b i l i t y f o r t h e i - t h
interval i s
Lna + %
I f l i t t l e i s known a b o u t t h e i n t e r a c t i o n a good way o f s t a r t i n g an i n v e s t i g a t i o n
i s co s t u d y the a n g u l a r d i s t r i b u t i o n of t h e decay i n t h e r e s t frame o f t h e
s y s t e m B as a f u n c t i o n o f t h e mass o f c h i s s y s t e m . where t h e s w a t i o n goes over a l l ( J + l ) ( Z J + l ) combinations o f t h e i n d i c e s j , , n ,
L e t t h e decay a n g l e n=(B,b) be d e f i n e d as t h e d i r e c t i o n o f t h e na mo-- With a t o t a l number of n e v e n t s the p r e d i c t e d number f o r t h e i - t h interval i s
mentum i n the B r e s t frame r e l a t i v e l y t o t h e q u a n t i z a t i o n a x i s . The a n g u l a r I
decay d i s t r i b u t i o n can t h e n g e n e r a l l y be e x p r e s s e d i n t e r n s o f t h e s p h e r i c a l
harmonic f u n c t i o n s Y m ( c o s ~ , b ) , and t h e d e n s i t y m a t r i x e l e m e n t s Pm, as
J The v e c t o r of o b s e r v e d numbers of e v e n t s i n t h e N i n t e r v a l s i s I = ( ~ , ...,
,~~,
m
%'*
r ( ~ ~ . o , m )= 1 Y ~ ( C ~ S O , + ) ~ ~ , Y ~ ~ C ~ ~ B , ~ ) * , (10.85) where
m,mT
where .I i s t h e s p i n of 8 . From t h e p r o p e r t i e s of t h e s p h e r i c a l harmonics I

I f t h e nurber of i n t e r v a l s i s o u f f i c i e n t l y l a r g e t o j u s t i f y t h e approxi-
mati?" t h a t e a c h n . i s a n i n d e p n d e n t P o i s s o n v a r i a b l e w i t h w a n and variance
e q u a l t o f . , t h e unknown p a r a m e t e r s would b e found by minimizing e q . ( l O . 7 7 ) , or
r a t h e r eq.(10.80) i f t h e approximation of1 = n .1 i s acceptable. Clearly, with
e i t h e r formulation, a g e n e r a l numerical minimization procedure i s c a l l e d f o r . rhe " c x a c c methodt'. i n which one m o d i f i e s t h e i d e a l p . d . f . (i.e.
(i)
I n a c t u a l e x p e r i m e n t s the d e t e r m i n a r i o n of t h e c o e f f i c i e n t s c . as the t h e o r e t i c a l r n d e l ) t o g i v e a n " o b s e r v a b l e p . d . f . " , which i s
Im
f u n c t i o n s o f t h e mass of t h e s y s t e m B can serve d i f f e r e n t p u r p o s e s .
~ i r s r l y ,i t s u b s e q u e n t l y compared t o t h e a b s e r v a t i o n s . T h i s approach r e q u i r e s
may be used t o d e t e r m i n e t h e s p i n of t h e decaying boson from t h e maximum com- c h a t t h e e x p e r i m e n t a l d e t e c t i o n a b i l i t y i s known over t h e whole
p l e x i t y theorem i n the f o l l o w i n g manner: S t a r t i n g from rhe l o w e s t p o s s i b l e spin range of t h e o b s e r v a b l e s .
v a l v e one e v a l u a t e s c o n s e c u t i v e l y t h e s e t s o f c o e f f i c i e n t s 2 . f o r i n c r e a s i n g
v a l u e s o f J. From a c e r t a i n J v a l v e on, s a y from J=Jmax, t h eJmg o o d n e s s - o f - f i t when method ( i ) i s n o t a p p l i c a b l e one h a s t o r e s o r t t o
w i l l n o t become s i g n i f i c a n t l y b e t t e r w i t h i n c r e a s i n g J, and a l l c o e f f i c i e n t s
( i i ) t h e " a p p r o x i m a t r rnechod", i n whirh one m o d i f i e r t h e raw observa-
Sj, f ~ jr 2 J m x ~ i l l be c a n p o t i b l e w i t h zero. The value Jmax t h e n d e t e r n i n e s r i n n s by a s s i g n i n g d i f f e r e n t w e i g h t s t o t h e i n d i v i d u a l o b s e r v e d
a lower l i m i t f o r t h e s p i n o f t h e boson 8 . Secondly, t h e b e h a v i o u r of t h e coef-
events. The weight w . a s s i g n e d t o a n e v e n t i s e q u a l t o r h e in-
f i c i e n t s with d i f f e r e n t j,m may sometimes g i v e e v i d e n c e f o r t h e p r e s e n c e o f
verse tilr p l - n i , . i h i l i ~ yi,,r d e t r r f i n g t h i s event: i n otlier
more than one resonance i n a c e r t a i n mass r e g i o n . F i n a l l y , i f t h e s p i n s o f the e v e n t was o b s e r v e d one a s s u m e s t h a t t h e r e would
wards, i f
produced resonances are well-known, t h e magnirude of t h e c o e f f i c i e n t s Sjm
lor,
have been w. e v e n t s i f t h e d e t e c t i o n had been p e r f e c t .
equivalently. t h e d e n s i t y m a t r i x e l e m e n t s pm ) can s u p p l y i n f o r m a t i o n on t h e
p r o d u c t i o n p r o c e s s , and be used t o t e s t p r o d u c t i o n models. c l e a r l y , whenever t h e t h e o r e t i c a l model can b e p r o p e r l y m o d i f i e d , by
f o l d i n g - i n o f t h e e x p e r i m e n t a l r e s o l u t i o n o r by t h e "enact methodr', t h e r e i s no
E x e r c i s e 10.13: D e r i v e eq.(10.87) from e q . ( 1 0 . 8 5 ) . (Hint: Use t h e p r o p e r t y
f u r t h e r need f o r changes i n t h e LS e s t i m a t i o n p r o c e d u r e as described i n t h e
(-I)?;
ymymTs
a t 4"
I<a,m;e.m'lj,m+m'>~/&
.
-
and t h e p r o d u c t theorem f o r s p h e r i c a l harmnnics,
~ ~ ' c a . n ; e , n ( ~ . ~ ,
previous s e c t i o n s of t h i s c h a p t e r . I n f a c t , t h e mosr p s s e n t i a l restriction we
have p u t o n a t h e o r e t i c a l model i s that i t s h o u l d g i v e p r e d i c r i , l o s which coio-
1 c i d e w i t h the e x p e c t a t i o n v a l u e s of the o b s e r v a t i o n s . w i t h t h e " a p p r o x i m a t e
w i t h t h e f a c t t h a t t h e Clebsch-Gordan c a e f f i c i e n r s < L , o ; k , o l j , o > v a n i s h f o r .dd
j.) method", however, s p e c i f i c problems a r i s e f a r t h e LS p a r a m e t e r e s t i m a t i o n , i n
p a r t i c u l a r i t becomes more d i f f i c u l t t o g e t r e l i a b l e v a l u e s o f the e r r o r s o n the
10.6 APPLICATION OF THE LEAST-SQUARES METHOD TO WEIGHTED EVENTS estimated parameters.
SO f a r we have i n c h i s c h a p t e r t a c i t l y assumed t h a t the p r e d i c t i o n s of Let us assume t h a t t h e o b s e r v a t i o n s lhave b e e n c l a s s i f i e d i n t o N b i n s as
t h e t h e o r e t i c a l model a r e d i r e c t l y comparable t o t h e e x p e r i m e n t a l o b s e r v a t i o n . i n Secf.lo.5.1, and c h a t i t i s "ow n o r meaningful t o compare the p r e d i c t e d o m -
I n p r a c t i c e , however, t h e o b s e r v a t i o n s a r e f r e q u e n t l y known t o be b i a s s e d . AS b e r o f e v e n t s f i d i r e c t l y t o t h e observed number of events n i i n the i - t h b i n
was d i s c u s s e d i n C h a p t e r 6, t h e o b s e r v a t i o n a l b i a s e s can he d i v i d e d i n t o random Using t h e ' " a p p ~ a x i m a t emethod" we would t h a t w i t h a ~ e r f e c td e t e c t i o n appar-
(or s t a t i s t i c a l ) errors and s y s t e m a t i c e r r o r s . We r e c a l l from S e c t . 6 . Z t h a t atus we s h o u l d have ~ b s e r v e d i n b i n i a number o f e v e n t s e q u a l t o
random o b s e r v a t i o n a l errors can be t a k e n care of by "smearing" t h e i d e a l p . d . f .
w i t h t h e experimental r e s o l u t i o n function t o o b t a i n a m d i f i e d p.d.f. ("resolu-
t i o n transform") which can t h e n be d i r e c t l y compared t o t h e o b s e r v a t i o n s .
where w.. i s t h e i n v e r s e o f t h e d e t e c t i o n p ~ o b a b i l i r yf o r e v e n t j w i t h i n t h e
F u r t h e r , from S e c t . 6 . 3 . s y s t e m a t i c o b s e r v a t i o n a l errors can b e h a n d l e d by two LJ
i - r h b i n . I t i s then ~ . u g g ~ c t itvo~w r i t e down the Following two a l t e r n a t i v e
b a s i c a l l y d i f f e r e n t approaches. The f i r s t of t h e s e i s e x p r e s s i o n s to be minimized i n t h e case of weighted e v e n t s .
LINEAR LEAST-SQUARES ESTIMdTION IJITH LINEAR CONSTRAINTS
I t f r e q u e n t l y happens t h a t t h e o h s e r v a h l e s 3 i n an LS e s t i m a t i o n
and While t h e o r i g i n a l measure-
are r e l a t e d t h r o u g h a l g e b r a i c c o n s t r a i n t e q u a t i o n s .

,,,ts are s u b j e c t t o e x p e r i m e n t a l i n a c c u r a c i e s and may n o t s t r i c t l y s a t i s f y


A

the c o n s c c a i o t r t h e "improved measucements" 3, t h e e s t i m a t e s of t h e true, un-


known v a l u e s , s h o u l d do so.
These e x p r e s s i o n s are t h e a n a l o g u e s o f , r e s p e c t i v e l y , eqs.(10.77) and (10.80). ~n g e n e r a l , two d i f f e r e n t approaches can be used t o s o l v e a minimi-
Again, i f E(n!) = fi and t h e n o r m a l i t y assumption i s r e a s o n a b l y f u l f i l l e d , where t h e o b s e r v a b l e s are r e s t r i c t e d by a l g e b r a i c c o n s t r a i n t s
zation
xkin i s e x p e c t e d t o g i v e an approximate measure o f t h e g o o d n e s s - o f - f i t . *he f i r s t is t h e eZimination method, i n which one makes use of t h e c o n s t r a i n t
I f t h e w e i g h t s w . . a r e n o t t o o l a r g e and n o t too d i f f e r e n t , e i t h e r of e q u a t i o n s to e l i m i n a t e a n u d e r o f t h e v a r i a b l e s and s u b s e q u e n t l y performs t h e
II
t h e a l t e r n a t i v e e x p r e s s i o n s (10.93) and (10.94) works s a r i s f a c r o r i l y . E x p e r i - .inimizarion i n terms of t h e reduced n u d e r of unknovns. The second a p p r o a c h i s
ence shows, however, t h a t c u r i o u s results may sometimes be o b t a i n e d f o r the
the metho.? of iligroni4im r l i ? i i n l i r ~ r s ,i n which one i n c r e a s e s t h e number of un-
p a r a m e t e r e s t i m a t e s and t h e i r e r r o r s i f some o f t h e e v e n t s have very l a r g e knowns i n t h e m i n i m i z a t i o n by a d d i n g a s e t of Lagrangian m l t i p l i e r s , one f o r
weights. One should t h e r e f o r e be c a u t i o u s and v e r i f y t h a t the w e i g h t s of t h e each c o n s t r a i n t e q u a t i o n . Thus b o t h methods r e f o r m u l a t e t h e c o n s t r a i n e d x2
i n d i v i d u a l e v e n t s do n o t d e v i a t e t o o much from t h e a v e r a g e .
I f t h e weights a r e t o a n ( o r d i n a r y ) unconstrained m i n i m i z a t i o n , which c a n be c a r r i e d
minimization
l a r g e mainly i n one or a few b i n s one may improve the r e l i a b i l i t y of t h e p a r a - .r a c c o r d i n g to t h e p r o c e d u r e s o u t l i n e d e a r l i e r i n t h i s c h a p t e r .
meter e s t i m a t e s by s i m p l y o m i t t i n g t h e s e b i n s i n the m i n i m i z a t i o n . I n general, I n t h e f o l l o w i n g s e c t i o n s we w i l l f i r s t ( i n S e c t . l o . 7 . 1 ) see how t h e
t h e i n c l u s i o n of events w i t h l a r g e w e i g h t s t e n d s t o i n c r e a s e t h e e s t i m a t e d two m t h o d s ~ o r ko n a s i m p l e example b e f o r e we proceed (Sect.10.7.2) t o develop
errors on t h e p a r a m e t e r s . the g e n e r a l f o r m u l a t i o n of t h e method of L a g r a n g i a n m u l t i p l i e r s f o r a l i n e a r xZ
F i n a l l y , l e t i t be mentioned t h a t an approach between t h e "exact" and minimization w i t h l i n e a r c o n s t r a i n t e q u a t i o n s . As we s h a l l see an eract s o l u -
t h e "approximate" methods c o n s i s t s i n u s i n g t h e e x p r e s s i o n t i o n i s o b t a i n e d f o r t h i s problem, w i t h c l o s e d formulae e x p r e s s i n g a m o d i f i c a -
N ("i-f;)2 t i o n of t h e u n c o n s t r a i n e d l i n e a r LS s o l u t i o n .
x2 = 1 (10.95)
i=1
10.7. 1 Example: Anxles i n a t r i a n g l e
with
TO i l l u ~ r r a r ethe two a p p r o a c h e s t o a l i n e a r LS e s t i m a t i o n under
l i n e a r c o n s t r a i n t s , l e t us c o n s i d e r t h e f o l l o w i n g s i m p l e example: The t h r e e
a n g l e s of a t r i a n g l e have b e e n measured i n d e p e n d e n t l y , g i v i n g t h e r e s u l t s
T h i s would c o r r e s p o n d t o a d o p t i n g the p h i l o s o p h y b e h i n d t h e "enact method",
where t h e o b s e r v a t i o n s a r e k e p t unchanged and t h e model a d j u s t e d t o a c c o u n t f o r
t h e i m p e r f e c t d e t e c t i o n . However, t h e p r o c e d u r e i s no l o n g e r e n a c t s i n c e t h e
We assume f o r s i m p l i c i t y t h a t a l l measurements have an e r r o r a = l o . We want t h e
m o d i f i c a t i o n of the i d e a l model rests on a n e s t i m a t e o f t h e a v e r a g e d e t e c t i o n
LS e s t i m a t e s of t h e t r u e v a l u e s q , , q , , n , of t h e a n g l e s , which mst s a t i s f y t h e
e f f i c i e n c y 0. f o r t h e i - t h b i n , as o b t a i n e d from t h e observed events i n t h e b i n .
requirement t h a r t h e i r sum b e e q u a l to 180'.
S i n c e an error i s i n h e r e n t i n t h i s e s t i m a t e i t i s l i k e l y t h a t t h e u s e o f
eq.(l0.95) i m p l i e s a reduced r e l i a b i l i t y o f t h e p a r a m e t e r e s t i m a t e s .
Clearly,
t h e method i s m s r r e l i a b l e f o r small w e i g h t s , or n e a r l y e q u a l w e i g h t s . I f a l l
by t h e amount P
we o serve t h a r t h e measurements yi f a i l t o s a t i s f y t h e c o n s t r a i n t

i;,
y. - 180' = 2'. To f i n d t h e "improved measurements" ni

w e i g h t s a r e i d e n t i c a l e q . ( 1 0 . 9 5 ) a n d e ~ . ( 1 0 . 9 3 ) g i v e t h e same r e s u l t s . .
a c c o r d i n g t o t h e LS P r i n c i p l e we seek t h e s o l u t i o n of t h e c o n s t r a i n e d minimiza- Explicitly, the
*his i s a l i n e a r LS problem f o r t h e f o u r unknowns nl.n2.q1,A.
t i o n problem
..rmal e q u a t i o n s become
\

Adopcing f i r s t t h e e l i m i n i a t i o n method we use t h e c o n s t r a i n t equa-


t i o n t o e l i m i n a t e one of t h e Unknowns, s a y n,, substitute i n x2 and minimize

I
I
With r e s p e c t t o t h e r e m a i n i n g two v a r i a b l e s ; t h u s we c o n s i d e r t h e u n c o n s t r a i n e d
case ~ ~ l r i ~ t hl e yf iir s~t ~tllrcc! c q i l . ~ ~ i o by
n -1 2 n d t h e 1 s t by
1
z,. we r i n d by add-
I

)'
inga l l e x p r e s s i o n s an e q u a t i o n f o r I.
I y3-(l 8o0-n,-n7)
= minimum. (10.98)

T h i s t r i v i a l m i n i m i z a t i o n problem h a s t h e s o l u t i o n
which l e a d s r o

me e s t i m a t e s f o r t h e a n g l e s , o b t a i n e d from t h e t h r e e f i r s t e q u a t i o n s , are
and from t h e c o n s t r a i n t e q u a t i o n we f i n d

These r e l a t i o n s , symmetric i n a l l i n d i c e s from 1 t o 3, show i n a t r a n s p a r e n t way


how t h e o r i g i n a l are c o r r e c t e d by s u b t r a c t i n g e q u a l amounts of t h e
We f i n d t h e r e f o r e , n o t u n e x p e c t e d l y , f i t a t the "improved measurements" a r e ob-
t o t a l measured "excess" of 2' from e a c h of t h e measurements.
t a i n e d by s u b t r a c t i n g t h e measured "excess" of 2O e q u a l l y from a l l t h r e e o b s e r -
vations. A

~ x e r c i s e10.14: show t h a t t h e c o v a r i a n c e = t r i n f o r che e s t i m a t e s 3 c a n he ex-


I f , i n s t e a d , we a d o p t t b e method o f Lagrangian r m l t i p l i e r s we w i l l pressed a s
r e f o r m u l a t e t h e problem (10.97) t o t h e equivalent form w i t h t h e Lagrangian mul- 1213 -113
(l,l.l)=o' 113 213
t i p l i e r A, r113 -113 213,

I = minimum. (10.99) showing t h a t t h e "improved measurements" become c o r r e l a t e d b u t have s m l l e r


errors ( d i a g o n a l terms) t h a n t h e o r i g i n a l rneasurementr;. Compare e q . (10.112) of
the next s e c t i o n .
10.7.2 L i n e a r LS model w i t h l i n e a r c o n s t r a i n t s ; Lagrangian m u l t i p l i e r s
1 f t h e i n v e r s e of C e x i s t s we c a n p r e m u l t i p l y t h e f i r s t of t h e s e e q u a t i o n s by
We w i l l c o n s i d e r a g e n e r a l l i n e a r LS e s t i m a t i o n problem w i t h l i n e a r
BC-' and s u b s t i t u t e f o r B l from t h e l ~ s et q u a t i o n . This g i v e s a n e q u a r i o n i n -
c o n s t r a i n t e q u a t i o n s , i n t h e form
v o l v i n g t h e urknowns only,

x 2 ( j ) = ( y - ~ ~ ) ~ v - ~ ( y=- ~

e j - a = ? .

Here t h e L p a r a m e t e r s
minimum,
j)

I (10.101)

are r e s t r i c t e d through t h e K c o n s t r a i n t e q u a t i o n s er-


p r e s s e d by t h e m a t r i x 8 of dimension K x L, and t h e K-component v e c t o r a. The
Writing

vB = ~ c - 1 ~ ~
- (10.107)

o t h e r symbols a r e t h e same as used b e f o r e f o r t h e u n c o n s t r a i n e d l i n e a r c a s e .


We i n t r o d u c e a K-component v e c t o r X= [Al ,Az,. ..,AK) of L a g r a n g i a n
m u l t i p l i e r s and r e f o r m u l a t e t h e problem ( 1 0 . 1 0 1 ) r o an u n c o n s t r a i i e d l i n e a r
l and assuming t h a t t h i s symmetric m t r i x h a s a n i n v e r s e , t h e s o l u s i o n f o r t h e
~ a ~ r a n g i amnu l t i p l i e r s becomes

m i n i m i z a t i o n f o r t h e L + K urknouns B and *,
1 When t h e L a g r a n g i a n m u l r i p l i e r s are s u b s t i t u t e d back i n eq.(10.106) we o b t a i n t h e
s o l u t i o n f o r the parameters 1,
I f we h e r e e q u a t e t o z e r o t h e d e r i v a t i v e s of x2 with respect to BL, k=1,2, ...,1
and A k , k=1,2, ...,K we g e t t h e norwal e q u a t i o n s , , w h i c h i n v e c t o r n o t a t i o n can
be w r i t t e n as
E q u a t i o n s (10.108) and (10.109) p r o v i d e a n e x a c t s o l u t i o n , s i n c e a l l ma-
t r i c e s and v e c t o r s are known q u a n t i t i e s . I t is i n t e r e s t i n g t o o b s e r v e t h a t t h e
c o r b i ~ t i o nc - ' ~ . which i s n o t h i n g b u t t h e s o l u t i o n of t h e u n c o n s t r a i n e d minimi-
A

h a s w e l l as i n
zation, e n t e r s i n - 8.
I n f a c t , t h e p a r e n t h e s i s ( B C - ' e ) meas-
ures how much t h e o b s e r v a t i o n s y v i o l a t e t h e c o n s t r a i n t e q u a t i o n s ; compare t h e
These a r e L + K l i n e a r e q u a t i o n s f o r t h e urknowns. Obviously, e q s . ( 1 0 . 1 0 4 ) a r e
example of t h e p r e v i o u s s e c t i o n . As for B we see t h a t t h e e f f e c t of t h e con-
t h e c o n s t r a i n t e q u a t i o n s r e g a i n e d , whereas e q s . ( 1 0 . 1 0 3 ) a r e t h e a n a l o g u e s of e q s .
s t r a i n t equations has been t o correct t h e solution c - ' ~of t h e unconstrained
(10.21) f a r t h e u n c o n s t r a i n e d case, now m o d i f i e d by t h e A-term due t o t h e con-
minimization by an amount p r o p o r t i o n a l t o t h e ' " v i o l a t i o n " t e r m (BC"*).
straints. I t w i l l be seen t h a t eqs.(lO.lOO)of the previous section represent a
We n o t e f u r t h e r t h a t t h e L a g r a n g i a n m u l t i p l i e r s h a s w e l l as t h e p a r a -
A
s p e c i a l c a s e of t h e formulae above. meters 8 have a l i n e a r dependence o n t h e o b s e r v a t i o n s y through t h e v e c t o r 2.
L e t us i n t r o d u c e t h e a b b r e v i a t i o n s I t i s seen by a p p l i c a t i o n o f t h e e x p e c t a t i o n o p e r a t o r t h a t

which b r i n g t h e s i m u l t a n e o u s l i n e a r e q u a t i o n s t o t h e form Hence t h e e x p e c t a t i o n of t h e t e r m ( B C - ' e ) v a n i s h e s i n v i r t u e of t h e c o n s t r a i n t


equations, giving
C$ + nT_? = ? ,
Be
;
-b .
-
E x e r c i s e 10.15:
eq.(10.108) c o r r e s pShow
o n d t oh aat ct ho ev aer isat ni m
c ea tm t h e Lagrangian
e sa tfr oi xr v(A) = V-', - t hu al t i ptlhi e r es s ti
a n dm i moaft e d

(10.110) parameter3 and L a g r a n g i a n m u l t i p l i e r s are ""correlate!, cov(j,*) = 0.


- - I
t h a t t h e p a r a m e t e r e s t i m a t e s are unbinssed, as t h e y =ere
I E x e r c i s e 10.16: ( i ) With t h e n o t a t i o n o f S e c t . 1 0 . 7 . 2 , show that
f o r t h e l i n e a r LS f i t w i t h l i n e a r c o n s t r a i n t s can be " z i t t e n a s
the r e s i d u a l s
E. Dy
me
1 i s t h e v e c t o r o f errors d u e t o measurement, and
i n t h e unconstrained case.
The lineardependence us t o e s t a b l i s h t h e c o v a r i a n c e mat- D = I, - AC-IA~V-I + AC-'B~V~'BC-'~~~~'.
D
A

=iX the 8,b y applying t h e u s u a l law o f error p r o p a g a t i o n . Prom


eq.(10.109), recalling that
T -
= A V 'y, we have
This m a t r i x d i f f e r s from t h e c o r r e s p o n d i n g
case (see E x e r c i s e 10.8) o n l y through t h e l a s t term.
f o r the unconstrained

)j(,, = -
[C-~AT\Ty-l C - ~ B ~ V ; ~ B C - l A T V - ~ ] ~ [ ~ - l ~ T v - l - c-'B~v-'Bc-~A~v-~]T.
B
( i i ) I f t h e measurements are u n c o r r e l a r e d and have errors, i.e.
/ "
V(y) = U ~ I ~D ,s i m p l i f i e s t o

Using the of c and vB together with t h e f a c t t h a t these =trices.


D rN - + ~
A(A~A)-'& A(A~A)-'B~[B(A~A)-~BT]-'B(~~~)-'~~,
and hence a l s o inverse
matricesC-'and v;' are s y m n e t r i c . t h i s e x p r e s s i o n
which r e p r e s e n t s t h e e x t e n s i o n of eq.(10.59) v a l i d f o r the unconstrained e a s e .
can b e s i m p l i f i e d t o
Show t h a t t h i s D i s a l s o an idempotent m a t r i x s a t i s f y i n g uT-0, u T u = ~ . ~ ~ l l ~
the r e a s o n i n g of S e c t .lO.4 .2 and v e r i f y t h a t ~ ( 9 ' . )=oZ(N-L+K), showing t h a t
v (-
g) = c - ' ( I ~ - ,
B ~ ~ ~ B C - I ) 9'.mln /(N-L+K) i s a n ""biassed e s t i m a t o r of $. m' n

( i i i ) G e n e r a l i z e t h e Proof t h a t 9'. I(N-L+R) i s an u n h i a s s e d e s t i m t o r o f o2


to t h e case when t h e measurements Rase c o r r e l a t i o n terms and n o t n e c e s s a r i l y
equal errors. ( H i n t : See Exercise 10.11.)

I n e q . ( 1 0 , 1 1 1 ) C-' i s t h e c o v a r i a n c e m a t r i x f a r t h e u n c o n s t r a i n e d p a r a m e t e r s , and
10.8 GENERALLEAST-SQUARES ESTIMATION WITH CONSTRAINTS
a s t h e d i a g o n a l e l e m e n t s o f t h e term ( B C - ~ ) ~ V ; ~ ( B C -are
' ) always "on-negative,
we see t h a t t h e c o n s t r a i n t e q u a t i o n s w i l l l e a d t o a reduction of t h e p a m e t e r We h a v e i n t h e p r e c e d i n g s e c t i o n s d i s c u s s e d how t h e LS method c a n be
errors ( i . u . t h e d i a g o n a l t e r m s ) compared t o t h e u n c o n s t r a i n e d c a s e . For t h e used t o e s t i m a t e unknown p a r a m e t e r s i n v a r i o u s problems of i n c r e a s i n g complex-

off-diagonal terms no s i m i l a r s t a t e m e n t can be made i n t h e g e n e r a l case, as t h e ity. We w i l l now t u r n t o t h e most g e n e r a l s i t u a t i o n , where t h e e s t i m a t i o n prob-

covariances between d i f f e r e n t p a r a m e t e r s can be s m a l l e r o r l a r g e r t h a n i n t h e l r r , i n v o l v e s o b s e r v a b l e q u a n t i t i e s as w e l l as unobservable unknowns, which a r e

unconstrained case. connected through a s e t of g e n e r a l , i . e . no"-linear, algebraic restrictions.


A "
he "improved measurements" 3 = A? and t h e i r e r r o r s a r e g i v e n by t h e We s h a l l i n t h e f o l l o w i n g s e c t i o n d e v e l o p the f o r m u l a t i o n of t h e i r e r -

formulae a t i v e p r o c e d u r e u s i n g t h e method of Lagrangian m u l t i p l i e r s , w i t h o u t making any


r e f e r e n c e t o a s p c i a l p h y s i c a l problem. The r e a d e r m y f i n d i t u s e f u l t o have
i n mind t h e k i n e m a t i c a n a l y s i s of a reaction example f o l l o w s i n
Sect.lO.8.2), where the w m e n t m and energy c o n s e r v a t i o n laws c o n s t i t u t e a s e t
of r e s t r i c t i o n s r e l a t i n g t h e v a r i o u s momenta and a n g l e s f o r t h e p a r r i c l e combin-

which s h o u l d be rampared t o t h e ~ o r r e s p ~ ~ d ei xopgr e s s i o n s f o r t h e u n c o n s t r a i n e d a t i o n d e f i n i n g t h e kinematic hypothesis. Some of t h e q u a n t i t i e s have b e e n meas-

problem, g i v e n i n E x e r c i s e 10.8. ured t o a c e r t a i n a c c u r a c y ( s a y , t h e momenta and a n g l e s of curved t r a c k s i n a


bubble chamber), and some are c o m p l e t e l y unknown ( t h e v a r i a b l e s f o r an unseen
p a r t i c l e , m m e n t a f o r s h o r t , s t r a i g h t t r a c k s , eLc .). The purpose o f t h e LS es-
and r e p h r a s e t h e problem by r e q u i r i n g
t i v a t i o n is t o i n v e s t i g a t e t h e k i n e m a t i c h y p o t h e s i s ; f o r a s u c c e s s f u l minimiza-
t i o n t h e c o n s t r a i n t e q u a t i o n s w i l l s u p p l y e s t i m a t e s f o r t h e unmeasured v a r i a b l e s
a s w e l l as "improved measurements" f o r t h e measured q u a n t i t i e s .

10.8.1 The i t e r a t i o n p r o c e d u r e We now have a t o t a l of N+J+K unknowns. When t h e d e r i v a t i v e s of X' w i t h

we w i l l , as b e f o r e , l e t 2 be a v e c t o r of N o b s e r v a b l e s , f o r which we r e s p e c t t o a l l unkmvns are p u t e q u a l t o zero we g e t t h e f o l l o w i n g set of equa-

h a v e t h e f i r s t a p p r o x i m a t i o n v a l u e s (measurements) y, w i t h errors c o n t a i n e d i n t i o n s , w r i t t e n i n v e c t o r form,

t h e c o v a r i a n c e m a t r i x V(y).
ables 4 = {E1,52, ....Csl.
I n a d d i t i o n we h a v e a s e t of J unmeasurable v a r i -
T h e N measurable and t h e J unmeasurable v a r i a b l e s a r e
r e l a t e d and h a v e to s a t i s f y a s e t of K c o n s t r a i n t e q u a t i o n s ,

According t o t h e Least-Squares P r i n c i p l e we s h o u l d a d o p t as o u r b e s t e s t i m a t e s
of t h e unknowns 2 and 5 t h o s e v a l u e s f o r which
V
7l
v 5 x2
vAx2 =
xZ =

=
-

2
2 ~ " ( ~ - g ) + 2rT1 =

2r;i =
f(2.i)= 0,
t.
ir
2. (N e q u a t i o n s )
(J e q u a t i o n s )
(K e q u a t i o n s )

where t h e m a t r i c e s Fn ( o f dimension KXN) and F, (dimension KXJ) a r e d e f i n e d by


I (10.115)

T
x 2 ( g ) = (y-g) T1(z)(x-9)= minimum, Thus, removing t h e n u i c a n c e f a c t o r s 2 , t h e e q u a t i o n s a r e

(10.113)
-f(rl.5)
-- = o.
The g e n e r a l , c o n s t r a i n e d LS problem of eqs.(lO.!l3) can be s o l v e d by
e l i m i n a t i n g K unknowns from t h e c o n s t r a i n t e q u a t i o n s , s u b s t i t u t i n g i n x2 and
minimizing t h i s f u n c r i o n w i t h r e s p e c t t o t h e N+S-K r e m a i n i n g v a r i a b l e s . The
The s o l u t i o n of t h e s e t of e q u a t i o n s (10.117)- ( 1 0 . 1 1 v ) f o r t h e N+S+K
e l i m i n a t i o n method, however, h a s t h e d i s a d v a n r a g e t h a t i t does not g i v e any
unknowns must i n t h e g e n e r a l case*) be found by i t e r a t i o n s , producing success-
p r e s c r i p t i o n on which v a r i a b l e s one s h o u l d e l i m i n a t e from t h e c o n s t r a i n t equa-
ively b e t t e r approximations.
tions. I f t h e s e a r e " ~ n - l i ~ ~ at rh e, a c t u a l m i n i m i z a t i o n o f t h e f u n c t i o n X' may
The Lagrange L e t us suppose t h a t i t e r a t i o n number v h a s b e e n performed and t h a t i t
d e v e l o p q u i t e d i f f e r e n t l y . d e p e n d i n g on t h e e l i m i n a t i o n made.
i s necessary t o f i n d a s t i l l b e t t e r s o l u t i o n . Far t h e v-th iteration the
m u l t i p l i e r method, on t h e o t h e r hand, a v o i d s t h e p r e f e r e n c e of any of t h e un-
known v a r i a b l e s and t r e a t s them a l l on an e q u a l f o o t i n g . Accordingly, although
approximative s o l u t i o n i s g i v e n by t h e v a l u e s
-n U , ~ v , ~c o, r r e s p o n d i n g t o the
We p e r f o r m a T a y l o r e n p a n s i o n of t h e c o n s t r a i n t equa-
f u n c t i o n v a l u e (X2)".
t h i s approach i m p l i e s more v a r i a b l e s i n t h e m i n i m i z a t i o n , i t s f e a t u r e o f s p e -
t i o n s 00.119) i n t h e p a i n t (gV .5U),
t r y i n t h e v a r i a b l e s i s c o n s i d e r e d a g r e a t e r v i r t u e , and i s p r e f e r r e d i n p r a e -
tice.
We p r o c e e d t h e r e f o r e t o s o l v e t h e problem of eqs.(lO.l13) by t h e method
o f t h e Lagrangian m u l t i p l i e r s . We i n t r o d u c e K a d d i t i o n a l unknowns ? = (A,,..,
* The s e t (10.117)-(10.119) r e d u c e s o f course t o eqs.(10.106) for a linear
AK' Problem v i t h no unmeasurable v a r i a b l e s .
When t h e terms of second and h i g h e r o r d e r s a r e n e g l e c t e d t h i s can be w r i t t e n

rv
- + F,"G~+' + F;I&'+' -47 = -
O, (10.120)
" V V
The l i n e a r i z e d e q u a t i o n s a r e t h e r e f o r e s o l v e d i n such a way t h a t t h e
where a l l s u p e r s c r i p t s v i n d i c a t e t h a t f , F n' F 5 are t o be e v a l u a t e d a t t h e " ~ ~ m pt le el y unknown" iVi1
a r e found f i r s t , next t h e Lagrangian m u l t i p l i e r s W1
p o i n t (- C U ) , the v-th i t e r a t i o n . E q u a t i o n s (10.117) and 0 0 . 1 1 8 ) now r e a d
n V ,- and f i n a l l y ' t h e "improved measurements" g U f l .
(10.121) I n eqs.(10.126)-(10,128) the matrices F F ,S and t h e v e c t o r r a r e
v - l ~ V t ' - +~ ( F ; ) * C + ~ = 2, n' F
evalmted a t the point (nV,~'). We note t h a t i n d e r i v i n g t h e formulae i t h a s
I
(F;)" = -0. (10.122) been t a c i t l y assumed t h a t t h e i n v e r s e o f t h e m a t r i c e s S and (FTs-'F ) e x i s t s .
F C
With i h r ne,w vnl(,,,s '17r , ! V + l ,-
~V+l inrl iVtl
ur c a l c u l . ~ t c i~h r v ; i l u e of
These e q u a t i o n s , t o g e t h e r w i t h t h e expanded ~ o n s t r a i n te q u a t i o n s ( 1 0 . 1 2 0 ) w i l l the f u n c t i o n ( X ' ) ~ + If o r t h e ( " + I ) - t h i t e r a t i o n and compare i t to the p r e v i o u s
make i t p o s s i b l e t o e x p r e s s a l l unknowns of t h e ( v t 1 ) - t h i t e r a t i o n by t h e quan- value (x2)'. n V t l , ~ v + l ) i s used
w i t h an improved s o l u t i o n t h e new p o i n t ( - for
t i t i e s o f the p r e c e d i n g i t e r a t i o n . a new T a y l o r e x p a n s i a n of t h e c o n s t r a i n t e q u a t i o n s and t h e p r o c e s s i s s t a r t e d
I € we e l i m i n a t e from (10.121) and s u b s t i t u t e i n (10.120)we g e t a
over a g a i n . The i t e r a t i o n s should be c o n t i n u e d u n t i l a s a t i s f a c t o r y s o l u t i o n
"tl
r e l a t i o n involving only 1"
' and 5 has been found. Uscnally one would r e p e a t t h e i : a l c u l u t i o n s u n t i l t h e change i n
v
f v + Fw[,
- '1
- V(F;)' 2") - gv] + FC&
"+I_
57 ' c. X' between s u c c e s s i v e s t e p s becomes s m a l l . One may a l s o have t o check t h a t t h e
differences AI. A1 converge p r o p e r l y . General convergence c r i t e r i a c a n h a r d l y
be g i v e n , b u t must be d e c i d e d f o r t h e s e p a r a t e problems. It i s g e n e r a l l y v a l i d .
t h a t i n order t o o p t i m i z e t h e canverxence o f an i t e r a t i o n p r o c e d u r e should
be c a r e f u l i n g i v i n g good s t a r t i n g v a l u e s go, 50 f o r t h e i t e r a t i o n s , sir.ce these
when we i n t r o d u c e t h e n o t a t i o n determine how many s t e p s w i l l be n e c e s s a r y t o r e a c h t h e d e s i r e d minimum.
The d i s t i n c t i o n between the two t y p e s of v a r i a b l e s 2 and 5 lies in
f a c t i n t h e c h o i c e of s t a r t i n g values. n
For t h e measurable v a r i a b l e s - one

should s t a r t w i t h 9 = y, t h e measured v a l u e s ; f o r t h e u n m a s u r a b l e variables


the s t a r t i n g values 5' s h o u l d b e e v a l u a t e d from t h e most c o n v e n i e n t c o n s t r a i n t
clearly, S i s a symmetric m a t r i x of dimension K x K. M u l t i p l y i n g eq.(10.123) e q u a t i o n s i n s e r t i n g the measurements fl f o r 2.
from t h e l e f t by s-' g i v e s an e x p r e s s i o n f o r ?+I t h a t can b e s u b s t i t u t e d i n To sunmarine, an i t e r a t i o n w i l l c o n s i s t o f t h e f o l l o w i n g i t e m s :
eq.(l0.122), which i n t u r n l e a d s t o an e q u a t i o n w i t h o n l y xu+' unknown.
(i) E v a l u a t e t h e v e c t o r . 5 from eq.(10.124) and t h e m a t r i x S from
T
(Fg) S
" -I
[r + F ~ ( & ~ + ~ - ~ =' )g] . eq.(10.125).
(ii) Find t h e new v e c t o r f t l of unmeasurable v a r i a b l e s from
Solving t h i s equation f o r iW1
and s u b s t i t u t i n g back i n t o eq.(10.123) we f i n d
eq.(lO.126).
-hVtl, and f i n a l l y eq.(10.121) w i l l give
"+ 1
g . Thus we f i n d i n s u c c e s s i o n
(iii) Find the new vector
eq.(lO.l27).
r1 of Lagrangian multipliers from
conservation, which are obtained by equating the following expression. to

f 1 = - PA coshAcos$A + PP c o dP cosmP + PPOSA n.o.$ TI


(iv) ~i"d the new vector fi*l of measurable quantities frorn

eq.ClO.128).
f2 = - PA~~~.4Asio$APPCOSA Psin6P
+ +
2 Wl,
(v) Calculate the new value (X ) fs = - PA sinAA + P sinh + P,,sinA",
(~i) Compare results with ~ r e v i o ~iteration.
s
proceed to (i) if "ew iteration is required.
Stop, if solution has been obtained. since the Problem involves 4 constraints and 3 unmeasured unknowns we are there-
A
fore dealing with a 1C-fic.
When the final step has been made the covarisnces of the estimates 1
From the definitions of eqs.(lO.ll6) we see that the matrices Pq ("f
and 5 should be found; see Secc.lO.R.3.
dimension 1
6) and F i. (dis!r!nsion 4 X 3 ) as obtained from t h e derivatives of
~xercise 10.17: show that the value of X' for the (v+l)-th step is the four constraint functions f with respect e o the measurable
(x7)v+1 - (LU+l,Tsi*l + 2(g+1)=Lv+1, able variables, are, respectively,
k and unmeasur-

where the matrix S is evaluated for the v-th iteration.

10.8.2 Example: Kinematic analysis ,of a V" eveaf.ll)


Let us apply the prcvilius fnrmlllatiun t o the kinematic analysis n f a vn
as seen in a bubble chamber. We suppose that the two tracks have been measured
m d that the track reconstruction has provided for each track a first approxima-
tion for the kinematic variables 1 / P (inverse momentum). A (dip angle), and $
(azimuth angle) as well as a covariance matrix for these variables We want to
and
analyze this event for the kinematic hypothesis

If the rig in of the A is unspecified the magnitude of the momentum as


well as the direction of the decaying ~arricleare unknoms. The problem there-
fore involves three ~ o m ~ l e t e lunknom
y variables,

and sir measurable unknowns, To start the iterations we t a k e the measurements as the initial ,,"
1 - (pa ,.A,". ,:P ,:A 4;).
-

..
The algebraic constraints are the four equations describing momentum and energy
Far in
we
take, for example, the value '-
5 - {PO
A'
A'
A'
mO),
A
where the con-
p o n e n t s are o b t a i n e d by demanding the f i r s t r h r e e c o n s t r a i n t f u n c t i o n s e q u a l t o I n t h e s e formulae and f , as w e l l as t h e f u n c t i o n f and a l l m a t r i c e s F and S
z e r o , i.e. momentlm c o n s e r v a t i o n s a t i s f i e d . The f o u r t h c o n s t r a i n t f u n c t i o n are e v a l u a t e d i n t h e l a s t , i.9. t h e v-th iteration. To t h e a p p r o x i m t i o n of a
w i l l t h e n i n g e n e r a l n o t b e s t r i c t l y zero. ~ h u swe w i l l h a v e an i n i t i a l v a l u e l i n e a r dependence on y t h e c o v a r i a n c e m a t r i c e s f o r t and $ are g i v e n by eq.(3.80),
of the vector r from eq.(10.124). the law of p r o p a g a t i o n of e r r o r s , as

0
I n s e r t i n g t h e a p p r o x i m a t i o n s (Z ,&0 ) we can f i n d POn' F05 and o b t a i n t h e 4 4

m a t r i x S from eq.(10.125),

where t h e d e r i v a t i v e s of g and h with r e s p e c t t o y denote matrices of dimensions


I n v e r t i n g t h i s m a t r i x we can next f i n d t h e v a l u e s of l',1'. 3' i n succession N x N and J x N, respcrtively. Prom r q s . ( 1 0 . 1 3 0 ) we c a n e n p r e s t h e $ ? q u : > n r i t i e s

from eqs.(10.12b)-(10.127), calculate x2 w i t h t h e s e f i r s t e s t i m a t e s , and c o n t i n u e as

t h e process.
I f we had measured t h e c o o r d i n a t e s of t h e o r i g i n of t h e A i n a d d i t i o n
t o i t s decay p i n t t h e l i n e - o f - f l i g h t of t h i s p a r t i c l e would have b e e n k n o m ,
e q u i v a l e n t t o i n c l u d i n g hA and $A among t h e measurable unknowns 2. The only
c o m p l e t e l y unknown v a r i a b l e would then be P A , t h e magnitude of t h e m m n t u m ,
In t h e s e e x p r e s s i o n s we o b s e r v e t h a t t h e m a t r i c e s F ,F e n t e r o n l y yia t h e t h r e e
corresponding t o a i n t h i s case. See a l s o
p o s s i b l e c m b i m t i o n s of t h e type F S IF.
T - n 5
One of t h e s e , F ~ S - ' P a p p e a r e d al-
5 5:
10.8.3 C a l c u l a t i o n of errors ready as a p a r t o f t h e s o l u t i o n f o r t h e unmeasurable v a r i a b l e s 4, eq.(10.126).
The e r r o r s i n t h e f i n a l e s t i m a t e s of t h e measurable and unmeasurable With t h e a b b r e v i a t i o n s
v a r i a b l e s from t h e g e n e r a l LS f i t of Sect.lO.8.1 a r e found by a p p l y i n g t h e law
"tl
of e r r o r p r o p a g a t i o n . L e t us c o n s i d e r t h e e s t i m a t e s = g"tl and 5 =Ias
f u n c t i o n s of t h e measurements y,

we f i n d a f t e r a l i t t l e a l g e b r a t h a t e q s . 0 0 . 1 3 1 ) lead to

where t h e a c t u a l forms of t h e f u n c t i o n s g and 1are found from e q s . ( 1 0 . 1 2 6 ) -


(10.128), u s i n g e q . (10.124) f o r r,

These e r r o r f o r m l a e imply t h a t t h e errors on t h e f i t t e d q u a n t i t i e s n


A

i n general are s m a l l e r t h a n t h e errors i n t h e o b s e r v a t i o n s y, and t h a t t h e f i t -


ted q u a n t i t i e s w i l l be c o r r e l a t e d , even if t h e m a s u r e m e n t s were independent.
E x e r c i s e 10.18: Show t h a t t h e c o v a r i a n c e m a t r i x f o r t h e r e s i d u a l s rA
- A
y -1.

-
t o t h e a p p r o x i m a t i o n of a l i n e a r r e l a t i o n s h i p between 11 and y, i s x2(?) s u r f a c e by p l a n e s
-
v(:) V(Y) + ~ ( 6 )- 2cov(y,6) V(y) - v(<) = V&)(G-HUH T)V(y).

10.9 MNFIDENCE INTERVALS AND ERRORS FROM THE xi FUNCTION t h e c o n s t a n t a i s a p p r o p r i a t e l y chosen. The c o r r e s p o n d i n g p r o b a b i l i t i e s
are d e t e r m i n e d from t h e c h i - s q u a r e d i s t r i b u t i o n w i t h a number of d e g r e e s of
10.9.1 B a s i s f o r t h e d e t e r m i n a t i o n o f LS c o n f i d e n c e i n t e r v a l s freedom e q u a l t o t h e number of independent p a r a m e t e r s .
With a t h e o r e t i c a l model which i s l i n e a r i n t h e p a r a m e t e r s 2 we have When ~ ' ( 8 ) i s n o t a q u a d r a t i c f u n c t i o n i n the p a r a m e t e r s , f o r example
t h e g e n e r a l e x p r e s s i o n f o r t h e X' function i f t h e model i s n o t a l i n e a r f u n c t i o n of t h e p a r a m e t e r s a n d l o r the c o v a r i a n c e
m a t r i x i s n o t independent of -
8, i t i s s t i l l customary t o use t h e i n t e r s e c t i o n
approach ( t h e " g r a p h i c a l method'') to e s t a b l i s h c o n f i d e n c e r e g i o n s f o r t h e para-

where V-' i s t h e i n v e r s e of t h e c o v a r i a n c e m a t r i x V(y) f o r t h e N o b s e r v e d quan- mters. However, s i n c e t h e e n a c t d i s t r i b u t i o n of t h e q u a n t i t i e s i s t h e n n o t

tities y. When V(y) i s i n d e p e n d e n t of t h e p a r a m e t e r s we found ( i n S e c t . 1 0 . 2 . 3 ) known. the a s s o c i a t e d p r l i h a h i l i r i e s a s deduced from a comparison w i t h t h e c h i -

t h a t t h e m i n i m i z a t i o n of x 2 @ ) w i t h r e s p e c t t o square d i s r r i h u r i n n w i l l o n l y be a p p r o x i m a t e l y c o r r e c t i n t h e s e c a s e s .
l e d t o t h e LS e s t i m a t e

10.9.2 LS e r r o r s and c o n f i d e n c e i n t e r v a l s , t h e one-parameter case


We assume t h a t t h e LS e s t i m a t i o n problem i n v o l v e s a s i n g l e p a r a m e t e r 8 .
and, from t h e law o f e r r o r p r o p a g a t i o n , t h e c o v a r i a n c e s of t h e s e e s t i m a t e s were When t h e x2 f u n c t i o n i s T a y l o r expanded around t h e extremum (minimum) p o i n t
found t o b e 8 = here the f i r s t d e r i v a t i v e v a n i s h e s we have i n g e n e r a l

S i m p l e a l g e b r a then l e a d s t o t h e f o l l o w i n g r e l a t i o n (see E x e r c i s e 1 0 . 1 2 ) :
xZ(8) = x2.
ml n + l
.jX2 L( ~ - 6 )+~ ...
0=8
(10.138)

Here t h e second d e r i v a t i v e of x2 must b e p o s i t i v e t o ensure t h a t t h e extremum i s


a minimum v a l u e . C l e a r l y , t o the e x t e n t t h a t t h e higher-order terms can b e ne-
g l e c t e d t h e second d e r i v a t i v e s p e c i f i e s t h e "width" of ~ ' ( 8 ) a r o u n d t h e minimum
E q u a t i o n (10.136) p r o v i d e s t h e b a s i s f o r t h e d e t e r m i n a t i o n of c o n f i -
d e n c e i n t e r v a l s ( r e g i o n s ) w i t h t h e LS method. When t h e o b s e r v a t i o n s y are multi-
point s, t h a t i s , i t d e t e r m i n e s how " p r e c i s e "
A
t h e minimum has b e e n l o c a t e d . Thus

m r m 2 l y d i s t r i b u t e d about t h e t r u e values Q = ,A! the estimates 4,which are t h e e r r o r i n t h e LS e s t i m a t e 8 must i n g e n e r a l be r e l a t e d t o t h e second d e r i v a -

l i n e a r l y r e l a t e d t o y, w i l l a l s o be normally d i s t r i b u t e d (compare S e c t . 4 . 8 . 5 ) . t i v e of t h e f u n c t i o n x2 e v a l u a t e d f o r 8 = 2.
Under t h e s e c o n d i t i o n s , e a c h of t h e t h r e e terms i n eq.(10.136) w i l l be chi-square For a linear- LS problem f u l f i l l i n g t h e c o n d i t i o n s s p e c i f i e d i n , S e c t .
10.2.3, w i t h a constant c o v a r i a n c e m a t r i x V(y) f o r t h e o b s e r v a t i o n s , t h e f u n c t i o n
distributed. For example, w i t h N i n d e p e n d e n t o b s e r v a t i o n s and L u n c o n s t r a i n e d
p a r s m e t e r s , x*@) i s x'(N). x : ~ i ~s x'(N-L), and t h e q u a d r a t i c ( c o v a r i a n c e ) form
xl i s s t r i c t l y of s e c o n d o r d e r in t h e p a r a m e t e r 8 . The second d e r i v a t i v e o f X'

(E-@~v-'($)(!-$) i s x'(L). I n t h e case of c o n s t r a i n e d p a r a m t e r s t h e l a s t term w i t h r e s p e c t t o 8 i s then a c o n s t a n t and t h e s e r i e s e x p a n s i o n (10.138) temin-


a t e s w i t h t h e second term, so t h a t we have t h e p a r a b o l i c dependence
w i l l b e d i s t r i b u t e d as x'(L-K), where K i s t h e number of l i n e a r c o n s t r a i n t s . I n
g e n e r a l , t h e c o n f i d e n c e r e g i o n s we a r e s e e k i n g w i l l b e found by i n t e r s e c t i n g t h e
S i n c e , from eq.(10.136) we a l s o have 10.9.3 LS errors a d c o n f i d e n c e r e g i o n s , t h e m u l t i - p a r a m e t e r case
The g e n e r a l i z a t i o n of t h e p r e c e d i n g s e c t i o n t o t h e s i t u a t i o n w i t h more
t h a n one p a r a m e t e r i s q u i t e s t r a i g h t f o w a r d . The T a y l o r e x p a n s i o n of t h e X'
f u n c t i o n around t h e minimum v a l u e B= reads, i n general,
i t i s seen from t h e two l a s t e q u a t i o n s by i d e n t i f i c a t i o n of the c o e f f i c i e n t s o f
t h e q u a d r a t i c term* t h a t

BY comparison w i r h e q . ( 1 0 . 1 3 6 ) , which i s v a l i d f o r t h e i d e a l case v i t h a l i n e a r


T h i s e x p r e s s i o n f o r t h e v a r i a n c e of t h e LS e s t i m a t e 8
a
can be compared dependence and c o n s t a n t c o v a r i a n c e m a t r i x f o r t h e o b s e r v a t i o n s , one

' i:
t o t h e s i m i l a r e x p r e s s i o n o b t a i n e d f o r t h e large-sample ML e s t i m a t e , e q . ( 9 . 3 6 ) . f i n d s t h a t t h e elements o f t h e c o v a r i a n c e m a t r i x f o r t h e LS e s t i m a t e 8 can b e
I t s h o u l d b e emphasized t h a t t h e formulae (10.139) - (10.141) a r e ~ m under
t e x p r e s s e d as
t h e c o n d i t i o n s s p e c i f i e d above.
I n t h e c a s e of a non-linear LS problem, o r i n g e n e r a l , w i r h an x2
f u n c t i o n which i s n o t of s t r i c t l y p a r a b o l i c form, one can s t i l l e x p e c t t o f i n d
the variance - and hence t h e e r r o r - of t h e e s t i m a t e 6 from t h e formula This formula i s a l s o e x a c t t o t h e e x t e n t t h a t x2 has a q u a d r a t i c dependence upon
-
8. I t can be compared t o t h e s i m i l a r e x p r e s s i o n o b t a i n e d f o r t h e c o v a r i a n c e s of
ML e s t i m a t e s , e q . ( 9 . 3 2 ) .
I n t h e s i m p l e s t s i t u a t i o n v i t h a l i n e a r model and a parameter indepen-
which w i l l be c o r r e c t t o t h e a p p r o x ~ r n a t i o nt h a t t h e h i g h e r - o r d e r t e r m s of eq d e n t c o v a r i a n c e m r r i x f o r t h e normally d i s t r i b u t e d o b s e r v a t i o n s , t h e d o u b l e s u m
(10.138) small.
are
i n e q . ( 1 0 . 1 4 4 ) - ~ h i c hi s i d e n t i c a l t o t h e c o v a r i a n c e form of e q . ( 1 0 . 1 3 6 ) - is
U n d e r t h e a s s u m p t i o n o f u n h i a s s e d and normally d i s t r i b u t e d o b s e r v a t i o n s a chi-square v a r i a b l e f o r which t h e number of d e g r e e s of freedom i s e q u a l t o t h e
we can f i n d ( e x a c t o r a p p r o x i m a t e ) c o n f i d e n c e i n t e r v a l s f o r 8 by s e e k i n g t h e number o f e s t i m a t e d parameters minus the n u d e r of l i n e a r c o n e r r a i n t s , i f any.
i n t e r s e c t i o n s of t h e ( e x a c t o r approximate) p a r a b o l i c f u n c t i o n x Z ( 8 ) by t h e
S p e c i f i c a l l y , v i t h o n l y two independent parameters t h e i n t e r s e c t i o n s between t h e
straight lines
x 2 (-
8 ) s u r f a c e and t h e p a r a l l e l p l a n e s a t d i s t a n c e a from t h e minimum x : ~w~i l l
be a set of c o n c e n t r i c e l l i p s e s which d e f i n e j o i n t c o n f i d e n c e r e g i o n s f o r t h e
two p a r a m e t e r s , whose p r o b a b i l i t y c o n t e n t i s determined by ~ ' ( 2 ) . Hence t h e
Here, i n t h e one-parameter case, t h e v a l u e s a = 1'. z2, and 32 f a r t h e i n t e r - e l l i p t i c c o n f i d e n c e r e g i o n s o b t a i n e d by t a k i n g a = 1'. 2', 3' w i l l have a s s o c i -
s e r t i o n d i s t a n c e from minimum l e a d t o c o n f i d e n c e i n t e r v a l s of p r o b a b i l i t i e s 6 8 . 3 , a t e d p r o b a b i l i t i e s 39.3, 86.5. 98.91, r e s p e c t i v e l y , i n complete analogy w i t h t h e
9 5 . 4 , and 99.72, r e s p e c t i v e l y , which correspond t o one, two, and t h r e e s t a n d a r d j o i n t l i k e l i h o o d regionsf o r t h e a s y m p t o t i c two-parameter case d i s c u s s e d i n S e c t .
d e v i a t i o n i n t e r v a l s when ~ ' ( 6 ) i s s t r i c t l y p a r a b o l i c i n 0. Hence t h e r e i s a 9.7.4. I n t h e g e n e r a l r n ~ l t i - p a r a m e t e r c a s e , t h e i n t e r s e c t i o n between t h e ~ ' ( 8 )
c l o s e a n a l o g y between t h e i n t e r v a l e s t i m a t i o n from t h e X' f u n c t i o n considered h y p e r s u r f a c e and t h e h y p e r p l a n e a t d i s t a n c e a above t h e minimum w i l l produce a
h e r e and t h e l i k e l i h o o d f u n c t i o n as d e s c r i b e d f a r t h e one-parameter case i n h y p e r e l l i p t i c j o i n t c o n f i d e n c e r e g i o n f o r a l l t h e p a r a m e t e r s , f o r which t h e con-
Sect.9.7.1. f i d e n c e c o e f f i c i e n t i s e x a c t l y g i v e n by t h e c u m u l a t i v e i n t e g r a l , up t o t h e v a l u e
a, of t h e c h i - s q u a r e p . d . f . w i t h a number a f d e g r e e s of freedom e q u a l t o t h e
number of independent parameters. Evidently, the associated probabilities for 1 1. T h e method of moments
fixed values a = 12, 2',... will decrease quickly when the n d e r of parameters
increases. Conversely, to have a specified probability content for the joint
confidence region, larger values of a must be taken for increasing number of
parameters,
From the joint confidence region for all parameters considered simul-
taneously one can also deduce conditional confidence regions (intervals) for
subsets of the parameters, by seeking the intersections between this region and
lines for fixed values of the remaining parameters, for example their estimated
The parameter estimators constructed by the method of manents (m) are
consistent but in general neither as efficient as the Haxi-Likelihood estime-
: values. The arguments are the same as in Sects .9.7.4 and 9.7.6 for the asymp-
. .. totic likelihood function. The specific choice a = 1' will as before supply the
tors, nor sufficient. However, although the qualities of efficiency and suffi-
ciency are important an estimation method should not be judged from its theore-
errors in the parameter estimates by the hypersurface circumscribing the joint
tical optimum properties alone, but also for its applicability to practical pro-
confidence region, as well as the conditional errors obtained by keeping some
blem. In particle physics, the moments method because of its feasibility has
parameters at their estimated values. In particular, with two independent para-
been widely used in experiments to determine polarization and density matrix
meters the situation is completely analogous to that described in detail for the
binormal likelihood function, Sects.9.6.3 and 9.7.4. elements. The HH estimates can be easily obtained. since their evaluation only

Irrespective of whether the actual minimization of ~'(8)


involves a involves calculation of expectation values and averages of specified functions

linear model or not, the errors in the LS estimates are by convention determined over the experimental sample.
from the intersecting hyperplane at one unit above the minimum x:~", and confi-
dence regions deduced with the choices a = 1 2 , .2;' .
In all situations when 11.1 BASIS FOR ?HE SINPLE MOMeNTS ME-ROD
~'(8)
is not of second order in the parameters andlor the observations are not Given a probability density function f(xl8) with unkn- Parameters
normally distributed, a comparison with the chi-square distribution will obvi- -
8 = (81,02, ....9k1 we want to estimate these parameters from a set of observa-
ously provide only approximate values for the probabilities associated vith the tions I(~.X~....,X . r t h algebraic mrment of the population is defined by
Ihe

different regions. Har good the approximation is will in general depend on the (see eq.(3.13) of Sect.3.3.3)
magnirvde of the higher-order re- in the series expansion of ~'(1)and the
validity of the normality assumption for the observations.

where n ie the domain of x. r - is then the arith-


A reasonable estimator of ~'(8)
methie mean of the r-th power of the observations xi,

By equating the different moments of the parent population, which are functione
of the unknown 8, to the numerical values of the corresponding sample moments We
set
1 1 .Z GENERALIZED MONENTS METHOD
Instead of using a set of powers of the variable x to estimate the
unknown 8, one can select a set of independent functions of x and proceed in a
similar way to construct estimators for these functions in tern of appropriate

k set of equations (11.3) can therefore be found and solved to give the
averages of the functions evaluated for the sample values xl.x2, ....x, *) .
a *.. A

MH estimates = 101r82,...,13k1. AS a limited number of rmmenrs usually will 11.2.1 One-parameter case
not contain all information about the p.d.f. the MM estimators will in general In.the simplest situation, when there is only one unknown parameter.
be less efficient than the Maximum-Likelihood estimators. it will suffice to consider a single function g(x). The expectation of g(x),
The estimator m' of eq.(11.2) is an ""biassed estimator of ui. since or the first rmment of this function, for the p.d.f. f(x(8) is defined by (corn
pare eq.(3.6) of Sect.3.3.1)

in accordance with the requirement (8.3) far an ""biassed estimator.


E(p(x)) Zy(0) = f
n
g(x) f(x!8)dx. (11.6)

To evaluate the variance of m' we observe that


An estimator of y(8) is now the average of g(x) over the sample,

The variance of this linear combination is

where we use the fact chat the x . are independent. Hence

where for ~(g(x)) we may insert its estimated value obtained from the sample,

which shows that the variance of the sample moment of a given order is dependent
on the population moment of twice this order; V(mr) may therefore, even when n
Thus we take
is large, be of considerable magnitude for higher moments if the p.d.f. has
substantial tails. This explains why the simple method of taking the moments of
the variable x itself ( i . ~ .eqs.(ll.3)) iq rather seldom used in prartire.
An alternative form, convenient for numerical computation, is obtained after a
Exercise 1 1 . 1 : Show that the covariance between m' and m
: is given by little algebra,
cou(m:.ml) = n-'(~:+~ - u:ul).
Exercise 11.2: Show that the MH estimators of the first algebraic moment and
"
the second central moment of any p.d.f. are u
1 1
= ; and o2 =
respectively. Show that the variance in the MN estimate u is
"i
c(x~-;)- pi - ',
-- -

v(j) - :(u:-u:') = 02/n 2 21,. *) We will in the following allow x to have several components; xi will there-
fore denote all measured quantities for the i-th event.
To s u m a r i z e , e q . ( 1 1 . 7 ) w i l l produce t h e MH e s t i m a t e f o r t h e f u n c t i o n e o n l l ~ c t i o nw i t h t h e M x i m u n r L i k e l i h o o d a p p r o a c h t o t h e e s t i w n a n problem. We
~ ( 8 ) .and e q . ( l l . l O ) (or a l t e r n a t i v e l y . e q . ( l l . l l ) ) an e s t i m a t e o f i t s v a r i a n c e . ass- t h a t a sample of n resonance e v e n t s h a s been o b t a i n e d , e a c h e v e n t earre-
These e s t i m a t e s must n e x t b e " i n v e r t e d " t o g i v e t h e d e s i r e d 2 and i t s error.
I spending t o two measured q u a n t i t i e s e 0 s 8 ~ , mi. The t h e o r e t i c a l d i s t r i b u t i o n f o r
t h e decay o f a v e c t o r meson i n t o two p s e u d o s c a l a r mesons i s
E x e r c i s e 11.3: Verify t h a t t h e c h o i c e g ( x ) = n reduces t h e formulae of the
l a s t s e c t i o n t o t h e p r e v i o u s l y e s t a b l i s h e d e x p r e s s i o n s f o r t h e MH e s t i m a t e of p
and i t s v a r i a n c e ( E x e r c i s e 1 1 . 2 ) .

11.2.2 M u l t i - p a r a m e t e r case
L e t u s now assume t h a t t h e e s t i m a t i o n p r o b l e m i n v o l v e s k unknown para-
m e t e r s , and c h a t we have s e l e c t e d a s e t o f k l i n e a r l y i n d e p e n d e n t ftnnrrions *ere -I i cos8 5 + I , 0 5 4 i 2 n , and Poa.p,-, ,ReoI0 are t h e t h r e e unknown p a r a -
g,(x), g2(x), ...,g k ( x ) . With t h e p . d . f . f ( x 8 ) t h e f u n c t i o n s have e x p e r t a r i o n meters.
values g i v e n by ~ ~ i d by
c d t h e f o r m o f the p . d . f . we now d e f i n e t h r e e f u n c t i o n g , , g 2 , g l
!
I
E(gr(x)) s Y r ( z ) =
I
n
gr(x) f ( x l g ) d x , r=1,2,. .., k . (11.12)
o f t h e a n g u l a r v a r i a b l e s as

These e x p e c t a t i o n v a l u e s are f u n c t i o n s o f t h e unknown g, and f o r t h e i r e s t i m a t o r s


we m y t a k e

Y - 1 "
n . 1 8r (X i ) ' r=I,Z,...,k, (11.13) C a l c u l a t i n g t h e e x p e c t a t i o n v a l u e s of t h e s e f u n c t i o n s f o r t h e p . d . f . of e q . ( 9 . 4 1 )
1-1
and e q u a t i n g t h e e x p e c t a t i o n s t o t h e c o r r e s p o n d i n g sample means we g e t
These e x p r e s s i o n s a r e of course s i m i l a r t o t h e f o r m u l a e w r i t t e n down f o r t h e one-
P a r a m e t e r case.
-
F o r t h e c o v a r i a n c e terms between t h e g (x) we g e n e r a l i z e e q s .
(11.10)-(11.11) t o g i v e t h e e l e m e n t s of t h e c o v a r i a n c e m a t r i x f o r ~=(~z,...,~k~,
-

where t h e s u m a t i o n s are e x t e n d e d over a l l o b s e r v e d e v e n t s .


The s e t o f l i n e a r e q u a t i o n s can e a s i l y be i n v e r t e d t o g i v e t h e M e s t i -
mates o f t h e d e n s i t y m a t r i x e l e m e n t s . To f i n d t h e errors i n t h e s e e s t i m a t e s l e t
I n p r a c t i c e one w i l l employ f u n c t i o n s ~ ( x for
) which t h e e x p e c t a t i o n s

-
have simple r e l a t i o n s h i p s t o t h e parameters
antes on t h e e s t i m a t e s !will
8,
n o t become t o o l a r g e .
and which are s u c h t h a t t h e v a r i - us f o r c o n v e n i e n c e w r i t e t h e l i n e a r r e l a t i o n s h i p as

11.2.3 E x a q l e : Density m a t r i x elements (2)


We s h a l l see how t h e g e n e r a l i z e d m m e n t s method can be used t o f i n d t h e
d e n s i t y m a t r i x e l e w n t a i n t h e example which was i n t r o d u c e d i n Sect.9.5.8 in
3+ 1+
E x e r c i s e 11.4: The two-body decay -
2 + -
2 + 0 can i n t h e Jackson reference
system be d e s c r i b e d by the p . d . f .
3
f = Zii (r1 ( 1 + 4 ~ 2 ~ )T1( 1 - 4 p ~ l ) e o s 2 8 - -
+ 2 Rep3,rin28cos4 - -
2 Rep3-,sin28cos2$
"5 "5
.
When t h e v a r i a n c e m a t r i x ~ -( j =
) V(g) h a s been c a l c u l a t e d from t h e measurements where 0 i s t h e p o l a r a n g l e . 0 <
8 5 n, and $ t h e azimuth a n g l e . 0 5 4 5 2".
u s i n g e q . ( 1 1 . 1 4 ) , t h e law o f p r o p a g a t i o n of e r r o r s , e q . ( 3 . 8 0 ) , a p p l i e s and , show t h a t t h e t h r e e d e n s i t y m a t r i x e l e m n t s p39.Rep,l.Repa-1 and t h e i r e r r o r s
=an be o b t a i n e d by t h e moments method u s i n g t h e t r i a l f u n c t i o n of eq.(11.15).
g i v e s t h e v a r i a n c e m a t r i x v ( c ) of t h e d e n s i t y m a t r i x elernents as
~ x e r c i s e 11.5: The decay 2' + 0- + 0- has the d i s t r i b u t i o n

or, e x p l i c i t l y
1'

+ 2Rep,_,cos3+sin20sin28 + p,-,cos44sin*8
1-.
The n i n e d e n s i t y m a t r i x elements a r e n o t a l l i n d e p e n d e n t s i n c e t h e n o r m a l i z a t i o n
c o n d i t i o n r e q u i r e s p o o + 2 p l l + 2022 = 1 . D i s c u s s how a s e t o f t r i a l f u n c t i o n s
where v1 5 i s s h o r t f o r VrS(g) from e q . ( 1 1 . 1 4 ) . can be chosen f o r t h i s p . d . f . , which w i l l l e a d to MH e s t i m a t e s f a r t h e p a r a m e t e r s .
The p r e v i o u s c o n s i d e r a t i o n s assume t h a t o u r sample c o n s i s t s of n e v e n t s
E x e r c i s e 11.6: Show t h a t t h e formulae f o r t h e decays 1- + 0- + 0- ( e q . ( 9 . 4 1 ) )
t h a t a l l r e p r e s e n t t r u e decays o f t h e s p e c i f i c t y p e 1- * 0- + 0-. Most o f t e n and 2' + 0- + 0- ( E x e r c i s e 11.5) f o l l o w from t h e g e n e r a l f o r m u l a e q . ( 1 0 . 8 5 ) .
(Hint: The d e n s i t y m a t r i x p i s H e m i t e a n . )
it i s not e x p e r i n e n t a l l y p o s s i b l e t o o h r a i n a p u r e sample of resonance e v e n t s ,
a n d i t is n e c e s s a r y t o p e r f o r m some k i n d of background s u b t r a c t i o n . One popu-
11.3 MOMENTS METHOD WITH ORTHONORKAL FUNCTIONS
l a r way of d o i n g t h i s i s t o d e t e r m i n e t h e d e n s i t y m a t r i x e l e m e n t s pa u s i n g a l l
The method o u t l i n e d i n Sect.11.2.2 becomes e s p e c i a l l y s i m p l e i f t h e
the e v e n t s i n t h e resonance r e g i o n o f t h e e f f e c t i v e mass p l o t , and t h e n t o
p.d.f. can be e x p r e s s e d as
c a l c u l a t e t h e same q u a n t i t i e s p u s i n g t h e e v e n t s i n two a d j a c e n t mass r e g i o n s .
b
The number % of background events w i t h i n t h e resonance r e g i o n can he e s t i m a t e d
from t h e shape of t h e e f f e c t i v e mass s p e c t r u m . Under t h e assumption t h a t t h i s
hackground i s w e l l d e s c r i b e d by che e v e n t s i n t h e n e i g h h o u r i n g r e g i o n s we can where t h e Sr(x) c o n s t i t u t e a s e t of k orthonomat flnctions, s a t i s f y i n g
e s t i m a t e t h e d e n s i t y m a t r i x e l e m e n t s P of t h e resonance by t h e a p p r o x i m a t i o n
~ S , ( x ) S S ( x ) d x = 6rs,
0
4 and
The u n c e r t a i n t y i n t h i s q u a n t i t y i s e s t i m a t e d as

Then e q . ( i l . l 2 ) y i e l d s t h e e x p e c t a t i o n of Er(x) s i m p l y a s
where Ap
a'
Apb are t h e e r r o r s c o n n e c t e d t o t h e e s t i m a t e s of pa,pb.
e s t i m a t o r of a' i s t h e r e f o r e g i v e n by ( e q . ( 1 1 . 2 6 ) )

T h e r e f o r e , an u n b i a s s e d e s t i m a t o r o f
f u n c t i o n f, ( x ) over t h e sample,
'ar i s p r o v i d e d by t h e a v e r a g e v a l u e o f t h e 0' = rn = 1
n . :
1=1
J~~(ZY~-I),

f o r which t h e v a r i a n c e i s a s y m p t o t i c a l l y ( e q . ( 1 1 . 2 7 ) )
1 "
Er = S,(x) = ; , I t,(ni), 1 2k (11.26) 1
1=1 ~ ( 0 '=) -
n- l ( 1 - a r 2 ) .
Since the f u n c t i o n s 5 (x) are o r t h o g o n a l t h e c o v a r i a n c e terms between
h t e r m s of t h e o r i g i n a l v a r i a b l e eosm t h e e s t i m a t e of t h e p a r a m e t e r
different $ .6 v a n i s h . The ertor i n gr can be found from t h e approximate for-
r 5
mula f o r t h e v a r i a n c e , e q . ( 1 1 . 1 4 ) , which becomes a and i t s v a r i a n c e become.

11.3.1 Example: P o l a r i z a t i o n o f a n t i p r o t o n s ( 3 )
I t i s i n t e r e s t i n g t o compare t h i s l a s t r e s u l t w i t h t h e c o r r e s p o n d i n g
We r e c o n s i d e r t h e p o l a r i z a t i o n example d e s c r i b e d under t h e Maximor
r e s u l t f o r t h e v a r i a n c e o b t a i n e d w i t h t h e Maximum-Likelihood method. For l a r g e
L i k e l i h o o d method i n S e c t . 9 . 5 . 7 and under t h e Least-Squares method i n Sect.10.5.3.
n the varianre of t h e ML e s t i m a t e of a t a k e s the s m a l l e s t p o s s i b l e v a l u e , g i v e n
l'he d i s t r i b u t i o n of t h e a n g l e r$ between t h e normals of t h e two s c a t t e r i n g p l a n e s
by e q . ( 9 . 3 9 ) . Hence t h e a s y m p t o t i c e f f i c i e n c y o f t h e moments e s t i m a t o r i s
is g i v e n by

A
1
-
E f f i c i e n c y (&)=
"HL(')
- l"(l+;) - l n ( l - 6 ) - 26
(11.31)
here t h e unknown is a = p 7 , t h e square of t h e p o l a r i z a t i o n . Vm(a) (3-iz)
The p . d . f . i s n o t of t h e form of e q . ( 1 1 . 2 2 ) . However, i n t e r m s of a n- 1
A

new v a r i a b l e which f o r l a r g e n and a < < ? g i v e s

4
E f f i c i e n c y (&)= 1 - - a 2 . ( 1 1.32)
15

we can w r i t e the p . d . f . i n t h e r e q u i r e d form n u s , f o r s m a l l p o l a r i z a t i o n s P, the MM e s t i m a t o r i s n e a r l y f u l l y e f f i c i e n t .


Numerically, t h e a s y m p t o t i c e f f i c i e n c y i s 0.99997 f o r P = 0 . 1 0 , and 0.998 f o r
f ' ( y l a t ) = 1 + aSS(y) P = 0.30.

i f t h e new p a r a m e t e r i s t a k e n t o b e a ' = a 1 6 and See f u r t h e r Chapter 12.

E x e r c i s e 11.7: Show t h a t a more general e x p r e s s i o n than eq.(11.30) is

The f u n c t i o n E(y) s a t i s f i e s t h e two c o n d i t i o n s of eqs.(11.23), (11.24). Hence


- .
t h e r e s u l t s of t h e p r e v i o u s s e c t i o n can be a p p l i e d d i r e c t l y . An u n b i a s s e d Iliis v a r i a n c e formllla i s v a l i d f o r a l l sample s i z e s .
11.3.2 ~ m _ r n p l ~h: n g u l a r momentum a n n l y l i a
we go back t o t h e example of S e c t . 1 0 . 5 . 4 and w r i t e t h e f o r m u l a (10.87)
f o r the a n g u l a r d i s t r i b u t i o n of a two-pion decay of a s p i n J boson i n t h e form

where gr i s the m e s t i m a t e of er from eq. (11.26). and where 0: is t h e (unknown)


v a r i a n c e i n the d i s t r i b u t i o n of g r ( x ) around 0 . For l a r g e n, a I 6 may be
where t h e y m ( n ) .re the well-knom s p h e r i c a l harmonic f u n c t i o n s . All conditions approximated by sr16, where i s t h e sample v a r i a n c e of g ( x ) from e g . ( 1 1 . 2 7 ) ;
J
s p e c i f i e d i n Secr.11.3 f a r t h e use of o r t h o g o n a l f u n c t i o n s a r e s a t i s f i e d , e x c e p t hence
f o r t h e t r i v i a l d i f f e r e n c e i n v a r i a b l e region. We have now
- or
- + N(0.1) when n * . (11.38)
sr/G
The s t a n d a r d normal d i s t r i b u t i o n can therefore, when n i s n o t r o o s m a l l , be used
A

t o d e r i v e approximate c o n f i d e n c e i n t e r v a l s f o r e a c h 8= from t h e o b s e r v e d er, s,.

11.4 COMBINING MM ESTIMATES FROM DIFFERENT EXPERIWNTS


Suppose t h a t two e x p e r i m e n t s have e s t i m a t e d t h e p a r a m e t e r s 6 by t h e
moments method u s i n g t h e same s e t s of o r t h o g o n a l f u n c t i o n s . The e x p e r i m e n t s
Hence t h e r e s u l t s S e c r . l l . 3 a p p l y , g i v i n g the u n b i a s s e d e s t i m a t e s of t h e ex- have been b a s e d on n, a n d nz e v e n t s , r e s p e c t i v e l y , and have g i v e n t h e e s t i m a t e s
pansion c o e f f i c i e n t s c . as t h e a v e r a g ~values of the +? aver t h e e x p e r i m e n t a l 0") and 8 ( 2 ) . The r e s u l t of t h e combined experiment f o r t h e r - t h component i n
~m I -
sample, ( e q . ( 1 1 . 2 b ) )

where n i s the number "f euenls. The e s t i m a t e s a r e u n c o r r e l a t e d and have v a r i - which i s n o t h i n g b u t a s i m p l e weighted c o m b i n a t i o n of t h e MM e s t i m a t e s from t h e
ances g i v e n by e q . 0 1 . 2 7 ) , i n d i v i d u a l experiments. The v a r i a n c e of t h i s combined e s t i m a t e of 8 is

I t w i l l be seen t h a t e s t i m a t i n g by t h e moments method w i t h o r t h o g o n a l I f t h e two e x p e r i m e n t s have used t h e more g e n e r a l m m e n t s method of


f u n c t i o n s , as d e s c r i b e d h e r e , i s c o m p u t a t i o n a l l y much s i m p l e r than the L e a s t - Sect.11.2.2 t h e MM e s t i m a t e s of each e x p e r i m e n t w i l l be a s s o c i a t e d w i t h a non-
S q u a r e s e s t i m a t i o n of S e c t . l ( l . 5 . 4 . diagonal covariance matrix. The combined e s t i m a t e a n d i t s e r r o r must t h e n be
found by a more e l a b o r a t e p r o c e d u r e , f o r example by a Least-Squares a p p r o a c h as
11.3.3 Confidence i n t e r v a l s f o r MM e s t i m a t e s
d e s c r i b e d f o r t h e example of Sect.10.2.6.
Working w i t h orthonormal f u n c t i o n s h a s t h e f u r t h e r advantage t h a t con-
f i d e n c e i n t e r v a l s can be o b t a i n e d f o r t h e p a r a m t e r s independently.For t h e

.
r - t h p a r a m e t e r we have from t h e C e n t r a l L i m i t Theorem, when n becomes l a r g e ,

12 - Probability and sratlstisr.


p r i o r i t y t o c e r t a i n f a c t o r s and s a c r i f i c e o t h e r s which are assumed l e s s import-
12. A simple case study with application of
ant.
different parameter estimation methods I n t h i s c h a p t e r we w i l l c o n s i d e r a p a r t i c u l a r l y s i m p l e example w i t h
s i m u l a t e d p o l a r i z a t i o n e x p e r i m e n t s where t h e e s t i m a t i o n i n v o l v e s a s i n g l e para-
meter . The purpose i s , f i r s t l y , t o r e c a l l how t h e p a r a m e t e r e s t i m a t e and i t s
error can be o b t a i n e d by the d i f f e r e n t e s t i m a t i o n methods d e s c r i b e d e a r l i e r and,
I n p r a c t i c a l problems t o d e t e r m i n e n u m e r i c a l v a l u e s of p h y s i c a l con- ~ ~ c a n d l yt o, d e m o n s t r a t e t h a t t h e s e e s t i m a t o r s produce m u t u a l l y c o n s i s t e n t re-
s t a n t s on t h e b a s i s of s e t s o f e x p e r i m e n t a l d a t a t h e p h y s i c i s t may have t h e sults.
o p p o r t u n i t y t o choose between d i f f e r e n t e s t i m a t i o n methods. His a c t u a l choice
o f e s t i m a t o r s w i l l u s u a l l y depend on s t a t i s t i c a l as w e l l as o t h e r c r i t e r i a . 12.1 SMULATION OF POULRIUTION EXPERIMENTS
G e n e r a l s t a t i s t i c a l p r o p e r t i e s t h a t s h o u l d b e p o s s e s s e d by good e s t i m a t o r s were We w i l l t a k e f o r a case s t u d y t h e s i m p l e one-parameter example of
c o n s i d e r e d i n C h a p t e r 8. These optimum p r o p e r t i e s are: ~ ~ e t s . 9 . 5 . 7 ,10.5.3, and 1 1 . 1 . 1 , d e s c r l b l n g t h e p o l a r i z a t i o n of a n t i p r o t o n s i n

- casistency, which means t h a t t h e e s t i m a t o r g i v e s r i s e t o e s t i m a t e s pp e l a s t i c scattering. The t h e o r e t i c a l d i s t r i b u t i o n f o r t h e a n g l e $ between t h e

t h a t converge t o w a r d s t h e true p a r a m e t e r v a l u e when t h e nwober o f ~ a t t e r i n gnormals i s g i v e n by

observations i s increased;
- unbiassehess, which means t h a t , r e g a r d l e s s of t h e sample s i z e .
t h e e s t i m a t o r p r o d u c e s e s t i m a t e s t h a t are n o t s y s t e m a t i c a l l y s h i f t e d "here x = cosb i s r e s t r i c t e d t o t h e i n t e r v a l [-1,+1 I. For t h i s c l a s s o f u n d e r
from t h e t r u e p a r a m e t e r v a l u e ; l y i n g d i s t r i b u t i o n s w e have g e n e r a t e d a r t i f i c i a l e v e n t samples c o r r e s p o n d i n g t o
- efficiency, which means t h a t t h e d i s t r i b u t i o n of t h e e s t i m a t e s h a s s p e c i f i e d v a l u e s of t h e p a r a m e t e r a by t h e "hit-and-miss" Monte C a r l o method
m i n i m variance a b o u t t h e c e n t r a l v a l u e ( e q u a l t o t h e t r u e v a l u e u s i n g a c m p u t e r equipped w i t h a uniform random number g e n e r a t o r . The number
of t h e p a r a m e t e r f o r u n b i a s s e d e s t i m a t o r s ) ; g e n e r a t o r d e l i v e r s numbers r which a r e u n i f o r m l y d i s t r i b u t e d between 0 and 1 .
- sufficiency, which means t h a t t h e e s t i m a t o r e x h a u s t s a l l i n f o r m a t i o n An e v e n t c a n d i d a t e can b e c o n s t r u c t e d from two c o n s e c u t i v e numbers r , and rr
i n t h e o b s e r v a t i o n s r e g a r d i n g t h e unlrnovn p a r a m e t e r . f r m t h e g e n e r a t o r by d e f i n i n g t h e a n g l e 4 . through t h e r e l a t i o n

I n c h o o s i n g between p o s s i b l e e s t i m a t o r s t h e p h y s i c i s t w i l l a l s o t a k e
o t h e r factors i n t o consideration. Preferentially,
and t h i s c a n d i d a t e i s a c c e p t e d and i n c l u d e d i n t h e a r t i f i c i a l e v e n t sample i f ,
- t h e e s t a b l i s h i n g of n e c e s s a r y formulae f o r c o m p u t a t i o n s h o u l d be as
for the specified a,
s i m p l e as p o s s i b l e ;
- the computer p r o g r a m i n g s h o u l d n o t be t o o c o m p l i c a t e d ; f a r e n q l e .
r e l e v a n t s o f t w a r e s h o u l d b e a v a i l a b l e f o r m a t r i x i n v e r s i o n and
function optimization; We have c a r r i e d o u t s i m u l a t i o n s u s i n g two d i f f e r e n t v a l u e s o f t h e
- t h e method s h o u l d make economic use o f computer t i m e . parameter a, a-0.09 ( c o r r e s p o n d i n g t o a p o l a r i z a t i o n P = & = o . ~ ) and a-0.25
(P=O.S). F o r each a, f o u r i n d e p e n d e n t , a r t i f i c i a l samples were g e n e r a t e d w i t h
Some of the i d e a l t h e o r e t i c a l p r o p e r t i e s and t h e p r a c t i c a l demands
n.10, 100, 1000, and 10000, g i v i n g a l t o g e t h e r 8 i n d e p e n d e n t , s i m u l a t e d e x p e r i -
w i l l f r e q u e n t l y be c o n f l i c t i n g . I n p r a c t i c e one v i l l t h e r e f o r e have t o g i v e
t . F i g u r e 12.1 shows h i s t o g r a m s i n x=cos+ f o r t h e " e v e n t s " from t h e s e 8"ex-
perimenrs". t o g e t h e r w i t h curves showing t h e u n d e r l y i n g " t h e o r e t i c a l " d i s t r i b u -
t i o n s f ( x / u ) w i t h t h e u - v a l u e s used i n t h e s i m u l a t i o n s , n o r m a l i z e d t o t h e number
of = v e n t s i n t h e "experiments". I t i s seen t h a t a l l h i s t o g r a m s match w e l l w i t h
the " t h e o r e t i c a l " d i s t r i b u t i o n s , and i t a p p e a r s j u s t i f i e d t o c o n s i d e r t h e gene-
r a t e d e v e n t samples as F a i r l y t y p i c a l f o r p h y s i c a l e v e n t s o r i g i n a t i n g from t h e
distributions. We w i l l t h e r e f o r e , i n t h e f o l l o w i n g , r e g a r d t h e
samples a s i f they c o n s i s t e d o f r e a l , o b s e r v e d e v e n t s i n 8 d i f f e r e n t
experimenrs, and use them t o e s t i m a t e t h e unknown p a r a m e t e r a by t h e u s u a l e s r i -
m a t i o n methods d e s c r i b e d i n C h a p t e r s 9-11.

12.2 ~FPLICATIONOF DIFFERENT ESTIMATION METHODS


We p r o c e e d now t o see how t h e d a t a o f t h e s i m u l a t e d e x p e r i m e n t s can be
"sed t o o b t a i n e s t i m a t e s of t h e unknom p a r a m e t e r and i t s e r r o r . I n t h e follow-
A

ing s e c t i o n s we b r i e f l y r e c a p i t u l a t e how a and Aa, o r V(u), can be found by t h e


d i f f e r e n t e s t i m a t o r s u s i n g e x p l i c i t F o m u l a e and g r a p h i c a l methods. The numeri-
c a l r e s u l t s o b t a i n e d by t h e d i f f e r e n t methods f o r t h e 8 e x p e r i m e n t s a r e a l l s u m
marized i n T a b l e 12.1. and w i l l be d i s c u s s e d f u r t h e r i n S e c t . 1 2 . 3 below.

12.2.1 The method o f moments


I h e p r e s e n t example was t r e a t e d i n S e c r . l l . 3 . 1 u s i n g t h e moments
method w i t h o r t h o g o n a l f u n c t i o n s . The MM e s t i m a t e of a i s s i m p l y g i v e n by

and f o r l a r g e samples t h e e r r o r i n a c a n be found from t h e a s y m p t o t i c e x p r e s s i o n


f a r t h e v a r i a n c e , (eq. (1 1 . 3 0 ) ) ,

For s m a l l samples t h e error can be o b t a i n e d from t h e more g e n e r a l v a r i a n c e f o r -


mula ( E x e r c i s e 11 .7)
x
Fig. l 2 . l . Histograms i n x=cos$ f o r 8 ~ o ~ carlo t e simulated Polarization
e x p e r i m e n t s g e n e r a t e d w i t h t h e i n d i c a t e d v a l u e s ,f the parameter
dashed s t r a i g h t l i n e s c a r r e s p o r d . t o t h e t h e o r e t i c a l d i s t r i b u t i o n f ( x The
/o )
Of e q . ( I z . l ) n o r m a l i r e d t o t h e g e n e r a t e d number of
n,
The last formula has been used to obtain the errors for the generated omples
with n not exceeding 100, although one may expect the numerical result for V(a)
to be rather sensitive to the particular sample values x.. given n.

12.2.2 The Maximu-Likelihood method


The likelihood function is given by

from which
"
InL = -"In2 + 1 ln(l+axi). (12.7)
i-1
In Fig. 12.2 1nL is shown as a function of u for the 8 experiments.
For each the M estimate a corresponds to the peak of the 1nL func-
tion, and the error in a is determined by intersecting the function by a
Straight line at a distance 0.5 below its maximum value. As can be seen from
Fig. 12.2 the 1nL function, even for the smallest samples, has an almost symnet-
ric and prabolie shape. The error Aa can therefore for each experiment be
taken as the average of the distances AOL and A q i defined by the lover and upper
intersection points; see Fig. 12.2(a).
In addition to the estimated errors obtained by the graphical method
Table 12.1 also gives the errors deduced from the largrsample famula (9.39)
derived for the present p.d.f. in Sect.9.5.7,

12.2.3 The Maximum-Likelihood method for classified data


When the events are classified in N bins with n. observed events in the
i-th bin the Pa solution for the unknown parameter a is found by maximiring the
expression (compare Sect.9.9)
N
loL(m.nl.. ...nN(u) = 1 n.lnpi(a),
i-I Fig. 12.2. InL of eq.(12.7) as s function of the parameter o for the 8
simulated experiments.
where p . ( u ) i s the p r o b a b i l i t y f o r t h e i - t h b i n , given a. For t h e b i n extending
from x . t o x . r Ani we have

x.+Ax.

pi(a) =
I

1
X.
1

A(1 + ax)dx = ai + ab.


1 ' (1 2.9a) ,
where
1
a. E IAxi. b. E &Ax.(n. + iAx.). (12.9b) I
I I

Hence we can w r i t e i -----------,-----


lnL(n,,n, ,...,nN/a)= (12.10) i 1.o
where t h e argument f o r t h e l o g a r i t h m i s taken a t t h e c e n t r a l valve of the i - t h I

bin.
Although the HL method with c l a s s i f i e d d a t a was introduced t o save com-
p u t a t i o n , and only i s of p r a c t i c a l i n t e r e s t when n i s l a r g e , we have a p p l i e d i t
h e r e f o r demonstration purposes f o r n a s small as 100 (10 c l a s s e s ) . The r e s u l t s
o b t a i n e d are given i n Table 12.1, t o g e t h e r w i t h t h e errors i n t h e e s t i m a t e s as
deduced by t h e g r a p h i c a l method using t h e p o i n t s where t h e 1nL f u n c t i o n i s 0 . 5

i
below i t s maximm v a l u e . I

I I (g)
12.2.4 The Leasf-Squares method
With n events d i s t r i b u t e d i n N b i n s the u s u a l form f o r the f u n c t i o n t o
, 0.0 0.04 0.08 0.12 a012 036 a20 0.24 a
b e minimized i s x2

where p . ( u ) , the p r o b a b i l i t y f a r the i - t h b i n , i s g i v e n by eqs.(lZ.9a.b).


The x2 i s shown g r a p h i c a l l y i n F i g . 12.3 as a f u n c t i o n o f a

..
f o r t h e 8 experiments, w i t h t h e i r minimum v a l u e s i n d i c a t e d , corresponding t o t h e
LS e s t i m a t e s u . Also i n d i c a t e d i n each graph i s t h e s t r a i g h t l i n e a t d i s t a n c e
2
'min
I
1 . 0 above the f u n c t i o n m i n i m m , which determines t h e e r r o r i n each e s t i m a t e . I
Fig. 12.3. x2 of eq. (12.11) as a f u n c t i o n of t h e parameter o f o r t h e 8
simulated experiments.
12.2.5 The simplified Least-Squares method
As we saw in Sect.10.5.3, the simplified LS method with 12.1 Summary of results from estimations
- Estimation Number Estimated Estimated parameter errordo
~~nerated
sample method of bins parameter
(experiment) W value 6 Analytical Graphical MVB
and npi(a) linear in the parameter a, implies that X2 is of second order in a.
Hence the analytical solutions can be written down for the LS estimate and its ManentS - 0.42 0.62 -
(a) - 0.38 0.52 0.47
error (eqs.(10.82), (10.83)). HL, classified data - - - - 0.54
,,=0.00
LS, ordinary (4) (0.15) - (0.53)
"-10 LS, simplified (4) (0.16) (0.55) -
Moments - 0.172 0.175 -
(b) HL
. 0.170 0.171 0.173
ML, classified data 10 0.177 - 0.174 0.173
a=O.09
LS, ordinary 10 0.175 - 0.172
"=loo 10 0.188 0.165 -
LS, simplifted
Table 12.1 gives the numerical results obtained for the estimated para- Moments - 0.078 0.055 -
meter and its error with the simplified LS method as well as the ordinary LS (el .-
MT - 0.080
~ ~
0.054 0.056
ML, classified data 50 0.080 . 0.056 0.054
method applied to all 8 experiments, including the two experiments with sample a.0.09 -
LS, ordinary 50 0.075 0.054
size nilO, which do not fulfil the usual requirements on the number OF bins and "=LO0o LS, simplified 50 0.109 0.052 -
their contents as discussed in Sects.lO.5.1. 10.5.2; for the latter the "umbers Moments - 0.093 0.0173 -
(dl - 0.093 0.0173 0.0173
are given in parenthesis, HL, classified data 100 0.093 - 0.0172 0.0173
a=0.09
LS, ordinary 100 0.092 - 0.0175
n=lOOOO 100 0.095 0.0172 -
LS, simplified
12.3 DISCUSSION
- -
12.3.1 The estimated parameters and their errors
From the numerical values of Table 12.1 the following conclusionsmay I az0.25
1I z,
Moments
rn
classified clafa
-
-
0.21
0.40
-
0.44
0.52
-
0.71
- 0.54
LS, ordinary (4) (0.05) - (0.69)
be drawn:
n-10 LS, simplified (4) (0.40) (0.43) -
- for each generated sample (experiment) the estimated values of the
Moments - 0.215 O.~or -
Parameter by the five different procedures a r e generally in good (f) ML - 0.240 0.170 0.178
a,classified data 10 0.251 - 0.180 0.170
agreement, except when the sample sire is very small (,,=lo); a=0.25
10 0.250 - 0.180
- within each sample the parameter errors estimated by the different
"=loo
LS, ordinary
LS, simplified 10 0.224 0.154 - -
Hoolents - 0.211 0.055 -
procedures are roughly equal;
(g) - 0.210 0.054 0.054
ML. classified data 50 0.207 - 0.054 0.054
- the estimated errors are inversely proportional to the square root u-0.25
50 0.200 - 0.057
of the sample size.
n=lOOO
LS. ordinary
LS, simplified 50 0.215 0.054 -
Moments - 0.262 0.0171 -
For the larger samples the first result is not surprising, since all (h) - 0.258 0.0170 0.0172
HL
HL, classified data 100 0.259 - 0.0168 0.0170
u.0.25
100 0.258 - 0.0170
n=lOOOO
LS. ordinary
LS, simplified . 100 0.260 0.0169 -
f i v e e s t i m a t o r s a r e c o n s i s t e n t and a s y m p t o t i c a l l y i m b i a s s e d . For t h e v e r y s m a l l has w e l l d e f i n e d d i s t r i b u t i o n p r o p e r t i e s . S p e c i f i c a l l y , i f t h e number

samples w i t h n=10, even t h e methods which u t i l i z e a l l i n f o r m a t i o n i n t h e ("0- ,,f events i s not too s m a l l , x'.m l n i s a c h i - s q u a r e v a r i a b l e w i t h a number of de-
binned) d a t a , i.e. t h e moments and t h e ML method, g i v e n u m e r i c a l l y d i f f e r e n t grees of freedom e q u a l t o t h e number of independent t e r m . i n t h e X' sum minus
v a l u e s f o r the estimated parameters; however, c o n s i d e r i n g t h e magnitude of t h e the number of i n d e p e n d e n t p a r a m e t e r s e s t i m a t e d ; t h e corresponding chi-square
e s t i m a t e d e r r o r s , t h e s e r e s u l t s are n o t i n c o m p a t i b l e . P i s t h e n t h e p r o b a b i l i t y f o r o b t a i n i n g a h i g h e r v a l u e of X2.
2
X mln
The dependence of t h e e s t i m a t e d e r r o r s upon t h e sample s i z e i s as e r t h a t o b s e r v e d , and can be found, f o r example, from t h e graph of F i g . 5.2.
pected. For t h e p . d . f . of t h e p r e s e n t example, e q . ( 1 2 . 1 ) , t h e minimum v a r i a n c e Table 1 2 . 2 g i v e s t h e minimum v a l u e x'.
mln
and t h e deduced c h i - s q u a r e
bound, MVB, can be e v a l u a t e d from t h e fundamental Cramcr-Raa i n e q u a l i t y (8.11); probability P 2 f o r t h e o r d i n a r y and t h e s i m p l i f i e d LS f i t s to t h e g e n e r a t e d
X
one f i n d s o f s i z e n 2 100. I n each f i t t h e number of d e g r e e s of freedom is
(N-1) - 1 = N-2, where N i s t h e number of b i n s used.

rable 12.2.
which i s n o t h i n g h u t t h e ML large-sample v a r i a n c e o f e q . ( 1 2 . 8 ) .
i n g MVB error Am o b t a i n e d f o r each g e n e r a t e d sample i s a l s o g i v e n i n T a b l e 12.1.
The correspond-
r1 ~ ~ sample ~ Number
~ o f b i~n s O r d i n a r y t LS
~ ~S i m p l i f ide d LS i
I t i s seen t h a t t h e errors e s t i m a t e d by t h e d i f f e r e n t p r o c e d u r e s are c l o s e t o
(experiment) N
n
i:' ?x2 , x2.
ml" Pxt !
t h e MVB error f o r a l l sample s i z e s . T h i s means t h a t a l l f i v e e s t i m a t i o n pro- (b)
(c)
a=0.09
u=0.09
"-100
n=1000
10
50
4.6
38.5
0.80
0.82
1
;
4.5
47.3
0.82
0.56
!;
c e d u r e s have a h i g h e f f i c i e n c y a l s o f o r s m a l l n. (However, no e s t i m a t i o n pro- (d) a.0.09 n=L0000 100
c e d u r e f o r a can be f u l l y e f f i c i e n t f o r a l l n, s i n c e t h e p . d . f . of eq.(12.1) (f) a-0.25 *=I00 10 11.7 11.8 0.17
50
d o e s n o t b e l o n g t o t h e e x p o n e n t i a l f a m i l y and t h e r e f o r e does n o t have any s u f f i -
(a)
(h)
a=0.25
a.0.25
"=I000
n=10000 100 39.0 >0.99 , 39.5 ,0.99
c i e n t e s t i m a t o r f o r a; Sect.8.6.1.)
Some of t h e e s t i m a t e d errors i n T a b l e 12.1 are somewhat s m a l l e r than The numbers of Table 12.2 show t h a t t h e c h i - s q u a r e p r o b a b i l i t i e s from
t h e c o r r e s p o n d i n g Mlll e r r o r . This need n o t d i s t u r b u s , s i n c e t h e MVB i s t o be the o r d i n a r y and t h e s i m p l i f i e d LS methods a r e s i m i l a r . The h i g h p r o b a b i l i t i e s
u n d e r s t o o d a s t h e lower l i m i t of t h e ezpected v a l u e of t h e e s t i m a t e d v a r i a n c e , i n d i c a t e t h a t t h e LS f i t s are "good". I n p a r t i c u l a r , t h e exceedingly high
and t h u s r e p r e s e n t s no a b s o l u t e minimum f o r t h i s q u a n t i t y . Hence i f many new p r o b a b i l i t i e s o b t a i n e d f o r t h e l a r g e sample e x p e r i m e n t s i n t h i s case a r e very
s a m p l e s were g e n e r a t e d , w i t h s i m i l a r n m b e r of e v e n t s , t h e s e c o u l d g i v e s m a l l e r l i k e l y j u s t a r e f l e c t i o n o f a well-behaved random number g e n e r a t o r , which h a s
o r l a r g e r e s t i m a t e d errors t h a n t h e a c t u a l t a b l e v a l u e f o r t h e g i v e n methods, produced a r t i f i c i a l e v e n t samples which a r e e x t r e m e l y c l o s e t o t h e i d e a l con-
b u t i n such a way t h a t t h e i r a v e r a g e v a l u e , f o r any method, would always be a t tinuous d i s t r i b u t i o n s o f eq. ( 1 2 . 1 ) .
l e a s t as l a r g e as t h e MVB error. For s m l l samples t h e d i s t r i b u t i o n of t h e x2. s t a t i s t i c is n o t known,
mln
and t h e LS e s t i m a t i o n is " o t a s s o c i a b l e ~ i t a
h chi-square rob ability e x p r e s s i n g
12.3.2 Goodneos-of-fit s i m i l a r l y , r e g a r d l e s s of t h e sample s i z e , t h e moments and
the g o o d n e s s - ~ f - f i t .
As was emphasized i n Chapter 1 0 t h e Least-Squares method h a s an ad-
the ML e s t i m a t i o n methods p a v i d e no d i r e c t measures f o r t h e goodness-of-fit.
v a n t a g e over o t h e r p a r w e t e r e s t i m a t i o n p r o c e d u r e s i n t h a t i t can p r o v i d e a d i -
One may of course c a l c u l a t e a ~ ~ r r e s ~ o n d x2
i n gv a l u e from eq.(lZ.11) or (12.12)
r e c t measure o f the g o o d n e s s - o f - f i t between a f i t t e d model and t h e e x p e r i m e n t a l
using t h e f i t t e d p a r a m e t e r ~ a l u e eby t h e s e methods and a r e a s o n a b l e b i n n i n g of
d a t a , s i n c e t h e minimum v a l u e o b t a i n e d f o r t h e o p t i m i z e d f u n c t i o n , under c e r t a i n
the d a t a , b u t the X' s t a t i s t i c c o n s t r u c t e d t h i s way i s g e n e r a l l y n o t s i m p l y c h i -
square distributed, and one will therefore in general not be able to assign a
chi-square probability for the goodness-of-fit. Only if there is a very large
number of observations, corresponding to a substantial number of events in the
separate bins, can the X' statistic as obtained by inserting the ML estimates
for the parameters be regarded as approximately chi-square distributed (see
Sect.14.4.3), and a chi-square probability for the goodness-of-fit be deduced
from, for example, a standard graph of the cumulative chi-square distribution.
In principle, the numerical value obtained for lnL(max) could also be
used to supply information on the goodness-of-fit if the distributional prop-
erties of the statistic lnL(max) were k n a m . This is generally not the case.
One can, however, construct an o p p r o s i m t e probability distribution of lnL(max)
corresponding to the specific 6 and n by using the Monte Carlo technique to
generate a large number of event samples, all of sire n, and determine for
these (independent) samples the frequency distribution of the values obtained
far lnL(man). Since lnL(man) depends on the parameter m, only simulated ex-
periments producing fitted parameter values very close to the specific ;should
be used in deriving this frequency distribution.
smaller value than the actually observed InL(max)
..
The probability to obtain a
for the specific a can then
be estimated as the integrated value from -- up to lnl.(max) of the derived
frequency distribution, thus providing the desired measure of goodness-of-fit.
Figure 12.4 shows the frequency distribution for lnL(msx) and the
corresponding cumulative distribution F obtained chis way on the basis of 100
independent simulated experiments with n=10 which all gave estimated parameter Fig. 12.4. ( a ) Distribution of lnl(max) obtained for 100 simulated
values in the interval [0.37,0.41 I. We take Fig. 12.4(a) to represent an experiments with n=10 and 3 E [0.37,0.41 I. (b) he cumulative inte-
gral of the distribution in ( a ) .
approximate distribution of lnL(max) for the two small sample experiments (a)

.
and (el from Table 12.1, for which the ML method gave the estimated parameter
value a equal to 0.39 and 0.40, respectively. Since the actual values obtained
for lnL(man) were -6.68 for experiment (a) and -6.79 for experiment (el, we
find from Fig. 12.4(b) that the corresponding estimated ML probabilities become
0.46 and 0.04 for these experiments. Proceeding in a similar manner to obtain
the approximate lnL(max1 distributions for the two experiments (b) and (I) in
Table 12.1, both with n=100, we estimate their ML probabilities to be 0.55 and
0.12, respectively; these numbers compare reasonably with the chi-square prob-
abilities as given in Table 12.2, being 20.80 and -0.18 for these experiments.
13. Minimization procedures

There e x i s t s a v a r i e t y of problems which r e q u i r e t h e o p t i m i z a t i o n of


general nature are p r e s e n t l y a v a i l a b l e as more o r l e s s standard equipment of
large ~ o m p u t e ri n s t a l l a t i o n s . Due t o the f i n i t e work leogth of d i g i t a l computers
,..er may sometimes have t o worry about numerical inaccuracies a r i s i n g f r w
rounding-aff errors, underflow and overflow. D i f f i c u l t i e s of t h i s kind require
!
special a t t e n t i o n and can be handled by various e l a b o r a t e methods. A discussion
some f u n c t i o n with r e s p e c t t o a s e t of parameters. The Haxi-Likelihood and
t h e Least-Squares methods have i n co-n t h a t t h e o p t i m values of t h e unknown of these more t e c h n i c a l p o i n t s i s , however, o u t s i d e the scope of the p r e s e n t
parameters are determined by seeking the extremum of a f u n c t i o n f o r which an text.
e x p l i c i t dependence on the parameters has been w r i t t e n down. Since maximizing
t h e l i k e l i h o o d f u n c t i o n i s e q u i v a l e n t t o minimizing i t s n e g a t i v e value we may 13.1 LXNEKAL REMARKS
formulate the Ml. as w e l l as t h e LS methods f o r parameter e s t i m a t i o n as p r o b l e m 10 t h i s =hapter t h e r e a l function t o be minimized is denoted by F(g).

of minimizing a f m c t i o n with r e s p e c t t o i t s v a r i a b l e s . the v a r i a b l e _x = ..


Ix, , a r . . . X 1 corresponding t o the n parameters.

The n u n e r i e a l minimization procedures t o be o u t l i n e d i n t h i s chapter A minimum p a i n t of F ( x ) i s defined as a p o i n t where F(_x) '~(5')


are of r a t h e r general a p p l i c a b i l i t y , t h e only assumption being, f o r some methods. f o r a l l points _x near $. From elementary calculus we know t h a t a minimum point

an approximate q u a d r a t i c behaviour of the f u n c t i o n i n the i w e d i a t e v i c i n i t y of mast be of one of the following types:


i t s minimum. In p a r t i c u l a r , when minimizing the negative logarithm of t h e l i k e -
(i) a stationary point, where a l l t h e d e r i v a t i v e s aF/axi are zero,
lihood f u n c t i o n or t h e sum of squared d e v i a t i o n s by t h e Least-Squares method.
t h e f u n c t i o n is o f t e n t o a good approximation q u a d r a t i c i n the v a r i a b l e s near ( i i ) a cusp, where s o w of the d e r i v a t i v e s are zero and o t h e r s do not
i t s minimum. exist,
E f f e c t i v e use of the techniques described i n the following presupposes
( i i i ) an edge point, t h a t i s , a ~ o i n lt y i n g an the boundary of t h e
t h a t t h e procedures have been programed f o r high-speed computers.
Although much
of t h e b a s i c theory d a t e s back t o Newton's time o r even e a r l i e r , i t i s now h a r d l y allowed v a r i a b l e region.

conceivable t o apply these techniques f o r hand c a l c u l a t i o n . One m y conveniently We w i l l h e r e consider functions f o r which an e n p l i c i t a n a l y t i c erpres-
t h i n k of t h e procedure as a progr-d subroutine o r algorithm which is c a l l e d The usual way of f i n d i n g the e r
sion is not s p e c i f i e d or i s very complicated.
w i t h assigned values of t h e v a r i a b l e s (parameters) and which r e t u r n s the function
trema of a function by equating a l l i t s f i r s t d e r i v a t i v e s t o zero i s then not
v a l u e and sometimes i t s d e r i v a t i v e s . d i r e c t l y applicable. A reasonable approach i n t h i s s i t u a t i o n i s t o perform a
The renewed i n t e r e s t i n the rninimiratian problem during the l a s t y e a r s napping or aearch over t h e v a r i a b l e space t o l o c a t e the minima of F(x_). Such a
has r e s u l t e d i n new minimization procedures as well as improvements and exten- search can be done i n many ways, defining d i f f e r e n t minimization procedures.
s i o n s t o o l d ones. We s h a l l emphasize the p r i n c i p l e s behind s e l e c t e d methods and The f m c t i o n t o be minimized o f t e n has more than one minimum. Since
d e s c r i b e how they work. The r e a d e r who needs more d e t a i l e d information end
it seems t o be r a t h e r d i f f i c u l t t o define minimization procedures which w i l l
f u r t h e r t h e o r e t i c a l j u s t i f i c a t i o n should c o n s u l t more s p e c i a l i z e d l i t e r a t u r e . surely produce the absolute minimum of a function, o r the gtoboi m i n i m , we w i l l
Several very good and f l e x i b l e minimization progra-s*) of r a t h e r a t f i r s t a n t i c i p a t e the procedures t o l e a d t o the n e a r e s t tocat m i n i m .
To be able t o propose " i n t e l l i g e n t " minimization methods l e t us f i r s t
*) An example is HIWIT of the CERN Program Library.
I n t u i t i v e l y we expect a l l
study the f u n c t i o n F(4) near some a r b i t r a r y p o i n t c _ .
the d e r i v a t i v e s of a p h y s i c a l l y meaningful F(5) t o e x i s t i n the region of i n t e r -
The minimization procedures we d e s c r i b e can conveniently be divided
est. We may t h e r e f o r e perform a Taylor s e r i e s expansion of F(5) around t h e p o i n t
i n t o two main c l a s s e s , t h e step methods and the g m d i e n t methods. The s t e p
c
- and w r i t e
i do n o t use any information about t h e d e r i v a t i v e s of F(g) when the s t e p
length and s t e p d i r e c t i o n have been chosen, whereas t h e g r a d i e n t methods do.
Common t o many methods i s t h a t they need t h e r e p e t i t i o n of a c e r t a i n
T . One must t h e r e f o r e
where g is t h e transposed g r a d i e n t v e c t o r with elements g.=aF/axi, and t h e procedure t o make the f u n c t i o n converge t w a r d s a minimm.
matrix G has elements G. . = a 2 ~ / a x i a x .with t h e i n d i c e s i , j running from 1 t o n. formulate convergence c r i t e r i a which w i l l ensure t h a t the process i s brought t o
11 1
The d e r i v a t i v e s are t o be evaluated a t 5 = 5. an end as soon as t h e c r i t e r i a are f u l f i l l e d . The r e p e t i t i o n s can be stopped.
I n eq.(13.1) F(S) i s a c o n s t a n t and t h e r e f o r e s u p p l i e s no information f a r e x a n p l e , i f t h e change i n t h e function v a l u e f o r two consecutive i t e r a t i o n s
about t h e l o c a t i o n of a minimum.
I n t h e second term the vector g i s expected t o i s smaller than a preassigned number.
vary considerably over parameter space, being c l o s e t o zero i n t h e neighbrxlrhaod When a minimum has been obtained f o r F(5) i t remains t o f i n d t h e er-
of a s t a t i o n a r y minimum. The components of t h e product gT (5-5) w i l l t e l l us i n rors on the For the l a s t of t h e minimization procedures described
which d i r e c t i o n F(5) changes most r a p i d l y , b u t not how f a r we should go t o poss- here, the Davidan variance algorithm, t h e covariance matrix i s obtained by the
i b l y reach the minimum. Information about t h e required s t e p s i z e can be gained algorithm i t s e l f . For t h e o t h e r minimization methods s p e c i a l v a r i a n c e algo-
0 from t h e t h i r d term of eq.(13.1). The m a t r i x G , derived from t h e second deriva- rithms must be applied, based upon reasonable a s s u q t i o n s about F(5) and the
I !
! t i v e s of F ( 5 ) , w i l l u s u a l l y have a modest v a r i a t i o n over parameter space, being ideas of Chapters 9 and 10, whenever these are a p p l i c a b l e .
I :
c o n s t a n t f o r a f u n c t i o n F(x) of s t r i c t l y q u a d r a t i c form.
Clearly, i f F(5) i s t o I n Sect.13.6 we w i l l d i s c u s s t h e s i t u a t i o n which a r i s e s when the func-
possess any minimum value a t a l l t h e r e must be some r e s t r i c t i o n s on t h e s y m n e t r i ~ tion F(5) is c o n s t r a i n e d through l i m i t e d allowed regions f o r t h e parameters.
n x n m a t r i x 6. At a s t a t i o n a r y minimum G i s p o s i t i v e - d e f i n i t e .
For a s p e c i f i e d problem t h e choice of minimization method should depend
on t h e information a v a i l a b l e on the function F(5). I n g e n e r a l , t h e mre informa-
t i o n about F(5) a c t u a l l y used i n t h e minimization, t h e more e f f i c i e n t we expect
t h e method t o be. One can conveniently consider t h e f o l l w i n g s i t u a t i o n s i n
l e v e l s of i n c r e a s i n g knowledge about F(g).

(i) only F(E) iis knovn.

(ii) F(5) as w e l l as i t s f i r s t d e r i v a t i v e s are k n m ,

(iii) F ( 5 ) , i t s f i r s t and second d e r i v a t i v e s are known end reasonably


continuous.

We s h a l l i n the following consider minimization methods which assure

d i f f e r e n t knowledge about PC%).


I f the d e r i v a t i v e s are n o t k n m a n a l y t i c a l l y
~ i 13.1.
~ ." ~ ~ ~ e n b r o c kcurved
's valley", ~(x!,xr)-100(xr-x:)'+(l-xl)'.
they may have t o be obtained numerically.
(Exercise 13.1).
E x e r c i s e 13.1: A n o f t e n used f u n c t i o n f o r t e s t i n g t h e e f f i c i e n c y of minimization
I
in the crude scan can then be used as a s t a r t i n g p o i n t f o r a more r e f i n e d minimi-
procedures i s "Rosenbrock's curved v a l l e y "
zation
method. Indeed, whenever complicated f u n c t i o n s are involved an introduc-
F(x,,xrl = 100(xr-x:)Z + (I-X,)~.
tory % r i d mapping is reco-nded t o f i n d good a t a r t i n $ v a l u e s f o r a subsequent
This f u n c t i o n d e f i n e s a narrow p a r a b o l i c v a l l e y as i n d i c a t e d by t h e contour dia-
gram of Fig. 13.1. Show t h a t i t e r a t i o n method.

(i) t h e minimm of t h e f u n c t i o n i s given by F(1.1) = 0 , The mapping procedure runs i n t o obvious d i f f i c u l t i e s i f t h e parameter
(ii) I n t h i s case one can i n p r a c t i c e s t a r t t h e
range i s very l a r g e ( i n f i n i t e ) .
t h e components of t h e g r a d i e n t v e c t o r g a t an a r b i t r a r y p o i n t (xl.rz) are
given by mapping
with a reasonable i n t e r v a l f o r x w i t h i n t h e a l l w e d r e g i o n , and l a t e r
g l = 400~: - 4 0 0 x 1 ~ 2+ 2x1 - 2,
g2 = - 2 0 0 ~ : + 2 0 0 ~ ~ . s h i f t the range i f the s m a l l e s t value of t h e function t u r n s o u t t o be a t t h e
(iii)
GI, -
t h e elements of the m a t r i x G are given by
1 2 0 0 ~ : - 400x2 + 2, G,, = G 2 , = - 400x1 , Gzi = 200.
boundary of t h e chosen i n t e r v a l .
The simple g r i d search can be c h a r a c t e r i z e d as a b l i n d o r u n i n t e l l i g e n t
s i n c e i t does n o t take account of what could be learned about t h e func-
13.2 S l K P EIETHODS t i o n along the way. Assuming a reasonably smooth f u n c t i o n t h e method c e r t a i n l y
The s t e p methods presented below are more o r l e s s e m p i r i c a l and do not invalves many redundant f u n c t i o n e v a l u a t i o n s i n regions of t h e parameter space
have any r e a l t h e o r e t i c a l b a s i s .
N e v e r t h e l e s s , f o r many minimization problems where t h e f u n c t i o n values are n o t small. By performing t h e g r i d search i n more
simple s t e p methods perform e q u a l l y well as t h e b e t t e r grounded g r a d i e n t m t h o d s . s t a g e s , however, t h e method can be made more e f f i c i e n t .
A multi-stage g r i d s e a r c h can be done i n t h e f o l l a r i n g way: in the f i r s t
13.2.1 Grid s e a r c h and random search
s t a g e a crude g r i d mapping is made a l l over t h e parameter spaee, c o n f i n i n g t h e
The most elementary minimization procedure c o n s i s t s i n mapping t h e func-
t o a r e s t r i c t e d volume element. I n t h e second s t a g e a new g r i d search
t i o n v a l u e s i n a g r i d over t h e e n t i r e parameter space and keeping t h e p o i n t with
i s performed w i t h i n t h i s volume, l i m i t i n g t h e m i n i m t o an even s m a l l e r region.
t h e lowest f u n c t i o n value as the b e s t p o i n t .
and 00 on.
In a one-dimensional g r i d search one j u s t c a l c u l a t e s t h e f u n c t i o n v a l u e s
With many parameters, i n s t e a d of a systematic g r i d aearch a l l
F(x) a t p o i n t s e q u a l l y spaced Ax a p a r t . One of these p o i n t s must then l i e w i t h i n
over parameter s p a e e , good results a r e o f t e n obtained by a Monte Carlo s e a r c h ,
&An from the trueminimum, b u t s i n c e the minimum need not be c l o s e s t t o t h e p o i n t
w i t h t h e s m a l l e s t F-value i t is f o r reasonably smooth f u n c t i o n s assumed t h a t t h i s
choosing p o i n t s x randomly i n t h a t region of parameter spaee where one e x p e c t s
the minimm t o be. The Monte Csrlo mapping is u s u a l l y made with a Gaussian ran-
simple g r i d search i n one v a r i a b l e w i l l only l o c a l i z e t h e minimum t o w i t h i n a d i s -
dam number g e n e r a t o r c o n s t r u c t i n g a s e t of t r i a l p a i n t s 5 around some f i r s t - g u e s s
tance An.
value x_p of s p e c i f i e d width. The p o i n t g i v i n g rhe lowest f u n c t i o n v a l u e i s then
I n two v a r i a b l e s e g r i d seareh l o c a t e s the minimum w i t h i n a r e c t a n g l e
r e t a i n e d and can be used as a s t a r t p o i n t f o r a more advance minimization teeh-
Axibxz, i n t h r e e v a r i a b l e s w i t h i n a volume Ax,AxzAxs, e t c . With more v a r i a b l e s the
nique.
g r i d s e a r c h obviously r e q u i r e s a r a p i d l y i n c r e a s i n g number of f u n c t i o n e v a l u a t i o n s .
For example, t o l o c a l i z e a minimum t o w i t h i n 1%of the range of one v a r i a b l e by Exercise 13.2: A f u n c t i o n of four v a r i a b l e s d e f i n e d w i t h i n a f i n i t e spaee i s t o
t h i s techniqw, 100 f u n c t i o n e v a l u a t i o n s are necessary, v h i l e with f i v e v a r i a b l e s be minimized by s g r i d s e a r c h . The minimum should be l o c a l i z e d to w i t h i n one p e r
m i l l e of t h e range i n each v a r i a b l e . Show t h a t i n a simple g r i d search 10" f m c -
t h e number of e v a l u a t i o n s r e q u i r e d i s 10''. C l e a r l y , t h e r e f o r e , a simple-minded t i o n e v a l u a t i o n s are necessary, and t h a t 3.10' e v a l u a t i o n s are necessary i n a
g r i d s e a r c h should only be used f o r a r a t h e r crvde mapping over t h e parameter three-stage g r i d search.

space when more than N o o r three p a r a m t e r s are involved. The b e s t p o i n t found


353

13.2.2 Mini- along a l i n e ; the s u c c e s s - f a i l u r e method me s u c c e s s - f a i l u r e method eorbined with a interpolation


I n the one-parameter case and i n v a r i a t i o n methods v a r y i n g only one para- appears i n p r a c t i c e t o he very e f f i c i e n t and i s t h e r e f o r e w e l l s u i t e d t o f i n d the
meter a t a time t h e problem i s t o s e a r c h f o r a minimum i n one d i r e c t i o n . The minimum of a f u n c t i o n of only one v a r i a b l e .
f o l l o w i n g p r e s c r i p t i o n f o r f i n d i n g the minimum along a l i n e , the success-failure
~ x e r e i s e 13.3: Prove eq.(13.2).
method, h a s been g i v e n by H.H. Rosenbrock.
Let F(x) be a f u n c t i o n of the v a r i a b l e (parameter) x and suppose t h a t a Show t h a t an a l t e r n a t i v e form f o r t h e i n t e r p o l a t e d v a l u e xo of
, Exerciee 13.4:
f i r s t guess of t h e minimum i s x = x o . For a g i v e n s t e p o f l e n g t h s we e v a l u a t e 7igXn-K
F ( x + s ) and proceed according t o t h e new f u n c t i o n value o b t a i n e d :
x0 -I (F,-F2)o(2xl+2s+as) + (F3-F2)(2xr+s)
(Fl-Fr)n + (FrF2)
(i) I f F ( a + s ) S F(x) c a l l the t r i a l a success, r e p l a c e n by x+s and s
by as, where t h e expansion f a c t o r a s a t i s f i e s a > 1. Exercise 13.5: me f u n c t i o n ~ ( x )= X'-X-I with t h e true minimum ~ 4 - 1 . 2 5 f o r
x o = 0 . 5 i s t o be minimized by t h e s u c c e s s - f a i l u r e method using t h e s t a r t p o i n t
x, = 1, an i n i t i a l s t e p s i n e s = 1 , and the c a n s f a n t s a = 3.0. 6 = - 0.4. Show
(ii) I f F(x+s) > F ( x ) c a l l the t r i a l a f a i l u r e , r e p l a c e s by Bs, wherp t h a t a f t e r f o u r f u n c t i o n e v a l u a t i o n s ( t h r e e ~ t e p s )the minimum is bracketed be-
t h e c o n t r a c t i o n f a c t o r B s a t i s f i e s -1 < 6 c 0. tween r = - 0 . 6 and x = + 2 (Ax = 2 . 6 ) , and v e r i f y t h a t t h e i n t e r p o l a t e d minirmm

Ihe pracedvre i s r e p e a t e d u n t i l the d i f f e r e n c e b e t w e e a t h e f u n c t i o n v a l u e s of t h e


two l a s t successes i s l e s s chan a s p e c i f i e d amount. Good empirical values f o r
.
occurs a t the c o r r e c t p o i n t x O = 0 . 5 . I g n o r i n g the i n t e r p o l a t i o n , show t h a t
a f t e r seven f u n c t i o n e v a l u a t i o n s ( s i x s t e p s ) t h e minirmm is b r a c k e t e d between
- - 0 . 1 6 8 and = 1.08 (AX = 1.248).

Exercise 13.6: Ihe r e s u l t s of t h e f i r s t few s t e p s i n minimizing some f u n c t i o n


t h e expansion and c o n t r a c t i o n f a c t o r s are a = 3.0 and B = - 0.4. F ( x ) a c c o r d i n a t o the s u c c e s s - f a i l u r e method are given below (u = 2.0, 8 = - 0 . 3 ) :
I t should be noted t h a t when a success i s i m e d i a t e l y followed by a f a i l -
Stepi 0 1 2 3
u r e t h e middle one of t h e l a s t t h r e e p o i n t s corresponds t o a s m a l l e r f u n c t i o n
x -1.5 -1 0 2
v a l u e t h a n t h e o u t e r two, and hence a l o c a l minimum must l i e somewhere between
F (x) 4.75 2 1 17
the outer points. This s u g g e s t s the f o r m u l a t i o n of an a l t e r n a t i v e convergence
0.5 1.0 2.0 -0.6
criterion: I f t h e d i s t a n c e Ax between the o u t e r p o i n t s is s m a l l e r t h a n a pre-
a s s i g n e d value t h e procedure should be Stopped; t h e b e s t p o i n t then corresponds Show t h a t t h e b e s t e s t i m a t e of t h e minimum i s xo = - 113.
t o t h e l a s t success and the t r u e minimumwill be l e s s than a d i s t a n c e Ax from
t h i s point. 13.2.3 The c o o r d i n a t e v a r i a t i o n method
I n t h i s s i t u a t i o n when a success i s i m d i a t e l y followed by a f a i l u r e one An i n t u i t i v e l y simple method f o r f i n d i n g a minimum of a f u n c t i o n of sev-
can, i n f a c t , do even b e t t e r . Since experience i n d i c a t e s t h a t f u n c t i o n s o f t e n e r a l v a r i a b l e s , F(Z) = F(nl,x2, ...,x ) i s t o seek a minimm along each c o o r d i n a t e
have an approximate q u a d r a t i c behaviaur near a minimum w e p r e d i c t F' to coincide a x i s i n turn. Tnis i s the i d e a behind t h e one-by-one v a r i a t i o n method or t h e
w i t h the minimm of t h e p a r a b o l a p a s s i n g through t h e l a s t three p o i n t s x l , x 2 singZe-parmeter-, or the coordinate "-ation method, which works as f o l l o w s :
(success), x l (failure). Denoting the corresponding f u n c t i o n v a l u e s by F I , F 2 , F 9 S t a r t i n g a t a p o i n t P o one f i r s t f i n d s a minimum Pt p a r a l l e l t o t h e X I
t h e i n t e r p o l a t e d minimum is a t no, given by axis, for example by t h e s u c c e s s - f a i l u r e oethod. Next, from t h i s minimom one
f i n d s a olinimuo, P? along the x r a x i s , and so on f o r a l l c o o r d i n a t e axes. Of

course, when the mini- along x,, h a s been obtained a f t e r one f u l l c y c l e o r s t a g e .


a t a minimum i n the o t h e r c o o r d i n a t e s .
the f u n c t i o n i s no longer One t h e r e f o r e
s t a r t s searching along X I a g a i n , and completes a new s t a g e . The p r o c e s s i s con- .,.riation method and h a s been found v e r y u s e f u l i n p r a c t i c e .
t i n u e d u n t i l some s p e c i f i e d convergence c r i t e r i o n i s f u l f i l l e d . A r e a s o n a b l e The p r i n c i p l e of t h i s method can be i l l u s t r a t e d f o r t h e ease of two v a r i -
s t o p p i n g c r i t e r i o n c o u l d be t h a t t h e d i f f e r e n c e between t h e b e s t f u n c t i o n v a l u e s & l e s by r e f e r r i n g t o F i g . 13.3. S t a r t i n g a t the p o i n t P a t h e minima P I and P2
i n two s u b s e q u e n t f u l l c y c l e s s h o u l d be l e s s t h a n some p r e s c r i b e d amount.
are f i r s t f m n d by s e a r c h i n g a l o n g t h e c o o r d i n a t e axes x, and x t , r e s p e c t i v e l y .
The one-by-one v a r i a t i o n method i s i l l u s t r a t e d i n F i g s . 1 3 . 2 ( a ) and ( b ) me i d e a i s now t o s e a r c h n e x t a l o n g a "best" l i n e , d e f i n e d by t h e d i r e c t i o n
f o r two two-dimensional problems i n v o l v i n g , r e s p e c t i v e l y , weakly and s t r o n g l y T h i s l i n e c o r r e s p o n d s t o t h e d i r e c t i o n of t h e o v e r a l l improvement
from P o t o P z .
correlated variables. Curves have b e e n drawn f a r c o n s t a n t v a l ~ ~ eofs F ( x , , x ~ ) . i n t h e f i r s t s t a g e and i s a c c o r d i n g l y e x p e c t e d t o be a good d i r e c t i o n f o r f u r t h e r
S t a r t i n g a t P o t h e successive minima alonp. the l i n e s p a r a l l e l t o t h e x , , x , axes
search. The c o o r d i n a t e s y s t e m i s t h e r e f o r e r o t a t e d so t h a t one a x i s , x l say.
a r e found a t PI,PI, ...

I'U I ,
x1
F i g . 13.2. The one-by-one v a r i a t i o n method f o r f i n d i n g t h e minimum o f a f u n c t i o n T3.3. I l l u s t r a t i o n of t h e Rosenbrock method f o r a case w i t h two v a r i a b l e s
o f two v a r i a b l e s ; (a) weakly c o r r e l a t e d v a r i a b l e s , (b) s t r o n g l y c o r r e l a t e d v a r i -
ables.
p o i n t s a l o n g t h e "best" l i n e . When a minimum PI h a s s u b s e q u e n t l y been found a l o n g
t h e " b e s t " l i n e a new s e a r c h i s made i n a p e r p e n d i c u l a r d i r e c t i o n , g i v i n g a mini-
Although t h e one-by-one v a r i a t i o n method v s u a l l y does converge, i t may mum P*. The p r o c e s s i s r e p e a t e d a l o n g a new "best" d i r e c t i o n d e f i n e d by t h e l i n e
r e q u i r e a l a r g e m d e r of s t e p s b e f o r e t h e convergence is reached. I n cases w i t h j o i n i n g P2 and P * , and s o on u n t i l t h e convergence c r i t e r i a are f u l f i l l e d .
s t r o n g l y c o r r e l a t e d v a r i a b l e s t h e method i s u n a c c e p t a b l y slow, s i n c e t h e approach
t o w a r d s t h e minimum goes by an i n e f f i c i e n t z i g zag curve c r o s s i n g t h e s i d e s of
I n n dimensions t h e Rosenbroek r e c i p e i s as f o l l o w s : 6.. i - l , Z ,
L e t -I ...,n
d e n o t e t h e s e t of n o r t h o g o n a l d i r e c t i o n s . e f i n d t h e minirnum of F(5) a l o n g
W
t h e "valley" t o which t h e minimum b e l o n e s ; see F i g . 13.2(b).
e a c h of t h e s e d i r e c t i o n s i n t u r n , s t a r t i n g from t h e p o i n t Po and a f t e r c o m p l e t i n g
13.2.4 The Rosenbroek npthod t h e c y c l e r e a c h i n g the p o i n t P I . L e t s. b e t h e s i z e of t h e s t e p t a k e n t o r e a c h

Rosenbrock's algorithm i s an e x t e n s i o n and improvement of t h e one-by-me t h e minimum a l o n g Ei. The n s t e p s of t h e f i r s t s t a g e can t h e n be c h a r a c t e r i z e d


I!
by t h e n l i n e a r l y i n d e p e n d e n t v e c t o r s .re given. F o r s h o r t we w r i t e F(Pi) f o r t h e f u n c t i o n v a l u e a t Pi.

n We s t a r t t h e p r o c e d u r e by e v a l u a t i n g a l l F(Pi) and d e t e r m i n e t h e "high-


a. =
-1
L s.5.. i = 1 2 . . . (13.3) e s t " p o i n t P and t h e "lowest" p o i n t P i n t h e s i m p l e x , where
h 1
j=i J 3
I t i s seen t h a t _a, i s t h e sum of a l l s t e p s i n t h e s t a g e , i . ~ .t h e v e c t o r i s such t h a t F(Ph) = rnax{F(~,) . . . . , F ( P n + l ) ] ,
Ph (13.6)
t i n g Po and P I , 52 i s t h e sum of a l l s t e p s e x c e p t t h e f i r s t , e t c .
Obviously t h e
I i s such t h a t F(P1) 5 min(~(~r)....,F(~~+~)].
r e s u l t of a l l n s t e p s c o u l d have b e e n o b t a i n e d w i t h one s i n g l e s t e p i n t h e d i r e c - P1

t i o n gt = _ail / _ a l 1, the " b e s t " d i r e c t i o n . According t o t h e Gram-Schmidt orthogon-


-
~ e x we
t d e f i n e t h e c e n t r o i d ("centre-of-mass") P of a l l p q i n t s i n t h e s i m p l e x
a l i z a t i o n method w e can c o n s t r u c t a whole s e t of n o r t h o g o n a l d i r e c t i o n s by de- e x c e p t Ph,
fining the other vectors 32. ...,gn as

The r e c i p e i s now ro r r p l a r r P,, by a new p o i n t w i t h lower f r l n c t i o n a l


.
where "slue l y i n g on t h e l i n e through P1 and P. Three o p e r a t i o n s can b e used, reftec-
i-1 don, m t m c t i o n and ezpansion.
hi = 1.- 1 (_a.g.)g. i = 2 , 3 ,...,n (13.5) -
L j = , ' 1 1' The f i r s t t r i a l is done by r e f l e c t i n g Ph about P , d e f i n i n g a new point

With t h i s s e t of o r t h o g o n a l d i r e c t i o n s t h e p r o c e d u r e i s r e p e a t e d f o r t h e p o i n t P I , P* by t h e r e l a t i o n
-
g i v i n g P2, and so on. P* = (1+c1)P - aPh , (13.8)
The Rosenbrock method u s u a l l y works w e l l i f t h e number of v a r i a b l e s i s
where t h e r e f l e c t i o n c w f f i c i e n t a i s a p o s i t i v e c o n s t a n t . Three s i t u a t i o n s are
n o t t o o l a r g e , b u t when t h e number i n c r e a s e s i t s e f f i c i e n c y goes d a m .
possible:
E x e r c i s e 13.7: V e r i f y t h a t t w o d i f f e r e n t u n i t v e c t o r s ni.g. d e f i n e d by e q s . ( 1 3 . 4 ) ,
(13.5) are o r t h o g o n a l . 1 (i) I f F(P*) < F(P ) t h e r e f l e c t i o n h a s p r o d w e d a new minimum. To
1
see i f we can do even b e t t e r w e m k e an e x p a n s i o n a l o n g t h e l i n e
13.2.5 l h e s i m p l e x method
and go beyond P* ro a new p o i n t P*', d e f i n e d by
A f r e q u e n t l y used s t e p method f o r minimizing a f u n c t i o n of many v a r i a b l e s
is t h e s i m p l e z m t h o d i n v e n t e d by J . A . N e l d e r and N. W a d . The method i s b a s e d
on t h e e v a l u a t i o n of t h e f u n c t i o n v a l u e F ( x l , x 2 , ...x ) a t n t l p o i n t s forming a
where t h e e x p a n s i o n c o e f f i c i e n t y is g r e a t e r t h w u n i t y . If
g e n e r a l s i m p l e x * ) , f o l l o w e d by t h e r e p l a c e m e n t of th: vertex with t h e highest
F(P**) < F(P1) we r e p l a c e Ph by P** and r e s t a r t t h e p r o c e s s . If
f u n c t i o n v a l u e by a new and b e t t e r p o i n t , i f p o s s i b l e . The new p o i n t i s o b t a i n e d
by a s p e c i f i c a l g o r i t h m , and l e a d s t o a new s i m p l e x b e t t e r a d a p t e d t o t h e func- PCP**) L F ( P ~ ) t h e e x p a n s i o n h a s f a i l e d and we r e p l a c e Ph by P'

tion. before restarting.

L e t us assum t h a t t h e p o i n t s P 1 , P 2 , ...,Pn+, defining the current simplex


(ii) I f F(P1) I F(P*) c F(Ph) t h e r e f l e c t i o n h a s g i v e n a b e t t e r p o i n t
t h a n t h e previoush i g h e s t v a l u e . Ph i s r e p l a c e d by P* and t h e
*) A sinptez i s d e f i n e d as t h e s i m p l e s t n-dimensional g e o m e t r i c a l f i g u r e a p e c i - process i s r e s t a r t e d .
t i e d by g i v i n g i r s n + l v e r t i c e s . I t i s a t r i a n g l e f o r n-2. a tetrahedron f o r
n=3, etc.
(iii) I f F(P*) ? F(Ph) t h e r e f l e c t i o n h a s f a i l e d and P* i s unaceept- ill i n t h i s i t e r a t i o n b e r e p l a c e d by P* and t h e n e x t s i m p l e x w i l l have t h e ver- II
able. With a c o n t r a c t i o n a l o n g t h e l i n e v e t r y a new p o i n t PI* t i e e s Pt=P*.P2,Ps.
between Ph and ?, such t h a t The p r o c e d u r e w i t h r e f l e c t i o n s , e x p a n s i o n s and c o n t r a c t i o n s runs i n t o !
d i f f i c u l t i e s i f t h e new p o i n t , P* or P**, f a l l s too close t o !, s i n c e t h e simplex i
P** = 6Ph + (1-6)F . (13.10) then c o l l a p s e s i n t o a s u r f a c e of s m a l l e r d i r e n s i o n from which i t can never re-
Recommended v a l u e s f o r t h e c o e f f i c i e n t s a.0 and y a r e , r e s p e c t i v e l y , 1 ,
where t h e c o n t r a c t i o n c o e f f i c i e n t 6 l i e s between 0 and 1 . If
0.5 and 2 . I n i t i a l v a l u e s of t h e s i m p l e x rnay be chosen a t random o r e v a l u a t e d
F(P") < F(P ) Ph i s r e p l a c e d by P** and t h e p r o c e s s r e s t a r t e d .
h from a g i v e n p o i n t a c c o r d i n g t o a p r e s c r i b e d a l g o r i t h m . For c h i s method a con-
I f F(P**) > m i n ( F ( ~) , F ( P * ) ] ( t h e c o n t r a c t e d p o i n t is worse than
h
t h e b e t t e r of Ph and P*) t h e whole a t t e m p t h a s f a i l e d . The s i r
v e n i e n t convergence t e s t i s t o c a l c u l a t e t h e d i f f e r e n c e F(P )
h
-
F(P ) f o r e a c h
1
i t e r a t i o n and t o s t o p when t h i s d i f f e r e n c e f a l l s below a p r e s e t v a l u e . F i n a l l y ,
p l e x i s t h e n c o n t r a c t e d t o v a r d s t h e l o w e s t p o i n t by r e p l a c i n g a l l
a f t e r convergence, ? and F(:) a r e c a l c u l a t e d , and t h e minimum p o i n t i s taken as
P . ' s by i ( P . + P ) b e f o r e r e s t a r t i n g t h e p r o c e d u r e .
r l P, or ?, whichever produces the lowest F-value.
H w t h e s i m p l e r method works i s i l l u s t r a t e d f o r t h e two-dimensional I The s i m p l e x method does n o t d i r e c t l y produce t h e c o v a r i a n c e m a t r i x a t
case by F i g . 1 3 . 4 , where c o n t o u r l i n e s have b e e n d r a m f o r c o n s t a n t v a l v e s of 1 the estimated m i n i m . I t i s , however, a r a t h e r e f f e c t i v e ~ethod: i t r e q u i r e s

F(m,xz). A s i m p l e x ( t r i a n g l e i n t h i s care) w i t h v e r t i c e s P I , P 2 . P 3 is shown to- few f u n c t i o n e v a l u a t i o n s , one o r two p e r i t e r a t i o n , e a c h s e a r c h i s made i n an


gether with the calculated points ?, P*, P** i n one i t e r a t i o n . The p o i n t P =PI " i n t e l l i g e n t t ' d i r e c t i o n p o i n t i n g from t h e h i g h e s t f u n c t i o n v a l u e t o t h e a v e r a g e
h
of t h e b e t t e r v a l u e s of t h e simplex; f u r t h e r m o r e , as t h e method i s d e s i g n e d t o
I

x2 * ,
0
I.,** take as l a r g e s t e p s as p o s s i b l e i t i s r a t h e r i n s e n s i t i v e t o s h a l l o w l o c a l minima
0 o r f i n e s t r u c t u r e s i n t h e f u n c t i o n , implying a g e n e r a l l y good a d a p t i o n t o t h e
landscape and a q u i c k c o n t r a c t i o n t o t h e o v e r a l l minimum.

E x e r c i s e 13.8:
t i a l s i m p l e x d e f i n e d by P I = ( ! , I ) ,
- 2 2
The f u n c t i o n F ( X , , X ~ ) X, + x 2 i s g i v e n t o g e t h e r w i t h an i n i -
P,= (1,-2). P , = (-1.0). Show, u s i n g t h e sim-
plex minimization procedure with c o e f f i c i e n t s u = I , 6 = 0.5, y = 2, t h a t
1 F(Ph) - F(P1) < 1 a f t e r t h r e e i t e r a t i o n s , and t h a t t h e e s t i m a t e d minimum a t t h i s
stage i s a t ( 3 1 8 , 5/16).

13.3 GRADIENT METHODS


Ihe g r d < e n t methods u t i l i z e t h e d e r i v a t i v e s of F(?) i n t h e minimiza-
t i o n p r o c e s s t o p r e d i c t new t r i a l ~ o i n t sr e l a t i v e l y f a r away from t h e l a s t p o i n t .
They have a b e t t e r t h e o r e t i c a l f o u n d a t i o n than t h e s i m p l e s t e p methods, they are
a l s o more c o m p l i c a t e d , b u t d o n o t always produce b e t t e r r e s u l t s .

X1 I n t h i s s e c t i o n we w i l l p r e s e n t two g r a d i e n t methods which b o t h use t h e


f i r s t d e r i v a t i v e s of F(Z) and which have been w i d e l y used i n p r a c t i c e , t h e c l a s o i -
F i g . 1 3 . 4 . I l l u s t r a t i o n of t h e s i m p l e x method f o r a ease w i t h two v a r i a b l e s . c a l s t e e p e s t descent method i n v e n t e d by Cauchy, and Davidon's variance algorithm

.. A t h i r d g r a d i e n t n e t h o d , Newtom's method, which u s e s a l s o t h e second d e r i v a t i v e s


of F(E) t o ~ a l c ~ l a t hee s t e p s i z e s , was d e s c r i b e d a l r e a d y i n Sect.lO.3.1. F(x,. ....x.+Ar. ... ..xn) - F ( x I . . ..,.xi-Axi .....xn) - ZF(x)
G..
a2F
I-=
I 1
. (13.1h)
13.3.1 Numerical c a l c u l a t i o n of d e r i v a t i v e s (Axi)'

10 m - y cases t h e a n a l y t i c a l e x p r e s s i o n s f o r the d e r i v a t i v e s of t h e
f u n c t i o n a r e very complicated o r can n o t be found a t a l l . General p r o g r a m s
II Far t h e o f f - d i a z o n a l elements of G we get, using syametrical s t e p s ,

rninimiring by the g r a d i e n t methods should t h e r e f o r e b e s u p p l i e d w i t h a l g o r i t h m . G.. = - a2F


axiax. ' ( ~ ( x . + ~ x . , r . + AI x .+) F(xi-bxi,x.-Ax.)
1 1 1 1 1
f o r d e t e r m i n i n g the d e r i v a t i v e s of a f u n c t i o n from f i n i t e d i f f e r e n c e s .
g(5) of F(5) can be e s t i m a t e d numerically a t each p o i n t 5
I 1 (13.15)
The
of parameter space by performing r e p e a t e d c a l c u l a t i o n s of F(5) varying one para-
where we have s i m p l i f i e d n o t a t i o n and only s p e c i f i e d the dependence on the x i . x j
meter a t a time. For t h e i - t h component of g. with a p o s i t i v e increment Ani, we
v a r i a b l e s . Equation (13.15) shows t h a t f o u r new f u n c t i o n e v a l u a ~ i o n sare r e q u i r e d
may take

aF
F(x1, ...,xi+Ani... .,x ) - F(xl ,...,x.... ..x n) f o r each off-diagonal element. Since t h e r e are n(n-1)/2 independent off-diagonal

8 1. = -axi = .lcments i n a symmetric: n r n m a t r i x we see t h a t Zn(n-1) extra funcrion evalua-


bni , (13.11)
t i o n s have t o be done t o e s t i m a t e G .
!
or, a l t e r n a t i v e l y . I f the second d e r i v a t i v e s can be assumad approximately can6tant over

F(xl, ...,x . , . . . , ~ ) - F(x,, ...,xi-Ax i , . . . , x (

!
small r e g i o n s near 5, s-trical s t e p s w i l l not be necessary t o f i n d 6. The o f f -
aF
" . (13.12) , diagonal elements may then be c a l c u l a t e d from t h e formula
= si= AX.

E i t h e r a l t e r n a t i v e r e q u i r e s n+l function evaluations t o give g I n choosing t h e G..


I]
-- -
axiaxj = ( F ( X . + A X . , ~ . + A X +. ) F ( X . , ~ . )
1 1 I 1 1
s i z e of t h e increments one should keep i n mind the f i n i t e p r e c i s i o n of t h e com- (13.16)
p u t e r and avoid too small A x i A t h i r d and b e t t e r a l t e r n a t i v e f o r e s t i m a t i n g g .
- ~ ( x . + A x,x.)
1
.
1 1
- F(x.1, x1. + A ~I . ) ) I A Y ~ AI x,.
i s t o take t h e average of t h e e x p r e s s i o n s above, r a t h e r t h a n from eq.(13.15). With eq.(13.16) only one new f u n c t i o n e v a l u a t i o n i s
needed p e r off-diagonal element of G i n a d d i t i o n t o those r e q u i r e d f o r t h e f i r s t
F(n l,...,xi+Axi,...,xn)
2An.
- P(xl , . . . , x . - Axi, ...,n )
, (13.13)
, derivatives.

Exercise 13.9: Show t h a t t h e e r r o r i n the i - t h component of the g r a d i e n t v e c t o r


g a s c a l c u l a t e d by t h e uns-trical e x p r e s s i o n s e s (13.11). (13.12) t o the low-
which, however, i m p l i e s 2n e v a l u a t i o n s of the f u n c t i o n t o o b t a i n g. For func-
e o t o r d e r i n the Taylor expansion i s I A n . ( a 2 ~ / a x . ) . ?.
- 1
t i o n s having a n e a r l y q u a d r a t i c dependence on the parameters t h e l a s t formula
~ i v e sa p ~ r o x i m a t e l yc o r r e c t d e ~ i v a t i v e s ,independent of t h e s i z e of the increment. x -
Exercise 13.10: Consider t h e simple f u n c t i o n F ( x ) = x z with s l o p e Z i n t h e p o i n t
1 . C a l c u l a t e numerically the g r a d i e n t a t x =1 by t h e t h r e e e x p r e s s i o n s
The numerical e v a l u a t i o n of the second d e r i v a t i v e s G:.
even more time-consuming.
J'
= a2Flax,ax. i s
I
However, the d i a g o n a l elements of the m a t r i x G come
Ax -
given i n the t e x t , eqs.(13.11),(13.1Z),(13.13)
0.1.
i n t u r n , using ( i ) A x = 1 , (ii)

o u t as by-products when e s t i m a t i n g t h e g r a d i e n t by t h e s y m t r i e method, i.e. Exercise 13.11: V e r i f y t h e formula eq.(13.14) f a r the d i a g o n a l elements of the
matrix G by applying e q . (1 3.13) m i c e .
eq.(13.13). s i n c e one may v r i t e
E x e r c i s e 13.12: V e r i f y t h e e x p r e s s i o n s g i v e n f o r t h e off-diagonal elements of G , The s t e e p e s t d e s c e n t method i s i l l u s t r a t e d f o r a ease w i t h two v a r i -
eqs.(l3.15),(13.16).
ables i n P i g . 13.5. I n t h i s s i t u a t i o n t h e method is e q u i v a l e n t t o t h e one-by-
E x e r c i s e 13.13: Consider t h e q u a d r a t i c f u n c t i o n F ( x l , x r ) = x : + x l x r + 2 x : f o r
which t h e m a t r i x of second d e r i v a t i v e s is c o n s t a n t and e q u a l t o i v a r i a t i o n method e x c e p t f o r a r o t a t i o n of t h e c o o r d i n a t e axes; i n t h e gene-
ral case with more t h a n two dimensions the two methods are n o t e q u i v a l e n t .

Show t h a t a l l numerical e x p r e s s i o n s f o r the second d e r i v a t i v e s c a l c u l a t e d f o r the


p o i n t (1.1) with s t e p s i z e s 6x1 = Axz = 1 i n t h i s case produce e x a c t r e s u l t s .

E x e r c i s e 13.14: Given t h e f u n c t i o n F ( x I . x ~ =
) X ~ + X ~ X ~ + X I X : + X ~ +C a l c u l a t e t h e
f i r s t and second d e r i v a t i v e s i n the p o i n t (1.1) a n a l y t i c a l l y and numerically,
I
I
u s i n g ( i ) Axl = Ax, = 1 , ( i i ) Ax, = Axz = 0.1.

13.3.2 k t h o d of s t e e p e s t d e s c e n t
The steepest descent method can he thought o f as a n a t u r a l improvement
of t h e o n e - b y - o n e v a r i a t i o n method i n s i t u a t i o n s where t h e d e r i v a t i v e s of F(x) are
knam.
From t h e s t a r t i n g p o i n t Po we h e r e seek a minimm of F(5) along t h e
d i r e c t i o n i n parameter space where the f u n c t i o n d e c r e a s e s most r a p i d l y . When t h e
minimum P , i n t h i s d i r e c t i o n h a s been found t h e p r o c e s s i s r e p e a t e d s e a r c h i n g a ,
second and b e t t e r minimum i n a d i r e c t i o n orthogonal t o t h e f i r s r , ~ i v i n gP2. and
X1
so on u n t i l a s a t i s f a c t o r y convergence h a s been o b t a i n e d . F i g . 13.5. I l l u s t r a t i o n of the method of s t e e p e s t d e s c e n t
The d i r e c t i o n 5 of t h e s t e e p e s t d e s c e n t i n t h e p o i n t Po=xo h a s compo- f o r a case w i t h two v a r i a b l e s .
nents
The method of s t e e p e s t d e s c e n t ensures a b e t t e r convergence t h a n t h e
simple one-by-one v a r i a t i o n method. S t i l l i t may be r a t h e r slow when t h i v a r i -
a b l e s have a ~ m p l i e a t e dinterdependency i n F and the choice of s t a r t i n g p o i n t
where the d e r i v a t i v e s are e v a l u a t e d a t Po. I f t h e search along t h e d i r e c t i o n I,
Po ha8 not been f o r t u n a t e .
h a s l e a d t o t h e minimum P I the new d i r e c t i o n n of s t e e p e s t d e s c e n t i n the p o i n t
PI i s p e r p e n d i c u l a r t o t h e p r e v i o u s d i r e c t i o n 5. To see t h i s , i n t r o d u c e a v a r i a - Exercise 13.15: For t h e "Rosenbrock curved v a l l e y " of Exercise 13.1, show how
a b l e s on a l i n e along the d i r e c t i o n 5 through P o . A l l p o i n t s on t h i s l i n e t h e n d i f f e r e n t c h o i c e s of t h e s t a r t i n g p o i n t P o f o r t h e s t e e p e s t d e s c e n t method w i l l
lead to minimizations of d i f f e r e n t e f f i c i e n c y .
satisfy x = xo+s$. The minimwo p o i n t PI on the l i n e is d e f i n e d by the r e q u i r e -
ment a F / a x 4 . With t h e d e r i v a t i v e s e v a l u a t e d a t PI we have
13.3.3 The Davidon v a r i a n c e a l g o r i t h m
me e s s e n t i a l f e a t u r e of DaVidon'a varirmce atgorithm i s t h a t t h e f u n r
t i o n i s made t o approach i t s minimum by l e t t i n g t h e covariance m a t r i x V(z)= G-'
which i m p l i e s t h a t two c o n s e c u t i v e d i r e c t i o n s of s e a r c h w i l l always be orthogonal. undergo s u c c e s s i v e approximations. This means t h a t a simultaneous convergence is
obtained towards the f u n c t i o n minimum and t h e t r u e covariance m a t r i x . I n this

1 3 - P r ~ b a b l l i t yand sfafirtiss.
365

p r o c e s s t h e v a r i a n c e s are never c a l c u l a t e d d i r e c t l y , as must be done w i t h o t h e r


methods, and thereby one saves the e v a l u a t i o n of second d e r i v a t i v e s and t h e in-
v e r s i o n of t h e r e s u l t i n g m a t r i x .
I n s t e a d s u c c e s s i v e approximations are made f o r
(iv)I f
If
F* < P. & f i n e
F* 1 F, d e f i n e V
5
-- x*,
V*.
F = P*,

Proceed t o ( i ) f o r a new i t e r a t i o n .
g - g*, V = V*.

V(x) u s i n g only the f u n c t i o n F(1) and i t s g r a d i e n t g.


In the following we merely give t h e p r e s c r i p t i o n s of t h e method without 1t should be s t r e s s e d t h a t t h i s minimization a l g o r i t h m gives an eltact

any comments on i t s t h e o r e t i c a l foundation. For complete p r o o f s t h e r e a d e r i s matrix V i f F(5) i s a quadratic f m c t i o n .


r e f e r r e d t o t h e o r i g i n a l paper by W.C. Davidon. Experience shows t h a t the method l e a d s t o a f a s t convergence f o r v a r i -

Before s t a r t i n g t h e i t e r a t i o n s t h e method r e q u i r e s t h e knowledge of an ous types of problems. I n f a c t i t can be proven t h a t when F(5) i s of q u a d r a t i c

approximate mininum p o i n t xo with corresponding values


e s t i m a t e 'V of the covariance matrix.
PO,
-go k n a m
For t h e l a t t e r i t w i l l s u f f i c e t o have the
eltd a f i r s t , ,.
form in the n parameters t h e t r u e minimum of t h e f u n c t i o n and the e x a c t covari-
m t r i x are always found w i t h i n n iterations; see Exercige 13.16.
d i a g o n a l elements o n l y , and i f t h e s e are completely unknown o m may simply s t a r t
with v0 as the u n i t matrix; i n g e n e r a l , t h e b e t t e r t h e e s t i m a t e of v0 the f a s t e r I ,.
gxercise 13.16: Ihe f u n c t i o n F(x) = x2 is t o be minimized by t h e Davidon v a r i -
from the s t a r t i n g value (approximate minimm) x0 = 2, VO -
1 for
B = 10, c = 0.1. Perform t h e c a l c u l a t i o n s of t h e steps
t h e convergence. I n i t i a l l y given i s a l s o a s e t of s t e p c o n s t a n t s a. 8, which
from computational experience may be reasonably taken as o = lo-', B - 10, and a
the
,-
(i)
a =
( i v ) i n the t e x t and ~ e r i f yt h a t the method f i n d s the t r u e f u n c t i o n mini-
and e x a c t v a r i a n c e a f t e r only one f u l l s t a g e . Make a sketch of the f m c t i o n
convergence c o n s t a n t . E = 0.1, say. 1 d
, the q u a n t i t i e s c a l c u l a t e d .
Suppose t h a t an i t e r a t i o n has y i e l d e d the values 1. F, g and V and t h a t Ii
a further iteration i s needed t o o b t a i n a new s e t , fl, F*, g*, V*. The a l g o r i t h m : 13.4 MULTIPLE MINIMA
then c o n s i s t s of t h e following items: One of t h e most b a s i c problems i n t h e f i e l d of minimization is t h a t of

(i) Define 1' = ?! -V g, and compute P* - F(1*) and g* = &(?I)


deciding whether the c o r r e c t minimum has been found. It has i n t h e previous see-
t i o n s been more or l e s s i m p l i c i t l y assumed t h a t the d i f f e r e n t i t e r a t i v e opera-

(ii) Define a r e s i d u a l vector 5 - V g* and a number


w i l l vanish i f t h e e x a c t minimum i s found, because then
p = g * . ~(r
g* - O;
tions w i l l e v e n t u a l l y l e a d t o j u s t one l o c a l minimum of the f u n c t i o n , and t h a t
t h i s minimum i s i d e n t i c a l t o t h e d e s i r e d p o i n t .It is, however, n o t a t a l l evi-
dent t h a t t h e procedure w i l l end up with t h e g l o b a l (lowest) minimum of t h e func-
P i s a measure of t h e p e r p e n d i c u l a r d i s t a n c e t o t h e minimum).
tion, nor is t h e r e m y guarantee t h a t t h e mininum found is t h e n e a r e s t t o the
If
x0
- --
p <
x*.
E. stop. Ihe f i n a l e s t i m a t e f a r t h e minimum is then

I
o e a r t i n g p o i n t f o r t h e minimization.
I" minimizing a f u n c t i o n with s e v e r a l minima t h e t a s k u s u a l l y belongs
I f p 1 E, proceed t o ( i i i ) .
I t o one of t h e following t h r e e c a t e g o r i e s i n descending o r d e r of d i f f i c u l t y :
i Define y = -~r/p.
An obvious
If -
!+a
5 u
l-a , d e f i n e A = u. (i) A l l minima are of i n t e r e s t and should be found.

If - -@B+1
- <- - u, y <
1 +a d e f i n e A --Y+l
I.
although r a t h e r p r i m i t i v e way t o !oak f o r a l l minima i s t o make acompletemapping
of F(x) over t h e e n t i r e parameter space, b u t t h i s may imply a l a r g e number of
~f B-l - Y < - -,B d e f i n e
-8, A = 0, flmction evaluation^ and be p r o h i b i t e d f o r economical reasons. Anorher approach

Define v?.
B+l
I f none of t h e s e t h r e e , d e f i n e A
= V.. + (A-1)r.r.l~.
- I.Y+l
i s t o perform the minimum s e a r c h from s e v e r a l s t a r t i n g values. This may, however,

lead t o a r b i t r a r y r e s u l t s unless one has some p r i o r knowledge as t o t h e l o c a t i o n


'I I1 11 of the p o s s i b l e minima. or t h e general case t h e r e seems t o be no e x h a u s t i v e and
p r a c t i c a l guide an how t o f i n d a l l minima of a complicated f u n c t i o n of many v a r i -
ables.
F - PO + a.
interval.
where the c o n s t a n t a i s chosen so as t o g i v e the d e s i r e d
With t h e HL method it is important t o remember t h a r t h e

(ii) Only the g l o b a l minimrm i s of i n t e r e s t .


Many a u t h o r s have des- i n t e r v a l s obtained i n t h i s way have t h e meaning of l i k e l i h o o d i n t e r -
c r i b e d procedures t o s e a r c h f o r t h e g l o b a l minimum of a f u n c t i o n . I t seems, how- vals, as d i s c u s s e d f o r t h e one-, two- and multi-parameter cases i n S e c t ~ . 9 . 7 . 1 .

ever, as i f no s a f e and simple method has been i n v e n t e d y e t . For d e t a i l s on t h i s 9.7,!, and 9.7.6, respectively. m e numerical problem i t s e l f i s t r i v i a l vhen t h e

advanced s u b j e c t t h e r e a d e r is r e f e r r e d t o more s p e c i a l i z e d l i t e r a t u r e , f o r exam- function,as i s o f t e n t h e case, i s approximately q u a d r a t i c around i t s minimum.


ple I.M. Gelfand and H.L. T s e t l i n , and A.A. G o l d s t e i n and J.F. Price.

( i i i ) Only one minimrmr corresponds t o a p h y s i c a l s o l u t i o n and i s of


I f one
.inationof the parameter v a l u e s t h a t make F PO + a -
t h a r t h e minimrm r e g i o n i s n o t s u f f i c i e n t l y q u a d r a t i c the d e t e r
is not simple.
A s p e c i a l technique f o r t h e h a n d l i n g of i l l - b e h a v e d f u n c t i o n s , used for
interest. I n f a c t , i n many physics problems approximate v a l u e s of some of t h e ,,,pin i n the MINUIT minimization progr-, can be sketched as f o l l a r s :
parameters may o f t e n be known i n advance. An a p p e a l i n g approach i s then f i r s t t o suppose t h a t a crude eetirnate of t h e covariance m a t r i x e x i s t s and t h a t
f i n t h e s e parameters a t t h e i r assumed v a l u e s and minimize t h e f u n c t i o n with res- =he minimum
value Fo of the non-parabolic f u n c t i o n F(?) h a s been l o c a t e d a t 2".
p e c t t o t h e remaining p a r a m e t e r s . The parameter s e t corresponding t o t h e m i n i m a t o f i n d the e r r o r o ( x . ) i n the parameter value x"i corresponding t o an in-
we
f o r t h i s s i m p l i f i e d problem i s next used as a s t a r t i n g p o i n t f o r a complete m i n i - crease i n F by the amount a from the minimum when only the parameter xi i s
m i z a t i o n i n v o l v i n g a l l t h e parameters. varied and a l l the o t h e r s are k e p t f i x e d a t t h e i r v a l u e s a t t h e minimum.
0 0 0
.;,...,Xi+l,Xi+l....,xn.
13.5 EVALUATION OF ERRORS Lee us r e f e r t o Fig. 13.6, where the mininum p o i n t of the f u n c t i o n i s
I m p l i c i t i n t h e previous c o n s i d e r a t i o n s i s t h a t o u r primary i n t e r e s t c a l l e d A. I f the crudely e s t i m a t e d covariance m a t r i x i m p l i e s an e r r o r oo i n x?.
l i e s i n the p a r t i c u l a r s e t of parameter values which minimizes t h e f u n c t i o n F(x) rhe parabola F ( x ~ )= Fa+ ao;2(x.-x?)2 I i n t e r s e c t s t h e s t r a i g h t l i n e P = F0 + a
- I
r a t h e r than the f u n c t i o n m i n i m i t s e l f . I t remains now t o d i s c u s s how one can the p o i n t B. The f u n c t i o n value f o r xi = B corresponds t o a lower p o i n t 8 ' .
e v a l u a t e t h e u n c e r t a i n t i e s (errors) a s s o c i a t e d with t h e s e parameter v a l u e s . A second parabola w i t h minimum a t A, p a s s i n g through 8' i n t e r s e c t s the s t r a i g h t
With the Davidon v a r i a n c e a l g o r i t h m the error d e t e r m i n a t i o n becomes l i n e a t t h e p o i n t C, v i t h a corresponding f u n c t i o n value C', above the l i n e . A
e s p e c i a l l y simple, s i n c e the covariance matrix i s o b t a i n e d d i r e c t l y i n t h e mini- t h i r d parabola through the p o i n t s A. B', C' g i v e s an i n t e r s e c t i o n with t h e
m i z a t i o n p r o c e s s as p a r t of the a l g o r i t h m i t s e l f . The e r r o r s , d e r i v e d from the s t r a i g h t l i n e a t D and a new p o i n t D' o n the curve, which a l s o i s b e l o v t h e l i n e .
d i a g o n a l elements of t h i s m a t r i x , are even e n a c t f o r f u n c t i o n s of q u a d r a t i c form. A new parabola through B', C ' , D' g i v e s an i n t e r s e c t i o n v i t h t h e l i n e a t E, and
For t h e o t h e r minimization procedures the parameter errors can be found t h i s time t h e f u n c t i o n value E' c o i n c i d e s v i t h E w i t h i n some g i v e n t o l e r a n c e .
f o r ~ p e c i a l l yconseructed F ' s , u s i n g t h e i d e a s of Chapters 9 a n d 10. F o r i n s t a n c e , Hence the r e q u i r e d p o i n t h a s been favnd and o ( x i ) can be determined.
f o r a Least-Squares minimization with one parameter a one-standard d e v i a t i o n eon- lhis method can handle r a t h e r p a t h o l o g i c a l f u n c t i o n s , but w i l l then be
f i d e n c e i n t e r v a l is d e r i v e d from t h e p o i n t s where t h e sum of sqxlares i s 1.0 above time-consuming.
i t s minimum value PO. S i m i l a r l y , i n a Maxima-Likelihood e s t i m a t i o n of a s i n g l e
parameter t h e error corresponding t o a one-standard d e v i a t i o n confidence i n t e r v a l 13.6 HINIMIZATION WITH CONSTRAINTS
i s determined from t h e p o i n t s where the n e g a t i v e log l i k e l i h o o d i s 0 . 5 above mini- me parameters of t h e f u n c t i o n t o be minimized are f r e q u e n t l y r e s t r i c t e d
mum. l h u s f o r both methods t h e computational problem involved i n t h e determina- to a l i m i t e d region
inparameter space through c o n s t r a i n t e q u a t i o n s or i n e q u a l i -
t i o n of t h e errors i s t o f i n d the p a r a r e t e r v a l u e s t h a t correspond t o t h e f u n c t i o n t i e s due to physics requirements. I n our d i s c u s s i o n of minimization so f a r t h e
h a v e been assumed t o be w i t h o u t r e s t r i c t i o n s . We w i l l now see how d i f -
ferent t y p e s of c o n s t r a i n e d problems may b e h a n d l e d .
The g e n e r a l problem i n t h i s s e c t i o n i s t o minimize t h e f u n c t i o n P(x)
,,bjecr t o one or more of t h e f o l l o w i n g c o n d i t i o n s :

f,(_?) = 0
f,(S) = 0 constraint equations

5 XI 5 b~
hinple constant limits
(ii) az 5 x t 5 hz

simple v a r i a b l e l i m i t s

"I (5) 5 "I (5) i WI (5)


implicit variable limits
2 v2 (1)5 wz ( 5 )

I i - i ) a , b i - 1 . n are c o n s t a n t s w h i l e a l l f u n c t i o n s
depend on 5 . The g e n e r a l c o n d i t i o n s of t y p e ( i v ) a c t u a l l y i n c l u d e a l l c o n d i t i o n s
of t h e t y p e s ( i i ) and ( i i i ) , b u t f o r p r a c t i c a l and i l l u s t r a t i v e p u r p o s e s t h e y may
be s e p a r a t e d as above.
In t h e s p e c i a l f i e l d of linear progrmnming, where t h e f u n c t i o n F ( x ) and
i t s c o n s t r a i n t s are l i n e a r i n t h e p a r a m e t e r s , t h e i d e a s d i s c u s s e d i n t h e f o l l o w -
F i g . 13.6. Error d e t e r m i n a t i o n f o r an i l l - b e h a v e d f u n c t i o n (see t e x t ) .
ing are n o t of p r a c t i c a l use. When F(?) i s l i n e a r i t i s t h e c o n s t r a i n t s t h a t make
i t p o s s i b l e f o r t h e f u n c t i o n t o rake a minimum v a l u e a t a f i n i t e p o i n t a t t h e
boundary of t h e a l l o w e d r e g i o n . Thus, i n t h e f i e l d of l i n e a r programming t h e
c o n s t r a i n t s are e s s e n t i a l t n o b t a i n a minimum, whereas i n o u r problems t h e con-
s t r a i n t s are more o r l e s s r e g a r d e d as nuisance.
C o n s t r a i n t s can h e t a k e n care o f by s p e c i a l procedures for
c o n s t r a i n e d minimization. We w i l l . however, only c o n s i d e r techniques f o r modi- I f , f a r i n s t a n c e , x i h a a t o be non?legstive, 0 I x. 5 -, e i t h e r of t h e

f y i n g t h e f u n c t i o n P ( 2 ) i n such a way t h a t o r d i n a r y , u n c o n s t r a i n e d , minimization no following v a r i a b l e changes can b e a p p l i e d .

procedures may be a p p l i e d . Well-known techniques are

- e l i m i n a t i o n of v a r i a b l e s u s i n g the c o n s t r a i n t e q u a t i o n s ,
problems which r e q u i r e 0 5 xi 5 1 transformations l i k e
- i n t r o d u c t i o n of Lagrangian m u l t i p l i e r s , .Yi
x. = sin2yi or = (13.20)
- change of v a r i a b l e s to eliminate constraints,
X.
1 eYi - .-Yi
ai 5 xi I; h i
- i n t r o d u c t i o n of p e n a l t y f u n c t i o n s .
.ill remove t h e c o n s t r a i n t . With g e n e r a l c o n s t a n t l i m i t s an o f t e n
used transformation i s
I n p a r t i c u l a r , when F(x) i s t o be minimized under k c o n s t r a i n t equa-
(13.21)
tions fi(x) = 0, i=l,2, ....k , corresponding t o case ( i ) above, one may use the
x. =
I
a. + (b.-a.)sinZy..
I I t

c o n s t r a i n t equations t o eliminate v a r i a b l e s i n F ( 5 ) An example an the elimina- The v a r i a b l e changes above w i l l n o t i n t r o d u c e new minima i n x. The
t i o n approach was given i n Sect.lO.7.1. A l t e r n a t i v e l y one ran, us discussed hi
transformations i n v o l v i n g s i n 2 y . a c t u a l l y produce f o r each minimum i n 5-space
Chapter 10, i n t r o d u c e Lagrangian m u l t i p l i e r s X = [Al,A2, ....Akl and c o n s t r u c t a
y,. equally-spaced minima i n y-space. This should, howeipr, n o t cause d i f f i c u l -
modified f u n c t i o n T ( 5 . l ) . where
t i e s provided t h a t t h e minimization procedure used does not involve so l o n g s t e p s
that i n t e r m d i a t e h i l l s are c r o s s e d .

which i s t h e n minimized with r e s p e c t t o 4.


Ihe method w i t h t h e
and c i s Ae f u n c t i o n F ( x , , x ~ , x ~ is
~ ~ ~ ~ 13.17: ) t o be minimized under the c o n s t r a i n t s

Lagrangian m u l t i p l i e r s obviously suffers from che drawback of an increased o r x , s m ,


O S n 2 1 1 .
n u d e r of parameters i n the minimization, b u t h a s o t h e r w i s e v i r t u e s of i t s own.
with the s u b s t i t u t i o n s

( i v ) above.
I n the f o l l o w i n g s e c t i o n s we w i l l see haw the two techniques i n v o l v i n g
change of v a r i a b l e s and p e n a l t y f u n c t i o n s can be a p p l i e d t o t h e s i t u a t i o n s ( i i ) - x, - eY1. x2 = sin2y2. XI = 711
verify t h a t t h e minimm can be obtained by an u n c o n s t r a i n e d minimization i n r-
space followed by t h e t r a n s f o r m a t i o n of t h e minimm p o i n t back t o 5-space. I f the
~ o ~ a r i a n cofe the minimm p o i n t i n space i s v ( y ) , show t h a t t h e covariance of
13.6.1 E l i m i n a t i o n of c o n s t r a i n t s by change of v a r i a b l e s
An important p r a c t i c a l case of c o n s t r a i n e d minimization occurs when
some or a l l v a r i a b l e s are bounded by constant l i m i t s , a. i x 1. 5 hi. I n a mini-
~1 the minimum i n 5-space i s V(x) = Sv(y)ST, w h e r e

eYl 0 0
I
m i z a r i a n by a g r i d s e a r c h such i n e q u a l i t y c o n s t r a i n t s are e a s i l y taken care of by
l i m i t i n g the search region. With the o t h e r minimization procedures one nay e l i m i -
n a t e the e x p l i c i t i n e q u a l i t y c o n s t r a i n t s by changing t o a new s e t of u n r e s t r i c t e d I! Exercise 13.18: The minimum of a f u n c t i o n F(X,,X~) subject to the constraint
variables. To f i n d the minima one simply e x p r e s s e s F(5) by the new v a r i a b l e s y
0 5 r , l n , i -
thravgh a s t r a i g h t f o r w a r d s u b s t i t u t i o n and minimizes with r e s p e c t t o y . The
can be obtained by an unconstrained minimization with d i r e c t s u b s t i t u t i o n of
minima obtained i n y-space are t h e n transformed back t o r s p a c e . - The e r r o r s i n 5'
must b e found from an o r d i n a r y computation of e r r o r propagation.
XI = Y , . x 2 = y;
Consider t h e e x p l i c i t f u n c t i o n F ( x , , x 2 ) =
+ Y:
- x: -
.
x,x, + x:. Shnr a n a l y t i c a l l y
T(x.5) - F(5) + r
m

i=l
L ei/pi(5) . (13.23)

t h a t t h e minimum i s a t x , = x z = 0 . "here t h e sun r e p r e s e n t s t h e "penalty". The c . ' s are p o s i t i v e w e i g h t s of t h e


individual "penalties", w h i l e r(ZO) w e i g h t s t h e s m r e l a t i v e l y t o P ( 5 ) . The mini-
13.6.2 Penalty f u n c t i o n s ; C a r r o l l ' s response s u r f a c e technique
&ation i s t h e n h a n d l e d as a s u c c e s s i o n of u n c o n s t r a i n e d m i n i m i z a t i o n s f o r d i f -
When one h a s s i m p l e v a r i a b l e l i m i t s , 1I - 5 .I 5 ,I- or i m p l i c i t
I f e r e n t v a l u e s of r , s t a r t i n g w i t h a ( f i n i t e ) r,O and r e d u c i n g i t s t e p w i r e t o zero.
variable limits, u.(n)
1 -
5 vi(?) 5 w i ( ~ ) . i t may b e d i f f i c u l t t o f i n d t r a n s f o r m a -
r i o n s which w i l l e l i m i n a t e t h e c o n s t r a i n t s . However, i t i s p o s s i b l e t o m d i f y t h e ~ ~ ~ r e 13.19:
i s e The f u n c t i o n F(n) i s t o b e minimized w i t h i n t h e r e g i o n a 5 x S b
,,ringC a r r o l l ' s r e s p o n s e s u r f a c e t e c h n i q u e . Show t h a t t h e f u n c t i o n T ( x , ~ ) of
f u n c t i o n F ( 5 ) by a d d i n g a s o - c a l l e d penalty function t o i t so t h a t t h e f u n c t i o n eq.(13.23) w i t h e q u a l w e i g h t s c . can be w r i t t e n
v a l v e becomes l a r g e a t t h e boundary of t h e a l l o w e d r e g i o n . I n t h i s w a y one " f o o l s "
t h e minimization procedure t o s e a r c h only i n s i d e the allowed region provided t h a t
T(x.r) - ~ ( x )+
1
+ =). I

a p e r m i s s i b l e s t a r t i n g p o i n t was chosen. The p e n a l t y f u n c t i o n s h o u l d have an


13.6.3 Example: D e t e r m i n a t i o n of resonance p r o d u c t i o n
e f f e c t j u s t on the boundary, t o ensure t h a t no new minima are i n t r o d u c e d and t h a t An e f f e c t i v e mass spectrum of t h e dn-
system i n t h e r e a c t i o n
no minima are l o s t . . -
+ -
n p + n pn n shows enhancements c o r r e s p o n d i n g t o t h e p r o d u c t i o n of ~ ( 7 6 5 )and
To b e s p e c i f i c , c o n s i d e r t h e m i n i m i z a t i o n of F(2) w i t h t h e c o n s t r a i n t The e x p e r i m e n t a l spectrum is t o be compared w i t h t h e t h e o r e t i c a l d i s t r i -
f(l260).
p ( ~ )L 0 . We can t h e n d e f i n e a m o d i f i e d f u n c t i o n T(5) such t h a t
bution

where BW (M). BWf(H) are Breit-Wigner f u n c t i o n s d e s c r i b i n g t h e resonances and


P
a n d minimize T(5) as b e f o r e . The c o n s t a n t c s h o u l d be chosen l a r g e compared t o
B(M) a background term, a l l o f which are normalized and kn-. The p r o d u c t i o n
F(x), and F ( 5 ) s h o u l d b e c o n t i n u o u s a t t h e boundary. This olethod can e a s i l y b e f r a c t i o n s crp,crf,% are t h e unknown p a r a m e t e r s , which must s a t i s f y t h e f o l l o w i n g
e x t e n d e d t o more c o n s t r a i n t s . constraints:
One sees from the example above a u s e f u l f e a t u r e of i n t r o d u c i n g t h e
"penalty" cp2(z) i n T ( 5 ) . Without t h e p e n a l t y f u n c t i o n t h e c o n s t r a i n t s p(5) ? 0
c o u l d b e used as a t e s t c r i t e r i o n d u r i n g t h e m i n i m i z a t i o n of F ( x ) , b u t t h e n a
p o i n t f o r which ~ ( 5 <) 0 would s u p p l y no i n f o r m a t i o n a b o u t t h e d i r e c t i o n f o r t h e
continued search. When the " p e n a l t y " i s i n c o r p o r a t e d i n T(5) t h e knowledge g a i n e d
on t h e g r a d i e n t w i l l h e l p s p e e d up t h e r e t u r n t o t h e allowed r e g i o n of p a r a m e t e r
space. l h e l a s t ( n o m l i z a t i o n ) c o n d i t i o n can be u s e d t o e l i m i n a t e % from
Another p o p u l a r way of i n t r o d u c i n g p e n a l t y f u n c t i o n s is by Corrott'e f(M.a), l e a v i n g Wo u & n m p a r a m e t e r s a and of. I f now t h e f r a c t i o n s u a are
P P. f
reepmse surfom technique. Suppose t h e c o n s t r a i n t s are r e d e f i n e d t o t h e form t o be e s t i m a t e d . f o r example by t h e Least-Squares method, thi. corresponds t o

-
p . ( x ) 2 0, f o r i-1.2.
I
...,m. The m o d i f i e d f u n c t i o n T(5.5) t o be minimized is then minimizing a f u n e t i o n F ( u u
P' f
) of 8"-d s q u a r e d d e v i a t i o n s , s u b j e c t t o t h e can-
d e f i n e d by straints
the ~ h o i c eof procedure should be made a l s o c o n s i d e r i n g where i n t h e r e g i o n of
space t h e minimization i s t o be s t a r t e d . Quite o f t e n a method which
,rks w e l l i n t h e d i s t a n t r e g i o n s i s l e s s e f f i c i e n t when approaching the minimm.
% i s s u g g e s t s t h a t t h e d i f f e r e n t methods be a p p l i e d i n sequence, depending on t h e
To make use o f an unconstrained minimization procedure with C a r r o l l ' s reached. Indeed, t h e g e n e r a l purpose p r o g r a m s a v a i l a b l e incorpor-
response s u r f a c e technique we must r e w r i t e the c o n s t r a i n t s (2) t o the farm .te s e v e r a l minimization techniques and a l l o w t h e user t o change from one method
p i ( ~ )1 0, .
i = 1 , 2 , . .,m. The e q u i v a l e n t s e t of c o n s t r a i n t s is to i n one run.

One c o n s t r a i n t , 1 - aP - uf 1 1 , i n (2) is i m p l i c i t l y g i v e n b y the o t h e r s , leaving


f i v e c o n s t r a i n t s i n ( 3 ) . Thus the modified f u n c t i o n of eq.(13.23) with e q u a l
weights c . can be w r i t t e n

1
P f
Prom t h i s example i t is r e a l i z e d that.depending on which v a r i a b l e i s
e l i m i n a t e d f r o . t h e n o r m a l i z a t i o n condition,omewhat d i f f e r e n t e s t i m a t e s may be
found f a r the production f r a c t i o n s . The d i f f e r e n c e s w i l l , however, be r e f l e c t e d
by the errors connected t o the e s t i m a t e s .

13.7 CONCWDING REMARKS


The m e r i t of a minimization method i s o f t e n judged on i t s a b i l i t y t o
h a n d l e problems of l a r g e complexity. The s o p h i s t i c a t e d m e t h o d s d e s i g n e d t o s e a r e h
i n " i n t e l l i g e n t " d i r e c t i o n s with v a r i a b l e s t e p s i z e s may i n t h i s sense be consid-
e r e d s u p e r i o r t o t h e rmre simple-minded s t e p methods. However, which of t h e many
minimization procedures is t h e b e s t f o r p r a c t i c a l use, depends t o a l a r g e e x t e n t
on the a c t u a l f u n c t i o n t o be minimized. I t i s t h e r e f o r e d i f f i c u l t t o g i v e speci-
f i c recommendations. With simple f u n c t i o n s of only a few parameters t h e simple
methods are o f t e n found t o produce s a t i s f a c t o r y r e s u l t s . With complicated fune-
t i o n s i n v o l v i n g many parameters t h e more r e f i n e d techniques are necessary, and
14. Hypothesis testing
men the experiment has been performed the a c t u a l l y observed l i f e t i m e T~~~

should enable u s t o express a p r o b a b i l i t y o r "likelihood" t h a t t h e n u l l hypo-


thesis Ho i s c o r r e c t . Whether t h e n u l l hypothesis should be r e j e c t e d as f a l s e
.ill i n general depend on the a l t e r n a t i v e hypothesis t o which it i s compared.
I t w i l l be our concern i n t h i s chapter t o d i s c u s s h w various hypo-
14.1 INTRODUCTORY REMARKS I f t h e hypothesis under t e s t involves only s p e c i f i e d
theses can be put t o t e s t .
I n t h e preceding c h a p t e r s main emphasis has been on the area of s t a -
values of c e r t a i n parameters, as Ho i n the case considered above, the t e s t i s
t i s t i c a l i n f e r e n c e which d e a l s with parameter e s t i m a t i o n , t h a t i s , p o i n t e s t i -
called a p m m c t p i c test. The formulation of such t e s t s f o l l a r s c l o s e l y the
mation and i n t e r v a l e s t i m a t i o n of the parameters of a d i s t r i b u t i o n . T h i s chap- l i n e s of thought f o r the eonstruetion of confidence i n t e r v a l s f o r unknown para-
t e r w i l l be devoted t o a n o t h e r domain of s t a t i s t i c a l i n f e r e n c e , namely t h a t of
meters as discussed i n Chapter 7 .
t e s t i n g s t a t i s t i c a l hypotheses. This s u b j e c t is closely r e l a t e d t o parameter :ion-parmetric tests deal with questions l i k e : I s an experimental
p o i n t e s t i m a t i o n and t o t h e determination of confidence i n t e r v a l s . While t h e d i s t r i b u t i o n of a shape1 Or, are two given experimental d i s t r i b u t i o n s
e s t i m a t i o n problem g e n e r a l l y amounts t o seeking e s t i m a t e s of unknown parameters, the same form? In the former case the n u l l hypothesis can be t h a t the obser-
our concern under hypothesis t e s t i n g w i l l be t o decide vhether, from a s t a t i s - ..tions come from, say, a normal population while the a l t e r n a t i v e hypothesis can
t i c a l p o i n t of view, some given mathematical model with pre-aosigned o r e s t i - be t h a t they ~ r i g i n ~ tfrom
e any other parent d i s t r i b u t i o n ; f o r t h i s kind of
mated valves of the parameters i s a c c e p t a b l e i n l i g h t of the observations. problem one can formulate goodness-of-fit tests f o r H .
I n the comparison of
Suppose f o r i n s t a n c e t h a t we have measured the proper decay time of experimental d i s t r i b u t i o n s the n u l l hypothesis need not m k e any s p e c i f i c
a sample of Eo hyperons and from these measurements estimated the Eo mean l i f e - assumption about the shapes o f the p a r e n t a l d i s t r i b u t i o n s , except t h a t these
time. W e may then ask the question: Does t h i s e s t i m a t e agree with the predic- .re equal. I f the t e s t formulation i s made independent of the form of t h e
t i o n of t h e AI-1 rule that the Eo l i v e s twice as long as t h e
In different E-? underlying d i s t r i b u r i o n s , and therefore i s v a l i d for a l l d i s t r i b u t i o n a l forms,
words, do t h e observations throw doubt upon o r even disprove the v a l i d i t y of i t i s c a l l e d a distribution-free test.
t h e AI-1 rule? Our discussion i n the following w i l l be limited to a r a t h e r elementaly
I n s t e a d of giving answers "yes" or "no" t o q u e s t i o n s of the above survey of the general p r i n c i p l e s involved, which should, houever, be s u f f i c i e n t
type i t is customary t o rephrase the problem i n terms of a test of a statisti- t o tackle many p r a c t i c a l problems met i n the everyday l i f e of an experimental
caZ hypothesis which follows as a consequence of t h e assumed physical l e v . I n physicist.
t h e s i t u a t i o n above l e t To be t h e mean l i f e t i m e of the 'E as implied by the I t should be s t r e s s e d from the o u t s e t t h a t an experimentalist l r i l l
AI-L r u l e and the * o m 2- l i f e t i m e , and l e t r be t h e value i n d i c a t e d by an often, on t h e b a s i s of h i s o m observations only. not be a b l e t o prove or dis-
experiment. We may then e x p l i c i t l y want t o t e s t i f r is equal t o lo within the prove a fundamental t h e o r e t i c a l i d e a or hypothesis which motivated h i s experi-
experimental errors; we w r i t e ment. This i s p a r t i c u l a r l y t r u e i n p a r t i c l e physics, which camwnly c a l l s f o r
higher s t a t i s t i c s than can be obtained i n a s i n g l e experiment. I n essence the
individual experiment determines the p r o b a b i l i t y t o o b t a i n the observed r e s u l t
assuming the hypothesis t o be t r u e , and q u i r e o f t e n the experimentalist w i l l
and c a l l Ho the n u l l hypothesis. The o t h e r p o s s i b i l i t y , r i s d i f f e r e n t from
have t o leave the problem a t t h i s s t a g e . Further measurements or o t h e r types of
can be c a l l e d the altentative hypothesis t o Ho, end i s w r i t t e n
TO.
..
experiments may have t o be a v a i t e d b e f o r e a f i n a l s t a t e o r n t o r d e c i s i o n can be
of r e j e c t i o n o r t h e c r i t i c a t regiDn f o r Ho, while (W-R) i s c a l l e d t h e
made. For i n s t a n c e , r e p e a t e d experiments may be n e c e s s a r y t o g e t a s u f f i c i e n t l y
m,eptonre regiox f o r Ho. The wo r e g i o n s are s e p a r a t e d by t h e c r i t i c a l value
lar o v e r a l l p r o b a b i l i t y t o j u s t i f y t h e r e j e c t i o n o f some t h e o r e t i c a l hypotheaia. The i m p l i c a t i o n i s t h a t i f t h e observed
; see the i l l u s t r a t i o n i n P i g . 14.1.
C
The d e c i s i o n i s n o t taken u n t i l i t h a s been demonstrated with overwhelming plau-
value xObs f a l l s i n R ( i . e . xObs exceeds x i n F i g . 14.1) we s h a l l r e j e c t Ho.
s i b i l i t y t h a t t h e t h e o r e t i c a l i d e a was wrong.
we s h a l l a c c e p t i t .
T h i s i s a somewhat d i f f e r e n t s i t u a t i o n compared t o t h e p r a c t i c e i n
i n d u s t r y , i n s u r a n c e companies e t c . . where q u i t e o f t e n d e c i s i o n s have t o be taken
even though t h e r i s k of making t h e wrong d e c i s i o n can be r a t h e r h i g h . High-risk
d e c i s i o n s a l s o occur i n p a r t i c l e p h y s i c s , b u t then a n a more elementary l e v e l .
For example, f a r an i n d i v i d u a l p a r t i c l e r e a c t i o n a s e l e c t i o n i s made between t h e
k i n e m a t i c a l l y permitted hypotheses on t h e b a s i s of t h e p r o b a b i l i t i e s of t h e
various f i t s . The p h y s i c i s t , however, w i l l u s u a l l y have t h e p o s s i b i l i t y t o cor-
r e c t h i s f i n a l sample of accepted e v e n t s f o r t h e p o s s i b l e "beckground" of i n c o r -
r e c t l y assigned k i n e m a t i c a l hypotheses. Therefore, although t h e d e c i s i o n s about
t h e i n d i v i d u a l e v e n t s are taken w i t h a r a t h e r l a r g e r i s k of b e i n g wrong, one
t r i e s i n t h e end t o c o r r e c t f o r t h e wrong d e c i s i o n s made.
Fig. 14.1. I l l u s t r a t i o n of c r i t i c a l region R ( r e j e c t i o n r e g i o n )
and acceptance r e g i o n W-R f o r the t e s t s t a t i s t i c x.
14.2 OUTLINE OF GENERAL M3THODS
To i n t r o d u c e t h e concepts involved i n h y p o t h e s i s t e s t i n g suppose t h a t
we have wo hypotheses completely s p e c i f i e d by wo d i f f e r e n t v a l u e s of a para- The p r e a s s i g n e d p r o b a b i l i t y a t h a t t h e o b s e r v a t i o n x w i l l b e l o n g t o
meter 9 e n t e r i n g a p . d . f . f(x19)'). The n u l l h y p o t h e s i s Ho a s s m s 9 = go, and the r e g i o n R i s c a l l e d t h e s i g n i f i c a n c e , or t h e s i z e o f the t e s t , and deterloines
t h e a l t e r n a t i v e h y p o t h e s i s H I assmes 9 = 8 1 . When a h y p o t h e s i s i s completely the significance leuet a t 1000 %.
s p e c i f i e d , a s b o t h Ho and H I i n t h i s c a s e , i t i s c e l l e d a s i n p t e h y p o t h e s i s . A P r o . t h i s d e f i n i t i o n t h e r e i s obviously a p r o b a b i l i t y a t h a t t h e obser-
h y p o t h e s i s i s c a l l e d composite i f any parameter i s n o t s p e c i f i e d completely. ved value xObs w i l l f a l l i n t o R a l s o when H i s true. T h e r e f o r e , i n lOOa :! of
For i n s t a n c e , an a l t e r n a t i v e H I : 9 > 8, i s a composite h y p o t h e s i s . a l l d e c i s i o n s H o w i l l be r e j e c t e d when i t should, i n f a c t , have been a c c e p t e d .
We s h a l l now d e f i n e c r i t e r i a f o r when to a c c e p t t h e n u l l h y p o t h e s i s H The mistake we do by r e j e c t i n g Ho when i t i s t r u e i s c a l l e d a Type I ermr, or an
(and reject H I ) . and when t o a c c e p t t h e a l t e r n a t i v e h y p o t h e s i s HI ( r e j e c t i n g n o ) , error of the f i r s t k i d . S i n c e we want t o cormnit such an error a s r a r e l y as
on t h e b a s i s of a given o b s e r v a t i o n n p o s s i b l e a low n m e r i c a l v a l u e should be taken f o r a .
0bs'
A s s m i n g the n u l l h y p o t h e s i s Ho t o be t r u e we can f i n d a r e g i o n R i n There i s , h a r e v e r , a n o t h e r p o s s i b l e mistake which can occur, namely
t h e sample space W f o r the o b s e r v a t i o n n such t h a t t h e p r o b a b i l i t y t h a t x belongs t h a t we a c c e p t Ho as t r u e when i t i s , i n f a c t , f a l s e . T h i s i s c a l l e d a Type II
t o R is e q u a l t o any preassigned n m e r i c a l v a l u e . The r e g i o n R i s c a l l e d the e m m , or an error of the eecond kind; t h e p r o b a b i l i t y o f i t s occurrence 6 de-
pends on t h e a l t e r n a t i v e h y p o t h e s i s HI. With reference t o t h e i l l u s t r a t i o n s i n
7 I n t h i s s e c t i o n x can be a d i r e c t l y o b s e r v a b l e q u a n t i t y or more g e n e r a l l y Fig. 14.2 we can i n an obvious n o t a t i o n s t a t e t h e above d e f i n i t i o n s as f o l l w s ,
some f u n c t i o n of t h e o b s e r v a b l e s , corresponding t o the n o t i o n of a s t a t i s t i c
(compare Chapters 7 and 8 ) .
- -
m

Prob(Type I error) a = If(x/e0)dx If(x~~,,)dx ,


prom t h e previous considerations i t seems reasonable t o choose t h e c r i t i c a l
R x
region R such t h a t , f o r a s p e c i f i e d s i g n i f i c a n c e l e v e l , t h e pooer g e t s as high
I

Prob(Type I1 error)= 8 =
I
W-R
f(rl8,)dx - If (xl8,)dx.
as p o s s i b l e . This vague statement about t h e choice of the best critical region
can be given a more q u a n t i t a t i v e form i n terms of the Neyman-Pearson t e s t and
the l i k e l i h o o d - r a t i o t e s t t o be discussed l a t e r . Before we proceed t o d i s c u s s
these e x p l i c i t t e s t a we w i l l i l l u s t r a t e with a s p e c i f i c example from p a r t i c l e
physics some of t h e concepts introduced above.

14.2.1 Example: Separation of one-no and multi-no events


A wellknown s i t u a t i o n involving d e c i s i o n s between d i f f e r e n t hypotheses
occurs i n the i d e n t i f i c a t i o n of p a r t i c l e r e a c t i o n s based on kinematical f i t t i n g
of measured e v e n t s .
Let us consider antiproton-proton a n n i h i l a t i o n s i n t o four-pronged
events i n a hydrogen bubble chamber. A l a r g e f r a c t i o n of t h e events do n o t
s a t i s f y the kinematical c o n s t r a i n t s imposed by the e n e r a and momentum eonserva-
tion laws i f one assumes t h a t only f o u r charged pions were created i n t h e anni-
h i l a t i o n process. I n these events probably one o r more n e u t r a l p a r t i c l e s were
produced, and t h e missing-mass as deduced from t h e measured v i s i b l e p a r t i c l e s
(tracks) i s c o n s i s t e n t with t h e production of a t l e a s t one no. From t h e meas-
ured missing-mass values one wants t o s e p a r a t e these events i n t o two groups, one
s a l e c o n s i s t i n g of events where t h e r e i s a s i n g l e no ( t h e "1C-channel"), the
second sample c o n s i s t i n g of events i n which two or more no's were produced ( t h e
"no-f i t channel") .
I n the languageof hypothesis t e s t i n g the problem i s phrased as follows:
The i n d i v i d u a l e v e n t s are t o be put t o t e s t f o r t h e n u l l hypothesis

Fig. 14.2. I l l u s t r a t i o n of Type I error a and Type I1 e r r o r 6.

against t h e a l t e r n a t i v e hypothesis

- + + - -
HI: p p * T " " T M ,
The pmer of a t e s t i s defined as t h e p r o b a b i l i t y of r e j e c t i n g a hypo-
t h e s i s when i t i s f a l s e . We have f o r t h e p w e r of t h e t e s t of t h e n u l l hypoth-
(here M denotes "mare than one n e u t r a l pion"), and t h e missing-mass squared m2
esis H a g a i n s t t h e a l t e r n a t i v e HI:
is t o be used as a t e s t s t a t i s t i c . The c r i t i c a l value m: is most reasonably
- -
m

Pmer 1-8 jf(xl8,)dx


R
=
J
.F
f(xl8,)dx. (14.3) taken somewhere beyond the squared no mass, corresponding t o a one-sided t e s t
for H . I f t h e missing-mass squared f o r an e v e n t comes o u t s m a l l e r than m2 t h e
h y p o t h e s i s Ho i s a c c e p t e d ; we then c a l l t h e e v e n t a one-rro e v e n t . I f mz>m2 the
h y p o t h e s i s Ho i s r e j e c t e d and t h e a l t e r n a t i v e H , accepted, corresponding t o a
multi-no e v e n t .
I n choosing t h e c r i t i c a l v a l u e m 2 we a r e faced w i t h a dilemma. Sup-
pose, f o r example, t h a t t h e observed missing-mass spectrum f o r t h e t o t a l e v e n t
sample h a s t h e shape of Fig. 14.3(a). I f t h e c r i t i c a l v a l u e is s e t a t a low
mass, n o t much l a r g e r t h a n m,: we s h a l l be ensured t h a t r e l a t i v e l y f e u t r u e
multi:no e v e n t s w i l l wrongly be taken f o r one-no e v e n t s , b u t a t t h e same time
many t r u e one-no events w i l l then be l o s t from t h e f i n a l o n e - n o s a m p l e . On the
o t h e r hand, i f t h e c r i t i c a l v a l u e i s taken a t a high mass, t h e l o s s of t r u e one-
no e v e n t s w i l l be s m a l l , b u t t h e contamination of multi-no e v e n t s i n t h e one-lio
sample w i l l w i t h t h i s choice be c o n s i d e r a b l e . C l e a r l y , t h e d e c i s i o n of a e r i t i -
c a l v a l u e r e q u i r e s a two-way c o n s i d e r a t i o n of the c o n f l i c t i n g demands o f a smal-
l e s t p o s s i b l e l o s s of t r u e one-no e v e n t s and a s m a l l e s t p o s s i b l e contamination
of multi-no e v e n t s i n t h e f i n a l one-no sample.
T h i s example i l l u s t r a t e s t h e g e n e r a l s i t u a t i o n when i t i s d e s i r e d t o
minimize t h e p r o b a b i l i t y a f o r r e j e c t i n g H when i t i s t r u e , i.e. e o m i t a Type
I error (corresponding t o l o s i n g t r u e one-no e v e n t s ) and simultaneously minimize
t h e p r o b a b i l i t y B f o r a c c e p t i n g Howhen i t r e a l l y is f a l s e , i.e. commit a Type
I1 error (contamination of multi-no e v e n t * ) . To keep t h e s i g n i f i c a n c e u a t a
low v a l u e and a t t h e same time have an optimum of t h e p a r e r (1-0) of the t e s t a
compromise i s c a l l e d f o r .
With r e f e r e n c e t o F i g . 14.3(b) suppose t h a t we know t h e d i s t r i b u t i o n
f(m2/il0) i n the v a r i a b l e m2 f o r t r u e one-no e v e n t s and t h e corresponding d i s t r i -
b u t i o n f ( m ' 1 ~ 1 ) f o r the t r u e multi-no e v e n t s . For any chosen s i g n i f i c a n c e u we
can then, by n m e r i c a l i n t e g r a t i o n of f(mzlH ) according t o e q . ( 1 4 . 1 ) , determine
t h e c r i t i c a l value m: and subsequently, by i n t e g r a t i o n of f ( m 2 1 ~ I ) determine
, B
from eq.(14.2). The r e s u l t o f t h i s computation f o r the power considered as a Missing-moss squared m 2 ( ~ c ~ 2 )
f u n c t i o n of the s i g n i f i c a n c e i s t h a t (1-8) from the value z e r o i n c r e a s e s
Fig. 14.3. (a) D i s t r i b u t i o n of t h e missing-mass squared m2 i n Pp
v e r y r a p i d l y with i n c r e a s i n g v a l u e s of a up t o about 0.1. For any s i g n i f i c a n c e a n n i h i l a t i o n s i n t o f o u r charged p l u s a t l e a s t one n e u t r a l
pion; d a t a from a n experiment w i t h 1 . 2 GeVIc a n t i p r o t o n s i n c i d e n t
g r e a t e r t h a n , s a y , 0.15 t h e power i s p r a c t i c a l l y equal t o 1 . Thus, depending on i n a hydrogen b t b b l e chamber. (b) P r o b a b i l i t y d e n s i t y f u n c t i o n s
whether i t i s t h e p u r i t y o r t h e s i z e of the one-no sample which i s of g r e a t e s t i n mZ f o r one-n e v e n t s ( f u l l curve) and multi-no e v e n t s ( d o t t e d
curve).
II 385
i
importance one would i n t h i s case probably f i x t h e s i g n i f i c a n c e l e v e l aomewhere Far t h e case w i t h a s e r i e s of measurements x,,xl,....x the p . d . f .
between 2 and 10%. corresponding t o t a k i n g t h e c r ~ t i c a l v a l u e m t i n t h e r e g i o n f ( ~ l 8 )i s r e p l a c e d by t h e j o i n t d i s t r i b u t i o n f u n c t i o n , t h e l i k e l i h o o d f u n c t i o n
0 . 2 t o 0.1 G~v'.
Far completeness i t should be mentioned t h a t i n the p r e s e n t example a
b e t t e r t e n t s t a t i s t i c e x i s t s f o r t h e t e s t of t h e h y p o t h e s i s Xo a g a i n s t the a l -
ternative HI. When t h e u n c e r t a i n t i e s on t h e measured a n g l e s and momenta are ! The c r i t e r i o n (14.5) n w becomes
I kn- a Least-Squares minimization can be performed as d e s c r i b e d i n Sects.lO.8.
10.8.1 w i t h c o n t r a i n t s from energy and momentum c o n s e r v a t i o n assmiming an u m a s -
"red no. The x2 f u n c t i o n t h e n c o n t a i n s more i n f o r m a t i o n than t h e missing-maas
a l o n e , and as x'.m l o w i l l i ~ i t ht h e c o n d i t i o n

-
be an approximate chi-square v a r i a b l e w i t h one d e g r e e of

I,;
freedom, i t can be used as a t e s t s t a t i s t i c f o r a goodness-of-fit t e s t ; see also 1 { ~ ( ~ ( 8 , ) da~ (14.7)
Sec:.14.4.5.
R
14.2.2 The Neyman-Pearson t e s t f o r simple hypotheses which r e p l a c e s eq.(14.1).
The Neymon-Pearson Zemnn s t a t e s t h a t , f o r a f i x e d s i g n i f i c a n c e l e v e l ,
I The c r i t i c a l r e g i o n c o n s t r u c t e d i n accordance w i t h t h e r u l e s e q s .
t h e b e s t c r i t i c a l region 8 should i n c l u d e those v a l u e s o f x f o r which f ( n l 8 1 ) i s
as l a r g e as p o s s i b l e r e l a t i v e t o f ( ~ 1 8 ~ ) .
1 (14.6), (14.7) w i l l provide the maximm pover of a simple n u l l h y p o t h e s i s a g a i n s t
a simple a l t e r n a t i v e hypothesis f o r t h e given s i g n i f i c a n c e l e v e l . Equivalently,
I r h i s r e g i o n minimizes t h e p r o b a b i l i t y o f Type I1 e r r o r s .
the t e s t H : 8 = 8 against H I : 8 -
It can e a s i l y be v e r i f i e d t h a t r h i s c h o i c e o f R maximizes the p w e r o f
81. Consider one measurement x and d e f i n e
t h e s i g n i f i c a n c e a of t h e t e s t by the i n t e g r a l o f eq.(14.1).
For t h e case with many measurements t h e c r i t i c a l r e g i o n R i n ?-space
may be d i f f i c u l t t o f i n d since eq.(14.7) c o n s t i t u t e s an n-dimensional integral.
The r e g i o n R should
o b v i o v s l y i n c l u d e a l l p o i n t s where f ( x l 8 ) = 0 and f(xlO1) > 0 , s i n c e t h e s e I n p r a c t i c e one w i l l seek t h e c r i t i c a l r e g i o n f o r some t e s t s t a t i s t i c , expressed
p o i n t s do not c o n t r i b u t e t o a . For f ( x l 0 ) > 0 t h e power of the t e s t of Ho as a f u n c t i o n of the xi. For i n s t a n c e , i t w i l l be convenient t o seek t h e e r i t i -
a g a i n s t H I may be w r i t t e n (from e q . ( 1 4 . 3 ) ) c a l r e g i o n f o r t h e sample m a n ;when t e s t i n g on a p o p u l a t i o n m a n U, o r t h e
1 sample v a r i a n c e 's when the t e s t i n v o l v e s t h e p o p u l a t i o n variance 0'. I n any
case, t o determine the c r i t i c a l r e g i o n one w i l l have t o i n t e g r a t e over t h e proba-
b i l i t y d e n s i t y f u n c t i o n f o r t h e t e s t s t a t i s t i c a c t u a l l y used.
It should be noted t h a t the Neyman-Pearson t e s t i s applicable only i n
where E i s a p o i n t w i t h i n R . S i n c e t h e l a s t i n t e g r a l i s n o t h i n g b u t the con-
t e s t i n g simpZe hypotheses. or composite hypotheses one can only r a r e l y f i n d a
s t a n t u, t h e p a r e r of H o w i l l be maximal i f the r e g i o n 8 i s chosen such t h a t t h e
t e s t which i s more p w e r f u l than any o t h e r t e a t . The l a t t e r type of problem
r a t i o f ( ~ ~ 8 ~ ) / f ( x / 8as
~ ) liasr g e as p o s s i b l e . The b e s t c r i t i c a l r e g i o n t h e r e f o r e
' w i l l be d i s c u s s e d i n Seet.14.2.4 i n connection with t h e l i k e l i h o o d - r a t i o test.
c o n s i s t s of p o i n t s s a t i s f y i n g t h e i n e q o a l i t y

14.2.3 Example: Neyman-pearson t e s t on t h e F~ mean l i f e t i -


For t h e Z' l i f e t i m e problem introduced i n Sect.14.1, c o n s i d e r t h e two
following simple hypotheses about the man value r , both with the f u n c t i o n a l
where k i s determined by the p r e a s s i g n e d s i g n i f i c a n c e a
form f ( t l r ) -- 1
exp(-t/r), ( t o g e t simple r e l a t i o n s t h e l i f e t i m is h e r e expres-
sed i n u n i t s of t h e t h e o r e t i c a l p r e d i c t i o n from t h e AI-! rule),

I The p a r e r of t h e t e a t Ho: r - 1 against H I : r - 2 becomes

-
The c o n d i t i o n given by eq.(14.6) now r e a d s , assuming n o b s e r v a t i o n s . With j u s t one o b s e r v a t i o n of t h e
the n u l l h y p o t h e s i s H i f t h e observed v a l u e
EO l i f e t i m we w i l l t h e r e f o r e r e j e c t
lobs i s l a r g e r than TI= 3.00. The
p r o b a b i l i t y t h a t we a c c e p t Ho when t h e a l t e r n a t i v e h y p o t h e s i s H I i s t r u e (i.e. we
c o m i t a Type I1 error) i s v e r y l a r g e , namely 8 = 0.78.

( i i ) n large In chis case t h e p . d . f . for the t e s t s t a t i s t i c f can be appconi-


- 1 "
I n terms of t = - 1 t"
" i=, . t h i s can be w r i t t e n , mated t o a normal p . d . f . with mean r and v a r i a n c e r 2 / n ( t h i s was shown i n S e c t .
1 9.4.8).

The b e s t c r i t i c a l r e g i o n i n t space i s t h e r e f o r e t h e s e t of v a l u e s s a t i s f y i n g
i The l o u e r l i m i t T of t h e c r i t i c a l region f o r F is now implied by the i n t e g r a l
t h e i n e q u a l i t y (14.8), where Tn i s a c o n s t a n t which can be c a l c u l a t e d f o r any
given s i g n i f i c a n c e a. We see t h a t t h e sample mean r is a n a t u r a l t e s t s t a t i s t i c
f o r t h i s t e s t about t h e p o p u l a t i o n mean r . However, t o f i n d R (or To) i t is
n e c e s s a r y t o know t h e p.d.f. f (t) for f. This p.d.f. i s p a r t i c u l a r l y simple
f o r t h e two l i m i t i n g cases. "-1 and n very l a r g e . Fixing the significance l e v e l where G i s t h e cumulative standard no-1 d i s t r i b u t i o n function. Prom Appendix
t o 5% we w i l l now study t h e s e extreme cases. Table A6 we f i n d t h e v a l u e of the a r g u m n t of G t o be 1.645; hence

0The
. p.d.f. f o r the t e s t s t a t i s t i c ;is i n t h i s case the saw as t h e
p.d.f. f o r t.
The p a r e r of t h e t e s t H : T = 1 against H,: r - 2 now becomes

The s i g n i f i c a n c e a determines t h e lower l i m i t TI of t h e c r i t i c a l r e g i o n f o r I


through t h e d e f i n i t i o n , eq.(14.1).
m
Thus, b o t h t h e c r i t i c a l v a l u e T and t h e power (1-0) depend on t h e number of
0.05 -a- /e-'d~ . o b s e r v a t i o n s n. Numerically. w i t h n-100 we f i n d T
100
-1.16 and (1-8)-0.99999.
TI The f a c t t h a t t h e l i m i t i n g v a l u e of t h e pouer is u n i t y expresses t h a t t h e t e s t
Henee t h e l o v e r limit i s H a g a i n s t H I is consistent.
From t h e p o i n t s ( i ) and ( i i ) we see, r h e r e f o r e , t h a t f o r t h e f i x e d
i
I
his r a t i o i s obviously non-negative, and s i n c e t h e maxim= of L w i t h i n t h e aub-
389

s i g n i f i c a n c e a=0.05, t h e power of t h e t e s t Ho a g a i n s t H, i n c r e a s e s very r a p i d l y ! w cannot exceed t h e maximum value over t h e e n t i r e space n, A must be P

w i t h t h e nlnober of observations. q u a n t i t y between 0 and 1.


I f i n s t e a d we had f i x e d t h e s i g n i f i c a n c e a t
The l i k e l i h o o d - r a t i o A i s a f u n e t i o n of t h e observations. I f A turns
a-0.01 we would have found (1-8)=0.10 f o r one observation, and (I-6)-0.9994 f o r
n-100. Thus t h e power of t h e t e s t H a g a i n s t HI i s weakened when t h e s i g n i f i - our t o b e i n t h e neighbourhood of 1 t h e n u l l hypothesis Ho i s such t h a t i t

cance l e v e l i s lowered.
1 renders L(& c l o s e t o t h e maximum ~ ( 6 ) and
. hence H w i l l have a l a r g e probabi-
l i t y of being t r u e . On t h e o t h e r hand, a small value of A w i l l i n d i c a t e t h a t Ho
E x e r c i s e 14.1: For the l i f e t i m e example, consider t h e two simple hypotheses , i s unlikely. The v a r i a b l e A i s t h e r e f o r e i n t u i t i v e l y a reasonable t e s t s t a t i s t i c
4 no,
-
H : T = ~ , (A1 = rulc), for and the l i k e l i h o o d - r a t i o rest s t a t e s t h a t t h e c r i t i c a l region
EL: T 114, ( A 1 = 312 r u l e ) . : for A i s given by
For t h e s i g n i f i c a n c e ~ 0 . 0 5 ,show chat t h e power of t h e t e s t Ho a g a i n s t H1 with i
o n l y one observation i s 0.19.

14.2.4 ere^ must be a d j u s t e d t o correspond t o t h e chosen s i g n i f i c a n c e a, where


The l i k e l i h o o d - r a t i o t e s t f o r c o w a s i t e hypotheses 0
I
The Neyman-Pearson t e s t i s only a p p l i c a b l e when t h e n u l l hypothesis H I ha
and t h e a l t e r n a t i v e HI are both simple.
I n s i t u a t i o n s where e i t h e r H or H I , o r = I g ( h / ~ ~ ) d ~ (14.12)
both, are composite hypotheses one may apply t h e likezihood-mtio t e s t . 0
We w i l l consider a p . d . f . f ( x l 8 ) with parameters ~=181.8z....,8kJ which , and g ( A l ~ ~i s) t h e p . d . f . f o r A under t h e assumption t h a t Ho i s t r u e .
belong t o t h e parameter Space n. We assume t h a t H p u t s some c o n d i t i o n s on a t I f t h e p.d.f. g(AlH0) i s n o t known i t i s s t i l l p o s s i b l e - as w i l l
l e a s t one of t h e parameters. If H i s trve t h e parameters are t h e r e f o r e r e s t r i c - become c l e a r from a n example i n t h e next s e c t i o n - t o use t h e l i k e l i h a o d - r a t i o
t e d t o l i e i n a subspace w of t h e t o t a l space n. On t h e b a s i s of a sample of t e s t provided t h a t one knows t h e d i s t r i b u t i o n of some funetion of A , which h a s a
s i z e n from f ( n j 8 ) ve want t o t e s t t h e hypothesis monotonic behaviour i n A . Let y I y(A) be a monotonic f u n c t i o n of A and, assum-
ingHo t r u e , l e t t h e p . d . f . f o r y be h ( y l ~ o ) . Then the r e l a t i o n
I Aa y(Ao)
Given t h e observations xl,xr,.. . , x t h e l i k e l i h o o d function i s a - J ~ ( A I H , , ) ~= A J h ( y l ~ o ) d y
0 ~(0)
(14.13)

gives t h e c r i t i c a l region i n terms of t h e v a r i a b l e y,

The maximum of t h i s f u n c t i o n over the t o t a l space S2 i s denoted by ~ ( 6 ) . Within


t h e svbspace o, s p e c i f i e d by t h e r e s t r i c t i o n s from Ho, t h e maximum of the l i k e l i -
and t h i s again can he i n v e r t e d t o give A .
hood f u n c t i o n i s L(w). We then d e f i n e the Likelihood-mtio by the q u o t i e n t
I t frequently occurs t h a t i t i s not p o s s i b l e t o put the likelihood-
ratio A i n a unique and e x a c t correspondence t o a s t a t i s t i c whose d i s t r i b u t i o n
i s known e x a c t l y . Under such circumstances i t w i l l be more complicated t o
construct a valid test for Ho, and approximative procedures must be tried. The ML estimators of 8 1 and 82 over the entire parameter space C2 are as found in
I
Fortunately a satisfactory solution often exists when one is dealing with large Seet.9.2.3,

I samples. For example, if the null hypothesis fixes the values of r of the para-
meters it can be shown that, for Ho true, the statistic -2 In A tends asymptoti-
I cally to s chi-square variable with r of degrees of freedom. The statistic
-2 In A can therefore be used to test H , taking the critical region at the
right-hand tail o f the appropriate chi-square distribution.
It should be stressed that the asymptotic behaviour of the likelihood- A
The absolute maximum of the likelihood function is found by inserting 0 , and 8,
ratio depends essentially on the same conditions that give the optimum proper-
in eq.(14.17), giving
ties of the Maximum-Likelihood estimators.

I 14.2.5 Elample:. Likelihood-ratio test on the mean of a normal p.d.f.


TO get acquainted with the terminology and idea behind the likelihood-
ratio test let us consider a eomon practical problem. Suppose we want to test
Within the subspace w the ML estimator of 6 > is found by inserting 6 ,
eq.(14.17) and maximizing with respect r o e2; one finds
- p0 in

whether the mean 8 1 of a normal p.d.f.


" I n
e2 = ;,I ( X ~ - L I J ~ . (14.21)
1-1
We note that the HL estimators of 8 1 are different in the two spaces w and fl.

I has a specified value uo, given a sample o f n observations x~,x~,...,x . The For Ho true, the maximum of the likelihood function is
null hypothesis to be tested is then

F r m the definition, eq.(14.10), the likelihood-ratio is, therefore,


a d the alternative is

I For both hypotheses the second parameter 8, is some positive number. The total Since

ned by 8, and 8 % . where -- < 8, < -


parameter space C2 is here the set of all points in the positive half plane span-
and e2 ) 0. Under the null hypothesis Ho the
parameter 8 1 is fixed to take the value !J ; thus the subspace w is the line de-
fined by 01 - uo for positive values of Q 2 . ue may write

The likelihood function is

where the variable t is defined by


the approximation introduced at the end of the preceding section. According to
the statement made -2 In A is approximately chi-square distributed with a number
of degrees of freedom equal to the rider of parameters which were fixed by the
hypothesis Ho. In the present case H fixes one parameter, and eqs.(14.24),
With reference to Sect.5.2.1, remark 2. we see that this is a Student's t-vari-
(14.25) give
able with n-1 degrees of freedom. The variable has the p.d.f. f(t;V) of eq.
(5.15) with U = 0-1.
The p.d.f. g(A) for A may now be derived from the t-distribution, and

I
thus provide the critical region for A with a given significance a; compare eq. men a Taylor expansion is performed and the sample variance replaced by o2 for

i (14.13. Hwever, this variable transformation is not necessary, because A is a


monotonic function of tZ. Since H o false corresponds to A small and t2 very
large. we see that a critical region defined by 0 < A < A is equivalent to
a
large we get the asymptotic expression

t2 > t2. Thus the critical reglon for t corresponding to the significance u
a Far a chi-square variable with one degree of freedom we find from Appendix Table
consists of the two intervals
A8 that a one-sided test for the significance 0-0.05 corresponds to the critical
value -2 In A 05-3.841, from which A 05=0.147. Comparing this to the enact value
0.125 obtained above, we see that even for a sample sire as small as 20 the a s p -
Here tu12 is defined by tatic -2 In A approximation is reasonable.
-t
T o compute the power of the test w c have to consider the alternative
(14.27) hypothesis HI: 81 Po. The variable t as defined by eq.(14.25) may still be
%

t used as a test statistic, but if HI is true. the statistic has a "on-central


a12
Student's t-distribution (compare Sect.5.2.1, remark 1 and Exercise 5.20) with

dealing with a two-sided test. If the observed value t


the inequalities (14.26) the hypothesis H is rejected.
- tabs
With the critical region consisting of two separate parts we are here
satisfies one of
If the observed value
I "on-central parameter

satisfies
' It is seen that 6 gives a measure of h w much the alternative deviates from the
null hypothesis.
In accordance with the definition of the power of a test the pooer
the hypothesis H is accepted.
function is defined as
Consider, for instance, an experiment with 0-20 and a chosen signifi-
cance a-0.05. Por this case Appendix Table A7 gives t 025-2.093. If therefore
the observed 1 ti-value computed from eq.(14.25) is larger than 2.093. the null
hypothesis should be rejected. The corresponding critical value for A is found
by inserting t -2.093 into eq.(14.24), giving A.05-0.125. where g(AlH1,6) is the distribution of A for a given 6 when H L is true, and
,025
Let us compare this exact calculation of the critical region for A to f(t;n-1,6) is the non-central t-distribution with n-1 degrees of freedom. The
8 ,
cumulative d i s t r i b u t i o n of f(t;"-1,6) has been t a b u l a t e d f o r various parameter .
I c o d i n a t i o n s i n , f o r i n s t a n c e , "Tables of t h e "on-central t - d i s t r i b u t i o n " by
where F
= Zt.10. For n reasonably l a r g e , when E i s N ( l , l l n ) , show t h a t a-0.05
corresponds t o a c r i t i c a l region 0 < A < A.05 where
G.J. Resnikoff and G.J. Liebermn. I n F i g 14.4 t h e power function i s shorn f o r
- y)"
(1 +
.'. erp(-1 . 6 4 5 n ,

/ and t h a t A.05 approaches u n i t y when + -.


14.3 PARAMETRIC TESTS FOR NORMAL VARIABLES
We a h a l l now Concentrate on various t e s t p r o b l m s which involve t h e
/ parameters of t h e normal d i s t r i b u t i o n . The f i r s t of these is t h e following:
/ Given a s e t of observations, assumed t o r e p r e s e n t a random sample from e normal

I population, we want t o t e s t whether t h e s e observations a r e i n agreement with some


p a r t i c u l a r value of t h e mean o r v a r i a n c e of t h e normal d i s t r i b u t i o n . Simple
t e s t s of t h i s type are t r e a t e d in Sect.14.3.1. I n t h e subsequent s e c t i o n s we
I w i l l be concerned with c o n s t r u c t i n g t e s t s f o r t h e comparison of parameters of
two normal d i s t r i b u t i o n s . I n Seet.14.3.7 we d i s c u s s how t h e mean values i n more
than two normal d i s t r i b u t i o n s can be compared.
The main p o i n t i n a l l t h e examples considered below i s t o c o n s t r u c t an
Fig. 14.4. Power f u n c t i o n f o r s t u d e n t ' s t t e s r on the mean
value; curves are given f o r sample s i z e s n=2O and n m w , and a p p r o p r i a t e t e s t s t a t i s t i c , t h a t i s , some f u n c t i o n involving t h e sample v a r i -
f o r s i g n i f i c a n c e s a-0.05 ( f u l l curves) and u-0.02 ( d o t t e d a b l e s but no unspecified parameters, and which has a known d i s t r i b u t i o n function.
curves).
This d i s t r i b u t i o n w i l l enable us t o determine e c r i t i c a l region f o r t h e t e s t
s t a t i s t i c i n question, t h e s i z e of t h e region being determined by t h e chosen
d i f f e r e n t values of u and n. The following f e a t u r e s emerge from the f i g u r e :
s i g n i f i c a n c e l e v e l of t h e t e s t .
(i) me power function i s sy-rric about 6.0, as already suggested by
t h e p r o p o s i t i o n we are checking.
I 14.3.1 T e s t s of mean and v a r i a n c e i n N(~,Q')
(ii) The power i s ar i t s minimum f o r 6=0 when i t e q u a l s t h e s i g n i f i c a n c e , I Let XL.X~.....X b e a random sample f c m the normal population ~ ( u . 0 ' ) .
and increases v i t h i n c r e a s i n g m g n i r u d e of 161.
We vant f i r s t t o see how t h e mean value of t h e d i s t r i b u t i o n can be put t o t e s t
(iii) m e power i n c r e a s e s with i n c r e a s i n g s i g n i f i c a n c e . and w r i t e t h e n u l l hypothesis as
)
(iv) For a given 6 t h e power increases with the n u h e r of observations.

~ h e rse s~u l t s vould obviously a l s o be found i f , i n s t e a d of t e s t i n g on r , we had


=onsidered an "pper-tail t e s t with the s t a t i s t i c t 2 , which has a non-central i where u, i e some s p e c i f i c omber. The a l t e r n a t i v e is t h e composite hypothesis
F - d i s t r i b u t i o n i f the a l t e r n a t i v e hypothesis i s t r u e .

Exercise 14.2: For the f o l i f e t i m e example introduced i n Sect.14.2.3. c o n s i d e r


HI: u * uo. (14.31)

t h e t e s r of t h e simple n u l l hypothesis H : r = 1 a g a i n s t the camposite a l r e r - We then h s v e e p r o b l m which involves a reasoning q u i t e analogous t o t h a t used i n
m t i v e hypothesis
.. H,: r t I . Shar r h a t o v i t h the sample t l , t 2 , . . . , t n t h e
l i k e l i h o o d - r a t i o becows , e s t a b l i s h i n g confidence i n t e r v a l s f o r t h e mean of a n o r m 1 d i s t r i b u t i o n , which
was carried out in some detail in Seet.7.2.
quantity or not, the appropriate test statistic to test H :U
7.2.1 and 7.2.2 for the arguments)
-
Depending on whether of is a known
Uo is (see Sect.
lishing a confidence interval for the variance of a normal distribution can nov
be used. We recall that a variable constructed from the sample variance s2 aa
(n-l)s2/02 will have a chi-square distribution with "-1 degrees of freedom. If
should happen to know what the true mean u of the population is we may replace
G-uo
-
016

i-vo
-
s/&
,

,
which is N(O,1)

which is t(n-1)
(a2 k n m )

(u2unhown)
II by u in the evaluation of a',

0
' :

(n-l)s2/ai - n
and the chi-square variable
have n degrees of freedom. Thus the appropriate test statistic to be used for
testing H,: = 0 is

1 (xi-U)'/o:, which is X2(n)


(n-l)s210i will then

(y known),
i=1
Here N(0.1) is the usual abbreviation for a standard normal variable, while (14.35)
"
t(n-1) is a Student's t-variable with "-1 degrees of freedom; ;and s2 are the (0-l)s2/ai = 1 (~~-;)2/~:,
i=l which is X2(n-1) (U
wellknown sample characteristics,
In testing the variance we are likely to reject H (and accept H,) if
S' from the measurements is either much too small or much too large compared to
the value oi specified by Ho. Therefore, again, a two-sided test is appropriate.
In testing the mean we are likely to reject H (and accept H,) if the
Since the chi-square distribution is unsymetric the choice of the two critical
observations are svch that the sample mean ;is either much too amall or much
regions is not unambiguous, but e m n practice is to take equal probabilities
too large canbared to the hypothetical value u". therefore a No-sided test is
(=la) at both ends of the distribution.
applied. Fixing the significance a we will use the appropriate distribution
(i.e. the standard normal or the Student's t, depending on whether a' is known In general, two-sided tests for the null hypothesis Ho will always be

or not) to deternine the two critical values of the test statistic,, which eorre- appropriate whenever the alternative hypothesis are of the f o m s of eqs.(14.31).

spond to a probability ha in either tail of the distribution. If it is found (14.34). If the alternatives are fomulated as. for example,

that the observed value of the test statistic falls in any of the two tails
(i.e. it belongs to the critical region) then we shall reject the hypothesis
Ho: 11 = Uo at the significance level 10Oa X ; otherwise the null hypothesis is
accepted. one-sided tests must be applied to rest Ho.
A similar procedure is applicable if we want to test the variance of
14.3.2 Comparison of means in two normal distributions
the normal distribution. The null hypothesis is
....xn
-
We shall assume that x,.x,, is a sample of sire n from N(u1 ,a:)
Ho: o2 0
:
. (14.33) and yl,yr, ....ym a sample of size m from ~(u2.o:). We want to test the hypa-
thesis that the two normal distributions have the same mean,
where :a is specified, and the alternative (composite) hypothesis can be

H I : o2 I $. (14.34) against the alternative that the means are different

The technique developed earlier (see Seets.7.3.1 and 7.3.2) for estab- HI: Ul t u2. (14.37)
The formalism covers situations frequently met in practice when it is desired to
and
indepe,,,jenthave wellknown distributions:
check whether two series of n d e r s really are measurements on the same physical

I
quantity. Although simply stated,the null hypothesis ia not always easily tested
;, which is N(VI.'J:/") 9
(n-i]sf/o:, which is x2(n-1).
unless some simplifying assumptions can he made about the variances o: and o
:.
(m-~)s:/a:, which is x2(m-1).
We shall consider the follwing situations:
L
-
y , which is N(V2.o:/m).

(i) o: and a: are know, These variables are the entities from which we must try to build up a nw vsri-
(ii) oi and a: are unknown, but equal. able that may serve as a test statistic. The two normal variables can be eom-
(iii) u? and 01 are unknown a d different. I bined in the usual way such that

(i) o: and 08 are known (i-7) - (v,-U2) is N(0.1).

we know that the arithmetic m a n s i and ;j will be normally distrxbured /'Z'K&


as N(Ul?":/n) and N(Uz,o:/m), respectively. According to the addition theorem Adding the two chi-square variables gives a new chi-square variable with a rider
for variables, (Se~t.4.8.5)~the difference ;-? will also be normal. with of degrees of freedom equal to (,,-1)+(~-i)-n+n-2 (the addition theorem for chi-
mean (uI-U~)and variance (o:ln+o:/m). ~ h u sthe variable "ariables, Sect.5.1.5). thus

(n-l)st/o: + (m-l)s2,/of is xz(n+m-2) .


me standard normal and the chi-square variables constructed this way a t e fur-

be N(O.l). thermore independent, and according to Sect.5.2.1 they can be used to form a
To test the null hypothesis of equal means, when ,,,-ur=O, we
student'$ tvariahle by the prescription of eq.(5.14). Hence
consequently use the test =tatiatic
-x - y-
Lzzz
which is N(0.1) under H .
The problem is therefore reduced to a wellknwn one,
being exactly analogous to the testing of the mean of a normal distribution when vill be a Student's t-variable with (n-2) degrees of freedom.
its variance is known, which was discussed in the preceding section. Note that the complicated expression (14.39) is no test statistic yet.
since it includes the parameters of and o:
, which were assumed to he unknown.
(ii) of and 0% are unknown, but equal
If the variances of the populations are not known it is still possible
However, if 0
: o:, -
the expression
tion variances drop out, giving
can be si~lifiedfurther since the popula-

to carry out the desired teat, provided that it can be assumed that the two vari-
ances are of equal sire. To construct a test statistic far this case we recall (G)- (uI-v~) , (14.40)
that with the r w samples xl,nr. ....
x, from ~(u~,o:) and y ~ , y r ym from ..... s/q-
~(u~.o:) we can itmediately write d a m a set of four variables which are all
where s1 is the poo~ed setinnee of the population variance '
0 (=0?rt%).
s2 z _?_(
n*m-2
("-1)s; + (m-1)s;

equal variances, we shall take as our test statistic


) 1 (Yi-;)').
= a(~,(X1i-X)2 "
i-1
Therefore, to test the hypothesis of equal mans in the case of unknam, but
-
(14.41) +
m

!
1
i
The approach described above can be used for the practical purpose of
checking consistency between two sets of observations. If the first set
x,,,,,...,
r,,yl,....y,
xn implies an experimental reault (x
similarly (y ?
- * Ax)
- and the second set
A?) a test for the hypothesis that the two experi-
I Lnts have measured the same physical quantity (i.e. ulsuz) can be based an the
/ test statistic

which is t(n+m-2). Thus the problem ie analogous to the one discussed previous-
ly in Sect.14.3.1, when we tested the mean of a normal distribution with an un-
known variance. hieh is assumed to be N(0,l). In fact, this is the prescription used by most
It is seen from this rather lengthy derivation that unless the two un- physicists to test if two experimental results are compatible. Thus the assump-
k n o w population variances 0: and 5: are equal, we shall in general not be able
- tion norolally distributed observations is implicit whenever the above pre-
to obtain a variable for which we know the explicit distribution. The general scription is used, alti*ougl! this is seldom explicitly stated in practice.
case O:*LT$ can in fact not be treated exactly, and approximate methods must be

i! applied. ~ ~ ~ 14.3:
~ ~ Experimental
i s e
times are, respectively,
values (as of April 1976) for the E'
-,u
. .
and 2- life-

(iii) of and o$ are unknown, and different r = (2.96 + 0.12 ) x 10 sec ,


i! Given two samples from assumed normal distributions, one does not usu- r_ = (1.652 * 0.023) x 10
-10
sec .
ally have any exact knowledge about the population variances 5: and of, nor does
show that the null hypothesis H : r = 2r- (prediction of the A=! rule) can
1 on,?have any particular reason to believe that they are of equal sire. Thus the be accepted at a significance le&1
Hi: To + 21-.
8f 0.5% against the alternative hypothesis

I j conditions specified in (i) or (ii) above are not fulfilled in general.


When the samples are reasonably large, it is fair to say that the sam-
ple variances are good estimates of the population variances. Replacing of. o: 14.3.3 Comparison of variances in two normal dietributions
by their estimates sf. s% in eq.(14.36) gives For the two normal samples, xl.x2 .....x,, f r m N(y3,0:) and yl.yn .....Ym
from N(u~.o:) we can also be interested in testing the hypothesis of equal vari-
ances,

n : of - 0 ; . (14.46)
TO test the hypothesis H : ~ ~ = uone
r will therefore in this situation use the
I
test statistic against the alternative

H,: d ,012. (14.47)

For the m m n t we assuoe that bath i1


and U* are unknown. Since we know that the
population variances are estilnated by the s q l e variances it is reasonable to
which is approximately standard nornal if H is true.
try to establish a test statistic using the by now wellknown properties that
-
14.3.4 S-ary table
Table 14.1 s w r i z e s r e l e v a n t d e t a i l s about d i f f e r e n t t e s t s i n v o l v i n g

With tvo independent chi-square v a r i a b l e s i t is n a t u r a l t o c o n s t r u c t an F-vari-


.,
the mean and v a r i a n c e of wrmal d i s t r i b u t i o n s under various s p e c i f i e d e o n d i t i a r s
discussed i n t h e previous s e c t i o n s and exemplified i n t h e fallowing.

T e s t s f o r t h e mean and variance of normal d i s t r i b u t i o n s


~ ~ b 14.1
l e
a b l e by d i v i d i n g each of them by i t s number of degrees of freedom and then take
Test s t a t i s t i c Test
their ratio. Conditions distribution
HO

which, according t o t h e d e f i n i t i o n i n Sect.5.3.1, gives an F-variable w i t h


(n-l,ml) degrees of freedom. For the t e s t of H a g a i n s t H I we s h a l l t h e r e f o r e
use t h e s i n p l e test s t a r i s t i c

and take the c r i t i c a l region a t t h e upper t a i l of t h e a p p r o p r i a t e P - d i s t r i b u t i o n .


- -

I
Conversely, i f the a l t e r n a t i v e were

I I o: and 0: know"
&:In
r y
+ oflm

we would take the c r i t i c a l r e g i o n i n t h e lower p a r t of t h e d i s t r i b u t i o n .


I f the population means V 1 and un were known one could reason i n a
s i m i l a r manner; i t is r e a d i l y seen t h a t t h e a p p r o p r i a t e t e s t s t a t i s t i c i n t h i s
case is obtained by r e p l a c i n g i n eq.(14.49) t h e sample averages ;and y by the
p o p u l a t i o n means 111 and u,, giving
I I o: a: unknown
I
1
x-y

/s:/n + s:/m I not exactly


kown,~N(0,1) I
U I and p2 k n o w

mi8 s t a t i s t i c w i l l be d i s t r i b u t e d a$ an F-variable with (n,.) degrees of f r e e -


3
0% =

dom.
I
14.3.5 Example: Comparison of r e s u l t s from two d i f f e r e n t measuring machines yhich has a S t u d e n t ' s t - d i s t r i b u t i o n with 19 degrees of freedom. I f we choose
As a f i r s t example on t e s t s involving parameters of normal d i s t r i b u - a $ i g n i f i c a n c e of a.0.05 we f i n d from Appendix Table A7 t h a t t h e c r i t i c a l region
t i o n s , suppose t h a t we want t o compare measurements on beam t r a c k s i n a bubble
j f o r the two-sided t e s t with equal p r o b a b i l i t i e s la i n t h e two t a i l s i s given by
chamber from two d i f f e r e n t measuring devices with unspecified precisians.
For d e f i n i t e n e s s we ass- t h a t a monoenergetic beam of m-ntm / t l > t,025 - 2.09.
P0=24.90 GeVlc i s i n c i d e n t i n the bubble chamber, and t h a t the m u l t i p l e Coulomb
' prom the observed numbers i n t h e t a b l e above we f i n d t h a t t h e observations with
s c a t t e r i n g on the t r a c k s can be considered n e g l i g i b l e . Then the error on the i
i A and B correspond t o
measured t r a c k momentum w i l l be due s o l e l y t o t h e inaccuracy of t h e measuring
device. From the geometrical r e c o n s t r u c t i o n method applied t o t h e measured
t r a c k s one expects t h a t i t i s the i n v e r s e of t h e mamentvm - r a t h e r than t h e mo-
A
fobs = - 0.40, ttbs - + 1.60.

mentum i t s e l f - which i s an approximate normal v a r i a b l e . Suppose t h a t 20 t r a c k s uence both s e r i e s of observations produce values of t i n t h e acceptance region,

have been measured with t h e two machines A and B, g i v i n g the mean v a l u e s and and there is no reason t o suspect t h e measurements f o r e i t h e r machine.
Secondly, we w a n t t o test the hypothesis t h a t the two neasuring ma-
I : s t a n d a r d d e v i a t i o n s f o r t h e i n v e r s e mmentum l / P as displayed i h t h e following
,hines have the same p r e c i s i o n . This amounts t o t e s t i n g whether t h e v a r i a n c e s
table.
O' and og are equal, s i n c e
A

1 -
(-)= 1-
1 2O1
Machine P ~oi.lpi
The a l t e r n a t i v e hypothesis may s t a t e t h a t A i s l e s s p r e c i s e t h a n B ,
i n u n i t s lO".(GeV/c)-' i n u n i t s lo-'. (Gev1c)-'

A 40.12 0.46
B 40.32 0.25
1" t h i s case we know the mean valves of the two (normal) d i s t r i b u t i o n s , u ~ = L I ~ - ~ / P , ,
and, i n accordance with Sect.14.3.3, we form t h e t e a t s t a t i s t i c
Let us f i r s t check t h e s u p p o s i t i o n t h a t each machine r e a l l y measures
t h e i n c i d e n t momentum, which corresponds t o 1/P - 4 0 . 1 6 . 1 0 - ~ ( ~ e ~ l c ) ~For
' . each
machine t h i s implies a t e s t on t h e mean value of a normal d i s t r i b u t i o n , with t h e
which has an F-distribution with (20.20) degrees of freedom. Pining the s i g n i -
c o n d i t i o n t h a t the variance is unknown. With t h e formulation of the p r e s e n t
c h a p t e r we w r i t e ficance l e v e l a t 5% t h e c r i t i c a l region corresponding t o a one-sided t e s t is
given by, according t o Appendix Table AS.

F > F.95 = 2 . 1 6 .

!
From the observed numbers we have t h e a c t u a l number
According t o Seet.14.3.1 the a p p r o p r i a t e t e s t s t a t i s t i c is
i
which f a l l s w i t h i n t h e c r i t i c a l region. Hence we must r e j e c t t h e hypothesis of
and extending over a few b i n s . We ask t h e f o l l a u i n % q u e s t i o n s :
e q u a l p r e c i s i o n s a t the chosen s i g n i f i c a n c e l e v e l . Indeed, ve f i n d t h a t t h e r e
i s a p r o b a b i l i t y of l e s s than 0.5% t h a t a l a r g e r r a t i o P would be observed, i f (1) !ghat is t h e p r o b a b i l i t y t h a t t h e observed e f f e c t a
H were t r u e .
Hence we are l i k e l y t o a c c e p t t h e a l t e r n a t i v e hypothesis i n t h i s p a r t i c u l a r mass value (M-no) i s a s t a t i s t i c a l f l u c t u a t i o n of
case, and conclude t h a t machine A i s l e a s p r e c i s e than ~naehineB . the background?

14.3.6 Example: S i g n i f i c a n c e of s i g n a l above background


~t o f r e n happens t h a t an experimental set-up t o study a p a r t i c u l a r
I: (2) 'dhat is t h e p r o b a b i l i t y t o observe such a f l u c t u a t i o n
any p o s i t i o n i n the spectrum?
e f f e c t i m p l i e s t h e r e g i s t r a t i o n of various background e f f e c t s which tend t o ob-
scure the "real" e f f e c t under study. Whenever the background becorns appreciable , To answer q u e s t i o n ( I ) , l e t t h e t o t a l number of e v e n t s i n the i n t e r v a l
compared t o t h e t o t a l s i g n a l i t w i l l be i n c r e a s i n g l y d i f f i c u l t t o decide whether [%,%I ("the bbup region") be denoted by N and the number of background e v e n t s
a n apparent e f f e c t i s r e a l or just r e p r e s e n t s s s t a t i s t i c a l f l u c t u a t i o n of t h e in the same region be denoted by B . I f we a s s m e t h a t a l l e v e n t s i n t h e b m p
background. We are t h e r e f o r e i n t e r e s t e d i n f i n d i n g c r i t e r i a f o r assigning "re- region are due t o t h e background only, we w i l l f o r n u l a t e a n u l l hyporhesis as
a l i t y " t o v a r i o u s phenomena and d o s o , f o r example, by a t t a c h i n g a s i g n i f i c a n c e
l e v e l t o the observed enhancement i n a n effective-mass d i s t r i b u t i o n o r t h e d i p
seen i n a four-momentlrm-transfer spectrum, etc. It sometimes happens t h a t t h e background d i s t r i b u t i o n is knovn f r m
To be s p e c i f i c , l e t u s consider t h e e f f e c t i v e - m a s s histogram i n F i g . some theory o r model. I n t h i s f o r t u n a t e s i t u a t i o n B and i t s variance V(B1 can
14.5, where an aecumulatlon of events i s observed centered a t a mass value n=M More o f t e n t h e background i s n o t known, but i s approni-
be c a l c u l a t e d d i r e c t l y .
spectrum
mated by a polynomial of appropriate order and f i t t e d t o t h e observed
on bbth s i d e s of t h e ~ e a ko r bump. Another procedure. which may be e q u a l l y w e l l
j u s t i f i e d , i s t o e s t i m a t e t h e number B by f i r s t f o r g e t t i n g about t h e bump region
and drawing smooth curves through t h e d a t a p o i n t s on each aide of the bunp, and
then f i n a l l y i n t e r p o l a t e over the bump r e g i o n . The error i n such an eye-estimate
A

of B may be taken as h a l f t h e d i f f e r e n c e between the l a r g e s t and t h e s m a l l e s t


"reasonable" values of t h e estimated number B of background e v e n t s .
The t o t a l number of events N i n the b m p region may be regarded as a
Poisson v a r i a b l e . Under t h e assumption t h a t the enhancement i s a f l u c t u a t i o n .
H t r u e , t h e variance of t h e t o t a l number i s t h e n
equivalent t o assuming

I f B has been estimated by ignoring


t h e enhancement region, as f o r i n s t a n c e from
Fig. 14.5. Experimental effective-mass p l o t with enhancement a p o l y n m i a l f i t or by the hand-dram curves as described above, i t i s c o r r e c t
Then t h e variance on t h e i r
t o regard N and B as m c m e t a t e d v a r i a b l e s .
difference becomes
and t h a t we want t o f i n d t h e p r o b a b i l i t y P ( d ) t h a t a bump of d s t a n d a r d d e v i a -
tionsn t sa~p p e a r a t any p l a c e i n the h i s t o g r a m . I f t h e bump e x t e n d s a v e r k b i n s
I f t h e nvmber of e v e n t s is n o t t o o s m a l l , i t i s f u r t h e r j u s t i f i e d t o approximate
and the t o t a l number of b i n s i n t h e h i s t o g r a m i s n , t h e c e n t r a l v a l u e of t h e bump
t h e P o i s s o n v a r i a b l e N t o a noPml v a r i a b l e .
To t e s t t h e h y p o t h e s i s fro i t is be l o c a t e d i n (n-kcl) d i f f e r e n t b i n s over t h e p l o t . Therefore the probabili-
therefore suggestive to adopt the t e s t s t a t i s t i c
I ty to o b s e r v e a f l u c t u a t i o n of a t l e a s t d s t a n d a r d d e v i a t i o n s anywhere i n t h e
histogram i s
.,
which w i l l b e e n a p p r o x i m a t e s t a n d a r d normal v a r i a b l e if H is t r u e . The t e s t
~ ( d ) - I - (1 - P ( d ; +no)) n-krl,

problem i s t h e r e f o r e a n a l o g o u s t o t h a t of t e s t i n g t h e mean of a no-1 distribu-


~f d i s l a r g e t h i s may be w r i t t e n as
t i o n , which was c o n s i d e r e d i n S e c t . 1 4 . 3 . 1 .
We are now i n a p o e i t i o n t o s t a t e an a n s w r t o o u r q u e s t i o n ( 1 ) : The
p r o b a b i l i t y of o b s e r v i n g a poritive f l u c t u a t i o n a t l e a s t as l a r g e as t h e b m p
observed around t h e mass v a l u e M=Mo i s simply g i v e n b y T),,,~ t h e p r o b a b i l i t y t o o b s e r v e a h i g h l y s i g n i f i r a n t peak of d s t a n d a r d d e v i a t i o n s
anywhere i n a n e f f e c t i v e - m a s s p l o t i n c r e a s e s w i t h t h e number of b i n s i n t h e p l o t
d e c r e a s e s w i t h t h e w i d t h of t h e peak. For r e a c t i o n s w i t h many p a r t i c l e s i n

the f i n a l s t a t e where many mass c o m b i n a t i o n s can b e formed, t h e chance f o r o b s e r -


where G h a s t h e u s u a l meaning of t h e c u m u l a t i v e s t a n d a r d normal d i s t r i b u t i o n .
.ing l a r g e e f f e c t s somewhere i n any of t h e e f f e c t i v e - s s d i s t r i b u t i o n s may eer-
I t i s seen t h a t d measures t h e excess of e v e n t s above t h e background
t a i n l y become q v i t e a p p r e c i a b l e .
i n u n i t s o f s t a n d a r d d e v i a t i o n s . Common p r a c t i c e i s t o express t h e s i g n i f i c a n c e
To e x e m p l i f y t h e l a s t p o i n t , l e t us e s t i m a t e the number of "many-a
of an enhancement by q u o t i n g t h e number of s t a n d a r d d e v i a t i o n s as d e f i n e d by
f l u c t u a t i o n s " expected p e r y e a r i n effective-mass p l o t s from b u b b l e chamber ex-
eq.(14.54). E v i d e n t l y , s m a l l v a l u e s o f d are c o m p a t i b l e w i t h t h e assumption t h a t
periments performed a l l over t h e world. The f o l l o w i n g numbers a p p l y t o t h e y e a r
t h e background i s r e s p o n s i b l e f o r t h e bump, w h i l e very l a r g e v a l u e s of d m y be
1970:
r e g a r d e d as s u p p o r t f o r an a l t e r n a t i v e i n t e r p r e t a t i o n , f o r i n s t a n c e t h e p r e s e n c e
o f a resonsnce. t o t a l number of e v e n t s measured
= 2 . lo6,
I t s h o u l d be s t r e s s e d t h a t i t i s v e r y i m p o r t a n t i n d e t e r m i n i n g t h e s i g -

-
n i f i c a n c e o f an enhancement t o t a k e i n t o a c c o u n t t h e u n c e r t a i n t y i n the back-
average number of mass c o m b i n a t i o n s p e r e v e n t
number of c o m b i n a t i o n s i n each h i s t o g r a m --
= 15,
3000.
40.
ground e s t i m a t e .
I t i s seen t h a t t h e t e r n V(B) i n t h e denominator of e q . ( 1 4 . 5 4 ) number of b i n s p e r h i s t o g r a m
t e n d s t o lower the number of s t a n d a r d d e v i a t i o n s , t h a t i s , i t r e d u c e s t h e s i g n i -
f i c a n c e of t h e peak. S i n c e , f o r g i v e n t o t a l number of e v e n t s N and e s t i m a t e d mis givesan e s t i m a t e d number of b i n s p e r Yea' e q u a l t o
*
background 8 , a broad a c c u m u l a t i o n c o v e r i n g many b i n s w i l l u s u a l l y have a l a r g e r
V(R) t h a n h a s a narrow p e a k , i t w i l l i n g e n e r a l be more d i f f i c u l t t o d e t e c t a
-5
broad resonance than a narrow o n e . - f l u c t u a t i o n of minimum 40 i s 3.2.10 i n any
since the probability of a
we proceed n e x t t o answer t h e q u e s t i o n (2) r a i s e d e a r l i e r . We assume
of these b i n s we expect t o t a l of = 13 occurrences p e r y e a r of " e f f e c t s " of
t h a t we have a l r e a d y determined t h e p r o b a b i l i t y P ( d ; M-MJ t h a t the d standard T h i s example i l l u s t r a t e s why i t
at least 4 standard deviations in magnitude.
d e v i a t i o n e f f e c t a t t h e p a r t i c u l a r mass v a l u e M=H i s a s t a t i s t i c a l fluctuation
become customary t o r e q u i r e f i v e o r mare s t a n d a r d d e v i a t i o n s t o c l a i m any en-
14.3.7 Comparison of means i n N no-1 d i s t r i b e t i o n s ; scale f a c t o r
hancement i n a n e f f e c t i v e - m a s s d i s t r i b u t i o n as e v i d e n c e f o r a new resonance. W
e saw i n Sect.14.3.2 how t h e mean v a l u e s of two normal d i s t r i b u t i o n s
be t e s t e d f o r e q u a l i t y by c o n s t r u c t i o n o f an a p p r o p r i a t e t e s t s t a t i s t i c ,
E x e r c i s e 14.4: I n an experiment t o s e a r c h f o r and s t u d y t h e r a t e of e l a s t i c
a n t i n e u t r i n o - e l e c t r o n s c a t t e r i n g , "iz. *ich was e x a c t l y or approximately N ( 0 , 1 ) , or a S t u d e n t ' s t - v a r i a b l e , depending

(1) Ye + e - + V e + e-
,,~ h e t h e rt h e v a r i a n c e s of t h e two normal d i s t r i b u t i o n s were known o r n o t .
mese methods can t h e r e f o r e be used t o t e s t whether two e x p e r i m e n t a l r e s u l t s are
R e i n e s , Gurr and Sobel have u t i l i z e d low-energy a n t i n e u t r i n o s from a n u c l e a r m u t u a l l y c o n s i s t e n t provided, of course, t h a t t h e u n d e r l y i n g assumption of nor-
r e a c t o r . Evidence f o r t h e r e a c t i o n would mainly come from e d i f f e r e n c e i n t h e
c o u n t i n g r a t e s observed when t h e r e a c t o r was ON and when i t was OFF. mally d i s t r i b u t e d observations i s reasonably s a t i s f i e d .
d i f f e r e n t e l e c t r o n energy i n t e r v a l s one r e g i s t e r e d t h e c o u n t s o v e r p e r iFor
o d s s of
ix
I t i s f r e q u e n t l y d e s i r e d t o check t h e i n t e r n a l c o n s i s t e n c y between more
many d a y s . The observed c o u n t i n g r a t e s w i t h t h e r e a c t o r ON and OFF are g i v e n i n
t h e t a b l e below (columns I1 and 1111, t o g e t h e r w i t h t h e e s t i m a t e d errara due t o than two e x p e r i m e n t a l r e s u l t s . Suppose, f o r e x a r p l e , t h a t t h e r e are N e x p e r i -
i n s t r u m e n t a l i n s t a b i l i t i e s i n t h e runs ( c o l u m IV): m e n t s , each r e p o r t i n g an observed v a l u e xi w i t h error Ax.. We want t o f i n d o u t

whether t h e s e N measurements*) a r e compatible w i t h i n t h e l e r r o r s . T h i s we formu-


I I1 l a t e as a t e s t p r o b l e m where t h e n u l l h y p o t h e s i s under r e s t assumes t h a t t h e xi
111 IV v VI
Electron Countslday Countslday Stability from normal p o p u l a t i o n s which a l l have t h e same mean v a l u e .
Countslday Standard
energy (over 64.6 d a y s ) (over 60.7 days) error associated deviations
(MeV) v i t h r e a c t o r ON w i t h reactor o f f i n any run vith reactor Ho: U I = uz = ... = vN. (14.57)

1.5 - 2.0 30.6 t .69 26.9 1 .67 t .60 3.7 f 1.28 2.89
2.0 - 2.5 10.5 f f .38
he a l t e r n a t i v e h y p o t h e s i s i s any o t h e r p o s s i b i l i t y , where so- of t h e means are
not e q u a l t o t h e o t h e r s , c o r r e s p o n d i n g t o b i a s i n home of t h e e x p e r i m e n t s .
I f t h e n u l l h y p o t h e s i s had s p e c i f i e d t h e v a l u e u of t h e cornon popula-
t i o n mean, and a l s o t h e s t a n Nd a r d d e v i a t i o n s o. of t h e i n d i v i d u a l &distri-
.Z (xi-u)'/of
b u t i o n s , t h e n the q u a n t i t y 1-1 would by d e f i n i t i o n be a chi-square
v a r i a b l e w i t h N d e g r e e s o f f r e e d a n and an a p p r o p r i a t e t e s t s t a t i s t i e However,
since H does n o t s p e c i f y t h e c a m n p o p u l a t i o n mean i t must be e s r i m a t e d f r w
( i ) Assume t h e number of c o u n t s p e r day t o be P o i s s o n v a r i a b l e s and f i l l i n t o
columns I1 and 111 t h e errors ( s t a n d a r d d e v i a t i o n s ) on t h e a v e r a g e numbers of the d a t a . ~n e s t i m a t e f o r u i s t h e weighted mean v a l u e f o r a l l N measurements.
e v e n t s p e r day. (The errors f o r t h e f i r s t energy b i n are g i v e n f o r check.)
( f i ) Find t h e d i f f e r e n c e i n t h e o b s e r v e d c o u n t i n g r a t e s v i t h t h e r e a c t o r ON and
n t h t h e r e a c t o r OFF and d e t e r m i n e t h e error on t h e s e d i f f e r e n c e s (column v ) .
( H i n t : Assume t h e i n s t r u m e n t a l i n s t a b i l i t y t o be independent of t h e number of
c o u n t s .) -
h e r e t h e weipht of t h e i - t h o b s e r v a t i o n i s taken e q u a l t o i n v e r s e of t h e s q u a r e
I! ( i i i ) I n accordance w i t h cornon p r a c t i c e among size of of its error, w i = j / ~ x f . ~ n e s t i m a t e for the error i n v i s
t h e o b a e ~ e dr e a c t o r a s s o c i a t e d r a t e s i n u n i t s of standard deviations (column v I ) . !
( i v ) A t a chosen s i g n i f i c a n c e l e v e l of 0 . 1 % . do t h e r e a c t o r associated
t h e s e p a r a t e energy b i n s s u p p o r t t h e h y p o t h e s i s of a real of N (14.59)
e f f e c t i n any b i n ? ( H i n t : Assum t h e r e a c t o r
v a r i a b l e .) rate to be a normal
(v) I f t h e d a t a are lumped i n t o one group c o v e r i n g a l l energy b i n s , h a t is the each reported v a l u e xi w i l l o f t e n b e an awrage Valuefrom a
*) In practice,
combined e v i d e n c e f o r a r e a l s i g n a l of r e a c t i o n ( I ) ? and t h e bni t h e c o r r e s p o n d i n g error in thin average;
series of
(For Ho t r u e and normally d i s t r i b u t e d o b s e r v a t i o n s t h e s e are the Harimm-Likeli- p h y s i c s become customary t o r e t a i n t h e averaging of the weighted o b s e r v a t i o n s
hood e s t i m a t e s of the p o p u l a t i o n parameters, given by eqs.(9.11).(9.12), when but to i n t r o d u c e a scale f a c t o ~t o i n c r e a s e t h e e r r o r s . Since one u s u a l l y does
I
t h e 0. are approximated by t h e observed errors Axi.) ,,.t know which, i f any, of t h e measurements or experiments may be vrong one
I I f Ho i s t r u e we e x p e c t t h a t t h e weighted sum of t h e squared devia- -kes t h e r a t h e r a r b i t r a r y assvmption t h a t a l l experiments have u r d e r e s t i m a t e d
t i o n s from t h e weighted mean v a l u e errors. and i n the same p r o p o r t i o n , so t h a t a l l errors shovld be a d j u s t e d
by some comon f a c t o r S. The P a r t i c l e Data Group accordingly d e f i n e S

w i l l be a n approximate chi-square v a r i a b l e v i t h N-1 degrees of freedom; hence X2


!
i s a s u g g e s t i v e t e s t s t a t i s t i c f o r t h e h y p o t h e s i s of e q u a l p o p u l a t i o n means. If me above procedure i s used f o r the c a l c u l a t i o n of s i f a l l experiments have
t h e p o p u l a t i o n means are n o t e q u a l
o b t a i n v a l u e s of x2
(i.e. i f Ho is n o t t r u e ) we are l i k e l y t o
which are, on the average, t o o l a r g e .
errors of about the same s i z e . The new xibs obtained v i t h t h e a d j u s t e d errors
Consequently we s h a l l If the experiments have widely v a r y i n g errors
.ill then of course be j u s t N-1.
use a one-sided t e s t f o r e q u a l i t y of t h e means, with c r i t i c a l r e g i o n a t the upper
the P a r t i c l e Data Group recornend a s l i g h t m o d i f i c a t i o n which c o n s i s t s i n d i s r e -
t a i l of t h e x 2 ( ~ - 1 ) d i s t r i b u t i o n .
garding the l e s s p r e c i s e experiments i n the e v a l u a t i o n of S, thereby o b t a i n i n g a
Whenever t h e observed v a l u e xibs
as c a l c u l a t e d from eq.cl4.60) comes
,,ale f a c t o r which i s s e n s i t i v e t o t h e most p r e c i s e measurements.
o u t v i t h a magnitude comparable t o t h e mean v a l u e N-1 f o r t h e x 2 ( ~ - 1 ) d i s t r i b u -
Note t h a t t h e s c a l i n g procedure for errors i n no way a f f e c t s t h e weigh-
t i o n t h i s is taken as s u p p o r t f o r t h e assumption of c o n s i s t e n t measurements.
One w i l l t h e n adopt the weighted mean value ;and t h e error A; i n t h i s quantity
ted average v a l u e ;from eq.(14.58), b u t only amounts t o i n c r e a s i n g the e r r o r i n
t h i s q u a n t i t y by the f a c t o r S compared t o t h e unscaled error from eq.(14.59).
from eqs.(14.58),(14.59) as t h e b e s t e s t i m a t e s f o r t h e t r u e value p and i t s
1t should = l s o be s t r e s s e d t h a t i t i s of g r e a t importance t h a t t h e
error 0.
i n d i v i d u a l error e s t i m a t e s Ax. have been checked b e f o r e they are used t o calcu-
I f the X2 t e s t o n the mean v a l u e s f a i l s a t the chosen s i g n i f i c a n c e
l a t e a weighted mean f r w eq.(14.58) and a numerical value f o r t h e t e s t s t a t i s t i c
l w e l , i.e. xZbs exceeds t h e c r i t i c a l value x 2 ( N - I ) , t h e reason can snnetimes be
of eq.(14.60). For i n s t a n c e , i t i s recomnended t o make sure whenever p o s s i b l e
t h a t one, or a few, of t h e measurements d e v i a t e s s t r o n g l y from a l l o t h e r s . Such
that these errors are n o t s m a l l e r than those i n p l i e d by the minimum v a r i a n c e
d e v i a t i o n s are conveniently d e t e c t e d by p l o t t i n g the d a t a i n an ideogram as de-
bound
s c r i b e d i n Sect.6.2.5. One a s s i g n s t o each measurement a normalized Gaussian
curve of mean x . and width Axi and sums a l l the curves. I f t h e r e s u l t i n g enve- Exercise 14.5: Five bubble chamber experiments have e s t i m a t e d t h e mass of the
-
l o p e has a s i n g l e f a i r l y smooth peak c e n t e r e d a t ;the measurements are probably Q- hyperon a s f o l l o w s : (1673.0 t- 8.0)Mev, (1673.3 t- 1.O)MeV. (1671.8 + 0.8)MeV,
(1674.2 t 1.6)MeV- (1671.9 t 1.2)Mev. Are t h e s e r e s u l t s c o n s i s t e n t a t a s i g n i -
r e a s o n a b l y c o n s i s t e n t , b u t i f some secondary peak shows up c l e a r l y s e p a r a t e d ficance l e v e l of 5%? A" 01d measurement "sing n u c l e a r emulsions e s t i m a t e d t h e
from t h e peak a t ;t h i s i n d i c a t e s t h a t the measurements r e s p o n s i b l e f o r the mass of a negative a t (1620 ? 25)MeVL 13 t h i s r e s u l t c o n s i s t e n t v i t h
the more r e c e n t bubble chamber d a t a f o r the Q ?
secondary peak are incompatible with t h e r e s t . I f no n a t u r a l e x p l a n a t i o n can be
found f o r the odd behaviour it i s recornended t o r e j e c t t h e d e v i a t i n g measure-
ments and r e p e a t t h e c a l c u l a t i o n s w i t h t h e reduced number of o b s e r v a t i o n s .
When x:bs is l a r g e compared t o t h e e x p e c t a t i o n v a l u e , b u t not so large
-
Exercise 14.6: s i x d i f f e r e n t experim$nts have measured t h e complex CP v i o l a t i o n
parameter n+-a i n the decay KO n n no. The v a l u e s r e p o r t e d f a r the r e a l and
imaginary p a r t s of t h i s parameter are t h e following:

t h a t t h e h y p o t h e s i s of equal mean v a l u e s must be abandoned, i t has i n p a r t i c l e


let x l . x 2 , . ..,x, b e sample v a l u e s for a random v a r i a b l e n whose t r u e p r o b a b i l -
Experiment Re i t y f u n c t i o n f ( x ) , c o n t i n u o u s o r d i s c r e t e , i s nor known and l e t f ( x ) b e some
lm 'l+-o
p a r t i c u l a r s p e c i f i e d d i s t r i b u t i o n . To t e s t t h e s i m p l e h y p o t h e s i s

no: f ( x ) = fo(x) (14.62)

on t h e b a s i s of t h e sample v a l u e s i e t h e n a t y p i c a l g o o d n e s s - o f - f i r problem.
I n r e s t i n g g o o d n e s s - o f - f i r we s h a l l , as b e f o r e , need a t e s t s t a t i s t i c

" t r u e , d e f i n e s a c r i t i c a l r e g i o n and an a c c e p t -
"hose d i s t r i b u t i o n , a s s u m i n g H-
C o n s i d e r i n g t h e r e a l and t h e imaginary p a r t s of t h e p a r a m e t e r s e p a r a t e l y , are region with probabilities d and 1-a, r e s p e c t i v e l y . The s i t u a t i o n can be
t h e r e s u l t s from t h e d i f f e r e n t e x p e r i m e n t s c o n s i s t e n t a t t h e 5 % l e v e l ? What
are t h e s c a l e f a c t o r s f o r t h e two s e r i e s of o b s e r v a t i o n s u s i n g a l l measurements? d i f f e r e n t from t h a t o f t h e p r e v i o u s s e c t i o n s i n t h a t we may now n o t f o r n u l a t e an
What would t h e s c a l e f a c t o r s be i f t h e l a s t e x p e r i m e n t was d i s r e g a r d e d ? h y p o t h e s i s H I , s i n c e H I c a n be t h e ensemble o f a l l c o n c e i v a b l e hy-
p o t h e s e s d i f f e r e n t from H,>. Thns A , i s o f t e n l e f t u n s p e c i f i e d , and the power of
E x e r c i s e 14.7: The m a n l i f e t i m e r of t h e no meson h a s b e e n measured by 11 ex-
p e r i m e n t s , s i x of which used n u c l e a r e m u l s i o n s as d e t e c t o r s and f i v e used c o u n t e r the r e s t n o t taken i n t o account.
t e c h n i q u e . The f o l l o w i n g d a t a have been r e p o r t e d : The g o o d n e s s - o f - f i t t e s t most ~ o m n l yused i s Pearsm'e X' t e s t .
"hich w i l l be d i s c u s s e d e x t e n s i v e l y i n t h e f a l l o v i n g s e c t i o n s . This t e s t i s
Nuclear e m u l s i o n t e c h n i q u e Counter technique
s p e a k i n g e x a c t f o r l a r g e samples o n l y , and o t h e w i s e approximate. A
r ( i n u n i t s of 10'L6sec) Of
T ( i n u n i t s of 1 0 " ~ s e c )
~ e c o n dt e s t f o r g o o d n e s s - o f - f i t , the Zikelihood-mtio t e s t , i s v a l i d f o r a l l

1.9 t 0.5 76 samples s i z e s , b u t s i n c e i t s r e s t s t a t i s t i c has a d i s t r i b u t i o n which i s not


1.05 f 0.18
2.3 f 1.l 45 0.730f0.105 known i n g e n e r a l , r h i s t e s t i s l e s s a p p l i c a b l e t o p r a c t i c a l p r o b l e m ; moreover,
2'.8 f 0.9 88 0.6 f 0.2
1.7 f 0.5
since the likelihood-ratio t e a t f o r l a r g e s a n p l e s can be shown t o be e q u i v a l e n t
75 0.56 f 0.06
1.6 A 0.6 67 0.9 f 0 . 0 6 8 t o Pearson's X' t e s t , i t w i l l n o t be c o n s i d e r e d i n t h i s book. I n s t e a d we s h a l l
1.0 f 0.5 232
d i s c u s s t h e KotmgorolrSmirnov t e s t , which i s p a r t i c u l a r l y u s e f u l i n t h e ease o f

(i) V e r i f y t h a t t h e errors g i v e n by t h e e m u l s i o n e x p e r i m e n t s are l a r g e r t h a n S W I I samples when t h e c o n d i t i o n s f o r Pearson's X2 t e s t a r e nor s a t i s f i e d .


t h e t h e o r e t i c a l l y s m a l l e s t p o s s i b l e errors. ( H i n t : The MVB f o r r i n t h e p . d . f .
: ( t \ r ) = l h e x p ( - t / ~ ) is ~'ln.) 14.4.1 Pearson's y 2 t e s t
(ii) Show t h a t t h e emulsion e x p e r i m e n t s are i n t e r n a l l y c o n s i s t e n t , I n c h i s and s u b s e q u e n t s e c t i o n s . t h e p r e s e n t a t i o n reserrbles t h a t of
( i i i ) Are the c o u n t e r e x p e r i m e n t s i n t e r n a l l y c o n s i s t e n t ?
(iv) Do the t w o t e c h n i q u e s l e a d t o c o n s i s t e n t r e s u l t s ? S e c t s . 10.4 and 10.5 c o n c e p t s and n o t a t i o n . Harever, w h i l e i n t h e s e
sections the topic p a r a m r e r e s t i m a t i o n (by t h e Least-Squares a p p r o a c h ) , the
14.4 GWDNESS-OF-FIT TESTS e s t i m t i o n problem as s u c h w i l l be of s u b o r d i n a t e importance t o us h e r e . In
Up t o t h i s p o i n t we have i n r h i s c h a p t e r c o n s i d e r e d v a r i o u s t y p e s of f a c t , t o emphasize t h a t o u r p r e s e n t concern i s t h e g w d n e s s - o f - f i t b e w e e n ob-
p a r a m e t r i c r e s t s , i n which t h e t a s k h a s been t o d e c i d e w h e t h e r , i n t h e l i g h t o f served d a t a and so- t h e o r e t i c a l m d e l we s h a l l f o r t h e mlrpnt assume t h a t t h e
a s e t of o b s e r v a t i o n s , t h e n u m e r i c a l v a l u e s o f c e r t a i n p a r a m e t e r s d e s c r i b i n g a data have n o t been ~ s e dt o i n f e r t h e n u m r r i e a l v a l u e s of p a r a m e t e r s i n t h e model.
P r o b a b i l i t y d i s t r i b u t i o n are c o m p a t i b l e w i t h some s p e c i f i e d v a l u e s d e f i n i n g t h e
In o t h e r words. t h e d i s t r i b u t i o n d e f i n i n g H, i s p r e s e n t l y assumed t o be spec-
parametric hypothesis.
i f i e d i n d e p e n d e n t l y of t h e o b s e r v a t i o n s which c o n s t i t u t e t h e b a s i s f a r t h e t e s t
We w i l l now proceed ro d i s c u s s more g e n e r a l problems i n v o l v i n g non- of f i t . The necessary m o d i f i c a t i o n s f o r a s i t u a t i o n where t h e d a t a have been
p a r a m e t r i c h y p o t h e s e s , w i t h which.qe s h a l l be concerned f o r t h e remainder o f the
chapter. We S t a r t by c o n s i d e r i n g t e s t s of goodness-of-fit, For d e f i n i t e n e s 4 ,
used t o s p e c i f y P a r a m e t e r s i n H w i l l b e c o n s i d e r e d i n S e c t s . 14.4.3 and 14.4.4 enough t o approximate a P o i s s o n v a r i a b l e to a n o r m 1 v a r i a b l e , t h e n ( n i a p o i ) l
below. i s a p p r o x i m a t e l y N(0.1). Zhe s r a t i s r i c x2 c o n s t r u c t e d as a sum o f s q u a r e s

We assume now t h a t n o b s e r v a t i o n s on t h e v a r i a b l e x belong ro N o f such v a r i a b l e s w i l l a c c o r d i n g l y b e an approximate chi-square variable v i t h a

m u t u a l l y e x c l u s i v e c l a s s e s . s u c h as s u c c e s s i v e i n t e r v a l s i n a h i s t o g r a m , mom- number of d e g r e e s of freedom q u a 1 t o the nunber o€ independent tern i n t h e s q

o v e r l a p p i n g r e g i o n s i n two-dimensional p l o t , etc. ~ h i c hi s h e r e N-1 due t o the n o r m l i r a t i o n c o n d i t i o n .


The n u d e r of e v e n t s n,,n2,
.... nN i n the d i f f e r e n t c l a s s e s w i l l t h e n be m u l t i n o r n i a l l y d i s t r i b u t e d , v i t h I f H, i s t r u e , and t h e e x p e r i r e n t i s r e p e a t e d many t i m e s under t h e

p r o b a b i l i t i e s Pi f o r t h e i n d i v i d u a l c l a s s e s as determined by t h e " n d e r l y i n g dia. same c o n d i t i o n s v i t h n observations, the actual values obtained f o r xzObs w i l l
tribution. me s i m p l e h y p o t h e s i s we wish t o t e s t s p e c i f i e s t h e c l a s s p r o b a b i l - t h e r e f o r e be d i s t r i b u t e d " e a r l y l i k e x'(N-1); i n p a r t i c u l a r , the average value

i t i e s a c c o r d i n g t o some p r e c r i p r i o n , for xibs w i l l be 1 N-1 and t h e v a r i a n c e 1 2(N-1). I f , on t h e o t h e r hand, H, is

H,,: P I = P o l s PZ = p o l , .... P N - poN, (14.63)


not true, the e x p e c t a t i o n f o r each ni i s n o t " p o i ,
(ni-np,i)l<iwill
and t h e sum o f s q u a r e d terms
t e n d t o become on t h e a v e r a g e l a r g e r t h a n i f H, were t r u e .
where Hence i t seems reasonable to reject Ho if x : ~becones
~ t o o l a r g e and to a d o p t a
N
one-sided t e s t f o r H,, r a k i n g the c r i t i c a l r e g i o n a t t h e upper t a i l o f t h e appro-
1 Po; = 1 . (14.64)
-
I
i=l p r i a t e chi-square d i s t r i b u t i o n . See a l s o E x e r c i s e 1 4 . 8 below.
E q u i v a l e n t l y , t h e h y p o t h e s i s s p e c i f i e s t h e numbers p r e d i c t e d f o r t h e d i f f e r e n t I t h a s been t h e p r a c t i c e a t r i m s t o reject Ha f o r v e r y s m a l l as w e l l
c l a s s e s , g i v e n t h a t t h e t o t a l number i n a l l c l a s s e s i s n . To t e s t w h e t h e r t h e xibs,
s e t of p r e d i c t e d n u d e r s np .
01
is c o m p a t i b l e w i t h t h e s e t o f o b s e r v e d numbers n.
we t a k e as o u r t e s t s t a t i s t i c t h e q u a n t i t y
I
i
as very l a r g e v a l u e s of
t h a t a very s m a l l xibs might
i.e. t o use a two-sided t e s r , t h e argument b e i n g
i n d i c a t e some b i a s i n t h e d a t a towards t h e hypo-
thetical values. Although s u c h a b i a s w y c e r t a i n l y cause an i m p r o b a b l e , s m a l l
v a l u e f o r x : ~ ~ , i t a p p e a r s on t h e o t h e r hand even more u n l i k e l y t o o b t a i n a low
N (ni-n~,i)z
x2= 1 xEbS v a l u e u s i n g a wrong h y p o t h e s i s ; t h e two-sided tesr thus s e e m l e s s jus-
i=l "Poi
t i f i e d and h a s been abandoned.
o r , i n an e q u i v a l e n t form which i s e a s i e r t o compute, S i n c e t h e Pearso" X2 t e s t i s i n s e n s i t i v e t o t h e s i g n s of the d i f f e r -
ences (mi - "poi) it i s recorranended, a l s o i n t h e cases when t h e r e is no
1 reason t o s u s p e c t t h e hypothesis from t h e xibs v a l u e , t o examine t h e h y p o t h e t i -
c a l and o b s e r v e d numbers and look f o r s y s t e m a t i c t r e n d s o v e r the v a r i a b l e range.
When Ho i s t r u e t h i s s t a t i s t i c i s a p p r o x i m a t e l y c h i - s q u a r e d i s t r i b u t e d w i t h N-1 Quite f r e q u e n t l y a h y p o t h e s i s which c o r r e s p o n d s t o an a c c e p t a b l e c h i - s q u a r e
d e g r e e s of freedom. p r o b a b i l i t y can be r u l e d o u t from t h e p a t t e r n of s i g n s i n t h e d e v i a t i o n s between
To q u a l i f y t h i s s t a t e m e n t we o b s e r v e t h a t e a c h term i n eq.(14.65) is d a t a and m d e l ; see t h e exanple
i n Sect.14.6.7.
t h e s q u a r e of a q u a n t i t y (ni-np where t h e numerator measures t h e d i f - E x e r c i s e 14.8: ( E x p e c t a t i o n v a l u e s of t h e x2 s t a t i s t i c )
01
f e r e n c e between t h e o b s e r v e d and h y p o t h e t i c a l v a l u e f o r c l a s s i ; i f H i s true
, ( i ) show t h a t , f o r any u n d e r l y i n g m ~ l t i n o m i a l d i s t r i b u t i o n w i t h class p r o b a b i l -
t h i s d i f f e r e n c e h a s e x p e c t a t i o n v a l u e zero.
A l s o , t h e number of o b s e r v a t i o n s i n
i t i e s pl. pr, .... p N the e x p e c t a t i o n v a l u e of Pearson's t e s t s t a t i s t i c e q .
(14.66) i s
a n y c l a s s i can b e c o n s i d e r e d a Poisson v a r i a b l e , w i t h mean v a l u e e q u a l t o i t s
v a r i a n c e , and f o r $ t r u e t h i s mean i s e q u a l t o "poi. I f , f u r t h e r , np . i s la-e
01
In p a r t i c u l a r , i f H is t r u e (is.i f pi-poi f o e a l l i ) , one h a s , for rmy s q 2 e
s i z e n, use r e l a t i v e l y few c l a s s e s t o conply w i t h the normality requirement, and many
E(XZIH ) = N-1 . c l a s s e s t o reduce the l o s s o f information from t h e d a t a .
This r e p r e s e n t s a g e n e r a l i z a t i o n o f the a l r e a d y e s t a b l i s h e d asymptotic r e s u l t . To s o l v e t h i s dile- it is u s u a l l y reco-nded t o group t h e observa-
when X' i s X2(n-1) and t h e r e f o r e h a s mean v a l w N-1 . t i o n . as l i t t l e as p o s s i b l e and i n such a way t h a t t h e number o f events w i t h i n
( i i ) Show, by minimizinf E(x') with r e s p e c t t o t h e p. under t h e c o n s t r a i n t
E p i - 1 , t h a t as n-, E(X ) v l l l be minimal i f pi =poi! (Hint: Use t h e Lagrangian a l l c l a s s e s w i l l be i n accordance v i t h the normality requirement, which i s eus-
m u l t i p l i e r method.) Hence, f o r any h y p o t h e s i s H I s p e c i f y i n g a s e t of c l a s s prob-
a b i l i t i e s d i f f e r i n g from Ho one w i l l have, a s y m p t o t i c a l l y . tomarily s p e c i f i e d as a minimum of 5 expected events i n each c l a s s . I f t h e nu=

E ( X ' I H ~ )> N-1 , b e r o f degrees of freedom i s not too s m s l l , s a y a t l e a s t 6 , one o r two c l a s s e s


s u g g e s t i n g t h a t t h e c r i t i c a l r e g i o n f o r Pearson's x2
t e s t should be a t the upper may be allowed t o have even l e s s than 5 expected e v e n t s .
t a i l o f t h e chi-square d i s t r i b u t i o n .
( i i i ) I f a11 c l a s s p r o b a b i l i t i e s are equal under H,, i . e . i f a l l poi
t h a t , for m y smple size,
1
show -*. -present
For o b s e l v a t i o n a on a d i s c r e t e v a r i a b l e the c l a s s d i v i s i o n v i l l u s u a l l y
no s p e c i a l problem, s i n c e the c l a s s boundaries w i l l mre o r l e s s pug-
E ( x ~ ~ H=, )0-1) + ( n - l ) ( ~ ~ p f - l ) . gesr t h e m e l v e s and p o s s i b l e n e c e s s a r y groupings of events, p a r t i c u l a r l y i n t h e
which i s always l a r g e r t h a n N-1 f o r hypotheses d i f f e r i n g from H,. t a i l s of the d i s t r i b u t i o n , c a n be made t o correspond t o the minimum requirements
14.4.2 Choice o f c l a s s e s f o r Pearson's y z r e s t r e n t i o n e d above.

The problem of how t o d i v i d e t h e v a r i a b l e range i n t o c l a s s e s o r b i n s , For a continuous v a r i a b l e t h e r e i s no n a t u r a l s u g g e s t i o n f o r t h e c l a s s

and thereby f i x the n u d e r of c l a s s e s , was a l s o considered i n connection v i t h the , determination from the h y p o t h e t i c a l d i s t r i b u t i o n i t s e l f , and e s s e n t i a l l y two d i f -

Least-Squares method of parameter e s t i m a t i o n , Secr.10.5.2. i f e r e n t approaches c a n be thought of f o r t h e s u b d i v i s i o n of the v a r i a b l e range

The asylrptotic chi-square behaviour of the X' s t a r i s t i e f o r the Pearson (Seet.lO.5.2): E i t h e r the range i s d i v i d e d i n t o c l a s s e s o f eqml width, o r i t i s
divided t o correspond t o c l a s s e s of equal p m b a b i l i t y . The equal-width method
,y2 rest of goodness-of-fit i s , s t r i c t l y speaking, only proved t o be c o r r e c t i f
the c l a s s d i v i s i o n is made without an). r e f e r e n c e t o t h e o b s e r v a t i o n s . l h i s is s o i s a r i t h m e t i c a l l y s i m p l e r t h a n t h e e q u a l - p r o b a b i l i t y method which may r e q u i r e a

because t h e formulation above assumed t h e o b s e r v a t i o n 8 t o be randomly (multi- soroewhat h i g h e r l e v e l of c o n p u r a t i o n a l s o p h i s t i c a t i o n and t h e r e f o r e , a p p a r e n t l y ,


nomially) d i s t r i b u t e d o v e r a s e t o f Predefined c l a s s e s , each corresponding t o a m j o y s l e s s p o p u l a r i t y among p h y s i c i s t s . The e q u a l - p r o b a b i l i t y method can, h o w
specified probability. I n p r a c t i c e , however, the choice of c l a s s boundaries i s ever, be shovn t o be t h e o r e t i c a l l y advantageous.

o f t e n made a f t e r t h e d a t a have been o b t a i n e d and t h e general p a t t e r n of the ob- Assuming e q u a l - p r o b a b i l i t y p a r t i t i o n and s u f f i c i e n t l y l a r g e samples.

s e r v a t i o n s has emerged and can be taken i n t o account. This p r a c t i c e i s j u s t i f i e d it i s p o s s i b l e t o e s t a b l i s h a r e l a t i o n f o r the optimum number of c l a s s e s which
by t h e f a c t t h a t , f o r i n f i n i t e n, t h e d i s t r i b u t i o n of x2 w i l l be X 2 ( ~ - l )f o r any m x i m i r e s an approximate p a r e r f u n c t i o n f o r Pearson's x2 t e s t ( s e e Kendall and
S t u a r t . Chapter 2 0 , Vol.2). The optimum n u d e r of c l a s s e s i s found t o i n c r e a s e
p a r t i t i o n with N c l a s s e s , provided H, i s true.
Pearson's x2 t e s t r e l i e s on t h e approximation of a mulfinomial t o a i n p r o p o r t i o n t o n2" for f i x e d paver and s i g n i f i c a n c e or, e q u i v a l e n t l y , t h e op-
m u l t i n o m l d i s t r i b u t i o n , s i n c e i t assumes an approximate s t a n d a r d normal behav- timum expected e v e n t n u d e r i n each c l a s s i n c r e a s e s as n3I5 .
Specifically,

i o u r of a l l t e r m ( n i - n p O i ) / K i , and c h i s r e q u i r e s s u f f i c i e n t l y l a r g e numbers maximizing a t a power l e v e l 1 - 8 - L f o r 0 - 2 0 0 one f i n d s t h a t the optimum c l a s s

o f expected events w i t h i n each of the N c l a s s e s . C l e a r l y , t h e grouping of i n d i - n u d e r N is 31 (27) f o r a s i g n i f i c a n c e l e v e l o f 5% (1%). corresponding t o between


v i d u a l o b s e r v a t i o n s i n t o a c l a s s and r e p r e s e n t i n g them by some c o m n average 6 and 8 expected e v e n t s p e r c l a s s , j u s t b a r e l y i n accordance with the n o r m a l i t y
v a l u e of t h e v a r i a b l e n e c e s s a r i l y i m p l i e s a c e r t a i n l o s s of information whieh is requirerant. For h i g h e r power l e v e l s the optimum n u d e r of c l a s s e s i s lowered,

i n i t s e l f unwanted. I n a p p l y i n g t h i s t e s t t o cornpare model and d a t a the physi- corresponding t o mare expected events p e r c l a s s .

c i s t may t h e r e f o r e experience t h e khoice between S e y l l a and Charybdis: he should


Pearson's
clear.
Whether t h e e q u a l - p r o b a b i l i t y m t h o d g e n e r a l l y i o p r o v e s t h e p a r e r o f
xZ t e s t compared t o t h e equal-width method i s , however, n o t a t a l l
Indeed, one may s u s p e c t t h a t t h e d e g r e e of " f i t " w i l l be r m s t c r i t i c a l
s t t h e e x t r e m e s of t h e v a r i a b l e r a n g e , and under s u c h c i r c u m s t a n c e s t h e e q u a l -
I $ r a t i s t i c can then be shown t o have a s y m p t o t i c a l l y a x'(N-I-L)
as i f t h e p a r a m e t e r s were e s t i m a t e d by t h e Least-Squares method.
distribution, just

( i i ) The L unknown p a r a m e t e r s were e s t i m a t e d by t h e o r d i n a r y M a x i m u m L i k e l i h w d


-rhod from t h e o r i g i n a l ungrouped o b s e r v a t i o n s . - In t h i s situation i t is a
p r o b a b i l i t y method may v e l l r e s u l t i n a l o s s o f s e n s i t i v i t y i n t h e s e r e g i o n s . l i t t l e nore problematic t o perform the goodness-of-fit t e s t , because t h e X' s t a t -
i s t i c doer no l o n g e r have a s i m p l e c h i - s q u a r e d i s t r i b u t i o n . Assuming a g a i n N
14.4.3 Degrees o f freedom i n Pearson's x2 t e s t term in the x2 sum i t can be shown t h a t X' w i l l have a d i s t r i b u t i o n v h i c h i s
Tests f o r g o o d n e s s - o f - f i t are f r e q u e n t l y used i n c o n j u n c t i o n w i t h e s t i -
bounded by two c h i - s q u a r e d i s t r i b u t i o n s w i t h (N-1) a n d (N-I-L) d e g r e e s of f r e e -
m a t i o n problems. Typically, s o m theory or m d e l i s available vhich includes a
dom, r e s p e c t i v e l y . When N i s a l a r g e n u d e r t h e d i f f e r e n c e b e o r e e n t h e two d i s -
c e r t a i n n u d e r of p a r a m e t e r s t h a t are e i t h e r k n m t o an u n s a t i s f a c t o r y p r e c i s i o n .
t r i b u t i o n s may b e i g n o r e d , and t h e c r i t i c a l v a l u e f o r X' a t a given s i g n i f i c a n c e
o r n o t known a t a l l . This model may be s a i d t o c o n s t i t u t e a composite h y p o t h e s i s .
can be found from t h e c o r r e s p o n d i n g x2(N-1) distribution. F o r N s m a l l , however,
s i n c e i t i s not c o m p l e t e l y s p e c i f i e d , b u t s p a n s a c e r t a i n r e g i o n of p a r a m t e r
i t may be n e c e s s a r y to check t h a t thr c a l c u l a t e d x2 e x c e e d s the c r i t i c a l v a l u e
space. I f t h e n t h e d a t a a r e used t o o b t a i n e s t i m a t e s f o r t h e unknowns t h e e f f e r r
; for b o t h d i s t r i b u t i o n s x ' ( N - I ) and x'(N-1-L) before r e j e c t i n g the f i t .
i s t o reduce t h e allowed r e g i o n o f p a r a m e t e r s p a c e t o a s i n g l e p o i n t .

I
T h i s cor-
r e s p o n d s to making t h e node1 a s i m p l e h y p o t h e s i s , which i s s u b s e q u e n t l y p u t t o 1h.4.4 General X2 t e s t s f o r g w d n e s s - o f - f i t
t e s t f o r goodness-of-fir.
I I n o u r c o n s i d e r a t i o n s so f a r we have assumed t h a t the t e s t s t a t i s t i c
F a r a Least-Squares e s t i m a t i o n we know from Sects.lO.4.3 and 10.4.4 1 x2 has been e x p r e s s e d i n terns of N c l a s s p r o b a b i l i t i e s , which a r e n o t a l l inde-
t h a t t h e comparison between d a t a and f i t t e d model i s made u s i n g t h e chi-square I pendent b u t must add t o u n i t y . This i s e q u i v a l e n t t o r e q u i r i n g e q u a l l y many
d i s t r i b u t i o n w i t h a n u d e r of d e g r e e s of freedom e q u a l t o t h e n u d e r o f indepen- p r e d i c t e d and o b s e r v e d evenrs when su-d over a l l c l a s s e s , and i m p l i e s a redue-
d e n t o b s e r v a t i o n s minus t h e n u h e r of i n d e p e n d e n t p a r a m e t e r s e s t i m a t e d . This r i o n i n t h e n u d e r of d e g r e e s o f freedom of one u n i t i n t h e comparison f o r good-
p r o c e d u r e i s e x a c t o n l y i n t h e l i m i t o f infinitely many o b s e r v a t i o n s and w i t h a ness-of-fit. Q u i t e f r e q u e n t l y , however, t h e model v h i c h we v i s h t o t e s t g i v e s
l i n e a r p a r a m t e r dependence; o t h e n r i s e i t i s a n a p p r o x i m a t i o n .
Thus, i f t h e r e d e f i n i t e p r e d i c t i o n s f o r t h e a b s o l u t e n u d e r s of e x p e c t e d events f i i n t h e s e p -
1
are L p a r a m e t e r s i n H, which a r e e s t i m a t e d by t h e LS method and N c l a s s e s s u b j e c t : arate classes. R a t h e r than e q s . (14.63),(14.64) the hypothesis i s
t o an o v e r a l l n o r m a l i z a t i o n condition, P e a r s o n ' s x2 r e s t f o r goodness-of-fit con-

s i s r e i n comparing the f i t r e d (minimum) v a l u e xZ.


ym"
t o the chi-square d i s t r i b u t i o n
w i t h (N-1-L) d e g r e e s of freedom. and the n a t u r a l t e s t s t a t i s t i c i s t h e n

Whenever unknown p a r a m e t e r s a r e t o b e i n f e r r e d from t h e d a t a i t i s gen-


e r a l l y r e c o m n d e d to use t h e MaximunrLikelihood method f o r t h i s e s t i m a t i o n and
t o use t h e x2 s t a t i s t i c o f eqs.(14.65) o r (14.66) v i t h t h e e s t i m a t e d p r e d i c t e d which, f o r Ho t r u e , is assumed t o b e approxilnntely x'(N) when a l l N c l a s s e s have
frequencies p . = p0 1.
01
f o r t e s t i n g the g o o d n e s s - o f - f i t . The comparison w i l l t h e n
be made t o t h e c h i - s q u a r e d i s t r i b u t i o n w i t h a n u d e r of d e g r e e s of freedom which
' s u f f i c i e n t l y m y events. i f x0 i n v o l v e s p a r a m e t e r s d o s e n u m e r i c a l
values are i n f e r r e d from t h e d a t e by t h e u s u a l e s t i m a t i o n m t h o d s , t h e number of
I
i s s m a l l e r t h a n N-1 . lbo p o s s i b i l i t i e s can be thought o f : j degrees of freedom w i l l have t o be reduced as d e s c r i b e d i n t h e p r e v i o u s s e c t i o n .
For more e o l n p l i e a t e d p r o b l e m . where t h e r e are e o n s t r a l n t e q u a t i o n 8
(i) d a t a were grouped i n N c l a s s e s and t h e L unknown parameters e s t i m a t e d
r e l a t i n g measurable and unmeasurable q u a n t i t i e s ( p a r a m t e r s ) , and a l s o c o r r e l a -
- The xZ
by t h e m u l t i n o n i a l ManimumLikelihood m t h o d as d e s c r i b e d i n S e e t . 9 . 9 .
I
t i o n t e r n s between t h e measurements, a 'X t e s t f o r goodness-of-fit can h e based
hypothe$is''. A X 1 t e s t f o r goodness-of-fit may, however, p r o v i d e a d d i t i o n a l
on a r e s t s t a t i s t i c of t h e t y p e
f o r a h y p o t h e s i s which i s a l r e a d y p l a u s i b l e f o r other reasons. Thus.
x2 = (r-0)T v - 1 (Y) (y-,,I (14.69) when a p h y s i c i s t s a y s he a c c e p t s a c e r t a i n h y p o t h e s i s h e probably h a s been con-
where & i s t h e v e c t o r o f f i t t e d q u a n t i t i e s d e f i n i n g Ho and y the observations .,ineed from p h y s i c a l arguments r a t h e r than p u r e l y s t a t i s t i c a l c o n s i d e r a t i o n s .

w i t h c o v a r i a n c e m a t r i x V ( y ) . I n p a r t i c u l a r , i f a l e a s t - S q u a r e s e s t i m a t i o n has 14.4.5 E x a w l e : Kinematic a n a l y s i s of a vo event (2)


b e e n used. t h e number o f d e g r e e s of freedom i s (K-J), where K i s t h e n u d e r o f To i l l u s t r a t e t h e use of the X2 t e s t f o r goodness-of-fit we s h a l l go
c o n s t r a i n t s and J the nunher o f unmeasurables i n t h e p r o b l e m . back t o t h e example of Secr.lO.8.2 and see how t h i s r e s t can be used t o d e c i d e
F o r a chosen s i g n i f i c a n c e l e v e l IOOcl % of t h e X2 t e s t the lover l i m i t =he i d e r t t i t y of n e u t r a l s t r a n g e p a r t i c l e s observed sa 'V events i n a bubble c h a m
4 of
o -
-
t h e c r i t i c a l r e g i o n i s g e n e r a l l y g i v e n by t h e r e l a t i o n

f(u;v)du = 1 -F(u=<;v)
her. ~ e p e n d i n gon whether t h e p o s i t i v e decay p a r t i c l e i s a p r o t o n o r a p i o n
t h e r e are two h y p o t h e s e s f o r e a c h v":
(14.70)
Ho: v0 is A *p+n-,
xi
where f ( u ; v ) i s t h e c h i - s q u a r e p . d . f . ,
p.d.f..
F ( u ; v ) the c ~ l t l l a t i v ei n t e g r a l o f t h e same
and v t h e a p p r o p r i a t e number of d e g r e e s of freedom. I f x : ~as~ c a l e u l a -
I HI: v0is K~*nt+il'.

We assume now t h a t t h e f l i g h t d i r e c t i o n of t h e Vo h a s b e e n determined


t e d with the observations exceeds the c r i t i c a l value 4
i m p l i e d by e q . ( 1 4 . 7 0 ) from the measured p r o d u c t i o n and decay p o i n t s , and t h a t t h e momenta and a n g l e s
( a n d o b t a i n e d , f o r e x a n p l e , from a t a b u l a t i o n w i t h f i x e d p e r c e n t a g e p o i n t s s u c h o f b o t h decay p a r t i c l e s have been measured. The measurable v a r i a b l e s g then eon-
as Appendix Table A81 t h e h y p o t h e s i s Ho w i l l have t o be r e j e c t e d a t t h e c h o s e n
significance level; i f x : ~comes
~ o u t s m a l l e r than < t h e r e i s no reason t o re-
~ r i t u t ean 8-conponent
two
v e c t o r , i n v o l v i n g two a n g l e s f o r t h e YO, and m n e n t u m a n d
f o r e a c h of t h e two decay p a r t i c l e s . ' h e o n l y unmeasurable v a r i a b l e
j e c t H, on t h e b a s i s o f t h e xZ t e s t . 5 in t h e p r o b l e m i s ( t h e magnitude o f ) t h e Vo momencum. Since t h e r e are f o u r
Co-n p r a c t i c e among p h y s i c i s t s i s t o c o n v e r t t h e a c t u a l l y observed e q u a t i o n s f o r energy and momentum c o n s e r v a t i o n t h i s c o r r e s p o n d s t o a
value x : ~ f o~r the f i t t o an e q u i v a l e n t c h i - s q u a r e p r o b a b i l i t y P 2 as i n p l i e d by X-fit f o r t h e c o n s t r a i n e d Least-Squares e s t i m a t i o n of a l l k i n e m a t i c v a r i a b l e s i n
the relation ~ ~

pX2 =
-j f(u;v)du = 1- F ( U = X ~ ~ ~ ; ~ ) .
X
a specified hypothesis.

I f t h e c o r r e c t h y p o t h e s i s h a s b e e n used f o r t h e f i t t h e v a r i a b l e X'
(14.71)
i of eq.(14.69) with the f i t t e d q u a n t i t i e s =
: obtained i n t h e f i n a l i r e r e t i o n of
Xh:. I the m i n i m i z a t i o n p r o c e d u r e , w i l l b e approximately X 2 ( 3 ) . F i x i n g , f a r example,

i'
T h i s p r o b a b i l i t y i s most c o n v e n i e n t l y o b t a i n e d from c u r v e s of the e u r n l a t i v e c h i - the s i g n i f i c a n c e l e v e l a t 1 % t h e c r i t i c a l v a l u e , as r e a d o f f from Appendix T a b l e
in
s q v a r e d i s t r i b u t i o n , s v c h as F i g . 5.2, b u r can a l s o he found by interpolation A8, i s xfO1 - 1 1 .345. Hence, a t t h e 1 % l e v e l , we s h a l l reject a h y p o t h e s i s i f
t h e s t a n d a r d cables with f i r e d percentage p o i n t s . the o b s e r v e d v a l u e x : ~exceeds
~ xfO1. and o t h e l w i s e a c c e p t it.
I t may b e a p p r o p r i a t e t o s t r e s s t h a t a l t h o u g h a vely b a d f i t ( w i t h a Specifically, l e t "3 c o n s i d e r an e v e n t for which r e l e v a n t nurrbera are
high x : ~v ~a l u e and low PX2) can be a s u f f i c i e n t reason f o r r e j e c t i n g s hypoth- given i n t h e f o l l o w i n g t a b l e . The a n g l e s f o r t h e YO have b e e n o b t a i n e d from t h e
e s i s , a good f i t i s i n i t s e l f i n c o n c l u s i v e as l o n g as o t h e r h y p o t h e s e s have n o t p r o d u c t i o n and decay p o i n t s w i t h measured ( x , y . z ) c o o r d i n a t e s (-44.4+.14,-1.8f.17.
been t r i e d . In f a c t , i n s t e a d o f u s i n g t h e p h r a s e "we a c c e p t t h e h y p o t h e s i s " i t -16.2+.26) and (-28.8t.15,-5.3f .16.-16.0t.26) i n cm, r e s p e c t i v e l y . A l l neasured
w i l l p e r h a p s b e more a p p r o p r i a t e t o express t h e m a t t e r as 'be f a i l t o r e j e c t t h e q u a n t i t i e s have ""correlated errors.
1
I Momentum Dlp a n g l e Azlmuth a n g l e ' fie c u m u l a t i v e d i s t r i b u t i o n f o r t h i s s a n p l e of s i r e n i s na, d e f i n e d by
(radians) (radians) ,
Measured q u a n t i t i e s
I 3
1
/
!
1535172
1479160 1 0.02210.006
0.019~.006
6.107t0.001
6.111t0.006
;

,
: T
- ,
1 378218 1 -0.09720.016 1 5.76820.014 iI Thus S _ ( X ) is an i n c r e a s i n g s t e p f u n c t i o n w i t h a s t e p of h e i g h t
1
a t e a c h of t h e
Fitted quantities. p_ ; 1564t72 ' 0.022f0.006 j 6.106f0.007
.. ,x,,.
h y p o t h e s i s Ho T i 354111 ' -0.091?0.016 5.78lt0.012 / ~ o i n t sx l, x z . .
The ~ o l ~ ~ o r o v - S d r n tae vs t i n v o l v e s a c o o p a r i s o n between t h e observed
Fitted quantities, :
T / 1831251 1 0.024f0.006 ' 6.124f0.006
d i s t r i b u t i o n f u n c t i o n S (x) f o r the sample and t h e c u m u l a t i v e d i s -
cumulative
L
hypothesis H I / j 381218 , -0.118*0.016 5.719*0.011 , :
t r i b u t i o n f u n c t i o n F ( x ) which would occur under some t h e o r e t i c a l rmdel. We s t a t e
I n view o f t h e p r e a s s i g n e d I % s i g n i f i c a n c e l e v e l and c o r r e e p a n d i n g the n u l l h y p o t h e s i s as
c r i t i c a l v a l u e of t h e t e s t s t a t i s t i c , we s h a l l from t h e o b s e i v e d v a l u e s xZ H : S-(x) = F.(n).
(14.73)
3.6 and X & ( H , ) = 26.7 a c c e p t Ho and r e j e c t H i . W
obs(Ho)=
e t h u s conclude t h a t t h i s par- 0 r. "
For H t r u e one e x p e c t s t h a t t h e d i f f e r e n c e between S,,(n) and Fo(x) a t
t i c u l a r V" i s a A .
any p o i n t s h o u l d be r e a s o n a b l y s m a l l . The K o l m g o r o r S m i m o v t e a t l o o k s a t t h e
With t h e p r e s e n t example t h e n u d e r s assure, v i t h ovelvhelming p l a u s i -
b i l i t y , t h a t t h e c o r r e c t i d e n t i t y h a s been e s t a b l i s h e d f o r t h e v". d i f f e r e n c e S (x) - Fo(x) ar a l l o b s e r v e d p o i n t s and r a k e s as a t e s t s t a t i s t i c * )
I n other
cases the s i t u a t i o n may b e n o t so s i m p l e . the mximum of t h e a b s o l u t e v a l u e of t h i s q u a n t i t y , t h u s
For example, i f both h y p o t h e s e s give :
xibs < xi and thus are a c c e p t a b l e a t t h e chosen s i g n i f i c a n c e a, t h e v0 i s kine- D = max /S,,(x) - F~(X)\. (14.74)

m a t i c a l l y ambiguous. I f t h i s a a i g u i t y can n o t be r e s o l v e d by i o n i z a t i o n o r I t can b e s h a m t h a t provided no p a r a m e t e r i n F o b ) has been determined from t h e


o t h e r c r i t e r i a , t h e p r a c t i c e among p h y s i c i s t s i s e i t h e r t o a c c e p t o n l y t h e hypoth-1
d a t a , and assuming Ho t r u e . me v a r i a b l e D,, h a s a d i s t r i b u t i o n which i s indepen-
e s i s v i t h l o w e s t X'
obs ( h i g h e s t chi-square p r o b a b i l i t y PX2), r e j e c t i n g the o t h e r , i dent of B o (x), i.e. Dn is distribution-m*ee This h o l d s i r r e s p e c t i v e o f t h e
o r t o a c c e p t b o t h h y p o t h e s e s , i n c l u d i n g t h e v0 i n t h e two samples o f A a n d KO
sample s i r e .
events v i t h w e i g h t s i n a c c o r d a n c e v i t h t h e PX2 v a l u e s f o r t h e two f i t s . - If
TO b e a b l e ro use Dn as a reor s t a t i s t i c f o r t e s t i n g No one must know
b o t h hypotheses give x2ohr > xfO, t h e e v e n t i s r e j e c t e d , and both f i n a l samples 1 I t was s h m n by Kolmogorov t h a t D,, has a c u m u l a t i v e
i t s d i s t r i b u t i o n function.
c o r r e e r e d f o r a 1 % l o s s of true e v e n c s . I

- -L
d i s t r i b u t i o n which f o r l a r g e n i s g i v e n by
1b.4.6 The Kolmgorov-Smirnov t e s t 2 2
.-2r 2 (14.75)
Pearson's x2 t e s t i s undoubtedly t h e most p o p u l a r "on-paramerric test lim P (D < 5)
n-Jii
1-2
r-1
(-1)
n-
used by p h y s i c i s t s . However, o t h e r g o o d n e s s - o f - f i t t e s t s e x i s t which a v o i d the
For f i n i t e n t h e On d i s -
b i n n i n g o f i n d i v i d u a l o b s e r v a t i o n s and may be m3re s e n s i t i v e This r e l a t i o n is a p p r o x i m a r e l y v a l i d a l r e s d y a t n - 8 0 .
the d a t a . ne
m3st i m p o r t a n t o f t h e s e t e s t s i s p r o b a b l y the ~ 0 2 m o g o l o u - s ~test, i~~~ t r i b u t i o n s can be found from recurrence r e l a t i o n s .
in
p a r t i c u l a r f o r small sanples i s s u p e r i o r t o the X2 f e e t . and has mny ,,iceprop- Appendix able A10 g i v e s t h e e x a c t c r i t i c a l v a l u e s d01 of t h e t e s t s c a t -
e r t i e s when a p p l i e d t o p r o b l e m i n which no paranr?ters are e ~ t i m t e d . istic D f o r n 1 100 as as v a l u e s f o r t h e l i m i t i n g case o f n

Given n independent o b s e r v a t i o n s on t h e v a r i a b l e x we tom an ordered


snmPLe by a r r a n g i n g the o b s e r v a t i o n s i n a s c e n d i n g o r d e r of m a g n i t u d e ,
n
. *) A l t e r n a t i v e f o r m u l a t i o n o of r e l a t e d t e s t s use t h e s t a t i s t i c s

.. D: = max(sn(x)-~o(x)) or 0; max(Fo(x)-~,,(Xl)
l a r g e ( l a s t row), f o r d i f f e r e n t v a l u e s of t h e s i g n i f i c a n c e u . I t t u r n s o u t t h a t an s h o u l d have i n o r d e r t o p r o v i d e € ( a ) t o a r e q u i r e d
sample
t h e approximate v a l u e s o b t a i n e d w i t h t h e l i m i t i n g e n t r i e s are always l a r g e r than ~ ~ ~ ~l e t ius demand
~ ~ anl a c clu r a c~y of, b e t t e r t h a n 0.20 anywhere
t h e e x a c t ones. For i n s t a n c e , t a k i n g a = 0.05 i n a case w i t h n = 80 the exact on F ( ~ a) t a c o n f i d e n c e l e v e l of 9 0 % . Then, from Appendix T a b l e A10 we s e e from
c r i t i c a l value o f 0 i s d,05 = 0.1496, w h i l e t h e a p p r o x i a s r e v a l u e becomes With a
e n t r i e s f o r a = 0 . 1 0 t h a t n 2 35 w i l l be n e c e s s a r y t o have D 0 < 0.20. -
1.3581@5 = 0.1518. I f the n u l l hypothesis H is t e s t e d st a s i g n i f i c a n c e l e v e l f
, better t h a n 0.05 ar t h e same c o n f i d e n c e l e v e l we f i n d from
of 5% i t s h o u l d t h e r e f o r e b e r e j e c t e d i f t h e l a r g e s t o b s e r v e d d e v i a t i o n between the asymtotic
entrl t h a t the c o n d i t i o n on n i s 1 .22/& 5 0.05, which i m p l i e s
SeO(x) and Fo(x) e x c e e d s 0.15.
" 2 600.
I t w i l l be seen from Appendix T a b l e A10 t h a t , i f t h e sample s i r e i s 1r s h o u l d b e s t r e s s e d t h a t t h e c o n s i d e r a t i o n s above apply o n l y t o a i t u -
s m a l l , r a t h e r l a r g e d i f f e r e n c e s must b e found between t h e c u m u l a t i v e d i s t r i b u t i o n s ,tion$ where no unknown parameters are i n v o l v e d . I E same o f t h e P a r a m e t e r s en-
i n order t o d e t e c t s i g n i f i c a n t d e v i a t i o n s between d a t a and h y p o t h e s i s ; h e n c e un- : tering have been e s t i m a t e d u s i n g t h e d a t a t h e s t a t i s t i c D, i s no l o n g e r
l e s s t h i s d i f f e r e n c e i s c o n s i d e r a b l e one s h a l l n o t be a b l e t o f a l s i f y no. Indeed, o f F ~ ( ~ and
) , t h e = r i t i c a l v a l u e s da can n o t be o b t a i n e d u s i n g t h e
t h e nurbers o f t h i s t a b l e can be taken as an i l l u s t r a t i o n of t h e g e n e r a l d i f f i - tables. H ~ in s o~m f a r t~u n a r e s~i t u a t i o~n s t h e ~Kalmgorav-Smirnov
,
c u l t y i n c o n s t r u c t i n g e f f e c t i v e t e s t s f o r small d a t a samples. can s t i l l be u s e d even i n t h e p r e s e n c e of unknown n u i s a n c e p a r a m e t e r s , pro-
I t i s worth n o t i n g t h a t s i n c e D,, f o r Ho t r u e h a s a d i s t r i b u t i o n which vided
appropriate
t a b l e s over p e r c e n t a g e p i n t s are a v a i l a b l e . For example, f o r
!
i s u n i v e r s a l and i n d e p e n d e n t of t h e t h e o r e t i c a l F (x), and f u r t h e r m o r e is k n o w
for all n . one may use D,, to c o n s t r u c t confidence bmds f a r any c o n t i n u o u s d i s -
'
a,
important
<lass of pmb~em t h e e s t i m a t i o n i n v o l v e s the mean v a l u e o f
d i s t r i b u t i o n such t a b l e s can be found i n a recent a r t i c l e by
exponential
t r i b u t i o n f u n c r i o n F(x). Whatever t h e true F ( x ) i s we may w r i t e a p r o b a b i l i t y J . Durbin.
s t a t e m e n t about D as
14.4.7 ample: Goodness-of-fit i n a small s a v l e
P (D,, = max /s,,(x) - Fo(n)I ,da) =a. (14.76) T~ i l l u s t r a t e t h e use of t h e K o l m o g o r o ~ s m i r n o v t e s t o f g o o d n e s s - o f - f i t
where as b e f o r e d i s the c r i t i c a l value of D to the significance We consider a t y p i c a l l o w - s t a t i s t i c s experiment. Suppose t h a t for 30 events One
" corresponding
a. The s t a t e m e n t can be i n v e r t e d t o g i v e a c o n f i d e n c e s t a t e m e n t a b o u t F ( x ) , has measured t h e p r o p e r f l i g h t - t i m e of n e u t r a l kaons d e c a y i n g i n t o t h e semilep-
t o n i c f i n a l s t a t e rite-v. with t h e kaons produced i n an i n i t i a l l y pure s t r a n g e -
P (s,(.) - da < F(X) < s"(x) + dm 811 = 1 - . (14.77)
"ess + I s t a t e one can p r e d i c t t h e p . d . f . f,(t) f o r the flight-time t under t h e
T h i s means t h a t , a t any p o i n t x, the c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n F ( x ) w i l l K component w i t h s t r a n g e n e s s -1
assumption t h a t o n l y t h e -0 contributes t o the
have a p r o b a b i l i t y (1 - a ) of b e i n g l a r g e r than (S,,(r) -d ) but smaller than n+e-v f i n a l s t a t e . This assumption d e f i n e s t h e n u l l h y p o t h e s i s Ho which we want
(s,,(x) +do). T h e r e f o r e , i f one c o n s t r u c t s a band of w i d t h *d around t h e e m p i r i - to test.
c a l c u m u l a t i v e d i s t r i b u t i o n S (n) t h e p r o b a b i l i t y i s ( I - a ) t h a t t h e t r u e F ( x ) ~i~~~~ 1 4 . 6 ( ~ )~ h o w st h e s t e p f u n c t i o n S X I ( t ) o b t a i n e d f o r t h e sample
w i l l l i e e n t i r e l y w i t h i n t h i s band. This p r o v i d e s an e x t r e m e l y s i m p l e and d i r e c t cumulative d i s t r i b u t i o n function P,(t).
of XI flighr.times the
method f o r e s t i m a t i n g a c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n a t given ~ r o mt h e l a r g e s t d e v i a t i o n b e w e e e n t h e experimental and t h e o r e t i c a l c u r v e s we
level. Obviously t h i s i n v e r s i o n of t h e goodness-of-fir t e s t i n t o a confidence
determine t h e a c t u a l v a l u e of t h e Kolmogorov t e s t s t a t i s t i c of eq.(14.74) as
s t a t e m e n t a b o u t F ( x ) r e s t s upon t h e s i m p l e way D,, was d e f i n e d t o give a f,
t h e d e v i a t i o n between S". (x) and F ( x )
D~ = max ( SY)( t ) - ~ ~ ( t =) 0.17
l
' 0 . ' .

The technique d e s c r i b e d h e r e can be used, f o r i n s t a n c e , t o plan From Appendix = a b l e A10 we see t h a t a t t h e c o m n l y chosen s i g n i f i c a n c e l e v e l s UP
; to 10% we s h a l l n o t be a b l e t o reject H~ on t h e b a s i s of t h e XI o b s e r v a t i o n s w i t h
14.5 TESTS OF INDEPENDENCE
F r e q u e n t l y , when d a t a a r e a v a i l a b l e i n d i f f e r e n t i a l farm s p e c i f y i n g

.,, p r o p e r t i e s o r a t t r i b u t e s , i t i s d e s i r e d t o t e s t whether t h e s e
independent of e a c h o t h e r . The m o t i v a t i o n f o r c a r r y i n g o u t a t e s t of inde-
pendence can s o w t i m e s b e a p r o f o v n d t h e o r e t i c a l c o n j e c t u r e , f a r example, a
p r e d i c t i o n f o r t h e s h a p e of a s p e c t r v m of a k i n e m a t i c a l v a r i a b l e . Before
a c l a i m i s made on s c a l i n g b e h a v i o u r i t i s t h e n n e c e s s a r y t o e s t a b l i s h t h a t t h e
rpee~rurni n q u e s t i o n remains unchanged when, s a y , an i n c i d e n t energy i s i n -
More o f t e n t h e m t i v a t i o n i s l e s s s u b t l e . The e x p e r i m e n t e r may s i w l y
I to f i n d o u t w h e t h e r o b s e r v e d e v e n t s are u n i f o r m l y d i s t r i b u t e d a l o n g a band
I
I i n a ~ a l i t z lot, whether t r a n s v e r s e and l o n g i t u d i n a l rromnta are u n c o r r e l a t e d .
~'te.
An assumption o f independence i n t h e v a r i a b l e s x , y , . .. c a n be s t a t e d as
I a n u l l h y p o t h e s i s where t h e j o i n t p r o b a b i l i t y d i s t r i b u t i o n f a c t o r i z e s i n t o sepa-

I r a t e p r o b a b i l i t y d i s t r i b u t i o n s f o r t h e i n d i v i d u a l v a r i a b l e s , i.e.
1
I no: f(n,y,.) = f , ( x ) f,(y)'" . (1478)
!
A test problem of t h i s k i n d can b e approached a l o n g d i f f e r e n t l i n e s of t h o u g h t ,
Time of flight (in units of 0.89xl0-'~sec)
: some of which c o n s i s t i n g i n r e p h r a s i n g t h e problem to make i t analogous t o t h o s e
Fig. 1 4 . 6 . ComParlson of p r e d i c t e d and e x p e r i m e n t a l d i s t r i b u t i o n I
discussed i n Sects.14.6 below.
O f f l i g h t times; (a) c u m u l a t i v e d i s t r i b u t i o n ( ~ ~ l ~ ~ ~ ~ ~ ~ ~ - ~ ~ i ~ ~ ~ ~
I t turns our t h a t t h e X Z test i s a l s o a d e q u a t e f o r p r o v i d i n g answers to
t e s t ) . ( b ) d i f f e r e n t l a 1 d i s t r i b u t i o n (pearsonVs X2
test o f the above t y p e , a t e s t s t a t i s t i c can b e c o n s t r u c t e d i n
this test. To d e m o n s t r a t e t h i s we s h a l l
Indeed, f a r Ho t r u e , we f i n d by e x t r a p o l a t i o n of t h e t a b l e e n t r i e s analogy w i t h t h e Pearson s t a t i s t i c of e q . ( 1 4 . 6 5 ) .
f o r n = 30 t h a t t h e r e i s a P r o b a b i l i t y of about 0.25 t h a t a l a r g e r m a x i m a
be s a t i s f i e d w i t h c o n s i d e r i n g a problem i n two dimensions o n l y , which w i l l b e
d e v i a t i o n t h a n 0 . 1 7 would b e found between t h e observed c u m u l a t i v e d i s t r i b u t i o n
s u f f i c i e n t far most p r a c t i c a l p u r p o s e s . The e x t e r n i o n t o h i g h e r dimensions m y
f u n c t i o n and t h e p r e d i c t e d F o ( t ) .
become somewhat awkward r e g a r d i n g n o t a t i o n b u t i s c o n c e p t u a l l y s i m p l e and s h o u l d
For comparison, l e t us use t h e same o b s e r v a t i o n s t o t e s t H by t h e Xz be borne i n mind by t h e e x p r i m e n t e r working w i t h h i g h - s t a t i s t i c s d a t a samples.
method. For t h e f l i g h t - t i n e s of F i g . 14.6(a) a g r o u p i n g o f t h e d a t a i n t h e 4
i n t e r v a l s 0-3, 3-5, 5-7, and 7-18 ( i n u n i t s of t h e KO mean l i f e t i m e ) f u l f i l the 14.5.1 no-way c l a s s i f i c a t i o n ; contingency t a b l e a

reconmendation g i v e n e a r l i e r f o r P e a r s o n ' s xZ t e s t with n e a r l y equal probabil- suppose t h a t o b s e r v a t i o n s can be c l a s s i f i e d a c c o r d i n g t o rwo d i f f e r e n t

i t i e s i n a l l b i n s and a t l e a s t 5 e n t r i e s i n any b i n , see F i g . 1 4 . 6 ( b ) . Assuming a t t r i b u t e s o r p r o p e r t i e s A and B, and t h a t t h e r e are I c a t e g o r i e s f o r t h e f i r s t

t h e X' s t a t i s t i c t o b e s u f f i c i e n t l y w e l l approximated t o a c h i - s q u a r e v a r i a b l e a t t r i b u t e , A,,A 2,..., A=, and J c a t e g o r i e s f o r r h e second, B1 ,B2 ,....


BJ. Let t h e
under t h e s e c i r c u m s t a n c e s one f i n d s a c h i - s q u a r e p r o b a b i l i t y of about 0.40 number of o b s e r v a t i o n s w i t h a t t r i b u t e s Ai og 8 . b e denoted by n . . and l e t t h e
1 1J
t o t a l number of o b s e r v a t i o n s be n. With t h e n o t a t i o n
( ~ : ~ ~ = 3 .and
0 3 d e g r e e s of freedom).
t h e A and 8 c l a s s i f i c a t i o n s can he s a i d t o be i n d e p e n d e n t . me indepen-
imply t h a t t h e c o n d i t i o n a l p r o b a b i l i t y f o r a t t r i b u t e 8 . givenA , , i s
dence 1:
the normalization condition i s The sane w h a t e v e r Ai, and vice verso. E q u i v a l e n t l y , t h e probability f o r having
,inultaneously p r o p e r t i e s A. and 8 1. i a e q u a l t o t h e p r o d u c t of p r o b a b i l i t i e s f o r
the s e p a r a t e occurrences. Thus we can s t a t e our (composite) n u l l h y p o t h e s i s as

I Ho: P(AiflB.)J = P(Ai)-P(8.)1 all i.j


I h e n u d e r 8 can c o n v e n i e n t l y be w r i t t e n i n a c a t i n g e n c y table, as i n F i g . 14.7. !
A s i m i l a r t a b l e * ) can b e v r i t t e n f o r t h e c e l l p r o b a b i l i t i e s P(A.nB.) 5 p . . ,,,( sect.2.3.6 f o r t h e g e n e r a l d e f i n i t i o n of independence). U i r h a somevhat
1 11'
, i m p l i f i e d n o t a t i o n we may w r i t e
I f t h e s e p r o b a b i l i t i e s were s p e c i f i e d by some t h e o r y o r model as d e f i n i t e nuor
b e r s p . . * p?. , c o r r e s p o n d i n g t o a s i q l e h y p o t h e s i s , t h e agreement b e t v e e n t h e
/
I1 11 i Ho: P i j = Pi. P.j a11 i , j (1 4.82)
i
,.bere t h e m a r g i n a l p r o b a b i l i t i e s f o r a t t r i b u t e s Ai and Bj are given by
B1 B2 83
.I J
r pi. E ?(Ai) = 1
j.1
p.
1' '
p.. 5 PCB.) =
J J
1
p..
j = l 11
. (14.83)

mese marginal p r o b a b i l i t i e s must s a t i s f y t h e c o n d i t i o n f o r e x h a u s t i v e s e t s

i m p l i e s t h a t o n l y 1-1 of t h e r o v p r o b a b i l i t i e s and J-: o f t h e column prob-


= b i l i t i e s are i n d e p e n d e n t . lhese p r o b a b i l i t i e s m y be e s t i m a t e d from t h e o b s e r
vations as

Fi. = ni./" . $.j - n.jlo . (14.85)


F i g . 14.7. Contingency t a b l e f o r t v o u a y c l a s s i f i c a t i o n .

o b s e r v e d and p r e d i c t e d d i s t r i b u t i o n s c o u l d be cheeked by t a k i n g t h e sum


me i n d i v i d u a l c e l l p r o b a b i l i t i e s may t h e r e f o r e , i f Ho is t r u e , be e s t i m a t e d by

-
the p r o d u c t s of t h e a p p r o p r i a t e e s t i m a t e d row and column p r o b a b i l i t i e s .
6;. p.j ni.".jln2.
611
..
A s u g g e s t i v e t e s t s t a t i s t i c f o r t e s t i n g independence i n a
-
two-way c l a s s i f i c a t i o n i s c o n s e q u e n t l y

as a t e s t s t a t i s t i c f o r an o r d i n a r y Pearson x2 r e s t of goodness-of-fit with o r , i n a form nore c o n v e n i e n t f o r computation.


(IJ-1) d e g r e e s of freedom.
We assume now t h a t our i n t e r e s t i s n o t i n t h e g o o d n e s s - o f - f i t as such.
b u t r a t h e r i n t h e problem of d e c i d i n g , on t h e b a s i s of t h e d a t a a v a i l a b l e , 1
t
Under no, X' w i l l be approximately c h i - s q u a r e distributed, provided the
*) A c o n t i n g e n c y f a b l e f o r p r o b a b i l i t i e s v a s g i v e n a l r e a d y i n Sect.2.3.12. 1 nuhers of events i n t h e d i f f e r e n t c e l l s are s u f f i c i e n t l y l a r g e . The number of
d e g r e e s of freedom i s e q u a l t o t h e number of i n d e p e n d e n t o b s e r v a t i o n s minus t h e 14,5,2 ~ ~ ~ Independence
~ l e :o f aomnrum components
number of i n d e p e n d e n t l y e s t i m a t e d unknowns, t h a t i s ,
As an example on a t e s t of independence i n a t v o u a y c l a s s i f i c a t i o n ,
(IS-1) - [(I-I) + (J-111 = (1-l)(J-I) . (14.88) t h e d a t a of F i g . 1 4 . 8 v h i c h s h w s i n t h e form of a s c a t t e r diagram the
E x e r c i s e 1 4 . 9 : Show t h a t t h e 6 . . (and s i m i l a r l y t h e 6 .) o f e q . ( 1 4 . 8 5 ) are t h e d i s t r i b u t i o n o f centre-of-mass momentum components f o r a sample of 670 A hyperons
MaximuwLikelihood e s t i m a t e s & t h e m w (COI-) -1 i n PP c o l l i s i o n s a t 19 G e V I c .
( ~ i " ~w :r i t e dovn
...,
t h e l i k e l i h o o d f o r o b t a i n i n g t h e o b s e r v a t i o n s nl n2.: nl., and use t h e
L a g r a n g i a n m u l t i p l i e r m r h o d t o f i n d the I p r o b a < ! l l t l e s pl.. .,...,PI. We v a n t t o t e s t , on t h e b a s i s of t h i s i n f o r m a t i o n v h e t h e r t h e l o n g i t u -
maximize the l o g l i k e l i h o o d f u n c t i o n under t h e c o n s t r a i n t XpiP20 .) dinaland t r a n s v e r s e mmentum components can be r e g a r d e d ss i n d e p e n d e n t v a r i a b l e s
E x e r c i s e 14.10: I n an I x 1 c o n t i n g e n c y t a b l e , show t h a t t h e h y p o t h e s i s of conr in t h e u n d e r l y i n g d i s t r i b u t i o n . The f o l l o v i n g c o n t i n g e n c y t a b l e s u m r i z e s t h e
p l e t e symmetry, H : p L. J. = pi;i i , j = 1 . 2 , ....
I , can be t e s t e d by t h e s t a t i s t i c data a c c o r d i n g t o a chosen s u b d i v i s i o n w i t h 4 c a t e g o r i e s f o r t h e t r a n s v e r s e mo-
x2 = I I (n.'J .-n..
31) ,"turn ( c l a s s i f i c a t i o n A) and 10 c a t e g o r i e s f o r the l o n g i t u d i n a l rromnrum c l a s -
i " '1. 1+ " ' . J
1 #ification 8):
w h i c h , under Ho, i s a s y m p t o t i c a l l y d i s t r i b u t e d as X Z ( I 1 ( ~ - ~ ) ) . 1
I I

-5
u 1.2 A1
A~
B1

20

20
B2

23
8
B3

13

22
B4

12

26
B5

13

15
E6

11

14
8,

21
7 5

20
B9

9
7
BI0

10 /
n.1.

106 j

-
0
'a
0
22

18
15

25
24

19
35

25
10

25
19

19
22

18 6
8

7 1 8 2 '
I
E
2 0.8 ZI.. 81 71 75 81 88 60 66 65 30 53 670

EE i
0
I By c a r r y i n g o u t t h e s u m t i o n a c c o r d i n g t o eq.(14.87) v i t h the n d e r s i n t h i s
E 1 x : ~= 3~9 . 8 .
U

5 0.4
V) i
I
t a b l e one f i n d s t h a t t h e d a t a c o r r e s p o n d t o
of freedom, from e q . (14.88). i s e q u a l t o (4-1)(10-1) -
p r o b a b i l i t y f o r independence i n t h e rvo momnrum components i s deduced t o b e
27.
The number o f d e g r e e s
Hence t h e c h i - s q u a r e

UI
about 5 % .
2
L
14.6 TESTS OF CONSISTENCY AND RANWWESS
C
i When e s e t of o b s e r v a t i o n s is used t o e s t i m a t e t h e p a r a m e t e r s e n t e r i n g
a p.d.f., i t is g e n e r a l l y t a c i t l y assumed that t h e o b s e r v a t i o n s r e p r e s e n t a
-2.4 -1.6 0
-0.8 0.8 1.6 2.4 saaple which h a s b e e n d r a m a t random from t h e p o p v l a t i a n o r u n i v e r s e . Thus t h e
Longitudinal momentum pL(GeV/c) n o t i o n of a randm ~ a r p l epresupposes t h a t t h e o b s e r v a t i o n s a c q u i r e d are t y p i c a l
and r e p r e s e n t a t i v e f o r t h e u n d e r l y i n g d i s t r i b u t i o n . I f t h e assumption a b o u t
F i g . 14.R. S c a t t e r diagram o f centre-of-mass mamenturn components of A hyperons. randomeso i s n o t f u l f i l l e d t h e c o n c l u s i o n s c o n c e r n i n g t h e p r o p e r t i e s of t h e
p o p u l a t i o n m y be wrong o r m i s l e a d i n g . I t i s t h e r e f o r e of i m p o r t a n c e t o have erpected. OD t h e other hand, t e s t s on p o p u l a t i o n v a r i a n c e s are v e r y s e n s i t i v e
a v a i l a b l e som s t a n d a r d p r o c e d u r e s f o r t e s t i n g w h e t h e r s e t s o f o b s e N a t i o n s may
d e p a r t u r e s from n o r m a l i t y , r e s t r i c t i n g t h e u s e f u l n e s s of t h e p r o c e d u r e s e f
be r e g a r d e d as random and f r e e of s y s t e m a t i c e f f e c t s .
sect,14.3.3 t o s i t u a t i o n s where t h e o b s e r v a t i o n s are m a n i f e s t l y c l o s e t o normal.
P a r t i c l e p h y s i c i s t s f r e q u e n t l y f i n d themselves i n s i t u a t i o n s which I" t h e f o l l o w i n g we s h a l l f o r t h e i n v e s t i g a t i o n of randormess and eon-
c a l l f o r i n v e s t i g a t i o n o f randomness and c o n s i s t e n c y . When o b s e r v a t i o n s have ,isrency f o r m u l a t e v a r i o u s t e s t s which a v o i d making s p e c i f i c assumptions about
b e e n o b t a i n e d through a s e r i e s of m e a s u r e m n t s e x t e n d i n g o w r t i n . o r s p a c e i t form o f t h e p o p u l a t i o n s . These distribution-free t e s t s are t h e r e f o r e gener-
may be n e c e s s a r y t o check t h a t t h e e x p e r i m e n t a l c o n d i t i o n s have remained t h e a l l y v a l i d , r e g a r d l e s s of t h e u n d e r l y i n g t h e d i s t r i b u r i o n of t h e i r
same t h r o u g h o u t t h e e x p e r i m e n t . S i m i l a r l y , t h e d a t a may have b e e n a c q u i r e d i n
s t a t i s t i c i s determined by t h e number of e q u i v a l e n t p e r m u t a t i o n s o f elemen-
two o r mre runs w i t h c o n p l i c a t e d e x p e r i m e n t a l s e t - u p s , o r c o l l e c t e d by d i f f e r
e q u i p r o b a b l e e v e n t s and can, a t l e a s t i n p r i n c i p l e , f o r f i n i t e samples be
e n t l a b o r a t o r i e s p a r t i c i p a t i n g i n a c o l l a b o r a t i o n experiment. derived from p u r e l y c o m b i n a t o r i a l arguments. I n abandoning t h e c o m n normal
I" s u c h s i r u -
a t i o n s , b e f o r e any i n f e r e n c e s are made from t h e combined d a t a , i t i s i m p o r t a n t
theory methods f o r t h e mre g e n e r a l d i s t r i b u t i o n - f r e e p r o c e d u r e s one may have t o
t h a t c o n s i s t e n c y checks a r e performed t o ensure t h a t s y s t e m t i c d i f f e r e n c e s do t h a t of l o o s i n g " e f f i c i e n c y " , o r r e l a t i v e power. However.
pay a c e r t a i n p r i c e .
n o t e x i s t between t h e s e p a r a t e s a m p l e s . Likewise, b e f o r e d i f f e r e n t experlmen- A l ~ h o u g hi t i s g e n r r a l l y t r u e t h a r d i s t r i b u t i o n - f r e e approaches are l e s s e f f i -
t a l e s t i m a t e s o f some p a r a m e t e r are eonbined t o o b t a i n a " b e s t average" o r
=ientthan t a i l o r e d t e s t s based on n o r m a l i t y assumptions, theoretical investiga-
"pooled e s t i m a t e " , i t must be checked t h a t t h e i n d i v i d u a l e s t i m a t e s do n o t de-
t i o n s have shown t h a t f o r p o p u l a t i o n s which are nor normal, t h e d i s t r i b u t i o n - f r e e
pend on p a r t i c u l a r a s s u m p t i o n s which are d i f f e r e n t f o r t h e d i f f e r e n t e s t i m a t e s . r e s t s may even be s u p e r i o r .
For example, i f the v a l u e s o f mass and width of a resonance have been e s t i ~ n a t e d ! TO t e s t c o n s i s t e n c y between two or more e x p e r i m e n t a l samples we s h a l l
!
by d i f f e r e n t groups i r w i l l b e u n j u s t i f i e d t o deduce p o o l e d e s t i m a t e s of t h e from mow on make no f u r t h e r a s s u m p t i o n a b o u t t h e ~ ~ d e r l y i nd gi s t r i b u t i o n s e x c e p t
resonance p a r a m e t e r s i f t h e v a l u e s r e p o r t e d by t h e s e p a r a t e groups have b e e n : t h a r they a r e a l l e q u a l . F o r o b s e r v a t i o n s of t h e c o n t i n u o u s t y p e t h e s e tests of
d e r i v e d from t h e raw o b s e r v a t i o n s u s i n g d i s s i m i l a r a s s u m p t i o n s a b o u t t h e reson- homogeneity imply t h e n u l l h y p o t h e s i s
ance s h a p e .
e saw i n S e c t . 1 4 . 3 how t e s t s of c o n s i s t e n c y can be f o m l a t e d f o r
W
no: € , ( x ) = f 2 ( ~ =) .,. (14.89)

o b s e r v a t i o n s which are normally d i s t r i b u t e d . where f ( ~ )i s l e f t " n s p e ~ i f i e d . we w i l l d e s c r i b e t h r e e d i f f e r e n t t e s t s which can


Indeed, g i v e n the t a s k of t e s t i n g
be used when t h e comparison i n v o l v e s o n l y two s a r p l e s . Of t h e s e , t h e mn test i s
c o m p a t i b i l i t y between e x p e r i m e n t a l r e s u l t s , most p h y s i c i s t s would w i t h o u t h e s i -

t a n c e a p p l y t h e p r o c e d u r e s of S e c t s . 1 4 . 3 . 2 and 1 4 . 3 . 7 . Only seldom i s t h e nor- p a r t i c u l a r l y s i m p l e , s i n c e i t r e q u i r e s l i t t l e more t h a n j u s t c o u n t i n g , w h i l e t h e


m a l i t y o f t h e o b s e r v a t i o n s e x p l i c i t l y demonstrated t o j u s t i f y t h e use o f t h e s e XoLmogomv-Smirnov and t h e Witcozon mnk swn twcrsmnple t e s t s are somewhat mre
classical tests. I t a p p e a r s t h a t n o r m a l i t y is o f t e n t a k e n f o r g r a n t e d , a l t h o u g h
s o p h i s t i c a r e d and may r e q u i r e some c o m p u t a t i o n a l e f f o r t . me recommendation f o r

i t i s , a d m i t t e d l y , a very s p e c i f i c assumption a b o u t t h e n a t u r e o f t h e universe p r a c t i c a l work i s , i f i n c o n s i s t e n c y i s s u s p e c t e d , t o a p p l y t h e r u n t e a t f i r s t t o


from which t h e o b s e r v a t i o n s o r i g i n a t e , and c e r t a i n l y n o t an i n d i s p u t a b l e f a c r . see whether t h i s simple t e s t i s capable o f r e j e c t i n g Ho; i f t h e r u n t e s t i s in-
c o n c l u s i v e , t h e o t h e r t e s t s s h o u l d be a p p l i e d i n t u r n mese t e s t s e x p l o i t mre
Fortunately, t h e o r e t i c a l s t u d i e s have s h w n t h a t c e r t a i n t e s t pro-

'
cedures are r e l a t i v e l y i n s e n s i t i v e t o t h e s p e c i f i c f o m o f t h e u n d e r l y i n g d i s - f u l l y the i n f o r m a r i o n i n t h e d a t a and w i l l be more powerful i n d e t e c t i n g p o s s i b l e
tribution. These p r o c e d u r e s p o s s e s s a p r o p e r t y a p t l y c a l l e d mbustness. This
I inconsistencies. - me r u n t e s t h a s o t h e r u s e f u l a p p l i c a t i o n s ; f o r example i t can
be used t o g i v e a rough check as t o w h e t h e r a s e t o f o b s e r v a t i o n s is f r e e from
a p p l i e s , f o r example, t o t h e tests an p o p u l a t i o n wans r e f e r e n c e d above, and
i systematic trends. I t can a l s o be u s e d t o s u p p l e r e n t ~eeraon's X2 test far
may j u s t i f y t h e i r use i n cases where no d r a m a t i c d e v i a t i o n f m m normal b e h a v i o u r

I
g o o d o e r s - o f - f i t , o f u h i c h i t is i n d e p e n d e n t under some c o n d i t i o n s . m i s is an
ae~ r i t i e a lv a l u e sra 1 2 and r l-o,2 can b e d e t e r m i n e d from t a b u l a t i o n s o f t h e c r
i n t e r e s t i n g feature b e c a u s e , i n g e n e r a l , d i f f e r e n t t e n t s on t h e same d a t a are n o t
m l a t i v e binolnial d i s t r i b u t i o n . S i n c e t h e s t a t i s t i c r is d i s c r e t e one can, f o r
i n d e p e n d e n t , and hence t h e e o n b i n i n g of o u t c o n e s o f d i f f e r e n t t e s t s i s n o t t r i v -
a probability a i n t h e lower t a i l . d e f i n e t h e c r i t i c a l v a l u e ra as t h a t i n t e g e r
ial.
which s a t i s f i e s t h e i n e q u a l i t y
With more t h a n t w o samples the h y p o t h e s i s (14.89) can be t e s t e d by t h e
KnrakaZ-WaZlis rank t e s t . When t h e u n d e r l y i n g d i s t r i b u t i o n s are o f t h e d i s c r e t e
type. t h e analogous multi-sample h y p o t h e s i s can be examined b y a p p l y i n g t h e w e l l -
knam x2 t e s t ; t h i s is d e s c r i b e d i n Sect.14.6.12.
or, w i t h t h e n o t a t i o n of Appendix Table AZ.
14.6.1 Sign t e s t
A s i m p l e way o f r e c o r d i n g d a t a i s t o n o t e o n l y whether each o b s e r v a t i o n
i s s m l l e r t h a n , o r l a r g e r t h a n , some s p e c i f i e d v a l u e .
Although t h i s rough
I The sign t e s t can be ,,sed to test whether a v a r i a b l e when r e c o r d e d as a
method m y imply t h e l o s s o f a c o n s i d e r a b l e a m u n t o f i n f o r m a t i o n i n t h e o b s e r v a - ' f u n c t i o n of t i n e remains " c o n s t a n t " and e q u a l t o B f i x e d v a l u e , o r t e n d s to c h a n p
t i o n s , i t is p o s s i b l e t o c o n s t r u c t u s e f u l t e s r s f o r t h i s k i n d of d a t a . These .irh rim. Suppose, f o r i n s t a n c e , t h a t in a b u b b l e e h e h e r e x p o s u r e one m y want
sign t e s t s are based on t h e b i n o m i a l d i s t r i b u t i o n law, which d e s c r i b e s e x p e r i - =heck t h a t t h e ( a v e r a g e ) n u d e r o f beam p a r t i c l e s p e r p u l s e remains t h e same
ments w i t h o n l y two p o s s i b l e o u t c o n e s f o r i n d i v i d u a l e v e n t s . during the whole r u n , and e q u a l t o t h e optimum n u d e r r e q u e s t e d by t h e d e s i g n e r s
L e t t h e v a r i a b l e x have a d i s t r i b u t i o n o f media v a l u e li. We want t o the e x p e r i m e n t . I f the beam i n t e n s i t y s h o u l d vary s i g n i f i c a n t l y d u r i n g t h e
t e s t t h e simple n u l l hypothesis exposure, e i t h e r by f a l l i n g below t h e r e q u e s t e d v a l u e ( w i t h the consequence o f

no: !A = u0 (14.90) l a s s of u s e f u l e v e n t s ) o r by r i s i n g w e l l above i r ( t o o d i r t y p i c t u r e s ) , one would


a d j u s t t h e e x p e r i n e n t a l c o n d i t i o n s and m o n i t o r t h e beam i n s u c h a way t h a t t h e
a g a i n s t t h e compostre a l t e r n a t i v e
1 i n t e n s i t y i s b r o u g h t back t o t h e optimum v a l u e .
! HI: u t v,. (14.91) 1 N u m e r i c a l l y , l e t us t a k e a sample s i r e 20 and choose a s i g n i f i c a n c e
suppose t h a t of n o b s e r v a t i o n s on x, r o b s e r v a t i o n s a r e s m e l l e r t h a n uo, vhile l e v e l of 20%. By l o o k i n g up t h e e n t r i e s f o r n = 2 0 , p = 0 . 5 0 i n Appendix T a b l e A2
n - r are l a r g e r t h a n t h i s v a l u e . we see t h a t r - 6 corresponds t o a p r o b a b i l i t y which i s s m a l l e r than a 1 2 = 0.10
C l e a r l y , i f Ho i s t r u e , the d i s t r i b u t i o n o f t h e
v a r i a b l e s r and n - r w i l l c o r r e s p o n d t o t h e b i n o m i a l law w i t h e q u a l p r o b a b i l i t i e s
f o r t h e two o u t c o m s x < v o end x > % for t h e i n d i v i d u a l o b s e r v a t i o n s . nus,
Ho
v a l w (F(7;20,0.50) -
(oince € ( 6 ; 2 0 , 0 . 5 0 ) = 0 . 0 5 7 7 ) , w h i l e r - 7 g i v e s a p r o b a b i l i t y h i g h e r than t h i s
0.1 316). Hence t h e l o v e r c r i t i c a l v a l u e i s ra12= r
S i m i l a r l y , the "pper ~ r i t i c a lv a l u e i s r1-a,2 =
=6

r,90 = 1 4 , t h e two v a l u e s b e i n g s y n r
.
t r u e i m p l i e s a n e x p e c t e d v a l u e of r e q u a l t o i n , and very s m l l v a l u e s of r as
w e l l as very l a r g e v a l u e s (near n) are u n l i k e l y . To t e s t H a t t h e s i g n i f i c a n c e a e t r i c a l l y positioned r e l a t i v e l y t o t h e e x p e c t a t i o n v a l u e f a r r , which is h e r e
a we may t h e r e f o r e use the n u d e r r as a t e s t s t a t i s t i c and t a k e t h e r e j e c t i o n 1.20- 10. - T h e r e f o r e . i f d u r i n g t h e e x p o s u r e we made a count on 20 randomly se-
r e g i o n a t t h e two t a i l s of t h e binomial d i s t r i b u t i o n B ( r ; n , p = l ) . That is, ye l e c t e d picturesand found t h a t t h e n u d e r of t r a c k s was s m a l l e r t h a n t h e optimum
s h a l l r e j e c t t h e assumption o f a p o p u l a r i o n median e q u a l t o uo i f , among n nulrber u, i n mre t h a n 6 b u t i n less t h a n 14 p i c t u r e s , t h e n we would b e s a t i s f i e d
o b s e r v a t i o n s , t h e number of t i m e s r v h e n x i s s m a l l e r t h a n J
! i s such t h a t with the s t a t e of a f f a i r s , and b e l i e v e i n t h e assumption of a c o n s t a n t beam i n t e n -
0
Sity. I f , on t h e o r h e r hand, we found t h a t t h e n u d e r of p i c t u r e s h a v i n g l e s s
(14.92)
! than yo t r a c k s was either I 6 , or 1 1 4 , we would r e j e c t H on t h e b a s i s o f t h i s
I test. I f r e p e a t e d counts on a new sample of 20 p i c t u r e s gave s i m i l a r r e s u l t s , or
i f a c l o s e r e x a m i n a t i o n o f t h e n d e r t r a c k s on e a c h p i c t u r e s u g g e s t e d a s h i f t
ency d o e s n o t h o l d .
t o w a r d s , s a y , a l o v e r i n t e n s i t y , we would presumably a d j u s t t h e c o n d i t i o n s t o
TO we t h e number o f runs r a s a t e s t s t a t i s t i c f o r t h e h y p o t h e s i s Ho
b r i n g t h e i n t e n s i t y up t o t h e o p t i m a l b e f o r e c o n t i n u i n g t h e exposure.
have t o f i n d t h e p r o b a b i l i t y d i s t r i b u t i o n o f r assuming H t o b e t r u e . A
The s i g n t e s t assumes, f o r H t r u e , e a c h of t h e n o b s e r v a t i o n s xl.m, S i n c e , hov-
,,tat o f (n-) q u a n t i t i e s can be a r r a n g e d i n (n+m)! d i f f e r e n t ways.
...,x t o have a c o n s t a n t p r o b a b i l i t y f o r b e i n g s m a l l e r t h a n t h e s p e c i f i e d v a l u e
,,,r, t h e m u t u a l o r d e r w i t h i n r h e x ' s as w e l l as w i t h i n t h e y ' ~ ,h a s a l r e a d y been
vo. In t h i s r e s p e c t t h e o r d e r o f t h e i n d i v i d u a l o b s e r v a t i o n s is i m t e r i a l , and
t h e sequence i s random u n d e r H . I n s t e a d of c l a s s i f y i n g t h e o b s e r v a t i o n s r e l a -
f i x e d , we can o n l y have (n+m)!/(n!m!) = p) d i f f e r e n t p e r m u t a t i o n s a n d , pro-
"ided Ho i s t r u e , each o f t h e s e p e r m u t a t i o n s v i l l have t h e same p r o b a b i l i t y o f
t i v e t o a f i x e d , o u t s i d e v a l u e I),, one can d e s i g n a s i m p l e t e s t o f r a n d m e s s by
i g . To f i n d t h e p r o b a b i l i t y f o r a p a r t i c u l a r n u d e r o f runs. s a y r , one
c l a s s i f y i n g t h e i n d i v i d u a l measurements w i t h r e f e r e n c e t o one o f t h e sample T h i s is a c o m b i n a t o r i a l
c o u n t a l l p e r m u t a t i o n s g i v i n g r i s e t o just r runs.
v a l u e s , f o r example, r e l a t i v e t o t h e s a n p l e median, see Seet.14.6.4. lhe r e s u l t s o f t h e corn
problem which can be s o l v e d i n a s t r a i g h t f o w a r d manner.

14.6.2 Run t e s t f o r comparison o f two samples I p u t a t i o n i s t h a t t h e p r o b a b i l i t y d i s t r i b u t i o n f o r r i s g i v e n by ( 2 5 r 5 n + m)


To i n t r o d u c e t h e r u n t e s t , suppose t h a t we have two s e r i e s o f o b s e r v a -
t i o n s which have been a r r a n g e d i n terms o f i n c r e a s i n g (or d e c r e a s i n g ) m a g n i t u d e ,
g i v i n g t h e two o r d e r e d samples x l , x z , . . . ,xn and y ~ , y z , .. . , y m . Without l o s s of
g e n e r a l i t y we assume f o r s i m p l i c i t y i n t h e f o l l o w i n g t h a t n 5 m, s i n c e t h e case
w i t h n > m can be c o n s i d e r e d by merely i n t e r c h a n g i n g t h e n and y n o t a t i o n . We
I even,

1 (14.94)

t h e n c o n b i n e t h e two o r d e r e d samples and a r r a n g e a l l t h e (ntrn) o b s e r v a t i o n s i n


i n c r e a s i n g o r d e r of m a g n i t u d e , t h e r e s u l t b e i n g f o r i n s t a n c e a s e r i e s o f t h e form
I
n i s d i s t r i b u t i o n h a s mean and v a r i a n c e g i v e n by
The a s s u m p t i o n o f a e o m n p a r e n t p o p u l a t i o n i m p l i e s t h a t t h e x ' s and y ' s i n t h i s
s e r i e s s h o u l d b e w e l l mixed, and i f s u c h a pattern i s o b t a i n e d i t w i l l b e t & e n
as s u p p o r t f o r Ho, w h i l e a p a t t e r n w i t h , s a y , a p r e p o n d e r a n c e o f x ' s t o t h e l e f t To t e s t H i t i s customary t o t a k e t h e c r i t i c a l r e g i o n o n l y a t t h e
o f t h e c h a i n and y ' s t o t h e r i g h t w i l l i n d i c a t e s y s t e m a t i c d i f f e r e n c e s between l m e r t a i l o f t h e run d i s t r i b u t i o n The argument f o r a d o p t i n g a l e f t - t a i l t e s t
t h e two s e t s o f d a t a . i s c h a t i t a p p e a r s r a t h e r i w r o b a b l e t o o b t a i n a very l a r g e nurber o f runs i f H
,
A m i s defined as a s e q u e n c e o f s y d o l s of t h e same k i n d .
lhus t h e i s not t r u e . A s h i f t i n l o c a t i o n . o r a d i f f e r e n c e i n d i s p e r s i o n between t h e two
c h a i n above s t a r t s w i t h a r u n o f two x ' s , t h e n f o l l o v j a run o f one y , and so on, parent p o p u l a t i o n s would n o s t l i k e l y l e a d t o l e s s r a n d o m e s s and g r e a t e r o r d e r i n
a l t o g e t h e r s i x runs are shown. Ihe n u l l h y p o t h e s i s s u g g e s t s t h a t t h e n d e r o f the combined s e r i e s o f x ' s and y ' s . S i n c e a l l r e a s o n a b l e ("smooth") a l t e r n a t i v e s
runs r v i l l be f a i r l y l a r g e f o r t h e p a r t i c u l a r v a l u e s o f n and m.
I f t h e number to Ho t h e r e f o r e i n a l l l i k e l i h o o d would g i v e r i s e t o p r e d o m i n a n t l y s m a l l v a l u e s
o f runs cones o u t V e r y s m a l l compared t o t h e maximum p o s s i b l e one might s u s p e c t of r, t h e r e g i o n s h o u l d be r e s t r i c t e d t o t h e l o w e r t a i l o f t h e d i s t r i b l r
t h a t t h e p r o b a b i l i t y f o r a d e f i n i t e outcome o f one measurement h a s n o t remained tion.
t h e same i n t h e x and y s e r i e s . I f . on t h e o t h e r hand, t h e n u d e r o f runs i s very c r i t i c a l v a l u e s o f t h e run s t a t i s t i c are g i v e n i n Appendix T a b l e A l l
l a r g e i t i s p o s s i b l e t h a t t h e two s e r i e s h a v e not b e e n o b t a i n e d i n d e p e n d e n t l y . Since r i s a
f o r s a n p l e s i z e s up t o 15 and f o r 4 d i f f e r e n t s i g n i f i c a n c e levels.
Thus r v e r y s m a l l o r r v e r y l a r g e c o u l d i n d i c a t e t h a t t h e a s s u n p t i o n o f c o n s i s t -
..
Ij dierete variable the c r i t i c a l value ra c o r r e s p o n d i n g t o t h e s i g n i f i c a n c e a is
t a k e n as t h a t i n t e g e r which s a t i s f i e s t h e i n e q u a l i t y l i q u i d b u b b l e chamber two l a b o r a t o r i e s have m a s u r e d e l e c t r o n - p o s i t r o n !
airs, or y ' s , p o i n t i n g towards t h e r e a c t i o n p o i n t and c a l c u l a t e d t h e e f f e c t i v e -
P
Nia.9 yy of p a i r s o f 1 ' s . One vanes t o t e s t i f t h e r e s u l t s from t h e two l a b o r a -
t o r i e s are c o n s i s t e n t . given t h e f o l l o w i n g o r d e r e d s a n p l e s o f e f f e c t i v e - p a s s e s I

The run t e s t i s s i m p l e and e a s y t o a p p l y vhen t a b u l a t i o n s are a t hand. (""hers i n *Y) : \ 1


For example, w i t h two sample8 o f s i z e n - 6 , m - 8 one f i n d s by l o o k i n g up t h e ap-
p r o p r i a t e e n t r i e s i n Appendix T a b l e All t h a t t h e e s s u w t i o n of c o n s i s t e n c y be-
tween t h e two samples w i l l have t o b e r e j e c t e d ar t h e s i g n i f i c a n c e l e v e l 5% ( o r
1D i f t h e number of ~ M So b s e r v e d is <4 (or < 3). However, i t i s c l e a r t h a t o f
a l l t h e i n f o r m t i o n c o n t a i n e d i n t h e d a t a , o n l y l i t t l e i s a c t u a l l y used by t h i s
test. This means t h a t t h e run t e s t may f a i l t o r e j e c t a h y p o t h e s i s o f c o n s i s t -
e n c y , w h i l e o t h e r , more e l a b o r a t e t e s t s which e x p l o r e t h e d a t a onre t h o r o u g h l y .
1
may produce e v i d e n c e f o r r e j e c t i o n .

I
For l a r g e s a m p l e s , v h i c h i n p r a c t i c e often i s t a k e n t o man m a n d n W i t h t h e s e n d e r s t h e conbined o r d e r e d s e r i e s i s
l a r g e r t h a n 10, t h e p r o b a b i l i t y d i s t r i b u t i o n of eq.(14.94) i s very c l o s e t o nor-
"el. The a p p r o p r i a t e r u n t e s t s t a t i s t i c i n t h i s s i t u a t i o n i s !

which i s a p p r o x i m a t e l y N ( o , I ) . 1
I
The run t e s t i n t h e form d e s c r i b e d above i s one of t h e l e a s t p o w e r f u l From eq.(14.95) the
which meam t h a t t h e o b s e r v e d n u h e r of runs i s equal t o 24.
distribution-free tests. I t i s i n f a c t o n l y m a n i n g f u l vhen a p p l i e d t o samples e q e ~ r e dv a l u e and v a r i a n c e f o r t h e v a r i a b l e r w i t h sample s i r e s n - 2 8 and m - 3 2
of comparable s i r e . I f one sample i s v e r y much l a r g e r than t h e o t h e r t h e n , i n are, respectively,
t h e c o & i w d o r d e r e d s e r i e s , t h e o b s e r v a t i o n s from t h e s m a l l e s t s a n p l e w i l l a l -
most e e r r a i n l y be s e p a r a t e d from e a c h o t h e r by o b s e r v a t i o n s from t h e l a m e r
-~ -
s a m p l e ; hence the n u d e r o f runs v i l l t e n d to bee- maximm,
regardless of 2.28.32(2'28'32-28-32) ,1 4 . 6 2 ,
w h e t h e r t h e a ~ s u m p t i o no f i d e n t i c a l p a r e n t V(d *
i s true or not, (28+32)' (28+32-1)
O t h e r r e s t s b a s e d on r u n s have been d e v i s e d , s o m of which elrploit
Using the large sample approximation
eq.(14.97) for t h e t e s t s t a t i s t i c we find
the
hrll~ i n t h e d a t a ; one, f o r e r a n p l e , takes as a test
that the actualv a l l e from t h e o b s e r v a t i o n s is
the l e n g t h O f t h e l o n g e s t run. For a d e s c r i p t i o n of t h i s and o t h e r run tests the I

r e a d e r s h o v l d c m s u l t mre s p e c i a l i z e d l i t e r a t u r e .
Z
ObS
= 24- 33.87
m
- -1.80.

- -
14.6.3 Example: C o n ~ i s t e n c yb e t v e e n t v o e f f e c t i v e - ~ s s a n p t e s
Since
the p r o b a b i l i t y f o r a s t a n d a r d normal v a r i a b l e t o be smaller
than -'.'a is
10 s t u d y the p r o d u c t i o n o f fl i n a n t i n e u t r i n o induced r e a c t i o n s i n a
G(-1.80) - 1 - G(1.80) = 0.036, the h y p o t h e s i s of c o n s i s t e n c y between t h e two
.can and v a r i a n c e t o r - t h e number of runs become, r e s p e c t i v e l y .
-
samples w i l l t h e r e f o r e h a w t o b e r e j e c t e d from t h e r u n t e s t a t t h e 5% l e v e l .
By i n s p e c t i o n of t h e o r i g i n a l sample v a l e s i t i s seen t h a t t h e a t r i k - r - 1 V(r) - "("-1) , (14.98)
i n g d i f f e r e n c e h e m e e n t h e two s e r i e s o f m e a s u r e m n t s is the l a r g e rider of
F o r l a r g e samples t h e run s t a t i s t i c i s t h e n a p p r o x i m a t e l y N(n.ln).
smll qyv a l u e s o b s e r v e d by l a b o r a t o r y Y which are m i s s i n g f o r l a b o r a t o r y X. As
p h y s i c i s t s we m y t r y t o e x p l a i n t h e d i s c r e p a n c y h e m e e n t h e d a t a s e t s as b e i n g 14.6.5 Example: Time v a r i a t i o n of beam momentum
due t o an e x p e r i m e n t a l b i a s : The events w i t h s m a l l v a l u e s o f yycould be "wrong" As a n a p p l i c a t i o n of t h e run t e s t on one a e r i e s of o b s e r v a t i o n s , l e t
e v e n t s , i n which, f o r example, one y from no decay was e r r o n e o u s l y combined w i r h c o n s i d e r measurements on t h e beam momentum i n a b u b b l e chamber e x p e r i m e n t .
a bremsstrahlung air from t h e same y . Therefore, given t h e d a t a s e t s above, i t suppose t h a t measurements on 30 r o l l s of f i l m o r d e r e d a c c o r d i n g t o t h e rime of
i s s u g g e s t i u e to a s k l a b o r a t o r y Y t o look more c l o s e l y a t t h e i r l o r m a s e v e n t s exposure gave t h e f o l l o w i n g v a l u e s f o r t h e a v e r a g e momentum of t h e i n c i d e n t
OD t h e scan t a b l e . t r a c k s (numbers i n GeVlc):
I f one d i s r e g a r d s t h e seven e v e n t s w i t h 5,. i 40 HeV from l a b o r a t o r y Y , 18.90 18.88 18.94 18.91 18.96 19.05 19.06 19.08 19.03 19.10
one f i n d s t h a t t h e d a t a f o r t h e samples o f s i z e 28 and 25 c o r r e s p o n d t o a rul, 19.07 19.12 19.13 19.10 19.15 19.20 19.17 19.14 9 4 19.10
p r o b a b i l i t y of a b o u t 0.17. Hence w i r h t h e r e d w e d
nwober o f e v e n t s the run r e s t 19.11 19.08 19.08 19.07 19.03 18.98 19.00 18.97 18.94 18.95
f i n d s no i n c o m p a t i b i l i t y between t h e two samples even ar t h e 10% l e v e l .
Do t h e s e numbers e v p p o r t t h e h y p o t h e s i s H of a c o n s t a n t beam momentum d u r i n g

I1 -
14.6.4 Run t e s t f o--
r c h e c k i n g j a n d o m n e s 8 w i r h i n one *Ie
- t h e exposure?
Already from an i n s p e c t i o n of t h e cantro2 chart f o r t h e m e a s u r e a e o t s
3 The run t e s t p r o c e d u r e f o r t e s t i n g w h e t h e r two samples have t h e same
j p a r e n t p o p u l a t i o n can r e a d i l y be adopted t o t e s t t h e assumption t h a t a s e r i e s o f 1 above, s h o w i n P i g . 14.9, one would b e i n c l i n e d t o rejeer t h e assumption of
o b s e r v a t i o n s o b t a i n e d s e q u e n t i a l l y , f o r i n s t a n c e by measurements p e r f o m d a t
d i f f e r e n t t i m e s , can he c o n s i d e r e d f r e e from s y a t e m a r i e t r e n d s .
I
,
constancy in t h e beam mowenturn. I n d e e d , t h e e v i d e n c e from t h i s c h a r t is t h a t
t h e momentum h a s f i r s t i n c r e a s e d , t h e n d e c r e a s e d d u r i n g t h e exposure.
The h y p o t h e s i s 1
i of randomness i s t h a t a l l o b s e r v a t i o n s masure t h e same q u a n t i t y , which i n p l i e s 1'
t h a t t h e o r d e r of t h e o b s e r v a t i o n s i s i m m a t e r i a l . !
L e t the e l e m e n t s i n a t i r e - o r d e r e d s e r i e s o f o b s e r v a t i o n s be c l a s s i f i e d
I ;
r e l a t i v e l y t o some v a l u e of t h e s a m p l e , s u c h t h a t a n o b s e r v a t i o n a b o w t h i s v a l u e '
!
is l a b e l l e d by A and an o b s e r v a t i o n below i t by B .
Observations c o i n c i d i n g with
t h e c h o s e n v a l u e can be i g n o r e d . The h y p o t h e s i s H i m p l i e s t h a t a t every p o s i t i o n
i n t h e sequence the roba ability t o have an A i s the same, i.e. t h e ~ r o h a h i l i t ~
f o r an A r e w i n s c o n s t a n t a l o n g the sequence, and l i k e w i s e f o r B .
Ihe r e s u l t i n g
Film mll nwnkr
s e r i e s o f A ' s and 8's is t h e n a p a t t e r n of s y n b o l s w i t h p r o p e r t i e s analogous t o F i g . 14.9. C o n t r o l c h a r t f o r beam measurements
, ,
I
t h e s e r i e s of x's and y ' s of Sect.16.6.2.
We may t h e r e f o r e use t h e formulae f o r
t h e run s t a t i s t i c g i v e n e a r l i e r . We s e a k a n u m e r i c a l measure f o r our d i s b e l i e f i n Ho. The median v a l u e
These become p a r t i c u l a r l y s i w l e i f we choose
t o c l a s s i f y t h e o b s e r v a t i o n s r e l a t i v e l y t o t h e sample median, s i n c e by d e f i n i t i o n
f o r t h e o b s e r v a t i o n s is 19.07 GeVIe, and t h e c l a s s i f i c a t i o n r e l a t i v e t o t h i s
t h e rider o f A ' s a n d B S s w i l l t h e n b e e q u a l .
By p u t t i n g n = m i n eq.(14.95) the v a l u e g i v e s t h e f o l l o w i n g s e r i e s a f t e r t h e two o b s e r v a t i o n s c o i n c i d i n g w i t h t h e
I
median a r e i g n o r e d :

B B B B B B B A B A A A A A A A A A A A A A B B B B B B
. l i m i t e d amount o f i n f o m t i o n i n t h e o b s e r v a t i o n s , t h e run t e s t is p s r t i e u l s r l y
o n l y when u s e d i n c o n j u n c t i o n w i t h a x2 t e s t on t h e sane d a t a .
T h i s s e r i e s h a s a c o n s i d e r a b l e d e g r e e of o r d e r w i t h o n l y 5 runs among t h e 28 sym- For d e f i n i t e n e s s , c o n s i d e r t h e t h r e e s i t u a t i o n s s k e t c h e d i n F i g . 14.10.
bols. From Appendix T a b l e All we see t h a t , f o r n = m - 1 4 , t h e c r i t i c a l v a l u e s r m (a) t h e p r e d i c t i o n from t h e h y p o t h e s i s under t e s t r o u g h l y f o l l o w s t h e observa-
a
f o r t h e s i g n i f i c a n c e s a = 0.05. 0.025. 0.01. and 0.005 are, r e s p e c t i v e l y , 1 0 , 9 , t i o n s over t h e v a r i a b l e r a n g e , r e s u l t i n g i n a s e r i e s o f d e v i a t i o n s between ob-
8 , and 7 .Hence t h e p r o b a b i l i t y t o have as l i t t l e as 5 runs must b e c o n s i d e r a b l y served and h y p o t h e t i c a l v a l u e 9 which a l t e r n a t e i n s i g n and hence g i v e a f a i r l y
smaller than 1 ' l o o . l a r g e n u d e r of runs. I f the hypothetical distribution d i f f e r s substantially
From e q . ( 1 4 . 9 8 ) t h e e l t p e c t a t i o n v a l u e and v a r i a n c e f o r t h e number of from t h e o b s e r v e d i n l o c a t i o n , as i n ( b ) , i t is c l e a r t h a t t h e r e w i l l b e a se-
runs r a r e , r e s p e c t i v e l y , quence o f p o s i t i v e s i g n s f o l l o w e d by a sequence o f n e g a t i v e s i g n s . Similarly,

I f we a d o p t t h e l a r g e sample a p p r o x i m a t i o n i n t h i s case t h e a c t u a l v a l u e of t h e
a p p r o x i m a t e N ( 0 , I ) t e s t s t a t i s t i c of eq.(14.97) becomes
I

-
Hence t h e p r o b a b i l i t y t o have 5 o r l e s s runs among t h e 28 symbols w i t h t h i s ap-
p r o x i m a t i o n and t h e a c c u r a c y of Appendix T a b l e A6 i s G(-3.85) = 1-0.99994 I Fig. 14.10. Observed and h y p o t h e t i c a l d i s t r i b u t i o n s , ( a ) comparable i n
shape and l o c a t i o n , (b) d i f f e r i n g i n . l o c a t i o n , ( c ) d i f f e r i n g i n s h a p e .
~.IO-~.

E x e r c i s e 14.11: For t h e example above, show t h a t , f o r Hotrue, t h e e x a c t prob- i f t h e d i s t r i b u t i o n s d i f f e r mainly i n shape, as i n ( c ) , t h e s i g n s w i l l occur i n
a b i l i t y t o have 5 o r l e s s runs i s 5.97.10.'.
$eqUence o f n e g a t i v e , p o s i t i v e , n e g a t i v e . Thus i n b o t h s i t u a t i o n s ( b ) a n d ( c )

14.6.6 the s i g n s tend t o b e i n g e q u a l over l a r g e p a r t s of t h e v a r i a b l e r a n g e , i n c o n t r a s t


Run t e s t as a supplement t o P e a r s o n ' s ,y2 t e s t
The Pearson xZ t e s t f o r goodness-of-fit i s b a s e d on a t e s t s t a t i s t i c t o t h e more random p a t t e r n expected i f t h e two d i s t r i b u t i o n s had g r e a t e r s i m i l a r -

which is a sum o f terms, each i n v o l v i n g t h e s q u a r e of t h e d e v i a t i o n between ob- ity. Since. t h e r e f o r e , a l t e r n a t i v e s t o t h e t r u e h y p o t h e s i s most l i k e l y w i l l l e a d


to few runs, t h e c r i t i c a l r e g i o n f o r t h e run t e s t must b e a t t h e l w e r t a i l of
s e r v e d and P r e d i c t e d v a l u e i n t h e d i f f e r e n t c l a s s e s . From t h e c o n s t r u c t i o n of
t h e t e s t s t a t i s t i c t h i s t e s t t h e r e f o r e h a s t h e d e f e c t t h a t knowledge r e g a r d i n g the r u n d i s t r i b u t i o n .

t h e s i g n s o f t h e d e v i a t i o n s i n t h e i n d i v i d u a l c l a s s e s g e t s l o s t , and so does t h e L e t us assume t h a t t h e o b s e r v a t i o n s are grouped i n c l a s s e s f o r t h e x2


t e s t , and t h a t t h e t o t a l n u h e r o f e x p e c t e d e v e n t s under Ho is t h e same as t h e
o r d e r i n which t h e s e d e v i a t i o n s occur. The X2 t e s t i s , i n o t h e r words, i n s e n s i -
t o t a l n u d e r of o b s e r v e d e v e n t s . Suppose f u r t h e r t h a t t h e r e are n c l a s s e s where
t i v e t o t h e P a t t e r n of s i g n s i n t h e d e v i a t i o n s . T h i s p a t t e r n e v i d e n t l y c o n t a i n s
some u s e f u l i n f o r m a t i o n about t h e c o r r e s p o n d e n c e between experiment and p r e d i c - t h e d e v i a t i o n between observed and p r e d i c t e d v a l u e is p o s i t i v e , and i n c l a s s e s

t i o n , and s h o u l d be p o s s i b l e t o e x p l o r e by a run t e s t , which a p p e a r s t o s u g g e s t where i t i s n e g a t i v e . We count t h e number o f runs i n t h e sequence of s i g n s and

i t s e l f as a n a t t r a c t i v e supplement t o t h e can f i n d t h e run p r o b a b i l i t y i n t h e v s u a l manner f o r t h e g i v e n n,m.


test. I n fact. since it u t i l i z e s
The i m p l i c i t assumption i n t h i s p r o c e d u r e i s t h a t t h e o r d e r i n which
t h e n + m s i g n s occur i s i n m a t e r i a l i f H i s true. I n any c l a s s t h e p r o b a b i l i t y
t o have a p o s i t i v e d e v i a t i o n i s t h e same as t h e p r o b a b i l i t y f o r a n e g a t i v e d e v i -
ation. T h i s w i l l be t h e case i f t h e c o n d i t i o n s f o r t h e x2 t e s t are s a t i s f i e d ,
s i n c e t h e n t h e number i n any c l a s s i s normally d i s t r i b u t e d . It has been v e r i f i e d .
however, from an e x t e n s i v e s e r i e s of random sampling e x p e r i m e n t s (F.N.David) that
t h e p r o c e d u r e i s v a l i d even when t h e p r o b a b i l i t y o f o b t a i n i n g a p o s i t i v e devi-
a t i o n i n a c l a s s i s f o u r times t h a t o f o b t a i n i n g a n e g a t i v e .
It can b e shown t h a t , f o r a s i m p l e h y p o t h e s i s H o , t h e run t e s t i s
asymptoticaZLy independent of t h e x2 t e s t , whereas i f p a r a m e t e r s i n Ho are e s t i -
mated from t h e d a t a ( e x c e p t t h e o v e r a l l n o r m a l i z a t i o n ) t h e two t e s t s a r e n o t in-
dependent and hence i t i s l e s s meaningful t o a p p l y b o t h . When t h e wo t e s t s can
b e r e g a r d e d as independent t h e y can be combined i n t o a s i n g l e t e s t . Let PI be
t h e p r o b a b i l i t y o f a v a l u e o f X' l a r g e r t h a n t h a t o b s e r v e d and P2 t h e p r o b a b i l -
i t y o f a v a l u e o f r n o t l a r g e r than t h a t o b s e r v e d . I f t h e sample s i z e is s u f f i -
c i e n t l y l a r g e t o approximate b o t h p r o b a b i l i t i e s t o c o n t i n u o u s v a r i a b l e s , uni-
Four-manentun transfer squared t ( G ~ v / c ) '
formly d i s t r i b u t e d between 0 and 1 , t h e n t h e v a r i a b l e

u - -2(LnP, + EnP,)

w i l l b e ~ ' ( 4 ) . (see E x e r c i s e 5.6) and an a p p r o p r i a t e t e s t s t a t i s t i c .


(14.99)

For t h i s
combined t e s t t h e c r i t i c a l r e g i o n i s t a k e n a t t h e upper t a i l o f t h e c h i - s q u a r e
i € i g . 14.11. E x p e r i m e n t a l and p r e d i c t e d d i s t r i b u t i o n of four-momentum t r a n s f e r .

distribution. To r e s t t h e n u l l h y p o t h e s i s q u a n t i t a t i v e l y one could use t h e


However, i n view
class s u b d i v i s i o n as i m p l i c i t by t h e b i n n i n g o f t h e h i s t o g r a m .
distribution. of t h e r e q u i r e m e n t of n o t t o o few e v e n t s i n e a c h c l a s s , i t is r e a s o n a b l e t o
group t o g e t h e r s e v e r a l b i n s a t t h e upper p a r t of t h e spectrum. I f t h i s i s done
, I 14.6.7 Example: Comparison of e x p e r i m e n t a l h i s t o g r a m and t h e o r e t i c a l d i s t r i b u t i o n
To i l l u s t r a t e t h e use o f t h e run t e s t as a supplement t o t h e Pearson x2
above 1 . 6 ( ~ e v l c )t h~ e r e w i l l b e a t o t a l of 24 c l a s s e s , and o n l y one c l a s s , t h e
I

! t e s t f o r goodness-of-fit we s h a l l r e f e r t o F i g . 14.11. The h i s t o g r a m shows t h e


I f i r s t b i n , h a s an e x p e c t e d number of events below t h e recornended minimum o f 5 .
The e x p e c t e d number "poi = foi w i t h i n e a c h c l a s s can be o b t a i n e d by n u m e r i c a l
o b s e r v e d d i s t r i b u t i o n of t h e s q u a r e d four-rmmentum t r a n s f e r t from t h e t a r g e t
i n t e g r a t i o n of t h e curve; t h e r e s u l t of t h e computation i s
Proton t o a negative pion i n the annihilation
. + + - -
p + p * n + n + n + n

a t 1.2 GeV/c a n t i p r o t o n momentum.


X:bs - iEl
24 ( n i f i )
foi
=
i=l
24 ni2
TT
01
- n = 30.6
Ihe smooth curve g i v e s t h e d i s t r i b u t i o n expec-
t e d from a multi-llegge model of a p a r t i c u l a r form and s p e c i f i e s t h e s i m p l e n u l l ' Because of t h e n o r m a l i z a t i o n c o n s t r a i n t on t h e t h e o r e t i c a l model t h e x2 t e s t has
xAbS
h y p o t h e s i s Ho t o b e t e s t e d on t h e b a s i s o f t h e d a t a . The area under t h e curve
h a s b e e n normalized t o t h e number o f e v e n t s (n = 990) i n t h e h i s t o g r a m .
From i n s p e c t i o n o f t h e d i a g r a m one g e t s t h e i m p r e s s i o n t h a t t h e ob-
v - 24-1 = 23 d e g r e e s o f freedom.
chi-square p r o b a b i l i t y P
X
= P, -
The o b s e r v e d v a l u e
0.14. Hence, from t h e
then corresponds t o a
x2 t e s t there i s l i t t l e
reason t o s u s p e c t t h e h y p o t h e s i s .
s e r v e d s p e c t r u m i s somewhat more c o n c e n t r a t e d around t-0 t h a n t h e t h e o r e t i c a l I t w i l l b e seen from F i g . 14.11 t h a t t h e r e are 17 b i n s where t h e h i s t o -
gram l i e s above t h e t h e o r e t i c a l curve and 7 b i n s where t h e o p p o s i t e oeeurs, w i t h
.. !
-.
an o b s e r v e d number o f runs e q u a l t o 5 . The p r o b a b i l i t y t o h a v e no more t h a n 5
runs a m n g t h e 7+17 s M o 1 s i s . from eq.(14.94). o n l y P2 0.0034. Hence from significance level, t h e c r i t i c a l values i n t h e w o - s q l e
Hence, f a r a given
t h e run t e s t we would c e r t a i n l y r e j e c t t h e p r o p o s i t i o n H
t e s t are always l a r g e r t h a n t h e c r i t i c a l v a l u e s f o r t h e one-sample goodness-of-
For t h e combined t e s t t h e a c t u a l v a l u e o f t h e approximate ~ ' ( 4 ) s t a t i s -
f i t t e s t , i n a c c o r d a n c e w i t h e o m n sense e x p e c t a t i o n . However, when one sample
t i c u of eq.(14.99) is
i s very much l a r g e r t h a n t h e o t h e r , c o r r e s p o n d i n g t o v e r y s m a l l s t e p s i n i t s
uObs = -2(f.n0.14 + Ld.0034) = 15.3 ~ u m u l a t i v ed i s t r i b u t i o n f u n c t i o n , t h e c r i t i c a l v a l u e s f o r t h e two-sample test
which, from Appendix T a b l e A8, c o r r e s p o n d s t o a combined p r o b a b i l i t y < 0 . 0 0 5 . are o n l y s l i g h t l y l a r g e r t h a n t h e t a b l e v a l u e s f o r t h e s m a l l e s t sample e i r e . In
t h e extreme c a s e , w i t h m > > n , t h e two-sample comparison i s e v i d e n t l y i d e n t i c a l
14.6.8 Kolmogorov-Smirnov t e s t f o r comparison o f t v o samples
to t h e goodness-of-fit t e s t f o r one sample.
The c o n s i s t e n c y between two e x p e r i m e n t a l d i s t r i b u t i o n s o f a c o n t i n u o u s
For an a p p l i c a t i o n o f t h e Kolmgoro\rSmirnov two-sample t e s t , l e t us
v a r i a b l e can a l s o be checked by a p p l y i n g t h e Xolnugomv-Smimv two-smnple test.
T h i s i s a d i s t r i b u t i o n - f r e e t e s t which i n v o l v e s t h e comparison o f t h e two emu-
-~
.n back t o t h e example o f S e c t . 1 4 . 6 . 3 .
effective-mnss
The c u m u l a t i v e d i s t r i b u t i o n s f o r t h e two

samples have been p l o t t e d i n F i g . 14.12; t h e l a r g e s t v e r t i c a l


l a t i v e sample d i s t r i b u t i o n s , analogous t o t h e K o l m g o r o r S m i r n o v g o o d n e s s - o f - f i t
t e s t d e s c r i b e d e a r l i e r f o r t h e comparison one sample and a s p e c i f i e d d i s t r i b u t i o n I
defining H .
Let Sm(x) and S,(X) be t h e c u m u l a t i v e d i s t r i b u t i o n s f o r t h e two (or-
d e r e d ) samples o f s i x e m and n , r e s p e c t i v e l y (compare eq.(14.72) of S e c t . 1 4 . 4 . 6 ) .
The Kolrmgorov-Smirnov two-sample test statistic D i s t h e maximum d e v i a t i o n be-
mn
tween t h e s e two s t e p f u n c t i o n s over t h e e n t i r e v a r i a b l e range,

C r i t i c a l v a l u e s o f t h l s s t a t i s t i c can be found, f o r i n s t a n c e , from B i o m e t r i k a


T a b l e s f o r S t a t i s t i c i a n s , Vol.11,
f o r samples s i z e s up t o 2 5 . I n t h e l i m i t i n g
case, i t can be shown t h a t , f o r i d e n t i c a l p a r e n t p o p u l a t i o n s , i.e. f a r Ho t r u e ,

T h i s f o r m l a i s analogous t o e q . ( 1 4 . 7 5 ) f o r t h e one-sample case, and t h e prob-


a b i l i t y d i s t r i b u t i o n f o r t h e two-sample s t a t i s t i c D
m i s therefore related t o the
d i s t r i b u t i o n f o r t h e one-sample s t a t i s t i c D . For m and n n o t too s m a l l , c r i t i -
c a l v a l u e s Do. o f Dm,, can c o n s e q u e n t l y be o b t a i n e d from t h e c o r r e s p o n d i n g c r i t i c a l
v a l u e s da oI D,,, which have been t a b u l a t e d i n Appendix Table A10.
Effective-mass b$y (MeV)
The r e l a t i o n -
~ i 14.12.
~ . c u m u l a t i v e d i s t r i b u t i o n s o f two e f f e c t i v e - m a s s samples.
ship i s
s e p a r a t i o n between the s t e p f u n c t i o n s is
t h a t t h e x o b s e r v a t i o n s occur predominantly i n one end of t h e combined ordered
i n d i c a t i n g t h a t t h e hypothesis H i s l e s s l i k e l y t o be c o r r e c t . Hence i t '1
..ems r e s o n a b l e t o adopt a two-sided test for H .
From Appendix Table A10 t h e c r i t i c a l v a l u e f o r a one-sample KolmogororSmirnov
Assuming Ho t o b e t r u e t h e s m a l l e s t p o s s i b l e v a l u e Wmin for t h e rank
t e s t a t a 5% s i g n i f i c a n c e l e v e l w i t h n = 2 8 i s d.05 = 0.2499. Hence, from eq. W of t h e x sample i s o b t a i n e d when a l l x ' s are t o t h e l e f t of t h e combined
(14.1021, t h e c r i t i c a l value f o r t h e two-sample t e s t i s
ordered s e r i e s and a l l y V s t o t h e r i g h t . Then W i s nothing b u t t h e sum of t h e n

0.2499. -= 0.34 . f i r s t i n t e g e r nunhers o r .


wmln = n(n+l)/2. Similarly, the l a r g e s t possible value
D.05 = 32
S i n c e t h e observed l a r g e s t d e v i a t i o n exceeds t h e c r i t i c a l value.
Dabs '
D.05' we
must r e j e c t t h e h y p o t h e s i s o f c o n s i s t e n c y between t h e two samples a t t h e 5 % l e v e l
i s t h e sum of a l l i n t e g e r s from (m+l) t o (m+n), o r Wmx
he d i s t r i b u t i o n o f p r o b a b i l i t i e s p(W) f o r W between Wmin
-
urnax w i l l occur when a l l t h e x ' s a r e t o t h e r i g h t and a l l y ' s t o t h e l e f t ; hence
" ( n + l ) l Z + om.
and Wmax can be ob-
tained i n a s i m i l a r manner as i n d i c a t e d f a r t h e r u n s t a t i s t i c . The d i s t r i b u t i o n
on t h e b a s i s o f t h e KolmogarorSmirnov t e s t . This i s o f course n o t a t a l l sur-
p r i s i n g , s i n c e even t h e run t e s t would r e j e c t o u r assumption a t t h e 5 % s i g n i f i - is s y m e t r i c , and has mean and v a r i a n c e e q u a l t o
cance l e v e l , as we saw i n Secr.14.6.3.

E x e r c i s e 14.12: Repeat t h e Kolmogorov-Smirnov t e s t f o r t h e example i n t e x t


u s i n g tSe reduced d a t a samples as explained i n Sect.14.6.3. Show t h a t t h i s t e s t . C r i t i c a l v a l u e s f o r the s t a t i s t i c W a r e reproduced i n Appendix Table
i n c o n t r a s t t o t h e r u n t e s t , r e j e c t s t h e h y p o t h e s i s o f c o n s i s t e n c y a t t h e 5%
l e v e l , a l s o f o r t h e r e v i s e d samples. A12 f o r ample s i r e s up t o 25 and f o r 6 d i f f e r e n t s i g n i f i c a n c e s a . The t a b l e

14.6.9 Wilcoxon's rank sum t e s t f a r comparison o f two samples assumes one-sided t e s t s , t h e c r i t i c a l value W b e i n g d e f i n e d as t h a t i n t e g e r
" d u e f o r which
We have so f a r given s e v e r a l p r e s c r i p t i o n s f o r t h e comparison o f two
samples, and w i l l now i n t r o d u c e the WiLcomn two-sample test, or Wilcomn's rnnk
sum test f o r the same problem. We assume a g a i n t h a t we have two ordered samples
~ 1 ~ x 2 ....,\ and Y!.Yz,.... 7, ("51"); we want t o t e s t t h e h y p o t h e s i s H that
t h e two p o p u l a t i o n s fromwhich t h e s e samples o r i g i n a t e are i d e n t i c a l . To o b t a i n t h e c r i t i c a l v a l u e s corresponding t o a two-sided t e s t a t a s i g n i f i c a n c e
AS d e s c r i b e d for t h e r u n t e s t i n Seet.14.6.2 we a r r a n g e the ( n r m ) ob- l e v e l lOOo I one r e a d s o f f t h e l w e r c r i t i c a l v a l u e WE as t h e t a b l e e n t r y i n t h e
s e r v a t i o n s i n i n c r e a s i n g o r d e r of magnitude. I n t h i s = d i n e d ordered sample a p p r o p r i a t e column for a 1 2 . The upper c r i t i c a l value W,, can t h e n b e o b t a i n e d
each o b s e r v a t i o n i s assigned a m n k , e q u a l to t h e o r d e r i n which t h e o b s e r v a t i o n from t h e sytrmetry p r o p e r t y o f t h e d i s t r i b u t i o n , which implies
occurs i n t h e s e r i e s . I f some o b s e r v a t i o n s happen t o be i d e n t i c a l ( " t i e s " ) they
a r e a l l a s s i g n e d the average value of the ranks t h e s e o b s e r v a t i o n s would have i f
t h e y were d i s t i n g u i s h a b l e . The Wilconon t e s t s t a t i s t i c W i s now c o n s t r u c t e d as
t h e sum o f t h e n ranks f o r t h e o b s e r v a t i o n s from t h e a sample. I f H i s t r u e we , The v a l u e of 2c i s a l s o given i n t h e t a b l e f o r each n,m combination.
The d i s t r i b u t i o n f o r w can be s h w n t o tend t o normal f o r n and m
e x p e c t the x and the y o b s e r v a t i o n s t o be w e l l mixed i n t h e combined s e r i e s , and
large. For l a r g e samples one can t h e r e f o r e use Wileoxon's t e s t w i t h t h e a p p r o r i -
hence t h e value of W should be not "too small" and not "too large". Conversely, mate N(0.1) statistic
i f t h e value f o r W comes o u t e i t h e r "very small" o r "very l a r g e " t h i s would m a n
where e " c o n t i n u i t y c o r r e c t i o n " o f -1 o r + j i s added t o t h e numerator depending "here the c o n t r i b u t i o n s from the c o u n t e r experiments have been underlined. Thus
, .i on whether a n upper or lower t a i l p r o b a b i l i t y i s b e i n g c a l c u l a t e d . The normal the a c t u a l value o f the rank sum becomes
approximation i s good £or most p r a c t i c a l purposes with n and m both l a r g e r t h a n
10. Even i f n i s s m a l l e r t h a n 10 t h e approximation is f a i r , provided t h a t m i s
Webs - 1+2+3+4+6 - 16
~ h i c hi s below the c r i t i c a l l i m i t f o r the t e s t with the chosen s i g n i f i c a n c e l e v e l .
n o t too much l a r g e r t h a n n ( f a i r l y s y m e t r i c p r o b a b i l i t y d i s t r i b u t i o n ) and the
~ c c o r d i n g l y , from t h e Wilcoxon r a n k sum t e s t the two s e t s o f measurements of t h e
s i g n i f i c a n c e a n o r too small ( h e l m 0.01, s a y ) .
l i f e t i m e are n o t c o n s i s t e n t a t t h e 5% l e v e l .
E x e r c i s e 14.13: (Wilcoxon's rank sum t e s t f o r randomness w i t h i n one sample) The ordered sample of the two measurement s e r i e s corresponds t o 4 r u n s .
Discuss how the Wilcoxon rank sum t e s t can be used t o t e a t whether a s e r i e s o f
measurements i s f r e e from s y s t e m a t i c t r e n d s .
E x e r c i s e 14.14: (Wileoxon's rank sum t e s t f o r independence)
.=6 and a-0.05 is r
.05
-
1t w i l l be seen from Appendix Table A l l t h a t t h e c r i t i c a l number of runs f o r "-5.
3 . Hence, from the run t e s t we would have no reason t o
"lairn t h a t the r e s u l t s o f the two s e t s of measurements are not c o n s i s t e n t a t a
A d i s t r i b u t i o n o f two v a r i a b l e s f ( x , y ) i s such t h a t t h e v a r i a b l e y can t a k e on
o n l y two v a l u e s . Show how Wilcoxon's two-sample t e s t can be used t o t e s t whether ~ i g n i f i c a n c e0 . 0 5 . I n f a c t , t h e p r o b a b i l i t y t o have 4 or l e s s runs is 0.0644.
x and y are independent v a r i a b l e s .
his i l l u s t r a t e s t h a t the Wilcoxon t e s t i s more capable than the simple run t e s t

14.6.10 Example: C o n s i s t e n c r t e s t f o r two s e t s of measurements of t h e no l i f e t i m e i~ r e j e c t i n g a h y p o t h e s i s .


The mean l i f e t i m e o f t h e no meson has been measured by s e v e r a l e x p e r i -
14.6.11 Kruskal-Wallis rank t e s t f o r comparison of s e v e r a l samples
ments which u t i l i z e e s s e n t i a l l y two d i f f e r e n t techniques. One s e t o f experiments
When more than two d a t a s a m p l e s a r e t o be checked f o r c o n s i s t e n c y , the
h a s used n u c l e a r e m u l ~ i o n sas d e t e c t i n g d e v i c e , t h e o t h e r h a s used c o u n t e r detec-
two-sample t e a t s d e s c r i b e d i n the s e c t i o n s may be a p p l i e d f o r e p a i r -
t o r s (see Exercise 1 4 . 7 ) . I g n o r i n g f o r the oomeat t h e d i f f e r e n t a e e u t a c i e s o f
wise comparison of any two samples. With J samples t h i s would imply a t o t a l o f
t h e experiments t h e r e s u l t s can be s u m r i z e d by t h e f o l l a r i n g nunhers g i v i n g
& J ( J - 1 ) tvo-sample comparisons t o be c a r r i e d o u t . Also, one may compare each
t h e measured mean l i f e t i m e i n u n i t s o f 10-" seconds:
of t h e J samples with '"the average" of a l l samples by t h e same two-sample proce-
Counter technique: 0.56, 0.6, 0.73, 0.9, 1.05 dures. Obviously, t h e number o f such comparisons w i l l soon become l a r g e , making
Nuclear emulsions: 1 .O , 1 .6, 1 .7, 1 .9, 2.3, 2.8 t h e process a l e n g t h y one i f the number of samples i s not r e l a t i v e l y s m a l l . By

The q u e s t i o n i s , do t h e d i f f e r e n t techniques provide c o n s i s t e n t r e s u l t s a t a s i g - random f l u c t u a t i o n s , even i f a l l samples do o r i g i n a t e from t h e same p a r e n t popu-


n i f i c a n c e l e v e l o f 51.1 l a t i o n , one i s l i a b l e t o f i n d a t l e a s t two samples t h a t appear t o be i n c o n s i s t e n t ,
We want h e r e t o apply a two-sided Wilcoxon rank sum t e s t l o r t h e and hence t h e r i s k o f r e j e c t i n g a t r u e hypothesis may become c o n s i d e r a b l e . Mare-
h y p o t h e s i s o f e q u a l p o p u l a t i o n means on the b a s i s o f t h e two samples of s i r e s
"-5, m = 6 . From Appendix Table A12 t h e l w e r c r i t i c a l limit f o r t h e W s t a t i s t i c
w i t h t h e s e numbers i s W e-W.025'18. The upper c r i t i c a l l i m i t becomes wu-2i-w -
1 over, t h i s approach w i l l not g i v e a measure of the o v e r a l l agreement between a l l
samples.
An e f f i c i e n t method f o r t h e s i m l t a n e o u s comparison o f any " h e r of
2- i Suppose t h a t the complete s e t o f N ob-
60-18-42, which is a l s o given i n t h e t a b l e . Hence we s h a l l r e j e c t t h e hypothesis samples i s the Xruskat-Wattis rank test.
I
o f a c o m n p o p u l a t i o n mean a t t h e 5% l e v e l i f we f i n d a rank sum f o r the s e r v a t i o n s from J samples i s arranged according t o magnitude, such t h a t eaeh ob-
s m a l l e s t sample which i s 1 1 8 o r - > 42. I s e r v a t i o n s i s assigned a rank between 1 and N . For eaeh sample one f i n d s t h e

The o b s e r v a t i o n s above correspond t o t h e f o l l o w i n g combined ordered rank sum W . as w e l l as the mean rank i. = W.In., where n. is t h e number of obser-
3 I I 1 1
sample. v a t i o n s i n t h e j-th sample. I F t h e assumption of J i d e n t i c a l p a r e n t p o p u l a t i o n s

-
0.56 0.6 0.73 0.9 1.0 1.05 1.6 1.7 1.9 2.3 2.8
is c o r r e c t , i.e. i f i s t r u e , a l l samples are expected t o have t h e same mean Exercise 14.15: J u s t i f y the s t a t e m e n t t h a t H of eq.(14.109) i s asymptotically
rank d i s t r i b v t e d as x'(J-1) i f Ho i s t r u e .
e x e r c i s e 14.16: Shou t h a t eq.(14.110) f o l l o w s from eq.(14.109).
Exercise 14.17: Shav t h a t a t e s t with the s t a t i s t i c N/(N-l).H for 5-2 i s e q u i r
= l e n t t o t h e Wilcoxon two-sample t e s t .

w i t h t h e same ""biassed variance 14.6.12 me r2 t e s t f o r omp par is on of histograms


I n t h e preceeding d i s c u s s i o n o f rank r e s t procedures f o r the comparison
e x p e r i o ~ e n t a lsamples i t has been t a c i t l y a s s u m d t h a t the underlying d i s t r i b u -
I t i o n s are continuous, i n o r d e r t h a t the rank assignments be meaningful. The fore-
I A suggestive t e s t s t a t i s t i c w i t h weighted c o n t r i b u t i o n s from the d i f f e r e n t
going procedures are t h e r e f o r e n o t a p p l i c a b l e f o r t e s t i n g t h e c o n s i s t e n c y o f
samples i s
~ a m p l e swhich are known t o o r i g i n a t e from d i s c r e t e p o p u l a t i o n s , n e i t h e r w i l l they
apply f o r the comparison o f samples c o n s i s t i n g of "pooled" o b s e r v a t i o n s , where
i n d i v i d u a l measurements have been grouped i n c a t e g o r i e s p r i o r t o comparison.
To be s p e c i f i c , l e t us t h i n k of t h e c o m n s i t u a t i o n when a s e t of
which can be r e w r i t t e n i n the f o l l o w i n g form, convenient f o r computation,
histograms i s t o be checked f o r c o n s i s t e n c y . The o b s e r v a t i o n s have been clas-
s i f i e d i n I b i n s , e q u a l l y chosen f o r a l l J h i s t o g r a m , and correspond t o J in-
dependent multinomial d i e t r i b u t i a n s , the j-th o f which having the f o l l o w i n g s e t

Since H w i l l be zero i f the W.


1
come our e q u a l , and l a r g e o f t h e
1
W.
are i f bin probabilities,

s u b s t a n t i a l l y d i f f e r e n t , the h y p o t h e s i s o f a c o m n p a r e n t p ~ p u l a t i o nshould be
r e j e c t e d i f the observed value Robs exceeds the c r i t i c a l value Ha corresponding
t o t h e chosen s i g n i f i c a n c e a . I n o r d e r t o determine t h e s e c r i t i c a l l i m i t s one
must know the p r o b a b i l i t y d r i s t r i b u t i o n f o r H assuming the n u l l h y p o t h e s i s t o be The observed number of e v e n t s i n the d i f f e r e n t b i n s and the t o t a l number of

true. I n p r i n c i p l e , t h e s e ( d i s c r e t e ) p r o b a b i l i t i e s can be o b t a i n e d from p u r e l y events i n the j-th histogram are given as

c o m b i n a t o r i a l arguments (assuming no " t i e d ranke") f o r any s e t of samples nl,nr,


....nJ. s t a r t i n g with s m l l numbers. U n f o r t u n a t e l y , unles. the n u h e r s are t i n y
t h e amount of r e q u i r e d work soon becomes formidable, and t a b u l a t i n g according t o
many arguments a l s o becomes i m p r a c t i c a b l e . Accordingly. Kruskal and W a l l i s give
I For a l l h i s t o g r a m the o v e r a l l n u d e r i f o b s e r v a t i o n s i s n,

t a b l e s o f c r i t i c a l values corresponding t o s i g n i f i c a n c e s between 1 and 10% f o r


3 samples of s i z e n o t exceeding 5 .

I I n p r a c t i c e one makes use of t h e f a c t t h a t f o r H t r u e , t h e s t a t i s t i c


I The hypothesis we wish t o t e s t i s t h a t a l l p a r e n t d i s t r i b u t i o n s are

-
H f o r s u f f i c i e n t l y l a r g e n . has a chi-square d i s t r i b u t i o n w i t h J-1 degrees of
I i i d e n t i c a l , corresponding t o , f o r each b i n nuaher i , a common p r o b a b i l i t y f o r a l l
freedom. The x2- approximation i s g e n e r a l l y accepted when e i t h e r J 3 and a l l
I J histograms,
I sample s i n e s =re above 5 , O F J 2 4 and a l l sample s i z e s above 4 .
i H : pil = PiZ = ..' = P ~ J i = 1.2, ...,I (14.114)
l e t us d e n o t e t h e c o m n , unknown, p r o b a b i l i t i e s by p i . , i-1.2... .,I.
The Maxi-Likelihood e s t i m a t e s f o r t h e s e b i n p r o b a b i l i t i e s are t h e a v e r a g e f r e - Event t o p o l o g y
q u e n c i e s o b s e r v e d f o r each of t h e b i n s Laboratory

152 131 70 48
189 161 108 42 25
I
w h i c h are seen t o s a t i s f y t h e r e q u i r e m e n t , E 9 . . = 1 i n v i r t u e o f t h e c o n s t r a i n t 105 78 52 32 12
1'1 1
on t h e n.. e q . ( 1 4 . 1 1 3 ) ; t h u s o n l y 1-1 o f t h e e s t i m a t e d b i n p r o b a b i l i t i e s are i n -
LI'
dependent. The t e s t s t a t i s t i c f o r t h e comparison o f a l l h i s t o g r a m s s i m u l t a n e o u s - Check t h a t t h e e v e n t samples o b t a i n e d by t h e d i f f e r e n t l a b o r a t o r i e s are f u l l y
compatible.
l y i s c o n s t r u c t e d as t h e sum over a l l h i s t o g r a m s and b i n s o f a l t o g e t h e r J.1 t e r n s .
e a c h t e r m b e i n g a s q u a r e d d e v i a t i o n between an o b s e r v e d and e s t i m a t e d number, d i -
v i d e d by t h e e s t i m a t e d n d e r ,

I f Ho i s t r u e and t h e e x p e c t e d e v e n t numbers i n a l l h i s t o g r a m b i n s f u l f i l t h e
u s u a l n o r m a l i t y r e q u i r e m e n t , t h i s s t a t i s t i c w i l l be a p p r o x i m a t e l y c h i - s q u a r e d i a -
tributed. The number o f d e g r e e s o f freedom i s ( I - 1 ) ( J - l ) , corresponding t o the
p r e s e n t number o f independent o b s e r v a t i o n s ( I J - J ) minus t h e number o f independ-
e n t l y e s t i m a t e d p a r a m e t e r s , 1-1 .
The a s s u m p t i o n o f c o n s i s t e n c y between a l l J h i s t o g r a m s i s a c c e p t e d a t
t h e c h o s e n s i g n i f i c a n c e l e v e l lOOa Z i f t h e c a l c u l a t e d v a l u e x : ~comes
~ out
smaller than the c r i t i c a l value 4 f o r t h e a p p r o p r i a t e n u d e r of degrees of
freedom, and r e j e c t e d i f t h e o p p o s i t e o c c u r s . Q u i t e f r e q u e n t l y when xibs ' 4 t h e
o v e r a l l i n c o n s i s t e n c y can be t r a c e d t o a s i n g l e h i s t o g r a m , s a y t h e j - t h , having
an exceptionally large contribution t o x : ~ ~ .A repeated calculation with the
j - t h h i s t o g r a m e x c l u d e d may t h e n show t h e r e m a i n i n g h i s t o g r a m s t o b e m u t u a l l y
c o m p a t i b l e and s u g g e s t a c r i t i c a l e x a m i n a t i o n o f t h e d a t a f o r t h e odd h i s t o g r a m .

E x e r c i s e 14.18: Show t h a t t h e problem o f t e s t i n s c o n s i s t e n c y between h i s t o g r a m


i s e q u i v a l e n t t o r e s t i n g independence i n a two-way c l a s s i f i c a t i o n .
E x e r c i s e 14.19: Four l a b o r a t o r i e s p a r t i c i p a t i n g i n a c o l l a b o r a t i o n e x p e r i m e n t
h a v e s c a n n e d t h e i r bubble chamber f i l m s f o r f i v e d i f f e r e n t e v e n t t o p o l o g i e s and
h a v e o b t a i n e d t h e f o l l o w i n g number o f e v e n t s :
APPENDIX
Statistical Tables

16 - Probnility and statistics.

i
Table A 1 . The binomial d i s t r i b u t i o n

he t a b l e gives values of B(r;n,p) = 'TI p r ( ~ - p ) n - r for s p e c i -


f i e d values of n,p and r , where 0 5 r 5 n.'
The table only has e n t r i e s f o r p 5 0 . 5 0 , but i t can be used to
f i n d ~ a l u e sof B(r;n.p) for p > 0 . 5 0 by means of the r e l a t i o n
B(r;n,p) = B(n-r;n,l-p).
Table A l . The binomial d i s t r i b u t i o n (continued) Table A l . The binomial d i s t r i b u t i o n (continued)

" ?
p .01 .'? .O, .a5 .,a .,s .lo 5 .la .." .ro
464

Table Al. Ihe binomial distribution (continued)


Table AZ. The cumulative binomial distribution

.01 .01 .01 ."d 0 .I5 .10 . -10 .LO .I0


In
The table gives values of F ( x ; n , p ) =rEO,r, p 1 - p for speci-
fied values of n , p and x , where 0 5 r 5 n.
The table only has entries for p 5 0.50, but it can be used to
find values of F(x;n,p) for p ,
0.50 by means of the relation

. . . .,0672 ,8837
. . . ~ s a .nno .~trs .era, .mrs
....,
.
I
i
3

5
1
I.
.we5
,.oooo
1.1000
1.0"O.
1.1.10
00
.99.,
.?**a
,.1.1$
1.0100
I.OOII
1.00.0
.?a75
.QWS
1.0000
1.0000
,.naoo
I."OOO
.o*,a
,999.)
,.".OO
,.nnoo
,.no00
,384,
.QPIII
.so**
I.OOOl
i.00""
,7765
.9%,
.99.,
.9996
i.0ooa
,.oonn
.*5s.
.90,,
_?a30
.9.
,9999
,.oqos
,5329

.96l*
,395.
.wee
,.ooos
.em?
,9295
.%9l
.nlOl
,.oooo
,233,
,..s.
,820"
.gIso
,9959
,.oaao
.,OQ.
.,.m
.hra2
..*m
.9"..
,.onaa
p ." . " ." .'5 .,a .I5 .?O .'5 .'O .." .'O
n x
6 0 ,1515 .72,a .*,.I .L..l .,s5, ,071, ."?*, ,0100 .0011 .O.", .noon
.so,> .oaa,

.
1 .*"ll ..601 t " " 8 5 . 3 0 5 6 ,

1
I
1
.909S
1.0000
I.0.0"
1.1000
,9962
.%PQI
,.ooo.
1.0000
,9187
,991s
.9W?
1.000@
,9511
.a010
..ss,
.9Ol9
.?ill
.s"lo
.*%I
.%lr
.la09
.slnl
.qI65
,151"
.5WI
.,"a2
.eldl
.1011
,6050
.L102
.1IO1
...
.D*Q.
,1659

.as%
99
.01"1
,0611
.,rrr
,1111
.0011

.",".
.0106

,1151
6
,
1.0100
0
I.0000
0 .
I.OIO0 t.&ooO
I.""" ..WOT .01).1 ,9711
0
.9?0. .(I?.,
,256
,1271 .1z1?
..Dl"

...
I 9 " 9 7 ,
11 1 . ~ 1 0 1 1.0010 ,..0~0 ,_""no i.ooao .ew. ..*a5 ,9925 .rr., ,8577 ,3912
9 ,.1110 1.00*1 1.0000 ,."an0 t.oano ,.oono .W"" .9srr .ee>e n .rrm
10 ,.00.0 I.OO0D 1.0w0 ,.nnns ,.onno ,..an" ,.oooo .v99r .s*. .Qam .o.o
I, I.0.00 ,.00@0 1.0000 I. 0 1.0110 1.010" 1.0"OO I.lO"0 .P.9, ,9951 ,9616
11 2.0000 1.0101 ,.nolo i.0o.O I.ODI. 1.11"" l.O.0" ,.oono 1.00.0 .Pes, .*es.
11 ,.a(100 1.0000 ,.noao 1.ooae ,.aooa ,.*oaa ,.nooa ,.mano ,.DO00 .9.e* .wr*
I. ,."no0 ,.oooa ,.nola ,.onno ,.nooo ,.oaan ,.onoo ,.mano ,.ooso >.o.olr .ose,
l5 ,.onas ,..aoo 1.nooa ,.onaa ,.oaao L.oao0 ,.ooo* I.OO"0 ,.oooo I.o.00 ,.onoo
I6 t.0noo 1.1000 I."IO. 1.n""" 1.0".. >.000"..D00 I.... 0 ,.*..a 1.0.00 >.no.
I? O
I
.BIZ.)
" 7
,1001
5 . ..%01* _a111
.7*22
,1168
.lala
.0611
.rizr
.01?5
.tlrr
.001J
.osar
.0013
.Oh?,
.00mZ
.ooz,
.ooID
."so,

....
1 .?Psl .wJL ,9111 ....1 .Ibl* .lOPL ,1617 .0116 .01Z1 ,0011
3 1.0000 .Pvsi ,9080 _iell ,0171 .i55l .%80 ,3510 .101* ,016. ,096.

...
* 1.0010 I.0000 ."*P* .ill* , 9 1 7 " .9m11 .lie? .571P .31.1 _I260 ."2'S
1 1.0110 ,.ow0 1.0000 9 ,9953 .91.1 ,1911 ,765, .L9." .
s
.
,9 ,071,
6 ,.*I00 I.0000 1.11100 I.0D.I .esw .*Pn .sar> .Is29 .,rra .I., " .,srr

....."...",,.
r 1.0000 1.01* ,.DODO ,.O.PO 'Q". .PQ"1 .9"V, ,959. 5. ..LO5 .,I..
I I.1000 1.1110 1.0110 1.0011 1.0000 .999l .09ll .?ll6 .*191 ,1111 . 5 ~ 0
1.00oo I.BOOO ~.oaoo I.DOIP 1.0000 t.0000 .essr . ~ m..sen .eoal .681S
,a ,.a000 I.*I.O I.0000 1.0010 I.0000 1.0100 .*WP ,999. ..we 52
11 1.1.10 1.0000 ,."a00 ,.""ID 1.0000 I.00"" 1.01.0 .ewe .es., .*I",
12 I.MOO 1.oom I.OODO >.DIOO t to no t.onoo ,.aaav i.oooa .r.sq ..err .,s.
t1 ,.0000 I.OIO0 ,."DO0 ,."moo 1.0000 1.0000 ,.moo 1.1&"0 I.0O.D ,999.

....
.9*,1
I.
I5
I6
I.O.00
>..a01
I.0.00
,.O*DO
1.1010
I.IOOO
1.10.0
,.nolo
I.ID00
1.111"
1.1110
,..000
1.0000
1.0001
1.0000
1.000*
t.0001
1.0B01
I.0000
I.0000
I.0001
1.0000
1.0100
I.01.0
1.0000
I
3.0.00
0
,0999
I.. 0.
I.1.00
..*.
..9""

1.0.00
9
I 7 I.0000 I.0000 >.I000 1.0000 I.OOB1 1.1000 I.OOOI 8.0000 I.0000 I.OIOO I.D&OO

111 0 .81.1 .L9ll .57ID ,1971 .1511 .0531 .1180 ,0156 .1111 .1001 .MOD
I ,9505 ."W7 ,7775 ,950, 22.1 ."99, ,0395 .a,.? .oo,, ,000,
1 .99*1 .99.8 )..9. ,Q.lV .1111 ..I97 ./713 .1>53 ,0800 .OO~Z .OO@I
1 I.0000 ,9396 .*e.r .9.e, .sola .?lo2 .rota .>.57 .,srs .o,p. ..*,a
L I.ODO* 1.0011 .eeea .ooar .orla .~rq. .,ma. .rna~ .mm .oe.r .oar.

... .,.*)
z 1.1100 1.1010 1.00011 .v-ea .seas .*sat .as71 . r > n .nrl .zoan .onl
I.1000 3.0000 1.0000 I.DO"0 ,9988 .ell1 .?.a, ..(I,. .1.1, ,111, .,,a
1 I.OOlb I.0000 1,1000 I.D.00 .esea .esn .e., .?.,, ,859, .I*,.
I I.0000 1.0000 1.0110 ,.DO00 I.IPO0 .se57 .*.ll ,1161 .0,
9 I.0000 I.00W 1.0IDO I.001 I.0000 .091)9 .*e.l, .errs ..re0 ..en .5e.,
I0
I!
1.0.00
1.0DOI
1.1001
I.00*0
I.DI00 I.Old. I.OO"D 1.01111 .*we .a. .?*a .el,. .I507
1.00011 3.1000 i . 0 0 0 ~ 1.0100 I.0.OD .9W" .eeas ,979, ."a,,
I? I.0000 1.001* 1.0000 I.P.OO 8.0000 1.0000 DO..I I.O..$ ,9997 .%? .or,.
11
1.
I.0.W
1.0010
1.0100
1.1010
1.@100
1.1000
,..on0
I.0.00
1.0000
1.a010
1.0010
I.0100
I.O.00
I.O.00
I.IO.0
I.... 0
I....
I.00.0
0
...
....I
9"
.*".a
.We2

I9
15
IS
I7
,I

0
1.101
I.ODI0
!.I000
I.I.00

.8?61
1.0*10
1.0110
1.1110
1.1000

.6111
1.1000
1."10*
1.0010
I."Wl

.180&
1.....
I.0.00
,."~OO
1.0000

.1111
1.0000
I.0DIO
I.0000
1.1000

.1>11
I.0.0.
1.moo
1.0000
I.0000

.Oh56
I.00..
I.0.00
1.0010
I...IO

.Ol.L
1.010.
I.0I.D
1.0001
l.OlO0

.11'2
1.001.
1.0010
1.101(
1.1.1.

.1*11
1.0101
I... 0.
1.0(01
I.OO.0
s.*.
I..
..9.,

I.oo)o00

...,
,0001 ,1001
P
1
1
0
.'I.I
,999,
.s.I.
.w>9
5
.*",.
,8900

" 7
.ISLI
.?>a3
9""
A101
,709.
" 0
,1085

6 " .
>
.O.I*
.2>6*
.rrr,
.,,,,
.1310

,263,
,9106
.a,$?
,1332
.0001
...,.
.a055
,0000
,000.
.00.2
t.ooao I.OOOO .sew . Q O ~ . ~ a r r .rtm .ern ..sir .arz .oass

.."," .".,.
5 t.eo00 t.0000 ,.OOO" ,*99" .w,. .9&6, ."M+ .W7" .
.,
,Q ,1829.O,,"
6 L.O"I. ,.OO.* 1.0010 I.DO.0 .Q*"I .9"l, .s11. ,1251 ,6655 .,I",
.,,*.
I
8
9
I0
II
I.0010
1.0110
1.1.01
1.00.0
I.0"00
1.1011
I.01*0
1.0001
1.1000
,.ow0
,."I00
I."*IO
,.DIDO
1.1101
1."010
I.1.00
I.**"" ~ . O O " P
I.""""1.000"

,
,.o"nn
.no00
,9997

t.II*
,.oaaa
.w..
.P99?
.sess
I.*O"O
i .oaoo
,976,
.PQ11
.war
.WP.
I.aoa.
.9llb
,9713
..PI,
.ssrr
.sevr
.".,.
,818 1
.*16$

."".5
,9972
,6679
.I,,.
.s,,.
.sa.n
,3231

.*,12
.".or
I.aana

.. .....
I? I.O.00 1.0100 i."OOO ,."DO0 I.O*IO i.0000 ,9999 .99u ,088. ..,a5
13 1.01101 ,.a000 ,.nolo ,.noon r.nooo i.0000 ,.oaon ,.**no .sess .swe ..rar
I. I.0000 ,.oooe r.naaa
1.."10 1.00011 1.00.. ,.DO00 1.11.. 1.1100 ,199. 0.
15 1.11000 I.0000 I."."" 1.0000
>."ll" I.IO.0 I.I.00 I.D... .1...1 9 ,9971
6 I.OLl0l 1.1OOO l.nO.1 I..DOI I.IO.0 1.0000 I.@"00 1.11.0 1.00 0. I.I.0. L
17 I.OOOO 1.1100 1.0101 ~.DDOO ~ . a m o o 1.0000 1.0100 1.1000 n.oooo I.OO.~ I..100
II I.OOOO 1.0000 i.no*o i.nnoo i.oooo t.aoos t.oaao ~ . a a o o ~ . o n t a I.OO$. I.naoo
Is I.0000 1.1001 I."000 1.0010 I.IDO0 1.10." 1.1000 1.000. t.0.0. 1.0.10 1.00..
468
i'

Table A2. Ihe cumulative binomial d i s t r i b u t i o n (continued) Table A3. The Poisson d i s t r i b u t i o n

p .O' '"2 .a, .OT ." .'5 .a "5 .'O .a0 .'O The t a b l e g i v e s values of -
P(r;u) = r! ure-' for the
n x s p e c i f i e d values of u and r .
10 0 ,1179 .61,* .lT. .,5.5 ,1116 -0111 .n,,r .**,I .Ion" ,0000 .o.oo

.
1
(I
1
t
1
..I11
.OssO
I.0000
,.onoo
1.0000
I.0000
.e.o,
,9929
,999.
,.oaaa
1.0100
I.0000
.mar
.PW0
,0911
.*w,
1.00OO
I.ooBI
.r,ra
.9?15
,9111
.esrr
.~P97
i.oton
.I?,,
,6769
.B011
.*rn
.P111
.eOla
.tr5s
.LO19
,11117
."?W
.WE1
.We1
.os9s
.1061
..,I.
.s*va
.10aZ
.9>>I
.w.1
,0911
.22ll
..,.a
,6171
.mSI
.oa7s
.a151
,107,
.2,r5
..I*+
.1Ow
.laor
,0016
,116"
.Orlo
.I118
,1100
.nono
,000z
.om,
.aorv
."zO,
.n%ll
I
1 1.0060 1.1010 ,.noon ,.nano .ssoa .rQ.l .err9 .Is"Z .rr2, ..>re .,,,a
1 1.0000 1.0100 l_"~.0 ,.nnnn .-99e .Qoar .ssoa ,939, .saar .5ssr .7r,r
9 1.1na11 1.0000 1.OODO ,.01"",.IO"O ,999" .ee>. .r"s, ,9520 .,qr, .r,,.
10 t.anoa ,.naoa om i.aonn i.nono i.oooo . e e ~ r ..esb .em .mrr .maI
II t.oooo I.OOOO 1.no00 I.onno t.nsno i.aonn . . i ~.wen .ew .e.x .,.a,
12 t.onoo >.oooo I . ~ O ~ O I.osnn t.anoa ~ . e a a n m.ooilo .ee*a .em .wee .erer
11 I.OOOO I.0000 ,."*I0 ,.""no 1.00011 1.0001 I.0000 1.*10. ,9997 .QV>V .*.?I
I. I.0000 I.0000 ,."lo" ,.nooo t.ooso ,.oaoa ,.oooo ,.oono i.vooa .sm. ,s.
IS I.0000 I.0000 I.0000 l_""00 I.OO00 1.0000 I.0000 I.0000 1.0000 .we, .**.I
16 1.0000 I.00@0 >.0aoo 1.0000 1.00BO 1.0000 1.0000 i.IIPO0 ,.DO0 1.0100 .P.).,
I7 I.IODO I.OOO. I."OO" 1.14"" 1.1001 1.000" I.0000 1.0000 ,.on00 1.1000 .
.*
el
I8 ,.onlo I.DOIO ,.oo*o ,."d"o 3.1000 I.00BO I.0000 1.0110 1.0000 1.0000 I.0.00
I9 I.O"OO 1.0010 1_00*0 1.0110 1.0000 I.0000 1.0000 i.DDIO I.0000 t.0000 I. 00
?O 1.0000 1.000. I."Ol. ,.nono ,.oano ,.onaa ,.a000 ,.aano ,.oooo 1.00aa I.000~

5 a ,7778 .ao,r .*a," ,777. .a,,s .I,,? .OOl" .oooa .ooo, .&aoa .oano
I
2
1
.WaZ
,9980
,9999
,9116
_016a
.sslb
,11280
.9+,10
,9911
.L&?.
,1729
..Ll*
.11I1
,537,
,1611
.0+11
5
,1111
.
,027'

,2310
.no10
"
,0962
3
,0016
.*o*a
.0,1.
.1001
,000.
,001.
,1000
.aoao
,010,
L 1.*0"0 .PWP ,9992 .09,8 ,9010 .m21 ..*01 ,1111 .010* ,0095 ,0005
I
6
7
I . * ~ o1 . 0 o ~ o
1.0010
1.0011
1.0001
1.0000
. o w
I.0OOD
i.nOon
.99~.

1.0010
.esaa
,0001
,9917
.ales
,7305
.97.5
.atw
.,*a11
.WOP
.~ral .ms
.5*11
.72LI
.1.01
.I1II
.ozer
.OI,.
,153.
.".,,
.na20
,021.

I0
8
9
I.OOOO
I.0000
1.0"10
I.OOOI
1.1000
1.0101
i.nooo
,.""lo
i.ooOP
l.oono
\.oono
,.nono
.wsr
.99se
,.oono
.esro
,9979
,999-
.Vnz
.Q"Z,
.ss.l
.asas
,928,
,970,
.arrs
.was
,9129
,1115
..r.r
.5"1"
.,,."
.orlr
.>I?*
II 3.0000 1.0010 1.OOOO t.aOO"l.OOOO .l)990 .We5 .WP1 ,9551 .1111 .1&.0
I? ,.On00 1.0000 I.D.00 ,."OD" l_O""O >.oaon ,9996 .ssaa ,9825 .a.w ,5100
11 t.aooe 1.0oao ~.oooo I.aooo t.aooo i.aoan .oew .qrw .esra . ~ ? 2 .*rso
I. I.0000 I.DOIO I.0001 ,.""no ,.oaoo 1.00on 1.oOln .99*. .use2 .%56 .r*r.
I5 I.OI.0 1.0010 ,."ma ,.""no I.OODO I.0400 I.00OD I.0000 ,9995 .."a" .*rrr
I6 1.0"00 1.0000 1.0110 1.nnoo ,.nooo i.aooo ,.aaoa ,.oooa ,9999 .o.r, .*,a,
11 1.0000 I.0000 i.DDO0 1.*"110 I.0000 2.0000 1.0000 1.0000 1.0000 ..9.J ..I..
11 1.0010 1.0000 ~ . n o o o i.nnoo r.oooo i.aooo i . o n a o i . a o a o ~.oooa .WT .-sl~
le i.onoo 1.0000 1.0000 ~ . n n n a ~ . o o o o ~.oaoo ~ . o o o o ~ . n o o a i.oooo .sseq ,0910
1D ,..OD0 1.00110 1.1000 ,.nono ,.no00 r.0000 ,.a000 ,.aooo l.OOD0 ,.oooa ,9995
I3 I.l"DD I.0000 1.10110 ,."Dl0 1.1000 1.0000 I.0.00 1.0000 1.0.11 ,.*.1. .99e.
2 1.0006 1.0000 I.lOO0 I.nnoa I.Ooa0 1.0000 I.0000 I.oOO0 3.0000 >.Do00 1.0000
11 t.0110 1.0000 1.nOoo i.oao0 I.oOo0 t.Oooa 1.0ool i.oOO0 1.0000 l . 0 0 0 ~ t.oo00
2. 1.0000 I.0.0. I.00*0 ,.nnoo I.oooo 8.oaoo ,.oooo I.olao ,.om0 ,.oaoo ,.oaoa
15 1.0000 1.1010 r.aooa t.onoo i.oono i.oaao i.aaaa i.aooo i.oooo i.oaao t.saoa
30 0 .13*1 .5.99 .&+la .?I11 .OllL .OO?L .0011 .m1? .0000 .OOOI ,0000
I .ss,e .m.s .r,,, ,5535 ,1037 .o.m .*to5 .noeo ,0003 .oooo ."DO*
2 .we7 .??a> ,0399 ,",t> ..,I. .,5,. .*.a2 ,0106 .mt, .om* ."OOO
1 .999a ,997, .08", ,9,-2 .6.7. .x,7 .I?*, .on. .ow, .ooo, .oom
6
5
6
I.O"O0
3.onao
1.0000
,9397
i.oooa
1.100.
.W9l?
,9998
1.0000
,911.
.w.r
.se-.
,8267
.e2*a
.Srrl
.5Z'5
.,,aa
.Ill,.
.155?
.at75
.a010
.oe,s
.Ills
.,."I
.O,Ol
,0766
,1395
.oo,r
,005,
.OII.
...
.a000

..IO,
02
7 1.0000 I.1000 I.0000 ,999. .sv22 .9>az ,7608 .r,r3 ,281. .a.35 ,0026
8 I.0000 3.0000 I.DOO0 1.1016 .Wall .el?? .",I3 .a736 .Ill5 .09ra .oOsl
s 1.0010 ,.0000 I."OOO I.ll.0 .ss-5 .esa, .1),ev ,8011 .l"." ,1161 ,0211
10
,I
12
1.0016
2.0000
t.onno
I.0000
1.000*
~.oooo
I.0000
1.0(100
,.no00
1.lOl0
,.nnnn
1.0nnn
.
'
9
99
,.oano
i.oooo
,9971
.99e>
.%n
..'*a
.osos
,1941
.s.-3
. ~ v s ~.sr.*
.110&
.a.o,
.*ass
,1915
..,,,
.rlar
,019.
.,*a2
.la*.
,I I . ~ " O ~ 1.0000 1.0000 ,.""no ,.onoo t.oann .qss, .ee,a .rws .,,.r .rsz,
I. I.O"D@ 1.0000 1.0011 ,.nono ,.ooao i.aann .9*9" .9sr, .s",, ."I.* ..*re
15 1.1.00 I.0040 ,."*O" I."""" ,."ODD ,.oonn .9w9 .wsz .s93a .w2s ,7722
06 ,.elOD I.@"OO 1.000" 1.nnn0 r.onoo ,.aooo ,.aooa .9wa .re77 ,9510 ,707,

..,".
I7 I.0000 I.0000 l.n100 l.nOn0 i.nooo i.oonn 1.onOo .weq .ewr ,978" ,1102
,I i.(l"OO I.000" ,.noon ,.""DO 1.0000 I."O." 1.0000 I.O.00 .".PI ..s,, ,1991
3s ,.onlo ,.0000 ,."I00 ,.nnoa ,.oooo ,.oooo ,.oooo ,.ooao i.oo00 ,997, .?sea
,D I.0000 I."OOO I.nnoD ,.."00 ,.000o >.*a"" 1.0100 1.1000 I.OODI .09*1
>I i.onn0 1.0001 1."00" ,.mono ,.oooo i.onan ,.loo0 ,.aono ,.oaon .evg" .*st.,
71 ,.onlo i.Ol"0 i.noo0 ,.""PO ,."no0 i.00oo ,.loo. ,.OOOO ,.an". I.O... ...I I
71 1.0000 i.L)@IO 1.0010 ,.""no i.oaoo i.oano ,.no00 ,.osno ,.onoo ,.oooo .ow,
1. 1.0010 1.0010 ,_nola 1."""" l.oO10 i.oo00 i.noo0 ,...oo 1.0000 1.O.01 ..a
15 1.0010 1.010. 1.000" l.000. ,_onno i.oaon i.oo00 I.0001 I.0.00 >.O".l ,.no00
7 I.OP.4 1.0000 1.0000 ,.nono 1.naoo 1.oooa ,.looo ,.oooa i.oooB ,.oaan ,.noo&
11 1.0100 1.000a I..000 1.001" ,_*no0 1.0000 1.0.*1 I.*000 ,.onoo ,.o.oa ,.nnaa
21 1.11100 l.0000 I.&000 ,_""0".aaoo I.0""" ,.nooo ,.oaoo I.oooa ,.oooo ,.nono
0 1.""00 1.0000 ,_"OD" ,.""no i.oono !.a000 i.oo0o ,.aaoa t.ooo0 ,.oaao ,.oooa
- 10 1.0000 1.001. l."010 1.nnoo 1.0000 i.on00 ,.nooo ,.0aoo I.0000 I.oloO ,.oooa
n ,0008 .0om .on", .OD"* .on06 .eons .onn5 .noor .ona* .om,
I ,005~ .oosa .nn*q .oars .owl .aom . o o ~ .oow ,0029 .non
Z .P?Oll .Om"* .n(lO ,0167 ,0156 .Ole5 .">la ,0125 ,0116 ,0101
I .OLW .or,. .0*1, .O,lP ,0316 _"145 .011* .om5 .02.(.
,
5
,0117.
.1111
.OII*
,01116
.,?a*
.n,ss
.,,LI
,076.
.Ill0
.or29
.Ins*
.orss
.,or7
,066,
.to23
.oa,z
.nsss
.oa07
.osr,
,097,
.os,s
6 ,168 5 0 3 7 3 1 .I292 .I112 ,1271
7 .1.19 .Iblh .,."I .I.,C .
,'
61 .I.)& .,.<I .1*21 .,6,1 .I,%
11 .I121 ,1311 .1151 ,1363 ,1173 +11*1 .I111 .IlW .!Is5 .I396
(1 ,1012 .to10 .In*& .It21 .ll+* ,1167 .It87 ,1207 .I??& ,124,
10 ,0110 ,0710 ."l"O .one9 .olsR .a.a, .ns,r ."Q., .mr, ,099,

.
11 .01111 .OIDI ,0111 -0551 ,0519 ,0011 .a640 ,0667 ,0695 .n7??
I2 .OZO, .o,o, .o,n .a,.r .o,ra .om8 .or,, .0.36 ,0457 .o.*,
II

Is
,015~
,0071
-0017
.OIW
..me6
.onrl
.n~m
,0095
.no*&
.ole6
,0101
.On5l
.",,,
,073 1

.on57
.om
.a,z,
.OOII
. a m
.o,,r
,0069
.nrm
."try
.DOIS
.am
,0357
.no81
.o1*6
.",re
.DOqP
I6 ,0016 .oo,s .""ll .00>* .an?* .oo>o ."a,, .oo37 .oa*, .no.r
1, .0001 .00"B .onn9 .oo,a .ao,r .oo,, .oa,r .ooll .O"l., .a071
111 .0001 .00", .on"& ,000. .onor .noas .no06 ,0007 .ones .eons
19 .0001 ,0001 .on", .oooz .OOO? .ooa2 .ooo> ,0003 .on03 .ooar
20 .0000 .0"*0 .on", .O"Ol .ooa, .oaa, .O""l .ooo, .ooo, .oooz
2, .OPOO ,0000 .onno .oona .anao .onao .nnoo .aaoo .onat ."an,

8.1 *.I (1.1 8.' 8.5 6 11.1 8.8 8.9 9.0

,0001 .mn, .arm? .on@, .mar .a002 .oao~ .oaaz .oaot .noat
1 .00?1 .nn?l .oal.) ,0017 ,0016 .@OM ,0013 ,0012 .oOll
.oteo .(109~ .one. .owe .oo~r .006(1 .n06) ,0051 .OD% .0a5a
1 ,0269 ,0152 .n237 .o?lr ,0208 ,0195 .Dl83 .olll ,0160 ,0150
& .0511 .PSI7 .OL?I .OI6L .W41 ,0110 .Dl98 .0111 ,0151 ,0317
1 .On111 .OW9 .en16 ,0184 .011Z ,0722 ,0692 ,0663 ,0615 ,1607
6 .llsl .I160 .?I28 .I097 ,1066 .1O1. .lo01 .0912 .&9.1 .OPII
1 ,11711 .I158 ,1311 ,1317 ,1294 ,1271 .IZL1 .1122 ,1197 ,1171
8
9
I0
.!IPS
6
.IoI1
.I102
6
.IPIO
.
,111.

.IoL1
,1112
0
.to81
,1115
,199
+IlOL
.I116
6
+Iltl
+1156
3
.ll.0
.131+
,1115
,1157
.1112
.)>I1
,8172
,1310
,1311
.Ill&
II .01Le .0116 .alOr .0128 .0851 .OL(lO .OW2 ,0921 .a948 .Wlo
11 . ~ 0 5 .@5>0 . o w .osw ,060~ .as?? .osra .oslv .WOI .wra
11 .0115 .PI>+ .m5& ,0176 ,0395 .O.I6 .043l) +0159 .a*&! .0501
1. .01111 .Pls6 .n?lP .01?5 ,0260 .DIS(I .OZ7P .0289 ,0101, .01?L
II .oms .mm .oxto .eta .o>>h .OM, .o~za .oms .oxaz .ol*r
16 ,0050 .On55 .nDLP ,0066 ,0071 .0019 ,0086 ,0093 .a101 .01W
I7 .001. ,1026 .OW9 .DO33 ,0036 ,0040 .OOL1 .0018 .@051 .005l
18 .00,, .POI2 .no,r .on15 .oa,, .00,s .aoll .oozr .002a .nore
I9 .On05 ,0005 .arms .OOOl .a008 .0009 .a010 .OOll .OOl2 .Oat.
20 .OD02 ,0002 ."no? .0001 .00o3 .no** .ooo. .sow .oans .a006
PI .POI, .00", ."an, .ooa, .aoa, .aaal ,0002 .ooo2 .om2 .ooo,
22 .oooo .moo .moo .aaoo .ono~ .a001 .soot .ooo~ .so08 .oom
2, .oona .oona .anno .oooa .*no0 .oano .a000 .on00 .anoa .onno

9.1 P.? 9.3 O.. 0.5 9.6 '.I 9.8 9.s i0.P

0 .000, .0001 ."no, .o00, .ooo, .a001 .0001 .aoo, .0001 .a000

.
1 ,0010 .Onns .on09 .OOol( .OoOl ,0007 .OOOb .OOOI .OD05 .OOOI
2 .0016 ,0063 .OW0 .DO11 +0014 .DO31 ,0029 ,0027 ,0021 ,0021
J .0110 .a111 .Din .PI15 ,0107 .Dl00 ."a93 .0087 .OOsl ,0076
.0,19 .P,02 .".05 .ole* .025r .o>.o .Oll(l .*?I, , 0 1 0 , .a,.*
5 .osm .*sss . o m .OIM . O L ~ .orso .ml+ .orla .om .nnr
8 ,085, .a112 .Or93 .01aL .OI36 ,070'1 .Ohel .Oh% .Os>l
7 .I145 ,1111 .Inel .L@e. ,1037 .tat0 .OW? .0955 .OVZa .OW3
II .I302 .I286 .l>69 .I251 .I212 .I212 .1191 .1170 .11*11 ,1116
9 ,111 7 ,111 5 .,,,I ,1306 .001. .I293 .11111 .I211 .I261 .,271
IO .II.)B ,1210 .l?ls .111$ .1215 .11.1 .11.5 .II*9 .12% .$?-I
.os9n .IS)> . I ~ V .lors .anal .~oa> .lose . I ~ I ? ,1125 .I,??
I? ,0152 ,0776 .n,s* .01*? .o*rr ,0866 .D8lld .OW8 .m11 ,0940
11 ,0516 .0%9 ,0417 .059l .PLI7 .ObLO ."LI? .OW5 ,0107 .071'
)I .01a1 .0161 ,0110 .Olss ,0439 ,0139 ,0631 ,0179 ,0500 .0511
IS .Oml .W?I .*115 .O?IO +n265 .02.1 ,0297 ,0111 ,0110 .01.1
16 .OBI8 ,0117 .a117 ,0147 ,0151 ,01611 ,0180 .OIOI .WIT
I, ,0061 .Om9 .om5 .P*LI, .OOPII .an95 .o,o> ."ill .0119 ."lZ"
In .PO11 ,0035 ,0039 .Onrl .OW6 ,0051 .@05l .OM0 ,0065 .oall
te .oats .OOII .ant9 .oou .OOZ~ .ooze .~OPI .no>, .oolr .no>?
20 .0001 .00D11 .on09 .on10 .oo!l .OOl? .nOll .00>5 .On97 .OOl*
?I .0011 .0001 .arm* .Ooor ,0005 .OOaa .nos6 ,0007 .Onall .nOW
?I .Doll .0001 .Onor .OoOl .am2 .ooo2 ,0003 .OOOl .O@O. .00"4
13 .oooo .oan~ .nno! .oool .on01 .oau .ooox ,0001 .ooo? .oon2
2. .OD00 .O""O .oono .on00 .oooo . O ~ ~ O .o000 .ooo, .0"0l .son,
473

Table A3. The P o i s s o n d i s t r i b u t i o n (continued) Table A4. The cumulative Poisson d i s t r i b u t i o n

The table g i v e s values of x u = ! e u for the


s p e c i f i e d values of u and n.

\ .' . ., .- .' .' .I .I .s ,.a


O .90&1 ,11187 .rlnl .6101 ,6065 .5*111 .rO(S .&*?I .r0L6 .16,e
I .seSl .VR?l .s&?l .WV ,909- .a788 .%+2 ,8088 ,7725 . 7 ~ 8
1 .9WO .W'9 .SWL .-.)?I ,91156 ,9763 ,9659 ,9526 ,9171 ..lsl
1 1.0000 .9999 .*ooi, .qesz .ssrz .was .sea2 .ssoe .qass .ella
L I.POOO 1.0000 1.0noo .~WY .ssw .se-s .sss2 .sear .wrl .ssll
I ~ . o o a o ).onno ).onon I . O O O ~ >.onoa i.ooon .wss .ssea .sw .qeq.
6 I.POO@ 1.0000 ,.""no 1.0000 1.0000 1.0000 ,.*.O. I.0000 ,..a00 .99s.
1 1.0000 1.0000 >.Onno 1.0000 1.0000 I.0000 1.0000 1.0000 I+ODOP I.OOn0

I., 1.2 I.3 I.* 1.5 Z.6 1.7 1.I z.9 2.0
0 .11?$ ,1011 _>lli .2Lb* ,2231 ,2019 ,11117 .1151 .I*% .1)51
I .sem .a626 .wsa .5sla .157a .I?W .LV~Z ..*?a .+111 .*tho
2 .900* .81*5 ,1511 .a315 ,8088 .1111 ,1571 ,7106 ,7027 ,6767
1 .97.> ,9662 .a569 .%63 .93&. ,9212 ,9068 .a?!> .87'7
+ .P.)l* .9'l23 .9*91 .9*51 ,981. .97&l ,9701 ,9636 .ST19 ,911,
5 .0990 ,9915 .?Dl8 ,9975 .9PW ,9920 .sag6 .%.a .P$>r
6 .ww . 9 ~.WQL .sw .wst . s w ..vat .ser* .essa ,995s
I I . O ~ O 1.0000 ,9799 .eew .ssss .e*w . w s .wsr .we ..sms
I I.0000 1.0000 I."OOO 1.0000 1.oono I.ono0 .was .9sos .9W" .W."
9 1.0000 1.0010 i.nnnn i.aooo i.oooa ,.moo i . o o a o i.oooo i.aoaa 1.0aaa

2.1 1.1 2.1 2.1 2.5 2.6 2.7 2.1 2.9 3.0
0 .1?14 .)1&8 .lo03 .0907 .0"21 .Ol4l ,0672 .Oboe .0510 ,0191
I ,1796 .15*6 .no+ ,308. .Zlnl .?67r .I+OY .?III .?l46 .lwl
2 .(1ls(l .(1?21 .5960 .56sl ,5618 .*la+ .r936 .*695 .+.a0 .an2
1 .a18e ,8194 .1991 .7787 ,1576 ,1360 ,7111 .eel9 .saga .nip
L ,917'1 .em ,916s .eael .as)? .mrr .la19 .am .ella .~151
5 .wse .P~W .woo .sea> .err0 .srnn .+a, . w r s .srro .slrl
.W.I .*975 .*so6 .911r .9158 .el28 ,979. ,9756 .*)I> .96~5
7 .<91S .91)10 .*97* .9967 .W51 .99" .993L .Wls .9m1 .ell>
I .9ew .es95 ,9991 .swb .was ,9983 .90a1 .sws .eves .*P.>
9 .PW* .99P9 ,0999 .s99. .9997 .r)ll% .sesr .I)ss, .99., ..sns
It I.OOD0 I.1000 I."ODO 1.00.0 .sses .$so9 .see9 .rse. .ssrs .s.,
I, I.0000 I.0000 ,.0"00 ,.DO00 1.0000 1.0000 I.0.00 1.0000 .P.99 .*s.s
XI 1.a0a0 1.0090 >.nsnn t.ooao >.000s ~ . * o o ~L.OO.O >.oooo 1.80bo I.~OOO

1.1 1.2 1.1 1


.
. 1.5 1.6 1.7 1.. 1.9 L.O
0 .1150 .0608 .Ox69 .0311 .P102 .0?71 .W'l .0?11 ,0202 .PIe3
I .to$? .B782 .>$a6 ,8359 .tZS7 .lt$Z ,807. .Oe$z .0916
2 ..OIP .I799 .%9. .1191 .3201 .1011 .?a%. .?*a9 .2511 ,1311
3 .6025 .5"03 .536* .5!52 .49+Z .a735 ..532 .+>I5
4 .Wli ,7626 .1L1I .1151 .I064 .(111I ,6678 .LO+ ,6110
5 .so11 .@era . s w .WOZ .IW6 .ar.l .mn~ .m56 .ma
6 .set2 .%%L .PIW .s.a .sw .sz.? .WI .we, .ass.i .~ss,
I .Wl(l .W>l ,9102 .9769 .97ll .9692 .%48 ,959. .%a6 .91I9
8 .~s> .ter3 .OVX . m ~.WOI .*an, .vea .m.o . W I ~ .v~eb
e .sses .see? .ssn .sm .see, .oew .ss52 .99+2 .09m .wts
to . ~ P W .eess .sssr .ewz .sesa .1)98, .e$m ,9981 ..)s77 .ss~~
!I .s99'l .9W9 .9991 .99W .9091 .WVh .*9-5 .W9l .999l ,9991
)2 i.oooo 1.1000 ~.nnon ..ew .sew .009r .ssss .wss .ew .ses~
I3 I.ODO0 I.DDltO 1.0100 1.0000 3.0000 1.0000 I.0000 1.0000 ,9993 .Ww
I. I.0000 I.OO"0 ,.onno t.ono0 8.0000 ,.*ooo 1.000. I.0000 I.000. I.....
T a b l e A4. Ihe cumulative Poisson d i s t r i b u t i o n (continued) Table A 4 . R e cumulative Poisson d i s t r i b u t i o n (continued)
Table A 4 . me cumulative Poisson distribution (continwd) Table A5. The standard normal probability density function
-lx2 for o 5 x < 4.99.
The table gives values of g(x) = Ji;n e
Table A 6 . The emulative standard normal distribution Table A 7 . Percentage points of the Student's t-distribution

--j Pi e-bx2dn
The table gives values of ta for different degrees of freedom v such as
The table gives values of G(y) = for 0 5 y 5 4 . 9 9 . to produce specified values F(t,;v), where
G(-y) = 1 - G(y).

F(-t;v) = 1 - F(t;U). v = corresponds to the standard normal distribution.


Table A8. Percentage points of the chi-square d i s t r i b u t i o n a
m
0

The t a b l e g i v e s v a l u e s o f freedom v such as t o p r o d u c e s p e c i f i e d


v a l u e s F ( ~ ; Y ) ,where
1
e - l u d" = 1 - a ,

T a b l e A9. P e r c e n t a g e p o i n t s o f t h e F-distribution

The t a b l e g i v e s v a l u e s o f x f o r d i f f e r e n t d e g r e e s o f freedom (vl,v2) s u c h ao ro p r o d u c e s p e c i f i e d


valves F ( x _ ; v I . U 2 ) . where

The t a b l e o n l y h a s "upper r a i l " e n t r i e s c o r r e s p o n d i n g to v a l u e s F = 0.90, 0.95, 0.975, 0.99, 0.995,


0.999, b u t it can be used to o b t a i n "lower t a i l " p e r c e n t a g e p a i n t s c o r r e s p o n d i n g t o t h e c o q l e ~ n t a r y
F = 0.10, 0.05, 0.025, 0.01, 0.005, 0.001, by means o f t h e r e l a t i o n

* M u l t i p l y t h e s e numbers b y 100.
I 1
Table A9.

3 '
Percenrage points of the F-distribution (continued)

5 6 I B s 10 12 15 10 10 LO bO 120 m
-
*
m

a .so L.5. L.12 ..Is 4.11 k.05 1.01 3.98 3.95 3.9* 3.P l.W 1.87 1.1. 3.82 3.80 1.n 1.76
?.,I
.PS 1.71 6.9. 6.59 s.39 6.26 6.16 6.03 a.o* 6.00 5.96 5.e) 5.8. 5.80 5.75 5.12 3.69 5.6s 5.6,
.m ~ 1 . ~ 2 10.e5 e.se 9.60 9.36 9.20 e.07 a.qa 8.90 8.s' 8.7% s.*r s.sr ..+a *.at 8.16 s.n e.as
.99 z1.10 18.00 I6.W IS_eI 15.5, 15.2, li.98 lb.80 Ik.66 ,655 L1.37 11.10 Ia.02 11.~. 11.15 11.65 12.56 !I.'*
.?q% 31.11 i6.2E 2L.26 21.15 ??.a* 2L.97 21.W 21.15 21-11 20.97 10.10 iB.L* 20.17 is..P 19.75 19.61 19.LI 80.32
,999 ,'.I. at.25 ss.,s r,.** FI.7, 50.53 -9.6s es.00 is.*, *l.OS *7.., rr.7s *s.,o *i.*, *s.os .1.,5 r... a *,.05
5 .90 6.06 1.78 i.L? 3.5' 1.*5 3-*d 3.37 1.11 3.30 3.27 1.?i 1.)) 1.11 3.1* 1.1a 1.11 1.1D
5
.9rs
.Q-
6.6,
ID.OI
16.i6
(I..,
5.79

11.21
5.rl
7.16
12.06
5.19
7.w
11.w
5.05
7.15
10.97
L.Q5
6.98
to.67
a.88
6.85
ra.ra
*.bZ
6.76
1o.r-
*.I7
6.68
..I'
6.62
1 0 . ~ 6 1o.o-
a.68
6 . ~ 2
9.m
L.6Z
&.*,
q.12
L.56
6 . n
9.55
1.50
6.2,
s.la
a,.(,
6.1,
+.a1
6.12
9.20
r.m
6.0,
9.11
r.36
6.02
9.02
,995 2 . 8 111.31 1a.51 15.56 1*.9L 1*.51 11.10 11.'6 3 . 7 3 11.18 11.15 12.90 I2.66 11.51 >?.LO 1 . 7 II.1*
.PW 7 . 7 . 2 3 . 0 11.09 19.75 ZS.114 11.16 27.6' 11.2* 26.91 Z5.91
?$.L? 25.39 ?+.a7 1a.60 21.11 il.OL 23.7.
6 .90 1.78 I.&* 1.>9 3,)s 1.11 1.05 1.01 2.91 Z.96 1+'* 2.90 2.87 2 . ~?.no 2.78
.
-95

.o-
5.99
a.et
11.75
5.1.
7.26
10.92
i.76
*.an
9.18
b.51
s.2,
".I?
L.19
5.99
e.19
L.20
s.81
a.rr
*.I!
s.70
8.26
s.15
s.so
e.10
1.10
5.52
1.98
L.OS
5.bs
7.81
1.00
5.n
1_9*
5.2,
I.s7
5.17
1.~1
s.",
3.17
5.01
2.7%
3.76
..vr
2.7.
3.70
*.s"
2.72
3.67
*.s~
1.12 7 . n 7.m 1.1, T . I ~ 7.06 h . 9 ~ *.en
.W5 ib.L3 I2.W I2.DI 11.66 11.07 IO.79 iO.57 10.19 10.25 10.03 9
,
'
s *.<9 9.15 9.2' 9.11 7.00 8.88
.q99 35.51 l7.00 21.10 11_9? 10.81 W.01 19.L6 19.03 IP.69 IB.rl 7 . 9 6 7 . 1 la.67 la.'* Is.ll ir.9e ,$.is
7 .W 3.5- 3.26 3.07 ?.w 2.88 2.75 2.72 2.70 2.67 7.63 2.w 2.56 2.5. 2.5, ?..a >.a,
. .?5
.OF
5.:' '.I* 6.15 r.12 1.*1
2.83
1.11
2.78
3.79 1.71 3.60 3.6. 1.57 1.59 1.L- 1.18 I.>* 3.10 1.27 3.23
1.01 6.51 5 . ~ 9 5 . 5 ~ 5.19 5.1? r.99 '.PO r . 8 ~ *.IL L . ~ I L.FI c.,. i.31 L.Z~ i.2. ..i*
.i)F 12.25 9.51 I.L5 7.85 I,** 1.1* 6.99 6.8% 6.72 6.62 ..&I 6.3, 1.16 5.99 5.91 5.aP 5 . ~5 . 6 5~
-9-5 16.11 IZ.60 10.18 10.05 P.52 ').I@ P.(IP a.a* 8.5) 8.38 8.18 7.9, i.,r I.., r..2 7.1, ,.Is ,.ns
.eve 1 9 . s ~ ?,.as LO.^ . I ~ 5 . 5 i IS.OZ . a 1 . 3 I . 11.11 I,.>? 12.91 12.5, 12.11 i~.i? 11.91 11.70

~~- ~~~ -

-
Table A9. Percentage poinro of the F-distribution (continued)

% F
y 1 2 1 , I 6 I 8 9 1. t i 15 20 I0 .@ 0 120

....
Li .W 3.18 1.81 Z.6) 2.6. Z.>9 2.13 2.2. 2.2. 1.11 2.L- ..I5 >_I. 2.06 2.OL 1.9'1 I.** 1.9, I.9"
.s a.75 ,.w 3 . e 3.76 3.11 3.00 2.9, 2.*5 2.m 2.75 2.69 7.6, 2.5. 2.67 ?A, ?.,e 2.3. 2.30

..ess
,975 6.5s
e.,3
LI.75
5.10
6.9,
1.51
4.67
9.95
7.2,
b.tl
S..,
6.w
1 . l ~ 1.11
5.w
6.07
.ma
3.76
3.61

5.51
1.5,
4.5.
5.3%
>.a4
1.39
5.20
1.37
r.,.
5.09
1.2.
*.I6
..$l
1.1.
*..I
e.72
>."I
3.86
L.5)
I+'L
i.,"
b.17
2.91
3.6%
b.23
1.85
1.5.
*.I?
2.7-
3.r5
L.01
7.72
3.-
1.90
.se., ,a,*. 82.9, In."o 9.n *.a* a.,a a..* 7.7, 7.1. 7.zs 7.0. 6.7, s..o 6.09 5.0, 5.76 5.59 5.e

81 .so 1.1. 2.16 2.5. z.+> 1.15 1.28 z.21 1.20 2.16 2.1. 2.10 >.01 2.01 I.'$ I.*, 1.*0 L.8" 1.95
..5 ..67 1.01 >.a1 1.). 3.03 2.92 z.81 2.77 t.71 2.67 2.6. 1.7, 2.66 Z.II2.- 2.30 1.15 2.2,
,975 ...I *..I e . 3 ~ L.OO 1.v 1.a. 3.m 1.11 1.15 1.15 ,.or 2.95 2.w. 2.w 1 . 7 ~ 2.66 2.m
.se ..a, a.,. S.7. 5.2, ..eL ..*I ..a. 4.3. ..,9 ..I. 1.96 ,.a? 3.6. 3.51 ,.L, ,.1. 1.25 1.11
.w5 LI.37 I.39 6.03 6.2, 5.79 5.6, 5.15 5.0. 1.- L.82 +.SL a.97 L . D ~ I.07 3.11 1.11 3.b5

...
.0.9 ,,.PI 12.3, 10.2, 9.0, 8.15 7.8. 1..9 1.2, 6.98 s..o 0.52 6.7, 5.9, 5.5, 5.*r s.,o 5.1. 6.97

3. .90
.*5
,975
1.10

e.10
o
2.13
3.1.
6.86
1.12
3.-
A,?.
2
,.,,
1.19
. ~2.1,
?.e6
3.66
.21
2.85
1.50
2.19
2.76
T.18
2.15
2.70
3.29
I . ~ I
1.&5
1.2,
2.10
2.60
3.35
t.02
1.51
9.05
'.OK

?.a5
i.ea
2.m
2.M
i.9,
l.ll
2.73
1.89
z.2,
2.67
i.aa
2 . r ~ 2.ls
2.61
i.8,

2.55
i.ao
2.n
I-**
..*
,905
8.01
I,...
...I
7.92
5.5.
6.6.
5.".
(1.0.
..I9
5.5.
..LL
5.1e
6.2.
5.01
4.8.
L.86
..ox
6.72
3.9.
..h. *..,
1.80 ..ar
..,5
3.5,
L.0.
1.15
,.'C
3.27
3.76
,.ld
,.A*
2.09
1.55
3.""
I...
.w9 11.1. 11.7. v.1, 8.61 7.92 l..? 7.0. 6.e. 8 . 5 6..0 6 . 3 5.85 5.56 5.75 5.LO 6."' 1.7, ..eo

IS ..I 1.07 1.10 ?.re z.ls 2.l. 2.2l 2.LI 2.U Z.os 2.06 2.02 1.97 1.92 I.&? 1.85 1.W l.79 1.76
.95 L
.
% 3.b" 1.Z9 i.nc 2.90 2.7. 2.7) 1.e 2.5e Z.9 2.ra 2.a 2.31 2.75 2.20 2.11 2.1, i.0'
,975 8.20 6.77 *+I5 >..o 1.58 1.11 1.29 >.20 1.Iz >.Oe I.96 ?.a6 2.76 I.*' 2.5q 1.51 ?..a ?.+"
.er ,.as 6.36 I..2 4.19 i.5s r.32 r.lr 6.00 1.W >.eD 1.11 1.57 1.17 1.21 1.11 1.05 2.9s ?.I7
..a5 lb.IO 7.70 6.). 5.~0 5.>1 5.07 L.85 a.67 ..LI 6.42 ..IS r.07 1.81 I.bP 3.51 3.r" 2.17 3.11
.se9 11.5s I,.,. 9.3. ..?5 7.5, s.., 6.7. r..s a.zs 6.08 5.8, <.s* 5.25 ..ss ..no ..a. ..A7 '.,I

I6 .90
.95
,915
.s9
...
1.15

$.,Z
8.53
9
2.67
3.6,
Q.69
6.23
i.ib
3.2.
4.08
5.2e
7.13
,.a,
3.7,
6.77
2.11
2.85
,.% 6.20
+.a.
3.3.
2.18
2.7.
1.11
2.66
3.n
a.03
2.0-
2.5*
3.12
3.8-
2.06
2.5.
3.05
3.78
2.03
?.'9
2.99
3.69
L.99
2..2
2.8-
3.55
1.'.
2.35
?.7?
>.<L
1.'-
2.n
?.fie
3.76
1."'
2.19
7.77
3.10
1."
2.,5
2.5,
3.02
I.78
2.1,
?.a-
Z.93
1.75
Z.06
2.38
2.86
2.72
2.01
2.32
>.75
.9*5 10.5. 1.11 6.30 5.6. 5.21 L.9, 6.49 4.52 ..la 6.17 &.ID 1.92 1.11 3.G. 3.r. 2.33 2.22 3.1,
.w. I.., 1 i0.e. 9.0. 7.9' 1.21 6.8, 6..6 6.1- 5.98 5.8, 5.55 5.27 ..pe '.TO r.5. r.19 r.2, r.o*
I, .90
.e5
,..,
.A5
2.b.
3.59
?..I
1.10
?.,I
7.90
1.m
>.el
*.IS
2.70
?.LO
2.6,
Z.O6
1.55
2.0,
2.69
1.00
2.L5
I.%
I.%
,.PI
P.11
1.M
2.?1
I.*,
I.>%
1.11
2.10
1.75
Z.06
1.71
i.01
1.*9
1.96
,975 6.0. L.62 ..a, ,.cr 1.4. 1
.2
. 3.1. 1.06 2.9. *?
. 2.82 ,.I, 2.6, 7.50 .'.1 2.1s 2.11 2-25
.eo $..O h.LL 5.t8 4.h7 a.3. A.kO 3.93 3.79 3.e8 >.39 3.+b 3.3, 3.16 3.0" 2.W Z."> 2.75 2.65
.995 10.11 7.15 6.16 S.50 5.07 r.7. +.56 L.39 L.25 +.I. 1.97 1.19 1.61 >.&I 3.31 3.28 2.l" 2.79
.ess 15.72 LO.66 7 , 7 . 6 7.02 1.1. 6.11 5.9. 5.15 5.5e 5.12 5.05 ..?(I "'.L 3 , ..I8 '.DL 1.35

I* .s. 3.0, 2.62 1..* 2.7' 2.20 2.11 ?.*a 2.e. 2.00 1,s" 1.93 l.8') 1.8. i.7" 1.15 ,.I? 1.1- 1.66
.e5
.el3
.es
...I
5.9*
*.?9
3.55
6.56
6.0,
3.36
3.-
9.09
?.Q,
3.6,
6.5, a.25
2.7,
,.,a
2.M
3.22
..o,
,.,*
2.58

3.86
2.5,
,.*I
3.7,
2..*
2.w
3.60
2.6,
>.a7
3.5,
2.3.
2.77
3.37
z.27
>,*>
3.23
2.t9
2.56
3.08
2.1,
,..a
2.-2
2.06
2.3"
?.a'
2.02
2.32
2.75
1.97
2.26
Z.b6
,.9z
>,,9
2.57
.PQ~
.ess
t0.21
8r.m
1.21
10.39
6.0,
a,..
5.1, L.~L
6.8,
L.W
s.35
L.L.
6.02
..a
5.76
..I.
5.56
a.01
5.39
,.I*
r.13
I.*.
r .
1.10
~l.5-
1.10
*.la
3.20
r.ts
3.10
r.00
7.~9
1.8.
?.rr
i.r?
9 0 2.9s 2.61 >.LO 7.11 2.111 1.11 2.06 1.W 1.9" 1.96 1.9, I.@* t.81 i.76 1.7, %.TO 1.67 3.63
?.r+
.e.
.s,.
.9e
4.3s
s.*2
8.1-
a.5,
5.9,
,.>%
,.90
5.01
z.qo
3.56
r.50
3.31
+.I7
,.,,
2.63
3.96
2.-
3.05
1.77
z..e
2.96
3.61
?..z
2.m
1.51
2.38
z.a2
3.r3
2.31
2.72
I.>*
2.m
,A?
3.15
2.m
2.5,
3.00
z.07
2.29
2.e.
2.0,
2.3,
2.76
1.9"
z.2,
I.67
1.93
z.z*
i.58
i.ee
z.,,
I..?
.015 10.01 1.09 3.92 5.z7 6.15 r.54 r.3. ..ll a.0. 3.93 3.7s 1.29 0
.
1 1.21 1.1, 3.0" 2.89 2.78
.*s. . . . . L ,o.,s ..>a 7.2. 6.62 a.,o 5.85 5.5. 5.3- 5.21 4.97 ..TO .A, ..I. 3.99 1.8. >.6D 1.51
2
Table A 9 . Percentage points of the F-distribution (continued) b
*
m

v, I 2 3 5 a 7 8 3 lo 12 15 I0 10 LO 6 0 OD

Y F
20 .90 2.9, 2.59 1.311 1.)$ ?.I6 2.09 2.0, 1.00 1.96 I.* 1.80 I.". 1.79 I.,. I.,, t.40 1.6. ,.s,
.s5 4.15 3.49 3.10 2.7, z.60 1.51 2.&5 2.19 2.35 2.211 z.10 2.12 ?.OL 1.W !.'5 I.% i.8r
.97? 5.87 ..a6 3.86 1.29 3.13 1.01 2.9, >.a* i.77 2.68 >.51 2.66 2.39 1.29 1.22 2.11 2.09
_s9 (1.10 5.85 6.91 1..' *.LO 3.81 1.10 3.56 3.46 3.37 3.11 1.00 2.0. 2.78 Z.69 2.61 1.52 2.62
,995 9.9. 6.99 5.12 5 . ) ~ s.76 *.*I b.09 i.96 3.85 i.68 1.50 1.32 3.82 I.02 2.92 Z.sl 2.69
_999 s.sS a.10 e..e 6.w 5.69 Z.I* 5 2 1 5.08 6.82 a.56 *.2- ..no 3.16 3.m 1.11 3 . 1 ~

CI .en
.v5
2.96
L.12
2.5,
I.',
2.36
1.67
*.,,
2.e.
2.1,.
2.68
>.OF
2.57
2.02
?.a9
1.9" t.95
2.17
8.W
2.12
,.a,
2.25
1.8,
1.11
1.78
2.10
1.72
2.01
1.6-
1.96
,.fib
1.V
1.52
1.81
h.59
1-91
.P15
.P')
,995
5.C3
a,"?
+.(I
6.12
5.7"
6.89
1.82
.#,
1.73
..,,
9."-
1.25
..o.
r.60
1.09
,."I
*.?s
z.97
3.6.
4.21
1.1I1
3.5,
r.01
2.80
,.a0
3.m
2.73
3.3,
1.17
,.,,
2.6.

1.60
7.51
3.0,
,.r>
?.LZ
2.m
i.>r
I.>!
z.72
3.n~
2.25
2.e.
2.95
2.I.
2.57
2.a.
2.11
2..*
i.7,
2.0.
Z.36
r.n~
,999 $1.59 9.77 ?.PI 6.95 6.32 5.88 5.56 5.11 5.11 L.95 ..I0 ...I ..!I 3.3" 3.7. 1.51 !.at 3.25

ZZ .ea 1.95 1.56 2.35 2.7, 2.13 2.M 2.01 1.97 1.93 i.90 I.Ba l.el 1.76 I.?" 1.67 I.*. 1.60 l.57
-95 a.10 3.L. 1.05 ?.R> 2.66 1.55 2.66 i.LO 2
.
3
' 2.10 2.21 2.15 2.07 1.01 1.9- 1.19 1.W 1.75
7 5.79 L.38 3.78 >.a' 3-22 1.05 2.Ql 2.W Z.76 2.70 2.60 2.50 2.39 2.77 2.21 2.1. 2.0. Z.0"
.99 7.95 5.12 *.a2 r.31 ?.99 1.16 3.59 3.65 3.15 1.16 3.12 2.98 1.81 2.67 2.58 2.50 i.lO 1.1,
.9q5 9.73 &.a1 5.65 5.n? L.61 b.12 b.11 1.W 3.81 1.70 1.5. 1-16 3.18 2.'" 2.W 2.77 Z.11. 2.55
.9Pe IL.38 °.$I 7.90 *..I L.19 5.76 Z.LI 5.19 ..*9 a.11 1.51 ..11 6.06 1.11 1.61 3.LI 1.12 >.IS
23 .iD 2
.
9' 2.55 1.11 2.7) 2.il 2.05 1.W 1.95 1.W 1.89 &.a* i.80 1.7. I.r9 >.as 1.62 1.59 1.55
-95 r.28 1.w 3.01 2.80 >.a1 2.53 2.L1 2.37 1.32 2.27 2.10 2.11 1.05 1.06 I.91 I.86 I.81 1.76
.P15 5.15 L.15 1.15 1.LI i.L8 3.02 2.90 2.81 2.73 P.67 2.57 7.*7 2.16 1.71 2.18 Z.11 i.OI I.*'
-09 7.08 5.66 6.7& .,Z* >.*a 3.7, 3.5+ ,.+I 3.20 3.21 3.07 ?.Q, 2.m ?.M 2.5. 2..5 2.35 2.Z6
,995 9.83 6.73 5.58 4.P5 L.5' 1.26 L.05 1.88 1.75 3.61 1.11 1.1. 1.IZ Z.02 1.82 2.77 >.an &re
.ws I*.** e.*r 7.67 6.60 6.08 5.65 523 5.09 1.8- +.,a *.*a L.E 3.w >.*8 1.53 1.11 1.~2 3.05

2, .so 2.9, Z.S* 1.33 ?.I9 2.b0 2.04 1.98 i.s* 1.9) 8 . M ).el 1.7" 1.73 I..? ,.a& ,.at 1.57 1.53
.95 4.X 3.60 1.01 2.7s 2.w 2.51 I.'I 2.16 1.10 2.2s 2.18 7.18 2.0, 8.9. 1.89 1.1. 1.7" 1-73
,915 5.72 6.12 1.7Z 3.18 3.LS 2.99 2.117 1.71 2.70 2.64 2.51 2.L. 2.31 1.21 1.15 L.03 i.01 1.0.
.W 7.12 5.6) *.11 r.7) 1.90 1.61 1.50 3.16 1.2e 1.11 3.01 7.89 1.1. 2.58 2.69 2.L. ?.It i.LI
.Q95 9.55 6.SLi 5.52 *.19 L.L9 L.?O 3.99 1.8, 1.69 3.59 1.w 3.25 3.0. 2.ll 2.11 2.66 2.55 L.*>
.P99 i4.01 9.W 1.$5 6.59 5.98 5.55 5.23 L.99 L.80 I.6. L.1' L.ll 3.87 3.59 3.65 >.?* >.lc 2.97
25 .90 2.91 1.51 2.11 1.11 1.09 1.01 1.91 1.91 L.89 1.81 1.82 1.77 1.72 1.66 1.63 1.59 1.11 I.5L
.s6 6.2. 3.3s 2.99 2.76 2.~0 *.r9 I.*a 2.14 2.2e 1.11 2.1. p.09 2.01 1.92 1.87 I..? 1.77 1.n
,912 5.69 6.29 1.69 3.75 1.13 1.97 ?.a5 2.75 Z.68 2.61 1.11 ?..I 2.30 2.18 2.22 ?.a7 i.e. L.91
.e* 7.77 5.57 ..68 *.,a 3.85 3.63 3.*6 3.32 3.22 3.13 2.99 ?.a5 2.70 t.5. 2.6% 2.38 2.27 z.,,
,995 9.W &.SO 5.L6 6.R. L.*l L.15 3.9. 3.6. 3.5. 3.37 3.20 3.0, r.li 2.72 2.61 2.5- >.la
.l)w 11.e" 9.22 I.rr s.rq 5.8a s.rs 5.15 4.91 ..TI r.56 r.>t ..os 3.79 1.51 1.17 3.22 3.0- r.3.
16 .90 2.91 1.11 1.11 2.>7 2.08 2.0, 1.96 1.92 1.88 I... 8.08 1.76 1.11 1.65 1.h1 1.51 1.5b 1.50
.95 6.2, 1.17 2.98 1.1. 1.59 1
.2 2.3') 2.12 2.27 1.12 2.15 2.07 I.99 1.90 1.85
.el$
.se
3.66
7.72
..n
5.53
3.6, 1.11
..I.
3.10 2.9, 2.81 2.73 2.65 2.59 2.49 .?.m
,* 2.r. 2.36 z.oe
I.10
2.03 1.9~
1.75 1.6-
i.a8
,995 *..I e.11
6.6.
5.11
3.82 3.59 3.62 3.29 3.18 3.09 2.96 z.66 2.50 2.62 2.3, 2.2, z.~,
4.V *+>I( ..I0 I.89 3.73 3.60 1.M 1.11 1.15 2.97 i.77 1.67 2.56 Z.L5 2.3,
.w 13.7. 9.12 7.16 6.61 5.m 5.38 5.07 4.8, r.sr +.+a 4.2. 3.99 3.12 I... 1.30 3.15 2.9s r.az
27 .90 2.90 2.3, 1.10 ?.I7 2.07 2.00 1.93 i.9, 1.87 1.115 1.110 1.75 1.70 8.6. 1.60 1.57 l.sl I...
.es 6.21 3.s 2.w 1.71 2.51 1.w 2.37 2.u 1.25 2.20 1.1, p.06 1.9, 3.0. 1. 1.79 1.7, ,.&,
,975 5.63 L.2. 1.65 >.?I 3.08 2.92 2.80 2.11 2A3 1.57 1..1 2.?4 1.25 2.8, 2.07 2.0. 1.93 i.ar
.w 1.a 5.as L.~O 6.1, 1.18 3.5t 3 . 3 ~ 3.26 1.15 3.06 2.9) 2.1" 2.63 2 . r ~ 2.1. 2.29 2.20 z.10
.sP5 9.3. 6.** 5.36 ..I& ..,a L.06 1.85 1.69 1.56 >.a5 1.11 1.11 2.Vl 2.73 2.63 2.52 2.r) L.Zs
.99* 33.61 9.02 7.27 6.13 5.73 5.31 5.00 a.76 a.57 6.68 6.87 >.V2 3.66 3.38 3.23 3.0s 2.92 2.7s

Table A9. Percentage p o i n t s a £ the F-distribution (continued)

b28 .so
.s~
.s15
9, 1

2.es
..IO
5.~1
2

2.50
1.3.
6.22
3

1.29
2..5
1.61
r

1.~6
1.1,
3.99
5

?.Oh
1.56
1.06
6

2.00
2.15
i.90
I

1.91
2.36
2.78
II

1.90
1.29
2.6Q
9

1.87
LO

I.*&
i . ~ +I.,?
Z.al 1.55
I2

I.7-
2.12
1.45
15

I.7L
1.0'
2.36
ZO

1.69
1.96
Z.23
1s

1.61
1.a7
?.I1
*O

I.%
1.82
2.05
0

1.56
1.7r
,.Pa
120

1.52
1.1,
1.91
>.'a
1.65
1.11
a

.sq 1.6. 5.~5 e.11 *.01 3.75 1.51 1.36 1.21 3.12 1.01 Z.90 7.71 2.60 ?.a4 2.15 Z.ZL 2.17 2.06
.9e5 9.2. .
.
LL 1.1. *.TO b.10 *.oi 1.81 3.~5 1.52 1.~1 1.15 2.01 r.ae 2.w 1.59 2.68 2.n r.zs
,999 11.50 8.9, 7.19 6.25 5.aa r.2r e.9, ..as r.50 *.,s r.,, ,.a* 3.60 3.37 >.,a 3.02 2.86 i.63

29 .')O
.P.
1.89
..I8
2.50
1.11
2.2.
>.a3
,.,'
2.70
2.06
1.55
1.99
1.63
1.9,
p.35
1.89
1.28
1.m
1.12
1.8,
?.nu
1.7s
2.10
I.,,
>.O?
,.he
>.PI
$.*I
I.s5
i.21
1.81
1.15
1.75
1.51
i.70
,..,
I.*.
,975 5.59 6.20 ).a1 1.27 1.01 2.w 2.76 1.67 1.19 1.51 1.*1 ?.ll 2.21 2.09 2.01 1.96 i.89 1-SI
.P 7.e0 5.e. s.51 *.n* 3.71 3.50 1.33 3.20 3.0- 3.00 2.87 7.7, 2.51 >..I 1.11 1.71 2.1. 2.01
...5 9.13 6.ao 3.18 *.&c L.26 1.9C 1.11 3.6, >.La 1.18 3.2, 3.0. 2.M 2.66 1.50 2.L5 1.11 2.21
.99.) 11.19 8.15 1.11 6.19 5.19 5.ll L.87 a.6' L.LS *.Iq L.F l.an i.3 >.El 3.11 2.q7 2.8, ?.a*
30 .eo 2.m z..9 2.a ,.I' 2.05 1.9- t.9, ,.an 1.s ,.ez 1.7, L?,. ,.67 1.6, 1.57 1.5' b.50 1.66
.PS
.*,?
.ss
a,,,
5.5,
7.56
1.12
..IS
5.39
i.02
3.59
<.5,
,.L9
2.75
..OF
2.53
3.0,
3.70
Z.'i
2.8,
,..7
2.11
2.75
3.m
?.<I
2.65
3.,7
2.2,
2.57
3.07
2.16
2.5,
2.98
i.OP
z..,
2.8..
p.01
2.3,
>.TO
i.91
2.70
2.55 2.m
1.8.
,.a7
1.19
z.0,
2.30
,.%
1.11

2.2,
3.69
1.87
Z.,&
i.62
L.79
*."L
5 9.18 6.15 1.1- r.62 L.23 3.9% 1.11 1.51 1.L5 3.3. 1.18 3.01 2.8: ?.L1 1.52 2.*? 2.30 2.18
,993 13.21) 8.71 7.05 6.11 5.51 5.12 1.81 r.58 i.37 '
2
.
L ..OD 1.75 I.*9 3.Z 1.07 1.'2 2.76 i.59

LO .PI 2.w 1.11 2.73 2.00 2.00 i.9: 1.87 1.83 1.79 1.76 1.71 1.66 1.6, I.?* i.li I.*? L.+? 1.29
.P5 1.08 3.29 2.sC 2.6, 2.*1 1.11 2.25 2.11 2.12 1.011 1.00 I.9Z 1.8. I.11 1.69 i.6. 5.58 1.5,
+m 2 . a ~ ..)I >.LO 1.~1 2.90 2.7. ?.a* 2.53 2.15 1.39 2.19 2.1. 1.07 1.9. i.88 1.30 l . 7 ~ #.re
.Ps 1.11 5.ie L.31 ,.a3 1.51 3.29 1.11 2.99 2.8- I.-* Z.66 7.5) 1.11 2.m 2.)) i.aZ i.?2 1.80
.9e5
.*Pa
a.81
11.61
s.01
8.15
L.98
6.60
r.37
5.70
3.9-
5.33
3.71
6.73 *..'
1.51 1.15
L.2,
1.22
s.01
>.I2
1.117
?.Q5
1.6s
2.71
>.LO
2.00
3.15
i.rO
2.41
2.10
2.71
2.1-
2.51
L.UL
L.ll
t.93
L.L1
60 _PO 2.79 2.3- ?.,a 2.e. ,.95 ,.a7 ,.a? 3.7. 1.7. 1.7, ,.a* 1.6" 1.5. ,..R I.&* ,.a0 2.35 I.??
.*5 *.so 3.,5 Z.76 2.5, 2.37 2.25 *.,r 2.10 2.0. 1.99 1.32 ,.as 1.7- 1.65 1.59 t.5, 2.-7 1.w
.915 3.29 3.93 1.N 1.0, 2.1' 2.63 2.51 ?..I 2.13 2.27 1.11 2.06 1.'. i.12 1.11 1.61 1.53 1-66
.es 7.0" ..?a ..,3 3.65 3.3. 3.,2 2.97 2.82 2.72 2.63 Z.5" 9.35 2.20 7.0, t.9. 1.8. L.73 1.w
,995 8.'') 5.19 L.71 +_!a 1.76 3.W 1.23 1.11 2.01 2.90 2.7. 1.57 1.19 7.19 2.01 I_% ,.XI I.&'
,999 Lx.97 7.76 6.>7 5.31 a.76 *.>7 a.0- >.e9 >.% 3 . 3 1 3.08 2.83 2.q5 ?.st ?.25 Z.08 1.m

120 .en 2.7% 2.35 2.1, b.99 1.m L.*? 2.77 >.7? ,.6* 1.65 1.60 1.55 ,.*8 ,.L. 1.3, 1.32 ,.a 1.19
.er 3.5z ,.@7 2.m 2.'5 2.a 2.t7 2.09 2.02 1.96 I.*, ,.a, t.75 1.M z.75 ,.$O I.', b.35 1.25
.SIS 5.15 >.a0 3.23 ?.19 ?.b1 2.51 2
.
3
' 2.30 2.22 2.16 1.05 i.W i.sZ 1.6' 1.61 1.51 %..I i.Ji
.se 6.85 r.79 3 . e ~ i.ra 1.17 2.9s 2.79 i.ar 2.56 2.rr 2.1. ,.I* 2.03 1.r6 1.76 1.66 1.5, 1.38
,995 s.18 3.5. 6.50 3.w 1.55 I.2L 3.09 2.93 2.81 1.71 2.5. 1.17 2.19 1.91 1.87 1.75 I.LI I.*>
.W9 kt.>* 7.32 3.79 +.s5 ..+* ..Ok 3.77 3.S 3.>8 3.26 >.OZ ?.,a z.53 2.26 2.11 L.+< 1.76 >.5+

9) .90 2.71 1.30 2.011 1.W 1.85 1.77 1.12 1.67 1.61 1.60 I.15 >.a9 I.LZ I.,& 1.10 8.1. 1.17 1.00
.91 1.W 3.00 2.60 7.37 P.21 2.10 2.0, i.9. 1.88 i.sl i.75 i.67 1.57 1.r- 1.1- 1.11 1.11 l.0"
,975 5.01 3.6- 1.11 2.79 2.Sl 1.*1 2.29 2.19 1.11 2.05 l_Or 1.113 1.11 1.97 !.A11 1.1- 1.27 1.00
.9s 6.61 *.61 3.18 1.17 1.02 ?.*a 2.66 2.51 ?A1 2.11 2.i8 ?.O1 1.81 1.76 1.59 1.67 1.32 i.00
.ees ?.an 5.30 1.2s 3.12 1.35 3.09 2.90 P
.
7
. 2.61 1.5Z 1.16 2.19 2.00 1.7- 1.61 1.51 1.36 1.00
.PW 1O.e) 6.9, 5.*1 L.61 ..I0 I.?. > . + I 1.21 1.10 2.W 1.11 1.51 2.27 1.99 1.8. 8.66 1.+5 3.00

,*.
"2

. .-
i
1
Table A10. Percentage points of the ~ol~ogorov-s~ir~ov
statistic*

The table gives values da for specified values of a such that


i Table All. Critical values of the run statistic

The table gives critical values r of the run statistic r with probability
distribution p(r) defined in ~ect.14.6?2 for sample sizes n,m up to 15 (n 5 m).
P(D < d l =I -a, The table assumes a one-sided test of significance a with critical region in
n- a the lower tail of the probability distribution,
where the Kolm~gorov-Smirnovtest statistic D for the sample of size n
is the largest deviation betveen the observedncwoularive distribution
and the theoretical cumulative distribution.

* Abridged and adapted from Table 15.1 in Donald 8. men: Hadbook of


Statistical Tnbtes, 1962, ~ddison-WesleyPublishing Company, Inc.,
Reading, Mass., by permission &'the publisher.
I T a b l e A12. C r i t i c a l v a l u e s o f t h e Wilconon r a n k sum s t a t i s t i c * / Table A12. C r i t i c a l v a l u e s of t h e Wilcoxon rank sum s t a t i s t i c ( c o n t i n u e d )

The t a b l e g i v e s c r i t i c a l v a l u e s of t h e Wilcoxon two-sample t e s t


s t a t i s t i c W d e f i n e d i n S e c t . 1 4 . 6 . 9 f a r sample s i z e s n,m up t o 25 (n?m).
The r a b l e assumes a one-sided t e s t of s i g n i f i c a n c e a as q u o t e d i n b o l d -
Face a t t h e head of t h e columns, and c r i t i c a l region i n t h e lower t a i l
of t h e p r o b a b i l i t y d i s t r i b u t i o n . For a two-sided t e s t o f s i g n i f i c a n c e
a , the lower c r i t i c a l v a l u e W p i s found as t h e r a b l e e n t r y f o r ha, and
the upper c r i t i c a l v a l u e i s deduced a s W = 20-WQ, where 2 i j i s a l s o
It
g i v e n i n the t a b l e . I2
l.3
li
IS
I,,
I?
I8
I9
10

n=, n- R
m a.ml oms a.080 0.025 oar 0.10 1W owl 000s +at0 OQ15 005 0.10 2 8 m

I, i l 111
I!, 5, I,"
5% in i4i
i, iil 16,
$0 "I llil
D (il in8
81 lili 175
6% li!) 182
Bi 71 In!+
A, 7, ID"

n = 9 n = lo
I. O.WI DWS (1.010 0.02s e.05 0.10 2W orat w r wlo 0.028 oar o.loaR m
1 SS 60 39 82 80 70 175
10 33 68 81 ni 69 13 I80 85 rl il 78 8l 87 210 I0
II fii 01 13 88 !? ?R 18% 07 ra )r 81 ao n t a m II
II (13 "6 71 ,., RO 198 68 m 79 a4 8s 8 i nro 11
IS 69 06 OR 13 70 83 PO7 72 18 IIP 88 9% 08 240 I3
I4 00 A7 I! 16 RI 86 P l l l 74 RI 85 DI 90 101 250 14
I5 82 8" 73 10 84 "0 821, 78 84 88 4 DO I" ROO 15

* Reproduced w i t h changes i n n o t a t i o n from l a b l e 1 in I..R. Verdooren:


"Extended t a b l e s of c r i t i c a l values f o r Wilcoxon's test s t a t i s t i c " ,
Biometr'ika, 50 (1963) 177-186, by p e r m i s s i o n of t h e p u b l i s h e r .
?.
m
*e DZC 3 SZY1L: 3 S X J Z Z Yo: a 0
=&$g , , ,, .
u
,
.,
,
,,
-
s:t-5 s z z s : 23 2
= ,q g w
g
2 E ggc 8 2:g:zg
- % $5 ,.
2.
: g z $2: gi Y Y Y P ~* g g, 2.
" O U a , I--=- UY'E' 08, m
--- ::= g,= " --
t: -St' 'U _u_Y_P" UYUII -I
.-
<
Z L
.
Z :YZE, a -$= ?.
-
E ; 6 2 %; ?;$;: ;$:gz ;:& %

w a.
- --- -*-<< T
3 g g z 5 D
-

Sil 0 -.*- YY%.YY i. 0 F.


$ g$gz 8 =.><=-
---..- b :?.:
m P - *--a "
YE: 9ilZS
--
S 5 - ,-.
--
ZZ U
---
"'00 *-m=rn rn
O P s OL..zZ % 88%2: 8 s
SZ 3 IYJ: e YYJZZ 2 3
%
-
P
.c
a
- .
Bibliography

GIneral books on probability end statistics

BBADLEY, J.V., Distribution-Free S t a t i s t i c a l T e s t s ,


I
Prentice-Hall, Inc., Englevood Cliffs, New Jersey. 1968.
BFSIM, L.. S t a t i s t i c s With a V i m Toward Apptications,
noughton Mifflin Company, Boston. 1973.
C W R , H., Mathemtical Method8 of Statistics,
Princeton University Press, Princeton, 1966.
FISHER. R.A., S t a t i s t i c a l Method8 for Research Workers, Thirteenth Edition,
-
Oliver and Boyd, Edinburgh London. 1958.
HOGG, R.V. and CRAIG, A.T..
Introduction t o Mathemtical S t a t i s t i c s , Second Edition.
The Hacmillan Company, New York.
Collier-Hacmillan Limited, London, 1967.
KENDALL, H.G. and STUART. A.,
The Adurnteed Theory o f S t a t i s t i c s , In Three Volumes,
Vol. 1 Distribution Theory Second Edition 1963;
Vol. 2 Inference and Relationship Second Edition 1967;
Vol. 3 Design and Analysis, and Tim-Series 1966;
Charles Griffin 6 Company Limited, London.
LEHHUI. E.L. N a p a m m e t r i c e : S t n t i a t i e a l Methods Based a Ranks,
nolden-~ay, Inc.. San Eranciseo, 1975.
LINDLEY. D.V. Introduction tu Probability and S t a t i s t i c s . In Two Volumes,
Part 1 Probability, Part 2 Inference.
Cambridge University Press, London, 1965.
M O D , A.M.. GUAYBILL. F.A.. and BOES, D.C.,
Introduction t o t h e Theory o f S t a t i s t i c s , Third Edition,
ue~rav-nil1Book Company. Inc., New York, 1974.
OSTLE. 8.. S t a t i s t i c s i n Research Basic Concepts and Techniques
for Research Workers,
The Iara State College Press. Ames. I w a , 1956.
SVERDRUP, E. Lm og t i l f e l d i g h e t Den praktiske s t o t i s t i k k s metode og teknikk,
I to bind (in Nowegian);
Bind I En elementirr innfering, 2. utgave 1973;
Bind I1 En matematist viderefpring, 1964;
Universitetsforlaget, Oslo.
Laws and Chance Variations Bnsic Concepta o f S t o t i e t i c a Z
Inference, In Two Volumes, (English translation);
Vol. I - Elemenfarv
---- Ineraduction:
-~ - -~ .-~
- - - ~~~,
Vol. I1 More Advanced Treatment;
North-Holland Plmlishine Comoanv. . ,. Amsterdam. 1967.
WALWLE, R.E., Introduction t o S t o t i s t ~ c s , Second ~dition;
Macmillan Publishing Ca.. Inc., N e w York.
Collier Haemillan Publishers, London, 1974.
Books on statistics applied to physics
HUDSON. D.J. Lectures a E t e m a t a ~ yS t a t i s t i c s and Probability,
BRANDT. S., CERN 63-29 (1963). and
S t a t i s t i c 0 1 and Computational Methods i n Data Amlynis,
Second Revised Edition, S t a t i s t i c Lectures I I : M a z i m Likelihood m d Lemt Souare.
North-Holland Publishing Company, Amsterdam. Theory, CERN 64-18 (1964).
Elsevier North-Holland, Inc., New York, 1976. J M S . F., Function Minimization, CERN 72-21 (1972).
CWPER, B .E .,
S t & i s t i c s for E q e r i m n t a l i s t s , JAUNEAU, L., and MORELLET. D., Notions of statistics and applications, in
Methods i n Subnuclear Physicn, Volume IV Part 3, Data Handling.
Pergamon Press, Oxford-Landon-Edinburgh-Nev York-Toronto-Sydney-
~ - - ~bv
edited -,- -n.
~~- ~ikoliE:
~,
paris-Braunschweig, 1969. ~

EADIE, W.T.. DRIJARD, D., JAMES. F.E., ROOS, M., and SAWULET, 8.. Gordon and Breach Science Publishers, New York-London-Paris, 1970.
S t a t i s t i c a l Methods i n Ezperimntol Phyeice, KNOP. R.E. Errors in estimation of total e v e n t s , Rev. S c i . Instnun. 5 , 1518
North-Holland Publishing Company, AmsterdawLondon. 1971. ~...., .
119711)~
JANOSSY, L., Theory a d Practice for the Evaluation of Measurements, OREAR, J., Notes on statistics for physicists. UCRL-8417 (1958).
Oxford a t the Clarendon Press. Oxford Universitv Press. 1965. ROSENFELD, A.H. and HUUPHREY, W.E., Analysis of bubble chamber data, Am. 'Rev.
JOHNSON, N.J. and LEONE. F.C.. NucZ. S c i . 13, 103 (1963).
S t a t i s t i c s and Ezperimentol Design i n Engineering and the Physi- SHBPPEY, G.C., Minimization and curve fitting. in Pmgramning Techniques,
cal Sciences, Volum I, CERN hR-5 (1968).
John Wiley 6 Sons, Inc., New York- London- Sydney. 1964. SOLMITZ, F., Analysis of experiments in particle physics, Ann. Rev. Nucl. S c i .
MARTIN, B.R., S t a t i s t i c s for Physicistrr, -
14, 375 (1964).
Academic Press, London-New York, 1971.
PARRATT, L.G.. Probability and Ezperimental Errors i n Science, Articles referred to in the text
John Wiley 6 Sons, Inc., New York-London, 1961.
WINE, R.L., S t a t i s t i c s f o Scientists
~ and Engineers, ADAIR, R.K. and KASILA. H., Analysis of s o w results of quark searches, Phys. Retl
Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1964. L e t t e r s 23. 1355 (1969).
BARTLETT, M.S., On the Gtiaticsl estimation of m a n life-times, P h i l . Mag. 14,
Statistical tables 249 (1953), and Estimation of m a n lifetimes from multiplate
cloud chamber tracks. Phil. Mag. 5. 1407 (1953).
FISHER, R.A. and YATES, F.. DAVID. F.N.. A Y' "smoth" test for eoodness-of-fit. Biometrika 34. 299 (1947).
S t a t i s t i c a l Tables for Biological, Agricultuml rmd Medical Re- DAVIDON, w.c., variance algorithm for minimization, COT. J . 10, 6 0 6 (1960).
search, DIIRBIN, J.. Kolmo~oro\rsmirnov
. tests when parameters are ezirnated with app-
~ ~

Oliver and Boyd, Edinburgh, 1938. lications to tests of exponentiality and tests on spacings,
MILLER, L.H., Table of percentage points of Kolmogorov statistics, J. Amer.
Biomtrikn 9 , 5 (1975).
S t a t i s t . Ass. 51, 111 (1956). EVANS. D.A. and BARKAS, W.H., Exact treatmnf of search statistics, Nucl. I n s t .
OWEN, D.B., Rm&wk of S t z i s t i c o l Tables, Metk. 56. 289 (1967).
Addison-Wesley Publishing Company, Inc., Reading, Massachusetts, GELFAND. I.M. and TSETLIN, M.L., The principles of nonlocal search in automatic
1962. ~ ~~
~.
ootimization svstems. Soviet Phus. " . .
Dokl. 6. 192 (1961).
RESNIKOFF, G.J. and LIEBERMAN, G.J., GOLDSTEIN, A.A. and PRICE, J.I., On descent from local mTnima, Math. Comput. 25.
Tables of the n a - c e n t m l t - d i s t r i b u t i a , 569 (1971).
Stanford University Press, 1957. KRUSKAL, W.H. and WALLIS, W . A . , The use of ranks in one-criterion variance
VERDOREN, L.R., Extended tables of critical values for Wilcoxon's test statistic, analysis, J. Amr. S t a t i s t . Ass. 5, 583 (1952).
Biometrika 50, 177 (1963). NELDER, J.A. and WEAD, R., A simplex method for function rninimieatian, Conp. J.
PEARSON. E.S. and HARTLEY. KO. (editors), 7, 348 (1965).
Biometrika Tablea for S t n t i s t i c i m s , Volumes I and 11, PARTICLE DATA GROUP, Review of particle properties. Rev. Mod. Pkys. 9, No. 2 ,
Cambridge, 1970 and 1972. Part 11, April 1976. -
REINES, F., GURR, H.S.. and SOBEL, H.W., Detection of v,-e scattering, Phye.
Rev. Letters 37. 315 (1976).. .
Articles,reports,and lecture notes an statistical estimation in particle physics ROSENBROCK. H.. An automatic%thod for finding the greatest or least value of
a function, Conp. J. 2, 175 (1960).
ANNIS, M., CHESTON, W., and PRIEIAKOFF. H., On statistical estimation in physics, WILCOXON, F.. Individual comparisons by ranking methods, Biometrica Bulletin 1,
Rev. Elod. Phys. 25, 818 (1953). 80 (1945).
- -*
-
h
--
*.
*h
- - --
rn
m b r
*n
. - - - ..
m Pd
-- ., - - .- - .
. - n rn a
c.z
sn
* - .
r+
t

1.
z
c n l
I
m.
* m
o m
a
N
2
m
8
*0Am0
mm
. - x
0 .e rn *n -
m m
*w
8
* ?-
-0
wm
- 4

A,,
mm
w.
-
z =m
.,
2 Y . . D 8 m P * n ?*. m
m -* rn z rn ..n
I - n 7 .r n N I .b-OY
-;
*1 P: uL
-4
<
: : : :; .::.
-- ..---
m +=, w -m --om .n ..,?,a br.0 r. m.
=..
- - -- . . . .. --
-m
2:: 2 ,%:A> .q.; 2m 5 m... - m.
'" Z .
- . . z i s - e e : z z i , s a ~ : ~ a s r - a 2 E :?: .
e
2- =- - c-< m mrn ;z 8 3 - 7 T I I I" "- 4 2mmO m
P * Z8bi 2I * 5 = -mn. s. sz
-.c
nt
O h
-n
, m = m z = * * * r.<
3- 2. Y m l n n C L I r n
z, -E<>=z x - z w2: a u c.m
m m u z m
=*- = z,.= +
:+
::
-..= Lz =e a= ,,mu<
L=ZL:8:rYfirb;: 2 Z m + Y 2 E Z-ELZ
. :sr-:cN:
-;-in ~ e u c o ao h m myif E
~ = -.=- O iw - * c z C LW~+Y
a.."..'"cZZ 3 "
- :
+*.L.<*Z =z"=mm m...
e ~ ~ ~ ~ ~ g ~
uc
-=L"-x
x*z--= -
-,-uzLc
0 -
L C 2 i _ _ l
..C_1>Di
**C&..
. . IL-I-'-=--'C"
< -

..
i
w - u x z ~ > c + ~
, x....I.u --
. ~- ,. .~3>1. 3

D
D
*
"bk$:>'

E l
>
uu
Y < S Y
Ern,
= m .c">m<<*-"
"L="-Y'
fe,:4Pi-tE:"&L
3

.Y^*.52.iL.-r3
.~~lr,Y.-Y-..I..~
Z * C # O r n *u.mo

.
T
.Z
*Y..D*.
I W C " 1 3 l l
.
N Y Y l r
il..Y*L
~ ~ k s ~ ~ ~ ; r ~ ~ E ~ r r n n , ~ z z ~ z ~ ~ Y r L m ~ Lm
< . . -~
+ mu
. mu
a u a z-
a.. a a 3 ~ ~ w
-
o.=z=--e.x
-riii-x*..
- -
z
~ ~ . ~ . . . u w n w w ~ u ~ ~ u - u ~ f E Z r Y Y Y I 'D=="""=="'-
* a m = . = ~ . x z z m ~ 2 ~ ~ ~ u , z - A - " " m ~m ~a-i.m
;;t .+ +..
P 6 : $ $ 6 Z r P+ 2* =
w~, " " . , & -
-a
Z ?- *
::rLrhgs:tsrze
..Zzz=~L~L.;.- mz.
--.z....u...c",->
er rr:"::r:r:p:s:::::t:
=.--WE.
z::~zzz;:z:%:: u w w . . - -
moocc
2 u u
u. L,
2
w
n
u
= c,u u
0 ~
"UUC"0
0 =~= = 0
"UQ
<~
-0
u
m a a a - m
nln- :rll,nl xelt r s e . . O B 21,. 31, r,,
, a , * a m r ,.,#K,h,T,",.
6akA"r
.'LLI*I..... ,,k
I,).
,h
Il *om* r ,.T",r.T,o*
201.202
,pe-?or.

rr~owr.nal.-u. 91
I 115~119 GLflllll I 8 l : i r l i i l l l T l O N 9 5 YR
oewm ?5.?8 BfiU!i'i I I ) Y k O Y 1HLLlllfM 267-268
..,,
.......................
.",or
O F H C E A I COHI OLINII IPOlSliOU BR lihll<iSlnN i N O H M a l ) R h N O O M NUfiIIER OCNERfiTOH IL),,T1.*(ilA* n 6. l l * *T,,,O" ,801 -0,. lo? r1.l
O E N t l L I IZCO HII't:UliEOMI:TUII: 74 11.1 1 1 4 .*XOT @.4"illr,, 6, 61
OFONCIHIC bl?~>O liQ1151.1"N I I I E T I I I I ~ T I O N PSEE NORHI)I DISTRIBLITIOH L*5T-50,allcs L3llnnll"" nr "C*"
""IShEXION, "TI",. 7s O I Y I R " I LOnPOllHD I U I S S O H DISTRIBUTION eB <,*,TIN8 " I F 1 I I I B u T I D H OF 111
,~"I.EROFIIH,.I",C 70 CltWII*, 1 2 TCST "I.i . ,mant l , "S ,,a :a, OF 1 " " C T I " N 33
""I T I U D n l a l 7 2 731 86-18, L ' l O L l F B U I FHFEOOM I N a22 1 0 . a 1 1 2 1 1 1 ~.es
~ :v*. ir. nb Y * N K , O ~ YanlaeLE 33
N U L T I H O R H A I 120 173 I O H C O N F I I I I I S O N 0, +11ErO.R.N9 455-457 5111a10(11 L L W I a L a I I 1 T I S : 6 2 ? 6 I . 2 ) . ? I ? B A I b L t ss
UEOAIIYE 8IHONIOI. 7 0 . 9 0 POR OL1011HL59-0#-1 1 T 4 2 1 4 1 4 .r191 -11, a*,- ,,I Y t r R H r C O l e e 2 0 0 , 21s
NON-CFNTPIL CNI SOLIeRE 129. 119 FOR l N o r P # NIICHCE 119-433 C", S C I a L , "10*6*1 1 1 1 - 8 :n" I I r E T I m E . EEL EXPONENTIAL OISTRlBUTIOW
HDN-CZUTT16L r 1 4 5 . 14s IiFUIRhI IIED b I I P F l D E O N E T I I I C I I r S T R I O U T I O H 7. MEOil)N 31

.................
I I G L T I . 111 1 W C l l l U. : 0 1 . . * 1 . 111 1 1 ,
NOH-CENTRAL T 1 4 0 , 1 1 4 bbNI:WL 12tTe i ikCIII4ODrm lUNCTlOH 249 ,2111.110L Or ,LA. .n2,n*r, '1 I)' a\ r S r I M l T 0 R OF W I I N I N HORHIIL I ) I S T I I I B U T I O N
NOR**. LO1 1 1 4 '.LO*,,",,: "lsrli1X"TlDH 69.70
POISSOH l i ~ 7 8 .8181. ?R 611110N193 D F ~ l l T2 8 8 - 2 8 7 . 142-113 ,*,k ,$.I * , A . L , n , . l . I , . , , , a,:,., I
5 T I N 0 ( 1 H D MOHMAL 101-103 O(1OIIUTliS O F - F I T T T S T S 3771 1 1 4 - 4 2 8
E T l , l l E U i ~ S T 140-1.4 5, Nl.hhL 1 ' 421-121
UWIFOUI B S ~ P I hUI.1(OBOROV-~WlHNOV 4 1 4 - 4 2 8
1 148.149 I'EOUSONS X 415-471. 42". 4.4-4.B
I l U U D L i CXI.O"FNTIOL O l s T l l l B U T l O N 124 m&IsLrNr ~ L ~ H D D357 S ,as
0fiAFHICl)L FROCCDUHC 10H P.RMTTER ESTrMeTIOH
Ilb-JI2
I R O N LIKELIHOOD V l l U C T l O N 721-228. 212-216.
:,.,*-24. NOH-L I W E I R N O E L 275-281
FRO8 X ) rUNCTlOW l l d ~ l l s NOH-LIHr(lR "DOEL Y I T H CONSTRbIHTS 307-316
I SIMPLIFIED 261. 2 V S - 1 9 4
UWYEIOHTED 2 6 0
LIhST-SOUARES P R I N C r P I E 259-260
.. ..
" O l l l T I T D I.,
.I .. \I, L,1",! 110" llll)L5F0n*
LllFLlHOOD nnmcrr ;r*raarl*o r r r r r t o r is
OF CVLHT 26 *"*T*1. "'1.a"" i2l-lrl
OF OBBERVATIDWS 1BO
L l X E L l H O D O COUhTION 197
LIKELIHOOD FUNCTION LBO. 1%
FOP CLASSIFIED DeT1 2.9--251
TOIIIw1.Ii
O,P.,kb
<,-' . ,.,
L,.,
t<ITklnl).lL

, .>
I?, $ 2 7
111

TOR YElOHTEO SYLHTS 2 5 2 - 2 5 1


SEHERaLIIEO 219
I L I - B E H A V W 255-258
LIKELIH000 1NTERVI)LS
OYE-PARfil(6TCR C6SE 212-116
ZEC *LSD CO"F,DFNCF lNTERYllLB
IOFOBRaM 157- 158. 4 1 2 LII(EL,"00D REOlOHS
IYUEFSWDIHCE FOR NE6H 6ND Y O R I W C E IN WORNIII.
"YD is& L I I L V l U r E < S E T S > 15-16 DISTRIBUTION 215-246
S U F F I C I E N T IPo-lV4 1I1 fit** AND YI)YIANLE I O R NORMAL S M P L E 57. M U L T I - P L r l l M E T C R C 1 5 1 216-247
UNB1A9SEO Iw-LBI 107. ,.,s TWO-PhRIHCTFR CLSC 217-246
tS11MI)TOR P R O P E R , l T S 180-Lel. 312 06 Y I R I 6 B L I : S 4 1 - 4 2 . 171 429-433 SEE ALSO CONFIDENCE: HEOION8
E X C L U S I V E 5CTS I 0 1NDI:PFYDENT YIHIAXLCS 4 1 - 4 2 , 4 1 L I X E I . I H 0 0 0 - R I ) T I D 388
E I I I U S T I V L BETS 9 7NTI:RSECIIOH DI SI:TB P L I I E L I H D 0 0 - H 1 T I D TEST 3 8 8 - 3 9 5 , l l J
EXPECTI)TIOH VALUES I W T I Y Y O L C B T 1 1 4 T I O N 179 LINEAR COY(lWULNTI+L METHOD 9 t
.................
r""".r,""*, .A CONFlDCHCt INTIHVLLS 366-1781 216-21P. LINE(IR FUYCTlOWB OF R(1W00" YbRII)OLC!
6 O H J O I N T C.D.F. 3? .,I6170 L I N E A R LEhST-SOUARES CSTIMITOR 762-: H E O e T I V F B l H O n l l l L D1STRIOUIIOM 7 0 . 98
UF a R I T H I I E T I C NEON < J & I ( P I C N E I ) H ) 5 0 cllLll:II)L I"T,:RVAIS I , , .7.**.,Qo H E Y r O N ' S METHOD 116-278
OF TUHCTlON 31-32 I I k E I II1OOr1 I H T F U Y A I S 211 PROPERTIES OF 267-261). 3 0 6 NEIMAN-PTIHSON rrsr 381-388
Or ILIHEI)R FUNCTIONS OF Al(NllO# Y I ) R I O B I I : S 1NO UI)RII)NCE rN WORMLL DISTHIBUTIOW
11I l C l l H V l r H I.IUEI)R C0YSTHI)IHTS 1 0 1 3 0 1 NON-CCNTIIIL CHI-SOLll)(tli OIJTRlBUTlOH 129. L29
48-50 l/I-IIY1 245-246 NOHCLNTRhL r-01ETHIllllT10W 115, I + ?
OF RANDOM Y O R I a I E 32-11 OF MEON I N NORIIeL 01STHIBUTIO* 169-174 UON C E H T R l l T n l S T U l B U T l O N 110. LII
EEE I)LSO K.4" IF vnslwcc I N WORHAL ~ I S T R I B U T ~ O WL?.-I,~ W O N - P e I I A I E T R I C TESTS 377
C X P E R l W N T Y RESOLUTION. 811: R E S O L U T I O N Y 1 T H ILrLELIHDOD rUNCllOH 227-249 N M M l l L U I S l R I R L 1 T I O N 101-107
TU*C,lON ITER*rlYE P""CIDL,R,:S I:OUrIDFHI:& I H T E H U I S FOR *El)* IN 169-171
EXPONENTIAL B I B T Y I D I I T L O N 92-91 I ~ RM I I N - L I N ~ ~ Wi.t.asr-sourn:s csrwarrow CI1Nl lnENCE I U T E I V a L S FOR V I Y I O N C E I Y 174-177
CONTIOEHCL I N T F R Y I L S FOH *&a* L I I C r l n c 1 1 1 2 1 R . 307.316 L O N I I I C H C I RCBION5 FOR MEIIY OHD Y I H I 6 Y E E rN
231.2,P SPE * I S 0 MINI*1ZATION PROCEDURES ,,,..,,a
..............
LlNlll" OF S F T S 9
SET THEORY 9
S l D N 1657 4 3 6 - 4 1 8
SIOHI~~CI)NCI: 379
S I 0 N I F I C I ) Y C C LCUEL 3 7 9
SlGNlFlCONCI: OF S I O N O I lOd-410
SLHFLE H I Y D T H C S I E 3 7 8
SIMPLEX BETHlln 116-159
S l n P L I r l t O IEhST-SIIUAREB H C l l l 0 l i 261. 7
'
1
~7
'
4
S I Z E OF T C S I . SEE ( I I O N I F I C I N L F
SXEUNESS. :,E:F A W N H ~ T R Y c n r l r r ~ : ~ ~ : ~ ~
BThNOIHO IIEYIQIIUW I 3
5,1)*0a10 NORM*, n l S T l l l D U T l l l N 101 10.3
,7,,,"1,&RL, ,R&"S,~,,k*&,,,,N I,>, I I Y X l A ~ i S i l l N I : S ? ; Or ,.STI"I),"R *H?.I"l. 101-20.
l i l r l l l S T L C i h , . 180 , , N , , , , , W ,h,~l,<,e,,,,,,M ,a,> 9 ,
SEE (1180 I C B T I f l R I U R UNlFOHn RaNDON flIIMBLH 0E:MEReTOR el
S I A T I S T I C A I l N l E R C H C C 6 ~ 7 1866 LINlflOnULLIt P.D.F.
............ ' 38
..- . . . . . . . . .
B T A T I S T I C A L T E S T S . SEE TI:STS

Z l l T l r 5 ' 0,SCT.l *I'II<I i n ? ' 6 ,


UNION OF s,:,s
LIUIYLRSE 5"
,,*"IrSHrCll I F19T-90111)IES M r l l l O O 260
% f , F ",,,,-,? 3.1 3..
<,eElr" C < " l,,,k, :R*.??"
5.8"ll.' + ' 1 , ,*,I ll"N r. I..
SUBSET 7
3 U I : C F S S - e I I I U I I I f i L T H O I 352-153
B L l F r l C l E W C I 0 8 CSTlMATmR 190-1711 201-205

I ' O Z ! ; ? X I P I I T I I , N 8.1. ire

R F I . 8 T I O H TO NSBI)TIYI:
PB
IINOMIhI O l i i l H l B U T l O N ..
., , , -.
A s 9UPPLEMFNT TO P E a R I O N ' S XI TEST 1 1 4 - 4 4 8
P O L I ) R I I I ) I I D N IIB-217. 2 9 5 - 2 7 6 . 318-.1?9. FOR CONSISTI:NCI BETWEEN TYO 51)IPLES 418.142
355z.5 T O R R 4 H D O H W S S U l T H I N ONE SllnPLE 412-4.1
POLINOeIaL r l T T l H B 2 6 2 - 2 6 3 , 2 7 2 7 / 5
P0PULI)TIOH 58
P O S T E R I O R Y R O B l B l l l T I 26
POWER FUNCTIOW 393 nrnN se
PDYER OF TEST 380 P'RllPERTlES FOR HORI1I)L VIRrbBLES 1 0 9
PRECISION OF OBSE*VIIIO" 200. 2 6 0 , 262 SllL 5 8
PrtlDI1 PROBABILITY 26 V&RlANCE 59
PI1OB(IBILITI SEE $LEO HORHI)L 9 W P I . E
ADDITION RUI.E I I S***lC STICE 8
EOHDlTlOHlL , , - I , SOMI-LING DISTRIBUTIONS 121--150
D E F I N I T I O N OF 7-B I C e L E F 1 C T D R 11,
*(IROIHL(L 2,
NULTIPLICITIOW RULE 15-86 .. b r A N H I U B ErrlCII:"C"
SLaTTERI.I.OTS 4 3 - 4 6
10-22. bB-b?, 222.125

You might also like