You are on page 1of 59

ERRORS OF OBSERVATION AND THEIR TREATMENT

by J. TOPPING,
PH.D., F.INST.P.
Formerly Vice-Chancellor, Brunei University

,.

FOURTH EDITION

CHAPMAN SCIENCE

AND

HALL

PAPERBACKS

~

First published ]955 Reprinted once Second edition 1957 Reprinted twice Third edition 1962 Reprinted five times Fourth edition 1972 Chapman and Hall Ltd., I J New Feller Lane. London £C4P 4,££

PREFACE
This little book is written in the first place for students in technical colleges taking the National Certificate Courses in Applied Physics; it is hoped it will appeal also to students of physics, and perhaps chemistry, in the sixth forms of grammar schools and in the universities. For wherever experimental work in physics, or in science generally, is undertaken the degree of accuracy of the , measurements, and of the results of the experiments, must be of the first importance. Every teacher of experimental physics knows how "results" given to three or four decimal places are often in error in the first place; students suffer from "delusions of accuracy." At a higher level too, more experienced workers sometimes claim a degree of accuracy which cannot be justified. Perhaps a consideration of the topics discussed in this monograph will stimulate in students an attitude to experimental results at once more modest and more profound. T~e mathematical treatment throughout has been kept as simple as possible. It has seemed advisable, however, to explain the statistical concepts at the basis of the main considerations, and it is hoped that Chapter 2 contains as elementary an account of the leading statistical ideas involved as is possible in such small compass. it is a necessary link between the simple introduction to the nature and estimation of errors given in Chapter I, and the theory of errors discussed in Chapter 3. Proofs have usually been omitted but references to other works are given in the text. There is also a list of books for furtber reading. I am much indebted to other writers, which will be obvious, and to many groups of students particularly those at The Polytechnic, Regent Street, London, who bore patiently with my attempts to get them to write every experimental result as x ± y. I am also much Indebted to friends and old students who have helped me with the nrovision of suitable data;' arid. r am specially grateful t~ Mr. Norman Clarke, F.lnst.P., Deputy Secretary of The Institute of Physics, who has kindly read the manuscript and made many helpful uggestions. The author of a book of this kind must always hope that not too many errors, accidental or personal, remain. ",cIon J. TOPPING July, 1955. 5

©

1972 J. Topping

Printed in Great Britain by Latimer Trend and Co. Ltd., Whitstable SBN 41221040 1 This paperback edition is sold subject 10 the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form 0/ binding or cover oilier than that in which it is published and without a similar condition including this condition beinq imposed on the subsequent purchaser. All rights reserved. No part 0/ this book may be reprinted, or reproduced or litilized in any form or by any electronic, mechanical or other means, now known or hereafter inuented, including photocopying and recordiuq, or in any information storaqe and retrieval system, without permission in writing from the Publisher.

-J

Distributed in the U.S.A. by HARPER & ROW PUBLISHERS, INC. BARNES & NOBLE IMPORT DIVISION

PREFACE TO THIRD EDITION
Opportunity has been taken to make one or two corrections and a few slight additions. I am grateful to all those who have written and made suggestions. It is pleasing that the book has found acceptance in universities and other institutions, both in this country and overseas. J. TOPPING Brunei College of Technology, London, W.3. October, 1961.
Chapter

CONTENTS
Page
OF OBSERVATION

l.

ERRORS

9

1 Accidental and systematic errors; 2 Errors and fractional errors; 3 Estimate of error; 4 Estimate of the error in compound quantities; 5 Error .in a product; 6 Error in a quotient; 7 Use of the calculus; 8 Error in a sum or difference.

2.

SOME STATISTICAL

IDEAS

29

PREFACE TO FOURTH EDITION
With the adoption by Britain of the system of S.1. units appropriat changes have been made throughout the book. Some other small revisions have also been made. October, 1971. J. TOPPIN

9 Frequency distributions; 10 The mean; 11 Relative frequency; 12 The median; 13 Frequency curves; 14 Measures of dispersion; 15 The range; 16 The mean deviation; 17 The standard deviation; 18 Evaluation of standard deviation, a; 19 Sheppard's correction; 20 Charlier's checks; • 21 The mean and standard deviation of a sum; 22 Certain special frequency distributions; 23 The binomial distribution; 24 The Poisson distribution; 25 The normal distribution; 26 Relation between a normal and a binomial distribution; 27 The mean deviation of a normal distribution; 28 Area under the normal error curve; 29 Sampling, standard error of the mean; 30 Bessel's formulre: 31 Peters' formulee ; 32 Fitting of a normal curve; 33 Other frequency distributions.

3.

THEORY

OF ERRORS

72

34 The normal or Gaussian law of error; 35 Applicability of the normal law of error; 36 Normal error distributions; 37 Standard error of a sum or difference; 38 Standard error of a product; 39 Standard error of a compound quantity; 40 Method of least squares; 41 Weighted mean; 42 Standard error of weighted mean; 43 Internal and external consistency; 44 Other applications of the method of least squares, solution of linear equations; 45 Solution of linear equations involving observed quantities; 46 Curve fitting; 47 Line of regression; 48 Accuracy of coefficients; 49 Other curves.
REFERENCES BIBLIOGRAPHY INDEX

115
116

117

6

7

CHAPTER

1

ERRORS

OF OBSERVATION

"And so we see that the poetry fades out of the problem, and by the time the serious application of exact science begins we are left with only pointer readings."
EDDINGTON

r

1. Accidental and systematic errors Although physics is an exact science, the pointer readings of the physicist's instruments do not give the exact values of the quantities measured. All measurements in physics and in science generally are inaccurate in some degree, so that what is sometimes called the "accurate" value or the "actual" value of a physical quantity, such as a length, a time interval or a temperature, cannot be found. Ho~ever, it seems reasonable to assume that the "accurate" value exists, and we shall be concerned to estimate limits between which this value lies. The closer these limits the more accurate the measurement. In short, as the "accurate" value is denied us, we shall endeavour to show how the "most accurate" value indicated by a set of measurements can be found, and how its accuracy can be estimated. Of course the aim of every experimentalist is not necessarily to make the error in his measurements as small as possible; a cruder result may serve his purposes well enough, but he must be assured that the errors in his measurements are so small as not to affect the conclusions he infers from his results. The difference between the observed value of any physical quantity and the "accurate" value is called the error of observation. Such errors follow no simple law and in general arise from many causes. Even a single observer using the same piece of apparatus several times to measure a certain quantity will not always record exactly the same value. This may result from some lack of precision or uniformity of the instrument or instruments used, or from the variability of the observer, or from some small changes in other physical factors which control the measurement. Errors of observation are usually grouped as accidental and systematic. although it is sometimes difficult to distinguish between them and many errors are a combination of the two types. 9

CHAP.

1

ERRORS

OF

OBSERVATION

ERRORS

OF

OBSERVATION

CHAP.

1

Accidental errors are usually due to the observer and are often revealed by repeated observations; they are disordered in their incidence and variable in magnitude, positive and negative values occurring in repeated measurements in no ascertainable sequence. On the other hand, systematic errors may arise from the observe or the instrument; they are usually the more troublesome, for repeated observations do not necessarily reveal them and even when their existence or nature has been established they are sometimes difficult to eliminate or determine; they may be constant or may vary in some regular way. For instance, if a dial gauge is used in which the pivot is not exactly centred, readings which are accurately observed will be subject to a periodic systematic error. Again, measurements of the rise of a liquid in a tube using a scale fixed to the tube will be consistently too high if the tube is not accuratel vertical; in this case the systematic errors are positive and proportional to the height of liquid. Further, measuring devices may be faulty in various ways; even the best possible instruments r are limited in precision and it is important that the observer shoul appreciate their imperfections. Errors peculiar to a particular observer are often termed persona errors; we sometimes speak of the "personal equation." Errors 0 this kind are well authenticated in astronomical work. Bessel, fo instance, examined tile estimates of time passages in astronomica observations and fouad that systematic differences existed amongs the leading astronomers of his time. That similar differences exis amongst students and observers today will be familiar to teacher and scientific workers alike. More fundamentally there is the "error" introduced by the ver process of observation itself which influences in some measure th phenomenon observed. In atomic physics this is specially im portant and is enshrined in the uncertainty principle due t Heisenberg, but in macroscopic phenomena with which we shall b mainly concerned it can be neglected. Further, there are man phenomena in physics, often included under the general term "noise," where fluctuations arise due to atomic or sub-atomic particles which set a natural limit to the accuracy of measurements. These fluctuations are, however, usually very much smaller tha the errors which arise from other causes. For instance, molecula bombardments of the suspension of a suitably sensitive galvan meter produce irregular deflexions which can be recorded. 10

example of such deflexions obtained using a torsion balance is shown in Fig. 1.

.----------------_
30sec

..

__

... -

Fig. 1. Record of the deflexions of a supersensitive torsion balance showing irregular fluctuations in time due to the Brownian motion of the instrument. (From an investigation by E. Kappler.)
2. Errors and fractional errors If a quantity Xo units is measured and recorded as x units we shall call x - Xo the error in xo, and denote it usually bye. It might be positive or negative but we shall assume throughout that its numerical value is small compared with that of xo; this is usually written iel « IXol· We can write

x=xo+e = xo(1

+ I)

where 1= e/xo and is known as the fractional error in Xo. Also lOOe/xo is called the percentage error in Xo. Of course e/xo can be written as e]x approximately when lel<lxol, that is, when 1/1<1. We note that and Also, Xo x=l+I':::.1 e Xo e x

Xo

~= 1 +1
1-

I

if

III « I

~=~(l+f)':::.:' Xo x x
11

10'0. is the error in the measurement x. In fact..10'0.. we cannot repeat the measurements of an eclipse. 35'4. may be different from that of another student.74'7 seconds for 40 swings. and as the arithmetic mean of the nine measurements is 10 + (1/9)(0'8) ~ 10'1. . The arithmetic mean x of the n measurements is XI + X2 + . we can write x.xn units. 1 3.+ d = O. (Results should however not be rejected indiscriminately or without due thought. We shall discuss this more fully later. We note here that if a quantity Xo units is measured n times and recorded as x" X2.+ enl . a student timing the oscillations of a simple pendulum may count 49 swings as 50. ... It will be noticed that it is not possible to find e" e2' . In this case. 36·2..dr = x and whereas + e2 + . we can say that the measured quantity is 10'1 ± 0·1 to indicate the scatter of the measurements about the mean.74'0.10'1...2. l 2 We note then that by repeated measurements of the same quantity uccidental errors of the observer may be corrected in some degree. in stating that the value of the measured quantity is 10'1 the numerical error is likely 12 to be less than 0'1..74'4.36.e and consequently [e .73'6. 36'8. Of course in a fundamental sense all measurements are unique.+ en)/n may be very small. To obviate or reveal accidental errors.en may be positive and some negative the value of (el + e2 + . followed up rather than discarded. . 37'0.. To take an 13 .. The fractional error in the timing may be reduced by increasing the number of swings timed but this tends to increase the error in counting the number of swings.10. It seems quite possible that some mistake was made in recording 12·3 and it is reasonable to reject it on these grounds.CHAP. have in the hands of a great scientist often led to important discoveries. = Xo + e.10'1. denotes the deviation of xr from x: it is sometimes called the residual of x..) The above measurements indicate that the measured quantity lies between 10·0 and 10'2. but perhaps in a timing such as this the effect of the "reaction time" may be neglected as it is reasonable to assume it is the same at the beginning as at the end of the time interval...73'0.. whenever possible. but if we write x..) A set of repeated measurements might be 10'1. (If the phenomenon is unique the measurements cannot be repeated. 74'2.. and indeed it is possible on certain reasonable assumptions to calculate the probability that the error is not greater than some assigned magnitude. ERRORS OF OBSERVATION ERRORS OF OBSERVATION CHAP. The following times were obtained by ten different students using the same pendulum and the same watch: 37'2. Hence in general x will be near to Xo and may be taken as the "best" value of the measured quantity which the measurements provide. as well as that due to the lack of precision of the particular instrument used. = Xo + e. where e. 10'0. then d.. 10·2. 12'3.8 seconds for 20 swings and 73·8. It is usual therefore to examine the scatter or dispersion of the measurements not about Xo but about x.74'3. and the stop-watch he uses may be accurate to perhaps O· 1 second.. There may be a large personal or accidental error..36'7. assuming he counts the swings correctly the error in the measurement is dictated by the accuracy of the stop-watch. 36'9.74'1... In general the larger the value of n the nearer x approaches xo.1 ~+_e=-2_+_-_-_-_+---"en n and as some of the errors el> e2.. Thus if e is the largest numerical error in any of the measurements wa have lei + e2 + -n.xol .e.74'0. = X + d. unless some counting device is used. but systematic errors peculiar to him cannot thus be obviated nor an any lack of accuracy in the instrument itself.10'1. Also his "reaction time. Estimate of error If only a single measurement is made any estimate of the error may be widely wrong.+ xn n =xo + _e!. 36'7." in using the stop-watch...0 We have that el x. for example. = x + d.en or e since Xo is not known. repeated measurements of the same quantity are made by the same observer. they all refer to a particular instant.. In any case it must be smaller numerically than the greatest value of the separate errors.. Abnormal or unusual results. .37'2.+ en = n(x d + d + . We hall discuss this more fully later.. To take a simple example. Xo xo) n e.

CHAP. Bullard'!' has 15 . whether it is appropriate to the particular end he has in view. For example. Vol. This device is used particularly in many biological experiments.. No. the electronic charge and Planck's constant records how various systematic errors have been revealed and corrected. (b) and (c). and (c) the systematic error of the • I am grateful to Dr. C.Xo is small too. say. should a cathetometer be used? These are the sort of questions which any scientist must ask and answer. Indeed systematic errors are often the most serious source of inaccuracy in experimental or observational work and scientists have to devote much time and care to their elimination and estimation." Also..3. later it was revised to 4·8025 X 10-10 e.Xo. The other two errors.-. will have a "systematic" error of 1'0. S. H. 9 '0.instead of 10· 0. X2 . of experimental work of high accuracy. with a scale wrongly marked as 9 instead of 10 an observer might record measurements of 9 '0. for every instrument records m "instead of M where M . There is no common or infallible rule.u.m)/M is a measure of the accuracy of the instrument. Xn are of high precision if the residuals d. 9 '1.. others may be estimated by comparing the measurements with those obtained when a quite different method or different instrument is employed... 10· 0. Students are recommended to read some of the accounts. Indeed this information helps him to decide whether the instrument is the right one to use.m or (M . Of course. Sometimes more is known of the precision of an instrument than of its accuracy. It is important that the observer should know what degree of accuracy he can achieve with the instrument he is using. On the other hand the personal error of an observer is usually treated as constant and is determined by direct comparison with an automatic recording device or with the results of other observers. Accuracy therefore includes precision but the converse is not necessarily true. 1 obvious example. Various other corrections may be applied to the readings to take account of estimated systematic errors. Here the distinction that is drawn between the terms accuracy and precision might be noted. Once determined it is used to correct all the readings made by the observer. . even when all this has been done systematic errors sometimes remain. It is of listle avail measuring one quantity with an accuracy of 1 in 1000.u. Sometimes the experiments are so designed and the measurements o randomized that any remaining systematic errors acquire the nature of accidental errors. The answers to these questions dictate the experimental apparatus he uses. In all cases the aim is so to correct the readings as to ensure that all the errors remaining are accidental. 14 instrument. that "atomic weight determinations prior to 1920 were afflicted with unknown and unsuspected systematic errors that averaged fully ten times the assumed experimental errors. however many there be. E. Certain systematic errors may be eliminated by making the measurements in special ways or in particular combinations. (b) the systematic or personal error of the observer.7700 X 10-10 e. 10'1. Peiser who kindly brought to my notice Dr. It involves estimating (a) the accidental error. are small whatever the value of x . How accurately can a length of about 50 em be measured using a metre rule? Is this sufficiently accurate for the purpose? If not. . the history of the measurement of fundamental physical constants such as the velocity of light. Churchill Eisenhart's article in Photogrammetric Engineering. whereas the term precision is used to indicate the closeness with which the measurements agree with one another quite independently of any systematic error involved.s.. Birge has pointed out. in 1929 the accepted value of the electronic charge was 4.and the arithmetic mean of the former readings. This example is not so absurd as at first sight it may seem. particularly the original papers. Various devices are used depending upon the nature of the measurements and of the apparatus used.s. Of these the accidental error is usually assessed by applying certain statistical concepts and techniques as we shall explain in Chapter 2. the difference between these two values being much greater than the uccepted accidental error of the earlier value. June 1952. if another quantity which affects the result equally can only be measured with an accuracy of 1 in 100. whereas the accuracy of the measurements is high if the errors e r are small in which case' x . Using the notation introduced earlier in this section we say that a set of measurements Xl. If one instructive example may be selected. It will be clear that the assessment of the possible error in any measured quantity is of fundamental importance in science. * Accuracy refers to the closeness of the measurements to the "actual" value or the "real" value of the physical quantity. for instance. XVIII. are sometimes merely neglected or are assumed to be smaller than (a) which is not always true. 1 ERRORS OF OBSERVATION ERRORS OF OBSERVATION CHAP.

he cautiously adds: "While it is believed that the discussion of the errors includes all those large enough to be of any practical importance it must be remembered that many apparently irreproachable gravity measurements have in the past been found to be subject to unex pected and unexplained errors.-) ao ~"bcp +/1 -/~ Thus the fractional error in alb is approximately the difference of the fractional errors in a and b. positive or negative. Estimate of the error in compound quantities Once the error in a measured quantity has been estimated it is a fairly simple matter to calculate the value of the consequentia error in some other quantity on which it depends. 17 If the measured lengths of the sides of a rectangle have errors of 3 % and 4 %. 1 discussed some gravity measurements made in East Africa. + /1)(1 - 12 + . After examining very carefully both the accidental and systematic errors involved he concludes that the measurements form a consistent set with a probable error of about 0·00001 ms-2• However. Thus if we write we must have NoO +I) Ni ~ NJO + tf) N= This follows of course from the result O+f)i~l+tf which can be established independently. First we note that by putting a = b it follows that the fractional err~r in a2 is t".. we can write Q = ab = ao(l The above result can easily be extended. IIIn particular.. Error in a quotient If a quantity Q is expressed as the quotient aib where again a and b have fractional errors 11 and h respectively. " If y is some function of a measured quantity x. Alternatively.... or 1 % if they have opposite signs.less the sum of the fractional errors in /. we have Q = alb aoO + 11) bo(l + I~ = ~(1 bo ~ aobo(l so that the fractional error that is the sum of the fractional errors in the two quantities of which is the product.. if Q = abc . c.CHAP. . 1 ERRORS OF OBSERVATION ERRORS OF OBSERVATION CHAP. b.. 6. . Again. m... 16 .:ice t~e fractional error in a..the fractional error in Q is approximately the sum of the fractional errors in a. and until the source of these dis crepancies has been found it would be unwise to be dogmatic about the errors in the present work. if the errors in the sides have the same sign. the error in due to some error in x can be found by using some simple mathematical techniques which we shall now explain. the fractional error In a IS one-half the fractional error in a2. c..-)/(lmn .-) the fractional error in Q is the um of the fractional errors in a.. EXAMPLE + fJ) x bo(l + I~ + fJ + I~ in Q is 11 + 12 approximately. where a and are measured quantities having fractional errorsfJ and/2 respectively. n." 4. the fractional error in an equals approximately n times the fractional error in a. Error in a product If a quantity Q is expressed as the product of ab. Also if Q = (abc . 5. b. .. the fractional error in Nt is one-half the fractional error in N. This is true for all values of n. the error in the calculated value of the area is 7 % approximately. that is.

fractional error. then fi . For suppose x is a measured quantity and y is a quantity calculated from the formula y = I(x). in ordance with the result proved in Section 5. which gives the greatest fractional error Instead of this greatest value of the quantity. we have EXAMPLE 7. For instance.F2 and +F2. Oy Therefore SX oy = dy 41T2/0 g = TJ 41T2/0 1 + fi ox dx y x x x (1 + 1~2 dy . + fi - therefore Hence nd oy.. as is often the case in practice. approximately twice the fractional error in the radius.FI and +FI. and this may be positive or negative depending on the sign of ox.2/~ 1 2/~. . sometimes called the "most quoted. I b dy. It is given by the square root 18 that is.2/2. a smaller probable" value. whilst 12 lies between . suppose fi lies between . however. Since fi and 12 may be positive or negative. then the fractional error in i due to fractional errors IE and IR in E and R respectively is IE . 1 EXAMPLE 1 If the current i amperes in a circuit satisfying Ohm's law is calculated from the relation i = E/R where E volts is the e. that is. which clearly is greater than FI or 2F2. . in the circuit and R ohms is the resistance.. 1Tx2 21TX . 1 ERRORS OF OBSER VA TlON ERRORS OF OBSER V A TION CHAP. fi and 12 are not known exactly but may have any value between certain limits. t h at IS. the greatest value of the fractional error can be estimated. (Fi FI f the greatest values of the separate fractional errors. Of course.m. suppose y is the area of a circle of radius x. .IR approximately.2/2 can be as large numerically as FI + 2F2• Indeed we can write 8x ~ 21TX. if oX is small oy ~ 21TXOX lfil '" 21121· Thus the error in the calculated value of the area due to a small rror ox in the measured value of the radius is 21TXOX. but less than + 2F2· 1 2 If g is calculated from the simple pendulum formula g = 41T21/T2. the area of the annulus having radii x and x + 8x. of course.. <!lIven approxunate Y y dx ox. dx dy = Thus the fractional error in g is fi . + 4F~)~. t he error ill oX ~ TJ 1 fi 1 + 2/2 (1 + . and we write 1 = 10(1 + fi). It is. As an example. If ox is the error in x the corresponding error in y is oy where Iim Bx-*O e. ~ dx if"" IS sma 11enoug h . The fractional error in the area is oy _ 21TXOX = 20X y 1TX2 x lfi - 2/21 <: FI + 2F2 in g..f. Use of the calculus The calculus can be-used in the estimation of errors.J2 are the fractional errors in I and T respectively.. T = To(1 + I~ where fi. it follows that the numerical value of the fractional error in g may be as large as lfil + 21121 or as small as If. is often of the sum of the squares . most simple examples including that above hardly 19 . so that y = ~ T:2 o ~ go(1 41T2/0 + fi .CHAP.

then the "most probable" value of oQ is given by (OQ)2 A quantity y is expressed in terms of a measured quantity x by the relation y = 4x ..-2g I T Now dy dx = C (1 + X2)2 21 1 + x2 - x(2x) = C . OZ. when x lies near to ± 2-t. as we found in Section 6.in x.210g T ~ ~ ~/ _ 20T g . the percentage error in y is (4x2 + 2)/(4x2 ...2). y.. that is.CHAP. oy.~.. find the corresponding error in y and show that for all values of x it does not exceed Ce units. y. . )2 + (Oy oQ x e2) + 2 (OZ sc x e3 )2 + --- ~ 100(4x2 x(4x2 + 2)ox - 2) that is.2h..all being zero). .oy oz 1 The first term ~~ ox is the error in Q due to an error ox in x only (that is.respectively. 1 need the use of the calculus.x2 20 .2 is small. This result is obtained more simply by taking logarithms first.(2/x). z. oz. If a quantity Q is a function of several measured quantities x. • Also. Taking as an example the simple pendulum formula used earlier we have g = 47T21/T2 therefore og ~ og 0/ Thus if ox/x = 1/100. -'" 1 . Actually using oy ~ [4 + (2/x2)]ox. 0 Q is the square root of the sum of the squares of the greatest errors due to an error in each variable separately. it is obviously large when 4x2 . . so that logg and hence = log 47T2 + log 1 .. OZ. if we suppose that ox.. What is the percentage error in y corresponding to an error of 1 % in x? We have so that dy[dx = 4 oy ~ + (2/x2) [4 + (2/x2)]ox + (2/x2)]ox ercentage error in y = (oy/y)l00 ~ 1~2/x)[4 = oQ (ox X e.. .. This result is often referred to as the principle of superposition of errors. 1 ERRORS OF OBSERVATION ERRORS o·p OBSER V A TlON CHAP.I T EXAMPLE + -oQ oy + -oQOZ + .T2 T3 2 sr 2 A quantity y is calculated from the formula y = Cx/(1 + x2). oy ~ 80x. Further. If the error in measuring x is e units. corresponding to oy. oy.. .. -e3 and +e3.... z.respectively is given by oQ oQ ~ ox oX that is Ii . . this is not surprising since y = 0 when x = ±2-t.can have any value between -el and +e" -e2 and +e2. we find that when x = ± 2-t. ()/ + og oT sr 2 [ ~ 47T 0/ _ 87T . It might be noted that this percentage error varies with x and is approximately 1 % when x is large. a little algebra is all that is necessary as we have shown earlier.the error in Q due to errors ox. and similarly the second term °o~ oy is the error in Q due to an error oy in y only. but the calculus does facilitate the solution of more complicated problems. EXAMPLE and og 01 sr .

E < 0. when x =I.. 1). 8..\ EXAMPLE 3 If the error ox in x is e unit.CHAP. Find the error in 1) due to errors in the measured quantities Q. 'T1'pa4t We have 1) = 81Q therefore log 1) = . Further. but as we have noted earlier all that is usually known is that el may have any value between -EI and +EI say. ---- 3 x Q =a +b = ao + bo + el + e2 So writing the error in Q as e and the fractional error in Q as I we have e = el -o. Also E -+ as x -+ co. E does not change in value when x is replaced by x and hence its graph is symmetrical about the axis of E and must have the form shown in Fig. I ERRORS OF OBSERVATION ERRORS OF OBSERVATION CHAP. Graph of E = (1 ..'\ therefore oy ~ C/o 1 -. If el and e2 are known. + X2)2 log ('T1'18) 01) + logp + 410g a + log t .x2)/(1 E has a maximum value when x = + e2 und x x = ± 3t. having errors el and e2 we have -3 -2 -I . t. .. 2.log Q How does E vary with x? When x =0. 2. the error in y is 1 oy ~ G__ -x". I and Q. Error in a sum or difference If a quantity Q is expressed as the sum of two quantities a and b. whilst e2 may have any value 23 1= el + e2 = aofi + bo/2 \ . Thus E lies between 1 (when x = 0) and -t (when = ± 3t). the radius must be measured with an accuracy of at least 1 in 400.x2 E = (1 The viscosity of water. X2 .s -I Fig. and when x > 1. Hence the error in y lies between Ce and -GeI8.. a andp.e . e and I can be calculated. E =1. 22 ° + x2)2 and minimum values when ao +bo ao + bo where II and 12 are the fractional errors in a and b respectively. ° -=-+-+----1) pat op 40a ot 01 I sa Q + I' E Thus the fractional error in 1) equals a combination of the fractional rrors in p. a. I. is calculated using Poiseuille's formula giving the quantity of water Q flowing through a cylindrical tube f length I and radius a in time t under a pressure difference p. E = 0.log 1 . The term 40ala is usually the most important since a is very small and hence for an accurate determination of 1) special attention must be paid to the accuracy of measurement of the radius a.. 2 This can be written where ~il oy ~ GeE 1 .. Indeed to ensure an accuracy of 1 % in 1).OX ./ A\ as 01 1 2 '--.

indeed in general it will unless it should happen that fi . 1) + _b+o °0 b-(f2 .may be neglected.a2) oTJ/TJ = 2(1612 ..Q would have to be added to the value of 101)/TJI calculated above. which is less than E[ + E2 but bigger than either E[ or E2• On the other hand the fractional error I depends on the values of 00 and bo as well as on fi and 12. Also. 1 'I between . whereas (so that be large Again if 00 and bo are approximately equal but 01 opposite sign ao + bo is small) I may be large.) a2 b a b2 2(a2 h . Writing 1= fi EXAMPLE 2 The viscosity TJ of a liquid is measured by a rotation viscometer. The cylinders are of radii a and b. We can write I[ where the greatest values of I[ and 12 are 0'01/4 and 0'01/5 respectively.fi) 0 = 12 + 00 0+0b (fi . 1 ERRORS OF OBSERVATION ERRORS OF OBSERVATION CHAP. The "most probable" value of oTJ/TJ may be taken as The greatest error in 11 + 12 is 0·2 and the greatest fractional error is 0'2/19'0 = 0·01.J~ CHAP. Hence 10Tj/TJI may be large as 2(16 x 0. Now. that the greatest error in measuring both a and b is 0·000] m and that the error in G/D.Q Q2 - G (1 lJi. Again if ao and bo are approximately equal and 01 the same sign I is approximately ·Hfi + 12J. We note that the "most probable" fractional error in 11 -/2 is v'[(0·1)2 + (0·])2]/1'0 = 0'14.Q is the angular velocity of rotation.12 is very small.01 9 5 + 25 x 0'01) 4 + 12 = = (10'0 19·0 ± 0·1) + (9'0 ± 0·1) ± 0'2 that is.25fi)/9 Veer 100 OTJ = Tj = + E~) + bol Hence in this case 1 EXAMPLE If two lengths of 11 and 12 are measured as 10'0 and 9·0 with possi ble errors of 0·1 in each case find (i) the greatest error and (ii) the greatest fractional error in the values of 11 + 12 and 11 -/2.~ fi + ~ 2 12)/(~ 2 .Q G (2 (}a - + '[)job 2) = fi and ob/b = 12 we have (. or lei lies between 0 and E[ + E2. b = 0·05 m. Calculate the fractional error in '1 given that a = 0·04 m.12) 0 I: it follows that if 00 is numerically much bigger than bo then I equals I[ approximately. 25 t . 24 2v[(16F2J2 + (25FI)2] 9 = ~(0'07) = 0'016 9 If there were also errors in G and .b2fi)/(b2 . What then are the limits between which e and j' may lie? Clearly e can have any value between -(EI + E2) and +(EI + E2J.!.Q the fractional error in G/. It is usual to take the "most probable" value of lei as (Er + EDt (see Section 6). and varies between wide limits. so lhat writing oa/a OTJ = 47T. 11 -/2 = 1·0 ± 0·2 so that the greatest error in 11 -12 is 0·2 and the greatest fractional error is as high as 0'2/ 1·0 = 0·2. I the "most probable" value of III may be taken to be where . and a torque G is applied to the rotating cylinder so that TJ = 47T. 0·021 or about 2%. or if 00 is numerically much smaller than bo then I equals h approximately.E2 and + E2.

Tl + IX) find the fractional error in y due to an error of 0·1 % in t when (i) t = 17/2w.2 s-2 "with a precision. If x = 0·012 is correct to two significant figures show that eX may be calculated for this value of x correct to four significant figures. 1 EXAMPLE 3 In Heyl's method the gravitational constant G is calculated from the formula G_ 4172/ EXERCISES 1. + h. Heyl did not give an estimate of the error in the value of G. mass and density of the mercury thread. Now since G= 4172/ X (T2 - 3. I is the moment of inertia of the system about the axis of suspension and AI> A2 are constants. Find also the values of t for which the fractional error in y is least. 2. is measured as 1·0423 ± 0·000 OSm and the time of oscillation T as 2·048 ± 0·0005 s. The diameter of a capillary tube is found by weighing and measuring the length of a thread of mercury inserted into the tube. The mass m grammes of an electron moving with velocity u m s " ' is mo/vlp -. 6. for a given error e in measuring 8. If y = sin (2wt 1 (a) x = 3.2000 == 1200 The values of TI and T2 quoted above are those given by Heyl(2) but the values of oTJ and OT2 are hypothetical. and (ii) the fractional error in i is least. 1 ERRORS OF OBSERVATION ERRORS OF OBSER VATION CHAP.A2 TI)(T2 T?Tl + T1) using the method of Section 7 -= so G O(T2 - T1) ~-~ O(T2 + T ) 20T + ~+~ I ---. Ox = 0·1 and (1 1) ." 1) 1 is given by g = 47T2(hJ + hz)/T2 The length of h. Calculate g and the greatest fractional error. 8. how accurately must T be measured in order that the error in g may be less than 1 in 1000? 27 26 . where T. if v/e is small compared with unity.A2 Tl . If h. If i = k tan 8 find the value of 8 for which (i) the error in i is least. Estimate the fractional error in G due to errors in TI and T2 of 0·1 s when TI = 1750 sand T2 = 2000 s. Show that the fractional error in m is approximately v2/e2 times the • fractional error in v. (ii) t = 17/W. of 0·005. + h« is measured as 1·042 ± 0·0005 m. Find also the value of h for which the fractional error in T is least. Using Kater's pendulum g 10 250 + 1750 . platinum and glass spheres was 6·670 X 10-8 em! g. Given T2 = h + (lOO/h) find the value of h for which the error in T is least if the error in h is a constant e. 4. Estimate the error in the calculated diameter of the tube in terms of the errors in the length. Find the fractional error in eX corresponding to an error Ox in x. Ox = 0·05.Al .CHAP. as measured by the average departure from the mean. and T2 are times of oscillation. S. but it is interesting to note that the value of G he adopted as a result of experiments with gold. If y = x2/(l + x2) find oy/y when (b) x = 2.I ~ 20T2 ~ The first term on the right-hand side may be as large as ~(2~0) and is the most important term. Al .(v2/e2)] where em S-1 is the velocity of light. we get the largest value of OG/G is 2(1 1 Putting OT2 = - OTI = 1/10 7.

the classes shown in Table 1. then ••• Vol V = dold where do and d are the corresponding deflexions "~·. the percentage marks obtained in an examination by a number of students could be grouped by counting the number of students who had marks between 0 and 9.. :. Table 1 shows what is called the frequency distribution.(. 11. The width of any class 15 the difference between the first number specifying that class . including scientific measurements as well as industrial and social statistics. Vo and V. For some groupings. 10.. that is.•. 90 and 99. 1 ERRORS OF OBSERVATION CHAPTER 9. 12. r2 = 0·14 em.8 .fractional error in R is fo( 1 . thus dividing them into 10 classes. fractional error in Vo ". The deflexion dot a beam under certain conditions is given by d = 4We3/317Ea4• Find (i) the maximum fractional error. If Vo and V are measured by a ballistic galvanometer. The first step in dealing with such data. if they are sufficiently numerous. this is often done by grouping them into classes according to their magnitude or according to suitable intervals of a variable on which they depend. is to arrange them in some convenient order. 10 and 19 and so on.-) = 2 -gph . 10 and 19. C. Assuming the errors in do and d are equal • in magnitude. If e is a constant find for what value of x the error in H is greatest." • 1111111111111111111111111 79638 28 J 0-9 10-19 20-29 30-39 40-49 2 5 6 14 22 29 50-59 60-69 70-79 80-89 90-99 32 25 10 2 2 7963. the error in e is ±0·05 % and the error in a is ±0'1 %.. (ii) the "most probable" fractional error in the value of Young's modulus E calculated from this formula if the error in d is ±0·1 %. and the error in each of these measurements is not greater than 0 :005 em.' 1 1 1 2 SOME STATISTICAL IDEAS as an L. are often represented graphically to aid their appreciation. show that the greatest value of the corresponding . . are usually called the lower and upper class limits." Estimate the fractional error in y if h = 1'06 ern. the time t taken for the potential to fall to V is noted. . Find the fractional error in R due to errors in t. HOGBEN y. rl = 0·07 ern." for example.of the instrument. . The number of data in each class is usually called the frequency for that class. The pairs of numbers written in the columns headed "class. A coil of n turns of radius r carries a current J.. The difference of height h in the two limbs is measured and y is calculated from the formula . If the error in measuring x is e. 0 and 9.and the first number specifying the next. . Table 1 .the widths of the classes may be unequal..43N72 IISC Class Frequency Class Frequency. the magnetic field at a point on its axis at a distance x from the centre is H = 27Tnr2J(r2 + x2)-3J2. For instance. The surface tension y of a liquid of density p is found by inserting the liquid into a U-tube of which the two limbs have radii '1 and r2 respectively.'1 r2 "The experimental scientist does not regard statistics excuse for doing bad experiments. however. Frequency distributions Numerical data. + ~o)lOg f -? where fo is the Find the value of Vol V for which this Lib 8'lore 9.CHAP. in which the marks of a sample of 120 students have been used. is least. 10 for each of . A large resistance R is measured by discharging a condenser of capacitance C charged to potential Vo. R is given by tl C log (Vo/V). The data could then be tabulated as shown in Table 1.. find the corresponding error in the value of the field. 511.

25. The points obtained by plotting the frequency against the mid-value of the corresponding class. AI ~D = x~ + m where m is some constant.. 3. that is. for instance. the heights of the rectangles are proportional to the frequencies when the classes have equal widths. 4 have areas equal to 2. 4.. 3.+ InX~ + m(fi + 12 + . Then o 20 40 30 60 80 100 fix! Marks Fig. the mean value of the variable is given by (fix! + 12X2 + . to ~ 20 ::::J 0- J to.<)/(/1 + 12 + . 2 The classification shown in Table 1 obviously helps us t appreciate the distribution of marks amongst the students. but the arithmetic can be minimized by using the following simple device.xn' the frequencies in the corresponding classes..+ IJ (1) axis and the frequencies along the vertical axis.' + m) + . 14·5. 4·5..~I and is often denoted by s... The mean frequency is usually not of particular significance. X2. how many students have fewer than 40 marks and how many have 70 or more... + fJ 31 ..+ /'. The data are plotted in Fig.. + 12X2 + . + hX2 + .. namely. 10.-. 14.CHAP.+ In<x~ + m) = fix. Marks Frequency polygon of examination marks. 94·5 are joined by straight lines and the resulting figure is known as a frequency polygon.... + m) + Hx.. Let us write x. . 12. Xn are the mid-values of the variable. 5. It can be written as n n the weights being ~ Isxs/~ Is or [Ix]/[/] 30~ = 20 10 unit area ..-. But a graphical representation can make it possibly even clearer. where the marks are represented along the horizontal 40 30 >(]) Ire constructed of width equal to the class width and of area equal the frequency of the corresponding class. 10 24·5 44·5 64·5 84·5 The mean Fig. 2 units respectively. and the total area of the histogram in this case is 120 units equal to the total number of students. . In this case the mean height of the rectangles is proportional to the mean frequency.. 2..In are the frequencies in the various classes of which XI. . 6. .+ I nXn = fi(x. 4 where a series of rectangles This is the weighted mean of XI.=! . the rectangles In Fig. The figure obtained is called a histogram. but what is often important is the mean of the data. 22. x2. A different method is used in Fig.. we ca see at a glance. 32. Since the area of each rectangle in a histogram represents the frequency in the corresponding class. Histogram of examination marks. To evaluate x using the expression (1) directly can sometimes be laborious. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. This is defined as follows: if fi.

m x.f. 12.. It is clear that in this way the amount of arithmetic. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. 11 when arranged in ascending order of magnitude are 7.CHAP. In many practical cases the economy is considerable. = Xs - 3·5 Isx. 13. As a simple example columns one and two of Table 2 give values of Xs and the corresponding values of f. 33 .+ 1. 13. Is =}.. 12. Relative frequency XI which XI occurs is fi/"L..+ . 20 of which the median is 12. If.. .7 0 10 16 12 18 0'5 i 5 2'5 3'5 4'5 5'5 6·5 sum 1 5 7 9 10 8 4 44 -3 -2 -1 0 1 2 3 We note that rl is represented in a histogram by the area of the rectangle corresponding to the class of which XI is the mid-value divided by the total area of the histogram. the middle values of the set are the 11thand (n + l)th. however. 18.:.. In column three an given the values of l..h. say 2n + 1. 15. r». X2' .. s=l Isxs 0'5 7·5 17·5 31'5 45·0 44·0 26'0 172'0 x. If the scales are so chosen that the total area is unity then each rectangle represents the corresponding relative frequency.3 -10 . II. is reduced. It is clear that X' will be small if m is chosen near to x.. if the number of observations is odd./(fi + 12 + ..xs' so that on addition the mean value of is given by x found directly. The median However.. it is simpler to proceed as follows: an examination of the data suggests that the mean is somewhere between 3· 5 and 4· 5. 18.In' the relative frequency with which Xl occurs is .. or generally the relative frequency with 1/ If the classes specified by Xl. 32 If a set of observations are arranged in ascending or descending rder of magnitude the observation in the middle of the set is called the median. the median is the (n + l)th value.x~ + m (2 IS It follows that i' = 44 and hence 18 x = 3·5 18 +44 3~ 11 where X' is the mean of the quantities x. 20. say 2n. 12.'). 15. and consequently the likelihood of error. and so taking = 3·5 the values of are tabulated in column four of the table and the values of f. the arithmetic mean of which is taken as the median. II. 10. + . For example. if the number of observations is even. 9. 9. the numbers 10. By choosing m conveniently we can make the evaluation of simpler than the evaluation of x.. . Xn occur with frequencies x=172=3~~3'9 44 11 Table 2 x. the last value 20 had not been present the middle terms of the set would have been 11 and 12 and the median would have been taken as 11·5. m is often called the working mean or the assumed mean. If we denote the relative frequency of by rl' the mean of the observations can be written as X ). calculated as shown in the last column. 7.f"Xn = fix~ + hx.x...fi+/2+---+ln 11+/2+---+ln x=x'+m + f. 2 therefore fixl or + 12x2 + .. More precisely.

in dealing with the marks shown in Table 1. 163'5 em. 5 is (15 + 17) units. To do this we note that 49 students hav marks under 50 and 49 + 32 = 81 students have marks under 601 Student 60· 5 is therefore in the class 50-59.CHAP. Frequency curves It is clear that in general the shape of a histogram depends 0 the widths of the classes chosen in grouping the data. 2 Sometimes. the same the area of the corresponding rectangle ABCD in Fig. 5 which should be compared with t original histogram shown in Fig. In practice the width of the classes is dictated by the nature of the data. 4. 34 tal area equals in each case the total frequency which. For instan when the data used in Table 1 are grouped in classes of half tb width the results are as shown in Table 3. 162'5. On the other hand. On the other hand the maximum class width is 100. 50 1st 10 5 unit area IZI + 3'6 = 53'6. 4.::. if the heights of a group of men were measured to the nearest 0'5 em. The area of ABCDEF in Fig. Fig. which are known as the class-boundaries. if the marks were set out in ascending order of magnitud the middle values would be the 60th and the 6lst. To find tb median we therefore find the marks of the fictitious "individual' number 60·5 in the set. In this case the total number of students i 120. For example. 5. 35 . 163·5em and half of those with recorded heights of 162 and 164 cm . the data are grouped into classes as fc example in Table 1. of course. whieh would be the class-boundaries. The correspondin Table 3 Class limits Frequency Class limits Frequency 0-4 5. so th although they have different shapes they have equal areas. Again. and the mid-value of the class would be 162'75 cm. 163.49 x 10 that is. all the men of recorded height 180'5 cm could be put In the class 180·5 and the class-width would be 0·5 cm. in which case the histogram would consist of a single rectangle from which no useful information could be derived. in particular by their number as well as their accuracy. This class would include all the men with heights between 180·25 and 180'75 ern. classes either too wide or too narrow do not reveal the general trend of the data. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. in the class 162-164 there might be included the men with recorded heights of 162'5. 163. strictly in that ease it would include all those with heights between 161·75 em and 163·75em. of course. taken to be 50 + 60'5_. However. the smallest possible class width is 1 mark since fractional marks are not given. but in fact a class width of less than 5 marks is probably meaningless as few examiners would claim to be able to mark within 5 %. using It class-width of 2 ern the class 162-163'5 might include all the men with recorded heights of 162. and the median . unchanged by any change of class width. This true for all corresponding portions of the two histograms.9 10-14 15-19 20-24 25-29 30-34 35-39 40-44 45-49 ~}2 i}5 ~}6 ~} 14 12} 22 10 50-54 55-59 60-64 65-69 70-74 75-79 80-84 85-89 90-94 95-99 15} 32 17 14} 25 11 ~} 10 ~}2 ~}2 histogram is shown in Fig. 13. Histogram of examination marks.

as is illustrated in Fig. 14. consider the distribution such that the frequency f is given by f = 2/7T(X 2 + 1) c Irorn x = . X = J a b fxdx/ J b fdx a u. The histogram tends more and more closely to the frequency curve as the number of observations Increases and the class width decreases.dx = . If the equation of the frequency curve is known. that is. Measures of dispersion A very important characteristic of a set of data is the dispersion or scatter of the data about some value such as the.:. Frequency curve. us' only one class covering the whole range. if these marks are grouped shown in Table 3 the mean is found to be 51'17. merely note here that any finite number of observations might 36 The mean value of the variable = Jf -1 1 xdx = ~[IOg (x2 J~ 1 which is zero. The value of the variab for which the ordinate divides the area under the curve into t equal parts is therefore the median. the mean value of the variable can usually be found by expressing it in terms of Integrals. For example. the degree of ipproximation depending in general on the number of observations and on the class width.tan -1 x 2 x -1 + 1 2[ 71' -1 J1 = 1 + I) The area under a frequency curve or under any part of it h special significance. The scales of the frequency curve are ofte chosen so that the total area under the curve is unity. 2 the mid-value of the 'Classwould then be 163 em. since the mean is defined (see Section 1 in terms of the mid-values of the classes. x Fig.. On the other hand the mo corresponds to the maximum value of the frequency. provided t class width is not too large the effect of grouping is small and usually neglected. This is obvious from the symmetry of the curve ibout the axis x = O. 2 SOME STATISTICAL IDEAS SOME STATISTICA L IDEAS CHAP. 6. the mean is 49·5. mean. regarded as a sample taken from an infinite number distributed in . The histogram of the trnple is an approximation to the frequency curve.. f is expressed as some function of the variable x.1 to x = + 1. Such a curve is kno as a frequency curve. With a very large number of data grouped into classes of sm width it is clear that the outline of the histogram approximates a continuous curve.CHAP. namely. This total area being finite does not give the total frequency which is infinit but the area under any part of the curve gives the relative frequen for the corresponding range of the variable. 37 . 0--99. Again. The median is also x = 0 and so is the mode. u where a and b define the range of x. provided the unit of area is NOchosen that the total area under the histogram is always the ame as that under the frequency curve. which as stated above is usually taken as unity. 6. However. manner indicated by the frequency curve. it is found that the mean of t marks given in Table 1 is 51' 25. For example. We shall take some examples of frequency curves later. Variable. It will be noted that the mean of any number of observatio depends in general on the width of the classes into which t observations are grouped. J1 f dx = -1 2J1 71' -. -J :. the factor 2/71' being chosen so that the area under the frequency curve is unity. It is important tn there should be no doubt where the boundaries of the classes an and that there should be no overlap or gaps between successi classes. that is.

Far example the range of the marks given in Table 1 is 0-9: but 93/120 or 78 % of the marks lie between 30 and 69 inclusive. which we will write as d.+ Idnl n =~ ns~l f Idsl· Far example.. . different means and differe dispersions. .. that is..x) respectively.x). Similarly.3 and 8 + Different sets of data have. 12.+ d. 12. Various parameters are used to' measure the dispersio We shall mention only three. The mean was found to' be 3 :~ ~ 3·9. 7. in general. .. the numbers 5. 7. the mean deviation is given by 15. 6. 50. 60.6 have a mean of 8 and a mean deviation of (1/6)(1 + 3 + 0+ 2 + 4 + 2) = 2..I.. (X2 .('~ :f~{· ~\ ~\£>"I'\" tI'O'(Y ) ~. 8.. In Table 4 below values of Idsl are given in column three and values of f. whilst the mean of the data A is greater than that of t data D. Therefore the mean deviation is e c V Idd + Id21 ::J f Variable Fig. Frequency distributions with different means and dispersions + . dl + d2 + . 7 have a smaller dispersion than those represented by t curve D. the numbers 7.e is the mean. . of course. they are within the limits 60 ± 20. of the data lie. 5. The mean deviation If Xl. 39 . limitations because of its simplicity. 58'0 I 32 I d uie mean eviation IS 44 = .10. which occur very .-. 16.~ il. • deviations are sometimes taken from the median..x).. .8.. It gives the complete interval of th variable over which the data are distributed and So' includes tho data. 10. X2. frequently.. . usually at the ends of the interval.. 38 iildll + 121d21 + . have a mean of 8 and are within the limits 8 .nx = O. 80 of which the mean is 60 have dispersion or scatter of 20 an each side of the mean. from which it follows that ..In respectively.. 70.+ xn . 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. The data corresponding to' the frequency curve A i: Fig..dn• Some of these deviations are positive and same negative.Xn have frequencies ii. .. The mean deviation (sometimes called the mean absolute deviation) of a set of data is defined as the mean of the numerical values of the deviations. • . = Xl + x2 + . If the data Xl' X2. namely....CHAP. It may be much mare useful in certain cases to' kno the portion of the range af the variable within which a giv fraction. d2.ldsl in column four. in fact their sum is zero. The range The range of a frequency distribution is defined as the differen between the least and greatest values of the variable. This is very simple measure of dispersion and has.fxs }:fs xl (3) EXAMPLE Consider the data given in Table 2.. the mean deviatl and the standard deviation.r.ii+/2+---+ln + Inldnl }:f. say 50 %.Xn are a set of data of which . the deviations of the data from the mean* are (x.~~Y --..Iir .(xn . . the range. 2 The numbers 40.

b).-'.. ibove example the coefficient of variation is therefore It might be noted that a coefficient of variation is sometimes used. zero.. It is defined as 100 a/x and is expressed as a percentage. 2 Table '4 X = XJ 3'9.-. The mean value of x is given by If x" x2...CHAP. 8. f course. t::ln are the deviations of the data xl. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. from the mean x. usually denoted by a.. 4 respectively. 6... measured from th mean..-. d. a2 = looC(x o which reduces to a a)2 e-x/a dx / looc e-x/a 0 dx = a. n ns=1 x)2 J! (4 If the frequency of the variable x is given by I = Ce=t« for all positive values of x. 0..x)2dx / 1 r a fdx EXAMPLE 9 10 4 8 44 Find the standard deviation of the numbers 5. Now therefore a2 = (1/6)(9 a = + 4 + 1 + 0 + 4 + 16) = 34/6 = 5·67 17.3'9 runge (a. find the mean and standard deviation of the distribution. -1. ..+ d~)/n 2 that is.(J"s - x)2/ ~ s=1 Is. . The standard deviation The most imt>ortant measure of dispersion is the standa deviation. the sum of the deviations is. If the frequenci~/are known for all values of X over a continuo • The reader is ~ked to note the modification introduced in Section 3. X2.In respectively the 2 = (Jid? + 12d~ + . 1 5 7 d. = x. The mean of these numbers is 8.-. and so the deviations are 3.. 12. 41 40 . Also the standard deviation a is given by a2 is known as the variance of the data. then the summations can be replaced by integrals and wo have is 0·5 1· 5 2'5 3'5 4·5 5'5 6'5 sum Idsl 3'4 2·4 1'4 0'4 0'6 1·6 2·6 IMsl 3'4 12·0 9·8 3'6 6'0 12·8 10'4 58'0 a2 = r a I(x . X. ~n have frequencies Ji. .+ Ind~)/(Ji + fz + . In the 100(2'38)/8::::: 30% EXAMPLE =(d r + d~ + . n a '= [1. .= [1 ~ (x. 2. 7.~ d} ns=1 J!. The standard deviation is the root mean square deviation of the data. then * G-2 2·38. 12. -2. It is defined in terms of the squan of the deviations from the mean as follows: If d. . which reduces to x = a.+ In> a = ~ s _1 x = loocx e-x/a o dx / looc e-x/a 0 dx !... 10.

x)2 x . Evaluation of standard deviation.m)2 "£1.xs . Using a working mean of 3 the necessary working is tabulated below. [(xs ..m with m = 3..CHAP.m)2 is the sum of the squares of th deviations from the working mean m. "£ I. that is. EXAMPLE 1492 400 - (0-57)2 = 3'73 .lculated for checking purposes as explained later in Section 20. The last two columns are ca. a In cases more complicated than the simple example abo (Example 1). Also: one further interpretation of equation (6) might well be noted here the least value 01 fL2 is a2 and this arises when m = x.-m)2/. -2 -1 0 1 2 3 4 5 6 7 x)2 = "£1. and using equation (5) above = "£1S<xs.m) + (m ) x)2.x)"£fs(x.m)2 :.(m ..m"£fs "£fs(xs . where d now denotes the deviation from the working mean. + 2( m-x .x)2 (6) sum Hence. As shown earlier "£fs(xs . 2 SOME STATISTICAL IDEAS SOMB STATISTICAL IDEAS CHAP 18. tho sum of the squares of the deviations of any set of data. Table 5 x 0 1 2 3 4 5 6 7 8 9 10 so t hat f 10 40 72 85 78 55 32 18 7 2 J 400 d The summation on the left-hand side. measured from any value m. = "£fs(xs .m) = "£I.. dividing by "£1. is least when m equals the mean of the data. 'E.xs = -. x .fs(xs - -3.m) - = "£fs(xs a 2 m)2 + (m .m) "£I.0·32 = 3'41 therefore a 1'85 We note that in this case the coefficient of variation equals (1'85/3-57) x 100% = 52% Calculate the mean and standard deviation of the followin observations. considerable economy in the arithmetic can be mad by using a suitable working mean m.m =x . Sheppard's correction If we regard the data given in Table 5 as grouped into classes of unit width Sheppard's correction should be applied in the evaluation of a2• As we have stated earlier (Section 13) the effect of the class width on the mean is usually negligible. but Sheppard has shown 43 .(m _ x)2 "£1. where 1denotes the frequency of the observation x 0 1 2 3 4 5 6 7 8 9 10 1 10 40 72 85 78 42 55 32 18 7 2 1 19.£fs and then use a2 = fL2 .m "£1. (m .£fs(x.x)2 "£1. involving deviations fro the working mean m. i finding a2. instead of x.x»)2 + 2(m . to choose a suitable working mean m and calcula fL2 = .3= • x= a2 228/400 3·57 Also fL2 = 1492/400 and hence from equation (6) == The special case when m = 0 is of some importance. can thus be used to find x (see Section 10) Further. from equation (5) therefore We note that "£fs(xs . that is.£ Hence. • -)( x-m + (m-x -)2 fd -30 -80 -72 0 78 110 96 72 35 12 7 228 fd2 90 160 72 0 78 220 288 288 175 72 49 1492 fed + 1) -20 -40 0 85 156 165 128 90 42 14 8 628 fed + 1)2 40 40 0 85 312 495 512 450 252 98 64 2348 = "£fs(xs = fL2 - . It is usually easier..

~.. x.f(d + 1)2 hav. if z = x .. In the above example }:.. say z = x + y or more generally z = ax + by where a and b are constants. Also. For example. given by ax. X'. Y and the standard deviations by az. 11.e been calculated independently as is shown in the last two columns 0 Table 5. in terms of ... x.+ xn + xl + xi . 2.= 45 y. suppose the mean values of the two sets are x and respectively.xn and xi...fd 2 +}:. .")/(n + m) = (nx + mx')/(n + m) Then if z = x + y. z has the nine values obtained by adding the three values of X to each of the three values of y. whilst a~ = (12 and + 1) = 228 + 400 = 628 + 1)2 = 1492 + 2(228) + 400 = 2348 }:. a~ ~ (32 + 22 + 2 .x. Since c = 1 in this case. a and a'. 21.f(d m.-+ x.. mn in number. y." are two sets of values of a variable x.xn and y has the values y" h. X.12 that again a~ = a~ + a.. x. the standard deviation of the aggregate of values can 44 so that i = 3 verifying i = x Also. Suppose x has the values Xl. The mean and standard deviation of a sum It is sometimes necessary to find the mean and standard deviation of the sum of two or more sets of quantities.. or of two or more functions.. If the means of z. it can be shown that i = ax + by and ai = a2a~ + b2a. Another important example of a sum. 3.-. Again. suppose X has the values 10... Further.f(d + 0 + 12)/3 = 2/3 a.n and j = 1. ax. It is possible to obtain expressions for the mean and standard deviation of the sum in terms of those of the components forming the sum. Charlier's checks As a check on the arithmetic involved in the calculation of and a we can use the results }:.. - + 0 + 2.0·32 = 3'33 = 1·82 20.fd which are known as Charlier's checks. 10 so that x = 11 and y = 8. .. 2.. . 10/3 so . Suppose XI' x2. 22 It follows that i = 19 verifying i = x + y.. however. 3.CHAP.0'08 = 3·65 = 3'65 . . + by} with i = 1. The mean value of the sum or aggregate of the two sets of values is (XI + x2 + . by using equation /I.. a~ = (32 + 22 + 2.. is where a variable z is expressed as the sum of two or more independent variables x.f sr« + 1)2 = }:. 12 + 22 verifying that a~ = a~ + a.12 + 22 + 32)/9. X2.Ym' Then if z = ax + by it follows that z has the values. In particular if z = X + Y we have i = x + y and a~ = a~ + a.. (6) of Section 18. 8.. yare denoted by .fd + 2}:.y the values of z are 2 2 + 32)/9 = 10/3 x 3 4 4 5 6 o Similarly if a and a/ are the standard deviations of the two sets of values.. t2 + 0 + 2. 2 that to allow for the class width c the value of .f +}:.2 as calculated abov should under certain conditions be reduced by c2/12. . = (22 + 0 + 22)/3 = 8/3 16 17 18 These checks are only partial but are usually worthwhile. . for these give and }:. The calculated values agree with the values obtained' using Charlier's checks. . 18 19 20 20 21 . ay respectively.. namely .f(d + 1) and }:.. 12 and y has the values 6. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP.2 and hence therefore a2 a = 3'73 .m.f(d x + 1) = }:.. we have be found.

6.17. 11.21 (b) the quantities given in (a) when they occur with frequenciesl 2. 1. x 5.5 plates led to the following results. Find the mean and the median of the following.20.12. 3. is unity. 19. x 20 30 40 50 60 70 80 f 53 75 95 100 70 35 22 Find also the median. 9. 8. 2 .x). (a) 10.23. 47 \ . Find the mean deviation and the standard deviation of the distrihution given by f = t7T sin t7TX from x = 0 to x = 4.9. Find the mean and standard deviation of the two sets of observations taken together.14. Radius of curvature (em) Relative intensity 5'4 o 5'5 11 5·6 36'5 5·7 40·5 5·8 5·9 31 9·5 6'0 0 B. 46 o 1 2 1 2 3 4 5 6 f x 3 8 12 18 20 18 16 12 7 8 9 8 4 10 1 f 2 345 5 12 678 26 20 4 Deduce the mean and standard deviation of the sum of the two distributions. 10. and for the range x = 0 to x = 1. Find also the standard deviation. 2 SOME STATISTICAL IDEAS EXERCISES 2 1.9. The mean of 500 observations is 4 ·12 and the standard deviation is 0 ·IB. for which f and x are both positive. 11. 12. Find also the standard deviation for the range x = 0 to x = 2. Observations on the scattering of electrons in 0. 0 0 0 0 0 0 \ SOME STATISTICAL IDEAS CHAP. 15. and find the mean value of x over this range.x from x = 1 to x = 2.3. Find the standard deviation of the distribution given in question 4.16. The mean of 100 observations is 2· 96 and the standard deviation is O' 12.17. Find the mean value of the data given (a) in Table 1. Calculate the mean and mean deviation of these readings. 2. and the mean of a further 50 observations is 2· 93 with a standard deviation of O' 16. 16. Apply Charlier's checks and Sheppard's correction for grouping.21. If nl quantities have a mean x and standard deviation CTI and another n2 quantities have a mean y and standard deviation CTz. Using (i) 40 and (ii) 50 as the assumed mean find the mean of the variable x which occurs with frequency f as follows. One hundred of these observations are found to have a mean of 4·20 and a standard deviation of 0·24. Choose the constant C so that the area under the curve. 12 respectively.6. Thirty observations of an angle B are distributed as follows f being the frequency of the corresponding value. Draw the frequency curve (the triangular distribution) given by f = x from x = 0 to x = 1 and f = 2 . What is the mean value of x? Find the median for the range x = 0 to x = 2. 5. 12. (J J 50 36' 20" 1 ji 50 36' 21" 2 50 36' 23" 4 50 36' 24" 6 50 36' 25" 8 500 36' 26" 5 500 36' 27" 3 50 36' 28" 1 4. 14. 13. Sketch the frequency curve given by f = Cx2(2 . 4.19.20.CHAP. (a) Find the mean deviation of the following. Find the mean and standard deviation of the other 400 observations.1 3.18.25 (b) Find the mean deviation and the standard deviation of the data given in (a). if their frequencies are respectively. Find the mean radius of curvature of the tracks. Find the mean and standard deviations of each of the following distributions. 14.5. which give the relative intensity of electrons in tracks of given radii of curvature. 6.22. 7. 10. (b) in Table 3.

on tossing a coin 10 times the probability of getting heads 3 times is + More generally James Bernoulli (1654-1705) showed that if the probability of a certain event happening is p and the probability that it will not happen is q (so that p q = 1). indeed there may be no heads at all. I. 22. the probabilities that it will happen on 0. It is instructive to represent graphically the frequency distributions corresponding to the terms of the expansion of (q p)n for different + + + 49 \ .n out of n occasions are given by the successive terms of the binomial expansion of (q p)n.-. 48 15 :2 = 128 -. In one trial of 1000 tosses heads were obtained 490 times and tails 510 times. 3. namely. 10.. we may find we get heads 3 times. that is.. .1) qn nqn-lp qn-2p2 -I.3. show that the aggregate given by nlO'I nl of (n! + n2) quantities has a varianc + n20'~ + nln2(x . Poisson and normal distributions There are many others. . If we toss the coin another 10 times.. 8. We therefore say that the probability of getting heads on anyone toss is t. These are known as the binomial.45.1 CHAP. and (iv) 3y . It can be shown that on tossing a coin n times the probabilities of getting heads 0. If the variable x has the values 0. of course. expansion of (t + w 1 1 IOx9 1 1 210' 10 X 210' 1><2 X 210' . How ever. 2.y. Certain special frequency distributions The distributions of industrial and social statistics. <: + 10 x 9 X 8(1)7(1)3 1 x 2 x 3:2 which is 1 in 8 roughly. we may get heads 8 times. the number of heads in anyone trial may well differ considerably from tn. .. we get either heads or tails. 1. 1.. These probabilities are represented diagrammatically The sum of the probabilities is. as is the probability of getting tails. 10 1 120. In practice.. 210 = 2 (1. ° <: Values of n Binomial distribution. 2 . say n. I A distribution of data having relative frequencies given by the terms of the expansion of (q p)n where p q = 1 is known as a binomial distribution.pn 1. we shall find that the number of times we get tails is tn approximately.10 times on tossing a coin 10 times are given by the successive terms of the o.n times are given by the successive terms in the binomial expansion of (t t)n. r-- I-- r- I-- ~ 40l£ 012345678910 Fig. .---. (ii) 2x . if we toss a coin 10 times. The binomial distribution On tossing a coin there are two possibilities. For example. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP.210. ri. Verify the genera results quoted in Section 21. 1) 8. unity. 3. (iii) 3x 4y. 210. what is the probability of getting heads m times? (of course m n).2x. If we toss a coin a large number of times.. -1. n(n . 1.4 and the variable has the values .45. many of them approach closely to one or other of three speciall important types of frequency distribution which can be derive and expressed mathematically using the theory of probability. 120. For instance. 1. 3.y)2 + n2 (n! + nz}2 + 15. of scientifi observations and of various games of chance are manifold. 10. if n is small. in Fig.2 + + + + + Thus the probabilities of getting heads 0. o 280 240 N200 >< 160 £ 120 :g 80 .252. Of course. relative frequencies conforming approximately to these theoretical values are only likely to be obtained if a large number of trials is involved..2. 23. The question arises: if we toss a coin n times. find from first principles the meat and standard deviations of the function given by (i) x y. 2. 2.

EXAMPLE 1 Verify that the data given below.p + f 10 + 40 + 80 + 80 + 32) which shows that the above distribution is binomial. Expanding (t + 1)5 we get . the. Hence the mean equals np = 5(t) = 3t and the standard deviation equals (npq)t = (5 x t x W = toO)t. 7. One important practical case is when p is very small. Consequently if we choose at random a sample of n articles. 2. n 012345 f 1 10 40 80 80 32 Since the first and last terms of the expansion of (q + p)rI an qn and pn. Show that this distribution is approximately binomial and deduce the percentage of articles in the batch that are defective. It is found that tb 50 EXAMPLE . Hence we should expect the 100samples to be distributed as follows: 55. = [297 _ 243 (!)2J! = (270)t 3 243 = !(1O)t 2 A large batch of articles produced by a certain machine a examined by taking samples of 5 articles. The Poisson distribution The form of the binomial distribution varies considerably depending upon the values of p and n. 2 values of p (and q) and of n. This is known as the Poisson series..9.32. 1. Further it can be proved that the mean of a binomial distribution is np and the standard deviation is (npq)t. To find the mean and standard deviation independently we can use an assumed mean of 3. If a large batch of the products contains a fraction p that are defective. the probability of choosing at random a defective article may be taken to be p.0 respectively.0. Taking for simplicity p = i the expansion of (q + p)n in this case is (~+ W = 9-5 (85 + 5 X 84 + 10 X 83 + 10 X 82 + 5 x 8 + 1) 32768 + 20480 + 5120 + 640 + 40 + 1 59049 = 0·5549 + 0·3468 + 0'0867 + 0·0108 + 0·0007 + 0·0000. if the above are distributed binomially then (Plq)5 mus equal 32 or plq = 2. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. The distribution is therefore approximately binomial and the percentage of defectives in the batch is likely to be about 11 %. 4. that is. the binomial expansion of (q + p)n approximates closely to the series m m2 m3 mn) e-m ( 1 + T + fX2 + 1 x 2 x 3 + . the probability of the event happening is very smail.0. so large that np is not insignificant. 1. n o d fd fd2 numbers of samples containing 0. The mean of the above distribution is (0 + 32 + 14 + 6 + 4 + 0)/100 = 0'56 If we equate this to np with n = 5 we get p = 0 ·112. 1.. 3. the working is given below. that is. 2. Itcan be shown that if p is small whilst n is large and such that np = m.+ n! where e = 2·71828 correct to 5 decimal places. but n is large. q = t and p = .CHAP.. We note that in such distributio the variable n is not continuous. and any distribution which corresponds to the 51 1 2 3 4 5 sum Hence the mean 1 10 40 80 80 32 243 = -3 -2 -1 0 1 2 = . form a binomial distribution and find the mean and standard deviation. it can only assume positive integra values. 24. probability that this sample will contain s articles which are defective is given by the (s + l)th term of the expansion of (q + p)".35. where f denotes the frequenc with which an event occurs n times. 5 defective articles are 58. which agree well with the numbers actually found.3 -20 -40 0 80 64 81 9 40 40 0 80 128 297 deviation 3 + 2~~ 3t and the standard 3 .

Poisson distributions having (a) Data conforming approximately to Poisson distributions occur widely in science. 2 successive terms of this series is called a Poisson distritution.3. n e-m 1 = = ( 1 x 2 s! e-m x e+m (approximately if s is large enough) 1 (approximately). The variance is found to be 3 ·05. Find the mean of the distribution md verify that the distribution is roughly of the Poisson type. or strictly so large that the mean number of occurrences appreciable. 1 + _ + __ m+ m 2 + m') _ f nf 0 24 0 1 2 3 4 5 77 110 112 84 50 77 220 336 336 250 6 7 8 24 12 5 144 84 40 9 sum 2 500 18 1505 To the same degree of approximation it can be shown that the mea of the distribution is m and the standard deviation is mi. The working. 11'0. emr emr The mean of the number of successes may be found by calculating the values of nf as shown in the table. ' .. Verify also that the variance equals the mean approximately.--) m=1 o 12+ m=4 On multiplying by 500 the successive terms become 24· 6. 111'6. EXAMPLE 1 The following table gives the frequencies f of the number n of uccesses in a set of 500 trials. It i' so named after the French mathematician Poisson (1781-1840 who in 1837 developed the theory. The terms of a Poisson series with this value of mare e-3-0I(1 + 3·01 + 3'012/2 + 3'013/6 + ._ nrls. 9. 4'0.'=T'. (b) mean = 4. In Fig. 84'3. What characterizes a distribution of the Poisson type is that t relative frequency r with which an event happens n times vari according to the following table: 0 1 2 3 --_ s emr 1 m m2/1 X 2 m3/1 X 2 x 3 . particularly in biology. For a true probability distribution the sum of the probabilit» must be unity. 112'0. 74' 2. The indispensable conditions are that the evem 52 7 8 9 -3 -2 -1 0 1 2 3 4 5 6 sum 53 -72 -154 -110 0 84 100 72 48 25 12 5 216 308 110 0 84 200 216 192 125 72 1523 . n d [d [d2 8 05 0'1. 'I! 4 2 012345 n (b) mean = 1. (0) 0 1 2 3 4 5 6 Fig. 25'4. 6 O . using an assumed mean of 3. 1'4 which agree very well with the values of f given in the table above. 9 are drawn histograms of Poisson distributions com sponding to (a) m = 1 and (b) m = 4.. whence the mean equals 1505/500 = 3'01. For a Poisson distribution the sum of the relati frequencies is n hould happen rarely (p small) and that the number of trials should he large. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. 50'8. is given below. where d = n .CHAP. They also arise in man: industrial problems.

and vanance 1523 500 . 9. The shape of the curve is shown in Fig.. The normal distribution The normal distribution was first derived by Demoivre in 1733 when dealing with problems associated with the tossing of coins. On multiplying by 100 thy successive terms become 57.CHAP. EXAMPLE 3 A large batch of articles produced by a certain machine are examined by taking samples of 5 articles. 2 Hence. 4.-. y is as small as A/IOO when a = 2·15 approximately. When x = m. (3) It might be noted concerning these measurements that the number of dust nuclei in the air is large. There is a maximum value when x = m and the curve is symmetrical about the line x = m. 10. 0. 7. 0·574 + 0·319 + 0·089 + 0·016 + 0·002 + . It was also obtained independently by Laplace and Gauss later. which agree well with the values of the frequency given above. It is therefore sometimes referred to as the Gaussian distribution or the Gaussian law of errors. 90. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. 5 defective articles are 58. 2. 92. 8. 1. 0 respectively. The mean number of defective articles is O· 556 ~ 5/9 and the terms of the Poisson series with m = 5/9 are The equation of what is known as the normal error curve is of the form y = A e-iJ2(x-m)2 where A. 38. 0·055 + 0·160 + 0·231 + 0'224 + 0·163 -I. 32. 18. Its basic importance in physics and statistics cannot be veremphasized. 32. 3'0459 ~ 3·05 2 EXAMPLE The number of dust nuclei in a small sample of air can be estimat by using a dust counter. 65. and the probability of any one nucleus being within the small sample examined is very small. m are constants. y = A and when x = m ± alh.-On multiplying by 400 the successive terms become 22. The degree of departure from the Poisson distribution has been discussed by Scrase.--) that is. 1. 3. The whole area under the curve is given by = A e-a2 which J co A e-iJ2(x-m)2 dx e -m(1 -co 5 ~ 1~ 6~ +9+162+2187+26244+--54 ) * "The role of the normal distribution in statistics is not unlike that of the straight line in geometry. "(4) 55 . Normal error curve. Show that this is approximately a Poisson distribution (see Example 2. y = A e-h2(x-m)2 and h2 = 1/2a2. which agree well with the actual values. 10.(0'01)2 that is. h. It is found that the numbers of samples containing 0. 2. 64. 0. 25. because of its early application to the distribution of accidental errors in astronomical and other scientific data. 3. Number 0/ particles 0 1 2 3 4 5 6 7 Frequency 23 56 88 95 73 40 17 5 The mean number of particles is found to be 1170/400 ~ 2·9 The terms of the Poisson series with m = 2·9 are e-2-9 (l + 2·9 + 2'92/2 + 2'93/6 + .0'094 + 0'046 + 0'019 -I. The following values were found in a se: of 400 samples. * y A' OM O'2A m-O' m m+O' x Fig. Section 23). mean = 3 + 500 = 3 '01 = = 5 . 2.0'007 + .

2 and this equals (Ay'7T)/h. Strictly. Another wa of expressing this is to say that for a normal distribution tb probability that an observation will lie between x and x + ox . 12 are drawn the histogram corresponding to (t + t)10. sometimes called the precision constant. such a distribution is one for which thi relative frequency of the observations having a value between and x + ox is yox where y is given by equation (7). 57 .ay'(27T) ---- 0·135 56 with m equal to the mean np of the binomial distribution. and the normal curve with m = 10/3 and a2 = 20/9. If the terms of the expansion of (q + p)" are plotted as ordinates against integral values of x from 0 to n the points are found to lie approximately. The shape of the curve for the same value of x and for different values of a is shown in Fig. Of course there is one very important difference between the two distributions. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. is related to the standard deviation a of the distribution by the relation 2a2h2 = 1. The binomial is discontinuous consisting of a set of discrete values corresponding to integral values of n. to a normal error curve..CHAP. Writing m Fig. A distribution of discrete values may. and a equal to the variance npq. 11. Indeed the normal distribution can be derived mathematically from the binomial. _1 (2rri o· x ~-I-~ x 2 Normal error curves y = ~ e_(x-. h -. on the normal curve y = 1 ay'(27T) e-(x-m)2/2a2 2 This then represents a normal distribution of which the mean is x and the standard deviation is a. Relation between a normal and a binomial distribution It can be verified that the histogram of a binomial distribution corresponding to (q + p)n approximates very closely.x')2/2a av'(27T) = x and 172 = 1/2a2 we have y = 1 ay'(27T) e-(x-x)2/2a2 26.1 2(2rr1 y y'7T This is the frequency curve of what is called a normal or Gaussi distribution. 10) and that h. of course.11. equation (7). when n is large. if n is large enough. whereas the normal distribution is continuous extending over all values of the variable. We can show that m is the mean of the distribution (this is obvious from the symmetry of the curve in Fig.e-k'{x-m)2 ox y'7T The value of y above. If A is taken equal to h/y'7T the an under the curve is unity and the equation of the curve then is y = _h e-h'{x-ftV . When x = x ± a we have y = _1_ e-t _ 0·607 ay'(27T) .ay'(27T) whilst when x = x ± 2a y=--e we have 1 -2 ay'(27T) . It will be noted that any normal distribution is determined by two parameters hand m. In Fig. is therefore sometimes called the probability density or the relative frequency density of the distribution. approximate closely to a normal distribution.

It might be noted that the function given by 1 YTT e-. For example.x)/a.2dt = terf(Ty2) Hence Y(~TT)l eo 0 59 \ . Further.f 1 0"v'(217) X2 e-(X-»2. It can therefore be expressed as the difference of two integrals of the type 1 __ y(2TT) ITe-t. T) 28.r CHAP. This expression is represented by the area under the curve y = 1 ay(2TT) e-(x-x)2/2a2 \ G + ~rO. Jr~ x given by y where x and a are the mean a~d standard deviation respectively of the distribution. and is denoted by erf(T). if we use the distribution shown in Table S (Section 18) we find that :E/ldl = 592. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP..x)/a = 1 this expression becomes __1 y(2TT) with m = 10/3 and = normal 20/9. If we apply this definition to a normal distribution we find that the mean deviation. 12. The extent to which any other distribution departs from normality is sometimes estimated by calculating the ratio T)/a and comparing it with 4/5. andi 1 between the ordinates x and x + ox.2 dt -T JT = yTT 2 IT e-.x)/a and 12 = (X2 .2dl o a/Y(. the probability that an observation will lie between two values XI and x2 is represented by the area under the curve between the ordinates XI and X2. denoted by T). because of its importance in the theory of errors.2dt = YTTI e-. The mean deviation of a normal distribution In Section 16 we defined the mean deviation of any set of data.2 dt 0 = 592/400 = 1·48 = 1·85 and hence "J/a = 1·48/1·85 = 0·80. (dotted Histogram curve) the (12 ~ I 2 3 4 5678910 --. is given by T) = where tl = (XI . T 1 Tv'2 t. and equals bL o Fig. Area under the normal error curve We have stated earlier that for a normal distribution th~ probability that an observation will lie between x and x + Ox is 58 is known as the error function. 2 y' 20 15 10 5 r-jI y'e59·05y \ \ \ ___ 1 e-(x-x)2/2a2 ox ay(2TT) . error curve = (1 v(2 ) e-(x-m)2/2a2 17 J'2e-t.2dl '1 27.2n2 Xl dx of binomial distribution Writing (x .TT) ~ 0·80a Thus for a normal distribution T) equals 4a/5 approximately. whence But a This integral has been evaluated for different values of T and is given in mathematical tables.

Similarly the range x ± 3'09a is the 99'8% zone. -3 Fig. the number of observations is limited and it is important to know whether such a finite sample approximates to the normal type and. It is not a good term and is falling into disuse. Unlike the binomial and Poisson distributions. the range of a distribution conforming to the normal type is effectively about 6a. A finite number n of observations will conform to a normal distribution if the frequency of the observations that lie between x and x + is ox . and is sometimes called the probable error.nh e-h2(x-m)1 Bx V7T for all values of x. set known as the population. y 0·6745 1 2 3 co 0-5 0'6827 0'9543 0·9973 1 _1 (211)2 g ± 1·96a is 0'95. the probability that an observation will lie within the range 60 29. the quantity 0'6745a is the deviation from the mean which is just as likely to be exceeded as not. such as a length or a time interval. Shaded area equals t if T = O. or infinite. We give in Table 6 the value of ~ 1 e-!12dt V(27T) -T JT for a few important values of T. Again. The practical problem is to test this and to find the values of m and h appropriate to the given set of observa61 . From this table it follows that in a normal distribution the probability that an observation lies within 3a of the mean is O' 9973. 100 measurements of a physical quantity. however. it can be shown that the area under this curve from t = _ T to t = Tis t if T = 0'6745. in which the variable assumes only integral values. that is. if so. standard error of the mean It is usual to regard a finite set of data as a sample taken from a much larger. The term population is used generally in this way. In other words. For example.co. the probability that an observation will lie outside this zone is 1 in 500. The total area under this curve is unity. Sampling. This last result can be interpreted as follows: in a normal distribution the probability that an observation will lie between x ± 0'6745a is t.CHAP. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. -L -I 6 T I 2 3 t Graph of y = v'(~7T) e-!12. 13. The importance in statistics of this concept of population cannot be over-emphasized. In practice. to + co. and not limited to the biological examples to which it was first applied. may be thought of as a sample selected from the very large number of measurements which could be made and which are referred to as the population. to be able to find the parameters of the normal law which best fit the observations. so this is sometimes called the' 95% zone. the variable in a normal distribu tion ranges continuously from . and an infinite population or number of observations is implied. 2 This quantity is represented by the shaded area Y = V(217) Table 6 1 -_e-!12 o T erf (Tv'2) o shown in Fig. 13. 6745.

2 tions. the mea. the uncertainty in the value of a can be indicated by using expression (8) and writing the standard error of the mean as 0'018(1 ± 0·07). namely.001)2 + (0'019)2 = 0·0003 + 0·000 361 = 0·000 Variance of the 500 heights about the mean 5 Standard error of the mean 661 950 (8) The usefulness and significance of the standard error of tho mean may be illustrated by taking a particular example. if n is large the two means may not differ very muc but if n is small they may. The mean height of the 500 men is (1·705 x 2 + 1·752 x 3)/5 0·01 + 0·156 1'70 = 1·733 m 5 Also. Using Table 6 of Section 28 we infer that if much larger number of measurements had been made the mea • Random sampling is defined as such that in selecting an individu from a population each individual in the population has the same chan of being chosen. the probability of the mean lying between 2·341 . Variance of the 200 heights about the new mean = 200(0'002)2 (0'028)2 = 0·0008 0·000 584 0·001 384 + + + = Variance of the. especially if it is small. We have pointed out earlier that fa an infinite population of the normal type m = x and 112 = 1/2 where X and a refer to that infinite population.'. 300 heights about the new mean a[ 1 ± v'~2nJ = 300(0. Then the standard error of the mean is 0'18/100t = 0'018. EXAMPLE 2 =: (0'001 384) + ~ (0. Similarly the standard error of the standard deviation is. Fig. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. Mean height of the 500 men = 1·733 ± 0·001 m. 63 . I dealing with any set of data it is therefore usual to find the me x and to write it in the form x ± ex where ex = oln» and is called the standard error of the mean. Further. in general. Suppose the mean of a set of 100 measurements of a certain physical quantity is 2· 341 and that the standard deviation is O' i8. It is clear that a perfect fit is unlikely. but it can be sho that the most probable value of m is the mean of the observatio and the most probable value of h2 is 1/2s2 where s is the standar deviation of the observations. so that the standard deviation may be written as would be likely to lie between 2·341 . EXAMPLE 1 The mean height of 200 men is 1·705 ± 0·002 m and the mean height of another 300 men is 1·752 ± 0·001 m. Of course. Section 28) which is called the probable error 01 the mean is sometimes quoted but the use of the standard error is generally recommended and is now widely adopted. It is possible to express mathernaticall the manner in which the means of different samples of given si are distributed. if we select at random" a sample of n data from infinite normal population it is obvious that. 62 5 = v(O'OOO 95/500) = vO'ooo 001 9 = 0·0014 . and we write the resul as 2·341 ± 0·018. Find the mean height of the 500 men and its standard error. x = 2·341 ± 0·018 (100 observations) Alternatively.000661) = 0·000 A set of 400 data has a mean of 2' 62. These probabilities vary with the value of n and hence it is important to quote the number of observations. 11 illustrates how th distribution of the means of samples retains its normal form bu decreases in dispersion as the size of the samples increases.CHAP.0·036 and 2·341 + 0·036 is 0·95. a/v'(2n). of the sample will not be the same as the mean of the who] population. the variance of their 500 heights about this new mean may be found by using relation (6) as follows. whilst its standard deviation equals a/n! when n is the number of data in each sample and a is the standac deviation of the whole population. It should be noted that the quantity O· 6745a/v' n (cf. Test whether this can be regarded as a random sample drawn from a normal population with mean 2'42 and standard deviation 1·24.0·018 and 2·341 + 0·018 with a probability of 0·68. In fact it can be shown that this distribution i itself normal and such that its mean equals the true mean of tb whole population.

It is insignificant when n large..0·08 = 3·33 nnd hence a = 1'82.3'""9""9 ---'1492 .J.ld.ldsl = 1·25iV• ---.. xn the best estimate of a2 is given by 1 n 1 n -~ (x. the mean deviation of the distribution is given by n 7J = ~ s-I n I. The standard error of the mean is therefore 1'82/400! = 0·091.l/v[N(N . that is.1 = 592. . in this case.ld. it can be shown that the standard error of the stand deviation is a/v [2(N ..-. If we take the example discussed in Section 18 we have from Table 5 that 'i:. This formula differs from that for s2 in that (n . When we have to deal with a sample of N data drawn from a normal distribution.=1 is estimated by a = = 7JV(t7T) = ~[2N(.x)2 = -. 12.I.400(0. the best estimate of a2 is given r-I that. . 2 The standard error of the mean of a random sample of 400 tak from the normal population = 1. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. Hence the sarnpl is not likely to be drawn from the given population. I. 30. so Ihat the standard error of the mean is 0'091(1 ± 0'035). Using Sheppard's correction we get a2 = 3·41 .. they are more readily computed than the formulee due to Bessel used in the previous paragraph. if the sample XI. Peters' formula = 3·57 ± 0·09 We have stated earlier in Section 27 that the mean (absolute) deviation.-. The standard error of the mean is however. correct to two decimal places. . Bessel's formula We have not yet discussed how the standard deviation a of infinite population can be derived from the standard deviation s a sample of n data. If the sample consists of the values XI. a/v N.41 or (1 = 1. xn with f quenciesjj.x. . The uncertainty in this value is I/V798. x2.I. But the difference between the two means is O·20 which is mo than three times the standard error of the mean.ldsl 1·25 'i:.24/V 400 = 0·062.. the mean x wu found to be 3· 57. Thus.1)] a of the distribution i: J. 65 .(x r - x)2/(N - 1) = r-I £. Hence the standard deviation where N = 'i:.1 r-I n .1 IfiV _ " Returning to the data discussed in Section 18.1)]. If we apply Bessel's correction the best estimate of a2 is given by a2 = 1492 .~ d~ n . In respectively. of a normal distribution equals a(2/7T)!.-. It can be shown that the best estimate we make of a2 is given by [n/(n . 7J.1)] 'i:. x2. As they involve Idsl and not d/. EXAMPLE .1) replaces This is known as Bessel's correction. _ 1)J X 'i:. and x is the mean of the sample. d. 57)2 Also the standard error of the mean of the distribution is N! a = N 1 X 'Y ZeN_ I[ 1) 'i:. We can write x 31. = x.I. about 3· 5 %..85 64 These are known as Peters' formulas. Further.130 ~~~ = 1362 399 = 3.1 r=1 where x is the mean of the sample.I. 4a/5 approximately./(N - 1) where N =~ f.1)]s2. that is. .. and d. the correction leaves unchanged the value of Ir.ldsl/v[N(N 7T ] X .l.CHAP.ld.

Also the normal distribulion which represented by y = A e-(x-m)'j2a' Is such that ydxy 1 dy dx 1 d y __ Ci2 x .CHAP. 2.\/(217) for x = 0.. referred to as skew. becomes 67 . the observed values f(obs. .. The values obtained are given as f(calc.m is a special case of more general distributions for which = . These values can then be compared with the given fn quencies. The agreement seems reasonably good. Poisson and normal distributions it should be emphasized that there are many others. the s of the frequencies. 14. Further.(6) Many of these other distributions are approximately realized in practice. as always. It might also be noted here that sometimes what is quite an asymmetrical frequency curve. .-. Histogram and the corresponding normal curve..) in the following tabl which also shows. Some of these. 14.. 32. We should. Fitting of a normal curve It is possible to find the equation of the normal curve (Ty(217) which' best fits a given frequency distribution by finding the m x and the standard deviation (T of the given distribution in t way explained above. \3. such as the triangular and the median distributions. Other frequency distributions Although we have restricted our considerations to the binomial. It is possible to test "t goodness of fit" by using a special technique.a x .. of which the mean x was found to be 3' 57 and t standard deviation (T was l :82. I I I o I 2 3 4 5 6 7 8 g 10 x Fig.) 13 33 60 84 85 65 36 15 5 1 f(obs. 2 Hence (T = 1·25 x 592/y(400 x 399) = 1'85 in agreement wi! the value obtained in Sections 18 and 30. not expect reasonable agreeme unless the number of observations is large. 10 with Sx = 1. the binomial distribution is a special case of a more general multinomial distribution which has many forms. 66 i y = N e-(x-:I?)'/2a' 100 f I I I I I .) 10 40 72 85 78 55 32 18 7 2 These results are represented graphically in Fig. As an example let us consider the data discussed above (Sectio 18 and 30).) x 0123456789 f(calc. The total area under this curve is N whi may be taken equal to the total number of data. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. 1. have been included in the examples.bo + bjx + b2x2 (10) These are known as Pearson's typeS.. Since N = 400 we calculate 400 ___ e-(x-3'57)2j6'66ox 1· 82. known as the X2-tes which enables the significance of the departure of the data fro the assumed type of distribution (in the above example. To test how well the curve fits the data it is useful to calcula the values of ySx with Sx equal to the width of the classes in which the data are grouped and x equal to the mid-value of eac class. 3. . We cannot go into this here: referen should be made to a text-book of statistics. that is. \ 50 . the norm distribution) to be assessed.

84.CHAP. n is the number of insulators with a breakdown voltage less than EkV.124. 68 0 3 7 15 26 55 78 92 96 99 100 Find the mean and standard deviation. Calculate from first principles the mean x and the standa deviation a of the two distributions given in question 1 Verify that x = n p and a2 = npq. and fit a normal curve to it. Rutherford and Geiger(5) using the scintillation method counted the number of ex-particlesemitted per unit of time by polonium.). 2 approximately a normal curve when the frequency is plotted agai the logarithm of the variable rather than against the variable itse Such a distribution is often called a lognormal distribution. Verify that th variance equals the mean. (b) q = p = t and n = 5. nOl / 5. 9. 8. Draw the normal cur and the histogram of the distribution. plotting such distributions it is convenient to use logarithmic pape having a logarithmic scale for the variable. Make a table of the values of the probability of getting 6 + x heads. Moreove the mean of the variable is given by the point on the straight li corresponding to the cumulative relative frequency of O· 5. Draw the corresponding histogram.Q.74. 2. Their results are given below. n EXERCISES 3 1. The results of certain measurements on the breakdown voltages of one hundred insulators are given below.104. Compare this with the observed values.134--144 Frequency 1 2 9 22 33 22 8 2 1 Draw the normal curve and the histogram of the distribution. however. E Number 0/ insulators. 6.64. Fit a normal curve to the following distribution of intelligence quotients (I. when 12 coins are tossed. Find the ratio "Jla for the data.Q. Find a and b. / is the number of times N ex-particles were observed. limits 54. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. Arithmetic probability paper is su that if the cmnulative relative frequency of a normal distribution plotted against the variable a straight line is obtained. whi] the standard deviation equals the projection on the variable ax of the part of the straight line between the points corresponding t cumulative relative frequencies of O:16 and O·5. I. 3. Draw the histograms of the binomial distributions correspon ing to (q + p)n when (a) q = p = t and n = 6. 4. Write 21 29 2345 22 110 120 130 140 150 160 170 180 190 200 210 12 5 1 down the binomial distribution corresponding (t + t)IO. and verify that the distribution is roughly normal. Logarlthml probability paper has. (Check your results by using arithmetic probability paper. Show that the following distribution is roughly of the Poisson type. and find its variance.) 69 . N 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 / 57 203 383 525 532 408 273 139 45 27 10 4 0 1 1 lO. a logarithmic scale for the variabl and consequently a lognormal distribution is represented by straight line. Show graphically that an equation of the form y = a e-bJl2 fits these values very closely if a and b are chosen suitably.94. Show that the mean number of ex-particles emitted is 3· 87 and find the Poisson distribution corresponding to this mean. 7. for values of x from 0 to 6. Find the frequencies of the Poisson distribution of which th mean is 2. s.114. Two types of what known as probability paper are also available for plotting no and lognormal distributions. Breakdown voltages. The number of stoppages on 400 consecutive shifts in an industrial undertaking were recorded as follows: Stoppages per shift 0 1 2 3 4 5 Number 0/ shifts 245 119 30 4 2 0 Show that the distribution is approximately of the Poisson type.

and find the mean value of the velocity-component u. Find approximately ~ 97 98 99 100 101 102 103 104 105 106 107 108 109 01 1 8 10 25 32 42 42 27 15 4 3 3 19. Plot the frequency curve for the distribution given by 1 2/[7T(X2 1)] from x = . 17.CHAP. 15. w parallel to the co-ordinate axes is (hm/1T)3/2 e-hmc2 ou ov ow Find the mean value of c and of 71 c2• . x 25 26 27 28 29 30 31 32 33 34 1 1 5 17 49 85 52 25 11 4 1 18. and find its variance. According to the kinetic theory the velocities of the molecules of a gas are such that the component velocities in anyone direction are distributed normally (Maxwellian distribution). 70 I + where c has components u. (b) Peters' forrnule. (b) between 800 an 1100 hours. lying between u and u + ou is A e-hmu2ou where A. The diameters of the spores of lycopodium can be found by aa interference method. centimetres. find the mean and its standard error. 2 11. Sketch the normal curve and the histogram of the data. Tests made on electrical contacts to examine how often the contacts failed to interrupt the circuit due to their welding together yielded the following results. Assuming the two sampl were taken from the same normal population. Use (a) Besse formula. and it is found th they have an average life of 950 burning hours with a standa deviation of 150 hours. Verify that the distribution is approximately normal. distribution is approximately Poissonian. 21. measured parallel to the axis of x. v. A sample of 20 data was found to have a mean of 80'1 wi standard deviation 2· 2. and (iv) 10'0 and 10·5. 2 SOME STATISTICAL IDEAS SOME STATISTICAL IDEAS CHAP. hand m are constants.1 to x = 1. 14. Number of welds per test 0 1 2 3 4 5 Frequency 10 13 13 8 4 2· Show that the distribution is roughly Poissonian. use the 60 da to estimate the mean of the population and its standard err. If the value of a quantity x and its standard error are given 10'23 ± 0'12. and (c) between 1100 and 1250 hours. Thus the probability that a molecule has a velocity-component. The results of an experiment are give below. What number of lamps may be expect. Find the standard deviation. The median law is defined by 1= (l/2a)e-1xlla for all valu of x. oltages Number tubes 12. (ii) 10'11 and 10. Sketch the frequency curve and show that the area und it is unity. Results of a count of warp breakages during particular lengths of cloth are given below. One thousand electric lamps are tested. what is the probability that the accurate val of x will lie between (i) 10· 11 and 10' 35. Using the result quoted in question 21.4' (iii) 10'11 and 10'31. 13. 22. Assuming that the following distribution is approxima normal. Warp breaks per length of cloth 0 1 2 Frequency 15 26 21 the weaving of Show that the 3 19 4 8 5 3 6 0 20.4. Show that A = (hm/1T)t. to have a life of (a) less than 650 hours. Another sample of 40 had a mean 81'4 with standard deviation 3 . k X diameter 14 15 16 17 18 19 20 21 22 Number 01 spores 1 1 8 24 48 58 35 16 8 Find the mean diameter of the spores and the standard erron Represent the results by a histogram and draw the correspon ing normal error curve. 16. where k = 5880 when the diameters are measured . show that the probability that a molecule has a velocity lying between c and c oc + standard deviation of this distribution by using the values of from 0 to 1 at intervals of 0·2. The results of examining 213 discharge tubes and noting the voltage associated with a particular operating characteristic are given below.

(Cases can be quoted.4 7 d2 4 324 36 36 81 16 497 34. where a large number of independent causes. Perhaps this is not surprising since any theoretical derivation of the law is bused on special assumptions which mayor may not correspond to observational conditions. The two parameters x and a characterizing this normal distributio: can be estimated by calculating the mean and standard deviatio: of the given set of measurements. This so-called Gaussian law of error was first deduced theoretically hy Laplace in 1783. d2. that is. Applicability of the normal law of error Certain values of the velocity of light obtained methods are given below in kilometres per second. each giving rise to a mall deviation. and the mathematicians because they believe it has been established by observation. Mean value = 1 aV(27T) e -(x-x)2/202 lurther which defines a normal frequency distribution is often called tlu Gaussian law of error. namely. 1/10+. relative hI the working mean. Indeed the mean of these measurements is the best estimate of the value 0: the measured quantity. as explained in Section 30. normally about the mean of the observations. The normal or Gaussian law of error The function y A working mean of 299780 may be used. The value of the measure. 72 using differe Find the mea Some groups of observational data in science satisfy closely the normal error law.6(7/6)2 = 98 5 a = 9·9 therefore tnndard error of the mean velocity of light = 9'9/V6 = 4 = 299 781 ± 4 km S-1 (6 observations) As there are only six values the uncertainty in the value of the (andard error is considerable. In the second column the above table are given the values of the residuals d. and in the third column the squares. approximately. but it is by no means universally true. 15.THEORY OF ERRORS CHAP.9 . so that the standard rror might be written as 4(1 ± O' 32).leviation of anyone of a set of observations from the mean of the It is the resultant of a large number of very small deviations due to Independent causes."(7) III Velocity 299782 299798 299786 299774 299771 299776 sum d 2 18 6 . 3 CHAPTER 3 THEORY OF ERRORS "Everybody believes in the exponential law of errors. quantity is usually written as x ± cc where cc = a/VI1 is the standard' error of the mean and n is the number of measurements. and that positive and negative deviations of the same size are equally probable. do not result in deviations from the mean conform73 .6 . More precisely. This "law" states that measurements of given quantity which are subject to accidental errors are distribute.) EXAMPLE = 299780 + 7/6 = 299781 a2 = 497 . and its standard error. (As we have mentioned earlier some writers use not the standard error but the probable error of the mean which is 0'6745a/VI1. he started from the assumptions that the . whilst the standard deviation provides tb best estimate of the accuracy so obtained. however. the experimenters because they think it can be proved by mathematics. t law infers that any set of measurements of a given quantity may regarded as a sample taken from a very large population-t aggregate of all the observations that could be made if the inst ments and time allowed-and that this population is normal.

and so we can only assume they are equally divided. Normal error distributions A much discussed example of observational data satisfying normal error law was given originally by Bessel. he m six series of observations resembling those of the determination right ascension and declination using a transit circle. Nevertheless it is used widely unless there evidence to show that some quite different probability distributi applies. perhaps because of the small number of observatio with which he usually has to deal. and in the seco: 74 In column four of the table above are given the calculated values of f = 0 -2 y for the values of x corresponding to the mid-points of each class. Hence we have 0-2 = 15-629/299 = 0-05227 Applying Sheppard's correction we get 0-2 = 0-05227 . Its justification is that in repres ing many types of observations it is apparently not far wrong.CHAP. It is worthy of note that even when the parent populati is not normally distributed. 0 ·15.Karl Pearson showed that some series of measu ments which had been deemed to support the normal law sho significant departures from it. and he fou again substantial departures from the normal law.-. ing to the normal law.It is now generally agreed that the normal law of error is n universally valid. unlike the biolo seldom uses this. olumn the frequency of its occurrence in a total of 300 determinalions. Gauss gave a proof b on the postulate that the most probable value of any number equally good observations is their arithmetic mean. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. is much more convenient to handle than others that might or represent them better. addition he has shown that the data satisfy a law of Pearso type VII.0·95.(0·1)2/12 0-05144 = or 0- = 0-227 300 0·227Y(27T) = 527 e-9·72x2 e-x2/0-1029 Hence the normal error function corresponding to the data is y = 36. 15. We measure x from the mean to the mid-point of each class. The factor 2 is necessary to include negative as well as positive values of x and the factor O· 1 because of the class width. However.(lO) He provided following data concerning the errors involved in measuring the ri ascension of stars. with collaborators. the distribution of the means of samp of a given size is usually closer to a normal distribution than population itself. The normal error curve and the histogram of the data are drawn in ig.. certain large grou of physical measurements do exist and some have been caref examined. Limits of error Frequency f fx2 Calculated frequency 0-0-0·1 0-10-20·30·4a-50·60-70-80·9sum 114 84 53 24 14 6 3 1 1 0 300 0·285 1·890 3·312 2·940 2-835 1·815 1·268 0-562 0·722 0 15-629 103 85 57 32 15 6 2 o o o 300 The third column of the table gives the product of the frequency and the square of the deviation x. In 1901. 3 (8» Subsequently. Both Laplace and Gauss discussed examples of measureme which suggested a wide applicability of the normal law to distribution of accidental errors and its truth was generally accep But more recently its validity has been challenged on vari grounds. Later. . onsequently the mean of the distribution is zero. It will be noted that positive and negative errors are grouped together. 75 ." Of course the applicability of the normal or any other law be examined using the X2-test but the physicist. Jeffreys'?' re-examined Pearson's data and confirmed the main result. O· OS. In the first column are given the magnitude the error of the observation in seconds of time. Jeffreys(6) concludes: "The normal law of e cannot be theoretically proved. . that is.

as one possibility. F. The data are symmetrically distributed about the mean value of 299774 km S-1 bu the distribution of the residuals is not normal. equal to 0 ·1965 and l = 500 y." Several attempts to provide suitable measurements have therefore been made. Pease and Pearson obtained 2885 values the velocity of light from observations made on about 165 differenl occasions extending over about two years. because it suggests that this series has been selected beca it gives an unusually good agreement with the law and that otb that may have disagreed violently with it may have been suppress This danger of selection makes it undesirable to make use 0: published series to test the law. A similar series of observations. 15. as calculated from the observations. Histogram of Bessel's data and the corresponding normal cur It is clear that the agreement is very good. Hansmannt'D h however. 3 f 70 II 310 110 -08 ':(Yo-:04---=0-2 0 02 04 06 0 ." There are other difficulties in attempting to test the law. indeed a little t good. The distribution of residuals for 500 measurementsof a spectral line. with standard deviations of 5 and 15 km s respectively. n least for the physicist that observational data in physics are seldo sufficiently numerous.CHAP. Birge(13) made a series of 00 cross-hair settings "on a very wide but symmetrical solar pectrum line. Michelson. which suggests. 1026 in number. have been made by Bond who viewed an illuminated slit using a travelling 77 . He adds: "The poi is that if a large assortment of observations are not of equal reliabili their residuals cannot be expected to follow a normal error cur 76 so . But even when the number of observatio is large they are sometimes invalidated in various ways.x !lnd J am more and more convinced that the deviations from such curve found so commonly in large groups of physical measurements are due usually just to this cause." His results are shown in Fig. Birge(12)points ou that a very good fit can be obtained if one takes the sum of t normal error curves. unless there is some definite reaso to believe that no suppression has taken place. under conditions as favourable as possible to equal reliability for all observations. examined the data and shown that there are two ty of Pearson curve that give even a better fit than the normal cur Jeffreys(9) has also discussed the data and in particular examin how closely they follow the normal error law. Besides those of Karl Pearson mentioned earlier. compared with the Gaussian error curve evaluated by second moments.) where the abscissa unit is 0·00 1 mm and the ordinate v' represents the number of residuals of magnitude v.40 s' JO -6 Fig. 3 TH EOR Y OF ERRORS THEORY OF ERRORS CHAP. Abscissa represents residual Ii. 16. in 0'001 mm units. The smooth curve drawn In Fig. 16 corresponds to the normal error curve y = _h e-h2x2 VTr with h. 16 60 Fig. Ordinate representsthe number of residuals of magnitude Ii. It is clear that over the whole range the observations follow the Gaussian curve very closely. example. (Reproduced from Physical Review. he commen "This agreement is therefore surprisingly good. two groups observations of unequal degrees of accuracy.

. Y. If the quantities x are distributed about their mean x with a standan error ex and the quantities yare distributed about their mean with a standard error f3.. In other words. d" so that D = (d) + d2 + . .+ d" ai + a~ + . We have verified these results earlier (cf.= an then a = otyn. EXAMPLE 1 A sample of 250 observations has a standard deviation of 3 and another sample of 400 observations of the same quantity has a standard deviation of 5.-. Certain particular cases are specially important.= an this reduces to a2 = ar/n or which accounts for the value of the standard error of the mean quoted in Section 29. itself satisfies t. Z = f(x. obtained b evaluating f(x.We discussed in Chapter 1 (Section 7) the resultant error in a quantity which is a function of a number of quantities all subj to experimental error. It is now necessary to consider the estimation of the standa error of any quantity when the standard errors of the quantities 01 which it depends are known. in this case D = nd.D = kldl expressed as the sum of a set the normal law..= d" Ihe deviations are not independent and so the above result does not hold. .. lies between x and x + ox il given by 1 ---.... For instance. (i) If D = dl + d2 we get 02 = or + o~ Also if D = d. Section 21) in a few simple cases. 3 THEOR Y OF ERRORS THEORY OF ERRORS CHAP... However. .. y) for the measured values of x and y. satisfies the norm law so that the probability that d. are constants..dz.CHAP.)/n we get 02 = (ay + a~ + . distributed To answer this question we use an important property of t normal error law. and another set of /1z data has standard deviation Oz. Suppose further that each of the deviations d.. . and hence a = nat· (ii) If D is the arithmetic mean of d). The conditions resembled th in the measurement of a spectrum line. which can be deviations each of which satisfies normal law...-.+ o~ If at = a2 = . sometimes referred to as its reproductive propert It may be stated as follows: Any deviation.. Calculate the standard error of the mean a2 Oz)t Then it can be shown(l4) that the probability that D lies betw x and x + ox is given by 1 ___ e-x2/2a2 ox where (J2 = kroi + k~o~ + . how are the quantities z.:::--:.+ a~)/n2 written as a linear function of . D satisfies a normal law determined by the standard deviation c. 3 microscope slightly out of focus.. say.k.. but Jeffreys'P! subsequently examined them ani found systematic departures from the normal law.+ d. 37. suppose a quantity is a function of two measured quantities x and y. Standard error of a sum or difference . then again 02 = + o~. But if dJ = d2 = . Suppose a deviation D can be independent deviations dl. d2..+ k~o. which they seemed satisfy very well.. We note that 0 is greater than either 01 or 02 but less than 01 + Oz· More generally. Bond's data were original given as an example of the normal error law.d" in the form + k2dz + ..e-x2/2ai ox °iV(27T) means of the two sets of data is ( -1 + -1n) n2 It follows that if two quantities are given as a ± ex and b ± f3 where ex and f3 are the standard errors of a and b respectively. where kl' kz. if D = dl 02 = or we get + d2 + .+ k"d" = 02 = . dz. 78 oy(27T) (1 79 .b ± y(ex2+ (32). then the standard error of the sum or difference of the When al 0' = ot/yn. (iii) It follows from cases (i) and (ii) that if a set of nl data has standard deviation a)... the sum of the two quantities is a + b ± y(cx2 + (32) and the difference of the two quantities is a .

from time to time.m21/0'31 is obviously of some practical impo tance. the probability that they so belong is 1< than 0. From Table 6 (Section 2e) we note that if 1m. the unit being 10" C kg ' '.CHAP. and so the existence of a systematic error is not ruled out. 38. so the fractional squares of the product. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. Birge comments: "In view. Do the two grou of measurement differ systematically'[ The difference between the two mean values is 10-5[79 The mean of z. and hence using equation (11). Suppose the standard error of x is ex and of y is {3. . of mean 111 and standard deviation a. the probability that a value of t will be exceeded in random sampling. for any value of n. whilst the mean of y' is zero and its standard error is {3.m is distributed normally with mean zero and standard deviation aly'n. Thus the difference of the means is about 1·4 times its stand a error. However. . is therefore xy. the standard error of z is y where y2 ± y'(422 + 362)] = (x{3)2 + (yex)2 that is. 3 of each sample and the standard error of the difference of the means. I think that the sm remaining discrepancy should not be taken too seriously.m)y'n/a is not normal. From the table of t can be obtained. EXAMPLE 2 xy + xy' + yx' + x'y' Determinations of elm for the electron by two different grou of methods give the following results: 1·75880 ± 0·000 42 a 1·75959 ± 0·000 36. the distribution of the ratio t = (x . especially small samples.m21 >3(0'31i it may be concluded that the two samples are not likely to belo to the same population. We have z = (x + x')(ji + y') = Thus if the two means have values the means can be written as ml m2 and the difference ± 0'31 The ratio 1m. of the many serious sourc of systematic error that have been revealed. when the standard deviation a is estimated from a given sample of n as explained in Section 30. denoted by z. Standard error of a product Suppose z = x y where x and yare measured quantities of which the means are }(and y respectively. the deviation of z from xy is xy' + yx' approximately. almost every method of determining elm. but it has been evaluated and is given in many books(lS) on statistics.m) : aly'n is distributed normally about zero with unit standard deviation. This implies that the ratio (x . 0·00079 ± 0·00055. The standard error of the mean of the first sample is 3/y'250 = 0·19 The standard error of the mean of the second sample is 5/y'400 = 0·25 The standard error of the difference of the two means = ~[(y'~50)2 + (y'~)2J = ~~ ml = 0'31 m2 It might be added here that in trying to assess the significance of the difference of the means of two samples. We have stated earlier (Section 29) that the mean x of a sample of n taken from a normal population. If we write x = x + x' and y = y + y' the mean of x' is zero and its standard error is IX. use may be made of what is usually called the t-distribution.2%." 80 We can write and we note that (ylxy)2 z = xy ± y = (ex/x? + ({3IYP error of x. ex/x is what might be called the fractional standard that the above result indicates that the square of standard error of the product equals the sum of the fractional standard errors of the factors forming the 81 . Further. Section 37. however. is such that the deviation x .

20·00 ± 0· 04 cm and 10·00 ± 0· 03 ern. .cos 60° 5' 2' = . EXAMPLE The refractive index n of the glass of a prism is given by n = sin teA + D)/sin tA where A is the angle of the prism and D is the angle of minimum deviation...-.. 3 For example. EXAMPLE (vi) the power mf is a where a/m~ = padml or a = (vii) any function of ml' m2.. Of course (i) to (vi) are special cases of (vii).m2 is y(ar + a~) multiple km. mn>namely. To find the standard error we have The radius r of a cylinder is given as 2·1 ± 0·1 em and tb length I as 6· 4 ± 0·2 cm. . m2. Find the volume of the cylinder and itsl standard error. a" respectively then the standard error of (i) (ii) (iii) (iv) (v) the the the the the sum ml + mz is y(ar + a~) difference ml . .. mn).CHAP.0' 79 sin? tA sintD 2 sin2 tA sintD 1.+ (om 1 0/)2 n a..-. mil with standard errors aI' a2.0091 +0. is kal product mlm2 is y(mra~ + m~aT) product mlm2m3 is a where (a/mlm2m3)2 = (adml)2 (~)2 = (~)2 V 2·1 + (~)2 6·4 4 1 = 441 + 1024 =0. Standard error of a compound quantity Collecting together and extending the results of the previous tw paragraphs we have: If a number of measured quantities have means ml. 3 THEORY OF ERRORS THEOR Y OF ERRORS CHAP. suppose the sides of a rectangle are given as.0010 that is Hence a/V= 0·10 V= 71(2'1)26'4(1 ± 0'10) = =0·0101 88·7 ± 8·9 2 + (a2/m~2 + (a3/m3)2 (pm~-I)al m2. . The area of the rectangl can be written as 200'00 ± y ern? where The volume V cm3 of the cylinder is given by V= 71(2'1)26'4 where ± a (L)2 200 therefore that is = (0'04)2 20 + (0'03)2 10 (.-./(ml' is a where a 2 = (om) Of 2 aT + (0/ 2 om) a~ + . Taking teA + D) = 53° 20·9' and tA = 30° 2·6' we have n = 0·802280/0'500655 = 1·60246.cos A 83 .)2 = (~1)2+ (1/ r y2 = 0·16 Y = 0·72 + 0·36 = 0'52 and al and a2 are the standard errors of therefore and I respectively 39.-. 82 oA on t sin tA cos teA = + D) - t sin teA + D) cos tA sin 23° 18·3' 1 . The results of several measurements of A and D are given as A = 60° 5'2' ± 0·2' and D = 46° 36'6' ± 014'. Find the refractive index of the prism for the particular wavelength of light used.

-3'5 2 0 168 -3'0 12 0'5 148 -2'5 25 1·0 129 -2. Given that elm = (1'7592 ± 0'0005) x 10 It C kg.0 126 3·0 10 v = 0·4107 ± 0·0003 m. alb. using the relation h3 . where x denotes the deviation in seconds of time from a value near the mean of the observations and y denotes the number of observations having a deviation x. Wha are the limits of uncertainty in this standard error? 84 ' = me4/8rdJR~c. and EXERCISES 7. a. 36. 1//= l/v+ l/u.00064. Fifteen measurements of the surface tension of water have a mean value of 0·072 52 N rn" with a standard deviation of 0. 10. If m = 0·850 Hence the refractive index of the glass of the prism = 1·60246 ± 0·00008 6. 43. calculate/and u = 0·3513 ± 0·0002 m its standard error.39 (b) 40.40. 39. 3 Also oD - on _ 1: cos -!(A sin-!A + D) cos 53° 20·9' 2 sin 30° 2·6' = 0·597 = 0. If 1.a)la and log.60 1·00 But and O:A O:B = = ± 0·2' 0·4' IX = = ± 0·000058 radian ± ± 0·000 116 radian Hence the standard 0:2=(0'79 error of n is given by 10-4)2+(0'60 therefore 0: = 0·69 = ± 0·83 x 0·58 x x 10-8 X x 1·16 x 10-4)2 10-4 3. Assuming the two sets 0 measurements belong to the same normaf population. ± 0'08 and b = 2·54 ± 0'12 find a + b. Given that c = (2'99776 ± 0·000 04) x 108 m S-I Roo = 10973730 ± 5 rn>' m = (9·1091 ± 0·0005) x 10-3' kg e = (1·60210 ± 0·00052) x 10-19 C 12 Eo = (8·854 18 ± 0·000 04) x 10kg " m" 3 s'e' find Planck's constant h. 41. 4 I ± 0'012 find m2. 85 . Another thirty had a mean value of 9·802 m·s-1 with a standard deviation of 0·022. 40. The refractive index n of the glass of a double convex lens is calculated from the formula 1//= (/1 . m3 and 11m.). 39. r z = 1-490 1. 38. = 0'312 ± 0·001 m. 38. find tb mean of the fifty measurements and its standard error. Plot the curve and draw the histogram of the observations on the same figure. The results of determinations of a physical quantity by two different methods were quoted as 15 + x units. Twenty measurements of the acceleration due to gravity were found to have a mean value of 9·811 m s -2 with a standard deviation of 0·014. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP.2.CHAP.0 43 1'5 78 -1'5 74 2·5 33 -1. find 11. 37. 8. Is there evidence that the two sets of values differ systematically from one another? 5. 4. x y x y -0. If a = 1·16 (b . Find the standard error of the mean and its uncertainty. . If r.41. 42. 38. 40. 0·005 m and / = 0·501 ± 0·002 m. 37. Fit a normal error curve to these observations. 41.1 e = (J ·60203 ± 0·000 34) x 10-19 C calculate the electronic mass m. 39.5 150 3'5 2 9. where 100 x had the following values: (a) 37. ab. Given that the accepted value of the surface tension at the temperature of the laboratory is 0·073 05 N m -I show that there is likely to be a systematic error in the measurements.1)(1/r.Ilr. Certain observations of the right ascension of Polaris have been grouped(16) as follows.

.. 41. the: according to the principle of least squares the most probable valu of this quantity is X chosen so that (xl . xn occur wit frequencies ft. that the measurements Xl' X2. -..-x)2 + h(xl - X)l + .. We have mentioned earlier that Gauss. Weighted mean + (x2 - X)2 + . . Xn as h(h2 .. as explained later. As we have shown above this occurs when x is the mean of Xl>X2. X)2 Both Laplace and Gauss using different techniques established the principle of least squares mathematically. are observed values of any given quantity.X)2 is a minimum. in keeping with the principle of least squares. deduced the normal error law. 3 40.-.. equals (h/V7T) n e-h'l:(x.-. Thus the probability is a maximum if L:(xs ..X»)2 = L:fs(xs - x)2 + (x . X)2 = = L:(xs - L:[(xs . xn so that L:xs = nx L:(xs . xm being the product of the probabilities of the occurrence of each separately.-. Sometimes the weight attributed to any observation is determined.-.. that is.x) x)2 (x - + n(x X))2 X)2 which is clearly least when X = x.+ (xn + - X)2 0 is least.X)2 If all the weights are multiplied by the same constant factor the value of the weighted mean is unchanged. but other significances are given to the weight often depending upon the personal assessment of the observer.+ fn(xn - X)l is least. (fix. if we assume that all the observations belong to the same Gaussian population specified by the precision constant h.hn7T-~n e-Eh~(x. We can then write the probability of the observation x. as (hs/V7T) e-hHx. h. that is.-x)' + !2x2 + . Conversely. if the observations Xl' Xl. More generally. xn' It may be. .. Now if x is the mean of Xl' Xl.. . . the mean of the observa tions each "weighted" according to the frequency of its occurrence. we have L:fs(x.-. the accuracy of measurement may be different in each case. Thus applying the principle of least squares we derive the resul that the most probable value of a measured quantity is the arithmeti mean of the· observations. may be expressed as follows: the most probable value of any observed quantity is such tha the sum of the squares of the deviations of the observations fro this value is least.x) = 0 we have L:(x. In the above derivation the weight equals the frequency of occurrence of the observation. . If now we write x = L:fsx. The "proof" proceeds as follows: The probability of the occurrence of an observation x. If Xl' Xl. 3 THEORY OF ERRORS THEORY Of ERRORS CHAP.X)2 = L:fs[(x. Hence the probability of the occurrence of the observations Xl' X2. ../L:fs.. assuming that the mean of the observations is the most probable value of a measured quantity. then applying the princip of least squares the most probable value of the measured quanti is X.-.-'!n respectively. such that ft(x._x)2 87 In such an expressionj'.+ !nx.. Method of least squares The principle of least squares. XII belong to different Gaussian populations. by the accuracy of the observation..) X and hence the probability of the occurrence of the observations Xl' x2.)IUi + i2 + .x) = 0 and hence L:fs(xs . is known as the weight of the observation 86 . Thus the most probable value of the measured quantity is tb weighted mean of the observations. ..x) + (x ..-.. which was first formulated b Legendre.X)2L:! This is clearly least when X = s. x. is given by (h/V7T) e-h'(Xl-x)2 where X is the magnitude of the measured quantity.CHAP. . we can use the normal error law to deduce that the most probable value is the mean. of course.+ j. .

9'5.x. The mean is therefore 9·43 ± O' 06.-. 3 This is greatest when L. 89 . _ L.h~(x. can be given a weight proportional to the reciprocal 0 the square of its standard error. X. the most probable value is given by the arithmetic mean. For as we have stated earlier (Section 30) the standard error of the standard deviation is /y'[2(n .9·2. which occurs when is given by the weighted mean L.xs/a.h~... is sometimes written as s-I 2: w.Ws.CHAP.Xs/IX. which equals 9 + -fo (4'3) = 9'43. In or if Xs is the mean of Is observations. the sum L.(0'03)2 = 0·030 The standard error of the mean = y'0'030/y'10 = 0·055.. since the probabl error is a constant multiple of the standard error a weight proportional to the reciprocal of the square of the probable error may be used. especially if the number of observations were not quoted.4 as the mean we get 0'2 = 0·28/9 = 0·031 If we used 9 ·43 as the mean we would get a2 = 0'2. xn is given by L. Treating the observations as of equal weight.X)2 is least when X equals this weighted mean. to the arithmetic mean if 0'1 = a2 = . where IX.w. 9'6. Section 29) corresponding to the observations xs' Thus eac observation x. To estimate the accuracy of the weighted mean we find the quantity a2 given by 0'2 = To find the standard error of the mean we calculate o .h~x$/L.x? is least.h.-. X" occur with frequencies J. L. a/y'n. Alternatively. EXAMPLE 1 Find the most probable value of a quantity of which the following Ire observed values: 9'4. As proved earlier. so that the standard error of the mean..ws)!' that is. however.. If all the weights are equal this reduces to L. This would be misleading.-.w. xn are given the weights WI> W2. = a/v Is and represents the standard error of the mean (cf.f. . w" respectively the weighted mean is x = L. x?]! (Section 30.-. - x) are which in this case equals O'055(1 ± 0·24).1/a~ This reduces. Standard error of weighted mean Quite generally. of course.h~1s L. it is clear that if the observations Xl X2.xs L. if the observations XI' x2. distributed about zero..L. . 9'5. 9'7./L. using 0·031 . (x. f2' . L.x)2/(n .y'[2(n .(Xs (n - X?Ji 1)L. [ L.) = L.w. . 42.x.-.1)].(xs . .1s/a. 9'4.hU.= an' Combining this result with that obtained above. If we write h~ = 1/2a.w. . It can be shown that the standard error of the weighted mean is given by a/(L. but in view of the accuracy of the given values one might be inclined to write the mean as 9·4 ± 0·1. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. as quoted in Section 25. the most probablel value of a measured quantity deducible from the observations XI' x2. = L. then the most probable value of the measured quantity is L.1)] 1 } This is the variance with which the quantities 88 w.h.9'4. It has to be remembered that the accuracy 01 the standard error is not very high when the number of observations is small.l)n in keeping with the usual formula.xs/a. 9'3.(xs n .1) ~{1 + y'n . The proof is exactly given above.(Xs [ (n .1/IX. 9'3. .-.

wsd~ 10·25 10·32 10·43 10·27 10·16 sum 1 2 4 3 2 12 0'25 0'64 1·72 0'81 0'32 3'74 -0.89 228 35 5 193 79 228 42 32 116 126 544 the factor 10-11 we have: mean = 1'75800 a2 weighted .1) + --ro:7 0·01701 = 1·75959 d.75870 10·0 9·0 4·0 13'0 8'0 x 10-4 X X X X 10-4 10-4 10-4 10-4 The observations Xs and their weights the most probable value of the measured error. Accordin the above result should be written 9· 430 ± O'055 (10 observation or 9'430 ± 0'055(1 ± 0'24). We can write the measured quantity as 10·31 ± 0 ·05 or preferab 10·312 ± 0'048 (5 observations).06 0·01 0·12 -0.10) d. writing ds=(e/m) a2 x 10-"-1'759 59 we get as shown above = 0·00036. (e/m) x 10-11 Probable error EXAMPLE 2 Ws 1·76110 1·75900 1'75982 1·75820 1. whe + n 1·0 1·2 6·3 0·6 1·6 10·7 Hence omitting 310 100 182 20 70 310 120 1147 12 112 1701 151 . and n is the number of observations. 10 3'74/12 = 10·312.1'75800 corresponding to a working mean of 1 ·75800. =~ s=1 wsd. Fi quantity and the standa d2 s ws(xs . Jeffreys has written "if anybo wants to reduce a good set of observations to meaninglessness can hardly do better than to round the uncertainty to one fi and suppress the number of observations. Ws Xs X 105 WsXs X 105 ds X 105 d~ X 108 w.= (e/m) X 10-11 . 3 It is therefore usually recommended that in computations of kind two doubtful figures should be retained and the standard (I probable) error expressed to two significant figures.J(n .dj X 108 x The most probable value is the weighted mean given 'by' = l:wsXs/l:ws> that is. a2 = 0·1112/4 = 0·0278 therefore a = 0·167 43..)t.59 23 -139 .04 -0'15 0·0036 0·0001 0·0144 0·0016 0·0225 0·0036 0·0002 0·0576 0·0048 0·0450 0·1112 We will take the weight of each determination as inversely proportional to the square of the probable error. Find the most probable value of elm and the standard error. The necessary working is given below. Also the standard error of the mean is given by a/(l:w. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. Internal and external consistency If the weight Ws assigned to the observation Xs is proportional to i/rx. In column 2 are the values of ~. Xs Ws are given below. = x." EXAMPLE 3 Values of the ratio elm for the electron as determined by different methods are given below along with the probable error of each value. Hence from the above table taking 10'31 we have Standard Hence elm = 0·00000544/4 = 0·00000136 error of the mean = v(0·00000136/10'7) = (1·75959 ± 0·000 36) X la" C kg " '. - x x = Therefore.CHAP. where rxs is the standard error of xs' the standard error rx of the weighted mean x may be found as explained in Section 42. and is given by and the standard error of the mean = 0'167/VI2 = 0·048. 90 91 . these values Ws are iven in column 1 below.

3 = ~(x 3 . 3 THEORY 1X2 OF ERRORS THEORY OF 'ERRORS CHAP. Ten observations of a quantity have a mean of 9· 52 and standard error O: 08. is unity with standard error 1Iv' [2(n . are equal. 2. - X)2/IX~J t (n . 4. The refractive index of a glass prism was found successively to be 1'53. Find the mean and its standard error if the observations are given (a) equal weights. 1. the ra IX. 93 . We might therefore safely take a equal to 0'00045. Show that the mean of p equally good observations has a weight p times that of one of them. the ratio of the probable error to the standard srror. 4. 1'55.4. 3. ' note that expression (13) depends entirely on the standard errors the separate observations. some or all which may be considerably inaccurate especially if they have been calculated from few observations. whereas expression (12) depends also upo: the differences between the observations. following Birge. (ii) weights I. 2. the "external consistency of the observations. on the other hand. 2. 2. 3. In such a situation the weights based on the standard errors should he discarded and replaced by others.xs where k.CHAP. such a case it is perhaps safer to choose the larger of the values !Xe and lXi as the standard error of the weighted mean.1) In practice it will be found that Z does not equal unity. or more strictly.m. differs from unity by an amount mUI 92 1. It may be concluded that systematic errors are likely to be present. (b) weights 1. If.53. 3. th the observations as a whole may be regarded as consistent. = 0·00036. 3. 1'56 and 1. We shall denote the standa errors obtained using expressions (12) and (13) by IXc and respectively. Further.u. this ratio is found not to differ from unity significantl having in mind the uncertainties in the standard errors used. elm = (1'75959 ± 0·00045)107e. 2. 1 respectively./IX. 2. 2.X)2/1X2 I)D/IX~ S (n - But it is possible to derive another expression for the stand error as follows. . and (iii) weights 1. 3. = ws/~ws and hence using equation (11) of Section 37. 8 and giving these ordinates (i) equal weights. This not surprising.g-l• EXERCISES 5 ':! = [S~l (xs IX. We have x = ~k. If we use equation (13) we have ~1/a~ = r2 X 108(0'010 + 0'012 + 0·063 + 0'006 + 0'016) = r2 x 108 x O· 107 where r = O' 6745. for it depends on the values of IXs. 2. Now expressions (12) and (13) are apparently quite different. and IX./IX. 1'51. 1. 1'54. 1'50. the standard error x is IX where 1X2 = ~k~lX~ = ~W~IX~/(~ws)2 If Ws is proportional to 1/1X: this reduces to 1X2 = (~IM)/(~I/IX~)2 = 1/(~I/IX~) (l reater than is to be expected on the basis of statistical fluctuations. another 20 observations of the same quantity have a mean of 9' 49 and standard error O·05. If. We note t Z = Hence lXi = r-I x 1O-4/v'0'107 = 0·00045 Consequently Z = 36/45 = 0·8. Find the mean and standard error of the 30 observations using weights (i) proportional to the number of observations. 4. The latter is therefore function of what might be called. 1 respectively. 1'54.1)]. whereas expression (13) depen upon the "internal" consistency. Find the area under the curve y = x2 from x = 0 to x = 8 by calculating y when x = 0. and (ii) inversely proportional to the square of the standard errors. that is. 1 respectively. 1'54. In Example 3 above we have shown that equation (12) leads to the result IX."i)2 and so will be affected by any systema errors present. 6. for with n = 10 the uncertainty in Z Is about 0'24. from which we may conclude that the values are consistent. 1'57. IX. Z depends as well on the values of (x. depending inevitably upon the judgment of the individual concerned and his knowledge of the experimental conditions. For samples taken from an infinite normal populati it may be shown that IX. The assignment of new weights must be somewhat arbitrary. however.

(b) 10·1 ± 1'2. 5464 ± 0 '0095 6·5443 ± 0·0091 and 9. Given that the probability of the occurrence of the set of measurements Xl' X2. A precision method of finding the diffusivity p of nickel resulted in the following values with their probable errors: 0·0042184 ± 0·0000021 S-1 42068 ± 78 42213 ± 18 42093 ± 35 42281 ± 57 42148 ± 71 42135 ± 30 Find the most probable value of p and its standard error.9 ± 1. 6. Find also the weighted mean of the two values and i standard error.0 299782 Michelson 1926' 5 299 798 Mittelstadt 1928' 0 299786 Michelson. Probable error X 1()4 elm 3·7 1 1·75913 5·0 2 1·74797 5·0 3 1'75914 6·0 4 1·75815 5·8 1·76048 5 7·0 1·75700 6 4·0 7 1·76006 10. Values of the velocity of light in kilometres per second as corrected by Birge(12)are given below: Author Epoch Corrected result Probable error :% • '/ rY \'.. Two groups of determinations of elm for the electron give following values: 1·75880 ± 0·00042 and 1. 11·4 ± 2'4. and (ii) of the first six (the "spectroscopic" values). .007 6·546 ± 0·010 and 6. the unit being 10" C kg " '. Twelve precision values of elm for the electron obtained by ~~ \ seven essentially different methods are given below.34 Js. Find the weighted mean of the values and its standard error. Test the following results for external and internal consistenc (a) 10 ± 1. 7. 95 . 3 5. Find the most probable value of h and its standard error.0 8 1·76110 1·75900 9 9'0 1.544 + 0·009 ~ the unit being 10. Find the weighted mean and standard error of (i) all twelve values. 12 ± 1 and 15 ± 2.7.12·2 ± 1·3 and 14.0 Find the weighted mean and its standard error. 94 10. Four determinations of Planck's constant h were rounded a and given with their probable errors as 6·557 ± 0·006 6·554 ± 0.75959 ± 0·0003 the unit being 1011C kg " '.0 11 1'75820 8·0 12 1·75870 12.. Test for internal and external consistency. internal and external consistency. 11 ± 2. Find the difference of these tWI values and its standard error. find the value of h for which P is a maximum. and 96506·6 ± 7·7 C mol " '. Pease 299774 and Pearson 1932· 5 299771 Anderson 1936·8 Huttel 1937·0 299771 299776 Anderson 1940.-. Test for internal and external consistency in each case./ Rosa-Dorsey 1906'0 299784 Mercier 1923. Repeat the calculations using the original values 6· 5568 ± 0·0063 6·5539 ± 0·0072 6.-:r)2 where x is the arithmetic mean of the measurements. 8. 10 30 15 10 4 10 10 6 Test for II. xn is proportional to P = (hlvTr)n e-h2E(x.CHAP. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP.75982 4·0 10 13. Two determinations of the Faraday constant are: 96497·6 ± 6.

CHAP.x + [wab]y + [wbb]y - [wak] [wbk] 0 It is clear from these equations that if each of the original equations is multiplied throughout by the square root of its weight the equations can be regarded as of equal weight. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. 4x .[ak] [bb]y . 874·0 . Values of x and y can be found that satisfy any two of t equations. x .x In determinantal notation we can write x _ [ab ] y l [bk ] IVs [ak][Qb]I-I[aa] [bb ] [ak]1 [bk ] [aa] I[ab] =0 = [ab]1 [bb] + bsY = k.. 97 .1'1 1 -1 1'1 I 28·8 .) = 0 .46·0 "n • = 402·8 417 0·966 where [ab] denotes the summation ~ a. bsks b~ 2 10·2 1 2 1 5·1 4 5 '1 -1 I ·1 I .x 'i:.7·2 1 4 -1 7·2 16 -4 23·6 4 16 1 4 5·9 1 5'9 20'4 1 19 46'0 sum 8 3 19·3 22 + bsY . a.k. If different weights can be assigned to the given equations and is the weight of the equation a. s=1 They are known as the normal equations. With more complicated examples certain arithmetical checks are introduced. In this case the questi arises: what are the values of x and y that satisfy all the equatio as closely as possible? The method of least squares can be applied to the solution this problem.b.x + bsY = ks• then the normal equations become [waa]x [wab]x where as. These are the.bs(a. k. EXAMPLE 1 Find the most probable values of x and y from the equations 2x + y = 5· I..x + b.y = 7' 2 and x + 4 y = 5' 9 We first construct the following table from the coefficients as. that the equations may not be consistent.y + bsY + + [ably .[bk] n =0 (1 (1 Thus the normal equations are 22x + y = 46'0 x + 19y = 20'4 .20'4 853·6 Solving we get x = 418 _ 1 = 417and y = 2·047 = = 0 = 448·8 . a2. ask. Other applications of the method of least squares.k. k. of the given equations. Suppose there are n equations whe n > 2.ks) = 0 From these two equations the values of x and y can be found. are constants. • . 96 The most probable values are x = 2·05 and y = 0·97. b•• k.i'rnost probable" values.ks)2 must be a minimum. = e. These values may not satisfy all the equations. n Thus ~ (a. we can choose x and y such that the sum of the squares of t "errors" es is least.bs as b. Differentiating partially with respect to x and y we get as necess conditions for a minimum: . b. The above equations are usually written in the form and [aa]x [ab]x ~a.(a. solution linear equations Legendre applied the method of least squares to the follow' problem.y = 1'1. 3 44. For writing a. Suppose we have a number of linear equations in the t variables x and y of the form a.x s=1 + bsY .

<. Then writing Xl = 23'9.. DE along a straight line are measu as 24'1.b. 3 TH EOR Y OF ERRORS EXAMPLE THEORY OF ERRORS CHAP.e(. are subject to accidental errors of observation. 35·8. • In mathematics this method of solution is often referred to as. CD. wb2 wa. x4 = The most probable values are therefore x EXAMPLE 2·05 and y = 0'97.x3.\.lAt . Also let us suppose the equations have been multiplied by the square roots of their weights.x4 cm respectively. x2.tRe~· ~'. e4 so that er + e~ + e~ + e~ is a minim 35·8 In Section 44 we considered the problem of finding the most probable values of x and y from n(> 2) linear equations a. ks wa3 wa. b.k.CHAP. so that the weight of each may be taken as unity. X3 = 30·3 + e3' x4 = 33·8 + eJ + e2 + e) = . therefore = .~ \~ . Solution of linear equations involving observed quantities 24·1 + el.-. CD. + (AI + Avde2 + (AI + A2)de3 + A2de4 = 0 el _ e2 _ e3 __ e4 1 3 2 sum 2 1 4 1 -1 -1 5 ·1 1. \\\R~t\'f •• / ••••• 1 .lk:·. e2. + e2.y = ks' Now let us suppose that the constants as.~99 '. CD. 35'8. x .y = 7· 2 + de2 + de) + de3 + de4 = 0 (19) (20) ••• 0 given that these equations have weights 1.method of undetermined multipliers. r. BC. but it is known that AD accurately 90 em and BE is 100 ern.--.1 Cl r~~'i. BC.x + b.O· 2 e2 + e3 + e4 = 0·1 e). DE.9y = 71·J -9x e4 = e2 - Hence on substituting in equations (16) and (17) we get el el + 6y = - 12'6 Hence el = e3 Solving we get x = 2'05 and y = 0'97 = + 2e2 + 3e2 == 0·2 O· 1 e2 e4 0'16 and and = . \ ~ eldel + e2de2 + e)de3 + e4de4 98 =0 £'''n~UTE"CF ~. BC.'. We construct the following table: Ws On multiplying.y = 1'1 and 4x .-1"" ~. 3 and 2 respectively.-~·.0·02 = 0'14 30'3. 30·3 and 33·8 em.. 1 7·2 4 3 2 -3 -8 -9 32 39 1 3 2 6 10·2 3'3 57'6 71 '1 Comparing this with equation (18) we have AI . X2 = we have We choose whence el. x3 = 33'9 45. are known accurately but the constants k. 3 2 But from equations (I) and (2) we have del de2 Find the most probable values of x and y from the equatio 2x +y = 5 '1. DEbexl. Find the most probable val of AB. .equations (19) and (20) by the undetermined multipliers AI and A2 and adding we get Aldel as b.Al so that e3 = e2 + A2 and AI + A2 - A~ el Thus the normal equations are 39x .· -: . Let the lengths of AB.0·02 X2 = 3 Hence the most probable values arc XI = The lengths AB.

k. x . y2 respectively we get x = 1-00. 4x . as before. suppose that a third independent observation leads to + 2y = 3-00 ± What are the most probable values of x and y? We will give the equations weights inversely proportional to the squares of their standard errors. Curve fitting Another useful application of the method of least squares is the fitting of a curve. it has been. to a set of experimental data. then it can be shown that [bb ] where t" is the determinant 1 each of which can now be regarded as having unit weight. shown that the standard error a to be expected in any expressio asxo + bsyo . is given by" a2 = [dd]/(n 2) Further. . or a theoretical formula. d1 d2.:»: =~=_. As a further example.. 1-00 bsks o [aa] [ab] [ab]1 [bb] 8 10 6-00 7-00 1-80 12-00 13-80 If.k. the 1)10stprobable values of x and yare solutions of the normal equations [aa]x [ab]x tb + [ablY + [bb]y = = [ak] [bk] We can write x = 2-05 ± 0-01 and y = 0-97 ± 0-01. y. we consider the equations solved in Example I. that [Jd] = 0-0031 and a2 = 0-00155_ Hence therefore 0-0116/14 ay = 0-05 y = 0-96 leading to Hence x = ax = 0-09 and and a~ -_ 22a. .y = 7 -2 and x 22x + 4y The normal equations are 3x + 4y 4x + 10y = 7-00 = 13-80 = 5 -9 the normal equations are x leading to x = 1. are the residuals when the most probabl values Xo and Yo are substituted in the given equations.. ----c _ 0-00155 19 C"7C 1-06 ± 0-09 ± 0-05 4 17=--- ax = 0'008 and 100 ay = 0-009 •• If there are m unknowns x. Denoting these values by Xo and Yo and writing a"xo + b.y = I-I.yo . as an example. 2. 2 2 a2 [aa ] t" o y2 sum o y2 2y2 ks 1-00 0-9OV2 3-00y2 a2 b2s ashs o 2 3 1 o 2 o o 4 4 o ask..-.. namely. yy2 = 0-90y2 and xy2 + 2yy2 = 3 -OOY2 We Also if ax and ay denote the standard errors in respectively. "" d. 46. therefore have a. 2x +y = 5 -1. namely. y2.CHAP_ 3 THEORY OF E RRORS THEOR Y OF ERRORS CHAP_ 3 Then. bs 1 a a . -0-02. IOJ ./3 = = 0-011 leading to x = 2-05 and y = 0-97_ The residuals are therefore -0-03. Section 44. let us suppose that by direct observation it has been found independently that x = 1-00 ± 0-10 and x y = 0-90 ± 0-07 0'07 Gauss(17) and others have discussed the problem of estimating the weights that may be assigned to the values of x and y so obtained.. d. 1. 2_ So multiplying the equations by I.the denominator is( n .057 and y = 0-957_ Using x = 1-06 and y = 0-96 we get [dd] Hence = + 19y +y = = 46-0 20-4 0-03 and 0-03 sa 0-0036 a~/l0 + 0-0072 + 0-0008 = a.m). that is..

. ••• . xn of another quantity x. where the errors in one of the variables may neglected.7 0 0 1 7 '1 19·0 4 34'5 9 54·8 16 79'5 25 111·6 36 146·3 49 188'0 64 228. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP. 103 . We assume too there exists al linear relation between X and y. The conditions obtained by differentiating partially with res to a and bare ~x. assuming the values of X are accurate.45 x 869'4 825 = which give and We note that if b where b n[xy] .Ys 2 3 4 5 6 7 To obtain the linear relation (or the line) which best fits the da we choose a and b such that the sum of the squares of the "errors' is least. will not in general equal y. for simplicity.[x][xy] n[xx] .Ys) = 0 a[xx] + b[x] = [xy] a[x] + bn = [y] ~(axs a= = 8694 . often occur in practice. ••• .32 and b = 150'7 = x 285 .<axs + b . and to draw the straight line that fits the points best. namely.[x][x] (23) 3826'5 825 (2· 4. 3 Suppose YI Y2.6 7·1 9'5 11'5 13'7 15·9 18.64 = [y][xx] . that there are experimental erro in the values of Ys but not in the values of xs' Conditions closel approaching these.[x][ x] = therefore y = 2·32x + 4·64 = 0 we have y a=-=102 ax [xy] [xx] [y] [x] (2 It is instructive to plot the values of x and y.6 20·9 23·5 25·4 150. Let us assume. that is.4 285 ax +b then from equations (23) and (24) + b .6 81 869. ~(axs + b . +b .Ys)2 is least.[x][y] a=-7-~~~ n[xx] .CHAP. y = ax EXAMPLE Fit a linear law to the values of x and y given in the following table.45 X 150·7 2850-452 1912'5 825 (2 (22) = 2.Ys) = 0 and Hence 8 9 sum 45 If the relation is we have y = 4. Y~ are the values of a measured quantity (or combination of measured quantities) corresponding to t values XI' X2. x y xy xx 0 I +b y On substituting x = xs' the value of there will be an "error" of amount ax.

xn and the most probable values of a and b. Accuracy of coefficients It is clearly of some importance to be able to estimate the accuracy of the values of a and b derived by the method outlined above. This can be done by applying the results discussed in Section 45. using x = 4· 5 and ji = 15' 1. Yn and XI' x2. 3 THEORY OF ERRORS TH EOR Y OF ERRORS CHAP. . 105 x-x ax ay or - Y ay =1'- X ax Secondly.CHAP..0'3 443·71 191'25 This is known as the line of regression of y on x.. When r2 = 1 all the points (xs' Ys) lie on the coincident lines of regression and there is perfect correlation between x and y. 3 47.. we can calculate the residuals d. can be written as )x] n y = +b = [y] n This indicates that the point ([x]/n. In such a case we can proceed as follows. Assuming that the values of x. r = 191·25/'\1(82'50 x 443'71) = 0·999. in general ~ and yare independent of one another.4' 5) y or = 2'32x + 4·66 In keeping with the equation found earlier. are accurate we can derive the line of regression of y on x in the form y-ji --=r-- Hence. .3·6 12·96 5'40 13·7 . we can assume that the values of Ys are accurate an derive the line of regression of x on y in the form --=/'-- x-x ax y-ji ay 104 or -=r- X Y ax ay . [y]/n) lies on the line ax + b. The quantity r is known as the coefficient of correlation of x and y. . It can be shown that r2. the line passes through the point (x. it may happen that both sets of quantities x and are liable to experimental error. 1. For the example discussed above the value of r may be found ns follows. We note that has the same sign as [XY] and hence the same sign as the gradient a Of course. Line of regression It is of interest to note that the second of the normal equations equation (22) above.1·4 1·96 0'70 15·9 0·8 0·64 0'40 18'6 3·5 12·25 5'25 20'9 5·8 33'64 14'50 23'5 8'4 29·40 70'56 25'4 10'3 106·09 46·35 Y ax 150·7 .. It is therefore useful to write x = x + X and y = ji + Y whenceY Fhese equations are identical if r = 1..y = [XX] If we write [XX] we have = na3. x = aXand hence from equation (25)above a = [XY]/[XX]. 48. and it ca be verified that these co-ordinates satisfy very closely the equatio y = 2·32x + 4·64. or indeed that the linear relation between them is only approximately true. [YY]=na} y-ji = [XYF/[xx][y -- ay = x-x 1'-- 5 6 7 8 9 sum 45 that X -4'5 -3'5 -2'5 -1'5 -0'5 0·5 1·5 2'5 3·5 4'5 0 XX 20'25 12·25 6'25 2·25 0'25 0·25 2'25 6'25 12·25 20·25 82'50 yy Y XY 4·6 -10'5 110·25 47'25 64·00 7 '1 . that is. If r = 0 the two lines of regression are parallel to the axes of x and y. that is.-.8·0 28·00 9·5 .-.. ji) whe x and ji are the arithmetic means of the values of x and y respectively In the example given above x = 4· 5 and ji = 15·07. we choose the bisector of the acute angle between the lines as the line that best fits the data...5·6 14·00 31'36 11'5 . We note also [XY]/[XX] = 191'25/82'50 = 2'32 so that the relation between x and y is y - 15'1 = 2' 32(x . _ [XY] _ (x-x) and 1'2 0 1 2 3 4 The relation between x and y can therefore be written y . In general r is not equal to unity and the lines are not coincident. For using the values YI' Y2.

if <Xa and <Xb denote the standard errors in the values a and b we have. 1X2/825 that is.x we have shown above that a = [XY]/[XX].CHAP.22 0·0484 0·10 0'0100 0.~/n where = <xU[xx] = [x]1 = n[xx] n <x2/tJ. x y d dd n . and hence Further.02 0·0004 -0'30 0'0900 0'12 0·0144 0.y.(a ± IXJX or so that b ax ± . since b we can write n[XX] n[xx] 1X2 [xJ2 Hence we have where 1X2 = <x~/1O <x~/285 = = IX~ _ [dd]/8 and [dd] = 0. a.Ys is given by <x2 in = [dd]/(n - 2) Also. + b .(lS) If we write the relation y = ax + b in the form Y = aX where Y = y ..yand X = x . An alternative method has been given by Bond.14 0·0196 -0. using standard error (i) of Section 39.123 We can therefore write y = (2'32 ±0'02)x + 4·64 ± 0'12 106 as given above. + b . 3 THEOR Y OF ERRORS <x2 THEORY OF ERRORS CHAP. 3 given by ax.7 4 5 15·9 6 18·6 7 20·9 8 23'5 9 25'4 sum 45 150'7 0 Hence lXa 0'04 0'0016 -0. (and therefore of the values of Ys) is IX we can write formally a = [X(Y ± <x)]/[XX] which. is the determinant I leading to a [xx] [x] - [x)2 a= ± <xy[XX] [XX] [XY] =[XX]±y'[XX] = IX/Y[XX]. IX In the example considered above the normal equations are 285a +45b = 869·4 45b lOb = 150'7 Therefore the standard error of a is lXa is in keeping with that given above since --=- This result + [XX] n [xx] -2 -x n = = 2·32 and b = 4'64. Hence if the values of Xs are accurate and the standard error of the values of y.3500 as shown below.J + 1X~2 (:2 + <x~2) <xl n =~ + IXUX)2 n2 n =<x2 [xx] n[xx] - [x)2 = 0·023 = 0. leads to [XYJ tJ.a. using the normal equations (21) and (22).0484 0·34 0·1156 -0'04 0·0016 -0.22 0. It is perhaps more convenient to write the equation in the form Y ± IX/yn = (a ± lXa)X 107 . Then the mean square error expressions ax.3500 and <Xb = y ± <x/yn =y <x2 = - .'n[:-xx·]---r[x"F <x/v'n = Y -.r and the standard error of y is b 4·6 7 ·1 1 9·5 2 3 11'5 13.

wxy] .aX s: Thus in the example considered above we have found tha a = [XY]/ [XX] = 191. It might be added that if the observed values x. . .....~I i:. the equations (21) and (22) above become a[wxx] a [w. at> a2. 3 where y = y . the relation may be of the form y = therefore ee~ ao + alx + a2x2 I ••• + Gmx'" = O'3580/(8 x 82' 50) = 0'00054 therefore eeo = 0·023 as before. . (Y. and hence as shown belo [DD] = 0'3580.8·0 -2'5 .. Other curves It often happens that the relation between the two variables x and y is not linear. the values of the constants ao.x] so that and a b = D DD + b[wx] = + blw] = [wxy] [wy] -4'5 -10'5 -3'5 .4'5) is least.• . 2. More generally.2).1·4 0·8 0'5 1'5 3'5 5·8 2'5 8·4 3'5 4'5 10·3 sum Hence ee2 = -0'06 0'0036 0'12 0·0144 0'20 0'0400 -0'12 0·0144 -0'24 0·0576 -0'36 O'lf96 0·02 0·0004 0 0 0·28 0'0784 -0. [w][wxx] ..3·6 -0. 109 a2x~)2 15'!) ± 0·07 = (2'32 108 ± 0'02)(x .5 . ...ao s ~ I alx..alx. = Hence the relati won /0' 3580 ~ (y.a2x.amx.. •• " nand n> m + I.y..14 0·0196 0·3580 [w](wxy] . eeo = ee/v[XX] and ee2 = [DD]/(n . . ._ = Also the standard errors of a and b are given by ee2 ee2 . If n corresponding values of x and yare known.::. If y = ao + alx + a2x2 we choose ao. X = x .[wx](wy] [w][wxx] . A particular example will illustrate the method. 3. (xs' yJ with s = I..l wx F ee2 .[wxF [wy][wxx] [w] [wxx] [wxH.. Also..)2 .!!. IS Vn 'Y 80""" = 0·067 Y ± 0'07 = (2'32 ± 0'02)X This can be written in terms of x and y as follows (y - is least.25/82'50 =2· 32. say. involving m + I constants.. 3 THEORY OF ERRORS THEORY OF ERRORS CHAP.ao .X.-.=_b_= [w] where ee2 = [wdd]/(n [wxx] 2). ai' a2 so that .[wxP O'3580/8 49.CHAP. .5·6 -1'5 . being the value of Y. X y therefore Y = (2'32 ± 0·02)x 0'02)x = (2'32 ± + 4·66 ± v[(0'07)2 + (0'09)2] + 4'66 ± 0'12 in keeping with the result found earlier. am may be chosen such that the sum of the squares given by n . D.. Ys are given the weight w.

a2X2 for the various values of X.8 I I X4 0 1 0 I 4 9 28 8 27 0 81 16 1 0 1 16 81 Xy X2y -3. if we assume that the values of Xs are accurate and the values of Ys are subject to experimental error.0 9·0 -4·8 9·6 -6.a2xJ) = 0 The normal equations are therefore [y] [xy] [xxy] [xx]ao nao [x]a.U''20) Hence the normal equations are 7ao 28al 28ao leading to ao = 14'463.09 -0'26 0·0676 0·16 0·0256 0·17 0'0289 -0'12 0·0144 0'1795 [x]ao - [xx]a. aJ> a2 can be estimated by the method outlined in Section 45 above.CHAP.6 6·6 0 0 25·7 25'7 80·2 160·4 172·5 517'5 y(calc) 1·05 2·22 6·69 14'46 25·54 39·93 57·62 d d2 -0. We first find the residuals d given by y .a.xs . 3 THEORY OF ERRORS THEOR Y OF ERRORS CHAP. When the number of constants is large.3) + I . and a2' Again.Go . r=1 The uncertainties in the values of 00. [xxx]al - i sum o 147'5 196 264·0 728·8 the solutions of which give the most probable values of ao. ai' a2 can be estimated as follows.xs LXs(Ys ao LXJ(ys .3 and tabulate the working as follows. X Y X2 a2x. + S4a2 = t2 soao n n = 14'4631 9'428X + l'652X2 = 14·463 I 9'428(5x .3) ao.a. the solution of the norma equations may be laborious. EXAMPLE If the standard errors of respectively. ai' = 0'1795/4 ao. a.05 0'()()25 0'18 0'0324 0'()()81 -0. X + a2X2. 3 Differentiating partially with respect to ao. + S3a2 = tl S2aO + S3a.652(5x . Hence we have y al + 28a2 = 9'428 and a2 147' 5 = 264·0 + 196a2 = 728·8 = 1·652. where to simplify the arithmetic we take X = 5x . [xx]a2 = 0 [xxx]a2 = 0 [xxxx]a2 1 =0 -3 -2 1·0 2'4 -I 6·6 0 14·2 I 25'7 ·1 2 4{) 3 57·5 X3 9 -27 4 . aJ. X .2'42x + 41'3x2 3)2 where Sk = ~ .ao - ai' a2 respectively we We assume that y = ao + a. We then calculate d2 and find a2 = [dd]/(n .xs . Special methods of solution have bee devised.) = 0 a. obtain the necessary conditions: L(ys . = 1'05 . the standard errors of the values of ao. The normal equations might more conveniently be written + sJal + S2a2 = to Siao + S2a. these are tabulated above.ao .a2xJ) = 0 a. we have 2 ao a2 are denoted by a~ _ a2 a2 Fit a parabolic curve to the following data: x y aI _ o 1-0 28 1 ·0 0'2 2'4 0'4 6·6 0·6 14·2 110 0·8 25·7 40·1 1·2 57·5 1 o 01 196 7 ~1-1701-1-=-7 0 28 1 28 196 11 I ---'0~28" 0 28 0 28 0 196 . The general form of these equations is obvious.-I x~ and tk = ~ x~ Y.

Find the most values of x and y. EXERCISES 1.I. Find (i) x = 0. find the most probable estimate the errors in those values. The equations values of x and y. and x-3y=-5'6 4x + y = 8'1 2x . + a relation of the form values of a and b. 3.y = 0·5 probable d (in. error 0·27 is overestimated. v = 2·17 ± 0·08 and u + v = 3· 50 ± O' 12. Independent u sets of observations led to the results 0'06.CHAP. 1. y and z. Two quantities D and d are measured as follows: discussed in them. (ii) x = 5. 3 THEORY (X~ _ (Xi _ OF ERRORS THEORY OF ERRORS CHAP. The three angles of a plane triangle to be A = 48° 5' 10". and (ii) weights 1.(2·42 ± O'72)x (41·3 ± 0·6)x2 C = 70° 42' 7". B = 60° 25' 24" are measured and and found ± 0·122 + (9'428 ± 0'040)X + (1'652 ± The simplification of this expression is complex. If. since (Xo. and 3. find the most probable values of a and b.2z = 5' 02 and 3x y = 4· 97 assuming the equations have equal weight. 2) are respectively 3 '1. Estimate the uncertainties in the co-ordinates of the point. From the equations 3x+y=2'9. are given below. 2. 8. 7. x 3 y . 8.) ! 1'19 i 1·31 I 1'42 Ii 1'52 11 1·64 Ii 1'76 2 1·87 have weights 1. 2·2 and 3' 2 units.) D (in. I2 . 3 therefore therefore (xo = (X~ _ 112 - ± 0'122. Fit a linear relation to the following data and estimate the errors in the values of the constants obtained. 4. at different temperatures. Find the most probable values of x. x-2y=0'9 6 and 2x-3y=i'9 find the most probable 2. (3. but the others are Find the most probable values of A. = 1·23 ± 9. 3 respectively.be. 0). however. 0'1795/4 336 (X2 = ± 0'023 0'023)X2 Hence y = 14·463 5. Find the values of a and b using only the values 2. y N m -1. Values of the surface tension of water. 1) and (. 6. (XI' (X2 are not independent. Band C assuming that the measurements have (i) equal weights. x v 10 11'0 12 7·6 13 6·2 17 -0. 5 and the corresponding values of y. 2x-y+z=l'04. 3 respectively. Find also the standard errors of x. + + 10. 6. y and z that satisfy the equations x+y+z=4'01. t Y X 103 10 74·22 20 72-75 113 30 71-18 40 69'56 50 67-91 60 66·18 112 . Show that they satisfy approximately D = ad b. Find the most probable position of the point of which the measured distances from the points (1. to C. Find the most probable values of u and v and their standard errors. 9 and the corresponding values of y.1 19 -3'2 20 -5'0 4. and their standard errors. If y = a . and 7.4 (Xl = ± 0'040. Plot the values of x and y given in the example Section 46 and draw the best straight line through the equation of the line drawn. we treat them as independent we get y = 1·05 ± 0·27 . 2. where e is the temperature on the Kelvin scale. + The standard satisfactory.

188 (London: J 14 . 128-195 (1934). (7).G. N. at different. . T. (3) SCRASE. Phil. (6) JEFFREYS.. A. H. 698-707 (Oxford: Clarendon Press. E. 1950). pp. (18) BOND. (1910). are given below. Quart. 20. r Soc . 207--227 (1932). (13) BIRGE. t 15 42 49 78 90 105 125 x 103 37·92 34·92 33·78 30·73 29·30 27·62 25'30 c. 15.F. Theory ± 2·03 . p. 243 and later. . Roy. C Cambridge University Press. T. 8. Stand. (10) BESSEL. 23/-271 (1938). 237. (Washington). R. p. + + t 1'/ X 10 3 10 1·308 20 1·005 30 0·801 40 0·656 50 0·549 60 0·469 70 0'406 on the to find Astron. Press. Fit a law of the form 1-1 = a bt ct2• (7) WmITAKER. y N m . W. (9) JEfFREYS. 26.. 1952). E. Probability and Random Errors. May it be concluded that there is experimental linear variation of c with time? evidence 0/ Probability 13. 14. Statistical Mathematics. Nachr . 235.(0'381 ± 0'234)(D - 1930) of Phil. p. Rep. Rev. E Press. 368-378 p.(8/670)]" where 8 is the temperature in degrees Kelvin. 95 (1941).. H.. 1952). P. Phil. D. Trans A. Progr. T. Use the calculated expression for' y to find y when x = 1·2. H. 1929). Trons A. use the data given in question J 4 above the most probable values of A and k. Numerical Numerical Analysis Calculus 115 (Oxford: (Princeton Clarendon University p.1. pp. p. Nr.. (I7) See ref. Assuming that TJ = Aek/T where T is the temperature Kelvin scale. (20) MILNE. 61. at different temperatures. 175. Oliver and Boyd Ltd.CHAP. Ltd. p. and ROBINSON. 1935). C. 174. E. H 1948). Phys. J. 445-531 (1936). Values of the viscosity of water. but excluding the value x = 1· 2. Bur. use the method of least squares to show that c is related to the epoch D by the relation c = 299777·27 (4) AITKEN. 1'/ Ns m ->. J. 11. temper atures. pp. to C. 3 THEORY OF ERRORS REFERENCES (I) BULLARD. pp.F. (1 I) HANSMANN. (I5) WEATHERBURN. 96 (London: Edward Arnold and Co. Using the values of the velocity of light c given in question 10 of Exercises 5. (5) RUTHERFORD.. w. Meteorol. (2) HEYL. Values of the surface tension of bromo benzene. (7). (12) BeRGE. to C. p. R. W. and GEIGER. Biometrika. (19) HARTREE. R. R. 5. (1935). 1952). Res. J. Mathematical Statistics... pp. 1243 (1930). Mag. pp.• IS. 40. 72 (Edinburgh: 12. Phys. Fit a parabolic curve to the data given in the example discussed in Section 49. (7).. are given below.G. (14) See ref. (16) See ref. (7). Calculus a/Observations. Fi t a law of the form y = yo[1 . E. p. p. 358-359. (8) See ref. 179 (London: Blackie and Son Ltd. 177.

1970).43 Consistency. Errors (London: Method Edward (London: (London: BRADDICK. 15. 1951). internal and external 91 Correction. 56. 57 Pearson's types 67.. Statistics and statistical mathematics Penguin Books Ltd. coefficient of 105 Cumulative frequency 68 Demoivre A. Ltd. W. see Mean INDEX Distribution-continued normal 55. 100 Gaussian law 55. H. Oliver and Bernoulli. Statistical Boyd Ltd.76. Press. 115 ~'_'"'-"'" MORONEY. F. 79638 116 .115 BOlld. Mod. 1958).-·_. 87. R. accidental 9. C. LYON. 10. K. C. II .BIBLIOGRAPHY The following is a list of works on statistics and which. 85. \~·ntl!TE . constant of 26 Gravity 14 Grouping. S. probability 56 Deviation 38 mean 38. normal 96 Error. 55. 55 Density. 9 Electronic charge 80. 74. ~ r: . Mathematics (Edinburgh: (London: WEATHERBURN. Phys. A. Theory 1948). W.. 57 Birge. 43 Hansmann. Elements University Press. 1I7 76. 15. 1952). A.48. DEMING and BIRGE. G.67 lognormal 68 median 67 multinomial 67 Rev. 91 Equations. 55. 1929). Combination of Observations University Press. 67 polygon 30 relative 33 Gauss. 72 Gravitation. it is hoped. 44 Dispersion 13.. of Probability (Oxford: (London: Clarendon of Probability Errors of observation and related topics BOND. J. p. Facts from Figures (London: 1951).. E.92.. Cambridge (London: SMART. 14 theory of 72 Fittin of a normal curve 66 of a parabolic curve 109 of a straight line 101 Frequency 29 curves 34 distributions 29. Probability and Random Arnold and Co.'<:" Dr }.115 Bullard. 49 Bessel's correction 64 forrnula 64 Bessel.'+i r. The works range elementary to the very advanced and are quoted mate order of difficulty.•LIBRAPy: ( :+. N. Statistics Cambridge Oxford Press. ArrKEN.vtih. 116 Approximations 16 Area under error curve 58 Average. 39. 74. 14 estimate of 12 fractional 11 Gaussian law of 55 percentage 11 personal 10. 119 (1934). The Physics of Experimental Chapman and Hall Ltd. 115. effect of 36. JEFFREYS. 1935). 37 Distribution binomial 48 Cauchy 37. 74 Poisson 51 triangular 67 Dust counter 54 Eddington. 64. T. Calculus of Observations Blackie and Son Ltd. WHIlT AKERand ROBINSON. might serve as some guide to a very extensive literature.. 14 superposition of 20 systematic 9.• __. F. 6.g.107. Sheppard's 43 Correlation. 115 Binomial distribution 48.). 1952). Mathematical University Press. in their approxiAccuracy 14 Accuracy of coefficients 105 Aitken.. Dealing with data (Oxford: Pergamon Charlier's checks 44 Class boundaries 35 limits 29 widths 29 Coefficient of correlation 105 variation 41. 58 standard 38. LEVY and ROTH. curve 55 function 59 in a product 16 in a quotient 17 in a sum or difference 23 probable 60 standard 62 Errors.77. 77. 40. 115 theory of erro students through! from the most.

58. 74. 115 31. P. 63.'and Robinson. R. L.101 Milne.78. Heyl. 87 Least squares 86. Heisenberg. standard error Oy··'Ss. Probable error 60. 96. 78. E.. Histogram Hogben. 115 40 67... 70 Method of least squares 86. 14 r-distribution 81 Tests of normality 58. P. 64.. solution of 96 error in a 17 Lognormal distribution 68 standard error of 82 Hartree. 90. 110. 72 weighted 87 Median 33 Median distribution 67. S. E. 74 Sample 61 Sampling.uions ~8 Whittaker. 72 of product 81 of standard deviation 64 of sum or difference 78 of weighted mean 88 118 119 . 54. 101 Legendre. 58 of a sum 44 standard error of 62.. 74 one. 74 Pearson types of curve Personal equation 10 error 10. line of 104 Relative frequency 33 Residual 13 Root mean square deviation Rutherford. 57 fitting of 66 Normal equations 96 Normal error curve 55 distribution 74 law 72 applicability of 73 Pearson. E.74. T. 40 evaluation of 42 of a sum 44 Standard error of compound quantities 82 of mean 62. 74.. random 62 Scatter. 76. R. J. 10 41. 100.INDEX Peters' formulae 65 POisson. K. 69. E. 115 W.1l5 Random 62 Range 38 Ratio ?)/a 58 Regression... Weights of observ. 1151 X2 -test 66. 55. D. 10 26. Linear equations. 37 Scrase. H. 72 Product Kinetic theory of gases 71 error in a 16 standard error of 81 Laplace. 76 29 Mean 31 assumed or working 32 deviation 38. 1J 5 Sheppard's corrections 43 Skew 67 Solution of linear equations 96 Standard deviation 38. C.115 Multinomial law 67 "Noise" 10 Normal distribution 55. S. 88 116 of mean 62.95% 61 99·8% 61 81. G 72. 110. 66 Theory of errors 72 Triangular distribution 67 Uncertainty principle Variance 40 Variation.43 INDEX Weight 86 Weighted mean 87 . 64. 115. 13. 51 distribution 51 series 51 Population 61 Precision 14 Precision constant 56 Infinite population 61 Probability 48 density 56 paper 68 Jeffreys.84.96. 56. A. D. F. M. W. 96 Quotient. coefficient of -Weatherburn.