Franz The measurement of variable quantities

Boas,.

THE MEASVKKMKNT OF VARIABLE QUANTITI
BY

FBANZ BOAS, PH.D
Prf.Mor
of Anthropology

ARCHIVES OF
r,

PSYCHOLOGY AM>
KD11KD BT

J.

McKEEN CATTEJ.L

AJ.D

HU

Dl UIC K

.1.

E.

WOODBB1DOE

No.

5,

Jmw.

'

l

t(H;

Philosophy
Coluu.bi

*

.

^

x

THE

TOBK 8CIENCB PHBflB

.

XIV.D. E. * THE SCIENCE PRESS . Professor of Anthropology.THE MEASUREMENT OF VARIABLE QUANTITIES BY FRANZ BOAS. WOODBR1DGE No. Columbia University ARCHIVES OF PHILOSOPHY. JUNE. Tol. PSYCHOLOGY AND SCIENTIFIC METHODS KDITKD BY J. McKEEX CATTELLi AND FREDERICK J. >"o. Pn. 5. 19OO Columbia University Contributions to Philosophy and Psychology.

a .

the " Wahrscheinlichkeitsrechnung und Kollektivmasslehre. 1906. was It to a great extent follows methods similar to those published. all application of the calculus. because it seems adapted to the peculiar needs of American students. FRANZ BOAS. NEW YORK. which I have given for ten years at Columbia University." by Heinrich Brims. which made it necessary to avoid. biology. May. . Nevertheless I have decided to publish my treatment of the subject. While the book was in the hands of the printer. so far as feasible.PREFACE THE present treatise contains the introduction to a course on the statistical treatment of biological and psychological measurements. and is much more comprehensive. The form selected for the demonstration of the principles of measurement of variables was chosen on account of the limited mathematical preparation of students who have devoted themselves to the study of anthropology. and psychology. used by me.

CONTENTS
I.

Introductory

;

Constants and Variables

.

.

.

1

II.

Comparison between Limited Series of Observations and the Unlimited Series of Variables

14
.

A. Properties of Averages
B. Comparison of Limited

.

.

.

.

.

.14
.

and Unlimited

Series

.

.

33,

in. Distribution of Variables and of Chance Variations

50

The Measurement
I.

of Variable

Quantities

INTRODUCTORY; CONSTANTS AND VARIABLES
1

we distinguish two separate and phenomena constants and variables. The former give the same quantitative results whenever are measured they under exactly the same conditions the latter give different quantitaclasses of objects
;

IN

the quantitative study of nature

tive results

when measured

at different times, because the

governing
it

conditions are complex and never quite the same. When we measure the length of a metal rod, we consider

as the

whenever measured, and therefore we assume that it must always have the same length, provided the variable conditions affecting its length remain the same. Only when changes occur that modify permanently its inner structure, and by which its identity is changed, do we speak of a change in the length of the
identical object

same

If our refined measurements give different results for the length of the rod, we ascribe these to lack of control of conditions,
rod.

and we

we

cast

them 'errors of observation.' If, on the other hand, a number of rods in the same mold and as nearly as possible
call

1

under the same conditions, the
casting will

slight- differences

of conditions in

become permanent

characteristics of the several rods,

and the errors of manufacture of each individual specimen will make them a series, intended to represent the same kind of an object, but,
owing to individual differences, variable. In the same way, if we wish to determine the weight of a cubic
centimeter of pure iron, the weight appears as a constant, l>ecause both a cubic centimeter and pure iron are identically the same in all Each individual experiment will therefore be an approach to cases.
this constant weight, affected

by errors of observation according

to

of the cubic changing physical condition, to inaccuracies in the size the same the other in the iron. to and hand, centimeter, impurities

On

1

The term

'

error of observation

'

is

applied to constants, and

is

generally naed

in a restricted sense, signifying the error due to the technique of measurement and occur, can no determining the limits within which differences, if they should

longer be recognized.

2

THE MEASUREMENT OF VARIABLE QUANTITIES
when considered individually and objectively, what is intended to be a cubic centirepresentatives of
according to the principle of
is

series of iron cubes are,

the variable

meter of pure iron. It appears from these examples, that
identity

we

consider a

phenomenon which

completely defined as

As soon as any of the a constant. always the same, and therefore as constituent or controlling elements are no longer completely defined,
the

group,

but separate individualities. phenomena are no longer identical, Differences of the results of measurements are called, in the first ' differences due to errors of observation ; in the second
'

' and the variates group, ' ' ' constants and variables.'
;

'

phenomena are

called,

respectively,

While in some groups of phenomena a complete definition can be of a phenomenon as given which compels us to consider a repetition definitions are posno such in others with the identical original one,
sible,

and the individual repetitions always possess independent elements which are not contained in their common definition. These phenomena must always be considered as variables, and their com-

mon definition is that of a class embracing the individual phenomena. The identity of two objects or phenomena is inferred partly from considerations that have no relation to measurements, and we conclude that, on account of their identity, the measurements of the two objects or phenomena must be the same ; but we also con-

clude conversely that when the measurements are not the same the objects or phenomena cannot be identical, and also that when they

have the same measurements
they are identical.

that -is,

when they
is,

are constant

This

last

conclusion

open

to refutation

by new

facts.

For

this reason

of course, empirical and it may be that

with increasing knowledge objects or phenomena which once appeared ;tconstants may come to be consi^lerecl 'as variables, because what

seemed
ent
;

at one time as quantitatively the same is proved to be differ<r what seemed at one time identical is proved to contain different elements. In other words, certain parts of the error of obser-

vation

may

olmervations.

be proved to be due to individual differences between The discovery of variations in latitude and of new

element* which are found

in very small quantities mixed with other element* illustrates this point. Strictly speaking, no two measurements are absolutely the same. If, in our definition of the phenomenon measured, the differences be-

tween repetitionH are taken into account, it will be a variable if they arc Unregarded, and only the elements common to all repetitions
;

The centimeters of iron as they exist. as long as they are discernible by means of the standard used in making the measure- ment. in so far as its individual peculiarities are considered as vitiating our definition. It will be noticed that in comparisons between two series quantitative variations in each series will be the more negligible. A cubic centimeter of pure iron.INTRODUCTORY are included. and by abstrac- tion define the length of the centimeter as the type of our rods. we by Our decision will always depend upon the question whether consider each quantity individually. differences are found to be so great that the centimeter can no longer be considered as the type of the as an individual. and the rods series. and. The example given trates that a cubic centimeter of pure iron is a constant. each rod must be considered present a series of variates. Thus. they are errors of observation. have It sec-n that is abso- therefore a matter of judgment whether a quantity shall be considered as constant or as variable. the same cubes would become representatives of the series of cubic measured. but that the individual cubes measured may also be considered as variates. when we fairly compare a number of uniform centimeter rods. It appears from what has been said that the essential difference between constants and variables consists is in the fact that the constant an individual considered as a complete representation of a cla*. we are more ready to consider their deviations as errors. the greater the differences between the series. or as adequately represented what is common to all the quantities measured. When these individual peculiarities are considered as part of the definition. it 3 befoie illus- will be a constant. that the quantity measured is variable no fixed lowest limit can be given under which variations must lye disregarded. in comparing a series of inaccurate lengths of centimeters -and meters. on the other hand. the measurable deviations will be considered as errors of observation- When. under given physical conditions. is and therefore completely defined. In empirical studies. lute sameness of measurement does not We exist. and in this sense the individual peculiarities would be considered as variates. While considerable differences will always lead to the conclusion that our definition is incomplete in other words. . identity or diversity is often inferred from sameness or diversity of measured values. identically the same whenever individual cube does not quite correspond to our definition. the. recognize -the two types and to than in a case where we compare inaccurate lengths of one centi- meter and of two centimeters. Thus.

to be complete.determining in the second case?Jheje_sjibordinate_capKfs anr} the the type are given equal importance.en IK- L- be such that the relative frequency of a measurement 1 with the total number of measurements. e. could the whole infinitely large number of individuals be' observed. the individual cases of which are distributed according to a certain O is we form law. Two variables will appear to us as the same when the series of measurements representing each is the same. that say. In this case we may infer that Iwth variables represent the same class and the same groups of imperfectly known modifying conditions. NN hile in limited. Thus the measurement of the variable considered as following this may be reduced to the distribution. the measurement of a variable. 2 Since a variable consists of a series of individuals constituting a class. though it nature the number of cases constituting a variable series we conceive the series in forming the abstract law as consisted of an unlimited number of individuals. but differing from this value on account of imperfections of manufacture. The distribution of cases in the ideal series . S1] b- ordinatejcauses . oagaei. for instance. Since the single series are considered as identical with the type. we call the differences between the individual values and the ideal values 'errors' in the same as we call 'errors' the differences betWeen way one centimeter and individual rods. differences of individual In the formpr measurements are considered as ^" p tr> case. must consist of a aeries of measurements of all the individuals of the class. fulfil these conditions. may be expressed algebraical function /(A'). intended to be one centimeter in length. i. and each series of measurements law. determination of those elements which determine the general law of When these elements are determined from a single limited series of observations. and but each that .completely defined phenomenon to rp<tett without end. When.yMimla nf thp f Mm ^s^ individual considered as different from the other. When we find two variables .a variable is a series of indi. because th^ distribution of variates fully defines the and we may imagine any . we consider them as identical that is to the abstraction of the existence of a class of phenomena. they may be expected to differ from those representing the abstract type of distribution which would be obtained. 20 observa- X ..4 THE MEASUREMENT OF VARIABLE QUANTITIES the definition of the class and of the individual being the same. the complete infinitely long series of individuals. This may riable done.

and which will be crowded where the once. arise at the beginning of our investigathe one. - mm.. If the variations contained in the class had an unlimited range. beginning with the lowest and proceeding to the highest. class can not It is obvious that the members of one and the same vary more than a certain limited amount. distribution clearer impression of the character of the obtained be by counting the number of measuremay For instance. The character of the function expressing the distribution of vari- ates arranging the measurements in order. the intervals. According to our definition all the measurements are to belong to the same class. the other. in no case do variations occur of such size measurement could no longer belong to the class in ques- A f^^ t^t Therefore variates must always remain within certain finite limits. or vice versa. that the tion. far the algebraical function thus obtained may be assumed to correspond to the function representing the unlimited series. have been made. or the frequency of variates beyond these limits is zero. and the measurement we have X l occurs 3 times. therefore.. The measurements may be recorded with the greatest possible ac- may be determined by curacy. how to determine this algebraical function . which may be made list so great that the observations may be given in of measurements of which each occurs only a few or at most times. . 1055-1065 1065-1075 1075-1085 1085-1095 . within convenient occur that ments A statures of 905 nine-year-old boys iji Toronto have l>een taken. which is determined by the definition of the class. from our actual observations we obtain a table of measurements from which may be derived an algebraical law representing this function with greater or less accuracy. In other words. the class itself would be unlimited.INTRODUCTORY tions 5 occurs 4 times. of ( .. X t Thus. far apart where the frequencies of the measured values are great. By mm. Two tion how questions.. recording the number of cases : that occur between groups'of 10 the following series results No. and form of a frequencies are low.

THE MEASUREMENT OF VARIABLE QUANTITIES mm. 1195-1205 .

x f+l . . assume in this formula. The equation given with the factor zr . all the coefficients. if we have. (A. disappear.xj - : points from ..(Ap . 2p + 1 points for X_f+l the 2p 4. . ) . . X p+1 )(^.from It follows that.x_p+ for points .x_p+l for the ) (x to x^x. to A^. (X. that in the region between the known points the function changes continuously and that no periodical or We irregular features of the law of distribution occur in the intermediate regions. X_f to X X p r+ j .INTRODUCTORY (1) .th interpolation for point.jy . . _ to X_ .X_.> : : x^j. x A. while the one that does not disappear will equal 1.AV. to (A.A.:. for the 2}> points..x_f)(x. the following amounts must be added to the 2/.> -f 1 we compare any For a term points. before takes a simpler form for two succeeding interpolations 2p and 2. 2p -f 1 (X. A^ +J (A'. for X . . .JTJ . from A"_^.JTJI A.X^A^ +I - J - from -3T_^. .^JT^Y.~ .. (X. -2 z' + (^ .) because in this equation. .)' (X^-X^X^-X^X~XJ 4 ( . with the ex- (X A'r ). .XA. .x_p)(x. for ception of the one not containing ' X= X r.) (x.1st interpolation. ^ and for . _ A.

a )z_p+r+v .^. .1-2-3-.(r-p-l) The sums as follows in these values from z_ t : to 2 ...-.ajz..(r+p) '1-2-3.2p- l-2-3. if as zero and call the varying furthermore we assume the point X values x.-. We have.8 THE MEASUREMENT OF VARIABLE QUANTITIES If we take the values of A" equidistant... l then Their difference (a. p the interpolation for the 2p 1 points from x_ to x and p p f r the points a-_ if we designate by 2 the sum to x and Vip+i p+1 p+l) of the products (2) and (2*) for all values of r 2fJrl : V + U' L ~- TT L 1-2-3- _ --2/> --. if U of the function for the interpolation of 2p points.(r-p-l)' ' 1-2-3-. If the rth differences are expressed by the form l . therefore.-.ar r . 2p(2p-l).(r-p) 1-2-3.-(2p) 1r ^-... we find for the last two values (x (2) +p l)(x +p 2) (x p) 1 nr 1-2-3. their distances as units.. from x_ +1 to x . 2p _p and xf+1 also we call 2p the value It will be seen that the members for the points agree with these forms. + + (a r+1 .. p two terms are the 2pth differences between the and from z_ to z This can be shown p+l p+l ..(r-p) and 1-2-3. = a 2_ + n s .

we find (x +p l)(g +P 2) (* p) A . and also Therefore . decrease by units and are. The increase r the amounts coefficients and are. arising from a' = r(r 1) ~T2~ r(r -~ l)(r2) If we write the differences between the values z in the follow: ing form '-' +1 A-- A:- A:!. therefore.INTRODUCTORY Since for 9 Alp = 1. for the rth difference r. but with alternating signs.. A:. it follows at once that the coefficient* a. the rth difference (/ l)/l-2. therefore. In short. the coefficients are the sums of arithmetical progressions of increasing order whole numbers. Sr In the same way it can be -shown that 2. for by 2 a.-n 1-2 -3.:. *-H ^2 A ' z' A: A..

*-* 1 A' 4- "fo1 1) A + A -i O < 1 L O ^ afo.0057+ +0.55 0.0000 +0.(x-p)(x-i) 1-2 -3-.0072+ +6.02250 0.0039+ +0.10 0.0117+ 1.004Q 0. we determine intermediate values of the function that a number of by assuming observed points between certain definite limits defines the function In this table the arguments from 0. 0.0106+ +0.06188-0..0057 +0.80 0..03188 +0. (2p-f 1) -e ' By introducing for p successively values from upward.10 THE MEASUREMENT OF VARIABLE QUANTITIES P j_ 1-2 -3 (%>+!) 2 l-2-3.0021+0. -L ^ ' O 1 1-2-3-4 1-2-3-45 Or 48 For various values of x the follows l : coefficients of interpolation are as *(*-!) 4 x(z l)(z 6 i) 0.60 0.01188 0..40 0.0020+ +0.04000 0.0 are given on the right-hand side and are to be used with the corresponding sign.5 are given on the left-hand side and are to be used with the sign on the left hand of the columns.!)(*-}) A 1 O Q ' ./ _ 2!_+A O ^> .06000 -0..05250 0.05 -0. .0085+ +0.95 0.0078 +0.65 0.00 0.0000 ~48 +Q.0116+ +0.0080 +0.75 0.35 0.70 0.90 0. Those from 0..30 0.05688 5 0..00 0. we find .0060 +0.0 to 0.5 1 to 1.0036 0.0112+ +0.0097+ +0.OOOO+ +0..0074 +0.45 0.06250- +0. ~~ V.85 0.50 4 In the method of interpolation described in 3.20 0.0070 +0.2p (x + p-l)(x + p^2). .04688 0.15 + 0.00000 0.

a certain function /(i . and A*. We may deter- mine the average frequency of the variates. It is possible to overcome this objection by demanding that each observation shall be given equal weight. A. if a function can be determined. the f(X] Then ' =c + t c3 X+ A c3 If 2 + c A' + 4 . then the averages of these powers are also determined algebraically. the averages of their squares. If the function representing the unlimited series is determined. and X. fourth powers. those near the middle region appearing with great weight. Xv . infinitely number of values Or.. A. c. for the infinitely large observed averages. be 1 For readers not familiar wlth~the~elementa of calculus. the following e*p'n- tion will perhaps be sufficient.. assume again that between certain function can be represented by a series will We limits. This can be done by determining the averages of the powers of the variates. it would seem that the problem can be solved in a more satisfactory manner than can be done by interpolation. cubes.INTRODUCTORY 11 adequately within these limits. This further investigation. the average value of the variates. In applying this method we assume that any two functions which have the same average values oif their powers are the same. will the total number of cases between the limits A. point requires. If we designate the measurements again by A". we have to determine the averages a Then we must find the function /(A"). whose average powers correspond to the calculated average powers of the observed series.. etc. while those beyond the limits of the range of interpolation are entirely neglected. however. +l A'. This method is open to the objection. and the averages of the observed limited series will approach the theoretical values more or less accurately. that not all the observations are given equal weight. and that the approach to the true distribution will be best in the middle region of the selected interval. Therefore. if we indicate corresponds to the the process of averaging the XJ(X) r which the average of long series by brackets.

o(x) =x n . If we calculate this Xj. /(xj). /(xs dx). n+1 This is the formula applied in the text. and the line xa x. which may be written We may also FIG.. this area will be very nearly /(x.).)dx. will be equal to the sum of all the values /(x)dx between the limits xl and x. for instance. The area bonnded by the curve. that a certain initial point must be when the terminal coordinate is x n . C1 / Jo x^dx=- xn+ l n+ n+1' - . 1.dx may be measured by the area bounded by the ordi nates /(. and by the curve. ^(x. ('ODTersely we may conclude. the area from .]/. + .dx and the corresponding segment of the curve. we obtain the area bounded by x^. the line x 2 x t -). when n is an integer we find =/w and expanded n -x"1 Since we can make dx as small as we please.x 2 ). or dx If =/(*) we assume. When dx is taken very small. the ter- ^^/ minal ordinates /(x1 ).12 rHE MEASUREMENT OF VARIABLE QUANTITIES fJT2_ -y-2\ r /^i + -v-3 ^2 _ T-3 -^-1 between two points x 2 and x z -j. consider the area thus Xj bounded between the points and a variable point r as a function of Then we can write x. as may be shown by np *eries first lo the point rv then to substitution in the preceding formula. ..

1 may be determined See pp. we find : It follows at once that when in two functions the values of a are the same between the limits ATj and v and the functions between these limits can be expressed by a limited series containing only X powers of x. . et stq. In that case we have the two equations from which follows This can be true only when the series of c . cn will be the same. It appears also that the constants of the function from (4) by successive elimination. a*. aj. 26. and desig- nate these powers byajj. the functions c lf c 2 because the constants between these limits are also the same. as . initial 1 point.INTRODUCTORY 13 If we determine the powers with A".

COMPARISON BETWEEN LIMITED SERIES OF OBSERVATIONS AND THE UNLIMITED SERIES OF VARIABLES A. it will' be well to discuss a few equal to the general properties of averages.. The acteristics question arises.. therefore. but only approximations which will vary according to the accidental peculiarities of each series. Before we take -up this subject. and a strict correspondence between the distribution of the limited series and of For this reason we can not find the unlimited series does not exist. We Then we shall determine the values of [(<-*/]''. whether we can determine the charIn of the distribution of the averages of powers of X. is sum of . Properties of Averages 5 presupposed that we know the exact values of the averages of the powers of X. in a number of limited series we do not expect to obtain uniformly the Ix the method outlined in 4 it is same relative frequency for each variate.. l powers of our unlimited series a \ The corresponding special averages of limited series each containing n observations may be called a' p p . However. '. 4 investigating this problem we shall use the method suggested in and try to determine the averages of the powers of those values which express the distribution of the limited averages around the have called the general average of the jath general average. the powers of the differences between the special averages and the general average of the function.II. because there will be acci- dental differences due to the limited number of cases. the exact values of the averages of the powers of X. The average of the sum of two variables their averages.' We shall direct particular attention to the values of [('.or].

COMPARISON BETWEEN SERIES ftfi 15 + . The averages of the product of two independent to the product of their averages. Then which will have the average value of m[z]. We may now take up and we will begin with the consideration of [(a' a)]. each value y r will appear as the factor of a sum.) + H to + *. Thus every value of y r value m[z] where m equals the appears as the factor of a certain We find. the special average a therefore = r We will write -ol A" a = x. . therefore. variables is equal The members of this sum may be so grouped that all the valuea y r that have the same numerical value are grouped together. the number of occurrences of respective value y r . A" (5) N _+ *x [y + *]-&] + [*]. For a special series of n observations. [x]therefore [AT -a] =0.) + * --- + *. = (6) W the discussion of the values [(a' a)'] . (y y + + :r) #+ *. Since according to definition = [-Y].

Then each of the n expressions of the form n hence. x. The general average of the expression z? depends solely on the character of the function which determines the distribution of X. will also estimate the mean of the cubes and fourth powers of the differences between the special averages and the general average. As we can write n n n "" "^l2 n ~^ . their ' sum total r ' 2T i '2n . n) . l 1 ..16 THE MEASUREMENT OF VARIABLE QUANTITIES We before. 1*2 ] 2 ' 2 2 _ ~ Since each of these factors averages zero. We a (' \t s a) = I ( *'! 1 \ n **'> -2 -|- n + "_ \ ----1_ _! ) . and we may write [-]-. zero. will proceed in the same manner to evaluate [(a' 2 a) ] . . fcX] 1>X] + ~^~ ^^ . viz. all their products will be Thus we find [(' or -?)*]='. .

hence If we call and we have (7) n _ iy We will designate the values of V/n Then we may write (7*) [('-.0. Here again all which contain a first those products in the expansion of the polynomial power of x will be zero. Therefore.7- .COMPARISON BETWEEN SERIES In expanding 17 x /n 3 first . this expression we obtain n terms of the form All the other terms will contain squares of x multiplied by powers of x &W-MW-.

however.2.g(1. express.' .r*-i. we find.18 THE MEASUREMENT OF VARIABLE QUANTITIES These averages. . is.. for the number of equal terms . n(n n members of the polynomial expression. all those products which contain the first power of an x will be zero.r ----. products will be the greater. and for u will be p./2 ).-/. Therefore only those values need be considered for which u -J. the maximum value In this case all the p exponents are the same. x must not appear with an exponent less th#n 2. every [<] =< i For any series containing only different values of r rz r u-v 2P ~ r . therefore. We is It We n n n write If we expand this polynomial term we can [(n . will now proceed to evaluate the higher powers of (a' a). .Lr 2p n L (1 2 r. the mean values a that may be expected in a of the powers of the differences a' series of 71 observations and may. therefore.. (1-2. r r -ir-^~ x xtu ri~ r2~ut 1 r u-A ) I .2. ._ n) 2 ' J ]= . which may be used to determine the characteristics of the distribution of a.1 is greater than any value lr these As long as n . r > 1 .. because the average value of every x is zero. the x greater u we find.. convenient to treat the even powers and the odd powers sepawill first determine rately. 2 among tain the a total of therefore. be called expected errors. - n(n ' 1) (n u -f 1) > (1. at the same time. the values of x represent the same function.) In our particular case... v X Since all fr r ir r 2 (Xt^t. . Since. u + 1) different combinations which con(n 1) same product l members for which the If in this series there are lv 12 v exponnumber their will be ents are the same.

. Therefore the smaller L..l)(2j-2)(2j-3) 1-2 1-2 2- l)n(n-l). must be According to the remark made on p. the more nearly will all the values a be If n is assumed sufficiently large. L.. if we introduce again a- = 1/n (8) ={(2^- contain a less For any value u < p the product of n(n u + 1) will (h 1) number of members than occur when every member .. If we designate their the deviations in any class must be limited. and the factors of our product containing the will be divided by 1/71*"". (. 5 nearly of the same order. all the small as compared with n.COMPARISON BETWEEN SERIES The average value of each x* is 19 a* .+ 1) T-2 ~T^T7T" and. has the exponent 2 powers of e.p the approximation . limit by L. since all the values I (1.-.2 1 n 2* J2p(2j-. e. For large values of . 2 These approximations are pared with n. The same (9-) consideration shows that for large values of n [('-) *+'l = 0. e4 .3-l}f*. terms containing n in the denominator may be neglected. r=2 [ n*|_ therefore.. Now it can be shown that the values of t^ 4.. sufficient as long as p is small is when comnot correct. and thus we (9) find [(a' -a)*] = {(2p-lX2p-3). 3.

i_ _ J-. ^] are 2p+1 For odd powers [(' a) ]..* Ztrt y=--.. Since the aver- For even powers [(' age of the last of these averages equals zero.3) 2 .. 0-1/2 V2-7T r+" e' -** (x + bx* -f cx*)dx = be + 3ce = 0.. or by neglecting terms of orders higher than 1/1/n. (1 2 -. the values of not changed by introducing elements of the order 1/l/n. (11) = (2p + l)(2P Vn we may write 1 . 3 lAr .. values of this order originate 2 through the combination of (p 2) elements of the order x .1) .> a) ]. one element of the order x* and one element of the order x. because the even responding to the exponential function. -. 2... . the values of the order 1 / i/n origi2 nate by the combination of (p 1) elements of the order x and one 3 element of the order x . n 2* (1 -.1 2)(1 2) .20 THE MEASUREMENT OF VARIABLE QUANTITIES This series of averages of powers corresponds to the average powers of the exponential formula 1 ~ (q'-a? We shall next determine the function which arises when we con- sider terms of the order l/i/?i.7. <ri/27r (ar + to 4 -f- cx 6 )dx = 36e 4 -j- 1 5ce6 C = -^. (p ..s= which must be its (\+bx + c^\ moments give values Then we have cor- 0-J/27T type..5 r f v _ In order to determine the function corresponding to these average powers. 2 [a. _. 2 4 /+ 1 e 2<r * /_... . -l).

elements of the order 1/n the denominator n. originate through [(a' . a ) 2/>] are found by including in the expression (8) terms with Besides these. We will also consider the form that our function takes.3) - 3 1 }e"' 1-2-3-4-71 1 + "The' function giving these average odd and even powers has the form (12) " ' fjJC " 1-r OC 4 3 le"5-3 . provided the members of the order 1/u are considered.a) -] = 2 {(2p - l)(2p . 4 element x . and two elements x Thus we obtain - [(a' .COMPARISON BETWEEN SERIES 21 and 1 -*' 1- The odd moments of this function have the values ' _ ~ i/n and therefore agree with the powers demanded in (11). These leave The first two terms of even powers the odd powers as found before. and also through the combination of 3 . (p 3) elements 2 jc . s the combination of (p 2) elements X and one.

be shown that differences between averages of higher powers do not materially alter the function expressed by these average powers... 3 l}d^ = 0.{(4m {(4m 3) (4m .2r .3) - - .27^1)} x {(2m . 4- + .3-l}' By consecutive elimination of the r -^ first 1 first elements.+ {(2m + l)(2m 1) .4) (2< . we find for the tth element of the remaining equation the coefficient and by subsequent elimination of the r tth last elements we find for the element of the first .. .22 THE MEASUREMENT OF VARIABLE QUANTITIES As soon as p is large when compared with n.. however... -f . . functions of the forms Supposing we have two el/ Z.2<) (2m . and 1 We will assume that their 2m 1 first moments are the same. . .2) x and by taking t remaining Aquation the coefficient . {(2m - l)(2m .2t -f . 4. If we call a a =d s and the difference between the 2mth moments 1 </+! d d2 e 2 + 3 Irf/ + . . = 0. (2m . while the higher moments differ.5) (2m - I)}c?2n e 2m = 0.. .(* 2)} l(*-l)(2. (2m + 8 2m {(2m-l)(2m-3).3) .3) = r and r = 2m r.. 5 3 -f -c?/ + .2t . It can. 3}rf2ro e 2 '" + 3d2 c 2 + 5 3d4 e 4 + 7 5 3d/ + -. the approximations are no longer satisfactory. - l)(4m .2)(2< . . .

. but only the approximate value a' derived by averaging 2 It is.. ( . 9 not known..a}} 2 . it can be shown in the same way that their influences are small. determined by the whole infinitely long In any actual series of n observations <r* is relation between cr and <r'.2/-)(2w - 2r - 2) 4 2}.. necessary to establish the the n values (X' a') ..COMPARISON BETWEEN SERIES we have for the coefficient of 23 d^i* . into consideration the differences between still higher moments. we need consider only the last one.2) .3) . If is a we take unless large very very small.. the differences a o.. 2.(a . where In this equation 2 o- is series of observations. may be called errors of observation. Accord- ing to (7*) we have ~ . [(a LV of] = = . therefore. Thus we find 2r 2m+2r {2r(2r is 2r-l) . which is multiplied in the process of elimination by {(2m - l)(2m . (2r - 1)}. 6 Since the particular average a is an imperfect representation of the average value of the function. Since in these consecutive operations the values of the right-hand side are zero. e- n .1 1) } {(2m - 2r) (2m 2r 2) 4 -2 } Since m assumed to be a large value. according to the terminology of 1..a) . 4 {2r(2r 2} {(2m . the values d which are are the changes in a due to the difference in the higher moment e a s and small value. except the last one.

and at the foot of the columns.4. X . the second. The first column in the following table contains the measurements X. n = 25 . square deviation of the series. Average = 83. number As an example the measurements head of twenty-five of the proportions of length and breadth of Indians of the interior of British Columbia may be given. the computation of the mean . the third. (X their frequencies. their frequencies F(X).24 Since THE MEASUREMENT OF VARIABLE QUANTITIES and ff 2 (7) [ n therefore . the a 7 ) 2 the fifth.4. The average obtained from the series is 83.?i-l ( and by substitution (13) in (7*) t-i/j may be expressed in words as follows The expected is proportional to the mean square : This equation mean square of cases error of the average deviation and inversely proportional to the square root of the less one. the fourth. these squares multiplied by values of (X a').

aj + (X' . a\ The method may be illustrated (p. giving the values may be determined. These values. and their relation to <r' 3 .a) + (X' 3 t )' -r- 4- (X' m - ?} .606. If we do not make the approximation from which the exponential formula was derived. on the average. are not known.a). appears that the values <ry &4 must be investigated.(a' .COMPARISON BETWEEN SERIES n - 711 Mean square error = 2. = 2. that ia to ay. 17). .9 = 0. We proceed as before by the discussion of the value of 23).a = (X The a) .' - ')'} *.98/4. the square root of the mean square difference between the average of the typical series and our special series will have the value 0. whoee factors are not entirely pression of the form (X[ f For the second term we must determine the average of the ex- .a)'} The four terms on the right-hand side of this equation may be averaged singly i {(X( . but go back to the formula (7) (p.98.608.a).a)\a . and write for every term r <r' v X' . it must be determined. and by of the function t 6V/27T the frequency of any particular value of a may be determined. In this manner the constant in (10) means of a table. either positive or negative. we may expect that. cubic expansion of the average gives i .aj + t + n (X.

We will as powers call. . According to (7) Therefore Ko. and ' we have Therefore for a certain X l .X'n 2 f- a)}~\ Therefore the whole term will equal 3(a' a) . for In the third term the sum [{(X( any particular a' equals n(a 3 + (X' -a) + ---.26 THE MEASUREMENT OF VARIABLE QUANTITIES all independent of one another. * -/J=^-- 33 Q Oo. 3. If we arrange these so that have the same value are grouped together. we can write X^ that Since we have not made any conditions for X X 2. .of(a' - )] = and for all possible X a) a). X n. the average of each of them will be a. heretofore. It and the deviations from the average x =X a. ?i _3 n (14) 7 seems desirable to discuss at this point a few properties of the of a function.

' ~ 2 . for instance. . since we have to take the averages only of the powers of For the table given the small integers z. which we We =d and then c 4.z =a z = x -f a*. instead of the fraction x. 4 (15) +6a'[V] +4a[x] In calculating the values of will also call c [of] it is convenient to measure A" will call c. ] -f 1 applying formula (16) the calculation of [V] may be much simplified.COMPARISON BETWEEN SERIES It follows that 27 M- ] <] + =a W. as above. and. [*]--*. By in 6 we may assume. from an arbitrary unit situated near a a. d.

28 THE MEASUREMENT OF VARIABLE QUANTITIES Then the original table may be arranged as follows : X .

exponential law. If we imagine the series of observations recorded in intervals of if the number of cases in the interval of A" <//2 and A* rf/2 and is /(JT d/4). The question therefore arises as to how far the measurements grouped in such intervals agree with the theoretical series or what corrections have to be made. was In many cases this is the only feasible method of taking the measurements. the greater n 9 most convenient way of representing the distribution of the measurements of a variable is by counting the number of measurements 4hat occur within convenient inIt stated on p. the average obtained under these conditions .COMPARISON BETWEEN SERIES If we call write <r' r 29 o-. ar must be distributed the more nearly according to the r is. 5 that the tervals. = o- r -f e r and consider e r as a small value. we have. we can <r and j r = For o- the value will be Here <r r and <7 2r are known only approximately and we may esti- mate more strictly H By substitution in (17) /2r <r It is o-'J that easy to prove in the same way as has been done for a' a. being designated by ad/ In order 3 to compare this average with the average obtained when .

30 THE MEASUREMEKT OF VARIABLE QUANTITIES d are used. this We ' must always be the case (see can show in the same way tHat 2 2/0 p.-2/_ 2 -i. +c + r c2 -f c4 + . By the same formula *2/6 we find 2/3 = all ' ' ' + C-o^O + C o2/2 +C 22/4 +C + and by adding these values.) x ana therefore ' ' (y 4 + 3U + -2/_ 3 jri +y 2/ 3 > +-X - + 2/. intervals we may write The second part of our series is determined by the value of The negative element on be expressed by -2/_ 2 the right-hand side of this equation may + 2/0 + 2/ +2 + S/+4---( Then provided the interpolation formula 2) is valid where the values c designate certain constants. + 2/. 5). . s t x (. + 2/_ + 3 y_! + 2/1 + y +.. According to our definition of vari- X ability.2 -f 2/o + 2/2 + 2/4 + = + 2/-i + 2/i + + It follows from this that the difference between these two values is . .2/ -f S/ 2 + 2/4 +) This presupposes that the frequencies for large positive or negative values of will be very small.+ + 2/2 + 2/4 + = ( c_.

. We mula can calculate applicable./.COMPARISON BETWEEN SERIES zero. provided that the interpolation formula adequately repre^entH the frequency function between the limits of any A" and A" + <l. provided the interpolation for- is + d s It has been shown before that the second term of the d- sum is zero. in the same way. 31 and that (20) a_ d = ad r . + and by continued subdivision (23) . therefore -ii + By continued division we find \//2 l6' = o "j/4 + & 64 and (21) - ff2 + In the same way we find (22) < for and < = <.

5 and 6.91. 24 we found 2 for interval 1 the = 8. for the calculation of the be seen from a groupaverage and of the mean square variation may on the material of intervals in given pp. therefore.08 = 8.32 THE MEASUREMENT OF VARIABLE QUANTITIES For example. 8. the corrected value value of _ The actual results 0.83.91 a-^ -^d would be. larger ing 1055-1075 . <r 2 for the table on p.

COMPARISON BETWEEN SERIES 1045-1095 i 33 3 .

It follows from this definition that the frequency product of the probability and the number of cases. Provided a certain event has the probability p } this probability being its relative frequency in an infinitely long series. 3. then the . frequency that is than zero nor more than n. dent event the probability jointly ( If one event has the probability p l} another entirely indepenp 2 the probability that both will occur . Probability p is called the ratio between the frequency / of an event and the total number of cases n in which the event occur. B occurs with the frequency fv If the two events are entirely that is.000 casts with an ordinary Since the die. so that/j-p. therefore.000/60. the frequency with be expected will be may /i -r-/2> an d their probability. gives us the frequency of all the cases in which both events occur. t B are to occur. 1.. the probability p of the throw one is 10. (25) 2. these. it follows p always a proper fraction. is equal to the f=pn. / can never be less If I throw one 10. probability will occur is A B In a number of cases n the event A the event with the frequency /2 independent. therefore.34 THE MEASUREMENT OF VARIABLE QUANTITIES To solve this question a number of simple definitions and propositions relating to the theory of probabilities are required.000 times among 60. their combined frequency which either one or the other event . If one event (A) has the probability p v another (jB) the probathe that or the event either the event bility p. Among event B may occur f l -p 2 times. may (24) '"'. Their probability is. 27 ) P = PiPr A and In a number of cases n the tw6 events There are/ cases in which event A occurs.000.

and does not occur n 100 010 001 I times. III. so that. its non-occurrence by 0. = . namely when the event does not occur at all.. The event occurs twice. consequently the probability that the event does not occur 71 times is. P. r. for our particular purpose. The event occurs once. 000 000 000 100 010 001 000 000 000 II. according to (27). It is obvious that there is only one case when the occurrence does not . The event does not occur at all n cases among ooo-. The n 1 cases of non-occurrence have each the probability q . . and does not occur n 110 101 2 times. and for Group (28) .-ooo I. the single case of occurrence has the probability combination has the probability pq*~ l .. we must consider the various combinations singly. 100 010 In the first case. p . then in a series of n observations the foil lowing groups of combinations may occur. If we indicate the occurrence 1 35 .>y- Since. the order of occurrence and non-occurrence is irrelevant. - This value of the event by 1. the total probabilities by adding the probabilities of all the cases in each group. 9". bination has the probability p~q ~~. consequently every each comr In the same way we find that in n Groups 5 II. />V"~ r . according to of r occurrences among n cases is found (26).>. In Group I.COMPARISON BETWEEN SERIES probability that the event does not occur will be may be called q. 000 000 --100 -010 001 100 100 100 Oil ---OOO 010 010 etc. we want to know how often among a group of n observations the event occurs r times. every non-occurrence has the probability q.

the case of occurrence may n be in cases positions. the first occurrence is in first place. contained in this group. the second in first place. There are. n(n I)/ 1 2. that taking only those cases where occurrence is in the first place. The one added case may be in the n + 1st position. is identical with the case where the first occurrence is in sec- ond place. and for each of these posicombinations will occur/ 'Thus will r originate (n + Y)Fr combinations. can continue in this manner and find that for Group III. (29) P. therefore. since the probability of each is.. Provided is be given in the following the frequency of . so that. there are in each r 1 cases in which occurrences and group n + . (n r + 1) ~r~2T^r n~r and. then we may determine its frequency among n -f 1 occurrences. 3d. therefore. In Group II. only one-half of the total combinations obtained. therefore. first to the ?ith. namely. = nt^nn-r^l) ff F may r>n . We 1st. In every one of these . from what has been said before. p q we .. 2d. For instance. etc. the We number of different combinations is n(n l)(n 2)/l -2-3 and generalized for Group n(n r l)(n 2) . will be n occurrence may be found in all positions possible. according to (28). to tions F have. second.ooo.. however. The total number is.._ A better proof of this formula manner. . it appears. the combination x o(in.).the combination of r occurrences among n occurrences. there Since the first 1 positions for the second occurrence. obvious that here those cases have been counted twice in which the first and there are in all n(n 1) cases case where the second occurrences have exchanged positions. have the total probability of a combination of r occurrences and n 1 non-occurrences. the first. so that the probability of one occurrence n~l out of n observations will be npq .36 THE MEASUREMENT OF VARIABLE QUANTITIES In Group ?ith I... n+ 1 case of non-occurrence can be taken groups the place of the one additional r cases of non-occurby the n rence in the original series. take place at all. In the one added case the event shall not occur. including the additional case. from the It is. the second in second place.

. It is important to investigate a few of the characteristics of the First of all we will.l)th term of the binomial exfor this reason the law expressing the probapansion (q p) r in a series of n observations of an eveut which of occurrences bility n -f. combinations is.COMPARISON BETWEEN SERIES non-occurrences follow in the same order.(n-r) find *n-r.r 1) + 2)~^7 1) 1 " n(ntherefore l). (11 (r + . . r.. This probability and has the probability p is called the binomial law. if we take n = . according to (28). therefore. 87 total The number of F The group which "n-r time has only one combination contains the case in which the event occur* every that is..r)~ 1) ' Since.( n 1 2 r .. 1-2 and for For F_ n r< we n 1-2. the probability of each combination of r n~ r Group r is p q .. equal to the (r -f. we find for the total probability of r occurrences among n (29) observations i\ n = is ?j(n - 1)- -(n r ~^^r- + 1) ~p'<f-'. = -f n-r.(n-r) + r ~ ~~*~~{\~~ I .. -r+ _ n(n 2 .'' r By multiplying the value found for F r with we (n-r + find lj(n~. determine for what value of r the binomial law.

r is the integer between np q and np q + 1. we will compare .. Thus we have found that the greatest probability belongs to that frequency of the event which corresponds most accurately to its If np is probability. that is. or r > njo q . this value of r not an integer. r+l-n-r-1 / P9 V ~ r ?\ rTiW' 7l -.r-l^Ti-r+l r g\ 1-2therefore .. >0.> when 71 > + 1 p 1 .-.* ~-r+lp/' r+lffl P when r. 12 In order to gain a better insight into the distribution of those frequencies which differ from the typical frequency. it is the nearest integer under or over np. n -P >0. and P. If n/> is an integer.38 THE MEASUREMENT OF VARIABLE QUANTITIES maximum.n-Pr-l. a probability reaches for what value of r P and r. or r <np q+ that is to say. n -P r+1 n .

. V - t) y n"F-'f] "l^ t l .. from this equation that the degree of asymmetry will be the same for large values of np and nq aa long as the proportion .+ -^-= n(7i --. .. the value is nearly equal to one that If we assume * as the distribution will be nearly symmetrical.i^/ tliat the be may disregarded.COMPARISON BETWEEN SERIES those values which are equidistant from the 39 which.-(np - + n'P+'fi**P 1 .1) .(ii/> ... for convenience sake. == 7-j may maximum frequency. / 2 = np n(n -^ P = 'n(n i -.. 1-2- + ) rt'P+'tp-**-' ^ ' ) !) i -(no * > s -f 1) / l-2._ .n H s n-p = It appears ! . r. so small in comparison to np and nq.__ *. 1) '- (n ^ np *- ft +. p r t>n (np + )(np + 8 . P l). we have 1 4717 P.1) ' .. s as so small in comparison to np and . that the higher powers of s/np and sjnq may be disregarded.. - (np -s+ \)rf If we consider fractions is.(n-|. be considered an integer lip -f .

It also appears that the more nearly p and q are equal the greater will be the symmetry of the whole series.. a P9 ' It follows from this that the sum of all the probabilities of the event * . We have then 1)] [(__ s 1) + ( _ 2) + .. nq~ . ..2.+ !)]+ _ a[(a-l) + (a.. We than s will next estimate the probability of the event occurring less times among ?i times..40 of s THE MEASUREMENT OF VARIABLE QUANTITIES and npq remains the same. We may also determine the same proportion when we consider the next higher powers of s/nq and s/np. or. .(-._!)[>+ (s. .2)-f-.3 Thus we find -* P l+ n | rj.(n. if are disregarded (31) onpq/ 13 .((* + It can easily be shown that the third terms in numerator and de- 1 nominator ( 1.'. n(n-l)..2)+.8 +l) 1-2.(.a + +(.

14 We will finally compare the frequency at the point np + with .. +1 up._ np+s? _ i + 2) If we assume that (** - ?)/"/> and W. the last product will be a fraction as soon as n It follows that from a certain value of n is taken sufficiently large. that at the point np + 2 . we have _ We will call . Since 9 is a fraction.COMPARISON BETWEEN SERIES occurring 0.. 2 (np + .(tig-. 41 2 times It is easy to show that the numerator decreases with increasing n. 1.) l). ")/ n ? are 80 sma11 that their higher powers may be disregarded.+ l) . the product (np/q)' q becomes smaller and smaller therefore the value 2j(PJ will also become smaller and smaller with increas: n ing values of n.

42 THE MEASUREMENT OF VARIABLE QUANTITIES and assume s^ = rd. . all th^e Since. 2npq = d. which implies also an n the third and fourth terms may be disregarded. all rip the probabilities for the points np + rd -f 1. .d(d + 1 l)(2d + l)(2r 1 2 3 Znpq + 1) d(d + 1 l)(p 2 -)- . by designating is ' Therefore if d 2P(32) (> -f l)cZ by P r. according to limits of rri can be made as small as terms beyond sufficiently large we desire.9) We will add up + rd + 2.q) o)~| I.-f 1 3( ^ 2npq \'2npq The ratio between the second and third terms of this expression -(-1) rf(r + p) ' 3(r+p) 2r and the value of this ratio lies between d : 3/> (for r 0) and d : |- (for r = oo). sufficiently large. In the same way the ratio between the second and be shown to be between last terms can taken sufficiently large. + Qnpq [d\2r 1) d(r ^ + ^>) 1- 2?. + (r + l)d. we have designating the sum of the values P by np+(r-i-l)d t13. the sum of the probabilities between rd and and. we can say If we assume d = c\/npq.

COMPARISON BETWEEN SERIES we find that 43 In the same way we find for a second series and Since this equation must hold for any value of r and e each or Pnp+rcJ^ -Pn'X+" JVV In other words. 71 We found that when a limited series of observations is taken. frequencies corresponding to the binomial law. sufficiently large values of bilities According to (32) we have. and therefore equal for series in which Vnpq has the same value. inversely proportional to } ri/>^. for for ' = V n p' q '. This value is. in various series. which a representative of an infinitely long series in which a certain event occurs with the probability p. if np and nq are large values. We may now summarize the results so far obtained. the standard by which the probabilities of deviations from the normal quency are measured. In other words. and is called the standard deviation. and fre- may have is be designated by a. by substituting for d. npj-r+ld np~-rd ^2r+ ~~ 1)~| According to (33) this value is constant. np and uy the probafrom the most probable point by equal pointsj-emoved multiples of i/npq are inversely proportional to this value. In this case frequencies of deviations from np will be. therefore. the distribution of frequencies will be very nearly symmetrical around the value np } and the frequencies of points far removal from the value np will be negligible. By a series of approximations it has been found that. then the most probable frequency of Deviations from this frequency are found the event will be np. for sufficiently large values of np and 717 the total probability between two limits is always the same when the limits are the same multiples of Vnpq. when . ^with.

0. lies between these limits. we may call this value also the standard error and designate it.000 such series the frequency can be expected only five times to fall outside of these limits. we may it is reasonably certain that the value p. which we can not determine by direct observation. l preceding pages by following table of probabilities of deviations of accuracy of the approximation discussed in degree the value of n is not large : The exhibits 'the 1114 when =i . values of Vnpq a single table will be sufficient. If. Since the probability of this difference is determined by the measure Vnpq. 0. 7 particular purpose it is desired to know how probable it will be that the frequency of an event will lie within certain limits For our may wish to determine deviating from the desired value np. how often. and the total probbetween two limits will be the same ability of finding any deviation as long as the limits remain the same multiples of i/npq. that in the last case we may be almost certain that the true value will not differ from the empirical value obtained from n observations by more than 3l/npg. therefore. 0. a table is computed for the binomial terms.997 for x = 3V'npq. therefore. If. the probability of any deviation.954 for x = 2y npq. for instance. and also the probability of finding any For large deviation inside of certain limits. in a number of series of n observations. we may expect a x and np -f x.632 for x / It appears. as has been done in the e. can be determined. it frequency that lies between np that can be determined among 1. For large values of np and nq it has been calculated that the proba- We conclude that bility of finding a frequency between np x and np -f x is = Vnpq.44 THE MEASUREMENT OF VARIABLE QUANTITIES the deviations are the same multiples of i/npq .

1000 Probabilities. Approximation.042 0.'.6 11 1 No. P T5 .998 0.132 100 0.732 . l/njjfl 3 No. of Occurrences.132 0.866 0.970 0. according to both formulas. If we 1 calculate the probabili- of positive and negative groups separately. IH tUiS I I Formula.042 6-14 1-19 0.870 0.998 90-110 80-120 0. Binomial Formula.. Approximation. . the approximation gives a symmetrical formula.732 0.COMPARISON BETWEEN SERIES 7 4. while the binomial formula is the more asym- According to what has been said metrical the ties smaHer the value of n. of Occurrences. before. these differences appear clearly.. n=100 Probabilities. 10 0. \*pq-9.

l)(7i . r +[A' Thus the following values are found .l)(n ...l)p* -K (n' np -f 7n(w . + ~~ .I)p 90n(n .l^ + 6n(n np + 37i( l)(n 2 -2 l)(n l)(n (34) np + 2 15n( l)jj + 25n(n - .46 THE MEASUREMENT OF VARIABLE QUANTITIES If we call +^ 1-2 -^ 1-2- ---y " ^("-IX".l)(n . [X*] _. If we designate the number of cases from which the average is derived by an inferior mark.21)"^*-!+_LL^-i* 2 * r manner the successive values of [-5Tr ] may be calculated.2)(n .3)( .2)p 7ip + 31n(7i + 65n(n .3)p* l)(n + n (n 2)(w 3)(?i .. we have In this [X'] .2)(n .. = up 1 + (r - 1)[X] .3}p .2)(n .4)p -f 15w(n x 1 + 10n(w - )(?i 2 s --f 4 5 .

we obtain [OK- [6)1 -"(?)+*(*)'If ?i is assumed sufficiently large we find thus the approxiroationi [Ol-W .COMPARISON BETWEEN SEMES 47 From these values the average of the power* around the average value of [A'] can easily be found by applying ( 1 6) or the corresponding formula. developed from and we find (35) If \ve determine these values in relation to the total number of cases n.

oecause their values de- pend upon the 2 a? alone. each deviation npq (35). Between two given limits. p in a series containing n ob- We have found that these are distributed according to the n observations. and as such follow the binomial p law. et seq. A of the values of these frequencies has been published by W. The veloped in probabilities are distributed symmetrically. s l and 2. we find number of cases \ Jnpq The value of The this integral will be the same whenever x l and x2 are the same multiple of Vnpq. theoretical distribution of chance occurrences of the probability in a series containing nm observations.e P(S The deviations mnp) =- _ (Sn mnpY. and their mean square If we consider the sum of m such series. From this it follows that the binomial law for a series con- taining a sufficiently large number of cases pressed by the exponential formula. J according to (37). F. ***** ymnpq V^TT be also considered as the Sn mnp may. . however. 18. we binomial law.) leads to B -na)2 *n ** the formula P(x} = 1 F=- = _( e 0-1/M1/27T Now we will consider as our function the theoretical distribution of chance occurrences of the probability servations. the mean square of the deviations of the terms of the binomial law from their average.48 THE MEASUREMENT OF VARIABLE QUANTITIES The 5 can considerations relating to special averages made in of a number of special easily be modified so as to express the sum If a is the general average of a values representing a function. exponential law gives us a means of calculating the frequencies inside of certain full and convenient table multiples of cr. O It is easy to show that the various approximations which we de11-14 hold good for the exponential formula. in which may we be adequately exa- substitute for the value Vnpq. function and S the sum of n observed values The expansion of (37) this term (see pp. . that their average is np (34). consisting of find.

7422 .00 .90 .99865 .8849 .>.00 .5398 .nttrika.80 1.30 .5793 .60 .8289 .99984 .20 1.40 .5596 . II.7734 .00 probabilities for negative values of x/<r subtracting the values given in the -table from 1.80 ..8413 .6915 2. 45). . o-* = "^9(9 p).9332 1. For our present purposes the following data arc sufFor values of xj<r.9041 1.COMPARISON BETWEEN SERIES 1 49 Sheppard.8443 .35 .75 3. .00 3. ficient.99966 .5987 .10 . 174.80 .)." .7881 .65 .50 .9861 . according to 1 introduced by the use (35).8023 ..30 .5000 . pp.60 3. el ttq.6554 1.60 2.99993 4.9554 .9192 .9032 .40 3.7088 .99931 .6179 .5199 1.6736 .20 3. The may be found by of (12).45 .60 1.Probability. The asymmetry of distribution may here be When we consider that.40 .50 1.40 2.15 .(*at the -end This agrees with the asymmetry found 1 of 14 (p. 1 l/npq 1/27T I +*.80 .20 55 .70 .20 .10 1.7257 2. 'New Tables of the Probability Integral' (Iii. the probability of occurrences of a devioo ation between and x/<r will be as follows : .7580 .00 2.9452 .9713 .9974 .85 .9918 .05 .70 .6368 .9772 .25 1.95 . Probability.90 .8159 . Vol.

We the law of distribution must always be of such character that the frequency is zero for values far removed from the bulk of the observa- The empirical investigation of great numbers of variables resenting many different kinds of phenomena has demonstrated tions. and the agreement between the two will be the better. Any deviation from the average " x = (<l + y + ') ( d + y ") + . therefore. the will then A The phenomenon be are functions expressing the effect of each cause upon unknown. repthat in a great r many cases their law of distribution conforms quite nearly . each increasing by one small unit. . the greater the number of causes. n.III. the number of contributory elements being very great. . merely a simplification of the existence of many small causes which act according to various laws and \vith varying probabilities. of causes. We can imagine such a result to be brought about by the action of a great many contributory causes which affect the values of our observations. if n is This assumption is large. the value of our measurement ability of each cause Provided that there are n such causes. the . the same as though there were chance deviations from a certain expected result. the conditions will be same as those discussed in the preceding chapter and the result must be a distribution corresponding to the binomial law. (cF + rv ) where the values A 4- d are the averages for each individual con- . and that the prob- coming into action be p. We assume that there are si considerable number. . have also seen that whatever the law of distribution may be. DISTRIBUTION OF VARIABLES AND OF CHANCE VARIATIONS 16 We have shown that the averages of observed series of variables have certain definite relations to the averages of the unlimited series. if n is small to the exponential law. each of which has a slight influence upon the observed phenomenon. more general proof of the applicability o'f the exponential law will be given here. w ith the exponential law discussed before that the distribution is.

since all the values of sr may be assumed to be of the same order (see p. 19). For cases in which these consmall as compared p ditions are not fulfilled asymmetrical distributions may be expected. In this case our approximation which results in the ex{>onential for- by 'the 2/y is a large number.are all of the same order of since a2 is a finite value may be indicated . The value a r r is represented in this dM average of the corresponding values of the component functions. by writing /l ^ n ln \Ve will call Since the individual contributory causes are distributed according to a great variety of laws. present form our problem assumes the same form as that of the distribution of special averages of one function which has l>een dis- In its cussed in 5-9 (pp. and as good as long as 7i with n. provided n Sp a large number and also in general If n is large and r >2 this value will approach zero.VARIABLES AND CHAKCE VARIATIONS 51 tributary cause. If these are entirely independent. their order If the values <P and y. it seems justifiable to assume that any group of n the same is 2p functions expressing these laws is on the average as any other group of n 2p functions. 14 f dseq. and y the deviations from the average for each of these contributory causes. mula will hold is long as .). [y'] ) + 2 ( [<r ] + [/''] ) + .

but. 21). owing to the independence of proximation If asymmetries of these constants such results can not be expected. . ficial.. no longer Skew shown that whenever V7 [V] is large as compared with the average A. the constants of will change according to these functions. ordinarily to the binomial law. shows that [or*] higher degrees of asymmetry are considered. the order 1 / i/n are considered we may call ' Kl-n'pq. since this must be true also of the variable when subject to all the contributory causes which w e have assumed to be of the same order. It has often been asymmetry assumed that skew curves will follow the apthe binomial of law. or Therefore. can be found. r-s-i L *q 3J I _ r 2 ~| -t/ n =p q. according to (12). when (p. exists. it follows that the average limit of the contributory It can also be r functions must be less than A/n. entirely artitaken into consideration. Accordto our definition of the values of the variates must be ing variability. if !/[#*] large is great . etc. s3 and n and p is. and. in (35). TBerefore. has been shown to be the mean cube deviation for In this case a certain form of the binomial law the binomial law. n cannot be and we must expect skew distributions. Since these values depend upon the will also exert their influence. character of the component functions. the values s 5 s6 . 4 depends upon s 3. . Then. this relation distributions do not correspond. therefore. which. limited and. which will correspond to the skewness of the function. If the skewness of the order 1/n is however. skew distributions may be expected.as compared with A. and [x ] upon s3 and s4 . The relation between s2 .52 THE MEASUREMENT OF VARIABLE QUANTITIES Formula (12) which determines the skewness of the curve.

.

.

QA 276 B637 Physic*! Boas. UNIVERSITY OF TORONTO LIBRARY . Franz The measurement of variable quantities & Applied So.

Sign up to vote on this title
UsefulNot useful