H^tt QfoUcge of Agticultarc

At

Cf^acnell

Uniueraity

Cornell University Library

HA

29.Y82 1922
of statist

An introduction to the theory

3 1924 013 993 187

The
tlie

original of

tliis

book

is in

Cornell University Library.

There are no known copyright

restrictions in

the United States on the use of the text.

http://www.archive.org/details/cu31924013993187

AN INTEODUCTION TO THE THEOEY OF STATISTICS.

— —

Charles Griffin
In Cloth.

&

Co., Ltd.,
net,

Publishers

Note.— All nrices are

postage extra.

Pp. i-xv. +518.

TARIFFS: A STUDY IN
By

25s.

METHOD

T. E. G. GEEGOEY, B.Sc. (Boon.), Lond. OASSEL KBADEE IN OOMMBKOE IN THE UNIVEKSITr OF LONDON.

Contents— Customs Areas Customs-making Bodies and Customs Law Taiiff-making Bodies— The Tariff as a Whole— The Internal Form of TariffDifferential Tariff Eate Differentiation and Specialisation of Commodities Duties Retaliation, Eeciprocity, and Colonial Preferences The Preferential System of the British Empire Valuation and Allied Problems Some Alleviations of the Protectionist Regime Free Ports and Bonded Warehouses The " Commercial and Drawbacks Improvement Trade Frontier Trade "

'

Tariff 'Treaties Appendices Index. " Will be of special interest to business men and all interested in current economic problems." Chmnber of Commerce Journal, "It ought to be in the library of every legislature throughout the world." New Statesman,

Second Edition.

Eevised. In Crown 8vo, extra, with Diagrams and Folding-Plate. 9s.

THE CALCULUS FOR ENGINEERS AND PHYSICISTS,
INTEGRATION AND DIFFERENTIATION
With Applications to Technical Problems, and Classified Reference List of
Integrals.

By Prof. ROBERT H. SMITH, A.M.Inst.C.E., M.I.Mech.E.,

Contents Introductory General Ideas and Principles Algebraic and Graphic Symbolism— Easy and ITaniiliar Examples of Integration and Differentiation Important General Laws Particular Laws— Transformations and Reductions— Successive Differentiation and Multiple Integration Independent Variables— Maxima and Minima Integration of Differential Equations— Appendices.

— —

Etc.

"Interesting diagrams, with practical illustrations of actual occurrence, are to be in abundance. The very complete classified refekehce table will prove very useful in saving the time of those who want an integral in a hurry." The Engineer.

foundhere

Thirty-eighth Annual Issue. Cloth. 15s. and postage.)
OW THE

(To Subscribers, 12s. 6d.

THE OFFICIAL YEAR-BOOK
SCIENTIFIC

AND LEARNED SOCIETIES OF GREAT BRITAIN AND IRELAND

Gompiled from Official Sources. Comprising (together with other Official Information^ LISTS of the PAPERS read during the Session 1920-1921 before all the LEADING SOCIETIES throughout tlie Kingdom engaged in the following Departments of Research:
§ 1. Science Generally : i.e.. Societies occupying themselves with several Branches of Science, or with Science and Literature jointly. § 2. Astronomy, Meteorology, Mathematics, and Physics, g 3. Chemistry and Photography. § i. Geology, Geography, and Mineralogy. § 6. Biology, including Botany, Microscopy, and Anthropology. §6. Economic Science and Statistics. § 7. Mechanical Science, Engineering, and Architecture. § 8. Naval and Military Science. § 9. Agriculture and Horticulture. § 10. Law. § 11. Literature, History, and Music. § 12, Psychology. § 13. Archceology. § 14. Medicine.

" This book
:

is

a valuable compendium of indispensable woith."— Nautical Magazine.

London CHAS. GRIFFIN & CO., LTD., Exbtee

St.,

Strand, W.C.2

;

AN INTEODUCTION TO THE

THEOEY OF STATISTICS
BY

G.

UDNY YULE,

C.B.E.,

M.A., F.R.S.,

FELLOW

TOTITEB3ITT LECTDEER IN STATBTira, OAMBRIDOE; AND SOMETIIIE HOKORART SECRETARY 01 THE ROYAL STATISTICAL

SOCIETY OP LONDON;

MEMBER OF THE INTERNATIONAL

STATISTICAL INSTUDTE FELLOW OF THE EOYAIi ANTHROPOLOGICAL INSTrrUTE.

"BHlttb

53

jFlflures

anD diagrams.

SIXTH EDITION, ENLAEOED.

LONDON:
CHARLES GRIFFIN AND COMPANY, LIMITED,
EXETER STREET, STRAND,
1922.
\_All

W.C.

2.

Bights Seserved.]

Nkill

Printed in Great Britain by & Co., Ltd., Edinburgh.

printed just after the close of the War. June 1922. the still volume continues to hold edition its own. Y. rather than to revise the text. the deduction of the regressions by the use of the differential calculus. after ten years of service. giving . The Index has been revised new matter G. Opportunity has also been taken to correct some minor errors in the text and the answers to questions. . it was thought better to maintain the present form. with the additional matter as Supplements. has been exhausted in three years is gratifying evidence that. since the latter course would inevitably have necessitated a heavy increase in the to cover all selling price.PREFACE TO THE SIXTH EDITION. dealing with the application of this method to Association and Contingency Tables. and the Supplementary List of Eeferences has been extended and brought up to date. been enlarged The present has by a considerable addition to the Supplement on testing Goodness of Fit. Suppleme'nt has been inserted. U. and including also results only recently published. A brief. still In view of the high costs of printing and composition ruling. That the large Fifth Edition of the Introduction.

in spite of the slight revision which has Theory been possible in the last few years. April 1919.PEEFACE TO THE FIFTH EDITION.T.. on the Law of Small Chances and on the Goodness of Fit of an observed to a theoretical distribution respectively. been revised to cover all As the index has hoped that this the new matter. U.M. was decided on as any extensive revision of the text. The rapid exhaustion of the Fourth Edition. Y. G. and list of additional references. reading difficult to and writing are at present I have to express my great indebtedness to my friend and frequent collaborator.A. the references. Their inclusion as instead of the incorporation of the matter in the text. the Introduction of Statistics has continued to be of service.C. The present test of edition has been enlarged by the inclusion of two supplementary notes. . Owing to a serious impairment of vision. Captain of M. it is will cause little inconvenience. owing to post-war conditions. drafting of the Lister Institute. would have been disproportionately slow and costly. and also of an extensive me. of for list the of these notes and the compilation supplements. is evidence that. R. Greenwood. by the end to the of last year.

Hooker not only . H. of the student who is working without the assistance of a teacher. in the sessions 1902-1909.PREFACE TO THE FIRST EDITION. the three parts into which the volume is divided corresponding approximately to the work of the three terms. it is all the binomial theorem. To enable the student to proceed further with at University College. The following chapters are based on the courses of instruction given during my tenure of the Newmarch Lectureship in Statistics The and examples has. been increased to render the book more suitable 'for the use of biologists and others besides those interested in economic and vital statistics. together with such elements of co-ordinate geometry as are that is now assumed. variety of illustrations the subject. however. I hope that may prove of some service to the students of the diverse sciences in My which statistical methods are now employed. mere especially. the chapters follow closely the arrangement of the course. most grateful thanks are due to Mr K. For the rest. fairly detailed lists of references to the original : memoirs have been given at the end of each chapter exercises have also been added for the benefit. as distinct to work out a systematic introductory course on statistical methods from collecting. and some of the more difficult parts of the subject have been treated in greater detail than was possible in a sessional course of some thirty lectures. The volume represents an attempt for discussing. statistical —the methods available data — suited : to those who possess only a limited knowledge of mathematics to an acquaintance with algebra up generally included therewith. London.

December 1910. often delayed and interrupted by the pressure of other work. to illustrate the frequency distribution on the § my probable error of the median. mass and exercises have been eliminated. XVII. . and will feel indebted to any reader who directs my attention to any such mistakes. might never have been completed my debt to Mr Hooker is for reading the greater part of the manuscript. Vigor for some assistance in checking the arithmetic. U. but also for much friendly help and encouragement without which the preparation of the volume. amI all can hardly hope that involved errors in the text or in the of arithmetic in examples biguities. and the proofs.Vlll PKEFACE. and suggestions which have of tlie greatest service. My thanks are also due to for the Mr H. D. for and been making many criticisms : indeed greater than can well be expressed in a formal preface. or obscurities. and Edgeworth example used in the influence of the form of acknowledgments to Professor 5 of Chap. or to any omissions. G. Y.

Positive and negative attributes. Sufficiency of the tabulation of the ultimate classfrequencies 16-17." " statistical. Definitions of "statistics. FAQE3 — — — 1-6 PART I. The aggregate 12. The observation or universe. INTRODUCTION." "statistical methods. The change in meaning of these terms during the nineteenth centuiy 7-9.— COI^TEKTS. contraries 10. NOTATION AND TEEMINOLOGY. class 11. II. Notation for single attributes and for combinations 8. CONSISTENCE. Statistics of attributes and statistics of variables fimdamental character of the former 3-5. Derivation of complex from simple relations by 5-6. Consistence specifying the universe 7-10. Claasifioation by dichotomy 6-7. and its specification by symbols 4. of the positive class-frequencies 18. CHAPTER I. 1-3 The introdnction of the terms " statistics. The arrangement of classes by order and aggregate 13-14. Conditions of consistence for three attributes . Inclusive and exclusive notations and terminologies — : — — — — — — 7-16 CHAPTER 1-8. The class-frequency 9." into the English language 4-6. field of — — — 17-24 . . better." in accordance with present usage . Conditions of consistence for one and for two attributes 11-14." "theory of statistics. Or. 1-2. The present use of the terms 10. The order of a.—THE THEORY OF ATTRIBUTES. The class-frequencies chosen in the census for tabulation of statistics of infirmities 19.

. . The coefficient of contingency 9-10. independence— 5-10. Illusory association due to the association of each of two attributes with a third 9. CHAPTEE 1-4. The general principle of a manifold classification 2-4. The criterion association. Ideal frequency-distributions moderately distribution 14. Position of intervals 7. 13-14. Illustrations 4.—THE THEORY OF VARIABLES. III. . ASSOCIATION. dependence values— 13. 2S-41 CHAPTER IV. The case of complete independence . : — — — — 42-69 CHAPTER V. . The exti'emely asymmetri16. the Introductory 2. Homogeneity of the classifications dealt with in the pre- — — — ceding chapters : heterogeneous classifications .UENCY-DISTRIBUTION. CHAPTER 1. 1. Estimation of the partial associations from the frequencies of the second order 10-12. PARTIAL ASSOCIATION. Numerical equality of the between the four second-order frequencies and their in14. MANIFOLD CLASSIFICATION. 60-74 PART II. The total number of associations for a given number of attributes . The U-shaped distribution cal or J -shaped distribution The symmetrical — — 75-105 . Graphical representation of the unequal intervals — — — — — : — — — — frequency-distribution 13. Tables with 11.— — — X CONTENTS. Magnitude of class-intervals 6. Method of forming the table 5. THE FREQ. Necessity for classification of observations frequency-distribution 3. analysis of a contingency table by tetrads 11-13. Process of classification— 8. Isotropic and anisotropic distributions— 14-15. 1-2. The conception of and testing for the same by the comparison of PASE3 differences of percentages— 11-12. attribute A being extended to include non-^'s — . Uncertainty in interpretation of an observed association Source of the ambiguity partial associations 6-8. 12. VI. Treatment of Intermediate observations 9. The table of double entry or contingency table and its treatment by fundamental methods— 5-8. Tabulation 10. . Coefficients of association Necessity for an investigation into the causation of an . 3-5. The asymmetrical distribution 15.

20-24. calculation. 1. Causation of pauperism calculate r 11-13. The arithmetic mean its definition. The commoner forms of average 6-13. CORRELATION: ILLUSTRATIONS AND PRACTICAL METHODS. Illustration i. correlation table and its formation 4-5. The general problem surface of rows and the line of means of columns : their relative positions in the case of independence and of varying degrees of correlation 10-14. The simpler properties. Illustration ii. and properties or semi-interquartile range 25. Measures of position (averages) and of dispersion 3. 1. PASES 1. and the standard-deviations of arrays Numerical calculations 17. 22-26. Summary which it is specially applicable — 27. Necessity for careful choice of variables before proceeding to 2-8. Necessity for quantitative detinition of the characters of a frequency-distribution 2. . CORRELATION. The line of means 6-7. ETC. The mode its calculation. 19-20. The quartile deviation calculation. and the — 21. the re15-16. The standard deviation its definition. MEASURES OF DISPERSION. Measures of relative dispersion 26. The mean deviation : its definition. AVERAGES. Desii'able properties for an average to possess 5.— — CONTENTS. and simpler properties 14-18. gressions. XI CHAPTEE VII. 9-10. : Inheritance of fertility — : — . and simpler properties — — — — — : — definition and relation to : mean and median — : comparison of the preceding forms of average geometric cases in mean its definition. 157-190 CHAPTEE X. The correlation 8-9. The median : its definition. — : — — — — The method of grades or percentiles 133-156 CHAPTER The IX. The harmonic 106-132 mean : its definition and calculation CHAPTER Inadequacy VIII. calculation. Certain points to be remembered in calculating and using the coefficient — — — — — — . Measures of asymmetry or skewness 27-30. and properties 14-19. The dimensions of an average the same as those of the variable 4. of dispersion of the range as a measure 2-13. The correlation-coefficient. 1-3.

Introductory explanation for Direct deduction of the formulse Special notation for the general generalised regressions 5. Correlation between indices 10. 210-228 CHAPTER XII. Theorems concerning the generalised product-sums 9. Standard-deviation of a sum or difference 3-5. The weather and the crops 14. Mean and standard-deviation of an index 9. Influence of errors of observation on the correlation-coefficient (Spearman's theorems) 8. Direct interpretation of the generalised regressions 10-11. Reduction of the generalised standard-deviation 12.deviations 7-8. Introductory 2. — — — — — — — 229-253 . The correla: — : : — — tion-ratio 191-209 CHAPTER XI. Application of weighting to the correction of death-rates. Limiting inequalities between the values of correlation-coefficients necessary for consistence 19. 4. Correlation-coefficient for a two x two- — — fold table — — — — 11. MISCELLANEOUS THEOREMS INVOLVING THE USE OP THE CORKELATION-GOEFFICIENT. . 1-2. Correlation due to heterogeneity of material 13. Fallacies work : . Correthe movements of two variables: (a) Illustration iv. Example ii. Reduction of the generalised correlation-coefficient 14. lation between infantile : CONTENTS..Geometrical representation of correlation between three variables by means of a model 16. Reduction of correlation due to mingling of unoorrelated with correlated material 14-17. The coefficient of n-fold correlation 17. Generalised correlations case Generalised deviations 6. Certain rough methods of approximating to the correlation-coefficient 20-22.— XU Illustration iii.. Expression of regressions and correlations of lower in terms of those of higher order 18. . for varying sex and agedistributions 20. the marriage-rate and foreign trade— 18. 1. Reduction of the generalised regression 13. Arithmetical two variables : — — 3. 15. and standard . . (6) Quasi-periodic movements Illustration v. The weighted mean 18-19. Correlation-coefficient for all possible pairs of JV values of a variable 12. Elementary methods of dealing with cases of non-linear regression 19. changes in Nop-periodio movements : — PAGE9 and general mortality 15-17. PARTIAL CORRELATION. The weighting of forms of average otlier than the arithmetic mean — — — — — . — — Example i. Influence of errors of observation and of grouping on the standard-deviation 6-7. etc.

Warning as to the assumption that three times the standard error gives the range for the majority of fluctuations of simple sampling of either sign 2. Warning as to the use of the observed for the true value of p in the formula for the standard error 3. 1. More detailed discussion of the assumptions on which the formula for the standard-deviation of simple sampling is based— 9-10. Biological cases to which the theory is directly applicable 11.—THEORY OF SAMPLING. Determination of the mean and standard-deviation of the number of successes in events— 6. and relation between mean and standard-deviation. Use of the standard-deviation of simple sampling. The same for the proportion of successes in n n events : the standard-deviation of simple sampling as a measure of unreliability or its reciprocal as a measure of precision 7. The probleiti of the present Part chief divisions of the theory of sampling 3. when the chance of success or failure is very small^lS.— CONTENTS. The importance of errors errors when n is large other than fluctuations of "simple sampling" in practice: unrepresentative or biassed samples 9-10. CHAPTER SIMPLE SAMPLING OF ATTRIBUTES. Definition of the chance of success or failure of a given event 5. The inverse standard error. XUl PART III. Approximate value of the standarddeviation of simple sampling. for checking and controlling the interpretation of statistical results . Verification of the theoretical results by experiment 8. Summary — — — — — — 276-290 . SIMPLE SAMPLING CONTINUED: EPFEOT OP REMOVING THE LIMITATIONS OF SIMPLE SAMPLING. 1. or standard error. . The two — — — — — — 254-275 CHAPTER XIV. (6j EH'ect of variation in p and q from one sub-class to another within each universe 13-14. XIII. Effect of divergences from the conditions of simple sampling : (a) effect of variation in p and q for the several universes from which the samples are drawn 11-12. (c) Effect of a correlation between the results of the several events 15. — — PAGES 2. or standard en'or of the true proportion for a given observed proportion : equivalence of the direct and inverse standard 4-8. Standard-deviation of simple sampling when the numbers of observations in the samples vary 12. Limitation of the discussion to the case of simple sampling 4.

1-3.. The contour lines : a series of concentric and similar ellipses 6. 317-334 . Illustrations of the application of the normal curve and of the table of areas — — — — — — — — — — .distribution for the number of successes in n events : the binomial distribution 3.. normality of distribution obtained by diagonal addition.. Deduction of the normal curve as a limit to the symmetrical binomial 10-11. for use in many practical cases. the terms of the binomial series 9.. NOEMAL COEEELATION.. q.. Graphical and mechanical methods of forming re- — — presentations of the binomial distribution 6.. Necessity of deducing. Fitting the curve to an actual series of observations 16. Direct calculation of the mean and the standard-deviation from the distribution 7-8. a continuous curve giving approximately. Constancy of the standard-deviations of parallel arrays and linearity of the regression 5. The table of areas of the normal curve and its use 17. The value of the central ordinate 12. The normal surface for two correlated variables regarded as a normal surface for uncorrelated variables rotated with respect to the axes of measurement arrays taken at any angle across the surface are normal distributions with constant standard-deviation : distribution of and correlation between linear functions of two normally correlated variables are normal : principal axes 7. Investigation of Table III. Outline of the principal properties of the normal distribution for n variables — — — — — — — . Comparison with a binomial distribution for a moderate value of n 13. Chapter IX. Standarddeviations round the principal axes 8-11. contour lines 12-13. PAGX8 1-2. to test normality: linearity of regression. Deduction of the general expression for the normal correlation surface from the case of independence 4. Difficulty of a complete test of fit by elementary methods 16.. THE BINOMIAL DISTKIBUTION AND THE NOEMAL CURVE. Isotropy of the normal distribution for two variables 14. Outline of the more general conditions from which the curve can be deduced by advanced methods— 14. and n 4-5.: XIV CONTENTS. constancy of standard-deviation of arrays. Dependence of the form of the distribution on p. for large values of n. CHAPTER XV. Determination of the frequency. The quartile deviation and the "probable error" 18. 291-316 CHAPTER XVI.

DiKEOT Deduction of the roEMtTL^ fok Eegebssions TiiE II. and the Theory of Probability . — 362-383 363-367 367-386 . Special values for the percentiles of a normal distribution 5. Law of Small Chances . Eifect of the form of the distribution generally 6. and Hints on the Solution of. THE SIMPLER CASES OE SAMPLING EOE VARIABLES PERCENTILES AND MEAN. Short List of Works on the Mathematical Theory . and regression 16. Effect of removing the restrictions of simple sampling. PAOES 1-2. Statement of the standard errors of standard-deviation. . Restatement of the limitations of interpretation if the sample be small — — — — — — — — — — — — — — . Standard error of the arithmetic mean 11. correlation-coefficient. III. Standard error of the interquartile range for the normal curve 9. . Simplified formula for the case of a grouped frequency-distribution 7. Goodness of Fit Additional Refeeences 387-392 Answers Index to. and limitations of interpretation 10. 335-366 Appendix Appendix I. XV CHAPTER Xyil. 357-359 II. The problem of sampling for variables : the conditions assumed 3. Effect of removing the restrictions of simple sampling 15. The tendency to normality of a distribution of means 14. the Exercises 393-400 401-415 GIVEN .: CONTENTS. Standard error of a percentile 4. of Statistics. coefficient of variation. Relative stability of mean and median in sampling 12. Correlation between errors in two percentiles of the same distribution 8. . —Tables — for facilitating Statistical Work . 360-361 Supplements I. Standard error of the difference between two means 13.

.

1914. London. von Bielfeld. however. — — — 1. and the adjective is also given (p. ^ Act ii. 1 . by E. is become a favourite study in Germany" (p. v).. F. The word "statist" is found. Quarterly Publications of the American Statistical Associaiion. * I cite from Dr W. 287. for instance.THEOEY OF STATISTICS. by Baron J. sc. (3 vols. ii) .^ Cymbeline (1610 or 1611).^ issued in 1787. from the Latin status. appear to be the sense that 2.^ and in Paradise Regained The earliest occurrence of the word "statistics" yet (1671)." all derived.. By the more convenient form it has now received . some respectable ants." " statistical methods. The change in meaning of these terms during the The present use of the terms 10." * " Statistics " occurs again with a rather wider definition in the preface to A Political Swrvey of the Present State of Ev/rope. A. this science. ' Act V." says Zimmermann. and contains a definition of the subject as "The science that teaches us what is the political arrangement of all the modern states of the known world. . of much earlier date than the two others. 2. iv. " in accordance with present usage. W. into a separate science." into the English language 4-6. which has for its object the actual and relative power of the several modern states. The words it "statist." "statistical. "that that branch of political knowledge. INTKODUCTION. p. tions of "statistics. One of its chapters is entitled Statistics. . xiv. 1-3..'' "statistics. Professor of Natural Philosophy at Brunswick. ' Bk. translated by W. the power arising from their natural advantages. Zimmermann.. more or less indirectly. M. though he was a German. chiefly by German writers. in Hamlet (1602)." "theory of statistics. 1770). Defininineteenth century 7-9. acquired in mediisval Latin of a political The first term is. 5 Ziuimermann's work appears to have been written in English. has been formed. in state. Willcox. F. "It is about forty years ago. The introduction of the terms " statistics..D. "To the several articles contained in this work. vol.. so.' noted is in The Elements of Universal Ervdition. ." statistical. the industry and civilisation of their inhabit and the wisdom of their governments. Hooper. 4. distinguished by the new-coined name of statistics..

.." was certainly justified. their introIn the circular letter to the duction has been frequently ascribed. as the term is used by German writers of the eighteenth century. is the volume in which the word * "statistik" appears to be first employed. by Zimmermann and by Sir John Sinclair.^ he states Germany "' Statistical Inquiries. . vol.. to. and I hope that it is now completely This hope naturalised and incorporated with our language. Account. the political circumstances. " statiaticus . but the adjective occurs at a somewhat earlier date in works written in Latin. accordingly. as I thought that a new word might attract more public attention. I resolved on adopting it. the — — — but trustworthy figures were — scarce. "Statistics" (statistik). as it was supposed that some term in our own language might have expressed the same meaning. and numerical statements.*. the editor and organiser of the first Statistical Account of Scotland. xiii. indeed. p. Statistics and Statistical.. however." " History of the Origin and Progress " ^ of the work. the proIn the ductions of a country. which I happened to take in 1786..' as they are called. but the meaning of the word underwent rapid development during the half century or so following its introduction." and adds an explanatory "or inquiries footnote to the phrase "Statistical Inquiries" that in — respecting the population. began more and more to displace the verbal descriptions of earlier days. viz>. The conciseness and definite character of numerical data were recognised at a comparatively early period more particularlj' by English writers characteristics of a state. Clergy of the Church of Scotland issued in May 1790. Professor of Politics at Gottingeu. ^ Statistical Progress ' . But in the course of a very extensive tour. 1791-99. have added a view of the principal epochas of the history of each country. xx." 3. meant simply the exposition of the noteworthy mode of exposition being almost inevitably at that time preponderantly verbal. " Many people were at first surprised at my using the new words. and other matters of state. 4. notably by Sir John Sinclair. the growth data was continuous." 2 Statistical writers THEORY OF STATISTICS. Within the next few years the words were adopted by several writers. through the northern parts of Eur^jpe..'^ to whom. vols. he tells us.which they had given the name of Statistics . " Statistics " thus insensibly acquired a narrower signification." given at the end of the volume. Loc. have been carried to a very great extent. I found that in Germany they were engaged in a species of political enquiry. After the commencement of official of the nineteenth century. Appendix to "The History of the Origin and . dt. The Abriss der Statswissenschaft der Earopdischen Keiclie (1749) of Gottfried Achenwall. ' Twenty-one .

Oct. Cambridge Univ. W. Thus we read of the inheritance of genius being treated "in a statistical manner. 1. at the present day. however. Symons' British Rainfall for 1899. 1869). for instance."* and we have now "a journal for the statistical Such phrases as " the statistical study of biological problems." ^ admitted that " the statist commonly prefers to employ figures and tabular exhibitions. i. first volume of the Journal. ii. 3. The New Psychology. 15. Biometrika. the characteristics of the Virgilian hexameter " are examined carefully with statistics. Francis Galton. in anthropology. writing of " statistics concerning the mental characteristics of man. 1903. is held to cover a collection of numerical data. chap. The methods applied to the study of numerical data concerning the state were still termed " statistical methods. "Statistics. 18. in a book on Latin verse. It is difficult to say at what epoch the word came definitely to bear this quantitative meaning. the word was transferred to those series of figures with which it operated." even when applied to data trom other sources. in the words of the prospectus of this Society. Op. cit." 5. the word. and so forth." we read. p." * 6. Hereditary Genius (Maomillan. as we speak of vital statistics. Scripture. under the headings bright average duU. Once. From the name of a science or art of state-description by numerical methods." and consequently.^ and the author. Soc. the firat number issued in 1901 ." INTKODUCTION. are for the most part of a numerical character.^ We find a chapter headed " Statistics " in a book on psychology. etc. Stat. but the transition appears to have been only half accomplished even after the foundaThe articles in the tion of the Royal Statistical Society in 1834. p. "may be said. however. poor-law statistics." " statistics of children. But similar data occur in many connections . analogous to those which were originally formed for the study of the state. the first change of meaning was accomplished. on almost any subject whatever. E." "^ — — ' " * » " ' Jour. 1897. to be the ascertaining and bringing together of those facts which are calculated to It is. but the official definition has no reference to method." but of " statistics " showing the growth of an organisation for recording rainfall. issued in 1838-9. in meteorology. vol. 3 the exposition of the characteristics of a State by numerical methods."^ We are informed that. Such collections of numerical data were also termed " statistics. We not only read of rainfall " statistics. further changes followed. preface. Press. illustrate the condition and prospects of society. Athenceum. p. The development in meaning of the adjective " statistical was naturally similar.

matters are in somewhat better case. is this We find common 8. is of social science. and "On Boltzmann's Theorem" ' By J. and rain gauge have to be in almost precisely the He can experiment on minor which methods ' statistical of experiment. methods have either to aid or to replace the The physicist and chemist."^ and the Bakerian lecture for 1909. vol. in same position as the student points.\ii." 7. have become part of a work entitled "the principles of statistical mechanics. then." no longer bear any necessary They are applied indifferently in reference to " matters of state. as the parent of the methods termed " statistical. He can and does apply experimental methods to a very large extent. but frequently cannot approximate closely to the experimental ideal . biology. by Sir J. 9. but the the barometer. stands out so markedly that attention has been repeatedly directed to it by " statistical " writers as the source of the peculiar the observer of social facts cannot exdifficulties of their science periment." for a moment. with physics or chemistry. Larmor. Willard Gibbs (Macmillan." physics. however. character 1 Let us turn to social science. With the biologist. there must be social sciences. Trans. What. to deal with highly complicated cases of multiple causation cases in which a given result may be due to any one of a number of alternative causes or to a number of — different causes acting conjointly. the internal circumstances of animals and plants too easily evade complete control. some community of character between them. (1878). but must deal with circumstances as they occur. 1902). FUl. in general.— 4 THBOKY OF STATISTICS. . that this is also precisely the characteristic of the observations in other fields to which statistical methods are applied. Now the object of experiment is to replace the complex systems of causation usually occurring in nature by simple systems in which only one causal circumstance is permitted to vary at a time. records of treated as they stand. It is unnecessary to multiply such instances to show that the words "statistics. was on " the statistical and thermodynamical relations of radiant energy. the observer has. say. of Clerk Maxwell. "Theory Heat" (1871). apart from his control. Camb.. Hence a large field (notably the study of variation and heredity) is left. and meteorology. ' investigation of the motion of molecules " the ordinary language of physicists. as well as in the Diverse though these cases are. . little consideration will show. and consider its characteristics One characteristic as compared. The meteorologist." "statistical. finally. thermometer. This simplification being impossible. or the same terms and the same methods would not be applied. anthropology. for A example.

1905. is a source of error . or of moisture. iii. Yule. cannot be completely eliminated. of disturbing circumstances. 391. and cannot be. Stuttgart. KOBRRT VON. But even so. multiplicity of causes is of the essence of the case. (For history of statistics see 3 vols. Der. the other extremity of the scale. U.. . hoy. the author died in 1900. as well as the observing instrument. priiilSipaily latter half of vol. vibraFurther. Bj' far the best history of statistics down to the early years of the nineteenth century.) .. Soc. Theirs are the sciences in which experiment has been brought to its greatest perfection. statistical methods still find application. for same year. Stat. which are affected only by a relatively small residuum of disturbing causes. 1884. At the same time. Berne." " Statistical.. referred to in the last sentences of § 6. (4) MoHL. bis auf Quetelet . 1855-58. Enke. Roy. (All published ." John.. Oeschichle der StntistiTc. of molecular physics." Jour.' into the English Language.Name'Statistik . statistical methods we mean methods specially adapted to the elucidation of quantitative data affected by a multiplicity of causes. Oeschichte imd Liiteratv/r der Staatswissenschaften. ' p. are not. since the term " statistics " is not usually applied to data. V. Stat. 1883. draughts. The motion of an atom or of a molecule in the middle of a swarm is dependent on that of every other atom or molecule in the swarm. The History of the Words (1) (2) " Statistics. In the light of this discussion. 10. whether the influence of many causes be large or not. stand at the methods available for eliminating the effect though continually improved. V. " statistical methods " are applicable to all such cases. Enke. A translation in Jour. " The Introduction of the Words 'Statistics. The History of (3) Statistics in General. mean the exposition of statistical The insertion in the first definition of some such words as " to a marked extent " is necessary. REFEKENCES. like those of the physicist. in the problems tion. 1'° John.) — INTKODUCTION. Teil. Weiss. Erlangen. G. By theory of statistics we methods. Soc. The observer himself.' Statistical. absolutely perfect. we may accordingly give the In the first place. of pressure. Ixviii. vol. the efieots of changes of temperature. following definitions : By By statistics we mean quantitative data affected to a marked extent by a multiplicity of causes.

(Vol. 1865. There is no detailed history in English. 1888. 2nd edn. Jena. C. and the biographical articles in Palgrave's Diiiionary of For its importance as regards the English Political Economy are useful.g. History of (8) Official Statistics. H. Meitzen's Geschichte.. Hoepli. i. (Gives an exceedingly useful outline of the history J. 2 vols. Theorie und Technik der Statistik Cnew edn.. American translation by R.) Several works on theory of statistics include short histories. but the article 189)). 1899. together with the Observations on the Bills of Mortality more probably by Captain John Graunt. The Economic Writings of Sir William Petty. H.. 1903. A.) gives a very slight sketch. \ Bektillon. . Macmillan.. P. 1890). Milano.. school of political arithmetic. : From the purely mathematical side the following is important TODH0NTEK. Westergaard's Die Grundzuge der Theorie der Statistik (Fischer. and P. Somewhat (7) slight information is given in the general works cited. e. Gatjaolio. History of Theory of Statistics..—— 6 (5) TTIT^ORT OF STATISTICS. Falkner. 2 vols. Cambridge University Press.) . A History of the Mathematical Theory of Probability from the time of Pascal to that of Laplace . of official statistics in different countries.. reference may also be made to (6) Hull. Teoria generate ddla stalistica. 1895. Oours dUmentaire de statislique Soci^td d'dditions scientifiques. Antonio. I. Parte storica. "Statistics" in the Encyclopcedia Britannica (11th edn.

men. we may count the number of the blind and seeing. in a given count how many do or do not possess it. better. may be treated by simply counting all measurements as tall that exceed a certain limit. Inclusive and exclusive notations and terminologies. In the absence of first place. Notation for of the former 3-5. He may record. The quantitative character may. the prices of different samples of a commodity. 2. CHAPTER I. Sufficiency of the tabulation of the ultimate class15-17. Positive and negative attributes. NOTATION ANI^ TERBIINOLOGY. character. arise in two different ways. contraries 10. the statures of men. The observations in these cases are quantitative ab initio. which may be termed statistics of attributes. — 1.— PAET I. frequencies — : — — — — — — — The class-frequencies chosen in the census for tabulation of statistics of infirmities 19. are also applicable record of statures of to the latter. neglecting the magnitude of excess or defect. The arrangement of classes by order and aggiegate 13-14. and Thus. population. for example. Or. The quantitative the dumb and speaking. 1-2. in such cases. Classification by dichotomy single attributes and for combinations 8. or statistics of variables.— THE THEOEY OF ATTRIBUTES. The methods applicable to the former kind of observations. or the insane and sane. deal with quantitative data alone. Statistics of attributes and statistics of variables fundamental character 6-7. In the second place. The methods of statistics. The class-frequency 9. however. The order of a class 11. The aggregate— 12. the observer may note or measure the actual magnitude of some variable character for each of the objects or individuals observed. as defined in the Introduction. and stating the numbers of tall and short (or A 7 . of the positive class-frequencies 18. the observer may note only the presence or some attribute in a series of objects or individuals. the ages of persons at death. the numbers of petals in flowers. arises solely in the counting. for instance.

barometer readings as above or below some particular height. be continued indefinitely. Or. may be said to be members of two distinct classes.-IV. In the simplest case. every class being divided into two at each step. The boundary may be wholly arbitrary. so that we would have eight classes tall with round green seeds. the process of classification may. and four similar classes of dwarf plants. tall with wrinkled yellow seeds. with green seeds or yellow seeds. the members of each sex into sane and insane. sane males. and are best considered first.) are accordingly devoted to the Theory of Attributes. Thus the members of the population of any district may be classified into males and females . with wrinkled seeds or round seeds. we may assume that we have really to do with a variable character which has been crudely classified. making use of each value recorded. where attention is paid to one attribute alone. and we may be able. where prices are classified as above or — below some special value. The division may also be vague and uncertain: sanity and insanity. and sane females into blind and seeing.8 THKOEY OF STATISTICS. It may be noticed that the fact of classification does not necessarily imply the existence of either a natural or a clearly defined boundary between the two classes. For example. and those that do not possess it. e. however. If we were dealing with a number of peas {Pisvm sativum) of different varieties. the members of each of the subclasses so formed according as they do or do not possess the third. more strictly not-tall) on the basis of this classification. 4. the methods that are specially adapted to the treatment of statistics of variables.g. Similarly. as suggested above. But the methods and principles developed for the case in which the observer only notes the presence or absence of attributes are the simplest and most fundamental. by auxiliary hypotheses as to the nature of this variable. 3. the insane males. to draw further conclusions. sight and blindness. we may treat the presence or absence of the attribute as corresponding to the changes of a variable which can only possess two values. tall with wrinkled green seeds. insane females. the observer classifying the objects or individuals observed. say and 1. and so on. pass into each other by such fine gradations that judgments may . are available to a greater extent than might at first sight seem possible for dealing with statistics of attributes. This and the next three chapters (Chapters I. Those that do and do not possess the first attribute may be reclassified according as they do or do not possess the second. only two mutually exclusive classes are formed. The objects or individuals that possess the attribute. If several attributes are noted. they might be classified as tall or dwarf. tall with round yellow seeds.

The class. however. G. Generally "a" is equivalent to "non-. by means of which we specify the characters of the members of a class. . instead of twofold or dichotomous) processes of classification.e. sanity. a j8. Combinations of attributes will be represented by juxtaThus if. the class a is equivalent to the class none of the members of which possess the attribute A. the class ABG. but not insani^ and so on. represents sight. has been termed by logicians classification. B. . division by dichotomy (cutting in two). ^ sta. The capitals A. or. It is convenient to use single symbols also to denote the absence of the We shall employ the Greek letters. attributes A. inthe neither blind nor deaf. ABy. for most usually a class is divided into more than two sub-classes. e. but dichotomy is In Chapter V. G. B positions of letters. 9 differ as the class in which a given individual should be possibility of uncertainties of this kind should always be borne in mind in considering statistics of attributes whatever the nature of the classification. An object or individual possessing the attribute A will be termed simply A. . denoted say by G. aB. the relation of dichotomy the fundamental case. . The . as above.4. AB. ABy those who are deaf amd blind to entered. and insane." or an object or individual not possessing the attribute A . deafness. Afi. The classifications of most statistics are not dichotomous. . non-blindness . may be termed a class symbol. to more elaborate (manifold. . the final judgment must be decisive any one object or individual must be held either to possess the given attribute or not. Any letter or combination of letters like A. y. if B stands for deafness. A classification of the simple kind considered. If the presence and absence of these attributes be noted. 6. aB. includes those who are at once deaf. . o.nd%ior hearing. 7. are dealt with briefly. 5. in which each class is divided into two sub-classes and no more. the four classes so formed. AB represents the combination blindness and deafness. include respectively the blind and deaf. . definite or uncertain. will be used to denote the several attributes. natural or artificial. . AB. all the members of which possess the attribute A. For theoretical purposes it is necessary to have some simple notation for the classes formed. aj3.c/. the blind but not-deaf.: I. will be termed the class A. B. i. . —NOTATION AND TERMINOLOGY. to use the more strictly applicable term. and If a third attribute be noted. Thus if A represents the attribute blindness. and for the numbers of observations assigned to each. and the methods applicable to some such cases. the deaf but not-blind. blind. A represents blindness. viz.

(a) .

(2) Hence. the number of A's to the number of ^'s that are B together with the number of ^'s that are not B . (A) = (AB) + (A/3) = (. i. {AB) {Afi) (AC) (Ay) (aC) (ay) (BC) (By) ifiO) (!)• (aB) Order 3.'s together with the number of a's. gate of frequencies of the second order. 1. In such a complete table for the case of three attributes. (AB) = (ABC) + (ABy) = etc. classes of the rath order in the case of n attributes. = (AB) + (Aji) + (aB) + (a.e. statement. than If four attributes had the eight frequencies of the third order. and 8 of the third. be arranged so that frequencies of the same order and frequencies belonging to the same aggregate are kept together. in no case necessary to give such a complete is. may be . the whole number of observations denoted by the letter being reckoned as a frequency of order zero.4(7) + (Ay) = etc. The classes specified by all the attributes noted in any case. N {A) (a) (B) (/3) (G) (r) 2. i.j3) = etc. 12. It however.e. (aBG) (aBy) (a/3C) {Apy) 13.— I. and so on. no more need be given. for the case of three attributes. (ABC) (ABy) (AjiC). Thus the frequencies for the case of three attributes should be grouped as given below . twenty-seven distinct frequencies are given 1 of order zero.) = (B) + (P) = BU. since no attributes are N specified : Order Order Order 0. in tabulating. 12 of the second. instead of enumerating all the frequencies as under (1). any class-frequency can always he expressed in terms of classfrequencies of higher order. been noted it would be sufficient to give the sixteen frequencies of the foiirth order. Class-frequencies should. The whole number of observations must clearly be equal to the number of ^. Thus : — . — — 11 —NOTATION AND TERMINOLOGY. 6 of the first order. and the twelve classes of the second order which can be formed where three attributes have been noted may be grouped into three such aggregates. N=^(A) + (a.

g. Similarly.086 {AB) {AC) {BC) 338 143 135 57 286 {ABC) The number n attributes.By) (a/36') 453 (a^y) 78 670 65 8310 equal to the grand The whole number total: i\r= 10. find the frequencies the positive classes. 10. i.e.g. the frequency of any second-order class.000.) number of school children were examined for the presence or absence of certain defects of which three chief descriptions were noted. C low — B nutrition. or the of ultimate frequencies in the general case of number of classes in an aggregate of the nth is given by considering that each letter of the class-symbol be written in two ways (. The complete results are N (-4) iB) (C) 14. and that either way of writing one letter may be combined with either way of writing another. including the whole number of obser- vations N. =2". is 2x2x2x2 . e. is given by the total of the two third-order frequencies. e. Hence we may say that it is never necessary to env/merate more than the ultimate frequencies. the class-symbols for which contain the same letter The frequency {ABC) -f {ABy) + {AjiG) + {Afiy) = {A) = 877. B or ft. the number of order. {A) is given by the the four third-order frequencies.000 877 1. All the others can be obtained from these by simple addition. of Given the following ultimate frequencies. total of of observations N is of any first-order class.— —— — 12 THEORY OF STATISTICS. A termed the ultimate classes and their frequencies the ultimate frequencies. {ABC) {ABy) {AfiC) {Afiy) 57 281 86 {aBG) {a. the classsymbols for which both contain the same pair of letters {ABC) + {ABy) = {AB) = 338. . (See reference 5 at the end of the chapter. C or y). nerve signs. . . Example i. {AB). may classes.4 or u. . Hence the whole number of ways in which the class-symbol may be written. A development defects.

. Compare.8)-(a. Otherwise the number N is made up 0. braically independent .— — I.(B) + (AB) . moreover. with the exception of a^y . and (BC). form one such set... series „ ^^-1) I "•" I n(n-l){n-2) 1.2. n n(m-l)(ra-2) things 3 together) 1~2~3 and so But the T . the two forms of statement. none of these fundamentally important ligures without the performance of more or less lengthy additions. The latter gives directly the whole number of The former gives observations and the totals of A's. including under this head the They are algetotal number of observations N. they become the symbols for the positive classes. which are necessary for discussing the relations subsisting between A. Their number is.(a(7) + (a^C) = J!r-{A). and C's.((7) + (AB) + (AC) + (BC) .. the latter gives the second-order frequencies (AB).. Further.. no one positive class-frequency can be expressed wholly in terms of the others. — AND TERMINOLOGY.(ABC) (3) (4) . as follows . for which must be substituted. for instance. but any other set containing the same number of algebraically independent frequencies. 2". thus (a/3) number of 16. 15.80 = N-{A). Order Order Order Order (The whole number of observations) (The number of attributes noted) . The is 2". {AG).(B) . may be chosen instead.. and C. § 13. but are only indirectly given by the frequencies of the ultimate classes. 2".2 is + • • • • the binomial expansion of positive classes 1)" or 2". The positive class-frequencies. (The number of combinations of n things 2 together) (The number of combinations of on. as may be readily seen from the fact that if the Greek letters are struck out of the symbols for the ultimate classes. — 1) ^r-o 3. ={a)-{aB) (a/3y) = N-{A)-{B) + {AB) = (a. 17. B's. in terms of the ultimate and the positive classes respectively. viz. therefore the total set of positive class-frequencies is a most convenient one for both theoretical and practical purposes..3 (1 -I- 1. 1 n n{n- 2. . as given in Example i.. B. The expression of any class-frequency in terms of' the positive frequencies is most easily obtained by a process of stepby-step substitution . 13 —NOTATION The ultimate frequencies form one natural set in terms of which the data are completely given. 1.

that the object referred to possesses only the attribute A and no others B . 19. obviously. (B). deaf-mutism be denoted by A. Census of England and Wales. the symbol A denoting that the object or individual possesses the attribute A.135-1825 = 8310 and so on.{Ajiy) = N-{B). § 13. dumb and blind but not deranged .{Afiy) = 10.(AC) . blind. iii. of 1891 or 1901. and mental derangement by C.. Census of 1901. for England and Wales.453 = 10..{ABy) = (A) . (ABC) (cf. the class-frequencies thus given are {A).(C) + {BC) . Check the work of Example i. and this list he must bear in mind.(ABC) = 338 . The classes chosen for tabulation are. If.. An exclusive notation is apt to be relatively cumbrous and also ambiguous. The statement that the symbol A is used exclusively cannot mean. blind. Ivii. but not or G or D. blindness by B.57 = 281 {A^y) = {Ay) . but at least one notation has been constructed on an exclusive basis {cf. work." any individual who is deaf and dumb. Example ii.{ABy. dumb and deranged but not blind blind and deranged but not dumb .). dumb. {A£y). (ABy) = (AB) . = 877-143-281 = 453 (ajSy) = iPy) .. it should be remarked.1086 .286 + 135 . mentally deranged . table xlix. (aBC). 18. and deranged. 1891.000 . of persons suffering from different " infirmities. imbecile. or idiot) being required to be returned as such on the schedule. others. frequencies does not appear to possess any special advantages. for example. ref. the symbol A. vol. but the following (neglecting minor distinctions amongst the mentally deranged and the returns of persons who are deaf but not dumb) :-^Dumb. signifying an object or individual possessing the attribute A with or without This seems to be the only natural use of the symbol. in the symbolic notation. (G). 14 Arithmetical THEORY OF STATISTICS. neither the positive nor the ultimate classes. used in an inclusive sense. for the reader cannot know what attributes a given symbol excludes until he has seen the whole list of attributes of which note has been taken. however. This set of p. however. Summary Tables. by finding the frequencies of the ultimate classes from the frequencies of the principles. or whatever other attributes have been noted. e. 5). blind or mentally deranged (lunatic. The symbols of our notation are. should be executed from first and not by quoting formulae like the above. {A/3C). — positive classes. Examples of statistics of consideration are afforded by the census precisely the kind now under returns.g. tables 15 and 16.

Series A. 15 whatever. (5)." Memoirs of the Manchester Lit.. when classes are verbally described. The " Blind " includes those who Dumb. 2'rans. p. in this respect. Dumb. 1890. and others. : Material has been cited from. are naturally used in an inclusive sense. are " Blind and forth. Soc. U. 94 of (3). (4) Yule. Boy. "On the Theory of Consistence of Logical Class-frequencies and its Geometrical Representation. Series A. The remarks made as regards the tabulation of class-frequencies at the end of (2) should be read in con- nection with the remarks made at the beginning of (3) and in this chapter cf. p. "Notes on the Theory of Association of Attributes in Statistics. that the description is complete." Biometrika. Adjectives. G. if anything. Keprinted In Pure Logic and other Minor Works . (1) Jevons. lix. Soc. nerve signs . footnote on p.. ' ' .5. it merely excludes the other attributes noted in the particular investigation. 1903. Stanley. (3) Yule. 121. G. as well as the symbols which may represent them.. a General System of Numerically Definite Heasoning." Phil. D.. vol. Parkes Museum. vol. but not Lunatic or Imbecile. 257. 1896.. 3870. Boy. made to the notation used in Report on the Scientific Study of the Mental and Physical Conditions of Childhood" published by the Committee." is used in the sense " Blind and Dumb. cxcvii. and so on for the others. cxciv. "Mental and Physical Conditions among Fifty Thousand Children. " On the Association of Attiibntes in Statistics..) (2) Yule. W.. . eta. to " REFERENCES. in the second exclusive. The terminology of the English census has not. (The method used in these chapters is that of Jevous." and so But the heading " Blind and Dumb. and Lunatic. and states what. and care should therefore be taken." Phil. U. " On vol. Moy. vol. is excluded as well as what is included. (6) Warner. combined infirmities. EXERCISES. A.. F." or " Blind. (The first three sections of (4) are an abstract of (2) and (3). Macmillan.ii. p.— I. 125. with the notation slightly modified to that employed in the next three memoirs cited. Soc. Trans. C. Stat.. F. (ABC) . low nutrition. and Phil.. —NOTATION AND TERMINOLOGY. p. 1901.. 1. . and reference (5) Wabner. -S." etc. In the first table the headings are inclusive.) The following are the numbers of boys observed with certain classes of defects amongst a number of school-children. Soc. denotes development defects . in the same way as our notation.." in the table relating been quite clear. (Figures from ref. etc. G. 1900." Jour. 189. 91.

— 16 THEOEY OF STATISTICS. frequencies of the 2. Tlie fallowing are the (5). (Figures from ref.) positive classes for the girls in the same investigation : jV .

In strictness. of age. in' Ejujlish. Conditions of consistence for three field 4. The expression the universe of discourse. The of observation or univeree and its specification by symbols Derivation of complex from simple relations by specifying the universe 5-6. and the common symbol can be understood. the symbol ought to be written if. no term is required to denote the material to which the work is so confined the limits are specified. — — — attributes. 1-3.') = Number — . insane English males over 00 living ill 1901. the symbols : : — ( U — — B TI) \UA. But for theoretical purposes some term is almost essential to avoid circumlocution. or material. space. . or simply the universe. for instance. Any statistical inquiry is necessarily confined to a certain time. Conditions of consistence for one and for two attributes 11-14. 2. like any class. may be limited to England.. English male over 60 blindness. 17 2 . living in 1901. denote the combination of attributes. general. {UAB)= blind blind and insane English males over 60 living in 1901. we should strictly use living in 1901. A insanity. The universe. to take the illustration of § 1.. [. and that is sufficient. We know that such attributes must exist.UB) = . may be adopted as familiar and convenient. to English males in 1901. e. those implied by the predicates It is not. however. 1. For actual work on any given subject. or even to English males over 60 years of age in 1901. of English males over 60 living in 1901.. male. used in this sense by writers on logic. An investigation on the prevalence of insanity. necessary to introduce a special letter into the classsymbols to denote the attributes common to all members of the universe. to England in 1901.— — CHAPTER IL CONSISTENCE. Consistence 7-10. may be considered as specified by an enumeration of the attributes common to all its members. and so on. say.g. over 60 years.

A or B ox case U AB {ABG) = {ABGD) + {ABG^) and so on. § 13. to denote the common general relations (2). e. however. the instead of the simpler symbols If (A) (JB) (AB). = <UAB) + (UAm + (UaB) + {Ual3) = etc. Clearly. using attributes of all the members of the universe and {U) consequently the total number of observations If. again. Chap. Similarly. as y. /3. The more complex may be derived from the simpler relations between class-frequencies very readily by the process of specifying the universe..) = If-(A). should in strictness be written U in the form (CO (UA) = (UAB) + (UAI3) = {UAC) + (?7^-y) = eto.— 18 — THEORY OF STATISTICS. by specifying the universe as {al3) = (/3)-{A/3) = If-{A}-{B) + (AB). (UA£) = (UABCr) + (UABy)== etc. writing in the latter of the universe. =(UA) + (Ua) = (I7B) + (U/3) = eto. Thus starting from the simple equation may : {a.5 within the iimiverse C. we might have used any other symbol instead of to denote the attributes common to all the members or ABC. 3. Hence any the attribute or comhirvatwn equatio7i the common to all the class-syniboU in an may be of attributes regarded as specifying universe within which equation holds good.ff. class-frequencies 5. Any observed within one and the same universe which have been or might have been may IJe said to be . I.(C) + (AB) + (AG) + (BG) ." The equation {AG) = {ABG) + {Aj3G) be read "The number of A's is equal to the number of ^'s that are B together with the number of . Specifying the universe.4's that are not-. Thus the equation just written may be read in words: "The number of objects or individuals in the universe ABC is equal to the number of D'b together with the number of not-i>'s within the same universe. we have.(B) ." i.(ABC). (a/3y) we have = {y)-(Ay)-{By) + (ABy) = ]f-{A).

= -57.— II. (2) the positive class-frequencies. being derived from the ultimate frequencies by simple addition. all others must be so. It follows from what we have just said that there is only one condition of consistence for the ultimate frequencies. 7. and so verify the fact that they are positive. 6. for instance. then. however. the conditions may be . Apart from this. If the figures. that' they must all exceed zero. we should accordingly calculate the values of all the unstated frequencies. Hence we need only calculate the values of the ultimate class-frequencies in terms of those given. 19 consistent with one another. in different places or on different material. but others are by no means of an intuitive character. They conform with one another. They might have been observed at different times. any one frequency of the set may vary anywhere between and oo without becoming inconsistent with the others. Generally. As we saw in the last chapter. the data are given N {A) {B) (C) 1000 525 312 470 (AB) {AC) (BC) (ABC) 42 147 86 25 Yet they there is nothing obviously wrong with the figures. If the ultimate class-frequencies are positive. —CONSISTENCE. Otherwise they are consistent. are certainly inconsistent. — They imply. for the case of n attributes. there must have been some miscount or misprint.470 + 42 + 147 + 86 . Clearly no class-frequency can be negative. there are two sets of 2" algebraically independent frequencies of practical importance. (1) the ultimate.312 . a negative value for {ajSy) — = 1000 . The conditions of consistence are some of them simple. (aySy) in fact. viz. viz. we may say that any given class-frequencies are inconsistent if they imply negative values for any of the unstated frequencies.525 . and do not in any way conflict.25. For the positive class-frequencies. Suppose. and verify the fact that they exceed zero. consequently. are alleged to be the result of an actual inquiry in a. To test the consistence of any set of 2" algebraically independent frequencies. but they cannot have been observed in one and the same universe. definite universe. be limited by a simple consideration. = 1000-1307 + 275-25. This procedure may.


20


STATISTICS.

THEORY OF

expressed symbolically by expanding the ultimate in terms of the positive frequencies, and writing each such expansion not We will consider the cases of one, two, and less than zero.
three attributes in turn. 8. If only one attribute be noted, say A, the positive frequencies The ultimate frequencies are (A) and (a), where are iV and (A).

ia)^F-{A}.
The conditions
of consistence are therefore

simply

or,

more conveniently expressed,
(a)

(4)^0
:

(b)

{A)i^]^

.

.

.

(1)

These conditions are obvious the number of A's cannot be less than zero, nor exceed the whole number of observations. 9. If two attributes be noted there are four ultimate frequencies The following conditions are given by (AB), {Afi), {aB), {ajS). expanding each in terms of the frequencies of positive classes
(a)

(^5)<t:0

(6) (c)
(d)
(a), (c),

(AB)^{A) + {B)-N

U^)>(^}
(AB)1s>{B)

or (.4^) would be negative „ „ (ap) „ „ (Afi) „ „ „ (aB)

'

(2)
.

and

(d) are

obvious

;

(b) is

and is occasionally forgotten. It same type as the other three. None of these conditions are really of a new form, but may be derived at once from (1) (a) and
or as jS respectively. The (1) (b) by specifying the universe as conditions (2) are therefore really covered by (1). 10. But a further point arises as regards such a system of limits as is given by (2). The conditions (a) and (6) give lower or minor limits to the value of (AB) ; (c) and (d) give upper or major limits. If either major limit be less than either minor limit the conditions are impossible, and it is necessary to see whether (A) and (B) can take such values that this may be the case. Expressing the condition that the major limits must be not less than the minor, we have

perhaps a little less obvious, is, however, of precisely the

^

(yl)<0

1

(5)<0
of the
(1),

(

Those are simply the conditions {A) and (B) fulfil the conditions

form

(1).

If,

therefore

the conditions (2)

must be


II.

—CONSISTENCE.
(1)

21

possible.

The conditions

and

ditions of consistence for the case of

(2) therefore give all the contwo attributes, conditions of

an extremely simple and obvious kind.
11. Now consider the case of three attributes. There are eight ultimate frequencies. Expanding the ultimate in terms of the positive frequencies, and expressing the condition that each expansion is not less than zero, we have


22 {AB) and
{BC).
{ACT),

— —

THEORY OF STATISTICS.

the conditions (4) give limits for the third, viz. replace, for statistical purposes, the ordinary rules of syllogistic inference. From data of the syllogistic form, they would, of course, lead to the same conclusion, though in a somewhat cumbrous fashion ; one or two cases are suggested as exercises for the student (Questions 6 and 7). The following will serve as illustrations of the statistical uses of the con-

They thus

ditions

:

Example

i.

— Given

that {A)

= {B) = {(J) = ^N
The data are

anA 80 per

cent,

of the ^'s are B, 75 per cent, of A's are G, find the limits to the

percentage of B's that are C.

and the conditions give
(a)
Ih)
(c)

?(^<tl

-0-8 -0-75

{d)
{a) gives a negative limit

<0-8 + 0-75-l >1 -0-8 +0-75 >.l +0-8 -0-75
and
{d)

hence they

may

be disregarded.

From

a limit greater than unity; (b) and (c) we have

—that

.55 per cent, nor more than 95 per can be G. Example ii. If a report give the following frequencies as actually observed, show that there must be a misprint or mistake of some sort, and that possibly the misprint consists in the dropping of a 1 before the 85 given as the frequency {BG).
is

to say, not less than

cent, of the B'&

iVlOOO
{A) {B) {G)

510 490 427

(^AB)

{AG) {BG)

189 140 85

From

(4) (a)

we have
<i:510

{BG)

+ 490 + 427

- 1000 - 189 - 140

H;98.

But 85<98, therefore it cannot be the correct value If we read 185 for 85 all the conditions are fulfilled.

of {BC).


II.

— CONSISTENCE.

23

Example iii. In a certain set of 1000 observations (4) = 45, Show that whatever the percentages of ^'s (5) = 23, (C) = 14. that are A and of 6"s that are A, it cannot be inferred that any B's
are C.
'

is

conditions (a) and (6) give the lower limit of {BG), which find required.

The

We

W
(6)

iv^

<

i\r

N

^^*'-

<m^

(^) + (^)_.045

The

first limit is clearly

negative, since

negative. The second must also be {AB)IN cannot exceed '023 nor (AC)/J!f '014.

is any limit to (BC) greater This result is indeed immediately obvious when we consider that, even if all the B'a were A, and of the remaining 22 ^'s 14 were C's, there would still be 8 A'a that were neither B nor G. 14. The student should note the result of the last example, as it illustrates the sort of result at which one may often arrive by applying the conditions (4) to practical statistics. For given values of JV, (A), (B), (G), (AB), and {AC), it will often happen that any value of {BC) not less than zero (or, more generally, not less than either of the lower limits (2) (a) and (2) (6) ) will satisfy the conditions (4), and hence no true inference of a lower limit is possible. The argument of the type "So many ^'s are B and so many B's are C that we must expect some ^'s to be G " must be used with caution.

Hence we cannot conclude that there
0.

than

REFERENCES.
Logic, 1847 (chapter viii., "On the Numerically Definite Syllogism"). Laws of Thought, 1854 (chapter xix., "Of Statistical Condi(2) BooLB, G., tions"). The above are the classical works with respect to the general theory The student will iind both difficult to follow of numerical consistence. on account of their special notation, and, in the case of Boole's work, the special method employed. (3) YiiLK, G. U., "On the Theory of Consistence of Logical Class-frequencies and its Geometrical Representation," Phil. Trans., A, vol. cxovii. (Deals at length with the theory of consistence for (1901), p. 91. any number of attributes, using the notation of the present chapters.)
(1)

MoEBAN, A. HE, Formal

24

THEORY OF STATISTICS.

EXERCISES.
1.

(For this and similar estimates

c/.

" Report by Miss Collet on the
i

Statistics of Employment of urban district of Bury, 817 per

Women and

years of age were returned as " thousand as married or widowed, what is the lowest proportion per thousand of the married or widowed that must have been occupied ? 2. If, in a series of houses actually invaded by small-pox, 70 per cent, of the inhabitants are attacked and 85 per cent, have been vaccinated, what is the lowest percentage of the vaccinated that must have been attacked 1 3. Given that 50 per cent, of the inmates of a workhouse are men, 60 per cent, are "aged "(over 60), 80 per cent, non-able-bodied, 35 per cent, aged men, 45 per cent, non-able-bodied men, and 42 per cent, non-able-bodied and
aged, find the greatest

If, in the Girls " 7564] 1894). thousand of the women between 20 and 25 occupied" at the census of 1891, and 263 per

[C—

and

least possible proportions of non-able-bodied aged

men.

The following are the proportions 4. (Material from ref. 5 of Chap. I.) per 10,000 of boys observed, with certain classes of defects amongst a number .(i= development defects, 5=nerve signs, i5=mental of school-children, dulness.

(A)= IB)=

iV =10,000 877
1,086

(D) =789 {AS) = 3SS {BD) = i5o

Show that some dull boys do not many at least do not do so.
5.

exhibit development defects, and state
:

how

The following

are the corresponding figures for girls

iV =10,000 682 iA)= 850 (£)=

(D)

=689

{AB) = 2i8 {£D) = S63
and
state

Show that some
at least

defectively developed girls are not dull,

how many

all B'a are C, therefore all A's are 6. C," express the premisses in terms of the notation of the preceding chapters, and deduce the conclusion by the use of the general conditions of consistence. 7. Do the same for the syllogism "All A's are B, no B's are C, therefore no .4's are C." 8. Given that {A) = {B) = {C) = iiN; and that {AB)/2ir={A0/N'=p, find what must be the greatest or least values otp in oi-der that we may infer that ( BC)IN exceeds any given value, say q. 9. Show that if

must be so. Take the syllogism " All ^'s are B,

(^=^
A'

(.-B)_2^

(C)_

(AB)_{ACI)_(BO)_
JSf

~

N

=^i

the value of neither x nor y can exceed J.

-

CHAPTER

III.

ASSOCIATION.
1-4.

The

6-10. The conception of association and criterion of independence. 11-12. testing for the same by the comparison of percentages Numerical equality of the diiferences between the four second-order 13. Coefficients of associafrequencies and their independence values tion 14. Necessity for an investigation into the causation of an

attribute

A

being extended to include uon-A's.
of

1.

If there

is

no sort of relationship,

any kind, between two

B, we expect to find the same proportion of A'& amongst the B'& as amongst the non-5's. We may anticipate, for instance, the same proportion of abnormally wet seasons in leap years as in ordinary years, the same proportion of male to total births when the moon is waxing as when it is waning, the same proportion of heads whether a coin be tossed with the right hand or the left. Two such unrelated attributes may be termed independent, and wG have accordingly as the criterion of independence for A and B
attributes

A and

{AE)_{m
(^)
If this relation

~

m ^'

(/8)

hold good, the corresponding relations

(a^_(a^)
(^)
(/«)

{AB) ^ (oB)
(A) - (a) (^)_(a£)

must

also hold.

For

it

follows at once

from

(1) that

(B)-(AB) J/3)-iAft)
(^)
25
(/8)
'


THEORY OF STATISTICS.
ia

26
that

(aB)Jap)
identities

and the other two
2.

may be similarly deduced. however, be put into a somewhat The equation different and theoretically more convenient form. second-order fre (1) expresses (AB) in terms of (B), (j8), and a quency (A^) ; eliminating this second-order frequency we have
The
criterion

may,

{AB) _ (AB) + {Ap) _{A)
(B)
i.e.

-

(B)

+

{fi)

N'
is

in words, " the proportion of A's,

amongst the B's

the same
to recog-

as in the universe at large." nise this equation at sight in

The student should learn any of the forms
^^^

{AE)_iA)
(B)

(*)

(«)

attributes

The equation (d) gives the important fundamental rule If A and B are independent, the proportion of AB's in
:

the the

universe is equal to the proportion of A'a midtiplied by the proportion of B's.

The advantage of the forms (2) over the form (1) is that they give expressions for the second-order frequency in terms of the frequencies of the first order and the whole number of observations alone; the form (1) does not. Example i. If there are 144 A's and 384 B's in 1024 observations, how many AB'a will there be, A and being independent 1

B


III.

—ASSOCIATION.
cent,

27
less closely,
c/.

and therefore there must be 21 per
§§ 7,

(more or

8 below) of

AB'a

in the universe to justify the conclusion

that
3.

A

and

B

are independent.

It follows

from

§ 1

that

if

of the foiir second-order frequencies,

the relation (2) holds for any one e.ff. {AB), similar relations
three.

must hold
from
(1)

for

the remaining
jAli) _
(y3)

Thus we have
(A)

directly

{AB) + {AP)
{B)

+ {p)

#'

givmg

And

again,

(aB)_{afi)_ {aB)
(B)
ip)

+ (ali)
+
(fi)

(a)
i\r'

{B)

which gives

of

Example iii. In Example i. above, what would be the number ajS'B, A and £ being independent ?
(a)
(/8)
,

= 1024 -144 = 880 = 1024 -384 = 640
880x640

••

<"^>=

1-02^ = ^^^-

_,„

The theorem is an important one, and the result may be deduced more directly from first principles, replacing (AB) by its value {A){B)IN in the expansions—
{aB)={B).~{AB). = {A)-{AB). (aP) = {]r)-{A)-{B) + (AB).

{Ap)

as an exercise for the student. Finally, the criterion of independence may be expressed in yet a third form, viz. in terms of the second-order frequencies are independent, it follows at once from and alone. If

This
4.

is left

A

B

equation (2) and the work of the preceding section that

iABKap)JMmm,
And
evidently (aB)(Ap)
is

equal to the same fraction.

it is not meant merely that iome . A and B Then if (AB)>'^^. on the other hand. or sometimes simply associated. it is not meant that no A's are B's. enabling one to recognise almost at a glance whether or not the two attributes are independent.B){A^). but that the number of A'a which are B's falls short of the number to be expected if A and £ the when . are A and B independent or not f — (^5) = 110 Clearly so in (a5) = 90 (^/3) = 290 (a. Example iv. This form of criterion is a convenient one if all the four second-order frequencies are given.8) = 510. {AB)(a. disassociated. B A and are said to be negatively associated or. are not independent. (3) _{AB) (Afi) _ '- The equation the (c) -S's is " The ratio of A's to a's amongst (6) may be read equal to the ratio of ^'s to a's amongst the /3's. Suppose now that A and B are not independent. A and B are said to be negatively associated or disassociated. If. A and are said to be positively associated.^)> (a. Similarly. but that the number of A's which are B's exceeds number to be exjyected if A and B are independent.il's are B's.— 28 Therefore THEORY OF STATISTICS. B more briefly. however complicated." and similarly. If the second-order frequencies have the following values. but related some way or other. but in a technical sense. When A and B are said to be associated. {AB){aP) = {aB){Ap) {a) (AB) (IB) _ - {Afi) W) (oB) (a^e) ^^\ ^''^J . 5. The student should notice that these words are not used exactly in their ordinary senses.

" or "no a's are /S. is the intensity of association or of disassociation. 29 " Association '' cannot be inferred from the are independent. is fairly certain. . {A){B)IN' = 44 X 53 \qq = 23-32.. it may'be that such association. highly or slightly associated. When {AB) attains either of these values. A and B may be said to be completely or perfectly associated.. The lowest possible value of (which{AB). 7. "No A's are B. in the given record. is not really significant of any definite relationship. mere fact that sorne A'& are B's. {B) = 53." or more narrowly to the case when both these statements are true. i. When {AB) falls to either of these values. B " heads '' in Hence the second. The greatest possible value of {AB) for given values of N. The greater the divergence of {AB) from the value {A){B)IN towards the limiting value in either direction. and the tosses noted in pairs . we have from the above (-4) = 44. between But it the result of the first throw and the result of the second. is either zero or {A) + {B) ever is the greater). t First toss tails and second heads . while actually {AB) is 26." or it may be of the cases. —ASSOCIATION. . where {AB) only differs from {A){B)/2f by a few units or by a small proportion. on the other hand.. and {B) is either {A) or {B) (whichever is the less). When the association is very slight. 26 18 27 ^. we may say. This conception of degrees of association.e. Complete generally understood to correspond to one or other " or " All B's are A. from the nature of the case. is important.7 talis . suppose that a coin is tossed a number of times. 6. the greater. . so that we may speak of attributes being more or less. If we use A to denote " heads " in the first toss. Complete disassociation may be similarly taken as corresponding to one or other of the cases. degrees which may in fact be measvired by certain formulae (c/.. N be said to be completely disassociated. then 100 pairs may give such results as the following (taken from an actual record) association is A and B may B : First toss heads and second heads „ tails „ . . " All A's are more narrowly defined as corresponding only to the case when both these statements were true. {A).— III. „ J. however great that proportion this principle is fundamental. . To give an illustration. § 13). and should be always borne in mind. Hence there is a positive association. that such association cannot indicate any real connection between the results of the .

(B) and JV in the second. A little consideration will suggest that such associations due to the fluctuations Of sampling must be met with in all classes of statistics. as leads. or dependence as at least unproved. 9. To quote. to an extremely complex system of causes of the general nature of which we are aware. At present the attention of the student can only be directed to the existence of the difficulty. 8." must be postponed to the chapters dealing with the theory of sampling. to differences between small samples drawn from the same material. impossible to analyse. but others a negative association. The first two. and to the serious risk of interpreting a " cHance association " as physically significant. THEORY OF STATISTICS. or better to the chances or fluctuations of sampling. e. The discussion of the question. for example. A large number of such comparisons are available for the purpose. the two illustrations there given of independent attributes. the proportion of A's amongst the B'& with the proportion in the universe at large. But so long as the divergence from independence is not well-marked we must regard such attributes as practically independent. . The procedure is from the theoretical standpoint perhaps the most natural. like the above occurrence of positive association. as indicated by the inequalities (4) below. but it is usual. Such proportions are usually expressed in the form of percentiages or proportions per thousand. of a number of such records. (a) and (b).g. in practice. for instance. follow at once from the definition of § 5. how great the divergence must be before we can consider it as " well-marked. (c) and (d) follow from (a) and (6).30 . The definition of § 5 suggests that we are to test the existence or the intensity of association between two attributes by a comparison value (as it (AB) with its independence{A){B)/M. two throws it must therefore be due merely to such a complex system ot causes. which all hold good for the case of positive association between A and B. is sometimes said to be due to chance. nor exactly the same proportion of male births when the moon is waxing as when it is waning. on multiplying across and expanding (A) and iV in the first case. we know that in any actual record we would not be likely to find exactly the same proportion of abnormally wet seasons in leap years as in ordinary years. The deduction of the remainder is left to the of the actual value of may be termed) student. from § 1. The conclusion is confirmed by the fact that. but of the detailed operation of which we are ignorant. to adopt a method of comparing proportions. Au event due. some give a positive association (like the above).

.

000 = Therefore {AB)j{A)>{aB)/{a). and therefore (c) is to be preferred to (a). .e. the exact question to be answered. . Association between sex and death.000 16. 1230] 1903. be made in cases where the proportion of B'& (or A's) in the universe is very small.967 16.000 285.^^gj •0158.— 32 this — — THEORY <)F STATISTICS. as in Example vi. (a^)_ (a) 265.773. cannot be fully realised unless the value of {B)/]V (or {A)IN in the second case) is known. however.967 We may denote the number of males by (A). Where no definite question has to be answered or hypothesis tested both pairs of proportions may be tabulated. But if it were only stated that {AB)I{B) between = -70 (A)IN= -67 Yet the two statethe association would appear to be small. as illustrated by the examples below.773. and between male-sex and death. [Cd. for then we have {A)IN= The meaning -7 X -9 + -4 X -1 = -67 of {a) or (6). 1901 Females „ „ „ Of the Males died Of the Females died . and {d) to {b). the proportion of males that died and the proportion of females. Example vi. ments are equivalent if {B)IN=0'^. (Material from 64th Annual Report Reg. We find (AB) _J85fil8 (A) "15.000. There still remains the choice between (a) and (b).848. so that {A)IN approaches closely to {Aft)/{ft) or (B)/M to {aB)/{a) (cf. This must be decided with reference to the second principle. or between (c) and {d)." ^. Example v.618 265. It is there is positive association usual to express proportions . below). then the natural comparison is between {AB)/(A) and {aB)/{a). A would mean a considerable positive association and B. An exception may.e.848. with regard to the more important aspect of the problem under discussion. or the hypothesis to be tested. i. i. the number of deaths by (B) . in fact. 15.) — Males in England and Wales. General. . again.

. with death-rates.4/3)/(j8) and {aB)j(a) respectively. 16'9 „ This brings out the difference between the death-rates of males and of the whole population. of deaths. /The student should learn. whole population . and not with the sex-ratios of the living and the dead. A comparison of the form (4) (c) is again valid for testing the association. . marriages. . A comparison of the death-rate among males with the deathfor the whole population would be equally valid. 1 g^g '^ thousand. as equivalent to the above : Proportion of males amongst those that died in the year . j Proportion of males amongst those {too that did not die in the year ] . The above figures give rate — Death-rate „ among males for .]) .. births. Example vi. We may from (. would be written . [Cd. 32. . illustrating very well Statisticians are concerned the remarks on the opposite page. . —Deaf-mutism Summary (Material from Census of 1901. IS'l per thousand.528. 15'8 . positive association it follows. that there is between A and £. but the form is not desirable. bo that the above figures Death-rate „ among Males „ Females . . — 33 —ASSOCIATION. . " Since (AB)/{S)>{AP)/(^).— III. but it should be remembered that the latter depends on the sex-ratio as well as on the causes that determine the death-rates amongst males and females. The question 3 .246 451 associated Eequired. Tables. imbecility. to recognise such forms of statement as the following. which is the point to be investigated. . Total population of England and Wales Number of the imbecile (or feeble-minded) Number of deaf-mutes Number of imbecile deaf-mutes . however.882 15. and Imbecility. to find whether deaf-mutism is with mutes by little denote the number of the imbecile by (A). etc. but is not so clear an indication of the difference between males and females. seeing that (A)/J!^ and (5)/iV' differ very (B). . 18-1 per thousand. 1523. of deafOne of the comparisons (a) or (b) may very well be used in this case.000 48. to the population as rates per thousand .. as before.

Example vii. whether to give the preference to (a) or to (b) depends on the nature of the investigation we wish to make. ill 151 148 .. The best comparison here is Percentage of light-eyed amongst the sons of light-eyed fathers . . In cases of this kind the father is reckoned once for each son . It may be pointed out. . however.. . . „ „ » not light light (aB) „ „ not light . . (6) it is desired to exhibit the conditions will be preferable. the . cxov. p. . . that census data as to such infirmities are very untrustworthy. . 230 Required to find whether the colour of the son's eyes is associated with that of the father's. 2.—— . . (1900). ciation — Fathers with light eyes and sons with light eyes (AB) not light (A^) :. A. lie j If. Troms.J " But the following Percentage Percentage is equally valid of light-eyed fathers of light-eyed sons amongst the .g. . and 3 of the memoir treated as light).._ whole population {£)/JV f . would be reckoned as giving two to the class AB and one to the class A^. as given by Professor Karl Pearson. . the classes 1. ) „„ „ | 'o per cent. Proportion of deaf-mutes in the „. Eye-colour of father and son (material due to Sir Francis Galton. on the other hand. „„ Percentage of light-eyed amongst the sons of not-light-eyed fathers . ) | ) „„ 76 per cent. a family in which the father was light-eyed. " . „ (a/3) » . Phil. If it is desired to exhibit the conditions among deaf-mutes (a) may be used : Proportion of imbeciles among deaf. two sons light-eyed and one not. . — 34 THEORY OF STATISTICS. vol. of light-eyed amongst fathers of not-light-eyed sons . ) mutes = {AB)I{B) 39 ^ thousand. amongst the imbecile. e. T I „ „ "^ P^'^ thousand. J Proportion of imbeciles in the whole population = (^)/iV . Either comparison exhibits very clearly the high degree of assobetween the attributes. . . ) " J . 138 . 1 Proportion of deaf-mutes amongst the imbecile (AB)/(A) . .

and son. for example. The reason why the former comparison is preferable is.S JJ!f-{A)]{B) N = {aB)^ .-{aB).8. that it is often desirable to shall use the symbols employ single symbols to d«note them. quite generally {Ap)^{AP). .{AB). {AB)-{AB). in the case of independence.4^) (aB) {Aji) and {aj3). the power of estimating the character of parents from that of their offspring. . want to make use of offspring to parents.-. of parents to offspring. {A){B) . that we usually wish to estimate the character of offspring from that of the parents.—— — III. then we have (aJB) = {B) . we have {AjS) {AB) . indicate equally clearly the tendency to resemblance between We father 11.{AB). =(«. The values that the four second-order frequencies take viz. iV=100 then (^) = 60 (^) = 45 (a^)o {AB). = (a5)„ .8)o +8. and define heredity in terms of the resemblance of do not. (a)(B) ' N N ' {jm i^m N ' ' jsr are of such great theoretical importance.-h. be shown that Similarly it may m Therefore. = {aB). however. . = 21 {aB). = (a/3) . and of so much use as reference-values for comparing with the actual values of the frequencies (. as a rule.{cB). nor do we define heredity in terms of the resemblance Both modes of statement. Supposing. We If 8 denote the excess of {AB) over {AB\.{AB) = {B) . — — 35 —ASSOCIATION. = l8 (^/i)o=33 = 22.(a/3)„ = {A^).

Similarly.60 . Bring the terms on the right to a common denominator. then {aB) = 45 .— — — — 36 If. = (A£)-i^. The following data were observed for hybrids of N : — . or it will be liable to suggest a higher degree of association than actually exists. and express all the frequencies of the numerator in terms of those of the second order . (^/5) = 60 . then we have N\ -[{AB) + {Ap)l{AB) + {aB)-\ ] = ^{{AB){alS)~{aB){Ap)\. and we have 35-27 = 30-22 = 18-10 = 33-25 = 8. (a. A and B are positively associated. We have by definition 8 = {AB)-{AB).35 = 10. and {AB) — ^a. we have 8 = jqq| 35x30-25x10 = ^/19x14-26x41 I =8 8. In using the difference of the cross-products to test mentally the sign of the association in a case where all the four second-order frequencies are given.g. e.35 = 25. Example viii. That is to say. The value of this common difference 8 may be expressed a form that it is useful to note. now. in = 2Q {Ap) = il (a/?) =14 19-27 = 14-22 = 18-26 = 33-41= -8.3) = 100 . taking the examples of § 11. although S is really very small. if A and B be disassociated and {AB) = say 19. the common difference is equal to l/iVth of the difference of the " cross products " {AB){a^) and (aiB){Ap) .45 + 35 = 30. this should be remembered the difference should be compared with N. the student {aB) will find that {AB) = \^ and 12. THEOllY OF STATISTICS. and 8 I It is evident that the difference of the cross-products may be very large if be large.j 35.

e. „ . {AB) {a^)<(aB} {AjS}. Bateson and Miss Saunders. . {AB) (-4/3) 47- smooth smooth Flowers white. . there is clearly a negative association. Report Committee of the Eoyal Society. „„ " .— ril. 252-141 = 111. 1 . — — —ASSOCIATION. + 1 when they are completely associated. While the methods used in the preceding pages suffice for most practical purposes. But S= 111/83 = 1 '3 only. 1 80 Per cent. grouping the frequencies in a small table in a way that is sometimes convenient. fruits prickly . Since 3x47 = 141." so devised as to be zero when the attributes are independent. the three cases of complete association : . . 1902) Flowers „ violet. so small that no stress can be laid on it as indicating anything but a fluctuation of sampling. and at first sight this considerable difference is apt to suggest a considerable association. we have. 12x21 = 252. it is often very convenient to measure the intensities of association in different cases by means of some formula or " coefficient. so that in point of fact the association is small.. 1 „„ Percentage of white-flowered plants with . . in the sense of § 6.1 when they are completely disassociated. and . i. .. If we use the term "complete association" in the wider sense there defined. Working out the percentages we have Percentage of violet-flowered plants with prickly fruits prickly fruits . prickly laB) (aj8) „ 12 21 3 Investigate the association between colour of flower and character of fruit.1 13. . : 37 to the Evolution Datwra (W.

SO that all A's are B and also all B'a are A. three corresponding oases of complete disassociation are (4) (5) (6) .— 38 THEORY OF STATISTICS. The {B) = {AB).

. to ref. XJ. IX. G.) that 29-6 per thousand of deaf-mutes were imbecile unless we knew that the proportion of imbeciles in the whole population was only 1 '5 per thousand . though in the opinion of the present writer its use is of doubtful advantage. 121. p. vol. classification certain other coefficients in the (e/. of course. based on theorems in the theory of variables. Reference should also be made to the coefficient described in § 10 of Chap. (1) Yule. 90. (2) Yule. or concerning a universe that includes both a's and Hence an investigation as to the causal A's. but must be extended to a's (unless. and XVI. : £ £ : REFERJ^lSrCES. (3). 257. of A's being with . XI. Chap. Soc. It would be no use to obtain with great pains the result (c/. The question of the best coefficient to use as a measure of association is still the subject of controversy for a discussion the student is referred to refs. possessing the same properties but certain advantages. nor would it contribute anything to our knowledge of the heredity of deafmutism to find out the proportion of deaf-mutes amongst the oflFspring of deaf-mutes unless the proportions amongst the offspring of normal individuals were also investigated or known. in the absence of information. the criterion of independence for two attributes A and B. In concluding this chapter. p. we can but assume that In order to apply 80. Example vi. U. relations of an attribute A must not be confined to ^'s. § 5) the mere fact of 80. Hoy." Biometrika. G. to ref. (5).. 39 The coefBcient is only mentioned here to direct the attention of the student to the possibility of forming such a measure of association. it is necessary to have information concerning a's and /3's as well as A'a and ^'s. of a's may also be S. the necessary information no comparison is otherwise as to a's is already obtainable) possible. j8's and £'s. V. and the references to Chaps. which has come into more general use. Trans.) . Series A. "On the Association of Attributes in Statistics.'' Phil. of the principal portions of (1) and other matter. For further illustrations of the use of this coefficient the reader is referred to the reference (1) at the end of this chapter. 1900.) III. 1903. or 99 per implies nothing as to the association of A cent.. and (6). and for a mode of deducing another coefficient.) and of variables Chap. X. cxciv. or 99 per cent. 90. ii. (3) . (Deals fully with the theory of association : the association coefiBoient of § 13 suggested. —ASSOCIATION. a measure which serves a similar purpose in the case of attributes to that served by cases of manifold (c/. 14. (4).. for the sake of emphasis. that (c/.. vol. for the modified form of the coefficient. "Notes on the Theory of Association of Attributes in (Contains an abstract Statistics. it may be well to repeat.).

and the conclusion that none of these coefficients are of much value for comparative purposes in interpreting statistics of the type considered.and Self-fertilisation of Plants. Tcgl. Roy. Soy. Yui. whether A and are independent. At the census of State proportions exhibiting the association between deaf-mutism from childhood and sex...) species that were above or below the average height. Soc.) ) — 40 (3) THEORY OF STATISTICS. Feb. saclisischen Gesellschafl d. 1913. (A reply to criticisms in ref. and 6. Roy." Jour.) Lipps. " On the Con-elation of Charactere not Quantitatively Measurable. (Deals with the general theory of the dependence between two characters. Pbaeson. 3. Soc. G. For a criticism see ref. stating separately those that were derived from cross-fertilised and from self. and David Heron. 3. vol. the coefficient of association of § 13 is again suggested independently.) : .fertilised parentage Investigate the association between height and cross-fertilisation of parentage.-phys. as briefly as possible. 1905. of Medicine.8)= 570 48 (^5) = 1600 (a£)= 380 (a. Soc. Klassed. TJ. 1915. critical survey of the various coefficients that have been suggested for measuring association and their properties : a modified form of the coefficient of § 1 3 given which possesses marked advantages." Berichte d. "On the Methods of Measuring the Association between Two (A Attributes. ref. Karl. Yule." PAi'Z. and 3072 females. and draw attention to any special points you notice. oxov.) Gkeenwood. math. (Cited for the discussion of association coefficients in § 4. Leipzig. ix. How many of each sex for the same total number would have been deaf-mutes if there had been no association ? 2. Show.729. 579-642. Species. 1912."DieBestimmungderAbhangigkeitzwischendenMerkmalen eines Gegenata. 159-332. Siat. or negatively associated in each of the following cases B : (a) (6) N {A) =5000 (A) =2350 (B) (o) =3100 = (AB)= 3.aB)= 768 = (^. 490 256 {AB)= 294 (.799. ." Bwmetrika. " The Statistics of Anti-typhoid and Anti-cholera Inoculations. vol. positively associated. (4) (5) (6) (7) EXERCISES. Trans.. p.. 1. p. viii. pp. The table below gives the numbers of plants of certain cf. G.. Series A.000 females. Ixxv.000 males and 16. F. 113. p. M. 1.8)= 144 (Figures derived from Darwin's Gross. pp. (Deals with the problem of measurement of intensity of association from the standpoint of the theory of variables.. U. and the interpretation of such statistics in general. England and Wales in 1901 there were (to the nearest 1000) 15. giving a method which has since been largely used only the advanced student will be able to follow the work. however classified . vol. Wissenschaften. 3497 males were returned as deaf-mutes from childhood.nAes. 294. Kakl. "On Theories of Association." Proc.e. 1. 1900. Peakson. vol.

(Figures from the Census of Englwnd and Wales.e. vol.. . and the blind mentally -deranged (AB). .. —ASSOCIATION. Husbands „ not-light eyes {A$) with not-light eyes and wives with light eyes {oB) not-light eyes {a$) .e. . 309 214 132 119 Also tabulate for comparison the frequencies that would have been observed had there been strict independence between eye colour of husband and eye colour of wife. .. Husbands with light eyes and wives with light eyes . etc.. 34. (§ 11). i. .. iii.. ) Investigate the association between eye colour' of husband and eye colour of wife ("assortative mating") from the data given below. ..") . 5.) The figures given below show the number of males in successive age groups. Trace the association between blindness and mental derangement from childhood to old age. . 782 Also tabulate for comparison the frequencies that would haTC been observed had there been no heredity. 41 (Figures from same source as Example vii. „ . p.— III..dark eyes M/S) . . (^S)o. 6. of the mentally-deranged (5). the values of {A-B)^. „ . tabulating the proportions of insane amongst the whole population and amongst the blind.. Give a short verbal statement of your results. but material differently Investigate the classes 7 and 8 of the memoir treated as " dark. : the data cannot be regarded as trustworthy.. association between darkness of eye-colour in father and son from the following data: 4. (AB) . 50 79 89 . the values oi{AB)f„ etc. (oif) Fathers with not-dark eyes and sons with dark eyes not-dark eyes (o/3) . . and also the association coefficient Q of § 13. as in question 4. together with the number of the blind {A). 1891. i. (Figures from same source as above. grouped Fathers with dark eyes and sons with dark eyes i-^^) not.

third 9. to the association of A with C and of B with C.— CHAPTEE IV. Uncertainty in interpretation of an observed association 3-5. hygienic conditions by C.e. The commonest of all forms of alternative hypothesis is of this kind it is argued that the relation between the two attributes A and B is not direct. 1-2. or whether it' is of any other particular kind that we may happen to have in our minds at the moment. Chap. If we find that in any given case iAB)> all or <(^. 2. but is wholly due to the fact that most of the unvaccinated are drawn from the lowest classes. Source of the ambiguity : partial associations 6-8." i. III. Illusory association due to the association of each of two attributes with a. — — — — — 1. more of the vaccinated than of the unvaccinated are exempt from attack. living in very unhygienic conditions. exemption from attack by B. The case of complete independence. An illustration or two will make the matter clearer (1) An association is observed between "vaccination" and " exemption from attack by small-pox. The result by : : i2 . Estimation of the partial associations from the frequencies of the second order 10-12. Denoting vaccination by A. the argument is that the observed association between A and B is due to the associations of both with C that is known is that there is between A and B. in some way. The total number of associations for a given number of attributes 13-14. It is argued that this does not imply a protective effect of vaccination. a relation itself of some sort or kind cannot tell as whether the relation is direct. Any interpretation of the meaning of the association is necessarily hypothetical. but due. g§ 7-8). PARTIAL ASSOCIATION. whether possibly it is only due to " fluctuations of sampling" (c/. and the number of possible alternative hypotheses is in general considerable.

at a general election. —PARTIAL ASSOCIATION. in each case. the question arises whether the association between A and C may not be due solely to the associations between A and B. and grandfather by A. we shall avoid the possible fallacy. If the association between A and C be observed for those cases in which all the parents. and compare the percentages of Conservatives winning elections when they spend more than their opponents and when they spend less. and that the Conservatives generally spent more than the Liberals. and Conservative by C. Question 9 at the end of the chapter). but both. possess the attribute. Thus. The ambiguity in such cases evidently arises from the fact that the universe of observation. father. (3) An association is observed between the presence of some attribute in the father and its presence in the son. Denoting the presence of the attribute in son. If the percentage is greater in the former case than in the latter. B and C. or else all do not. though of course others might remain. the supposed argument would be refuted.IT. and also between the presence of the attribute in the grandfather and its presence in the grandson. in the first illustration. if we confine our attention to the " universe " of Conservatives (instead of dealing with candidates of both parties together). say. and an association were still observed between vaccination and exemption from attack. The biological case of the third illustration should be similarly treated. the argument is the same as the above (c/. that a greater proportion of the candidates who spent more money than their opponents won their elections than of those who spent less. it cannot be for the reasons suggested in § 2. B and C . respectively. The fact would prove that the association between vaccination and exemption could not be wholly due to the association of both with hygienic conditions. contains not merely objects possessing the third attribute alone. Denoting winning hj A. 3. but is due to the fact that Conservative principles generally carried the day. It is argued that this does not mean an influence of expenditure on the result of elections. B. If the universe were restricted to either class alone the given ambiguity would not arise. in the second illustration. if the statistics of vaccination and attack were drawn from one narrow section of the population living under approximately the same hygienic conditions. and it is still sensible. or objects not possessing it. spending more than the opponent by B. 43 (2) It is observed. then the association first observed between A and G for the whole universe cannot have been due solely to the observed associations between A and B. Again. and C.

. (ABC)>(^^ and negatively associated (1) As in the converse case. If. apply of course equally to the present case. § 4 of Chap. the deiinition of § 5 of Chap. although for some purposes a "coefficient of association" of some kind may be useful. ^ (ySC) (AG) ^°'> These inequalities may easily be rewritten for any other case by making the proper substitutions in the symbols . thus to obtain the inequalities for testing the association between A and G in the universe of B's. III. it is argued that this is due to the association of both A and B with D. p. the argument may be tested by still further limiting the £eld of observation to the universe GD. III. Chap. (ABC) (AG) ^"'' (ABC) (AC) (BC) . Chap. ji for y. it should be noticed that precisely similar conceptions and formulse to the above apply in the general case where more than three attributes have been noted. (4) (a)-(d).44 4. Chap.) when .). ^"^ {BCy^ JCJ (ABC) UpC) (BG) > ^ (C) (ABC) ^' (aBG) (aC) (2) . it being remembered that the order of the letters in the class-symbol is immaterial. we must have. The associations observed between the attributes A and B and the universe of y's may be termed partial associations. III.. in the simpler case. Confining our attention to the more fundamental method.. The remarks of § 10. B must be written for C. . II. III.^Aomm. 31). the association is most simply tested by a comparison of percentages or proportions (§ 9. A and B are positively associated within the universe of GD's. . THEOKY OF STATISTICS. A and B will be said to be posiin the universe of C's tively associated in the universe of C's (c/. to distinguish them from the total associations In terms of observed between A and S in the universe at large. If ^ABGn). to quote only the four most convenient comparisons (c/. when it is observed that A and B are still associated within the universe of C's.. throughout. as to the choice of the comparison to be used. 5. and the association cannot be wholly ascribed to the presence and . or where the relations of more than three have to be taken into account. if A and B are positively associated within the universe of C's. and vice versa. Though we shall confine ourselves in the present work to the detailed discussion of the case of three attributes.

. pp. the association of A and B being tested for the universe ODE.— IV. and may be similarly treated by investigation of the for the universes and /3.. the association between for the whole universe. . : —PARTIAL If it ASSOCIATION. the ." Discuss this conclusion.B-universe and the /8A and universe B B D B B : For the entire material dull =(Z))/JV Proportion of the dull=(I>' roportion . . . A and being thus common effects of the same cause B (or another attribute necessarily indicated by B). partial associations between A and As the ratios (A)/]!^. The case is thus similar to that of the first illustration of § 2 (liability to small-pox and to nonvaccination being held to be common effects of the same circumstances).000 of boys observed with certain classes of defects.086 (AB) (AD) (BB) 789 (ABB) 338 338 455 153 The Report from which the figures are drawn concludes that " the connecting link between defects of body and mental dulness is the coincidept defect of brain which may be known by observation of abnormal nerve-signs.) The following are the proportions per 10. The following figures illustrate.A) who 1 _ 338 _5o. 45 absence of D as suggested. however complicated. (p. = ) '°" = 7-9 per cent. (B) the number of the "dull. (A) denotes the number with development defects. The two following examples will serve as illustrations for the case of three attributes. but it may mean that the mental defects indicated by nerve-signs B may give rise to development-defects A. or (2) (a) (b) above. and so on as far as practicable. (B)/JV are small. the process may be repeated as before. III. of and absence of be then argued that the presence and absence E is the source of association. and also to mental-dulness . 31-2). the remarks in § 10 of the same chapter. amongst a number of school children. may very well be used (cf. The phrase " connecting link " is a little vague. nor to the presence C and D conjointly. then. (£) with nerve-signs. 31). (Material from ref." — N (A) (B) (Z)) 10.AD)/(. . and not directly influencing each other.. comparisons of the form (4) (a) or (b) of Chap. Partial associations thus form the basis of discussion for any case. defectively developed . Example i.k ~ ~ ^SfT . I. weie dvM=(. (B)/]^. .000 877 1. 5 of Chap.

ugUt^ei. parent and child. parent by B. families with not less than 6 brothers or sisters.. .o per cent. — Grandparents p. and none to the remainder. grandparent by C.) The original data are treated as in Example vii. of the last chapter (p./~~338 For those not exhibiting nerve signs Proportion of the dull =(flZ))/(e) . the association between A very high indeed both for the material as a whole (the universe at large) and for those not exhibiting nerve-signs (the )S-universe). every possible line of descent is taken into account. but serves as a good illustration of the method.^ /3) 185 ~ _q>. Light-eyed. defectively developed . Eye-colour of grandparent. . . 15 to oBy. This result does not appear to be in accord with the conclusion of the Report. 4 13 4x1x3 303 225 395 501 7 LiglS-°eyed the first would give 4 x 1 x 1 = 4 to the class ABC. =xpT = _ 3*7 . The table only gives particulars for 78 large table 20. 12 to A^C. Denoting a light-eyed child by A. A D Exwmple ii.. „gjityed.. tight-eyed. (ABC) (ABy) (AI3C) (Apy) 1928 596 552 508 (aJBG) (aJBy) (al3G) (oySy) . 34). = 12 to the class ABy. who \ . Thus. =_!^=41-9 _jp.. ^Is^t-^y^i. . .— — 46 THEORY OF STATISTICS. 5 to a/JC. defectively developed „ . Children Parents a. A. . who \ _ 153 : . 216. The class-frequencies so derived from the whole table are. C. 5 to aBC. 4 3 5 4 11 11 B. (Material from Sir Francis Galton's Natwral Inheritance (1889). 16 to aftC.o 539" . but it is very small for those who do exhibit nerve- The results are extremely striking is and D signs (the 5-universe). : For those exhibiting nerve signs Proportion of the dull =(5i)/(B) ./ were dull = (^ei?)/. taking the following two lines of the table. so that the material is hardly entirely representative. p. as we have interpreted it. were dull =(^£Z))/(^5). and 15 to a^Sy . 4 to A/SC. for the association between and in the ^-universe should in that case have been very low instead of very high. 12 to A/Sy. 16 to aBC. the second would give 3 x 1 x 4 = 12 to the class ABC. .

Proportion of light-eyed amongst the ") grandchildren of not-light-eyed |giandparonts . . J . . . : : Crrandpa/rents a/nd Grandchildren Proportion of light-eyed amongst the grandchildren of light-eyed grandparents ) : Parents light-eyed. approximately the same .„„. / _ (-^3) _^^0_ (yS) 1956 " the above cases we are really dealing with the between parent and oiFspring.) . <^7 The following comparisons indicate the association between grandparents and parents. respectively :— Gramdparents a/nd Parents.iparents J _ (-SC) _ 2231 _ ^^ percent ^ (C) 3178 = -j-i. in order to throw light on the real nature of the There are two such partial associations to be resemblance. —PARTIAL ASSOCIATION. parents and children. tested (1) where the parents are light-eyed. in the next case it is naturally lower In both association : Grandparents a/nd Gramdchildren. . (2) where they are The foUowing are the comparisons not-light-eyed. = -T^-r.}parents J Proportion of light-eyed amongst the ^ grandchildren of not-liglit-eyed [grandparents .= jggQ = 44 '9 l7J (By) 821 „ Parents and Children. \ _ (^^) _ 2524 _ '^^ " (-B) "3052 J ' P^ ° Proportion of light-eyed amongst the 1 children of not-light-eyed parents.= * (ABy) ^' ggf^ 72 6 „ . = (AC) = 5770 = ^8 '0 ttft 2480 l'') ^i'» percent. as might be expected. as distinct from the total associations given above. = -7--y — roo(5 = 60'3 ^'^' (Ay) 1104 .—— IV. Proportion of light-eyed amongst the "j grandchildren of light-eyed grand.1 We proceed now to test the pa/rtial associations between grandparents and grandchildren. and grandparents and grandchildren. and consequently the intensity of association is. [ 1 = (ABC) . Proportion of light-eyed amongst the \ children of light-eyedgrandparentsj Proportion of light-eyed amongst the ) children of not-light-eyed grand. Proportion of light-eyed amongst the children of light-eyed parents . = gSoT = 86 '4 596 1928 per cent.

_ -S^ .. and {ABC)/{BC) If If form an ascending series. A and D being positively associated in the universe of BG's. giving {ABCI))/{BCD). A and B are positively associated. as regards the above results. the series might be continued. (^^)/(-«) . {ABC)l{BCr) > {AB)I{B). etc. tliey will nevertheless be associated within the universe at large. etc. that the most important feature may be brought out by stating three ratios positive . parents and children. If A and B . Hence {A)IN. {ABCDE)I {BODE). —--^-j . Grandparents and Grandchildren Proiroi'tion of light-eyed : Parents not-light-eyed. respectively There is an ancestral heredity. 6. A and C are positively associated in the universe of B's. The general nature of the fallacies involved in interpreting associations between two attributes as if they were necessarily due to the most obvious form of direct causation is more clearly exhibited by the following theorem : are independent within the universe of C's and also within the universe of y's. The above examples will serve to illustrate the practical application of partial associations to concrete cases. and so forth. =71-6 per cent. as it is comparatively of little consequence.= 58 ***' '3 per cent. {AB)/(B).. were also known. „. It may be noted. ^'"'J Proportion of light-eyed amongst the (ABy) 608 "I =-7^-?. the total association between grandparents and grandchildren cannot. We need not discuss the partial association between children and parents. amongst the 1 grandohildren of light-eyed grand.. however. m . (ABO) .=i-nAq = 50 '3 grandohildren of not-light-eyed J1009 Irindpr. unless C is independent of either A or B or both. as well as a parental heredity..\ parents J = 6B2 ^ = -rj=. etc. 7 „ Proportion of light-eyed amongst the") children of light-eyed parents and h ={ABO)/(£C) = 86'i „ If the great-grandparents.— 48 — THBOKY OF STATISTICS.. .. only. The series would probably ascend continuously though with smaller intervals. as it is termed. then. Thus we have from the given data '"°°^^!}= ^XSingen^r^^' (^V^ .. Proportion of light-eyed amongst the \ children of light-eyed parents J- _ - „. be due wholly to the total associations between grandparents and parents. (AB)/{B) > {A)/JV. In both oases the partial association is quite well-marked and . A and E in the universe of BGD's.

= ^[(AC)-{ACmBC)-{BC). while no degree of heterogeneity in the universe can influence the association between A and B if all other attributes are independent of either A or B or both. negative. In (1) it is argued that the positive associations between vaccination and hygienic conditions. conservative and spending more.— IV. exemption from attack and hygienic conditions. as in § 11 of Chap. . ^Misleading associations of this kind may easily arise through 4 . If both associations are of the same sign. subtract (AB)^ from both sides of the above equation. the resulting illusory association between A and B will be positive . III.= fflQ.] . the positive association between grandparent and grandchild may not be due solely to the positive associations between grandparent and parent. = (§9. simplify. for the right-hand side will not be zero unless either (AC) = (AC)^ or (BG) = (BC)^. (p. if of opposite sign.^0. The result indicates that. and we have {AB)-{AB). —PARTIAL 49 Tlie two data give (3) Adding them together we have (^B) = ~^^N{AC){BO-{A){Crj{B0)-[B){Cr){AG) + U){S){0)j Write. 35)— (AB). parent and child. give rise to an illusory positive association between In (3) the question is raised whether viinning and spending more. — ASSOCIATION. 7. In (2) it is argued that the positive associations between conservative and winning. The three illustrations of § 2 are all of the first kind. (4) This proves the theorem .J'-f>. WC). an illusory or misleading association may arise in any case where there exists in the given universe a third attribute G with which both A and B are associated (positively or negatively). give rise to an illusory positive association between vaccination and exemjption from attack.

It follows that there the second positively.g. which a careful worker would keep distinct. Take the following case. of the males and 40 per and the results published without distinction of sex. for example. respecting the two sexes. therefore the first attribute is associated nega- A new treatment is tried cent. will : . Suppose. Suppose there have been 200 patients in a hospital. suffering from some disease. 100 males and 100 females. that more males were treated than females. further. on 80 per cent. trealTnent and vnale sex. with the relations of which we are The data show here concerned. e. be an illusory negative association between the first two death and treatment. that the death-rate for males (the case mortality) has been 30 per cent. the mingling of records. and m^ore females died than males . with the third.. of the females. in fact. for females 60 per cent. If the treatment were completely inefficient we would. have the following results tively.— 50 — THEORY OF STATISTICS. The three attributes. are death.

per cent. Illusory associations may also arise in a different way through the personality of the observer or observers. if the attributes are not well defined.^ . 9. for instance. _ . of the offspring of parents with the attribute possess the attribute themselves..io 25 per cent.. he may be more likely to notice the presence of A when he notices the presence of B. ^ ^ ^ 1 ^ Male Female Mixed record. If the observer's attention fluctuates.. that {abg)J^^)^^^Kk (5) . Parents with attribute and irents . 51 line. and an illusory will consequently arise. —PARTIAL „c. C. . The student will see that if records for male-female and femalemale lines were mixed./"'''" children without . It is important to notice that. „ « „.. and vice versd I in such a case A and (so far as the record goes) will both be associated with the observer's attention C.„ " J Here 13/30 = 43 per cent. Again. the illusory association would be negative. i' Parents without attribute and children without . I „" " f. a ^ " 17 ^ ' " Parents without attribute „f.. 8.. -.. Suppose. line. we can make some conjecture as to their sign from the values of the second-order frequencies. of the ofifspring of parents without the attribute. ASSOCIATION. . and even one observer may fluctuate in the generosity of his marking. — IV. but only 17/70 = 24 per cent... however. In this case the recording of A and the recording of will both be associated with the generosity of the observer in recording their presence. Parents with attribute and „irentswith . though we cannot actually determine the partial associations unless the third-order frequency {ABC) is given. association between A and as B B B before. > J \ . . and that if all four lines were combined there would be no illusory association at all. > ^0 and chUdren with } . and consequently an illusory association will be created.. children with . 13 per cent. due solely to the association of both with male sex. one observer may be more generous than another in deciding when to record the presence of A and also the presence of B. The association between attribute in parent and attribute in offspring is.

Example iii. or else independent in both. if S^ first be positive). (e) the value of (AB) exceed the value given by the (i. the universe of y's. Then we have by addition B ^^^^JAO)m.(C) • (BC) (B) + (Ayl (y) • (Byl.. the universe of y's. If. if (AB) be equal to the value of the first two terms. (B) two terms + Sj — {AB)_(A£) (B) .— 52 THBOEY OF STATISTICS. say. Hence if . Finally. concrete case. SO that 8j and Sg are positive or negative according as A and are positively or negatively associated in the universes of C and y respectively. A and B must be positively associated in the one partial universe and negatively in the other. or both. (AB) fall short of the value given by the first two terms. or both. on the other hand. below.s.h+li (B) ^ (B) m ' ^'^ In using this expression we make use solely of proportions or percentages. farmers. of occupied males in general. and glass workers (over 15 years of age in each case) during the decade 1891-1900 in England and Wales. . The expression (6) may often be used in the following form. A and B must be positively associated either in the universe of C"s.iAym. obtained by dividing through by. 8503]. is perhaps clearer than the general formula. A and B must be negatively associated in the universe of C's.^.) The following are the death-rates per thousand per annum. (Figures compiled from Supplement to the Fiftyfifth Annual Report of tlie Registrar-General [C. 1897. and judge of the sign of the partial associations between A and accordingly. and the proportions over 65 years of age. as iu Example iii. textile workers.e. B A — — .

or both. or old farmers. Calculate what would be the death-rate for each occupation on the supposition that the death-rates for occupied males in general (11 '5. but also iu the relative proportions of the two sexes. 53 to apply the principle of equation (7). on the other hand. if it falls short.S) iV=5008 = 3584 3052 (i?) =:3052 (C) = 3178 = 2524 (AC) = 2480 -' (£C) = 2231 . {A) B. however. . and glass working still more so (13-0<16"6). It is hardly necessary to observe that as age is a variable quantity. C. appears to be unhealthy (14'6 15-9). It is evident that age-distributions vary so largely from one occupation to another that total death-rates are liable to be very misleading so misleading. liable to occur in comparisons of local death-rates. then. unhealthy. 102-3 x -016 = 13-0. the nature of the fallacy involved in assuming that crude deathrates are measures of healthiness. Textile workers Glass workers . in fact. Textile working. better than a more complex one. The simpler procedure brings out. but on the relative numbers at every single year of age. §§ 17-19. 11-5 x -868 11-5 x -966 11-5 x -984 -t- -f -h 102-3 x •132 = 23-5. owing to variations not only in the relative proportions of the old. 102-3) apply to each of its separate. calculated rate for farmers largely exceeds the actual rate farming. light-eyed grand- (The figures are those of Example A. and see whether the total death-rate so calculated exceeds or falls short If it exceeds the actual rate. . . Thus we have the following calculated death-rates : Farmers. — Eye-colour . the above procedure for calculating the comparative death-rates is extremely rough. in grandparent. the of the actual death-rate. be The death-rate for either young farmers a healthy occupation. must on the whole. . . the high death-rate observed is due solely to the large proportion of the aged. —PARTIAL ASSOCIATION.— IV. only death-rates for narrow limits of age Similar fallacies are (5 or 10 year age-classes) are worked out.] The < — Example iv. XI. [See also Chap. over 65). light-eyed parent (4. that they are not tabulated at all by the Registrar-General . light-eyed child parent. age-groups (under 65. occupation must on the whole be healthy . 102-3 x •034 = 14-6. parent and child. must be less than for occupied males in general (the last is actually the case) . ii.) . the actual low total death-rates are due merely to low proportions of the aged. The death-rate of those engaged in any occupation depends not only on the mere proportions over and under 65. as one would expect.

the associations between A and B. is Given only the above data. possible associations — three For three attributes there are 9 three partials in totals. but the partial associations between A and B. for instance. however. and for six attributes 1215 associations.. n 10. a partial association with the grandparent whether the line of descent passes through " light-eyed " or " not-light-eyed " parents. but this could not be proved without a knowledge of the class-frequency {ABC). there are 6 pairs to be formed from four attributes.— 54 THEOKY OF STATISTICS. of any reason to the contrary. In Example ii. again. B and were not essential for answering the question that was asked. the total and partial associations between A and D were alone investigated . the nature of the problem indicates those that are required. or both.B-universe.e. it would be natural to suppose there i. If there were no partial association we would have probably {AB){BG) {APXPC) 1060 x 947 1956 2524x22 31 3052 "^ = 1845-0 + 513-2 = 2358-2. for and three partials in negative universes. be partial association In the absence either in the . the y8-universe. that there is is a partial association in both . In Example i. The total possible number of associations to be derived from attributes grows so rapidly with the value of n that the evaluation of them all for any case in which n is greater than four becomes almost unmanageable. As suggested by Examples i. Practical considerations of this kind will always lessen the amount of necessary labour. and ii. above. and we can find 9 associations for each pair (1 total. For four the number of possible associations rises to 54. there must. and 4 partials with the universe specified by two).. positive universes. D . then. For five attributes the student will find that there are no less than 270. 4 partials with the universe specified by one attribute. B and C were omitted as unnecessary. attributes. investigate whether there a partial association betvireen child and grandparent. the three total associations and the partial association between A and C were worked out. Actually (AC) = 2480 . it is not necessary in any actual case to investigate all the associations that are theoretically possible .

determinate values of all the possible class-frequencies 1 . {A). attributes. assigned values of the positive class-frequencies of the second and higher orders must 11. -1- of But the number of these positive the second and higher orders is only 2" . . 55 might appear. of which there are 2" in the case of n For given values of the m-h 1 frequencies N.. It tions therefore correspond to associations.. at first sight. (B).— IV. can be For successive that n . all class-frequencies can be expressed in terms of those of the positive classes. As we saw in Chapter I. that theoretical considerawould enable us to lessen it still further.m of therefore the number algebraically independent is associations derived from n attributes values of n this gives only 2''-ra-|-l. —PARTIAL ASSOCIATION.. of order lower than the second. (G).

the bracket on the left the unknown association. and the corresponding and {BG).— 56 — THEORY OF STATISTICS. {AC). regard must be had to practical considerations rather than to theoretical relations. and A and C are independent in the universe of B's. 14. say. the four known associations. in the first place. 26). (p.e. {ABy)^{AB)-{AB0J^S. for instance. is not of such a simple kind that the term on Hence in conthe left can be. Similarly. B quantities in the brackets on the right represent.'m0) __ {A){B){y) _ {Ay){By) and are independent in the universe of y's. It must be noted.xt place it is evident from the above that relations of the general form (to write the equation symmetrically) {ABC)JA) {B) (C) • ^ must hold iV" iT TV • • (^. i. (2) {d). Chap. mentally evaluated. Again.' for every class-frequency. The tions are zero Then it follows at once that we have ^^^ also {ABC)i. ^j^. that we are given 13. III. differences for the frequencies (AS).w + 1 given associais worth some special investigation. it may be shown that and C are independent in the universe of ^'s. Clearly. Therefore A B A B In the ne.e. Suppose. sidering the choice and number of associations to be actually tabulated. as we may term it. the relation particular case in which all the 2" . that (8) is not a criterion for the . exists. The four were independent in the universe of C's. B and C in the universe of A's. however. in general. This relation is the general form of the equation of independence. It follows. that all other possible associations must be zero. that a state of complete independence. and C in the universe of a's.

if N. we know that similar relations must hold for (AfS).) . while If. that a state of complete independence subsists. B. U. especially §§ 4 and 5. 1903. the n (m>2) attributes are completely independent.. Soy. (A). The direct verification of this result is left for the student. . . . and the last relation quoted holds good. U. p. on the theory of complete independence. {A).. "Notes on the Theory of Association of Attributes in Statistics. p. i. (B). "On mixing of records. cxciv.e. G. If REFERENCES. (A). If i\^. the relation : (ABC .. the Association of Attributes in Statistics. and (B). (aB). There are eight algebraically independent class-frequencies in the case of three attributes. the relation (9) holds good . Trans. be given. 257. ii. 2" - classes It is only because n+ l =1 when 71 =2 that the relation N may be all " N ' N' treated as a criterion for the independent/e of A and B. (Deals fully with the theory of partial as well as of total association.. the data are insufficient.. and (C) be given. G. Soc. and (a/3)." Phil. (2) YuLB. {Of.. 57 complete independence A. (G) are only four the equation (8) must therefore be shown to hold good ior fov/r frequencies of the third order before the conclusion can be drawn that it holds good for the remainder. (Q). —PARTIAL of ASSOCIATION. 121. 1900. with numei-ous illustrations : a notation suggested for the partial co'efSoients. however.. (A). and C iV in the sense that the equation imjj) iV is (J) iV a criterion for the complete independence of A and B.) (A) ' (B) - (C)_ J\r N N N ^ ' before must be shown to hold good for 2" ." Biometrika. vol..) IV. and the equation (8) hold good. (B). Series A. but it does not follow that if the relation (9) hold good they are all independent. (B). and the fallacies due to (1) Yule. vol. If we are given i\^. Quite generally. we can draw no conchision without further information .71+ 1 of the nth order it may be assumed to hold good for the remainder.

nerve signs. to see whether the conclusion that " the connecting link between defects of body and mental dulness is the coincident " defect of brain which may be known by observation of abnormal nerve signs seems to hold good. A. but not necessarily using exactly the same comparisons. p. development defects.58 THEORY OF STATISTICS. iV B. Take the following figures for girls corresponding to those for boys in Example L. and discuss them similarly. mental dulness . EXERCISES. 1. 45. D.

— IV. —PAETIAL ASSOCIATION. in son 59 A light-eye colour in husband. B in wife. N .

The general theory of such a manifold as distinct from a twofold or dichotomous classification. . only. . If the classification of the A's be sfold and of the B's i-fold. and II. the frequencies of the st classes of the second order may be most simply given by forming a table with s columns headed Ay to Ag. as was briefly pointed out in Chap. Isotropic and anisotropic 14-15. is entered in the compartment common to the mth column and the nth row. MANIFOLD CLASSIFICATION. 1. and finally the grand total at the right-hand bottom corner gives the whole number of observations.e. . t. A^ A^ As. Cj ultimate classes altogether. . I. a simpler form of classification than usually occurs It may be regarded as in the tabulation of practical statistics. i. . — — — — — 1. say. a special case of a more general form in which the individuals or objects observed are first divided under. The general 2-4. .^^'s and B^'s. A and B.. and t rows headed i?j to B. the numbers of . Tables I. . say A^ and B„.CHAPTER V. s heads. each of these under u heads. The table of principle of a manifold classification double-entry or contingency table and its treatment by fundamental methods 5-8. i. 1). § 5. would be extremely complex or characters ABC in the present chapter the discussion will be confined to the case of two characters. and the feet of columns give the first-order frequencies. . Classification by dichotomy is. . The number of the objects or individuals possessing any combination of the two characters. thus giving rise to s. . . and Bt. the st compartments thus giving all The totals at the ends of rows the second-order frequencies. as they have been termed by Professor Pearson (ref. B^ C„. Cj. . so on. below will serve as illustrations of such tables of double-entry or contingency tables.e. Analysis of a contingency table by tetrads 11-13. : 60 . . Homogeneity of the classifications dealt with distributions in this and the preceding chapters : heterogeneous classifications. the frequency of the class A^B„. u 2. The coefficient of contingency 9-10. . each of the classes so obtained then subdivided under t heads. . B^. in the case of n attributes JV. .

In Table I. (2) : of these divisions are again classified Table I. 616.000 were inhabited. in course of erection.000 in rural districts.000 houses. Table X. 4. (3) houses that are " building. of which 571.) .e. 40.000 in other urban districts. and 5000 in course of erection from the first column.— V.000 inhabited houses in England and Wales.064. Thus from the first row we see that there were in London. (Census of 1901. and the houses in each into (1) inhabited houses. (2) other urban districts.625." i. in round numbers. there were 6. — MANIFOLD CLASSIFICATION.000 were in London. : 61 3. (3) rural districts. uninhabited but completed houses. the division is 3 x 3-fold the houses in England and Wales are divided into those which are in (1) London. and 1. Summary Houses in England and Wales.000 uninhabited.) (OOO's omitted.260. of which 571.

946 grey or green eyes. . as another illustration. Chap. a larger might be expected. and 47 red hair. there were 2811 nieri with blue eyes noted. . J. ) 325/4960 = 66 per thousand. any such table may be treated on fold (2 4. ) Proportion of all houses which j are in course of erection in > 12/1761 rural districts . the proportion of houses uninhabited being greater in rural than in urban districts. pooling London district. Similarly. we find — Proportion of all houses which j are in course of erection in I 50/5010= 10 per thousand. The tables are a generalised form of the fourX 2-fold) tables in § 13. If. the procedure will be rather different. there were 2829 men with fair hair. . of whom 1768 had fair hair. It then becomes possible to trace the association between any one or moro of the A's and any one or more of the B'a. Taking the first row. ) There is therefore. the principles of the preceding chapters by reducing it in different ways to 2 X 2-fold form. as association. . = 71 „ .e. . III. . . ) Proportion of all houses which | are uninhabited in rural > 124/1749 districts . Rows 1 and 2 may be added together as but column 3 may be omitted altogether. urban districts . ) The association is therefore negative.. for example. or Taking Table I. so as to make no distinction between inhabited and uninhabited houses as long as they are completed. either in the universe at large or in universes limited by the omission of one or more of the A's. between the erection of houses and the urban character of a Adding together the first two rows i. 807 brown hair. and the other urban districts together and similarly adding the first two columns.. it tells us that read similarly to the last. and 115 brown eyes. as the houses which are only in course of We then have erection do not enter into the question. trace the association of both. a distinct positive proportion of houses being in course of erection in urban than in rural districts. of whom 1768 had blue eyes. Proportion of all houses which are uninhabited in urban districts . before. . from the first column. . For the purpose of discussing the nature of the relation between the A's and the £'s. it be desired to trace the association between the " uninhabitedness " of houses and the urban character of the district. =7 .— — — 62 THEORY OF" STATISTICS. of the £'s. 189 black hair.

. and columns 2. is this dependence very close. or the reverse ? The subject of coefficients of association. — — . i. rows 1 and 2 may be pooled together as representing the least. : — MANIFOLD 63 The eye. The figures are . we desire to trace the association between a lack of pigmentation in eyes and in hair. which affords the answer to this question in the case of a dichotomous classification. 3. For comparison we trace the corresponding association between the most marked degree of pigmentation in eyes and hair.. and 4 placed. 2." is based. — J 2714/5943 = 46 per ) I nnt a ic^nn's »a o cent. twenty -five or more. and the mode of calculation 5. may be treated in a precisely aimilar fashion. and if so. i. the column for red being really mis1.. ^'wack The difference last case. ^-^ " association is therefore well-marked. hLir^ j 935/5943 ^ ^g _ association is again positive and well-marked.g. it gives the most detailed information possible with regard to the relations of the two attributes. . as red represents a comparatively slight degree of pigmentation.'''°''!'"'^"^ ^'^^ } ^'bkck Taif light-eyed with 288/857 = 34 per cent. but the between the two percentages is rather less than in the treatment adopted in the preceding section rests and. where there are only four classes of the second order to be considered the matter is not nearly so complex as where the number is. At the same time a distinct need is felt in practical work for some more summary method a method which will enable a single and definite answer to be given to such a question as Are the A's on the whole distinctly dependent on the -5's. pigmentation of the eyes. for it is still the subject of some controversy further. quite simple and fundamental. moreover. and 4 may be pooled together as representing hair with a more or less marked degree of pigmentation. the " coefficient of contingency. brown eyes and The may black hair.— V. — CLASSIFICATION. . The mode of on first principles. was only dealt with briefly and incidentally. if fully carried out. If.and hair-colour data of Table II. / nr l^»/8»^ = /opry. . are. e. . We then have Proportion of light-eyed with fiirhair Proportion of brown-eyed with fair hair . The ideas On which Professor Pearson's general measure of dependence. say.e. and columns Here we must add together rows 1 and 2 as before.. and the need for any summary coefficient is not so often nor so keenly felt.

and of its relation to the theory of variables. using 2 to denote " the sum of all quantities like " : : : % and 82. but the second the better. then. ating algebraical signs. in symbols. (. simply to add them together. It will not do. if A and B bo independent it is zero. some of which are negative and others positive. because every 8 is zero. If. . .nBn)~{A^B„\ . as it leads to a coefficient easily treated by algebraical methods. say. let the frequency of AJs. wo form a coefficient C given by the relation (4) . (^m-"ii)l ii) • (^) • • Being the sum of a series of squares. It is necessary. the sum of both the (ABy& and the (AB)q's being equal to the whole number of observations If. x^ . or. the frequency of ^„'s by (-S„). however. and the frequency of objects or individuals possessing both characters by {A^B^. 1) for a completer treatment of the theory of the coefficient. or (2) by squaring the differences and then summing the squares. and that the sum of all such ratios is. for the sum of all the values of 8. which the first process does not as the student will see later. The first process is the shorter.— 64 THEORY OF STATISTICS. be denoted by (^„). A and B are not completely independent. student should refer to the original memoir (ref. 6. ^^ is necessarily positive. we must have for all values of and n m — {A^B„)J^^^ = {A^B„). then. to get rid of the signs. that every 8 is calculated. squaring is very usefully and very frequently employed for the purpose of eliminSuppose. and this may be done in two simple ways (1) by neglecting them and forming the arithmetical instead of the algebraical sum of the differences 8. (1) however. Generalising slightly the notation of the preceding chapters. Let (A„5„)„ will not be identical for all values of the difference be given by m Kn={A. The advanced is therefore given in full in the following section. therefore. . Then.4„5„) and and n. . (2) A coefficient such as we are seeking may evidently be_ based in some way on these values of 8. if the A's and jB's be completely independent in the universe at large. and also the ratio of its square to the corresponding value of (AB)^. . If. must be zero in any case.

65 if the characters A and are completely independent. Cis Professor Pearson's mean square contingency this coefficient is zero B coefficient. Replacing S„>„ in equation (3) by its value in terms of (^m-Sn) and {A^B„)^ we have— 7.he diagonal compartments of the table. It is clearly desirable for practical purposes that two coefficients calculated from the same data classified in two different ways should be.„) for all values of m . at least approximately. which he denotes by ^. slight pigmentation of eyes and of hair appear to go together. (1) 6 x 6-fold. x° the square continthe ratio v^/iV^. the S's of one sign only. 1) terms S a sub-contingency. all the frequency is then concentrated in . ' Professor Pearson . say. gency . that the association between A. and approaches more and more nearly towards unity as )^ increases. With the present coefficienst this is not the case: if certain data be classified in. viz. further. —MANIFOLD CLASSIFICATION. the contingency might have been regarded as negative. has one disadvantage. Thus in Table II. the remaining frequencies of the second order being zero . the mean square contingency . If slight pigmentation of eyes had been associated with marked pigmentation of hair.^ and B^ is perfect. for any finite number of classes the limiting value of C is the smaller the smaller the number of classes. on which a dififerent coefficient can and the sum of all be based. s (6) Now suppose we have to deal with a * x i-fo\d classification in which (A J) = {B.! The coefficient. (2) 3 X 3-fold form. and each contributes (ref. for the coefiBcient simply shows whether the two characters are or are not independent. only unity if the number of classes be infinitely great . This may be briefly illustrated as follows. in the simple form (4). In general. no sign should be attached to the root. identical. denoting the expression in brackets • <^) by S. and nothing more. but in some cases a conventional sign may be used. that coefficients calculated on different systems of classification are not comparable with each other. the coefficient in the latter form tends to be the least. and suppose. f-A'^i)-^ and therefore. The greatest possible Value of the coefficient is. in fact. the mean contingency. and the contingency may be regarded as definitely positire.V.„) = (5„) for all values of m. so that (A^BJ) = (j1.

and therefore. value of C 5 is accordingly tif. The total value of — 66 iV'' to the sum S. in such a table. and the 'V^This is the greatest possible value of C for a symmetrical t x «-fold classification.— THEORY OF STATISTICS. for «= 2 C cannot exceed 0-707 .

— MANIFOLD CLASSIFICATION. 67 (1768)2/1169 .V.

and eye-colour. brown. for every pair Taking the of columns or of rows. the proportions run as follows : For rows 1 and 2.e. For rows 2 and 3. or perhaps better. (^. i.)/(^m+i^n+i)> etc. etc. black. peculiarity will be removed at once if the fourth column be placed immediately after the first if this be done. red. : tribution.jfi^.. ." and the associations in them all can (r 1) be very quickly determined by simply tabulating the ratios like {A^B„)/(K+A). and this would seem to be the more natural order. exhibit just the same characteristic. — 68 The than.— THEOKT OP STATISTICS. will be the same. 1768/2714 807/2194 189/935 47/100 0-651 0-368 0-202 0-470 946/1061 1387/1825 746/1034 53/69 0-892 0-760 0-721 0-768 In both cases the first three ratios form descending series. (1) In an isotropic distribution the sign of the association is the same not only for every elementary tetrad of adjacent frequencies.1) such " tetrads. e. associations in the six tetrads are accordingly + + + + - The negative sign in the two tetrads on the right is striking.„5„H.+^B„)}. as shown in It will be termed an isotropic disthe following theorems.. {A„JB. as an illustration. but The signs of the the fourth ratio is greater than the second. as may be most convenient. less than. arranged in But the the same way. whole of the contingency table can be analysed into a series of elementary groups of four frequencies like the above. and working from the rows.). considering the depth of the pigmentation. the sign of the association in all the elementary tetrads The colours will then run fair.g. hut for every to common set of fov/r frequencies in the compartments two rows and two columns. or equal to the ratio (^„5„+i)/(^„+i5„+i). figures of Table II. if " red " be placed between " fair " and " brown " instead of at the end of the colourseries. the more so as other tables for hair. the proportions {A„B„)/{{A^B„) + {A„.. A distribution of frequency of such a kind that the association in every elementary tetrad is of the same sign possesses several useful and interesting properties. {A„^^. 11. each one overlapping its neighbours so* that an r«-fold table contains (s .

12.n^^S„){A„B^. From the work of the preceding section we may say that Table II. the association is still positive though the two columns A^ and A^^^ are no longer adjacent. — CLASSIFICATION. The expression " complete independence " is therefore justified. .A) = {A:){B„)IN only a 2 X 2-fold table aU values of and n.4„. Therefore the distribution remains independent in whatever way the table be grouped. . or in whatever way the universe be limited by the omission of rows or columns. and (3) we have. in — 69 the elementary — MANIFOLD For suppose that the sign of association tetrads is positive. (3) That is to say.— V. the association is evidently zero for every tetrad. elementary association is that is to say.5„)(it„+A+i)>(^™+25„)(^™5„+i) .„+ A) + (^. is not isotropic as it stands. as otherwise different reductions for m to fourfold of course form may lead to associations of different sign. it (2) An isotropic may be condensed distribution remains isotropic in whatever way Thus from (1) by grouping together adjacent rows or colvm/ns. . The case of complete independence is a special case of isotropy. (3) As the extreme case of the preceding theorem. we then have the theorem If an isotropic distribution be reduced to a fourfold distribution in a/ny way whatever.„+A)]. by addition of adjacent rows and columns. so that (A^£n){A^A+i)>{A.)>(^„+A)(^^+A+i) Then multiplying up and cancelling we have (^„. we may suppose both rows and columns grouped and regrouped until is left . though they need not necessarily do so. It is best to rearrange such a table in isotropic order.+ A+i) + (-4™+A+0] > (^™-B„+i)[(^. the sign of the unaffected by throwing the {m+ l)th and (m + 2)th columns into one. The following will serve as an illustration of a table that is not isotropic. For if {A. and cannot be rendered isotropic by any rearrangement of the order of rows and columns. adding* (^„5„)[(. but may be regarded as a disarrangement of an isotropic distribution.) and similarly. (1) (il„+A)(^„+A+. the sign of the association in such fourfold table is the same as in the elementary tetrads of the original table. • (2)" .

(Data of Sir F. the Frequencies of Different Combinations of Eye-colov/rs in Father and Son.) (1900). Blue. Dark grey. 3. grey. 1. Blue-green.— 70 THEORY OF Table Showing STATISTICS. 138 . IV. vol. cxcv. A. from Karl Pearson. Father's Ete-oolour. 4. p. classifioation condensed.. Brown. 2. Galton. Trans. Phil. . hazel.

in such a case. —MANIFOLD CLASSIFICATION. for in. alone. but so long as the classification remains purely heterogeneous. A^s . (3) Medical. viz. Drama. B^'a ..'s into B^s. The number of sub-heads under (8) Exhibitions.. and the distribution in every column similar to that in the column of totals . (2) Navy and Marines not (1) National and (2) Local Government again the sub-heads are necessarily distinct. and so on. say. and different for each main heading .4." subdivided under the headings (1) National Government. (7) Art. the first " order " in the list of occupations is "General or Local Government of the Country. that in the case of complete independence the distribution of frequency in every row is similar to the distribution in the row of totals. or a similar summary method.. Were such a table treated by the method of the contingency coefficient. and accordingly its origin demands explanation. that they the principle of division so to speak. (2) Legal." with the subheadings (1) Army. however complex are. biological classifications into orders. B/a. in order to render possible those comparisons on which the discussions of associations and contingencies depend. It may be noted. Thiis A'a and a's are both subdivided into B's and ^'s. 13. Similarly. there are no data for any conclusion.)J^{B)„ and so on. The next order is " Defence of the Country. To take the last case as an illustration. 14. the peculiarity might not be remarked. 71 the great majority of the tables. however. e. the classifications of the causes of death in vital statistics..). in concluding this part of the subject. etc. (A„£.). Many classifications are. each main heading is. and of occupations in the census. This property is of special importance in the theory of variables.. Music. Clearly this is necessary .V. Games.g. "homogeneous" being the same for all the sub-classes of any one class. arbitrary and variable. . (5) Literary and Scientific." with the fresh sub-heads (1) Clerical. essentially of a heterogeneous character. genera.i^'s. the third order is " Professional Occupations and their Subordinate Services. — — — . and amongst the a's a certain percentage of C's. the column A„ the frequencies are given by the relations — {A^B. (6) Engineers and Surveyors. The classifications both of this and of the preceding chapters have one important characteristic in common. {A^B. (4) Teaching.)J^{B.. and species . If we only know that amongst the A's there is a certain percentage of B'a. (2) Local Government.)JAn)^B.

and if they have remained strictly the same. All practical schemes of classification are subject to alteration and improvement from time to time. it is not necessarily really the same. or species may be discussed in connection with the topographical characters of their habitats may of moor and we may observe statistical associar between given genera and situations of a given topographical The causes of death may be classified according to sex. and unfortunately it very seldom is fulfilled. render a certain number of comparisons impossible. 15.— 72 it ) — THEOKY OF STATISTICS. incomplete until a homogeneous division is introduced either directly or indirectly. and these alterations. and it then becomes possible to discuss the association of a given cause of death with one or other of the two sexes. or occupation. Dulau & Co. Ka. .iil. e. Biometric Series i. with a given age-group. (The memoir in which the coefficient of contingency is proposed. In any case. there is no opportunity for any discussion It is causation within the limits of the matter so derived. to see whether the numbers of those engaged in the given occupation or succumbing But to the given cause of death have increased or decreased. 1904." Drapers' Company liestarch Memoirs. it is also possible to discuss the association of a given occupation or a given cause of death with the earlier or later year of observation i. or — tions occupation. REFERENCES. Even where a classification has remained verbally the same. (1) Pearson. or with a given desert. improved methods of diagnosis may transfer many deaths from one heading to another without any change in the incidence of the disease. only when a homogeneous division is in some way introduced that we can begin to speak of associations and contingencies. London. however desirable in themselves. by repetition. marsh. or age. This may be done in various ways according to the nature of the case. thus. genera. become.e.. heterogeneous classification should be regarded only as a partial process.g. in such circumstances the greatest care must be taken to see that the necessary condition as to the identity of the classifications at the two periods is fulfilled. "On the Theory of Contingency and its Relation to Association and Normal Cornlatinn. Thus the relative frequencies of different botanical families. and so bring about a virtual change in the classification. Again. the classifications of deaths and of occupations are repeated at successive intervals of time . in the case of the causes of death. . type. Contingency.

U. and Biometrika. Show that neither table is even remotely isotropic. e.. p. 6. G. 1906. Pearson. 248. An in the different districts as rows of a contingency table and working out the coefficient the same principle is also applicable to the comparison of a single district with the rest of the country. vii. . U. Aj .. the coefficient of contingency should not as a rule be used for tables smaller than 5 x 5-fold these small tables are given to illustrate the method." BericTite der math. "Die Bestimmung Abhangigkeit zwischen . Find the coefficient of contingency (coefficient of mean square vol. vol." Jour. (An application of the contingency coefficient to the measurement of heterogeneity. Inst. " On the Inheritance of the Mental and Moral Characters in Man. of the Anthrop. "On a Property which holds good for all Groupings of a Normal Distribution of Frequency for Two Variables. a table showing the numbers of candidates who passed or failed at an examination. p. and assumes a normal distribution of frequency (chap.. e. (Data from Karl Pearson. "On a New Method of Determining Correlation. G. Yule. vol.. xxxvi. Karl. for each year of age.) Contingency Tables of two (6) Rows only. Inst. 198. xxxiii." Biometrika. while avoiding lengthy (1) arithmetic. Pearson. Klasse der Leipzig.8 of which only the Percentage Intensity is B exceeds (or falls short of) a given recorded (7) Grade ot A.) ) ) ) ) : V. . 1910. yol. (On the property of isotropy and some applications.) contingency) for the two tables below." Biometrika. ) for B." Jour. — MANIFOLD CLASSIFICATION. " On a Coefficient of Class Heterogeneity or Divergence.. vol.. Soy. (Includes an investigation as to the influence of bias 1906. and of personal equation in creating divergences from isotropy in Statistics of Ill-defined Qualities. p. 1906. Karl.. XV. The table of such a type stands between the contingency tables for unmeasured characters and the correlation table Pearson's method is based on that adopted (chap. Soc. 325. 1905. (The similar problem for the case in which the variable is replaced by an unmeasured quality. p. Series A. .. contingency tables. (As stated in § 7. {i) (5) Yule.) for variables. der 73 den (2) LlPPS.. in different Merkmalen by treating the observed frequencies of some quality Aj. v. p. Sdchsischen Gesellschaft der Wissenscha/ten general discussion of the problems of association and contingency. with applications to the Study of Contingency Tables for the Inheritance of Unmeasured Qualities." Proc.g. : Isotropy. eines Gegenstaudes.-phys. Ixxvii. 324. ix. iii. showing the resemblance between brothers for athletic capacity and between sisters for temper. vol. for the correlation table. . "On a a Measured Character of Cases wherein for each New Method of Determining Correlation between A and a Character . 1909. districts of a country. (A kgl." (3) Biometrika.) EXERCISES. vol.g. when one Variable is given by Alternative and the other by Multiple Categories. vii. "On the Influence of Bias and of Personal Equation in Anthrop. (Deals with a measure of dependence for a common type of table.. 96. F. Peaeson. Karl.

. First Brother.74 THEORY OF STATISTICS. Athletic Capacity.

or petals) on animals or plants. Since numerical measurement is applied only in the case of a quantity that can present more than one numerical value. THE FEEaUBNCT-DISTEIBUTION. Position of intervals 7. The dichotomous olagsi: 76 . Process of classification— 8. 1. prices. If some hundreds or thousands of values of a variable have been noted merely in the arbitrary order in which they happened to occur. Magnitude of class-interval 6. Graphical representation of the frequency-distribution— 12. As common examples of such variables that are subject to statistical treatment may be cited birth. are applicable to all observations.g. wages. Tabulation. Ideal frequency-distributions 13. that is. I. barometer readings. a varying quantity. Treatment of intermediate observations— 9. as suggested by Chap. whether qualitative or quantitative we have now to proceed to the consideration of specialised processes. Tables with unequal intervals 11. I. The extremely asymmetrical or J-shaped distribution 16. Necessity for classification of obserTationa the frequency distribution 3. and those comparisons. Method of forming the table 5. The moderately asymmetrical distribution 15. — — — — : — — — — — — — 1.or death-rates. on which arguments as to causation depend. the mind cannot properly grasp the significance of the record the observations must be ranked or classified in some way before the characteristics of the series can be comprehended. The U-shaped distribution. can be made with other series. Iiitroduotoiy 2.-V. The symmetrical distribution 14.— 10. spines. this section of the work may be termed the theory of variables. but not as a rule available (with some important exceptions. Illustrations 4. and measurements or enumerations {e. 2. adapted to the treatment of quantitative measurements. rainfall records. or more shortly a variable. definitely. of glands.PART IL—THE THEORY OF VARIABLES. The methods described in Chaps. CHAPTER VI. § 2) for the discussion of purely qualitative observations.

of the 632 registration districts of England and Wales. and so on. as it may be termed) might be 1 inch. (c/. The frequency-distribution is shown by the : following table. of the scale . THEORY OF STATISTICS.— 76 fioation. as for example in enumerations of numbers of children in families or of petals on flowers. . the numbers of districts have been counted in which the death-rate was over 12 5 but under 13 '5. chosen for classifying (the class-interval. since the classes may be made as numerous as we please. however by the original record is lost. expressed as proportions per thousand of the population per annum. Chap. v. merely classified as ^'s or a's A .).-IV. a large part of the information given manifold classification. or 2 centimetres.or death-rates might be grouped to the nearest unit per thousand of the population . for the class limits can be conveniently and precisely defined by assigned values of the variable. for the decade 1881-90. over 13 "5 but under 14:'5. returns of birth. are distributed over the successive equal intervals of the scale is spoken of as the frequency-distribution of the variable. A few illustrations will make clearer the nature of such frequency-distributions. the numbers of individuals being counted whose statures fall within each successive inch. have been classified to the nearest unit i. and so on. For convenience. the values of the variable chosen to define the successive classes should be equidistant. When the variation is discontinuous. 3. is too crude if the values are according as they exceed or fall short of some fixed value. or. avoids the crudity of the dichotomous form. . if desired to obtain a more condensed table.. returns of wages might be classified to the nearest shilling. and numerical measurements lend themselves with peculiar readiness to a manifold classification. the unit is naturally taken as the class-interval unless the range of The manner in which the observations variation is very great.e. : considered in Chaps. I. and the service which they render in summarising a long and complex record In this illustration the mean annual death-rates. (a) Table I. so that the niimbers of observations in the different classes (the class-frequencies) may be Thus for measurements of stature the interval comparable. [Table I. or each successive 2 centimetres. by intervals of five shillings or ten shillings.

—THE FREQUENCY-DISTBIBUTION. .) Mean Annual Death-rate.— VI. 77 Table England and I. (Material from the Supplement to the 55th Annual Report of the Registrar-General for England and m JTaZcsCC— 7769] 1895. Showing the Numbers of Registration Districts Wales with Different mean Death-rates per Thousand of the Population per Annum for the Ten Years 1881-90.

. (1900).) Families. and 6. Years. (Cited from Proc. Soc. 172. in certain Qunker Dying at Different Ages. THEORY OF STATISTICS. by Miss M. Ixvii. Hoy. Yule. Showing the Numbers of Married Women. p.— 78 Table II. U. Age at Death. On tJie Correlation between Duration of Life and Numher of Offspring. Beeton. vol. Karl Pearson.

from 6 to 20. without serious error. i. It may.. intervals is. or 12-5-13-5. we have an approximate value for the interval. as if they were equal to the mid-value : — if the death-rate of every district in of Table I. As already remarked. of rays range lisual. 13. or at least take place by discrete steps which are small in comparison with the whole range of variation. (2) The position or origin of the intervals must then be determined. and the : — observations classification are classified accordingly. there is no such natural class-interval. Magnitvde 'of Class-Interval. etc. so that the mid. A preliminary inspection of the. tinuous. and its choice is a matter for judgment. Position of Intervals. (4) The process of being finished. or. one unit was chosen in the case of Tables I. (6) for convenience and brevity we desire to make the interval as large as These conditions will possible. or 14 rays like To expand slightly the brief description given in §. THE FKKQUENCY-DISTKIBUTION. number of classes lies less than. tables the preceding are formed in the following way (1) The magnitude of the class-interval. the death-rate of every district in the second class 14'0. a table is drawn up on the general lines of Tables I. thirty makes a somewhat unwieldy table. generally be fulfilled if the interval be so chosen that the whole of the class-interval.. and III. in cases where the variation proceeds by discrete steps of considerable magnitude as compared with the range of variation. 14-15. The actual value should be the nearest integer or simple fraction. e.values are integers. (3) This choice having been made. Some remarks may be made on each of these heads.g. record should accordingly be made and the highest and lowest values be picked out. 14-5-15-5.-IIL. 79 The numbers being the most 4. and a number over.e. Dividing the diflference between these by.g. etc. The position or starting-point of the 6. the complete scale of intervals is fixed. we must decide whether to take as intervals 12-13. 5. A number of classes ten leads in general to very appreciable inaccuracy. The two conditions which guide the choice are these (a) we desire to be able to treat all the values assigned to any one class. — .. as the first class between 15 and 25. and II. as a rule. is first fixed . as in Tables I.2. were exactly 13-0. more or less indifferent. say. five units in the case of Table II. say. 13-14. five and twenty. and so on . the number of units to each interval. subject to the first condition. e. in Table I. 13-5-14-5. but in general it is fixed either so that the limits of intervals are integers. showing the total numbers of observations in each class-interval. The But if the variation be conunit will in general have to serve. there is very little choice as regards the magnitude of the class^interval. say.VI. — 12.

it is accordingly better to enter the values observed on cards. Classification. in compiling Table I. tens. however. and also ref. in which a different view is taken). some exceptional cases. some districte will have been noted with death-rates entered in the Registrar-General's returns as 16'5. however. In such a case. and transfer the entries of the original record to this sheet by marking a 1 on the line corresponding to any class for each entry It saves time in subsequent totalling if each assigned thereto. 7. where the original figures for numbers of deaths and population are available. is generally the case. moreover. it is as well to subject the raw material to a close examination before finally fixing the classification. for instance. or tens and fives. the observations exhibit a marked This clustering round certain values." " 35 and under 45. owing to the tendency to state a round number where the true age is unknown. the difficulty may be readily surmounted by working out the rate to another place . 17'5. fixed. e. the classification of the English census. be chosen. Thus. 5. vii. 8. in age returns.80 THEORY OF STATISTICS. and the whole work checked by running through the pack corresponding to each class. so that no In limit corresponds exactly to any recorded value (c/. entry in a class is marked by a diagonal across the preceding by leaving a space. 1911." etc. — The scale of intervals having been classified. The disadvantage in this process is that it offers no facilities for checking if a repetition of the classification leads to a different If the number of result. the Census of England and Wales. These are then dealt out into packs according to their classes. for simplicity in classification." " 30 and under 40. and verifying that no cards fifth four. in order to avoid sensible error in the assumption that the mid-value is approximately representative of the values in the Thus. vol. any one of which might at first sight have been apparently assigned indifferently to either of two adjacent classes. one to each observation.. owing to the occurrence of observed values corresponding to class-limits. it intervals in a will be sufficient to column down the left-hand side of a sheet number of observations is mark the limits of successive of paper. In some cases difficulties may arise in classifying.. When there is any probability of a clustering of this kind occurring.. Under such circumstances." and so on (c/. observations is at all considerable and accuracy is essential. there is no means of tracing the error. or : have been wrongly sorted. the observations may be If the not large. tens. is a better grouping than " 20 and under 30.g. in the case of ages. or 18"5. " 25 and under 35. the values round which there is a marked tendency to cluster should preferably be made mid-values of intervals. § 8 below). since the clustering is chiei^y round class.

and II. half-years still cannot occur in the age at death. 15r5-152'5. if necessary. to state the manner in which the difficulty of intermediate values has been met or evaded. —THE FEBQUBNCT-DISTRIBUTION. etc.. it woul<J be slightly better. p. and so on. class-frequencies — Death-rate per 1000 . 60||-61f|. 81 of decimals will if the rate stated to be 16 '50 proves to be 16-502. X.. If the difficulty is not evaded in any of these ways. the intervals may be 59^|— 60i|. As regards the actual drafting of the final table. it is usual to assign one-half of an intermediate observation to each : : the result that half-units occur in the Tables VII. again. there is little to be said. to assign the intermediate observations to the adjacent classes in proportion to the numbers of other observations falling into the two classes. The procedure is rough.. 96. the classintervals may be taken as 150'5-151'5. Death-rates that work out to half-units exactly do not occur in this example. 96). and XL. if statures are measured to the nearest centimetre. with (c/. might have been given in the form adjacent class. to the class 15'5-16'5. except that care should be taken to express the class-limits clearly. if the actual day of birth and death be cited. 9.. The difficulty may always be avoided if it be borne in mind in iixing the limits to class-intervals. the age at death is only calculable to the nearest unit . it be sorted to the class 16'5-17"5 . p. Thus Table I. and. Tabulation. 90. In the case of Table II. but probably good enough for p. but a good deal more laborious. practical purposes . there is no difficulty if the year of birth and death alone are given. Thus or a smaller fraction. if to the nearest eighth of an inch. The class-limits are perhaps best given as in Tables I. and so there is no real difficulty.— VI. if 16'4:98. but may be more briefly indicated by the midvalues of the class-intervals.. than the values in the original record. because there is an odd number of days in the year. these being carried to a further place of decimals.

Stature in Inches.82 THEOKT OF STATISTICS. .

1. FKEQUKNCY-DISTRIBUTION. (Cited fr Jour. p. from Qrmt Brilaln nxsessal to hihahilM House Duly in 1885-6.— VI. atat. 83 Tabie IV.) .. —THE Soc. 610. vol. lUiy. 1887. flhmm'vg the Avniial Value and Number of Dwelling-houses IE IV.

Biometrika. . will serve as an example. p. R. giving the distribution of head-breadths for 1000 men. Table T. Maodonell. 220. (Cited from W. i.84 THEOEY OF STATISTICS.. 1902.) — Head-breadth in Inches. Measurements taken to tlie nearest tenth of an inch. at Shoviing the Frequency-distribution of ffead-breadtJis for Students Cambridge.

VI. —THE FEEQUENCY-DISTEIBUTION. 85 .

the frequency rises so sharply . successive ordinates y-^. i. it is better to use the histogram.86 interval THEORY OF STATISTICS. for this reason. the area shown by the and if. the frequencypolygon is too small polygon tends to become very misleading at any part of the In the moirtality disrange. histogram. if y^ exceed it. y^. 3.e. yg lie on a line. tribution of Table I. for instance. if y2 = i(?'i + 2's)> the areas of the two little triangles shaded in the figure being equal If y^ fall short of this value. . polygon is too great.. 4. as suggested by the shown by the frequency-polygon over any interval with an area ordinate y^ (fig. the area shown by the Fig. 3) is only correct if the tops of the three Tfc yi yi Fig. The is not the same.

The forms presented by smoothly running sets of numerous observations present an almost endless variety. more especially anthropometric. iv. the use of the histogram is almost imperative. the moderately asymmetrical distribution. measurements. the polygon and the histogram will approach more and more closely Such an ideal limit to the frequency-polygon to a smooth curve. from data published by a British Association Committee in 1883. Being a special case of the more general type described under the second heading. and is important in much theoretical work. number of observations is considerable say a thousand at least the run of the class-frequencies is generally suflSciently smooth to give a good notion of the form of the ideal distribution. from which the following illustrations are drawn. in any actual case.VI. For elementary purposes it is sufficient to consider these fundamental simple types as four in number. and very exceptional indeed in economic statistics. It occurs more frequently in the case of biometric. Chap. observations falling between the values x^ and x^ of the variable in fig. to the maximum that a histogram is. so that the class-frequencies may remain finite. the figures being given separately . Table VI. In this ideal frequencycurve the area between any two ordinates whatever is strictly proportional to the number of observations falling between the Thus the number of corresponding values of the variable. or histogram is termed a frequency-curve. The symmetrical distribution.). Fi^. § 15. the extremely asymmetrical or J-shaped distribution. from which many at least of the more complex distributions may be conceived as compounded. the class-frequencies decreasing to zero symmetrically on either side of a central maximum. Ex. which. 87 . 13. but amongst these we notice a small number of comparatively simple types. and the U-shaped — — distribution. and so on. most probably. 4 will be proportional to the area of the shaded strip in the figure. the symmetrical distribution. and at the same time the number of observations be proportionately increased. have very little significance (c/. 5 illustrates the ideal form of the distribution. this form of distribution is comparatively rare under any circumstances. the better representation of the distribution of frequency. and in such a distribution as that of Table IV. —THE FKEQUENCY-DISTRIBUTION. on the whole. the ordinate through x^. If the class-interval be made smaller and smaller. XV. with small numbers the frequencies may present all kinds of irregularities. 12. shows the frequency-distribution of statures for adult males in the British Isles. the number of observed values greater than sc^ will similarly be given by the area of the curve to the right of the When. and § 18.

Final Report of {Report. 256.— 88 THEOET OF STATISTICS. and so on Glass. the Anthropometric Committee to the British Association. § 9). Scotland.) J^th of an Inch. Inches. Ireland.Intervals are h^re Sec Fig. and Wales. {cf. Table VI. 1883. . Height without shoes. Showing the Frequency-distrihwlions of Statures for Adult Males born in England. the 57H-58if. As Measurements are stated to have been taken to the nearest p. presumably 56i-|-57i|. 6.

.VI. 89 Fig. —THE FREQUENCY-DISTRIBUTION. 5. —An ideal symmetrical Frequency-distribution.

p. 1902. 415) .. The distribution of death-rates in the registration districts of England investigations. K. Shmoing the Frequency-distribution of Statures for (1) 1078 JEnglish Sons (Karl Pearson. 6. (2) /or 1000 Male Students at Cambridge (W. as in fig. 7 and 8. See Figs. with parents living. Both these distributions are more irregular than that of fig. Table VII. Table VII. The polygons are shown in figs. 7 and 8. . Biometrika. i. 1903. Macdonell. and to students at Cambridge. in Great Britain. illustrations occurring in statistics from almost every source. 9 (a) or (6). they may all be held to be approximately symmetrical. 220). This is the most common of all smooth forms of frequency-distribution. ii. gives two similar distributions from more recent relating respectively to sons over 18 years of age. roughly speaking. Biomelrika. p. the class-frequencies decreasing with markedly greater rapidity on one side of the maximum than on the other. The moderately asymmetrical distribution.— 90 THEORY OF STATISTICS. but. 14.

7. Fio." (Table VII. 91 ZOO K'so 160 "140 I ^.) xoa \iao to 160 140 / [120 too .00 ° 80 60 40 \ / 7 60 \ \ 6 8 70 P 20 58 Z 4 Z 4 6 8 60 Stature in tnches Fig. — Frequency -distribution of Stature for 1000 Cambridge Students. (Table VII.) . 80 h 60 40 t HO \ 60 4 6 a 70 2 4 8 80 Stature in ouches. 8.VI.— Frequency-distribution of Stature for 1078 "English Sons. —THE FEEQUENCY-DISTKIBUTION.

The distribution of rates of pauperism in the same W Jc^) T'lG 9. districts (Table VIII. given of the type. is a somewhat rough example .92 and Wales. The frequency smoother and more like the attains a maximum for . p. in Table I. 77. THEORY OF STATISTICS. — Ideal distributions of the moderately asymmetrical form.. and fig. 10) is type (o) of fig 9.

q. .. and 8 per cent. lix. Soc. Roy. of pauperism. 2f to 3J per cent.— VI.v. Percentage of the Population in receipt of Relief. 93 districts with relief. 10. 1896. 347.) See Fig. —THE FKEQUBNCY-DISTKIBUTION. p. Stat. Shoioing the Nvmiber of Eegistration Districts in England and Wales with Different Percentages of the Population in receipt of Poor-law Belief on the Ist January 1891. 7.. Tablk VIII. for distributions for earlier years. of the population in receipt of and then tails off slowly to unions with 6. Jour. vol. (Yule.

imo I \ViOO n.94 THEOBY OF STATISTICS. .

and Wales. . Ireland. etc. consequently the true ClassIntervals are 89-5-99'5. {Loc. —THE FEKQUBNCY-DISTEIBUTION. Table VI. Scotland.) Weights were taken to the nearest pound. (§ 9). cit.. —Showing the Freguency-distritution of Weights for Adult Males bom in Englcmd. 95 Table IX.VI. 99-5-109-5.

Tram. (Pearson.e. A. Lee. cxoii. th-e Batic of Yearling Foals produced to the Number of Coverings. the Freqiiency-distribuiion of Fecundity.. and Moore. Table X. i. (1899). vol. for Brood-mares {Race-horses) Covered Eight Times at Least. p. Fhil.) See Fig. Showing of the Number . 12.— 96 THEORY OF STATISTICS. 803.

97 600 500 too •300- 'ZOO 100 . 700 —THE FREQUENCY-DISTRIBUTION.n.

Cases of greater asymmetry. Showing the Numbers of Deaths from Diphtheria at Different Ages in England and Wales during the Ten Tears 1891-1900. p. suggesting an ideal curve that meets the base (at one end) at a finite angl^ even a right angle. 1891-1900. The actual figures for this case are given in Table XII. 9 (b). {Supplement to 65iA Anmial Report of the Registrar-General." very rapidly to the maximum. affords one such example of a more asymmetrical kind. The distribution of deaths from diphtheria. and illustrated by fig. the distribution. 14 . as in fig. are less frequent. and it will be seen that the frequency of deaths reaches a maximum for children aged " 3 and under 4. and thence an appreciable frequency for persons over 60 or 70 years of age.98 THEORY OF STATISTICS. — . 3. but occur occasionally. 14. in such a way as to suggest that the ideal curve is tangential to the base. the rising falling so slowly that there is still number Table XII.) See Kg. according to age..

by income tax and house valuation returns. In practical cases no hard and Fig. and so on (c/. a distribution of the present type.g. they would have run 49. —THE FKEQUENOY-DISTRIBUTION.479. It is only the analysis of the deaths in the earlier years of life by one-year intervals which shows that the frequency reaches a true maximum in the fourth year.092. and so thus suggesting a maximum number of deaths at the beginning of life.VI. 23. 15. 99 on. —An ideal Distribution of the extreme Asymmetrical Foim.e. i.. The distributions may possibly be a very extreme case of the last type . by returns of the size of agricultural holdings. line can always be drawn between the moderately and extremely asymmetrical types. but if the maximum is not absolutely at the lower end of the fast . 4. and therefore the distribution is of the moderately asymmetrical type. any more than between the moderately asymmetrical and the symmetrical type.348. ref. In economic statistics this form of distribution is particularly characteristic of the distribution of wealth in the population at large. e. as illustrated. only. 4).

Nonjurors. but that there is a true maximum The frequency for estates of about £1 15 in annual value. London. it is very close indeed thereto. 573. etc. distribution might therefore be more correctly assigned to the second type.— ) 100 THEORY OF STATISTICS. however. (Compiled from Gosin's Names of the Soman Catholics. . usually give the necessary analysis of the frequencies at the lower end of the range to enable the exact position of the maximum to be determined . 1745. See a note in Southey's CommMiplace Book. i. and others who refused to take the Oaths to his late Majesty King George. Figures of very doubtful absolute value. that the greatest frequency does not occur actually at zero. and the frequency continuously falling as the value increases. 16. . Official returns do not range. the number of estates between zero and £100 annual value being morfe than six times as great as the number between £100 and £200 in annual value. See Fig. A close analysis of the first class suggests. It will be seen from the table and fig. Hollis. Shoioing the Jfumbers and Annual Values of the Estates of those who had taken part in the Jacobite Rising of 1715. p. Annual Value in £100. vol. but the position of the greatest frequency indicates a in Table XIII. though of course very unreliable. is founded. and for this reason the data on which Table Xni. quoted from the Memoirs of T. 16 that with the given classification the distribution appears clearly assignable to the present type. are of some interest.

diphtheria would more closely resemble the distribution of estatevalues if the maximum occurred in the fourth and fifth weeks The figures of Table IV. 8- 6- *l * 6 4 5 3 AnntLot value Fig. of life instead of in the fourth year. 14 the distribution of numbers of deaths from 16- ti 10- B. —THE : FRKQUKNOY-DISTKIBUTION. . p. 101 degree of asymmetry that is high even compared with the asymmetry of fig. showing the annual value and nimiber of dwelling-houses.. 83. 16.) — BVequency-distribution of the Annual Values of certain Estates in England in 1715 : 2476 Estates. 7 in- 8 3 10 13 £lOO (Table XIII.VI.

common in official returns. (H. hot.) See Fig. Qes. xii.v. 17. Bd. de Vries. Ber. 102 afford a THEORY OF STATISTICS.. good illustration of this form of distribution. Shovring the Frequencies of Different Nurribers of Petals for Three Series of Ranunculus bulbosus. for details. q. 1894. but marred intervals so by the unequal Table XIV.— . dtsch. .

at the ends of the range and a The ideal form of the distribution minimum towards is illustrated by fig. 19. —An ideal Distribution of the XT-shaped Form. Cloudiness. 103 the centre. at Breslau Table XV. I'"iG.VL~THE FKEQUENCY-DISTRIBUTION. — Shovnng the Frequencies during the cU Breslau Ten Tears 1876-85. 19 illustrate an example based on a considerable number of observations. the distribution of degrees of cloudiness. and fig. or estimated percentage of the sky covered by cloud. of Estimated Intensities of Cloudiness (See ref. Table XV. 18.) See Fig. it This in is a rare but interesting form of distribution. as stands somewhat marked contrast to the preceding forms. viz. 18. . 2.

o «> 500 4 Fio. Washington. gives the distribution for an analogous case. A sky completely. ! IS 00. of Sir form Table XVI. viz. .) S 10 doiccUfie^s —Frequency-distribution of Degrees of Cloudiness at Breslan 1876-85: 3653 observations. (Compiled from material in Marriages of the Deaf in America. a practically clear sky comes next. (Table Francis Galton in Natural Inheritance suggest such a for the distribution of " consumptivity " amongst the offspring of consumptives. E. and intermediates are more This form of distribution appears to be sometimes exhibited by the percentages of offspring possessing a certain attribute when one at least of the parents also possesses the attribute. i 1 1000. 5 6 XV. most common.— 104 THEORY OF STATISTICS. The remarks i2000. 18^8. Showing the Percentages of Deaf-mutes among Children of Parents one of vihom at least was a Deaf-mute. Volta Bureau. overcast at the time of observation is the rare. during the years 1876-85. A. but the figures are not in a decisive shape. Fay. 19. ed. the Table XVI. for Marriayes produt-ing Five Children or more.) Percentage of Deaf-mutes. or almost com- pletely.

Vilfredo. refer to them with advantage.. reference may be made to the following in which a ditferent conclusion is drawn as to the best grouping Discussion of Age Statistics.A." Census Bulletin IS. to which reference was made in § 16. what is the scale of observations to the square inch ? If the scales are ten observations per interval to the centimetre and 1 per cent.— 1 VI. 443-459. The fourth work is cited on account of the author's discussion of the distribution of wealth in acommunity.. Roy. of the children are deaf-mutes are nearly three times as many as those in which the percentage lies between 60 and 80.. Ixii. on the grouping of ages. the second and third supplementary. on account of the large collection of frequency-distributions which is given. Cours d'iconomie politique . pp. "Ages and Condition as to Marriage. attempting to follow the mathematics. 343-414. however. too small to form whom a very satisfactory illustration. If a frequency. Soc. "La courbe des revenus.. vol. Soc. Series A. olxxxvi. and from which some of the Without illustrations in the preceding chapter have been cited. instead of the true number 236 show between head-breadths . 1904. and The elementary student may. vol. Karl. In general less than one-fifth at the other end of the range the of the children are deaf-mutes cases in which over 80 per cent. If a frequency-polygon be drawn to represent the data of Table I. cxovii. Kakl. livre iii. Reference should also be made to the Census of England and Wales. pp. p. 1 what number of observiitions will the polygon 5"95 and 6'06. to the inch. he may also note that each of our rough empirical types may be divided into several sub-types. number of observations will the polygon show between death-rates of 16 'o and 1 7 '6 per thousand. what is the scale of observations to the 6 is redrawn to scales of 300 observations per interval 4 inches of stature to the inch. Pbaeson. "Supplement to a Memoir on Skew Variation. Trans." Proc. .S. See especially tome ii." Phil. (1901). Roy. In connection with the remarks in § 6. REFERENCES. (1895). Allyn A. 1 is redrawn to . The numbers are. 287. 2 vols. to the centimetre. If the diagram to the inch and tions to the square inch ? If the scales are 100 observations per interval to the centimetre and 2 inches of stature to the centimetre. Trans. what is the scale of observafig. 1911. vol. the theoretical division into types being made on different grounds.. Young. square centimetre 2. Soc. 1. Pabeto." The first three memoirs above are mathematical memoiis on the theory of ideal frequency-curves. 105 distribution of deaf-mutism of amongst the : offspring of parents one at least was a deaf-mute. however. vol. (1) (2) (3) Pbaeson. Bureau of the Census." especially the Report by Mr George King on the graduation of ages. If fig. "Cloudiness: Note on a Novel Case of ITiequenoy. 1896-7. vii. U. Series A. Pbakson. instead of the true number 59 1 4. Eakl. ? scales of 25 observations per interval to the inch and 2 per cent.polygon be drawn to represent the data of Table V. chap.. Soy. the first being the fundamental memoir. "Skew Variation (4) (5) in Homogeneous Material. —THE FEEQUENCY-DISTKIBUTION. Lausanne.. i. (1897). Washington. what is the scale of observations to the square centimetre ? what 3. : "A EXERCISES." Phil.

calculation. that very difficult cases of comparison could arise in which. would discussion of causation. of the numbers of petals in two races of the same species of Ranunculus. we have only to compare with each other two distributions of the same or nearly the same type. it was pointed out that a classification any long series is the first sfep necessary to make the observations comprehensible. Confining our attention. The mode: its definition and relation to mean and median 21. Desirable properties for an average to possess 5. and simpler properties 19-20. then. we had to contrast a symmetrical distribution with a " Jthat it The next step that shaped " distribution. however. and of average simpler properties 14-18. of the death-rates in English registration districts in two successive decades. seeing In § 2 of the last chapter of the observations in only enables qualitative or verbal comparisons to be made. Necessity for quantitative definition of the characters of a frequency2.CHAPTER VII. show that classification alone is not an adequate method. calculation. in general. The geometric mean : its definition. The median: its definition. Summary comparison of the preceding forms 22-26. compare the frequency-distributions of stature in two races of man. simpler proof average 27. : its definition and calculation. The perties. Measures of position (averages) and of dispersion— 3. we seldom have to deal with such a case . there are two fundamental characteristics in which such distributions may 106 . and to render possible those comparisons with other series which are essential for any Very little experience. distributions drawn from similar When we have to material are. and the cases in which it is specially applicable — — — — — — — — — harmonic mean 1. of similar form. As a matter of practice. to this simple case. however. The arithmetic mean : its definition. AVERAGES. for example. so that quantitative comparisons may be made between the corresponding It might seem at first sight characters of two or more series. The commoner forms 6-13. 2. distribution The dimensions of an average the same as those of the variable 4. 1. it is desirable to take is the quantitative definition of the characters of the frequency-distribution.

20. that is. In addition to these two principal and fundamental characters. or (2) they "may centre round the same value. as in fig 20. G. and the question therefore arises. In whatever way an average is defined. there are several different forms of average. viz. 107 (1) they of the variable differ : may differ markedly in position. 20. and is therefore necessarily of the same dimensions as the variable i. 3. as in fig. the desirable properties for an average to possess ? dispersion : The present chapter . if the variable be a But there are percentage. Fig. and so on. it is merely a certain value of the variable. measures of the second are termed measures of dispersion. the distributions may differ in both characters at once. but the two properties may be considered independently. the degree of asymmetry of the distribution.VII. as it is termed. we may also take a third of some interest but of much less importance. as in fig. it may be as well to note.e. are generally known as averages . B. in the values round which they centre. deals only with averages j measures of are considered in Chapter VIII. Measures of the first character. its average is a percentage. —AVEKAGES. position. i. several different ways of approximately defining the position of a frequency-distribution. its average is a length . if the variable be a length. 20.e. but differ in the range of Of course variation or dispersion. By what criteria are we to judge the relative merits of different forms 1 What are. and measures of asymmetry are also briefly discussed at the end of that chapter. in fact. A.

that the measure chosen shall lend itself readily to algebraical treatment. the first named being by far the most widely used in general statistical work. (a) In the first place. There are three forms of average arithmetic mean.108 THEORY OF STATISTICS. of course. and not left to the mere estimation of the observer. (c) It it is not really a characteristic of the whole distribution. . values of a variable JTj. . is desirable that the average should possess some simple and obvious properties to render its general nature readily comprehensible an average should not be of too abstract a mathematical character. but one form of average may show much greater differences than another. Of the two forms. + X^+ . to the neglect of other factors. . it almost goes without saying that an average should be rigidly defined. : : — . . To these may be added the geometric mean and the harmonic mean. If. however carefully they may be taken. the average of the whole should be readily expressed in terms of the averages of its parts. more rarely used. It is desirable that the average should be as little affected as may be possible by what we have termed Jhictuations of sampling. desirable that an average should be calculated with reasonable ease Und rapidity. the 5. We will consider these in the order named. and the mode. XVII. the median. but of service in special cases. An average that was merely estimated ^ould depend too largely on the observer as well as the data. That is to say. however.). N M M=^{X^ + X. be postponed to a later section of this work (Chap. X„. if be the arithmetic mean. A measure for which simple relations of this kind cannot be readily determined is likely to prove of somewhat limited application. the easier calculated is the better of two forms of average.. in common use. (b) An average should be based on all the observations made. The arithmetic mean of a series of 6. (d) It is. is the quotient of the sum of the values by their number. the averages of the different samples will rarely be quite the same. by far the most important desideratum is this.g. Xg. The arithmetic mean. in number. two or more series of observations on similar material are given. e. . If not. (f) Finally. -fX„). The full discussion of this condition must. 4. the more stable is the better. If diiFerent samples be drawn from the same material. At the same time too great weight must not be attached (e) to mere ease of calculation. Other things being equal. X^. the average of the combined series should be readily expressed in terms of the averages of the component series if a variable may be expressed as the sum of two or more others.

but only the wages-bill . without qualification. if the mean number of children per family is M.M. the total number of children in families is N. if families possess a total of C children. It may be shortened considerably by forming the frequency-table and treating all the values in each class as if they were identical with the mid-value of the class-interval. a simple relation between the whole : N : N R. if we are told that the mean wage pounds. it will be noted that the mean is actually determined without even the necessity of determining or noting all the individual values of the variable to get the mean wage we need not know the wages of every hand." ilf=l2(Z) The word mean . but we have to deal with a moderate number of observations so few (say 30 or 40) that it is hardly worth while compiling the frequency-distribution the arithmetic mean is calculated directly as suggested by the definition. the mean takes a high In the cases just cited.number that each family would possess if the children were shared uniformly. PjN pounds. is £M. —AVERAGES. but only the total. for its general nature is readily comprehensible. Further. all the values observed are added together and the total divided by the number of observations. when anyone speaks of " the mean " or " the average of a series of observations. is the amount that each would receive if the whole sum available were divided equally between them conversely. is very generally used to denote this particular form of average that " is to say. its parts. i. it fulfils condition (c). this direct process becomes a little lengthy.VII. S 109 to denote or. to express it more briefly by using the symbol " the sum of all quantities like. . : — — . the mean number of children per family is GjN the . for it is rigidly defined and based on all the observations made. be assumed that the arithmetic mean is meant. (1) or average alone. Conversely. a process which in general gives an approximation that is quite sufficiently exact for practical purposes if the class-interval has been taken moderately As regards simplicity position. . as a rule. It is evident that the arithmetic mean fulfils the conditions laid down in (a) and (b) of § 4. . The arithmetic mean expresses. of calculation. If the wages-bill for workmen is £P. If this total is not given. it may. to get the mean number of children per family we need not know the number in each family. But if the number of observations be large. we know this means that the wages-bill is Similarly.M — N and 7. in fact.e. the arithmetic mean wage.

make use of this term. is sometimes termed the first moment of the distribution about the arbitrary origin A we shall not. the middle of the : from the top of the table. (4) The calculation of %{f. and the total divided by the number of. If A be the arbitrarily chosen value and X=A + ^ then or. and the mid-value of X another. 9. .f) for the grouped distribution. To keep the values of f as small as possible. The consequent values of f are then written down as in column (3) of the table. Chap. M= A + !.Z) .X) is therefore replaced by the calculation of tif-t)The advantage of this is that the class-frequencies need only Joe multiplied by small integral numbers. however.. the ^'s must be a series of integers proceeding from zero at the arbitrary origin A. and a little neai-er than the middle of the range to the estimated position of the mean. against the corresponding frequencies. from zero opposite 3 '5 per cent. The process is illustrated by the following example. . § 5). or 2(/. . since . But this procedure is still further abbreviated in practice by the following artifices : (1) The class-interval is treated as the unit of measurement throughout the arithmetic . Each frequency /is then multiplied by its | and the products entered sixth class-interval .. the value of the mean so obtained X may be written^- if=ls(/. Chap. In this process each class-frequency small (e/. of course. is multiplied by the mid-value of the interval.110 THEORY OF STATISTICS. (2) the diiference between the mean and the mid-value of some arbitrarily chosen class-interval is computed instead of the absolute — value of the mean.%(/.value of the If /denote the frequency of any class. A should be chosen near the middle of the range. It may be mentioned here that 2(f). using the frequency-distribution of Table VIII. . the values starting. (2) 8. The arbitrary origin A is taken at 3 '5 per cent. and the class-interval being treated as a unit.4 is (3) a constant. VI. the mid. the products added together. corresponding class-interval. . . VI. observations. i) . for A being the mid-value of a class-interval.

Calculation of the Mean Mean : Example i. whence 2(/.Til. Ill in another column (4)..^) = . from the Figures of TdbU Fill. viz. 93. 0-42 intervals. The positive and negative products are totalled separately. that is 0'21 per cent. we have the difference of from A in class-intervals. giving totals . (1) — . Calculation of the Arithmetic of the Percentages of the Population in receipt of Belief. Hence the mean is 3'5 -0'21 =3'29 M per cent. 632. VI. Dividing this by N. —AVERAGES. p.776 and + 509 respectively.. viz. Chap.267.

measures having been made to the nearest eighth of an inch. etc.. the mid-values of the intervals are STj'j-. and accordingly the quotient 267/632 is hak'ed in order to obtain an answer in units. Care must also be taken to give the right sign to the quotient. 88. As the process is an important one we give a second illustration from the figures of Table VI.. 5S-5. is half a unit. 10. so the value of by dividing S(/. VI. M-A Caloitlation of the Mean: MxampU ii. p. Calculation of the Arithmetic Mean Stature of Male Adults in the British Isles from the Figures of Chap.|) by ])f.. etc. VI.— 112 interval THEOEY OF STATISTICS. Chap. In this case the classis given directly in^val is a unit (1 inch). 58-^^.. The student must notice that. and not 57'5. (1) . Table VI.

intervals is. § 10). (4) will 113 It is evident that such calculation may an absolute check on tho arithmetic of any be effected by taking a different arbitrary : origin for the deviations all be changed. mean must be the The student should note that a classification by uneg^ual the and the use (c/. of an indefinite interval for the extremity of the distribution renders the exact calculation of the mean impossible 11. —AVKKAGES. Chap. the figures of for col. VI. a hindrance to this simple form of calculatfon. We return again below (§ 13) to the question of the . but the value ultimately obtained same.VII. at best.

and at the same time illustrate the facility of its algebraic treatment : (a) The sum of the deviations from the mean. evidently %{f. The student should test for himself the effect of different groupings in two or three different cases. The student should mark the position of the mean in the diagram of every frequency distribution that he draws. MM Mo MiM Fio. not as an abstraction. 22.— Mean M. a diagram has been drawn representing the frequencythe position of the mean may conveniently be indicated by. 10) shows the frequencypolygon for our first illustration. for if M and A are . Median Mi. THEORY OF STATISTICS. but always in relation to the tion: fig. a vertical through the corresponding point on the base. frequency-distribution of the variable concerned. This follows at once from equation (4) identical.^) must be zero. If of the degree of inaccuracy to be expected. is zero. 13. on the side of the greatest frequency towards the longer " tail " of the distribudistribution. The following examples give important properties of the arithmetic mean. 21 (a reproduction of fig. as in the present example. In distributions of such a type the intervals must be made very small indeed to secure an approximately accurate value for the mean. and so accustom himself to thinking of the mean. Thus fig. so as to get some idea 12. 22 shows similarly the position of the mean in In a symmetrical distribution the mean coincides with the centre of symmetry. taken with their : proper signs. In a moderately asymmetrical distribution at all of this form the mean lies. and Mode Mo. Mm an ideal distribution.— 114 interval. and the vertical indicates the mean. of the ideal moderately asymmetrical distribution.

that is. For if we denote the values in the first series by X^ and in the second series by X^. —AVERAGES. it may be noted that the approximate value for the mean obtained from any frequency distribution is the same whether we assume (1) that all the values in any class are identical with the mid-value of the class-interval. Chap. For if class is identical (c) The mean of observations in two Z=Xi±X„ 2(X) = S(X. X]. As an important corollary to the general relation (6). . Xg the means if^ equation .1. »• as before. M„ there are r series of observations of the whole series is related to of the component series by the . (6) For the convenient checking of arithmetic.M^+ .. : That is. and the means of the two series be i/j.. . Jf= 66-99 is ..fi) + 2(/. 2. we must have. 741 „ Wales = 66-62 in the Hence the mean stature of the 1087 men born is given by the equation 109.)+ +%{fAr) (7) The agreement of these totals accordingly checks the work. (6) If a series of N observations of a variable X consist the mean of the whole series can be readily expressed in terms of the means of the two components.. inches. (5) from the data of Table VI.)±2(X. in.. . two countries M= (346 X 67-78) + (741 x 66-62)..f. if there be iVj observations in the first series and N^ '"^ the second.M=FyM. Mean „ stature of the 346 . of. . .) + 2(X.). .M^ For example. in Ireland = 67"78 in. . N.. if the same arbitrary origin A for the deviations t be taken in each case. quite general X„ the mean M M^ ..^) = 2(/i. + N^. men bom . we find . This follows almost at once. N. 115 say. VI. S(y. all the sums or differences of corresponding series (of equal numbers of observations) is equal to the sum or difference of the means of the two series. It is evident that the if form of the relation (5) . .— VII.. it is useful to note that. +N. . two component series.. M^ respectively.M=NyM^ + N^. .. 2(20 = 2(X. denoting the component series by the subscripts 1.M. or (2) that the mean of the values in the with the mid-value of the class-interval. . .).

(9) As a useful illustration of equation (8).116 That is. — In the case of a smaller values occur with equal frequency. and that it possesses the simple property of being the central or middlemost value. . had 13 or more . But the definition does not necessarily lead in all cases to a determinate value. any value such that greater and less values occur with equal frequency... but this is a convention supplementary to the definition. as a rule.. so that its nature is obvious. but there is not. observed. ±x„ M=M^±M^± . as the vertical through Mi in fig. The median. The median. had 14 or more. if THEORY OF STATISTICS.. (8) of this result is again quite general. seeing that it is based on all the observations made. . and measurements If.. The median may be defined as the middlemost or central value of the variable when the values are ranged in order of magnitude. of the poppy capsules had 12 or fewer stigmatio rays. But if there be an even number. so that X=X^±X^± . . latter be zero. Thus in Table III. It should also be noted that in the case of a discontinuous variable the second form of the definition in general breaks down if we range the values in order there is always a middlemost value (provided the number of observations be odd)... The actual measurein any such case is the algebraic sum of the true ment measurement Xj and an error -Zj. conbider the case of measurements of any kind that are subject (as indeed all measures must be) to greater or less errors. M^. we see that 45 per cent. had 13 or fewer There is no number of rays rays. or as the value such that greater and mean. any value between the mth and (ra-fl)th In such a case it appears to be usual to fulfils the conditions. take the mean of the nth and (m-t-l)th values as the median. the the arithmetic mean of the errors M^. § 3 of Chap.. 22. ±M^ . frequency-curve. M=M^±M^ Evidently the form if . VI. 55 per cent. fulfils the conditions (6) and (c) of § 4. M. . and only if. the median may be defined as that value of the variable the vertical through which divides the area of the curve into two equal parts. 14. The mean of the actual is therefore the sum of the true mean Jfj. M. will the observed mean be identical with the true X M Errors of grouping (§11) are a case in point. say 2n different values. similarly 61 per cent. . 39 per cent. say If there be an odd number of different values of 2n+l. X : .^ be the respective means. like the mean. the (7i-l-l)th in order of magnitude is the only value fulfilling the definition.

VI. The work may be indicated thus : Half the total number of observations (8585) = 4292'5 = 3589 Total frequency under 66i|. When a table showing the frequency-distribution for a long series of observations of a continuous variable is given. its position indicated by Mi in fig. we see that there are 227 districts with not more than 2 '75 per cent. . it may be remarked. and is perhaps best avoided in any case. whether the variation be continuous or discontinuous. 15. no difficulty arises. instead of by 50 per cent. Looking down the table. = 3'195 is The mean being 3'29. .— VII. 21. even in the case of an odd number of observations of a continuous variable if the number of observations be small and several of the observed values identical. —AVERAGES. In the case of the buttercups of Table XIV. or 20 per cent. as a sufficiently approximate value of the median can be readily determined by simple interpolation on the hypothesis that the values in each class are uniformly distributed throughout the interval. An analogous difficulty may arise. of which the half is 316.i = 2-75 -t- 0-445 per cent. for it may be exceeded by 5. = 703-5 = 1329 - = 67-47 inches. Difference Frequency in next interval Therefore median = 66-^ -I- . taking the figures in our first illustration of the method of calculating the mean. of the population in receipt of relief. 117 such that the frequencies in excess and defect are equal. only of the observed values. and 100 more with between 2'75 and 3'25 per cent. § 15) there is no number of petals that even remotely fulfils the required condition. The median is therefore a form of average of most uncertain meaning in cases of strictly discontinuous variation.inches . the total number of observations (registration districts) is 632. 10. the median is slightly less . in which small series of observations have to be dealt with. But only 89 are required to make up the total of 316 . its use in such cases is to be deprecated. hence the value of the median is taken as : 2-75 -^^. (Chap. Thus. 15. . . The value of the median stature of males may be similarly calculated from the data of the second illustration.

417. the figures of for arithmetical interpolation. 327. if desired. 16.. Plot the numbers of districts to the corresponding with pauperism not exceeding each value Example 2-25 is . X f s 400- 300- S 200- 100- . the number of districts with pauperism not exceeding 138. the smallness of the diflference arising from the approximate symmetry of In an absolutely symmetrical distribution the distribution. i. Graphical interpolation may. 227 not exceeding 3-25. The difference between median and mean in this case is therefore only about one-hundredth of an inch. not exceeding 2-75.118 THEORY OF STATISTICS. be substituted. Taking. it is evident that mean and median must coincide. again. and not exceeding 3-75.

however. and not on their medians alone. These limitations render the applications of the median in any work in which theoretical considerations are necessary comOn the other hand. too much weight ought not to be attached. comparison of the calculations for the mean and 17. It is equally impossible to give any theorem analogous to equations The median of the sum or diflFerence of (8) and (9) of § 13. for instance. 119 involve the crude assumption that the frequency is v/niformly distributed over the interval in which the median lies. for the median respectively will show that on the score of brevity of calculation the median has a distinct advantage. therefore.yil. the stature of the middlemost is (On the other hand the median. But if the two components be asymmetrical. but impossible the value of the resultant median depends on the forms of the component distributions. the resultant median will not coincide with the resultant mean. equal to the sum or difference of the medians of the two series . A : . As was shown in § 13. when several series of observations are combined into a single series. i. the resultant median must evidently (from symmetry) coincide with the resultant mean. i. to give any theorem for medians analogous to equations (5) and (6) for means. a number of men be ranked in order of stature. without the necessity of measuring all the objects to be observed. —AVERAGES.e. pairs of corresponding observations in two series is not. (o) It is very readily calculated . When. in any case in which they can be arranged by eye in order of magnitude. the mean of the resultant distribution can be simply expressed in terms components. a factor to which. the median may paratively circumscribed. even if the median error be zero. have an advantage over the mean for special reasons. and he alone need be measured. 18. If. as already stated. the ease of algebraical treatment of the two forms of average is compared. however. or (whatever their form) if the degrees of dispersion or numbers of observations in the two series be diflferent. the median value of a measurement subject to error is not necessarily identical with the true median. be combined. If two symmetrical distributions of the same form and with the same numbers of observations. if positive and negative errors be equally frequent. however. lie halfway between the means of the components. the superiority lies wholly on the side of the mean. in general. not merely complex and difficult.e. (6) It is readily obtained. nor with any otlier simply assignable value. It is impossible. The expression of the of the means of the median of the resultant distribution in terms of the medians of the components is. but with different medians.

as in Table IV. If a number of' men enjoy incomes closely clustering round a median of £500 a year. the value wliich is in fact the fashion (la mode). (In general the mean is the less affected. will remain a batch representing the median 'price when prices are reckoned at so many eggs to the shilling. If. for the object or individual that is the median object or individual on any one system of measuring the character with which we are concerned will remain the median on any other method of measurement which leaves the objects in the same relative order. 11). e. tlie three forms of average are distinct. when prices are reckoned at so much per dozen. The Mode. and the total (c) It of the wages-bill is not known when the median is given. the median will be more stable and less affected by fluctuations of sampling than the arithmetic mean. Clearly. ref. to {d) The a final indefinite class. useless in the oases cited at the end of § 6 . owing.120 it is THEORY OF STATISTICS. a particularly real and natural form of average. XVII. for this is entirely dependent It . the distribution be asymmetrical. and the mean. when the observations are so given that the calculation of the mean is impossible. though the term is of recent introduction (Pearson.). the median will be no more affected by the addition to the group of a man with the income of £50. M is no use giving merely the mid value of the class-interval into which the greatest frequency falls. The stature of a giant would have no more influence on the median stature of a number of men than the stature of any other man whose height is only just greater than the median. It is evident that in an ideal symmetrical distribution mean. the mode is an important form of average in the cases of skew distributions. It represents the value which is most frequent or typical. Mi the median. ) is sometimes useful as a makeshift.) The point is discussed more fully later (Chap. (Chap. Thus a batch of eggs representing eggs of the median price. the median wage cannot be found from the total of the wages-bill. 22. But a difficulty at once arises on attempting to determine this value for such distributions as occur in practice. median and mode coincide with the centre of symmetry. as in fig. 19. VI. The mode is the value of the variable corresponding to the maximum of the ideal frequency-curve which — gives the closest possible fit to the actual distribution. or due to errors or blunders). or even £600. owing to its being less affected by abnormally large or small values of the variable. § 10). however.g. Mo being the mode. median Tnay sometimes be preferable to the mean. (e) It may be added that the median is. in a certain sense..000 than by the addition of a man with an income of £5000. If observations of any kind are liable to present occasional greatly outlying values of this sort (whether real.

— VII. in order to ascertain the approximate value of the mode. But there is only one smoothing process that is really satisfactory. and it is one that should be borne in mind as giving roughly. 20. at all events the relative values of these three averages for a great many cases with which the student will have to deal. mode so determined for the distribution of pauperism. found by fitting an ideal for the calculation. though three decimal places must be retained The true mode. It is no use making the class-intervals very small to avoid error on that account.. At the same time there is an approximate relation between mean. median. be so increased that the class-frequenrun smoothly. 9.. if the intervals could be made indefinitely small and at the same time irregular. for the class-frequencies will then become small and the distribution What we want to arrive at is the mid-value of the interval for which the frequency would be a maximum. however.Median). 21 is the value of the accordance with our definition. the value 299 being.. and mode that appear? to hold good with surprising — closeness for moderately asymmetrical distributions. in so far as every observation can be taken into account in the determination.3 (Mean . — — 121 on the choice of the scale of class-intervals. . It is expressed by the equation — — Mode = Mean . be indefinitely increased. it is evident that some process of smoothing out the irregularities that occur in the actual distribution must be adopted. and that is the method of fitting an ideal frequency-curve of given equation to the of observations cies should number The value of the variable corresponding to the the actual figures. taking the mean to three places of That is decimals.— AVERAGES. approaching the ideal type of fig.. . As the observations cannot. the median lies one-third of the distance from the mean towards the mode (compare figs. maximum of the fitted curve is then taken as the mode. in a practical case.. 21 and 22). to say. in Mo in fig.. which is sufficient accuracy for the final result. very nearly coincident with the centre of the interval in which the greatest frequency lies. For the distribution of pauperism we have.3 x 094 or 3-01 to the second place of decimals. be left to the more advanced student. = 3-007. as it happens. Mean Median Difference . The determination of the mode by this the only strictly satisfactory method must. 3-289 3-195 0-094 Hence approximate mode = 3'289 .

.122 THEORY OF STATISTICS. Comparison of the Approximate and True Modes in the Case of Five Distributions of Pauperism Belief) in the Unions of Soe. StcU. § 14). 1870. is 2-99. we give below the results for the distributions of pauperism in the unions of England and Wales in the years 1850. distribution. VI. 1896. 1881. 1860. lix. As further illustrations of the closeness with which the relation may be expected to hold in different cases. and also the results for the distribution of barometer heights at Southampton (Table XL. vol. {Percentages of the Population in receipt of (Yule. . and similar distributions at four other stations. Jov/r. and 1891 (the last being the illustration taken above). Year.) England and Wales. Soy. Chap..

but its use is undesirable in cases of discontinuous variation. . it should be noticed. the value of the negative values occur. the logarithiti of the geometric mean of a series of values is the arithmetic mean of their logarithms. The arithmetic mean should invariably be employed unless there is some very definite reason for the choice of another form of average. (10) may also be expressed in terms of logarithms. It is necessarily zero. its value is always determinate. and its algebraic treatment is difficult and often impossible. . would apply with almost equal force to the median. The objection. 22. XS^ . and in ref. its value may be indeterminate. X^. and it may become imaginary if even a single value of Excluding these cases. if VIII. . Question 8). somewhat more easily calculated from a given frequency-distribution than is the mean . . The median is... is zero. but at the same time it represents an important value of the variable. iS| defined by the relation — G={X. But no one in the least degree familiar with the manifold forms taken by frequency-distributions would regard the two as in general identical . (11) that is to say. .X^. The mode. log^ = -^2(logX) . 10. is a form of average hardly suitable for elementary use. —AVERAGES. and in a certain class of cases it is more and not less stable than the mean . .X^ The definition . Chap. X . and that its value is consequently misleading. X„. 123 it is simply calculated. the difference between mode and median is usually about two-thirds of the difference between mode and mean. for. .. the student will find a proof in most text-books of algebra. The geometric mean of a given series of quantities is always less than their arithmetic mean . The magnitude of the difference depends largely on the amount of dispersion of the variable in proportion to the magnitude of the mean (c/. on the ground that the mean is not the mode. and the elementary student will do very well if he limits himself to its use.VII. as we have seen (§ 20). it cannot replace the latter. owing to the difficulty of its determination. The Geometric Mean. The geometric mean G^ of a series of values Xj. it is true. Objection is sometimes taken to the use of the mean in the case of asymmetrical frequency-distributions. it is sometimes a useful makeshift. its algebraic treatment is particularly easy. Xg... finally. it may be noted. and while the importance of the mode is a good reason for stating its value in addition to that of the mean. and in most cases it is rather less affected than the median by errors of sanjpling.

. G=GJG^ (c) ..\ogG^ + N^. if a variable i. At the same time.. JTj.\ogGr (6) . (13) Similarly. +Nr. and is readily treated : X : algebraically in certain cases. Xf. the geometric mean O of the whole series can be readily expressed in terms of the geometric means ffj.^. on account of its rather troublesome computation.. G^. of the X^ denoting corresponding the geometric mean G oi X is . . .X^ . .\ogG^N. of §§ 9 and 10. different series. the relation . logX=logXi-logJ-2. The geometric mean has never come into general use as a representative average. partly. but principally on account of its somewhat abstract mathematical character (c/. . no doubt. computation is a little long.. of the ratios of For if X=X-^IX^. then summing for all pairs of Xj's and X^s.. owing to the necessity of taking logarithms it is hardly necessary to give an example. That is G^ . as the following examples show. The geometric mean is always determinate and is rigidly defined. and ii. geometric means G^.. of the component series. a table should be drawn up giving the frequency-distribution of log X. § 4 (c) ) the geometric mean does not possess any simple and obvious properties which render its general nature readily comprehensible. For evidently we have at once (as in § 13 (*))N. (14) to say. observations in r expressed in terms G^ of X^. as the method is simply that of finding the arithmetic mean of the (instead of the values of X) in accordance with logarithms of equation (11). If there are many observations.e. if X is given as the product of any number of others. and the mean should be calculated as in Examples i.. JTj. X consist of r component there being N-^ observations in the first. the geometric mean of the product is the product of the geometric means. the mean possesses some important properties. X=XyX2. G^.. (12) tions The geometric mean in two series is equal corresponding observato the ratio of their geometric means. etc. . N^ in the second. .G^..G.\ogG^+ . . (a) If the series of observations series. .124 THBOKT OF STATISTICS. X^ X„ by G=G^.. 23. and so on.

(15) The population midway between the two censuses P^. ffereforU — 100 Hereford' SO- SO isor n 21 31 11 51 ei ?. If nothing is known concerning the increase of the population save that the numbers recorded at the first census were Pg and at the second census n years later P„. the most reasonable assump1801 II 21 31 SI 61 11 81 91 1901 300 300 Gwib^rUxndi 250 -250 200- — 200 Dorset 150 CumherloTui ISO Dorset 100 . in estimating the The use of the geometric mean finds its simplest application numbers of a population midway between two epochs (say two census years) at which the population is known.VII. . 31 w iffoi Census year.. 125 24..^ . 24. and so on. — Showing the Populations of certain rural counties of England for each Census year from 1801 to 1901. and tion to Pn = P.. P^r being the population a year after the first census. —AVERAGES. P^r^ two years after the first census. (16) .. so that the populations in successive years form a geometric series. Fig. make is that the percentage increase in each year has been the same.^Po-r^^iPo-i^nY therefore . is .

i. be used with discretion. and similar results will often be found for districts in which the population is not increasing very rapidly. 14. the curves are frequently concave towards the base. it cannot be assumed for the Counties if it be assumed for the Counties. a peculiarly convenient form of average in dealing with ratios. of prices. for example. For if in one part of the area considered the initial population is Pq and the common ratio R." as they are termed.126 THEORY OF STATISTICS. the population in year n is given by : — This ^oes not represent a constant rate of increase unless = r. The student is referred to refs. a curve over any considerable period of time representing the growth of population as in fig. If then. and from which there is much emigration. Let : R ^ 0' -^ 0' -^ 01 • • . however. 15 for a discussion of methods that may be used for the consistent estimation of populations under such circumstances. whether the population were In the diagram it will be seen that increasing or decreasing. This result must. the assumption is not self-consistent in any case in which the rate of increase is not uniform over the entire area and almost any area can be analysed into parts which are not similar in this respect. a constant percentage rate of increase be assumed for England and Wales as a whole.e. " index-numbers. i.e. the geometric mean of the numbers given by the two censuses. 25. or even usually. The property of the geometric mean illustrated by equation (13) renders it. and in the remainder of the area the initial population is ^q and the common ratio r. 24 would be continuously convex to the base. Further. constant if it were so. in some respects. it cannot be assumed for the country as a whole. The rate of increase of population is not necessarily.

The question is. for 127 will afford form of average of the Ps any given year an indication of the general level of prices for that year. we have £20 / Y' _ I ^ 20 \yi -t • Y" 20 ' ± Y'" : Y" ^"10 Y"' . —^AVERAGES. what form of average to choose. and G-^^.VII. G^^o denote the geometric means of the' Ps for the years 1 and 2 respectively. provided the commodities chosen are sufficiently numerous and representative. If the geometric mean be chosen.

in certain breeding experiments. as the equivalent of a deviation + x. instead of a deviation .) Number in . § 11). however. log G will be that mode a logarithmic or geometric mode. the result of which is example in a later chapter (Chap. and the reasons assigned have not sufficed to bring the geometric mean into common use.a. and XL of the last chapter. The table gives the number of litters of mice. It may be noted that. 27. appear to have been very widely tested. 31.128 THEORY OF STATISTICS. G. The frequency-curve will then be symas base. D. the fundamental assumption which would justify the use of the former clearly does not hold where the (arithmetic) mode is greater than the arithmetic mean. is (18) The following illustration. The harmonic mean of a series of X — . (Data from A. X — quantities the reciprocal of the arithmetic mean of their reciprocals. The Harmonic Mean. The general applicability of the assumption made does net.G and s.G. and if there be metrical round log G if plotted to log a single mode. if JI be the harmonic mean. the frequency of deviations between G/s and G/r will be equal to the frequency of deviations between r. as it might be termed G will not be the mode if the distrias base. Darbishire. as in Tables X. bution be plotted in the ordinary way to values of The theory of such a distribution has been discussed by more than one author (refs. 30. If a distribution take the simplest possible form when relative deviations are regarded as equivalents. with given numbers (X) in the litter. XIII. will required for an serve to show the method of calculation. that is. 8. as the geometric mean is always less than the arithmetic mean. iii. pp. deviation Gjr will be regarded as the equivalent of a deviation r. 2. Biometrika. 9).

p. arithmetic mean 0-997d." The average annual price of a commodity was based on halfmonthly prices stated in this form. and 20 a price of l-2d. 1913. Leipzig. Leipzig. New York.) Fechner.-phys. and. per egg. amounts to a replacement of the harmonic by the arithmetic mean price. their forms.) per rupee. AVERAGES. {Cf.) REFERENCES. The arithmetic mean is 4-587. Wissenschaften. per egg. (1) General. Chap. by W. Leipzig (The average defined as the origin from which the (1878). If the prices of a commodity at different places or times are stated in the form "so much for a unit of money. dispersion. T. equivalent to a price of 0-984d. it will be seen. and measures of dispersion in general : includes much of the matter of (1). (Posthumously published: deals with frequency-distributions. and 20 ten to the shilling . Classe). 1897. until 1907. 129 Whence. a fortiori. or more than a unit greater. measured in one way or another." xviii. maih." and an average price obtained by taking the arithmetic mean of the quantities sold for a unit of money. In the issues of " Prices and Wages in India" for 1908 and later years the prices have been stated in terms of "rupees per maund (82-286 lbs.) . Verwendung nnd VerallgemeinAbh. given in the form of "Sers (2-057 lbs. F. M. pp. Kolleklivmasslehre. translated with additional notes. legl.) : (3) . d." The change. is a minimum geometric eriiiig. we should have had 50 returns showing a price of Id. 1908 English translation. Persons. Lipps Engelmann. Holt & Co. Fkanz. ZlZBK. "Ueber den Ausgangswerth der kleinsten Abweiohungssumme.). numbered xi. then the mean number per shilling would be 12-2. the amount of difference depending largely on the magnitude of the dispersion relatively to the magnitude of the mean.. 30 fourteen to the shilling. G. Fechnee. 13-16." Supposing we had 100 returns of retail prices of eggs. of the Abh. and "index-numbers" were calculated from such annual averages. herausgegeben von G. a slightly greater value than the harmonic mean of 0-984. but useful to the economic student for references cited. sdehsischen Gesellschaft d. The official returns of prices in India were. (Nonmathematical. DiestatistischenMittelicerthe. (also : mean (2) dealt with incidentally. Question 9.. 1. averages. 1/Zr= 0-2831. dessen Bestimmung. 30 showing a price of 0-857d. T. Utatiatical Averages. lower than the arithmetic mean. d. 27= 3-532. Thus retail prices of eggs were quoted before the War as "so many to the shilling.. G. : VII. DunckerundHumblot.. The harmonic mean of a series of quantities is always lower than the geometric mean of the same quantities. the result is equivalent to the harmonic mean of prices stated in the ordinary way. vol. 50 returns showing twelve eggs to the shilling. etc. VIII. But if the prices had been quoted in the form usual for other commodities..

Populations in the Soc. (7) Galton. p.. Soc.. a generalisation of McAlister's law. Series A. Yule. Elementary Proof that the Arithmetic Mean (10) Crawfoed. C. : : (14) Waters. for a different method based on the symptoms of growth such as numbers of births or of houses. The Geometric Mean. "An See also refs. Macmillan. (11) Pearson. (The law of frequency to which the use of the geometric mean would be appropriate. A. Stanlbt.) ) ) — ) ) 130 THEORY OF STATISTICS.. the different methods in which they may be formed are not Cbnsidered in the present work. J. Supplement to Annual Report of Snow. 1895. Groningen. xxviii. "Reports of the Committee : appointed for the . (12) p. 293. 343. "On the Modal Value of an Organ or Character. (6) Edgewokth. Kael. 345. Of. (Tlie geometric mean applied to the measurement of price changes. " The Law of the Geometric Mean. th£ Registrar-General for A. 2618.' The general theory of indej^-numbers and." Trans.. p. Stanley. Roy. 1879. Index-mimbers. Y. Serious Pall in the Valiie of Gold ascertained Reprinted Social Effects set forth . C. "On the Variation of Prices and the Value of the Currency since 1782. cxvii. F. 714. (Contains. xlvi. 1883. Method for estimating Mean last Intercensal Period. 1907. i. Also reprinted in volume cited above. Math. p.. vol. and Wm." ibid. (5) Jevons. p. Sac. (Some criticism of the reasons assigned by Jevons for the use of the geometric (4) Jbvons." Proc. Soc. Soc. (8) MoAlistbk. Roy. p. see the Reports of the Registrav. 367. pp. 1896. London. Boy. p.. Donald. Pay. 1903. 1 and 2. lix.. Ixiv. "A Estimates of Population. Y. " The Geometric Mean in Vital and Social Statistics." Proc..) For the methods actually used. London. Stat. in Investigations in Cwrrency and Finance . Karl. 1901. xxix. xviii. vol. 1863. (16) Waters. XII. ref. . W.General England and Wales for 1907. Stat. 11. vol. Edin. "On the Method of ascertaining a Change in the Value of Gold. vol. Wales. (A warning as to the inadequacy of mere inspection for determining the mode. U. Francis." Jour. etc. vol. 365. 1884. Soc. vol. A and its mean. The Mode. p. Dawson. vol.. G. These were incidentally referred to in § 25.. London. Eapteyn. 1899-1900. 1902. xi-xii. Skew Frequency -curves in Biology and Statistics (9) Noordhoff. and for 1910. Phil. 260.. of pp." Biometrika. Soc." Jour. C. vol. Chap.. (The note deals with elementary methods of approximately determining the mode the onethird rule and one other.. (Definition of mode. "Skew Variation in Homogeneous Material. W.) "Notes on the History of Pauperism in England and Supplementary Note on the Determination of the Mode. E. Roy. Stat. of any number of Positive Quantities is greater than the Geometric Mean. Stat. cxxxii-cxxxiv. Estimates of Population : England and Wales {Cd. references to the literature in the following E. clxxxvi. amongst other forms.. G. 343." Jour.. Stanford. p. 1865.) (13) Pearson." Joim: Roy. The student will find copious (16) Edgbworth.

1889 (p. and check your work by the method of § 13 (5). Find the mean weiglit of adult males in the United Kingdom from the data in the last column of Table IX. EXERCISES. 133). . 1 and 2. In column 3 have been added the ratios of the index-numbers in 1908 to the index-numbers in 1898. Chap VI. and hence the approximate mode. 96.) The figures in columns 1 and 2 of the small table b^low show the index-numbers (or percentages) of prices of certain animal foods in the years 1898 and 1908. <&c. F.. 2. Table X. 95. Y. — 131 —AVERAGES. 1888 (p. 4. 485). (2) From the ratio of the arithmetic means of cols. "Memorandum on the Construction of Index-numbers of Prices. (1) . 1903. (17) Edgicworxh. (Data from Sauerbeck. 3.. find the median annual value of houses assessed to inhabited house duty in the financial year 1885-6 from the data of Table IV. . H. by § 25. p. on their average prices during the years 1867-77. p. 181). March 1909. the last two methods must give the same result. Ireland. . Wales. and approximate value of the mode for the distribution of fecundity in race-horses. /Stei. Also find the median weight. Similarly. use the same arbitrary origin as in Example ii. 3. Verify the following means and medians from the data of Table VI. 67-31 67-35 68-55 68-48 66-62 66-56 67-78 67-69 In the calculation of the means. Chap. by the method of § 20. . 5. Chap.. VI. Scotland. 3. Stature in Inches for Adult Males in England.. 88.. Jowr. p. (3) From the ratio of the geometric means of cols. (18) Fountain. Macmillan. ... VI. and 1890 (p. 83. fiiyi the mean.. vol.— VII.. 247). the latter being taken as 100. . ii. Mean Median . (4) From the geometric mean of the ratios in col. p. Using a graphical method. Find the average ratio of prices in 1908 to prices in 1898." in the Board of Trade Report on Wholesale and Retail Prices in the United Kingdom. VI. 1896. 1887 (p.. purpose of investigating the best methods of asoei-taining and measuring Variations In the Value of the Monetary Standard. Note that.. taken as 100 : From the arithmetic mean of the ratios in col. Article " Index-numbers " in Palgrave's Dictiona/ry of Political Mconomy.. median. 1 and 2. 1. Chap. Boy." British Association Reports.

the urban sanitary districts (other than the borough of West Ham). (1) on the assumption that the percentage rate of Increase is constant for the county as a whole. (Data from census of 1901. and the borough of West Ham. . The table below shows the population of 6. (2) on the assumpti(fti that the percentage rate of increase is constant in each group of districts and the borough of West Ham.132 THEORY OF STATISTICS. midway between the two censuses. at the censuses Estimate the total popiilation of the county at a date of 1891 and 1901. Essex.) the rural sanitary districts of Essex.

Some sort of measure of dispersion is therefore required. -The standard deviation: its definition. Chap. calculation. The simplest possible measure of the dispersion of a series of values of a variable is the actual range. Note. the measure takes no account of the form of the distribution within the limits of the range . while the other exhibited an almost eyra distribution of frequency over the whole range. i. The method of grades or percentiles. thj? one showed the observations for the most part closely clustered round the average.CHAPTER VIII. like the averages discussed in the last : 133 . I. p. Clearly we should not regard two such distributions as exhibiting the same dispersion. The deviation : its definition. for instance. the next heaviest being under 260 lbs. — mean — — — 1. or about one-fifth.. and properties 20-24. While this is frequently quoted. A measure subject to erratic alterations by casual influences in this way is clearly not of much use for comparative purposes. based. one individual was observed with a weight of over 280 lbs. the difference between the greatest and least values observed. The addition of the one very exceptional individual has increased tho range by some 30 lbs. it is as a rule the worst of all possible measures for any serious purpose. The quartile deviation or semi-interquartile range 25. though they exhibit the same range. In Wales. and properties— 14-19. 95. There are seldom real upper and lower limits to the possible values of the variable. MEASURES OF DISPEESION. it might well happen that. Moreover. calculation. ETC.e. Measures of asymmetry or skewness 27-30. the figures of Table IX.. showing the frequency distributions of weights of adult males in the several parts of the United Kingdom. Measures of relative dispersion— 26. Inadequacy of the range as a measure of dispersion 2-13. VI.. of two distributions covering precisely the same range of variation. very large or very small values being only more or less infrequent the range is therefore subject to meaningless fluctuations of considerable magnitude according as values of greater or less infrequency happen to have been actually observed.

VII... The standard deviation — o-2 = is(a. and the quartile deviation or semi-interquartile range. The Standard Deviation. which the first is the most important. . as in the last chapter. deviations being measured from the arithmetic mean of the observations. If the standard deviation be denoted by o-. Then we may define the root-mean-square deviation origin A by the equation s2 s from the = is(a. let i=X-A.%{x) + N. Let M-A = d so that (3) ^=x + d. then the standard deviation is given by the equation 2. and stpiaring is the simplest process for eliminating signs which leads to results of algebraical convenience.d+d\ = 2(«2) + M.' in order to obtain a measure of dispersion. so that no single observation can have an unduly preponderant effect on its magnitude .e. measures in common use the standard deviation. and let f (as in Chap. (1) To square all the deviations may seem at first sight an artificial procedure. and a deviation from the arithmetic mean by x. 3.134 THEORY OF STATISTICS. but it must be remembered that it would be useless to take the mere sum of the deviations. on all the observations made. VII. (2) In terms of this definition the standard deviation is the rootmean-square deviation from the mean. § 8) denote the deviation of A A X from .d^. chapter. quantity analogous to the standard deviation may be defined in more general terms. . 2(^2) . — of is the square root of the arithmetic mean of the squares of all deviations. since this sum is necessarily zero if deviations be taken from the mean.^) . There is a very simple relation between the standard deviation and the root-mean-square deviation from any other oi'igin. . Then ^^ = x^+'ix. Let A be any arbitrary value of X. In order to obtain some quantity that shall vary with the dispersion it is necessary to average the deviations by a process that treats them as if they were all of the same sign. i. the mean deviation. . the measure should possess all the properties laid down as desirThere are three such able for an average in § 4 of Chap. indeed.

concrete idea of the way in which the root-mean-square deviation depends on the origin from which deviations are measured.. and accordingly zero. figures given below for the estimated average earnings of MH mean X M .e.and d are the two sides of a right-angled triangle. SA will be the root-meanThis construction gives a square deviation from the point A. of a frequency-distribution (fig. s is the : M Fig. (4) Hence the root-mean-square deviation is least when deviations are measured from the mean. be the vertical through the the hypotenuse. just as 2(^) or SC/'. If o. Chap. the standard deviation is the least possible root-mean-square deviation.. therefore = o-2 + d2. then. If we have to deal with relatively few. S(^^). — MEASUKES s2 OF DISPEBSION. or %(f. VII. It will be seen that for small values of d the difference of s from awill be very minute. ungrouped observations. . Generally. 5. S(/'. is sometimes termed the second moment of the distribution about A. § 8) we shall not make \\m of the term in the present work.|") is termed the mth moment. the method of calculating the standard It is illustrated by the deviation is perfectly straightforward. and Jf*S' be set off equal to the standard deviation (on the same scale in which the variable is plotted along the base).VIII. appreciably affect the value of the standard deviation. 4. therefore. ETC. is 135 But the sum of the deviations from the mean the second term vanishes. A If.f) is termed first moment (c/. 25.t^) if we are dealing with a grouped distribution and/ is the frequency of f. i. say thirty or forty. since A will lie very nearly on the circle with centre S and radius SM: slight errors drawn through in the mean due to approximations in calculation will not. 25).

lid.] Finally.. in the given case. as they are not wanted.2^16^ (i= 1^ =421-5263 0-0693 00 = 0-2632. viz. The earnings being estimates. thus checking the value for the mean. as follows : . . each difference is squared. we have — . ll^fd-. differences to be squared are large (see list of Tables. 421-5 If we wish to be more precise we can reduce to the true mean by the use of equation (4). The first value is correct within a very small fraction of a penny. one penny being taken as the unit the signs are not entered. to the nearest penny. negative 290. 3. Treating the value taken for the mean as sensibly accurate.. p. that small errors in the mean have little effect on the value found for the standard deviation. viz. + 10/38. is unnecessary. but the work should be checked by totalling the positive and negative [The positive total is 300 and the diflferences separately. illustrating the fact mentioned at the end of § 4. : lid. The sum of the squares is 16. 136 The values (earnings) agricultural labourers in 38 rural unions. 15s. or 15s. 356).— — THEOKY OF STATISTICS..018. it is not necessary to take the average to any higher degree of accuracy. and the squares entered in tables of squares are useful for such work if any of the col. 4. the difference of each observation from the mean is next written down as in col.l^B. Evidently this reduction. <P= o-2 Hence = s2_ ^2 = 421-4570 0-= 20-529c. to give the are first of all totalled and the total divided by arithmetic mean M. 15s. Having found N the mean.

137 Caioi'LATion of the Standakd Deviation: Example i. ( W.. .) 1. ETC. vol. part i. in 1892-S. Calculation of Mean and Standard Deviation for a Short Series of Observations unEstimated Average Weekly Earnings of Agricultural Labourers grouped. Report. 1894. Little Labour Com: mission.— VIII. v. in Thirty-eight Rural Unions.. —MEASURES OF DISPERSION.

they include allowances for gifts in kind. The work is therefore done very rapidly. from the class-interval as unit to the natural unit In this case the value found is 2'48 classof measurement. intervals. however. and the class-interval being half a nnit. and 4 are the same as those we have already given in Example i. The arithmetic mean wage is 13s. below.e. If we have to deal with a grouped frequency-distribution. The 768. 4 by multiplying the Thus 90 x 5 = 450. VII. It might be expected that earnings would vary less than wages. also given in the same Report. and all the observations within any one class-interval are treated as if they were identical with we denote the If. cols. VII. §§ 8. if necessary. and we are thus enabled to make an interesting comparison of the dispersions of the two. 192x4 = figures of that column again by f. 9. Column 5 gives the figures necessary for calculating the standard deviation. 26-Od. and we accordingly use the same illustrations as in the last chapter.— 138 • THEORY OF STATISTICS. these / observations contribute f^ to the sum of the squares of deviations and we have The standard deviation is then calculated from equation (4). 1. 2. 10). and is derived directly from col. . the mid-value of the interval. as his earnings and not the mere money wages he receives are the important matter to the labourer. The figures dealt with in this illustration are estimates of the weekly earnings of the agricultural labourers. i. 7. etc. . as before. The whole of the work proceeds naturally as an extension of that necessary for calculating the mean. the class-intervals is chosen as the arbitrary origin A from which to measure the deviations 4 tbe class-interval is treated as a unit throughout the arithmetic. Thus in Example ii. . of Chap. 3. cider. 5d. that is 1'24 per cent. the student must be careful to remember the final conversion. such as coal. and so on. the same artifices and approximations are used as in the calculation The mid-value of one of of the mean (Chap. 20 -Sd. and as a fact we find Standard deviation of weekly earnings wages . 6. remaining steps of the arithmetic are given below the table . potatoes. frequency in any one interval by /.^ The estimated weekly money wages are. for the calculation of the mean.

in addition to the Mean. of : Chap. . —MEASURES OF DISPERSION. 139 CALCtTLATlON OF THE Standakd DEVIATION Example ii. the work for the mean alone.— VIII. p. 111. from the figures of Table VIII. FI. Calculation of the Standard Deviation of the Percentages of the Population in receipt of Relief. {Cf. ETC.) (1) Percentage in receipt of Relief.

Stat. figures slightly amended. Jour.. Means and Standard Deviations of the Distributions of Pauperism {Percentage of the Population in receipt of Poor-law Belief) in the Unions of England and Wales since 1S50.) Year. (From Yule. Soc. lix. . 1896. Boy. 140 THEORY OF STATISTICS. vol.

112 for the calculation of : mean alone. (1) ..—MEASURES OF DISPBKSION. p.) — 141 VIII. 88. Calculation of the Standard Deviation Example iii. ETC. Calculation of the Stamdard Deviation of Stature of Male Adults in the British Isles from the figures of Table VI. p. {Cf.

. that is to say. o-^. . respectively. resembling in this respect the arithmetic mean amongst measures of position. It must not be expected to hold for short series of observations in Example i. . to guard against very gross blunders. and we may take another by continuing the work of In that section it was shown that if a series § 13 (6). of observations of which the mean is consist of two component series.— 142 — — THEORY OF STATISTICS. m N.T. The majority of illustrations of its treatment must be postponed to a later stage (Chap.{. the actual range is a good deal less than six times the standard deviation. XL).^) . by equation (4). Chap.<T^-^N^. we have as a special case If <T^ = hW + <r^) . (5) the numbers of observations in the component series be equal and the means be coincident. d^. for instance.dJ) . : M and N^ being the numbers of observations in the two comseries.aJ) + %{A\.of the whole series may be expressed in terms of the standard deviations o-j and o-j of the components and their respective means. tively.^) + N. and also to check arithmetical work to some extent suificiently. give a more definite and concrete meaning to the standard deviation. VII. Let N-^ ponent the Then the mean-square deviations mean are. (7) . d„ the standard deviation cr of the whole series is given (using to denote any subscript) by the equation : It is evident that the if .. Therefore. the standard deviation o. M of the a--^ component a-^^ series about respec- + d^ and + d^ . N. o-„ and means diverging from the general mean of the whole series by d^.^ + d. . . .<T^ = :if^{<T^^ + d. 11. (6) so that in this case the square of the standard deviation of the whole series is the arithmetic mean of the squares of the standard deviations of its components. Similarly. . form of the relation (5) is quite general a series of observations consists of r component series with standard deviations o-j. zxA N=N-^-vN^ the number in the entire series. . The standard deviation is the measure of dispersion which it is most easy to treat by algebraical methods. for the whole series. but the work of § 3 has already served as one example. of which the means are M^ and i¥. . .

143 Again. With in the same way as boys are numbered in a class. observations made .. —MEASURES OF DISPERSION. — . Further. whatever the character. VII..2 = g . .. (9) the relative merit of. as a rule. and consequently This result is of service ' W X W . deviation 0-2 a.— VIII. marks awarded on some system of examination. it is convenient to note. the measure least affected by fluctuations of . but merely by means of their respective positions when ranked in order as regards the character. (10) 13. Another useful result follows at once from equation (9). as is shown in any elementary Algebra. It will be seen from the preceding paragraphs that the standard deviation possesses the majority at least of the properties which are desirable in a measure of dispersion as in an average It is rigidly defined. the sum of the squares of the first iV natural numbers is ir(iv+i)(2iv+i) 6 The standard that is. for the checking of arithmetic. or the relative intensity of some character in. _ = J(iV+ if cr2 1)(2]V+ 1) .. as in § 13 of Chap. so that the frequency-distribution may be represented by a rectangle. though the student will have to take the statement on trust for the present. +^{fr. the different individuals of a series is recorded not by means of measurements.^)+%if. The mean in this case is evidently {JV+ l)/2. . as they are termed. values outside these limits not occurring. e. ETC. it is based on all the (Chap. that if the same arbitrary origin be used for the calculation of the standard deviations in a number of component distributions we must have %{m=^u^. Tlie base I may be supposed divided into a very large number i\^ of equal elements.4. and the standard deviation reduces to that of the first The single natural numbers when JV is made indefinitely large. it is calculated with reasonable ease it lends itself readily to algebraical treatment and we may add. namely.i(if + = J^(i\r2_i) .g. let us find the standard deviation of the first iV natural numbers.')+ 12. the standard deviation of a frequency-distribution in which all values of within a range + 1/2 on either side of the mean are equally frequent. S 4). individuals there are always if ranks. .m (8) As another useful illustration.is therefore given by the equation 1)2. that it is. unit then becomes negligible compared with i^. and the standard deviation is therefore always that given by equation (9). VII.

and also twice the the square. the student will see later the reason for The reciprocal of the modulus has the adoption of the factor. Let the origin be displaced by an amount c until it is just exceeded by to. it may be The added. : : — — {N . values of a variable is the arithmetic mean of their deviations The from some average. Such root-mean-square quantities. standard deviation should always be used as the measure of dispersion.144 THEORY OF STATISTICS. ref. On the other hand. taken without regard to their sign. so the mean deviation is For least when deviations are measured from the median. nature is not very readily comprehended. new mean deviation is therefore. been termed the "precision" (Lexis). just as the arithmetic mean should be used as the measure of position. as they bear evidence of their derivation from the theory of errors of observation. the " modulus " (Airy). until it coincides with the »ith value from the upper end of the series. It may Lo added here that the student will meet with the standard deviation under many different names. unless there is some very definite reason for preferring another measure. and will realise. Just as the root-mean-square deviation is least when deviations are measured from the arithmetic mean. however. but the latter is the natural origin to use.m)c . and "mean square error" have all been used in the same sense. Thus the terms "mean error" (Gauss). while the sum of The deviations in defect of the mean is increased by (iV-m)c. . i. as he advances further. it may be said that its general sampling. have been termed the "fluctuation" (Edge worth) standard deviation multiplied by the square root of 2. deviations may be measured either from the arithmetic mean or from the median. for some origin exceeded by m. values out of N. 2) many of the earlier names are hardly adapted to general use.mc "*" 'N = A-t-i^(i\^-2m)c. frequently occur in other branches of science.c. "error of mean square" (Airy). The Mean Deviation. By this displacement of the origin the sum of deviations in excess of the mean is reduced by m. The mean deviation of a series of 14. the mean deviation has a value A.e. . suppose that. The square of the standard deviation. the advantages that it possesses. The student will. of which we have adopted the most recent (due to Pearson. and that the process of squaring deviations and then taking the square root of the mean seems a little involved. soon surmount this feeling after a little practice in the calculation and use of the constant.1 of the values only.

and the (iV/2 + l)th observations. If the number of observations below the mean is N^ and above the mean N^. In the present case iV^j = 327 and JV2 = 305.0-42 X 22 = . 36. .9-2 = 1275"8. 137) as an illustration. the class-interval in which the mean (or median) lies.iVj) = . 50. : : — M~A and the sum of deviations from the mean is 1285 . That is to say. The mean deviation from the median should be found similar fashion. the mean deviation is lowest when the origin coincides with the {N+ l)/2th observation. and so on . as it is necessary to remember. 145 The new mean deviation is accordingly less tlian the old so long as all origins if JTbe even. exactly. 57. their sum is 570 . therefore t«(i\?i .9-2. the class-interval. accordingly. of course. 101 per cent. 6d. and lies in the class-interval centring round 3-5 per cent. but the median replaces the mean as the origin from which deviations are measured. and.VIII. Take the figures of Example i.— MEASUKBS OF DISPERSION. as before. The mean deviation from the mean is therefore 590/38 = 15'53d. In the case of a grouped frequency-distribution. to the nearest penny). is 776. we have to add ifj. Hence the mean deviation from the mean is 1275'8/632 = 2'019 class-intervals. (p.d to the sum found and subtract JV^d. and = d. for con- 10 . the sum of deviations should be calculated first from the centre of. Thus in the case of Example ii. The median is 15s. The calculation of the mean deviation either from the mean or from the median for a series of ungrouped observations is very simple. and the deviations from the mean are written down in column 3. The mean deviation from the median is calculated in precisely the same way. from an origin within the range in which it lies. the mean deviation from the median is 15d. if the latter be indeterminate. or 17. the unit of measurement being. ETC. The deviations in pence run 63. 15. while d= -0-42 class-intervals. the mean deviation is constant for within the range between the Njith. and this value is the least if iV be odd. The mean deviation is therefore a minimum when deviations are measured from the median or. We have already found the mean (15s. Adding up this column without respect to the sign of the deviations we find a total of 590. 16. lid.but in precisely which the median (instead the mid-value of the interval in of the mean) lies should. and of deviations in excess 509 total (without regard to sign) 1285. We have already found that the sum of deviations in defect of 3'5 per cent. the mean is 3'29 per cent. and then reduced to the mean as origin.

Hence 3-0 per cent. Isles. or again I'Ol per cent. It is a useful empirical rule for the student to remember that for symmetrical or only moderately asymmetrical distributions. iV^j = 327. but as a rule gives results of amply sufficient accuracy for practice if the class-interval be kept reasonably small have left it as an exercise to the {cf. in some forms of experimental work. Thus for the distribution of pauperism we have I'Ol mean deviation standard deviation 1-24 In the case of the distribution of male statures in the British" Example iii. it is more affected by fluctuations of sampling than is the standard deviation. though in the case of a grouped distribution the difference in ease of calculation is not great. § 5). . For a short series of observations like the wage statistics of Example i. It should be noted that. and the +8-6. instead of concentrated at its centre (Question 7). on the other hand. but the difference is too slight to affect the second place of decimals. again Chap. This is. d = + 0-39 intervals. It is not. 18. The value is really smaller than that of the mean deviation from the arithmetic mean. As a rule. The deviation-sum with 3'0 as origin is found to be 1263. as in the case of the standard deviation.146 THEOKY OF STATISTICS. N^ = 305. This may happen. § 15) 3-195 per cent. can be calculated rather more rapidly than the standard deviation. an approximation. the mean deviation is usually very nearly four-fifths of the standard devia +039x22= X We tion. 5 and 9. the median is venience. the mean deviation of a distribution obtained by combining several others cannot in general be expressed in terms of the mean deviations of the component distributions. for example. approaching the ideal forms of figs. this method of calculation implies the assumption that all the values of within any one class-interval may be treated as if they were the mid-value of that interval. of course. and in such cases the use of the mean deviation may be slightly preferable to that of the standard deviation. but depends upon their forms. Hence the mean deviation correction is from the median is 2'012 intervals. for example. The mean deviation. VI. should be (Chap. a regular result could hardly be expected: the actual ratio is 15-0/20'5 = 0-73. the ratio found is 0-80. VII. student to find the correction to be applied if the values in each interval are treated as if they were evenly distributed over the interval.. but may be less affected if large and erratic deviations lying somewhat beyond the bulk of the distribution are liable to occur. taken as the origin. a convenient magnitude for algebraical treatment . be taken as origin. it will be seen. Thus in Example ii. 19.

The Quartile Deviation or Semi-interquartile Range. there are 38 observations. for instance. In the case of a short series of ungrouped observations the quartiles are determined.times this measure. ii a value Q^ be determined such that three-quarters of all the values observed are less than Q^ and one-quarter only greater. and 38/4 = 9'5: What is the lower quartile? The student may be tempted to take it halfway between the ninth and tenth observations from the bottom of the list but this would be wrong.. If the mean deviation be employed as the measure of dispersion. In the wage statistics of Example i. If a 20. a symmetrical distribution — • Mi -Q^ = Q^.Mi. then Qj is termed the upper quartile. like the median. Q^= 16s. in classes of equal frequency. by inspection. or better. In the case of a grouped distribution. Similarly. and Therefore falling half above it and half below. quartile must be taken as given by the tenth observation itself. VIII. XV. it is usual to take as the Q= Qs-Qi and termed the quartile deviation. lid. which may be regarded as divided by the quartile. = ^^i = 12-5d the 22. we must substitute a We range of 7-|. the quartiles. § 17). a range of six times the standard deviation contains over 99 per cent. are determined by simple arithmetical or by . then Q-^ is termed the lower quartile. of all the observations. for then there would be nine The observations only below the value chosen instead of 9*5. But rigidly symmetrical. 21. The two quartiles and the median divide the observed values of the variable into four If Mi be the value of the median. ETC. the semiit is not a measure of the deviation from any particular average the old name probable error should be confined to the theory of sampling (Chap.. lOd. like median. as and no distribution measure the difference may is be taken as a measure of dispersion. —MEASURES OF DISPERSION. is Q interquartile range — : Lower Upper and quartile Q^ quartile = 14s. value Qj of the variable be determined of such magnitude that one-quarter of all the values observed are less than Q^ and threequarters greater. 147 pointed out in § 10 that in distributions of the simple forms referred to.

the wage statistics in Example . 5 and 9. This measure of dispersion may also be useful as a makeshift if the calculation of the standard deviation has been rendered difficult or impossible owing to the employment of an irregular classification of the frequency or of an indefinite terminal class. the semi-interquartile range is usually about two-thirds Thus for Example ii. and.148 THEORY OF STATISTICS.. the dispersion as well as the average stature of a group of men is required to be determined with the least possible expenditure of time. gives the ratio 0-68. i. generally speaking. or more) as a range of six times the standard deviation. viz. For distributions approaching the ideal forms of figs. Example ii. we have Thus for the Total frequency under 2-25 per cent. by measuring two individuals only. does not diverge greatly. 632 -=-4 =158 = 138 Difference Frequency in interval 2 '25 . Of the three measures of dispersion. 23. e. however. approximately. „ Whence Gj = 2-25 + gg x Similarly 20 0-5 = 2-362 =4-130 we find §3 Hence It is Q=^^^ = 0-884 to „ left the student to check the value by graphical interpolation. with great ease. we find of the standard deviation..g. graphical interpolation (c/. could not be expected to give a result in very strict conformity with the rule. The distribution The short series of statures.. §§ 15. It follows from this ratio that a range of nine times the semiinterquartile range.2 75 = = 20 89 per cent. Such uses are. distribution of pauperism. It is calculated. If. 0-61. like the median. and the quartiles may be found. of Example iii. the semi-interquartile range has the most clear and simple meaning. they may be simply ranked in order of height. Chap. if necessary. and the three men picked out for measurement who stand in the centre and one-quarter from either end of the rank. 24. VII. but the actual ratio. a little exceptional. is required to cover the same proportion of the total frequency (99 per cent. 16).

Further. has never come into general use. but to measure the degree of skewness we should take the ratio of this — . Pearson has termed the quantity it shares with the median. and that the use of this measure of dispersion is undesirable in cases of the student should refer again to the discontinuous variation discussion of the similar disadvantage in the case of the median. or skewness. § 14. unless simplicity of meaning is of primary importance. may become : — ti=100— I. however. Measures of Relative Dispersion. for example. owing to the lack of algebraical convenience which it is obvious that the indeterminate. It is a much more simple matter to allow for the influence of size by taking the ratio of the measure of absolute dispersion {e. — MEASURES OF DISPERSION. and has used it. or the kilogramme as the unit of weight and the measure should accordingly be a mere number.g. Such a measure of skewness should obviously be independent of the units in which we measure the variable e. Chap. if relative size is regarded as influencing not only the average. Such a measure of relative dispersion is evidently a mere number. the stone. VII.g. Chapter VII. particularly for anthropometric work. like the median. It has. as Pearson has termed it. some numerical measure of this character is desirable. the skewness of the distribution of the weights of a given set of men should not be dependent on our choice of the pound. mean deviation. quartile. — If we have to compare a series of distributions of varying degrees of asymmetry. standard deviation. ETC. the geometric mean seems the natural form of average to use. the coefficient of variation (ref. 7). or quartile deviation) to the average (mean or median) from which the deviations were measured. Measures of Asymmetry orSkewness. and deviations should be measured by their ratios to the geometric mean. 8).— VIII. this method of measuring deviations. As was pointed out in 25. § 26. however. 26. ref.e. been largely used in the past. but also deviations from the average. Thus the difference between the deviations of the two quartiles on either side of the median indicates the existence of skewness. W : the percentage ratio of the standard deviation to the arithmetic mean. 149 semi-interquartile range as a measure of dispersion is not to be recommended. and its magnitude is independent of the units of measurement employed. As already stated. with its accompanying employment of the geometric mean. in comparing the relative variations of corresponding organs or characters in the two sexes the ratio of the quartile deviation to the median has also been suggested (Verschaeffelt.

100 -^ above. (11) This would not be a bad measure if we were using the quartile deviation as a measure of dispersion its lowest value is zero. but. and VII. skewness to be positive if the longer tail of the distribution runs in the direction of ' high values of X.. as a fact. only one generally recognised measure of skewness. or value of the variable wliich has 50 per cent. -^ — — — . 9. 5 per cent.. The deciles. some quantity of the same dimensions. e. 9) : skewness = This standard deviation -mode — mean=-t ^--.g. be replaced approximately by 3(mean . No upper limit to the ratio is apparent from the formula. or values of the variable which divide the total frequency into ten equal parts. as they are sometimes termed) be ranged in order of magnitude. it may be noted that the numerator of the above fraction may. form a natural and convenient series of percentiles to use. or 10 per cent. of the observed values is mode and mean —We P P . the value does not exceed unity for frequency-distributions resembling generally the ideal distributions As the mode is a difficult form of average to determine of fig. however. e. The fifth decile. and while its highest possible value is 2. percentiles. and the preceding paragraphs of this chapter. they suffice by themselves to show the general form This is Sir Francis Galton's method of of the distribution. 27. in the case of frequency-distributions of the forms referred to. A similar measure might be based on the mean deviations in excess and in defect of the mean. taking the interquartile range. by elementary methods. in which coincide.median).. then If a series of percentiles be determined for short intervals. and that is Pearson's measure (ref. . Chap.g.. (12) ^ ' evidently zero for a symmetrical distribution. when the distribution is symmetrical . The Method of Percentiles. skewness = (^^-^^l^i^^^ = «I±%^^ . it would rarely in practice attain higher numerical values than ±1. for summarising such statistics as we have been considering. the semiOur measure would then be. may conclude this chapter by describing briefly a method that has been largely used in the past in lieu of the methods dealt with in Chapters VI. and a value of the variable be determined such that a percentage p of the total frequency lies below it and is termed a percentile.— 150 difference to — THEORY OF STATISTICS. VII. If the values of the variable (variates. There is. § 20). than (11) for moderate degrees of asymmetry. The measure (12) is much more sensitive (c/. . .

ETC. 26. may be determined either by arithmetical or by graphical interpolation. : 151 above it and 50 per cent. The figures of the original table are added up step by step from the top. as the method is precisely § 22). § 15. It is hardly necessary to give an illustration of the former process. showing the number of Districts of England and Wales in which the Pauperism on 1st January 1891 did not exceed any given percentage of the population (same data as Fig. excluding the cases in which. and above. 28.VIII. —MEASURES OF DISPERSION. § 24). the tsoo- 1300izoo- Percentage of the popiUaUon. the same as for median and quartiles (Chap. like the former constants. below. VII. as will be seen from the figure. in/ receipt of relief Fig. they become indeterminate (c/. : —Curve graphical curve used for obtaining the deciles by the graphical method in the case of the distribution of pauperism (Example ii. and finally turns over again and : becomes quite flat as the frequencies tail off to zero. of course on a very much reduced scale. This curve. 92) determination of Deciles. rises slowly at first when the frequencies are small. 10. Fig. p. is the median the two quartiles lie between the second and third and the seventh and eighth deciles respectively. The deciles . The deciles. and ordinates are then erected to a horizontal base to represent on some scale these integrated frequencies a smooth curve is then drawn through the tops of the ordinates so obtained. then more rapidly as they increase. so as to give the total frequency not exceeding the upper limit of each class-interval. 26 shows. above). like the median and quartiles.

and erecting at each point so obtained a vertical proportional to the corresponding percentile. measured. 27. 28. 26. the value of which is approximately 2 '88 per cent. This gives the curve of fig. the capacity of the different boys in a class as regards some school subject cannot be directly iii. 30. 26 redrawn so as to give the Pauperism corresponding to each grade Galton's "Ogive. 92. 10. The construction is indicated on the figure for the fourth decile." : Ogive form. but it may not be very difficult for the master to . The method of percentiles has some advantages as a method of representation. An extension of the method to the treatment of non-measurable characters has also become of some importance. For example. — The curve of Fig. as the meaning of the various percentiles is so simple and readily understood.152 THEORY OF STiTISTICS. 27. may be readily obtained from such a curve by dividing the terminal ordinate into ten equal parts. p. which was obtained by merely redrafting fig. 29. and projecting the points so obtained horizontally across to the curve and then vertically down to the base. The curve is of so-called O 10 20 30 10 so eO 70 so 90 100 10 20 30 W so 60 70 SO 90 lOO Grades Fib. The ogive curve for the distribution of statures It will be noticed that the ogive curve does not bring out the asymmetry of the distribution of pauperism nearly so clearly as the frequencypolygon.) is (Example shown for comparison in fig. as Sir Francis Galton has termed them). 26 may be drawn in a different way by taking a horizontal base divided into ten or a hundred equal parts (grades. The curve of fig. fig.

and thence to other constants. p. in the case of a measurable character. 28. 89. he cannot pass back to the frequency-distribution. : 153 arrange them in order of merit as regards this character if the boys are then " numbered up " in order. with any degree of accuracy. It should be noted that rank in this sense is not quite the same as grade if a boy is tenth. But if. tor aduit males in the British Isles. In all cases of published work. The method of ranks.a Vm. 10 ZO 30 100 FlO. same data as Fig. from the 'bottom in a class of a hundred his grade is 9 "5. as the Given the application of other methods to the data is barred. grades. to a sufficiently high degree of approximation. but the method is in principle the same with that of grades or percentiles).— Ogive Curve for Stature. ETC. serious inconvenience may be caused. but entirely to replace the table giving the frequencydistribution. 10 1 20 1 3f) 1 40 50 1 60 1 70 1 80 i 90 1 100 it- H 74 -7 -70 -68 -66 -6q -62 60- -60 O 40 SO GO 70 SO 90 Stature corresponding to each. the percentiles are used not merely as . or percentiles in such a case may be a very serviceable auxiliary. O 7t—\ 12. say. they are absolutely fundamental. the number of each boy. the remarks in § 12. But given only the percentiles. or at least so few of them as the nine deciles. the reader can calculate not only the percentiles. —MEASURES OF DISPERSION. but any form of average or measure of dispersion that has yet been proposed. of course. the figures of the frequency-distribution should be given . constants illustrative of certain aspects of the frequency-distribution. . therefore. it is better if possible to obtain a numerical measure.. 6. grade. or his rank. though. table showing the frequency-distribution. serves as some sort of index to his capacity {cf.

"A Note on a Property Boy.KSOHAEFFELT. E. " pp. 1894. d. On the Disseotionof Asymmetrical Frequency-curves).) — 154 THEORY OF STATISTICS." PAiZ. 253. p. (Introduction of term. Kakl." p. Leipzig. (9) Peakson. vol. 1878. and Panmixia. Boy. bot. d.) of the Median. Karl.) Trachtenbekg. (Introduction of the term "standard deviation. p. Stat. Stalistios Galton. Phil. d. Ixxviii.. Verwendung und Veiallgemeinerung.. & E. A process of successive summation that has some advantages can. "Contributions to the Mathematical Theory of Evolution (i.. M.. 1894. (A very simple proof of Method of (5) Percentiles. Soc. Boy. (The method of percentiles is used throughout. vol. etc. Law (6) of Frequency of Error.. London. xii. " Skew Variation in Homogeneous Material. Eigensohaften." Phil." Jour. 276-7.) Calculation of Mean. Series A. p. Trans. xlix. math." Abh. (also numbered vol.-phys. I. p. Wiisenschaften. (1) Fbohnbb. Soc. W. (4th Series). Marquis de. Bd. pp. Relative Dispersion. 370. clxxxvi. with Remarks on the vol.. 1906. Trans. with the quartile deviation as the measure of dispersion. REFERENCES. 1889. Laplace. Heredity. Mag. vol. T. Standard-deviation. (2) Peakson. . Trans. (Proof that the mean deviation is a (4) minimum when taken about the median. " Regression.. 1895. 343. 1875. Th^orie analytique des probabiliUa: 2'"' supplement. vol. Fkanois. 1. olxxxv. Soc." £en deutsch. G. 1896. 1915. dessen Bestimmung. Galton. Soc. 71. Ges.) Mean (3) Deviation. . " by Intercomparison. Palin. The student will find a convenient description with illustrations in (10) Eldbrton. kgl. 1818. however. 33-46. p. ' Standard Deviation. Pieree Simon. Fkancis.. 454. 350-55. Kakl. pp. xi.." Skewness. Ges. sacks. (Introduction of " coefficient of variation. General. Classe) . clxxxvii. or of the General Moments of a Grouped Distribution. " Ueberden AusgangswerthderkleinstenAbweiohungssumine. be used instead. Frequency cu7-ves and Correlation C. including ftnartiles. xviii. Boy. Series A." Phil. have given a direct method that seems the simplest and best for the elementary student. the same property. of the Abh.. Series A. 80.) " Ueber graduelle Variabilitat von pflanzlichen (8) Vi'. Layton. p. We . (7) Peakson. Natural Inheritance Macmillan. vol.

VIII. 1. VI. —MEASURES OF DISPERSION. Chap.. 1. . ETC. 155 EXERCISES. VII. continuing the work from the stage reached for Qu. Cliap. Verify the following from the data of Table VI..

is found to be S. cited by Fechner. we have approximately where : M H being the harmonic mean. G=M(l-ii).G^=a^. Take the number of observations below the interval containing the mean (or median) to be n^.156 (or median). do not dill'er from the values found by the simpler method of §§ 16 and 17 in the second place of decimals.) Similarly. Scheibrier. cit. VII. 2 of Chap. and above it n^ . p. Gesellschaft d. we have approximately the relation : G . . loc. Show that the values of the mean deviation (from the mean and from the median respectively) for Example ii. in a THEOKY OF STATISTICS. 1873. show that if deviations are small compared with the mean. 1899) as an empirical one.. 9. in that interval Wj. " Ueber Mittelwerthe. Qu. in order to reduce it to the mean (or median) as origin. sachsischen 8.. ) Show that if deviations are small compared with the mean. the 'second form of the relation is given by Duncker (Die Melhode der VariationsstatistiTc Leipzig. Find the coiTection to be applied to this sum. on the assumption that the observations are evenly distributed over each class-interval. (W. found by the use of this formula. and the distance of the mean (or median) from the arbitrary origin to be d. G is the geometric mean. so that (x/M)^ may be neglected in comparison with x/M. S. grouped frequency-distribution. Wissenschaften. (Scheibner. the arithmetic mean." Berichte der kgl. ref. and <r the standard deviation and consequently to the same degree of approximation M^ . 564.

The 4-5. births.) be formed. two measurements on a shell (Pecten). Six such tables are given below as illustrations the following for variables Table I.-VIII. The line of means of rows and the line of means of columns : their relative positions in the case of independence and of varying degi'ees of correlation 10-14. exhibiting the frequencies of pairs of values lying within given class-intervals. and the standard-deviations of arrays 15-16. ages of husbands and wives in England and Wales in 1901. The correlation surface correlation table and its formation 6-7. and the total numbers of registration. Table III. remembered in calculating and using the coefficient. COEEELATION.— CHAPTER IX. the rate of discount and the ratio of reserves peerage).. the methods of classification employed in the preceding chapters may be applied to both. V. we considered the frequency-distribuof a single variable. Table IV. tion : — male to total births.. and the more important constants that may be calculated to describe certain characters of such distributions. Numerical calculations Certain points to be 17. statures of fathers and their sons (British). and the consideration of the relations between them. the proportion of to deposits in American banks. in the Each row in such a table gives the frequency-distribution of the first variable for cases in which the second variable lies within the limits stated on the left of the row. — — — — — In chapters VI. and a table of double entry or contingency-table (Chap. If the corresponding values of two variables be noted together. fertility of mothers and their daughters (British Table V. We have now to proceed to the case of two variables. the regressions. Table VI. Table II. every column gives the frequency-distribution of the second variable for cases in which the value of the first variable lies within the As " columns " and limits stated at the head of the column... districts of England and Wales. "rows" are distinguished only by the accidental circumstance 157 .. The correlation coefficient. Similarly. The general problem 8-9. 1. 2.. 1 -3.

a » a s 3 §4J s . ^ a si S "-I ^ .THEORY OF STATISTICS.

159 1 . —COREELA.IX.TION.

160 THEORY OF STATISTICS. tS f< .

—CORRELATION.. B £ > fe>l fe-s - § I 8^ I -^ .IX. 161 Si4 3 o E-1 S - 8.0 J S ST.-.

• QJ ti Ph 9 .. fO *> r-H o I 9^-^ .162 00 THEORY OF STATISTICS. « rrt 3 ill g o r . o 5> CM i-i ^8 4 *^ PC .» -5 il^l 23 ^ s y CQ .

IX. 163 1 . — CORRELATION.

each pack can then be run through to see that no card has been mis-sorted. and the rows 61'5-62-5. It is best to choose the limits of class-intervals. is accordingly entered as 0'25 to each of the four compartments under the columns 59-5-60-5. distribution of frequency for two variables may be by a surface or solid in the same way as the frequencydistribution of a single variable may be represented by a plane 4. Y„ may be termed the type of the array. In this case the statures of fathers and sons were measured to the nearest quarterinch and subsequently grouped by 1-inch intervals a pair iii which the recorded stature of the father is 60'5 in. 3. where possible. the table is readily compiled by taking a large sheet ruled with rows and columns properly headed in the same way as the final table and entering a dot. the four dots should be placed in the position of the four points of the X and joined when complete. ref. and the difference has no statistical significance. in one array are associated with values of Y between the limits Y^ — h and F„ + 8. (Pearson. and that of the son 62-5 in. 62'5-63'5. and thus quarters as well as halves may occur in the table. When these have been fised. The difficulty as to the intermediate observations values of the variables corresponding to divisions between class-intervals will be met in the same way as before if the value of one variable alone be intermediate. If facility of checking be of great importance. If both values of the pair be intermediates. each pair of recorded values may be entered on a separate card and these dealt into little packs on a board ruled in squares. the word array has been suggested as a convenient term to denote either a row or a column. in Table III. Nothing need be added to what was said in Chapter VI. the observation must be divided between /omj* adjacent compartments. e. or small cross in the corresponding compartment for each pair of recorded observations. but one convenient method is to use a small x to denote a unit and a dot for a quarter . stroke. The represented figure. We may imagine the surface to be obtained by erecting .164 of the THEORY OF STATISTICS. as. the unit of frequency being divided between two adjacent compartments. 6..^r. in such a way as to avoid their fractional frequencies. Workers will generally form — — : own methods for entering such fractional frequencies during the process of compiling. 60-5-61 "5. or into a divided tray . If the values of X. as regards the choice of magnitude and position of class-intervals. one set running vertically and the other horizontally.) The special kind of contingency tables with which we are now concerned are called correlation tables. to distinguish them from tables based on unmeasured qualities and so forth.

. to group the majority of frequency-surfaces. This form is fairly common. If the compartments were made smaller and smaller while the class- frequencies remained finite. just as the area of the frequency-curve over any interval of the base-line gives Models of the frequency of observations within that interval. and fig. The maximum frequency occurs in the centre of the whole distribution. 165 at the centre of every compartment of the correlation-table a vertical of length proportionate to the frequency in that compartment. The simplest ideal type is one in which every section of the surface is a symmetrical curve the first type of Chap. 5. — — of cardboard. It is impossible. 30 the distribution of Table III. of the distributions of arrays are asymmetrical. etc. (fig. actual distributions may be constructed by drawing the frequencydistributions for all arrays of the one variable. The total distributions and the distributions of the majority of the arrays illustrations — — : — . or by marking out a baseboard in squares corresponding to the compartments of the correlation-table. The data of Table II. equal frequencies occurring at The next equal distances from the mode on opposite sides. Most. 29 shows the ideal form of the surface. tribution of fig. of course. which approximates to the same the difierence in steepness is. will serve as an example. anthropometry. 5. frequency-solid over any area drawn on its base gives the frequency of pairs of values falling within that area. this is a very on sheets : — rare form of distribution in economic statistics. and erecting on each square a rod of wood of height proportionate to the frequency. 92 and the maximum does not lie in the centre of the distribution. to the same scale. in the same way as the frequency-curves. if not all.IX. Fig. p. — COKRKLATION. and illustrations might be drawn from a variety of sources economics. 89). 9. Like the symmetrical distribution for the single variable. scale. VI. and the surface is symmetrical round the vertical through the maximum. simplest type of surface corresponds to the second type of frequency-curve the moderately asymmetrical. Such solid representations of frequency -distributions for two variables are sometimes termed stereograms. merely a matter of type. and like the disthe surface is consequently asymmetrical. p. under a few simple types the forms are too varied. the irregular figure so obtained would approximate more and more closely towards a continuous curved corresponding to the frequencysurface a frequency-surface The volume of the curves for single variables of Chapter VI. however. and erecting the cards vertically on a base-board at equal distances apart. and joining up the tops of the verticals. but approximate . may be drawn from anthropometry. meteorology. somewhat truncated.

.166 THEOKY OF STATISTICS.

(daWTable HI. «% Sir.) the : .uency Stature of Surface for Stature of Father and ?. 30.-Fre.ory of Statistics. 29 ^e 60 no. ] «•* «<? <ii.

.

— Frequency Surface for the Rate of Discount and Rati .Theory of Statistics. ] "^^^S Fie. 31.

.

But the coefficient of contingency merely tells us whether. Fig. it seems impossible to delimit empirically any simple types. hence their averages and dispersions.. The distribution may be investigated in detail by such methods as those of § 4. If possible. the frequency in each compartment being represented by a square pillar.IX. 1 67 are asymmetrical. II. on an average. and quite different from that of any of the Tables I. In applying any of these methods. § 13).. and much more information than this can be obtained from the correlationtable. the skewness being positive for the rows at the top of the table (the mode being lower than the mean). The maximum frequency lies towards the upper end of the table in the compartment under the row and column headed " 30 . 7.' how closely. and it is not necessary to retain the constancy of the class-interval. The classification should. must be the same. V. Of the two constants. It is clear that such tables may be treated by any of the methods discussed in Chapter V.". can be If the applied to the arrays as well as to the total distributions. two variables are independent. e. the more central rows being nearly symmetrical. however. high values of the one variable show any tendency to be associated with high (or with low) values of the other. however formed. we also desire to know how great a divergence of the one variable from its average value is associated : .g. and the relation between the mean or standard deviation of the array and its type requires investigation. in general. isotropy (§ 11). or the coefficient of contingency can be calculated (§§ 5-8). are given simply as illustrations of two very divergent forms.. be arranged simply with a view to avoiding many scattered units or very small frequencies. and if so. or tested for. the two variables are related. the distributions of all parallel arrays are similar (Chap. relate solely to averages the most important and fundamental question is whether. Tables V. and our attention will for the present be conThe majority of the questions of practical statistics fined to it. means and standard deviations. and negative for the rows at the foot. and VIII. or IV. In general they are not the same. The distribution of frequency is very characteristic. the mean is.. and VI. which are applicable to all contingency-tables. III. —CORRELATION. on the contrary.. seeing that the measures of Chapters VII. A few examples should be worked as exercises by the student (Question 3). Outside these two forms. The frequency falls off very rapidly towards the lower ages. 31 gives a graphical representation of the former by the method corresponding to the histogram of Chapter VI. it is desirable to use a coarser classification than is suited to the methods to be presently discussed. the more important. 6. and slowly in the direction of old age.

and the means of arrays must lie on the vertical and horizontal lines M^M.e. i. V. 8. with a unit Sivergence of the other. the distributions of frequency in aU parallel arrays are similar (Chap. the 00 2 37ki 4- S 6X 4 6 6 7 8 <9. and to obtain some idea as to the closeness with which this relation is usually fulfilled. 32) to be drawn representing the Lfet OX. 01. variables. and M^ the mean value of T. Let M-^ be the mean value etc. being successive class-intervals. Suppose a diagram (fig. M^M. OF be the scales of the two values of means of arrays. § 13). . If the two variables be absolutely independent.. the scales at the head and side of the table. of X. 12.168 THEORY OF STATISTICS.

36-8. it is found either (1) that the means of arrays lie very approximately round straight lines. 9. a purpose which will be served very fairly by fitting a in 2 JM. plotting the points representing means of arrays on a diagram like those of figures 36-38. fit such lines by a simple graphical method. —CORRELATION. is to find formulse or equations which will suffice to describe approximately these curves. in a large number of cases. 169 fig. as already pointed out. by means of a stretched black thread shifted about till it appeared to run as near as — — We . nor coincident as in fig. say. 4 5 6 Fig. 36-38. like that of the physicist.) and they are relatively more frequent than might be supposed the fitting of straight lines to the means of arrays determines might all the most important characters of the distribution. in the first place.IX. but standing at an acute angle with one another as Sli (means of rows) and CC (means of columns) in figs. 33. In the general case this may be a difficult problem. straight line . it often suffices. figs. and further. The complete problem of the statistician. and a straight line will do almost as well In such cases as any more elaborate curve. (Of. 33. and " fitting " lines to them. but. 32. to know merely whether on an average high values of the one variable show any tendency to be associated with high or with low values of the other. or (2) that they lie so irregularly (possibly owing only to paucity of observations) that the real nature of the curve is not clearly indicated.

S(a. and let deviations from My. Then it may be shown that the vertical through must cut in M-^.e. Some method is clearly required which will enable the observer to determine equations to the two lines factory. Consider the simplest case in which the means of rows lie as exactly on a straight line (fig. For. . for a given distribution. the tangent of the angle M-^MR or ratio of Id to IM. be Sj. it remains only to determine RR RR RR M OX 5W M RR . let the slope of to the vertical. and therefore for the whole table.170 THEORY OF STATISTICS. Mx be denoted by x and y^ Then for any one row of type y in which the number of observations is n. But such a method is hardly satismore especially if the points are somewhat scattered it leaves too touch room for guesswork. in M. i/j must therefore be the mean of X. simply and definitely as he can calculate the means and standard deviations. since 2(ray) = 0. and may accordingly be termed the mean of the whole distribution. Let i/^ be the mean value of Y.-^. and different observers obtain very diflferent results.) = 6j2(«y) = 0. the horizontal through M. ho\lever irregularly the means may lie. might be to all the points. 34). i. 10. = m. the mean of X.5jy. Knowing that passes through M. and let cut AI^x.

y . (2) and r= written in a Let -^ . . If the values of x and b^y be noted for all pairs of associated deviations. if CC and 62 its slope to be the line on which lie the means of columns the horizontal. 11. X 5(x-ii. — CORKELATION. (6) These equations may.b^yf = iTo-. *2 = f. if desired. — p=^{^y) For any one row we have S(«2/) (1) = yS(a.=r-'' to 4.y)2 If Jj = iW.2(l .— IX. (7) be given any other value. say + S)— then . of course. we have for the sum of the squares of the differences. . (4) Then 6. rs/sM. RB and GO— y=r—. %{x .x . i. Therefore for the whole table ^1 = ^2 • • (2) Similarly. (3) are usually (3) These two equations slightly different form.e. giving b^ its value from (5). = r-'' . .) = ra.2. The meaning of the above expressions when the means of rows and columns do not lie exactly on straight lines is very readily obtained. in- 171 by This may conveniently be done p of all pairs of associated deviations terms of the mean product z and y. be expressed.(l-r2) (r .6iy2. = i . (5) Or we may write the equations a.»-2 + 32). in terms of the absolute values of the variables and Y instead of the deviations x and y. .

. is the lowest possible. 35). each multiplied by the correspovding frequency. the left-hand side being a minimum.yy^^nsJ) + ^nd^). Therefore for the whole table. . Hence we may regard the equations (6) as being. the deviation of the mean of the row from and the standard deviation is s^.with respect to the line GG.b^yf has the lowest possible value when 6j is put equal to rcrja-y.172 THEORY OF STATISTICS.. either (a) equations for estimating each individual x its from associated y (and y from its associated x) in such a way . SB ^x-b. That is to say..cP. hence. But the first of the two sums on the right is unaffected by the slope or position of JiR. This is necessarily greater than the value (7) . %{x . and also S(»i«^) (fig. Similar theorems hold good.b^. of course. If 62 be given the value r —". for any one row in which the number of observations is d (fig. the second sum on the right must be a minimum also.yy is a minimum. when 6j is put equal to r crja-j.b^yf = nsj^ + n. %(x . Further. is n.35). hence 2(a. the sum of the sqvures of the distances of the row-means from RR.

when every mean is counted once for each observation on which it is based. make Aff& of Wife sv^ 30 4C 50 60 10 SO so \ 10 ^50 ^60 7*7 SO . or {b) equations for estimating the mean of the x's associated with a given type of y (and the meoM of the p's associated with a given type of x) in such a way as to make the sum of the squares of the errors of estimate the least possible. —CORRELATION.IX. 173 as to least possible the sum of the squares of the errors of estimate the .

this : 6Z 64 63 R Father^ statLWe 66 68 70 72 66 C e? 5<59 IS «0 7/ 73 75 .). -IV.). negative if small values of x are associated with The numerical large values of y and conversely (as in Table V. If r= ±1. value cannot exceed ±1.174 are THEORY OF STATISTICS. for the sum of the series of squares in equation (7) is then zero and the sum of a series of squares cannot be negative. associated with large values of y. and conversely (as in Tables I. it follows that all the observed pairs of deviations are subject to the relation xjy = a-Ja.

and +0-21 respectively. and endeavour to accustom himself to estimating the value of r from the general '^ appearance of the table. and IV. upward or downward. III. 10 R . The two quantities are termed the coefacients of regression. or simply the regressions. however.. + 0'51. but the coefhcient of contingency G (for grouping of qu. 6j being the regression of x on y. 3 trend.. conveniently Table VI. 38 are drawn from the data of Tables II. Fig. The student Nianber of Mother's Childreiv. ) means of rows shown and means of columns by crosses : r= +0'21. 39 will serve as an illustration of a case in which the variables are almost uncorrelated but by no means independent. Figs. —CORRELATION. — number of by oivcles should study such tables and diagrams closely. Correlation between number of a Mother's Children' and her Daughter's Children (Table IV. the correlation being positive in each case. for which r has the values +0-91. 37. 3) 0'47. 38. to right or to left. 13.0'014).IX. or deviation in x corresponding on the average to a unit change in the type of y. 175 any definite Two variables for which spoken of as uncorrelated. 36. and b^ being . and tig. r is zero are. r being very small ( ..

as their magnitudes depend on the ratio of o-Ja-y. ofMale.176 THEORY OF STATISTICS. Proportion. Whilst the coefficient of correlation is always a pure number. and consequently on the units in which x and y are measured. I . They are both necessarily of the same sign (the sign of r). the regressions are only pure numbers if the two variables have the same dimensions.-IV. : . lirSis -per 1000 TnrOis. Since r is in Tables I. similarly the regression of y on x.

b^. the cases of means and standard deviations. Where the actual means of arrays appear to be given. or " regression " towards a more the idea of a " stepping back obviously so where or less stationary mean is quite inapplicable the variables are different in kind. 14. to a satisfactory degree of approximation. as in Example i. ref. vV1 - '-" are of considerable importance.x. 8) would perhaps be better.x). X. from the mean of all sons " towards the general mean. Hence s^ and s„ are sometimes termed the " standard deviations of arrays. that such linearity extends beyond the limits of observation. b^." In general.e. to assume that the regression is linear. or sufficient to justify the formation of a correlation-table. Proceeding now to the arithmetical work. s» is the standard deviation of the a. ""i. and VI. however. § 19 (3) ). Chap.b-^. the form of the arithmetic is slightly different according as the observations are few and ungrouped. It follows from (7) that s^ is the standard deviation of {x . J^-'T^ «!. The variables are (I) — X— 12 . we may say It is not safe." and equations (6) '' — the " regression equations. they step back or " regress may be termed the " ratio of regression. is a rough approximation. and similarly Sy is the standard deviation of {y ." 15. As in 8„ and Sj.y). = !. and Sy as an average standard deviation of a column about GO." The expressions " characteristic lines. however. and 0'52 i.— . IX. below. The two standard deviations »i = <'.-array and Sy the standard deviation of the y-array (c/. as in Tables V.. by straight lines. Hence we may regard s^ and Sy as the standard errors (root mean square errors) made in estimating x from y and y from x by the respective characteristic relations x^b^y Sj y = b^. Table VII. — COREELATION." " characteristic equations " (Yule.. where the regression is truly linear and the standard deviations of all parallel arrays are equal. may also be regarded as a kind of average standard deviation of a row about RR. the work is quite straightforward.b^. In the first case. the estimated Example i. is the product sum ^(xy) or the mean product^. and the term " coefficient of regression " should be regarded simply RB and GC as a convenient name for the coefficients 6j and b^. are generally termed the " lines of regression. In an ideal case. 177 Hence the sons of fathers of deviation x from the mean of all fathers have an average deviation of only 0*52a. the only new expression that has to be calculated in order to determine r. a case to which the distribution of Table III.

Theory of Ooeeblatiok: Example i.178 THEOKY OF STATISTICS. . 1. Table VII.

a!=-0-87y y=-0-50a. 137). For practical purposes it is more convenient to express the equations in terms of the absolute values of the variables rather than the deviations: therefore. in terms of these units..15-94) and y hj (Y. 136). for the pauperism. The 6„ = r^=-0-50. Jl -r^= 0-97 per cent... regression equations are therefore. VIII. 17-53. replacing x by (X. The standard errors made in using these equations to estimate earnings from pauperism and pauperism from earnings respectively are (T. and 5i=r^=-0-87. The means of each of the variables are calculated in the ordinary way. These deviations are then squared (columns 6 and 7) and the standard deviations Finally. for the earnings and 1 per cent. VIII. we have X= 19-13 -0-877 r=ll-64-0-50X . added up separately and the algebraic sum of the totals gives S(a. (a) .= 1-71. p. (b) the units being Is. every x is found as before (Chap.y)= -666'04: therefore the mean product ^ = 2(a. to a shilling. (2) in receipt of Poor-law relief on the 1st January 1891 in each of the same unions {B return).. . p. Z— 179 average weekly earnings of agricultural labourers in 38 English Poor-law unions 6t an agricultural type (the data of Example i. and 20-5x1 -29 There is therefore a well-marked relation exhibited by these data between the earnings of agricultural labourers in a district and the percentage of the population in receipt of Poor-law relief.y)/iV= : . (Tj v/r^= 15-4d. A penny is rather a small unit in which to measure deviations in the average earnings. = l-28s.IX. and then the deviations x and y from the mean are written down (columns 4 and 5) care must be taken to give each deviation the correct sign. so for the regressions we may alter the unit of a. making 0-^. —CORRELATION. the percentage of the population Chap.3 -67) and simplifying. multiplied by the associated y and the product entered in column These columns are then 8 or column 9 according to its sign.

earnings in diminishing the necessity for relief.0 . Which is the correct interpretation of the facts? The above in passing The equpt^on from one district to : 1Z 13 „i4- 15 le -11 18 19 zo m -^ (5. (b) tells us therefore that a rise of 2s. or 10|d. conclusion cannot be accepted oflfhand.180 THEORY OF STATISTICS. but such a Equation (a) indicates. for instance. in earnings this might mean that the giving of relief tends to depress wages. in earnings another means on the average a A natural confall of 1 in the percentage in receipt of relief. that every rise of a unit in the percentage relieved corresponds to a fall of 087 shillings. clusion would be that this means a direct effect of the higher .

The diagram gives a very clear idea of the distribution .Z=13-91: GC is by the points 7=0. 7 = 1"14. —COREELATION. When a classified correlation-table is to be dealt with. It will sometimes give a sensible correction even for work in the form of . (2^ the arbitrary origin is taken at the centre of a class-interval . or bringing 2(a:y) to the left.^. being the sums of deviations from the mean. since the second and third sums on the right vanish. SB 181 doing this given by the regression equations (a) and (6). 7=5-64 and X=21. using p' to denote the meanproduct for the arbitrary origin.— IX. and then to check the work by seeing that they meet in the mean of the whole distribution. 7=367. Thus is determined from (a) = 19-13 and 7=6. summing. Therefore. they 15-94. determined from (6) by the points 12. That is. this correction must be used. P =P' . and Glendale and Wigton with the highest earnings but a pauperism well above the lowest over 2 per cent. The most exceptional districts are Brixworth and St Neots with rather low earnings but very low pauperism. the same artifices being used to shorten the work. and there are no very exceptional observations. Marking in these poiats. (1) the product-sum is calculated in the first instance with respect to an arbitrary origin. and is afterwards reduced to the value it would have with respect to the mean . Let deviations from the arbitrary origin be denoted by It/. will be found to meet in the mean. That is to say. the procedure is of precisely the same kind as was used in the calculation of a standard deviation. In is as well to determine a point at each end of both lines. and drawing the lines. Then EJi and CO it X X= X= i = x +^^ rj = y + ^. . and let 1^ be the co-ordinates of the mean. In any case where the origin from which deviations have been measured is not the mean. (3) the class-interval is treated as the unit of measurement throughout the arithmetic. clearly the regression is as nearly linear as may be with so very scattered a distribution. in terms of mean-products. 16.

treating the class-interval as the unit these are the figures in To ^ compartment : heavy type Table VIII. No. of course. which may be readily done by adding together the totals of these two columns together with the frequency in row 4 and = 0). the standard deviations will also require reduction to the mean. The two variables are (1) X. p. The algebraic sum of the frequencies in each line of columns 2 and 3 is in ^ .0'77 per cent. or at 17 '5 per cent. the first economic.— THEORY OF STATISTICS. 613. Table "VIII. M^ = fj= -I-0-36 intervals or units. (Economic Journal. the ratio of X . 1896. 1890). for Y at the centre of the fourth row.. (the row and column for which being careful not to count twice the frequency in the compartment common to the two this grand total must clearly be equal to the total number of observations Jff. but it must be remembered that this sign will be positive in the upper left-hand and lower righthand quadrants. we give two illustrations. the second biological. of calculating the correlation cois of As the arithmetical process efficient great importance. My = 3'86. the total frequency in the positive quadrants is 13-)= 21-5. the value of of the table against the corresponding frequency. column 4 of Table VIII. of the single distributions : 1= i7-j = -0-15-32 intervals= . i. or 3'5. or 235 in the present case. vol. whence 16-73 per cent. .) was taken at the centre of the fourth The arbitrary origin for column. 182 Example and in that case. they negative. In making these entries the sign of the product may be neglected. is first written in every calculate %{iri}.. 5-h2-)-l-|-3'5 = ll'5 in the When columns 2 and 3 are completed. and so on. in the negative 14 -1-6 = 20: for ^17 = 2. the percentage of males over 65 years of age in receipt of Poor-law from a grouped table — a mainly rural character in England and the numbers of persons given relief " out. The figures refer to a one-day count (1st August 1890. The frequencies are then collected as shown in columns 2 and 3 of Table VIIIa. and the table is one of a series that were drawn up with the view to discussing the influence of administrative methods on pauperism. vi. negative in the two others. The following are the values found for the constants relief in 235 unions of Wales (2) 7.. 10 + 4-5H-1 -1-4-5 = 20 in the positive quadrants. Example ii... whence (r„ = 2-98 units.. Thus for 8 '5 fi7 = l. should first of all be checked to see that no frequency has been dropped. 36. l'29 intervals = 6 '45 per cent. doors" (in their own homes) to one "indoors" (in the workhouse). being grouped according to the value and sign of ^t.

Old-age Pauperism and Proportion of Out-relief. Number relieved Outdoors to One Indoors. Thbort op Cokkblation : Example ii. (The Frequencies are the figures printed in ordinary type. . The numbers in heavy type are the Deviation-Products (|r)). 183 Table VIII. —CORRELATION.— ) IX.

Calculation of the Pkoduct Sum S(|t)). 1. Table VIIIa.184 THEORY OF STATISTICS. .

(2) Y. the length of a motherfrond of duckweed {Lemna minor) . '707 2 "828 intervals = . Table IX. U. as the equation of greatest practical interest. The following are the values found for the constants of the single distributions : X J= -1'058 ir»= interTals= - 6*3 mm. The units of length in the tabulated measurements are millimetres on the drawings. in columns 2 and 3 of Table IXa. (Unpublished data . = 987 mm. according to the magnitude and sign of fiy. and the daughter-frond when its first daughter-frond separated. actual. Measures were taken from camera drawings made with the Zeiss-Abbd camera under a low power. or the regression equation accordingly X= 13-9 + 0-747. Example iii. and this can be taken as a working hypothesis for further investigation. entries in column 3 from those in column 2. on drawing= every compartment of the fij are entered in and the frequencies then collected. actual. mm. The student should work out the second regression equation. on drawing. The totals so obtained are multiplied by iy) (column 1) and the products entered values" of The table as before. on drawing. The result is such as to create a presumption in favour of the view that the giving of out-relief tends to increase the numbers relieved. and drawing a diagram like figs. actual. the actual magnification being 24 1. Yule. 0771 mm. 18'5 mm. The mother-frond was measured when the daughter-frond separated from it. 36. The entries in these two columns are next checked by adding to the totals the frequency in the row and column for which is zero. and 38. actual. 17'0 1-2 on drawing= ^=-0'203 By— 3"084 == Jllf2=103-8 = .. Jf.. = 4'11 mm. This we pass from one district to another. and = 0'74y. 185 a. the standard error from Y being is (Tx made in using the equation Jl -r^= 6'07. 37. 4 '32 mm. mm. mm. for estimating X that.. measurements by G.) The two variables are (1) X. + 0-34 X 6-45/2-98 = 0'74. and check both by calculating the means of the principal rows and columns. —CORRELATION. mm. ^ . a the numbers relieved in their own homes — : The arbitrary origin for both and Y was taken at 105 mm. the length of the daughter-frond. and seeing that it gives the total number of observations The numbers in column 4 are given by deducting the (266). telling us rise of 1 in the ratio of to the numbers relieved in the workhouse corresponds on an average to a rise of 0'74 in the percentage in receipt of relief.— IX.

.186 THEORY OF STATISTICS. 1. Table IXa.

The numbers in heavy type are the deviation-products (Jt.— Theory of Statistics. Yule.)). U. in Lemma minor. 60-66 { . Correlation letween (1) daughter-frond.] (The freque type. G.] Table IX. Theory of Cokeblation : Illustration iii. [Unpublished data .

.

) Finally. If we write on cards a series of pairs of strictly independent values of x and y and then work out the correlation coefficient for samples of. e. No great stress can therefore be laid on small. it must always be remembered that correlation coefficients. the coefficient of correlation or the coefficient of. r= +0'3 may similarly be a mere fluctuation of If sampling. III. if iV=100. —CORRELATION. values of r as indicating a true correlation if the numbers of observations be For instance. a value of r= +0*5 may be small. though again an infrequent one. (See Chap. the work will be wrong unless means and standard deviations are expressed '{1 the same units. in the value of r^p/a-. or even on moderately large. but will find a series of positive and to find r = negative values centring round 0. as above. say. a value of r= ± O'l might occur as a fluctuation of sampling of the same degree of infrequency. like all other statistical measures. Further. § 15. merely a chance result (though a very infrequent one) . we are very unlikely ever absolutely. it should be borne in mind that any coefficient.— IX. Chap. (2) To express cr^ and a-y in terms of the class-interval as a unit. The student must therefore be careful in interpreting his coefficients. are subject to fluctuations of sampling (cf. for example. if iV"=36.g. The student should be careful to remember the following points in working (1) To give p' and $rj their correct signs in finding the true mean deviation-product p. 40 or 50 cards taken at random.and daughter-fronds. as such an alteration will affect both standard deviations equally).contingency. §§ 7. 187 The regression of daughter-frond on mother-frond is 0'69 (a value which will not be altered by altering the units of measurement for both mother. XVII. 17. a-y. and to check the whole work by a diagram showing the lines of regression and the means of arrays for the central portion of the table. 900. for these are the units in terms of which p has been calculated. gives : : We N= . Hence the regression equation giving the average actual length (in millimetres) is of daughter-fronds for mother-fronds of actual length X 7=l-48-fO-69X again leave it to the student to work out the second regression equation giving the average length of mother-fronds for daughter-fronds of length Y. 8). (3) To use the proper units for the standard deviations (not class-intervals in general) in calculating the coefficients of regression in forming the regression equation in terms of the absolute values of the variables.

be made here to F. Mag. In (1) Brayais introduced the product-sum. Beavais. "On the Theory of Correlation... "On p. p.." Jour.) (10) Edgeworth. xv. Boy... "Regression towards Mediocrity in Hereditary Stature. Series A. U. developed the practical method. and Proc. Soc. Soc. Ix. variables differing entirely from that described in the preceding chapter. as it was termed at first) graphically. G. li. "Regression. t. vol. in the case of Skew Correlation. des Sciences savants. F." Proc. Feancis.. XVI. . 477. 341 Beferences at the et seq. The theory of correlation was first developed on definite assumptions as to the form of the distribution of frequency. Trans. p. Soc. 1887. xxiv.. Soc. U. p. 1886. Ixv. 1888.. 1892. vol. and Phil. Joiir. their Measurement. Boy. 184." Proc. xxxiv.. "Family Likeness in Stature. Stat. or the original the correlation table. ix. (Tables and diagrams illustrating the meaning of values of the 1907. 5th Series." PAii." Proc. Soc. Daebishirb. determining his coefficient (Qalton's function. Correlated "On Averages. p. 1896. Ix. 1897. (3). Edgeworth. Soc.— 188 THEORY OF STATISTICS." Jour. Galton. vol. 253. . Stat. Inst.. vol. of the Manchester Lit. vol. Boy. REFERENCES. "On a ISTew Method of reducing Observations relating to several Quantities. only a part of the information afforded by the original data or The correlation table itself. Karl. Y. " Analyse matWmatique sur les : probability des erreurs de situation d'un point." ^cofZ. correlation coefficient from to 1 by steps of a twelfth. vol. and Panmixia. in (2). xl. unless considerations of space or of expense absolutely preclude the adoption of such a course.. 812.. Galton. Y. (7) Yule." Phil. Chap. Mimoires prdse^itis par divers (2) Galtox.. L. but not a single symbol for a coefficient of correlation.) being assumed. p. 246. and vol. p.) to memoirs on (he theory of non-linear regression are given end of Chapter X.). and Pearson introduced the product-sum formula in (6) both memoirs being vvritten on the assumption of a "normal" distribution of frequency (c/. xxv. Reference . XVI. " Correlations and Soc. Y. vol. II« s^rie. p.. Heredity. Pearson. and (4). : may also 1902.. G. "Some Tables for illustrating Statistical Correlation. Boy. 135. The method used in the preceding chapter — (1) is based on (7) and (8). p. 255." Phil. 1888. a.. Sir Francis Galton. Boy. 222. 1886. 1846. vol. xlv. vol. vol. a." Mem. should always be given.. Anthrop.. D. etc. For some illustrations see F. p.. 42. (8) (9) Yule. the so-called " normal distribution " (Chap. data if no correlation table has been compiled. 1897. Boy. 5th Series. (3) (4) (5) (6) Francis. Bowley. Edgeworth developed the theoretical side further in (5). 190.. and based on the use of the median the method involves the use of trial and error to some extent. Fkancis. Edgeworth and A. (A method of treating correlated p. clxxxvii. Mag. the significance of Bravais' Formulas for Regression.

The following figures show.. ] 2. X.IX. Find the correlation-coefficient for the following values of X and Y. EXERCISES. Find the correlations between the out. —COHEELATION. . for the districts of Example i. the ratios of the numbers of paupers in receipt of outdoor relief to the numbers in receipt of relief in the workhouse.relief ratio for so : and (1) the estimated earnings of agricultural labourers of the population in receipt of relief. (2) the percentage 1 . and the equations of regression Y. [As a matter of practice it is never worth calculating a correlation-coefficient few observations the figures are given solely as a short example on which the student can test his knowledge of the work. 189 1.

V. Rows singly up 20 then 20-28. (2) three bottom rows. 3 -1- .. etc. . group all up to 494 '5 and all over 521 '5." III. Group together (1) two top rows. 1 of Chap. 44-56. 1 -^ 2. For cols. leaving centre of table as it stands. Regroup by ten-year intervals (15-. 28-44. Rows. for son. the coefficient of mean square contingency is 465. . so as to avoid small scattered frequencies at the extremities of the tables and also excessive arithmetic : I. VI. 9-hlO.] IV. 0.. . leaving central cols. etc. 25-. If a 3-inch grouping be used (58'5-61'5. llH-12. 58'5-60'5. 56 upwards. For cols. Regroup by 2-inch intervals. etc. (4) four last columns. (3) two first columns.• 190 THEOKT OF STATISTICS. making the last group "65 and over. . — etc. 4. 13 and upwards. ref. for both father and [Both results cited son). for father. . group 1-1-2. 8-H4. In calculating the coefficient of contingency (coefficient of mean square contingency) use the following groupings. 59'6-61"5.. 11 and upwards. II. .) for both husband and wife. from Pearson. .. 35-. : ..

Unfortunately. and the real difficulties arise in the interpretation of the coefficient when obtained. only exhibits in a summary and comprehensible form one particular aspect of the facts on which it is based. and not only are care and judgment essential for the discussion of such possible hypotheses. Further. ooefiioient — 20-22.: The weather ii. No general rules can be laid down. : Causation of pauperism 9-10. they should afford the answers to specific and definite questions. by deficiencies in the available data and so forth. the student of economic statistics. The value of the coefficient may be consistent with some given hypothesis. and. and consequently practical possibilities as well as ideal requirements have to be taken into account. Certain rough methods of approximating to the correlation : : — 1. (i) Quasi-periodic move: : — — — — — — : ments Illustration v. The student — especially whom this chapter is principally addressed — 191 . The correlation ratio. 1. if several are to be dealt with. care should be exercised from the commencement in the selection of the variables between which the correlation shall be determined. The variables should be defined in such a way as to render the correlations as readily interpretable as possible. Correlation between the movements of two variables (os) Non-periodic movements Illustration iv. to should be careful to note that the coefficient of correlation. like an average or a measure of dispersion. Illustration i. but it may be equally consistent with others. CORRELATION: ILLUSTRATIONS AND PRACTICAL METHODS. Illustration Inheritance of fertility 11-13.rate and foreign trade^ 18. Illustration iii.CHAPTEK X. Necessity for careful choice of variables before proceeding to calculate r 2-8. Elementary methods of dealing with cases of non-linear regression 19. and the crops 14. but also a thorough knowledge of the facts in all other possible aspects. but the following are given as illustrations of the sort of points that have to be considered. The marriage . the field of choice is frequently very much limited. : Changes in infantile and general mortality 15-17.

as representative variables. was given in Chap.) It is practically impossible to deal with more than three factors. — It is required to variations of pauperism in the throw some light on the unions (unions of parishes) of {Cf.— 192 2. 20 per cent.) table (Table VIII. it would seem better to correlate changes in pauperism with chamges in various possible factors. : — — employment). What shall we take. either in the same or the reverse direction. to deal with changes in pauperism and possible factors. social conditions (residential or industrial character of the district. in fact.. by the statisprices. Yule. tics of crime). under age 16. (J) numbers relieved. we mean that when industry recovers. the percentage of the population between given age-limits in receipt of relief increases very rapidly with old age. then. viz. (6) Environment. The returns give (a) cost. that any one variable is a factor of pauperism. or moral conditions (as illustrated. the actual figures given by one of the only two then existing returns of the age of paupers being— 2 per cent. (p. 2.g. Illustration i. we presumably mean that as administration England. the influence of the giving of out-relief on the proportion of the aged in receipt of relief. (c) Age Distribution. 1890. density of population. IX. 1 per cent. One became lax. The possible factors may be grouped under three heads (a) Administration. pauperism would decrease if we say that the high pauperism is due to the depressed condition of industry. over 65. Changes in the method or strictness of administration of the law. or that if administration were more strict. The question was treated by correlating the percentage of the aged relieved in different districts with the ratio of numbers relieved outdoors to the numbers in the workhouse. therefore. The next question is what factors to choose. one from each of the above groups. including the pauperism itself. When we say. over 16 but under 65.) bearing on a part of this question. (Return 36. pauperism will fall. THEORY OF STATISTICS. 3. 183). It will be better. we mean that changes in that variable are accompanied by changes in the . and how shall we best measure — " pauperism " ? 4. e. It Pauperism. seems better to deal with (b) (as in the illustration of Table — . pauperism rose. Is such a method the best possible ? On the whole. ref. nationality of population). or four variables altogether. Changes in economic conditions (wages. percentage of the population in receipt of relief. If we say that a high rate of pauperism in some district is due to lax administration.

is the relative proportion of indoor and outdoor relief (relief in the workhouse The first question is. 1881. (2) percentage of population living two or more to a room. but if we take. and 91 per cent. 1891. The most important point here. say. and as the administrative methods of dealing with these two classes differ entirely from the methods applicable to ordinary pauperism. there does not seem to be any special reason for taking the one return rather than the other.) union. VIII. less lunatics and vagrants. The returns. This is the most difficult factor of all to deal 6. the simpler and more important ratio for the present purpose. In Mr Booth's work the factors tabulated were (1) persons per acre . in every union. —COKEELATION: ILLUSTKATIONS AND METHODS. however. Mr Charles Booth. Environment. or (2) as the percentage of numbers given out-relief on the total number relieved. and one 5. as the numbers in receipt of relief on 1st January and 1st July . The former method was chosen.e. IX. the percentages of outdoor to total paupers. and 1891 (the three census years). but the return for 1st January was The percentage of the population in receipt of actually used.). we still have the choice of expressing the proportion (1) as the ratio of numbers given out-relief to numbers in the workhouse. 10 to 1 .i^rec? Poor Condition). Aged Poor Condition. 193 numbers are more important than cost from the standpoint of the moral effect of relief on the population. was therefore tabulated for each (The investigation was carried out in 1898. 1881. relief on 1st January 1871. generally include both lunatics and vagrants in the totals of persons relieved . partly on the simple ground that it had already been used in an earlier investigation. that lends itself readily to statistical treatment. the figures are 94 per cent. 1894). was therefore tabulated for 1st January in the census years 1871. again. instead of the ratios.g. and relief in the applicant's home).i. The data relating to overcrowding were first collected — — — 13 . and these differences seem to have significance. Administration. it seems better to alter the Returns are available giving official total by excluding them. "overcrowding". respectively. shall we measure this proportion by cost or by numbers ? The latter seems. which are so close that they will probably fall into the same array. (3) rateable value per head (. Thus a union with a ratio of 15 outdoor paupers to one indoor seems to be materially different from one with a ratio of. If we decide on the statement in terms of numbers. The ratio of numbers in receipt of outdoor relief to the numbers in the workhouse. as before..— X. though some writers have preferred the statement in terms of expenditure (e. Chap. with. partly on the ground that the use of the ratio separates the higher proportions of out-relief more clearly from each other.

The population of every union was therefore tabulated for the censuses of 1871. e. and only 1-2 per cent. of rateable value per head. this is prirndfade evidence that its industries are prospering. and the numbers of children as well. 1891. XI.194 at the census of THEOIiY OF STATISTICS. at all a complete index to the composition of the population as aflfecting the rate of pauperism. decided to use a very simple index to the changing fortunes of a If the district. years of age was therefore worked out for every union and tabuThis is not. As the percentage in receipt of relief was. the proportions of the population — : . Further. {Of. lated from the same three censuses. pp. this strongly suggests that the industries are suffering from a temporary lack of prosperity or permanent decay. The changes in each of the four quantities that had been tabulated for every union were then measured by working out the ratios for the intercensal decades 1871-81 and 1881-91. as the conditions are and were very different for rural and for urban unions. ratios so obtained were taken as the four variables. say. Age Distribution. and are not available for earlier years. and for a group of unions of somewhat similar character. but with not very satisfactory results. especially in the case of falling assessments in rural unions. For any given year. of course.g. 223-25. Chap. but changes in the two are not very highly correlated Some trial was made : probably the movements of assessments are sluggish and irregular. the rateable value per head appears to be highly (negatively) correlated with the pauperism. population of a district is increasing at a rate above the average. 1891. 20 per cent. however. method might have been used by correcting the observed rate of pauperism to the basis of a standard population with given numbers of each age and sex. below. the figures that are 7. the movement of the population itself. viz. known clearly indicate a very rapid rise of the percentage relieved The percentage of the population over 65 after 65 years of age.g. and do not correspond at all accurately with the real changes in the After some consideration. it was value of agricultural land. exactness the majority of unions are of a mixed character. for those under that (A more complete age. As already stated. taking The percentage the value in the earlier year as 100 in each case. it seemed very desirable to separate the unions into groups But this cannot be done with any according to their character. rural.) 8. if the population is decreasing. or not increasing as fast as the average. which is sensibly dependent on the proportion of the two sexes. for those over 65. It might seem best to base the classification on returns of occupationSj e. it is evidently a most important index. consisting. of a small town with a considerable extent of the surrounding country. 1881.

It is desired to find whether it is also influenced by the heritable constitution of the parents. 3. i. i. IX. family histories. Illustration ii. —The subject of investigation is the inheritance of fertility in man. 15 years This will rather heavily reduce the number of records at Ipast.X. at the present stage it can only be stated that the discussion is based on the correlations between all the possible (6) pairs that can be formed from the four — . and by the duration of marriage. The metropolitan unions were also treated by themThe limit 0'3 for rural unions was suggested by the selves. fertility is itself a heritable character. 195 in agriculture. Pearson and others.e.. unless a large proportion of its inhabitants live under urban conditions.e. it was decided to use a classification by density of population. but a country district cannot reach this density unless it include a small town or portion of a town. and the is — — . included. 9. The method by which the relations between four variables are discussed is fully described in Chapter XII. Finally. more than 0'3 but not more than 1 person per acre Urban. allowance being made for the effect of such disturbing causes as age and duration of marriage. : : — — : : variables. density of those agricultural unions the conditions in which were investigated by the Labour Commission (the unions of Table VII. but the statistics of occupations are not given in the census for individual unions. Fertility in One table. and similar works. from which the All marriages must . of the 38 were under 0-3.^CORKELATION engaged : ILLUSTKATIONS AND METHODS. whatever the age of the parents at marriage. from the memoir (Table IV.e. this is impossible i the age of the most important factor is only exceptionally given the wife in peerages. whether. The effect of duration of marriage may be largely eliminated by excluding all marriages which have not lasted.) the average density of these was 0'25.therefore be data must be compiled. say.). more than 1 person per acre. But. 0'3 person per acre or less Mixed. was given as an example in the last chapter man (i. and 34. unfortunately. ref. The lower limit of density for urban unions 1 per acre was suggested by a grouping of Mr Booth's (group xiv. It would be desirable to eliminate the effect of late marriages in the same way by excluding all cases in which. husband was over 30 years of age or wife over 25 (or even less) at the time of marriage.) of course 1 person per acre is not a density associated with an urban district in the ordinary sense of the term.) cited. but will leave a sufficient number for discussion. say. available. {Cf. the grouping used being Eural. the number of children born to a given pair) very largely influenced by the age of husband and wife at marriage (especially the latter). Chap.

and it is naturally desired to find the influence of the weather at all successive stages during this period. But the woman and (2) the first sisters). The subject for investigation is the relation between the bulk of a crop (wheat and other cereals. Estimates are published for separate counties and for groups of counties (divisions). and the allied problems regarding the inheritance of fertility in the horse. But the climatic conditions vary so much over the United Kingdom that it is better to deal with a smaller area. ref. that the times of both sowing and in : for — : . Hunts. however. The produce of a crop is dependent on the weather of a long preceding period. etc. Essex. Suffolk. of the varying age at marriage must be estimated afterwards. consisting of Lincoln. was selected as fulfilling these conditions. and to determine. 0). 4) in the correlation-table. and who fulfils the conditions as to duration of marriage. 7. generation is 5 (say the mother and her brothers and and that she has three daughters with 0.) Produce-statistics for the more important crops of Great Britain have been issued by the Board of Agriculture since 1885 the figures are based on estimates of the yield furnished by official local estimators all over the country. (5. which pair ? is distinctly the best (though it still further limits the available data). some regular rule will have to be made for the selection of the daughter whose fertility shall be entered the first daughter married the table. more homogeneous from the meteorological standpoint. (For a much moire detailed discussion of the problem. the area should not be too small . If it be adopted. (Gf. and 4 : are we to enter all three pairs (5. On the other hand. It must be remembered. 12. may. for each crop. and Hertford. it should be large enough to present a representative variety of soil. children respectively If the (5. which period of the year is of most critical importance as regards weather. the number of children in 10.196 effect THEORY OF STATISTICS. be taken in every case. turnips and other root crops. with the single exception of permanent grass. e. Illustration iii.).g. the student is referred to the original. so as to avoid bias whom data are given. Norfolk. The group of eastern counties. The group includes the county with the largest acreage of each of the ten crops investigated.. 2). Suppose. hay. / Hooker. for instance. 2. correlation between (1) number of children of a number of children of her daughter will be further affected according as we include in the record all her available daughters or only one.) 11. Bedford. and the weather. Cambridge. or only one pair 1 For theoretical simplicity the second process latter.

197 harvest are themselves very largely dependent on the weather. of great importance. temperature is another. for the more rapid movements will often exhibit a fairly close The two consilience." i. as the second characteristic of the weather these "accumulated temperatures.e. and consequently. based on the three possible correlations between them. X. the limits of the critical period will not be very well defined. is deacrilaed in Chapter XII. If. therefore.e. Problems of a somewhat special kind arise when dealing with the relations between simultaneous values of two variables which have been observed during a considerable period of time. and similarly.) there is very little growth. but baaed on 8 weeks' weather. very cursory inspection of the figure shows that when the infantile mortality rose from one year to the next the general mortality also rose. Temperatures were taken from the records of the same stations. It was accordingly decided to take successive groups of 8 weeks. The method of treating the correlations between three variables. i. 5-12. 14. The student should refer to the original for the full discussion as to data." moreover. and these two will The weekly afford quite enough labour for a first investigation. on an average of many years. Correlation coefficients were thus obtained at 4-weeks intervals. 41 exhibits the movements of (1) the infantile mortality (deaths of infants under 1 year of age per 1000 births in the same year) . niustration iv. the total number of day-degrees above 42° during each of the 8-weekly periods. : following examples will serve as illustrations of two methods which are generally applicable to such cases. etc. (2) the general mortality (deaths at all ages per 1000 living) in England and Wales during the period 1838-1904. Fig. and the growth increases in rapidity as the temperature rises above this point (within limits). The average temperatures. do not give quite the sort of information that is required at temperatures below a certain limit (about 42° Fahr. rainfalls were averaged for eight stations within the area. while the slower changes show no similarity. as a rule .. and the average taken as the first characteristic of the weather.. however. It was therefore decided to utilise the figures for "accumulated temperatures above 42° Fahr. it will be as well not to make these intervals too short. we correlate the produce of the crop (X) with the characteristics of the weather (7) during successive intervals of the year. —CORRELATION: ILLUSTRATIONS AND METHODS. It remains to be decided what characteristics of the weather The rainfall is clearly one factor are to be taken into account. show much larger variations than mean temperatures. 13. when the — A . overlapping each other by 4 weeks. weeks 1-8.

198
infantile mortality

THEORY OF STATISTICS.
fell,

tke general mortality also

fell.
_

There

were, in fact, only five or six exceptions to this rule during the whole period under review. The correlation between the annual values of the two mortalities would nevertheless not be very high, as the general mortality has been falling more or less steadily since

1875 or thereabouts, while the infantile mortality attained almost During a long period of time the correlaa record value in 1899. tion between annual values may, indeed, very well vanish, for the two mortalities are affected by causes which are to a large extent different in the two cases. To exhibit, therefore, the closeness of the relation between infantile and general mortality, for such causes as show marked changes between one year and the next, it will be best to proceed by correlating the annual changes, and not the annual values. The work would be arranged in the following form (only sufficient years being given to exhibit the principle of the process), and the correlation worked out between the figures of cols. 3 and 5.
1.

X.

—COKRELATION:

ILLUSTRATIONS AND METHODS.
i.e.

199

ceeding to the second differences, differences of the differences in

by working out the
3 and in
col.

col.

successive 5 before corre-

^

200.

''^'

isto
Fig.

so

eo

TO so Yecws
in

ao

1900

«

41.— Infantile and General Mortality

England and Wales, 1838-1904.

It may even be desirable to proceed to third, fourth or higher differences before correlating.
lating.

18SS no

eo

es

lo

"is

so

as

HO
^1^
3^


200
rise

THEORY OF STATISTICS.

from one year to the next when the other rises, and a fall when the other falls. The movement of both variables is, however, of a much more regular kind than that of mortality,
resembling a series of " waves " superposed on a steady general the short-period trend, and it is the " waves " in the two variables movements, not the slower trends which are so clearly related. 16. It is not difficult, moreover, to separate the short-period oscillations, more or less approximately, from the slower movement. Suppose the marriage-rate for each year replaced by the average of an odd number of years of which it is the centre, the number being as near as may be the same as the period of the " waves " e.g. nine years. If these short-period averages were plotted on the diagram instead of the rates of the individual years, we should evidently obtain a smoother curve which would clearly exhibit the trend and be practically free from the conspicuous waves. The excess or defect of each annual rate above or below the trend, if plotted separately, would therefore give the "waves" apart from the slower changes. The figures for foreign trade may be treated in the same way as the marriage-rate, and we can accordingly work out the correlation between the waves or rapid fluctuations, undisturbed by the movements of longer period, however great they may be. The arithmetic may be carried out in the form of the following table, and the correlation worked out in the ordinary way between the figures of columns 4 and 7.

1.

X.

—CORRELATION:

ILLUSTRATIONS AND METHODS.

201

any one year with the deviation of the exports and imports of the year before, or two years before, instead of the same year ; if a sufficient number of years be taken, an estimate may be made, by interpolation, of the timedifference that would make the correlation a maximum if it were possible to obtain the figures for exports and imports for periods other than calendar years. Thus Mr Hooker finds (ref. 5) that on an average of the years 1861-95 the correlation would be a maximum between the marriage-rate and the foreign trade of about one-third of a year earlier. The method is an extremely useful one, and .is obviously applicable to any similar case. The
tion of the marriage-rate in

iseo

iseo
Fig. 43.

in (1) Marriage-rate and (2) Foreign Trade (Exports the Curves show Deviations -I- Imports per head) in England and "Wales from 9-year means. Data of K. H. Hooker, Jour. Boy. Stat. Soc, 1901.
:

— Fluctuations

student should refer to the paper by Mr Hooker, cited. Reference may also be made to ref. 10, in which several diagrams are given similar to fig. 43, and the nature of the relationship between the marriage-rate and such factors as trade, unemployment, etc., is discussed, it being suggested that the relation is even more complex than appears from the above. The same method of separating the short-period oscillations was used at an earlier date by Poynting in ref. 16, to which the student is referred for a discussion of the method. 18. It was briefly mentioned in § 9 of the last chapter that the treatment of cases when the regression was non-linear was,
in general, somewhat difficult. Such cases lie strictly outside the scope of the present volume, but it may be pointed out and that if a relation between be suggested, either by

X

T


202

THEORY OF

STATISTICS.
it

theory or by previous experience, that relation into the form

may be

possible to throw

Y=A + B.<I>{X),
A and B are the only unknown constants to be determined. a correlation-table be then drawn up between T and <j>{X) instead of Y and X, the regression will be approximately linear. be the rate of Thus in Table V. of the last chapter, if discount and Y the percentage of reserves on deposits, a diagram of the curves of regression, or curves on which the and Y means of arrays lie, suggests that the relation between is approximately of the form
v^here
If

X

X

X(Y-£)=A,
A and B
being constants
;

that

is,

XY=A+BX.
Or,
if

we make

XY a new variable, say Z, Z=A + BX.
correlation-table

Hence,

if

we draw up a new

between

X and Z

the regression will probably be much more closely linear. If the relation between the variables be of the form

Y==AB^
we have
log

r=log^ + X.

log 5,

and hence the relation between log if the relation be of the form

Y and

X

is

linear.

Similarly,

X"Y=A
we have
log

Y- log A-n.

log

X,

and so the relation between log Y and log is linear. By means of such artifices for obtaining correlation- tables in which the regression is linear, it may be possible to do a good deal in difficult cases whilst using elementary methods only. The advanced student should refer to ref. 17 for a different method of treatment. 19. The only strict method of calculating the correlation coefficient is that described in Chapter IX. from the formula

X

*'=W^

—.

Approximations

to

this

value

may, however, be

X.

— COEKELATION:

ILLUSTRATIONS AND METHODS.

203
(1)

found in various ways, for the most part dependent either

on

the formulse for the two regressions r-^ and
°'y

r— ""^

,

or (2) on

the formulEe for the standard deviations of the arrays a-^ Jl - r^ and a-y s/l - 1^. Such approximate methods are not recommended for ordinary use, as they will lead to different results in different hands, but a few may be given here, as being occasionally useful for estimating the value of the correlation in cases where the data are not given in such a shape as to permit of the proper calculation of the coefficient. (1) The means of rows and columns are plotted on a diagram, and lines fitted to the points by eye, say by shifting about a stretched black thread until it seems to run as near as may be to all the points. If 6j, b^ be the slopes of these two lines to the vertical and the horizontal respectively,

r=
as

Jbyb^.

Hence the value of r may be estimated from any such diagram figs. 36-40 in Chapter IX., in the absence of the original table. Further, if a correlation-table be not grouped by
it may be difficult to calculate the product sum, but it may still be possible to plot approximately a diagram of the two lines of regression, and so determine roughly the value of r. Similarly, if only the means of two rows and two columns, or of one row and one column in addition to the means of the two variables, are known, it will still be possible to estimate the slopes of RB and GG, and hence the correlation

equal intervals,

coefficient.

(2)

The means
and

of

one set of arrays only, say the rows, are
o-j,

and a-y. The two standard-deviations means are then plotted on a diagram, using the standard-deviation of each variable as the unit of measurement, and a line fitted by The slope of this line to the vertical is r. If the standard eye. deviations be not used as the units of measurement in plotting, the slope of the line to the vertical is r a-J a-y, and hence r will be obtained by dividing the slope by the ratio of the standardcalculated,
also the

deviations.

This method, or some variation of it, is often useful as a makeshift when the data are too incomplete to permit of the proper calculation of the correlation, only one line of regression and the ratio of the dispersions of the two variables being required the ratio of the quartile deviations, or other simple measures of dispersion, will serve quite well for rough purposes in lieu of the As a special case, we may note that ratio of standard-deviations.
:


THEORY OF

204
if

STATISTICS.

the two dispersions are approximately the same, the slope of to the vertical is r. Plotting the medians of arrays on a diagram with the quartile deviations as units, and measuring the slope of the line, was the method of determining the correlation coefficient ("Galton's function ") used by Sir Francis Galton, to whom the introduction (Kefs. 2-4 of Chap. IX. p. 188.) of such a coefficient is due. of errors of estimate like (.3) If s^ be the standard-deviation X- b^y, we have from Chap. IX. § 11

RR

and hence

•=y>-sif the dispersions of arrays do not differ largely, and the regression is nearly linear, the value of s^ ma,j be estimated from the average of the standard-deviations of a few rows, and r deterThus in Table III., mined or rather estimated accordingly. Chap. IX., the standard-deviations of the ten columns headed 62-5-63-5, 63-5-64-5, etc., are—

But

2-56 2-11 2-55 2-24 2-23
2-60-

'

2-26 2-26 2-45 2-33

Mean

2-359

The standard-deviation
approximately

of the stature of all sons is 2 '75:

hence

/,

/2-359Y

= 0-514.
This is the same as the value found by the product-sum method It would be better to take an to the second decimal place. average by counting the square of each standard-deviation once for each observation in the column (or "weighting" it with the number of observations in the column), but in the present case this would only lead to a very slightly different
result, viz.
o-

= 2-362, ? = 0-512.

The Correlation Ratio. The method clearly would not give an approximation to the correlation coefficient, however, in the case of such tables as V. and VI. of Chap. IX., in which the means of successive arrays do not lie closely round straight lines.
20.

,

X.

—COKEELATION:

ILLUSTRATIONS AND METHODS.

205

In suih cases it would always tend to give a value for r markedly higher than that given by the product-sum method. The product-sum method gives in fact a value based on the standarddeviation round the line of regression ; the method used above gives a value dependent on the standard-deviation round a line which sweeps through all the means of arrays, and the second standard-deviation is necessarily less than the first. We reach, therefore, a generalised coefficient which measures the approach towards a curvilinear line of regression of any form. Let Sai denote the standard-deviation of any array of X's, and let n, as before, be the number of observations in this array (Chap. IX., § 11), and further let

<rJ^%{n.Bj)lNThen
o-^
is

.

.

.

.

(1)

an average

of the standard-deviations of the arrays

obtained as suggested at the end of the last section.

Now
'

let

<rJ = <rJ^(l-V^')
or

.

(2)

vJ=-^-"4
Then
18).
it

(3)

tj^ is

termed by Professor Pearson a correlation-ratio (ref. As there are clearly two correlation-ratios for any one table,

should be distinguished as the correlation-ratio of JT on Z: it measures the approach of values of associated with given

X

values of to a single-valued relationship of any form. The calculation would be exceedingly laborious if we had actually to evaluate a-a^, but this may be avoided and the work greatly simplified by the following consideration. If M^ denote the mean of all Jf s, m,^ the mean of an array, then we have by the general relation given in § 11 of Chap. VIII. (p. 142)

7

Or, using a-^ to denote the standard-deviation of
CT-i^

m^

= o-a«- + o-«i/
V.v=

W
(5)

Hence, substituting in (3)

.

.

on Y is therefore determined when we The correlation-ratio of have found, in addition to the standard-deviation of X, the
standard-deviation of the
21.

X

than the X on T cannot be -r^ a measure of correlation-coefficient for X and Y, and For the divergence of the regression of X on F from linearity.

means

of its arrays.

The

correlation-ratio of

less

rj^y^

is

'r. 352 below). IX. ref. and dividing by the number of observations. not less than r. as in Chap. two decimal places only having been retained as suflScient for the present purpose. if desired. i. {cf. we have by the relation of Chap. the deviation of the mean of an array of X's from tho line of regression. If the second correlation-ratio for this table be worked out in the same way. IX. Chap. multiplying each array-mean by its frequency.. As the standard-deviation of the sons' stature is 2 -75 in.e. The square of the mean must then be subtracted from But o-^ is necessarily The magnitude of positive. owing to the fluctuations of sampling. it is as well to check the means of the arrays by recalculating from them the mean of the whole distribution. very nearly identical.. 172 a. and therefore is 2(. and the formula cited therefrom on p. (6) from (2). . IX. therefore of yf--r^ measures the (t^ and divergence of the actual line through the means of arrays from the line of regression./. r and -q are almost certain to differ slightly. The form of the arithmetic may be varied./. In the fourth column these differences are squared. ?.„.%l-r^)^<Tj + <T^^.. the mean stature of sons for that array . column is given the type of the array (stature of father) . Pearson. 22. IX. in the third. 19. = 0'52. p. and in the sixth they are multiplied by the frequency of the array. Before taking the difierences for the third column of such a table. summing.. it : .'^T. The following table illustrates the form of the arithmetic for the calculation of the correlation-ratio of son's stature on In the first father's stature (Table III. instead of taking differences from the true mean."»/)/-^ to give o-„. The observed value of y^ . 160). and only slightly greater than the correlation-coefficient (0-51).r^ must be compared with the values that may arise owing to fluctuations of sampling alone. of Chap. the difference of the mean of the array from the mean stature of all sons.206 if THEORY OF STATISTICS. ret 22. p. that is. It should be noted that. d denote.. The sum-total of the last column divided by the number of observations (1078) gives <7„„^ = 2-058. by working from zero as origin. before a definite significance can be ascribed to it (c/. Substituting for o-^ . Blakeman.^-nJ-r') (7) ly^j. the value will be found to be the same to the second place of decimals the two correlation-ratios for this table are. in the second. Both regressions. even though the regression may be truly linear. question 3). . or a-„„ = 1 'iS. therefore. § 11.

. On the other hand. p. Son's Stature on Father's Stature : Data of Table III. that we should expect the two correlation-ratios for Table VI. p. Calculation of the Corkblatiox-Ratio Example. p.^ = 0-46..^^ is "higher since the line of regression of TonXis sharply curved. . — 1. In the case of a short series of observations such as that given in Table VII. — CORRELATION : ILLUSTRATIONS AND METHODS. 178.. the The oonfii-mation of these values is left to the student. 176. of the same chapter to differ considerably from each other and from the correlation. 17. IX. 37.X. Chap. 174). 183. it is evident from fig. For Table VIII. the method is inapplicable. 39. 160. 207 follows from the last section. but i.^^ = 0-38 (r= -0014): 17^ is comparatively low as proportions of male births differ little in the successive arrays. the two ratios are t. The student should sufficiently large for notice that correlation-ratio only affords a satisfactory test when the number of observations is a grouped correlation table to be formed. p. a result confirmed by the diagram of the regression lines (fig. r]y^ = 0-39 (r = 0'34). p. The values found are i7^„ = 0-14. are very nearly linear. : .

" iSirf. E. Soc. {Cf.TISIICS. 403-413. 1906. Roy. is employed. (16) PoYNTiNG. 1906.' " JSiomeiWte. Dulau & Co." ibid. . Stat. Soc. p. 88. 1899.. D. H. vol... 0. . 1905. Hbkok. (Detailed theory of the same extended method. {Of. 847. 1907. -with an Inquiry as to their probable Causes. p." if«m. Ixii.. Karl." Jour.. Cave.) the Correlation of the Marriage-rate with Trade. vol. v. ." Jour. Soc. Ixiv.) Correlation of the Weather and the Crops. Bengal." Phil. Research Memoirs : Studies in National Deterioration. and v. 1902. 340-355. "On the Influence of the Time-factor on the Correlation between the Barometric Heights at Stations more than 1000 miles apart.. by graphical interpolation. Illustration ii. Statistics. Hooker. R. vol. 1914.. p.. 249. Soc. H. 486. and toI. 1905. p. " Jour."0n the Correlations of Areas of Matured C'rop% and the Rainfall. P. 1896. "Genetic Inheritance of Fertility in Man and of (reproductive) Selection Fecundity in Thoroughbred Racehorses. Soc. "The vol. vol. On tJie Relation of Fertility in Man to Social Status. Beatrice M." Biometrika. Jioy. (The extension of the difference-method by the use of successive differences. pp.Browne-Cave. p. p.. " Drapers' Co. "On the Changes in the Marriage and Birth Kates in England and Wales during the past Half Century. London. vol. and in the Cotton and Silk Imports into Great Britain. x." (5) Hooker. Ixix.. Series 257.. Economic Jour. vol. Alice Lee.. vol. . .. (Uses the methods of Illustrations iv. "On Illustration v. x.. x.. Bramlet Mooric. Ixviii. Jacob. pp. ' pp.. but the instantaneous average is obtained by an interpolated logarithmic curve.) Norton. Illustration i. 613." "The Elimination of Spurious Correlation due to Position in Time or Space. vol. {Of. J. REFERENCES. March. Illustrative Applications. vi. . G. "Comparaison numerique de courbes statistiques. cxoii. R.. U. pp. L. 1914 . "Numerical Illustrations of the Variate-difference Correlation M. used. .) ) ) . 255 and 306. U. U. Roy. p." Froc. (1) Yule. J. Stat.. R. 1895. of total Pauperism with Proportion ot 603... H. Roy. " On the (7) (8) (9) ' (10) (11) (12) (13) (14) (15) Correlation of Successive Observations : illustrated by Corn-prices. . ii." p.eit\iod. lUusti-ation iii. New York. tion due to Position in Time or Space. and : L. de la sociSU de statistique de Paris. and Karl Pearson. vol..) chiefly during the last "An (3) Peakson. pp. (The difference-method of Illustration iv. G.) (6) Hooker..) " Nochmals iiber The Elimination of Spurious CorrelaAnderson. 1901. (The method of Illustration iv. 1899. 1914. 208 THEORY OF STA. Trans.. H.) Yule. M. 269-279. Ixx. (Applications to financial statistics an instantaneous average. (2) Yule. 1910. S. 1. .. but obtaining the instantaneous average in the latter case. G.. 696. analogous to that of illustration v..method. Roy." Biometrika. principally to Economic and Practical Methods.. (The method of Jour. " St[jdent. 1904. vol. vol. p. Statistical Studies in the New York Money Market Macmillan Co. Asiatic Soc." Jour. in England Inyestigation into the Causes of Changes in Pauperism two Interoensal Decades. Stat. 179-180." I. A. F. " On the Correlation Out-relief.. Ixxiv. (4) Cave. "A Comparison of the Fluctuations in the Price of Wheat. .

1884. Harris. but is cited because the method of Illustration v." Jour." linear Regression.) X. "On a General Theory of the Method of False (A method of curve fitting by Position. (18) (19) (20) (21) Peakson.) J.. 1914. Stat. 34. ix. p. Karl. p. the use of trial solutions. 209 Roy. Ixxvii.") II. On the Gteneral Theory of Skew Oorrelation and NonBibmetric Series. (24) Slutsky. 559. 6th Series. 1901. pp. 6th Series. and Curve or Line fitting generally. p. 446-472r 14 . Short Method of Calculating the Coefficient of Correlation in the case of Integral Y3. Pbakson. vol. ii. 1913." Biometrika." PAiZ. " On the Systematic Fitting of Curves to Observations and Measurements." Biometrileo. on the process of averaging employed. "On xxi." Biometrika. Aethue. Pearson.. (This paper was written before the invention of the correlation coefficient. (17) Pbakson. " Drapers' Oo. 78-84.) (22) Testa for Linearity of Regression in FrequencyJ..tes. 1902. vol.. Karl. Biometrika. Kakl. vol. Abbreviated Methods of Calculation.. 254. Mag. (The second part is useful for the fitting of curves in cases of non -linear regression. London." PMl. "On the Combinations is Calculation of Intra-class and Inter-class Coefficients of Correlation possible from Class-moments when the Number of large. June 1903. Blakeman." Biometrika. is used to separate the periodic from the secular movement : see especially § ix. Soc. —COEKKLATION: Stat. 1.. (The "correlation ratio. "On the Criterion of Goodness of Fit of the Regression Lines and the best Method of Fitting them to the Data. 214. i. p. " On Restricted Lines and Planes of Closest Fit to Systems (23) Snow. 265.. Karl. " On Lines and Planes of Closest Fit to Systems of Points in Space. (Not an approximation. 1911. Research Memoirs Dulau & Co. and vol. iv. p. Roy. Mag. vii. " On a Correction to be made to the Correlation Ratio. Soc. E. 1909. 367..) Theory of Correlation in the case of Non-linear Kegression. ILLUSTRATIONS AND METHOD. Pbakson. J. E. C. xlvii. p. viii. Arthtje.Tia.. See also references to Chapter (25) XVI. " A (26) Harris.. 332. Karl. distributions... vol. vol.. ii. 1911." : . vol. vol. 1905. 1905.." Fhil. p. vol. of Pointe in any number of Dimensions. Mag. p. pp. vol. but a true short method.

— — — it is It has already been pointed out that a statistical measure. of an index — — — coefficient two. Introductory— 2. seeing that the coefficient is derived.CHAPTER XI. suffice to show that the correlation-coefficient can be treated with the same facility. and the following illustrations. for a N . Mean and standard-deviation 10. 1. Then if Z evidently 210 . Standard-deviation of a sum or difference— 3-5. Correlation-coefficient values of a variable— 12. for varying sex and weighting of forms of average other than the arithmetic mean. 1. CorrelationCorrelation between indices 9. 2. This might indeed be expected. MISCBLLANEOTJS THEOEEMS INVOLVING THE USE OP THE GORRELATION-OOEFFIGIENT. Correlation due for all possible pairs of to heterogeneity of material— 13. should lend itself readily to algebraical treatment. Xj. The mingling of uncorrelated with correlated material weighted mean 18-19. like the mean and standard-deviation. Let 2. Application of weighting to the correction age-distributions 20. while giving a number of results that are of value in one branch or another of statistical work. The arithmetic mean and the standard-deviation derive their importance largely from the fact that they fulfil this requirement better than any other averages or measures of dispersion . x^ denote deviations of the several variables from their arithmetic means. To find the Standard-deviation of the sum or difference of corresponding values of two variables JT^ and X^. if to be widely useful. The of death-rates.X two-fold table— 11. etc. Influence of errors of observation and of grouping on the standarddeviation 6-7. Influence of errors of observation on the correlationcoefBcient (Spearman's theorems)— 8.. Reduction of correlation due to 14-17. by a straightforward process of summation.

Influence of Grouping on the Standard-deviation. we have from the preceding — X The effect of errors of observation is. +Cr„2-h2rj2. <r2 we have the important special case . The student should notice that the assumption made does not imply the complete independence of and S he is quite at liberty to suppose that errors fluctuate more. . and x^ and o-. (2) The student should notice that in this case the standarddeviation of the sum of corresponding values of the two variables is the same as the standard-deviation of their difference. The consequence of grouping observations to form the frequency distribution is to introduce errors that are. For the sum of a series of variables Xj. if any value of be observed a large number of times. . In that case the contingency-coefficient between and S would not be zero. 211 Squaring both sides of the equation and summing. say 8..XI. Let us suppose that. Influence of Errors of Observation on the Standard-deviation. the arithmetic mean error being zero. being the correlation beween X^ and X^.). as might very probably happen. (1) If x-^ and a. and so on. 0-2 = (Ti^ -t- CTj^ -t- . = o-. 4.. . to increase the standard-deviation above its true value. for example.o-i(r2 . . . X^ X„ we must have . The same process will evidently give the standard-deviation of a linear function of any number of variables. consequently.. although the correlation-coefficient might still vanish as supposed. in effect. . o-j o-2 = CTi2-|-o-22+2r. .ia. r^^ the correlation between X^ and Xg.crjCr3 + r^2 . x the true deviation. . . the error. if r be the correlation between the respective standard-deviations. .. with large than with small values of X. In this case if x-^ be an observed deviation from the arithmetic mean. -|-2r23.2 are uncorrelated. the arithmetic mean of the observations is approximately the true value.0-2O-3+ . the arithmetic mean error being zero for all values of X.(7iOr2-l-2»-j3. 3. The results of § 2 may be applied to the theory of errors of observation. . 2(2^) = %{x^^) + 2(V) ± 22(a. x-^ That is. . is uncorrelated with X. errors of X : X — .2 + o-22 . . <Tj.— COEEELATION : MISCELLANEOUS THEOREMS. Then.

the mean value of 8 being zero for every value of Xj further. If certain observations be repeated so that we have in every case two measures Wj and x^ of the same deviation x. : other.rd- To deduce from this equation a X : deviation of the true values X. thereby making an error 8. .2-|! . instead of 8 and Xj. eqn. standard-deviation of the grouped values Xj and o. is appreciably zero and that the standard-deviation of '8 may be taken as c'/l 2. 14. 9 {b). that the correlation between 8 and X. 5. 1) shows to give very good results for a curve approximating closely to the form of fig. or fig. 97. the standard-deviation of 8 would be cY12. shows that grouping tends to increase rather than reduce the standard-deviation. (10)). p. 9 {a). (4) a formula of correction for grouping (Sheppard's correc1 to 4) that is very frequently used. (2) that the frequency tapers off gradually to zero in both directions. as before (the values of 8 being to a first approximation uniformly distributed over the class-interval when all the intervals are considered together). 15. § 12.212 THKOET OF STATISTICS. then we have . p. we assign to it the value Xj corresponding to the centie of the class-interval. But the true frequency distribution is rarely or never a histogram. The strict proof of the formula lies outside the scope of an elementary work it is based on two assumptions: (1) that the distribution of frequency is continuous. Instead of assigning to any observation its true value X. if o-. On this assumption . 92. formula showing the nature of the influence of grouping on the standard-deviation we must know If the original or Xy the correlation between the error 8 and distribution were a histogram. The formula would not give accurate results in the case of such a distribution as that of fig. or fig. as in § 3.r2 = <r. et seq. where Xi = X+S. p. . neither is it applicable at all to the more divergent forms such as those of figs. p. p. 5. and trial on any frequency distribution approximating to the symmetrical or slightly asymmetrical forms of fig. VIII. . where c is the classHence. 89. 5. and that trial (ref. be the interval (Chap. 92. If we assume. measurement.the 8tanda. refs. 89. it is possible to obtain the true standard-deviation cr^ if the further assumption is legitimate that the errors Sj and S^ are uncorrelated with each This is tion. Xj and 8 would be uncorrelated.

cf. y^ be the observed deviations from the arithmetic means. also suggested by Spearman. of equation (7). 213 and accordingly ' ]f • • • • W (This formula is part of Spearman's formula for the correction of the correlation-coefficient. the true value of the correlation can be obtained by the use of equations For (5) and (6). —CORRELATION: MISCELLANEOUS THEOREMS. no mere increase in the number of observations can in any way lessen. however. the observations of both 7. of the sii quantities x. is less than the true This difference. § 7. on assumptions similar to those made above. 8j. Of the four quantities x. . c we will suppose x and y alone to — be correlated. y. 6.) 6. If. y the true deviations. Ejj "^ and y The correction given by the second part alone are correlated. we have %{x^y^-%{x^y^ ^ %{x. as assumed in § 5. X. . On this assumption %{x^y^)=^%{xy) . (6) It follows at once that and consequently the observed correlation correlation. Spearman's Theorems. Influence of Errors of Observation on the Correlationrcoeffident. and 8. y^ and y^ of every value of x and y. e^. seems. so that we have two measures x-^ and x^. it should be noticed. . "^ ~ (r r Y ' ' ' ^ ' the original form in which Spearman gave his It will be seen to imply the 7). Or. \. X and y be repeated.^y^)%{x^y^ ^ a_ — (7) if we use all the four possible correlations between observed values of x and observed values of y. y. e the errors of observation. 8. .XI. on the Equation (8) is correction formula (refs. Let a!j. assumption that.

in n\xJ n «:<'-^)(-C Expand the second bracket by the binomial theorem. and M^ the means of X-^ and X^. Let / be the mean of Z. however. = 0-2/^2.— (Rei. i/. Mean and Standard-deviation of an Index.^: this correlation should vanish.ll. Then to this approximation "^-^^3^(^^-^) + ]it^^(V)]That ^2 is. if r be the correlation between x^ and x^. and neglecting terms of all orders above the second.) The means and standard-deviations of non-linear functions of two or more variables can in general only be expressed in terms of the means and standard-deviations of the original variables to a first approximation. for in X and in y. If ^=3^X1 If s -'^1^2 + ^2') have • . and if v^ = irJM^. . on the assumption that deviations are small compared with the mean values of the variables. or index Z=X-JX^. An insufficient though partial test of the correctness of the assumptions may be made by correlating x-^ . are unoorrelated. to be safer. in terms of the constants for Xj and X^. it eliminates the assumption that the errors the same series of observations. 8. M2 =:^(i+v-4'-"i^2+>V) . assuming that xJM^ is so small that powers higher than the second can be neglected.x^ with y-^ — y.. it may vanish from symmetry without thereby implying that all the correlations of the errors are zero. Thus let it be required to find the mean and standard-deviation of a ratio.214 THEORY OF STATISTICS. Evidently. Then whole. (9) be the standard-deviation of Z we Expanding the second bracket again by the binomial theorem.

%(?^)-Kir X. — Required to find approximately the correlation between two ratios Z-y = Xj/X. = = . lem Correlation between Indices." Thus if measurements be taken. there may be little. in the last step.-A)(f^-/. if two individuals both observe the same series of magnitudes quite independently. 11. 215 or from (9) 9. we have W2 = ^'(1+3V)-V2 .) The following probaffords a further illustration of the use of the same method.— XL —COEEELATION: MISCELLANEOUS THEOREMS. be a positive correlation. say.p. being equal to 0'5 if and hence even if JTj and X^ are independent. probably approaching 0'5. on three bones of the human skeleton. and the measurements grouped in threes absolutely at random.. 2 Neglecting terms of higher order than the second as before and remembering that all correlations are zero.) =. between the indices formed by the ratios of two of the measurements to the third. To give another illustration. The required correlation p will be given by ir. nevertheless. 3 > neglected.. where. (Ref. the indices formed by taking their ratios to a common denominator X^ will be correlated. JTj -Zj and X^ being ^^ncorrelated. if z>2 'Wj . there will.^1^2. = 2g. these are given approximately by (9) and (10) of the last section. we have finally This value of p fJ is obviously positive. Let the means of the two ratios or indices be /j /g and the standard-deviations Sj s^ .a. 2^ = XJX^. a term of the order v^* has again been Substituting from (10) for Sj and s^. The value of p is termed by Professor Pearson the "spurious correlation.

The Gorrelation-coeffioient for a two. a theoretical value is obtainable for the coefficient. The case considered. For an interesting study of actual illustrations cf. such as those given in Chapter IX. In some cases. ref. 13).x twofold Table. be expressed as percentages of the magnitude observed. is only a special one.g. for the general discussion cf. one illusand for others the references given in questions 11 and 12). . ref. there may be considerable correlation. which holds good even for the limiting case when there are only two values possible for each variable {e.^j. It is therefore of some interest to obtain an expression for the coefficient in this case in terms of the class-frequencies. Using the notation of Chapters I. " If the indices are uncorrelated. where X-^ X^ X^ are uncorrelated. Values of Second Variable.216 THEORY OF STATISTICS. It does not follow of necessity that the correlations between indices or ratios are misleading.Xg = Xj and .-IV. correlation between their absolute errors.Xj = Xj. the table may be written in the form 1) and tration in § 11. 14. 10. But if the errors any. however. there will be a similar " spurious correlation between the absolute measurements ^j. The correlation-coefficient is in general only calculated for a table with a considerable number of rows and columns. 11. — and consequently two rows and two columns {cf. ref. and the answer to the question whether the correlation between indices or that between absolute measures is misleading depends on the further question whether the indices or the absolute measures are the quantities directly determined by the causes under investigation (cf.

8)/iV^. (Aft) <r/ Finally. say. for a general discussion of various measures of association. the correlation between brothers for statv/re. 11. But further. 3 of Chap. essentially difierent properties. r only becomes unity if (AB) = (A) = (B). III. The two coefficients possess. This is the only case in which both frequencies (aB) and (J. If. in fact. which is unity if either {AB) = {A) or {AB) = {B). §§ 11-12) and replacing 2(0!^) |. including these and others. § 13. that have been proposed. Xj Tj on a line. ^xi/) = 1{{AB) + (a/3) - . rj by their values. while the association coefficient is the same for all tables derived from one another by multiplying rows or columns by arbitrary coefficients. unlike the association-coefficient of Chap./3) can vanish so that (AB) and (aj8) correspond to the frequencies of two points Xj 7^. when the table is symmetrical. and are different measures of association in the same sense that the geometric and arithmetic means are different forms of average. The Correlation-coefficient for all possible pairs of If values In certain cases a correlation. —COKBELATION: o-j. For moderate degrees of association. the correlation coefficient (12) is greatest when (A)=(a) and (B) = (/3). or the interquartile range and the standard-deviation different measures of dispersion. The student is again referred to ref. this reduces to = 8. but.(aB)} - F^. Whence ^mwrni' • ' ^ ^ This value of r can be used as a coefficient of association. Obviously this alone renders the numerical values of the two coefficients quite incomparable with each other. the association coefficient gives much the larger values. and there are three brothers in — .table is formed by of a Variable. and its value is lowered when the symmetrical table is rendered asymmetrical by increasing or reducing the number of A'a or B's. III. i. a table is being formed to illustrate.XI. III.e. o-g 217 The standard-deviations o-j2 are given by = 0-25-f2 = (4)(a)/ivr2 = 0-25-^2 = (5)(. Writing {AB)-{A){B)IN=h (as in Chap. for example. MISCELLANEOUS THEOREMS. combining If observations in pairs in all possible ways.

Xn -p XyJCa "P X-iX^ T" -p X^JC-i "t" . ajj only determine the two points {x-^. there being N{N . times. . from the standpoint of § 10. As each observed value N—\ N • M . and Y are uncorrelated in each of two If X records. Looking at the association.1). 4 . This result is utilised in § 14 of Chapter Correlation due offers to Reterocfeneity of Material. IV.— 218 THEOKY OF STATISTICS. XIV. —The following ivhen the theorem § 6 for some analogy with the theorem of Chap. « • ^2*^3 ~i~ *^2'^4 "r . the numbers of pairs being therefore N{N. -It + = X-^{%{x)-x-^]+x^{%{x)-x^]+x^{%{x)-x^} == .-\...^ . 9. whence. 11. (13) N= 2. e. the product sum may be ajg i. — 1. . each due to one family.e.. ft. 10 which may be entered into the table. written X-.. ft. one family with statures 5 ft.g. The entire table will be formed from the aggregate of such subsidiary tables. they will nevertheless exhibit some correlation .. x^). = . x^) and (x^. even if the variables can only assume two values. . 10 with 5 11 ft. -|— -J— XqpC-i XnXn T" X^ykit -P .1) pairs. say and <t. 5 regarded as giving the six pairs 5 5 ft. 3. in fact.No-^.. If x^ x^ same as for the original be the observed deviations. — \. . the means and standard-deviations of the totals of the correlation-table are the observations. The student should notice that a corresponding negative association will arise between the first and second member of the pair if all possible pairs are formed in a mixture of A's and a's. 1 iVo-2 N{N-l)<T^ For N-l . . these are 9 with 5 5 „ ft. ..x^ - . 10 „ „ „ „ 5 ft. two values »!. for . this gives the successive values of >•= It is clear that the first value is right. attributes.. 12. due to a family with Jf members.. 10 11 5 5 ft. - x-^ . of the variable occurs once in combination with every other value. and 5 ft. 9 „ ft.. for a single subsidiary table. Let it be required to find the correlation-coefficient. . the equation (13) still holds. and the slope of the line joining them is negative.. 10. and 1. • . however. ft.x.

therefore. second condition gives K^ — K^. for if M^. is iVj (M^ . . or the mean value of Y in the second record is identical with that in the first record. this correlation being rather lower than might have been expected. and w^e will have Xjxy) {n^+n^)<T^ar. N^ the numbers of observations. the difference might be accounted for on the hypothesis .ry) Now let ^2 pairs be added to the material. M-^ = M^.K). the means when the two records are mingled. unless the mean value of in the second record is identical toith that in the first record. the mean values of 7. N^. E^. 219 two records are mingled.— CORRELATION : MISCELLANEOUS THEOREMS. '>= _.' Whence Suppose. 20. the means and standard-deviations of x and y being the same as in the first series of observations. Suppose that n^ observations of x and y give a correlation-coefficient Similarly. This follows almost at once. (For a more general form of the theorem cf.XI.) 13. but the correlation zero. and M. If r^ is the value that would be expected from other records. ref. X X K K Evidently the first term can only be zero ii M = M-^ or E=Ky But the first condition gives that is. only vanish if M-^ = M^ or Correlation may accordingly be created by the mingling K-^ = K^. Both the first can. of two records in which and Y vary round different means. Reduction of Correlation due to mingling of uncorrelated with correlated pairs.!^i_ . the product-sum of deviations about M. or both. M^ are the mean values of in the two records Zj. the and second terms X — _ S(. The value of %{x. that a number of bones of the human skeleton have been disinterred during some excavations. _ tu) n n-^+n^ for example. and a correlation r^ is observed between pairs of bones presumed to come from the same skeleton.M){K^ -K) + F^{M^ .M)(iq . and subject to some uncertainty owing to doubts as to the allocation of certain bones.y) will then be unaltered.

it will be better to form an average price. and may be denoted by M' so that M' = :S. M^. treating the market as the unit. in market B at an average price of 27s. paired at random. The second form of average would be quite correctly spoken of as a weighted mean of the means of the several series at the same time it is simply the arithmetic mean of all the series pooled together. To give an arithmetical illustration... the arithmetic mean obtained by treating the observation and not the series as the unit.{M)lr. per quarter.X)/%(W). or virtual frequencies. viz. The Weighted Mean. treating the series as the unit. the quotient of the sum of such products by the sum of the weights is defined as a weighted mean of X. VII. But if we know the number of observations in every series it will be better to form the weighted mean '%{NM)I'S. The weighted mean then becomes simply an arithmetic mean. : .r^jr-^ of all the pairs. i.(W. we multiply each several observed value of by some numerical coefficient or weight W. Id.) The arithmetic mean Jlf of a series 14. proportion (r^ . estimated. treating the unit of quantity as the unit of frequency. weighting each mean in proportion to the number of observations in the series on which it is based. not by taking the arithmetic mean of the several market prices. again ref. and have been virtually (For a more general form of the theorem cf. the bones do not really belong to the same skeleton. if no statement is made as to the quantities sold at these prices (as very The it is. as the unit.. 7d. (Chap. Mr of r series of observations.e. or X — M=%{X)IN. '2. M^ .{N).220 that. if known. on the other hahd. we may.e. § 13. 4d. i. and in market C at an average price of 28s. for the " weights " may be regarded as actual.. in which some new quantity is regarded Thus if we are given the means M-^. X distinction between " weighted " and " unweighted " means should be noted. was defined as the quotient of the sum of values of a variable of those values by their number N.. very often formal rather than essential. 20. if a commodity is sold at different prices in different markets. in a THEOEY OF STATISTICS.) 15. Thus if wheat has been sold in market A at an average price of 29s. but do not know the number of observations in' every series. but by weighting each price in proportion to the quantity sold at that price. we may form a general average by taking the arithmetic mean of all the means. If.

at B. and not as a rate per 1000 of the population.) as the general average. 4d. If. .). But if we know that 23. death-. our standpoint be that of some average consumer. X 23. VII. . We is the mean of the rates in the weighting each in proportion to its population. and 3933 qrs. or weight the index-numbers for the several commodities according to their importance from some point of view . or marriage-rates of a country may be regarded as weighted means. x 3933) 27889 to ~^^^- This is appreciably higher than the the nearest penny. — total population 2(birth-rate in each district x population in that district) ^(population of each district) i. attached to the small markets In the case of index-numbers for exhibiting the changes in average prices from year to year (ef. XI. the rate for the whole country different districts. use the weighted and unweighted means of such rates as §17 below. take the arithmetic mean (28s. 7d. whence M' = M+r<r^^ (15) . x 26) + (28s. we may take as the weight for each commodity the sum which he spends on that commodity in an average year. correlation between weights and variables. the standarddeviations. For.X) = N{M.930 qrs. Id. were sold at A. and w the mean weight. § 25). It is evident that 2( W. 221 often happens in the case of statements as to market prices). any weighted mean will in general differ from the unweighted mean of the same quantities. and it is If r be the required to find an expression for this difference. which is lowered by the undue importance and C. Chap. only 26 qrs. —COEEELATION : MISCELLANEOUS THEOKEMS. 4d. it will be better to take the weighted mean (29s. treating the rate for simplicity as a fraction. Eates or ratios like the birth-. we have at once illustrations in 16. at C. o-„ and o-^. total births „.w + r<r„(r. it may make a sensible difference whether we take the simple arithmetic mean of the index-numbers for different commodities in any one year as representing the price-level in that year.e. for example. Birth-rate of whole country = r~r^ rr' . and much has been written as to the weights to be chosen.930) + (27s. . so that the frequency of each commodity is taken as the number of shillings or pounds spent thereon instead B of simply as unity.. arithmetic mean price.

222 That THEORY OF STATISTICS. Soc. for instance. between weighted and unweighted means on the population in different districts is. 349). January 1. Thus we have the following figures for rates of pauperism 17. Stat. and then weighting makes little difference. In some the weighted mean is the greater . if the weights and variables are positively correlated. if negatively. . nearly always of importance. vol. (1896). r having a sensible value and a-^a-^/w a large value. lix. the less. The difference of death-rates. p. birth-rates or other rates (Jour. but in others the difference is large and important. cases r is very small. is to say.

D' = -S.. . . (16) For some other district taken as a basis of comparison.. The corrected deathrate for the district will then be . The principle of weighting finds one very important application in the treatment of such rates as death-rates. (18) .p) . etc. or vice versd. . . being 4 '08. Then the ordinary or cnide death-rate for the district is tion. If the first district be a rural district and the second urban. 10-. The difiiculty may be got over by averaging the age-class death-rates in tke district not with the weights Pi P2 Pz given by its own population... accidental. but with the weights... A = S(8.{d. . 20-. suppose the of deaths are noted in a certain district for. . for simplicity.-n) . . etc.— COKRKLATION : MISCELLANEOUS THEOREMS. pj. tr^ we have. and the deathrates approximately the same for all age-classes.. the death-rates and fractions of the population in the several age-groups may be 8j Sj S3 ttj tt^ Wj and the crude death-rate .XI.. The comparison of crude death-rates is therefore liable to lead to erroneous conclusions. *" _ 32-34 -30-34 4-08 459 ^ 564 = + The of course. for instance. in which the fractions of the whole population are p^. owing to a difference of weighting. (17) differ Now D and A may differ either tt's because the (fs and S's or both. -40. in spite of lower deathrates in every class. . the question of sex. assuming that cr^/w) approximately the same for the decade 1881-90 as in 1891. perhaps the country as a whole. . on the other hand. It may happen that really both districts are about equally healthy. . which are largely affected by the age and sex-composition of the popula- Neglecting. . dj. the first average may be markedly higher than the second. the age-groups 0-. P2. where 2(p)=l. 18. there will be a larger proportion of the old in the former. numbers D=l(d. closeness of the numerical values of r in the two cases is. etc. . higher crude death-rate that the second. say. ttj ttj ttj given by the population of the standard district. but. Let the deathrates for the corresponding age-groups be d^. 223 is For the birth-rate. dj.^) . and it may possibly have a or because the p's and differ. .

as well as in different districts.j9) .. This.g. 17. the death-rates in the standard population and the district stand to one another in the same ratio in all age-classes. It is the crude death-rate that there would be in the district if the rate in every ageclass were the same as in the standard population.e. Difficulty may arise in practical cases . They are obviously applicable to other rates besides death-rates. (20) D" is not necessarily.p) %(8. 18). . The death-rates must be noted for each sex separately in every age-class and averaged with a system of weights based on the standard population. Thus it has been suggested (ref.^d^d^ classes . SJd^ = SJd^ = S^/d^ = etc. 51-3) by forming what may be termed a potential or standard death-rate A' for the class or • • • district. they may readily be extended into quite different fields. e. but only the crude rates D and the fractional populations of the age-classes Px Pi.224 THEORY OF STATISTICS. the same as D'.. those engaged in given occupations. IV.. are not known which it is desired to compare with the standard population. from the fact that for the districts or the death-rates d.method of correction is used in the Annual Summaries of the Registrar General for England and Wales. Further.e. i. A' being given by A' = S(S. (19) the rates of the standard population averaged with the weights of the district population. . refs. and ly and A will be comparable as regards age-distribution. birth-rates {cf. § 9.. Chap. P^ The difficulty may be partially obviated {cf.7r) ^{d.. 16). 19) that corrected average heights or corrected average weights — — .7r) S(8.. 19. 'Both methods of correction that of § 18 and that of the present section are of great and growing importance. e. There is obviously no difficulty in taking sex into account as well as age if necessary. An approximate corrected death-rate for the district or class is then given by i. The method is also of importance for comparing death-rates in different classes of the population. pp. . only be the same if It can S((Z.g.py This will hold good if.. nor generally.g. and is used for both these purposes in the Decennial Supplements to the Reports of the Registrar General for England and Wales (ref. i>" = i>x|. e.

vol.) Spearman.'' (3) Shbppabd." Biometrika. a weighted geometric mean could casual fluctuations. 698. Soc. 1904. 1897. ix.li. C. 1913.. iii.e. F. and on other allied points. (Proof of formula (8). xxix.. W. p. regards hair and eye-colour as well. (2) Sheppaud. which was communicated to Spearman in 1908. but it should be noted that any form of average can be weighted. "The Calculation of Moments of a Frequency-distribution. (7) Spearman. (4) for the coiTeotion of the standard-deviation is Sheppard's result. p.. Cube. vol. 161. an Elementary Proof of (4) Pbakson. Boy. p." Proc. Sheppard's Formulae for correcting Raw Moments. but on different lines to that given in the text. Slat. be calculated by weighting the logarithms of every value of the variable before taking the arithmetic mean.. Jonr. C. Speabman. 15 . Land. (8) "Demonstration of Formulae for True Measurement of Amer.) Two 88. W. p. "c/ojsr. iii. xv. 450. " The Proof and Measurement Things. In §§ 14-17 we have dealt only with the theory of the weighted arithmetic mean. Kael.iogx) s(wo EEFERENCES. 20. of a large number of Magnitudes. 271. v.— XI. xviii. Sheppakd. pp..) F. p. allowing for the smoothing of Similarly. vol." Britith Jowr. and others [editorial]. of Association between vol. vol. 363." Biometrika. C.. "Correlation calculated from Faulty Data. 1907. or iudeed of given composition fis. F. Soc. p. i.. "On the Calculation of the most probable Values of Frequency Constants for Data arranged according to Equidistant (The Divisions of a Scale. Kabl. 1907. — correlation: miscellaneous theorems. Thus a weighted median can be formed by finding the value of the variable such that the sum of the weights of lesser values is equal to the sum A weighted mode could be of the weights of greater values. etc. Effect of Errors of Observation (6) on the Correlation-coeflBcient. of log ^"- s(r. (6) Pearson. 116-139..." Amer. " On the Influence of ' Broad Categories on Con-elation. result given in eqn. Effect of (1) . W. ' On the Calculation of the Average Square. p. 1910. vol. Math. of Psychology. 308." of Psychology.. "On ' Biometrika. and published by Brown and by Spearman in (8) and (10). vol. Jour. (Formula (8).. Con'elation. 1904. Grouping Observations. formed by finding the value of the variable for which the sum of the weights was greatest. 225 of the children in different schools basis might be obtained on the a standard school population of given age and sex composition. of Psychology.

.'' Phil. M. (15) PBAiisoN.9. 257. Hoy. Tatham.. viii. 0.. 1899. Karl.. vol. "Genetic (reproductive) Selection: Inheritance of Fertility in Man and of Fecundity in Thoroughbred Racehorses. "Note on Reproductive Selection.. W. (14) Brown." etc. Karl Pearson on Spurious Correlation.." iM(?. (§ 7 contains remarks on the effects of errors on the correlations and regressions.pp.) EXERCISES..J. Stat." Soy. Pt. lix. lxxvii. and T. U. The Weighted Mean.) 226 (9) THEOEY OF STATISTICS. VI. "On a Form of Spurious Correlation which may arise Indices are used in the Measurement of Organs. some of which have been utilised in §§ 1213 of the preceding chapter. vol. and L. 1897 2619. as shown by Corrected Birth-rates. 88. Stat. the Changes in the Man-iage and Birth Rates in (18) Yule. . 1910. Karl. Bengal.) (12) Galton. Pearson. 1906. Sheppard's correction will make a difference of less than 0'5 per cent. "On Rainfall. with especial reference to this problem. Soc. Greenwood. Jacob. Soc. vol. G. 3. Boy. Chap. (p. p. "Note to the Memoir by Prof. 2. 139) and iii. p. (11) Peahson. (Cd. Find the values obtained for the standard-deviations in Examples ii... Series A. etc. (16) Correction of Death-rates. 141) of Chapter VIII. H. 498. Dulau & Co. (§§8. (Data from the decennial supplements to the Annual Reports of the Registrar-General for England and Wales. yoI. Supplement to the Fifty-fifth Annual Report of the Registrar-General for England and Wales: 'hUroductory Letters to . "On England and Wales during the past Half Century. Stevenson. 1895 8503. Geneva. Soc.) The following particulars are ..) Proc. ibid. 1910. U. Karl." Eugenics Laboratory Memoirs. p. Roy. 644. (A number of theorems of general application are given in the introductory part of this memoir." Jour. p. Also Supplement to Sixty-fifth Report : Introductory Letter to Pt. "The Iniluenoe of Defective Physique and Unfavourable Home Environment on the Intelligence of School Children. (19) Heron. Tbitj". M." Mem.317-46. Soc. A. 1. p. 301.. Stat.1914. Alice Lee. vol. August 1S09. on applying Sheppard's correction for grouping. John.. II. (16). 7769. " On the Interpretation of Correlations between Indices or Eatios. . S. 847.. when . and PL II.. p. cxcii. Trans. Correlations between Indices. 1896. Bramley-Moore. (20) Miscellaneous. David. ii.. the Correlations of Areas of Matured Crops and the Asiatic Soc. . "Some Experimental Eesults in Correlation. Show that if a range of six times the standard-deviation covers at least 18 olass-iutervals (c/. in the rough value of the standard -deviation. Roy. Francis." Jour. Soc. 489. (10) Beown. § 5). etc. 1910. Nkwsholme. 34." Proceedings of the Sixth International Congress of Psychology. vol. and Frances Wood. "The Decline of Human (17) Fertility in the United Kingdom and other Countries. p. p. I. G. 1908). Ixix. (13) Yule. vol. . 1897. (p. Soc.. London. Roy. Ix." (Eqn. "A Study of Index-Correlations. W. Ixxiii. " Proc.

—CORRELATION : MISCELLANEOUS THEOREMS.— XI. . 227 found for 36 small registration districts in which the number of births in a decade ranged between 1500 and 2500 : Decade.

in our usual notation. 53). ." Phil. b and c. What — . If «. such errors being uncorrelated with each other. find the coiTelations between a!j. in a Mendelian jiopulation breeding at random (such as would ultimately result from an initial cross between a pure dominant and a pure recessive). If we consider the correlation between number of recessive couplets in parent and in offspring. and the correlation table between parent and offspring reduces to the form 10.. with the weights. A. Offspring. !Cj and ajg (which are.=]. the correlation is found to be 1/3 for a total number of couplets n. deviations from their respective arithmetic means). x^ and Kg in terms of their standard-deviations and the values of a. (Pearson. 1904.228 THEORY OF STATISTICS holds for all values of iKj. p. the only possible numbers of recessive couplets are and 1. or with the variables (1) if the arithmetic mean values of the errors are zero (2) if the arithmetic mean values of the en'ors are not zero ! 11.. "On a Generalised Theory of Alternative Inheritance. is the effect on a weighted mean of errors in the weights or the quantities weighted. vol. Of. coiii. Trans.

Generalised correlations 6. Special notation for the general case : generalised regressions 5. for we have as yet no guide as to how far a correlation between 229 . But in the case of statistics of attributes we found it necessary to proceed from the theory of simple association for a single pair of attributes to the theory of association for several attributes. and also with changes in the proportion of old . Generalised deviations and standard-deviations 7-8. Direct deduction of the fommlie for two variables 4. Chap.— CHAPTER XII. Introduotoiy explanation— 3. — 14. for 19. 16. 1-2.e. 15. 1. X. Arithmetical work : Example : Example coefficient 17. it might be found that changes in pauperism were highly correlated (positively) with changes in the out-relief ratio. the theory of the correlation-coefficient for a single pair of variables has been developed and its applications illustrated. 18. to a correlation between changes in out-relief and changes in proportion of old. or correlation between several variables. PARTIAL CORRELATION. In Chapters IX.. -XI. Theorems concerning the generalised product-sums 9. and similarly the student will find it impossible to advance very far in the discussion of many problems in correlation without some knowledge of the theory of multiple correlation. in order to be able to deal with the complex causation characteristic of statistics . The question could not at the present stage be answered by working out the correlation-coefficient between the last pair of variables. and the question might arise how far the first correlation was due merely to a tendency to give outrelief more freely to the old than the young. Reduction of the generalised regression 13. In such a problem as that of illustration i. — — i. Reduction of the generalised — — — — — correlation-coefiSoient ii — Geometrical representation of correlation between three variables by means of a model — The of M-fold correlation — Expression of regressions and correlations of lower in terms of those of higher order — Limiting inequalities between the values-of correlation-coefficients necessary consistence — Fallacies. i.. for instance. Direct interpretation of the generalised regressions 10-11. Reduction of the generalised standard-deviation 12.

again assigning such values to : the constants as to make the sum of the squares of the errors of estimate a minimum.X„. The correlation between X-^ and X^ indicated by \ may be termed a partial correlation. parent and grandparent. Problems of this type. In the problem of inheritance in a population. such a generalised regression or characteristic equation we any one coefficient such as \. taking each in turn. say. and practically no correlation between the crop and the accumulated temperature during the same period .. assigning such values to the constants as to make the sum of the squares of the errors of estimate as low as possible the more complicated case may be discussed by forming linear equations between any one of the n variables involved.. . The latter case was discussed by forming linear equations between the two variables. and it is required to deduce from the values of the coefficients b. as corresponding with the partial association of Chapter IV. X. the bulk of a crop and the rainfall during §. 2. in which it is necessary to consider simultaneously the relations between at least three variables. or X„. and possibly more. we know that there must be a positive correlation between Xj and X„_ that cannot be accounted for by mere correlations of X^ and X^ with X^.. If in -I-6„. X^. the variables 1 and 2 can be accounted for by correlations between 1 and 3 and 2 and 3. but failing as a rule to obtain this benefit owing to the concomitant deficiency of rain. grandson and grandparent can or cannot be accounted for solely by observed correlations between grandson and parent. in fact. the mean change in X^ associated with a unit change in X^ when all the remaining variables are kept constant. the corresponding problem is of great iinportance.. certain period.. X„. for the effects of changes in these variables are allowed for in the remaining terms on the right.230 THEORY OF STATISTICS. It is essential for the discussion of possible hypotheses to know whether an observed correlation between. say. the equation will be of the form . Again. and the n—\ others. and the question might arise whether the last result might not be due merely to a negative correlation between rain and accumulated temperature. If the variables are X^X^X^ . The magnitude of h^ gives. X^ = a + h„. a marked positive correlation might be observed between. Chap. . which may be termed partial regressions. as already indicated in Chapter IV.. partial coefBcients of correfind a sensible positive value for . in the case of illustration iii. may be treated by a simple and natural extension of the method used in the case of two variables.X^^hyX^^ . the crop being favourably affected by an increase of accumulated temperature if other things were eqwil.

2 a minimum. Then it is required to determine a^ and ij. Sj. though possibly. In Chapter IX. Suppose any value whatever to be assigned to Jjj.e. It will first. § 14). and a series of values of a-^ to be tried. for which s^. so far as this may be done with a linear equation.. introducing a notation that can be conveniently adapted to more. If therefore the values of «j 2 were plotted to the values of a^ on a diagram... in a regression-equation. and the meaning of the coefficient in the more general case was subsequently investigated. zero. we write associated pairs Put more briefly. Such a process is not conveniently applicable when a number of variables are to be taken into account.2 is in using regression-equation (a) . however.. With this explanatory introduction. The best value of Oj. the value of the coefficient r ^^pjcr-^a-^ was deduced on the special assumption that the means of all arrays were strictly collinear. Let us take the arithmetic means of the variables aS origins of measurement. a curve would be obtained more or less like that of fig..XII.^ attained its so that «j.j could never become negative. for all deviations x^ and x„. if the root-mean-square value of the errors of estimate (cf. the least possible. be as well to revert briefly to the case of two variables. —PARTIAL CORRELATION. in the regression-equation X^ . required. 231 lation giving the correlation between Xj and variables when the remaining variables X^ or other pair of are kept constant. so as to make the sum of the squares of the errors of estimate a minimum. IX.j being calculatted for each. we may now proceed to the algebraic theory of such generalised regression-equations and of multiple correlation in general. and the problem has to be faced directly i. it is required to make Sj. and would continuously decrease as this best value was approached . if any. to determine the coefficients and constant term. Evidently a^j would be very large for values of aj that erred greatly either in excess or defect of the best value (for the given value of Sjj). will take this problem first for the case of two variables. or when changes in these variables are corrected or allowed for. Chap. For examples of such generalised regression-equations the student may turn to the illustrations worked out below (pp. the value of Sj. and let x^. 3. 44. x^ denote deviations of the two variables from their respective means. 239-247).j -a-^ + b-^^-x^Y. to obtain the greatest possible simplicity of treatment. but exceptionally. X^ : We *i = «i + V^2 • • • • («) of so as to make S(a.

S(a. Let Then if ttj and (aj + 8) be two such values. § 10). If.x^) or.i - «! + b-^^. now. ^x^{x^~b^^. ij^. again neglecting terms in 8^. IX.aj + b^^. minimum value.232 THEORY OF STATISTICS. the value of a-^ is the best for the assigned evidently. b^^ is to be assigned result of the best value. the corresponding valiies of Sj j are equal. we must have. S(a. This is the direct proof of the that no constant term need be introduced on' the right a regression-equation when written in terms of deviations from the arithmetic mean. by similar reasoning. but it can be calculated with much more exactness from the condition that if a\ a"i be two values close above and below the best. that is. = (c) breaking up the sum. whatever the value of Jjj. ij^ + S. That is. neglecting value of 6i2-^'^*> the term in 8^. or that the two lines of regression must pass through the mean (Chap. could be approximately estimated from such a diagram .x^) = <h = 0. say o-j j. *'^- "'^•^ S(V) . the equation gives.sc^)^ — 2{x-^ . We may therefore omit any constant term.aj + 8 + b^^-""!)^ when 8 is very small.i . for slightly differing values.

n "21. . n = ("12. it follows from reasoning precisely similar to that given above that no constant term need be entered on the right-hand for Now n variables. . quite indiiferent. the correlation-coefficient r-^^ may be regarded as defined by the equation in the first case and x. . figj. and notably we have for cr^. and the second will be the subscript of the x to which it is attached. n • *^8 + • • • + "ln. and separated from them by a point. ..23 . S^j. . . . follow from this. it should be noted. are placed the subscripts of all the remaining variables on the right-hand side as secondary subscripts. (d) apply the same method to the regression-equation Writing the equation in terms of deviations. After the primary subscripts. •''I — "1184 .^ secondary subscripts may '•l2=(*12-*2l)'- We shall generalise this equation in the '"12. form . From the fact that Sjj is determined so as to make the value of 2(ai . n) • • (2) it This is at present a pure definition of a new symbol. For the partial regression-coefficients (the coefficients of the sc'a on the right) a special notation will be used in order that the exact position of each coefficient may be rendered quite definite. (n-1 ' •''n (1) which the secondary subscripts are written is. the standard-deviation of errors of estimate cri/ = V(l-r. In the case of two variables.. fijg.. . . . and remains to be shown that r^^.j. .34 . the minimum value of Sj. The regressions b-^^. .2) 4.. . .XII. . .^ . the method of determination is sometimes called the method of least squares. „ may really be regarded as. . . . in the second.34 . etc. but the order of the primary subscripts is material . The regression-equation will therefore be written in the form side.b^^^^Y ^^^ least possible. and may be termed total as distinct from partial regressions. . A coefficient with p be termed a regression of the ^th order. 233 which is the value found by the previous indirect method of Chapter IX. The order . . 612 g „ and 621 3 „ denote quite distinct coefficients. Evidently all the remaining results of Chapter IX. 5. —PARTIAL CORRELATION. . .24 . e. .34 . Kj being the dependent variable in .2. in the case of two variables may be regarded as of order zero. » • ^2 * ^13. . The first subscript affixed to the letter b (which will always be used to denote a regression) will be the subscript of the X on the left (the dependent variable). these may be called the primary subscripts.g.

.. . ..n. i. In-ll ^n • (3) where x.24 m etc. etc.5) There are a large number of these equations. correlation-coefficient it. . the standarddeviations o-j cTj. n ^2 • " "13. A correlation-coefficient with p secondary subscripts will be termed Evidently.i etc. (d) of § 3) of the first order.34 .2 o-j.. by the equation -Z\^-0i. coefficient... 2X^{X^ 6.„) = .2 ^1..„• ..2 + • + *1..23 . That ..„. may be regarded as of order zero.. 613. etc. « X.. The correlations rjj. etc.2. .612.. and spoken of as total. . will be denoted by a. .. more briefly. . pending the proof. correlations. in terms of the notation of equation (3). . or.34 . . for the right-hand side of unaltered by writing 2 for 1 and 1 for 2. however..J. 7.. .. in the case of a correlationa corr.34 . as distinct from partial.2 .!•„)' = 2(^1 8 + SK+ is.. . a name may.. . • .23 . and so on. . r^j. . .^23 .„-„ 8^.„. . Such an error (or residual. of ») . tion we have . the order in which both primary and secondary written is indifferent. as determined by the method of least squares. A 2(a. From the reasoning of § 3 it follows that the " least-square values of the partial regressions 612. be applied to the equation (2) is .234 THEOKY OF STATISTICS. will be termed a deviation of the pth order..23. "1. the number observations.34. — ''^\~ O12.. .23 . + 6i„ 53 .. neglecting the term in • • • n a. as it is sometimes called) denoted by a symbol with p secondary suffixes. .23 .. . If the regressions J1234 ... (4) standarddeviation denoted by a symbol with p secondary suffixes will be termed a standard-deviation of the pth order. • 2(a. („-l) ^n) = 0.. will be given by equations of the form '' N being. (« ing the coefficients 61J..(612. {cf.. as a definisubscripts is and possesses all the properties of. . being regarded as of order zero... (n-l) 1) for determinagain for determining .23. +6l„.„ = 2(423.e.^'i~ . 6.. . the difference between the actual value of x-^ and the value assigned by the right-hand side of the regression-equation (1).. . .2+ . . |„_i.....n •'^1... Finally.~ "ln.j . as usual.24 .34 . eqn. x„Y being very small.elation of order 'p. etc. the standarddeviations (rj.^x^ x^ are assigned any one set of observed values. ...] j.. . (. we will define a generalised standard-deviation . .„.„.. be assigned the "best" values. the error of estimate. that is...

^n) = S(a!l.». n . Therefore. . x^ enters into the product-sum with 3:1.. ..Kjj). is imaltered by adding to the secondary subscripts of the former the latter. all (6) *'2. : 235 the coefficients Jji 34 .34 . .23 . . .. in fact. we may note that x^ is and x^ uncorrelated with Xj. . .. conversely.q. (n-1) . (n-1)) = '^{H'l. The theorems of this of . etc.— XII. . provided the subscript of the former occur among the secondary subscripts of the latter.. ..„. and so 3(^1.34 .84 ..34 Similarly. —PARTIAL CORRELATION. ... ») =2a!l.. n • %34 ..„. .. the p subscripts being the same in each case.. . n Xg- . . and should be carefully remembered.. of any deviation of order zero with any deviation of higher order is zero.n. „ :x^ the equal product-sums that may be obtained deviations is unaltered by omitting any or all of the secondary subscripts of either which are common to the two. 2(lBl. . The normal equations of the form (5) are therefore equivalent to the theorem The product-svm... he will see that when the con" dition is expressed that b^^. . every x the suffix of which is included in the secondary suffixes of *i. But it follows from this that . we see that the product-simi of any two any or all of the q additional subscripts of It follows therefore from (5) that amy product-sum is zero if all the subscripts of the one deviation occur among the secondary subscripts of the other. the product-sum of any deviation of order p with a deviation of order p -I.. . . Taking each regression in turn. and so on..i.... n !»2.Jjji.34. ..». 2(Xl.. .. . a. and so on they are sometimes termed the normal equations. on.» = S(a!i.s4 Comparing in this way. . (n-1) • '''2.. .34 . As the simplest case.34 .si .ii . . . . n) — S(a.Sl .34 . . • • . .. «) = S(K] XiM . ^a).«) = ^\pl. (n-1)) .... when the same condition is expressed for b^g^i n> ^s enters into the product-sum. .i X2.. n).n . 2(^1.n shall possess the "least-square value. amd. a!2. uncorrelated with X2. quite generally.. Similarly again.U .ni^^-iwA.2. .23 . If the student will follow the process by which (5) was obtained. 8. m . . .„) — ^\^l..U . . . and of the preceding paragraph are fundamental importance.34 .34 .. 71 enters into the product-sum.34 .

— 236 9.. We have now from g§ 7 and 8 = S(%34. THEORY OF STATISTICS. ..

S4 . b^^ .. cannot be numerically greater than unity. . .31 >i= 612.34 .34 . ..34 .34 ... . it will not be possible to estimate x^ with any greater accuracy from x^ and x^ than from x^ alone. . . (n-l) .. if r-^^ = + 0"8. .hnM . . (n-l) 7 • "n2.. in a^ to a!„_i) . = S(a. ''J2. iCg x^-i.23 „..hnM .. it is clearly indifferent in what order the latter are taken into account. n .34 .i. for if we are estimating one variable from n others. (n_l|).a4 . . (n-ll '^2.. so .rj^^) . as it leads to rather unexpected results._i| by (n-l) • 6„2. 0-2.S4 . Xn will not increase the accuracy of estimate unless ri„_^ .. . .23. - hn.rj. <r?.terms . (»-i)(a!2 . . . ''•2. (I . . 6n2. any other subscript can be eliminated in the same way as subscript n from the suffix of o-. (n-l) ''„2.' Any regression of order p may be expressed in terms of For we have regressions of order p -1. in equation (9). . .. 0-1. n) . (n-l) . . "vS 6_.ry(l . Apart from the algebraic proof. ?^ . .. XII. .. (not ri„) differ from zero..a (n-2)) ^^^ so on. .34 ~ 7 ''2n. . a. 11. . . S(a. It is clear from (9) that r^^^ .„. („_i) This condition is somewhat interesting. (n-i)) we have 612.34 . . „ — Ol2. . (n-l)) .34 . .6-. It should be noted that. it is obvious that the values must be identical . ff2..34 .s4 (»-l) .. .M . . .. 237 This is again the relation of the familiar form ai„ = ai(l-rl) with the secondary suffixes 23 (n-l) added throughout. Further. r2g= +0"5. . 12. (9). . 2>A34 . . (»-ii • <»n . . m) = 2(a!i.34 . at once that if we have been estimating ojj from x^. .rU.34 .34 . (re-1) . . . so that a standard-deviation of order p can be expressed in p ways in terms of standard-deviations of the next lower order. say the inverse. ^ • (n-l) ~ °ln. .i. . (n-Jai^si . .. („_i) The student should note that 7 this is O iii "1 an expression of the form • "liM — 1 — bin I . (ii-l)S(!Ci.34 ....M =Sa. . .. —PARTIAL COKKKLATION. (n-l). from I .„_i|. (n-l) /ii% (..34 Eeplacing . § 13). . for the value of rjg2 is zero (see below.. . . For example.) • (10) This is an extremely convenient expression for arithmetical use the arithmetic can again be subjected to an absolute check by eliminating the subscripts in a different.i.. (n-l| • a%. ..si .„ = tr?(l . . („_i) ain . . like any correlation of order It also follows zero.23 (n-i) "^i^ ^® expressed in the same way in terms that we must have of o'l. rj3= +0'4.. — . . order.. This is useful as affijrding an independent check on arithmetic. . .. or. . 7 . .ii. .«-i. .){l .

..34 „ can be eliminated so as to obtain another equation of the same form as (12). The best mode of procedure on the whole.)' This is. . having calculated all the correlations and standard-deviations of order zero.. From equation (11) we may readily obtain a corresponding equation for correlations.„ „ _ ^12.. (n-l) ~ ^ln.34. .)» with the secondary subscripts added throughout. similarly. Evidently equation (12) permits of an absolute check or . (2) to calculate any required standard-deviations by equation (10) . (3) to calculate any required regressions by equation (8): the use of equation (11) for calculating the regressions of successive orders directl. and »-i2. (n-l| Hence. (n-l). .(n-l.. .34 .34 . and the value obtained for r-i^M „ by inserting the values of the coefficients of lower order in the expression on the right must be the same in each case.34 „ can be assigned interpretations corresponding to those of 612 34 „ above.34 . .34 . .. (n-l) ' (12) (l-rS. the expression for three variables '"•" (l-0'(l-r-|. . with the subscripts 34 coefficient 612. have been eliminated in lieu of n.. 1 J- (n-l) • ^271.. writing down the corresponding expression for b^^si and taking the square root n2.»> %46 n being it is . We will give two illustrations. . .^ from each other is comparatively clumsy... and so on..... (n-II • '''n.34 . . (n-l) added throughout.^ . .34 .... . The therefore be regarded as determined from a regression-equation of the form n K^a-J ^l. ... . .34 . 14.23 .. The equations now obtained provide all that is necessary for the arithmetical solution of problems in multiple correlation.. .. . |n-l) ~ ^ln.e. (n-I) = "12... (n-l) + "ln. . For (11) may be written I "12.34 . ... ~ -^i ^2n..34. (n-l ) "'1. 238 THEORY OF STATISTICS. .. for any one of the secondary suffixes of r'i2. .. given. (n-II> *n.. .-1) .S4 ..34 n _ ^12.34 ..M i. (n-l) • ^2n... .3i .)'(l-rL34... we might also regard it as the partial regression of aj^j „ on ^2. . the arithmetic in the calculation of all partial coefficients of an order higher than the first.34 .34 0'2. (1. . ... . As any other secondary suffix might (n-l) being given. . the first for .. . („_i)..34 the partial regression of ic^g^ _ _ (^-d on X2. is (1) to calculate the correlations of higher order by successive applications of equation (12) . . ti • '"2.„-l. . 13.34 .. .

. Using as our notation Xj = average earnings. as 1.. 5-79 o-j o-„ = 1 '29 = 3-09 per cent. In Question 2 of the same chapter are given (3) the ratios of the numbers in receipt of outdoor relief to the numbers relieved in the workhouse. — PARTIAL CORRELATION. but rapidly increases the amount. these had. First it will be noted that the logarithms of (1 . 3 the product term of the numerator of is The work in tabular form. in which the correla- was worked out between (1) the average earnings of agriand (2) the percentage of the population in receipt of Poor-law relief in a group of 38 rural districts. for these three variables.—— XII. In col.g=-0-13 r„o=+0-60 is To obtain the partial correlations. _ ^12 ~ ^13 -^28 best done systematically and the results collected many of the logarithms occur repeatedly.. better be worked out at once and tabulated (col. especially if logarithms are used. The introduction of more variables does not involve any difference in the form of the arithmetic. Example tion i. the first constants determined are Jlfj = 15-9 shillings 0-1 = 1 '71 shillings M^= lf. accordingly. X^ = percentage of population in receiptof relief. Required to work out the partial correlations. t. in the same districts. etc. 239 three and the second for four variables. rj2=-0-66 r. 2 of the table below). — The first i. illustration of continuation of example cultural labourers we shall take will be a Chapter IX.r^)* occur in all the denominators . X^ = out-relief ratio. regressions. equation (12) used direct in simplest form ».= its 3-67 per cent.

log log log rapidly done if the are The values found = 0-06146 = 1 "84584 0-3.i-i-2-18 x^ .13- This transformation is a useful one and should be noted by the The values of each n.3= -1-21 6.g.1 = T-36174. we shall have six regressions to calculate from equations of the form regressions.r^ for the Having obtained the correlations we can now proceed to the If we wish to find all the regression-equations.i3 o-i23 0-2. *i2.12 = 0-34571 0-1.1= From log log log these and the logarithms of the r's = = 1 "64:993.8 • ''l.1. o-j. to calculate at once. as being the standard-errors or root-mean-square errors of estimate made in using the regression-equations of the second-order.2 = 1-93024.8= 631.1 = 0-33891. ""ssi ^^^^^^ ^^^ standard-deviations of the first-order are not in themselves of much interest.2-l-0-23a.r3= -)-0-85 a. and the standard-deviations of the second-order are so. therefore.8 is. 612. pendently by the formulae of the form = 0-1(1 BO as to -r?3>(l-»i.2s/<''2. 0-2.may be calculated twice inde student.' the values of log first-order coefiBcients (col.r^ have been tabulated. Jl .)» is values of log check the arithmetic ..13 = l'15 = 0'70 0-5. 9).g.0-45 ail + 0-22 Wg (3) .2 628.23 o"2. = 1-33917.j to the form We "12s ~ ''12. "12-3 ~ ''12-8 • "'iw^'is- These will involve all the six standard-deviations of the first order o-j. may save needless arithmetic. 632.g (2) x^= .1. for reference in the calculation of standard- deviations of the second-order.12 = 2-22 we have 6133= +0-23 6231= -t-0-22 632. by replacing the standarddeviations of the first-order by those of the second. the work Jl .2= : -0'*5 +0-85 : : log log log *i3.3 621. and transforming the above equation for Sjj.240 THEORY OF STATISTICS of the denominators from those of the numerators we have the It is also as well logarithms of the correlations of the first-order. 631. 0'08116. omitting the former entirely. +2-18 That the regression-equations are (1) iBi= -l-21a.

against the hypothesis of a tendency to lower wages. —-(Four variables. and this is. for the pauperism Xj.. of course. will Example work ii. Such a hypothesis would have little to support it in view of the smallness and doubtful significance of rj3. and consequently the relation cannot be strictly linear but the third equation gives possible (positive) average ratios for all the combinations of pauperism and earnings that . The first and second regression-equations are those of most practical importance. The third regression-equation shows that the proportion of out-relief is on the whole highest where earnings are highest and pauperism greatest. 2. however. that described in the first illustration of Chapter X. The partial correlation coefficient (»'i3. in so far. the argument might be advanced that the observed correlation (rjj = -f 0'60) between pauperism and outrelief was in part due to the negative correlation (»']3= -0'13) between earnings and out-relief. to which the student should refer for details. though very small (c/. It remains possible. 1 per cent. and is definitely contradicted by the positive partial correlation r^^^ = 4. of the take a portion of the data from another investigation into the causation of pauperism. transferring the origins to zero. As regards pauperism. actually occur. It should be noticed. and 1 for the out-relief ratio Xj. indicate that in unions with a given percentage of the population in receipt of relief (X^ the earnings are highest where the proportion of out-relief is highest . that out-relief may adversely affect Vae possibility of earning. The argument has heen advanced that the giving of out-relief tends to lower earnings. that a negative ratio is clearly impossible. Chap. IX.2= -hO'44) and the regression-equation (1). and the total coefficient (rj3=-0'13) between earnings (Xj) and out-relief (Xg). § 17).0-45 Xj + 0-22 Xj Out-relief ratio Xl= -15-7 + 085 X^+2-18 X^ The units are throughout one shilling for the earnings Xj. 241 or. viz.0'69. The variables are the ratios of the values in 1891 to the values in 1881 (taken as 100) of— 1. e. 3.) As an illustration of the form we 16 . and the second regression-equation.g. The percentage of the population in receipt of relief.XII. does not seem inconsistent with such a hypothesis. (1) (2) Earnings Pawperism Xj = + 19-0 - 1-21 X^ + 0-23 X^ (3) X^= + 9-55 . however. —PARTIAL CORRELATION. by limiting the employment of the old. The ratio of the niimbers given outdoor relief to the numbers relieved in tte workhouse. in the case of four variables. The percentage of the population over 65 years of age.


242
4.

THEOEY OF STATISTICS.
The
pop^ilation itself,

in the metropolitan

constants
follows
:

(means,

group of 32 unions, and the fundamental standard-deviations and correlations) are as
Table
I.

Xn.

—PARTIAL

CORRELATION.

243

Table

II.

1.

244
sorrelations
of the

THEORY OF STATISTICS.
first

order (Table

II.

col.

The

first-order coefficients are

then regrouped

4) are obtained. in sets of three,

with the same secondary suffix (Table III. col. 1), and these are treated precisely in the same way as the coefficients of order

be seen, the value of each coefficient two ways independently, and so the arithmetic is checked r-^^-u occurs in the first and fourth lines, for instance, rj324 in the second and seventh, and so on. Of course slight diiferences may occur in the last digit if a sufficient number of digits is not retained, and for this reason the intermediate work should be carried to a greater degree of accuracy than is necessary in the final result thus four places of decimals were retained throughout in the intermediate work of this example, and three in the final result. If he carries out an independent calculation, the student may differ slightly from the logarithms given in this and the following work, if more or fewer figures are retained. Having obtained the correlations, the regressions can be calculated from the third-order standard-deviations by equations of the form (as in the last example),
zero.

In this way,

it will

of the second order is arrived at in
:

;

"12-34

'1234^
"^2134

>

BO the standard-deviations of lower orders need not be evaluated.

Using equations of the form
<^i.2,i

= <ri(l - »^2)»(1 - '1a2)'(l - ^u.^y

we

find

1-35740 log <^i.234 log oTj 134 =1-50597 0-65773 log <r3.i24 log <^4.i23= 1-32914

= =

o-i.23,=22-8
(r2i34
0-3.124 ^^4.1,3

= 32-l = 4-55 = 21-3

All the twelve regressions of the second order can be readily calculated, given these standard deviations and the correlations,

but we may confine ourselves to the equation giving the changes pauperism (Xj) in terms of other variables as the most important. It will be found to be
in
iBj

= 0-32.5a;2 + 1 -383x3 - 0-383a;4,
and expressing the equation
in

transferring the origins percentage-ratios,
or,

terms

of

Zj = -

31-1

-I-

0-325X2 + 1-383Z3 - 0-383X4,

XII.

—PARTIAL

CORRELATION.
:

245

or, again, in

terms of percentage-changes (ratio - 100) Percentage change in pauperism


,

= + 1 '4 per cent. + 0'325 times the
-fl"383 - 0'383

change

in out-relief ratio.

proportion of old. population.

These results render the interpretation of the total coefficients, which might be equally consistent with several hypotheses, more clear and definite. The questions would arise, for instance, whether tlie correlation of changes in pauperism with changes in out-relief might not be due to correlation of the latter with the other factors introduced, and whether the negative correlation with changes in population might not be due solely to the correlation of the latter with changes in the proportion of old. As a matter of fact, the partial correlations of changes in pauperism with changes in out-relief and in proportion of old are slightly less than the total correlations, but the partial correlation with changes in population is numerically greater, the figures being
ri2= -1-0-52 r,,= +0-il
r-j,=
»-i2

34=-l-0-46

ri3.2,= -l-0-28

-0-14

ri^,3=-0-36

So far, then, as we have taken the factors of the case into account, there appears to be a true correlation between changes pauperism and changes in out-relief, proportion of old, and population the latter serving, of course, as some index to changes in general prosperity. The relative influences of the three factors are indicated by the regression-equation above. [For the full discussion of the case cf. Jour. Boy. Stat. Soc, vol. Ixii., 1899.] 15. The correlation between pauperism and labourers' earnings exhibited by the figures of Example i. was illustrated by a diagram
in

(fig. 40, p. 180), in which scales of "pauperism" and "earnings" were taken along two axes at right angles, and every observed pair of values was entered by marking the corresponding point with a small circle the diagram was completed by drawing in the lines of regression. In precisely the same way tlie correlation between three variables may be represented by a model showing the distribution of points in space ; for any set of observed values X,, Xj, X) may be regarded as determining a point in space, just as any pair of values Xj and Xj may be regarded as determining a point in a plane. Fig. 45 is drawn from such a model, constructed from the data of Example i. Four pieces of wood are fixed together
:

;

246
like the

THEORY OF STATISTICS.
bottom and three

Supposing the open sides of a box. a scale of pauperism is drawn vertically upwards along the left-hand angle at the back of the "box," the
side to face the observer,

Fig. 46.

Model illustrating the Correlation between three Variables (1) Pauperism (percentage of the population in receipt of Poor-law relief) (2) Out-reiief ratio (numbers given relief in their homes to one in the Workhouse) (3) Average "Weekly Earnings of agricultural labourers, (data pp. 178 and 189). A, front view £, view of model tilted till the plane of regression for pauperism on the two remaining variables is seen
: ; ;

as a straight line.

; :

XII.

— PARTIAL

CORRELATION.

247

scale starting from zero, as very small values of pauperism occur a scale of out-relief ratio is taken along the angle between the back and bottom of the box, starting from zero atthe left finally, the scale of earnings is drawn out towards the observer along the angle between the left-hand side and the bottom, but as earnings lower than 12s. do not occur, the scale may start from 12s. at the Suitable scales are pauperism, 1 in. = 1 per cent. ; outcorner. relief ratio, 1 in. = 1 unit ; earnings, 1 in. = Is. ; and the inside measures of the model may then be 17 in. x 10 in. x 8 in. high, Given these three the dimensions of the model constructed. scales, any set of observed values determine a point within the " box." The earnings and out-relief ratio for some one union are noted first, and the corresponding point marked on the baseboard a steel wire is then inserted vertically in the base at this point and cut off at the height corresponding, on the scale chosen, to the pauperism in the same union, being finally capped with a The model small ball or knob to mark the "point" clearly. shows very well the general tendency of the pauperism to be the higher the lower the wages and the higher the out-relief, for the highest points lie towards the back and right-hand side of the model. If some representation of all three equations of regression were to be inserted in the model, the result would be rather confusing ; so the most important equation, viz. the second, giving the average rate of pauperism in terms of the other variables, may be chosen. This equation represents a plane the lines in which it cuts the right- and left-hand sides of the "box" should be marked, holes drilled at equal intervals on these lines on the opposite sides of the box (the holes facing each other), and threads stretched through these holes, thus outlining the plane as shown In the actual model the correlation-diagrams (like in the figure. fig. 40) corresponding to the three pairs of variables were drawn on the back sides and base they represent, of course, the elevations and plan of the points. The student possessing some skill in handicraft would find it worth while to make such a model for some case of interest to himself, and to study on it thoroughly the nature of the plane of regression, and the relations of the partial and total correlations.
: : : :

16. If

we

write
<^.23
I.

= o"i(l - ^iixi
. .
. .

«))

(13)

ajj

is the correlation between be shown that R^^ nj and the expression on the right-hand side of the regressionequation, say 61,23 ....«) where it

may

*1.28..

.

«

= "12.34. ..n- ^2"'' °13M. • n

•''s'''

'

"•

+ "l».2S.

.

.

(n-I|

(14)

248
For we have
2(a;i.
61.23

THEORY OF STATISTICS.

„)=2ajj(a;j-a;i,23 .... „)

= iV'(af -

(r?.jj

.

.

.

„)

and also
2(ef,j3

„)

= 2(a;i - a;i.23
(aj

...
x-^

„y = iV(cr| ei.23

0^.23

„)

whence the correlation between
-

and

...

„ is

o-f.;3

....„)*

is the value of £^23 ....„> given by (13). The value of accordingly a useful datum as indicating how closely x-^^ can . x„, and be expressed in terms of a linear function of x^, x^ the valuer of the regressions may be regarded as determined by the condition that Ji shall be a maximum. Its value is essentially positive as the product-sum 2(xi.ei.23 ....„) is positive. maybe termed a coefficient of (7i-l)-fold (or double, triple, etc.) correlation ; for n variables there are n such correlations, but in the limiting case of two variables the two are identical. The value may be readily calculated, either from o-j.^s „ and o-j or directly from the equation
i.e.
.

B

.

.

B

,

.

.

.

l-K^...«) = a-ri.){l
It is obvious

-

rf3.2)(l

-ri,.n)

(1 -^In.,,...,„-:,)•

(15)

from

this

equation that since every bracket on

the right

is

not greater than unity,

Hence Bi,^ ....„) cannot be numerically less than r-^^. same reason, rewriting (15) in every possible form,

For the
^1,.^
„,

cannot be numerically less than rj^, r^,, .... r^^ i.e. any one of the possible constituent coefficients of order zero. Further, for similar reasons, B^^ ....„) cannot be numerically less than any possible constituent coefficient of any higher order. That is to say, B^^s the greatest „| is not numerically less than of all the possible constituent coefficients, and ie usually, though not always, markedly greater. Thus in Example i., B^^^) (the coefficient of double correlation between pauperism on the one hand, out-relief and labourers' earnings on the other) is 0'8.39, and the numerically greatest of the possible constituent coefficients is rj2g=-0'73. Again, in Example ii., ^11.^3,) is 0-626, and the numerically greatest of the possible constituent
.
.

.

.

coefficients is »'i2.4= -fO'573.

The student should notice that is necessarily positive. Further, even if all the variables Xj, X^, X„ were strictly uncorrelated in the original universe as a whole, we should expect ''121 ''13.21 ''1428' '^^''i ^'^ exhibit values (whether positive or negatived
. . .

B

.

XII.

— PARTIAL

CORRELATION.

249

differing

from zero in a limited sample. Hence, B will , not on an average of such samples, to be zero, but will fluctuate round some mean value. This mean value will
tend,

be the greater the smaller the number of observations in the sample, and also the greater the number of variables. AVTien only a small number of observations are available it is, accordingly, little use to deal with a large number of variables. As a limiting case, it is evident that if we deal with n variables and possess only n observations, all the partial correlations of the highest possible order will be unity. 17. It is obvious that as equations (11) and (12) enable us to express regressions and correlations of higher orders in terms of those of lower orders, we must similarly be able to express the coefficients of lower in terms of those of higher orders. Such expressions are sometimes useful for theoretical work. Using the same method of expansion as in previous cases, we have

= 2(a;i 23

......

x^,si

....

,„_i,)

Equations (12) and (17) imply that certain limiting inequalities must hold between the correlation-coefficients in the expression on the right in each case in order that real values (values between ±1) may be obtained for the correlationcoefficient on the left. for the three coefficients of zero order and of the first order respectively : Value of .3..1) ^s must have »f2..»-?2 .Aa. + rfs + r^ - 2?-. simplest form for r^^ in Similarly writing (17) in ''i2-8> *'i3-2> terms of ^'^^ ''23.:. but we propose to treat them only briefly here.sr-i3. rjj as known.1) added throughout.J + »23.3 + »13. r^^-^ (19) and therefore. this gives as limits for rgg '•12^3 ± n/1 its . we must have r?2.. we take r-^^.»^3. 18. + ri.)'(i-^U* with the secondary suffixes 34 .rls + rj^rfs.1 + 2n2. These inequalities correspond precisely with those "conditions of consistence" between class-frequencies with which we dealt in Chapter II..i < 1 and r^^^ are given. (?i. »"?2 <1.2ri3r-23 <1 If (18) if the three r's are consistent with each other.3<l or (^12 ^13 ''23) (l-rj3)(l-rl3) that is.2 + 'ii.2r-23. The following table gives the limits of the third coefficient in a few special cases..2± Jl - r^as . (i-r?„.— 250 which is THEORY OF similarly the equation > STATISTICS. Writing (12) in its simplest form for r^2. if j-j^j must lie between the limits -na3n3.

251 The student should notice that the set of three coefficients of order zero and value unity are only consistent if either one only. and was discussed fuUy in Chap.1.1. . Finally. or both. It suffices to point out the principal sources of fallacy which are suggested at once by the^ form of the partial correlation . . positive. and this may lead to still more serious errors of interpretation. IV. or all three. if the : two are equal.^ will not be zero unless either r^j or rgg. (2) r-^^. ''•' (a) x/(l-rl3)(l-r?3) and from the form of the corresponding expression for r-^^ in terms of the partial coefficients „ _ ^12 - 3 + ^18-2 ^ 28-1 IJ.^ is zero. pair and pair.\ V(l -ris. . of the first order and value unity are only consistent if one only. +1. If rjj and r^^ are of the same sign the partial correlation will be negative . say. . conversely. of opposite sign to r-^^. Thus the quantity of a crop might appear to be unaifected.. -1.1. or .1. . or all three. it may be noted that no two values for the known coefficients ever permit an inference of the value zero for the third the fact that 1 and 2. We do not think it necessary to add to this chapter a detailed discussion of the nature of fallacies on which the theory of multiple correlation throws much light. if of opposite sign. and important. are negative the only consistent sets are +1. +1.. -1. which may lie anywhere between + 1 and . + 1 . —PAKTIAL COKKELATION. unless either r-jj^ or r^gi is zero.: XII.. This corresponds to the theorem . The general nature of such fallacies is the same as for the case of attributes. we see that.i) the form of the numerator of (a) it is evident (1) that even be zero.i){i--'rh.^ may be. 19. The values of the two given r's need to be very high if even the sign of the third can be inferred . the set of three coefficients not -1. We may thus easily misinterpret a coefficient of correlation which is zero. Vy^. §§ 1-8. by the amount of rainfall during some period preceding harvest if From rj2 this might be due merely to a correlation between rain and low temperature. on the other hand. +1. From the form of the numerator of (J). 1 and 3 are uncorrelated.1. but On the other hand.e. r^j will not be zero even though r-^^. the partial correlation between crop and rainfall being positive. indeed often is. + 1. they must be at least equal to \/0'5 or "707 .1 and . are positive. are zero. permits no inference of any kind as to the correlation between 2 and 3.1. i. .

. (6) Hooker. X. mining partial associations (cf. X. L. A and JB in the universe that these two associations may of C's and the universe of y's differ materially.. Theoretical. Y. Soc. and with the notation and method of ref. as we always obtain the actual tables exhibiting the association between. xxxiv. 1896. (1) (2) Edgkworth. p. (3) YULB. "A . as of the increased. IV.) is. 263. 1 and 2)from the standpoint of the "normal" distribation of frequency (c/. Pb.. 6th Series." Proc. vol. The preceding chapter is written from the standpoint of refs. p. Soc.. 1907.. IV. pp. Ixix. thorough and complete. Moy. . not be the same (or approximately the same) for all such tables. The process for deterChap. "Oa the Partial Correlation-Ratio. § 6. Kakl. 891-4U. of Chap. Theory. 1906. U. It might sometimes serve as a useful check on (pp." Jour. of Chap. Ixxvii. and indicates a source of fallacies similar to those there discussed. 317-46.252 THEORY OF STATISTICS. : : REFERENCES.. Heredity. 194. 1914. Chap. 3 and that we might determine the value of this partial correlation by drawing up the actual correlation table for the two Suppose. 45-6).... vol. "Note on Estimating the Kelative Influence of Pbaksoh." PAt/. Hence r-^2. " Eegression." Jour. (6) Yule. E.) (8) ISSEKLIS.^ for everi/ value of x^ (cf.) are probably exceptional. and Fkanokb Wood. Stat.. JI. "On the Theory of Correlation for any number of Variables treated by a New System of Notation. Chap. " On the Significance of Bravais' Formulro for Eeeression. though exceedingly laborious.B should be regarded.. in the case of Skew Correlation.. 5. 1914. p. 1897. W.. up a single table we drew up a series of tables for values of a. For the general case an extension of the method of the " correlation-ratio " (Chap. § 20) might be useful. Series A. and Panmixia. vol. (4) Yum. Yule. 7 and the theory more fully developed in ref. pp. p.^ g and ^j. 812. S aud 4." Phil. etc. F. vol. We and ajj. Ix. nature of an average correlation the cases in which it measures the correlation between ajj. but would exhibit some systematic change as the value of a. Soc. TJ. p. (The partial or "solid correlation-ratio is used. I. Hoy. Soc. TJ.).. vol. Stat.s is the correlation between a. 197. Tlie tlieory of correlation for several variables was developed by Edgeworth and Pearson (rets. have seen (§ 9) that j-jj. Ix.g would class-intervals of its range.. Soc." Froc. vol. G. Stat. Mag. J. vol. clxxxvii. U.jj associated with values of x^ lying within successive In general the value of rj^. and G.j. Greenwood.. Jour. 1897. Jioy. (7) Bbown. XVI.. 20. Series A. Two Variables upon a Third. Study of Index-Correlations. G. Hoi/.. it will be remembered. Soe.. partial-correlation work to reclassify the observations by the fundamental methods of that chapter. is illustrated by Example i. "On Correlated Averages. vol. It is actually employed in the paper cited in ref. 8. G. in general. H.. IV. Ixxix. Roy. Hoy. 182. p.g and x^. "On the Theory of Correlation. Traits. 1892. say. that instead of drawing residuals in question." Biometrika. 477. however. XVI..

(10) yoLE. 10. 6. : . " The Application of the Method of Multiple Correlation to the Estimation of Post-censal Populations.. (11) Show. H.) found for 1.= +0-80 r]3=-0-40 = 28-02 M2= i-91 Ms=59i = i-i2 (r2=l-10 0-5=85 ?-23=-0-66 Find the partial correlations and the regression-equation for hay-crop on spring rainfall and accumulated temperature. a^. 9. Economic Interest. Chap.Xj=deaths of infants under mortality). 1899. Ixii. Stat. what must the partial correlations Check the answer to Qu. 1 year per 1000 births in same year (infantile ^2= proportion per thousand of married women occupied for gain. The following means. per acre. what is the limiting value of r if all the equal correlations are negative and n variables have been observed 4. p." Jmir. EXERCISES. 1911. in.a certain district of ifi years. by working out the partial . Xj. Check the answer to Qu. what are the values of the partial correlations of successive orders ? Under the same condition. Soc... Write down from inspection the values of the partial correlations for the three variables Zi. floOKEB. say = r. correlations.000. Stat. ri5. and Xg. vol." Jour. vol. 2.Zi-)-&.! ) XII. If all the correlations of order zero are equal. C. and X3=o. —PARTIAL COKEBLATION. ^3= accumulated temperature above 42° England during 20 ffi F. find the partial correlations and the regression-equation for infantile mortality on the Taking the other factors. vol. etc. 1907. Chap. (Ref. "The Correlation of the Weather and the Ctovs. 249. Sac. Boy.j and %i ? 5. U. J/] = 164 jl/2=168 j)/g = 143 <ri= 20-0 <ra= 74-9 <r3= 22-4 Ci2=-f0-49 /i3= -I-0-78 ?-i4=-|-0-20 /•23=-l-0-15 r24=-0-37 /•34= -l-0'23 Mi=2Q5 o-4=130-0 3. standard-deviations. 1. and correlations are Jfj = seed-hay crops in cwts. G. Ray. Ixriv. R. " An Investigation into tlie Causes of Changes in Pauperism in England. Soy. 576. Ixx. ^4= proportion per thousand of population living 2 or more to a room (overcrowding). E. p... figures below for 30 urban areas in England and Wales. X2= spring rainfall in inches.. 7.Xj. What is the correlation between ajj. .. XL. 253 Illustrative ApplicationB of (9) . in spring. by working out the partial If the relation holds for bel all sets of values of ajj.Xj= death -rate of persons over 5 years of age per 10. p. Slat. XI. correlations." Jam. Soo. (The following figures must be taken as an illustration only the data on which they were based do not refer to uniform times or areas.

or of cards bearing measurements within some given class-interval in drawing cards. say. In 100 throws of a coin. for checking and controlling the interpretation of statistical results. of black balls in drawing samples from a bag containing a mixture of black and white balls. Similarly. averages. The problem of of the present Part 2. Use of the standarddeviation of simple sampling. — — 1.: PART III. if on measuring the statures of 1000 men in each of two nations we find that the mean stature is sUghtly greater for able causes. Limitation of the discussion to the case of simple sampling 4. or standard error. Biological cases to which the theory is directly applicable 11. Standard-deviation of number — — — — — simple sampling when the numbers of observations in the samples vary 12. Determination of the mean and standard-deviation of the — — — of successes in n events 6. CHAPTER SIMPLE SAMPLING OF ATTRIBUTES.—THEORY OF SAMPLING. XIII. we may have noted 56 heads and only 44 tails. from an anthropometric record. but we cannot conclude that the coin is biassed on repeating our throws we may get only 48 heads and 52 tails. On several occasions in the preceding chapters it has been pointed out that small differences between statistical measures like percentages. when the chance of success or failure is very small 13. or its reciprocal as a measure of precision 7. measures of dispersion and so forth cannot in general be assumed to indicate the action of definite and assign- Small differences may easily arise from indefinite and highly complex causation such as determines the fluctuating proportions of heads and tails in tossing a coin. for example. Definition of the chance of success or failure of a given event— 5. The two chief divisions of the theory sampling 3. The same for the proportion of successes in n events : the standard-deviation of simple sampling as a measure of unreliability. 1. Approximate value of the standard-deviation of simple sampling. Verification of the theoretical results by experiment 8. ajid relation between mean and standard -deviation. More detailed discussion of the assumptions on which the formula for the standarddeviation of simple sampling is based 9-10. 264 .

etc. In tossing a coin we only classify the results of the tosses as heads or tails .XIII. we cannot necessarily conclude that the real mean stature is greater in the case of nation A possibly if the observations were repeated on diiFerent samples of 1000 men the ratio might be reversed. we values of some variable can form averages and measures of dispersion for the successive batches. the general case may be represented as the drawing of a sample from a universe containing both . in drawing balls from a mixture of black and white balls. we put in a bag a number of cards bearing different and draw sample batches of cards. is dealt with for the case of importance and interest. one or two of the more important cases of the theory of sampling for variables are briefly treated.4's and a's. not only from its applications to the checking and control of statistical results. In the present and the three following chapters the theory of sampling attributes alone. we only classify the balls drawn as black or as These cases correspond to the theory of attributes. in successive samples. on the other hand. and white. If associated measures of are recorded on each card. owing to its difficulty. lying somewhat outside the limits of this work. and there are two chief sections of the theory corresponding to the theory of attributes and the theory of variables respectively. It doesnot hold good. the greater part of the theory.. in Chapter XVII. the remainder may be either 2 black and 3 white. The theory of sampling attains its greatest simplicity if every observation contributed to the sample may be regarded as independent of every other. for the drawing of balls from a bag if a ball be drawn from a bag containing 3 black and 3 white balls. we can also form two variables and correlation-coefficients for the different batches. The theory of such fluctuations may be termed the theory of sampling. 3. for the tossing of a coin or the throwing of a die the result of any one throw or toss does not affect. the results of the preceding and following tosses. and is unaffected by. and these will vary in a similar manner. on the other hand. These cases correspond to the theory of variables. 2. e.g.. correlation-coefficients. This condition of independence holds good. and these averages and measures of dispersion will vary slightly from one batch to another. —SIMPLE SAMPLING OF ATTRIBUTES. The result of drawing a second ball is therefore is The theory of great : : . the number or proportion of A's in successive samples being observed. according as the first ball was black or white. 255 nation A than for nation B. or 2 white and 3 black. measures of dispersion. If. but also from the theoretical forms of frequencydistribution to which it leads. Finally. and it is the function of the theory of sampling for such cases to inform us as to the fluctuations to be expected in the : X X T averages.

and will tend to give the former qlf. and the cliance of throwing six (or any other face) with a die is 1/6. Take this frequency-distribution and work out the standard-deviation of the number of successes for the single event. What will be the values towards which the mean and standard-deviation of the number of successes in a sample will tend ? The mean is given at once. These results are sometimes expressed by saying that the chance of throwing heads (or tails) with a coin is 1/2. f^. for there are J!f. — pF pN pN F — pN . as in coin-tossing or dicethrowing the simplest cases of an artificial kind suitable for For brevity. times in If trials. we theoretical study and experimental verification. The disturbance can only be eliminated by drawing from a bag containing a number of balls that is infinitely large compared with the total number drawn. or with any given face uppermost one-sixth of the whole number of times. Successes |.— 256 THEORY OF STATISTICS. say. The single event may give either no successes or one success. there is nothing which can make it tend to fall more often on the one side than on the other . refer to an event the chance of success of which is p and the chance oi failure q. 5. consider first the single event (ra = I). we may expect. dependent on the result of drawing the first. if we may regard the ideal die as a perfect homogeneous cube. If we may regard an ideal coin as a uniform. In this chapter our attention will be confined to the case of independent sampling. ?# pif 1 — /f. or by returning each ball to the bag before drawing the next. of which approximately 23^71 will be successes. that in any long series of throws the coin will fall with either face uppermost an approximately equal number of times. and the mean number of successes in a sample will therefore tend towards pn. To avoid speaking of such particular instances as coins or dice. we shall in future. heads uppermost approximately half the times.n events. may refer to such cases of sampling as simple sampling the — : implied conditions are discussed more fully in § 8 below. As regards the standarddeviation. Suppose we take Jf samples with n events in each. 4. therefore. in any long series of throws. Similarly. it will tend. or with. using terms which have become conventional. to fall with each of its six faces uppermost an approximately equal number of times. homogeneous circular disc. the latter pjf. Obviously p + g' = 1. as in the case of an arithmetical example : Frequency/.

and so forth have been carried out by various persons in order to obtain exof the proportion of successes in — 17 . but as the square root of n. we have therefore. o-„ being the standarddeviation of the number of successes in n events. We return to this point again below (§ 8 and Chap. XI.e. the greater the fluctuations of the observed proportion. 7. observed proportion varies as the square root of the number of This is again a very important observations on which it is based. In lieu of recording the absolute number of successes in each sample of n events.XIII. As this would amount to merely dividing all the figures of the original or rather the value record by n. reciprocal of the standard-deviation (l/«). § 2. l/«th of the number in each sample. dice throwing. XIV.=a>/n^=pq/n The standard-deviation .). or. if we regard the observed proportion in any one sample as a more or less unreliable determination of the true proportion in a very large sample from the same material. Experiments in coin tossing. in a group of n events varies. all the events being independent. equation (2)). a group of n such events is the which it is composed. and the standard-deviation of the proportion of successes s„ be given by — ^. the mean proportion of successes towards which the mean tends to approach— must be p. and consequently the reliability or precision of am. —SIMPLE M=p. as it is sometimes termed. the standard-deviation of sampling may fairly be taken as a measure of the the greater the standardwnreliahility of the determination deviation. (2) samples of such independent events varies therefore inversely as the square Now root of the number on which the proportion is calculated. should be borne in mind. i. but the limitations of the case to which it applies. on the other hand. and the exact conditions from which it has been deduced. rule with many practical applications. we might have recorded the proportion of such successes. and 257 We have therefore a\=p-p^=pq. by the usual rule for the standard-deviation of the sum of independent variables (Chap. and. due to fluctuations of simple sampling alone. of successes in But the number sum of successes for the single events of <ji = npq (1) This is an equation of fundamental importance in the theory of sampling. . 6. The student should particularly bear in mind that the standard-deviation of the number of successes. precision. SAMPLING OF ATTRIBUTES. • . may be regarded as a measure of reliability. not directly as n. The although the true proportion is the same throughout.

. Edgeworth. and the marks not out very deeply. in order to It may be as well acquire confidence in the use of the theory. Weldon. so that they roll across the corrugations. Successes. 11th edn. 5. theoretical value of the standard-deviation o- M= 1-732. therefore p = g = 0'5. suggested. F. perimental verification of these results. we believe. vol.. . or 6 points reckoned a success. The following will serve as illustrations. is to roll them down an inclined gutter of corrugated paper. Y. by the late Professor Weldon. E. Brit. Cheap dice are generally very much out of truth. xxii. (1) (W. but the student is strongly recommended to carry out a few series of such experiments personally. to remark that if ordinary commercial dice are to be used for the trials. Theoretical mean 6 . cited by Professor F. Encycl. p. 394. Totals of the columns in the table there given. and if the marks are deeply cut the balance of the die may be sensibly affected.) Twelve dice were thrown 4096 times .258 THEORY OF STATISTICS. care should be taken to see that they are fairly true cubes. A convenient mode of throwing a number of dice. a throw of 4.

standard-deviation = 0-1667.= 1-296. and the numbers of 5's or 6's noted at each throw. thrown 648 times. U. : . /i = l/3. Of course such very close agreement is accidental.— XIII. Frequency-distribution observed Successes. 0-816. Standard- Mean M= 2000. The following may be taken as an illustra(3) (G. Yule. of successes 2-00/12 deviation. and not to be always expected. —SIMPLE SAMPLING OF ATTRIBUTES.) Three dice were tion based on a smaller number of observations. 5 = 2/3. agreeing with the theoretical value to the fourth place of decimals. Theoretical mean 1. 259 Actual proportion o.

be difficult or impossible to say what differences or changes are to be regarded as essential. must any essential change have taken place during the period over which the observations are spread. To revert to the case of deathrates. Consequently if formula (2) is to hold good in our practical case of sampling there must not be a : i. but. formulae (1) and (2) would not apply to the numbers of persons dying in a series of samples of 1000 persons.— 260 identically similar of THEORY OF STATISTICS. Thus it is obvious that the theory of simple sampling cannot apply to the variations of the death-rate in localities with populations of different age and sex compositions. so that the chances p and q were the same for every coin or die. nor. Consequently. each sample only contained persons of one sex and one age. Where the causation of the character observed is more or less unknown. but also for every individual in every sample. nor for the young and the old. if our formulae are to apply in the practical case of sampling. even if these samples w^ere all of the same age and sex composition. the condition would be broken. For if each sample incladed persons of both sexes and different ages. due to definite causes are superposed on the fluctuations of sampling. throughout the experiment. nor to death-rates in successive years during a period of conIn all such cases variations tinuously improving sanitation. unless. and living under the same sanitary conditions. it may. " six " with the dice was the same throughout we did not commence an experiment with dice loaded in one way and later on take a fresh set of dice loaded in another way. the condition laid down enables us to exclude certain cases at once from the possible applications of formula (1) or (2). say.e. but also that all the coins and dice in the set used were identically similar. if we were observing hair-colours. where we have more knowledge. the chance of death during a given period not being the same for the two sexes. so that the chance throwing " heads " with the coins or. nor to death-rates in a mixture of healthj' and unhealthy districts. of course. so that the chances p and q were the same at every trial. again a very marked limitation. if the observations have been made at different epochs. we have also tacitly assumed not only that we were using the same set of coins or dice throughout. the conditions that regulate the appearance of the character observed must not only be the same for every This is sample. further. our formulae deduced. difference in any essential respect affect the proportion — . in any character that can observed between the localities from which the observations are drawn. (6) In the second place. The groups would not be homogeneous in the sense required by the conditions from which our formulae have been Similarly.

and explosions in mines if such an accident is fatal to one person it is probably fatal to others also. railway accidents due to derailment. with deaths from an infectious or contagious disease. he has increased the possibility of others doing so. must be completely independent of one another. (b). 261 would not apply if the samples were compounded by always taking one person from district A. balls in . Reverting to the illustration of a death-rate. like the throws of a die.(/. speak of simple sampling in the following pages. It may be as well expressly to note that we need not make any assumption as to the conditions that determine term is When we p If we draw a unless we have to estimate Jnpq a priori. if we were dealing. and (c). A's draw another sample under precisely the same conditions. It is this limiting value which is to be used in our formulse the value of p that would be observed in a very large sample. The third condition was explicitly stated (c) The individual "events. our formulae would riot apply even if the sample populations were composed of persons of one age and one sex. another from district B. and so on. on this understanding. The same thing holds good for certain classes of deaths from accident. e. For if one person in a certain sample has contracted the disease in question. Similarly. The standard-deviation of the number of sixes thrown with n dice. for example.: XIII. say. like the drawings of balls from a bag containing a number of balls that is very large compared with the number drawn. sample and observe in it the actual proportion of. may be Jnpq. even if the dice be out of truth or loaded so that p is no longer : — — — the standard-deviation of the number of black samples of n drawn from an infinitely large mixture of black and white balls in equal proportions may be Jnpq even 1/6. or sensibly so. and the individual " events " or appearances of the character being quite independent. and observe the proportion of A's in the two samples together add to these a third sample. . all the samples and all the individual contributions to each sample being taken under precisely the same conditions. the intended to imply the fulfilment of all the conditions (a). less erratic variations." or appearances of the character observed. and hence of dying from the disease. and consequently the annual returns show large and more or . and so on. but with some fluctuations closer and closer to some limiting value. and consequently it has been necessary to emphasise them specially. these districts not being similar as regards the distribution of hair-colour. we will find th&tp approaches not continuously. —SIMPLE SAMPLING OF ATTRIBUTES. The above conditions were only tacitly assumed in our previous work.

ijpq/n . that in these cases all the necessary conditions are fulfilled. IX. with some limitations. with surprising closeness. notably in the proportions of offspring of different types obtained on crossing hybrids. to the proportions of the two sexes at birth. and. males in a series of groups of n births each. as a rule. however. considering the small numbers of observations. 1/3. § 15). it seems doubtful whether the rule applies to the frequency of the sexes in individual families of given numbers (ref. IX. are given at the foot of the table. In Table VI. and show the same general agreement with the standard-deviations of simple sampling. In the case of the sex-ratio at birth. the larger of the two in every case but one. accordingly. is §4. slightly in excess of the theoretical values. portion of male births. based on the same data. drop in dispersion as we pass from the small to the large districts The actual standard-deviations. and the is extremely striking. but it does apply fairly closely to the sex-ratios of births in different localities. but it is quite sufficiently accurate for practical purposes to use the proportion of male births actually observed if that proportion be based on a moderately large number of observations. Chap. standard-deviations of simple sampling corresponding to the midnumbers of births. on the whole. 7 at the end of this chapter. of Chap. however. p say. if we note the number of successive periods. (p. 163) was given a correlationtable between the total numbers of births in the registrationdistricts of England and Wales during the decade 1881-90 and the proThe table below gives some similar figures. and it will be seen that the two agree. reason. is approximately otherwise. male birth or. again. Chap. degree of approximation in certain biological cases. however. a priori value to the chance p as in the case of dice-throwing. XIV. of Chap. but this is not a necessary inference from the mere applicability of the formulse (c/. the actual standard-deviations are. and not 1/2 owing to the black balls.) field evident that these conditions very much limit the an economic or sociological character to which formulse (1) and (2) can apply without considerable The formulse appear. It is possible. It of practical cases of of that of a number . tending to slip through our fingers. >Jnpq. 10. for some {Of. are given in Qu. for a few isolated groups of districts conIn both tables the taining not less than 30 to 40 districts each. 9). The corresponding standard-deviations for Table VI. and still more closely to the ratios in one locality during That is to say. XIV. STATISTICS.262 if THEORY OF is. is where p is the chance the standard-deviation We are not able to assign an of the proportion of male births. The actual standard-deviation is. the standard-deviation 9. to hold to a high modification.

—SIMPLE SAMPLING OF ATTKIBUTES. for Groups of Districts with the Numbers of Births in the Decade lying between Certain Limits. 263 Table showing Frequencies of Registration Districts in England and Wales with Different Ratios of Male to Total Births during the Decade 1881-90.'] . [Data based on Decennial Supplement to Fifty -fifth Annual Report of the Registrar-General for England and fVales.XIII.

Given a sufficiently large number of observations.o ' V 2000 / In the above illustration the difficulty due to the wide number of births n in different districts has been surmounted by grouping these districts in limited class intervals. to the nearest unit) of .. ). /g containing n^. where the number of observations varies from one sample to another.. and assuming that it would be sufficiently accurate' for practical purposes to treat all the districts in one class as if the sex-ratios had been based on the mid-numbers of births. and so on What would be the standard-deviation of the observed proportions in these samples? Evidently the square of the standard-deviation in the first group would he pq/n-^. . But if ff be the harmonic mean w. . as the means tend to the same values in all the groups. But if the number of observations does not exceed. in both cases the standard-deviagiven are standard-deviations of the proportion of male births ^er 1000 of all births. perhaps. and therefore The student should note that tions _/5 08x492 Y_.. ^ iz /i TOj A Mj h ™3 and accordingly ^=f Thus the following percentages (taken (3) That is to say. that a series of samples have been taken from the same material. though it is not very good. and so on therefore. we must have for the whole series 11. the proportion of males is 508 per 1000 births. /j containing n^.. such a process does well enough. 1000 times the values given by equation (2).S^=pq(ii + ^ + '^+ of n-^ . variation in the : : JV.— — 264 THEORY OF STATISTICS. n. then. /^ samples containing n^ individuals or observations each. that is. Suppose. These values are given by simply substituting Thus for the proportions per 1000 for^ and q in the formula.. grouping is obviously out of the question. in the seoond pq/n^. the mid-number of births 2000. and some other procedure must be adopted. the first column of Table I. 50 or 60 altogether. the harmonic mean number of observations in a sample must be substituted for n in equation (2)..

p. iii. . Darbishire. —SIMPLE : SAMPLING OF ATTRIBUTES. Biometrika. 265 albinos were obtained in 131 litters from hybrids of Japanese waltzing mice by albinos.— XIII. D. crossed inter se (A. 30) oentage.

The frequency-distributioa corps per the number of deaths per army annum was aths.266 THEORY OF of STATISTICS. .

Can the divergence from the exact theoretical result have ? arisen owing to errors of sampling only The numerical difference from the expected result is 23. The deviation observed sampling. In this case the observed difference is to be compared with the standard error of the theoretical number or proportion. (Data from the Second Report of the Evolution Committee of the Royal Society. and may very well have arisen owing simply to fluctuations of sampling. The expectation is 25 per cent.) Certain crosses of Fisum sativum gave 5321 yellow and 1804 green seeds. practically speaking. of green seeds. and the difference observed bears the error as before. The excess is 569 throws. same ratio to the standard Example ii. This proportion is 0'5116 instead may The problem might. 1905. Three principal cases of comparison may be distinguished. 25. — Example 5. i. 5's. the theoretical 0'5000. or 6's thrown.576 expected (out of 49. standard error of the proportion is of The ' = x/ix|x 49^52 =0'0*^226. as of course it must. Case I. or 1781. . 72. have been ?. Henqe the divergence from theory is only some 3/5 of the standard error. and. 267 " standard-deviation of simple sampling " may be regarded as a measure of the magnitude of such errors. —SIMPLE SAMPLING OF ATTRIBUTES.145 throws of a 4. The standard error is (r= — V0-25X 0-75x7125 = 36-8. It is desired to know whether the deviation of a certain observed number or proportion from an expected theoretical value is possibly due to errors of sampling. It is 5'1 times the standard error. difference in excess O'OllG. or 6 were made in lieu of the 24.XIII. possibly due to mere fluctuations of sampling 1 The standard error is 0-= >yjx|x49152 = 110-9. — In the first illustration of § 7. for the number of observations contained in the sample. p.ttacked equally well from the standpoint of the proportion in lieu of the absolute number of 4's. of course.152 Is this excess throws altogether). could not occur as a fluctuation of simple perhaps indicate a slight bias in the dice. and may be called accordingly the standard error.

If the . Case II.^ respectively. by the (weighted) mean proportion in our two samples together. the numbers of observations in thesamples being «j and n. instead of the theoretical 0'25. Let t. we have 0'2532 s= V0-25x 0-75/7125 = 0-0051. then 4=Po<lo/ni. (a) Can the difference between the two proportions have arisen merely as a fluctuation of simple sampling. is only some . however. 4=PoQo/''k- samples are simple samples in the sense of the previous work. Let us find. whether the observed difference between p^ and P2 may not have arisen solely as a fluctuation of simple sampling. owing to fluctuations of sampling.3/5 of the standard error. o-= JN.\.8's (a) We have no theoretical expectation in this case as to the proportion of A's in the universe from which either sample has been taken. in other samples taken in precisely the same way 1 This case corresponds to the testing of an association which is indicated by a comparison of the proportion of A's amongst B's and . and {A) and {B) are themselves If we formed an association-table liable to errors of sampling. i. as before. the proportion of A's being really the same in both cases. Two samples from distinct materials or different universes give proportions of A's p^ and p^. (A) and (B) being the numbers of heads thrown in the case of the first and the second coin respectively. let us say. and similarly the divergence from theory between the results of tossing two coins iV times." for it is not a theoretical number given a priori as in the above illustrations. It should be noted that this method must not be used as a test of association by comparing the difference of (AB) from {A){B)lN with a standard error calculated from the latter value as a "theoretical number. then the mean difference between pj and P2 will be zero. might it vanish. not the standard error for differences of {AB) from (A){B)/]V. and given. Cg ^^ ^^^ standard errors in the two samples. the two universes being really — similar as regards the proportion of A's therein 1 (6) If the difference indicated were a real one. Working from the observed proportion of green seeds. by (the best guide that we have). viz.e.268 THEOEY OF STATISTICS.^ would be the standard error for the divergence of {AB) from the a priori value m/4.

Below Average. as a rule. (5) the observed difference is less than some three times i-^^ it arisen as a fluctuation of simple sampling only. of more theoretical than practical importance. . § 3). The following data were given in Qu. The difference between the values of ej2 given by (5) and (6) is indeed. if the observed difference is greater or less than some three times the value of cjj given by (6). for testing the significance of an observed difference. will be given by the samples being 4=Fo?o(i+i) If . Further. but p-^ and ^2 are the true values of the proportions. and hence. in lieu of that given by equation (5). tjg. equation (6) gives approximately the standard-deviation of the true values of the difference for a given observed value. .. — Parentage Cross-fertilised. 3 of Chap. The justification of this usage we indicate briefly later (Chap. Example iii. the proportions of A'e. Height Above Average.—— XIII. Below Average. XIV. Parentage Self-fertilised. Height Above Average. are not the same in the material from which the two samples are drawn. the student should note that the value of Cj^ given by equation (6) is frequently employed. on the other hand. the standard errors of sampling in the two cases are may have «?=i'i!Zi/% ei=^2?2/"2 and consequently 4=M>+M^ . or may some have arisen solely as an " error of . and in that case either formula will place the difference outside the range of fluctuations of sampling. {b) If.. III.and self-fertilisation respectively. 17 17 12 22 cross- The figures indicate an association between tallness and fertilisation of parentage. Here it is sufficient to state that. it is hardly possible that the true value of the difference can be zero. — SAMPLING OF ATTRIBUTES. —SIMPLE 269 and the standard error of the difFereuce independent.. for they do not differ largely unless jOj and p^ differ largely. Is this association significant of it real difference. . if n be large. for plants oi Lohelia fulgens obtained by cross. (6) If the difference between p^ and p^ does not exceed some three times this value 6i €j2i it may be obliterated by an error of simple sampling on taking fresh samples in the same way from the same material.

The proportion of plants above average height in the and self-fertilised) together is 29/68. xxxvii.— — 270 sampling " 1 two classes THEORY OF STATISTICS. for samples of 34 observations drawn from identical material. but rather more marked. (Data from J.743 39. of the Royal Anthropological Institute. 1907. . difference 15 per cent. definite significance could be attached to it The student will notice. The standard-deviation of the differences due to simple sampling between the proportions of " tall " plants in two samples of 34 (cross- observations each is therefore /29 39 2V oTon The actual proportions observed are 50 per or 12'0 per cent. and 35 per cent. the standard error of the percentage difference would be. Jowr.008 17. vol. If 50 per cent. observed may be a real one.. (^ 34 fluctuations of sampling. and 35 per cent. Per cent. 41-1 44-1 Edinburgh Glasgow . the standard error of sampling for the difference between percentages observed in samples of the above sizes would be hair-colour . were the true proportions in the two classes. by equation (6). If this were the true percentage. Gray. Examvple iv.529 9. or perhaps the real difference may be greater and may be partially masked by a fluctuation of sampling.. no if it stood alone. Memoir on the Pigmentation Survey of Scotland. . so long as experi- and consequently the actual difference might not infrequently be completely masked by ments were only conducted on the same small scale.764 Can the difference observed in the percentage of girls of medium hair-colour have arisen solely through fluctuations of sampling ? In the two towns together the percentage of girls with medium is 43'5 per cent. Total observed. . is — ^i2 = /50 X 50 35 X 65\* „ „ + 34 J =11 "9 percent. — only slightly in excess of the standard error of the difference. 4. = (43-5x56-5)^x(^-^3H-^y = 0'56 per cent. however. Medium. .) The following are extracted from the tables relating to hair-colour of girls at Edinburgh and Glasgow — : Of Medium Hair-eolour.. As this difference cent. that all the other cases cited from Darwin in the question referred to show an association of Hence the difference the same sign.

treatment is similar to that of Case II. rjj Therefore finally Unless the difference between p^ and p-^ exceed. it may have arisen solely by the chances of simple sampling. as " ^ n^Pi + n^i n^ + n^ Required to find whether the difference between p^ and p^ can have arisen as a fluctuation of simple sampling. 0-56 per cent. . Case III. giving proportions of A's p^ and p^. be obliterated by the fluctuations of simple sampling The actual alone. or over 5 times this. multiplying by the deviation in p^^ and summing.XIII. since errors in p^ and pj ^^^ uncorrelated. If we assume that the difference is a real one and calculate the standard error by equation (6). we have. With such large samples the difference could not. viz. writing it in terms of deviations in p^^ p^ and p^. If Eji be the standard error of the difference between p^ and Pq. p^. viz. but in lieu of comparing the proportion p-^ with p^ it is compared with the proportion of ^'s in the two samples together. samples are drawn from distinct material or in the last case. 271 difference is 3'0 per cent. say. as before. but the work is complicated owing to the fact that errors in p-^ and jOj are not independent.. we have at once ^j being the correlation between errors of simple sampling in and p^. from the above equation relating p^ to p^ and P2. where. accordingly. This case corresponds to the testing of an association which is indicated by a comparison of the proportion of A'a amongst The general the B's with the proportion of A'a in the universe. and could not have arisen through the chances of simple sampling. we arrive at the same value. ^q being the true proportion of A's in both samples. some three times this value of £„i. — Two different universes. —SIMPLE SAMPLING OF ATTRIBUTES. But..

The theory Experimental results of dice throwing.=n„ = 34. . Layton. or 42 6 per cent.)..272 It will Wg. the standard error for a sample omit. .). if the between p-^ and p^. 255 of the English. and the table on p. difference in this case. and the student will be unable to follow much of the literature until he has read that chapter. 374 of the 1849). both the subsamples have the same number of observations. and could not have occurred mere error of sampling.. is generally treated by first determining the frequency-distribution of the number of This frequencj'-distribution is not considered till successes in a sample. (1) . the observed difference is only 1'25 times the standard error of the difference. p. as it should. Chapter XV. be observed that if «! be very small compared with approaches. the latter 43 '5 per cent. London.. As. As in the working of Example iii. French. G. and We — ' = l68^68^68J=°°^^ f29 39 IV nf^«n or 6 per cent. Example vi.. The standard error of the difference between the percentages observed in the subsample of 9743 observations and the entire sample of 49. The actual as a difference over five times this (the ratio must. Lettres sur la thiorie des probability Bruxelles. of sampling. suppose that we had compared the proportion of girls of medium haircolour in Edinburgh with the proportion in Glasgow and Edinburgh together. difference 2 4 per cent. coin tossing. etc.. A. REFERENCES. See especially letter xiv. C. QtJETELET..507 observations is therefore c„. Downes. The solution is a little complex as we no longer have e? =Po9'o/('h + '^a)Example v. in this case. €„j THEORY OF STATISTICS. it might be wiped out in other samples of the same size by fluctuations of simple sampling alone. suppose that we compare the proportion of tall plants amongst the offspring resulting from cross-fertilisations (viz. of course. 50 per cent.) with the proportion amongst all offspring (viz. the allied problem whether. be the same as in Example iv. indicated by the samples were real. of Mj observations. — = (43-5 X 56-5)'(j^^^^3y = 0-45 is per cent. for the cases dealt with in this chapter. Taking the data of Example iii. The former is 41 "1 per cent. Taking now the figures of Example iv. 29/68. edition. & E. n-. 1846 (English translation by 0. and consequently it may have arisen as a mere fluctuation of sampling.

. (6) PoissoN. vol. pp. vol. (19) Whittaker. 1829. (11) Edgeworth.. Series 6. 273 Fischer. (Pp... 1914. Karl. (§ 12). Ixi. H. (2) (3) —SIMPLE SAMPLING OF ATTRIBUTES. Boy. Boy. Macmillan. H. Yule. U.) L. V. pp. Soc. F. IT. Ixix.) Lexis. 1902. "Methods jubilee volume. p. General (5) : and applications to sex-ratio of births. Camb. Stat." Proc. (Sections 2 to on the binomial distribution. 1895.." Biometrika. of the Manchester Lit. (especially Part II." Phil. 25-35. W.. VON. F. S. vol.. 18 . 1897-8 (especially partii. W. "Tables of Poisson's Exponential Binomial Limit. Jena. xxviii. E. data regarding the distribution of sexes in families on p.. 1885. 425. Mag. S. xxii... Soc. Series A.) ) XIII. London. "On the Sex-ratios of Births in the Eegistration Districts of England and Wales. 181..'' Proc. D. G. reprints of some of Professor Lexis' earlier papers in a form convenient for Oesellschaft reference. vol. {Cf. vol. 1910. H. Das Gesetz der kleinen Zahlen. 351. Dabbishikb.. Stat. Edgbwoktii. F. 1888. and Phil. vols." 6 Trans. D. clxxxvi. . §11.) (13) As regards the sex-ratio. 1914. reference may also be made to papers in vols. Soc. Zwr Theorie der Massenerscheinungen in der miTischlichen : Freiburg. 576. etc. vol. p. Soc. Teubner. 239. (15) (16) BoRTKBwiTSOH. the of Statistics. 1907. vol. "Miscellaneous Applications of the Calculus of Probabilities. John. p.. and for illustrating Statistical Correlation. "On the Error of Counting with a Haemacytometer. E. (7) Lexis. V. 1914. Fischer.) (8) (9) Edgeworth. 1890. p... Phil. and H. 280 . 119).. vol.) (18) SoPBR. (12) Vigor. . 3rd edn. with a note by H. 1881-90. "Some Tables Mem. (17) Rutherford. Pearson. li. 1898. p. der Theorie der Statistik . Y. " Sur la proportion des naissances des filles et des gargons. and G." Biometrika. (Contains. (The frequency of particles emitted during a small interval of time follows the law of small chances: the law deduced by Bateman in ignorance of previous work. (4) ). Weldon. 698. p. or on "Probability. Yule.. pp. xx. vol. vol. and vi. X. 1877. "Skew Variation in Homogeneous Material. xvii. Y." Joitr. D. 343. Paris. Ix. and Woods. 1903. The law of small chances (14) PoissoN. 205-7.. "Fluctuations of Sampling in Mendelian Ratios. "The probability variations in the distribution of a particles." Jour. 390 et seg. Abliandlungen zwr Theorie der BevSlkerungs und MoraUtatistik ." Eleventh Edition. Geiger. D. p. with new matter. 36-71. Soc. Bateman. Venn. A. vol. Student.." JoKr. Soc. 1906. Jiecherches sur la pvobaiiliti des jugemerds.. (Use of the harmonic mean as in Stat. p. des Sciences. Article on the " Law of Error" in the Tenth Edition of the Sncyelopcedia Britannica. "On Poisson's Law of Small Numbers. 264. Leipzig. vol. Phil." Mimoires de I'Acad. Die Orwndziige Jena. (Principally theoretical the statistical illustrations very slight. The Logic of Chamee. X. (10) to which reference was made in § 9. . Soy.... "^iomeirifca. Lucy. p. ix. Wbstergaard. .. of Biometrika by Heron. Roy. 1907.. Ixi. 1837.. Y.

4 total of columns of all the 13 tables given. 1. or 6 being reckoned : as a "success. 5. EXERCISES.274 THEORY OF STATISTICS. Frequency. 4." Successes. 1 Successes.) Compare the actual with the theoretical mean and standard-deviation for the following record of 6500 throws of 12 dice. (Ref. 1 2 14 103 302 711 1231 1411 .

If a frequency-distribution such as those of Questions 1.XIII. 212). 2. Is this divergence probably significant of bias ? 6. 163. — SIMPLE SAMPUNG OF ATTRIBUTES. Verify the following results for Table VI. In calculating the actual standard-deviation. and compare the results of the different grouping of the table on p. of Chapter IX. if unknown. 263. and 3 be given. 7. The proportion of successes in the data of Qu. 5. 1 and Qu. use Sheppard's correction for grouping (p. p. 3. 275 4. . ard-deviation of the proportion with the given number of throws. 1 is Find the stand'bOdT. In the 4096 drawings on which Qu. and state whether you would regard the excess of successes as probably significant of bias in the dice. maybe approximately determined from the mean and standard-deviation of the distribution. Find n anip in this way from the data of Qu. 2 is based 2030 balls were black and 2066 white. show how n andp.

No theoretical rule as to the limits can be given. our assumed range may be greater than is possible for negative errors. — : : — : — — — 1. XIII. therefore. Warning as to the assumption that three times the standard error gives the range for the majority of fluctuations of simple sampling of either sign —2. the distribution of errors is not strictly symmetrical unless p = q = 0-5. that a range of three times the standard error includes the great majority of the deviations in the direction of the longer " tail " of the distribution. (a) Effect of divergences from the conditions of simple sampling effect of variation in p and q for the several universes from which the samples are drawn 11-12. he should remember that. the limits are not. as a rule. strictly the same for positive and for negative errors. The inverse standard error. In the first place. p be less than 0'5. while we have taken three times the standard error as giving the limits within which the great majority of errors of sampling of either sign are contained. XV. (J) Effect of variation in^ and q from one sub-class to another within each universe 13-14. 1.. while the same range on the shortei side may extend beyond the limits of the distribution altogether. The importance of errors other than fluctuations of "simple sampling " in practice unrepresentative or biassed samples 9-10. Summary. (c) Effect of a coiTelation between the results of the several events 15. Thbrb are two warnings as regards the methods adopted in the examples in the concluding section of the last chapter which the student should note. but it appears from the examples referred to and from the calculated distributions in Chap. Warning as to the use of the observed for the true value of ^ in the formula for the standard error 3.CHAPTER XIV SIMPLE SAMPLING CONTINUED: EFFECT OF REMOVING THE LIMITATIONS OF SIMPLE SAMPLING. As is evident from the examples of actual distributions in § 7. If. or standard error of the true proportion for a given observed proportion equivalence of the direct and inverse standard errors when n is large— 4-8. § 3. or if p be 276 . as they may become of importance when the number of observations is small. Chap.

of the value of the standard error is also more limited in this Suppose a large number of observacase than when n is large.) In the second place. The interpretation the number of cases in the sample. or. for when n is large the distribution tends to become sensibly symmetrical even for values of p differing considerably from 0'5.18 = 25 per cent. if within the limits of fluctuations of sampling. for each of which the true value of p is knowli. 43 per cent. and hence these values. of Chap. (Gf.. To get some rough idea of the possible importance of such effects. replacIt should be remembered that the maximum ing IT hJ Tr± 3e. the student should note that.. however.or over-estimation of the standard error which cannot be neglected. or in different universes. The two difficulties mentioned in §§ 1 and 2 arise when n. however. the use of the serious error is possible. say ir. XV..is small. Where n is large so that the standard error of p becomes small relatively to the product pq the assumption is justifiable. 68 J 1 —^— 3. is 277 greater than 0*5. the observed proportion of The standard error of tall plants is 29/68. by means of samples of n observations each. The assumption is not. no means exact. say. possible for positive errors. Thus in Example iii. observed value ir may lead to an under. and then fresh values recalculated. Chap. n be small. the standard error is unlikely to be lower than that based on a proportion of 43 .XIV. form of distribution. will give The procedure is by one limiting value for the standard error. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. 25 X 75V K OK = 5-25 per cent. for the properties of the limiting 2. tions to be made. On these data we could . we have assumed that it is suflElciently accurate to replace p in the formula for the standard error by the proportion actually observed. likely as a rule to lead to a serious mistake as stated at the commencement of this paragraph. this proportion is 6 per cent. but may serve to give a useful warning. and no If. the point is of importance only when n is small. is therefore well within the limits of fluctuations of sampling. the approxitnate standard error e may first be calculated as usual from the observed proportion w. and a true proportion of 50 per cent. The maximum value of the standard error is therefore i /50 X 50V c OR =6-06 per cent. where we were unable to assign any a priori value to p. on diff'erent masses of material. value of the product pq is given by ^ = g' = 0'5. —^— I On the other hand. greater than . XIII.

and therefore if a-p be the standard-deviation of p in all the universes from which samples are drawn. would give a distribution of p ranging uniformly between and 1. any observed value n. 8 is uncorrelated with p. on an average of all arrays. the standard-deviation But of the true proportion p for a given observed proportion ir. and conversely. [ir(l . it should be for extreme values of p near noticed that. is (pq/ny is the standard-deviation of the array at right angles to this. the last chapter is that the standard-deviation of an array of it's but the question may be asked —What associated with a certain true value p. 278 THEORY OF STATISTICS. on an average. observed for every conceivable subject. o-j becomes very small. the two standard-deviations will tend. the mean of the array of it's is identical with p. and as the proportions ? -f- standard-deviation of the differences. are also If n be large. while if n be small the standarddeviation of the array of it's will tend to be appreciably the For if tt =p S. the array of p's associated with a certain observed proportion ir ? In other words.. given an observed proportion tt. a-^ the standarddeviation of observed proportions in the samples. to be nearly the same. for example. on the average of all values otp. therefore.erved proportion tt in a sample of n observaWhat we have found from the work of tions drawn therefrom.greater than 0'5 will probably correspond to a true value of p slightly lower than ir.e. cr„ is therefore appreciably greater than o-y. XIII. the standard error of the difference is — . i. with sufficient exactness. the type of the array the regression of ^ on tt is less than unity. especially and 1. we can see that if n be large. Further. 0-5 cannot be neglected in comparison with o-y. But o-j varies inversely as n. correspondingly greater than the standard deviation of the array of p's the state- — ment not true for every pair of corresponding arrays. If we assume. however. what is the standard-deviation of the true This is the inverse of the problem with which we have been dealing. and the standard-deviation of the array of it's is. in this table. It we determine.— . or indeed grouped symmetrically in any way round 0'5. and it is a much more difficult problem. if n be small.. (Case II. We have already referred to the use of the inverse standard error in § 13 of Chap. form a correlation-table between the true proportion ^ in a given universe and the obs. (r„ becomes sensibly equal to ctj. taken as giving.Trj/nf may be sensibly equal. that a tabulation of all possible chances. and therefore the standard-deviations of the arrays. while the regression of ir on ^ is unity i. p. On general principles. therefore. Hence if n become very large. greater of the two. 269).e.

XIV. so as to tend to escape the fingers of the sampler. thus quite restricted. ways. whether the observed proportion tt in the sample may not diverge from the the universe from which it was drawn. the characters not being well defined a source of error which we need not further discuss. The formulae obtained for the standard errors of proportions and of their differences have no bearing except on the one question. and we knew that the proportions in successive samples were subject to the law of simple sampling. true differences for the given observed It 4. much more highly polished than the white ones. far more than the mere divergences between different samples drawn in is m — : . (1) owing to variations of classification in sorting the A'a and a's. and it is this uncertainty whether the chance of inclusion in the sample is the same for A's and a's. an illustration from artificial chance. or of insects to stones. or they might be represented by a number of lively black insects sheltering amongst white stones in neither case would the ratio of black balls to white. The use of standard errors must be exercised with care. for in many cases of practical sampling this The principal question in is not the principal question at issue. provided n be large. —REMOVING of LIMITATIONS OF SIMPLE SAMPLING. ref. but one which may lead to serious results [cf. in any parallel case. very necessary to remember the limited assumptions on which the theory of simple samplinff is based. might or might not Their use is be due to fluctuations of simple sampling alone. proportion p existing owing to the nature of the conditions under which the sample was taken. say. if on drawing samples from a bag containing a very large number of black and white balls the observed proportion of black balls was rr. V. inferences as to the material from which the sample is drawn are of a very doubtful and uncertain kind. even though the standard error were small. we could not necessarily infer that the proportion of black balls in the bag was approximately w. and to bear in mind that it covers those fluctuations alone which exist when all the assumed conditions are fulfilled. whether an observed divergence of a certain proportion from a certain other proportion that might be observed in a more extended series of observations. this may be taken. tt tending to be definitely greater or definitely less than Such divergence between tt and p might arise in two distinct p. or that has actually been observed in some other series. as approximately the standard-deviation difference. For the black balls might be. be represented in their proper proportions. Clearly. 279 between two observed proportions by equation (6) of that chapter.]. viz. 5 of Chap. (2) Owing to either ^'s To give or a's tending to escape the attentions of the sampler. many such cases concerns quite a different point.

on the other hand. unduly unfavourable. 6. but it may exist. or in different places. taken in the way supposed. as the persons who make the returns will probably include an undue proportion of the more intelligent farmers whose crops will tend to be above average. which renders many statistical results based on samples so dubious. No assured answer could be given conjectures on the matter would be based in part on the way in which the schools were selected. In such cases we can see that any sample. Thus in collecting returns as to family income and expenditure from working-class households. from intelligent and unintelligent farmers. There may be no definite reason for expecting definite bias in either case. are compared. the volunteering of teachers for the work might in itself introduce an element of bias. if estimates as to crop-production are formed on the basis of a limited number of voluntary returns. Compulsion could not ensure equally accurate and trustworthy returns from illiterate and well-educated workmen. equal proportions of the A's and a's in the original material. Again. The following of some definite rule in drawing the sample may also produce unrepresentative samples if samples of fruit were taken solely from the top layers of baskets exposed for sale. but. to keep the necessary accounts. if say 10.g. .280 THEORY OF STATISTICS. and no mere examination of the sample itself can give any informa: : tion as to whether it exists or no. Again. to what extent they are under-represented. e. is likely to be definitely biassed. the results might be unduly favourable . the question would arise whether this method would tend to give an unbiassed sample bf all the children. no certainty that it does not exist. the families with lower incomes are almost certain to be under-represented . In other cases there may be no obvious reason for presuming such bias. the estimates are likely to err in excess. and the question were raised whether the sample was likely to be an unbiassed sample of North Sea herrings. in the sense that it will not tend to include. or to form any estimate as to the possible error when two such samples taken by different persons at different times. even in the long run. they largely "escape the sampler's fingers " from their simple lack of ability It is almost impossible to say. one school in ten in a large town. no assured answer could be given. Whilst voluntary returns are in this way liable to lead to more or less unrepresentative samples.000 herrings were measured as landed at various North Sea ports. 5. the same way. Thus if we noted the hair-colours of the children in. say. compulsory sampling does not evade the difficulty. if from the bottom layer. however.

Similarly. it would. But while the dissimilarity of subsamples would then be evidence as to the difficulty of obtaining a representative sample. Such indicating one possible source of bias. XIII. are most probably equal. of divergences from the conditions of simple sampling which were laid down in § 8 of Chap. in the first illustration. but Ti-Tj is considerably less than three times the standard error of the difference. but merely to a greater or less extent untrustworthy if the standard error be large. of course. It may be quite untrustworthy for other reasons owing to bias in taking the sample. — — — : should also be borne in mind that an observed proportion is not incorrect. Let us now consider the effect. of course. First suppose the condition (a) to break down. again.t possibli/ p^ may even exceed Pj. On the other hand. be no evidence that the sample was representative. for example. 281 an examination may be of service. the hair-colours of the children differed largely in the different schools much more largely than would be accounted for by fluctuations of simple sampling it would be obvious that one school would tend to give an unrepresentative sample. if the herrings in different catches varied largely. The student must therefore be vfery careful to remember that even if some observed difference exceed the limits of fluctuation in simple sampling. or owing to definite errors in classifying the A's and a's. for some very different material which should have been represented might have been missed or overlooked. great heterogeneity in the original material. if an observed proportion ttj in a sample drawn from one universe be greater than an observed proportion -ir^ in a sample drawn from another universe. of course. however. it by no means follows that the result is necessarily trustworthy the smallness of the standard error only indicates that it is not VMtrustworthy ovring to the magnitvde of fluctuations of simple sampling. If. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. and tha. be diificult to get a representative sample for a large area. it does not follow that it exceeds the limits of fluctuation due to what the practical man would regard and quite rightly regard as the chances of sampling. and questionable therefore whether the five. jpj and p^. p-^ most likely exceeds ^j j the standard error only warns us that this conclusion is more or less uncertain. so that there is some essential difference between the localities from which. for instance. or the it necessarily . viz. it does not. he must remember that if the standard error be small. ten or fifteen schools observed might not also have given an unrepresentative sample. Similarly. the similarity of subsamples would. 9. follow that the true proportion for the given universes. as 7. On the contrary. Further. 8. on the standard-deviation of sampling.— : XIV.

. or Na-'^ = rCZifpq) + 7. The mean number of successes per throw of the n dice is given by We where iV= 2(/) is the whole number of throws and ^(. XIII. and so on. samples are drawn. and «(pj -Po) *^^ difference between the mean number of successes for the first set and the mean for all the sets together. .^ 2/(p -p. even for individuals of the same age and sex.y. then the last 1 -^ for q.mo-y + T^a^ = npaqa + n{n-\)<j% . Suppose. Let o-p N. XIII.npl . instead of that of the absolute number. and substituting o-^ be the standard-deviation of p. just as the chance of death. we deal with the standard-deviation of the proportion of successes. we have.— This is -..n^a-^. (1) the formula corresponding to equation (1) of Chap. conditions under which. . we have sum is = «po . for the next/g throws p^.^Mo + IZ-V^ . equation (2) is sensibly of the form s' = sl + trl. if ^. that the records of all these throws are pooled together. . for the next/^ throws p^. the chance of success -varying from time to time. . ^P\1i being the square of the standard-deviation for these throws. now.282 THEORY OF STATISTICS. Hence the standard-deviation a.of the whole distribution is given by the sum of all quantities like the above. (2) 10. To find the standarddeviation of the number of successes at each throw consider that the first set of throws contributes to the sum of the squares of deviations an amount /i[™Pi?i + «^(Pi--?'o)^]. If TO be large and s^ be the standard-deviation calculated from the mean proportion of successes p^^... is the mean value '2i{fp)IN of the varying chance p. dividing through by n\ the formula corresponding to equation (2) of Chap. viz. or that some essential may change has taken place during the period of sampling. varies from district to district. represent such circumstances in a case of artificial chance by supposing that for the first /j throws of n dice the chance of success for each die is^j.

Data from same soui-ce.XIV. 283 Table showing Frequencies of Segislraiion Districts in England amd Wales with Different Proportions of Deathi in Childbirth {including Deaths from Puerperal Fever) per 1000 Births in the saine Year. XIII. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. § 10. . Decade 1881-90. for the same Groups of Districts as in the Table of Chap.

or the circumstances that regulate the appearance of the character observed the same for every individual or every sub-class in each of the universes from which samples are drawn. while if we make n small s becomes more nearly equal to P(. but varied from one throw to another now they are constant from throw to throw. and so on. units in the proportion of male births per thousand. five out of the eight values being very close to this average. p^ q^. the chances varying for different dice. tion of 0'86. that if we make n large s becomes sensibly equal to a-p. Now : . given in § 8 (6) of Chapter XIII. we want to obtain good illustrations of the theory of simple sampling n should be made small. 283 are given some does s fall short of «„. XIII. If n be very large the actual standarddeviation may evidently become almost indefinitely large compared with the standard-deviation of sampling. It will be seen that in the first group of small districts p. 263.284 THEORY OF STATISTICS. Suppose that in the group of n dice thrown the chances for m^ dice are p-^ q^ . the condition that the chances p and q shall be the same for every die or coin in the set.cijn. in § 10 of Chap. different data relating to the deaths of women in childbirth in the same grpups of districts. for m^ dice. viz. but in the more urban districts this falls to 1 or 2 units . in one case only In the table on p. ''^' the standard-deviation of sampling is approximately . as one might expect. but differ from one die to another as they would in any ordinary set of badly made dice. Required to find the effect of these differing chances. The values of suggest an almost uniform significant standard-deviation women per thousand births. viz. Thus during the 20 years 1855-74 the death-rate in England and Wales fluctuated round a mean value of 22 -2 per thousand with a standard-deviaTaking the mean population as roughly 21 millions. The case differs from the last. 11.Js^ a-p — si = 0'8 in the deaths of — — y This is 2221978 _ 0-032 only about one twenty-seventh of the actual value. as in that the chances were the same for every die. The figures of this case also bring out clearly one important consequence of (2). and in this case the effect of definite causes is relatively larger. on the other hand. certain registration districts of England. Hence if we want to know the significant standard-deviation of the proportion p the measure of its fluctuation owing to definite causes n should be made as large as possible . but being constant throughout the experiment. at any one throw.. consider the effect of altering the second condition of simple sampling. if. there appears to be a significant standard-deviation of some 6 .

uniform. if the values of p are uniformly distributed over the whole range between and 1. 1/12 = 0-0833 (Chap. the deathrate is not. the standard-deviation of the proportion of successes. but varies from a high value in infancy (say 150 per thousand). standard-deviation of p. of the number of successes at each throw. p.XIV. The effect of the chances varying for the individual dice or other "events" is therefore to lower the standard-deviation. ^ = = 0'408/\/m. 1 8 per thousand in a population of uniform age and s one sex is (18 x 982)*/ Jn= 133/^/m. as before. however. Hence : = l. and crp = |. so that s is zero. § 12. still somewhat extreme..1<4 . /"o "= So = i before but a^ Hence s2 = 0-1667/?i. . To take another illustration. n-- and using . (4) ' ^ 12. and the effect may conceivably be considerable. the standard-deviation of the rate within such a populaBut the effect of this tion is roughly about 30 per thousand. 143). it should be noted that this may be regarded as made up of the number of successes in the mj dice for which the chances are p^ q^. 285 For the mean number of successes we evidently have M=mjPj^ + m^2 + mgPg+ . In a population of the age composition of that of England and Wales. To take a limiting case. .(mp)/n. however. — REMOVING LIMITATIONS OF SIMPLE SAMPLING. Thus the standard-deviation of sampling for a deathrate of. To find the standard-deviation Pg being the mean chance I.{mpq). the value of s if the chances are In most practical oases. together with the number of successes amongst the m^ dice for which the chances are p^ q^. as before. . instead of 0'5/s/n. much less. and so on and these numbers of successes are all independent. ^^P^_^ n n . • (3) or if s be. of course.. say. a-^ to denote the = re-^o^o . VIII. if p be zero for half the events and unity for the remainder. as calculated from the mean proportion p^^. Substituting \-p for q. through very low values (2 to 4 per thousand) in childhood to continuously increasing values in old age . . p^ = q^ = |. .. the effect will be J in every case.

first and third events. = npq[l+r{n-l)]. etc.. and constant throughout the experiment.. therefore.+ .pq + 2pq{ri^ + r-^^+ r. therefore.. The standardieviation for each event is (pq)* as before. : expression o-^ = n. Chap. as o. — of the correlations we may (T^ write . For the standard deviation of the "proportion of successes in each sample we have the equation The standard-deviation «^=^[1 +'-(»It 1)1 • • • • (6) should be noted that. . XIII. but the events instead. of course.. r is the arithmetic mean where. of the simple are no longer independent as 1 3. We have finally to pass to the third condition (c) of § 8. . We shall suppose.). for . § 2) . r^^. that the two other conditions (a) and (b) are fulfilled. The problem is again most simply treated on the lines of § 5 of the last chapter. . correlation-coefficients. is variation on the standard-deviation of simple sampling small.286 THEORY OP STATISTICS. however.pq. and so on for variables (number of successes) which can only take the values and 1. Chap.. .may be reduced to zero or increased to n(j)qy.mpling will therefore be increased or diminished according as the average correlation between the results of the single events is positive or negative. and to discuss the effect of a certain amount of dependence between the several " events " in each sample. (r^ = n. of (5) simple S9. the chances^ and q being the same for every event at every trial. and if. are the correlations between the results of the correlations first and second. we must have (cf. as calculated from equation (4). but may nevertheless. as the means and standard-deviations our variables are all identical. XL § 10). and the effect may be considerable. be treated as There are n(n-l)/2 ordinary variables {cf.. XI. i\„. quite s2 = -(18x982 -900) = 130/ Jn s compared with 133 / Jn. r is the correlation-coefficient for a table formed by taking all possible pairs of results in the n events of each sample. Chap. for..

ball for the first.1).XIV. best kept distinct. as the conditions are hardly ever constant from one year to another. cr becomes 0'816 {npqf . consider the important case of sampling from. n — w. on the other hand. r is positive. § 11 we of the w balls in the bag. number of times. It is difficult to give a really good example from actual statistics. in the case of drawing half the balls out of a very large number. it approximates to {0-5. of drawing n balls in succession from the whole number «) in a bag containing^!* white On repeating such drawings a large balls and qw black balls. even although the results of the events at each trial are quite independent of one Similarly. the case discussed in §§ 11-12 is covered by another. <T becomes zero as it should. 14. and if n be large (as it usually is in such cases) a very small value of r may easily lead to a very great increase in the observed standard-deviation. and the formula is thus checked for simple cases. e. the case when r is negative for if the chances are not the same for every event at each trial. The cases (a). As a simple illustration. if fatal at all. or nth ball of the sample the correlation-table formed from all possible pairs of every sample will therefore tend in the long run to give just the same form of distribution as the correlation-table formed from all possible pairs But from Chap. —KEMOVING LIMITATIONS OF SIMPLE SAMPLING.pq\{'-.g. but the following will . : : : whence 0-- = n. (6) and (c) are. we are evidently equally likely to get a white ball or a black. we have the obviously correct result that fr= (pq)\ as drawing from unlimited material: if. since a positive or negative correlation may arise for reasons quite different from those discussed in i 9-12. 287 It should also be noted that the case when r is positive covers the departure from the rules of simple sampling discussed in for if we draw successive samples from different records. §§ 9-10 this introduces the positive correlation at once. If re — in In the case of contagious or infectious diseases. for drawing 5 balls out of 10. know that the correlation-coefficient for this table is .^) w = n. second. For drawing 2 balls out of 4. XI. or 0-707 (ripq)*.npqy. 0'745 {npqf' .l/(w . and the chance of success for some one event is above the average.a limited universe. the mean chance of success for the remainder' must be below it. to result in wholesale deaths. however. or of certain forms of accident that are apt.pq -n -^ w -I = 1.

it will probably be fatal to others also. if p and q are constant. the standard-deviation observed will be less than the standard-deviation of simple sampling as calculated from the mean values of the chances finally. 560000 X 105 that Summarising the preceding paragraphs. the numbers of deaths ranging between 14 (in 1903) and 317 This large standard-deviation. years.: 288 THEOKY OF STATISTICS. serve to illustrate the point. For if Wq denote the standard-deviation of simple sampling. or whatever they may be from which 15. or its value 84-7. the standard-deviation observed will be greater than the standard-deviation of simple sampling. to judge from the though not wholly. due to a general tendency to decrease in the numbers of deaths from explosions in spite of a large increase in the number of persons employed . it follows that this should be the square of the standard-deviation of simple sampling. '''' := +0-00012. These conclusions further emphasise the need for caution in the use of standard errors. is partly. §§ 9-14. iigures.the standarddeviation of sampling given by equation (5). if the samples are drawn. the magnitude of the standard-deviation can be accounted for by a very small value of the correlation y. from the above data. From § 12 of Chap. XIII. we see the chances p and q differ for the various universes. expressive of the fact that if an explosion is sufficiently serious to be fatal to one individual. or the standard-deviation itself approximately 10'3. But the square of the actual standard-deviation is 7178. the observed standard-deviation will be greater or less than the simplest theoretical value according as the correlation between the results of the single events is positive or negative. districts. we have (in 1894). taking the numbers of persons employed underground at a rough average of 560. as calculated from the average values of the chances if the average chances are the same for each universe from which a sample is drawn. but vary from individual to individual or from one subclass to another within the universe. (n-lK' Whence. If we find that the : . but the events are no longer independent. materials. but even if we ignore this.000. or an average of 105 deaths per annum. o. During the twenty years 1887-1906 there were 2107 deaths from explosions of firedamp or coal-dust in the coal-mines of the United Kingdom.

p. two interpretations are possible either that p and q are different in the various universes from which samples have been drawn {i. EXERCISES.). dealing with the first problem of our § 14. 1.. ) : —REMOVING LIMITATIONS OF SIMPLE SAMPLING. is generally spoken of as random sampling. masked by a variation of the chances p and q in sub-classes of each universe. simple sampling as we have called it. xlvii. generally the references to Chap. " On certain Properties of the Hypergeometrical Series. 19 . "On Errors of Random Sampling in certain Cases not suitable for the Application of a Normal Curve of Frequency. pp. — — REFERENCES. 1. from the standpoint of the frequency-distribution of the number of white or black balls in the (1) samples.. or that the results of the events are negatively correlated inter se. vol. for example." Philosophical Magazine. Possibly. M. ' . Chap. and on the fitting of such Series to Observation Polygons in the Theory of Chance. Kakl. . as the condition that the sampling shall be random haphazard is not the only condition tapitly assumed. Cf. 289 standard-deviation in some case of sampling exceeds the standarddeviation of simple sampling. (If an event has succeeded p times in successes in subsequent n trials. XIII. that the variations are more or less definitely significant in the sense of § 13. to which may be added Pearson. 6th Series. 1913. (An expansion of one section of ref. Referring to Question 7 of Chap. Even if the actual standard-deviation approaches closely to the standarddeviation of simple sampling. XIII. 236. . while approximately constant from one universe to another.— XIV. 10 of Chap. Sampling which fulfils the conditions laid down in § 8 of Chap. ix.) (2) Greenwood.' " Biometrika. but taking row 5 with rows 6 and 7.e. or that the results of the events are positively correlated inter se. m m trials 1 Tables for small samples. 69-90. We have thought it better to avoid this term. vol... XIII.e. i. If . . work out the values of the siguifioant standard-deviation ap (as in § 10) for each row or group of rows there given. either that the chances p and q vary for different individuals or sub-classes in each universe. deviation the actual standard-deviation fall short of the standardof simple sampling two interpretations are again possible. it is only a conjectural and not " a necessary inference that all the conditions of " simple sampling as defined in § 8 of the last chapter are fulfilled. there may be a positive correlation r between the results of the different events.. drawing samples from a bag containing a limited number of white and black balls. what are the chances of 0. XIII. 1899. XIII..

) the standard-deviation of the proportion of male births per 1000 of all births is 7 '46 and the mean proportion of male births 509-2. For all the districts in England and Wales included in the same tahle (Table VI. IX. The harmonic mean number of births in a district is 5070. what is the standard-deviation of the number of successes. . 2.290 THEORY OF STATISTICS. Find the significant standard-deviation a^ 3. whilst for the other half the chance of success is j and the chance of failure p. If for one half of m events the chance of success is p and the chance of failure q. the events being all independent ? 4. Chap. The following are the deaths from small-pox during the 20 years 1882-1901 in England and "Wales :— 182 .

Graphical and mechanical of the distribution on p." in which the events are completely independent. 1-2. For the simpler cases of artificial chance it is possible. Detei-mination of the frequency-distribution for the number of successes in n events : the binomial distribution 3. — — — — — — — — — In Chapters XIII. iygSuppose we now combine with the failures and Np successes. the terms of the binomial series 9. Necessity of deducing. results of this first event the results of a second. Direct calculation of the mean and the standard-deviation from the distribution— 7-8. The table of areas of the normal curve and its use— 17. number This we propose to do for the case of "simple sampling. and the applications of the results indicated. Dependence of the form 4-5." 1. we expect in iV trials.— CHAPTER XV. Deduction of the normal curve as a limit to the symmetrical binomial 10-11. The two events are quite independent. to go much further. for use in many practical cases. If we deal with one event only. THE BINOMIAL DISTEIBUTION AND THE NOEMAL OUEVE. and the chances p and q the same for each event and constant throughout the trials. 2. q and n methods of forming representations of the binomial distribution 6. the standard-deviation of the of successes in n events was determined for the several more important cases. or the throwing of ideally perfect dice (homogeneous cubes). Outline of the more general conditions from which the curve can be deduced by advanced methods 14. and therefore. a continuous curve giving approximately. however. The value of the central ordinate 12. and determine not merely the standard-deviation but the entire frequency-distribution of the number of " successes. The quartile deviation and the " probable error " 18. The case corresponds to the tossing of ideally perfect coins (homogeneous circular discs). according to the rule of all 291 . and XIV.^ 15. Comparison with a binomial distribution for a moderate value of w 13. Illustrations of the application of the normal curve and of the table of areas. Difficulty of a complete test of lit by elementary methods 16. Fitting the curve to an actual series of observation. for large values of n.

"a. a. a. a. fen' a. Si. a. a.292 THEORY OF STATISTICS. . a.

292). Quite generally. Of the iNpq cases of one {N'q^)p with success of the third. and Np^ cases of three successes. trials of three events we should expect result is that in cases of no success. The distribution is. . The scheme is continued for the results of a fourth event.e. and Np^ cases of two successes. 2Npq cases of one success and one failure. independence. {Np)q will be associated (on an average) with failures of the second event and {Np)p with successes. 3 Npq^ cases of one success. 1. 2 .. tailing off in either direction from the mode. The general form of the distributions given by such binomial series will have been evident from the experimental examples given in Chapter XIII. as in row 5 of the scheme. . (2N'pq)q will be associated with failure of the third event and {^Npq)p with success. i. and it is evident that all the results are included under a very simple rule the frequencies of 0.+Pr N N^ : . . successes W . and similarly for The the Np^ cases in which both the first two events succeeded.. however.. . evidently the distribution must be symmetrical. The results of a third event may be as in row 3 of the scheme. 3 Np\ cases of two successes. : and soon. (2) on the value of the exponent n. 3.. successes are given for one event by the binomial expansion of N{q+p) for <i«o events . for . N(q+pY for three events ]S[\q +pr „ „ for /ozir events „ „ ^{9. 2 .of the second event. row 2 of the scheme on p. n(n-Vj . of so much importance that it is worth while considering the form in greater detail. — 293 —BINOMIAL DISTRIBUTION AND NORMAL CURVE.. of the Nq failures of the first event {Nq)q will be associated (on an average) with failures . q are equal. {Nq^)q will be associated (on an average) with failure of the third also.. 1. N{q'' + n.. This form evidently depends (1) on the values If p and of q and p. 2 i* + . viz. success and one failure. Similarly of the Np successful first events.1 j7 / . the frequencies of 0. m(m-l)(m-2) —f23 „ . they are distributions of greater or less asymmetry. In trials of two events we would therefore expect approximately iVg^ cases of no success.\ ) This is the first theoretical expression that we have obtained for the form of a frequency-distribution. and {Nq)p with successes of the second event {cf.— XV.. in fact in trials of n events are given by the successive terms in the binomial expansion of N{q + p)".q''-^ + -^Y2-1P+ „ . combined with those of the first two in precisely the same way. Of the iVg^ cases in which both the first two events failed.

000 {q+pf for from O'l to 0'5.294 THEORY OF STATISTICS. If p and q are unequal. on the other hand. from 0.1.1. the greater the inequality of the chances. Values o/p {Figures given to the nearest unit. and the more asymmetrical. The following table shows the calculated distributions for m = 20 and values of p. p and any end q may be interchanged without altering the value of term. — Terms of the Binomial Series 10.) Number of . and consequently terms equidistant from either of the series are equal.5. the distribution is asymmetrical. proceeding by 0. for the same value of n.1 to 0. When ^ = 0. cases of two successes are the A.

000 (O'O + to the nearest tmit. li —BINOMIAL DISTRIBUTION AND NOIIMAL CURVE. the less the asymmetry. the greater n. however. — Terms of the Bvnomial Series 10. we have the following increase : p^q B. {Figures given Number . O'l)'™. not only does an increase in n raise the mean and increase the dispersion. Thus if we compare the first distribution of the above table with that given by m= 100.) — 295 XV. If p is not equal to' q. but it also lessens the asymmetry . for the same value of p and q. the efFeofc of increasing n is to raise the mean and the dispersion.

in Q.) segments into which is Draw a series of verticals binomial. (the heavy verticals of fig. 3 1. 47) at any convenient distance apart AR BQ line.. Similarly. Next. in the diagram iV has been taken = 4096. \c = 1024. be. The polygons for higher values of n may now be constructed graphically.296 THEORY OF STATISTICS. viz. — PR and considering the two (This follows at once on joining Consider then some divided. we have the polygon ab"c"d"e"f" for « = 3. draw the binomial polygon for the simplest case m = 1 . For ob' — q. binomial distributions. so that BQ. and erect other verticals (the lighter dividing the distance between them in the ratio of q -. ? = |. Mark the points where ab. choosing a vertical scale. This gives the polygon ab'e'd'e for on a horizontal base verticals) : = 2.GR. out the intermediate verticals are projected horizontally on to the thick verticals. The process may be continued 11 . etc. say for the case p = ^.p.AP + q. ob = 3072.lc. and the polygon is abed. It will have been noted that any one term say the rth— in one series is obtained by taking q times the rth term together with p times the (r-l)th term of the preceding Now if AP. be erected Ijetween them. series. then BQ=p.ob + q. b'c. and a third. cutting AB:BC::q:p. OR (figure 46) be two verticals. and so on. if the points where ab'.ob. cd respectively cut the intermediate verticals and project them horizontally to the right on to the thick verticals. Ic' =p.

indefinitely. 297 though it will be found difficult to maintain any high degree of accuracy after the first few constructions. —BINOMIAL DISTRIBUTION AND NORMAL CURVE. .XT.

This wedge is set so as to throw q parts of the stream to the left and p parts The wedges 2 and 3 are set so as to the right (of the observer). 48. are therefore in the ratio of q^ 2qp p\ The next row of wedges is again set so as to divide these streams in the same proportions : : . comes from the funnel and meets the wedge 1.298 THEORY OF STATISTICS. which will divide up into streams any granular material such as shot or mustard seed which is poured through the funnel when the apparatus is held at a slope. to divide the resultant streams in the same proportions. This space is broken up by successive rows of wedges like 1. in the spaces between which the — — Fig. 2 3. wedge 3 throws pq parts of the original material The streams passing these wedges to the left and p^ to the right.. 4 5 6. Thus wedge 2 throws <^ parts of the original material to the left and qp to the right. apparatus consists of a funnel opening into a space say a J inch in depth between a sheet of glass and a back-board. etc. —The Pearson-Galton Binomial Apparatus. At the foot these wedges are replaced by vertical strips. Consider the stream of material that material can collect.

. as may be desired. This kind of apparatus was originally devised by Sir Francis Gal ton (ref. will give the streams proportions q^ iq^p Gq^^ iqp^ p\ and these streams will accumulate between the strips and give a representation of the binomial by a kind of histogram. 1) in a form that gives roughly the symmetrical binomial. The values of the mean and standard-deviation of a binomial distribution may be found from the terms of the series directly. —BINOMIAL : : : DISTSIBUTION AND NORMAL CUETE. (the calculation was in fact given as an exercise in Question 8. That is.. npl q"-' + {n-l)q"--p + ^ (n - l)(n - 2) f^ 'q"-y+ . The final set. XIII . it may be omitted for convenience.g^-'p n. (4) Frequency/. so that they (Eef. a stream of shot being allowed to fall through rows of nails. Chap.. 6. and treat the problem as if it were an arithsuccesses: as metical example.p.. But this snm is . VIII.cf-^p 2m(»-l)?'>-2p2 ~ J V '^j^ 2 m(9s-l)(7"-V r. The apparatus was generalised by Professor Pearson. and the resultant streams being collected in partitioned spaces..{n-l){n-2) ? 1.) could be adjusted to give any ratio of q -. 13. XIII. we are treating iV as and the mean is therefore given by the sum of the terms col. as by the method of Chap. Of course as many rows of wedges may be provided as shown.). 1 is of course unity. XV.3 ^ n(n-l){n-2) ^ 1. tions q^ bear the proporthe heads of the vertical strips. taking the arbitrary origin at iV is a factor all through. at : : and the four streams that result 3q^p 3qp^ p^.2 2 i" unity. VII. and Question 6. who used rows of wedges fixed to movable slides. 1 below. the mean M is np. : : (1) (2) (3) |. (3). — 1 /f — n.q"-^p n. Chap.<• ] = np(q -l-j>)"~' = np. g" Dev. /|. in The sum of col. will 299 as before.2 ' ^ 3n(n-l){n-2) 1. i. Arrange the terms under each other as in col.e.2. as well as by the method of Chap.

The question arises whether we can pass from this discontinuous formula to an equation suitable for representing a continuous distribution of frequency. The terms of the binomial series thus afford a means of completely describing a certain class of frequency-distributions of giving not merely the mean and standard-deviation in i. . The square of the standard-deviation is given by the the terms in col. samples of n cards each be drawn from an indefinitely large If record of cards marked with A or a.— 1HE0RY OF 300 STATISTICS. that it only applies to a strictly discontinuous distribution like that of the number of . 2. • • • }-»¥• But the series in the bracket is the binomial series {q +pY~^ It therefore with the successive terms multiplied by 1. The distribution will be given by the terms of the series (0'49-)-0'51)i°'""* and the standard-deviation is. the binomial series suffers from a serious limitation. that is. eafih case. of heads thrown in tossing a coin. almost a necessity for Consider. in round numbers. then the successive terms of the series N{q+py give the frequencies to be expected in the long run of . (T- = np{(n-l)p + \]-rfip^ = np .000 This would not only be practically impossible without the use of certain methods of approximation. the average or smoothed form of the distribution to which actual distributions will more or less closely approximate. 2. Therefore . the frequency-distribution of the number of male births 10. Such an equation becomes. or the number N . The distribution will therefore extend to some 150 births or more on either side of the mean number. 5100. 8. Considered. o-a=TOj)|g»-i+2(TO-i)g"-^j)+3 ^"'~ sum of ^_^~ V ~y+ . p. . viz.4-cards drawn from a record containing A's and a's. deviating from these by errors which are themselves fluctuations The three constants N.np"^ = npq. . and its sum is therefore {n-\)p + \. 50 births. therefore. 7. as a formula which may be generally useful for describing frequency-distributions. and in order to obtain it we should have to calculate some 300 terms of a binomial series with an exponent of 10. the actual frequencies only 0.e.000 births. but it would give the distribution in quite in batches of ! . however. gives the difference of the mean of the said binomial from .4-cards in the sample. n. for certain cases with which we have already dealt. but of describing the whole form of the distribution. the proportion of 2-oards in the record being p. 1. the mean number being. (4) less the square of the mean. say. indeed. 3. determine of sampling.1. example.

The terms of the series are The frequency of m successes is \n ^(i)"| OT \ n-m is and the frequency of m+ 1 n successes The multiplying it by {n-m)/{m+l). It is possible to iind such a continuous limit to the binomial series for any values of p and q.. the value is \2k ^' = ^(^)^|ITt^ (k-x+l) . . for simplicity...>m+ 1 n-1 or m< 2 Suppose. to replace the binomial series by some continuous curve. —BINOMIAL : DISTRIBUTION AND NORMAL CURVE. 9. and the binomial is symmetrical.XV. then the frequency of k successes is the greatest.. we would not have compiled a frequency-distribution by single male births. 301 unnecessary detail as a matter of practice.. and its value is ^^-^(^r^ The polygon ordinate. y~ (k)(k-l){k-2) {k+l){k + 2){k+Z) (-D(-S(--D----(-^) "(-D(-l)(-l) • • " . taking probably 10 births as the class-interval. We want. {k + x) (2) and therefore y.. • (1) tails off symmetrically on either side of this greatest Consider the frequency oi k + x successes . say equal to 2^ . (-'-i^X-S . but in the present work we will confine ourselves to the simplest case in which p = q = Q'b. the curve being such that the area between any two ordinates y-^ and y^ will give the frequency of observations between the corresponding values of the variable x-^ and x^.. that n is even. therefore greater than the former so long as derived from this by latter frequency is ~m. having approximately the same ordinates. but would certainly have grouped our observations. . therefore.

If we desire to make a normal curve fit some given distribution as near as may be. must be the same. that drawn in fig. The curve represented by this equation is symmetrical about Mean. the point x = Q. and taken as the ideal form of the symmetriThe curve is generally cal frequency-distribution in Chap. This assumption does not involve any difficulty. 10.and assigning the origin of x. . for we need not consider values of x much greater than three times the standard-deviation or 3 ^fh|2. and mode therefore coincide. the last two data are given by the standard-deviation and the mean respectively . i<:=-|(i+2+3+ _ x(x-\) X .+-— )-^ h' Therefore. and indeed large compared with x.+ |--^+ S^ S* to every bracket in the fraction (3). in fact. 5. and the curve is. or the numbers of observations which these areas represent. 89. however. as suggested in § 8.(l+S) = S-|. the value of y^ will be given by the fact that the areas of the two distributions.302 THEORY OF STATISTICS. yx = l -2^-2 . (4) where. lead in any simple and elementary algebraic way to an etpression for y^. VI. known as the normal curve of errors or of frequency. p. law of error. let us approximate by assuming. so that (a../i)^ may be neglected compared with (a:/A). . the constant k has been replaced by the standard-deviation cr. finally. which gives the greatest ordinate y = y^. ratio of this to k is 3/ \l2h. which is necessarily small if k On this assumption we may apply the logarithmic S^ log. Now h is and the series be large. though such a value could be found arithmetically to any desired degree of approximation. in the last expression. and neglect all terms beyond the first. A normal curve is evidently defined completely by giving the values of y^ and o. median. that very large. To this degree of approximation. This condition does not. . or the. For it is evident that (1) any alteration in . for o-^ = kj2.

—BINOMIAL : DISTRIBUTION AND NORMAL CURVE. 357-8. 303 ^g produces a proportionate alteration in the area of the curve. The value of a may be found approximately by taking yp and <t both equal to unity.) X. or number of observations N. {For references to more extended tables. 11. the same for positive and negative values of x. the interval being 0'2 units . xjar. approximately.XV. of every ordinate and therefore doubling a. For the whole curve the sum of the ordinates will be found to be 12 53318. calculating the values of the ordinates y^ for equidistant values of x. and therefore doubles (2) any alteration in o. doubling y^ doubles every ordinate y^. and taking the area.doubles the distance from the mean. of' the curve. the values are. The area is represented. of course. for the values of y^ are the same for the the area same values of area. The table below gives the values of y for values of x proceeding by fifths of a unit .produces a proportionate alteration in the area.g. Ordinates of the Curve y = e ''. see list on pp. and consequently doubles the y^a-. 2'50664:. . the area is therefore. as given by the sum of the ordinates multiplied by the interval. or the therefore proportional to number of observations or we must have N= a X y^^a- where a is a numerical constant. e.

In the proof of § 9 the assumption was made that k (the half of the exponent of the binomial) was very large compared with X (any deviation that had to be considered). however. and this is modulus ") as a measure of dispersion. \n= sj^mr -^ Applying have Stirling's theorem to the factorials in equation (1) we The complete expression for the normal curve is therefore ^=7W* The exponent may be written x^jc^ .a-. For this distribution the mean deviation = 0.. VIII. that the mean deviation the origin of the use of J2x<r (the " approximately 4/5 of the standard-deviation. Thus if to— 64. from the annexed table.cr as a measure of " precision. Deviations x have therefore to be considered up to ±12 or more. VIII.000 observations) up to x= +15. the normal curve gives the terms of the symmetrical binomial surprisingly closely even for moderate values of n. to a high degree of approximation. Chap. . however. which is over 1/3 of k. 2 or 1J2 becomes meaningless if the distribution be not normal.\/2/7r = 0'79788 o-: the proof cannot be given within the limitations of the present work. (6) where c= tji. § 13). The closeness of approximation is partly due to the fact that.. in applying the logarithmic series to the fraction on the right of equation (3). do not is for ." and of 2cr^ The use of the factor as " the fluctuation " (c/. viz. of 1/ J^. the terms of the -second order in expansions of corresponding brackets in numerator and denominator cancel each other these terms. . we have.304 Stirling (1730). In point of fact.. therefore. : . and the standard-deviation is 4. . 17).. Another rule cited in Chap. is strictly true the normal curve only. 12. If THEORY OF STATISTICS. Aswill be seen. ji be large. A = 32. the ordinates of the normal curve agree with those of the binomial to the nearest unit (in 10. The rule that a range of 6 times the standard-deviation includes the great majority of the observations and that the quartile deviation is about 2/3 of the standard-deviation were also suggested by the properties of this curve (see below g§ 16.


XV.

—BINOMIAL

DISTKIBUTION AND NORMAL CURVE.

305

accumulate, but only the terms of the third order. There is only one second-order term that has been neglected, viz. that due to the last bracket in the denominator. Even for much lower values of m than that chosen for the illustration e.g. 10 or 12 (c/. Qu. 4 at the end of this chapter) the normal curve still gives a very fair approximation.

Table
(2)

shoviing (1) Ordinates of the

Binomial

Series 10,000 (i
;

+ J)^ and
32

Corresponding Ordinates of the

Normal Curve y =

——
4v25r

10,000 «

Term.

306

THEORY OF STATISTICS.

Nelwmbitim, Pearl, American Ifaturalist, Nov. 1906). The question why, in such cases, the distribution should be approiimately normal, a form of distribution which we have only shown to arise if the variable is the sum of a large number of elements, each of which can take the values and 1 (or other two constant values), these values occurring independently, and with equal frequency. In the first place, it should be stated that the conditions of the deduction given in § 9 were made a little unnecessarily restricted,
arises, therefore,

tsoo

,

1200

>aoo

^600

K300

XV.

—BINOMIAL

DISTRIBUTION AND NORMAL CURVE.

307

still, without introducing the conception of the binomial at all, by founding the curve on more or less complex cases of the theory of sampling for variables instead of for attributes. If a variable is the sum (or, within limits, some slightly, more complicated function) of a Icurge number of other variables, then the distribution of the compound or resultant variable is normal, provided that the elementary variables are independent, or nearly so (cf. ref. 6). The forms of the frequency-distributions of the elementary variables affect the final distribution less and less as their number is increased only if their number is moderate, and the distributions all exhibit a comparatively high degree of asymmetry of uniform sign, will the same sign of asymmetry be sensibly evident in the distribution of the compound variable. On this sort of hypothesis, the expectation of normality in the case of stature may be based on the fact that it is a highly compound character depending on the sizes of the bones of the head, the vertebral column, and the legs, the thickness of the intervening cartilage, and the curvature of the spine the elements of which it is composed being at least to some extent independent, i.e. by no means perfectly correlated with each other, and their frequency-distributions exhibiting no very high degree of asymmetry of one and the same sign. The comparative rarity of normal distributions in economic statistics is probably due in part to the fact that in most cases, while the entire causation is certainly complex, relatively few causes have a largely predominant influence (hence also the frequent occurrence of irregular distributions in this field of work), and in part also to a high degree of asymmetry in the distributions of the elements on which the compound variable depends. Errors of observation may in general be regarded as compounded of a number of elements, due to various causes, and it was in this connection that the normal curve was first deduced, and received its name of the curve of errors, or law of error. compare some actual distribution 14. If it be desired to with the normal distribution, the two distributions should be superposed on one diagram, as in fig. 49, though, of course, on a much larger scale. When the mean and standard-deviation of the actual distribution have been determined, 7/^ is given by equation (5) ; the fit will probably be slightly closer if the standard-deviation is adjusted by Sheppard's correction (Chap. XI. § 4). The normal curve is then most readily drawn by plotting a scale showing fifths of the standard-deviation along the base line of the frequency diagram, taking the mean as origin, and marking over these points the ordinates given by the figures The curve of the table on p. 303, multiplied in each case by y^.
:

more general

308

THEORY OF STATISTICS.

,

can be drawn freehand, or by aid of a curve ruler, through the The logarithms of y in the tops of the ordinates so determined. table on p. 303 are given to facilitate the multiplication. The only point in which the student is likely to find any difficulty is in the use of the scales he must be careful to remember that the standard-deviation must be expressed in terms of the class-interval as a unit in order to obtain for y^ a number of observations per interval comparable with the frequencies of his
:

table.

The process may be varied by keeping the normal curve drawn to one scale, and redrawing the actual distribution so as to make the area, mean, and standard-deviation the same. Thus suppose a diagram of a normal curve was printed
once for all to a scale, say, of y^ — 5 inches, o- = 1 inch, and it were required to fit the distribution of stature to it. Since the standard-deviation is 2-57 inches of stature, the scale of stature is 1 itich = 2'57 inch of stature, or 0'389 inches = 1 inch of stature ; this scale must be drawn on the base of the normal-curve diagram, being so placed that the mean falls at 67"46. As regards the scale of frequency-per-interval, this is given by the fact that the whole area of the polygon showing the actual distribution must be equal to the area of the normal curve, that is 5 J2ir= 12 "53 square inches. If, therefore, the scale required is n observations per interval to the inch, we have, the number of observations being 8585,

TO

X 2-57

which gives n = 266'6.

Though the second method saves curve drawing, the first, on the whole, involves the least arithmetic and the simplest
plotting.
15. Any plotting of a diagram, or the equivalent arithmetical comparison of actual frequencies with those given by the fitted normal distribution, affords, of course, in itself, only a rough test, of a practical kind, of the normality of the given distribution. The question whether all the observed differences between actual and calculated frequencies, taken together, may have arisen merely as fluctuations of sampling, so that the actual distribution may be regarded as strictly normal, neglecting such errors, is a question of a kind that cannot be answered in an elementary work (cf. ref. 22). At present the student is in a position to compare the divergences of actual from calculated frequencies with fluctuations of sampling in the case of single class-intervals, or single groups of class-intervals only. If the

;

XV.

— BINOMIAL

DISTRIBUTION AND NORMAL CURVE.

309

expected

theoretical frequency in a certain interval is /, the standard error of sampling is n^(N -/)/N ; and if the divergence of the observed from the theoretical frequency exceed some three times this standard error, the divergence is unlikely to have occurred as a mere fluctuation of sampling. It should be noted, however, that the ordinate of the normal curve at the middle of an interval does not give accurately the area of that interval, or the number of observations within it it would only do so if the curve were sensibly straight. To deal strictly with problems as to fluctuations of sampling in the frequencies of single intervals or groups of intervals, we require, accordingly, some convenient means of obtaining the number of observations, in a given normal distribution, lying between any two values of the variable. 16. If an ordinate be erected at a distance a;/o- from the mean, in a normal curve, it divides the whole area into two parts, the ratio of which is evidently, from the mode of construction of the curve, independent of the values of y^ and of o-. The calculation of these fractions of area for given values of a;/cr, though a long and tedious matter, can thus be done once for all, and a table giving the results is useful for the purpose suggested in § 15 and in many other ways. Eeferences to complete tables are cited at the end of this work (list of tables, pp. 357-8), the short table below being given only for illustrative purposes. The table shows the greater fraction of the area lying on one side of any given ordinate e.g. 0"53983 of the whole area lies on one side of an ordinate at '460 17 on the other side. O'lo- from the mean, and It will be seen that an ordinate drawn at a distance from the mean equal to the standard-deviation cuts off' some 16 per cent, of the whole area on one side ; some 68 per cent, of the area will therefore be contained between ordinates at ± a. An ordinate at twice the standard-deviation cuts off only 2-3 per cent., and therefore some 95'4 per cent, of the whole area lies within a range of +2a-. As three times the standard-deviation the fraction of area cut oif is reduced to 135 parts in 100,000, leaving 99'7 per cent, within a range of ± 3o-. This is the basis of our rough rule that a range of 6 times the standard-deviation will in general include the great bulk of the observations the rule is founded on, and is only For other forms of strictly true for, the normal distribution. distribution it need not hold good, though experience suggests The binomial distribution, that it more often holds than not. especially if jo and q be unequal, only becomes approximately normal when n is large, and this limitation must be remembered in applying the table given, or similar more complete tables, to cases in which
:

:

the distribution

is

strictly binomial.

310

THEOKY OF STATISTICS.

Table

shoiving the Greater Fraction of fhe Side of an Ordinate of Abscissa xja, tables, see list on pp. 357-8.)

Area

of

{For references

a Normal Owve to One to more extended

XV.

—BINOMIAL

DISTRIBUTION AND NORMAL CURVE.

311

unreliability of observed statistical results, and the term probable error is given to this quantity. It should be noted that the word

"probable" is hardly used in its usual sense in this connection: the probable error is merely a quantity such that we may expect greater and less errors of simple sampling with about equal frequency, provided always that the distribution of errors is On the whole, the use of the "probable error" has little normal. advantage compared with the standard, and consequently little stress is laid on it in the present work ; but the term is in constant use, and the student must be familiar with it. It is true that the " probable error " has a simpler and more direct significance than the standard error, but this advantage is lost as soon as we come to deal with multiples of the probable error. Further, the best modern tables of the ordinates and area of the normal curve are given in terms of the standard-deviation or standard error, not in terms of the probable error, and the multiplication of the former by 0'6745, to obtain the probable error, is not justified unless the distribution is normal. For very large samples the distribution is approximately normal, even though j) and q are unequal ; but this is not so for small samples, such as often occur in practice. In the case of small samples the use of the "probable error" is consequently of doubtful value, while the standard error retains its significance as a measure of dispersion. The " probable error," it may be mentioned, is often stated after an observed proportion with the ± sign before it ; a percentage given as 20'5±2-3 signifying "20'5 per cent., with a probable error of 2 "3 per cent." If an error or deviation in, say, a certain proportion p only just exceed the probable error, it is as likely as not to occur iii simple sampling if it exceed twice the probable error (in either direction), it is likely to occur as a deviation of simple sampling about 18 times in 100 trials or the odds are about 4-6 to 1 against its occurring at any one trial. For a range of three times the probable error the odds are about 22 to 1, and for a range of four times the probable error 142 to 1. Until a deviation exceeds, then, 4 times the probable error, we cannot feel any great confidence that it is likely to be "significant." It is simpler to work with the standard error and take + 3 times the standard error as the critical range for this range the odds are about 370 to 1 against such a deviation occurring in simple sampling at any one trial. 18. The following are a few miscellaneous examples of the use of the normal curve and the table of areas. hundred coins are thrown a number of times. Exa/mple i. How often approximately in 10,000 throws may (1)' exactly 65 heads, (2) 65 heads or more, be expected %
:

;

—A

312

THEORY OF STATISTICS.
is

The standard-deviation
distribution as normal,
j'(,

>/0-5

x 0-5 x 100 = 5.

Taking the

= 797"9.

The mean number of heads being 50, 65 - 50 = 3o-. The frequency of a deviation of 3a- is given at once by the table (p. 303) A as 797-9 X -0111 .... =8-86, or nearly 9 throws in 10,000. throw of 65 heads will therefore be expected about 9 times. The frequency of throws of 65 heads or more is given by the area table (p. 310), but a little caution must now be used, owing A throw of 65 heads is to the discontinuity of the distribution. equivalent to a range of 64-5-65'5 on the continuous scale of the normal curve, the division between 64 and 65 coming at 64-5. 64-5- 50= +2-9a-, and a deviation of +2-9.0- or more, will only occur, as given by the table, 187 times in 100,000 throws, or, say, 19 times in 10,000. Example ii. Taking the data of the stature-distribution of fig. 49 (mean 67-46, standard-deviation 2-57 in.), what proportion of all the individuals will be within a range of + 1 inch of the

meanl
1 inch =0-389(r. Simple interpolation in the table of p. 310 gives 0-65129 of the area below this deviation, or a more extended Within a range of table the more accurate value 0-65136. ± 0-389o- the fraction of the whole area is therefore 0-30272, or the statures of about 303 per thousand of the given population will lie within a range of + 1 inch from the mean. Example iii. In a case of crossing a Mendelian recessive by a heterozygote the expectation of recessive offspring is 50 per cent. (1) How often would 30 recessives or more be expected amongst 50 ofFspring owing simply to fluctuations of sampling 1 (2) How many oifspring would have to be obtained in order to reduce the probable error to 1 per cent. ? The standard error of the percentage of recessives for 50

observations is 50 \/l/50 = 7-07. Thirty recessives in fifty is a deviation of 5 from the mean, or, if we take thirty as representing 29"5 or more, 4-5 from the mean; that is, 0'636.a-. positive deviation of this amount or more occurs about 262 times in 1000, so that 30 recessives or more would be expected in more than a quarter of the batches of 50 ofFspring. have assumed normality for rather a small value of n, but the result is sufficiently accurate for practical purposes. As regards the second part of the question we are to have

A

We

•6745x50
n being the nearest unit.

7i>=l,
This gives to=1137 to the

number

of offspring.

0'02347. p. and 5'52 The corresponding fractions of that is. chap. though a first approximation only. of fig. l-759cr and 2-148<r. so we would expect an equal or greater deficiency to occur about 10 times in 1000 trials. 63 . the proportion falling into other classes 0'977. and if so. To obtain the theoretical frequency we may either take it as given roughly by the ordinate in the centre of the interval.. Calculating the ordinate of the normal curve directly we find the frequency 197'8. a little too small. mid-value 62'44. The question how often it might have occurred can only be answered if we assume the distribution of fluctuations of sampling to be approximately normal. iv. The tables give 0-990 of the area below a deviation of 2"32o-. London. or once in a hundred. . 313 Example —The diagram 49 shows that the number XV. ref. 2 "32 times this. for Pearson's generalised machine. 13. Natv/ral Inheritance. Multiplying this by the whole number of observations (8585) we have the theoretical frequency 201-5. area are 0"96071 and 0'98418. use the integral Remembering that statures were only recorded to the table. of recorded in the group "62 in. in. difference. The Binomial Machine. (about one-fifteenth).. Maomillan & Co. or fraction of area between the two ordinates. difference of theoretical and observed frequencies is therefore But the proportion of observations which should fall into the given class is 0*023. Could such a difference occur owing to fluctuations of simple sampling . it could certainly have occurred as a fluctuation of sampling. This is certainly. The 32 '5. (1) Galton. see below.—BINOMIAL DISTRIBUTION AND NORMAL CURVE. 1889. how statures often might it happen % The actual frequency recorded is 169. better. or 61'94-62'94. Francis.) . or. nearest \ in. (Mechanical method of forming a binomial or normal distribution. but then n is very large (8585) so large that the — difference of the chances is fairly small compared with' tjupq Hence we may take the distribution of roughly normal to a first approximation. as The is evident from the form of the curve. errors as REFERENCES. and less than 63" is markedly less than the theoretical value. and the standard error of the class frequency is accordingly As the actual deviation is only v'O-023 X 0977 x 8585 = 14-0. the true limits of the interval are 61i§— 62lf. It is true that p and q are very unequal. interval actually lies between deviations of 4-52 in. v. This is a deviation from the mean (67 "46) of 5-02.

ignoring the binomial and analogous distributions.. Uhakliee. 443. and from a somewhat analogous series derived from the case of sampling from limited material. vol. 101." Jovr. F... Ixxvi. vol.. 7. Wm. For the early classical memoirs on the normal curve or law of error hy Laplace. Article on the "Law of Error" in the Eneydopcedia Britannica. Series A. 1901... Boy. Ixix. a Class of Normal Functions occurring in Statistics. V. 1898 . 1879. p. Boy.. 1908. " On the Distribution of Deaths with Age when the Causes (17) of Death act cumulatively. p. Soc. Series 3. ibid. " Nuove Applicazioni del Caloolo delle Probability alio Studio del Fenomeni Statistiei e Distribuzione dei Matrimoni secondo I'Eth degli Sposi.. L. vol. "On the Representation of Statistical Frequency by a Curve. Trans.. vol. Y. Soy. Soc. Boy. or Law of Great Numbers. i-xiv. Noordhoff. ). vol. Stat. Y. 1882. vol. vol. and similar Frequency-distributions. London. A Rejoinder. Series A. F.. Boy. xxviii. 280. cxcii. F. "On the Representation of Statistics by Mathematical Formnlse. Leipzig. pp. 367. clxixvi. . 497. 10th edn.) Kapteyn. (6) (6) (7) (8) (9) (10) (11) (12) (13) ref. cxovii. T. Karl. Soc. p.. For the generalised binomial machine. J.. C. Soc. 1903. F." Jour. 702-706. 1907. Lipps." Mem. 36-65.. "The Law of Error." Jowr. Edgeworth. vol. 113-141 (and an appendix. Trans. (16) Sheppard. G. Boy.. Biometrika. The memoir deals with curves derived from the general binomial. Izxxi. F. LniGi. J. Edgbworth. "The Law of the Geometric Mean. Soc. (2) (3) Cttnningham. pp." Phil. (Includes a geometrical treatment of the normal curve. and 14. Edgkworth. Groningen . and 13 are of fundamental importance. cf. and others. see § 1. Y. "The Generalised Law of Error. U.. Frequency Curves. 1906. Dawson & Sons. p. Soc. dei Lincei. x. Soc.. 1913. 1899 and vol.. C.. Ixi. see Todhunter's History (Introduction ref. 1900.." . Lund) . F. 310. 8. Boy. Gauss. Macalister. Donald. Ixiii. vol. delta Classe di Seieme morali. Skew Frequency Curves in Biology and Statistics. Engelmann. 1904. 102. E. "The i»-Functions. F.. "An Experimental Test of the Normal Law of Error. not printed in the Oambridge Phil.. vol. XX.. vol. Karl. W. Y. pp.. Stat. X. For a derivation of the same curves from a modified standpoint. Fechner. "Skew Variation in Homogeneous Material. vol. p." Proc. Ixx. Stat. p. Chap. Pearson. "Das Fehlergesetz und seine Verallgemoinerungen durch Fechner und Pearson " . "Researches into the Theory of Probability" [CommuHicatitms from the Astronomical Observatory. 1895. Edgbwokth. xxix. W." Cambridge Phil.: 314 THEORY OF STATISTICS.. Kollektivmasslehre (herausgegeben von G. vol. (4) ... 1898. Soc. 169. Edgeworth. Trans. 7)." PhU. 343. iv. p. Nixon." Proc.) YnLE. p. 1902.. (16) Perozzo. 1897. 18. G. Y. " On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation.. Stat.. Lund. 1906. Sujjplement to the memoir.. Series A. etc. of which 6. vol. p. Trans. The literature of this subject is too extensive to enable us to do more than cite a few of the more recent memoirs. Beale Accad. Boy. The student will find other citations in 6. (14) Pearson." Jour. Ixii. 1905.

On the Representation of Statistics by Mathematical Formulae. x. (A binomial distribution with negative index... 1905. and compare them with those of the normal curve. 1899. Ixxiii. Soy. Compare the values of the semi-interquartile range for the stature distributions of male adults in the United Kingdom and Cambridge students. vol. Stat.. Series A. Stat. Mag. p. 85-143. F. Karl. —BINOMIAL DISTKIBUTION AND NORMAL CURVE. in the Case of a Correlated System of Variables. the second the full memoir. Y. 1. 26. Testing the Fit of an Observed to a Theoretical or another Observed Distribution." Phil. vol." part ii. iv. Fernando de.e. " Contributions to the Mathematical Theory of Evolution (on the Dissection of Asymmetrical Frequency Curves). and a normal curve of the same area. vol. Pearson. EXERCISES. Karl. section vi. Draw a diagram showing the distribution of statures of Cambridge students (Chap.. p. 1906. 2. [Note it follows that if two normal distributions of the same area and standard-deviation are superposed so that the difference between the means is small compared with the standard-deviation.. VI. vol. Kabl." PhU. On some Applications of the Theory of Chance to (20) Pbakson. " On the Probability that Two Independent Distributions of Frequency are really Samples from the same Population. 1900. on the assumption that the distribution is normal. Racial Differentiation. Jour. p. clxxxv." Biometrika. 125. 5th Series. and standard-deviation superposed thereon. i.. 6th Series. 6. the distrib ution 1. (2) as calculated from the standard-deviation.. p. p.. 3. formed by adding superposed terms is a symmetrical binomial of degree n+\. (18) Peakson. pp. p.. Show that if np be a whole number. Soc. Rome.) Jo-wr. Calculate the ordinates of the binomial 1024 (0•5-^0•5)"'. and the related curve. 157. 1911." Biomttrika.J. 1910. vi.. Mag. Ixii. 13. Also memoir under the same title in the Transactions of the Reale Accademia dei Lincei. vol. Roy. Earl. (1) as found directly. and (3) cited in § 7 of Chapter XIII.. Edge-worth. the mean of the binomial coincides with the greatest term. vol. Table VII. mean. ] 4. (21) Helgtjbko. is such that it can be reasonably supposed to have arisen from random sampling. XV. 1894. (23) Pearson. Calculate the theoretical distributions for the three experimental oases (1). a special case of one of Pearson's curves. cited in (2). of that memoir dealing with the problem of dissection.. . vol. 1901. 250 . viii.) See also the memoir by Charlier. Show that if two symmetrical binomial distributions of degree n (and of the same number of observations) are so superposed that the rth term of the one coincides with the (r-H)th term of the other. Soc. The Resolution of a Distribution compounded of two Normal Curves into its Components. vol. : . (The first is a short note. "Per la risoluzione delle curve dimorfiche. 1914. i. Soc. ref. 230. also Biometrika. 110. (22) "On the Criterion that a given System of Deviations from the Probable.. 5. " ' ' (19) Trans.. (2). p. vol. the compound curve is very nearly normal. 71." PAtZ. 315 Roy.

or more. If skulls are classified as dolichocephalic when the length-breadth index is under 75. what percentage of Cambridge students exceed the British mean in stature. (2) 1000 seeds. find approximately (assuming that the distribution is normal) the mean and standard-deviation of a series in which 58 per cent. certain crosses of Pisum. As stated in Chap. of green seeds instead of the theoretical In what perproportion 25 per cent.316 THEORY OF STATISTICS 7. brachyeephalic. (the distribution of fig. mesocephalic. what number of seeds must be obtained to make the probable error " of the proportion 1 per cent. of green seeds. might (a) 30 per cent. Taking the mean stature for th? British Isles as 67 '46 in. In similar experiments. the standard error being 0'51 per cent. (6) 35 per cent. satimim based on 7125 seeds gave 25 '32 per cent.. mesocephalic when the same index lies between 75 and 80. and brachycephcUic when the index is over 80. if ever ? 10.. Xlll. or more. 49). ? 11. and the common standard-deviation as 2 "56 in. and 4 per cent. centage of experiments based on the same number of seeds might an equal or greater percentage be expected to occur owing to fluctuations of sampling alone ? 9. be expected to occur. Example ii. are stated to be dolichocephalic. assuming the distribution normal ? 8. 38 per cent.. .. the mean for Cambridge students as 68'85 in. ' ' . In what proportion of similar experiments based on (1) 100 seeds.

if more than two variables are involved. a knowledge of this special type of frequency-surface ceased to be so essential. the partial correlationcoefficients). constancy of standard-deviation of arrays.. and if it can be assumed to hold good. Chap. Isotropy of the normal distribution for two variables 14. This normal distribution for two variables.g. IX. be familiar with the more fundamental properties of the distribution. though when it was recognised that the properties of the correlation-coefficient could be deduced. as the earlier work on correlation is.. of measurements on man) . without reference to the form of the distribution of frequency. almost without exception. 317 . some of the expressions in the theory of distributions correlation. But the generalised normal law is of importance in the theory of sampling it serves to describe very approximately certain actual 1. based on the assumption of such a distribution . notably the standard-deviations of arrays (and. Standard-deviations round the principal axes 8-11. Investigation of Table III. Outline of the principal properties of the normal dis- — : — : : — : — : — — tribution for n variables. The contour lines a series of concentric and similar ellipses 6.. 1-3. as in Chap. IX. NOEMAL CORRELATION. or "normal correlation surface. contour lines 12-13. expression that we have obtained for the " normal " dissingle variable may readily be made to yield a corresponding expression for the distribution of frequency of pairs of values of two variables. Deduction of the general expression for the normal correlation surface from the case of independence 4. The normal surface for two correlated variables regarded as a normal surface for uncorrelated variables rotated with respect to the axes of measurement arrays taken at any angle across the surface are normal distribution of and distributions with constant standard-deviation correlation between linear functions of two normally correlated variables are normal principal axes 7." is of great historical importance. Constancy of the standarddeviations of parallel arrays and linearity of the regression— 5. to test normality linearity of regression. therefore. normality of distribution obtained by diagonal addition. Thb tribution of a : {e. can be assigned more simple and definite meanings than in the general case. The student should.CHAPTER XVI.

and if the disand as also x^ . of course. cKj . therefore.e. the equations to the contour lines being of the general form 3 + 4 = C^ Pairs of values of 3.. The contour lines of the surface. one special ajj = a consee that every section of the surface by a vertical plane parallel to the z^ axis. merely uncorrelated but completely independent.318 THEORY OF STATISTICS. by the rule of independence.e mean and standard-deviation as the total distribution of ajj's. V. Equation stant. If they are not x^. equally frequent. (4) and x^ related by an equation of this form are. that is to say. XII. the frequency-distribution of pairs of values must. with the sam. 2. (2) (2) gives a normal correlation surface for If we put case. singly. the distribution of any array of a^'s.. Chap. remember (Chap. § 13). . 8) that "'2. Consider first the case in which the two variables are completely independent. To pass from of if this special case of independence to the general § case two correlated variables. i. Let the distributions of frequency for the two variables x-y and x^. is a normal distribution. assuming independence. as the two variables are assumed independent (c/. the correlation-coefficient being zero.1 = '^2 " ''21"'''l Xj and Kj. lines drawn on the surface at a constant height.j are uncorrelated.i. are a series of similar ellipses with major and minor axes parallel to the axes of x-^ and Xg ^^d proportional to o-j and o-j. . . be given by -<4h-|) where . be 2'i = 3'i« (1) 2<r? 2'2 = y2« Then. and a similar statement holds for we the array of x^'s . these properties must hold good.

2 2ir.1' This is a normal distribution of standard-deviation r-i^-^'h^ o'2 o-j. . with a mean deviating by tion of (1) from the mean of the whole distribu- As ^2 represents any value whatever of ajj.o-2. 319 tribution of each of the deviations singly be normal. If we assign to x^ some fixed value.j. therefore."'2.j.i 27r. the general expression for the normal correlation surface for two variables -j(4++-^'.o-2(l -rjj)* ' 4.2 -_2 (72. since fx^ .2 2.1 ^' 12 ""ls. and reduced the exponent ^ 4.o-i.1 * Evidently we would also have arrived at precisely the same expression if we had taken the distribution of frequency for Kj and jBj. we see iCi's. '1. we have the \ <ri. that the standard-deviations of all arrays of x-^ are the same. we must have for the frequency-distribution of pairs of deviations of ajj and x^. (6) and Kj-j.?!:? 01 0-1.o-i. Wj and ajj. distribution of the array of x^'b of type h^.g.2 We have.a-2.^ yi2=?''i2« ^"^ '"^ • • . (5) But = ^ + ^-2r.2 "fil 1.XVI. . —NORMAL COKKELATION.^) Further. we must have ^'^ ' 27r. say Aj..o-i. are independent.

320 and equal to linear. 50. we will find (1) that the standard-deviations of all arrays of x^ are the same (2) that the regression of x^ on a. — Principal Axes and Contour Lines of the normal Correlation Surface. 5. (2) that the regression of x-^ on x^ is strictly if we assign to ic-^ any value h^. : Axes of Measurement M = Mean of and IS whole surface also the summit of Che surface R 9 . Fig. however. o-jj : THEORY OF STATISTICS. independence. Similarly. As each line of regression cuts every the lines of regression. as in the case of series of concentric and similar ellipses . of course..Lines of means Contour lines and Axes of normal correlaCion surface Fig.j is strictly linear. hut make a certain angle with them. 50 illustrates the calcuand CO being lated form of the contour lines for one case. The contour lines are. no longer parallel to the axes of x^ and Kj. a the major and minor axes are.CC. RR .

To find the angle 6 through which the surface has been turned. and CC in the points of contact of the vertical tangents. ^2. it follows that every section of a normal surface by a vertical plane is a normal curve. 321 array of Xj or of x^ in its mean. i. is shown in fig.Xj + J. as illustrated by the two chords shown by dotted lines: it also follows that RR cuts all the ellipses in the points of contact of the horizontal tangents to the ellipses. 29. But these would give the distributions of functions like a. Since. are uncorrelated. The surface or solid itself. = 0. and as the distribution of every array is symmetrical about its mean. and consequently (l)'the distribution of any linear function of two normally distributed variables x^ and x^ must also be normal . If fj. 6. RR must bisect every horizontal chord and CC every vertical chord. 166. (2) the correlation between any two linear functions of two normally distributed variables must be normal correlation. —NORMAL COEKBLATION. = ^ . It also follows that. the angle 6 being taken as positive for a rotation of the jBj-axis which will make it.ajj. coincide in direction and sense with the Wg-axis.e.e normal for every angle though which the surface is turned. The major and minor axes of the ellipses are sometimes termed the principal axes. multiplying = {a^-ai) sin 26 tan 2. the distributions of arrays taken at any angle across the surface are normal. from the position for which the correlation is zero to the position for which the coefiBcient has some assigned value r..<!-i<T2 cos 20 (9) 21 .XVI. since the total distributions of x^ and x^ must b. p. + 2r^. if continued through 90°. since ^j f. ^j-^^ ^^^ '^°' ordinates referred to the principal axes (the |j-axis being the Xj axis in its new position) we have for the relation between ^j. 2(fj^j) together equations (8) and summing. ^j = Xy cos 6 + x^. a normal surface for two correlated variables may be regarded merely as a certain surface for which r is zero turned round through some angle.. somewhat truncated. the distributions of totals given by slices or arrays taken at any angle across a normal surface must be normal distributions. we must use a little trigonometry. as we see from fig.. aTj. and since for every angle through which it is turned the distributions of all Xj arrays and x^ arrays are normal. Hence. sin 6 \ (8) But. aij. 50.

They may be most readily determined as follows. 8. if It should be noticed that we distribution for two variables as being a pair define the principal axes of anj of axes at right angles for which the variables fj. . for if As stated in Chap. The two standard-deviations. the central ordinate by equation (7) we have iV Referring it to the principal axes.: 322 THEORY OF STATISTICS. or nearly so. within limits. 2A = <ri<ra(l-»^^)> (10) (11) and (11) are a pair of simultaneous equations from which 2j and '2. Care must. 2j . . the frequency-distribution any variable may be expected to be approximately normal be regarded as the sum (or.2^ is also negative. + -^ = . say 2j and Sj. Similarly. for evidently from § 2 the major and minor axes of the contour-ellipses are proportional to these two standard-deviations. and may be obtained at somewhat greater length from the equations for transforming co-ordinates. § 13. XV.ji + oi .22 also if r is positive. and 2j .2i22" But these two values therefore of the central ordinate must be equal. of a large number of other variables. provided that these elementary component variables are independent. the major axes of the ellipses lying along ^^ but if r be negative.Sg is necessarily positive. It should be noted that. summing and adding. however. the correlation between two variables may be expected to be approximately normal if that variable slightly may some more complex function) . ^j are uncorrelated. equation (9) gives the angle that they make with the axes of measurement whether the distribution be normal or no. by equation If (3) y\2-- 2ir. 2j -i. we have ^. about the 7. be taken to give the correct signs to the square root in solving.^ may be very simply obtained in any arithmetical case. it is really of general application (like equation 10). principal axes are of some interest. (10) for Referring the surface to the axes of measurement. while we have deduced (11) from a simple consideration depending on the normality of the distribution. Squaring the two transformation equations (8).

The second important property is for two variables 2-56 . 160. IX. we may conclude that the : linear. of a large number of elementary component variables. or some sHghtly more complex function. whether the normal surface will fit the distribution of the same character in pairs of individuals we leave it to the student to test. —NORMAL COEKBLATION. 37. 323 each of the two variables may be regarded as the sum. 0-52 in each case as compared with a correlation of 0'51. and we have seen that. from the column headed 62'5-63'5 onwards .. the intensity of correlation depending on the proportion of the components common to the two variables. as far as we can by elementary methods. 204 the standarddeviations of ten of the columns of the present table. to test. the approximate normality of the total distributions for this table. The first important property of the normal distribution is the linearity of the regression. Stature is a highly compound character of this kind. X. We gave in Chap. viz. This was well illustrated in fig. p. as far as he can do so by simple graphical methods. 174.— XVI. of the normal distribution the constancy of the standard-deviation for all parallel arrays.. showing the correlation between stature of father and son. p. p. in one instance at least. these were 9. and the closeness of the regression to linearity was confirmed by the values of the correlation-ratios (p. 206). the distribution oif stature for a number of adults is given approximately by the normal curve.. We can now utilise Table III. Chap. when drawing arising as fluctuations of samples from a record for which the regression is strictly regression is appreciably linear. Subject to some investigation as to the possibility of the deviations that do occur simple sampling.

strictly normal. to the small numbers of observations in any array. viz. III. the distributions of arrays are very irregular. a fact.. IX. table we cannot find the totals of such diagonal arrays exactly. § 6). but the totals of arrays at an angle of 45° will be given with sufficient accuracy for our present purpose by the totals of lines Eeferring again to Table of diagonally adjacent compartments. and their normality cannot be tested we can only say that they do not in any very satisfactory way But we can test the exhibit any marked or regular asymmetry. Owing. the following distribution : 0-25 . as a rough test suggests that they distributions of all might have done 10.. and not parallel to either From an ordinary correlationaxis of measurement (c/. and forming the totals of such diagonals (running up from left to right). we find. Chap. allied property of a normal correlation-table. however.— 324 THEORY OF STATISTICS. that the totals of arrays must give a normal distribution even if the arrays be taken diagonally across the surface. starting at the top left-hand : corner of the table. but. so. Next we note that the arrays of a normal surface should themselves be normal.

certainly there is no marked asymmetry. the distribution may be regarded as appreciably normal. so far as the graphical test goes. Drawing a diagram and a normal curve we have 51 . : 100 80 eo k 4C 20 . and. = 0-51. fig. the distribution is rather irregular but the fit is fair . One of the greatest divergences of the actual distribution from the normal curve occurs in the almost central interval with frequency 78 the difference between the observed and calculated frequencies is here 12 units. sin ^=cos 6=1/ J2 fitting = 3'361).XVr. so that it may well have occurred as a fluctuation of simple sampling. = 2-75. — NORMAL 0-3 CORRELATION. but the standard error is 9'1. rj2 325 and inserting find o-{ <rj = 2-72.

22 = 5-275 2j. = 1-447 whence 2j = 3-36. 51 Hence we have from equation ^12 = 26-7 and the complete expression for the fitted normal surface is ac n \6-47 6-60 6-43 .326 THEOKY OF STATISTICS. tan 26= -46-49. From (9). but it is very much easier to draw To do this the ellipses if we refer them to their principal axes. XIII. They should be set off on the diagram.owing to the two standard-deviations being very nearly equal. To obtain 2j and 2^ we have from (10) and (11) 25 + ^=14-961 2^=12-868 22i Adding and subtracting these equations from each other and taking the square root. § 12).= 67-70 Jf2 = 68-66 2-75 '^12 ' c{= 2-72 (7) 0-2= =" . and this implies a standard error of about 5 units at the centre of the table.-. The equations the principal axes. 3 units for a frequency such fluctuations might of 9. 22=1-91. not with a protractor. . owing to the principal axes standing nearly at 45° the first value is sensibly the same as that found for o-f in § 10. Using the suffix 1 to denote the constants relating to the distribution of stature for fathers.2. we must first determine &. but by taking tan 6 from the tables (1-022) and calculating points on each axis on either side of the mean. and 2 the same constants for : the sons. the principal axes standing very nearly at an angle of 45° with the axes of measurement. whence 20 = 91° 14'. 6 = 45° 37'. square root (Chap. or 2 units for a frequency of 4 cause wide divergences in the corresponding contour lines. #=1078 Jf. 2^ -I.'' The equation to any contour ellipse will be given by equating the index of e to a constant. referred to therefore be written in the form (3-36)'^ (1-91)^ . may to the contour ellipses. Sj and Sg.

64 66 66 67 68 : 69 inches 70 71 n 73 Stature of Father 52. 1-40 and 0'76. and corresponding Cpntour Ellipses of the fitted Normal Surface. 2-55: semi-minor axes. or the following 6S Fio. Pj -Pj. 1-45. IX. very much : — . principal axes M. P^ P^. mean. 10 and 20 of the distribution of Table III..— Contour Lines for the Frequencies 5..-h^ c2_ 2(iogy'i2-iogyi2) log e Supposing that we desire to draw the three contour-ellipses for y = 5. Chap. find c for any assigned value of the frequency y we have yi2=yi2« . 327 To the major and minor axes being 3-36 x c and 1-91 x c respectively. 4-70. —NORMAL CORRELATION. ellipses drawn with these axes are shown in fig. we find c = l-83. 6-15. 52. 10 and 20.XVI. 2-67. : semi-major values for the major and minor axes of the ellipses The axes. 3-50.

The normal distribution of frequency for two variables is an isotropic distribution. suggested that the contour lines of a similar table for the inheritance of stature seemed to be closely represented by a series of concentric and similar ellipses (ref. on the whole. the fit must be regarded as quite as good as we could expect with such small frequencies. p. V. there is a frequency of 18'75. For father's stature = 66 in. the points on these polygons having been obtained by simple graphical interpolation between diagonal interpolathe frequencies in each row and each column tion between the frequencies in a row and the frequencies in a column not being used. It is perhaps of historical interest to note that Sir Francis Galton. Mr J. son's stature = 71 in. («l-a^l)(4-r>)- Assuming that the exponent is x\ of the . to a mathematician.. Hence the association for . and the various shapes and other particulars of its sections that were made by horizontal planes" (ref. same sign as rjj. and an increase of a single unit would give Taking the a point on the actual contour below the ellipse. of course. the fit looks very poor to the eye. fair. son's stature = 70 in. there is a frequency of 19.x-^ has been taken of the same sign as x\ — x^. 3. to which all the theorems of Chap. the figures suggest that here again we have only to deal with the eiFects of fluctuations of sampling.. The actual contour lines for the same frequencies are shown by the irregular polygons superposed on the ellipses. y = 20.. reduced. especially considering the high standard errors. divided by frequency of Wj x^ multiplied by frequency of — : Xi x^j. working without a knowledge of the theory of normal correlation.. but if the ellipse be compared carefully with the table. we have for the ratio of the cross-products (frequency of Xj x^ multiplied by frequency of Xi. 12. from the original drawing. correlation-table common to the rows and columns centring round values of the variables x^. 102). asking him to investigate "the Surface of Frequency of Error that would result from these data. In the case of the central contour. D. 4). for father's stature = 68 in. and an increase in this much less than the standard error would bring the actual contour outside the ellipse. x[. x^.328 THEORY Of STATISTICS. Xa. It will be seen that the fit of the two lower contours is. one of the squares shown representing a square inch on the original. results as a whole. in abstract terms. x^. For if we isolate the four compartments of the §§ 11-12 apply. Hamilton Dickson (ref. 2) the suggestion was confirmed when he handed the problem. Again.

contingency tables for such characters are sometinies regarded as groupings of a normal distribution of frequency. the ratio of the cross-products being unity. If only reducible to isotropic form by some rearrange§§ 9-10). If the frequencies in a contingency. of Chap. Before applying this procedure it is well. the association is therefore same sign the sign of j-jg for every tetrad of frequencies the compartments common to two rows and two columns . normal. even if the table be isotropic it need not be be avoided.x 2-fold form for the calculation of the correlation If the table is not reducible coefficient by the process referred to. ment.ssintervals are equal or unequal. or the association zero. or reducible to isotropic form by some alteration in the order of rows and columns (Chap. Table II.x 2-fold form must always be the same whatever the axes of division chosen.table be not large. densation of the table by grouping together adjacent rows and columns. that — — to say. V. — NORMAL CORRELATION. to see whether the distribution of frequency may be regarded as approximately isotropic. or some process of "smoothing" by averaging the . but at least the test for isotropy affords a rapid and simple means for excluding certain distributions which are not even remotely normal. and also if the contingency or correlation be small. 329 this group of four frequencies is also of the same sign as r-^. It follows that every groiiping of a normal distribution is isotropic whether the clp. of the in is In a normal distribution.^. might possibly be regarded as a grouping of normally distributed frequency if rearranged as suggested in § 10 of the same chapter it would be worth the investigator's while to proceed further and compare the actual distribution with a fitted normal distribution but Table IV. the process of calculating the coefficient of correlation on the assumption of normality is to Clearly.XVI. to isotropic form by any rearrangement. 13. this rearrangement should be effected before grouping the table to 2. These theorems are of importance in the applications of the theory of normal correlation to the treatment of qualitative The characters which are subjected to a manifold classification. therefore. could not be regarded as normal. the distribution is isotropic. if r^2 is zero. and the sign of the association for a normal distribution grouped down to 2. the influence of casual irregularities due to fluctuations of sampling may render it difficult to say whether the distribution maybe regarded In such cases some further conas essentially isotropic or no. V. and could not be rearranged so as to give a grouping of normally distributed frequency. large or small. — — . and the coefficient of correlation is determined on this hypothesis by a rather lengthy procedure (ref. 14).

isotropic : Table I.). say with four rows and four columns the table below exhibits such a grouping. Chap. is obviously not strictly isotropic as it stands we have seen. however. the limits of rows and of columns having been so fixed as to include not less than 200 observations in each array. The frequencies in adjacent compartments. that it appears to be normal. — (condensed from Table III.: 330 THEOKY OF STATISTICS.. . for instance. of Chapter IX. IX. correlation-table for stature in father and son (Table III. and it should consequently be within such limits. Son's Stature (inches).). within the limits of fluctuations of sampling. We can apply a rough test by regrouping the table in a much coarser form. may be of service.

i.23 . genof every order is a normal distribution. . a great variety of ways.1).1... x^-^^i ^tc.. X2..13 .3 "^n.3.X . .i °^ with x^ and x^^. x^. (n-2)ntr„. x„ hj 2/12 . ... . or diverging so slightly from isotropy that an alteration of the frequencies. (n-1) . . allotting any primary subscript to the second deviation (except the subscript of the first). + ---+.. —'i^{n-Vln. Our assumption. oil 0^1. . .o"wi • (14) The expression (13) for the exponent <^ may be reduced to a general form corresponding to that given for two variables. .. a„ is normally distributed. — nO^ilS . ..12 .. . T . n .„_i. '^.12 .. 14.1 ^»)=3+3^+3^+ . %i> ^312' 6tc. eralising the deduction of § 6. (15) ^^i2..i (13) \ ' and Z' 12 •••» = /. „. . will render the distribution isotropic. then.. .. commencing with any deviation of the first order. if the uncorrelated deviations x^. . Further. O^in . Denoting the frequency of the combination of deviations Xj. isotropic. . Before concluding this chapter we may note briefly some of the principal properties of the normal distribution of frequency for any number of variables. if in (13) any fixed .— XVI. -2 "3.n 0^1. and so on. "1..lZ. . . — —NORMAL x — 331 CORRELATION. ••••'"-" .. are normally distributed amounts to the assumption that all deviations of any order and with any suffixes are normally distributed. . be completely independent (c/... we must have in the notation of Chapter XII.12 0-„. ---^ —— + . ...l (n-ll Several important results may be deduced directly from the form Clearly this might have been written in (13) for the exponent..e.or 4- or two other condensations : of the original table to 3- x 4-fold form he will probably find them either isotropic. . just as in § 3 we arrived at precisely the same final form for the exponent whether we started with the two deviations Kj and aig..23 . x^. . n — .in-ZI~. referring the student for proofs to the original memoirs. . xt ^. The student should form one 3. viz.1. that the deviations x-^. 7/ yi2 n =1/' y 12 » - ^-J*(«ii^ "n) • (12) where 0(^1^2. x^ . well within the margin of possible fluctuations of sampling. that any linear function of x^. . in the general normal distribution for n variables every array It will also follow.. § 3 of the present chapter). +"r Ol "^l .

to 3:4.. : 0"l. whatever the particular The latter fixed valves assigned to the remaining deviations. . renders the meaning of partial correlation coefficients much more definite in the case of normal correlation than in the general case. .il 2. "Family Likeness In Stature. les : probabilit& des erreurs de Mimoires preseviis par divers Soc.13.n . strictly . it will be seen. . as we have Similarly. assigned to x-^. .n .^i represents merely the average correlation. the correlation coeflScient being r^^-^. «S + . Maomillan & Co..nZ . (1) t/ie correlation between any two deviations x^^ and x^^.3 in the general case ri2.S4. (2) the correlation between the said deviations is r^^j. Franois. In the general case r„.. . ix.24.W . '*2 +„ n3. («-l) . Thus in the case of three variables which are normally correlated. (15) for <^. °'l.. . p. n/<'"2. /tg. to increasing values of x^. . p.n.^. .2S • • • nn. Ifaiural Inheritance . say h^. "^S. increasing or decreasing as the Finally.. normal correlation. and then throw 1^ into the form of a perfect square (as in § 4 for the case of two variables). .. is normal correlation .. to all the deviations except x^.23. we obtain a normal distribution for x-^ in which the mean is displaced by values be assigned to correlation between and x-^ and x^. .^j is constant for all the subgroups corresponding to particular assigned values of the other variables. vol. Ray. xl. des Sciences savants. would probably exhibit some continuous change.i "^n. and so on. linear.. REFERENCES. so to speak. . in the expression case might be. if any fixed values be seen. (1) Bbavais.332 THEORY OF STATISTICS. 1889.i and x^^: in the normal case r„. Francis. II« s^rie.12. 1846. iTjjj all the following deviations. 1886. say. <''l. But this is a linear function of h^. between a..a3. n (n-1) ." Proo. if we assign any given value to x^. a.. on reducing ajg ^2 to the second order we shall 'find that the correlation between Xj. using k to denote any group of secondary suffixes.i ^^^ %i '^ normal correlation.„...2 " . etc.1231 s^nd all the following deviations." Acad. the on expanding x^.n «. the correlation between the associated values of x-^ and x^ is ri2.. we assign fixed values... 42. General. is. ..2s is ... " Analyse inathimatique sur situation d'un point.3. conclusion. (2) Galton. That is to say. n2. we have to note that if. if actually worked out for the various sub-groups corresponding.i3 n) ^tc. The expressions r-^^ n are of course the partial regressions . (8) Galton.. A3. .^ "^2.. . 255.. etc. therefore in the case of normal correlation the regression of any one variable on any or all of the others o"i.

G. 271. ii. Trans. Dulau & Co. Karl. "Empirical Studies in the Theory of Measurement. Series A. III. 1907. "On Lines and Planes of Closest Fit to Systems of Points in Space. cc. vol. 248. London. Trans. W. F. vol... 96.1906. Trans. p. (2). L. Ixxix. ) Dulau & Co. U. Various Methods and their Relation to Normal Correlation. "On the Calculation of the Double-integral expressing Normal Correlation. "On the Generalised Probable Error in Multiple Normal Correlation. Pearson. S (16) Pearson.." Proc Soy. Soc. G. p. Soc. Java: (17) . treated by a Applications to the Theory of Attributes. 101. Kakl. Soe. 1907. of which only the Percentage of Cases wherein exceeds (or falls short of) a given Intensity is recorded for each grade of A. "On the Correlation of Characters not Quantitatively Measurable. 1896. 1. Heredity. Series A. Proe. p. Mag. xix. Soy. Soc.." FMl.. Stat. p. Trans.. Boy.) ) ." Biometrika. 1908. (The suggestion of a " rank " method see Pearson's criticism and improved formula in (18) and Spearman's reply on some points in (20)." Drapers' Company Besearch Memoirs. Appendix to F. vol. vol. "On the Theory of Correlation. p. vol. 333 Dickson. of Psychology. J. p. Say. : : of Psychology. Mag. 1910. Soy.. (Based on the assumption of normal correlation." PM7. Karl. (10) (11) 1897. 23. 6th Series. Biometric Series I. vol. C. Pearson. 63. p. "Correlation calculated from Faulty Data. vol. 559. C. and Panmixia. Soc. 1901.ee. Yule.. p.. (Methods based on correlation of ranks difference methods. vol.. Trans." Phil.. 1909.. "On a New Method of Determining Correlation. cxcii. vol. Y. Kakl. London. 812. Pearson.'' 5r»<. Biometric Series IV. and Alice I. 1886. Series A. Thorndike. p.. "On Further Methods of Determining Correlation. "On the Influence of Natural Selection on the Variability and Correlation of Organs. " On the Application of the Thooi-y of Error to Cases of Normal Distribution and Normal Correlation. ii." 5r«7. . vol. 1904." Archives of Psychology (New York). Kael. "On the Theory of Correlation for any number of Variables New System of Notation. cxov. p. " On a New Method of Determining Correlation between a Measured Character A and a Character S. (6) Pearson. vi. Karl. vol. "Regression. D.. 253. 89. Series A. Correlated Averages.." Cwmbridge Phil. 1892. Jour. (4) (6) —NORMAL COERELATION. W.. clxxxvii.. Series A.) (20) Spearman. vii." Phil. etc. E. XVI. p... Soc. 1907.) 1902.. vol. (On the fitting of " principal axes" and the corresponding planes in the case of more (8) Edgewokth. 5th Series. vol. Soy. Yule. Soy. p. vol." Phil. "A Footrule for Measuring Correlation. " On the Theory of Contingency and its Relation to Association and Normal Correlation. 1898. p. 190." Biometrilca.. Pearson. 1910. 182." Biometrika. (21) iii. (19) Spearman. 3 of Chap. ." Jour. Ix. when one Variable is given by Alternative and the other by Multiple Categories. 1. p. {Cf. p. Hamilton. 59." PAiZ. (12) Sheppard. Karl. xl." (18) Drapers' Company Besearch MemA)irs. vii. (14) Pearson. (7) Pearson. (15) Pearson. 1900. Kabl. criticism in ref.. Soc. F. "On (9) than two variables. xxxiv. (13) Sheppard. Karl. 1900.. U.. vol. See also the memoir (12) by Sheppard..

. i. Hence show that if the pairs of observed values of a^ and Xj are represented by points on a plane. .e. Show that = = A B . is numerically greatest without regard to sign. at the medians. (A proof will be found in ref. and that the maximum value of the correlation is if . Show that these axes make an angle of 45° with the principal axes.) A fourfold table is formed from a normal correlation table. of the squares of the distances of the points from this line is a minimum the line is the major principal axis. ref 12. and with reference to other axes something. 10.-„.(-«-.).) 2.— 334 THEORY OF STATISTICS. and 5. EXERCISES. so that (^)=(o) (-») (3)=N/2. The coefficient of correlation with reference to the principal axes being zero. sum 4. (Slieppard. 3. there must be some pair of axes at right angles for which the correlation is a maximum. the 1. and a straight line drawn through the mean. Deduce equation (11) from the equations for transformation of co-ordinates without assuming the normal distribution. taking the points of division between and a.

for each of the samples. correlation-ratio and criterion for linearity of regression 16. Restatement of the limitations of interpretation if the sample be small. and limitations of interpretation 10. now proceed to consider some of the simpler theorems for the case of variables (c/. Chap. We 335 . median. THE SIMPLEE CASES OP SAMPLING FOR VAEIABLES PERCENTILES AND MEAN. and our problem is to determine the standard-deviation that each such measure will exhibit. 1-2. Simplified formula for the case of a grouped frequencydistribution 7. and so on until we have drawn n cards (a number small compared with the whole number in the bag). No one of these measures will prove to be absolutely the same for every sample. § 2). Suppose that we have a bag containing a practically infinite number of tickets or cards bearing the recorded values of some variable X.-XVI. XIII. — — — — — — — — — — standard-deviation. coefficient of variation. and then work out the mean. Special values for the percentiles of a normal distribution 5. Kelative stability of mean and median in sampling 12. In Chapters XIII. In solving this problem. Standard error of the arithmetic mean 11. Effect of the form of the distribution generally 6. Let us continue this process until we have if such samples of n cards each. and that we draw a ticket from this bag. — 1. Correlation between errors in two percentiles of the same distribution 8.the conditions which are assumed to subsist. 2. Standard error of the interquartile range for the normal curve 9. etc.. Standard error of the difference between two means 13. draw another. note the value that it bears. we must be careful to define precisely . correlation coefficient and regression. so as to These conditions realise the limitations of any solution obtained. Effect of removing the restrictions of simple sampling 15. Effect of removing the restrictions of simple sampling. The tendency to normality of a distribution of means^ll.— — : CHAPTER XVII. standard-deviation. Statement of the standard errors of . The problem 3. of sampling for variables the conditions assumed Standard error of a percentile i. we have been concerned solely with • the theory of sampling for the case of attributes and the frequencydistributions appropriate to that case.

is uncorrelated with the value of recorded on card 2. § 3). (b) We assume not only that we are drawing from the same record throughout. eliminate this correlation by replacing each card before drawing the next. at each drawing. were discussed very fully for the case of attributes (Chap. Chap. make the further assumption that the sample is unbiassed. This assumption is unnecessary. Here it is sufficient to state the assumptions briefly. may not be the same for each individual card at each drawing.. and so on. assumptions indicated (a) by the same letters in the section cited.e. if our card-record is contained in a series of bundles. § 8). Chap. It is for this reason that we spoke of the record. if. in § 1. speak of as simple sampling. we can. XIII. and we draw the card bearing 1. XIII." "the form of the frequency-distribution value of X X . XIV.). for otherwise the successive drawings at each sampling would not be independent if the bag contain ten tickets only. the average of the following cards drawn will be higher than the mean of all cards drawn . on the other hand. or a value within any assigned limits. bearing the numbers 1 to 10. as already pointed out for the : We X X : case of attributes (Chap. the last paragraph in § 8. the second card from bundle number 2. 3. i. there will be a negative correlation between the number on the card taken at any one drawing and the card taken Without making the number of cards in at any other drawing. (c) We assume that the drawing of each card is entirely independent of that of every other. so that the value of recorded on card 1. the average of the following cards will be lowerthan the mean of all cards i. for we can substitute for such phrases as " the standard-deviation of in a very large sample.— 336 THEORY OF STATISTICS. the interpretation of our results becomes simpler and more straightforward. or a value within assigned limits. XIII. and the discussion in g§ 4-8. that the chance of inclusion in the sample is independent of the We recorded on the card (cf. is the same at each sampling. the bag indefinitely large. and so on. but that each of our cards at each drawing may be regarded quite strictly as drawn from the same record (or from identically similar records) e. If it be true. do not. (b) and (c) to denote the corresponding. as containing a practically infinite number of cards. or else the chance of drawing a card with a given value of X. as before. we draw the 10. Sampling conducted under these conditions we shall. assume that we are drawing from precisely the same record throughout the experiment.e.ff. we must not make it a practice to take the first card from bundle number 1. it should be noticed. using the letters (a). so that the chance of drawing a card with any given value of X. and we would refer the student to the discussion then given. 4.

Let us consider first the fluctuations of sampling for a given percentile. If we note the proportions of observations above Xp in samples of n drawn from the record. Let Xp be a value of such that of the values of in an indefinitely large sample drawn under the same conditions lie above it and qN below it. we know that these observed values wi U ten d to centre round p as mean. If the frequency-distribution for the very large sample be a normal curve. e Therefore for the standard-deviation of corresponding to a proj^ortion p we have a/pq /pq yJ\ n or of the percentile 'p (1) 4. XIII. perhaps the majority of. —SIMPLER OASES OF SAMPLING JOE VARIABLES. as the problem is intimately related to that of Chaps. and the ordinate of the frequency curve at Xp when drawn with unit area and unit standard-deviation by y^. and we denote the standard-deviation of in a very large sample by cr. 9.XVII. But this ratio is quite simply determinable if the number of observations in the sample is sufficiently large to justify us in assuming that 8 is small so small that we may regard the element of the frequency curve — (for a very large sample) over'which X^ + € ranges as approximately a rectangle. say p + h. practical cases the very question at issue is the nature of the a very X in the original record. for the sample. 3." : relation tion of the record between the distribution of the sample and the distribufrom which it is drawn.. A table calculated by Mr Sheppard (Table III. Standard Error of a Percentile. no examination of samples drawn under the same conditions can give any evidence on this head. p.-XIV. the standarddeviation of c will bear to the standard-deviation of 8 the same ratio that e on an average bears to 8. with a standard-deviation — X pN X ijpqjn. in Tables for Statisticians and Biomet- 22 . X e = ^. as well as observing the proportion of X's above X^.8. the values of %. for the principal percentiles may be taken from the published tables. we also proceed to note the adjustment e required in Xp to make the proportion of observations above Xp + £ in the sample pn. 337 in large sample. If this assumption be made. As has already been emphasised in the passages to which reference is made above." the phrases " the standard-deviation of " the form of the frequency-distribution in the origi/nal record " but in very many. If now at each drawing.

.— 338 ricians. for example a U-shaped distribution like that of fig. On the other hand. in the case of a distribution which has a high peak in the centre. in so far. etc. Table IV. 1-25331 1-26804 1-31800 1-42877 1-70942 1-36263 0-84535 0-85528 0-88897 0-96369 1-15298 0-91908 It will be seen that the influence of fluctuations of sampling on the several percentiles increases as we depart from the median the standard error of the quartiles is nearly one-tenth greater than that of the median. and it will. 18 or fig. and these have been utilised for the following student can estimate the values roughly by area and ordinate tables for the normal curve given in Chapter XV. . remembering to divide the ordinates given in that table by ^2:7 so as to make the area unityValue of « gives the values the a combined use of the I. in Appendix directly. . . § 17).. . We can create such a . deciles. and the values given in the second column for their probable errors (Chap. . . . . multiplied by Median . or : THEORY OF STATISTICS. the standard error of the median will be relatively high. the standard error of the median will be relatively low. which the student Standard error is (r/Vn multiplied by may sometimes Probable error find useful : is ir/Vn. be an undesirable form of average to employ. Hence for a distribution in which y^ is small. .) : Median . so as to exhibit a value of y^ large compared with the standard-deviation. For a distribution with a given number of observations and a given standard-deviation the standard error varies inversely as y^. and 9 . as this is an important form of average. 19. and the standard error of the first or ninth deciles more than one-third greater. we have the following values for the standard errors of the median. ref. Quartiles . 16. 5. XV. Deciles 4 and 6 3 and 7 „ 2 and 8 „ 1 . Deciles 4 and 6 3 and 7 „ 2 and 8 „ 1 and 9 „ Quartiles 0-3989423 0-3863425 0-3476926 0-2799619 0-1754983 0-3177766 Inserting these values of yp in equation (1). . Consider further the influence of the form of the frequencydistribution on the standard error of the median..

The roots found give p = 2-2360 . where o-j.. To give some idea of the reduction in the standard error of . that is if 2sjTrp or p4 + 2/)3 + (2 . This equation may be reduced to a quadratic and solved by taking p + —as a new variable. Then .the median that may be effected by a moderate change in the form of the distribution. 0-2 a- is of course the standard-deviation of the com- pound Let distribution. standard-deviation a-Un. The distribution .(rJ V is 2 • • W Hence the standard error of the median (c) is equal to o-/\/m if (o-i -I- 0-2) Jorl 4. in round numbers.. the one root standard error of the median will therefore be .g-j _ 2 vircTiO-J Writing (r2/<Ti = p. the value of i/p is I 2x^. if the curve is. be m/2 observations in each. . being merely the reciprocal of . the standard error of the median reduces to "'/Jn.. 339 " peaked " distribution by superposing a normal curve with a small standard-deviation on a normal curve with the same mean and a relatively large standard-deviation. (a) On the other hand. — SIMPLER CASES OF SAMPLING FOR VARIABLES. . the standard error that of the other.47r)p2 + 2p -1- 1 = 0.. . . about 2J times of the one normal If the ratio be greater. in such a compound distribution. of the median will be less than crj^n. or 0-4472 The the other.— XVII. .0-1 2-v/27r. and let there be the standard-deviations of the two distributions. . having the same area. let us find for what ratio of the standard-deviations of two such curves.

we must have . if the number of observations is sufiBcient to make the class-frequencies run fairly smoothly.340 for THEORY OF STATISTICS. be the frequency-pereliminate <r from equation (1). 53 it will be seen that it is by no means a very striking form of distribution . the standard error of any percentile can be calculated very readily indeed. class-interval at the given percentile— simple interpolation will give us the value with quite sufficient accuracy for practical purposes. as nearly that of a very large sample. for we can Let fj. 53. Then since y^ is the ordinate of the frequency-distribution when drawn with unit standard-deviation and unit area. Let <T be the value of the standard-deviation expressed in classintervals. which the standard error of the median is exactly equal to tTJJn is shown in fig. to enable us to regard the distribution Fig. it is evident that we cannot at all safely estimate by eye alone the relative standard error of : the median as compared with c/vm. In the case of a grouped frequency-distribution. 6.e. i. and if the figures run irregularly they may be smoothed. and let n be the number of observations as before. at a hasty glance it might almost be taken as normals In the case of distributions of a form more or less similar to that shown.

Using the direct method. which is very nearly at the centre of the interval with a frequency 1329. VII. slightly in excess of that found on the assumption that the frequency is given by the normal curve. The student should notice that the class -interval is.) is approximately 96. in this case. and in The number of observations is 8585. the value is practically the same as that obtained from the value of the standard-deviation on the assumption of normality. 7. this gives 0-0348 as the standard error of the median. that is 0-0655 per cent. 341 But this gives at once for the standard error es^essed in terms of the class-interval as unit _ Jnpq %=— — 3 . §§ 13. mean 00479. and the standard-deviation 2*57 in.. In finding the standard error of the difference between two Example . XV. on the assumption of normality of the distribution. As we should expect.. VII. (^) Jp As an example in which we can compare the results given by the two different formulse (1) and (2). multiplying by the : given in the table in § 4. VII. 14 of Chap. identical with the unit of measurement. and VIII. the distribution being approximately normal cr/^ra = 0-027737.. On the assumption that the distribution is normal. In the case of the distribution of pauperism (Chap. the standard error is factor 1'253 . Taking this as being. we find by simple interpolation the approximate frequencies per interval at the first and ninth deciles respectively to be 590 and 570. take the distribution of stature used as an illustration in Chaps. and equal to 0-027737x1-70942 = 0-0474. The frequency at the median (3-195 per cent. —SIMPLER CASES OF SAMPLING FOR VARIABLES. Let us find the standard error of the first and ninth deciles as another illustration. the fact that the class-interval is not a unit must be remembered..). . and this gives for the standard error of the median by (2) (the number of observations being 632) 0-1309 intervals. ^8585 1329 = 00349.XVII. . giving standard errors of 0-0471 and 0-0488. i.. with sufficient accuracy for our present purpose. the frequency per interval at the median. . and consequently the answer given by equation (2) does not require to be multiplied by the magnitude of the interval. Using the direct method of equation (2). § 15). we find the median to be 67-47 (Chap. these standard errors are the same. and.

Let us apply the above value of the correlation between percentiles to find the standard error of the semi-interquartile Inserting q-^ =P2 =. Consider the two percentiles. = J. ej If tj and their respective standard errors. V:tiPi The correlation Ml between the percentiles : tude but opposite in sign it is is the same in magniobviously positive. and p^. producing an error 8j in g-j. But if there be a deficiency of observations below the lower percentile. we range for the normal curve. or difference. q^ =Pi = |. . q-^ and q^.-j. and p^ become sensibly equal to one another. and the errors in the second percentile are directly proportional but of opposite sign to the errors in p^. find r is. inserting the values of the standard errors. Hence the standard error of the interquartile range applying the ordinary formula for the standard-deviation of a 2/J3 times the standard error of either quartile. 1 . then r be the correlation between errors in we have gij and p^. ^1 Pi Or. These two percentiles divide the whole area of the frequency curve into three parts. the first-named being the lower of the two percentiles. for which the values of p and q are p^ q-^. percentiles in the same distribution. 8. and consequently correlation between errors in two percentiles If the _ \=^sJ~. p^ q^ respectively. and will therefore tend to produce an error 82=-^' -Si Pi in p^.342 THEORY OF STATISTICS. V q^^ J . the correlation between errors in the two percentiles will be the same as the correlation between errors in g-j and p^ but of opposite sign. the missing observations will tend to be spread over the two other sections of the curve in proportion to their respective areas. since the errors in the first percentile are directly proportional to the errors in q-^. Further. and the correlation becomes unity. IPii\ (3) J p-^ two percentiles approach very close together. the student must be careful to note that the errors in two such percentiles are not independent. as we should expect.g'l -P2. the areas of which are proportional to q-^.

whether an observed divergence of the percentile. XIV. —SIMPLER CASES OF SAMPLING. Of course the standard-deviation of the inter-quartile. the standard error almost certainly gives quite a misleading idea as to the accuracy attained iu determining the average stature for the United Kingdom the sample is not representative. case. FOE VARIABLES. standard error of the semiinterquartile range in a ] (T > j = 0'78672—7= . of statures. therefore. 2) with reasonable accuracy. Finally. 343 the standard error of the «e»ii-iiiterquartile range 1/^/3 times the standard error of a quartile. (4) normal distribution interquartile. 9. applying the usual formula for the standard deviation of the difference of two correlated variables (Chap. or semirange can readily be worked out in any particular : using equation (2) and the value of the correlation given above it is best to work out such standard errors from first principles. to hold good. § 2. we discussion of these points in §§ 4-8 of Chap. Further. for the efieot on the standard error of p was considered in detail in §§ 9-14 Of Chap. XI. nor can it give any indication of the magnitude or influence of definite errors of observation errors which may conceivably be of greater imIn the case of the distribution portance than errors of sampling. from a certain value that might be expected to be yielded by a more extended series of observations or that had actually been observed in some other series. save on the one question. § 3). If there is any failure of the conditions of simple sampling.. Taking the value of the standard error of a quartile from the table in § 4. enter again into a discussion of the efiect of removing the several restrictions. for instance. finally. of course. the several parts of the kingdom not contributing The student should refer again to the in their true proportions. might or might not be due to fluctuations of simple sampling alone. and the standard error of any percentile is directly proportional to the standard error of ^ (ef. We need not. may note that the standard error of a percentile cannot be evaluated unless the number of observations is fairly large large enough to determine f^ (eqn. It cannot and does not give any indication of the possibility of the sample being biassed or unrepresentative of the material from which it has been drawn. the formulae of the preceding sections cease. equation (1)).XVII. however. the student may be reminded that the standard error of any percentile measures solely the fluctuations that may be expected in that percentile owing to the errors of simple sampling alone it has no bearing. we have. XIV. or : — : — . .

is observations numbe r of su ccesses in : agreeing with equation (5). 6. Let us now pass to a fresh problem.pq: the standard-deviation of the total n samples of m observations each is therefore Jnm. The distribution being very approximately normal. refs. This is very readily obtained. 5.344 THEORY OF STATISTICS. and rath card of our sample.of a. The standard-deviation of the number of successes in a sample of m Jm. (As regards the theory of sampling for the median and percentiles generally. For the distribution of statures used as an illustration in § 6 the standard error of the median was found to be 0-0349 the standard error of the mean is only 0'0277.be known. drawn under the same conditions. Laplace.. Jmpq Ijn. 27 the preceding sections have been based on the work of : Edgeworth and Sheppard. Standard Error of the Arithmetic Mean. ref. (standard error of the median). or.ir. and in general the standard errors of the two stand in a somewhat similar ratio for a distribution not differing largely from the normal form.. Supplement II. and the student should note that it has been obtained without any reference to the size of the sample or'to the form of the frequenoydistrfbution. ref. Further. the ratio of : . It is therefore of perfectly general application.pq dividing by n we have the standard-deviation of the mean number of successes in the n samples. The standard-deviation of the values on each separate card will tend in the long run to be the same. also § 16 below). § 4). The standard-deviation of the sum of the values recorded on the n cards is therefore tjn. cf. We can verify it against our formula for the standard-deviation of sampling in the case of attributes. ''"•^^^ (5) This is a most important and frequently cited formula. viz. Suppose we note separately at each drawing the value recorded on the first.) 10. if o. 11. second. to test whether we may treat the distribution as approximately normal (c/. Edgeworth. the value recorded on each card is (as we assume) uncorrelated with that on every other. and the standard-deviation of the mean of the sample is consequently — 1/mth of this . 15.. For a normal curve the standard error of the mean is to the standard error of the median approximately as 100 to 125 (cf. and determine the standard error of the arithmetic mean. 7. in an indefinitely large sample. and identical with the standard-deviation o. and. third .Sheppard.

and they might be expected If. or in which the frequency-distribution assumed a form resembling fig.. As such cases as these seem on the whole to be the more common and typical. 12. 53.is not known of sampling. be affected considerably by small groups of widely outlying observations. 53 represents a distribution in which the standard errors of the mean and of the median are the same. the standard error of the median was found to be 0'0655 per cent. the standard error of the difference of their means is given by the two standard errors. 1'26. — SIMPLER CASES OF SAMPLING FOR VARIABLES.(e) If an observed difference exceed three times the value of cjj given by this formula it can hardly be ascribed to fluctuations If. also used as an illustration in § 6. =<4) . and it would seem natural to take as this value the standard-deviation in the two samples thrown together. Further. 345 viz. a point quite distinct zero." Such distributions are not uncommon in some economic statistics. Fig. assumes almost exactly the theomagnitude. in some experimental cases it is conceivable that the median may be less affected by definite experimental errors. If two quite independent samples of TOj and OTj observations respectively be drawn from a record.. but even more exaggerated as regards the height of the central " peak " and the relative length of the "tails. the average of which does not tend to be this is. retical — — €?. . If. § 18 that the mean is in general less affected than the median by errors of sampling. in a practical case. which bears to the standard error of the median a ratio of 1 to 1"33. the greater stability of the median is sufficiently marked to outweigh its disadvantages in other respects. The standard error of the mean is only 0*0493 per cent. a priori.XTII. for example. In the case of the asymmetrical distribution of rates of pauperism. the standard-deviations of the two samples themselves differ more than can be accounted for on the basis of fluctuations of sampling alone (see below. however. we must substitute an observed value. VII. § 15). the value of a. than is the mean. of course. evidently e-^^. in these to characterise some forms of experimental error. we stated in Chap. from that of errors of sampling. we evidently cannot assume that both samples have been drawn from the same record the one sample must have been drawn from a record or a universe exhibiting a greater standard-deviation : . the median may be the better form of average to use. cases.. At the same time we also indicated the exceptional cases in which the median might be the more stable cases in which the mean might.

In the present instance this condition is strictly fulThe mean of the sample of n observations is the sum of filled. the use of (6) or (7) is not justified. and 0-2. Further. that the distribution tends to be normal whenever the variable may be regarded as the sum (or some slightly more complex function) of a number of other variables. Following precisely the lines of the similar problem in § 13. 22. than the other. XV. viz.. he will see that the genesis of the normal curve in this case is in accordance with what we then stated. but if the student will refer to § 13. indefinitely large samples from which exhibit the standard-deviations cr. If two quite independent samples be drawn from the same universe.. the point here . and not on the mean. and the symmetry will be the greater the greater the number of observations in the sample. As an illustration of the approach to symmetry even for small values of .. but instead of comparing the mean of the one with the mean of the other we compare the mean «ij of the first with the mean m^ of both samples together. the distribution of means (and therefore also of the difierenoes between means) tends to become not merely symmetrical but normal. we (For a complete treatment of this problem in the case of samples drawn from two difierent universes 13. for errors in the mean of the one sample are correlated with errors in the mean of the two together. Chap. the formula usually employed for testing the of the difference between two means in any case seeing that the standard error of the mean depends on the standard-deviation only. We can only illustrate. If two samples be drawn quite independently from different universes. case find that this correlation is ijnjin^+n^. . (7) indeed. of the distribution. XIII. ref. the standard error of the diiFerence of their means will be given by 4. and hence III. not prove.. we can inquire whether the two universes from which samples have been drawn difier in mean apa/rt from any difference in significance dispersion.) samples drawn under the conditions of simple sampling will always be more symmetrical than the distribution of the original record... The distribution of means c/. and we should expect the distribution to be the more nearly normal the larger n.=^+-^ This is.: 346 THEORY OF STATISTICS. the values in the sample each divided by n. Chap.

be normal if the deviation of the mean of each sample is expressed in terms of the standard-deviation of that sample (c/. But the distribution of the number of successes for 100 events when g' = 0'9. This will be equivalent to finding the' distribution of the number of successes for 100 such events. once from the fact that any linear function of normally distributed The variables is itself normally distributed (Chap. but first draw a series of samples from one record. —SIMPLER CASES OF SAMPLING FOR VARIABLES. p = 0'l the distribution is extremely skew. even a fairly large sample may continue to reflect any asymmetry existing in the original distribution (c/. the divergence from symmetry is comparatively small: the distribution has gained. XV. distribution of If the original distribution be normal. XVI. but are to some extent positively correlated with each other. given as illustrations of the forms of binomial distributions in Chap. that the approach to normality. then another series from another record with a somewhat difierent mean and : — . that the distribution of means is approximately a normal distribution. ref. on that assumption. and thence tailing off to 20 cases of 7 successes in 10. (a) If we do not draw from the same record all the time. very greatly in symmetry thrtugh only five observations have been taken to the sample. p = 0'l. the This follows at means. the frequency with which any given deviation from a theoretical value or a value observed in some other series. is only rapid if the condition that the several drawings for each sample shall be independent is strictly fulfilled. if our sample is large. rising to high frequencies for 1 and 2 successes. in an observed mean. however.XVII. is strictly normal. under the same conditions. 4 cases of 8 successes and 1 case of 9 successes. But now find the distribution for the mean number of successes in groups of five throws. XV. but only to its scale. We may therefore reasonably assume. and then dividing the observed number of successes by five the last process making no difierence to the form of the distribution. however. 32 and the record of sampling there cited). 347 we may take the following case. § 3. he will find there the distribution of the number of successes for twenty events when g' = 0'9. while it is appreciably asymmetrical. of n. 14. starting at zero. § 3. is also given in Chap. will arise from fluctuations of simple sampling alone. The warning is necessary.000 throws. ref. 30). Let us consider briefly the effect on the standard error of the mean if the conditions of simple sampling as laid down in § 2 cease to apply. distribution will not in general. and we may calculate. even of small samples. If the observations are not independent. and it will be seen that. If the student will turn to the calculated binomials. § 6).

if 0-^ be the standard error of the mean. and so on. Hence. The standard error of the mean. and the mean differs by d^ from the mean of all the records Then for the samples drawn from the first together. Hence. the standard error will be greatly increased. may be increased indefinitely as compared with the value it would have in the case of simple sampling.. and the means differ by d^. record the standard error of the mean will be a-J^n. but will have some greater value. . For suppose we draw first record. if our samples are drawn from different records or from essentially difierent parts of the entire record. : Kai. deviations are o-j. . If the total number of samples. or if we draw the successive samples from essentially different parts of the same record. the standard-deviation of the means for the different districts will not be a-jjn. THEORY OF STATISTICS. the standard error of the mean will be decreased. XIV. (6) If . for which the standard-deviation is cTj. = ^{k~) + ^k. and so on.T + -^^"' . k^ samples from the" second record. we take the statures of samples of n men in a number of different districts of England.cr'). If. and so on. we are drawing from the same record throughout.348 standard-deviation. Chap. dependent on the real variation in mean stature from district to district. but' the distribution will centre round a value differing by d-^ from the mean for all the records together and so on for the samples drawn from the other records. for which the standard-deviation yfcj samples from the (in an indefinitely large sample) is o-j. writing 2(Acf) = if.sJ„ '^"' = . and the mean differs by together (as ascertained by d-i from the mean of all the records large samples in numbers proportionate to those now taken) . d^. For if. the standardo-„. . the second card from another part. . but always draw the first card from one part of that record. But the standard-deviation by o-q for all the records together is given iVr. and these parts differ more or less.(T^=2(Aa^)-t-2(&f). for example. and the standard-deviation of all the statures observed is o-q. in large samples drawn from the subsidiary parts of the record from which the several cards are taken. • • • (9) This equation corresponds precisely to equation (2) of § 9. cr..

or using some equivalent process. 4). district : X The conclusions seem in accord with common-sense. be a a-jijn. . (10) last equation again corresponds precisely with that given for the same departure from the rules of simple sampling in the case of attributes (Chap. There may. XIV. in fact. it is evident that if the men in each were all of precisely the same stature. It shows that. if the cards from which we were drawing samples had been arranged in order of the magnitude of recorded on each.. and then proceeded to form a set of samples by taking one man from each district for the first sample. —SIMPLER CASKS OF SAMPLING FOR VARIABLES. we have n Hence ^ ' n ' = ^-^ The . to vary our previous illustration. each as homogeneous as possible. if we are actually taking samples from a large area. the standard-deviation of the means of the samples so formed would be appreciably less than the standard error of simple sampling As a limiting case. If. . . and take a contribution to the sample from each. different districts of which exhibit markedly different means for the variable under consideration. while our conditions (a) and (h) of § 2 hold good. if we break up the whole area into n sub-districts. the means of all the samples so compounded would be identical in such a case. however. § 11.. one man from each district for the second sample. we would get a much more stable sample by drawing one card from each successive reth part of the record than by taking the sample according to our previous rules e. suppose that. . The result is perhaps of some practical interest. we will obtain a more stable mean by this orderly procedure than will be given.. To give another illustration. (c) Finally.. eqn. and so on.. and consequently <Tm = 0. for 349 .g. we had measured the statures of men in each of n different districts. d„ from the mean a large sample from the entire record. the magnitude of the variable recorded on one card drawn is no longer independent of the magnitude recorded on greater risk of biassed error. and are limited to a sample of n observations . shaking them up in a bag and taking out cards blindfold. o-Q = «„. for the same number of observations.— XVII. by any process of selecting the districts from which samples shall be taken by chance.

we may write therefore. such a positive correlation is at once introduced. any sampling of the same these circumvalues on the another card. As was pointed out in that chapter. however. the mean (deviation)^ with respect to the mean. the standard error of the mean will be increased. to keep the cases (a). our reasons for not proceeding further with the discussion we must remind the student that in order to express the standard error of the mean we require to know. the others must on an average be drawn from parts containing relatively high values. and if. although the drawings of the several cards at each sampling are quite independent of one another. . Similarly. if rj2 denote the correlation between the first and second cards. the case discussed under (6) is covered by the case of negative correlation. and (c) distinct. . There are to(w-1)/2 correlations. on the other hand. the case when r is positive covers the case discussed under (a) for if we draw successive : samples from different records. If r be negative. briefly of standard errors. If this correlation be positive. Equation (11) corresponds precisely to equation (6). in other words.»-l. § 13. (11) As the means and standard-deviations of x-^. of Chap. or. the next and following cards sample are likely to bear high values also. and for a given value of r the increase will be the greater. r is the .g. . (6).'s will on the average be negative if some one card be always drawn from a part of the record containing low values of the variable. It is as well. the correlation between any two a.350 THEORY OF STATISTICS. for if each card is always drawn from a separate and distinct part of the record. the standard-deviation about the mean. x^ x„ are all identical. since a positive or negative correlation may arise for reasons quite different from those considered under : (a) and (6). that if the first card drawn at bears a high value. With this discussion of the standard error of the arithmetic mean we must bring the present work to a 6lose. e. the greater the size of the samples. the standard error will be diminished. XIV. To indicate 15.^»=^[l+r(. arithmetic mean of them all. in addition to the mean itself. and so on.] . r may more simply be regarded as the correlation coefficient for a table formed by taking all possible pairs of the n values in every sample. Under stances.

and the proof would be laborious and difficult. on reference deviations of any order (ref. as a fact. the distribution be normal. standard error of the standarddeviation in a distribution of / | >=. to use the terminplogy of Chap. § 8. Standard-deviation. (12) ) ^. Either./ ^^ ^ —^-j any form j ^ . For a normal distribution. If. To deal with the standard error of the correlation coefficient would take us still further afield. XII. we have standard error of the coefficient of variation :]''wA'<m)r (»' . equation (Gf. measured from the mean and /ij the mean (deviation) ^ or the square of the standard-deviation n is assumed sufficiently large to make the errors in the standard-deviation small compared with Equation (13) may in some cases give that quantity itself. that the standard error of the standardis less than that of the semi-interquartile range for a normal distribution. we must find this quantity for the given distribution and this would entail entering on a field of work which hitherto we have intentionally avoided or we must. — — : — — deviation to equation (4) above. assume the distribution to be of such a form that we can express the mean (deviation)* in terms of the mean (deviation)^. 17. (13) ^1^2-^ where /i^ is the mean (deviation)* deviations being. therefore.— — XVII. if that be possible. again. by no means exact the general expression is : : it is. standard error of the standard-deviation in a normal distribution ) ^ >= —y= . but the proof would again take us rather beyond the limits that we have set ourselves. however. but of standardIt will be noticed. 33). with a simple statement of the standard errors of some of the more important — — constants. in the general case. We must content ourselves. of course. —SIMPLER OASES OF SAMPLING FOR VARIABLES. to express the standard error of the standard-deviation we require to know. for the normal distribution. This is generally given as the standard error in all cases however. . 351 Similarly. without the use of the differential and integral calculus.) (12) gives the standard error not merely of standard-deviations of order zero. the mean (deviation)* with respect to the mean. if not impossible. — If the distribution be normal. ref.. then. values considerably greater twice as great or more than (12). Thfs can be done.

The general expression for the standard error of the correlation-ratio is a somewhat complex expression (cf. Coefflfiient of regression. 28. to have been attempted. — standard error of correlationratio approximately 1 _I "" -if . The expression — standard error of the corj relation coefficient for > a normal distribution ) =l _ j-2 t=^ . total or partial i. In general.e. in Chap. 10.—— THEORY OE STATISTICS. For the of a coefficient of any order. Equation (15) gives the standard error cf. X. _ „ \^ > =^1-^1 ""2 ) for a normal distribution t=^~ fg'^'"rv'* v» (^^) This formula again applies to a regression coefficient of any order. Chap.. § 10). k denoting any collection of secondary subscripts other than 1 or 2. Correlation coefficient. If the distribution be normal.). „. Correlation ratio. that is to say. refs. X. total or partial (ref. it may be taken as given sufficiently closely by the above expression for the standard error of the correlation coefficient. see ref. in the bracket is usually very nearly unity. If the distribution be normal.. : standard error of 612. -7^ ^ / ^ V(l-^2)2_(l -r«)2-|-l . Professor Pearson's original memoir on the correlation-ratio. o-g^ y/n. standard error of ^ roughly = 2/- . standard error of f = 2 (18) For rough work the value of the second square root may be taken as nearly unity. 34 the formula (15) does not apply.4 for a normal distribution ) ) ""J-a* h.. (19) . and we have then the simple expression. and in that case may be neglected. § 21. = -r^ is a Very approximately (Blakeman. This : — : standard error of the ooefficient of regression \ /. XI. 352 . the value of ^ test for linearity of regression. for a normal distribution.. standard error of the correlation-coefficient for a fourfold table (Chap. however. 33). (15) '^™ the use o'f a more general formula is the value always given which would entail the use of higher moments does not appear As regards the case of small samples. 18. ' ) J^ ' ' ^ As was pointed out ref. and 31. ref. 1). in terms of our general notation.

Xiy.FOE VAKIABLBS.g. Finally. and of a. in (1). some rough idea as to the possible extent of under-estimation or over-estimation may be obtained. are assumed to be known a priori.e. § 3).in the expression for the standard error of the mean by cr ± three times its standard error so obtained. We need hardly restate once more tho warnings given in Chap. —SIMPLER CASES OF SAMPLING . the procedure is safe : enough. if n be small. 16. in the case of the mean.. Consequently. If this sample is based on a considerable number of observations. but if it be only a small sample we may possibly misestimate the standard error to a serious extent.: XVII. nor as to the magnitude of errors of observation. 23 . Chap. again emphasise the warnings given in §§ 1-3. the standard error ceases to measure with reasonable accuracy the standard-deviation of true values of the constant round the observed value (Chap. we cannot interpret the standard error of any constant in the inverse sense. In the first place. but we may. that a standard error can give no evidence as to the biassed or representative character of a sample.. If the sample be large. and then replacing o.and yj. it will be noted that the values of fp ^^ (2).. the values that would be by an indefinitely large sample drawn under the same conditions.e. 353 To convert any standard error to the probable error multiply by the constant 0-674489 . XIV. XIV. or the values that they possess in the original record if the sample is unbiassed. e. i. and the "probable error" becomes of doubtful significance... in conclusion.. Following the procedure suggested in Chap. as to the use of standard errors when the number of observations in the sample is small. we cannot in general assume that the distribution of errors is approximately normal it would only be normal in the case of the median (for which p and q are equal) and in the case of the mean of a normal distribution. by first working out the standard error of <r on the assumption that the values for the necessary moments are correct. and repeated in § 9 above. the direct and inverse standard errors are approximately the same. if the sample be small. Secondly. it will be remembered that unless the number of observations is large. the rule that a range of three times the standard error includes the majority of the fluctuations of simple sampling of either sign does not strictly apply. But this is only the case in dealing with the problems of artificial chance in practical cases we have to use the values given us by the sample itself. of and (5). i. XIV.in (4) given for these constants o.

i. "Tables for Facilitating the Computation of Probable Errors. "An Abac to determine the Probable Errors of Correlation Coefficients. R. Ixxi. Y. (12) (13) (14) (15) (16) Winifred.. 5th Series. F. Edgewoeth. 6) 7) 8) Theory of Errors of Observation and the First Principles of Statistics.. iv. "The Frequency Distribution of the Values of the Correlation Coefficient in Samples frqm an Indefinitely largo Fopnlation. 1916. 14. L. F. Y. the Probable Errors of Frequency Constants. L. 7." Biometrika." Proc." Phil. A.. x.. xiv. Thiorie des protabiliUs. ' . D. Laplaob.. 1814. p. xxiv.) — 354 THEORY OF STATISTICS. p. W. 411." Biometrika. 1887. Hekon. Coefficients of association : 34. 13.. 26. Raymond. (With four supplements. p.. 26. London. 12. vii. v. Ixxii. 386. and Karl Pearson. 35. vol. 1906. for the case of three variables. 30. 1906. vol.. 23. Edgewoeth. Mag. 1886.. 1906. Tests for Linearity of Regression in Frequency 1 Distributions. Y. vol. (A proof. " Problems in Probabilities. 1908. " normal coefficient. Layton.. 9) Addendum. 23. 19..) Pearl. of the result given in (33). 651 and . A. . Trans. 36. vol. "The Calculation of Probable Errors of Certain Constants of the Normal Curve. The following is a classification of some of the memoirs in the list below : General : 18. ref. 1885. Jour. Pibeeb Simon. Fit of Theory to Observation.. Roy.. 1905. 165. 20. 6. vii." Cambridge Phil. C. F. J. Series A. 371. Stat. Edgewoeth." Biomstrika. Coefficient of correlation (product-sum and partial correlations) : : : 10. (A diagram giving the probabl" error for any number of observations up to 1000. vol. p. Address to Section F of the British Association. p. REFERENCES. Edgewoeth.. p.. vol. 5th Series. 81. ." vol. Boy. p. Marquis de.. other methods.. xxii. " ^iomeirifo. 381." Biometrika. p. Averages and percentiles 5. As regards the conditions under which it becomes valid to assume that the 'stribution of errors is normal. pp. (11) GipsON. A. 31. vol. 411. 268. are generally dealt with in the memoirs concerning them. "On "On Coefficient of Mean Square Contingency. vol. D. : Theory of fit of two distributions 9. 139. p.. Standard deviation 17. 332. on ordinary algebraic lines. 32." Buymetrika. 1909. Eldbeton. vol.. Y." Biometrika.." Biometrika. the Probable Error of the 2) Blakbman.) IssERLis. The Measurement of Groups and Series . etc. 1910. "Observations and Statistics: An Essay on the .. 1902. "On Sac.. vol... the memoirs concerning errors of sampling in proportions or percentages. : 24. J. 4) 5) BowLBY. reference to which has been made in the lists of previous chapters : reference has also been made before to most of The probable . 33. 28. 507. iv. xoii. cf. 1910. vol. p. 1906. "On the Conditions under which the Probable Errors ' of Frecjuenoy Distributions have a real Significance... v. " On the Probable Error of a Partial Correlation Coefficient. p. vol. Mag. & E. 191. Soc. 190.. Palin. Coefficient of correlation. 3) BowLEY. F. etc. 2" iin. 1915.'' Phil. Coefficient of contingency : 2. errors of various special coefficients. 499. 34. 1903. L.. "The Choice of Means. BLji KEMAN. 29. p. p. vol.) Heron. "Tables for Testing the Goodness of (10) FiSHEE.

"On the Probability that two Independent Distributions of Frequency are really Samples from the same Population. (17) — SIMPLER CASES OF SAMPLING FOE VARIABLES. oxoi. 172. 1. 302. of association coefficients. etc.) ) ) XVII. p. 85. vol. ii. cxcii. and others (editorial). 1898. x. "On the Probable Error of a Coefficient of Mean (26) Khind. 1. p. from the standard error (18) al\J1n in the case of a normal distributioti. p. 1915.." Biometrika. (34) Reference may also be made most part with the effects of errors to the following. 1909-10. Karl. and vol. vol. in certain cases. 181. F. " On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be reasonably supposed to have arisen from Random Sampling. 101.) "SinDBNT. "On the Theory of Correlation for any number of Variables treated by a New System of Notation. vol. Series Yule.. p." Proc. "On the Probable Errors of Frequency Constants. (23) (24) (25) Pearson. Pkaeson." Jour." " On the Probable Error of a M. Yule. (19) (20) Peabson." Phil.. "On the Probable Error of the Correlation Coefficient to a Second Approximation. Ixxix. vol. "On the Probable Error of the Bi-serial Expression for the Correlation Coe&cient." (The problem of the probable error Biometrika. 1903.. Karl. Ixxvi. U. Stat. Pearson. "On the Probable Error of a CoeflScient of Correlation as found from a Fourfold Table.. 1911. 182. ix. E.. H." Biometrika. p. vol.. G. p. Trans.Ti. N. x. (22) Pearson. "Note on the Significant or Non-significant Character of a Sub-sample drawn from a Sample. 1900. 386. 1913. 590. Soc... 1906. p. "On the Probable Errors of Frequency Constants. Karl. p." Biometrika. SoPBB. vol.. vii.. Eaymokd. "On certain Points concerning the Probable Error of the Standard-deviation. 1913. vol. 355 Peakl. a. (27) (28) (29) (30) (31) Skew Frequency-distributions. 210. G. Karl. vol. Series A. 250. p." Phil. p. Boy. based on the 1913. H. Boy. p. vol. viii. p. E. vi." Biometrika. x. Roy. Soc. general case without respect to the form of the frequency-distribution." "On the Distribution of Means of Samples which are not drawn at Random. Pearson." Biometrika. 91. U. p. (32) (33) "Student. 229. 1908. W. ix." BiomeVrika. v.. with small samples. Soc. vii. (See pp.. p." Biometrika." Biometrika. (21) Pbakson. vol.." FMl. and on the Influence of Random Selection on Variation and Correlation. 1912." Biometrika.. 22. Filon. "Student. p. " On the Curves which are most suitable for describing the Frequency of Random Samples of a Population.. vol. 1914..) "On the Methods of Measuring Association between two Attributes. and vol... Soc... (Dseful for the general formulae given. 1898. vol. Trans. 1907. p. Kabl. Karl. 1909. vol. and L.. Square Contingency. vol. 1914.. (Probable error of the correlation coefficient for a fourfold table. vol. 384. (On the amount of divergence. 1906. G. Series A. A. ix. 5th Series. Mag. 1. 112." Biometrika." Biometrika. 127 and p.ea. vol. vi." "On the Probable EiTor of a Correlation Coefficient.. p. Karl. vol. 1908.) Pbabson. 157.." iiomeirite. 1908. "On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation. p.. Boy. SoPBR. Kakl. 273.. vi. which deal for the other than errors of sampling:— . "Tables for Facilitating the Computation of Probable Errors of the Chief Coiistants of vol. 192-3 at end. v. p. Shbppakd. (The standard error of the mean in terms of the standard error of the sample.

vol. . 1).'' EXERCISES. Chap. 4. Slat. Boy.. .). "The Measurement of the Accuracy of Jowr. with the ratio of the standard error of the semi-interquartile range to the semi-interquartile range. 95.356 (35) THEORY OF STATISTICS. 0*8.. 168-4 lbs. VI.. (Standard-deviation 2 '67 in. Work out the standard error of the standard deviation for the distribution of statures used as an Olustration in § 6. p. A.. p. find the standard error of the median (154"7 lbs. 8585 observations. 77. vol. 1897. 2. 1911. For the same distribution. 0-2. Roy. For the same distribution. Soc. find the standard errors of the two quartiles 1. (142-5 3. For the data in the last column of Table IX.. 0-4. for values of r-=0. assuming the distribution normal. The standard-deviation of the same distribution is 21 "3 lbs.)." Jow. (36) BovfLBT. that of its L.. "Relations between the Accuracy of an Average and Constituent Parts. Slat. assuming the distribution normal. 0"6. find the standard error of the semi-inter- quartile range. 5. L. A. lbs. p.) Compare the ratio of standard error of standarddeviation to the standard-deviation. Soc. Ix. BowLBT. (2) 1000 observations. 855. an Average. Calculate a small table giving the standard errors of the con'elation coefficient. Ixxv. Find the standard error of the mean. 6. and compare its magnitude with that of the standard error of the median (Qn. based on (1) 100.

but.000 . may be preferred. ii. Oube-roots. especially work not intended for publication. rule. containing logarithms of all numbers from 1 to 200. 357 . or one of Hannyngton-pattern rules (Aston & Mander. Zimmermann's tables are inexdifferent forms are cited below. in which the scale is broken up into a number of parallel segments. rule will serve for most ordinary purposes. 1910. Cotsworth's. beyond the reach of the student. Barlow's tables Crelle's. A. It is hardly necessary to cite special editions of tables of logarithms here. If it is desired to avoid logarithms. and four of tables are very useful. will : A : (1) Barlow's Tables of Squares. E. vol. rocals of all Integer Nvmiiers up to 10. are invaluable for calculating standard-deviations of ungrouped observations and similar work. of trigonometric functions). as a rule. is. or Peters' tables for more advanced work. London). & F. .APPENDIX I.000. but attention may perhaps be directed to the recently issued eight-figure tables of Bauschinger and Peters (W. For greater exactness in multiplying or dividing. Engelmann. 6d. London and New York . stereotype edition. and Asher & Co. price 18s. of course. net. seven-figure tables must be used. i. a 50-cm. logarithms are almost essential five-figure tables suffice if answers are ouly desired true to five digits . if greater accuracy is needed. and Recip- N. pensive and recommended for the elementary student. For a great deal of simple work. arithmetic machines are. extended multiplication There are many of these. a Fuller spiral rule. Cubes. or if greater accuracy is desired. owing to their cost. price 6s. TABLES FOR FACILITATING STATISTICAL WORK. plain 25-cm. containing logs.. CALCULATING TABLES. vol. the student find a slide-rule exceedingly useful particulars and prices will be found in any instrument maker's catalogue. Spon. Fob heavy arithmetical work an arithmometer invaluable . Leipzig. Square-roots. London.

for all numbers up to 1000 at the foot of the page. cf.) German or in English. together with others. "Tables for Facilitating the Computation of Probable Errors. without index. (10) EVEKITT. vol.) Berlin .. French or German. 1904. p. John Wiley . New Sechentafeln fur MuUiplilcation imd Division. SPECIAL TABLES OF PTJNCTIONS. price 5s." Biometrika. Wjnifp. (12) Heuon. e. a table of Gamma Functions in Elderton's book (12) and a table of six-figure logarithms of the factorials The of all numbers from 1 to 1100 in De Morgan's treatise (11). differences of O'OOl of the argument. vol. " An Abac to detertnine the Probable Errors of Correlation Coefficients. price 15s. (Gives products up to 100 x 10. gamma functions. B. Asher & Co.. ref. The Direct Calmlator.) (8) DuFEELL. and vol. Berlin . logarithms.. cubes. (Tables of area and ordinates of the normal curve. (4)Eldebton.. a. W. price 15s. P. 1906. 437.) four-figure products." Biometrika. 1000 X 1000. 25s. . vii. 411.. . "Tables of Powers of Natural Numbers.g. 155. (Seven-figure logarithms of the function.. 385. Ernst & Son. (7) C. nebst Sammlung haufig gebrauohter Zahlenwerthe. Can be obtained with explanatory introduction in to 1000x1000. " Tables of F{r. loith especial reference to BioVariation . etc.. Statistical Methods. which were originally published in Biometrika. p. H. 1912. are contained in Tables for Statisticians and Biometricians. (5)Peteks. (Products of all numbers up to 100 x 1000 subsidiary CoTSWOBTH. vii. and of the Sums of Powers of the Natural Numbers from 1 to 100" (gives powers up to seventh). 1902.. p. cube-roots and reciprocals." Biometrika." British Association Report. Mechentofel. F.. vol. New York. Several tables of service will be found in the works cited in Appendix II.bd. J. vol. p.) (13) Lee. English edition. vii. etc. W. Sechentafeln. London. L. p. XVL) (11) Gibson. vol. second edition." Biometrika. ETC. : : tables of squares. p. proceeding by 1909.. 1899. edited by Karl Pearson (Cambridge University Press. D. . Reimer. Eldbrton.. (Functions occurring in connection with Professor Pearson's frequency curves. p. iv.. G." Biometrika. i.. majority of the tables in the list below. Biometrika. i4ofChap. "Tables of the Gamma-function. net).. 1910. . ) M'Corquodale & Co. powers. (Tables for facilitating the calculation of the correlation coefficient of a fourfold table by Pearson's method on the assumption that it is a grouping of a normally distributed table . p.) 358 (2) THEORY OF STATISTICS. viii. square-roots. 385. London. "TablesforTestingtheGoodnessof Fit of Theory to Observation. B. Series 0. "Tables of the Tetrachoric Functions for Fourfold Correlation Tables. . (A diagram giving the probable error for any number of observations up to 1000. v) and S{r. W. Reimer. 474. J. 1910. Berlin . (Multiplication table giving all products up (3) Cbeue. ii.. H. price 15s. Chapman & Hall .. (Product table to B..) logical (9) Davenport.000 more convenient than Orelle for forming Introduction in English. London price with thumb index.. vol. v) Functions. (6) ZiMMEEMANN. p. 6s. 21s. 43. 1914. probable errors of the coefficient of correlation. G. Alice. M.

F. 1907. "£«)?ne<riAa.. 386. vol. 208.. W. Biometrika. a. "Tables of the Gaussian Tail-functions. (A table giving the deviation of the normal curve. (Includes not merely table of areas of the normal 1903. "Tables for Facilitating the Computation of Probable Errors of the Chief Constants of Skew Frequency -distributions. vol. "New Tables of the Probability Integral. Aliok. vol. "Table of Deviates of the Normal Curve" (with introductory article on Orades and Deviates by Sir Francis Galton). p.. vii. for the ordinates which divide the area into a thousand equal parts. but also a table of the ordinates to the same degree of accuracy. v.. p. —SPECIAL TABLES QF FUNCTIONS.) (16) Sheppabd. (15) Rhixd. p. F. 1909-10. 1914:.. vol. ETC. x.. when the 'tail' is larger than the body. W. 127 and p.. (17) Shbppaed. p. 17i." Biometrika.) APPENDIX (14) I. ii." Biometrika. . curve (to seven figures). 404. ' ' 359 Leb. in terms of the standard-deviation as unit.

the latter containing. (A German translation in Ostwald's Klassiker der exakten J. Bowley's "Elements" (6 below). 1861 .. Jacques Bertillon's Cours SlSmentaire de statistique (Socidt^ d'^ditions international in scope).. 1907. 107.) ) . 1889. . 1st edn. Betz. SHORT LIST OF WORKS ON THE MATHEMATICAL THEORY OF STATISTICS AND THE THEORY OF PROBABILITY. London. . W.. 1895 Vital Statistics (Swan Sonnensohein. 1909. Sammelforschung J.. Brown. Barth. B. BowLEY. und Kollektivmasslehre 360 . Paris. H.. conjectandi. opus posthumum: Accedit tractatus de seriebus infinitis. The great majority of the works list. . as supplementing the lists of references given at the ends of the several chapters. 1911.) (3) (4) (6) (6) (7) Bektkani). L. 1910). Newsholme's scientifiques. 1879. to An Elementary Manual of Statistics (Macdonald & Evans. et epistola gallic^ scripta de ludo pilot reticularis. The student may find the following short list of service. Leipzig. 1713. Oalcul des probabilit4s Gauthier-Villars. A. BoEBL. Elements of Statistics P. Sir G.. Wahrscheinlichktitsrechnung Teubner. (Part 2 on the theory of correlation applications . . F. are in the library of the Royal Statistical Society. The economic student who wishes to know more of the practical side of statistics may be referred to Mr A. und psych. London. and to M. the Algebraical and Nwmerical Theory of Errors oj 1st edn. by the same writer (useful as a general guide to English statistics). Ueber Korrelation Beihefte zur Zeitschrift fiir ang.. Srd edn.. 108.. A. as a rule. L. On Observations . 1901 Srd edn. King. Dr A. Moments de la tMorie des probabiliMs Hermann. mentioned in the following with others which it has not been thought necessary to include. (8) Bruks. (2) Bernoulli. J.. original memoirs only.. S. 3rd edn.. (1) AlET. Psybh. APPENDIX II. Nos. . .. 1911. L. : to experimental psychology. (Applications to p^chology. 1899) will also be : of service to students of that subject. E. Paris. W. The Essentials of Mental Measurement Cambridge University Press. Leipzig^ 1906. Ars Wissenschaften.

(German translation by C. 1913. Layton. 1908-10. A. Gauthier-Villars. . Treatise on ProhaUlUy (republished from the 7th edn. Treatise on the Theory of Proialilities (extracted from . Calcul des probability . Essai pMlqsophique sur les probability. edited by 6. G. Leipzig. Frequency Curves and Correlation C.. (The introduction to 18. separately printed with some modifications. A. Jena. Abhandlungen eur Theorie der Bevolkertmgs.. (10) CzTJBER. 1837.. interest. Leipzig. G. 1837. 1888. i. the Enayclopcedia MetropoUtana). a. Eldeeton. Jena. (25) Westergaard. L. C. An Introduction to the Theory of Mental and Social Measurements. H. T.. Lippsj Engelmann. WahrscJieinlichkeiisrechnung imd ihre Anwendung auf Fehlerausgleichung. Schnuse. —SHORT LIST OF WORKS.. 1814. (15) Gauss.) (13) (14) Fechnee.. J.) (22) QuETELET. prMdies des rigles ginirales du calcul des probabilitis. . J. (Verylargely concerned with an exposition of the statistical methods. London. Kollektivmasslehre (posthumously published. London... 1849. 361 CoURNOT. Downes. T. .. Lettres sur la tMorie des probabilit^s. Elemente der exakten ErUichkeitslehre . . 1841. The Logic of Chance: an Essay on the Fovmdations cmd (24) Province of the Theory of Probability. 1897. Laplace.) . Galloway. . 2ndedn. Macmillan. ThSorie analytigiie des probability (18) 2nd edn.. Fischer. 1856. 1846. A. (16) JoHANNSEN. H. traduits par J. New York. Jena. (English translation by 0. Fischer. Bertrand. Marquis de. Marquis de. vol.wnd Moral(17) statistik . F. D. (Deals with Professor Pearson's frequency curves and with illustrations chiefly of actuarial . 1904. loith especial reference to its Logical Bearings cmd its Application to Moral and Social Science amd to StatiMcs 3rd edn. of the Encyclopaedia Britannica).) E. 1843.. APPENDIX (9) II. Statistik wnd Lelensversicherung Teubner.. . Pikkke Simon. Venn. 1896. F. Fischer... Pibkrb Simon. correlation. (19) Lexis. (11) (12) De MoiiGAN. & E. H. (20) PoiNCABi.. W. "W.. Die OrundzUge der Theorie der Statistik (23) Thokndike. with supplements 1 to 4. a'^ Ausgabe. Becherches sur la probability des jugements en matiire criminelle et en matiire civile. appliquie aux sciences morales et politiques.. W. L.) Laplace. S. 1906. p. (21) PoissoN. Exposition de la tMorie des chances et des prohdbilU4s. Mithode des moindres carris: Minwires sur la combinaison des observations. Paris. 1814. 1890. Science Press. 1903. E. 1839.

171. and differentiating with respect to %{x-b^. {Supplementary to Chapters IX.rj)y = 0. it is required to determine values of a^ and h-^ in the equation (where x and y denote deviations from the respective means) that will make the sum of the squares of the errors like M = a. I.yf with respect to a^ and to \ and equating to zero. 2/' minimum. § 3. Similarly. proof given in Chapter XII. ^ = |^^-^^ cry 2(2/2) if on p.' — a^ + a 6j^ . we determine the 2/ values of a^ and h^ in the equation = aa + \x -a^ + b^. The required equations for determining a^ and 6j^ will be given by differentiating %{u^)--^%{x-a^ + h^.). %{x) = %{y) = Q.x' that will make the sum of the squares of the errors like v = y' a minimum.y) = 0. _ %{xy) o-„ 362 .^ + h^. and XII.) To those who are acquainted with the diiferential calculus the It is on the lines of the following direct proof may be useful. we will find . a^ = 0.SUPPLEMENTS. we have %{x-a. x' and y' being a pair of associated deviations. Differentiating with respect to a^. But and consequently we have Dropping a^. That as is. Taking first the case of two variables (Chapter IX. &j^. DIRECT DEDUCTION OF THE FOBMULJE FOB BEGBESSIONS.

Let us regard the n trials of the event.+ l)(m + 2) „ 3! 3 . m m+l of (b) is (c) mp'^ . 3(^1 — Oi2. 363 of variables as in Chapter XII. But the probability of the latter at a specified trial. §§ 4 et seq. II. This gives the equations of the form there stated. is p™ q. If a constant term be introduced. —THE LAW OF SMALL CHANCES. . as above. its "least square" value will be found to be zero. .. (a) Each of the first m trials might succeed . the complete probability out oi occur in any one of m m m . n • £^2+ . successes and 2 failures. THE LAW OF SMALL CHANCES. The required For instance result might happen in any one of «»' + 1 ways. then the probability of having at least m successes in the m + m trials is evidently the sum of the m' + 1 terms of the expansion But this probability. a number ar^ involved.— SUPPLEMENTS If. but n is so large that np or nq remains finite. + Oi„.. The now be directed to the limit reached when either p ox q becomes very small. (Supplementary to Chapter J[V. . . m OT(OT. . giving m + 3 trials. successes and 1 failure. the probability of this is The first m+2 m the (m + 2)"" trial 2 ! ^ ^ In a similar way we find for the contribution oi successes and 3 failures. and. for which the chance of success at each trial is p. as made up of m + m =n trials. may term P^. (n-1) • i*n) with respect to each coefficient in turn and equating the result to zero. can be expressed in another and more convenient form with the help of the following reasoning. covered by (a)). which we of (p + q)" beginning with ^". as the failure might trials.23 . + 1 trials might give (S) The first the latter not to happen on the (m+iy^ trial (a condition already successes and 1 failure. especially § 7)..) We have (p + qY when n is large student's attention will seen that the normal curve is the limit of the binomial and neither p nor q very small. the chance of either : this is /I™. {cf. the equations for determining the coefficients will be given by diflerentiating . q. trials might give not to be a failure (So as to avoid a repetition of either of the preceding cases) .34 .

For instance. we have the chance that the event succeeds every time. we have verify. is infinite ('-3-->Hence. and (8) reduces to e-\ Put m'= 1.-. when n and.2.(».. under similar conditions.A. e-'^(l + A).23! ('-3"(>-r"t-^---"since — and smaller fractions can be neglected.-).364 Ultimately we reach THEOEY OF STATISTICS.is the chance of exactly one failure. if n be large and g small. Writing = . and we get the chance that the event shall not fail more than once.. L 2 ! m m ! This expression is of course equivalent to the first m' + 1 terms of the binomial expansion beginning with /j™. we have . and the terms . so that e-^.and putting m = ?i — 9w'. becomes 1.-2)g + ("-y-l) g2] ii ! Let us now suppose that q failures to total trials is is very small.->(. if m j>"-^[l+(r. If ...-. n But (1 j • • m' " !..=. we put »i' = 0. so that — = ratio of also very small. so that = 2.x. Let us also suppose 2' that n is so large that (7) nq^ = \\& finite. as the student can = m .. is shown in books on algebra to be equal to e -\ where e is the base of the natural logarithms.

p.q q is r . bracket on the right is equal to e~*. The investigation contained in the preceding paragraphs was published in 1837 by Poisson. 273). we have is The ratio of the (r + 1)"" to the r"" term is ——r -t- J^^l \ . In other words.. 1. etc. which the student may perhaps find easier to follow than that of Poisson (see ref. but the result has been reached independently by several writers since Poisson's time.. .1 SUPPLEMENTS —THE LAW OF SMALL CHANCES. 365 within the bracket give us the proportional frequencies of 0. The convergence of seen from the fact that r cannot exceed -. right Hence the second bracket on the ri] of (9) may be written and (9) is identical with (8). and we shall give one of the methods of proof adopted by modern statisticians.. 2 it (9a) which reduces to the series is — when T very small. so that (8) may be termed Poisson's limit to the binomial . 2.when q Expanding the second bracket. failures. and the to substitution of this value in (9o) reduces q^ (l-y)\' which vanishes with q. (8) is the limit of the binomial (p + j)" when q is very small but nq finite. 19. (?> + ?)" = (! -? + ?)" = (! -#(1 + ^4-)' • W inde- The first finitely small.

the pigeon-holes the dis- . tribution given by the binomial distribution of M being large. in the statistics used in par. method of par. if we desired to find the (all The frequent rediscovery its value is felt in the pendent probabilities.+i!t •) = \e-^( . o-=-78..366 THEORY OF STATISTICS. For instance. 12. tables of which for X have been published by v. .). Hence any statistics produced by causes conforming to Poisson's limit should. Bortkewitsoh and also been applied to cases in which. although is unknown. and for cr^ e-^(\ + 2\^ + ^^+ . n things in JV pigeon-holes being of equal size and equally accessible). + A. . indeFor instance.. if (8) is the real law to be very small. Chapter XIII. Chapter XV. we have for the mean the actual value of q (or p) The theorem has .2-\2 = \. of distribution. 12 of Chapter XIII. 6. the mean is -61.)-X2 = A.. . certain relations must obtain between the conUsing the stants of the statistics (see par. . have the mean equal to the square of the standard deviation. effectively different values of represented by (8). it may safely be assumed It should be noticed tliat.-«(wx.^.x. of this theorem is due to the fact that study of problems involving small. = A./l F-IX iV-lN" would be others. 0-2= -6079. within the limits of sampling.

: SUPPLEMENTS If — GOODNESS : OF FIT. 367 we now compute the \= '61. . we theoretical frequencies from (8). putting have the following results — Deaths.

and followed the algebra of pp. . at once suggests that the same principle will apply when there are two variables. . will have no difficulty in seeing that. then the chance that on random sampling we should obtain the combination x y is measured by the corresponding ordinate of the surface. Instead of this taining respectively n^. that ellipse Conversely. n„ = N'. . = volume divided by the total volume of the surface will be the the probability of obtaining in sampling a result not worse than X y' . y = y' to x — y=co. 246. m^ + m^+ . 4 . or a worse result. if we dissect the surface into indefinitely thin elliptical slices and determine the total volumes of the sum of . . when the number of variables is 3.).368 relation there THEORY OF STATISTICS. and y = 0. . The reader who has compared the figures on p. 331-332.' y ellipse.. »ij. 319-321 that the contours of a normal Now suppose we surface are a system of concentric ellipses. in Chapter XII. Hence. with four variables another dimension is involved. but throughout the equation of the contour of equal probability is of the ellipse type (c/. . combinawill be contained within the x y ellipse. ». . . . It was proved on pp. tions less likely to occur than a. . The problem to be solved is whether the observed system of deviations from the most probable values might have arisen in . shown to hold between the normal curve and the surface of normal correlation. . and the four-dimensioned frequency " volume " must be dissected into tridimensional ellipsoids ... have a normal system of frequency in two variables x and y.' y' will be represented by ordinates located upon ellipses wholly surrounding the a. three variables the contour ellipse becomes an ellipsoidal surface. 166 and p. . if we prefer. we may sum from x = x'. and then the fraction is the chance of obtaining as bad a result as x' y'. where N . the generalisation of the theorems of Chapter IX. n^. this slices from x = x' and y = y' down to a. and so on . the above principle remains valid although it With ceases to be possible to give a graphic representation. n^ we actually find m^. and as the locus of its foot must also be an ellipse. and the feet of all ordinates of equal height will lie upon an ellipse which will therefore be the locus of all combinaAny combinations of X and y equally likely to occur as is x' y tion more likely to occur than x y will have a taller ordinate. or. should be distributed into n+\ groups conn„ each. m„ = TOo + TOi+ . . m^ m„. these data. in number. Let us now suppose that if a certain set of data is derived from a statistical universe conforming to a particular law.

the following method is adopted..e. The reason for adopting the latter device is that. m^.SUPPLEMENTS —GOODNESS OF FIT. a vectorial element dr. iV being given. the integral vectorial factor is raised to the n . is then the equation of the " ellipsoid " delimiting the two portions of the "volume" corresponding to combinations more or less likely to occur than m^. we have to dissect the frequency solid. fixing the contents of any n of the classes determines the n + V'\ there are only n independent variables. while ^ may be treated as the vectorial element or ray. . To reduce this ?i-fold integral to a single integral. Since. . But as the limits of integration of the angular (not of the vectorial) element will be the same in the numerator and denominator.\^\ power and there is an infinitesimal vectorial element. dr. the radius vector. . to find the chance of a system of deviations as probable as or less probable than that observed. dx^. the symbol / replacing or S. and the infinitesimal element being written dx. 331.!("-i . i. Then the equation of the frequency "solid" is of the type set out in equation (15) of p. dx dx 24 . 36§ random sampling. Let us now suppose that the distribution of deviations is normal. which we will write for the present in the form = a constant. when two rectangular elepnents dx. dy are transformed to polar co-ordinates. Accordingly. etc. In this book we have been concerned with summations the elements of which were finite. referred to its principal axes. is transformed into a spheroid by stretching or squeezing. and a " solid " angular element. and a term in r. . The reader is probably aware that when the element summed is taken indefinitely small the summation is called an integration. In the first place the ellipsoid.. When n such elements are transformed. the summation from the ellipsoid to the ellipsoid oo X^ . m^ m„. and the system of rectangular co-ordinates transformed into polar co-ordinates. In the present case we have to reduce an n-fold integral the summation relating to n elements (fa. . . Hence the multiple integral reduces to a single integral and the expression becomes 2 i X e-W e-ix2 . adding together the elliptic elements from the ellipsoid -^ to the ellipsoid oo and to divide this summation by the total volume. we replace them by an angular element dQ. these cancel out. x"'^ . .

for the correlation of errors of value is 71=0 In the summation extending to distribution. x^ is determined by evaluating the standard deviations of the n variables and their correlations two at a time (the higher partials being deducible if the correlations of zero order are known). the reduction of which. to group together the small frequencies in the " tail " of the frequency distribution. 258. upon the computation of the function x. There are three points which the student should note as regards the practical application of the method. so as to make the expected frequency a few In the case of the first illustration it might have units at least.e method of p. x^ "an be deduced (the actual process of reduction is somewhat lengthy. 370 THEORY OF STATISTICS. the proof given assumes that deviations from the expected frequencies follow the normal law. therefore. 342 we reach sampling in the p*'^ and y"" classes. as is done in the second illustration below. By an application of th. 358. successes with that of been better to group the frequency of . As we have seen. its integration. 370-2 of ref. Everything turns. 257. have been computed for a considerable range of x^ ^^d of n' — n+l= the number of classes. 47. This is a reasonable assumption only if no theoretical frequency is very small. usually denoted by the letter P. The arithmetical process is illustrated upon the two examples of dice-throwing given on p. In the first place. but the student should have no difficulty Its in following the steps given in pp. desirable.. and are published in the Tailes for Statisticians and Bio-metricians mentioned on p. can be effected in terms of X ^7 methods described in text-books of the integral calculus. for if it is very small the It is distribution of deviations will be skew and not normal. we have ' '•=V4-"i)^ for the standard error of sampling in the content of the p*^ class while by a similar adaptation of the reasoning on p. With these data. infra). all n+ l classes of the frequency Values of the probability that an equally likely or less likely system of deviations will occur. therefore.

371 Twelve Dice thrown 4096 times. or 6 poimts reckoned No. . a throw of a success {p. 4. 258). of Successes. 5.StJPPLKMENTS — GOODNKSS OF FIT.

267) that the mean deviates from the expected value by 5*1 (more precisely 5'13) times its standard error. The method used pays no attention to the order of these signs.— 372 I THEORY OF STATISTICS. . . This is almost the first case supposed. not from a priori considerations "independence values" of the frequencies for a contingency table from the given row and column totals. so that the mean shows a deviation from the expected value that is quite outside the limits of sampling. and in fact we have already found (p. assumes that the In a large number. If we regroup the signs of m' — m.000. the constants of a frequency curve from the observations themwe determine the selves. again not from This general case is dealt with below. of Tables for Statisticians we have In the second place. in the section headed "Comparison Frequencies based on the Observations. From Table II. 12 successes with that of 11 successes. . The value found f or P (•0015) by the grouping used is therefore in some degree misleading. perhaps almost the majority. positive from 6 to 10 successes. we find Successes. and negative again for II and 12 successes. a priori considerations. : P : Greater fraction of the area of a normal curve for a deviation 5-13 Area in the tail of the curve .000. plied this condition is not fulfilled. Area in both tails 9999998551 -0000001449 -0000002898 ( -t- so that the probability of getting such a deviation or —) on random sampling is only about 3 in 10. distribution according to the . example on the preceding page aU the differences are negative up to 5 successes. of practical cases in which the test is apWe determine. . and the frequency of success. for example." Finally. . or that the differences are negative in both tails so that the standard deviation shows In the first an almost impossible divergence from expectation. the proof outlined is theoretical law known a priori. and it may happen that x" ^^^ quite a moderate value and is not small when all the positive differences are on one side of the mode and all the negative differences on the other. attention should be paid to the run of the signs of the differences m' — m.

and the regrouping has a comparatively small effect . and about "000001 a value much more nearly in accordance with that suggested by the mean. the mean being in almost precise agreement with expectation. 373 is this comparison n' is 3. . ^^ is 26-96. The regrouped distribution is — P : Successes. Such a regrouping of the frequency distribution by the runs of classes that are in excess and in defect of expectation would appear often to afford a useful and severe test of the real extent of agreement between observation and theory. or practically 27. In the second example the signs are fairly well scattered.— SUPPLEMENTS For —GOODNESS OF FIT.

374 THEORY OF STATISTICS. x'- .

— SUPPLEMENTS say. —GOODNESS OF FIT. Thus for a rough grouping into four classes the above series of trials gave : P. 375 when we fulfil the conditions of simple sampling. . is uniform over the whole range from to 1.

5 of Chap. 5 of Chapter Y. Small scraps were cut from each sheet and pasted on cards. medium. went through the pack independently. Table XXI. ref.. (Yule. for every row. the number of algebraically independent values of S is (r— l)(c— 1). to Lower Tint on .). ref. we are taking it as one more than the number of algebraically independent frequencies. if THEOEY OF STATISTICS. and the tables must be entered with the value TO' = (r-l)(c-l) + l. Name assigned Card. Y. since the total number of observations is fixed. that this is a reasonable rule if he considers that when we take n as the number of classes. Upper figure.) bottom figure. The same statement must hold good Hence. Sixteen pieces of photographic paper were printed down to different depths of colour from nearly white to a very deep blackish brown. or darlc) assigned to each of two Pieces of Photographic Paper on a Card : 256 Cards and 20 Observers. the comparison frequencies being given a priori. observed frequency j central figure. each one naming each tint either "light.376 zero. r be the number of rows." The student will realise Table showing the Name {light. combining scraps from the several sheets in all possible ways. so Twenty observers then that there were 256 cards in the pack. c the number of columns. two scraps on each card one above the other." "medium. difference S. independence frequency ." or "dark. The following will serve as an illustration (Yule.

377 .SUPPLEMENTS 4225/785 — GOODNESS OF TIT.

.378 THEORY OF STATISTICS.

— SUPPLEMENTS — — GOODNESS : OF FIT. .— (Data from ref. Table XIV. 6 of Chapter III. The same result would again have been obtained had we worked from the columns instead of from the rows... of "7995 -2005 "401 to say.. Greater fraction of area for a deviation of '84 in the normal curve Area in the tail Area in both tails .. Example ii.. of Tables for Statisticians. of either sign.... 379 Interpolating in the table of areas of the normal curve on 310... the probability of getting a difference. we have p. is That agreeing. within the accuracy the arithmetic. with the by the ^^ method. or taking the required figure directly from Table II.) The following table shows the result of inoculation against cholera probability given on a certain tea estate : .. as great as or greater than that actually observed is -401. and considered the difference between the proportions of white flowers for prickly and for smooth fruits respectively.

P x'- . its The ratio of the difference to •01853/'01025. we take the following values of -^ and of They refer to six for six tables that include that example. a positive association. different estates in the same group. take n' as given by w'=l + 2(i'-l)(c-l). and enter the P-tables with a value of n' equal to the total of algebraically independent frequencies increased by unity fields.. Thus from ref. were cited. y^ for the different tables. answered by pooling the tables . this better answer is given by the method is not quite satisfactory. Chapter IV. It may often happen that we have formed a number of contingency or association tables more for similar data from different often the latter than the former As before.380 THEORY OF STATISTICS. but the values of may run so high that we do not feel any great conThe question then arises fidence even in the aggregate result. For the association table there is only one algebraically independent value of S. Add up all the values of application of the present general rule.. for the aggregate as whether we cannot obtain a single value of a whole.. §§ 6 and 7).. in view of the fallacies that may be introduced by pooling {cf. — — All may give. An Aggregate of Tables.. P A : that is. but. we must add together the values of x" and enter the P-tables with n' taken as one more than the number of tables in the aggregate. or 1-808. thus obtaining the value of -j^ for the aggregate. telling us what is the probability of getting by mere random sampling a series of divergences from independence as The question is usually great as or greater than those observed. Hence if we are testing the divergence from independence of an aggregate of association tables. 6 of Chapter III. from which the data of Example ii. standard error is therefore Greater fraction of normal curve for a deviation of 1'808 Fraction in tail Fraction in the two tails . perhaps.. — P is '96470 '03530 "07060 both methods must lead to the same result.

1 = 10 whence by interpolation the value P We We : We P — and respectively. I think. so that n' is 4.000. and entering the column for ra' = 7 (one is The P more than the number of tables considered). giving P=0"52 in the first case and 0-22 in the second. for the observed event (S(x^) = 28-4 and all associations positive) is therefore only -0000013. The formulse for the general case. 28 29 -000094 -000061 of is -000081. ra' = (lx7)-f-l = 8 Differencing the columns for corresponding to these two values of n'. of fit be tested by the x^ method. P 4x4 . can be checked by experiment. 374 above were entered as the frequencies of a table (1) with 4 rows and 4 columns. For the two cases we have to' = (3 X 3) 4. (2) with 2 rows and 8 columns. should therefore only expect to get an equal or greater total value of y^ and tables all shomng positive association.000.e. Experimental Illustrations of the General Case. 1-3 times. and in six cases there are 2* or 64 possible permutations of sign. as for the special case in which the frequencies with which comparison is made are given a priori. 81 times in 1. and the value of x^ computed for each table for divergence from independence.000 trials but 81/64 or. x'. the total is 28-40. we obtain the theoretical frequency-distributions given in the columns headed "Expectation" in Table A. not 81 times in 1.000 trials. The numbers of beans counted in each of the sixteen compartments of the revolving circular tray mentioned on p. can therefore regard the results as significant with a high degree of confidence. The observed distributions of the values of x^ in 100 experimental tables are given in the columns headed " Observation. grouping together the frequencies from x^ = 15 upwards. we find P. 381 association between inoculation and protection from attack positive for each estate. x^ is found to be 2-27 for the tables and 4-36 for the 2x8 tables. go further for all the observed associations are positive." It will be seen that the agreement between expectation and observation is If the goodness excellent for so small a number of observations. roundly. i. but for only one of the tables is the value of so small that we can say the result is very unlikely to have arisen as a fluctuation of sampling. may. we should only expect to get a total of y^'s as great as or greater than this. on random sampling.SUPPLEMENTS — GOODNESS OF FIT. Adding up the values of x^.

50.— 382 Table THEORY OF STATISTICS. Theoretical Distnbution of -j^. calculated from Independence-values. in Tables with 16 Compartm. compared with the Actual Distributions given by 100 Eomervmenial Tables. A.) x' . (Ref. in the second as 8. In the first case n' must be taken as 10.ents.

According to theory The theorem is last tables illustrated the resulting frequency-distribution for the totals of pairs of y^'s should be given by differencing the column of the P-table for M =3. x^ is 2-18. n' must be taken as 3 the first case. Grouping the values of x' for the 350 experimental tables similarly in sets of three and summing. Testing goodness of fit. n' is 9. — m Sum of X^'s. and 4 in the second. y^ is 5-53. compared with the Actual Distributions given by Experimental Tables. Grouping values of x^ 8 and upwards. and Pis 0-60. The results of theory and observation are compared in the first pair of columns of Table C. . and the theoretical distribution by difi'erencing the column of the P-table for n'=i. n' is 8. we get the observed distribution on the right of Table C. giving 175 pairs. Table C. The values of y^ for the 350 fourfold tables of Table B were added together in pairs. and P 0-97. and testing goodness of fit between theory and observation. 383 given for evaluating P for an aggregate of by the experimental data of Tables C and D. grouping the values of v^ 7 and upwards.SUPPLEMENTS — GOODNESS OF FIT. Theoretical Distribution of Totals of x^ {calculated from Independence-valves) for Pairs and for Sets of Three Tables with 2 Rows and 2 Columns.

P Table D. Theoretical Distribution of Totals of -j^ {calculated. — Sum of two . compared with the Actual bistrihwtion given by Experimental Tables.384 THKOEY OF STATISTICS. and is 0-39. n' is 5. from Independeiice-values)for Pairs of Tables with 4 Sows and 4 Columns. ^^ is found to be 4-11. Taking the two groups at the bottom of the table together and testing goodness of fit. -being given in the second column.

SUPPLEMENTS —GOODNESS OF FIT. 385 Table of the Values of V for Divergence from Independence in the Fourfold Table. .

.386 THEORY OF STATISTICS.

Soc. T. 159.. Bennett. New York. Soc. 1916-17. 455. 1921. J. DooDSON. A. Ixxxiii. 1. The Maemillan Co.— ) ) — ) SUPPLEMENTS— ADDITIONAL KEFEKENCES. Gaz. Stat. p. p.. 93. 40.. Index-nvunljers (p. Stat.M. '' (4) RiTOHiE-ScoTT.. "Prices. Stationery Office.. Soy. Biometrika. "The Correlation Coefficient of a Polyohorio Table." Jour. and Cost of Livingin Australia. Frances.. 121. vol. 1921. W." Quart. 1916." Jour. ' On measuring association with special reference to 4 x 3-fold classifications.. H. vol. vol. (Gives a proof of the relation noted on p." . Pub. Ixxxiv. M... Econ.. L. Report (Cd.. the General Theory of Multiple Contingency with Special Reference to Partial Contingency. (14) March. For the student of the cost of living in Great Britain the following are useful : (15) "Labour Gazette Index Number: Scope and Method Lab. 1900-12. in Frequency Curves. Roy. (Considers various methods of ' Pearson. March 1920 and Feb. "The Measurement of Price Chsinges." Metron. tlieir Development and Progress in many Countries. Working Classes. p. 73). March 1921." Jour. 6). S<(rfjs<ics. p. Contingency (2) (p.. A. Report No. 1919. (13) Persons. F. vol. No. p. W. mainly on the progress of official statistics. A. BowLET." Biometrika. 1918. 429. vol. Stat. Wood. classification subjected to various conditions . 1920. 130). vol. Kabl." Biometrika. 130). Median and Mean. The Mode (5) (p. 1916. (12) Fishes. p. Ass." Rev. xi. 8980. p. Ixxxii. (A collection of articles. Roy. : There are useful discussions as to method in the following (6) (7) (8) (9) Knibbs. vol. 533. Roy. "The Best Form of Index-number.. 343. Labour and Industrial Branch. (11) Flux. "Fisher's Formula for Index-numbers. 1. vol. (3) Pkaeson.. i. 145. 167.. vol. of Compilation. p. "The Measurement of Changes in Cost of Living. Ixxvii. xi. Soc.. Arthur "Relation of the Mode. Price-Indexes. Toohee. Stat. Amer. and J.. iii. Karl. 1918. The History of Statistics. ADDITIONAL EEFERENCE3. xii. Soc. 1913-14.. "On Criteria for the Existence of Differential Death-Rates. written by a specialist for each country." Commonwealth of Australia. (edited by). "Les modes de mesure du mouvement general desprix. vol." Biometrika.." Jour.. arithmetical examples are provided in the undermentioned paper. 1918. 103. xi. 1921. p. H. "The Theory of Measurement of Changes in the Cost (10) of Living. 1921. L.) T. 1912. Stat. KoBEN.. "The Course of Real Wages in London. (An extension of the method of contingency coefficients to p. Cost of Living Cohimittee. Irving. 4. p. L. 1918).. History of (1) 387 Official Statistics (p. 6.

L.. Econ. etc. 1921) ciitical notices of the same in the Labour Gazette for Aug. p. .. 1921. XV. : History (p. (p. Oxford. Prices and Wages in the United Kingdom. No. Ass. 1912. London. vol. 188).. 84... Pub. "General Ability. p..— 388 (16) THEORY OF STATISTICS. 298. Problem. Soe. vol... 225). Svenska Vetenskapsakademiens Handl. 6578. xi. L. 1917. D. "On foTcnings Tidskrift. p. Hart.. Sept. Correlation (18) xiii. Logarithmic Correlation. Stat. 51. (22) Pearson. p. C. p." Biometrika... p. " On the Application of Goodness of Tit Tables to test Regression Curves and Theoretical Curves used to describe Observational or Experimental Data. [Cd. " Final Report on the Cost of Living of the Parliamentary Committee of the Trades Union Congress" (The Committee. 1917. 33. usually employed instead of "correction. Fit of Regression Lines (19) (p..] (1911). vol. Pearson. 1921. 1920 (Clarendon Press). WiCKSELL.. 854. xiii. Karl. S. U. J. (17) BowLBY. "On the Time-correlation 4. and between the Sum of Two or More Components. Deaths. A. (p. and Sept." Biometrika. WiCKSELL. 32 Eccleston Sq. K. vol. 1920. "Notes on the History of Correlation." Metron. Pearson. Harris. D.. (26) Yule. G. 1921 and review by A. Jour. American Stat. 226).. i. 1917.. p. (Criticises and extends the work of Slutsky. Jowr. 1916-17.. 237.. Miscellaneous 226).. S. ." and a "potential or standard death-rate " is termed an "index death-rate": for the methods of standardisation in present use see is The term "standardisation" now (24) Seventy-fourth Annual Report and Marriages in England and Wales Correlation (26) : of the Registrar General of Births. 497. (27) S. 209). its Existence Brit. 209). Correlation (23) : Effect of Errors of Observation.. Ixxxiv." Biometrika. Psychology. with an Application to the Distribution of Ages at First Marriage. K. Boy. and and Nature." Meddelande fran iMnds Svenska AktuarieAstronomiska Ohservatorium. No. Bowley. vol." Jour. (21) WiCKSBLi. 1913." Quart. v. Spearman. 1921. and the Sum of the Remaining Components of a Variable. vol. 25. Iviii." Kungl.. 1921. "The Correlation between a Component. Bd.) Correlation in Case of Non-linear Regression (20) (p.. Arthur. Bernard. "An Exact Formula for Spurious Correlation." Correction or Standardisation of Death-rates (p. vol. 1914-20. "The Correlation Function of Type A. . D. " On a General Method of Determining the Successive Terms in a Skew Regression lAne.

. tome xx." Explanation of Deviations from Poisson's Law in Practice. xiii.. A. V. tion of some Bacteriological Methods employed in Water Analysis.. and G. L. Part (30) IssEKLis. 266. . vol. Stat. Ixxxi. (Continues the discussion initiated by the paper of Miss Whitaker." Genetics. the undermentioned deal with the forms which are suitable for the representation of particular classes of data. (p. 389 252)... Trails.." . Stat. J. the Statistical InterpretaGkbbnwood.. especially statistics of epidemic disease. . C. ocxvi. 1918. 1919. Mathematical Theory of Statistics " (1912). 6. VON. Series A. cited on p. Sampling of Attributes (31) (p. vol.. (29) Vj{-i--rls){\-A) and ris'-^a/ N/(l-r?3)(l-4). L.. (Applies a criterion developed from Poisson's limit to the discrimination of water analyses numerous arithmetical examples. vol. 1906-12. p. Karl..'' (Completes Bhil. x. 1915.. 1921. should consult the following : (38) Charhbr. ix. xi. "Fluctuations of Sampling in a tion." Proc." Allgemein. Random Occurrences in Space and Time when MoKANT.. vol. Udny Yulb..) The advanced student who desires to compare the merits of different frequency systems proposed. 1915. 27. 1916. Roy. vol.. 1916. cal Data. 2^ livr. p. No. Y.. 1916-17. (39) Edgbwobth. 1918. 1917. 50.. ii. 105. 211. de Stat. 1916. Frequency Curves (37) Pearson. vol. Soc. 225. p. Mendelian Popula- The Law of Small Chances (32) (p." Biometrika. . Soc.. and Partial Correlation Ratio Kblley." Bull. Karl. p. 273). de rinstitui Int. p. followed by a Closed Interval. p. 36. 492. the description of type frequency curves contained in references (1) and ' (3) of p. Boy. p. ' (p." Jour. 466 65. Bm/.. 273). "On the Mathematical Representation of StatistiIxxx. " On the Partial Correlation Ratio. BoRTKiBWicz. "On . (Tables giving the values of L. "Tables to facilitate the Calculation of Partial Coeflicients of Correlation and Regression Equations." Biometrika. xci. F. BoBi'KiBWioz. "An "On See also reference 40.. 314)." Bulletin. Ixxix. 1916. vol. pp. p. (33) (34) (35) (36) L. . issued from the Astronomical especially "Contributions to the (These papers are concerned with the general theory of frequency systems . Arch. 429. 599. M. 322. p. " Realiamus und Formalismus in der mathematisohen Statistik.BiomciJ'ifca. xvi. "Student. vol. "Ueber die Zeitfolge Zufalliger Ereignisse. Numerous papers Department of Lund. T. Dbtlefsbn. iii. the Partial Correlation Ratio. 309.) ) ) — SUPPLEMENTS Partial Correlation (28) —ADDITIONAL llEFERENCES.. Second Supplement to a Memoir on Skew Variation. Soc. "On Numerical. Series A. p." Jov/mal of Hygiene.. 411 . 273.. VON. vol. L.) Peakson. of the University of Texa s.

1922..) ) ) 390 (40) THEORY OF Brownlee. Ixxxv. "On the Value of a Mean as calculated from a Sample. H. Ixxxi. 97 of this paper. Boy. 315 and p. Soc. Soc. (51) IssERLis. vol. 28. Soc. . Sir Ronald." Proc.. U.. xcii. vol.. 204. Soc. Roy. . 1916-17. vol. Greenwood.. vol. (The appendix to this paper summarises the author's results and those of Sir Ronald Ross . (52) SOPER. p. H. Ix. p. J. " An Enquiry into the Nature of Frequency Distributions representative of Multiple Happenings. "The Mathematical Theory of Random Migration and Epidemic Distribution. p. Stat. Theory of Probabilities to the Study of a priori Pathometry. A.. vol. vol. p." Trans.. Sect. 87.." Phil. Epidemiology aiid State Medicine. Actuarial Soc. p. and III. vol. Goodness of Fit (47) (p. Soc. i. " On the Application of the x^ Method to Association and Contingency Tables. xxx. Karl. p. (A modification of the goodness-olfit test to cover such statistics as those indicated by the title. D. Sir Ronald.. Mag.." Biometrika." Appendix A (Contains a to vol. 1918. 1922. xi. p. G." Jour. Stat. 369. (50) YcLE. 95. xoiii.. the Study of u. Soc. Edin. p. "The Mathematical Theory of Population. (44) Knibes.) Application of the Theory of Probabilities to (42) Ross. vol. "On the Distribution of the Correlation Coefficient in Small Samples. 262. pp. (Numerous graphs of mortality rates in different xviii. STATISTICS." Journ. J.. 85. and Others. Hudson..) from Contingency Tables. ^ p.." Jour. p. Stat. D (6th series)." Jour. Pearson. U. America. p. Karl. Ixxxiii. vol. 1916.. 1917. Ixxxv. and Hilda P. vol. A. of Census of the Commonwealth of Australia. 311. Soc. 1920. with experimental illustrations. Boy." Biometrika. R. vol." Proc. x. vol. (63) Pearson.. 292. 1916. "On the Interpretation of and the Calculation of P. Boy. 367).. E. 255. Probable Errors : General References (p.. 1916-17. and Yule. "Mortality Graphs. Stat. 1918. vide infra. H. Karl. "Certain Aspects of the Theory of Epidemiology in Special Reference to Plague. p. G. (48) Pearson. "On the Probable Error of Biserial ?." Pts.. 355). 328. xi. "An "An statistics.. (49) Fisher. with particular reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents. Boy. Boy. A. vol. "Multiple Cases of Disease in the same House. 1917. (45) (46) MoiR." Proc. Boy. 212 and 225. M.." Biometrika.. classes and periods. 1910-11.. L. II. 1913. " On a Brief Proof of the Fundamental Formula for testing the Goodness of Fit of Frequency Distributions and on the Probable Error of P. priori Pathometry. that a full proof [of the general theorem as ap])lied to contingency tables] seems he has convinced me that his proof covers the case. Boy. (After correspondence with Mr Fisher I wish to withdraw the statement on p. full discussion of the application of various frequency systems to vital (41) Brownlee.. Soc. Application of the (43) Ross.. xxxi. G. 75. Medicine. Proc. still to be lacking : See also reference 19.

of Illinois Agr. and W." Proc. Supplement 7.. "The Element of Uncertainty in the Interpretation of Feeding Experiments. (58) 1920.. "On the Probable Error Coefficient of xi. 6. of work has been done on this particular branch of the and the following references may be useful Berry. and O'Brien.testing. B. Fisher." Jour. Wood.. of the Partial Correlation Coefficient in Samples of Thirty. 430. iii.. E. 1039.Siomein'fe. A. 1905. (69) Errors of Sampling in Agricultural Experiment. J. (55) Editorial. pp.... Wood. Agr. 4.. E.U. i." Proc... H. "An . 1910. A. : A good deal (60) (61) with Cross-bred Pigs. 391 of a vol. la chance. J. 263.." . Agr. ''A Method of Correcting for Soil Heterogeneity in Variety Tests. vol. vol. Science. A.. Roy. No. a. H. B." Jour..l. W. Wood. 283. xiii. vol. 140 and 185." Univ. T. A. 1913. 1916-17." Jour. M. etc. II.. 1911.. of Agronomy. F. 89. p. 1921. T. A. "Variation in the Chemical Composition of Mangels. Agr. "On the Probable Errors of Frequency Constants. Sci. S. T. Flammarion. . S. Experimental Determination of the Distribution (57) BispiiAM. vol. and Kakl Peakson." Journal of the Board of Agriculture. vol.. Mitchell. "Errors in Feeding Experiments subject." Pt. B. p. 113. des prohahilitis. and Raymond Pearl. Young. Baohblier. Science.. (With an appendix by ' ' Student " desftribing the chessboard method of conducting yield trials. xiii. H. D. horticultural work. L. B. A. xiii. p. p. Pickering. 1921. J. 3... p. Soc. D. Bulletin 165. Science. p. 1916.." Metron. 1914. Bebuy. Harris. L. Andrew. and vol. B. Agr. xi. 16.. vol. vol. 361). Lloyd. 1920. "The Experimental Error of Field Trials. i. and R. etc. 215. 1916. milk. "On the Mathematical Expectation of the Moments of Frequency Distributions." Jour. 275. p. Mercer. Gauthier-Villars.. iv. T. Agr. vol. S. Agr. p. "The Interpretation of Experimental Results. 1911. v. R. and F." Jour. Research. Amer.. 1916. (56) "STtTDENT. Geindley. L. A. p. " On a Criterion of Substratum Homogeneity (or Heterogeneity) in Field Experiments. 1918-19.) — SUPPLEMENTS (64) —ADDITIONAL REFEKENCES. Russell... Stkatton. Biometrika. J." Jour. "On the Probable Error of a Coefficient of Correlation deduced from a Small Sample. p.." Amer." Biometrika. Science.. (62) Hai. 1921. xovii. D. 144. feeding experiments. Paris. viii. "On the Probable Error of Sampling in Soil Surveys. T.. Collins. Surface. p. vol. Robinson.. and H. (72) Baohklier. (63) (64) (65) (66) (67) (68) (69) (70) Works on Theory (71) of Probability. p. p..) Lyon. vol. vol. iii. Exp. iii. Hall.. R. 1921. W. Le jeu.. Calcul tome i.. vol. G. TcHOUPROFF. vol. M.. (App. Contingency without Approximation. 1912. p. Science. p. Naturalist. and A." Jour. 1910. "The Interpretation of the Results of Agricultural Experiments. Agr. 107. "Some Experiments to estimate Errors in Field Plat Tests. xii. Wood. et le hasard. W. 225. Paris. "The Feeding Value of Mangels. TIL. Station. (Contains a collection of papers on error in field trials." "An Experimental Determination of the Probable Error of l)r Spearman's Correlation Coefficients." Biometrika. 417. 1911. Soc.

London. JuLiN.. J. (81) Keynes. Macmillan. Elements of Statistics. Introduction to Mathematical Statistics. Adams Co. A. Bell & Sons. The Mathematical Theory of Probabilities and its Application to Frequency Curves and Statistical Methods. New (76) Elberton. C.. Die statistische ForsehwngsmetAode. The Combination of Observations. & Columbus. (77) Fisher. London. London. Die statistische Methode als selbstandige Wissenschaft. .. L. 17 on p. 1921. BowLET. Leipzig. 4th ed. E. 1921. An inexpensive reprint of Laplace's Essai philosophique {ve{. (75) OzUBER. 1913 (Veit). 1917 (Layton). A. Principes de slatistique thiorique etappliguie tome i. to Frequency Curves amd Correlation. Paris (Rivifere). W. vol. 1915. London. L. S. Seidel. FoRCHER. Press. Statis(80) : tique thiorique. P. (79) Jones.. J.) David. 1921. 1921) in the -series entitled " Les maitres de la penste soientifique... " Ajjplieations of Mathematics to Statistics " the two Parts can now be purchased separately. ^ Treatise on Probability. 1918. 1917. i.. M.. C. 1921." 392 (73) THEORY OF STATISTICS. Palin. Wien. (82) West. 361) has been published by Gauthier-Villars (Paris. Cambridge University (74) Bkcnt. Arne. (This edition has been much extended in Part II.. A First Course in Statistics. 1920.. Hugo. Bruxelles (Dewit).. Addendum (78) York (Maomillan). King. W.

AND HINTS ON THE SOLUTION OF.ANSWEES TO. . THE EXERCISES GIVEN.

III. independence. since 256/768 = 1/3. since (b) («) (^5)o = 1457. . negative association.394 THEORY OF STATISTICS. . Percentage of Plants above the Average Height. among 1. (a) positive association. there would have been 3176 male and 3393 female deaf-mutes. 380/570=2/8. CHAPTEE . since 294/490 = 3/5. Deaf-mutes from childhood per million among males 222 females 133 there is therefore positive association between deaf-nmtism and male sex if there had been no association between deaf-mutism and sex. 48/144 = 1/3. : 2. 3.

(2) .

2nd qual. . that the than Mean.p. 0-6391. (2)916.. 146-25. CHAPTER 1. . (r = 18-0. V. (2) Jf=73. (3) 88-9. 2^d. upper quartile =£54 -6. Note that x cannot exceed J.|>0. BO and '224 .. (Note mean and the median should be taken to a place of decimals further is desired for the mode the true mode. 6|d.pq. no inference is possible from positive associations of ./s. £35-6 approximately.5x . ratio 115-2. (1) 116-0.) 4. § 6. ninth decile 5. e/3. 0-29. 7. Note that x cannot exceed J. CHAPTER 2. 154-67 lb. 7. Approximately lower quartile =£26-1. the standard deviation is the higher the coarser the grouping. = £94 Skewness. Mode (approx. is Mode (approx. proof is given in Chapter XV. Ratios: m. Mean. 1200.) 150-6 lb. 142-5. 8. 3. VIII. 156-73 lb.. Median. 2. 1st qual.i4C and . If the terms of the given binomial series are multiplied by 0.). (4)115-2.507. whence $=12-95.. subject to the conditions y-^O. n. A. The assumption that observations are evenly \/n. . 216-6. VII. 6.-2a.. =0-77.) 396 No THEORY OF STATISTICS. upper quartile 168-4.2-l) AB (3) >i{Zx + 23?). 0-36. Lower quartile 2.) 6330. Median. Standard deviation 21-3 lb. 0-68. (3) (Note that while the mean is unaffected in the second place of decimals.d. Geometrical means 77*2.963. 2.. The proof is given in Chapter XV. 5. . 89-0. (r=17-3. 20.= 17-5..d. (2) Means 77-4. 3. <lC. .d. (True mode 0-653. CHAPTER 1. 100. B. inference is possible from positive associations of inference is only possible from negative associations if x lie between "183 . 10s. § 6. As in (2). an inference is possible from negative associations if a.. and •215 .. ratio 114'9. 200. Mean deviation 16-4 lb. o. 1. (1)921. 9s. is 151 '1 lb. 3 note that the resulting series is also a binomial when a common factor [The full is removed.<i(6a.2.. 3. An and BC. 2. 0-651.] CHAPTER = 0-61. 4. distributed over the . lie between '177 .1.. M=n% 6. (1) j¥=73-2. VI.. found by fitting a theoretical : frequency curve.

n-2. (The latter. X^ 9-i. : 2. " 11. and the former consequently probably too high. (Tx-irn The If r is small. (against 1240 per cent. ^ «j(«(-F«) d is the important term. 0"43.2s=2-64. ffs-i2=70-l. expressed as a fraction of the class-interval.. XIV. ir3=3-09: r. rij3=-f0-60.8=-0-13. X=0-5r+0-5. The 4. CHAPTER XII. r=l-3X-l-l-l. No effect at all. mean value of the errors in variables is is d. <2=077. 8.) 1. but it does not hat the second term would become the most important in practical cases. % CHAPTER 1. Cf.is=0-694. 3 for out-relief ratio. 0=2/3. <rx IX. r=-l-0'81. and in the weights the value found for the weighted mean true value -i-d-r. corrected standard-deviation is 0'9954 of the rough value. of course. §10.3 '37 Xj -h -00364 = . is given its proper sign. fa = 579. (2) If the down from symmetry. X. ras= -HO-759. except for the interval in which mean or median lies : for that interval the sum is Wj (0'26 +d^). which can be independently calculated. Chap. TO EXERCISES GIVEN. 397 intervals does not affect the sum of deviations. against 2'572 in. Estimated true standard-deviation 6 '91 : standard-deviation of fluctuations of sampling 9 '38. rm= -0-436. and In this expression d is. 12. 2 for pauperism. 1. Notice that the n^ and of this question are not the same as the iVj and N^ of § 16. CHAPTER XL 1'232 per cent. seem probably able errors in the weights may be of consequence. 3. 5.a= -fO-OOr. 58 per cent.— ANSWERS the ETC. ru. Using the subscripts 1 for earnings. 9 -31 -1. 0-30. ffy=2-280. hence the entire correction is d(ny -jij) + jj2(0-25 +iP].) 2'556 in. The others may be 10. is too low. (1) written e. = l-414. and hence errors in the quantities are If r become considerusually of more importance than errors in the weights. 2.

and the divergence is 5-4 times this. correlation of order 4. . The actual difference is 1'7 times this. »-]2. 0-3. n-2 Hence if r be negative. The standard deviation of the proportion is 0-00179. The diflFerenoe might therefore occur frequently as a fluctuation of sampling. The actual 8. •/•g4.. therefore.24 «ri. (r (t <r (c) 3.f2+0-587 Jfg + O-OSiS X^. 0-4.. fl: (b) . -0-149. ilf=3-47. 49-2. or Case III. M=6. 5..: (^ j3)/(3) = 80 "0 per cent. 71=12-0 :p = 0-454..3= '"l2-3 -1. 2. = 5"23. r]4. (a) (AB)/(S) = 69-1 per cent.24= +0-803. r.3 = 723.1= = '-13'a = '-23. and might. = l-40. 7i=110-4. and would frequently occur as a fluctuation of simple sampling. Standard deviation of simple sampling 23 -0 per cent. Xi = 53 + 0-127 . ir=l-732. but . 7-13. S. (a) Theo. Difference 10-9 percent. and therefore almost certainly . The test can be applied either by the formulae of Case II. p = l-(T^/M. (r=l-225: o- „ . actual actual Theo. 0-2. and thence €.2= 12 "9 per cent. seem to indicate any real variation. Theo. Case II. Difference from expectation 7 "5 standard error 10-0. (A)/N=S7'6 per cent. iIf=2-5. standard-deviation does not. = l-323 : : = 1-26.^. cannot exceed (numerically) l/(n -J-12. 103 = 105 -4. 10. n^M/p :^=0-510. »•«. J!f=3-5. the cannot be numerically greater than unity and r 1). rather infrequently. M=50. = = 3. (r = l-14. '•23i4= -0-433.23i = 9-l?. r = 5 Actual if= 50 -11. and the difference from expectation 18.134 'u 3J= +0-680.significant. 5. 6.1= — !• +1. ir = l'7S2 : Actual Jf=6-n6.iB= +0-397. : CHAPTER Row. 13= -0-553. XIII. . CHAPTER 1. 9. . There is no significance.398 2- THEOKY OF STATISTICS. XIV. : {A$)l(. is taken as the simplest. and thence ei2=3-40 per cent.i2= 12-5. M=2-97. occur as a fluctuation of simple sampling. (Ji)/iV=71'l percent. only fluctuations of sampling. (i) {AB)/{B) = 70-1 per cent. = l-llS Actual Jf= 2-48. M=S.$)-6i-Z per cent. The actual difference is less than this. Difference 5-8 per cent. The standard deviation of the number drawn is 32. 4. The correlation of the pth order is r/(l +^r).

ETC.. TO EXERCISES GIVEN.ANSWEKS. 399 CHAPTER (1) XV. .

6. suppose eyery horizontal array to be given a slide to the nght until its mean lies on the vertical axis through the mean of the whole distribution : then suppose the ellipses to he squeezed in the direction of this vertical The original quadrant has now become a axis until they become circles. 0'24 lb. of the standard-deviation : error of the median. r. 50. and the question is solved on determining its magnitude. 3. In fig. O'l «=1000. 0-0316 0-0304 0-0266 0-0202 0-0114 0-0 0-2 0-4 0-6 0-8 0-096 0-084 0-064 0-036 . 0'18 lb. sector with an angle between one and two right angles. 4.orO'76 per cent.400 THEOKT OF STATISTICS. the standard error of the semi -interquartile range is 1"23 per cent. upper Q. standard 4. less than the standard error 0'34 lb. frequency 1554. 5. . 1. standard error 0-28 lb. 17 per cent. 2.. standard error 0-26 lb. Lower Q. frequency 1116. of that range. n=100. 00196 in. Estimated frequency 1472. CHAPTER XVII.

of classes. at death of certain (table). 35-36 . constancy of difference from independence values for the secondorder frequencies. 2. Gottfried. 32-33 . and offspring. 42-43. . 164 .. refs. table. Aohenwall.. experiment. 189. 177. 216-217 . 333. Pearson's coefficient case of . 198 . refs. 25-59 def. refs. of husband and wife . of estates in 1715. eye-colour of father and son. errors in. 30-35 . 54-56. 44 . 10-11. generally.. 48-51 total possible . number of. See Earnings. 78 . 391. 90-102 relative positions . Ages. Airy." 144. parent. degrees of. of dweUing-houses 83 . 100 diagram.. 45-46 . 0.. metic. 44-48 . women . Agriculture. 173 constants (qu. generally. Eefs. O.order frequencies. 208. use of terms " error of mean square " and " modulus. 29-30 . 333 . arithmetical treatment. followed by citations of the authors' papers or books in the Usts of references. illusory or misleading. The subject-matter of the Exercises given at the ends of the chapters has been indexed only when such exercises (or the answers thereto) give the constants for statistical tables in the text. 51-54 refs. 39-40. use of ordinary correlationcoefficient as measure of association. inoculation. 101. Sir G. in ignorance of third. 46-48. citations in the text are given first. 42-59. 360. def. 61. In the case of authors' names. Association. 401 26 . Anderson. 236-237. INDEX. of grandparent. hair and eye-colour data cited from.. Accident.. See Mean.. 37-39 .] Ability. deaths from (law of small chances). general. Abriss der Staatswissenschaft. 3). Array. 204-205. correlation difference method. diagram. arith(table).. testing. testing by . ). B. 377-378.. colour and priokfiness of Datura fruits. Aggregate. defects in schoolchildren. coefficients of. Ammon. deaths and occupation. 15. Association. complete independence. for n attributes. del. in all such oases the number of the question cited is given. 52-53 . 379381. (correlation). comparison of percentages. deaf -mutism and imbecility. 28 . Annual value examples deaths and sex. standard-deviation of. Asymmetrical frequency-distributions. 57. 53-54 . 265-266. 33-34.. in normal correlation. 36-37. 40. Agricultural labourers' earnings. refs. 159 56-57 based on normal correlation (refs. 319-321. or theoretical results of general interest . Theory of Errors of Observation. [The references are to pages. 388.. the problem. total and partial. 34-35 eye-colour : . partial. Arithmetic mean.

. weight. time-distributions. . Axes. W.. 121-122. H.... cost of living. A.. def. calculated series for different values of p and experimental n. 149-150.. mean. effect of errors on an average. 353. 273. 13-14 .. F. Barometer heights.. refs. 297-299 refs. 390. stature. Laws of Tlwught. Address by 354. J. data cited from. 360. 369. Bernoulli. law of small chances. 107 . forms of. Attributes. Caktd des probabilites. Beeton. . 357. E.. ref.. positive and negative. 78... . .. Measurement of Groups and 354 . . 129-130. 1 and qu. 25-59. cost of living. 313. 193. 273. genesis sampling of attributes. generally. refs. 274. (epidemiology). consistence of class-frequencies. Bispham. Berry. Median.. 226. 402 of THEOKY OF STATISTICS. W.. Boole... Bowley. Refs. refs. ultimate classes.. 209.. 291-293 295 of. 336-337.. 391. A. generally. measures of. 321322. errors in feeding experiments.. L. W. British Association. 391. theory of. Calcul des 391 Lejeu. Bateman. 97 means.. .. Charles. Averages. law of small chances. association of. Mode. 360. 188. Miss M. Reports on index-num130-131. tests for linearity of regression. 113-114. Elements of Statistics. 106-132 . frequency curves coefficient. refs. . 88. illustrations sampling of. principal. refs. 343. 354 360. refs. 259 (qu. Oours eUmentaire de statistique. 122. 279-281. A. 332. diagrams.. mechanical method of forming a representation of series. 391.. 294. Ueber Korrelalion. Theorie des probabilites.. Borel. Bias in samplin<j. 387. 360. in correlation." Binomial in 291-300 . Asymmetry in frequency-distributions. 360. Elementary Manual of 360 . graphic method of forming 371 a representation of series. L. L. refs. 95. correlation. errors of sampling in partial correlations. T. refs.. W. . data cited from. Bravais. Brownlee. 354 . Brown. " statistics. 388. 377-381 (see Association) word of. 107. 391 .. .. Baohblibe.. Booth. et le L. Bertillon. refs. 17-24 (see Consistence) . hasard. J. refs. on pauperism... J. . G. variation in mangels.. use of 1.... .. see Stature. ref. . probabilites. 96 diagram. J. 1-59. refs. Statistics. la chance etc.. Barlow. 9-10. J. 366 . W. notation. 299-300 deduction of normal refs. von. 314. data cited from. probable error of contingency coefficient. 389.. Ars Conjectandi. 301-302 258. 109 refs. 252. . and modes. def. P. Blakeman. Bortkewitsch. Bowley on sampling. F. 6. 10 order and aggregate of classes. 377-378. 1914-20. refs. tables of squares. median. . 356 . medians.. 360. R. curve from. 108 . Weight bers . A. refs. 7. Bertrand. J. index correla- tions. Series. refs. 23. 354. Bielfeld. 389 . refs. 195. on sampUng.. 12 positive classes. desirable properties of. 37. L. 261-262. Bateson. 67. J. refs.. 392 Bennett.. Betz. 107108 . and mode in. 254-334 (see Sam- pling of attributes). See also Frequency-distributions. L. table.. effect of experimental errors on the correlation226 TTie Essentials of Mental Measurement.. direct determination of mean and standard-deviation. See Mean. 14-15 . 387 Prices and Wages. J. Brown.. law of small chances. refs. 295-297. von.. 2). Baron series. 10-11 . average in sense of arithmetic mean.

generally. deaths in. INDSX. 208. 18-19 . 389. treatment of table by coefficient of contingency. 10 . Chance. 273. Colours. 175-177. 21-22. standard error of (refs. 104. 60-74. Correction of correlation-coefficient for errors of observation. order of a class. 197-199. 177. of success or failure of an event. 76 homogeneous and heterogeneous. refs. standard error.. standard-deviations of arrays. 389. 157—253 . law of small. 157. 17-24 . 388 elementary methods . 115.. 149. 9 .deviation for grouping of observations. Census (England and Wales). def. def. classifica- tion of ages. (qu. 201-202 . for age and sex-distribution. of class-frequencies for attributes. refs. generally. 64-67 testing . ous classification. 105 . class-frequency. D. 10. Consistence. isotropy. 8. method. construction of tables. fluctuation method. 174. 116. regressions. tabulation of infirmities in. correlation . 12 .. 170def. 328-331 . Class. 202-204.. refs. 204.. difference method. 208-209. (qu. 24. etc. rough methods for esti- mating coefficient. . C. 82-83 . 7.foragroupedtable. 8. 79-80 desirability of equality of intervals.. refs.). of contingency. for one or two attributes. of correlation. V. refs. refs. partial or multiple contingency (refs. 362-363. in sense of complex causation.. Beatrice M. 225-226. 103 . 76. Correlation. 403 W Collins. 9). Cloudiness at Breslau. 6. 391 . 198 . for cases of non-linear regression. def. see Correlation. 208.. theory of frequency-curves. 10. 388. 37-39 . . Classification. appUcacoefficient of.. naming a pair. Cholera andinooulation. deduction. 68-71. Class-interval... 80-81. frequency-distribution. (qu.. 23.. 205 .. diagram. 376-377. between movements of two variables. 360. 164.. 387. refs. example of Brunt. 388.. 177- 231-233 direct 181. Chances. 213-214 . for three attributes. correlation-coeffiConsistence of cients. F. 33-34 classification of occupations. case of equality of contrary frequencies (qu. and refs. choice of magnitude and position. 20. 167. 60 treatment of. generally. distribution or correlation table. correlation-coefficient. of divergence from independence. Coefficient. 165-167 . 250-251.iUustrations. 59. 8 class symbol. 198 refs. positive and negative classes. refs. data as to ages of husbands and wives cited from. def. 379-381. . 14-15 data as to infirmities cited from. 3) 189 . 355. 64-67 . 159. 16. by elementary methods. difference E. 282-284. def. as example of a heterogene. refs. L.. conditions. ahrschdrdichkeitsrechnung und KoUeklivmasslehre.. 199-201. H. .). 223-225 . of association. of standard. H. refs. 351 . 256.. 71-72 of a variable for frequency. (including correction of moments generally). 8). 225. 211-212 .. 8). 212. tomy. 72 . 363-367 . direct deduction. Bruns. 10 . application of theory of sampling. 30 .. S. 167 174. representation of frequency-distribution by surface. 175. 9 manifold. 164 . Childbirth. 61-63 . Correction of death-rates. by dicho- tion to correlation tables. 113-114. Contrary classes and frequencies (for attributes).. 315. 80. 314. resolution of a compound normal curve. errors of agricultural experiment.181-]88. 392. cor- . 391. Contingency tables. on standard deviation. influence of magnitude on mean. 375-377. in theory of attributes. Cave. 265-266. 76 . correlation difference method. Cave Browne Cave. . The Combination of Observations. 140. refs. 76. 7. calculation of coefficient for ungrouped data.. 226. CharUer. contingency. ultimate classes. of variation.

236 reduction of standard-deviation. correlation between Twodiameters of a shell {Pecten). Ages of husband and wife. diagram. 333. correlationratios. coefficient for a normal distribution grouped linearity of regression. coefficient for a fourfold table. 237-238 . testing for normality of correlation table for stature. (qu. partial correlation. 189. . 225-226. relief. Ratio. 229-231. Weather and crops. 207 . 250-251 . 160 diagrams. deduction of expression for two vari- constancy of 318-319 . 229-253. Discount rates and percentage of reserves on deposits. fundamental theorems on product-sums. 245-247 coefficient of «-fold correlation. 161. Correlation. 331-332 . application to theory of weighted mean. facing 166. diagram. 3). 238245 representation by a model. 176 . 215-216 . facing 166. cance of generalised regressions and correlations. For Illustrations. standard-deviation of arrays and ables. . 182-185. 195-196. efEeot of errors of observation on the pauperism and out- reUef. 3). 271. 206-207 . 404 THEORY OF STATISTICS. 1) 289. 312-328 . 4) 334 applications to theory of quaUtative observations (refs. 387. diagram. and regressions in . Movements of marriage-rate and foreign trade. of contour-lines fitted with ellipses of normal surface. 327. 196-197. 352. 321-322 . partial regressions and correlations. 199-201. to fourfold form round medians (Sheppard's theorem)^ (qu. for all 40. 233-234 normal equa. 217218 . 322-328 .— . Changes Movements of infantile and general mortality. Refs. 247. 342. 249-250 consistence of coefficients. 3). 251252 limitations in interpretation . Earnings of agricultural labour- N in pauperism. 3). 189. isotropy of normal correlation table. Old-age relation-ratio. standard-deviations of arrays and comparison with theory of sampUng (qu. correlations . 185-187. 286-289. terms of those of higher order. see below. testing normaUty of table. Fertility of mother and daughter. constants (qu. diagram of diagonal distribution. 245ers. 388. 3). 388. correlation . 175 diagram. 162. 389. 177181 constants. 320-321 . possible pairs of values. constants (qu. fallacies. . correlation in theory of sampling. Sex-ratio and numbers of births in different districts. treatment by partial correlation. of the partial correlation-co- . correlation due to heterogeneity of material. 252 . 325 . signifitions. 197-199. 239-241 geometrical representation. 173 . 247-249 expression of . Illustrations and Examples. 159 . Correlation. normal.. 189 . 333. the problem. 219-220 . 192-195 . principal axes. 189 .). 362-363 notation and definitions. of correlation. 189. pauperism and . partial. : 319-320 normality of linear functions of two normally distributed variables. direct. 332-333. 163. out-relief. 234-235 . Partial. . outproportion of old and population. Statures of father and son. 216-217. constants (qu. 2) 189. 387. 7) 275 and (qu. 239 correlation-ratios. 387. 218—219 effect of adding unoorrelated pairs to a given table. on assumption of normal correlation (Pearson's coefficient) (refs. 328-331 outline of theory for any number of variables. 238 arithmetical treatment. 158 . 174. .). ooefSpient. 349350 . 321 . Normal. standard error of coefficient. contour lines. of . 236-237. 241-245. constants (qu. 317-334.ratios. regression. and motherLengths of daughter-frond in Lemna minor. 207 . direct deduction. Refs. 204-207. constants (qu. 188. 208-209. 221-223 . correlation between indices. 175. 213-214 . .

of a sum — or difference. tables. data as to Pecten Refs. probability. of arrays in theory of correlation. 142-143 table. 138-141 influence of grouping. mean. omega-funcref a.. effect of errors of observation on. 130.. 146-147 . 260-261. 287-288 inapplicability of the theory of simple sampling. def. See Deviation. . theory of Crawford.. refs. 389. 309. 269-270. 299-300. Deviation. data cited from. E. criteria (refs. refs.. 150-162 337-341. Dareishire. M. correlation. 331-332. 144-145 refs. (partial correction for age-distribution). 140-142. 252. J. 405 partial association and partial correlation. Detlefsen. 387-388. other names for. fluctuations of sampling in Mendelian population. E. 223-225. . table. association with with occupation 32-33 sex. 38.. 143 . 98. standard. little 135 affected by small errors in the mean. 389. B. 361. 378. partial. . contains the bulk of the observations. Darwin.. of rectangle. 158. 204-207 error. 285-286. B.. For' standarddeviations of sampling. 62-53. . is least . standard. 134 .. 144-147: refs. quartile. 389. A. of an index. Correlation ratio. 392. 252-253. of a series . Czuber... statistical cited from. 252 . for age and sex-distribution. refs. 145-146.d. 314. 358. 205. data cited from. ref. of binomial series. correlation of movements. 102. 332-333. infantile and general. 134-135 is the least possible root-meansquare deviation. 287288 . INDEX. deaths from explosions in mines. death-rates. applications of theory of sampling deaths from accident. 234. Deciles. refs. refs. .. association between colour pricldiness of fruit... standard 352 . 97 . table.. 155-156 comparison of advanwith standard-deviation. Deaths.. . for a grouped distribution. proof that arithmetic mean exceeds geometric. see Error. 236-237 . efficient.. 282-284. 10) 275. and Deaf-mutism. data cited from. Defects in school children. 14-15. a.. from diphtheria. 196197. 265. A. Cosin. compounded of of others.. H. 388 . Charles.. 209 . De Morgan. 135-137. . correlation. 23 .. partial correlation in case of normal distribution of frequency. round . 211212 range of six times the s. N consecutive natural numbers. 1881-1890. 358. refs. tions. 77 . 226. 177. 211 .. L. 134-144 def. Wahrscheinlich. 143 . E. 214r-215 . 252. Cotsworth. refs. multiplication table. A. 135 calculation for ungrouped data. 144-. keitsrechnung. A. G.) 387. amongst offspring of deaf-mutes. .. D. Davenport. 358. 304. (qu. Deviation. values of estates in refs. standard error of. : ... A. 33-34. 146 of magnitude with standarddeviation. 134 relation toroot-mean-square deviation from any origin. Cunningham. Cournot. 144 . 12. diagram. of normal . the median. and 1716. Crelle. refs. generally.. deaths in childbirth. See Quartiles. multiplication table. 38. standard. 361. . Datura. Theory of Probabilities. 197-199 . 204. 140. in England and Wales. (qu. Refs. Eefs. . 15 census tabulation of. 265-266. .. illustrations of 128. 154 calculation of. of generalised deviations (arrays). Cost of living. correction of. 282-284. 361 Die statis- curve... refs. association of. 37. 273. 45-46. Formal Logic. 52-53 . association with imfrequency becility. 319-320. 210-211 . 7) tages Crops and weather. 100. 104. root-mean-square. tiache ForschungsmetJiode. De Vries. 188.. C.

. F. 288.. pauperism and Edgeworth. 154... owing to changes of classification. 264^-265 . probable. Indexcorrelation. mean refs. refs. 156. mother. testing for significance of divergence from theory. sampling. 133- standard. See Value. ages at death from. refs. For general references. 387. correlation-ratio and test for linearity of regression. in interpreting associations theorem on. mean deviation.. P. 267. 53-54 contingency with hair-oolour. 107. theory of. 358. 358 . 333. Duncker. 144. 14^15. curve. of percentiles (median. 135-137 . 145 . 226. 389. def. 351 . Error. 144. tables of function. 48-49. association between father and son. 154. quartiles. 258 median. 337-341 . deaths from. 370-373 . H. Exclusive and inclusive notations for statistics of attributes. of quartiles. curve of. errors. unsuitability of. 123 See Deviation. Falkner. . J. 46-48. terms formeasures dioe-throwings probable error of (Weldon). Dispersion. parent. correlation due to .. table. refs. 147 . standard Distribution of Frequency. 144. 185187. Dufiell. of standarddeviation and coefficient of variation. 180 by partial correlation. 371 . 265-266 . 388. when numbers in samples vary. refs. normal correlation surface. 1. p. 328. refs. normal correlation. Y.. Quartiles. R. G. 314. 352 . . 239-247 diagram of model. Hamilton. Fallacies. . 361. Deviation.. 246. 315. correlation between Duckweed. 3) 274.. as illustrating theory of sampling. etc. F.. 273. dissection of normal Elderton. 70-71 association between grandparent. . 273. J.. median. etc. 149 a measure. Doodson. 256-257 . Discounts and reserves in American banks. 333 law of error (normal law) and frequency-curves. of moments. Error. Refs. 258-259.. Refs. numbers.. in interpreting correlations " spurious " correlation between indices. Everitt. 289. relation between geometric and arithmetic 9). . A. actual or virtual. of mean square. Normal 166. standard. T. 38. etc. 358 Palin. 267 . tables for calculating Pearson's coefficient for a fourfold table. constants (qu. 392. measures 156 . 239. of coefficients of correlation and regression. 310-311. probable errors. genertheory of ally. in theory of sampling. mean Earnings of agricultural labourers calculation of standard-deviation. 273. . . records of throwing. 406 THEORY OF STATISTICS. refs. mean square. 63. 130-131 188. of dispersion.. 147 ... mean. Diphtheria. illustrations. 2) 189. diagram. facing . 154 . 97. Explosions in coal-mines. mode. 163 diagram. Eye-colour. 354. when chance of success or failure is small. 70-71. 61 66-68 non-isotropy of con tingeney table. ). 358. 352 . translation of Meitzen's Theorie der Statistik. 252.. W. 252. 6. range as relative. refs. calculation table of powers. 354-355. correlation with : out-relief. 2.. Estates. Frequency Curves and CorSee relation. 177-181. D. 72 . 344. ref. quency-distribution. . 354 .and daughter-frond. see Error. . of number or proportion of successes in n events. theory of. 34-35. 344-350 . (qu. law of . Difference method in correlation. 273. for father and son.. — — 215-216 . and mean. in sense of semi-interquartile range. 98 . 144 . and child. See also Sampling. gamma(qu. of arithmetic mean. See Sampling. Dice. theory of.. Dickson. 49-51 . annual value of.. 390-391. tables for testing fit. 197-199 . curve. diagram. See Freof . table.

goodness of fit in contingency tables. E. A... ability. 387. . 195-196. Fay. See also Correlation. 391. 90-98.K. . Galloway. 93 . 6. 381 381-384. mode. N. normal. 314. median.— INDEX. 76. 226.. of. 314. Filon. inheritance refs. Correlaseries (ref. 375-386 contingency tables.. different districts of England and Wales. 251-252. Mathematical Theory of Probabilities. Fountain. 291-300 . refs. G. 354. table. hypergeometrioal . 301-313 tion. Frequency-distributions.. 79-83 graphic representation of. refs. 377380 aggregate of tables. Irving. P-table for use with association tables. 98-102. diagram. correlation. Die statiatische Methode als selbstdndige WissenacJiaft. 384 . Kollektivmasslehre. 154 . W.).. 100 . 391. 129. G. 83-87 ideal forms symmetrical. diagram. 10. 103 . 370-373 . 87 .. 104. 389-390 . data cited from Marriages of the Deaf in America. refs.. 367-386 . 375. 104 grades and percentiles. Flux. Forcher. 78 of . L. errors of sampling in correlation-coefficient. ally. measurement of price-changes. 161. 301-313.. refs. ref. Treatise on Prob- 392. of percentages of deaf-mutes in offspring of deafmutes. See Binomial series . teBting. refs. illustrations and examples. 152 . A. 392. 83 . comparison frequencies based on the observations. of. 218difference of sign of total and partial correlations. . measures of dispersion. def.. Field trials. 95 . . 3 ... construction 84.. 375-377 association tables. Galton. . 77 of ages at death of certain women. Frequency-curve. extremely asymmetrical (J-shaped). 3) 189 . testing goodness of fit. 96 . (ref. of statures of males in the 84 U. tables for. Fecundity of brood-mares. Fertility of mother and daughter. 78 of stigmatic rays on poppies. England and Wales..v. refs. 380experimental illustrations.). . frequency-distribution of consumptivity. normal curve (q. 87-90. 390. etc. of barometer heights at Southampton. . A. annual values of dwelling-houses in Great Britain. ref. diagrams. Feeding trials. 76 formation of. binomial series.. Fisher. of comparison frequencies given a priori.gener- 289 . . 314. 96 .. 96 . amples 166.. 358. Fisher. and 3) 389-390. .. ... of degrees of cloudiness at Breslau. diagram. 354. errors in. moderately asymmetrical. . index-numbers. facing 166 jSee forms and ex164-167 . of petals in Ranunculus bulbosus. Gabaqlio. T. 385-386 refs. constants (qu. 131. . illustrations : of death-rates in . of pauperism in .. (qu. frequency-dis- tributions. measure of dispersion. Feohner. 144. ' ... 361. 150. ref. 289. . 391. R. Normal curve . : Frequency-polygon. 387. index-numbers of Frequency prices. 90 . Fit of a theoretical to an actual frequency-distribution. .. Sir Francis. Correlation. probable errors. normal. Hereditary Oenius... 104. 175 . experimental illustration. ref.. ages at death from diphtheria. averages. Frequency-surface. 208. regression. H. Teoria generate delta statistica. of weights of males in the U. 102-105 . mean. 315. 94 . U-shaped. 390 Fluctuation. 208. 370-373. of headbreadths of Cambridge students. refs. 129. 131 . of material. 361. of a class. refs. testing goodness of fit. . T. A. 87-105. 166. 388. A. 374^375 . of fecundity of brood mares. normal curve. errors in. ideal forms of.K. 367-375 cautions.. . 88. 407 heterogeneity 219 . 105. theoretical forms.. normal. 226. 98 of annual values of estates. ref. ref. H.). 102 .

3) 155. Observations on the Bills of Mortality. index correlations.. Grindley. 313 . correlation between ages. Hollis. Geometric (logarithmic) mode. ex- ample 67 . correlation. 66- non-isotropy. refs. 328 299 data cited from. 212. 354. Hair-oolotjr : and eye-colour. general ability. 196 . probable error of a partial correlation coefficient. monic. 69 . 100. Refs. of movements 201. of correlation. 173 . inoculation statistics and association. Hull. 388. method Refs. 6. 269. interpolation for median or percentiles. 332 relation between indices. P. C. S. quartiles. refs. influence Hooker. 34. 354. (qu. 153. Harmonic mean. Hart. Gibbs. 85. etc. 252 . Principlee of Statistical Mechanics. 3) 189.. refs. A. influence on standard-deviation. . errors of feeding trials. theory of sampling applied to certain data. multiple happenings. normal correlation. Harris. C. constants. refs. geometric mean. 154. Horticulture. of representing quency-distributions. 116. 289 . . 113-114. 6. 387. 391.... See Mean. errors in. Gauss. Inheritance. Helguero. of estimating correlation coefficient. 408 THEORY OF STATISTICS. S3 .. . errors of sampling (small samples). refs.. normal curve.. table. de. data cited from. on mean. . 388. 46. 151-152 . Refs. 40 . . 140. refs. 315. Geiger. M. in rural and urban districts. Hudson. Husbands and wives. 70. Gibson. 390. choice of class-interval. 295-297. T. 84 . 289. Gray. 115.... and 358. ref. .. 388 .. Tables for computing probable errors. forming one binomial polygon * from another.. Galton'afuiiotion(correlationcoeffioieiit). refs.. diagram. refs. 188. Winifred. 391. 253 theory of partial correlation. R.. refs. H. The Economic Writings of Sir William Petty. John. refs." 144. 361.eto. 226 Natural binomial machine. correlation between movements of two variables. See Mean. 84. 208. of contingency. intelligence. inhabited and uninhabited. relation between fertility and social defective physique status. 118. ref. Geometric mean..204. 270. 130 percentiles. short method of calculating coefficient of correlation. of least squares. 209 . intra-class coefficients. F.. Graunt. 68. 313.. construction History. of representing correlation between two variables.. H. (qu. 270-271. between two variables. .. use of term error. correlation. 226. errors of agricultural experiment. error in field experiments. cited re Cosin's Names of the Roman Catholics. 252. 83-87 . 79-80 .. Hypergeometrical Series. association.. diagram. frequencycurves (epidemiology).. D. 4) 131 . Grades. refs. 354. 188.. weather and crops. H. 203-204 of Histogram. 128.. 208 . together with the Observations on the Bills of Mortality more probably by Captain John Oraunt. compound normal dissecting curve. John. 208 . Greenwood. refs.. cor154 . Willard. 226 . Graphic method. J. law of small chances.. 390. correlation between weather and crops. har- 176. Heron. 209 .. abac giving probable errors of correlation coefficient. 4. H. A.. B. Head-breadths of Cambridge students. miscellaneous. . '' mean 314 . F. 389 . J. H. fre- of application of correctionforage-distribution. 61-63. D. Grouping of observations to form frequency-distribution. of statistics generally. ref. geometric. .. 391. 200. 391. 152. Houses. 272. table. 358 . table. 40 application of law of small chances. 5-6. HaU. . 62 median. (qu. binomial machine. of. 180-181 . 332.. 159 . 61annual value of.

226. Elemente der exakten . 409 Kelley. measures of 208.. T. Refs. ). 314. 360. 164. refs. . 125-126 . L.' 144. 154. 25-28 . 344. Little. A. Stanley. refs. correlacontingency. 359. correlation between lengths of mother. C. refs. in a. ' reasoning numerically definite (theory of attributes). G.. refs. logarithmic mode. form of contingency or correlation table in case of. system Lee. 379-381. S. 215216 . in correlation table. refs.. Keynes. Infirmities. mean deviation least about the median.. criterion of. probable error of median. 129 refs. G. 226. 273 . Befs. tables. Inoculation. classification of.. . of a horse. 154 . goodness of fit test for. 71 . Lemna minor. 361. Jevons. John. Marquis de. refs. 272. 161. Theorie anaZytique des prob abilites. Inclusive and exclusive notations for statistics of attributes. L. def. data as to agricultural labourers' earnings cited from. 392.. normal curve. W. refs. correlation between.. King George. 185-187. 5. price index numbers. M. conditions for real significance of probable errors. 130. Laplace. 388. probable error of mean. W. W. Lyon. 105. 226 tables of functions. fre. ratio. Kapteyn. refs. 287. 265-266 366-367. 352 . History of Statistics^ Labouebrs.. tistics. . data cited from. generally. Koren. 392. refs. Pure Logic and Investigaother Minor Works. 354. dependence (association. cholera." 4. Lloyd. 137. Sir tical. 389. 15 tions in Currency and Finance. 73 Isserlis. J. 130-131. laotropy.. error in soil survey. frequency distributions. T. Julin. Imbecility.. 205-206. W. 38. Intermediate observations. case of complete. A First Course in - Statistics. Oeschichte der Sta- Jones. refs. J. 68 . C. INDEX. Independence. 361 . associations with deafmutism. E. Theorie der Massen erscheinungen.. 56-57 . 129..und Moralstatistik. Linearity of regression. def. 390. Lexis. Essa. 252. graduation of. L.. 96... 33-34.... v. 15 . 358. 67-71 of normal correlation table. J-shaped 98-102. crops and rainfall.. errors of agricultural experiment. Indices.. F.. H. 388. refs. Principes de Statistique. Pierre Simon. 314. 122. refs. indexnumbers. 391. Knibbs. Index-numbers of prices. 14-15. 392. 130 . 48-51. 14^15 . See Earnings. use of geometric 127. tion. Refs. ref. 80-81 . age statistics. 354 . Skew Frequency-curves in Biology and Sta- 130. 126127 .. Johannsen. refs. census tabulation of. fertility and fecundity.. Abhandlungen zur Theorie der Bevolkerungs. 392. etc. 33-34.. philosophique. 387 . inheritance of 160. W.. 273. Logarithmic increase of population. J.. 361. frequency-curves. 208. 389 . A Treatise on Kick Probability. partial correlation- 252.. earnings of agricultural. 361. tistih. refs. 328- Larmor. use of term " precision. Jacob... 375-381. Lipps. Refs. following law of small chances. 390. association between deafmutism and imbeciUty.and daughterfrond. M.. 38. for attributes... Alice. 128... LobeUa. quency-distribution. Illusory asaooiations. Erblichkeitslehre. refs. test for. 40 Feohner's Kollektivmasslehre. 269-270. appUoation of theory of sampling to certain data. 387-388. use of harmonic mean.. 391. refs. deaths from.. 126 use of geometric mean for. examples. use of word " statis 331 . J.. of mean.. for attributes..

Clerk. Geschichte . numbers in litters. Milk testing.... 391. Mean deviation. 389. of series of ratios or products. 130. nature of. Maxwell. in estimating intercensal populations. standard error of. Theorie und Teehnilc der Statistik.) 387. 113.. 144. refs. of series compounded of others. comparison with median. 390. Maoalisteb. origin from normal curve." 4. less than arithmetic mean. 122-123 . is less than arithmetic and 128 geometric means. generally. . 116 .. See Error. 109-113.) . continuous observations and small 116-117. fluctuations compared with theory of samphng. arithmetic. position relatively to mode and . 124 . Geschichte. 108-109". 113. (refs. of series compounded of others. 387.. calculation. 115. def.. influence of grouping." 1. approximate determina- of. 120 . Mean. W. law of geometric mean. . 410 THEORY OF STATISTICS.. median. Deviation. 37. 129 in theory of sampling. difference from arithmetic mean in terms of dispersion. R. 126-127 . 304. . tion.. generally. refs. Mitchell. for ageand sex. . 108116. . mean and mode. def.. Mice. Mode. 391. 115-116. (qu. harmonic mean. Mendelian breeding experiments as illustrations. H. calculation. geometric. def. weighting of. 118 comparison with arithmetic mean. plication of weighting to correction of death-rates. 114. 113. refs. 264-265. refs. 128-129. P. Sir Donald. 108 generally. . 129 difference from arithmetic mean in terms of dispersion. diagrams. 123. use of word " statist. cultural experiment. . sum of deviations from. 116-120. data cited from. (refs. indeterminate in certain cases. 144. . 6. 125-126 convenience for index-numbers. 220-225 March. 115 . 116. of sum or difference. 121-122. 108 . See Deviation. 264-265. 226. B. harmonic. 120-123 def. weighting error. def.distribution. statistical. 314. 123-128. calculation of. refs. . generally. 109 calculation of. refs. 128-129 proportions of albinos in litters. 108. 264-265. 208 index-numbers. 114. . 225. 121-122. 124 . Mohl. generally. position relatively to Meitzen. slight influence of outlying values on. 334-350. correlation of movements. (refs. 223-225 refs. 226 . 387. John.. 128. . purport of. 8) 156 .) 387 diagrams. . when numbers in samples vary.. weighting of. 84. 108 . 90. errors of agri- 355. 38. 144 . mean. 121diagrams showing position relatively to mean and median. 113-114. . Milton. ref. 5. errors of feeding trials.. 9) 156 use in averaging prices of index-numbers. 299 standard error of. W. standard. A. L.. 114.. 267-268. Mean square difference error. 199-201. 220 . Modulus as measure of dispersion. Median. 130.. H. 225. use on ground that deviations vary with absolute magnitude. unsuited to disseries. refs.. def. 119-120 . 119 advantages in special cases. Macdonell. 122 . 123 .. is zero. . 3-5. weighted. def. 119 summary comparison with median and mode... standard Methods. 128.. 220-225 of binomial series. errors in. between weighted and unweighted means. 116-117. Robert von.. correlation. 117 graphical determination of. use of word " statistical. 124 . logarithmic or geometric mode. 391. mean is the best for all general purposes. 273.. 337-341. fluctuations of sampUng in. (qu. 221-223 ap. for a grouped distribution.. from mean and median. Mercer. 128 . refs. 120.. weighting of. Marriage-rate and trade. 127-128 . etc. 114.

301-302 value of central ordinate. probable errors^ . 215 nomial apparatus.) 154. 289. 303 mean deviation and modulus. deviation. . D. normality in fluctuations of sampling of the mean. 195 spurious correla. For normal correlation. Palgkave. normal. deduction correction Pauperism. 346-347. 111 of median. . fitting to a given distribution. for age-distribution. 149 inheritance of skewness. outline of general. 239-241. refs. 306. Literatur 411 der iStaatswissen- scJiaften. 208. 122 stanmean dard-deviation.. correlation of characters not quantitatively measurable. 2) 189. 315. data cited from. 311-313.. series. regressions. 138-140 145-146 . refs. and its use. data cited from. 130.. testing fit of theoretical to actual distribution. 333. of a class. refs. 161.. 226 . inheritance of fer208. Moment. and modes for other years. . frequency-curves . 310-311. 226 . refs. 245-247 . partial. 309-310 . 390. 117. L. 252. G. bibetween indices. 161. 306 tion .. 388. 118.. 92. Order. see Correlation. Ref. Newsholme. 387 frequency curves. Partial correlation. 10 . proportion of aged. 333. 333. etc. experimental test of normal law. means. 122. 149 . 72-73. 314.def. 144 ooefiicient of variation.. 289. 314. refs. . Movements. Pearson. and fecundity. 151-152. 96. dissection of curve. 177-181. etc. 113 cajtable. in England and Wales. 182185 . 70. Moore. numerical examples of use of tables. . 391. dissection of compound curve. more general methods of deduction. 225. und Moir. . . . Cours d'economie politique. moments. with out-relief. 314. 226. refs. 304-305 from binomial 245.. calculation of general methods of curve-fitting. errors in feed- ing experiments.. hypergeometrical series. 359. P. H. 226 .. 78. etc. correlation with out-reHef. 233-234. 148 . Refs.. 105. V. 307-308 . compound normal . Pearl. 389. 93 culation of mean. tables. quartiles. 388. fertility. 225 selection. 6.' variables. 226. Nixon. Normal curve of errors . calculation of first. of generalised correlations.. 3X5 . See Association. G. See Correlation. reproductive moments. 391. birth-rates. A. 209 . J. Mortality. 354. I. Vital Statistics. Sir R. .. (mortality).. 162. See Death-rates. 188. 90. Karl.. 304 table of ordinates. medians. 389 nomial distribution and machine. correlation of. 65 ? mode. with earnings and out-relief. 305-307 . percentiles. tility (ref. Statistical Studies in the York Money Market. 389. contingency. NEGATrvE 10. 241. probable errors. . and standard deviations. H. bi273. normal distribution of number of seeds in Nelumbium. fitting of principal axes and planes. Bramley. from. deviations. 208. .. . Refs. . the table of areas. 105. 160.. partial. 310. quartile deviation and probable error. frequency-distributions. ... . 96. Partial association. INDEX. in two 197-201 .. 388. 209. weighted mean.... correlation between indices.W. 154. correlation and correlation-ratio. 304 series comparison with binomial for moderate value of n. standard-deviation. 120 . 333 . 315 . J. del. classes and attributes. Pareto. inheritance of fertility. Morant. . Ref. 315. 355 errors in variety tests. 40. New O'Brien. 358-359. 63. 3. . Raymond. 110 second and general. refs. contingency. Refs. 299 deduction data cited of normal curve. (qu. 135 . 314 . 192-195.. Dictionary of Political Economy. . Norton. 209. methods. diagrams. 390.

Pickering. 150-153 def. 133. 13-14. Calcul dea proba361. ratio of q. refs.). ref. multiplication table. 98. 152153. 333.. 149. U. standard errors of. 222. 233 stan: . quartile deviation and semi-interquartile range. def. 361. 152-153 use for unmeasured characters. Relative dispersion.. sex-ratio. Poincare. 361. 264^265. H. in correlation. 149 . 358. estimation censuses. to median as measure of relative dispersion. 257. as a measure Petty. frequency of petals. Probability. 6. 314. defs. difference between deviations of quartiles from median as measure of skewness. 387. Economic Writings. pling of attributes. 337-341 . .. termination. bilites. 304. Percentage. 125-126 . 129 . 126 . Rhind. ratio of q. 391. unsuitability of dispersion. errors of agricultural experiment. .d.. refs. standard errors.. 175 total and partial. A. 263. . refs. constants. refs. Becherches sur la probabilite dea jugements. . refs. 285-286.. 355. index-numbers. Precision. refs. correlation between two diameters of shell.. W. . expression of other frequencies. . QtfABTiLE deviation. 52-53. 366. unsuitability of median in such a distribution. See Quartiles. 226. 388 population. 151-152. of. refs. Regressions. 162 diagram.. Reserves and discounts in American banks. Ranks. when numbers in samples See also Samvary. unsuitability of median in case of such a distribution. 391-392. as a 310 measure of dispersion. of normal curve. 391 . 144.. refs. correlation between errors of sampling in. . 341-343.. refs.. 130-131. Poisson. . A. refs. 175-177 def. attributes. 272. Random sampling. between Registrar-General correction or standardisation of death-rates. of harmonic mean. in sense of simple samphng. refs.. 361. 126 . 273 . ref. 150 de. . 117. of geometric 355. Positive classes and refs.. Petals of Ranunculus bulbosus. 358. 352.. use Prices. correlation of fluctuations. 147. Persons. 365. 253.d.. mean. q. of. 153 . Peas. 387-388. D. L. 412 THEOKY OF STATISTICS. 201-202. 352 . frequency of. index-numbers of. 148. 149-150 . (qu. 310 . PoppieSjStigmatic rays on. 267-268. 158 .M. determination. applications of theory of sampling to experiments in crossing. 341-342 . 333 . Perozzo.. .. 284. 289. standard error of. 273. 333.. works on. 32-33. L. correlation. theory of. Poynting. 134 generaUy. 102 unsuitability of median for such distributions. 130 data cited from Reports. 205-206. 154. S. 337341. Peters.. auffioienoy of. refs. law of small chances. advantages and disadvantages. Lettres sur la theorie des probabilites. 3) 189. 256257 . non-linear. 362-363. 201 . J. 102 . 148-149 . 359.. Population. J. 116. J. 388.. 147-149. 208. 283.d.. H. 197-199. tables for statisticians. number of positive classes. 321322 . generally.f requency. to standard-deviation.. applications of theory of probability to correlation of ages at marriage. methods of correlation based on (refs. refs. Percentiles. Pecten. direct deduction. 163. 10 13 13 . 147-148 . Sir W. Quartiles... 130. Principal axes. for tabulation..d. ref. tables for computing probable errors. estimates of 224.. 199-201. dard errors of.... 390. Quetelet.. Range. refs. facing 166. 117. S. . 208209. 143. advantages of q. Banunculus. in terms of. 77. 367. refs. 78 ..

354-365. 352 . error in soil surveys. normality of distribumean. 212. : See Quar- Sex-ratio of births correlation with total births. 255256. refs." " statistical. R. (qu. Sir R. . Correlation. Scheibner. Scripture. Semi -interquartile range. tables of normal function and its integral. Sampling. 289. E. standard error of tion of standard-deviation and coefficient of variation. 7) 275. E. (median and quartiles). Skewness of frequency-distributions. the problem. when numbers in samples vary. generally. 300-313 correlation. 359. refs. comparing one sample with another combined limitations to with it. W.. Sheppard. 413 percentiles. arithmetic and harmonic means. application tw sex-ratio. use of words " statistics.. 90-102. 337 standard errors of percentiles. errors of agricultural experiment. 335standard errors of percentiles Refs. of arithmetic 345-346 . J.. 254355. normal curve and correlation. appUcation of the theory of sampUng to.. refe. series Hypergeometrical series .. W. del. correlation of polychoric table. difference between arithmetic and geometric. (qu. 338-340 ... Sinclair. 344-350 of difference between two means. conditions Sampling of attributes assumed in simple sampUng. data cited from. diagram. 207 . ref. 347-350 ." 2. " statistics.. of coefficients of correlation and regression.. curve . (qu. 264r-256. . mean. Sir John. Sampling sumed 337 .. Shakespeare. Skew or asymmetrical frequencydistributions. 314... theory of sampling. comparing a. 390. chance. 271-272 interpretation of standard error 299-300 . inverse interpretation. 351 . 262-264 . Robinson. Russell. 289 .. 267-268 comparing one sample with another independent therefrom. theory of. 279-281 effect of removing conditions of samsimple sampling. 225 . law of small chances. 267 . . W. E.. (qu. 313-315. 344. theorem on correlation of a normal distribution grouped round medians. 273. . " statist. . correction of the standard-deviation for grouping. refs. 281-289 pling from limited material. 272also Binomial 389. 268-271 . refs. 37. Ross. 287 binomial distribution. P." W. constants. 265-266 standard error. and qu. . (See 273. normal normal curve. 262-264. (qu. 276-279 . . Normal normal. standard: 354-356. 273. 1. refs. use of word 3.. measures of.. . 9) 156. 107 . 163. 337-341 dependence of standard error of median on the form of the disof difference tribution. 291-300 . 346-347 . 149-150. random in sense of simple sampUng. . 264^-265 . See also Frequency-distributions. Saunders.. 258-259 . of variables. limits as a measure of untrustworthiness. 266. Ritchie-Scott. 11) 275. (qu. between two 341-343 . 387. 307 . 259-262 . 273. calculation and correction of moments. when chance of success or failure is small. 355. frequency-curves (epidemiology). 4) 334 normal curve tables. 390-391. G. 390391. 352 . of correlation-ratio and test for linearity of regression. . conditions asin simple sampling. standard 1. W. sample with theory.. Miss E. examples from artificial . 256-257. 391." tiles. 391. error of ratio of male to female births.. refs. 8 deviation of number or proportion of successes in n events. effect of removing conditions of simple sampling on standard error of mean. refs.. 317-334. 333. INDEX. Rutherford. 3) 189 .. 2) 289. A. 175. 176 . Significant differences. use of word when n is small.

diagonp. (qu. for . " Student " (pseudonym). purport of. of a frequencydistribution. probable errors.. 273.. Robert. 265-266. F. 100.. 391. Small chances.. Society. errors of agricultural experiment... use of word " statistics " in British Rainfall. 388 rank . 333 Theory of Mental and Social Measurements. Statures of males in U. James. 355.. of a correlation table. method 215- Standard-deviation. 333.. refs. 414 THEORY OF STATISTICS. 87-90. 5. standard-deviation. tables of exponential binomial limit. estimates of population. 160 . 355 . refs. refs. 209.. 3) 189 . def. E.. . E. effect of errors of observation. I.. 223- 225 . 329-331 . refs.distributions. law of. mean 153 . J. 304. introduction and develop- of statistics of attributes. 361.. 5. Southey. 226. 112 medians. (qu. 391. C. Tohonproff. See also Frequencydistributions . 307-308. 213-214. refs.. M. 343. property ninth deciles. 89. : ment in meaning matical expectation of moments. 391. law of small chances.. A.. F. percentiles. Statist. 273. birthrates. Symons. def. dia- Thorndike. Stratton. methods. (qu.. 116. . 363-367 . John. Statistics. facing 166. (qu. 164. probable error of correlation coefficient.. 1 1-14. M. refs.. refs. Standardisation of death-rates. . diagram of isotropy. A. refs. 3. 2 . 208-209. contingency. 226. correlation of. .. 3. 164. 391. E. of median. refs. diagrams.. refs. 1. 130. refs. Stevenson. L. classes and frequencies. refs.. refs. fit of regression lines. effect of errors of observation on the standard-deviation and coefficient of correlation.. lines and planes of closest fit.. 78 . . 327. 141 . C. 341. 226. 81-83 . for father and son.. 389. Tabulation. 225-.. median for such distributions. 154. methods of measuring correlation. 388. 355 . Slutaky. Symmetrical frequency . Spearman. 37 .. 3-5. mathe- 1-5 theory of. 209.... 390 of bi-serial expression for correlation coefficient. Royals. H. 387. I. Refs. Surface. 344-345 of standard-deviation and semiinterquartile range. Todhunter. J. correction of..l distribution. 5) 355. Normal curve. See Deviation. Time-correlation problem. refs. error in field trials (chessboard method). 388. 91 calcula90 means and tion of mean. 12 . diagrams. C. occurrence of the word in Shakespeare and in Milton.. test- Type of array. Stigmatic rays on poppies. constants. def.. in the meaning of the word. Account of Scotland. J. 306 . 88. Snow. of the Roma/n Catholics.. tables. Ultimate def. Tatham. freunsuitability of quency. 216 . 12-13.. and . 174 . deviation and quartiles. cited re Cosin's ing for normality. 389 . for age-distribution. errors in variety tests. refs. 273. Statistical. Soper. gram. . E. dard-deviation. 333.K. correction of death-rates. Stirling. refs. expression for factorials of large numbers. errors in. 388. def. 8. sufficiency of . 253 . 391 errors of Spearman's correlation coefficients. probable of correlation. standard errors of of first mean and median.. 117.. 1) 131 stanof word. Soil surveys. 391. 1) 155 . Spurious correlation of indices. 305-306. for tabula- tion. of fitted contour-lines. Tocher. T. History of the Mathematical Theory of ProbTraohtenberg. E. 388.. refs. standard. introduction and develop- ment 1-5 . 6. M. 325. distribution fitted to normal curve. 197-201 refs. H. 322-328 Names etc. 226. 5.

252 . refs. Earnings. correlation between indices. refs. sex-ratio. F. 15.. Weighted mean. Variation... notation for statistics of attributes.K. 150. 392.. S. ref 4. C. 40. refs. errors of agricultural experiment. 391. refs. 163.. 273. 154. 273. refs. time-correlation problem. Andrew.. 415 refs.. 387. refs. 358. weighted also Mean. John. generally.. E.. .distributions. EDINBURGH. Universe. etc. def. mean deviation and quartiles. 273 . . 226.. 390. Wood Frances... see Mean. 140. estimating Weather and interoensal populations. of dwelling-houses. median.. 6. 252. 1. Allyn tics. refs. annual. 75.. frequency-curves.. relative dispersion. 1. Waters. median. 361... 23. goodness of fit in association and contingency tables. W. LTD. Wages of agricultural labourers. 390. 5 . 83 . 7. 391. fluctuations of sex-ratio. 4) 131 . 95 diagram. quartilea.. refs.. cost of living. 370-373. Westergaard. W. 273. 382.. methods. 188. C. PRINTED IN GREAT BRITAIN BY NEILL AND CO. Befs. of estates in 1715. 226. U - shaped frequency . 129. . Young. A. geometric Median Mode. Theorie der Statistik. 105. birth-rates. 3) 155. study of defects in school-children. 387. A. 226 .. 149. 351-352. W. 2) 131... data cited from. 258-259. 388. Value. (qu. 57 .. to H. 391. refs. index-numbers. 259. Vigor. Lucy. 130. . multiplication table. statistical. theory of... H. D. Working classes. F. 149 .. 75253 . probable errors. A. consistence. F. Refs. 101. . Logic of Chance. citation of Bielfeld. errors in.. Variates.. 18. refs. 355 .. law of small 102-105. 192 data cited from 78. (qu. 196197 ." " in EngUsh. refs. Young. correlation. refs. 253.. use words " statistics. 208. use of term characteristic lines (lines of regression). and mode. 387-388. 208. age statisrefs. 388. . (qu. refs. F. E. of the (qu. ZiMMEBMANN. 273. 389." attributes. D. history of words " statistics. 100 diagram. sampHng in Mendelian ratios. INDEX. 388 application of law of small chances. T." " statistical. specification of. 39. 94 mean. U. 177 . Venn. table.. correlation. West.. 389 . numbers. B. WiUcox. 73 correlation. Yule. association. def. 130. def. Introduction Mathematical Statistics. Versohaeffelt. facing 186." Weldon. Variables. R.. 226... . 208. standard-deviation. 122. refs. 163. index correlations.. crops. Variety tests. coefficient of. Wicksell. Weight of males in U.. standard error of. Warner. pauperism. error of coefficient of contingency. dice-throwing experiments.. G. probable sex-ratio. problem of pauperism. 17 17. influence of bias in statistics of qualities. measure of relative dispersion. Water analysis. real. 2) 155. see 185. J. Whittaker. 273 . 15. . 388.. Die statistischen Mittelwerthe and translation. 93. table. 361. H... . table. 314. Refs. Wages. isotropy. Zizek.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times