You are on page 1of 34
3 sem Observations Total eee . 5,5, 51555 25 5 Series I yy Je Oe 25 5 Series II 3,4,5,6,7 5 Series 1,3,6,7,8 25 “cent In the first series the mean is 5 and all the items are identical. The items are not at all scattered. The A.M. fully represents the charac teristics of the distribution. In the second series, A.M. is 5 but the items have different values. The maximum value is 7 and the minimum is 3. Thus A.M. represents the given observations but not as closely as was done by in the series I. Similarly A.M. represents the series III but not in the way it represented in the series I and II. In series III, the observations are relatively more scattered. The degree Of scatterness of the individual observations about a central value is called ‘Dispersion’. Dispersion, refers to the variability in the si” ae It tells us that the sizes of the items in a series are not uniform. The varius items difer among themselves in their values. The vais Asures of dispersion will give us a precise picture of variation in Various items of a given series. The various measures of dispersion also called the averages of second order. of dispers It is worth ionii * worth mentioning here that a good measure of dispersion mus Possess all the properties possessed as enlisted in the last chapter, by a good measure of central tendo”) percentage. If we calculate the variati + ee. . ial - . original observations, the measure ee series in the ei Dispersion’. On the other hand, if the measu measure 2 inthe a of percentage or ratio of the average, it is ce eersion fe pelt a near wie mE Emplois ary tow a. Measures of Dispersion 67 Dispersion’. A measure of relative dispersion is appropriate to compare jwo of more scrics expressed in different units of measurements. 6 a VARIOUS MEASURES OF DISPERSION ” The following are the commonly used measures of dispersion: 1, Range 2, Inter Quartile Range. 3, Semi-Inter Quartile Range or Quartile Deviation. 4, Absolute Mean Deviation. 5. Standard Deviation. Now we shall discuss these measures one by one. 1, Range. The simplest measure of dispersion is Range. It is defined as the-difference between the largest (L) and the smallest item (S) of the series and is mathematically given as Range = L—S It is an absolute measure of dispersion. For the purpose of comparing two or more series, a relative measure called the co-efficient of range is used. Mathematically Coefficient of Range =L—S. L+S Merit of Range. It is easy to understand and calculate. Few demerits are given below : - (i It is affected by [Fluctuations of sampling as its value is never stable and changes from sample to sample. (i) A small variation in the value of an extreme item affects its value. (iii) This measure does not take into account the composition of the series. The range of a symmetrical and an asymmetrical distribution can be identical. ‘ In spite of the above listed disadvantages, this measure of dispersion is commonly used in the field of Quality Control to maintain the quality of manufactured goods. As even a small variation in_the delicate CEL Manufactured goods can prov harmful, range i red in such circumstances. - 2, Inter Quartile Range. Just like the difference of the largest and the smallest item is termed range, the difference in the two quartiles, the 3rd (Q,) and the 1st (Q,) is called Inter Quartile Range. Mathematically, Inter Quartile Range = Q, - Q,. It isa measure of absolute dispersion. The advantage of this measure over range is that it is not affected by the values OLextreme items. In fact 50% of the-given observations between the ‘wo quartiles and as such this m is a fair measure of variability. Merits of the Measure are : (i) It is easy to calculate and is readily Understood, d neat range. 68 ure of variably affected by fluctua, belt eare: (i Ie is 2 ier quartile rane Demers opiner’ alue is not stable. i its vé : . of sampling measure of location oa composition 0 f the series, ii a “a ; a 7: 1 Quartile Deviation. It is defip ea inter Quartil Range 01 zy he calf sik i fe between the rd Quartile and the Quartii, Sil viation = Qs —Qr Si al he Quartile Range oF Quartile De iation Q jemi-_ istributi f quartile deviation whe ical distribution the value o! eae or when subtracted from the 3rd Quartile i; i tely asymmetrical distribution, the value equal oe age Thus there would be a differenc, _ difference, the greater-would te absolute measure Qs~Qr Qa+Q Merit of this measure is: It is simple to calculate and easy to understand Demerits of this measure are: (i) It is not based on all the observations of the data and is not capable of further algebraic treatment. (ii) It is affected by fluctuations of sampling. , wi This measure should not be used in series in which the variation is high. . A. Absolute Mean Deviation. The absolute mean deviation of give? observations is defined as the A.M. of the deviations (ignoring signs) various iter i i vain items from a measure of central tendency (either mean, median brace tna, ‘ne, the deviations are taken from the AM. # Jpethod of. averagin ae calculating this measure is also called observations are alse ae Sometimes the deviations of individ! absolute deviations about meron Median as it is expected that the Su” deviations about the fea oe is always less than the sum of abso! f Moment of dispersion, ‘bsolute mean deviation ig also called h? 1s Coefficient of Quartile Deviation = Mathematically Absolute Mean Deviation = =X n f observations, X the A.M. ° Observations, The two b Where X represents the values and n denotes fis individual es Number of 5 Measures of Dispersion 69 x _X indicate that we are ignoring signs of the deviations. On the same line we can define, ‘Absolute Mean Deviation about Mode = 1X —Mode | oa n and, Absolute Mean Deviation about Median = =| X=Median| i For statistical data involving frequencies we can define, Absolute Mean Deviation = Xf | X= (Measure of C.T.)| x where C.T. stands for central tendency. Z For continuous data the formula will be the same, only X value shall be replaced by ‘m’, the middle value of the corresponding class interval. Relative measure of dispersion based on absolute mean deviation is called coefficient of absolute mean deviation given by = Absolute Mean Deviation Measure of C.T. Merits of this are : @ Absolute Mean deviation is rigidly defined and has aféefinite value. ] (i) It is simple to calculate although not as easy as Range and le Deviation. (iii) It is easily understood and is an average of deviations from a measure of central tendency. (iv) It is based on all the observations and cannot be calculated if a single observation is missing. Demerits of this measure are : (i It ignores the algebraic sign of deviations and thus is not capable of algebraic treatment. (i It is not a very accurate measure of dispersion particularly when we calculate it from mode which can be unrepresentative in some series. Illustration 1. Following are the diameters (in cms.) of the steel balls used in a machine manufactured by a manufacturer. 3.25, 3.18, 3.22, 3.23, 3.24, 3.18, 3.16, 3.23, 3.18, 3.24. What is the range ? Also calculate the co-efficient of range. Solution. For the above observations. ~ L=largest value=3.25 cms., S=Smallest value=3.16 cms. Hence Range=L — § = 3.25 — 3.16 = 0.09-cms. = 2 = 3.254316 641 641 Milk yield of 80 Quartile Deviation Also coefficient of Range = E—S = L+S ' Illustration 2. Following is the distribution of Uuffaloes at a farm. Calculate the Interquanile Range, Statistical Methods for 's for Researc! h Won, 70 and the Coefficient of Quartile Deviation. ~ Milk Yield (Ke) 0-2 24 4.6 68 8-10 10-12 12. , 9 «16 BSS ace No. of buffaloes. Solution. For calculating the Interquartile Range & Quartile deviation we have to first determine Q, and Q,- Q, =Ist Quartile = Size of 80/4 = 20th observation. Q3= 3rd Quartile = S1 60th observation. Milk Yield 0-2 24 46 6-8 8-10 0-12 12-14 14-16 No. of buffaloes = 4 9 16 28 4 5 3 1 : 4 13 _29 57 TL 16 719 80 ize of 3. 80= (O. Ce. Q,=20th observation lies in the interval (4-6) and Q,=60th observation lies in the interval (8-10). Nef. 20-13 x2 .Qi=L+ 4+—— * 24+ f 16 4+ 0x 2 = 4+ 0.875 = 4.875. 3N_ —c.f. and Q3=L + 4—— * c+ 8=F x2 f 14 =8 +2 8 +.0.428 = 8.428. Now Interquartile Range = Q, — Q, = 8428 — 4875 = 3.553 Q=-Q = 2 Quartile Deviation = “95 = $428 4875 - 3553 = 1.716 ‘Also Coefficient of Quartile Deviation = 23 Q Qa +Q _ 8.428— 4.875 . 3.553 = 3428+ 4875 13.303 a Iiustration’3 Followin; i 1 . ig are the monthly eamings (in Ru} 5) of five earning members of a family : 680, 580, 850, 700, 750. Coleulae the Also Absolute mean deviation about_th ie A.M. i calculate the coefficient of Mean deviation “oe a edian. som a an di tions. ‘olution, A.M. X = 680+ 580 + 8: A. 700 + 150. = 3560. = 712 5 m5 } ri Ateiite Mean Deviation = |x -x| X n ( axe ps , gd Measures of Dispersion n | 680 —712 | + | 580 —712 | + | 850 —712 | + | 700 — 712 | +| 750 —712 | 5 = 3241324 138 +12 +38 2352-794 5 Also Coefficient of Absolute Mean Deviation ~ Absolute Mean Deviation _ 70.4: — 9,99 Mean 712 iM Now we calculate Absolute Mean Deviation about Median. Median of 680, 580, 850, 700 and 750, i.e., 580, 680, 700, 750 and 850 is size of 3rd observation = 700 Absolute Mean deviation about median = 2X —Median | 1680 — 700 | + 1580 — 700 | + 1 850 — 700 1 + 1 700 — 700 | oo +1750 — 7001 5 = 20+ 120 + 150 + 0+ 50 ~ 340 — 6g S 5 Coefficient of Absolute Mean deviation about Median = 68 = 0.097 700 IMlustration 4< Calculate the Absolute mean deviation about mean for the following data. > xX: 6 7 8 10 12 A 3 6 1 8 2 Solution. To calculate the Absolute mean deviation we form the following table : x f FX. Ix-xl fs lx—xl 6 3 18 24 12 7 6 42 14 84 8 rr 88 04 44 10 8 80 16 12.8 12 2 24 3.6 72 Ye YX =252 yy |x —X|= 40.0 ro Mean = 252 = 8.4 (© 0 Absolute Mean deviation = 2f (X =X) . 40 = 4 1.33 7 30 3 Statistical Methods for Research Work, 72 Illustration 5. Calculate the absolute mean deviation from Median ty the following data. Marks 0-10 10-20 20-30 3040 40-50 No. of students 1 3 8 2 2 ~~ In =i Solution. To calculate the absolute mean deviation first we calcul, median. The median class is (20—30) since an 8th observation lg in this class, } Marks =f cf. sm ~—-|m—Median| Ff |m~ Median| 0-10 1 Ly 5 20 +20 10-2030 4) 45 10 +30 20-30 8 1-205 0 0 30-40 2 4355 10 20 40-502 6s 20 40 110 Niep . 164 +. Median = L + 2 x C =20+2 x10 8 =204+4x 10= 2045 = 25 8 servations, X denotes the mean of X 1 N is the number oy Root-mean-squared deviation is the square root of the arithmetic jai deviations trary valuc- the deviations are taken from the artes mean om a ep isn dif between Toot-mean square deviation aa Lb ares items, Measures of Dispersion 3 A relation connecting root mean square deviation and standard deviation. In case AM. is not a whole number, we take the deviations of individual observations from any assumed mean (A), whole number enerally lying somewhere in the middle of the given observations and use the following relationship to find the standard deviation. ifx=X —Xandd=X—A, Yd EKA) y 2K thea N ord=X—A oe [2 X(X-A+A-X) : - of RSA eae = 4/ ZK —A) —(K—A)) N 2) SK AY? + BOK A) — 25 (OK — A) KA) N o4fEd?4 3a — Wad N 2 a 7° - _4/ 4 +Nd—2Nd” = [ua N N Similarly if observations are present with frequencies in discrete series data or in continuous series data, we can prove that on LORE. (ay N N \N ___where d_= (X—A)—_ We can see that $.D. of X values is the same as the S.D. of deviations of X values from an assumed mean. Thus the values of S.D. is not affected by changing the origin of X Values, Also ify = X then S.D. (Y) = S.D. (X)/c. ‘-., if we divide all the given observations by a constant, the S.D. of the thy Observations is got by dividing the S.D. of original observations Ls same constant. To prove this, we have Sy 14 ‘Statistical Methods for Research a ve We) 1 [yale soe ef ZED = _ S.D. (X) We can reduce our calculation work if we use the transforma u =X—A where A = Assumed mean, c = constant (in continuous ay) ¢ with equal class intervals we take c=class interval). By applying thiy uansformation, we find $.D. of u and then calculate $.D. of X values by} the formula S.D. of X variable = 2a? _ (Ru 30 Nn tw Mlustration 1. Calculate the Standard deviations of 4, 5, 7, 5, 6 and 9, - Solution, Arithmetic Mean X = 44547454649 36_ 6 6 6 S.D. (6) = ey =4/G-0" + 5-64 75 +(5~6)" + (6-6) (9-6) ES O46 =H = [B= R= 1.63 This method of taking deviati i number should be efor if of taking deviations from ‘Suitable Dot a whole number. Let us take “B14 0444061416029 FE 6-, N 6 and o= 4/2 ae [P= YBa reo 165 f ‘ : Mlustration 2. Following is the distribution of heights of 80 teac of a college. Determine the Mean height and the Standard deviation the given data. sures of Dispersion 75 Meas! Height 150-155 155-160 160-165 165-170 170-175 175-180 180-185 (in cms) Cot 8 4 35 12 8 2 1 teachers Solution. We shall calculate the S.D. of the above data by two methods. 1st Method. By taking deviations from an assumed Mean A Let us take A=162.5 Height m f d fd. fé 150-155 152.5 8 —10 — 80 800 155-160 157.5 14 —5 —71 350 160-165 1625 35 0 0 0 165-170 167.55 12. 5 60 300 170-175 172.5 8 10 80 800 175-180 177.5 2 15 30 450 180-185, 182.5° 1 20 20 400 3 # =3100 S.D. (6) = ua - Rf 5 = 38.75 —0.25 = ¥38.50 = 6.205 2nd Method. The calculation work can be further simplified if we change origin as well as scale of X values. In this example let us take A=162.5 and c=5 Height m fou anaes fit. fue 150—155 152.5 8 —2 — 16 32 155—160 157.5 14 —1 —14 14 160-165 162535 0 0 0 165—170 167.5 12 1 12 12 170-175 172.5 8 2 16 32 175180 177.5 2 3 6 18 180-185 182.5 1 4 4 16 Df =N=80 Yfu=8 Yfu?=124 «8.24/22 AF x c= 4/8 (8 » 5 = YESS — OO x $= VEST x 5 = 1.241 x 5 = 6.205. 16 Statistical Methods for Research Wore Mathematical Properties of Standard Deviation. Following are Som important mathematical properties of standard deviation. 1, Combined Standard Deviation : As it is possible to compuy, combined arithmetic mean of two or more groups, similarly we can combined standard deviation of two or more groups. Combined si deviation ¢ for any number of groups can be calculated from the formu, N10? + N202? +...4 NnGn? + Nidi © + N2 do? +... + Nnd,? o= Ni +N2+...+Na Where N,, N,, ..., N, denote the sizes of samples for which the standard deviations are 6, 0,, ...6,. d,, d,.... d, are deviations of combined mean from the mean of the Ist, 2nd...., nth sample. Thus if number of groups is two, then combined S.D. (©) is given by oa 4) Nigi2 + Nooo? +Ni di? + No do" = 4/ Nua + Nooo? + Ni di +Ne2d2" Ni +No2 2. Standard deviation of N natural numbers is of 1 (n? —1), 12 The A.M. of Ist N natural numbers is, xe ZX-NIN+D NGI N 2N 2 Also DX? = (1? + 2? +...4.N2) = N(N +1) (2N +1) 6 +. S.D. of N natural numbers : Re xy @=D NT 6 =y lam 2% » 3. The sum of the square of the deviations of items ie anini bn the eet are taken from the A.M. This is ‘ieee on a hy s nd deviation is always computed from the is property oS been proved in the last chapter. Tp — 4. For a symmetrical distribution, the fi hold as shown in the figure given belo, following area relationships Ww. whe items, Mean +20 covers 95.45% of the obser je covers 8.216 oF ; tions and Mean +30 cove> 99.73% of the observations. This can be cleari; + given below : 'Y understood from the diagr®™ Measures of Dispersion 7 Rr Kear XTX Xe KT Xe Fig. 6.1. Distribution of observations in terms of Mean and S.D. Some Empirical Relations between Measures of dispersion. The following relations connecting the measures of dispersion are true for a symmetrical distribution and also hold approximately for a moderately skewed distribution, (a) Mean deviation = § (Standard deviation) (b) Semi-Inter Quartile Range = 2 (Standard deviation). [lustration 1. The number of workers employed, the mean wage (in Rs.) per month ; \d standard deviation (in Rs.) in each section of a factory are given below. Calculate the mean wages and standard deviation of all the workers taken together. Section No. of workers Mean wage Standard deviation employed (Rs.) § ARs.) § A 50* 3! 6 B 60 (120 7 c 90 ‘IS 8 Solution. Ist we calculate the combined mean wage ae Ni_Xi + Na Xa + Na Xo _ (50) (113) + (60) (120) + (90) (115) Ni+Na+ No 50+ 60+ 90 = 5650 + 7200 + 10350 = 23200 = Rs. 116 200 200 Also Combined S.D. Ni +N2+N3 ‘Statistical Methods “a RB Pr Rescanty,, , 0,=8, N,=50, N,=60, N,=90, We are given 0,=6, 0,=7, Hence, g= =» TSO 2940 + 5760 + 450 + 960 + 90 ~ 200 12000 = = Y= 7-741 200 e : Coefficient of Variation. Standard Deviation discussed above wast absolute measure of dispersion. The relative Measure of dispersion bas on S.D. (6) is called coefficient of variation. It is denoted by the sym ‘C.V.’ and is defined as the ratio of S.D. (0) to the A.M. ( X) Thus C.V. = & x Mostly it is expressed as a percentage. ‘Thus, coefficient of variation = 2x 100 % x This measure is one of the best measures of relative dispersion # whenever one has to compare. given series with re ic hacia this measure is usually preferred over the other measures. The series ™” smaller value of coefficient of variation is said to be more consistent 6 less variable) as cémpared to the other series and vice-versa. Tilustration 1. Following are the runs made by two cricket pv", ae runs my on. and N in eight test matches ery average and vob aoe et Find out who is " : : Runs.Made , M(&): -%5—42 36 6 19 38 gS 8 “NW: 20 13 52 ag 45 ee 65 i i Solution. We shall com, . better on the averag pare the A.M. of runs in onder 10 find Mg e and co-effici 0 ‘ho is consistent. ‘ent of variation to find W! \ id Measures of Dispersion i) d,= d? d,= 2 x (« —50) Y (W—40) 5 — 25, 625 20 —2 400 42 —8 4 13 —27 729 36 -4 196 52 12 144 02 12 144 48 8 “4 19 —31 961 45 5 25 38 —12 144 56 16 256 85 35 1225 65 25 625 48 —2 4 38 —2 4 5X=355, Ld,=— 45, Dd,2=3363, LY=337, Ld=17, Ld2=2247 AM. of M= asta - so— 88 =355. = 44,375__— 8 AM. of N= A+ 22 = 49 —12.-337 = 42.195 N 8 8 Hence on the basis of mean runs, player M is than N. Now to calculate C.V.’s we calculate the S.D.’s of the two players S.D. (M) = va - ‘PF -/38-{ af 420.375 — 32.64 315 = 32.64 = 19.72 (app.) = a — 4.52 = 16.62 (app.) Cav. for M= 10.72. x 100= 44.375 44.4% CV. for = J 16.62 x 100 = 39.5% 7 eco player N is more consistent than M. - 5 Mlustration 2. Following is the distribution of monthly wages.of employees of two factories. Which factory pays more money to the employees and in which factory the monthly wages have less variability. Wages (Rs.) Factory I Factory “Factory 200—400 4 7 400—600 8 12 600—800 u 9 Statistical Methods for Resear, en, 80 Wages (Rs.) Factory I Factory 800—1000 6 5 1000—1200 5 1200—1400 3 3 1400—1600 3 1 Solution. To see that which factory pays more to the employees, shall compare the total money paid by two factories. To see variability; wages, we shall compare the values of C.V. Factory I : Calculation of Mean and S.D. Wages 200—400 = 3.075 — 0.225 x 200 = 337,60 C.V. for Factory I = 2 x 100 = 332.6. = for vIn Zx 337.8. x 100 = 41.549, ' ~~ ~~ of Dispersion : 83 Measures For Factory I raw y= m—900 7200-400 300 —3 7 —21 63 400—600 600 —2 12 —u 48 600800 700 -1 9 -9 9 g00—1000 900 0 4 0 0 1000—1200 1100 1 4 4 4 1200—1400 1300 2 3 6 12 1400—1600 1500 3 1 3 9 Wy =40 fw =— 41 Sfu? =145 © XaA+ LH xc = 900+ —4L x 200 = 900 —205 = 695 N 40 24| zful -&y ce aay - ° N a * 40 ( 40) ee a = 13.625 —1.031 x 200 = 320.9 : CV. Factory II = Sx 100 = 320.2 x 100 = 46.2% x 695 Now Total Money paid by factory I to its employees =X. (No. of employees) = 805 x 40, = Rs. 32200 Total money paid by factory II to its employees =X. (No. of employees) = 695 x 40.= Rs. 27800 Thus factory A pays more f oney to its employees. Seeing the value of C.V. for the two factories, we conclude that monthly y wages. have less variability in Factory I. Variance. This term was first given by R.A. Fisher_in- 1913. It is defined as the square of standard-deviation. As S.D. is denoted by o thus variance is denoted by the symbol o*. Thus standard deviation = Wariance. / Merits of Standard Deviation are given below. ___ 1. The standard deviation is the best measure of dispersion because of its mathematical properties. — - Q 2. It is based on every item of the series and is rigidly defined. 3. It is possible to calculate the combined standard deviation of two © more groups. : _4. For comparing the variability of two or more series, coefficient of Variation is the most appropriate measure and it is based on standard deviation, : 82 Statistical Methods for Research Worken 5 . a ‘ 5. It is prominently used for further statistical work. Demerits of this measure are : 1. It is difficult to calculate, 2. It gives more weight to extreme items and less to those which an near the mean. 63. MOMENTS Ifa variable X takes the values X,, X,, ... Xj, then the rth momen, is defined as the A.M. of the rth powers of the variable X. =x’ n Thus, rth Moment of X = Mars Kat Kel Clearly 1st Moment of x = X1+X2-+.. Xs —X (AM) Also, we define rth moment about an arbitrary value A as the A.M. of rth Powers of the deviations of individual values of X from A, sometimes termed as raw moment. We shall denote the rth moment about ‘A’ by the symbol m’, given by (KX — A)in. _ On the same lines, we may define the moment about the mean X, as the A.M. of the rth powers of the deviations of individual values of X from X. We shall call these moments “Central Moments’ and shall denote the rth Central Moment by the symbol m,. Thus, m= (Xi —X J+ (X2—K 4... (Xa KY _ I X_XY n n Clearly when r=2, m, will be called the 2nd central moment and given by LK %Hin © ox), ° For computing moments for grouped data, we shall use the following formulae : If X,, X,, ... X, occur with uenci we can have" i ia hole ‘i Ff, respectively, then rth Moment =41_X1’'+f2 X2" +... +inX _ SSX" _ yyx! DAS + thy yp NT my =LLOO~ AY eh OG Aye evi ay _Ds (Keay Si that eg “—T and mF LORRY 2 OG — RY 4 ng pet EY —- Sifat ig ES (X-XY tn a, where N=2f. Measures of Dispersion 83 Relations between Moments. The central moments (m,) and the raw moments about A (m‘) are connected by the following relations : m, = 0, m, =m, — m2, my =m’, — 3m‘, m, + 2m’? and m, =m‘, — 4m‘, m’, +6m‘? m’, — 3m’. These relations are proved in the following illustration 1. ‘The calculations of m, can be simplified by making the transformation we (X=), where A and c are constants. m= cr. Uful Then, NN andhence, central moments can be calculated. Illustration 1. Prove that (@)_m, = m', — m'? (&) m, =m‘, — 3m’, m, + 2m’? () m, =m’, — 4m’, m’, +6’? m’, — 3m’. Solution. If d=X—A, then X=A+d,X=A+q and XX = d—d _E(X-y? _E(d-d" _ E(d?~2dd+ 2") a aT (a) m -2d*_ 2 2? (6) ms = LK=XY Ee y . Proceeding on similar lines, the result can be proved. = _ay4 (maz 2LX=®) =2(4- d) Proceeding on similar lines, the result can be obtained. Illustration 2. Calculate the first four moments about the mean of the distribution from the following data giving the distribution of age of students of a school. Age 46 68 8-10 10-12 12-14 14-16 16-18 No.of Students : 34 98 85 72 6 ST 35 Solution, We shall first determine moments about assumed mean, then calculate the central moments using the appropriate formulae. m= ig fy fur fur — fat Age Mid value u = (m) Z 46 5 3 3 —102 306 —918 2754 68 7 2 98 — 196 392 — 784 1568 &—10 9 1 85 —85 85 — 85 8 1-12 u o 2 +0 oOo ° Statistical Methods for Research Works, 4 13 1 6 8 8 6 14 15 2 57 «U4 228456 gh 14-16 a 3 35—«10S- 31S 945 age 16—18 35 305 BS 945 as Here, C = 2, A= 11, N = 450 = Df Lf = — 95, Zfu? = 1395 Syd = 317 and Sfit = 8223. Using m'’, =o bit 95, 19-0, were ni OE -8 0.422 mia (2 USD) = Sh = 12.4 45 3 Cat _—2536- _ 5,636 ue 450 “4=(2)* (8223) = 131568 = 292.37 450 450 Now, we calculate Central Moments 2 m=0, ma= ms ~ my? =58 — (_19} =12.22 45 \ 45 m, = m’,— 3m’, m, + 2m’? = — 2536, 18} -3 358) 450 ( ICS +2(=12 m= m4 — 4m’ m’s+ 6m’ eS, = Hse — ale 19} (~2536) , «(—19 = (24) + 6(=12, (558) 7 45 s = (=3/' = 296.86 45. =— 21.94 central value should be with coincide in this type of distibaiey, frequency. “A.M., Median and A distribution is said to be posi, 4 increase suey bt falls dow. raed te value of sea Qu ee Han 8nd AM. will have the harnest val Measures of Dispersion 45 MEAN Fig. 6.2. Symmetrical curve Mo Me X Fig. 63. Positively skewed curve X= Mean Me= MEDIAN Mo=MopbE K Me Mo Fig. 6.4, Negatively skewed curve .__ A distribution is said to be negatively skewed, if the value of frequency increases slowly and falls suddenly. For this type of distribution mode will have the largest value and A.M. will have the smallest value, while median Will lie in between the mode and the A.M, A symmetrical curve, positively , Skewed and negatively skewed curves will, respectively, be of type shown in figures 6.2 to 6.4. Statistical Methods for Research Works, FREQUENCY VARIABLE ig. 6.5. U-shaped curve FREQUENCY VARIABLE 6.6 respectively. . coer may cave. the items having the maim wency are generally in one come; For U shaped curves, the values th extemes hae Yer high equ Sted curves, the alist low frequencies. Measures of Dispersion 87 Generally we call a distribution skewed if any of the following ditions is satisfied. 1, A.M., median and mode of the distribution fall at different points. 2. Quartiles are not equidistant from the median. 3. The frequency curve drawn with given data is stretched more on one side than the other, Two types of measures of skewness are given below : (@ Absolute measures of Skewness. These measures have the same units as those of individual observations. Following are the Absolute measures of Skewness () Skewness = Mean — Median (i) Skewness = Mean — Mode. (iii) Skewness = (Q, — Mode) — (Mode — Q,). (6) Relative measures of Skewness. These measures are independent of the units of measurement. Following are the relative measures of skewness. cont _A.M, — Mode Standard Deviation The limits of the measure are —1 and +1. If A.M. > Mode, the distribution will be positively skewed and if Mode > A.M,, it will be negatively skewed. If mode is ill defined, then coefficient of skewness given below should be preferred. It does not use the value of mode. 3(A.M. — Median) Standard Deviation —3 and +3. Both these formulae were Coefficient of Skewness = (@ Coefficient of Skewness = The limits of this measure are given by Karl Pearson. (i) Bowley’s Coefficient of Skewness (based on Quartiles). Coefficient of Skewness = 2+ 1—2Med a—QA ‘The limits of this coefficient are —1 and +1. (iii) Coefficient of Skewness based on Moments “\ This coefficient is given by Bi= 3% ory = VB = (na oP sons wid w ose If y, is positive (or negative) tribution is said to be positively (or negatively) skewed. Zero value of ¥, indicates that the distribution 1s symmetrical. Illustration 1. Compute the Karl Pearson's coefficient of Skewness and Bowley's coefficient of Skewness (based on quartiles) for the following data on milk yield collected from a farm. m3 Statistical Methods for Res, hw, 46 68 8-10 10-12 4 4-6 68 810 10-12 12, “ad 0: 14 Milk yield care 4.8 15 28 10 4 2 My No. of Bul AM, mode and standard deviation, \ Solution. Calculations © Wo. of Buf. d= ™=2 fa’, 88 a id value ; 2 1 4 “4 = g 24 3 8 ~ 2 —%A n Vo 3 15\ -2 -0 63 4 38 —1 —-% 4% 10 9 10 0 0 0 10-12 n 4 1 4 4 12-14 B 2 2 4 8 Bf = Nee 72 Yd’ = — 87 Dha'P=ms aM.:X=A+2hfix 29+ 8h x 2 = 44 = 6.5 N 2 2 (@ Mode = 6+ —f1=fo__x C= 6+ Bx 268 2fi—fo—f2 56 —15—10 es Eid piel 24/28 [RY 2-229 _ Karl Pearson’s Coefficient 4 M.—Mode _ 6.58 — 6.84 = AM. =o (i) S.D.=6= i ofSkewness ~~ gp. 2.79 =—0.09 Calculations of Quartiles and Median : Milk Yie cf. 2 3 2 ba 6 1S n 8 28 33 ures of Dispersion 89 Meas! g—10 10 65 10-12 4 69 12-14 2 n 14-16 1 12 Yf=N=72 Q, lies in the class (6—8) 3xR . 3N= = 5éth observation is in this clas. BN Wee) -b tg C-6ae - it 28 Q, lies in the class (4—6) as (N)th is 18th observations lies in this class interval. N of Hence Qi =L +4 x C=4+ Bay 2-44-48 : 15 oS Median lies in the class (6—8) as ( th, i.e., 36th observation lies in this class interval. 2 N-cf Hence, Median = L + 2 7* c= 6+ 3621 x 2 = 6.64 Bowley's coefficient of skewness (based on quartiles) = 3 +Qi —2 Med Q@-Q = 1.93 + 4.8 — 2(6.64) _ = 0.55 = — 0,176 7.93 —4.8 3.13 65. KURTOSIS Kurtosis is a Greek word meaning bulkiness. In statistics it refers to the degree of flatness or peakedness of a frequency curve in the region about the mode. The degree of kurtosis is measured relative to the Peakedness of normal curve. If a curve is more peaked than the normal Curve, it is called ‘Leptokurtic’. In such a case the items are more closely bunched around the mode. If a curve is less peaked than the normal curve, a Statistical Methods for Research w, on 90 en it is called “Platykurtic’, The normal Curve ‘The following diagram illustrates the shape atid L ie. it is flat topped th is called ‘Mesokurtic . ren curves mentioned above, L=LEPTO KURTic M=MESO KUATIC P=PLATY KUATIC Fig. 6.7. Shape of Platykurtic, Mesokurtic and Leptokurtic Curves | Measure of Kurtosis. Coefficient of Kurtosis (based on moment} a relative measure of kurtosis denoted by B, and is given by B2= 4 where m, and m, are, respectively, 4th and 2nd central moments. For a normal curve B, = 3. If for a distribution value of B, is than 3, the frequency curve of the distribution is more peaked than # normal curve (Leptokurtic). It the value of B, is less than 3, the curt! Jess peaked than the normal curve (Platykurtic). Sometimes, a measure ¥y, derived from B, is used as a measutt! kurtosis and is given by y= B, If for a distribution +, = 0, Distribution is normal (Mesokutt¢ if % > 0, Distribution is Leptokurtic, and if ¥, < 0, Distribution is Platykurtic. Illustration 2, Com, . eg! kunotis based on're ute the coefficient of skewness and coeffici@# oments for the following data, Plant 30-35 35.49 4 Height 4045 45.50 50.55 55.60 60-65 © (in cms) No. of 5 4 46 6 Plants % 4 12 8 easures of Dispersion & 1 solution. Calculation of Moments : 4 4 % Mid No. of \ Plant 525 Height value plants ust ka far fw fat (in cm) ua fs ia 30-35 32.5 5 —4°—2%0 80—320 1280 35-40 315 4 —3 —42 126-378 114 40-45 42.5 16 —2 —32 —128 256 45-50 415 25 Bee 5) 25) 925) © 225) 50-55 525“ 14 ss 0 o 0 0 0 55-60 515 12 1 2 2 12 2 60-65 62.5 8 2 16+ 32 64 128 55-10 615 6 3 18 24 162 486 YEN=100 Here, Yfu= — 73, Lfu? = 393, fi? =613, Yyit = 3321. Calculations of raw moments : Using the formula m”, = or Ee we get y ni =(5) (-2)--36, m2 = (5)° (393) = 98.25 100 100 3=(5) ° (813) = — 166.25, 4 = (5)* (3321) = 20756.25 mani (=9t ms = (9) (3821 Using raw moments, central moments can be calculated. m, =0, m, =m‘, — (m’, = 98.25 — (— 3.65)? = 84.93 m, = m,— 3m,‘m‘, + 2 (m,) = 166.25 — 3 (— 3.65) (98.25) + 2 (— 3.65)° = 212.34 m, =m, — 4m,‘m,+ 6 m’, (my — 3 (mf. = 20756.25 — 4 (— 3.65) (— 766.25) + 6 (98.25) (— 3.65)* —3 (— 3.65)*= 16890.14 ~. Coefficient of skewness 1 = 3 = 16890.14 3 7213.10 Coefficient of kurtosis Y2 = aA (m2)? =234 —3=—0.66 2 Statistical Methods for Res, arch, 9: N DISPERSION, SKEWNESS AND 66. COMPAR MET dy of variation of the individual ea sen oe or among themselves. It does not tell us about teen which deviations cluster below or above the average value, The m of skewness take this point into consideration and tell us about the g of deviations, above and below an average. Kurtosis Sut concentration of various items at the central part of a series, Thy three measures study three different aspects of a frequency distribu MISCELLANEOUS ILLUSTRATIONS Illustration 1. An experimenter took observations on 100 eggs found arithmetic mean and S.D. to be 56 gms and 6.8 gms respecii For the first 50 eggs selected from these 100 eggs, the mean and s deviation were 55 gms and 4.8 gms respectively. Find the arithmetic and standard deviation of the other half. Solution. Let N, and N, represent the number of observations of! groups. Then, we have N, +N, = 100, X= 56, 6 = 6.8, N,=50, Xi= 55, 6, = 48 N, = 50, X2= 2,6, =? _ To find Mean of 2nd group, we equate X with combined mean, ie, K = Ni Kit No Xa, 56 — 50 (55) + 50 (2) Ni+No2 100 2150 + 50 X2 = 5600, Xz = 2850. = 57, 50 Similarly, to find $.D, of 2nd group 4 ot = Ni or? + No ox? + Ni dh? + No dy? 3 Ni+Nz We have, N, = 50, N, = 50, 0, = 48, 6, f= GW Gs —59 2 = (X— Substituting these values i = 67 — 56) =4+1 in above formula we get (6.8)? = 50.4.8)" + 50 (022) + 50 (<1)? + 504.0? or 46.24 = 12524500, +50 100 2 or oF ae 67.44, 62 = 8.21, » easures of DspErSION 93 ence A.M. of the other half = 57 gm and S.D. = 8.21 gm stration 2. Following is the data of milk yield obtained from three. i bce farms for the year 1980-81. Find the missing observations. i Farm! — FarmIl_— Farmiil ~— Combined 4 No. of cows 50 ? 90 200 t SD. (ke) 225.8 270.6 ? 260.4 AM. of Milk 3500.4 u 3414.8 3516.4 yield (kg) 1 Solution. If we represent number of observations of the Ist, 2nd and | ard farm by N,, N, and N,, then N, = 50, N, = 90 and N, + N, + N, #200 2 N, +N, = 50 + 90 = 140 Hence N, = 200 — 140 = 60 ‘ LetX;, Xp and X3 denote the mean of the first, second, and third farm, then | XY 2NLXi+N2 X2+Na X Ni + N2+No3 X = 50.(3500.4) + 60 (Xo) + 90 (3414.8) _ 3516.4 50 + 60 + 90 175020 + 60 X2 + 307332 = 200 x 3516.4 Xp = 220928 = 3682.1 60 To find standard deviation of 3rd farm, we have Nit+N2+No3 We are given © = 260.4, N, = 50, N,= 60, N, = 90, 6, = 225.8,,6, = 270.6 4, = (X, —X) = (3500.4 — 3516.4) = — 16 4, = Ka — X) = (9682.1 — 35164) = 165.7 4, = (X, — X) = 3414.8 — 3516.4) = — 101.6 os Statistical Methods for Research y Substituting these values, we get 50 (225.8)? + 60 (270.6)? + 90 (03)? +50(—ig2 ? a + 60 (165.7)"+ 90 (— 101.6 (260.4)? = a 9 ox? or (260.4)? = 9531993 + 90 o 3° ¢ ) 200 9531993.0 + 900,” = 13561632 6,’ = 44773.7, 6, = 211.6 Tilustration 3, In a small town a survey was conducted in Tespect Profit made by retail shops. The following results were obtained. Profit 310 —2 B20 —-1 —1t00 Otol 1to2 2003 3m) (in "000 Rs) No. of. 4 10 aw 38 58 32 10 Retailers Here negative values indicate loss incurred. Calculate the (i) Average profit made by a retail shop Gi) Total profit of all shops (iii) The coefficient of variation of eamings Solution. Calculations of Mean and Standard deviation : Profit in No. of Shops Mid value D=(m — eee / im (m—05) fd. fd —30—2 4 ~25 ~3 —2t—-1 10 —15 ~2 = x —lw 0 24 —05 ay “u Ow 1 38 0s 0 “oO lw 2 58 15 1 ss 20 3 32 25 2 64 128 30 4 10 3.5 3 30 90 Ya=376 S d = = Average=X =A +Udn05+. 96 0.5 + 0.5455 = Rs, 1045.50 0 = [PF ERT -™ 1250 fe a ey 1s of Dispersion 95 Measure § Coefficient of vasiaen = = x 100 = 1.3 x 100 = 130% Ry profit = X = N. X = 176 x 1045.50 = Rs. 1884008 ~“ Tos gation 4. In a frequency distribution, the coefficient of skewness based upon the quartiles is 0.6. If the sum of the upper and lower quartiles is 100 and median is 38, find the value of the upper quartile. (Bom. UR.A(s)’ 67) Solution. Given coefficient of skewness = 0.6 Q, + Q, = 100 Median = 38 We know that Qs + Qi — 2 Med. Q-Q e é i Coefficient of skewness = 0.6 = 100— 238) & = 3 Qi = 24=40 Qs—-Q: 0.6 Qs-Qi= 40 +0; =100 f “4 ; 2Q;. =140 Q,=70 Hence value of upper quartile is 70. Illustration 5. A frequency distribution gives the following results : ( coefficient of variation = 10 per cent (ii) Karl Pearson’s coefficient of skewness = 0.6 ii) 6 = 08 Find the A.M., median and mode of the distribution. Solution. Given o = 0.8, coefficient of skewness = 0.6, C.V. = 10 per cent. Sx 100= 98x 100= 10 x x or X=8. | We know C. TTT Also coefficient skewness= Mean — Mode __ Standard deviation or Mode = 8 — 0.48 = 7.52 = 0.8 at we know that, Mode = 3 Median — 2 Mean us 7.52 = 3 (median) — 2 (8) or Median = 7.84 ives ration 6. Find the coefficient of variation of frequency distribution 'E its A.M. is 60, mode is 75 and Karl Pearson's coefficient of AS ‘wes is — R a cd 2 Statistical Methods for Re, " We Solution. Given X = 60 Mode = 15, Coefficient of skewness = . iv i if skewness = -M. ~ Mode We know that Coefficient of ae Thus, 06-9575, — ora=180=25 =x 100= Hence C.V. -ax 100 = <5 % 100 = 41.66 per cent EXERCISES * (a) What is meant by a ‘measure of dispersion’ ? State the different me of measuring it. ©) Discuss the relative advantages of coefficient of variation and s deviation as measures of variation. (a) What are quartiles ? How are they used for measuring dispersion? (®) Why is it that standard deviation is considered to be the most measure of dispersion ? (a) What are the principal measures of dispersion ? Discuss their merit demerits. (6) Distinguish between ‘Variance’ and ‘Coefficient of Variation’. In what ways measures of variation supplement measures of cei tendency ? Explain, “The duty of the statistician goes much beyond collecting data and calculations, Facts do not speak of themselves and it is the statistician Must interpret the statistical results to discover their meaning.” f letters in each word as variable prepare the Bt Iculate the value of A.M. and S.D. . Explain the term ‘Skewness’ and ‘Kurtosis’ used in connection wil of Skewness ant Kec comimuous variable. Give the different m¢ N (a) te nme Estab the relation between the raw monet] * (®) How are momen, formation of a fi Given the following \ ts helpful in the study of different aspects) hy ‘equency distribution 2 \ ‘esulls compute the missing items oan Number of observations Mean ves Group II 8 30 a Group I & II Fe * combined, Pe Measures of Dispersion 7 9. Find the value of coefficient of variation in the following cases : 7 (iy SD-= 3.5,N= 10, X = 145 (ii) Variance = 148.6, AM. = 40 10, Given the following information Coefficient of Skewness = 0.8, A.M. = 4.0, Mode = 36. Find the standard deviation. ‘The first four moments of a distribution about X = 2 are 1, 205, 5.5 and 16. Calculate the four moments of X about zero. following table gives the distributions of population in towns A and B different age groups. Compare the variation and Skewness of their frequencies. us 12. ‘Age-group Population in Thousands A ~ 3 0-10 18 10 10-20 16 12 20.30 15 ma 30-40 12 32 40-50 10 29 50-60 5 u 60-70 2 3 above 70 1 1 13 (a) The following are some of the particulars of the distribution of weights 7 of boys and girls in a class : Boys Girls Number 100 50 Mean weight GO kgs 45 kgs Variance 9 4 () Find the standard deviation of combined data. (i) Which of the two distributions is more variable ? 14, Compare the variability of the two series given below : Series A 192 236 184 260 243-290-245 Series B 83 87 105 120 110 92 95. 51, In two series, where d, and d, represent the deviation from a trial average 40, following results were obtained. n=50 Ed,=65 Yd? = 1000 m=75 Ed,=80 Ed? = 2250. Calculate the coefficient of variation for the two series. ate tre the measures of dispersion of a distribution ? Why is the standard ‘Viation most commonly used as a measure of dispersion ? Statistical Methods for Researey We 98 re 17. Goals scored by two teams A and B in a football season wer 7 re a J follows : ' ———. Number op xo—~ Number of Mat, Y lumber of Goals scored ches P, N ae Se A Be 0 ay ” 1 9 9 2 8 6 3 5 5 4 4 3 a By calculating the coefficient of variation in each case, find which tan may be consideredimore consistent. 18 A distribution consists of three components with frequencies of 200, 25) and 300 having A.M. of 25, 10 and 15 and S.D. of 3, 4 and 5 respective, Find the A.M. and the S.D. of the combined distribution. 19. The coefficient of variation of two series are 40% and 60%. Their standa deviations are 16 and 18. What are their arithmetic means ? 20, Find the AM., mode, S.D. and C.V. for the following : Years under 10 20 30 40 50 60 No. of persons 15 32 51 78 79 18 21. The first four central moments of a distribution are 0, 2.5, 0.7 and 1815. Test the Skewness and Kurtosis of the distribution. 22. From the following table, compute the Quartile deviation coefficient o 7 Skewness and coefficient of Kurtosis (based on Moments). Size: 48 8:12 12-16 16-20 20-24 24.28 28-32 32-36 3640 Frequ- ey : 6 10 18 30 15 12 10 6 2 23. The first four moments of a distribution are 1, 4, 10 and 46 respective} Compute the first four central moments and Beta constants. Comment UP the nature of the distribution, 24, The Karl Pearson's coefficient of Skewness of a distribution is 032.6 0s aes deviation is 6.5 and the A.M. is 296.6. Find the mode. (a) If the sum of squares of deviations from the A.M. of 10 observations * 1690, find their standard deviation, () N=10, X=12, yx 26. The following results are obs 100 observations : X =9, Variance =19, B, = 0:7, It was later found that an observatio correct value of the first three ven = 1530, find the coefficient of variation. tained from a frequency distribution w n 12 was misread as 21. Calcul? tral moments, Measures of Dispersion 99 For a distribution the mean is 10, variance is 16, B, = + 1 and B, = 4. Obtain the first four moments about the origin, i.e. zero. Also comment upon the 2. nature of the distribution. 2B. and Skewness : Distribution I 14 14 Distribution IT uo Distribution IIT 1 3 \2\t A Wa 3,4 Ab — 14 14 14 14 16 7 6 18 42. 3 re aed 7) Comment on the nature of the following distribution in respect of dispersion

You might also like