You are on page 1of 31
Chapter 5 MEASURES OF CENTRAL TENDENCY TRODUCTION , ; ‘The most important objective of statistical analysis is to find a single value for the entire mass of ‘data which can be called a representative for the whole set of | data. Itis expected that such a typical value should lie centrally within the given.set of data if itis arranged according to magnitude. Such a” 54.0 typical value which represents the whole:set of data is call ge. Thus the main aim of finding an average is to reduce the complexity of given data and to make it comparable. A figure (numerical value) used to represent a whole set of data should bea value where most of the items of series tend to cluster. ( Properties ofa good average. As anaverageisa single value represent- ing'a given group, it should satisfy some basic properties. These properties are given below : (“() Itshould be easy to calculate and understand. (i) Itshould be rigidly defined. The bias of the investigator should not affect the value of the average. (iii) It should be based on all the observations of the series. (i) It should be capable of further algebraic treatment. (») It should not be affected by fluctuations of sampling. (vi) It should not be affected much by extreme values. (vii) Items whose average is to be determined should form a homogene- ous group. [i MEASURES OF CENTRAL TENDENCY The following are the five commonly used measures of central tendency 1, ArithmeticMean 2, Median 3. Mode 4. Geometric Mean “5. Harmonic Mean mi 1. Arithmetic Mean. The most popular measure for representing the tee is the arithmetic mean (A.M.). A.M. ef: set of ebservations.is a wine numerical value) obtained by dividing the tetal ofall the observations wei pubes Arithmetic mean can be either (i) simple arithmetic mean or Chants mean, of simple arithmetic mean _ltthod 1. If we represent the individual. values of une variable 7 8} + ‘v-oX, (n=number of observations), We shall denote by X (read X ba) k 36 workers earelt Statistical Methods for Res! ivan 2 jation x) -M. of the variable X. Also th x (read as Su denotes the sum of X values, i.e. ome = X\+ X44 X,, Then according t0 the a a definition Illustration. The followii sights of 10 students of a class: 7 lowing are the ae 7 8 (9 (10 a 1 203 4 eightinkgms 45 48 50 60 55 46 50 58 62 46 Solution, Calculation i i , . ‘ule of Arithmetic Mean : Arithmetic Mean X = 45 +48 + 50 + 60 + 55 + 46 +50 + 58+ 62+ = 82D = 52 kg Method II. By taking 1 jeviations of individual observations from an arbitrary value, the arithmetic mean can be calculated by using the formula, X=A +d where A denotesthearbitrary value, Dd thesum of deviations and n the number of observations. For the above problem if we choose A=50 then deviations are d:—5, —2, 0, +10, +5, 4,0, +8, +12, 4, Zd=20 and -. X = 50+ 20 = 50+2= S2kg. J Calculation of Arithmetic Mean for Discrete Series. If we have a data in discrete form with frequencies, then the arithmetic mean is given by x= where X represent the value of the variable, fits corresponding frequency, and n = Lf represents the total number of observations. To simplify the calculations we can use the deviations of X values from an arbitrary mean ‘A’ (change of origin) and use the formula X =A+ =r where d denotes the deviation of a particular value of X from the assumed mean. Calculations can be further reduced if we can divide all the values of deviations by a suitable scale ‘C’ (change of scale). In this case we can use the formula im fs . [X=sA+ wt x Cwhere d= @ = X=A Mlustration, Fi TS iS Hlustration, Find the A.M. of the following data. eee X 5 10 15 20 25 30 35 40 45 50 requency (f): 20 43 75 67 72 45 39 9 8 : ) pcs. } sures of Central Tendency 37 Meas! / Solution. Method I Variate Frequency fX Variate Frequency fX 1 3 1 2 3 - 20100 30 45 1350 10 43 430 35 391365 15 15 1125 40 9 360 0° 67-1340 45 8 360 6 72 1800 50 6 300 n= Xf= 384 —s LfX = 8530 Arithmetic Mean £ 2£X)_ 8530 = 22.21 A 384 Method II LZ Variate Frequency d= X23. fd’ 5 ce x f 5 20 + —80 10 43 3 —129 15° 15 2. —150 20 67 =i —67 25 72 0 0 30 45 b 45 35 - 39 2 BB 40 9 2 27 45 8- 4—— OTB) - 50 6 5 30 L384 Yd'=—214 = ‘d” —214 : X=At Yd". c= 25+ 2x 5 A n ° 384 = 251070 7 384 25 —2.79 ; =22.21 . Itis obvious from here that the calculation work is considerably reduced by the use of this method. ics data, yus scrics O°» 5. ~Calculation of A.M. for Continuous Series. In continuot Feo of AM. can be calculated by applying the same three formu: ut nervel iscfete series. In continuous series data, the mid point of a class M—« a Statistical Methods for Research Wi, | 38 ; denoted by ‘m’ is taken as the value of the variable (as X in case Of discry| jenoted by ‘m’ ii = series). M. for the data given in first bo columns y Illustration. Calculate A.M. + 100 buffaloes. : following table on lactation length us NO, A=? 75, Solution. For this problem take —— m—A 24 ft m Lactation length — Frequency Cc (days) j = 25 — 0-50 _ = =2 50-100 a 125 3 = 150.200 14 ue Re in 200-250 13 aaa i i 250-300 22 Ae fl Ff 30000 a 375 2 16 00.430 5 ~ 425 3 ' 450300 ; 473 4 : Sfd’=— n=100 AM=A+ Bit xc 1 = 215+ x 00= 21S —05 = 2745 day = 215 +55 x 50 si8 We see that the use of this formula reduces the work of calculations considerably. ir Correcting Incorrect Values. Sometimes due to mistakes in copying, some wrong observations are taken while calculating mean. The value of arithmetic mean can be corrected by Substracting the sum of incorrect Observations from the incorrect value of Sx and by adding the sum of correct observations. The result thus obtained is again divided by the number of ‘Ons to get the correct value of the Arithmetic Mean. Mlustration, ie found that a heen oe e8e height oP20 boys was 155¢ms. Lateron it Was ight of 136 F was 1 height of the boys, cms was misread as(156, Find the correct average Solution, Given n= 20, X= 155, Since X = 2X so 1552 2X 20 res of cerural Tendency 39 east! xX = 3100 nen Xx = 3100 — Wrong item + Correct item " 3100 — 156 + 136 = 3080 correct EX - 3080. = 154 cms. “7 20 metic Mean in case of data with open-end «Correct AM. Calculation of Arith asses —— CSS wen te data is well behaved (ie, it consists of equal class vals except the 1st & last classes). In this case we make an assumption r warding the first & the last class intervals. If our data is of the following type. Income Frequency Less than 1000 2 1000—3000 3 3000—5000 7 5000—7000 10 7000—9000 13 9000—11000 9 above 11000 2 In this case we can take the first class interval as 0-1000 the income cannot be taken as negative and the last interval as 11000-13000, after seeing the pattern n of other classes. It isnot better to use A.M. asa measure of central tendency for a data for which it is difficult to estimate the limits of the open end classes. { Case II. When the data is not well behaved (it consists of unequ: I class infervals). In such situations the Arithmetic. mean-can be calculated by-using ch situations the Arithmetic n the formula A +£/! were m represents the mid value of an interval and f denotes the corresponding frequency. : Mathematical Properties of Arithmetic Mean) Following are the few mathematical properties of Arithemetic Mean : . Property L The algebraic sum of the deviations of a set of values from their arithmetic mean is zero. r be Proof. If we have n observations denoted by X,, X, - X, with meen Lis fen f, Fespectively. Then the total number of observations a Zf=N. Thearithmetic mean (A.M.) of these observations denoted by all be given by x= LLX _ of ES(KAX )=EPX_K LN KN X=0. + 40 Statistical Methods for Research Worker is minimum when the deviations are taken from their A.M. Proof. LetZ =f (X—A)?, denote the sum of the squares of devigt;, of given values from any arbitrary point ‘A’. To prove that Z is minimyp, when A =X (the A.M. of X values) we shall apply the principle of maxing sista tn : snimum if 22-=0 and 4-Z>9, ‘inima. Thus Z will be minimum if 77 dA2 Now 42. =—25f(K—A)=0, or LFX—ABf=0 ( Property-2/‘The sum of the squares ofthe deviations cf 2 set of vaty, o A= 2X LX =f ~2.5f.(— 1) =2Ef=2N>0 Again aa? _ Hence Z is minimum when A =X ; Property 3. IfX1, X2, ..., Xp denote the arithmetic means of p series of sizes 1, ,n,,...,n, respectively, the mean of all the series taken together can be obtained by the formula xem +2 X24... + mpXp_ mi +72, + Mp Proof of this property is obvious. \ ( Illustration. The mean weight of 25 male workers in a factory is 64 kgs and themean weight of 35 female workers in the same factory is 58 kgs. Find the combined mean weight of 60 workers in the factory. . Solution. We can use the formula for A.M. as zenkim® x =-mXitmXe (Property 3) Here mi ='52, X1 = 64, n2=35, Xr=58 2. X = 2X4) +G5%58) _ 1600+2030 - 3630 —, . 25+35 0 Fe af Welahted Arithmetic Mean. If some items are more important than others, ies keeping this fact in mind we give proper weightage to various items and then calculate the A.M. The A. ii se mi tn calle hp -M. thus calculated is called the If we give wei i we give weights w,, Wy... W, 10m observations X,, Xy,.., Ky then We define Weighted AM,="1X1+..+WaXn_ Dw X WitW2t.tWwa Sw \ w A) eosures of Central Tendency a jumay benoted that weighted A.M. will give same results as simple A.M. erequal weights are assigned to all the item. if "yustration. ‘The final marks of a student who took four courses in a (iniesterare 57, 75,83 and 62. Find hisaverage marks if the respective credits (ceived for these courses are 5, 3, 4 and 2. Solution. In this question we have to find the weighted A.M. as the courses have been assigned different credits. So (57 x 5) + (75 x 3) + (83 x 4)+ (62 x 2) 5+34+44+2 = Ce 69 marks. Weighted A.M. = | Merits and demerits of Arithmetic Mean \ Merits : 1. It is the simplest average, easy to understand and calculate. 2. Itis rigidly defined and is based on all the observations. 3. It is capable of further algebraic treatment. 4, It is a calculated value, and does not depend on the position in the series. 5. Of all the averages, Arithmetic Mean is affected least by fluctuati of sampling. Due to this reason A.M. is also called a stable average.) Demerits : 1. The value of Arithmetic mean is affected by extreme values, In the presence of extreme items, Arithmetic mean gives a distorted picture of the given observations. . . Arithmetic mean cannot be calculated even if a single observation is missing. 3, Arithmetic mean should not be calculated ifw theextreme classes are open i.e. of the type below 10 or above 70. 4. Arithmetic mean may lead to wrong conclusions if the details of data from which it is obtained are not given e.g. two students, Aand B get the following marks in three examiriations. . Examination: I . 0 ml Overall A. 30percent 50 percent 70 percent Av. 50 percent B 70 percent 50 percent —30 peroent Av. 50 percent ___ Although both get’ overall average of SOper cent marks, position of A has | improved and that of B has deteriorated. "72, Median. The median refers to the middle value in the given set of 7 Sbservations. When we arrange the given set of observations inanascending _ oe mending onde, the falue of the observation which divides the given set vations into two equal parts is called the median) \Median is-@ Numerical value which is greater: ieatbe value of half of theo! sand is also smaller than the value taken by the other half of the ‘observations. Median is also called afposiioriaf average as it occupies 4 definite place in siven set of observations. For an ungrouped data, if the number of 42 be Statistical Methods for Research worker obs i ‘ | opseevations isa, then the observation lying inthe middle, afier te Biven Servations have been arranged in the ascending OF ending Oder is pauave, and it is called the median. Thus when ” is the value oe the (#2) observation is called the median. On the otter and ir we nme observations is even, then the middle position is not occupied by a single i i (n+1) n) observation. In this case we take the two observations G ) th and () Iy-ing in the middle and find hor arithmetie mean which scaled the median for the set with even i number of observations. ; Illustration, Find the medianvof (2) 5: 7, 84 and (ii) 2, 4,8,5.3.6 Solution. (i) n = 5 = odd The arranged observations are : 2,4, 5. 758: Locate (S24}m ie.3rd observation 2 My Thus median =5 (ii) n = 6 = even observations are 2, 3, 4, 5, 6, 8- The arranged Locate (8) th and (6 + 1) thie. 3rd and 4th observation 3rd observation =4. 4th observation = 5. AM=445=45 2 ‘Thus in this case Median = 4.5 Calculation of Median for Discrete Series. For calculating Median in uencies for each value adiscrcte series data we have to find cumulative fi the middle observation. Cumulative fre- of the variable in order to locate quency conesponding 1o-a vaciable value is the sum.of frequenicies-of the value upto the.given-valuc. Steps for calculating the Median are : (Write a column of cumulative frequency in addition to given columns of variable value and frequency ‘f’ (i) Calculatethe valueof ie where N = total number of observations. Locate this value in the cumulative frequency column. The svalucof the variable corresponding to thi ives the Median. y Illustration, Find hemedian size ofthe shoes from the data given Below Size 4 5 6 71 8 9 10 Frequency 2 6 10 2 15 8 7 3 Solution. We have Size (x) :4 5 6 1 ; 7 10 Frequency) : 2 6 10 20 3 3 1 a nN Cum. Freq.(cf): 2 8 18 38 53 61 8&8 res of Central Tendency : 3 Meas! Here N =71, therefore Net = 2 =36 Now value’ of 36th item occurs inc,f. column after 18th and before 38th, value off corresponding to 36is 7. .s median =7. \ Calculation of Median for Continuous Series. To find the median of a continuous data we use the formula : IN. — cf! Median = L + +2 7 xe where L = Lower limit of median class N =Total number of observations f = Frequency of the median class h ‘¢ = Class interval of the median class cf, = Cumulative frequency of the class preceding the median class. Median class is the. _class interval_in which the median or the middle observation lies. The above formula is based on interpolation and while applying this formula we assume that the frequencies of the class: in which the median lies are uniformly spread over the whole class interval. Illustration 1. Calculate the median age of Teachers in a School from tae following data. Age 20—25 25—30 30—35 35—40 40—45 45—50 50—SS 55—60 28 27 12 3 No.of 4 20 1 teachers Soluticn. Age f cf. Age f cf. Ca 4 4-50 5 % 330-20 my Bigg 3 3-35 28, 52 3960 1S«100 33-4027 ” ot 4045 2 i ai \d before Here N = 100, N. = 50, 50th item will come after Ath an ; 2 5nd. Thus Median class is 30 — 35- ; Also here L = 30, N=100, cf. =24, ¢ =5, f =28 ( -ca) 100 — 24), Hence =L4 (22) xe #30412 = * 5 f 28 = 30426. x 5 = 30 + 4.64 = 34.64 yr5- 28 we, 44 Statistical Methods for Research Wor, 7 Titustration 2. Calculate the median from the following data ; Value <10 <2 <30. <40. «<30, <00 <70 cy Frequency 4 16 40 76 96 =12 120 9s Solution. In this problem the given frequencies are not frequencies j class intervals but are in fact the cumulative frequencies. We can change y above table into another with frequencies and then find the median, Ve Value f. cf. oo | 4 a 40-50 20 96 10-20 12 16 50-60 16 112 20-30 24 407¢ 60-70 8 120 30-40 36 76'¢ 70-80 5 125 125 Now median = size of ts = 125 = 62.5th item Hence Median class is 30-40 and for this problem L = 3| cf. = 40 and f = 36 te) 0, N-=125, c= 10 2 xe Median =L+ = 30+ x 10=30+ 62.5—40 y 19 36 = 30 + 225 = 30+ 6.25 = 36.25 ’ Calculation of Median when Class Intervals are Unequal. Incase the class intervals are of unequal size we can use the same formula as already given for equal class intervals. The median being a positional _valuc.is-00+ affected much if class intervals are-of unequal size. Open end classes als? don’t affect its value. ~ Ae ( Mathematical Property of Median. The sum of the deviations of individual observations ignoring signs is the least if the deviations are take? J erts of Median (0.1 . icutated! ian () It is easy to understand and can be easilf As aiseposione owe andisnotaffected by the values of extreme iter easily determined, n° CaS? OF data with open-end classes it ca" Payee gna opriate average when we have to deal with qualitati¥® counted or measured or sored is some such data is which items are I (i) Itcan be located merely by inspection in many cases. we jeasures of ‘Central Tendency « (o Ieean be calculated even if the value of some extreme item is not aly Its value can be determined graphically. vit of Median. () Median may not be representative of a series HA _ Ithappens in the presence ofl wide variation} inthe values of marks oblained by a student are 10, 10, 12, 12, 14, 30,35, different item: : 40 and 50 then median marks would be 14, Clearly this average is not sentative of the series. (i Iris notcapable of algebraic treatment, ¢.g., median of a set of items not be used to find the sum of all the.items.. (iii) Ina continuous series data median is found by interpolation. The assumption of the’ interpolation thatall the frequencies of the class interval are uniformly spread over the class interval may_not be actually true. (jv) tis nece: to arrange the data is calculate median. Other averages do not juire arrangement. ome (9) Ifbig or small itemsin aseriesare| tobe given greater importance, then median is not a suitable average as it ignores the extreme values. (vi) Median is affected | ions of sampling. | Other Positional Measures. Besides median, there are other measures which divide a given set of observations when arranged in ascending order into equal parts. Important among these are quartiles, deciles: and percentiles. Quartilesare. those. values which divide the given observations into. four equal pars, deciles divide into ten equal parts and percentiles into 100 equal parts. Asa point (Median) divides the total frequency into twoequal parts, three points will divide itinto four parts and soon. Sothere are 3 quartiles, 9 deciles and 99 percentiles. The quartiles are denoted by the symbol Q, deciles by D and percentiles by P. Calculation of Quartiles, Deciles and Percentiles. To calculate Quar- rocedure is similar to that used for Median. tiles, Deciles and Percentiles the pt For discrete observations we add 1 to N whereas in continuous series we do not add 1. A Thus Qi = Ist Quartile = Size of (N+ 0° item (for discrete series) Q1 = Size of N. th item (for continuous series) Similarly . th Qu = Size of a 1)" item (for discrete series). = th ‘ Qs = Size of ou item (for continuous series). De= A , & = Size of oe item (for discrete series). 46 Statistical Methods for Research w, De = Size of SN* item (for continuous series). 10 P35 = Size of 35(N + 1)" item (for discrete series). 100 Ps = Size of an” item (for continuous series). 100 Illustration 1. Following are heights (in cms) of 30 wheat plans recorded by an experimenter after 50 days of sowing. Compute the 3g Quartile, 6th Decile and 40th Percentile. 146, 14, 18, 15.5, 15, 16, 14, 18.5, 17.5, 16, 15, 15.5, 17, 18, 16, 15, 18.5, 18, 18, 18.5, 14.5, 16.5, 18.5, 17.5, 18.5, 19, 16.5, 17, 19, 18.5 Solution. Let us first arrange the given observations in ascending order, S.No. Observation S.No. Observation S.No. Observation il 14.0 11 16.0 21- 18.0 2 14.0 12 16.0 22 18.0 3 14.5 3 16.5‘ By 18.5: 4 14.5 14 16.5 18.5 5 15.0 15 17.0 25 18.5 6 15.0 16 17.0 26 18.5 7 15.0 17 175 27 18.5 8 15.5 18 175 28 18.5 9 15.5 19 18.0 29 19. 10 16.0 20 18.0 30 . 19.0 3rd Quartile = Size of (30 + 1) — = 25 item = 23,25th jtem = Size of 23rd item + + = 18.5+0= 18.5 cms, 6th Decile = size of Sant “Tp item ie, 186" item ie, 18,64 item 0 = Size of 18h tem +06 8 . 5175406 (03)=178 ome, HE —Size of 18th item) 40th Percentile Pros 40 = Sire of DEDF neg 12,4H j (Size of is item - Size of 23rd item) item = Size of 12th item + 0,4 Siz it i. = 160404 05)= 162000 ( ¥2¢ Of 13th item - Size of 124h item) Measures of Central Tendency 47 ilustration 2. From the data given below calculate the 3rd quartile, 4th i and 85th percentile. oR [docile rage (RS.) 12 20 34 40 45 Sum Daily We 35 «20 15 12 8 6 3 99 "solution. Let us make one column of cumulative frequency so that the jtion of the items can be easily located. Daily Wase No. of persons cf. Daily Wage No. of persons cf. 35 3534 8 90 %® 49° 20 55 40 6 96 by 9 15 7 45 Qe sO 0 12 82 I, 99° y jrd Quartile = Size of } (0941)th item = 75th item =Rs. 20. This item hall come after 70th item and before 82nd item ' Ath Decile = size of 4 (99+1)th item = Rs. 10 85th percentile = Size of a (99+1)th observation, ie, 85th item = Rs. 34 \ Determination of Quartiles, Decile and Percentiles in a Continuous Series Data. The calculations of quartiles, deciles and percentiles are based - onthe same lines as the calculation of ‘Median ina continuous series data. The | following formulae shall be used for computing quartiles, deciles and = @ ) where L = Lower limit of the class in which the ith Quartile, ie. (Yn item lies Qi=ith Quartile = L+ ca N= Total No. of observations. | C=Class interval. cf. = Cumulative frequency of the class preceding the class in which the ith quartile lies. fe frequency of the class in which ith quartile lies. For ith Decile D,, change iN/4 to iN/10 ; i= 1, 2 For ith Percentile P,, change iN/4 to iN/100 ;i= 1, 9. HHustration. Calculate the median, 1st quartile, 8th_c ‘ile and 48th percentile ages for the data on ages of females in a village. Age + 5:10 10-15 15-20 20.25 25-30 30-35 35-40 4045 No: 2 3 W 35° 30 28. 18 10 cht 13 5080108 126-136 . Statistical Methods for Research w, 48 a Age : 45:50 50-55 55.60 60.65 65-70 7075 7 BSCS ere 144 «149«154 «158160161 Solution, (?) Median Median = size of 161 = g0.5th item lies in class interval 39 __ 2 = 30, N= 161, cf. = 80, f= 28, c=5 SoL =30, f ‘61 90 Hence Median = 30 + aor x5 = 30+ 0:5 x 5 = 30.09 28 (ii) 1st Quartile Size of N. thitem =161 =4025th item lies in the class interval 202 4 4 Here L = 20,N= 161, cf. = 15,f=35,c=5 161 _ 15 =20+-4 __x5 Qa a = 20+ M025—15 x $= 2043.61 = 23.61 yrs (iit) 8th Decile \e Size of (8N)* item = 8% 161 _ i ize o ( 7 | item = 52161 = 128.8ih item, It lies in the class interval 40 — 45, Here L = 40,N= 161, cf.=126,f=10,c=5 8161) _ 196 Ds = 10+ —10 x5 10 = 40428 x 52404 1.4 = 41.4 yrs, (iv) 45th Percentile 2 45N\". 45x 1 Size of ssn) item = axis 72.45th item Here L=25,N= 161, cf. =50,f= 161 x 45 ee 0 Pas= 25+ 100” g 30 30, c= 5. It lies in the class interval 25-3 se a * $= 254+3.14= 28.74 yrs, Determination of Median, jt - Quartites ete. ically. Median 2” determined Braphically from ci = Srapeicemy ‘umulative frequency curves easures Central Tendency 49 Method 1? (Only applicable for Median) () Draw two Ogives of less than type and More than type. (id Locate the point where these two curves intersect each other. qi) Daw a perpendicular from this point on the X-axis. The point where this perpendicular cuts the X-axis gives the value_of the Median. Method 2 : Steps, for determining Median, Quartiles etc. Draw only one aghe of ess than type taking the variable value on the X-axis and the cumulative frequency on the Y-axis. Calculate the values of relative cumu-_ cue requences by dividing all she.cumulaive frequencies by iBe tial number of observations. Locate the relative cumulative frequencies on the other side of the graph paper as shown in the figure that follows. Now for locating Median from theselative cf} of 0,5 draw a line: parallel to X-axis. Let itcut the ogive at M. From Mdraw 1 on X-axis, The value thus located on the X-axis corresponds to the Median of the observations. i ‘This method can also be used to locate Quariilés, Deciles and percentiles graphically, For locating 1st Quartile we repeat the procedure used in the case ofmedian by taking relative cumulative frequency of 0.25. Similarly to locate 4th decile we can start with relative c,f. of 0.40 and to locate 65th percentile with a relative cf. of 0.65. \ Illustration, Determine median income graphically from the following data on Income distribution of employees of a farm. Also determine Ist Quartile, 8th Decile, and 64th Percentile. Monthly Income (Rs.) 100-400 400-700 700-1000 1000-1300 1300-1600 Frequency 2 8 18 9 6 Monthly Income (Rs.) 1600-1900 1900-2200 2200-2500 Frequency 4 2 1 Solution. Let us form the cumulative frequencies of this distribution. Monthly Income f. Lessthan _Rel..f. More than type type cf. — cf. 10400 «22—~S(2 0.044. 50 400-7008 10 02- & 48 700-1000 18 28 0.56 — 2% 40 1000-1300 9 37 0.74 — 5p 22 1300-1600 6 43 086 — 13 1600-19004 47 09 7 1900-2200 20°49 0.98 . Ye 3 2200-2500 1 50. 100 yo: I ) % i ON Statistical Methods for Re: 50 each ag, X 50 >» & a @ ao 3 E 30 w $ ze $ | 3 20 \ = 3 | S | 10 | | Rs.940 00 400 700 1000 1300 1600 1900 2200 2500 INCOME Fig. 5:1. Cumulative frequency curves showing median income. 6 50. 3 fos? 5 JS 30 ba a port 3 0-64 u i 2 8 20 ee os$ ‘a 3 & a 2 eas Rs 310 oe ¢/ 18 J 7 410 «43 +46. 4g 22-25 Fig. 52. !NCOME———» 7 1. $2. Cumulative frequency curve showing different positional measures. Characteristics of quartiles, deci : | Ir 3» deciles il . The aus tiles, deciles and percentiles are not cre roan aad edian bi averages like mean and median bua theaveragesof p; iti i formation of eat These quantities are helpful in understanding observations te onseation that occurs most frequently in the set of median fail to pa th €-Mode'\In many situations arithmetic mean 3% ie true true characteristics of the data. For examp! measures af Central Tendency 51 shen we talk ‘of most common wage, most common income, mast common “ of shoes, most common size ofa garment or most common | ‘ight, the pe meRSUTE of central tendency in these situations is mode. p Calculation of Mode for Individual Observations. For finding mode ofagivensct of ‘observations, we shall count the number of times the various yalues repeat themselves. Illustration, Following are the daily wages in Rupees of employees of astnall factory. Calculate the mode daily wage. 14, 20, 18, 14, 20, 14, 14, 10, 25, 14, 30, 10, 20, 35, 14, 22. ‘olution, We can sce that the value 14-is-occurring the maximum number of times i.e. 6 times in the above data. Hence Mode wage = Rs. 14. A Calculation of Mode for Discrete Series with frequencies. In discrete series, mode can be determined just by inspection, The value of the variable around which mast of the items concentrate is the: mode. If the frequency of a particular value of the variable is more than the frequencies of the other values of the variable, then the value of variable with maximum frequency is the mode. (Calculation of Mode for Continuous data. The value of mode is calculated by the formula . Mode = L+ Ai _x 2fifo—fa where L = Lower limit of the modal class. f, = frequency of the modal class. = frequency of the class precedinglihe modal class frequency of the class succeeding the modal class C= class interval of the modal class. Steps for calculating Mode of a continuous data : (i Locate the modal class. (ii) Find the value of Mode by the formula given above. Mlustration. Following is the data on Annual Income of 75 employees ofa factory, Find the. modal value of annual income. Annual Income Frequency 4 3000—5000 5000—7000 8 7000—9000 B 9000—11000 ———~ 7 11000—13000 ® 4 2 1 5 | 52 Statistical Methods for Research we Solution, The first step is to locate the modal class. For this Probl class interval (9000—1 1000) is the modal class as it has the m, frequency. For the above problem L=9000, C=2000, f=26 f=13, Using the formula Mode = L + ft =fo__x € 2f1—fo—f2 26 —13 We Mode = x 2000 fe get le 9000 + 3-17 em | xin = 9000 +13.x 2000 22 = 9000 + 26000 = 9000 +1181.81 = Rs, 10181.81 Ans. } ! Calculation of Mode in a data with unequal class interval/open end classes. The formula for calculating the Mode remains the same in this case. We must try to. make the class intervals of equal size by distributing the frequencies equally in classes. The procedure shall become clear by the following illustration. Illustration. Calculate the modal monthly income from the following data. Monthly Income Frequency _! Less than 200 200—300 300—400- Ca 4 6 700 mY 700—800 2 800—900 7 900—1000 3 Above 1000 2 It isor iso abors data we see that class interval (400-600) interval into two classes HOt se ot Size 100. So we split this So we can write the eae 100 with equal frequencies i.e. 12€¢ eases central Tendency 3 Monthly Income Frequency Less than 200 20-300 300-400 400—500 500-600 600-700 T0800 800-900 9001000 : Above 1000) X)-700. 0,f, = 10, C= 100, Now modal cli Hore L = 600, f, Mote = + J) £0 --%c 2h Sofa - = H+ Oe 12__ x 100 = O00 + 7" x 100 = 600 + 44.44 = Rs. 644.44 Note that if we do not split the class interval (400-600), Mode would have been in the interval 400-600 which is wrong. ode Graphically. The value of mode can also be graphically. This_method ci if ther containing the highest frequen he following steps can be followed forthe calculatién of Mode graphically. () Draw a histogram of the given data. (ii) Draw two lines diagonally in the modal class bar, starting from each upper corner of the bar to the upper comer of the adjacent bar as shown in the Figure 5.3 of the following example. - | (iii) Draw a perpendicular line from the intersection of the two diagonal lines on the X-axis. The point where this. perpendicular.cuts the X-axis gives the modal value, Mlustration, Find the modal marks graphically for the following data : Marks 0-20 20-40 «40-60 60-80 80-100 No. of students 2 5 10 7 1 ; . 4 Solution, First we shall draw a histogram and then calculate the mode by Above procedure. statistical Methods for Researey earch Wa nb a a) LS | | MARKS Histogram showing modal marks 52, Fig. 53. fe. Following are the merits of mode : and and calculate. value can be found Merits an (i) Itis easy to underst ii) In some cases its observations. os (iii) As mode is the observation that occurs most frequently, so it is noi anjisolated valud like mean or median of a data. , (Guy Mode is not affected by the values of the extreme items. (v) Its value can be determined in data with open end classes withoul ascertaining the class limits. (vi) Itcan be used to describe qualitative phenomenon, ¢.8. ,10 compart the preference of a consumer for different type of products, S2¥~ <0 ai polish etc. (vii) The value of mode can also be determined graphically. Demerits of Mode. Following are the demerits of mode : (i The value oman determined. In some cases ¥* may have 2, 3 or more modal values. (ii) Mode is not based on all the observations of a serics. (iii) Mode is not capable of further mathematical treatment. (iv) Mode may be unrepresentative in many cases. __ (0) Mode is nota suitable measure while dealing with quantitative da as it has more disadvantages than its good features. : Empirical Relation between Mean, Mode and Median. In asy™™ urical distribution Mcan, Mode and Median are identi f syne trical distribution shall bedi: i identical. Concept Or i moderately differ from peer thenext chapter. In distribuiont i cting Mcan, Mode a Mes ces an empirical relations! by mere inspection of given| Mode = 3 Median —2 Mean .—— Measures of Central Tendency 55 Mo =MODE (Under the peak 3 ofthecurve) Me=MEDIAN (Middle Value) g XEMEAN (Centre of gravity) FREQUENCY Mo Me X —> VARIABLE Fig. 54. A moderately positively skewed curve showing A.M., median and mode. ‘Also in some cases where the position of mode cannot be determined by known methods, we can determine it by the above formula. This formula also ‘comes to our rescue when the given distribution is bimodal. —-- 4, Geometric Mean. Geometric mean of n itemsis defined asthe nthroot of the product of the n items. Symbolically, G.M.= VX. X2....X0 where X,, Xp.» X, are the individual items. The calculation of the G.M. by the above formula is possible only if the number of observations is small. We can convert the above formula by using logarithms and use the formula given below to calculate the value of G.M. Taking log of both the sides of (1) we get log G.M. = log X1 + log ahs wees + log Xn or G.M. = Antilog (Z1o8.X:) Thus we can define geometric mean as the antilog of the arithmetic average of the logs of the individual observations. Mlustration. Calculate the geometric mean of 4, 8, 10, 15 and 20. Solution, Here n =5 GM. = Vax 8x 10x 15 x20 = (4x8 x 10x 15 x 20)" G.M. = Antilog (ise + log 8 + log 10 + log 15 + log 20 5 = Antilog (0.9021 + 0.9031 + 1.0000 + 1.1761 + 1.3010) 5 56 Statistical Methods for Research we = Antilog (29823) Calculation of Geometric Mean for Descrete Series. + f log X Formula used is G.M. = Antilog (E48) antilog (0.9965) = 9.919 Calculation of Geometric Mean for Continuous Series. We shay} Use the formula G.M. = Antilog PA en) Here m denotes the middle value of each class interval. Illustration. Find the G.M. for the following data Daily wages: 0-6 6-12 12-18 18-24 24-30 30-36 36.4 Frequency : 1 3 8 15 7 4 2 Solution. Class Interval m f logm * f logm 0-6 3 1 0.4771 04771 6-12 9 3 0.9542 2.8626 12-18 15 8 1.1761 9.4088 18-24 21 15 1.3222 19.8330 24-30 27 17 1.4314 24,3338 30-36 33 4 1.5185 6.0740 36-42 39 2 15911 3.1822 Df. log m= 66.1715 G.M. = Antilog (sans = Antilog 1.32343 = 21.06 Uses of Geometric Mean. Geometric mean should be preferred over the other measures of Central ten dency while calculati it in crease in sales, population y culating average percent In = Anmttog EW og x. Iw Here W,, W,,...W, are the weighte's: one variable," Cights given to the values X,,X,+0~ Xe Mathematical Properties of @ jes of GM “perties of GM. Following are few properti©’ res of Central Tendency 57 Measu’” (p The product of the value of the series remains the same when cach G jg replaced by the Geometric Mean. val rhe sum ofthe deviations of the logarithms of the original observa- ; Ne or below the logarithm of the geometric mean is equal. ton) G.M. of individual values in equal to the reciprocal of the G.M. of te seciprocals of individual observations. Because of this property G.M. is -ferred for calculating the average percent increase in any variable. Merits and Demerits of G.M. Important merits are (9 The geometric mean is rigidly defined and its value is definite. (ii) Inis based on all the observations, It cannot be calculated if a single observation is missing. (iii) vis capable of further algebraic treatment. (iv) It gives more weight to smaller items. (v) Itgives better results when we have to calculate the percent increase in any variable and also in the construction of Index Numbers. Demerits of G.M. are following. (i) It is neither easy to calculate nor is simple to understand. (ii) If any of the observations has zero value then geometric mean will also be zero. Also if an observation is negative, geometric mean cannot be calculated. (iii) Like arithmetic mean it can take a value which may not even exist in the series. (iv) The property of giving more weight to smaller items may in some cases prove to be a drawback of the geometric mean. 5, Harmonic Mean. Harmonic mean of a set of observations is defined asthe inverse of the arithmetic mean of the inverses of the values ofits various items. If we have n observations designated as X,, X,, . X, and denote their harmonic mean by H, then Ayteie th H = Inverse of ¥1__X2 a (9 sates Mateos 0.50 + 0.25+0.20+0.10 0.95 95 19 = 4.21 (approx) It Statistical Methods for Re; a wertset ° iscrete series tion of Harmonic mean for Discre' Natio Calcul ee ' Formula used is H.M. ss) i following data : Illustration. Calculate the Coe mean the is g . Variable 2) ‘ 5 : 0 2 Frequency : Solution. ) 1) Variable Frequency x 2 . 2.0000 0.5000 3 2.0000 0.2500 ; 5 3.0000 0.2000 : ‘ 0.1250 10000 a 0.1000 0.3000 X Z 0.8333 0.1666 0 1 rf =40 y @ . = mos? (approx.) z(t} 8 Calculation of Harmonic Mean for Continuous Series. When the observations are given in the form of continuous series we use the middle value of each class interval as the variable value and calculate the Harmonic Mean in the manner explained for the discrete series, Uses of Harmonic Mean. Harmonic Mean should be preferred for calculating the average.sate of increase of profits of a firm or finding average speed at which a journey has been performed or the average price at which articles are sold. Weighted Harmonic Mean. In some situations we have to give dif- ferent weights to individual values before calculating their Harmon: In such cases the formula used for calculating H.M. shall be : HM. =W =_=W ee el) Merits of Harmonie Mean” Some n* $ rmonic Mean. Some meri oa (i) Itis rigidly defined, merits of HM, are : (ii) Itis based on all the ob: i Ma Itis capable of further algebras tea re, iv) Iisa good representatives ee to be given more weightage, vein those situations where small items ar ic Mean. eosites of Central Tendency 59 qin problems relating to time and rates it gives better results than other is average is not affected much by fluctuations of sampling. pemerits of HM. Few demerits of H.M. are the following : (jiis neither easily understood nor is easy to calculate. ron gives a very high weightage to small items and is not suitable for sis of ‘economic data. ide js usually a value which does not exist in the series. (iv) Its value cannot be computed when there are both positive and ative items ina series or when any of the observations is zero. Because of nese limitations H.M. is very rarely used. In problems relating 10 time and rates it must be preferred over the other averages. Following is the-relatiqnship between the three means : M.2GM2HM., The equality signs hold only if all the number X,, Xp) ++ X,are identical. Which Average to Use ? We cannot regard any single average to be the pest for all circumstances. Our choice of average should depend upon : (i) The purpose for which we have to use it. (i) Whether the average is going to be used for further algebraic calculations. a (ji) The type of data available. 43. MISCELLANEOUS ILLUSTRATIONS Illustration 1. Find the arithmetic mean of first n natural numbers. Solution. In this problem our variable X will take the values 1,2, we Sum of first n ratural numbers _n(n +1) 2 SLt2teccner ta Sum of observations n(n +1) n+] Arithmetic Mean X = Number of observations Qn 2 Illustration 2. Find the unknown distribution table, It is given that the p n that he prithmetic. many is 22, and the total number of ‘observations is 70. Variable X_ + 518 0 wm 2 30 Frequency fz 2 3 7 ow at 2e hcg tom, Let us assume frequency x corresponding to X = 24. The Frequency corresponding to 28 will be (6—x). Thiscan be casily.socn asit is low Ict us calc! ren that the total number of ‘obscrvations is 20. N M. by usual method. Variable f Xf, | r us 4 30} 18 54 22 4 3 140 x 7 20 x ao 24 x 168~ 28 & 60 A * aie 452—4y 20 | -Unkn = LXifi = 22(given) I We know that A.M. = Th 20 grou Later Hence 452 —4x = 440 or 4x =12 orx =3 | what ‘Thus the unknown frequencies are : ot b 6 Corresponding to X = 24, frequency = 3 ese and for X = 28, frequency, = 3__——"~—~ , Illustration 3. The mean weights of60 students in a certait vom! kg. The mean weight of boys in the Class is 65 kg and that of girls is 50k Find the number of boys and girls in the class. | 85 tt Solution. Let the number of boys =X | Number of girls =60—X | whil Sum of the weights of boys = X x 65 = 65 X kg. the Sum of the weights of boys and girls = 60 x 60 = 3600 kg. ofR Now Sum of weights ofthe boys and girls = Sum of weights of boys | how: ae + Sum of wei irls i, Thos 30 = 6X + 60_ 0.50 of weights of gi ' oF 3600 = 65X +3000 ‘Thus the number thie aa a ee X=40 and number of girts in th pas Mlustration 4, F; © class = 20 | data assuming he erat manenown Value of the variable in the followité A i ean of these o| 5 ee xX: Sag 5 servations to be 25. “ and requency f ; 1 5 4 _ 29 32 35 Solution. Let us ass, 10 3 2 2 the formula for arithmetien, the unknown Value of the variable is X. Usité CAN We got « variable is X. flo @) ‘Central ndenct osures of a 61 f x x x 1 15 29 5 5 90 32 18 1 154 35 2 10 10X x am. == 480 + 10 X_= 25 (given) or 480 + 10 X = 750, or Ls 30 10 X = 270 or X = 27 Inknown value of Variable = 27 Illustration 5. The ‘Arithmetic Mean, the.Mode and the Median ofa of 50 observations were calculated to be 25, 32 and 28 respectively. Later on it was found that one observation was read as 85 instead of 35. To ‘what extent will the three averages be affected by the discovery of this error. Solution. A.M.will change, New A.M.= 1250 + 35 — 85 _ 1200. 50 50 =24 Mode will not be affected because value 32 was repeated maximum number of times. Median will not be affected because if we write the value 35 instead of 85 the middle value is not going to change. Illustration 6.Inacompany having 50 employees, 30: Jearn Rs. 2perhour viile 20 earn Rs. 3 per hour. Determine their mean earning per hour. Would the answer remain the same if the 30 employees cam a mean hourly camming of Rs. 2 per hour and 20 employees carn a mean hourly earning of Rs. 3 per hour. () x = 2LX = 30x 24.20%3 = 120-= Rs, 2.40 per hour N 50 50. (ii Yes, the answer will be the same and combined mean becomes X = Lit fo Xo _ 30X24 20%3 - 120-- Rs, 2.40 per hour fit - —*30 ee Mlustration 7. Given the Mean and Median wages of employces of X and Y Company to be Rs. 500 and Rs. 496 find the Modal wage. Solution. We know that Mode = 3 Median — 2 Mean = 3(496) — 2(500) = 1488 — 1000 = Rs. 488 _lustration 8. During one year the ratio of milk pri i ey "price per kg was 1.56, whereas during the nex a ser ad, (0) Fj x ? ind the arithmetic mean of these two ratios for the two-years 62 Statistical Methods for Research w, ka) (® Find the A.M. of the ratios of Wheat flour prices to Milk prices fy, A.M. for averaging raj," years period. (c) Is it advisable to use th @ Discuss the suitability of the Geometric Mean for averaging ratios, Solution. (a) Mean ratio of Milk to Wheat flour prices =} (1.56+1.00) =1.28. Wheat flour prices for the Ist year is 1.51 ices = 1 = 0.641 prices = 55 (approx) .es to Milk prices in the next (b) Since the ratio of Milk to the ratio of Wheat flour prices to Milk Similarly the ratio of Wheat flour pric = _ = 1.000 , 4, vi Mean ratio of Wheat flout prices to Milk prices = 4 (0.641 + 1.000) = 0.8205. (c) We would expect the mean ratio of Milk to wheat flour prices to by} the reciprocal of the mean ratio of Wheat flour prices to Milk prices if th mean is an appropriate average. However _1_ = 1.219 # 1.28 0.8205 This shows that the A.M. is a poor average to use for ratios. (d@ Geometric Mean of ratios of Milk to ‘Wheat flour prices = ¥1.00 x 1.56 = V1.56 Geometric Mean of ratios ol = VOCAL) 1.000) = VO.6AT = a cine W156 of each other our conclusion is ha ratios for his f Wheat flour prices to Milk prices Since these averages are reciprocals the geometric mean is more, suitable than the A.M. foraveragin, type of problem. qu y ce Illustration 19 ‘Compute, Mean & Median for the following data oa MidValue 25° 35 «45 55 65 75 85 Frequency 2 7 3 2 34 +4 6 ie Solution. Since we are given the mid values, we should find out fi upper and lower limits of various classes to calculate the median. The oho! calculations are usual and hence omitied. == EXERCISES 1. (@) Explain clearly the concept of central tendency taking a suitable exam Does central value always imply the middle ‘most’ value ? cy () State the empirical relationship between the averages. ndency of central Te 63 feos san average js a substitute for a complex group of variable but it is not 2. says safe to depend on the substitute all to the exclusion of individual % \e ~~ ember of the group", Discuss, wo Dsus briefly the merits and demerits of the various: statistical measures. D Exhore of average has its own particular ficldof application. Inthe light of this statement discuss the characteristic features of the chief average used in statistics. Why is arithmetic mean generally preferred over median as the measure of central tendency ? What is the relation between arithmetic mean and geometric mean ? When is the later preferred over the former ? What do you understand by graphic representation of median, quartiles and node ? What are the ways of showing them by graph ? Which average is more suitable in the following ease ? Give reasons : (i) Average size of ready made garments (i) Average intelligence of students in a class ia) Average production per shift in a factory las) _ Average rate of growth of population per decade, Given the following data calculate the most suitable average giving reasons for © your choice Value Frequency Jess than 100 40 100-200 89 200-300 148, 300.400 64 400 and above 39 Total= 380 Ifthe mode and median of a moderately asymmetrical series are 16 inches and 202 inches respectively, compute the most probable mean, > There are two branches of an establishment employing 100 and 80 persons respectively. If the arithmetic means of the monthly salaries paid by two nrches are Rs.275 and Rs. 225 respectively, find the arithmetic mean of the salaries of the employces of the establishment as a whole, ‘A train moves first 20 kms. at the rate of 15 km. p.h., next 30 kms. at the ra of 40 km. psh, and next 40 kms. atthe rate of 60 km p.h. Find the average speed of the train. : Following is the data on weights (in gms) of 15 eggs collected by an experi- menter : 48, 50, 49, 48, 52, 51, 49, 55, 60, 45, 50, 52, 55, 58, 48. Find the Arithmetic mean, Geometric mean, and Median for the above Calculate the geometric mean of the following price rel Commodity Price relative Commodity Price relative Wheat 205 Pulses 150 Rice 198 Oils 180 110 Sugar 128 Salts 15, 16, 7 -_ f Statistical Methods for Research Wong ‘ ises i At the end of the first year hy | two annual raises in salary. h © ge os of percent at the end of the second year an increase of 12 ee 8) on the salary as it was atthe end of the year. What is the average pe : ; 2 Thesales of a company increase by 20 per cent from 1980 ‘to 1981 ang by percent from 1981 t0 1982. Whatis the average percentage increase in sale, the company from 1980 to 1982 ? Jn certain factory a unit of work is completed by A in 4 minutes, byB; 4 fl + mh minutes, by C in 6 minutes, by D in 10 minutes and by Ein12 minutes, ; is their average rate of working ? What is the average numberof units, Complete Per minute ? At this rate how many units will they complete in « six-hog| day? Thefotlowing lable gives the distribution of supplies in Indiain 1940 according to yield of milk per day. Calculate the mean and median milk yield, Yield per day in kg O1 12 23 34 45 56 67 No. of buff, 114 2005 7706 4590 2080 240 3549 in thousands The following table gives the increase in cost of living over 1947 for a working lass family as at 1st January, 1960, and the Weights assigned to various groups, Group Percentage increase Weights Food 40 7S Rent 65 25 Clothing 100 15 Fuel and lighting 80 1.0 Other items 90 05 Find out the weighted average of the increase in cost of living. Calculate the Median, Quartles, 6th Decile and 70th Percentile from the following data, Markslessthan 80 79 69 5 40 30 20 10 No.of suidents 100 99 gg 6 32 2 13 5¢ cece ean, the mode and the median of a group of 75 observations : Midtobe27, 34 and 29 respectively lena’ later discoveredthat one pervaton wan OnEly Tead as 43 instead of the Correct value 53. Examiné Sicoveryof he ucla vahesoftevoe mere will be affected by he Auais vicar te eau mnge ari Mean of twontimbere gs Le AYETABE Speed of the train. i the harmon; ‘mean, is 10 and their geometric mean is . ‘The annual expendi Expendiure: pone coon Biven below :- t No.of families 4 _” 4000-6000 6000-8000 8000-10000 measures of Central Tendency 65 Calculate the missing frequencies, itbeing known that Mode of this distribution ee Frequency of C.1. 2000-4000 is 19 and frequency of C.1. 6000-8000 is 3. 7. The! following table gives the annual income distribution of some families. The Median and Mode for the distribution are Rs. 19000 and Rs. 18000 respec- tively, Compute the total number of families and the missing frequencies. Income (in thousand of Rs.) 5-1 aa 10-15 15.20 20-25 25-30 No. of families —- ow — 3 Income (in thousand of Rs.) 10: 35 35.40 No. of families 2 1 Ans.Freq. for C.I. 10-15 is 4 and for C.I. 20-25 is 6. 23. Following is the age distribution of 1000 persons working in a large Hoisery 7 Nill Age group : 15-20 20-25 25-30 30-35 35-40 40-45 45-50 No.of persons 60 122 135 242-148 «107—85 Age group: 50-55. 55-60 No. of persons 6338 Due to heavy losses the management decide to bring down the strength to 50 per cent of the present number according to the following scheme : (to retrench the first 8 per cent from the lower group (i) to absorb the next 32 per cent in other branches (ii) to make 10 per cent from the highest age group retire premature. What will be the age limits of the persons retained in the mill and of those transferred to other branches ? Ans. (i) Persons with age <20.82 yrs to be retrenched (ii) Persons to be transferred will lie in the age group 20.82 031.71 yrs. (ii) The persons with age >50.08 yrs will retire premature. — GY Ui: AS & HAP wo Poy gd aa \ —— Qe vo e c of, 20

You might also like