# 260

MATHEMATICS

STATISTICS
There are lies, damned lies and statistics.

14

— by Disraeli

14.1 Introduction
In Class IX, you have studied the classification of given data into ungrouped as well as grouped frequency distributions. You have also learnt to represent the data pictorially in the form of various graphs such as bar graphs, histograms (including those of varying widths) and frequency polygons. In fact, you went a step further by studying certain numerical representatives of the ungrouped data, also called measures of central tendency, namely, mean, median and mode. In this chapter, we shall extend the study of these three measures, i.e., mean, median and mode from ungrouped data to that of grouped data. We shall also discuss the concept of cumulative frequency, the cumulative frequency distribution and how to draw cumulative frequency curves, called ogives.

14.2 Mean of Grouped Data
The mean (or average) of observations, as we know, is the sum of the values of all the observations divided by the total number of observations. From Class IX, recall that if x1, x2,. . ., xn are observations with respective frequencies f1, f2, . . ., fn, then this means observation x1 occurs f1 times, x2 occurs f2 times, and so on. Now, the sum of the values of all the observations = f1x1 + f2x2 + . . . + fnxn, and the number of observations = f1 + f2 + . . . + fn. So, the mean x of the data is given by f x + f 2 x2 + L + f n xn x = 1 1 f1 + f 2 + L + f n Recall that we can write this in short form by using the Greek letter Σ (capital sigma) which means summation. That is,

File Name : C:\Computer Station\Class - X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2006)

STATISTICS

261

∑fx
x =
i =1 n

n

i i

∑f
i =1

i

Σ f i xi which, more briefly, is written as x = , if it is understood that i varies from Σ fi 1 to n.
Let us apply this formula to find the mean in the following example. Example 1 : The marks obtained by 30 students of Class X of a certain school in a Mathematics paper consisting of 100 marks are presented in table below. Find the mean of the marks obtained by the students. Marks obtained 10 (x i ) Number of student ( fi) 1 20 1 36 3 40 4 50 3 56 2 60 4 70 4 72 1 80 1 88 2 92 95 3 1

Solution: Recall that to find the mean marks, we require the product of each xi with the corresponding frequency fi. So, let us put them in a column as shown in Table 14.1. Table 14.1 Marks obtained (xi ) 10 20 36 40 50 56 60 70 72 80 88 92 95 Total Number of students ( fi ) 1 1 3 4 3 2 4 4 1 1 2 3 1 Σfi = 30 f ix i 10 20 108 160 150 112 240 280 72 80 176 276 95 Σfixi = 1779

.

File Name : C:\Computer Station\Class - X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2006)

262

MATHEMATICS

Now,

x=

1779 Σ fi xi = = 59.3 30 Σ fi

Therefore, the mean marks obtained is 59.3. In most of our real life situations, data is usually so large that to make a meaningful study it needs to be condensed as grouped data. So, we need to convert given ungrouped data into grouped data and devise some method to find its mean. Let us convert the ungrouped data of Example 1 into grouped data by forming class-intervals of width, say 15. Remember that, while allocating frequencies to each class-interval, students falling in any upper class-limit would be considered in the next class, e.g., 4 students who have obtained 40 marks would be considered in the classinterval 40-55 and not in 25-40. With this convention in our mind, let us form a grouped frequency distribution table (see Table 14.2). Table 14.2 Class interval Number of students 10 - 25 2 25 - 40 3 40 - 55 7 55 - 70 6 70 - 85 6 85 - 100 6

Now, for each class-interval, we require a point which would serve as the representative of the whole class. It is assumed that the frequency of each classinterval is centred around its mid-point. So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class. Recall that we find the mid-point of a class (or its class mark) by finding the average of its upper and lower limits. That is, Class mark =

Upper class limit + Lower class limit 2

10 + 25 , i.e., 2 17.5. Similarly, we can find the class marks of the remaining class intervals. We put them in Table 14.3. These class marks serve as our xi’s. Now, in general, for the ith class interval, we have the frequency fi corresponding to the class mark xi. We can now proceed to compute the mean in the same manner as in Example 1.
With reference to Table 14.2, for the class 10-25, the class mark is

File Name : C:\Computer Station\Class - X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2006)

X (Maths)/Final/Chap–14/Chap–14 (28th Nov.5 92. We can do nothing with the fi’s.5 or a = 62.0 97.0 x = = = 62 30 Σ fi This new method of finding the mean is known as the Direct Method. 2006) . di = xi – a = xi – 47.25 25 . and which one is more accurate? The difference in the two values is because of the mid-point assumption in Table 14.e.5 375.5 fi xi 35. .0 465.100 Total Number of students ( fi ) 2 3 7 6 6 6 Σ fi = 30 Class mark (xi ) 17.5. We observe that Tables 14.70 70 . the mean x of the given data is given by Σf i xi 1860. but we can change each xi to a smaller number so that our calculations become easy.55 55 . 59. The next step is to find the difference di between a and each of the xi’s. So. Sometimes when the numerical values of xi and fi are large.5 32. .5 332. Let us choose a = 47. let us think of a method of reducing these calculations. How do we do this? What about subtracting a fixed number from each of these xi’s? Let us try this method. . we may take ‘a’ to be that xi which lies in the centre of x1. Can you think why this is so. and denote it by ‘a’. So. to further reduce our calculation work. while 62 an approximate mean.5 62.STATISTICS 263 Table 14. finding the product of xi and fi becomes tedious and time consuming.85 85 .0 The sum of the values in the last column gives us Σ fi xi. x2. i. Also.3. The first step is to choose one among the xi’s as the assumed mean.0 555. that is.1 and 14.5 77.0 Σ fi xi = 1860.3 being the exact mean..3 Class interval 10 .5 47.40 40 . for such situations. So.5 File Name : C:\Computer Station\Class .5. xn.3 are using the same data and employing the same formula for the calculation of the mean but the results obtained are different. the deviation of ‘a’ from each of the xi’s.. we can choose a = 47.

This can be explained mathematically as: Mean of deviations.5 di = xi – 47.e.5 62.5 47. Table 14.25 25 . d = d = Σfi di Σfi Σfi ( xi − a) Σfi Σfi xi Σfi a − = Σf i Σf i = x −a = x −a Σf i Σf i So.. so.100 Total Number of students ( fi ) 2 3 7 6 6 6 Σfi = 30 Class mark (xi ) 17. i. d = Σfi di . the mean of the deviations. The calculations are shown in Table 14. Σfi Now. from Table 14.85 85 .5 –30 –15 0 15 30 45 f id i –60 –45 0 90 180 270 Σfidi = 435 So. we need to add ‘a’ to d .70 70 .5 77.5 92.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.4.4.4 Class interval 10 .264 MATHEMATICS The third step is to find the product of di with the corresponding fi. let us find the relation between d and x .5 32. x = a+ d x = a+ Σfi di Σfi File Name : C:\Computer Station\Class . in order to get the mean x . and take the sum of all the fi di’s.40 40 . we subtracted ‘a’ from each xi.55 55 . 2006) . So. Since in obtaining di.

(Here.5.100 Total fi 2 3 7 6 6 6 Σfi = 30 u = xi 17. 30 Therefore. the values in Column 4 are all multiples of 15.. we would get smaller numbers to multiply with fi.5 + 435 = 47. So. (Why ?) So.4.4.5 = 62 . and so on) as ‘a’.STATISTICS 265 Substituting the values of a. Observe that in Table 14.70 70 .5.25 25 .5 + 14.5 47. Activity 1 : From the Table 14. h Now.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. i. the mean of the marks obtained by the students is 62.5 92. 17. let us form Table 14. again let us find the relation between u and x . The method discussed above is called the Assumed Mean Method.3 find the mean by taking each of xi (i. 15 is the class size of each class interval. What do you observe? You will find that the mean determined in each case is the same.) So. 32.. we get x = 47.5 62.55 55 .5.5 77. we can say that the value of the mean obtained does not depend on the choice of ‘a’. Taking h = 15.40 40 ..85 85 . 2006) .5 32. File Name : C:\Computer Station\Class .e. we calculate ui in this way and continue as before (i. 62. find fi ui and then Σ fi ui).e.5 d i = xi – a –30 –15 0 15 30 45 ui = xi – a h –2 –1 0 1 2 3 f iu i –4 –3 0 6 12 18 Σfiui = 29 Let Σfi ui Σfi Here. where a is the assumed mean and h is the class size. if we divide the values in the entire Column 4 by 15. Table 14. Σfidi and Σfi from Table 14. let ui = xi − a .5 Class interval 10 .e.

5 = 62 So. The mean obtained by all the three methods is the same. 2006) .266 MATHEMATICS We have. i. h. Σf ⎤ 1 ⎡ Σfi xi −a i⎥ ⎢ h ⎣ Σfi Σf i ⎦ 1 [ x − a] h hu = x − a x = a + hu ⎛ Σf u ⎞ x = a + h⎜ i i ⎟ ⎝ Σfi ⎠ Now.5 + 14. the mean marks obtained by a student is 62.5 + 15 × ⎜ ⎟ ⎝ 30 ⎠ = 47. u = = = So. The assumed mean method and step-deviation method are just simplified forms of the direct method.. ui = xi − a h Σf i ( xi − a ) 1 h = Σf i h ⎡ Σfi xi − a Σfi ⎤ ⎢ ⎥ Σf i ⎣ ⎦ Therefore. h Let us apply these methods in another example.e. substituting the values of a. but are xi − a . So. we get ⎛ 29 ⎞ x = 47. any non-zero numbers such that ui = File Name : C:\Computer Station\Class . The method discussed above is called the Step-deviation method.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. The formula x = a + hu still holds if a and h are not as given above.5. We note that : the step-deviation method will be convenient to apply if all the di’s have a common factor. Σfiui and Σfi from Table 14.

85 Number of States /U.65 65 . 10 xi 20 30 40 50 60 70 80 Here we take a = 50.7.75 75 .55 55 .65 65 .T.45 45 .55 55 . ( fi ) 6 11 7 4 4 2 1 xi − 50 .75 75 .35 35 .6): Table 14.) of India. of each class.35 35 .STATISTICS 267 Example 2 : The table below gives the percentage distribution of female teachers in the primary schools of rural areas of various states and union territories (U.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.6 Percentage of female teachers 15 . Percentage of 15 .45 45 .T.25 25 . 2006) . then di = xi – 50 and ui = We now find di and ui and put them in Table 14. h = 10. Find the mean percentage of female teachers by all the three methods discussed in this section.T.25 25 .85 female teachers Number of States/U. xi. File Name : C:\Computer Station\Class . and put them in a column (see Table 14. 6 11 7 4 4 2 1 Source : Seventh All India School Education Survey conducted by NCERT Solution : Let us find the class marks.

268 MATHEMATICS Table 14. If xi and fi are sufficiently small.7 Percentage of female teachers Number of states/U.71 ⎝ 35 ⎠ ⎝ Σf i ⎠ Therefore.71 Σfi 35 Using the assumed mean method.35 35 . x = Σfiui = –36. the mean percentage of female teachers in the primary schools of rural areas is 39.75 75 . Σf i xi 1390 = = 39.55 55 . we can still apply the step-deviation method by taking h to be a suitable divisor of all the di’s. Σfixi = 1390. then the direct method is an appropriate choice. Σfi di (−360) = 50 + = 39. we obtain Σfi = 35. and xi are large numerically.45 45 . File Name : C:\Computer Station\Class .85 Total 6 11 7 4 4 2 1 35 20 30 40 50 60 70 80 –30 –20 –10 0 10 20 30 –3 –2 –1 0 1 2 3 120 –180 –18 330 –220 –22 280 –70 –7 200 0 0 240 40 4 140 40 4 80 30 3 1390 –360 –36 From the table above.71 Σfi 35 Using the step-deviation method.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. So the choice of method to be used depends on the numerical values of xi and fi. ( fi ) xi di = xi – 50 ui = xi −50 10 fixi fidi fiui 15 . Using the direct method. If the class sizes are unequal. x = a+ ⎛ Σf u ⎞ ⎛ – 36 ⎞ x = a + ⎜ i i ⎟ × h = 50 + ⎜ ⎟ × 10 = 39. 2006) .25 25 .T. then we can go for the assumed mean method or step-deviation method.71. Σfi di = – 360. If xi and fi are numerically large numbers. Remark : The result obtained by all the three methods is the same.65 65 .

and the xi. Let us still apply the stepdeviation method with a = 200 and h = 20.250 250 . Find the mean number of wickets by choosing a suitable method. the class size varies. let us see how well you can apply the concepts discussed in this section! File Name : C:\Computer Station\Class . x = 200 + 20 ⎜ ⎟ = 200 – 47.350 350 . on an average. Then.8.100 100 .89. we obtain the data as in Table 14.STATISTICS 269 Example 3 : The distribution below shows the number of wickets taken by bowlers in one-day cricket matches.450 5 16 12 2 3 Solution : Here. 2006) . Now.150 150 .89.s are large.11 = 152.60 60 .100 100 .350 350 . u = ⎛ −106 ⎞ −106 ⋅ Therefore.450 Total 7 5 16 12 2 3 45 40 80 125 200 300 400 –160 –120 –75 0 100 200 –8 –6 –3.8 Number of wickets taken Number of bowlers ( fi ) xi di = xi – 200 ui = di 20 ui fi 20 . the number of wickets taken by these 45 bowlers in one-day cricket is 152.150 150 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov. Table 14.250 250 . ⎝ 45 ⎠ 45 This tells us that.75 0 5 10 –56 –30 –60 0 10 30 –106 So. What does the mean signify? Number of wickets Number of bowlers 20 .60 7 60 .

Daily wages (in Rs) Number of workers 100 . the groups should find the mean in each case by the method which they find appropriate.19 13 19 . Daily pocket allowance (in Rs) Number of children 11 .14 3 Which method did you use for finding the mean.200 10 Find the mean daily wages of the workers of the factory by using an appropriate method.10 6 10 . The following distribution shows the daily pocket allowance of children of a locality. Consider the following distribution of daily wages of 50 workers of a factory. A survey was conducted by a group of students as a part of their environment awareness programme. Find the mean number of plants per house. Present this data as a grouped frequency table.25 4 File Name : C:\Computer Station\Class .1 1.17 9 17 .160 8 160 . Measure the heights of all the students of your class (in cm) and form a grouped frequency distribution table of this data.13 7 13 .21 f 21 . 1. 2006) . Find the missing frequency f.270 MATHEMATICS Activity 2 : Divide the students of your class into three groups and ask each group to do one of the following activities. Number of plants Number of houses 0-2 1 2-4 2 4-6 1 6-8 5 8 .180 6 180 . The mean pocket allowance is Rs 18. and why? 2. 3.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.140 14 140 .23 5 23 . Form a grouped frequency distribution of the data obtained. After all the groups have collected the data and formed grouped frequency distribution tables. 3. EXERCISE 14. Collect the daily maximum temperatures recorded for a period of 30 days in your city.15 6 15 . in which they collected the following data regarding the number of plants in 20 houses in a locality. 2. Collect the marks obtained by all the students of your class in Mathematics in the latest examination conducted by your school.120 12 120 .12 2 12 .

350 2 Find the mean daily expenditure on food by a suitable method.08 0.71 4 71 .20 .0.61 115 62 .0. ppm).64 25 Find the mean number of mangoes kept in a packing box.0.08 .0. 7.12 0.52 15 53 .24 Find the mean concentration of SO2 in the air.58 135 59 .55 110 56 . choosing a suitable method. Find the mean heart beats per minute for these women.04 . Thirty women were examined in a hospital by a doctor and the number of heart beats per minute were recorded and summarised as follows.0.20 0. In a retail market. i.86 2 5. fruit vendors were selling mangoes kept in packing boxes.16 0. Daily expenditure (in Rs) Number of households 100 . Number of mangoes Number of boxes 50 .83 4 83 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov. The table below shows the daily expenditure on food of 25 households in a locality.77 8 77 ..12 .150 4 150 . 2006) .16 .74 3 74 .00 .04 0.e.0. Frequency 4 9 9 2 4 2 File Name : C:\Computer Station\Class .250 12 250 .STATISTICS 271 4.200 5 200 . the data was collected for 30 localities in a certain city and is presented below: Concentration of SO2 (in ppm) 0. These boxes contained varying number of mangoes.68 per minute Number of women 2 68 . Which method of finding the mean did you choose? 6. Number of heart beats 65 .80 7 80 . The following was the distribution of mangoes according to the number of boxes. To find out the concentration of SO2 in the air (in parts per million.300 2 300 .

40 10 7 4 4 3 1 9. Let us first recall how we found the mode for ungrouped data through the following example.28 28 . Solution : Let us form the frequency distribution table of the given data as follows: Number of wickets Number of matches 0 1 1 1 2 3 3 2 4 1 5 1 6 1 File Name : C:\Computer Station\Class .20 20 . Though grouped data can also be multimodal.95 3 14.65 10 65 . The following table gives the literacy rate (in percentage) of 35 cities.55 3 55 . the value of the observation having the maximum frequency.38 38 . Find the mean number of days a student was absent.3 Mode of Grouped Data Recall from Class IX.14 14 . Further.10 10 . the data is said to be multimodal. In such situations. 2006) . Literacy rate (in %) Number of cities 45 . we shall restrict ourselves to problems having a single mode only. Number of days Number of students 0-6 11 6 . It is possible that more than one value may have the same maximum frequency. Here. A class teacher has the following absentee record of 40 students of a class for the whole term.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. Example 4 : The wickets taken by a bowler in 10 cricket matches are as follows: 2 6 4 5 0 2 1 3 2 3 Find the mode of the data.85 8 85 .75 11 75 . a mode is that value among the observations which occurs most often.272 MATHEMATICS 8. we shall discuss ways of obtaining a mode of grouped data. we discussed finding the mode of ungrouped data. Find the mean literacy rate. that is.

e. Here. 2006) . So.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. Now. In a grouped frequency distribution. frequency ( f0 ) of class preceding the modal class = 7.STATISTICS 273 Clearly. 3) of matches. the mode of this data is 2. f1 = frequency of the modal class. and is given by the formula: ⎛ f1 − f 0 ⎞ Mode = l + ⎜ ⎟×h ⎝ 2 f1 − f 0 − f 2 ⎠ where l = lower limit of the modal class. The mode is a value inside the modal class. Let us consider the following examples to illustrate the use of this formula.. Now modal class = 3 – 5. frequency ( f2 ) of class succeeding the modal class = 2.11 1 Find the mode of this data. the modal class is 3 – 5. h = size of the class interval (assuming all class sizes to be equal). let us substitute these values in the formula : File Name : C:\Computer Station\Class . we can only locate a class with the maximum frequency. f0 = frequency of the class preceding the modal class. it is not possible to determine the mode by looking at the frequencies. and the class corresponding to this frequency is 3 – 5. Solution : Here the maximum class frequency is 8. Example 5 : A survey conducted on 20 households in a locality by a group of students resulted in the following frequency table for the number of family members in a household: Family size Number of families 1-3 7 3-5 8 5-7 2 7-9 2 9 . class size (h) = 2 frequency ( f1 ) of the modal class = 8. called the modal class. 2 is the number of wickets taken by the bowler in the maximum number (i. f2 = frequency of the class succeeding the modal class. So. lower limit (l ) of modal class = 3.

Now. the mode marks is 52. Remarks : 1. the class size ( h) = 15. But for some other problems it may be equal or more than the mean also. the mode is less than the mean. Find the mode of this data. the modal class is 40 .3 of Example 1.286. 2006) .e.. the maximum number of students obtained 52 marks. Solution : Refer to Table 14. the lower limit ( l ) of the modal class = 40. It depends upon the demand of the situation whether we are interested in finding the average marks obtained by the students or the average of the marks obtained by most ⎛ 7−3 ⎞ Mode = 40 + ⎜ ⎟ × 15 = 52 ⎝ 14 − 6 − 3 ⎠ File Name : C:\Computer Station\Class . Therefore. In Example 6. the frequency ( f1 ) of modal class = 7. the frequency ( f2 ) of the class succeeding the modal class = 6. ⎝ 2 f1 − f 0 − f 2 ⎠ we get So. from Example 1. Since the maximum number of students (i. Also compare and interpret the mode and the mean. Example 6 : The marks distribution of 30 students in a mathematics examination are given in Table 14. the frequency ( f0 ) of the class preceding the modal class = 3. using the formula: ⎛ f1 − f 0 ⎞ Mode = l + ⎜ ⎟ × h.274 MATHEMATICS ⎛ f1 − f 0 ⎞ Mode = l + ⎜ ⎟×h ⎝ 2 f1 − f 0 − f 2 ⎠ ⎛ ⎞ 8−7 2 = 3+⎜ ⎟ × 2 = 3 + = 3.55. 7) have got marks in the interval 40 .286 7 ⎝2×8 − 7 − 2⎠ Therefore.3 of Example 1. while on an average a student obtained 62 marks. Now. you know that the mean marks is 62.55. the mode of the data above is 3.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2. So.

Activity 3 : Continuing with the same groups as formed in Activity 2 and the situations assigned to the groups.25 11 25 . 3.4500 4500 .45 23 45 .40 35 40 . the mean is required and in the second situation.4000 4000 . However.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. EXERCISE 14.20 10 20 .65 5 Find the mode and the mean of the data given above.60 52 60 . Find the modal monthly expenditure of the families. 2006) .100 38 100 .2000 2000 . Compare and interpret the two measures of central tendency. find the mean monthly expenditure : Expenditure (in Rs) 1000 . Also. The following data gives the information on the observed lifetimes (in hours) of 225 electrical components : Lifetimes (in hours) Frequency 0 . the mode is required.1500 1500 . 2.3000 3000 .2500 2500 .15 6 15 .3500 3500 . Ask each group to find the mode of the data. The following table shows the ages of the patients admitted in a hospital during a year: Age (in years) Number of patients 5 . Remark : The mode can also be calculated for grouped data with unequal class sizes.STATISTICS 275 of the students. In the first situation.2 1. and interpret the meaning of both.35 21 35 . The following data gives the distribution of total monthly household expenditure of 200 families of a village.5000 Number of families 24 40 33 28 30 22 16 7 File Name : C:\Computer Station\Class . They should also compare this with the mean. we shall not be discussing it.120 29 Determine the modal lifetimes of the components.80 61 80 .55 14 55 .

40 40 .276 MATHEMATICS 4.10000 10000 .35 35 . Number of students per teacher 15 . Runs scored 3000 .70 70 . A student noted the number of cars passing through a spot on a road for 100 periods each of 3 minutes and summarised it in the table given below.50 50 .25 25 . Find the mode and mean of this data.45 45 . 3 8 9 10 3 0 0 2 5.8000 8000 .4000 4000 . Interpret the two measures.7000 7000 . The following distribution gives the state-wise teacher-student ratio in higher secondary schools of India.60 60 .5000 5000 . Find the mode of the data : Number of cars Frequency 0 .T.30 30 .55 Number of states / U .6000 6000 .80 7 14 13 12 20 11 15 8 Number of batsmen 4 18 9 7 6 3 1 1 File Name : C:\Computer Station\Class .9000 9000 . The given distribution shows the number of runs scored by some top batsmen of the world in one-day international cricket matches. 6.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2006) .40 40 .30 30 .10 10 .20 20 .50 50 .20 20 .11000 Find the mode of the data.

out of 50.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. we have to find the median of the following data. which gives the marks. the median is the ⎜ ⎟ th observation.4 Median of Grouped Data As you have studied in Class IX. Recall that for finding the median of ungrouped data. we arrange the marks in ascending order and prepare a frequency table as follows : Table 14. And. if n is odd. 2 ⎝2 ⎠ Suppose. we first arrange the data values of the observations in ⎛ n + 1⎞ ascending order. Then. then the median will be the average of the n ⎛n ⎞ th and the ⎜ + 1⎟ th observations. obtained by 100 students in a test : Marks obtained Number of students 20 6 29 28 28 24 33 15 42 2 38 4 43 1 25 20 First.9 Marks obtained Number of students (Frequency) 6 20 24 28 15 4 2 1 100 20 25 28 29 33 38 42 43 Total File Name : C:\Computer Station\Class . if n ⎝ 2 ⎠ is even.STATISTICS 277 14. the median is a measure of central tendency which gives the value of the middle-most observation in the data. 2006) .

10 Marks obtained 20 upto 25 upto 28 upto 29 upto 33 upto 38 upto 42 upto 43 Number of students 6 6 + 20 = 26 26 + 24 = 50 50 + 28 = 78 78 + 15 = 93 93 + 4 = 97 97 + 2 = 99 99 + 1 = 100 Now we add another column depicting this information to the frequency table above and name it as cumulative frequency column. Table 14.11 Marks obtained 20 25 28 29 33 38 42 43 Number of students 6 20 24 28 15 4 2 1 Cumulative frequency 6 26 50 78 93 97 99 100 File Name : C:\Computer Station\Class .e.. we proceed as follows: Table 14. which is even. The median will be the average of the n th and the 2 ⎛n ⎞ ⎜ + 1 ⎟ th observations.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 2006) .278 MATHEMATICS Here n = 100. To find these ⎝2 ⎠ observations. i. the 50th and 51st observations.

5. Consider a grouped frequency distribution of marks obtained.5 2 Remark : The part of Table 14.12 Marks 0 . we see that: 50th observaton is 28 51st observation is 29 So.40 40 . The median marks 28.100 Number of students 5 3 4 3 3 4 7 9 7 8 From the table above. out of 100. let us see how to obtain the median of grouped data. through the following situation. Now. Median = (Why?) 28 + 29 = 28. in a certain examination. as follows: Table 14.50 50 .90 90 .5 conveys the information that about 50% students obtained marks less than 28.5 and another 50% students obtained marks more than 28. 2006) . by 53 students.30 30 .20 20 .11 consisting Column 1 and Column 3 is known as Cumulative Frequency Table. File Name : C:\Computer Station\Class .80 80 .10 10 .70 70 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov.STATISTICS 279 From the table above. try to answer the following questions: How many students have scored marks less than 10? The answer is clearly 5.60 60 .

this means that there are 53 – 5 = 48 students getting more than or equal to 10 marks.14.. 30.10. Similarly.13 given below: Table 14.12. From Table 14.e.280 MATHEMATICS How many students have scored less than 20 marks? Observe that the number of students who have scored less than 20 include the number of students who have scored marks from 0 . more than or equal to 10. more than or equal to 0. .. we observe that all 53 students have scored marks more than or equal to 0. File Name : C:\Computer Station\Class . as shown in Table 14. and so on.20. 30 or above as 45 – 4 = 41. the number of students with marks less than 30. 20. i. we get the number of students scoring 20 or above as 48 – 3 = 45. . 100. Since there are 5 students scoring marks in the interval 0 . we can compute the cumulative frequencies of the other classes. Here 10. less than 40. We say that the cumulative frequency of the class 10 -20 is 8. 8. We can similarly make the table for the number of students with scores. So. 2006) . Continuing in the same manner. . the total number of students with marks less than 20 is 5 + 3.13 Marks obtained Less than 10 Less than 20 Less than 30 Less than 40 Less than 50 Less than 60 Less than 70 Less than 80 Less than 90 Less than 100 Number of students (Cumulative frequency) 5 5+3=8 8 + 4 = 12 12 + 3 = 15 15 + 3 = 18 18 + 4 = 22 22 + 7 = 29 29 + 9 = 38 38 + 7 = 45 45 + 8 = 53 The distribution given above is called the cumulative frequency distribution of the less than type. less than 100. and so on. We give them in Table 14. i.10 as well as the number of students who have scored marks from 10 . .. are the upper limits of the respective class intervals.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. . more than or equal to 20. .e.

STATISTICS 281 Table 14.12 and 14.30 30 . . . 10. we can make use of any of these cumulative frequency distributions.14 Marks obtained More than or equal to 0 More than or equal to 10 More than or equal to 20 More than or equal to 30 More than or equal to 40 More than or equal to 50 More than or equal to 60 More than or equal to 70 More than or equal to 80 More than or equal to 90 Number of students (Cumulative frequency) 53 53 – 5 = 48 48 – 3 = 45 45 – 4 = 41 41 – 3 = 38 38 – 3 = 35 35 – 4 = 31 31 – 7 = 24 24 – 9 = 15 15 – 7 = 8 The table above is called a cumulative frequency distribution of the more than type.13 to get Table 14.60 60 . Here 0. 20.70 70 .40 40 .100 Number of students ( f ) 5 3 4 3 3 4 7 9 7 8 Cumulative frequency (cf ) 5 8 12 15 18 22 29 38 45 53 Now in a grouped data. we may not be able to find the middle observation by looking at the cumulative frequencies as the middle observation will be some value in File Name : C:\Computer Station\Class .80 80 .15 Marks 0 .15 given below: Table 14. 2006) .X (Maths)/Final/Chap–14/Chap–14 (28th Nov. Let us combine Tables 14.10 10 .50 50 . .. 90 give the lower limits of the respective class intervals. to find the median of grouped data.90 90 .20 20 . Now.

we find the cumulative frequencies of all the classes and Therefore.282 MATHEMATICS a class interval. we get ⎛ 26. ⎜ f ⎟ ⎜ ⎟ ⎝ ⎠ where l = lower limit of median class.5 − 22 ⎞ Median = 60 + ⎜ ⎟ × 10 7 ⎝ ⎠ = 60 + = 66. After finding the median class.5.4 45 7 So. n = 53. cf = 22. 2 2 Now 60 – 70 is the class whose cumulative frequency 29 is greater than (and nearest n . 60 – 70 is the median class. 2 We now locate the class whose cumulative frequency is greater than (and nearest to) n n ⋅ This is called the median class. File Name : C:\Computer Station\Class . 26. It is. f = frequency of median class. i. we use the following formula for calculating the median. 2006) .5. f = 7. about half the students have scored marks less than 66. n = number of observations. Substituting the values n = 26.5. In the distribution above. = 26. necessary to find the value inside a class that divides the whole distribution into two halves. cf = cumulative frequency of class preceding the median class. So.4. h = 10 2 in the formula above.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.e. But which class should this be? n .. to) 2 To find this class. and the other half have scored marks more than 66. h = class size (assuming class size to be equal). ⎛n ⎞ ⎜ 2 − cf ⎟ Median = l + ⎜ ⎟ × h.4. therefore. l = 60.

our frequency distribution table with the given cumulative frequencies becomes: Table 14. Observe that from the given distribution. it is 40 – 29 = 11. 160 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov. we need to find the class intervals and their corresponding frequencies.145. . 150. 145. 140 . the classes should be below 140. So.150 150 . The given distribution being of the less than type. . 2006) .160 160 . 165 give the upper limits of the corresponding class intervals. the number of girls with height in the interval 140 . 140. there are 11 girls with heights less than 145 and 4 girls with height less than 140..155. and so on.145 is 11 – 4 = 7. Now.16 Class intervals Below 140 140 .165.155 155 . we find that there are 4 girls with height less than 140.150. . So. the frequency of 145 ... Similarly. for 150 . . . . Solution : To calculate the median height. i. 145 .165 Frequency 4 7 18 11 6 5 Cumulative frequency 4 11 29 40 46 51 Number of girls 4 11 29 40 46 51 File Name : C:\Computer Station\Class . Therefore.e.STATISTICS 283 Example 7 : A survey regarding the heights (in cm) of 51 girls of Class X of a school was conducted and the following data was obtained: Height (in cm) Less than 140 Less than 145 Less than 150 Less than 155 Less than 160 Less than 165 Find the median height.145 145 .150 is 29 – 11 = 18. the frequency of class interval below 140 is 4.

03. cf (the cumulative frequency of the class preceding 145 .1000 Frequency 2 5 x 12 17 20 y 9 7 4 File Name : C:\Computer Station\Class . 2 2 l (the lower limit) = 145.700 700 .5 . and 50% are taller than this height. Then. This observation lies in the class 145 .03 cm.5 = 149. if the total frequency is 100. f (the frequency of the median class 145 .150) = 11.284 MATHEMATICS Now n = 51. Find the values of x and y.200 200 . we have Using the formula.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. 18 So. = 145 + This means that the height of about 50% of the girls is less than this height.500 500 . the median height of the girls is 149. n 51 = = 25. 2006) .300 300 .800 800 . Median = l + ⎜ f ⎜ ⎟ ⎝ ⎠ ⎛ 25.100 100 .150. h (the class size) = 5. ⎛n ⎞ ⎜ 2 − cf ⎟ ⎟ × h .600 600 . So.400 400 .150) = 18. Class interval 0 . Example 8 : The median of the following data is 525.900 900 .5 − 11 ⎞ Median = 145 + ⎜ ⎟×5 ⎝ 18 ⎠ 72.

700 700 . So..100 100 . 2006) . from (1).200 200 .e.400 400 . x + y = 24 (1) The median is 525.STATISTICS 285 Solution : Class intervals 0 . f = 20.. l = 500.800 800 ..1000 It is given that n = 100 So.500 500 . we get i. 525 – 500 = (14 – x) × 5 25 = 70 – 5x 5x = 70 – 25 = 45 x= 9 9 + y = 24 y = 15 File Name : C:\Computer Station\Class . i. So.e.e. which lies in the class 500 – 600 cf = 36 + x. h = 100 ⎛n ⎜ − cf Median = l + ⎜ 2 ⎜ f ⎝ ⎞ ⎟ ⎟ h.X (Maths)/Final/Chap–14/Chap–14 (28th Nov... Therefore. 76 + x + y = 100. i. we get ⎟ ⎠ Frequency 2 5 x 12 17 20 y 9 7 4 Cumulative frequency 2 7 7+x 19 + x 36 + x 56 + x 56 + x + y 65 + x + y 72 + x + y 76 + x + y Using the formula : ⎛ 50 − 36 − x ⎞ 525 = 500 + ⎜ ⎟ × 100 20 ⎝ ⎠ i.e.900 900 .300 300 . i.e.600 600 .

V. the median is more appropriate. the mode is the best choice. In situations which require establishing the most frequent value or most popular item. 20. It also enables us to compare two or more distributions. The median of grouped data with unequal class sizes can also be calculated.e. the colour of the vehicle used by most of the people. File Name : C:\Computer Station\Class . and we wish to find out a ‘typical’ observation. i. Remarks : 1.286 MATHEMATICS Now. to find the most popular T. by comparing the average (mean) results of students of different schools of a particular examination. that you have studied about all the three measures of central tendency. finding the typical productivity rate of workers. The mean is the most frequently used measure of central tendency because it takes into account all the observations. So. in such cases. average wage in a country. However. But. and the five others have frequency 20. e. then the mean will certainly not reflect the way the data behaves. For example. we shall not discuss it here. rather than the mean.g. 21. 25. the consumer item in greatest demand. etc.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. the mean is not a good representative of the data.. extreme values in the data affect the mean. we can conclude which school has a better performance. e. the largest and the smallest observations of the entire data. if one class has frequency. 2006) . let us discuss which measure would be best suited for a particular requirement..g. etc. 18. and lies between the extremes. In problems where individual observations are not important. programme being watched.. These are situations where extreme values may be there. the mean of classes having frequencies more or less the same is a good representative of the data. There is a empirical relationship between the three measures of central tendency : 3 Median = Mode + 2 Mean 2. say 2. we take the median as a better measure of central tendency. However. So. For example.

find the values of x and y. Calculate the median age.165 165 . Monthly consumption (in units) 65 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov.185 185 .40 40 .3 1. 2006) . A life insurance agent found the following data for distribution of ages of 100 policy holders.105 105 .10 10 . File Name : C:\Computer Station\Class . Class interval 0 .20 20 . Find the median. if policies are given only to persons having age 18 years onwards but less than 60 year.5.125 125 . The following frequency distribution gives the monthly consumption of electricity of 68 consumers of a locality. If the median of the distribution given below is 28.85 85 . mean and mode of the data and compare them.50 50 .205 Number of consumers 4 5 13 20 14 8 4 2.30 30 .60 Total Frequency 5 x 20 15 y 5 60 3.145 145 .STATISTICS 287 EXERCISE 14.

2006) .X (Maths)/Final/Chap–14/Chap–14 (28th Nov.) Number of leaves 3 5 9 12 5 4 2 File Name : C:\Computer Station\Class .5 . The lengths of 40 leaves of a plant are measured correct to the nearest millimetre. . (Hint : The data needs to be converted to continuous classes for finding the median.144 145 . .5. 126.126 127 . 171.. .5 .288 Age (in years) Below 20 Below 25 Below 30 Below 35 Below 40 Below 45 Below 50 Below 55 Below 60 MATHEMATICS Number of policy holders 2 6 24 45 78 89 92 98 100 4. and the data obtained is represented in the following table : Length (in mm) 118 . since the formula assumes continuous classes.171 172 .135 136 .5 .5. The classes then change to 117.126.5.180.162 163 .153 154 .180 Find the median length of the leaves.135.

65 65 . 7.10 40 10 .2000 2000 . Let us now represent a cumulative frequency distribution graphically.70 70 .X (Maths)/Final/Chap–14/Chap–14 (28th Nov.2500 2500 . File Name : C:\Computer Station\Class .19 4 Determine the median number of letters in the surnames.60 60 .13 16 13 . The distribution below gives the weights of 30 students of a class. let us consider the cumulative frequency distribution given in Table 14.50 50 .3500 3500 .STATISTICS 5. Find the mean number of letters in the surnames? Also. 100 surnames were randomly picked up from a local telephone directory and the frequency distribution of the number of letters in the English alphabets in the surnames was obtained as follows: Number of letters Number of surnames 1-4 6 4-7 30 7 . For example.55 55 .4500 4500 .3000 3000 . find the modal size of the surnames. The following table gives the distribution of the life time of 400 neon lamps : Life time (in hours) 1500 . we have represented the data through bar graphs. In Class IX. 2006) .13.75 2 3 8 6 6 3 2 14.4000 4000 . Find the median weight of the students. pictures speak better than words. Number of lamps 14 56 60 86 74 62 48 289 6. A graphical representation helps us in understanding given data at a glance.16 4 16 .5000 Find the median life time of a lamp.45 45 . Weight (in kg) Number of students 40 .5 Graphical Representation of Cumulative Frequency Distribution As we all know. histograms and frequency polygons.

. 10 . 2006) . 53).20. 20. 15). . (30. 29). 100 are the upper limits of the respective class intervals. (80.e. we mark the upper limits of the class intervals on the horizontal axis (x-axis) and their corresponding cumulative frequencies on the vertical axis ( y-axis). In architecture.290 MATHEMATICS Recall that the values 10.2) File Name : C:\Computer Station\Class . (60. (See Fig. 24). or an ogive (of the less than type).. 22). 5). 14. 30.14 and draw its ogive (of the more than type). Fig. (40. . 8). The curve we get is a cumulative frequency curve. corresponding cumulative frequency). (50. The scale may not be the same on both the axis. corresponding cumulative frequency). 18). choosing a convenient scale. (80. on a graph paper. 20.1 i. (30. To represent the data in the table graphically. Let us now plot the points corresponding to the ordered pairs given by (upper limit. . 45). 90 are the lower limits of the respective class intervals 0 . (20. Next.1) The term ‘ogive’ is pronounced as ‘ojeev’ and is derived from the word ogee.2 and join them by a free hand smooth curve. 48). (70. (60. . (50. here 0. 12).. or an ogive (of the more than type). (100. (See Fig. 10. 35). Fig. (90. again we consider the cumulative frequency distribution given in Table 14. 90 . the ogee shape is one of the characteristics of the 14th and 15th century Gothic styles. 8). (0. 31). 53) on a graph paper and join them by a free hand smooth curve. An ogee is a shape consisting of a concave arc flowing into a convex arc. 14. 15). we plot the lower limits on the x-axis and the corresponding cumulative frequencies on the y-axis.e. 45). 38).10. . 41). Then we plot the points (lower limit. 14.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.. 38). so forming an S-shaped curve with vertical ends. To represent ‘the more than type’ graphically.100. (10. (40. (20.. (90. i.. The curve we get is called a cumulative frequency curve. . 14. (70. . . (10. Recall that.

12. are the ogives related to the median in any way? Is it possible to obtain the median from these two cumulative frequency curves corresponding to the data in Table 14.STATISTICS 291 Remark : Note that both the ogives (in Fig.5 2 2 (see Fig. draw a line parallel to the x-axis cutting the curve at a point. 14.4 Example 9 : The annual profits earned by 30 shops of a shopping complex in a locality give rise to the following distribution : Profit (in lakhs Rs) More than or equal to 5 More than or equal to 10 More than or equal to 15 More than or equal to 20 More than or equal to 25 More than or equal to 30 More than or equal to 35 Number of shops (frequency) 30 28 16 14 10 7 3 File Name : C:\Computer Station\Class . The two ogives will intersect each other at a point. Another way of obtaining the median is the following : Draw both ogives (i. 14. Now. One obvious way is to locate n 53 on the y-axis = = 26. Fig.3 Fig.e.. From this point. 14.1 and Fig. draw a perpendicular to the x-axis. 14.12? Let us see.3). of the less than type and of the more than type) on the same axis. the point at which it cuts the x-axis gives us the median (see Fig. which is given in Table 14. From this point.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. if we draw a perpendicular on the x-axis.2) correspond to the same data. From this point. 14. 2006) .4). The point of intersection of this perpendicular with the x-axis determines the median of the data (see Fig. 14.3). 14.

which is the median. 14. (35. 23).5 to get the ‘less than’ ogive.5. (15. 2). 28). (25. it may be noted that the class intervals were continuous.5. We join these points with a smooth curve to get the ‘more than’ ogive. Hence obtain the median profit. 16).35 35 .10 2 2 10 . 30) on the same axes as in Fig. The abcissa of their point of intersection is nearly 17. 14). their frequencies and the cumulative frequency from the table above. (40. of shops Cumulative frequency 5 . with lower limits of the profit along the horizontal axis. 2006) . (30. 7) and (35. 10). 14. (25.15 15 . This can also be verified by using the formula. Table 14. 16). let us obtain the classes. (20. 3). 20). we plot the points (10.6 File Name : C:\Computer Station\Class .5.292 MATHEMATICS Draw both ogives for the data above.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. Remark : In the above examples. Hence.30 30 . Now. For drawing ogives. and the cumulative frequency along the vertical axes. 30). we plot the points (5. (10. (20.40 12 14 2 16 4 20 3 23 4 27 3 30 Fig.20 20 . 14. (Also see constructions of histograms in Class IX) Fig. (15. the median profit (in lakhs) is Rs 17. 14. (30. it should be ensured that the class intervals are continuous. as shown in Fig.17 Classes No.5 Using these values. Solution : We first draw the coordinate axes. 14). 27).6. 14.25 25 . Then. as shown in Fig.

2006) . Production yield (in kg/ha) Number of farms 50 . their weights were recorded as follows: Weight (in kg) Less than 38 Less than 40 Less than 42 Less than 44 Less than 46 Less than 48 Less than 50 Less than 52 Number of students 0 3 5 9 14 28 32 35 Draw a less than type ogive for the given data. The following distribution gives the daily income of 50 workers of a factory. The mean for grouped data can be found by : (i) the direct method : x = Σfi xi Σfi File Name : C:\Computer Station\Class .140 14 140 .180 6 180 . Daily income (in Rs) Number of workers 100 .65 12 65 .120 12 120 .70 24 70 .55 2 55 . and draw its ogive. you have studied the following points: 1. and draw its ogive.75 38 75 . 2.80 16 Change the distribution to a more than type distribution.X (Maths)/Final/Chap–14/Chap–14 (28th Nov.6 Summary In this chapter. 14.200 10 Convert the distribution above to a less than type cumulative frequency distribution.160 8 160 .4 1. During the medical check-up of 35 students of a class.STATISTICS 293 EXERCISE 14. Hence obtain the median weight from the graph and verify the result by using the formula.60 8 60 . The following table gives production yield per hectare of wheat of 100 farms of a village. 3.

Same condition also apply for construction of an ogive.X (Maths)/Final/Chap–14/Chap–14 (28th Nov. or an ogive of the less than type and of the more than type. File Name : C:\Computer Station\Class . 6. in case of ogives. The cumulative frequency of a class is the frequency obtained by adding the frequencies of all the classes preceding the given class. it should be ensured that the class intervals are continuous before applying the formulae. The mode for grouped data can be found by using the formula: ⎛ ⎞ f1 − f 0 Mode = l + ⎜ ⎟×h 2 f1 − f 0 − f 2 ⎠ ⎝ where symbols have their usual meanings.294 (ii) the assumed mean method : x = a + Σfi di Σfi MATHEMATICS ⎛ Σf u ⎞ (iii) the step deviation method : x = a + ⎜ i i ⎟ × h . Further. A NOTE TO THE READER For calculating mode and median for grouped data. 5. the scale may not be the same on both the axes. Representing a cumulative frequency distribution graphically as a cumulative frequency curve. called its class mark. 2. f ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ where symbols have their usual meanings. 4. 3. The median of grouped data can be obtained graphically as the x-coordinate of the point of intersection of the two ogives for this data. ⎝ Σf i ⎠ with the assumption that the frequency of a class is centred at its mid-point. 2006) . The median for grouped data is formed by using the formula: ⎛n ⎞ ⎜ 2 − cf ⎟ Median = l + ⎜ ⎟× h.