You are on page 1of 66

(Measures of Central Tendency)

MEASURES OF CENTRAL TENDENCY or AVERAGE


Definition:
“A measure of central tendency is a typical value around which other
figures congregate.”
Objective and function of Measures of Central Tendency:
• To provide a single value that represents and describes the
characteristic of entire group.
• To facilitate comparison between and within groups.
• To draw a conclusion about population from sample data.
• To form a basis for statistical analysis.
Essential characteristics/Properties/Pre-requisite for a good or an ideal
Average:
• It should be easy to understand and simple to compute.
• It should be rigidly defined.
• Its calculation should be based on all the items/observations in the
data set.
• It should be capable of further algebraic treatment (mathematical
manipulation).
• It should be least affected by sampling fluctuation.
• It should not be much affected by extreme values.
• It should be helpful in further statistical analysis.
Types of Average

Mathematical Average Positional Average Commercial Average

1.Arithmetic Mean or 1. Moving Average


Mean i) Simple 2.Progressive
Arithmetic Mean 1. Median Average
2. Mode 3.Composite
ii) Weighted Arithmetic
3) Quantiles Average
Mean
i) Quartiles
iii) Combined Mean ii) Deciles
2. Geometric Mean
3. Harmonic Mean iii) Percentiles
Computation of Simple Arithmetic Mean:
i) For raw data/individu isal-series/ungrouped data:
ii) For frequency distribution data:
1) Discrete frequency distribution (Ungrouped frequency distribution) data:
2). Continuous frequency__ distribution (Grouped frequency distribution)
data: _
Mean = X = ΣXi , if we have ungrouped raw data
N
= Σ fi Xi , if we have ungrouped frequency data or continuous
N (grouped) frequency distribution

Note: For continuous frequency distribution Xi is the middle value


corresponding to the i-th class interval of the frequency distribution.
Merits of Arithmetic Mean:
• It is simplest and most widely used average.
• It is easy to understand and easy to calculate.
• It is rigidly defined.
• Its calculation is based on all the observations.
• It is suitable for further mathematical treatment.
• It is least affected by the fluctuations of sampling as possible.
• If the number of items is sufficiently large, it is more accurate and more reliable.
• It is a calculated value and is not based on its position in the series.
It provides a good basis for comparison
Demerits of Arithmetic Mean:
• It cannot be obtained by inspection nor can be located graphically.
• It cannot be used to study qualitative phenomenon such as intelligence, beauty,
honesty etc.
• It is very much affected by extreme values.
• It cannot be calculated for open-end classes.
• The A. M. computed may not be the actual item in the series
• Its value can’t be determined if one or more number of observations are missing in the
series.
• Some time A.M. gives absurd results ex: number of child per family can’t be in
fraction.
Uses of Arithmetic Mean
• Arithmetic Mean is used to compare two or more series with respect to
certain character.
• It is commonly & widely used average in calculating Average cost of
production, Average cost of cultivation, Average cost of yield per hectare
etc...
• It is used in calculating standard deviation, coefficient of variance.
• It is used in calculating correlation co-efficient, regression co-efficient.
• It is also used in testing of hypothesis and finding confidence limit.
Example on Ungrouped data:
Example-2

The average fuel efficiencies, in miles per gallon, of cars sold in the United States in the years 1999 to 2003
were 28.2, 28.3, 28.4, 28.5, 29.0 .

Find the sample mean of this set of data.

Solution:

The sample mean x is the average of the five data values.

Thus, x¯ = (28.2 + 28.3 + 28.4 + 28.5 + 29.0 5)/5 = 142.4 5 /5= 28.48

Note from this example that whereas the sample mean is the average of all the data values, it need not itself
be one of them.
Examples on Grouped discrete and Continuous frequency distribution
(a)Find the arithmetic mean of the following frequency
distribution :
x: 1 2 3 4 5 6 7
f: 5 9 12 17 14 10 6

(b) Calculate the arithmetic mean of the “marks from the following
table:
Marks : 0-10 10-20 20-30 30-40 40-50 50-60
No. of students : 12 18 27 20 17 6
Solution:

(a) x : 1 2 3 4 5 6 7 Total
f : 5 9 12 17 14 10 6 73
fx : 5 18 36 68 70 60 42 299

Therefore,
Mean=Σfx/N = 299/73 = 4.0958
Arithmetic Mean= Xbar = A+ h Σ f*d/N
= 28 + 8(-25)/77
= 28 – 200/77
= 28 – 2.597
= 25.403
Mathematical Properties of the Arithmetic Mean :
1. The sum of the deviation of the individual items from the arithmetic mean is
always zero. i.e.

2. The sum of the squared deviation of the individual items from the arithmetic
mean is always minimum. i.e.

3. The Standard Error of A.M. is less than that of any other measures of central
tendency.
4.Arithmetic mean is dependent on change of both Origin and Scale
(i.e. If each value of a variable X is added or subtracted or multiplied or
divided by a constant values k, the arithmetic mean of new series will also
increases or decreases or multiplies or division by the same constant value k.)
Uses of the weighted mean:
1.Construction of index numbers.
2.Comparison of results of two or more groups where number of items differs in
each group.
3.Computation of standardized death and birth rates.
4.When values of items are given in percentage or proportion.
Combined Arithmetic Mean
• Example 3. The average salary of male employees in a firm was Rs.520 and that of females was Rs.420. The mean salary of all the
employees was Rs.500. Find the percentage of male ,and female employees.
• Solution. Let n1 and n2 denote respectively the number of male and female employees in the concern and ̅xI and ̅x2 denote
respectively their average salary (in rupees). Let X̅ denote the average salary of all the workers in the firm.
• We are given that: ̅xI = 520, ̅x2 = 420 and ̅x = 500

implies 500 (n1+ n2) = 520 n1 + 420n2

(520 - 500) n1 = (500 - 420)n2


implies 20 n1 = 80 n2
n1/n2 =4/1
Hence the percentage of male employees in the firm = (4/5) x100 = 80%
The percentage of female employees in the firm = (1/5) x100 = 20%

X1bar=
= 50*X1bar, = 50*X2bar,…., = 50*X110bar
Examples Weighted Arithmetic Mean

• Example: Find the simple and weighted arithmetic mean of the first n natural numbers, the
weights being the corresponding numbers.
• Solution: The first natural numbers are I, 2 ,3, ... ,n.
We know that
1 + 2+ 3 + - - - +n = n(n+1)/2
1x1 + 2x2 + - - - + nxn = n(n+1)(2n+1)/6
Simple A.M. = (1+2+3+ - - - +n)/n = (n + 1)/2
Weighted A.M. = (1x1 + 2x2 + 3x3 + - - - +nxn)/n = [n(n+1)(2n+1)/6 ] * [2/n(n+1)]
= (2n+1)/3,
Since the sum of the eights Σ wi= 1 + 2+ 3 +……+n = n( n+1)/2
Example on Weighted Average:
• Now, if w1, w2, ... , wk are nonnegative numbers that sum to 1, then w1 x1 + w2x2 +···+ wkxk is
said to be a weighted average of the values X1, X2, ... , Xk with Wi being the weight of xi. It is
normalized weighted arithmetic mean

• For instance, suppose that k = 2.

• Now, if w1 = w2 = 1/2,

• then the weighted average w1x1 + w2x2 = 1/ 2 x1 + 1/ 2 x2 = (X1 + X2 )/2 is just the ordinary
average of x1 and x2.

• On the other hand, if w1 = 2/3 and w2 = 1/3,

• then the weighted average w1x1 + w2x2 = 2/ 3 x1 + 1 /3 x2 gives twice as much weight to x1 as
it does to x2.

• Note: If all weights are taken equal, then it reduces to simple arithmetic mean.


Continuous frequency distribution (Grouped frequency distribution) data:
If x1, x2, x3,……. Xn are the mid-points of the n-class intervals with their corresponding
frequencies
f1, f2, f3…….., fn, then the geometric mean (GM) is defined as
GM = [(X1^f1) *(X2^ f2)*(X3^f3) …….. (Xn^fn )]Ʌ(1/N),
where N =
It is equivalent to computing GM = Antilog(log(GM))
= Anti log [log(XI )]/N
Merits of Geometric mean:
• It is rigidly defined.
• It is based on all observations.
• It is capable of further mathematical treatment.
• It is not affected much by the fluctuations of sampling.
• Unlike AM, it is not affected much by the presence of
extreme values.
• It is very suitable for averaging ratios, rates and
percentages.
Demerits of Geometric mean:
• Calculation is not simple as that of A.M and not easy to
understand.
• The GM may not be the actual value of the series.
• It can’t be determined graphically and inspection.
• It cannot be used when the values are negative because if any one
observation is negative, G.M. becomes meaningless or doesn’t
exist.
• It cannot be used when the values are zero, because if any one
observation is zero, G. M. becomes zero.
• It cannot be calculated for open-end classes.
Uses of G. M:
1.It is used in the construction of index numbers.
2.It is also helpful in finding out the compound rates of change such as
the rate of growth of population in a country, average rates of
change, average rate of interest etc..
3.It is suitable where the data are expressed in terms of rates, ratios
and percentage.
4.It is most suitable when the observations of smaller values are given

more weightage or importance.



• phy7oy76gzz
Merits of H.M.:
• It is rigidly defined.
• It is based on all items is the series.
• It is amenable to further algebraic treatment.
• It is not affected much by the fluctuations of sampling.
• Unlike AM, it is not affected much by the presence of
extreme values.
• It is the most suitable average when it is desired to give
greater weight to smaller observations and less weight to
the larger ones.
Demerits of H.M:
• It is not easily understood and it is difficult to compute.
• It is only a summary figure and may not be the actual item in the series.
• Its calculation is not possible in case the values of one or more items is either
missing, or zero
• Its calculation is not possible in case the series contains negative and positive
observations.
• It gives greater importance to small items and is therefore, useful only when small
items have to be given greater weightage
• It can’t be determined graphically and inspection.
• It cannot be calculated for open-end classes.

Uses of H. M.:
• H.M. is greater significance in such cases where prices are expressed in quantities
(unit/prices). H.M. is also used in averaging time, speed, distance, quantity etc... for
example if you want to find out average speed travelled in km, average time taken to
travel, average distance travelled etc...
Example. You can take a trip which entails travelling 900 km. by train at an average speed of 60
km. per hour, 3000 km. by boat at an average of 25 km. per hour, 400 km. by plane at 350 km. per
hour and finally 15 km. by taxi at 25 km. per hour. What is the average speed for the entire
distance?
Solution. Since different distances are covered with varying speeds, the required average speed
for the entire distance is given by the weighted harmonic mean of the speeds (in km.p.h.), the
weights being the corresponding distances covered (in kms.).
Positional Averages:
These averages are based on the position of the observations in
arranged (either ascending or descending order) series. Ex: Median,
Mode, quartile, deciles, percentiles.
1) Median:
• Median is the middle most value of the series of the data when the
observations are arranged in ascending or descending order.
• The median is that value of the variate which divides the group into
two equal parts, one part comprising all values greater than middle
value, and the other all values less than middle value.

ii.For frequency distribution data :
(a) Discrete frequency distribution (Ungrouped frequency
distribution) data:

(b) Continuous frequency distribution (Grouped frequency


distribution) data:
Examples of Discrete Frequency distribution:
Example: Obtain the median for the following frequency distribution:
x: 1 2 3 4 5 6 7 8 9
f: 8 10 11 16 20 25 15 9 6
Solution:

Hence N = 120 => N/2 = 60 Cumulative frequency (cf.) just greater than N/2, is 65 and the value of X
corresponding to 65 is 5. Therefore, median is 5.
111222333 X: 1 2 3
f: 3 3 3'
Continuous frequency distribution (Grouped frequency distribution) data:

• In the case of continuous frequency distribution, the class corresponding to the


cumulative frequency just greater than NI2 is called the median class and the value of
median is obtained by the following formula :
(N/2 - cf)
• Median = l + -------------- x h ……(1)
f
• where I: is the lower limit of the median class,
f : is the frequency of the median class,
h : is the magnitude (length) of the median class.
cf : is the cumulative frequency preceding to the median class, and
N : is the total frequency Σfi.
Example:
Find the median wage of the following distribution:
Wages (in Rs.) : 20-30 30-40 40-50 50-60 60-70
No. of labourers : 3 5 20 10 5
Wages (in Rs.) No. of labourers Cumulative frequency
20-30 3 3
30-40 5 8
40-50 20 28
50--60 10 38
60-70 5 43
Here N/2= 43/2= 21·5 . Cumulative frequency just greater than 21·5 is 28 and
the corresponding class is 40-50. Thus median class is 40-50 .
Hence using (1), we get
(N/2 - cf) 21.5 - 8
• Median = l + -------------- x h = 40 + ----------- x 10 = 40 + 6.75 = 46.75
f 20
Thus median wage of the labourers is Rs. 46.75
Graphic method for Location of median:
• Median can be located with the help of the cumulative frequency curve or ‘ogive’ . The procedure for locating
median in a grouped data is as follows:
• Step1: The class boundaries, where there are no gaps between consecutive classes, i.e. exclusive class are
represented on the horizontal axis (x-axis).
• Step2: The cumulative frequency corresponding to different classes is plotted on the vertical axis (y-axis) against
the upper limit of the class interval (or against the variate value in the case of a discrete series.)
• Step3: The curve obtained on joining the points by means of freehand drawing is called the ‘ogive’ . The ogive so
drawn may be either a (i) less than ogive or a (ii) more than ogive.
• Step4: The value of N/2 is marked on the y-axis, where N is the total frequency.
• Step5: A horizontal straight line is drawn from the point N/2 on the y-axis parallel to x-axis to meet the ogive.
• Step6: A vertical straight line is drawn from the point of intersection perpendicular to the horizontal axis.
• Step7: The the vertical line that touches the point on the horizontal axis is the median value
• Open end class intervals
• Income N.of families
• < 10,000 100
• 10,000-50,000 350
• 50,000-1,00,000 400

• 10,00,000 and above 50


Graphic method for location of median
Merits of Median:
• It is easily understood and is easy to calculate.
• It is rigidly defined.
• It can be located merely by inspection.
• It is not at all affected by extreme values.
• It can be calculated for distributions with open-end classes.
• Median is the only average to be used to study qualitative data where the
items are scored or ranked.
Demerits of Median:
• In case of even number of observations median cannot be determined
exactly. We merely estimate it by taking the mean of two middle terms.
• It is not based on all the observations.
• It is not amenable to algebraic treatment.
• As compared with mean, it is affected much by fluctuations of sampling.
• If importance needs to be given for small or big item in the series, then
median is not suitable average.
Uses of Median
• Median is the only average to be used while dealing with qualitative data
which cannot be measure quantitatively but can be arranged in ascending or
descending order.
• Ex: To find the average honesty or average intelligence, average beauty etc...
among the group of people.

• Used for the determining the typical value in problems concerning wages and
distribution of wealth.
• Median is useful in distribution where open-end classes are given.
Median for incomplete frequency distribution data:

Example . An incomplete frequency distribution is given as follows:


Variable Frequency Variable Frequency
10-20 12 50-60 ? f2
20-30 30 60-70 25
30-40 ? f1 70-80 18
40-50 65=f Total 229
Given that the median value is 46, determine the missing frequencies
using the median formula.
Median for incomplete frequency distribution

Solution: Let the frequency of the class 30-40 be f1 and that of


50-60 be f2,
Then f1 + f2 = 229 - (12 + 30 + 65 + 25 + 18)= 79.
Since median is given to be 46, the class 40-50 is the median class.
Hence using median formula we get

46=40+ (114·5-(12 +30+f1)) x 10


65
46 - 40 = (72.5-f1) x10 /65
This implies f1 =72·5-39 = 33·5 =34 approximately

But, f1 + f2 =79 implies f2 = 79-34 = 45


Mode:
• The mode is the value in a distribution, which occur most frequently or
repeatedly.
• It is an actual value, which has the highest concentration of items in and
around it or predominant in the series.
• In case of discrete frequency distribution mode is the value of x
corresponding to maximum frequency.
Computation of mode:
For raw data/individual-series/ungrouped data:
• Mode is the value of the variable (observation) which occurs maximum
number of times.
For frequency distribution data :
• Discrete frequency distribution (Ungrouped frequency distribution) data:
• In case of discrete frequency distribution mode is the value of x variable
corresponding to maximum frequency.
• Continuous frequency distribution (Grouped frequency distribution) data:
Graphic method for location of mode:
Steps:
• Draw a histogram of the given distribution.
• Join the top right corner of the highest rectangle (modal class rectangle) by a
straight line to the top right corner of the preceding rectangle. Similarly the top
left corner of the highest rectangle is joined to the top left corner of the
rectangle on the right.
• From the point of intersection of these two diagonal lines, draw a
perpendicular to the x -axis.
• Read the value in x-axis gives the mode.
Fig 6 .3: Graphic method for Location of mode
Merits of Mode:
• It is easy to calculate and in some cases it can be located mere
inspection
• Mode is not at all affected by extreme values.
• It can be calculated for open-end classes.
• It is usually an actual value of an important part of the series.
• Mode can be conveniently located even if the frequency
distribution has class intervals of unequal magnitude provided
the modal class and the classes preceding and succeeding it are
of the same magnitude.
Demerits of mode:
• Mode is ill defined. It is not always possible to find a
clearly defined mode.
• It is not based on all observations.
• It is not capable of further mathematical treatment.
• As compared with mean, mode is affected to a greater
extent by fluctuations of sampling.
• It is unsuitable in cases where relative importance of items
has to be considered.
Remarks:
In some cases, we may come across distributions with two modes. Such
distributions are called bi-modal. If a distribution has more than two modes, it is
said to be multimodal.

Uses of Mode:
Mode is most commonly used in business forecasting such as manufacturing
units, garments industry etc... to find the ideal size. Ex: in business forecasting
for manufacturing of readymade garments for average size of track suits,
average size of dress, average size of shoes etc....
Example:
Example . Find the mode for the following distribution:
Class-interval: 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency : 5 8 7 12 28 20 10 10
Solution. Here maximum frequency is 28. Thus the class 40-50 is the modal
class.
Using the following formula, the value of mode is given by
Mode = l+ (f1 –f0)xh
(2f1 – f0 – f2 )
Where f1 : frequency of the modal class
f0 : frequency preceding to the modal class
f2 : frequency succeeding to the modal class
l : lower limit of the modal class
contd

Mode = 40 + {( 28 -12) x 10}/(2x28 -12- 20)


= 40 + 6·666 = 46.67
Partition Values:
Partition values are the values of the variable which divide the
total number of observations into number of equal parts when
it is arranged in order of magnitude.

Ex: Median, Quartiles, Deciles, Percentiles.

• Median: Median is only one value, which divides the whole


series into two equal parts.
• Quartiles: Quartiles are three in number and divide the whole
series into four equal parts.
• They are represented by Q , Q , Q respectively.



Some Important relation and results:
1. Relation between A.M., G.M. & H.M. A.M. ≥ G.M. ≥ H.M.
2. i.e. G.M of A.M & H.M. is equal to G.M of two values.
3. A.M. of first “n” natural number 1,2,3,....n is ( n+1)/2
4. Weighted A.M of first “n” natural number 1,2,3,....n with
corresponding weights 1,2,3,...n is

Sum of the observations=1^2+2^^+3^2+…..+n^2 = n(n+1)(2n+1)/6


Total weight= 1=2+3+…+n = n(n+1)/2
Quiz:
1. Find the median of the data: 5, 7, 4, 9, 5, 4, 4, 3 3,4,4,4,5,5,7,9
A. 5.125 B. 14 C. 4.5 D. 4
2. Find the mean of the following data: 12, 10,15, 10, 16, 12,10,15, 15, 13
A. 13 B. 12.5 C. 15 D. 12.8
3. Find the mode of the following data: 20, 14, 12, 14, 26, 16, 18, 19, 14
A. 14 B. 17 C. 26 D. 16
4. Find the mean of the following data: 0, 5, 2, 4, 0, 5, 0, 3, 0, 5, 0, 3
A. 0 B. 2.25 C. 2.5 D. 3.86
5. Find the median of the following data: 25, 20, 30, 30, 20, 24, 24, 30, 31
A. 20 B. 26 C. 25 D. 30
Contd..

6. Find the median of the following data: 1, 6, 12, 19, 5, 0, 6


A. 6 B. 7 C. 19 D. 3.5
7. Find the mean of the following data: 20, 24, 24, 24, 22, 22, 24, 22, 23, 25
A. 23.5 B. 23 C. 24 D.25
8. Find the mode of the following data: 5, 0, 5, 4, 12, 2, 14 5
A. 4 B. 5 C. 6 D.. 0
9. Find the mean of the following data: 0, 5, 30, 25, 16, 18, 19, 26, 0, 20, 28
A. 0 B. 18 C. 19 D. 17
10. Find the median of the following data: 9, 6, 12, 5, 17, 3, 9, 5, 10, 2, 8, 7
A. 6.5 B. 7.5 C. 6 D. 7.75
Contd..


Answers:

• ANSWERS TO PRACTICE EXERCISES


• 1. C 2. D 3. A 4. B 5. C 6. A 7. B 8. B 9. D 10. B 11. B
12. B 13. B 14.C 15. D 16. A 17. A 18. D 19. B 20. B
The End

Thank You
• Coefficient of variation = CV=x100
• (i)
• Mean= ?
• Find CV
• (ii). Data:
• CV =
• Find mean?
• (iii). Data: Mean = ?
• CV = ?
• Find

You might also like