You are on page 1of 32

Measures of Central

Tendency
Introduction to central tendency
• Frequency distributions and corresponding graphical fail to identify three major
properties that describe a set of quantitative data.
• (i) The numerical value of an observation (also called central value) around which most
numerical values of other observations in the data set show a tendency to cluster or group,
called the central tendency.
• (ii) The extent to which numerical values are dispersed around the central value, called
variation.
• (iii) The extent of departure of numerical values from symmetrical (normal) distribution
around the central value, called skewness.
• There are three types of descriptive measures/summary measures:
• (i) Measures of central tendency (ii) Measures of dispersion or variation (iii) Measure of
symmetry—skewness may be used to extract and summarize major features of the data set by
the application of certain statistical methods
Central Tendency
• The observations (numerical values) in most data sets have a tendency to group or
cluster around a value of an observation located some wherein the middle of all
observations.
• It is necessary to identify or calculate this typical central value (also called
average) to describe or project the characteristic of the entire data set.
• Population parameter and Sample statistic
• If the descriptive summary measures are computed using data of samples, then these are
called sample statistic or simply statistic but if these measures are computed using data of the
population, they are called population parameters or simply parameters. The population
parameter is represented by the Greek letter µ (read : mu) and sample statistic is represented
by the Roman letter x (read : x bar).
How to represent
Measures of central tendency
• Arithmetic mean (Average)
• Simple and Weighted Average
• Geometric mean and Harmonic mean
• Average of position
• Median
• Mode
• Quartiles
• Deciles
• Percentiles
Mathematical averages
• Please Note while calculating mathematical averages of a data set—the nature of
data available ungrouped (unclassified) or grouped (classified).
• Arithmetic mean of ungrouped data:
• Direct Method-In this method, A.M. is calculated by adding the values of all
observations and dividing the total by the number of observations. Thus if x1,
x2, . . ., xN represent the values of N observations, then A.M. for a population
of N observations is:
• Short cut method to solve ungrouped data: Focus is on Assumed Mean
Problems in Mean (Grouped and ungrouped)
1. In a survey of 5 cement companies, the profit (in ` lakh) earned during a year was 15, 20, 10, 35, and 32. Find the
arithmetic mean of the profit earned.

2. If A, B, C, and D are four chemicals costing ` 15, ` 12, ` 8 and ` 5 per 100 g, respectively, and are contained in a
given compound in the ratio of 1, 2, 3, and 4 parts, respectively, then what should be the price of the resultant
compound.
3. The number of new orders received by a company over the last 25 working days were recorded as follows: 3,
0, 1, 4, 4, 4, 2, 5, 3, 6, 4, 5, 1, 4, 2, 3, 0, 2, 0, 5, 4, 2, 3, 3, 1. Calculate the arithmetic mean for the number of
orders received over all similar working days. Can you represent it in frequency table format.
4. From the following information on the number of defective components in 1000 boxes;

Calculate the arithmetic mean of defective components for the whole of the production line.
AM of grouped or classified data
• Arithmetic mean for grouped data can also be calculated by applying
any of the following methods: (i) Direct method, and (ii) Indirect or
Step-deviation method.
• For calculating arithmetic mean for a grouped data set, the following
assumptions are made: (i) The class intervals must be closed. (ii) The
width of each class interval should be equal. (iii) The values of the
observations in each class interval must be uniformly distributed
between its lower and upper limits. (iv) The mid-value of each class
interval must represent the average of all values in that class, that is, it is
assumed that all values of observations are evenly distributed between
the lower and upper class limits.
• Direct Method:
• Step Deviation method/short cut method:

This method is very useful in those cases where mid-values (mi ) and/or frequencies ( f i ) are in three or more
digits.
For you to solve (HW)
Problems for you to solve
Weighted arithmetic mean
• The arithmetic mean, as discussed earlier, gives equal importance (or weight) to
each observation in the data set. However, there are situations in which values
of individual observations in the data set are not of equal importance. If
values occur with different frequencies, then computing A.M. of values (as
opposed to the A.M. of observations) may not be a true representative of the
data set characteristic and thus may be misleading. Under these circumstances,
we may attach to each observation value a ‘weight’ w1, w2, . . ., wN as an
indicator of their importance perhaps because of size or importance and compute a
weighted mean or average denoted by xw as-
Problem
Median
• Median may be defined as the middle value in the data set when its elements are
arranged in a sequential order, that is, in either ascending or decending order of
magnitude. It is called a middle value in an ordered sequence of data in the sense
that half of the observations are smaller and half are larger than this value. The
median is thus a measure of the location or centrality of the observations. The
median can be calculated for both ungrouped and grouped data sets
• Ungrouped Data In this case, the data is arranged in either ascending or decending
order of magnitude. (i) If the number of observations (n) is an odd number, then
the median (Med) is represented by the numerical value corresponding to the
positioning point of (n + 1)/2 ordered observation. That is
• If the number of observations (n) is an even number, then the median is defined as
the arithmetic mean of the numerical values of n/2th and (n/2 + 1)th observations
in the data array. That is
Median for grouped data
• To find the median value for grouped data, first identify the class interval which contains the
median value or (n/2)th observation of the data set. To identify such class interval, find the
cumulative frequency of each class until the class for which the cumulative frequency is equal to
or greater than the value of (n/2)th observation. The value of the median within that class is found
by using interpolation. That is, it is assumed that the observation values are evenly spaced over the
entire class interval. The following formula is used to determine the median of grouped data
Mode
• The mode is that value of an observation which occurs most frequently in the data
set, that is, the point (or class mark) with the highest frequency. The concept of
mode is of great use to large scale manufacturers of consumable items such as
ready-made garments, shoe-makers, and so on. In all such cases it is important to
know the size that fits most persons rather than ‘mean’ size.
• Calculation of Mode It is always preferable to calculate mode from grouped data.
Table 3.27, for example, shows the sales per day of an item for 20 days period.
The mode of this data is 71 since this value occurs more frequently (four times
than any other value). However, it fails to reveal the fact that most of the values
are under 70
• In the case of grouped data, the following formula is used for
calculating mode
HW for you to solve
Partition Values- Quartiles, Decile, and Percentiles
• To have more knowledge about the data set, we may decompose it into more parts of equal size. The
measures of central tendency which are used for dividing the data into several equal parts are called partition
values.
• Quartiles The values of observations in a data set, when arranged in an ordered sequence, can be divided into
four equal parts, or quarters, using three quartiles namely Q1, Q2, and Q3. The first quartile Q1 divides a
distribution in such a way that 25 per cent (=n/4) of observations have a value less than Q1 and 75 per cent (=
3n/4) have a value more than Q1, i.e. Q1 is the median of the ordered values that are below the median
• The second quartile Q2 has the same number of observations above and below it. It is therefore same as
median value. The quartile Q3 divides the data set in such a way that 75 per cent of the observations have a
value less than Q3 and 25 per cent have a value more than Q3, i.e. Q3 is the median of the order values that
are above the median
• Deciles The values of observations in a data set when arranged in an
ordered sequence can be divided into ten equal parts, using nine
deciles, Di (i = 1, 2, . . ., 9). The generalized formula for calculating
deciles in case of grouped data is:
• Percentiles The values of observations in a data when arranged in an ordered sequence can be
divided into hundred equal parts using ninety nine percentiles, Pi (i = 1, 2, . . ., 99). In general, the
ith percentile is a number that has i% of the data values at or below it and (100 – i)% of the data
values at or above it. The lower quartile (Q1), median and upper quartile (Q3) are also the 25th
percentile, 50th percentile and 75th percentile, respectively. For example, if you are told that you
scored 90th percentile in a test (like the CAT), it indicates that 90% of the scores were at or below
your score, while 10% were at or above your score. The generalized formula for calculating
percentiles in case of grouped data is
Geometric Mean
• In many business and economics problems, we deal with quantities (variables) that
change over a period of time. In such cases the aim is to know an average
percentage change rather than simple average value to represent the average growth
or declining rate in the variable value over a period of time. Thus we need to
calculate another measure of central tendency called geometric mean (G.M.).
• The value of G.M. is not much affected by extreme observations and is computed
by taking all the observations into account.
• It is useful for averaging ratio and percentage as well as in determining rate of
increase and decrease.
• In the calculation of G.M. more weight is given to smaller values and less weight to
higher values. For example, it is useful in the study of price fluctuations where the
lower limit can touch zero whereas the upper limit may go upto any number.
Harmonic Mean
• The harmonic mean (H.M.) of a set of observations is defined as the
reciprocal of the arithmetic mean
• While calculating H.M., more weightage is given to smaller values in
a data set because in this case, the reciprocal of given values is taken
for the calculation of H.M.
• Used for calculating share price, dividend yield
Combined AM
Correcting incorrect mean
Calculate the missing frequency

You might also like