You are on page 1of 20

Describing Data: Measures of Central Tendency

Describing Data: Measures of Central Tendency

Syed S. Hossain
Institute of Statistical Research and Training
University of Dhaka, Bangladesh.
shahadat@isrt.ac.bd

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Measures of Center

Measures of Location is defined as the statistical measure that


identifies a single value as representative of an entire
distribution.
It aims to provide an accurate description of the entire data.
It is the single value that is most typical/ representative of the
collected data.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Measures of Center

For a set of observations xi ; i = 1, 2, . . . , n, the following are the


formula for different measures of center:
1 Arithmatic mean,
2 Median,
3 Mode,
4 The weighted mean
5 Geometric mean,
6 Harmonic mean .

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The Arithmetic Mean

Arithmetic mean is a mathematical average and it is the most


popular measures of location (central tendency).
It is frequently referred to as mean it is obtained by dividing
sum of the values of all observations in a series by the number
of items constituting the series.

For a set of observations x1 , x2 , . . . xn , the following is the formula


for the Arithmetic mean
n
1X
x̄ = xi .
n
i=1

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Properties of Arithmetic mean

1 Every set of interval-level data has a mean.


2 All the values are included in computing the mean.
3 A set of data has only one mean. The mean is unique.
4 The mean is a useful measure for comparing two or more
populations.
5 The arithmetic mean is the only measure of central tendency
where the sum of the deviations of each value from the mean
will always be zero. Expressed symbolically:
n
X
(xi − x̄) = 0
i=1

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Advantages and Disadvantages of Arithmetic mean

Advantages Disadvantages

1 It is easy to understand 1 It is affected by extreme


simple calculate. values.
2 It is based on all the values. 2 It cannot be calculated for
It is rigidly defined . open end classes.
3 It is easy to understand the 3 It cannot be located
arithmetic average even if graphically.
some of the details of the 4 It gives misleading
data are lacking. conclusions.
4 It is not based on the 5 It has upward bias.
position in the series.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The Median

Median is a central value of the distribution, or the value


which divides the distribution in equal parts, each part
containing equal number of items. Thus it is the central value
of the variable, when the values are arranged in order of
magnitude.

For a set of observations x1 , x2 , . . . xn , the following is the formula


for the Median
for odd n, ( n+1 )th ordered observation
2
forneven n, o
1 n th n th ordered observation
2 ( 2 ) ordered observation + ( 2 + 1)

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Properties of Median

1 Every set of ordinal and interval-level data has a median.


2 The median is unique; that is, like the mean, there is only one
median for a set of data.
3 it is not affected by extremely large or small values and is
therefore a valuable measure of central tendency when such
values do occur.
4 It can be computed for a frequency distribution with an
open-ended class if the median does not lie in an open ended
class.
5 It can be computed for ratio-level, interval-level and
ordinal-level data.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Advantages and Disadvantages of Median

Advantages Disadvantages

1 Median can be calculated in 1 It is not based on all the


all distributions. values.
2 Median can be understood 2 It is not capable of further
even by common people. mathematical treatment.
3 Median can be ascertained 3 It is affected fluctuation of
even with the extreme items. sampling.
4 It can be located graphically 4 In case of even no. of values
5 It is most useful dealing with it may not the value from
qualitative data the data.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Usage of Median

1 Whenever a data set has extreme values, the median is the


preferred measure of central location.
2 The median is the measure of location most often reported for
annual income and property value data.
3 A few extremely large incomes or property values can inflate
the mean.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The Mode

Mode is the most frequent value or score in the distribution.

For a set of observations x1 , x2 , . . . xn ,


if n(xi ) denote the number of times the observation xi occurs,
and if n(xk ) = max{n(xi )},
Mode = xk .

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Properties of Mode

1 Unlike the mean and median, Mode is not unique; that is,
there may be more than one mode for a set of data.
If the data have exactly two modes, the data are bimodal.
If the data have more than two modes, the data are
multimodal.
2 Mode may be calculated for ratio-level, interval-level,
ordinal-level and nominal-level data.
3 Mode of data set may not exist.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Advantages and Disadvantages of Mode


Advantages Disadvantages

1 Mode is readily 1 It is not based on all


comprehensible and easily observations.
calculated 2 It is not capable of further
2 It is the best representative mathematical manipulation.
of data 3 Mode is affected to a great
3 It is not at all affected by extent by sampling
extreme value. fluctuations.
4 The value of mode can also 4 Choice of grouping has great
be determined graphically. influence on the value of
5 It is usually an actual value mode.
of an important part of the
series.
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The weighted mean

The weighted mean is a special case of the arithmetic mean. It


occurs when there are several observations of the same value which
might occur if the data have been grouped into a frequency
distribution. The weighted mean of a set of numbers is computed
by P
wi Xi
X̄w = P ,
wi
where, wi denotes the weight for the i th data point.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The Geometric mean

Geometric mean is often used for a set of numbers whose


values are meant to be multiplied together or are exponential
in nature, such as data on the growth of the human
population or interest rates of a financial investment.

For a set of observations x1 , x2 , . . . xn , (xi > 0 ∀i = 1, 2, . . . , n),


the following is the formula for the Geometric mean
n
!1 ( n )
n
Y 1X
G = xi = Antilog log xi .
n
i=1 i=1

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

The Harmonic mean

Harmonic mean (formerly sometimes called the subcontrary


mean) is one of several kinds of average.
Typically, it is appropriate for situations when the average of
rates is desired. The harmonic mean is the number of
variables divided by the sum of the reciprocals of the variables.
Useful for ratios such as speed (=distance/time) etc.

For a set of observations x1 , x2 , . . . xn , (xi 6= 0 ∀i = 1, 2, . . . , n),


the following is the formula for the Harmonic mean
( n )−1
1X 1 1
H = = 1 Pn 1 .
n xi n i=1 x
i=1 i

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Measures of Centre from grouped data


For a frequency distribution with xi and fi denoting the class
frequency and class mark (mid-value) of the i th class
(i = 1, 2, . . . , k), the measures of central tendency can be obtained
as:
1 Arithmatic mean: Pk
fi xi
x̄ = Pi=1
k
.
i=1 fi
2 Geometric mean:
k
! P1 f
i
P 
Y fi log xi
G= xifi = Antolog P .
fi
i=1

3 Harmonic mean, 1
H= 1 Pn fi
.
n i=1 xi
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Measures of Centre from grouped data (Cont.)

Median
n P
2 − fl
Median = L1 + × c,
fmedian

where
L1 =Lower class boundary of the median class
n = Number of items in the data
P
fl = Sum of frequencies of all classes lower than the
median class
fmedian = Frequency of the median class
c = Size of class interval

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Measures of Centre from grouped data (Cont.)

Mode
∆1
Mode = L1 + × c,
∆1 + ∆ 2

where
L1 =Lower class boundary of the modal class
∆1 = Excess of modal frequency over frequency of next lower
class
∆2 = Excess of modal frequency over frequency of next
higher class
c = Size of class interval

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Describing Data: Measures of Central Tendency

Mode and Median from Graph

Median Mode

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd

You might also like