Measures of Central Tendency: Unit Objectives

CHAPTER
Measures of Central Tendency 3

Chapter Contents
3.1. Definition of center
In unit two we have learnt about how one could gain useful information
3.2. Review of Algebra
from raw data by organizing (grouping) it in to a frequency distribution 3.3. Describing center
table, and then presenting data by using various statistical tools such as 3.4. Quantiles
graphs and diagrams. 3.5. Review questions
In this unit (unit three) we will learn about data analysis: description
and summarization – the 4th stage in conducting any statistical study/
investigation.
Unit objectives
1. Define what we mean by measures of central tendency.
2. Summarize data using measures of central tendency such as the mean,
median and mode.
3 Identify the position of a data value in a data set using measures of
position such as percentiles, deciles and quartiles.
Learning outcomes
After completing this section, successful students will be able to:
• Define what is meant by measure of central tendency.
• Tell the objectives of knowing measures of center.
• Understand properties of a good measure
Key words
Deciles ,Mean , Median, Mode, Percentiles, Quartiles
Review of algebra 30
3.1 Definition of Central Tendency
Definition 3.1. Measures of central tendency are numerical values that

locate, in some sense, the center of a data set. Due to this reason, they
are sometimes called measures of location or measures of averaging or
summary statistics.
Objectives of averaging (or MCT)

Why it is necessary to measure the center of a data set?
• To get one single value that can describe the characteristics of the entire
group.
• To facilitate comparisons between two or more different data sets.
• For summarizing data or reducing data size.
Properties of a good measure of center

We say a measure of central tendency is best or good if it posses most of
the following properties. It should be:
X simple to understand and easy to calculate/ interpret.
X exist and be unique i.e., rigidly defined by mathematical formula.
X based on all observations.
X not seriously affected by extreme observations/ outliers.
X capable of further statistical analysis and/or manipulation
3.2 Review of algebra
Definition 3.2. Statistical notation refers to the standardized code for

symbolizing the mathematical operations performed in the formulas
and the answers we obtain.
A.Summation notation (Σ)
Sum of terms such asX1 + X2 + X3 + X4 + X5 , is often designated by the

symbol, 5i=1 Xi .
P
5
X1 + X 2 + X3 + X4 + X 5 = (3.2.1)
X
Xi
i=1
is the capital Greek letter read as ”sigma”, and in this connection it is

P
read as ”the sum of...” and is called the summation sign. The letter ”i” is
Lecture notes (set by: Z)

Review of algebra 31
called the summation index. The term following is called the summand.
P
In our example Xi is the summand.

The ”i = 1” below indicates that the first term of the sum is obtained
P
by putting i = 1 in the summand . The five above indicates that the

P
final term of the sum is obtained by substituting i = 5 in the summand .

The other terms of the sum are obtained by giving ”i” the integral values
between the limits 1 and 5.
B.Product notation ( )
Q
An analogous notation for the product is obtained by substituting the Greek

capital letter for (read as ”pi”). In this case the terms resulting from
Q P
substituting the integers for the index are multiplied instead of added.
Product of terms such as X1 ∗ X2 ∗ X3 ∗ X4 ∗ X5 , is often denoted by short
hand notation, 5i=1 Xi .
Q
5
X1 ∗ X 2 ∗ X3 ∗ X4 ∗ X 5 = (3.2.2)
Y
Xi
i=1
C.Algebra of summations (AoS)
The following rules are commonly called the Algebra of summations. If X

and Y are two variables and c is a constant number, then
n
Xi = X1 + X2 + ... + Xn (3.2.3)
X
i=1
n
Xi2 = X12 + X22 + ... + Xn2 (3.2.4)
X
i=1
n
c = |c + c + ... + c} = nc (3.2.5)
X
{z
i=1 ntimes
n n
c ∗ Xi = cX1 + ... + cXn = c ∗ (3.2.6)
X X
Xi
i=1 i=1
n n n
(Xi + Yi ) = Xi + (3.2.7)
X X X
Yi ,
i=1 i=1 i=1
n
Xi Yi = X1 Y1 + X2 Y2 + ... + Xn Yn (3.2.8)
X
i=1
Example 3.1. ll
mm

Describing the center of the distribution/ or the data set 32
3.3 Describing the center of the distribution/ or

the data set
The most common summary measures used to describe the center of the
distribution are the mean, the median, and the mode. 1 1
From these, the mean is best known and frequently used measure. Their
formula is defined in raw data and grouped data as well. Now it is time to
deal the properties and formulas of each measure one by one.
3.3.1 The mean or arithmetic mean

Figure 3.1. Center measures
Definition 3.3. The mean is the sum of the observations divided by
the total number of observations. It is denoted by the letter µ (read
as “mu”) for population and X̄(read as “X-bar”) for sample.
Computing Mean for raw data
Data in list form: given n observations X1 , X2 , ..., Xn , sample mean is com-

puted by 3.3.1
X1 + X2 + ... + Xn
Pn
Xi
X̄ = = i=1 (3.3.1)
n n
ILLUSTRATION 3.1. Calculating mean for raw data The heights (in
inch) of six female police officer candidates are shown below. 62, 64, 63, 61, 62, and 66.
Then compute the mean height.
Solution:Let Xi represent height of ith candidate.
Step 1 Count the number of values. n = 6 =total candidates.
Step 2 Find sum. To get sum add all values.
Xi = 62 + 64 + 63 + 61 + 62 + 66 = 378 inch
X
.
Step 3 Use formula ((3.3.1)) .
378
Pn
Xi
X̄ = i=1
= = 63 inch
n 6
Step 4 Interpret the value you compute: On average, the female police
candidates were 63 inch tall.
Computing mean from ungrouped and grouped frequency distribution
Data in the form:

V alue x1 x2 ... xk
F requency f1 f2 ... fk
Describing the center of the distribution/ or the data set The mean or arithmetic mean 33
sum of the product of i th class value (Xi )and its frequency (fi )
M ean =
total frequency (n)
Symbolically,
X1 f1 + X2 f2 + ... + Xk fk k P
i=1 Xi fi
X̄ = Pk = P k (3.3.2)
i=1 fi i=1 fi
Where
Xi =data value of the ith class
i = 1, 2...k
fi = frequency /repetition of Xi
k= number of classes
n = ki=1 fi = total frequency.
P
Note 3.1. In case of grouped data (G.f.d) replace Xi in [(3.3.2)] by Xmi .

Where,Xmi =mid point of ith class
ILLUSTRATION 3.2 (Calculating mean for discrete data (u.f.g)). . The

following numbers of books were read by each of the 28 students in a liter-
ature class. Calculate mean.
Number of books 0 1 2 3 4 T otal

Frequency 2 6 12 5 3 28
Solution:Let Xi =number of books that ith student read.

Step1 Count the number of values.n = fi = 28 students
Pk
i=1
Step2 Find sum. Xi fi = 0 ∗ 2 + 1 ∗ 6 + 2 ∗ 12 + 3 ∗ 5 + 4 ∗ 3 = 57

Pk
i=1
Pk
Xf
Step3 Use formula (3.2),X̄ = Pi=1 i i
k = 57/28 = 2.03 books
i=1
fi
Step4 Interpret the value you compute: ”Therefore,on average each stu-
dent read 2 books.”
- Let’s look at another example.
ILLUSTRATION 3.3. Calculating for continuous data (ufg) Thirty au-

tomobiles were tested for fuel efficiency (in miles per gallon: mpg). The
results were summarized in the frequency distribution table given below.
Find the mean.
The mean or arithmetic mean Lecture notes (set by: Z)

Table 3.1. Fuel efficiency of 30 Automobieles

Fuel efficiency(mpg) Number of automobiles(f)
7.5–12.5 3
12.5–17.5 5
17.5–22.5 15
22.5–27.5 5
27.5–32.5 2
Solution:The given data is tabulated in grouped form of frequency distri-

bution .so each class takes values in a range of intervals. K=5. Let Xmi
=represent mid value of ith class fuel efficiency.
Step 1 Construct the table as:
Table 3.2. Fuel efficiency of 30 Automobieles

Fuel efficiency fi Xmi Xi f i
(in mpg) (I) (II) (III)
7.5–12.5 3 10 10*3=30
12.5–17.5 5 15 15*5=75
17.5–22.5 15 20 20*15=300
22.5–27.5 5 25 25*5=125
27.5–32.5 2 30 30*2=60
Total 30 590
Pk
X f
Step 2 Use formula ((3.3.2)). X̄ = i=1 mi i
P k = 590/30 = 19.6 mpg.
f
i=1 i
Step 3 Interpret the value you get: “The average fuel efficiency of the
tested automobiles were 19.6 miles per gallon .”
Properties of the mean (arithmetic mean)
Before proceeding to this section first do the following activity.
Activity 3.1. Properties of mean (4 min.) Consider the observations: 10, 12, 26, 14, and, 28.
1. Calculate the mean.
2. Add 2 to each observation and calculate the new mean.
3. Multiply each observation by 2 and compute the new mean.
4. What would you conclude from your discussions/works/?
Solution::
The following are properties of the mean.
Property 3.1. The sum of the deviations of the observations from their
arithmetic mean is zero.
n
[i.e., (Xi − X̄) = 0] (3.3.3)
X
i=1
Proof. i=1 (Xi − X̄) =

n n
i=1 Xi − X̄ = ni=1 Xi − n ∗ X̄ but
P P P P
i=1 Xi = n ∗ X̄, substitute this you get, n ∗ X̄-n ∗ X̄=0

Pn
Property 3.2. The sum of the squares of the deviations of a set of obser-
vations from any number, say A, is the least only when A = X̄.
n n
[i.e., (Xi − X̄)2 < (Xi − Ā)2 ] (3.3.4)
X X
i=1 i=1
Proof.
Property 3.3. The combined mean of k different data sets or groups is

calculated as
X̄1 n1 + X̄2 n2 + ... + X̄k nk k P

i=1 X̄i ni
X̄12...k = = P (3.3.5)
n1 + n2 + ... + nk k
i=1 ni
Where
X̄1 =the mean of data set 1 having n1 observations
X̄2 =the mean of data set 2 having n2 observations
. .........................................
X̄k =the mean of data set k having nk observations
Property 3.4. If a wrong figure has been used in calculating the mean we
can correct if we know the correct figure that should have been used
nX¯w + Xc − Xw
X̄c = (3.3.6)
n
Where c =correct and w= wrong
Property 3.5. If the mean of n observations X1 , X2 , ..., Xn is X̄ ,then

• If we add constant number k to all the observations, the new mean

becomes old mean plus that constant number k.X̄new = X̄ + k
• If we subtract a constant number k from all observations the new mean
becomes the old mean minus that constant number. X̄new = X̄ − k
• If we multiply all observations by k the new mean becomes X̄new = k X̄
• If we divide all observations to k the new mean becomes X̄new = X̄k
Merits and demerits of arithmetic mean
Merits of arithmetic mean

• It has definite value
• It is calculated based on all observations.
• Simple to calculate and easy to understand /comprehend.
• Used for further manipulation e.g. to calculate variance.
• Used for comparing two data sets.
• Used for data measured at interval or ratio scale.
Demerits of mean
• Highly affected by extreme values/outliers.
• It can not be calculated for frequency distribution having open ended
classes.
• It sometimes gives absurd results.
Types of mean
Pn
Xi
1. Arithmetic Mean or simply Mean :The formula is A.M = X̄ = i=1
n
This is the mean you discussed earlier . See (3.3.1) ,(3.3.2)
2. The Geometric Mean (G.M):It is defined as the nth root of the product
of n values /observations.The formula is
v
u n
G.M = (3.3.7)
uY
n
t X i
i=1
Note 3.2. The geometric mean is useful in finding the average of percent-
ages, ratios, indexes, or growth rates.
3. The Harmonic Mean (H.M):It is defined as the number of values divided
by the sum of the reciprocals of each value.The formula is
n
H.M = Pn 1 (3.3.8)
i=1 Xi
Note 3.3. The harmonic mean is used in finding the average speed.

Describing the center of the distribution/ or the data set The median 37
Relation between the three types of means
Before we talk about this sub–session, first do the following class activity
Activity 3.2. Relation between means (5 min.)

1. For data values given below: X1 = 1, X2 = 3, andX3 = 9. (i)Find the
G.M, A.M, and H.M. (ii) Compare the results.
2. The cost of food increases in a specific geographic region for the past
three years were 1 %, 3% and 5%. Find the average?Multiply each
observation by 2 and compute the new mean.
3. A sales person derives 300 miles round trip at 30 miles per hour going
to Chicago and 45 miles per hour returning to home. Find the average
miles per hour? Solution::
Lemma 3.3.1. If X1 and X2 are two observed values, then the G.M of
their A.M and H.M is equal to the geometric mean of the numbers X1
and X2 . i.e., Gm2 = Am ∗ Hm.
Proof. Substituting the values in formulas (3.3.1),(3.3.7),(3.3.8), we
have
x1 + x2
A.m =
2
,
√
G.m = x1 ∗ x2
and
2 2 ∗ x1 ∗ x2
H.m = =
1
x1
+ 1
x2
x1 + x2
⇒, G.m2 = x1 ∗ x2
x1 + x2 2 ∗ x1 ∗ x2
A.m ∗ H.m = ( )∗( ) = x1 ∗ x2
2 x1 + x2
∴ Gm2 = Am ∗ Hm
Lemma 3.3.2. If A, G and H stands for Arithmetic mean, Geometric
mean and Harmonic mean respectively, then the relation A ≥ G ≥ H
, holds true.
Proof. Left as an exercise. [Hint: Consider two observations x1 , x2
√ √
and ( x1 − x2 )2 ≥ 0 ∀x1 , x2 ≥ 0]
3.3.2 The median

-This is the second measure of central tendency
Definition 3.4. The median is the middle observation of the values

after they have been ordered from the smallest to largest or from the
largest to the smallest. Or the median is the middle point of the data
The median Lecture notes (set by: Z)

Describing the center of the distribution/ or the data set The median 38
array. Or the median is the half way point in a data set. The symbol
for sample median is X̃(read as X-tilde).
Computing Median from un-grouped and grouped data
For un-grouped data,


( n+1 )th value,

if n is odd
X̃ = 2
n th n+2 th (3.3.9)
 ( 2 ) value+(
 2
) value
, if n is even
2
ILLUSTRATION 3.4. Calculating median for un-grouped data

Consider the following observations: 1, 5, 9, 3, and 11. Now find the
median.
Solution:Follow the steps shown below.
1st Arrange the data in increasing order of magnitude as: 1, 3, 5, 9, and
11.
2nd Decide whether n is even or odd. n=5(odd).
3rd Select the correct formula in (3.3.9) and calculate.
n + 1 th 5 + 1 th
X̃ = ( ) value = ( ) valu = 3rd value = 5
2 2
4th Interpret: 50% of values found below 5.

For grouped data,
w n
X̃ = Lm + ( − Fpm ) (3.3.10)
fm 2
Where
• Lm =lower boundary of the median class
• Fpm =the less than cumulative frequency immediately Preceding the
median class
• fm =frequency of the median class
• w= the class width
• n2 = is the key to find the median class and should be calculated first
Note 3.4. The median class: is the class with the smallest less than
cumulative frequency ≥ n2
ILLUSTRATION 3.5. Calculating median from grouped data: Table

3.3 below gives the distribution of the weekly wages of employs of a small
firm.
The median Lecture notes (set by: Z)

Describing the center of the distribution/ or the data set The mode 39
Table 3.3. Distribution of wages of employees in a small firm

Wages( in birr) No of employees (fi)
126&below 3
127–138 5
136–144 9
145–153 12
154–162 5
163–171 4
172&above 2
a. Find the median of weekly wage.

b. Why is the median a more suitable measure of central tendency in this
case?
Solution::
Merits and demerits of median
Merits of median
• Used when one is interested to find the center or middle value of a
data set.
• It is unique.
• It is affected less than the mean by extremely low or high values
because it is a positional average.
• It can be computed for a frequency distribution with an open ended
class.
• Can be determined for all levels of data except nominal
Demerits of median
• It is not capable of further algebraic treatment/ statistical analysis.
• It is not a good representative of a data when the N o of observations
(data) is small.
• In case when the N o of items is very large, sorting is cumbersome and
time consuming.
3.3.3 The mode

–This is the third measure of central tendency.
Definition 3.5. The mode is the value of the observation that appears
most frequently. The symbol for sample mode is X̂ (read as X-hat).
The mode Lecture notes (set by: Z)

Describing the center of the distribution/ or the data set The mode 40
Computing Mode from grouped data
For grouped data

∆1
X̂ = Lmo + w (3.3.11)
∆1 + ∆2
Where
• Lmo =lower boundary of the modal class
• Fpm =the less than cumulative frequency of the class immediately Pre-
ceding the modal class
• fmo =frequency of the modal class
• w= the class width
• ∆1 = fmo − fsmo , and, ∆2 = fmo − fpmo
Note 3.5. The modal class: is the class having largest frequency.
Merits and demerits of mode
Merits of mode
• It is the easiest average to compute.
• It is not affected by extreme values.
• It can be calculated in case of the open ended intervals.
• It is the only measures of center that can be used in finding the most
typical case when the data are nominal or categorical
Demerits of mode
• It may not exist; if it exists it may not be unique.
• It may be unrepresentative in many cases.
Note 3.6. In moderately asymmetrical distribution the following relation

holds true.
mean − mode = 3(mean − median) (3.3.12)
ILLUSTRATION 3.6. Calculating X̂ for continuous data (ufg) Find the

mode for the following frequency distribution of the birth weights (in kg)
of 30 children given below.
Weight(in kgs) 1.9–2.3 2.3–2.7 2.7–3.1 3.1–3.5 3.5–3.9 3.9–4.3
N o of children 5 5 9 4 4 3
Solution:Follow the steps shown below.
Step1 Find the modal class: The modal class is the 3rd class because the
frequency is higher than other classes . The interval is 2.7 − −3.1
. This is the class boundary of the 3rd class. So Lmo = 2.7 , ∆1 =
fmo −fsmo = 9−5 = 4 , 5 = 9−4 = fmo −fsmo = ∆2 ,w = Ucb1 −Lcb1 =
2.3 − 1.9 = 0.4
The mode Lecture notes (set by: Z)
Review questions 41
Step2 Substitute the values in to formula (3.18), we can get the answer as
follows: X̂ = Lmo + w ∆1∆+∆
1
2
=2.7 + ( 4+5
4
) ∗ 0.4 = 2.87 kg.
Step3 Interpretation: most of the children at birth weigh about 2.87 kg.
Example 3.1. Find the mode of the weekly wages data in Table 3.3
3.4 Quantile
-These are other measures used to describe position of a data.
Definition 3.6. Quantile are values that divide a given data set in to
some equal parts. They are also called measures of position (MoP).
Example 3.2. quartiles, deciles, percentiles

(i) Quartiles: are numerical values that divide a given data set in to 4 equal
parts.
• Notations :Q1 , Q2 , Q3 .
• Meaning:Qi =is the value below which 25i% and above which 100−25i%
of values found.
(ii) Deciles: are numerical values that divide a given data set in to 10 equal
parts.
• Notations: D1 , D2 , ..., D5 , ..., D9 .
• Meaning:Di =is the value below which 10i% and above which 100−10i%
of values found.
iii) Percentiles: are numerical values that divide a given data set in to 100
equal parts.
• Notations: P1 , P2 , ..., P50 , ..., P99 .
• Meaning:Pi =is the value below which i% and above which 100 − i% of
values found.
Example 3.2. Find Q1 , X̃, D3 , P80 for the data given in Illustrations ( 3.3.1
, 3.3.1,3.3.1,3.4,3.5,3.3.3)
3.5 Review questions

Measures of Central Tendency: Unit Objectives

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Measures of Central Tendency: Unit Objectives

Uploaded by

Copyright:

Available Formats

CHAPTER

Measures of Central Tendency 3

3.1 Definition of Central Tendency

Definition 3.1. Measures of central tendency are numerical values that

Objectives of averaging (or MCT)

Properties of a good measure of center

3.2 Review of algebra

Definition 3.2. Statistical notation refers to the standardized code for

A.Summation notation (Σ)

Sum of terms such asX1 + X2 + X3 + X4 + X5 , is often designated by the

is the capital Greek letter read as ”sigma”, and in this connection it is

Lecture notes (set by: Z)

In our example Xi is the summand.

by putting i = 1 in the summand . The five above indicates that the

final term of the sum is obtained by substituting i = 5 in the summand .

An analogous notation for the product is obtained by substituting the Greek

C.Algebra of summations (AoS)

The following rules are commonly called the Algebra of summations. If X

Lecture notes (set by: Z)

3.3 Describing the center of the distribution/ or

3.3.1 The mean or arithmetic mean

Computing Mean for raw data

Data in list form: given n observations X1 , X2 , ..., Xn , sample mean is com-

Computing mean from ungrouped and grouped frequency distribution

Data in the form:

Note 3.1. In case of grouped data (G.f.d) replace Xi in [(3.3.2)] by Xmi .

ILLUSTRATION 3.2 (Calculating mean for discrete data (u.f.g)). . The

Number of books 0 1 2 3 4 T otal

Solution:Let Xi =number of books that ith student read.

Step2 Find sum. Xi fi = 0 ∗ 2 + 1 ∗ 6 + 2 ∗ 12 + 3 ∗ 5 + 4 ∗ 3 = 57

ILLUSTRATION 3.3. Calculating for continuous data (ufg) Thirty au-

The mean or arithmetic mean Lecture notes (set by: Z)

Table 3.1. Fuel efficiency of 30 Automobieles

Solution:The given data is tabulated in grouped form of frequency distri-

Table 3.2. Fuel efficiency of 30 Automobieles

Properties of the mean (arithmetic mean)

Before proceeding to this section first do the following activity.

The following are properties of the mean.

Proof. i=1 (Xi − X̄) =

i=1 Xi = n ∗ X̄, substitute this you get, n ∗ X̄-n ∗ X̄=0

Property 3.3. The combined mean of k different data sets or groups is

X̄1 n1 + X̄2 n2 + ... + X̄k nk k P

Property 3.5. If the mean of n observations X1 , X2 , ..., Xn is X̄ ,then

The mean or arithmetic mean Lecture notes (set by: Z)

• If we add constant number k to all the observations, the new mean

Merits and demerits of arithmetic mean

Merits of arithmetic mean

The mean or arithmetic mean Lecture notes (set by: Z)

Relation between the three types of means

Activity 3.2. Relation between means (5 min.)

3.3.2 The median

Definition 3.4. The median is the middle observation of the values

The median Lecture notes (set by: Z)

Computing Median from un-grouped and grouped data

For un-grouped data,

ILLUSTRATION 3.4. Calculating median for un-grouped data

4th Interpret: 50% of values found below 5.

ILLUSTRATION 3.5. Calculating median from grouped data: Table

The median Lecture notes (set by: Z)

Table 3.3. Distribution of wages of employees in a small firm

a. Find the median of weekly wage.

Merits and demerits of median