Professional Documents
Culture Documents
Bit 3 Lesson
Bit 3 Lesson
03
Measures of Central
Location
3.0 In
ntroductiion
al techniqu
Graphica ues provide
e a pictoria
al represen
ntation of a given data set whicch is
easily understanda
able and co
omparable. They help
p to gain a good gene
eral view about
a
the data
a set. For example, the
t frequency distribution givess us a gene
eral idea about
a
the apprroximate sh
hape of the
e distribution. However, the gra
aphical tech
hniques pro
ovide
little orr no inform
mation whiich can be
e used to represent
r and interp
pret a data
a set
preciselyy. Thereforre, we have
e to have some
s formss of numeriical measurres which could
c
be used to represe
ent and inte
erpret a givven data se
et in a more precise, descriptive
e and
compara
able way. These
T meassures then can be use
ed for inferrring valuab
ble informa
ation
about a population
n. Numeriical descrip
ptive techn
niques come in to pla
ay to caterr this
need.
Descriptiive
Statisticcs
Graphiccal Numerical
Techniq
ques ues
Techniqu
Graphiical Graphiccal
Techniquues for Techniquees for
Quantitativ
ve Data Qualitative Data
Figure 3.0.1
Lesson 3 – Measures
M of Loccation L
L3-2/20
ITE 3703 – Probability and Statistics Week 03
Numerical descriptive techniques provide more precise information about a given data
set which intern can be used to draw very important predictions about the data set.
There are a number of different numerical techniques. We will study two of the more
eminent and widely used techniques;
In this lesson you will learn the first one: Measures of Central Location and in lesson
four, you will learn about measures of dispersion.
Learning outcomes
After completion of this lesson, you will be able to define a typical value in a set of
observations.
• Identify the position of the mean, median and mode for different shaped
distributions.
A measure about the center or the central value of a data set is called “a measure of
central location (also known as measures of central tendency)”. Most commonly used
measures of central location are,
Arithmetic Mean
Median
Mode
You may be wondering why there are three types of measures about the center of a
set of data. These different measures provide a numerical representation about
different types of “centers” of a given data set.
Most frequently used measure of central tendency is the arithmetic mean. But in
certain situations the other two types of means can be more useful than the
arithmetic mean.
Which type of measure to be used actually depends on various factors such as: what
we are going to accomplish and nature of data (qualitative? quantitative?).
No doubt you have heard the term “average”. The arithmetic mean is the average
value of a given data set. Depending on the data set under consideration, we may
calculate the population mean or the sample mean. If the data is available for the
entire population, we calculate the population mean whereas if the data is available
only for a sample of the population, we calculate the sample mean. It is not
practically possible always to obtain data for the entire population, in such situations,
we obtain data for a sample, calculate the sample mean, and apply other statistical
techniques (discussed in subsequent chapters) to derive a value for the population
mean.
Population Mean:
Population mean is obtained by dividing the summation of observations (of the whole
population) by the number of observations.
∑
μ
Where,
µ = Population Mean
N = Total number of observations in the population
xi = ith Observation
Example 3.2.1:
Mr. Athukorale’s family owns five cars. The total mileages (in kilo meters) of these
cars are 65000, 80000, 35000, 60000, and 70000. Find the mean mileage of a car
owned by this family.
∑
μ
Here,
N=5
x1 = 65000 km
x2 = 80000 km
x3 = 35000 km
x4 = 60000 km
x5 = 70000 km
µ =( x1+ x2 + x3 + x4 + x5)/5
= (65000+80000+35000+60000+70000)km/5
= 62000 km
Therefore, on average we can say that a car used by Mr. Athukorale’s family has run
approximately 62000kms.
Exercise 3.2.1:
Given below is the number of cars produced at a car manufacturing company for five
days. Find the average number of cars produced per day.
Day Production
1 200
2 350
3 400
4 450
5 700
Sample Mean
Sample mean is obtained by dividing the summation of the observations collected for
a sample by the number of observations in the sample.
∑
=
Where,
= S ample mean
n = number of observations in the sample
xi = ith observation of the sample
Note that the notation used for population mean and the sample mean are different.
Example 3.2.2:
A sample of five executives of Nadee Trvels Pvt. Ltd received the following amounts
of bonus last year: RS 12000, RS 14000, RS 8000, RS 6000 and RS 10000. Find the
average bonus for these five executives
Solution:
Here, the study is based on a sample: Bonuses of all the executives are not collected.
Only a sample of five was selected for the study. Therefore, we can calculate the
sample mean, not the population mean.
∑ 12000 14000 8000 6000 10000
. 10000
5 5
Exercise 3.2.2:
Given below is the number of weekly overtime hours of five employees in a company.
Find the average number of weekly overtime hours of an employee of the company.
100 150 50 150 200
For example, refer to Example 3.2.1. If we change the mileage of the fifth car to
950,000 km (that is x5=950,000km), and calculate the average.
We get,
µ =( x1+ x2 + x3 + x4 + x5)/5
= (65000+80000+35000+60000+950000)km/5
= 238,000 km
This is a misleading value about the average mileage of an individual car in the
family.
3.3 T Median
The n
Followin
ng equation
ns can be ussed to find the media
an of a set of
o data.
umber of ob
If the nu bservationss = N, then
Examplle 3.3.1:
Find the
e median off the follow
wing set of data.
Lesson 3 – Measures
M of Loccation L
L3-9/20
ITE 3703 – Probability and Statistics Week 03
Solution:
Example 3.3.2:
Solution:
Therefore, the median is the average of the values at 2nd and 3rd locations.
Exercise 3.3.1:
http://www.mathsisfun.com/median.html
Mode is the most frequently appeared value in a set of data. A data set can have
more than one mode (multi modal).
Example 3.4.1:
Solution:
Observation Frequency
10 4
12 1
15 2
20 1
The value 10 appears four times. Therefore, the most frequently appeared value is
10.
In real life situations, we usually have to work with grouped data due to different
reasons (convenience, classification purposes, etc.).
We have seen how to calculate the mean (arithmetic mean), median and mode for a
set of data. An important point which we should notice there is that the data are not
grouped in to classes. If the data are grouped in to classes (as in a frequency
distribution), the way we calculate the mean, median and the mode for such a set of
data is different. Let’s have a look at how we calculate the mean, median and mode
for grouped data.
Where,
Example 3.5.1.1:
Ten film halls were selected to find out how many films shown by each hall during a
particular week. The findings are given in the following table.
Number of frequency
Movies shown (number of film halls)
1-2 1
3-4 2
5-6 3
7-8 1
9-10 3
Solution:
Let’s first calculate the class mid points. Then we can find xifi for each class
∑
=
= 61/10 = 6.1
Where,
From the above formula, it is obvious that in order to calculate the median, we have
to find the median class first.
a) Divide the total number of data values by 2 (let us say the result is k)
b) Determine which class contains this value (the class which contains the
kth value)
c) that class is called the median class.
Example 3.5.2.1:
n/2 = 10/2 =5
L=5
CF = 3
f=3
i=7–5=2
The mode of a set of grouped data is the class midpoint of the class with the highest
frequency.
If there are two classes with the same highest frequency, then we call it a “bimodal”
distribution.
Example 3.5.3.1:
There are two classes with the frequency of 3 (which is the highest frequency).
Therefore, we have two modes for this distribution.
The relative positions of the mean and the median give us information about the
distribution shape.
If all three measures mean, median and mode are equal for a set of data, then the
distribution is symmetrical.
If the mean is larger than the median, it is an indication of a right skewed distribution
and if the mean is smaller than the median, then the distribution is left skewed. This
is becau
use the fe
ew extreme
e values affect
a the mean tha
an the med
dian. (Extrreme
values: Very
V large values or very
v small values.)
v
• M
Mean = Med
dian = Mode → Distrribution is Symmetricc
Fiigure 3.6.1
• M
Mode<Media
an<Mean → A Po
ositively Sk
kewed Disttribution
Figure 3.6.2
Lesson 3 – Measures
M of Loccation L33-18/20
ITE 3703 – Probability and
a Statistics We 03
Week
• M
Mean<Media
an<Mode → A Nega
atively Ske
ewed Distriibution
Figure 3.6.3
Figure 3.6.4
3 summarizes thesse facts.
Figure 3.6.4
Source: Keller G. and
d Warrack B. (2000). Statiistics: for Ma nd Economics.. 5th ed.Duxbury.
anagement an
That brings us to th
he end of the
t lesson. Now try th
he quiz to check
c your knowledge.
Lesson 3 – Measures
M of Loccation L33-19/20
ITE 3703 – Probability and Statistics Week 03
Summary
In this lesson you extended your knowledge about the descriptive measures.
You learnt four very important measures of central location: Arithmetic
Mean, Median and Mode.
Further Reading :
Anderson, D.R., Sweeny, D.J., Williams, T.A., 2007. Statistics for Business and
Economics. Chapter 3.
http://mste.illinois.edu/hill/dstat/dstat.html