You are on page 1of 33

Measures of Central Tendency

Measures of Location
Measures of Dispersion
Measures of Symmetry
Measures of Peakdness
Descriptive Statistics
Measures of Central Tendency
Measures of Location
Measures of Dispersion
Measures of Symmetry
Measures of Peakdness
Measures of Central Tendency
The central tendency is measured by averages.
These describe the point about which the
various observed values cluster.
In mathematics, an average, or central
tendency of a data set refers to a measure of
the "middle" or "expected" value of the data
set.
The central tendency is measured by averages.
These describe the point about which the
various observed values cluster.
In mathematics, an average, or central
tendency of a data set refers to a measure of
the "middle" or "expected" value of the data
set.
Measures of Central Tendency
Arithmetic Mean
Geometric Mean
Weighted Mean
Harmonic Mean
Median
Mode
Arithmetic Mean
Geometric Mean
Weighted Mean
Harmonic Mean
Median
Mode
Measure of central tendency
Central tendency
A statistical measure that identifies a
single score as representative for an
entire distribution. The goal of central
tendency is to find the single score that is
most typical or most representative of
the entire group.
Central tendency
A statistical measure that identifies a
single score as representative for an
entire distribution. The goal of central
tendency is to find the single score that is
most typical or most representative of
the entire group.
Choosing a measure of central tendency
the level of measurement of the variable
concerned (nominal, ordinal, interval or ratio);
the shape of the frequency distribution;
what is to be done with the figure obtained.
The mean is really suitable only for ratio and
interval data. For ordinal variables, where the
data can be ranked but one cannot validly talk
of `equal differences' between values, the
median, which is based on ranking, may be
used. Where it is not even possible to rank the
data, as in the case of a nominal variable, the
mode may be the only measure available.
Measure of central tendency
Choosing a measure of central tendency
the level of measurement of the variable
concerned (nominal, ordinal, interval or ratio);
the shape of the frequency distribution;
what is to be done with the figure obtained.
The mean is really suitable only for ratio and
interval data. For ordinal variables, where the
data can be ranked but one cannot validly talk
of `equal differences' between values, the
median, which is based on ranking, may be
used. Where it is not even possible to rank the
data, as in the case of a nominal variable, the
mode may be the only measure available.
Summary
1. The purpose of central tendency is to determine the single value
that best represents the entire distribution of scores. The three
standard measures of central tendency are the mode, the median,
and the mean.
2. The mean is the arithmetic average. It is computed by summing all
the scores and then dividing by the number of scores. Conceptually,
the mean is obtained by dividing the total (IX) equally among the
number of individuals (N or n). Although the calculation is the same
for a population or a sample mean, a population mean is identified
by the symbol and a sample mean is identified by X.
3. Changing any score in the distribution will cause the mean to be
changed. When a constant value is added to (or subtracted from)
every score in a distribution, the same constant value is added to
(or subtracted from) the mean. If every score is multiplied by a
constant, the mean will be multiplied by the same constant. In
nearly all circumstances, the mean is the best representative value
and is the preferred measure of central tendency.
1. The purpose of central tendency is to determine the single value
that best represents the entire distribution of scores. The three
standard measures of central tendency are the mode, the median,
and the mean.
2. The mean is the arithmetic average. It is computed by summing all
the scores and then dividing by the number of scores. Conceptually,
the mean is obtained by dividing the total (IX) equally among the
number of individuals (N or n). Although the calculation is the same
for a population or a sample mean, a population mean is identified
by the symbol and a sample mean is identified by X.
3. Changing any score in the distribution will cause the mean to be
changed. When a constant value is added to (or subtracted from)
every score in a distribution, the same constant value is added to
(or subtracted from) the mean. If every score is multiplied by a
constant, the mean will be multiplied by the same constant. In
nearly all circumstances, the mean is the best representative value
and is the preferred measure of central tendency.
Summary
1. The median is the value that divides a distribution exactly in half.
The median is the preferred measure of central tendency when
a distribution has a few extreme scores that displace the value
of the mean. The median also is used when there are
undetermined (infinite) scores that make it impossible to
compute a mean.
2. The mode is the most frequently occurring score in a
distribution. It is easily located by finding the peak in a
frequency distribution graph. For data measured on a nominal
scale, the mode is the appropriate measure of central tendency.
It is possible for a distribution to have more than one mode.
3. For symmetrical distributions, the mean will equal the median.
If there is only one mode, then it will have the same value, too.
4. For skewed distributions, the mode will be located toward the
side where the scores pile up, and the mean will be pulled
toward the extreme scores in the tail. The median will be
located between these two values.
1. The median is the value that divides a distribution exactly in half.
The median is the preferred measure of central tendency when
a distribution has a few extreme scores that displace the value
of the mean. The median also is used when there are
undetermined (infinite) scores that make it impossible to
compute a mean.
2. The mode is the most frequently occurring score in a
distribution. It is easily located by finding the peak in a
frequency distribution graph. For data measured on a nominal
scale, the mode is the appropriate measure of central tendency.
It is possible for a distribution to have more than one mode.
3. For symmetrical distributions, the mean will equal the median.
If there is only one mode, then it will have the same value, too.
4. For skewed distributions, the mode will be located toward the
side where the scores pile up, and the mean will be pulled
toward the extreme scores in the tail. The median will be
located between these two values.
Arithmetic Mean
The arithmetic mean is the sum of a set of
observations, positive, negative or zero,
divided by the number of observations. If we
have n real numbers
their arithmetic mean, denoted by , can be
expressed as:
, ......., , , ,
3 2 1 n
x x x x
The arithmetic mean is the sum of a set of
observations, positive, negative or zero,
divided by the number of observations. If we
have n real numbers
their arithmetic mean, denoted by , can be
expressed as:
n
x x x x
x
n
+ + + +
=
... ..........
3 2 1
n
x
x
n
i
i _
=
=
1
x
Arithmetic Mean of Group Data
if are the mid-values and
are the corresponding
frequencies, where the subscript k stands for
the number of classes, then the mean is
k
z z z z ., ,......... , ,
3 2 1
k
f f f f ,........, , ,
3 2 1
if are the mid-values and
are the corresponding
frequencies, where the subscript k stands for
the number of classes, then the mean is
_
_
=
i
i i
f
z f
z
Geometric Mean
Geometric mean is defined as the positive root of the
product of observations. Symbolically,
It is also often used for a set of numbers whose values are
meant to be multiplied together or are exponential in nature,
such as data on the growth of the human population or
interest rates of a financial investment.
Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34
n
n
x x x x G
/ 1
3 2 1
) ( =
Geometric mean is defined as the positive root of the
product of observations. Symbolically,
It is also often used for a set of numbers whose values are
meant to be multiplied together or are exponential in nature,
such as data on the growth of the human population or
interest rates of a financial investment.
Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34
n
n
x x x x G
/ 1
3 2 1
) ( =
Geometric mean of Group data
If the n non-zero and positive variate-values
occur times, respectively,
then the geometric mean of the set of
observations is defined by:
n
x x x ,........, ,
2 1
n
f f f ,......., ,
2 1
If the n non-zero and positive variate-values
occur times, respectively,
then the geometric mean of the set of
observations is defined by:

N
n
i
f
i
N
f
n
f f
i
n
x x x x G
1
1
1
2 1
2 1

= =
[
=

_
=
=
n
i
i
f N
1
Where
Geometric Mean (Revised Eqn.)
) (
3 2 1 n
x x x x G = ) (
3 2 1
3 2 1 n
f f f
x x x x G =
Ungroup Data Group Data
) (
3 2 1 n
x x x x G =
]
]

\
|
=
_
=
n
i
i
x Log
N
AntiLog G
1
1
]
]

\
|
=
_
=
n
i
i i
x Log f
N
AntiLog G
1
1
) (
3 2 1
3 2 1 n
f f f
x x x x G =
Harmonic Mean
Harmonic mean (formerly sometimes called the
subcontrary mean) is one of several kinds of
average.
Typically, it is appropriate for situations when the
average of rates is desired. The harmonic mean is
the number of variables divided by the sum of the
reciprocals of the variables. Useful for ratios such
as speed (=distance/time) etc.
Harmonic mean (formerly sometimes called the
subcontrary mean) is one of several kinds of
average.
Typically, it is appropriate for situations when the
average of rates is desired. The harmonic mean is
the number of variables divided by the sum of the
reciprocals of the variables. Useful for ratios such
as speed (=distance/time) etc.
Harmonic Mean Group Data
The harmonic mean H of the positive real
numbers x
1
,x
2
, ..., x
n
is defined to be
Ungroup Data Group Data
_
=
=
n
i
i
i
x
f
n
H
1
The harmonic mean H of the positive real
numbers x
1
,x
2
, ..., x
n
is defined to be
_
=
=
n
i
i
x
n
H
1
1
Ungroup Data Group Data
Exercise-1: Find the Arithmetic ,
Geometric and Harmonic Mean
Class Frequency
(f)
x fx f Log x f / x
20-29 3 24.5 73.5 4.17 8.17
30-39 5 34.5 172.5 7.69 6.9
40-49 20 44.5 890 32.97 2.23
50-59 10 54.5 545 17.37 5.45
60-69 5 64.5 322.5 9.05 12.9
Sum N=43 2003.5 71.24 35.64
Weighted Mean
The Weighted mean of the positive real numbers x
1
,x
2
,
..., x
n
with their weight w
1
,w
2
, ..., w
n
is defined to be
The Weighted mean of the positive real numbers x
1
,x
2
,
..., x
n
with their weight w
1
,w
2
, ..., w
n
is defined to be
_
_
=
=
=
n
i
i
n
i
i i
w
x w
x
1
1
Median
The implication of this definition is that a
median is the middle value of the
observations such that the number of
observations above it is equal to the number
of observations below it.
The implication of this definition is that a
median is the middle value of the
observations such that the number of
observations above it is equal to the number
of observations below it.
) 1 (
2
1
+
=
n
e
X M
]
]

\
|
+ =
+1
2 2
2
1
n n e
X X M
If n is odd
If n is Even
Median of Group Data
L
0
= Lower class boundary of the median
class
h = Width of the median class
f
0
= Frequency of the median class
F = Cumulative frequency of the pre-
median class
]

\
|
+ = F
n
f
h
L M
o
o e
2
L
0
= Lower class boundary of the median
class
h = Width of the median class
f
0
= Frequency of the median class
F = Cumulative frequency of the pre-
median class
Steps to find Median of group data
1. Compute the less than type cumulative frequencies.
2. Determine N/2 , one-half of the total number of cases.
3. Locate the median class for which the cumulative frequency is
more than N/2 .
4. Determine the lower limit of the median class. This is L
0
.
5. Sum the frequencies of all classes prior to the median class.
This is F.
6. Determine the frequency of the median class. This is f
0
.
7. Determine the class width of the median class. This is h.
1. Compute the less than type cumulative frequencies.
2. Determine N/2 , one-half of the total number of cases.
3. Locate the median class for which the cumulative frequency is
more than N/2 .
4. Determine the lower limit of the median class. This is L
0
.
5. Sum the frequencies of all classes prior to the median class.
This is F.
6. Determine the frequency of the median class. This is f
0
.
7. Determine the class width of the median class. This is h.
Example-3:Find Median
Age in years Number of births Cumulative number of
births
14.5-19.5 677 677
19.5-24.5 1908 2585 19.5-24.5 1908 2585
24.5-29.5 1737 4332
29.5-34.5 1040 5362
34.5-39.5 294 5656
39.5-44.5 91 5747
44.5-49.5 16 5763
All ages 5763 -
Mode
Mode is the value of a distribution for which the
frequency is maximum. In other words, mode is the
value of a variable, which occurs with the highest
frequency.
So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The
mode is not necessarily well defined. The list (1, 2, 2,
3, 3, 5) has the two modes 2 and 3.
Mode is the value of a distribution for which the
frequency is maximum. In other words, mode is the
value of a variable, which occurs with the highest
frequency.
So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The
mode is not necessarily well defined. The list (1, 2, 2,
3, 3, 5) has the two modes 2 and 3.
Example-2: Find Mean, Median and
Mode of Ungroup Data
The weekly pocket money for 9 first year pupils was
found to be:
3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8
The weekly pocket money for 9 first year pupils was
found to be:
3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8
Mean
5
Mode
4
Median
4
Mode of Group Data
L
1
= Lower boundary of modal class

1
= difference of frequency between
modal class and class before it

2
= difference of frequency between
modal class and class after
H = class interval
h L M
2 1
1
1 0
A + A
A
+ =
L
1
= Lower boundary of modal class

1
= difference of frequency between
modal class and class before it

2
= difference of frequency between
modal class and class after
H = class interval
Steps of Finding Mode
Find the modal class which has highest frequency
L
0
= Lower class boundary of modal class
h = Interval of modal class

1
= difference of frequency of modal
class and class before modal class

2
= difference of frequency of modal class and
class after modal class
Find the modal class which has highest frequency
L
0
= Lower class boundary of modal class
h = Interval of modal class

1
= difference of frequency of modal
class and class before modal class

2
= difference of frequency of modal class and
class after modal class
Example -4: Find Mode
Slope Angle
()
Midpoint (x) Frequency (f) Midpoint x
frequency (fx)
0-4 2 6 12
5-9 7 12 84 5-9 7 12 84
10-14 12 7 84
15-19 17 5 85
20-24 22 0 0
Total n = 30 (fx) = 265
Measures of Central Tendency
Consider the Measurements and Frequency Table
87, 85, 79, 75, 81, 88, 92, 86, 77, 72, 75, 77, 81, 80, 77,
73, 69, 71, 76, 79, 83, 81, 78, 75, 68, 67, 71, 73, 78, 75,
84, 81, 79, 82, 87, 89, 85, 81, 79, 77, 81, 78, 74, 76, 82,
85, 86, 81, 72, 69, 65, 71, 73, 78, 81, 77, 74, 77, 72, 68
Class Class Midpoint Total Frequency Class Class Midpoint Total Frequency
64.5 - 69.5 67 6 0.100
69.5 74.5 72 11 0. 183
74.5 79.5 77 20 0.333
79.5 84.5 82 13 0.217
84.5 89.5 87 9 0.150
89.5 94.5 92 1 0.0167
Measures of Central Tendency
For the 60 temperature readings in this population we obtain:
87, 85, 79, 75, 81, 88, 92, 86, 77, 72, 75, 77, 81, 80, 77,
73, 69, 71, 76, 79, 83, 81, 78, 75, 68, 67, 71, 73, 78, 75,
84, 81, 79, 82, 87, 89, 85, 81, 79, 77, 81, 78, 74, 76, 82,
85, 86, 81, 72, 69, 65, 71, 73, 78, 81, 77, 74, 77, 72, 68
87, 85, 79, 75, 81, 88, 92, 86, 77, 72, 75, 77, 81, 80, 77,
73, 69, 71, 76, 79, 83, 81, 78, 75, 68, 67, 71, 73, 78, 75,
84, 81, 79, 82, 87, 89, 85, 81, 79, 77, 81, 78, 74, 76, 82,
85, 86, 81, 72, 69, 65, 71, 73, 78, 81, 77, 74, 77, 72, 68
= (87+85+ 79 +.+72+68)/60 = 4751/60 = 79.183
Measures of Central Tendency
A third measure of central tendency is the median
The median of a population of size N is found by
1. Arranging the individual measurements in ascending order, and
2. If N is odd, selecting the value in the middle of this list as the median (there
will be the same number of values above and below the median)
3. If N is even find the values at position N/2 and N/2 + 1 in this list (call them
x
N/2
and x
N/2+1
) and let median be given by the formula median = (x
N/2
+
x
N/2+1
)/2 or be the value halfway between these two measurements.
The median of a population of size N is found by
1. Arranging the individual measurements in ascending order, and
2. If N is odd, selecting the value in the middle of this list as the median (there
will be the same number of values above and below the median)
3. If N is even find the values at position N/2 and N/2 + 1 in this list (call them
x
N/2
and x
N/2+1
) and let median be given by the formula median = (x
N/2
+
x
N/2+1
)/2 or be the value halfway between these two measurements.
Note! When N is even the median will usually not be an actual value in the
population
Measures of Central Tendency
We now find the median of the population of temperature readings
87, 85, 79, 75, 81, 88, 92, 86, 77, 72, 75, 77, 81, 80, 77,
73, 69, 71, 76, 79, 83, 81, 78, 75, 68, 67, 71, 73, 78, 75,
84, 81, 79, 82, 87, 89, 85, 81, 79, 77, 81, 78, 74, 76, 82,
85, 86, 81, 72, 69, 65, 71, 73, 78, 81, 77, 74, 77, 72, 68
Arrange these 60 measurements in ascending order
65, 67, 68, 68, 69, 69, 71, 71, 71, 72, 72, 72, 73, 73, 73, 74, 74, 75, 75, 75,
75, 76, 76, 77, 77, 77, 77, 77, 77, 78, 78, 78, 78, 79, 79, 79, 79, 80, 81, 81,
81, 81, 81, 81, 81, 81, 82, 82, 83, 84, 85, 85, 85, 86, 86, 87, 87, 88, 89, 92
Since N/2 = 30 and both the 30
th
and 31
st
values in the list are the same, we obtain
median = 78
Measures of Central Tendency
One further parameter of a population that may give some indication of central
tendency of the data is the mode
Define: mode = most frequently occurring value in the
population
From the previous data we see:
65, 67, 68, 68, 69, 69, 71, 71, 71, 72, 72, 72, 73, 73, 73, 74, 74, 75, 75, 75, 75,
76, 76, 77, 77, 77, 77, 77, 77, 78, 78, 78, 78, 79, 79, 79, 79, 80, 81, 81, 81, 81,
81, 81, 81, 81, 82, 82, 83, 84, 85, 85, 85, 86, 86, 87, 87, 88, 89, 92
That the value 81 occurs 8 times mode = 81
Note! If two different values were to occur most frequently, the distribution would be
bimodal. A distribution may be multi-modal.
Measures of Central Tendency
Next we show where each of these parameters occur in the frequency
distribution graph for this tabulated data.
42
39
36
33
30
27
24
21
18
15
12
9
6
3
0
Frequency %
x
Mean = 79.183
Median = 78
Midrange = 78.5
Mode = 81
42
39
36
33
30
27
24
21
18
15
12
9
6
3
0
67 72 77 82 87 92
Temperature
x
x
x
x
x
mean
median
Midrange = 78.5
Mode = 81
Measures of Central Tendency
From the table we obtain
Class Class Midpoint (x) Total (f) Frequency f*x
64.5 - 69.5 67 6 0.100 402
69.5 74.5 72 11 0. 183 792
74.5 79.5 77 20 0.333 1540
79.5 84.5 82 13 0.217 1066
84.5 89.5 87 9 0.150 783
89.5 94.5 92 1 0.0167 92
60 4675
64.5 - 69.5 67 6 0.100 402
69.5 74.5 72 11 0. 183 792
74.5 79.5 77 20 0.333 1540
79.5 84.5 82 13 0.217 1066
84.5 89.5 87 9 0.150 783
89.5 94.5 92 1 0.0167 92
60 4675
=
i
(f
i
* x
i
) /
i
f
i
= 4675/60 = 77.917
The small discrepancy between these two values for the mean is due to the
way the data is accumulated into classes. The mean of the raw data is more
accurate, the mean of the tabulated data is often more convenient to obtain.
Numerical Data
Properties & Measures
Numerical Data
Properties
Central
Tendency
Variation Shape
Mean Mean
Median Median
Mode Mode
Central
Tendency
Range Range
Variance Variance
Standard Deviation Standard Deviation
Variation
Skew Skew
Shape