You are on page 1of 22

WFM 5201: Data Management and Statistical Analysis Dr.

Akm Saiful Islam


WFM 5201: Data Management and
Statistical Analysis
Akm Saiful Islam
Lecture-1: Descriptive Statistics
[Measures of central tendency]
April, 2008
Institute of Water and Flood Management (IWFM)
Bangladesh University of Engineering and Technology (BUET)
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Descriptive Statistics
Measures of Central Tendency
Measures of Location
Measures of Dispersion
Measures of Symmetry
Measures of Peakdness

WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Measures of Central Tendency
The central tendency is measured by
averages. These describe the point about
which the various observed values cluster.

In mathematics, an average, or central
tendency of a data set refers to a
measure of the "middle" or "expected"
value of the data set.
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Measures of Central Tendency

Arithmetic Mean
Geometric Mean
Weighted Mean
Harmonic Mean
Median
Mode

WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Arithmetic Mean

The arithmetic mean is the sum of a set of
observations, positive, negative or zero,
divided by the number of observations. If
we have n real numbers
their arithmetic mean, denoted by , can
be expressed as:
n
x x x x
x
n
+ + + +
=
... ..........
3 2 1
n
x
x
n
i
i
=
=
1
, ......., , , ,
3 2 1 n
x x x x
x
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Arithmetic Mean of Group Data
if are the mid-values and
are the corresponding
frequencies, where the subscript k stands
for the number of classes, then the mean
is

=
i
i i
f
z f
z
k
z z z z ., ,......... , ,
3 2 1
k
f f f f ,........, , ,
3 2 1
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Geometric Mean
Geometric mean is defined as the positive root of the
product of observations. Symbolically,




It is also often used for a set of numbers whose values
are meant to be multiplied together or are exponential in
nature, such as data on the growth of the human
population or interest rates of a financial investment.

Find geometric mean of rate of growth: 34, 27, 45, 55,
22, 34

n
n
x x x x G
/ 1
3 2 1
) ( =
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Geometric mean of Group data
If the n non-zero and positive variate-
values occur times,
respectively, then the geometric mean of
the set of observations is defined by:
| |
N
n
i
f
i
N
f
n
f f
i
n
x x x x G
1
1
1
2 1
2 1
(

= =
[
=

=
=
n
i
i
f N
1
Where
n
x x x ,........, ,
2 1
n
f f f ,......., ,
2 1
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Geometric Mean (Revised Eqn.)
) (
3 2 1 n
x x x x G =
|
|
.
|

\
|
=

=
n
i
i
x Log
N
AntiLog G
1
1
|
|
.
|

\
|
=

=
n
i
i i
x Log f
N
AntiLog G
1
1
) (
3 2 1
3 2 1 n
f f f
x x x x G =
Ungroup Data Group Data
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Harmonic Mean
Harmonic mean (formerly sometimes called
the subcontrary mean) is one of several
kinds of average.

Typically, it is appropriate for situations when
the average of rates is desired. The harmonic
mean is the number of variables divided by the
sum of the reciprocals of the variables. Useful
for ratios such as speed (=distance/time) etc.

WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Harmonic Mean Group Data
The harmonic mean H of the positive real
numbers x
1
,x
2
, ..., x
n
is defined to be

=
=
n
i
i
i
x
f
n
H
1

=
=
n
i
i
x
n
H
1
1
Ungroup Data Group Data
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Exercise-1: Find the Arithmetic ,
Geometric and Harmonic Mean
Class Frequency
(f)
x fx f Log x f / x
20-29 3 24.5 73.5 4.17 8.17
30-39 5 34.5 172.5 7.69 6.9
40-49 20 44.5 890 32.97 2.23
50-59 10 54.5 545 17.37 5.45
60-69 5 64.5 322.5 9.05 12.9
Sum N=43 2003.5 71.24 35.64
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Weighted Mean
The Weighted mean of the positive real numbers
x
1
,x
2
, ..., x
n
with their weight w
1
,w
2
, ..., w
n
is
defined to be

=
=
=
n
i
i
n
i
i i
w
x w
x
1
1
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Median
The implication of this definition is that a
median is the middle value of the
observations such that the number of
observations above it is equal to the
number of observations below it.
) 1 (
2
1
+
=
n
e
X M
|
|
.
|

\
|
+ =
+1
2 2
2
1
n n e
X X M
If n is odd
If n is Even
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Median of Group Data


L
0
= Lower class boundary of the median
class
h = Width of the median class
f
0
= Frequency of the median class
F = Cumulative frequency of the pre-
median class

|
.
|

\
|
+ = F
n
f
h
L M
o
o e
2
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Steps to find Median of group data
1. Compute the less than type cumulative frequencies.
2. Determine N/2 , one-half of the total number of cases.
3. Locate the median class for which the cumulative
frequency is more than N/2 .
4. Determine the lower limit of the median class. This is L
0
.
5. Sum the frequencies of all classes prior to the median
class. This is F.
6. Determine the frequency of the median class. This is f
0
.
7. Determine the class width of the median class. This is h.
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Example-3:Find Median
Age in years Number of births Cumulative number of
births
14.5-19.5 677 677
19.5-24.5 1908 2585
24.5-29.5 1737 4332
29.5-34.5 1040 5362
34.5-39.5 294 5656
39.5-44.5 91 5747
44.5-49.5 16 5763
All ages 5763 -
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Mode
Mode is the value of a distribution for which the
frequency is maximum. In other words, mode is
the value of a variable, which occurs with the
highest frequency.

So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3.
The mode is not necessarily well defined. The
list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3.
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Example-2: Find Mean, Median
and Mode of Ungroup Data
The weekly pocket money for 9 first year pupils
was found to be:

3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8
Mean
5
Mode
4
Median
4
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Mode of Group Data
L
1
= Lower boundary of modal class

1
= difference of frequency between
modal class and class before it

2
= difference of frequency between
modal class and class after
H = class interval
h L M
2 1
1
1 0
A + A
A
+ =
WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Steps of Finding Mode
Find the modal class which has highest
frequency
L
0
= Lower class boundary of modal class
h = Interval of modal class

1
= difference of frequency of modal
class and class before modal class

2
= difference of frequency of modal class and
class after modal class

WFM 5201: Data Management and Statistical Analysis Dr. Akm Saiful Islam
Example -4: Find Mode
Slope Angle
()
Midpoint (x) Frequency (f) Midpoint x
frequency (fx)
0-4 2 6 12
5-9 7 12 84
10-14 12 7 84
15-19 17 5 85
20-24 22 0 0
Total n = 30 (fx) = 265