Professional Documents
Culture Documents
ENGINEERING APPLICATIONS
Measures of Dispersions
Range, Inter-quartile range
Average deviation or Mean deviation
Standard deviation and Variance
Arithmetic Mean
Given a set of n observations x1, x2, ….. , xn, the arithmetic mean is defined as the sum of
the observations divided by the sample size. It is given by
1 n
x xi
n i 1
In the case of an ungrouped frequency distribution where the value xi occurs with frequency
fi , it is given by
n n
1
x
N
fx
i 1
i i where N f i
i 1
In the case of an grouped frequency distribution xi is taken as the mid-point of the i-th class
interval.
Property – 1: Algebraic sum of the deviations of a set of observations from their arithmetic
mean is zero, that is, for the frequency distribution xi/ fi , i = 1,2,…,n
n n
1
i 1
fi xi x =0 where x
N
fx
i 1
i i
n n n
Proof: f x x = f x x f
i 1
i i
i 1
i i
i 1
i Nx xN 0
Property – 2: The sum of the squares of the deviations of a set of observations is minimum
about its mean, that is
n
fi xi a is minimum at a x .
2
i 1
n n
dS
f x a 2 f i xi a 1
2
Proof: Let S = i i
i 1 da i 1
n
dS
0 fi xi a 0
da i 1
n
1
a
N
fx
i 1
i i x
Property – 3: The mean of the combined sample is the weighted mean of the individual
sample means, that is, if k xi is the mean of the ith sample of size ni, i=1,2,…,k, then the
combined sample of size ni is
i 1
k k
x ni xi n i
i 1 i 1
Proof: Let xij be an observation of the ith sample with frequency fij where j=1,2,…,ni
and i=1,2,…,k, Then
fij xij fij xij
xi j
= j
ni
fij
j
fij xij i j ij ij ni xi
f x
x
i j
= i
f ij
i j ij
f
i
ni
i j
Department of Water Resources and Ocean Engineering, NITK Surathkal 6
MEASURES OF CENTRAL TENDENCY CONTD….
In the case of grouped frequency distribution shifting of origin and scale helps to reduce a
lot of arithmetic while calculating mean, particularly when the class frequencies and class
marks are large enough.
Let us suppose that origin is x is shifted to a point a called the assumed mean and scale be
made h times the original scale, then if u is the new variable, we have
xa
u x a hu
h
1 1
x
N
f * x
N
f * a hu a hu
We can choose a and h as any suitable values depending upon the data given. Normally, a is
the mid-point of the class interval corresponding to which frequency is maximum and h is
taken as the highest common factor of the widths of the class intervals.
Median
It is that value of the variable which divides the data set into two equal parts when arranged in
increasing (or decreasing) order. Median is thus a positional average.
If n is odd, then the median is the value in position (n+1)/2 but if n is even, it is the average
of the values in position n/2 and (n/2)+1.
In the case of grouped frequency distribution, the median is obtained by the formula
hN
Median l C
f 2
where, l is the lower limit of median class, f is the frequency of the median class, h is the
width of the median class and C is the cumulative frequency.
Mode
It is that value of the variable occurs with the greatest frequency. If no single value occurs
most frequently, then all the values that occur with the highest frequency are considered as the
modal values.
In the case of grouped frequency distribution the mode is given by the formula
h f m f1
Mode l
2 f m f1 f 2
where, l is the lower limit of modal class, fm is the frequency of modal class, f1 and f2 are the
frequencies of pre-modal class and post-modal class, h is the width of the modal class.
Example 1:
The following frequency distribution gives the values obtained in 60 rolls of a dice. Find (a)
the mean, (b) the median and (c) the mode.
Value (s) 1 2 3 4 5 6
Frequency (f) 9 8 12 11 13 7
Solution:
n
1 212
(a) The mean is x
N
i 1
f i xi
60
3.5
(b) The cumulative frequencies of the data values 1, 2, 3, 4, 5 and 6 are respectively 9,
17,29, 40, 53 and 60. The median is the average of 30th and 31st values (data is arranged in
increasing order). It is thus 4.
Example 2:
The following data gives the frequencies of serum cholesterol level of 1000 males aged
between 25 and 35 years arrived at a particular city hospital during the last one year. Calculate
mean, median and mode.
591
Mean x a hu 180 40* 203.64mg /100ml
1000
Department of Water Resources and Ocean Engineering, NITK Surathkal 12
MEASURES OF CENTRAL TENDENCY CONTD….
Solution: (b) To calculate median formulate the following table:
Solution: (c) To calculate mode, Since the maximum frequency is 380, thus 160 – 200
is the modal class. Therefore l = 160, fm = 380, f1 = 145, f2 = 292, h =40
Geometric mean
The geometric mean G of a set of N observations is the Nth root of their product, that is, if
xi/fi, i=1,2,…n is the frequency distribution, then the geometric mean is
G x x .....x
f1 f 2
1 2 n
f n 1/ N
where N fi
n
1
This can be written as ln G
N
f ln x
i 1
i i
Harmonic mean
The harmonic mean H of the set of N non-zero observations is the reciprocal of the arithmetic
mean of the reciprocals of the data values, that is if xi/fi, i=1,2,…n is the frequency
distribution, then the harmonic mean is
1
H n
where N fi
1
fi / xi
N i 1
Department of Water Resources and Ocean Engineering, NITK Surathkal 15
SUMMARY
Measures of Dispersions
Range, Inter-quartile range
Average deviation or Mean deviation
Standard deviation and Variance