Professional Documents
Culture Documents
Summarizing Data
Numerical Summary Measures
Diriba D. (MPH-Epidemiology)
Tel: 0917793542
E-mail: debaba.tolosa@gmail.com
August, 2022
04/19/2023 1
Numerical Summary Measures
04/19/2023 2
04/19/2023 3
04/19/2023 4
MCT…
m
04/19/2023 12
Pros and cons of mean
04/19/2023 13
Exercise 1
Calculate the mean of the following data
1 5 4 3 2 6 7 8 8 10
04/19/2023 14
2. Median
• The median is the value which divides the data set into
two equal parts.
• If the number of values is odd, the median will be the
middle value when all values are arranged in order of
magnitude.
• When the number of observations is even, there is no
single middle value but two middle observations.
• In this case, the median is the mean of these two middle
observations, when all observations have been arranged
in the order of their magnitude.
04/19/2023 15
04/19/2023 16
04/19/2023 17
Solution
04/19/2023 18
04/19/2023 19
Properties of the median
04/19/2023 20
Pros and Cons of median
04/19/2023 21
Exercise 2
What will be the median of the following data ?
A) 7 ,2, 5,9,10,12,16
B) 8,7,2,5,12,16
04/19/2023 22
4. Mode
04/19/2023 24
Pros and Cons of Mode
04/19/2023 25
Which MCT is best
with a given set of data?
04/19/2023 26
04/19/2023 27
Measures of Dispersion
04/19/2023 28
04/19/2023 29
Range (R)
04/19/2023 30
Example
04/19/2023 31
Properties of range
04/19/2023 32
Percentiles and quartiles
• Just as the median is the value above and below which lie
half the set of data, one can define measures (above or
below) which lie other fractional parts of the data.
04/19/2023 34
Quartiles
04/19/2023 35
04/19/2023 36
04/19/2023 37
Inter-Quartile Range (IQR)
04/19/2023 40
04/19/2023 41
Properties of IQR:
• It is a simple measure
• It encloses the central 50% of the observations
• Since it excludes the lowest and highest 25%
values, it eliminates the outlier problem
04/19/2023 42
• The variance is the average of the squares of the
deviations of individual values taken from the
mean of that set.
04/19/2023 43
04/19/2023 44
• Variance is used to measure the dispersion of
values relative to the mean.
• When values are close to their mean (narrow
range) the dispersion is less than when there is
scattering over a wide range.
04/19/2023 45
04/19/2023 46
• The main disadvantage of variance is that its unit is the
square of the unit of the original measurement values
• A variance of a distribution of weight is not expressed in
Kg, but in Kg²
Weight = 36.5 Kg, S² = 257 Kg²
• The variance gives more weight to the extreme values as
compared to those which are near to mean value,
because the difference is squared in variance.
04/19/2023 47
Standard deviation (σ, s)
• Standard deviation, is based on deviations from the
mean of the data.
• It is the square root of the variance.
• This produces a measure having the same
scale as that of the individual values.
• Most commonly used
• Shows variation about the mean
04/19/2023 48
04/19/2023 49
04/19/2023 50
Example 2
04/19/2023 51
04/19/2023 52
Properties of SD
04/19/2023 53
Coefficient of variation (CV)
04/19/2023 54
04/19/2023 55
04/19/2023 56
Example
04/19/2023 57
“Give a man a fish, and you
feed him for a day. Teach a
man to fish, and you feed
him for a lifetime.” Thank
you
04/19/2023 58