Professional Documents
Culture Documents
February, 2024
Addis Ababa, Ethiopia
1 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Content
Introduction
Numerical Summary Measures
Measures of Central Tendency
Measures of Dispersion
Since a MCT represents the entire data, it facilitates comparison within one
group or between groups of data
10
9 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Example
The heart rates for n=10 patients were as follows (beats per minute):
167, 120, 150, 125, 150, 140, 40, 136, 120, 150
What is the arithmetic mean for the heart rate of these patients?
In calculating mean from grouped data , we assume that all values falling into a
particular class interval are located at the mid points of the class interval. it is
calculated as follow:
k=3 mesokurtic
k<3 platykurtic
Value
17 Dube Jara (Assistant Professor &PhD Candidate), For MPH
A few words about the normal curve
Skewness = 0
Kurtosis = 3
1 ( x ) / 2 2
f ( x) e
18
2
Dube Jara (Assistant Professor &PhD Candidate), For MPH
Characteristics of mean
= Σ(log xi)/n
Example: The geometric mean may be calculated for the following parasite counts per
100 fields of thick films.
7 8 3 14 2 1 440 15 52 6 2 1 1 25
12 6 9 2 1 6 7 3 4 70 20 200 2 50
GM = 42√7x8x3x…x1x237
= 1/42 (41.9985)
= 0.9999 ≈ 1.0000
Disadvantages:-
Trimmed mean
Total 169
It is an average of position
It is affected by the number of items than by extreme
values
There is only one median for a given set of data
(uniqueness)
Mode Mode
Mean
Mode
Median
Mean
The two data sets given above have a mean of 50, but obviously set 1 is
more “spread out” than set 2. How do we express this numerically?
The object of measuring this scatter or dispersion is to obtain a single
summary figure which adequately exhibits whether the distribution is
compact or spread out
Percentiles
Simply divide the data into 100 pieces.
Percentiles are less sensitive to outliers and not greatly affected by the
sample size (n)
Percentiles can be expressed:
P0: The minimum
P25: (25th percentile) ,25% of the sample values are less than or equal to this
value. 1st Quartile
P50: 50% of the sample are less than or equal to this value. 2 nd Quartile
P75: 75% of the sample values are less than or equal to this value. 3 rd Quartile
P100: The maximum
57 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Quintiles and percentiles…
That is, outliers in the data do not affect the Interquartile range.
Also, it can be computed when the distribution has open-end class
(X i ) 2
2 i 1
where
N
N
X i
= i=1
is the population mean.
N
67 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Example
Following are the survival times of n=11 patients after heart
transplant surgery.
The survival time for the “ith” patient is represented as Xi for i= 1,
…, 11.
Calculate the sample variance and SD.
(m i x) 2
fi
S2 i =1
k
f
i =1
i -1
where
mi = the mid-point of the ith class interval
fi = the frequency of the ith class interval
x
= the sample mean
k = the number of class intervals
71 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Example. Compute the variance and SD of the age of 169 subjects from the grouped data.
Class
interval (mi) (fi) (mi-Mean) (mi-Mean)2 (mi-Mean)2 fi
10-19 14.5 4 -19.98 399.20 1596.80
20-29 24.5 66 -9-98 99.60 6573.60
30-39 34.5 47 0.02 0.0004 0.0188
40-49 44.5 36 10.02 100.40 3614.40
50-59 54.5 12 20.02 400.80 4809.60
60-69 64.5 4 30.02 901.20 3604.80
= 2502/14
= √Variance
= √178.71
= 13.37 m2.
74 Dube Jara (Assistant Professor &PhD Candidate), For MPH
Properties of Variance
The main disadvantage of variance is that its unit is the square
of the units of the original measurement values
The variance gives more weight to the extreme values as
compared to those which are near to mean value, because the
difference is squared in variance
The drawbacks of variance are overcome by the standard
deviation
and S = S 2 2
i
(X ) 2
2 i 1
where
N
N
X i
= i=1
is the population mean.
77 N
Dube Jara (Assistant Professor &PhD Candidate), For MPH
Ungrouped....
( x x) 2
S = (n - 1)
sample standard
deviation
=square root
=sum (sigma)
X=score for each point in data
_
X=mean of scores for the variable
n=sample size (number of observations or cases
78 Dube Jara (Assistant Professor &PhD Candidate), For MPH
SD...
This measure of variation is universally used to show the scatter
of the individual measurements around the mean of all the
measurements in a given distribution.
Note that the sum of the deviations of the individual observations
of a sample about the sample mean is always 0.
For example, imagine 5,000 samples, each of the same size n=11
This would produce 5,000 sample means. This new collection has its
We describe this new pattern of variability using the SE, not the SD
S
CV 100
x
SD Mean CV (%)
SBP 20mm 140mm 14.3
Cholesterol 80mg/dl 400md/dl 20.0