Biostatistics Measures

Biostatistics
Practical
Topic : Measures of central tendency

and
Measures of dispersion
Mr. Ninganagouda P
Lecturer in biostatistics,
Dept. of community medicine,
SIMS&RH,Tumkur
Measures of central tendency
Definition:
The property of concentration of the observations around a central
value is called central tendency. The central value around which there
is concentration is called measures of central tendency( measures of
location, average).
There are 5 measures of central tendency :

1. Arithmetic Mean (AM)
2. Median
3. Mode
4. Geometric Mean (GM)
5. Harmonic Mean (HM)
Mean Median Mode
The average Most commonly

of the data occurring value
The Middle value
of the data
1. Arithmetic Mean (AM)
Definition :
Arithmetic mean of a set of values is obtained by dividing the sum of the

values by the number of values in the set. Arithmetic mean is denoted by x̄ or µ
and Arithmetic mean of the values X1, X2,………,Xn is –
If the observations x1, x2, ………., xn have frequencies f1, f2,……..,fn, the
arithmetic mean is -
Where N = Ʃf is the total frequency

Arithmetic Mean (AM)
Thus for a raw data , the arithmetic mean is –
x̄ =
For a tabulated data ( discrete or continuous ) it is –
x̄ =
2.Median
Definition :
Median of a set of values is the middle most values when they are arranged in
the ascending order of magnitude. (such an arrangement is called an array) It is a
value that is greater than half of the values and lesser than the remaining half.
The median is denoted by M.
In the case of a continuous frequency distribution(tabulated data), the median is –
Median(M) Grouped data formula
Where, l = Lower limit of the median class.

c = Width of the median class.
f = Frequency of the median class.
m = Less than cumulative frequency up to l (cumulative frequency corresponding
to the class preceding the median class.
N = Total frequency.
(Raw data)
Median(M)
3. Mode
Definition :
Mode is the value which has the highest frequency. It is the most
frequency occurring value. It is denoted by Z.
In the case of raw data, and also in the case of a discrete frequency
distribution, mode is the value with highest frequency.
In the case of a continuous frequency distribution(tabulated data), mode is –

𝑓 − 𝑓1 ∗𝑐
Mode(Z) = l +
2𝑓 − 𝑓1 − 𝑓2
Where, l = lower limit of the modal class.
f = frequency of the modal class.
c = width of the modal class.
f1 = frequency of the class preceding the modal class.
f2 = frequency of the class succeeding the modal class.
Modal class is the class which contains the mode.
Measures Advantages Disadvantages
• The logic behind its computation can be • It is highly affected by
easily understand. It can be easily computed. abnormal extreme values.
• It can be easily adopted for further statistical • Since it is based on all the
analysis values, even if one of the
Mean • It is based on all the values. values is missing, it cannot
• It can be calculated even when some of the be calculated.
values are equal to zero(0) or negative(-ve).
• The logic behind its computation can be • It is not based on all the
easily understand. It can be easily computed. values.
• Even when some of the extreme values are • It cannot be used in deep
Median missing, it can be computed. statistical analysis.
• It is not affected by abnormal extreme • It is not as stable as
values. arithmetic mean.
• It can be graphically found out.
• The logic behind its computation can be • It is not based on all the
easily understand. It can be easily computed. values.
• Even when some of the extreme values are • It cannot be used in deep
Mode
missing, it can be computed. statistical analysis.
• It can be graphically found out • It is not stable as arithmetic
mean
MEASURES OF DISPERSIION
Definition : Variation (dispersion) is the property of deviation of
values from the average. The degree of variation is indicated by the
measures of variation.
Various measures of variation which are in common use are –
1. Range (R)
2. Quartile deviation or Semi-interquartile range (QD)
3. Mean Deviation (M.D)
4. Standard deviation (S.D)
5. Variance(σ𝟐 )
6. Coefficient of variation(C.V)
1.The Range :
Definition – “Range is the difference between the highest and the
lowest values in the data” is called range.
If H is the highest value and L is the lowest value in the data,
the range of variation is –
R=H–L
Relative measures of variation which is used for comparison
of frequency distributions is coefficient of range. It is –
Coefficient of Range (R)

2. Quartile deviation (Semi – interquartile range):
The quartile deviation is obtained by dividing
the range between the lower and the upper quartiles
by 2. Thus, if Q1 and Q3 are the lower and upper
quartiles, the quartile deviation is –
Relative measures of variation based on the

Quartiles is coefficient of quartile deviation. It is--
Where,
3.The Mean deviation(Average distribution):
The mean deviation of a set of values from a central values is the
mean of absolute deviations of the values from the central values. Thus
, the mean deviation of the values X1,X2,….., Xn from their arithmetic
mean is -
In the case of tabulated data (discrete as well as continuous) , the

mean deviation from the arithmetic mean is –
4. The Standard deviation:
It is the positive square root of the average of squares of
deviations of the observations from the mean. This is also called root
mean squared deviation. It is denoted by σ (sigma)
For individual (data)values X1,X2,….., Xn, standard deviation is,
For discrete (tabulated data)values X1,X2,….., Xn and with

corresponding frequency f1,f2,……,fn standard deviation is, standard
deviation is,
5.Variance(σ )
The square of the standard deviation is called variance.
Thus, variance of x1,x2,…….,xn is—
It is the mean of the squared deviations of the values from their

arithmetic mean. For the computation of standard deviation, the
following formulae are used.
For raw data
For tabulated data

6. Coefficient of variation(CV)
Definition:
The coefficient of variation (CV) is the ratio of the standard
deviation to the mean and shows the extent of variability in
relation to the mean of the population. The higher the CV, the
greater the dispersion.
The standard formula for calculating the coefficient of
variation is as follows: Coefficient of Variation (CV) = (Standard
Deviation/Mean) × 100.
Coefficient of variation = X 100

Measure Advantages Disadvantages
 A reasonably good indicator  Badly affected by extreme values

Range
Inter quartile  Not affected by extreme values  Does not tell you what happened
 Often used with skewed data beyond the quartiles
range
 Good measure  Mathematical properties not
 All values used useful(SD preferred)
Variance  Used when data are fairly symmetrical  Not so good if data are strongly
skewed
 Good measure  Not so good if data are strongly

 All values used skewed
Standard  Used when data are fairly symmetrical
 Can be used in mathematical calculation
deviation
of other statistics
 Relationship between standard deviation  When the mean is 0, the coefficient

and mean of variation is infinity.
 No unit  Cannot find out the intervals of
Coefficient of
 No dimension mean.
variation  Compare distributions  Sensitive to small mean values.
 Cannot calculate logarithmic values
Practical – 3 and 4
Problems on

and
Measures of dispersion
1. Birth weight of new born babies : 2.5, 3.3, 1.7, 4.2, 3.8, 2.7, 4 ,3.4, 3.9, 4.3, 3, 2.8, 3.1, 3, 2.1
Calculate mean, median, mode and range.
Solution:
Mean:
We know that, Mean formula for raw data,
𝟐.𝟓+𝟑.𝟑+𝟏.𝟕+𝟒 +𝟑.𝟖+𝟐.𝟕+𝟒.𝟏+𝟑.𝟒+𝟑.𝟗+𝟒.𝟐+𝟑+𝟐.𝟖+𝟑.𝟏+𝟑+𝟐.𝟏
Mean( 𝑥ҧ ) =
𝟏𝟓
𝟒𝟕.𝟔
Mean( 𝑥ҧ ) = 𝟏𝟓
= 3.17 ≈ 3 𝑘𝑔
Median: The arrayed series(ascending order) is-

1.7, 2.1, 2.5, 2.7, 2.8, 3, 3, 3.1, 3.3, 3.4, 3.8, 3.9, 4, 4.1, 4.2
(𝑛+1) 𝑡ℎ
Median(M) = value in the arrayed series
2
Here, n = 15,
(15+1) 𝑡ℎ
Median(M) = value in the arrayed series = 8𝑡ℎ value in the arrayed series
2
Median(M) = 3.1 kg
Mode(Z) : Here, the value 3 is the highest frequency(repeated value)
Therefore, mode is Z = 3 kg
Range(R ):
Range(R) = H – L
Here, Highest value(H) = 4.2 and
Lowest value(L) = 1.7
Range(R) = 4.2 – 1.7
Range(R) = 2.5kg
2. The number of days taken for the appearance of signs and
symptoms after exposure : 12,19,13,5,14,10,22,43,14,51,5.
Find the mean, median and mode.
Solution:
Mean =
𝟏𝟐+𝟏𝟗+𝟏𝟑+𝟓+𝟏𝟒+𝟏𝟎+𝟐𝟐+𝟒𝟑+𝟏𝟒+𝟓𝟏+𝟓
Mean( 𝑥ҧ ) =
𝟏𝟏
𝟐𝟎𝟖
Mean( 𝑥ҧ ) = 𝟏𝟏
= 18.91 ≈ 19 days
Median: The arrayed series(ascending order) is-
5,5,10,12,13, ,14,19,22,43,51
(𝑛+1) 𝑡ℎ
2
Here, n = 11,
(11+1) 𝑡ℎ
2
Median(M) = 14 days
Mode(Z) : Here, the value 5 and 14 are highest frequency(repeated
value)
Therefore, mode is Z = 5 and 14 days.
3. The following data relates to the number of children of
26 couples. Find the median.
No. of children per couple: 2,0,5,2,2,1,0,0,3,4,2,1,1,
2,3,0,1,2,7,2,2,1,3,4,1,5
Solution : The arrayed series ( ascending series ) is –
0 0 0 01 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 4 4 5 5 7
Here, n = 26. Therefore, median is –
𝑛+1 𝑡ℎ
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = value in the arrayed series
2
26+1 𝑡ℎ
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = value in the arrayed series
2
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13.5 𝑡ℎ value in the arrayed series
 𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13th value+0.5(14th value – 13th value)
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13th value+0.5(14th value – 13th value)
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 2+0.5(2 – 2)
𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 2 children per couple

4. The following data represents Hb level of 13 pregnant mothers (in g/dl) 9.3, 11.5, 10.2,
11.2, 11.7, 12.1, 11.2, 10.7, 8.7, 9.3, 8.9, 12.4, 11.6. Calculate all the measures of central
tendency and dispersion and interpret the results.
Solution: The given data is in the form of raw data(discrete data)

The measures of central tendency:-
𝟗.𝟑+𝟏𝟏.𝟓+𝟏𝟎.𝟐+𝟏𝟏.𝟐+𝟏𝟏.𝟕+𝟏𝟐.𝟏+𝟏𝟏.𝟐+𝟏𝟎.𝟕+𝟖.𝟕+𝟗.𝟑+𝟖.𝟗+𝟏𝟐.𝟒+𝟏𝟏.𝟔
Mean( 𝑥ҧ ) =
𝟏𝟑
Mean( 𝑥ҧ ) = 10.68 g/dl
Median(M) = The arrayed series(ascending order) is-

8.7, 8.9, 9.3, 9.3, 10.2, 10.7, 11.2,11.2 11.5, 11.6, 11.7, 12.1, 12.4
(𝑛+1) 𝑡ℎ
2
Here, n = 13,
(13+1) 𝑡ℎ
2
Median(M) = 11.2 g/dl.
Mode(Z): Z value represents maximum number of times

9.3 and 11.2 both are repeated times
Thus the given data has a bimode 9.3 g/dl and 11.2 g/dl
The measures of dispersion:
Range(R) = Highest value(H) – Lowest value(L) = 12.4 – 8.7 = 3.7 g/dl
Quartile deviation:
(𝒏+𝟏) 𝒕𝒉 (𝟏𝟑+𝟏) 𝒕𝒉 𝟏𝟒 𝒕𝒉
𝑸𝟏 = observation = observation = observation = 3.5
𝟒 𝟒 𝟒
𝑸𝟏 = 3rd value + 0.5(4th value – 3rd value) = 9.3 + 0.5(9.3 - 9.3) = 9.3
𝟑(𝒏+𝟏) 𝒕𝒉 𝟑(𝟏𝟑+𝟏) 𝒕𝒉
𝑸𝟑 = observation = observation = 10.5
𝟒 𝟒
𝑸𝟑 = 10 value + 0.5(11th value – 10 value) = 11.6 + 0.5(11.7 – 11.6) = 11.7
𝑸𝟑 − 𝑸𝟏 𝟏𝟏.𝟕 − 𝟗.𝟑
QD = = = 1.2 g/dl
𝟐 𝟐
Interquartile range(IQR): IQR = (Q1-Q3) = (9.3 – 11.7)g/dl

Standard deviation(SD):
x x-ഥ
𝒙 (x −ഥ
𝒙) 𝟐
9.3 -1.38 1.90
11.5 0.82 0.67
10.2 -0.48 0.23
11.2 0.52 0.27
11.7 1.02 1.04
12.1 1.42 2.02
19.20
11.2 0.52 0.27 S.D = σ = = 1.22 g/dl
10.7 0.02 0.00 13
8.7 -1.98 3.92
9.3 -1.38 1.90
8.9 -1.78 3.17
12.4 1.72 2.96
11.6 0.92 0.85
෍(x −ഥ
𝒙)𝟐 = 19.20
1.22
Coefficient of variation(CV): Coefficient of variation = X 100 = 10.68 X 100
CV = 11.4%2
Interpretation : Therefore mean±SD is 10.68±1.22g/dl

Mean with interquartile range(Q1 to Q3) is 11.2 with (IQR) g/dl
and the given data has bimode 9.3 g/dl and 11.2 g/dl and range is 3.7 g/dl.
5. In a study the weight of children was recorded as follows weight is given in kg,
7, 10, 19, 14, 15, 10, 13, 14, 9, 17, 11, 16, 11, 17, 11, 8. Calculate all measures of central
tendency and measures of dispersion and interpret the results.
Solution:
Measures of central tendency:
Mean ( 𝑥ҧ ) = =
𝟕+𝟏𝟎+𝟏𝟗+𝟏𝟒+𝟏𝟓+𝟏𝟎+𝟏𝟑+𝟏𝟒+𝟗+𝟏𝟕+𝟏𝟏+𝟏𝟔+𝟏𝟏+𝟏𝟕+𝟏𝟏+𝟖
Mean( 𝑥ҧ ) =
𝟏𝟔
202
Mean( 𝑥ҧ ) = = 12.61 ≈ 13 KG
16
Median(M) = The arrayed series(ascending order) is-
7, 8, 9, 10, 10, 11, 11, 11, 13, 14, 14, 15, 16, 17, 17, 19
(𝑛+1) 𝑡ℎ
2
Here, n = 16,
(16+1) 𝑡ℎ
Median(M) = value in the arrayed series = 8.5𝑡ℎ value in the arrayed series
2
Median(M) = 8th value + 0.5(9th value – 8th value) = 11 + 0.5(13-9) = 15.5 kg.
Mode(Z): Z value represents maximum number of times

11 repeated 3 times
Thus the given data has a mode is 11 kg
Measures of dispersion:
Range= H – L = 19-7 = 12
Quartile deviation(Semi inter quartile range)=?
SD=?
CV=?
Interpretation: ?
6. Given below table is frequency distribution of individuals in a community
according to the number of illness suffered by them in a year.
No. of illness suffered in a No. of individuals
year
0 24
1 76
2 79
3 81
4 86
5 51
6 26
7 43
Total 466
Calculate , I. Measures of central tendency

II. Measures of dispersion and
III. Interpret the results.
Solution: Given data is discrete data(tabulated data)
Measures of central tendency-
σ𝒏
𝒊=𝟏 𝒇𝒊 𝒙𝒊
𝒙) =
We know that, Mean(ഥ
𝑵
No. of illness
No. of individuals
suffered in a year
(𝒇𝒊 ) 𝒇𝒊 ∗ 𝒙𝒊 LCF
(𝒙𝒊 )
0 24 0 x 24 = 0 24 σ𝒏
𝒊=𝟏 𝒇𝒊 𝒙𝒊
1 76 1 x 76 = 76 24 + 76 =100 Mean(𝑥)ҧ =
𝑵
2 79 2 x 79 = 158 100 + 79 = 179
3 81 3 x 81 = 243 179 + 81 = 260 𝟏𝟓𝟑𝟑
Mean(𝑥)ҧ =
4 x 86 = 344 𝟒𝟔𝟔
4 86 260 + 86 = 346
5 51 5 x 51 = 255 346 + 51 = 397
Mean(𝑥)ҧ = 3.3≈ 3
6 26 6 x 26 = 156 397 + 26 = 423
7 43 7 x 43 = 301 423 + 43 = 466  Mean(𝑥)ҧ = 3 illness
Total 466 1533
suffered per year
σ 𝒇𝒊 𝒐𝒓 N = 466 σ 𝒇𝒊 𝒙𝒊 =1533
Median(M) : LCF – less than cumulative frequency

𝑁+1
= = 233.5th observation
2
𝑁+1
= Value of x corresponding to LCF ≥
2
 M = 233.5 observation = 3 illness suffered per year
th
Mode(Z) :- Z = value of x is corresponding to the highest frequency
The highest frequency is 86. The corresponding value of x is 4
Z=4
Measures of dispersion-
Range(R) = H – L = 86 - 24 = 62 individuals
𝑄3 + 𝑄1
Quartile deviation: Q.D =
2
No. of illness
No. of individuals
suffered in a year
(𝒇𝒊 ) LCF
(𝒙𝒊 )
𝑁+1 𝑡ℎ
𝑄1 = Observation
0 24 24 4
1 76 100 = 116.75 𝑡ℎ observation
2 79 179 =2
3 81 260
4 86 346 𝑁+1 𝑡ℎ
𝑄3 = 3 Observation
5 51 397 4
6 26 423 = 350.25 𝑡ℎ observation
7 43 466 =5
Total 466
𝑄3 + 𝑄1 5 −2
Quartile deviation(Q.D) = = = 1.5 ≈ 2 illness suffered per year
2 2
IQR = (𝑄1 + 𝑄3 ) = (2 – 5) illness suffered per year

Standard deviation(S.D) :
No. of illness suffered
No. of individuals
in a year 𝒇𝒊 𝒙 𝒊 𝒙) ( 𝒙𝒊 − ഥ
(𝒙𝒊 - ഥ 𝒙) 𝟐 𝒇𝒊 (𝒙𝒊 − ഥ
𝒙) 𝟐
(𝒇𝒊 )
(𝒙𝒊 )
0 24 0 -3 9 216
1 76 76 -2 4 304
2 79 158 -1 1 79
3 81 243 0 0 0
4 86 344 1 1 86
5 51 255 2 4 204
6 26 156 3 9 234
7 43 301 4 16 688
Total 466 1533 ෍ 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 2 = 1811
σ 𝑓𝑖 𝑥𝑖 −𝑥ҧ 2 1811
1. = = = 1.97 ≈ 𝟐 illness suffered per year
𝑁 466
N
σ 𝑓𝑖 𝑥𝑖−𝑥ҧ 2 1811
69 Variance =σ2 =  σ2 = = 3.89 ≈ 4 illness suffered
𝑁 466
σ 𝟐
Coefficient of variation(CV) = x 100,  C.V = x 100 = 66.67 %
µ 𝟑
Interpretation: Therefore, Mean±SD is 3.3±1.97 illness suffered per year
Mean with interquartile range(Q1 to Q3) is 3 with (IQR) is (2-5) illness suffered per year
and the given data has mode 4 and range is 62.
7 .In a clinical trial to see the effect of improvement in haemoglobin
among anaemic patients, the following data were obtained. Calculate
the median of haemoglobin level.
Distribution of haemoglobin level of 50 patients
Haemoglobin level (gm%) No. of patients
4-6 2
6-8 3
8 - 10 15
10 - 12 20
12 - 14 10
Total 50
Solution : The given data is in grouped or classified data, Then
In order to do the calculation , of above table has to be
reconstructed as given below
< C.F (less than
Hb. (gm%) No. of patients Upper class
cumulative
(Class Interval) (Frequency = f) boundry
frequency)
4-6 2 6 2
6-8 3 8 5
8 - 10 15 l 10 20 m
l 10 - 12 20 12 40
12 - 14 10 14 50
Total N = 50
f Median class
Given, N = 50 (total frequency)
l = 10 (lower boundary of median class)
m = 20 (cumulative frequency up to the median class)
c = 2 (class interval), i.e. 14 – 12 = 12 -10 …… = 2
(equal class intervals) and
f = 20 (frequency in the median class), i.e. frequency in class 10 - 12
Median formula for grouped or classified data as given below,
Median formula
(grouped data)
8. The distribution of systolic blood pressure(mmHg) values of patients attending at rural health
camp is given in the following table,
Systolic blood pressure(mmHg) Number of patients
90 – 100 3
100 – 110 5
110 – 120 7
120 – 130 10
130 – 140 15
140 – 150 11
150 – 160 9
160 – 170 6
170 - 180 4
Total 70
Calculate all the measures of central tendency , dispersion and coefficient of variation and
Interpret the results.
Solution: Mean
Systolic blood Number of
Mid point of CI
pressure(mmHg) patients f*x LCF
(x)
(Class interval) (f)
90 – 100 1 95 95 1
100 – 110 2 105 210 3
110 – 120 5 115 575 8
120 – 130 11 (𝒇𝟏 ) 125 1375 19 m

130 – 140 13 (f) 135 1755 32
l
140 – 150 12 (𝒇𝟐 ) 145 1740 44
150 – 160 9 155 1395 53
160 – 170 4 165 660 57
170 - 180 3 175 525 60
Mean =Total 60 8330

σ 𝒇𝒊 𝒐𝒓 N = 60 σ 𝒇𝒊 𝒙𝒊 = 8330
𝑁 60
= =30
σ𝒏
𝒊=𝟏 𝒇𝒊 𝒙𝒊 𝟖𝟑𝟑𝟎 2 2
Mean(𝑥)ҧ = , Mean(𝑥)ҧ =
𝑵 𝟔𝟎
Mean(𝑥)ҧ = 138.83 ≈ 139 mmHg

Median: Median formula for grouped or classified or tabulated data as given below
𝑁
2
−m ∗𝑐
Median(M) = l +
𝑓
𝑁 60
Given, l = 130, f = 13, m = 19, c = 10, = =30
2 2
30 − 19 ∗10
Median(M) = 130 +
13
Median(M) = 138.46 mmHg
Mode: Mode formula for grouped or classified or tabulated data as given below
𝑓 − 𝑓1 ∗𝑐
Mode(Z) = l +
2𝑓 −𝑓1 −𝑓2
Given, l = 130, f = 13, 𝑓1 = 11, 𝑓2 = 12, c = 10
13 −11 ∗10
Mode(Z) = 130 +
2 ∗13 −11 −12
Mode(Z) = 136.67 mmHg

Quartile deviation:
Systolic blood pressure(mmHg) Number of patients
LCF
(Class interval) (f)
90 – 100 1 1
100 – 110 2 3 Q1 class = 15th observation
110 – 120 5 8(𝒎𝟏 ) = 120-130
120 – 130 11(𝒇𝟏 ) 19
𝑙1 130 – 140 13 32
140 – 150 12 44(𝒎𝟑 )
150 – 160 9(𝒇𝟑 ) 53
160 – 170 4 57 Q3 class = 45th observation
𝑙3 170 - 180 3 60 = 150-160
Total 60
𝑄1 = The first quartile value ( 25th percentile), below which

25% of total observations lie
𝑁
4
− 𝑚1 ∗𝑐
𝑁𝑡ℎ
𝑄1 = 𝑙1 + , 𝑄1 = observation
𝑓1 4
60
= observation = 15th observation
60 4
4
−8 ∗10 𝑄1 class = 120 - 130
𝑄1 = 120 +
11
𝑄1 = 120 + 6.36 = 126.36 mmHg

𝑄3 = The third quartile value ( 75th percentile), below which
75% of total observations lie
𝑁 𝑡ℎ 60
𝑄3 class = 3 observation =3* = 45th observation,
4 4
 𝑄3 class is 150-160
3 ∗60
4
− 44 ∗10
𝑄3 = 150 + , 𝑄3 = 150+1.11 = 151.11 mmHg
9
𝑄3 −𝑄1 151.11 −126.36

Q.D = = = 12.38 mmHg
2 2
Inter-quartile range(IQR) = (𝑄1- 𝑄3) = (126.36 – 151.11) mmHg

Standard deviation:
SBP(mmHg) Mid point of CI
(Class interval) f (x) x-ഥ
𝒙 x−ഥ
𝒙 2
f* x−ഥ
𝒙 2
90 – 100 1 95 -44 1936 1936
100 – 110 2 105 -34 1156 2312
110 – 120 5 115 -24 576 2880
120 – 130 11 125 -14 196 2156
130 – 140 13 135 -4 16 208
140 – 150 12 145 6 36 432
150 – 160 9 155 16 256 2304
160 – 170 4 165 26 676 2704
170 - 180 3 175 36 1296 3888
Total 60 =N 18820
σ 𝑓𝑖 𝑥𝑖 −𝑥ҧ 2 18820
 Standard deviation (σ) = = = 17.71 mmHg
𝑁 60
σ 𝟏𝟕.𝟕𝟏
Coefficient of variation(CV) = x 100,  C.V = x 100 = 12.74 %
µ 𝟏𝟑𝟗
Interpretation: Therefore, Mean±S.D is (139±17.71) mmHg

Median with IQR is 138.46 with (126.36 – 151.11) mmHg
Made of the given data is 136.67 mmHg and C.V is 12.74 mmHg
9. In a randomized control trail(RCT) of malnourished children with anemia , the following
results were obtained,
Groups Sample size Mean Hb(g%) S.D
Treatment group 80 16 4
Control group 120 11 3
Compare both groups and find which group is more consistent.

Solution:
σ 𝟒
Coefficient of variation(CV) of treatment group = x 100,  C.V = x 100 = 25
µ 𝟏𝟔
σ 𝟑
Coefficient of variation(CV) of control group = x 100,  C.V = x 100 = 27
µ 𝟏𝟏
Therefore the C.V value of treatment group(25) is less than the C.V value of control group (27).
Thus treatment group is more consistent.
10. In a study on dental hygiene, conducted by department of community medicine in
association with department of dentistry the following data were obtained,
Group Sample Size Mean dental hygiene score S.D
Rural school children 60 24 6
Urban school children 60 28 3
Compare the both the groups and find which group is more consistent.
Solution:
σ 𝟔
Coefficient of variation(CV) of rural school children = x 100,  C.V = x 100 = 25
µ 𝟐𝟒
σ 𝟑
Coefficient of variation(CV) of urban school children = x 100,  C.V = x 100 = 10.71
µ 𝟐𝟖
Therefore the C.V value of rural school children (25) is greater than the C.V value of urban
school children (10.71).
Thus urban school children dental hygiene is more consistent.
THANK YOU

Biostatistics Measures

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biostatistics Measures

Uploaded by

Copyright:

Available Formats

Biostatistics

Topic : Measures of central tendency

There are 5 measures of central tendency :

Mean Median Mode

The average Most commonly

Arithmetic mean of a set of values is obtained by dividing the sum of the

Where N = Ʃf is the total frequency

For a tabulated data ( discrete or continuous ) it is –

In the case of a continuous frequency distribution(tabulated data), the median is –

Median(M) Grouped data formula

Where, l = Lower limit of the median class.

In the case of a continuous frequency distribution(tabulated data), mode is –

Coefficient of Range (R)

Relative measures of variation based on the

In the case of tabulated data (discrete as well as continuous) , the

For discrete (tabulated data)values X1,X2,….., Xn and with

It is the mean of the squared deviations of the values from their

For raw data

For tabulated data

Coefficient of variation = X 100

 A reasonably good indicator  Badly affected by extreme values

 Good measure  Not so good if data are strongly

 Relationship between standard deviation  When the mean is 0, the coefficient

Measures of central tendency

Median: The arrayed series(ascending order) is-

𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13.5 𝑡ℎ value in the arrayed series

 𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13th value+0.5(14th value – 13th value)

𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 13th value+0.5(14th value – 13th value)

𝑀𝑒𝑑𝑖𝑎𝑛 𝑀 = 2 children per couple

Solution: The given data is in the form of raw data(discrete data)

Median(M) = The arrayed series(ascending order) is-

Mode(Z): Z value represents maximum number of times

Range(R) = Highest value(H) – Lowest value(L) = 12.4 – 8.7 = 3.7 g/dl

𝑸𝟑 = 10 value + 0.5(11th value – 10 value) = 11.6 + 0.5(11.7 – 11.6) = 11.7

Interquartile range(IQR): IQR = (Q1-Q3) = (9.3 – 11.7)g/dl

Interpretation : Therefore mean±SD is 10.68±1.22g/dl

Mode(Z): Z value represents maximum number of times

Calculate , I. Measures of central tendency

Median(M) : LCF – less than cumulative frequency

IQR = (𝑄1 + 𝑄3 ) = (2 – 5) illness suffered per year

100 – 110 2 105 210 3

110 – 120 5 115 575 8

120 – 130 11 (𝒇𝟏 ) 125 1375 19 m

150 – 160 9 155 1395 53

160 – 170 4 165 660 57

170 - 180 3 175 525 60

Mean =Total 60 8330

Mean(𝑥)ҧ = 138.83 ≈ 139 mmHg

Median(M) = 138.46 mmHg

Given, l = 130, f = 13, 𝑓1 = 11, 𝑓2 = 12, c = 10

Mode(Z) = 136.67 mmHg

𝑄1 = The first quartile value ( 25th percentile), below which

𝑄1 = 120 + 6.36 = 126.36 mmHg

𝑄3 −𝑄1 151.11 −126.36

Inter-quartile range(IQR) = (𝑄1- 𝑄3) = (126.36 – 151.11) mmHg

90 – 100 1 95 -44 1936 1936

100 – 110 2 105 -34 1156 2312

110 – 120 5 115 -24 576 2880

120 – 130 11 125 -14 196 2156

130 – 140 13 135 -4 16 208

140 – 150 12 145 6 36 432

150 – 160 9 155 16 256 2304