You are on page 1of 24

Variability

(Dispersion)
• The three basics of statistics are:
1. Variability: making sense from variation
2. Inference: making generalization
3. Probability: making proportion and chance
Measures of dispersions
( variability)
• Range
• IQR
• Mean deviation
• Standard deviation

3
Reasons
• Used for “scale or continuous values" in medical
practice.
Examples: systolic and diastolic blood pressure, pulse
rate, heart rate, height, weight, serum cholesterol,
haemoglobin levels, etc.
• Summarizing variability in a single number, in order to
facilitate comparison of variability between different
groups or study samples.
• Using variability as an indicator of homogeneity or
heterogeneity of data.
4
(1)Range
The simplest measure of dispersion
It is the difference between the highest and lowest
values of observations.
The diastolic BP of 10 individuals:
83, 75, 81, 79, 71, 90, 75, 95, 77, 94
The range is expressed either 71-95 or by the difference
(24)
If the data is set in classes or groups, the range will be
between the lowest mid point of class or group and the
highest mid point class or group
5
Age Frequency Mid class
• The range is from 3 to
1-6 18 3
7-12 20 9
27 or 27-3=24
13-18 16 15
19-24 10 21
25-30 6 27
Range

• E.g: Hb level of 5 • E.g: Hb level of 6


pregnant women pregnant women
12, 12.5, 11, 13, 12.5 12, 12.5, 11, 13, 12.5, 8
Range = 13-11 = 2 Range = 13-8 = 5

Friday, November 13, 2020 7


Range
Advantages Disadvantages
– It includes the extreme – Value of range is only
determined by two values.
values.
– The interpretation of the
– Easy to calculate
range is difficult.
– It does not provide
information about other
values and how dispersed
they are.
– The range also does not
explain the data distribution

Friday, November 13, 2020 8


Centiles
• To ignore the extremities of values and
concentrate on the central where values
grouped is centiles
• It expressed in 25th, 50th and 75th centiles
• It is also called percentiles
• The 25th centile is called the first quartile
• It is the point separates the lower quarter of
values from the second quarter values
• The 50th is called the second quartile
( consistent with the median value)
• The 75th is called the third quartile, it separate
the upper quarter
• Interquartile range (IQR) is between 25th and 75th
• Is calculated by subtracting the value at 25th from
value at 75th.
• It provide how is the variation between two
quartiles, 25th and 75th. Ignoring the lower and
upper values
Example
• Cholesterol values in mmol/L
• 3,5 3,5 3,6 3,7 4,0 4,1 4,3 4,5 4,7 4,8 5,2 5,7 6,1
6,3 6,3
• The 25th centile is 3,7
• The 50th centile is 4,5
• The 75th centile is 5,7
• Interquartile range (IQR)= 5,7-3,7= 2,0
• This means a variation of 2 mmol/L between 1 st and
3rd quartile and a range of 3,5 to 6,3 mmol/L
• A second group of patients may have IQR of 0,9
mmol/L indicates less variation
Example

Statistics
HB level
Sample number 335

Percentiles 25th 8.0 mg/dl


50th 9.0mg/dl
75th 10.0mg/dl
Mean deviation
• The average deviation from the arithmetic
mean
• Deviation from the mean is the difference
between an individual observed value and the
mean of the group (positive or negative).
• The total of all such deviations from the mean
is equal to zero.
• Deviations may also be measured from the
median or the mode.
13
Deviations from Arithmetic mean Diastolic BP N
the mean

2 81 83 1
-6 81 75 2
0 81 81 3
-2 81 79 4
-10 81 71 5
14 81 95 6
-6 81 75 7
-4 81 77 8
3 81 84 9
9 81 90 10
T= 56( ignore the T=810
sign) = 810/10= 81
MD= 56/10=5.6

14
Characteristics
• - Based on all observations.
• - Easy to understand.
• - Simple to calculate.
• - Considers all deviations as absolute (positive)
values regardless of direction of deviation.
• - Not readily amenable to further
mathematical treatment.

15
( )2 Deviations from the mean Arithmetic Diastolic N
mean BP

4 2 81 83 1
36 -6 81 75 2
0 0 81 81 3
4 -2 81 79 4
100 -10 81 71 5
196 14 81 95 6
36 -6 81 75 7
16 -4 81 77 8
9 3 81 84 9
81 9 81 90 10
T=482 T= 56( ignore the sign) T=810
Sum of squared MD= 56/10=5.6 =
deviations 810/10= 81

16
Variance and Standard Deviation
Variance:
 Sum of squared deviations divided by the
number of observations n
Or divided by (n-l) for sample variance when the
sample size less than 30
Uses deviations from the mean to measure the
variation in the dataset

17
Variance of a sample
x xi-x (xi-x)2
8 0 0
5 -3 9
4 -4 16
12 4 16
x 8
n
15 7 49
 i
( x
i 1
 x ) 2
 100 5 -3 9
s 2  100 / 6  4.08 7 -1 1
56/7=8 100
18
Standard deviation
HB Mean Deviation from (x - mean) 2
mean (x - mean)
12 11.5 - 0.5 0.25
12.5 11.5 -1 1
11 11.5 0.5 0.25
13 11.5 - 1.5 2.25
12.5 11.5 - 1 1
8 11.5 3.5 12.25
T= 17

19
Standard Deviation
• Most commonly used to measure dispersion.
• Average deviation of values around the mean
(Square root of variance)
• Standard deviation (SD). Root of mean square
deviation, where deviations have been taken
from the mean. This equals the square root of
the variance, expressed in the units of the
original observations.
SD 
 ( xi  x ) 2

n 1
20
Standard deviation
• variance = 17/(6-1) = 3.4
• SD = √ variance
SD = √ 3.4 = 1.84
• Hb level of 6 pregnant women
12, 12.5, 11, 13, 12.5, 8
Mean = 11.5
SD = 1.84
HB level of 6 pregnant women 11.5 + 1.84
21
Standard deviation
• If mean HB of 10 women is 11.5 and SD is 3,
what does this tell you about the dispersion of
these values around the mean as compared to
the previous example 11.5 + 1.84?

22
E.g. Systolic blood pressure
• Smoking males • Non-smoking males
120 130 120 150 110 130 120 140
130 170 180 160 130 150 160 130
170 150 130 150

• Mean SBP = 148 • Mean SBP = 135


• Range = 180-120 • Range = 160-110
• SD = 22 • SD = 15.1

23
• The meaning of SD can only be appreciated
fully in reference to normal distribution curve
• It gives the idea about the spread of data
around the mean.
• The larger the SD the greater the dispersion of
values around the mean

24

You might also like