You are on page 1of 44

Measures of Dispersion

Range
Variance
Standard Deviation
Coefficient of Variation
Summary Definitions

 The measure of dispersion shows how the


data is spread or scattered around the mean.

 The measure of location or central tendency


is a central value that the data values group
around. It gives an average value.

 The measure of skewness is how symmetrical


(or not) the distribution of data values is.
Measures of Dispersion
Variation

Range Variance Standard Coefficient


Deviation of Variation

 Measures of variation give


information on the spread
or variability or
dispersion of the data
values.
Same centre,
different variation
Measures of Dispersion:
The Range

 Simplest measure of dispersion


 Difference between the largest and the smallest values:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 13 - 1 = 12
Measures of Dispersion:
Why The Range Can Be Misleading

 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Percentile Range

 Difference between 10 to 90 percentile.


 It is established by excluding the highest and
the lowest 10 percent of the items, and is the
difference between the largest and the smallest
values of the remaining 80 percent of the items.

90
P10  P90  P10
Quartile Deviation

 A measure similar to the special range (Q) is the inter-


quartile range . It is the difference between the third
quartile (Q3) and the first quartile (Q1). Thus

Q  Q3  Q1
 The inter-quartile range is frequently reduced to the
measure of semi-interquartile range, known as the
quartile deviation (QD), by dividing it by 2. Thus

Q3  Q1
QD 
2
Mean Deviation

 The mean deviation is an average of absolute


deviations of individual observations from the
central value of a series. Average deviation
about mean k

f i xi  x
MDx   i 1
n

 k = Number of classes
 xi= Mid point of the i-th class
 fi= frequency of the i-th class
Standard Deviation

 The standard deviation is one of the most


important measures of dispersion. It is much
more accurate than the range or inter quartile
range.
What does it measure?

 It measures the dispersion (or spread) of


figures around the mean.

 A large number for the standard deviation


means there is a wide spread of values around
the mean, whereas a small number for the
standard deviation implies that the values are
grouped close together around the mean.
Measures of Dispersion:
The Standard Deviation s
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data

 Sample standard deviation:


n

 (X i  X) 2

S i1
n -1
For A Population:
The Standard Deviation σ

 Most commonly used measure of variation


 Shows variation about the mean
 Has the same units as the original data

 Population standard deviation:


N

 i
(X  μ) 2

σ i1
N
Approximating the Standard Deviation
from a Frequency Distribution
 Assume that all values within each class interval are
located at the midpoint of the class

s
 (x  x) 2
f
n -1
Where n = number of values or sample size
x = midpoint of the jth class
f = number of values in the jth class
Measures of Dispersion:
The Standard Deviation

Steps for Calculating Standard Deviation

1. Calculate the difference between each value


and the mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample
variance.
5. Take the square root of the sample variance to
get the sample standard deviation.
Measures of Dispersion:
Sample Standard Deviation:
Calculation Example

Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = X = 16

(10  X)2  (12  X)2  (14  X)2    (24  X)2


S
n 1

(10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2



8 1

130 A measure of the “average”


  4.3095
7 scatter around the mean
Semi-worked example
 We are going to try and find the standard deviation
of the minimum temperatures of 10 weather
stations in Britain on a winters day.

The temperatures are:


5, 9, 3, 2, 7, 9, 8, 2, 2, 3 (˚Centigrade)
To calculate the standard deviation we construct a table like
this one:

x ẍ (x - ẍ) (x - ẍ)2

There should be enough space


here to fit in the number of
values. Eg: there are 10
temperatures so leave 10 lines.

∑x = ∑(x - ẍ)2 =
ẍ = ∑x/n =
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Next we write
To calculate the values
the standard (temperatures)
deviation we construct a in
table like
column
this one:x (they can be in any order).

x ẍ (x - ẍ) (x - ẍ)2
5
9
3
2
7
9
8
2
2
3
∑x =
∑(x - ẍ)2 =
ẍ = ∑x/n =
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Add them up (∑x) Calculate the mean (ẍ)

x ẍ (x - ẍ) (x - ẍ)2
5
9
3
2
7
9
8
2
2
3
∑x = 50
∑(x - ẍ)2 =
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Write the mean temperature (ẍ) in
every row in the second column.

x ẍ (x - ẍ) (x - ẍ)2
5 5
9 5
3 5
2 5
7 5
9 5
8 5
2 5
2 5
3 5
∑x = 50
∑(x - ẍ)2 =
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Subtract each value (temperature) from the mean. It
does not matter if you obtain a negative number.

x ẍ (x - ẍ) (x - ẍ)2
5 5 0
9 5 4
3 5 -2
2 5 -3
7 5 2
9 5 4
8 5 3
2 5 -3
2 5 -3
3 5 -2
∑x = 50
∑(x - ẍ)2 =
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Square (2) all of the figures you obtained in
column 3 to get rid of the negative numbers.

x ẍ (x - ẍ) (x - ẍ)2
5 5 0 0
9 5 4 16
3 5 -2 4
2 5 -3 9
7 5 2 4
9 5 4 16
8 5 3 9
2 5 -3 9
2 5 -3 9
3 5 -2 4
∑x = 50
∑(x - ẍ)2 =
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Add up all of the figures that you
calculated in column 4 to get ∑ (x - ẍ)2.

x ẍ (x - ẍ) (x - ẍ)2
5 5 0 0
9 5 4 16
3 5 -2 4
2 5 -3 9
7 5 2 4
9 5 4 16
8 5 3 9
2 5 -3 9
2 5 -3 9
3 5 -2 4
∑x = 50
∑(x - ẍ)2 = 80
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n =
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Divide ∑(x - ẍ)2 by the total number of
values (in this case 10 – weather stations)

x ẍ (x - ẍ) (x - ẍ)2
5 5 0 0
9 5 4 16
3 5 -2 4
2 5 -3 9
7 5 2 4
9 5 4 16
8 5 3 9
2 5 -3 9
2 5 -3 9
3 5 -2 4
∑x = 50
∑(x - ẍ)2 = 80
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n = 8
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Take the square root (√) of the figure to obtain the
standard deviation. (Round your answer to the nearest
decimal place)

x ẍ (x - ẍ) (x - ẍ)2
5 5 0 0
9 5 4 16
3 5 -2 4
2 5 -3 9
7 5 2 4
9 5 4 16
8 5 3 9
2 5 -3 9
2 5 -3 9
3 5 -2 4
∑x = 50
∑(x - ẍ)2 = 80
ẍ = ∑x/n = 50/10 = 5
∑(x - ẍ)2/n = 8
√∑(x - ẍ)2/n =

x = temperature --- ẍ = mean temperature --- √ = square root


∑ = total of --- 2 = squared --- n = number of values
Answer

2.8°C
Why?

 Standard deviation is much more useful.

 For example our 2.8 means that there is a 68%


chance of the temperature falling within ± 2.8°C
of the mean temperature of 5°C.

 That is one standard deviation away from the


mean. Normally, values are said to lie between
one, two or three standard deviations from the
mean.
Where did the 68% come from?

This is a normal distribution curve. It is a bell-shaped


curve with most of the data cluster around the mean
value and where the data gradually declines the further
you get from the mean until very few data appears at the
extremes.
For Example – peoples height
Most people are near
average height.

Some are short Some are tall

But few are


And few are
very short
very tall.
Standard deviation

 Standard deviation
tells us the average
distance of each
score from the mean.
 68% of normally
distributed data is
within 1 sd each side
of the mean
 95% within 2 sd
 Almost all is within 3
sd
Example

 Mean IQ = 100, sd = 15
 What is the IQ of 68% of
population (ie what is the
range of possible IQs)?
 Between what IQ scores
would 95% of people be?
 Dan says he has done an
online IQ test, and he
has an IQ of 170. Should
you believe him?
Why/not?
Another example

 Sol scores 61% in the


test. His mum says
that’s rubbish. Sol
points out that the
mean score in class
was 50%, with an sd
of 5. Did he do well?
 What if the sd was
only 2?
 What if sd was 15?
Measures of Dispersion:
Comparing Standard Deviations

Smaller standard deviation

Larger standard deviation


What Does the Variance
Formula Mean?

 Variance is the mean of the squared deviation


scores
 The larger the variance is, the more the scores
deviate, on average, away from the mean
 The smaller the variance is, the less the scores
deviate, on average, from the mean
Measures of Dispersion:
The Variance
 Average (approximately) of squared deviations
of values from the mean
n
 Sample variance:
2
 (X i  X) 2

S  i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Another formula for Variance

 Sample Variance s 2

 x 2
f
x 2

with frequency table n -1

X = arithmetic mean
n = sample size
Xi = ith value of the variable X

f = frequency
For A Population:
The Variance σ2
 Average of squared deviations of values from
the mean
N
 Population variance:  (X i  μ) 2

σ2  i1
N

Where μ = population mean


N = population size
Xi = ith value of the variable X
Measures of Dispersion:
The Coefficient of Variation

 Measures relative variation


 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare the variability of two or
more sets of data measured in different units

 S 
CV   

 X 
The Coefficient of Variation

 Coefficient of Variation of a population:

 
CV   
 
 This can be used to compare two distributions directly
to see which has more dispersion because it does not
depend on units of the distribution.
Measures of Dispersion:
Comparing Coefficients of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

S  $5
CVA     100% 
  100%  10%
X  $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100
deviation, but
 Standard deviation = $5 stock B is less
variable relative
to its price
S  $5
CVB    100% 
 100%  5%
X  $100
Chap 3-40
Sample statistics versus
population parameters

Measure Population Sample


Parameter Statistic
Mean
 X
Variance
2 S2
Standard
Deviation  S
Measures of Dispersion:
Summary Characteristics
 The more the data are spread out, the greater the
range, variance, and standard deviation.

 The less the data are spread out, the smaller the
range, variance, and standard deviation.

 If the values are all the same (no variation), all these
measures will be zero.

 None of these measures are ever negative.


Summary of Measures

Range X largest – X smallest Total Spread


Standard Deviation 2 Dispersion about
(Sample)  X i  X
Sample Mean
n 1
Standard Deviation 2 Dispersion about
(Population)
 X i  X 
Population Mean
N
Variance (X i X )2 Squared Dispersion
(Sample) n–1 about Sample Mean
Thank you

You might also like