You are on page 1of 9

CHAPTER 5

MEASURES OF VARIATION

1. Concept of variation. In this chapter we want to study how much the values of a distribution vary,
and to this end we will build several measures called measures of variation, which in turn can be
absolute measures if they have units, or relative measures if they don’t have any units. The most
important absolute measures of variation are the variance and the standard deviation, whereas the most
important relative measure is Pearson’s coefficient of variation.

2. Absolute measures of variation.


1) Range: Defined as the difference between the maximum and the minimum of the observed values,
that is, R  xk  x1 . So, for Example 2 of the salary distributions shown in Chapter 3, page 2, the
range would be R  2300 1400  900 (euros) ; whereas for Example 3 of newborns’ weights in
Chapter 3, page 4, the range would be R  4.5  2.4  2.1(kg) .
Observe that the range only depends on the two extreme values. Thus, this information is
sometimes worth very little.

2) Interquartile range (IQR): Defined as the difference between the 3rd and the 1st quartile, that is,
IQR  Q3  Q1 , with the same idea as that of the range but taking only the central 50% of the
observed values. In this way the IQR has exactly the opposite defect to that of the range, because it
only tells us about the variability of a part of the values of the set which are not extreme.

Between Q1 and Q3 there are


50% of the central values.

Now, for the newborns’ weights, the calculation would be this:


Absolute Relative
Cumulative relative 3
Intervals frequencies frequencies 0.25 
frequencies Fi Q1  2.9  19  0.4  3.02
ni fi 6 19
[2.4, 2.9] 3 3 19 3 19  0.158 9
0.75 
(2.9, 3.3] 6 6 19 9 19  0.474 Q3  3.3  19  0.5  3.74
6 19
(3.3, 3.8] 6 6 19 9 19  0.789
(3.8, 4.5] 4 4 19 1 IQR  3.74  3.02  0.72 (kg)
Therefore, the difference in weights for the central 50% of the observed values is 0.72 (kg)

page 1 Part 5
As we have seen, both the Range and the IQR cannot be considered in general as suitable
measurements of variability. An alternative and better idea is to take a central position point as a
reference, measure a positive “distance” of every observed value in relation with this reference and
calculate the mean value of all these “distances”. But then there is more than one choice, because the
reference can be the arithmetic mean or the median, and the “distances” can be measured by using the
absolute value of the deviations of every observed value with respect to the reference or by taking the
square of these deviations. The name and formulation of all these existing measures of variability are
shown in the following table:

Differences measured by the Differences measured by the square


absolute values of the deviations of the deviations
Average of the absolute deviations
with respect to the median


k
xi  Me  ni
Central point: Me DMe  i 1 Not used in practice
N
or DMe   i 1 xi  Me  f i
k

Average of the absolute deviations Average of the squared deviations with


with respect to the arithmetic mean respect to the arithmetic mean (variance)

 
k
( xi  x ) 2  ni
k
xi  x  ni
Central point: x Dx  i 1
S 2
X  i 1

N N
or Dx   i 1 xi  x  f i or S   i 1 ( xi  x ) 2 . f i
k 2 k
X

The most widely used of these measures is the variance and its square root S X is called the standard
deviation. The reasons are that the arithmetic mean is usually preferable to the median, and that it is
easier and more intuitive to work with squares rather than with absolute values. It could also be shown
that this chain of inequalities is always verified: DMe  Dx  S X .

3. The variance and the standard deviation.


As we have already said, the variance of a distribution X of values is defined as mean of the squared
deviations with respect to the arithmetic mean, and is represented as S X2 , that is,


k
( xi  x ) 2  ni
, or S X2   i 1 ( xi  x ) 2  fi
k
S 2
X  i 1

page 2 Part 5
Thus, in order to calculate the variance, one has to previously calculate the arithmetic mean, and if for
instance we want to find the variance of the monthly salaries then we start by calculating
x  (1400  3  1500  2  1950  2  2300) 8  1675 , so that the deviations with respect to x would be as
follows:
xi ni xi  x
1400 3 -275
1500 2 -175
1950 2 275
2300 1 625
N= 8

(275)2  3  (175)2  2  2752  2  6252


Thus, S X2   103, 750(€ 2 )
8

Important remarks:
1. Observe that the variance gives more importance to the deviations obtained from points that are far
from the arithmetic mean, since the deviations have been squared. This is very convenient, because
one large deviation must be considered more important than several small deviations.

2. The unit of measure of the variance is the same as that of the observed data, but squared. That is, in
the previous example the variance is 103,750 (€2), which of course doesn’t make sense. In order to
correct this situation and recover the original unit of measure, we can define the standard deviation
as the square root of the variance, and represent it by S X . In our example S X  103750  322.10
(€), which gives us a reference to decide whether a value is more or less deviated with respect to the
mean than the “mean or standard” deviation for the group. For instance, in accordance with this
criterion, the salary € 2300 € is deviated nearly double than the general level, whereas the other
deviations are below this general level.

The following properties of the variance and of the standard deviation are very important:

a) The variance is the difference between the arithmetic mean calculated for the observed values
squared and the original arithmetic mean squared, that is, S X2   i 1 xi2 fi  ( x ) 2 , which can
k


k 2
xn
also be written S 2
X   ( x ) 2 or S X2  x 2  ( x )2 . In this way, we must take into account
i 1 i i

N
that first squaring the data and then calculating the arithmetic mean is not the same as first
calculating the arithmetic mean and then squaring it. The difference between these two results is
precisely the variance.

page 3 Part 5
Proof: We have to justify a sort of algebraic identity similar to (a  b) 2  a 2  2ab  b 2 but a little
more sophisticated. To this end we start from the definition of variance, expand every square,
reorganize the summation and finally take into account the definition of the arithmetic mean to
obtain the desired result. We proceed this way:
S X2   i 1 ( xi  x ) 2 f i  ( x1  x ) 2 f1   ( xk  x ) 2 f k   x12  2 x x1  ( x ) 2  f1    xk2  2 x xk  ( x ) 2  f k 
k

 x12 f1   xk2 f k  2 x ( x1 f1   xk f k )  ( x )2 ( f1   f k )  x12 f1   xk2 f k  2( x ) 2  ( x ) 2 


x 1

 xk2 f k  x 2   i 1 xi2 f i  x 2 , and the proof is finished.


k
 x12 f1 


k 2
xn
If we don’t use a calculator, the formula S 2
X  i 1 i i
 ( x ) 2 is easier to apply than
N

k
( xi  x ) 2 ni
S X2  i 1
, because this way it is not necessary to repeatedly subtract the mean from
N
every observed data.

If we come back to the example of the salaries, then by using the new formula for the variance we
14002  3  15002  2  19502  2  23002
obtain S  2
X  16752  103,750 , which of course is the same
8
result we obtained by using the other formula.

b) The variance and the standard deviation are always positive, and they are zero only if every
deviation with respect to the mean is null, which only happens when all the data are the same
value (the variable then is called degenerated).
Proof: The variance is a summation of squares, and the summation will be zero only if each of the
squares is null.

c) Change of origin: If a constant c is added or subtracted to every value of a distribution, the


variance doesn’t change (and of course, the Standard deviation doesn’t change either).
Proof: Let X be the original distribution of values, and let be Y  X  c the transformed distribution
after adding a constant. We know that the frequencies of the original values xi and that of their
transformations yi  xi  c are the same, and that the relationship between the means is y  x  c .
Thus, by applying the definition of the variance, then we obtain:
SY2   i 1 ( yi  y ) 2 f i   i 1 ( xi  c  x  c ) 2 f i  S X2 ,
k k

and the two variances are the same.

page 4 Part 5
For instance, if in the example of the salaries we increase each of them by € 200, we would have
the following situation:

The original salaries are depicted above, and the salaries increased by € 200 are depicted below them. The
salaries change, but the deviations with respect to their means are the same.

d) Change of scale: If every value of a distribution is multiplied (or divided) by a positive constant
c, the new variance is the original variance multiplied (divided) by c2 . The standard deviation
is the original standard deviation multiplied (divided) by c.

The demonstration of this property is similar to that above, and is left as an exercise to the reader.

Properties c) and d) show us that the variance and the standard deviation are sensitive to scale
changes, but they are not sensitive to origin changes.

4. Pearson’s coefficient of variation: Variability is in general better appreciated in relative terms. For
instance, if we observe the weights of a group of dogs, the difference can be, let’s say, 50 kg between
the biggest and the smallest (imagine a Saint Bernard and a Chihuahua), whereas in a herd of elephants
the difference can be 500 kg. Nevertheless, the elephants’ weights can be considered more
homogeneous because a difference of 500 kg is not much for elephants, but 50 kg is a big difference for
dogs. That is, it is usually more effective to compare proportional variations, and in accordance with
this, the main relative measure of variation is Pearson’s coefficient of variation, defined as the
proportion between the standard deviation and the mean, that is:
S
Pearson’s coefficient of variation: CVX  X
x

This is a non-dimensional coefficient, because the standard deviation and the arithmetic mean have
the same units, that of the data, and in the division they cancel each other out.

page 5 Part 5
For instance, for the variables X: dogs’ weights and Y: elephants’ weights, both measured in kg, and
described before, let’s suppose the situation to be this:
 x  19.5 kg 9.3
Dogs:  CVX   0.4769 (47.69%)
S X  9.3 kg 19.5

 y  4523.8 kg 454.1
Elephants:  CVY   0.1004 (10.04%)
SY  454.1 kg 4523.8

The CV is smaller for the elephants, because its population is more homogeneous. As an average, the
weights of the dogs are deviate by 47.69% with respect to their arithmetic mean, whereas the elephants
weights are deviated by only 10.04% .

Observe that the CV is not defined when the arithmetic mean is null. Furthermore, if the arithmetic
S
mean is negative, the coefficient is usually defined as CVX  X .
x

Main properties of the coefficient of variation:


1) The CV is not sensitive to scale changes, because if given a variable X, we apply a change of
scale, the new variable will be Y  cX with c  0 . Therefore, the Y coefficient of variation is
S 
cS S
CVY  Y  X  X  CVX , that is, CVY  CVX , and the coefficient hasn’t changed.
y 
cx x

2) The CV is sensitive to changes of origin, because if Y  X  c , then we obtain


S S x SX x x
CVY  Y  X   CVX , that is, CVY  CVX , and in this case CVY is not the
y x c x c x x c x c
same as CVX .

3) The CV is usually used to evaluate how much representative the arithmetic mean is. The smaller
the CV is, the more homogeneous the population is, and the more representative its arithmetic
mean can be considered. For instance, the arithmetic mean of the elephants’ weights can be
considered more representative than that of the dogs, because we have seen that the CV was smaller
for the elephants.

Traditionally, the arithmetic mean is considered a good representation for the group when CV<35%.
Thus:

page 6 Part 5
 0.35 (35%)  representative arithmetic mean (homogeneous population)
CV 
 0.35 (35%)  non-representative arithmetic mean (non-homogeneous population)

5. Standardised variables.
Given a variable X, for which the arithmetic mean and the standard deviation have been calculated, its
standardised values are those obtained by firstly subtracting the arithmetic mean and then dividing by
X x
the standard deviation. That is, X standardised is the variable Z X  .
SX
The arithmetic mean of a standardised variable is null, and its standard deviation is one. This is so
xx 1 1
because for Z X its mean will be  0 , and its variance Var ( Z X )  2 Var ( X  x )  2 S X2  1 .
SX SX SX

Standardisation is then a change of origin and scale, which is useful for two main reasons:

a) It simplifies the evaluation of the relative position of any value inside the distribution. In the
general case, we have to take into account the arithmetic mean and the standard deviation in order
to consider whether a certain value must be considered as a central value or an extreme value, in
accordance with this graphical representation:

The central values for a general distribution are those between x  S X and x  S X

But for a standardised variable, the situation is easier

The central values for a standardised distribution are those between -1 and 1

b) It allows comparisons between elements of different populations, as shown in the following


example.
“In a group, 10 students study Mathematics, whose marks are: 1, 1, 3, 4, 4, 4, 5.5, 8.5, 8.5, 8.5;
whereas 11 students of the group study Economics, whose marks are: 3, 3, 4, 7, 7.5, 7.5, 7.5, 7.5, 7.5,

page 7 Part 5
8, 8; where the dark highlighted marks belong to the same student. In which subject can it be said that
the student is performing better in comparison with his classmates?”

We can find that the arithmetic mean for Mathematics is 4.8, with a standard deviation of 2.75, whereas
in Economics the mean is 6.1 and the standard deviation is 1.92. The student has received a higher
mark in Economics, but this subject seems easier. In Mathematics the standardised value is
(5.5  4.8) 2.75  0.25 , and in Economics it is (7  6.1) 1.92  0.47 . Thus, according to this, the
student is above the general average in both subjects, but in a better position in Economics than in
Mathematics.

Summary: The measures of variation are used to indicate how much a distribution varies. They can be
of two different types:

Absolute measures of variation: They have units. The main of them are the range, R  xk  x1 , the
interquartile range IQR  Q3  Q1 , the average of the absolute deviations with respect to the median
DMe   i 1 xi  Me  f i , the average of the absolute deviations with respect to the arithmetic mean
k

Dx   i 1 xi  x  f i , and above all, the variance S X2   i 1 ( xi  x ) 2 . f i and its root the standard
k k


k
deviation S X  i 1
( xi  x )2 . fi

Important properties:
a) The variance is also the difference between the arithmetic mean calculated for the observed
values squared and the original arithmetic mean squared, that is, S X2   i 1 xi2 fi  ( x ) 2
k

b) The variance and the standard deviation are both positive, and they are zero only if the variable
is degenerated.
c) Change of origin: If a constant c is added or subtracted to every value of a distribution, neither
the variance nor the standard deviation change.
e) Change of scale: If every value of a distribution is multiplied (or divided) by a positive constant
c, the new variance is the original variance multiplied (divided) by c2 and the standard deviation
is the original standard deviation multiplied (divided) by c.

page 8 Part 5
Relative measures of variation: They don’t have units, and the most widely used is the Pearson’s
S
coefficient of variation CVX  X .
x
Important properties:
a) The CV is not sensitive to scale changes: for Y  cX then CVY  CVX .
x
b) The CV is sensitive to changes of origin: for Y  X  c then CVY  CVX .
x c
c) The CV is used to evaluate how much representative the arithmetic mean is. According to a
general rule it can be said that:
 0.35 (35%)  representative arithmetic mean (homogeneous population)
CV 
 0.35 (35%)  non-representative arithmetic mean (non-homogeneous population)

X x
Standardisation: Given a variable X, its standardisation is the variable Z X  . For standardised
SX
variables the arithmetic mean is always zero and their standard deviation is always one. Standardisation
is only a change of origin and scale which simplifies the evaluation of the relative position of any
observed value inside the whole population. Standardization also allows comparisons between elements
which belong to different populations.

page 9 Part 5

You might also like