Professional Documents
Culture Documents
MEASURES OF VARIATION
The measures of variation or dispersion indicate the degree or extent to which numerical values are dispersed
or spread out about the average value in a distribution. In this chapter, we will discuss the more properly used
measures of variation. These are the range, the semi quartile range, the quartile range, the mean deviation or
average deviation, the variance and the standard deviation.
RANGE
The range which is the simplest to compute, is the difference between the largest and the lowest values in the
set of numerical data. The range for ungrouped data is obtained by finding the difference between the largest
value and the lowest value. For grouped data, the range is determined by subtracting the lower boundary of
the lowest class interval from the upper boundary of the highest class interval of a frequency distribution. This
is so because the class boundaries are considered the true limits.
Thus, we have the following formulas:
For ungrouped Data
Range (R) = Highest value (HV) – Lowest Value (LV) or
R = HV – LV
For Grouped Data
Range (R) = Upper Boundary of the highest class interval (UBHCI) – Lower Boundary of lowest class interval
(LBLCI)
R = UBHCI - LBLCI
EXAMPLE: The scores obtained by 12 students in Statistics class are 80,75,63,95,98,78,85,90,73,68,87 and 81.
Find the range.
Solution: The highest score is 98, while the lowest score is 63. Hence,
R = HV – LV
R = 98 – 63
R = 35
EXAMPLE: Find the range of a given frequency distribution whose highest class interval is 91-95 and lowest
class interval is 51-55
Solution: The upper boundary of the highest class interval 91-95 is 95.5 and the lower boundary of the lowest
class interval 51-55 is 50.5. Therefore, the range is obtained, as follows.
R = UBHCI - LBLCI
R = 95.5 – 50.5
R = 45
Q1 Q2 Q3
Note that Q1 is up to the point containing the lower 25% of the data arranged from lowest to highest,
while Q3 is up to the point containing the upper 75% of the data.
Thus, we apply the formulas as follows:
b. IQR = Q3 – Q1
IQR = 30.5 – 22.5
IQR = 8
Q3−Q1
c. SIQR or QD =
2
30.5−22.5
QD =
2
SIQR or QD = 4
EXAMPLE: Table 1 below shows the average production of 60 employees of a manufacturing company during a
given week.
Find the:
a. Range
b. Inter-quartile range
c. Semi-interquartile range or Quartile Deviation
60
Q1 = 70.5 + 5 4
( )
−12
10
Q1 = 72
3
Q3 = 80.5 + 5 4
(
(60)−12
10 )
Q3 = 83.68
Therefore, IQR = Q3 – Q1
IQR = 83.68 – 72
IQR = 11.68
83.68−72
QD =
2
= 5.84
The Mean Deviation or Average Deviation
The average of the absolute deviations of the individual values of a set of numerical data from either
the mean, the median or mode. Among the 3, however, the mean is the most preferred and commonly used
measure of central tendency for computing the mean deviation or average deviation. To determine the mean
or average deviation, we shall use the following formulas
MD or AD =
∑ |x−x́|
n
For Grouped Data:
MD or AD =
∑ f |x− x́|
n
Where:
x – refers to the individual value for ungrouped data, and the midpoint of each class interval for grouped data
x́ – the mean of the data
Mean (x́)
x́ =
∑ x = 80
n 10
Mean Deviation (MD) or Average Deviation (AD)
MD or AD =
∑ |x−x́|
n
26
= = MD or AD = 2.6
10
The MD or AD of 2.6 indicates the amount by which the individual sales are dispersed around their mean 0f 8.
The higher the value of MD or AD, the larger the dispersion.
For grouped Data, we shall compute the mean deviation (MD) or average deviation (AD) by following the
procedures below.
1. Compute the mean (x́ ¿ of the distribution
2. Subtract the mean (x́ ¿ from each of the midpoints and write the absolute values of the results under
the column ∑ | x−x́|
3. Find the products of items under column f and items under column |x− x́|
The mean (x́ ¿ of the distribution is calculated by using the formula presented in chapter 3, as follows.
x́ =
∑x =
5,177
n 89
x́ = 58.17
Thus, the mean deviation (MD) or average deviation (AD) shall be computed as follows:
MD or AD =
∑ |x−x́| =
519.25
= 5.83
n 89
The Variance
Is defined as the average of the squared deviations from the mean. The square root of this variance is known
as the standard deviations from the mean. The square root of this variance is known as the standard deviation.
The variance for a sample data is denoted by S2 (reads S squared or the square of S), while the symbol for a
variance of the population is σ2, and is read sigma squared.
The Variance for ungrouped data
The formulas for calculating the variance for ungrouped data are:
For the variance of a sample data:
( x−x́ )2
S2 = ∑ where x́ is the mean of the sample data
n−1
For the variance of a population:
σ2 =
∑ ( x−μ́ )2 where μ represents the mean of the population.
N
We will use n -1 instead of n as divisor in computing the variance of a sample data although the variance is
defined as the average of the squared deviations about the mean. The reason for this is to avoid the likely
existence of biases that are normally associated with the use of the variance computed from different random
samples, especially when the samples sizes are small. The n different sizes selected from the same population
generally yield different values for the variance. But, the average of these values computed from several
samples of the population tends to be closer to the actual variance- the population variance.
To determine the variance of an ungrouped data, let us follow the steps below.
1. Arrange the values according to magnitude (Lowest to highest or vice versa) vertically
2. Calculate the mean
3. Obtain the individual deviations from the mean
4. Square each deviation and write the results under the column |x− x́|2
5. Find the sum of the squared deviations.
6. Divide the sum in step 5 by n – 1 for sample data or by N for population data.
Example: Determine the variance of the following 8 sample production units of a certain company: 10, 11, 9,
17,13,15,13 and 20.
Solutions: The 8 sample data, showing values under x - x́ and (x - x́)2, are as follows:
x x - x́ (x - x́)2
9 -4.5 20.25
10 -3.5 12.25
11 -2.5 6.25
13 -0.5 0.25
13 -0.5 0.25
15 1.5 2.25
17 3.5 12.25
20 6.5 42.25
n=8 ∑ (x−x́)2 = 96
Thus, the mean is obtained, as follows:
x́ =
∑x =
9+10+11+13+ 13+15+17+20 108
=
n 8 8
2 ∑ ( x−x́ )2 96
S = = 8−1
n−1
S2 = 13.71
As you may observed, the computation of the variance (S2) for ungrouped data using the above procedures is
laborious and time-consuming. A simpler and easier solution can be done through the raw data formula.
The raw data formulas for computing the variance of ungrouped data are:
2
n x2 −( ∑ x )
S = ∑
2
for sample data
n( n−1)
And
2
2 N ∑ x 2−( ∑ x )
σ = for population data
N2
We will use n for the number of observations for sample data and N for population data.
To solve the variance of ungrouped data by raw data formula, we will follow the procedures enumerated
below.
1. Arrange the values in terms of their magnitude
2. Find the sum of the values
3. Square each value and write the results under the column x2
4. Get the sum of the squared values in step 3.
5. Substitute the results obtained in step 2 and step 4 in the raw data formula.
Example: Using the same 8 sample production units in the preceding example 1 with the compute mean ( x́) =
13.5, find the variance by the raw data formula.
x x́
9 81
10 100
11 121
13 169
13 169
15 225
17 289
20 400
∑ x =108 ∑ x 2 =1,554
n=8
Substituting the computed values obtained above in the raw data formula, we shall have:
2
n ∑ x2 −( ∑ x )
2
S =
n( n−1)
S2 = 13.71
The variance for Grouped Data
The variance for grouped data may be calculated by 2 methods. The first method involves rather longer
procedures using the mean deviation formula as follows.
Method 1 – (Long Method Formula)
2 ∑ f ( x−x́ )2
S = for sample data and
n−1
σ2 =
∑ f ( x− μ́ )2 for population data
N
We shall use x́ for the mean of a sample data and ´μ́ for the mean of a population data.
We will follow the procedures below to solve the variance of a grouped data by using the above formula (Long
method)
Solutions:
The mean (x́)
x́ =
∑x
n
4,131
x́ =
87
x́ = 47.48
f ( x−x́ )2
S2 = ∑
n−1
4,551.69
S2 = 86
S2 = 52.93
Now, let us simplify the computation in solving the variance of grouped data by using a short ,ethod known as
the coding method formula, as follows.
Method 2: Short Method
2
n ∑ f d 2−( ∑ fd ) 2
2
S =
[n( n−1) ]
c for sample data
2
N ∑ fd 2−( ∑ fd )
σ= [ N 2 ] c 2 for population data
The d represents the coded value of class interval and c, the interval size of the class interval.
To obtain the variance of a grouped data by the short method or coding formula, we shall follow the
procedures listed below.
1. Write the coded values of the class intervals under the d column
2. Multiply the frequencies by the corresponding coded values.
3. Multiply the squared coded values by the corresponding frequencies
4. Add the results in step 2 and in step 3.
5. Substitute the values in the coding formula.
Example: Calculate the variance of the distribution of the ages of a sample of 87 managerial employees
contained in table 3 in the preceding example 1, by the short method (coded formula)
∑ ( x −x́ )2
S=
√ n−1
or √ ¿ ¿ note: c2 is included in the square root / radical sign
Or S = √ σ 2
Hence, the standard deviation of the ungrouped 8 sample data in Example 1 in which the variance (S 2) = 13.71
is calculated in the following manner
S = √ S2
S = √ 13.71
S = 3.70
Similarly, the standard deviation of the grouped data on the ages 87 managerial employees presented in
recent table where the variance (S2) = 52.93 is found as follows:
S = √ S2
S = √ 52.93
S = 7.28
The variance and the standard deviation are generally accepted measures of dispersion, especially in
discussions and presentation of reports containing basic statistics. The standard deviation is more popularly
used than the variance since its value is expressed in the unit of measurement as the observations and the
mean. Hence, it presents values that can be used directly for analysis and interpretation. For instance, when
the unit of measurement of the data is in kilos, the standard deviation and the mean are in kilos. The variance
on the other hand is kilos squared.