You are on page 1of 13

Md.

Ashraful Islam
Lecturer (Statistics)
Course Name: STATISTICS Dept. of Natural Science, PCIU
Cell: 01620728620
Email: ashraful144cu@gmail.com

Measures of Dispersion
Dispersion

Dispersion is the state of getting dispersed or spread. Statistical dispersion means the extent to
which a numerical data is likely to vary about an average value. In other words, dispersion helps
to understand the distribution of the data.

Measures of Dispersion

In statistics, the measures of dispersion help to interpret the variability of data i.e. to know
how much homogenous or heterogeneous the data is.

As the name suggests, the measure of dispersion shows the scatterings of the data. It tells the
variation of the data from one another and gives a clear idea about the distribution of the data.
The measure of dispersion shows the homogeneity or the heterogeneity of the distribution of the
observations.

Suppose you have four datasets of the same size and the mean is also same, say, m. In all the
cases the sum of the observations will be the same. Here, the measure of central tendency is not
giving a clear and complete idea about the distribution for the four given sets.

Can we get an idea about the distribution if we get to know about the dispersion of the
observations from one another within and between the datasets? The main idea about the
measure of dispersion is to get to know how the data are spread. It shows how much the data
vary from their average value.

Characteristics of Measures of Dispersion

 A measure of dispersion should be rigidly defined


 It must be easy to calculate and understand
 Not affected much by the fluctuations of observations
 Based on all observations
Types of Measures of Dispersion

There are two main types of dispersion methods in statistics which are:

 Absolute Measure of Dispersion


 Relative Measure of Dispersion

Absolute Measure of Dispersion

An absolute measure of dispersion contains the same unit as the original data set. Absolute
dispersion method expresses the variations in terms of the average of deviations of observations
like standard or means deviations.

An absolute measure of dispersion can be defined as,

 The measures which express the scattering of observation in terms of distances i.e.,
range, quartile deviation.
 The measure which expresses the variations in terms of the average of deviations of
observations like mean deviation and standard deviation.

The types of absolute measures of dispersion are:

1. Range
2. Variance
3. Standard Deviation
4. Quartile Deviation
5. Mean Deviation

Relative Measure of Dispersion

The relative measures of dispersion are used to compare the distribution of two or more data sets.
This measure compares values without units.

We use a relative measure of dispersion for comparing distributions of two or more data set and
for unit free comparison.

They are the coefficient of range, the coefficient of mean deviation, the coefficient of quartile
deviation, the coefficient of variation, and the coefficient of standard deviation Common relative
dispersion methods include:

1. Co-efficient of Range
2. Co-efficient of Variation
3. Co-efficient of Standard Deviation
4. Co-efficient of Quartile Deviation
5. Co-efficient of Mean Deviation
Range

A range is the most common and easily understandable measure of dispersion. It is the difference
between two extreme observations of the data set.

If X max and X min are the two extreme observations then

Range = X max – X min

 It is simply the difference between the maximum value and the minimum value given in a
data set.

Example: 1, 3, 5, 6, 7

=> Range = 7 -1= 6

Merits of Range

 It is the simplest of the measure of dispersion


 Easy to calculate
 Easy to understand
 Independent of change of origin

Demerits of Range

 It is based on two extreme observations. Hence, get affected by fluctuations


 A range is not a reliable measure of dispersion
 Dependent on change of scale

Quartile Deviation

The quartiles are the values that divide a list of numbers into quarters. The quartile deviation is
half of the distance between the third and the first quartile.

The quartiles divide a data set into quarters. The first quartile, (Q1) is the middle number between
the smallest number and the median of the data. The second quartile, (Q2) is the median of the
data set. The third quartile, (Q3) is the middle number between the median and the largest
number.
Quartile deviation or semi-inter-quartile deviation is

𝑸𝟑− 𝑸𝟏
Q= 𝟐

Merits of Quartile Deviation

 All the drawbacks of Range are overcome by quartile deviation


 It uses half of the data
 Independent of change of origin
 The best measure of dispersion for open-end classification

Demerits of Quartile Deviation

 It ignores 50% of the data


 Dependent on change of scale
 Not a reliable measure of dispersion

Mean Deviation

The average of numbers is known as the mean and the arithmetic mean of the absolute deviations
of the observations from a measure of central tendency is known as the mean deviation.

Mean deviation is the arithmetic mean of the absolute deviations of the observations from a
measure of central tendency.

If x1, x2, …, xn are the set of observation, then the mean deviation of x about the average A
(mean, median, or mode) is

For a ungrouped frequency,

∑|xi – A|
Mean deviation = 𝑛

For a grouped frequency, it is calculated as:

∑𝑓𝑖 |xi – A|
Mean deviation = 𝑛

Here, xi and fi are respectively the mid value and the frequency of the ith class interval.

Merits of Mean Deviation

 Based on all observations


 It provides a minimum value when the deviations are taken from the median
 Independent of change of origin
Demerits of Mean Deviation

 Not easily understandable


 Its calculation is not easy and time-consuming
 Dependent on the change of scale
 Ignorance of negative sign creates artificiality and becomes useless for further
mathematical treatment

Variance:

Deduct the mean from each data in the set then squaring each of them and adding each square
and finally dividing them by the total no of values in the data set is the variance.

For Ungrouped data,

∑(X−A)2
Variance (σ2) = 𝑛

∑f(X−A)2
For group data, Variance (σ2) = 𝑛

Standard Deviation

A standard deviation is the positive square root of the arithmetic mean of the squares of the
deviations of the given values from their arithmetic mean. It is denoted by a Greek letter sigma,
σ. It is also referred to as root mean square deviation.

The square root of the variance is known as the standard deviation i.e. S.D. it can be denoted as
σ. The standard deviation is given as

For a ungrouped frequency distribution,

∑(𝑥𝑖 −A)2
σ=√ 𝑛

For a grouped frequency distribution, it is

∑𝑓𝑖 (𝑥𝑖 −μ)2


σ=√ 𝑛

The square of the standard deviation is the variance. It is also a measure of dispersion.
Population standard deviation

When you have collected data from every member of the population that you’re interested in,
you can get an exact value for population standard deviation.

The population standard deviation formula looks like this:

Formula Explanation
 σ = population standard deviation
 ∑ = sum of…
 X = each value
 μ = population mean
 N = number of values in the population

Sample standard deviation

When you collect data from a sample, the sample standard deviation is used to make estimates or
inferences about the population standard deviation.

The sample standard deviation formula looks like this:

Formula Explanation
 s = sample standard deviation
 ∑ = sum of…
 X = each value
 x̅ = sample mean
 n = number of values in the sample

With samples, we use n – 1 in the formula because using n would give us a biased estimate that
consistently underestimates variability. The sample standard deviation would tend to be lower
than the real standard deviation of the population.

Reducing the sample n to n – 1 makes the standard deviation artificially large, giving you a
conservative estimate of variability.

While this is not an unbiased estimate, it is a less biased estimate of standard deviation: it is
better to overestimate rather than underestimate variability in samples.
Merits of Standard Deviation

 Squaring the deviations overcomes the drawback of ignoring signs in mean deviations.
 Suitable for further mathematical treatment
 Least affected by the fluctuation of the observations
 The standard deviation is zero if all the observations are constant
 Independent of change of origin

Demerits of Standard Deviation

 Not easy to calculate


 Difficult to understand.
 Dependent on the change of scale.

Why is standard deviation a useful measure of Dispersion?

Although there are simpler ways to calculate variability, the standard deviation formula weighs
unevenly spread out samples more than evenly spread samples. A higher standard deviation tells
you that the distribution is not only more spread out, but also more unevenly spread out.

This means it gives you a better idea of your data’s variability than simpler measures, such as the
mean absolute deviation (MAD).

The MAD is similar to standard deviation but easier to calculate. First, you express each
deviation from the mean in absolute values by converting them into positive numbers. Then, you
calculate the mean of these absolute deviations.

Unlike the standard deviation, you don’t have to calculate squares or square roots of numbers for
the MAD. However, for that reason, it gives you a less precise measure of variability.

Relative Measures of Dispersion

Whenever we want to compare the variability of the two series which differ widely in their
averages. Also, when the unit of measurement is different. We need to calculate the coefficients
of dispersion along with the measure of dispersion. The coefficients of dispersion (C.D.) based
on different measures of dispersion are

(𝒙 –𝒙 )
 Coefficient of Range = (𝒙𝒎𝒂𝒙 +𝒙 𝒎𝒊𝒏) × 100
𝒎𝒂𝒙 𝒎𝒊𝒏
(𝑸 – 𝑸 )
 Coefficient of Quartile deviation = (𝑸 𝟑+ 𝑸 𝟏) × 100
𝟑 𝟏
𝐌.𝐃
 Coefficient of mean deviation = 𝐌𝐞𝐚𝐧 × 100
𝐒.𝐃
 Coefficient of Standard deviation = 𝐌𝐞𝐚𝐧 × 100
Coefficient of Variation

100 times the coefficient of dispersion based on standard deviation is the coefficient of variation
(C.V.).

S.D
C.V. = Mean × 100

Solved Example on Measures of Dispersion

Question–1: The length of 11 similar crystals is measured (in mm) in a chemistry experiment.
Calculate the Mean deviation, standard deviation, Variance and the coefficient of variation for
the observations taken.

Crystal no. Crystal Measured (In mm)


1 9
2 2
3 5
4 4
5 12
6 7
7 8
8 11
9 9
10 3
11 7

Solution – We can construct the table as given below –

Xi (xi – A) |𝐱𝐢 – 𝐀| (xi – A)2

9 9-7= 2 2 4
2 2-7=-5 5 25
5 -2 2 4
4 -3 3 9
12 5 5 25
7 0 0 0
8 1 1 1
11 4 4 16
9 2 2 4
3 -4 4 16
7 0 0 0
Total ∑|xi – A|= 28 ∑ (xi – A)2 = 104
Avg, A=7
∑ xi
Mean of the data set, Mean, A= = 77/11=7
n

∑|xi – A|
Mean deviation, M.D= 𝑛

= 28/11

= 2. 55

∑(X−A)2
Variance (σ2) = 𝑛

= 104/11

= 9.45

Standard Deviation

∑(𝑥𝑖 −A)2
S.D.= σ = √ 𝑛

104
=√ 11
= 3.07

We can calculate the coefficient of variation as –

S.D
C.V. = Mean × 100

3.07
= ×100 = 43.86%
7
Example-2

Class 1-10 11-20 21-30 31-40 41-50 51-60


Frequency 7 8 6 5 2 7
Find the value of M.D, Variance, S.D and C.V from the Table-

Class M.V(Xi) Frequency (fi) fixi |𝐱𝐢 – 𝐀| 𝒇𝒊|𝐱𝐢 – 𝐀| 𝒇𝒊 (xi – A)2


1-10 5.5 7 38.5 5.5-27.79=22.29 22.29*7=156.03 3477.90
11-20 15.5 8 124 15.5-27.79=12.29 98.32 1208.35
21-30 25.5 6 153 2.29 13.74 31.46
31-40 35.5 5 177.5 7.71 38.55 297.22
41-50 45.5 2 91 17.71 35.42 247.94
51-60 55.5 7 388.5 27.71 193.97 5374.90
∑fixi = 972.5
Total ∑ xi =183 N=35 ∑|xi – A|=90 ∑𝑓𝑖|xi – A|=536.03 ∑fi (xi – A)2 =
10637.77
A=30.5

∑ fixi
Mean== =27.79
n

∑fi|xi – A|
Mean deviation = =536.03/35=15.32
𝑛

∑fi(X−A)2
Variance (σ2) = =10637.77/35= 303.94
𝑛

∑fi(𝑥𝑖 −A)2
S.D.= σ = √ =17.43
𝑛

S.D 17.43
C.V. = Mean × 100 = ×100 = 62.73%
27.79
Problem-3: Below is the table showing the values of the results for two companies A, and B.

1. Which of the company has a larger wage bill?


2. Calculate the coefficients of variations for both of the companies.
3. Calculate the average daily wage and the variance of the distribution of wages of all the
employees in the firms A and B taken together.

Solution:

For Company A

No. of employees = n1 = 900, and average daily wages = ȳ 1 = Rs. 250

𝐓𝐨𝐭𝐚𝐥 𝐰𝐚𝐠𝐞𝐬
We know, average daily wage = 𝐓𝐨𝐭𝐚𝐥 𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐞𝐦𝐩𝐥𝐨𝐲𝐞𝐞𝐬

or, Total wages = Total employees × average daily wage

= 900 × 250

= Rs. 225000 … (i)

For Company B

No. of employees = n2 = 1000, and average daily wages = ȳ2 = Rs. 220

Total wages
Average daily wage = Total number of employees

So, Total wages = Total employees × average daily wage

= 1000 × 220

= Rs. 220000 … (ii)

Comparing (i), and (ii), we see that Company A has a larger wage bill.
For Company A

Variance of distribution of wages = σ12 = 100

S.D, σ1 =√100=10

standard deviation of distribution of wages


C.V. of distribution of wages = × 100
average daily wages

Or, C.V. A = 10 ⁄250 ×100

=4 … (i)

For Company B

Variance of distribution of wages = σ22 = 144

S.D, σ2 =√144=12

standard deviation of distribution of wages


C.V. B = × 100
average daily wages

= 12⁄220 × 100

= 5.45 … (ii)

Comparing (i), and (ii), we see that Company B has greater variability.

For Company A and B, taken together

The average daily wages for both the companies taken together

(n1 ȳ 1 + n2 ȳ 2) (Total wages of A+Total wages of B) 225000+220000


ȳ= ( n1 + n2)
= ( n1 + n2)
= =234.21 Rs
1900
𝑛1 (𝜎12 + 𝑑12 ) + 𝑛2 (𝜎22 + 𝑑22 )
The combined variance, σ2 = 𝑛1 + 𝑛2

Here, d1 = ȳ1 − ȳ = 250 – 234.21 = 15.79

d2 = ȳ2 − ȳ = 220 – 234.21 = – 14.21

Hence, σ2 = [900 × (100 + 15.792) + 1000 × (144 + – 14.212)] ⁄ (900 + 1000)

or, σ2 = (314391.69 + 345924.10) ⁄ 1900

= 347.53

Thank You So much

You might also like