You are on page 1of 13

Objectives

Gain skills in the computation of the different


quantitative measures of dispersion;
Compute the most applicable measure of
variability in different types of distributions;
Measures of Variation Describe and compare groups using the
measures of dispersion; and
Interpret results obtained from each measure

Let us take 5 sets of observations


Set 1: 45 45 47 48 50
Set 2: 45 46 46 48 50
Set 3: 44 45 46 49 51
Set 4: 41 43 48 48 55
Set 5: 44 45 48 49 49
Questions remain unanswered even after getting the mean:
how variable the data sets are?
how the values in each data set differ from each other?
how are the values in each data set clustered or dispersed
from each other?

3 4
Measures of dispersion Importance of the measures of dispersion
group of analytical tools that describes the
spread or variability of a data set. supplements an average or a measure of central
tendency
indicate the extent to which individual items
compares one group of data with another.
in a series are scattered about an average.
indication on how representative the average is.

5 6

A measure of dispersion can be


expressed in several ways: Measures of Dispersion
▪ indicate the extent to which individual items in
Range Based on the
position of
a series are scattered about an average.
Quartile Deviation certain items in
a distribution 1. Measures of Absolute Dispersion

Measures of ▪ use to compare two or more data sets with the


Mean Absolute same means and the same units of measurement.
Dispersion Measure the
Deviation dispersion
around an 2. Measures of Relative Dispersion
Variance/ average ▪ used to compare two or more data sets with
Standard Deviation different means and different units of
measurement.
Coefficient of Expressed in a
variation relative value

7 8
1. Range (R)
Range
difference between the highest
and lowest values in a given set of Characteristic:
data.
most crude measure of dispersion.
Formulas of the range:
For Ungrouped Data: Uses:
when quickest measure of dispersion is
needed.
For Grouped Data:
if information concerning extreme values is
desired.

9 1
0

Range Range
Advantages: Disadvantages:
simplest measure of dispersion. does not consider every observation in the data set.
includes the limits within which all of the items fails to measure the variability of the majority of the
occurred. values.
very sensitive to extreme values.
Disadvantages: cannot be computed for open-ended distributions.
does not consider every observation in the data set. not amenable to algebraic manipulations.
fails to measure the variability of the majority of the is unreliable when computed from a frequency
values. distribution table with gaps or zero frequencies.
very sensitive to extreme values.
cannot be computed for open-ended distributions.
not amenable to algebraic manipulations.

1 1
1 2
Example: Given below are the weights in pounds of five Example: Given below are the weights of 5 babies from
babies below 1 yr. old from Health Center 1, get the health center 2. Compare the weight range of the babies
range. from health center 1 given in the previous example and
health center 2.

10 pounds 12 pounds 14 pounds 16 pounds 20 pounds


lightest heaviest
Health Center 1 12 pounds 12 pounds 14 pounds 12 pounds 16 pounds
Health Center
Solution: The maximum or heaviest baby is 20 pounds and the 2
minimum or lightest baby is 10 pounds. Thus, the Solution: Weight range of babies in Health Center 1:
weight range of babies is heaviest – lightest = 20 – 10 pounds
heaviest – lightest = 20-10 = 10 pounds
Weight range of babies in Health Center 2:
We can say that the weights of babies range from 10 to 20 heaviest - lightest = 16 - 12 = 4
pounds. pounds

1 1
3 4

Example for ungrouped data


Quartile Deviation (QD)
Data Set 1: Total employment from small establishments
is half of the interquartile range.
also called as the
45 semi-interquartile range.
45
47 Formula of the quartile deviation:
48 ❖ This means that the rough estimate of
50 the variability of the total employment
of small establishments in data set 1
is 5.

1 1
5 6
Quartile Deviation Quartile Deviation
Characteristics: Advantages:
expressed in the same unit as the median. not affected by values which are smaller than
more refined measure of dispersion than the range. Q1 or larger than Q3.
easy to calculate and to understand.
Uses:
when the distribution is open-ended and a measure
of dispersion is needed. Disadvantages:
when median is used as a measure of central does not consider every observation in the data set.
tendency. fails to consider dispersion that affects the ends of
when measure of dispersion among the middle 50% the distribution.
observations is needed and less interested in unreliable if there are gaps in the data around the
extremes values. quartiles.
when measure of dispersion that is not affected by does not lead to useful generalizations beyond the
extreme values is needed. single-variable case.

1 1
7 8

Example for ungrouped data Mean Absolute Deviation (MAD)


Data Set 1: Total employment from small establishments the sum of the deviations of the items
from the arithmetic mean, regardless of
the signs, then divides it by the total
45 number of items.
45
47
48 ❖ This means that the deviation of
50 the middle 50% total
employment values from its
median is equal to 1.5.

1 2
9 0
Mean Absolute Deviation
Mean Absolute Deviation
Formulas of the MAD:
Characteristics:
For Ungrouped Data: For Grouped Data:
If the distribution is normal, 57.5% of the items are
included within the range of ±MAD
Sometimes MAD is computed in relation to the
median.
where:
is the classmark of the ith class and
Uses:
is the frequency of the ith class
When a measure of dispersion that takes into
is the number of classes account every observation in the data set is needed.
When measure of dispersion that is not affected by
extreme values is needed.

2 2
1 2

Mean Absolute Deviation


Mean Absolute Deviation Disadvantages:
difficult to handle algebraically.
Advantages:
does not lead to useful generalizations beyond the
gives equal weight to the deviation of every value single-variable case.
from the mean.
rarely used.
more refined measure of dispersion than the range
and quartile deviation.
more sensitive measure of dispersion than the
Steps in computing the MAD:
range and quartile deviation. 1. Calculate the mean.
less affected by extreme values than the standard 2. Subtract the mean from each observation.
deviation. 3. Change all the negative values to positive ones.
4. Add these absolute values.
5. Divide by the number of observations.

2 2
3 4
Variance and Standard Deviation
Example for ungrouped data
are the measures of dispersion that is
Data Set 1: Total employment from small establishments preferred in most circumstances and by far
the most important measure of variation.
Variance
45 2
is the average of the squared deviations of
45 2 each observation in the set from the mean
47 0 of the dataset. (Shows variation about the
48 1 ❖ This means that the mean)
50 3 variability of the total
employment of small Population variance is denoted by σ2
establishments is only equal (sigma-squared) while the sample variance
to 1.6. is denoted by s2 (s-squared)

2 2
5 6

Variance and Standard Deviation


Variance
Population Standard Deviation
Ungrouped Data Grouped Data square root of the average squared
deviations.
also known as root-mean-square of the
deviations from the mean.
Standard Deviation of the population is
Sample represented by the Greek letter σ (sigma)
Ungrouped Data Grouped Data while the sample standard deviation is
denoted by s (small s)

2
8
Standard Deviation Sample variance/Standard Deviation
Population Sample variance
Ungrouped Data Grouped Data

Sample Standard Deviation


Ungrouped Data Grouped Data

Variance/Standard Deviation Variance/Standard Deviation


Characteristic: Characteristic:
if all values of a data set are the same, the standard can only be computed where n is at least 2.
deviation is zero. Variance is always greater than 0.
small standard deviation means a high degree of Variance is not expressed in the same units as the
uniformity and homogeneity of the observed values. observations.
if the distribution has a few very extreme cases, the Standard deviation can be seriously affected if the
standard deviation can give misleading results. mean is a poor measure of location.

3 3
1 2
Variance/Standard Deviation Variance/Standard Deviation
Uses:
when a dependable measure of dispersion is Advantages:
needed. takes into account every value in the data set.
if further statistical analysis is needed. most reliable measure of dispersion.
when interpretation related to the normal mathematically logical.
distribution is required.
amenable to further mathematical manipulations.
when the mean is used as a measure of central
tendency. can be used for in-depth analysis.
Extremely useful in estimating the
‘representativeness’ of the mean
when further mathematical computations are
Disadvantages:
needed. harder to compute and more difficult to understand.
most widely used measure of dispersion and the generally affected by extreme values that may be
easiest to handle algebraically. due to skewness of data.

3 3
3 4

Variance/Standard Deviation Standard Deviation


According to the Empirical Rule, if the distribution
is normal, Remarks:
1. If there is a large amount of variation in
the data set, then on the average, the data
values will be far from the mean. Hence,
the standard deviation will be large.

2. If there is only a small amount of variation


in the data set, then on the average, the
data values will be close to the mean.
68% of the distribution Hence, the standard deviation will be
95% of the distribution small.
99% of the distribution
3 3
5 6
Variance/Standard Deviation Example for ungrouped data
Data Set 1: Total employment from small establishments
Steps in computing the Variance (steps 1-5)
and Standard deviation (steps 1-6):
1. Calculate the mean. 45 2,025 -2 4
2. Subtract the mean from each observation. 45 2,025 -2 4
3. Square each result. 47 2,209 0 0
4. Add these squares.
48 2,304 1 1
5. Divide this sum by the number of
observations. 50 2,500 3 9
6. Take the positive square root.

3 3
7 8

Example for ungrouped data Example for ungrouped data


Data Set 1: Total employment from small establishments
Interpretation:
❖ The variability of total employment from 5
45 2,025 small establishments is 2.12.
45 2,025
47 2,209
48 2,304
50 2,500

3 4
9 0
Quartile Standard
Characteristics Range M.A.D.
Deviation Deviation
Computation Lowest Q1 and Q3 Every value Every value Coefficient of Variation
based on and
highest
values Characteristics:
Affected by Greatest Not by values Affected by Affected by an abstract number expressed in percent.
extreme smaller than every value every value
Q1 or larger demonstrates the relationship between standard
values deviation and mean, by expressing the risk as a
than Q3
Rough Better than Good, but it Excellent,
percentage of the mean.
Degree of
precision as a estimate the range only measures measures
absolute squared
measure of
dispersion
deviations
from the mean
deviations from
the mean
Uses:
or median
compares distributions where units are different.
when measure of relative dispersion is needed.
Mathematical Easy to Can be used Easier to Hard to
advantages compute to measure compute than compute, but
asymmetrical standard suitable for
distribution deviation further
mathematical
computations 4
2

Coefficient of Variation
Coefficient of Variation (CV)
relative measure of dispersion. Characteristics:
an abstract number expressed in percent.
ratio of the standard deviation to demonstrates the relationship between standard
the mean. deviation and mean, by expressing the risk as a
percentage of the mean.

Formula of the CV:


Uses:
compares distributions where units are different.
when measure of relative dispersion is needed.

4 4
3 4
Examples
Coefficient of Variation (CV) 1. To get the coefficient of variation (CV) using the
distribution of the number of vacancies from 43
Advantages: selected enterprises, we have the following:
independent of any unit of measurement. Given:
easy to interpret.

Disadvantage:
not useful when the mean is close to 0.

❖ The variability of the number of vacancies from


43 selected enterprises in relation to its mean is
40.18%.

4 4
5 6

Examples Examples
Computing the CV for each of the data set, we have:
2. Suppose that there are two sets of data, one for the
weights of the employees and the other data set is For the weights of the For the income of the
their income. These two data sets have equal employees: employees:
standard deviation. How do we compare these two
data sets?
Given:
weights of the income of the
employees: employees:

The CV of the weights of the employees is greater


than the CV of the income of the employees. This
means that the weights of the employees are more
variable than their income despite that their
standard deviations are equal.

4 4
7 8
Calculating the Sample SD Sample vs Population SD

Data : Xj 10 12 14 15 17 18 18 24
For the Sample : use n - 1
S = in the denominator. N= 8 Mean =16

Data: 10 12 14 15 17 18 18 24 s = = 4.2426

n=8 Mean =16

s= = 3.9686

= 4.2426 Value for the Standard Deviation is larger for data considered as a Sample.

4 5
9 0

Comparing Standard Deviations Comparing Standard Deviations

Example: Team A - Heights of five marathon players in inches Example: Team B - Heights of five marathon players in inches

Mean = 65”
s = 3.6”
Mean = 65
s =0

65 “ 65 “ 65 “ 65 “ 65 “ 62 “ 67 “ 66 “ 70 “ 60 “

5 5
1 2

You might also like