Professional Documents
Culture Documents
3 4
Measures of dispersion Importance of the measures of dispersion
group of analytical tools that describes the
spread or variability of a data set. supplements an average or a measure of central
tendency
indicate the extent to which individual items
compares one group of data with another.
in a series are scattered about an average.
indication on how representative the average is.
5 6
7 8
1. Range (R)
Range
difference between the highest
and lowest values in a given set of Characteristic:
data.
most crude measure of dispersion.
Formulas of the range:
For Ungrouped Data: Uses:
when quickest measure of dispersion is
needed.
For Grouped Data:
if information concerning extreme values is
desired.
9 1
0
Range Range
Advantages: Disadvantages:
simplest measure of dispersion. does not consider every observation in the data set.
includes the limits within which all of the items fails to measure the variability of the majority of the
occurred. values.
very sensitive to extreme values.
Disadvantages: cannot be computed for open-ended distributions.
does not consider every observation in the data set. not amenable to algebraic manipulations.
fails to measure the variability of the majority of the is unreliable when computed from a frequency
values. distribution table with gaps or zero frequencies.
very sensitive to extreme values.
cannot be computed for open-ended distributions.
not amenable to algebraic manipulations.
1 1
1 2
Example: Given below are the weights in pounds of five Example: Given below are the weights of 5 babies from
babies below 1 yr. old from Health Center 1, get the health center 2. Compare the weight range of the babies
range. from health center 1 given in the previous example and
health center 2.
1 1
3 4
1 1
5 6
Quartile Deviation Quartile Deviation
Characteristics: Advantages:
expressed in the same unit as the median. not affected by values which are smaller than
more refined measure of dispersion than the range. Q1 or larger than Q3.
easy to calculate and to understand.
Uses:
when the distribution is open-ended and a measure
of dispersion is needed. Disadvantages:
when median is used as a measure of central does not consider every observation in the data set.
tendency. fails to consider dispersion that affects the ends of
when measure of dispersion among the middle 50% the distribution.
observations is needed and less interested in unreliable if there are gaps in the data around the
extremes values. quartiles.
when measure of dispersion that is not affected by does not lead to useful generalizations beyond the
extreme values is needed. single-variable case.
1 1
7 8
1 2
9 0
Mean Absolute Deviation
Mean Absolute Deviation
Formulas of the MAD:
Characteristics:
For Ungrouped Data: For Grouped Data:
If the distribution is normal, 57.5% of the items are
included within the range of ±MAD
Sometimes MAD is computed in relation to the
median.
where:
is the classmark of the ith class and
Uses:
is the frequency of the ith class
When a measure of dispersion that takes into
is the number of classes account every observation in the data set is needed.
When measure of dispersion that is not affected by
extreme values is needed.
2 2
1 2
2 2
3 4
Variance and Standard Deviation
Example for ungrouped data
are the measures of dispersion that is
Data Set 1: Total employment from small establishments preferred in most circumstances and by far
the most important measure of variation.
Variance
45 2
is the average of the squared deviations of
45 2 each observation in the set from the mean
47 0 of the dataset. (Shows variation about the
48 1 ❖ This means that the mean)
50 3 variability of the total
employment of small Population variance is denoted by σ2
establishments is only equal (sigma-squared) while the sample variance
to 1.6. is denoted by s2 (s-squared)
2 2
5 6
2
8
Standard Deviation Sample variance/Standard Deviation
Population Sample variance
Ungrouped Data Grouped Data
3 3
1 2
Variance/Standard Deviation Variance/Standard Deviation
Uses:
when a dependable measure of dispersion is Advantages:
needed. takes into account every value in the data set.
if further statistical analysis is needed. most reliable measure of dispersion.
when interpretation related to the normal mathematically logical.
distribution is required.
amenable to further mathematical manipulations.
when the mean is used as a measure of central
tendency. can be used for in-depth analysis.
Extremely useful in estimating the
‘representativeness’ of the mean
when further mathematical computations are
Disadvantages:
needed. harder to compute and more difficult to understand.
most widely used measure of dispersion and the generally affected by extreme values that may be
easiest to handle algebraically. due to skewness of data.
3 3
3 4
3 3
7 8
3 4
9 0
Quartile Standard
Characteristics Range M.A.D.
Deviation Deviation
Computation Lowest Q1 and Q3 Every value Every value Coefficient of Variation
based on and
highest
values Characteristics:
Affected by Greatest Not by values Affected by Affected by an abstract number expressed in percent.
extreme smaller than every value every value
Q1 or larger demonstrates the relationship between standard
values deviation and mean, by expressing the risk as a
than Q3
Rough Better than Good, but it Excellent,
percentage of the mean.
Degree of
precision as a estimate the range only measures measures
absolute squared
measure of
dispersion
deviations
from the mean
deviations from
the mean
Uses:
or median
compares distributions where units are different.
when measure of relative dispersion is needed.
Mathematical Easy to Can be used Easier to Hard to
advantages compute to measure compute than compute, but
asymmetrical standard suitable for
distribution deviation further
mathematical
computations 4
2
Coefficient of Variation
Coefficient of Variation (CV)
relative measure of dispersion. Characteristics:
an abstract number expressed in percent.
ratio of the standard deviation to demonstrates the relationship between standard
the mean. deviation and mean, by expressing the risk as a
percentage of the mean.
4 4
3 4
Examples
Coefficient of Variation (CV) 1. To get the coefficient of variation (CV) using the
distribution of the number of vacancies from 43
Advantages: selected enterprises, we have the following:
independent of any unit of measurement. Given:
easy to interpret.
Disadvantage:
not useful when the mean is close to 0.
4 4
5 6
Examples Examples
Computing the CV for each of the data set, we have:
2. Suppose that there are two sets of data, one for the
weights of the employees and the other data set is For the weights of the For the income of the
their income. These two data sets have equal employees: employees:
standard deviation. How do we compare these two
data sets?
Given:
weights of the income of the
employees: employees:
4 4
7 8
Calculating the Sample SD Sample vs Population SD
Data : Xj 10 12 14 15 17 18 18 24
For the Sample : use n - 1
S = in the denominator. N= 8 Mean =16
Data: 10 12 14 15 17 18 18 24 s = = 4.2426
s= = 3.9686
= 4.2426 Value for the Standard Deviation is larger for data considered as a Sample.
4 5
9 0
Example: Team A - Heights of five marathon players in inches Example: Team B - Heights of five marathon players in inches
Mean = 65”
s = 3.6”
Mean = 65
s =0
65 “ 65 “ 65 “ 65 “ 65 “ 62 “ 67 “ 66 “ 70 “ 60 “
5 5
1 2