Professional Documents
Culture Documents
WEEK 9 – 10
Module 4
Chapter 4: Variability
Overview
While measures of central tendency provide a useful summary of information about a list of
data, a single summary number by itself is not enough to provide a picture of the
distribution of a list (in the same way that a single word cannot fully describe someone).
Several lists of data may have the same mean, but the spread of the lists may be different.
Consider, for instance, the following data sets:
Set 1: 15 15 17 18 20
Set 2: 15 16 16 18 20
Set 3: 14 15 18 19 19
A measure of variability shows the extent to which numerical values tend to spread out over
the average. A suitable measure should be large when the values vary over a wide range and
should be small when the range of variation is not too great.
Study Guide
Learning Outcomes
LO1: Define variability and determine its use and importance as a statistical measure.
LO2: Calculate the variance and standard deviation both for ungrouped and grouped data.
LO3: Distinguish biased and unbiased statistics and clarify why the sample mean and the
sample variance are unbiased statistics.
LO4: Explain how adding and multiplying a constant affect the values of both the mean and
standard deviation.
Variability in the context of statistics has also the same meaning as in everyday language
which implies that if objects are variable then they are not all similar. Statistically speaking,
variability is used to describe how scores spread or scatter. If in a given distribution all the
scores are the same, then it follows that scores have no variability. The greater the differences
are there in the distribution, the greater will be the amount of variability.
Purposes:
1. It describes the distribution whether the distance of one score to another score is big
or small or how much is the expected distance of an individual score from the mean.
2. It measures how representative is an individual score with respect to the population
to provide for the expected sampling error.
1. Range
2. Variance
3. Standard Deviation
The Range
Range is the simplest form of measuring the variability of a distribution, which is equal to
the difference between the maximum and minimum quantitative data entries in the set.
Example:
Starting Salaries (in thousands of dollars) 41, 38, 39, 45, 47, 41, 44, 41, 37, 42
Solution: Ordering the data helps to find the least and greatest salaries.
37 38 39 41 41 41 42 44 45 47
∑(𝑥−𝜇)2
Population Variance: 𝜎 2 = 𝑁
Steps
IN WORDS IN SYMBOLS
1. Find the mean of the data set. Σ𝑥
𝜇=
𝑁
2. Find the deviation of each entry. 𝑥 − 𝜇|
3. Square each deviation. (𝑥 − 𝜇)2
4. Add to get the sum of squares. Σ(𝑥 − 𝜇)2
5. Divide by N to get the population variance. Σ(𝑥 − 𝜇)2
𝜎2 =
𝑁
6. Find the square root of the variance to get
Σ(𝑥 − 𝜇)2
the population standard deviation. 𝜎=√
𝑁
Find the population variance and standard deviation of the starting salaries.
Solution:
𝑥 𝑥−𝜇 (𝑥 − 𝜇)2
41 −0.5 0.25
38 −3.5 12.25
39 −2.5 6.25
45 3.5 12.25
47 5.5 30.25
41 −0.5 0.25
44 2.5 6.25
41 −0.5 0.25
37 −4.5 20.25
42 0.5 0.25
Σ𝑥 = 415 Σ(𝑥 − 𝜇)2 = 88.5
88.5 88.5
𝜎2 = = 8.85 and 𝜎 = √ 10 = 2.97
10
So, the population variance is 8.85 and the population standard deviation is 2.97.
(the denominator n – 1 in the formula of sample variance is used for estimation purposes)
∑(𝑥−𝑥̅ )2
Sample standard deviation: 𝑠 = √ 𝑛−1
Steps
IN WORDS IN SYMBOLS
1. Find the mean of the data set. Σ𝑥
𝑥̅ =
𝑛
2. Find the deviation of each entry. 𝑥 − 𝑥̅
3. Square each deviation. (𝑥 − 𝑥̅ )2
4. Add to get the sum of squares. Σ(𝑥 − 𝑥̅ )2
5. Divide by 𝑛 − 1 to get the sample variance. 2
Σ(𝑥 − 𝑥̅ )2
𝑠 =
𝑛−1
6. Find the square root of the variance to get
Σ(𝑥 − 𝑥̅ )2
the sample standard deviation. 𝑠=√
𝑛−1
88.5 88.5
Solution: 𝑠 2 = 10−1 = 9.83 and 𝑠 = √10−1 = 3.14
𝑛 ∑ 𝑥 2 −(∑ 𝑥)2
Using the raw score method: 𝑠 2 = 𝑛(𝑛−1)
Notes:
1. The standard deviation is a measure of the typical amount an entry deviates from the
mean.
2. You can use standard deviation to compare variation in data sets that use the same units
of measure and have means that are about the same.
3. Data entries that lie more than two standard deviations from the mean are considered
unusual, while those that lie more than three standard deviations from the mean are very
unusual.
4. To compare variation in different data sets, you can use standard deviation when the data
sets use the same units of measure and have means that are about the same. For data sets
with different units of measure or different means, us the coefficient of variation.
Coefficient of Variation (CV) is a measure of relative dispersion reflecting how large the
variation is relative to the average. It expresses the standard deviation as a percentage of
the mean. That is, the CV is the ratio of the standard deviation to the mean, expressed in
percent.
Note that the coefficient of variation measures the variation of a data set relative to the
mean of the data.
𝑠
𝐶𝑉 = (100%)
𝑥̅
Solution:
3.14
𝐶𝑉 = (100%) = 7.57%
41.5
MEASURES OF VARIABILITY
GROUPED DATA
1. R = upper boundary (highest class interval) – lower boundary (lowest class interval)
∑ 𝑓(𝑥−𝜇)2
2. a. Population Variance: 𝜎 2 = 𝑁
∑ 𝑓(𝑥−𝑥̅ )2
b. Sample Variance: 𝑠 2 =
𝑛−1
𝑛 ∑ 𝑓𝑥 2 −(∑ 𝑓𝑥)2
or 𝑠2 = 𝑛(𝑛−1)
∑ 𝑓(𝑥−𝜇)2
3. a. Population standard deviation: 𝜎 = √ 𝑁
∑ 𝑓(𝑥−𝑥̅ )2
b. Sample standard deviation: 𝑠 = √ 𝑛−1
𝑠
4. 𝐶𝑉 = 𝑥̅ (100%)
Example: Calculate the following measures of variation: R, MAD, s2, s & CV.
Class Interval f
91 – 100 7
81 – 90 9
71 – 80 12
61 – 70 20
51 – 60 18
41 – 50 15
31 – 40 11
21 – 30 8
Solution:
1. R = 100.5 – 20.5 = 80
∑ 𝑓(𝑥−𝑥̅ )2
2. Sample Variance: 𝑠 2 = 𝑛−1
5930
𝑥̅ = = 59.3
100
37756
𝑠2 = = 381.37
100 − 1
19.53
𝐶𝑉 = (100%) = 32.93%
59.3
Or
A sample statistic is unbiased if the average value of the statistic is equal to the population
parameter. (The average value of the statistic is obtained from all the possible samples for a
specific sample size, n.)
A sample statistic is biased if the average value of the statistic either underestimates or
overestimates the corresponding population parameter.
Transformation of Scale
1. The standard deviation does not change when a constant is added to each score in
the distribution.
2. If each score is multiplied by the same constant the standard deviation will also be
multiplied by the same constant.
Assessment
Assignment
Reference
Statistics for the Behavioral Sciences by Frederick J. Gravetter & Larry B.Wallnau (10th Ed.). CENGAGE
Learning. Boston, MA USA. 2017