Professional Documents
Culture Documents
Week #6 to #7
While a measure of central tendency describes the typical value, measures of variability define
how far away the data points tend to fall from the center.
The measures of dispersion to be discussed in this chapter are the range, quantile deviation,
interquartile range, mean deviation, variance, and standard deviation.
RANGE
The range of a data set is defined to be the difference between the highest and lowest values in
the data set
• Outliers can have a disproportionate effect on statistical results, such as the range and mean,
which can result in misleading interpretations.
For example:
Outlier: 34.
Use the range to compare variability only when the sample sizes are similar.
The range is used to report the movement of stock process over a period of time.
The weather reports typically states the high and low temperature readings for a 24-hour
period.
• Other than the range are other variability measures such as the percentile deviation, decile
deviation, quartile deviation and the interquartile range(IQR).
• These Measures may be used to minimize the effect of the extremely low and high scores or the
outliers on the measure of variability or spread.
In descriptive statistics, the interquartile range tells you the spread of the middle half of your
distribution.
Quartiles segment any distribution that’s ordered from low to high into four equal parts. The
interquartile range (IQR) contains the second and third quartiles, or the middle half of your data set.
Whereas the range gives you the spread of the whole data set, the interquartile range gives you the
range of the middle half of a data set.
Interpretation: 30% of the students score are
lower than or equal to 20.
P43 = P60 = P75 =
15 16 18
MEAN DEVIATION
The mean deviation measures the average deviation of the values from arithmetic mean. It gives
equal weight to the deviation of every observation
Formula:
∑ x| X − X̄|
M . D .=
n
Where M.D = mean deviation
X = A particular data
x = sample mean
n = total number of observations
ll = absolute value
Absolute value describes the distance from zero that a number is on the number line, without
considering direction. The absolute value of a number is never negative.
In that example the values are, on average, 3.75 away from the middle.
Example of getting the Mean Deviation using a Table
The variance of a population is equal to the sum of the squared deviation about the mean
divided by the number of scores. The standard deviation is equal to the square root of the variance.
They a used when the mean is the preferred measure of central tendency. They show whether or not
the scores are grouped closely around the mean of distribution. The symbols for sample and population
variances are S2 and σ 2, respectively. Variance is frequently discussed by researchers as an indicator of
how much variability there is in an entire distribution of scores. The standard deviation is used to
determine how far the data are from the mean.
If the values are clustered tightly about their mean, the standard deviation is small and if the
values become more and more scattered about their mean, the standard deviation for these sets is
large.
• The variance(S2) of a sample is a measure of how items are dispersed about their mean.
• The Standard Deviation(S) for a sample is the square root of the variance.
Of all Measures of Variability, the standard deviation is the most widely employed because it is
used in so many statistical operations.
Before we can compute the standard deviation, we must determine if our data set represents a
population or a sample. We must know this fact so that the correct formula can be used.
The formulas are similar; however, a denominator of n- 1 is used to compute the sample
standard deviation and denominator of N for population standard deviation. Since the sample standard
deviation is often used to estimate the value of an unknown population standard deviation, the use of
n – 1 produces better estimates.
The variance of population (σ 2) and population standard deviation (σ ) for ungrouped data can be
computed from the formula.
√
2 2
∑ ( x−x ̅ ) ∑ ( x−x ̅ )
S 2= 2
S=
n−1 n−1
Where S2 = variance of a sample
S = sample standard deviation
x = values of the observation
x̅ = mean of the sample
n = total number of observations in the samples
(X-X̅)2 = square of the distance of the x value to the mean(| X-X̅̅ |). To get the (X-X̅)2 multiply the value of
| X-X̅̅ | to itself. (ex. 12*12=144)
The Short-Cut Formula
When many more items are included and we would like to minimize the use of deviation, we may use
the formula:
n ( ∑ x )−( ∑ x )
2 2
2
S= (for sample variance)
n( n−1)
S=
√
n ( ∑ x 2 )−( ∑ x )2
n( n−1)
(for sample standard deviation)
X2 = square of the X value. To get the X2 multiply the value of X to itself. (ex. 22*22=484)
Complete table
X | X − X̄| (X – X̄ ¿ ¿2 X2
22 12 144 484
23 11 121 529
21 13 169 441
26 8 64 676
48 14 196 2304
32 2 4 1024
27 7 49 729
40 6 36 1600
46 12 144 2116
44 10 100 1936
40 6 36 1600
43 9 81 1849
37 3 9 1369
28 6 36 784
33 1 1 1089
∑X = 510 ∑| X − X̄| = 120 ∑ (X – X̄ ¿ ¿2=1190 ∑ X 2 = 18530
√
2 2
2 ∑ ( x−x ) ∑ ( x−x )
S= S=
n−1 n−1
S2=¿ 85 S=9.22
Shortcut Formula
n = 15 ∑ x 2 = 18530 n = 15 ∑ x 2 = 18530
n-1 = 15- 1 = 14 ( ∑ x )2 = ( 510 )2 = 26100 n-1 = 15- 1 = 14 ( ∑ x )2 = ( 510 )2 = 26100
15(18530)−( 510 )2
√
2
2
S= 15(18530)−( 510 )
S=
15(14) 15(14)
2
S=
277950−260100
210
S=
√ 277950−260100
210
S2=¿ 85 S=9.22
Standard Deviation Example
The Standard Deviation is a measure of how spread out numbers are.
The formula is easy: it is the square root of the Variance. So now you ask, "What
is the Variance?"
Variance
The Variance is defined as: The average of the squared differences
from the Mean.
Example
You and your friends have just measured the heights of your dogs (in
millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.
Find out the Mean, the Variance, and the Standard Deviation.
Your first step is to find the Mean:
X | X − X̄| (X – X̄ ¿ ¿2
600
470
170
430
300
∑X = 1970 ∑| X − X̄| = ∑ (X – X̄ ¿ ¿2=¿
Answer:
Mea 600 + 470 + 170 + 430 + 300 /
=
n 5
= 1970 / 5
= 394
so the mean (average) height is 394 mm. Let's plot this on the chart:
X | X − X̄| (X – X̄ ¿ ¿2
600 206
470 76
170 224
430 36
300 94
∑X = 1970 ∑| X − X̄| = 635 ∑ (X – X̄ ¿ ¿2=¿
To calculate the Variance, take each difference, square it, and then average the
result:
X | X − X̄| (X – X̄ ¿ ¿2
600 206 42436
470 76 5776
170 224 50176
430 36 1296
300 94 8836
∑X = 1970 ∑| X − X̄| = 635 ∑ (X – X̄ ¿ ¿2=108520
Variance
2 ∑ ( x−x )2
S=
n−1
∑ ( x−x )2 = 108520
n-1 = 5-1 = 4
2 108520
S=
4
2
S =¿ 27130
And the Standard Deviation is just the square root of Variance, so:
Standard Deviation
√
2
∑ ( x−x )
S=
n−1
2
∑ ( x−x ) = 108520
n-1 = 5-1 = 4
S=
√
108520
4
S=164.71
And the good thing about the Standard Deviation is that it is useful. Now we can
show which heights are within one Standard Deviation (164mm) of the Mean:
164
164
So, using the Standard Deviation we have a "standard" way of knowing what
is normal, and what is extra large or extra small.