You are on page 1of 42

Chapter 3-2

Measures of Variation

1
Measures of Variation
 An important characteristic of any set of data
is the variation in the data. In some data sets,
the data values are concentrated closely near
the mean; in other data sets, the data values
are more widely spread out from the mean.
How Can We Measure Variability?
Range

Variance

Standard Deviation
2
Measures of Variation: Range
 The range is the difference between the
highest and lowest values in a data set.

R  Highest value  Lowest value

3
Example 1: Outdoor Paint
Two experimental brands of outdoor paint are
tested to see how long each will last before
fading. Six cans of each brand constitute a
small population. The results (in months) are
shown. Find the mean and range of each group.
Brand A Brand B
10 35
60 45
50 30
30 35
40 40
20 25

4
Example 3-18/19: Outdoor Paint
Brand A Brand B   X 210
  35
10 35 Brand A: N 6
60 45 R  60  10  50
50 30
30 35
  X

210
 35
40 40 Brand B: N 6
20 25
R  45  25  20

The average for both brands is the same. You might


Conclude that the both brands of paint last equally well. Even
thought that, the spread or variation is quite different.
For brand A is much greater than the range for brand B.
Which brand would you buy?
5
Measures of Variation: Variance &
Standard Deviation
 The variance is the average of the
squares of the distance each value is
from the mean.
 The standard deviation is the square
root of the variance.
 The standard deviation is a measure of
how spread out your data are.

6
Measures of Variation:
Variance & Standard Deviation
(Population Theoretical Model)
 The population variance is
 X  
2

 2

N
 The population standard deviation is
 X   
2


N

7
Example 2: Outdoor Paint
Find the variance and standard deviation for the data set
for Brand A and B in the previous example.
Solution: For brand A,   35
 X  
2

 2

Months, X X - µ (X - µ)2 N
1750
10 -25 625 
60 25 625 6
50 15 225  291.7 months 2
30 -5 25
40 1750
5 25 
20 -15 225 6
1750  17.1months
8
Example 3-21: Outdoor Paint
Solution: For brand B,   35
 X  
2
Months, X X - µ (X - µ)2
 2

35 0 0 N
45 10 100 250
30 -5 25 
6
35 0 0
40 5 25  41.7 months 2

25 -10 100
250   41.7  6.5 months
Since the standard deviation of brand A is 17.1 and the
standard deviation of brand B is 6.5, the data are
more variable for brand A.
9
Measures of Variation:
Variance & Standard Deviation
(Sample Theoretical Model)
 The sample variance is
  X X
2

s 2

n 1
 The sample standard deviation is
 X  X 
2

s
n 1

10
Measures of Variation:
Variance & Standard Deviation
(Sample Computational Model)
 The sample variance is
n X    X 
2 2

s 
2

n  n  1

 The sample standard deviation is


s s 2

11
Measures of Variation:
Variance & Standard Deviation
(Sample Computational Model)
 Is mathematically equivalent to the
theoretical formula.
 Saves time when calculating by hand
 Does not use the mean
 Is more accurate when the mean has
been rounded.
12
Example 3: European Auto Sales
Find the variance and standard deviation for the
amount of European auto sales for a sample of 6
years. The data are in millions of dollars.
11.2, 11.9, 12.0, 12.8, 13.4, 14.3
n X    X 
2 2
X X 2
s 
2
11.2 125.44 n  n  1
11.9 141.61
6  958.94    75.6 
12.0 144.00 2
12.8 163.84 s 
2 s 2  1.28
13.4 179.56 6  5 s  1.13
14.3 204.49
75.6 958.94  
s 2  6  958.94  75.62 /  6  5 
13
Measures of Variation for Grouped
Data
 Sample variance and standard deviation
are given by the formula.

s 
2 
n  f .X 2
m    f .X 
m
2

,n   f
n  n  1

s s 2

14
Measures of Variation for Grouped
Data
Example: Find the variance and the standard deviation for
the following frequency distribution.
2
Class f Xm f. Xm f .X m
5.5 - 10.5 1 8 8 64
10.5 - 15.5 2 13 26 338
15.5 - 20.5 3 18 54 972
20.5 - 25.5 5 23 115 2645
25.5 - 30.5 4 28 112 3136
30.5 - 35.5 3 33 99 3267
35.5 - 40.5 2 38 76 2888
f = 20  f ·Xm = 490  m  13310
f . X
2

15
Measures of Variation for Grouped
Data
Solution: n  20,  f ·X = 490,
m  m  13310
f . X
2

20 13310    490 
2

s 
2

20 19 
266200  240100 26100
s 
2
  68.68
380 380

s  68.68  8.29

16
Applications:
1-Coefficient of Variation
The coefficient of variation is the
standard deviation divided by the
mean, expressed as a percentage.
s
CV  100%
X
Used CV to compare standard
deviations when the units are different.
17
Example : Sales of Automobiles
The mean of the number of sales of cars over a
3-month period is 87, and the standard
deviation is 5. The mean of the commissions is
$5225, and the standard deviation is $773.
Compare the variations of the two.
s 5
For Sales CVar  100%  100%  5.7%
X 87
s 773
For commissions CVar  100%  100%  14.8%
X 5225

Commissions are more variable than sales.


18
Example:- The mean and the variance of two groups
are given below. Compare the variations
Group
Mean Variance

A 132 23

B 182 62

s 23 4.8
For group A: CV ar  100%  100%  100%  3.6%
X 132 132
s 62 7.9
For group B: CV ar  100%  100%  100%  4.3%
X 182 182
Group A is less variable than group B.

19
Measures of Variation:
2-Range Rule of Thumb
The Range Rule of Thumb
approximates the standard deviation
as Range
s
4
when the distribution is unimodal and
approximately symmetric.

20
Measures of Variation:
Range Rule of Thumb
Use X  2s to approximate the lowest
value and X  2s to approximate the
highest value in a data set.
Example: X  10, Range  12
12 LOW  10  2  3  4
s 3
4 HIGH  10  2  3  16
21
3-3 Measures of Position
 Z-score

 Percentile

 Quartile

 Outlier

22
Measures of Position: Z-score
 A z-score or standard score for a value
is obtained by subtracting the mean from
the value and dividing the result by the
standard deviation.
X X X 
z z
s 
 A z-score represents the number of
standard deviations a value is above or
below the mean.

23
Example : Test Scores
A student scored 65 on a calculus test that had a
mean of 50 and a standard deviation of 10; she
scored 30 on a history test with a mean of 25 and
a standard deviation of 5. Compare her relative
positions on the two tests.
X  X 65  50
z   1.5 Calculus
s 10
X  X 30  25
z   1.0 History
s 5
She has a higher relative position in the Calculus class.

24
Example : Students GPA
Two students, Ahmed and Ali, from different high
schools, wanted to find out who had the highest GPA
when compared to his school. Which student had the
highest GPA when compared to his school?
Student GPA School Mean GPA School Standard Deviation GPA
Ahmed 3 0.7
2.85

Ali 77 80 10

25
Example : Students GPA
Solution:
X X 2.85  3
For Ahmed: z     0.21
s 0.7
X X 77  80
For Ali: z     0.3
s 10
Then, Ahmed has the better GPA.

26
Measures of Position: Percentiles
Percentiles are divide a set of data into 100
equal groups with about 1% of the values
in each group . There are 99 percentiles
denoted P1, P2, . . . P99,

The percentile corresponding to a given value X:

27
Example : Test Scores
A teacher gives a 20-point test to 10 students.
Find the percentile rank of a score of 12.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
Sort in ascending order.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
6 values

Percentile 
 number of values below X   0.5
100
total number of values
6  0.5 A student whose score
 100
10 was 12 did better than
 65 65% of the class.
28
Converting from the kth
Percentile to the Corresponding
Data Value

29
30
Example : Test Scores
A teacher gives a 20-point test to 10 students. Find
the value corresponding to the 25th percentile and
60th percentile
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
Solution: Sort in ascending order.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
For the 25th percentile
K 25
L .n  .10  2.5  3
100 100
P25  5

31
Example 3-34: Test Scores
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
 For 60th percentile
K 60
L .n  .10  6
100 100
10  12
P60   11
2
What is the different between a percentile and a
percentage?
A percentile is a relative measurement of position; a percentage is
an absolute measure of the part to the total.

32
Deciles:
Deciles are divide a set of data into 10
equal groups with about 10% of the values
in each group. There are 9 Deciles denoted
D 1, D 2, . . . D 9,

Not that:
D1=P10, D2=P20, D3=P30, ….. , D9=P90,

33
QUARTILES:
Quartiles separate the data set into 4 equal groups with about
25 % of the values in each group. There are 3
quartiles denoted by Q1, Q2, Q3

Q1 = first quartile, Q2 =second quartile (median), Q3 = third quartile


Not that: Q1=P25, Q2=MD=D5=P50, Q3=P75
The Interquartile Range, IQR = Q3 – Q1.

34
Example:
Find Q1, Q2, and Q3 for the data set.
15, 13, 6, 5, 12, 50, 22, 18

Sort in ascending order.


5, 6, 12, 13, 15, 18, 22, 50

6  12
Q1   9
2
13  15
Q2  median   14
2
18  22
Q3   20
2
35
Measures of Position:
Outliers
 An outlier is an extremely high or low
data value when compared with the rest of
the data values.
 A data value less than Q1 – 1.5(IQR) or
greater than Q3 + 1.5(IQR) can be
considered an outlier.That is the data
value outside the interval
Q1 – 1.5(IQR), Q3 + 1.5(IQR
Exercise: Find the outlier for the previous
example
36
The Five-Number Summary

37
Boxplots

38
Constructing Boxplots
1. Find the five-number summary.
2. Draw a horizontal axis with a scale that includes
the maximum and minimum data values.
3. Draw a box with vertical sides through Q1 and
Q3, and draw a vertical line though the median.
4. Draw a line from the minimum data value to the
left side of the box and a line from the maximum
data value to the right side of the box.

39
Example 3-38: Meteorites
The number of meteorites found in 10 U.S. states is
shown. Find the five number summary and construct
a boxplot for the data.
89, 47, 164, 296, 30, 215, 138, 78, 48, 39
Solution:

40
The five-number summary are
Lowest value=30, Q1 =47, MD=83.5, Q3 =164
Highest value=296

47 83.5 164
30 296

41
Exercise:
Consider the following boxplot,
1- Identify the five number summary?
2- What is the outlier?
3- Describe the distribution?

42

You might also like