You are on page 1of 25

Measures of Variation

• Maximum-minimum
Range • Sensitive to extreme values
• Measures how much the data values deviate
from the mean
• “spread from the mean”
Standard • Never negative
Deviation • Increases dramatically with outliers
• Units for s are the same as units for original
data

𝑛 2
σ𝑖=1 𝑥𝑖
( x  x ) 2 σ𝑛𝑖=1 𝑥𝑖2 −
𝑛
s =
n 1 𝑛−1
Example
• Calculate range and s for the number of chocolate chips:
• 22, 22, 26, 24, 23

To calculate standard deviation, begin with the summations.


• Rule of Thumb: 95% of values lie within 2
standard deviations of the mean.
• “Usual” values are those that are typical and
Usual versus not too extreme.
Unusual • Minimum ‘usual’ value= mean – 2*standard
deviation
Values • Maximum ‘usual value= mean + 2* standard
deviation
Example: Chocolate Chips Continued
• If 40 chocolate chip cookies are randomly sampled, and the mean is
24.0 chips and the standard deviation is 2.6 chips.

• Is a cookie with 30 chocolate chips unusual?


• Measure of variation equal to the square of the
standard deviation

Variance • Sample variance= s2

• Population Variance=σ2
Empirical Rule

• For datasets that have a distribution that


is approximately bell-shaped, the
following properties apply:
• About 68% of the values fall within 1
standard deviation of the mean.
• About 95% of the values fall within 2
standard deviations of the mean.
• About 99.7% of the values fall within
3 standard deviations of the mean.
Example Chocolate Chips

• If 40 chocolate chip cookies are randomly sampled and are normally distributed with mean
24.0 chips and standard deviation 2.6 chips.
• Draw the curve with the mean at the center and the standard deviations as in the previous
slide.
• What percent of the cookies do you expect to have less than 16.2 chips?
• How many chocolate chips do the highest 16% of cookies have?
• The number of standard deviations that a given
value is above or below the mean.
• Positive z is above the mean
• Negative z is below the mean.
Z Score • So instead of -3, -2, -1, 0, 1, 2, 3, you can have
exact values 1.27 standard deviations above the
mean.

xx x
Population
Sample

z z
s 
Interpreting z-scores
Example
• The author of the text measured his pulse rate to be 48 beats per
minute.
• Is that pulse rate unusual if the mean adult male pulse rate is 67.3 beats per
minute with a standard deviation of 10.3?
• Step 1. Calculate z-score
• Measures of location.
• 99 percentiles denoted P1, P2, P3,…, P99, which
Percentiles divide a set of data into 100 groups with 1% of
the values in each group.

number of values less than x


Percentile of value x = • 100
total number of values
Example
For the 40 Chips Ahoy cookies, find the percentile for a cookie with 23 chips.
Convert from Percentile to Data Value
k
• If given percentile, find location L n
100
where k=percentile
n=sample size
L is location
Example
For the 40 Chips Ahoy cookies, find the 60th percentile.

Answer:
• Special percentiles, denoted Q1, Q2, and Q3.
• Divide a set of data into four groups with 25% of
the values in each group.
• Q1= First quartile, 25% of values below
Quartiles • Q2= Second quartile = Median, 50% of
values below
• Q3= Third quartile, 75% of values below

25% 25% 25% 25%

(minimum)
Q1 Q2 Q3 (maximum)

(median)
• Interquartile Range (IQR)= Q3- Q1
• 5- Number Summary:
Other • Minimum
• First Quartile, Q1
Statistics • Median
• Third Quartile, Q3
• Maximum
Example
For the 40 Chips Ahoy cookies, find the five number summary.
Boxplot

• Graphs the 5-Number


Summary
• Whiskers extend from the
minimum value to the
maximum value with a box
from the first to third
quartile and line with
median in middle.
Boxplots

Normal Skewed
Outlier

A value that lies very far Can have dramatic effect


Can obscure the true
away from the vast on the mean and
nature of a distribution
majority of the other standard deviation (non-
and introduce skewness.
values in the dataset. resistant)
Modified Boxplots

• Indicate outliers above the


whiskers
• Finds the outliers using the
following rule:
• Any point above Q3 by
1.5 x IQR
• Or any point below Q1
by 1.5 x IQR
Example
For the 40 Chips Ahoy cookies, are there any outliers?
• To this point, we have discussed
• Context of the data
• Source of the data
Putting It All • Sampling method
Together • Measures of Center and Variation
• Distribution and Outliers

You might also like