Professional Documents
Culture Documents
Chapter 3
In this chapter, you learn to:
Numerical Descriptive Describe the properties of central tendency,
Measures variation, and shape in numerical variables.
Construct and interpret a boxplot.
Compute descriptive summary measures for a
population.
Calculate the covariance and the coefficient of
correlation.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 1 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 2
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
11 12 13 14 15 65 11 12 13 14 20 70
13 14 Less sensitive than the mean to extreme values.
5 5 5 5
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 5 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 6
Note that n 1 is not the value of the median, only the position of 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
2
the median in the ranked data. Mode = 9 No Mode
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 7 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 8
Measures of Central Tendency: Measures of Central Tendency:
Review Example Which Measure to Choose?
DCOVA DCOVA
House Prices: Mean: ($3,000,000/5) The mean is generally used, unless extreme values
$2,000,000 = $600,000 (outliers) exist.
$ 500,000
Median: middle value of ranked The median is often used, since the median is not
$ 300,000
$ 100,000 data sensitive to extreme values. For example, median
$ 100,000 = $300,000 home prices may be reported for a region; it is less
Sum $ 3,000,000 Mode: most frequent value sensitive to outliers.
= $100,000 In many situations it makes sense to report both the
mean and the median.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 9 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 10
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 13 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 14
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 19 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 20
Measures of Variation: Measures of Variation:
Summary Characteristics The Coefficient of Variation
DCOVA DCOVA
The more the data are spread out, the greater the
Measures relative variation.
range, variance, and standard deviation.
Always in percentage (%).
The more the data are concentrated, the smaller the Shows variation relative to mean.
range, variance, and standard deviation. Can be used to compare the variability of two or
more sets of data measured in different units.
If the values are all the same (no variation), all these
measures will be zero.
§ S·
CV ¨ ¸
None of these measures are ever negative. ¨ X ¸ 100%
© ¹
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 21 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 22
§S· $5 §S· $5
CVA ¨¨ ¸¸ 100% 100% 10% CVA ¨¨ ¸¸ 100% 100% 10%
©X¹ $50 Both stocks have X $50 Stock C has a
the same
© ¹ much smaller
Stock B: Stock C:
standard standard
Mean price last year = $100. deviation, but Mean price last year = $8. deviation but a
stock B is less much higher
Standard deviation = $5. Standard deviation = $2.
variable relative coefficient of
to its mean price. variation
§S· $5 § S · $2
CVB ¨¨ ¸¸ 100% 100% 5% CVC ¨¨ ¸ 100%
¸ 100% 25%
©X¹ $100 ©X ¹ $8
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 23 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 24
Locating Extreme Outliers: Locating Extreme Outliers:
Z-Score Z-Score
DCOVA DCOVA
To compute the Z-score of a data value, subtract the XX
mean and divide by the standard deviation. Z
S
The Z-score is the number of standard deviations a
data value is from the mean. where X represents the data value
X is the sample mean
A data value is considered an extreme outlier if its Z-
score is less than -3.0 or greater than +3.0. S is the sample standard deviation
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 25 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 26
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 31 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 32
Quartile Measures: Quartile Measures:
Locating Quartiles Calculation Rules
DCOVA DCOVA
When calculating the ranked position use the
Find a quartile by determining the value in the
appropriate position in the ranked data, where: following rules:
If the result is a whole number then it is the ranked
position to use.
First quartile position: Q1 = (n+1)/4 ranked value.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 33 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 34
12 30 45 57 70 Median (Q2).
Third Quartile (Q3).
Xlargest.
Interquartile range
= 57 – 30 = 27
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 37 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 38
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 39 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 40
Five Number Summary: Distribution Shape and
Shape of Boxplots The Boxplot
DCOVA DCOVA
If data are symmetric around the median then the box
and central line are centered between the endpoints. Left-Skewed Symmetric Right-Skewed
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 41 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 42
Numerical Descriptive
Boxplot Example Measures for a Population
DCOVA
DCOVA
Below is a Boxplot for the following data: Descriptive statistics discussed previously described a
sample, not the population.
Xsmallest Q1 Q2 / Median Q3 Xlargest
Summary measures describing a population, called
0 2 2 2 3 3 4 5 5 9 27
parameters, are denoted with Greek letters.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 47 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 48
The Empirical Rule DCOVA
The Empirical Rule
DCOVA
Approximately 95% of the data in a symmetric mound-
The empirical rule approximates the variation of shaped distribution lies within two standard deviations
data in a symmetric mound-shaped distribution. of the mean, or μ ± 2σ.
Approximately 68% of the data in a symmetric
Approximately 99.7% of the data in a symmetric mound-
mound shaped distribution is within 1 standard shaped distribution lies within three standard deviations
deviation of the mean or μ ± 1σ. of the mean, or μ ± 3σ.
68%
95% 99.7%
μ
μ ± 1σ μ r 2σ μ r 3σ
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 49 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 50
Approximately 99.7% of all test takers scored between (1 - 1/22) x 100% = 75% ….............. k=2 (μ ± 2σ)
230 and 770, (500 ± 270). (1 - 1/32) x 100% = 88.89% ……….. k=3 (μ ± 3σ)
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 51 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 52
We Discuss Two Measures Of The Relationship
Between Two Numerical Variables The Covariance
DCOVA
The covariance measures the strength of the linear
Scatter plots allow you to visually examine the
relationship between two numerical variables (X & Y).
relationship between two numerical variables
and now we will discuss two quantitative
The sample covariance:
measures of such relationships.
n
i
¦ ( X X)( Y Y) i
The Covariance. i 1
cov ( X , Y )
The Coefficient of Correlation. n 1
Only concerned with the strength of the relationship.
No causal effect is implied.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 53 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 54
X X X
r = +1 r = +.3 r=0
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 57 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 58
Pitfalls in Numerical
Descriptive Measures Ethical Considerations
DCOVA DCOVA
Data analysis is objective: Numerical descriptive measures:
Should report the summary measures that best
Should document both good and bad results.
describe and communicate the important aspects of
the data set. Should be presented in a fair, objective and
neutral manner.
Data interpretation is subjective: Should not use inappropriate summary
Should be done in fair, neutral and clear manner. measures to distort facts.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 59 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 60
Chapter Summary
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 61