Professional Documents
Culture Documents
Numerical Measures
4A-3
Numerical Description
• Three key characteristics of numerical data:
Characteristic Interpretation
Central Tendency Where are the data values concentrated?
What seem to be typical or middle data
values?
Dispersion How much variation is there in the data?
How spread out are the data values?
Are there unusual values?
Shape Are the data values distributed
symmetrically? Skewed? Sharply peaked?
Flat? Bimodal?
4A-4
Chapter Topics
Measures of Center and Location
Mean, median, mode, geometric mean,
midrange
Other measures of Location
Weighted mean, percentiles, quartiles
Measures of Variation
Range, interquartile range, variance and
standard deviation, coefficient of variation
Coefficient of
Variation
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-6
Measures of Center and Location
Overview
Center and Location
x i
XW
wx i i
x
w
i1
n i
x
N
i W
wxi i
i1
N
w i
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-7
Mean (Arithmetic Average)
The Mean is the arithmetic average of data
values
n = Sample Size
Sample mean n
x i
x1 x 2 x n
x i1
n n
Population mean N = Population Size
N
x
x1 x 2 x N i
i1
N N
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-8
Mean (Arithmetic Average)
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
In an ordered array, the median is the “middle”
number
If n or N is odd, the median is the middle number
If n or N is even, the median is the average of the
two middle numbers
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-10
Mode
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 5 No Mode
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-11
Weighted Mean
Example: Sample of
26 Repair Projects
Weighted Mean Days
Days to
Frequency to Complete:
Complete
5 4 XW
wx
i i
(4 5) (12 6) (8 7) (2 8)
6 12 w i 4 12 8 2
7 8 164
6.31 days
8 2 26
Mean < Median < Mode Mean = Median = Mode < Median < Mean
(Longer tail extends to left)
Mode (Longer tail extends to right)
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-14
Central Tendency
Growth Rates
• A variation on the geometric mean used to find
the average growth rate for a time series.
xn
Gn 1 Year Revenue (mil)
x1
1998 131
1999 227
• For example, from
2000 311
1998 to 2002, Spirit
Airlines revenues 2001 354
are: 2002 403
4A-15
Central Tendency
Growth Rates
• The average growth rate is given by taking the
geometric mean of the ratios of each year’s
revenue to the preceding year.
• Due to cancellations, only the first and last years
are relevant:
227 311 354 403 403
G5 1 5 1
131 227 311 354 131
= 1.2521 = .252 or 25.2% per year
• In Excel use =(403/131)^(1/5)-1 4A-16
Central Tendency
Midrange
• The midrange is the point halfway between the
lowest and highest values of X.
• Easy to use but sensitive to extreme data values.
xmin xmax
Midrange =
2
• For the J. D. Power quality data (n=37):
xmin xmax x1 x37 87 173
Midrange = = 130
2 2 2
• Here, the midrange (130) is higher than the mean
(125.38) or median (121).
4A-17
Central Tendency
Trimmed Mean
• To calculate the trimmed mean, first remove the
highest and lowest k percent of the observations.
• For example, for the n = 68 P/E ratios, we want a
5 percent trimmed mean (i.e., k = .05).
• To determine how many observations to trim,
multiply k x n = 0.05 x 68 = 3.4 or 3 observations.
• So, we would remove the three smallest and three
largest observations before averaging the
remaining values.
4A-18
Other Location Measures
Other Measures
of Location
Percentiles Quartiles
Q1 Q2 Q3
Example: Find the first quartile
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)
Q1 = 25th percentile, so find the 25 (9+1) = 2.5 position
100
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-21
Box and Whisker Plot
A Graphical display of data using 5-number
summary:
Minimum -- Q1 -- Median -- Q3 -- Maximum
Example:
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
00 22 33 55 27
27
This data is very right skewed, as the plot depicts
Sample Sample
Variance Standard
Deviation
Same center,
different variation
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-28
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 7 8 9 10 11
12 Range = 12 - 7 = 5 12 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-29
Interquartile Range
Example:
X Median X
minimum Q1 (Q2) Q3 maximum
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
(x i x) 2
s2 i1
n -1
Population variance: N
(x i μ) 2
σ2 i1
N
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-32
Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
n
Sample standard deviation:
i
(x x ) 2
s i1
n -1
N
Population standard deviation:
i
(x μ) 2
σ i 1
N
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-33
Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16
(10 x ) 2 (12 x ) 2 (14 x ) 2 (24 x ) 2
s
n 1
126
4.2426
7
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-34
Comparing Standard Deviations
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Population Sample
σ s
CV
μ
100% CV 100%
x
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-36
Comparing Coefficient
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
s $5
CVA 100%
100% 10%
x $50 Both stocks
have the same
Stock B: standard
Average price last year = $100 deviation, but
stock B is less
Standard deviation = $5
variable relative
to its price
s $5
CVB 100% 100% 5%
x $100
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-37
The Empirical Rule
If the data distribution is bell-shaped, then
the interval:
μ 1σ contains about 68% of the values in
the population or the sample
68%
μ
μ 1σ
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 3-38
The Empirical Rule
μ 2σ contains about 95% of the values in
the population or the sample
μ 3σ contains about 99.7% of the values
in the population or the sample
95% 99.7%
μ 2σ μ 3σ
Examples:
At least within
(1 - 1/12) = 0% ……..... k=1 (μ ± 1σ)
(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)
(1 - 1/32) = 89% ………. k=3 (μ ± 3σ)
x μ
z
σ
where:
x = original data value
μ = population mean
z = standard score
xx
z
s
where:
x = original data value
x = sample mean
z = standard score
Click OK
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000