Professional Documents
Culture Documents
AGBS | Bangalore
Summary Measures
Describing Data Numerically
Mode Variance
Coefficient of Variation
Quartiles
Quartiles split the ranked data into 4 segments with
an equal number of values per segment
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the observations are greater than the third
quartile
Quartile Formulas
(n = 9)
Q1 is in the 9/4 = 2.25 position of the ranked data
so Q1 = 12.25
Same center,
different variation
Range
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Coefficient of Range
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Coefficient of QD
10-20 3
20-30 6
30-40 11
40-50 3
50-60 2
Variance
S 2 i 1
n -1
Where X = mean
n = sample size
Xi = ith value of the variable X
Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Is the square root of the variance
Has the same units as the original data
Sample standard deviation:
n
i
(X X ) 2
S i 1
n -1
Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = X = 16
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.567
Measuring variation
S
C V 100%
X
Comparing Coefficient
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
S $5
C V A 100% 100% 10%
X $50 Both stocks
have the same
Stock B:
standard
Average price last year = $100 deviation, but
stock B is less
Standard deviation = $5 variable relative
to its price
S $5
C VB 100% 100% 5%
X $100
Sample vs. Population CV
Calculate!
XX
Z
S
Z Scores
(continued)
Example:
If the mean is 14.0 and the standard deviation is 3.0,
what is the Z score for the value 18.5?
X X 1 8 .5 1 4 .0
Z 1 .5
S 3 .0
The value 18.5 is 1.5 standard deviations above the
mean
(A negative Z-score would mean that a value is less
than the mean)
Shape of a Distribution
68%
μ
μ 1σ
The Empirical Rule
μ 2σ contains about 95% of the values in
the population or the sample
μ 3σ contains about 99.7% of the values
in the population or the sample
95% 99.7%
μ 2σ μ 3σ
Chebyshev Rule
Positive skewness
There are more observations below the mean than
above it
When the mean is greater than the median
Negative skewness
There are a small number of low observations and a
large number of high ones
When the median is greater than the mean
Measures of Skew