You are on page 1of 26

Week 1

IE 212 STATISTICS
Week 3: Measures of Variability

Dr. Başak GEVER


UNIVERSITY OF TURKISH AERONAUTICAL ASSOCIATION
IE 212 STATISTICS WEEK 1

MEASURES OF VARIABILITY
❖ The mean alone does not provide a complete or sufficient
description of data. In this section we present descriptive
numbers that measure the variability or spread of the
observations from the mean. In particular, we include the

1) Sample variance and standard deviation


2) Mean absolute deviation
3) Quartiles
4) Skewness
5) Kurtosis
6) Coefficient of variation

DR. BAŞAK GEVER 1


IE 212 STATISTICS WEEK 1

1) Sample Variance and Standard Deviation


❖ The sample variance, 𝑠 2 , is the sum of the squared differences
between each observation and the sample mean divided by the
sample size, n, minus 1:
𝑛
1
𝑠2 = ∑(𝑥𝑖 − 𝑥̅ )2
𝑛−1
𝑖=1

➢ Here 𝑥̅ is arithmetic mean of the sample. The sample standard


deviation, s, is as follows:

𝑛
1
𝑠= √𝑠 2 =√ ∑(𝑥𝑖 − 𝑥̅ )2
𝑛−1
𝑖=1

DR. BAŞAK GEVER 2


IE 212 STATISTICS WEEK 1

Example: Calculate the standard deviation of daily sales for Gilotti


Pizzeria, Location 1. From the following table the daily sales for
Location 1 are:
6; 8; 10; 12; 14; 9; 11; 7; 13; 11

DR. BAŞAK GEVER 3


IE 212 STATISTICS WEEK 1

DR. BAŞAK GEVER 4


IE 212 STATISTICS WEEK 1

Shortcut Formulas for Sample Variance, 𝒔𝟐 :

➢ Sample variance, 𝑠 2 , can be computed as follows:

∑ 𝑛 2 (∑ 𝑛 )2 /𝑛
𝑖=1 𝑥𝑖 − 𝑖=1 𝑥𝑖
𝑠2 =
𝑛−1
➢ Alternatively, sample variance, 𝑠 2 , can be computed as
follows:

∑ 𝑛 2 ( ̅ ) 2
𝑥
𝑖=1 𝑖 − 𝑛 𝑋
𝑠2 =
𝑛−1

DR. BAŞAK GEVER 5


IE 212 STATISTICS WEEK 1

Example: Use alternative formula to calculate the sample


variance for Gilotti’s Pizzeria Sales (Variance)

DR. BAŞAK GEVER 6


IE 212 STATISTICS WEEK 1

Example: The following data give the time in months from hire to
promotion to manager for a random sample of 25 software
engineers from all software engineers employed by a large
telecommunications firm.

Calculate the variance and standard deviation for this sample.

DR. BAŞAK GEVER 7


IE 212 STATISTICS WEEK 1

Solution:
𝑿𝒊 ̅ (𝑿𝒊 − 𝑿
𝑿𝒊 − 𝑿 ̅ )𝟐 25 -58,28 3396,5584
5 -78,28 6127,7584 23 -60,28 3633,6784
7 -76,28 5818,6384 24 -59,28 3514,1184
229 145,72 21234,3184 34 -49,28 2428,5184
453 369,72 136692,878 37 -46,28 2141,8384
12 -71,28 5080,8384 34 -49,28 2428,5184
14 -69,28 4799,7184 49 -34,28 1175,1184
18 -65,28 4261,4784 64 -19,28 371,7184
14 -69,28 4799,7184 47 -36,28 1316,2384
14 -69,28 4799,7184 67 -16,28 265,0384
483 399,72 159776,078 69 -14,28 203,9184
22 -61,28 3755,2384 192 108,72 11820,0384
21 -62,28 3878,7984 125 41,72 1740,5584
DR. BAŞAK GEVER 8
𝑛
1 5 + 7 + 229 + ⋯ + 125 2082
𝑋̅ = ∑ 𝑋𝑖 = = = 83.28
𝑛 25 25
𝑖=1
𝑛

∑(𝑋𝑖 − 𝑋̅)2 = 395461.04;


𝑖=1
𝑛
1
𝑠2 = ∑(𝑋𝑖 − 𝑋̅)2 = 16477.5433; 𝑠 = 128.3649
𝑛−1
𝑖=1
IE 212 STATISTICS WEEK 1

Sample Variance for Grouped Data


❖ Suppose that you have two sets of sample data: 𝑋⃗1 and 𝑋⃗2

with the sizes of 𝑛 and 𝑚, respectively. Let us denote their

variances with 𝑠12 and 𝑠22 , respectively. Then the joint variance

(𝑠𝑝2 ) can be calculated as follows:

( ) 2 ( ) 2
𝑛 − 1 𝑠1 + 𝑚 − 1 𝑠2
𝑠𝑝2 =
𝑛+𝑚−2

DR. BAŞAK GEVER 1


IE 212 STATISTICS WEEK 1

2) Mean Absolute Deviation (MAD):


Suppose you have a set of data 𝑋⃗ = (𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ). Then, the
mean absolute deviation can be calculated as follows:
𝑛
1
𝑀𝐴𝐷 = ∑|𝑥𝑖 − 𝑥̅ |
𝑛
𝑖=1

DR. BAŞAK GEVER 2


IE 212 STATISTICS WEEK 1

Example. Harry Vernon has collected data on the number of


VCRs sold last year for Vernon’s Music Store.

DR. BAŞAK GEVER 3


IE 212 STATISTICS WEEK 1

3) Quartiles
❖ The lower quartile (Q1) is the middle number (median) of the
half of the data below the median, and the upper quartile is the
middle number (median) of the half of the data above the
median. We will denote

𝑄1 : lower quartile

𝑄2 : M = middle quartile (median)

𝑄3 : upper quartile

❖ The difference between the quartiles is called interquartile


range (IQR): 𝐼𝑄𝑅 = 𝑄3 − 𝑄1
DR. BAŞAK GEVER 4
IE 212 STATISTICS WEEK 1

DR. BAŞAK GEVER 5


IE 212 STATISTICS WEEK 1

DR. BAŞAK GEVER 6


IE 212 STATISTICS WEEK 1

DR. BAŞAK GEVER 7


IE 212 STATISTICS WEEK 1

DR. BAŞAK GEVER 8


IE 212 STATISTICS WEEK 1

Solution.

DR. BAŞAK GEVER 9


IE 212 STATISTICS WEEK 1

Box – and - Wisker Plot:

DR. BAŞAK GEVER 10


IE 212 STATISTICS WEEK 1

4) Skewness: In nearly all situations, we would compute


skewness with a statistical software package or Excel.
❖ If skewness is zero or close to zero, then the distribution is
symmetric or approximately symmetric.
❖ A negative skewness value tells us that the distribution is
skewed to the left.
❖ Similarly, a positive skewness value tells us that the
distribution is skewed to the right.
1
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)3
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = 𝛾3 = 𝑛 − 1
𝑠3

DR. BAŞAK GEVER 11


IE 212 STATISTICS WEEK 1

❖ The important part of this expression is the numerator; the


denominator serves the purpose of standardization, making
units of measurement irrelevant.
❖ Positive skewness results if a distribution is skewed to the
right, since average cubed discrepancies about the mean are
positive.
❖ Skewness is negative for distributions skewed to the left and 0
for distributions such as the bell-shaped distribution that is
mounded and symmetric about its mean.
1) If 𝛾3 > 0, then data have “positive skewness”.
2) If 𝛾3 < 0, then data have “negative skewness”.
3) If 𝛾3 = 0, then data have “symmetric”.

DR. BAŞAK GEVER 12


IE 212 STATISTICS WEEK 1

Remarks:

1. Arithmetic Mean < Median < Mode: Then, the frequency

curve of unimodal distribution has left skewness.

2. Mode < Median < Arithmetic Mean: Then, frequency curve

of unimodal distribution has right skewness.

3. Arithmetic Mean = Median = Mode: In this case, frequency

curve is symmetric.

DR. BAŞAK GEVER 13


IE 212 STATISTICS WEEK 1

5) Kurtosis: Kurtosis is a measure of whether the distribution is


peaked or flat relative to a normal distribution. Kurtosis is based
on the size of a distribution’s tails.
1
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)4
𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = 𝛾4 = 𝑛 − 1 4
−3
𝑠

1) If 𝛾4 > 0, then data have “positive kurtosis”.


2) If 𝛾4 < 0, then data have “negative kurtosis”.
3) If 𝛾4 = 0, then data have “symmetric”.

DR. BAŞAK GEVER 14


IE 212 STATISTICS WEEK 1

➢ Positive kurtosis indicates too few observations in the tails,


whereas negative kurtosis indicates too many observations in
the tail of the distribution.

➢ Distributions with relatively large tails are called leptokurtic


(negative kurtosis), and those with small tails are called
platokurtic. A distribution which has the same kurtosis as a
normal distribution is known as mesokurtic. It is known that the
kurtosis for a standard normal distribution 𝛾4 = 3.

DR. BAŞAK GEVER 15


IE 212 STATISTICS WEEK 1

6) Coefficient of Variation: The coefficient of variation, CV, is a


measure of relative dispersion that expresses the standard
deviation as a percentage of the mean (provided the mean is
positive). The sample coefficient of variation (CV) is
𝑠
𝐶𝑉 = × 100%, 𝑋̅ > 0
𝑋̅

DR. BAŞAK GEVER 16

You might also like