You are on page 1of 41

Summary statistics and

observations
Outline

n Summary Statistics (Step 2b of Input


analysis process)
n Population vs. Sample
n Measures of Central Tendency
n Measures of Variability/Dispersion
n Measures of Skewness

2
Readings

n Law – Section 6.4

3
Population vs. Sample
Sample:
Population: subset of
a collection of population 𝜇,̂ 𝜎$ !
individuals
𝜇,𝜎

4
Example 1
Sample:
Population: This Class
All Currently
enrolled Meng
students at
Concordia

5
Example 2
Sample:
Population: Customers that
All of the arrive at the bank
banks’s today
current
customer

6
Goal/Question
n Want to study the population but
observe only a subset of the population
n How can we make inferences about the
population based on the sample?

7
Another Question

n How does the previous question relate


to simulation? Why do we care about
making inferences about a population
based on a sample?

ØWhatever data you collect, you usually


only collect a sample!

8
Example in Simulation
n You collect inter-arrival time data at a
bank in order to simulate its operations

n Population: all of the bank’s


customers

n Sample: customers that arrive when


you are at the bank collecting data

9
Therefore…
n Every time you collect data, you are
sampling, but are actually interested in the
characteristics of the population

n So, summary statistics are actually


sample statistics

10
Three Types of Summary Stats

1. Measures of Central Tendency

2. Measures of Variability/Dispersion

3. Measures of Skewness

11
Measures of Central Tendency

n Mode
n Median
n Mean

By Cmglee (Own work) [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)]


https://upload.wikimedia.org/wikipedia/commons/3/33/Visualisation_mode_median_mean.svg 12
Mode (“Most Frequent Value”)

n Sample
𝑋!! (𝑛)

n Population
𝑥!

13
Median (“half-way point”)
n Sample
𝑋('())⁄+ 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑
𝑥&".$ 𝑛 = ) 𝑋'⁄+ + 𝑋('⁄+())
𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛
2

n Population
𝑥".$
14
Mean
RV that corresponds to
ith sampling attempt (ith point)

n Sample
∑',-) 𝑋,
3
𝑋(𝑛) =
𝑛
Number of points in
the sample

n Population
µ = 𝐸(𝑋& 𝑛 )

15
Example

n Calculate the mean, median and mode


of the following sequence:
n 3.1, 3.2, 3.6, 3.1, 3.8, 4.2

Mean = 3.5, Median = 3.4, Mode = 3.1

How is this
useful for input
analysis?

16
Example: Triangular
Distribution
Another good example:
https://en.wikipedia.org/wiki/Mode_(statist
ics)#/media/File:Comparison_mean_media
n_mode.svg

17
Central Tendency:
Observations
n If 𝜇 = 𝑥".$ then the distribution is
symmetric (might be approximate for
discrete distn’s)

n Thus, if
𝑋3 𝑛 ≈ 𝑥*".$ 𝑛
then we can suppose that the
distribution to be fitted is symmetric
18
𝜇 = 𝑥..8
Symmetry

https://commons.wikimedia.org/wiki/File:Normal_Distribution_PDF.svg 19
Skewness

𝜇 < 𝑥&.( 𝜇 > 𝑥&.(

http://www.statisticshowto.com/wp-content/uploads/2014/02/pearson-mode-skewness.jpg 20
Extreme Values
n Consider the following example
n X: 1, 2, 3, 4, 5 with unif. prob. 0.2
Mean: 3
Median 3
n Y: 1, 2, 3, 4, 100 with unif. prob. 0.2
Mean: 22
Median: 3
n Extreme values more likely to affect the
mean!
May want to check data for outliers! 21
Example: Triangular
Distribution
n If the mode is well-defined (and
unique) and the data should have a
finite range, consider using the
triangular distribution

22
Measure of
Dispersion/Variability

n Range

n Variance & Standard Deviation

n Lexis Ratio (for discrete)/Coefficient of


Variation (for continuous)

23
Range

n Sample

𝑋(&) , … , 𝑋(')
(i) subscript means the ith data
n Population point if the points are sorted in
non-decreasing order

𝑋(.,') , … , 𝑋("#$)

24
Range

n Range is useful to rule out unbounded


distributions or distributions with negative values
n e.g., processing times – should not use Normal (or at
least check the behaviour of the software when negative
values are generated)
n e.g., # of defective items is known to be in a certain
range – should not use Gamma (which has no upper
bound)

25
Variance

n Sample
(
+ &
∑)*&(𝑋) −𝑋(𝑛))
𝑆 ( (𝑛) =
𝑛−1
Note: -1 is a correction for a
n Population potential underestimate

/ +
∑,-)(𝑋, −𝜇)
𝜎+ =
𝑁
26
Standard Deviation

n Sample

𝑆 ! (𝑛)

n Population

𝜎+
27
Variance Illustration
𝜎 & =0.2
&
𝜎 =1
𝜎 & =5

Same mean but different variance!


28
Lexis Ratio (for Discrete Dist’n)

n Sample
𝑆 ( (𝑛)
𝜏=
&
𝑋(𝑛)

n Population
𝜎+
𝜏 =
𝜇

29
Lexis Ratio: Observation

n 𝜏=1→ Poisson Dist’n


n 𝜏<1→ Binomial Dist’n
n 𝜏>1→ Negative Binomial Dist’n

Why? Let’s recall what


we know about these
distributions...
30
Recall a slide from
probability review
Poisson Dist’n
0 !" 1#
n PMF: 𝑓 𝑥 = 𝑓𝑜𝑟 𝑥 = 0,1,2, …
2!
n Notation: Poisson(𝜆)
n Mean: 𝐸(𝑋) =𝜆
n Variance: 𝑉𝑎𝑟 (𝑋) = 𝜆
Mean = variance so Lexis ratio is 1
𝜎+ 𝜆
𝜏 = =
𝜇 𝜆
31
Probability of x successes
out of n trials where p is
Binomial Dist’n the prob of success

n PMF: 𝑓 𝑥 = '2 𝑝 2 (1 − 𝑝)'42 𝑓𝑜𝑟 𝑥 =


0, … , 𝑛
n Notation: Binomial(n,p)
n Mean: 𝐸(𝑋) =np
n Variance: 𝑉𝑎𝑟 (𝑋) = np(1-p)

Mean ≥ variance so Lexis ratio < 1


32
Coefficient of Variation (for
Continuous Dist’n)

n Sample
𝑆 ( (𝑛)
2 =
𝐶𝑉
&
𝑋(𝑛)

n Population
𝜎+
𝐶𝑉 = ,𝜇 ≠ 0
𝜇
33
Coefficient of Variation:
Observations

n CV=1 → 𝜎 ( = 𝜇
n Recall the exponential distribution:
& + )
n 𝜇= 𝜆 , 𝜎 = $
𝜆
n so CV = 1 → Exponential!

34
Coefficient of Variation:
Observations
n For Gamma and Weibull, there is a
relationship between CV and the shape
parameter alpha

n If CV > 1, alpha < 1


n If CV = 1, alpha = 1
n If CV < 1, alpha > 1

https://en.wikipedia.org/wiki/Gamma_distribution#/media/File:Gamma_distri
bution_pdf.svg
35
Simulation Modeling and
Analysis – Chapter 6 –
Selecting Input Probability
Distributions

36
Simulation Modeling and Analysis – Chapter 6 – Selecting
37
Input Probability Distributions
Coefficient of Variation:
Observations

n F
If 𝐶𝑉(𝑛) > 1, check whether the shape
resembles a log-normal

n Log-normal might also be a good choice

https://en.wikipedia.org/wiki/Log-normal_distribution#/media/File:PDF-
log_normal_distributions.svg

38
Measures of Skewness

n Sample
∑+)*& (𝑋) − 𝑋& 𝑛) , ⁄𝑛
𝑣* =
𝑆9 (𝑛),

n Population
𝐸((𝑋' − 𝜇)( )
𝑣=
𝜎* & (𝑛)(/&
39
Skewness: Observations

n 𝑣=0→ Symmetric
n 𝑣>0→ Skewed to the right
n 𝑣<0→ Skewed to the left

http://www.statisticshowto.com/wp- content/uploads/2014/02/pearson-mode-skewness.jpg 40
Questions?

41

You might also like