You are on page 1of 27

Managing Uncertainty II

Continuous Distributions

MIT Center for


Transportation & Logistics ctl.mit.edu
Zippy Bright DCs
Zippy Bright manufactures electric toothbrushes that are sold through large
retail outlets. Currently, they distribute one of their premiere products, the
XP219, from three Distribution Centers (East, Center, and West) to more than
3500 stores. The weekly demand that the East DC faces is shown in the data
table below.
• What can we say about the weekly demand for this DC?

Week Unit Sales Week Unit Sales Week Unit Sales Week Unit Sales Week Unit Sales
1 3595 11 2346 21 3967 31 2898 41 2196
2 3011 12 2869 22 2844 32 3713 42 3469
3 2994 13 3450 23 2546 33 2845 43 3570
4 3576 14 2031 24 2771 34 2866 44 2071
5 3697 15 3198 25 4084 35 3549 45 3247
6 2648 16 2939 26 2755 36 2365 46 4740
7 3747 17 2034 27 2641 37 2462 47 2316
8 3165 18 2476 28 2875 38 2480 48 2625
9 3412 19 2339 29 3855 39 3055 49 3973
10 2750 20 3200 30 2880 40 2453 50 3491

MIT Center for


2
Transportation & Logistics
Summary Statistics for Spreadsheets
Function Microsoft Excel Google Sheets LibreOffice->Calc

Minimum =MIN(array) =MINA(array) =MIN(array)


Median =MEDIAN(array) =MEDIAN(array) =MEDIAN(array)
Mode =MODE(array) =MODE(array) =MODE(array)
Mean (μ) =AVERAGE(array) =AVERAGE(array) =AVERAGE(array)
Maximum =MAX(array) =MAX(array) =MAX(array)
Percentile =PERCENTILE.INC(array, k) =PERCENTILE(array, =PERCENTILE.INC(array,
percentile) alpha)
=VAR.P(array) =VARP(array) =VAR.P(array)
Population Variance (σ2)
Sample Variance (σ2) =VAR.S(array) =VAR(array) =VAR.S(array)
Pop. Std Deviation (σ) =STDEV.P(array) =STDEVP(array) =STDEV.P(array)
Sample Std Deviation (σ) =STDEV(array) =STDEV(array) =STDEV.S(array)

N 2

A Note on Population versus Sample Variance . . . σ 2


=
∑ (x − µ)
i=1 i
pop
• In real-life, we usually do not know the true mean of the N
population. Instead, we need to estimate it from a sample. n 2

• An unbiased estimate of the variance is shown, s2 2


s =
∑ (x − x )
i=1 i
sample
• In practice, useforthe sample variance and standard deviation
MIT Center
Transportation & Logistics
n −1 3
Zippy Bright DC-East: Weekly Demand

Minimum 2,031 Range 2,709


Median 2,889 Inner-Quartile Range 920 25th Percentile 2,566
Mean (μ) 3,022 Variance (σ2) 356,269 50th Percentile (Median) 2,889
Maximum 4,740 Std Deviation (σ) 603 75th Percentile 3,486
MIT Center for Coefficient of Variation 0.20
4
Transportation & Logistics
Continuous Distributions

MIT Center for


5
Transportation & Logistics
Why not just “Discretize” the Data?

Which histogram should we use?


60% 40% 60%
50% 50%
30%
40% 40%
30% 20% 30%
20% 20%
10%
10%
10%
0% 0%
0%
2500 3000 3500 4000 4500 5000 2500 3500 5000
3000 3500 4000 4500 5000

MIT Center for


6
Transportation & Logistics
0
1

10%
12%

0%
2%
4%
6%
8%
15
00 1500
16 1600
00
17 1700
00
18 1800
00
19 1900
00
20
2000

MIT Center for


00
21
00 2100
22
00 2200

Transportation & Logistics


23
00 2300
24
00 2400
25
00 2500
26
00 2600
27
00 2700
28
00 2800
29
00 2900
30
00 3000
31
00 3100
32
00 3200
33
00 3300
34
00 3400
35
00 3500
36
00 3600
37
00 3700
38
00
3800
39
00
40 3900
00
41 4000
00
42 4100
00
Weekly Demand at Eastern DC

43 4200
00
44 4300
00
45 4400
00
Weekly demand at Eastern DC

46 4500
in 100 unit bins for the last year.

00
47 4600
00
48 4700
00
49 4800
00
50 4900
00
5000
7
Continuous Probability Distributions
• Differences from Discrete Random Variables
n Probability of specific value outcomes make no sense
n Probability of values within an interval is more helpful
n Cannot list all possible outcomes – instead we need to use a function
• Probability Density Function (pdf)
n Probability that X lies between values a and b is equal to area under
the curve between a and b
n Total area under the curve equals 1, but the P[X=t] = 0!

b
probability

∫ a
f (t)dt

a b t

MIT Center for


8
Transportation & Logistics
Continuous vs. Discrete Distributions
Discrete Continuous
Multiply each value by Requires integration to
its probability calculate µ and σ2
n
µ = E( X ) = ∑ pi xi b
µ = ∫ t ⋅ f (t )dt
i=1
a
n
b
σ 2 = ∑ pi (xi − µ ) 2 σ = ∫ (t − µ ) 2 ⋅ f (t )dt
2
i=1 a
probability

pmf probability pdf f(t)

t t

MIT Center for


9
Transportation & Logistics
Continuous Probability Distributions
• Cumulative Density Function (cdf)
n F(t) = P(X≤t) or the probability that X does not exceed t
n 0.0 ≤ F(t) ≤ 1.0
n F(b) ≥ F(a) if b>a – it is increasing
1.0
cdf
• Simple Rules F(t)

cumulative probability
n P(X≤t) = F(t)
n P(X>t) = 1 – F(t)
n P(c≤X≤d) = F(d) – F(c)
n P(X=t) = 0
F(a)

0
t
probability

pdf f(t)

a t
MIT Center for
10
Transportation & Logistics
Continuous Distributions

MIT Center for


Source: Wikipedia 11
Transportation & Logistics
Uniform Distribution

MIT Center for


12
Transportation & Logistics
Uniform Distribution
We would say,
“X is uniformly distributed over
the range a to b, or X~U(a,b)”

⎧ 1
⎪ if a ≤ t ≤ b
f (t | a, b) = ⎨ b − a
⎪⎩ 0 otherwise ⎛1⎞
(
Mean = ⎜ ⎟ a + b
⎝2⎠
)
⎧ 0 if t < a
⎪ ⎛1⎞
⎪ t−a (
Median = ⎜ ⎟ a + b
⎝2⎠
)
F(t | a,b) = ⎨ if a ≤ t ≤ b
⎪ b−a Mode = any value in range [a,b]
⎪ 1 if t > b ⎛1⎞ 2
⎩ Variance = ⎜ ⎟ b − a
⎝ 12 ⎠
( )
MIT Center for
13
Transportation & Logistics
Zippy Bright Transportation I
Zippy Bright has a consumer delivery unit. They distribute
product from a downtown location to all residences and offices in
the city. The deliveries are made on scooters and each customer
is delivered to directly. They found that the distance to each
customer location is ~U(2.75,6.50) kilometers.
1. What is the average distance, median distance, and CV?
We know that mean = (a+b)/2 = (2.75 + 6.50)/2 = 4.625 km which is also the median!
CV= σ/μ= √[(1/12)(b-a)2] / (a+b)/2 = √[(1/12)(6.5 – 2.75)2] / 4.625 = 1.0825 / 4.625 = 0.23

2. What is the probability that distance >5 km?


F(t) = P[X≤t], since we want to find P[X>t], we need to find 1-F(t) = 1 – (t-a)/(b-a)
= 1 – (5-2.75) / (6.5 – 2.75) = 1 – 0.6 = 0.40 or 40%.
3. What is the probability that distance is +/- 1σ of the μ?
We know that σ = 1.0825 and that μ =4.625. So, we want to find, the probability that X is
between (4.625 – 1.0825) and (4.625 + 1.0825) or [3.5425, 5.7075].
We can find this using the cdf: F(5.7075) – F(3.5425) = 0.789 – 0.211 = 0.577 = 58%
MIT Center for
14
Transportation & Logistics
Normal Distribution

MIT Center for


15
Transportation & Logistics
Normal Distribution
We would say,
“X is normally distributed with mean µ fx(x0)

and standard deviation σ, or X~N(µ, σ)” Area = Area =


P[x<μ+kσx] P[x≥μ+kσx]
Note: mean=median=mode =μ =1-P[x<μ+kσx]

⎡ 1 ⎛ x − µ ⎞2 ⎤
1 ⎢− ⎜ ⎟ ⎥
⎢⎣ 2 ⎝ σ ⎠ ⎥⎦
f ( x | µ ,σ ) = 1/ 2
e μ μ+kσx x0
(2π ) σ

Characteristics
• Most commonly used distribution – many analyses assume ~ N
• High point in ‘bell curve’ occurs at mean
• Symmetric about the mean
• The mean ‘shifts’ the distribution – but not the ‘shape’
• The standard deviation changes the ‘shape’ but doesn’t ‘shift’ it

MIT Center for


16
Transportation & Logistics
The Normal Distribution +/- 2σ
+/- σ
Normal Distribution

Common dispersion metrics ~N(µ,σ) 0.6%

• P(X w/in 1σ around µ) = .6826

Probability of X
0.5%
0.4%
• P(X w/in 2σ around µ) = .9544 0.3%

• P(X w/in 3σ around µ) = .9974 0.2%


0.1%
0.0%

• +/- 1.65 σ around µ = 0.900

0
20

25

30

35

40

45

50

55

60

65

70
• +/- 1.96 σ around µ = 0.950 Value
Normal CDF
• +/- 2.81 σ around µ = 0.995
100%
Probability of X 80%
So, what is 6σ? 60%
_10
Error occurs 9.9 x 10 of the time 40%
20%
0%
0

0
20

25

30

35

40

45

50

55

60

65

70
Value

MIT Center for


17
Transportation & Logistics
Unit or Standard Normal Distribution
• Standard Normal Distribution (z scores)
n Z~N(0,1) where Z=(X-μ)/σ
n Z score gives the number of standard deviations away from the mean
n Allows for use of standard tables
n Used extensively in inventory theory for setting safety stock
n Area under the curve is 1
n Able to assess the probability of an event
n A z score can be positive or negative

− x02

fu(u0) e 2

Area =
( )
f u u0 =

P[u<z]
Area =
P[u≥z]=
=1-P[u<z]

0 z u0
MIT Center for
18
Transportation & Logistics
Normal Functions for Spreadsheets
Function Microsoft Excel Google Sheets LibreOffice->Calc

cdf of Normal =NORM.DIST(X, μ, σ, 1) =NORMDIST (X, μ, σ, 1) =NORM.DIST (X, μ, σ, 1)


Distribution
pdf of Normal =NORM.DIST(X, μ, σ, 0) =NORMDIST (X, μ, σ, 0) =NORM.DIST (X, , μ, σ, 0)
Distribution
Inverse of Normal =NORM.INV(Probability, μ, σ) =NORMINV (Probability, μ, σ) =NORM.INV (Probability, μ, σ)
cdf
Standard Normal cdf =NORM.S.DIST(z,1) =NORMSDIST (z) =NORM.S.DIST (z,1)
Inverse Standard =NORM.S.INV(Probability) =NORMSINV (Probability) =NORM.S.INV (Probability)
Normal cdf

Examples for ~N(100, 12):


• What is P[X<85]? =NORM.DIST(85, 100, 12, 1) = 0.105 = 10.5%
• What value of X covers 75% of the probability?
= NORM.INV (0.75, 100, 12) = 108.09 = 108
• How many standard deviations does it take to cover 99.99%?
= NORM.S.INV(.9999) = 3.719
• What probability is covered by 1.65 standard deviations over the mean?
= NORM.S.DIST(1.65,1) = 0.95 = 95%
MIT Center for
19
Transportation & Logistics
Zippy Bright Transportation II
Zippy Bright has a consumer delivery unit. They distribute
product from a downtown location to all residences and offices in
the city. The deliveries are made on scooters and each customer
is delivered to directly. They found that the distance to each
customer location is ~N(4.6, 1.10) kilometers.
1. What is the average distance, median distance, and CV?
This is trivial since they are all given! Average = median = 4.6 km. CV=σ/μ=1.1/4.6 = 0.24

2. What is the probability that distance >5 km?


We want to find P[X>5] = 1- P[X≤5] = 1 - NORM.DIST(5, 4.6, 1.1, 1) = 1- 0.643 = 0.36 or 36%

3. What is the probability that distance is +/- 1σ around the mean?


By definition, 68.3%. But we could also use the cdf functions.
P[X≤5.7] – P[X≤3.5] = NORM.DIST(5.7, 4.6, 1.1, 1) - NORM.DIST(3.5, 4.6, 1.1, 1)
= 0.841 - 0.158 = 0.683 or 68.3%

MIT Center for


20
Transportation & Logistics
Triangle Distribution

MIT Center for


21
Transportation & Logistics
Triangle Distribution
2
(b − a)

We would say,
“X follows a triangle distribution
with a minimum of a, maximum
b, and a mode of c, ~T(a, b, c)”
a c b x

PDF # 0 x<a CDF (𝑑 − 𝑎)!


% 𝑎<𝑑≤𝑐
(𝑏 − 𝑎)(𝑐 − 𝑎)
% (
2 x−a ) a≤x≤c
𝐹 𝑥 =𝑃 𝑥≤𝑑 =
% 𝑏−𝑑 !
%
f (x) = $
(b − a ) ( c − a ) 1−
𝑏−𝑎 𝑏−𝑐
𝑐≤𝑑<𝑏
% 2 (b − x )
% c≤ x≤b
% (b − a ) (b − c)
%
0 x>b 𝑃 𝑥 > 𝑑 = 1 − 𝑃(𝑥 ≤ 𝑑)
&

𝑎+ 𝑃 𝑥 ≤𝑑 𝑏−𝑎 𝑐−𝑎 𝑎<𝑑≤𝑐


𝑑= 𝑏− 1−𝑃 𝑥 ≤𝑑 𝑏−𝑎 𝑏−𝑐 𝑐≤𝑑<𝑏

Characteristics
• Good way to get a sense of an unknown distribution
• People tend to recall extreme and common values
• Handles asymmetric distributions
MIT Center for
22
Transportation & Logistics
Zippy Bright Transportation III
Zippy Bright has a consumer delivery unit. They distribute 2
(b − a)
product from a downtown location to all residences and =0.25

offices in the city. The deliveries are made on scooters and


each customer is delivered to directly. No one recalls exactly
what the distance to each customer location is, but the 1 4 9
consensus is that the shortest is about 1 km, the longest is
about 9 km, and the most common is probably 4 km.
1. What is the average distance and CV?
Average = (1 + 9 + 4)/3 = 4.67 km
Var[x] = σ2= 2.72 so σ= 1.65 and CV = 1.65/4.67 = 0.36
(𝑏 − 𝑑)!
2. What is the probability that distance <=5 km? 𝑃 𝑥 ≤𝑑 =1− 𝑐≤𝑑<𝑏
(𝑏 − 𝑎)(𝑏 − 𝑐)
We want to find P[X<=d] where d=5. Since 4 ≤ 5 < 9, we select the
case where c≤d<b, plug in the equation and find P[x<=5] = 60%.
That is, the probability that a delivery is within 5 km is 60%.

3. What is the probability that distance >=5 km?


We want to find P[X>d] where d=5. We know that the P[X>5] = 1 – P[X≤5].
And, we just found that P[X≤5] = 0.60, so, P[X>5] = 1 – 0.60 = 0.40. So the
MIT Center for
probability that a delivery is longer than 5 km is 40%.
23
Transportation & Logistics
Zippy Bright Transportation III
Zippy Bright has a consumer delivery unit. They distribute 2
(b − a)
product from a downtown location to all residences and =0.25

offices in the city. The deliveries are made on scooters and


each customer is delivered to directly. No one recalls exactly
what the distance to each customer location is, but the 1 4 9
consensus is that the shortest is about 1 km, the longest is
about 9 km, and the most common is probably 4 km.
(𝑑 − 𝑎)!
4. What is the probability that distance >2 km? 𝑃 𝑥≤𝑑 = 𝑎<𝑑<𝑐
(𝑏 − 𝑎)(𝑐 − 𝑎)

We want to find P[X>d] where d=2 or P[X>2] which is = 1 – P[X≤2]. Because 1 ≤ 2 < 4, we
select the case where a≤d<c, plug in the equation and find P[X<=2] = 4.2% so that the
probability that a delivery is longer than 2 km is 1 – 0.042 = 0.958 or 95.8%

5. What is the distance that 90% of the trips will be shorter than?
We want to find d where the P[X≤d] = 0.90. This is obviously on the right hand side of
the distribution, but we can check by looking at P[X≤c] = (c-a)/(b-a) = 3/8 = 0.375. Since
90% is larger than this, we are on the right hand side.

To find d we select the case where c≤d<b and plug in the


equation. This gives us 9 – sqrt[(0.10)(8)(5)] = 7 km. That
MIT Center for
Transportation & Logistics
is, 90% of all deliveries will be less than 7 km in length. 24
Key Points from Lesson

MIT Center for


25
Transportation & Logistics
Key Points from Lesson

MIT Center for


26
Transportation & Logistics
Questions, Comments, Suggestions?
Use the Discussion Forum!

“Uniformly fun!”

MIT Center for caplice@mit.edu


MIT Center for Transportation & Logistics ctl.mit.edu
Transportation & Logistics

You might also like