Professional Documents
Culture Documents
Statistical Models
2010
Goals for Today
2
2
Topics
Discrete Probability Distributions
• Binomial Distribution
• Bernoulli Distribution
• Geometric Distribution
• Discrete Poisson Distribution/Poisson Process
Continuous Probability Distribution
• Uniform
• Exponential
• Normal
Empirical Distributions
3
3
Discrete and Continuous
Random Variables
4
Discrete versus Continuous Random Variables
Discrete Random Variable Continuous Random Variable
3. f ( x) = 0, if x is not in RX
i ≤x
p (x i ) ∫ f=−∞
p (a ≤ X ≤b) =∫a f ( x ) dx
b
5
Expectation
• Discrete E ( x) = ∑ xi p ( xi )
all i
∞
• Continuous E ( x) = ∫ xf ( x)dx
−∞
∑
n
∑
n
∑
( x i − µx ) ⋅ p ( x i )= ( x i ) ⋅ p ( x i ) − x i ⋅ p ( x i
2 2
)
=i 1 = i 1= i 1
+∞ +∞
• Continuous ∫ ∫
V ( X ) = ( x − µx ) ⋅ f ( x ) dx = x 2 ⋅ f ( x ) dx
2
− µx2
−∞ −∞
6
Statistical Models: Properties
Context of applications
Mean/Expected Value
7
Discrete Probability Distributions
Bernoulli Trials
Binomial Distribution
Geometric Distribution
Poisson Distribution
Poisson Process 8
Modeling of Random Events with
Two-States
Bernoulli Trials
Binomial Distribution
9
Bernoulli Trials
Context: Random events with two possible values
Two events:Yes/No, True/False, Success/Failure
Two possible values: 1 for success, 0 for failure.
Example: Tossing a coin, Packet Transmission Status,
An experiment comprises n trial.
Probability Mass Function (PMF): Probability in one trial
p, =
x j 1,= j 1, 2,..., n
PMF: p ( one trial ) = p (x j ) = 1 − p = q , x j = 0 , j = 1, 2 ,..., n
0,
otherwise
n k n −k n n!
P ( X = k ) = p ⋅ (1 − p ) with =
k k k ! ( n − k )!
where
k = 0, 1, 2, ......., n k: number of success
n = 1, 2, 3, ....... n: number of trials
0<p<1 p = success probability
11
Binomial Distribution
Probability of k=2 successes in n=3 trials?
E1: 1 1 0 P(E1)= p(1 and 1 and 0) = p p (1-p) = p2 (1-p)
or
E2: 1 0 1 P(E2)= p(1 and 0 and 1) = p (1-p) p = p2 (1-p)
or
E3: 0 1 1 P(E3)= p(1 and 1 and 0) = p p (1-p) = p2 (1-p)
13
Geometric Distribution
Context: the number of Bernoulli trials until achieving the
FIRST SUCCESS.
It is used to represent random time until a first transition occurs.
PMF
q k −1 p , k = 0,1, 2,..., n
PMF: p (X= k=
)
0, otherwise
CDF: F ( X ) =p ( X ≤ k ) =1 − (1 − p )
k
1
Expected Value : E [ X ] =
p k
q 1− p
Variance :V [ X=] σ= 2
=
p2 p2
14
Modeling of Random Number of
Arrivals/Events
Poisson Distribution
Poisson Process
15
Poisson Distribution
Context: number of events occurring in a fixed period of time
Events occur with a known average rate and are independent
Possion distribution is characterized by the average rate λ
The average number of arrival in the fixed time period.
Examples
The number of cars passing a fixed point in a 5 minute
interval. Average rate: λ = 3 cars/5 minutes
The number of calls received by a switchboard during a
given period of time. Average rate: λ =3 call/minutes
The number of message coming to a router per second
The number of travelers arriving to the airport for flight
registration
16
Poisson Distribution
The Poisson distribution with the average rate parameter λ
Variance: V [ X ] = λ
CDF
17
Example: Poisson Distribution
The probability of having exactly 15 cars entering the parking in one hour:
2015
p (15 ) = P ( X = 15 ) = ⋅ exp ( −20 ) = 0.051649
15!
or
p (15 ) = F (15 ) − F (14 ) = 0.156513 − 0.104864 = 0.051649
The probability of having more than 3 cars entering the parking in one hour:
p ( X > 3) =1 − p ( X ≤ 3) =1 − F ( 3)
= 1 − p ( 0 ) + p (1) + p ( 2 ) + p ( 3)
USE EXCEL/MATLAB
= 0.9999967 FOR COMPUTATIONS
18
Example: Poisson Distribution
20k
p ( X =k ) = ⋅ exp ( −20 )
k
20i
k! F ( k )= p ( X ≤ k )= ∑
i =0
i!
⋅ exp ( −20 )
19
Modeling of Random Number of
Arrivals/Events
Poisson Distribution
Poisson Process
20
Poisson Process
Process N(t)
random variable N that depends on time t
Poisson Process: a Process that has a Poisson Distribution
N(t) is called a counting poisson process
There are two types:
Homogenous Poisson Process: constant average rate
Non-Homogenous Poissoon Process: variable average rate
21
What is “Stochastic Process”?
State Space = {SUNNY,
RAINNY}
X (day i ) = "S " or " R " : RANDOM VARIABLE that varies with the DAY
X (day 2 ) = "S " X (day 4 ) = "S " X (day 6 ) = "S "
X (day 1) = "S " X (day 3) = " R " X (day 5 ) = " R " X (day 7 ) = "S "
Day
Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7
THU FRI SAT SUN MON TUE WED
23
(Homogenous) Poisson Process
N (t + τ ) − N (t )
describes the number of events in time interval (t , t + τ ]
The mean and the variance are equal
E N (τ ) = V N (τ ) = λ ⋅τ
24
(Homogenous) Poisson Process
Properties of Poisson process
Arrivals occur one at a time (not simultaneous)
{N (τ ) ,τ ≥ 0} has stationary increments, which
means N (t ) − N ( s ) =N (t − s )
25
CDF of Exponential distribution
The first arrival occurs after time t MEANS that there are no
arrivals in the interval [0,t], As a consequence:
p ( A1 > t ) = p ( N (t ) = 0 ) = exp ( −λ ⋅ t )
p ( A1 ≤ t ) =1 − p ( A1 > t ) =1 − exp ( −λ ⋅ t )
26
The Inter-arrival times of a Poisson process are
exponentially distributed and independent with mean 1/λ
26
Splitting and Pooling
Splitting
λp N1(t) ~ Poi[λp]
N(t) ~ Poi(λ) λ
Pooling
N1(t) ~ Poi[λ1] λ1
λ1 + λ2
N(t) ~ Poi(λ1 + λ2)
N2(t) ~ Poi[λ2] λ2
27
Modeling of Random Number of
Arrivals/Events
Poisson Distribution
Non Homogenous Poisson Process
28
Non Homogenous (Non-stationary) Poisson Process
(NSPP)
The number of cars that cross the intersection of King Fahd Road and Al-
Ourouba Road is distributed according to a non homogenous Poisson process
with a mean λ(t) defined as follows:
80 cars/mn if 8 am ≤ t ≤ 9am
60 cars/mn if 9am ≤ t ≤ 11pm
λ (t ) =
50 car/mn if 11am ≤ t ≤ 15 pm
70 car/mn if 15 pm ≤ t ≤ 17 pm
Let us consider the time 8 am as t=0.
Q1. Compute the average arrival number of cars at 11H30?
Q2. Determine the equation that gives the probability of having only 10000 car
arrivals between 12 pm and 16 pm.
Q3. What is the distribution and the average (in seconds) of the inter-arrival time
of two cars between 8 am and 9 am?
30
Example: Non-stationary Poisson Process
(NSPP)
Uniform Distribution
Exponential Distribution
Normal (Gaussian) Distribution
32
Uniform Distribution
33
Continuous Uniform Distribution
The continuous uniform distribution is a family of probability
distributions such that for each member of the family, all intervals of
the same length on the distribution's support are equally probable
A random variable X is uniformly distributed on the interval [a,b],
U(a,b), if its PDF and CDF are:
0, x a
1 x −a
, a ≤ x ≤b
PDF: f (x ) = b − a =
CDF: F (x ) , a ≤x b
0, b − a
otherwise
1, x ≥b
Expected value: E [ X ] =
a +b ( a + b )2
Variance: V [ X ] =
2 12
34
Uniform Distribution
Properties
p ( x 1 ≤ X ≤ x 2 )is proportional to the
length of the interval
X 2 −X1
F (X 2 ) − F (X 1 ) =
b −a
CDF
Special case: a standard
uniform distribution U(0,1).
Very useful for random number
generators in simulators
35
Exponential Distribution
36
Exponential Distribution
1
Variance: V [ X=]
1
Expected value: E [ X =
] = µ = µ 2
λ λ 2 37
Exponential Distribution
µ=20 µ=20
1 x 0, x <0
⋅ exp − , x ≥0
f (x ) = 20 20 F (x ) = x
0, 1 − exp − , x ≥0
otherwise 20
38
Exponential Distribution
The memoryless property: In probability theory, memoryless is a
property of certain probability distributions: the exponential distributions
and the geometric distributions, wherein any derived probability from a set
of random samples is distinct and has no information (i.e. "memory") of earlier
samples.
Formally, the memoryless property is:
For all s and t greater or equal to 0:
s ) p (X > t )
p (X > s + t | X > =
This means that the future event do not depend on the past event, but only on
the present event
The fact that Pr(X > 40 | X > 30) = Pr(X > 10) does not mean that the events X > 40 and X
> 30 are independent; i.e. it does not mean that
Pr(X > 40 | X > 30) = Pr(X > 40).
39
Exponential Distribution
p (X > s + t | X > =
s ) p (X > t )
The fact that Pr(X > 40 | X > 30) = Pr(X > 10) does not mean that the
events X > 40 and X > 30 are independent; i.e. it does not mean that
Pr(X > 40 | X > 30) = Pr(X > 40).
40
Example: Exponential Distribution
p ( ≤ X ≤ 3=
) F ( 3) − F ( 2=) 0.145
The probability that the repair time lasts for another hour given it has been
operating for 2.5 hours:
Using the memoryless property of the exponential distribution, we have:
1
p ( X > ( 2.5 + 1) | X > 2.5 ) =p ( X > 1) =1 − p ( X ≤ 1) =exp − =0.717
3
41
Normal (Gaussian) Distribution
42
Normal Distribution
43
Normal or Gaussian Distribution
45
Normal Distribution
Evaluating the distribution:
Independent of µ and σ, using the standard normal distribution:
Z ~ N ( 0,1) z 1 −t 2 / 2
, where Φ ( z ) = ∫ e dt
−∞
2π
X −µ
Transformation of variables: let Z =
σ
x−µ
F ( x ) = P ( X ≤ x ) = P Z ≤
σ
( x−µ ) /σ 1 −z2 / 2
=∫ e dz
−∞
2π
( x−µ ) /σ
=∫ φ ( z )dz = Φ ( xσ− µ )
−∞
46
Normal Distribution
Example: The time required to load an oceangoing vessel, X, is
distributed as N(12,4)
The probability that the vessel is loaded in less than 10 hours:
10 − 12
F (10) = Φ = Φ (−1) = 0.1587
2
Using the symmetry property, Φ(1) is the complement of Φ (-1)
47
Empirical Distribution
48
Empirical Distributions
49
Empirical Distributions
In statistics, an empirical distribution function is a cumulative
probability distribution function that concentrates probability
1/n at each of the n numbers in a sample.
Let X1, X2, …, Xn be iid random variables in with the CDF equal to
F(x).
The empirical distribution function Fn(x) based on sample X1, X2, …,
Xn is a step function defined by
n
number of element in the sample ≤ x
∑ I (X
1
Fn ( x )
= = i ≤x)
n n i =1
1 if ( X i ≤ x )
where I(A) is the indicator of event A.I ( X i ≤ x ) =
0 otherwise
For a fixed value x, I(Xi≤x) is a Bernoulli random variable with
parameter p=F(x), hence nFn(x) is a binomial random variable
with mean nF(x) and variance nF(x)(1-F(x)).
50
End
51