You are on page 1of 96

Probability Review

MIE1650 Lecture 1

September 19, 2017


Course Overview

S YLLABUS

 Meeting Times: Tuesdays 09:00-12:00 PM


 Office Hours:
◦ Monday: 04:30-06:00 PM
◦ By Appointment
 Course HomePage:
◦ blackboard
◦ I will post outlines of lecture slides (if any) there before class.
◦ Homework assignments and grades will be posted there
 Textbook: Adventures in Stochastic Processes, S. Resnick
Probability and Random Processes, G. Grimmett, D. Stirzaker
 Grading:
◦ 5× Homework 20%
◦ 1× Midterm 40%
◦ 1× Final 40%
M Cevik MIE1605 - Probability Review 2 / 96
Course Overview

T ENTATIVE S CHEDULE

Dates Topic
12/09 - 19/09 Probability Review
26/09 - 10/10 Discrete Time Markov Chains
17/10 - 24/10 Poisson Processes
30/10 Midterm Exam, 5.15 - 7.30 PM
31/10 - 07/11 Continuous Time Markov Chains
14/11 - 21/11 Renewal Processes
28/12 - 05/12 Brownian Motion and Martingales
12/12 Final Exam, 9.00 - 11.30 PM

M Cevik MIE1605 - Probability Review 3 / 96


Course Overview

O UTLINE
1 Course Overview
2 Probability Basics
Introduction
Random variables
Sum of Independent RVs
Functions of RVs
3 Limit theorems
4 Generating Functions
Moment Generating Functions
Probability Generating Functions
5 Random Sums
6 Simple Branching Process
7 Simple Random Walk

M Cevik MIE1605 - Probability Review 4 / 96


Probability Basics Introduction

P ROBABILITY S PACE

 A measure on a set is a systematic way to assign a number to


each suitable subset on that set.

 Let X be a set and Σ a σ-field over X. A function µ from Σ to the


extended real number line is called a measure if it satisfies the
following properties:
(1) Non-negativity : For all E ∈ Σ: µ(E) ≥ 0.
(2) Null empty set: µ(∅) = 0.
(3) Countable additivity : For all countable collections
P∞ {Ei }∞
i=1 of

pairwise disjoint sets in Σ, µ ∪k=1 Ek = k=1 µ(Ek )

 A triple (X, Σ, µ) is called a measure space.

M Cevik MIE1605 - Probability Review 5 / 96


Probability Basics Introduction

P ROBABILITY S PACE

 Probability space is a measure space with a probability measure


and a probability measure is a measure with total measure one,
i.e., µ(X) = 1.

 A probability space consists of three parts:


(1) A sample space, Ω , which is the set of all possible outcomes.
(2) A set of events, F, where each event is a set containing zero
or more outcomes.
(3) A probability measure, P, specifying the assignment of
probabilities to the events.

M Cevik MIE1605 - Probability Review 6 / 96


Probability Basics Introduction

P ROBABILITY S PACE

 Sample Space Ω : Set of all possible outcomes ωi of an


experiment.

Ex: Toss a coin until a head appears.


ωi = T . . . TH
Ω = {H, TH, TTH, TTTH . . .} = {ω1 , ω2 , . . .}

⇒ An event is a set of outcomes of the experiment.

M Cevik MIE1605 - Probability Review 7 / 96


Probability Basics Introduction

P ROBABILITY S PACE

 Set of events, F: A collection of subsets of Ω is called a σ-field


(denoted by F ⊆ 2Ω ) if
◦ ∅ = Ω̄ ∈ F, Ω ∈ F
◦ if A1 , A2 , . . . ∈ F, then, ∪∞
i=1 Ai ∈ F ⇒ closed under countable
union
◦ if A ∈ F, then Ā = Ω\A ∈ F ⇒ closed under complements

Ex: F = {∅, {1, 2, 3, 4}, {1, 2}, {3, 4}, {1, 3}, {2, 4}} is not a σ-field,
because, {1, 2} ∪ {1, 3} = {1, 2, 3} is not in F.

 Probability measure, P: A function P : F 7→ [0, 1] such that


(a) P(∅) = 0, P(Ω) = 1
(b) if A1 , A2 , . . . ∈ F, where Ai Aj = ∅, ∀i 6= j,
P∞
then P(∪∞ i=1 Ai ) = i=1 P(Ai )

M Cevik MIE1605 - Probability Review 8 / 96


Probability Basics Introduction

R ANDOM VARIABLES

 A function X : Ω → T is called a (T-valued) random variable (RV)


(e.g., T = R). Let X be a (discrete) RV whose range is {0, 1, 2, . . .}.
Then,
P(X = k) = pk , k = 0, 1, 2, . . .

X
If X is a proper RV, then pk = 1, pk ≥ 0 ∀k
k=0

 Examples:
◦ Discrete RV
◦ Continuous RV
◦ Mixed RV

M Cevik MIE1605 - Probability Review 9 / 96


Probability Basics Introduction

D ISTRIBUTION F UNCTIONS

 Probability density function (pdf): A continuous RV X has density


fX , where fX is a non-negative Lebesgue-integrable function if
Z b
P[a ≤ X ≤ b] = fX (x) dx
a

 Probability mass function (pmf): A discrete RV X : S → A has a


pmf fX : A → [0, 1] defined as
fX (x) = P(X = x) = P({s ∈ S : X(s) = x})

 Cumulative distribution function (cdf):


FX (x) = P(X ≤ x), P(a < X ≤ b) = FX (b) − FX (a)
Z t=x
◦ FX (x) = fX (t)dt
t=−∞
P P
◦ F(x) = xi ≤x P(X = xi ) = xi ≤x p(xi )
M Cevik MIE1605 - Probability Review 10 / 96
Probability Basics Introduction

BAYES ’ T HEOREM

P(A ∩ B) P(B|A)P(A)
 P(A|B) = =
P(B) P(B)

 Let Ai ’s constitute a partition of the sample space S.


P(B|Ai )P(Ai ) P
P(Ai |B) = P where P(B) = j P(B|Aj )P(Aj )
j P(B|Aj )P(Aj )

 Example:
In a city, 51% of the adults are males. Also, 9.5% of males smoke
cigars, whereas 1.7% of females smoke cigars. If a randomly
selected adult smokes cigars, what’s the probability that selected
subject is a male?

M Cevik MIE1605 - Probability Review 11 / 96


Probability Basics Introduction

J OINT DISTRIBUTION

 Joint pmf: p(x, y) = P(X = x, Y = y)


Joint pdf: f (x, y), x ∈ A, y ∈ B

 Single RV X: Cumulative distribution function


FX (x) = P(X ≤ x)

 Two RV’s X, Y: Joint cdf


F(x, y) = P(X ≤ x, Y ≤ y)
⇒ Similar definition for more than two RVs.

 Continuous RV’s: Joint cdf


R R
P(X ∈ A, Y ∈ B) = B A f (x, y)dxdy

M Cevik MIE1605 - Probability Review 12 / 96


Probability Basics Introduction

M ARGINAL DISTRIBUTION

 Relation between joint and individual RV densities:


Z Z ∞ 
P(X ∈ A) = P(X ∈ A, Y ∈ R) = f (x, y)dy dx
A −∞
Z ∞
⇒ fX (x) = f (x, y)dy
−∞

 Similar for a discrete RV, marginal mass function


X X
pX (x) = p(x, y), pY (y) = p(x, y)
y∈R x∈R

M Cevik MIE1605 - Probability Review 13 / 96


Probability Basics Introduction

M ARGINAL DISTRIBUTION

 Ex: Consider following joint pdf


(
e−(x+y) , if x ≥ 0, y ≥ 0
f (x, y) =
0, otherwise

◦ This is a joint density (it integrates to 1)

◦ Calculation of the marginals:

◦ X and Y are exponentially distributed with mean 1

M Cevik MIE1605 - Probability Review 14 / 96


Probability Basics Introduction

C ONDITIONAL P ROBABILITY D ISTRIBUTIONS

 Conditional probability mass function:

P(X = x, Y = y) p(x, y)
pX|Y (x|y) = P(X = x|Y = y) = =
P(Y = y) pY (y)

 Conditional probability density function:

f (x, y)
fX|Y (x|y) =
fY (y)

 Ex: Suppose the joint pmf of X and Y is given by


p(1, 1) = 0.5, p(1, 2) = 0.1, p(2, 1) = 0.1, p(2, 2) = 0.3.
Find the pmf of X given Y = 1.

M Cevik MIE1605 - Probability Review 15 / 96


Probability Basics Introduction

I NDEPENDENCE

 P(A|B) = P(A) ⇒ A and B (events) are independent, A ⊥ B.

 If A ⊥ B ⇒ A ⊥ B̄

 RVs X and Y are independent if one of the following holds:


◦ FX,Y (x, y) = FX (x)FY (y)
◦ fx,y (X, Y) = fX (x)fY (y)

 Ex1: A coin is tossed n times: Let X be # of heads, Y be # of tails.


X and Y are dependent, since X + Y = n

 Ex2: X, Y having joint density exp[−(x + y)] are independent

M Cevik MIE1605 - Probability Review 16 / 96


Probability Basics Introduction

I NDEPENDENCE

Ex: Discrete RV’s that are not independent:


 X = sum of 2 flips
Y = difference of 2 flips

 Joint mass function


Outcome X Y Prob
00 0 0 0.25
01 1 -1 0.25
10 1 1 0.25
11 2 0 0.25

p(0, 0) = 0.25, pX (0) = 0.25, pY (0) = 0.5


p(2, 1) = 0, pX (2) = 0.25, pY (1) = 0.25

M Cevik MIE1605 - Probability Review 17 / 96


Probability Basics Introduction

C ONDITIONAL I NDEPENDENCE

 R and B are conditionally independent given Y


⇐⇒ P(R ∩ B | Y) = P(R | Y)P(B | Y)
or equivalently ⇐⇒ P(R | B ∩ Y) = P(R | Y)

 R and B are being independent does not imply that A and B are
conditionally independent. Likewise, conditional independence of
R and B does not imply that they are independent.

 Ex: Consider a coin which can be fair or biased (favoring H):


◦ A: First coin toss is H
◦ B: Second coin toss is H
◦ C: Coin is biased
⇒ A and B are dependent.
⇒ A and B are conditionally independent given C.
M Cevik MIE1605 - Probability Review 18 / 96
Probability Basics Introduction

BASIC PROBABILITY EXAMPLE

 P(A) =?, P(B) =?, P(C) =?

 P(A ∩ C) =?, P(B ∩ C) =?

 P(A ∩ B ∩ C) =?

 P(A|C) =?, P(B|C) =?

 P(A|B ∩ C) =?, P(A ∩ B|C) =?

M Cevik MIE1605 - Probability Review 19 / 96


Probability Basics Introduction

M EAN (E XPECTATION )

 For RV X, the mean (or expectation) is defined to be


X X
◦ E(X) = xp(x), E[g(X)] = g(x)p(x) (discrete RV)
x∈R x∈R
Z Z
◦ E(X) = xf (x)dx, E[g(X)] = g(x)f (x)dx (continuous RV)
R R

 Joint distributions:
Z Z Z
E[aX] = axf (x, y)dydx = a xfX (x)dx = aE[X]
R R R
Z Z
E(aX + bY) = (ax + by)f (x, y)dydx = aE[X] + bE[Y]
R R

E[ ni=1 Xi ] = ni=1 E[Xi ], Xi0 s not need to be indep.


P P


M Cevik MIE1605 - Probability Review 20 / 96


Probability Basics Introduction

M EAN (E XPECTATION )

 Moments of a RV:
The rth moment of X is E[X r ]
The rth central moment of X is E[(X − E[X])r ]

 For non-negative integer RVs, an alternative way of computing


expectation:

X
◦ Discrete RV: E[X] = P(X > k)
k=0
Z ∞
◦ Continuous RV: E[X] = P(X > k)dk
k=0

M Cevik MIE1605 - Probability Review 21 / 96


Probability Basics Introduction

C ONDITIONAL E XPECTATION

 Conditional expectation of X given Y = y


X
◦ E[X|Y = y] = xpX|Y (x|y) (discrete RV)
x∈R
Z
◦ E[X|Y = y] = xfX|Y (x|y)dx (continuous RV)
x∈R

 Computing expectation by conditioning (continuous RV)


Z ∞
E[X] = E[X|Y = y]fY (y)dy
−∞
Z ∞Z ∞
= xfX|Y (x|y)fY (y)dxdy
−∞ −∞

 Chain expansion: E[X] = EY [EX|Y (X|Y)]


M Cevik MIE1605 - Probability Review 22 / 96
Probability Basics Introduction

C ONDITIONAL E XPECTATION

 Ex: Suppose you arrive at a post office having two clerks at a


moment when both are busy, but there is no one else waiting in
line. You will enter service when either clerk becomes free. If
service times for clerk i (for i = 1, 2) are exponential with rate λi ,
find E[T], where T is the total amount of time that you will spend in
the post office (including the time you spend in service). Note that
the two servers are not identical.
P.S. X ∼ Expo(λ) ⇒ fX (x) = λe−λx , x ≥ 0, FX (x) = 1 − e−λx , x ≥ 0

Solution:
3
E[T] = E[T|R1 < R2 ]P(R1 < R2 ) + E[T|R2 < R1 ]P(R2 < R1 ) = · · · = λ1 +λ2

M Cevik MIE1605 - Probability Review 23 / 96


Probability Basics Introduction

C OVARIANCE

 Measure of how much two RV change together

 Covariance of RV’s X and Y


Cov(X, Y) = E[(X − E[X])]E[(Y − E[Y])]
expanding the product yields
Cov(X, Y) = ... = E[XY] − E[X]E[Y]

 Cov(X, X) is the variance of X,


Var(X) = E[X 2 ] − (E[X])2

 If X and Y are independent, then


Z Z Z Z
E[XY] = xyf (x, y)dydx = xyfX (x)fY (y)dydx = E[X]E[Y]
R R R R
⇒ Cov(X, Y) = 0.
M Cevik MIE1605 - Probability Review 24 / 96
Probability Basics Introduction

C OVARIANCE PROPERTIES

 Cov(cX, Y) = cCov(X, Y)

 Var(cX) = c2 Var(X)

 Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z)


P P P P
 Cov( i Xi , i Yi ) = i j Cov(Xi , Yj )
Pn Pn Pn P
 Var( i=1 Xi ) = i=1 Var(Xi ) +2 i=1 j<i Cov(Xi , Xj )
Pn Pn
 Var( i=1 Xi ) = i=1 Var(Xi ) → if Xi ’s independent

M Cevik MIE1605 - Probability Review 25 / 96


Probability Basics Introduction

C OVARIANCE PROPERTIES

 Ex: Flip a fair coin 3 times. Let X be the number of heads in the
first 2 flips and let Y be the number of heads on the last 2 flips (so
there is overlap on the middle flip). Compute Cov(X, Y).
1
Solution: 4

M Cevik MIE1605 - Probability Review 26 / 96


Probability Basics Introduction

S TANDARD DEVIATION / CV
p
 Standard deviation of the RV X is σ(X) = Var(X)
⇒ Standard deviation and variance are measures of dispersion
about the mean

 The coefficient of variation (CV) of the RV X is σ(X)/E[X]


⇒ This is a way of normalizing the description of dispersion (it
removes a scale factor)
⇒ Not always well-defined (what if E[X] = 0?)

M Cevik MIE1605 - Probability Review 27 / 96


Probability Basics Introduction

C ORRELATION

 Correlation of RV’s X and Y


Cov(X, Y)
ρ(X, Y) =
σ(X)σ(Y)

⇒ We always have −1 ≤ ρ(X, Y) ≤ 1

 Correlation has the sign of the covariance, but it is dimensionless.

 Ex: A box contains red, white and black balls. We draw balls from
the box n times where at each draw we note ball color and then
replace it to the box. Let X1 and X2 be the number of red balls and
white balls drawn, respectively. Find ρ(X1 , X2 ).
−np√1 p2
Solution: √
np1 (1−p1 ) np2 (1−p2 )

M Cevik MIE1605 - Probability Review 28 / 96


Probability Basics Introduction

C ORRELATION AND INDEPENDENCE

 If X and Y are independent ⇒ Cov(X, Y) = 0 ⇒ ρ(X, Y) = 0. Then,


X, Y are uncorrelated.

 Beware!
If X and Y are uncorrelated, this does not imply they are
independent

⇒ Exception: Joint normally distributed random variables

 Ex:
◦ X = sum of 2 coin flips, and Y = difference of the 2 flips

◦ E[XY] = 0 = E[Y], but E[X] = 1

◦ So, Cov(X, Y) = E[XY] − E[X]E[Y] = 0

◦ But we know X and Y are not independent


M Cevik MIE1605 - Probability Review 29 / 96
Probability Basics Random variables

B INOMIAL RV

 Binomial RV’s are used to describe number of successes in n


Bernoulli trials. Let X1 , X2 , . . . Xn be independent and identically
distributed (iid) RV with
Xi = 1 with prob. p, and 0, with prob. (1 − p).
⇒ X1 , X2 , . . . Xn are called Bernoulli trials
n
X
⇒ Xi ∼ Bin(n, p).
i=1

 Let X ∼ Bin(n, p).Then,



n k
◦ P(X = k) = p (1 − p)n−k , k = 0, 1, 2, . . . , n
k
Xn
◦ E[X] = np, Var(X) = Var( Xi ) = . . . = np(1 − p)
i=1

M Cevik MIE1605 - Probability Review 30 / 96


Probability Basics Random variables

G EOMETRIC RV

 Number of failures until first success, i.e., X represents number of


failures in successive Bernoulli trials until first success occurs.
◦ P(X = k) = (1 − p)k p, k = 0, 1, 2 . . .
◦ P(X ≤ k) = 1 − P(X > k) = 1 − (1 − p)k+1 , k = 0, 1, 2 . . .
(1−p) (1−p)
◦ E[X] = p , Var[X] = p2

 Alternatively, X may represent number of trials until first success.


◦ P(X = k) = (1 − p)k−1 p, k = 1, 2 . . .
(1−p)
◦ E[X] = 1p , Var[X] = p2

◦ Assume X ∼ Geom(p1 ), Y ∼ Geom(p2 ) are iid Geometric RVs


X P(X < Y)?
X What’s distribution of min(X, Y)?
M Cevik MIE1605 - Probability Review 31 / 96
Probability Basics Random variables

N EGATIVE B INOMIAL RV

 Number of trials until rth success:


⇒ Nr basically corresponds to rth inter-arrival times.
◦ P(rth success on nth trial)
= P(r − 1 success in n − 1 trials)P(success on nth trial)
 
 n − 1 r−1
p (1 − p)n−r p, n = r, r + 1, . . .

=
r−1
◦ E[Nr ] = r/p, Var(Nr ) = r(1 − p)/p2

 Number of failures before rth success: Kr


◦ P(k failures before rth success)
= P(k failures in k + r − 1 trials)P(success on nth trial)
 
 k + r − 1 r−1
p (1 − p)k p, k = 0, 1, 2, . . .

=
k
◦ E[Kr ] = r/p − r, Var(Kr ) = r(1 − p)/p2
M Cevik MIE1605 - Probability Review 32 / 96
Probability Basics Random variables

P OISSON RV

 Used to model number of occurrences or events over a time


interval.
e−λ λk
◦ P(X = k) = , k = 0, 1, 2, . . .
k!
◦ E[X] = λ

◦ Var(X) = λ

 Poisson RV might be used to approximate a binomial RV when n


is large and p is small.

M Cevik MIE1605 - Probability Review 33 / 96


Probability Basics Random variables

U NIFORM RV ( DISCRETE )

 Parameters a, b ∈ Z, b ≥ a.

1
 pmf: P(X = k) = , k ∈ {a, a + 1, . . . , b − 1, b}
b−a+1
bkc − a + 1
 cdf: F(k; a, b) =
b−a+1

a+b (b − a + 1)2 − 1
 E[X] = , Var(X) =
2 12
 Ex: Roll a die. E[X] = 7/2, Var(X) = 35/12.

M Cevik MIE1605 - Probability Review 34 / 96


Probability Basics Random variables

N ORMAL (G AUSSIAN ) RV

1 (x − µ)2
 f (x) = √ exp{− }, x∈R
2πσ 2 2σ 2
  
 F(x) = 21 1 + erf σx−µ

2
, x∈R

 If X ∼ N(µ, σ 2 ), then Y = αX + β ∼ N(αµ + β, α2 σ 2 )

 Standart Normal Distribution: Z ∼ N(0, 1) → µ = 0, σ = 1


⇒ If X ∼ N(µ, σ 2 ), then X = µ + σZ

1 2
 pdf of standard normal distribution: φ(x) = √ e−x /2

1 x−µ
⇒ pdf of X ∼ N(µ, σ 2 ) : fX (x) = φ( )
σ σ Z x
1 2
 cdf of standard normal distribution: Φ(x) = √ e−t /2 dt
  2π −∞
x−µ
⇒ cdf of X ∼ N(µ, σ 2 ) : FX (x) = Φ
σ
M Cevik MIE1605 - Probability Review 35 / 96
Probability Basics Random variables

E XPONENTIAL RV

 f (x) = λe−λx , 0≤x≤∞

 F(x) = 1 − e−λx , 0≤x≤∞

 E[x] = 1/λ, Var(X) = 1/λ2

 Let’s assume X, Y ∼ Expo(λ1 ), Expo(λ2 ) :


Z ∞
P(X < Y) = P(X < k|Y = k)P(Y = k)dk
k=0
Z ∞
λ1
= (1 − e−λ1 k )λ2 e−λ2 k dk =
k=0 λ1 + λ2

P(min(X, Y) ≤ k) = 1 − P(min(X, Y) > k)


= 1 − P(X > k)P(Y > k) = 1 − e−(λ1 +λ2 )k
⇒ min(X, Y) ∼ Expo(λ1 + λ2 )
M Cevik MIE1605 - Probability Review 36 / 96
Probability Basics Random variables

E XPONENTIAL RV

 Ex: Two individuals, A, and B, both require kidney transplants. If


she does not receive a new kidney, then A will die after an
exponential time with rate µA , and B after an exponential time with
rate µB . New kidneys arrive in accordance with a Poisson process
having rate λ. It has been decided that the first kidney will go to A
(or to B if B is alive and A is not at that time) and the next one to B
(if still living).
(a) What is the probability that A obtains a new kidney?
(b) What is the probability that B obtains a new kidney?
Soln:
(a) λ/(λ + µA )
λ(µA + λ)
(b)
(µB + λ)(λ + µA + µB )

M Cevik MIE1605 - Probability Review 37 / 96


Probability Basics Random variables

E RLANG RV

λk xk−1 e−λx
 f (x; k, λ) = , 0≤x<∞
(k − 1)!
k−1
X 1 −λx
 F(x) = 1 − e (λx)n , 0≤x<∞
n!
n=0

k k
 E[X] = , Var(X) =
λ λ2
λ
 If X ∼ Erlang(k, λ) ⇒ aX ∼ Erlang(k, )
a
 If X ∼ Erlang(k1 , λ), and Y ∼ Erlang(k2 , λ)
⇒ X + Y ∼ Erlang(k1 + k2 , λ)

M Cevik MIE1605 - Probability Review 38 / 96


Probability Basics Random variables

G AMMA RV

 Gamma(α, β), α > 0 : shape, β > 0 : rate

β α α−1 −βx
 f (x; α, β) = x e , 0<x<∞
Γ (α)
Z βx
1
 F(x) = tα−1 e−t dt, 0<x<∞
Γ (α) 0

 Gamma Function: Γ (n) = (n − 1)! if n is integer


Z ∞
Γ (α) = tα−1 e−t dt
0

α α
 E[X] = , Var(X) =
β β2

M Cevik MIE1605 - Probability Review 39 / 96


Probability Basics Random variables

G AMMA RV

 If Xi ∼ Gamma(αi , β), i = 1, 2, . . . , N
XN XN
⇒ Xi ∼ Gamma( αi , β)
i=1 i=1

 If X ∼ Gamma(α, β) ⇒ cX ∼ Gamma(α, β/c)

 If X ∼ Gamma(1, β) ⇒ X ∼ Expo(β)

 If α is integer, gamma distribution is equivalent to Erlang


distribution. That is, if X ∼ Γ (α, λ) ⇒ X ∼ Erlang(α, λ)

M Cevik MIE1605 - Probability Review 40 / 96


Probability Basics Random variables

B ETA RV

 Beta(α, β), α > 0 : shape, β > 0 : shape

xα−1 (1 − x)β−1
 f (x; α, β) = , 0≤x≤1
B(α, β)

B(x; α, β)
 F(x) = , 0≤x≤1
B(α, β)
Z 1 Z x
 Beta functions: B(a, b) = ta−1 (1 − t)b−1 dt, B(x; a, b) = ta−1 (1 − t)b−1 dt
0 0

α αβ
 E[X] = , Var(X) =
α+β (α + β)2 (α + β + 1)

 if X ∼ Gamma(α, θ), Y ∼ Gamma(β, θ)


⇒ X/(X + Y) has a beta distribution with params. α and β.

 Beta(1, 1) = Unif (0, 1)

M Cevik MIE1605 - Probability Review 41 / 96


Probability Basics Random variables

E XAMPLE : POSTERIOR DISTRIBUTION

 Posterior probability is the probability of the parameters θ given


the evidence X.
p(θ)p(X|θ)
p(θ|X) =
p(X)
p(θ|X) : posterior probability
p(X|θ) : Likelihood function
p(θ) : prior
R probability
p(X) = p(X|θ)p(θ)d(θ) : Normalization constant

M Cevik MIE1605 - Probability Review 42 / 96


Probability Basics Random variables

E XAMPLE : POSTERIOR DISTRIBUTION

 Ex: We have the following prior distribution: P(λ = 0.01) =


0.3, P(λ = 0.03) = 0.2, P(λ = 0.05) = 0.4, P(λ = 0.1) = 0.1.
Assume we observed 3 failures in 100 time units. Determine
posterior distribution of λ given the evidence using poisson
likelihood function.

M Cevik MIE1605 - Probability Review 43 / 96


Probability Basics Random variables

E XAMPLE : POSTERIOR DISTRIBUTION

 Ex: Let P have a beta distribution with parameters a and b,


a−1 (1−p)b−1
f (p) = Γ (a+b)p
Γ (a)Γ (b) , 0 < p < 1. If we observe evidence of k
failures in n trials, show that posterior distn. for
P ∼ Beta(a + k, b + n − k).

M Cevik MIE1605 - Probability Review 44 / 96


Probability Basics Random variables

U NIFORM RV ( CONTINUOUS )

 U(a, b), −∞ < a < b < ∞

 pdf:

1
 , if x ∈ [a, b]
f (x) = b − a
0, otw

 cdf:


0, if x < a
x − a
F(x) = , if x ∈ [a, b]

 b−a
1, if x ≥ b

 E[X] = (a + b)/2, Var(X) = (b − a)2 /12

 If X has a uniform distribution, then Y = X n ∼ beta(1/n, 1)

M Cevik MIE1605 - Probability Review 45 / 96


Probability Basics Random variables

M EMORYLESS P ROPERTY

P(X > m + n, X ≥ n)
P(X > m + n | X ≥ n) =
P(X ≥ n)
P(X > m + n)
= = P(X > m)
P(X ≥ n)

 Discrete Memorylessness: Geometric distribution.

 Continuous Memorylessness: Exponential distribution.

M Cevik MIE1605 - Probability Review 46 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT D ISCRETE RV S

 X1 ⊥ X2 , we are interested in distributionPof X1 +X2 or in general, if X1 , . . . , Xn


independent, what is the distribution of ni=1 Xi ?

 Let X and Y be non-negative integer valued RVs, X ⊥ Y.


◦ P(X = i) = ai , P(Y = i) = bi , i = 0, 1, 2, . . .
◦ {X + Y = n} = ∪ni=0 {X = i, Y = n − i}
◦ For i 6= j, {X = i, Y = n − i} and {X = j, Y = n − j} are mutually exclusive
(or disjoint), they cannot happen at the same time.
X n
X P(X + Y = n) = P(X = i, Y = n − i)
i=0
n
X
⇒ Cn = P(X + Y = n) = ai bn−i
i=0
k
X
X P(X + Y + Z = k) = P(X + Y = i)P(Z = k − i),
i=0
P(X + Y = i) = ci , P(Z = k − i) = dk−i

M Cevik MIE1605 - Probability Review 47 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT D ISCRETE RV S

 Result: Let Z = X + Y where X, Y are nonnegative RVs (X, Y need


not be independent).
X X
⇒ fZ (z) = fX (x)fY|X (z − x|x) = fY (y)fX|Y (z − y|y)
x y

M Cevik MIE1605 - Probability Review 48 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT D ISCRETE RV S

 Ex: Let X ∼ Poisson(λ), Y ∼ Poisson(µ). If X and Y are indep.,


what’s the distribution of X + Y?

Soln:
X + Y ∼ Poisson(λ + µ)

M Cevik MIE1605 - Probability Review 49 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT D ISCRETE RV S

 Ex: If X ∼ Binom(n, p), Y ∼ Binom(m, p). and X and Y are indep.,


then the distribution of X + Y?

Soln:
X + Y ∼ Binom(n + m, p)

M Cevik MIE1605 - Probability Review 50 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT C ONTINUOUS RV S
Z ∞
 P(X + Y ≤ a) = P(X ≤ a − y)fY (y)dy : CDF of X + Y
0
Z ∞
d
 fX+Y (a) = FX+Y (a) = fX (a − y)fY (y)dy : pdf of X + Y
da 0

M Cevik MIE1605 - Probability Review 51 / 96


Probability Basics Sum of Independent RVs

S UM OF I NDEPENDENT C ONTINUOUS RV S

 Ex: Let Z = U1 + U2 (i.e., sum of two standard uniform RVs). Find


fZ (z).

Soln:

z,
 if 0 ≤ z ≤ 1
fZ (z) = 2 − z, if 1 < z < 2

0, otw

⇒ pdf of a triangular RV

M Cevik MIE1605 - Probability Review 52 / 96


Probability Basics Functions of RVs

F UNCTIONS OF R ANDOM VARIABLES

 Question: If X1 and X2 have joint density function f , and g, h are


functions mapping R2 to R, then what is the joint density function of
the pair Y1 = g(X1 , X2 ), Y2 = h(X1 , X2 )?

 Theorem: (Method of direct transformation) Let X be a continuous


RV with pdf fX and support I, where I = [a, b]. Let g : I → R be a
continuous monotonic function with inverse function h : J → I,
where J = g(I). Let Y = g(X). Then the pdf fY of Y satisfies
(
fX (h(y)) · |h0 (y)|, if y ∈ J
fY (y) =
0, otw.

M Cevik MIE1605 - Probability Review 53 / 96


Probability Basics Functions of RVs

F UNCTIONS OF R ANDOM VARIABLES

 Ex: Suppose X has the density


θ
fX (x) = (θ+1) , x > 1, θ > 0.
x
Find the density of Y = ln(X).

M Cevik MIE1605 - Probability Review 54 / 96


Probability Basics Functions of RVs

F UNCTIONS OF R ANDOM VARIABLES

 Corollary (Two RV case): Suppose X1 , X2 have joint pdf


fX1 ,X2 (x1 , x2 ) with support A = {(x1 , x2 ) : f (x1 , x2 ) > 0}. We are
interested in RVs Y1 = g1 (X1 , X2 ) and Y2 = g2 (X1 , X2 ). The
transformation y1 = g1 (x1 , x2 ), y2 = g2 (x1 , x2 ) is a one-to-one
transformation of A onto B. The inverse transformation is
x1 = g−1 −1
1 (y1 , y2 ), x2 = g2 (y1 , y2 ).

The determinant of the Jacobian of this inverse determined as:



∂x1 ∂x2 ∂x ∂x ∂x2 ∂x1
∂y1 ∂y1 1 2
J = ∂x1 ∂x2 = −
∂y ∂y ∂y1 ∂y2 ∂y1 ∂y2
2 2

The joint pdf of Y1 = g1 (X1 , X2 ) and Y2 = g2 (X1 , X2 ) is


fY1 ,Y2 (y1 , y2 ) = |J| fX1 ,X2 (g−1 −1
1 (y1 , y2 ), g2 (y1 , y2 )).
If (y1 , y2 ) is not in the range of g, fY1 ,Y2 (y1 , y2 ) = 0.
M Cevik MIE1605 - Probability Review 55 / 96
Probability Basics Functions of RVs

F UNCTIONS OF R ANDOM VARIABLES

 Ex: If X and Y have joint density function f , find the density function
U = XY.

M Cevik MIE1605 - Probability Review 56 / 96


Probability Basics Functions of RVs

F UNCTIONS OF R ANDOM VARIABLES

 Ex: Let X1 , X2 be indep. expo. RVs, with param λ. Find joint


density of Y1 = X1 + X2 , Y2 = X1 /X2 . Show they are indep.

M Cevik MIE1605 - Probability Review 57 / 96


Limit theorems

M ODES OF CONVERGENCE

 Definition: A sequence of real numbers {α1 , α2 , . . .} is said to


converge to a real number α if, for any  > 0 there exists an
integer N such that for all n > N
|αn − α| < 
⇒ We express the convergence as αn → α as n → ∞ or as
limn→∞ αn = α.

 Ex: Let αn = 1 − 1n . For any  > 0 there exists an integer N such


that for all n > N : |αn − 1| < , so limn→∞ αn = 1. (E.g., consider
N = [ 1 ])

M Cevik MIE1605 - Probability Review 58 / 96


Limit theorems

C ONVERGENCE IN P ROBABILITY

 Definition: A sequence of RVs {X1 , X2 , . . .} is said to converge in


probability (or weakly) to a real number µ if for any  > 0 and γ > 0
there exists an integer N such that for all n > N:
P(|Xn − µ| < ) > 1 − γ.
p p
⇒ We express the convergence as Xn →
− µ or Xn − µ →
− 0 as
n → ∞.

 Alternative way to express the same


Definition: A sequence of RVs {X1 , X2 , . . .} is said to converge in
probability (or weakly) to a real number µ if for any  > 0:
limn→∞ P(|Xn − µ| < ) = 1

M Cevik MIE1605 - Probability Review 59 / 96


Limit theorems

A LMOST S URE C ONVERGENCE


 Definition: A sequence of RVs {X1 , X2 , . . .} is said to converge
almost surely (or strongly) to a real number µ if for any  > 0:
lim P(sup |Xn − µ| < ) = 1.
N→∞ n>N
a.s.
⇒ We express the convergence as Xn − µ −−→ 0 as n → ∞.

 Alternative way to express the same


Definition: A sequence of RVs {X1 , X2 , . . .} is said to converge
almost surely (or strongly) to a real number µ if
P( lim Xn = µ) = 1.
n→∞

 Alternative terminology:
a.e.
Xn → X almost everywhere, Xn −−→ X
w.p.1
Xn → X with probability 1, Xn −−−→ X
M Cevik MIE1605 - Probability Review 60 / 96
Limit theorems

C ONVERGENCE IN D ISTRIBUTION

 Definition: Consider a sequence of RVs X1 , X2 , . . . and a


corresponding sequence of cdfs, FX1 , FX2 , . . . so that for
n = 1, 2, . . ., FXn = P(Xn ≤ x). Suppose that there exists a cdf FX
such that for all x at which FX is continuous,
limn→∞ FXn (x) = FX (x).
Then X1 , . . . , Xn converges in distribution to RV X with cdf FX
denoted
d
Xn →
− X
and FX is the limiting distribution.

M Cevik MIE1605 - Probability Review 61 / 96


Limit theorems

C ONVERGENCE IN rth MEAN

 Definition: The sequence of RVs X1 , . . . , Xn converges in rth


r
mean to RV X (or a real number µ), denoted Xn → − X if
limn→∞ E[|Xn − X|r ] = 0.
r=2
 If limn→∞ E[(Xn − X)2 ] = 0, then we write Xn −−→ X
That is, {Xn } converges to X in mean-square or in quadratic mean.

 Theorem: For r1 > r2 ≥ 1


r=r1 r=r
2
Xn −−−→ X ⇒ Xn −−−→ X.

M Cevik MIE1605 - Probability Review 62 / 96


Limit theorems

R ELATING THE MODES OF CONVERGENCE

 Theorem: For a sequence of RVs X1 , . . . , Xn , following


relationships hold:

No other relationships hold in general.

M Cevik MIE1605 - Probability Review 63 / 96


Limit theorems

M ARKOV ’ S INEQUALITY

 If X ≥ 0, then for any tZ> 0


E[X] = xf (x)dx
ZR Z
= xf (x)dx + xf (x)dx
{x≥t} {0≤x<t}
Z
≥ xf (x)dx
{x≥t}
Z
≥ tf (x)dx = tP(X ≥ t)
{x≥t}

E[X]
⇒ If t > 0, P(X ≥ t) ≤
t
 Scaling Markov’s inequality: For t > 0
P(X ≥ tE[X]) ≤ (tE[X])−1 E[X] = 1/t
M Cevik MIE1605 - Probability Review 64 / 96
Limit theorems

C HEBYSHEV ’ S INEQUALITY

 Apply Markov’s inequality to the RV Y = (X − E[X])2 ≥ 0


P((X − E[X])2 ≥ 2 ) ≤ Var(X)/2
Remember, Var(X) = E[(X − E[X])2 ]

 We can write this as

P(|X − E[X]| ≥ ) ≤ Var(X)/2

These just require mean and variance, with no other assumptions.

 One-sided Chebyshev’s inequality:


σ2
P(X ≥ E[X] + t) ≤
σ 2 + t2
M Cevik MIE1605 - Probability Review 65 / 96
Limit theorems

C HEBYSHEV ’ S INEQUALITY

 Ex: Let X be an arbitrary RV with unknown distribution but with


known range, e.g., 10 ≤ X ≤ 30. For random samples of size
1000, give a lower bound for P(|X̄ − E[X]| ≤ 1).
Soln: P(|X̄ − E[X]| ≤ 1) ≥ 0.9

M Cevik MIE1605 - Probability Review 66 / 96


Limit theorems

C HEBYSHEV ’ S INEQUALITY

 Ex: Suppose that X is a RV with mean 10 and variance 15. What


can we say about P(5 < X < 15)?
Soln: P(5 < X < 15) ≥ 2/5

M Cevik MIE1605 - Probability Review 67 / 96


Limit theorems

R EVIEW OF IMPORTANT THEOREMS (1)

Strong law of large numbers (SLLN)


Let X1 , X2 , X3 , ... be a sequence of independent and identically
distributed (i.i.d) RV’s, with E[Xi ] = µ. Then, with probability 1,
n
1X
X̄n := Xi → µ as n → ∞
n
i=1

 LLN is basis for estimation via simulation

M Cevik MIE1605 - Probability Review 68 / 96


Limit theorems

R EVIEW OF IMPORTANT THEOREMS (2)

Central Limit Theorem (CLT)


Let X1 , X2 , X3 , ... be a sequence of i.i.d RVs, with mean µ and
variance σ 2 . Then,
X̄n − µ d
Zn := √ → − N(0, 1) as n → ∞
σ/ n
That is,
d
P(Zn ≤ z) → − Φ(z) as n → ∞, ∀z

 Remarks:
◦ If n is large, then X̄n ≈ Nor(µ, σ 2 /n)
◦ Xi ’s need not be normally distributed
◦ Usually n ≥ 30 for better approximations (fewer observations
needed when Xi ’s are from symmetric distribution)
M Cevik MIE1605 - Probability Review 69 / 96
Limit theorems

L IMIT THEOREMS

 Ex: Use Chebshev’s inequality to prove the weak law of large


numbers. Namely, if X1 , X2 , . . . are iid with mean µ and variance σ 2
then, for any ,
 
X1 + X2 + . . . + Xn
P | − µ |>  → 0 as n → ∞
n

M Cevik MIE1605 - Probability Review 70 / 96


Limit theorems

L IMIT THEOREMS

 Ex: Let Xi , i = 1, 2, . . . , 10 be independent RVs,


P each being
uniformly distributed over (0, 1). Estimate P( 10 i=1 Xi > 7).
P10
Soln: P( i=1 Xi > 7) ≈ 1 − Φ(2.2) = 0.0139

M Cevik MIE1605 - Probability Review 71 / 96


Limit theorems

L IMIT THEOREMS

 Ex: (Normal approximation to the binomial) The Blue Jays play


100 independent baseball games, each of which they have
probability 0.8 of winning. What’s the probability that they win at
least 90?
Soln: P(Y ≥ 90) ≈ 1 − Φ(2.5) = 0.0088.

M Cevik MIE1605 - Probability Review 72 / 96


Generating Functions

G ENERATING F UNCTIONS
Z
iXt
 Fourier Transform: E[e ] = eixt fX (x)dx
x∈R

Z
 Laplace Transform: E[e−Xt ] = e−xt fX (x)dx
x∈R

Z
Xt
 Moment Generating Functions: E[e ] = ext fX (x)dx
x∈R

Z
 Probability Generating Functions: E[sX ] = sx fX (x)
x∈R

M Cevik MIE1605 - Probability Review 73 / 96


Generating Functions Moment Generating Functions

M OMENT G ENERATING F UNCTIONS


Z
Xt
 E[e ] = ext fX (x)dx = φX (t)

 φX (0) = 1

∂φX (t) ∂
 = E[ ext ] = E[XeXt ]
∂t ∂t
0 00
⇒ φX (0) = E[X], φX (0) = E[X 2 ], etc.

 Sum of RVs:
P
Xi t
φPi Xi (t) = E[e i ] = E[Πi eXi t ].
If Xi0 s are independent ⇒ E[Πi eXi t ] = Πi E[eXi t ] = Πi φXi (t)

M Cevik MIE1605 - Probability Review 74 / 96


Generating Functions Moment Generating Functions

M OMENT G ENERATING F UNCTIONS

 Bernoulli Distribution:
φX (t) = E[eXt ] = pet + (1 − p)

 Binomial Distribution:
P
φX (t) = E[eXt ] = E[e i Xi t
] = (E[eXi t ])n
= (pet + (1 − p))n → since Xi0 s are idd Bernoulli RVs.
Note that φX (t) gives a hint about distribution of a RV. If we
recognize something like (pet + (1 − p))n , then we can say that it’s
a Binomial(n, p) RV.

M Cevik MIE1605 - Probability Review 75 / 96


Generating Functions Moment Generating Functions

M OMENT G ENERATING F UNCTIONS

 Geometric Distribution (trials):



Xt
X pet
φX (t) = E[e ] = ext p(1 − p)x−1 =
1 − et (1 − p)
x=1

 Negative Binomial Distribution : # of trials until rth success.


pet r
φNr (t) = [φN (t)]r =

t
1 − e (1 − p)

 Exponential Distribution:
φX (t) = E[ext ] = λ/(λ − t)

M Cevik MIE1605 - Probability Review 76 / 96


Generating Functions Probability Generating Functions

P ROBABILITY G ENERATING F UNCTIONS



X
 E[sX ] = sk P(X = k) = P(s)
k=0

 P(1) = 1 → quick way of verifying that P(s) is a g.f.

 Ex: Poisson RV
∞ ∞
X e−λ λk X (λs)k
P(s) = sk = e−λ = eλ(s−1)
k! k!
k=0 k=0

 Ex: Geometric RV

X 1
P(s) = sk (1 − p)k p = p , (s < 1/(1 − p))
1 − s(1 − p)
k=0

M Cevik MIE1605 - Probability Review 77 / 96


Generating Functions Probability Generating Functions

P ROBABILITY G ENERATING F UNCTIONS

 Let P(s) be the g.f. of a mystery RV X with an unknown distribution


P(0) = 00 p0 + 01 p1 + 02 p2 + . . . = p0
∞ ∞
0 0
X X
k−1
P (s) = ks pk = ksk−1 pk ⇒ P (0) = p1
k=0 k=1

00 00
X
P (s) = k(k − 1)sk−2 pk ⇒ P (0) = 2p2
k=2

X
P(n) (s) = k(k − 1) . . . (k − n + 1)sk−n pk ⇒ P(n) (0) = n!pn
k=n

P(k) (0)
Then, pk = , k = 0, 1, 2, . . .
k!

M Cevik MIE1605 - Probability Review 78 / 96


Generating Functions Probability Generating Functions

P ROBABILITY G ENERATING F UNCTIONS

 Let P(s) be the g.f. of a mystery RV X with an unknown distribution


∞ ∞
0 0
X X
P (s) = ksk−1 pk ⇒ P (1) = kpk = E[X]
k=1 k=0

00
X
P (s) = k(k − 1)sk−2 pk
k=2

00
X
⇒ P (1) = k(k − 1)pk = E[X(X − 1)] = E[X 2 ] − E[X]
k=0

P(n) (1)
 
= E X(X − 1) . . . (X − n + 1)
00 0 0
Var(X) = E[X 2 ] − (E[X])2 = P (1) + P (1) − (P (1))2

M Cevik MIE1605 - Probability Review 79 / 96


Generating Functions Probability Generating Functions

P ROBABILITY G ENERATING F UNCTIONS

 Ex:
◦ P(s) = p/(1 − qs)
0
◦ P (s) = qp/(1 − qs)2
⇒ P0 (0) = (1 − p)p = P(X = 1)
⇒ P0 (1) = q/p = E[X]
00 00
⇒ P (s) = (2(1 − qs)q2 p)/(1 − qs)4 ⇒ P (0) = 2q2 p
00
Then, P(X = 2) = (1/2!)P (0) = (1 − p)2 p

⇒ Observation: There is one-to-one correspondence between the


g.f. of a RV and its probability distribution. We can recover a
probability distribution and all moments of a RV from g.f.

M Cevik MIE1605 - Probability Review 80 / 96


Generating Functions Probability Generating Functions

PGF FOR THE S UMS

 Let X1 , X2 , . . . , Xn be independent non-negative integer valued RVs


where Xi has g.f. PXi (s). We are interested in g.f. of
X1 + X2 + . . . + Xn
PX1 +X2 +...+Xn (s) = E[sX1 +X2 +...+Xn ]

= E[sX1 sX2 . . . sXn ] = E[sX1 ]E[sX2 ] . . .


n
= Πi=1 PXi (s)

M Cevik MIE1605 - Probability Review 81 / 96


Generating Functions Probability Generating Functions

PGF FOR THE S UMS

 Ex: X1 ∼ Poisson(λ1 ), X2 ∼ Poisson(λ2 ), X1 ⊥ X2 .


PX1 +X2 (s) = e−λ1 (1−s) e−λ2 (1−s) = e−(λ1 +λ2 )(1−s)
⇒ X1 + X2 ∼ Poisson(λ1 + λ2 )

 Ex: Let X1 , X2 , . . . , Xn are iid Bernoulli RVs with


P(Xi = 1) = p = 1 − P(Xi = 0).
◦ PXi (s) = s0 q + s1 p = q + sp
n
◦ PX1 +X2 +...+Xn (s) = Πi=1 (q + sp) = (q + sp)n
◦ X1 + X2 + . . . + Xn ∼ Binom(n, p)
 Note that if X1 , X2 , . . . , Xn are iid RVs, then
PXi (s) = P(s)
PX1 +X2 +...+Xn (s) = (P(s))n
M Cevik MIE1605 - Probability Review 82 / 96
Random Sums

R ANDOM S UMS
 Let X1 , X2 , X3 , . . . be iid (non-negative integer valued) RVs with
P(Xi = k) = pk , k = 0, 1, 2...
Let N be a non-negative integer valued RV which is independent
of {X1 , X2 , . . .} where P(N = k) = αk , k ≥ 0. Define
◦ S0 = 0P
◦ Sn = ni=1 Xi , n = 1, 2, . . .

⇒ A random sum is SN = Ni=1 Xi .


P
X∞ ∞
X
 P(SN = j) = P(SN = j, N = k) = P(SN = j|N = k)P(N = k)
k=0 k=0

X
= P(Sk = j)P(N = k)
k=0
 Ex: Consider people who are coming to a mall.
N : # of people coming. Xi : money spent by ith customer.
M Cevik MIE1605 - Probability Review 83 / 96
Random Sums

R ANDOM S UMS

E[N]
X
 E[SN ] = E[N]E[X1 ] 6= E[Xi ] → E[N] may not be integer!
i=1

 Computing E[SN ] by conditioning on N:



  X  
◦ E[SN ] = EN ESN [SN | N] = E SN | N = k P(N = k)
k=0


X k
hX i ∞
X
= E Xi αk = kE[X1 ]αk = E[X1 ]E[N]
k=0 i=1 k=0

M Cevik MIE1605 - Probability Review 84 / 96


Random Sums

PGF FOR R ANDOM S UMS



X ∞
X ∞
X
j j
 PSN (s) = s P(SN = j) = s P(Sk = j)αk
j=0 j=0 k=0


X ∞
X ∞
X
= αk sj P(Sk = j) = αk (PX1 (s))k
k=0 j=0 k=0

⇒ PSN (s) = PN (PX1 (s))

→ Sk = X1 + X2 + . . . + Xk , k is known (g.f. for the sums)

M Cevik MIE1605 - Probability Review 85 / 96


Random Sums

PGF FOR R ANDOM S UMS


 Computing E[SN ] by using g.f.
∂PSN (s)
PSN (s) = PN (PX1 (s)) ⇒ |s=1 = E[SN ]
∂s
∂PSN (s) 0 0
= (PX1 (s)) (PN (PX1 (s))) .
∂s
0
(PX1 (s)) |s=1 = E[X1 ],
0 0
PX1 (1) = 1 ⇒ (PN (PX1 (1))) = (PN (1)) = E[N]
⇒ E[SN ] = E[X1 ]E[N]

 Ex: Let P(X1 = 1) = p = 1 − P(X1 = 0) and N ∼ Poisson(λ).


Calculate PSN (s).
◦ PSN (s) = PN (PX1 (s)) = PN (ps + q)
= e−λ(1−(ps+q)) = e−λp(1−s)
⇒ SN ∼ Poisson(λp)
M Cevik MIE1605 - Probability Review 86 / 96
Simple Branching Process

S IMPLE B RANCHING P ROCESS

 We have a pmf {pk } on non-negative integers and a “progenitor”


who forms generation zero. The progenitor splits into k offsprings
with probability pk and the offsprings constitute the first generation.

 Each of the members of the first generation is split into a random


number of offsprings, again with the same mass function {pk }.
 This process continues until extinction, if occurs.
 An example could be family generations or in queuing, a branch
may be number of arrivals during 1st job is processed.
M Cevik MIE1605 - Probability Review 87 / 96
Simple Branching Process

S IMPLE B RANCHING P ROCESS

 Let {Zn,j , n ≥ 1, j ≥ 1} be iid RVs with pmf {pk , k = 0, 1, 2, . . .}.


Zn,j corresponds to the number of members of the nth generation
that are offsprings of the jth member of the (n − 1)st generation.

 Let {Zn , n ≥ 0} be a branching process with


Z0 = 1 → generation zero
Z1 = Z1,1
Z2 = Z2,1 + Z2,2 + . . . + Z2,Z1
Zn−1
X
Zn = Zn,1 + Zn,2 + . . . + Zn,Zn−1 = Zn,j
j=1

⇒ This is a random sum of RVs

M Cevik MIE1605 - Probability Review 88 / 96


Simple Branching Process

S IMPLE B RANCHING P ROCESS

 Note that Zn−1 and {Zn,j , j = 1, 2, . . .} are independent. We can


use the generating functions to determine pmf of Zn .
PZn (s) = E[szn ]

X
Let P(s) = sk pk , and E[sZ1 ] = P1 (s).
k=0

P(s) = s → s1 (gen. zero)


Pn (s) = Pn−1 (P(s)) → from rand. sum of RVs
P2 (s) = P(P(s))
P3 (s) = P(P2 (s)) = P(P(P(s)))
⇒ Pn (s) = P(Pn−1 (s))

M Cevik MIE1605 - Probability Review 89 / 96


Simple Branching Process

S IMPLE B RANCHING P ROCESS

 Ex: Let P{Z1,1 = 0} = 1 − p = q, P{Z1,1 = 1} = p, 0 < p < 1.

P(s) = P1 (s) = q + ps

P2 (s) = P(P(s)) = q + p(q + ps) = q + pq + p2 s

Pn+1 (s) = q + pq + p2 q + . . . + pn q + pn+1 s = E[sZn+1 ]



X
Pn+1 (s) = (q + pq + p2 q + . . . + pn q)s0 + pn+1 s1 = sk pk
k=0

2 n
P(Zn+1 = 0) = q + pq + p q + . . . + p q

P(Zn+1 = 1) = pn+1

P(Zn+1 = 2) = 0

X
lim P(Zn+1 = 0) = q pi = q/(1 − p) = 1 ⇒ this family will extinct!
n→∞
i=0

M Cevik MIE1605 - Probability Review 90 / 96


Simple Branching Process

S IMPLE B RANCHING P ROCESS

 If E[Z1,1 ] = p < 1, there will certainly be extinction.


⇒ It’s the measure of extinction.

X
 Define mn = E[Zn ]. Let m1 = m = E[Z1 ] = kpk
k=0
0 0 0
P2 (s) = P (P(s))P (s) → chain rule for derivatives
0 0
P2 0 (1) = P (1)P (1) = m2
0 0 0 0
P3 (s) = P (P2 (s))P2 (s) ⇒ P3 (1) = m3
0
⇒ Pn (s) = mn = mn

M Cevik MIE1605 - Probability Review 91 / 96


Simple Branching Process

S IMPLE B RANCHING P ROCESS


 Our purpose is to be able to characterize the probability of
extinction of a branching process.
πn = P(Zn = 0) = Pn (0)
π = P{extinction}
πn = P{the extinction occurs on generation n or before}
{extinction} = ∪∞
n=1 {Zn = 0}, and {Zn = 0} ⊂ {Zn+1 = 0}

⇒ All the outcomes that lead to {Zn = 0} imply that {Zn+1 = 0}


π = P{∪∞ n
k=1 {Zk = 0}} = P( lim ∪k=1 {Zk = 0})
n→∞
= lim P(Zn = 0) = lim πn
n→∞ n→∞
Note that there are trivial cases such as:
 if p0 = 0 ⇒ π = 0  if p0 = 1 ⇒ π = 1
M Cevik MIE1605 - Probability Review 92 / 96
Simple Branching Process

S IMPLE B RANCHING P ROCESS

Theorem
Suppose 0 < p0 < 1.
◦ If m = E[z1 ] ≤ 1, then π = 1.
◦ If m > 1, then π < 1 is the unique non-negative solution of
s = P(s).

M Cevik MIE1605 - Probability Review 93 / 96


Simple Branching Process

S IMPLE B RANCHING P ROCESS


 Ex: Consider an operator of a sales booth at a computer show
that takes orders. Each order takes three minutes to fill. While
each order is being filled, there is probability pj that j more
customers will arrive and join the line. Assume
p0 = 0.2, p0 = 0.2, p0 = 0.6. The operator cannot take a break
until a service is completed and no one is waiting in line to order.
If present conditions persist, what is the probability that the
operator will ever take a break?

M Cevik MIE1605 - Probability Review 94 / 96


Simple Random Walk

S IMPLE R ANDOM WALK

 Let {Xn , n ≥ 1} be iid RVs with possible values {−1, 1} where


P(X1 = 1) = 1 − P(X1 = −1) = p.

 The Random Walk Process:


{Sn , n ≥ 0} with S0 = 0.
Sn = X1 + X2 + . . . + Xn = Sn−1 + Xn .

 Define N = min{n : Sn = 1}, S0 = 0.


→ First passage time or hitting time.
b = min{n : Sn = 0}, S0 = −1
N
→ Starting point shifted to level -1.
N and N̂ are identical RVs.
Also, N = 1 ⇐⇒ Xi = 1 w.p. p
M Cevik MIE1605 - Probability Review 95 / 96
Simple Random Walk

S IMPLE R ANDOM WALK

φn = P(N = n), n ≥ 0
φ0 = 0
φ1 = p
n−2
X
φn = qφj φn−j−1
j=1

X
Φ(s) = sn φn
n=0
E[N] =?

M Cevik MIE1605 - Probability Review 96 / 96