You are on page 1of 116

Probability Review

January 26, 2023

1 / 96
Outline

• Textbook: Chapter 1 - Shreve I


• Review some basis context in probability
• Sample space and probability
• Filtration as record of information
• Random variable and expectation
• Conditional distribution and conditional expectation

2 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine

3 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine
Each numerical value in the sequence is modeled by a random variable,

3 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine
Each numerical value in the sequence is modeled by a random variable, so a stochastic
process is simply a (finite or infinite) sequence of random variables

3 / 96
Why study random processes?

• Many phenomenon can be modelled by random processes: understand random


processes help to understand the phenomenon.
• Wide application: finance, actuary...
• Provide foundation for later: financial mathematics 1 and 2.

4 / 96
Course outline

• Review probability
• Introduction to random process
• Introduction to some special random process
• Poisson process
• Markov chain
• Random walk
• Brownian motion
• Stochastic calculus
• Itô integral
• Itô’s formula for derivative of a function of Brownian motion and of an Itô process
• Solve stochastic differential equations

5 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

6 / 96
Probability models

• Random experiment: produce uncertain outcomes under the same condition.


• An outcome ω: a result of experiment.
• A sample space Ω: all results of experiment.
• An event: a collection of some outcomes
• Probability measure P : estimate the likelihood of each event.
• P (A): how likely that event A occurs
7 / 96
Axiom of Probability

A law P to assign each event in a sample space Ω to a number is a probability in Ω if


it satisfies
1 0 ≤ P (A) ≤ 1
2 P (Ω) = 1
3 If A1 , A2 , . . . are mutually exclusive (also called disjoint, i.e Ai ∩ Aj = ∅∀i ̸= j)
then
∞ ∞
!
[ X
P An = P (An )
n=1 n=1

8 / 96
Probability measure on finite sample space

• Sample space has finite elements

Ω = {x1 , . . . , xn }

• X
P (A) = p(xi )
xi ∈A

• Probability of each outcome p(xi ) must satisfies the conditions


1 0 ≤ P (xi ) ≤ 1
2 p(x1 ) + · · · + p(xn ) = 1

9 / 96
Example 1

• Experiment: toss a fair coin three times


• Sample space: all possible outcome of three coin tosses

Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }

1 Probability of individual outcomes?


2 Probability of H on the first toss?

10 / 96
Additive rule

P( A ∪ B}
| {z ) = P (A) + P (B) − P ( A ∩ B}
| {z )
at least one event occurs both events occur

Additive rule for disjoint sets


If A and B are disjoint, i.e A ∩ B = ∅ then they can not occur simultaneously and

P (A ∪ B) = P (A) + P (B)

Complement rule
Complement of A is Ac = Ω − A - the event contains all outcomes that is not in A

P (Ac ) = 1 − P (A)

11 / 96
Conditional probability

For events A and B, the conditional probability of A given B is

P (A ∩ B)
P (A|B) =
P (B)

defined for P (B) > 0.

12 / 96
Conditional probability

For events A and B, the conditional probability of A given B is

P (A ∩ B)
P (A|B) =
P (B)

defined for P (B) > 0.

Measure the likelihood of A in the new sample space B

12 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}
• AB = {(2, 5)}
P (AB) 1/36 1
P (A|B) = = =
P (B) 6/36 6

13 / 96
Practice

Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
second roll is even?
Does the result of the first roll effect on the second?

14 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.
Equivalent condition
P (AB) = P (A)P (B)

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.
Equivalent condition
P (AB) = P (A)P (B)
the second condition is used to check the dependency between A and B even for
P (B) = 0

15 / 96
Multiplication rule
1 P (AB) = P (B)P (A|B)
2 General case

P (A1 A2 ...Ak ) = P (A1 )P (A2 |A1 )P (A3 |A1 A2 )...P (Ak |A1 ...Ak−1 )

16 / 96
Law of total probability
Let A1 , ..., Ak be a partition the sample space
• Ai ∩ Aj = ∅ for all i ̸= j (mutually exclusive (disjoint))
• ∪i Ai = Ω
Then, for any event B, we have
k
X k
X
P (B) = P (B ∩ Ai ) = P (B|Ai )P (Ai ).
i=1 i=1

17 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

18 / 96
Random variables

Definition
Random variable A random variable a function on sample space

X: Ω→R
ω → X(ω)

19 / 96
Random variables

Definition
Random variable A random variable a function on sample space

X: Ω→R
ω → X(ω)

Classify random variables


Range(X) = {X(ω) for ω ∈ Ω} is the set of all possible values of random variable X
• If Range(X) is a countable set then X is called a discrete random variable.
• If Range(X)is an uncountable set then X is called a continuous random variable.

19 / 96
Example - Binomial Asset Pricing Model
u: up factor, d: down factor

• Initial stock price S0


• Next period
• Upward: uS0
• Downward: dS0
1
where 0 < d < 1 < u, (d = u )
• Toss a coin
• Head: move up
• Tail: move down
• Range(S1 ) = {2, 8} so S1 is a
discrete RV

20 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value

F (x) = P (X ≤ x)

21 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value

F (x) = P (X ≤ x)

Evaluate probability of a RV with cdf


• P (X > x) = 1 − F (x)
• P (X < x) = lim F (t) = F (x− )
t→x−
• P (a < X ≤ b) = F (b) − F (a)
• P (a ≤ X ≤ b) = F (b) − F (a− )

21 / 96
Properties of cdf

• Increasing

lim F (x) = 0
x→−∞

and
lim F (x) = 1
x→∞
• has left-limit
• Right- continuous

22 / 96
Distribution of a discrete random variable
Range(X): countable set x1 , x2 , ...
Definition (Probability mass function (pmf))
pmf of X is a function pX given by

pX (xi ) = P (X = xi )

which satisfies
1 0 ≤ pX (xi ) ≤ 1 for all i
X
2 pX (xi ) = 1
i

Properties
X X
(1) P (a ≤ X ≤ b) = pX (xi ) (2) cdf F (x) = pX (xi )
a≤xi ≤b xi ≤x
23 / 96
Bernoulli distribution

A random variable X is Bernoulli distribution if it is an indicator random variable of a


trial which has success probability p and failure probability 1 − p.

P (X = 0) = 1 − p
P (X = 1) = p

Denote X ,→ Ber(p)

24 / 96
Example - Bernoulli distribution

Toss a fair coin. Let X be the number of H.


Probability mass function (pmf) of X

x 0 1
P (X = x) P (T ) = 1/2 P (H) = 1/2

25 / 96
Binomial distribution

• n independent trials, each is Ber(p).


• X is the number of success
• X is called Binomial RV with parameter (n, p)
• Denote X ∼ Bino(n, p).
!
n k
P (X = k) = p (1 − p)n−k
k

26 / 96
Example - Binomial distribution

Toss a fair coin twice. Pmf of the number of H


x P (X = x)
0 P (T T ) = 1/4
1 P (T H) + P (T T ) = 1/2
2 P (HH) = 1/4

27 / 96
Example - Binomial distribution

Toss a fair coin 3 times. Pmf of the number of H


x P (X = x)
0 P (T T T ) = 1/8
1 P (T T H) + P (T HT ) + P (HT T ) = 3/8
2 P (HHT ) + P (HT H) + P (T HH) = 3/8
3 P (HHH) = 1/8

28 / 96
Example - Payoff of European Call Option

• European call option: confers the right to buy the stock at maturity or
expiration time T = 2 for strike price K = 14 dollars. It is worth
• S2 − K if S2 − K > 0
• 0 otherwise
• Value (payoff) of the option at maturity

C2 = max(S2 − K, 0) = (S2 − K)+

where x+ = max(x, 0)
• Stock price: binomial model with S0 = 4, d = 1/2, u = 2,
p = p(H)1/2,q = p(T ) = 1 − p = 1/2
• Find probability distribution for payoff C2 of the European call option.

29 / 96
Solution

pmf of S2

x 1 4 16
P (S2 = x) 1/4 1/2 1/4

pmf of C2

x 0 2
P (C2 = x) 3/4 1/4

30 / 96
Example - European Put Option

• European put option: confers the right to sell the stock at maturity or
expiration time T = 2 for strike price K = 3 dollars. It is worth
• K − S2 if K − S2 > 0
• 0 otherwise
• Value (payoff) of the option at maturity

P2 = max(K − S2 , 0) = (K − S2 , 0)+

• Stock price: binomial model with S0 = 4, d = 1/2, u = 2, p(H) = p(T ) = 1/2


• Find probability distribution for payoff P2 of the European put option.

31 / 96
Solution

pmf of S2

x 1 4 16
P (S2 = x) 1/4 1/2 1/4

pmf of C2

x 0 2
P (P2 = x) 3/4 1/4

32 / 96
Practice

Find c.d.f of stock price S2 , European call option C2 and European put option P2 in
the previous example

33 / 96
Distribution of a continuous random variable
Range(X): uncountable
Definition (Probability density function (pdf))
The pdf of X is a function f that satisfies
1 f (x) ≥ 0 for all x
R∞
2 P (−∞ < X < ∞) = −∞ f (x)dx =1

Properties
Ra
• P (X = a) = P (a ≤ X ≤ a) = a f (x)dx = 0
Rb
• P (a ≤ X ≤ b) = a f (x)dx
• P (a ≤ X ≤ b) = F (b) − F (a)
• cdf Z a
F (a) = P (X ≤ a = f (x)dx)
−∞

34 / 96
Probability as an Area

Note that probability of any individual value is 0


35 / 96
Interpretation of p.d.f

Z a+ ϵ
ϵ ϵ
 
2
P a− ≤X ≤a+ = f (x)dx
2 2 a− 2ϵ

≈ ϵf (a)

f (a) is a measure of how likely it is that the random variable will be near a.

36 / 96
Continuous uniform distribution

• X is uniformly distributed on [a, b] if its pdf is


(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise

37 / 96
Continuous uniform distribution

• X is uniformly distributed on [a, b] if its pdf is


(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise

Any value in [a, b] is equally likely to be value of X.


Denote X ∼ U ni[a, b].
• cdf 
0,

 x<a
x−a
F (x) = , a≤x<b
 b−a
x≥b

1,

37 / 96
Exponential distribution with parameter λ

• pdf
(
0, x<0
f (x) = −λx
λe , x≥0
• cdf (
0, x<0
F (x) =
e−λx , x≥0

38 / 96
Normal distribution

Definition
Continuous RV X is said to be normally distributed with parameter µ and σ 2 if its pdf
is
1 2 2
f (x) = √ e−(x−µ) /2σ , −∞ < x < ∞
σ 2π
Denote X ∼ N (µ, σ 2 ).

Properties
1 E(X) = µ
2 V ar(X) = σ 2

39 / 96
Simulate random number

Simulate uniform random number


• Simulate 1 random number from uniform distribution: rand
• Simulate a m × n matrix of random number from unifrom distribution rand(m,n)

Simulate random number from typical distribution


search Google for binomial, poisson, exponential, normal ....

40 / 96
Practice

• Generate 10000 uniform random number


• Plot histogram of the generated sample to make a comparison with pdf of uniform
distribution
• Similar for binomial, poisson, exponential, normal random number

41 / 96
Definition (Expectation of random variable X)
∞
X
xk P (X = xk ), if X is discrete with countable values





k=1
E(X) = Z ∞


 x fX (x) dx, if X is continuous with uncountable values
 −∞ | {z }

pdf of X

Expectation of function of random variable


∞
 X g(x )P (X = x ),


 k k
E(g(X)) = k=1
Z ∞

g(x)fX (x)dx



−∞

42 / 96
Definition (Variance of random variable X)

V ar(X) = E[(X − E[X])2 ]

Property

V ar(X) = E[X 2 ] − (E[X])2


where
(P
2
2 x x p(X = xk ) if X is a discrete random variable
E(X ) = R ∞k 2k
−∞ x fX (x)dx if X is a continuous random variable

43 / 96
Some properties of expectation and variance

• E(g(X) + h(X)) = E(g(X)) + E(h(X))


• E[aX + b] = aE[X] + b
• V ar[aX + b] = a2 V ar[X]

44 / 96
Example

Find expectation and variance for stock price S2 , European call option C2 and
European put option P2 in the previous example

45 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

46 / 96
Joint distribution of two discrete random variables

Definition (Joint probability mass function )

pX,Y (x, y) = P (X = x, Y = y)

Properties
For constants a < b and c < d,
b X
X d
P (a ≤ X ≤ b, c ≤ Y ≤ d) = P(X = x, Y = y)
x=a y=c

47 / 96
Marginal pmf

pmf of X and Y are given by



X
PX (X = x) = P(X = x, Y = y)
y=−∞

and ∞
X
PY (Y = y) = P(X = x, Y = y)
x=−∞

48 / 96
Example
Consider binomial asset pricing model
• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3

Find the joint pmf of the stock price (S1 , S2 )


49 / 96
Joint distribution of two continuous random variables

Definition (Joint pdf)


The joint probability density function of two continuous random variables X and Y ,
denoted by fX,Y (x, y) is a function which satisfies the following properties:
• fX,Y (x, y) > 0, ∀x, y
Z +∞ Z +∞
• fX,Y (x, y)dxdy = 1
−∞ −∞
• ZZ
P ((X, Y ) ∈ R) = fX,Y (x, y)dxdy
R
Z bZ d
In particular P (a ≤ x ≤ b, c ≤ y ≤ d) = fX,Y (x, y)dydx
a c

50 / 96
Probability as a volume
ZZ
P ((X, Y ) ∈ R = fX,Y (x, y)dxdy
R

51 / 96
Definition (Marginal pdf)
The marginal pdf of X and Y are given by
Z ∞
fX (x) = fX,Y (x, y)dy
−∞

and Z ∞
fY (y) = fX,Y (x, y)dx
−∞

Definition (Joint cdf)


Z a Z b
FX,Y (a, b) = P (X ≤ a, Y ≤ b) = fX,Y (x, y)dydx
−∞ −∞
Relationship between joint pdf and joint cdf

∂2
fX,Y (x, y) = FX,Y (x, y)
∂x∂y
52 / 96
Independence of random variables
• X and Y are independence if and only if

FX,Y (x, y) = FX (x)FY (y)

for all x, y
• If X and Y are discrete RV then X and Y are independent if and only if

P (X = x, Y = y) = P (X = x)P (Y = y)

for all x, y
• If X and Y are continuous RV then X and Y are independent if and only if

fX,Y (x, y) = fX (x)fY (y)

for all x, y
53 / 96
Definition (Expectation of function of random vector)
P P
 x y g(x, y)P (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[g(X, Y )] =

 g(x, y)fX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞

Example
P P
 x y xyP (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[XY ] =

 xyfX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞

54 / 96
Covariance and correlation coefficient

• Covariance of X and Y is cov(X, Y ) = E[XY ] − E[X]E[Y ].


• Correlation coefficient of X and Y is given by:

cov(X, Y )
cor(X, Y ) =
σX σY
where σX , σY are standard deviations of X and Y

−1 ≤ corr(X, Y ) ≤ 1
Correlation coefficient is used to measure how strong linear relationship between
X and Y is

55 / 96
Bivariate normal distribution

Let X1 ,→ N (µ1 , σ12 ), X2 ,→ N (µ2 , σ22!) be 2 normal random variables with covariance
X1
σ12 then the random vector X = is a bivariate normal distribution if the joint
X2
pdf is given by
1 − 21 (x−µ)T Σ−1 (x−µ)
f (x1 , x2 ) = 1 e
2π det(Σ) 2
where !
µ1
• µ=
µ2
!
σ12 σ12
• Σ= is the variance - covariance matrix of X
σ12 σ22

56 / 96
57 / 96
Properties

• Cov(X, Y ) = Cov(Y, X)
• Cov(X, X) = V ar(X)
• Cov(aX, Y ) = aCov(X, Y )
• Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z)
• If X and Y are independent then Cov(X, Y ) = 0

58 / 96
Variance of Sum

X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j

59 / 96
Variance of Sum

X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j

V ar(X + Y ) = V ar(X) + V ar(Y ) + 2Cov(X, Y )

59 / 96
Sum of normal distributions

2 ),
• If (X, Y ) has multivariate normal distribution, X ,→ N (µX , σX
Y ,→ N (µY , σY2 ) and Cov(X, Y ) = σXY then
2 2
X + Y ,→ N (µX + µY , σX + σX + 2σXY )
2 ), Y ,→ N (µ , σ 2 ) and X and Y are independent then
• If X ,→ N (µX , σX Y Y

2 2
X + Y ,→ N (µX + µY , σX + σX )

60 / 96
Example

Let X ,→ N (0, 4) and Y ,→ N (0, 1) be daily return of stock A and B respectively.


Suppose that Cov(X, Y ) = 2 and the joint distribution of X and Y is a multivariate
normal distribution.
1 Consider a portfolio consisting of 70% stock A and 30% stock B. Then the return
of the portfolio is given by
W = 0.7X + 0.3Y
Determine the distribution of W .
2 Suppose that we aim to allocate stock A and B with weight a and 1 − a. The
return of portfolio is
U = aX + (1 − a)Y
Determine a to minimize risk of the portfolio.

61 / 96
Solution

1 W is normally distributed with


• E(W ) = 0.7E(X) + 0/3E(Y ) = 0

V ar(W ) = V ar(0.7X) + V ar(0.3Y ) + 2Cov(0.7X0.3Y )


= (0.7)2 V ar(X) + (0.3)2 V ar(Y ) + 2(0.21)Cov(X, Y )
= (0.7)2 × 4 + (0.3)2 × 1 + (0.42) × 2 =

2 Evaluate risk via variance

V ar(U ) = 4a2 + (1 − a)2 + 4a(1 − a)

Need to find a ∈ [0, 1] to minimize V ar(U )

62 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

63 / 96
Conditioning a RV on an event

The conditional pmf of a RV X given an event A is

P ((X = x) ∩ A)
pX|A (x) = P (X = x|A) =
P (A)

if P (A) > 0

64 / 96
Example

Let X be the roll of a fair die and let A be the event that the roll is an even number.
Then
P (X = 1 and roll is even)
pX|A (1) = =0
P (roll is even)

65 / 96
Example

Let X be the roll of a fair die and let B be the event that the roll is an even number.
Then
P (X = 2 and roll is even) 1
pX|B (2) = =
P (roll is even) 3

66 / 96
Conditional of a discrete RV on another

• 2 RVs X and Y
• given Y = y with P (Y = y) > 0
• conditional pmf of X

pX|Y (x|y) = P (X = x|Y = y)


P (X = x, Y = y)
=
P (Y = y)

67 / 96
For each y, we view the joint pmf along the slice Y = y and renormalize such that
X
pX|Y (x|y) = 1
x 68 / 96
Example
1
Consider a binomial asset pricing model with S0 = 4, d = 2, u = 2, p = 2/3 and
q = 1/3. Find conditional pmf of S2 given S1 = 2.

69 / 96
Solution

x 1 4 8
1 2
PS2 |S1 =2 (x|2) 3 3 0

70 / 96
Practice

Consider a binomial asset pricing model with S0 = 4, d = 2, u = 12 , p = 2/3 and


q = 1/3. Find conditional pmf of S1 given S2 = 4.

71 / 96
Practice

Consider binomial asset pricing model


• S0 = 4, u = 2, d = 1/2
• p = 1/3, q = 2/3
Find
1 The conditional probability of S2 = 16 given that S1 = 8
2 The conditional probability of S3 = 8 given that S1 = 8

72 / 96
Conditional distributions

• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)

73 / 96
Conditional distributions

• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)
• For continuous random variable X and Y , the conditional density function of Y
given X = x is
f (x, y)
fY |X (y|x) = ,
fX (x)
provide the likelihood that X takes values near x given that Y takes values near
by y

73 / 96
Example
Let X1 and X2 be jointly normal random variables with parameters µ1 , σ1 , µ2 , σ2 , and
ρ. The joint pdf is given by
(x1 −µ1 )2 (x −µ )2
 
1 (x −σ )(x −σ )
1 − + 2 22 −2ρ 1 σ1 σ 2 2
1−ρ2 2σ 2 2σ 1 2
f (x1 , x2 ) = p e 1 2
2πσ1 σ2 1 − ρ2

X1 is a normal distribution N (µ1 , σ12 ) with pdf


(x1 −µ1 )2
1 − 2σ 2
fX1 (x1 ) = e 1
2πσ1
Condition pdf of X2 given X1 = x1 is

f (x1 , x2 )
fX2 |X1 (x2 |x1 ) = = ...
fX1 (x1 )

74 / 96
Another way to find conditional distribution of X2 given
X1 = x 1
Using the construction
(
X1 = µ1 + σ1 Z1
p
X2 = µ2 + σ2 (ρZ1 + 1 − ρ2 Z2 )
Given X1 = x1 , we have
x1 − µ 1
Z1 =
σ1
Then
x1 − µ 1 q x 1 − µ1
  q
X2 = µ2 + σ2 ρ + 1 − ρ2 Z2 = µ2 + σ2 + σ2 1 − ρ2 Z2
σ1 σ1

75 / 96
Since Z2 and Z1 are independent, knowing Z1 does not provide any information on Z2 .
It implies that given X1 = x1 , X2 is a linear combination of Z2 , thus it is normal with
mean
x1 − µ 1
µ2 + σ2
σ1
and variance
σ22 (1 − ρ2 )
So
x 1 − µ1 2
 
X2 |(X1 = x1 ) ∼ N µ2 + σ2 , σ2 (1 − ρ2 )
σ1

76 / 96
Conditional expectation of Y |X = x

X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞

77 / 96
Conditional expectation of Y |X = x

X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞
• E(Y |X = x) is a function of x, i.e., the result depends on the value of x

77 / 96
Example

Consider binomial asset pricing model


• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3
Find
1 The conditional expectation of S2 given that S1 = 8.
2 The conditional expectation of S3 given that S1 = 8

78 / 96
Example

Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and ther second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Given that B1 = 1, find conditional expectation of B2 .

79 / 96
Important note

1 If X and Y are independent then E(Y |X = x) = E(Y ).


2 If g is a function then E(g(X)|X = x) = g(x).
3 If g is a function then
X


 g(y)P (Y = y|X = x), if X is discrete
y
E(g(Y )|X = x) = Z ∞
g(y)fY |X (y|x)dy, if X is continuous



−∞

80 / 96
σ − algebra: record of information

Let Ω be a sample space of a random experiment. A collection of subsets of Ω is called


an σ − algebra over Ω if it satisfies the following conditions:
1 Ω∈F
2 A ∈ F ⇒ Ac ∈ F
3 Ai ∈ F, ∀i = 1, 2 · · · ⇒ ∪∞
i=1 Ai ∈ F
Meaning: In measure-theoretic probability, information is modeled using σ-algebras.
The information associated with a σ-algebra F can be thought of as follows. A
random experiment is performed and an outcome w is determined, but the value of w
is not revealed. Instead, for each set in the σ-algebra F, we are told whether w is in
the set. The more sets there are on F, the more information this provides.

81 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.

82 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.
2

F1 = {0, Ω, {HHH, HHT, HT H, HT T }, {T HH, T HT, T T H, T T T }}


= {0, Ω, AH , AT }

where
AH = {HHH, HHT, HT H, HT T } = { H on first toss }
AT = {T HH, T HT, T T H, T T T } = { H on first toss }
F1 : information of the first coin or ”information up to time 1”. For
example, you are told that the first coin is H and no more.
82 / 96
Example 2
3

F2 = {∅, Ω, {HHH, HHT }, {HT H, HT T }, {T HH, T HT }, {T T H, T T T }


and all sets which can be built by taking unions of these }
= {∅, Ω, AHH , AHT , AT H , AT T }
and all sets which can be built by taking unions of these }

where
AHH = {HHH, HHT } = {HH on the first two tosses}
AHT = {HT H, HT T } = {HT on the first two tosses}
AT H = {T HH, T HT } = {TH on the first two tosses}
AT T = {T T H, T T T } = {TT on the first two tosses}
F2 : information of the first two tosses or ”information up to time 2”
83 / 96
4 F3 = G set of all subsets of Ω: “full information” about the outcome of all three
tosses

84 / 96
F - measurable

Definition
A random variable X is called F - measurable if σ(X) ⊂ F
Meaning: the information in F is enough to determine the value of the random
variable X(w), even though it may not be enough to determine the value w of the
outcome of the random experiment.
Example
Consider a binomial asset pricing model and F2 is σ - algebra generated by information
of the first two tosses.
The asset price S1 and S2 are F2 - measurable but S3 is not F2 - measurable

85 / 96
Conditional distribution X given a σ - algebra F
Conditional expectation E(X|F) is a random variable which satisfies
• F - measurable
Meaning: Estimate E(X|F) of X is based on the information in F
• Partial - average property
Z Z
E(X|F)dP = XdP for all A ∈ F
A A

E(X|F) is indeed an estimate of X. It gives the same averages as X over all the
sets in F.
If F has many sets, which provide a fine resolution of the uncertainty inherent in
w, then this partial-averaging property over the ”small” sets in F says that
E(X|F) is a good estimator of X.

86 / 96
Conditional expectation given a random variable

Definition

E(Y |X) = E(Y |σ(X))


can be regared as an estimate of the value of Y based on the knownledge of X

87 / 96
Conditional expectation given a random variable

Definition

E(Y |X) = E(Y |σ(X))


can be regared as an estimate of the value of Y based on the knownledge of X

Find formula for E(Y |X)


• Find g(x) = E(Y |X = x)
• E(Y |X) = g(X)

87 / 96
Example
Consider a binomial asset pricing model with S0 = 4, u = 1/d = 2, p(H) = 2/3 and
p(T ) = 1/3, find E(S2 |S1 )

88 / 96
Properties of conditional expectation
1 Linearity
E(aY + bZ|X) = aE(Y |X) + bE(Z|X)
2 Taking out of what we know (typical approach to verify Markov and martingal
property)
E(f (X)Y |X) = f (X)E(Y |X)
3 Iterated conditioning σ - algebra G ⊂ H

E(E(Z|H)|G) = E(Z|G)

In particular E(Y ) = E(E(Y |G))


4 Independence
E(Y |G) = E(Y )
if X is independent of G.
89 / 96
Example - Linearity

Considering a binomial asset pricing model with S0 = 4, p = 1/3 and q = 2/3.


Compare
E(S2 + S3 |S1 )
and
E(S2 |S1 ) + E(S3 |S1 )

90 / 96
Solution

• S1 takes two values 8 and 2


• Given S1 = 8
2 1
E(S2 |S1 = 8) = (16) + (4) = 12
3 3
 2
2 1 2
  
E(S3 |S1 = 8) = (32) + (8)
3 3 3
 2
2 1 1
  
+ (8) + (2) = 18
3 3 3
So
E(S2 |S1 = 8) + E(S3 |S1 = 8) = 12 + 18 = 30

91 / 96
• Given S1 = 8, (S2 , S3 ) takes pair values (16, 32), (16, 8), (4, 8), (4, 2). So
 2
2 1 2
  
E(S2 + S3 |S1 = 8) = (16 + 32) + (16 + 8)
3 3 3
 2
1 2 1
  
+ (4 + 8) + (4 + 2) = 30
3 3 3
• Hence E(S2 + S3 |S1 = 8) = E(S2 |S1 = 8) + E(S3 |S1 = 8) = 30
• Similary E(S2 + S3 |S1 = 2) = E(S2 |S1 = 2) + E(S3 |S1 = 2) = 7.5
• Regardless the outcome of S1 , we have

E(S2 + S3 |S1 ) = E(S2 |S1 ) + E(S3 |S1 )

92 / 96
Example - Take out what is known

Compare
E(S1 S2 |S1 )
and
S1 E(S2 |S1 )

93 / 96
Example - Iterated conditioning

Compare
E(S3 |S1 )
and
E(E(S3 |(S1 , S2 ))|S1 )

94 / 96
Example-Independence

Compare
S2
 
E |S1
S1
and
S2
 
E
S1

95 / 96
Practice

Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and the second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Find E(B2 |B1 ).

96 / 96

You might also like