1-ProbabilityReview v3

Probability Review
January 26, 2023
1 / 96
Outline
• Textbook: Chapter 1 - Shreve I

• Review some basis context in probability
• Sample space and probability
• Filtration as record of information
• Random variable and expectation
• Conditional distribution and conditional expectation
2 / 96
Random process
A random process (or stochastic process ) is a mathematical model of a probabilistic

experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine
3 / 96
Random process

Each numerical value in the sequence is modeled by a random variable,
3 / 96
Random process

Each numerical value in the sequence is modeled by a random variable, so a stochastic
process is simply a (finite or infinite) sequence of random variables
3 / 96
Why study random processes?
• Many phenomenon can be modelled by random processes: understand random

processes help to understand the phenomenon.
• Wide application: finance, actuary...
• Provide foundation for later: financial mathematics 1 and 2.
4 / 96
Course outline
• Review probability
• Introduction to random process
• Introduction to some special random process
• Poisson process
• Markov chain
• Random walk
• Brownian motion
• Stochastic calculus
• Itô integral
• Itô’s formula for derivative of a function of Brownian motion and of an Itô process
• Solve stochastic differential equations
5 / 96
Plan
1 Probability space
Probability space
2 Random variables
Random variables
Simulation
Expectation and Variance
3 Random vectors
4 Conditional distribution and conditional expectation

Conditional distribution
Conditional Expectation
6 / 96
Probability models
• Random experiment: produce uncertain outcomes under the same condition.

• An outcome ω: a result of experiment.
• A sample space Ω: all results of experiment.
• An event: a collection of some outcomes
• Probability measure P : estimate the likelihood of each event.
• P (A): how likely that event A occurs
7 / 96
Axiom of Probability
A law P to assign each event in a sample space Ω to a number is a probability in Ω if

it satisfies
1 0 ≤ P (A) ≤ 1
2 P (Ω) = 1
3 If A1 , A2 , . . . are mutually exclusive (also called disjoint, i.e Ai ∩ Aj = ∅∀i ̸= j)
then
∞ ∞
!
[ X
P An = P (An )
n=1 n=1
8 / 96
Probability measure on finite sample space
• Sample space has finite elements
Ω = {x1 , . . . , xn }
• X
P (A) = p(xi )
xi ∈A
• Probability of each outcome p(xi ) must satisfies the conditions

1 0 ≤ P (xi ) ≤ 1
2 p(x1 ) + · · · + p(xn ) = 1
9 / 96
Example 1
• Experiment: toss a fair coin three times

• Sample space: all possible outcome of three coin tosses
Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }
1 Probability of individual outcomes?

2 Probability of H on the first toss?
10 / 96
Additive rule
P( A ∪ B}
| {z ) = P (A) + P (B) − P ( A ∩ B}
| {z )
at least one event occurs both events occur
Additive rule for disjoint sets

If A and B are disjoint, i.e A ∩ B = ∅ then they can not occur simultaneously and
P (A ∪ B) = P (A) + P (B)
Complement rule
Complement of A is Ac = Ω − A - the event contains all outcomes that is not in A
P (Ac ) = 1 − P (A)
11 / 96
Conditional probability
For events A and B, the conditional probability of A given B is
P (A ∩ B)
P (A|B) =
P (B)
defined for P (B) > 0.
12 / 96
Conditional probability
For events A and B, the conditional probability of A given B is
P (A ∩ B)
P (A|B) =
P (B)
defined for P (B) > 0.
Measure the likelihood of A in the new sample space B
12 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
13 / 96
Example
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
13 / 96
Example
Solution
• B: sum of the roll is 7.
13 / 96
Example
Solution
• Need to find P (A|B)
13 / 96
Example
Solution
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
13 / 96
Example
Solution
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}
13 / 96
Example
Solution
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}
• AB = {(2, 5)}
P (AB) 1/36 1
P (A|B) = = =
P (B) 6/36 6
13 / 96
Practice
second roll is even?
Does the result of the first roll effect on the second?
14 / 96
Independence
Two events A and B are independent if
P (A|B) = P (A) for P (B) > 0
15 / 96
Independence
P (A|B) = P (A) for P (B) > 0
There is no relationship between A and B. B has no effect on A or knowing B does

not change the probability that A happens then.
15 / 96
Independence
P (A|B) = P (A) for P (B) > 0

Equivalent condition
P (AB) = P (A)P (B)
15 / 96
Independence
P (A|B) = P (A) for P (B) > 0

Equivalent condition
P (AB) = P (A)P (B)
the second condition is used to check the dependency between A and B even for
P (B) = 0
15 / 96
Multiplication rule
1 P (AB) = P (B)P (A|B)
2 General case
P (A1 A2 ...Ak ) = P (A1 )P (A2 |A1 )P (A3 |A1 A2 )...P (Ak |A1 ...Ak−1 )
16 / 96
Law of total probability
Let A1 , ..., Ak be a partition the sample space
• Ai ∩ Aj = ∅ for all i ̸= j (mutually exclusive (disjoint))
• ∪i Ai = Ω
Then, for any event B, we have
k
X k
X
P (B) = P (B ∩ Ai ) = P (B|Ai )P (Ai ).
i=1 i=1
17 / 96
Plan
1 Probability space
Probability space
2 Random variables
Random variables
Simulation
3 Random vectors

18 / 96
Random variables
Definition
Random variable A random variable a function on sample space
X: Ω→R
ω → X(ω)
19 / 96
Random variables
Definition
Random variable A random variable a function on sample space
X: Ω→R
ω → X(ω)
Classify random variables

Range(X) = {X(ω) for ω ∈ Ω} is the set of all possible values of random variable X
• If Range(X) is a countable set then X is called a discrete random variable.
• If Range(X)is an uncountable set then X is called a continuous random variable.
19 / 96
Example - Binomial Asset Pricing Model
u: up factor, d: down factor
• Initial stock price S0

• Next period
• Upward: uS0
• Downward: dS0
1
where 0 < d < 1 < u, (d = u )
• Toss a coin
• Head: move up
• Tail: move down
• Range(S1 ) = {2, 8} so S1 is a
discrete RV
20 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value
F (x) = P (X ≤ x)
21 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value
F (x) = P (X ≤ x)
Evaluate probability of a RV with cdf

• P (X > x) = 1 − F (x)
• P (X < x) = lim F (t) = F (x− )
t→x−
• P (a < X ≤ b) = F (b) − F (a)
• P (a ≤ X ≤ b) = F (b) − F (a− )
21 / 96
Properties of cdf
• Increasing
•
lim F (x) = 0
x→−∞
and
lim F (x) = 1
x→∞
• has left-limit
• Right- continuous
22 / 96
Distribution of a discrete random variable
Range(X): countable set x1 , x2 , ...
Definition (Probability mass function (pmf))
pmf of X is a function pX given by
pX (xi ) = P (X = xi )
which satisfies
1 0 ≤ pX (xi ) ≤ 1 for all i
X
2 pX (xi ) = 1
i
Properties
X X
(1) P (a ≤ X ≤ b) = pX (xi ) (2) cdf F (x) = pX (xi )
a≤xi ≤b xi ≤x
23 / 96
Bernoulli distribution
A random variable X is Bernoulli distribution if it is an indicator random variable of a

trial which has success probability p and failure probability 1 − p.
P (X = 0) = 1 − p
P (X = 1) = p
Denote X ,→ Ber(p)
24 / 96
Example - Bernoulli distribution
Toss a fair coin. Let X be the number of H.

Probability mass function (pmf) of X
x 0 1
P (X = x) P (T ) = 1/2 P (H) = 1/2
25 / 96
Binomial distribution
• n independent trials, each is Ber(p).

• X is the number of success
• X is called Binomial RV with parameter (n, p)
• Denote X ∼ Bino(n, p).
!
n k
P (X = k) = p (1 − p)n−k
k
26 / 96
Example - Binomial distribution
Toss a fair coin twice. Pmf of the number of H

x P (X = x)
0 P (T T ) = 1/4
1 P (T H) + P (T T ) = 1/2
2 P (HH) = 1/4
27 / 96
Example - Binomial distribution
Toss a fair coin 3 times. Pmf of the number of H

x P (X = x)
0 P (T T T ) = 1/8
1 P (T T H) + P (T HT ) + P (HT T ) = 3/8
2 P (HHT ) + P (HT H) + P (T HH) = 3/8
3 P (HHH) = 1/8
28 / 96
Example - Payoff of European Call Option
• European call option: confers the right to buy the stock at maturity or
expiration time T = 2 for strike price K = 14 dollars. It is worth
• S2 − K if S2 − K > 0
• 0 otherwise
• Value (payoff) of the option at maturity
C2 = max(S2 − K, 0) = (S2 − K)+
where x+ = max(x, 0)
• Stock price: binomial model with S0 = 4, d = 1/2, u = 2,
p = p(H)1/2,q = p(T ) = 1 − p = 1/2
• Find probability distribution for payoff C2 of the European call option.
29 / 96
Solution
pmf of S2
x 1 4 16
P (S2 = x) 1/4 1/2 1/4
pmf of C2
x 0 2
P (C2 = x) 3/4 1/4
30 / 96
Example - European Put Option
• European put option: confers the right to sell the stock at maturity or
expiration time T = 2 for strike price K = 3 dollars. It is worth
• K − S2 if K − S2 > 0
• 0 otherwise
• Value (payoff) of the option at maturity
P2 = max(K − S2 , 0) = (K − S2 , 0)+
• Stock price: binomial model with S0 = 4, d = 1/2, u = 2, p(H) = p(T ) = 1/2

• Find probability distribution for payoff P2 of the European put option.
31 / 96
Solution
pmf of S2
x 1 4 16
P (S2 = x) 1/4 1/2 1/4
pmf of C2
x 0 2
P (P2 = x) 3/4 1/4
32 / 96
Practice
Find c.d.f of stock price S2 , European call option C2 and European put option P2 in
the previous example
33 / 96
Distribution of a continuous random variable
Range(X): uncountable
Definition (Probability density function (pdf))
The pdf of X is a function f that satisfies
1 f (x) ≥ 0 for all x
R∞
2 P (−∞ < X < ∞) = −∞ f (x)dx =1
Properties
Ra
• P (X = a) = P (a ≤ X ≤ a) = a f (x)dx = 0
Rb
• P (a ≤ X ≤ b) = a f (x)dx
• P (a ≤ X ≤ b) = F (b) − F (a)
• cdf Z a
F (a) = P (X ≤ a = f (x)dx)
−∞
34 / 96
Probability as an Area
Note that probability of any individual value is 0

35 / 96
Interpretation of p.d.f
Z a+ ϵ
ϵ ϵ

2
P a− ≤X ≤a+ = f (x)dx
2 2 a− 2ϵ
≈ ϵf (a)
f (a) is a measure of how likely it is that the random variable will be near a.
36 / 96
Continuous uniform distribution
• X is uniformly distributed on [a, b] if its pdf is

(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise
37 / 96
Continuous uniform distribution
• X is uniformly distributed on [a, b] if its pdf is

(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise
Any value in [a, b] is equally likely to be value of X.

Denote X ∼ U ni[a, b].
• cdf 
0,

 x<a
x−a
F (x) = , a≤x<b
 b−a
x≥b

1,
37 / 96
Exponential distribution with parameter λ
• pdf
(
0, x<0
f (x) = −λx
λe , x≥0
• cdf (
0, x<0
F (x) =
e−λx , x≥0
38 / 96
Normal distribution
Definition
Continuous RV X is said to be normally distributed with parameter µ and σ 2 if its pdf
is
1 2 2
f (x) = √ e−(x−µ) /2σ , −∞ < x < ∞
σ 2π
Denote X ∼ N (µ, σ 2 ).
Properties
1 E(X) = µ
2 V ar(X) = σ 2
39 / 96
Simulate random number
Simulate uniform random number

• Simulate 1 random number from uniform distribution: rand
• Simulate a m × n matrix of random number from unifrom distribution rand(m,n)
Simulate random number from typical distribution

search Google for binomial, poisson, exponential, normal ....
40 / 96
Practice
• Generate 10000 uniform random number

• Plot histogram of the generated sample to make a comparison with pdf of uniform
distribution
• Similar for binomial, poisson, exponential, normal random number
41 / 96
Definition (Expectation of random variable X)
∞
X
xk P (X = xk ), if X is discrete with countable values





k=1
E(X) = Z ∞


 x fX (x) dx, if X is continuous with uncountable values
 −∞ | {z }

pdf of X
Expectation of function of random variable

∞
 X g(x )P (X = x ),


 k k
E(g(X)) = k=1
Z ∞

g(x)fX (x)dx



−∞
42 / 96
Definition (Variance of random variable X)
V ar(X) = E[(X − E[X])2 ]
Property
V ar(X) = E[X 2 ] − (E[X])2

where
(P
2
2 x x p(X = xk ) if X is a discrete random variable
E(X ) = R ∞k 2k
−∞ x fX (x)dx if X is a continuous random variable
43 / 96
Some properties of expectation and variance
• E(g(X) + h(X)) = E(g(X)) + E(h(X))

• E[aX + b] = aE[X] + b
• V ar[aX + b] = a2 V ar[X]
44 / 96
Example
Find expectation and variance for stock price S2 , European call option C2 and
European put option P2 in the previous example
45 / 96
Plan
1 Probability space
Probability space
2 Random variables
Random variables
Simulation
3 Random vectors

46 / 96
Joint distribution of two discrete random variables
Definition (Joint probability mass function )
pX,Y (x, y) = P (X = x, Y = y)
Properties
For constants a < b and c < d,
b X
X d
P (a ≤ X ≤ b, c ≤ Y ≤ d) = P(X = x, Y = y)
x=a y=c
47 / 96
Marginal pmf
pmf of X and Y are given by

∞
X
PX (X = x) = P(X = x, Y = y)
y=−∞
and ∞
X
PY (Y = y) = P(X = x, Y = y)
x=−∞
48 / 96
Example
Consider binomial asset pricing model
• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3
Find the joint pmf of the stock price (S1 , S2 )

49 / 96
Joint distribution of two continuous random variables
Definition (Joint pdf)

The joint probability density function of two continuous random variables X and Y ,
denoted by fX,Y (x, y) is a function which satisfies the following properties:
• fX,Y (x, y) > 0, ∀x, y
Z +∞ Z +∞
• fX,Y (x, y)dxdy = 1
−∞ −∞
• ZZ
P ((X, Y ) ∈ R) = fX,Y (x, y)dxdy
R
Z bZ d
In particular P (a ≤ x ≤ b, c ≤ y ≤ d) = fX,Y (x, y)dydx
a c
50 / 96
Probability as a volume
ZZ
P ((X, Y ) ∈ R = fX,Y (x, y)dxdy
R
51 / 96
Definition (Marginal pdf)
The marginal pdf of X and Y are given by
Z ∞
fX (x) = fX,Y (x, y)dy
−∞
and Z ∞
fY (y) = fX,Y (x, y)dx
−∞
Definition (Joint cdf)

Z a Z b
FX,Y (a, b) = P (X ≤ a, Y ≤ b) = fX,Y (x, y)dydx
−∞ −∞
Relationship between joint pdf and joint cdf
∂2
fX,Y (x, y) = FX,Y (x, y)
∂x∂y
52 / 96
Independence of random variables
• X and Y are independence if and only if
FX,Y (x, y) = FX (x)FY (y)
for all x, y
• If X and Y are discrete RV then X and Y are independent if and only if
P (X = x, Y = y) = P (X = x)P (Y = y)
for all x, y
• If X and Y are continuous RV then X and Y are independent if and only if
fX,Y (x, y) = fX (x)fY (y)
for all x, y
53 / 96
Definition (Expectation of function of random vector)
P P
 x y g(x, y)P (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[g(X, Y )] =

 g(x, y)fX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞
Example
P P
 x y xyP (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[XY ] =

 xyfX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞
54 / 96
Covariance and correlation coefficient
• Covariance of X and Y is cov(X, Y ) = E[XY ] − E[X]E[Y ].

• Correlation coefficient of X and Y is given by:
cov(X, Y )
cor(X, Y ) =
σX σY
where σX , σY are standard deviations of X and Y
−1 ≤ corr(X, Y ) ≤ 1
Correlation coefficient is used to measure how strong linear relationship between
X and Y is
55 / 96
Bivariate normal distribution
Let X1 ,→ N (µ1 , σ12 ), X2 ,→ N (µ2 , σ22!) be 2 normal random variables with covariance
X1
σ12 then the random vector X = is a bivariate normal distribution if the joint
X2
pdf is given by
1 − 21 (x−µ)T Σ−1 (x−µ)
f (x1 , x2 ) = 1 e
2π det(Σ) 2
where !
µ1
• µ=
µ2
!
σ12 σ12
• Σ= is the variance - covariance matrix of X
σ12 σ22
56 / 96
57 / 96
Properties
• Cov(X, Y ) = Cov(Y, X)
• Cov(X, X) = V ar(X)
• Cov(aX, Y ) = aCov(X, Y )
• Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z)
• If X and Y are independent then Cov(X, Y ) = 0
58 / 96
Variance of Sum
X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j
59 / 96
Variance of Sum
X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j
V ar(X + Y ) = V ar(X) + V ar(Y ) + 2Cov(X, Y )
59 / 96
Sum of normal distributions
2 ),
• If (X, Y ) has multivariate normal distribution, X ,→ N (µX , σX
Y ,→ N (µY , σY2 ) and Cov(X, Y ) = σXY then
2 2
X + Y ,→ N (µX + µY , σX + σX + 2σXY )
2 ), Y ,→ N (µ , σ 2 ) and X and Y are independent then
• If X ,→ N (µX , σX Y Y
2 2
X + Y ,→ N (µX + µY , σX + σX )
60 / 96
Example
Let X ,→ N (0, 4) and Y ,→ N (0, 1) be daily return of stock A and B respectively.

Suppose that Cov(X, Y ) = 2 and the joint distribution of X and Y is a multivariate
normal distribution.
1 Consider a portfolio consisting of 70% stock A and 30% stock B. Then the return
of the portfolio is given by
W = 0.7X + 0.3Y
Determine the distribution of W .
2 Suppose that we aim to allocate stock A and B with weight a and 1 − a. The
return of portfolio is
U = aX + (1 − a)Y
Determine a to minimize risk of the portfolio.
61 / 96
Solution
1 W is normally distributed with

• E(W ) = 0.7E(X) + 0/3E(Y ) = 0
•
V ar(W ) = V ar(0.7X) + V ar(0.3Y ) + 2Cov(0.7X0.3Y )

= (0.7)2 V ar(X) + (0.3)2 V ar(Y ) + 2(0.21)Cov(X, Y )
= (0.7)2 × 4 + (0.3)2 × 1 + (0.42) × 2 =
2 Evaluate risk via variance
V ar(U ) = 4a2 + (1 − a)2 + 4a(1 − a)
Need to find a ∈ [0, 1] to minimize V ar(U )
62 / 96
Plan
1 Probability space
Probability space
2 Random variables
Random variables
Simulation
3 Random vectors

63 / 96
Conditioning a RV on an event
The conditional pmf of a RV X given an event A is
P ((X = x) ∩ A)
pX|A (x) = P (X = x|A) =
P (A)
if P (A) > 0
64 / 96
Example
Let X be the roll of a fair die and let A be the event that the roll is an even number.
Then
P (X = 1 and roll is even)
pX|A (1) = =0
P (roll is even)
65 / 96
Example
Let X be the roll of a fair die and let B be the event that the roll is an even number.
Then
P (X = 2 and roll is even) 1
pX|B (2) = =
P (roll is even) 3
66 / 96
Conditional of a discrete RV on another
• 2 RVs X and Y
• given Y = y with P (Y = y) > 0
• conditional pmf of X
pX|Y (x|y) = P (X = x|Y = y)

P (X = x, Y = y)
=
P (Y = y)
67 / 96
For each y, we view the joint pmf along the slice Y = y and renormalize such that
X
pX|Y (x|y) = 1
x 68 / 96
Example
1
Consider a binomial asset pricing model with S0 = 4, d = 2, u = 2, p = 2/3 and
q = 1/3. Find conditional pmf of S2 given S1 = 2.
69 / 96
Solution
x 1 4 8
1 2
PS2 |S1 =2 (x|2) 3 3 0
70 / 96
Practice
Consider a binomial asset pricing model with S0 = 4, d = 2, u = 12 , p = 2/3 and

q = 1/3. Find conditional pmf of S1 given S2 = 4.
71 / 96
Practice

• S0 = 4, u = 2, d = 1/2
• p = 1/3, q = 2/3
Find
1 The conditional probability of S2 = 16 given that S1 = 8
2 The conditional probability of S3 = 8 given that S1 = 8
72 / 96
Conditional distributions
• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)
73 / 96
Conditional distributions
• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)
• For continuous random variable X and Y , the conditional density function of Y
given X = x is
f (x, y)
fY |X (y|x) = ,
fX (x)
provide the likelihood that X takes values near x given that Y takes values near
by y
73 / 96
Example
Let X1 and X2 be jointly normal random variables with parameters µ1 , σ1 , µ2 , σ2 , and
ρ. The joint pdf is given by
(x1 −µ1 )2 (x −µ )2

1 (x −σ )(x −σ )
1 − + 2 22 −2ρ 1 σ1 σ 2 2
1−ρ2 2σ 2 2σ 1 2
f (x1 , x2 ) = p e 1 2
2πσ1 σ2 1 − ρ2
X1 is a normal distribution N (µ1 , σ12 ) with pdf

(x1 −µ1 )2
1 − 2σ 2
fX1 (x1 ) = e 1
2πσ1
Condition pdf of X2 given X1 = x1 is
f (x1 , x2 )
fX2 |X1 (x2 |x1 ) = = ...
fX1 (x1 )
74 / 96
Another way to find conditional distribution of X2 given
X1 = x 1
Using the construction
(
X1 = µ1 + σ1 Z1
p
X2 = µ2 + σ2 (ρZ1 + 1 − ρ2 Z2 )
Given X1 = x1 , we have
x1 − µ 1
Z1 =
σ1
Then
x1 − µ 1 q x 1 − µ1
q
X2 = µ2 + σ2 ρ + 1 − ρ2 Z2 = µ2 + σ2 + σ2 1 − ρ2 Z2
σ1 σ1
75 / 96
Since Z2 and Z1 are independent, knowing Z1 does not provide any information on Z2 .
It implies that given X1 = x1 , X2 is a linear combination of Z2 , thus it is normal with
mean
x1 − µ 1
µ2 + σ2
σ1
and variance
σ22 (1 − ρ2 )
So
x 1 − µ1 2

X2 |(X1 = x1 ) ∼ N µ2 + σ2 , σ2 (1 − ρ2 )
σ1
76 / 96
Conditional expectation of Y |X = x
X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞
77 / 96
Conditional expectation of Y |X = x
X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞
• E(Y |X = x) is a function of x, i.e., the result depends on the value of x
77 / 96
Example

• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3
Find
1 The conditional expectation of S2 given that S1 = 8.
2 The conditional expectation of S3 given that S1 = 8
78 / 96
Example
Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and ther second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Given that B1 = 1, find conditional expectation of B2 .
79 / 96
Important note
1 If X and Y are independent then E(Y |X = x) = E(Y ).

2 If g is a function then E(g(X)|X = x) = g(x).
3 If g is a function then
X


 g(y)P (Y = y|X = x), if X is discrete
y
E(g(Y )|X = x) = Z ∞
g(y)fY |X (y|x)dy, if X is continuous



−∞
80 / 96
σ − algebra: record of information
Let Ω be a sample space of a random experiment. A collection of subsets of Ω is called

an σ − algebra over Ω if it satisfies the following conditions:
1 Ω∈F
2 A ∈ F ⇒ Ac ∈ F
3 Ai ∈ F, ∀i = 1, 2 · · · ⇒ ∪∞
i=1 Ai ∈ F
Meaning: In measure-theoretic probability, information is modeled using σ-algebras.
The information associated with a σ-algebra F can be thought of as follows. A
random experiment is performed and an outcome w is determined, but the value of w
is not revealed. Instead, for each set in the σ-algebra F, we are told whether w is in
the set. The more sets there are on F, the more information this provides.
81 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.
82 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.
2
F1 = {0, Ω, {HHH, HHT, HT H, HT T }, {T HH, T HT, T T H, T T T }}

= {0, Ω, AH , AT }
where
AH = {HHH, HHT, HT H, HT T } = { H on first toss }
AT = {T HH, T HT, T T H, T T T } = { H on first toss }
F1 : information of the first coin or ”information up to time 1”. For
example, you are told that the first coin is H and no more.
82 / 96
Example 2
3
F2 = {∅, Ω, {HHH, HHT }, {HT H, HT T }, {T HH, T HT }, {T T H, T T T }

and all sets which can be built by taking unions of these }
= {∅, Ω, AHH , AHT , AT H , AT T }
and all sets which can be built by taking unions of these }
where
AHH = {HHH, HHT } = {HH on the first two tosses}
AHT = {HT H, HT T } = {HT on the first two tosses}
AT H = {T HH, T HT } = {TH on the first two tosses}
AT T = {T T H, T T T } = {TT on the first two tosses}
F2 : information of the first two tosses or ”information up to time 2”
83 / 96
4 F3 = G set of all subsets of Ω: “full information” about the outcome of all three
tosses
84 / 96
F - measurable
Definition
A random variable X is called F - measurable if σ(X) ⊂ F
Meaning: the information in F is enough to determine the value of the random
variable X(w), even though it may not be enough to determine the value w of the
outcome of the random experiment.
Example
Consider a binomial asset pricing model and F2 is σ - algebra generated by information
of the first two tosses.
The asset price S1 and S2 are F2 - measurable but S3 is not F2 - measurable
85 / 96
Conditional distribution X given a σ - algebra F
Conditional expectation E(X|F) is a random variable which satisfies
• F - measurable
Meaning: Estimate E(X|F) of X is based on the information in F
• Partial - average property
Z Z
E(X|F)dP = XdP for all A ∈ F
A A
E(X|F) is indeed an estimate of X. It gives the same averages as X over all the
sets in F.
If F has many sets, which provide a fine resolution of the uncertainty inherent in
w, then this partial-averaging property over the ”small” sets in F says that
E(X|F) is a good estimator of X.
86 / 96
Conditional expectation given a random variable
Definition
E(Y |X) = E(Y |σ(X))

can be regared as an estimate of the value of Y based on the knownledge of X
87 / 96
Conditional expectation given a random variable
Definition
E(Y |X) = E(Y |σ(X))

can be regared as an estimate of the value of Y based on the knownledge of X
Find formula for E(Y |X)

• Find g(x) = E(Y |X = x)
• E(Y |X) = g(X)
87 / 96
Example
Consider a binomial asset pricing model with S0 = 4, u = 1/d = 2, p(H) = 2/3 and
p(T ) = 1/3, find E(S2 |S1 )
88 / 96
Properties of conditional expectation
1 Linearity
E(aY + bZ|X) = aE(Y |X) + bE(Z|X)
2 Taking out of what we know (typical approach to verify Markov and martingal
property)
E(f (X)Y |X) = f (X)E(Y |X)
3 Iterated conditioning σ - algebra G ⊂ H
E(E(Z|H)|G) = E(Z|G)
In particular E(Y ) = E(E(Y |G))

4 Independence
E(Y |G) = E(Y )
if X is independent of G.
89 / 96
Example - Linearity
Considering a binomial asset pricing model with S0 = 4, p = 1/3 and q = 2/3.

Compare
E(S2 + S3 |S1 )
and
E(S2 |S1 ) + E(S3 |S1 )
90 / 96
Solution
• S1 takes two values 8 and 2

• Given S1 = 8
2 1
E(S2 |S1 = 8) = (16) + (4) = 12
3 3
2
2 1 2

E(S3 |S1 = 8) = (32) + (8)
3 3 3
2
2 1 1

+ (8) + (2) = 18
3 3 3
So
E(S2 |S1 = 8) + E(S3 |S1 = 8) = 12 + 18 = 30
91 / 96
• Given S1 = 8, (S2 , S3 ) takes pair values (16, 32), (16, 8), (4, 8), (4, 2). So
2
2 1 2

E(S2 + S3 |S1 = 8) = (16 + 32) + (16 + 8)
3 3 3
2
1 2 1

+ (4 + 8) + (4 + 2) = 30
3 3 3
• Hence E(S2 + S3 |S1 = 8) = E(S2 |S1 = 8) + E(S3 |S1 = 8) = 30
• Similary E(S2 + S3 |S1 = 2) = E(S2 |S1 = 2) + E(S3 |S1 = 2) = 7.5
• Regardless the outcome of S1 , we have
E(S2 + S3 |S1 ) = E(S2 |S1 ) + E(S3 |S1 )
92 / 96
Example - Take out what is known
Compare
E(S1 S2 |S1 )
and
S1 E(S2 |S1 )
93 / 96
Example - Iterated conditioning
Compare
E(S3 |S1 )
and
E(E(S3 |(S1 , S2 ))|S1 )
94 / 96
Example-Independence
Compare
S2

E |S1
S1
and
S2

E
S1
95 / 96
Practice
Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and the second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Find E(B2 |B1 ).
96 / 96

1-ProbabilityReview v3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1-ProbabilityReview v3

Uploaded by

Copyright:

Available Formats

Probability Review

January 26, 2023

• Textbook: Chapter 1 - Shreve I

A random process (or stochastic process ) is a mathematical model of a probabilistic

A random process (or stochastic process ) is a mathematical model of a probabilistic

A random process (or stochastic process ) is a mathematical model of a probabilistic

• Many phenomenon can be modelled by random processes: understand random

4 Conditional distribution and conditional expectation

• Random experiment: produce uncertain outcomes under the same condition.

A law P to assign each event in a sample space Ω to a number is a probability in Ω if

• Sample space has finite elements

• Probability of each outcome p(xi ) must satisfies the conditions

• Experiment: toss a fair coin three times

Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }

1 Probability of individual outcomes?

Additive rule for disjoint sets

For events A and B, the conditional probability of A given B is

defined for P (B) > 0.

For events A and B, the conditional probability of A given B is

defined for P (B) > 0.

Measure the likelihood of A in the new sample space B

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does

4 Conditional distribution and conditional expectation

Classify random variables

• Initial stock price S0

Evaluate probability of a RV with cdf

A random variable X is Bernoulli distribution if it is an indicator random variable of a

Toss a fair coin. Let X be the number of H.

• n independent trials, each is Ber(p).

Toss a fair coin twice. Pmf of the number of H

Toss a fair coin 3 times. Pmf of the number of H

C2 = max(S2 − K, 0) = (S2 − K)+

• Stock price: binomial model with S0 = 4, d = 1/2, u = 2, p(H) = p(T ) = 1/2

Note that probability of any individual value is 0

• X is uniformly distributed on [a, b] if its pdf is

• X is uniformly distributed on [a, b] if its pdf is

Any value in [a, b] is equally likely to be value of X.

Simulate uniform random number

Simulate random number from typical distribution

• Generate 10000 uniform random number

Expectation of function of random variable

V ar(X) = E[(X − E[X])2 ]

V ar(X) = E[X 2 ] − (E[X])2

• E(g(X) + h(X)) = E(g(X)) + E(h(X))

4 Conditional distribution and conditional expectation

Definition (Joint probability mass function )

pmf of X and Y are given by

Find the joint pmf of the stock price (S1 , S2 )

Definition (Joint pdf)

Definition (Joint cdf)

FX,Y (x, y) = FX (x)FY (y)

fX,Y (x, y) = fX (x)fY (y)

• Covariance of X and Y is cov(X, Y ) = E[XY ] − E[X]E[Y ].