## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Laura Ballotta

MSc Financial Mathematics

October 2008

0

c Laura Ballotta - Do not reproduce without permission.

2

Table of Contents

1. Review of Measure Theory and Probability Theory

(a) The basic framework: the probability space

(b) Random variables

(c) Conditional expectation

(d) Change of measure

2. Stochastic processes

(a) Some introductory deﬁnitions

(b) Classes of processes

3. Brownian motions

(a) The martingale property

(b) Construction of a Brownian motion

(c) The variation process of a Brownian motion

(d) The reﬂection principle and functionals of a Brownian motion

(e) Correlated Brownian motions

(f) Simulating trajectories of the Brownian motion - part 1

4. Itˆo Integrals and Itˆo Calculus

(a) Motivation

(b) The construction of the Itˆo integral

(c) Itˆo processes and stochastic calculus

(d) Stochastic diﬀerential equations

(e) Steady-state distribution

(f) The Brownian bridge and stratiﬁed Monte Carlo

5. The Change of Measure for Brownian Motions

(a) Change of probability measure: the martingale problem

(b) PDE detour

(c) Feynman-Kac representation

(d) Martingale representation theorem

REFERENCES 3

References

[1] Grimmett, G. and D Stirzaker (2003). Probability and Random Processes. Oxford

University Press.

[2] Mikosch, T. (2004). Elementary Stochastic Calculus, with Finance in View. World

Scientiﬁc Publishing Co Pte Ltd.

[3] Shreve, S. (2004). Stochastic Calculus for Finance II - Continuous-time models.

Springer Finance.

4 REFERENCES

Introduction

This set of lecture notes will take you through the theory of Brownian motions and

stochastic calculus which is required for a sound understanding of modern option pricing

theory and modelling of the term structure of interest rates.

As the theory of stochastic processes has its own special “language”, the ﬁrst chapter

is devoted to introducing this new notation but also to some revision of the basic concepts

in probability theory required in the following chapter. Particular attention is given to

the conditional expectation operator which is the building block of modern mathematical

ﬁnance. This will allow us to introduce the idea of martingale, which underpins the theory

of contingent claim pricing. Once these concepts are clear and well understood, we will

devote the rest of the module to the Brownian motion and the rules of calculus that go

with it. These will be our main “tools” for ﬁnancial applications, which are explored in

great details in the module “Mathematical Models for Financial Derivatives”.

As the Brownian motion by construction links us to a prespeciﬁed distribution of the

increments of the process, we will introduce very brieﬂy a more general class of processes

which can be used in the context of mathematical ﬁnance. However, the full investigation

of these processes and their applications will be the focus of the module “Advanced

Stochastic Modelling in Finance” which runs in Term 2.

The material in this booklet covers the entire module; however it is far from being ex-

haustive and students are strongly recommended to do some self-reading. Some references

have been provided in the previous page.

Each chapter contains a number of sample exam questions, some in the form of solved

examples, others in form of exercises for you to practice. Solutions to these exercises will

be posted on CitySpace at some point before the end of term, together with the solutions

to the exam papers that you will ﬁnd in the very last chapter of this booklet.

Needless to say that waiting for these solutions to become available before

attempting the exercises on your own will not help you much in preparing for

the exam itself. You need to test yourself ﬁrst!

5

1 Review of Measure Theory and Probability Theory

1.1 The basic framework: the probability space

Imagine a random experiment like the toss of a coin or the prices of securities traded in

the market in the next period of time. Imagine that we want to explore the features of this

random experiment in order to make appropriate and informed decisions. These features

could be: the expected price of the security tomorrow, or its volatility; the characteristics

of the tails of the price distribution (if for example you need to calculate some risk measure

such as VaR, or shortfall expectation).

In order to be able to do all this, we need appropriate tools describing the random

experiment in such a way that we can extract all this information, i.e. we need a mathe-

matical model of the random experiment. This is represented by the so-called probability

space.

Deﬁnition 1 (Probability space) We denote the probability space by the triplet

Θ := (Ω, F, P) .

A probability space can be considered as a mathematical model of a random experiment.

This deﬁnition is telling us that the probability space is made up of three building

blocks, which we are going to explore one by one.

The ﬁrst piece of the probability space is Ω, which represents our sample space, i.e.

the set of all possible outcomes of random experiment.

Example 1 Let the random experiment be deﬁned as: choose a number from the unit

interval [0, 1]. Then Ω ={ω : 0 ≤ ω ≤ 1} = [0, 1].

Example 2 Assume now that the random experiment you are interested into is the

evolution of a stock price over an inﬁnite time horizon, when only 2 states of nature can

occur, i.e. up or down. Then Ω = the set of all inﬁnite sequences of ups and downs

= {ω : ω

1

ω

2

ω

3

...} , where ω

n

is the result at the n-th period.

The second piece you need in order to have a probability space is F which is called

σ-algebra. The σ-algebra of a random experiment can be interpreted as the collection of

all possible histories of the random experiment itself. Formally, it is deﬁned as follows.

Deﬁnition 2 (σ-algebra) Given a set Ω, a collection F of subsets of Ω is a σ-algebra

if:

1. ∅ ∈ F

2. A ∈ F implies A

c

∈ F

0

c Laura Ballotta - Do not reproduce without permission.

6 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

3. {A

m

} ∈ F implies

∞

¸

m=1

A

m

∈ F (inﬁnite union).

Example 3 1. F = {∅, Ω} is a σ-algebra

2. Consider some event A ⊂ Ω. Then the σ-algebra generated by Ais F = {∅, Ω, A,A

c

}.

3. Consider the sample space deﬁned above for the evolution of the stock price in a

2-state economy, i.e. Ω = the set of inﬁnite sequences of ups and downs, and deﬁne

A

U

= {ω : ω

1

= U}

A

D

= {ω : ω

1

= D} .

The σ-algebra generated by these two sets is

F

(1)

= {∅, Ω, A

U

,A

D

} .

Now consider the sets

A

UU

= {ω : ω

1

= U, ω

2

= U}

A

UD

= {ω : ω

1

= U, ω

2

= D}

A

DU

= {ω : ω

1

= D, ω

2

= U}

A

DD

= {ω : ω

1

= D, ω

2

= D} .

Then

F

(2)

= {∅, Ω, A

UU

,A

UD

, A

DU

,A

DD

, A

c

UU

,A

c

UD

, A

c

DU

,A

c

DD

, A

U

,A

D

,

A

UU

¸

A

DU

, A

UU

¸

A

DD

, A

DU

¸

A

UD

, A

UD

¸

A

DD

¸

is the corresponding σ-algebra.

Example 4 The Borel σ-algebra B on R is the σ-algebra generated by open subsets of

R.

Every σ-algebra has a set of properties that will be useful in the future.

Theorem 3 The σ-algebra has the following properties:

1. Ω ∈ F.

2. {A

m

} ∈ F implies

∞

¸

m=1

A

m

∈ F.

1.1 The basic framework: the probability space 7

Proof. 1) ∅ ∈ F by deﬁnition, hence ∅

c

= Ω ∈ F by deﬁnition as well, (apply properties

1 and 2 from the previous deﬁnition).

2) By assumption: {A

m

} ∈ F; hence A

c

m

∈ F which implies that

∞

¸

m=1

A

c

m

∈ F. By the

law of De Morgan (b)

1

:

∞

¸

m=1

A

c

m

=

∞

¸

m=1

A

m

c

,

therefore

∞

¸

m=1

A

m

c

∈ F.

From the deﬁnition of σ-algebra, it follows that

¸

∞

¸

m=1

A

m

c

c

∈ F and consequently

∩

∞

m=1

A

m

∈ F.

The last piece of our probability space is represented by the symbol P. This is called

probability measure, and you can consider it as a sort of “metrics”, that measures the

likelihood of a speciﬁc event or story of the random experiment.

Deﬁnition 4 A probability measure P is a set function P : F →[0, 1] such that:

1. P(Ω) = 1

2. For any sequence of disjoint events {A

m

} , P(

¸

∞

m

A

m

) =

¸

∞

m=1

P(A

m

).

Based on this deﬁnition, you can show that

P(∅) = 0;

P

A

¸

B

= P(A) +P(B) ;

P(A

c

) = 1 −P(A) .

Moreover, we can deﬁne independent events: two events, A and B, are independent if and

only if P(A

¸

B) = P(A) P(B).

Example 5 Consider the previous example of the evolution of the stock price over

an inﬁnite time horizon, so that Ω = {ω : ω

1

ω

2

ω

3

...}, and A

U

= {ω : ω

1

= U}, A

D

=

{ω : ω

1

= D}. Assume that the diﬀerent up/down movements at each time step are in-

dependent, and let

P(A

U

) = p; P(A

D

) = q = 1 −p.

1

Proposition (Law of De Morgan) (a) (A ∪ B)

c

= A

c

∩B

c

. More in general: (∪

m

A

m

)

c

= ∩

m

A

c

m

.

(b) (A ∩ B)

c

= A

c

∪ B

c

. Generalising: (∩A

m

)

c

= ∪A

c

m

.

Proof. (a) Assume x ∈ ∩

∞

m=1

A

c

m

. Then x ∈ A

c

m

∀m. Hence x / ∈ A

m

∀m, which implies x / ∈

∪

∞

m=1

A

m

. Therefore x ∈ (∪

∞

m=1

A

m

)

c

.

(b) Assume x ∈ ∪

∞

m=1

A

c

m

; then x ∈ A

c

m

for some m. Hence x / ∈ A

m

for the same m. Therefore

x / ∈ ∩

∞

m=1

A

m

and hence x ∈ (∩

∞

m=1

A

m

)

c

. The other direction of the statement can be proved in a

similar fashion.

8 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

Then

P(A

UU

) = p

2

; P(A

UD

) = P(A

DU

) = pq; P(A

DD

) = q

2

.

Further, P(A

c

UU

) = 1 − p

2

; similarly, you can calculate the probability of each other

set in F

(2)

. Moreover, if A

UUU

= {ω : ω

1

= U, ω

2

= U, ω

3

= U}, you can calculate that

P(A

UUU

) = p

3

. And so on. Hence, in the limit you can conclude that the probability

of the sequence UUU... is zero. The same applies for example to the sequence UDUD...;

in fact this sequence is the intersection of the sequences U, UD, UDU, .... From this

example, we can conclude that every single sequence in Ω has probability zero.

In the previous example, we have shown that

P(every movement is up) = 0;

this implies that this event is sure not to happen. Similarly, since the above is true, we are

sure to get at least one down movement in the sequence, although we do not know exactly

when in the sequence. Because of this fact, and the fact that the inﬁnite sequence UUU...

is in the sample space (which means that still is a possible outcome), mathematicians have

come up with a somehow strange way of saying: we will get at least one down movement

almost surely.

Deﬁnition 5 Let (Ω, F, P) be a probability space. If A ⊂ F is such that

P(A) = 1,

we say that the event A occurs almost surely (a.s.).

Now, in order to introduce the next deﬁnition, consider the following, maybe a little

silly, example. Assume that you want to measure the length of a room, and assume you

express this measure in meters and centimeters. It turns out that the room is 4.30m.

long. Now assume that you want to change the reference system and express the length

of the room in terms of feet and inches. Then, the room is 14ft. long. But in the process

of switching from one reference system to the other, the room did not change: it did not

shrink; it did not expand. The same applies to events and probability measures. The idea

is given in the following.

Deﬁnition 6 (Absolutely continuous/equivalent probability measure) Given two

probability measures P and P

∗

deﬁned on the same σ-algebra F, then:

i) P is absolutely continuous with respect to P

∗

, i.e. P << P

∗

, if P(A) = 0 whenever,

P

∗

(A) = 0∀A ∈ F.

ii) If P << P

∗

and also P

∗

<< P, then P ∼ P

∗

, i.e. P and P

∗

are equivalent measures.

Thus, for P ∼ P

∗

the following are equivalent:

• P(A) = 0 ⇔P

∗

(A) = 0 (same null sets)

• P(A) = 1 ⇔P

∗

(A) = 1 (same a.s. sets)

1.1 The basic framework: the probability space 9

• P(A) > 0 ⇔P

∗

(A) > 0 (same sets of positive measures)

Example 6 Consider a closed interval [a, b], for 0 ≤ a ≤ b ≤ 1 and consider the experi-

ment of choosing a number from this interval. Deﬁne the following

P(the number chosen is in [a, b]) = P[a, b] := b −a.

But you can also deﬁne a diﬀerent metrics P

∗

, according to which

P

∗

(the number chosen is in [a, b]) = P

∗

[a, b] := b

2

−a

2

.

As there is a conversion factor that helps you to switch between meters and feet, so that

4.30m = 14ft, there is also a conversion factor between probability measures. However,

this conversion factor depends on few objects that we have not met yet. Therefore, the

discussion of this last feature is postponed to the end of this unit.

Exercise 1 Let A and B belong to some σ-algebra F. Show that F contains the sets

A

¸

B, A\B, and A∆B, where ∆ denotes the symmetric diﬀerence operator, i.e.

A∆B = {x : x ∈ A, x / ∈ B or x / ∈ A, x ∈ B} .

Exercise 2 Show that for every function f : Ω −→R the following hold:

1. f

−1

(

¸

n

A

n

) =

¸

n

f

−1

(A

n

);

2. f

−1

(

¸

n

A

n

) =

¸

n

f

−1

(A

n

);

3. f

−1

A

C

= (f

−1

(A))

C

for any subsets A

n

, A of R.

Exercise 3 Let F be a σ-algebra of subsets of Ω and suppose that B ∈ F. Show that

G = {A

¸

B : A ∈ F} is a σ-algebra of subsets of B.

Exercise 4 Let P be a probability measure on F. Show that P has the following prop-

erties:

1. for any A, B ∈ F such that A

¸

B = ∅, P(A

¸

B) = P(A) +P(B);

2. for any A, B ∈ F such that A ⊂ B, P(A) ≤ P(B) [Hint: use the fact that for any

two sets A and B such that A ⊂ B, B = A

¸

(B\A) , where we deﬁne B\A :=

{x : x ∈ B, x / ∈ A}, (diﬀerence operator for sets]

3. for any A, B ∈ F such that A ⊂ B, P(B\A) = P(B) −P(A)

10 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

1.2 Random variables

So far, we have considered random events, like and up or down movement in the stock price

over the next period of time, and the likelihood of such events to occur, as described by

the probability measure. The next step in which you might be interested is to “quantify”

the outcome of the random event, for example you might want to know how much the

stock price is going to change if an up or down movement is going to occur in the next

time period. In order to do this, you need the idea of random variable.

Deﬁnition 7 (Random variable) Let (Ω, F, P) be a probability space. A random vari-

able X is a function X : Ω →R such that {ω ∈ Ω|X (ω) ≤ x} ∈ F ∀x ∈ R.

Note that if B is any subset of the Borel σ-algebra B, i.e. B is a set of the form B =

(−∞, x] ∀x ∈ R, then Deﬁnition 7 implies that X

−1

(B) ∈ F ∀x ∈ R. In other words,

any random variable is a measurable function

2

, i.e. a numerical quantity whose value is

determined by the random experiment of choosing some ω ∈ Ω.

Example 7 Consider once again the random experiment of the evolution of the stock

price over an inﬁnite time horizon in a 2-state economy, described in Example 3. Let us

deﬁne the stock prices by the formulae:

S

0

(ω) = 4;

S

1

(ω) =

8 if ω

1

= up

2 if ω

1

= down

S

2

(ω) =

16 if ω

1

= ω

2

= up

4 if ω

1

= ω

2

1 if ω

1

= ω

2

= down.

All of these are random variables, assigning a numerical value to each sequence of up

and down movements in the stock price at each time period. Example 5 tells us how to

calculate the probability that the random variable S takes any of these values; for example

P(S

1

(ω) = 8) = P(A

U

) = p;

P(S

2

(ω) = 4) = P

A

DU

¸

A

UD

= 2pq.

The above Example shows that we can associate to any random variable another

function measuring the likelihood of the outcomes. This is what we call the law of X.

Precisely, by law of X we mean a probability measure on (R, B), L

X

: B → [0, 1] such

that

L

X

(B) = P(X ∈ B) ∀B ⊂ B.

2

Deﬁnition (Measurable function) Let F be a σ-algebra on Ω and f : Ω →R. For A ∈ R let

f

−1

(A) = {ω ∈ Ω|f (ω) ∈ A} ;

then, f is called F-measurable if f

−1

(E) ∈ F ∀E ∈ B, where f

−1

(E) is called the pre-image of E.

1.2 Random variables 11

In general, we prefer to speak in terms of distribution of a random variable; this is a

function F

X

: R →[0, 1] deﬁned as

F

X

(a) = P(X ≤ a) = P(ω : X (ω) ≤ a) .

This is the law of X for any set B of the form B = (−∞, a], i.e. F

X

(a) = L

X

(−∞, a].

In some special cases, we can describe the distribution function of a random variable X

in even more details. The ﬁrst case is the case of a discrete random variable, like the

one introduced in Example 7, which assigns lumps of mass to events. For this random

variable, we can express the distribution function as

F

X

(a) = P(X ≤ a) =

¸

X≤a

p

X

(x) ,

where p

X

(x) is the probability mass function of X. If instead the random variable X

spreads the mass continuously over the real line, then we have a continuous random

variable and

F

X

(a) = P(X ≤ a) =

a

−∞

f

X

(x) dx, (1)

where f (x) denotes the density function of X.

Exercise 5 Let X be a random variable. Show that the distribution F

X

of X deﬁned by

F

X

(A) = P(X ∈ A) = P

X

−1

(A)

, A ∈ B(R) ,

is a probability measure on the σ-algebra B (R).

Remark 1 (A matter of notation) From equation (1), we see that we could write the

density function as

f

X

(x) =

dF

X

dx

=

dP(ω)

dx

∀x ∈ R.

The expectation E of a random variable X on (Ω, F, P) is then deﬁned by:

E[X] =

Ω

X (ω) dP(ω) =

∞

−∞

xdF

X

(x) .

The expectation returns the mean of the distribution; you might be interested in

the dispersion around the mean, this feature is described by the variance of a random

variable. Further features that characterize the distribution of a random variable are

the skewness (degree of asymmetry) and the kurtosis (behaviour of the tails). These

features are described by the moments (from the mean) of a random variable which can

be recovered via the moment generating function (MGF)

M

X

(k) = E

e

kX

=

∞

−∞

e

kx

dF

X

(x) .

12 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

Example 8 Few (and very important, as we will use them throughout the entire year)

examples of random variables:

1. The Poisson random variable is an example of discrete random variable. More

precisely, a Poisson random variable N ∼ Poi(λ), with rate λ has probability mass

p

N

(n) =

e

−λ

λ

n

n!

from which it follows that

E(N) = λ = Var(N); M

N

(k) = e

λ(e

k

−1)

.

2. The normal (or Gaussian) random variable X ∼ N (µ, σ

2

) is a continuous random

variable deﬁned by the density function

f

X

(x) =

e

−

(x−µ)

2

2σ

2

σ

√

2π

.

You can easily show that

E(X) = µ; Var(X) = σ

2

; M

X

(k) = e

kµ+

k

2

σ

2

2

3. Assume X ∼ Γ(α, λ), α > 0. Then X is a non-negative random variable which

follows a Gamma distribution; its density function is given by

f (x) =

1

Γ(α)

λ

α

x

α−1

e

−λx

,

where Γ(α) is the Gamma function, which is deﬁned as

Γ(α) =

∞

0

x

α−1

e

−x

dx,

and has the property that

3

Γ(α) = Γ(α −1) (α −1) .

This means that

Γ(α) = (α −1)!

where α is a positive integer. The MGF of X is

M

X

(k) =

1

Γ(α)

λ

α

∞

0

x

α−1

e

−x(λ−k)

dx.

3

Why don’t you try to prove this last property... just integrate by parts.

1.2 Random variables 13

Set y = x(λ −k), then

M

X

(k) =

1

Γ(α)

λ

α

∞

0

y

λ −k

α−1

e

−y

λ −k

dy =

λ

λ −k

α

.

Note that if α = 1, then X follows an exponential distribution with rate λ. Using

the MGF you can show that the Gamma random variable has mean µ = α/λ and

variance ν = α/λ

2

. The parameter α is the shape parameter, whilst λ is the scale

parameter.

Moment generating functions suﬀer the disadvantage that the integrals which deﬁne

them may not always be ﬁnite.

Example 9 A Cauchy random variable X has density function

f(x) =

1

π(1 +x

2

)

x ∈ R.

Hence the MGF of X is given by

M

X

(k) =

∞

−∞

e

kx

π(1 +x

2

)

.

This is an improper integral of the 1

st

kind which does not converge unless k = 0 (which

of course is a nonsense...) In fact, if you perform the convergence test, you obtain that:

lim

x→∞

e

kx

π(1+x

2

)

(

1

x

)

α

α=2

=

1

π

lim

x→∞

e

kx

=

0 if k < 0

∞ if k > 0,

lim

x→−∞

e

kx

π(1+x

2

)

(

1

x

)

α

α=2

=

1

π

lim

x→−∞

e

kx

=

0 if k > 0

∞ if k < 0.

Hence, the MGF of a Cauchy random variable does not exist.

Characteristic functions are another class of functions equally useful and whose ﬁnitiness

is guaranteed.

Deﬁnition 8 The characteristic function of X is the function φ

X

: R →C deﬁned by

φ

X

(u) = E

e

iuX

where i =

√

−1.

This is a common transformation and is often called the Fourier transform of the

density f of X if this quantity exists. In this case

φ

X

(u) =

e

iux

dF(x) =

e

iux

f(x)dx.

14 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

The characteristic function of a random variable has several nice properties. Firstly

it always exists and it is ﬁnite (in L

1

): note that

φ

X

(u) = E

e

iuX

= E(cos (uX) +i sin (uX)) ,

hence

4

|cos (uX) +i sin (uX)| :=

cos (uX)

2

+ sin (uX)

2

= 1.

Then

E

e

iuX

≤ E

e

iuX

= 1.

Moreover:

1. if X and Y are independent random variables, φ

X+Y

(u) = φ

X

(u) φ

Y

(u) ;

2. if a, b ∈ R and Y = aX +b, then φ

Y

(u) = e

iub

φ

X

(au).

1.2.1 Examples of characteristic functions

Calculations of integrals involving complex numbers are not always pleasant; usually you

should know about contour integration... but for our purposes you can get away with

only knowing about analytic continuation.

Analytic continuation provides a way of extending the domain over which a complex

function is deﬁned. Let us start from a complex function f (like the characteristic

function); this function is complex diﬀerentiable at z

0

and has derivative A if and only if

f (z) = f (z

0

) +A(z −z

0

) +o (z −z

0

) , ∀z ∈ C.

A complex function is said to be analytic on a region D if it is complex diﬀerentiable at

every point in D (i.e. has no singularities, i.e. points at which the function “blows up”

or becomes degenerate). Now, let f

1

and f

2

be analytic functions on domains D

1

and D

2

respectively, with D

1

⊂ D

2

, such that f

1

= f

2

on D

1

¸

D

2

. Then f

2

is called the analytic

continuation of f

1

to D

2

. Moreover, if it exists, the analytic continuation of f

1

to D

2

is

unique.

Consider now the MGF M

X

of some random variable X; we can say that the function

M

X

(z) =

∞

−∞

f (x) e

zx

dx z ∈ C

is the analytic continuation of M

X

to the complex plane, if it respects the condition above.

Then, the characteristic function of X, φ

X

, is the restriction of M

X

to the imaginary axis,

i.e.

φ

X

(u) = M

X

(iu)

And now, let’s calculate some characteristic functions.

4

Note that this is the complex square of the complex number z = cos (uX) +i sin(uX), and you can

interpret the notation as a norm.

1.3 Conditional expectation 15

1. Let X ∼ N (0, 1). The characteristic function is

φ

X

(u) =

1

√

2π

∞

−∞

e

iux−

x

2

2

dx.

Now consider the real valued function

M

X

(k) =

1

√

2π

∞

−∞

e

kx−

x

2

2

dx = e

k

2

2

,

i.e. the MGF of X. Since R

¸

C =∅, then M

X

has analytic continuation on the

complex plane given by

M

X

(z) =

1

√

2π

∞

−∞

e

zx−

x

2

2

dx = e

z

2

2

z ∈ C.

Therefore, by analytic continuation

φ

X

(u) = M

X

(iu) = e

−

u

2

2

.

2. Let X be a Poisson random variable with rate u. You can apply the same argument

as above (i.e. analytic continuation) to show that

φ

X

(u) = M

X

(iu) = e

λ(e

iu

−1)

.

3. Consider now the Gamma distribution. Analytic continuation implies that

φ

X

(u) =

λ

λ −iu

α

.

4. Assume X is a Cauchy random variable, i.e.

f (x) =

1

π (1 +x

2

)

.

We cannot use the analytic continuation argument because the function is not ana-

lytic (can you spot why?). Here you need to use contour integration and the residue

theorem. You should obtain that

φ

X

(u) = e

−|u|

.

1.3 Conditional expectation

At the beginning of this Unit, we talked about the problem of setting up a mathematical

model of a random experiment, in order to support our decision process. Speciﬁcally, we

talked about informed decisions, and we have seen that information in the probability

16 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

space is captured by the σ-algebra. Then, in the previous section, we have seen how to

quantify a random event by using random variables.

Now, consider as always that some random experiment is performed, whose outcome is

some ω ∈ Ω. Imagine that we are given some information, G, about this possible outcome,

not enough to know the precise value of ω, but enough to narrow down the possibilities.

Then, we can use this information to estimate, although not precisely, the value of the

random variable X (ω). Such an estimate is represented by the conditional expectation of

X given G.

In order to understand the deﬁnition of conditional expectation, we need to familiarize

ﬁrst with the indicator function. Precisely, we use the notation 1

A

for

1

A

(ω) =

1 if ω ∈ A

0 otherwise

Hence 1

A

is a random variable which follows a Bernoulli distribution, taking values 1 with

probability P(A), and 0 with probability P(A

c

). Hence E[1

A

] = P(A). Properties of the

indicator function are listed below.

1. 1

A

+ 1

A

C = 1

A∪A

c = 1

Ω

= 1;

2. 1

A∩B

= 1

A

1

B

.

Now, we are ready for the following.

Deﬁnition 9 (Axiomatic deﬁnition-Kolmogorov) Let (Ω, F, P) be a probability space

and X a random variable with E|X| < ∞. Let G be a sub σ-algebra of F. Then the ran-

dom variable Y = E[X|G] is the conditional expectation of X with respect to G if:

1. Y is G-measurable (Y ∈ G).

2. E|Y | < ∞

3. ∀A ∈ G : E(Y 1

A

) = E(X1

A

) , i.e.

A

Y dP =

A

XdP.

The idea is that, if X and G are somehow connected, we can expect the information

contained in G to reduce our uncertainty about X. In other words, we can better predict

X with the help of G. In fact, Deﬁnition 9 is telling us that, although the estimate

of X based on G is itself a random variable, the value of the estimate E[X|G] can be

determined from the information in G (property 1). Further, Y is an unbiased estimator

of X (property 3 with A = Ω).

Example 10 Consider once again the stock price evolution described in Example 7.

Suppose you are told that the outcome of the ﬁrst stock price movement is “up”. You

can now use this information to estimate the value of S

2

E[S

2

(ω) |up] = 12p + 4.

1.3 Conditional expectation 17

In this case, G = A

U

. Similarly,

E[S

2

(ω) |down] = 3p + 1,

and G = A

D

. Question: what is

E[S

2

(ω) |G = A

UD

]?

Theorem 10 The conditional expectation has the following properties:

1. E[E(X|G)] = E[X] , i.e. E[Y ] = E[X].

2. If G = {∅, Ω} (smallest σ-algebra),E[X|G] = E[X].

3. If G = F, E[X|G] = X.

4. If X ∈ G, E[X|G] = X

5. If Z ∈ G, then E[ZX|G] = ZE[X |G] = ZY

6. Let G

0

⊂ G, E[E(X |G) |G

0

] = E[X|G

0

] .

7. Let G

0

⊂ G, E[E(X|G

0

) |G] = E[X |G

0

] .

8. If X is independent of G, then E[X |G] = E[X]

Proof. One by one:

1. Check point 3 in the previous deﬁnition for A = Ω (remember that Ω ∈ G ...):

E[Y 1

Ω

] = E[X1

Ω

] but 1

Ω

= 1.

2. Check point 3 in the axiomatic deﬁnition. ForA = ∅, we have

∅

Y dP =

∅

XdP = 0

For A = Ω

E[X1

Ω

] = E[X]

E[E(X) 1

Ω

] = E[X]

in virtue of property 1. Hence both sides return E[X].

3. Verify the deﬁnition of conditional expectation on X for G = F:

• X ∈ F because it is F-measurable by deﬁnition of random variable.

• E|X| < ∞ by assumption (axiomatic deﬁnition).

• E(Y 1

A

) = E(X1

A

) ∀A ∈ G.

18 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

In this case you have available the entire “history” of X. Hence you know everything

and therefore there is no uncertainty left.

4. If X ∈ G, then we go back to the same situation as depicted in (3).

5. We prove this property for the simple case of an indicator function; hence, assume

Z = 1

B

for some B ∈ G; then condition 3 in the deﬁnition of conditional expectation

reads:

∀A ∈ G E(ZX1

A

) = E(X1

A

1

B

) = E(X1

A∩B

) .

But ∀A∩ B ∈ G, condition 3 implies

E(X1

A∩B

) = E(Y 1

A∩B

) = E(Y 1

A

1

B

) = E(ZY 1

A

) .

The extension to the case of a more general random variable relies on the construc-

tion of a random variable as the limit of the sum of indicator functions. However,

this is out of the grasp of this unit.

6. Let Y = E[X|G] and Z = E[X|G

0

]. If A ∈ G

0

, then E(Z1

A

) = E(X1

A

), but

since G

0

⊂ G, A ∈ G as well, and by deﬁnition E(Y 1

A

) = E(X1

A

). Therefore

E(Z1

A

) = E(Y 1

A

) ∀A ∈ G

0

.

7. Let Z = E[X|G

0

], then Z ∈ G

0

. Since G

0

⊂ G, it follows that Z ∈ G. Therefore

E[Z|G] = Z.

8. ∀A ∈ G : E(X1

A

) = E(X) E(1

A

) = E[E(X) 1

A

] .

Exercise 6 Let X

1

, X

2

, ... be identically distributed random variables with mean µ, and

let N be a random variable taking values in the non-negative integers and independent

of the X

i

. Let S = X

1

+ X

2

+ ... + X

N

. Show that E(S| N) = µN and deduce that

E(S) = µE(N).

Exercise 7 We deﬁne the conditional variance of a random variable X given a σ-algebra

F by

V ar(X|F) = E[(X −E(X|F))

2

|F].

Show that

V ar (X) = E[V ar(X|F)] +V ar [E(X|F)] .

1.4 Change of measure

Let us go back to the example of measuring the length of a room and of wishing to do

this using diﬀerent references. If you want to convert meters in feet, you need a “bridge”

between the two (1 ft = 0.30 meters). There is something equivalent to this also for

probability measures and it is deﬁned as follows.

1.4 Change of measure 19

Theorem 11 (Radon-Nikod´ ym) If P and P

∗

are two probability measures on (Ω, F)

such that P ∼ P

∗

, then there exists a random variable Y ∈ F such that

P

∗

(A) =

A

Y dP = E[Y 1

A

] , ∀A ∈ F. (2)

Y is called the Radon-Nikod´ym derivative of P

∗

with respect to P and is also written as

Y =

dP

∗

dP

Remark 2 From the discussion in Section 1.1, it should be obvious by now that Y is not

a proper derivative but more something like a likelihood ratio.

Example 11 Consider Example 6. Here we deﬁned two metrics on the interval [a, b],

0 ≤ a ≤ b ≤ 1:

P(the number chosen is in [a, b]) = P[a, b] := b −a

P

∗

(the number chosen is in [a, b]) = P

∗

[a, b] := b

2

−a

2

.

We could be more speciﬁc and say that

P[a, b] =

b

a

dω =

[a,b]

dP(ω) ;

P

∗

[a, b] =

b

a

2ωdω =

[a,b]

2ωdP(ω) .

The last equation is (2) with Y (ω) = 2ω.

Exercise 8 Consider the usual probability space (Ω, F, P) and a standard normal random

variable X, i.e. X ∼ N (0, 1). Deﬁne a new random variable Y as Y = X + θ, and let

ˆ

P(A) be another probability measure on Ω, deﬁned by

d

ˆ

P

dP

= Z,

where

Z = e

−θx−

θ

2

2

.

Show that Y ∼ N (0, 1) on

Ω, F,

ˆ

P

.

Note that for any random variable X,

E

∗

[X] =

XdP

∗

=

XY dP = E[XY ] .

20 1 REVIEW OF MEASURE THEORY AND PROBABILITY THEORY

Theorem 12 (Bayes formula) Let P and P

∗

be two equivalent probability measures on

the same measurable space (Ω, F) and let

Y =

dP

∗

dP

be the Radon-Nikod´ym derivative of P

∗

with respect to P. Furthermore, let X be a random

variable on (Ω, F, P

∗

) such that E

∗

|X| < ∞ and G ∈ F a sub σ-algebra of F. Then the

following generalised version of the Bayes formula holds:

E

∗

[X|G] =

E[XY |G]

E[Y |G]

.

Proof. Let Z = E

∗

[X |G]. By deﬁnition: Z ∈ G, E

∗

|Z| < ∞ and E

∗

(Z1

A

) =

E

∗

(X1

A

) ∀A ∈ G. Hence

A

ZdP

∗

=

A

XdP

∗

⇔

A

ZY dP =

A

XY dP

⇔E(ZY 1

A

) = E(XY 1

A

)

Now

E(XY 1

A

) = E[E(XY |G) 1

A

] ;

E(ZY 1

A

) = E[E(ZY |G) 1

A

] , ∀A ∈ G.

Then

E[(E(ZY |G) −E(XY |G)) 1

A

] ,

which implies that E(ZY |G) = E(XY |G). Since Z ∈ G, E(XY |G) = E(Y |G) Z.

We will use this rule to link expectations calculated in a particular “universe” to the

ones calculated in another universe.

1.5 Some more exercises

1. a) Formally deﬁne the components of any probability space Θ = (Ω, F,P) .

b) Let Ω = {1, 2, 3, 4, 5} and let U be the collection

U = {{1, 2, 3} , {3, 4, 5}} .

Find the smallest σ-algebra F (U) generated by U.

c) Deﬁne X : Ω →R by

X (1) = X (2) = 0; X (3) = 10; X (4) = X (5) = 1.

Deﬁne the condition of F-measurability for X. Check if X is measurable with

repsect to F (U).

21

d) Deﬁne Y : Ω →R by

Y (1) = 0; Y (2) = Y (3) = Y (4) = Y (5) = 1.

Find the σ-algebra F (Y ) generated by Y and show that Y is F (Y )-measurable.

2. Let X be a non-negative random variable deﬁned on a probability space (Ω, F, P)

with exponential distribution, which is

P(X ≤ x) = F

X

(x) = 1 −e

−λx

, x ≥ 0,

where λ is a positive constant. Let

˜

λ be another positive constant, and deﬁne

Z =

˜

λ

λ

e

−(

˜

λ−λ)X

.

Deﬁne

˜

P by

˜

P(A) =

A

ZdP for all A ∈ F.

(a) Show that

˜

P(Ω) = 1.

(b) Compute the cumulative distribution function

˜

P(X ≤ x) for x ≥ 0

for the random variable X under the probability measure

˜

P.

A Set theory: quick reminder

For further references, you can look at Grimmett and Stirzaker, and Schaum (Chapter

2).

A.1 Sets, elements and subsets

• a ∈ A: stays for “ a is an element of set S”;

• if a ∈ A implies (⇒, in short) a ∈ B, then A is a subset of B, or A ⊆ B, which is

read “ A is contained in B”;

• A = B ⇐⇒ (read: “if and only if”) A ⊆ B and B ⊆ A;

• Negations:

a / ∈ A; A B; A = B

• If A ⊆ B and A = B, then A ⊂ B (proper subset)

• An example: let A = {1, 3, 5, 7, 9}; B = {1, 2, 3, 4, 5}; C = {3, 5}

22 A SET THEORY: QUICK REMINDER

• C ⊂ A

• C ⊂ B

• A B

• B A

• Sets can be speciﬁed in

– tabular form (roster method): A = {1, 3, 5, 7, 9}

– set-builder form (property method): B = {x : x is an even integer, x > 0}

• Special sets:

– Universal set U

– Empty set ∅: S = {x : x is a positive integer, x

2

= 3} = ∅

A.2 Union and intersection

• Union of A and B: set of all elements which belong either to A, B, or both:

A

¸

B := {x : x ∈ A or x ∈ B}

• Intersection of A and B: set of all elements which belong to both A and B:

A

¸

B := {x : x ∈ A and x ∈ B}

• If A

¸

B = ∅, then A and B are disjoint.

• If A ⊆ B, then

A

¸

B = B

A

¸

B = A

A.2.1 Properties

• A

¸

∅ = A; A

¸

∅ = ∅

• If A ⊆ U, A

¸

U = U and A

¸

U = A

• Commutative Law

A

¸

B = B

¸

A

A

¸

B = B

¸

A

A.3 Complements and diﬀerence 23

• Associative Law

A

¸

B

¸

C = A

¸

B

¸

C

A

¸

B

¸

C = A

¸

B

¸

C

• Distributive Law

A

¸

B

¸

C

=

A

¸

B

¸

A

¸

C

A

¸

B

¸

C

=

A

¸

B

¸

A

¸

C

• Idempotent Law

A

¸

A = A

A

¸

A = A

A.3 Complements and diﬀerence

• The set (absolute) complement of A is deﬁned as

A

C

= {x : x ∈ U, x / ∈ A}

i.e. the set of elements which do not belong to A;

• The set relative complement of B with respect to A (or diﬀerence of A and B) is

deﬁned as

A\B = {x : x ∈ A, x / ∈ B}

! Note that

A\B = A

¸

B

C

A\

B

¸

C

= (A\B)

¸

(A\C)

A\

B

¸

C

= (A\B)

¸

(A\C)

Example 12 Let

U = {1, 2, 3, 4, 5, ...}

A = {1, 2, 3}

B = {3, 4, 5, 6, 7}

then

A

c

= {4, 5, 6, ...}

A\B = {1, 2}

Note:

• A

¸

A

c

= U

• A

¸

A

c

= ∅

24 B MODES OF CONVERGENCE OF A RANDOM VARIABLE

A.3.1 Properties

• (A

c

)

c

= A

• if A ⊂ B, then B

c

⊂ A

c

• De Morgan Laws

– (A

¸

B)

c

= A

c

¸

B

c

– (A

¸

B)

c

= A

c

¸

B

c

A.4 Further deﬁnitions

• A×B := {(x, y) : x ∈ A, y ∈ B} is the Cartesian product of A and B

• A is ﬁnite if it is empty or if it consists of exactly n elements, where n is a positive

integer;

• Otherwise A is inﬁnite;

• A is countable if it is ﬁnite or if its elements can be listed in the form of a sequence

(countable inﬁnite)

• Otherwise A is uncountable

Example 13 • A = {letters of the English alphabet}

• D = {days of the week}

• R = {x : x is a river on Earth}

• Y = {x : x is a positive integer, x is even} = {2, 4, 6, 8, ...}

• I = {x : 0 ≤ x ≤ 1}

B Modes of convergence of a random variable

Let {X

m

}

m∈N

be a sequence of random variables, and let X be another random variable.

Then:

• ALMOST SURE CONVERGENCE: X

m

a.s

→ X if, ∀ε > 0, the event

{ω ∈ Ω :X

m

(ω) →X (ω) as m →∞} has probability 1.

• CONVERGENCE IN PROBABILITY: X

m

P

→X if, ∀ε > 0

lim

m→∞

P(|X

m

−X| > ε) = 0.

B.1 Further convergences 25

• CONVERGENCE IN L

p

(in L

p

mean): X

m

L

?p

→ X if

lim

m→∞

E(|X

m

−X|

p

) = 0

• CONVERGENCE IN DISTRIBUTION: X

m

D

→ X

lim

m→∞

P(X

m

≤ x) = P(X ≤ x) ∀x ∈ R.

B.1 Further convergences

• MONOTONE CONVERGENCE: if 0 ≤ X

m

↑ X a.s., then E(X

m

) ↑ E(X) < ∞,

or equivalently, lim

m→∞

E(X

m

) = E(lim

m→∞

X

m

) = E(X) , as X = lim

m→∞

X

m

.

• DOMINATED CONVERGENCE: for X

m

→X a.s., if |X

m

| ≤ Y (ω) with E(Y ) <

∞, then

E(|X

m

−X|) →0

In other words E(X

m

) ↑ E(X) or lim

m→∞

E(X

m

) = E(X).

• BOUNDED CONVERGENCE THEOREM: for X

m

→X a.s., if |X

m

| ≤ K

E(|X

m

−X|) →0

implied by dominated convergence.

- Stochastic Calculus Notes 3/5
- 16- Catalogo - TARANTO - Retenes
- Stochastic Calculus Notes 4/5
- Vcds Compatibility Charts
- Central Convenience vag com
- Stochastic Calculus Notes 5/5
- liouville
- Stochastic Calculus
- Isoclinas
- Stochastic Calculus, Filtering, And Stochastic Control
- Números Complejos - Solución
- SuperVAG User Manual
- Stochastic Calculus Notes 2/5
- Lecture Notes Stochastic Calculus
- [David Nualart] Malliavin Calculus and Related Top(BookFi)
- VW Passat B5 - Construction & Operation [SERVICE]
- Stochastic Integration With Jumps
- Queueing Model
- Stochastic Process
- Actuarial AdvantEDGE (Sample Manual)
- Pivato M , Analysis Measure and Probability and Introduction (Book Draft, S
- Dynamic Mathematical Models of Projects (Part1. High-Level Quantitative Representation of Projects Without Accounting for Design Feedbacks)
- VW Passat B5 ATF Fluid & Filter Change
- Levy intro
- Vw Secret Codes
- VW Passat B5.5 Self Study Guide SP251
- On the Entropy of Continuous Probability Distributions
- 4- Volkswagen Passat 2000-2005 Benzina-Diesel Manualde Intretinere Si Repartii Auto-Haynes

Read Free for 30 Days

Cancel anytime.

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading