You are on page 1of 105

Stochastic Processes in Continuous Time

Joseph C. Watkins
December 14, 2007

Contents
1 Basic Concepts 3
1.1 Notions of equivalence of stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Sample path properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Properties of filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Examples of stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Lévy Processes 12

3 Martingales 16
3.1 Regularity of Sample Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Sample Path Regularity and Lévy Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Maximal Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 Law of the Iterated Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Markov Processes 33
4.1 Definitions and Transition Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Operator Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.1 The Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 The Resolvent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 The Hille-Yosida Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.1 Dissipative Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.2 Yosida Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.3 Positive Operators and Feller Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.4 The Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 Connections to Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6 Jump Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6.1 The Structure Theorem for Pure Jump Markov Processes . . . . . . . . . . . . . . . . 53
4.6.2 Construction in the Case of Bounded Jump Rates . . . . . . . . . . . . . . . . . . . . 55
4.6.3 Birth and Death Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6.4 Examples of Interacting Particle Systems . . . . . . . . . . . . . . . . . . . . . . . . . 59

1
CONTENTS 2

4.7 Sample Path Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


4.7.1 Compactifying the State Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Transformations of Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.8.1 Random Time Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.8.2 Killing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.8.3 Change of Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.9 Stationary Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.10 One Dimensional Diffusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.10.1 The Scale Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.10.2 The Speed Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.10.3 The Characteristic Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.10.4 Boundary Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5 Stochastic Integrals 84
5.1 Quadratic Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Definition of the Stochastic Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3 The Itô Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.5 Itô Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
1 BASIC CONCEPTS 3

1 Basic Concepts
We now consider stochastic processes with index set Λ = [0, ∞). Thus, the process

X : [0, ∞) × Ω → S

can be considered as a random function of time via its sample paths or realizations

t → Xt (ω), for each ω ∈ Ω.

Here S is a metric space with metric d.

1.1 Notions of equivalence of stochastic processes


As before, for m ≥ 1, 0 ≤ t1 < · · · < tm , B ∈ B(S m ), the Borel σ-algebra, call

µt1 ,...,tm (B) = P {(Xt1 , . . . , Xtm ) ∈ B}

the finite dimensional distributions of X.


We also have a variety of notions of equivalence for two stochastic processes, X and Y .
Definition 1.1. 1. Y is a version of X if X and Y have the same finite dimensional distributions.
2. Y is a modification of X if for all t ≥ 0, (Xt , Yt ) is an S × S-random variable and

P {Yt = Xt } = 1.

3. Y is indistinguishable from X if

P {ω; Yt (ω) 6= Xt (ω) for some t ≥ 0} = 0.

Recall that the Daniell-Kolmogorov extension theorem guarantees that the finite dimensional distribu-
tions uniquely determine a probability measure on S [0,∞) .

Exercise 1.2. If A ∈ σ{Xt ; t ≥ 0}, then there exists a countable set C ⊂ [0, ∞) so that A ∈ σ{Xt ; t ∈ C}.
RT
Thus, many sets of interest, e.g., {x; sup0≤s≤T f (xs ) > 0} or {x : 0 f (xs ) ds < ∞} are not measurable
subsets of S [0,∞) . Thus, we will need to find methods to find versions of a stochastic process so that these
sets are measurable.

Exercise 1.3. If X and Y are indistinguishable, then X is a modification of Y . If X is a modification of


Y , then X and Y have the same finite dimensional distributions.
1 BASIC CONCEPTS 4

1.2 Sample path properties


Definition 1.4. We call X
1. measurable if X is B[0, ∞) × F-measurable,
2. (almost surely) continuous, (left continuous, right continuous) if for (almost) all ω ∈ Ω the sample path
is continuous, (left continuous, right continuous).
Focusing on the regularity of sample paths, we have

Lemma 1.5. Let x : [0, ∞) → S and suppose


xt+ = lim xs exists for all t ≥ 0,
s→t+

and
xt− = lim xs exists for all t > 0.
s→t−

Then there exists a countable set C such that for all t ∈ [0, ∞)\C,
xt− = xt = xt+ .
Proof. Set
1
Cn = {t ∈ [0, ∞); max{d(xt− , xt ), d(xt− , xt+ ), d(xt , xt+ )} >
}
n
If Cn ∩ [0, m] is infinite, then by the Bolzano-Weierstrass theorem, it must have a limit point t ∈ [0, m]. In
this case, either xt− or xt+ would fail to exist. Now, write
∞ [
[ ∞
C= (Cn ∩ [0, m]),
m=1 n=1

the countable union of finite sets, and, hence, countable.


Lemma 1.6. Let D be a dense subset of [0, ∞) and let x : D → S. If for each t ≥ 0,
x+
t = lim xs
s→t+,s∈D

exists, then x+ is right continuous. If for each t > 0,


x−
t = lim xs
s→t−,s∈D

exists, then x− is left continuous.


Proof. Fix t0 > 0. Given  > 0, there exists δ > 0 so that
d(x+
t0 , xs ) ≤  whenever s ∈ D ∩ (t0 , t0 + δ).

Consequently, for all s ∈ (t0 , t0 + δ)


d(x+ +
t0 , xs ) = lim d(x+
t0 , xu ) ≤ .
u→s+,u∈D

and x+ is right continuous.


The left continuity is proved similarly.
1 BASIC CONCEPTS 5

Exercise 1.7. If both limits exist in the previous lemma for all t, then x− +
t = xt− for all t > 0.

Definition 1.8. Let CS [0, ∞) denote the space of continuous S-valued functions on [0, ∞). Let DS [0, ∞)
denote the space of right continuous S-valued functions having left limits on [0, ∞).
Even though we will not need to use the metric and topological issue associated with the spaces CS [0, ∞)
and DS [0, ∞), we proved a brief overview of the issues.
If we endow the space CS [0, ∞) with the supremum metric ρ∞ (x, y) = sup0≤s≤∞ d(xs , ys ), then the
metric will not in general be separable. In analogy with the use of seminorms in a Frechet to define a metric,
we set, for each t > 0,
ρt (x, y) = sup d(xmax{s,t} , ymax{s,t} ).
s

Then, ρT satisfies the symmetric and triangle inequality properties of a metric. However, if x and y
agree on [0, T ], but xs 6= ys for some s > T , then x 6= y but ρt (x, y) = 0. However, if ρt (x, y) = 0 for all t,
then x = y. Consider a bounded metric d(x ¯ 0 , y0 ) = max{d(x0 , y0 ), 1} and set ρ̄t (x, y) = sup ¯
0≤t≤t d(xs , ys ),
then we can create a metric on CS [0, ∞) which is separable by giving increasingly small importance to large
values of t. For example, we can use
Z ∞
ρ̄(x, y) = e−t ρ̄t (x, y) dt.
0

Then, (CS [0, ∞), ρ̄) is separable whenever (S, d) is separable and complete whenever (S, d) is complete.
For the space DS [0, ∞), then unless the jumps match up exactly then the distance from x to y might be
large. To match up the jumps, we introduce a continuous strictly increasing function γ : [0, ∞) → [, ∞) that
is one to one and onto and define

ργt (x, y) = sup d(xmax{s,t} , ymax{γ(s),t} ).


s

and Z ∞
ρ( x, y) = inf {max{ess supt | log γt0 |, e−t ργt (x, y) dt}}.
γ 0

As before, (DS [0, ∞), ρ̄) is separable whenever (S, d) is separable and complete whenever (S, d) is complete.
Exercise 1.9. CR [0, ∞) with the ρ∞ metric is not separable. Hint: Find an uncountable collection of
real-valued continuous functions so that the distance between each of them is 1.
With a stochastic process X with sample paths in DS [0, ∞), we have the following moment condition
that guarantee that X has a CS [0, ∞) version.

Proposition 1.10. If (S, d) be a separable metric space and set d1 (x, y) = min{d(x, y), 1}. Let X be a
process with sample paths in DS [0, ∞). Suppose for each T > 0, there exist α > 1, β > 0, and C > 0 such
that
E[d1 (Xt , Xs )β ] ≤ C(t − s)α 0 ≤ s ≤ t ≤ T. (1.1)
Then almost all sample paths of X belong to CS [0, ∞).
1 BASIC CONCEPTS 6

Proof. Let T be a positive integer.


Claim.
N
X 2X T
d1 (Xt , Xt− )β ≤ lim inf d1 (Xk2−N , X(k−1)2−N )β .
N →∞
0<t≤T k=1

First note that by Lemma 1.5, the left side is the sum of a countable number of terms. In addition, note
that for each n ≥ 1, {t ∈ [0, T ]; d1 (Xt , Xt− ) > 1/n} is a finite set. Thus, for N sufficiently large, these points
are isolated by the 2−N partition of [0, T ]. Thus, in the limit, these jumps are captured. Consequently,
N
X 2X T
d1 (Xt , Xt− )β I{d1 (Xt ,Xt− )>1/n} ≤ lim inf d1 (Xk2−N , X(k−1)2−N )β . (1.2)
N →∞
0<t≤T k=1

Note that the left side of expression (1.2) increases as n increases. Thus, let n → ∞ to establish the
claim.
By Fatou’s lemma, and the moment inequality (1.1),
n
X 2
X T
β
E[ d1 (Xt , Xt− ) ] ≤ lim inf E[d1 (Xk2−n , X(k−1)2−n )β ]
n→∞
0<t≤T k=1
n
2
X T
≤ lim inf C2−nα
n→∞
k=1

= lim inf CT 2n(1−α) = 0.


n→∞

Thus, for each T , with probability 1, X has no discontinuities on the interval [0, T ] and consequently
almost all sample paths of X belong to CS [0, ∞).

1.3 Properties of filtrations


Definition 1.11. A collection of sub-σ-algebras {Ft ; t ≥ 0} of the σ-algebra F is called a filtration if s ≤ t
implies that Fs ⊂ Ft .
For a given stochastic process X, write FtX for the filtration σ{Xs ; 0 ≤ s ≤ t}. Call {FtX ; t ≥ 0} is called
the natural filtration associated to the process X.
The filtration {Ft ; t ≥ 0} in continuous time may have some additional structure

Definition 1.12. 1. {Ft ; t ≥ 0} is right continuous if for each t ≥ 0,


\
Ft = Ft+ = Ft+ .
>0

2. {Ft ; t ≥ 0} is complete if (Ω, F, P ) is complete and F0 contains all events having probability zero.
Combining the process X and the filtration {Ft ; t ≥ 0}, we have, as before:
1 BASIC CONCEPTS 7

Definition 1.13. X is adapted if Xt is Ft -measurable. In addition, a process X is {Ft }-progressive if, for
every t ≥ 0,
X|[0,t]×Ω is B[0, t] × Ft -measurable.
Exercise 1.14. 1. The filtration {Ft+ ; t ≥ 0} is right continuous.

2. If X is {Ft }-progressive, then X is {Ft }-adapted and measurable. Hint. Approximate X on [0, t] × Ω
by Xsn (ω) = Xmin{t,([ns]+1)/n)} (ω).
3. If X is {Ft }-adapted and right continuous, then is {Ft }-progressive.
4. Let X be {Ft }-progressive and let f : S → R be bounded and measurable. Show that f ◦ X is {Ft }-
Rt
progressive and At = 0 f (Xs ) ds is {Ft }-adapted.

1.4 Stopping times


A stopping time, τ , and the corresponding stopped σ-algebra Fτ are defined for continuous time processes
in a manner analogous to the discrete time case.
Definition 1.15. A non-negative (possibly infinite) random variable is an {Ft }-stopping time if for every
t ≥ 0,
{τ ≤ t} ∈ Ft .
The stopped σ-algebra
Fτ = {A ∈ F; A ∩ {τ ≤ t} ∈ Ft , for all t ≥ 0}.
Many of the same properties hold with the same proof as for discrete time processes.
Proposition 1.16. 1. The supremum of a countable number of stopping times is a stopping time.
2. The minimum of a finite number of stopping times is a stopping time.
3. Fτ is a σ-algebra
4. For a second stopping time σ, min{τ, σ} is Fτ -measurable.
5. If σ ≤ τ , then Fσ ⊂ Fτ
The following proposition holds in discrete time. However the proof in continuous is more involved.
Proposition 1.17. Let X be an Ft -progressive process and τ a finite stopping time. Then Xτ is Fτ
measurable.
Proof. Pick Γ ∈ B(S). We must show that {Xτ ∈ B} ∈ Fτ . Consequently, for each t ≥ 0, we must show
that.
{Xτ ∈ B} ∩ {τ ≤ t} ∈ Ft .
We shall accomplish this using a compositions of maps. Note that by the previous proposition, min{τ, t} is
Ft -measurable. Thus
ω 7→ (min{τ (ω), t}, ω)
is a measurable mapping from (Ω, Ft ) to ([0, t] × Ω, B([0, t]) × Ft ).
1 BASIC CONCEPTS 8

Next, because X is an Ft -progressive process,

(s, ω) 7→ Xs (ω)

is a measurable mapping from ([0, t] × Ω, B([0, t]) × Ft ) to (S, B(S)).


Thus, the composition
ω 7→ Xmin{τ (ω),t} (ω)
is a measurable mapping from (Ω, Ft ) to (S, B(S)) and {Xmin{τ,t} ∈ Γ} ∈ Ft .
Finally, {Xτ ∈ B} ∩ {τ ≤ t} = {Xmin{τ,t} ∈ B} ∩ {τ ≤ t}. Because both of these events are elements of
the σ-algebra Ft , then so is their intersection.
Some additional properties are connected to the continuity of time.

Lemma 1.18. A [0, ∞]-valued random variable τ is an Ft+ -stopping time if and only if {t < τ } ∈ Ft for
every t ≥ 0.
Proof. (necessity)

[ 1
{τ < t} = {τ ≤ t − } ∈ Ft ⊂ Ft+ .
n=1
n
(sufficiency) For n ≥ m,
1
{τ < t + } ∈ Ft+1/m .
n
Therefore,
∞ ∞
\ 1 \
{τ ≤ t} = {τ < t + }∈ Ft+1/m ⊂ Ft+ .
n=1
n m=1

Corollary 1.19. For an Ft+ -stopping time τ ,

{τ = t} ∈ Ft .

Proof. {τ = t} = {τ ≤ t}\{τ < t} ∈ Ft .


Proposition 1.20. Let {τn ; n ≥ 1}. If Ft is right continuous, then inf n τn , lim inf n→∞ τn , and

lim sup τn
n→∞

are Ft -stopping times.


Proof. Use the lemma above and note that

\
{inf τn < t} = {τn < t}.
n
n=1

Now use
lim inf τn = sup inf τn and lim sup τn = inf sup τn .
n→∞ m n≥m n→∞ m n≥m
1 BASIC CONCEPTS 9

Proposition 1.21. Every Ft -stopping time is the nonincreasing limit of bounded stopping times.
Proof. Take the stopping times τn = min{τ, n} for n = 1, 2, . . . .
Definition 1.22. Call a stopping time discrete if its range is almost surely countable.
Proposition 1.23. Every Ft -stopping time is the nonincreasing limit of discrete stopping times.
Proof. For n = 1, 2, . . ., choose the nested collection of sequences 0 = tn0 < tn1 < · · · with limk→∞ tnk = ∞,
and limn→∞ supk (tnk+1 − tnk ) = 0. For an Ft+ -stopping time τ , define
 n
tk if tnk−1 ≤ τ < tnk
τn =
∞ if τ = ∞.

Then clearly τn is discrete and τ = limn→∞ τn . To see that τn is a stopping time, set γn (t) = max{tnk ; tnk ≤
t}. Then
{τn ≤ t} = {τn ≤ γn (t)} = {τ ≤ γn (t)} ∈ Fγn (t) ⊂ Ft .

Exercise 1.24. Let {τn ; n ≥ 1} be a sequence of Ft -stopping times and let τ = limn→∞ τn .
1. If {τn ; n ≥ 1} is nondecreasing, then τ is an Ft -stopping time.
2. If {τn ; n ≥ 1} is nonincreasing, then τ is an Ft+ -stopping time.
Definition 1.25. For a stochastic process X and a finite stopping time τ Define the stopped process

Xtτ = Xmin{τ,t}

and the shifted process


θτ Xt = Xτ +t .
Proposition 1.26. Let X be an Ft -progressive process and let τ be a finite stopping time. Define the
collection of σ-algebras
Gt = Fmin{τ,t} and Ht = Fτ +t .
Then
1. {Gt ; t ≥ 0} is a filtration and X τ is both Gt -progressive and Ft -progressive.
2. {Ht ; t ≥ 0} is a filtration and θτ X is Ht -progressive.
Proof. For the first statement, because {min{τ, t}; t ≥ 0} is an increasing set of stopping times, {Gt ; t ≥ 0}
is a filtration. Because Gt ⊂ Ft , then if X τ is Gt -progressive, then it is automatically Ft -progressive. With
this in mind, choose Γ ∈ B(S) and t ≥ 0. We show that

{(s, ω) ∈ [0, t] × Ω; Xmin{τ (ω),s} (ω) ∈ Γ} ∈ B[0, t] × Gt .

Now, define
Cs,t = {C ∩ ([0, t] × {min{τ, t} ≥ s}); C ∈ B[0, t] × Fs }.
1 BASIC CONCEPTS 10

Claim: Cs,t ⊂ B([0, t]) × Gt .


Clearly, Cs,t is a σ-algebra. Thus, it suffices to choose C of the form B × A where B ∈ B([0, t]) and
A ∈ Fs noting that the collection of such B × A generates B[0, t] × Fs .
A ∩ {min{τ, t} ≥ s} ∈ Gt . Thus, for B ∈ B[0, t],

(B × A) ∩ ([0, t] × {min{τ, t} ≥ s}) = B × (A ∩ {min{τ, t} ≥ s}) ∈ B[0, t] × Gt .

Thus, B × A ∈ Cs,t .
Now, write
{(s, ω) ∈ [0, t] × Ω; Xmin{τ (ω),s} (ω) ∈ Γ}
= {(s, ω) ∈ [0, t] × Ω; Xmin{τ (ω),s} (ω) ∈ Γ, min{τ (ω), t} ≤ s ≤ t} ∪ {(s, ω); Xs (ω) ∈ Γ, s < min{τ, t}}.
To see that the first term is in B[0, t] × Gt , write it as

{(s, ω); min{τ (ω), t} ≤ s ≤ t} ∩ ([0, t] × {Xmin{τ,s} (ω) ∈ Γ})

and note that


∞ [∞
\ k k k+1
{(s, ω); min{τ (ω), t} ≤ s ≤ t} = ([ , t] × { ≤ min{τ, t} < }).
n=1
n n n
k=1

Similarly, the second term equals


∞ [

[ k k
{(s, ω); Xs (ω) ∈ Γ, s < } ∩ {(s, ω); ≤ min{τ (ω), t}}
n=1 k=1
n n

and use the claim.


For the second statement, first note that by the same argument as before, {Ht ; t ≥ 0} is a filtration. By
the first part, the mapping
(s, ω) → Xmin{τ (ω)+t,s}
from ([0, ∞] × Ω, B[0, ∞] × Fτ +t ) to (S, B(S)) is measurable. We also have the measurable mapping

(u, ω) → (τ (ω) + u, ω)

from ([0, t] × Ω, B[0, t] × Fτ +t ) to ([0, ∞] × Ω, B[0, ∞] × Fτ +t ). The mapping

(u, ω) → Xτ +u (ω)

is the composition of these two mappings, so it is measurable. Consequently, θτ X is Ht -progressive.

1.5 Examples of stopping times


Definition 1.27. Let x : [0, ∞) → S and let s ≥ 0.
1. The first entrance time into A after time s is defined by

τe (A, s) = inf{t ≥ s; xt ∈ A}.


1 BASIC CONCEPTS 11

2. The first contact time into A after time s is defined by

τc (A, s) = inf{t ≥ s; cl{xu ; s ≤ u ≤ t} ∩ A 6= ∅}.

Here cl(B) is the closure of the set B.

The first exit from B after time s is the first entrance to B c . If s = 0, then it is suppressed in the
notation.
Proposition 1.28. Suppose that X is a right continuous Ft -adapted process and that σ is an Ft -stopping
time.
1. If A is closed and X has left limits at each t > 0 or if A is compact then τc (A, σ) is an Ft -stopping
time.
2. If A is open then τe (A, σ) is an Ft+ -stopping time.
Proof. If A is open, we have by the right continuuity of X,
[
{τe (A, σ) < t} = {Xs ∈ A} ∩ {σ < s} ∈ Ft
s∈Q∩[0,t]

proving 2.
Under the conditions of part 1,
τc (A, σ) = lim τe (A1/n , σ)
n→∞

where A = {x; d(x, A) < }. Thus



\
{τc (A, σ) ≤ t} = ({σ ≤ s} ∩ {Xt ∈ A} ∪ {τe (A1/n , σ) < t} ∈ Ft
n=1

proving 1.
2 LÉVY PROCESSES 12

2 Lévy Processes
We begin our study of continuous time stochastic process with the continuous time analog of random walks.

Definition 2.1. An Rd -valued process X is call a Lévy process or a process with stationary independent
increments if
1. (stationary increments) The distribution of Xt+s − Xt does not depend on t.
2. (independent increments) Let 0 = t0 < t1 < · · · < tn . Then the random variables

{Xtj − Xtj−1 ; j = 1, . . . , n}

are independent.
3. (stochastic continuity)
Xt − X0 →P 0 as t → 0.

Exercise 2.2. 1. Any linear combination of independent Lévy processes is a Lévy process.
2. If X has finite mean, X0 = 0 and µ = EX1 , then EXt = µt.
3. If, in addition, X has finite variance and σ 2 = Var(X1 ), then Var(Xt ) = σ 2 t.
Definition 2.3. A Poisson process N with parameter λ is a Lévy process with N0 = 0 and

(λt)n −λt
P {Nt = n} = e .
n!
Exercise 2.4. Show that the definition of a Poisson process satisfies the compatibility condition in the
Daniell-Kolmogorov extension theorem.
Definition 2.5. Let {Yk , k ≥ 1} be independent and identically distributed Rd -valued random variables and
let N be a Poisson process, then
XNt
Xt = Yk
k=1

is called a compound Poisson process.


Exercise 2.6. Show that a compound Poisson process is a Lévy process.
Exercise 2.7. Let Yk take on the values ±1 each with probability 1/2 and let X be the associated compound
Poisson process. Assume that N has parameter λ. Then

P {Xt = m} = e−λt Im (λt)

where Im is the modified Bessel function of order m. Hint. The modified Bessel functions of integer order
Im (x) have generating function exp(x(z + z −1 )/2).
2 LÉVY PROCESSES 13

If Yk is not integer valued, then we use the characteristic function φs for Xs

φs (u) = Eeihu,Xs i = E[E[eihu,Xs i |Ns ]]



X
= E[eihu,(Y1 +···+Yn )i ]P {Nt = n}
n=0

X (λt)n λt
= E[eihu,Y1 i ]n e
n=0
n!
Z
= exp(λt(φY (u) − 1)) = exp(λt (eihu,yi − 1) ν(dy))

where ν and φY are, respectively, the distribution and the characteristic function for the Yk .

Exercise 2.8. For a Lévy process X let νs and φs (u) denote, respectively, the distribution and the charac-
teristic function for the Xs . Then, φs+t = φs φt and consequently νs+t = νs ∗ νt .
Use the dominated convergence theorem to show that the stochastic continuity of a Lévy process implies
that φs is right continuous and so, by the functional relationship in the exercise above,

φs (u) = exp sψ(u),

for some ψ.
Definition 2.9. The function ψ is called the characteristic exponent for the Lévy process.
Definition 2.10. A one dimensional Brownian motion B with parameters µ and σ 2 is a Lévy process with
B0 = 0 and
(x − µt)2
Z  
1
P {Bt ∈ A} = √ exp − dx.
σ 2πt A 2σ 2 t
If µ = 0 and σ 2 = 1, the B is called a standard Brownian motion.
For d-dimensional Brownian motion B, we require µ ∈ Rd and a symmetric, positive semi-definite matrix
Σ, Then, B is a Lévy process and if B0 = 0, then

(x − µt)T Σ−1 (x − µt)


Z  
1
P {Bt ∈ A} = p exp − dx.
det(Σ)(2πt)n A 2t

Exercise 2.11. 1. Show that Brownian motion as defined above satisfies the consistency condition in the
extension theorem.
2. Find the characteristic exponent for Brownian motion.
Exercise 2.12. Let B be a d-dimensional Brownian motion. Show that there exists an invertible matrix C ,
a vector µ, and a d-dimensional process B̃t = (B̃t1 , . . . , B̃td ) in which the component are independent standard
Brownian motions so that Bt = C B̃t + µt.
As a consequence, we sometimes call B a Brownian motion if Bt = C B̃t for some d × d-matrix C.
Exercise 2.13. Let B be a standard Brownian motion. Then the following are also standard Brownian
motions.
2 LÉVY PROCESSES 14

1. −B,
2. {Bt+t0 − Bt0 ; t ≥ 0} for t0 > 0.
3. {aBt/a2 ; t ≥ 0} for a > 0.

Proposition 2.14. Let B be a standard Brownian motion. Then,

P {sup Bt = +∞, inf Bt = −∞} = 1.


t t

Proof. Set B ∗ = sup{Bt ; t ∈ Q ∩ [0, ∞)}. Then, for any a ∈ Q+ , B ∗ and aB ∗ have the same distribution.
Consequently, B ∗ is concentrated on the set {0, ∞}. Set p∗ = P {B ∗ = 0}. Then

p∗ ≤ P {B1 ≤ 0, Bu ≤ 0, u ∈ Q ∩ [1, ∞)}


≤ P {B1 ≤ 0, sup{B1+t − B1 ; t ∈ Q+ } < ∞}
= P {B1 ≤ 0}P {sup{B1+t − B1 ; t ∈ Q+ } < ∞}
1
= P {B1 ≤ 0}P {sup{Bt ; t ∈ Q+ } < ∞} = p∗
2
Thus, p∗ = 0 and P {supt Bt = +∞} = 1. Because −B is a standard Brownian motion, P {inf t Bt = −∞} =
1. Because the intersection of a pair of probability 1 events has probability 1, the proposition follows.

Take X0 = 0. Then, we can write

Xt = (Xt − X(n−1)t/n ) + (X(n−1)t/n − X(n−1)t/n ) + · · · + (Xt/n − X0 ).

Thus, for any n, Xt can be written as the sum of n independent identically distributed random variables.
This motivates the following:

Definition 2.15. A random variable Y is said to have an infinitely divisible distribution if for every n, there
exists n independent identically distributed random variables, Y1,n , . . . , Yn,n such that

Y = Y1,n + · · · + Yn,n .

The Lévy-Khintchine formula gives the characteristic function for any infinitely divisible distribution on
Rd ,  Z   
1 ihu, yi
exp ψ(u) = exp ihβ, ui − hu, Σui + eihu,yi − 1 − ν(dy) .
2 1 + |y|2
For this formula, there is some flexibility in the last function χ(y) = y/(1 + |y|2 ). We need χ to be anti-
symmetric, bounded and have derivative 1 at the origin. We can choose β to be any d-dimensional vector
and Σ to be any symmetric non-negative definite d × d-matrix. The measure ν, called the Lévy measure can
be any Radon measure satisfying
1. ν{0} = 0, and
R |y|2
2. 1+|y| 2 ν(dy) < ∞.
2 LÉVY PROCESSES 15

Now, clearly exp(ψ(u)/n) is the characteristic function of a random variable. The sum of n such inde-
pendent random varibles has characteristic function exp ψ(u). Thus, we have the converse that any infinitely
divisible
R distribution can be realized by X1 for some Lévy process X.
|y|
If 1+|y| 2 ν(dy) < ∞, then we can write
Z
1
exp ψ(u) = exp(ihβ̃, ui − hu, Σui + (eihu,yi − 1) ν(dy)).
2
where Z
y
β̃ = β + ν(dy)
1 + y2
This is the characteric function of a random variable that is the independent sum of a normal random
variable and a compound Poisson process evaluated at time 1.
If λ = ν(Rd \{0}) < ∞, and if CC T = Σ, then the process
Nt
X
Xt = βt + C B̃t + Yn − µλt
n=1

the X is a Lévy process with characteristic exponent ψ where B̃t = (B̃t1 , . . . , B̃td ) with components that are
independent standard Brownian motions, N is a Poisson process with parameter λ and {Yn ; n ≥ 1} is an
independent sequence of random variables with common distrobution ν/λ.
3 MARTINGALES 16

3 Martingales
The definition of martingale in continuous time is exactly the analog of the discete time definition.

Definition 3.1. A real valued process X with E|Xt | < ∞ for all t ≥ 0 and adapted to a filtration {Ft ; t ≥ 0}
is an Ft -martingale if
E[Xt |Fs ] = Xs for all t > s,
an Ft -submartingale if
E[Xt |Fs ] ≥ Xs for all t > s,
and an Ft -;gale if the inequality above is reversed.
Remark 3.2. Many previous facts about discrete martingales continue to hold for continuous time martin-
gales.
1. X is an Ft martingale if and only if for every s < t, E[Xt ; A] = E[Xs ; A], for all A ∈ Fs .
2. Suppose X is an Ft -martingale, φ is convex and E|φ(Xt )| < ∞ for all t ≥ 1. Then φ ◦ X is an
Ft -submartingale.
3. Suppose X is an Ft -submartingale, φ is convex and non-decreasing, E|φ(Xt )| < ∞ for all t ≥ 0. Then
φ ◦ X is an Ft -submartingale.
Exercise 3.3. An adapted stochastic process M is a martingale if and only if

EMτ = EM0

for every bounded stopping time τ .


Example 3.4. Let X be a Lévy process, X0 = 0.
1. If X has finite mean, µ = EX1 . Then Xt − µt is a martingale.
2. If X has mean zero and finite variance, σ 2 = Var(X1 ). Then Xt2 − σ 2 t is a martingale.
3. Let X have characteristic exponent ψ. Then exp(iuXt − tψ(u)) is a complex-valued martingale.
Exercise 3.5. Compute the martingales above for the Poisson process and one-dimensional Brownian mo-
tion.
The optional sampling theorem continues to hold. In this context, let {τt ; t ≥ 0} be an increasing
collection of stopping times and set
Yt = Xτt , and Gt = Fτt .
Assume that each τt satsifies the sampling integrability conditions for X. If X is an Ft -submartingale, then
Y is a Gt -submartingale.
We have similar statements in the discrete time case with 0 = τ0 ≤ τ1 ≤ · · · and Yn = Xτn . One
important special case of this is the choice of τn = tn where {tn ; n ≥ 1} is a set of non-random times.
3 MARTINGALES 17

Proposition 3.6. Let τ and σ be two Ft -stopping times taking values in a finite set F = {t1 < · · · < tm }.
If X is an Ft -submartingale, then
E[Xτ |Fσ ] ≥ Xmin{τ,σ} .
Proof. We show that for every A ∈ Fσ ,

E[Xτ ; A] ≥ E[Xmin{τ,σ} ; A].


Sm
Write A = i=1 (A ∩ {σ = ti }). to see that it is sufficient to show that for each i,

E[Xτ ; A ∩ {σ = ti }] ≥ E[Xmin{τ,σ} ; A ∩ {σ = ti }] ≥ E[Xmin{τ,ti } ; A ∩ {σ = ti }].

Because A ∩ {σ = ti } ∈ Fti , the inequality above holds provided that

E[Xτ |Fti ] ≥ Xmin{τ,ti } .

Claim. E[Xmin{τ,tk+1 } |Ftk ] ≥ Xmin{τ,tk } .

E[Xmin{τ,tk+1 } |Ftk ] = E[Xmin{τ,tk+1 } I{τ >tk } |Ftk ] + E[Xmin{τ,tk+1 } I{τ ≤tk } |Ftk ]
= E[Xtk+1 |Ftk ]I{τ >tk } + Xmin{τ,tk } I{τ ≤tk }
≥ Xtk I{τ >tk } + Xmin{τ,tk } I{τ ≤tk } = Xmin{τ,tk }

Thus, E[Xτ |Fti ] = E[Xmin{τ,tm } |Fti ] ≥ Xmin{τ,tm−1 } . Use this for the frist step and the tower property
of σ-algebras in the induction step to complete the proof.
Example 3.7. Let N be a Poisson process with parameter λ. Let τn = inf{t ≥ 0; Nt = n}.
1. Fix a time t, then min{τn , t}is bounded, so by the optional sampling theorem,

0 = ENmin{τn ,t} − λE min{τn , t}.

Now, use the monotone convergence theorem on each of these terms and the fact that τn < ∞ almost
surely to obtain
n
0 = ENτn − λEτn and, therefore, Eτn =
λ
2. Consider
exp(iuNt − λt(eiu − 1)),
the exponential martingale for the Poisson process. Then

1 = E[exp(iuNτn − λτn (eiu − 1)],

and thus
e−iun = E[exp(−λτn (eiu − 1)].
Set
v+λ
v = λ(eiu − 1) or u = −i log( )
λ
3 MARTINGALES 18

yielding
λ n
Ee−vτ = ()
λ+v
This is the n-th power of the Laplace transform of an exponential random variable and, hence, the
Laplace transform of a Γ(n, λ) random variable.
Exercise 3.8. Using N and τn as in the example above. Show that τn+1 − τn is an exponential random
variable independent of FτNn t see that {τn+1 − τn ; n ≥ 1} is an independent and identically distributed
sequence of exponential random variables.
Exercise 3.9. Note that {Nt < n} = {τn > t}. Use this to obtain the density of a Γ(n, λ) random variable.

3.1 Regularity of Sample Paths


Lemma 3.10. Let X be a submartingale, t > 0 and F a finite subset of [0, t]. Then for each x > 0,
1
P {max Xs ≥ x} ≤ EXt+ ,
s∈F x
and
1
P {min Xs ≤ −x} ≤ (EXt+ − EX0 ).
s∈F x
Proof. The first statement follows immediately from the corresponding result for discrete time submartin-
gales. For the second statement, define

τ = min{u ∈ F ; Xu ≤ −x}.

By the proposition above, E[Xt |Fτ ] ≥ Xmin{τ,t} . In addition, if τ < ∞, then min τ, t} = τ .

EX0 ≤ EXmin{τ,t} = E[Xmin{τ,t} ; {τ < ∞}] + E[Xmin{τ,t} ; {τ = ∞}]


≤ E[Xτ ; {τ < ∞}] + E[Xt ; {τ = ∞}] ≤ −xP {τ < ∞} + EXt+ .

Now simplify.
Corollary 3.11. Let X be a submartingale, t > 0 and C a countable subset of [0, t]. Then for each x > 0,
1
P {max Xs ≥ x} ≤ EXt+ ,
s∈C x
and
1
P {min Xs ≤ −x} ≤ (EXt+ − EX0 ).
s∈C x
Proof. Choose finite subests F1 ⊂ F2 ⊂ · · · so that C = ∪∞
n=1 Fn . Then, for 0 < x̃ < x,

1
P {max Xs ≥ x} ≤ lim P {max Xs ≥ x̃} ≤ EXt+ .
s∈C n→∞ s∈Fn x̃
Now, let x̃ → x.
3 MARTINGALES 19

Corollary 3.12. Let D be a countable dense subset of [0, ∞). Then


P{ sup Xs < ∞} = P { inf Xs ≥ −∞} = 1.
s∈D∩[0,t] s∈D∩[0,t]

We can now make similar analogies to the discrete time case for upcrossings. For a stochastic process X,
define U (a, b, F ) to be the number of upcrossings of the interval (a, b) by X restricted to a finite set F . For
C countable, as before, write C = ∪∞ n=1 Fn and define Then U (a, b, Fn ) is a monotone increasing sequence.
Call its limit U (a, b, C).

Exercise 3.13. Show that the definition above is independent of the choice of the finite sets Fn .
Theorem 3.14 (Doob’s upcrossing inequality). Let X be a submartingale and let t > 0. For a countable
subset C ⊂ [0, t],
E(Xt − a)+
EU (a, b, C) ≤ .
b−a
Proof. Choose Fn as above, then by the discrete time version of the uncrossing inequality
E(Xt − a)+
EU (a, b, Fn ) ≤ .
b−a
Now, use the monotone convergence theorem.
Corollary 3.15. Let D be a countable dense subset of [0, ∞). Then
P {U (a, b, D ∩ [0, t]) < ∞} = 1.
Remark 3.16. Set
 

\ \
Ω0 = { sup Xt < ∞} ∩ { inf Xt ≥ −∞} ∩ {U (a, b, D ∩ [0, n]) < ∞} .
s∈D∩[0,n] s∈D∩[0,n]
n=1 a<b,a,b∈D

Then, P (Ω0 ) = 1.
Exercise 3.17. For ω ∈ Ω0 ,
Xt+ (ω) = lim Xs (ω)
s→t+,s∈D

exists for all t ≥ 0.


Xt− (ω) = lim Xs (ω)
s→t−,s∈D
+
exists for all t > 0. Furthermore, X + (ω) ∈ DS [0, ∞) and Xt− (ω) = Xt− (ω).
Set Xt+ (ω) = 0 for all ω ∈
/ Ω0 .

Proposition 3.18. Let X be a submartingale and define X + as above. Then Γ = {t ≥ 0; P {Xt+ 6= Xt−
+
}>
+
0} is countable. P {Xt = Xt } = 1 for t ∈
/ Γ and

Xt (ω) if t ∈ Γ
X̃t (ω) =
Xt+ (ω) if t ∈

defines a modification of X almost all of whose sample paths have right and left limits at all t ≥ 0 and are
right continuous at all t ∈
/ Γ.
3 MARTINGALES 20

Proof. Consider the space L∞ (Ω) of real-valued random variables on Ω and define the metric

ρ(ξ1 , ξ2 ) = E[min{|ξ1 − ξ2 |, 1}].

We can consider the stochastic process X as a map

[0, ∞) → (L∞ (Ω), ρ)

Now X + has right and left hand limits in this metric and so by the lemma Γ is countable.
Choose a real number a, then φa (x) = max{x, a} is an increasing convex function and so φa (Xt ) is a
submartingale. Consequently,

a ≤ φa (Xs ) ≤ E[φa (Xt )|FsX ], 0 ≤ s ≤ t.

Because {E[φa (Xt )|FsX ]; 0 ≤ s ≤ t} is uniformly integrable, and

|φa (Xs )| ≤ |a| + |E[φa (Xt )|FsX ]|

we see that {φa (Xs ); 0 ≤ s ≤ t} is also uniformly integrable. Therefore, almost sure convergence implies
convergence in L1 and

a ≤ φa (Xs ) ≤ lim E[φa (Xu )|FsX ] = E[φa (Xs+ )|FsX ].


u→s+,u∈D

+
/ Γ, Xs+ = Xs−
For s ∈ a.s and

E[E[φa (Xs+ )|FsX ] − φa (Xs )] ≤ lim E[φa (Xs+ ) − φa (Xu )] = 0.


u→s−,u∈D

Thus, the non-negative random variable E[φa (Xs+ )|FsX ] − φa (Xs ) has zero expectation and is consequently
0 almost surely.
+ +
Using again that Xs+ = Xs− a.s. and that Xs− is FsX -measurable,

φa (Xs ) = E[φa (Xs+ )|FsX ] = φa (Xs+ ) almost surely.

Thus, for almost all ω, if Xs (ω) > a, then Xs (ω) = Xs+ (ω). Because this holds for all a ∈ R,

P {Xs = Xs+ } = 1 for all s ∈


i.e., X is right continuous for s ∈


/ Γ and X̃ is a modification of X
To see that X̃ has both left and right limits for all t ≥ 0, replace D with D ∪ Γ in the construction of Ω0
and call this new set ΩΓ ⊂ Ω0 . For ω ∈ ΩΓ , and t ≥ 0,

Xt+ (ω) = lim Xs+ (ω) = lim Xs (ω).


s→t+ s→t+,s∈D∪Γ

Thus,
Xt+ (ω) = lim X̃s (ω)
s→t+

and X̃ has right limits for every t.


Repeat the procedure with X − to show the existence of left limits.
3 MARTINGALES 21

Remark 3.19. In most of the situations that we will encounter, Γ = ∅ and if the process is defined, as in
the case of Lévy processes, by its finite dimensional distributions, then we can take the version with sample
paths in DS [0, ∞).
Exercise 3.20. The convergence under the metric ρ is equivalent to convergence in probability.
Corollary 3.21. Let Z ∈ L1 , then for any filtration {Ft ; t ≥ 0}, and for all t ≥ 0,
1
E[Z|Fs ] →L E[Z|Ft+ ] as s → t + .

Proof. Let Xt = E[Z|Ft+ ]. Then X is a martingale. By the proposition above, X has right limits a.s. at
each t ≥ 0. X is uniformly integrable. Consequenty,

Xs → Xt+ a.s. and in L1 as s → t + .

Note that Xt+ is Ft+ -measurable. Take A ∈ Ft+ , then for s > t, Ft+ ⊂ Fs and E[Xs ; A] = E[Z; A].
Consequently,
E[Xt+ ; A] = lim E[Xs ; A] = E[Z; A].
s→t+

Thus, Xt+ = E[Z|Ft+ ].


Theorem 3.22 (Doob’s regularity theorem). If X is a Ft -submartingale, then the process X + is also a
Ft -submartingale. Moreover, X + is a modification of X if and only if the map

t → Xt+

is right continuous from [0, ∞) to L1 (Ω), that is for every t ≥ 0,

lim E[|Xs − Xt |] = 0.
s→t+

Proof.
Theorem 3.23 (Optional sampling formula). Let X be a right continuous Ft -submartingale and let τ and
σ be Ft -stopping tmes. Then for each t > 0,

E[Xmin{τ,t} |Fσ ] ≥ Xmin{σ,τ,t} .

If, in addition, τ satisfies the sampling integrability conditions for X, then

E[Xτ |Fσ ] ≥ Xmin{σ,τ } .

Proof. We have the theorem in the case that τ and σ are discrete stopping times. Now, let {σn , τn ; n ≥ 1}
be a sequence of nonincreasing discrete stopping times that, converge, respectively to τ and σ.
As before for a, set φa (x) = max{x, a} and note thatφa (Xt ) is a submartingale. Then,

E[φa (Xmin{τn ,t} )|Fσn ] ≥ φa (Xmin{σn ,τn ,t} .

Use the fact that Fσ ⊂ Fσn and the tower property to conclude that

E[φa (Xmin{τn ,t} )|Fσ ] ≥ E[φa (Xmin{σn ,τn ,t} |Fσ ].


3 MARTINGALES 22

Note that

a ≤ φa (Xmin{τn ,t} ) ≤ E[φa (Xt )|Fτn ] and a ≤ φa (Xmin{σn ,τn ,t} ) ≤ E[φa (Xt )|Fmin{σn ,τn } ].

Consequently, as before, {φa (Xmin{τn ,t} ); n ≥ 1} and {φa (Xmin{σn ,τn ,t} )} are uniformly integrable. Now
let n → ∞ and use the right continuity of X to obtain

E[φa (Xmin{τ,t} )|Fσ ] ≥ E[φa (Xmin{σ,τ,t} |Fσ ]. = φa (Xmin{σ,τ,t} )

Now let, a → −∞. The second part follows are in the discrete time case.
We have similar theorems to the discrete time case on convergence as t → ∞.

Theorem 3.24. (Submartingale convergence theorem) Let X be a right continuous submartingale. If

sup EXt+ < ∞,


t

then
lim Xt = X∞
t→∞

exists almost surely with E|X∞ | < ∞.


Theorem 3.25. Let X be a right continuous Ft -submartingale. Then X is uniformly integrable if and only
if there exists a random variable X∞ such that
1
Xt →L X∞ .

Furthermore, when this holds


Xt → X∞ a.s.
Corollary 3.26. A nonnegative supermartingale converges almost surely.
Proposition 3.27. Let X be a right continuous non-negative Ft -;gale. Let τc (0) be the first contact time
with 0. Then, with probability 1, Xt = 0 for all t ≥ τc (0).
Proof. For n = 1, 2, . . ., let τn = τe ([0, n−1 )), the first entrance time into [0, n−1 ). Then τn is an Ft+ -stopping
time and τc (0) = limn→∞ τn . If τn < ∞, Xτn ≤ n−1 . Consequently, for every t ≥ 0,

E[Xt |Fτn + ] ≤ Xmin{t,τn }

and hence
1
E[Xt |Fτn + ]I{τn ≤t} , ≤ ,
n
and, taking n → ∞, we have upon taking expected values that

E[Xt I{τc (0)≤t} ] ≤ 0.

Now, use the non-negativity and right continuity of X.


3 MARTINGALES 23

3.2 Sample Path Regularity and Lévy Processes


From Doob’s regularilty theorem, we have that any Lévy process with finite mean has a version in DRd [0, ∞).
For Brownian motion, we have the following.

Proposition 3.28. Brownian motion has a version in CRd [0, ∞).


Proof. Bt = C B̃t + µt where C is a d × d matrix, µ is a d-dimensional vector and B̃ is a vector consisting of
d independent standard Brownian motions. Thus, if standard Brownian motion has a continuous version, so
does B Thus, let B̃ be standard Brownian motion on R. Then B̃t is a normal random variable, mean zero,
variance t. Thus, E B̃t4 = 3t2 or
E[|B̃t − B̃s ]4 ] = 3(t − s)2 .
Because B has a version in DRd [0, ∞), we can apply the moment theorem on continuous versions with C = 3,
β = 4 and α = 2.
Example 3.29. Let B be a continuous version of standard Brownian motion. Recall that for this process
P {supt Bt = ∞, inf t Bt = −∞} = 1. In particular, because B is continuous, for each a ∈ R, the stopping
time τa = inf{t ≥ 0; Bt = a} is finite with probability 1.
1. Choose a, b > 0 and define τ = inf{t ≥ 0; Bt ∈ / (−a, b)} = min{τ−a , τb }. Because each point in R is
recurrent a.s., τ is almost surely finite. Thus, by the optional sampling theorem,

0 = EBmin{τ,n} = bP {Bτ = b, τ ≤ n} − aP {Bτ = a, τ ≤ n} + E[Bn ; {τ > n}].

Now, let n → ∞. For the third term, use the bounded convergence theorem to obtain

0 = bP {Bτ = b} − aP {Bτ = a}

or
a
Eτ = P {Bτ = b} = .
b+a
Now, use the martingale Bt2 − t to obtain
2
0 = EBmin{τ,n} − E min{τ, n}.

For the first term use the bounded convergence theorem and for the second use the monotone convergence
theorem as n → ∞ to see that
a b
EBτ2 = b2 + a2 = ab.
b+a b+a
For the second term use the monotone convergence theorem as n → ∞ to see that

Eτ = ab.

2. Set Xt = Bt + µt. For x > 0, define the stopping times

τx = inf{t ≥ 0; Xt = x}

Now
exp(uXt − αt) = exp(uBt − (α − uµ)t)
3 MARTINGALES 24

is a martingale provided that


1 2
α − uµ = u ,
2
that is, p
u± = −µ ± µ2 + 2α.
Note that if α > 0, u− < 0 < u+ . Thus the martingale exp(u+ Xt − αt) is bounded on [0, τx ]. The
optional sampling theorem applies and
1 = E[exp(u+ Xτx − ατx )] = eu+ x E[e−ατx ].
Consequently, p
E[e−ατx ] = exp(−x( µ2 + 2α − µ)).
Take α → 0, then exp(−ατn ) → I{τn <∞} . Therefore,

1 µ≥0
P {τx < ∞} =
e2µx µ<0

In addition, the Laplace transform can be inverted to see that τx has density
x (x − µt)2
fτx (t) = √ exp(− ).
2πt3 2t

Exercise 3.30. Let B be standard Brownian motion, a > 0, and τ = inf{t ≥ 0; |Bt | = a}.

1. E[exp(−ατ )] = sech(a 2α).
2. Eτ 2 = 5a4 /3. Hint: Prove that Bt4 − 6tBt2 + 3t2 is a martingale.
Proposition 3.31. Let X be a right continuous Lévy process. Then for each s ≥ 0, the process X̃t =
X
Xs+t − Xs is a Lévy process with the same finite dimensional distribution as X that is independent of Fs+ .
X X X X
Proof. For each  > 0, Fs+ ⊂ Fs+ , Xs+t+ − Xs+ is a Lévy process independent of Fs+ . Let A ∈ Fs+
Choose a right continuous modification of X, times 0 = t0 < t1 < · · · < tn , and bounded continuous functions
fj .

n
Y n
Y
E[ fj (Xs+tj − Xs+tj−1 ); A] = lim E[ f (Xs+tj + − Xs+tj−1 + ); A]
→0+
j=1 j=1
Yn
= lim E[ f (Xs+tj + − Xs+tj−1 + )]P (A)
→0+
j=1
Yn
= lim E[ f (Xtj − Xtj−1 )]P (A)
→0+
j=1
n
Y
= E[ f (Xtj − Xtj−1 )]P (A)
j=1
3 MARTINGALES 25

Proposition 3.32. Let X be a right continuous Lévy process and τ an almost surely finite stopping time.
Then the process X̃s = Xs+τ − Xτ is a Lévy process with respect to the filtration {FτX+s ; s ≥ 0} having the
same finite dimensional distributions as Xs − X0 and is independent of FτX+ .
Proof. Consider times 0 = t0 < t1 < · · · < tn , bounded continuous functions f1 , f2 , . . . , fn , and A ∈ FτX+ .
For the case that τ is a discrete stopping time with range {sk ; k ≥ 1}, we have
n
Y ∞
X n
Y
E[ fj (Xτ +tj − Xτ +tj−1 ); A] = E[ fj (Xτ +tj − Xτ +tj−1 ); A ∩ {τ = sk }]
j=1 k=1 j=1
X∞ Yn
= E[ fj (Xsk +tj − Xsk +tj−1 ); A ∩ {τ = sk }]
k=1 j=1
X∞ Yn
= E[ fj (Xtj − Xtj−1 )]P (A ∩ {τ = sk })
k=1 j=1
n
Y
= E[ fj (Xtj − Xtj−1 )]P (A)
j=1

Note that A ∩ {τ = sk } ∈ Fsk + and thus the third line follows from the second by the previous proposition.
For the general case, let {τm ; m ≥ 1} be a decreasing sequence of stopping times with limit τ . Pick
A ∈ Fτ ⊂ Fτm . Now, apply the identity above to the τm , use right continuity and the bounded convergence
theorem to obtain the result.
X
Corollary 3.33 (Blumenthal 0-1 law). For a Lévy process X, with X0 = 0 every event in F0+ has probability
0 or 1.
X X
Proof. Fix A ∈ F0+ , then, in addition, A ∈ σ{Xt ; t ≥ 0} = σ{Xt − X0 ; t ≥ 0}. Then F0+ and σ{Xt − X0 ; t ≥
0} are independent σ-algebras and thus P (A) = P (A ∩ A) = P (A)P (A) and so P (A) = 0 or 1.
X
In particular, if τ is an Ft+ -stopping time then

P {τ = 0} = 0 or 1.

Example 3.34. 1. Suppose that f (t) > 0 for all t > 0. Then
Bt
lim sup = c a.s.
t→0+ f (t)

for some c ∈ [0, ∞]. If f (t) = t, the Bt /f (t) is a standard normal and so the limit above cannot be
finite. Consequently, c = ∞. If f (t) = tα , α > 1/2, then Bt /f (t) is normal, mean 0, variance t1−2α
and so the limit above cannot be positive. Consequently, c = 0.
2. Let τ = inf{t > 0; Bt > 0}. Then P {τ ≤ t} ≥ P {Bt > 0} = 1/2. Thus,

P {τ = 0} = lim P {τ ≤ t} =
6 0,
t→0+

thus it must be 1.
3 MARTINGALES 26

Theorem 3.35 (Reflection principle). Let a ∈ R and let B be a continuous standard Brownian motion.
Define the stopping time τa = inf{t > 0; Bt > a}. The process

Bt if t < τa ,
B̃t =
2a − Bt if t ≥ τa .

is a continuous standard Brownian motion.


Proof. τa < ∞ with probability 1. Therefore, Xt = Bτa +t − a and −X are standard Brownian motions
independent of B τa . Consequently (B τa , X) and (B τa , −X) have the same distribution. Define the continuous
map
φ : CR [0, ∞) × CR [0, ∞) → CR [0, ∞)
by
φ(b, x)t = bt I{τa ≤t} + (xτa +t − a)I{τa >t} .
Then φ(B τa , X) = B and φ(B τa , −X) = B̃.
Exercise 3.36. State and prove a similar theorem for symmetric Lévy processes and general finite stopping
times.
Corollary 3.37. Define
Bt∗ = sup{Bs ; s ≤ t}.
Then, for all a, x ≥ 0 and t ≥ 0,

P {Bt∗ ≥ a, Bt ≤ a − x} = P {Bt ≥ a + x}.

Proof.

P {Bt∗ ≥ a, Bt ≤ a − x} = P {Bt∗ ≥ a, B̃t ≤ a − x} = P {Bt∗ ≥ a, Bt ≥ a + x} = P {Bt ≥ a + x}.

Use the corollary in the case a > 0 and x = 0 to obtain.

P {τa ≤ t} = P {Bt∗ ≥ a} = P {Bt∗ ≥ a, Bt ≤ a} + P {Bt∗ ≥ a, Bt ≥ a}


Z ∞
1 2
= 2P {Bt ≥ a} = √ e−x /2t dx
2πt a
Exercise 3.38. 1. Show that τa has density
a 2
fτn (t) = √ e−a /2t
.
2πt3

2. Show that (Bt∗ , τa ) has joint density

2(2a − x) (2a − x)2


f(Bt∗ ,τa ) (x, t) = √ exp(− ), a ≥ 0, x ≤ a.
2πt3 2t

3. Show that {τa ; a ≥ 0} is a Lévy process that does not have a finite stopping time.
3 MARTINGALES 27

3.3 Maximal Inequalities


Proposition 3.39 (Doob’s submartingale maximal inequality). Let X be a right continuous submartingale.
1. Then for each x > 0 and t > 0,
1
P { sup Xs ≥ x} ≤ EXt+ ,
0≤s≤t x
and
1
P { inf Xs ≤ −x} ≤ (EXt+ − EX0 ).
0≤s≤t x
2. If X is non-negative. Then for α > 1 an d t > 0,
 α
α
E[ sup Xsα ] ≤ EXtα .
0≤s≤t α−1

Proof. The first statement follows form the corresponding result for a countable collection of times in [0, t]
and right continuity.
For the second statement. set τ = inf{t ≥ 0; Xs > x}. Then τ is an Ft+ -stopping time. The right
continuity of X implies that Xτ ≥ x whenever τ < ∞. Consequently,

{ sup Xt > x} ⊂ {τ ≤ t} ⊂ { sup Xt ≥ x}.


0≤s≤t 0≤s≤t

Moreover, these events have equal probability for all but countably many x > 0 and hence

xP {τ ≤ t} ≤ E[Xτ ; {τ ≤ t}] ≤ E[Xt ; {τ ≤ t}].

By the optional sampling theorem,


EXmin{τ,t} ≤ EXt .
Now the balance of the proof follows the same line as in the discrete time case.
Proposition 3.40. Let B be a continuous version of standard Brownian motion and define

tB1/t if t > 0
B̃t =
0 if t = 0

Then B̃ is also a continuous version of standard Brownian motion.


Proof. An easily calculation shows that B̃ has stationary independent increments, that B̃t+s − B̃t is normally
distributed with mean 0 and variance s and that B̃ is continouus on (0, ∞).
To establish continuity at 0, first note that is equivalent to showing that
Bt
lim = 0.
t→∞ t

This subsequential limit along the integers is 0 by the strong law of large numbers. The following claim
proves the theorem.
3 MARTINGALES 28

Claim.
maxn≤t≤n+1 |Bt − Bn |
lim = 0 a.s.
n→∞ n

Let  > 0. Then


1
P { max |Bt − Bn | > n} = P { max |Bt | > n} ≤ EB12 .
n≤t≤n+1 0≤t≤1 (n)2
Thus,
∞ ∞
X X 1 π2
P { max |Bt − Bn | > n} = 2
= < ∞.
n=0
n≤t≤n+1
n=0
(n) 6
Thus, by the first Borel-Cantelli lemma,
|Bt − Bn |
P { max >  i.o.} = 0
n≤t≤n+1 n
and the claim holds.
We will need the following integration result for the next theorem.

Exercise 3.41. Let Φ be the cumulative distribution function for the standard normal. Then, for a > 0
1 1
Φ(−a) = 1 − Φ(a) ≥ √ exp(− a2 )a−1 (1 − a−2 ).
2π 2

Hint. Use exp(−z 2 /2) > (1 − 3z −4 ) exp(−z 2 /2) and find the antiderivative.

3.4 Localization
To extend our results to a broader class of processes, we introduce the concept of localization.

Definition 3.42. Let Y be a Ft -adapted DR [0, ∞)-process,


1. A stopping time τ reduces Y if
Y τ I{τ >0}
is uniformly integrable.
2. Y is called a local martingale if there exists a increasing sequence of stopping times {τn ; n ≥ 1},
limn→∞ τn = ∞ so that τn reduces Y and Y τn is a martingale. We shall call the sequence above a
reducing sequence for Y .
Exercise 3.43. 1. If τ reduces Y and σ ≤ τ , then σ reduces Y .
2. Local martingales form a vector space of processes.
3. Take τn = n to see that martingales are local martingales.
4. Let σn = inf{t ≥ 0 : |Yt | ≥ n}, then min{σn , τn } is a reducing sequence for Y .
3 MARTINGALES 29

This next proposition gives us our first glimpse into the stochastic calculus.

Proposition 3.44. Suppose that Y is a right continuous Ft -local martingale, and that V is a real-valued
Ft -adapted, bounded continuous process having locally bounded variation. Then
Z t
Mt = Vt Yt − Ys dVs
0

is an Ft -local martingale.

Remark 3.45. This uses the fact that we will later learn that integrals of martingales are martingales.
Then, M can be found from an integration by parts formula
Z t Z t
Vs dVs = Vt Yt − V0 Y0 − Yu dVu .
0 0

Exercise 3.46 (summation by parts). Let {ak ; k ≥ 0} and {bk ; k ≥ 0} be two real valued sequences, then
n−1
X n−1
X
ak (bk+1 − bk ) = an bn − am bm − bk+1 (ak+1 − ak ).
k=m k=m

Proof. (of the proposition) Let {τn ; n ≥ 1} be a localizing sequence for Y so that τn is less than the contact
time with the set (−n, n)c .
Write |V |t for the variation of V on the interval [0, t]. Because V is continuous, σn , the first contact time
of |V | into [n, ∞) is a stopping time. Because |V | is continuous, σn → ∞ as n → ∞.
Write γn = min{σn , τn }, we show that M γn is a martingale, i. e,
" #
Z min{t+s,γn }
γn γn
E Vt+s Yt+s − Ysγn Vsγn − Yuγn dVuγn |Ft = 0. (3.1)
min{t,γn }

Let t = u0 < u1 < · · · < um = t + s be a partition of the interval [t, t + s]. Then, by the tower property,
and the fact that V is adapted and thatYγn is a martingale, we see that
m−1
X m−1
X
E[ Vuγkn (Yuγk+1
n
− Yuγkn )|Ft ] = E[Vuγkn E[(Yuγk+1
n
− Yuγkn )|Fuk |Ft ] = 0.
k=0 k=0

Use the summation by parts formula to obtain that


m−1
X
γn γn
E[Vt+s Yt+s − Ysγn Vsγn − Yuγk+1
n
(Vuγk+1
n
− Vuγkn )|Ft ] = 0. (3.2)
k=0

The sum in the conditional expectation converges almost surely to the Riemann-Stieltjes integral
Z t+s
Yuγn dVuγn .
t
3 MARTINGALES 30

To complete the proof, we must show that the process is uniformly integrable. However,
n−1
X m−1
X γn γn γn γn
Yuγkn (Vuγk+1
n
− Vuγkn ) ≤ |Yuk ||Vuk+1 − Vuγkn | ≤ max{n, Yt+s }(|Vγn |t+s − |V γn |t ) ≤ max{n, Yt+s }n



k=0 k=0

which is integrable.
Thus, the Riemann sums converge in L1 and we have both that (3.1) holds and that Mγn is uniformly
integrable.

3.5 Law of the Iterated Logarithm


Theorem 3.47. (law of the iterated logarithm for Brownian motion) Let B be a standard Brownian motion.
Then, ( )
Bt
P lim sup p = 1 = 1.
t→0+ 2t log log(1/t)
p
Proof. (McKean) Write h(t) = 2t log log(1/t).
Part 1.  
Bt
P lim sup ≤ 1 = 1.
t→0+ h(t)

Consider the exponential martingale Zt (α) = exp(αBt − 21 α2 t). Then by Doob’s submartingale maximal
inequality,
1
P { sup (Bs − αs) > β} = P { sup Zs (α) > eαβ } ≤ e−αβ EZt (α) = e−αβ .
0≤s≤t 2 0≤s≤t

Fix the θ and δ in (0, 1) and apply the inequality above to the sequences
1
tn = θn , αn = θ−n (1 + δ)h(θn ), βn = h(θn ).
2
Then,
1 −n 1
αn βn = θ (1 + δ)h(θn )2 = (1 + δ) log log θ−n = (1 + δ)(log n + log log ).
2 θ
Consequently, for γ = (log(1/θ))(1+δ) ,
1
P { sup (Bs − αn s) > βn } ≤ γn−(1+δ) .
0≤s≤tn 2

Because δ > 0, these probabilities have a finite sum over n. Thus, by the first Borel-Cantelli lemma
1
P { sup (Bs − αn s) > βn i.o.} = 0.
0≤s≤tn 2

Thus, we have, for ω in an event having probability 1, there exists N (ω) so that
1
sup (Bs − αn s) > βn
0≤s≤tn 2
3 MARTINGALES 31

for all n ≥ N (ω). For such n, consider t in the interval (θn+1 , θn ]. Then
1 1 1
Bt (ω) ≤ sup Bs (ω) ≤ βn + αn θn = (2 + δ)h(θn ) ≤ θ−1/2 (2 + δ)h(t).
0≤s≤θ n 2 2 2
Consequently,
Bt (ω) 1
lim sup ≤ θ−1/2 (2 + δ).
t→0+ h(t) 2
Now, use the fact inf θ,δ∈(0,1) 21 θ−1/2 (2 + δ) = 1 to obtain
Bt (ω)
lim sup ≤ 1.
t→0+ h(t)

Part 2.  
Bt
P lim sup ≥ 1 = 1.
t→0+ h(t)

Choose θ ∈ (0, 1) and define the independent events



An = {Bθn − Bθn+1 > h(θn ) 1 − θ}.
Noting that Bθn − Bθn+1 is a normal random variable with mean 0 and variance θn (1 − θ), we have
 
1 1
P (An ) = 1 − Φ(θ−n/2 h(θn )) ≥ √ exp − θ−n h(θn )2 θn/2 h(θn )−1 (1 − θn h(θn )−2 ).
2π 2
Because
1 −n
θ h(θn )2 = log n + log log(θ−1 ),
2
we have that

X
P (An ) = ∞
n=1
P∞ √
by comparison to n=1 (n log n)−1 . Now, use the second Borel-Cantelli lemma to conclude that for infinitely
many n √
Bθn − Bθn+1 > h(θn ) 1 − θ}.
Use the first part of the theorem, applied to −B to obtain that

Bθn+1 ≥ −2h(θn+1 ) > −4 θh(θn )
for all sufficiently large n > 1/ log θ,. Thus, for infinitely many n,
√ √ 
Bθ n > 1 − θ − 4 θ h(θn ).

This proves that, with probability 1,


Bt √ √
≥ 1−θ−4 θ
lim sup
t→0 h(t)
√ √
for all θ ∈ (0, 1). Now, part 2 follows by noting that supθ∈(0,1) 1 − θ − 4 θ = 1.
3 MARTINGALES 32

Corollary 3.48. Let B be a standard Brownian motion. Then,


 
Bt
P lim sup √ = 1 = 1.
t→∞ 2t log log t

Proof. Use the fact that tB1/t is a Brownian motion.


4 MARKOV PROCESSES 33

4 Markov Processes
4.1 Definitions and Transition Functions
Definition 4.1. A stochastic process X with values in a metric space S is called a Markov process provided
that
P {Xs+t ∈ B|FtX } = P {Xs+t ∈ B|Xt } (4.1)
for every s, t ≥ 0 and B ∈ B(S).
If {Ft ; t ≥ 0} is a filtration and FtX ⊂ Ft , then we call X a Markov process with respect to the filtration
{Ft ; t ≥ 0} if the (4.1) holds with FtX replaced by Ft .
The probability measure α given by α(B) = P {X0 ∈ B} is called the initial distribution of X.
Exercise 4.2. 1. If X is Markov process with respect to the filtration {Ft ; t ≥ 0} then it is a Markov
process.
2. Use the standard machine to show that the definition above is equivalent to

E[f (Xs+t )|FtX ] = E[f (Xs+t )|Xt ].

for all bounded measurable f .


Definition 4.3. A function
p : [0, ∞) × S × B(S) → [0, 1]
is a called time homogeneous transition function if,
1. for every (t, x) ∈ [0, ∞) × S, p(t, x, ·) is a probability,
2. for every x ∈ S, p(0, x, ·) = δx ,
3. for every B ∈ B(S), p(·, ·, B) is measurable, and
4. (Chapman-Kolmogorov equation) for every s, t ≥ 0, x ∈ S, and B ∈ B(S),
Z
p(t + s, x, B) = p(s, y, B)p(t, x, dy).
S

The transition function P is a transition function for a time-homogeneous Markov process X, if, for
every s, t ≥ 0 and B ∈ B(S),
P {Xt+s ∈ B|FtX ] = p(s, Xt , B).
Exercise 4.4. Show, using the standard machine, the the identity above is equivalent to
Z
X
E[f (Xt+s |Ft ] = f (y)p(s, Xt , dy)
S

for every s, t ≥ 0 and bounded measurable f .


4 MARKOV PROCESSES 34

To see how the Chapman-Kolmgorov equation arises, note that for all s, t, u ≥ 0 and B ∈ B(S),
p(t + s, Xu , B) = P {Xu+t+s ∈ B|FuX } = E[P {Xu+t+s ∈ B|Fu+tX
}|FuX ]
Z
X
= E[p(s, Xu+t , B)|Fu ] = p(s, y, B)p(t, Xu , dy).
S

Exercise 4.5. Let X be a Lévy process with νt (B) = P {Xt ∈ B}. Then the transition function
p(t, x, B) = νt (B − x).
In this case, the Chapman-Kolmogorov equations state that νs+t = νs ∗ νt .
The transition function and the initial distribution determine the finite dimensional distributions of X
by
P {X0 ∈ B0 , Xt1 ∈ B1 , . . . , Xtn ∈ Bn }
Z Z Z
= ··· p(tn − tn−1 , xn−1 , Bn )p(tn−1 − tn−2 , xn−2 , dxn−1 ) · · · p(t1 , x0 , dx1 )α(dx0 ).
B0 B1 Bn−1

Exercise 4.6. Show that the consistency condition in the Daniell-Kolmogorov are satisfied via the Chapman-
Kolmogorov equation.
This gives us a unique measure on (S [0,∞) , B(S [0,∞) ). Denote probabilities with respect to this measure
Pα and Px for Pδx .
Rarely are we able to write this transition function explicitly. In two well known cases, Brownian motion
and the Poisson process we can.
Example 4.7. For Brownian motion,
(y − x)2
Z  
1
P (t, x, B) = √ exp − dt, t ≥ 0, x ∈ R, B ∈ (R).
2πt B 2t
For the Poisson process with parameter λ,
(λt)y−x −λt
P (t, x, {y}) = e , t ≥ 0, y ≥ x ≥ 0, x, y ∈ N.
(y − x)!
The kernels P (s, ·, ·) are naturally associated with the operator on the bounded measurable functions.
Definition 4.8. The transition operator for the transition function P is defined by
Z
T (s)f (x) = f (y)p(s, x, dy) = Ex [f (Xs )].
S
Consequently, Z
T (s)f (Xt ) = f (y)p(s, Xt , dy) = E[f (Xt+s )|FtX ].
S
Exercise 4.9. The family of operators {T (t); t ≥ 0} satisfies
1. T (0)f = f
2. T (s + t)f = T (s)T (t)f
for all bounded measurable f .
4 MARKOV PROCESSES 35

4.2 Operator Semigroups


Definition 4.10. A one-paramter family {T (t); t ≥ 0} of bounded linear operators on a Banach space
(L, || · ||) is call a semigroup if
1. T (0)f = f
2. T (s + t)f = T (s)T (t)f
for all f ∈ L. This semigroup is called strongly continuous if

lim T (t)f = f.
t→0+

for all f ∈ L. This semigroup is called a contraction semigroup if

||T (t)|| = sup{||T (t)f ||; ||f || = 1} ≤ 1.

Exercise 4.11. Let A be a bounded linear operator and define, for t ≥ 0, the linear operator

X 1 k k
exp tA = t A , exp 0 = I.
k!
k=0

Then, exp tA is a strongly continuous semigroup and || exp tA|| ≤ exp t||A||.
Proposition 4.12. Let T be a strongly continuous contraction semigroup. Then, for each f ∈ L, the
mapping
t → T (t)f
is a continuous function from [0, ∞) into L.
Proof. For t ≥ 0 and h > 0,

||T (t + h)f − T (t)f || = ||T (t)(T (h)f − f )|| ≤ ||T (t)|| ||T (h)f − f || ≤ ||T (h)f − f ||.

For 0 ≤ h ≤ t,

||T (t − h)f − T (t)f || = ||T (t − h)(T (h)f − f )|| ≤ ||T (t − h)|| ||T (h)f − f || ≤ ||T (h)f − f ||.

Now, let h → 0.

4.2.1 The Generator


Definition 4.13. Let A be a linear operator with domain D(A), a subspace of L and range R(A).
1. The graph of A is given by

G(A) = {(f, Af ) : f ∈ D(A)} ⊂ L × L.

2. L × L is a Banach space under the norm ||(f, g)|| = ||f || + ||g||. A is a closed operator if its graph is
a closed subspace of L × L.
Note that if D(A) = L, then A is closed.
4 MARKOV PROCESSES 36

3. Call à an extension of A if D(A) ⊂ D(Ã) and Ãf = Af for all f ∈ D(A).


Definition 4.14. The (infinitesimal) generator of a semigroup T is the linear operator G defined by
1
Gf = lim (T (t)f − f ).
t→0 t

The domain of G is the subspace of all f ∈ L for which the limit above exists.
Exercise 4.15. 1. Let L be the bounded real-valued functions on N. For the Poisson process with param-
eter λ, show that the generator is

Gf (x) = λ(f (x + 1) − f (x))

and that D(G) = L.


2. Assume that A is a bounded operator. The semigroup exp tA has generator A. D(A) = L.
We will be examining continuous functions u : [a, b] → L. Their Riemann integrals will be defined via
Riemann sums and limits. Thus,
R ∞ we will have analogous definitions for Riemann integrable and for the
improper Riemann integrals a u(s) ds and an analogous statement for the fundamental theorem
Z a+h
1
lim u(s) ds = u(a).
h→0 h a

Exercise 4.16. 1. If u : [a, b] → L is continuous and ||u|| is Riemann integrable, then u is integrable and
Z b Z b
|| u(s) ds|| ≤ ||u(s)|| ds.
a a

2. Let B be a closed linear operator on L and assume that u : [a, b] → L is continuous, that u(s) ∈ D(B)
for all s ∈ [a, b], and that both u and Bu are Riemann integrable. Then
Z b Z b Z b
u(s) ds ∈ D(B) and B u(s) ds = Bu(s) ds.
a a a

3. If u is continuously differentiable, then


Z b
d
u(t) dt = u(b) − u(a).
a dt
The example of the semigroup exp tA motivates the following identities.

Proposition 4.17. Let T be a strongly continuous semigroup on L with generator G.


1. If f ∈ L and t ≥ 0, then
Z t Z t
T (s)f ds ∈ D(G) and T (t)f − f = G T (s)f ds. (4.2)
0 0
4 MARKOV PROCESSES 37

2. If f ∈ D(G) and t ≥ 0, then T (t)f ∈ D(G) and

d
T (t)f = GT (t)f = T (t)Gf. (4.3)
dt

3. If f ∈ D(G) and t ≥ 0, then


Z t Z t
T (t)f − f = GT (s)f ds = T (s)Gf ds. (4.4)
0 0

Proof. 1. Observe that (T (h) − I)/h is closed for all h > 0. Thus,
Z t Z t Z t+h Z h
1 1 1 1
(T (h) − I) T (s)f ds = (T (s + h)f − T (s)f ) ds = T (s)f ds − T (s)f ds.
h 0 h 0 h t h 0

Now, let h → 0 to obtain (4.2).


2. Write Gh = (T (h) − I)/h. Then, for all h > 0
1
(T (t + h)f − T (t)f ) = Gh T (t)f = T (t)Gh f
h
Consequently, T (t)f ∈ D(G) and

d+
T (t)f = GT (t)f = T (t)Gf.
dt
To check the left derivative, use the identity
1
(T (t − h)f − T (t)f ) − T (t)Gf = T (t − h)(Gh − G)f + (T (t − h) − T (t))Gf,
−h
valid for h ∈ (0, t].
3. This is a consequence of part 2 in this proposition and part 3 in the lemma above.

Exercise 4.18. Interpret the theorem above for the semigroup

T (s)f (x) = f (x + s)

for measurable functions f : R → R.


Corollary 4.19. If G is the generator of a a strongly continuous contraction semigroup T on L, then G is
closed and D(G) is dense in L.
Proof. Because, for every f ∈ L,
Z h
1
lim T (s)f ds = f,
h→0+ h 0

D(G) is dense in L.
4 MARKOV PROCESSES 38

To see that G is closed, choose {fn ; n ≥ 1} so that

fn → f and Gfn → g as n → ∞.

Then, by (4.3),
Z t
T (t)fn − fn = T (s)Gfn ds for each t > 0.
0
Thus, by letting n → ∞, we obtain
Z t
T (t)f − f = T (s)g ds for each t > 0.
0

Now, divide by t and let t → 0 to conclude that

f ∈ D(G) and Gf = g.

4.2.2 The Resolvent


We now give some necessary conditions for an operator to be the generator of a strongly continuous semi-
group. The Hille-Yosida will show that this conditions are sufficient.

Definition 4.20. For a strongly continuous contraction semigroup T , define the resolvent,
Z ∞
R(λ)g = e−λs T (s)g ds.
0

Exercise 4.21. If X is a time homogenuous Markov process with semigroup T and τλ is an independent
exponential random variable with parameter λ, then

λR(λ)g = Eg(Xτλ ).

Proposition 4.22. Let T be a strongly continuous contraction semigroup with generator G, then for all
λ > 0, (λI − G)−1 is a bounded linear operator on L and

R(λ)g = (λI − G)−1 g

for all g ∈ L.
Proof. Fix λ > 0 and choose g ∈ L. Then
Z ∞
1
||R(λ)g|| = e−λs ||T (s)g|| ds ≤ ||g||.
0 λ

Also, for any h > 0,


∞ ∞ h
eλh − 1 eλh
Z Z Z
1 1 −λs −λs
(T (h) − I)R(λ)g = e (T (s + h)g − T (s)g) dt = e T (s)g ds − e−λs T (s)g ds.
h h 0 h 0 h 0
4 MARKOV PROCESSES 39

Let h → 0 to see that R(λ)g ∈ D(G) and GR(λ)g = λR(λ)g − g, or

(λI − G)R(λ)g = g.

Thus, R(λ) is a right inverse for λI − G and R(λI − G) = L.


If g ∈ D(G), then use the fact that G is closed to see that
Z ∞ Z ∞ Z ∞
−λs −λs
R(λ)Gg = e T (s)Gg ds = G(e T (s))g ds = G e−λs T (s)g ds = GR(λ)g
0 0 0

or
R(λ)(λI − G)g = g.
Thus, (λI − G) is one-to-one and its left inverse is R(λ)
Definition 4.23. ρ(G) = {λ; (λI − G) has a bounded inverse} is called the resolvent set.
Thus, if G is the generator of a strongly continuous contraction semigroup, then its resolvent set ρ(G)
contains (0, ∞).

Remark 4.24. Let τλ be an exponential random variable with parameter λ, then as λ → ∞, τλ converges
in distribution to the degenerate random varibale that takes on the value 0 with probability 1. This motivates
the following.
Proposition 4.25. Let {R(λ); λ ≥ 0} be the resolvent for a strongly continuous contraction semigroup with
generator G, then
lim λR(λ)f = f.
λ→∞

Proof. Note that for each f ∈ D(G),

λR(λ)f − f = R(λ)Gf and that ||Gλ Af || ≤ λ−1 ||Gf ||.

Thus, the formula holds for the dense set f ∈ D(G). Now, use the fact that ||λR(λ) − I|| ≤ 2 for all λ > 0
and approximate.
Exercise 4.26. 1. (resolvent identity) Let λ, µ > 0, then

R(µ) − R(λ)
R(λ)R(µ) = R(µ)R(λ) = .
λ−µ

2. If λ ∈ ρ(G) and |λ − µ| < ||R(λ)||−1 , then



X
R(µ) = (λ − µ)n R(λ)n+1 .
n=0

Consequently, ρ(G) is open.


3. If G is the generator of a semigroup then

||λf − Gf || ≥ λ||f || for every f ∈ D(G), and λ > 0.


4 MARKOV PROCESSES 40

4.3 The Hille-Yosida Theorem


The Hille-Yosida theorem gives a characterization for generators. We begin with a definition.

4.3.1 Dissipative Operators


Definition 4.27. A linear operator A on L is dissipative if

||λf − Af || ≥ λ||f || for every f ∈ D(A), and λ > 0. (4.5)

Remark 4.28. If A is dissipative, then divide (4.5) by λ to obtain that ||f − Af /λ|| ≥ ||f || or

||f − cA ≥ ||f || for all c > 0.

Exercise 4.29. 1. If A is dissipative, then λI − A is one-to-one.


2. If A is dissipative and (λI − A)−1 exists, then ||(λI − A)−1 || ≤ λ−1 .
Lemma 4.30. Let A be a dissipative linear operator on L and let λ > 0. Then A is closed if and only if
R(λI − A) is closed.
Proof. Suppose that A is closed. If {fn ; n ≥ 0} ⊂ D(A), and (λI − A)fn → h, then

||(λI − A)(fn − fm )|| ≥ λ||fn − fm ||,

and {fn ; n ≥ 0} is a Cauchy sequence. Thus, there exists f ∈ L such that fn → f and hence

Afn → λf − h.

Because A is closed, f ∈ D(A) and


h = (λI − A)f.
This shows that R(λI − A) is closed.
Suppose R(λI − A) is closed. If {fn ; n ≥ 0} ⊂ D(A), fn → f and Afn → g. Consequently,

(λI − A)fn → λf − g.

Because R(λI − A) is closed,

λf − g = (λI − A)f0 for some f0 ∈ D(A).

Because A is dissipative,
||(λI − A)(fn − f0 )|| ≥ λ||fn − f0 ||,
and fn → f0 . Hence, f = f0 ∈ D(A) and Af = g. This shows that A is closed.
Lemma 4.31. Let A be a dissipative closed linear operator on L. Set ρ+ (A) = ρ(A) ∩ (0, ∞). If ρ+ (A) 6= ∅,
then ρ+ (A) = (0, ∞)
4 MARKOV PROCESSES 41

Proof. The condition is equivalent to the statement that ρ+ (A) is both open and closed in (0, ∞). We have
shown that ρ(A) is open.
Suppose that {λn ; n ≥ 1} ⊂ ρ+ (A) with λn → λ > 0. Given g ∈ L, define, for each n,

gn = (λI − A)(λn I − A)−1 g.

Because A is dissipative

|λ − λn |
||gn − g|| = ||((λI − A) − (λn I − A))(λn I − A)−1 g|| = ||(λ − λn )(λn I − A)−1 g|| ≤ ||g||.
λn

Thus, gn → g. Consequently, R(λI − A) is dense in L. We have shown that this range is closed, thus
R(λI − A) = L. Because λI − A is one-to-one and ||(λI − A)−1 || ≤ λ−1 , we have that λ ∈ ρ+ (A). Thus,
ρ+ (A) is closed.

4.3.2 Yosida Approximation


The generator of a strongly continuous contraction semigroup is sometimes an unbounded operator. The
Yosida approximation gives us a method of approximating generators using bounded operators. Later, we
will give a probabilistics interpretation to this approximation.

Definition 4.32. Let G be a dissipative closed linear operator on L, and suppose that D(G) is dense in L
and that (0, ∞) ⊂ ρ(G). Then the Yosida approximation of G is

Gλ = λG(λI − G)−1 , λ > 0.

Proposition 4.33. The Yosida approximation has the following properties:


1. For each λ > 0, Gλ is a bounded linear operator on L and {Tλ (t) = exp(tGλ ); t ≥ 0} generates a
strongly continuous contraction semigroup on L.
2. Gλ Gµ = Gµ Gλ for all λ, µ > 0
3. For every f ∈ D(G),
lim Gλ f = Gf.
λ→∞

Proof. Because
I = (λI − G)R(λ) on L, Aλ = λ2 R(λ) − λI on L.
Because
R(λ)(λI − A) = I on D(G), Aλ = λR(λ)G on D(G).
Note that 2
||Tλ (t)|| ≤ e−tλ || exp(tλ2 R(λ))|| ≤ e−tλ etλ ||R(λ)||
≤ 1.
proving part 1.
Part 2 follows from the first identity above and the resolvent identity.
Part 3 follows from the second identity above and the fact that for f ∈ D(G), λR(λ)f → f as λ → ∞ as
λ → ∞.
4 MARKOV PROCESSES 42

Lemma 4.34. For a pair of commuting bounded operators B and C on L assume that || exp(tB)|| ≤ 1 and
|| exp(tC)|| ≤ 1 for all t ≥ 0. Then

|| exp(tB)f − exp(tC)f || ≤ t||Bf − Cf ||.

Proof.
Z t
d
exp(tB)f − exp(tC)f = (exp(sB) exp((t − s)C))f ds
0 dt
Z t
= exp(sB)(B − C) exp((t − s)C)f ds
0
Z t
= exp(sB) exp((t − s)C)(B − C)f ds.
0

The rest is easy.


Theorem 4.35 (Hille-Yosida). A linear operator G is the generator of a strongly continuous contraction
semigroup on L if and only if
1. D(G) is dense in L.
2. G is dissipative.
3. R(λ0 I − G) = L for some λ0 > 0
Proof. The necessity of these conditions has been established.
We have shown that conditions 2 and 3 implies that G is closed. By condition 3, λ0 ∈ ρ(G) and, therefore,
(0, ∞) ⊂ ρ(G).
As before, for each λ > 0, define the Yosida approximation Gλ and its semigroup Tλ . Then by the lemma
above, we
||Tλ (t)f − Tµ (t)f || ≤ t||Gλ f − Gµ f ||.
for all f ∈ L, t ≥ 0 and λ, µ > 0.
Consequently, because the Yosida approximations converge to the generator G, we have, for all f ∈ D(G),

lim Tλ (t)f
λ→∞

exists for all t ≥ 0 uniformly on bounded intervals. Call this limit T (t)f . Because D(G) is dense, this limit
holds for all f ∈ L.
Claim 1. T is a strongly continuous contraction semigroup.
The fact that T (0)f = f and that T (t) is a contraction is immediate. Use the identity

T (t)f − f = (T (t) − Tλ (t))f + (Tλ (t) − I).

to check the strong continuity at 0 and

T (s + t)f − T (s)T (t)f = (T (s + t) − Tλ (s + t))f + Tλ (s)(Tλ (t) − T (t))f + (Tλ (s) − T (s))T (t)f
4 MARKOV PROCESSES 43

to check the semigroup property.


Claim 2. G is the generator of T .
For every f ∈ L, t ≥ 0, and λ > 0,
Z t
Tλ (t)f − f = Tλ (s)Gλ f ds.
0

For each f ∈ D(G) and s ≥ 0,


Tλ (s)Gλ f − T (s)Gf = Tλ (s)(Gλ f − Gf ) + (Tλ (s) − T (s))Gf.
Use the convergence of the Yosida approximations and the convergence of the semigroups to see that
Tλ (s)Gλ f → T (s)Gf as λ → ∞
uniformly for s ∈ [0, t]. Thus the Riemann integrals converge and
Z t
T (t)f − f = T (s)Gf ds f ∈ D(G), t ≥ 0.
0

Consequently, the generator G̃ of T is an extension of G.


With this in mind, choose f˜ ∈ D(G̃). Because λI − G is surjective, there exists f ∈ D(G) so that
(λI − G̃)f˜ = (λI − G)f.
Now, evaluate this expression for two distinct values of λ and subtract to obtain that f˜ = f .
Lemma 4.36. Let A be a dissipative operator on L and suppose that u : [0, ∞) → D(A) be continuous for
all s > 0 and Au : [0, ∞) → L continuous. If
Z t
u(t) = u() + Au(s) ds


for all t >  > 0, then ||u(t)|| ≤ ||u(0)|| for all t ≥ 0.


Proof. Let 0 <  = t0 < t1 < · · · < tn = t. Then
n
X
||u(t)|| = ||u()|| + (||u(tj )|| − ||u(tj−1 )||)
j=1
Xn
= ||u()|| + (||u(tj )|| − ||u(tj ) − (tj − tj−1 )Au(tj )||)
j=1
n
X
+ (||u(tj ) − (tj − tj−1 )Au(tj )|| − ||u(tj ) − (u(tj ) − u(tj−1 ))||)
j=1
n
X Z tj−1 Z tj
≤ ||u()|| + (||u(tj ) − Au(tj ) ds||) − ||u(tj ) − Au(s) ds||)
j=1 tj tj−1

Xn Z tj
≤ ||u()|| + ||Au(tj ) − Au(s)|| ds.
j=1 tj−1
4 MARKOV PROCESSES 44

The first sum is negative by the dissipativity of A. Now use the continuity of Au and u and let max(tj −
tj−1 ) → 0. Then, let  → 0.
Proposition 4.37. Let T and S be strongly continuous contraction semigroups on L with generators GT
and GS . If these two generators are equal, the T = S.
Proof. Use the lemma above with
u(t) = (T (t) − S(t))f
and note that u(0) = 0 and Au(s) = 0 for all s.
Definition 4.38. Call a linear operator A on L closable if it has a closed linear extension. The closure Ā
of A is the minimal closed linear extension of A.
Often the generator G is an unbounded operator. In these instances, establishing D(G) can be a tedious
process. Thus, we look for an alternative form for the Hille-Yosida theorem.

Lemma 4.39. Let G be a dissipative linear operator on L with D(G) dense in L. Then G is closable and
R(λI − G) = R(λI − Ḡ) for every λ > 0.
Proof. For the first assertion, we show that

{fn ; n ≥ 0} ⊂ D(G), fn → 0, Gfn → g implies g = 0.

Choose {gm ; m ≥ 0} ⊂ D(G) such that gm → g. Then by the dissipativity of G,

||(λI − G)gm − λg|| = lim ||(λI − G)(gm + λfn )|| ≥ lim λ||gm + λfn || = λ||gm ||
n→∞ n→∞

for every λ > 0. Now divide by λ and let λ → ∞ to obtain, for each m, ||gm − g|| ≥ ||gm ||. Letting m → ∞
yields g = 0.
The inclusion R(λI − Ḡ) ⊂ R(λI − G) is straightforward. The reverse inclusion follows from the fact
that dissipative linear operator G̃ is closed if and only if R(λI − G̃) is closed.
Theorem 4.40. A linear operator G on L is closable and its closure Ḡ is the generator of a strongly
continuous contraction semigroup on L if and only if
1. D(G) is dense in L.
2. G is dissipative.
3. R(λ0 I − G) is dense in L for some λ0 > 0
Proof. By the lemma above G satisfies 1-3 if and only if G is closable and Ḡ satisfies 1-3 in the Hille-Yosida
theorem.
Definition 4.41. Let A be a closed linear operator on L. A subspace D of D(A) is called a core for A if A
is the closure of the restriction of A to D.
Proposition 4.42. Let G be the generator of a strongly continuous contraction semigroup on L. Then a
subspace D of D(G) is a core for G if and only if D is dense in L and R(λ0 I − G) is dense in L for some
λ0 > 0.
4 MARKOV PROCESSES 45

4.3.3 Positive Operators and Feller Semigroups


We now return to the consideration of Markov processes.

Definition 4.43. Let S be locally compact and let C0 (S) be the Banach space of continuous functions that
vanish at infinity with norm
||f || = sup{|f (x)| : x ∈ S}.
Definition 4.44. Call an operator on a function space positive if it maps nonnegative functions to nonneg-
ative functions.
To establish positivity, we will use the following proposition.

Proposition 4.45. Let T be a strongly continuous contraction semigroup on L with generator G and resol-
vent R. For M ⊂ L, let
ΛM = {λ > 0; λR(λ) : M → M }.
If M is closed and convex and ΛM is unbounded, then for each t ≥ 0,

T (t) : M → M.

Proof. For λ, µ ≥ 0 satisfying |1 − µ/λ| < 1,



X µ µ n
µR(µ) = 1− (λR(λ))n+1 .
n=0
λ λ

Because M is closed and convex, λ ∈ ΛM implies (0, λ] ⊂ ΛM . Thus, ΛM = (0, ∞). Use the Yosida
approximation

X (tλ)n
Tλ (t) = e−tλ exp(tλ(λR(λ))) = e−tλ (λR(λ))n
n=0
n!
to conclude that
Tλ (t) : M → M.
Now use the fact that M is closed and that Tλ (t)f converges to to T (t)f as λ → ∞.
Definition 4.46. 1. Call a semigroup conservative if there exist a sequence fn ∈ C0 (S) that is bounded
in norm and converges pointwise to 1, and

lim T (t)fn = 1.
n→∞

for all t ≥ 0
2. Call a conservative strongly continuous positive contraction semigroup T on C0 (S) a Feller semigroup.
Exercise 4.47. 1. If T is conservative, then A1 = 0

2. If A1 = 0, then T is conservative.
4 MARKOV PROCESSES 46

4.3.4 The Maximum Principle


Definition 4.48. A linear operator A on C0 (S) satisfies the positive maximum principle if whenever f ∈
D(A), x̃ ∈ S, then
f (x̃) = sup{f (x); x ∈ S} ≥ 0 implies Af (x̃) ≤ 0.
Remark 4.49. To see why this ought to be true for a Markov process, note that

0 ≥ Ex̃ [f (Xs )] − f (x̃) = T (s)f (x̃) − f (x̃).

now divide by s and let s → 0.


Proposition 4.50. Let S be locally compact. A linear operator A on C0 (S) satisfying the positive maximum
principle is dissipative.
Proof. Let f ∈ D(A) and λ > 0. There exists x̃ ∈ S so that |f (x̃)| = ||f ||. Suppose f (x̃) ≥ 0. (Otherwise
replace f with −f .) Then by the positive maximum principle, Af (x̃) ≤ 0 and hence

||λf − Af || ≥ λf (x̃) − Af (x̃) ≥ λf (x̃) = λ||f ||.

In this context, we have the following variant of the Hille-Yosida theorem.

Theorem 4.51. Let S be locally compact. A linear operator G on C0 (S) is closable and its closure Ḡ is the
generator of a positive strongly continuous contraction semigroup on C0 (S) if and only if
1. D(G) is dense in C0 (S).
2. G satisfies the positive maximum principle.
3. R(λ0 I − G) is dense in C0 (S) for some λ0 > 0.
Proof. The necessity of conditions 1 and 3 follows from the theorem above. To check the neccessity of 2, fix
f ∈ D(G) and x̃ ∈ S so that sup{f (x) : x ∈ S} = f (x̃) ≥ 0. Then, for each t ≥ 0,

T (t)f (x̃) ≤ T (t)f + (x̃) ≤ ||f + || = f (x̃)

and Gf (x̃) ≤ 0.
Conversely, suppose G satisfies conditions 1-3. Because condition 2 implies that G is dissipative, Ḡ
generates a strongly continuous contraction semigroup T . To prove that T is positive, we note, by the
proposition above, it suffices to prove that R(λ) maps nonnegative functions to nonegative functions.
Because R(λ0 I − G) is dense for some λ0 > 0, and G is dissipative, R(λI − Ḡ) = C0 (S) all λ > 0. Thus,
it is equivalent to show that for f ∈ D(G), and λ > 0,

(λI − Ḡ)f ≥ 0 implies f ≥ 0.

To establish the contrapositive of this statement, choose f ∈ D(Ḡ) so that inf{f (x) : x ∈ S} < 0. Thus,
there exist {fn ; n ≥ 0} ∈ D(G) so that

lim (λI − G)fn = (λI − Ḡ)f.


n→∞
4 MARKOV PROCESSES 47

Because Ḡ is dissipative, we have that fn → f as n → ∞. Let x̃n and x̃ be the points in which fn and f ,
respectively take on their minimum values. Then

inf{(λI − Ḡ)f ; x ∈ S} ≤ lim inf (λI − Ḡ)fn (x̃n ) ≤ lim inf λfn (x̃n ) = λf (x̃) < 0.
n→∞ n→∞

Exercise 4.52. Let X1 and X2 be independent Markov processes on a common state space S. Is (X1 , X2 )
a Markov process? If so, determine its generator?

4.4 Strong Markov Property


Definition 4.53. Let X be a Ft -progressive, Ft Markov process with associated transition function P . Let
τ be an almost surely finite Ft -stopping time. Then X is strong Markov at τ if for all t ≥ 0 and B ∈ B(S),

P {Xτ +s ∈ B|FτX } = p(s, Xτ , B),

or equivalently, for any t ≥ 0 and bounded measurable f ,


Z
E[f (Xτ +s )|FτX ] = f (y)p(s, Xτ , dy).
S

X is called strong Markov with respect to {Ft ; t ≥ 0} if X is strong Markov for all almost finite Ft -
stopping times.
In reviewing the results on Lévy processes and its regularity properties, we obtain the following.

Proposition 4.54. A right continuous Lévy process is strong Markov.


The following proposition has essentially the same proof as in the discrete time case.

Proposition 4.55. Let X be a Ft -progressive, Ft Markov process with associated transition function P .
The X is strong Markov at all discrete almost surely finite Ft -stopping times.
Now, using the essentially the same proof we had for Lévy processes, we have

Proposition 4.56. Let X be a right continuous Ft -progressive, Ft Markov process with associated transition
function P . Then X is strong Markov at all almost surely finite Ft+ -stopping times.
Theorem 4.57. Let X be a Markov process with time homogenous transition function P . Let Y be a bounded
measurable process and assume that and let τ be a FtX -stopping time so that X is strong Markov at τ + t
for all t ≥ 0. Then
Eα [Yτ ◦ θτ |Fτ ] = φτ (Xτ ) on {τ < ∞}
where
φs (x) = Ex Ys .
4 MARKOV PROCESSES 48

Proof. By the standard machine, we need only consider functions of the form
n
Y
Ys = f0 (s) fk (Xtk ), 0 < t1 < · · · < tn .
k=1

f0 , f1 . . . . , fn continuous and bounded. In this case


Z Z Z
φs (x) = f0 (s) · · · fn (xn )p(tn − tn−1 , xn−1 , dxn )fn−1 (xn−1 )p(tn−1 − tn−2 , xn−2 , dxn−1 )

· · · f1 (x1 )p(t1 , x, dx1 ) = f0 (s)ψ(x).

The proof will process by induction on n. The case n = 0 states that

Eα [f0 (τ )|Fτ ] = f0 (τ ) on {τ < ∞}.

This follows form the fact that τ is Fτ -measurable.


Let
n
Y
φ̃(x) = E[ fk (Xtk )].
k=2

Then, Z
φs (x) = f0 (s) φ(x1 )f1 (x1 )p(t1 , x, dx1 )

On {τ < ∞},
n
Y
Eα [Yτ ◦ θτ |Fτ ] = Eα [f0 (τ ) fk (Xτ +tk )|Fτ ]
k=1
n
Y
= f0 (τ )Eα [Eα [ fk (Xτ +tk )|Fτ +t1 ]f1 (Xτ +t1 )|Fτ ]
k=2

= f0 (τ )Eα [φ̃(Xτ +t1 )f1 (Xτ +t1 )|Fτ ]


= f0 (τ )Eα [(φ̃(Xt1 )f1 (Xt1 )) ◦ θτ |Fτ ]
= f0 (τ )ψ(Xτ ) = φτ (Xτ )

Using essentially the same proof as in the case of Lévy processes, we have the following.

Theorem 4.58 (Blumenthal 0-1 law). Let X be a right continuous Ft -progressive, Ft Markov process. Then
X
for each x ∈ S every event in F0+ has probability 0 or 1.
X
Corollary 4.59. Let τ be an Ft+ -stopping time and fix x ∈ S. Then either

Px {τ = 0} = 1 or Px {τ > 0} = 1.
4 MARKOV PROCESSES 49

4.5 Connections to Martingales


Proposition 4.60. Let X be an S-valued progressive Markov process and let T , the semigroup assocated to
X, have generator G. Then for f ∈ D(G),
Z t
Mtf = f (Xt ) − Gf (Xs ) ds
0

is an FtX -martingale.
Proof. For each t, u ≥ 0
Z t Z t+u
f
E[Mt+u |FtX ] = T (u)f (Xt ) − Gf (Xs ) ds − T (s − t)Gf (Xt ) ds
0 t
Z u Z t
= T (u)f (Xt ) − T (s)Gf (Xt ) ds − Gf (Xs ) ds
0 0
Z t
= f (Xt ) − Gf (Xs ) ds = Msf
0

Exercise 4.61. Let X be a right continuous Markov process with generator G and assume that
Z t
f (Xt ) − g(Xs ) ds
0

is a martingale, then f ∈ D(G) and Gf = g.


The optional sampling formula gives us the following.

Theorem 4.62 (Dynkin formula). In addition to the conditions above, let τ be a stopping time with Eα τ <
∞, then Z τ
Eα f (Xτ ) = Eα f (X0 ) + Eα [ Gf (Xs ) ds]
0

Analogous to the situation for discrete time Markov chains, we make the following definitions.

Definition 4.63. Let G be the generator for a time homogeneous Markov process X and let f ∈ D(G).
Then call
1. f harmonic if Gf = 0.
2. f superharmonic if Gf ≤ 0.

3. f subharmonic if Gf ≥ 0.
Remark 4.64. 1. If f harmonic, then f (Xt ) is a martingale.
2. If f superharmonic, then f (Xt ) is a supermartingale.
4 MARKOV PROCESSES 50

3. If f subharmonic, then f (Xt ) is a submartingale.


Example 4.65. 1. Let D0 , D1 are two disjoint subsets of the state space S. If h is harmonic with respect
to a Markov process X having generator G and

h(x) = 0 for x ∈ D0 , h(x) = 1 for x ∈ D1 .

Assume that the first entrance times τDj , j = 0, 1 are stopping times and τ is their minimum. Then if
τ has finite mean,
h(x) = Ex h(Xτ ) = P {τD0 > τD1 }.

2. If we can solve the equation Gf = −1, then

Mtf = f (Xt ) + t

is martingale. Consider a domain D so that τD is a stopping time. Gf = −1, f = 0 on D, and τD


satisfies the sampling integrability conditions for M f , then

f (x) = Ex τD .

Lemma 4.66. Assume that g ≥ 0 and let R(λ) be the resolvent for a positive continuous contraction
semigroup T associated to a Markov process X. Then

e−λt R(λ)g(Xt )
X
is a non-negative Ft+ -supermartingale.
Proof. For t > s, use the fact that the semigroup and the resolvent commute to obtain that

E[e−λt R(λ)g(Xt )|Fs+


X
] = e−λt T (t − s)R(λ)g(Xs )
Z ∞
= e−λt e−λu T (t − s + u)g(Xs ) du
Z0 ∞
−λs
= e e−λu T (u)g(Xs ) du
t−s
−λs
≤ e R(λ)g(Xs )

Let’s continue this line analysis. Consider the bounded random variable,
Z ∞
Y = e−λs g(Xs ) ds
0

Then Z t
Y = e−λs g(Xs ) ds + e−λt Y ◦ θt
0
and
Ex Y = R(λ)g(x).
4 MARKOV PROCESSES 51

We have Doob’s martingale


Z t
Ztλ,g = E[Y |FtX ] = e−λs g(Xs ) ds + e−λt E[Y ◦ θt |FtX ]
0
Z t
= e−λs g(Xs ) ds + e−λt R(λ)g(Xt )
0

Now, because Doob’s martingale is uniformly integrable, any stopping time τ satisfies the sampling
integrability conditions for Z λ,g and therefore,
Z τ
R(λ)g(x) = Ex [ e−λs g(Xs ) ds] + Ex [e−λτ R(λ)g(Xτ )].
0

Now, let f = (λI − G)g, then the Doob’s martingale above becomes
Z t
Ctλ,f = e−λt f (Xt ) + e−λs (λI − G)f (Xs ) ds
0

and the analog to Dynkin’s formula becomes


Z τ
−λτ
f (x) = Ex [e f (Xτ )] + Ex [ e−λs (λI − G)f (Xs ) ds].
0

Let fλ satisfy the eigenvalue problem Gfλ = λfλ with fλ = 1 on D, then

fλ (x) = Ex e−λτD .

Theorem 4.67. Let X be a measurable Ft -adapted process and let f and g be bounded, measurable functions,
inf x f (x) > 0. Assume that
Z t
Yt = f (Xt ) − g(Xs ) ds
0
is an Ft martingale. Then  Z t 
g(Xv )
f (Xt ) exp − dv
0 f (Xv )
is a martingale.
Proof. Let  Z t 
g(Xv )
Vt = exp − dv .
0 f (Xv )
4 MARKOV PROCESSES 52

Then, we have the local martingale


Z t  Z t   Z t 
g(Xv )
Yt Vt − Ys dVs = f (Xt ) − g(Xs ) ds exp − dv
0 0 0 f (Xv )
Z t Z s   Z s 
g(Xs ) g(Xv )
+ f (Xs ) − g(Xu ) du exp − dv
0 0 f (Xs ) 0 f (Xv )
 Z t  Z t  Z t 
g(Xv ) g(Xv )
= f (Xt ) exp − dv − g(Xs ) ds exp − dv
0 f (Xv ) 0 0 f (Xv )
Z t  Z s 
g(Xv )
+ g(Xs ) exp − dv ds
0 0 f (Xv )
Z tZ s  Z s 
g(Xs ) g(Xv )
− g(Xu ) exp − dv du ds
0 0 f (Xs ) 0 f (Xv )

For the double integral, reverse the order of integration, then we have, upon integrating the s variable that
Z t  Z s 
g(Xv ) t
− g(Xu ) exp − dv du
0 0 f (Xv ) u
Z t  Z t  Z t  Z u 
g(Xv ) g(Xv )
=− g(Xu ) exp − dv du + g(Xu ) exp − dv du
0 0 f (Xv ) 0 0 f (Xv )

and the last three terms sum to zero.


Now use the fact that f and g are bounded and f is bounded away from 0 to see that the local martingale
above is indeed a martingale.
Exercise 4.68. Let X be a measurable Ft -adapted process and let f and g be bounded, measurable functions,
1. Assume that inf x f (x) > 0 and that
 Z t 
g(Xv )
f (Xt ) exp − dv
0 f (Xv )

is an Ft martingale. Then Z t
f (Xt ) − g(Xs ) ds
0
is an Ft martingale.
2. Assume that Z t
f (Xt ) − g(Xs ) ds
0
is an Ft martingale. Then
Z t
e−λt f (Xt ) + e−λs (λf (Xs ) − g(Xs )) ds
0

is an Ft martingale.
4 MARKOV PROCESSES 53

4.6 Jump Processes


Definition 4.69. Call a time homogenuous Markov process X a pure jump process, if starting from any
point x ∈ S, the process is right continuous and has all its sample paths constant except for isolated jumps.
In the case of a countable state space, the stochastic nature of a pure jump process is captured by the
infinitesimal transitions rates
Px {Xh = y} = g(x, y)h + o(h).
The infinitesimal generator G is the rate of change of averages for a function f : S → R of the process.

Ex f (Xh ) − f (x)
Gf (x) = lim , f ∈ D(G).
h→0 h
To relate these two concepts, write
X X
Ex f (Xh ) = f (y)g(x, y)h + f (x)(1 − g(x, y)h) + o(h)
y6=x y6=x
X
Ex f (Xh ) − f (x) = g(x, y)(f (y) − f (x))h + o(h)
y6=x
X
Gf (x) = g(x, y)(f (y) − f (x))
y∈S

Thus, G can be written as an infinitesimal transition matrix. The xy-entry x 6= y is g(x, y). The diagonal
entry X
g(x, x) = − g(x, y).
y6=x

Thus, the off-diagonal entries of the matrix G are non-negative and the row sum is 1.

4.6.1 The Structure Theorem for Pure Jump Markov Processes


For processes that move from one state to another by jumping the exponential distribution plays an important
role.

Proposition 4.70. Let X be a pure jump process, then

τ1 = inf{t ≥ 0; Xt 6= X0 }

is an Ft+ -stopping time.


Proof. τ1 = inf{σn : n ≥ 1} where
σn = inf{k2−n ; Xk/2n 6= X0 }
Now
{σn ≤ t} ∈ FtX ⊂ Ft .
The σn are Ft -stopping times and, consequently, τ1 is an Ft+ -stopping time.
4 MARKOV PROCESSES 54

Theorem 4.71 (structure theorem for pure jump Markov processes). Under Px , τ1 and Xτ1 are independent
and there is a B(S) measurable function λ on S such that

Px {τ1 > t} = exp(−λ(x)t).

Proof. Set

ex (t+s) = Px {τ1 > t+s} = Px {τ1 > t+s, Xt = x, τ1 > t} = Px {τ1 > t+s|Xt = x, τ1 > t}Px {Xt = x, τ1 > t}

Note that
P {τ1 > t + s|Xt = x, τ1 > t} = Px {τ1 > t + s|Xt = x} = Px {τ1 > s} = ex (s).
and that Px {Xt = x, τ1 > t} = Px {τ1 > t} = ex (t). Thus,

ex (t + s) = ex (s)ex (t)

or for some λ ∈ [0, ∞], ex (t) = exp(−λ(x)t). The function λ is measurable because Px {τ1 > t} is measurable.
Let τ1 (t) be the first exit from state x after time t. Then, for B ∈ B(S), x ∈
/ B,

Px {Xτ1 ∈ B, τ1 > t} = Px {Xτ1 (t) ∈ B, τ1 > t, Xt = x} = Px {Xτ1 (t) ∈ B|τ1 > t, Xt = x}Px {τ1 > t, Xt = x}.

For the first term

Px {Xτ1 (t) ∈ B|τ1 > t, Xt = x} = Px {Xτ1 (t) ∈ B|Xt = x} = Px {Xτ1 ∈ B}.

For the second term,


Px {τ1 > t, Xt = x} = Px {τ1 > t}.
Thus, the time of the first jump and the place of the first jump are independent.
Let µ : S × B(S) → [0, 1] be a transition function and defined by µ(x, B) = Px {Xτ1 ∈ B}, then

Ex f (Xh ) = Ex [f (Xh )|τ1 > h]Px {τ1 > h} + Ex [f (Xh )|τ1 ≤ h]Px {τ1 ≤ h}
Z
−λ(x)h
= f (x)e + f (y)µ(x, dy)(1 − e−λ(x)h ) + o(h)

Thus, Z 
Ex f (Xh ) − f (x) = (1 − e−λ(x)h ) f (y)µ(x, dy) − f (x) + o(h)

and Z
Gf (x) = λ(x) (f (y) − f (x))µ(x, dy).

In the case of a countable set of states, set µ(x, {y}) = T (x, y).
X
Gf (x) = λ(x) T (x, y)(f (y) − f (x))
y∈S

Equating the two expressions for the generator, we find that

g(x, x) = −λ(x)
4 MARKOV PROCESSES 55

and for y 6= x
g(x, y)
g(x, y) = λ(x)T (x, y) or T (x, y) =
λ(x)

For example, let S = {1, 2, 3, 4},


     
−5 2 3 0 0 .4 .6 0 5
 4 −10 3 3   .4 0 .3 .3   10 
G= ,
 .4 .4 0 .2  ,
T =  λ= 
 2 2 −5 1   5 
5 0 0 −5 1 0 0 0 5
Exercise 4.72. 1. Let T1 , . . . , Tn be independent, exponential random variables, having respectively, pa-
rameters λ1 , . . . , λn , then
min{T1 , . . . , Tn }
is an exponential random variable with parameter λ1 + · · · + λn .
2. P {T1 < min{T2 , . . . , Tn }} = λ1 /(λ1 + · · · + λn ).
Exercise 4.73. Consider a stochastic process X with a finite state space S defined as follows.
1. For every x, y ∈ S, x 6= y, let N (x,y) denote a Poisson process with parameter g(x, y). Assume that
these processes are independent.
2. if Xt = x, then X jumps to y if
y = arg min{ỹ; Ns(x,ỹ) , s ≥ t}.
(x,ỹ)
3. The first jump after time t takes place at minỹ {Ns : s ≥ t}.
Show that X is a Markov process whose generator G can be represented by a matrix with xy-entry g(x, y).

4.6.2 Construction in the Case of Bounded Jump Rates


Let λ : S → [0, λmax ] be measurable. Define
Z
Gf (x) = λ(x) (f (y) − f (x)) µ(x, dy).
S

Exercise 4.74. Show that G satisfies the maximum principle.


We can construct the Markov process on S with generator G as follows:
Let {Yk ; k ≥ 0} be a Markov chain on S with initial distribution α and transition function µ. In addition,
independent of the chain Y , let {σk ; k ≥ 0} be independent exponentially distributed random variables with
parameter 1. Set
k−1
X σj
τ0 = 0, τk =
j=0
λ(Yj )
4 MARKOV PROCESSES 56

and
Xt = Yk , τk ≤ t < τk+1 .
Note that we can allow λ(x) = 0. In this case, once X arrives to x it remains there. Thus, if the Markov
process arrives at the site x is remains there an exponential length of time, parameter λ(x) and then jumps
to a new site according to the transition function µ.
Let us consider the alternative representation for G. Define
 
λ(x) λ(x)
µ̃(x, B) = 1 − δx (B) + µ(x, B).
λmax λmax

R
Exercise 4.75. If G̃f (x) = λmax S
(f (y) − f (x)) µ̃(x, dy), then G = G̃
Write Z
T f (x) = f (y) µ̃(x, dy)
S

for the transition operator for the Markov chain Ỹ . Then

G = λmax (T − I).

The semigroup generated by G is



X λk
T (t) = exp(tG) = e−λmax t max
T k.
k!
k=0

Let {Ỹk ; k ≥ 0} be a Markov chain on S with initial distribution α and transition function µ̃ and let N
be an independent Poisson process with parameter λmax , and define

X̃t = ỸNt .

Proposition 4.76. Let Ft = σ{(X̃s , Ns ); 0 ≤ s ≤ t}, then X̃ is an Ft -Markov process.


Proof. We begin with a claim.
Claim. E[f (Ỹk+Nt )|Ft ] = T k f (X̃t ).
Fix A ∈ FtN and B ∈ Fm

. Then

E[f (Ỹk+Nt ); A ∩ B ∩ {Nt = m}] = E[f (Ỹk+m ); A ∩ B ∩ {Nt = m}] = P (A ∩ {Nt = m})E[f (Ỹk+m ); B]
= P (A ∩ {Nt = m})E[T k f (Ỹm ); B] = E[T k f (X̃t ); A ∩ B ∩ {Nt = m}]

However, sets of the form A ∩ B ∩ {Nt = m} are closed under finite intersection, and the claim follows from
the Sierpinski class theorem.
4 MARKOV PROCESSES 57

Now, use the independence of the increments for a Poisson process, to obtain that

E[f (X̃t+s )|Ft ] = E[f (ỸNt+s )|Ft ] = E[f (ỸNt+s −Nt +Nt )|Ft ]
∞ ∞
X (λmax s)k X (λmax s)k k
= e−λmax s E[f (Ỹk+Nt )|Ft ] = e−λmax s T f (X̃t ) = T (s)f (X̃t )
k! k!
k=0 k=0

Because G = G̃, X is a Markov process whose finite dimensional distributions agree with X̃.
Thus, all of the examples of generators with bounded rates are generators for a semigroup T for a Markov
process on C0 (S).
Exercise 4.77. For a two state Markov chain on S = {0, 1} with generator
 
−λ λ
A= ,
µ −µ
show that
µ λ −(λ+µ)t
P (t, 0, {0}) = + e .
λ+µ λ+µ

4.6.3 Birth and Death Process


Example 4.78. 1. For a compound Poisson process,
Nt
X
Xt = Yn ,
n=1

λ(x) is some constant λ, the rate for the Poisson process. If ν is the distribution of the Yn , then

µ(x, B) = ν(B − x).

2. A process is called a birth and death process if S = N ,



 λx y =x+1
g(x, y) = −(λx + µx ) y = x
µx y =x−1

3. A pure birth process has µx = 0 for all x.


4. A pure death process has λx = 0 for all x.
5. A simple birth and death process has µx = µ and λx = λ for all x.
6. A linear birth and death process has µx = xµ and λx = xλ for all x.
7. A linear birth and death process with immigration has µx = xµ and λx = xλ + a for all x.
8. M/M/1 queue S = {0, 1, · · · , N }, λx = λ, x < N , λn = 0, µx = µ. So customers arrive at an
exponential rate, parameter λ until the queue reaches length N at which point, all arrivals are turned
away. When they reach the front of the queue, they are served immediately. The service time is
exponential, parameter µ.
4 MARKOV PROCESSES 58

9. M/M/∞ queue S = N λx = λ, λn = 0, µx = xµ. So customers arrive at an exponential rate,


parameter λ. All arrivals are served immediately. The service time is exponential, parameter µ.
10. A logistic process S = {0, 1, · · · , N }, λx = rx(N −x), µx = xµ. This can be used to model an epidemic.
The susceptibility is of the N − x non-infected individuals is proportional to the number x of infected
individuals. An infected stays in this state for an exponential rate of time, parameter µ.
11. An epidemic model with immunity can be constructed by having S = {0, 1, . . . , N }2 . The state (x, y)
gives the number of infected and the number of immune. The generator

Gf (x, y) = rx(N − x − y)(f (x + 1, y) − f (x, y)) + µx(f (x − 1, y + 1) − f (x, y)) + γy(f (x, y − 1) − f (x, y)).

12. For the Moran model with mutation S = {0, 1, · · · , N }, consider a genetic model with two alleles, A1
and A2 Xt - number of alleles of type A1 at time t. The process remains at a site for an exponential
length of time, parameter λ, an individual is chosen at random to be replaced. A second individual
is chosen at random to duplicate. A mutation may also occur at birth. The mutation A1 → A2 has
probably κ1 and the mutation A2 → A1 has probably κ2 The generator
x x x 
Gf (x) = λ(1 − ) (1 − κ1 ) + (1 − )κ2 (f (x + 1) − f (x))
N N N
x  x x 
+λ (1 − )(1 − κ2 ) + κ1 (f (x − 1) − f (x))
N N N

13. For a random walk on the lattice S = Zd , Xt - position of the particle at time t. The particle at a
site x remains at x for an exponential length of time, parameter λ(x), and then moves according to the
kernel p(x, y).
Thus, X
p(x, y) = 1
y

and the generator X


Gf (x) = λ(x) p(x, y)(f (y) − f (x)).
y

Exercise 4.79. For the compound Poisson process, if {Yn ; n ≥ 1} is a Bernoulli sequence, the X is a
Poisson process with parameter λp
To find harmonic functions h̃ for birth and death processes, note that

λx (h̃(x + 1) − h̃(x)) + µx (h̃(x − 1) − h̃(x)) = 0.


µx
(h̃(x + 1) − h̃(x)) = (h̃(x) − h̃(x − 1)), λx > 0.
λx
Summing this on x, we obtain
x
Y µx̃
(h̃(x + 1) − h̃(x)) = (h̃(1) − h̃(0)).
λx̃
x̃=1
4 MARKOV PROCESSES 59

Assume that h̃(0) = 0 and h̃(1) = 1 and all of the λx > 0. Then any other non-constant harmonic
function is a linear function of h̃. Then
y−1 y−1 x
X XY µx̃
h̃(y) = (h̃(x + 1) − h̃(x)) = .
x=0 x=1 x̃=1
λx̃

Thus, for a simple birth and death process and for a linear birth and death process,
y−1
X µ x 1 − (µ/λ)y
h̃(y) = = .
x=0
λ 1 − (µ/λ)

Let D0 = {0} and D1 = {N }, then


h̃(x)
h(x) =
h̃(N )
is a harmonic function satisfying
h(0) = 0, h(N ) = 1.
Thus, for x ∈ {0, . . . , N }
h(x) = Px {τ0 > τN }

Exercise 4.80. For a birth and death process, let τ = min{τ0 , τN }, for x ∈ {0, . . . , N }.
1. Find Ex τ .
2. Find Ex e−λτ , λ > 0.
3. Consider what happens as N → ∞.
Exercise 4.81. 1. For a simple birth and death process, X, find Ex Xt and Varx (Xt ).
2. For a linear birth and death process, X, find Ex Xt and Varx (Xt )

4.6.4 Examples of Interacting Particle Systems


An interacting particle aystems is a Markov process {ηt ; t ≥ 0} whose state space S is the configurations on
a regular lattice Λ, i.e, the state is a mapping

η:Λ→F

where F is a finite set. Examples of lattices are Zd , Z/M Z)d , hexagonal lattice, and regular trees. Examples
of F are
{0,1} = {vacant, occupied}
{-1,1}= {spin up, spin down}
{0,1,· · · ,k} = {vacant, species 1, species 2, · · · , species k}
{∆, 0, · · · , N} = {vacant, 0 individuals having allele A, · · · , N individual having allele A}
4 MARKOV PROCESSES 60

Example 4.82 (interacting particle systems). 1. Exclusion Process (Spitzer, 1970). F = {0, 1}, Λ =
Zd .
If initially we have one particle at x0 , i.e.,
(
1, if x = x0 ,
η0 (x) =
0, if x 6= x0 .

Set Xt = ηt−1 ({1}). Then {Xt ; t ≥ 0} is the random walk on the lattice.
When there are many particles, then each executes an independent random walk on the lattice excluding
transitions that take a particle to an occupied site. The generator
X
GF (η) = λ(η, x)p(x, y)η(x)(1 − η(y))(F (η xy ) − F (η))
x,y

where 
η(x),
 if z = y,
η xy (z) = η(y), if z = x,

η(z), otherwise.

Let’s read the generator:


• Particle at x0 moves in time ∆t with probability λ(η, x)∆t + o(∆t)
• Particle at x0 move to y with probability p(x0 , y).
• If site y is occupied, nothing happens.
2. Coalescing Random Walk. F = {0, 1}, Λ = Zd .
Each particle executes an independent random walk on the lattice until the walk takes a particle to an
occupied site, then this particle disappears. The generator
X
GF (η) = λ(η, x)p(x, y)η(x)(1 − η(y))(F (η xy ) − F (η)) + η(y)(F (ηx ) − F (η))
x,y

where (
1 − η(x), if z = x,
ηx (z) =
η(z), otherwise.

3. Voter Model (Clifford and Sudbury, 1973, Holley and Liggett, 1975). F = {0, 1}={no, yes}, Λ = Zd .
An individual changes opinion at a rate proportional to the number of neighbors that disagrees with the
individual’s present opinion. The generator
X
GF (η) = λ(η, x)(F (ηx ) − F (η))
x,y

where
λ X
λ(η, x) = I{η(y)6=η(x)} .
2d
{y;|y−x|=1}
4 MARKOV PROCESSES 61

4. Contact process (Harris, 1974). F = {0, 1} ={uninfected, infected}, Λ = Zd .


The generator X
GF (η) = λ(η, x)(F (ηx ) − F (η))
x,y

where ( P
λ {y;|y−x|=1} η(y), if η(x) = 0,
λ(η, x) =
1, if η(x) = 1.
Thus, λ is the relative infection rate.
5. For the stochastic Ising model, F = {−1, 1}. Λ = Zd .
 
X X
GF (η) = exp −β η(x)η(y) (F (η̄x ) − F (η))
x,y y:|x−y|=1

where (
−η(x), if z = x,
η̄x (z) =
η(z), otherwise.

4.7 Sample Path Regularity


The goal of this section is to show that for S separable, every Feller semigroup on C0 (S) corresponds to a
Markov process with sample paths in DS [0, ∞). In addition, we show that every generator A of a positive
strongly continuous contraction semigroup on C0 (S) corresponds to a Markov process.
We have seen that a jump Markov process with bounded jump rates λ : S → [0, ∞], corresponds to a
Feller process X on S. Moreover, this process has a DS [0, ∞) version.
If the rates are bounded on compact sets, then because S is locally compact, we can write, for S separable,

[
S= Kn
n=1

for an increasing sequence of compact sets {Kn ; n ≥ 1} and define

σn = inf{t > 0; Xt ∈
/ Kn }.

Then, {σn ; n ≥ 1} is an increasing sequence of stopping times. Define

ζ = lim σn .
n→∞

If this limit is finite, then we say that an explosion occurs. In these circumstances, we have that the transition
function fails to satisfy P (t, x, S) = 1 for all t > 0, and the semigroup T is no longer conservative.
4 MARKOV PROCESSES 62

4.7.1 Compactifying the State Space


To include this situation in the discussion of Markov processes, the state space S has adjoined to it a special
state ∆ and we write S ∆ = S ∪ {∆} and let

Xt = ∆ for t ≥ ζ.

If S is not compact, then S ∆ is the one point compactification of S.


This suggest the following modification.

Theorem 4.83. Let S be locally compact and separable and let T be a strongly continuous positive contraction
semigroup on C0 (S). For each t ≥ 0, define the operator T ∆ (t) on C(S ∆ ) by

T ∆ (t)f = f (∆) + T (t)(f − f (∆)).

Then T ∆ is a Feller semigroup on C(S ∆ ).


Proof. The verification that T ∆ is a strongly continuous conservative semigroup on C(S ∆ ) is straightforward.

Claim 1. T ∆ is positive. In other words, given c ∈ R+ and f ∈ C0 (S), c+f ≥ 0 implies that c+T (t)f ≥ 0.

Because T is positive,
T (t)(f + ) ≥ 0 and T (t)(f − ) ≥ 0.
Hence,
−T (t)f ≤ T (t)(f − ) and so (T (t)f )− ≤ T (t)(f − ).
Because T (t) is a contraction, we obtain

||T (t)(f − )|| ≤ ||f − || ≤ c.

Therefore,
T (t)(f − )(x) ≤ c and so c + T (t)f ≥ 0.

Claim 2. T ∆ is a contraction.
Because T ∆ is positive,
|T ∆ (t)f (x)| ≤ T ∆ (t)||f || = ||f ||
and ||T ∆ (t)|| ≤ 1.
With this modification, the condition of being a Feller semigroup states that the distribution of the
Markov process at time t depends continuously on the initial state. In other words, for f ∈ C(S ∆ ),

lim T (t)f (x) = T (t)f (x0 ),


x→x0
4 MARKOV PROCESSES 63

lim Ex f (Xt ) = Ex0 f (Xt ),


x→x0
Z Z
lim f (y)P (t, x, dy) = f (y)P (t, x0 , dy).
x→x0 S∆ S∆
Thus P (t, x, ·) converges in distribution to P (t, x0 , ·).
To check for explosions, consider a pure birth process. For τx = inf{t ≥ 0 : Xt = x}, we have, if X0 = 0,

τx = σ 1 + · · · + σ x

where σx̃ is the length of time between the arrival of X to x̃ − 1 and x̃.
Then, the random variables {σx̃ ; x̃ ≥ 1} are independent exponential random variables with parameters
{λx̃−1 ; x̃ ≥ 1}. Thus, the Laplace transform
λx̃−1
E[e−uσx̃ ] = ,
u + λx̃−1
and !−1
x x
Y λx̃−1 Y u
E[e−uτx ] = = (1 + ) .
λx̃−1 + u λx̃−1
x̃=1 x̃=1

For u > 0, the infinite product,



Y u
(1 + ),
λx̃−1
x̃=1

converges if and only if



X 1
< ∞.
λx̃−1
x̃=1

Thus, if the sum is infinite, then we have no explosion.


In particular, a linear birth process has no explosion. However, if, for example,
λx
lim inf >0
x→∞ xp
for some p > 1, then the birth process has an almost surely finite explosion time.
For a birth and death process, X, we consider the pure birth process, X̃ with the same birth rates as X.
Couple the two processes together in the sense that the exponential time for a jump forward is the same for
the first time each process reaches any site x. In this way,

X̃t ≤ Xt

for all t ≥ 0. Thus, if X has no explosion, neither does X̃.


We can use the previous section to show that the Yosida approximation generates a Markov process. Write
Aλ = λ(λR(λ) − I) We know that λR(λ) is a positive linear operator on C0 (S). By the Riesz representation
theorem, there exists a positive Borel measure
Z
λR(λ)f (x) = f (y) µλ (x, dy).
S
4 MARKOV PROCESSES 64

Thus, for f ∈ D(G) Z


λf (x) = (λI − G)f (y) µλ (x, dy).
S

Because ||λR(λ)|| ≤ 1, we have that µλ (x, S) ≤ 1. However, if G is conservative, we can choose f (x) = 1/λ
to obtain
1 = µλ (x, S).

Note that if we have that G generates a Markov process, then


Z
λR(λ)f (x) = λe−λt Ex f (Xt ) dt = Ex f (Xτλ )
S

where τλ is an exponential random variable, parameter λ, and independent from X. Thus, µλ (x, ·) is the
distribution of Xτλ under Px .

Exercise 4.84. let D be a dense subset of C0 (S). Then a function x ∈ DS [0, ∞) if and only if f (x) ∈
DR [0, ∞) for every f ∈ D.
X
Theorem 4.85. Let X be a Ft+ -Feller process, then X has a DS [0, ∞)-valued modification.
Proof. Let X have generator G and let f ∈ D(G), then the martingale
Z t
f
Mt = f (Xt ) − Gf (Xs ) ds
0

is right continuous on a set Ωf of probability 1. Thus


Z t
f (Xt ) = Mtf − Gf (Xs ) ds
0

is right continuous on Ωf . Now take a countably dense set D ⊂ D(A) of f . Then


\
Ωf ⊂ {X ∈ DS [0, ∞)}.
f ∈D

Theorem 4.86. Let S be locally compact and separable. Let T be a positive strongly continuous contraction
semigroup on C0 (S) and define T ∆ as above. Let X be a Markov process corresponding to T ∆ with sample
paths in DS ∆ [0, ∞) and let
ζ = inf{t ≥ 0 : Xt = ∆ or Xt− = ∆}.
Then on the set {ζ < ∞},
Xζ+s = ∆ for all s ≥ 0.
Proof. By the Urysohn lemma, there exists f ∈ C(S ∆ ) such that f > 0 on S and f (∆) = 0. Then the
resolvent R1∆ f > 0 on S and R1∆ f (∆) = 0. Note that e−t R1∆ f (Xt ) is a supermartingale. Because a right
continuous nonnegative supermartingale is zero after its first contact with 0, the theorem follows.
4 MARKOV PROCESSES 65

Proposition 4.87. If, in addition to the conditions above, if the extention the semigroup T to the bounded
measurable functions is conservative, then α(S) = P {X0 ∈ S} = 1 implies that Pα {X ∈ DS [0, ∞) = 1}.
Proof. Let G∆ be the generator of T ∆ , then G∆ IS = 0, thus
Z t
Eα IS (Xt ) = Eα IS (X0 ) + Eα [ G∆ IS (Xs ) ds] = Eα IS (X0 ) + 0.
0
or
Pα {ζ > t} = Pα {Xt ∈ S} = Pα {X0 ∈ S} = 1.

Theorem 4.88. Let S be locally compact and separable. Let T be a Feller semigroup on C0 (S). Then for
each probability measure α on S, there exists a Markov process X with initial distribution α and sample paths
X
in DS [0, ∞). Moreover, X is strong Markov with respect to the filtration {Ft+ ; t ≥ 0}.
Proof. Let G be the generator for T and let Gλ , λ > 0, be its Yosida approximation. Then Tλ is the
semigroup for a jump Markov process X λ . Moreover, for every t ≥ 0 and f ∈ C0 (S),

lim Tλ (t)f = T (t)f.


λ→∞

Let Tλ∆ and T ∆ be the corresponding semigroups on C(S ∆ ). Then, for each f ∈ C(S ∆ ),

lim E[f (Xtλ )] = lim E[Tλ∆ (t)f (X0λ )]


λ→∞ λ→∞
Z Z
= lim Tλ∆ (t)f (x) α(dx) = T ∆ (t)f (x) α(dx).
λ→∞

Now f → T ∆ (t)f (x) α(dx) is a positive linear operator that maps the constant function 1 to the value
R

1, and thus, by the Riesz representation theorem is integration against a probability measure νt . This gives
the one dimensional distributions of X. The proof that the finite dimensional distributions converge follows
from a straightforward induction argument.
X
To check the strong Markov property, let τ be a discrete Ft+ -stopping time, then for each  > 0, and
X
A ∈ Fτ , A ∩ {τ = t} ∈ Ft+ . Thus, for s > 0,

E[f (Xτ +s ); A ∩ {τ = t}] = E[f (Xt+s ); A ∩ {τ = t}]


= E[T (s − )f (Xt+ ); A ∩ {τ = t}]
= E[T (s − )f (Xτ + ); A ∩ {τ = t}].

Now use the right continuity of X and the strong continuity of T to conclude that

E[f (Xτ +s ); A ∩ {τ = t}] = E[T (s)f (Xτ ); A ∩ {τ = t}]

Summing over the range of τ we obtain

E[f (Xτ +s )|FτX+ ] = T (s)f (Xτ ).

We can obtain the result for an arbitrary stopping time by realizing it as the decreasing sequence of
discrete stopping times, use the identity above, the right continuity of X, and the strong continuity of T .
4 MARKOV PROCESSES 66

Exercise 4.89. Complete the proof above by supplying the induction argument to show that the finite di-
mensional distributions converge.
In summary, we have

Theorem 4.90. Let S be locally compact and separable. Let G be a linear operator on C0 (S) satisfying 1-3
of the Hille-Yosida theorem and let T be a strongly continuous postive contraction semigroup generated by
Ḡ. Then for each x ∈ S, there exists a Markov process X corresponding to T with initial distribution δx and
with sample paths in DS [0, ∞) if and only if G is conservative.

4.8 Transformations of Markov Processes


Definition 4.91. A continuous additive functional A of a Markov process X is adapted process satisfying
1. A0 = 0.
2. A is continuous and nondecreasing.
3. For all s, t ≥ 0, As+t = As + At ◦ θs .
4. A is constant on [ζ, ∞), ζ = inf{t > 0; Xt = ∆}.
Rt
The example that will occupy our attention in this section is At = 0 a(Xu ) du for some nonnegative
measureable function a. We shall use these additive functionals in two settings: random time changes, and
killing.
Example 4.92. If N is one of the queues given in the examples, then
Z t
Ns ds
0

is the total amount of service time for the queue up to time t.


Z t
I{0} (Ns ) ds
0

is the amount of time that the queue is empty.

4.8.1 Random Time Changes


Let Y be a process with sample paths in DS [0, ∞) and choose c be a nonnegative continuous function on S
so that c ◦ Y is bounded on bounded time intervals. Think of c as the rate that the clock for the process
moves. Thus, for example, if c(x) > 1, the process moves more quickly through the state x
The goal is to describe solutions to
Xt = YR t c(Xs ) ds
0

If X solves this equation, then set


Z t Z t
τt = c(Xs ) ds = c(Yτs ) ds.
0 0
4 MARKOV PROCESSES 67

Rt
Set σ0 = inf{t > 0 : 0
1/c(Xτs ) ds = ∞}, then for t < σ0 ,
Z t Z τt
c(Xs ) 1
t= ds = du
0 c(Xs ) 0 c(Yu )

and therefore, τ satisfies the ordinary differential equation

τt0 = c(Yτt ), τ0 = 0.

Because c(Y ) is bounded on bounded time intervals, the solution to the ordinary differential equation
has a unique solution. Now,
Xt = Yτt .

Theorem 4.93. Let Y be a DS [0, ∞)-valued Markov process with generator G on C0 (S). If a unique solution
to
Xt = YR t c(Xs ) ds
0

exists, then X is a Markov process with generator cG.


Proof. Verifying that X is a Markov process is left as an exercise.
Rt Y
Note that {τs ≤ t} = { 0 1/c(Xτu ) du) ≥ s} ∩ {σ0 ≤ t} ∈ Ft+ . Thus, τs is an Ft+ -stopping time. For
f ∈ D(A), the optional sampling theorem guarantees that
Z τt Z t
f (Yτt ) − Gf (Ys ) ds = f (Xt ) − c(Xu )Gf (Xu ) du
0 0

is an FτYt -martingale.
Thus, f in in the domain of the generator of the Markov process X and this generator is cA.
Exercise 4.94. Verify that Y in the theorem above is a Marlov process.

4.8.2 Killing
We can choose to have a process X move to the state ∆. The rate that this happens can be spatially
dependent. Thus, we define the hazard function
1
k(x) = lim Px {Xh = ∆}.
h→0 h
The analysis follows the previous analysis for the case k constant. Write the additive functional
Z t
Kt = k(Xs ) ds
0

and define Z ∞
Y = e−Ks f (Xs ) ds, Rk f (x) = Ex Y.
0
4 MARKOV PROCESSES 68

Then Z t
Y = e−Ks f (Xs ) ds + e−Kt Y ◦ θt
0
Again, we have Doob’s martingale
Z t
Ztk,f = E[Y |FtX ] = e−Ks f (Xs ) ds + e−Kt E[Y ◦ θt |FtX ]
0
Z t
= e−Ks f (Xs ) ds + e−Kt Rk f (Xt )
0

Exercise 4.95. Show that for f ∈ D(G), Rk Gf = Rk kf − f or

Rk (k − G)f = f.

Now, let f = (k − G)g, then the Doob’s martingale above becomes


Z t
k,g −Kt
Ct = e g(Xt ) + e−Ks (k − G)g(Xs ) ds
0

and the analog to Dynkin’s formula becomes


Z τ
g(x) = Ex [e−Kτ g(Xτ )] + Ex [ e−Ks (k − G)g(Xs ) ds.
0

Let g satisfy Gg = kg with g = h on D, then

g(x) = Ex [e−KτD h(XτD )].

Now define the stochastic process X̃ as follows:


Let ξ be an exponential random variable, parameter 1, independent of the process X. Then

Xt if Kt < ξ,
X̃t =
∆ if Kt ≥ ξ.

Then, for f ∈ C0 (S),

Ex f (X̃t ) = Ex [f (Xt )I{Kt <ξ} ] = Ex [f (Xt )Ex [I{Kt <ξ} |FtX ]] = Ex [e−Kt f (Xt )].

Exercise 4.96. For f ∈ C0 (S), define

T (t)f (x) = Ex [e−Kt f (Xt )].

Then T is a positive continuous constraction semigroup with generator G − k. Thus, X̃ is Markov process
with state space S ∆ and generator G − k.
4 MARKOV PROCESSES 69

Corollary 4.97. (Feynman-Kac formula) Let X be a Feller process on S and generator G. Then the
equation
∂u
= Gu − ku, u(0, x) = f (x)
∂t
has solution
u(t, x) = Ex [e−Kt f (Xt )].

4.8.3 Change of Measure


Proposition 4.98 (Bayes formula). On the probability space (Ω, F, P ), let L be a non-negative random
variable satisfying EL = 1. Define the probability measure Q(L) = E[L; A] and write EQ [Y ] = E[Y L]. Then
for any integrable random variable Z, and any σ-algebra G ⊂ F,

E[ZL|G]
EQ [Z|G] = .
E[L|G]

Proof. Clearly the right hand side is G-measurable. Choose A ∈ G, then


       
E[ZL|G] E[ZL|G] E[ZL|G]
EQ ;A = E L; A = E E L G ;A
E[L|G] E[L|G] E[L|G]
 
E[ZL|G]
= E E [L|G] ; A = E [E[ZL|G]; A]
E[L|G]
= E[ZL; A] = EQ Z.

Lemma 4.99. Assume Q|Ft << P |Ft , and assume that Lt is the Radon-Nikodym derivative, then Z is a
Q-martingale if and only if ZL is a P -martingale.
Proof. Because L is an adapted process, Z is adapted if and only if ZL is adapted. To show that L is a
martingale, choose A ∈ Ft , then Q(A) = E[Lt ; A]. However, A ∈ Ft+s and therefore

E[Lt ; A] = Q(A) = E[Lt+s ; A].

By Bayes formula,
E[Zt+s Lt+s |Ft ] E[Zt+s Lt+s |Ft ]
EQ [Zt+s |Ft ] = = .
E[Lt+s |Ft Lt
Thus,
E[Zt+s Lt+s |Ft ] − Zt Lt
EQ [Zt+s |Ft ] − Zt = .
Lt
and, consequently,
EQ [Zt+s |Ft ] = Zt if and only if E[Zt+s Lt+s |Ft ] − Zt Lt .
4 MARKOV PROCESSES 70

Note that L is a positive mean one martingale. In addition, is L−1 is Q-integrable, the L−1 is a Q-
martingale and P |Ft << Q|Ft
For X, a time homogeneous Markov process with generator G, we write g = Gf for f ∈ D(G), with
inf x f (x) > 0, recall that  Z 
f (Xt ) g(Xv )
exp − dv
f (X0 ) f (Xv )
is a mean 1 local martingale.. Write f (x) = exp w(x) and Gw = e−w Gew , then this local martingale becomes
 Z t 
Lt = exp w(Xt ) − w(X0 ) − Gw(Xv ) dv .
0

For the case of standard Brownian motion B, take w(x) = µx. We shall soon learn that G = 21 ∆, and
1
Lt = exp(µBt − µ2 t)
2
is indeed a martingale.
Under the change of measure induced by L
Z
1 2 1 1 2
Q{Bt ∈ A} = E[IA (Bt ) exp(µBt − µ t)] = √ exp(µx − µ2 t)e−x /2t dx
2 2πt A 2
Z Z
1 1 2 2 2 1 1
= √ exp − (x − 2µxt − µ t ) dx = √ exp − (x − µt)2 dx
2πt A 2t 2πt A 2t
Z
1 2
= √ e−x /2t dx = P {Bt + µt ∈ A}
2πt A−µt
Exercise 4.100. Let B, P and Q be defined as above. For times 0 ≤ t1 < t2 · · · tn , and Borel sets A1 , . . . , An ,

Q{Bt1 ∈ A1 , . . . Btn ∈ An } = P {Bt1 + µt1 ∈ A1 , . . . Btn + µtn ∈ An }.

Thus, by the Daniell-Kolmogorov extension theorem, under the measure Q, B is Brownian motion with drift
µ.
Example 4.101. Recall that for standard Brownian motion, and for a ≥ 0, the hitting time for level a,
τa = inf{t > 0; Bt > a}, has density
a 2
fτa (s) = √ e−a /2s .
2πt3
If we consider the process Bt +µt, µ > 0 under the measure Q, then this process is standard Brownian motion
under the measure P . Thus,

Q{τa ≤ t} = E[I{τa ≤t} Lt ] = E[E[I{τa ≤t} Lt |Fmin{τa ,t} ]] = E[I{τa ≤t} Lmin{τa ,t} ] = E[I{τa ≤t} Lτ ]
Z t
1 2 1 a 2
= E[I{τa ≤t} exp(µa − µ τa )] = exp(µa − µ2 s) √ e−a /2s ds
2 0 2 2πs 3

Thus,
a 2
fτa ,µ (s) = √ e−(a−µt) /2s
.
2πs3
gives the density of τa for Brownian motion with constant drift.
4 MARKOV PROCESSES 71

4.9 Stationary Distributions


Definition 4.102. Call a stochastic process stationary if, for every n ≥ 1, and 0 ≤ s1 < · · · < sn , the finite
dimensional distributions

P {Xt+s1 ∈ B1 , . . . , Xt+sn ∈ Bn }, B1 , . . . , Bn ∈ B(S).

are independent of t ≥ 0.
Definition 4.103. A probabiity measure µ for a Markov process X is called stationary if X is a stationary
process under Pµ .
Lemma 4.104. µ is a stationary measure for a Markov process X if and only if

Pµ {Xt ∈ B} = µ(B), B ∈ B(S).

for all t ≥ 0.
Proof. The necessity is immediate. To prove sufficency, note that for each t ≥ 0, θt X is a Markov process
with initial distribution µ and the same generator as X, thus has the same finite dimensional distributions
as X.
Theorem 4.105. Let G generate a strongly continuous contraction semigroup T on C0 (S) corresponding to
a Markov process X. In addition, assume that D is a core for G. Then for a probability measure µ on S the
following are equivalent:
1. µ is a stationary distribution for X.
R R
2. T (t)f dµ = f dµ for f ∈ C0 (S) and t ≥ 0.
R
3. Gf dµ = 0 for f ∈ D.
Proof. (1 → 2) Z Z
T (t)f dµ = Eµ [T (t)f (X0 )] = Eµ [f (Xt )] = Eµ [f (X0 )] = f dµ

(2 → 1) By the lemma above, it suffices to show that the one dimensional distributions agree.
Z Z
Eµ [f (Xt )] = Eµ [T (t)f (X0 )] = T (t)f dµ = f dµ = Eµ [f (X0 )].

(2 → 3) follows from the definition of G.


(3 → 2) By the definition of core, 3 holds for all f ∈ D(G).
Z Z Z t Z tZ
(T (t)f − f ) dµ = GT (s)f dsdµ = GT (s)f dµds = 0.
0 0

Because D(G) is dense in C0 (S), 2 follows.


Exercise 4.106. Let α be a probability measure on S
R R
1. If f dν = limt→∞ T (t)f dα exists for every f ∈ C0 (S), then ν is a stationary measure.
4 MARKOV PROCESSES 72

Rt R
2. If f dν = limn→∞ t1n 0 n T (t)f dα exists for every f ∈ C0 (S), and some increasing sequence
R

{tn ; n ≥ 1}, limn→∞ tn = ∞ then ν is a stationary measure.


Definition 4.107. A Markov process X is said to be ergodic if there exist a unique stationary measure ν
and Z
lim Eα f (Xt ) = f dν.
t→∞

for every initial distribution α and every f ∈ C0 (S).


Note that if X is a stationary process, then we may extend it to be a process {Xt ; t ∈ R}. As with
Markov chains, we have:

Definition 4.108. A probability measure ν is said to be reversible for a Markov process X if

Eν [f1 (X0 )f2 (Xt )] = Eν [f2 (X0 )f1 (Xt )],

or in terms of the semigroup T , Z Z


f1 T (t)f2 dν = f1 T (t)f2 dν.

In other words, T (t) is a self adjoint operator with respect to the measure ν.

Proposition 4.109. If T is conservative, then a reversible measure is invariant.


Proof. Take f2 = 1 in the definition.
Theorem 4.110. Suppose that T is a Feller semigroup on C0 (S) with generator G and let ν be a probability
measure on S. Then the following statements are equivalent:
1. ν is reversible.
R R
2. f1 Gf2 dµ = f1 Gf2 dµ for all f1 , f2 ∈ D(G).
R R
3. f1 Gf2 dµ = f1 Gf2 dµ for all f1 , f2 ∈ D, a core for G.
4. ν is stationary. Let X be the Markov process obtained using the ν as the distribution of X0 and
extending the time set to R. Then {Xt ; t ∈ R} and {X−t ; t ∈ R} have the same finite dimensional
distributions.
Proof. (1 → 2) is straightforward and (2 → 3) is immediate. (3 → 1) is nearly identitical to the proof of a
similar statement for stationary measures.
(4 → 1)
Eν [f1 (X0 )f2 (Xt )] = Eν [f2 (X0 )f1 (X−t )] = Eν [f2 (X0 )f1 (Xt )].
(1 → 4) Now suppose that ν is reversible. Then given

f0 = 1, f1 , . . . , fn , fn+1 = 1 ∈ Cb (S)

and t0 < · · · < tn , define


`−1
Y n+1
Y
g` (x) = Ex [ fk (Xt` −tk )], and h` (x) = Ex [ fk (Xtk −t` )].
k=1 k=`+1
4 MARKOV PROCESSES 73

Set Z
I` = g` (x)f` (x)h` (x) ν(dx).

Then
n
Y n
Y
I0 = Eν [ fk (Xtk −t0 )], and In+1 = Eν [ fk (Xtn+1 −tk )].
k=1 k=1

Claim. I0 = In+1 .
Let 1 ≤ ` ≤ n, Then
Z Z
I` = g` f` h` dν = (g` f` )T (t`+1 − t` )(f`+1 h`+1 ) dν
Z
= T (t`+1 − t` )(g` f` )(f`+1 h`+1 ) dν
Z
= g`+1 f`+1 h`+1 dν = I`+1

because
n+1
!
Y
T (t`+1 − t` )(f`+1 (x)h`+1 (x)) = T (t`+1 − t` ) f`+1 (x)Ex [ fk (Xtk −t`+1 )]
k=`+2
n+1
Y n+1
Y
= T (t`+1 − t` )Ex [ fk (Xtk −t`+1 )] = Ex [ fk (Xtk −t` )] = h` (x),
k=`+1 k=`+1

and
`−1
!
Y
T (t`+1 − t` )(g` (x)f` (x)) = T (t`+1 − t` ) Ex [ fk (Xt` −tk )]f` (x)
k=1
`
Y
= T (t`+1 − t` )Ex [ fk (Xt` −tk )] = g`+1 (x).
k=1

Interate this to obtain the claim.


However, by stationarity,
n
Y n
Y
I0 = Eν [ fk (Xtk )], and In+1 = Eν [ fk (X−tk )].
k=1 k=1

and 4 follows.
R R
For a countable state space, the identity f1 Gf2 dν = f2 Gf1 dν becomes
XX XX
f1 (x)gxy (f2 (y) − f2 (x))ν{x} = f2 (x)gxy (f1 (y) − f1 (x))ν{x}.
x y x y
4 MARKOV PROCESSES 74

Simplifying, this is equivalent to


XX XX
f1 (x)gxy f2 (y)ν{x} = f2 (x)gxy f1 (y)ν{x}
x y x y

or by changing indices
XX XX
f1 (x)gxy f2 (y)ν{x} = f2 (y)gyx f1 (x)ν{y}.
x y x y

By writing f1 and f2 as sums of indicator functions of sites in S, we have that this statement is equivalent
to
ν{x}gxy = ν{y}gyx .
This is the equation of detailed balance.

Exercise 4.111. Find a reversible measure for a birth and death process and give conditions so that this
measure can be normalized to be a probability measure. Simplify this expression for the examples of birth and
death processes in the examples above.

4.10 One Dimensional Diffusions.


Let −∞ ≤ r− < r+ ≤ +∞ and let I = [r− , r+ ] ∩ R, int(I) = (r− , r+ ), and I¯ = [r− , r+ ].

Definition 4.112. A diffusion on I is a Feller process with paths in CI [0, ∞). A diffusion is called regular
if, for every x ∈ int(I) and y ∈ I,
Px {τy < ∞} > 0,
where τy = inf{t > 0 : Xt = y}.
Example 4.113. Brownian motion is a one dimensional diffusion on R. The semigroup is

1
Z
(x − y)2 1
Z √ y2
T (t)f (x) = √ f (y) exp(− ) dy = √ f (x + y t) exp(− ) dy.
2πt R 2t 2π R 2

Then, for f ∈ C 2 (R) ∩ C0 (R),

1 1
Z
1 √ y2
(T (t)f (x) − f (x)) = √ ((f (x + y t) − f (x)) exp(− ) dy
t 2π R t 2
1
Z
1 √ 1 √ y2
= √ (y tf 0 (x) + y 2 tf 00 (x + θy t) exp(− ) dy
2π R t 2 2
1
Z
1 2 00 √ y 2
= √ y f (x + θy t) exp(− ) dy
2π R 2 2

for some θ ∈ (0, 1). Thus, the generator


1 00
Gf (x) = f (x).
2
Exercise 4.114. For λ > 0, solve g 00 = 2λg and use this to find the Laplace transform for exit times.
4 MARKOV PROCESSES 75

Exercise 4.115. Let φ ∈ C 2 (R) be increasing and let B be standard Brownian motion. Find the generator
of X = φ ◦ B.
The major simplifying circumstance for one-dimensional diffusions is the intemediate value theorem, i.e.,
if x0 is between Xs and Xt , s < t, then Xc = x0 for some c ∈ (s, t). Regularity implies a much stronger
condition.

Theorem 4.116. Let J be a finite open interval, J¯ ⊂ I and let τJ be the time of the first exit from J, then
if X is a regular diffusion on I
sup Ex τJ < ∞.
x∈J

Proof. We begin with a claim.


Claim. Assume that pJ = supx∈J Px {τJ > t} < 1, then
t
E x τJ ≤ .
1 − pJ

Write pn = supx∈J Px {τJ > nt}. Then

Px {τJ > (n + 1)t} ≤ pn Px {τJ > (n + 1)t|τJ > nt} ≤ pn pJ .

by the strong Markov property. Thus, pn ≤ pnJ and


Z ∞ ∞ Z (n+1)t ∞
X X t
Ex τJ = Px {τJ > s} ds = Px {τJ > s} ds ≤ t pnJ = .
0 n=0 nt n=0
1 − pJ

To complete the proof, set J = (a, b) and pick y ∈ J. Then, by the regularity of X, there exists a t > 0
and p < 1 so that
Py {τa > t} ≤ p, Py {τb > t} ≤ p.
Thus, for x ∈ [y, b),
Px {τJ > t} ≤ Px {τb > t} ≤ Py {τb > t}.
For x ∈ (a, y],
Px {τJ > t} ≤ Px {τa > t} ≤ Py {τa > t}.

To set notation, for any open interval J = (a, b) with Px {τJ < ∞} = 1 for all x ∈ J, write

sJ (x) = Px {τb < τa }.

Clearly, sJ (a) = 1 and sJ (b) = 0. If a < x < y < b, then by the strong Markov property,

sJ (x) = Px {τy < τa }sJ (y).


4 MARKOV PROCESSES 76

Lemma 4.117. sJ is continuous and strictly increasing.


Proof. By the statements above, the theorem is equivalent to showing that for a < x < y < b, Px {τy < τa }
is continuous, strictly increasing as a function of y and has limit 1 as x → y. (This give right continuity.
Left continuity follows from a symmetric argument.)
To this end,
Px {τy < τa } = Px { sup Xt ≥ y} → Px { sup Xt > x} = 1
t≤τa t≤τa

because τa > 0 a.s. Px


We now show that Px {τy < τa } = 1 contradicts the regularity of X. Define stopping times ξ0 = 0,
σn+1 = inf{t > ξn : Xt = x} = σ(ξn , {x}), ξn+1 = inf{t > σn+1 ; Xt ∈ {a, y}}. By the strong Markov
property at ξn ,
Py {ξn < ∞, Xξn = a} = Ey [Px {τa < τy }; {ξn < ∞}] = 0
and thus,
Px {τa < ∞} = 0.

4.10.1 The Scale Function


Definition 4.118. Let X be a regular diffusion on I. A scale function for X is a function s : I → R such
that for all a < x < b, a, b ∈ I,
s(x) − s(a)
Px {τb < τa } = .
s(b) − s(a)
If s(x) = x is a scale function for X, then we say that X is in its natural scale. In particular, if
x = (a + b)/2, then Px {τb < τa } = 1/2.
Exercise 4.119. 1. Standard Brownian motion is on natural scale.
2. s is uniquely determined up to an increasing affine transformation.
3. A process X is on natural scale if and only if Px {τb < τa } = 1/2 for every a, b ∈ I and x = (a + b)/2.
Theorem 4.120. Every regular diffusion on I has a scale function.
Proof. If I is compact, then sI is a scale function. To see this, take I = [a, b] and a < c < x < d < b. Then,
by the strong Markov property

Px {τa < τb } = Px {τd < τc }Pd {τa < τb } + Px {τc < τd }Pc {τa < τb }

sI (x) − si (a) sI (d) − si (a) sI (c) − si (a)


= Px {τd < τc } + (1 − Px {τc < τd }) .
sI (b) − sI (a) sI (b) − sI (a) sI (b) − sI (a)
Solving, we obtain,
sI (x) − sI (c)
Px {τd < τc } = .
sI (d) − s( c)
4 MARKOV PROCESSES 77

Now, we extend. Take {Jn ; n ≥ 1} to be an increasing sequence of compact intervals with union I.
Moreover, if I contains an endpoint, then construct each Jn = [an , bn ] to include this endpoint. Define
s̃n : Jn → R inductively. s̃1 = sJ1 , and, for n ≥ 1,

sJn+1 (x) − sJn+1 (an )


s̃n+1 = s̃n (an ) + (s̃n (bn ) − s̃n (an )) .
sJn+1 (bn ) − sJn+1 (an )

Clearly, s̃n+1 (an ) = s̃n (an ), and s̃n+1 (bn ) = s̃n (bn ). Repeat the argument above to show that s̃n+1 is
a scale function on Jn . Because s̃n is an affine transformation of sn+1 and the two functions agree at two
points, they must be equal on Jn .
Consequently, we can define, for x ∈ Jn , s = s̃n .
Theorem 4.121. If X is a regular diffusion on I, then X̃ = s(X) is a regular diffusion in natural scale on
s(I).
Proof. Write ã = s(a), b̃ = s(b), and τ̃a = inf{t > 0 : X̃t = a}

s(x) − s(a) x̃ − ã
Px̃ {τ̃ã < τ̃b̃ } = Px {τa < τb } = = .
s(b) − s(a) b̃ − ã

4.10.2 The Speed Measure


The structure threorem for jump Markov processes gives to each site x in a birth and death process, and
function λ(x), that determines the length of time at x and a probability px = Px {Xτ1 = x + 1}, the bias in
direction.
For a one dimensional regular diffusion, the bias in direction is articulated through the scale function s.
The length of time near x is described by Ex τJ . In this section we shall show how these mean values can be
represented as integration of a kernel against a measure. To explain the kernel we have the following:

Exercise 4.122. For a simple symmetric random walk Y on Z, and let τ = min{n; Yn ∈ {M− , M+ }},
M− < M+ and let Ny be the number of visits to y before τ . Then

Ex τ = (x − M− )(M+ − x),

and
(x−M− )(M+ −y)
(
M+ −M− , x≤y
Ex N y = (y−M− )(M+ −x)
M+ −M− , x ≥ y.

For a bounded open interval J = (a, b) define, for x, y ∈ J


(
(x−a)(b−y)
b−a x ≤ y,
GJ (x, y) = (y−a)(b−x)
b−a x ≥ y.

Theorem 4.123. Let X be a one dimensional diffusion on I on natural scale. Then there is a unique
measure m defined on B(int(I)) such that
4 MARKOV PROCESSES 78

1. m is finite on bounded sets, B, B̄ ⊂ int(I).


2. For a bounded open interval J,
Z
m(x, J) = Ex τJ = GJ (x, y)m(dy).
J

Proof. Let {a = x0 < x1 < · · · < xn = b} be a partition [a, b] into subintervals of length h = (b − a)/n.
Thus, xk = a + hk Define Jk = (xk−1 , xk+1 ).
The process restricted to the exit times of the intervals Jk is a simple symmetric random walk Y on
(b − a) + hZ. Let σj denote the time between steps before τj and zero after. Then for a partition point x,

X
Ex τ J = Ex σ j .
j=0

and
n−1
X
Ex σj = Ex [Ex [σj |Yj ]] = Px {Yj = xk }m(xk , Jk ).
k=1

Let n(xk0 , xk ) denote the expected number of visits of the random walk to xk starting at xk0 . Then, by the
exercise, (
k0 (n−k)
n xk0 ≤ xk 1
n(xk0 , xk ) = k(n−k0 ) = G(xk0 , xk ).
n xk ≥ xk0 h
Thus,
∞ n−1
X X n−1
X X∞
Ex τJ = Px {Yj = xk }m(xk , Jk ) = ( Px {Yj = xk })m(xk , Jk )
j=0 k=1 k=1 j=0
n−1 n−1
X X 1
= n(x, xk )m(xk , Jk ) = GJ (x, xk )m(x, Jk )
h
k=1 k=1

Define the measure mn that gives mass m(xk , Jk )/h to the point xk . Then the formula above becomes,
Z
m(xj , J) = GJ (xj , y) mn (dy).
J

Now consider succesive refinements, (say n = 2` ). Then the representation above holds whenever the
ratio (x − a)/(b − a) is a dyadic rational. This represenation above also implies that lim supn→∞ mn (J) < ∞.
Now use a variant of the Prohorov theorem to show that {m2` : ` ≥ 1} has compact closure in the topology
determined by weak convergence. Let m be a limit point for the {m2` : ` ≥ 1}, then for any finite open
interval J with dyadic rational endpoints, J¯ ⊂ I,
x−a
Z
m(x, J) = GJ (x, y) m(dy), a dyadic rational.
J b−a

By the bounded convergence theorem, this can be extended to all such open intervals J.
4 MARKOV PROCESSES 79

To extend this to arbitrary y, use the strong Markov property to obtain the identity for (ã, b̃) = J˜ ⊂ J,

m(x, J) = m(x, J˜n ) + Px {τã < τb̃ }m(b̃, J) + Px {τã > τb̃ }m(ã, J).

Now let ãn and b̃n be dyadic rationals ãn < x < b̃n , then

lim m(ã, J) = lim m(b̃, J) = m(x, J)


n→∞ n→∞

and by the representation for dyadic rationals,


˜ = 0.
lim m(x, J)
n→∞

Exercise 4.124. The measure m is unique.


Definition 4.125. The measure m is called the speed measure for the process.
Exercise 4.126. 1. If X is not on its natural scale, then the speed measure m can be defined by
Z
m(x, J) = GJ (x, y) m(dy)
J

with ( (s(x)−s(a))(s(b)−s(y))
s(b)−s(a)) x≤y
GJ (x, y) = (s(y)−s(a))(s(b)−s(x))
s(b)−s(a)) x≤y

2. Let X be standard Brownian motion, then for J = (a, b),

Ex τJ = (x − a)(x − b).

Thus, the speed measure is Lebesgue measure.


3. Conversely, if the speed measure is Lebesgue measure, then for J = (a, b),

Ex τJ = (x − a)(x − b).

4. Use the standard machine to prove that for a bounded measurable function,
Z τJ Z
Ex [ f (Xt ) dt] = f (y)GJ (x, y) m(dy)
0 J

Theorem 4.127. Let X be a regular diffusion on natural scale with speed measure m and let X̃ be the unique
solution to
X̃t = XR t c(X̃s ) ds .
0

Then, X is a Feller process on natural scale with speed measure


Z
m̃(A) = c(x)−1 m(dx).
A
4 MARKOV PROCESSES 80

Proof. Set Z t
σt = c(X̃s ) ds.
0
Then, as before Z σt
t= c(X̃s ) ds.
0
Denote
τa = inf{t > 0 : Xt = a} τ̃a = inf{t > 0 : X̃t = a}.
Then
Px {τ̃a < τ̃b } = Px {τa < τb }.
Thus, X̃ is on natural scale.
Let τ̃J be the first exit time of X̃ from J. Then
Z τJ
1
τJ = στ̃J or τ̃J = ds.
0 c(Xs )

Taking expectations, we obtain


Z τJ Z
−1
Ex τ̃J = Ex [ c(Xs ) ds] = GJ (x, y)c(y)−1 m(dy).
0 J

Thus, X̃ has the asserted speed measure.

4.10.3 The Characteristic Operator


Because we are working with exit times in the definition of the speed measure, it is convenient to continue
this idea to give an operator similar to the generator.

Definition 4.128. Let x ∈ int(I), then


lim φ(J) = L
J→{x}

means that for any system of open sets {Jn ; n ≥ 1} decreasing to the point x, we have

lim φ(Jn ) = L.
n→∞

Definition 4.129. Let X be a regular diffusion on I. The characteristic operator

Ex f (XτJ ) − f (x)
Cf (x) = lim .
J→{x} E x τJ

The domain of C is the subspace of all f ∈ C0 (I) for which the limit above exists.
Theorem 4.130. Let X be a regular diffusion on I with generator A. Let f ∈ D(A) ∩ D(C), then Af (x) =
Cf (x) for all x ∈ int(I).
4 MARKOV PROCESSES 81

Proof. Choose f ∈ D(A), and x ∈ int(I). Set g = Af . Then Dynkin’s formula states that
Z τJ
Ex [f (XτJ )] = f (x) + E[ g(Xs ) ds].
0

for an interval J, J¯ ⊂ int(I). Choose  > 0 so that g(J) ⊂ (g(x) − , g(x) + ). Then,

Ex f (XτJ ) − f (x)
g(x) −  < < g(x) + 
E x τJ

and Cf (x) = Af (x).


Theorem 4.131. If X is a regular diffusion on the line on natural scale and with speed measure equal to
Lebesgue measure, the X is standard Brownian motion.
Proof. Let A be the generator for X and let C be its characteristic operator. Then, because R is open,
D(A) ⊂ D(C). Write an interval containing x as J(λ, ) = (x + λ, x − (1 − λ)), λ ∈ (0, 1). Then,

Ex τJ(λ,) = λ(1 − λ)2 .

For f ∈ C 2 (R),
Ef (XτJ(λ,) ) = (1 − λ)f (x + λ) + λf (x − (1 − λ)).
and
Ef (XτJ(λ,) ) − f (x) = (1 − λ)(f (x + λ) − f (x)) + λ(f (x − (1 − λ)) − f (x)).
Now, choose sequences n → 0 and λn ∈ (0, 1). Then, by L’Hôpital’s rule

Ex f (XτJ (n ,λn ) ) − f (x) (1 − λn )(f (x + λn n ) − f (x)) + λn (f (x − (1 − λn )n ) − f (x))


lim = lim
n→∞ Ex τJ (n , λn ) n→∞ λn (1 − λn )2n
0
f (x + λn n ) + f (x − (1 − λn )n ) 1
= lim = f 00 (x).
n→∞ 2n 2

Set aside for the moment behavior in the boundary of I. Beginning with any scale s and any speed
measure having a density c, we can transform it to Brownian by a change of scale and a time change. Thus,
by inverting these transformations,
R we can transform a Brownian motion into a regular diffusion having scale
and speed measure A c(x) dx. This shows, among this class of scales and speed measure, that the behavior
of a regular diffusion is uniquely determined in the interior of I by these two properties of the diffusion.
For diffusions X in this class, the generator takes the form

1 d2 d
A= a(x) 2 + b(x) , a(x) > 0.
2 dx dx

Let x0 ∈ I and define Z x  Z y 


b(z)
s(x) = exp −2 dz dy
x0 a a(z)2
4 MARKOV PROCESSES 82

Then s is a harmonic function


For J = (a, b) ⊂ int(I), we have by Dynkin’s formula,

s(x) = s(a)P {XτJ = a} + s(b)P {XτJ = b}.

Therefore, s is a scale function.


Now, use the chain rule to see that s(X) has generator

1 d2
a(x)s0 (x)2 2 .
2 dx
Thus, the speed measure has density 1/(a(x)s0 (x)2 ).

Example 4.132. 1. The radial part of an n-dimensional Brownian motion R is called a Bessel process.
In this case, I = [0, ∞) and
1 d2 n−1 d
A= + .
2 dr2 2r dr
2. The Ornstein-Uhlenbeck process is obtained by subjecting particles undergoing physical Brownian mo-
tion to an elastic force. The generator is

σ 2 d2 d
A= − ρx , I = R.
2 dx2 dx

3. Financial models of stock prices use geometric Brownian motion. Here, I = (0, ∞) and the generator
is
σ 2 d2 d
A= x + ρx .
2 dx2 dx
4. A diffusion approximation of a neutral 2 allele model with scaled mutation rates µ0 and µ1 is

1 d2 d
A= x(1 − x) 2 + (µ1 x − µ0 (1 − x)) , I = [0, 1].
2 dx dx
Exercise 4.133. 1. Find a scale function and the speed measure for the four examples above.
2. Show that a normal distribution is stationary for the Ornstein-Uhlenbeck process.
3. Show that a beta distribution is stationary for the diffusion in example 4.

4.10.4 Boundary Behavior


We now have a good understanding of the behavior of a regular diffusion on int(I). We conclude by examining
its behavior at the boundary.

Definition 4.134. Let I = [r− , r+ ] ∩ R. The r+ is called an accessible is there exists t > 0 and x ∈ int(I)
such that
inf Px {τy ≤ t} > 0.
{y∈(x,r+ )}
4 MARKOV PROCESSES 83

If r+ is inaccessible, call it natural if there exists t > 0 and y ∈ int(I) such that
inf Px {τy ≤ t} > 0.
{y∈(x,r+ )}

Call it entrance if for all t > 0 and y ∈ int(I) such that


lim Px {τy ≤ t} = 0.
x→r+ −

An accessible boundary may or may not be regular. If it is not, then call the boundary point an exit.
Corresponding statements hold for r−
In schematic form, we have
r+ → int(I) r+ 6→ int(I)
int(I) → r+ regular exit
int(I) 6→ r+ entrance natural

Theorem 4.135. Fix x0 ∈ I and define


Z x
u(x) = s(y) m(dy).
x0

Then r+ is accessible if and only if u(r+ ) < ∞


Proof. Let J = (x0 , r+ ). By the lemma, for y ∈ J
Z r+
∞ > E y τJ = GJ (y, z) m(dz)
x0
Z r+
s(r+ ) − s(y)
≥ (s(z) − s(x0 )) m(dz)
s(r+ ) − s(x0 ) y
s(r+ ) − s(y)
≥ (u(r+ ) − u(y) − s(x0 )m(y, r+ )) .
s(r+ ) − s(x0 )
Because m(x, y) < ∞ and u(y) < ∞, we have that u(r+ ) < ∞.
Conversely, assume that u(r+ ) < ∞ and choose x ∈ (x) , r+ ). Then,
Z y
Ex τx0 = lim m(x, (x0 , y)) = G(x0 ,y) (x, s) m(dz).
y→r+ − x0

Note that,
Ex τx0 is nondecreasing as x incresing.
and that
Z y Z x Z x
2(s(r+ ) − s(x) 2(s(c) − s(x0 )
G(x0 ,y) (x, s) m(dz) = (s(z)−s(x0 )) m(dz)+ (s(r0 )−s(z)) m(dz).
x0 2(s(r+ ) − s(x0 ) x0 2(s(r+ ) − s(x0 ) x0
This tends to zero as x → r+ . Thus,
lim Ex τx0 = 0
x→r+

and r+ is accessible.
5 STOCHASTIC INTEGRALS 84

5 Stochastic Integrals
In this section we develop the theory of stochastic integrals with the primary intent to apply them to the
study of a class of Markov processes having continuous sample paths.
The theory of Riemann-Steitjes integral applies readily to continuous functions having finite variation on
compact intervals. Thus, we can readily obtain a stochastic integral for
Z t
(X · V )t (ω) = Xs (ω) dVs (ω).
0

for
1. progressively measurable continuous V having finite variation on compact intervals, and
2. progressively measurable continuous X bounded on compact intervals.
To extend this to continuous martingales, we must take into account the following.

Exercise 5.1. Let M be a martingale with EMt2 < ∞ for all t > 0. Then, for s, t > 0,
2
Var(Ms+t |Ft ) = E[Ms+t |Ft ] − Mt2 = E[(Ms+t − Mt )2 |Ft ].
Thus
E[(Ms+t − Mt )2 ] = E[Ms+t
2
− Mt2 ].
Definition 5.2. A partition Π of [0, T ] is a strictly increasing sequence of real numbers 0 = t0 < t1 < · · · <
tn = T . A partition Π of [0, ∞) is a strictly increasing unbounded sequence of real numbers 0 = t0 < t1 <
· · · < tk · · · , limk→∞ tk = ∞. The mesh(Π) = sup{tk+1 − tk ; k ≥ 0}
For any process X and partition Π = {t0 < t1 < · · · < tk · · · } of [0, ∞), define the simple process

X
XtΠ = Xtk I[tk ,tk+1 ) (t).
k=0

Theorem 5.3. A continuous martingale having finite variation on compact intervals is constant.
Proof. By considering the martingale M̃t = Mt − M0 , we can assume that M0 = 0. In addition, let σc be
the time that the variation of M reaches C. Then by considering the martingale M σc , we can assume that
M has variation bounded above by c.
Now, let Π = {0 = t0 < t1 < · · · < tn−1 < tn = t}, be a partitions of [0, t], then
n
X n
X
EMt2 = E[Mt2j+1 − Mj2 ] = E[(Mtj+1 − Mtj )2 ] ≤ nE[ max |Mtj+1 − Mtj |].
1≤j≤k
j=1 j=1

The random variable max1≤j≤k |Mtj+1 − Mtj | is bounded above by c and converges to 0 almost surely as the
mesh of the partition tends to 0. Thus,
EMt2 = 0 and Mt = 0 a.s.
5 STOCHASTIC INTEGRALS 85

5.1 Quadratic Variation


Definition 5.4. For the partition Π, 0 = t0 < t1 < · · · < tk · · · , of [0, ∞), the quadratic variation process
of a stochastic process X along Π is

X

t (X) = (Mmin{t,tj+1 } − Mmin{t,tj } )2 .
j=1

We say that X has finite quadratic variation if there exists a process hX, Xi such that

QΠ P
t (X) → hX, Xit as mesh(Π) → 0.

Theorem 5.5. A continuous bounded martingale M has finite quadratic variation hM, M it that is the unique
continuous increasing adapted process vanishing at zero such that

Mt2 − hM, M it

is a martingale.
Proof. The uniquess of hM, M i is easy, let A and B be two such processes. Then,

At − Bt = (Mt2 − Bt ) − (Mt2 − At )

as the difference of two martingales is itself a martingale. Because it has finite variation on compact intervals,
it is constant.
For existence, we will prove:

Theorem 5.6. If {Πn ; n ≥ 1} is a sequence of partitions of [0, ∞) with mesh(Πn ) → 0 then

lim E[ sup (QΠ


t (M ) − hM, M it )] = 0.
n
n→∞ 0≤t≤T

Exercise 5.7. Let B denote standard one-dimensional Brownian motion. With the partitions Πn described
L2
as above, show that QΠ
t (B) →
n
t as n → ∞.
n
For this, we will introduce a second
R t sequence I of process, which we shall later recognize as the approx-
imation to the stochastic integral 0 Ms dMs . Write Πn = {0 = tn0 < tn1 < · · · < tnk · · · }, then

X
ItΠn = Mmin{t,tnk } (Mmin{t,tnk+1 } − Mmin{t,tnk } ).
k=0

Exercise 5.8. 1.
Mt2 − M02 = QΠ Πn
t (M ) + 2It .
n
(5.1)

2. I Πn is a mean zero martingale. Consequently, Mt2 − QΠ


t (M ) is a mean-zero martingale.
n
5 STOCHASTIC INTEGRALS 86

Lemma 5.9. Choose c ≥ supt≥0 |Mt |, then

E[(ItΠn )2 ] ≤ c4 .

Proof. The quadratic variation of I Πn associated with Πn



X

t (I
n Πn
)= 2
Mmin{t,tn } (Mmin{t,tn
k+1 }
− Mmin{t,tnk } )2 ≤ c2 QΠ
t (M ).
n
k
k=0

Thus,
E[QΠ n 2 Πn 2 2 2 4
t (I )] ≤ c E[Qt (M )] = c E[Mt − M0 ] ≤ c .
n

Lemma 5.10. With c ≥ supt≥0 |Mt |,

lim E[ sup (ItΠn − ItΠm )2 ] = 0. (5.2)


m,n→∞ 0≤t≤T

Proof. Write
Πn ∪ Πm = {u0 < u1 < · · · < uk · · · }.
Then

X
ItΠn − ItΠm = Πn
(Mmin{t,u k}
Πm
− Mmin{t,u k}
)(Mmin{t,uk+1 } − Mmin{t,uk } )
k=1
m ∪Πm
is a mean zero martingale, and therefore, (ItΠn − ItΠm )2 − QΠ
t (I Πn − I Πm ) is, by the exercise, a mean
zero martingale.. Thus,

X
m ∪Πm

t (I Πn − I Πm ) = Πn
(Mmin{t,u k}
Πm
− Mmin{t,u k}
)2 (Mmin{t,uk+1 } − Mmin{t,uk } )2
k=1
m ∪Πm
≤ sup (MsΠn − MsΠm )2 QΠ
t (M ).
0≤s≤t

By Doob’s inequality, and then by Schwartz’s inequality, we have,

E[ sup (ItΠn − ItΠm )2 ] ≤ 4E[(ITΠn − ITΠm )2 ] = 4E[QTΠn ∪Πm (I Πn − I Πm )]


0≤t≤T
m ∪Πm
≤ 4E[ sup (MtΠn − MtΠm )4 ]1/2 E[QΠ
T (M )2 ]1/2 .
0≤t≤T

In addition,
m ∪Πm
E[QΠ
T (M )2 ] ≤ 2(E[(MT2 − M02 )2 ] + 4E[(ITΠm ∪Πn )2 ]) ≤ 12c4 .
Finally, sup0≤t≤T (MtΠn − MtΠm )4 is bounded and converges to zero almost surely as m, n → ∞. Thus, by
the bounded converges theorem, its expectation converges to zero and (5.2) holds.
Returning to the proof of the theorem.
5 STOCHASTIC INTEGRALS 87

Proof. The continuous processes I Πn are Cauchy in the norm ||X − Y ||2,T = E[sup0≤t≤T (Xt − Yt )2 ]1/2 , thus,
by (5.1). so is QΠn (M ). Because the space is complete and the convergences is uniform almost surely, the
limit Q(M ) is continuous. Choose, if necessary, a subsequence that converges almost surely.
Write Π̃n = ∪nk=1 Πn . Then for any s, t ∈ Π̃n , s < t

QΠ̃ Π̃n
t (M ) ≥ Qs (M ).
n

Consequently, Q(M ) is nondecreasing on the dense set ∪∞ k=1 Πn . Because Q(M ) is continuous, it is nonde-
creasing. Finally, that fact that Mt2 −Qt (M ) is a martingale can be seen by taking limits on the expectations

E[Mt2 − QΠ̃ 2 Π̃n


t (M ); A] = E[Ms − Qs (M ); A],
n
A ∈ Fs , s < t.

To extend this theorem to the broader class of local martingales.


Proposition 5.11. Let M be a bounded continuous martingale and let τ be stopping time. Then

hM τ , M τ it = hM, M iτt .

Proof. By the optional sampling theorem,

(M τ )2 − hM, M iτ .

is a martingale. Now use the uniqueness of the quadratic variations process.


Theorem 5.12. A continuous local martingale M has associated to it hM, M it the unique continuous in-
creasing adapted process vanishing at zero so that

Mt2 − hM, M it

is a local martingale.
In addition, for any T > 0 and any sequence of partitions {Πn ; n ≥ 1} of [0, ∞) with mesh(Πn ) → 0 then

sup |QΠ P
t (M ) − hM, M it | → 0
n

0≤t≤T

as n → ∞.
Proof. Choose {τm ; m ≥ 1} be a reducing sequence for M so that M τm I{τm >0} is bounded. Let Qm (M )
denote the unique continuously increasing process, Qm
0 (M ) = 0 so that

(M τm )2 I{τm >0} − Qm (M )

is a bounded martingale. By the uniqueness property and the proposition above,

(Qm+1 (M ))τm = Qm (M )

on {τm > 0}. This uniquely defines hM, M i on [0, τm ].


For the second statement, let , δ > 0. Choose m0 so that

P {τm0 > T } < δ.


5 STOCHASTIC INTEGRALS 88

Then, on the random interval [0, τm0 ],


t (M
n τm0
) = QΠ
t (M )
n
and hM τm0 , M τm0 it = hM, M it .

Consequently,

P { sup |QΠ Πn
t (M ) − hM, M it | > } ≤ δ + P { sup |Qt (M
n τm0
) − hM τm0 , M τm0 it | > }.
0≤t≤T 0≤t≤T

The last term tends to 0 as mesh(Πn ) → 0.


For any quadratic form, we have a polarization identity to create a bilinear form. That is the motivation
behind the next theorem.

Theorem 5.13. Let M and N be two continuous local martingales, then there is a unique continuous adapted
process hM, N i of bounded variation vanishing at zero so that

Mt Nt − hM, N it

is a local martingale.
In addition, set, for the partition Π for 0 = t0 < t1 < · · · < tk · · · , limk→∞ tk = ∞.

X
CtΠ (M, N ) = (Mmin{t,tj+1 } − Mmin{t,tj } )(Nmin{t,tj+1 } − Nmin{t,tj } ).
j=1

Then, for any t > 0 and any sequence of partitions {Πn ; n ≥ 1} of [0, ∞) with mesh(Πn ) → 0 then

sup (CtΠn (M, N ) − hM, N it ) →P 0


0≤t≤T

as n → ∞.
Proof. Set
1
hM, N it =(hM + N, M + N it ) − hM − N, M − N it )
4
and use an appropriate reducing sequence. Uniqueness follows are before.
Definition 5.14. We will call hM, N i the bracket of M and N and hM, M i the increasing process associated
with M

Exercise 5.15. 1. For a stopping time τ , hM τ , N τ i = hM, N τ i = hM, N iτ .


2. hM, M i = 0 if and only if Mt = M0 a.s. for every t.
3. The intervals of constancy of M and hM, M i are the some. (Hint: For constancy on [a, b], consider
the martingale θa Mmin{t,b−a} .)

4. The brackets process is positive definite, symmetric and bilinear.


5. If M and N are independent, then hM, N i is zero.
5 STOCHASTIC INTEGRALS 89

For any positive definite, symmetric and bilinear form, we have

Proposition 5.16 (Cauchy-Schwartz). Let M and N be continuous local martingales. Then


1/2 1/2
|hM, N it − hM, N is | ≤ (hM, M it − hM, M is ) (hN, N it − hN, N is ) .

Soon, we shall develop the stochastic integral. We will begin with a class of simple functions and extend
using a Hilbert space isometry. The Kunita-Watanabe inequalities are central to this extension.

Proposition 5.17. Let M and N be local martingales and let H and K be measurable processes, then
Z t Z t 1/2 Z t 1/2
| Hs Ks dhM, N is | ≤ Hs2 dhM, M is Ks2 dhN, N is .
0 0 0

Proof. By the standard machine, it suffices to consider H and K of the form


n
X n
X
K = K0 I{0} + Kj I(tj−1 ,tj ] , H = H0 I{0} + Hj I(tj−1 ,tj ]
j=1 j=1

with 0 = t0 < · · · < tn−1 < tn = t..


Use the Cauchy-Schwartz inequality for sums.
Z t n
X
| Hs Ks dhM, N is | ≤ |Hj Kj ||hM, N itj+1 − hM, N it] |
0 j=1
n
X 1/2 1/2
≤ |Hj Kj | hM, M itj+1 − hM, M it] hN, N itj+1 − hN, N it]
j=1
 1/2  1/2
n
X Xn
≤  Hj2 (hM, M itj+1 − hM, M is )  Kj2 (hN, N itj+1 − hN, N it] )
j=1 j=1
Z t 1/2 Z t 1/2
= Hs2 dhM, M is Ks2 dhN, N is .
0 0

Applying the Hölder inequality yields the following.

Corollary 5.18 (Kunita-Watanabe inequality). Let p and q be conjugate exponents. Then


 Z t  Z t 1/2 Z t 1/2
2 2

E Hs Ks dhM, N i ≤ Hs dhM, M i Ks dhN, N i .


0 0 p
0
L Lq

We now introduce the class of processes that will become the stochastic integraters.
5 STOCHASTIC INTEGRALS 90

Definition 5.19. An continuous Ft -semimartingale is a continuous process which can be written

X =M +V (5.3)

where M is a continuous local martingale and V is a continuous adapted process having finite variation on
a sequence of stopping time {τn ; n ≥ 1}, limn→∞ τn = ∞ bounded time intervals.
Exercise 5.20. 1. Show that V has zero quadratic variation.
2. Using the notation above show that hM, V i = limmesh(Πn )→0 C Πn (M, V ) = 0.

3. If V0 = 0, show that the representation in (5.3) is unique.


Thus, if X + V and Y + W are semimartingales using the representation above with M and N local
martingales and V and W processes having bounded variation on bounded intervals, then

hX, Y i = hM, N i.

5.2 Definition of the Stochastic Integral


Stochastic integration will begin proceed using the standard machine. The extension of the stochastic integral
in the limiting process will be based on a Hilbert space isometry. Here are the spaces of interest.

Definition 5.21. 1. Define H2 to be the space of L2 -bounded martingales. These are martingales M ,
such that
sup EMt2 < ∞.
t≥0

Use H02 2
for those elements M ∈ H with M0 = 0.
2. For M ∈ H2 define L2 (M ) to be the space of progressively measurable processes K such that
Z ∞
||K||2M = E[K 2 · hM, M i] = E[ Ks2 dhM, M is ] < ∞.
0

If M ∈ H2 , then M is a uniformly integrable martingale, the its limit exists and is in L2 (P ). Thus we
can place the following norm
||M ||2H2 = EM∞
2
.

Proposition 5.22. A continuous local martingale M is in H02 if and only if EhM, M i∞ < ∞. In this case,
M 2 − hM, M i is a uniformly integrable martingale.
Proof. Let {τn ; n ≥ 1} be a sequence that reduces M . Then,
2
E[Mmin{t,τ I
n } {τn >0}
] − E[hM, M imin{t,τn } I{τn >0} ] = E[M02 I{τn >0} ].

Because M∞ ∈ L2 (P ), we can use the dominated and monotone convergence theorem to conclude that
2
E[M∞ ] − E[hM, M i∞ ] = E[M02 ]
5 STOCHASTIC INTEGRALS 91

and EhM, M i∞ < ∞.


Conversely, by the same equality,
2
E[Mmin{t,τ I
n } {τn >0}
] ≤ E[hM, M i∞ ] + E[M02 ] ≤ C.

By Fatou’s lemma,
EMt2 ≤ lim inf E[Mmin{t,τ
2
I
n } {τn >0}
]≤C
n→∞

Thus, {Mmin{t,τn } I{τn >0} ; n ≥ 1, t ≥ 0} is uniformly integrable. Take s < t and A ∈ Fs , then

E[Mt ; A] = lim E[Mmin{t,τn } I{τn >0} ; A] = lim E[Mmin{s,τn } I{τn >0} ; A] = E[Ms ; A]
n→∞ n→∞

and M is a martingale.
Finally, note that
sup |Mt2 − hM, M it | ≤ (M 2 )∗∞ + hM, M i∞ ,
t≥0

an integrable random variable and thus {Mt2 − hM, M it ; t ≥ 0} is uniformly integrable.


Note that ||M ||H2 = EhM, M i∞
A simple process has the form,
n
X
K = K0 I{0} + Kj I(tj−1 ,tj ] .
j=1

To be progressive, we must have that Kj is Ftj−1 -measurable. The stochastic integral


Z n
X
(K · M ) = Ks dMs = Kj (Mtj−1 − Mtj ).
j=1

Note that
X n
n X
E(K · M )2 = E[ Kj (Mtj − Mtj−1 )Kk (Mtk − Mtk−1 )].
j=1 k=1

If j < k,

E[Kj (Mtj − Mtj−1 )Kk (Mtk − Mtk−1 )] = E[Kj (Mtj − Mtj−1 )Kk E[Mtk − Mtk−1 |Ftk−1 ]] = 0.

Thus,
Xn Xn
E(K · M )2 = E[ Kj2 (Mtj − Mtj−1 )2 ] = E[ Kj2 E[(Mtj − Mtj−1 )2 |Ftj−1 ]]
j=1 j=1
Xn Xn
= E[ Kj2 E[(Mt2j − Mt2j−1 |Ftj−1 ]] = E[ Kj2 E[hM, M itj − hM, M itj−1 |Ftj−1 ]]
j=1 j=1
Xn Z ∞
= E[ Kj2 (hM, M itj − hM, M itj−1 )] = E[ Ks2 dhM, M is ] = ||K||2M
j=1 0
5 STOCHASTIC INTEGRALS 92

Thus, the mapping


K →K ·M
2 2
is a Hilbert space isometry from L to L (M ).

Exercise 5.23. Let N and M be square integrable martingales and let H and K be simple adapted functions,
then
E[(K · M )(H · N )] = E[(KH · hM, N i)].
Theorem 5.24. Let M ∈ H02 , then for each K ∈ L2 (M ), there is a unique element K · M ∈ H02 , such that

hK · M, N i = K · hM, N i.

Proof. To verify uniqueness, note that if M 1 and M 2 are two martingales in H02 , such that

hM 1 , N i = hM 2 , N i for every N ∈ H02 ,

Then,
hM 1 − M 2 , M 1 − M 2 i = 0.
and M 1 − M 2 is constant, hence 0.
By the Kunita-Watanabe inequality,
Z ∞
|E[ Ks dhM, N is ]| ≤ ||N ||H2 ||K||L2 (M ) .
0

Thus the map


N → E[K · hM, N i]
is a continuous linear functional on H2 . Consequently, there exists an element K · M ∈ H2 so that

E[(K · M )∞ N∞ ] = E[(K · hM, N i)∞ ].

Claim. (K · M )N − K · hM, N i is a martingale.


Recalling that element in H02 are uniformly integrable. Thus, for any stopping time τ , we have

E[(K · M )τ Nτ ] = E[E[(K · M )∞ |Fτ ]Nτ ] = E[(K · M )∞ Nτ ]


τ
= E[(K · M )∞ N∞ ] = E[(K · hM, N τ i)∞ ]
τ
= E[(K · hM, N i )∞ ] = E[(K · hM, N i)τ ]

By the uniqueness of the covariation process, we have that hK · M, N i = K · hM, N i


5 STOCHASTIC INTEGRALS 93

This also shows that

||K · M ||2H2 = E[(K · M )2∞ ] = E[(K 2 · hM, N i)∞ ] = ||K||L2 (M )

and thus
K →K ·M
is an isometry.

Definition 5.25. The martingale K · M is called the Itô stochastic integral of K with respect to M . It is
often also denoted by Z t
(K · M )t = Ks dMs .
0

From out knowledge of Riemann-Stieltjes integrals, we have that


Z t Z t Z t
Jt = Ks dhM, N is implies Hs dJs = Hs Ks dhM, N is .
0 0 0

We would like a similar identity for stochastic integrals, namely


Z t Z t Z t
Yt = Ks dMs implies Hs dYs = Hs Ks dMs .
0 0 0

This is the content of the next theorem.

Proposition 5.26. Let K ∈ L2 (M ) and H ∈ L2 (K · M ), then

(HK) · M = H · (K · M ).

Proof. Apply the theorem above twice to see that hK · M, K · M i = K 2 hM, M i. Thus, we have H · K ∈
L2 (K · M ). For N ∈ H02 , we also have

h(HK) · M, N i = (HK) · hM, N i = H · (K · hM, N i) = H · hK · M, N i = hH · (K · M ), N i.

Now use the uniqueness of in the definition of the stochastic integral.


Proposition 5.27. Let τ be a stoppping time, then

K · M τ = KI[0,τ ] · M = (K · M )τ .

Proof. Claim. M τ = I[0,τ ] · M .


Choose N ∈ H02 ,
hM τ , N i = hM, N iτ = I[0,τ ] · hM, N i = hI[0,τ ] · M, N i.

Now use the preceding proposition,

K · M τ = K · (I[0,τ ] · M ) = KI[0,τ ] · M = I[0,τ ] K · M = I[0,τ ] · (K · M ) = (K · M )τ ,

which completes the proof.


5 STOCHASTIC INTEGRALS 94

We now we shall use localization to extend the definition of the stochastic integral.

Definition 5.28. Let M be a continuous local martingale, then L2loc (M ) is the space of progressively mea-
surable functions K for which there exists an increasing sequence of stopping times {τn ; n ≥ 1} increasing
to infinity so that Z τn
E[ Ks2 dhM, M is ] < ∞.
0

Exercise 5.29. For any K ∈ L2loc (M ), there exists a continuous local martingale K · M such that for any
continuous local martingale N ,
hK · M, N i = K · hM, N i.
Definition 5.30. 1. Call a progressively measurable process K locally bounded if there exists a sequence
of stopping times {τn ; n ≥ 1} increasing to infinity and constants Cn such that K τn ≤ Cn .
2. Let K be locally bounded and let X = M +V be a continuous semimartingale, X0 = 0, the Itô stochastic
integral of K with respect to X is the continuous semimartingle

K ·X =K ·M +K ·V

where K · M is the stochastic integral defined above and K · V is the usual Stieltjes integral.
Exercise 5.31. The map K → K · X has the following properties:
1. H · (M · X) = (HM ) · X.
2. (K · M )τ = (KI[0,τ ] ) · M ) = K · M τ .
3. If X is a local martingale or a process of finite variation, so is K · X.
4. If K is an progressively measurable simple function,

X
K = K0 I{0} + Kj I(tj−1 ,tj ] , 0 = t0 < t1 < · · · , lim tj = ∞,
j→∞
j=1

then

X
(K · X)t = Kj (Xmin{t,tj } − Xmin{t,tj−1 } .
j=1

Theorem 5.32 (bounded convergence theorem). Let X be a continuous semimartingale and let {K n ; n ≥ 1}
be locally bounded processes converging to zero pointwise with |K n | ≤ K for some locally bounded process K,
then for any T > 0 and  > 0,
lim P { sup |(K n · X)t | > } = 0.
n→∞ 0≤t≤T

Proof. If X is a process of finite variation, then we can apply the bounded convergence theorem for each
ω ∈ Ω.
If X is a local martingale and it τ reduces X, then by the bounded convergence theorem, (K n )τ converges
to zero in L2 (X τ ) and thus (K n · X)τ converges to zero in H2 . Now repeat the argument on the convergence
of the quadratic variation over a partition for a local martingale.
5 STOCHASTIC INTEGRALS 95

Exercise 5.33. Let K be left-continuous and let {Πn ; n ≥ 1} be a sequence of partitions of [0, ∞), with
Πn = {0 = tn0 < tn0 < · · · }, mesh(Πn ) → 0, then

X
lim P { sup |(K · X)t − Ktnj−1 (Xmin{t,tnj } − Xmin{t,tnj−1 } | > } = 0.
n→∞ 0≤t≤T j=1

Exercise 5.34. 1. Let X be a continuous semimartingale and let the stochastic integrals H · X and H · X
exists, then for a, b ∈ R
Z t Z t Z t
(aHs + bKs ) dXs = a Hs dXs + b Ks dXs .
0 0 0

2. Let X and X̃ be two continuous semimartingales and let K and K̃ be two predictable locally bounded
processes. Prove that if the finite dimensional distributions of (K, X) and (K̃, X̃) agree, then so do the
finite dimensional distributions of (K, X, K · X) and (K̃, X̃, K̃ · X̃).

5.3 The Itô Formula


The Itô formula is the change of variable formula for stochastic integrals. The formula gives explicitily the
decomposition of functions of a semimartingale as a local martingale and a process of bounded variation.

Proposition 5.35. Let X be a continuous semimartingale, then


Z t
Xt2 = X02 + 2 Xs dXs + hX, Xit .
0

Proof. Let {Πn ; n ≥ 1} be a sequence of partitions of [0, ∞), Πn = {0 = tn0 < tn0 < · · · }, mesh(Πn ) → 0.
Expand to obtain,

X ∞
X
(Xmin{t,tj } − Xmin{t,tj−1 } )2 = Xt2 − X02 − 2 Xmin{t,tj−1 } (Xmin{t,tj } − Xmin{t,tj−1 } ).
j=1 j=1

Then,

X
(Xmin{t,tj } − Xmin{t,tj−1 } )2 → hX, Xit .
j=1

and

X Z t
Xmin{t,tj−1 } (Xmin{t,tj } − Xmin{t,tj−1 } ) → Xs dXs
j=1 0

uniformly in probability on compact intervals.


Corollary 5.36. (Integration by parts formula) Let X and Y be continuous semimartingales, then
Z t Z t
Xt Yt = X0 Y0 + Xs dYs + Ys dXs + hX, Y it .
0 0
5 STOCHASTIC INTEGRALS 96

Proof. The corollary follows from polarization of the identity in the previous proposition.
Example 5.37. Let B be standard Brownian motion, then
Z t
2
Bt − t = Bs dBs .
0

Definition 5.38. A d-dimensional vector (continuous) local martingale is an Rd -valued stochastic process
X = (X 1 , . . . , X d ) such that each component is a (continuous) local martingale. A complex (continuous)
local martingale is one in which both the real and imaginary parts are (continuous) local martingales.
Theorem 5.39 (Itô formula). Let f ∈ C 2 (Rd , R) and X = (X 1 , . . . , X d ) be a vector continuous semimartin-
gale, then f ◦ X is a continuous semimartingale and
d Z t d d t
∂2f
Z
X ∂f 1 XX
f (Xt ) = f (X0 ) + (Xs ) dXsi + (Xs ) dhX i , X j is .
i=1 0 ∂xi 2 i=1 j=1 0 ∂xi ∂xj

Thus, the class of semimartingales is closed under composition with C 2 -functions.


Proof. If the formula holds for f , then, by the integration by parts formula, the formula holds for g(x) =
xk f (x). Thus, the Itô formula holds for polynomials. Now, for any compact set K, stop the process at τK .
By the Stone-Weierstrass theorem, any f ∈ C 2 (K, R) can be uniformly approximated by polynomials. Thus
the formula holds up to time τK by the stochastic dominated convergence theorem. Now, extend the result
to all of Rd .
From a vector semimartingale X we often write the stochastic integral
d Z
X t
Yt = Y0 + Hsi dXsi
i=1 0

in differential form
d
X
dYt = Hsi dXsi .
i=1

Using this notation, the Itô formula becomes


d d d
X ∂f 1 X X ∂2f
df (Xt ) + (Xs ) dXsi + (Xs ) dhX i , X j is .
i=1
∂xi 2 i=1 j=1
∂xi ∂xj

Example 5.40. 1. let f : R × R+ → C satisfy


∂f 1 ∂2f
+ =0
∂t 2 ∂x2
Then, if M is a local martingale, then so is the process {f (Mt , hM, M i)t ; t ≥ 0}. In particular,

λ2
Et (λM ) = exp(λMt − hM, M it ), λ∈C
2
5 STOCHASTIC INTEGRALS 97

is a local martingale.
This local martingale satisfies the stochastic differential equation

dYt = λYt dMt .

2. Note that for standard Brownian motion,

λ2
exp(λBt − t)
2
is a martingale. Use the standard machine to show that
Z t
1 t
Z 
Etf (B) = exp f (t) dBs − f (t)2 ds
0 2 0
is a martingale.
3. If B is a d-dimensional Brownian motion and if f ∈ C 2 (R+ × Rd ), then
Z t
1 ∂f
f (t, Bt ) − ( ∆f + )(s, Bs ) ds
0 2 ∂t

is a martingale. Thus, if f is harmonic in Rd , then f ◦ B is a local martingale. In addition, 21 ∆ is the


restriction of the generator of B to C 2 (Rd ).
Exercise 5.41. If B = (B 1 , . . . , B d ) is d-dimensional Brownian motion then hB i , B j i = δij t.
Example 5.42. If f : Rd → R is a function of the radius, then the Laplacian can be replaced by its radial
component. Writing Rt = |Bt |, we have
Z t
∂f
f (t, Rt ) − (Gf + )(s, Rs ) ds
0 ∂t
is a martingale if G is the generator for the Bessel process in d dimensions.
Check that s(r) = ln r is a scale function for the Bessel process in 2 dimensions and that s(r) = −r2−d
is a scale function for dimension d ≥ 3. So if ri < r < ro , then
s(ro ) − s(r)
Pr {τro > τri } = .
s(ro ) − s(ri )
Now let ro → ∞, then for d = 2,
Pr {τri < ∞} = 1.
Thus, in 2 dimensions, Brownian motion must return infinitely often to any open set containing the origin.
For d ≥ 3
 r d−2
i
Pr {τri < ∞} =
r
and Brownian has a positive probability of never returning to a given neighborhood of the origin. In short,
Brownian motion is recurrent in dimension 2 and transient in dimensions higher than 2. This conforms with
similar results for random walks.
5 STOCHASTIC INTEGRALS 98

Theorem 5.43 (Lévy’s characterization theorem). Assume that X is a Rd valued stochastic process satis-
fying X0 = 0. Then the following are equivalent.
1. X is d-dimensional standard Brownian motion.
2. X is a continuous local martingale and hX i , X j i = δij t.

Proof. 1 → 2 is the exercise above.


(2 → 1) Take λ = i and M = ξ · X in the exponential martingale, then

hM, M it = |ξ|2 t

and we have the local martingale


1
exp(iξ · Xt + |ξ|2 t).
2
Because it is bounded, it is a martingale. Thus, for s < t
1 1
E[exp(iξ · Xt + |ξ|2 t)|Fs ] = exp(iξ · Xs + |ξ|2 s).
2 2
or
1
E[exp(iξ · (Xt − Xs )|Fs ] = exp(− |ξ|2 (t − s)).
2
Thus, X has stationary and independent Gaussian increments and X1 has a normal distribution with the
identity matrix as its covariance matrix. Thus, it is standard Brownian motion.

5.4 Stochastic Differential Equations


Definition 5.44. Let f and g be adapted functions taking values in Rd×r and Rd respectively and let B be
a standard r-dimensional Brownian motion. Consider the stochastic differential equation
r Z t
X Z t
Xti = X0i + ( j
fij (s, X) dBs + gi (s, X) ds)
j=1 0 0

or, in vector differential form,


dXt = f (s, X) dBs + gi (s, X) ds.
A solution is a pair (X, B) of Ft -adapted processes that satisfy this equation.
The variety of notions of equivalence for stochastic processes leads to differing definitions of uniqueness
of uniqueness for solutions to stochastic differential equations.
Definition 5.45. 1. Let (X, B) and (X̃, B) be two solutions on the same probability space with the same
filtration and the same initial conditions. Then the stochastic differential equation above is said to
satisfy pathwise uniqueness if X and X̃ are indistinguishable.
2. Let (X, B) and (X̃, B̃) be two solutions so that the distributions of X0 and X̃0 are equal. Then the
stochastic differential equation above is said to satisfy uniqueness in law if X and X̃ are two versions
of the same process.
5 STOCHASTIC INTEGRALS 99

We will be focused on pathwise uniquesness. We begin with the basic estimate used to study the
dependence of a solution on initial condiations.

Theorem 5.46. (Gronwall’s inequality) Let f, g : [0, T ] → R be continuous and nonnegative. Suppose
Z t
f (t) ≤ A + f (s)g(s) ds, A ≥ 0.
0

Then, for t ∈ [0, T ],


Z t
f (t) ≤ A exp( g(s) ds).
0

Proof. First suppose A > 0 and set Z t


h(t) = A + f (s)g(s) ds.
0

Thus h(t) > 0. Then,


h0 (t)
h0 (t) = f (s)g(s) ≤ h(t)g(t) or ≤ g(t).
h(t)
Integration gives Z t
f (t) ≤ h(t) ≤ A exp( g(s) ds).
0
For A = 0, take limits to see that f is the zero function.
The basic hypothesis for existence and uniqueness that we will use is an assumption of the following
Lipschitz condition.
Let x, x̃ ∈ CRr [0, ∞). There exists a constant K such that for every t > 0,

|f (t, x) − f (t, x̃)| + |g(t, x) − g(t, x̃)| ≤ K sup |xs − x̃s |.


s≤t

Theorem 5.47. Let Ft be a right-continuous complete filtration on (Ω, FP ) and let Z be a continuous
r-dimensional semimartingale. If the progressive process f (t, X) satisfies the the Lipschitz condition above
and if for every t > 0,
Z s 2
E[ sup f (r, x̄) dr ] < ∞,
0≤s≤t 0
r
where x̄ is a constant function for some x0 ∈ R , then there is a pathwise unique process X such that
Z t
Xt = x + f (s, X) dZs .
0

Proof. Note that the Lipschitz condition guarantees that if the condition above holds for one choice x0 then
it holds for all x ∈ Rr .
Write the semimartingale Z = M + A in its canonical decomposition. Let |A|t denote the variation of A
up to time t
5 STOCHASTIC INTEGRALS 100

Case 1. The measures associated to hM, M i and |A| are absolutely continuous with respect to Lebesgue
measure and for all i and j.
dhM i , M j it d|A|t
≤ 1, ≤ 1.
dt dt

Fix x ∈ Rr and define the mapping


Z t
(SU )t = x + f (s, U ) dZs .
0

and the norm


||U − V ||22,t = E[ sup |Us − Vs |2 ].
0≤s≤t

Then, by the Doob and Cauchy-Schwartz inequality.


Z s Z s
2 2
||SU − SV ||2,t ≤ 2E[ sup | (f (r, U ) − f (r, V )) dMs | ] + 2E[ sup | (f (r, U ) − f (r, V )) d|A|s |2 ]
0≤s≤t 0 0≤s≤t 0
Z t Xr Z t
2 i
≤ 8E[ (f (r, U ) − f (r, V )) dMs | + 2E[ |A |t (fj (r, U ) − fj (r, V ))2 d|Aj |s |]
0 j=1 0
r X
X r Z t
≤ 8E[ (fi (r, U ) − fi (r, V ))(fj (r, U ) − fj (r, V )) dhM i , M j is ]
i=1 j=1 0

X r Z t
+2tE[ (fj (r, U ) − fj (r, V ))2 d|Aj |s |]
j=1 0
Z t
≤ 2K 2 (4r2 + rt) ||U − V ||22,r dr
0

This sets us up for the Picard-Lindelöf iteration scheme. Set

X 0 = x, and X n = SX n−1

and write
C = 2K 2 (4r2 + rT ).
Now, check, by induction, that
(Ct)n
||X n+1 − X n ||22,t ≤ ||X 1 − X 0 ||22,T .
n!
Thus, {X n ; n ≥ 1} is a Cauchy sequence in || · ||2,T . Call the limit X. Then X is a continuous process
and satisfies X = XS. In other words, X is a solution.
To show that X is unique, let X̃ be a second solution. Define τk = inf{t > 0 : |Xt | ≥ k or |X̃l | ≥ k}. Use
Gronwall’s inequality, with f (t) = ||X τk − X̃ τk ||2,t and g(t) = C to see that X̃0 = x, implies X̃ τk = X τk .
Now, let T and k go to infinity.
Case 2. The general case.
5 STOCHASTIC INTEGRALS 101

Define the strictly increasing process


r X
X r r
X
i j
Āt = t + hM , M it + |Ai |t
i=1 j=1 i=1

Consider the time change Ct = Ã−1


t and write

M̃t = MCt , Ãt = ACt , and Z̃t = M̃t + Ãt .

Then Z̃ satisfies the circumstances in case 1 with the filtration F̃t = FCt . Thus,
Z t
X̃t = x + f (s, X̃) dZ̃s
0

has a unique solution. Now, Xt = X̃Ãt is the unique solution to the equation in the statement of the
theorem.

5.5 Itô Diffusions


Definition 5.48. A continuous stochastic process X is called a time homogeneous Itô diffusion is there
exists measurable mappings
1. σ : Rd → Rd×r , (the diffusion matrix), and
2. b : Rd → Rd , (the drift)
and an r-dimensional Brownian motion B so that
r
X Z t
Xti = X0i + σij (Xs ) dBsj + bi (Xs ) ds
j=1 0

has a solution that is unique in law.


If σ and b satisfy an appropriate Lipschitz hypothesis, then we have pathwise uniqueness. To show that
this suffices, we have the following.

Theorem 5.49. Let σ and b be locally bounded and Borel measurable, and let α be a probability measure on
Rd . Then pathwise uniqueness to the Itô stochastic differential equation above implies distribution uniqueness.
We begin with a lemma.

Lemma 5.50. The following condition is sufficient for uniqueness in law:


Let x ∈ Rd . Whenever (X, B) and (X̃, B̃) are two solutions satisfying

X0 = x, X̃0 = x, a.s.

Then the distribution of X and X̃ are equal.


5 STOCHASTIC INTEGRALS 102

Proof. Let P be the distribution of (X, B) on the space CRr [0, ∞) and write the pair (ξ, β) for the coordinate
functions for the first d and last r coordinates respectively. Because CRd+r [0, ∞) is a Polish space, there is
a regular conditional probability
P (B|B0 )(ξ, β) = P (B, (ξ, β)).
In particular, β has the distribution of r-dimensional Brownian motion. The triple (X, B, σ(X) · B) and
(ξ, β, σ(ξ) · β) have the same distribution and thus (ξ, β) solves the stochastic differential equation.
Repeat this letting P̃ be the distribution of (X̃, B̃). Then, the regular conditional probabilities agree
P̃ (B, (ξ, β)) = P (B, (ξ, β))
and so if X0 and X̃0 have the same distribution, then P = P̃ .
Theorem 5.51. If pathwise uniqueness holds for an Itô type stochastic differential equation, then uniqueness
in law holds. Moreover, there exists a measureable map
F : CRd [0, ∞) → CRr [0, ∞)
such that F (β) is adapted to the natural filtration Btr = σ{ξs ; s ≤ t} and for every solution (X, B), we have
that
Xt (ω) = F (B(ω))t a.s.
Proof. Let (X, B) be a solution on (Ω, P ) and let (X̃, B̃) be a solution on (Ω̃, P̃ ). With, for some x ∈ Rd ,
P {X0 = x} = P̃ {X̃0 = x} = 1.
For the solution (X, B) let Q be the image of P under the map
ω → (X(ω), B(ω)).
For the solution (X̃, B̃) let Q̃ be the image of P̃ under the map
ω̃ → (X̃(ω̃), B̃(ω̃)).
In addition, let W be Wiener measure, that is the law of r-dimensional Brownian motion, Define a probability
measure Π on CRd [0, ∞) × CRd [0, ∞) × CRd [0, ∞) by
˜ dβ) = Q(β, dξ)Q̃(β, dξ)W
Π(dξ, dξ, ˜ (dβ)

and consider the filtration


Ft = σ{(ξs , ξ˜s , βs ); 0 ≤ s ≤ t}.

Claim 1. β is an r-dimensional Ft Brownian motion.


Let A ∈ σ{ξs ; 0 ≤ s ≤ t}, Ã ∈ σ{ξ˜s ; 0 ≤ s ≤ t}, and B ∈ σ{βs ; 0 ≤ s ≤ t} and let s, t > 0, then
Z
Ex [exphiθ, βt+s − βt i; A ∩ Ã ∩ B] = exphiθ, βt+s − βt iQ(β, A)Q̃(β, Ã)W (dβ)
B
Z
1 2
= exp(− |θ| t) Q(β, A)Q̃(β, Ã)W (dβ)
2 B
1
= exp(− |θ|2 t)Π(A × Ã × B)
2
5 STOCHASTIC INTEGRALS 103

Thus β has Gaussian independent increments with identity covariance matrix.


˜ β) are two solutions to the stochastic differential equation on the space
Claim 2. (ξ, β) and (ξ,

(CRd [0, ∞) × CRd [0, ∞) × CRr [0, ∞), Π)

with filtration {Ft ; t ≥ 0}.


The joint law of (X, B, σ(X) · B) under P is the same as the joint law of (ξ, β, σ(ξ) · β) under Π. Similarly,
˜ β, σ(ξ)
the joint law of (X̃, B̃, σ(X̃) · B̃) under P̃ is the same as the joint law of (ξ, ˜ · β) under Π. Because
˜ ˜
ξ0 = ξ0 = x a.s. Π, the property of pathwise uniqueness implies that ξ and ξ are indistinguishable. Thus,
uniqueness in law holds.
Solution is Markov.
Exercise 5.52. If X is an Itô diffusion and if f ∈ C 2 (Rd , R), then
d Z t d d t
∂2f
Z
X ∂f 1 XX
f (Xt ) = f (X0 ) + i
(Xs ) dXsi + (Xs ) dhX i , X j is
i=1 0 ∂x 2 i=1 j=1 0 ∂xi ∂xj
d Z t d Xr Z t
X ∂f 1 T
X ∂f
= f (X0 ) + (bi (Xs ) i (Xs ) + (σσ )ij (Xs )) ds + σik (Xs ) i (Xs ) dBsk .
i=1 0 ∂x 2 i=1 0 ∂x
k=1

In particular, we have the martingales


Z t
Mtf = f (X0 ) + Gf (Xs ) ds.
0

with
d d d
X ∂f 1 XX ∂2f
Gf (x) = bi (x) (x) + aij (x) (x)
i=1
∂xi 2 i=1 j=1 ∂xi ∂xj
T
and a = σσ .
square roots

Definition 5.53. A stochastic process X is called a solution to the martingale problem for the collection A
if Z t
f (Xt ) − g(Xs ) ds
0

is a martingale for every (f, g) ∈ A.


For a probability distribution α, we say that X is a solution to the martingale problem for (A, α) if X is
a solution to the martingale problem and α = P X0−1 .
Uniqueness holds for the martingale problem (A, α) if any two solutions have the same finite dimensional
distributions. In this case, we say that the martingale problem is well posed. If uniqueness holds for any
initial distribution α, then we say that the martingale problem A is well posed.
The relationship to Itô diffusions is as follows.
5 STOCHASTIC INTEGRALS 104

Theorem 5.54. Let


σ : Rd → Rd×d , and b : Rd → Rd
and let α be a probability measure on Rd . Define A as above and A = {(f, Af ) : f ∈ Cc∞ (Rd )}. Then there
exists a solution to the Itô stochastic differential equation corresponding to σ and b with initial distribution
α if and only if there exists a solution to the martingale problem (A, α).
Moreover, uniqueness in law to the stochastic differential equation corresponding to σ and b with initial
distribution α if and only if the martingale problem (A, α) is well posed.
This will follow from a sequence of lemmas:

Lemma 5.55. Assume that σ and b are locally bounded and measurable. In addition, σ −1 exists and is
locally bounded. If X is a solution to the CRd [0, ∞)-martingale problem for (A, α) with respect to a complete
filtration {Ft ; t ≥ 0}, then there exists a d-dimensional Brownian motion B such that (X, B) is a solution
to the stochastic differential equation.
Proof. For every f ∈ Cc∞ (Rd )
Z t
Mtf = f (Xt ) − f (X0 ) − Af (Xs ) ds
0
is a continuous martingale, we have that this expression gives a continuous local martingale for every f ∈
C ∞ (Rd ). Choose f (x) = xi to obtain the martingale
Z t
i i
Mt = X t − X 0 − bi (Xs ) ds
0
i j
is a local martingale. Choose f (x) = x x ,
Z t
hM i , M j it = hX i , X j it = aij (Xs ) ds.
0

Claim. Z t
Bt = σ −1 (Xs ) dMs
0
is a d-dimensional Ft -adapted Brownian motion.
B is a vector valued continuous local martingale and B0 = 0. Thus, by Lévy’s characterization, it suffices
to show that hB i , B j it = δij t.
d X
X d
hB i , B j it = hσ −1 (X)ik · M k , σ −1 (X)j` · M ` it
k=1 `=1
d X
X d Z t
−1
= (σik ak` (σ T )−1
`j )(Xs ) ds = δij t.
k=1 `=1 0

Thus, Z t Z t Z t
σ(Xs ) dBs = dMs = Xt − X0 b(Xs ) ds
0 0 0
and the stochastic differential equation has a solution.
5 STOCHASTIC INTEGRALS 105

To consider the case of singular σ we have the following,

Lemma 5.56. There exists Borel measurable functions


ρ, η : Rd → Rd×d
such that
ρaρT + ηη T = Id , ση = 0, (Id − σρ)a(Id − σρ)T = 0.
Lemma 5.57. Let B 0 be a d-dimensional Ft0 -Brownian motion on (Ω0 , P 0 ). Define
Ω̃ = Ω × Ω0 , P̃ = P × P 0 , and X̃t (ω, ω 0 ) = Xt (ω).
Then there exists a d-dimensional Ft ×Ft0 -Brownian motion B̃ such that (X̃, B̃) is a solution to the stochastic
differential equation.
Proof. With M as above, define
M̃t (ω, ω 0 ) = Mt (ω) and B̃t0 (ω, ω 0 ) = Bt0 (ω 0 ).
Because M̃ and B̃ are independent, hM k , B̃ ` it = 0 for k, ` = 1, . . . d.
Claim. Z t Z t
B̃t = ρ(Xs ) dM̃s + η(Xs ) dB̃s0
0 0
defines a d-dimensional Brownian motion.
As before, we have that B̃ is a vector valued continuous local martingale.
d X
X d d X
X d
hB̃ i , B̃ j it = hρ(X)ik · M̃ k , ρ(X)j` · M̃ ` it + hη(X)ik · B̃ 0k , η(X)j` · B̃ 0` it
k=1 `=1 k=1 `=1
d X
X d Z t d X
X d Z t
= (ρik ak` ρj` )(Xs ) ds + (ηik δk` ηjl )(Xs ) ds
k=1 `=1 0 k=1 `=1 0
Z t
= (ρaρT + ηη T )ij (Xs ) ds = δij t
0

To show that this Brownian can be used to solve the stochastic differential equation, note that,
Z t Z t Z t
σ(X̃s ) dB̃s = (σρ)(X̃s ) dM̃s + (ση)(X̃s ) dB̃s
0 0 0
Z t Z t
= (σρ)(X̃s ) dM̃s = M̃t − (Id − σρ)(X̃s ) dM̃s
0 0
Z t
= M̃t = X̃t X̃0 − b(Xs ) ds
0
because
d
X d
X d X
X d Z t
h (Id − σρ)(X̃s )ik · M̃ k , (Id − σρ)ik (X̃s ) · M̃ k it = (Id − σρ)ik ak` (Id − σρ)i` (X̃s ) ds = 0
k=1 `=1 k=1 `=1 0

and the lemma follows.

You might also like