Lecture 1: Markov processes and Diffusions

Guohuan Zhao

Institute of Applied Mahtematics, AMSS

February 22, 2023


Regular Conditional Distribution

Markov Processes and Diffusions


Let (Ω, F, P) be a probability space. X : (Ω, F) → (E , E) a

measurable map, and G a σ-field ⊆ F.
▶ When E = R, we define the conditional expectation of X
given G, E(X |G), to be any random variable Y that satisfies
(a) Y ∈ G;
(b) for all A ∈ G, E(X ; A) = E(Y ; A).
▶ QG : Ω × E → [0, 1] is said to be a regular conditional
distribution for X given G if
(a) For each A ∈ E, ω 7→ QG (ω, A) is a version of E(1A (X )|G);
(b) For a.e. ω ∈ Ω, A 7→ QG (ω, A) is a probability measure.
If E = Ω, X (ω) = ω, then QG is called a regular conditional

Proposition 1
Let QG be a RCD for X given G. If f : E → R has E|f (X )| < ∞,
then Z
E(f (X )|G)(ω) = f (x)QG (ω, dx) a.s..

Theorem 1 (cf. [Dur19])

RCD exists if E is a Polish space and E = B(E ).
Transition Probability

A transition kernel K on a measurable space (E , E) is a map

K : E × E → R+ such that for each x ∈ E , K (x, ·) is a measure,
and for each A ∈ E, K (·, A) is measurable. If, furthermore,
K (x, E ) = 1 for all x ∈ E , then K is a transition probability.
Theorem 2 (cf. [Dur19])
Suppose X and Y take values in a Polish space E . There is a
transition probability P : E × E → [0, 1] such that
(ω, A) 7→ P(Y (ω), A) is a RCD for X given Y , and P is unique
a.s. law(Y ).
We denote P(y , A) by P(X ∈ A|Y = y ).
Markov Processes

Let F = (Ω, F, {Ft }t⩾0 , P) be a filtered probability space. Let

X = {Xt }t⩾0 be a process takes values in (E , E). The Markov
property of X then says that, for any times s ⩽ t and A ∈ E,

P(Xt ∈ A|Fs ) = P(Xt ∈ A|Xs ), P − a.s.. (1)

Exercise 1
(1) is equivalent to

E(F (Xs+· )|Fs ) = E(F (X· )|Xs ) P − a.s., (2)

where F : E R+ → R is a bounded measurable function.

In this course, we always assume E is a Polish space. By Theorem
2, one can define

Ps,t (x, A) := P(Xt ∈ A|Xs = x), s ⩽ t.

For each bounded measurable function f : E → R, set

Ps,t f (x) := Ps,t (x, dy ) f (y ) = E(f (Xt )|Xs = x).

Noting that

Pr ,t f (Xr ) =E(f (Xt )|Xr ) = E [E (f (Xt )|Xs , Xr ) |Xr ]

=E [E (f (Xt )|Xs ) |Xr ]
=E(Ps,t f (Xs )|Xr ) = Pr ,s Ps,t f (Xr ),

we have
Pr ,t (·, dy ) = Pr ,s (·, dz)Ps,t (z, dy ) law(Xr ) − a.s.

Therefore, a Markov process can be described by a collection of

transition probabilities Ps,t . In this course, we only consider the
time-homogeneous case, meaning that Ps,t depends only on the
size t − s.
A homogeneous transition function on (E , E) is a collection
{Pt }t⩾0 of transition probabilities on (E , E) such that Ps Pt = Ps+t
for all s, t ⩾ 0. A process X is Markov with transition function Pt ,
and with respect to a filtered probability space F if it is adapted
and E(f (Xt+s )|Fs ) = Pt f (Xs ). One can verify that

µt1 ,··· ,tk (A1 × · · · × Ak ) := P(Xt1 ∈ A1 , · · · , Xtk ∈ Ak )

= µ0 (dz0 )Pt1 (z0 , dz1 ) · · · dzk−1 )Ptk −tk−1 (zk−1 , dzk ).
Rd ×A1 ×···×Ak
Construction of Markov processes

Question: Given a collection of transition probabilities {Pt }t⩾0 ,

how to construct the corresponding Markov process?

The following Kolmogorov consistency theorem gives an affirmative

answer to this.
Theorem 3 (Kolmogorov consistency theorem)
Let E be a Polish space. For each k ∈ N and t1 , · · · , tk ∈ I ,
µt1 ,··· ,tk is a probability measure on E k . Assume that µ··· s satisfy
two consistency conditions
(a) for all measurable sets Ai ⊂ E ,
µt1 (A1 × · · · × Ak ) = µt1 A1 ×· · ·×Ak ×E
| × ·
{z· · × } ;
Construction of Markov processes

(b) for each permutation π of {1, · · · , k} and for all measurable

sets Ai ⊂ E ,

µtπ(1) ...tπ(k) Aπ(1) × · · · × Aπ(k) = µt1 (A1 × · · · × Ak ) .

Then there exists a probability measure µ on Ω = E I such that

µ(ωt1 ∈ A1 , · · · ωtk ∈ Ak ) = µt1 ···tk (A1 × · · · × Ak ).

i.e. under µ the canonical process ω has µt1 as its

finite-dimensional distributions relative to times t1 . . . tk .
Example: Brownian Motion

As an example, standard Brownian motion, W , has the defining

property that Wt − Ws is normal with mean 0 and variance t − s
independently of {Wu : u ⩽ s}, for times s < t. Equivalently, it is
a Markov process with the transition function
1 |y −x|2
− 2t
Pt f (x) = d e f (y ) dy .
(2πt) 2 Rd
Infinitesimal generators

Let E = Rd . Assume that the following limit

Lφ(x) = lim (φ(y ) − φ(x))Pt (x, dy ) (3)
t→0 t Rd

exists for all φ ∈ Cc∞ , and that Pt (x, dy ) = p(t, x, y )dy . Under
mild conditions, this implies
Pt f = LPt f = Pt Lf , ∀f ∈ Cc∞
and (
∂t p(t, x, y ) = Lp(t, ·, y )(x)
p(0, ·, y ) = δy
in the sense of distribution.
Infinitesimal generators

Denote the space of all real-valued functions on Rd by M(Rd ).

Assume that φ(x0 ) = supx∈Rd φ(x), then
Lφ(x0 ) = lim (φ(y ) − φ(x0 ))Pt (x0 , dy ) ⩽ 0.
t→0 t Rd

We say that an operator L : D(L) ⊆ M(Rd ) → M(Rd ) satisfies the

positive maximum principle on D(L) if for all φ ∈ D(L) such that

Lφ(x0 ) ⩽ 0,

if φ(x0 ) = supx∈Rd φ(x).

Infinitesimal generators
Theorem 4 (Courrège theorem, see [Cou68])
Let L : Cc∞ (Rd ) → M(Rd ) be a linear operator. Then L satisfies
the positive maximum principle iff there exist, uniquely determined
by L for each x ∈ Rd ,
(i) a nonnegative definite symmetric matrix (aij )1⩽i,j⩽d ;
(ii) a vector b(x) ∈ Rd ;
(iii) a c(x) ⩾ 0 ;
(iv) aR Borel measure ν(x, dz) on Rd \{0} satisfying
Rd \{0} (1 ∧ |z| )ν(x, dz) < ∞
such that
Lφ(x) = aij (x)∂ij φ(x) + bi (x)∂i φ(x) − c(x)φ(x)
Z (5)
+ (φ(x + z) − φ(x) − 1B1 (z)z · ∇φ(x))ν(x, dz).
Rd \{0}
Infinitesimal generators

Remark 1
1. Suppose that L is defined by (3), then it satisfies the positive
maximum principle and L has representation (5) with c = 0;
2. L in (5) is the generator of a Lévy process if c = 0 and a, b
and ν are independent with x.
In the rest of this chapter, we add the assumption that the
operator L in (3) is local, in the sense that Lφ(x) = 0 whenever φ
vanishes in a neighborhood of x. Roughly speaking, local
infinitesimal generators correspond to Markov processes with
continuous trajectories (Diffusion processes).
Exercise 2
The operator L defined in (3) is local iff

Lφ(x) = aij (x)∂ij φ(x) + bi (x)∂i φ(x), (6)
lim Pt (x, Brc (x)) = 0.

Exercise 3
Let X be a Markov process on Rd with transition function Pt . If
lim sup Pt (x, Brc (x)) = 0,
u↓0 u t∈[0,u];x∈Rd

then there is a continuous Markov process Y defined on same

probability space such that P(Xt = Yt ) = 1 for each t ⩾ 0.
Kolmogorov’s construction of diffusion processes

Question: given a and b, how to construct a diffusion process

satisfies (6)?
Kolmogorov’s idea was to recover Pt from (4).
Theorem 5 (see [Fri08])
Assume that b ∈ L∞ , and a ∈ C α and there is a constant
λ ∈ (0, 1) such that λ|ξ|2 ⩽ aij ξi ξj ⩽ λ−1 |ξ|2 . There is a unique
continuous fucntion p : (0, T ) × (Rd × Rd \∆) → R+ such that for
each t > 0 and each x, y ∈ Rd .

∂t pt (x, y ) = Lpt (·, y )(x), (7)

Kolmogorov’s construction of diffusion processes

Moreover, p fulfils
(a) (Two-sided estimate)
2 /Ct 2 /t
t −d/2 e−|x−y | ≲ pt (x, y ) ≲ t −d/2 e−C |x−y | ;

(b) (Conservative) Z
pt (x, y )dy = 1;
(c) (C-K equation)
pt+s (x, y ) = pt (x, z)ps (z, y )dz.

How to construct diffusion processes using

probabilistic methods?

