Markproc 2022 V 5

Technische Universität München
Markov processes
Summer 2022
Prof. Dr. Nina Gantert

Contents
1 Markov chains in continuous time 2

1.1 Markov chains, Q-matrices and transition functions . . . . . . . . . . . . . 2
1.2 Going from one of these to the others . . . . . . . . . . . . . . . . . . . . . 6
1.3 Blackwell’s Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Invariant measures, recurrence and transience 15

2.1 Invariant measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Recurrence and Transience . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Feller processes 26
3.1 Feller processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 From the Feller process to its infinitesimal description . . . . . . . . . . . . 30
3.3 From the infinitesimal description to the Feller process . . . . . . . . . . . 32
4 Generators, martingales, invariant distributions 36

4.1 Construction of generators . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Construction of martingales . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Invariant distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5 Spin systems 46
5.1 Spin systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Ergodicity of spin systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 Attractive spin systems and coupling . . . . . . . . . . . . . . . . . . . . . 53
6 The contact process 60

6.1 The graphical representation . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7 The voter model 65

7.1 Voter model duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.2 The recurrent case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.3 The transient case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
1
Chapter 1
Markov chains in continuous time
1.1 Markov chains, Q-matrices and transition func-

tions
We start the course by constructing Markov processes in continuous time with finite or
countable1 state space S: Markov chains in continuous time. We equip S with the
discrete topology.
Definition 1.1.
Ω = collection of right-continuous functions ω : [0, ∞) → S with only finitely many

jumps in each finite time interval.
Xt (ω) = ω(t) for t ≥ 0.
θs shift operator on Ω: (θs ω)(t) = ω(t + s).
Consider the smallest σ-field F on Ω such that ω 7→ ω(t) is measurable for all t ≥ 0.
A continuous-time Markov chain on S is given by
(i) a collection of probability measure (Px )x∈S on (Ω, F)
(ii) a right-continuous filtration (Ft )t≥0 on Ω, Ft ⊂ F for all t ≥ 0 such that (Xt )t≥0 is
adapted to (Ft )t≥0 and Px (X0 = x) = 1 for all x ∈ S, and
Ex [g ◦ θs |Fs ] = EXs [g] Px -a.s. for all x ∈ S (1.1)
and all bounded measurable functions g : Ω → R. (1.1) is called the Markov prop-
erty. (We write Ex for the expectation with respect to Px ).
Remark. We say that a function ω : [0, ∞) → S is càdlàg (French: ”continue à droite,

avec des limites à gauche”) if it is right-continuous and has left limits everywhere. Since
S has the discrete topology, Ω is equal to the set of all càdlàg paths with values in S.
1
we always assume |S| ≥ 2
2
CHAPTER 1. MARKOV CHAINS IN CONTINUOUS TIME 3
Definition 1.2. A transition function is a function pt (x, y), t ≥ 0, x, y ∈ S such that

X
pt (x, y) ≥ 0, pt (x, y) = 1 ∀x ∈ S
y∈S (1.2)
lim pt (x,x) = p0 (x, x) = 1 ∀x ∈ S
t→0
and the Chapman-Kolmogorov equations hold

X
ps+t (x, y) = ps (x, z)pt (z, y), ∀x, y ∈ S, s, t ≥ 0. (1.3)
z∈S
We write Pt = (pt (x, y))x,y∈S .
Given a transition function, we can construct a consistent family of finite-dimensional

marginals for any starting state x ∈ S:
Px (Xt1 = x1 , ..., Xtm = xm ) = pt1 (x, x1 )pt2 −t1 (x1 , x2 ) · · · ptm −tm−1 (xm−1 , xm )
(these are consistent due to (1.3)).
We want an infinitesimal description with a (possibly infinite) matrix, the Q-matrix,

which is related to the transition function as follows

d
q(x, y) = pt (x, y) . (1.4)
dt t=0
Definition 1.3. A Q-matrix is a collection (q(x, y))x,y∈S such that
q(x, y) ≥ 0, ∀x 6= y
X
q(x, y) = 0, ∀x ∈ S.
y∈S
We set c(x) = −q(x, x) for all x ∈ S.
If S is finite, there are one-to-one correspondences between Markov chains, transition

functions and Q-matrices. If S is infinite, this is in general not true.
Example 1.4 (Poisson process). (Xt )t≥0 has values in Z. After an exponentially dis-
tributed waiting time, the process jumps by 1, i.e. it goes from its present state x to x + 1.
Example 1.5. Take a discrete-time MC given by the transition matrix P = (p(x, y))x,y∈S ,
where P is a stochastic matrix, i.e.
X
p(x, y) ≥ 0, ∀x, y ∈ S, p(x, z) = 1 ∀x ∈ S.
z∈S
Then we define (Xt )t≥0 as follows: take i.i.d. exp(1) distributed random variables. Then
at x wait for one of these and jump according to P .
Remark. With Ft = σ(Xu , u ≤ t), (1.1) holds true due to the loss of memory property
of the exponential distribution.
Claim. pt (x, y) := Px (Xt = y) is a transition function, which depends on the stochastic

matrix P as follows:
∞
−t
X tk
pt (x, y) = e P k (x, y), (1.5)
k=0
k!
where P k is the k-th power of P with P 0 = 1.

Pk
Proof. We show (1.5). Take T1 , T2 , T3 , ... i.i.d. with law exp(1), Nt := max{k : i=1 Ti ≤
t}. Then
∞
X
Px (Xt = y) = Px (Xt = y|Nt = k)Px (Nt = k)
k=0
∞
X tk
= P k (x, y)e−t .
k=0
k!
Exercise. Show that Equation (1.5) defines a transition function.
Claim. The Q-matrix given by (1.4) satisfies Q = P − 1. Conversely, if P is a stochastic

matrix, P − 1 is a Q-matrix.
Proof.
∞
−t 0 −t
X tk
pt (x, y) = e P (x, y) + e P k (x, y)
k=1
k!
∞ ∞
d X tk X tk−1
pt (x, y) = −e−t P 0 (x, y) − e −t k
P (x, y) + e −t
P k (x, y)
dt k=1
k! k=1
(k − 1)!
∞ k−1 k

X t t
= −e−t P 0 (x, y) + e−t − P k (x, y)
k=1
(k − 1)! k!

d
pt (x, y) = −1 + P 1 (x, y).
dt t=0
Example 1.6. Let S = {0, 1}. The generic Q-matrix is

−β β
Q=
δ −δ
with β, δ ≥ 0. We will show that the corresponding transition function is given by

δ β −t(β+δ)
pt (0, 0) = + e
δ+β δ+β

β −t(β+δ)
pt (0, 1) = 1−e
δ+β
(1.6)
β δ −t(β+δ)
pt (1, 1) = + e
δ+β δ+β

δ −t(β+δ)
pt (1, 0) = 1−e .
δ+β
Exercise. Show that (1.3) and (1.4) are satisfied.
If β < 1, δ < 1, then Q = P − 1 with

1−β β
P =
δ 1−δ
and we know that pt (·, ·) belongs to the MC in Example 1.5.
Claim. For all β > 0, δ > 0, pt (·, ·) belongs to the following MC (Xt )t≥0 : The MC stays,
when in 0, for an exp(β)-distributed waiting time in 0 and then goes to 1. When in 1, it
stays a exp(δ)-distributed time in 1 and then it goes to 0, etc. (Note that all waiting times
are independent.)
Proof. Exercise.
Lemma 1.7. Let S be finite and Q a Q-matrix. Then Pt = (pt (x, y))x,y∈S given by
∞
tQ
X tn
Pt = e := Qn
n=0
n!
is a transition function and (1.4) holds.

Proof. (i) Pt is well-defined: Let A be a k × k-matrix, then ∞ n −1
P
n=0 A (n!) converges,
since
∞ An X ∞
kAn k
∞
kAkn N →∞
X X
≤ ≤ −−−→ 0.

n=N
n!
n=N
n! n=N
n!
(ii) Show that Pt is a stochastic matrix.
(iii) We have by definition Ps+t = Ps Pt hence the Chapman-Kolmogorov equations (1.3)

are satisfied and
lim pt (x, x) = 1
t→0
follows since limt→0 etQ = 1. (1.4) is true since
d tQ
e = QetQ .
dt
Note that (1.5) is a special case with Q = P − 1.

∞
−t1 −t t(P −1) −t P −t
X tk
e = 1e ⇒e =e e =e P k.
k=0
k!
Example 1.8. A birth and death chain is a MC with state space N ∪ {0} and the
following Q-matrix
q(k, k + 1) = ρk , k ≥ 0
q(k, k − 1) = λk , k ≥ 1
q(k, k) = −ρk − λk , k ≥ 0.
and q(k, l) = 0 if |k − l| ≥ 2, where ρk , λk ≥ 0 and λ0 = 0. An important particular case

is the linear birth and death chain, where ρk = ρk, λk = λk.
Interpretation. (Xt )t≥0 is the size of a population at time t. Each particle has two inde-
pendent clocks, which ring after exponentially distributed times T and Te with parameters
ρ and λ. If time T < Te, the particle has at time T an offspring, if Te ≤ T the particle dies
at time Te. The offspring particles are doing the same, and all particles behave indepen-
dently. This means that in state Xk = k the process goes with rate ρk to k + 1 and with
rate λk to k − 1. Note that if T1 , T2 , ..., Tk are i.i.d. with law exp(ρ) then min(T1 , ..., Tk )
has the distribution exp(ρk). Note that c(k) = (ρ + λ)k is not bounded in this example.
1.2 Going from one of these to the others

How do we go from the MC to the transition function?
The following theorem tells us that this is always possible.
Theorem 1.9. Let (Xt )t≥0 be a continuous time MC. Let pt (x, y) = Px (Xt = y) for t ≥ 0
and x, y ∈ S. Then
(a) pt (x, y) is a transition function
(b) pt (x, y) determines the measures Px , x ∈ S uniquely.
Proof. For (1.2) it remains to show that limt↓0 pt (x, x) = p0 (x, x) = 1. Due to the right-
continuity of the paths, we have
T := inf{t > 0 : Xt 6= X0 } > 0 Px -a.s. for all x ∈ S.

Since pt (x, x) ≥ Px (T > t) the claim follows. The Chapman-Kolmogorov equations follow
from the Markov property: (1.1), with g = 1{Xt =y} tells us that
Px (Xs+t = y|Fs ) = PXs (Xt = y) = pt (Xs , y) Px -a.s. for all x ∈ S.
Take expectation w.r.t. Px to get
X
pt+s (x, y) = Px (Xs+t = y) = ps (x, z)pt (z, y),
z∈S
which shows (1.3) and finishes the proof of (a).
To show (b) write, using the Markov property,

Px (Xt1 = x1 , ..., Xtm = xm ) = pt1 (x, x1 ) · · · ptm −tm−1 (xm−1 , xm )
for 0 < t0 < t1 < ... < tm−1 < tm . Hence the transition function determines the finite
dimensional marginals of Px . But since probability measures on (Ω, F) are characterized
uniquely by their finite dimensional marginals the claim follows.
How do we go from the Q-matrix to the transition function?
If S is finite we can construct a transition function pt (x, y) for a given Q-matrix Q =
(q(x, y))x,y∈S : namely, pt (·, ·) solves the Kolmogorov backward-equations:
d X
pt (x, y) = q(x, z)pt (z, y). (1.7)
dt z∈S
We get (1.7) (formally, since one would need to justify that t → pt (x, y) is continuously
differentiable!) by deriving (1.3) w.r.t. s and taking s = 0.
X
ps+t (x, y) = ps (x, z)pt (z, y)
z∈S

d d X d
pt (x, y) = ps+t (x, y)
= ps (x, z) pt (z, y)
dt ds s=0 z∈S
ds s=0
X
= q(x, z)pt (z, y).
z∈S
In the same way one could derive w.r.t. t and take t = 0 to get

d d X
ps (x, y) = ps+t (x, y) = ps (x, z)q(z, y). (1.8)
ds dt t=0 z∈S
The equations (1.8) are called Kolmogorov forward-equations.
If S is finite (1.7) is a finite system of linear differential equations, which has the unique
solution
∞
X tk k
Pt = etQ = Q .
k=0
k!
If S is infinite, uniqueness of the solution is in general not satisfied, but true under
additional assumptions.
How do we go from the transition function to the Q-matrix?
We start with some important facts on transition functions, which we will need later.
Theorem 1.10. Let (pt )t≥0 be a transition function. Then the following hold:
(i) pt (x, x) > 0 for all t ≥ 0 and x ∈ S.
(ii) For all x, y ∈ S the map t 7→ pt (x, y) is continuous.
(iii) For all x ∈ S the right-hand derivative

d
u(x) := − pt (x, x)
dt t=0
exists in [0, ∞]. Furthermore,
pt (x, x) ≥ e−u(x)t . (1.9)
Proof. (i) Note that

lim pt (x, x) = 1
t&0
implies that pt (x, x) > 0 for t small enough. Furthermore, due to (1.3), we have
pt+s (x, x) ≥ pt (x, x)ps (x, x), (1.10)
which yields that pt (x, x) > 0 for all t ≥ 0.
(ii) Using again (1.3), we have

X
pt+s (x, y) − pt (x, y) = pt (x, y)(ps (x, x) − 1) + ps (x, z)pt (z, y).
z6=x
Note that
|pt (x, y)(ps (x, x) − 1)| ≤ |ps (x, x) − 1| = 1 − ps (x, x),
and X X
0≤ ps (x, z)pt (z, y) ≤ ps (x, z) = 1 − ps (x, x).
z6=x z6=x
Thus, we obtain
|pt+s (x, y) − pt (x, y)| ≤ 2(1 − ps (x, x)).
We deduce that
|pt (x, y) − ps (x, y)| ≤ 2(1 − p|t−s| (x, x)) → 0 as |t − s| & 0.
This proves that t 7→ pt (x, y) is continuous.
(iii) The proof relies on the following lemma, which is often called subadditivity lemma:
Lemma 1.11. Assume that g : R+ → R is right-continuous at 0 and satisfies g(0) =

0. If g is additionally subadditive, i.e.
g(s + t) ≤ g(s) + g(t),
then
lim g(t)
t
= sup g(t)
t
t&0 t>0
exists in (−∞, ∞].
For a proof we refer to [Liggett, Theorem A.59]. Due to (i) and (ii), the function
f (t) := − log(pt (x, x))
is well-defined and right-continuous. Furthermore, due to (1.10), f is subadditive.

Thus, by the subadditivity lemma, we can conclude that
lim f (s)
s
= sup f (s)
s
s&0 s>0
exists in [0, ∞]. Using the chain rule and limt&0 pt (x, x) = 1, we see that the limit
on the l.h.s. equals u(x) and the inequality (1.9) follows also readily.
The proof is complete.
Now, we explain informally how to come from the transition function to a Q-matrix.
This direction requires some extra assumptions in certain places. Let (pt )t≥0 be a transition
function and let x ∈ S be such that u(x) < ∞. Then, one can show that for all x 6= y the
right-hand derivative
d
u(x, y) := pt (x, y)
dt t=0
exists in [0, ∞) and X

u(x, y) ≤ 0.
y∈S
P
(Here, we write u(x, x) for u(x)). Furthermore, if even y∈S u(x, y) = 0, then t 7→ pt (x, y)
is of class C 1 for all y ∈ S. In particular, the Kolmogorov backward equations hold, i.e.
d X
pt (x, y) = u(x, z)pt (z, y).
dt z∈S
If this is true for all x ∈ S, then (u(x, y))x,y∈S is a Q-matrix corresponding to the transition
function (pt )t≥0 . For proofs we refer to [Liggett, Theorem 2.14].
Definition 1.12. Let (pt )t≥0 be a transition function. The state x ∈ S is called absorbing
if pt (x, x) = 1 for all t ≥ 0. The state x ∈ S is called instantaneous if u(x) = ∞, see
Theorem 1.10.
If S is countable, there are examples (see Section 1.3 below) of transition functions
where all x ∈ S are instantaneous and there is no MC with this transition function.
How do we go from the Q-matrix to the MC?

Let Q be a Q-matrix. Define transition probabilities p(x, y) for a discrete time MC as
follows:
If c(x) = 0, take p(x, y) = 1{x=y} ,

q(x, y)
If c(x) > 0, take p(x, y) = 1{y6=x} .
c(x)
P P
Note that y∈S p(x, y) = 1, since y∈S q(x, y) = 0, hence P = p(x, y)x,y∈S is a stochastic
matrix. Let µ = (µ(x))x∈S ∈ M1 (S), µ(x) > 0, ∀x ∈ S. Consider the time-discrete MC
(Zn )n∈N on S with starting distribution µ and transition matrix P . More precisely the
finite dimensional marginals of Z are given by
P(Z0 = x0 , Z1 = z1 , ...., Zm = zm ) = µ(x0 )p(x0 , x1 ) · · · p(xm−1 , xm ).
Z is called the embedded discrete time MC. Let τ0 , τ1 , ... be random variables whose
conditional distribution given (Z0 , Z1 , ...) is the following: Given (Z0 , Z1 , ...) the τk are
independent and have laws
exp(c(Zk )) k = 0, 1, 2, ... with τk = ∞ if c(Zk ) = 0.
The finite dimensional marginals of ((Z0 , Z1 , ....), (τ0 , τ1 , ...)) are given as follows:
P(Z0 = z0 ,Z1 = z1 , ...., Zm = zm , τ0 > t0 , τ1 > t1 , ..., τm > tm )

= µ(z0 )p(z0 , z1 ) · · · p(zm−1 , zm )e−c(z0 )t0 e−c(z1 )t1 · · · e−c(zm )tm .
Let
( P∞
min{m ≥ 0 : τ0 + τ1 + ... + τm ≥ t}, if k=1 τk > t
N (t) =
+∞, otherwise.
Hence N (t) = 0 for a time interval of length τ0 , then N (t) = 1 for a time interval of length
et = ZN (t) on {N (t) < ∞}: X
τ1 , etc. Finally, let X e has right-continuous paths (if defined)
and waits, when at x, an exponentially distributed time with parameter c(x), and then
jumps to state y with probability p(x, y). Everything is fine, except that X e could jump
infinitely often in a finite time interval.
Theorem 1.13. The following statements are equivalent:
(a) (1.7) has a unique solution, which is a transition function
(b) P(N (t) < ∞) = 1 for all t ≥ 0

P∞
(c) n=0 τn = ∞ P-a.s.
P∞ 1
(d) n=0 c(Zn ) = ∞ P-a.s.
[e) (X
et )t≥0 is an MC in continuous time.
Proof. See [Liggett] Theorem 2.33.

Corollary 1.14. If one (or both) of the following hold
(a) supx∈S c(x) < ∞
(b) the embedded discrete time MC is irreducible and recurrent.
then (d) in Theorem 1.13 holds true.
Proof. We prove the Corollary. If (a) in Corollary 1.14 is true, c(x)−1 is bounded away
from zero, therefore the sum in (d) diverges. If (b) holds true, then, for each x ∈ S, the
event {Zn = x} happens P-a.s. for infinitely many n, hence the term c(x)−1 in the sum
in Theorem 1.13 appears infinitely often.
Remark. Clearly (a) is satisfied if S is finite.
1.3 Blackwell’s Example

Our goal is to construct a transition function for which all states are instantaneous,
i.e. u(x) = ∞, where u is as in Theorem 1.10. For any n ∈ N let (Xtn )t≥0 be a MC as in
Example 1.6 for the parameters βn and δn . We assume that all these MCs are independent.
Furthermore, set
Xt := (Xt1 , Xt2 , ...) ∈ {0, 1}∞ .
This process is well-defined due to Kolmogorov’s extension theorem. Next, we use (Xt )t≥0
to define a transition function on the space
( ∞
)
X
S := x = (x1 , x2 , ...) : xi ∈ {0, 1}, xk < ∞ .
k=1
We stress that S is countable. To see this, note that

[
S= {x = (x1 , x2 , ...) : xk ∈ {0, 1} for k < n and xk = 0 for all k ≥ n}.
n∈N
Recalling that countable unions of countable sets are countable, we conclude that S is
countable. Now, set
Y
pt (x, y) := P(Xt = y|X0 = x) = Pxk (Xtk = yk ), x, y ∈ S. (1.11)
k∈N
Next, we give a criterion on the parameters βk and δk such that (pt )t≥0 is a transition
function. Afterwards, we will also see a criterion such that all states are instantaneous.
Theorem 1.15. Suppose that
∞
X βk
< ∞. (1.12)
k=1
βk + δk
Then,
P(Xt ∈ S|X0 ∈ S) = 1 for all t ≥ 0, (1.13)
and (pt )t≥0 is a transition function.
Proof. The function (pt )t≥0 is a transition function if we show (1.2) and the Chapman-
Kolmogorov equations (1.3). First, pt (x, y) ≥ 0 is obvious. Second, we have
X
pt (x, y) = P(Xt ∈ S|X0 = x).
y∈S
We show that
P(Xt ∈ S|X0 = x) = 1 (1.14)
for all x ∈ S. Fix x ∈ S. Then, there exists a nx < ∞ such that xi = 0 for all i ≥ nx . Note
that Xt ∈ S whenever Xtn = 1 for only finitely many n ≥ nx . We show that this holds
with probability one due to the Lemma of Borel-Cantelli. Indeed, recalling (1.6), we see
that for any n ≥ nx
−t(βn +δn )
P(Xtn = 1|X = x) = δnβ+β ≤ δnβ+β

n
n
1 − e n
n
.
Thus, recalling our hypothesis (1.12), the Borel-Cantelli lemma yields that
P(Xtn = 1 for only finitely many n ≥ nx |X0 = x) = 1,
i.e. (1.14) holds. Moreover, we have
P
P(X0 =x)
X
P(Xt ∈ S|X0 ∈ S) = P(Xt ∈ S|X0 = x) P(X0 =x)
P(X0 ∈S)
= x∈S
P(X0 ∈S)
= 1,
x∈S
which proves (1.13). Next, we prove the last part in (1.2). Using once again (1.6), we see
that for any n ≥ nx
n−1 ∞
Y Y δi
pt (x, x) ≥ Pxk (Xtk = xk ) . (1.15)
k=1
β + δi
i=n i
Let us shortly
Q argue that the infinite product makes sense. It is well-known
P that an infinite
product k ak converges to a non-zero limit iff the infinite sum k log(ak ) converges. We
have (using the estimate log x ≥ 2(x − 1) for x ∈ (1/2, 1])
∞ ∞ ∞
X X X βk
log βkδ+δ
δk
k
≤ 2 − 1 = <∞

βk +δk
βk + δk
k

k=1 k=1 k=1
by hypothesis. Thus, the infinite product in (1.15) is well-defined. Now, in (1.15), letting
first t & 0 and then n % ∞, we see that
lim pt (x, x) = 1.
t&0
Finally, we show that the Chapman-Kolmogorov equations are satisfied. Let x, y ∈ S, s, t ∈

R+ and denote by An the set of all z ∈ S such that zk = 0 for all k ≥ n. For all n ≥ nx ∨ny
we have
X ∞
Y X n−1
Y
pt (x, z)ps (z, y) = pkt (0, 0)pks (0, 0) Pxk (Xtk = zk )Pzk (Xsk = yk ).
z∈An k=n z1 ,...,zn−1 k=1
Let us understand the sum on the r.h.s. We have

X n−1
Y
Pxk (Xtk = zk )Pzk (Xsk = yk )
z1 ,...,zn−1 k=1
n−1 1
!
X Y X
= Pxk (Xtk = zk )Pzk (Xsk = yk ) Px1 (Xt1 = j)Pj (Xs1 = yk )
z2 ,...,zn−1 k=2 j=0
X n−1
Y
= Pxk (Xtk = zk )Pzk (Xsk = yk )Px1 (Xt+s
1
= y1 )
z2 ,...,zn−1 k=2
n−1
Y
k
= ··· = Pxk (Xt+s = yk ),
k=1
where we use the Chapman-Kolmogorov equations for the MCs (Xtk )t≥0 . We conclude
that
X ∞
Y n−1
Y
k k j
pt (x, z)ps (z, y) = pt (0, 0)ps (0, 0) Pxj (Xt+s = yj ).
z∈An k=n j=1
Letting n % ∞, we obtained the Chapman-Kolmogorov equations for (pt )t≥0 and the
proof is complete.
Remark. We stress that X is always a stochastic process with values in {0, 1}∞ . The
condition (1.12) guarantees that we can consider X also as a stochastic process with
values in S.
So far, we have seen a condition on the sequences (βk )k∈N and (δk )k∈N such that (1.11)
defines a transition function. In the following theorem, we also give a condition which
implies that all states of (1.11) are instantaneous. For an intuitive explanation of the
condition, see the remark below.
Theorem 1.16. Suppose that (1.12) holds and that
∞
X
βk = ∞. (1.16)
k=1
Then all states of (1.11) are instantaneous.

Proof. Let x ∈ S and let nx as in the proof of Theorem 1.15. Fix a small enough t > 0.
For all n ≥ nx we have
∞
Y
1 − pt (x, x) = 1 − Pxk (Xtk = xk )
k=1
Yn
≥1− P0 (Xti = 0)
i=nx
Yn
=1− δi
βi +δi
+ βi
βi +δi
e−t(βi +δi )
i=nx
Now, using Taylor’s theorem, we see that

δi
βi +δi
+ βi
βi +δi
e−t(βi +δi ) ∼ 1 − βi t.
Because 1 − x ≤ e−x , we see that

n
!
X
1 − pt (x, x) & 1 − exp − βk t .
k=nx
Together with (1.9), we have shown that

n
!
X
exp − βk t & pt (x, x) ≥ e−u(x)t ,
k=nx
which implies that

n
X
u(x) ≥ βk .
k=nx
Finally, letting n → ∞ and using our hypothesis yields that u(x) = ∞. This finishes the
proof.
Remark. Let us informally understand the condition (1.16). Roughly speaking, a state is
instantaneous if the probability of remaining in it in an interval is zero. The probability
that X k remains in 0 in an interval of length h is e−βk h . The probability that the entries
of (X n , X n+1 , ...) remain in 0 is consequently
∞
Y
e−βk h ,
k=n
using the independence of the entries. This product, however, is zero if (1.16) is satisfied.
Chapter 2
Invariant measures, recurrence and

transience
2.1 Invariant measures

Let Q be a Q-matrix. We assume in this chapter that the statements in Theorem 1.13 hold
true, hence there is a MC belonging to Q (namely the MC X et constructed above) with
a unique corresponding transition function pt (·, ·) which satisfies (1.7) and in particular
(1.4).
Definition 2.1. A measure1 π on S is invariant for the MC with transition function

pt (·, ·) if
X
π(y) = π(z)pt (z, y), ∀y ∈ S, t > 0. (2.1)
z∈S
P
Lemma 2.2. Assume that x∈S c(x)π(x) < ∞. Then
X
π is invariant if and only if π(x)q(x, y) = 0, ∀y ∈ S. (2.2)
x∈S
Sketch of proof. (i) First assume that π is invariant. Then

d X
0= π(x)pt (x, y)
dt x∈S t=0

X d
= π(x) pt (x, y)
x∈S
dt t=0
X
= π(x)q(x, y).
x∈S
1
Note that this is not necessarily a probability measure.
15
CHAPTER 2. INVARIANT MEASURES, RECURRENCE AND TRANSIENCE 16
P
(ii) Assume now π(x)q(x, y) = 0. Then
x∈S

d X (1.7) X X
π(x)pt (x, y) = π(x) q(x, z)pt (z, y)
dt x∈S x∈S z∈S
X X
Fubini (to verify) = pt (z, y) π(x)q(x, y) = 0.
z∈S x∈S
Hence
X X
π(x)pt (x, y) = π(x)p0 (x, y) = π(y).
x∈S x∈S
Definition 2.3. A measure π is reversible for the MC with transition function pt (x, y)
if
π(x)pt (x, y) = π(y)pt (y, x), ∀x, y ∈ S, t > 0. (2.3)
A reversible measure is also invariant, since (2.3) implies (2.1). However there are
invariant measures that are not reversible.
Example. Fix λ > 0. Let (Xt )t≥0 be a Poisson process with intensity λ, i.e.
q(x, y) = λ1{y=x+1} − λ1{y=x} , x, y ∈ Z.
Let π(x) = 1 for all x. Check (2.1) directly

?
X X
π(y) = 1 = π(x)pt (x, y) = pt (x, y).
x∈Z x∈Z
But pt (x, y) only depends on the difference y − x:
X X ∞
X
pt (x, y) = pt (x, y) = pt (0, z) = 1.
x∈Z x≤y z=0
On the other hand, (2.3) can not hold since for x < y, pt (x, y) > 0 but pt (y, x) = 0.
Lemma 2.4. Assume that π(x) > 0 for all x ∈ S. Then
π(x)pt (x, y) = π(y)pt (y, x), ∀x, y ∈ S, t > 0

⇔ (2.4)
π(x)q(x, y) = π(y)q(y, x), ∀x, y ∈ S.
Proof. See [Liggett].
Remark. Note that (2.4) simplifies the computation of π!

Example. Let

−β β
Q= .
δ −δ
Look for π with π(0)q(0, 1) = π(1)q(1, 0), then
δ β
π(0) = , π(1) = ,
β+δ β+δ
hence (Xt )t≥0 has the reversible measure π.
An invariant (or reversible) probability measure is called invariant (or reversible)

distribution.
Definition 2.5. A stochastic process (Xt )t≥0 is strictly stationary, if for all n ≥ 1 and
t1 < t2 < · · · < tn , the law of (Xt1 +s , ..., Xtn +s ) does not depend on s.
Lemma 2.6. Let π be a probability measure on S and (Xt )t≥0 a MC with transition
function pt (·, ·) and starting distribution π (i.e. X0 has law π). Then (Xt )t≥0 is strictly
stationary if and only if π is invariant.
Proof. See exercises.
2.2 Recurrence and Transience

Definition 2.7. The MC (Xt )t≥0 is irreducible if pt (x, y) > 0 for all x, y ∈ S, t > 0.
Definition 2.8. A state x ∈ S is recurrent if
Px ( there are arbitrary large t such that Xt = x) = 1.
Otherwise the state is called transient.
The Green function of the MC is defined as

Z
G(x, y) := pt (x, y)dt ∈ [0, ∞], x, y ∈ S.
R+
The Green function is the expected occupation time of y, starting from x.

Z Z
Ex 1{Xt =y} dt = Px (Xt = y)dt.
R+ R+
Theorem 2.9. The state x ∈ S is transient if and only if G(x, x) < ∞.
For the proof we need:

Theorem (Strong Markov property). For a MC in continuous time, we have the

following: Assume g : Ω → R measurable and bounded, τ a stopping time with respect to
(Ft )t≥0 . Then for all x ∈ S

Ex g ◦ θτ |Fτ = EXτ g Px -a.s. on {τ < ∞},
where g ◦ θτ is defined as (g ◦ θτ )(ω) = g(θτ (ω) ω).
Proof of Theorem 2.9. Due to our construction of (X et )t≥0 , the length of the time intervals
spent in x are i.i.d. with law exp(c(x)). If c(x) = 0 we have pt (x, x) = 1 for all t ≥ 0 and
x is recurrent. Hence assume c(x) > 0. Due to the strong Markov property, the number G e
of visits of (Zn )n∈N to x has a geometric law (where we admit the case G e = ∞). Further,
G
e is independent of the waiting times (τ0 , τ1 , ...). Hence either
Z Z G
e
X
1{Xt =x} dt = ∞ or 1{Xt =x} dt = τ ik ,
R+ R+ k=0
where τik are i.i.d. with law exp(c(x)). Hence either
e = ∞ ⇒ G(x, x) = ∞
G
and x is recurrent or
G
X
e
1
G(x, x) = Ex τ ik = ,
k=0
c(x)e
p
where pe = Px (Zm 6= x, ∀m ≥ 1) and x is transient.
Remark. If x is transient, the random variable

Z
1{Xt =x} dt
R+
has law exp(c(x)e

p) under Px .
Example 2.10. Linear birth and death chain (cf. Example 1.8). Fix ρ, λ > 0 and consider
the MC with state space S = {0, 1, 2, . . . } and Q-matrix Q = q(k, l)k,l∈S given by

ρk
 l =k+1
q(k, l)k,l∈S = λk l =k−1

−(λ + ρ)k l = k

for k ≥ 1 and q(0, 0) = 0.
1. Is this MC irreducible? Answer: No, since 0 is an absorbing state.

2. Which states are transient? Which states are recurrent? 0 is clearly a recurrent state.
Claim: All other states are transient.
Consider the discrete embedded MC (Zn ). Intuitively, if x > 0 is recurrent, then it is
visited by (Zn ) infinitely often. But on each visit, we have a probability ≥ ε to visit
0 before coming back to x. Since these trials are iid, one of them will be successful
and the MC will be absorbed at 0: contradiction!
Proof of the claim. Let x > 0 and let T1 := min{n > 0 : Zn = x} and Ti+1 :=
min{n > Ti : Zn = x}. For H0 := min{n > 0 : Zn = 0} we have
x
λ
Px (H0 < T1 ) ≥ =ε
λ+ρ
(which is the probability that the first x steps are going to the left). Now by the
strong Markov property
Px (Ti+1 < ∞|Ti < ∞) ≤ 1 − ε.
Thus,
Px (Ti+1 < ∞) = Px (Ti+1 < ∞, Ti < ∞, . . . , T1 < ∞)

i
Y
= Px (T1 < ∞) Px (Tj+1 < ∞|Tj < ∞)
j=1
≤ (1 − ε)i .
Therefore, ∀k ∈ N
Px (x is visited i.o. by (Zn )) ≤ Px (Tk < ∞) ≤ (1 − ε)k
and we conclude that
Px (x is visited i.o. by (Zn )) = 0
=⇒ x is transient.
The arguments of 1. and 2. go through for the general birth and death chain with q(0, 0) = 0
and, for k ≥ 1

ρk
 l =k+1
q(k, l)k,l∈S = λk l =k−1

−(λk + ρk ) l = k

where ρk , λk > 0 for all k ≥ 1.

3. For which values of λ and ρ do we have P1 (limt→∞ Xt = ∞) > 0 ? Consider the

discrete-time MC (Z̃n )n=0,1,2,... with values in Z given by the transition probabilities
ρ λ
p(x, x + 1) = , p(x, x − 1) = ∀x ∈ Z
ρ+λ ρ+λ
ρ
If ρ+λ
> 21 , then

0 < P1 lim Z̃n = ∞, H0 = ∞ = P1 lim Xt = ∞ .
n→∞ t→∞
ρ 1 ρ
If ρ+λ
= 2
or ρ+λ
< 12 then

P1 lim Zn = 0 = P1 Z̃k = 0 for some k ≥ 1 = 1
n→∞
ρ ρ
since Z̃n is either recurrent (if ρ+λ = 12 ) or transient to the left (i.e. if ρ+λ
< 1
2
we
have P1 (limn→∞ Z̃n = −∞) = 1). Hence

P1 lim Xt = ∞ > 0 ⇐⇒ ρ > λ.
t→∞
Proposition 2.11. For an irreducible MC, either all states are recurrent or all states are
transient. In the transient case, we have
G(x, y) ≤ G(y, y). (2.5)
Proof. Using the Chapman-Kolmogorov equation
p2t+s (x, x) ≥ pt (x, y)ps (y, y)pt (y, x),
hence G(x, x) ≥ pt (x, y)G(y, y)pt (y, x). Hence if the MC is irreducible, G(x, x) < ∞
implies G(y, y) < ∞. To show the second statement, let Hy := inf{s ≥ 0 : Xs = y} the
first hitting time of y. Then
Z
G(x, y) = Ex 1{Xt =y} dt1{Hy <∞}
R+
(strong Markov property) = Px (Hy < ∞)G(y, y) ≤ G(y, y).
Definition 2.12. An irreducible MC is called recurrent if all states are recurrent and
transient if all states are transient.
Remark. The chain (Xt )t≥0 is irreducible if and only if the embedded discrete time MC
(Zn )n∈N is irreducible. In this case (Xt )t≥0 is recurrent if and only if (Zn )n∈N is recurrent.
Definition 2.13. A function f : S → R is superharmonic for (Xt )t≥0 if, for all t ≥ 0
and x ∈ S
Ex [|f (Xt )|] < ∞ and Ex [f (Xt )] ≤ f (x).

Remark 2.14. f is superharmonic if and only if (f (Xt ))t≥0 is a Px -supermartingale, for

all x ∈ S.
Theorem 2.15. An irreducible MC (Xt )t≥0 is transient if and only if there is a bounded
superharmonic function which is not a constant.
Proof. If (Xt )t≥0 is transient, fix y ∈ S and take f (x) = G(x, y). Then f is bounded due
to (2.5). We show that f is superharmonic:
X
Ex [f (Xt )] = pt (x, z)G(z, y)
z∈S
X Z
= pt (x, z) ps (z, y)ds
z∈S R+
Z X
Fubini = pt (x, z)ps (z, y)ds
R+ z∈S
Z
Chapman-Kolmogorov = pt+s (x, y)ds
+
ZR
= ps (x, y)ds
[t,∞)
< G(x, y) = f (x).
Since this is a strict inequality, f is not constant.
Conversely assume that (Xt )t≥0 is recurrent and that f is bounded and superharmonic.
Then (f (Xt ))t≥0 is a bounded Px -supermartingale, which converges Px -a.s. Since (Xt )t≥0
is recurrent, each x ∈ S is visited at arbitrary large times and hence f is constant.
The last argument also shows the following.
Lemma 2.16. For an irreducible, recurrent MC (Xt )t≥0 each non-negative superharmonic
function is constant.
Lemma 2.17. If an irreducible MC has an invariant distribution, it is recurrent.
Proof. Let π be an invariant distribution. Then for all t ≥ 0
Z
X 1
π(y) = π(x) ps (x, y)ds. (2.6)
x∈S
t [0,t]
If the MC was transient, the r.h.s. of (2.6) would go to 0 for t → ∞, since

Z Z (2.5)
1
ps (x, y)ds ≤ 1 and G(x, y) = ps (x, y)ds ≤ G(y, y) < ∞,
t [0,t] R+
which is a contradiction.
Lemma 2.18. An irreducible and recurrent MC has an invariant measure π. Moreover
π(x) > 0, ∀x ∈ S.
Proof. Fix z ∈ S.
τz := inf{s ≥ 0 : Xs 6= z}, Rz := inf{t > τz : Xt = z}.
then Rz is a Ft -stopping time. Define

Z
π(x) := Ez 1{Xt =x} dt (2.7)
[0,Rz ]
1
Then π(z) = c(z) and in particular π(z) > 0 and π(z) < ∞.
We show that π is invariant. Due to the strong Markov property
Z Z
Ez 1{Xt =x} dt = Ez 1{Xt =x} dt , s > 0.
[0,s] [Rz ,Rz +s]
Therefore
Z
π(x) = Ez 1{Xt =x} dt
[0,Rz ]
Z Z
= Ez 1{Xt =x} dt − Ez 1{Xt =x} dt
[0,Rz +s] [Rz ,Rz +s]
Z
= Ez 1{Xt =x} dt
[s,Rz +s]
Z
= Ez 1{Xt+s =x} dt
[0,Rz ]
Z X
= Pz (Xt = y, Rz > t)ps (y, x)dt
R+ y∈S
X
= π(y)ps (y, x),
y∈S
R R
since π(y) = Ez [0,Rz ] 1{Xt =y} dt = R+ Pz (Xt = y, Rz > t)dt. In particular, π(z) =
P
y∈S π(y)ps (y, z) and since we know that π(z) < ∞ and ps (y, z) > 0 for all y ∈ S
and all t > 0 due to irreducibility, we see that π(y) < ∞ for all y ∈ S. It remains
to showPthat π(x) > 0 for all x ∈ S. Since pt (z, x) > 0 for all t > 0, x ∈ S and
π(x) = y∈S π(y)pt (y, x) ≥ π(z)pt (z, x) > 0, we have π(x) > 0 for all x ∈ S.
Remark. If we sum (2.7) over x ∈ S we get

X
π(x) = Ez [Rz ],
x∈S
i.e. π can be normalized to an invariant distribution if and only if Ez [Rz ] < ∞.
Lemma 2.19. There is (up to a multiplication with a constant) exactly one invariant
measure for an irreducible and recurrent MC.
Proof. Assume π1 , π2 are invariant measures for (Xt )t≥0 . Note that π1 (x) > 0 and π2 (x) >
0 for all x ∈ S, due to the same argument as in the proof of Lemma 2.18. Define another
transition function
π1 (y)pt (y, x)
pt (x, y) = .
π1 (x)
Due to Theorem 2.9 the MC (X t )t≥0 with transition function p is recurrent as well. Set
α(x) = ππ12 (x)
(x)
and note that
X
pt (x, y)α(y) = α(x),
y∈S
i.e. α is a non-negative superharmonic function for (X t )t≥0 . Due to Lemma 2.16, α is

constant. Hence there exists a c > 0 such that π2 (x) = cπ1 (x) for all x ∈ S.
Example 2.20. S = Z, p ∈ ( 12 , 1). Define the Q-matrix by
q(x, x + 1) = p, q(x, x − 1) = 1 − p, q(x, x) = −1, q(x, y) = 0 if |x − y| ≥ 2.

p
x
Then π(x) = 1−p , x ∈ Z is reversible and invariant. But π e(x) = 1 for all x ∈ Z is also
invariant, since
X X X
π
e(x)pt (x, y) = pt (0, y − x) = pt (0, z) = 1 = π
e(y).
x∈Z x∈Z z∈Z
In particular, (Xt )t≥0 is transient, since there are two truly different invariant measures.
In this example, more is known: show that limt→∞ Xtt > 0 (exercise).
Definition 2.21. An irreducible MC which has an invariant distribution is positive
recurrent. An irreducible recurrent MC which does not have an invariant distribution is
null recurrent.
Remark 2.22. An irreducible recurrent MC with invariant measure π is positive recurrent
if and only if the expected return times are finite, i.e.
Ez [Rz ] < ∞, ∀z ∈ S.
Proof. See the remark after Lemma 2.18.

Theorem 2.23. Let (Xt )t≥0 be an irreducible MC with invariant distribution π. Then
lim pt (x, y) = π(y), ∀x, y ∈ S.

t→∞
Proof. The proof is based on a coupling argument, which goes as follows: Let (Xt )t≥0
and (Yt )t≥0 be two independent MC with the same transition function pt (·, ·). (We will
fix the starting distribution later.) Define Zt := (Xt , Yt ) for t ≥ 0. Then (Zt )t≥0 is an
irreducible MC on S × S with transition function
pet ((x1 , x2 ), (y1 , y2 )) = pt (x1 , y1 )pt (x2 , y2 ).

It is easy to check that (Zt )t≥0 has the invariant distribution π e given by πe(x1 , x2 ) =
π(x1 )π(x2 ), x1 , x2 ∈ S. Hence (Zt )t≥0 is recurrent due to Lemma 2.17. In particular, for
any starting points x1 , x2
P(x1 ,x2 ) (Zt ∈ {(i, i) ∈ S × S|i ∈ S} for some t ≥ 0) = P(x1 ,x2 ) (Xt = Yt some t ≥ 0) = 1
by recurrence. We conclude that the stopping time
τ := inf{t ≥ 0 : Xt = Yt }
is finite P-a.s. Define (Wt )t≥0 by

(
Yt , t ≤ τ
Wt =
Xt , t > τ.
(Wt )t≥0 is a MC with transition function pt (·, ·). We choose now the starting distribution
d d
such that X0 = x, Y0 = π. Then Yt = π for all t ≥ 0, since π is invariant. But we also
d
have Wt = π for all t ≥ 0.
|pt (x, y) − π(y)| = |Px (Xt = y) − Pπ (Wt = y)|

= |P(x,π) (Xt = y) − P(x,π) (Wt = y)|
(exercise) ≤ P(x,π) (Xt 6= Wt )
t→∞
= P(x,π) (τ > t) −−−→ 0,
d
since τ is finite P-a.s. Here P(x,π) is the law of (Zt )t≥0 , where X0 = x, Y0 = π.
Corollary 2.24. An irreducible MC has at most one invariant distribution.
Proof. For two invariant distributions π1 , π2 Theorem 2.23 implies that
lim pt (x, y) = π1 (y) = π2 (y), ∀x, y ∈ S.

t→∞
Definition 2.25. A function f : S → R is harmonic for (Xt )t≥0 if
Ex [|f (Xt )|] < ∞, and Ex [f (Xt )] = f (x) ∀t ≥ 0, x ∈ S.
Remark 2.26. (a) f is harmonic if and only if (f (Xt ))t≥0 is a Px -martingale.
(b) If (Xt )t≥0 is irreducible and recurrent each non-negative harmonic function is con-
stant.
Proof. Exercise.
Remark 2.27. If (Xt )t≥0 is irreducible and recurrent, Theorem 2.15 implies that each
bounded harmonic function is constant. Consider a graph with finite degrees. It has the
Liouville property if each bounded harmonic function (for simple random walk on the
graph) is constant. Simple random walk in continuous time on a graph with finite degrees
can be defined with the Q-matrix

1
 deg(x) , x 6= y, x ∼ y

q(x, y) = 0, x 6= y, x 6∼ y

−1, x = y,

where we write x ∼ y if x and y are neighbours, i.e. there is an edge connecting x and y,
and “finite degrees”means that each x has only finitely many neighbours. It is known that
Zd has the Liouville property for d ≥ 1. In particular there are transient graphs with the
Liouville property.
Chapter 3
Feller processes
3.1 Feller processes

Let S be a locally compact (or compact) complete separable metric space.
C(S) := {f : S → R, f continuous}, if S is compact

C(S) := {f : S → R, f continuous and vanishing at infinity}, if S is locally compact.
Here, “vanishing at infinity” means that for each ε > 0, there is a compact subset Kε of
S such that |f (x)| ≤ ε for all x ∈
/ Kε . Let kf k := supx∈S |f (x)|. Then C(S) is a separable
Banach space. The functions in C(S) are uniformly continuous.
Ω :=D[0, ∞) = {ω : [0, ∞) → S, ω càdlàg}, Xt (ω) = ω(t).
The σ-field F on Ω is the smallest σ-field, such that all projections Xt are measurable
w.r.t. F.
Definition 3.1. A Feller process with values in S consists of
(a) a collection of probability measures (Px )x∈S on (Ω, F)
(b) a right-continuous filtration (Ft )t≥0 on Ω, such that (Xt )t≥0 is adapted to (Ft )t≥0
with the following properties:
Px (X0 = x) = 1, ∀x ∈ S (3.1)
The mapping
x 7→ Ex [f (Xt )] is in C(S), ∀f ∈ C(S), t ≥ 0 (3.2)
and
Ex [g ◦ θs |Fs ] = EXs [g] Px -a.s. ∀x ∈ S (3.3)
and all g : Ω → R bounded and measurable. (3.3) is called the Markov property and
(3.2) the Feller property.
26
CHAPTER 3. FELLER PROCESSES 27
Each Feller process has the strong Markov property:

Theorem 3.2. Assume g : Ω → R measurable and bounded, τ a stopping time with respect
to (Ft )t≥0 . Then for all x ∈ S

Ex g ◦ θτ |Fτ = EXτ g Px -a.s. on {τ < ∞},
where g ◦ θτ is defined as (g ◦ θτ )(ω) = g(θτ (ω) ω).

Definition 3.3. A transition semigroup is a family of continuous linear operators
(Tt )t≥0 with Tt : C(S) → C(S) with the following properties:
(a) T0 f = f for all f ∈ C(S)
(b) limt↓0 Tt f = f for all f ∈ C(S)
(c) Ts+t f = Ts Tt f for all f ∈ C(S), s, t ≥ 0
(d) Tt f ≥ 0 for all f ∈ C(S), f ≥ 0
(e) If S is compact Tt 1 = 1 for all t ≥ 0 and if S is locally compact, there exists a
sequence (fn )n∈N , fn ∈ C(S) such that
n→∞
sup kfn k < ∞, Tt fn −−−→ 1 pointwise for all t ≥ 0.
n∈N
Concerning property (e), of course S is locally compact if it is compact. We have

Lemma 3.4. Let S be a compact separable metric space and (Tt )t≥0 be a transition semi-
group on S. Then, (i) and (ii) are equivalent.
(i) Tt 1 = 1 for all t ≥ 0.
(ii) There exists a sequence (fn )n∈N , fn ∈ C(S) such that supn∈N kfn k < ∞ and
n→∞
Tt fn −−−→ 1 pointwise for all t ≥ 0.
Proof: see exercises.
Definition 3.3 is analogous to Definition 1.2 and (c) corresponds to the Chapman-Kol-
mogorov equations and is called semigroup property. (c) implies Tt Ts = Ts Tt . (d) and
(e) imply that
kTt f k ≤ kf k, ∀f ∈ C(S)
i.e. that the Tt are contractions. More precisely: let S be compact, then
g = kf k − f ≥ 0, g ∈ C(S)
hence Tt g = kf k − Tt f ≥ 0 ⇒ Tt f ≤ kf k pointwise. The same argument applies to

−f and gives Tt f ≥ −kf k pointwise, hence kTt f k ≤ kf k. The locally compact case is an
exercise. (b) is called strong continuity. Together with (c) and the contraction property,
(b) implies that the function t 7→ Tt f from [0, ∞) to C(S) is continuous.
Proof. Let tn ↓ t. Then
Ttn f − Tt f = Tt (Ttn −t − 1)f

n→∞
⇒ kTtn f − Tt f k ≤ k(Ttn −t − 1)f k −−−→ 0,
due to (b). A similar argument shows continuity from the left.

Definition 3.5. Standard Brownian motion (BM) (Xt )t≥0 is a stochastic process
with stationary independent Gaussian increments, i.e. for 0 ≤ t0 < t1 < t2 < ... < tm
Xti − Xti−1 , i ∈ {1, ..., m} are independent and N (0, ti − ti−1 ) distributed, and continuous
paths, i.e.
P({ω : t 7→ Xt (ω) continuous}) = 1, (3.4)
and
P(X0 = 0) = 1.
Theorem 3.6. Brownian motion exists.

Proof. See [Liggett] Theorem 1.21., or “Stochastic Analysis”.
Remark. BM has a lot of interesting properties, e.g. there is a measurable set A such
that
P(A) = 1 and A ⊆ {ω : t 7→ Xt (ω) nowhere differentiable}
for which we refer to the course Stochastic Analysis.

Let x ∈ R. If (Xt )t≥0 is a standard Brownian motion, then (Xt + x)t≥0 is BM starting
from x: It satisfies the requirements in Definition 3.5, except that P(X0 = 0) = 1 is
replaced by Px (X0 = x) = 1. Let Xu (ω) = ω(u), Ω = D[0, ∞), Ft0 = σ(Xu , u ≤ t), Ft =
0
T
ε>0 Ft+ε . Then (Px )x∈R , (Ft )t≥0 is a Feller process in the sense of Definition 3.1, with
S = R.
Example 3.7 (Brownian motion). Let (Xt )t≥0 be BM. Then (Tt f )(x) := Ex [f (Xt )],
x ∈ R, defines a transition semigroup.
Proof. Note that for t > 0,
Z
1 −(y−x)2
Ex [f (Xt )] = √ f (y)e 2t dy.
2πt R
Rest: exercise.
Exercise. Assume S is countable, pt (x, y) a transition function and
X
(Tt f )(x) = pt (x, y)f (y), x ∈ S, t ≥ 0.
y∈S
Then
(a) If S is finite, (Tt )t≥0 is a transition semigroup.
(b) If S = Z, (Tt )t≥0 is a transition semigroup if and only if
lim pt (x, y) = 0, ∀y ∈ S, t > 0.

|x|→∞
Example. (b) is obviously satisfied for the Poisson process.

Remark. BM and the Poisson process are both in the larger class of Lévy processes,
which have stationary and independent increments.
Next we will define the object corresponding to the Q-matrix. We write D(L) and
R(L) for the domain and the range of a linear operator L.
Definition 3.8. A (probability) generator is a linear operator (in general unbounded)
L : D(L) → C(S) with D(L) a subspace of C(S) such that
(a) D(L) is dense in C(S).
(b) If f ∈ D(L), λ ≥ 0, f − λLf = g, then
inf f (x) ≥ inf g(x).

x∈S x∈S
(c) R(1 − λL) = C(S) for all λ > 0 small enough.
(d) If S is compact: 1 ∈ D(L) and L1 = 0.

If S is locally compact: for λ > 0 small enough there exists a sequence (fn )n∈N
such that fn ∈ D(L) for all n ∈ N and for gn := fn −λLfn , we have supn∈N kgn k < ∞
and the sequences fn and gn converge pointwise to 1.
Concerning property (d), of course S is locally compact if it is compact, and the two
statements are equivalent in that case, similarly as in Lemma 3.4, see exercises.
Property (b) implies that
f ∈ D(L), λ > 0, f − λLf = g =⇒ kf k ≤ kgk. (3.5)
This is true since
inf g(x) ≤ inf f (x) ≤ sup f (x) ≤ sup g(x).

x∈S x∈S x∈S x∈S
The last inequality follows from property (b) applied to −f . Therefore f 7→ f − λLf is
injective, and together with (c) we conclude that (1 − λL)−1 is defined everywhere for
λ > 0 small enough. Moreover it is a contraction that maps non-negative function to
non-negative functions.
Example 3.9. Let S = R, D(L) = {f ∈ C(S) : f 0 ∈ C(S)} and Lf = f 0 . Then L is a
generator.
Proof. See exercises.
Example 3.10. If S is finite and Q is a Q-matrix, then

X
Lf (x) = q(x, y)f (y), x ∈ S,
y∈S
is a generator
Proof. See Exercises.
We will see that Feller processes, transition semigroups and generators are
in one-to-one correspondence.
3.2 From the Feller process to its infinitesimal de-

scription
Theorem 3.11. Assume that (Xt )t≥0 is a Feller process. Define
(Tt f )(x) = Ex [f (Xt )], x ∈ S, f ∈ C(S).
Then (Tt )t≥0 is a transition semigroup on C(S).

Theorem 3.12. Let (Tt )t≥0 a transition semigroup. Define L by
Tt f − f
Lf := lim (3.6)
t↓0 t
on

Tt f − f
D(L) = f ∈ C(S) : converges in C(S) for t → 0 .
t
Then L is a generator and further
(a) If f ∈ D(L) then Tt f ∈ D(L) for all t ≥ 0 and
t 7→ Tt f
is continuously differentiable and satisfies

d
Tt f = Tt Lf = LTt f. (3.7)
dt
(b) For f ∈ C(S) and t > 0 we have

−n
t
lim 1− L f = Tt f. (3.8)
n→∞ n

Remark. (b) says ”Tt = etL ”. We know that this is true if S is finite:
X
Lf (x) = q(x, y)f (y), Pt = Tt = etQ .
y∈S
If L is a bounded operator, etL can be defined in several ways:

∞ n −n
(tL)k

X t t
, lim 1 + L , lim 1 + − L .
k=0
k! n→∞ n n→∞ n
It turns out that the third way is the best for generalization to unbounded operators.
Remark. Often generators are (second order) differential operators. Then, one can con-
sider the partial differential equation (PDE)
d
u = Lu, (3.9)
dt
where u = u(t, x) is a function of space and time. L only acts on the space variable. Under
mild conditions the solution of this PDE with initial condition u(0, x) = f (x) is given by
u(t, x) = Tt f (x), (3.10)
where (Tt )t≥0 is the transition semigroup corresponding to L.
Example 3.13 (Linear motion). Let s = R and ω (x) (t) = x + t and Px = δω(x) . Then,
the corresponding process (Xt )t≥0 is a Feller process.
Exercise. Determine the transition semigroup and the generator.
Example 3.14 (Brownian motion). Let S = R and let L be the following operator
1
Lf = f 00 ,
2
where
D := {f ∈ C(R) : f twice differentiable with f 00 ∈ C(R)}. (3.11)
(a) One could check directly that L is a generator in the sense of Definition 3.8.
(b) If (Xt )t≥0 is a BM, then

Z
1 (y−x)2
Tt f (x) = Ex [f (Xt )] = √ f (y)e− 2t dy.
2πt R
(c) The generator L corresponding to this transition semigroup satisfies Lf = 21 f 00 for

f ∈ D, in particular we have D ⊆ D(L), where D is given by (3.11).
Proof. To show (c), write

Z Z
1 (x−y)2
Tt f (x) = ϕx,t (y)f (y)dy = √ e− 2t f (y)dy
2πt
where ϕµ,σ2 is the density of N (µ, σ 2 ). For f ∈ D we have

Z

Tt f (x) − f (x) = ϕx,t (y) f (y) − f (x) dy
Z √
= ϕ0,1 (y) f (x + ty) − f (x) dy
Z √ 1
= ϕ0,1 (y) f 0 (x) ty + ty 2 f 00 (x + ax,t (y)) dy

2
√
where |ax,t (y)| ≤ ty. Therefore
T f (x) − f (x) 1 1 Z
t
− f (x) = ϕ0,1 (y)y 2 f 00 (x + ax,t (y)) − f 00 (x) dy
00
t 2 2

Z
1
ϕ0,1 (y)y 2 f 00 (x + ax,t (y)) − f 00 (x)dy

≤
2
Z
00 1 00
f (x + a) − f 00 (x).

≤kf k ϕ0,1 (y)y 2 dy + sup
[−K,K]c 2 |a|≤√tK
Since f 00 is uniformly continuous, the second term converges to zero for t → 0, uniformly
in x, and the first term becomes arbitrarily small for K large.
Remark. In fact, D = D(L). The proof above does not yet show this since the limit of
Tt f (x)−f (x)
t
for t → 0 could exist as well for functions which are not in D.
Remark. In the proof above, there is no requirement on the first derivative of f . However,
it is easy to see that in fact, D = D e = {f ∈ C(R) : f twice differentiable with f 0 , f 00 ∈
C(R)}. Apply the Taylor formula to get, for a function f ∈ D, f (x + 1) − f (x) = f 0 (x) +
1 00
2
f (x + ax ) where ax ∈ (0, 1). But, for each ε > 0, we can choose K large enough such
/ [−K, K] and |f 00 (x + ax )| ≤ ε for x ∈
that |f (x + 1) − f (x)| ≤ ε for x ∈ / [−K, K], and we
0
conclude that f ∈ C(R) as well and hence f ∈ D. e
3.3 From the infinitesimal description to the Feller

process
Let L be a generator. We want to construct the corresponding transition semigroup. For
ε > 0 small enough, let L(ε) = L(1 − εL)−1 . First we show that L(ε) is well-defined for
ε > 0 small enough.
Claim. R((1 − εL)−1 ) = D(L).

Proof.
f − εLf = g ⇔ f = (1 − εL)−1 g. (3.12)
According to (c) in Definition 3.8: For ε > 0 small enough and g ∈ C(S) there is
f ∈ D(L) such that f − εLf = g. Further, with f, g as above, recalling (??)
f −g kf k + kgk 2
kL(ε) gk = kLf k = k k≤ ≤ kgk.
ε ε ε
(ε)
Hence L(ε) is a bounded operator. Hence we can define Tt by
∞
(ε) tL(ε)
X tk (ε) k
Tt =e = L .
k=0
k!
Lemma 3.15. If A is a bounded operator, then A satisfies (c) in Definition 3.8, i.e.
R(1 − λA) = C(S), for all λ small enough.

P∞ n n
Proof. Solve f − λAf = g, setting f = n=0 λ A g. This power series converges for
λkAk < 1.
Lemma 3.16. (a) For all f ∈ C(S),
(1 − εL)−1 f − εL(ε) f = f (3.13)

(ε)
(b) L(ε) is a generator and (Tt )t≥0 is the transition semigroup corresponding to L(ε) (in
the sense of Theorem 3.12).
Proof. (a)
(1 − εL)−1 f − εL(ε) f = (1 − εL)−1 f − εL(1 − εL)−1 f = g − εLg = h,
where g := (1 − εL)−1 f . To show: h = f . But
g − εLg = h ⇒ g = (1 − εL)−1 h
and g = (1 − εL)−1 f by definition. Therefore, since (1 − εL)−1 is injective, h = f .
(b) We show that L(ε) is a generator:
(a) from Definition 3.8: D(L(ε) ) = C(S). True since R(1 − εL) = C(S) for ε small
enough.
(b) from Definition 3.8: We have to show that for f ∈ D(L(ε) ), λ ≥ 0
f − λL(ε) f = g ⇒ inf f (x) ≥ inf g(x).

x∈S x∈S
Now take f ∈ C(S), f − λL(ε) f = g. Due to (3.13)

1 −1 1
f − λ (1 − εL) f − f = g
ε ε

λ λ
f 1+ − (1 − εL)−1 f = g
ε ε
λ ε
f (x) − (1 − εL)−1 f (x) = g(x)
ε+λ ε+λ
Hence, using the inequality inf x (ax + bx ) ≥ inf x ax + inf x bx ,
λ ε
inf f (x) ≥ inf (1 − εL)−1 f (x) + inf g(x).
x∈S ε + λ x∈S ε + λ x∈S
But since L is a generator,
inf (1 − εL)−1 f (x) ≥ inf f (x),

x∈S x∈S
hence
λ ε
inf f (x) ≥ inf f (x) + inf g(x).
x∈S ε + λ x∈S ε + λ x∈S
Therefore inf x∈S f (x) ≥ inf x∈S g(x).
(c) from Definition 3.8: This is clear since L(ε) is bounded, see Lemma 3.15.
(d) from Definition 3.8: S compact: L(ε) 1 = L(1 − εL)−1 1, but (1 − εL)−1 1 = 1,
because f − εLf = 1 is solved uniquely by f = 1, L1 = 0. Therefore L(ε) 1 = 0.
For the locally compact case see [Liggett].
(ε)
It remains to show that (Tt )t≥0 is a semigroup with generator L(ε) . But, more
generally: If a bounded (linear) operator A on C(S) is a generator, then (Tt )t≥0
defined by Tt = etA , t ≥ 0 (defined as power series) is a transition semigroup with
generator A. To prove this check Definition 3.3.
Theorem 3.17. For f ∈ C(S) and all t ≥ 0 the limit

(ε)
Tt f = lim Tt f
ε↓0
exists in C(S) and the convergence is uniform on compact intervals. Further (Tt )t≥0 is a
transition semigroup with generator L.
Now we know that for each generator, there is a semigroup (Tt )t≥0 . Next we convince
ourselves that for each transition semigroup (Tt )t≥0 , there is a corresponding Feller process.
Theorem 3.18. Let (Tt )t≥0 be a transition semigroup. Then there is a Feller process
(Xt )t≥0 such that
Ex [f (Xt )] = (Tt f )(x)
for all x ∈ S, f ∈ C(S) and t ≥ 0.
Sketch of proof. We have to construct probability measures Px as in Definition 3.1. Since

the Markov property has to hold, the finite-dimensional marginals are determined by Tt .
For example the one-dimensional marginals satisfy
Ex [f (Xt )] = (Tt f )(x), ∀f ∈ C(S), x ∈ S,
the two-dimensional marginals satisfy
Ex [f (Xs )g(Xt )] = Ex [f (Xs )EXs [g(Xt−s )]] = Ex [f (Xs )(Tt−s g)(Xs )] = (Ts h)(x),
where h(z) = f (z)(Tt−s g)(z) (this corresponds to the law of (Xs , Xt )). One now has to
show that (Xt )t≥0 can be chosen such that t 7→ Xt (ω) is in D[0, ∞), for this we refer to
[Liggett] Theorem 3.26.
Definition 3.19. A diffusion is a Feller process with continuous paths, i.e.
t 7→ Xt (ω) is in C[0, ∞) P-a.s.
Example 3.20. For a ∈ R, let (Bt )t≥0 be a BM and let Xt := at + Bt be BM with drift.
Then (Xt )t≥0 is again a Feller process with semigroup
Z Z √
Tt f (x) = Ex [f (at + Bt )] = ϕx+at,t (y)f (y)dy = ϕ0,1 (y)f (x + at + ty)dy .
Let D = {f ∈ C(R) : f twice differentiable with f 0 , f 00 ∈ C(R)}. To determine the gener-

ator, we consider
1 1
Z √
(Tt f (x) − f (x)) = ϕ0,1 (y) f (x + at + ty) − f (x) dy
t t
1
Z √ f (x + at) − f (x)
= ϕ0,1 (y) f (x + at + ty) − f (x + at) dy + .
t t
By the same arguments as before, for f ∈ D, the first term converges to 12 f 00 (x), uniformly
in x, while the second term converges to af 0 (x). Moreover the convergence of the second
term is uniform in x (see Exercises). Therefore the generator L of (Xt )t≥0 is given by
1
Lf (x) = f 00 + af 0
2
and D ⊆ D(L).
Chapter 4
Generators, martingales, invariant

distributions
4.1 Construction of generators

Definition 4.1. A linear operator A on C(S) is closed if its graph {(f, Af ), f ∈ D(A)}
is a closed subspace of C(S) × C(S). The closure A of A is the linear operator, whose
graph is the closure of the graph of A provided that such an operator exists, i.e. that the
closure of the graph of A is the graph of a linear operator.
Not every linear operator has a closure.
Example. Let S = [0, 1], D(A) = {f ∈ C(S) : f 0 (0) exists} and define A by Af (x) =
f 0 (0) for all x ∈ S. Note that S is compact, and that A is not a probability generator
(why not?). Then the closure of {(f, Af ), f ∈ D(A)} is not the graph of a linear operator:
∃(fn )n∈N ∈ C(S) such that
fn → f in C(S), Afn → g in C(S)
but g 6= Af . Take for instance fn (x) := n−1 sin(nx), fn0 (x) = cos(nx), hence fn0 (0) = 1 but
fn → 0 in C(S).
Lemma 4.2. (a) Assume A satisfies (a) and (b) in Definition 3.8, then A exists and
A satisfies (a) and (b) in Definition 3.8 as well.
(b) If A satisfies (a), (b) and (c) in Definition 3.8, then A is closed.
(c) If A satisfies (a) and (c) in Definition 3.8, then
R(1 − λA) = C(S), ∀λ > 0.
(d) If A is closed and satisfies (b) in Definition 3.8, then R(1 − λA) is a closed subset
of C(S).
36
CHAPTER 4. GENERATORS, MARTINGALES, INVARIANT DISTRIBUTIONS 37
Proof. Part (a): Let

n o
D(A) := f ∈ C(S) : ∃(fn )n∈N in D(A) with fn → f and Afn converges
and for f ∈ D(A) define Af := limn→∞ Afn . We have to show that this is well-defined,
that is fn ∈ D(A), fn → 0, Afn → h implies h = 0.
Choose g ∈ D(A). Due to (3.5) we have
k(1 − λA)(fn + λg)k ≥ kfn + λgk, λ > 0.
Let n → ∞ and divide by λ. Then
kg − h − λAgk ≥ kgk.
Letting first λ → 0 and then approximating h by g (using (a) in Definition 3.8) we get
h = 0.
The closure A of A satisfies clearly (a) in Definition 3.8 since it extends A. To show
that A satisfies (b), take f ∈ D(A), λ ≥ 0 and f − λAf = g, then there exists a sequence
(fn )n∈N in D(A) such that fn → f and Afn → Af , and since A satisfies (b) in Definition
3.8, we have
inf fn (x) ≥ inf gn (x),

x∈S x∈S
where gn = fn − λAfn . Now letting n → ∞ we get inf x∈S f (x) ≥ inf x∈S g(x).
Part (b): Let A be the closure of A. We want to show that A = A. Let f ∈ D(A) and
λ > 0 small enough. Due to (c) in Definition 3.8 there is h ∈ D(A) such that
h − λAh = f − λAf ⇒ h − f − λA(h − f ) = 0 (4.1)
Due to (3.5), h = f and therefore Af = Ah. Due to (4.1) A = A.

For (c) and (d) see [Liggett] Proposition 3.30.
Definition 4.3. A collection D ⊂ D(L) is called core for L, if L is the closure of L|D .
In particular L is determined by its values on D.
Remark. We will see that in general a generator L is not determined by its restriction
to an arbitrary dense subset of D(L).
Let (Bt )t≥0 be real-valued BM. We consider two variants with S = R+ .
Example 4.4 (Brownian motion with reflection).
Xtrefl := |Bt |, t ≥ 0.
(Xtrefl )t≥0 is a Feller process, hence there is a semigroup (Ttrefl )t≥0 and a generator Lrefl .
For f ∈ C[0, ∞) let the even extension of f be
fe (x) = f (x)1{x>0} + f (−x)1{x≤0} ,

then we have
(Ttrefl f )(x) = Ex [f (|Bt |)] = Ex [fe (Bt )],
for x ≥ 0. Hence
D(Lrefl ) = f ∈ C[0, ∞) : fe ∈ D(L) .

But since fe0 (x) = f 0 (x)1{x>0} − f 0 (−x)1{x≤0} we need f 0 (0) = 0 to have that fe0 is contin-
uous. Hence (see Example 3.14 and the two remarks following it)
1
Lrefl f = f 00 , D(Lrefl ) = f ∈ C[0, ∞) : f 0 , f 00 ∈ C[0, ∞), f 0 (0) = 0 .

2
Example 4.5 (Brownian motion with absorption). Let τ = inf{s ≥ 0 : Bs = 0}.
Define (with S = R+ )
Xtabs = Bt 1{t<τ } .
Xtabs is a Feller process, hence there is a corresponding transition semigroup (Ttabs )t≥0 and
generator Labs . For f ∈ C[0, ∞) let fo be the odd extension of f in R, i.e.

fo (x) = f (x)1{x≥0} + 2f (0) − f (−x) 1{x<0} .
Using the strong Markov property, we have
1
Ex [fo (Bt )1{τ ≤t} ] = Ex [fo (−Bt )1{τ ≤t} ] = Ex fo (Bt ) + fo (−Bt ) 1{τ ≤t} = f (0)Px (τ ≤ t).
2
Therefore for x ≥ 0
(Ttabs f )(x) = Ex f (Xtabs )1{τ >t} + Ex f (Xtabs )1{τ ≤t} ]

= Ex [f (Bt )1{τ >t} ] + f (0)Px (τ ≤ t)

= Ex [fo (Bt )1{τ >t} ] + Ex [fo (Bt )1{τ ≤t} ]
= Ex [fo (Bt )].
Hence
n Tt fo − fo o
D(Labs ) = f ∈ C[0, ∞) : lim converges in C(R) .
t→0 t
where Tt is the semigroup of (Bt )t≥0 . Since fo00 (x) = f 00 (x)1{x>0} − f 00 (−x)1{x<0} we need
f 00 (0) = 0 to have that fo00 is continuous. Then we get (see Example 3.14 and the two
remarks following it)
1
Labs f = f 00 , D(Labs ) = f ∈ C[0, ∞) : f 0 , f 00 ∈ C[0, ∞), f 00 (0) = 0 .

2
Remark 4.6. Consider the operator A with
1
Af (x) = f 00 (x)
2
on Dr = {f ∈ C[0, ∞) : f , f ∈ C[0, ∞), f 00 (0) = f 0 (0) = 0}. A is closed but not a
0 00
generator, although Dr is dense in C[0, ∞). Namely Labs and Lrefl are true extensions of
A. But a generator cannot truly extend another generator.
Exercise. Let L1 , L2 be two generators on C(S) with D(L1 ) ⊆ D(L2 ) and L1 f = L2 f on

D(L1 ). Then D(L1 ) = D(L2 ).
Now we see, considering Labs and Lrefl , that a generator is in general not determined
by its values on a dense subset of its domain, c.f. the remark after Definition 4.3.
4.2 Construction of martingales

Theorem 4.7. Let (Xt )t≥0 be a Feller process with transition semigroup (Tt )t≥0 and gen-
erator L. Then, for each f ∈ D(L)
Z
Mt = f (Xt ) − Lf (Xs )ds
[0,t]
is a (Px , Ft )-martingale for each x ∈ S.

Proof. (i) Recall (3.7), i.e.
d
Tt f = Tt Lf = LTt f,
dt
hence
Z Z
d
Ex [Mt ] = Tt f (x) − Ts Lf (x)ds = Tt f (x) − Ts f (x)ds = f (x).
[0,t] [0,t] ds
(ii) For s < t we can use the Markov property to write Ex [Mt |Fs ] in the following way
Z Z

Ex [Mt |Fs ] = Ex [f (Xt−s ◦ θs )|Fs ] − Lf (Xu )du − Ex Lf (Xu ◦ θs )duFs
[0,s] [0,t−s]
Z Z
= EXs [f (Xt−s )] − Lf (Xu )du − EXs Lf (Xu )du
[0,s] [0,t−s]
Z
= EXs [Mt−s ] − Lf (Xu )du
[0,s]
Z
= f (Xs ) − Lf (Xu )du Px -a.s. for all x ∈ S.
[0,s]
4.3 Invariant distributions

Assume µ ∈ M1 (S). Define
R µTt ∈ M1 (S) to be the law of Xt if X0 was chosen with µ.
More precisely let Pµ = Px µ(dx) and write Eµ for the expectation w.r.t. Pµ , then
Z Z
f d(µTt ) = Eµ [f (Xt )] = Ex [f (Xt )]µ(dx).
Hence
Z Z
f d(µTt ) = (Tt f )dµ. (4.2)
Definition 4.8. µ ∈ M1 (S) is invariant for the Feller process (Xt )t≥0 with transition
semigroup (Tt )t≥0 if µTt = µ for all t ≥ 0, i.e.
Z Z
(Tt f )dµ = f dµ, ∀f ∈ C(S), t ≥ 0. (4.3)
We write I for the collection of invariant distributions. Due to (4.3), I is a convex set.
We write Ie for the set of extremal elements1 of I.
w
Exercise 4.9. Let µ ∈ M1 (S) and assume that µTt −−−→ ν. Then ν is invariant.
t→∞
Theorem 4.10. Let D be a coreR for L. Then µ ∈ M1 (S) is invariant for the corresponding
Feller process if and only if Lf dµ = 0 for all f ∈ D.
Proof. (i) Assume that µ is invariant and take f ∈ D(L). Then

Z
Tt f − f Tt f − f
Z Z Z Z
1
Lf dµ = lim dµ = lim dµ = lim Tt f dµ − f dµ = 0.
t↓0 t t↓0 t t↓0 t
(ii) Assume
Z
Lf dµ = 0, ∀f ∈ D.
Since D is a core, there is, for each f ∈ D(L), a sequence (fn )n∈N , fn ∈ D for all n
such that fn → f and Lfn → Lf . Therefore
Z
Lf dµ = 0, ∀f ∈ D(L).
R R
If f ∈ D(L), f − λLf = g, then f dµ = gdµ. Iteration yields
Z Z
−n
(1 − λL) gdµ = gdµ.
By Theorem 3.12 (b)

−n
t n→∞
1− L g −−−→ Tt g
n
and we conclude that µ is invariant.
The following theorem gives a sufficient condition for the existence of an in-
variant distribution.
1
An element v in a convex set C is extremal if
v = αv1 + (1 − α)v2 , v1 , v2 ∈ C, α ∈ (0, 1) ⇒ v1 = v2 = v.

Theorem 4.11. If S is compact, then I =

6 ∅.
Proof. Let µ ∈ M1 (S). Define
Z
1
νn = µTr dr.
n [0,n]
The semigroup property implies for f ∈ C(S)

Z Z Z Z Z Z
1 1
f d(νn Tt ) = (Tt f )dνn = Tt+r f dµ dr = Tr f dµ dr.
S S n [0,n] S n [t,n+t] S
Hence
Z Z Z Z
f dνn − (Tt f )dνn = f dνn − f d(νn Tt )
S S S S
Z Z Z Z
1
= Tr f dµdr − Tr f dµdr
n [0,n] S [t,n+t] S (4.4)
Z Z Z Z
1
= Tr f dµdr − Tr f dµdr
n [0,t] S [n,t+n] S
n→∞
−−−→ 0.
Since S is compact, M1 (S) is compact as well (Prohorov’s theorem, see [Liggett] Theorem
A.21) and therefore there exists a subsequence (nk )k∈N such that
w
νnk −−−→ ν,
k→∞
for some ν ∈ M1 (S). Since Tt f ∈ C(S), we can take the limit in (4.4) along the subse-
quence (nk )k∈N and we get
Z Z
f dν = Tt f dν.
Since this holds for any f ∈ C(S), we conclude that νTt = ν.

Exercise 4.12. There are examples where limk→∞ µTtk exists with tk ↑ ∞, but is not
invariant.
Remark. If S is not compact, I = ∅ or I =
6 ∅ are both possible.
4.4 Examples
Example 4.13. Had seen: If (Xt )t≥0 is BM, then the corresponding generator is Lf = 12 f 00
with
D(L) = {f ∈ C(R) : f 0 , f 00 ∈ C(R)}.
Consider Yt = Xct , c > 0, t ≥ 0, then (Yt )t≥0 has the generator
Ex [f (Yt )] − f (x) Ex [f (Xct )] − f (x) c
lim = c lim = f 00 (x), f ∈ D(L).
t↓0 t t↓0 ct 2
Now we want to choose c, which depends on x.
Example 4.14 (Wright-Fisher-diffusion). Consider a population with N individuals.

Each individual has two alleles of type A or a. Hence there are three possible genotypes:
aa, aA, AA.
If the present generation consists of N1 individuals of type aa, N2 of type aA and N3

of type AA, there are 2N1 + N2 alleles of type a and N2 + 2N3 alleles of type A. The
next generation is formed in the following way: An individual in the next generation picks
uniformly at random two alleles from the ”gene pool” of the previous generation. More
precisely, the next generation has N individuals, Ne1 of type aa, N
e2 of type aA and Ne3 of
type AA, where (N1 , N2 , N3 ) are trinomial with parameter N and
e e e
N2 + 2N3
((1 − x)2 , 2x(1 − x), x2 ), where x = .
2N
In other words,

N 2n N − n 1
P(N
e1 = n1 , N
e2 = n2 , N
e3 = n3 ) = (1 − x) 1
(2x(1 − x))n2 x2n3
n1 n2
if n1 + n2 + n3 = N and zero otherwise. Therefore N e2 + 2Ne3 has the law Bin(2N, x). Let
Xn denote the proportion of allele A in the n-th generation. (Xn )n∈N is a Markov chain
1 2
in discrete time with state space {0, 2N , 2N , ..., 1}. Heuristics:
2N
X 2N k 2N −k k
LN f (x) = Ex [f (X1 ) − f (x)] = x (1 − x) f − f (x) . (4.5)
k=0
k 2N
If f : [0, 1] → R, f ∈ C 2 ([0, 1]), then

2 2
k 0 k 1 00 k k
f − f (x) = f (x) − x + f (x) −x +o −x .
2N 2N 2 2N 2N
Hence
1
lim (2N )LN f (x) = x(1 − x)f 00 (x).
N →∞ 2
This suggests that for N → ∞, there is a limit process with S = [0, 1] and
1
Lf (x) = x(1 − x)f 00 (x). (4.6)
2
where f is (for instance) a polynomial.
Theorem 4.15. Let S = [0, 1] and Lf (x) = 21 x(1 − x)f 00 (x), f polynomial. Then
(i) the closure L of L is a generator.

Let (Xt )t≥0 be the Feller process with generator L. Then

(ii) (Xt )t≥0 has continuous paths
(iii) For τ := inf{t ≥ 0 : Xt ∈ {0, 1}} we have
Ex [τ ] = −2x log(x) − 2(1 − x) log(1 − x)
and
Z
Px (Xτ = 1) = x, Ex Xt (1 − Xt )dt = x(1 − x).
R+
Proof. L maps polynomials of degree n to polynomials of degree n. We have to check

Definition 3.8:
(a) Clear since D(L) contains all polynomials.
(b) Let f be a polynomial, f −λLf = g for some λ ≥ 0. f has a minimum in x0 ∈ [0, 1]. If
x0 ∈ (0, 1) then f 00 (x0 ) ≥ 0. If x0 ∈ {0, 1} then Lf (x0 ) = 0. In any case Lf (x0 ) ≥ 0,
hence
min f (x) = f (x0 ) ≥ g(x0 ) ≥ min g(x).

x∈[0,1] x∈[0,1]
(c) Let g be a polynomial

n
X
g(x) = ak x k .
k=0
Consider the equation
f − λLf = g. (4.7)
Pn k
Assume f (x) = k=0 bk x , then (4.7) becomes
1
bk − λ k(k + 1)bk+1 − (k − 1)kbk = ak , k ∈ {1, . . . n}, a0 = b0
2
with bn+1 = 0. Solve this recursively, starting with bn and ending with b1 . Hence
R(1−λL) contains all polynomials and is dense in C[0, 1]. Hence R(1−λL) = C[0, 1]
due to Lemma 4.2 (d).
(d) Clear since S is compact and L1 = 0.

Without knowing D(L), we know that it contains C 2 [0, 1], since for each f ∈ C 2 [0, 1],
there are polynomials (fn )n∈N such that fn , fn0 , fn00 converge uniformly to f, f 0 , f 00 . To show
(b) and (c) in Theorem 4.15, we apply Theorem 4.7 with apropriate functions in D(L).
For f (x) = x, we have Lf = 0, therefore (Xt )t≥0 is a bounded martingale. Hence
lim Xt =: X∞ exists P-a.s. and Ex [Xt ] = x for all x ∈ [0, 1]. (4.8)
t→∞
For f (x) = x(1 − x) we have Lf (x) = −x(1 − x) and therefore

Z
Zt := Xt (1 − Xt ) + Xs (1 − Xs )ds (4.9)
[0,t]
is a non-negative martingale. Then limt→∞ Zt exists P-a.s. and is also non-negative. We

conclude that Px (X∞ ∈ {0, 1}) = 1. Hence Px (X∞ = 1) = x due to (4.8). Moreover,
taking expectations and letting t → ∞ in (4.9)
Z
Ex Xs (1 − Xs )ds = x(1 − x).
R+
To show continuity of the sample paths we use the following

Theorem 4.16. Let (Xt )t≥0 be a stochastic process with values in a Polish space (S, d)
and assume
E[d(Xt , Xs )γ ] ≤ C|t − s|β+1 , 0 ≤ s, t ≤ T
for some γ, β, C > 0. Then the paths t 7→ Xt (ω), 0 ≤ t ≤ T are continuous for P-a.a. ω.
To check the hypothesis, fix y ∈ [0, 1], f (x) = (x − y)2 . Then Lf (x) = x(1 − x) and
therefore
Z
2
(Xt − y) − Xs (1 − Xs )ds, t ≥ 0
[0,t]
is a martingale. Hence
Z
2 1
Ey [(Xt − y) ] = Ey [Xs (1 − Xs )]ds ≤ t, (4.10)
[0,t] 4
since x(1 − x) ≤ 41 . But this does not suffice to apply Theorem 4.16! Take f (x) = (x − y)4 ,
then Lf (x) = 6x(1 − x)(y − x)2 . Therefore also
Z
4
(Xt − y) − 6 Xs (1 − Xs )(Xs − y)2 ds, t ≥ 0
[0,t]
is a martingale and
Z
4
Ey [(Xt − y) ] = 6 Ey [Xs (1 − Xs )(Xs − y)2 ]ds
[0,t]
Z
3
≤ Ey [(Xs − y)2 ]ds
2 [0,t]
(4.10)
Z
3 1 3
≤ tds = t2 .
2 [0,t] 4 16
For s < t, using the Markov property

3
Ey [(Xt − Xs )4 |Fs ] = EXs [(Xt−s − X0 )4 ] ≤ (t − s)2 ,
16
hence
3
Ey [(Xt − Xs )4 ] ≤ (t − s)2
16
and with Theorem 4.16, we conclude that (Xt )t≥0 has P-a.s. continuous sample paths. It
remains to show that
Ex [τ ] = −2x log(x) − 2(1 − x) log(1 − x) = −f (x),
where f (x) = 2x log(x) + 2(1 − x) log(1 − x). Note that f 00 (x) = x2 + 1−x 2
, x ∈ (0, 1) and
1 00 2
hence 2 x(1 − x)f (x) = 1 for all x ∈ (0, 1). However f 6∈ C [0, 1] and f 6∈ D(L) because
otherwise f (Xt ) − t would be a martingale, but f (Xt ) − t ≤ 0 and f (Xt ) − t → −∞, which
yields a contradiction to the martingale convergence theorem. Therefore take fε ∈ C 2 [0, 1]
with fε = f in [ε, 1 − ε]
τε = inf{s ≥ 0 : Xs ∈ {ε, 1 − ε}},
then
Z
fε (Xt ) − Lfε (Xs )ds
[0,t]
is a martingale and hence f (Xτε ∧t ) − (τε ∧ t) is a Px -martingale for ε < x < 1 − ε. Hence
Ex [f (Xτε )] − Ex [τε ] = f (x).
Letting ε → 0, we get Ex [τ ] = −f (x) for all x ∈ (0, 1).
Exercise 4.17. Let (Bt )t≥0 be BM, x ∈ [0, 1] and τ = inf{s ≥ 0 : Xs ∈ {0, 1}}. Then
Ex [τ ] = x(1 − x).
Remark. The Wright-Fisher-diffusion solves the stochastic differential equation (SDE)

p
dXt = Xt (1 − Xt )dBt ,
where (Bt )t≥0 is BM.

Chapter 5
Spin systems
5.1 Spin systems

Let V be a countable set. At each point x ∈ V there are two possible states {0, 1}. Let
S = {0, 1}V be the set of all configurations, equipped with the product topology. For each
η ∈ S and x ∈ V
(
η(z), x 6= z
η (x) :=
1 − η(x), x = z.
That is η (x) is equal to η with state x flipped.

c(x, η) will denote a non-negative, uniformly bounded function on V ×S that is continuous
in η for each x ∈ V . We want to define a Feller process on S, such that at small times
only one state is changed, that is η becomes η (x) with rate c(x, η). Consider the operator
X
c(x, η) f (η (x) ) − f (η) , for some appropriate functions f

Lf (η) = (5.1)
x∈V
We want to find conditions on c(x, η) such that the closure of L is a generator. Define
X
|||f ||| := sup |f (η (x) ) − f (η)|
η∈S
x∈V
and let

D := f ∈ C(S) : |||f ||| < ∞ .
Remark. (i) If f ∈ D then f is Lipschitz-continuous w.r.t. some norm that generates

the product topology. Recall that the product topology can be generated by the metric
X
dα (η, ξ) = α(x)|η(x) − ξ(x)|,
x∈V
P
where α(x) > 0, ∀x and x∈V α(x) < ∞.
46
CHAPTER 5. SPIN SYSTEMS 47
Proof. If f ∈ D, let
αf (x) := sup |f (η (x) ) − f (η)|,

η∈S
then f is Lipschitz continuous w.r.t. the metric induces by αf . Indeed, let I = {x ∈

V : η(x) 6= ξ(x)}, then
|I|
X
|f (η) − f (ξ)| ≤ |f (ξi ) − f (ξi+1 )|,
i=0
i→∞
where ξi ∈ S is a sequence with ξ0 = η, f (ξi ) −−−→ f (ξ). Then
X
|f (η) − f (ξ)| ≤ αf (x) = dαf (ξ, η),
x∈I
which implies Lipschitz continuity with Lipschitz constant 1.
(ii) |||f ||| < ∞ alone does not imply the continuity of f . Proof: exercise.
(iii) Lf ∈ C(S) for f ∈ D.
Proof. Let V = {x1 , x2 , ...}. For all ε > 0 there exists Nε such that for η ∈ S
X X
c(xi , η)|f (η (xi ) ) − f (η)| ≤ Cαf (xi ) < ε.
i≥Nε i≥Nε
Therefore for all η, ξ ∈ S we have

X
|Lf (η) − Lf (ξ)| ≤ c(xi , η)(f (η (xi ) ) − f (η)) − c(xi , ξ)(f (ξ (xi ) ) − f (ξ)) + 2ε

i≤Nε
X X
≤ c(xi , η)f (η (xi ) ) − c(xi , ξ)f (ξ (xi ) ) + |f (η) − f (ξ)| |c(xi , η) − c(xi , ξ)| + 2ε
i≤Nε i≤Nε
Now the claim follows since f ∈ C(S) and c(xi , ·) is continuous for every i ∈ N.
We show that L, the closure of L, is a probability generator. We check Definition 3.8:
• For condition (a), use the Stone-Weierstrass theorem: D is an algebra of continuous

functions on a compact set that separates points. That is, η 6= ξ implies that there
is an x ∈ V such that η(x) 6= ξ(x), and the function f (ξ) := ξ(x) separates η and ξ.
Moreover D contains constant functions, so D = C(S).
• For (b), suppose f ∈ D, λ ≥ 0 and f − λLf = g. Since S is compact and f ∈ C(S)

there is an η at which f attains its minimum. But then Lf (η) ≥ 0 since all summands
in (5.1) are non-negative, so that
min f (ξ) = f (η) ≥ g(η) ≥ min g(ξ).

ξ∈S ξ∈S
• Part (d) follows from 1 ∈ D and L1 = 0.

To check (c) we need some a priori bound on solutions of the equation f − λLf = g. In
order to state it, we need some notation:
ε := inf inf [c(u, η) + c(u, η (u) )]

u∈V η∈S

a(x, u) := sup c(x, η (u) ) − c(x, η)
η∈S
X
M := sup a(x, u).
x∈V
u:x6=u
Interpretation:
• inf η∈S [c(u, η) + c(u, η (u) )] is the amount of flipping at u that occurs independently
of the rest of the configuration.
• So ε is the minimal amount of flipping that occurs without interaction between sites.
• a(x, u) describes how much the flip rate at site x depends on the configuration at u
• and M gives the maximal amount of dependence of the flip rate at a site on the rest
of the configuration.
Let `1 (V ) be the Banach space of functions α : V → R that satisfies
X
kαk := |α(x)| < ∞.
x∈V
The matrix (a(x, u))x,u∈V defines an operator Γ on `1 (V ) by

X
(Γα)(u) := α(x)a(x, u). (5.2)
x:x6=u
Note that if M < ∞, then Γ is well-defined and bounded, and we have kΓk = M . More
precisely, note that
!
X X X X X X
kΓαk = |Γα(u)| ≤ |α(x)a(x, u)| = a(x, u) |α(x)| ≤ M |α(x)|
u∈V u∈V x:x6=u x∈V u:u6=x x∈V
which gives kΓk ≤ M and we leave it as an exercise to show equality. For f ∈ C(S) and
x ∈ V , let
∆f (x) := sup |f (η (x) ) − f (η)|.

η∈S
Observe that k∆f k`1 (V ) = |||f |||, and ∆f ∈ `1 (V ) if and only if f ∈ D.

Lemma 5.1. Assume either
(a) f ∈ D
(b) f is continuous and
c(x, ·) ≡ 0 except for finitely many x ∈ V (5.3)
Then, if f − λLf = g ∈ D, λ > 0 and λM < 1 + λε we have
∆f ≤ [(1 + λε)1 − λΓ]−1 ∆g . (5.4)
Here the inequality is pointwise in V , and for α ∈ `1 (V ) the inverse is defined by

i−1 ∞ k
−1 1 h λ 1 X λ
[(1 + λε)1 − λΓ] α = 1− Γ α= (Γk )α. (5.5)
1 + λε 1 + λε 1 + λε k=0 1 + λε
Proof. See [Liggett] Proposition 4.2.
Remark. Note that if λM < 1 + λε, then the operator in (5.5) is well-defined, since
∞ k
(1 + λε)1 − λΓ]−1 ≤
1 X λ
kΓkk .
1 + λε k=0 1 + λε
Moreover if f satisfies (b), then (5.4) implies f ∈ D.
Theorem 5.2. Let M < ∞ and define L as in (5.1) for f ∈ D. Then L is a generator
and the corresponding semigroup (Tt )t≥0 satisfies
∆Tt f ≤ e−εt etΓ ∆f (5.6)
for all f ∈ D(L), where the inequality is again pointwise. In particular, if f ∈ D then
Tt f ∈ D and we have
|||Tt f ||| ≤ e(M −ε)t |||f |||.
Example 5.3 (Voter model with V = Zd ). Let

1 X
c(x, η) = 1{η(y)6=η(x)}
2d
y:|y−x|=1
Here
(
0 |u − x| =
6 1
a(x, u) = 1
2d
|u − x| = 1.
and
X
M = sup a(x, u) = 1.
x∈V
u:u6=x
Since M < ∞ there is a corresponding Feller process (ηt )t≥0 .

Example 5.4 (Contact process with V = Zd ). Fix λ > 0. Let

(
1 η(x) = 1
c(x, η) =
λ {y : |y − x|1 = 1, η(y) = 1} η(x) = 0.

We have
(
(u) 0 |u − x| =
6 1
a(x, u) = sup |c(x, η) − c(x, η )| =
η∈S λ |u − x| = 1.
and M = 2dλ < ∞. So the corresponding Feller process exists.

Proof of Theorem 5.2. We already saw that L satisfies (a), (b) and (d) in Definition 3.8
and thus does L due to Lemma 4.2 (a). For (c) in Definition 3.8 take Vn finite, Vn ↑ V
and take
X
Ln f (η) = c(x, η)[f (η (x) ) − f (η)], f ∈ C(S). (5.7)
x∈Vn
Ln is a generator for the spin system, where the spins {η(x), x 6∈ Vn } are frozen. Let us
write Γn for the quantity from (5.2) corresponding to Ln , and note that Γn ≤ Γ. That is,
if a ∈ `1 (V ) is non-negative then
(Γn a)(x) ≤ (Γa)(x) for all x ∈ S. (5.8)
In particular kΓn k ≤ kΓk. Since Ln is a bounded operator (the sum in (5.7) is finite), we
have
R(1 − λLn ) = C(S)
due to Lemma 3.15 and Lemma 4.2 (c) for all λ ≥ 0. Hence for a given g ∈ D, there is
fn ∈ C(S) such that
fn − λLn fn = g.
Now let λ be small enough for λM < 1 + λε. Since Ln satisfies (5.3) we must have fn ∈ D
for all n ∈ N, due to the remark after Lemma 5.1. Hence we can define
gn := fn − λLfn ∈ R(1 − λL).
Set K := supx∈V,η∈S c(x, η) then we have

X
(x)
kgn − gk = λk(L − Ln )fn k = λ sup |(L − Ln )fn (η)| = λ sup c(x, η)(fn (η ) − fn (η))

η∈S η∈S
x∈V
/ n
X X −1
≤ λK ∆fn (x) ≤ λK (1 + λε)1 − λΓn ∆g (x)
x:x6∈Vn x:x6∈Vn
X −1
≤ λK (1 + λε)1 − λΓ ∆g (x). (5.9)
x:x6∈Vn
The first inequality is from Lemma 5.1, and the last from (5.8) and the definition of the
inverse. Since ∆g ∈ `1 (S) the r.h.s. of (5.9) goes to zero as n → ∞, hence gn → g. It
follows that g ∈ R(1 − λL), so
D ⊂ R(1 − λL).
We conclude that R(1 − λL) is dense in C(S) as well. Due to Lemma 4.2 (d), R(1 − λL)
is a closed subspace of C(S). Therefore
R(1 − λL) = C(S).
Hence, L has the properties (a) - (d) in Definition 3.8 and is a generator.
To show (5.6) write (5.4) as

−1
∆(1−λL)−1 g ≤ (1 + λε)1 − λΓ ∆g .
t
Taking λ = n
and iterating yields
−n
t t
∆(1− nt L)−n g ≤ 1+ ε 1− Γ ∆g .
n n
For n → ∞ (5.6) follows with Theorem 3.12 (b).
5.2 Ergodicity of spin systems

Due to Theorem 4.11, a Feller process with state space {0, 1}V has at least one invariant
distribution. Write I for the set of invariant distributions and Ie for the set of extremal
points of I.
Definition 5.5. The spin system (ηt )t≥0 (the Feller process with generator L from The-
orem 5.2) is called ergodic if there exists a unique invariant distribution µ and
Z
lim (Tt f )(η) = f dµ, ∀η ∈ S, f ∈ C(S).
t→∞
In other words (ηt )t≥0 is ergodic if there exists µ ∈ M1 (S) such that
w
(νTt ) −−−→ µ, ∀ν ∈ M1 (S).
t→∞
Example (Voter model (continued)). Recall Example 5.3. The process (ηt )t≥0 is not
ergodic, since there are at least two invariant distributions: δ0 (the Dirac measure on
the configuration η(x) = 0, ∀x ∈ V ) and δ1 (the Dirac measure on the configuration
η(x) = 1, ∀x ∈ V ) are invariant.
Theorem 5.6. If M < ε, then (ηt )t≥0 is ergodic.

Proof. Let η, ξ ∈ S. We can change η into ξ by changing the configurations pointwise.

More precisely there are sequences η i ∈ S, xi ∈ V with xi 6= xj for i 6= j such that
η 0 = η, η i+1 = (η i )(xi ) for all i and ξ = limi→∞ η i . Let f ∈ C(S), then
i
X
0
i k−1 k

|f (η) − f (ξ)| = | lim f (η ) − f (η )| = lim f (η ) − f (η )
i→∞ i→∞
k=1
∞
X X
≤ |f (η k−1 ) − f (η k )| ≤ ∆f (x).
k=1 x∈V
Therefore
X
sup |f (η) − f (ξ)| ≤ ∆f (x) = |||f |||.
η,ξ
x∈V
But now, Theorem 5.2 says
|||Tt f ||| ≤ e(M −ε)t |||f |||
and hence
sup |Tt f (η) − Tt f (ξ)| ≤ |||Tt f ||| ≤ e(M −ε)t |||f |||.
η,ξ
Let µ ∈ I and ν ∈ M1 (S), f ∈ D. Then

Z Z Z

f dµ − f d(νTt ) =
t→∞
≤ e(M −ε)t |||f ||| −
Tt f (η) − Tt f (ξ) (µ ⊗ ν)(dη × dξ) −−→ 0,
S S S×S
since M < ε. Since D is dense in C(S) we have

w
νTt −−−→ µ,
t→∞
and hence (ηt )t≥0 is ergodic.
5.3 Examples
Example 5.7. Voter model (more general). Let V be a vertex set and let (p(x, y))x,y∈V
be an irreducible stochastic matrix with p(x, x) = 0 for all x ∈ V . Let
X
c(x, η) = p(x, y).
y:η(x)6=η(y)
Interpretation: With rate p(x, y), the voter in x takes the opinion of the voter in y.
Have
X
M = sup p(x, u) = 1.
x∈V
u:u6=x
The voter model is not ergodic, since δ0 , δ1 are invariant.

Example 5.8. Contact process (more general) Let (V, E) be a graph with vertices
V and edges E and assume that the graph has bounded degree. Write x ∼ y if x, y are
neighbors, i.e. if (x, y) ∈ E. Fix λ > 0. Let
(
1 η(x) = 1
c(x, η) =
λ#{y : y ∼ x, η(y) = 1} η(x) = 0.
Interpretation: η(x) = 1 means that the particle at x is infected, η(x) = 0 means that
the particle at x is healthy. Infected particles recover with rate 1, healthy particles are
infected by their neighbors with rate λ. The Dirac measure δ0 on “everybody healthy” is
invariant.
Question: Are there other invariant measures? Theorem 5.6 implies that I = {δ0 } if
1
λ< .
maxx∈V deg(x)
Example 5.9. Noisy voter model. Let (p(x, y))x,y∈V be an irreducible stochastic matrix
with p(x, x) = 0 for all x ∈ V . Fix β, δ ≥ 0 and let
X
c(x, η) = p(x, y) + δ1{η(x)=1} + β1{η(x)=0} .
η:η(x)6=η(y)
For which values of β, δ do we have ergodicity?

Example 5.10. Ising model. Let V = Zd . Fix β > 0 and let
X
c(x, η) = exp − β (2η(x) − 1)(2η(y) − 1) .
y:y∼x
Interpretation: 2η(x) − 1 ∈ {−1, 1} is the spin at x, β is the inverse temperature: Spins

want to align (i.e. be as their neighbors), especially if the temperature is low (or β is large).
We have ε = 2 and M = 2de2dβ (1 − e−2β ). Hence (ηt )t≥0 is well-defined, and ergodic
if β is small enough. In fact, as we will see soon:
(i) (ηt )t≥0 is always ergodic if d = 1.
(ii) (ηt )t≥0 is not ergodic for β large enough if d ≥ 2.
5.4 Attractive spin systems and coupling

From now on we assume M < ∞.
Definition 5.11. A coupling of two stochastic processes (ηt )t≥0 and (ξt )t≥0 with values
in S is a stochastic process (e ηt , ξet )t≥0 with values in S × S such that the marginals of the
process (e
ηt , ξet )t≥0 coincide with the laws of (ηt )t≥0 and (ξt )t≥0 , i.e.
d d
(e
ηt )t≥0 = (ηt )t≥0 , (ξet )t≥0 = (ξt )t≥0 .
We write η ≤ ξ if η(x) ≤ ξ(x), ∀x ∈ V .
Theorem 5.12. Let (ηt )t≥0 be a spin system with rates c1 (x, η) and (ξt )t≥0 be a spin
system with rates c2 (x, ξ). If η ≤ ξ implies that
c1 (x, η) ≤ c2 (x, ξ), for η(x) = ξ(x) = 0

(5.10)
c1 (x, η) ≥ c2 (x, ξ), for η(x) = ξ(x) = 1
then there is a coupling (ηt , ξt )t≥0 for starting values η ≤ ξ such that
P(η,ξ) (ηt ≤ ξt , ∀t ≥ 0) = 1.
Proof. We give rates for the process (ηt , ξt )t≥0 as follows:

(
(1, 1) with rate c1 (x, η)
(0, 0) −→
(0, 1) with rate c2 (x, ξ) − c1 (x, η)
(
(0, 0) with rate c2 (x, ξ)
(0, 1) −→
(1, 1) with rate c1 (x, η)
(
(0, 0) with rate c2 (x, ξ)
(1, 1) −→ .
(0, 1) with rate c1 (x, η) − c2 (x, ξ)
(ηt , ξt )t≥0 is a Feller process with S = {(0, 0), (0, 1), (1, 1)}V and its generator is given in
terms of the rates as in (5.1). The construction of the process is analogous to the proof of
Theorem 5.2. The assumptions guarantee that all rates are non-negative. Finally we have
to check that (ηt )t≥0 and (ξt )t≥0 are Feller processes with rates c1 (x, η) and c2 (x, η). For
instance
η −→ η (x) with rate c((0, 0) → (1, 1)) = c((0, 1) → (1, 1))

= c1 (x, η), if η(x) = 0
and
η −→ η (x) with rate c((1, 1) → (0, 0)) + c((1, 1) → (0, 1))

= c1 (x, η), if η(x) = 1.
Note that (ηt )t≥0 and (ξt )t≥0 are not independent. In the particular case c1 (x, η) =
c2 (x, ξ) in (5.10), this leads to the definition of an attractive spin system.
Definition 5.13 (Attractive spin system). The spin system with rates c(x, η) is at-
tractive if for all η ≤ ξ,
c(x, η) ≤ c(x, ξ), if η(x) = ξ(x) = 0

c(x, η) ≥ c(x, ξ), if η(x) = ξ(x) = 1,
Example. The contact process, the noisy voter model and the Ising model are
attractive.
Example. The anti-voter model on Zd , given by
1 X
c(x, η) = 1{η(y)=η(x)}
2d y:x∼y
is not attractive.
Corollary 5.14. For an attractive spin system, there is a coupling (ηt , ξt )t≥0 of two copies
of the process, such that for η ≤ ξ we have
P(η,ξ) (ηt ≤ ξt , ∀t ≥ 0) = 1.
Definition 5.15.
(i) A function f ∈ C(S) is increasing if
η ≤ ξ ⇒ f (η) ≤ f (ξ).
Let C ↑ be the collection of increasing function in C(S).

(ii) For µ, ν ∈ M1 (S) we say µ ν (“ν dominates µ”or “µ is dominated by ν”) if
Z Z
f dµ ≤ f dν, ∀f ∈ C ↑ .
Lemma 5.16. Let (ηt )t≥0 be an attractive spin system. Then the transition semigroup
(Tt )t≥0 satisfies
(i) f ∈ C ↑ ⇒ Tt f ∈ C ↑ , ∀t ≥ 0.
(ii) µ ν ⇒ µTt νTt .
Proof. (i) Let η ≤ ξ. Then
(Tt f )(η) = Eη [f (ηt )]
= E(η,ξ) [f1 (ηt , ξt )], f1 (η, ξ) := f (η)
Corollary 5.14 ≤ E(η,ξ) [f2 (ηt , ξt )], f2 (η, ξ) := f (ξ)
= Eξ [f (ξt )]
= (Tt f )(ξ).
(ii) Assume µ ν. Then

Z Z
f dµ ≤ f dν, ∀f ∈ C ↑
Z Z
⇒ (Tt f )dµ ≤ (Tt f )dν, ∀f ∈ C ↑
Z Z
⇒ f d(νTt ) ≤ f d(νTt ), ∀f ∈ C ↑
and hence µTt νTt .

We will apply this with δ0 and δ1 .
Theorem 5.17. Let (ηt )t≥0 be an attractive spin system with semigroup (Tt )t≥0 . Then we
have
(i) δ0 Ts δ0 Tt and δ1 Tt δ1 Ts for all s ≤ t.
(ii) ν = limt→∞ δ0 Tt and ν = limt→∞ δ1 Tt exist1 and are invariant.
(iii) δ0 Tt µTt δ1 Tt for all µ ∈ M1 (S).
(iv) If ν = limk→∞ µTtk for a sequence tk → ∞ as k → ∞, then ν ν ν.
ν and ν are called upper invariant measure and lower invariant measure.
Proof. Let us first note that for all µ ∈ M1 (S), δ0 µ δ1 .

(∗)
(i) δ0 δ0 Tr ⇒ δ0 Tt δ0 Tr+t , where (∗) follows from Lemma 5.16 (ii). The statement
for δ1 follows in the same way.
(ii) For each sequence (tk ) with tk → ∞, δ0 Ttk has a limit point (due to compactness) -
but due to (i), all these limit points agree. (Exercise: make this argument precise!).
(iii) δ0 µ δ1 , for all µ ∈ M1 (S). Hence by Lemma 5.16 (ii) δ0 Tt µTt δ1 Tt .
(iv) follows from (iii).
Corollary 5.18. An attractive spin system is ergodic if and only if ν = ν.
Theorem 5.19. The stochastic Ising model (see Example 5.10) with d = 1 is ergodic for
all β ≥ 0.
Proof. We will show ν = ν. Let

X
c(x, η) = exp −β (2η(x) − 1)(2η(y) − 1)
y:|y−x|=1
and
K = K(β) = sup c(x, η).

x,η
Let m ∈ N,

c(x, η), |x| < m

cm (x, η) = K, |x| ≥ m, η(x) = 0

0, |x| ≥ m, η(x) = 1.

1
the limits are w.r.t. weak convergence of probability measures on S
Due to Theorem 5.12 one can couple the spin systems (ηt )t≥0 and (ξt )t≥0 with starting
configurations η0 ≡ 1, ξ0 ≡ 1 and rates c(x, η) and cm (x, ξ) such that ηt ≤ ξt for all t ≥ 0.
The process ξt (x), −m+1 ≤ x ≤ m−1, is an irreducible MC in continuous time with state
space {0, 1}2m−1 . Therefore the law of ξt converges weakly towards the unique invariant
distribution µm for t → ∞. Extend µm on {0, 1}Z by setting, for ζ ∈ {0, 1}Z ,
µm ({ζ : ζ(x) = 1}) = 1 |x| > m.
Then ν µm . Let us compute µm . Let µ ∈ M1 ({0, 1}Z ) the law of a (two-sided) time-
discrete MC with state space {0, 1} and transition matrix
e−β
β
1 e
P = β
e + e−β e−β eβ
1
and with µ(ζ(0) = 0) = 2
= µ(ζ(0) = 1). The finite-dimensional marginals of µ are given
by
l−1
1Y
µ({ζ : ζ(x) = z(x) for k ≤ x ≤ l}) = p(z(j), z(j + 1)) (5.11)
2 j=k
for z ∈ {0, 1}Z . Claim The conditional law of µ(·|ζ(x) = 1, ∀xwith |x| ≥ m) is reversible
for the (time-continuous) MC (ξt )t≥0 , hence it is invariant. This implies that
µm = µ(·|ζ(x) = 1 for |x| ≥ m)
Proof. Exercise.
Now we show that µ = limm→∞ µm . For −m < k < l < m
µm ({ζ : ζ(x) = z(x) for k ≤ x ≤ l})

P (k+m) (1, z(k)) l−1 (m−l)
Q
j=k p(z(j), z(j + 1))P (z(l), 1)
= (2m)
, (5.12)
P (1, 1)
where P (i) (·, ·) are the entries of the i-th power of P . The convergence theorem for MC
in discrete time gives us that
1
lim P (n) (u, v) = , ∀u, v ∈ {0, 1}.
n→∞ 2
Hence the term in (5.12) converges to the r.h.s. of (5.11). Since ν µm and limm→∞ µm =
µ, we conclude that ν µ. The same game, exchanging 0 and 1 yields µ ν and hence
ν = ν, which implies that (ηt )t≥0 is ergodic with unique invariant distribution µ.
Corollary 5.20. The unique invariant distribution for the stochastic Ising model with
d = 1 and β ≥ 0 is the probability measure µ with finite dimensional marginals given by
(5.11).
d
Consider the stochastic Ising model with S = {0, 1}Z . For simplicity of notation we
d
can pass to S = {−1, +1}Z and take
X
c(x, η) = exp − β η(x)η(y) . (5.13)
y:|y−x|=1
We saw that if d = 1 the stochastic Ising model is ergodic for all β ≥ 0.
Theorem 5.21. Assume that d ≥ 2. Then there exists βc = βc (d) such that for all
β > βc , the stochastic Ising model has at least two invariant distributions and is therefore
not ergodic.
Remark. We know by Theorem 5.6 that the stochastic Ising model is ergodic if β is small
enough. In the stochastic Ising model we have
ε = 2, M = 2de2dβ (1 − e−2β ),
hence M < ε for β small enough.
Proof of Theorem 5.21. Assume first that d = 2. We have to show that for β large enough
ν 6= ν. Let Λ = {−n, . . . , n}×{−n, . . . , n} and let µΛ,+ be the (unique) invariant distribu-
tion for the (finite state) MC on S = {−1, 1}Λ given by the rates in (5.13) and η(x) = 1,
∀x ∈ Λc . (Exercise: what is the Q-matrix of this Markov chain?) We will show that for β
large enough
1
lim µΛ,+ (η(0) = −1) < .
|Λ|→∞ 2
First observe that µΛ,+ is given as follows.

Λ
+ + + +
+ +
+ +
+ 0 +
+ +
+ + + +
Lemma 5.22.

1 X
µΛ,+ (η|Λ ) = exp β η(x)η(y) .
Z x∈Λ,y∈Λ∪∂Λ
|y−x|=1
where Z = Z(β) is a normalizing factor.

Proof. Exercise. Hint: µΛ,+ is reversible for the finite-state MC described above.
Peierl’s contour argument: For each configuration, draw edges between spins of oppo-
site signs. Then, these edges form contours and any pair of configurations (obtained from
each other by flipping all signs) corresponds to a unique contour configuration. Contours
are cycles in the dual graph given by {(x, y) : (x + 1/2, y + 1/2) ∈ Z2 }, with edges between
x and y if |x − y| = 1.
- + + + +
- + + + +
+ + - - +
+ + - - +
+ + + + +
To estimate the probability that η(0) = −1, observe that if η(0) = −1, the origin must
lie inside at least one contour. Let B(η) be the set of all contours of η. We estimate, for
γ a contour surrounding 0, µΛ,+ (γ ∈ B(η)). Let
|B(η)| = #{edges in B(η)}.
Then, due to Lemma 5.22
−2β|B(η)|
P
η:γ∈B(η) e
µΛ,+ (η : γ ∈ B(η)) = P −2β|B(η)|
.
η∈S e
Define ηe by
(
−η(x), if γ surrounds x
ηe(x) =
η(x), otherwise.
Then B(e
η ) is B(η) without the contour γ. Hence
−2β|B(eη )|
P
−2β|γ| η:γ∈B(η) e
µΛ,+ (η : γ ∈ B(γ)) = e P −2β|B(η)|
,
η∈S e
| {z }
≤1
where |γ| = #{edges in γ}. Therefore

µΛ,+ (η : γ ∈ B(η)) ≤ e−2β|γ| .
The number of possible contours of length L surrounding the origin is at most L2 3L : at
every corner, the contour can choose among at most 3 possible directions, and given the
shape of the contours, the maximal number of ways it can be shifted around the origin is
at most L2 . Hence
∞
X
µΛ,+ (η(0) = −1) ≤ e−2βL 3L L2 , ∀Λ
L=4
and for β large enough

1
lim µΛ,+ (η(0) = −1) < .
|Λ|→∞ 2
The same type of argument replacing edges with hyperfaces works for d ≥ 2.
Chapter 6
The contact process
The flip rates are given by

(
1, η(x) = 1
c(x, η) = P
λ y:y∼x 1{η(y)=1} , η(x) = 0.
We saw already that on any graph with bounded degree there is a Feller process - the
contact process - corresponding to these rates, and that the contact process is ergodic if
1
λ< .
max deg(x)
x∈V
Our goal is now to show that on Zd the contact process is not ergodic if λ is large enough.
6.1 The graphical representation

For each x ∈ V draw a time-line [0, ∞). On the time-line {x} × [0, ∞) consider a Poisson
point process (PPP) Dx with intensity 1. For each ordered pair (x, y) of neighbours, let
Bx,y be a PPP with intensity λ. All these PPP are independent. Points in Dx are called
points of cure and the points in Bx,y are called arrows of infection. The corresponding
probability measure is denoted by Pλ .
t
δ
δ δ
δ δ
-2 -1 0 1 2
60
CHAPTER 6. THE CONTACT PROCESS 61
Let (x, s), (y, t) ∈ V × [0, ∞), s ≤ t. We define a (directed) path from (x, s) to (y, t) to be
a sequence
(x, s) = (x0 , t0 ), (x0 , t1 ), (x1 , t1 ), ..., (xn , tn+1 ) = (y, t)
with s = t0 ≤ t1 ≤ ... ≤ tn+1 such that
(i) each interval {xi } × [ti , ti+1 ) contains no point of Dx
(ii) ti ∈ Bxi−1 ,xi for i = 1, 2, ..., n.
We write (x, s) → (y, t) if there exists such a directed path. Note that (x, s) → (y, t)
means that y is infected at time t if x is infected at time s. Let η0 ∈ S = {0, 1}V and
define ηt ∈ S for t ∈ [0, ∞), by taking ηt (y) = 1 if and only if there exists x ∈ V such
that η0 (x) = 1 and (x, 0) → (y, t). Then (ηt )t≥0 is a contact process with parameter λ.
The graphical representation has several advantages:
1. It gives a coupling of all the contact processes with parameter λ and any starting
configuration η0 .
2. It also provides a coupling of contact processes with different parameters λ1 ≤ λ2 .

If we start a contact process with parameter λ2 and remove (independently) each
arrow of infection with probability 1 − λλ21 , we obtain the representation of a contact
process with parameter λ1 .
Now identify S ←→ 2V , η ←→ {x ∈ V : η(x) = 1}. For η ∈ S and A ⊂ V , we write

ηtA for the value of a contact process at time t, started at time 0 from the infected set A.
The coupling given by the graphical representation has clearly the following properties:
(i) it is monotone, i.e. ηtA ⊆ ηtB if A ⊆ B.
(ii) it is additive, i.e. ηtA∪B = ηtA ∪ ηtB .
Theorem 6.1 (Duality relation). For A, B ⊆ V ,
Pλ (ηtA ∩ B 6= ∅) = Pλ (ηtB ∩ A 6= ∅). (6.1)
(6.1) can be written in the form
PA B
λ (ηt = 0 on B) = Pλ (ηt = 0 on A),
where the superscripts indicate the initial states.
Proof. The event on the l.h.s. of (6.1) is the union over a ∈ A and b ∈ B of the events
(a, 0) → (b, t). If we reverse the direction of time and the direction of the arrows of
infection, the probability of the event does not change and is now the probability of the
event on the r.h.s. of (6.1).
By Corollary 5.18 the contact process is ergodic if and only if

ν = δ∅ ,
which was denoted δ0 before. When is this the case?
d
Proposition 6.2. Let S = {0, 1}Z . The density of infected vertices under ν equals θ(λ),
where
{0}
θ(λ) = Pλ (ηt 6= ∅, ∀t ≥ 0).
In other words, θ(λ) = ν({ζ ∈ S : ζ(x) = 1}), ∀x ∈ Zd . In particular, the contact process
is ergodic if and only if θ(λ) = 0.
{0}
Proof. The events {ηt ∩ Zd 6= ∅} are non-increasing in t, hence
{0}
θ(λ) = lim Pλ (ηt ∩ Zd 6= ∅).
t→∞
By Theorem 6.1,
{0} d
Pλ (ηt ∩ Zd 6= ∅) = Pλ (ηtZ (0) = 1)
and by the weak convergence of δ1 Tt to ν,
d t→∞
Pλ (ηtZ (0) = 1) −−−→ ν({γ ∈ S : γ(0) = 1}).
Since ν is shift-invariant, the claim follows.
We define the critical value λc of the contact process by
λc = λc (d) = sup{λ : θ(λ) = 0} ∈ [0, ∞].
λ 7→ θ(λ) is non-decreasing. Hence
(
= 0 λ < λc
θ(λ)
> 0 λ > λc .
Theorem 6.3. For d ≥ 1, we have
1
≤ λc (d) < ∞.
2d
Remark. The bounds can be improved.
Proof of Theorem 6.3. We know already that
1
λc (d) ≥ ,
2d
1
since for the contact process ε = 1 and M = 2dλ, hence M < ε if λ < 2d
. Since Z is a
subgraph of Zd , we have
λc (d) ≤ λc (1).
Hence it suffices to show λc (1) < ∞. Fix ∆ > 0 and let m, n ∈ Z such that m + n are
even. We define independent random variables Xm,n , taking values 0 or 1. Let Xm,n = 1
and call (m, n) open, if in the graphical representation, the following two events occur:
(i) there is no point of cure in the interval

{m} × ((n − 1)∆, (n + 1)∆];
(ii) there are arrows of infection to both sides in the interval

{m} × (n∆, (n + 1)∆].
Clearly, the Xm,n are i.i.d. and

p = p(∆) = Pλ (Xm,n = 1) = e−2∆ (1 − e−λ∆ )2 .
We choose ∆ such that p(∆) is maximal:
1 λ2
e−λ∆ = , p(∆) = 2 . (6.2)
λ+1 (λ + 1)2+ λ
Consider a directed site percolation on the first quadrant of a rotated copy of Z2 . The
{0}
key point is that ηn∆ ⊇ Bn , where Bn is the set of vertices of the form (m, n) which are
reached from the origin (0, 0) along open paths in a directed site percolation. (see picture
in the lecture course or in the book [Grimmett]).
Lemma 6.4 (Critical value in directed site percolation). Consider directed i.i.d.
site percolation with parameter p and let
Ψ(p) = Pp ((0, 0) ↔ ∞)
and
psite
c := sup{p : Ψ(p) = 0}.
Then psite
c < 1.
Now Pλ (|Bn | > 0, ∀n > 0) > 0 if p > psite
c . By (6.2), θ(λ) > 0 if
λ2
2
2+ λ
> psite
c
(1 + λ)
and the last inequality is satisfied for λ large enough (we have p(∆) → 1 for λ → ∞).
Hence λc (1) < ∞.
Proof of Lemma 6.4. Peierl’s argument: A finite cluster of open sites has to be sur-
rounded by a closed contour. Consider contours surrounding the connected component of
sites which are reachable from (0, 0).
Pp (∃ closed contour of length n) ≤ 3n (1 − p)n ,
since there are at most 3n possible contours of length n and each is closed with probability
(1 − p)n . Now, for p close to 1,
X
3n (1 − p)n < 1
n∈N
and we conclude that psite

c < 1.
Indeed, the following holds.
Theorem 6.5. Consider the contact process on Zd , d ≥ 1. Then there is λc ∈ (0, ∞) such
that
I = {ν} = {δ0 } ⇐⇒ λ ≤ λc .
Remark. We showed that there is λc ∈ (0, ∞) such that λ < λc ⇒ {ν} = {δ0 } and
λ > λc ⇒ Ie = {δ0 , ν} and ν 6= δ0 . The critical case λ = λc is delicate, but it has been
proved that θ(λc ) = 0 for d ≥ 1.
Chapter 7
The voter model
Let V be a countable
P vertex set and let q(x, y) ≥ 0, for all x 6= y, x, y ∈ V . Assume that
M = supx∈V u:u6=x q(x, u) < ∞. Then we know that the spin system (ηt )t≥0 with rates
X
c(x, η) = q(x, y), x ∈ V, η ∈ {0, 1}V
y:η(y)6=η(x)
is well-defined and a Feller process. The voter model is never ergodic, since δ0 and δ1
are invariant.
Question. Are there other extremal invariant distributions, i.e. Ie = {δ0 , δ1 }?
7.1 Voter model duality

Definition 7.1. Assume (Zt1 )t≥0 and (Zt2 )t≥0 are Feller processes or continuous-time MC
with state spaces S1 and S2 . Given H : S1 × S2 → R, H continuous and bounded we say
that the two processes are dual w.r.t. H if
Ez1 [H(Zt1 , z2 )] = Ez2 [H(z1 , Zt2 )], ∀t ≥ 0, z1 ∈ S1 , z2 ∈ S2 . (7.1)
How does duality relate with the generator? Assume that (Zt1 )t≥0 and (Zt2 )t≥0
are Feller processes.
Theorem 7.2. Assume that L1 H(·, z2 )(z1 ) and L2 H(z1 , ·)(z2 ) are well defined and
L1 H(·, z2 )(z1 ) = L2 H(z1 , ·)(z2 )
for all z1 ∈ S1 , z2 ∈ S2 . Then the two processes (Zt1 )t≥0 , (Zt2 )t≥0 are dual w.r.t. H.
Proof. We only give a proof for the case when (Zt1 )t≥0 is a Feller process with transition
(1)
semigroup (Tt )t≥0 and generator L1 and the Feller process (Zt2 )t≥0 is a continuous time
MC with Q-matrix q(z, y) and transition function pt (z, y). Here, S2 is countable and we
take the discrete topology on C(S2 ). (We will also only apply the theorem in that case).
Hence,
X X
L2 f (x) = q(x, y) (f (y) − f (x)) = q(x, y)f (y), ∀x ∈ S2
y∈S2 y∈S2
65
CHAPTER 7. THE VOTER MODEL 66
(1)
Check that in this case, Tt and L2 commute for any t ≥ 0. Let
(1)
u(t, z1 , z2 ) = Ez1 [H(Zt1 , z2 )] = Tt H(·, z2 )(z1 ). (7.2)
Then we have, according to Theorem 3.12,

d (1)
u(t, z1 , z2 ) = Tt L1 H(·, z2 )(z1 )
dt
(1)
= Tt L2 H(z1 , ·)(z2 )
X (1)
= q(z2 , y)Tt H(·, y)(z1 )
y∈S2
X
= q(z2 , y)u(t, z1 , y).
y∈S2
(1)
where we use the assumption in the second equality and the fact that Tt and L2 commute
in the third equality. For each z1 ∈ S1 , the unique solution of this differential equation
with initial condition u(0, z1 , z2 ) = H(z1 , z2 ) is given by
X
pt (z2 , y)H(z1 , y) = Ez2 [H(z1 , Zt2 )], (7.3)
y∈S2
see [Liggett] Theorem 2.39. (7.1) follows from (7.2) and (7.3).
For the voter model we consider
Y
H(η, A) = η(x) = 1{η(x)=1,∀x∈A} ,
x∈A
where η ∈ {0, 1}V , A ⊆ V, |A| < ∞. The dual process (At )t≥0 is a system of coalescing
MC with Q-matrix q(x, y). More precisely, the points in A move independently according
to this MC, and coalesce when they meet.
A0 At1
(At )t≥0 is a MC in continuous time. The state space consists of all finite subsets of V and
the Q-matrix is given by
q(A, (A\{x}) ∪ {y}) = q(x, y), x ∈ A, y 6∈ A

X
q(A, A\{x}) = q(x, y), x ∈ A.
y∈A
x6=y
Note that t 7→ |At | is non-increasing.

Remark. Due to our assumption M < ∞, there is a continuous time MC with this
Q-matrix.
Theorem 7.3. The processes (ηt )t≥0 and (At )t≥0 are dual. w.r.t.
Y
H(η, A) = η(x).
x∈A
Proof. Let L be the generator of the voter model. Then

X
LH(·, A)(η) = c(x, η)(H(η (x) , A) − H(η, A))
x∈A
X
= q(x, y)(H(η (x) , A) − H(η, A))
x∈A,y∈V
η(y)6=η(x)
X
= q(x, y)(1 − 2η(x))H(η, A\{x})
x∈A,y∈V
η(x)6=η(y)
X
= q(x, y)H(η, A\{x})(η(y) − η(x))
x∈A,y∈V
X
= q(x, y)H(η, (A\{x}) ∪ {y}) − H(η, A))
x∈A,y∈V
X
= q(A, B)(H(η, B) − H(η, A))
B:B⊆V
finite
and the claim follows from Theorem 7.2.

Hence we have
Z
(µTt )({η : η(x) = 1, ∀x ∈ A}) = Eη [H(ηt , A)]dµ
Z
= EA [H(η, At )]dµ
Z X
= PA (At = B)H(η, B)dµ
B:B⊆V
finite
X
= PA (At = B)µ({η : η(x) = 1, ∀x ∈ B}).
B:B⊆V
finite
Interpretation. The Markov chain describes the back-tracking of opinions. Let t > 0.
Follow the development which lead to ηt (x). Possibly, there is t1 < t, t1 > 0 and x1 ∈ V
such that the voter in x took the opinion ηt1 (x1 ), otherwise ηt (x) = η0 (x). We continue
going back in time: 0 < tm < tm−1 < ... < t0 = t and ηt (x) = η0 (xm ). The path going
at time ti , from xi−1 to xi is the path of a continuous time MC Yx (t) starting at x, with
Q-matrix q(x, y). For different x, the MC are not independent, but they coalesce when
they meet. With this construction ηt (x) = η0 (Yx (t)).
Remark. Lets assume the starting law is the product measure νρ with finite-dimensional
marginals
νρ ({η : η(x) = 1, ∀x ∈ A}) = ρ|A| , ρ ∈ (0, 1).
Then we have
P(ηt (x) = 1) = P(η0 (Yx (t)) = 1) = ρ.
P(ηt (x) = ηt (y) = 1) = P(η0 (Yx (t)) = η0 (Yy (t)) = 1)
= ρP(Yx (t) = Yy (t)) + ρ2 P(Yx (t) 6= Yy (t))
Since (ηt )t≥0 has the same law as (1 − ηt )t≥0 we also have
Pη (ηt (x1 ) = ηt (x2 ) = 0) =E1−η (ηt (x1 ) = ηt (x2 ) = 1)
(1) (2)
=E{x1 ,x2 } 1 − η(At ) = 1 − η(At ) = 1
(1) (2)
=P{x1 ,x2 } η(At ) = η(At ) = 0
In the same way,
P(ηt (x) = 0) = P(η0 (Yx (t)) = 0) = 1 − ρ.
P(ηt (x) = ηt (y) = 0) = P(η0 (Yx (t)) = η0 (Yy (t)) = 0)
= (1 − ρ)P(Yx (t) = Yy (t)) + (1 − ρ)2 P(Yx (t) 6= Yy (t)) .
Letting t → ∞,we see that the probability that ηt (x) and ηt (y) have the same value goes
to 1 if and only if the two MC, (Yx ) and (Yy ) coalesce almost surely.
(1) (2)
Let (Xt )t≥0 and (Xt )t≥0 be independent MC with rates q(·, ·) and starting points
(1) (2)
x1 , x2 ∈ V . Take now V = Zd and assume (Xt )t≥0 and (Xt )t≥0 are irreducible. Let
(1) (2)
Zt = Xt − Xt , t ≥ 0.
7.2 The recurrent case

Theorem 7.4. Assume that (Zt )t≥0 is recurrent. Then
(i) For all η ∈ S and x1 , x2 ∈ V we have
lim Pη (ηt (x1 ) 6= ηt (x2 )) = 0.
t→∞
(ii) Ie = {δ0 , δ1 }.
(iii) We have
lim µTt = αδ1 + (1 − α)δ0
t→∞
if and only if
X
lim pt (x, y)µ({η : η(y) = 1}) = α, ∀x ∈ V (7.4)
t→∞
y∈V
(1)
where pt (·, ·) is the transition function of (Xt )t≥0 .
In particular (iii) characterizes all µ for which µTt converges.

(i)
Proof. (i) Let X (1) and X (2) be as before with X0 = xi for i = 1, 2 and set
(1) (2)
τ := inf t ≥ 0 : Xt = Xt = inf{t ≥ 0 : Zt = 0}.
(1) (1)
Set At := Xt and
( (2)
(2)
Xt for t ≤ τ
At := (1)
Xt for t > τ.
(1) (2)
Note that At := At , At is the coalescing MC started from {x1 , x2 }. By the
duality relation we have
Pη (ηt (x1 ) = ηt (x2 ) = 1) =Eη (H(ηt , {x1 , x2 })) = E{x1 ,x2 } (H(η, At ))
(1) (2)
=P{x1 ,x2 } η(At ) = η(At ) = 1 .
Using the bijection between (ηt )t≥0 and (1 − ηt )t≥0 we also have
Pη (ηt (x1 ) = ηt (x2 ) = 0) =E1−η (ηt (x1 ) = ηt (x2 ) = 1)

(1) (2)
=E{x1 ,x2 } 1 − η(At ) = 1 − η(At ) = 1
(1) (2)
=P{x1 ,x2 } η(At ) = η(At ) = 0
But now due to recurrence

(1) (2) t→∞
Pη (ηt (x1 ) 6= ηt (x2 )) = P{x1 ,x2 } η(At ) 6= η(At ) ≤ P{x1 ,x2 } (τ > t) −−−→ 0.
(ii) Let µ ∈ I. Then
µ({η : η(x1 ) 6= η(x2 )}) = µTt ({η : η(x1 ) 6= η(x2 )})

Z
= Pη (ηt (x1 ) 6= ηt (x2 ))µ(dη)
t→∞
−−−→ 0,
due to (i). Hence η is µ-a.s. constant and therefore µ = αδ1 + (1 − α)δ0 for some
α ∈ [0, 1].
(iii) Let µ ∈ M1 (S) be a starting distribution, then

Z
(µTt )({η : η(x1 ) = 1}) = Pη (ηt (x1 ) = 1)µ(dη)
Z
(1)
= P(η(Xt ) = 1)µ(dη)
X
= pt (x1 , y)µ({η : η(y) = 1}).
y∈V
In the same way,
(µTt )({η : η(x1 ) = 1}) − (µTt )({η : η(x) = 1, ∀x ∈ A})

Z
= Pη (ηt (x1 ) = 1) − Pη (ηt (x) = 1, ∀x ∈ A) µ(dη)
t→∞
−−−→ 0,
due to (i). Therefore: If (7.4) holds we have

t→∞
(µTt )({η : η(x) = 1, ∀x ∈ A}) −−−→ α,
therefore (µTt ) converges for t → ∞ (see the exercise below), hence limt→∞ µTt is
invariant. Hence
lim µTt = αδ1 + (1 − α)δ0 ,

t→∞
due to (ii).
Exercise. Let µ ∈ M1 (S). Assume that

t→∞
(µTt )({η ∈ S : η(x) = 1, ∀x ∈ A}) −−−→ h(A)
for any finite A ⊆ V . Then µTt converges weakly to an element of M1 (S) for t → ∞.
7.3 The transient case

We start with the following argument:
Let A ⊆ V , A finite. Recall νρ is the product measure with
νρ ({η : η(x) = 1, ∀x ∈ A}) = ρ|A| , ρ ∈ (0, 1).
Then
Z
(νρ Tt )({η : η(x) = 1, ∀x ∈ A}) = Pη (ηt (x) = 1, ∀x ∈ A)νρ (dη)
Z
= PA (η(x) = 1, ∀x ∈ At )νρ (dη)
X (7.5)
= PA (At = B)νρ ({η : η(x) = 1, ∀x ∈ B})
B:B⊆V
finite
= EA [ρ|At | ]
Since |At | is non-increasing, we know a∞ = limt→∞ |At | exists (note that a∞ is a random
variable). Let t → ∞ in (7.5) and conclude that
µρ := lim νρ Tt exists. (7.6)

t→∞
Hence µρ is invariant. Now take V = Zd , assume that q(x, y) = q(0, |y − x|) and that
(1) (2) (1) (2)
(Zt )t≥0 is transient. Recall that Zt = Xt − Xt , t ≥ 0,P
where Xt , Xt are independent
MC with Q-matrix Q = (q(x, y)) x,y∈Zd and q(x, x) = − u6=x q(x, u).
x6=y
Consider the Green function of (Zt )t≥0 :

Z
G(x, y) = Px (Zs = y)ds < ∞,
R+
(i)
since (Zt )t≥0 is transient. Let (Xt )t≥0 for 1 ≤ i ≤ k be coalescing MC with Q, starting
from xi and define
g(A) = PA (|At | < |A| for some t > 0).
Lemma 7.5. (i) If A ⊆ B, then g(A) ≤ g(B).
P
(ii) If |A| ≥ 2, then g(A) ≤ B⊆A,|B|=2 g(B).
Proof. Assume x1 , . . . , xk are distinct. We have
(i) (j)
g({x1 , x2 , x3 , ..., xk }) = P(Xt = Xt for some i, j with 1 ≤ i, j ≤ k, and some t > 0)
hence (i) follows.
To show (ii) note that

X (i) (j)
g({x1 , x2 , x3 , ..., xk }) ≤ P(Xt = Xt for some t > 0) .
| {z }
i,j:1≤i,j≤k
=g({xi ,xj })
Lemma 7.6. (i) lim|x|→∞ g({0, x}) = 0.

(ii) For each k, we have
k
O
(1) (k)
lim g({Xt , ..., Xt }) =0 Pxi -a.s.
t→∞
i=1
(iii) For each starting point A0 ⊆ V , |A0 | < ∞, we have

lim g(At ) = 0 PA0 -a.s.
t→∞
Proof. (i) Due to the Markov property,

X
Px (Z2t = y) = Px (Zt = z)Pz (Zt = y)
z∈V
X 1/2 X 1/2
2 2
≤ Px (Zt = z) Pz (Zt = y)
z∈V z∈V
= Px (Z2t = x)1/2 Py (Z2t = y) 1/2
translation invariance = P0 (Z2t = 0).

In the third step, we also used the reversibility: Px (Zt = z) = Pz (Zt = x) since
q(x, y) = q(0, |y − x|) = q(y, x) and this implies that the measure π(x) = 1, ∀x is a
reversible measure for the MC (Zt ). Hence we have
max Px (Z2t = y) ≤ P0 (Z2t = 0). (7.7)

x,y∈V
In the same way we can show that
Px (Zs = y for some s > t) ≤ P0 (Zs = 0 for some s > t). (7.8)
More precisely
Z Z
Px (Zs = y)ds = Ex 1{Zs =y} ds = Px (Zs = y for some s > t) G(y, y).
[t,∞) [t,∞) | {z }
=G(0,0)
(7.9)
Taking x = y = 0, we get
Z Z
P0 (Zs = 0)ds = E0 1{Zs =0} ds = P0 (Zs = 0 for some s > t)G(0, 0).
[t,∞) [t,∞)
(7.10)
and (7.8) follows with (7.7). Now
g({0, x}) = Px (Zs = 0 for some s > 0)

≤ Px (Zs = 0 for some s ≤ t) + Px (Zs = 0 for some s > t)
≤ P0 (Zs = x for some s < t) + P0 (Zs = 0 for some s > t)
| {z }
t→∞
−−−→0
where the last term goes to 0 due to the transience of (Zt )t≥0 . Hence it suffices to
show that for any t > 0,
lim P0 (Zs = x for some s ≤ t) = 0 (7.11)

|x|→∞
Let τx := inf{u : Zu = x} ∈ [0, ∞], then
P0 (Zτx +s = x|Fτx ) = Px (Zs = x) ≥ e−M s ,

P
where M = supx∈V u:u6=x q(x, u). (Since the time spent in x, starting from x, is
exponentially distributed with parameter ≤ M .) Multiply with 1{τx ≤t} , take expec-
tation w.r.t. P0 and integrate over s ∈ [0, 1] to get
Z Z
P0 (Zτx +s = x, τx ≤ t)ds ≥ P0 (τx ≤ t) e−M s ds . (7.12)
[0,1] [0,1]
| {z }
=(1−e−M )/M
The l.h.s. of (7.12) is

Z
≤ E0 1{Zs =x} ds
[0,t+1]
and in particular,
XZ
P0 (Zτx +s = x, τx ≤ t)ds ≤ t + 1
x∈V [0,1]
Hence
X M (t + 1)
P0 (τx < t) ≤
x∈V
1 − e−M
and (7.11) follows.

(ii) With Lemma 7.5 (ii) we write
(1) (k)
X (i) (j)
g({Xt , ..., Xt }) ≤ g({Xt , Xt })
1≤i,j≤k
X (j) (i)
= g({0, Xt − Xt })
1≤i,j≤k
| {z }
t→∞
−−−→0
due to (i) and the transience of (Zt )t≥0 .
(1) (k)
(iii) Let |A0 | = k. We estimate g(At ): consider {Xt , ..., Xt }. We know that At ⊆
(1) (k)
{Xt , ..., Xt } for all t ≥ 0. Due to Lemma 7.5 (i) we have
(1) (k)
g(At ) ≤ g({Xt , ..., Xt }).
But now we apply (ii) and we see that

t→∞
g(At ) −−−→ 0 PA0 -a.s.
For x ∈ Zd , define the shift θx by θx ω(z) = ω(x + z), ∀z ∈ V .

Definition 7.7. We say that µ ∈ M1 (S) is translation invariant if
µ ◦ θx−1 = µ, ∀x ∈ Zd .
d
A translation invariant probability measure µ on S = {0, 1}Z is mixing if
lim µ({η : η(z) = 1, ∀z ∈ A ∪ (B + x)})

|x|→∞
= µ({η : η(z) = 1, ∀z ∈ A})µ({η : η(z) = 1, ∀z ∈ B})
for all finite subsets A, B ⊆ Zd .

Theorem 7.8. (i) For all finite subsets A of Zd we have

0 ≤ µρ ({η : η(x) = 1, ∀x ∈ A}) − ρ|A| ≤ g(A).
(ii) The probability measures µρ are translation invariant and mixing for ρ ∈ [0, 1].
(iii) µρ ({η : η(x) = 1}) = ρ.
(iv) The covariances of µρ are given by
G(x, y)
Covµρ (η(x), η(y)) = ρ(1 − ρ) . (7.13)
G(0, 0)
Proof. (i)
µρ ({η : η(x) = 1, ∀x ∈ A}) − ρ|A| = EA [ρA∞ − ρ|A| ]
= EA [(ρa∞ − ρ|A| )1{a∞ <|A|} ]
≤ g(A).
(ii) νρ Tt is translation invariant for all t ≥ 0, hence µρ is translation invariant as well.

We couple three copies of the dual process At , Bt and Ct such that At and Bt are
independent collections of coalescing MC starting from A and B, and Ct = At ∪ Bt
for t ≤ τ := inf{t ≥ 0 : At ∩ Bt 6= ∅}. (This can be realized for instance by letting
Ct evolve independently of At and Bt for t > τ .) Due to (7.5), we have
|µρ ({η : η(z) = 1,∀z ∈ A ∪ B})
− µρ ({η : η(z) = 1, ∀x ∈ A})µρ ({η : η(z) = 1, ∀z ∈ B})|
X
= |E[ρc∞ ] − E[ρa∞ ]E[ρb∞ ]| ≤ P(τ < ∞) ≤ g({u, v}),
u∈A
v∈B
where c∞ = limt→∞ |Ct |, b∞ = limt→∞ |Bt |. Replace B with B + x and apply Lemma
7.6 (i), then the last term goes to zero as |x| → ∞.
(iii) and (iv)] This follows from (7.5). Note that, with A = {x, y},
Covµρ (η(x), η(y)) = µρ ({η : η(x) = η(y) = 1}) − ρ2
= E{x,y} [ρa∞ − ρ2 ]
= ρ(1 − ρ)P{x,y} (a∞ = 1)
= ρ(1 − ρ)Px−y (Zt = 0 for some t > 0)
= ρ(1 − ρ)Px (Zt = y for some t > 0)
G(x, y)
= ρ(1 − ρ) ,
G(0, 0)
where the last equality follows from (7.9) with t = 0 and
Z
G(x, y) = Px (Zs = y)ds = Px (Zs = y for some s > 0)G(0, 0).
R+
Theorem 7.9. Ie = {µρ : ρ ∈ [0, 1]}.
Proof. Theorem 7.8 implies that if

Z
µρ = µρe γ(de
ρ)
R
for some
R probability measure γ on [0, 1], then γ = δρ . Assume µ ρ = µρe γ(de
ρ), then
ρ = ρe γ(deρ) due to Theorem 7.8 (iii) More precisely, note that
Z Z
ρ = µρ ({η : η(x) = 1}) = µρe({η : η(x) = 1})γ(de ρ) = ρe γ(de ρ).
Further
Z
Covµρ (η(x), η(y)) = Covµρe (η(x), η(y))γ(deρ)
Z
G(x, y)
Theorem 7.8 (iv) = ρe(1 − ρe)γ(de
ρ)
G(0, 0)
Z Z
G(x, y)
Jensen ≤ ρ) (1 − ρe)γ(de
ρe γ(de ρ)
G(0, 0)
G(x, y)
= ρ(1 − ρ).
G(0, 0)
Hence it follows that γ = δρ , since Jensen’s inequality is strict if γ is not a Dirac measure.
This shows that {µρ : ρ ∈ [0, 1]} ⊆ Ie . For the rest of the proof, we refer to [Liggett]
Theorem 4.43.
.
Bibliography
[Grimmett] Grimmett, G.:

Probability on Graphs - Random Processes on Graphs and Lattices
Statistical Laboratory University of Cambridge, 2012.
http://www.statslab.cam.ac.uk/ grg/books/USpgs-rev3.pdf
[Karatzas, Shreve] Karatzas, I. and Shreve, S.:

Brownian Motion and Stochastic Calculus
Springer, 1991
[Liggett] Liggett, T. M.:

Continuous Time Markov Processes - An introduction
Graduate Studies in Mathematics, Volume 113, 2010.
[Levin, Peres, Wilmer] Levin, D. Peres, Y. and Wilmer, E. L.:

Markov Chains and mixing times
American Mathematical Soc., 2009
[Le Gall] Le Gall, J.F.:

Brownian Motion, Martingales, and Stochastic Calculus
Graduate Texts in Mathematics, 2016.
[Stroock, Varadhan] Stroock, D. and Varadhan, S.:

Multidimensional Diffusion Processes
Springer, 1979
76

Markproc 2022 V 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Markproc 2022 V 5

Uploaded by

Copyright:

Available Formats

Technische Universität München

Prof. Dr. Nina Gantert

1 Markov chains in continuous time 2

2 Invariant measures, recurrence and transience 15

4 Generators, martingales, invariant distributions 36

6 The contact process 60

7 The voter model 65

Markov chains in continuous time

1.1 Markov chains, Q-matrices and transition func-

Ω = collection of right-continuous functions ω : [0, ∞) → S with only finitely many

Xt (ω) = ω(t) for t ≥ 0.

θs shift operator on Ω: (θs ω)(t) = ω(t + s).

A continuous-time Markov chain on S is given by

(i) a collection of probability measure (Px )x∈S on (Ω, F)

Ex [g ◦ θs |Fs ] = EXs [g] Px -a.s. for all x ∈ S (1.1)

Remark. We say that a function ω : [0, ∞) → S is càdlàg (French: ”continue à droite,

Definition 1.2. A transition function is a function pt (x, y), t ≥ 0, x, y ∈ S such that

and the Chapman-Kolmogorov equations hold

We write Pt = (pt (x, y))x,y∈S .

Given a transition function, we can construct a consistent family of finite-dimensional

(these are consistent due to (1.3)).

We want an infinitesimal description with a (possibly infinite) matrix, the Q-matrix,

Definition 1.3. A Q-matrix is a collection (q(x, y))x,y∈S such that

We set c(x) = −q(x, x) for all x ∈ S.

If S is finite, there are one-to-one correspondences between Markov chains, transition

Claim. pt (x, y) := Px (Xt = y) is a transition function, which depends on the stochastic

where P k is the k-th power of P with P 0 = 1.

Exercise. Show that Equation (1.5) defines a transition function.

Claim. The Q-matrix given by (1.4) satisfies Q = P − 1. Conversely, if P is a stochastic

Example 1.6. Let S = {0, 1}. The generic Q-matrix is

with β, δ ≥ 0. We will show that the corresponding transition function is given by

and we know that pt (·, ·) belongs to the MC in Example 1.5.

is a transition function and (1.4) holds.

(ii) Show that Pt is a stochastic matrix.

(iii) We have by definition Ps+t = Ps Pt hence the Chapman-Kolmogorov equations (1.3)

follows since limt→0 etQ = 1. (1.4) is true since

Note that (1.5) is a special case with Q = P − 1.

and q(k, l) = 0 if |k − l| ≥ 2, where ρk , λk ≥ 0 and λ0 = 0. An important particular case

1.2 Going from one of these to the others

The following theorem tells us that this is always possible.

(a) pt (x, y) is a transition function

(b) pt (x, y) determines the measures Px , x ∈ S uniquely.

T := inf{t > 0 : Xt 6= X0 } > 0 Px -a.s. for all x ∈ S.

which shows (1.3) and finishes the proof of (a).

To show (b) write, using the Markov property,

The equations (1.8) are called Kolmogorov forward-equations.

How do we go from the transition function to the Q-matrix?

(i) pt (x, x) > 0 for all t ≥ 0 and x ∈ S.

(ii) For all x, y ∈ S the map t 7→ pt (x, y) is continuous.

(iii) For all x ∈ S the right-hand derivative

exists in [0, ∞]. Furthermore,

pt (x, x) ≥ e−u(x)t . (1.9)

Proof. (i) Note that

pt+s (x, x) ≥ pt (x, x)ps (x, x), (1.10)

which yields that pt (x, x) > 0 for all t ≥ 0.

(ii) Using again (1.3), we have

|pt (x, y) − ps (x, y)| ≤ 2(1 − p|t−s| (x, x)) → 0 as |t − s| & 0.

This proves that t 7→ pt (x, y) is continuous.

Lemma 1.11. Assume that g : R+ → R is right-continuous at 0 and satisfies g(0) =

g(s + t) ≤ g(s) + g(t),

exists in (−∞, ∞].

f (t) := − log(pt (x, x))

is well-defined and right-continuous. Furthermore, due to (1.10), f is subadditive.