Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
G.A. Pavliotis
Department of Mathematics
Imperial College London
London SW7 2AZ, UK
February 23, 2011
2
Contents
Preface vii
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 The OneDimensional Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Stochastic Modeling of Deterministic Chaos . . . . . . . . . . . . . . . . . . . . . 6
1.5 Why Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Elements of Probability Theory 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Basic Deﬁnitions from Probability Theory . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Expectation of Random Variables . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Conditional Expecation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 The Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 Types of Convergence and Limit Theorems . . . . . . . . . . . . . . . . . . . . . 23
2.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
i
3 Basics of the Theory of Stochastic Processes 29
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Deﬁnition of a Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 Strictly Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2 Second Order Stationary Processes . . . . . . . . . . . . . . . . . . . . . 32
3.3.3 Ergodic Properties of SecondOrder Stationary Processes . . . . . . . . . . 37
3.4 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Other Examples of Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Brownian Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.2 Fractional Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.3 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 The KarhunenLo´ eve Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Markov Processes 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Deﬁnition of a Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 The ChapmanKolmogorov Equation . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 The Generator of a Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.1 The Adjoint Semigroup . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Ergodic Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6.1 Stationary Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5 Diffusion Processes 77
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Deﬁnition of a Diffusion Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3 The Backward and Forward Kolmogorov Equations . . . . . . . . . . . . . . . . . 79
ii
5.3.1 The Backward Kolmogorov Equation . . . . . . . . . . . . . . . . . . . . 79
5.3.2 The Forward Kolmogorov Equation . . . . . . . . . . . . . . . . . . . . . 81
5.4 Multidimensional Diffusion Processes . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5 Connection with Stochastic Differential Equations . . . . . . . . . . . . . . . . . . 84
5.6 Examples of Diffusion Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6 The FokkerPlanck Equation 87
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Basic Properties of the FP Equation . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.2.1 Existence and Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . 88
6.2.2 The FP equation as a conservation law . . . . . . . . . . . . . . . . . . . . 89
6.2.3 Boundary conditions for the Fokker–Planck equation . . . . . . . . . . . . 90
6.3 Examples of Diffusion Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3.2 The OrnsteinUhlenbeck Process . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.3 The Geometric Brownian Motion . . . . . . . . . . . . . . . . . . . . . . 99
6.4 The OrnsteinUhlenbeck Process and Hermite Polynomials . . . . . . . . . . . . . 100
6.5 Reversible Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5.1 Markov Chain Monte Carlo (MCMC) . . . . . . . . . . . . . . . . . . . . 111
6.6 Perturbations of nonReversible Diffusions . . . . . . . . . . . . . . . . . . . . . . 112
6.7 Eigenfunction Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.7.1 Reduction to a Schr¨ odinger Equation . . . . . . . . . . . . . . . . . . . . 114
6.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7 Stochastic Differential Equations 119
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 The It ˆ o and Stratonovich Stochastic Integral . . . . . . . . . . . . . . . . . . . . . 119
7.2.1 The Stratonovich Stochastic Integral . . . . . . . . . . . . . . . . . . . . . 121
7.3 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
iii
7.3.1 Examples of SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4 The Generator, Itˆ o’s formula and the FokkerPlanck Equation . . . . . . . . . . . . 125
7.4.1 The Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.4.2 It ˆ o’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.5 Linear SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.6 Derivation of the Stratonovich SDE . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.6.1 It ˆ o versus Stratonovich . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.7 Numerical Solution of SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.8 Parameter Estimation for SDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.9 Noise Induced Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.10 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8 The Langevin Equation 137
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.2 The FokkerPlanck Equation in Phase Space (KleinKramers Equation) . . . . . . 137
8.3 The Langevin Equation in a Harmonic Potential . . . . . . . . . . . . . . . . . . . 142
8.4 Asymptotic Limits for the Langevin Equation . . . . . . . . . . . . . . . . . . . . 151
8.4.1 The Overdamped Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.4.2 The Underdamped Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.5 Brownian Motion in Periodic Potentials . . . . . . . . . . . . . . . . . . . . . . . 164
8.5.1 The Langevin equation in a periodic potential . . . . . . . . . . . . . . . . 164
8.5.2 Equivalence With the GreenKubo Formula . . . . . . . . . . . . . . . . . 170
8.6 The Underdamped and Overdamped Limits of the Diffusion Coefﬁcient . . . . . . 171
8.6.1 Brownian Motion in a Tilted Periodic Potential . . . . . . . . . . . . . . . 180
8.7 Numerical Solution of the KleinKramers Equation . . . . . . . . . . . . . . . . . 183
8.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9 Exit Time Problems 185
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.2 Brownian Motion in a Bistable Potential . . . . . . . . . . . . . . . . . . . . . . . 185
iv
9.3 The Mean First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.3.1 The Boundary Value Problem for the MFPT . . . . . . . . . . . . . . . . . 188
9.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.4 Escape from a Potential Barrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.4.1 Calculation of the Reaction Rate in the Overdamped Regime . . . . . . . . 193
9.4.2 The Intermediate Regime: γ = O(1) . . . . . . . . . . . . . . . . . . . . . 194
9.4.3 Calculation of the Reaction Rate in the energydiffusionlimited regime . . 195
9.5 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
10 Stochastic Resonance and Brownian Motors 199
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.2 Stochastic Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.3 Brownian Motors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.4 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.5 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
10.6 Multiscale Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
10.6.1 Calculation of the Effective Drift . . . . . . . . . . . . . . . . . . . . . . . 203
10.6.2 Calculation of the Effective Diffusion Coefﬁcient . . . . . . . . . . . . . . 205
10.7 Effective Diffusion Coefﬁcient for Correlation Ratchets . . . . . . . . . . . . . . . 207
10.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
11 Stochastic Processes and Statistical Mechanics 213
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
11.2 The KacZwanzig Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
11.3 QuasiMarkovian Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . 218
11.3.1 Open Classical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.4 The MoriZwanzig Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
11.5 Derivation of the FokkerPlanck and Langevin Equations . . . . . . . . . . . . . . 224
11.6 Linear Response Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
11.7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
v
11.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
vi
Preface
The purpose of these notes is to present various results and techniques from the theory of stochastic
processes and are useful in the study of stochastic problems in physics, chemistry and other areas.
These notes have been used for several years for a course on applied stochastic processes offered to
fourth year and to MSc students in applied mathematics at the department of mathematics, Imperial
College London.
G.A. Pavliotis
London, December 2010
vii
viii
Chapter 1
Introduction
1.1 Introduction
In this chapter we introduce some of the concepts and techniques that we will study in this book.
In Section 1.2 we present a brief historical overview on the development of the theory of stochastic
processes in the twentieth century. In Section 1.3 we introduce the onedimensional random walk
an we use this example in order to introduce several concepts such Brownian motion, the Markov
property. In Section 1.4 we discuss about the stochastic modeling of deterministic chaos. Some
comments on the role of probabilistic modeling in the physical sciences are offered in Section 1.5.
Discussion and bibliographical comments are presented in Section 1.6. Exercises are included in
Section 1.7.
1.2 Historical Overview
The theory of stochastic processes, at least in terms of its application to physics, started with
Einstein’s work on the theory of Brownian motion: Concerning the motion, as required by the
molecularkinetic theory of heat, of particles suspended in liquids at rest (1905) and in a series
of additional papers that were published in the period 1905 − 1906. In these fundamental works,
Einstein presented an explanation of Brown’s observation (1827) that when suspended in water,
small pollen grains are found to be in a very animated and irregular state of motion. In develop
ing his theory Einstein introduced several concepts that still play a fundamental role in the study
of stochastic processes and that we will study in this book. Using modern terminology, Einstein
introduced a Markov chain model for the motion of the particle (molecule, pollen grain...). Fur
1
thermore, he introduced the idea that it makes more sense to talk about the probability of ﬁnding
the particle at position x at time t, rather than about individual trajectories.
In his work many of the main aspects of the modern theory of stochastic processes can be
found:
• The assumption of Markovianity (no memory) expressed through the ChapmanKolmogorov
equation.
• The Fokker–Planck equation (in this case, the diffusion equation).
• The derivation of the FokkerPlanck equation fromthe master (ChapmanKolmogorov) equa
tion through a KramersMoyal expansion.
• The calculation of a transport coefﬁcient (the diffusion equation) using macroscopic (kinetic
theorybased) considerations:
D =
k
B
T
6πηa
.
• k
B
is Boltzmann’s constant, T is the temperature, η is the viscosity of the ﬂuid and a is the
diameter of the particle.
Einstein’s theory is based on the FokkerPlanck equation. Langevin (1908) developed a theory
based on a stochastic differential equation. The equation of motion for a Brownian particle is
m
d
2
x
dt
2
= −6πηa
dx
dt
+ ξ,
where ξ is a random force. It can be shown that there is complete agreement between Einstein’s
theory and Langevin’s theory. The theory of Brownian motion was developed independently by
Smoluchowski, who also performed several experiments.
The approaches of Langevin and Einstein represent the two main approaches in the theory of
stochastic processes:
• Study individual trajectories of Brownian particles. Their evolution is governed by a stochas
tic differential equation:
dX
dt
= F(X) + Σ(X)ξ(t),
• where ξ(t) is a random force.
2
• Study the probability ρ(x, t) of ﬁnding a particle at position x at time t. This probability
distribution satisﬁes the FokkerPlanck equation:
∂ρ
∂t
= −∇ (F(x)ρ) +
1
2
∇∇ : (A(x)ρ),
• where A(x) = Σ(x)Σ(x)
T
.
The theory of stochastic processes was developed during the 20th century by several mathemati
cians and physicists including Smoluchowksi, Planck, Kramers, Chandrasekhar, Wiener, Kol
mogorov, It ˆ o, Doob.
1.3 The OneDimensional Random Walk
We let time be discrete, i.e. t = 0, 1, . . . . Consider the following stochastic process S
n
: S
0
= 0; at
each time step it moves to ±1 with equal probability
1
2
.
In other words, at each time step we ﬂip a fair coin. If the outcome is heads, we move one unit
to the right. If the outcome is tails, we move one unit to the left.
Alternatively, we can think of the random walk as a sum of independent random variables:
S
n
=
n
j=1
X
j
,
where X
j
∈ ¦−1, 1¦ with P(X
j
= ±1) =
1
2
.
We can simulate the random walk on a computer:
• We need a (pseudo)randomnumber generator to generate n independent randomvariables
which are uniformly distributed in the interval [0,1].
• If the value of the random variable is
1
2
then the particle moves to the left, otherwise it
moves to the right.
• We then take the sum of all these random moves.
• The sequence ¦S
n
¦
N
n=1
indexed by the discrete time T = ¦1, 2, . . . N¦ is the path of the
random walk. We use a linear interpolation (i.e. connect the points ¦n, S
n
¦ by straight lines)
to generate a continuous path.
3
Figure 1.1: Three paths of the random walk of length N = 50.
Figure 1.2: Three paths of the random walk of length N = 1000.
4
Figure 1.3: Sample Brownian paths.
Every path of the random walk is different: it depends on the outcome of a sequence of indepen
dent random experiments. We can compute statistics by generating a large number of paths and
computing averages. For example, E(S
n
) = 0, E(S
2
n
) = n. The paths of the random walk (without
the linear interpolation) are not continuous: the random walk has a jump of size 1 at each time step.
This is an example of a discrete time, discrete space stochastic processes. The random walk is a
timehomogeneous Markov process. If we take a large number of steps, the random walk starts
looking like a continuous time process with continuous paths.
We can quantify this observation by introducing an appropriate rescaled process and by taking
an appropriate limit. Consider the sequence of continuous time stochastic processes
Z
n
t
:=
1
√
n
S
nt
.
In the limit as n → ∞, the sequence ¦Z
n
t
¦ converges (in some appropriate sense, that will be
made precise in later chapters) to a Brownian motion with diffusion coefﬁcient D =
∆x
2
2∆t
=
1
2
.
Brownian motion W(t) is a continuous time stochastic processes with continuous paths that starts
at 0 (W(0) = 0) and has independent, normally. distributed Gaussian increments. We can simulate
the Brownian motion on a computer using a random number generator that generates normally
distributed, independent random variables. We can write an equation for the evolution of the paths
5
of a Brownian motion X
t
with diffusion coefﬁcient D starting at x:
dX
t
=
√
2DdW
t
, X
0
= x.
This is the simplest example of a stochastic differential equation. The probability of ﬁnding
X
t
at y at time t, given that it was at x at time t = 0, the transition probability density ρ(y, t)
satisﬁes the PDE
∂ρ
∂t
= D
∂
2
ρ
∂y
2
, ρ(y, 0) = δ(y −x).
This is the simplest example of the FokkerPlanck equation. The connection between Brownian
motion and the diffusion equation was made by Einstein in 1905.
1.4 Stochastic Modeling of Deterministic Chaos
1.5 Why Randomness
Why introduce randomness in the description of physical systems?
• To describe outcomes of a repeated set of experiments. Think of tossing a coin repeatedly or
of throwing a dice.
• To describe a deterministic system for which we have incomplete information: we have
imprecise knowledge of initial and boundary conditions or of model parameters.
– ODEs with random initial conditions are equivalent to stochastic processes that can be
described using stochastic differential equations.
• To describe systems for which we are not conﬁdent about the validity of our mathematical
model.
• To describe a dynamical system exhibiting very complicated behavior (chaotic dynamical
systems). Determinism versus predictability.
• To describe a high dimensional deterministic systemusing a simpler, lowdimensional stochas
tic system. Think of the physical model for Brownian motion (a heavy particle colliding with
many small particles).
6
• To describe a system that is inherently random. Think of quantum mechanics.
Stochastic modeling is currently used in many different areas ranging from biology to climate
modeling to economics.
1.6 Discussion and Bibliography
The fundamental papers of Einstein on the theory of Brownian motion have been reprinted by
Dover [20]. The readers of this book are strongly encouraged to study these papers. Other fun
damental papers from the early period of the development of the theory of stochastic processes
include the papers by Langevin, Ornstein and Uhlenbeck, Doob, Kramers and Chandrashekhar’s
famous review article [12]. Many of these early papers on the theory of stochastic processes have
been reprinted in [18]. Very useful historical comments can be founds in the books by Nelson [68]
and Mazo [66].
1.7 Exercises
1. Read the papers by Einstein, OrnsteinUhlenbeck, Doob etc.
2. Write a computer program for generating the random walk in one and two dimensions. Study
numerically the Brownian limit and compute the statistics of the random walk.
7
8
Chapter 2
Elements of Probability Theory
2.1 Introduction
In this chapter we put together some basic deﬁnitions and results from probability theory that will
be used later on. In Section 2.2 we give some basic deﬁnitions from the theory of probability.
In Section 2.3 we present some properties of random variables. In Section 2.4 we introduce the
concept of conditional expectation and in Section 2.5 we deﬁne the characteristic function, one of
the most useful tools in the study of (sums of) random variables. Some explicit calculations for
the multivariate Gaussian distribution are presented in Section 2.6. Different types of convergence
and the basic limit theorems of the theory of probability are discussed in Section 2.7. Discussion
and bibliographical comments are presented in Section 2.8. Exercises are included in Section 2.9.
2.2 Basic Deﬁnitions from Probability Theory
In Chapter 1 we deﬁned a stochastic process as a dynamical system whose law of evolution is
probabilistic. In order to study stochastic processes we need to be able to describe the outcome of
a random experiment and to calculate functions of this outcome. First we need to describe the set
of all possible experiments.
Deﬁnition 2.2.1. The set of all possible outcomes of an experiment is called the sample space and
is denoted by Ω.
Example 2.2.2. • The possible outcomes of the experiment of tossing a coin are H and T. The
sample space is Ω =
_
H, T
_
.
9
• The possible outcomes of the experiment of throwing a die are 1, 2, 3, 4, 5 and 6. The
sample space is Ω =
_
1, 2, 3, 4, 5, 6
_
.
We deﬁne events to be subsets of the sample space. Of course, we would like the unions,
intersections and complements of events to also be events. When the sample space Ω is uncount
able, then technical difﬁculties arise. In particular, not all subsets of the sample space need to be
events. A deﬁnition of the collection of subsets of events which is appropriate for ﬁnite additive
probability is the following.
Deﬁnition 2.2.3. A collection T of Ω is called a ﬁeld on Ω if
i. ∅ ∈ T;
ii. if A ∈ T then A
c
∈ T;
iii. If A, B ∈ T then A ∪ B ∈ T.
From the deﬁnition of a ﬁeld we immediately deduce that T is closed under ﬁnite unions and
ﬁnite intersections:
A
1
, . . . A
n
∈ T ⇒ ∪
n
i=1
A
i
∈ T, ∩
n
i=1
A
i
∈ T.
When Ω is inﬁnite dimensional then the above deﬁnition is not appropriate since we need to
consider countable unions of events.
Deﬁnition 2.2.4. A collection T of Ω is called a σﬁeld or σalgebra on Ω if
i. ∅ ∈ T;
ii. if A ∈ T then A
c
∈ T;
iii. If A
1
, A
2
, ∈ T then ∪
∞
i=1
A
i
∈ T.
A σalgebra is closed under the operation of taking countable intersections.
Example 2.2.5. • T =
_
∅, Ω
_
.
• T =
_
∅, A, A
c
, Ω
_
where A is a subset of Ω.
• The power set of Ω, denoted by ¦0, 1¦
Ω
which contains all subsets of Ω.
10
Let T be a collection of subsets of Ω. It can be extended to a σ−algebra (take for example the
power set of Ω). Consider all the σ−algebras that contain T and take their intersection, denoted
by σ(T), i.e. A ⊂ Ω if and only if it is in every σ−algebra containing T. σ(T) is a σ−algebra
(see Exercise 1 ). It is the smallest algebra containing T and it is called the σ−algebra generated
by T.
Example 2.2.6. Let Ω = R
n
. The σalgebra generated by the open subsets of R
n
(or, equivalently,
by the open balls of R
n
) is called the Borel σalgebra of R
n
and is denoted by B(R
n
).
Let X be a closed subset of R
n
. Similarly, we can deﬁne the Borel σalgebra of X, denoted by
B(X).
A subσ–algebra is a collection of subsets of a σ–algebra which satisﬁes the axioms of a σ–
algebra.
The σ−ﬁeld T of a sample space Ω contains all possible outcomes of the experiment that we
want to study. Intuitively, the σ−ﬁeld contains all the information about the random experiment
that is available to us.
Now we want to assign probabilities to the possible outcomes of an experiment.
Deﬁnition 2.2.7. A probability measure P on the measurable space (Ω, T) is a function P :
T →[0, 1] satisfying
i. P(∅) = 0, P(Ω) = 1;
ii. For A
1
, A
2
, . . . with A
i
∩ A
j
= ∅, i ,= j then
P(∪
∞
i=1
A
i
) =
∞
i=1
P(A
i
).
Deﬁnition 2.2.8. The triple
_
Ω, T, P
_
comprising a set Ω, a σalgebra T of subsets of Ω and a
probability measure P on (Ω, T) is a called a probability space.
Example 2.2.9. A biased coin is tossed once: Ω = ¦H, T¦, T = ¦∅, H, T, Ω¦ = ¦0, 1¦, P :
T →[0, 1] such that P(∅) = 0, P(H) = p ∈ [0, 1], P(T) = 1 −p, P(Ω) = 1.
Example 2.2.10. Take Ω = [0, 1], T = B([0, 1]), P = Leb([0, 1]). Then (Ω, T, P) is a probability
space.
11
2.2.1 Conditional Probability
One of the most important concepts in probability is that of the dependence between events.
Deﬁnition 2.2.11. A family ¦A
i
: i ∈ I¦ of events is called independent if
P
_
∩
j∈J
A
j
_
= Π
j∈J
P(A
j
)
for all ﬁnite subsets J of I.
When two events A, B are dependent it is important to know the probability that the event
A will occur, given that B has already happened. We deﬁne this to be conditional probability,
denoted by P(A[B). We know from elementary probability that
P(A[B) =
P(A ∩ B)
P(B)
.
A very useful result is that of the total law of probability.
Deﬁnition 2.2.12. A family of events ¦B
i
: i ∈ I¦ is called a partition of Ω if
B
i
∩ B
j
= ∅, i ,= j and ∪
i∈I
B
i
= Ω.
Proposition 2.2.13. Law of total probability. For any event A and any partition ¦B
i
: i ∈ I¦
we have
P(A) =
i∈I
P(A[B
i
)P(B
i
).
The proof of this result is left as an exercise. In many cases the calculation of the probability
of an event is simpliﬁed by choosing an appropriate partition of Ω and using the law of total
probability.
Let (Ω, T, P) be a probability space and ﬁx B ∈ T. Then P([B) deﬁnes a probability measure
on T. Indeed, we have that
P(∅[B) = 0, P(Ω[B) = 1
and (since A
i
∩ A
j
= ∅ implies that (A
i
∩ B) ∩ (A
j
∩ B) = ∅)
P(∪
∞
j=1
A
i
[B) =
∞
j=1
P(A
i
[B),
for a countable family of pairwise disjoint sets ¦A
j
¦
+∞
j=1
. Consequently, (Ω, T, P([B)) is a proba
bility space for every B ∈ cF.
12
2.3 Random Variables
We are usually interested in the consequences of the outcome of an experiment, rather than the
experiment itself. The function of the outcome of an experiment is a random variable, that is, a
map from Ω to R.
Deﬁnition 2.3.1. A sample space Ω equipped with a σ−ﬁeld of subsets T is called a measurable
space.
Deﬁnition 2.3.2. Let (Ω, T) and (E, () be two measurable spaces. A function X : Ω → E such
that the event
¦ω ∈ Ω : X(ω) ∈ A¦ =: ¦X ∈ A¦ (2.1)
belongs to T for arbitrary A ∈ ( is called a measurable function or random variable.
When E is R equipped with its Borel σalgebra, then (2.1) can by replaced with
¦X x¦ ∈ T ∀x ∈ R.
Let X be a random variable (measurable function) from (Ω, T, µ) to (E, (). If E is a metric space
then we may deﬁne expectation with respect to the measure µ by
E[X] =
_
Ω
X(ω) dµ(ω).
More generally, let f : E →R be (–measurable. Then,
E[f(X)] =
_
Ω
f(X(ω)) dµ(ω).
Let U be a topological space. We will use the notation B(U) to denote the Borel σ–algebra of U:
the smallest σ–algebra containing all open sets of U. Every random variable from a probability
space (Ω, T, µ) to a measurable space (E, B(E)) induces a probability measure on E:
µ
X
(B) = PX
−1
(B) = µ(ω ∈ Ω; X(ω) ∈ B), B ∈ B(E). (2.2)
The measure µ
X
is called the distribution (or sometimes the law) of X.
Example 2.3.3. Let 1 denote a subset of the positive integers. A vector ρ
0
= ¦ρ
0,i
, i ∈ 1¦ is a
distribution on 1 if it has nonnegative entries and its total mass equals 1:
i∈7
ρ
0,i
= 1.
13
Consider the case where E = R equipped with the Borel σ−algebra. In this case a random
variable is deﬁned to be a function X : Ω →R such that
¦ω ∈ Ω : X(ω) x¦ ⊂ T ∀x ∈ R.
We can now deﬁne the probability distribution function of X, F
X
: R →[0, 1] as
F
X
(x) = P
_ _
ω ∈ Ω
¸
¸
X(ω) x
__
=: P(X x). (2.3)
In this case, (R, B(R), F
X
) becomes a probability space.
The distribution function F
X
(x) of a randomvariable has the properties that lim
x→−∞
F
X
(x) =
0, lim
x→+∞
F(x) = 1 and is right continuous.
Deﬁnition 2.3.4. A random variable X with values on R is called discrete if it takes values in
some countable subset ¦x
0
, x
1
, x
2
, . . . ¦ of R. i.e.: P(X = x) ,= x only for x = x
0
, x
1
, . . . .
With a random variable we can associate the probability mass function p
k
= P(X = x
k
).
We will consider nonnegative integer valued discrete random variables. In this case p
k
= P(X =
k), k = 0, 1, 2, . . . .
Example 2.3.5. The Poisson random variable is the nonnegative integer valued random variable
with probability mass function
p
k
= P(X = k) =
λ
k
k!
e
−λ
, k = 0, 1, 2, . . . ,
where λ > 0.
Example 2.3.6. The binomial random variable is the nonnegative integer valued random variable
with probability mass function
p
k
= P(X = k) =
N!
n!(N −n)!
p
n
q
N−n
k = 0, 1, 2, . . . N,
where p ∈ (0, 1), q = 1 −p.
Deﬁnition 2.3.7. A random variable X with values on R is called continuous if P(X = x) =
0 ∀x ∈ R.
14
Let (Ω, T, P) be a probability space and let X : Ω →R be a random variable with distribution
F
X
. This is a probability measure on B(R). We will assume that it is absolutely continuous with
respect to the Lebesgue measure with density ρ
X
: F
X
(dx) = ρ(x) dx. We will call the density
ρ(x) the probability density function (PDF) of the random variable X.
Example 2.3.8. i. The exponential random variable has PDF
f(x) =
_
λe
−λx
x > 0,
0 x < 0,
with λ > 0.
ii. The uniform random variable has PDF
f(x) =
_
1
b−a
a < x < b,
0 x / ∈ (a, b),
with a < b.
Deﬁnition 2.3.9. Two random variables X and Y are independent if the events ¦ω ∈ Ω[ X(ω)
x¦ and ¦ω ∈ Ω[ Y (ω) y¦ are independent for all x, y ∈ R.
Let X, Y be two continuous random variables. We can view them as a random vector, i.e. a
random variable from Ω to R
2
. We can then deﬁne the joint distribution function
F(x, y) = P(X x, Y y).
The mixed derivative of the distribution function f
X,Y
(x, y) :=
∂
2
F
∂x∂y
(x, y), if it exists, is called the
joint PDF of the random vector ¦X, Y ¦:
F
X,Y
(x, y) =
_
x
−∞
_
y
−∞
f
X,Y
(x, y) dxdy.
If the random variables X and Y are independent, then
F
X,Y
(x, y) = F
X
(x)F
Y
(y)
and
f
X,Y
(x, y) = f
X
(x)f
Y
(y).
15
The joint distribution function has the properties
F
X,Y
(x, y) = F
Y,X
(y, x),
F
X,Y
(+∞, y) = F
Y
(y), f
Y
(y) =
_
+∞
−∞
f
X,Y
(x, y) dx.
We can extend the above deﬁnition to random vectors of arbitrary ﬁnite dimensions. Let X be
a random variable from (Ω, T, µ) to (R
d
, B(R
d
)). The (joint) distribution function F
X
R
d
→[0, 1]
is deﬁned as
F
X
(x) = P(X x).
Let X be a random variable in R
d
with distribution function f(x
N
) where x
N
= ¦x
1
, . . . x
N
¦. We
deﬁne the marginal or reduced distribution function f
N−1
(x
N−1
) by
f
N−1
(x
N−1
) =
_
R
f
N
(x
N
) dx
N
.
We can deﬁne other reduced distribution functions:
f
N−2
(x
N−2
) =
_
R
f
N−1
(x
N−1
) dx
N−1
=
_
R
_
R
f(x
N
) dx
N−1
dx
N
.
2.3.1 Expectation of Random Variables
We can use the distribution of a random variable to compute expectations and probabilities:
E[f(X)] =
_
R
f(x) dF
X
(x) (2.4)
and
P[X ∈ G] =
_
G
dF
X
(x), G ∈ B(E). (2.5)
The above formulas apply to both discrete and continuous random variables, provided that we
deﬁne the integrals in (2.4) and (2.5) appropriately.
When E = R
d
and a PDF exists, dF
X
(x) = f
X
(x) dx, we have
F
X
(x) := P(X x) =
_
x
1
−∞
. . .
_
x
d
−∞
f
X
(x) dx..
When E = R
d
then by L
p
(Ω; R
d
), or sometimes L
p
(Ω; µ) or even simply L
p
(µ), we mean the
Banach space of measurable functions on Ω with norm
X
L
p =
_
E[X[
p
_
1/p
.
16
Let X be a nonnegative integer valued random variable with probability mass function p
k
. We
can compute the expectation of an arbitrary function of X using the formula
E(f(X)) =
∞
k=0
f(k)p
k
.
Let X, Y be random variables we want to know whether they are correlated and, if they are, to
calculate how correlated they are. We deﬁne the covariance of the two random variables as
cov(X, Y ) = E
_
(X −EX)(Y −EY )
¸
= E(XY ) −EXEY.
The correlation coefﬁcient is
ρ(X, Y ) =
cov(X, Y )
_
var(X)
_
var(X)
(2.6)
The CauchySchwarz inequality yields that ρ(X, Y ) ∈ [−1, 1]. We will say that two random
variables X and Y are uncorrelated provided that ρ(X, Y ) = 0. It is not true in general that
two uncorrelated random variables are independent. This is true, however, for Gaussian random
variables (see Exercise 5).
Example 2.3.10. • Consider the random variable X : Ω →R with pdf
γ
σ,b
(x) := (2πσ)
−
1
2
exp
_
−
(x −b)
2
2σ
_
.
Such an X is termed a Gaussian or normal random variable. The mean is
EX =
_
R
xγ
σ,b
(x) dx = b
and the variance is
E(X −b)
2
=
_
R
(x −b)
2
γ
σ,b
(x) dx = σ.
• Let b ∈ R
d
and Σ ∈ R
dd
be symmetric and positive deﬁnite. The random variable X : Ω →
R
d
with pdf
γ
Σ,b
(x) :=
_
(2π)
d
detΣ
_
−
1
2
exp
_
−
1
2
¸Σ
−1
(x −b), (x −b))
_
is termed a multivariate Gaussian or normal random variable. The mean is
E(X) = b (2.7)
17
and the covariance matrix is
E
_
(X −b) ⊗(X −b)
_
= Σ. (2.8)
Since the mean and variance specify completely a Gaussian randomvariable on R, the Gaussian
is commonly denoted by ^(m, σ). The standard normal random variable is ^(0, 1). Similarly,
since the mean and covariance matrix completely specify a Gaussian random variable on R
d
, the
Gaussian is commonly denoted by ^(m, Σ).
Some analytical calculations for Gaussian random variables will be presented in Section 2.6.
2.4 Conditional Expecation
Assume that X ∈ L
1
(Ω, T, µ) and let ( be a sub–σ–algebra of T. The conditional expectation
of X with respect to ( is deﬁned to be the function (random variable) E[X[(] : Ω → E which is
(–measurable and satisﬁes
_
G
E[X[(] dµ =
_
G
X dµ ∀ G ∈ (.
We can deﬁne E[f(X)[(] and the conditional probability P[X ∈ F[(] = E[I
F
(X)[(], where I
F
is
the indicator function of F, in a similar manner.
We list some of the most important properties of conditional expectation.
Theorem 2.4.1. [Properties of Conditional Expectation]. Let (Ω, T, µ) be a probability space and
let ( be a sub–σ–algebra of T.
(a) If X is (−measurable and integrable then E(X[() = X.
(b) (Linearity) If X
1
, X
2
are integrable and c
1
, c
2
constants, then
E(c
1
X
1
+ c
2
X
2
[() = c
1
E(X
1
[() + c
2
E(X
2
[().
(c) (Order) If X
1
, X
2
are integrable and X
1
X
2
a.s., then E(X
1
[() E(X
2
[() a.s.
(d) If Y and XY are integrable, and X is (−measurable then E(XY [() = XE(Y [().
(e) (Successive smoothing) If T is a sub–σ–algebra of T, T ⊂ ( and X is integrable, then
E(X[T) = E[E(X[()[T] = E[E(X[T)[(].
18
(f) (Convergence) Let ¦X
n
¦
∞
n=1
be a sequence of random variables such that, for all n, [X
n
[ Z
where Z is integrable. If X
n
→X a.s., then E(X
n
[() →E(X[() a.s. and in L
1
.
Proof. See Exercise 10.
2.5 The Characteristic Function
Many of the properties of (sums of) random variables can be studied using the Fourier transform
of the distribution function. Let F(λ) be the distribution function of a (discrete or continuous)
random variable X. The characteristic function of X is deﬁned to be the Fourier transform of
the distribution function
φ(t) =
_
R
e
itλ
dF(λ) = E(e
itX
). (2.9)
For a continuous random variable for which the distribution function F has a density, dF(λ) =
p(λ)dλ, (2.9) gives
φ(t) =
_
R
e
itλ
p(λ) dλ.
For a discrete random variable for which P(X = λ
k
) = α
k
, (2.9) gives
φ(t) =
∞
k=0
e
itλ
k
a
k
.
From the properties of the Fourier transform we conclude that the characteristic function deter
mines uniquely the distribution function of the random variable, in the sense that there is a oneto
one correspondance between F(λ) and φ(t). Furthermore, in the exercises at the end of the chapter
the reader is asked to prove the following two results.
Lemma 2.5.1. Let ¦X
1
, X
2
, . . . X
n
¦ be independent random variables with characteristic func
tions φ
j
(t), j = 1, . . . n and let Y =
n
j=1
X
j
with characteristic function φ
Y
(t). Then
φ
Y
(t) = Π
n
j=1
φ
j
(t).
Lemma 2.5.2. Let X be a random variable with characteristic function φ(t) and assume that it
has ﬁnite moments. Then
E(X
k
) =
1
i
k
φ
(k)
(0).
19
2.6 Gaussian Random Variables
In this section we present some useful calculations for Gaussian random variables. In particular,
we calculate the normalization constant, the mean and variance and the characteristic function of
multidimensional Gaussian random variables.
Theorem 2.6.1. Let b ∈ R
d
and Σ ∈ R
dd
a symmetric and positive deﬁnite matrix. Let X be the
multivariate Gaussian random variable with probability density function
γ
Σ,b
(x) =
1
Z
exp
_
−
1
2
¸Σ
−1
(x −b), x −b)
_
.
Then
i. The normalization constant is
Z = (2π)
d/2
_
det(Σ).
ii. The mean vector and covariance matrix of X are given by
EX = b
and
E((X−EX) ⊗(X−EX)) = Σ.
iii. The characteristic function of X is
φ(t) = e
i¸b,t)−
1
2
¸t,Σt)
.
Proof. i. From the spectral theorem for symmetric positive deﬁnite matrices we have that there
exists a diagonal matrix Λ with positive entries and an orthogonal matrix B such that
Σ
−1
= B
T
Λ
−1
B.
Let z = x −b and y = Bz. We have
¸Σ
−1
z, z) = ¸B
T
Λ
−1
Bz, z)
= ¸Λ
−1
Bz, Bz) = ¸Λ
−1
y, y)
=
d
i=1
λ
−1
i
y
2
i
.
20
Furthermore, we have that det(Σ
−1
) = Π
d
i=1
λ
−1
i
, that det(Σ) = Π
d
i=1
λ
i
and that the Jacobian
of an orthogonal transformation is J = det(B) = 1. Hence,
_
R
d
exp
_
−
1
2
¸Σ
−1
(x −b), x −b)
_
dx =
_
R
d
exp
_
−
1
2
¸Σ
−1
z, z)
_
dz
=
_
R
d
exp
_
−
1
2
d
i=1
λ
−1
i
y
2
i
_
[J[ dy
=
d
i=1
_
R
exp
_
−
1
2
λ
−1
i
y
2
i
_
dy
i
= (2π)
d/2
Π
n
i=1
λ
1/2
i
= (2π)
d/2
_
det(Σ),
from which we get that
Z = (2π)
d/2
_
det(Σ).
In the above calculation we have used the elementary calculus identity
_
R
e
−α
x
2
2
dx =
_
2π
α
.
ii. From the above calculation we have that
γ
Σ,b
(x) dx = γ
Σ,b
(B
T
y +b) dy
=
1
(2π)
d/2
_
det(Σ)
d
i=1
exp
_
−
1
2
λ
i
y
2
i
_
dy
i
.
Consequently
EX =
_
R
d
xγ
Σ,b
(x) dx
=
_
R
d
(B
T
y +b)γ
Σ,b
(B
T
y +b) dy
= b
_
R
d
γ
Σ,b
(B
T
y +b) dy = b.
We note that, since Σ
−1
= B
T
Λ
−1
B, we have that Σ = B
T
ΛB. Furthermore, z = B
T
y. We
21
calculate
E((X
i
−b
i
)(X
j
−b
j
)) =
_
R
d
z
i
z
j
γ
Σ,b
(z +b) dz
=
1
(2π)
d/2
_
det(Σ)
_
R
d
k
B
ki
y
k
m
B
mi
y
m
exp
_
−
1
2
λ
−1
y
2
_
dy
=
1
(2π)
d/2
_
det(Σ)
k,m
B
ki
B
mj
_
R
d
y
k
y
m
exp
_
−
1
2
λ
−1
y
2
_
dy
=
k,m
B
ki
B
mj
λ
k
δ
km
= Σ
ij
.
iii. Let y be a multivariate Gaussian random variable with mean 0 and covariance I. Let also
C = B
√
Λ. We have that Σ = CC
T
= C
T
C. We have that
X = CY +b.
To see this, we ﬁrst note that X is Gaussian since it is given through a linear transformation
of a Gaussian random variable. Furthermore,
EX = b and E((X
i
−b
i
)(X
j
−b
j
)) = Σ
ij
.
Now we have:
φ(t) = Ee
i¸X,t)
= e
i¸b,t)
Ee
i¸CY,t)
= e
i¸b,t)
Ee
i¸Y,C
T
t)
= e
i¸b,t)
Ee
i
P
j
(
P
k
C
jk
t
k
)y
j
= e
i¸b,t)
e
−
1
2
P
j
[
P
k
C
jk
t
k[
2
= e
i¸b,t)
e
−
1
2
¸Ct,Ct)
= e
i¸b,t)
e
−
1
2
¸t,C
T
Ct)
= e
i¸b,t)
e
−
1
2
¸t,Σt)
.
Consequently,
φ(t) = e
i¸b,t)−
1
2
¸t,Σt)
.
22
2.7 Types of Convergence and Limit Theorems
One of the most important aspects of the theory of random variables is the study of limit theo
rems for sums of random variables. The most well known limit theorems in probability theory
are the law of large numbers and the central limit theorem. There are various different types of
convergence for sequences or random variables. We list the most important types of convergence
below.
Deﬁnition 2.7.1. Let ¦Z
n
¦
∞
n=1
be a sequence of random variables. We will say that
(a) Z
n
converges to Z with probability one if
P
_
lim
n→+∞
Z
n
= Z
_
= 1.
(b) Z
n
converges to Z in probability if for every ε > 0
lim
n→+∞
P
_
[Z
n
−Z[ > ε
_
= 0.
(c) Z
n
converges to Z in L
p
if
lim
n→+∞
E
_¸
¸
Z
n
−Z
¸
¸
p
¸
= 0.
(d) Let F
n
(λ), n = 1, + ∞, F(λ) be the distribution functions of Z
n
n = 1, + ∞ and Z,
respectively. Then Z
n
converges to Z in distribution if
lim
n→+∞
F
n
(λ) = F(λ)
for all λ ∈ R at which F is continuous.
Recall that the distribution function F
X
of a random variable from a probability space (Ω, T, P)
to R induces a probability measure on R and that (R, B(R), F
X
) is a probability space. We can
show that the convergence in distribution is equivalent to the weak convergence of the probability
measures induced by the distribution functions.
Deﬁnition 2.7.2. Let (E, d) be a metric space, B(E) the σ−algebra of its Borel sets, P
n
a sequence
of probability measures on (E, B(E)) and let C
b
(E) denote the space of bounded continuous
functions on E. We will say that the sequence of P
n
converges weakly to the probability measure
P if, for each f ∈ C
b
(E),
lim
n→+∞
_
E
f(x) dP
n
(x) =
_
E
f(x) dP(x).
23
Theorem2.7.3. Let F
n
(λ), n = 1, +∞, F(λ) be the distribution functions of Z
n
n = 1, +∞
and Z, respectively. Then Z
n
converges to Z in distribution if and only if, for all g ∈ C
b
(R)
lim
n→+∞
_
X
g(x) dF
n
(x) =
_
X
g(x) dF(x). (2.10)
Notice that (2.10) is equivalent to
lim
n→+∞
E
n
g(X
n
) = Eg(X),
where E
n
and E denote the expectations with respect to F
n
and F, respectively.
When the sequence of random variables whose convergence we are interested in takes values
in R
d
or, more generally, a metric space space (E, d) then we can use weak convergence of the se
quence of probability measures induced by the sequence of random variables to deﬁne convergence
in distribution.
Deﬁnition 2.7.4. A sequence of real valued random variables X
n
deﬁned on a probability spaces
(Ω
n
, T
n
, P
n
) and taking values on a metric space (E, d) is said to converge in distribution if the
indued measures F
n
(B) = P
n
(X
n
∈ B) for B ∈ B(E) converge weakly to a probability measure
P.
Let ¦X
n
¦
∞
n=1
be iid random variables with EX
n
= V . Then, the strong law of large numbers
states that average of the sum of the iid converges to V with probability one:
P
_
lim
N→+∞
1
N
N
n=1
X
n
= V
_
= 1.
The strong law of large numbers provides us with information about the behavior of a sum of
random variables (or, a large number or repetitions of the same experiment) on average. We can
also study ﬂuctuations around the average behavior. Indeed, let E(X
n
− V )
2
= σ
2
. Deﬁne the
centered iid randomvariables Y
n
= X
n
−V . Then, the sequence of randomvariables
1
σ
√
N
N
n=1
Y
n
converges in distribution to a ^(0, 1) random variable:
lim
n→+∞
P
_
1
σ
√
N
N
n=1
Y
n
a
_
=
_
a
−∞
1
√
2π
e
−
1
2
x
2
dx.
This is the central limit theorem.
24
2.8 Discussion and Bibliography
The material of this chapter is very standard and can be found in many books on probability theory.
Well known textbooks on probability theory are [8, 23, 24, 56, 57, 48, 90].
The connection between conditional expectation and orthogonal projections is discussed in [13].
The reduced distribution functions deﬁned in Section 2.3 are used extensively in statistical
mechanics. A different normalization is usually used in physics textbooks. See for instance [2,
Sec. 4.2].
The calculations presented in Section 2.6 are essentially an exercise in linear algebra. See [53,
Sec. 10.2].
Random variables and probability measures can also be deﬁned in inﬁnite dimensions. More
information can be found in [75, Ch. 2].
The study of limit theorems is one of the cornerstones of probability theory and of the theory
of stochastic processes. A comprehensive study of limit theorems can be found in [43].
2.9 Exercises
1. Show that the intersection of a family of σalgebras is a σalgebra.
2. Prove the law of total probability, Proposition 2.2.13.
3. Calculate the mean, variance and characteristic function of the following probability density
functions.
(a) The exponential distribution with density
f(x) =
_
λe
−λx
x > 0,
0 x < 0,
with λ > 0.
(b) The uniform distribution with density
f(x) =
_
1
b−a
a < x < b,
0 x / ∈ (a, b),
with a < b.
25
(c) The Gamma distribution with density
f(x) =
_
λ
Γ(α)
(λx)
α−1
e
−λx
x > 0,
0 x < 0,
with λ > 0, α > 0 and Γ(α) is the Gamma function
Γ(α) =
_
∞
0
ξ
α−1
e
−ξ
dξ, α > 0.
4. Le X and Y be independent random variables with distribution functions F
X
and F
Y
. Show
that the distribution function of the sum Z = X + Y is the convolution of F
X
and F
Y
:
F
Z
(x) =
_
F
X
(x −y) dF
Y
(y).
5. Let X and Y be Gaussian random variables. Show that they are uncorrelated if and only if they
are independent.
6. (a) Let X be a continuous random variable with characteristic function φ(t). Show that
EX
k
=
1
i
k
φ
(k)
(0),
where φ
(k)
(t) denotes the kth derivative of φ evaluated at t.
(b) Let X be a nonnegative random variable with distribution function F(x). Show that
E(X) =
_
+∞
0
(1 −F(x)) dx.
(c) Let X be a continuous random variable with probability density function f(x) and char
acteristic function φ(t). Find the probability density and characteristic function of the
random variable Y = aX + b with a, b ∈ R.
(d) Let X be a random variable with uniform distribution on [0, 2π]. Find the probability
density of the random variable Y = sin(X).
7. Let X be a discrete random variable taking vales on the set of nonnegative integers with prob
ability mass function p
k
= P(X = k) with p
k
0,
+∞
k=0
p
k
= 1. The generating function is
deﬁned as
g(s) = E(s
X
) =
+∞
k=0
p
k
s
k
.
26
(a) Show that
EX = g
t
(1) and EX
2
= g
tt
(1) + g
t
(1),
where the prime denotes differentiation.
(b) Calculate the generating function of the Poisson random variable with
p
k
= P(X = k) =
e
−λ
λ
k
k!
, k = 0, 1, 2, . . . and λ > 0.
(c) Prove that the generating function of a sum of independent nonnegative integer valued
random variables is the product of their generating functions.
8. Write a computer program for studying the law of large numbers and the central limit theorem.
Investigate numerically the rate of convergence of these two theorems.
9. Study the properties of Gaussian measures on separable Hilbert spaces from [75, Ch. 2].
10. . Prove Theorem 2.4.1.
27
28
Chapter 3
Basics of the Theory of Stochastic Processes
3.1 Introduction
In this chapter we present some basic results form the theory of stochastic processes and we inves
tigate the properties of some of the standard stochastic processes in continuous time. In Section 3.2
we give the deﬁnition of a stochastic process. In Section 3.3 we present some properties of sta
tionary stochastic processes. In Section 3.4 we introduce Brownian motion and study some of
its properties. Various examples of stochastic processes in continuous time are presented in Sec
tion 3.5. The KarhunenLoeve expansion, one of the most useful tools for representing stochastic
processes and random ﬁelds, is presented in Section 3.6. Further discussion and bibliographical
comments are presented in Section 3.7. Section 3.8 contains exercises.
3.2 Deﬁnition of a Stochastic Process
Stochastic processes describe dynamical systems whose evolution law is of probabilistic nature.
The precise deﬁnition is given below.
Deﬁnition 3.2.1. Let T be an ordered set, (Ω, T, P) a probability space and (E, () a measurable
space. A stochastic process is a collection of random variables X = ¦X
t
; t ∈ T¦ where, for each
ﬁxed t ∈ T, X
t
is a random variable from (Ω, T, P) to (E, (). Ω is called the sample space. and
E is the state space of the stochastic process X
t
.
The set T can be either discrete, for example the set of positive integers Z
+
, or continuous,
T = [0, +∞). The state space E will usually be R
d
equipped with the σ–algebra of Borel sets.
29
A stochastic process X may be viewed as a function of both t ∈ T and ω ∈ Ω. We will
sometimes write X(t), X(t, ω) or X
t
(ω) instead of X
t
. For a ﬁxed sample point ω ∈ Ω, the
function X
t
(ω) : T →E is called a sample path (realization, trajectory) of the process X.
Deﬁnition 3.2.2. The ﬁnite dimensional distributions (fdd) of a stochastic process are the dis
tributions of the E
k
–valued random variables (X(t
1
), X(t
2
), . . . , X(t
k
)) for arbitrary positive
integer k and arbitrary times t
i
∈ T, i ∈ ¦1, . . . , k¦:
F(x) = P(X(t
i
) x
i
, i = 1, . . . , k)
with x = (x
1
, . . . , x
k
).
From experiments or numerical simulations we can only obtain information about the ﬁnite
dimensional distributions of a process. A natural question arises: are the ﬁnite dimensional distri
butions of a stochastic process sufﬁcient to determine a stochastic process uniquely? This is true
for processes with continuous paths
1
. This is the class of stochastic processes that we will study
in these notes.
Deﬁnition 3.2.3. We will say that two processes X
t
and Y
t
are equivalent if they have same ﬁnite
dimensional distributions.
Deﬁnition 3.2.4. A one dimensional Gaussian process is a continuous time stochastic process for
which E = Rand all the ﬁnite dimensional distributions are Gaussian, i.e. every ﬁnite dimensional
vector (X
t
1
, X
t
2
, . . . , X
t
k
) is a ^(µ
k
, K
k
) random variable for some vector µ
k
and a symmetric
nonnegative deﬁnite matrix K
k
for all k = 1, 2, . . . and for all t
1
, t
2
, . . . , t
k
.
From the above deﬁnition we conclude that the Finite dimensional distributions of a Gaussian
continuous time stochastic process are Gaussian with PFG
γ
µ
k
,K
k
(x) = (2π)
−n/2
(detK
k
)
−1/2
exp
_
−
1
2
¸K
−1
k
(x −µ
k
), x −µ
k
)
_
,
where x = (x
1
, x
2
, . . . x
k
).
It is straightforward to extend the above deﬁnition to arbitrary dimensions. A Gaussian process
x(t) is characterized by its mean
m(t) := Ex(t)
1
In fact, what we need is the stochastic process to be separable. See the discussion in Section 3.7
30
and the covariance (or autocorrelation) matrix
C(t, s) = E
_
_
x(t) −m(t)
_
⊗
_
x(s) −m(s)
_
_
.
Thus, the ﬁrst two moments of a Gaussian process are sufﬁcient for a complete characterization of
the process.
3.3 Stationary Processes
3.3.1 Strictly Stationary Processes
In many stochastic processes that appear in applications their statistics remain invariant under time
translations. Such stochastic processes are called stationary. It is possible to develop a quite
general theory for stochastic processes that enjoy this symmetry property.
Deﬁnition 3.3.1. A stochastic process is called (strictly) stationary if all ﬁnite dimensional dis
tributions are invariant under time translation: for any integer k and times t
i
∈ T, the distribution
of (X(t
1
), X(t
2
), . . . , X(t
k
)) is equal to that of (X(s + t
1
), X(s + t
2
), . . . , X(s + t
k
)) for any s
such that s + t
i
∈ T for all i ∈ ¦1, . . . , k¦. In other words,
P(X
t
1
+t
∈ A
1
, X
t
2
+t
∈ A
2
. . . X
t
k
+t
∈ A
k
) = P(X
t
1
∈ A
1
, X
t
2
∈ A
2
. . . X
t
k
∈ A
k
), ∀t ∈ T.
Example 3.3.2. Let Y
0
, Y
1
, . . . be a sequence of independent, identically distributed random vari
ables and consider the stochastic process X
n
= Y
n
. Then X
n
is a strictly stationary process (see
Exercise 1). Assume furthermore that EY
0
= µ < +∞. Then, by the strong law of large numbers,
we have that
1
N
N−1
j=0
X
j
=
1
N
N−1
j=0
Y
j
→EY
0
= µ,
almost surely. In fact, Birkhoff’s ergodic theorem states that, for any function f such that
Ef(Y
0
) < +∞, we have that
lim
N→+∞
1
N
N−1
j=0
f(X
j
) = Ef(Y
0
), (3.1)
almost surely. The sequence of iid random variables is an example of an ergodic strictly stationary
processes.
31
Ergodic strictly stationary processes satisfy (3.1) Hence, we can calculate the statistics of a
sequence stochastic process X
n
using a single sample path, provided that it is long enough (N ¸
1).
Example 3.3.3. Let Z be a random variable and deﬁne the stochastic process X
n
= Z, n =
0, 1, 2, . . . . Then X
n
is a strictly stationary process (see Exercise 2). We can calculate the long
time average of this stochastic process:
1
N
N−1
j=0
X
j
=
1
N
N−1
j=0
Z = Z,
which is independent of N and does not converge to the mean of the stochastic processes EX
n
=
EZ (assuming that it is ﬁnite), or any other deterministic number. This is an example of a non
ergodic processes.
3.3.2 Second Order Stationary Processes
Let
_
Ω, T, P
_
be a probability space. Let X
t
, t ∈ T (with T = R or Z) be a realvalued random
process on this probability space with ﬁnite second moment, E[X
t
[
2
< +∞ (i.e. X
t
∈ L
2
(Ω, P)
for all t ∈ T ). Assume that it is strictly stationary. Then,
E(X
t+s
) = EX
t
, s ∈ T (3.2)
from which we conclude that EX
t
is constant. and
E((X
t
1
+s
−µ)(X
t
2
+s
−µ)) = E((X
t
1
−µ)(X
t
2
−µ)), s ∈ T (3.3)
from which we conclude that the covariance or autocorrelation or correlation function C(t, s) =
E((X
t
− µ)(X
s
− µ)) depends on the difference between the two times, t and s, i.e. C(t, s) =
C(t −s). This motivates the following deﬁnition.
Deﬁnition 3.3.4. A stochastic process X
t
∈ L
2
is called secondorder stationary or widesense
stationary or weakly stationary if the ﬁrst moment EX
t
is a constant and the covariance function
E(X
t
−µ)(X
s
−µ) depends only on the difference t −s:
EX
t
= µ, E((X
t
−µ)(X
s
−µ)) = C(t −s).
32
The constant µ is the expectation of the process X
t
. Without loss of generality, we can set
µ = 0, since if EX
t
= µ then the process Y
t
= X
t
− µ is mean zero. A mean zero process
with be called a centered process. The function C(t) is the covariance (sometimes also called
autocovariance) or the autocorrelation function of the X
t
. Notice that C(t) = E(X
t
X
0
), whereas
C(0) = E(X
2
t
), which is ﬁnite, by assumption. Since we have assumed that X
t
is a real valued
process, we have that C(t) = C(−t), t ∈ R.
Remark 3.3.5. Let X
t
be a strictly stationary stochastic process with ﬁnite second moment (i.e.
X
t
∈ L
2
). The deﬁnition of strict stationarity implies that EX
t
= µ, a constant, and E((X
t
−
µ)(X
s
− µ)) = C(t − s). Hence, a strictly stationary process with ﬁnite second moment is also
stationary in the wide sense. The converse is not true.
Example 3.3.6.
Let Y
0
, Y
1
, . . . be a sequence of independent, identically distributed random variables and con
sider the stochastic process X
n
= Y
n
. From Example 3.3.2 we know that this is a strictly station
ary process, irrespective of whether Y
0
is such that EY
2
0
< +∞. Assume now that EY
0
= 0 and
EY
2
0
= σ
2
< +∞. Then X
n
is a second order stationary process with mean zero and correlation
function R(k) = σ
2
δ
k0
. Notice that in this case we have no correlation between the values of the
stochastic process at different times n and k.
Example 3.3.7. Let Z be a single random variable and consider the stochastic process X
n
=
Z, n = 0, 1, 2, . . . . From Example 3.3.3 we know that this is a strictly stationary process irrespec
tive of whether E[Z[
2
< +∞ or not. Assume now that EZ = 0, EZ
2
= σ
2
. Then X
n
becomes
a second order stationary process with R(k) = σ
2
. Notice that in this case the values of our
stochastic process at different times are strongly correlated.
We will see in Section 3.3.3 that for second order stationary processes, ergodicity is related to
fast decay of correlations. In the ﬁrst of the examples above, there was no correlation between our
stochastic processes at different times and the stochastic process is ergodic. On the contrary, in our
second example there is very strong correlation between the stochastic process at different times
and this process is not ergodic.
Remark 3.3.8. The ﬁrst two moments of a Gaussian process are sufﬁcient for a complete charac
terization of the process. Consequently, a Gaussian stochastic process is strictly stationary if and
only if it is weakly stationary.
33
Continuity properties of the covariance function are equivalent to continuity properties of the
paths of X
t
in the L
2
sense, i.e.
lim
h→0
E[X
t+h
−X
t
[
2
= 0.
Lemma 3.3.9. Assume that the covariance function C(t) of a second order stationary process is
continuous at t = 0. Then it is continuous for all t ∈ R. Furthermore, the continuity of C(t) is
equivalent to the continuity of the process X
t
in the L
2
sense.
Proof. Fix t ∈ R and (without loss of generality) set EX
t
= 0. We calculate:
[C(t + h) −C(t)[
2
= [E(X
t+h
X
0
) −E(X
t
X
0
)[
2
= E[((X
t+h
−X
t
)X
0
)[
2
E(X
0
)
2
E(X
t+h
−X
t
)
2
= C(0)(EX
2
t+h
+EX
2
t
−2EX
t
X
t+h
)
= 2C(0)(C(0) −C(h)) →0,
as h →0. Thus, continuity of C() at 0 implies continuity for all t.
Assume now that C(t) is continuous. From the above calculation we have
E[X
t+h
−X
t
[
2
= 2(C(0) −C(h)), (3.4)
which converges to 0 as h → 0. Conversely, assume that X
t
is L
2
continuous. Then, from the
above equation we get lim
h→0
C(h) = C(0).
Notice that form (3.4) we immediately conclude that C(0) > C(h), h ∈ R.
The Fourier transform of the covariance function of a second order stationary process always
exists. This enables us to study second order stationary processes using tools from Fourier analysis.
To make the link between second order stationary processes and Fourier analysis we will use
Bochner’s theorem, which applies to all nonnegative functions.
Deﬁnition 3.3.10. A function f(x) : R →R is called nonnegative deﬁnite if
n
i,j=1
f(t
i
−t
j
)c
i
¯ c
j
0 (3.5)
for all n ∈ N, t
1
, . . . t
n
∈ R, c
1
, . . . c
n
∈ C.
34
Lemma 3.3.11. The covariance function of second order stationary process is a nonnegative deﬁ
nite function.
Proof. We will use the notation X
c
t
:=
n
i=1
X
t
i
c
i
. We have.
n
i,j=1
C(t
i
−t
j
)c
i
¯ c
j
=
n
i,j=1
EX
t
i
X
t
j
c
i
¯ c
j
= E
_
n
i=1
X
t
i
c
i
n
j=1
X
t
j
¯ c
j
_
= E
_
X
c
t
¯
X
c
t
_
= E[X
c
t
[
2
0.
Theorem 3.3.12. (Bochner) Let C(t) be a continuous positive deﬁnite function. Then there exists
a unique nonnegative measure ρ on R such that ρ(R) = C(0) and
C(t) =
_
R
e
ixt
ρ(dx) ∀t ∈ R. (3.6)
Deﬁnition 3.3.13. Let X
t
be a second order stationary process with autocorrelation function C(t)
whose Fourier transform is the measure ρ(dx). The measure ρ(dx) is called the spectral measure
of the process X
t
.
In the following we will assume that the spectral measure is absolutely continuous with respect
to the Lebesgue measure on R with density f(x), i.e. ρ(dx) = f(x)dx. The Fourier transform
f(x) of the covariance function is called the spectral density of the process:
f(x) =
1
2π
_
∞
−∞
e
−itx
C(t) dt.
From (3.6) it follows that that the autocorrelation function of a mean zero, second order stationary
process is given by the inverse Fourier transform of the spectral density:
C(t) =
_
∞
−∞
e
itx
f(x) dx. (3.7)
There are various cases where the experimentally measured quantity is the spectral density (or
power spectrum) of a stationary stochastic process. Conversely, from a time series of observations
of a stationary processes we can calculate the autocorrelation function and, using (3.7) the spectral
density.
35
The autocorrelation function of a second order stationary process enables us to associate a time
scale to X
t
, the correlation time τ
cor
:
τ
cor
=
1
C(0)
_
∞
0
C(τ) dτ =
_
∞
0
E(X
τ
X
0
)/E(X
2
0
) dτ.
The slower the decay of the correlation function, the larger the correlation time is. Notice that
when the correlations do not decay sufﬁciently fast so that C(t) is integrable, then the correlation
time will be inﬁnite.
Example 3.3.14. Consider a mean zero, second order stationary process with correlation function
R(t) = R(0)e
−α[t[
(3.8)
where α > 0. We will write R(0) =
D
α
where D > 0. The spectral density of this process is:
f(x) =
1
2π
D
α
_
+∞
−∞
e
−ixt
e
−α[t[
dt
=
1
2π
D
α
__
0
−∞
e
−ixt
e
αt
dt +
_
+∞
0
e
−ixt
e
−αt
dt
_
=
1
2π
D
α
_
1
−ix + α
+
1
ix + α
_
=
D
π
1
x
2
+ α
2
.
This function is called the Cauchy or the Lorentz distribution. The correlation time is (we have
that R(0) = D/α)
τ
cor
=
_
∞
0
e
−αt
dt = α
−1
.
A Gaussian process with an exponential correlation function is of particular importance in the
theory and applications of stochastic processes.
Deﬁnition 3.3.15. A realvalued Gaussian stationary process deﬁned on R with correlation func
tion given by (3.8) is called the (stationary) OrnsteinUhlenbeck process.
The Ornstein Uhlenbeck process is used as a model for the velocity of a Brownian particle. It
is of interest to calculate the statistics of the position of the Brownian particle, i.e. of the integral
X(t) =
_
t
0
Y (s) ds, (3.9)
where Y (t) denotes the stationary OU process.
36
Lemma 3.3.16. Let Y (t) denote the stationary OU process with covariance function (3.8) and set
α = D = 1. Then the position process (3.9) is a mean zero Gaussian process with covariance
function
E(X(t)X(s)) = 2 min(t, s) + e
−min(t,s)
+ e
−max(t,s)
−e
−[t−s[
−1. (3.10)
Proof. See Exercise 8.
3.3.3 Ergodic Properties of SecondOrder Stationary Processes
Second order stationary processes have nice ergodic properties, provided that the correlation be
tween values of the process at different times decays sufﬁciently fast. In this case, it is possible to
show that we can calculate expectations by calculating time averages. An example of such a result
is the following.
Theorem3.3.17. Let ¦X
t
¦
t0
be a second order stationary process on a probability space Ω, T, P
with mean µ and covariance R(t), and assume that R(t) ∈ L
1
(0, +∞). Then
lim
T→+∞
E
¸
¸
¸
¸
1
T
_
T
0
X(s) ds −µ
¸
¸
¸
¸
2
= 0. (3.11)
For the proof of this result we will ﬁrst need an elementary lemma.
Lemma 3.3.18. Let R(t) be an integrable symmetric function. Then
_
T
0
_
T
0
R(t −s) dtds = 2
_
T
0
(T −s)R(s) ds. (3.12)
Proof. We make the change of variables u = t −s, v = t +s. The domain of integration in the t, s
variables is [0, T] [0, T]. In the u, v variables it becomes [−T, T] [0, 2(T −[u[)]. The Jacobian
of the transformation is
J =
∂(t, s)
∂(u, v)
=
1
2
.
The integral becomes
_
T
0
_
T
0
R(t −s) dtds =
_
T
−T
_
2(T−[u[)
0
R(u)J dvdu
=
_
T
−T
(T −[u[)R(u) du
= 2
_
T
0
(T −u)R(u) du,
where the symmetry of the function R(u) was used in the last step.
37
Proof of Theorem 3.3.17. We use Lemma (3.3.18) to calculate:
E
¸
¸
¸
¸
1
T
_
T
0
X
s
ds −µ
¸
¸
¸
¸
2
=
1
T
2
E
¸
¸
¸
¸
_
T
0
(X
s
−µ) ds
¸
¸
¸
¸
2
=
1
T
2
E
_
T
0
_
T
0
(X(t) −µ)(X(s) −µ) dtds
=
1
T
2
_
T
0
_
T
0
R(t −s) dtds
=
2
T
2
_
T
0
(T −u)R(u) du
2
T
_
+∞
0
¸
¸
¸
_
1 −
u
T
_
R(u)
¸
¸
¸ du
2
T
_
+∞
0
R(u) du →0,
using the dominated convergence theorem and the assumption R() ∈ L
1
.
Assume that µ = 0 and deﬁne
D =
_
+∞
0
R(t) dt, (3.13)
which, from our assumption on R(t), is a ﬁnite quantity.
2
The above calculation suggests that, for
T ¸1, we have that
E
__
t
0
X(t) dt
_
2
≈ 2DT.
This implies that, at sufﬁciently long times, the mean square displacement of the integral of the
ergodic second order stationary process X
t
scales linearly in time, with proportionality coefﬁcient
2D.
Assume that X
t
is the velocity of a (Brownian) particle. In this case, the integral of X
t
Z
t
=
_
t
0
X
s
ds,
represents the particle position. From our calculation above we conclude that
EZ
2
t
= 2Dt.
where
D =
_
∞
0
R(t) dt =
_
∞
0
E(X
t
X
0
) dt (3.14)
is the diffusion coefﬁcient. Thus, one expects that at sufﬁciently long times and under appropriate
assumptions on the correlation function, the time integral of a stationary process will approximate
2
Notice however that we do not know whether it is nonzero. This requires a separate argument.
38
a Brownian motion with diffusion coefﬁcient D. The diffusion coefﬁcient is an example of a
transport coefﬁcient and (3.14) is an example of the GreenKubo formula: a transport coefﬁcient
can be calculated in terms of the time integral of an appropriate autocorrelation function. In the
case of the diffusion coefﬁcient we need to calculate the integral of the velocity autocorrelation
function.
Example 3.3.19. Consider the stochastic processes with an exponential correlation function from
Example 3.3.14, and assume that this stochastic process describes the velocity of a Brownian
particle. Since R(t) ∈ L
1
(0, +∞) Theorem 3.3.17 applies. Furthermore, the diffusion coefﬁcient
of the Brownian particle is given by
_
+∞
0
R(t) dt = R(0)τ
−1
c
=
D
α
2
.
3.4 Brownian Motion
The most important continuous time stochastic process is Brownian motion. Brownian motion
is a mean zero, continuous (i.e. it has continuous sample paths: for a.e ω ∈ Ω the function X
t
is
a continuous function of time) process with independent Gaussian increments. A process X
t
has
independent increments if for every sequence t
0
< t
1
< . . . t
n
the random variables
X
t
1
−X
t
0
, X
t
2
−X
t
1
, . . . , X
t
n
−X
t
n−1
are independent. If, furthermore, for any t
1
, t
2
, s ∈ T and Borel set B ⊂ R
P(X
t
2
+s
−X
t
1
+s
∈ B) = P(X
t
2
−X
t
1
∈ B)
then the process X
t
has stationary independent increments.
Deﬁnition 3.4.1. • A one dimensional standard Brownian motion W(t) : R
+
→ R is a real
valued stochastic process such that
i. W(0) = 0.
ii. W(t) has independent increments.
iii. For every t > s 0 W(t) − W(s) has a Gaussian distribution with mean 0 and
variance t −s. That is, the density of the random variable W(t) −W(s) is
g(x; t, s) =
_
2π(t −s)
_
−
1
2
exp
_
−
x
2
2(t −s)
_
; (3.15)
39
• A d–dimensional standard Brownian motion W(t) : R
+
→ R
d
is a collection of d indepen
dent one dimensional Brownian motions:
W(t) = (W
1
(t), . . . , W
d
(t)),
where W
i
(t), i = 1, . . . , d are independent one dimensional Brownian motions. The density
of the Gaussian random vector W(t) −W(s) is thus
g(x; t, s) =
_
2π(t −s)
_
−d/2
exp
_
−
x
2
2(t −s)
_
.
Brownian motion is sometimes referred to as the Wiener process .
Brownian motion has continuous paths. More precisely, it has a continuous modiﬁcation.
Deﬁnition 3.4.2. Let X
t
and Y
t
, t ∈ T, be two stochastic processes deﬁned on the same probability
space (Ω, T, P). The process Y
t
is said to be a modiﬁcation of X
t
if P(X
t
= Y
t
) = 1 ∀t ∈ T.
Lemma 3.4.3. There is a continuous modiﬁcation of Brownian motion.
This follows from a theorem due to Kolmogorov.
Theorem 3.4.4. (Kolmogorov) Let X
t
, t ∈ [0, ∞) be a stochastic process on a probability space
¦Ω, T, P¦. Suppose that there are positive constants α and β, and for each T 0 there is a
constant C(T) such that
E[X
t
−X
s
[
α
C(T)[t −s[
1+β
, 0 s, t T. (3.16)
Then there exists a continuous modiﬁcation Y
t
of the process X
t
.
The proof of Lemma 3.4.3 is left as an exercise.
Remark 3.4.5. Equivalently, we could have deﬁned the one dimensional standard Brownian mo
tion as a stochastic process on a probability space
_
Ω, T, P
_
with continuous paths for almost all
ω ∈ Ω, and Gaussian ﬁnite dimensional distributions with zero mean and covariance E(W
t
i
W
t
j
) =
min(t
i
, t
j
). One can then show that Deﬁnition 3.4.1 follows from the above deﬁnition.
It is possible to prove rigorously the existence of the Wiener process (Brownian motion):
40
Figure 3.1: Brownian sample paths
Theorem 3.4.6. (Wiener) There exists an almostsurely continuous process W
t
with independent
increments such and W
0
= 0, such that for each t 0 the random variable W
t
is ^(0, t).
Furthermore, W
t
is almost surely locally H¨ older continuous with exponent α for any α ∈ (0,
1
2
).
Notice that Brownian paths are not differentiable.
We can also construct Brownian motion through the limit of an appropriately rescaled random
walk: let X
1
, X
2
, . . . be iid random variables on a probability space (Ω, T, P) with mean 0 and
variance 1. Deﬁne the discrete time stochastic process S
n
with S
0
= 0, S
n
=
j=1
X
j
, n 1.
Deﬁne now a continuous time stochastic process with continuous paths as the linearly interpolated,
appropriately rescaled random walk:
W
n
t
=
1
√
n
S
[nt]
+ (nt −[nt])
1
√
n
X
[nt]+1
,
where [] denotes the integer part of a number. Then W
n
t
converges weakly, as n → +∞to a one
dimensional standard Brownian motion.
Brownian motion is a Gaussian process. For the d–dimensional Brownian motion, and for I
the d d dimensional identity, we have (see (2.7) and (2.8))
EW(t) = 0 ∀t 0
and
E
_
(W(t) −W(s)) ⊗(W(t) −W(s))
_
= (t −s)I. (3.17)
41
Moreover,
E
_
W(t) ⊗W(s)
_
= min(t, s)I. (3.18)
From the formula for the Gaussian density g(x, t − s), eqn. (3.15), we immediately conclude that
W(t) −W(s) and W(t +u) −W(s +u) have the same pdf. Consequently, Brownian motion has
stationary increments. Notice, however, that Brownian motion itself is not a stationary process.
Since W(t) = W(t) −W(0), the pdf of W(t) is
g(x, t) =
1
√
2πt
e
−x
2
/2t
.
We can easily calculate all moments of the Brownian motion:
E(x
n
(t)) =
1
√
2πt
_
+∞
−∞
x
n
e
−x
2
/2t
dx
=
_
1.3 . . . (n −1)t
n/2
, neven,
0, nodd.
Brownian motion is invariant under various transformations in time.
Theorem 3.4.7. . Let W
t
denote a standard Brownian motion in R. Then, W
t
has the following
properties:
i. (Rescaling). For each c > 0 deﬁne X
t
=
1
√
c
W(ct). Then (X
t
, t 0) = (W
t
, t 0) in law.
ii. (Shifting). For each c > 0 W
c+t
− W
c
, t 0 is a Brownian motion which is independent of
W
u
, u ∈ [0, c].
iii. (Time reversal). Deﬁne X
t
= W
1−t
−W
1
, t ∈ [0, 1]. Then (X
t
, t ∈ [0, 1]) = (W
t
, t ∈ [0, 1])
in law.
iv. (Inversion). Let X
t
, t 0 deﬁned by X
0
= 0, X
t
= tW(1/t). Then (X
t
, t 0) =
(W
t
, t 0) in law.
We emphasize that the equivalence in the above theorem holds in law and not in a pathwise
sense.
Proof. See Exercise 13.
42
We can also add a drift and change the diffusion coefﬁcient of the Brownian motion: we will
deﬁne a Brownian motion with drift µ and variance σ
2
as the process
X
t
= µt + σW
t
.
The mean and variance of X
t
are
EX
t
= µt, E(X
t
−EX
t
)
2
= σ
2
t.
Notice that X
t
satisﬁes the equation
dX
t
= µdt + σ dW
t
.
This is the simplest example of a stochastic differential equation.
We can deﬁne the OU process through the Brownian motion via a time change.
Lemma 3.4.8. Let W(t) be a standard Brownian motion and consider the process
V (t) = e
−t
W(e
2t
).
Then V (t) is a Gaussian stationary process with mean 0 and correlation function
R(t) = e
−[t[
. (3.19)
For the proof of this result we ﬁrst need to show that time changed Gaussian processes are also
Gaussian.
Lemma 3.4.9. Let X(t) be a Gaussian stochastic process and let Y (t) = X(f(t)) where f(t) is a
strictly increasing function. Then Y (t) is also a Gaussian process.
Proof. We need to show that, for all positive integers N and all sequences of times ¦t
1
, t
2
, . . . t
N
¦
the random vector
¦Y (t
1
), Y (t
2
), . . . Y (t
N
)¦ (3.20)
is a multivariate Gaussian random variable. Since f(t) is strictly increasing, it is invertible and
hence, there exist s
i
, i = 1, . . . N such that s
i
= f
−1
(t
i
). Thus, the random vector (3.20) can be
rewritten as
¦X(s
1
), X(s
2
), . . . X(s
N
)¦,
which is Gaussian for all N and all choices of times s
1
, s
2
, . . . s
N
. Hence Y (t) is also Gaussian.
43
Proof of Lemma 3.4.8. The fact that V (t) is mean zero follows immediately from the fact that
W(t) is mean zero. To show that the correlation function of V (t) is given by (3.19), we calculate
E(V (t)V (s)) = e
−t−s
E(W(e
2t
)W(e
2s
)) = e
−t−s
min(e
2t
, e
2s
)
= e
−[t−s[
.
The Gaussianity of the process V (t) follows from Lemma 3.4.9 (notice that the transformation that
gives V (t) in terms of W(t) is invertible and we can write W(s) = s
1/2
V (
1
2
ln(s))).
3.5 Other Examples of Stochastic Processes
3.5.1 Brownian Bridge
Let W(t) be a standard one dimensional Brownian motion. We deﬁne the Brownian bridge (from
0 to 0) to be the process
B
t
= W
t
−tW
1
, t ∈ [0, 1]. (3.21)
Notice that B
0
= B
1
= 0. Equivalently, we can deﬁne the Brownian bridge to be the continuous
Gaussian process ¦B
t
: 0 t 1¦ such that
EB
t
= 0, E(B
t
B
s
) = min(s, t) −st, s, t ∈ [0, 1]. (3.22)
Another, equivalent deﬁnition of the Brownian bridge is through an appropriate time change of the
Brownian motion:
B
t
= (1 −t)W
_
t
1 −t
_
, t ∈ [0, 1). (3.23)
Conversely, we can write the Brownian motion as a time change of the Brownian bridge:
W
t
= (t + 1)B
_
t
1 + t
_
, t 0.
3.5.2 Fractional Brownian Motion
Deﬁnition 3.5.1. A (normalized) fractional Brownian motion W
H
t
, t 0 with Hurst parameter
H ∈ (0, 1) is a centered Gaussian process with continuous sample paths whose covariance is
given by
E(W
H
t
W
H
s
) =
1
2
_
s
2H
+ t
2H
−[t −s[
2H
_
. (3.24)
44
Proposition 3.5.2. Fractional Brownian motion has the following properties.
i. When H =
1
2
, W
1
2
t
becomes the standard Brownian motion.
ii. W
H
0
= 0, EW
H
t
= 0, E(W
H
t
)
2
= [t[
2H
, t 0.
iii. It has stationary increments, E(W
H
t
−W
H
s
)
2
= [t −s[
2H
.
iv. It has the following self similarity property
(W
H
αt
, t 0) = (α
H
W
H
t
, t 0), α > 0, (3.25)
where the equivalence is in law.
Proof. See Exercise 19
3.5.3 The Poisson Process
Another fundamental continuous time process is the Poisson process :
Deﬁnition 3.5.3. The Poisson process with intensity λ, denoted by N(t), is an integervalued,
continuous time, stochastic process with independent increments satisfying
P[(N(t) −N(s)) = k] =
e
−λ(t−s)
_
λ(t −s)
_
k
k!
, t > s 0, k ∈ N.
The Poisson process does not have a continuous modiﬁcation. See Exercise 20.
3.6 The KarhunenLo´ eve Expansion
Let f ∈ L
2
(Ω) where Ω is a subset of R
d
and let ¦e
n
¦
∞
n=1
be an orthonormal basis in L
2
(Ω). Then,
it is well known that f can be written as a series expansion:
f =
∞
n=1
f
n
e
n
,
where
f
n
=
_
Ω
f(x)e
n
(x) dx.
45
The convergence is in L
2
(Ω):
lim
N→∞
_
_
_
_
_
f(x) −
N
n=1
f
n
e
n
(x)
_
_
_
_
_
L
2
(Ω)
= 0.
It turns out that we can obtain a similar expansion for an L
2
mean zero process which is continuous
in the L
2
sense:
EX
2
t
< +∞, EX
t
= 0, lim
h→0
E[X
t+h
−X
t
[
2
= 0. (3.26)
For simplicity we will take T = [0, 1]. Let R(t, s) = E(X
t
X
s
) be the autocorrelation function.
Notice that from (3.26) it follows that R(t, s) is continuous in both t and s (exercise 21).
Let us assume an expansion of the form
X
t
(ω) =
∞
n=1
ξ
n
(ω)e
n
(t), t ∈ [0, 1] (3.27)
where ¦e
n
¦
∞
n=1
is an orthonormal basis in L
2
(0, 1). The random variables ξ
n
are calculated as
_
1
0
X
t
e
k
(t) dt =
_
1
0
∞
n=1
ξ
n
e
n
(t)e
k
(t) dt
=
∞
n=1
ξ
n
δ
nk
= ξ
k
,
where we assumed that we can interchange the summation and integration. We will assume that
these random variables are orthogonal:
E(ξ
n
ξ
m
) = λ
n
δ
nm
,
where ¦λ
n
¦
∞
n=1
are positive numbers that will be determined later.
Assuming that an expansion of the form (3.27) exists, we can calculate
R(t, s) = E(X
t
X
s
) = E
_
∞
k=1
∞
=1
ξ
k
e
k
(t)ξ
e
(s)
_
=
∞
k=1
∞
=1
E(ξ
k
ξ
) e
k
(t)e
(s)
=
∞
k=1
λ
k
e
k
(t)e
k
(s).
46
Consequently, in order to the expansion (3.27) to be valid we need
R(t, s) =
∞
k=1
λ
k
e
k
(t)e
k
(s). (3.28)
From equation (3.28) it follows that
_
1
0
R(t, s)e
n
(s) ds =
_
1
0
∞
k=1
λ
k
e
k
(t)e
k
(s)e
n
(s) ds
=
∞
k=1
λ
k
e
k
(t)
_
1
0
e
k
(s)e
n
(s) ds
=
∞
k=1
λ
k
e
k
(t)δ
kn
= λ
n
e
n
(t).
Hence, in order for the expansion (3.27) to be valid, ¦λ
n
, e
n
(t)¦
∞
n=1
have to be the eigenvalues and
eigenfunctions of the integral operator whose kernel is the correlation function of X
t
:
_
1
0
R(t, s)e
n
(s) ds = λ
n
e
n
(t). (3.29)
Hence, in order to prove the expansion (3.27) we need to study the eigenvalue problem for the
integral operator 1 : L
2
[0, 1] → L
2
[0, 1]. It easy to check that this operator is selfadjoint
((1f, h) = (f, 1h) for all f, h ∈ L
2
(0, 1)) and nonnegative (1f, f 0 for all f ∈ L
2
(0, 1)).
Hence, all its eigenvalues are real and nonnegative. Furthermore, it is a compact operator (if
¦φ
n
¦
∞
n=1
is a bounded sequence in L
2
(0, 1), then ¦1φ
n
¦
∞
n=1
has a convergent subsequence). The
spectral theorem for compact, selfadjoint operators implies that 1 has a countable sequence of
eigenvalues tending to 0. Furthermore, for every f ∈ L
2
(0, 1) we can write
f = f
0
+
∞
n=1
f
n
e
n
(t),
where 1f
0
= 0, ¦e
n
(t)¦ are the eigenfunctions of 1corresponding to nonzero eigenvalues and the
convergence is in L
2
. Finally, Mercer’s Theoremstates that for R(t, s) continuous on [0, 1][0, 1],
the expansion (3.28) is valid, where the series converges absolutely and uniformly.
Now we are ready to prove (3.27).
47
Theorem 3.6.1. (KarhunenLo´ eve). Let ¦X
t
, t ∈ [0, 1]¦ be an L
2
process with zero mean and
continuous correlation function R(t, s). Let ¦λ
n
, e
n
(t)¦
∞
n=1
be the eigenvalues and eigenfunctions
of the operator 1deﬁned in (3.35). Then
X
t
=
∞
n=1
ξ
n
e
n
(t), t ∈ [0, 1], (3.30)
where
ξ
n
=
_
1
0
X
t
e
n
(t) dt, Eξ
n
= 0, E(ξ
n
ξ
m
) = λδ
nm
. (3.31)
The series converges in L
2
to X(t), uniformly in t.
Proof. The fact that Eξ
n
= 0 follows from the fact that X
t
is mean zero. The orthogonality of the
random variables ¦ξ
n
¦
∞
n=1
follows from the orthogonality of the eigenfunctions of 1:
E(ξ
n
ξ
m
) = E
_
1
0
_
1
0
X
t
X
s
e
n
(t)e
m
(s) dtds
=
_
1
0
_
1
0
R(t, s)e
n
(t)e
m
(s) dsdt
= λ
n
_
1
0
e
n
(s)e
m
(s) ds
= λ
n
δ
nm
.
Consider now the partial sum S
N
=
N
n=1
ξ
n
e
n
(t).
E[X
t
−S
N
[
2
= EX
2
t
+ES
2
N
−2E(X
t
S
N
)
= R(t, t) +E
N
k,=1
ξ
k
ξ
e
k
(t)e
(t) −2E
_
X
t
N
n=1
ξ
n
e
n
(t)
_
= R(t, t) +
N
k=1
λ
k
[e
k
(t)[
2
−2E
N
k=1
_
1
0
X
t
X
s
e
k
(s)e
k
(t) ds
= R(t, t) −
N
k=1
λ
k
[e
k
(t)[
2
→0,
by Mercer’s theorem.
Remark 3.6.2. Let X
t
be a Gaussian second order process with continuous covariance R(t, s).
Then the random variables ¦ξ
k
¦
∞
k=1
are Gaussian, since they are deﬁned through the time integral
48
of a Gaussian processes. Furthermore, since they are Gaussian and orthogonal, they are also
independent. Hence, for Gaussian processes the KarhunenLo´ eve expansion becomes:
X
t
=
+∞
k=1
_
λ
k
ξ
k
e
k
(t), (3.32)
where ¦ξ
k
¦
∞
k=1
are independent ^(0, 1) random variables.
Example 3.6.3. The KarhunenLo´ eve Expansion for Brownian Motion. The correlation func
tion of Brownian motion is R(t, s) = min(t, s). The eigenvalue problem 1ψ
n
= λ
n
ψ
n
becomes
_
1
0
min(t, s)ψ
n
(s) ds = λ
n
ψ
n
(t).
Let us assume that λ
n
> 0 (it is easy to check that 0 is not an eigenvalue). Upon setting t = 0 we
obtain ψ
n
(0) = 0. The eigenvalue problem can be rewritten in the form
_
t
0
sψ
n
(s) ds + t
_
1
t
ψ
n
(s) ds = λ
n
ψ
n
(t).
We differentiate this equation once:
_
1
t
ψ
n
(s) ds = λ
n
ψ
t
n
(t).
We set t = 1 in this equation to obtain the second boundary condition ψ
t
n
(1) = 0. A second
differentiation yields;
−ψ
n
(t) = λ
n
ψ
tt
n
(t),
where primes denote differentiation with respect to t. Thus, in order to calculate the eigenvalues
and eigenfunctions of the integral operator whose kernel is the covariance function of Brownian
motion, we need to solve the SturmLiouville problem
−ψ
n
(t) = λ
n
ψ
tt
n
(t), ψ(0) = ψ
t
(1) = 0.
It is easy to check that the eigenvalues and (normalized) eigenfunctions are
ψ
n
(t) =
√
2 sin
_
1
2
(2n −1)πt
_
, λ
n
=
_
2
(2n −1)π
_
2
.
Thus, the KarhunenLo´ eve expansion of Brownian motion on [0, 1] is
W
t
=
√
2
∞
n=1
ξ
n
2
(2n −1)π
sin
_
1
2
(2n −1)πt
_
. (3.33)
49
We can use the KL expansion in order to study the L
2
regularity of stochastic processes. First,
let R be a compact, symmetric positive deﬁnite operator on L
2
(0, 1) with eigenvalues and normal
ized eigenfunctions ¦λ
k
, e
k
(x)¦
+∞
k=1
and consider a function f ∈ L
2
(0, 1) with
_
1
0
f(s) ds = 0. We
can deﬁne the one parameter family of Hilbert spaces H
α
through the norm
f
2
α
= R
−α
f
2
L
2 =
k
[f
k
[
2
λ
−α
.
The inner product can be obtained through polarization. This norm enables us to measure the reg
ularity of the function f(t).
3
Let X
t
be a mean zero second order (i.e. with ﬁnite second moment)
process with continuous autocorrelation function. Deﬁne the space H
α
:= L
2
((Ω, P), H
α
(0, 1))
with (semi)norm
X
t

2
α
= EX
t

2
H
α =
k
[λ
k
[
1−α
. (3.34)
Notice that the regularity of the stochastic process X
t
depends on the decay of the eigenvalues of
the integral operator 1 :=
_
1
0
R(t, s) ds.
As an example, consider the L
2
regularity of Brownian motion. From Example 3.6.3 we know
that λ
k
∼ k
−2
. Consequently, from (3.34) we get that, in order for W
t
to be an element of the space
H
α
, we need that
k
[λ
k
[
−2(1−α)
< +∞,
from which we obtain that α < 1/2. This is consistent with the H¨ older continuity of Brownian
motion from Theorem 3.4.6.
4
3.7 Discussion and Bibliography
The OrnsteinUhlenbeck process was introduced by Ornstein and Uhlenbeck in 1930 as a model
for the velocity of a Brownian particle [93].
The kind of analysis presented in Section 3.3.3 was initiated by G.I. Taylor in [91]. The proof of
Bochner’s theorem 3.3.12 can be found in [50], where additional material on stationary processes
can be found. See also [46].
3
Think of R as being the inverse of the Laplacian with periodic boundary conditions. In this case H
α
coincides
with the standard fractional Sobolev space.
4
Notice, however, that Wiener’s theorem refers to a.s. H¨ older continuity, whereas the calculation presented in this
section is about L
2
continuity.
50
The spectral theorem for compact, selfadjoint operators which was needed in the proof of the
KarhunenLo´ eve theorem can be found in [81]. The KarhunenLoeve expansion is also valid for
random ﬁelds. See [88] and the reference therein.
3.8 Exercises
1. Let Y
0
, Y
1
, . . . be a sequence of independent, identically distributed random variables and con
sider the stochastic process X
n
= Y
n
.
(a) Show that X
n
is a strictly stationary process.
(b) Assume that EY
0
= µ < +∞and EY
2
0
= sigma
2
< +∞. Show that
lim
N→+∞
E
¸
¸
¸
¸
¸
1
N
N−1
j=0
X
j
−µ
¸
¸
¸
¸
¸
= 0.
(c) Let f be such that Ef
2
(Y
0
) < +∞. Show that
lim
N→+∞
E
¸
¸
¸
¸
¸
1
N
N−1
j=0
f(X
j
) −f(Y
0
)
¸
¸
¸
¸
¸
= 0.
2. Let Z be a random variable and deﬁne the stochastic process X
n
= Z, n = 0, 1, 2, . . . . Show
that X
n
is a strictly stationary process.
3. Let A
0
, A
1
, . . . A
m
and B
0
, B
1
, . . . B
m
be uncorrelated random variables with mean zero and
variances EA
2
i
= σ
2
i
, EB
2
i
= σ
2
i
, i = 1, . . . m. Let ω
0
, ω
1
, . . . ω
m
∈ [0, π] be distinct frequen
cies and deﬁne, for n = 0, ±1, ±2, . . . , the stochastic process
X
n
=
m
k=0
_
A
k
cos(nω
k
) + B
k
sin(nω
k
)
_
.
Calculate the mean and the covariance of X
n
. Show that it is a weakly stationary process.
4. Let ¦ξ
n
: n = 0, ±1, ±2, . . . ¦ be uncorrelated random variables with Eξ
n
= µ, E(ξ
n
− µ)
2
=
σ
2
, n = 0, ±1, ±2, . . . . Let a
1
, a
2
, . . . be arbitrary real numbers and consider the stochastic
process
X
n
= a
1
ξ
n
+ a
2
ξ
n−1
+ . . . a
m
ξ
n−m+1
.
51
(a) Calculate the mean, variance and the covariance function of X
n
. Show that it is a weakly
stationary process.
(b) Set a
k
= 1/
√
m for k = 1, . . . m. Calculate the covariance function and study the cases
m = 1 and m →+∞.
5. Let W(t) be a standard one dimensional Brownian motion. Calculate the following expecta
tions.
(a) Ee
iW(t)
.
(b) Ee
i(W(t)+W(s))
, t, s, ∈ (0, +∞).
(c) E(
n
i=1
c
i
W(t
i
))
2
, where c
i
∈ R, i = 1, . . . n and t
i
∈ (0, +∞), i = 1, . . . n.
(d) Ee
_
i
_
P
n
i=1
c
i
W(t
i
)
_¸
, where c
i
∈ R, i = 1, . . . n and t
i
∈ (0, +∞), i = 1, . . . n.
6. Let W
t
be a standard one dimensional Brownian motion and deﬁne
B
t
= W
t
−tW
1
, t ∈ [0, 1].
(a) Show that B
t
is a Gaussian process with
EB
t
= 0, E(B
t
B
s
) = min(t, s) −ts.
(b) Show that, for t ∈ [0, 1) an equivalent deﬁnition of B
t
is through the formula
B
t
= (1 −t)W
_
t
1 −t
_
.
(c) Calculate the distribution function of B
t
.
7. Let X
t
be a meanzero second order stationary process with autocorrelation function
R(t) =
N
j=1
λ
2
j
α
j
e
−α
j
[t[
,
where ¦α
j
, λ
j
¦
N
j=1
are positive real numbers.
(a) Calculate the spectral density and the correlaction time of this process.
52
(b) Show that the assumptions of Theorem 3.3.17 are satisﬁed and use the argument presented
in Section 3.3.3 (i.e. the GreenKubo formula) to calculate the diffusion coefﬁcient of the
process Z
t
=
_
t
0
X
s
ds.
(c) Under what assumptions on the coefﬁcients ¦α
j
, λ
j
¦
N
j=1
can you study the above questions
in the limit N →+∞?
8. Prove Lemma 3.10.
9. Let a
1
, . . . a
n
and s
1
, . . . s
n
be positive real numbers. Calculate the mean and variance of the
random variable
X =
n
i=1
a
i
W(s
i
).
10. Let W(t) be the standard onedimensional Brownian motion and let σ, s
1
, s
2
> 0. Calculate
(a) Ee
σW(t)
.
(b) E
_
sin(σW(s
1
)) sin(σW(s
2
))
_
.
11. Let W
t
be a one dimensional Brownian motion and let µ, σ > 0 and deﬁne
S
t
= e
tµ+σW
t
.
(a) Calculate the mean and the variance of S
t
.
(b) Calculate the probability density function of S
t
.
12. Use Theorem 3.4.4 to prove Lemma 3.4.3.
13. Prove Theorem 3.4.7.
14. Use Lemma 3.4.8 to calculate the distribution function of the stationary OrnsteinUhlenbeck
process.
15. Calculate the mean and the correlation function of the integral of a standard Brownian motion
Y
t
=
_
t
0
W
s
ds.
53
16. Show that the process
Y
t
=
_
t+1
t
(W
s
−W
t
) ds, t ∈ R,
is second order stationary.
17. Let V
t
= e
−t
W(e
2t
) be the stationary OrnsteinUhlenbeck process. Give the deﬁnition and
study the main properties of the OrnsteinUhlenbeck bridge.
18. The autocorrelation function of the velocity Y (t) a Brownian particle moving in a harmonic
potential V (x) =
1
2
ω
2
0
x
2
is
R(t) = e
−γ[t[
_
cos(δ[t[) −
1
δ
sin(δ[t[)
_
,
where γ is the friction coefﬁcient and δ =
_
ω
2
0
−γ
2
.
(a) Calculate the spectral density of Y (t).
(b) Calculate the mean square displacement E(X(t))
2
of the position of the Brownian particle
X(t) =
_
t
0
Y (s) ds. Study the limit t →+∞.
19. Show the scaling property (3.25) of the fractional Brownian motion.
20. Use Theorem (3.4.4) to show that there does not exist a continuous modiﬁcation of the Poisson
process.
21. Show that the correlation function of a process X
t
satisfying (3.26) is continuous in both t and
s.
22. Let X
t
be a stochastic process satisfying (3.26) and R(t, s) its correlation function. Show that
the integral operator 1 : L
2
[0, 1] →L
2
[0, 1]
1f :=
_
1
0
R(t, s)f(s) ds (3.35)
is selfadjoint and nonnegative. Show that all of its eigenvalues are real and nonnegative. Show
that eigenfunctions corresponding to different eigenvalues are orthogonal.
54
23. Let H be a Hilbert space. An operator 1 : H →H is said to be HilbertSchmidt if there exists
a complete orthonormal sequence ¦φ
n
¦
∞
n=1
in H such that
∞
n=1
1e
n

2
< ∞.
Let 1 : L
2
[0, 1] →L
2
[0, 1] be the operator deﬁned in (3.35) with R(t, s) being continuous both
in t and s. Show that it is a HilbertSchmidt operator.
24. Let X
t
a mean zero second order stationary process deﬁned in the interval [0, T] with continuous
covariance R(t) and let ¦λ
n
¦
+∞
n=1
be the eigenvalues of the covariance operator. Show that
∞
n=1
λ
n
= T R(0).
25. Calculate the KarhunenLoeve expansion for a second order stochastic process with correlation
function R(t, s) = ts.
26. Calculate the KarhunenLoeve expansion of the Brownian bridge on [0, 1].
27. Let X
t
, t ∈ [0, T] be a second order process with continuous covariance and KarhunenLo´ eve
expansion
X
t
=
∞
k=1
ξ
k
e
k
(t).
Deﬁne the process
Y (t) = f(t)X
τ(t)
, t ∈ [0, S],
where f(t) is a continuous function and τ(t) a continuous, nondecreasing function with τ(0) =
0, τ(S) = T. Find the KarhunenLo´ eve expansion of Y (t), in an appropriate weighted L
2
space, in terms of the KL expansion of X
t
. Use this in order to calculate the KL expansion of
the OrnsteinUhlenbeck process.
28. Calculate the KarhunenLo´ eve expansion of a centered Gaussian stochastic process with co
variance function R(s, t) = cos(2π(t −s)).
29. Use the KarhunenLoeve expansion to generate paths of the
(a) Brownian motion on [0, 1].
55
(b) Brownian bridge on [0, 1].
(c) OrnsteinUhlenbeck on [0, 1].
Study computationally the convergence of the KL expansion for these processes. How many
terms do you need to keep in the KL expansion in order to calculate accurate statistics of these
processes?
56
Chapter 4
Markov Processes
4.1 Introduction
In this chapter we will study some of the basic properties of Markov stochastic processes. In
Section 4.2 we present various examples of Markov processes, in discrete and continuous time.
In Section 4.3 we give the precise deﬁnition of a Markov process. In Section 4.4 we derive the
ChapmanKolmogorov equation, the fundamental equation in the theory of Markov processes. In
Section 4.5 we introduce the concept of the generator of a Markov process. In Section 4.6 we study
ergodic Markov processes. Discussion and bibliographical remarks are presented in Section 4.7
and exercises can be found in Section 4.8.
4.2 Examples
Roughly speaking, a Markov process is a stochastic process that retains no memory of where it has
been in the past: only the current state of a Markov process can inﬂuence where it will go next. A
bit more precisely: a Markov process is a stochastic process for which, given the present, past and
future are statistically independent.
Perhaps the simplest example of a Markov process is that of a random walk in one dimension.
We deﬁned the one dimensional random walk as the sum of independent, mean zero and variance
1 random variables ξ
i
, i = 1, . . . :
X
N
=
N
n=1
ξ
n
, X
0
= 0.
57
Let i
1
, . . . i
2
, . . . be a sequence of integers. Then, for all integers n and m we have that
P(X
n+m
= i
n+m
[X
1
= i
1
, . . . X
n
= i
n
) = P(X
n+m
= i
n+m
[X
n
= i
n
). (4.1)
1
In words, the probability that the random walk will be at i
n+m
at time n + m depends only on its
current value (at time n) and not on how it got there.
The random walk is an example of a discrete time Markov chain:
Deﬁnition 4.2.1. A stochastic process ¦S
n
; n ∈ N¦ and state space is S = Z is called a discrete
time Markov chain provided that the Markov property (4.1) is satisﬁed.
Consider now a continuoustime stochastic process X
t
with state space S = Z and denote by
¦X
s
, s t¦ the collection of values of the stochastic process up to time t. We will say that X
t
is
a Markov processes provided that
P(X
t+h
= i
t+h
[¦X
s
, s t¦) = P(X
t+h
= i
t+h
[X
t
= i
t
), (4.2)
for all h 0. A continuoustime, discrete state space Markov process is called a continuoustime
Markov chain.
Example 4.2.2. The Poisson process is a continuoustime Markov chain with
P(N
t+h
= j[N
t
= i) =
_
0 if j < i,
e
−λs
(λs)
j−i
(j−i)!
, if j i.
Similarly, we can deﬁne a continuoustime Markov process whose state space is R. In this
case, the above deﬁnitions become
P(X
t+h
∈ Γ[¦X
s
, s t¦) = P(X
t+h
∈ Γ[X
t
= x) (4.3)
for all Borel sets Γ.
Example 4.2.3. The Brownian motion is a Markov process with conditional probability density
p(y, t[x, s) := p(W
t
= y[W
s
= x) =
1
_
2π(t −s)
exp
_
−
[x −y[
2
2(t −s)
_
. (4.4)
1
In fact, it is sufﬁcient to take m = 1 in (4.1). See Exercise 1.
58
Example 4.2.4. The OrnsteinUhlenbeck process V
t
= e
−t
W(e
2t
) is a Markov process with con
ditional probability density
p(y, t[x, s) := p(V
t
= y[V
s
= x) =
1
_
2π(1 −e
−2(t−s)
)
exp
_
−
[y −xe
−(t−s)
[
2
2(1 −e
−2(t−s)
)
_
. (4.5)
To prove (4.5) we use the formula for the distribution function of the Brownian motion to calculate,
for t > s,
P(V
t
y[V
s
= x) = P(e
−t
W(e
2t
) y[e
−s
W(e
2s
) = x)
= P(W(e
2t
) e
t
y[W(e
2s
) = e
s
x)
=
_
e
t
y
−∞
1
_
2π(e
2t
−e
2s
)
e
−
z−xe
s

2
2(e
2t
−e
2s
)
dz
=
_
y
−∞
_
2πe
2t
(1 −e
−2(t−s)
)e
−
ρe
t
−xe
s

2
2(e
2t
(1−e
−2(t−s)
)
dρ
=
_
y
−∞
1
_
2π(1 −e
−2(t−s)
)
e
−
ρ−x
2
2(1−e
−2(t−s)
)
dρ.
Consequently, the transition probability density for the OU process is given by the formula
p(y, t[x, s) =
∂
∂y
P(V
t
y[V
s
= x)
=
1
_
2π(1 −e
−2(t−s)
)
exp
_
−
[y −xe
−(t−s)
[
2
2(1 −e
−2(t−s)
)
_
.
Markov stochastic processes appear in a variety of applications in physics, chemistry, biology
and ﬁnance. In this and the next chapter we will develop various analytical tools for studying them.
In particular, we will see that we can obtain an equation for the transition probability
P(X
n+1
= i
n+1
[X
n
= i
n
), P(X
t+h
= i
t+h
[X
t
= i
t
), p(X
t+h
= y[X
t
= x), (4.6)
which will enable us to study the evolution of a Markov process. This equation will be called the
ChapmanKolmogorov equation.
We will be mostly concerned with timehomogeneous Markov processes, i.e. processes for
which the conditional probabilities are invariant under time shifts. For timehomogeneous discrete
time Markov chains we have
P(X
n+1
= j[X
n
= i) = P(X
1
= j[X
0
= i) =: p
ij
.
59
We will refer to the matrix P = ¦p
ij
¦ as the transition matrix. It is each to check that the
transition matrix is a stochastic matrix, i.e. it has nonnegative entries and
j
p
ij
= 1. Similarly,
we can deﬁne the nstep transition matrix P
n
= ¦p
ij
(n)¦ as
p
ij
(n) = P(X
m+n
= j[X
m
= i).
We can study the evolution of a Markov chain through the ChapmanKolmogorov equation:
p
ij
(m + n) =
k
p
ik
(m)p
kj
(n). (4.7)
Indeed, let µ
(n)
i
:= P(X
n
= i). The (possibly inﬁnite dimensional) vector µ
n
determines the state
of the Markov chain at time n. A simple consequence of the ChapmanKolmogorov equation is
that we can write an evolution equation for the vector µ
(n)
µ
(n)
= µ
(0)
P
n
, (4.8)
where P
n
denotes the nth power of the matrix P. Hence in order to calculate the state of the
Markov chain at time n all we need is the initial distribution µ
0
and the transition matrix P. Com
ponentwise, the above equation can be written as
µ
(n)
j
=
i
µ
(0)
i
π
ij
(n).
Consider now a continuous time Markov chain with transition probability
p
ij
(s, t) = P(X
t
= j[X
s
= i), s t.
If the chain is homogeneous, then
p
ij
(s, t) = p
ij
(0, t −s) for all i, j, s, t.
In particular,
p
ij
(t) = P(X
t
= j[X
0
= i).
The ChapmanKolmogorov equation for a continuous time Markov chain is
dp
ij
dt
=
k
p
ik
(t)g
kj
, (4.9)
60
where the matrix Gis called the generator of the Markov chain. Equation (4.9) can also be written
in matrix notation:
dP
dt
= P
t
G.
The generator of the Markov chain is deﬁned as
G = lim
h→0
1
h
(P
h
−I).
Let now µ
i
t
= P(X
t
= i). The vector µ
t
is the distribution of the Markov chain at time t. We can
study its evolution using the equation
µ
t
= µ
0
P
t
.
Thus, as in the case if discrete time Markov chains, the evolution of a continuous time Markov
chain is completely determined by the initial distribution and and transition matrix.
Consider now the case a continuous time Markov process with continuous state space and
with continuous paths. As we have seen in Example 4.2.3 the Brownian motion is an example
of such a process. It is a standard result in the theory of partial differential equations that the
conditional probability density of the Brownian motion (4.4) is the fundamental solution of the
diffusion equation:
∂p
∂t
=
1
2
∂
2
p
∂y
2
, lim
t→s
p(y, t[x, s) = δ(y −x). (4.10)
Similarly, the conditional distribution of the OU process satisﬁes the initial value problem
∂p
∂t
=
∂(yp)
∂y
+
1
2
∂
2
p
∂y
2
, lim
t→s
p(y, t[x, s) = δ(y −x). (4.11)
The Brownian motion and the OU process are examples of a diffusion process. A diffusion pro
cess is a continuous time Markov process with continuous paths. We will see in Chapter 5, that
the conditional probability density p(y, t[x, s) of a diffusion process satisﬁes the forward Kol
mogorov or FokkerPlanck equation
∂p
∂t
= −
∂
∂y
(a(y, t)p) +
1
2
∂
2
∂y
2
(b(y, t)p), lim
t→s
p(y, t[x, s) = δ(y −x). (4.12)
as well as the backward Kolmogorov equation
−
∂p
∂s
= a(x, s)
∂p
∂x
+
1
2
b(x, s)
∂
2
p
∂x
2
, lim
t→s
p(y, t[x, s) = δ(y −x). (4.13)
for appropriate functions a(y, t), b(y, t). Hence, a diffusion process is determined uniquely from
these two functions.
61
4.3 Deﬁnition of a Markov Process
In Section 4.1 we gave the deﬁnition of Markov process whose time is either discrete or continuous,
and whose state space is the set of integers. We also gave several examples of Markov chains as
well as of processes whose state space is the real line. In this section we give the precise deﬁnition
of a Markov process with t ∈ T, a general index set and S = E, an arbitrary metric space. We will
use this formulation in the next section to derive the ChapmanKolmogorov equation.
In order to state the deﬁnition of a continuoustime Markov process that takes values in a metric
space we need to introduce various new concepts. For the deﬁnition of a Markov process we need
to use the conditional expectation of the stochastic process conditioned on all past values. We can
encode all past information about a stochastic process into an appropriate collection of σalgebras.
Our setting will be that we have a probability space (Ω, T, P) and an ordered set T. Let X = X
t
(ω)
be a stochastic process from the sample space (Ω, T) to the state space (E, (), where E is a metric
space (we will usually take E to be either R or R
d
). Remember that the stochastic process is a
function of two variables, t ∈ T and ω ∈ Ω.
We start with the deﬁnition of a σ–algebra generated by a collection of sets.
Deﬁnition 4.3.1. Let / be a collection of subsets of Ω. The smallest σ–algebra on Ω which
contains / is denoted by σ(/) and is called the σ–algebra generated by /.
Deﬁnition 4.3.2. Let X
t
: Ω → E, t ∈ T. The smallest σ–algebra σ(X
t
, t ∈ T), such that the
family of mappings ¦X
t
, t ∈ T¦ is a stochastic process with sample space (Ω, σ(X
t
, t ∈ T)) and
state space (E, (), is called the σ–algebra generated by ¦X
t
, t ∈ T¦.
In other words, the σ–algebra generated by X
t
is the smallest σ–algebra such that X
t
is a
measurable function (random variable) with respect to it: the set
_
ω ∈ Ω : X
t
(ω) x
_
∈ σ(X
t
, t ∈ T)
for all x ∈ R (we have assumed that E = R).
Deﬁnition 4.3.3. A ﬁltration on (Ω, T) is a nondecreasing family ¦T
t
, t ∈ T¦ of sub–σ–algebras
of T: T
s
⊆ T
t
⊆ T for s t.
We set T
∞
= σ(∪
t∈T
T
t
). The ﬁltration generated by X
t
, where X
t
is a stochastic process, is
T
X
t
:= σ (X
s
; s t) .
62
Deﬁnition 4.3.4. A stochastic process ¦X
t
; t ∈ T¦ is adapted to the ﬁltration T
t
:= ¦T
t
, t ∈ T¦
if for all t ∈ T, X
t
is an T
t
–measurable random variable.
Deﬁnition 4.3.5. Let ¦X
t
¦ be a stochastic process deﬁned on a probability space (Ω, T, µ) with
values in E and let T
X
t
be the ﬁltration generated by ¦X
t
; t ∈ T¦. Then ¦X
t
; t ∈ T¦ is a Markov
process if
P(X
t
∈ Γ[T
X
s
) = P(X
t
∈ Γ[X
s
) (4.14)
for all t, s ∈ T with t s, and Γ ∈ B(E).
Remark 4.3.6. The ﬁltration T
X
t
is generated by events of the form¦ω[X
s
1
∈ B
1
, X
s
2
∈ B
2
, . . . X
s
n
∈
B
n
, ¦ with 0 s
1
< s
2
< < s
n
s and B
i
∈ B(E). The deﬁnition of a Markov process is
thus equivalent to the hierarchy of equations
P(X
t
∈ Γ[X
t
1
, X
t
2
, . . . X
t
n
) = P(X
t
∈ Γ[X
t
n
) a.s.
for n 1, 0 t
1
< t
2
< < t
n
t and Γ ∈ B(E).
Roughly speaking, the statistics of X
t
for t s are completely determined once X
s
is known;
information about X
t
for t < s is superﬂuous. In other words: a Markov process has no mem
ory. More precisely: when a Markov process is conditioned on the present state, then there is no
memory of the past. The past and future of a Markov process are statistically independent when
the present is known.
Remark 4.3.7. A nonMarkovian process X
t
can be described through a Markovian one Y
t
by
enlarging the state space: the additional variables that we introduce account for the memory in
the X
t
. This ”Markovianization” trick is very useful since there exist many analytical tools for
analyzing Markovian processes.
Example 4.3.8. The velocity of a Brownian particle is modeled by the stationary OrnsteinUhlenbeck
process Y
t
= e
−t
W(e
2t
). The particle position is given by the integral of the OU process (we take
X
0
= 0)
X
t
=
_
t
0
Y
s
ds.
The particle position depends on the past of the OU process and, consequently, is not a Markov
process. However, the joint positionvelocity process ¦X
t
, Y
t
¦ is. Its transition probability density
63
p(x, y, t[x
0
, y
0
) satisﬁes the forward Kolmogorov equation
∂p
∂t
= −p
∂p
∂x
+
∂
∂y
(yp) +
1
2
∂
2
p
∂y
2
.
4.4 The ChapmanKolmogorov Equation
With a Markov process ¦X
t
¦ we can associate a function P : T T E B(E) → R
+
deﬁned
through the relation
P
_
X
t
∈ Γ[T
X
s
¸
= P(s, t, X
s
, Γ),
for all t, s ∈ T with t s and all Γ ∈ B(E). Assume that X
s
= x. Since P
_
X
t
∈ Γ[T
X
s
¸
=
P[X
t
∈ Γ[X
s
] we can write
P(Γ, t[x, s) = P[X
t
∈ Γ[X
s
= x] .
The transition function P(t, Γ[x, s) is (for ﬁxed t, x s) a probability measure on E with P(t, E[x, s) =
1; it is B(E)–measurable in x (for ﬁxed t, s, Γ) and satisﬁes the Chapman–Kolmogorov equation
P(Γ, t[x, s) =
_
E
P(Γ, t[y, u)P(dy, u[x, s). (4.15)
for all x ∈ E, Γ ∈ B(E) and s, u, t ∈ T with s u t. The derivation of the Chapman
Kolmogorov equation is based on the assumption of Markovianity and on properties of the con
ditional probability. Let (Ω, T, µ) be a probability space, X a random variable from (Ω, T, µ) to
(E, () and let T
1
⊂ T
2
⊂ T. Then (see Theorem 2.4.1)
E(E(X[T
2
)[T
1
) = E(E(X[T
1
)[T
2
) = E(X[T
1
). (4.16)
Given ( ⊂ T we deﬁne the function P
X
(B[() = P(X ∈ B[() for B ∈ T. Assume that f is such
that E(f(X)) < ∞. Then
E(f(X)[() =
_
R
f(x)P
X
(dx[(). (4.17)
64
Now we use the Markov property, together with equations (4.16) and (4.17) and the fact that
s < u ⇒ T
X
s
⊂ T
X
u
to calculate:
P(Γ, t[x, s) := P(X
t
∈ Γ[X
s
= x) = P(X
t
∈ Γ[T
X
s
)
= E(I
Γ
(X
t
)[T
X
s
) = E(E(I
Γ
(X
t
)[T
X
s
)[T
X
u
)
= E(E(I
Γ
(X
t
)[T
X
u
)[T
X
s
) = E(P(X
t
∈ Γ[X
u
)[T
X
s
)
= E(P(X
t
∈ Γ[X
u
= y)[X
s
= x)
=
_
R
P(Γ, t[X
u
= y)P(dy, u[X
s
= x)
=:
_
R
P(Γ, t[y, u)P(dy, u[x, s).
I
Γ
() denotes the indicator function of the set Γ. We have also set E = R. The CK equation is
an integral equation and is the fundamental equation in the theory of Markov processes. Under
additional assumptions we will derive from it the FokkerPlanck PDE, which is the fundamental
equation in the theory of diffusion processes, and will be the main object of study in this course.
Deﬁnition 4.4.1. A Markov process is homogeneous if
P(t, Γ[X
s
= x) := P(s, t, x, Γ) = P(0, t −s, x, Γ).
We set P(0, t, , ) = P(t, , ). The Chapman–Kolmogorov (CK) equation becomes
P(t + s, x, Γ) =
_
E
P(s, x, dz)P(t, z, Γ). (4.18)
Let X
t
be a homogeneous Markov process and assume that the initial distribution of X
t
is
given by the probability measure ν(Γ) = P(X
0
∈ Γ) (for deterministic initial conditions–X
0
= x–
we have that ν(Γ) = I
Γ
(x) ). The transition function P(x, t, Γ) and the initial distribution ν
determine the ﬁnite dimensional distributions of X by
P(X
0
∈ Γ
1
, X(t
1
) ∈ Γ
1
, . . . , X
t
n
∈ Γ
n
)
=
_
Γ
0
_
Γ
1
. . .
_
Γ
n−1
P(t
n
−t
n−1
, y
n−1
, Γ
n
)P(t
n−1
−t
n−2
, y
n−2
, dy
n−1
)
P(t
1
, y
0
, dy
1
)ν(dy
0
). (4.19)
Theorem 4.4.2. ([21, Sec. 4.1]) Let P(t, x, Γ) satisfy (4.18) and assume that (E, ρ) is a complete
separable metric space. Then there exists a Markov process X in E whose ﬁnitedimensional
distributions are uniquely determined by (4.19).
65
Let X
t
be a homogeneous Markov process with initial distribution ν(Γ) = P(X
0
∈ Γ) and
transition function P(x, t, Γ). We can calculate the probability of ﬁnding X
t
in a set Γ at time t:
P(X
t
∈ Γ) =
_
E
P(x, t, Γ)ν(dx).
Thus, the initial distribution and the transition function are sufﬁcient to characterize a homoge
neous Markov process. Notice that they do not provide us with any information about the actual
paths of the Markov process. The transition probability P(Γ, t[x, s) is a probability measure. As
sume that it has a density for all t > s:
P(Γ, t[x, s) =
_
Γ
p(y, t[x, s) dy.
Clearly, for t = s we have P(Γ, s[x, s) = I
Γ
(x). The ChapmanKolmogorov equation becomes:
_
Γ
p(y, t[x, s) dy =
_
R
_
Γ
p(y, t[z, u)p(z, u[x, s) dzdy,
and, since Γ ∈ B(R) is arbitrary, we obtain the equation
p(y, t[x, s) =
_
R
p(y, t[z, u)p(z, u[x, s) dz. (4.20)
The transition probability density is a function of 4 arguments: the initial position and time x, s
and the ﬁnal position and time y, t.
In words, the CK equation tells us that, for a Markov process, the transition from x, s to y, t
can be done in two steps: ﬁrst the system moves from x to z at some intermediate time u. Then it
moves from z to y at time t. In order to calculate the probability for the transition from (x, s) to
(y, t) we need to sum (integrate) the transitions from all possible intermediary states z. The above
description suggests that a Markov process can be described through a semigroup of operators,
i.e. a oneparameter family of linear operators with the properties
P
0
= I, P
t+s
= P
t
◦ P
s
∀ t, s 0.
Indeed, let P(t, x, dy) be the transition function of a homogeneous Markov process. It satisﬁes
the CK equation (4.18):
P(t + s, x, Γ) =
_
E
P(s, x, dz)P(t, z, Γ).
66
Let X := C
b
(E) and deﬁne the operator
(P
t
f)(x) := E(f(X
t
)[X
0
= x) =
_
E
f(y)P(t, x, dy).
This is a linear operator with
(P
0
f)(x) = E(f(X
0
)[X
0
= x) = f(x) ⇒ P
0
= I.
Furthermore:
(P
t+s
f)(x) =
_
f(y)P(t + s, x, dy)
=
_ _
f(y)P(s, z, dy)P(t, x, dz)
=
_ __
f(y)P(s, z, dy)
_
P(t, x, dz)
=
_
(P
s
f)(z)P(t, x, dz)
= (P
t
◦ P
s
f)(x).
Consequently:
P
t+s
= P
t
◦ P
s
.
4.5 The Generator of a Markov Processes
Let (E, ρ) be a metric space and let ¦X
t
¦ be an Evalued homogeneous Markov process. Deﬁne
the one parameter family of operators P
t
through
P
t
f(x) =
_
f(y)P(t, x, dy) = E[f(X
t
)[X
0
= x]
for all f(x) ∈ C
b
(E) (continuous bounded functions on E). Assume for simplicity that P
t
:
C
b
(E) → C
b
(E). Then the oneparameter family of operators P
t
forms a semigroup of operators
on C
b
(E). We deﬁne by T(L) the set of all f ∈ C
b
(E) such that the strong limit
Lf = lim
t→0
P
t
f −f
t
,
exists.
Deﬁnition 4.5.1. The operator L : T(L) → C
b
(E) is called the inﬁnitesimal generator of the
operator semigroup P
t
.
67
Deﬁnition 4.5.2. The operator L : C
b
(E) → C
b
(E) deﬁned above is called the generator of the
Markov process ¦X
t
; t 0¦.
The semigroup property and the deﬁnition of the generator of a semigroup imply that, formally
at least, we can write:
P
t
= exp(Lt).
Consider the function u(x, t) := (P
t
f)(x). We calculate its time derivative:
∂u
∂t
=
d
dt
(P
t
f) =
d
dt
_
e
/t
f
_
= L
_
e
/t
f
_
= LP
t
f = Lu.
Furthermore, u(x, 0) = P
0
f(x) = f(x). Consequently, u(x, t) satisﬁes the initial value problem
∂u
∂t
= Lu, u(x, 0) = f(x). (4.21)
When the semigroup P
t
is the transition semigroup of a Markov process X
t
, then equation (4.21)
is called the backward Kolmogorov equation. It governs the evolution of an observable
u(x, t) = E(f(X
t
)[X
0
= x).
Thus, given the generator of a Markov process L, we can calculate all the statistics of our process
by solving the backward Kolmogorov equation. In the case where the Markov process is the
solution of a stochastic differential equation, then the generator is a second order elliptic operator
and the backward Kolmogorov equation becomes an initial value problem for a parabolic PDE.
The space C
b
(E) is natural in a probabilistic context, but other Banach spaces often arise
in applications; in particular when there is a measure µ on E, the spaces L
p
(E; µ) sometimes
arise. We will quite often use the space L
2
(E; µ), where µ will is the invariant measure of
our Markov process. The generator is frequently taken as the starting point for the deﬁnition
of a homogeneous Markov process. Conversely, let P
t
be a contraction semigroup (Let X be
a Banach space and T : X → X a bounded operator. Then T is a contraction provided that
Tf
X
f
X
∀ f ∈ X), with T(P
t
) ⊂ C
b
(E), closed. Then, under mild technical hypotheses,
there is an E–valued homogeneous Markov process ¦X
t
¦ associated with P
t
deﬁned through
E[f(X(t)[T
X
s
)] = P
t−s
f(X(s))
for all t, s ∈ T with t s and f ∈ T(P
t
).
68
Example 4.5.3. The Poisson process is a homogeneous Markov process.
Example 4.5.4. The one dimensional Brownian motion is a homogeneous Markov process. The
transition function is the Gaussian deﬁned in the example in Lecture 2:
P(t, x, dy) = γ
t,x
(y)dy, γ
t,x
(y) =
1
√
2πt
exp
_
−
[x −y[
2
2t
_
.
The semigroup associated to the standard Brownian motion is the heat semigroup P
t
= e
t
2
d
2
dx
2
. The
generator of this Markov process is
1
2
d
2
dx
2
.
Notice that the transition probability density γ
t,x
of the one dimensional Brownian motion is
the fundamental solution (Green’s function) of the heat (diffusion) PDE
∂u
∂t
=
1
2
∂
2
u
∂x
2
.
4.5.1 The Adjoint Semigroup
The semigroup P
t
acts on bounded measurable functions. We can also deﬁne the adjoint semigroup
P
∗
t
which acts on probability measures:
P
∗
t
µ(Γ) =
_
R
P(X
t
∈ Γ[X
0
= x) dµ(x) =
_
R
p(t, x, Γ) dµ(x).
The image of a probability measure µ under P
∗
t
is again a probability measure. The operators P
t
and P
∗
t
are adjoint in the L
2
sense:
_
R
P
t
f(x) dµ(x) =
_
R
f(x) d(P
∗
t
µ)(x). (4.22)
We can, formally at least, write
P
∗
t
= exp(L
∗
t),
where L
∗
is the L
2
adjoint of the generator of the process:
_
Lfhdx =
_
fL
∗
hdx.
Let µ
t
:= P
∗
t
µ. This is the law of the Markov process and µ is the initial distribution. An argument
similar to the one used in the derivation of the backward Kolmogorov equation (4.21) enables us
to obtain an equation for the evolution of µ
t
:
∂µ
t
∂t
= L
∗
µ
t
, µ
0
= µ.
69
Assuming that µ
t
= ρ(y, t) dy, µ = ρ
0
(y) dy this equation becomes:
∂ρ
∂t
= L
∗
ρ, ρ(y, 0) = ρ
0
(y). (4.23)
This is the forward Kolmogorov or FokkerPlanck equation. When the initial conditions are
deterministic, X
0
= x, the initial condition becomes ρ
0
= δ(y − x). Given the initial distribution
and the generator of the Markov process X
t
, we can calculate the transition probability density by
solving the Forward Kolmogorov equation. We can then calculate all statistical quantities of this
process through the formula
E(f(X
t
)[X
0
= x) =
_
f(y)ρ(t, y; x) dy.
We will derive rigorously the backward and forward Kolmogorov equations for Markov processes
that are deﬁned as solutions of stochastic differential equations later on.
We can study the evolution of a Markov process in two different ways: Either through the
evolution of observables (Heisenberg/Koopman)
∂(P
t
f)
∂t
= L(P
t
f),
or through the evolution of states (Schr¨ odinger/FrobeniousPerron)
∂(P
∗
t
µ)
∂t
= L
∗
(P
∗
t
µ).
We can also study Markov processes at the level of trajectories. We will do this after we deﬁne the
concept of a stochastic differential equation.
4.6 Ergodic Markov processes
A very important concept in the study of limit theorems for stochastic processes is that of er
godicity. This concept, in the context of Markov processes, provides us with information on the
long–time behavior of a Markov semigroup.
Deﬁnition 4.6.1. A Markov process is called ergodic if the equation
P
t
g = g, g ∈ C
b
(E) ∀t 0
has only constant solutions.
70
Roughly speaking, ergodicity corresponds to the case where the semigroup P
t
is such that P
t
−I
has only constants in its null space, or, equivalently, to the case where the generator L has only
constants in its null space. This follows from the deﬁnition of the generator of a Markov process.
Under some additional compactness assumptions, an ergodic Markov process has an invariant
measure µ with the property that, in the case T = R
+
,
lim
t→+∞
1
t
_
t
0
g(X
s
) ds = Eg(x),
where E denotes the expectation with respect to µ. This is a physicist’s deﬁnition of an ergodic
process: time averages equal phase space averages.
Using the adjoint semigroup we can deﬁne an invariant measure as the solution of the equation
P
∗
t
µ = µ.
If this measure is unique, then the Markov process is ergodic. Using this, we can obtain an equation
for the invariant measure in terms of the adjoint of the generator L
∗
, which is the generator of the
semigroup P
∗
t
. Indeed, from the deﬁnition of the generator of a semigroup and the deﬁnition of an
invariant measure, we conclude that a measure µ is invariant if and only if
L
∗
µ = 0
in some appropriate generalized sense ((L
∗
µ, f) = 0 for every bounded measurable function).
Assume that µ(dx) = ρ(x) dx. Then the invariant density satisﬁes the stationary FokkerPlanck
equation
L
∗
ρ = 0.
The invariant measure (distribution) governs the longtime dynamics of the Markov process.
4.6.1 Stationary Markov Processes
If X
0
is distributed according to µ, then so is X
t
for all t > 0. The resulting stochastic process,
with X
0
distributed in this way, is stationary . In this case the transition probability density (the
solution of the FokkerPlanck equation) is independent of time: ρ(x, t) = ρ(x). Consequently, the
statistics of the Markov process is independent of time.
71
Example 4.6.2. Consider the onedimensional Brownian motion. The generator of this Markov
process is
L =
1
2
d
2
dx
2
.
The stationary FokkerPlanck equation becomes
d
2
ρ
dx
2
= 0, (4.24)
together with the normalization and nonnegativity conditions
ρ 0,
_
R
ρ(x) dx = 1. (4.25)
There are no solutions to Equation (4.24), subject to the constraints (4.25).
2
Thus, the one dimen
sional Brownian motion is not an ergodic process.
Example 4.6.3. Consider a onedimensional Brownian motion on [0, 1], with periodic boundary
conditions. The generator of this Markov process Lis the differential operator L =
1
2
d
2
dx
2
, equipped
with periodic boundary conditions on [0, 1]. This operator is selfadjoint. The null space of both
L and L
∗
comprises constant functions on [0, 1]. Both the backward Kolmogorov and the Fokker
Planck equation reduce to the heat equation
∂ρ
∂t
=
1
2
∂
2
ρ
∂x
2
with periodic boundary conditions in [0, 1]. Fourier analysis shows that the solution converges to
a constant at an exponential rate. See Exercise 6.
Example 4.6.4. The one dimensional OrnsteinUhlenbeck (OU) process is a Markov process
with generator
L = −αx
d
dx
+ D
d
2
dx
2
.
The null space of L comprises constants in x. Hence, it is an ergodic Markov process. In order to
calculate the invariant measure we need to solve the stationary Fokker–Planck equation:
L
∗
ρ = 0, ρ 0, ρ
L
1
(R)
= 1. (4.26)
2
The general solution to Equation (4.25) is ρ(x) = Ax + B for arbitrary constants A and B. This function is not
normalizable, i.e. there do not exist constants A and B so that
_
R
rho(x) dx = 1.
72
Let us calculate the L
2
adjoint of L. Assuming that f, h decay sufﬁciently fast at inﬁnity, we have:
_
R
Lfhdx =
_
R
_
(−αx∂
x
f)h + (D∂
2
x
f)h
¸
dx
=
_
R
_
f∂
x
(αxh) + f(D∂
2
x
h)
¸
dx =:
_
R
fL
∗
hdx,
where
L
∗
h :=
d
dx
(axh) + D
d
2
h
dx
2
.
We can calculate the invariant distribution by solving equation (4.26). The invariant measure of
this process is the Gaussian measure
µ(dx) =
_
α
2πD
exp
_
−
α
2D
x
2
_
dx.
If the initial condition of the OU process is distributed according to the invariant measure, then
the OU process is a stationary Gaussian process.
Let X
t
be the 1d OU process and let X
0
∼ ^(0, D/α). Then X
t
is a mean zero, Gaussian
second order stationary process on [0, ∞) with correlation function
R(t) =
D
α
e
−α[t[
and spectral density
f(x) =
D
π
1
x
2
+ α
2
.
Furthermore, the OU process is the only realvalued mean zero Gaussian secondorder stationary
Markov process deﬁned on R.
4.7 Discussion and Bibliography
The study of operator semigroups started in the late 40’s independently by Hille and Yosida. Semi
group theory was developed in the 50’s and 60’s by Feller, Dynkin and others, mostly in connection
to the theory of Markov processes. Necessary and sufﬁcient conditions for an operator L to be the
generator of a (contraction) semigroup are given by the HilleYosida theorem [22, Ch. 7].
73
4.8 Exercises
1. Let ¦X
n
¦ be a stochastic process with state space S = Z. Show that it is a Markov process if
and only if for all n
P(X
n+1
= i
n+1
[X
1
= i
1
, . . . X
n
= i
n
) = P(X
n+1
= i
n+1
[X
n
= i
n
).
2. Show that (4.4) is the solution of initial value problem (4.10) as well as of the ﬁnal value
problem
−
∂p
∂s
=
1
2
∂
2
p
∂x
2
, lim
s→t
p(y, t[x, s) = δ(y −x).
3. Use (4.5) to show that the forward and backward Kolmogorov equations for the OU process are
∂p
∂t
=
∂
∂y
(yp) +
1
2
∂
2
p
∂y
2
and
−
∂p
∂s
= −x
∂p
∂x
+
1
2
∂
2
p
∂x
2
.
4. Let W(t) be a standard one dimensional Brownian motion, let Y (t) = σW(t) with σ > 0 and
consider the process
X(t) =
_
t
0
Y (s) ds.
Show that the joint process ¦X(t), Y (t)¦ is Markovian and write down the generator of the
process.
5. Let Y (t) = e
−t
W(e
2t
) be the stationary OrnsteinUhlenbeck process and consider the process
X(t) =
_
t
0
Y (s) ds.
Show that the joint process ¦X(t), Y (t)¦ is Markovian and write down the generator of the
process.
6. Consider a onedimensional Brownian motion on [0, 1], with periodic boundary conditions. The
generator of this Markov process Lis the differential operator L =
1
2
d
2
dx
2
, equipped with periodic
boundary conditions on [0, 1]. Show that this operator is selfadjoint. Show that the null space
of both L and L
∗
comprises constant functions on [0, 1]. Conclude that this process is ergodic.
Solve the corresponding FokkerPlanck equation for arbitrary initial conditions ρ
0
(x) . Show
that the solution converges to a constant at an exponential rate. .
74
7. (a) Let X, Y be mean zero Gaussian random variables with EX
2
= σ
2
X
, EY
2
= σ
2
Y
and
correlation coefﬁcient ρ (the correlation coefﬁcient is ρ =
E(XY )
σ
X
σ
Y
). Show that
E(X[Y ) =
ρσ
X
σ
Y
Y.
(b) Let X
t
be a mean zero stationary Gaussian process with autocorrelation function R(t).
Use the previous result to show that
E[X
t+s
[X
s
] =
R(t)
R(0)
X(s), s, t 0.
(c) Use the previous result to show that the only stationary Gaussian Markov process with
continuous autocorrelation function is the stationary OU process.
8. Show that a Gaussian process X
t
is a Markov process if and only if
E(X
t
n
[X
t
1
= x
1
, . . . X
t
n−1
= x
n−1
) = E(X
t
n
[X
t
n−1
= x
n−1
).
75
76
Chapter 5
Diffusion Processes
5.1 Introduction
In this chapter we study a particular class of Markov processes, namely Markov processes with
continuous paths. These processes are called diffusion processes and they appear in many appli
cations in physics, chemistry, biology and ﬁnance.
In Section 5.2 we give the deﬁnition of a diffusion process. In section 5.3 we derive the forward
and backward Kolmogorov equations for onedimensional diffusion processes. In Section 5.4 we
present the forward and backward Kolmogorov equations in arbitrary dimensions. The connec
tion between diffusion processes and stochastic differential equations is presented in Section 5.5.
Discussion and bibliographical remarks are included in Section 5.7. Exercises can be found in
Section 5.8.
5.2 Deﬁnition of a Diffusion Process
A Markov process consists of three parts: a drift (deterministic), a random process and a jump
process. A diffusion process is a Markov process that has continuous sample paths (trajectories).
Thus, it is a Markov process with no jumps. A diffusion process can be deﬁned by specifying its
ﬁrst two moments:
Deﬁnition 5.2.1. A Markov process X
t
with transition function P(Γ, t[x, s) is called a diffusion
process if the following conditions are satisﬁed.
77
i. (Continuity). For every x and every ε > 0
_
[x−y[>ε
P(dy, t[x, s) = o(t −s) (5.1)
uniformly over s < t.
ii. (Deﬁnition of drift coefﬁcient). There exists a function a(x, s) such that for every x and every
ε > 0
_
[y−x[ε
(y −x)P(dy, t[x, s) = a(x, s)(t −s) + o(t −s). (5.2)
uniformly over s < t.
iii. (Deﬁnition of diffusion coefﬁcient). There exists a function b(x, s) such that for every x and
every ε > 0
_
[y−x[ε
(y −x)
2
P(dy, t[x, s) = b(x, s)(t −s) + o(t −s). (5.3)
uniformly over s < t.
Remark 5.2.2. In Deﬁnition 5.2.1 we had to truncate the domain of integration since we didn’t
know whether the ﬁrst and second moments exist. If we assume that there exists a δ > 0 such that
lim
t→s
1
t −s
_
R
d
[y −x[
2+δ
P(dy, t[x, s) = 0, (5.4)
then we can extend the integration over the whole R
d
and use expectations in the deﬁnition of the
drift and the diffusion coefﬁcient. Indeed, ,let k = 0, 1, 2 and notice that
_
[y−x[>ε
[y −x[
k
P(dy, t[x, s)
=
_
[y−x[>ε
[y −x[
2+δ
[y −x[
k−(2+δ)
P(dy, t[x, s)
1
ε
2+δ−k
_
[y−x[>ε
[y −x[
2+δ
P(dy, t[x, s)
1
ε
2+δ−k
_
R
d
[y −x[
2+δ
P(dy, t[x, s).
Using this estimate together with (5.4) we conclude that:
lim
t→s
1
t −s
_
[y−x[>ε
[y −x[
k
P(dy, t[x, s) = 0, k = 0, 1, 2.
78
This implies that assumption (5.4) is sufﬁcient for the sample paths to be continuous (k = 0) and
for the replacement of the truncated integrals in (10.1) and (5.3) by integrals over R (k = 1 and
k = 2, respectively). The deﬁnitions of the drift and diffusion coefﬁcients become:
lim
t→s
E
_
X
t
−X
s
t −s
¸
¸
¸X
s
= x
_
= a(x, s) (5.5)
and
lim
t→s
E
_
[X
t
−X
s
[
2
t −s
¸
¸
¸X
s
= x
_
= b(x, s) (5.6)
5.3 The Backward and Forward Kolmogorov Equations
In this section we show that a diffusion process is completely determined by its ﬁrst two moments.
In particular, we will obtain partial differential equations that govern the evolution of the condi
tional expectation of an arbitrary function of a diffusion process X
t
, u(x, s) = E(f(X
t
)[X
s
= x),
as well as of the transition probability density p(y, t[x, s). These are the backward and forward
Kolmogorov equations.
In this section we shall derive the backward and forward Kolmogorov equations for one
dimensional diffusion processes. The extension to multidimensional diffusion processes is pre
sented in Section 5.4.
5.3.1 The Backward Kolmogorov Equation
Theorem 5.3.1. (Kolmogorov) Let f(x) ∈ C
b
(R) and let
u(x, s) := E(f(X
t
)[X
s
= x) =
_
f(y)P(dy, t[x, s).
Assume furthermore that the functions a(x, s), b(x, s) are continuous in both x and s. Then
u(x, s) ∈ C
2,1
(R R
+
) and it solves the ﬁnal value problem
−
∂u
∂s
= a(x, s)
∂u
∂x
+
1
2
b(x, s)
∂
2
u
∂x
2
, lim
s→t
u(s, x) = f(x). (5.7)
Proof. First we notice that, the continuity assumption (5.1), together with the fact that the function
79
f(x) is bounded imply that
u(x, s) =
_
R
f(y) P(dy, t[x, s)
=
_
[y−x[ε
f(y)P(dy, t[x, s) +
_
[y−x[>ε
f(y)P(dy, t[x, s)
_
[y−x[ε
f(y)P(dy, t[x, s) +f
L
∞
_
[y−x[>ε
P(dy, t[x, s)
=
_
[y−x[ε
f(y)P(dy, t[x, s) + o(t −s).
We add and subtract the ﬁnal condition f(x) and use the previous calculation to obtain:
u(x, s) =
_
R
f(y)P(dy, t[x, s) = f(x) +
_
R
(f(y) −f(x))P(dy, t[x, s)
= f(x) +
_
[y−x[ε
(f(y) −f(x))P(dy, t[x, s) +
_
[y−x[>ε
(f(y) −f(x))P(dy, t[x, s)
= f(x) +
_
[y−x[ε
(f(y) −f(x))P(dy, t[x, s) + o(t −s).
Now the ﬁnal condition follows from the fact that f(x) ∈ C
b
(R) and the arbitrariness of ε.
Now we show that u(s, x) solves the backward Kolmogorov equation. We use the Chapman
Kolmogorov equation (4.15) to obtain
u(x, σ) =
_
R
f(z)P(dz, t[x, σ) (5.8)
=
_
R
_
R
f(z)P(dz, t[y, ρ)P(dy, ρ[x, σ)
=
_
R
u(y, ρ)P(dy, ρ[x, σ). (5.9)
The Taylor series expansion of the function u(x, s) gives
u(z, ρ) −u(x, ρ) =
∂u(x, ρ)
∂x
(z −x) +
1
2
∂
2
u(x, ρ)
∂x
2
(z −x)
2
(1 + α
ε
), [z −x[ ε, (5.10)
where
α
ε
= sup
ρ,[z−x[ε
¸
¸
¸
¸
∂
2
u(x, ρ)
∂x
2
−
∂
2
u(z, ρ)
∂x
2
¸
¸
¸
¸
.
Notice that, since u(x, s) is twice continuously differentiable in x, lim
ε→0
α
ε
= 0.
80
We combine now (5.9) with (5.10) to calculate
u(x, s) −u(x, s + h)
h
=
1
h
__
R
P(dy, s + h[x, s)u(y, s + h) −u(x, s + h)
_
=
1
h
_
R
P(dy, s + h[x, s)(u(y, s + h) −u(x, s + h))
=
1
h
_
[x−y[<ε
P(dy, s + h[x, s)(u(y, s + h) −u(x, s)) + o(1)
=
∂u
∂x
(x, s + h)
1
h
_
[x−y[<ε
(y −x)P(dy, s + h[x, s)
+
1
2
∂
2
u
∂x
2
(x, s + h)
1
h
_
[x−y[<ε
(y −x)
2
P(dy, s + h[x, s)(1 + α
ε
) + o(1)
= a(x, s)
∂u
∂x
(x, s + h) +
1
2
b(x, s)
∂
2
u
∂x
2
(x, s + h)(1 + α
ε
) + o(1).
Equation (5.7) follows by taking the limits ε →0, h →0.
Assume now that the transition function has a density p(y, t[x, s). In this case the formula for
u(x, s) becomes
u(x, s) =
_
R
f(y)p(y, t[x, s) dy.
Substituting this in the backward Kolmogorov equation we obtain
_
R
f(y)
_
∂p(y, t[x, s)
∂s
+/
s,x
p(y, t[x, s)
_
= 0 (5.11)
where
/
s,x
:= a(x, s)
∂
∂x
+
1
2
b(x, s)
∂
2
∂x
2
.
Since (5.11) is valid for arbitrary functions f(y), we obtain a partial differential equations for the
transition probability density:
−
∂p(y, t[x, s)
∂s
= a(x, s)
∂p(y, t[x, s)
∂x
+
1
2
b(x, s)
∂
2
p(y, t[x, s)
∂x
2
. (5.12)
Notice that the variation is with respect to the ”backward” variables x, s. We will obtain an equa
tion with respect to the ”forward” variables y, t in the next section.
5.3.2 The Forward Kolmogorov Equation
In this section we will obtain the forward Kolmogorov equation. In the physics literature is called
the FokkerPlanck equation. We assume that the transition function has a density with respect to
81
Lebesgue measure.
P(Γ, t[x, s) =
_
Γ
p(y, t[x, s) dy.
Theorem 5.3.2. (Kolmogorov) Assume that conditions (5.1), (10.1), (5.3) are satisﬁed and that
p(y, t[, ), a(y, t), b(y, t) ∈ C
2,1
(R R
+
). Then the transition probability density satisﬁes the
equation
∂p
∂t
= −
∂
∂y
(a(t, y)p) +
1
2
∂
2
∂y
2
(b(t, y)p) , lim
t→s
p(t, y[x, s) = δ(x −y). (5.13)
Proof. Fix a function f(y) ∈ C
2
0
(R). An argument similar to the one used in the proof of the
backward Kolmogorov equation gives
lim
h→0
1
h
__
f(y)p(y, s + h[x, s) ds −f(x)
_
= a(x, s)f
x
(x) +
1
2
b(x, s)f
xx
(x), (5.14)
where subscripts denote differentiation with respect to x. On the other hand
_
f(y)
∂
∂t
p(y, t[x, s) dy =
∂
∂t
_
f(y)p(y, t[x, s) dy
= lim
h→0
1
h
_
(p(y, t + h[x, s) −p(y, t[x, s)) f(y) dy
= lim
h→0
1
h
__
p(y, t + h[x, s)f(y) dy −
_
p(z, t[s, x)f(z) dz
_
= lim
h→0
1
h
__ _
p(y, t + s[z, t)p(z, t[x, s)f(y) dydz −
_
p(z, t[s, x)f(z) dz
_
= lim
h→0
1
h
__
p(z, t[x, s)
__
p(y, t + h[z, t)f(y) dy −f(z)
__
dz
=
_
p(z, t[x, s)
_
a(z, t)f
z
(z) +
1
2
b(z)f
zz
(z)
_
dz
=
_
_
−
∂
∂z
(a(z)p(z, t[x, s)) +
1
2
∂
2
∂z
2
(b(z)p(z, t[x, s)
_
f(z) dz.
In the above calculation used the ChapmanKolmogorov equation. We have also performed two
integrations by parts and used the fact that, since the test function f has compact support, the
boundary terms vanish.
Since the above equation is valid for every test function f(y), the forward Kolmogorov equation
follows.
Assume now that initial distribution of X
t
is ρ
0
(x) and set s = 0 (the initial time) in (5.13).
Deﬁne
p(y, t) :=
_
p(y, t[x, 0)ρ
0
(x) dx. (5.15)
82
We multiply the forward Kolmogorov equation (5.13) by ρ
0
(x) and integrate with respect to x to
obtain the equation
∂p(y, t)
∂t
= −
∂
∂y
(a(y, t)p(y, t)) +
1
2
∂
2
∂y
2
(b(y, t)p(t, y)) , (5.16)
together with the initial condition
p(y, 0) = ρ
0
(y). (5.17)
The solution of equation (5.16), provides us with the probability that the diffusion process X
t
,
which initially was distributed according to the probability density ρ
0
(x), is equal to y at time t.
Alternatively, we can think of the solution to (5.13) as the Green’s function for the PDE (5.16).
Using (5.16) we can calculate the expectation of an arbitrary function of the diffusion process X
t
:
E(f(X
t
)) =
_ _
f(y)p(y, t[x, 0)p(x, 0) dxdy
=
_
f(y)p(y, t) dy,
where p(y, t) is the solution of (5.16). Quite often we need to calculate joint probability densities.
For, example the probability that X
t
1
= x
1
and X
t
2
= x
2
. From the properties of conditional
expectation we have that
p(x
1
, t
1
, x
2
, t
2
) = P(X
t
1
= x
1
, X
t
2
= x
2
)
= P(X
t
1
= x
1
[X
t
2
= x
2
)P(X
t
2
= x
2
)
= p(x
1
, t
1
[x
2
t
2
)p(x
2
, t
2
).
Using the joint probability density we can calculate the statistics of a function of the diffusion
process X
t
at times t and s:
E(f(X
t
, X
s
)) =
_ _
f(y, x)p(y, t[x, s)p(x, s) dxdy. (5.18)
The autocorrelation function at time t and s is given by
E(X
t
X
s
) =
_ _
yxp(y, t[x, s)p(x, s) dxdy.
In particular,
E(X
t
X
0
) =
_ _
yxp(y, t[x, 0)p(x, 0) dxdy.
83
5.4 Multidimensional Diffusion Processes
Let X
t
be a diffusion process in R
d
. The drift and diffusion coefﬁcients of a diffusion process in
R
d
are deﬁned as:
lim
t→s
1
t −s
_
[y−x[<ε
(y −x)P(dy, t[x, s) = a(x, s)
and
lim
t→s
1
t −s
_
[y−x[<ε
(y −x) ⊗(y −x)P(dy, t[x, s) = b(x, s).
The drift coefﬁcient a(x, s) is a ddimensional vector ﬁeld and the diffusion coefﬁcient b(x, s) is a
d d symmetric matrix (second order tensor). The generator of a d dimensional diffusion process
is
L = a(x, s) ∇+
1
2
b(x, s) : ∇∇
=
d
j=1
a
j
(x, s)
∂
∂x
j
+
1
2
d
i,j=1
b
ij
(x, s)
∂
2
∂x
2
j
.
Exercise 5.4.1. Derive rigorously the forward and backward Kolmogorov equations in arbitrary
dimensions.
Assuming that the ﬁrst and second moments of the multidimensional diffusion process exist,
we can write the formulas for the drift vector and diffusion matrix as
lim
t→s
E
_
X
t
−X
s
t −s
¸
¸
¸X
s
= x
_
= a(x, s) (5.19)
and
lim
t→s
E
_
(X
t
−X
s
) ⊗(X
t
−X
s
)
t −s
¸
¸
¸X
s
= x
_
= b(x, s) (5.20)
Notice that from the above deﬁnition it follows that the diffusion matrix is symmetric and nonneg
ative deﬁnite.
5.5 Connection with Stochastic Differential Equations
Notice also that the continuity condition can be written in the form
P([X
t
−X
s
[ ε[X
s
= x) = o(t −s).
84
Now it becomes clear that this condition implies that the probability of large changes in X
t
over
short time intervals is small. Notice, on the other hand, that the above condition implies that the
sample paths of a diffusion process are not differentiable: if they where, then the right hand side
of the above equation would have to be 0 when t −s ¸1. The sample paths of a diffusion process
have the regularity of Brownian paths. A Markovian process cannot be differentiable: we can
deﬁne the derivative of a sample paths only with processes for which the past and future are not
statistically independent when conditioned on the present.
Let us denote the expectation conditioned on X
s
= x by E
s,x
. Notice that the deﬁnitions of the
drift and diffusion coefﬁcients (5.5) and (5.6) can be written in the form
E
s,x
(X
t
−X
s
) = a(x, s)(t −s) + o(t −s).
and
E
s,x
_
(X
t
−X
s
) ⊗(X
t
−X
s
)
_
= b(x, s)(t −s) + o(t −s).
Consequently, the drift coefﬁcient deﬁnes the mean velocity vector for the stochastic process X
t
,
whereas the diffusion coefﬁcient (tensor) is a measure of the local magnitude of ﬂuctuations of
X
t
−X
s
about the mean value. hence, we can write locally:
X
t
−X
s
≈ a(s, X
s
)(t −s) + σ(s, X
s
) ξ
t
,
where b = σσ
T
and ξ
t
is a mean zero Gaussian process with
E
s,x
(ξ
t
⊗ξ
s
) = (t −s)I.
Since we have that
W
t
−W
s
∼ ^(0, (t −s)I),
we conclude that we can write locally:
∆X
t
≈ a(s, X
s
)∆t + σ(s, X
s
)∆W
t
.
Or, replacing the differences by differentials:
dX
t
= a(t, X
t
)dt + σ(t, X
t
)dW
t
.
Hence, the sample paths of a diffusion process are governed by a stochastic differential equation
(SDE).
85
5.6 Examples of Diffusion Processes
i. The 1dimensional Brownian motion starting at x is a diffusion process with generator
L =
1
2
d
2
dx
2
.
The drift and diffusion coefﬁcients are, respectively a(x) = 0 and b(x) = 1. The corre
sponding stochastic differential equation is
dX
t
= dW
t
, X
0
= x.
The solution of this SDE is
X
t
= x + W
t
.
ii. The 1dimensional OrnsteinUhlenbeck process is a diffusion process with drift and diffusion
coefﬁcients, respectively, a(x) = −αx and b(x) = D. The generator of this process is
L = −αx
d
dx
+
D
2
d
2
dx
2
.
The corresponding SDE is
dX
t
= −αX
t
dt +
√
DdW
t
.
The solution to this equation is
X
t
= e
−αt
X
0
+
√
D
_
t
0
e
−α(t−s)
dW
s
.
5.7 Discussion and Bibliography
The argument used in the derivation of the forward and backward Kolmogorov equations goes back
to Kolmogorov’s original work. More material on diffusion processes can be found in [36], [42].
5.8 Exercises
1. Prove equation (5.14).
2. Derive the initial value problem (5.16), (5.17).
3. Derive rigorously the backward and forward Kolmogorov equations in arbitrary dimensions.
86
Chapter 6
The FokkerPlanck Equation
6.1 Introduction
In the previous chapter we derived the backward and forward (FokkerPlanck) Kolmogorov equa
tions and we showed that all statistical properties of a diffusion process can be calculated from the
solution of the FokkerPlanck equation.
1
In this long chapter we study various properties of this
equation such as existence and uniqueness of solutions, long time asymptotics, boundary condi
tions and spectral properties of the FokkerPlanck operator. We also study in some detail various
examples of diffusion processes and of the associated FokkerPalnck equation. We will restrict
attention to timehomogeneous diffusion processes, for which the drift and diffusion coefﬁcients
do not depend on time.
In Section 6.2 we study various basic properties of the FokkerPlanck equation, including exis
tence and uniqueness of solutions, writing the equation as a conservation law and boundary condi
tions. In Section 6.3 we present some examples of diffusion processes and use the corresponding
FokkerPlanck equation in order to calculate various quantities of interest such as moments. In
Section 6.4 we study the multidimensional OnrsteinUhlenbeck process and we study the spectral
properties of the corresponding FokkerPlanck operator. In Section 6.5 we study stochastic pro
cesses whose drift is given by the gradient of a scalar function, gradient ﬂows. In Section 6.7 we
solve the FokkerPlanck equation for a gradient SDE using eigenfunction expansions and we show
how the eigenvalue problem for the FokkerPlanck operator can be reduced to the eigenfunction
expansion for a Schr¨ odinger operator. In Section 8.2 we study the Langevin equation and the as
1
In this chapter we will call the equation FokkerPlanck, which is more customary in the physics literature. rather
forward Kolmogorov, which is more customary in the mathematics literature.
87
sociated FokkerPlanck equation. In Section 8.3 we calculate the eigenvalues and eigenfunctions
of the FokkerPlanck operator for the Langevin equation in a harmonic potential. Discussion and
bibliographical remarks are included in Section 6.8. Exercises can be found in Section 6.9.
6.2 Basic Properties of the FP Equation
6.2.1 Existence and Uniqueness of Solutions
Consider a homogeneous diffusion process on R
d
with drift vector and diffusion matrix a(x) and
b(x). The FokkerPlanck equation is
∂p
∂t
= −
d
j=1
∂
∂x
j
(a
i
(x)p) +
1
2
d
i,j=1
∂
2
∂x
i
∂x
j
(b
ij
(x)p), t > 0, x ∈ R
d
, (6.1a)
p(x, 0) = f(x), x ∈ R
d
. (6.1b)
Since f(x) is the probability density of the initial condition (which is a random variable), we have
that
f(x) 0, and
_
R
d
f(x) dx = 1.
We can also write the equation in nondivergence form:
∂p
∂t
=
d
j=1
˜ a
j
(x)
∂p
∂x
j
+
1
2
d
i,j=1
˜
b
ij(x)
∂
2
p
∂x
i
∂x
j
+ ˜ c(x)u, t > 0, x ∈ R
d
, (6.2a)
p(x, 0) = f(x), x ∈ R
d
, (6.2b)
where
˜ a
i
(x) = −a
i
(x) +
d
j=1
∂b
ij
∂x
j
, ˜ c
i
(x) =
1
2
d
i,j=1
∂
2
b
ij
∂x
i
∂x
j
−
d
i=1
∂a
i
∂x
i
.
By deﬁnition (see equation (5.20)), the diffusion matrix is always symmetric and nonnegative.
We will assume that it is actually uniformly positive deﬁnite, i.e. we will impose the uniform
ellipticity condition:
d
i,j=1
b
ij
(x)ξ
i
ξ
j
αξ
2
, ∀ ξ ∈ R
d
, (6.3)
Furthermore, we will assume that the coefﬁcients ˜ a, b, ˜ c are smooth and that they satisfy the growth
conditions
b(x) M, ˜ a(x) M(1 +x), ˜ c(x) M(1 +x
2
). (6.4)
88
Deﬁnition 6.2.1. We will call a solution to the Cauchy problem for the Fokker–Planck equa
tion (6.2) a classical solution if:
i. u ∈ C
2,1
(R
d
, R
+
).
ii. ∀T > 0 there exists a c > 0 such that
u(t, x)
L
∞
(0,T)
ce
αx
2
iii. lim
t→0
u(t, x) = f(x).
It is a standard result in the theory of parabolic partial differential equations that, under the
regularity and uniform ellipticity assumptions, the FokkerPlanck equation has a unique smooth
solution. Furthermore, the solution can be estimated in terms of an appropriate heat kernel (i.e. the
solution of the heat equation on R
d
).
Theorem 6.2.2. Assume that conditions (6.3) and (6.4) are satisﬁed, and assume that [f[
ce
αx
2
. Then there exists a unique classical solution to the Cauchy problem for the Fokker–Planck
equation. Furthermore, there exist positive constants K, δ so that
[p[, [p
t
[, ∇p, D
2
p Kt
(−n+2)/2
exp
_
−
1
2t
δx
2
_
. (6.5)
Notice that from estimates (6.5) it follows that all moments of a uniformly elliptic diffusion
process exist. In particular, we can multiply the FokkerPlanck equation by monomials x
n
and
then to integrate over R
d
and to integrate by parts. No boundary terms will appear, in view of the
estimate (6.5).
Remark 6.2.3. The solution of the FokkerPlanck equation is nonnegative for all times, pro
vided that the initial distribution is nonnegative. This is follows from the maximum principle
for parabolic PDEs.
6.2.2 The FP equation as a conservation law
The FokkerPlanck equation is in fact a conservation law: it expresses the law of conservation of
probability. To see this we deﬁne the probability current to be the vector whose ith component is
J
i
:= a
i
(x)p −
1
2
d
j=1
∂
∂x
j
_
b
ij
(x)p
_
. (6.6)
89
We use the probability current to write the Fokker–Planck equation as a continuity equation:
∂p
∂t
+∇ J = 0.
Integrating the FP equation over R
d
and integrating by parts on the right hand side of the equation
we obtain
d
dt
_
R
d
p(x, t) dx = 0.
Consequently:
p(, t)
L
1
(R
d
)
= p(, 0)
L
1
(R
d
)
= 1. (6.7)
Hence, the total probability is conserved, as expected. Equation (6.7) simply means that
E(X
t
∈ R
d
) = 1, t 0.
6.2.3 Boundary conditions for the Fokker–Planck equation
When studying a diffusion process that can take values on the whole of R
d
, then we study the
pure initial value (Cauchy) problem for the FokkerPlanck equation, equation (6.1). The boundary
condition was that the solution decays sufﬁciently fast at inﬁnity. For ergodic diffusion processes
this is equivalent to requiring that the solution of the backward Kolmogorov equation is an element
of L
2
(µ) where µ is the invariant measure of the process. There are many applications where it is
important to study stochastic process in bounded domains. In this case it is necessary to specify
the value of the stochastic process (or equivalently of the solution to the FokkerPlanck equation)
on the boundary.
To understand the type of boundary conditions that we can impose on the FokkerPlanck equa
tion, let us consider the example of a random walk on the domain ¦0, 1, . . . N¦.
2
When the random
walker reaches either the left or the right boundary we can either set
i. X
0
= 0 or X
N
= 0, which means that the particle gets absorbed at the boundary;
ii. X
0
= X
1
or X
N
= X
N−1
, which means that the particle is reﬂected at the boundary;
iii. X
0
= X
N
, which means that the particle is moving on a circle (i.e., we identify the left and
right boundaries).
2
Of course, the random walk is not a diffusion process. However, as we have already seen the Brownian motion
can be deﬁned as the limit of an appropriately rescaled random walk. A similar construction exists for more general
diffusion processes.
90
Hence, we can have absorbing, reﬂecting or periodic boundary conditions.
Consider the FokkerPlanck equation posed in Ω ⊂ R
d
where Ω is a bounded domain with
smooth boundary. Let J denote the probability current and let n be the unit outward pointing
normal vector to the surface. The above boundary conditions become:
i. The transition probability density vanishes on an absorbing boundary:
p(x, t) = 0, on ∂Ω.
ii. There is no net ﬂow of probability on a reﬂecting boundary:
n J(x, t) = 0, on ∂Ω.
iii. The transition probability density is a periodic function in the case of periodic boundary
conditions.
Notice that, using the terminology customary to PDEs theory, absorbing boundary conditions cor
respond to Dirichlet boundary conditions and reﬂecting boundary conditions correspond to Neu
mann. Of course, on consider more complicated, mixed boundary conditions.
Consider now a diffusion process in one dimension on the interval [0, L]. The boundary condi
tions are
p(0, t) = p(L, t) = 0 absorbing,
J(0, t)) = J(L, t) = 0 reﬂecting,
p(0, t) = p(L, t) periodic,
where the probability current is deﬁned in (6.6). An example of mixed boundary conditions would
be absorbing boundary conditions at the left end and reﬂecting boundary conditions at the right
end:
p(0, t) = J(L, t) = 0.
There is a complete classiﬁcation of boundary conditions in one dimension, the Feller classiﬁca
tion: the BC can be regular, exit, entrance and natural.
91
6.3 Examples of Diffusion Processes
6.3.1 Brownian Motion
Brownian Motion on R
Set a(y, t) ≡ 0, b(y, t) ≡ 2D > 0. This diffusion process is the Brownian motion with diffusion
coefﬁcient D. Let us calculate the transition probability density of this process assuming that
the Brownian particle is at y at time s. The FokkerPlanck equation for the transition probability
density p(x, t[y, s) is:
∂p
∂t
= D
∂
2
p
∂x
2
, p(x, s[y, s) = δ(x −y). (6.8)
The solution to this equation is the Green’s function (fundamental solution) of the heat equation:
p(x, t[y, s) =
1
_
4πD(t −s)
exp
_
−
(x −y)
2
4D(t −s)
_
. (6.9)
Notice that using the FokkerPlanck equation for the Brownian motion we can immediately show
that the mean squared displacement grows linearly in time. Assuming that the Brownian particle
is at the origin at time t = 0 we get
d
dt
EW
2
t
=
d
dt
_
R
x
2
p(x, t[0, 0) dx
= D
_
R
x
2
∂
2
p(x, t)
∂x
2
dx
= D
_
R
p(x, t[0, 0) dx = 2D,
where we performed two integrations by parts and we used the fact that, in view of (6.9), no
boundary terms remain. From this calculation we conclude that
EW
2
t
= 2Dt.
Assume now that the initial condition W
0
of the Brownian particle is a random variable with distri
bution ρ
0
(x). To calculate the probability density function (distribution function) of the Brownian
particle we need to solve the FokkerPlanck equation with initial condition ρ
0
(x). In other words,
we need to take the average of the probability density function p(x, t[y, 0) over all initial real
izations of the Brownian particle. The solution of the FokkerPlanck equation, the distribution
function, is
p(x, t) =
_
p(x, t[y, 0)ρ
0
(y) dy. (6.10)
92
Notice that only the transition probability density depends on x and y only through their difference.
Thus, we can write p(x, t[y, 0) = p(x − y, t). From (6.10) we see that the distribution function is
given by the convolution between the transition probability density and the initial condition, as we
know from the theory of partial differential equations.
p(x, t) =
_
p(x −y, t)ρ
0
(y) dy =: p ρ
0
.
Brownian motion with absorbing boundary conditions
We can also consider Brownian motion in a bounded domain, with either absorbing, reﬂecting or
periodic boundary conditions. Set D = 1 and consider the FokkerPlanck equation (6.8) on [0, 1]
with absorbing boundary conditions:
∂p
∂t
=
1
2
∂
2
p
∂x
2
, p(0, t) = p(1, t) = 0. (6.11)
We look for a solution to this equation in a sine Fourier series:
p(x, t) =
∞
k=1
p
n
(t) sin(nπx). (6.12)
Notice that the boundary conditions are automatically satisﬁed. The initial condition is
p(x, 0) = δ(x −x
0
),
where we have assumed that W
0
= x
0
. The Fourier coefﬁcients of the initial conditions are
p
n
(0) = 2
_
1
0
δ(x −x
0
) sin(nπx) dx = 2 sin(nπx
0
).
We substitute the expansion (6.12) into (6.11) and use the orthogonality properties of the Fourier
basis to obtain the equations
˙ p
n
= −
n
2
π
2
2
p
n
n = 1, 2, . . .
The solution of this equation is
p
n
(t) = p
n
(0)e
−
n
2
π
2
2
t
.
Consequently, the transition probability density for the Brownian motion on [0, 1] with absorbing
boundary conditions is
p(x, t[x
0
, 0) = 2
∞
n=1
e
−
n
2
π
2
2
t
sin nπx
0
sin(nπx).
93
Notice that
lim
t→∞
p(x, t[x
0
, 0) = 0.
This is not surprising, since all Brownian particles will eventually get absorbed at the boundary.
Brownian Motion with Reﬂecting Boundary Condition
Consider now Brownian motion on the interval [0, 1] with reﬂecting boundary conditions and set
D = 1 for simplicity. In order to calculate the transition probability density we have to solve the
FokkerPlanck equation which is the heat equation on [0, 1] with Neumann boundary conditions:
∂p
∂t
=
1
2
∂
2
p
∂x
2
, ∂
x
p(0, t) = ∂
x
p(1, t) = 0, p(x, 0) = δ(x −x
0
).
The boundary conditions are satisﬁed by functions of the form cos(nπx). We look for a solution
in the form of a cosine Fourier series
p(x, t) =
1
2
a
0
+
∞
n=1
a
n
(t) cos(nπx).
From the initial conditions we obtain
a
n
(0) = 2
_
1
0
cos(nπx)δ(x −x
0
) dx = 2 cos(nπx
0
).
We substitute the expansion into the PDE and use the orthonormality of the Fourier basis to obtain
the equations for the Fourier coefﬁcients:
˙ a
n
= −
n
2
π
2
2
a
n
from which we deduce that
a
n
(t) = a
n
(0)e
−
n
2
π
2
2
t
.
Consequently
p(x, t[x
0
, 0) = 1 + 2
∞
n=1
cos(nπx
0
) cos(nπx)e
−
n
2
π
2
2
t
.
Notice that Brownian motion with reﬂecting boundary conditions is an ergodic Markov process.
To see this, let us consider the stationary FokkerPlanck equation
∂
2
p
s
∂x
2
= 0, ∂
x
p
s
(0) = ∂
x
p
s
(1) = 0.
94
The unique normalized solution to this boundary value problem is p
s
(x) = 1. Indeed, we multiply
the equation by p
s
, integrate by parts and use the boundary conditions to obtain
_
1
0
¸
¸
¸
¸
dp
s
dx
¸
¸
¸
¸
2
dx = 0,
from which it follows that p
s
(x) = 1. Alternatively, by taking the limit of p(x, t[x
0
, 0) as t → ∞
we obtain the invariant distribution:
lim
t→∞
p(x, t[x
0
, 0) = 1.
Now we can calculate the stationary autocorrelation function:
E(W(t)W(0)) =
_
1
0
_
1
0
xx
0
p(x, t[x
0
, 0)p
s
(x
0
) dxdx
0
=
_
1
0
_
1
0
xx
0
_
1 + 2
∞
n=1
cos(nπx
0
) cos(nπx)e
−
n
2
π
2
2
t
_
dxdx
0
=
1
4
+
8
π
4
+∞
n=0
1
(2n + 1)
4
e
−
(2n+1)
2
π
2
2
t
.
6.3.2 The OrnsteinUhlenbeck Process
We set now a(x, t) = −αx, b(x, t) = 2D > 0. With this drift and diffusion coefﬁcients the
FokkerPlanck equation becomes
∂p
∂t
= α
∂(xp)
∂x
+ D
∂
2
p
∂x
2
. (6.13)
This is the FokkerPlanck equation for the OrnsteinUhlenbeck process. The corresponding stochas
tic differential equation is
dX
t
= −αX
t
+
√
2DdW
t
.
So, in addition to Brownian motion there is a linear force pulling the particle towards the origin.
We know that Brownian motion is not a stationary process, since the variance grows linearly in
time. By adding a linear damping term, it is reasonable to expect that the resulting process can be
stationary. As we have already seen, this is indeed the case.
The transition probability density p
OU
(x, t[y, s) for an OU particle that is located at y at time s
is
p
OU
(y, t[x, s) =
_
α
2πD(1 −e
−2α(t−s)
)
exp
_
−
α(x −e
−α(t−s)
y)
2
2D(1 −e
−2α(t−s)
)
_
. (6.14)
95
We obtained this formula in Example (4.2.4) (for α = D = 1) by using the fact that the OU process
can be deﬁned through the a time change of the Brownian motion. We can also derive it by solving
equation (6.13). To obtain (6.14), we ﬁrst take the Fourier transform of the transition probability
density with respect to x, solve the resulting ﬁrst order PDE using the method of characteristics
and then take the inverse Fourier transform
3
Notice that fromformula (6.14) it immediately follows that in the limit as the friction coefﬁcient
α goes to 0, the transition probability of the OU processes converges to the transition probability
of Brownian motion. Furthermore, by taking the long time limit in (6.14) we obtain (we have set
s = 0)
lim
t→+∞
p
OU
(x, t[y, 0) =
_
α
2πD
exp
_
−
αx
2
2D
_
,
irrespective of the initial position y of the OU particle. This is to be expected, since as we have
already seen the OrnsteinUhlenbeck process is an ergodic Markov process, with a Gaussian in
variant distribution
p
s
(x) =
_
α
2πD
exp
_
−
αx
2
2D
_
. (6.15)
Using now (6.14) and (6.15) we obtain the stationary joint probability density
p
2
(x, t[y, 0) = p(x, t[y, 0)p
s
(y)
=
α
2πD
√
1 −e
−2αt
exp
_
−
α(x
2
+ y
2
−2xye
−αt
)
2D(1 −e
−2αt
)
_
.
More generally, we have
p
2
(x, t[y, s) =
α
2πD
√
1 −e
−2α[t−s[
exp
_
−
α(x
2
+ y
2
−2xye
−α[t−s[
)
2D(1 −e
−2α[t−s[
)
_
. (6.16)
Now we can calculate the stationary autocorrelation function of the OU process
E(X(t)X(s)) =
_ _
xyp
2
(x, t[y, s) dxdy (6.17)
=
D
α
e
−α[t−s[
. (6.18)
In order to calculate the double integral we need to perform an appropriate change of variables.
The calculation is similar to the one presented in Section 2.6. See Exercise 2.
3
This calculation will be presented in Section ?? for the FokkerPlanck equation of a linear SDE in arbitrary
dimensions.
96
Assume that initial position of the OU particle is a random variable distributed according to a
distribution ρ
0
(x). As in the case of a Brownian particle, the probability density function (distri
bution function) is given by the convolution integral
p(x, t) =
_
p(x −y, t)ρ
0
(y) dy, (6.19)
where p(x − y, t) := p(x, t[y, 0). When the OU process is distributed initially according to its in
variant distribution, ρ
0
(x) = p
s
(x) given by (6.15), then the OrnsteinUhlenbeck process becomes
stationary. The distribution function is given by p
s
(x) at all times and the joint probability density
is given by (6.16).
Knowledge of the distribution function enables us to calculate all moments of the OU process
using the formula
E((X
t
)
n
) =
_
x
n
p(x, t) dx,
We will calculate the moments by using the FokkerPlanck equation, rather than the explicit for
mula for the transition probability density. Let M
n
(t) denote the nth moment of the OU process,
M
n
:=
_
R
x
n
p(x, t) dx, n = 0, 1, 2, . . . ,
Let n = 0. We integrate the FP equation over R to obtain:
_
∂p
∂t
= α
_
∂(yp)
∂y
+ D
_
∂
2
p
∂y
2
= 0,
after an integration by parts and using the fact that p(x, t) decays sufﬁciently fast at inﬁnity. Con
sequently:
d
dt
M
0
= 0 ⇒ M
0
(t) = M
0
(0) = 1.
In other words, since
d
dt
p
L
1
(R)
= 0,
we deduce that
_
R
p(x, t) dx =
_
R
p(x, t = 0) dy = 1,
which means that the total probability is conserved, as we have already shown for the general
FokkerPlanck equation in arbitrary dimensions. Let n = 1. We multiply the FP equation for the
OU process by x, integrate over R and perform and integration by parts to obtain:
d
dt
M
1
= −αM
1
.
97
Consequently, the ﬁrst moment converges exponentially fast to 0:
M
1
(t) = e
−αt
M
1
(0).
Let now n 2. We multiply the FP equation for the OU process by x
n
and integrate by parts (once
on the ﬁrst term on the RHS and twice on the second) to obtain:
d
dt
_
y
n
p = −αn
_
y
n
p + Dn(n −1)
_
y
n−2
p.
Or, equivalently:
d
dt
M
n
= −αnM
n
+ Dn(n −1)M
n−2
, n 2.
This is a ﬁrst order linear inhomogeneous differential equation. We can solve it using the variation
of constants formula:
M
n
(t) = e
−αnt
M
n
(0) + Dn(n −1)
_
t
0
e
−αn(t−s)
M
n−2
(s) ds. (6.20)
We can use this formula, together with the formulas for the ﬁrst two moments in order to calculate
all higher order moments in an iterative way. For example, for n = 2 we have
M
2
(t) = e
−2αt
M
2
(0) + 2D
_
t
0
e
−2α(t−s)
M
0
(s) ds
= e
−2αt
M
2
(0) +
D
α
e
−2αt
(e
2αt
−1)
=
D
α
+ e
−2αt
_
M
2
(0) −
D
α
_
.
Consequently, the second moment converges exponentially fast to its stationary value
D
2α
. The
stationary moments of the OU process are:
¸y
n
)
OU
:=
_
α
2πD
_
R
y
n
e
−
αy
2
2D
dx
=
_
1.3 . . . (n −1)
_
D
α
_
n/2
, neven,
0, nodd.
It is not hard to check that (see Exercise 3)
lim
t→∞
M
n
(t) = ¸y
n
)
OU
(6.21)
98
exponentially fast
4
. Since we have already shown that the distribution function of the OU process
converges to the Gaussian distribution in the limit as t →+∞, it is not surprising that the moments
also converge to the moments of the invariant Gaussian measure. What is not so obvious is that the
convergence is exponentially fast. In the next section we will prove that the OrnsteinUhlenbeck
process does, indeed, converge to equilibrium exponentially fast. Of course, if the initial conditions
of the OU process are stationary, then the moments of the OU process become independent of time
and given by their equilibrium values
M
n
(t) = M
n
(0) = ¸x
n
)
OU
. (6.22)
6.3.3 The Geometric Brownian Motion
We set a(x) = µx, b(x) =
1
2
σ
2
x
2
. This is the geometric Brownian motion. The corresponding
stochastic differential equation is
dX
t
= µX
t
dt + σX
t
dW
t
.
This equation is one of the basic models in mathematical ﬁnance. The coefﬁcient σ is called the
volatility. The generator of this process is
L = µx
∂
∂x
+
σx
2
2
∂
2
∂x
2
.
Notice that this operator is not uniformly elliptic. The FokkerPlanck equation of the geometric
Brownian motion is:
∂p
∂t
= −
∂
∂x
(µx) +
∂
2
∂x
2
_
σ
2
x
2
2
p
_
.
We can easily obtain an equation for the nth moment of the geometric Brownian motion:
d
dt
M
n
=
_
µn +
σ
2
2
n(n −1)
_
M
n
, n 2.
The solution of this equation is
M
n
(t) = e
(µ+(n−1)
σ
2
2
)nt
M
n
(0), n 2
and
M
1
(t) = e
µt
M
1
(0).
4
Of course, we need to assume that the initial distribution has ﬁnite moments of all orders in order to justify the
above calculations.
99
Notice that the nth moment might diverge as t →∞, depending on the values of µ and σ. Consider
for example the second moment and assume that µ < 0. We have
M
n
(t) = e
(2µ+σ
2
)t
M
2
(0),
which diverges when σ
2
+ 2µ > 0.
6.4 The OrnsteinUhlenbeck Process and Hermite Polynomials
The OrnsteinUhlenbeck process is one of the few stochastic processes for which we can calcu
late explicitly the solution of the corresponding SDE, the solution of the FokkerPlanck equation
as well as the eigenfunctions of the generator of the process. In this section we will show that
the eigenfunctions of the OU process are the Hermite polynomials. We will also study various
properties of the generator of the OU process. In the next section we will show that many of the
properties of the OU process (ergodicity, selfadjointness of the generator, exponentially fast con
vergence to equilibrium, real, discrete spectrum) are shared by a large class of diffusion processes,
namely those for which the drift term can be written in terms of the gradient of a smooth functions.
The generator of the ddimensional OU process is (we set the drift coefﬁcient equal to 1)
L = −p ∇
p
+ β
−1
∆
p
(6.23)
where β denotes the inverse temperature. We have already seen that the OU process is an er
godic Markov process whose unique invariant measure is absolutely continuous with respect to the
Lebesgue measure on R
d
with Gaussian density ρ ∈ C
∞
(R
d
)
ρ
β
(p) =
1
(2πβ
−1
)
d/2
e
−β
p
2
2
.
The natural function space for studying the generator of the OU process is the L
2
space weighted
by the invariant measure of the process. This is a separable Hilbert space with norm
f
2
ρ
:=
_
R
d
f
2
ρ
β
dp.
and corresponding inner product
(f, h)
ρ
=
_
R
fhρ
β
dp.
100
Similarly, we can deﬁne weighted L
2
spaced involving derivatives, i.e. weighted Sobolev spaces.
See Exercise .
The reason why this is the right function space in which to study questions related to conver
gence to equilibrium is that the generator of the OU process becomes a selfadjoint operator in this
space. In fact, L deﬁned in (6.23) has many nice properties that are summarized in the following
proposition.
Proposition 6.4.1. The operator L has the following properties:
i. For every f, h ∈ C
2
0
(R
d
) ∩ L
2
ρ
(R
d
),
(Lf, h)
ρ
= (f, Lh)
ρ
= −β
−1
_
R
d
∇f ∇hρ
β
dp. (6.24)
ii. L is a nonpositive operator on L
2
ρ
.
iii. Lf = 0 iff f ≡ const.
iv. For every f ∈ C
2
0
(R
d
) ∩ L
2
ρ
(R
d
) with
_
fρ
β
= 0,
(−Lf, f)
ρ
f
2
ρ
(6.25)
Proof. Equation (6.24) follows from an integration by parts:
(Lf, h)
ρ
=
_
−p ∇fhρ
β
dp + β
−1
_
∆fhρ
β
dp
=
_
−p ∇fhρ
β
dp −β
−1
_
∇f ∇hρ
β
dp +
_
−p ∇fhρ
β
dp
= −β
−1
(∇f, ∇h)
ρ
.
Nonpositivity of L follows from (6.24) upon setting h = f:
(Lf, f)
ρ
= −β
−1
∇f
2
ρ
0.
Similarly, multiplying the equation Lf = 0 by fρ
β
, integrating over R
d
and using (6.24) gives
f
ρ
= 0,
from which we deduce that f ≡ const. The spectral gap follows from (6.24), together with
Poincar´ e’s inequality for Gaussian measures:
_
R
d
f
2
ρ
β
dp β
−1
_
R
d
[∇f[
2
ρ
β
dp (6.26)
101
for every f ∈ H
1
(R
d
; ρ
β
) with
_
fρ
β
= 0. Indeed, upon combining (6.24) with (6.26) we obtain:
(Lf, f)
ρ
= −β
−1
∇f
2
ρ
−f
2
ρ
The spectral gap of the generator of the OU process, which is equivalent to the compactness
of its resolvent, implies that L has discrete spectrum. Furthermore, since it is also a selfadjoint
operator, we have that its eigenfunctions form a countable orthonormal basis for the separable
Hilbert space L
2
ρ
. In fact, we can calculate the eigenvalues and eigenfunctions of the generator of
the OU process in one dimension.
5
Theorem 6.4.2. Consider the eigenvalue problem for the generator of the OU process in one
dimension
−Lf
n
= λ
n
f
n
. (6.27)
Then the eigenvalues of L are the nonnegative integers:
λ
n
= n, n = 0, 1, 2, . . . .
The corresponding eigenfunctions are the normalized Hermite polynomials:
f
n
(p) =
1
√
n!
H
n
_
_
βp
_
, (6.28)
where
H
n
(p) = (−1)
n
e
p
2
2
d
n
dp
n
_
e
−
p
2
2
_
. (6.29)
For the subsequent calculations we will need some additional properties of Hermite polynomi
als which we state here without proof (we use the notation ρ
1
= ρ).
Proposition 6.4.3. For each λ ∈ C, set
H(p; λ) = e
λp−
λ
2
2
, p ∈ R.
5
The multidimensional problem can be treated similarly by taking tensor products of the eigenfunctions of the one
dimensional problem.
102
Then
H(p; λ) =
∞
n=0
λ
n
n!
H
n
(p), p ∈ R, (6.30)
where the convergence is both uniform on compact subsets of RC, and for λ’s in compact subsets
of C, uniform in L
2
(C; ρ). In particular, ¦f
n
(p) :=
1
√
n!
H
n
(
√
βp) : n ∈ N¦ is an orthonormal basis
in L
2
(C; ρ
β
).
From (6.29) it is clear that H
n
is a polynomial of degree n. Furthermore, only odd (even)
powers appear in H
n
(p) when n is odd (even). Furthermore, the coefﬁcient multiplying p
n
in H
n
(p)
is always 1. The orthonormality of the modiﬁed Hermite polynomials f
n
(p) deﬁned in (6.28)
implies that
_
R
f
n
(p)f
m
(p)ρ
β
(p) dp = δ
nm
.
The ﬁrst few Hermite polynomials and the corresponding rescaled/normalized eigenfunctions of
the generator of the OU process are:
H
0
(p) = 1, f
0
(p) = 1,
H
1
(p) = p, f
1
(p) =
_
βp,
H
2
(p) = p
2
−1, f
2
(p) =
β
√
2
p
2
−
1
√
2
,
H
3
(p) = p
3
−3p, f
3
(p) =
β
3/2
√
6
p
3
−
3
√
β
√
6
p
H
4
(p) = p
4
−3p
2
+ 3, f
4
(p) =
1
√
24
_
β
2
p
4
−3βp
2
+ 3
_
H
5
(p) = p
5
−10p
3
+ 15p, f
5
(p) =
1
√
120
_
β
5/2
p
5
−10β
3/2
p
3
+ 15β
1/2
p
_
.
The proof of Theorem 6.4.2 follows essentially from the properties of the Hermite polynomials.
First, notice that by combining (6.28) and (6.30) we obtain
H(
_
βp, λ) =
+∞
n=0
λ
n
√
n!
f
n
(p)
We differentiate this formula with respect to p to obtain
λ
_
βH(
_
βp, λ) =
+∞
n=1
λ
n
√
n!
∂
p
f
n
(p),
103
since f
0
= 1. From this equation we obtain
H(
_
βp, λ) =
+∞
n=1
λ
n−1
√
β
√
n!
∂
p
f
n
(p)
=
+∞
n=0
λ
n
√
β
_
(n + 1)!
∂
p
f
n+1
(p)
from which we deduce that
1
√
β
∂
p
f
k
=
√
kf
k−1
. (6.31)
Similarly, if we differentiate (6.30) with respect to λ we obtain
(p −λ)H(p; λ) =
+∞
k=0
λ
k
k!
pH
k
(p) −
+∞
k=1
λ
k
(k −1)!
H
k−1
(p)
+∞
k=0
λ
k
k!
H
k+1
(p)
from which we obtain the recurrence relation
pH
k
= H
k+1
+ kH
k−1
.
Upon rescaling, we deduce that
pf
k
=
_
β
−1
(k + 1)f
k+1
+
_
β
−1
kf
k−1
. (6.32)
We combine now equations (6.31) and (6.32) to obtain
_
_
βp −
1
√
β
∂
p
_
f
k
=
√
k + 1f
k+1
. (6.33)
Now we observe that
−Lf
n
=
_
_
βp −
1
√
β
∂
p
_
1
√
β
∂
p
f
n
=
_
_
βp −
1
√
β
∂
p
_
√
nf
n−1
= nf
n
.
The operators
_
√
βp −
1
√
β
∂
p
_
and
1
√
β
∂
p
play the role of creation and annihilation operators.
In fact, we can generate all eigenfunctions of the OU operator from the ground state f
0
= 0
through a repeated application of the creation operator.
Proposition 6.4.4. Set β = 1 and let a
−
= ∂
p
. Then the L
2
ρ
adjoint of a
+
is
a
+
= −∂
p
+ p.
104
Then the generator of the OU process can be written in the form
L = −a
+
a
−
.
Furthermore, a
+
and a
−
satisfy the following commutation relation
[a
+
, a
−
] = −1
Deﬁne now the creation and annihilation operators on C
1
(R) by
S
+
=
1
_
(n + 1)
a
+
and
S
−
=
1
√
n
a
−
.
Then
S
+
f
n
= f
n+1
and S
−
f
n
= f
n−1
. (6.34)
In particular,
f
n
=
1
√
n!
(a
+
)
n
1 (6.35)
and
1 =
1
√
n!
(a
−
)
n
f
n
. (6.36)
Proof. let f, h ∈ C
1
(R) ∩ L
2
ρ
. We calculate
_
∂
p
fhρ = −
_
f∂
p
(hρ) (6.37)
=
_
f
_
−∂
p
+ p
_
hρ. (6.38)
Now,
−a
+
a
−
= −(−∂
p
+ p)∂
p
= ∂
p
−p∂
p
= L.
Similarly,
a
−
a
+
= −∂
2
p
+ p∂
p
+ 1.
and
[a
+
, a
−
] = −1
Forumlas (6.34) follow from (6.31) and (6.33). Finally, formulas (6.35) and (6.36) are a conse
quence of (6.31) and (6.33), together with a simple induction argument.
105
Notice that upon using (6.35) and (6.36) and the fact that a
+
is the adjoint of a
−
we can easily
check the orthonormality of the eigenfunctions:
_
f
n
f
m
ρ =
1
√
m!
_
f
n
(a
−
)
m
1 ρ
=
1
√
m!
_
(a
−
)
m
f
n
ρ
=
_
f
n−m
ρ = δ
nm
.
From the eigenfunctions and eigenvalues of L we can easily obtain the eigenvalues and eigenfunc
tions of L
∗
, the FokkerPlanck operator.
Lemma 6.4.5. The eigenvalues and eigenfunctions of the FokkerPlanck operator
L
∗
= ∂
2
p
+∂
p
(p)
are
λ
∗
n
= −n, n = 0, 1, 2, . . . and f
∗
n
= ρf
n
.
Proof. We have
L
∗
(ρf
n
) = f
n
L
∗
ρ + ρLf
n
= −nρf
n
.
An immediate corollary of the above calculation is that we can the nth eigenfunction of the
FokkerPlanck operator is given by
f
∗
n
= ρ(p)
1
n!
(a
+
)
n
1.
6.5 Reversible Diffusions
The stationary OrnsteinUhlenbeck process is an example of a reversible Markov process:
Deﬁnition 6.5.1. A stationary stochastic process X
t
is time reversible if for every m ∈ N and
every t
1
, t
2
, . . . , t
m
∈ R
+
, the joint probability distribution is invariant under time reversals:
p(X
t
1
, X
t
2
, . . . , X
t
m
) = p(X
−t
1
, X
−t
2
, . . . , X
−t
m
). (6.39)
106
In this section we study a more general class (in fact, as we will see later the most general
class) of reversible Markov processes, namely stochastic perturbations of ODEs with a gradient
structure.
Let V (x) =
1
2
αx
2
. The generator of the OU process can be written as:
L = −∂
x
V ∂
x
+ β
−1
∂
2
x
.
Consider diffusion processes with a potential V (x), not necessarily quadratic:
L = −∇V (x) ∇+ β
−1
∆ (6.40)
In applications of (6.40) to statistical mechanics the diffusion coefﬁcient β
−1
= k
B
T where k
B
is
Boltzmann’s constant and T the absolute temperature. The corresponding stochastic differential
equation is
dX
t
= −∇V (X
t
) dt +
_
2β
−1
dW
t
. (6.41)
Hence, we have a gradient ODE
˙
X
t
= −∇V (X
t
) perturbed by noise due to thermal ﬂuctuations.
The corresponding FP equation is:
∂p
∂t
= ∇ (∇V p) + β
−1
∆p. (6.42)
It is not possible to calculate the time dependent solution of this equation for an arbitrary potential.
We can, however, always calculate the stationary solution, if it exists.
Deﬁnition 6.5.2. A potential V will be called conﬁning if lim
[x[→+∞
V (x) = +∞and
e
−βV (x)
∈ L
1
(R
d
). (6.43)
for all β ∈ R
+
.
Gradient SDEs in a conﬁning potential are ergodic:
Proposition 6.5.3. Let V (x) be a smooth conﬁning potential. Then the Markov process with gen
erator (6.40) is ergodic. The unique invariant distribution is the Gibbs distribution
p(x) =
1
Z
e
−βV (x)
(6.44)
where the normalization factor Z is the partition function
Z =
_
R
d
e
−βV (x)
dx.
107
The fact that the Gibbs distribution is an invariant distribution follows by direct substitution.
Uniqueness follows from a PDEs argument (see discussion below). It is more convenient to ”nor
malize” the solution of the FokkerPlanck equation with respect to the invariant distribution.
Theorem6.5.4. Let p(x, t) be the solution of the FokkerPlanck equation (6.42), assume that (6.43)
holds and let ρ(x) be the Gibbs distribution (11.4). Deﬁne h(x, t) through
p(x, t) = h(x, t)ρ(x).
Then the function h satisﬁes the backward Kolmogorov equation:
∂h
∂t
= −∇V ∇h + β
−1
∆h, h(x, 0) = p(x, 0)ρ
−1
(x). (6.45)
Proof. The initial condition follows from the deﬁnition of h. We calculate the gradient and Lapla
cian of p:
∇p = ρ∇h −ρhβ∇V
and
∆p = ρ∆h −2ρβ∇V ∇h + hβ∆V ρ + h[∇V [
2
β
2
ρ.
We substitute these formulas into the FP equation to obtain
ρ
∂h
∂t
= ρ
_
−∇V ∇h + β
−1
∆h
_
,
from which the claim follows.
Consequently, in order to study properties of solutions to the FP equation, it is sufﬁcient to
study the backward equation (6.45). The generator L is selfadjoint, in the right function space.
We deﬁne the weighted L
2
space L
2
ρ
:
L
2
ρ
=
_
f[
_
R
d
[f[
2
ρ(x) dx < ∞
_
,
where ρ(x) is the Gibbs distribution. This is a Hilbert space with inner product
(f, h)
ρ
=
_
R
d
fhρ(x) dx.
Theorem 6.5.5. Assume that V (x) is a smooth potential and assume that condition (6.43) holds.
Then the operator
L = −∇V (x) ∇+ β
−1
∆
is selfadjoint in L
2
ρ
. Furthermore, it is nonpositive, its kernel consists of constants.
108
Proof. Let f, ∈ C
2
0
(R
d
). We calculate
(Lf, h)
ρ
=
_
R
d
(−∇V ∇+ β
−1
∆)fhρ dx
=
_
R
d
(∇V ∇f)hρ dx −β
−1
_
R
d
∇f∇hρ dx −β
−1
_
R
d
∇fh∇ρ dx
= −β
−1
_
R
d
∇f ∇hρ dx,
from which selfadjointness follows.
If we set f = h in the above equation we get
(Lf, f)
ρ
= −β
−1
∇f
2
ρ
,
which shows that L is nonpositive.
Clearly, constants are in the null space of L. Assume that f ∈ ^(L). Then, from the above
equation we get
0 = −β
−1
∇f
2
ρ
,
and, consequently, f is a constant.
Remark 6.5.6. The expression (−Lf, f)
ρ
is called the Dirichlet form of the operator L. In the
case of a gradient ﬂow, it takes the form
(−Lf, f)
ρ
= β
−1
∇f
2
ρ
. (6.46)
Using the properties of the generator L we can show that the solution of the FokkerPlanck
equation converges to the Gibbs distribution exponentially fast. For this we need to use the fact
that, under appropriate assumptions on the potential V , the Gibbs measure µ(dx) = Z
−1
e
−βV (x)
satisﬁes Poincar` e’s inequality:
Theorem 6.5.7. Assume that the potential V satisﬁes the convexity condition
D
2
V λI.
Then the corresponding Gibbs measure satisﬁes the Poincar´ e inequality with constant λ:
_
R
d
fρ = 0 ⇒ ∇f
ρ
√
λf
ρ
. (6.47)
109
Theorem 6.5.8. Assume that p(x, 0) ∈ L
2
(e
βV
). Then the solution p(x, t) of the FokkerPlanck
equation (6.42) converges to the Gibbs distribution exponentially fast:
p(, t) −Z
−1
e
−βV

ρ
−1 e
−λDt
p(, 0) −Z
−1
e
−βV

ρ
−1. (6.48)
Proof. We Use (6.45), (6.46) and (6.47) to calculate
−
d
dt
(h −1)
2
ρ
= −2
_
∂h
∂t
, h −1
_
ρ
= −2 (Lh, h −1)
ρ
= (−L(h −1), h −1)
ρ
= 2D∇(h −1)
ρ
2β
−1
λh −1
2
ρ
.
Our assumption on p(, 0) implies that h(, 0) ∈ L
2
ρ
. Consequently, the above calculation shows
that
h(, t) −1
ρ
e
−λβ
−1
t
h(, 0) −1
ρ
.
This, and the deﬁnition of h, p = ρh, lead to (6.48).
Remark 6.5.9. The assumption
_
R
d
[p(x, 0)[
2
Z
−1
e
βV
< ∞
is very restrictive (think of the case where V = x
2
). The function space L
2
(ρ
−1
) = L
2
(e
−βV
) in
which we prove convergence is not the right space to use. Since p(, t) ∈ L
1
, ideally we would like
to prove exponentially fast convergence in L
1
. We can prove convergence in L
1
using the theory of
logarithmic Sobolev inequalities. In fact, we can also prove convergence in relative entropy:
H(p[ρ
V
) :=
_
R
d
p ln
_
p
ρ
V
_
dx.
The relative entropy norm controls the L
1
norm:
ρ
1
−ρ
2

2
L
1 CH(ρ
1
[ρ
2
)
Using a logarithmic Sobolev inequality, we can prove exponentially fast convergence to equilib
rium, assuming only that the relative entropy of the initial conditions is ﬁnite.
A much sharper version of the theorem of exponentially fast convergence to equilibrium is the
following:
110
Theorem 6.5.10. Let p denote the solution of the Fokker–Planck equation (6.42) where the poten
tial is smooth and uniformly convex. Assume that the the initial conditions satisfy
H(p(, 0)[ρ
V
) < ∞.
Then p converges to the Gibbs distribution exponentially fast in relative entropy:
H(p(, t)[ρ
V
) e
−λβ
−1
t
H(p(, 0)[ρ
V
).
Selfadjointness of the generator of a diffusion process is equivalent to timereversibility.
Theorem 6.5.11. Let X
t
be a stationary Markov process in R
d
with generator
L = b(x) ∇+ β
−1
∆
and invariant measure µ. Then the following three statements are equivalent.
i. The process it timereversible.
ii. Its generator of the process is symmetric in L
2
(R
d
; µ(dx)).
iii. There exists a scalar function V (x) such that
b(x) = −∇V (x).
6.5.1 Markov Chain Monte Carlo (MCMC)
The Smoluchowski SDE (6.41) has a very interesting application in statistics. Suppose we want to
sample froma probability distribution π(x). One method for doing this is by generating the dynam
ics whose invariant distribution is precisely π(x). In particular, we consider the Smolochuwoski
equation
dX
t
= ∇ln(π(X
t
)) dt +
√
2dW
t
. (6.49)
Assuming that −ln(π(x)) is a conﬁning potential, then X
t
is an ergodic Markov process with
invariant distribution π(x). Furthermore, the law of X
t
converges to π(x) exponentially fast:
ρ
t
−π
L
1 e
−Λt
ρ
0
−π
L
1.
The exponent Λ is related to the spectral gap of the generator L =
1
π(x)
∇π(x) ∇ + ∆. This
technique for sampling from a given distribution is an example of the Markov Chain Monte
Carlo (MCMC) methodology.
111
6.6 Perturbations of nonReversible Diffusions
We can add a perturbation to a nonreversible diffusion without changing the invariant distribution
Z
−1
e
−βV
.
Proposition 6.6.1. Let V (x) be a conﬁning potential, γ(x) a smooth vector ﬁeld and consider the
diffusion process
dX
t
= (−∇V (X
t
) + γ(x)) dt +
_
2β
−1
dW
t
. (6.50)
Then the invariant measure of the process X
t
is the Gibbs measure µ(dx) =
1
Z
e
−βV (x)
dx if and
only if γ(x) is divergencefree with respect to the density of this measure:
∇
_
γ(x)e
−βV (x))
_
= 0. (6.51)
6.7 Eigenfunction Expansions
Consider the generator of a gradient stochastic ﬂow with a uniformly convex potential
L = −∇V ∇+ D∆. (6.52)
We know that L is a nonpositive selfadjoint operator on L
2
ρ
and that it has a spectral gap:
(Lf, f)
ρ
−Dλf
2
ρ
where λ is the Poincar´ e constant of the potential V (i.e. for the Gibbs measure Z
−1
e
−βV (x)
dx).
The above imply that we can study the spectral problem for −L:
−Lf
n
= λ
n
f
n
, n = 0, 1, . . .
The operator −L has real, discrete spectrum with
0 = λ
0
< λ
1
< λ
2
< . . .
Furthermore, the eigenfunctions ¦f
j
¦
∞
j=1
form an orthonormal basis in L
2
ρ
: we can express every
element of L
2
ρ
in the form of a generalized Fourier series:
φ =
∞
n=0
φ
n
f
n
, φ
n
= (φ, f
n
)
ρ
(6.53)
112
with (f
n
, f
m
)
ρ
= δ
nm
. This enables us to solve the time dependent Fokker–Planck equation in
terms of an eigenfunction expansion. Consider the backward Kolmogorov equation (6.45). We
assume that the initial conditions h
0
(x) = φ(x) ∈ L
2
ρ
and consequently we can expand it in the
form (6.53). We look for a solution of (6.45) in the form
h(x, t) =
∞
n=0
h
n
(t)f
n
(x).
We substitute this expansion into the backward Kolmogorov equation:
∂h
∂t
=
∞
n=0
˙
h
n
f
n
= L
_
∞
n=0
h
n
f
n
_
(6.54)
=
∞
n=0
−λ
n
h
n
f
n
. (6.55)
We multiply this equation by f
m
, integrate wrt the Gibbs measure and use the orthonormality of
the eigenfunctions to obtain the sequence of equations
˙
h
n
= −λ
n
h
n
, n = 0, 1,
The solution is
h
0
(t) = φ
0
, h
n
(t) = e
−λ
n
t
φ
n
, n = 1, 2, . . .
Notice that
1 =
_
R
d
p(x, 0) dx =
_
R
d
p(x, t) dx
=
_
R
d
h(x, t)Z
−1
e
βV
dx = (h, 1)
ρ
= (φ, 1)
ρ
= φ
0
.
Consequently, the solution of the backward Kolmogorov equation is
h(x, t) = 1 +
∞
n=1
e
−λ
n
t
φ
n
f
n
.
This expansion, together with the fact that all eigenvalues are positive (n 1), shows that the
solution of the backward Kolmogorov equation converges to 1 exponentially fast. The solution of
the Fokker–Planck equation is
p(x, t) = Z
−1
e
−βV (x)
_
1 +
∞
n=1
e
−λ
n
t
φ
n
f
n
_
.
113
6.7.1 Reduction to a Schr¨ odinger Equation
Lemma 6.7.1. The Fokker–Planck operator for a gradient ﬂow can be written in the selfadjoint
form
∂p
∂t
= D∇
_
e
−V/D
∇
_
e
V/D
p
__
. (6.56)
Deﬁne now ψ(x, t) = e
V/2D
p(x, t). Then ψ solves the PDE
∂ψ
∂t
= D∆ψ −U(x)ψ, U(x) :=
[∇V [
2
4D
−
∆V
2
. (6.57)
Let H := −D∆ + U. Then L
∗
and H have the same eigenvalues. The nth eigenfunction φ
n
of L
∗
and the nth eigenfunction ψ
n
of H are associated through the transformation
ψ
n
(x) = φ
n
(x) exp
_
V (x)
2D
_
.
Remarks 6.7.2. i. From equation (6.56) shows that the FP operator can be written in the form
L
∗
= D∇
_
e
−V/D
∇
_
e
V/D
__
.
ii. The operator that appears on the right hand side of eqn. (6.57) has the formof a Schr¨ odinger
operator:
−H = −D∆ + U(x).
iii. The spectral problem for the FP operator can be transformed into the spectral problem for
a Schr¨ odinger operator. We can thus use all the available results from quantum mechanics
to study the FP equation and the associated SDE.
iv. In particular, the weak noise asymptotics D ¸ 1 is equivalent to the semiclassical approxi
mation from quantum mechanics.
Proof. We calculate
D∇
_
e
−V/D
∇
_
e
V/D
f
__
= D∇
_
e
−V/D
_
D
−1
∇V f +∇f
_
e
V/D
_
= ∇ (∇V f + D∇f) = L
∗
f.
Consider now the eigenvalue problem for the FP operator:
−L
∗
φ
n
= λ
n
φ
n
.
114
Set φ
n
= ψ
n
exp
_
−
1
2D
V
_
. We calculate −L
∗
φ
n
:
−L
∗
φ
n
= −D∇
_
e
−V/D
∇
_
e
V/D
ψ
n
e
−V/2D
__
= −D∇
_
e
−V/D
_
∇ψ
n
+
∇V
2D
ψ
n
_
e
V/2D
_
=
_
−D∆ψ
n
+
_
−
[∇V [
2
4D
+
∆V
2D
_
ψ
n
_
e
−V/2D
= e
−V/2D
Hψ
n
.
From this we conclude that e
−V/2D
Hψ
n
= λ
n
ψ
n
e
−V/2D
from which the equivalence between the
two eigenvalue problems follows.
Remarks 6.7.3. i. We can rewrite the Schr¨ odinger operator in the form
H = D/
∗
/, / = ∇+
∇U
2D
, /
∗
= −∇+
∇U
2D
.
ii. These are creation and annihilation operators. They can also be written in the form
/ = e
−U/2D
∇
_
e
U/2D
_
, /
∗
= e
U/2D
∇
_
e
−U/2D
_
iii. The forward the backward Kolmogorov operators have the same eigenvalues. Their eigen
functions are related through
φ
B
n
= φ
F
n
exp (−V/D) ,
where φ
B
n
and φ
F
n
denote the eigenfunctions of the backward and forward operators, respec
tively.
6.8 Discussion and Bibliography
The proof of existence and uniqueness of classical solutions for the FokkerPlanck equation of a
uniformly elliptic diffusion process with smooth drift and diffusion coefﬁcients, Theorem 6.2.2,
can be found in [30]. A standard textbook on PDEs, with a lot of material on parabolic PDEs
is [22], particularly Chapters 2 and 7 in this book.
It is important to emphasize that the condition that solutions to the FokkerPlanck equation
do not grow too fast, see Deﬁnition 6.2.1, is necessary to ensure uniqueness. In fact, there are
inﬁnitely many solutions of
∂p
∂t
= ∆p in R
d
(0, T)
p(x, 0) = 0.
115
Each of these solutions besides the trivial solution p = 0 grows very rapidly as x → +∞. More
details can be found in [44, Ch. 7].
The FokkerPlanck equation is studied extensively in Risken’s monograph [82]. See also [35]
and [42]. The connection between the FokkerPlanck equation and stochastic differential equations
is presented in Chapter 7. See also [1, 31, 32].
Hermite polynomials appear very frequently in applications and they also play a fundamental
role in analysis. It is possible to prove that the Hermite polynomials form an orthonormal basis
for L
2
(R
d
, ρ
β
) without using the fact that they are the eigenfunctions of a symmetric operator with
compact resolvent.
6
The proof of Proposition 6.4.1 can be found in [90], Lemma 2.3.4 in particular.
Diffusion processes in one dimension are studied in [61]. The Feller classiﬁcation for one
dimensional diffusion processes can be also found in [45, 24].
Convergence to equilibrium for kinetic equations (such as the FokkerPlanck equation) both
linear and nonlinear (e.g., the Boltzmann equation) has been studied extensively. It has been
recognized that the relative entropy and logarithmic Sobolev inequalities play an important role in
the analysis of the problem of convergence to equilibrium. For more information see [62].
6.9 Exercises
1. Solve equation (6.13) by taking the Fourier transform, using the method of characteristics for
ﬁrst order PDEs and taking the inverse Fourier transform.
2. Use the formula for the stationary joint probability density of the OrnsteinUhlenbeck process,
eqn. (6.17) to obtain the stationary autocorrelation function of the OU process.
3. Use (6.20) to obtain formulas for the moments of the OU process. Prove, using these formulas,
that the moments of the OU process converge to their equilibrium values exponentially fast.
4. Show that the autocorrelation function of the stationary OrnsteinUhlenbeck is
E(X
t
X
0
) =
_
R
_
R
xx
0
p
OU
(x, t[x
0
, 0)p
s
(x
0
) dxdx
0
=
D
2α
e
−α[t[
,
6
In fact, Poincar´ e’s inequality for Gaussian measures can be proved using the fact that that the Hermite polynomials
form an orthonormal basis for L
2
(R
d
, ρ
β
).
116
where p
s
(x) denotes the invariant Gaussian distribution.
5. Let X
t
be a onedimensional diffusion process with drift and diffusion coefﬁcients a(y, t) =
−a
0
−a
1
y and b(y, t) = b
0
+ b
1
y + b
2
y
2
where a
i
, b
i
0, i = 0, 1, 2.
(a) Write down the generator and the forward and backward Kolmogorov equations for X
t
.
(b) Assume that X
0
is a random variable with probability density ρ
0
(x) that has ﬁnite mo
ments. Use the forward Kolmogorov equation to derive a system of differential equations
for the moments of X
t
.
(c) Find the ﬁrst three moments M
0
, M
1
, M
2
in terms of the moments of the initial distribu
tion ρ
0
(x).
(d) Under what conditions on the coefﬁcients a
i
, b
i
0, i = 0, 1, 2 is M
2
ﬁnite for all times?
6. Let V be a conﬁning potential in R
d
, β > 0 and let ρ
β
(x) = Z
−1
e
−βV (x)
. Give the deﬁnition of
the Sobolev space H
k
(R
d
; ρ
β
) for k a positive integer and study some of its basic properties.
7. Let X
t
be a multidimensional diffusion process on [0, 1]
d
with periodic boundary conditions.
The drift vector is a periodic function a(x) and the diffusion matrix is 2DI, where D > 0 and
I is the identity matrix.
(a) Write down the generator and the forward and backward Kolmogorov equations for X
t
.
(b) Assume that a(x) is divergencefree (∇ a(x) = 0). Show that X
t
is ergodic and ﬁnd the
invariant distribution.
(c) Show that the probability density p(x, t) (the solution of the forward Kolmogorov equa
tion) converges to the invariant distribution exponentially fast in L
2
([0, 1]
d
). (Hint: Use
Poincar´ e’s inequality on [0, 1]
d
).
8. The Rayleigh process X
t
is a diffusion process that takes values on (0, +∞) with drift and
diffusion coefﬁcients a(x) = −ax +
D
x
and b(x) = 2D, respectively, where a, D > 0.
(a) Write down the generator the forward and backward Kolmogorov equations for X
t
.
(b) Show that this process is ergodic and ﬁnd its invariant distribution.
(c) Solve the forward Kolmogorov (FokkerPlanck) equation using separation of variables.
(Hint: Use Laguerre polynomials).
117
9. Let x(t) = ¦x(t), y(t)¦ be the twodimensional diffusion process on [0, 2π]
2
with periodic
boundary conditions with drift vector a(x, y) = (sin(y), sin(x)) and diffusion matrix b(x, y)
with b
11
= b
22
= 1, b
12
= b
21
= 0.
(a) Write down the generator of the process ¦x(t), y(t)¦ and the forward and backward Kol
mogorov equations.
(b) Show that the constant function
ρ
s
(x, y) = C
is the unique stationary distribution of the process ¦x(t), y(t)¦ and calculate the normal
ization constant.
(c) Let E denote the expectation with respect to the invariant distribution ρ
s
(x, y). Calculate
E
_
cos(x) + cos(y)
_
and E(sin(x) sin(y)).
10. Let a, D be positive constants and let X(t) be the diffusion process on [0, 1] with periodic
boundary conditions and with drift and diffusion coefﬁcients a(x) = a and b(x) = 2D, respec
tively. Assume that the process starts at x
0
, X(0) = x
0
.
(a) Write down the generator of the process X(t) and the forward and backward Kolmogorov
equations.
(b) Solve the initial/boundary value problem for the forward Kolmogorov equation to calcu
late the transition probability density p(x, t[x
0
, 0).
(c) Show that the process is ergodic and calculate the invariant distribution p
s
(x).
(d) Calculate the stationary autocorrelation function
E(X(t)X(0)) =
_
1
0
_
1
0
xx
0
p(x, t[x
0
, 0)p
s
(x
0
) dxdx
0
.
118
Chapter 7
Stochastic Differential Equations
7.1 Introduction
In this part of the course we will study stochastic differential equation (SDEs): ODEs driven by
Gaussian white noise.
Let W(t) denote a standard m–dimensional Brownian motion, h : Z → R
d
a smooth vector
valued function and γ : Z → R
dm
a smooth matrix valued function (in this course we will take
Z = T
d
, R
d
or R
l
⊕T
d−l
. Consider the SDE
dz
dt
= h(z) + γ(z)
dW
dt
, z(0) = z
0
. (7.1)
We think of the term
dW
dt
as representing Gaussian white noise: a meanzero Gaussian process
with correlation δ(t − s)I. The function h in (7.1) is sometimes referred to as the drift and γ as
the diffusion coefﬁcient. Such a process exists only as a distribution. The precise interpretation
of (7.1) is as an integral equation for z(t) ∈ C(R
+
, Z):
z(t) = z
0
+
_
t
0
h(z(s))ds +
_
t
0
γ(z(s))dW(s). (7.2)
In order to make sense of this equation we need to deﬁne the stochastic integral against W(s).
7.2 The Itˆ o and Stratonovich Stochastic Integral
For the rigorous analysis of stochastic differential equations it is necessary to deﬁne stochastic
integrals of the form
I(t) =
_
t
0
f(s) dW(s), (7.3)
119
where W(t) is a standard one dimensional Brownian motion. This is not straightforward because
W(t) does not have bounded variation. In order to deﬁne the stochastic integral we assume that
f(t) is a random process, adapted to the ﬁltration T
t
generated by the process W(t), and such that
E
__
T
0
f(s)
2
ds
_
< ∞.
The It ˆ o stochastic integral I(t) is deﬁned as the L
2
–limit of the Riemann sum approximation of
(7.3):
I(t) := lim
K→∞
K−1
k=1
f(t
k−1
) (W(t
k
) −W(t
k−1
)) , (7.4)
where t
k
= k∆t and K∆t = t. Notice that the function f(t) is evaluated at the left end of each
interval [t
n−1
, t
n
] in (7.4). The resulting It ˆ o stochastic integral I(t) is a.s. continuous in t. These
ideas are readily generalized to the case where W(s) is a standard d dimensional Brownian motion
and f(s) ∈ R
md
for each s.
The resulting integral satisﬁes the Itˆ o isometry
E[I(t)[
2
=
_
t
0
E[f(s)[
2
F
ds, (7.5)
where [ [
F
denotes the Frobenius norm [A[
F
=
_
tr(A
T
A). The Itˆ o stochastic integral is a
martingale:
EI(t) = 0
and
E[I(t)[T
s
] = I(s) ∀ t s,
where T
s
denotes the ﬁltration generated by W(s).
Example 7.2.1. • Consider the Itˆ o stochastic integral
I(t) =
_
t
0
f(s) dW(s),
• where f, W are scalar–valued. This is a martingale with quadratic variation
¸I)
t
=
_
t
0
(f(s))
2
ds.
• More generally, for f, W in arbitrary ﬁnite dimensions, the integral I(t) is a martingale
with quadratic variation
¸I)
t
=
_
t
0
(f(s) ⊗f(s)) ds.
120
7.2.1 The Stratonovich Stochastic Integral
In addition to the It ˆ o stochastic integral, we can also deﬁne the Stratonovich stochastic integral. It
is deﬁned as the L
2
–limit of a different Riemann sum approximation of (7.3), namely
I
strat
(t) := lim
K→∞
K−1
k=1
1
2
_
f(t
k−1
) + f(t
k
)
_
(W(t
k
) −W(t
k−1
)) , (7.6)
where t
k
= k∆t and K∆t = t. Notice that the function f(t) is evaluated at both endpoints of each
interval [t
n−1
, t
n
] in (7.6). The multidimensional Stratonovich integral is deﬁned in a similar way.
The resulting integral is written as
I
strat
(t) =
_
t
0
f(s) ◦ dW(s).
The limit in (7.6) gives rise to an integral which differs from the It ˆ o integral. The situation is
more complex than that arising in the standard theory of Riemann integration for functions of
bounded variation: in that case the points in [t
k−1
, t
k
] where the integrand is evaluated do not effect
the deﬁnition of the integral, via a limiting process. In the case of integration against Brownian
motion, which does not have bounded variation, the limits differ. When f and W are correlated
through an SDE, then a formula exists to convert between them.
7.3 Stochastic Differential Equations
Deﬁnition 7.3.1. By a solution of (7.1) we mean a Zvalued stochastic process ¦z(t)¦ on t ∈ [0, T]
with the properties:
i. z(t) is continuous and T
t
−adapted, where the ﬁltration is generated by the Brownian motion
W(t);
ii. h(z(t)) ∈ L
1
((0, T)), γ(z(t)) ∈ L
2
((0, T));
iii. equation (7.1) holds for every t ∈ [0, T] with probability 1.
The solution is called unique if any two solutions x
i
(t), i = 1, 2 satisfy
P(x
1
(t) = x
2
(t), ∀t ∈ [0.T]) = 1.
121
It is well known that existence and uniqueness of solutions for ODEs (i.e. when γ ≡ 0 in (7.1))
holds for globally Lipschitz vector ﬁelds h(x). A very similar theorem holds when γ ,= 0. As for
ODEs the conditions can be weakened, when a priori bounds on the solution can be found.
Theorem 7.3.2. Assume that both h() and γ() are globally Lipschitz on Z and that z
0
is a random
variable independent of the Brownian motion W(t) with
E[z
0
[
2
< ∞.
Then the SDE (7.1) has a unique solution z(t) ∈ C(R
+
; Z) with
E
__
T
0
[z(t)[
2
dt
_
< ∞ ∀ T < ∞.
Furthermore, the solution of the SDE is a Markov process.
The Stratonovich analogue of (7.1) is
dz
dt
= h(z) + γ(z) ◦
dW
dt
, z(0) = z
0
. (7.7)
By this we mean that z ∈ C(R
+
, Z) satisﬁes the integral equation
z(t) = z(0) +
_
t
0
h(z(s))ds +
_
t
0
γ(z(s)) ◦ dW(s). (7.8)
By using deﬁnitions (7.4) and (7.6) it can be shown that z satisfying the Stratonovich SDE (7.7)
also satisﬁes the Itˆ o SDE
dz
dt
= h(z) +
1
2
∇
_
γ(z)γ(z)
T
_
−
1
2
γ(z)∇
_
γ(z)
T
_
+ γ(z)
dW
dt
, (7.9a)
z(0) = z
0
, (7.9b)
provided that γ(z) is differentiable. White noise is, in most applications, an idealization of a sta
tionary random process with short correlation time. In this context the Stratonovich interpretation
of an SDE is particularly important because it often arises as the limit obtained by using smooth
approximations to white noise. On the other hand the martingale machinery which comes with
the It ˆ o integral makes it more important as a mathematical object. It is very useful that we can
convert from the Itˆ o to the Stratonovich interpretation of the stochastic integral. There are other
interpretations of the stochastic integral, e.g. the Klimontovich stochastic integral.
122
The Deﬁnition of Brownian motion implies the scaling property
W(ct) =
√
cW(t),
where the above should be interpreted as holding in law. From this it follows that, if s = ct, then
dW
ds
=
1
√
c
dW
dt
,
again in law. Hence, if we scale time to s = ct in (7.1), then we get the equation
dz
ds
=
1
c
h(z) +
1
√
c
γ(z)
dW
ds
, z(0) = z
0
.
7.3.1 Examples of SDEs
The SDE for Brownian motion is:
dX =
√
2σdW, X(0) = x.
The Solution is:
X(t) = x + W(t).
The SDE for the OrnsteinUhlenbeck process is
dX = −αX dt +
√
2λdW, X(0) = x.
We can solve this equation using the variation of constants formula:
X(t) = e
−αt
x +
√
2λ
_
t
0
e
−α(t−s)
dW(s).
We can use It ˆ o’s formula to obtain equations for the moments of the OU process. The generator is:
L = −αx∂
x
+ λ∂
2
x
.
We apply It ˆ o’s formula to the function f(x) = x
n
to obtain:
dX(t)
n
= LX(t)
n
dt +
√
2λ∂X(t)
n
dW
= −αnX(t)
n
dt + λn(n −1)X(t)
n−2
dt + n
√
2λX(t)
n−1
dW.
123
Consequently:
X(t)
n
= x
n
+
_
t
0
_
−αnX(t)
n
+ λn(n −1)X(t)
n−2
_
dt
+n
√
2λ
_
t
0
X(t)
n−1
dW.
By taking the expectation in the above equation we obtain the equation for the moments of the OU
process that we derived earlier using the FokkerPlanck equation:
M
n
(t) = x
n
+
_
t
0
(−αnM
n
(s) + λn(n −1)M
n−2
(s)) ds.
Consider the geometric Brownian motion
dX(t) = µX(t) dt + σX(t) dW(t), (7.10)
where we use the It ˆ o interpretation of the stochastic differential. The generator of this process is
L = µx∂
x
+
σ
2
x
2
2
∂
2
x
.
The solution to this equation is
X(t) = X(0) exp
_
(µ −
σ
2
2
)t + σW(t)
_
. (7.11)
To derive this formula, we apply Itˆ o’s formula to the function f(x) = log(x):
d log(X(t)) = L
_
log(X(t))
_
dt + σx∂
x
log(X(t)) dW(t)
=
_
µx
1
x
+
σ
2
x
2
2
_
−
1
x
2
__
dt + σ dW(t)
=
_
µ −
σ
2
2
_
dt + σ dW(t).
Consequently:
log
_
X(t)
X(0)
_
=
_
µ −
σ
2
2
_
t + σW(t)
from which (7.11) follows. Notice that the Stratonovich interpretation of this equation leads to the
solution
X(t) = X(0) exp(µt + σW(t))
124
7.4 The Generator, Itˆ o’s formula and the FokkerPlanck Equa
tion
7.4.1 The Generator
Given the function γ(z) in the SDE (7.1) we deﬁne
Γ(z) = γ(z)γ(z)
T
. (7.12)
The generator L is then deﬁned as
Lv = h ∇v +
1
2
Γ : ∇∇v. (7.13)
This operator, equipped with a suitable domain of deﬁnition, is the generator of the Markov process
given by (7.1). The formal L
2
−adjoint operator L
∗
L
∗
v = −∇ (hv) +
1
2
∇ ∇ (Γv).
7.4.2 Itˆ o’s Formula
The Itˆ o formula enables us to calculate the rate of change in time of functions V : Z → R
n
evaluated at the solution of a Zvalued SDE. Formally, we can write:
d
dt
_
V (z(t))
_
= LV (z(t)) +
_
∇V (z(t)), γ(z(t))
dW
dt
_
.
Note that if W were a smooth timedependent function this formula would not be correct: there is
an additional term in LV , proportional to Γ, which arises from the lack of smoothness of Brownian
motion. The precise interpretation of the expression for the rate of change of V is in integrated
form:
Lemma 7.4.1. (Itˆ o’s Formula) Assume that the conditions of Theorem 7.3.2 hold. Let x(t) solve
(7.1) and let V ∈ C
2
(Z, R
n
). Then the process V (z(t)) satisﬁes
V (z(t)) = V (z(0)) +
_
t
0
LV (z(s))ds +
_
t
0
¸∇V (z(s)), γ(z(s)) dW(s)) .
Let φ : Z →R and consider the function
v(z, t) = E
_
φ(z(t))[z(0) = z
_
, (7.14)
125
where the expectation is with respect to all Brownian driving paths. By averaging in the Itˆ o for
mula, which removes the stochastic integral, and using the Markov property, it is possible to obtain
the Backward Kolmogorov equation.
Theorem 7.4.2. Assume that φ is chosen sufﬁciently smooth so that the backward Kolmogorov
equation
∂v
∂t
= Lv for (z, t) ∈ Z (0, ∞),
v = φ for (z, t) ∈ Z ¦0¦ , (7.15)
has a unique classical solution v(x, t) ∈ C
2,1
(Z (0, ∞), ). Then v is given by (7.14) where z(t)
solves (7.2).
For a Stratonovich SDE the rules of standard calculus apply: Consider the Stratonovich SDE (7.29)
and let V (x) ∈ C
2
(R). Then
dV (X(t)) =
dV
dx
(X(t)) (f(X(t)) dt + σ(X(t)) ◦ dW(t)) .
Consider the Stratonovich SDE (7.29) on R
d
(i.e. f ∈ R
d
, σ : R
n
→ R
d
, W(t) is standard
Brownian motion on R
n
). The corresponding FokkerPlanck equation is:
∂ρ
∂t
= −∇ (fρ) +
1
2
∇ (σ∇ (σρ))). (7.16)
Now we can derive rigorously the FokkerPlanck equation.
Theorem 7.4.3. Consider equation (7.2) with z(0) a random variable with density ρ
0
(z). Assume
that the law of z(t) has a density ρ(z, t) ∈ C
2,1
(Z (0, ∞)). Then ρ satisﬁes the FokkerPlanck
equation
∂ρ
∂t
= L
∗
ρ for (z, t) ∈ Z (0, ∞), (7.17a)
ρ = ρ
0
for z ∈ Z ¦0¦. (7.17b)
Proof. Let E
µ
denote averaging with respect to the product measure induced by the measure µ
with density ρ
0
on z(0) and the independent driving Wiener measure on the SDE itself. Averaging
126
over random z(0) distributed with density ρ
0
(z), we ﬁnd
E
µ
(φ(z(t))) =
_
?
v(z, t)ρ
0
(z) dz
=
_
?
(e
/t
φ)(z)ρ
0
(z) dz
=
_
?
(e
/
∗
t
ρ
0
)(z)φ(z) dz.
But since ρ(z, t) is the density of z(t) we also have
E
µ
(φ(z(t))) =
_
?
ρ(z, t)φ(z)dz.
Equating these two expressions for the expectation at time t we obtain
_
?
(e
/
∗
t
ρ
0
)(z)φ(z) dz =
_
?
ρ(z, t)φ(z) dz.
We use a density argument so that the identity can be extended to all φ ∈ L
2
(Z). Hence, from the
above equation we deduce that
ρ(z, t) =
_
e
/
∗
t
ρ
0
_
(z).
Differentiation of the above equation gives (7.17a). Setting t = 0 gives the initial condition (7.17b).
7.5 Linear SDEs
In this section we study linear SDEs in arbitrary ﬁnite dimensions. Let A ∈ R
nn
be a positive
deﬁnite matrix and let D > 0 be a positive constant. We will consider the SDE
dX(t) = −AX(t) dt +
√
2DdW(t)
or, componentwise,
dX
i
(t) = −
d
j=1
A
ij
X
j
(t) +
√
2DdW
i
(t), i = 1, . . . d.
The corresponding FokkerPlanck equation is
∂p
∂t
= ∇ (Axp) + D∆p
127
or
∂p
∂t
=
d
i,j
∂
∂x
i
(A
ij
x
j
p) + D
d
j=1
∂
2
p
∂x
2
j
.
Let us now solve the FokkerPlanck equation with initial conditions p(x, t[x
0
, 0) = δ(x−x
0
). We
take the Fourier transform of the FokkerPlanck equation to obtain
∂ˆ p
∂t
= −Ak ∇
k
ˆ p −D[k[
2
ˆ p (7.18)
with
p(x, t[x
0
, 0) = (2π)
−d
_
R
d
e
ikx
ˆ p(k, t[x
0
, t) dk.
The initial condition is
ˆ p(k, 0[x
0
, 0) = e
−ikx
0
(7.19)
We know that the transition probability density of a linear SDE is Gaussian. Since the Fourier
transform of a Gaussian function is also Gaussian, we look for a solution to (7.18) which is of the
form
ˆ p(k, t[x
0
, 0) = exp(−ik M(t) −
1
2
k
T
Σ(t)k).
We substitute this into (7.18) and use the symmetry of A to obtain the equations
dM
dt
= −AM and
dΣ
dt
= −2AΣ + 2DI,
with initial conditions (which follow from (11.5)) M(0) = x
0
and Σ(0) = 0 where 0 denotes the
zero d d matrix. We can solve these equations using the spectral resolution of A = B
T
ΛB. The
solutions are
M(t) = e
−At
M(0)
and
Σ(t) = DA
−1
−DA
−1
e
−2At
.
We calculate now the inverse Fourier transform of ˆ p to obtain the fundamental solution (Green’s
function) of the FokkerPlanck equation
p(x, t[x
0
, 0) = (2π)
−d/2
(det(Σ(t)))
−1/2
exp
_
−
1
2
_
x −e
−At
x
0
_
T
Σ
−1
(t)
_
x −e
−At
x
0
_
_
.
(7.20)
128
We note that generator of the Markov processes X
t
is of the form
L = −∇V (x) ∇+ D∆
with V (x) =
1
2
x
T
Ax =
1
2
d
i,j=1
A
ij
x
i
x
j
. This is a conﬁning potential and from the theory
presented in Section 6.5 we know that the process X
t
is ergodic. The invariant distribution is
p
s
(x) =
1
Z
e
−
1
2
x
T
Ax
(7.21)
with Z =
_
R
d
e
−
1
2
x
T
Ax
dx = (2π)
d
2
_
det(A
−1
). Using the above calculations, we can calculate the
stationary autocorrelation matrix is given by the formula
E(X
T
0
X
t
) =
_ _
x
T
0
xp(x, t[x
0
, 0)p
s
(x
0
) dxdx
0
.
We substitute the formulas for the transitions probability density and the stationary distribution,
equations (7.21) and (7.20) into the above equations and do the Gaussian integration to obtain
E(X
T
0
X
t
) = DA
−1
e
−At
.
We use now the the variation of constants formula to obtain
X
t
= e
At
X
0
+
√
2D
_
t
0
e
A(t−s)
dW(s).
The matrix exponential can be calculated using the spectral resolution of A:
e
At
= B
T
e
Λt
B.
7.6 Derivation of the Stratonovich SDE
When white noise is approximated by a smooth process this often leads to Stratonovich inter
pretations of stochastic integrals, at least in one dimension. We use multiscale analysis (singular
perturbation theory for Markov processes) to illustrate this phenomenon in a onedimensional ex
ample.
Consider the equations
dx
dt
= h(x) +
1
ε
f(x)y, (7.22a)
dy
dt
= −
αy
ε
2
+
_
2D
ε
2
dV
dt
, (7.22b)
129
with V being a standard onedimensional Brownian motion. We say that the process x(t) is driven
by colored noise: the noise that appears in (7.22a) has nonzero correlation time. The correlation
function of the colored noise η(t) := y(t)/ε is (we take y(0) = 0)
R(t) = E(η(t)η(s)) =
1
ε
2
D
α
e
−
α
ε
2
[t−s[
.
The power spectrum of the colored noise η(t) is:
f
ε
(x) =
1
ε
2
Dε
−2
π
1
x
2
+ (αε
−2
)
2
=
D
π
1
ε
4
x
2
+ α
2
→
D
πα
2
and, consequently,
lim
ε→0
E
_
y(t)
ε
y(s)
ε
_
=
2D
α
2
δ(t −s),
which implies the heuristic
lim
ε→0
y(t)
ε
=
_
2D
α
2
dV
dt
. (7.23)
Another way of seeing this is by solving (7.22b) for y/ε:
y
ε
=
_
2D
α
2
dV
dt
−
ε
α
dy
dt
. (7.24)
If we neglect the O(ε) term on the right hand side then we arrive, again, at the heuristic (7.23).
Both of these arguments lead us to conjecture the limiting Itˆ o SDE:
dX
dt
= h(X) +
_
2D
α
f(X)
dV
dt
. (7.25)
In fact, as applied, the heuristic gives the incorrect limit. Whenever white noise is approximated
by a smooth process, the limiting equation should be interpreted in the Stratonovich sense, giving
dX
dt
= h(X) +
_
2D
α
f(X) ◦
dV
dt
. (7.26)
This is usually called the WongZakai theorem. A similar result is true in arbitrary ﬁnite and even
inﬁnite dimensions. We will show this using singular perturbation theory.
Theorem 7.6.1. Assume that the initial conditions for y(t) are stationary and that the function f
is smooth. Then the solution of eqn (7.22a) converges, in the limit as ε → 0 to the solution of the
Stratonovich SDE (7.26).
130
Remarks 7.6.2. i. It is possible to prove pathwise convergence under very mild assumptions.
ii. The generator of a Stratonovich SDE has the from
L
strat
= h(x)∂
x
+
D
α
f(x)∂
x
(f(x)∂
x
) .
iii. Consequently, the FokkerPlanck operator of the Stratonovich SDE can be written in diver
gence form:
L
∗
strat
= −∂
x
(h(x)) +
D
α
∂
x
_
f
2
(x)∂
x
_
.
iv. In most applications in physics the white noise is an approximation of a more complicated
noise processes with nonzero correlation time. Hence, the physically correct interpretation
of the stochastic integral is the Stratonovich one.
v. In higher dimensions an additional drift term might appear due to the noncommutativity of
the row vectors of the diffusion matrix. This is related to the L´ evy area correction in the
theory of rough paths.
Proof of Proposition 7.6.1 The generator of the process (x(t), y(t)) is
L =
1
ε
2
_
−αy∂
y
+ D∂
2
y
_
+
1
ε
f(x)y∂
x
+ h(x)∂
x
=:
1
ε
2
L
0
+
1
ε
L
1
+L
2
.
The ”fast” process is an stationary Markov process with invariant density
ρ(y) =
_
α
2πD
e
−
αy
2
2D
. (7.27)
The backward Kolmogorov equation is
∂u
ε
∂t
=
_
1
ε
2
L
0
+
1
ε
L
1
+L
2
_
u
ε
. (7.28)
We look for a solution to this equation in the form of a power series expansion in ε:
u
ε
(x, y, t) = u
0
+ εu
1
+ ε
2
u
2
+ . . .
131
We substitute this into (7.28) and equate terms of the same power in ε to obtain the following
hierarchy of equations:
−L
0
u
0
= 0,
−L
0
u
1
= L
1
u
0
,
−L
0
u
2
= L
1
u
1
+L
2
u
0
−
∂u
0
∂t
.
The ergodicity of the fast process implies that the null space of the generator L
0
consists only of
constant in y. Hence:
u
0
= u(x, t).
The second equation in the hierarchy becomes
−L
0
u
1
= f(x)y∂
x
u.
This equation is solvable since the right hand side is orthogonal to the null space of the adjoint of
L
0
(this is the Fredholm alterantive). We solve it using separation of variables:
u
1
(x, y, t) =
1
α
f(x)∂
x
uy + ψ
1
(x, t).
In order for the third equation to have a solution we need to require that the right hand side is
orthogonal to the null space of L
∗
0
:
_
R
_
L
1
u
1
+L
2
u
0
−
∂u
0
∂t
_
ρ(y) dy = 0.
We calculate:
_
R
∂u
0
∂t
ρ(y) dy =
∂u
∂t
.
Furthermore:
_
R
L
2
u
0
ρ(y) dy = h(x)∂
x
u.
Finally
_
R
L
1
u
1
ρ(y) dy =
_
R
f(x)y∂
x
_
1
α
f(x)∂
x
uy + ψ
1
(x, t)
_
ρ(y) dy
=
1
α
f(x)∂
x
(f(x)∂
x
u) ¸y
2
) + f(x)∂
x
ψ
1
(x, t)¸y)
=
D
α
2
f(x)∂
x
(f(x)∂
x
u)
=
D
α
2
f(x)∂
x
f(x)∂
x
u +
D
α
2
f(x)
2
∂
2
x
u.
132
Putting everything together we obtain the limiting backward Kolmogorov equation
∂u
∂t
=
_
h(x) +
D
α
2
f(x)∂
x
f(x)
_
∂
x
u +
D
α
2
f(x)
2
∂
2
x
u,
from which we read off the limiting Stratonovich SDE
dX
dt
= h(X) +
_
2D
α
f(X) ◦
dV
dt
.
7.6.1 Itˆ o versus Stratonovich
A Stratonovich SDE
dX(t) = f(X(t)) dt + σ(X(t)) ◦ dW(t) (7.29)
can be written as an Itˆ o SDE
dX(t) =
_
f(X(t)) +
1
2
_
σ
dσ
dx
_
(X(t))
_
dt + σ(X(t)) dW(t).
Conversely, and It ˆ o SDE
dX(t) = f(X(t)) dt + σ(X(t))dW(t) (7.30)
can be written as a Statonovich SDE
dX(t) =
_
f(X(t)) −
1
2
_
σ
dσ
dx
_
(X(t))
_
dt + σ(X(t)) ◦ dW(t).
The It ˆ o and Stratonovich interpretation of an SDE can lead to equations with very different prop
erties!
When the diffusion coefﬁcient depends on the solution of the SDE X(t), we will say that we
have an equation with multiplicative noise .
7.7 Numerical Solution of SDEs
7.8 Parameter Estimation for SDEs
7.9 Noise Induced Transitions
Consider the Landau equation:
dX
t
dt
= X
t
(c −X
2
t
), X
0
= x. (7.31)
133
This is a gradient ﬂow for the potential V (x) =
1
2
cx
2
−
1
4
x
4
. When c < 0 all solutions are attracted
to the single steady state X
∗
= 0. When c > 0 the steady state X
∗
= 0 becomes unstable and
X
t
→
√
c if x > 0 and X
t
→ −
√
c if x < 0. Consider additive random perturbations to the
Landau equation:
dX
t
dt
= X
t
(c −X
2
t
) +
√
2σ
dW
t
dt
, X
0
= x. (7.32)
This equation deﬁnes an ergodic Markov process on R: There exists a unique invariant distribution:
ρ(x) = Z
−1
e
−V (x)/σ
, Z =
_
R
e
−V (x)/σ
dx, V (x) =
1
2
cx
2
−
1
4
x
4
.
ρ(x) is a probability density for all values of c ∈ R. The presence of additive noise in some
sense ”trivializes” the dynamics. The dependence of various averaged quantities on c resembles
the physical situation of a second order phase transition.
Consider now multiplicative perturbations of the Landau equation.
dX
t
dt
= X
t
(c −X
2
t
) +
√
2σX
t
dW
t
dt
, X
0
= x. (7.33)
Where the stochastic differential is interpreted in the Itˆ o sense. The generator of this process is
L = x(c −x
2
)∂
x
+ σx
2
∂
2
x
.
Notice that X
t
= 0 is always a solution of (7.33). Thus, if we start with x > 0 (x < 0) the solution
will remain positive (negative). We will assume that x > 0.
Consider the function Y
t
= log(X
t
). We apply Itˆ o’s formula to this function:
dY
t
= Llog(X
t
) dt + σX
t
∂
x
log(X
t
) dW
t
=
_
X
t
(c −X
2
t
)
1
X
t
−σX
2
t
1
X
2
t
_
dt + σX
t
1
X
t
dW
t
= (c −σ) dt −X
2
t
dt + σ dW
t
.
Thus, we have been able to transform (7.33) into an SDE with additive noise:
dY
t
=
_
(c −σ) −e
2Y
t
_
dt + σ dW
t
. (7.34)
This is a gradient ﬂow with potential
V (y) = −
_
(c −σ)y −
1
2
e
2y
_
.
134
The invariant measure, if it exists, is of the form
ρ(y) dy = Z
−1
e
−V (y)/σ
dy.
Going back to the variable x we obtain:
ρ(x) dx = Z
−1
x
(c/σ−2)
e
−
x
2
2σ
dx.
We need to make sure that this distribution is integrable:
Z =
_
+∞
0
x
γ
e
−
x
2
2σ
< ∞, γ =
c
σ
−2.
For this it is necessary that
γ > −1 ⇒ c > σ.
Not all multiplicative random perturbations lead to ergodic behavior. The dependence of the in
variant distribution on c is similar to the physical situation of ﬁrst order phase transitions.
7.10 Discussion and Bibliography
Colored Noise When the noise which drives an SDE has nonzero correlation time we will say
that we have colored noise. The properties of the SDE (stability, ergodicity etc.) are quite robust
under ”coloring of the noise”. See
G. Blankenship and G.C. Papanicolaou, Stability and control of stochastic systems with wide
band noise disturbances. I, SIAM J. Appl. Math., 34(3), 1978, pp. 437–476. Colored noise
appears in many applications in physics and chemistry. For a review see P. Hanggi and P. Jung
Colored noise in dynamical systems. Adv. Chem. Phys. 89 239 (1995).
In the case where there is an additional small time scale in the problem, in addition to the
correlation time of the colored noise, it is not clear what the right interpretation of the stochastic
integral (in the limit as both small time scales go to 0). This is usually called the Itˆ o versus
Stratonovich problem. Consider, for example, the SDE
τ
¨
X = −
˙
X + v(X)η
ε
(t),
where η
ε
(t) is colored noise with correlation time ε
2
. In the limit where both small time scales go
to 0 we can get either Itˆ o or Stratonovich or neither. See [51, 71].
Noise induced transitions are studied extensively in [42]. The material in Section 7.9 is based
on [59]. See also [58].
135
7.11 Exercises
1. Calculate all moments of the geometric Brownian motion for the Itˆ o and Stratonovich interpre
tations of the stochastic integral.
2. Study additive and multiplicative random perturbations of the ODE
dx
dt
= x(c + 2x
2
−x
4
).
3. Analyze equation (7.33) for the Stratonovich interpretation of the stochastic integral.
136
Chapter 8
The Langevin Equation
8.1 Introduction
8.2 The FokkerPlanck Equation in Phase Space (KleinKramers
Equation)
Consider a diffusion process in two dimensions for the variables q (position) and momentum p.
The generator of this Markov process is
L = p ∇
q
−∇
q
V ∇
p
+ γ(−p∇
p
+ D∆
p
). (8.1)
The L
2
(dpdq)adjoint is
L
∗
ρ = −p ∇
q
ρ −∇
q
V ∇
p
ρ + γ (∇
p
(pρ) + D∆
p
ρ) .
The corresponding FP equation is:
∂p
∂t
= L
∗
p.
The corresponding stochastic differential equations is the Langevin equation
¨
X
t
= −∇V (X
t
) −γ
˙
X
t
+
_
2γD
˙
W
t
. (8.2)
This is Newton’s equation perturbed by dissipation and noise. The FokkerPlanck equation for the
Langevin equation, which is sometimes called the KleinKramersChandrasekhar equation was
ﬁrst derived by Kramers in 1923 and was studied by Kramers in his famous paper [?]. Notice that
L
∗
is not a uniformly elliptic operator: there are second order derivatives only with respect to p
and not q. This is an example of a degenerate elliptic operator. It is, however, hypoelliptic. We can
137
still prove existence, uniqueness and regularity of solutions for the FokkerPlanck equation, and
obtain estimates on the solution. It is not possible to obtain the solution of the FP equation for an
arbitrary potential. We can, however, calculate the (unique normalized) solution of the stationary
FokkerPlanck equation.
Theorem 8.2.1. Let V (x) be a smooth conﬁning potential. Then the Markov process with genera
tor (8.45) is ergodic. The unique invariant distribution is the MaxwellBoltzmann distribution
ρ(p, q) =
1
Z
e
−βH(p,q)
(8.3)
where
H(p, q) =
1
2
p
2
+ V (q)
is the Hamiltonian, β = (k
B
T)
−1
is the inverse temperature and the normalization factor Z is
the partition function
Z =
_
R
2d
e
−βH(p,q)
dpdq.
It is possible to obtain rates of convergence in either a weighted L
2
norm or the relative entropy
norm.
H(p(, t)[ρ) Ce
−αt
.
The proof of this result is very complicated, since the generator Lis degenerate and nonselfadjoint.
See for example and the references therein.
Let ρ(q, p, t) be the solution of the Kramers equation and let ρ
β
(q, p) be the MaxwellBoltzmann
distribution. We can write
ρ(q, p, t) = h(q, p, t)ρ
β
(q, p),
where h(q, p, t) solves the equation
∂h
∂t
= −/h + γSh (8.4)
where
/ = p ∇
q
−∇
q
V ∇
p
, o = −p ∇
p
+ β
−1
∆
p
.
The operator / is antisymmetric in L
2
ρ
:= L
2
(R
2d
; ρ
β
(q, p)), whereas o is symmetric.
Let X
i
:= −
∂
∂p
i
. The L
2
ρ
adjoint of X
i
is
X
∗
i
= −βp
i
+
∂
∂p
i
.
138
We have that
o = β
−1
d
i=1
X
∗
i
X
i
.
Consequently, the generator of the Markov process ¦q(t), p(t)¦ can be written in H¨ ormander’s
”sum of squares” form:
L = /+ γβ
−1
d
i=1
X
∗
i
X
i
. (8.5)
We calculate the commutators between the vector ﬁelds in (8.5):
[/, X
i
] =
∂
∂q
i
, [X
i
, X
j
] = 0, [X
i
, X
∗
j
] = βδ
ij
.
Consequently,
Lie(X
1
, . . . X
d
, [/, X
1
], . . . [/, X
d
]) = Lie(∇
p
, ∇
q
)
which spans T
p,q
R
2d
for all p, q ∈ R
d
. This shows that the generator L is a hypoelliptic operator.
Let now Y
i
= −
∂
∂p
i
with L
2
ρ
adjoint Y
∗
i
=
∂
∂q
i
−β
∂V
∂q
i
. We have that
X
∗
i
Y
i
−Y
∗
i
X
i
= β
_
p
i
∂
∂q
i
−
∂V
∂q
i
∂
∂p
i
_
.
Consequently, the generator can be written in the form
L = β
−1
d
i=1
(X
∗
i
Y
i
−Y
∗
i
X
i
+ γX
∗
i
X
i
) . (8.6)
Notice also that
L
V
:= −∇
q
V ∇
q
+ β
−1
∆
q
= β
−1
d
i=1
Y
∗
i
Y
i
.
The phasespace FokkerPlanck equation can be written in the form
∂ρ
∂t
+ p ∇
q
ρ −∇
q
V ∇
p
ρ = Q(ρ, f
B
)
where the collision operator has the form
Q(ρ, f
B
) = D∇
_
f
B
∇
_
f
−1
B
ρ
__
.
The FokkerPlanck equation has a similar structure to the Boltzmann equation (the basic equation
in the kinetic theory of gases), with the difference that the collision operator for the FP equation is
139
linear. Convergence of solutions of the Boltzmann equation to the MaxwellBoltzmann distribution
has also been proved. See ??.
We can study the backward and forward Kolmogorov equations for (9.11) by expanding the
solution with respect to the Hermite basis. We consider the problem in 1d. We set D = 1. The
generator of the process is:
L = p∂
q
−V
t
(q)∂
p
+ γ
_
−p∂
p
+ ∂
2
p
_
.
=: L
1
+ γL
0
,
where
L
0
:= −p∂
p
+ ∂
2
p
and L
1
:= p∂
q
−V
t
(q)∂
p
.
The backward Kolmogorov equation is
∂h
∂t
= Lh. (8.7)
The solution should be an element of the weighted L
2
space
L
2
ρ
=
_
f[
_
R
2
[f[
2
Z
−1
e
−βH(p,q)
dpdq < ∞
_
.
We notice that the invariant measure of our Markov process is a product measure:
e
−βH(p,q)
= e
−β
1
2
[p[
2
e
−βV (q)
.
The space L
2
(e
−β
1
2
[p[
2
dp) is spanned by the Hermite polynomials. Consequently, we can expand
the solution of (8.7) into the basis of Hermite basis:
h(p, q, t) =
∞
n=0
h
n
(q, t)f
n
(p), (8.8)
where f
n
(p) = 1/
√
n!H
n
(p). Our plan is to substitute (8.8) into (8.7) and obtain a sequence of
equations for the coefﬁcients h
n
(q, t). We have:
L
0
h = L
0
∞
n=0
h
n
f
n
= −
∞
n=0
nh
n
f
n
Furthermore
L
1
h = −∂
q
V ∂
p
h + p∂
q
h.
140
We calculate each term on the right hand side of the above equation separately. For this we will
need the formulas
∂
p
f
n
=
√
nf
n−1
and pf
n
=
√
nf
n−1
+
√
n + 1f
n+1
.
p∂
q
h = p∂
q
∞
n=0
h
n
f
n
= p∂
p
h
0
+
∞
n=1
∂
q
h
n
pf
n
= ∂
q
h
0
f
1
+
∞
n=1
∂
q
h
n
_
√
nf
n−1
+
√
n + 1f
n+1
_
=
∞
n=0
(
√
n + 1∂
q
h
n+1
+
√
n∂
q
h
n−1
)f
n
with h
−1
≡ 0. Furthermore
∂
q
V ∂
p
h =
∞
n=0
∂
q
V h
n
∂
p
f
n
=
∞
n=0
∂
q
V h
n
√
nf
n−1
=
∞
n=0
∂
q
V h
n+1
√
n + 1f
n
.
Consequently:
Lh = L
1
+ γL
1
h
=
∞
n=0
_
−γnh
n
+
√
n + 1∂
q
h
n+1
+
√
n∂
q
h
n−1
+
√
n + 1∂
q
V h
n+1
_
f
n
Using the orthonormality of the eigenfunctions of L
0
we obtain the following set of equations
which determine ¦h
n
(q, t)¦
∞
n=0
.
˙
h
n
= −γnh
n
+
√
n + 1∂
q
h
n+1
+
√
n∂
q
h
n−1
+
√
n + 1∂
q
V h
n+1
, n = 0, 1, . . .
This is set of equations is usually called the Brinkman hierarchy (1956). We can use this approach
to develop a numerical method for solving the KleinKramers equation. For this we need to expand
each coefﬁcient h
n
in an appropriate basis with respect to q. Obvious choices are other the Hermite
basis (polynomial potentials) or the standard Fourier basis (periodic potentials). We will do this
141
for the case of periodic potentials. The resulting method is usually called the continued fraction
expansion. See [82]. The Hermite expansion of the distribution function wrt to the velocity is used
in the study of various kinetic equations (including the Boltzmann equation). It was initiated by
Grad in the late 40’s. It quite often used in the approximate calculation of transport coefﬁcients (e.g.
diffusion coefﬁcient). This expansion can be justiﬁed rigorously for the FokkerPlanck equation.
See [67]. This expansion can also be used in order to solve the Poisson equation −Lφ = f(p, q).
See [73].
8.3 The Langevin Equation in a Harmonic Potential
There are very few potentials for which we can solve the Langevin equation or to calculate the
eigenvalues and eigenfunctions of the generator of the Markov process ¦q(t), p(t)¦. One case
where we can calculate everything explicitly is that of a Brownian particle in a quadratic (har
monic) potential
V (q) =
1
2
ω
2
0
q
2
. (8.9)
The Langevin equation is
¨ q = −ω
2
0
q −γ ˙ q +
_
2γβ
−1 ˙
W (8.10)
or
¨ q = p, ˙ p = −ω
2
0
q −γp +
_
2γβ
−1 ˙
W. (8.11)
This is a linear equation that can be solved explicitly. Rather than doing this, we will calculate the
eigenvalues and eigenfunctions of the generator, which takes the form
L = p∂
q
−ω
2
0
q∂
p
+ γ(−p∂
p
+ β
−1
∂
2
p
). (8.12)
The FokkerPlanck operator is
L = p∂
q
−ω
2
0
q∂
p
+ γ(−p∂
p
+ β
−1
∂
2
p
). (8.13)
The process ¦q(t), p(t)¦ is an ergodic Markov process with Gaussian invariant measure
ρ
β
(q, p) dqdp =
βω
0
2π
e
−
β
2
p
2
−
βω
2
0
q
2
. (8.14)
142
For the calculation of the eigenvalues and eigenfunctions of the operator L it is convenient to
introduce creation and annihilation operator in both the position and momentum variables. We set
a
−
= β
−1/2
∂
p
, a
+
= −β
−1/2
∂
p
+ β
1/2
p (8.15)
and
b
−
= ω
−1
0
β
−1/2
∂
q
, b
+
= −ω
−1
0
β
−1/2
∂
q
+ ω
0
β
1/2
p. (8.16)
We have that
a
+
a
−
= −β
−1
∂
2
p
+ p∂
p
and
b
+
b
−
= −β
−1
∂
2
q
+ q∂
q
Consequently, the operator
´
L = −a
+
a
−
−b
+
b
−
(8.17)
is the generator of the OU process in two dimensions.
The operators a
±
, b
±
satisfy the commutation relations
[a
+
, a
−
] = −1, (8.18a)
[b
+
, b
−
] = −1, (8.18b)
[a
±
, b
±
] = 0. (8.18c)
See Exercise 3. Using now the operators a
±
and b
±
we can write the generator L in the form
L = −γa
+
a
−
−ω
0
(b
+
a
−
−a
+
b
−
), (8.19)
which is a particular case of (8.6). In order to calculate the eigenvalues and eigenfunctions of (8.19)
we need to make an appropriate change of variables in order to bring the operator L into the
”decoupled” form (8.17). Clearly, this is a linear transformation and can be written in the form
Y = AX
where X = (q, p) for some 2 2 matrix A. It is somewhat easier to make this change of variables
at the level of the creation and annihilation operators. In particular, our goal is to ﬁnd ﬁrst order
differential operators c
±
and d
±
so that the operator (8.19) becomes
L = −Cc
+
c
−
−Dd
+
d
−
(8.20)
143
for some appropriate constants C and D. Since our goal is, essentially, to map L to the two
dimensional OU process, we require that that the operators c
±
and d
±
satisfy the canonical com
mutation relations
[c
+
, c
−
] = −1, (8.21a)
[d
+
, d
−
] = −1, (8.21b)
[c
±
, d
±
] = 0. (8.21c)
The operators c
±
and d
±
should be given as linear combinations of the old operators a
±
and b
±
.
From the structure of the generator L (8.19), the decoupled form (8.20) and the commutation
relations (8.21) and (8.18) we conclude that c
±
and d
±
should be of the form
c
+
= α
11
a
+
+ α
12
b
+
, (8.22a)
c
−
= α
21
a
−
+ α
22
b
−
, (8.22b)
d
+
= β
11
a
+
+ β
12
b
+
, (8.22c)
d
−
= β
21
a
−
+ β
22
b
−
. (8.22d)
Notice that the c− and d
−
are not the adjoints of c
+
and d
+
. If we substitute now these equations
into (8.20) and equate it with (8.19) and into the commutation relations (8.21) we obtain a sys
tem of equations for the coefﬁcients ¦α
ij
¦, ¦β
ij
¦. In order to write down the formulas for these
coefﬁcients it is convenient to introduce the eigenvalues of the deterministic problem
¨ q = −γ ˙ q −ω
2
0
q.
The solution of this equation is
q(t) = C
1
e
−λ
1
t
+ C
2
e
−λ
2
t
with
λ
1,2
=
γ ±δ
2
, δ =
_
γ
2
−4ω
2
0
. (8.23)
The eigenvalues satisfy the relations
λ
1
+ λ
2
= γ, λ
1
−λ
2
= δ, λ
1
λ
2
= ω
2
0
. (8.24)
144
Proposition 8.3.1. Let L be the generator (8.19) and let c
±
, d
pm
be the operators
c
+
=
1
√
δ
_
_
λ
1
a
+
+
_
λ
2
b
+
_
, (8.25a)
c
−
=
1
√
δ
_
_
λ
1
a
−
−
_
λ
2
b
−
_
, (8.25b)
d
+
=
1
√
δ
_
_
λ
2
a
+
+
_
λ
1
b
+
_
, (8.25c)
d
−
=
1
√
δ
_
−
_
λ
2
a
−
+
_
λ
1
b
−
_
. (8.25d)
Then c
±
, d
±
satisfy the canonical commutation relations (8.21) as well as
[L, c
±
] = −λ
1
c
±
, [L, d
±
] = −λ
2
d
±
. (8.26)
Furthermore, the operator L can be written in the form
L = −λ
1
c
+
c
−
−λ
2
d
+
d
−
. (8.27)
Proof. ﬁrst we check the commutation relations:
[c
+
, c
−
] =
1
δ
_
λ
1
[a
+
, a
−
] −λ
2
[b
+
, b
−
]
_
=
1
δ
(−λ
1
+ λ
2
) = −1.
Similarly,
[d
+
, d
−
] =
1
δ
_
−λ
2
[a
+
, a
−
] + λ
1
[b
+
, b
−
]
_
=
1
δ
(λ
2
−λ
1
) = −1.
Clearly, we have that
[c
+
, d
+
] = [c
−
, d
−
] = 0.
Furthermore,
[c
+
, d
−
] =
1
δ
_
−
_
λ
1
λ
2
[a
+
, a
−
] +
_
λ
1
λ
2
[b
+
, b
−
]
_
=
1
δ
(−
_
λ
1
λ
2
+−
_
λ
1
λ
2
) = 0.
145
Finally:
[L, c
+
] = −λ
1
c
+
c
−
c
+
+ λ
1
c
+
c
+
c
−
= −λ
1
c
+
(1 + c
+
c
−
) + λ
1
c
+
c
+
c
−
= −λ
1
c
+
(1 + c
+
c
−
) + λ
1
c
+
c
+
c
−
= −λ
1
c
+
,
and similarly for the other equations in (8.26). Now we calculate
L = −λ
1
c
+
c
−
−λ
2
d
+
d
−
= −
λ
2
2
−λ
2
1
δ
a
+
a
−
+ 0b
+
b
−
+
√
λ
1
λ
2
δ
(λ
1
−λ
2
)a
+
b
−
+
1
δ
_
λ
1
λ
2
(−λ
1
+ λ
2
)b
+
a
−
= −γa
+
a
−
−ω
0
(b
+
a
−
−a
+
b
−
),
which is precisely (8.19). In the above calculation we used (8.24).
Using now (8.27) we can readily obtain the eigenvalues and eigenfunctions of L. From our
experience with the twodimensional OU processes (or, the Schr¨ odinger operator for the two
dimensional quantum harmonic oscillator), we expect that the eigenfunctions should be tensor
products of Hermite polynomials. Indeed, we have the following, which is the main result of this
section.
Theorem8.3.2. The eigenvalues and eigenfunctions of the generator of the Markov process ¦q, p¦ (8.11)
are
λ
nm
= λ
1
n + λ
2
m =
1
2
γ(n + m) +
1
2
δ(n −m), n, m = 0, 1, . . . (8.28)
and
φ
nm
(q, p) =
1
√
n!m!
(c
+
)
n
(d
+
)
m
1, n, m = 0, 1, . . . (8.29)
Proof. We have
[L, (c
+
)
2
] = L(c
+
)
2
−(c
+
)
2
L
= (c
+
L −λ
1
c
+
)c
+
−c
+
(Lc
+
+ λ
1
c
+
)
= −2λ
1
(c
+
)
2
and similarly [L, (d
+
)
2
] = −2λ
1
(c
+
)
2
. A simple induction argument now shows that (see Exer
cise 8.3.3)
[L, (c
+
)
n
] = −nλ
1
(c
+
)
n
and [L, (d
+
)
m
] = −mλ
1
(d
+
)
m
. (8.30)
146
We use (8.30) to calculate
L(c
+
)
n
(d
+
)
n
1
= (c
+
)
n
L(d
+
)
m
1 −nλ
1
(c
+
)
n
(d
+
m)1
= (c
+
)
n
(d
+
)
m
L1 −mλ
2
(c
+
)
n
(d
+
m)1 −nλ
1
(c
+
)
n
(d
+
m)1
= −nλ
1
(c
+
)
n
(d
+
m)1 −mλ
2
(c
+
)
n
(d
+
m)1
from which (8.28) and (8.29) follow.
Exercise 8.3.3. Show that
[L, (c
±
)
n
] = −nλ
1
(c
±
)
n
, [L, (d
±
)
n
] = −nλ
1
(d
±
)
n
, [c
−
, (c
+
)
n
] = n(c
+
)
n−1
, [d
−
, (d
+
)
n
] = n(d
+
)
n−1
.
(8.31)
Remark 8.3.4. In terms of the operators a
±
, b
±
the eigenfunctions of L are
φ
nm
=
√
n!m!δ
−
n+m
2
λ
n/2
1
λ
m/2
2
n
=0
m
k=0
1
k!(m−k)!!(n −)!
_
λ
1
λ
2
_k−
2
(a
+
)
n+m−k−
(b
+
)
+k
1.
The ﬁrst few eigenfunctions are
φ
00
= 1.
φ
10
=
√
β
_√
λ
1
p +
√
λ
2
ω
0
q
_
√
δ
.
φ
01
=
√
β
_√
λ
2
p +
√
λ
1
ω
0
q
_
√
δ
φ
11
=
−2
√
λ
1
√
λ
2
+
√
λ
1
β p
2
√
λ
2
+ β pλ
1
ω
0
q + ω
0
β qλ
2
p +
√
λ
2
ω
0
2
β q
2
√
λ
1
δ
.
φ
20
=
−λ
1
+ β p
2
λ
1
+ 2
√
λ
2
β p
√
λ
1
ω
0
q −λ
2
+ ω
0
2
β q
2
λ
2
√
2δ
.
φ
02
=
−λ
2
+ β p
2
λ
2
+ 2
√
λ
2
β p
√
λ
1
ω
0
q −λ
1
+ ω
0
2
β q
2
λ
1
√
2δ
.
147
Notice that the eigenfunctions are not orthonormal.
As we already know, the ﬁrst eigenvalue, corresponding to the constant eigenfunction, is 0:
λ
00
= 0.
Notice that the operator L is not selfadjoint and consequently, we do not expect its eigenvalues
to be real. Indeed, whether the eigenvalues are real or not depends on the sign of the discriminant
∆ = γ
2
−4ω
2
0
. In the underdamped regime, γ < 2ω
0
the eigenvalues are complex:
λ
nm
=
1
2
γ(n + m) +
1
2
i
_
−γ
2
+ 4ω
2
0
(n −m), γ < 2ω
0
.
This it to be expected, since the underdamped regime the dynamics is dominated by the deter
ministic Hamiltonian dynamics that give rise to the antisymmetric Liouville operator. We set
ω =
_
(4ω
2
0
−γ
2
), i.e. δ = 2iω. The eigenvalues can be written as
λ
nm
=
γ
2
(n + m) + iω(n −m).
In Figure 8.3 we present the ﬁrst few eigenvalues of L in the underdamped regime. The eigen
values are contained in a cone on the right half of the complex plane. The cone is determined
by
λ
n0
=
γ
2
n + iωn and λ
0m
=
γ
2
m−iωm.
The eigenvalues along the diagonal are real:
λ
nn
= γn.
On the other hand, in the overdamped regime, γ 2ω
0
all eigenvalues are real:
λ
nm
=
1
2
γ(n + m) +
1
2
_
γ
2
−4ω
2
0
(n −m), γ 2ω
0
.
In fact, in the overdamped limit γ →+∞(which we will study in Chapter ??), the eigenvalues of
the generator L converge to the eigenvalues of the generator of the OU process:
λ
nm
= γn +
ω
2
0
γ
(n −m) + O(γ
−3
).
This is consistent with the fact that in this limit the solution of the Langevin equation converges to
the solution of the OU SDE. See Chapter ?? for details.
148
Figure 8.1: First few eigenvalues of L for γ = ω = 1.
149
The eigenfunctions of L do not form an orthonormal basis in L
2
β
:= L
2
(R
2
, Z
−1
e
−βH
) since L
is not a selfadjoint operator. Using the eigenfunctions/eigenvalues of L we can easily calculate the
eigenfunctions/eigenvalues of the L
2
β
adjoint of L. From the calculations presented in Section 8.2
we know that the adjoint operator is
´
L := −/+ γS (8.32)
= −ω
0
(b
+
a
−
−b
−
a
+
) + γa
+
a
−
(8.33)
= −λ
1
(c
−
)
∗
(c
+
)
∗
−λ
2
(d
−
) ∗ (d
+
)∗, (8.34)
where
(c
+
)
∗
=
1
√
δ
_
_
λ
1
a
−
+
_
λ
2
b
−
_
, (8.35a)
(c
−
)
∗
=
1
√
δ
_
_
λ
1
a
+
−
_
λ
2
b
+
_
, (8.35b)
(d
+
)
∗
=
1
√
δ
_
_
λ
2
a
−
+
_
λ
1
b
−
_
, (8.35c)
(d
−
)
∗
=
1
√
δ
_
−
_
λ
2
a
+
+
_
λ
1
b
+
_
. (8.35d)
´
L has the same eigenvalues as L:
−
´
Lψ
nm
= λ
nm
ψ
nm
,
where λ
nm
are given by (8.28). The eigenfunctions are
ψ
nm
=
1
√
n!m!
((c
−
)
∗
)
n
((d
−
)
∗
)
m
1. (8.36)
Proposition 8.3.5. The eigenfunctions of L and
´
L satisfy the biorthonormality relation
_ _
φ
nm
ψ
k
ρ
β
dpdq = δ
n
δ
mk
. (8.37)
Proof. We will use formulas (8.31). Notice that using the third and fourth of these equations
together with the fact that c
−
1 = d
−
1 = 0 we can conclude that (for n )
(c
−
)
(c
+
)
n
1 = n(n −1) . . . (n − + 1)(c
+
)
n−
. (8.38)
We have
_ _
φ
nm
ψ
k
ρ
β
dpdq =
1
√
n!m!!k!
_ _
((c
+
))
n
((d
+
))
m
1((c
−
)
∗
)
((d
−
)
∗
)
k
1ρ
β
dpdq
=
n(n −1) . . . (n − + 1)m(m−1) . . . (m−k + 1)
√
n!m!!k!
_ _
((c
+
))
n−
((d
+
))
m−k
1ρ
β
dpdq
= δ
n
δ
mk
,
150
since all eigenfunctions average to 0 with respect to ρ
β
.
From the eigenfunctions of
´
L we can obtain the eigenfunctions of the FokkerPlanck operator.
Using the formula (see equation (8.4))
L
∗
(fρ
β
) = ρ
´
Lf
we immediately conclude that the the FokkerPlanck operator has the same eigenvalues as those of
L and
´
L. The eigenfunctions are
ψ∗
nm
= ρ
β
φ
nm
= ρ
β
1
√
n!m!
((c
−
)
∗
)
n
((d
−
)
∗
)
m
1. (8.39)
8.4 Asymptotic Limits for the Langevin Equation
There are very few SDEs/FokkerPlanck equations that can be solved explicitly. In most cases
we need to study the problem under investigation either approximately or numerically. In this
part of the course we will develop approximate methods for studying various stochastic systems
of practical interest. There are many problems of physical interest that can be analyzed using
techniques from perturbation theory and asymptotic analysis:
i. Small noise asymptotics at ﬁnite time intervals.
ii. Small noise asymptotics/large times (rare events): the theory of large deviations, escape from
a potential well, exit time problems.
iii. Small and large friction asymptotics for the FokkerPlanck equation: The Freidlin–Wentzell
(underdamped) and Smoluchowski (overdamped) limits.
iv. Large time asymptotics for the Langevin equation in a periodic potential: homogenization
and averaging.
v. Stochastic systems with two characteristic time scales: multiscale problems and methods.
We will study various asymptotic limits for the Langevin equation (we have set m = 1)
¨ q = −∇V (q) −γ ˙ q +
_
2γβ
−1 ˙
W. (8.40)
151
There are two parameters in the problem, the friction coefﬁcient γ and the inverse temperature β.
We want to study the qualitative behavior of solutions to this equation (and to the corresponding
FokkerPlanck equation). There are various asymptotic limits at which we can eliminate some of
the variables of the equation and obtain a simpler equation for fewer variables. In the large temper
ature limit, β ¸1, the dynamics of (9.11) is dominated by diffusion: the Langevin equation (9.11)
can be approximated by free Brownian motion:
˙ q =
_
2γβ
−1 ˙
W.
The small temperature asymptotics, β ¸ 1 is much more interesting and more subtle. It leads
to exponential, Arrhenius type asymptotics for the reaction rate (in the case of a particle escaping
from a potential well due to thermal noise) or the diffusion coefﬁcient (in the case of a particle
moving in a periodic potential in the presence of thermal noise)
κ = ν exp (−βE
b
) , (8.41)
where κ can be either the reaction rate or the diffusion coefﬁcient. The small temperature asymp
totics will be studied later for the case of a bistable potential (reaction rate) and for the case of a
periodic potential (diffusion coefﬁcient).
Assuming that the temperature is ﬁxed, the only parameter that is left is the friction coefﬁcient
γ. The large and small friction asymptotics can be expressed in terms of a slow/fast system of
SDEs. In many applications (especially in biology) the friction coefﬁcient is large: γ ¸ 1. In
this case the momentum is the fast variable which we can eliminate to obtain an equation for the
position. This is the overdamped or Smoluchowski limit. In various problems in physics the
friction coefﬁcient is small: γ ¸1. In this case the position is the fast variable whereas the energy
is the slow variable. We can eliminate the position and obtain an equation for the energy. This is
the underdampled or FreidlinWentzell limit. In both cases we have to look at sufﬁciently long
time scales.
We rescale the solution to (9.11):
q
γ
(t) = λ
γ
(t/µ
γ
).
This rescaled process satisﬁes the equation
¨ q
γ
= −
λ
γ
µ
2
γ
∂
q
V (q
γ
/λ
γ
) −
γ
µ
γ
˙ q
γ
+
_
2γλ
2
γ
µ
−3
γ
β
−1 ˙
W, (8.42)
152
Different choices for these two parameters lead to the overdamped and underdamped limits: λ
γ
=
1, µ
γ
= γ
−1
, γ ¸1. In this case equation (8.42) becomes
γ
−2
˙ q
γ
= −∂
q
V (q
γ
) − ˙ q
γ
+
_
2β
−1 ˙
W. (8.43)
Under this scaling, the interesting limit is the overdamped limit, γ ¸ 1. We will see later that in
the limit as γ →+∞the solution to (8.43) can be approximated by the solution to
˙ q = −∂
q
V +
_
2β
−1 ˙
W.
λ
γ
= 1, µ
γ
= γ, γ ¸1:
¨ q
γ
= −γ
−2
∇V (q
γ
) − ˙ q
γ
+
_
2γ
−2
β
−1 ˙
W. (8.44)
Under this scaling the interesting limit is the underdamped limit, γ ¸ 1. We will see later that in
the limit as γ →0 the energy of the solution to (8.44) converges to a stochastic process on a graph.
8.4.1 The Overdamped Limit
We consider the rescaled Langevin equation (8.43):
ε
2
¨ q
γ
(t) = −∇V (q
γ
(t)) − ˙ q
γ
(t) +
_
2β
−1 ˙
W(t), (8.45)
where we have set ε
−1
= γ, since we are interested in the limit γ →∞, i.e. ε →0. We will show
that, in the limit as ε → 0, q
γ
(t), the solution of the Langevin equation (8.45), converges to q(t),
the solution of the Smoluchowski equation
˙ q = −∇V +
_
2β
−1 ˙
W. (8.46)
We write (8.45) as a system of SDEs:
˙ q =
1
ε
p, (8.47)
˙ p = −
1
ε
∇V (q) −
1
ε
2
p +
_
2
βε
2
˙
W. (8.48)
This systems of SDEs deﬁned a Markov process in phase space. Its generator is
L
ε
=
1
ε
2
_
−p ∇
p
+ β
−1
∆
_
+
1
ε
_
p ∇
q
−∇
q
V ∇
p
_
=:
1
ε
2
L
0
+
1
ε
L
1
.
153
This is a singularly perturbed differential operator. We will derive the Smoluchowski equation (8.46)
using a pathwise technique, as well as by analyzing the corresponding Kolmogorov equations.
We apply It ˆ o’s formula to p:
dp(t) = L
ε
p(t) dt +
1
ε
_
2β
−1
∂
p
p(t) dW
= −
1
ε
2
p(t) dt −
1
ε
∇
q
V (q(t)) dt +
1
ε
_
2β
−1
dW.
Consequently:
1
ε
_
t
0
p(s) ds = −
_
t
0
∇
q
V (q(s)) ds +
_
2β
−1
W(t) +O(ε).
From equation (8.47) we have that
q(t) = q(0) +
1
ε
_
t
0
p(s) ds.
Combining the above two equations we deduce
q(t) = q(0) −
_
t
0
∇
q
V (q(s)) ds +
_
2β
−1
W(t) +O(ε)
from which (8.46) follows.
Notice that in this derivation we assumed that
E[p(t)[
2
C.
This estimate is true, under appropriate assumptions on the potential V (q) and on the initial con
ditions. In fact, we can prove a pathwise approximation result:
_
E sup
t∈[0,T]
[q
γ
(t) −q(t)[
p
_
1/p
Cε
2−κ
,
where κ > 0, arbitrary small (it accounts for logarithmic corrections).
The pathwise derivation of the Smoluchowski equation implies that the solution of the Fokker
Planck equation corresponding to the Langevin equation (8.45) converges (in some appropriate
sense to be explained below) to the solution of the FokkerPlanck equation corresponding to the
Smoluchowski equation (8.46). It is important in various applications to calculate corrections to the
limiting FokkerPlanck equation. We can accomplish this by analyzing the FokkerPlanck equation
154
for (8.45) using singular perturbation theory. We will consider the problem in one dimension. This
mainly to simplify the notation. The multi–dimensional problem can be treated in a very similar
way.
The Fokker–Planck equation associated to equations (8.47) and (8.48) is
∂ρ
∂t
= L
∗
ρ
=
1
ε
(−p∂
q
ρ + ∂
q
V (q)∂
p
ρ) +
1
ε
2
_
∂
p
(pρ) + β
−1
∂
2
p
ρ
_
=:
_
1
ε
2
L
∗
0
+
1
ε
L
∗
1
_
ρ. (8.49)
The invariant distribution of the Markov process ¦q, p¦, if it exists, is
ρ
β
(p, q) =
1
Z
e
−βH(p,q)
, Z =
_
R
2
e
−βH(p,q)
dpdq,
where H(p, q) =
1
2
p
2
+ V (q). We deﬁne the function f(p,q,t) through
ρ(p, q, t) = f(p, q, t)ρ
β
(p, q). (8.50)
Proposition 8.4.1. The function f(p, q, t) deﬁned in (8.50) satisﬁes the equation
∂f
∂t
=
_
1
ε
2
_
−p∂
q
+ β
−1
∂
2
p
_
−
1
ε
(p∂
q
−∂
q
V (q)∂
p
)
_
f
=:
_
1
ε
2
L
0
−
1
ε
L
1
_
f. (8.51)
Remark 8.4.2. This is ”almost” the backward Kolmogorov equation with the difference that we
have −L
1
instead of L
1
. This is related to the fact that L
0
is a symmetric operator in L
2
(R
2
; Z
−1
e
−βH(p,q)
),
whereas L
1
is antisymmetric.
Proof. We note that L
∗
0
ρ
0
= 0 and L
∗
1
ρ
0
= 0. We use this to calculate:
L
∗
0
ρ = L
0
(fρ
0
) = ∂
p
(fρ
0
) + β
−1
∂
2
p
(fρ
0
)
= ρ
0
p∂
p
f + ρ
0
β
−1
∂
2
p
f + fL
∗
0
ρ
0
+ 2β
−1
∂
p
f∂
p
ρ
0
=
_
−p∂
p
f + β
−1
∂
2
p
f
_
ρ
0
= ρ
0
L
0
f.
Similarly,
L
∗
1
ρ = L
∗
1
(fρ
0
) = (−p∂
q
+ ∂
q
V ∂
p
) (fρ
0
)
= ρ
0
(−p∂
q
f + ∂
q
V ∂
p
f) = −ρ
0
L
1
f.
155
Consequently, the Fokker–Planck equation (8.94b) becomes
ρ
0
∂f
∂t
= ρ
0
_
1
ε
2
L
0
f −
1
ε
L
1
f
_
,
from which the claim follows.
We will assume that the initial conditions for (8.51) depend only on q:
f(p, q, 0) = f
ic
(q). (8.52)
Another way for stating this assumption is the following: Let H = L
2
(R
2d
; ρ
β
(p, q)) and deﬁne
the projection operator P : H →L
2
(R
d
; ¯ ρ
β
(q)) with ¯ ρ
β
(q) =
1
Z
q
e
−βV (q)
, Z
q
=
_
R
d
e
−βV (q)
dq:
P :=
1
Z
p
_
R
d
e
−β
p
2
2
dp, (8.53)
with Z
p
:=
_
R
d
e
−β[p[
2
/2
dp. Then, assumption (11.5) can be written as
Pf
ic
= f
ic
.
We look for a solution to (8.51) in the form of a truncated power series in ε:
f(p, q, t) =
N
n=0
ε
n
f
n
(p, q, t). (8.54)
We substitute this expansion into eqn. (8.51) to obtain the following system of equations.
L
0
f
0
= 0, (8.55a)
−L
0
f
1
= −L
1
f
0
, (8.55b)
−L
0
f
2
= −L
1
f
1
−
∂f
0
∂t
(8.55c)
−L
0
f
n
= −L
1
f
n−1
−
∂f
n−2
∂t
, n = 3, 4 . . . N. (8.55d)
The null space of L
0
consists of constants in p. Consequently, from equation (8.55a) we conclude
that
f
0
= f(q, t).
Now we can calculate the right hand side of equation (8.55b):
L
1
f
0
= p∂
q
f.
156
Equation (8.55b) becomes:
L
0
f
1
= p∂
q
f.
The right hand side of this equation is orthogonal to ^(L
∗
0
) and consequently there exists a unique
solution. We obtain this solution using separation of variables:
f
1
= −p∂
q
f + ψ
1
(q, t).
Now we can calculate the RHS of equation (8.55c). We need to calculate L
1
f
1
:
−L
1
f
1
=
_
p∂
q
−∂
q
V ∂
p
__
p∂
q
f −ψ
1
(q, t)
_
= p
2
∂
2
q
f −p∂
q
ψ
1
−∂
q
V ∂
q
f.
The solvability condition for (8.55c) is
_
R
_
−L
1
f
1
−
∂f
0
∂t
_
ρ
OU
(p) dp = 0,
from which we obtain the backward Kolmogorov equation corresponding to the Smoluchowski
SDE:
∂f
∂t
= −∂
q
V ∂
q
f + β
−1
∂
2
q
f, (8.56)
together with the initial condition (11.5).
Now we solve the equation for f
2
. We use (8.56) to write (8.55c) in the form
L
0
f
2
=
_
β
−1
−p
2
_
∂
2
q
f + p∂
q
ψ
1
.
The solution of this equation is
f
2
(p, q, t) =
1
2
∂
2
q
f(p, q, t)p
2
−∂
q
ψ
1
(q, t)p + ψ
2
(q, t).
Now we calculate the right hand side of the equation for f
3
, equation (8.55d) with n = 3. First we
calculate
L
1
f
2
=
1
2
p
3
∂
3
q
f −p
2
∂
2
q
ψ
1
+ p∂
q
ψ
2
−∂
q
V ∂
2
q
fp −∂
q
V ∂
q
ψ
1
.
The solvability condition
_
R
_
∂ψ
1
∂t
+L
1
f
2
_
ρ
OU
(p) dp = 0.
This leads to the equation
∂ψ
1
∂t
= −∂
q
V ∂
q
ψ
1
+ β
−1
∂
2
q
ψ
1
,
157
together with the initial condition ψ
1
(q, 0) = 0. From the calculations presented in the proof of
Theorem 6.5.5, and using Poincare` e’s inequality for the measure
1
Z
q
e
−βV (q)
, we deduce that
1
2
d
dt
ψ
1

2
−Cψ
1

2
.
We use Gronwall’s inequality now to conclude that
ψ
1
≡ 0.
Putting everything together we obtain the ﬁrst two terms in the εexpansion of the Fokker–Planck
equation (8.51):
ρ(p, q, t) = Z
−1
e
−βH(p,q)
_
f + ε(−p∂
q
f) +O(ε
2
)
_
,
where f is the solution of (8.56). Notice that we can rewrite the leading order term to the expansion
in the form
ρ(p, q, t) = (2πβ
−1
)
−
1
2
e
−βp
2
/2
ρ
V
(q, t) +O(ε),
where ρ
V
= Z
−1
e
−βV (q)
f is the solution of the Smoluchowski FokkerPlanck equation
∂ρ
V
∂t
= ∂
q
(∂
q
V ρ
V
) + β
−1
∂
2
q
ρ
V
.
It is possible to expand the nth term in the expansion (8.54) in terms of Hermite functions (the
eigenfunctions of the generator of the OU process)
f
n
(p, q, t) =
n
k=0
f
nk
(q, t)φ
k
(p), (8.57)
where φ
k
(p) is the k–th eigenfunction of L
0
:
−L
0
φ
k
= λ
k
φ
k
.
We can obtain the following system of equations (
´
L = β
−1
∂
q
−∂
q
V ):
´
Lf
n1
= 0,
¸
k + 1
β
−1
´
Lf
n,k+1
+
_
kβ
−1
∂
q
f
n,k−1
= −kf
n+1,k
, k = 1, 2 . . . , n −1,
_
nβ
−1
∂
q
f
n,n−1
= −nf
n+1,n
,
_
(n + 1)β
−1
∂
q
f
n,n
= −(n + 1)f
n+1,n+1
.
158
Using this method we can obtain the ﬁrst three terms in the expansion:
ρ(x, y, t) = ρ
0
(p, q)
_
f + ε(−
_
β
−1
∂
q
fφ
1
) + ε
2
_
β
−1
√
2
∂
2
q
fφ
2
+ f
20
_
+ε
3
_
−
_
β
−3
3!
∂
3
q
fφ
3
+
_
−
_
β
−1 ´
L∂
2
q
f −
_
β
−1
∂
q
f
20
_
φ
1
__
+O(ε
4
),
8.4.2 The Underdamped Limit
Consider now the rescaling λ
γ,ε
= 1, µ
γ,ε
= γ. The Langevin equation becomes
¨ q
γ
= −γ
−2
∇V (q
γ
) − ˙ q
γ
+
_
2γ
−2
β
−1 ˙
W. (8.58)
We write equation (8.58) as system of two equations
˙ q
γ
= γ
−1
p
γ
, ˙ p
γ
= −γ
−1
V
t
(q
γ
) −p
γ
+
_
2β
−1 ˙
W.
This is the equation for an O(1/γ) Hamiltonian system perturbed by O(1) noise. We expect that,
to leading order, the energy is conserved, since it is conserved for the Hamiltonian system. We
apply It ˆ o’s formula to the Hamiltonian of the system to obtain
˙
H =
_
β
−1
−p
2
_
+
_
2β
−1
p
2 ˙
W
with p
2
= p
2
(H, q) = 2(H −V (q)).
Thus, in order to study the γ → 0 limit we need to analyze the following fast/slow system of
SDEs
˙
H =
_
β
−1
−p
2
_
+
_
2β
−1
p
2 ˙
W (8.59a)
˙ p
γ
= −γ
−1
V
t
(q
γ
) −p
γ
+
_
2β
−1 ˙
W. (8.59b)
The Hamiltonian is the slow variable, whereas the momentum (or position) is the fast variable.
Assuming that we can average over the Hamiltonian dynamics, we obtain the limiting SDE for the
Hamiltonian:
˙
H =
_
β
−1
−¸p
2
)
_
+
_
2β
−1
¸p
2
)
˙
W. (8.60)
The limiting SDE lives on the graph associated with the Hamiltonian system. The domain of
deﬁnition of the limiting Markov process is deﬁned through appropriate boundary conditions (the
gluing conditions) at the interior vertices of the graph.
159
We identify all points belonging to the same connected component of the a level curve ¦x :
H(x) = H¦, x = (q, p). Each point on the edges of the graph correspond to a trajectory. Interior
vertices correspond to separatrices. Let I
i
, i = 1, . . . d be the edges of the graph. Then (i, H)
deﬁnes a global coordinate system on the graph.
We will study the small γ asymptotics by analyzing the corresponding backward Kolmogorov
equation using singular perturbation theory. The generator of the process ¦q
γ
, p
γ
¦ is
L
γ
= γ
−1
(p∂
q
−∂
q
V ∂
p
) −p∂
p
+ β
−1
∂
2
p
= γ
−1
L
0
+L
1
.
Let u
γ
= E(f(p
γ
(p, q; t), q
γ
(p, q; t))). It satisﬁes the backward Kolmogorov equation associated
to the process ¦q
γ
, p
γ
¦:
∂u
γ
∂t
=
_
1
γ
L
0
+L
1
_
u
γ
. (8.61)
We look for a solution in the form of a power series expansion in ε:
u
γ
= u
0
+ γu
1
+ γ
2
u
2
+ . . .
We substitute this ansatz into (8.61) and equate equal powers in ε to obtain the following sequence
of equations:
L
0
u
0
= 0, (8.62a)
L
0
u
1
= −L
1
u
1
+
∂u
0
∂t
, (8.62b)
L
0
u
2
= −L
1
u
1
+
∂u
1
∂t
. (8.62c)
. . . . . . . . .
Notice that the operator L
0
is the backward Liouville operator of the Hamiltonian system with
Hamiltonian
H =
1
2
p
2
+ V (q).
We assume that there are no integrals of motion other than the Hamiltonian. This means that the
null space of L
0
consists of functions of the Hamiltonian:
^(L
0
) =
_
functions ofH
_
. (8.63)
160
Let us now analyze equations (8.62). We start with (8.62a); eqn. (8.63) implies that u
0
depends on
q, p through the Hamiltonian function H:
u
0
= u(H(p, q), t) (8.64)
Now we proceed with (8.62b). For this we need to ﬁnd the solvability condition for equations of
the form
L
0
u = f (8.65)
My multiply it by an arbitrary smooth function of H(p, q), integrate over R
2
and use the skew
symmetry of the Liouville operator L
0
to deduce:
1
_
R
2
L
0
uF(H(p, q)) dpdq =
_
R
2
uL
∗
0
F(H(p, q)) dpdq
=
_
R
2
u(−L
0
F(H(p, q))) dpdq
= 0, ∀F ∈ C
∞
b
(R).
This implies that the solvability condition for equation (8.83) is that
_
R
2
f(p, q)F(H(p, q)) dpdq = 0, ∀F ∈ C
∞
b
(R). (8.66)
We use the solvability condition in (8.62b) to obtain that
_
R
2
_
L
1
u
1
−
∂u
0
∂t
_
F(H(p, q)) dpdq = 0, (8.67)
To proceed, we need to understand how L
1
acts to functions of H(p, q). Let φ = φ(H(p, q)). We
have that
∂φ
∂p
=
∂H
∂p
∂φ
∂H
= p
∂φ
∂H
and
∂
2
φ
∂p
2
=
∂
∂p
_
∂φ
∂H
_
=
∂φ
∂H
+ p
2
∂
2
φ
∂H
2
.
The above calculations imply that, when L
1
acts on functions φ = φ(H(p, q)), it becomes
L
1
=
_
(β
−1
−p
2
)∂
H
+ β
−1
p
2
∂
2
H
_
, (8.68)
1
We assume that both u
1
and F decay to 0 as [p[ →∞to justify the integration by parts that follows.
161
where
p
2
= p
2
(H, q) = 2(H −V (q)).
We want to change variables in the integral (8.67) and go from (p, q) to p, H. The Jacobian of the
transformation is:
∂(p, q)
∂(H, q)
=
∂p
∂H
∂p
∂q
∂q
∂H
∂q
∂q
=
∂p
∂H
=
1
p(H, q)
.
We use this, together with (8.68), to rewrite eqn. (8.67) as
_ _
_
∂u
∂t
+
_
(β
−1
−p
2
)∂
H
+ β
−1
p
2
∂
2
H
_
u
_
F(H)p
−1
(H, q) dHdq = 0.
We introduce the notation
¸) :=
_
dq.
The integration over q can be performed ”explicitly”:
_
_
∂u
∂t
¸p
−1
) +
_
(β
−1
¸p
−1
) −¸p))∂
H
+ β
−1
¸p)∂
2
H
_
u
_
F(H) dH = 0.
This equation should be valid for every smooth function F(H), and this requirement leads to the
differential equation
¸p
−1
)
∂u
∂t
=
_
β
−1
¸p
−1
) −¸p)
_
∂
H
u +¸p)β
−1
∂
2
H
u,
or,
∂u
∂t
=
_
β
−1
−¸p
−1
)
−1
¸p)
_
∂
H
u + γ¸p
−1
)
−1
¸p)β
−1
∂
2
H
u.
Thus, we have obtained the limiting backward Kolmogorov equation for the energy, which is the
”slow variable”. From this equation we can read off the limiting SDE for the Hamiltonian:
˙
H = b(H) + σ(H)
˙
W (8.69)
where
b(H) = β
−1
−¸p
−1
)
−1
¸p), σ(H) = β
−1
¸p
−1
)
−1
¸p).
Notice that the noise that appears in the limiting equation (8.69) is multiplicative, contrary to
the additive noise in the Langevin equation.
As it well known from classical mechanics, the action and frequency are deﬁned as
I(E) =
_
p(q, E) dq
162
and
ω(E) = 2π
_
dI
dE
_
−1
,
respectively. Using the action and the frequency we can write the limiting Fokker–Planck equation
for the distribution function of the energy in a very compact form.
Theorem 8.4.3. The limiting Fokker–Planck equation for the energy distribution function ρ(E, t)
is
∂ρ
∂t
=
∂
∂E
__
I(E) + β
−1
∂
∂E
__
ω(E)ρ
2π
__
. (8.70)
Proof. We notice that
dI
dE
=
_
∂p
∂E
dq =
_
p
−1
dq
and consequently
¸p
−1
)
−1
=
ω(E)
2π
.
Hence, the limiting Fokker–Planck equation can be written as
∂ρ
∂t
= −
∂
∂E
__
β
−1
I(E)ω(E)
2π
_
ρ
_
+ β
−1
∂
2
∂E
2
_
Iω
2π
_
= −β
−1
∂ρ
∂E
+
∂
∂E
_
Iω
2π
ρ
_
+ β
−1
∂
∂E
_
dI
dE
ωρ
2π
_
+ β
−1
∂
∂E
_
I
∂
∂E
_
ωρ
2π
_
_
=
∂
∂E
_
Iω
2π
ρ
_
+ β
−1
∂
∂E
_
I
∂
∂E
_
ωρ
2π
_
_
=
∂
∂E
__
I(E) + β
−1
∂
∂E
__
ω(E)ρ
2π
__
,
which is precisely equation (8.70).
Remarks 8.4.4. i. We emphasize that the above formal procedure does not provide us with the
boundary conditions for the limiting Fokker–Planck equation. We will discuss about this
issue in the next section.
ii. If we rescale back to the original timescale we obtain the equation
∂ρ
∂t
= γ
∂
∂E
__
I(E) + β
−1
∂
∂E
__
ω(E)ρ
2π
__
. (8.71)
We will use this equation later on to calculate the rate of escape from a potential barrier in
the energydiffusionlimited regime.
163
8.5 Brownian Motion in Periodic Potentials
Basic model
m¨ x = −γ ˙ x(t) −∇V (x(t), f(t)) + y(t) +
_
2γk
B
Tξ(t), (8.72)
Goal: Calculate the effective drift and the effective diffusion tensor
U
eff
= lim
t→∞
¸x(t))
t
(8.73)
and
D
eff
= lim
t→∞
¸x(t) −¸x(t))) ⊗(x(t) −¸x(t))))
2t
. (8.74)
8.5.1 The Langevin equation in a periodic potential
We start by studying the underdamped dynamics of a Brownian particle x(t) ∈ R
d
moving in a
smooth, periodic potential.
¨ x = −∇V (x(t)) −γ ˙ x(t) +
_
2γk
B
Tξ(t), (8.75)
where γ is the friction coefﬁcient, k
B
the Boltzmann constant and T denotes the temperature. ξ(t)
stands for the standard d–dimensional white noise process, i.e.
¸ξ
i
(t)) = 0 and ¸ξ
i
(t)ξ
j
(s)) = δ
ij
δ(t −s), i, j = 1, . . . d.
The potential V (x) is periodic in x and satisﬁes ∇V (x)
L
∞ = 1 with period 1 in all spatial
directions:
V (x + ˆ e
i
) = V (x), i = 1, . . . , d,
where ¦ˆ e
i
¦
d
i=1
denotes the standard basis of R
d
.
Notice that we have already non–dimensionalized eqn. (8.75) in such a way that the non–
dimensional particle mass is 1 and the maximum of the (gradient of the) potential is ﬁxed [52].
Hence, the only parameters in the problem are the friction coefﬁcient and the temperature. Notice,
furthermore, that the parameter γ in (8.75) controls the coupling between the Hamiltonian system
¨ x = −∇V (x) and the thermal heat bath: γ ¸ 1 implies that the Hamiltonian system is strongly
coupled to the heat bath, whereas γ ¸1 corresponds to weak coupling.
164
Equation (8.75) deﬁnes a Markov process in the phase space T
d
R
d
. Indeed, let us write
(8.75) as a ﬁrst order system
˙ x(t) = y(t), (8.76a)
˙ y(t) = −∇V (x(t)) −γ ˙ y(t) +
_
2γk
B
Tξ(t), (8.76b)
The process ¦x(t), y(t)¦ is Markovian with generator
L = y ∇
x
−∇V (x) ∇
y
+ γ (−y ∇
y
+ D∆
y
) .
In writing the above we have set D = K
B
T. This process is ergodic. The unique invariant measure
is absolutely continuous with respect to the Lebesgue measure and its density is the Maxwell–
Boltzmann distribution
ρ(y, x) =
1
(2πD)
n
2
Z
e
−
1
D
H(x,y)
, (8.77)
where Z =
_
T
d
e
−V (x)/D
dx and H(x, y) is the Hamiltonian of the system
H(x, y) =
1
2
y
2
+ V (x).
The long time behavior of solutions to (8.75) is governed by an effective Brownian motion. Indeed,
the following central limit theorem holds [83, 70, ?]
Theorem 8.5.1. Let V (x) ∈ C(T
d
). Deﬁne the rescaled process
x
ε
(t) := εx(t/ε
2
).
Then
x
ε
(t) converges weakly, as ε →0, to a Brownian motion with covariance
D
eff
=
_
T
d
R
d
−LΦ ⊗Φµ(dx dy), (8.78)
where µ(dx dy) = ρ(x, y)dxdy and the vector valued function Φ is the solution of the Poisson
equation
−LΦ = y. (8.79)
We are interested in analyzing the dependence of D
eff
on γ. We will mostly focus on the one
dimensional case. We start by rescaling the Langevin equation (9.11)
¨ x = F(x) −γ ˙ x +
_
2γβ
−1 ˙
W, (8.80)
165
where we have set F(x) = −∇V (x). We will assume that the potential is periodic with period 2π
in every direction. Since we expect that at sufﬁciently long length and time scales the particle per
forms a purely diffusive motion, we perform a diffusive rescaling to the equations of motion (9.11):
t →t/ε
2
, x →
x
ε
. Using the fact that
˙
W(c t) =
1
√
c
˙
W(t) in law we obtain:
ε
2
¨ x =
1
ε
F
_
x
ε
_
−γ ˙ x +
_
2γβ
−1 ˙
W,
Introducing p = ε ˙ x and q = x/ε we write this equation as a ﬁrst order system:
˙ x =
1
ε
p,
˙ p =
1
ε
2
F(q) −
1
ε
2
γp +
1
ε
2
γβ
−1
˙
W,
˙ q =
1
ε
2
p,
(8.81)
with the understanding that q ∈ [−π, π]
d
and x, p ∈ R
d
. Our goal now is to eliminate the fast
variables p, q and to obtain an equation for the slow variable x. We shall accomplish this by
studying the corresponding backward Kolmogorov equation using singular perturbation theory for
partial differential equations.
Let
u
ε
(p, q, x, t) = Ef
_
p(t), q(t), x(t)[p(0) = p, q(0) = q, x(0) = x
_
,
where E denotes the expectation with respect to the Brownian motion W(t) in the Langevin equa
tion and f is a smooth function.
2
The evolution of the function u
ε
(p, q, x, t) is governed by the
backward Kolmogorov equation associated to equations (8.81) is [74]
3
∂u
ε
∂t
=
1
ε
p ∇
x
u
ε
+
1
ε
2
_
−∇
q
V (q) ∇
p
+ p ∇
q
+ γ
_
−p ∇
p
+ β
−1
∆
p
_
_
u
ε
.
:=
_
1
ε
2
L
0
+
1
ε
L
1
_
u
ε
, (8.82)
where:
L
0
= −∇
q
V (q) ∇
p
+ p ∇
q
+ γ
_
−p ∇
p
+ β
−1
∆
p
_
,
L
1
= p ∇
x
2
In other words, we have that
u
ε
(p, q, x, t) =
_
f(x, v, t; p, q)ρ(x, v, t; p, q)µ(p, q) dpdqdxdv,
where ρ(x, v, t; p, q) is the solution of the FokkerPlanck equation and µ(p, q) is the initial distribution.
3
it is more customary in the physics literature to use the forward Kolmogorov equation, i.e. the FokkerPlanck
equation. However, for the calculation presented below, it is more convenient to use the backward as opposed to the
forward Kolmogorov equation. The two formulations are equivalent. See [72, Ch. 6] for details.
166
The invariant distribution of the fast process
_
q(t), p(t)
_
in T
d
R
d
is the MaxwellBoltzmann
distribution
ρ
β
(q, p) = Z
−1
e
−βH(q,p)
, Z =
_
T
d
R
d
e
−βH(q,p)
dqdp,
where H(q, p) =
1
2
[p[
2
+ V (q). Indeed, we can readily check that
L
∗
0
ρ
β
(q, p) = 0,
where L
∗
0
denotes the FokkerPlanck operator which is the L
2
adjoint of the generator of the pro
cess L
0
:
L
∗
0
f = ∇
q
V (q) ∇
p
f −p ∇
q
f + γ
_
∇
p
(pf) + β
−1
∆
p
f
_
.
The null space of the generator L
0
consists of constants in q, p. Moreover, the equation
−L
0
f = g, (8.83)
has a unique (up to constants) solution if and only if
¸g)
β
:=
_
T
d
R
d
g(q, p)ρ
β
(q, p) dqdp = 0. (8.84)
Equation (8.83) is equipped with periodic boundary conditions with respect to z and is such that
_
T
d
R
d
[f[
2
µ
β
dqdp < ∞. (8.85)
These two conditions are sufﬁcient to ensure existence and uniqueness of solutions (up to con
stants) of equation (8.83) [38, 39, 70].
We assume that the following ansatz for the solution u
ε
holds:
u
ε
= u
0
+ εu
1
+ ε
2
u
2
+ . . . (8.86)
with u
i
= u
i
(p, q, x, t), i = 1, 2, . . . being 2π periodic in q and satisfying condition (8.85). We
substitute (8.86) into (8.82) and equate equal powers in ε to obtain the following sequence of
equations:
L
0
u
0
= 0, (8.87a)
L
0
u
1
= −L
1
u
0
, (8.87b)
L
0
u
2
= −L
1
u
1
+
∂u
0
∂t
. (8.87c)
167
From the ﬁrst equation in (8.87) we deduce that u
0
= u
0
(x, t), since the null space of L
0
consists
of functions which are constants in p and q. Now the second equation in (8.87) becomes:
L
0
u
1
= −p ∇
x
u
0
.
Since ¸p) = 0, the right hand side of the above equation is meanzero with respect to the Maxwell
Boltzmann distribution. Hence, the above equation is wellposed. We solve it using separation of
variables:
u
1
= Φ(p, q) ∇
x
u
0
with
−L
0
Φ = p. (8.88)
This Poisson equation is posed on T
d
R
d
. The solution is periodic in q and satisﬁes condi
tion (8.85). Now we proceed with the third equation in (8.87). We apply the solvability condition
to obtain:
∂u
0
∂t
=
_
T
d
R
d
L
1
u
1
ρ
β
(p, q) dpdq
=
d
i,j=1
__
T
d
R
d
p
i
Φ
j
ρ
β
(p, q) dpdq
_
∂
2
u
0
∂x
i
∂x
j
.
This is the Backward Kolmogorov equation which governs the dynamics on large scales. We write
it in the form
∂u
0
∂t
=
d
i,j=1
D
ij
∂
2
u
0
∂x
i
∂x
j
(8.89)
where the effective diffusion tensor is
D
ij
=
_
T
d
R
d
p
i
Φ
j
ρ
β
(p, q) dpdq, i, j = 1, . . . d. (8.90)
The calculation of the effective diffusion tensor requires the solution of the boundary value problem
(8.88) and the calculation of the integral in (8.90). The limiting backward Kolmogorov equation
is well posed since the diffusion tensor is nonnegative. Indeed, let ξ be a unit vector in R
d
. We
calculate (we use the notation Φ
ξ
= Φ ξ and ¸, ) for the Euclidean inner product)
¸ξ, Dξ) =
_
(p ξ)(Φ
ξ
)µ
β
dpdq =
_
_
−L
0
Φ
ξ
_
Φ
ξ
µ
β
dpdq
= γβ
−1
_
¸
¸
∇
p
Φ
ξ
¸
¸
2
µ
β
dpdq 0, (8.91)
168
where an integration by parts was used.
Thus, from the multiscale analysis we conclude that at large lenght/time scales the particle
which diffuses in a periodic potential performs and effective Brownian motion with a nonnegative
diffusion tensor which is given by formula (8.90).
We mention in passing that the analysis presented above can also be applied to the problem of
Brownian motion in a tilted periodic potential. The Langevin equation becomes
¨ x(t) = −∇V (x(t)) + F −γ ˙ x(t) +
_
2γβ
−1 ˙
W(t), (8.92)
where V (x) is periodic with period 2π and F is a constant force ﬁeld. The formulas for the
effective drift and the effective diffusion tensor are
V =
_
R
d
T
d
pρ(q, p) dqdp, D =
_
R
d
T
d
(p −V ) ⊗φρ(p, q) dpdq, (8.93)
where
−Lφ = p −V, (8.94a)
L
∗
ρ = 0,
_
R
d
T
d
ρ(p, q) dpdq = 1. (8.94b)
with
L = p ∇
q
+ (−∇
q
V + F) ∇
p
+ γ
_
−p ∇
p
+ β
−1
∆
p
_
. (8.95)
We have used ⊗to denote the tensor product between two vectors; L
∗
denotes the L
2
adjoint of the
operator L, i.e. the FokkerPlanck operator. Equations (8.94) are equipped with periodic boundary
conditions in q. The solution of the Poisson equation (8.94) is also taken to be square integrable
with respect to the invariant density ρ(q, p):
_
R
d
T
d
[φ(q, p)[
2
ρ(p, q) dpdq < +∞.
The diffusion tensor is nonnegative deﬁnite. A calculation similar to the one used to derive (8.91)
shows the positive deﬁniteness of the diffusion tensor:
¸ξ, Dξ) = γβ
−1
_
¸
¸
∇
p
Φ
ξ
¸
¸
2
ρ(p, q) dpdq 0, (8.96)
for every vector ξ in R
d
. The study of diffusion in a tilted periodic potential, in the underdamped
regime and in high dimensions, based on the above formulas for V and D, will be the subject of a
separate publication.
169
8.5.2 Equivalence With the GreenKubo Formula
Let us now show that the formula for the diffusion tensor obtained in the previous section, equa
tion (8.90), is equivalent to the GreenKubo formula (3.14). To simplify the notation we will prove
the equivalence of the two formulas in one dimension. The generalization to arbitrary dimensions is
immediate. Let (x(t; q, p), v(t; q, p)) with v = ˙ x and initial conditions x(0; q, p) = q, v(0; q, p) =
p be the solution of the Langevin equation
¨ x = −∂
x
V −γ ˙ x + ξ
where ξ(t) stands for Gaussian white noise in one dimension with correlation function
¸ξ(t)ξ(s)) = 2γk
B
Tδ(t −s).
We assume that the (x, v) process is stationary, i.e. that the initial conditions are distributed ac
cording to the MaxwellBoltzmann distribution
ρ
β
(q, p) = Z
−1
e
−βH(p,q)
.
The velocity autocorrelation function is [15, eq. 2.10]
¸v(t; q, p)v(0; q, p)) =
_
v pρ(x, v, t; p, q)ρ
β
(p, q) dpdqdxdv, (8.97)
and ρ(x, v, t; p, q) is the solution of the FokkerPlanck equation
∂ρ
∂t
= L
∗
ρ, ρ(x, v, 0; p, q) = δ(x −q)δ(v −p),
where
L
∗
ρ = −v∂
x
ρ + ∂
x
V (x)∂
v
ρ + γ
_
∂(vρ) + β
−1
∂
2
v
ρ
_
.
We rewrite (8.97) in the form
¸v(t; q, p)v(0; q, p)) =
_ _ __ _
vρ(x, v, t; p, q) dvdx
_
pρ
β
(p, q) dpdq
=:
_ _
v(t; p, q)pρ
β
(p, q) dpdq. (8.98)
The function v(t) satisﬁes the backward Kolmogorov equation which governs the evolution of
observables [74, Ch. 6]
∂v
∂t
= Lv, v(0; p, q) = p. (8.99)
170
We can write, formally, the solution of (8.99) as
v = e
/t
p. (8.100)
We combine now equations (8.98) and (8.100) to obtain the following formula for the velocity
autocorrelation function
¸v(t; q, p)v(0; q, p)) =
_ _
p
_
e
/t
p
_
ρ
β
(p, q) dpdq. (8.101)
We substitute this into the GreenKubo formula to obtain
D =
_
∞
0
¸v(t; q, p)v(0; q, p)) dt
=
_ __
∞
0
e
/t
dt p
_
pρ
β
dpdq
=
_
_
−L
−1
p
_
pρ
β
dpdq
=
_
∞
−∞
_
π
−π
φpρ
β
dpdq,
where φ is the solution of the Poisson equation (8.88). In the above derivation we have used the
formula −L
−1
=
_
∞
0
e
/t
dt, whose proof can be found in [74, Ch. 11].
8.6 The Underdamped and Overdamped Limits of the Diffu
sion Coefﬁcient
In this section we derive approximate formulas for the diffusion coefﬁcient which are valid in the
overdamped γ ¸1 and underdampled γ ¸1 limits. The derivation of these formulas is based on
the asymptotic analysis of the Poisson equation (8.88).
The Underdamped Limit
In this subsection we solve the Poisson equation (8.88) in one dimension perturbatively for small
γ. We shall use singular perturbation theory for partial differential equations. The operator L
0
that
appears in (8.88) can be written in the form
L
0
= L
H
+ γL
OU
171
where L
H
stands for the (backward) Liouville operator associated with the Hamiltonian H(p, q)
and L
OU
for the generator of the OU process, respectively:
L
H
= p∂
q
−∂
q
V ∂
p
, L
OU
= −p∂
p
+ β
−1
∂
2
p
.
We expect that the solution of the Poisson equation scales like γ
−1
when γ ¸ 1. Thus, we look
for a solution of the form
Φ =
1
γ
φ
0
+ φ
1
+ γφ
2
+ . . . (8.102)
We substitute this ansatz in (8.88) to obtain the sequence of equations
L
H
φ
0
= 0, (8.103a)
−L
H
φ
1
= p +L
OU
φ
0
, (8.103b)
−L
H
φ
2
= L
OU
φ
1
. (8.103c)
From equation (8.103a) we deduce that, since the φ
0
is in the null space of the Liouville operator,
the ﬁrst term in the expansion is a function of the Hamiltonian z(p, q) =
1
2
p
2
+ V (q):
φ
0
= φ
0
(z(p, q)).
Now we want to obtain an equation for φ
0
by using the solvability condition for (8.103b). To this
end, we multiply this equation by an arbitrary function of z, g = g(z) and integrate over p and q to
obtain
_
+∞
−∞
_
π
−π
(p +L
OU
φ
0
) g(z(p, q)) dpdq = 0.
We change now from p, q coordinates to z, q, so that the above integral becomes
_
+∞
E
min
_
π
−π
g(z) (p(z, q) +L
OU
φ
0
(z))
1
p(z, q)
dzdq = 0,
where J = p
−1
(z, q) is the Jacobian of the transformation. Operator L
0
, when applied to functions
of the Hamiltonian, becomes:
L
OU
= (β
−1
−p
2
)
∂
∂z
+ β
−1
p
2
∂
2
∂z
2
.
Hence, the integral equation for φ
0
(z) becomes
_
+∞
E
min
_
π
−π
g(z)
_
p(z, q) +
_
(β
−1
−p
2
)
∂
∂z
+ β
−1
p
2
∂
2
∂z
2
_
φ
0
(z)
_
1
p(z, q)
dzdq = 0.
172
Let E
0
denote the critical energy, i.e. the energy along the separatrix (homoclinic orbit). We set
S(z) =
_
x
2
(z)
x
1
(z)
p(z, q) dq, T(z) =
_
x
2
(z)
x
1
(z)
1
p(z, q)
dq,
where Risken’s notation [82, p. 301] has been used for x
1
(z) and x
2
(z).
We need to consider the cases
_
z > E
0
, p > 0
_
,
_
z > E
0
, p < 0
_
and
_
E
min
< z < E
0
_
separately.
We consider ﬁrst the case E > E
0
, p > 0. In this case x
1
(x) = π, x
2
(z) = −π. We can
perform the integration with respect to q to obtain
_
+∞
E
0
g(z)
_
2π +
_
(β
−1
T(z) −S(z))
∂
∂z
+ β
−1
S(z)
∂
2
∂z
2
_
φ
0
(z)
_
dz = 0,
This equation is valid for every test function g(z), from which we obtain the following differential
equation for φ
0
:
−Lφ := −β
−1
1
T(z)
S(z)φ
tt
+
_
1
T(z)
S(z) −β
−1
_
φ
t
=
2π
T(z)
, (8.104)
where primes denote differentiation with respect to z and where the subscript 0 has been dropped
for notational simplicity.
A similar calculation shows that in the regions E > E
0
, p < 0 and E
min
< E < E
0
the
equation for φ
0
is
−Lφ = −
2π
T(z)
, E > E
0
, p < 0 (8.105)
and
−Lφ = 0, E
min
< E < E
0
. (8.106)
Equations (8.104), (8.105), (8.106) are augmented with condition (8.85) and a continuity condition
at the critical energy [27]
2φ
t
3
(E
0
) = φ
t
1
(E
0
) + φ
t
2
(E
0
), (8.107)
where φ
1
, φ
2
, φ
3
are the solutions of equations (8.104), (8.105) and (8.106), respectively.
The average of a function h(q, p) = h(q, p(z, q)) can be written in the form [82, p. 303]
¸h(q, p))
β
:=
_
∞
−∞
_
π
−π
h(q, p)µ
β
(q, p) dqdp
= Z
−1
β
_
+∞
E
min
_
x
2
(z)
x
1
(z)
_
h(q, p(z, q)) + h(q, −p(z, q))
_
(p(q, z))
−1
e
−βz
dzdq,
173
where the partition function is
Z
β
=
_
2π
β
_
π
−π
e
−βV (q)
dq.
From equation (8.106) we deduce that φ
3
(z) = 0. Furthermore, we have that φ
1
(z) = −φ
2
(z).
These facts, together with the above formula for the averaging with respect to the Boltzmann
distribution, yield:
D = ¸pΦ(p, q))
β
= ¸pφ
0
)
β
+O(1) (8.108)
≈
2
γ
Z
−1
β
_
+∞
E
0
φ
0
(z)e
βz
dzO(1)
=
4π
γ
Z
−1
β
_
+∞
E
0
φ
0
(z)e
−βz
dz, (8.109)
to leading order in γ, and where φ
0
(z) is the solution of the two point boundary value prob
lem (8.104). We remark that if we start with formula D = γβ
−1
¸[∂
p
Φ[
2
)
β
for the diffusion coefﬁ
cient, we obtain the following formula, which is equivalent to (8.109):
D =
4π
γβ
Z
−1
β
_
+∞
E
0
[∂
z
φ
0
(z)[
2
e
−βz
dz.
Now we solve the equation for φ
0
(z) (for notational simplicity, we will drop the subscript 0 ).
Using the fact that S
t
(z) = T(z), we rewrite (8.104) as
−β
−1
(Sφ
t
)
t
+ Sφ
t
= 2π.
This equation can be rewritten as
−β
−1
_
e
−βz
Sφ
t
_
= e
−βz
.
Condition (8.85) implies that the derivative of the unique solution of (8.104) is
φ
t
(z) = S
−1
(z).
We use this in (8.109), together with an integration by parts, to obtain the following formula for
the diffusion coefﬁcient:
D =
1
γ
8π
2
Z
−1
β
β
−1
_
+∞
E
0
e
−βz
S(z)
dz. (8.110)
We emphasize the fact that this formula is exact in the limit as γ → 0 and is valid for all periodic
potentials and for all values of the temperature.
174
Consider now the case of the nonlinear pendulum V (q) = −cos(q). The partition function is
Z
β
=
(2π)
3/2
β
1/2
J
0
(β),
where J
0
() is the modiﬁed Bessel function of the ﬁrst kind. Furthermore, a simple calculation
yields
S(z) = 2
5/2
√
z + 1E
_
_
2
z + 1
_
,
where E() is the complete elliptic integral of the second kind. The formula for the diffusion
coefﬁcient becomes
D =
1
γ
√
π
2β
1/2
J
0
(β)
_
+∞
1
e
−βz
√
z + 1E(
_
2/(z + 1))
dz. (8.111)
We use now the asymptotic formula J
0
(β) ≈ (2πβ)
−1/2
e
β
, β ¸ 1 and the fact that E(1) = 1 to
obtain the small temperature asymptotics for the diffusion coefﬁcient:
D =
1
γ
π
2β
e
−2β
, β ¸1, (8.112)
which is precisely formula (??), obtained by Risken.
Unlike the overdamped limit which is treated in the next section, it is not straightforward to ob
tain the next order correction in the formula for the effective diffusivity. This is because, due to the
discontinuity of the solution of the Poisson equation (8.88) along the separatrix. In particular, the
next order correction to φ when γ ¸1 is of (γ
−1/2
), rather than (1) as suggested by ansatz (8.102).
Upon combining the formula for the diffusion coefﬁcient and the formula for the hopping rate
from Kramers’ theory [41, eqn. 4.48(a)] we can obtain a formula for the mean square jump length
at low friction. For the cosine potential, and for β ¸1, this formula is
¸
2
) =
π
2
8γ
2
β
2
for γ ¸1, β ¸1. (8.113)
The Overdamped Limit
In this subsection we study the large γ asymptotics of the diffusion coefﬁcient. As in the previous
case, we use singular perturbation theory, e.g. [42, Ch. 8]. The regularity of the solution of (8.88)
when γ ¸1 will enable us to obtain the ﬁrst two terms in the
1
γ
expansion without any difﬁculty.
175
We set γ =
1
ε
. The differential operator L
0
becomes
L
0
=
1
ε
L
OU
+L
H
.
We look for a solution of (8.88) in the form of a power series expansion in γ:
Φ = φ
0
+ εφ
1
+ ε
2
φ
2
+ ε
3
φ
3
+ . . . (8.114)
We substitute this into (8.88) and obtain the following sequence of equations:
−L
OU
φ
0
= 0, (8.115a)
−L
OU
φ
1
= p +L
H
φ
0
, (8.115b)
−L
OU
φ
2
= L
H
φ
1
, (8.115c)
−L
OU
φ
3
= L
H
φ
2
. (8.115d)
The null space of the OrnsteinUhlenbeck operator L
0
consists of constants in p. Consequently,
from the ﬁrst equation in (8.115) we deduce that the ﬁrst term in the expansion in independent of
p, φ
0
= φ(q). The second equation becomes
−L
OU
φ
1
= p(1 + ∂
q
φ).
Let
ν
β
(p) =
_
2π
β
_
−
1
2
e
−β
p
2
2
,
be the invariant distribution of the OU process (i.e. L
∗
OU
ν
β
(p) = 0). The solvability condition for
an equation of the form −L
OU
φ = f requires that the right hand side averages to 0 with respect
to ν
β
(p), i.e. that the right hand side of the equation is orthogonal to the null space of the adjoint
of L
OU
. This condition is clearly satisﬁed for the equation for φ
1
. Thus, by Fredholm alternative,
this equation has a solution which is
φ
1
(p, q) = (1 + ∂
q
φ)p + ψ
1
(q),
where the function ψ
1
(q) of is to be determined. We substitute this into the right hand side of the
third equation to obtain
−L
OU
φ
2
= p
2
∂
2
q
φ −∂
q
V (1 + ∂
q
φ) + p∂
q
ψ
1
(q).
176
From the solvability condition for this we obtain an equation for φ(q):
β
−1
∂
2
q
φ −∂
q
V (1 + ∂
q
φ) = 0, (8.116)
together with the periodic boundary conditions. The derivative of the solution of this twopoint
boundary value problem is
∂
q
φ + 1 =
2π
_
π
−π
e
βV (q)
dq
e
βV (q)
. (8.117)
The ﬁrst two terms in the large γ expansion of the solution of equation (8.88) are
Φ(p, q) = φ(q) +
1
γ
(1 + ∂
q
φ) +O
_
1
γ
2
_
,
where φ(q) is the solution of (8.116). Substituting this in the formula for the diffusion coefﬁcient
and using (8.117) we obtain
D =
_
∞
−∞
_
π
−π
pΦρ
β
(p, q) dpdq =
4π
2
βZ
´
Z
+O
_
1
γ
3
_
,
where Z =
_
π
−π
e
−βV (q)
,
´
Z =
_
π
−π
e
βV (q)
. This is, of course, the LifsonJackson formula which
gives the diffusion coefﬁcient in the overdamped limit [54]. Continuing in the same fashion, we
can also calculate the next two terms in the expansion (8.114), see Exercise 4. From this, we can
compute the next order correction to the diffusion coefﬁcient. The ﬁnal result is
D =
4π
2
βγZ
´
Z
−
4π
2
βZ
1
γ
3
Z
´
Z
2
+O
_
1
γ
5
_
, (8.118)
where Z
1
=
_
π
−π
[V
t
(q)[
2
e
βV (q)
dq.
In the case of the nonlinear pendulum, V (q) = cos(q), formula (8.118) gives
D =
1
γβ
J
−2
0
(β) −
β
γ
3
_
J
2
(β)
J
3
0
(β)
−J
−2
0
(β)
_
+O
_
1
γ
5
_
, (8.119)
where J
n
(β) is the modiﬁed Bessel function of the ﬁrst kind.
In the multidimensional case, a similar analysis leads to the large gamma asymptotics:
¸ξ, Dξ) =
1
γ
¸ξ, D
0
ξ) +O
_
1
γ
3
_
,
where ξ is an arbitrary unit vector in R
d
and D
0
is the diffusion coefﬁcient for the Smoluchowski
(overdamped) dynamics:
D
0
= Z
−1
_
R
d
_
−L
V
χ
_
⊗χe
−βV (q)
dq (8.120)
177
where
L
V
= −∇
q
V ∇
q
+ β
−1
∆
q
and χ(q) is the solution of the PDE L
V
χ = ∇
q
V with periodic boundary conditions.
Now we prove several properties of the effective diffusion tensor in the overdamped limit. For
this we will need the following integration by parts formula
_
T
d
_
∇
y
χ
_
ρ dy =
_
T
d
_
∇
y
(χρ) −χ ⊗∇
y
ρ
_
dy = −
_
T
d
(χ ⊗∇
y
ρ) dy. (8.121)
The proof of this formula is left as an exercise, see Exercise 5.
Theorem 8.6.1. The effective diffusion tensor D
0
(8.120) satisﬁes the upper and lower bounds
D
Z
´
Z
¸ξ, /ξ) D[ξ[
2
∀ξ ∈ R
d
, (8.122)
where
´
Z =
_
T
d
e
V (y)/D
dy.
In particular, diffusion is always depleted when compared to molecular diffusivity. Furthermore,
the effective diffusivity is symmetric.
Proof. The lower bound follows from the general lower bound (??), equation (??) and the formula
for the Gibbs measure. To establish the upper bound, we use (8.121) and (??) to obtain
/ = DI + 2D
_
T
d
(∇χ)
T
ρ dy +
_
T
d
−∇
y
V ⊗χρ dy
= DI −2D
_
T
d
∇
y
ρ ⊗χdy +
_
T
d
−∇
y
V ⊗χρ dy
= DI −2
_
T
d
−∇
y
V ⊗χρ dy +
_
T
d
−∇
y
V ⊗χρ dy
= DI −
_
T
d
−∇
y
V ⊗χρ dy
= DI −
_
T
d
_
−L
0
χ
_
⊗χρ dy
= DI −D
_
T
d
_
∇
y
χ ⊗∇
y
χ
_
ρ dy. (8.123)
Hence, for χ
ξ
= χ ξ,
¸ξ, /ξ) = D[ξ[
2
−D
_
T
d
[∇
y
χ
ξ
[
2
ρ dy
D[ξ[
2
.
This proves depletion. The symmetry of / follows from (8.123).
178
The One Dimensional Case
The one dimensional case is always in gradient form: b(y) = −∂
y
V (y). Furthermore in one
dimension we can solve the cell problem (??) in closed form and calculate the effective diffusion
coefﬁcient explicitly–up to quadratures. We start with the following calculation concerning the
structure of the diffusion coefﬁcient.
/ = D + 2D
_
1
0
∂
y
χρ dy +
_
1
0
−∂
y
V χρ dy
= D + 2D
_
1
0
∂
y
χρ dy + D
_
1
0
χ∂
y
ρ dy
= D + 2D
_
1
0
∂
y
χρ dy −D
_
1
0
∂
y
χρ dy
= D
_
1
0
_
1 + ∂
y
χ
_
ρ dy. (8.124)
The cell problem (??) in one dimension is
D∂
yy
χ −∂
y
V ∂
y
χ = ∂
y
V. (8.125)
We multiply equation (8.125) by e
−V (y)/D
to obtain
∂
y
_
∂
y
χe
−V (y)/D
_
= −∂
y
_
e
−V (y)/D
_
.
We integrate this equation from 0 to 1 and multiply by e
V (y)/D
to obtain
∂
y
χ(y) = −1 + c
1
e
V (y)/D
.
Another integration yields
χ(y) = −y + c
1
_
y
0
e
V (y)/D
dy + c
2
.
The periodic boundary conditions imply that χ(0) = χ(1), from which we conclude that
−1 + c
1
_
1
0
e
V (y)/D
dy = 0.
Hence
c
1
=
1
´
Z
,
´
Z =
_
1
0
e
V (y)/D
dy.
179
We deduce that
∂
y
χ = −1 +
1
´
Z
e
V (y)/D
.
We substitute this expression into (8.124) to obtain
/ =
D
Z
_
1
0
(1 + ∂
y
χ(y)) e
−V (y)/D
dy
=
D
Z
´
Z
_
1
0
e
V (y)/D
e
−V (y)/D
dy
=
D
Z
´
Z
, (8.126)
with
Z =
_
1
0
e
−V (y)/D
dy,
´
Z =
_
1
0
e
V (y)/D
dy. (8.127)
The CauchySchwarz inequality shows that Z
´
Z 1. Notice that in the one–dimensional case the
formula for the effective diffusivity is precisely the lower bound in (8.122). This shows that the
lower bound is sharp.
Example 8.6.2. Consider the potential
V (y) =
_
a
1
: y ∈ [0,
1
2
],
a
2
: y ∈ (
1
2
, 1],
(8.128)
where a
1
, a
2
are positive constants.
4
It is straightforward to calculate the integrals in (8.127) to obtain the formula
/ =
D
cosh
2
_
a
1
−a
2
D
_. (8.129)
In Figure 8.2 we plot the effective diffusivity given by (8.129) as a function of the molecular
diffusivity D. We observe that / decays exponentially fast in the limit as D →0.
8.6.1 Brownian Motion in a Tilted Periodic Potential
In this appendix we use our method to obtain a formula for the effective diffusion coefﬁcient of an
overdamped particle moving in a one dimensional tilted periodic potential. This formula was ﬁrst
4
Of course, this potential is not even continuous, let alone smooth, and the theory as developed in this chapter
does not apply. It is possible, however, to consider a regularized version of this discontinuous potential and then
homogenization theory applies.
180
Figure 8.2: Effective diffusivity versus molecular diffusivity for the potential (8.128).
derived and analyzed in [80, 79] without any appeal to multiscale analysis. The equation of motion
is
˙ x = −V
t
(x) + F +
√
2Dξ, (8.130)
where V (x) is a smooth periodic function with period L, F and D > 0 constants and ξ(t) standard
white noise in one dimension. To simplify the notation we have set γ = 1.
The stationary Fokker–Planck equation corresponding to(8.130) is
∂
x
((V
t
(x) −F) ρ(x) + D∂
x
ρ(x)) = 0, (8.131)
with periodic boundary conditions. Formula (10.13) for the effective drift now becomes
U
eff
=
_
L
0
(−V
t
(x) + F)ρ(x) dx. (8.132)
The solution of eqn. (8.131) is [77, Ch. 9]
ρ(x) =
1
Z
_
x+L
x
dyZ
+
(y)Z
−
(x), (8.133)
with
Z
±
(x) := e
±
1
D
(V (x)−Fx)
,
181
and
Z =
_
L
0
dx
_
x+L
x
dyZ
+
(y)Z
−
(x). (8.134)
Upon using (8.133) in (8.132) we obtain [77, Ch. 9]
U
eff
=
DL
Z
_
1 −e
−
F L
D
_
. (8.135)
Our goal now is to calculate the effective diffusion coefﬁcient. For this we ﬁrst need to solve the
Poisson equation (10.20) which now becomes
Lχ(x) := D∂
xx
χ(x) + (−V
t
(x) + F)∂
x
χ = V
t
(x) −F + U
eff
, (8.136)
with periodic boundary conditions. Then we need to evaluate the integrals in (10.18):
D
eff
= D +
_
L
0
(−V
t
(x) + F −U
eff
)ρ(x) dx + 2D
_
L
0
∂
x
χ(x)ρ(x) dx.
It will be more convenient for the subsequent calculation to rewrite the above formula for the effec
tive diffusion coefﬁcient in a different form. The fact that ρ(x) solves the stationary Fokker–Planck
equation, together with elementary integrations by parts yield that, for all sufﬁciently smooth peri
odic functions φ(x),
_
L
0
φ(x)(−Lφ(x))ρ(x) dx = D
_
L
0
(∂
x
φ(x))
2
ρ(x) dx.
Now we have
D
eff
= D +
_
L
0
(−V
t
(x) + F −U
eff
)χ(x)ρ(x) dx + 2D
_
L
0
∂
x
χ(x)ρ(x) dx
= D +
_
L
0
(−Lχ(x))χ(x)ρ(x) dx + 2D
_
L
0
∂
x
χ(x)ρ(x) dx
= D + D
_
L
0
(∂
x
χ(x))
2
ρ(x) dx + 2D
_
L
0
∂
x
χ(x)ρ(x) dx
= D
_
L
0
(1 + ∂
x
χ(x))
2
ρ(x) dx. (8.137)
Now we solve the Poisson equation (8.136) with periodic boundary conditions. We multiply the
equation by Z
−
(x) and divide through by D to rewrite it in the form
∂
x
(∂
x
χ(x)Z
−
(x)) = −∂
x
Z
−
(x) +
U
eff
D
Z
−
(x).
182
We integrate this equation from x −L to x and use the periodicity of χ(x) and V (x) together with
formula (8.135) to obtain
∂
x
χ(x)Z
−
(x)
_
1 −e
−
F L
D
_
= −Z
−
(x)
_
1 −e
−
F L
D
_
+
L
Z
_
1 −e
−
F L
D
_
_
x
x−L
Z
−
(y) dy,
from which we immediately get
∂
x
χ(x) + 1 =
1
Z
_
x
x−L
Z
−
(y)Z
+
(x) dy.
Substituting this into (8.137) and using the formula for the invariant distribution (8.133) we ﬁnally
obtain
D
eff
=
D
Z
3
_
L
0
(I
+
(x))
2
I
−
(x) dx, (8.138)
with
I
+
(x) =
_
x
x−L
Z
−
(y)Z
+
(x) dy and I
−
(x) =
_
x+L
x
Z
+
(y)Z
−
(x) dy.
Formula (8.138) for the effective diffusion coefﬁcient (formula (22) in [79]) is the main result of
this section.
8.7 Numerical Solution of the KleinKramers Equation
8.8 Discussion and Bibliography
The rigorous study of the overdamped limit can be found in [68]. A similar approximation theorem
is also valid in inﬁnite dimensions (i.e. for SPDEs); see [10, 11].
More information about the underdamped limit of the Langevin equation can be found at [89,
28, 29].
We also mention in passing that the various formulae for the effective diffusion coefﬁcient
that have been derived in the literature [34, 54, 80, 85] can be obtained from equation (??): they
correspond to cases where equations (??) and (??) can be solved analytically. An example–the
calculation of the effective diffusion coefﬁcient of an overdamped Brownian particle in a tilted
periodic potential–is presented in appendix. Similar calculations yield analytical expressions for
all other exactly solvable models that have been considered in the literature.
183
8.9 Exercises
1. Let
´
L be the generator of the twodimensional OrnsteinUhlenbeck operator (8.17). Calculate
the eigenvalues and eigenfunctions of
´
L. Show that there exists a transformation that transforms
´
L into the Schr¨ odinger operator of the twodimensional quantum harmonic oscillator.
2. Let
´
L be the operator deﬁned in (8.34)
(a) Show by direct substitution that
´
L can be written in the form
´
L = −λ
1
(c
−
)
∗
(c
+
)
∗
−λ
2
(d
−
)
∗
(d
+
)
∗
.
(b) Calculate the commutators
[(c
+
)
∗
, (c
−
)
∗
], [(d
+
)
∗
, (d
−
)
∗
], [(c
±
)
∗
, (d
±
)
∗
], [
´
L, (c
±
)
∗
], [
´
L, (d
±
)
∗
].
3. Show that the operators a
±
, b
±
deﬁned in (8.15) and (8.16) satisfy the commutation relations
[a
+
, a
−
] = −1, (8.139a)
[b
+
, b
−
] = −1, (8.139b)
[a
±
, b
±
] = 0. (8.139c)
4. Obtain the second term in the expansion (8.118).
5. Prove formula (8.121).
184
Chapter 9
Exit Time Problems
9.1 Introduction
9.2 Brownian Motion in a Bistable Potential
There are many systems in physics, chemistry and biology that exist in at least two stable states.
Among the many applications we mention the switching and storage devices in computers. An
other example is biological macromolecules that can exist in many different states. The problems
that we would like to solve are:
• How stable are the various states relative to each other.
• How long does it take for a system to switch spontaneously from one state to another?
• How is the transfer made, i.e. through what path in the relevant state space? There is a lot of
important current work on this problem by E, Vanden Eijnden etc.
• How does the system relax to an unstable state?
We can separate between the 1d problem, the ﬁnite dimensional problem and the inﬁnite dimen
sional problem (SPDEs). We we will solve completely the one dimensional problem and discuss
in some detail about the ﬁnite dimensional problem. The inﬁnite dimensional situation is an ex
tremely hard problem and we will only make some remarks. The study of bistability and metasta
bility is a very active research area, in particular the development of numerical methods for the
calculation of various quantities such as reaction rates, transition pathways etc.
185
We will mostly consider the dynamics of a particle moving in a bistable potential, under the
inﬂuence of thermal noise in one dimension:
˙ x = −V
t
(x) +
_
2k
B
T
˙
β. (9.1)
An example of the class of potentials that we will consider is shown in Figure. It has to local
minima, one local maximum and it increases at least quadratically at inﬁnity. This ensures that the
state space is ”compact”, i.e. that the particle cannot escape at inﬁnity. The standard potential that
satisﬁes these assumptions is
V (x) =
1
4
x
4
−
1
2
x
2
+
1
4
. (9.2)
It is easily checked that this potential has three local minima, a local maximum at x = 0 and two
local minima at x = ±1. The values of the potential at these three points are:
V (±1) = 0, V (0) =
1
4
.
We will say that the height of the potential barrier is
1
4
. The physically (and mathematically!)
interesting case is when the thermal ﬂuctuations are weak when compared to the potential barrier
that the particle has to climb over.
More generally, we assume that the potential has two local minima at the points a and c and a
local maximum at b. Let us consider the problem of the escape of the particle from the left local
minimum a. The potential barrier is then deﬁned as
∆E = V (b) −V (a).
186
Our assumption that the thermal ﬂuctuations are weak can be written as
k
B
T
∆E
¸1.
In this limit, it is intuitively clear that the particle is most likely to be found at either a or c. There
it will perform small oscillations around either of the local minima. This is a result that we can
obtain by studying the small temperature limit by using perturbation theory. The result is that we
can describe locally the dynamics of the particle by appropriate Ornstein–Uhlenbeck processes.
Of course, this result is valid only for ﬁnite times: at sufﬁciently long times the particle can escape
from the one local minimum, a say, and surmount the potential barrier to end up at c. It will then
spend a long time in the neighborhood of c until it escapes again the potential barrier and end at
a. This is an example of a rare event. The relevant time scale, the exit time or the mean ﬁrst
passage time scales exponentially in β := (k
B
T)
−1
:
τ = ν
−1
exp(β∆E).
It is more customary to calculate the reaction rate κ := τ
−1
which gives the rate with which
particles escape from a local minimum of the potential:
κ = ν exp(−β∆E). (9.3)
It is very important to notice that the escape from a local minimum, i.e. a state of local stability,
can happen only at positive temperatures: it is a noise assisted event. Indeed, consider the case
T = 0. The equation of motion becomes
˙ x = −V
t
(x), x(0) = x
0
.
In this case the potential becomes a Lyapunov function:
dx
dt
= V
t
(x)
dx
dt
= −(V
t
(x))
2
< 0.
Hence, depending on the initial condition the particle will converge either to a or c. The particle
cannot escape from either state of local stability.
On the other hand, at high temperatures the particle does not ”see” the potential barrier: it
essentially jumps freely from one local minimum to another.
187
To get a better understanding of the dependence of the dynamics on the depth of the potential
barrier relative to temperature, we solve the equation of motion (10.3) numerically. In Figure we
present the time series of the particle position. We observe that at small temperatures the particle
spends most of its time around x = ±1 with rapid transitions from −1 to 1 and back.
9.3 The Mean First Passage Time
The Arrheniustype factor in the formula for the reaction rate, eqn. (9.3) is intuitively and it has
been observed experimentally in the late nineteenth century by Arrhenius and others. What is
extremely important both from a theoretical and an applied point of view is the calculation of the
prefactor ν, the rate coefﬁcient. A systematic approach for the calculation of the rate coefﬁcient,
as well as the justiﬁcation of the Arrhenius kinetics, is that of the mean ﬁrst passage time method
(MFPT). Since this method is of independent interest and is useful in various other contexts, we
will present it in a quite general setting and apply it to the problem of the escape from a potential
barrier in later sections. We will ﬁrst treat the one dimensional problem and then extend the theory
to arbitrary ﬁnite dimensions.
We will restrict ourselves to the case of homogeneous Markov processes. It is not very easy to
extend the method to nonMarkovian processes.
9.3.1 The Boundary Value Problem for the MFPT
Let X
t
be a continuous time diffusion process on R
d
whose evolution is governed by the SDE
dX
x
t
= b(X
x
t
) dt + σ(X
x
t
) dW
t
, X
x
0
= x. (9.4)
Let D be a bounded subset of R
d
with smooth boundary. Given x ∈ D, we want to know how long
it takes for the process X
t
to leave the domain D for the ﬁrst time
τ
x
D
= inf ¦t 0 : X
x
t
/ ∈ D¦ .
Clearly, this is a random variable. The average of this random variable is called the mean ﬁrst
passage time MFPT or the ﬁrst exit time:
τ(x) := Eτ
x
D
.
We can calculate the MFPT by solving an appropriate boundary value problem.
188
Theorem 9.3.1. The MFPT is the solution of the boundary value problem
−Lτ = 1, x ∈ D, (9.5a)
τ = 0, x ∈ ∂D, (9.5b)
where L is the generator of the SDE 9.5.
The homogeneous Dirichlet boundary conditions correspond to an absorbing boundary: the
particles are removed when they reach the boundary. Other choices of boundary conditions are
also possible. The rigorous proof of Theorem 9.3.1 is based on Itˆ o’s formula.
Proof. Let ρ(X, x, t) be the probability distribution of the particles that have not left the domain
D at time t. It solves the FP equation with absorbing boundary conditions.
∂ρ
∂t
= L
∗
ρ, ρ(X, x, 0) = δ(X −x), ρ[
∂D
= 0. (9.6)
We can write the solution to this equation in the form
ρ(X, x, t) = e
/
∗
t
δ(X −x),
where the absorbing boundary conditions are included in the deﬁnition of the semigroup e
/
∗
t
. The
homogeneous Dirichlet (absorbing) boundary conditions imply that
lim
t→+∞
ρ(X, x, t) = 0.
That is: all particles will eventually leave the domain. The (normalized) number of particles that
are still inside D at time t is
S(x, t) =
_
D
ρ(X, x, t) dx.
Notice that this is a decreasing function of time. We can write
∂S
∂t
= −f(x, t),
189
where f(x, t) is the ﬁrst passage times distribution. The MFPT is the ﬁrst moment of the distri
bution f(x, t):
τ(x) =
_
+∞
0
f(s, x)s ds =
_
+∞
0
−
dS
ds
s ds
=
_
+∞
0
S(s, x) ds =
_
+∞
0
_
D
ρ(X, x, s) dXds
=
_
+∞
0
_
D
e
/
∗
s
δ(X −x) dXds
=
_
+∞
0
_
D
δ(X −x)
_
e
/s
1
_
dXds =
_
+∞
0
_
e
/s
1
_
ds.
We apply L to the above equation to deduce:
Lτ =
_
+∞
0
_
Le
/t
1
_
dt =
_
t
0
d
dt
_
Le
/t
1
_
dt
= −1.
9.3.2 Examples
In this section we consider a fewsimple examples for which we can calculate the mean ﬁrst passage
time in closed form.
Brownian motion with one absorbing and one reﬂecting boundary.
We consider the problem of Brownian motion moving in the interval [a, b]. We assume that the left
boundary is absorbing and the right boundary is reﬂecting. The boundary value problem for the
MFPT time becomes
−
d
2
τ
dx
2
= 1, τ(a) = 0,
dτ
dx
(b) = 0. (9.7)
The solution of this equation is
τ(x) = −
x
2
2
+ bx + a
_
a
2
−b
_
.
The MFPT time for Brownian motion with one absorbing and one reﬂecting boundary in the inter
val [−1, 1] is plotted in Figure 9.3.2.
190
Figure 9.1: The mean ﬁrst passage time for Brownian motion with one absorbing and one reﬂecting
boundary.
Brownian motion with two reﬂecting boundaries.
Consider again the problem of Brownian motion moving in the interval [a, b], but now with both
boundaries being absorbing. The boundary value problem for the MFPT time becomes
−
d
2
τ
dx
2
= 1, τ(a) = 0, τ(b) = 0. (9.8)
The solution of this equation is
τ(x) = −
x
2
2
+ bx + a
_
a
2
−b
_
.
The MFPT time for Brownian motion with two absorbing boundaries in the interval [−1, 1] is
plotted in Figure 9.3.2.
The Mean First Passage Time for a OneDimensional Diffusion Process
Consider now the mean exit time problem from an interval [a, b] for a general onedimensional
diffusion process with generator
L = a(x)
d
dx
+
1
2
b(x)
d
2
dx
2
,
where the drift and diffusion coefﬁcients are smooth functions and where the diffusion coefﬁcient
b(x) is a strictly positive function (uniform ellipticity condition). In order to calculate the mean
191
Figure 9.2: The mean ﬁrst passage time for Brownian motion with two absorbing boundaries.
ﬁrst passage time we need to solve the differential equation
−
_
a(x)
d
dx
+
1
2
b(x)
d
2
dx
2
_
τ = 1, (9.9)
together with appropriate boundary conditions, depending on whether we have one absorbing and
one reﬂecting boundary or two absorbing boundaries. To solve this equation we ﬁrst deﬁne the
function ψ(x) through ψ
t
(x) = 2a(x)/b(x) to write (9.9) in the form
_
e
ψ(x)
τ
t
(x)
_
t
= −
2
b(x)
e
−ψ(x)
The general solution of (9.9) is obtained after two integrations:
τ(x) = −2
_
x
a
e
−ψ(z)
dz
_
z
a
e
−ψ(y)
b(y)
dy + c
1
_
x
a
e
−ψ(y)
dy + c
2
,
where the constants c
1
and c
2
are to be determined from the boundary conditions. When both
boundaries are absorbing we get
τ(x) = −2
_
x
a
e
−ψ(z)
dz
_
z
a
e
−ψ(y)
b(y)
dy +
2
´
Z
Z
_
x
a
e
−ψ(y)
dy. (9.10)
9.4 Escape from a Potential Barrier
In this section we use the theory developed in the previous section to study the long time/small
temperature asymptotics of solutions to the Langevin equation for a particle moving in a one–
192
dimensional potential of the form (9.2):
¨ x = −V
t
(x) −γ ˙ x +
_
2γk
B
T
˙
W. (9.11)
In particular, we justify the Arrhenius formula for the reaction rate
κ = ν(γ) exp(−β∆E)
and we calculate the escape rate ν = ν(γ). In particular, we analyze the dependence of the escape
rate on the friction coefﬁcient. We will see that the we need to distinguish between the cases of
large and small friction coefﬁcients.
9.4.1 Calculation of the Reaction Rate in the Overdamped Regime
We consider the Langevin equation (9.11) in the limit of large friction. As we saw in Section 8.4,
in the overdamped limit γ ¸ 1, the solution to (9.11) can be approximated by the solution to the
Smoluchowski equation (10.3)
˙ x = −V
t
(x) +
_
2β
−1 ˙
W.
We want to calculate the rate of escape from the potential barrier in this case. We assume that the
particle is initially at x
0
which is near a, the left potential minimum. Consider the boundary value
problem for the MFPT of the one dimensional diffusion process (10.3) from the interval (a, b):
−β
−1
e
βV
∂
x
_
e
−βV
τ
_
= 1 (9.12)
We choose reﬂecting BC at x = a and absorbing B.C. at x = b. We can solve (9.12) with these
boundary conditions by quadratures:
τ(x) = β
−1
_
b
x
dye
βV (y)
_
y
0
dze
−βV (z)
. (9.13)
Now we can solve the problem of the escape from a potential well: the reﬂecting boundary is at
x = a, the left local minimum of the potential, and the absorbing boundary is at x = b, the local
maximum. We can replace the B.C. at x = a by a repelling B.C. at x = −∞:
τ(x) = β
−1
_
b
x
dye
βV (y)
_
y
−∞
dze
−βV (z)
.
193
When E
b
β ¸ 1 the integral wrt z is dominated by the value of the potential near a. Furthermore,
we can replace the upper limit of integration by ∞:
_
z
−∞
exp(−βV (z)) dz ≈
_
+∞
−∞
exp(−βV (a)) exp
_
−
βω
2
0
2
(z −a)
2
_
dz
= exp (−βV (a))
¸
2π
βω
2
0
,
where we have used the Taylor series expansion around the minimum:
V (z) = V (a) +
1
2
ω
2
0
(z −a)
2
+ . . .
Similarly, the integral wrt y is dominated by the value of the potential around the saddle point. We
use the Taylor series expansion
V (y) = V (b) −
1
2
ω
2
b
(y −b)
2
+ . . .
Assuming that x is close to a, the minimum of the potential, we can replace the lower limit of
integration by −∞. We ﬁnally obtain
_
b
x
exp(βV (y)) dy ≈
_
b
−∞
exp(βV (b)) exp
_
−
βω
2
b
2
(y −b)
2
_
dy
=
1
2
exp (βV (b))
¸
2π
βω
2
b
.
Putting everything together we obtain a formula for the MFPT:
τ(x) =
π
ω
0
ω
b
exp (βE
b
) .
The rate of arrival at b is 1/τ. Only have of the particles escape. Consequently, the escape rate (or
reaction rate), is given by
1
2τ
:
κ =
ω
0
ω
b
2π
exp (−βE
b
) .
9.4.2 The Intermediate Regime: γ = O(1)
• Consider now the problem of escape from a potential well for the Langevin equation
¨ q = −∂
q
V (q) −γ ˙ q +
_
2γβ
−1 ˙
W. (9.14)
194
• The reaction rate depends on the ﬁction coefﬁcient and the temperature. In the overdamped
limit (γ ¸1) we retrieve (??), appropriately rescaled with γ:
κ =
ω
0
ω
b
2πγ
exp (−βE
b
) . (9.15)
• We can also obtain a formula for the reaction rate for γ = O(1):
κ =
_
γ
2
4
−ω
2
b
−
γ
2
ω
b
ω
0
2π
exp (−βE
b
) . (9.16)
• Naturally, in the limit as γ →+∞(9.16) reduces to (9.15)
9.4.3 Calculation of the Reaction Rate in the energydiffusionlimited regime
In order to calculate the reaction rate in the underdamped or energydiffusionlimited regime
γ ¸1 we need to study the diffusion process for the energy, (8.69) or (8.70). The result is
κ = γβI(E
b
)
ω
0
2π
e
−βE
b
, (9.17)
where I(E
b
) denotes the action evaluated at b.
9.5 Discussion and Bibliography
The calculation of reaction rates and the stochastic modeling of chemical reactions has been a very
active area of research since the 30’s. One of the ﬁrst methods that were developed was that of
transition state theory. Kramers developed his theory in his celebrated paper [49]. In this chapter
we have based our approach to the calculation of the mean ﬁrst passage time. Our analysis is
based mostly on [35, Ch. 5, Ch. 9], [96, Ch. 4] and the excellent review article [41]. We highly
recommend this review article for further information on reaction rate theory. See also [40] and
the review article of Melnikov (1991). A formula for the escape rate which is valid for all values
of friction coefﬁcient was obtained by Melnikov and Meshkov in 1986, J. Chem. Phys 85(2) 1018
1027. This formula requires the calculation of integrals and it reduced to (9.15) and (9.17) in the
overdamped and underdamped limits, respectively.
There are many applications of interest where it is important to calculate reaction rates for
nonMarkovian Langevin equations of the form
¨ x = −V
t
(x) −
_
t
0
bγ(t −s) ˙ x(s) ds + ξ(t) (9.18a)
195
¸ξ(t)ξ(0)) = k
B
TM
−1
γ(t) (9.18b)
We will derive generalized non–Markovian equations of the form (9.18a), together with the
ﬂuctuation–dissipation theorem (11.10), in Chapter 11. The calculation of reaction rates for the
generalized Langevin equation is presented in [40].
The long time/small temperature asymptotics can be studied rigorously by means of the theory
of FreidlinWentzell [29]. See also [6]. A related issue is that of the small temperature asymptotics
for the eigenvalues (in particular, the ﬁrst eigenvalue) of the generator of the Markov process x(t)
which is the solution of
γ ˙ x = −∇V (x) +
_
2γk
B
T
˙
W.
The theory of Freidlin and Wentzell has also been extended to inﬁnite dimensional problems. This
is a very important problem in many applications such as micromagnetics...We refer to CITE...
for more details.
A systematic study of the problem of the escape from a potential well was developed by
Matkowsky, Schuss and collaborators [86, 63, 64]. This approach is based on a systematic use
of singular perturbation theory. In particular, the calculation of the transition rate which is uni
formly valid in the friction coefﬁcient is presented in [64]. This formula is obtained through a
careful analysis of the PDE
p∂
q
τ −∂
q
V ∂
p
τ + γ(−p∂
p
+ k
B
T∂
2
p
)τ = −1,
for the mean ﬁrst passage time τ. The PDE is equipped, of course, with the appropriate boundary
conditions. Singular perturbation theory is used to study the small temperature asymptotics of
solutions to the boundary value problem. The formula derived in this paper reduces to the formulas
which are valid at large and small values of the friction coefﬁcient at the appropriate asymptotic
limits.
The study of rare transition events between long lived metastable states is a key feature in
many systems in physics, chemistry and biology. Rare transition events play an important role,
for example, in the analysis of the transition between different conformation states of biological
macromolecules such as DNA [87]. The study of rare events is one of the most active research
areas in the applied stochastic processes. Recent developments in this area involve the transition
path theory of W. E and Vanden Eijnden. Various simple applications of this theory are presented
in Metzner, Schutte et al 2006. As in the mean ﬁrst passage time approach, transition path theory
196
is also based on the solution of an appropriate boundary value problem for the socalled commitor
function.
9.6 Exercises
197
198
Chapter 10
Stochastic Resonance and Brownian Motors
10.1 Introduction
10.2 Stochastic Resonance
10.3 Brownian Motors
10.4 Introduction
Particle transport in spatially periodic, noisy systems has attracted considerable attention over the
last decades, see e.g. [82, Ch. 11], [78] and the references therein. There are various physical
systems where Brownian motion in periodic potentials plays a prominent role, such as Josephson
junctions [3], surface diffusion [52, 84] and superionic conductors [33]. While the system of a
Brownian particle in a periodic potential is kept away from equilibrium by an external, determin
istic or random, force, detailed balance does not hold. Consequently, and in the absence of any
spatial symmetry, a net particle current will appear, without any violation of the second law of
thermodynamics. It was this fundamental observation [60] that led to a revival of interest in the
problem of particle transport in periodic potentials with broken spatial symmetry. These types of
non– equilibrium systems, which are often called Brownian motors or ratchets, have found new
and exciting applications e.g as the basis of theoretical models for various intracellular transport
processes such as molecular motors [9]. Furthermore, various experimental methods for particle
separation have been suggested which are based on the theory of Brownian motors [7].
The long time behavior of a Brownian particle in a periodic potential is determined uniquely
199
by the effective drift and the effective diffusion tensor which are deﬁned, respectively, as
U
eff
= lim
t→∞
¸x(t) −x(0))
t
(10.1)
and
D
eff
= lim
t→∞
¸x(t) −¸x(t))) ⊗(x(t) −¸x(t))))
2t
. (10.2)
Here x(t) denotes the particle position, ¸) denotes ensemble average and ⊗ stands for the tensor
product. Indeed, an argument based on the central limit theorem [5, Ch. 3], [47] implies that at
long times the particle performs an effective Brownian motion which is a Gaussian process, and
hence the ﬁrst two moments are sufﬁcient to determine the process uniquely. The main goal of all
theoretical investigations of noisy, non–equilibrium particle transport is the calculation of (10.1)
and (10.2). One wishes, in particular, to analyze the dependence of these two quantities on the
various parameters of the problem, such as the friction coefﬁcient, the temperature and the particle
mass.
Enormous theoretical effort has been put into the study of Brownian ratchets and, more gen
erally, of Brownian particles in spatially periodic potentials [78]. The vast majority of all these
theoretical investigations is concerned with the calculation of the effective drift for one dimen
sional models. This is not surprising, since the theoretical tools that are currently available are
not sufﬁcient for the analytical treatment of the multi–dimensional problem. This is only possible
when the potential and/or noise are such that the problem can be reduced to a one dimensional one
[19]. For more general multi–dimensional problems one has to resort to numerical simulations.
There are various applications, however, where the one dimensional analysis is inadequate. As an
example we mention the technique for separation of macromolecules in microfabricated sieves that
was proposed in [14]. In the two–dimensional setting considered in this paper, an appropriately
chosen driving force in the y direction produces a constant drift in the x direction, but with a zero
net velocity in the y direction. On the other hand, a force in the x direction produces no drift in the
y direction. The theoretical analysis of this problem requires new technical tools.
Furthermore, the number of theoretical studies related to the calculation of the effective diffu
sion tensor has also been scarce [34, 55, 79, 80, 85]. In these papers, relatively simple potentials
and/or forcing terms are considered, such as tilting periodic potentials or simple periodic in time
forcing. It is widely recognized that the calculation of the effective diffusion coefﬁcient is tech
nically more demanding than that of the effective drift. Indeed, as we will show in this paper, it
200
requires the solution of a Poisson equation, in addition to the solution of the stationary Fokker–
Planck equation which is sufﬁcient for the calculation of the effective drift. Diffusive, rather than
directed, transport can be potentially extremely important in the design of experimental setups for
particle selection [78, Sec 5.11] [85]. It is therefore desirable to develop systematic tools for the
calculation of the effective diffusion coefﬁcient (or tensor, in the multi–dimensional setting).
From a mathematical point of view, non–equilibrium systems which are subject to unbiased
noise can be modelled as non–reversible Markov processes [76] and can be expressed in terms
of solutions to stochastic differential equations (SDEs). The SDEs which govern the motion of
a Brownian particle in a periodic potential possess inherent length and time scales: those related
to the spatial period of the potential and the temporal period (or correlation time) of the external
driving force. From this point of view the calculation of the effective drift and the effective dif
fusion coefﬁcient amounts to studying the behavior of solutions to the underlying SDEs at length
and time scales which are much longer than the characteristic scales of the system. A systematic
methodology for studying problems of this type, which is based on scale separation, has been de
veloped many years ago [5, ?, ?]. The techniques developed in the aforementioned references are
appropriate for the asymptotic analysis of stochastic systems (and Markov processes in particular)
which are spatially and/or temporally periodic. The purpose of this work is to apply these multi
scale techniques to the study Brownian motors in arbitrary dimensions, with particular emphasis
to the calculation of the effective diffusion tensor.
The rest of this paper is organized as follows. In section 10.5 we introduce the model that we
will study. In section 10.6 we obtain formulae for the effective drift and the effective diffusion
tensor in the case where all external forces are Markov processes. In section 10.7 we study the
effective diffusion coefﬁcient for a Brownian particle in a periodic potential driven simultaneously
by additive Gaussian white and colored noise. Section ?? is reserved for conclusions. In Appendix
A we derive formulae for the effective drift and the effective diffusion coefﬁcient for the case where
the Brownian particle is driven away from equilibrium by periodic in time external ﬂuctuations.
Finally, in appendix Bwe use the method developed in this paper to calculate the effective diffusion
coefﬁcient of an overdamped particle in a one dimensional tilted periodic potential.
201
10.5 The Model
We consider the overdamped d–dimensional stochastic dynamics for a state variable x(t) ∈ R
d
[78, sec. 3]
γ ˙ x(t) = −∇V (x(t), f(t)) + y(t) +
_
2γk
B
Tξ(t), (10.3)
where γ is the friction coefﬁcient, k
B
the Boltzmann constant and T denotes the temperature. ξ(t)
stands for the standard d–dimensional white noise process, i.e.
¸ξ
i
(t)) = 0 and ¸ξ
i
(t)ξ
j
(s)) = δ
ij
δ(t −s), i, j = 1, . . . d.
We take f(t) and y(t) to be Markov processes with respective state spaces E
f
, E
y
and generators
L
f
, L
y
. The potential V (x, f) is periodic in x for every f, with period L in all spatial directions:
V (x + Lˆ e
i
, f) = V (x, f), i = 1, . . . , d,
where ¦ˆ e
i
¦
d
i=1
denotes the standard basis of R
d
. We will use the notation Q = [0, L]
d
.
The processes f(t) and y(t) can be continuous in time diffusion processes which are con
structed as solutions of stochastic differential equations, dichotomous noise [42, Ch. 9], more
general Markov chains etc. The (easier) case where f(t) and y(t) are deterministic, periodic func
tions of time is treated in the appendix.
For simplicity, we have assumed that the temperature in (10.3) is constant. However, this
assumption is with no loss of generality, since eqn. (10.3) with a time dependent temperature can
be mapped to an equation with constant temperature and an appropriate effective potential [78, sec.
6]. Thus, the above framework is general enough to encompass most of the models that have been
studied in the literature, such as pulsating, tilting, or temperature ratchets. We remark that the state
variable x(t) does not necessarily denote the position of a Brownian particle. We will, however,
refer to x(t) as the particle position in the sequel.
The process ¦x(t), f(t), y(t)¦ in the extended phase space R
d
E
f
E
y
is Markovian with
generator
L = F(x, f, y) ∇
x
+ D∆
x
+L
f
+L
y
,
where D :=
k
B
T
γ
and
F(x, f, y) =
1
γ
(−∇V (x, f) + y) .
202
To this process we can associate the initial value problem for the backward Kolmogorov Equation
[69, Ch. 8]
∂u
∂t
= Lu, u(x, y, f, t = 0) = u
in
(x, y, f). (10.4)
which is, of course, the adjoint to the Fokker–Planck equation. Our derivation of formulae for the
effective drift and the effective diffusion tensor is based on singular perturbation analysis of the
initial value problem (10.4).
10.6 Multiscale Analysis
In this section we derive formulae for the effective drift and the effective diffusion tensor for x(t),
the solution of (10.3). Let us outline the basic philosophy behind the derivation of formulae (10.13)
and (10.18). We are interested in the long time, large scale behavior of x(t). For the analysis that
follows it is convenient to introduce a parameter ε ¸ 1 which in effect is the ratio between the
length scale deﬁned through the period of the potential and a large ”macroscopic” length scale at
which the motion of the particle is governed by and effective Brownian motion. The limit ε → 0
corresponds to the limit of inﬁnite scale separation. The behavior of the system in this limit can be
analysed using singular perturbation theory.
We remark that the calculation of the effective drift and the effective diffusion tensor are per
formed seperately, because a different re–scaling is needed in each case. This is due to the fact that
advection and diffusion have different characteristic time scales.
10.6.1 Calculation of the Effective Drift
The backward Kolmogorov equation reads
∂u(x, y, f, t)
∂t
= (F(x, f, y) ∇
x
+ D∆
x
+L
f
+L
y
) u(x, y, f, t). (10.5)
We re–scale a space and time in (10.5) according to
x →εx, t →εt
and divide through by ε to obtain
∂u
ε
∂t
=
1
ε
_
F
_
(
x
ε
, f, y
_
∇
x
+ εD∆
x
+L
f
+L
y
_
u
ε
. (10.6)
203
We solve (10.6) pertubatively by looking for a solution in the form of a two–scale expansion
u
ε
(x, f, y, t) = u
0
_
x,
x
ε
, f, y, t
_
+ εu
1
_
x,
x
ε
, f, y, t
_
+ ε
2
u
2
_
x,
x
ε
, f, y, t
_
+ . . . . (10.7)
All terms in the expansion (10.7) are periodic functions of z = x/ε. From the chain rule we have
∇
x
→∇
x
+
1
ε
∇
z
. (10.8)
Notice that we do not take the terms in the expansion (10.8) to depend explicitly on t/ε. This is
because the coefﬁcients of the backward Kolmogorov equation (10.6) do not depend explicitly on
the fast time t/ε. In the case where the ﬂuctuations are periodic, rather than Markovian, in time,
we will need to assume that the terms in the multiscale expansion for u
ε
(x, t) depend explicitly on
t/ε. The details are presented in the appendix.
We substitute now (10.7) into (10.5), use (10.8) and treat x and z as independent variables.
Upon equating the coefﬁcients of equal powers in ε we obtain the following sequence of equations
L
0
u
0
= 0, (10.9)
L
0
u
1
= −L
1
u
0
+
∂u
0
∂t
, (10.10)
. . . = . . . ,
where
L
0
= F(z, f, y) ∇
z
+ D∆
z
+L
y
+L
f
(10.11)
and
L
1
= F(z, f, y) ∇
x
+ 2D∇
z
∇
x
.
The operator L
0
is the generator of a Markov process on Q E
y
E
f
. In order to proceed
we need to assume that this process is ergodic: there exists a unique stationary solution of the
Fokker–Planck equation
L
∗
0
ρ(z, y, f) = 0, (10.12)
with
_
QE
y
E
f
ρ(z, y, f) dzdydf = 1
and
L
∗
0
ρ = ∇
z
(F(z, f, y)ρ) + D∆
z
ρ +L
∗
y
ρ +L
∗
f
ρ.
204
In the above L
∗
f
and L
∗
y
are the Fokker–Planck operators of f and y, respectively. The stationary
density ρ(z, y, f) satisﬁes periodic boundary conditions in z and appropriate boundary conditions
in f and y. We emphasize that the ergodicity of the ”fast” process is necessary for the very exis
tence of an effective drift and an effective diffusion coefﬁcient, and it has been tacitly assumed in
all theoretical investigations concerning Brownian motors [78].
Under the assumption that (10.12) has a unique solution eqn. (10.9) implies, by Fredholm
alternative, that u
0
is independent of the fast scales:
u
0
= u(x, t).
Eqn. (10.10) now becomes
L
0
u
1
=
∂u(x, t)
∂t
−F(z, y, f) ∇
x
u(x, t).
In order for this equation to be well posed it is necessary that the right hand side averages to 0
with respect to the invariant distribution ρ(z, f, y). This leads to the following backward Liouville
equation
∂u(x, t)
∂t
= U
eff
∇
x
u(x, t),
with the effective drift given by
U
eff
=
_
QE
y
E
f
F(z, y, f)ρ(z, y, f) dzdydf
=
1
γ
_
QE
y
E
f
(−∇V (x, f) + y) ρ(z, y, f) dzdydf. (10.13)
10.6.2 Calculation of the Effective Diffusion Coefﬁcient
We assume for the moment that the effective drift vanishes, U
eff
= 0. We perform a diffusive
re–scaling in (10.5)
x →εx, t →ε
2
t
and divide through by ε
2
to obtain
∂u
ε
∂t
=
1
ε
2
_
F
_
x
ε
, f, y
_
∇
x
+ εD∆
x
+L
f
+L
y
_
u
ε
, (10.14)
205
We go through the same analysis as in the previous subsection to obtain the following sequence of
equations.
L
0
u
0
= 0, (10.15)
L
0
u
1
= −L
1
u
0
, (10.16)
L
0
u
2
= −L
1
u
1
−L
2
u
0
, (10.17)
. . . = . . . ,
where L
0
and L
1
were deﬁned in the previous subsection and
L
2
= −
∂
∂t
+ D∆
x
.
Equation (10.15) implies that u
0
= u(x, t). Now (10.16) becomes
L
0
u
1
= −F(z, y, f) ∇
x
u(x, t).
Since we have assumed that U
eff
= 0, the right hand side of the above equation belongs to the null
space of L
∗
0
and this equation is well posed. Its solution is
u
1
(x, z, f, y, t) = χ(z, y, f) ∇
x
u(x, t),
where the auxiliary ﬁeld χ(z, y, f) satisﬁes the Poisson equation
−L
0
χ(z, y, f) = F(z, y, f)
with periodic boundary conditions in z and appropriate boundary conditions in y and f.
We proceed now with the analysis of equation (10.17). The solvability condition for this equa
tion reads
_
QE
y
E
f
(−L
1
u
1
−L
2
u
0
) dzdydf = 0,
from which, after some straightforward algebra, we obtain the limiting backward Kolmogorov
equation for u(x, t)
∂u(x, t)
∂t
=
d
i,j=1
D
eff
ij
∂
2
u(x, t)
∂x
i
∂x
j
.
The effective diffusion tensor is
D
eff
ij
= Dδ
ij
+
¸
F
i
(z, y, f)χ
j
(z, y, f)
_
ρ
+ 2D
_
∂χ
i
(z, y, f)
∂z
j
_
ρ
, (10.18)
206
where the notation ¸)
ρ
for the averaging with respect to the invariant density has been introduced.
The case where the effective drift does not vanish, U
eff
,= 0, can be reduced to the situation
analyzed in this subsection through a Galilean transformation with respect to U
eff
1
. The effective
diffusion tensor is now given by
D
eff
ij
= Dδ
ij
+
¸_
F
i
(z, y, f) −U
i
eff
_
χ
j
(z, y, f)
_
ρ
+2D
_
∂χ
i
(z, y, f)
∂z
j
_
ρ
, (10.19)
and the ﬁeld χ(z, f, y) satisﬁes the Poisson equation
−L
0
χ = F(z, y, f) −U
eff
. (10.20)
10.7 Effective Diffusion Coefﬁcient for Correlation Ratchets
In this section we consider the following model [4, 17]
γ ˙ x(t) = −∇V (x(t)) + y(t) +
_
2γk
B
T ξ(t), (10.21a)
˙ y(t) = −
1
τ
y(t) +
_
2σ
τ
ζ(t), (10.21b)
where ξ(t) and ζ(t) are mutually independent standard d–dimensional white noise processes. The
potential V (x) is assumed to be L–periodic in all spatial directions The process y(t) is the d–
dimensional Onrstein–Uhlenbeck (OU) process [35] which is a mean zero Gaussian process with
correlation function
¸y
i
(t)y
j
(s)) = δ
ij
σe
−
t−s
τ
, i, j = 1, . . . , d.
Let z(t) denote the restriction of x(t) to Q = [0, 2π]
d
. The generator of the Markov process
¦z(t), y(t)¦ is
L =
1
γ
(−∇
z
V (z) + y) ∇
z
+ D∆
z
+
1
τ
(−y ∇
y
+ σ∆
y
)
with D :=
k
B
T
γ
. Standard results from the ergodic theory of Markov processes see e.g. [5, ch.
3] ensure that the process ¦z(t), y(t)¦ ∈ Q R
d
, with generator L is ergodic and that the unique
1
In other words, the process x
(
ε)(t) := ε
_
x(t/ε
2
) −ε
−2
U
eff
t
_
converges to a mean zero Gaussian process with
effective diffusivity given by (10.19)
207
invariant measure has a smooth density ρ(y, z) with respect to the Lebesgue measure. This is true
even at zero temperature [?, 65]. Hence, the results of section 10.6 apply: the effective drift and
effective diffusion tensor are given by formulae (10.13) and (10.18), respectively. Of course, in
order to calculate these quantities we need to solve equations (10.12) and (10.20) which take the
form:
−
1
γ
∇
z
((−∇
z
V (z) + y)ρ(y, z)) + D∆
z
ρ(y, z) +
1
τ
_
∇
y
(yρ(y, z))
+σ∆
y
ρ(y, z)
_
= 0
and
−
1
γ
(−∇
z
V (z) + y) ∇
z
χ(y, z) −D∆
z
χ(y, z)
−
1
τ
_
−y ∇
y
χ(y, z) + σ∆
y
χ(y, z)
_
=
1
γ
(−∇
z
V (z) + y) −U.
The effective diffusion tensor is positive deﬁnite. To prove this, let e be a unit vector in R
d
, deﬁne
f = F e, u = U
eff
e and let φ := e χ denote the unique solution of the scalar problem
−Lφ = (F −U) e =: f −u, φ(y, z + L) = φ(y, z), ¸φ)
ρ
= 0.
Let now h(y, z) be a sufﬁciently smooth function. Elementary computations yield
L
∗
(hρ) = −ρLh + 2D∇
z
(ρ∇
z
h) +
2σ
τ
∇
y
(ρ∇
y
h) .
We use the above calculation in the formula for the effective diffusion tensor, together with an
integration by parts and the fact that ¸φ(y, z))
ρ
= 0, to obtain
e D
eff
e = D +¸fφ)
ρ
+ D¸e ∇
z
φ)
ρ
= D +¸uφ)
ρ
−¸φLφ)
ρ
+ 2D¸e ∇
y
φ)
ρ
= D + D¸[∇
z
φ[
2
)
ρ
+ 2D¸e ∇
y
φ)
ρ
+
σ
τ
¸[∇
y
φ[
2
)
ρ
= D¸[e +∇
z
φ[
2
)
ρ
+
σ
τ
¸[∇
y
φ[
2
)
ρ
.
From the above formula we see that the effective diffusion tensor is non–negative deﬁnite and that
it is well deﬁned even at zero temperature:
e D
eff
(T = 0) e =
σ
τ
¸[∇
y
φ(T = 0)[
2
)
ρ
.
208
Although we cannot solve these equations in closed form, it is possible to calculate the small τ
expansion of the effective drift and the effective diffusion coefﬁcient, at least in one dimension.
Indeed, a tedious calculation using singular perturbation theory, e.g. [42, ?] yields
U
eff
= O(τ
3
), (10.22)
and
D
eff
=
L
2
Z
´
Z
_
D + τσ
_
1 +
1
γD
2
_
Z
2
´
Z
−
Z
1
Z
___
+O(τ
2
). (10.23)
In writing eqn. (10.23) we have used the following notation
Z =
_
L
0
e
−
V (z)
D
dz,
´
Z =
_
L
0
e
V (z)
D
dz,
Z
1
=
_
L
0
V (z)e
−
V (z)
D
dz, Z
2
=
_
L
0
V (z)e
V (z)
D
dz.
It is relatively straightforward to obtain the next order correction to (10.23); the resulting formula
is, however, too complicated to be of much use.
The small τ asymptotics for the effective drift were also studied in [4, 17] for the model con
sidered in this section and in [16, 92] when the external ﬂuctuations are given by a continuous
time Markov chain. It was shown in [16, 92] that, for the case of dichotomous noise, the small
τ expansion for U
eff
is valid only for sufﬁciently smooth potentials. Indeed, the ﬁrst non–zero
term–of order O(τ
3
)–involves the second derivative of the potential. Non–smooth potentials lead
to an effective drift which is O(τ
5
2
). On the contrary, eqn. (10.23) does not involve any deriva
tives of the potential and, hence, is well deﬁned even for non–smooth potentials. On the other
hand, the O(τ
2
) term involves third order derivatives of the potential and can be deﬁned only when
V (x) ∈ C
3
(0, L).
We also remark that the expansion (10.23) is only valid for positive temperatures. The problem
becomes substantially more complicated at zero temperature because the generator of the Markov
process becomes a degenerate differential operator at T = 0.
Naturally, in the limit as τ → 0 the effective diffusion coefﬁcient converges to its value for
y ≡ 0 :
D
eff
=
L
2
D
Z
´
Z
. (10.24)
This is the effective diffusion coefﬁcient for a Brownian particle moving in a periodic potential, in
the absence of external ﬂuctuations [54, 94]. It is well known, and easy to prove, that the effective
209
Figure 10.1: Effective diffusivity for (10.21) with V (x) = cos(x) as a function of τ, for σ =
1, D =
k
B
T
γ
= 1, γ = 1. Solid line: Results from Monte Carlo simulations. Dashed line: Results
from formula (10.23).
diffusion coefﬁcient given by (10.24) is bounded from above by D. This not the case for the
effective diffusivity of the correlation ratchet (10.21).
We compare now the small τ asymptotics for the effective diffusion coefﬁcient with Monte
Carlo simulations. The results presented in ﬁgures 1 and 2 were obtained from the numerical
solution of equations (10.21) using the Euler–Marayama method, for the cosine potential V (x) =
cos(x). The integration step that was used was ∆t = 10
−4
and the total number of integration
steps was 10
7
. The effective diffusion coefﬁcient was calculated by ensemble averaging over 2000
particle trajectories which were initially uniformly distributed on [0, 2π].
In ﬁgure 1 we present the effective diffusion coefﬁcient as a function of the correlation time τ
of the OU process. We also plot the results of the small τ asymptotics. The agreement between
theoretical predictions and numerical results is quite satisfactory, for τ ¸ 1. We also observe that
the effective diffusivity is an increasing function of τ.
In ﬁgure 2 we plot the effective diffusivity as a function of the noise strength σ of the OU
process. As expected, the effective diffusivity is an increasing function of σ. The agreement
between the theoretical predictions from (10.23) and the numerical experiments is excellent.
210
Figure 10.2: Effective diffusivity for (10.21) with V (x) = cos(x) as a function of σ, for τ =
0.1, D =
k
B
T
γ
= 1, γ = 1. Solid line: Results from Monte Carlo simulations. Dashed line:
Results from formula (10.23).
10.8 Discussion and Bibliography
10.9 Exercises
1. In this appendix we derive formulae for the mean drift and the effective diffusion coefﬁcient for
a Brownian particle which moves according to
γ ˙ x(t) = −∇V (x(t), t) + y(t) +
_
2γk
B
T(x(t), t)ξ(t), (10.25)
for space–time periodic potential V (x, t) and temperature T(x, t) > 0, and periodic in time
force y(t). We take the spatial period to be L in all directions and the temporal period of
V (x, t), T(x, t) and y(t) to be T . We use the notation Q = [0, L]
d
. Equation (10.25) is inter
preted in the It ˆ o sense.
211
212
Chapter 11
Stochastic Processes and Statistical
Mechanics
11.1 Introduction
We will consider some simple ”particle + environment” systems for which we can obtain rigorously
a stochastic equation that describes the dynamics of the Brownian particle.
We can describe the dynamics of the Brownian particle/ﬂuid system:
H(Q
N
, P
N
; q, p) = H
BP
(Q
N
, P
N
) + H
HB
(q, p) + H
I
(Q
N
, q), (11.1)
where ¦q, p¦ :=
_
¦q
j
¦
N
j=1
, ¦p
j
¦
N
j=1
_
are the positions and momenta of the ﬂuid particles, N is the
number of ﬂuid (heat bath) particles (we will need to take the thermodynamic limit N → +∞).
The initial conditions of the Brownian particle are taken to be ﬁxed, whereas the ﬂuid is assumed
to be initially in equilibrium (Gibbs distribution). Goal: eliminate the ﬂuid variables ¦q, p¦ :=
_
¦q
j
¦
N
j=1
, ¦p
j
¦
N
j=1
_
to obtain a closed equation for the Brownian particle. We will see that this
equation is a stochastic integrodifferential equation, the Generalized Langevin Equation (GLE)
(in the limit as N →+∞)
¨
Q = −V
t
(Q) −
_
t
0
R(t −s)
˙
Q(s) ds + F(t), (11.2)
where R(t) is the memory kernel and F(t) is the noise. We will also see that, in some appropriate
limit, we can derive the Markovian Langevin equation (9.11).
213
11.2 The KacZwanzig Model
Need to model the interaction between the heat bath particles and the coupling between the Brow
nian particle and the heat bath. The simplest model is that of a harmonic heat bath and of linear
coupling:
H(Q
N
, P
N
, q, p) =
P
2
N
2
+ V (Q
N
) +
N
n=1
p
2
n
2m
n
+
1
2
k
n
(q
n
−λQ
N
)
2
. (11.3)
The initial conditions of the Brownian particle ¦Q
N
(0), P
N
(0)¦ := ¦Q
0
, P
0
¦ are taken to be
deterministic.
The initial conditions of the heat bath particles are distributed according to the Gibbs distribution,
conditional on the knowledge of ¦Q
0
, P
0
¦:
µ
β
(dpdq) = Z
−1
e
−βH(q,p)
dqdp, (11.4)
where β is the inverse temperature. This is a way of introducing the concept of the temperature in
the system (through the average kinetic energy of the bath particles). In order to choose the initial
conditions according to µ
β
(dpdq) we can take
q
n
(0) = λQ
0
+
_
β
−1
k
−1
n
ξ
n
, p
n
(0) =
_
m
n
β
−1
η
n
, (11.5)
where the ξ
n
η
n
are mutually independent sequences of i.i.d. ^(0, 1) random variables. Notice that
we actually consider the Gibbs measure of an effective (renormalized) Hamiltonian. Other choices
for the initial conditions are possible. For example, we can take q
n
(0) =
_
β
−1
k
−1
n
ξ
n
. Our choice
of I.C. ensures that the forcing term in the GLE that we will derive is mean zero (see below).
Hamilton’s equations of motion are:
¨
Q
N
+ V
t
(Q
N
) =
N
n=1
k
n
(λq
n
−λ
2
Q
N
), (11.6a)
¨ q
n
+ ω
2
n
(q
n
−λQ
N
) = 0, n = 1, . . . N, (11.6b)
where ω
2
n
= k
n
/m
n
. The equations for the heat bath particles are second order linear inhomoge
neous equations with constant coefﬁcients. Our plan is to solve them and then to substitute the
result in the equations of motion for the Brownian particle. We can solve the equations of motion
214
for the heat bath variables using the variation of constants formula
q
n
(t) = q
n
(0) cos(ω
n
t) +
p
n
(0)
m
n
ω
n
sin(ω
n
t)
+ω
n
λ
_
t
0
sin(ω
n
(t −s))Q
N
(s) ds.
An integration by parts yields
q
n
(t) = q
n
(0) cos(ω
n
t) +
p
n
(0)
m
n
ω
n
sin(ω
n
t) + λQ
N
(t)
−λQ
N
(0) cos(ω
n
t) −λ
_
t
0
cos(ω
n
(t −s))
˙
Q
N
(s) ds.
We substitute this in equation (11.6) and use the initial conditions (11.5) to obtain the Generalized
Langevin Equation
¨
Q
N
= −V
t
(Q
N
) −λ
2
_
t
0
R
N
(t −s)
˙
Q
N
(s) ds + λF
N
(t), (11.7)
where the memory kernel is
R
N
(t) =
N
n=1
k
n
cos(ω
n
t) (11.8)
and the noise process is
F
N
(t) =
N
n=1
k
n
(q
n
(0) −λQ
0
) cos(ω
n
t) +
k
n
p
n
(0)
m
n
ω
n
sin(ω
n
t)
=
_
β
−1
N
n=1
_
k
n
(ξ
n
cos(ω
n
t) + η
n
sin(ω
n
t)) . (11.9)
Remarks 11.2.1. i. The noisy and randomtermare related through the ﬂuctuationdissipation
theorem:
¸F
N
(t)F
N
(s)) = β
−1
N
n=1
k
n
_
cos(ω
n
t) cos(ω
n
s)
+sin(ω
n
t) sin(ω
n
s)
_
= β
−1
R
N
(t −s). (11.10)
ii. The noise F(t) is a mean zero Gaussian process.
215
iii. The choice of the initial conditions (11.5) for q, p is crucial for the form of the GLE and, in
particular, for the ﬂuctuationdissipation theorem (11.10) to be valid.
iv. The parameter λ measures the strength of the coupling between the Brownian particle and
the heat bath.
v. By choosing the frequencies ω
n
and spring constants k
n
(ω) of the heat bath particles appro
priately we can pass to the limit as N → +∞ and obtain the GLE with different memory
kernels R(t) and noise processes F(t).
Let a ∈ (0, 1), 2b = 1 − a and set ω
n
= N
a
ζ
n
where ¦ζ
n
¦
∞
n=1
are i.i.d. with ζ
1
∼ (0, 1).
Furthermore, we choose the spring constants according to
k
n
=
f
2
(ω
n
)
N
2b
,
where the function f(ω
n
) decays sufﬁciently fast at inﬁnity. We can rewrite the dissipation and
noise terms in the form
R
N
(t) =
N
n=1
f
2
(ω
n
) cos(ω
n
t) ∆ω
and
F
N
(t) =
N
n=1
f(ω
n
) (ξ
n
cos(ω
n
t) + η
n
sin(ω
n
t))
√
∆ω,
where ∆ω = N
a
/N. Using now properties of Fourier series with random coefﬁcients/frequencies
and of weak convergence of probability measures we can pass to the limit:
R
N
(t) →R(t) in L
1
[0, T],
for a.a. ¦ζ
n
¦
∞
n=1
and
F
N
(t) →F(t) weakly in C([0, T], R).
The time T > 0 if ﬁnite but arbitrary. The limiting kernel and noise satisfy the ﬂuctuation
dissipation theorem (11.10):
¸F(t)F(s)) = β
−1
R(t −s). (11.11)
Q
N
(t), the solution of (11.7) converges weakly to the solution of the limiting GLE
¨
Q = −V
t
(Q) −λ
2
_
t
0
R(t −s)
˙
Q(s) ds + λF(t). (11.12)
216
The properties of the limiting dissipation and noise are determined by the function f(ω). As an
example, consider the Lorentzian function
f
2
(ω) =
2α/π
α
2
+ ω
2
(11.13)
with α > 0. Then
R(t) = e
−α[t[
.
The noise process F(t) is a mean zero stationary Gaussian process with continuous paths and,
from (11.11), exponential correlation function:
¸F(t)F(s)) = β
−1
e
−α[t−s[
.
Hence, F(t) is the stationary OrnsteinUhlenbeck process:
dF
dt
= −αF +
_
2β
−1
α
dW
dt
, (11.14)
with F(0) ∼ ^(0, β
−1
). The GLE (11.12) becomes
¨
Q = −V
t
(Q) −λ
2
_
t
0
e
−α[t−s[
˙
Q(s) ds + λ
2
F(t), (11.15)
where F(t) is the OU process (11.14). Q(t), the solution of the GLE (11.12), is not a Markov
process, i.e. the future is not statistically independent of the past, when conditioned on the present.
The stochastic process Q(t) has memory. We can turn (11.12) into a Markovian SDE by enlarging
the dimension of state space, i.e. introducing auxiliary variables. We might have to introduce
inﬁnitely many variables! For the case of the exponential memory kernel, when the noise is given
by an OU process, it is sufﬁcient to introduce one auxiliary variable. We can rewrite (11.15) as a
system of SDEs:
dQ
dt
= P,
dP
dt
= −V
t
(Q) + λZ,
dZ
dt
= −αZ −λP +
_
2αβ
−1
dW
dt
,
where Z(0) ∼ ^(0, β
−1
).
The process ¦Q(t), P(t), Z(t)¦ ∈ R
3
is Markovian.
217
It is a degenerate Markov process: noise acts directly only on one of the 3 degrees of freedom.
We can eliminate the auxiliary process Z by taking an appropriate distinguished limit.
Set λ =
√
γε
−1
, α = ε
−2
. Equations (11.17) become
dQ
dt
= P,
dP
dt
= −V
t
(Q) +
√
γ
ε
Z,
dZ
dt
= −
1
ε
2
Z −
√
γ
ε
P +
_
2β
−1
ε
2
dW
dt
.
We can use tools from singular perturbation theory for Markov processes to show that, in the limit
as ε →0, we have that
1
ε
Z →
_
2γβ
−1
dW
dt
−γP.
Thus, in this limit we obtain the Markovian Langevin Equation (R(t) = γδ(t))
¨
Q = −V
t
(Q) −γ
˙
Q +
_
2γβ
−1
dW
dt
. (11.18)
11.3 QuasiMarkovian Stochastic Processes
In the previous section we studied the gLE for the case where the memory kernel decays expo
nentially fast. We showed that we can represent the gLE as a Markovian processes by adding
one additional variable, the solution of a linear SDE. A natural question which arises is whether
it is always possible to turn the gLE into a Markovian system by adding a ﬁnite number of ad
ditional variables. This is not always the case. However, there are many applications where the
memory kernel decays sufﬁciently fast so that we can approximate the gLE by a ﬁnite dimensional
Markovian system.
We introduce the concept of a quasiMarkovian stochastic process.
Deﬁnition 11.3.1. We will say that a stochastic process X
t
is quasiMarkovian if it can be repre
sented as a Markovian stochastic process by adding a ﬁnite number of additional variables: There
exists a stochastic process Y
t
so that ¦X
t
, Y
t
¦ is a Markov process.
In many cases the additional variables Y
t
in terms of solutions to linear SDEs. This is possi
ble, for example, when the memory kernel consists of a sum of exponential functions, a natural
extension of the case considered in the previous section.
218
Proposition 11.3.2. Consider the generalized Langevin equation
˙
Q = p,
˙
P = −V
t
(Q) −
_
t
0
R(t −s)P(s) ds + F(t) (11.19)
with a memory kernel of the form
R(t) =
n
j=1
λ
j
e
−α
j
[t[
(11.20)
and F(t) being a mean zero stationary Gaussian process and where R(t) and F(t) are related
through the ﬂuctuationdissipation theorem,
¸F(t)F(s)) = β
−1
R(t −s). (11.21)
Then (11.19) is equivalent to the Markovian SDE
˙
Q = P,
˙
P = −V
t
(Q) +
n
j=1
λ
j
u
j
, ˙ u
j
= −α
j
u
j
−λ
j
p
j
+
_
2α
j
β
−1
, j = 1, . . . n, (11.22)
with u
j
∼ ^(0, β
−1
) and where W
j
(t) are independent standard one dimensional Brownian mo
tions.
Proof. We solve the equations for u
j
:
u
j
= −λ
j
_
t
0
e
−α
j
(t−s)
P(s) ds + e
−α
j
t
u
j
(0) +
_
2α
j
β
−1
_
t
0
e
−α
j
(t−s)
dW
j
=: −
_
t
0
R
j
(t −s)P(s) ds + η
j
(t).
We substitute this into the equation for P to obtain
˙
P = −V
t
(Q) +
n
j=1
λ
j
u
j
= −V
t
(Q) +
n
j=1
λ
j
_
−
_
t
0
R
j
(t −s)P(s) ds + η
j
(t)
_
= −V
t
(Q) −
_
t
0
R(t −s)P(s) ds + F(t)
where R(t) is given by (11.20) and the noise process F(t) is
F(t) =
n
j=1
λ
j
η
j
(t),
219
with η
j
(t) being onedimensional stationary independent OU processes. We readily check that the
ﬂuctuationdissipatione theorem is satisﬁed:
¸F(t)F(s)) =
n
i,j=1
λ
i
λ
j
¸η
i
(s)η
j
(t))
=
n
i,j=1
λ
i
λ
j
δ
ij
e
−α
i
[t−s[
=
n
i=1
λ
i
e
−α
i
[t−s[
= R(t −s).
These additional variables are solutions of a linear system of SDEs. This follows from results
in approximation theory. Consider now the case where the memory kernel is a bounded analytic
function. Its Laplace transform
´
R(s) =
_
+∞
0
e
−st
R(t) dt
can be represented as a continued fraction:
´
R(s) =
∆
2
1
s + γ
1
+
∆
2
2
...
, γ
i
0, (11.23)
Since R(t) is bounded, we have that
lim
s→∞
´
R(s) = 0.
Consider an approximation R
N
(t) such that the continued fraction representation terminates after
N steps.
R
N
(t) is bounded which implies that
lim
s→∞
´
R
N
(s) = 0.
The Laplace transform of R
N
(t) is a rational function:
´
R
N
(s) =
N
j=1
a
j
s
N−j
s
N
+
N
j=1
b
j
s
N−j
, a
j
, b
j
∈ R. (11.24)
This is the Laplace transform of the autocorrelation function of an appropriate linear system of
SDEs. Indeed, set
dx
j
dt
= −b
j
x
j
+ x
j+1
+ a
j
dW
j
dt
, j = 1, . . . , N, (11.25)
220
with x
N+1
(t) = 0. The process x
1
(t) is a stationary Gaussian process with autocorrelation function
R
N
(t). For N = 1 and b
1
= α, a
1
=
_
2β
−1
α we derive the GLE (11.15) with F(t) being the OU
process (11.14). Consider now the case N = 2 with b
i
= α
i
, i = 1, 2 and a
1
= 0, a
2
=
_
2β
−1
α
2
.
The GLE becomes
¨
Q = −V
t
(Q) −λ
2
_
t
0
R(t −s)
˙
Q(s) ds + λF
1
(t),
˙
F
1
= −α
1
F
1
+ F
2
,
˙
F
2
= −α
2
F
2
+
_
2β
−1
α
2
˙
W
2
,
with
β
−1
R(t −s) = ¸F
1
(t)F
1
(s)).
We can write (11.27) as a Markovian system for the variables ¦Q, P, Z
1
, Z
2
¦:
˙
Q = P,
˙
P = −V
t
(Q) + λZ
1
(t),
˙
Z
1
= −α
1
Z
1
+ Z
2
,
˙
Z
2
= −α
2
Z
2
−λP +
_
2β
−1
α
2
˙
W
2
.
Notice that this diffusion process is ”more degenerate” than (11.15): noise acts on fewer degrees
of freedom. It is still, however, hypoelliptic (Hormander’s condition is satisﬁed): there is sufﬁcient
interaction between the degrees of freedom ¦Q, P, Z
1
, Z
2
¦ so that noise (and hence regularity)
is transferred from the degrees of freedom that are directly forced by noise to the ones that are
not. The corresponding Markov semigroup has nice regularizing properties. There exists a smooth
density. Stochastic processes that can be written as a Markovian process by adding a ﬁnite num
ber of additional variables are called quasimarkovian . Under appropriate assumptions on the
potential V (Q) the solution of the GLE equation is an ergodic process. It is possible to study the
ergodic properties of a quasimarkovian processes by analyzing the spectral properties of the gen
erator of the corresponding Markov process. This leads to the analysis of the spectral properties of
hypoelliptic operators.
11.3.1 Open Classical Systems
When studying the KacZwanzing model we considered a one dimensional Hamiltonian system
coupled to a ﬁnite dimensional Hamiltonian system with random initial conditions (the harmonic
221
heat bath) and then passed to the theromdynamic limit N → ∞. We can consider a small Hamil
tonian system coupled to its environment which we model as an inﬁnite dimensional Hamiltonian
system with random initial conditions. We have a coupled particleﬁeld model. The distinguished
particle (Brownian particle) is described through the Hamiltonian
H
DP
=
1
2
p
2
+ V (q). (11.28)
We will model the environment through a classical linear ﬁeld theory (i.e. the wave equation) with
inﬁnite energy:
∂
2
t
φ(t, x) = ∂
2
x
φ(t, x). (11.29)
The Hamiltonian of this system is
H
HB
(φ, π) =
_
_
[∂
x
φ[
2
+[π(x)[
2
_
. (11.30)
π(x) denotes the conjugate momentum ﬁeld. The initial conditions are distributed according to
the Gibbs measure (which in this case is a Gaussian measure) at inverse temperature β, which we
formally write as
”µ
β
= Z
−1
e
−β1(φ,π)
dφdπ”. (11.31)
Care has to be taken when deﬁning probability measures in inﬁnite dimensions.
Under this assumption on the initial conditions, typical conﬁgurations of the heat bath have
inﬁnite energy. In this way, the environment can pump enough energy into the system so that
nontrivial ﬂuctuations emerge. We will assume linear coupling between the particle and the ﬁeld:
H
I
(q, φ) = q
_
∂
q
φ(x)ρ(x) dx. (11.32)
where The function ρ(x) models the coupling between the particle and the ﬁeld. This coupling is
inﬂuenced by the dipole coupling approximation from classical electrodynamics. The Hamiltonian
of the particleﬁeld model is
H(q, p, φ, π) = H
DP
(p, q) +H(φ, π) + H
I
(q, φ). (11.33)
The corresponding Hamiltonian equations of motion are a coupled system of equations of the
coupled particle ﬁeld model. Now we can proceed as in the case of the ﬁnite dimensional heat
222
bath. We can integrate the equations motion for the heat bath variables and plug the solution into
the equations for the Brownian particle to obtain the GLE. The ﬁnal result is
¨ q = −V
t
(q) −
_
t
0
R(t −s) ˙ q(s) + F(t), (11.34)
with appropriate deﬁnitions for the memory kernel and the noise, which are related through the
ﬂuctuationdissipation theorem.
11.4 The MoriZwanzig Formalism
Consider now the N + 1dimensional Hamiltonian (particle + heat bath) with random initial con
ditions. The N + 1− probability distribution function f
N+1
satisﬁes the Liouville equation
∂f
N+1
∂t
+¦f
N+1
, H¦ = 0, (11.35)
where H is the full Hamiltonian and ¦, ¦ is the Poisson bracket
¦A, B¦ =
N
j=0
_
∂A
∂q
j
∂B
∂p
j
−
∂B
∂q
j
∂A
∂p
j
_
.
We introduce the Liouville operator
L
N+1
= −i¦, H¦.
The Liouville equation can be written as
i
∂f
N+1
∂t
= L
N+1
f
N+1
. (11.36)
We want to obtain a closed equation for the distribution function of the Brownian particle. We
introduce a projection operator which projects onto the distribution function f of the Brownian
particle:
Pf
N+1
= f, Pf
N+1
= h.
The Liouville equation becomes
i
∂f
∂t
= PL(f + h), (11.37a)
i
∂h
∂t
= (I −P)L(f + h). (11.37b)
223
We integrate the second equation and substitute into the ﬁrst equation. We obtain
i
∂f
∂t
= PLf −i
_
t
0
PLe
−i(I−P)Ls
(I −P)Lf(t −s) ds + PLe
−i(I−P)Lt
h(0). (11.38)
In the Markovian limit (large mass ratio) we obtain the FokkerPlanck equation (??).
11.5 Derivation of the FokkerPlanck and Langevin Equations
11.6 Linear Response Theory
11.7 Discussion and Bibliography
The original papers by Kac et al and by Zwanzig are [26, 95]. See also [25]. The variant of
the KacZwanzig model that we have discussed in this chapter was studied in [37]. An excellent
discussion on the derivation of the FokkerPlanck equation using projection operator techniques
can be found in [66].
Applications of linear response theory to climate modeling can be found in.
11.8 Exercises
224
Index
autocorrelation function, 32
Banach space, 16
Brownian motion
scaling and symmetry properties, 42
central limit theorem, 24
conditional expectation, 18
correlation coefﬁcient, 17
covariance function, 32
Diffusion process
mean ﬁrst passage time, 188
Diffusion processes
reversible, 106
Dirichlet form, 109
equation
FokkerPlanck, 88
kinetic, 116
KleinKramersChandrasekhar, 137
Langevin, 137
FokkerPlanck, 88
FokkerPlanck equation, 126
FokkerPlanck equation
classical solution of, 89
Gaussian stochastic process, 30
generator, 68, 125
Gibbs distribution, 107
Gibbs measure, 109
GreenKubo formula, 39
inverse temperature, 100
It ˆ o formula, 125
Joint probability density, 96
KarhunenLo´ eve Expansion, 45
KarhunenLo´ eve Expansion
for Brownian Motion, 49
kinetic equation, 116
Kolmogorov equation, 126
Langevin equation, 137
law, 13
law of large numbers
strong, 24
Markov Chain Monte Carlo, 111
MCMC, 111
Mean ﬁrst passage time, 188
Multiplicative noise, 133
operator
hypoelliptic, 137
OrnsteinUhlenbeck process
225
FokkerPlanck equation for, 95
partition function, 107
Poincar´ e’s inequality
for Gaussian measures, 101
Poincar` e’s inequality, 109
Quasimarkovian stochastic process, 221
random variable
Gaussian, 17
uncorrelated, 17
Reversible diffusion, 106
spectral density, 35
stationary process, 31
stationary process
second order stationary, 32
strictly stationary, 31
wide sense stationary, 32
stochastic differential equation, 43
Stochastic Process
quasimarkovian, 221
stochastic process
deﬁnition, 29
Gaussian, 30
secondorder stationary, 32
stationary, 31
equivalent, 30
stochastic processes
strictly stationary, 31
transport coefﬁcient, 39
Wiener process, 40
226
Bibliography
[1] L. Arnold. Stochastic differential equations: theory and applications. WileyInterscience
[John Wiley & Sons], New York, 1974. Translated from the German.
[2] R. Balescu. Statistical dynamics. Matter out of equilibrium. Imperial College Press, London,
1997.
[3] A. Barone and G. Paterno. Physics and Applications of the Josephson Effect. Wiley, New
York, 1982.
[4] R. Bartussek, P. Reimann, and P. Hanggi. Precise numerics versus theory for correlation
ratchets. Phys. Rev. Let., 76(7):1166–1169, 1996.
[5] A. Bensoussan, J.L. Lions, and G. Papanicolaou. Asymptotic analysis for periodic structures,
volume 5 of Studies in Mathematics and its Applications. NorthHolland Publishing Co.,
Amsterdam, 1978.
[6] N. Berglund and B. Gentz. Noiseinduced phenomena in slowfast dynamical systems. Prob
ability and its Applications (New York). SpringerVerlag London Ltd., London, 2006. A
samplepaths approach.
[7] M. Bier and R.D. Astumian. Biasing Brownian motion in different directions in a 3–state
ﬂuctuating potential and application for the separation of small particles. Phys. Rev. Let.,
76(22):4277, 1996.
[8] L. Breiman. Probability, volume 7 of Classics in Applied Mathematics. Society for Industrial
and Applied Mathematics (SIAM), Philadelphia, PA, 1992. Corrected reprint of the 1968
original.
227
[9] C. Bustamante, D. Keller, and G. Oster. The physics of molecular motors. Acc. Chem. res.,
34:412–420, 2001.
[10] S. Cerrai and M. Freidlin. On the SmoluchowskiKramers approximation for a system with
an inﬁnite number of degrees of freedom. Probab. Theory Related Fields, 135(3):363–394,
2006.
[11] S. Cerrai and M. Freidlin. SmoluchowskiKramers approximation for a general class of
SPDEs. J. Evol. Equ., 6(4):657–689, 2006.
[12] S. Chandrasekhar. Stochastic problems in physics and astronomy. Rev. Mod. Phys., 15(1):1–
89, Jan 1943.
[13] A.J. Chorin and O.H. Hald. Stochastic tools in mathematics and science, volume 1 of Surveys
and Tutorials in the Applied Mathematical Sciences. Springer, New York, 2006.
[14] I. Derenyi, , and R.D. Astumian. ac separation of particles by biased Brownian motion in a
two–dimensional sieve. Phys. Rev. E, 58(6):7781–7784, 1998.
[15] W. Dietrich, I. Peschel, and W.R. Schneider. Diffusion in periodic potentials. Z. Phys,
27:177–187, 1977.
[16] C.R. Doering, L. A. Dontcheva, and M.M. Klosek. Constructive role of noise: fast ﬂuctuation
asymptotics of transport in stochastic ratchets. Chaos, 8(3):643–649, 1998.
[17] C.R. Doering, W. Horsthemke, and J. Riordan. Nonequilibrium ﬂuctuation–induced trans
port. Phys. Rev. Let., 72(19):2984–2987, 1994.
[18] N. Wax (editor). Selected Papers on Noise and Stochastic Processes. Dover, New York, 1954.
[19] R. Eichhorn and P. Reimann. Paradoxical directed diffusion due to temperature anisotropies.
Europhys. Lett., 69(4):517–523, 2005.
[20] A. Einstein. Investigations on the theory of the Brownian movement. Dover Publications Inc.,
New York, 1956. Edited with notes by R. F¨ urth, Translated by A. D. Cowper.
[21] S.N. Ethier and T.G. Kurtz. Markov processes. Wiley Series in Probability and Mathematical
Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York, 1986.
228
[22] L.C. Evans. Partial Differential Equations. AMS, Providence, Rhode Island, 1998.
[23] W. Feller. An introduction to probability theory and its applications. Vol. I. Third edition.
John Wiley & Sons Inc., New York, 1968.
[24] W. Feller. An introduction to probability theory and its applications. Vol. II. Second edition.
John Wiley & Sons Inc., New York, 1971.
[25] G. W. Ford and M. Kac. On the quantum Langevin equation. J. Statist. Phys., 46(56):803–
810, 1987.
[26] G. W. Ford, M. Kac, and P. Mazur. Statistical mechanics of assemblies of coupled oscillators.
J. Mathematical Phys., 6:504–515, 1965.
[27] M. Freidlin and M. Weber. A remark on random perturbations of the nonlinear pendulum.
Ann. Appl. Probab., 9(3):611–628, 1999.
[28] M. I. Freidlin and A. D. Wentzell. Random perturbations of Hamiltonian systems. Mem.
Amer. Math. Soc., 109(523):viii+82, 1994.
[29] M.I. Freidlin and A.D. Wentzell. Random Perturbations of dunamical systems. Springer
Verlag, New York, 1984.
[30] A. Friedman. Partial differential equations of parabolic type. PrenticeHall Inc., Englewood
Cliffs, N.J., 1964.
[31] A. Friedman. Stochastic differential equations and applications. Vol. 1. Academic Press
[Harcourt Brace Jovanovich Publishers], New York, 1975. Probability and Mathematical
Statistics, Vol. 28.
[32] A. Friedman. Stochastic differential equations and applications. Vol. 2. Academic Press
[Harcourt Brace Jovanovich Publishers], New York, 1976. Probability and Mathematical
Statistics, Vol. 28.
[33] P. Fulde, L. Pietronero, W. R. Schneider, and S. Str¨ assler. Problem of brownian motion in a
periodic potential. Phys. Rev. Let., 35(26):1776–1779, 1975.
229
[34] H. Gang, A. Daffertshofer, and H. Haken. Diffusion in periodically forced Brownian particles
moving in space–periodic potentials. Phys. Rev. Let., 76(26):4874–4877, 1996.
[35] C. W. Gardiner. Handbook of stochastic methods. SpringerVerlag, Berlin, second edition,
1985. For physics, chemistry and the natural sciences.
[36] I. I. Gikhman and A. V. Skorokhod. Introduction to the theory of random processes. Dover
Publications Inc., Mineola, NY, 1996.
[37] D. Givon, R. Kupferman, and A.M. Stuart. Extracting macroscopic dynamics: model prob
lems and algorithms. Nonlinearity, 17(6):R55–R127, 2004.
[38] M. Hairer and G. A. Pavliotis. From ballistic to diffusive behavior in periodic potentials. J.
Stat. Phys., 131(1):175–202, 2008.
[39] M. Hairer and G.A. Pavliotis. Periodic homogenization for hypoelliptic diffusions. J. Statist.
Phys., 117(12):261–279, 2004.
[40] P. Hanggi. Escape from a metastable state. J. Stat. Phys., 42(1/2):105–140, 1986.
[41] P. Hanggi, P. Talkner, and M. Borkovec. Reactionrate theory: ﬁfty years after Kramers. Rev.
Modern Phys., 62(2):251–341, 1990.
[42] W. Horsthemke and R. Lefever. Noiseinduced transitions, volume 15 of Springer Series in
Synergetics. SpringerVerlag, Berlin, 1984. Theory and applications in physics, chemistry,
and biology.
[43] J. Jacod and A.N. Shiryaev. Limit theorems for stochastic processes, volume 288 of
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical
Sciences]. SpringerVerlag, Berlin, 2003.
[44] F. John. Partial differential equations, volume 1 of Applied Mathematical Sciences. Springer
Verlag, New York, fourth edition, 1991.
[45] S. Karlin and H. M. Taylor. A second course in stochastic processes. Academic Press Inc.
[Harcourt Brace Jovanovich Publishers], New York, 1981.
230
[46] S. Karlin and H.M. Taylor. A ﬁrst course in stochastic processes. Academic Press [A sub
sidiary of Harcourt Brace Jovanovich, Publishers], New YorkLondon, 1975.
[47] C. Kipnis and S. R. S. Varadhan. Central limit theorem for additive functionals of reversible
Markov processes and applications to simple exclusions. Comm. Math. Phys., 104(1):1–19,
1986.
[48] L. B. Koralov and Y. G. Sinai. Theory of probability and random processes. Universitext.
Springer, Berlin, second edition, 2007.
[49] H. A. Kramers. Brownian motion in a ﬁeld of force and the diffusion model of chemical
reactions. Physica, 7:284–304, 1940.
[50] N. V. Krylov. Introduction to the theory of diffusion processes, volume 142 of Translations
of Mathematical Monographs. American Mathematical Society, Providence, RI, 1995.
[51] R. Kupferman, G. A. Pavliotis, and A. M. Stuart. Itˆ o versus Stratonovich whitenoise limits
for systems with inertia and colored multiplicative noise. Phys. Rev. E (3), 70(3):036120, 9,
2004.
[52] A.M. Lacasta, J.M Sancho, A.H. Romero, I.M. Sokolov, and K. Lindenberg. From subdiffu
sion to superdiffusion of particles on solid surfaces. Phys. Rev. E, 70:051104, 2004.
[53] P. D. Lax. Linear algebra and its applications. Pure and Applied Mathematics (Hoboken).
WileyInterscience [John Wiley & Sons], Hoboken, NJ, second edition, 2007.
[54] S. Lifson and J.L. Jackson. On the self–diffusion of ions in polyelectrolytic solution. J. Chem.
Phys, 36:2410, 1962.
[55] B. Lindner, M. Kostur, and L. SchimanskyGeier. Optimal diffusive transport in a tilted
periodic potential. Fluctuation and Noise Letters, 1(1):R25–R39, 2001.
[56] M. Lo` eve. Probability theory. I. SpringerVerlag, New York, fourth edition, 1977. Graduate
Texts in Mathematics, Vol. 45.
[57] M. Lo` eve. Probability theory. II. SpringerVerlag, New York, fourth edition, 1978. Graduate
Texts in Mathematics, Vol. 46.
231
[58] M. C. Mackey. Time’s arrow. Dover Publications Inc., Mineola, NY, 2003. The origins of
thermodynamic behavior, Reprint of the 1992 original [Springer, New York; MR1140408].
[59] M.C. Mackey, A. Longtin, and A. Lasota. Noiseinduced global asymptotic stability. J.
Statist. Phys., 60(56):735–751, 1990.
[60] M. O. Magnasco. Forced thermal ratchets. Phys. Rev. Let., 71(10):1477–1481, 1993.
[61] P. Mandl. Analytical treatment of onedimensional Markov processes. Die Grundlehren der
mathematischen Wissenschaften, Band 151. Academia Publishing House of the Czechoslo
vak Academy of Sciences, Prague, 1968.
[62] P. A. Markowich and C. Villani. On the trend to equilibrium for the FokkerPlanck equation:
an interplay between physics and functional analysis. Mat. Contemp., 19:1–29, 2000.
[63] B. J. Matkowsky, Z. Schuss, and E. BenJacob. A singular perturbation approach to Kramers’
diffusion problem. SIAM J. Appl. Math., 42(4):835–849, 1982.
[64] B. J. Matkowsky, Z. Schuss, and C. Tier. Uniform expansion of the transition rate in Kramers’
problem. J. Statist. Phys., 35(34):443–456, 1984.
[65] J.C. Mattingly and A. M. Stuart. Geometric ergodicity of some hypoelliptic diffusions for
particle motions. Markov Processes and Related Fields, 8(2):199–214, 2002.
[66] R.M. Mazo. Brownian motion, volume 112 of International Series of Monographs on
Physics. Oxford University Press, New York, 2002.
[67] J. Meyer and J. Schr¨ oter. Comments on the Grad procedure for the FokkerPlanck equation.
J. Statist. Phys., 32(1):53–69, 1983.
[68] E. Nelson. Dynamical theories of Brownian motion. Princeton University Press, Princeton,
N.J., 1967.
[69] B. Øksendal. Stochastic differential equations. Universitext. SpringerVerlag, Berlin, 2003.
[70] G.C. Papanicolaou and S. R. S. Varadhan. OrnsteinUhlenbeck process in a random potential.
Comm. Pure Appl. Math., 38(6):819–834, 1985.
232
[71] G. A. Pavliotis and A. M. Stuart. Analysis of white noise limits for stochastic systems with
two fast relaxation times. Multiscale Model. Simul., 4(1):1–35 (electronic), 2005.
[72] G. A. Pavliotis and A. M. Stuart. Parameter estimation for multiscale diffusions. J. Stat.
Phys., 127(4):741–781, 2007.
[73] G. A. Pavliotis and A. Vogiannou. Diffusive transport in periodic potentials: Underdamped
dynamics. Fluct. Noise Lett., 8(2):L155–173, 2008.
[74] G.A. Pavliotis and A.M. Stuart. Multiscale methods, volume 53 of Texts in Applied Mathe
matics. Springer, New York, 2008. Averaging and homogenization.
[75] G. Da Prato and J. Zabczyk. Stochastic Equations in Inﬁnite Dimensions, volume 44 of
Encyclopedia of Mathematics and its Applications. Cambridge University Press, 1992.
[76] H. Qian, Min Qian, and X. Tang. Thermodynamics of the general duffusion process: time–
reversibility and entropy production. J. Stat. Phys., 107(5/6):1129–1141, 2002.
[77] R. L. R. L. Stratonovich. Topics in the theory of random noise. Vol. II. Revised English
edition. Translated from the Russian by Richard A. Silverman. Gordon and Breach Science
Publishers, New York, 1967.
[78] P. Reimann. Brownian motors: noisy transport far from equilibrium. Phys. Rep., 361(2
4):57–265, 2002.
[79] P. Reimann, C. Van den Broeck, H. Linke, P. H¨ anggi, J.M. Rubi, and A. PerezMadrid. Dif
fusion in tilted periodic potentials: enhancement, universality and scaling. Phys. Rev. E,
65(3):031104, 2002.
[80] P. Reimann, C. Van den Broeck, H. Linke, J.M. Rubi, and A. PerezMadrid. Giant accel
eration of free diffusion by use of tilted periodic potentials. Phys. Rev. Let., 87(1):010602,
2001.
[81] Frigyes Riesz and B´ ela Sz.Nagy. Functional analysis. Dover Publications Inc., New York,
1990. Translated from the second French edition by Leo F. Boron, Reprint of the 1955
original.
233
[82] H. Risken. The FokkerPlanck equation, volume 18 of Springer Series in Synergetics.
SpringerVerlag, Berlin, 1989.
[83] H. Rodenhausen. Einstein’s relation between diffusion constant and mobility for a diffusion
model. J. Statist. Phys., 55(56):1065–1088, 1989.
[84] J.M Sancho, A.M. Lacasta, K. Lindenberg, I.M. Sokolov, and A.H. Romero. Diffusion on a
solid surface: anomalous is normal. Phys. Rev. Let, 92(25):250601, 2004.
[85] M Schreier, P. Reimann, P. H¨ anggi, and E. Pollak. Giant enhancement of diffusion and
particle selection in rocked periodic potentials. Europhys. Let., 44(4):416–422, 1998.
[86] Z. Schuss. Singular perturbation methods in stochastic differential equations of mathematical
physics. SIAM Review, 22(2):119–155, 1980.
[87] Ch. Sch¨ utte and W. Huisinga. Biomolecular conformations can be identiﬁed as metastable
sets of molecular dynamics. In Handbook of Numerical Analysis (Computational Chemistry),
Vol X, 2003.
[88] C. Schwab and R.A. Todor. KarhunenLo` eve approximation of random ﬁelds by generalized
fast multipole methods. J. Comput. Phys., 217(1):100–122, 2006.
[89] R.B. Sowers. A boundary layer theory for diffusively perturbed transport around a hetero
clinic cycle. Comm. Pure Appl. Math., 58(1):30–84, 2005.
[90] D.W. Stroock. Probability theory, an analytic view. Cambridge University Press, Cambridge,
1993.
[91] G. I. Taylor. Diffusion by continuous movements. London Math. Soc., 20:196, 1921.
[92] T.C.Elston and C.R. Doering. Numerical and analytical studies of nonequilibriumﬂuctuation
induced transport processes. J. Stat. Phys., 83:359–383, 1996.
[93] G. E. Uhlenbeck and L. S. Ornstein. On the theory of the brownian motion. Phys. Rev.,
36(5):823–841, Sep 1930.
[94] M. Vergassola and M. Avellaneda. Scalar transport in compressible ﬂow. Phys. D, 106(1
2):148–166, 1997.
234
[95] R. Zwanzig. Nonlinear generalized Langevin equations. J. Stat. Phys., 9(3):215–220, 1973.
[96] R. Zwanzig. Nonequilibrium statistical mechanics. Oxford University Press, New York,
2001.
235
2
Contents
Preface 1 Introduction 1.1 1.2 1.3 1.4 1.5 1.6 1.7 2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The OneDimensional Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . Stochastic Modeling of Deterministic Chaos . . . . . . . . . . . . . . . . . . . . . Why Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 1 1 3 6 6 7 7 9 9 9
Elements of Probability Theory 2.1 2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Deﬁnitions from Probability Theory . . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.3
Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Expectation of Random Variables . . . . . . . . . . . . . . . . . . . . . . 16
2.4 2.5 2.6 2.7 2.8 2.9
Conditional Expecation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 The Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Types of Convergence and Limit Theorems . . . . . . . . . . . . . . . . . . . . . 23 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 i
3
Basics of the Theory of Stochastic Processes 3.1 3.2 3.3
29
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Deﬁnition of a Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.1 3.3.2 3.3.3 Strictly Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . 31 Second Order Stationary Processes . . . . . . . . . . . . . . . . . . . . . 32 Ergodic Properties of SecondOrder Stationary Processes . . . . . . . . . . 37
3.4 3.5
Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Other Examples of Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . 44 3.5.1 3.5.2 3.5.3 Brownian Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Fractional Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 44 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 3.7 3.8 4
The KarhunenLo´ ve Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 e Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 57
Markov Processes 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Deﬁnition of a Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 The ChapmanKolmogorov Equation . . . . . . . . . . . . . . . . . . . . . . . . . 64 The Generator of a Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.1 4.6.1 The Adjoint Semigroup . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Ergodic Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Stationary Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . 71 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 77
5
Diffusion Processes 5.1 5.2 5.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Deﬁnition of a Diffusion Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 The Backward and Forward Kolmogorov Equations . . . . . . . . . . . . . . . . . 79 ii
. . . . 116 119 Stochastic Differential Equations 7. . . . . . . . . .1 6.7. . 114 o Discussion and Bibliography . . . . . . . . . . 88 The FP equation as a conservation law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 The Stratonovich Stochastic Integral . . . . . . . . . . . . . . . . . . . . . . 106 6.7 6. . . . . . . .6 5. . . . . . . . . . . . . . . .1 Markov Chain Monte Carlo (MCMC) . . . . . . . 95 The Geometric Brownian Motion . . . . . . . . . . . . .3 Existence and Uniqueness of Solutions . . . . . . . . . . 115 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 6. . . . . . . . . . . . . 112 Eigenfunction Expansions . . . . . . . . . . . . . . . . . . . 87 Basic Properties of the FP Equation . . . . . . . 89 Boundary conditions for the Fokker–Planck equation . . . . . . . . . . 81 Multidimensional Diffusion Processes . . . . . 86 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Introduction .2 5. . . . . . . . . . . .3 Examples of Diffusion Processes . . . . . .5. . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 The Itˆ and Stratonovich Stochastic Integral . . . . . . .2. .3 6. . . . . . . .5 5. . . 86 87 The FokkerPlanck Equation 6. . . . . . . .1 6. . . . . . .1 7. . . . . . . . . . .2 6. . . . . . .1 Reduction to a Schr¨ dinger Equation .1 5. . . . .2. 90 Brownian Motion . . . . . . . . . . . . . . . 92 6. . . . . . 119 o 7. . . . . . . . . .7 5. . . . . 99 6. . . . .2. . . . . . . . . .2 Introduction . . . . . . . . .4 5. . . . . . . . . . . . . . . . . .3. . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . .8 6 The Backward Kolmogorov Equation . . . . . . . . . . 111 Perturbations of nonReversible Diffusions . . . . . 112 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . .9 7 The OrnsteinUhlenbeck Process and Hermite Polynomials . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . .2 6. . . . . . . . . . . .5 6. . . . . . . . . . . .4 6. . . . . . 84 Connection with Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . 88 6. . . . . . . . 100 Reversible Diffusions . .8 6. 92 The OrnsteinUhlenbeck Process . . . . . . . . . . . .6 6. . .2 7.5. . . . . . . . . . . . . .3. . . . . . . 84 Examples of Diffusion Processes . . . . . . . . . . . . . . . . . 121 iii . . 79 The Forward Kolmogorov Equation . . . . . . . . . . . . . . . . 121 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 142 Asymptotic Limits for the Langevin Equation . . . . . . . . . . . . . . . . . . . .4 137 Introduction . 123 The Generator . . . . . . . . . . . . . . . . . . . . . . 153 The Underdamped Limit . . . . .3 8. . . . . . . . . . . . . . . . 135 7. . . .2 8. . . 170 Brownian Motion in a Tilted Periodic Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6. . .3. . . . . .6 8. .6. . . . . . . 133 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 o The Generator.1 8. . . . . . . . . . . . . . . . . . . . . . .9 9 The Underdamped and Overdamped Limits of the Diffusion Coefﬁcient . . 185 iv . . . . . . .5. . . . . . . . 133 Parameter Estimation for SDEs . . . . . . . . . . 133 o Numerical Solution of SDEs . . . . . . . . . . . . . . .1 9. . . . . . 164 8. . . . . . . . . . . . . . . . . . . . . . . . . . .4. . .1 Numerical Solution of the KleinKramers Equation . . . . . . . 127 Derivation of the Stratonovich SDE .4. . . . . . . . .1 Itˆ versus Stratonovich .4 7. . . . . . . . . . . . . . 159 The Langevin equation in a periodic potential . . . . . . . . . . . . .7. . 184 185 Exit Time Problems 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 8. . . . . . . . 151 8. . . . . 171 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Discussion and Bibliography . . . 180 8. . . . . . . . .7 8. . . . .5. . . . 133 Noise Induced Transitions . . . . . . . . . . . . .11 Exercises . . . . . . . . . . . . . . . . . . . .2 7. . . . . . . . 185 Brownian Motion in a Bistable Potential . . . . . . . . . . . . .6 7. . . .4. . . 129 7. . . . . 125 o Linear SDEs . . . . . 137 The Langevin Equation in a Harmonic Potential . .8 8. . . . . . . . . .1 8. . . . . . . . . . .9 Examples of SDEs . . . . . . . . . . . . .8 7. . . . . . . . . . . . . . . . . . 183 Exercises . . . . . . . . . . .5 Brownian Motion in Periodic Potentials . . . . . . . . . . . . . . . 183 Discussion and Bibliography . . . . . . . . . . 164 Equivalence With the GreenKubo Formula . . . . . . . .1 8.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 7. . . 137 The FokkerPlanck Equation in Phase Space (KleinKramers Equation) .1 7. . . .1 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The Overdamped Limit . . . . Itˆ ’s formula and the FokkerPlanck Equation . . 136 8 The Langevin Equation 8. . . . . . . . .7 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Itˆ ’s Formula . . . .4. .
. . . 188 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 11. . . . . . . . . . . . . . . . . . . . . 224 v . . . . . . . . . . 203 10. . . . . . . . . . . . . .3 9. . . . . . . . . . . . . . . . . . . . . . . . . 199 10.4. . . . . . . .1 9. . . . . . . .2 The Boundary Value Problem for the MFPT . . . 224 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Stochastic Resonance . . . . . . . . . . 211 11 Stochastic Processes and Statistical Mechanics 213 11. . . . . . 213 11. . . .9. . . . . 218 11. . . . . . . . . . . . . .4 Introduction . . . . . . . . . . . 224 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Brownian Motors . . . . . . . . . . . . . . . . . . . 199 10. . . . . . . . . . . . . . 192 9. . . . . .6 Discussion and Bibliography . . . . . . . . .4 Escape from a Potential Barrier . . . 195 Exercises . . 203 10. . . . . . . . . . . . . . . . 207 10. 223 11.9 Exercises . . . . . . . . . . . . 199 10. . . 202 10. . . . . . . . . . . . . . . . . . . . . . . .1 Open Classical Systems . . . . . . . .5 9.3.3 QuasiMarkovian Stochastic Processes . 195 9. . . . . . . . 211 10. . . . .6. . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 The Intermediate Regime: γ = O(1) . 199 10. . . .1 Calculation of the Effective Drift . . . . . . . . .2 The KacZwanzig Model . . .4. . . . . . . 190 Calculation of the Reaction Rate in the Overdamped Regime . . . . . . . . . . . . 214 11. . . . . . . . . . . . . 194 Calculation of the Reaction Rate in the energydiffusionlimited regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Introduction .2 Calculation of the Effective Diffusion Coefﬁcient . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . .3 The Mean First Passage Time . . . . . . . . . . . . . . . . . . .5 Derivation of the FokkerPlanck and Langevin Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 Discussion and Bibliography . . 205 10. . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . .1 Introduction . .3. . . . . . . . . . . . . . . . . .2 9. . . . .4 The MoriZwanzig Formalism . . . . . . . . . . . . . . . . . . . . . . . .1 9. . . . .6. . . . . . . . . . .6 Multiscale Analysis . . . . . . 188 Examples . . .6 Linear Response Theory . 197 199 10 Stochastic Resonance and Brownian Motors 10. . . . .7 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . .7 Effective Diffusion Coefﬁcient for Correlation Ratchets . . . . . . . .
. . . . . . . . . . . . .11. . . . 224 vi . . . . . . .8 Exercises . . . . . . . . . . . . . . . . .
Pavliotis London. G.A.Preface The purpose of these notes is to present various results and techniques from the theory of stochastic processes and are useful in the study of stochastic problems in physics. December 2010 vii . These notes have been used for several years for a course on applied stochastic processes offered to fourth year and to MSc students in applied mathematics at the department of mathematics. Imperial College London. chemistry and other areas.
viii .
In Section 1.7.. Discussion and bibliographical comments are presented in Section 1.1 Introduction In this chapter we introduce some of the concepts and techniques that we will study in this book. 1. small pollen grains are found to be in a very animated and irregular state of motion. Einstein introduced a Markov chain model for the motion of the particle (molecule.Chapter 1 Introduction 1. the Markov property. In Section 1.6. started with Einstein’s work on the theory of Brownian motion: Concerning the motion. Exercises are included in Section 1. as required by the molecularkinetic theory of heat. Using modern terminology.).2 Historical Overview The theory of stochastic processes. of particles suspended in liquids at rest (1905) and in a series of additional papers that were published in the period 1905 − 1906. Fur1 . In developing his theory Einstein introduced several concepts that still play a fundamental role in the study of stochastic processes and that we will study in this book. Some comments on the role of probabilistic modeling in the physical sciences are offered in Section 1.3 we introduce the onedimensional random walk an we use this example in order to introduce several concepts such Brownian motion. Einstein presented an explanation of Brown’s observation (1827) that when suspended in water. In Section 1.4 we discuss about the stochastic modeling of deterministic chaos.5. pollen grain.2 we present a brief historical overview on the development of the theory of stochastic processes in the twentieth century. at least in terms of its application to physics.. In these fundamental works.
the diffusion equation). Einstein’s theory is based on the FokkerPlanck equation. • The derivation of the FokkerPlanck equation from the master (ChapmanKolmogorov) equation through a KramersMoyal expansion. T is the temperature. 2 . η is the viscosity of the ﬂuid and a is the diameter of the particle. he introduced the idea that it makes more sense to talk about the probability of ﬁnding the particle at position x at time t. who also performed several experiments. Their evolution is governed by a stochastic differential equation: dX = F (X) + Σ(X)ξ(t). The approaches of Langevin and Einstein represent the two main approaches in the theory of stochastic processes: • Study individual trajectories of Brownian particles. The theory of Brownian motion was developed independently by Smoluchowski. • The calculation of a transport coefﬁcient (the diffusion equation) using macroscopic (kinetic theorybased) considerations: D= kB T . It can be shown that there is complete agreement between Einstein’s theory and Langevin’s theory. Langevin (1908) developed a theory based on a stochastic differential equation. 6πηa • kB is Boltzmann’s constant. dt2 dt where ξ is a random force.thermore. The equation of motion for a Brownian particle is m dx d2 x = −6πηa + ξ. rather than about individual trajectories. dt • where ξ(t) is a random force. In his work many of the main aspects of the modern theory of stochastic processes can be found: • The assumption of Markovianity (no memory) expressed through the ChapmanKolmogorov equation. • The Fokker–Planck equation (in this case.
• Study the probability ρ(x. . at each time step it moves to ±1 with equal probability 1 . t = 0. Wiener. Sn } by straight lines) to generate a continuous path. N } is the path of the n=1 random walk. o · (F (x)ρ) + 1 2 : (A(x)ρ). 2 In other words. The theory of stochastic processes was developed during the 20th century by several mathematicians and physicists including Smoluchowksi. .e. t) of ﬁnding a particle at position x at time t. . .3 The OneDimensional Random Walk We let time be discrete. 2 We can simulate the random walk on a computer: • We need a (pseudo)random number generator to generate n independent random variables which are uniformly distributed in the interval [0. we move one unit to the right. 3 1 2 then the particle moves to the left. We use a linear interpolation (i. we move one unit to the left. Planck. This probability distribution satisﬁes the FokkerPlanck equation: ∂ρ =− ∂t • where A(x) = Σ(x)Σ(x)T . • The sequence {Sn }N indexed by the discrete time T = {1. Kolmogorov.e. Kramers.1]. where Xj ∈ {−1. connect the points {n. Consider the following stochastic process Sn : S0 = 0. 2. 1} with P(Xj = ±1) = 1 . Chandrasekhar. Alternatively. . • If the value of the random variable is moves to the right. 1. otherwise it . we can think of the random walk as a sum of independent random variables: n Sn = j=1 Xj . at each time step we ﬂip a fair coin. If the outcome is heads. • We then take the sum of all these random moves. i. . Itˆ . If the outcome is tails. Doob. 1. .
1: Three paths of the random walk of length N = 50. Figure 1.2: Three paths of the random walk of length N = 1000.Figure 1. 4 .
independent random variables. the sequence {Ztn } converges (in some appropriate sense. distributed Gaussian increments. If we take a large number of steps. discrete space stochastic processes. the random walk starts looking like a continuous time process with continuous paths. 2 Brownian motion W (t) is a continuous time stochastic processes with continuous paths that starts at 0 (W (0) = 0) and has independent. We can simulate the Brownian motion on a computer using a random number generator that generates normally distributed. that will be made precise in later chapters) to a Brownian motion with diffusion coefﬁcient D = ∆x2 2∆t = 1.3: Sample Brownian paths. This is an example of a discrete time. Consider the sequence of continuous time stochastic processes 1 Ztn := √ Snt .Figure 1. Every path of the random walk is different: it depends on the outcome of a sequence of independent random experiments. n In the limit as n → ∞. E(Sn ) = n. We can quantify this observation by introducing an appropriate rescaled process and by taking an appropriate limit. The random walk is a timehomogeneous Markov process. We can compute statistics by generating a large number of paths and 2 computing averages. We can write an equation for the evolution of the paths 5 . E(Sn ) = 0. For example. The paths of the random walk (without the linear interpolation) are not continuous: the random walk has a jump of size 1 at each time step. normally.
∂t ∂y ρ(y. • To describe a high dimensional deterministic system using a simpler. Determinism versus predictability. 0) = δ(y − x). 6 . low dimensional stochastic system. X0 = x. This is the simplest example of a stochastic differential equation.of a Brownian motion Xt with diffusion coefﬁcient D starting at x: √ dXt = 2DdWt . • To describe a dynamical system exhibiting very complicated behavior (chaotic dynamical systems). • To describe a deterministic system for which we have incomplete information: we have imprecise knowledge of initial and boundary conditions or of model parameters.5 Stochastic Modeling of Deterministic Chaos Why Randomness Why introduce randomness in the description of physical systems? • To describe outcomes of a repeated set of experiments. – ODEs with random initial conditions are equivalent to stochastic processes that can be described using stochastic differential equations. Think of tossing a coin repeatedly or of throwing a dice.4 1. • To describe systems for which we are not conﬁdent about the validity of our mathematical model. The connection between Brownian motion and the diffusion equation was made by Einstein in 1905. Think of the physical model for Brownian motion (a heavy particle colliding with many small particles). 1. given that it was at x at time t = 0. t) satisﬁes the PDE ∂ρ ∂ 2ρ = D 2. The probability of ﬁnding Xt at y at time t. the transition probability density ρ(y. This is the simplest example of the FokkerPlanck equation.
6 Discussion and Bibliography The fundamental papers of Einstein on the theory of Brownian motion have been reprinted by Dover [20].7 Exercises 1. Write a computer program for generating the random walk in one and two dimensions. The readers of this book are strongly encouraged to study these papers. Many of these early papers on the theory of stochastic processes have been reprinted in [18]. Stochastic modeling is currently used in many different areas ranging from biology to climate modeling to economics. 1. Read the papers by Einstein.• To describe a system that is inherently random. Study numerically the Brownian limit and compute the statistics of the random walk. Doob. Very useful historical comments can be founds in the books by Nelson [68] and Mazo [66]. Think of quantum mechanics. 2. OrnsteinUhlenbeck. 1. 7 . Ornstein and Uhlenbeck. Other fundamental papers from the early period of the development of the theory of stochastic processes include the papers by Langevin. Doob etc. Kramers and Chandrashekhar’s famous review article [12].
8 .
4 we introduce the concept of conditional expectation and in Section 2. 9 .6. Example 2. one of the most useful tools in the study of (sums of) random variables.Chapter 2 Elements of Probability Theory 2. Some explicit calculations for the multivariate Gaussian distribution are presented in Section 2.8.2 we give some basic deﬁnitions from the theory of probability.9.5 we deﬁne the characteristic function. Exercises are included in Section 2. In Section 2. In Section 2. Deﬁnition 2.3 we present some properties of random variables. In Section 2.2. First we need to describe the set of all possible experiments.1. Discussion and bibliographical comments are presented in Section 2. T . 2.2 Basic Deﬁnitions from Probability Theory In Chapter 1 we deﬁned a stochastic process as a dynamical system whose law of evolution is probabilistic.2.1 Introduction In this chapter we put together some basic deﬁnitions and results from probability theory that will be used later on. The set of all possible outcomes of an experiment is called the sample space and is denoted by Ω.2. • The possible outcomes of the experiment of tossing a coin are H and T .7. The sample space is Ω = H. Different types of convergence and the basic limit theorems of the theory of probability are discussed in Section 2. In order to study stochastic processes we need to be able to describe the outcome of a random experiment and to calculate functions of this outcome.
A deﬁnition of the collection of subsets of events which is appropriate for ﬁnite additive probability is the following. i=1 When Ω is inﬁnite dimensional then the above deﬁnition is not appropriate since we need to consider countable unions of events. iii. iii. Deﬁnition 2. A2 . Deﬁnition 2. A collection F of Ω is called a ﬁeld on Ω if i. • F = ∅. Ω where A is a subset of Ω. • The power set of Ω.5. We deﬁne events to be subsets of the sample space. ii. • F = ∅. B ∈ F then A ∪ B ∈ F. Of course. if A ∈ F then Ac ∈ F. 6 . In particular. If A1 .4. 1}Ω which contains all subsets of Ω. An ∈ F ⇒ ∪n Ai ∈ F. 3. 2. 5 and 6. 4. 5. A collection F of Ω is called a σﬁeld or σalgebra on Ω if i. we would like the unions. Example 2. 3. 4.3. ii. From the deﬁnition of a ﬁeld we immediately deduce that F is closed under ﬁnite unions and ﬁnite intersections: A1 . · · · ∈ F then ∪∞ Ai ∈ F. i=1 ∩n Ai ∈ F. A. 2. intersections and complements of events to also be events. if A ∈ F then Ac ∈ F. When the sample space Ω is uncountable. Ac . . If A. . ∅ ∈ F.2.• The possible outcomes of the experiment of throwing a die are 1. .2. i=1 A σalgebra is closed under the operation of taking countable intersections. ∅ ∈ F.2. denoted by {0. Ω . 10 . then technical difﬁculties arise. not all subsets of the sample space need to be events. The sample space is Ω = 1.
A2 .7. the σ−ﬁeld contains all the information about the random experiment that is available to us. 1}. P(Ω) = 1. . Example 2.10. P(∅) = 0. Intuitively. F = {∅. P) is a probability space. 11 . F. a σalgebra F of subsets of Ω and a probability measure P on (Ω. Deﬁnition 2. . A ⊂ Ω if and only if it is in every σ−algebra containing F. P : F → [0. ii.8. .2. P = Leb([0. A biased coin is tossed once: Ω = {H. Now we want to assign probabilities to the possible outcomes of an experiment. A subσ–algebra is a collection of subsets of a σ–algebra which satisﬁes the axioms of a σ– algebra. by the open balls of Rn ) is called the Borel σalgebra of Rn and is denoted by B(Rn ). It can be extended to a σ−algebra (take for example the power set of Ω). T. denoted by B(X). i. Take Ω = [0. F = B([0. Similarly. Ω} = {0.2. 1]). 1] satisfying i. Deﬁnition 2. F) is a function P : F → [0. For A1 . Example 2. we can deﬁne the Borel σalgebra of X. The σalgebra generated by the open subsets of Rn (or. 1]. Then (Ω. H. It is the smallest algebra containing F and it is called the σ−algebra generated by F. 1]. A probability measure P on the measurable space (Ω. P(H) = p ∈ [0. Let X be a closed subset of Rn .Let F be a collection of subsets of Ω. σ(F) is a σ−algebra (see Exercise 1 ).2. Example 2.9. denoted by σ(F). Let Ω = Rn . T }.2. P comprising a set Ω. The triple Ω. equivalently.e. 1] such that P(∅) = 0. F) is a called a probability space. F. P(Ω) = 1.6. P(T ) = 1 − p. with Ai ∩ Aj = ∅. Consider all the σ−algebras that contain F and take their intersection.2. 1]). i = j then ∞ P(∪∞ Ai ) i=1 = i=1 P(Ai ). The σ−ﬁeld F of a sample space Ω contains all possible outcomes of the experiment that we want to study.
B are dependent it is important to know the probability that the event A will occur. Indeed. The proof of this result is left as an exercise. F.2. given that B has already happened. For any event A and any partition {Bi : i ∈ I} we have P(A) = i∈I P(ABi )P(Bi ). Deﬁnition 2. F.2.11. In many cases the calculation of the probability of an event is simpliﬁed by choosing an appropriate partition of Ω and using the law of total probability. P(·B)) is a probaj=1 bility space for every B ∈ cF . We deﬁne this to be conditional probability. A family {Ai : i ∈ I} of events is called independent if P ∩j∈J Aj = Πj∈J P(Aj ) for all ﬁnite subsets J of I. denoted by P(AB). 12 .12.2. Then P(·B) deﬁnes a probability measure on F. for a countable family of pairwise disjoint sets {Aj }+∞ . Consequently. P(ΩB) = 1 and (since Ai ∩ Aj = ∅ implies that (Ai ∩ B) ∩ (Aj ∩ B) = ∅) ∞ P (∪∞ Ai B) j=1 = j=1 P(Ai B). A family of events {Bi : i ∈ I} is called a partition of Ω if Bi ∩ Bj = ∅. we have that P(∅B) = 0.2. P) be a probability space and ﬁx B ∈ F. P(B) A very useful result is that of the total law of probability. i = j and ∪i∈I Bi = Ω. (Ω.2. Proposition 2. Let (Ω. We know from elementary probability that P (AB) = P (A ∩ B) .13.1 Conditional Probability One of the most important concepts in probability is that of the dependence between events. Deﬁnition 2. Law of total probability. When two events A.
Example 2. Every random variable from a probability space (Ω. µ) to (E. The function of the outcome of an experiment is a random variable.2. X(ω) ∈ B). (2.i . Let U be a topological space. If E is a metric space then we may deﬁne expectation with respect to the measure µ by E[X] = Ω X(ω) dµ(ω). i ∈ I} is a distribution on I if it has nonnegative entries and its total mass equals 1: 13 i∈I ρ0. (2. B ∈ B(E). Let I denote a subset of the positive integers. B(E)) induces a probability measure on E: µX (B) = PX −1 (B) = µ(ω ∈ Ω. F) and (E. F. rather than the experiment itself.2) The measure µX is called the distribution (or sometimes the law) of X. A function X : Ω → E such that the event {ω ∈ Ω : X(ω) ∈ A} =: {X ∈ A} belongs to F for arbitrary A ∈ G is called a measurable function or random variable. A sample space Ω equipped with a σ−ﬁeld of subsets F is called a measurable space. We will use the notation B(U ) to denote the Borel σ–algebra of U : the smallest σ–algebra containing all open sets of U . G) be two measurable spaces.1. then (2. Deﬁnition 2. A vector ρ0 = {ρ0.3. When E is R equipped with its Borel σalgebra.1) Let X be a random variable (measurable function) from (Ω. Deﬁnition 2.1) can by replaced with {X x} ∈ F ∀x ∈ R. that is. µ) to a measurable space (E. a map from Ω to R. More generally.i = 1. let f : E → R be G–measurable. Then. E[f (X)] = Ω f (X(ω)) dµ(ω). F.3 Random Variables We are usually interested in the consequences of the outcome of an experiment. Let (Ω.3. .3.3. G).2.
i. . . .e. k = 0. The Poisson random variable is the nonnegative integer valued random variable with probability mass function pk = P(X = k) = where λ > 0. .3) In this case. FX : R → [0. We can now deﬁne the probability distribution function of X.5. . Example 2. A random variable X with values on R is called continuous if P(X = x) = 0 ∀x ∈ R. x1 . . x2 .4. The distribution function FX (x) of a random variable has the properties that limx→−∞ FX (x) = 0. In this case a random variable is deﬁned to be a function X : Ω → R such that {ω ∈ Ω : X(ω) x} ⊂ F ∀x ∈ R. N. . k! k = 0. We will consider nonnegative integer valued discrete random variables. 1. . . Deﬁnition 2. 14 N! pn q N −n n!(N − n)! k = 0. (2. . . limx→+∞ F (x) = 1 and is right continuous. FX ) becomes a probability space.7.3. B(R).Consider the case where E = R equipped with the Borel σ−algebra. λk −λ e . 1). x1 . 2. . 1. 2.3. Example 2. . (R. } of R. The binomial random variable is the nonnegative integer valued random variable with probability mass function pk = P(X = k) = where p ∈ (0. Deﬁnition 2.3. With a random variable we can associate the probability mass function pk = P(X = xk ). . . . . 2.3.: P(X = x) = x only for x = x0 . .6. A random variable X with values on R is called discrete if it takes values in some countable subset {x0 . q = 1 − p. 1. In this case pk = P(X = k). 1] as FX (x) = P ω ∈ Ω X(ω) x =: P(X x). .
i.8.Y (x.Y (x.Let (Ω. Example 2. y) = fX (x)fY (y). a random variable from Ω to R2 . y) = −∞ −∞ fX. Y y). We will assume that it is absolutely continuous with respect to the Lebesgue measure with density ρX : FX (dx) = ρ(x) dx. We can view them as a random vector. y) = FX (x)FY (y) and fX. y) := joint PDF of the random vector {X. y). Deﬁnition 2. y ∈ R. then FX. The exponential random variable has PDF f (x) = with λ > 0. ∂2F (x. The uniform random variable has PDF f (x) = with a < b. x ∈ (a.Y (x. b). y) = P(X x.Y (x.Y (x. We will call the density ρ(x) the probability density function (PDF) of the random variable X.e. is called the FX.3. F. Y }: x y if it exists.3. y) dxdy. 15 . 0 a < x < b. Y be two continuous random variables. If the random variables X and Y are independent. P) be a probability space and let X : Ω → R be a random variable with distribution FX . i. We can then deﬁne the joint distribution function F (x. ∂x∂y The mixed derivative of the distribution function fX. ii. 1 b−a λe−λx x > 0. / Let X. This is a probability measure on B(R). 0 x < 0. Two random variables X and Y are independent if the events {ω ∈ Ω  X(ω) x} and {ω ∈ Ω  Y (ω) y} are independent for all x.9.
.Y (x. We deﬁne the marginal or reduced distribution function f N −1 (xN −1 ) by f N −1 (xN −1 ) = R f N (xN ) dxN . fY (y) = −∞ fX. µ) to (Rd . +∞ FX.5) appropriately. F. Let X be a random variable in Rd with distribution function f (xN ) where xN = {x1 . dFX (x) = fX (x) dx.Y (+∞.4) and (2. Rd ). (2. or sometimes Lp (Ω.Y (x. We can deﬁne other reduced distribution functions: f N −2 (xN −2 ) = R f N −1 (xN −1 ) dxN −1 = R R f (xN ) dxN −1 dxN . we have x1 xd FX (x) := P(X x) = −∞ . . xN }. We can extend the above deﬁnition to random vectors of arbitrary ﬁnite dimensions. 2.The joint distribution function has the properties FX. −∞ fX (x) dx. provided that we deﬁne the integrals in (2.4) and P[X ∈ G] = G dFX (x). The (joint) distribution function FX Rd → [0. we mean the Banach space of measurable functions on Ω with norm X Lp 1/p = EXp 16 . y) = FY (y). .5) The above formulas apply to both discrete and continuous random variables. y) dx. . When E = Rd and a PDF exists.3. x)... G ∈ B(E). .1 Expectation of Random Variables We can use the distribution of a random variable to compute expectations and probabilities: E[f (X)] = R f (x) dFX (x) (2.X (y. When E = Rd then by Lp (Ω. y) = FY. µ) or even simply Lp (µ). Let X be a random variable from (Ω. 1] is deﬁned as FX (x) = P(X x). B(Rd )).
b (x) := (2πσ) −1 2 (x − b)2 exp − 2σ . The mean is E(X) = b 17 (2. Such an X is termed a Gaussian or normal random variable. Example 2. This is true.3. The random variable X : Ω → Rd with pdf γΣ.b (x) dx = σ. (x − b) 2 is termed a multivariate Gaussian or normal random variable. to calculate how correlated they are.b (x) := (2π)d detΣ 1 −2 exp − 1 −1 Σ (x − b).b (x) dx = b and the variance is E(X − b)2 = R (x − b)2 γσ. • Let b ∈ Rd and Σ ∈ Rd×d be symmetric and positive deﬁnite. Y ) var(X) var(X) (2. Let X. 1]. Y ) = cov(X.7) .10. if they are. for Gaussian random variables (see Exercise 5). Y ) = E (X − EX)(Y − EY ) = E(XY ) − EXEY. • Consider the random variable X : Ω → R with pdf γσ. We will say that two random variables X and Y are uncorrelated provided that ρ(X. It is not true in general that two uncorrelated random variables are independent. We deﬁne the covariance of the two random variables as cov(X. Y ) = 0. We can compute the expectation of an arbitrary function of X using the formula ∞ E(f (X)) = k=0 f (k)pk .Let X be a nonnegative integer valued random variable with probability mass function pk . The correlation coefﬁcient is ρ(X. Y ) ∈ [−1. Y be random variables we want to know whether they are correlated and. however. The mean is EX = R xγσ.6) The CauchySchwarz inequality yields that ρ(X.
σ). (b) (Linearity) If X1 . then E(X1 G) E(X2 G) a. (a) If X is G−measurable and integrable then E(XG) = X. the Gaussian is commonly denoted by N (m. (e) (Successive smoothing) If D is a sub–σ–algebra of F. (d) If Y and XY are integrable. the Gaussian is commonly denoted by N (m. The standard normal random variable is N (0.1. since the mean and covariance matrix completely specify a Gaussian random variable on Rd . then E(c1 X1 + c2 X2 G) = c1 E(X1 G) + c2 E(X2 G).8) Since the mean and variance specify completely a Gaussian random variable on R. We can deﬁne E[f (X)G] and the conditional probability P[X ∈ F G] = E[IF (X)G]. The conditional expectation of X with respect to G is deﬁned to be the function (random variable) E[XG] : Ω → E which is G–measurable and satisﬁes E[XG] dµ = G G X dµ ∀ G ∈ G.and the covariance matrix is E (X − b) ⊗ (X − b) = Σ. 18 . Some analytical calculations for Gaussian random variables will be presented in Section 2. [Properties of Conditional Expectation]. Theorem 2. (2. c2 constants. and X is G−measurable then E(XY G) = XE(Y G). where IF is the indicator function of F . µ) and let G be a sub–σ–algebra of F.s. 1). Σ). (c) (Order) If X1 . in a similar manner.4 Conditional Expecation Assume that X ∈ L1 (Ω. X2 are integrable and X1 X2 a. Let (Ω.4. F. then E(XD) = E[E(XG)D] = E[E(XD)G]. We list some of the most important properties of conditional expectation.6. 2. X2 are integrable and c1 . Similarly. D ⊂ G and X is integrable. F..s. µ) be a probability space and let G be a sub–σ–algebra of F.
For a discrete random variable for which P(X = λk ) = αk . Z 2. From the properties of the Fourier transform we conclude that the characteristic function determines uniquely the distribution function of the random variable. .s. . Then E(X k ) = 1 (k) φ (0). (2. Proof.s.5. ik 19 . Xn } be independent random variables with characteristic functions φj (t).9) gives ∞ φ(t) = k=0 eitλk ak . . Furthermore. (2.9) gives φ(t) = R eitλ p(λ) dλ. for all n. j = 1. Let {X1 . n and let Y = n j=1 Xj with characteristic function φY (t).2.5. Lemma 2.5 The Characteristic Function Many of the properties of (sums of) random variables can be studied using the Fourier transform of the distribution function. The characteristic function of X is deﬁned to be the Fourier transform of the distribution function φ(t) = R eitλ dF (λ) = E(eitX ). X2 .1. . . Let X be a random variable with characteristic function φ(t) and assume that it has ﬁnite moments. Then φY (t) = Πn φj (t). If Xn → X a. and in L1 . Let F (λ) be the distribution function of a (discrete or continuous) random variable X. then E(Xn G) → E(XG) a. in the sense that there is a onetoone correspondance between F (λ) and φ(t).(f) (Convergence) Let {Xn }∞ be a sequence of random variables such that.9) For a continuous random variable for which the distribution function F has a density. See Exercise 10.. dF (λ) = p(λ)dλ. (2. in the exercises at the end of the chapter the reader is asked to prove the following two results. Xn  n=1 where Z is integrable. . j=1 Lemma 2.
We have Σ−1 z. The mean vector and covariance matrix of X are given by EX = b and E((X − EX) ⊗ (X − EX)) = Σ. iii. we calculate the normalization constant.1. y d 2 λ−1 yi . Let b ∈ Rd and Σ ∈ Rd×d a symmetric and positive deﬁnite matrix. the mean and variance and the characteristic function of multidimensional Gaussian random variables.Σt . i 20 . i. Let z = x − b and y = Bz.t − 2 Proof. z Λ−1 Bz. 1 1 exp − Σ−1 (x − b). 1 t.6 Gaussian Random Variables In this section we present some useful calculations for Gaussian random variables. Bz = Λ−1 y.6. From the spectral theorem for symmetric positive deﬁnite matrices we have that there exists a diagonal matrix Λ with positive entries and an orthogonal matrix B such that Σ−1 = B T Λ−1 B. In particular. Theorem 2.b (x) = Then i. The normalization constant is Z = (2π)d/2 det(Σ). ii. The characteristic function of X is φ(t) = ei b. z = = = i=1 B T Λ−1 Bz.2. Let X be the multivariate Gaussian random variable with probability density function γΣ. x − b Z 2 .
exp − Rd 1 −1 Σ (x − b).b (x) dx (B T y + b)γΣ. α ii. since Σ−1 = B T Λ−1 B.b (x) dx = γΣ.b (B T y + b) dy Rd = = b γΣ. Rd We note that. we have that det(Σ−1 ) = Πd λ−1 . z 2 d 2 λ−1 yi i i=1 dz J dy = 1 exp − 2 Rd d = i=1 1 2 exp − λ−1 yi 2 i R 1/2 dyi det(Σ). We 21 . = (2π)d/2 Πn λi i=1 from which we get that Z = (2π)d/2 det(Σ). = (2π)d/2 In the above calculation we have used the elementary calculus identity e−α 2 dx = R x2 2π . Hence. that det(Σ) = Πd λi and that the Jacobian i=1 i i=1 of an orthogonal transformation is J = det(B) = 1. Consequently EX = Rd xγΣ.b (B T y + b) dy = b.b (B T y + b) dy = 1 (2π)d/2 1 2 exp − λi yi 2 det(Σ) i=1 d dyi . From the above calculation we have that γΣ. Furthermore. we have that Σ = B T ΛB. z = B T y. x − b 2 dx = Rd exp − 1 −1 Σ z.Furthermore.
Σt . Let y be a multivariate Gaussian random variable with mean 0 and covariance I. 22 . EX = b and E((Xi − bi )(Xj − bj )) = Σij .C = ei b. we ﬁrst note that X is Gaussian since it is given through a linear transformation of a Gaussian random variable.t e− 2 = ei b.b (z + b) dz 1 (2π)d/2 1 (2π)d/2 k. t.t = ei b.t Eei = ei b.calculate E((Xi − bi )(Xj − bj )) = Rd zi zj γΣ. To see this.m 1 2 Bki Bmj λk δkm = Σij .t − 2 1 1 1 1 1 Tt P P j ( k Cjk tk )yj P P j  k Cjk tk  2 Ct.C T Ct t.t Eei Y.m = = = det(Σ) det(Σ) Bki yk Rd k m Bmi ym exp − yk ym exp − Rd 1 2 λ−1 y 2 λ−1 y 2 dy dy Bki Bmj k. φ(t) = ei b. iii. Now we have: φ(t) = Eei X.t e− 2 = ei b. We have that X = CY + b.t = ei b. Furthermore.Ct t.Σt .t e− 2 = ei b. Let also √ C = B Λ.t Eei CY.t e− 2 Consequently. We have that Σ = CC T = C T C.
7. B(E) the σ−algebra of its Borel sets. n→+∞ lim f (x) dPn (x) = E E f (x) dP (x). 23 .7 Types of Convergence and Limit Theorems One of the most important aspects of the theory of random variables is the study of limit theorems for sums of random variables. We will say that the sequence of Pn converges weakly to the probability measure P if. Deﬁnition 2. Deﬁnition 2. Then Zn converges to Z in distribution if n→+∞ lim Fn (λ) = F (λ) for all λ ∈ R at which F is continuous. d) be a metric space. · · · + ∞. F.2. The most well known limit theorems in probability theory are the law of large numbers and the central limit theorem. We list the most important types of convergence below. Let (E. respectively. Recall that the distribution function FX of a random variable from a probability space (Ω. (d) Let Fn (λ).1.7. FX ) is a probability space. F (λ) be the distribution functions of Zn n = 1. n = 1. (b) Zn converges to Z in probability if for every ε > 0 n→+∞ lim P Zn − Z > ε = 0. There are various different types of convergence for sequences or random variables. Pn a sequence of probability measures on (E. We can show that the convergence in distribution is equivalent to the weak convergence of the probability measures induced by the distribution functions. B(E)) and let Cb (E) denote the space of bounded continuous functions on E. We will say that n=1 (a) Zn converges to Z with probability one if P n→+∞ lim Zn = Z = 1. Let {Zn }∞ be a sequence of random variables.2. B(R). for each f ∈ Cb (E). (c) Zn converges to Z in Lp if n→+∞ lim E Zn − Z p = 0. P) to R induces a probability measure on R and that (R. · · · + ∞ and Z.
· · ·+∞ and Z. F (λ) be the distribution functions of Zn n = 1.3. n→+∞ where En and E denote the expectations with respect to Fn and F . for all g ∈ Cb (R) lim g(x) dFn (x) = X X n→+∞ g(x) dF (x). d) is said to converge in distribution if the indued measures Fn (B) = Pn (Xn ∈ B) for B ∈ B(E) converge weakly to a probability measure P. Indeed. Let {Xn }∞ be iid random variables with EXn = V . (2.10) is equivalent to lim En g(Xn ) = Eg(X).7. Pn ) and taking values on a metric space (E. Then. · · ·+∞. Deﬁnition 2. respectively. A sequence of real valued random variables Xn deﬁned on a probability spaces (Ωn . When the sequence of random variables whose convergence we are interested in takes values in Rd or. a metric space space (E. Fn . n = 1. We can also study ﬂuctuations around the average behavior. The strong law of large numbers provides us with information about the behavior of a sum of random variables (or. the strong law of large numbers n=1 states that average of the sum of the iid converges to V with probability one: 1 P lim N →+∞ N N Xn = V n=1 = 1. the sequence of random variables σ√N N n=1 Yn converges in distribution to a N (0. 24 . respectively. Then Zn converges to Z in distribution if and only if. more generally.Theorem 2.7.10) Notice that (2. Deﬁne the 1 centered iid random variables Yn = Xn −V . Then. Let Fn (λ). 1) random variable: lim P 1 √ N a n→+∞ σ N Yn n=1 a = −∞ 1 2 1 √ e− 2 x dx. a large number or repetitions of the same experiment) on average. 2π This is the central limit theorem.4. d) then we can use weak convergence of the sequence of probability measures induced by the sequence of random variables to deﬁne convergence in distribution. let E(Xn − V )2 = σ 2 .
48. (b) The uniform distribution with density f (x) = with a < b. More information can be found in [75.9 Exercises 1. See [53. 2]. 90].8 Discussion and Bibliography The material of this chapter is very standard and can be found in many books on probability theory. b).6 are essentially an exercise in linear algebra.13. Show that the intersection of a family of σalgebras is a σalgebra. Sec.2. / . 57. The connection between conditional expectation and orthogonal projections is discussed in [13]. 4. A comprehensive study of limit theorems can be found in [43]. Random variables and probability measures can also be deﬁned in inﬁnite dimensions. Proposition 2. 0 a < x < b. A different normalization is usually used in physics textbooks. The calculations presented in Section 2. 56. 25 1 b−a λe−λx x > 0. Ch.2].2]. Well known textbooks on probability theory are [8. x ∈ (a. Sec.2. 2. variance and characteristic function of the following probability density functions. Calculate the mean. 3. 24. 10. 23. (a) The exponential distribution with density f (x) = with λ > 0. Prove the law of total probability. The reduced distribution functions deﬁned in Section 2.3 are used extensively in statistical mechanics. 2. See for instance [2. The study of limit theorems is one of the cornerstones of probability theory and of the theory of stochastic processes. 0 x < 0.
Show that they are uncorrelated if and only if they are independent. (d) Let X be a random variable with uniform distribution on [0. x < 0. α > 0 and Γ(α) is the Gamma function ∞ Γ(α) = 0 ξ α−1 e−ξ dξ. Le X and Y be independent random variables with distribution functions FX and FY . 6. Show that the distribution function of the sum Z = X + Y is the convolution of FX and FY : FZ (x) = FX (x − y) dFY (y). (a) Let X be a continuous random variable with characteristic function φ(t). (c) Let X be a continuous random variable with probability density function f (x) and characteristic function φ(t). Let X be a discrete random variable taking vales on the set of nonnegative integers with probability mass function pk = P(X = k) with pk deﬁned as g(s) = E(s ) = k=0 X 0. 26 . The generating function is pk sk . Show that EX k = 1 (k) φ (0). (b) Let X be a nonnegative random variable with distribution function F (x). α > 0. +∞ k=0 +∞ pk = 1. ik where φ(k) (t) denotes the kth derivative of φ evaluated at t. Find the probability density of the random variable Y = sin(X). with λ > 0. 4. Find the probability density and characteristic function of the random variable Y = aX + b with a. Show that +∞ E(X) = 0 (1 − F (x)) dx. 5. 7. Let X and Y be Gaussian random variables.(c) The Gamma distribution with density f (x) = λ (λx)α−1 e−λx Γ(α) 0 x > 0. 2π]. b ∈ R.
Write a computer program for studying the law of large numbers and the central limit theorem. 2. Study the properties of Gaussian measures on separable Hilbert spaces from [75. Prove Theorem 2.4. 2].(a) Show that EX = g (1) and EX 2 = g (1) + g (1). (b) Calculate the generating function of the Poisson random variable with pk = P(X = k) = e−λ λk . 10. . 27 . (c) Prove that the generating function of a sum of independent nonnegative integer valued random variables is the product of their generating functions. 1. Ch. . 9. Investigate numerically the rate of convergence of these two theorems. 8. and λ > 0. k! k = 0. . where the prime denotes differentiation.1. .
28 .
1 Introduction In this chapter we present some basic results form the theory of stochastic processes and we investigate the properties of some of the standard stochastic processes in continuous time. Deﬁnition 3.7. Section 3. is presented in Section 3. P) to (E. G) a measurable space. 29 .Chapter 3 Basics of the Theory of Stochastic Processes 3. for each ﬁxed t ∈ T .8 contains exercises. G). In Section 3.6. (Ω. The precise deﬁnition is given below. Further discussion and bibliographical comments are presented in Section 3.3 we present some properties of stationary stochastic processes. Ω is called the sample space.2 Deﬁnition of a Stochastic Process Stochastic processes describe dynamical systems whose evolution law is of probabilistic nature. or continuous.2. A stochastic process is a collection of random variables X = {Xt . 3. The KarhunenLoeve expansion. one of the most useful tools for representing stochastic processes and random ﬁelds. Xt is a random variable from (Ω. Various examples of stochastic processes in continuous time are presented in Section 3. +∞).1. In Section 3. T = [0.5. t ∈ T } where. F. P) a probability space and (E. In Section 3. and E is the state space of the stochastic process Xt . The state space E will usually be Rd equipped with the σ–algebra of Borel sets. The set T can be either discrete. for example the set of positive integers Z+ .2 we give the deﬁnition of a stochastic process.4 we introduce Brownian motion and study some of its properties. F. Let T be an ordered set.
The ﬁnite dimensional distributions (fdd) of a stochastic process are the distributions of the E k –valued random variables (X(t1 ). We will say that two processes Xt and Yt are equivalent if they have same ﬁnite dimensional distributions.2. . x2 .2. ω) or Xt (ω) instead of Xt . . A one dimensional Gaussian process is a continuous time stochastic process for which E = R and all the ﬁnite dimensional distributions are Gaussian. 2. Deﬁnition 3. the function Xt (ω) : T → E is called a sample path (realization.2. .7 30 . . what we need is the stochastic process to be separable. It is straightforward to extend the above deﬁnition to arbitrary dimensions. A natural question arises: are the ﬁnite dimensional distributions of a stochastic process sufﬁcient to determine a stochastic process uniquely? This is true for processes with continuous paths 1 . x − µk 2 k . . Xt2 . For a ﬁxed sample point ω ∈ Ω. . .3.A stochastic process X may be viewed as a function of both t ∈ T and ω ∈ Ω. See the discussion in Section 3. A Gaussian process x(t) is characterized by its mean m(t) := Ex(t) 1 xi .e. every ﬁnite dimensional vector (Xt1 . . . From the above deﬁnition we conclude that the Finite dimensional distributions of a Gaussian continuous time stochastic process are Gaussian with PFG γµk . Deﬁnition 3. . . . Kk ) random variable for some vector µk and a symmetric nonnegative deﬁnite matrix Kk for all k = 1. . From experiments or numerical simulations we can only obtain information about the ﬁnite dimensional distributions of a process. . Xtk ) is a N (µk . This is the class of stochastic processes that we will study in these notes. X(tk )) for arbitrary positive integer k and arbitrary times ti ∈ T. and for all t1 . . t2 .4. . k}: F (x) = P(X(ti ) with x = (x1 . . . X(t2 ). . i ∈ {1.Kk (x) = (2π)−n/2 (detKk )−1/2 exp − where x = (x1 .2. k) 1 −1 K (x − µk ). . . . In fact. . Deﬁnition 3. . i. . trajectory) of the process X. We will sometimes write X(t). i = 1. . tk . . X(t. . . . xk ). xk ).
the ﬁrst two moments of a Gaussian process are sufﬁcient for a complete characterization of the process.3. A stochastic process is called (strictly) stationary if all ﬁnite dimensional distributions are invariant under time translation: for any integer k and times ti ∈ T . . Then Xn is a strictly stationary process (see Exercise 1). Xtk +t ∈ Ak ) = P(Xt1 ∈ A1 . . . Such stochastic processes are called stationary.3.1 Stationary Processes Strictly Stationary Processes In many stochastic processes that appear in applications their statistics remain invariant under time translations. j=0 almost surely. Deﬁnition 3. P(Xt1 +t ∈ A1 .1) almost surely. Example 3. X(s + t2 ).3. . . The sequence of iid random variables is an example of an ergodic strictly stationary processes. . X(tk )) is equal to that of (X(s + t1 ). we have that 1 lim N →+∞ N N −1 f (Xj ) = Ef (Y0 ). the distribution of (X(t1 ). . X(t2 ). X(s + tk )) for any s such that s + ti ∈ T for all i ∈ {1. In fact. for any function f such that Ef (Y0 ) < +∞. s) = E x(t) − m(t) ⊗ x(s) − m(s) . It is possible to develop a quite general theory for stochastic processes that enjoy this symmetry property. by the strong law of large numbers. Then. ∀t ∈ T. Xt2 +t ∈ A2 . .and the covariance (or autocorrelation) matrix C(t. Birkhoff’s ergodic theorem states that.3 3. k}. . Let Y0 . identically distributed random variables and consider the stochastic process Xn = Yn . . . . . 31 . we have that 1 N N −1 j=0 1 Xj = N N −1 Yj → EY0 = µ.1. . .2. . Xt2 ∈ A2 . Assume furthermore that EY0 = µ < +∞. 3. Thus. . Xtk ∈ Ak ). . j=0 (3. In other words. be a sequence of independent. . Y1 .
3.e. provided that it is long enough (N 1). 2.3. i. 32 . 1. s∈T (3. s) = C(t − s). Xt ∈ L2 (Ω. This motivates the following deﬁnition. .3. s) = E((Xt − µ)(Xs − µ)) depends on the difference between the two times. P be a probability space. P) for all t ∈ T ). E((Xt − µ)(Xs − µ)) = C(t − s). A stochastic process Xt ∈ L2 is called secondorder stationary or widesense stationary or weakly stationary if the ﬁrst moment EXt is a constant and the covariance function E(Xt − µ)(Xs − µ) depends only on the difference t − s: EXt = µ. C(t.Ergodic strictly stationary processes satisfy (3. Let Xt . 3. . . j=0 which is independent of N and does not converge to the mean of the stochastic processes EXn = EZ (assuming that it is ﬁnite). t ∈ T (with T = R or Z) be a realvalued random process on this probability space with ﬁnite second moment.3. EXt 2 < +∞ (i. Let Z be a random variable and deﬁne the stochastic process Xn = Z.1) Hence. Then. or any other deterministic number. Then Xn is a strictly stationary process (see Exercise 2).4.2) from which we conclude that the covariance or autocorrelation or correlation function C(t. This is an example of a nonergodic processes. from which we conclude that EXt is constant. and E((Xt1 +s − µ)(Xt2 +s − µ)) = E((Xt1 − µ)(Xt2 − µ)). Deﬁnition 3. We can calculate the long time average of this stochastic process: 1 N N −1 Xj = j=0 1 N N −1 Z = Z.e. . F. we can calculate the statistics of a sequence stochastic process Xn using a single sample path. Assume that it is strictly stationary.3) s∈T (3. E(Xt+s ) = EXt .2 Second Order Stationary Processes Let Ω. t and s. Example 3. n = 0.
. Notice that C(t) = E(Xt X0 ).3. Xt ∈ L2 ). we have that C(t) = C(−t). irrespective of whether Y0 is such that EY02 < +∞. . On the contrary. The converse is not true. Notice that in this case the values of our stochastic process at different times are strongly correlated.5. From Example 3. Consequently.3.3 that for second order stationary processes. The ﬁrst two moments of a Gaussian process are sufﬁcient for a complete characterization of the process.8.7. Y1 . . in our second example there is very strong correlation between the stochastic process at different times and this process is not ergodic. be a sequence of independent. From Example 3. Assume now that EZ = 0. The function C(t) is the covariance (sometimes also called autocovariance) or the autocorrelation function of the Xt . EZ 2 = σ 2 .6.The constant µ is the expectation of the process Xt . Notice that in this case we have no correlation between the values of the stochastic process at different times n and k. .3.3. In the ﬁrst of the examples above.3 we know that this is a strictly stationary process irrespective of whether EZ2 < +∞ or not. which is ﬁnite. a Gaussian stochastic process is strictly stationary if and only if it is weakly stationary. A mean zero process with be called a centered process. Then Xn becomes a second order stationary process with R(k) = σ 2 . . 1. we can set µ = 0. We will see in Section 3. Remark 3. The deﬁnition of strict stationarity implies that EXt = µ. and E((Xt − µ)(Xs − µ)) = C(t − s). Let Z be a single random variable and consider the stochastic process Xn = Z. identically distributed random variables and consider the stochastic process Xn = Yn . by assumption. Let Y0 . Assume now that EY0 = 0 and EY02 = σ 2 < +∞. ergodicity is related to fast decay of correlations. since if EXt = µ then the process Yt = Xt − µ is mean zero. n = 0.2 we know that this is a strictly stationary process. .3. there was no correlation between our stochastic processes at different times and the stochastic process is ergodic. . Without loss of generality. Then Xn is a second order stationary process with mean zero and correlation function R(k) = σ 2 δk0 .3. whereas C(0) = E(Xt2 ). Let Xt be a strictly stationary stochastic process with ﬁnite second moment (i.e. a constant. 2. a strictly stationary process with ﬁnite second moment is also stationary in the wide sense. Example 3. t ∈ R. Since we have assumed that Xt is a real valued process. Remark 3. 33 .3. Example 3. Hence.
Lemma 3. h→0 lim EXt+h − Xt 2 = 0. Assume that the covariance function C(t) of a second order stationary process is continuous at t = 0. Thus. From the above calculation we have EXt+h − Xt 2 = 2(C(0) − C(h)).3. . 34 .3. . . tn ∈ R.j=1 0 (3. . c1 . .5) for all n ∈ N. which applies to all nonnegative functions. Fix t ∈ R and (without loss of generality) set EXt = 0. cn ∈ C. (3. as h → 0. To make the link between second order stationary processes and Fourier analysis we will use Bochner’s theorem. from the above equation we get limh→0 C(h) = C(0).4) which converges to 0 as h → 0.10. the continuity of C(t) is equivalent to the continuity of the process Xt in the L2 sense. Proof. This enables us to study second order stationary processes using tools from Fourier analysis. We calculate: C(t + h) − C(t)2 = E(Xt+h X0 ) − E(Xt X0 )2 = E((Xt+h − Xt )X0 )2 E(X0 )2 E(Xt+h − Xt )2 2 = C(0)(EXt+h + EXt2 − 2EXt Xt+h ) = 2C(0)(C(0) − C(h)) → 0. Deﬁnition 3. Furthermore. The Fourier transform of the covariance function of a second order stationary process always exists. assume that Xt is L2 continuous. Notice that form (3. continuity of C(·) at 0 implies continuity for all t. Then. h ∈ R. Conversely.Continuity properties of the covariance function are equivalent to continuity properties of the paths of Xt in the L2 sense. i. .e. t1 . Assume now that C(t) is continuous. Then it is continuous for all t ∈ R.9.4) we immediately conclude that C(0) > C(h). A function f (x) : R → R is called nonnegative deﬁnite if n f (ti − tj )ci cj ¯ i.
3. using (3. ρ(dx) = f (x)dx. 35 . The covariance function of second order stationary process is a nonnegative deﬁnite function. (3.7) There are various cases where the experimentally measured quantity is the spectral density (or power spectrum) of a stationary stochastic process. Then there exists a unique nonnegative measure ρ on R such that ρ(R) = C(0) and C(t) = R eixt ρ(dx) ∀t ∈ R. second order stationary process is given by the inverse Fourier transform of the spectral density: ∞ C(t) = −∞ eitx f (x) dx. from a time series of observations of a stationary processes we can calculate the autocorrelation function and.12.3.Lemma 3. i. We will use the notation Xtc := n n i=1 n Xti ci . (Bochner) Let C(t) be a continuous positive deﬁnite function. −∞ From (3.7) the spectral density.j=1 i. Let Xt be a second order stationary process with autocorrelation function C(t) whose Fourier transform is the measure ρ(dx).11. We have.13. EXti Xtj ci cj ¯ C(ti − tj )ci cj = ¯ i.6) Deﬁnition 3.j=1 n n = E = i=1 EXtc 2 Xti ci j=1 Xtj cj ¯ ¯ = E Xtc Xtc 0.e. The Fourier transform f (x) of the covariance function is called the spectral density of the process: f (x) = 1 2π ∞ e−itx C(t) dt. Proof. The measure ρ(dx) is called the spectral measure of the process Xt . In the following we will assume that the spectral measure is absolutely continuous with respect to the Lebesgue measure on R with density f (x).3. (3.6) it follows that that the autocorrelation function of a mean zero. Conversely. Theorem 3.
The slower the decay of the correlation function. A realvalued Gaussian stationary process deﬁned on R with correlation function given by (3. i. The Ornstein Uhlenbeck process is used as a model for the velocity of a Brownian particle. (3. A Gaussian process with an exponential correlation function is of particular importance in the theory and applications of stochastic processes.3. Example 3. Consider a mean zero. Notice that when the correlations do not decay sufﬁciently fast so that C(t) is integrable. of the integral t X(t) = 0 Y (s) ds.3.8) where D > 0. then the correlation time will be inﬁnite.14. Deﬁnition 3. We will write R(0) = f (x) = = = = D α (3.e.8) is called the (stationary) OrnsteinUhlenbeck process.15. It is of interest to calculate the statistics of the position of the Brownian particle. the correlation time τcor : τcor 1 = C(0) ∞ ∞ C(τ ) dτ = 0 0 2 E(Xτ X0 )/E(X0 ) dτ. 36 . The correlation time is (we have that R(0) = D/α) τcor = 0 ∞ e−αt dt = α−1 .9) where Y (t) denotes the stationary OU process.The autocorrelation function of a second order stationary process enables us to associate a time scale to Xt . The spectral density of this process is: 1 D +∞ −ixt −αt e e dt 2π α −∞ 0 +∞ 1 D e−ixt eαt dt + e−ixt e−αt dt 2π α −∞ 0 1 D 1 1 + 2π α −ix + α ix + α D 1 . 2 + α2 πx This function is called the Cauchy or the Lorentz distribution. the larger the correlation time is. second order stationary process with correlation function R(t) = R(0)e−αt where α > 0.
Lemma 3. provided that the correlation between values of the process at different times decays sufﬁciently fast.3. We make the change of variables u = t − s. Let {Xt }t 0 be a second order stationary process on a probability space Ω. s variables is [0.3 Ergodic Properties of SecondOrder Stationary Processes Second order stationary processes have nice ergodic properties. In this case.s) − e−t−s − 1.3. Then 1 lim E T →+∞ T X(s) ds − µ 0 = 0. ∂(u. Then the position process (3. Lemma 3.11) For the proof of this result we will ﬁrst need an elementary lemma. v = t + s. Let Y (t) denote the stationary OU process with covariance function (3. The domain of integration in the t. F.3. (3. Let R(t) be an integrable symmetric function.17.3. v variables it becomes [−T. Theorem 3.18.16. s) + e− min(t.10) 3. s) 1 = . Then T 0 0 T T R(t − s) dtds = 2 0 (T − s)R(s) ds.12) Proof. it is possible to show that we can calculate expectations by calculating time averages. v) 2 R(t − s) dtds = −T T 0 R(u)J dvdu (T − u)R(u) du −T T = = 2 0 (T − u)R(u) du. The Jacobian of the transformation is J= The integral becomes T 0 0 T T 2(T −u) ∂(t. (3. An example of such a result is the following.s) + e−max(t. +∞). In the u. T ] × [0. T ] × [0. Proof.8) and set α = D = 1. 2(T − u)]. T ].9) is a mean zero Gaussian process with covariance function E(X(t)X(s)) = 2 min(t. P T 2 with mean µ and covariance R(t). 37 . See Exercise 8. (3. where the symmetry of the function R(u) was used in the last step. and assume that R(t) ∈ L1 (0.
Proof of Theorem 3.3.17. We use Lemma (3.3.18) to calculate: 1 E T
T 2
Xs ds − µ
0
= = = =
T 1 (Xs − µ) ds E T2 0 T T 1 (X(t) − µ)(X(s) − µ) dtds E T2 0 0 T T 1 R(t − s) dtds T2 0 0 T 2 (T − u)R(u) du T2 0 u 2 +∞ 2 +∞ 1− R(u) du R(u) du → 0, T 0 T T 0
2
using the dominated convergence theorem and the assumption R(·) ∈ L1 . Assume that µ = 0 and deﬁne
+∞
D=
0
R(t) dt,
(3.13)
which, from our assumption on R(t), is a ﬁnite quantity. 2 The above calculation suggests that, for T 1, we have that
t 2
E
0
X(t) dt
≈ 2DT.
This implies that, at sufﬁciently long times, the mean square displacement of the integral of the ergodic second order stationary process Xt scales linearly in time, with proportionality coefﬁcient 2D. Assume that Xt is the velocity of a (Brownian) particle. In this case, the integral of Xt
t
Zt =
0
Xs ds,
represents the particle position. From our calculation above we conclude that EZt2 = 2Dt. where D=
0 ∞ ∞
R(t) dt =
0
E(Xt X0 ) dt
(3.14)
is the diffusion coefﬁcient. Thus, one expects that at sufﬁciently long times and under appropriate assumptions on the correlation function, the time integral of a stationary process will approximate
2
Notice however that we do not know whether it is nonzero. This requires a separate argument.
38
a Brownian motion with diffusion coefﬁcient D. The diffusion coefﬁcient is an example of a transport coefﬁcient and (3.14) is an example of the GreenKubo formula: a transport coefﬁcient can be calculated in terms of the time integral of an appropriate autocorrelation function. In the case of the diffusion coefﬁcient we need to calculate the integral of the velocity autocorrelation function. Example 3.3.19. Consider the stochastic processes with an exponential correlation function from Example 3.3.14, and assume that this stochastic process describes the velocity of a Brownian particle. Since R(t) ∈ L1 (0, +∞) Theorem 3.3.17 applies. Furthermore, the diffusion coefﬁcient of the Brownian particle is given by
+∞
R(t) dt = R(0)τc−1 =
0
D . α2
3.4
Brownian Motion
The most important continuous time stochastic process is Brownian motion. Brownian motion is a mean zero, continuous (i.e. it has continuous sample paths: for a.e ω ∈ Ω the function Xt is a continuous function of time) process with independent Gaussian increments. A process Xt has independent increments if for every sequence t0 < t1 < . . . tn the random variables Xt1 − Xt0 , Xt2 − Xt1 , . . . , Xtn − Xtn−1 are independent. If, furthermore, for any t1 , t2 , s ∈ T and Borel set B ⊂ R P(Xt2 +s − Xt1 +s ∈ B) = P(Xt2 − Xt1 ∈ B) then the process Xt has stationary independent increments. Deﬁnition 3.4.1. • A one dimensional standard Brownian motion W (t) : R+ → R is a real
valued stochastic process such that i. W (0) = 0. ii. W (t) has independent increments. iii. For every t > s 0 W (t) − W (s) has a Gaussian distribution with mean 0 and
−1 2
variance t − s. That is, the density of the random variable W (t) − W (s) is g(x; t, s) = 2π(t − s) exp − x2 2(t − s) ; (3.15)
39
• A d–dimensional standard Brownian motion W (t) : R+ → Rd is a collection of d independent one dimensional Brownian motions: W (t) = (W1 (t), . . . , Wd (t)), where Wi (t), i = 1, . . . , d are independent one dimensional Brownian motions. The density of the Gaussian random vector W (t) − W (s) is thus
−d/2
g(x; t, s) = 2π(t − s)
exp −
x 2 2(t − s)
.
Brownian motion is sometimes referred to as the Wiener process . Brownian motion has continuous paths. More precisely, it has a continuous modiﬁcation. Deﬁnition 3.4.2. Let Xt and Yt , t ∈ T , be two stochastic processes deﬁned on the same probability space (Ω, F, P). The process Yt is said to be a modiﬁcation of Xt if P(Xt = Yt ) = 1 ∀t ∈ T . Lemma 3.4.3. There is a continuous modiﬁcation of Brownian motion. This follows from a theorem due to Kolmogorov. Theorem 3.4.4. (Kolmogorov) Let Xt , t ∈ [0, ∞) be a stochastic process on a probability space {Ω, F, P}. Suppose that there are positive constants α and β, and for each T constant C(T ) such that EXt − Xs α C(T )t − s1+β , 0 s, t T. (3.16) 0 there is a
Then there exists a continuous modiﬁcation Yt of the process Xt . The proof of Lemma 3.4.3 is left as an exercise. Remark 3.4.5. Equivalently, we could have deﬁned the one dimensional standard Brownian motion as a stochastic process on a probability space Ω, F, P with continuous paths for almost all ω ∈ Ω, and Gaussian ﬁnite dimensional distributions with zero mean and covariance E(Wti Wtj ) = min(ti , tj ). One can then show that Deﬁnition 3.4.1 follows from the above deﬁnition. It is possible to prove rigorously the existence of the Wiener process (Brownian motion): 40
Figure 3.1: Brownian sample paths
Theorem 3.4.6. (Wiener) There exists an almostsurely continuous process Wt with independent increments such and W0 = 0, such that for each t 0 the random variable Wt is N (0, t).
1 Furthermore, Wt is almost surely locally H¨ lder continuous with exponent α for any α ∈ (0, 2 ). o
Notice that Brownian paths are not differentiable. We can also construct Brownian motion through the limit of an appropriately rescaled random walk: let X1 , X2 , . . . be iid random variables on a probability space (Ω, F, P) with mean 0 and variance 1. Deﬁne the discrete time stochastic process Sn with S0 = 0, Sn = appropriately rescaled random walk: 1 1 Wtn = √ S[nt] + (nt − [nt]) √ X[nt]+1 , n n where [·] denotes the integer part of a number. Then Wtn converges weakly, as n → +∞ to a one dimensional standard Brownian motion. Brownian motion is a Gaussian process. For the d–dimensional Brownian motion, and for I the d × d dimensional identity, we have (see (2.7) and (2.8)) EW (t) = 0 and E (W (t) − W (s)) ⊗ (W (t) − W (s)) = (t − s)I. 41 (3.17) ∀t 0
j=1
Xj , n
1.
Deﬁne now a continuous time stochastic process with continuous paths as the linearly interpolated,
Moreover, E W (t) ⊗ W (s) = min(t, s)I. (3.18)
From the formula for the Gaussian density g(x, t − s), eqn. (3.15), we immediately conclude that W (t) − W (s) and W (t + u) − W (s + u) have the same pdf. Consequently, Brownian motion has stationary increments. Notice, however, that Brownian motion itself is not a stationary process. Since W (t) = W (t) − W (0), the pdf of W (t) is g(x, t) = √ 1 −x2 /2t e . 2πt
We can easily calculate all moments of the Brownian motion: E(xn (t)) = √ =
+∞ 1 2 xn e−x /2t dx 2πt −∞ 1.3 . . . (n − 1)tn/2 , n even, 0, n odd.
Brownian motion is invariant under various transformations in time. Theorem 3.4.7. . Let Wt denote a standard Brownian motion in R. Then, Wt has the following properties: i. (Rescaling). For each c > 0 deﬁne Xt = ii. (Shifting). For each c > 0 Wc+t − Wc , t Wu , u ∈ [0, c]. iii. (Time reversal). Deﬁne Xt = W1−t − W1 , t ∈ [0, 1]. Then (Xt , t ∈ [0, 1]) = (Wt , t ∈ [0, 1]) in law. iv. (Inversion). Let Xt , t (Wt , t 0) in law. 0 deﬁned by X0 = 0, Xt = tW (1/t). Then (Xt , t 0) =
1 √ W (ct). c
Then (Xt , t
0) = (Wt , t
0) in law.
0 is a Brownian motion which is independent of
We emphasize that the equivalence in the above theorem holds in law and not in a pathwise sense. Proof. See Exercise 13. 42
20) is a multivariate Gaussian random variable. Then V (t) is a Gaussian stationary process with mean 0 and correlation function R(t) = e−t . . Let W (t) be a standard Brownian motion and consider the process V (t) = e−t W (e2t ).20) can be rewritten as {X(s1 ). . t2 . . Proof. . .We can also add a drift and change the diffusion coefﬁcient of the Brownian motion: we will deﬁne a Brownian motion with drift µ and variance σ 2 as the process Xt = µt + σWt . Since f (t) is strictly increasing. Lemma 3. This is the simplest example of a stochastic differential equation. Notice that Xt satisﬁes the equation dXt = µ dt + σ dWt . N such that si = f −1 (ti ). it is invertible and hence. . For the proof of this result we ﬁrst need to show that time changed Gaussian processes are also Gaussian. We can deﬁne the OU process through the Brownian motion via a time change. Let X(t) be a Gaussian stochastic process and let Y (t) = X(f (t)) where f (t) is a strictly increasing function. the random vector (3.4. .4. .9. X(s2 ). for all positive integers N and all sequences of times {t1 . . sN . s2 . there exist si . Y (t2 ). . . Thus. . Lemma 3. Hence Y (t) is also Gaussian.8. (3. . X(sN )}. which is Gaussian for all N and all choices of times s1 . 43 . Y (tN )} (3. The mean and variance of Xt are EXt = µt. tN } the random vector {Y (t1 ). We need to show that. i = 1.19) E(Xt − EXt )2 = σ 2 t. Then Y (t) is also a Gaussian process. . .
1) is a centered Gaussian process with continuous sample paths whose covariance is (3. (3. we calculate E(V (t)V (s)) = e−t−s E(W (e2t )W (e2s )) = e−t−s min(e2t .21) Notice that B0 = B1 = 0. we can write the Brownian motion as a time change of the Brownian bridge: Wt = (t + 1)B t 1+t . s. 1]. (3.19).8.4.24) . We deﬁne the Brownian bridge (from 0 to 0) to be the process Bt = Wt − tW1 . t ∈ [0. 2 3.5. t given by E(WtH WsH ) = 1 2H s + t2H − t − s2H .4. (3. t 0.5. To show that the correlation function of V (t) is given by (3. e2s ) = e−t−s . we can deﬁne the Brownian bridge to be the continuous Gaussian process {Bt : 0 t 1} such that E(Bt Bs ) = min(s. Equivalently.Proof of Lemma 3. Another.5.23) Conversely. t ∈ [0.1 Other Examples of Stochastic Processes Brownian Bridge Let W (t) be a standard one dimensional Brownian motion.5 3. The Gaussianity of the process V (t) follows from Lemma 3. The fact that V (t) is mean zero follows immediately from the fact that W (t) is mean zero.9 (notice that the transformation that gives V (t) in terms of W (t) is invertible and we can write W (s) = s1/2 V ( 1 ln(s))). t ∈ [0.2 Fractional Brownian Motion 0 with Hurst parameter Deﬁnition 3. equivalent deﬁnition of the Brownian bridge is through an appropriate time change of the Brownian motion: Bt = (1 − t)W t 1−t .22) EBt = 0. t) − st. 3. 1]. A (normalized) fractional Brownian motion WtH . 1).1. 2 44 H ∈ (0.
5. Wt2 becomes the standard Brownian motion. See Exercise 19 3. t 1 0. (3. See Exercise 20. It has the following self similarity property H (Wαt .2. Fractional Brownian motion has the following properties.6 The KarhunenLo´ ve Expansion e Let f ∈ L2 (Ω) where Ω is a subset of Rd and let {en }∞ be an orthonormal basis in L2 (Ω).5. where fn = Ω f (x)en (x) dx. W0 = 0. α > 0. The Poisson process with intensity λ. stochastic process with independent increments satisfying e−λ(t−s) λ(t − s) P[(N (t) − N (s)) = k] = . It has stationary increments. is an integervalued. E(WtH )2 = t2H . iv. 1 i. iii. denoted by N (t). E(WtH − WsH )2 = t − s2H . k! k t>s 0. Then.3 The Poisson Process Another fundamental continuous time process is the Poisson process : Deﬁnition 3.25) where the equivalence is in law. H ii. When H = 2 . n=1 it is well known that f can be written as a series expansion: ∞ f= n=1 fn e n .Proposition 3. 3. t 0) = (αH WtH . 45 . continuous time. Proof.5.3. t 0). k ∈ N. The Poisson process does not have a continuous modiﬁcation. EWtH = 0.
(3. We will assume that these random variables are orthogonal: E(ξn ξm ) = λn δnm . s) = E(Xt Xs ) = E ∞ k=1 =1 ∞ ξk ek (t)ξ e (s) E (ξk ξ ) ek (t)e (s) = k=1 =1 ∞ = k=1 λk ek (t)ek (s). The random variables ξn are calculated as n=1 1 1 ∞ Xt ek (t) dt = 0 0 ∞ n=1 ξn en (t)ek (t) dt ξn δnk = ξk . 1] (3.26) For simplicity we will take T = [0. n=1 = where we assumed that we can interchange the summation and integration. Notice that from (3. where {λn }∞ are positive numbers that will be determined later. EXt = 0. s) is continuous in both t and s (exercise 21).27) where {en }∞ is an orthonormal basis in L2 (0. 1). h→0 lim EXt+h − Xt 2 = 0.The convergence is in L2 (Ω): N N →∞ lim f (x) − n=1 fn en (x) L2 (Ω) = 0. Let us assume an expansion of the form ∞ Xt (ω) = n=1 ξn (ω)en (t). Let R(t. we can calculate ∞ ∞ R(t. 46 .26) it follows that R(t. s) = E(Xt Xs ) be the autocorrelation function. It turns out that we can obtain a similar expansion for an L2 mean zero process which is continuous in the L2 sense: EXt2 < +∞. 1]. t ∈ [0.27) exists. n=1 Assuming that an expansion of the form (3.
Hence. 1)) and nonnegative (Rf. then {Rφn }∞ has a convergent subsequence). selfadjoint operators implies that R has a countable sequence of eigenvalues tending to 0. in order to the expansion (3. 1] → L2 [0. s) = k=1 λk ek (t)ek (s). f 0 for all f ∈ L2 (0. en (t)}∞ have to be the eigenvalues and n=1 eigenfunctions of the integral operator whose kernel is the correlation function of Xt : 1 R(t. h) = (f. The n=1 n=1 spectral theorem for compact.28) it follows that 1 1 ∞ R(t. h ∈ L2 (0. where the series converges absolutely and uniformly.27) we need to study the eigenvalue problem for the integral operator R : L2 [0. {en (t)} are the eigenfunctions of R corresponding to nonzero eigenvalues and the convergence is in L2 . 1]. Finally. where Rf0 = 0. It easy to check that this operator is selfadjoint ((Rf.29) Hence. Furthermore. s)en (s) ds = λn en (t). 0 (3.27) to be valid. Hence.Consequently. (3. 1]×[0. 1) we can write ∞ f = f0 + n=1 fn en (t). in order for the expansion (3. s) continuous on [0. Mercer’s Theorem states that for R(t. {λn . Furthermore. the expansion (3. 47 . Rh) for all f. Now we are ready to prove (3.27).27) to be valid we need ∞ R(t. s)en (s) ds = 0 0 ∞ k=1 λk ek (t)ek (s)en (s) ds 1 = k=1 ∞ λk ek (t) 0 ek (s)en (s) ds = k=1 λk ek (t)δkn = λn en (t).28) From equation (3. it is a compact operator (if {φn }∞ is a bounded sequence in L2 (0. for every f ∈ L2 (0. 1). 1)).28) is valid. in order to prove the expansion (3. 1]. all its eigenvalues are real and nonnegative.
Theorem 3.6.1. (KarhunenLo´ ve). Let {Xt , t ∈ [0, 1]} be an L2 process with zero mean and e continuous correlation function R(t, s). Let {λn , en (t)}∞ be the eigenvalues and eigenfunctions n=1 of the operator R deﬁned in (3.35). Then
∞
Xt =
n=1
ξn en (t),
t ∈ [0, 1],
(3.30)
where
1
ξn =
0
Xt en (t) dt,
Eξn = 0,
E(ξn ξm ) = λδnm .
(3.31)
The series converges in L2 to X(t), uniformly in t. Proof. The fact that Eξn = 0 follows from the fact that Xt is mean zero. The orthogonality of the random variables {ξn }∞ follows from the orthogonality of the eigenfunctions of R: n=1
1 1
E(ξn ξm ) = E
0 1 1 0
Xt Xs en (t)em (s) dtds R(t, s)en (t)em (s) dsdt
=
0 0 1
= λn
0
en (s)em (s) ds
= λn δnm . Consider now the partial sum SN =
N n=1 ξn en (t).
2 EXt − SN 2 = EXt2 + ESN − 2E(Xt SN ) N N
= R(t, t) + E
k, =1 N
ξk ξ ek (t)e (t) − 2E Xt
n=1 N 1 2
ξn en (t)
= R(t, t) +
k=1 N
λk ek (t) − 2E
k=1 0
Xt Xs ek (s)ek (t) ds
= R(t, t) −
k=1
λk ek (t)2 → 0,
by Mercer’s theorem. Remark 3.6.2. Let Xt be a Gaussian second order process with continuous covariance R(t, s). Then the random variables {ξk }∞ are Gaussian, since they are deﬁned through the time integral k=1 48
of a Gaussian processes. Furthermore, since they are Gaussian and orthogonal, they are also independent. Hence, for Gaussian processes the KarhunenLo´ ve expansion becomes: e
+∞
Xt =
k=1
λk ξk ek (t),
(3.32)
where {ξk }∞ are independent N (0, 1) random variables. k=1 Example 3.6.3. The KarhunenLo´ ve Expansion for Brownian Motion. The correlation funce tion of Brownian motion is R(t, s) = min(t, s). The eigenvalue problem Rψn = λn ψn becomes
1
min(t, s)ψn (s) ds = λn ψn (t).
0
Let us assume that λn > 0 (it is easy to check that 0 is not an eigenvalue). Upon setting t = 0 we obtain ψn (0) = 0. The eigenvalue problem can be rewritten in the form
t 1
sψn (s) ds + t
0 t
ψn (s) ds = λn ψn (t).
We differentiate this equation once:
1
ψn (s) ds = λn ψn (t).
t
We set t = 1 in this equation to obtain the second boundary condition ψn (1) = 0. A second differentiation yields; −ψn (t) = λn ψn (t), where primes denote differentiation with respect to t. Thus, in order to calculate the eigenvalues and eigenfunctions of the integral operator whose kernel is the covariance function of Brownian motion, we need to solve the SturmLiouville problem −ψn (t) = λn ψn (t), ψ(0) = ψ (1) = 0.
It is easy to check that the eigenvalues and (normalized) eigenfunctions are ψn (t) = √ 2 sin 1 (2n − 1)πt , 2 λn = 2 (2n − 1)π
2
.
Thus, the KarhunenLo´ ve expansion of Brownian motion on [0, 1] is e Wt = √
∞
2
n=1
ξn
2 sin (2n − 1)π 49
1 (2n − 1)πt . 2
(3.33)
We can use the KL expansion in order to study the L2 regularity of stochastic processes. First, let R be a compact, symmetric positive deﬁnite operator on L2 (0, 1) with eigenvalues and normalized eigenfunctions {λk , ek (x)}+∞ and consider a function f ∈ L2 (0, 1) with k=1 can deﬁne the one parameter family of Hilbert spaces H α through the norm f
2 α 1 0
f (s) ds = 0. We
= R−α f
2 L2
=
k
fk 2 λ−α .
The inner product can be obtained through polarization. This norm enables us to measure the regularity of the function f (t).3 Let Xt be a mean zero second order (i.e. with ﬁnite second moment) process with continuous autocorrelation function. Deﬁne the space Hα := L2 ((Ω, P ), H α (0, 1)) with (semi)norm Xt
2 α
= E Xt
2 Hα
=
k
λk 1−α .
(3.34)
Notice that the regularity of the stochastic process Xt depends on the decay of the eigenvalues of the integral operator R· :=
1 0
R(t, s) · ds.
As an example, consider the L2 regularity of Brownian motion. From Example 3.6.3 we know that λk ∼ k −2 . Consequently, from (3.34) we get that, in order for Wt to be an element of the space Hα , we need that λk −2(1−α) < +∞,
k
from which we obtain that α < 1/2. This is consistent with the H¨ lder continuity of Brownian o motion from Theorem 3.4.6. 4
3.7
Discussion and Bibliography
The OrnsteinUhlenbeck process was introduced by Ornstein and Uhlenbeck in 1930 as a model for the velocity of a Brownian particle [93]. The kind of analysis presented in Section 3.3.3 was initiated by G.I. Taylor in [91]. The proof of Bochner’s theorem 3.3.12 can be found in [50], where additional material on stationary processes can be found. See also [46].
Think of R as being the inverse of the Laplacian with periodic boundary conditions. In this case H α coincides with the standard fractional Sobolev space. 4 Notice, however, that Wiener’s theorem refers to a.s. H¨ lder continuity, whereas the calculation presented in this o section is about L2 continuity.
3
50
The spectral theorem for compact, selfadjoint operators which was needed in the proof of the KarhunenLo´ ve theorem can be found in [81]. The KarhunenLoeve expansion is also valid for e random ﬁelds. See [88] and the reference therein.
3.8
Exercises
1. Let Y0 , Y1 , . . . be a sequence of independent, identically distributed random variables and consider the stochastic process Xn = Yn . (a) Show that Xn is a strictly stationary process. (b) Assume that EY0 = µ < +∞ and EY02 = sigma2 < +∞. Show that 1 lim E N →+∞ N
N −1
Xj − µ = 0.
j=0
(c) Let f be such that Ef 2 (Y0 ) < +∞. Show that 1 lim E N →+∞ N
N −1
f (Xj ) − f (Y0 ) = 0.
j=0
2. Let Z be a random variable and deﬁne the stochastic process Xn = Z, n = 0, 1, 2, . . . . Show that Xn is a strictly stationary process. 3. Let A0 , A1 , . . . Am and B0 , B1 , . . . Bm be uncorrelated random variables with mean zero and
2 2 variances EA2 = σi , EBi2 = σi , i = 1, . . . m. Let ω0 , ω1 , . . . ωm ∈ [0, π] be distinct frequeni
cies and deﬁne, for n = 0, ±1, ±2, . . . , the stochastic process
m
Xn =
k=0
Ak cos(nωk ) + Bk sin(nωk ) .
Calculate the mean and the covariance of Xn . Show that it is a weakly stationary process. 4. Let {ξn : n = 0, ±1, ±2, . . . } be uncorrelated random variables with Eξn = µ, E(ξn − µ)2 = σ 2 , n = 0, ±1, ±2, . . . . Let a1 , a2 , . . . be arbitrary real numbers and consider the stochastic process Xn = a1 ξn + a2 ξn−1 + . . . am ξn−m+1 . 51
. i = 1. 7. i = 1. ∈ (0. Show that it is a weakly stationary process. i = 1. Calculate the following expectations. +∞). (a) Show that Bt is a Gaussian process with EBt = 0. Pn i i=1 ci W (ti ) where ci ∈ R. (b) Eei(W (t)+W (s)) . (b) Show that. n and ti ∈ (0. √ (b) Set ak = 1/ m for k = 1. Calculate the covariance function and study the cases m = 1 and m → +∞. m. +∞). . . R(t) = j=1 λ2 −αj t j e . variance and the covariance function of Xn . . for t ∈ [0. . (c) E( (d) Ee n 2 i=1 ci W (ti )) . n. . 1]. Let W (t) be a standard one dimensional Brownian motion. +∞). 1) an equivalent deﬁnition of Bt is through the formula Bt = (1 − t)W (c) Calculate the distribution function of Bt . t. (a) EeiW (t) . . 5. s) − ts. t ∈ [0. . . E(Bt Bs ) = min(t. λj }N are positive real numbers. . i = 1. . . . 52 . . n. αj where {αj . . . j=1 (a) Calculate the spectral density and the correlaction time of this process. where ci ∈ R.(a) Calculate the mean. s. 6. Let Xt be a meanzero second order stationary process with autocorrelation function N t 1−t . Let Wt be a standard one dimensional Brownian motion and deﬁne Bt = Wt − tW1 . n and ti ∈ (0.
Calculate the mean and variance of the random variable X= i=1 n ai W (si ). Calculate (a) EeσW (t) . 14.3 (i. . (a) Calculate the mean and the variance of St .7. σ > 0 and deﬁne St = etµ+σWt .4 to prove Lemma 3.17 are satisﬁed and use the argument presented in Section 3. . Prove Lemma 3. (c) Under what assumptions on the coefﬁcients {αj . Use Theorem 3. Let a1 .4. .3.4.10.3.4. 10. s1 . (b) E sin(σW (s1 )) sin(σW (s2 )) . an and s1 . 53 . Let W (t) be the standard onedimensional Brownian motion and let σ. .(b) Show that the assumptions of Theorem 3.e. 9. Use Lemma 3.8 to calculate the distribution function of the stationary OrnsteinUhlenbeck process. 13. Prove Theorem 3. 15. Let Wt be a one dimensional Brownian motion and let µ. 11. the GreenKubo formula) to calculate the diffusion coefﬁcient of the process Zt = t 0 Xs ds. sn be positive real numbers.4. . . s2 > 0. λj }N can you study the above questions j=1 in the limit N → +∞? 8. (b) Calculate the probability density function of St . Calculate the mean and the correlation function of the integral of a standard Brownian motion t Yt = 0 Ws ds. 12.3.
Let Vt = e−t W (e2t ) be the stationary OrnsteinUhlenbeck process. 19. δ (b) Calculate the mean square displacement E(X(t))2 of the position of the Brownian particle X(t) = t 0 Y (s) ds.4. s)f (s) ds (3. Show that the integral operator R : L2 [0. 18. 1] → L2 [0. Show that the process t+1 Yt = t (Ws − Wt ) ds. is second order stationary. t ∈ R. Let Xt be a stochastic process satisfying (3. s) its correlation function. Show that all of its eigenvalues are real and nonnegative. Show that the correlation function of a process Xt satisfying (3.26) and R(t.16. Show that eigenfunctions corresponding to different eigenvalues are orthogonal.25) of the fractional Brownian motion. Study the limit t → +∞. 54 . 21. Use Theorem (3. The autocorrelation function of the velocity Y (t) a Brownian particle moving in a harmonic 1 2 potential V (x) = 2 ω0 x2 is R(t) = e−γt cos(δt) − where γ is the friction coefﬁcient and δ = (a) Calculate the spectral density of Y (t). 1 sin(δt) .4) to show that there does not exist a continuous modiﬁcation of the Poisson process. 20. Give the deﬁnition and study the main properties of the OrnsteinUhlenbeck bridge. 22. Show the scaling property (3. 1] 1 Rf := 0 R(t.35) is selfadjoint and nonnegative. 2 ω0 − γ 2 .26) is continuous in both t and s. 17.
23. s) being continuous both in t and s. τ (S) = T . Show that it is a HilbertSchmidt operator. Deﬁne the process Y (t) = f (t)Xτ (t) . 24. 26. in terms of the KL expansion of Xt . Let Xt a mean zero second order stationary process deﬁned in the interval [0. nondecreasing function with τ (0) = 0. 1]. in an appropriate weighted L2 e space. where f (t) is a continuous function and τ (t) a continuous. 1] → L2 [0. 1] be the operator deﬁned in (3. 27. Let Xt . t ∈ [0. Use this in order to calculate the KL expansion of the OrnsteinUhlenbeck process. Use the KarhunenLoeve expansion to generate paths of the (a) Brownian motion on [0.35) with R(t. s) = ts. Calculate the KarhunenLoeve expansion of the Brownian bridge on [0. Find the KarhunenLo´ ve expansion of Y (t). An operator R : H → H is said to be HilbertSchmidt if there exists a complete orthonormal sequence {φn }∞ in H such that n=1 ∞ Ren n=1 2 < ∞. n=1 25. Let H be a Hilbert space. t) = cos(2π(t − s)). Show that n=1 ∞ λn = T R(0). Calculate the KarhunenLo´ ve expansion of a centered Gaussian stochastic process with coe variance function R(s. 1]. Calculate the KarhunenLoeve expansion for a second order stochastic process with correlation function R(t. T ] be a second order process with continuous covariance and KarhunenLo´ ve e expansion Xt = k=1 ∞ ξk ek (t). 28. t ∈ [0. 55 . T ] with continuous covariance R(t) and let {λn }+∞ be the eigenvalues of the covariance operator. Let R : L2 [0. 29. S].
(c) OrnsteinUhlenbeck on [0.(b) Brownian bridge on [0. How many terms do you need to keep in the KL expansion in order to calculate accurate statistics of these processes? 56 . 1]. 1]. Study computationally the convergence of the KL expansion for these processes.
In Section 4.6 we study ergodic Markov processes. In Section 4. In Section 4. In Section 4.2 Examples Roughly speaking. 57 X0 = 0. We deﬁned the one dimensional random walk as the sum of independent.8.Chapter 4 Markov Processes 4. a Markov process is a stochastic process that retains no memory of where it has been in the past: only the current state of a Markov process can inﬂuence where it will go next.7 and exercises can be found in Section 4. i = 1.3 we give the precise deﬁnition of a Markov process.2 we present various examples of Markov processes.4 we derive the ChapmanKolmogorov equation. the fundamental equation in the theory of Markov processes.5 we introduce the concept of the generator of a Markov process.1 Introduction In this chapter we will study some of the basic properties of Markov stochastic processes. past and future are statistically independent. 4. . A bit more precisely: a Markov process is a stochastic process for which. . Discussion and bibliographical remarks are presented in Section 4. In Section 4. Perhaps the simplest example of a Markov process is that of a random walk in one dimension. . given the present. : N XN = n=1 ξn . . in discrete and continuous time. mean zero and variance 1 random variables ξi .
. the probability that the random walk will be at in+m at time n + m depends only on its The random walk is an example of a discrete time Markov chain: current value (at time n) and not on how it got there.3.1) is satisﬁed. . 58 . Then. Example 4. n ∈ N} and state space is S = Z is called a discrete time Markov chain provided that the Markov property (4. Example 4. we can deﬁne a continuoustime Markov process whose state space is R. if j i. Xn = in ) = P(Xn+m = in+m Xn = in ).2.2. 1 (4.1) In words.2. . discrete state space Markov process is called a continuoustime Markov chain. .3) 1 2π(t − s) exp − x − y2 2(t − s) . tx. s for all Borel sets Γ. . The Poisson process is a continuoustime Markov chain with P(Nt+h = jNt = i) = 0 e−λs (λs)j−i .1. . (j−i)! if j < i. i2 . it is sufﬁcient to take m = 1 in (4.4) In fact. (4. Similarly. (4. . We will say that Xt is a Markov processes provided that P(Xt+h = it+h {Xs . The Brownian motion is a Markov process with conditional probability density p(y. See Exercise 1. s t} the collection of values of the stochastic process up to time t. s) := p(Wt = yWs = x) = 1 t}) = P(Xt+h ∈ ΓXt = x) (4.2. A continuoustime. .Let i1 . Deﬁnition 4. s for all h t}) = P(Xt+h = it+h Xt = it ).1). for all integers n and m we have that P(Xn+m = in+m X1 = i1 . In this case. the above deﬁnitions become P(Xt+h ∈ Γ{Xs . be a sequence of integers. . A stochastic process {Sn .2) 0. Consider now a continuoustime stochastic process Xt with state space S = Z and denote by {Xs .
6) which will enable us to study the evolution of a Markov process.2. biology and ﬁnance. s) := p(Vt = yVs = x) = 1 2π(1 − e−2(t−s) ) exp − y − xe−(t−s) 2 2(1 − e−2(t−s) ) .Example 4. we will see that we can obtain an equation for the transition probability P(Xn+1 = in+1 Xn = in ). s) = = ∂ P(Vt ∂y 1 2π(1 − e−2(t−s) ) yVs = x) exp − y − xe−(t−s) 2 2(1 − e−2(t−s) ) . tx. P(Vt yVs = x) = P(e−t W (e2t ) = P(W (e2t ) et y ye−s W (e2s ) = x) et yW (e2s ) = es x) z−xes 2 2(e2t −e2s ) = −∞ y 1 2π(e2t − e2s ) 2πe2t (1 − e − dz ρet −xes 2 = −∞ y − e−2(t−s) )e 2(e2t (1−e−2(t−s) ) − ρ−x2 2(1−e−2(t−s) ) dρ = −∞ 1 2π(1 − e−2(t−s) ) e dρ. (4. This equation will be called the ChapmanKolmogorov equation.e. chemistry. processes for which the conditional probabilities are invariant under time shifts. p(Xt+h = yXt = x). tx. i. In particular.5) we use the formula for the distribution function of the Brownian motion to calculate. For timehomogeneous discretetime Markov chains we have P(Xn+1 = jXn = i) = P(X1 = jX0 = i) =: pij . Markov stochastic processes appear in a variety of applications in physics. P(Xt+h = it+h Xt = it ).4. Consequently. (4. 59 . In this and the next chapter we will develop various analytical tools for studying them. The OrnsteinUhlenbeck process Vt = e−t W (e2t ) is a Markov process with conditional probability density p(y. We will be mostly concerned with timehomogeneous Markov processes. for t > s.5) To prove (4. the transition probability density for the OU process is given by the formula p(y.
t) = pij (0. t − s) In particular. Similarly.e. then pij (s. pik (m)pkj (n).We will refer to the matrix P = {pij } as the transition matrix. t) = P(Xt = jXs = i). (4.9) 60 . t. It is each to check that the transition matrix is a stochastic matrix. k s t. it has nonnegative entries and we can deﬁne the nstep transition matrix Pn = {pij (n)} as pij (n) = P(Xm+n = jXm = i). j. i. A simple consequence of the ChapmanKolmogorov equation is that we can write an evolution equation for the vector µ(n) µ(n) = µ(0) P n . (4. s. pij (t) = P(Xt = jX0 = i). the above equation can be written as µj (n) = i µi πij (n). for all i.7) Indeed. Componentwise. The (possibly inﬁnite dimensional) vector µn determines the state of the Markov chain at time n. Hence in order to calculate the state of the Markov chain at time n all we need is the initial distribution µ0 and the transition matrix P . let µi (n) := P(Xn = i). The ChapmanKolmogorov equation for a continuous time Markov chain is dpij = dt pik (t)gkj . If the chain is homogeneous.8) where P n denotes the nth power of the matrix P . (4. (0) Consider now a continuous time Markov chain with transition probability pij (s. We can study the evolution of a Markov chain through the ChapmanKolmogorov equation: pij (m + n) = k j pij = 1.
12) (4. (4. Thus. s) 2 . s) = δ(y − x).10) t→s ∂t 2 ∂y 2 Similarly. 61 . t)p). ∂s ∂x 2 ∂x lim p(y. ∂t ∂y 2 ∂y 2 lim p(y. s) = δ(y − x).11) The Brownian motion and the OU process are examples of a diffusion process.4) is the fundamental solution of the diffusion equation: ∂p 1 ∂ 2p = . tx. A diffusion process is a continuous time Markov process with continuous paths. tx. a diffusion process is determined uniquely from these two functions. We will see in Chapter 5. the evolution of a continuous time Markov chain is completely determined by the initial distribution and and transition matrix. dt The generator of the Markov chain is deﬁned as 1 G = lim (Ph − I). t)p) + (b(y. ∂t ∂y 2 ∂y 2 as well as the backward Kolmogorov equation − ∂p ∂p 1 ∂ 2p = a(x. s) = δ(y − x). We can t study its evolution using the equation µ t = µ 0 Pt . s) = δ(y − x). h→0 h Let now µi = P(Xt = i). t). t). Consider now the case a continuous time Markov process with continuous state space and with continuous paths. tx. the conditional distribution of the OU process satisﬁes the initial value problem ∂(yp) 1 ∂ 2 p ∂p = + . t→s (4. s) + b(x. s) of a diffusion process satisﬁes the forward Kolmogorov or FokkerPlanck equation ∂p ∂ 1 ∂2 = − (a(y.3 the Brownian motion is an example of such a process. tx.13) for appropriate functions a(y. as in the case if discrete time Markov chains. tx.where the matrix G is called the generator of the Markov chain. It is a standard result in the theory of partial differential equations that the conditional probability density of the Brownian motion (4. b(y. t→s (4. As we have seen in Example 4. t→s lim p(y.2. Hence. Equation (4. lim p(y.9) can also be written in matrix notation: dP = Pt G. that the conditional probability density p(y. The vector µt is the distribution of the Markov chain at time t.
P) and an ordered set T . In this section we give the precise deﬁnition of a Markov process with t ∈ T .3.3 Deﬁnition of a Markov Process In Section 4.2.1 we gave the deﬁnition of Markov process whose time is either discrete or continuous. is called the σ–algebra generated by {Xt . t ∈ T } of sub–σ–algebras of F: Fs ⊆ Ft ⊆ F for s t.3.3. an arbitrary metric space. In order to state the deﬁnition of a continuoustime Markov process that takes values in a metric space we need to introduce various new concepts. x ∈ σ(Xt . t ∈ T } is a stochastic process with sample space (Ω. t ∈ T )) and state space (E.4. where E is a metric space (we will usually take E to be either R or Rd ). t ∈ T ) We set F∞ = σ(∪t∈T Ft ). A ﬁltration on (Ω. t ∈ T ). Let K be a collection of subsets of Ω. the σ–algebra generated by Xt is the smallest σ–algebra such that Xt is a measurable function (random variable) with respect to it: the set ω ∈ Ω : Xt (ω) for all x ∈ R (we have assumed that E = R). Deﬁnition 4. F) to the state space (E. F. t ∈ T }. In other words. Our setting will be that we have a probability space (Ω. Deﬁnition 4. We will use this formulation in the next section to derive the ChapmanKolmogorov equation. t ∈ T and ω ∈ Ω. The ﬁltration generated by Xt . t ∈ T . G). . We can encode all past information about a stochastic process into an appropriate collection of σalgebras. For the deﬁnition of a Markov process we need to use the conditional expectation of the stochastic process conditioned on all past values. The smallest σ–algebra σ(Xt . and whose state space is the set of integers. We also gave several examples of Markov chains as well as of processes whose state space is the real line. F) is a nondecreasing family {Ft . such that the family of mappings {Xt . Let Xt : Ω → E. We start with the deﬁnition of a σ–algebra generated by a collection of sets. s 62 t) . σ(Xt . Deﬁnition 4.1. The smallest σ–algebra on Ω which contains K is denoted by σ(K) and is called the σ–algebra generated by K. is FtX := σ (Xs . Remember that the stochastic process is a function of two variables. a general index set and S = E. Let X = Xt (ω) be a stochastic process from the sample space (Ω. G). where Xt is a stochastic process.3.
then there is no memory of the past. F.Deﬁnition 4. The past and future of a Markov process are statistically independent when the present is known. .4. A nonMarkovian process Xt can be described through a Markovian one Yt by enlarging the state space: the additional variables that we introduce account for the memory in the Xt . Example 4. In other words: a Markov process has no memory. and Γ ∈ B(E). . More precisely: when a Markov process is conditioned on the present state. .5. s ∈ T with t s. This ”Markovianization” trick is very useful since there exist many analytical tools for analyzing Markovian processes. t ∈ T } is adapted to the ﬁltration Ft := {Ft . 0 t1 < t2 < · · · < tn t and Γ ∈ B(E). Yt } is. consequently. Remark 4. Xtn ) = P(Xt ∈ ΓXtn ) for n 1. a.8.3.3. . A stochastic process {Xt . the joint positionvelocity process {Xt . The deﬁnition of a Markov process is thus equivalent to the hierarchy of equations P(Xt ∈ ΓXt1 . .3. Roughly speaking.3. The velocity of a Brownian particle is modeled by the stationary OrnsteinUhlenbeck process Yt = e−t W (e2t ). t ∈ T }. Xt is an Ft –measurable random variable.6.14) for all t. However. Xsn ∈ Bn . t ∈ T } is a Markov process if X P(Xt ∈ ΓFs ) = P(Xt ∈ ΓXs ) (4. } with 0 s1 < s2 < · · · < sn s and Bi ∈ B(E). Xs2 ∈ B2 . s are completely determined once Xs is known. the statistics of Xt for t information about Xt for t < s is superﬂuous. Remark 4. t ∈ T } if for all t ∈ T . The particle position is given by the integral of the OU process (we take X0 = 0) Xt = 0 t Ys ds. is not a Markov process. The ﬁltration FtX is generated by events of the form {ωXs1 ∈ B1 . Xt2 .s. µ) with values in E and let FtX be the ﬁltration generated by {Xt . Then {Xt .3.7. Let {Xt } be a stochastic process deﬁned on a probability space (Ω. Deﬁnition 4. Its transition probability density 63 . The particle position depends on the past of the OU process and. .
tx0 . µ) be a probability space.p(x. tx. Xs . (4. Ex. 64 (4. u. Assume that f is such that E(f (X)) < ∞.15) for all x ∈ E. x s) a probability measure on E with P (t. Then E(f (X)G) = R f (x)PX (dxG). y.4 The ChapmanKolmogorov Equation With a Markov process {Xt } we can associate a function P : T × T × E × B(E) → R+ deﬁned through the relation X P Xt ∈ ΓFs = P (s.16) Given G ⊂ F we deﬁne the function PX (BG) = P (X ∈ BG) for B ∈ F. F. s) is (for ﬁxed t. tx. G) and let F1 ⊂ F2 ⊂ F. Since P Xt ∈ ΓFs = P [Xt ∈ ΓXs ] we can write P (Γ.17) . Γ). Assume that Xs = x. Γ) and satisﬁes the Chapman–Kolmogorov equation P (Γ. u)P (dy. The derivation of the Chapman Kolmogorov equation is based on the assumption of Markovianity and on properties of the conditional probability. Γx. ∂t ∂x ∂y 2 ∂y 2 4.1) E(E(XF2 )F1 ) = E(E(XF1 )F2 ) = E(XF1 ). F. Then (see Theorem 2. it is B(E)–measurable in x (for ﬁxed t. u (4. X a random variable from (Ω.4. s. t. s) = 1. µ) to (E. t ∈ T with s t. ty. The transition function P (t. for all t. s) = E P (Γ. s). s) = P [Xt ∈ ΓXs = x] . s ∈ T with t X s and all Γ ∈ B(E). ux. Let (Ω. Γ ∈ B(E) and s. y0 ) satisﬁes the forward Kolmogorov equation ∂p ∂p ∂ 1 ∂ 2p = −p + (yp) + .
t. .1. x. t. ρ) is a complete separable metric space. Then there exists a Markov process X in E whose ﬁnitedimensional distributions are uniquely determined by (4. X(t1 ) ∈ Γ1 . ty. 65 . ·. x. ·) = P (t.16) and (4. s) := P(Xt ∈ ΓXs = x) = P(Xt ∈ ΓFs ) X X X = E(IΓ (Xt )Fs ) = E(E(IΓ (Xt )Fs )Fu ) X X X = E(E(IΓ (Xt )Fu )Fs ) = E(P(Xt ∈ ΓXu )Fs ) = E(P(Xt ∈ ΓXu = y)Xs = x) = R P (Γ. Γn )P (tn−1 − tn−2 . We set P (0.19) · · · × P (t1 . z. Γ). tx. x. yn−1 . .2. dyn−1 ) (4. y0 . t. The transition function P (x. ux. t − s. We have also set E = R. 4. . tXu = y)P (dy. Γ) = E P (s.Now we use the Markov property.. and will be the main object of study in this course. x. s). Γ) and the initial distribution ν determine the ﬁnite dimensional distributions of X by P(X0 ∈ Γ1 . x.. u)P (dy. Sec. ·). The CK equation is an integral equation and is the fundamental equation in the theory of Markov processes.18) and assume that (E. Γ) = P (0.17) and the fact that X X s < u ⇒ Fs ⊂ Fu to calculate: X P (Γ. Γn−1 P (tn − tn−1 .18) Let Xt be a homogeneous Markov process and assume that the initial distribution of Xt is given by the probability measure ν(Γ) = P (X0 ∈ Γ) (for deterministic initial conditions–X0 = x– we have that ν(Γ) = IΓ (x) ). A Markov process is homogeneous if P (t. together with equations (4.1]) Let P (t. Γ) satisfy (4. which is the fundamental equation in the theory of diffusion processes. Xtn ∈ Γn ) = Γ0 Γ1 . dy1 )ν(dy0 ). yn−2 . Γ). .19). dz)P (t.4. ΓXs = x) := P (s.4. The Chapman–Kolmogorov (CK) equation becomes P (t + s. (4. ([21. Under additional assumptions we will derive from it the FokkerPlanck PDE. uXs = x) P (Γ. Theorem 4. R =: IΓ (·) denotes the indicator function of the set Γ. ·. Deﬁnition 4.
s to y.e. In order to calculate the probability for the transition from (x. s) dy. s) dz. t. s) = IΓ (x). the initial distribution and the transition function are sufﬁcient to characterize a homogeneous Markov process. tx. In words. Pt+s = Pt ◦ Ps ∀ t. u)p(z. let P (t. Γ) = E P (s. z. The above description suggests that a Markov process can be described through a semigroup of operators. s) = Γ p(y. s and the ﬁnal position and time y. Γ).18): P (t + s.Let Xt be a homogeneous Markov process with initial distribution ν(Γ) = P (X0 ∈ Γ) and transition function P (x. The transition probability P (Γ. We can calculate the probability of ﬁnding Xt in a set Γ at time t: P(Xt ∈ Γ) = E P (x. x. we obtain the equation p(y. 66 . tz. s) dy = Γ R Γ p(y. ux. It satisﬁes the CK equation (4. s) dzdy. (4. tx. Γ)ν(dx). t. Thus. Then it moves from z to y at time t.20) The transition probability density is a function of 4 arguments: the initial position and time x. sx. s) to (y. s 0. Assume that it has a density for all t > s: P (Γ. x. tx. Notice that they do not provide us with any information about the actual paths of the Markov process. for a Markov process. the transition from x. tx. the CK equation tells us that. s) = R p(y. for t = s we have P (Γ. t) we need to sum (integrate) the transitions from all possible intermediary states z. The ChapmanKolmogorov equation becomes: p(y. dz)P (t. t can be done in two steps: ﬁrst the system moves from x to z at some intermediate time u. Clearly. since Γ ∈ B(R) is arbitrary. u)p(z. Γ). s) is a probability measure. i. dy) be the transition function of a homogeneous Markov process. tz. Indeed. and. a oneparameter family of linear operators with the properties P0 = I. x. t. ux. tx.
Furthermore: (Pt+s f )(x) = = = = f (y)P (t + s. This is a linear operator with (P0 f )(x) = E(f (X0 )X0 = x) = f (x) ⇒ P0 = I.5. x. The operator L : D(L) → Cb (E) is called the inﬁnitesimal generator of the operator semigroup Pt . dy) f (y)P (s. dz) = (Pt ◦ Ps f )(x). x. ρ) be a metric space and let {Xt } be an Evalued homogeneous Markov process. dy) P (t.5 The Generator of a Markov Processes Let (E. x. 4. Consequently: Pt+s = Pt ◦ Ps . x. Then the oneparameter family of operators Pt forms a semigroup of operators on Cb (E). Assume for simplicity that Pt : Cb (E) → Cb (E). t→0 t . We deﬁne by D(L) the set of all f ∈ Cb (E) such that the strong limit Lf = lim exists.1. dz) (Ps f )(z)P (t. dy) = E[f (Xt )X0 = x] for all f (x) ∈ Cb (E) (continuous bounded functions on E). x. x. Deﬁne the one parameter family of operators Pt through Pt f (x) = f (y)P (t.Let X := Cb (E) and deﬁne the operator (Pt f )(x) := E(f (Xt )X0 = x) = E f (y)P (t. z. dy)P (t. dy). dz) f (y)P (s. 67 Pt f − f . Deﬁnition 4. z.
with D(Pt ) ⊂ Cb (E). 68 . The semigroup property and the deﬁnition of the generator of a semigroup imply that. t) satisﬁes the initial value problem ∂u = Lu. µ) sometimes arise. let Pt be a contraction semigroup (Let X be a Banach space and T : X → X a bounded operator.Deﬁnition 4. then equation (4. Consider the function u(x.21) When the semigroup Pt is the transition semigroup of a Markov process Xt . u(x. (4. We calculate its time derivative: ∂u d d Lt = (Pt f ) = e f ∂t dt dt = L eLt f = LPt f = Lu. We will quite often use the space L2 (E. we can calculate all the statistics of our process by solving the backward Kolmogorov equation. formally at least. t 0}. Thus. The operator L : Cb (E) → Cb (E) deﬁned above is called the generator of the Markov process {Xt . there is an E–valued homogeneous Markov process {Xt } associated with Pt deﬁned through X E[f (X(t)Fs )] = Pt−s f (X(s)) for all t. Furthermore. In the case where the Markov process is the solution of a stochastic differential equation. s ∈ T with t s and f ∈ D(Pt ). Then T is a contraction provided that Tf X f X ∀ f ∈ X). The generator is frequently taken as the starting point for the deﬁnition of a homogeneous Markov process. where µ will is the invariant measure of our Markov process. 0) = f (x). ∂t u(x. but other Banach spaces often arise in applications. given the generator of a Markov process L.5. the spaces Lp (E.2. t) := (Pt f )(x). closed. u(x. Then. under mild technical hypotheses. we can write: Pt = exp(Lt). µ).21) is called the backward Kolmogorov equation. then the generator is a second order elliptic operator and the backward Kolmogorov equation becomes an initial value problem for a parabolic PDE. 0) = P0 f (x) = f (x). t) = E(f (Xt )X0 = x). Conversely. Consequently. in particular when there is a measure µ on E. It governs the evolution of an observable u(x. The space Cb (E) is natural in a probabilistic context.
(4.21) enables us to obtain an equation for the evolution of µt : ∂µt = L ∗ µt .1 The Adjoint Semigroup The semigroup Pt acts on bounded measurable functions.x of the one dimensional Brownian motion is the fundamental solution (Green’s function) of the heat (diffusion) PDE ∂u 1 ∂ 2u = . Γ) dµ(x).5. dy) = γt. Let µt := Pt∗ µ. γt.x (y)dy. The Poisson process is a homogeneous Markov process. This is the law of the Markov process and µ is the initial distribution.x (y) = √ exp − 2t 2πt . x. The image of a probability measure µ under Pt∗ is again a probability measure. The operators Pt and Pt∗ are adjoint in the L2 sense: Pt f (x) dµ(x) = R R f (x) d(Pt∗ µ)(x). 2 dx2 Notice that the transition probability density γt. write Pt∗ = exp(L∗ t).22) We can. t d2 The semigroup associated to the standard Brownian motion is the heat semigroup Pt = e 2 dx2 . ∂t 2 ∂x2 4.5. .Example 4. The generator of this Markov process is 1 d2 . Example 4.4. An argument similar to the one used in the derivation of the backward Kolmogorov equation (4. ∂t 69 µ0 = µ.3. x.5. We can also deﬁne the adjoint semigroup Pt∗ which acts on probability measures: Pt∗ µ(Γ) = R P(Xt ∈ ΓX0 = x) dµ(x) = R p(t. The transition function is the Gaussian deﬁned in the example in Lecture 2: 1 x − y2 P (t. formally at least. The one dimensional Brownian motion is a homogeneous Markov process. where L∗ is the L2 adjoint of the generator of the process: Lf h dx = f L∗ h dx.
0) = ρ0 (y). x) dy. 70 g ∈ Cb (E) ∀t 0 . Deﬁnition 4. ∂t We can also study Markov processes at the level of trajectories. the initial condition becomes ρ0 = δ(y − x). µ = ρ0 (y) dy this equation becomes: ∂ρ = L∗ ρ. has only constant solutions. We will do this after we deﬁne the concept of a stochastic differential equation.6 Ergodic Markov processes A very important concept in the study of limit theorems for stochastic processes is that of ergodicity. This concept. When the initial conditions are deterministic.23) This is the forward Kolmogorov or FokkerPlanck equation.6.1. we can calculate the transition probability density by solving the Forward Kolmogorov equation. ∂t ρ(y. X0 = x. We can then calculate all statistical quantities of this process through the formula E(f (Xt )X0 = x) = f (y)ρ(t.Assuming that µt = ρ(y. A Markov process is called ergodic if the equation Pt g = g. We will derive rigorously the backward and forward Kolmogorov equations for Markov processes that are deﬁned as solutions of stochastic differential equations later on. (4. provides us with information on the long–time behavior of a Markov semigroup. Given the initial distribution and the generator of the Markov process Xt . 4. y. t) dy. We can study the evolution of a Markov process in two different ways: Either through the evolution of observables (Heisenberg/Koopman) ∂(Pt f ) = L(Pt f ). ∂t or through the evolution of states (Schr¨ dinger/FrobeniousPerron) o ∂(Pt∗ µ) = L∗ (Pt∗ µ). in the context of Markov processes.
If this measure is unique. In this case the transition probability density (the solution of the FokkerPlanck equation) is independent of time: ρ(x. from the deﬁnition of the generator of a semigroup and the deﬁnition of an invariant measure. to the case where the generator L has only constants in its null space. Indeed.Roughly speaking. 0 where E denotes the expectation with respect to µ. with X0 distributed in this way. equivalently. an ergodic Markov process has an invariant measure µ with the property that. Consequently. is stationary .6. 1 lim t→+∞ t t g(Xs ) ds = Eg(x). 71 . 4. we conclude that a measure µ is invariant if and only if L∗ µ = 0 in some appropriate generalized sense ((L∗ µ. then so is Xt for all t > 0. Using the adjoint semigroup we can deﬁne an invariant measure as the solution of the equation Pt∗ µ = µ. which is the generator of the semigroup Pt∗ . Using this. in the case T = R+ . f ) = 0 for every bounded measurable function).1 Stationary Markov Processes If X0 is distributed according to µ. or. ergodicity corresponds to the case where the semigroup Pt is such that Pt −I has only constants in its null space. we can obtain an equation for the invariant measure in terms of the adjoint of the generator L∗ . Then the invariant density satisﬁes the stationary FokkerPlanck equation L∗ ρ = 0. t) = ρ(x). The resulting stochastic process. the statistics of the Markov process is independent of time. The invariant measure (distribution) governs the longtime dynamics of the Markov process. This is a physicist’s deﬁnition of an ergodic process: time averages equal phase space averages. Assume that µ(dx) = ρ(x) dx. This follows from the deﬁnition of the generator of a Markov process. then the Markov process is ergodic. Under some additional compactness assumptions.
dx2 together with the normalization and nonnegativity conditions ρ 0. dx dx The null space of L comprises constants in x. Consider a onedimensional Brownian motion on [0. 2 ρ 0.25). The generator of this Markov process is L= The stationary FokkerPlanck equation becomes d2 ρ = 0.25) There are no solutions to Equation (4. it is an ergodic Markov process. (4. ρ L1 (R) = 1.2.24). This operator is selfadjoint. Hence. See Exercise 6. Consider the onedimensional Brownian motion. Both the backward Kolmogorov and the FokkerPlanck equation reduce to the heat equation 1 ∂ 2ρ ∂ρ = ∂t 2 ∂x2 with periodic boundary conditions in [0.6. there do not exist constants A and B so that R rho(x) dx = 1. In order to calculate the invariant measure we need to solve the stationary Fokker–Planck equation: L∗ ρ = 0. with periodic boundary conditions.3. This function is not normalizable. The one dimensional OrnsteinUhlenbeck (OU) process is a Markov process with generator L = −αx d2 d + D 2.4.24) ρ(x) dx = 1. 2 dx2 (4.6. R 1 d2 . 1]. 1]. subject to the constraints (4. Example 4.6. 1].26) The general solution to Equation (4. Example 4. The null space of both L and L∗ comprises constant functions on [0.e. 72 .25) is ρ(x) = Ax + B for arbitrary constants A and B. (4. i. 2 dx2 equipped with periodic boundary conditions on [0. The generator of this Markov process L is the differential operator L = 1 d2 . 2 Thus.Example 4. the one dimensional Brownian motion is not an ergodic process. Fourier analysis shows that the solution converges to a constant at an exponential rate. 1].
Assuming that f. Necessary and sufﬁcient conditions for an operator L to be the generator of a (contraction) semigroup are given by the HilleYosida theorem [22. 2πD 2D If the initial condition of the OU process is distributed according to the invariant measure.Let us calculate the L2 adjoint of L. d d2 h L h := (axh) + D 2 . Gaussian second order stationary process on [0. we have: Lf h dx = R R 2 (−αx∂x f )h + (D∂x f )h dx 2 f ∂x (αxh) + f (D∂x h) dx =: R R = where f L∗ h dx. 2 + α2 πx D −αt e α Furthermore. The invariant measure of this process is the Gaussian measure µ(dx) = α α 2 exp − x dx. dx dx ∗ We can calculate the invariant distribution by solving equation (4. 7]. Then Xt is a mean zero. h decay sufﬁciently fast at inﬁnity.7 Discussion and Bibliography The study of operator semigroups started in the late 40’s independently by Hille and Yosida. Let Xt be the 1d OU process and let X0 ∼ N (0. Semigroup theory was developed in the 50’s and 60’s by Feller. ∞) with correlation function R(t) = and spectral density f (x) = D 1 . mostly in connection to the theory of Markov processes. D/α). 4. Ch.26). 73 . then the OU process is a stationary Gaussian process. Dynkin and others. the OU process is the only realvalued mean zero Gaussian secondorder stationary Markov process deﬁned on R.
Show that it is a Markov process if and only if for all n P(Xn+1 = in+1 X1 = i1 . 6. Y (t)} is Markovian and write down the generator of the process. The generator of this Markov process L is the differential operator L = 1 d2 . 1]. . Let Y (t) = e−t W (e2t ) be the stationary OrnsteinUhlenbeck process and consider the process t X(t) = 0 Y (s) ds. Show that (4. Conclude that this process is ergodic. Show that this operator is selfadjoint. tx. Show that the joint process {X(t). s→t 3. ∂s 2 ∂x2 lim p(y. s) = δ(y − x). Consider a onedimensional Brownian motion on [0.5) to show that the forward and backward Kolmogorov equations for the OU process are ∂ 1 ∂ 2p ∂p = (yp) + ∂t ∂y 2 ∂y 2 and − ∂p 1 ∂ 2 p ∂p = −x + .4. Show that the null space of both L and L∗ comprises constant functions on [0.4) is the solution of initial value problem (4. Let {Xn } be a stochastic process with state space S = Z.8 Exercises 1. Show that the joint process {X(t). 1]. Use (4. . 74 . let Y (t) = σW (t) with σ > 0 and consider the process t X(t) = 0 Y (s) ds. equipped with periodic 2 dx2 boundary conditions on [0.10) as well as of the ﬁnal value problem − 1 ∂ 2p ∂p = . with periodic boundary conditions. . Let W (t) be a standard one dimensional Brownian motion. 1]. 5. Xn = in ) = P(Xn+1 = in+1 Xn = in ). 2. Y (t)} is Markovian and write down the generator of the process. ∂s ∂x 2 ∂x2 4. Solve the corresponding FokkerPlanck equation for arbitrary initial conditions ρ0 (x) . . Show that the solution converges to a constant at an exponential rate.
Xtn−1 = xn−1 ) = E(Xtn Xtn−1 = xn−1 ). σX σY Show that (b) Let Xt be a mean zero stationary Gaussian process with autocorrelation function R(t). t 0. (c) Use the previous result to show that the only stationary Gaussian Markov process with continuous autocorrelation function is the stationary OU process. . Y be mean zero Gaussian random variables with EX 2 = σX . 2 2 (a) Let X. R(0) s.7. . Use the previous result to show that E[Xt+s Xs ] = R(t) X(s). . Show that a Gaussian process Xt is a Markov process if and only if E(Xtn Xt1 = x1 . 8. EY 2 = σY and correlation coefﬁcient ρ (the correlation coefﬁcient is ρ = E(XY ) = ρσX Y. 75 . σY E(XY ) ).
76 .
Thus. A diffusion process can be deﬁned by specifying its ﬁrst two moments: Deﬁnition 5.3 we derive the forward and backward Kolmogorov equations for onedimensional diffusion processes. biology and ﬁnance. it is a Markov process with no jumps. A Markov process Xt with transition function P (Γ. namely Markov processes with continuous paths. Exercises can be found in Section 5. In Section 5.8. In section 5.5. s) is called a diffusion process if the following conditions are satisﬁed. 77 .1. tx.2 Deﬁnition of a Diffusion Process A Markov process consists of three parts: a drift (deterministic).7. Discussion and bibliographical remarks are included in Section 5.1 Introduction In this chapter we study a particular class of Markov processes.Chapter 5 Diffusion Processes 5. chemistry.2 we give the deﬁnition of a diffusion process.2. A diffusion process is a Markov process that has continuous sample paths (trajectories).4 we present the forward and backward Kolmogorov equations in arbitrary dimensions. These processes are called diffusion processes and they appear in many applications in physics. 5. a random process and a jump process. In Section 5. The connection between diffusion processes and stochastic differential equations is presented in Section 5.
tx. Using this estimate together with (5.1) uniformly over s < t. . For every x and every ε > 0 P (dy.2. ii. 2 and notice that y − xk P (dy. s).3) uniformly over s < t. 1. tx.4) we conclude that: lim t→s 1 t−s y − xk P (dy. Indeed.let k = 0. (Continuity). There exists a function a(x. tx. s)(t − s) + o(t − s). s) y−x>ε = y−x>ε y − x2+δ y − xk−(2+δ) P (dy. Rd (5. s) = 0. 78 . s)(t − s) + o(t − s). (Deﬁnition of drift coefﬁcient). (Deﬁnition of diffusion coefﬁcient). If we assume that there exists a δ > 0 such that lim t→s 1 t−s y − x2+δ P (dy.4) then we can extend the integration over the whole Rd and use expectations in the deﬁnition of the drift and the diffusion coefﬁcient. s) such that for every x and every ε>0 (y − x)P (dy.1 we had to truncate the domain of integration since we didn’t know whether the ﬁrst and second moments exist. s) = 0. s) = a(x. tx. y−x ε (5. s) 1 ε2+δ−k 1 ε2+δ−k Rd y−x>ε y − x2+δ P (dy. tx. s) = o(t − s) x−y>ε (5. s) such that for every x and every ε > 0 (y − x)2 P (dy. s) y − x2+δ P (dy.2) uniformly over s < t. tx.2.i. iii.2. 2. tx. y−x ε (5. Remark 5. s) = b(x. In Deﬁnition 5. tx. y−x>ε k = 0. 1. There exists a function b(x. tx.
1). The deﬁnitions of the drift and diffusion coefﬁcients become: lim E t→s Xt − X s Xs = x t−s = a(x. s) are continuous in both x and s. 5. tx.1 (R × R+ ) and it solves the ﬁnal value problem − ∂u ∂u 1 ∂ 2u = a(x. respectively).4) is sufﬁcient for the sample paths to be continuous (k = 0) and for the replacement of the truncated integrals in (10. (Kolmogorov) Let f (x) ∈ Cb (R) and let u(x. as well as of the transition probability density p(y. In this section we shall derive the backward and forward Kolmogorov equations for onedimensional diffusion processes. s).1 The Backward Kolmogorov Equation Theorem 5. the continuity assumption (5. s). s) := E(f (Xt )Xs = x) = f (y)P (dy. s) (5. b(x. These are the backward and forward Kolmogorov equations. s) + b(x. Then u(x. s) = E(f (Xt )Xs = x). u(x.3. x) = f (x).This implies that assumption (5. together with the fact that the function 79 .5) and lim E t→s Xt − Xs 2 Xs = x t−s = b(x. s) (5. ∂s ∂x 2 ∂x lim u(s.7) Proof.1) and (5.1.3. we will obtain partial differential equations that govern the evolution of the conditional expectation of an arbitrary function of a diffusion process Xt . tx. s) 2 .4.3) by integrals over R (k = 1 and k = 2. s) ∈ C 2. The extension to multidimensional diffusion processes is presented in Section 5.3 The Backward and Forward Kolmogorov Equations In this section we show that a diffusion process is completely determined by its ﬁrst two moments. First we notice that. s→t (5. Assume furthermore that the functions a(x.6) 5. In particular. s).
s) + y−x ε y−x>ε = f (y)P (dy. ρ) ∂ 2 u(z. ρx. s) + f y−x ε P (dy.z−x ε ∂u(x. s) = y−x ε f (y)P (dy. σ). ρ) − u(x. tx. σ) f (z)P (dz. ρ)P (dy. limε→0 αε = 0. since u(x. s) (f (y) − f (x))P (dy. s) gives u(z. s) + = f (x) + y−x ε (f (y) − f (x))P (dy.8) = = R u(y. ρ)P (dy. tx. σ) R R (5. s) + o(t − s). We use the ChapmanKolmogorov equation (4. ρ) − .f (x) is bounded imply that u(x.9) The Taylor series expansion of the function u(x. tx. s) = R f (y)P (dy. ty. s) y−x>ε = f (x) + y−x ε (f (y) − f (x))P (dy. tx. tx. Now the ﬁnal condition follows from the fact that f (x) ∈ Cb (R) and the arbitrariness of ε. We add and subtract the ﬁnal condition f (x) and use the previous calculation to obtain: u(x. tx. σ) = R f (z)P (dz. tx. ρ) 1 ∂ 2 u(x. tx. tx. (5. ρ) = where αε = sup ρ. ρx. tx. (5. s) f (y)P (dy.10) Notice that. ρ) (z − x)2 (1 + αε ). s) = R f (y) P (dy.15) to obtain u(x. Now we show that u(s. (z − x) + ∂x 2 ∂x2 ∂ 2 u(x. s) + o(t − s). tx. ∂x2 ∂x2 z − x ε. tx. x) solves the backward Kolmogorov equation. 80 . s) = f (x) + R (f (y) − f (x))P (dy. s) L∞ y−x>ε f (y)P (dy. s) is twice continuously differentiable in x.
s + h) h h R 1 = P (dy.11) is valid for arbitrary functions f (y). s + hx. ∂s ∂x 2 ∂x2 (5.9) with (5.2 The Forward Kolmogorov Equation In this section we will obtain the forward Kolmogorov equation. 5. s) ∂p(y. s)(u(y. In this case the formula for u(x. s + h) + b(x. s) 1 ∂ 2 p(y. tx. s + hx. s) + As. s) − u(x.We combine now (5. s + h) 2 ∂x2 h (y − x)2 P (dy. s) 1 ∂ 2u ∂u (x.12) Notice that the variation is with respect to the ”backward” variables x. s) + b(x.11) where ∂ 1 ∂2 + b(x. s)) + o(1) = h x−y<ε 1 ∂u (x. h → 0. tx. s + h) − u(x. s) = ∂x h x−y<ε + 1 ∂ 2u 1 (x. tx. s + h)) h R 1 P (dy. tx. s) dy. s) transition probability density: − ∂p(y. ∂x 2 ∂x Since (5. We assume that the transition function has a density with respect to 81 . tx. In the physics literature is called the FokkerPlanck equation. s + hx. s) 2 (x. we obtain a partial differential equations for the As. s + h) 1 = P (dy. s + h) (y − x)P (dy. s + hx. tx. tx.7) follows by taking the limits ε → 0.3. ∂x 2 ∂x Equation (5. s) 2 .x p(y. Substituting this in the backward Kolmogorov equation we obtain f (y) R ∂p(y. We will obtain an equation with respect to the ”forward” variables y. s) = R f (y)p(y. t in the next section. s). s)(1 + αε ) + o(1) x−y<ε = a(x. s)(u(y. s + h)(1 + αε ) + o(1). s)u(y.10) to calculate u(x. s) becomes u(x. s.x := a(x. s) . Assume now that the transition function has a density p(y. s) ∂s =0 (5. s + h) − u(x. s) = a(x. s + hx. s + h) − u(x.
ts. (5. b(y. (5. t) ∈ C 2. t). tx. s)) f (y) dy h→0 h 1 = lim p(y. ∂t ∂y 2 ∂y 2 lim p(t. On the other hand ∂ ∂ f (y) p(y.3. (Kolmogorov) Assume that conditions (5. s) ds − f (x) = a(x. Assume now that initial distribution of Xt is ρ0 (x) and set s = 0 (the initial time) in (5. = − (a(z)p(z. tx.14) lim h→0 h 2 where subscripts denote differentiation with respect to x. y)p) + (b(t. t)fz (z) + b(z)fzz (z) dz 2 1 ∂2 ∂ (b(z)p(z. t·. P (Γ. t + sz. s) dy. the forward Kolmogorov equation follows. 0)ρ0 (x) dx. tx. 82 (5. s) a(z.3) are satisﬁed and that p(y. Then the transition probability density satisﬁes the equation ∂ 1 ∂2 ∂p = − (a(t. (10.13) 2 Proof. We have also performed two integrations by parts and used the fact that. t)p(z. a(y. s)) + ∂z 2 ∂z 2 In the above calculation used the ChapmanKolmogorov equation. tx. Deﬁne p(y. s) − p(y. since the test function f has compact support. tx. tx. t)f (y) dy − f (z) dz p(z. s) dy ∂t ∂t 1 = lim (p(y.2. tx.Lebesgue measure. t→s (5. Fix a function f (y) ∈ C0 (R). s) = Γ p(y.15) . s) = lim h→0 h 1 = p(z. tx. t) := p(y. An argument similar to the one used in the proof of the backward Kolmogorov equation gives 1 1 f (y)p(y. ts. the boundary terms vanish. s) f (z) dz. t + hx.1). s) = δ(x − y).1). s)f (y) dy − p(z. x)f (z) dz h→0 h 1 = lim p(y. s) dy = f (y)p(y. t + hx. tx. s)fx (x) + b(x.1 (R × R+ ). yx. Theorem 5. s + hx. x)f (z) dz h→0 h 1 p(y. ·). s)fxx (x). t + hz. tx. tx.13). s)f (y) dydz − p(z. Since the above equation is valid for every test function f (y). y)p) .
0)p(x.18) The autocorrelation function at time t and s is given by E(Xt Xs ) = In particular. Xs )) = f (y. 83 yxp(y. Xt2 = x2 ) = P(Xt1 = x1 Xt2 = x2 )P(Xt2 = x2 ) = p(x1 .16). 0) dxdy f (y)p(y.13) as the Green’s function for the PDE (5.We multiply the forward Kolmogorov equation (5. t)p(t. t) = − (a(y. Using (5. (5. t)p(y. example the probability that Xt1 = x1 and Xt2 = x2 . Using the joint probability density we can calculate the statistics of a function of the diffusion process Xt at times t and s: E(f (Xt . From the properties of conditional expectation we have that p(x1 . we can think of the solution to (5.16) The solution of equation (5.13) by ρ0 (x) and integrate with respect to x to obtain the equation ∂ 1 ∂2 ∂p(y. tx. where p(y. For. ∂t ∂y 2 ∂y 2 together with the initial condition p(y. (5. t) dy. 0) dxdy. t)) + (b(y. is equal to y at time t.16) we can calculate the expectation of an arbitrary function of the diffusion process Xt : E(f (Xt )) = = f (y)p(y. t1 x2 t2 )p(x2 . tx. which initially was distributed according to the probability density ρ0 (x). tx. 0) = ρ0 (y). s)p(x. t1 . s) dxdy. x)p(y. E(Xt X0 ) = yxp(y. . s)p(x.16).16). t2 ). t2 ) = P(Xt1 = x1 . tx. 0)p(x. s) dxdy. Alternatively. provides us with the probability that the diffusion process Xt . x2 .17) (5. t) is the solution of (5. y)) . Quite often we need to calculate joint probability densities.
s). y−x<ε The drift coefﬁcient a(x. The drift and diffusion coefﬁcients of a diffusion process in Rd are deﬁned as: lim t→s 1 t−s (y − x)P (dy.1. we can write the formulas for the drift vector and diffusion matrix as lim E t→s Xt − Xs Xs = x t−s = a(x.j=1 ∂xj j=1 Exercise 5. s) y−x<ε and lim t→s 1 t−s (y − x) ⊗ (y − x)P (dy. 84 .4. s) = a(x. The generator of a d dimensional diffusion process is L = a(x.20) Notice that from the above deﬁnition it follows that the diffusion matrix is symmetric and nonnegative deﬁnite. s) is a ddimensional vector ﬁeld and the diffusion coefﬁcient b(x.5. tx. tx. 5. s) 2 . = + aj (x. s) (5. s) · 1 + b(x. Assuming that the ﬁrst and second moments of the multidimensional diffusion process exist.19) and lim E t→s (Xt − Xs ) ⊗ (Xt − Xs ) Xs = x t−s = b(x. s) ∂xj 2 i.5 Connection with Stochastic Differential Equations Notice also that the continuity condition can be written in the form P (Xt − Xs  εXs = x) = o(t − s). Derive rigorously the forward and backward Kolmogorov equations in arbitrary dimensions.4 Multidimensional Diffusion Processes Let Xt be a diffusion process in Rd . s) = b(x. s) is a d × d symmetric matrix (second order tensor). s) (5. s) : 2 d d ∂2 1 ∂ bij (x.
where b = σσ T and ξt is a mean zero Gaussian process with E s. Notice that the deﬁnitions of the drift and diffusion coefﬁcients (5.6) can be written in the form Es. hence. whereas the diffusion coefﬁcient (tensor) is a measure of the local magnitude of ﬂuctuations of Xt − Xs about the mean value. we conclude that we can write locally: ∆Xt ≈ a(s.x (Xt − Xs ) = a(x. s)(t − s) + o(t − s). The sample paths of a diffusion process have the regularity of Brownian paths. Or. A Markovian process cannot be differentiable: we can deﬁne the derivative of a sample paths only with processes for which the past and future are not statistically independent when conditioned on the present. and Es.Now it becomes clear that this condition implies that the probability of large changes in Xt over short time intervals is small. we can write locally: Xt − Xs ≈ a(s. then the right hand side of the above equation would have to be 0 when t − s 1. Let us denote the expectation conditioned on Xs = x by Es.x (ξt ⊗ ξs ) = (t − s)I. that the above condition implies that the sample paths of a diffusion process are not differentiable: if they where. the sample paths of a diffusion process are governed by a stochastic differential equation (SDE). Xs )(t − s) + σ(s. Since we have that Wt − Ws ∼ N (0. the drift coefﬁcient deﬁnes the mean velocity vector for the stochastic process Xt . s)(t − s) + o(t − s). Xt )dt + σ(t. Xs )∆t + σ(s.5) and (5.x (Xt − Xs ) ⊗ (Xt − Xs ) = b(x. Notice. Xs ) ξt . 85 . Xs )∆Wt . (t − s)I). Hence.x . on the other hand. Xt )dWt . replacing the differences by differentials: dXt = a(t. Consequently.
Prove equation (5.14). a(x) = −αx and b(x) = D. The corresponding stochastic differential equation is dXt = dWt . 5. 86 . Derive rigorously the backward and forward Kolmogorov equations in arbitrary dimensions.16). ii. respectively. 2. The 1dimensional OrnsteinUhlenbeck process is a diffusion process with drift and diffusion coefﬁcients.7 Discussion and Bibliography The argument used in the derivation of the forward and backward Kolmogorov equations goes back to Kolmogorov’s original work. (5.17). More material on diffusion processes can be found in [36]. 3. Derive the initial value problem (5.5. e−α(t−s) dWs .6 Examples of Diffusion Processes 1 d2 . The solution of this SDE is Xt = x + Wt . respectively a(x) = 0 and b(x) = 1. The 1dimensional Brownian motion starting at x is a diffusion process with generator L= The drift and diffusion coefﬁcients are. dx 2 dx2 √ D dWt . The generator of this process is L = −αx The corresponding SDE is dXt = −αXt dt + The solution to this equation is Xt = e−αt X0 + √ D 0 t X0 = x.8 Exercises 1. 2 dx2 i. 5. [42]. d D d2 + .
2 we study various basic properties of the FokkerPlanck equation. boundary conditions and spectral properties of the FokkerPlanck operator.5 we study stochastic processes whose drift is given by the gradient of a scalar function. including existence and uniqueness of solutions.4 we study the multidimensional OnrsteinUhlenbeck process and we study the spectral properties of the corresponding FokkerPlanck operator. In Section 6. 1 In this long chapter we study various properties of this equation such as existence and uniqueness of solutions. In Section 6.3 we present some examples of diffusion processes and use the corresponding FokkerPlanck equation in order to calculate various quantities of interest such as moments.2 we study the Langevin equation and the aso In this chapter we will call the equation FokkerPlanck. writing the equation as a conservation law and boundary conditions. rather forward Kolmogorov. In Section 8.1 Introduction In the previous chapter we derived the backward and forward (FokkerPlanck) Kolmogorov equations and we showed that all statistical properties of a diffusion process can be calculated from the solution of the FokkerPlanck equation. In Section 6. 1 87 .7 we solve the FokkerPlanck equation for a gradient SDE using eigenfunction expansions and we show how the eigenvalue problem for the FokkerPlanck operator can be reduced to the eigenfunction expansion for a Schr¨ dinger operator. In Section 6. We will restrict attention to timehomogeneous diffusion processes. which is more customary in the mathematics literature. for which the drift and diffusion coefﬁcients do not depend on time. long time asymptotics. We also study in some detail various examples of diffusion processes and of the associated FokkerPalnck equation. In Section 6. gradient ﬂows.Chapter 6 The FokkerPlanck Equation 6. which is more customary in the physics literature.
∂t ∂xj 2 i. d d (6. d d ∂bij . 0) = f (x).j=1 α ξ 2. c(x) ˜ 88 M (1 + x 2 ). i. (6. t > 0.2a) (6.1a) (6. 6. The FokkerPlanck equation is 1 ∂p ∂ ∂2 =− (ai (x)p) + (bij (x)p).9. where ai (x) = −ai (x) + ˜ j=1 d x ∈ Rd .20)). Exercises can be found in Section 6. Discussion and bibliographical remarks are included in Section 6. x ∈ Rd . 0) = f (x). We will assume that it is actually uniformly positive deﬁnite.3) Furthermore.j=1 ∂xi ∂xj i=1 ∂ai .2b) p(x. we have that f (x) 0.1 Basic Properties of the FP Equation Existence and Uniqueness of Solutions Consider a homogeneous diffusion process on Rd with drift vector and diffusion matrix a(x) and b(x). x ∈ Rd .2.8. ∂xi By deﬁnition (see equation (5. c are smooth and that they satisfy the growth ˜ ˜ conditions b(x) M.2 6. we will impose the uniform ellipticity condition: d bij (x)ξi ξj i. and Rd f (x) dx = 1. x ∈ Rd .3 we calculate the eigenvalues and eigenfunctions of the FokkerPlanck operator for the Langevin equation in a harmonic potential. ∂xj 1 ∂ 2 bij ci (x) = ˜ − 2 i.sociated FokkerPlanck equation. t > 0. we will assume that the coefﬁcients a.j=1 ∂xi ∂xj j=1 p(x. (6. b. ∀ ξ ∈ Rd .j=1 ∂xi ∂xj j=1 d d (6. a(x) ˜ M (1 + x ). We can also write the equation in nondivergence form: ∂p = ∂t 2 ∂p 1 ˜ij(x) ∂ p + c(x)u. In Section 8.4) . aj (x) ˜ + b ˜ ∂xj 2 i.1b) Since f (x) is the probability density of the initial condition (which is a random variable).e. the diffusion matrix is always symmetric and nonnegative.
Furthermore. To see this we deﬁne the probability current to be the vector whose ith component is Ji := ai (x)p − 1 2 d j=1 ∂ bij (x)p . we can multiply the FokkerPlanck equation by monomials xn and then to integrate over Rd and to integrate by parts.3.5) it follows that all moments of a uniformly elliptic diffusion process exist.5).2.5) Notice that from estimates (6. We will call a solution to the Cauchy problem for the Fokker–Planck equation (6. It is a standard result in the theory of parabolic partial differential equations that. under the regularity and uniform ellipticity assumptions. pt . ii. R+ ).3) and (6. ∀T > 0 there exists a c > 0 such that u(t. the solution of the heat equation on Rd ). u ∈ C 2. Theorem 6. (6. The solution of the FokkerPlanck equation is nonnegative for all times.2) a classical solution if: i.Deﬁnition 6.2. D2 p Kt(−n+2)/2 exp − 1 δ x 2t 2 . δ so that p.T ) ceα x 2 .2. Then there exists a unique classical solution to the Cauchy problem for the Fokker–Planck equation. and assume that f  ceα x 2 L∞ (0.e.1 (Rd . provided that the initial distribution is nonnegative. there exist positive constants K. in view of the estimate (6.2. Assume that conditions (6. This is follows from the maximum principle for parabolic PDEs. the FokkerPlanck equation has a unique smooth solution. x) = f (x). p .4) are satisﬁed. Remark 6. x) iii. ∂xj (6. the solution can be estimated in terms of an appropriate heat kernel (i. No boundary terms will appear. 6. limt→0 u(t.2.6) 89 .2 The FP equation as a conservation law The FokkerPlanck equation is in fact a conservation law: it expresses the law of conservation of probability. In particular. Furthermore.1.
as we have already seen the Brownian motion can be deﬁned as the limit of an appropriately rescaled random walk. N }.7) Hence. There are many applications where it is important to study stochastic process in bounded domains. Equation (6. In this case it is necessary to specify the value of the stochastic process (or equivalently of the solution to the FokkerPlanck equation) on the boundary. However.We use the probability current to write the Fokker–Planck equation as a continuity equation: ∂p + ∂t we obtain d dt Consequently: p(·.2 When the random walker reaches either the left or the right boundary we can either set i. the total probability is conserved. The boundary condition was that the solution decays sufﬁciently fast at inﬁnity. X0 = X1 or XN = XN −1 .7) simply means that E(Xt ∈ Rd ) = 1. . 0) L1 (Rd ) = 1.3 Boundary conditions for the Fokker–Planck equation When studying a diffusion process that can take values on the whole of Rd . which means that the particle gets absorbed at the boundary. the random walk is not a diffusion process. Of course. X0 = 0 or XN = 0. 6. ii. we identify the left and right boundaries).1). .. To understand the type of boundary conditions that we can impose on the FokkerPlanck equation. t) L1 (Rd ) · J = 0.e. t 0. t) dx = 0. equation (6. For ergodic diffusion processes this is equivalent to requiring that the solution of the backward Kolmogorov equation is an element of L2 (µ) where µ is the invariant measure of the process. let us consider the example of a random walk on the domain {0. 1.2. iii. A similar construction exists for more general diffusion processes. then we study the pure initial value (Cauchy) problem for the FokkerPlanck equation. (6. 2 90 . which means that the particle is moving on a circle (i. which means that the particle is reﬂected at the boundary. . X0 = XN . as expected. Integrating the FP equation over Rd and integrating by parts on the right hand side of the equation p(x. Rd = p(·.
An example of mixed boundary conditions would be absorbing boundary conditions at the left end and reﬂecting boundary conditions at the right end: p(0. the Feller classiﬁcation: the BC can be regular. entrance and natural. Consider the FokkerPlanck equation posed in Ω ⊂ Rd where Ω is a bounded domain with smooth boundary. Consider now a diffusion process in one dimension on the interval [0. exit. J(0. t) = 0 absorbing. The above boundary conditions become: i. ii. t) = 0. 91 . Of course. iii.6). t)) = J(L. L]. t) = 0. reﬂecting or periodic boundary conditions. using the terminology customary to PDEs theory. There is a complete classiﬁcation of boundary conditions in one dimension. periodic. t) = J(L. absorbing boundary conditions correspond to Dirichlet boundary conditions and reﬂecting boundary conditions correspond to Neumann. we can have absorbing. The transition probability density is a periodic function in the case of periodic boundary conditions. mixed boundary conditions. The boundary conditions are p(0. t) = p(L. There is no net ﬂow of probability on a reﬂecting boundary: n · J(x. The transition probability density vanishes on an absorbing boundary: p(x. Let J denote the probability current and let n be the unit outward pointing normal vector to the surface. where the probability current is deﬁned in (6. on ∂Ω. t) = p(L. t) = 0 p(0.Hence. on consider more complicated. on ∂Ω. t) reﬂecting. t) = 0. Notice that.
sy. This diffusion process is the Brownian motion with diffusion coefﬁcient D. is p(x. To calculate the probability density function (distribution function) of the Brownian particle we need to solve the FokkerPlanck equation with initial condition ρ0 (x). t) dx ∂x2 = D R p(x. in view of (6.9) Notice that using the FokkerPlanck equation for the Brownian motion we can immediately show that the mean squared displacement grows linearly in time. t) ≡ 0.3. 92 (6. where we performed two integrations by parts and we used the fact that.1 Examples of Diffusion Processes Brownian Motion Brownian Motion on R Set a(y.10) .6. From this calculation we conclude that EWt2 = 2Dt. Assuming that the Brownian particle is at the origin at time t = 0 we get d d EWt2 = dt dt = D R x2 p(x. (6. no boundary terms remain. (6. ty.8) ∂t ∂x The solution to this equation is the Green’s function (fundamental solution) of the heat equation: p(x. ty. Assume now that the initial condition W0 of the Brownian particle is a random variable with distribution ρ0 (x). 0) over all initial realizations of the Brownian particle.3 6. 0) dx = 2D. The solution of the FokkerPlanck equation. Let us calculate the transition probability density of this process assuming that the Brownian particle is at y at time s. The FokkerPlanck equation for the transition probability density p(x. t) = p(x. the distribution function. 0)ρ0 (y) dy.9). 0) dx R x2 ∂ 2 p(x. ty. t0. b(y. t) ≡ 2D > 0. In other words. s) = (x − y)2 exp − 4D(t − s) 4πD(t − s) 1 . we need to take the average of the probability density function p(x. s) = δ(x − y). t0. ty. s) is: ∂ 2p ∂p = D 2 . p(x.
t)ρ0 (y) dy =: p ρ0 . t) = k=1 pn (t) sin(nπx). The Fourier coefﬁcients of the initial conditions are 1 pn (0) = 2 0 δ(x − x0 ) sin(nπx) dx = 2 sin(nπx0 ). (6. . we can write p(x.11) and use the orthogonality properties of the Fourier basis to obtain the equations pn = − ˙ The solution of this equation is pn (t) = pn (0)e− n2 π 2 t 2 n2 π 2 pn 2 n = 1. 1] with absorbing boundary conditions is ∞ p(x. 0) = p(x − y. 0) = 2 n=1 e− n2 π 2 t 2 sin nπx0 sin(nπx). 0) = δ(x − x0 ). ty.11) We look for a solution to this equation in a sine Fourier series: ∞ p(x. as we know from the theory of partial differential equations. where we have assumed that W0 = x0 . (6. reﬂecting or periodic boundary conditions. Consequently. Thus. t) = 0. p(x. We substitute the expansion (6. with either absorbing.10) we see that the distribution function is given by the convolution between the transition probability density and the initial condition. . ∂t 2 ∂x2 p(0. Brownian motion with absorbing boundary conditions We can also consider Brownian motion in a bounded domain. t). . 1] with absorbing boundary conditions: ∂p 1 ∂ 2p = . t) = p(x − y.Notice that only the transition probability density depends on x and y only through their difference. t) = p(1. tx0 .12) into (6. . 2. 93 . The initial condition is p(x. the transition probability density for the Brownian motion on [0.12) Notice that the boundary conditions are automatically satisﬁed. From (6.8) on [0. Set D = 1 and consider the FokkerPlanck equation (6.
cos(nπx0 ) cos(nπx)e− n2 π 2 t 2 . Brownian Motion with Reﬂecting Boundary Condition Consider now Brownian motion on the interval [0. We substitute the expansion into the PDE and use the orthonormality of the Fourier basis to obtain the equations for the Fourier coefﬁcients: an = − ˙ from which we deduce that an (t) = an (0)e− Consequently p(x. 0) = 0. 1] with Neumann boundary conditions: 1 ∂ 2p ∂p = . To see this. In order to calculate the transition probability density we have to solve the FokkerPlanck equation which is the heat equation on [0. 2 n=1 From the initial conditions we obtain 1 ∞ an (0) = 2 0 cos(nπx)δ(x − x0 ) dx = 2 cos(nπx0 ). t) = a0 + an (t) cos(nπx). Notice that Brownian motion with reﬂecting boundary conditions is an ergodic Markov process. let us consider the stationary FokkerPlanck equation ∂ 2 ps = 0. ∂x ps (0) = ∂x ps (1) = 0. We look for a solution in the form of a cosine Fourier series 1 p(x.Notice that t→∞ lim p(x. The boundary conditions are satisﬁed by functions of the form cos(nπx). t) = 0. 1] with reﬂecting boundary conditions and set D = 1 for simplicity. tx0 . since all Brownian particles will eventually get absorbed at the boundary. t) = ∂x p(1. 0) = δ(x − x0 ). p(x. ∂x2 94 . 0) = 1 + 2 n=1 ∞ n2 π 2 t 2 n2 π 2 an 2 . This is not surprising. tx0 . ∂t 2 ∂x2 ∂x p(0.
integrate by parts and use the boundary conditions to obtain 1 0 2 dps dx dx = 0. by taking the limit of p(x. b(x. With this drift and diffusion coefﬁcients the FokkerPlanck equation becomes ∂(xp) ∂ 2p ∂p =α + D 2. it is reasonable to expect that the resulting process can be stationary. t) = 2D > 0. The transition probability density pOU (x.14) . Indeed. (6. (2n + 1)4 6. Now we can calculate the stationary autocorrelation function: 1 1 E(W (t)W (0)) = 0 1 0 1 xx0 p(x. ty. Alternatively. tx0 . ∂t ∂x ∂x (6.13) This is the FokkerPlanck equation for the OrnsteinUhlenbeck process. 0)ps (x0 ) dxdx0 ∞ = 0 0 xx0 1 + 2 n=1 +∞ cos(nπx0 ) cos(nπx)e− n2 π 2 t 2 dxdx0 8 1 + 4 = 4 π n=0 (2n+1)2 π 2 1 e− 2 t . tx.3. in addition to Brownian motion there is a linear force pulling the particle towards the origin. t) = −αx. So. s) for an OU particle that is located at y at time s is pOU (y. this is indeed the case. since the variance grows linearly in time. we multiply the equation by ps . 0) = 1. The corresponding stochastic differential equation is dXt = −αXt + √ 2DdWt .The unique normalized solution to this boundary value problem is ps (x) = 1. As we have already seen. tx0 . 0) as t → ∞ we obtain the invariant distribution: t→∞ lim p(x. tx0 . s) = α α(x − e−α(t−s) y)2 exp − 2πD(1 − e−2α(t−s) ) 2D(1 − e−2α(t−s) ) 95 . We know that Brownian motion is not a stationary process. By adding a linear damping term. from which it follows that ps (x) = 1.2 The OrnsteinUhlenbeck Process We set now a(x.
To obtain (6. ty. s) dxdy D −αt−s e .14) we obtain (we have set s = 0) t→+∞ lim pOU (x.2.14).6.18) In order to calculate the double integral we need to perform an appropriate change of variables. Now we can calculate the stationary autocorrelation function of the OU process E(X(t)X(s)) = = xyp2 (x.13). The calculation is similar to the one presented in Section 2.14) it immediately follows that in the limit as the friction coefﬁcient α goes to 0. ty.14) and (6.16) . 3 96 . ty. This calculation will be presented in Section ?? for the FokkerPlanck equation of a linear SDE in arbitrary dimensions.15) Using now (6. α (6. ty.We obtained this formula in Example (4. (6. solve the resulting ﬁrst order PDE using the method of characteristics and then take the inverse Fourier transform3 Notice that from formula (6. See Exercise 2. 0) = αx2 α exp − 2πD 2D . we ﬁrst take the Fourier transform of the transition probability density with respect to x. ty. (6. Furthermore. with a Gaussian invariant distribution ps (x) = α αx2 exp − 2πD 2D . 0)ps (y) α(x2 + y 2 − 2xye−αt ) α √ = exp − 2D(1 − e−2αt ) 2πD 1 − e−2αt More generally. by taking the long time limit in (6. the transition probability of the OU processes converges to the transition probability of Brownian motion.4) (for α = D = 1) by using the fact that the OU process can be deﬁned through the a time change of the Brownian motion. we have p2 (x. s) = α(x2 + y 2 − 2xye−αt−s ) α √ exp − 2D(1 − e−2αt−s ) 2πD 1 − e−2αt−s . since as we have already seen the OrnsteinUhlenbeck process is an ergodic Markov process. We can also derive it by solving equation (6. irrespective of the initial position y of the OU particle.17) (6. This is to be expected. 0) = p(x.15) we obtain the stationary joint probability density p2 (x.
as we have already shown for the general FokkerPlanck equation in arbitrary dimensions. 0). t) dx. Let n = 0. since d p dt we deduce that p(x. ty.19) where p(x − y. integrate over R and perform and integration by parts to obtain: d M1 = −αM1 . rather than the explicit formula for the transition probability density. the probability density function (distribution function) is given by the convolution integral p(x. . The distribution function is given by ps (x) at all times and the joint probability density is given by (6. We integrate the FP equation over R to obtain: ∂p =α ∂t sequently: d M0 = 0 dt In other words. 1. .16). t) decays sufﬁciently fast at inﬁnity. Let Mn (t) denote the nth moment of the OU process. p(x. ∂y 2 after an integration by parts and using the fact that p(x. . Mn := R xn p(x. = 0. When the OU process is distributed initially according to its invariant distribution. which means that the total probability is conserved. We will calculate the moments by using the FokkerPlanck equation. . ρ0 (x) = ps (x) given by (6. then the OrnsteinUhlenbeck process becomes stationary.15). 2. We multiply the FP equation for the OU process by x. Con⇒ M0 (t) = M0 (0) = 1. Let n = 1. n = 0.Assume that initial position of the OU particle is a random variable distributed according to a distribution ρ0 (x). t)ρ0 (y) dy. t) dx. t = 0) dy = 1. As in the case of a Brownian particle. Knowledge of the distribution function enables us to calculate all moments of the OU process using the formula E((Xt )n ) = xn p(x. (6. t) = p(x − y. t) := p(x. t) dx = R R L1 (R) ∂(yp) +D ∂y ∂ 2p = 0. dt 97 .
20) We can use this formula. . For example. α α Consequently. We can solve it using the variation Mn (t) = e−αnt Mn (0) + Dn(n − 1) 0 e−αn(t−s) Mn−2 (s) ds. This is a ﬁrst order linear inhomogeneous differential equation. . We multiply the FP equation for the OU process by xn and integrate by parts (once on the ﬁrst term on the RHS and twice on the second) to obtain: d dt Or. It is not hard to check that (see Exercise 3) lim Mn (t) = y n 98 OU t→∞ (6.21) . n even. dt of constants formula: t y n p = −αn y n p + Dn(n − 1) y n−2 p. Let now n 2.3 .Consequently. together with the formulas for the ﬁrst two moments in order to calculate all higher order moments in an iterative way. n 2. (n − 1) 0. the second moment converges exponentially fast to its stationary value stationary moments of the OU process are: yn OU D . . 2α The := = α 2πD y n e− 2D dx R D n/2 α αy 2 1. n odd. for n = 2 we have t M2 (t) = e−2αt M2 (0) + 2D 0 e−2α(t−s) M0 (s) ds D = e−2αt M2 (0) + e−2αt (e2αt − 1) α D D = + e−2αt M2 (0) − . (6. equivalently: d Mn = −αnMn + Dn(n − 1)Mn−2 . the ﬁrst moment converges exponentially fast to 0: M1 (t) = e−αt M1 (0).
indeed. The corresponding stochastic differential equation is dXt = µXt dt + σXt dWt . ∂t ∂x ∂x 2 We can easily obtain an equation for the nth moment of the geometric Brownian motion: d σ2 Mn = µn + n(n − 1) Mn . it is not surprising that the moments also converge to the moments of the invariant Gaussian measure. converge to equilibrium exponentially fast. The generator of this process is L = µx ∂ σx2 ∂ 2 + . The coefﬁcient σ is called the volatility. b(x) = 2 σ 2 x2 . 4 σ2 )nt 2 Brownian motion is: n 2. This equation is one of the basic models in mathematical ﬁnance.3 The Geometric Brownian Motion 1 We set a(x) = µx. Mn (0).3. Of course.22) 6. if the initial conditions of the OU process are stationary. we need to assume that the initial distribution has ﬁnite moments of all orders in order to justify the above calculations. The FokkerPlanck equation of the geometric ∂ ∂ 2 σ 2 x2 ∂p = − (µx) + 2 p . dt 2 The solution of this equation is Mn (t) = e(µ+(n−1) and M1 (t) = eµt M1 (0). This is the geometric Brownian motion. Of course. What is not so obvious is that the convergence is exponentially fast. ∂x 2 ∂x2 Notice that this operator is not uniformly elliptic. (6. then the moments of the OU process become independent of time and given by their equilibrium values Mn (t) = Mn (0) = xn OU . Since we have already shown that the distribution function of the OU process converges to the Gaussian distribution in the limit as t → +∞.exponentially fast4 . In the next section we will prove that the OrnsteinUhlenbeck process does. n 2 99 .
namely those for which the drift term can be written in terms of the gradient of a smooth functions. real. selfadjointness of the generator. depending on the values of µ and σ. which diverges when σ 2 + 2µ > 0. h)ρ = R 2 ρ := Rd f 2 ρβ dp. 2 6. discrete spectrum) are shared by a large class of diffusion processes. We will also study various properties of the generator of the OU process. f hρβ dp. Consider for example the second moment and assume that µ < 0. The generator of the ddimensional OU process is (we set the drift coefﬁcient equal to 1) L = −p · p + β −1 ∆p (6. In the next section we will show that many of the properties of the OU process (ergodicity.4 The OrnsteinUhlenbeck Process and Hermite Polynomials The OrnsteinUhlenbeck process is one of the few stochastic processes for which we can calculate explicitly the solution of the corresponding SDE.Notice that the nth moment might diverge as t → ∞. In this section we will show that the eigenfunctions of the OU process are the Hermite polynomials. We have Mn (t) = e(2µ+σ )t M2 (0).23) where β denotes the inverse temperature. We have already seen that the OU process is an ergodic Markov process whose unique invariant measure is absolutely continuous with respect to the Lebesgue measure on Rd with Gaussian density ρ ∈ C ∞ (Rd ) ρβ (p) = p2 1 e−β 2 . the solution of the FokkerPlanck equation as well as the eigenfunctions of the generator of the process. 100 . This is a separable Hilbert space with norm f and corresponding inner product (f. (2πβ −1 )d/2 The natural function space for studying the generator of the OU process is the L2 space weighted by the invariant measure of the process. exponentially fast convergence to equilibrium.
4.24) upon setting h = f : (Lf. Nonpositivity of L follows from (6. f )ρ = −β −1 f 2 ρ 0. integrating over Rd and using (6. The reason why this is the right function space in which to study questions related to convergence to equilibrium is that the generator of the OU process becomes a selfadjoint operator in this space. (6. weighted Sobolev spaces. 2 iv. L is a nonpositive operator on L2 . ρ iii. f )ρ (6.1. Proposition 6. h ∈ C0 (Rd ) ∩ L2 (Rd ). h)ρ = = −p · −p · f hρβ dp + β −1 f hρβ dp − β −1 h)ρ . we can deﬁne weighted L2 spaced involving derivatives. together with Poincar´ ’s inequality for Gaussian measures: e f 2 ρβ dp Rd β −1 Rd  f 2 ρβ dp (6. f 2 ρ (−Lf. ∆f hρβ dp f· hρβ dp + −p · f hρβ dp = −β −1 ( f.23) has many nice properties that are summarized in the following proposition.24) ii. multiplying the equation Lf = 0 by f ρβ .26) 101 . Similarly.Similarly. h)ρ = (f. Lh)ρ = −β −1 Rd f· hρβ dp. In fact.24) gives f ρ = 0. Equation (6. L deﬁned in (6.25) Proof. Lf = 0 iff f ≡ const. For every f. The spectral gap follows from (6. The operator L has the following properties: 2 i.24) follows from an integration by parts: (Lf. ρ (Lf. See Exercise . For every f ∈ C0 (Rd ) ∩ L2 (Rd ) with ρ f ρβ = 0.24). i.e. from which we deduce that f ≡ const.
Indeed. For each λ ∈ C. implies that L has discrete spectrum. (6.4.4.27) The corresponding eigenfunctions are the normalized Hermite polynomials: 1 fn (p) = √ Hn n! where Hn (p) = (−1)n e 2 p2 βp . upon combining (6. Then the eigenvalues of L are the nonnegative integers: λn = n. we can calculate the eigenvalues and eigenfunctions of the generator of ρ the OU process in one dimension. The multidimensional problem can be treated similarly by taking tensor products of the eigenfunctions of the one dimensional problem. set H(p. In fact. (6. (6.26) we obtain: (Lf. 2. . . which is equivalent to the compactness of its resolvent.29) For the subsequent calculations we will need some additional properties of Hermite polynomials which we state here without proof (we use the notation ρ1 = ρ). 5 λ2 p ∈ R. λ) = eλp− 2 . Furthermore.24) with (6.2.3. we have that its eigenfunctions form a countable orthonormal basis for the separable Hilbert space L2 . Proposition 6. 102 . Consider the eigenvalue problem for the generator of the OU process in one dimension −Lfn = λn fn . . 1.5 Theorem 6.28) dn dpn e− 2 p2 . f )ρ = −β −1 − f 2 ρ f 2 ρ The spectral gap of the generator of the OU process. . n = 0. ρβ ) with f ρβ = 0. since it is also a selfadjoint operator.for every f ∈ H 1 (Rd .
Furthermore. f1 (p) = f2 (p) = f3 (p) = f4 (p) = f5 (p) = βp. H2 (p) = p2 − 1.30) where the convergence is both uniform on compact subsets of R×C. λ) = n=0 λn √ fn (p) n! We differentiate this formula with respect to p to obtain +∞ λ βH( βp. ρ).28) implies that fn (p)fm (p)ρβ (p) dp = δnm . the coefﬁcient multiplying pn in Hn (p) is always 1. From (6. Furthermore. λ) = ∞ n=0 λn Hn (p). The orthonormality of the modiﬁed Hermite polynomials fn (p) deﬁned in (6. 2 2 √ 3/2 β 3 β 3 √ p − √ p 6 6 1 √ β 2 p4 − 3βp2 + 3 24 1 √ β 5/2 p5 − 10β 3/2 p3 + 15β 1/2 p . H4 (p) = p4 − 3p2 + 3. notice that by combining (6. n! 103 .2 follows essentially from the properties of the Hermite polynomials. β 1 √ p2 − √ .Then H(p. H5 (p) = p5 − 10p3 + 15p. {fn (p) := √1n! Hn ( βp) : n ∈ N} is an orthonormal basis in L2 (C. ρβ ). (6. λ) = n=1 λn √ ∂p fn (p). First. In particular. and for λ’s in compact subsets √ of C. f0 (p) = 1. 120 The proof of Theorem 6.28) and (6. R The ﬁrst few Hermite polynomials and the corresponding rescaled/normalized eigenfunctions of the generator of the OU process are: H0 (p) = 1. H3 (p) = p3 − 3p. n! p ∈ R. only odd (even) powers appear in Hn (p) when n is odd (even).29) it is clear that Hn is a polynomial of degree n. uniform in L2 (C.30) we obtain +∞ H( βp. H1 (p) = p.4.
we can generate all eigenfunctions of the OU operator from the ground state f0 = 0 through a repeated application of the creation operator. β Similarly.32) to obtain √ 1 βp − √ ∂p fk = k + 1fk+1 . β Now we observe that −Lfn = = The operators √ βp − 1 √ ∂p β (6. λ) = k=0 λk λk λk pHk (p) − Hk−1 (p) Hk+1 (p) k! (k − 1)! k! k=1 k=0 +∞ +∞ from which we obtain the recurrence relation pHk = Hk+1 + kHk−1 .31) (p − λ)H(p. Set β = 1 and let a− = ∂p . 104 .30) with respect to λ we obtain +∞ from which we deduce that (6. if we differentiate (6. we deduce that pfk = β −1 (k + 1)fk+1 + β −1 kfk−1 . In fact. β 1 √ ∂p β and play the role of creation and annihilation operators.31) and (6. Then the L2 adjoint of a+ is ρ a+ = −∂p + p.33) 1 1 βp − √ ∂p √ ∂p fn β β √ 1 βp − √ ∂p nfn−1 = nfn .32) We combine now equations (6.4. λ) = n=1 +∞ λn−1 √ √ ∂p fn (p) β n! √ β λn (n + 1)! ∂p fn+1 (p) = n=0 √ 1 √ ∂p fk = kfk−1 . Upon rescaling. Proposition 6. (6. From this equation we obtain +∞ H( βp.since f0 = 1.4.
(6.33). (6. 2 a− a+ = −∂p + p∂p + 1.35) and (6. h ∈ C 1 (R) ∩ L2 . −a+ a− = −(−∂p + p)∂p = ∂p − p∂p = L. a+ and a− satisfy the following commutation relation [a+ . together with a simple induction argument. formulas (6.34) follow from (6. a− ] = −1 Forumlas (6.38) and [a+ . a− ] = −1 Deﬁne now the creation and annihilation operators on C 1 (R) by S+ = and 1 S − = √ a− .31) and (6. Finally.37) (6.36) f ∂p (hρ) f − ∂p + p hρ.36) are a consequence of (6. Furthermore.Then the generator of the OU process can be written in the form L = −a+ a− . We calculate ρ ∂p f hρ = − = Now. let f. 105 . n Then S + fn = fn+1 In particular. 1 fn = √ (a+ )n 1 n! and 1 1 = √ (a− )n fn . n! Proof.31) and (6.33). 1 (n + 1) a+ and S − fn = fn−1 .35) (6.34) (6. Similarly.
39) .5. An immediate corollary of the above calculation is that we can the nth eigenfunction of the FokkerPlanck operator is given by ∗ fn = ρ(p) 1 + n (a ) 1. . We have L∗ (ρfn ) = fn L∗ ρ + ρLfn = −nρfn . . X−t2 . . n = 0. Xtm ) = p(X−t1 . 106 (6.5. t2 .35) and (6. . X−tm ).1. 1. Xt2 . A stationary stochastic process Xt is time reversible if for every m ∈ N and every t1 . 2. .36) and the fact that a+ is the adjoint of a− we can easily check the orthonormality of the eigenfunctions: 1 fn fm ρ = √ m! 1 = √ m! = fn (a− )m 1 ρ (a− )m fn ρ fn−m ρ = δnm . . Lemma 6. . n Proof. The eigenvalues and eigenfunctions of the FokkerPlanck operator 2 L∗ · = ∂p · +∂p (p·) are λ∗ = −n.4. . . . n! 6. . From the eigenfunctions and eigenvalues of L we can easily obtain the eigenvalues and eigenfunctions of L∗ .Notice that upon using (6. .5 Reversible Diffusions The stationary OrnsteinUhlenbeck process is an example of a reversible Markov process: Deﬁnition 6. . tm ∈ R+ . . ∗ and fn = ρfn . . the joint probability distribution is invariant under time reversals: p(Xt1 . the FokkerPlanck operator.
(6. Let V (x) = 1 αx2 .3.40) In applications of (6. if it exists.2. Consider diffusion processes with a potential V (x). however. not necessarily quadratic: L = − V (x) · + β −1 ∆ (6.41) ˙ Hence.43) where the normalization factor Z is the partition function Z= Rd e−βV (x) dx.5. Let V (x) be a smooth conﬁning potential. we have a gradient ODE Xt = − V (Xt ) perturbed by noise due to thermal ﬂuctuations.5. for all β ∈ R+ .42) It is not possible to calculate the time dependent solution of this equation for an arbitrary potential. The generator of the OU process can be written as: 2 2 L = −∂x V ∂x + β −1 ∂x . namely stochastic perturbations of ODEs with a gradient structure.In this section we study a more general class (in fact. (6.44) (6. The corresponding FP equation is: ∂p = ∂t · ( V p) + β −1 ∆p. 107 . A potential V will be called conﬁning if limx→+∞ V (x) = +∞ and e−βV (x) ∈ L1 (Rd ). Deﬁnition 6. Then the Markov process with generator (6. We can. as we will see later the most general class) of reversible Markov processes. The corresponding stochastic differential equation is dXt = − V (Xt ) dt + 2β −1 dWt . The unique invariant distribution is the Gibbs distribution p(x) = 1 −βV (x) e Z (6. always calculate the stationary solution.40) to statistical mechanics the diffusion coefﬁcient β −1 = kB T where kB is Boltzmann’s constant and T the absolute temperature. Gradient SDEs in a conﬁning potential are ergodic: Proposition 6.40) is ergodic.
Uniqueness follows from a PDEs argument (see discussion below). where ρ(x) is the Gibbs distribution. The initial condition follows from the deﬁnition of h.45) Proof.5. Rd ∂h = ρ − ∂t V · h + β −1 ∆h .43) holds. t)ρ(x). in order to study properties of solutions to the FP equation. Let p(x.4). t) = h(x.4. Furthermore. Then the operator L = − V (x) · + β −1 ∆ is selfadjoint in L2 . Theorem 6. 0)ρ−1 (x). it is nonpositive. Then the function h satisﬁes the backward Kolmogorov equation: ∂h =− V · ∂t cian of p: p = ρ h − ρhβ V and ∆p = ρ∆h − 2ρβ V · h + hβ∆V ρ + h V 2 β 2 ρ. Consequently. t) be the solution of the FokkerPlanck equation (6. We deﬁne the weighted L2 space L2 : ρ L2 = f  ρ f 2 ρ(x) dx < ∞ . It is more convenient to ”normalize” the solution of the FokkerPlanck equation with respect to the invariant distribution.42). assume that (6. h + β −1 ∆h. h(x. Deﬁne h(x. it is sufﬁcient to study the backward equation (6.5. Theorem 6.43) holds and let ρ(x) be the Gibbs distribution (11.The fact that the Gibbs distribution is an invariant distribution follows by direct substitution. h)ρ = Rd f hρ(x) dx. We calculate the gradient and Lapla We substitute these formulas into the FP equation to obtain ρ from which the claim follows. its kernel consists of constants. ρ 108 .45). in the right function space. The generator L is selfadjoint.5. Assume that V (x) is a smooth potential and assume that condition (6. (6. 0) = p(x. t) through p(x. This is a Hilbert space with inner product (f.
We calculate (Lf. h)ρ = Rd (− V · ( V · Rd + β −1 ∆)f hρ dx f )hρ dx − β −1 Rd = f hρ dx − β −1 Rd f h ρ dx = −β −1 Rd f· hρ dx.7. Remark 6. Then. Assume that f ∈ N (L). The expression (−Lf. f is a constant. from which selfadjointness follows. f )ρ = −β −1 which shows that L is nonpositive. f )ρ is called the Dirichlet form of the operator L. For this we need to use the fact that. f 2 ρ. ∈ C0 (Rd ). from the above equation we get 0 = −β −1 and. constants are in the null space of L. f )ρ = β −1 f 2 ρ. (6.6. Assume that the potential V satisﬁes the convexity condition D2 V λI. (6. If we set f = h in the above equation we get (Lf. In the case of a gradient ﬂow. the Gibbs measure µ(dx) = Z −1 e−βV (x) satisﬁes Poincar` ’s inequality: e Theorem 6. it takes the form (−Lf. Clearly.5. Let f.5. under appropriate assumptions on the potential V . consequently.2 Proof. f 2 ρ.46) Using the properties of the generator L we can show that the solution of the FokkerPlanck equation converges to the Gibbs distribution exponentially fast.47) 109 . Then the corresponding Gibbs measure satisﬁes the Poincar´ inequality with constant λ: e fρ = 0 ⇒ Rd √ f ρ λ f ρ.
p = ρh. lead to (6. Then the solution p(x. In fact. We can prove convergence in L1 using the theory of logarithmic Sobolev inequalities. t) of the FokkerPlanck equation (6. t) − 1 ρ e−λβ −1 t h(·. ideally we would like to prove exponentially fast convergence in L1 . 0) − 1 ρ . t) − Z −1 e−βV ρ−1 e−λDt p(·.46) and (6. h − 1)ρ = 2D 2β −1 λ h − 1 2 . Remark 6.9.42) converges to the Gibbs distribution exponentially fast: p(·.Theorem 6. The function space L2 (ρ−1 ) = L2 (e−βV ) in which we prove convergence is not the right space to use. The assumption p(x. assuming only that the relative entropy of the initial conditions is ﬁnite. This. Since p(·. 0) ∈ L2 . (6. we can also prove convergence in relative entropy: H(pρV ) := Rd p ln p ρV dx. the above calculation shows ρ that h(·. we can prove exponentially fast convergence to equilibrium. (6. Consequently.45). ρ (h − 1) ρ Our assumption on p(·. t) ∈ L1 . h − 1)ρ ρ = (−L(h − 1). 0) ∈ L2 (eβV ).h − 1 ∂t = −2 (Lh. 0) − Z −1 e−βV ρ−1 .5. We Use (6. and the deﬁnition of h. 0) implies that h(·.47) to calculate − d (h − 1) dt 2 ρ = −2 ∂h . A much sharper version of the theorem of exponentially fast convergence to equilibrium is the following: 110 .5. 0)2 Z −1 eβV < ∞ Rd is very restrictive (think of the case where V = x2 ). Assume that p(x.48) Proof.8. The relative entropy norm controls the L1 norm: ρ1 − ρ2 2 L1 CH(ρ1 ρ2 ) Using a logarithmic Sobolev inequality.48).
Selfadjointness of the generator of a diffusion process is equivalent to timereversibility. µ(dx)). Assume that the the initial conditions satisfy H(p(·.5. There exists a scalar function V (x) such that b(x) = − V (x). 0)ρV ) < ∞. Then p converges to the Gibbs distribution exponentially fast in relative entropy: H(p(·. Theorem 6. In particular. Then the following three statements are equivalent.Theorem 6. ii.11. (6.5. 6. t)ρV ) e−λβ −1 t H(p(·. 1 π(x) The exponent Λ is related to the spectral gap of the generator L = Carlo (MCMC) methodology.1 Markov Chain Monte Carlo (MCMC) The Smoluchowski SDE (6. Suppose we want to sample from a probability distribution π(x). the law of Xt converges to π(x) exponentially fast: ρt − π L1 e−Λt ρ0 − π L1 . One method for doing this is by generating the dynamics whose invariant distribution is precisely π(x).10.42) where the potential is smooth and uniformly convex. Furthermore. 0)ρV ). Let p denote the solution of the Fokker–Planck equation (6. iii. then Xt is an ergodic Markov process with invariant distribution π(x). The process it timereversible. i.5. we consider the Smolochuwoski equation dXt = ln(π(Xt )) dt + √ 2dWt . 111 π(x) · + ∆. This technique for sampling from a given distribution is an example of the Markov Chain Monte .49) Assuming that − ln(π(x)) is a conﬁning potential.41) has a very interesting application in statistics. Its generator of the process is symmetric in L2 (Rd . Let Xt be a stationary Markov process in Rd with generator L = b(x) · + β −1 ∆ and invariant measure µ.
(6. .e. γ(x) a smooth vector ﬁeld and consider the diffusion process dXt = (− V (Xt ) + γ(x)) dt + 2β −1 dWt . The operator −L has real.52) We know that L is a nonpositive selfadjoint operator on L2 and that it has a spectral gap: ρ (Lf.51) 6. Proposition 6. Let V (x) be a conﬁning potential.7 Eigenfunction Expansions Consider the generator of a gradient stochastic ﬂow with a uniformly convex potential L=− V · + D∆. f )ρ −Dλ f 2 ρ where λ is the Poincar´ constant of the potential V (i. the eigenfunctions {fj }∞ form an orthonormal basis in L2 : we can express every j=1 ρ element of L2 in the form of a generalized Fourier series: ρ ∞ n = 0. . (6.53) 112 . 1.1. Furthermore. φn = (φ. fn )ρ (6. .6.6 Perturbations of nonReversible Diffusions We can add a perturbation to a nonreversible diffusion without changing the invariant distribution Z −1 e−βV . . . discrete spectrum with 0 = λ0 < λ1 < λ2 < .50) dx if and Then the invariant measure of the process Xt is the Gibbs measure µ(dx) = only if γ(x) is divergencefree with respect to the density of this measure: · γ(x)e−βV (x)) = 0. for the Gibbs measure Z −1 e−βV (x) dx). φ= n=0 φn fn . e The above imply that we can study the spectral problem for −L: −Lfn = λn fn . 1 −βV (x) e Z (6.6.
1). t)Z −1 eβV dx = (h. t) dx = Rd h(x. integrate wrt the Gibbs measure and use the orthonormality of the eigenfunctions to obtain the sequence of equations ˙ hn = −λn hn .53). We multiply this equation by fm . n = 1. .with (fn . together with the fact that all eigenvalues are positive (n the Fokker–Planck equation is ∞ solution of the backward Kolmogorov equation converges to 1 exponentially fast. 0) dx = Rd p(x. The solution of p(x. 113 . 2. t) = 1 + n=1 e−λn t φn fn . hn (t) = e−λn t φn . shows that the This expansion.45). 1. We assume that the initial conditions h0 (x) = φ(x) ∈ L2 and consequently we can expand it in the ρ form (6.54) (6. t) = n=0 hn (t)fn (x). Consider the backward Kolmogorov equation (6. Notice that 1 = Rd n = 0.55) −λn hn fn . We look for a solution of (6.45) in the form ∞ h(x. . 1)ρ = (φ. p(x. We substitute this expansion into the backward Kolmogorov equation: ∂h = ∂t = n=0 ∞ ∞ ˙ hn fn = L n=0 ∞ n=0 hn fn (6. . 1)ρ = φ0 . Consequently. fm )ρ = δnm . The solution is h0 (t) = φ0 . t) = Z −1 e−βV (x) 1 + n=1 e−λn t φn fn . the solution of the backward Kolmogorov equation is ∞ h(x. This enables us to solve the time dependent Fokker–Planck equation in terms of an eigenfunction expansion.
∂t U (x) := . From equation (6.2. Then ψ solves the PDE ∂ψ = D∆ψ − U (x)ψ.56)  V 2 ∆V − . f eV /D 1 is equivalent to the semiclassical approxi Consider now the eigenvalue problem for the FP operator: −L∗ φn = λn φn . Then L∗ and H have the same eigenvalues. The operator that appears on the right hand side of eqn. Proof. We can thus use all the available results from quantum mechanics o to study the FP equation and the associated SDE. We calculate D · e−V /D eV /D f = D = · e−V /D D−1 V f + · ( V f + D f ) = L∗ f.1. The spectral problem for the FP operator can be transformed into the spectral problem for a Schr¨ dinger operator.57) Let H := −D∆ + U . ii. t) = e p(x. The Fokker–Planck operator for a gradient ﬂow can be written in the selfadjoint form ∂p = D · e−V /D eV /D p ∂t V /2D Deﬁne now ψ(x. t). i.6. (6. the weak noise asymptotics D mation from quantum mechanics. The nth eigenfunction φn of L∗ and the nth eigenfunction ψn of H are associated through the transformation ψn (x) = φn (x) exp Remarks 6. 4D 2 (6. In particular.7. iv. V (x) 2D .7.1 Reduction to a Schr¨ dinger Equation o Lemma 6.7.56) shows that the FP operator can be written in the form L∗ · = D · e−V /D eV /D · . iii.57) has the form of a Schr¨ dinger o operator: −H = −D∆ + U (x). (6. 114 .
with a lot of material on parabolic PDEs is [22].2. 2D A∗ = − + U . These are creation and annihilation operators.1 Set φn = ψn exp − 2D V . n n where φB and φF denote the eigenfunctions of the backward and forward operators. A= + U .8 Discussion and Bibliography The proof of existence and uniqueness of classical solutions for the FokkerPlanck equation of a uniformly elliptic diffusion process with smooth drift and diffusion coefﬁcients. The forward the backward Kolmogorov operators have the same eigenvalues. A∗ · = eU/2D e−U/2D · iii. 4D 2D · e−V /D From this we conclude that e−V /2D Hψn = λn ψn e−V /2D from which the equivalence between the two eigenvalue problems follows. particularly Chapters 2 and 7 in this book. see Deﬁnition 6. T ) . Remarks 6. respecn n tively. there are inﬁnitely many solutions of ∂p = ∆p ∂t p(x.7.2. 115 in Rd × (0. We can rewrite the Schr¨ dinger operator in the form o H = DA∗ A. Theorem 6.1.2. In fact. A standard textbook on PDEs. It is important to emphasize that the condition that solutions to the FokkerPlanck equation do not grow too fast. 6. is necessary to ensure uniqueness. i. can be found in [30]. They can also be written in the form A· = e−U/2D eU/2D · . We calculate −L∗ φn : −L∗ φn = −D eV /D ψn e−V /2D V ψn + ψn eV /2D = −D · e−V /D 2D 2  V ∆V = −D∆ψn + − + ψn e−V /2D = e−V /2D Hψn .3. 0) = 0. Their eigenfunctions are related through φB = φF exp (−V /D) . 2D ii.
3.. Solve equation (6. Use the formula for the stationary joint probability density of the OrnsteinUhlenbeck process. Convergence to equilibrium for kinetic equations (such as the FokkerPlanck equation) both linear and nonlinear (e. Lemma 2. It is possible to prove that the Hermite polynomials form an orthonormal basis for L2 (Rd . 6. Show that the autocorrelation function of the stationary OrnsteinUhlenbeck is E(Xt X0 ) = R R xx0 pOU (x. using the method of characteristics for ﬁrst order PDEs and taking the inverse Fourier transform. 116 . 0)ps (x0 ) dxdx0 D −αt e . eqn. Hermite polynomials appear very frequently in applications and they also play a fundamental role in analysis.13) by taking the Fourier transform.6 The proof of Proposition 6. that the moments of the OU process converge to their equilibrium values exponentially fast. It has been recognized that the relative entropy and logarithmic Sobolev inequalities play an important role in the analysis of the problem of convergence to equilibrium. ρβ ). See also [35] and [42]. 3.9 Exercises 1. Diffusion processes in one dimension are studied in [61]. The connection between the FokkerPlanck equation and stochastic differential equations is presented in Chapter 7.Each of these solutions besides the trivial solution p = 0 grows very rapidly as x → +∞. 24]. 2α = 6 In fact. using these formulas. Ch.4. The FokkerPlanck equation is studied extensively in Risken’s monograph [82]. Prove. For more information see [62]. Use (6. The Feller classiﬁcation for one dimensional diffusion processes can be also found in [45. (6. 2. the Boltzmann equation) has been studied extensively. 7]. 31. More details can be found in [44. ρβ ) without using the fact that they are the eigenfunctions of a symmetric operator with compact resolvent. Poincar´ ’s inequality for Gaussian measures can be proved using the fact that that the Hermite polynomials e form an orthonormal basis for L2 (Rd .17) to obtain the stationary autocorrelation function of the OU process. 4. tx0 . 32]. See also [1.1 can be found in [90].g.4 in particular.20) to obtain formulas for the moments of the OU process.
Let Xt be a multidimensional diffusion process on [0. (c) Show that the probability density p(x. (b) Assume that a(x) is divergencefree ( invariant distribution. Give the deﬁnition of the Sobolev space H k (Rd . bi 0. β > 0 and let ρβ (x) = Z −1 e−βV (x) . Let Xt be a onedimensional diffusion process with drift and diffusion coefﬁcients a(y. +∞) with drift and diffusion coefﬁcients a(x) = −ax + D x · a(x) = 0). t) = b0 + b1 y + b2 y 2 where ai . ρβ ) for k a positive integer and study some of its basic properties. 2. e 8. Let V be a conﬁning potential in Rd . 1. Use the forward Kolmogorov equation to derive a system of differential equations for the moments of Xt . 5. (b) Show that this process is ergodic and ﬁnd its invariant distribution. respectively. The drift vector is a periodic function a(x) and the diffusion matrix is 2DI. t) (the solution of the forward Kolmogorov equation) converges to the invariant distribution exponentially fast in L2 ([0. M2 in terms of the moments of the initial distribution ρ0 (x). 117 . D > 0. M1 . bi 0. 1]d ). (Hint: Use Poincar´ ’s inequality on [0. (a) Write down the generator and the forward and backward Kolmogorov equations for Xt . i = 0. (Hint: Use Laguerre polynomials). i = 0. The Rayleigh process Xt is a diffusion process that takes values on (0. 7. where D > 0 and I is the identity matrix. (b) Assume that X0 is a random variable with probability density ρ0 (x) that has ﬁnite moments. 1]d with periodic boundary conditions. (c) Find the ﬁrst three moments M0 . (a) Write down the generator the forward and backward Kolmogorov equations for Xt . 1]d ). (a) Write down the generator and the forward and backward Kolmogorov equations for Xt . 1. where a. (c) Solve the forward Kolmogorov (FokkerPlanck) equation using separation of variables. (d) Under what conditions on the coefﬁcients ai . 2 is M2 ﬁnite for all times? 6. Show that Xt is ergodic and ﬁnd the and b(x) = 2D.where ps (x) denotes the invariant Gaussian distribution. t) = −a0 − a1 y and b(y.
tx0 . 2π]2 with periodic boundary conditions with drift vector a(x. (c) Let E denote the expectation with respect to the invariant distribution ρs (x. y) = (sin(y). (b) Solve the initial/boundary value problem for the forward Kolmogorov equation to calculate the transition probability density p(x. Let x(t) = {x(t). 0). y(t)} be the twodimensional diffusion process on [0. y(t)} and calculate the normalization constant. 1] with periodic boundary conditions and with drift and diffusion coefﬁcients a(x) = a and b(x) = 2D. y). 118 . 0)ps (x0 ) dxdx0 . Assume that the process starts at x0 . X(0) = x0 . b12 = b21 = 0. Let a. respectively. (a) Write down the generator of the process {x(t). (c) Show that the process is ergodic and calculate the invariant distribution ps (x). y) = C is the unique stationary distribution of the process {x(t). (b) Show that the constant function ρs (x. 10.9. (d) Calculate the stationary autocorrelation function 1 1 E(X(t)X(0)) = 0 0 xx0 p(x. Calculate E cos(x) + cos(y) and E(sin(x) sin(y)). y) with b11 = b22 = 1. tx0 . (a) Write down the generator of the process X(t) and the forward and backward Kolmogorov equations. y(t)} and the forward and backward Kolmogorov equations. D be positive constants and let X(t) be the diffusion process on [0. sin(x)) and diffusion matrix b(x.
2) In order to make sense of this equation we need to deﬁne the stochastic integral against W (s). h : Z → Rd a smooth vectorvalued function and γ : Z → Rd×m a smooth matrix valued function (in this course we will take Z = Td . The function h in (7. dt dt We think of the term dW dt z(0) = z0 .1) is as an integral equation for z(t) ∈ C(R+ .2 The Itˆ and Stratonovich Stochastic Integral o For the rigorous analysis of stochastic differential equations it is necessary to deﬁne stochastic integrals of the form t I(t) = 0 f (s) dW (s). Such a process exists only as a distribution. 7.Chapter 7 Stochastic Differential Equations 7. (7. 119 (7.1 Introduction In this part of the course we will study stochastic differential equation (SDEs): ODEs driven by Gaussian white noise.1) as representing Gaussian white noise: a meanzero Gaussian process with correlation δ(t − s)I. (7.3) .1) is sometimes referred to as the drift and γ as the diffusion coefﬁcient. Consider the SDE dz dW = h(z) + γ(z) . Z): t t z(t) = z0 + 0 h(z(s))ds + 0 γ(z(s))dW (s). Let W (t) denote a standard m–dimensional Brownian motion. The precise interpretation of (7. Rd or Rl ⊕ Td−l .
5) where  · F denotes the Frobenius norm AF = martingale: EI(t) = 0 and E[I(t)Fs ] = I(s) where Fs denotes the ﬁltration generated by W (s). The Itˆ stochastic integral is a o ∀t s. W in arbitrary ﬁnite dimensions. adapted to the ﬁltration Ft generated by the process W (t). W are scalar–valued. k=1 (7.4) where tk = k∆t and K∆t = t. These o ideas are readily generalized to the case where W (s) is a standard d dimensional Brownian motion and f (s) ∈ Rm×d for each s. 120 . for f. F (7. continuous in t. The resulting Itˆ stochastic integral I(t) is a. This is not straightforward because W (t) does not have bounded variation. • More generally. This is a martingale with quadratic variation t I t = 0 (f (s))2 ds. The Itˆ stochastic integral I(t) is deﬁned as the L2 –limit of the Riemann sum approximation of o (7.4). Example 7.s. The resulting integral satisﬁes the Itˆ isometry o t EI(t)2 = 0 Ef (s)2 ds. • Consider the Itˆ stochastic integral o t I(t) = 0 f (s) dW (s). tr(AT A). the integral I(t) is a martingale with quadratic variation t I t = 0 (f (s) ⊗ f (s)) ds.2. and such that T E 0 f (s)2 ds < ∞.1.where W (t) is a standard one dimensional Brownian motion. Notice that the function f (t) is evaluated at the left end of each interval [tn−1 . In order to deﬁne the stochastic integral we assume that f (t) is a random process. • where f.3): K−1 I(t) := lim K→∞ f (tk−1 ) (W (tk ) − W (tk−1 )) . tn ] in (7.
the limits differ. γ(z(t)) ∈ L2 ((0.6). equation (7. tk ] where the integrand is evaluated do not effect the deﬁnition of the integral. tn ] in (7. 2 satisfy P(x1 (t) = x2 (t). T ] with probability 1. iii. ∀t ∈ [0.1) holds for every t ∈ [0.7. which does not have bounded variation. where the ﬁltration is generated by the Brownian motion W (t).3).1 The Stratonovich Stochastic Integral In addition to the Itˆ stochastic integral.3. ii.6) where tk = k∆t and K∆t = t.3 Stochastic Differential Equations Deﬁnition 7. The situation is o more complex than that arising in the standard theory of Riemann integration for functions of bounded variation: in that case the points in [tk−1 . 7.T ]) = 1. T )). The limit in (7. h(z(t)) ∈ L1 ((0. we can also deﬁne the Stratonovich stochastic integral. 2 (7. namely K−1 Istrat (t) := lim K→∞ k=1 1 f (tk−1 ) + f (tk ) (W (tk ) − W (tk−1 )) .2.1) we mean a Zvalued stochastic process {z(t)} on t ∈ [0.6) gives rise to an integral which differs from the Itˆ integral. By a solution of (7. The resulting integral is written as t Istrat (t) = 0 f (s) ◦ dW (s). T )). In the case of integration against Brownian motion.1. It o is deﬁned as the L2 –limit of a different Riemann sum approximation of (7. The solution is called unique if any two solutions xi (t). 121 . z(t) is continuous and Ft −adapted. The multidimensional Stratonovich integral is deﬁned in a similar way. T ] with the properties: i. i = 1. When f and W are correlated through an SDE. Notice that the function f (t) is evaluated at both endpoints of each interval [tn−1 . then a formula exists to convert between them. via a limiting process.
It is very useful that we can o convert from the Itˆ to the Stratonovich interpretation of the stochastic integral.9b) provided that γ(z) is differentiable.4) and (7. in most applications. As for ODEs the conditions can be weakened. A very similar theorem holds when γ = 0.1) has a unique solution z(t) ∈ C(R+ .1)) holds for globally Lipschitz vector ﬁelds h(x).3.2. White noise is. There are other o interpretations of the stochastic integral. dt dt z(0) = z0 . (7. the Klimontovich stochastic integral. Assume that both h(·) and γ(·) are globally Lipschitz on Z and that z0 is a random variable independent of the Brownian motion W (t) with Ez0 2 < ∞.9a) (7. Theorem 7. On the other hand the martingale machinery which comes with the Itˆ integral makes it more important as a mathematical object. when a priori bounds on the solution can be found. the solution of the SDE is a Markov process. (7. In this context the Stratonovich interpretation of an SDE is particularly important because it often arises as the limit obtained by using smooth approximations to white noise.8) By using deﬁnitions (7. an idealization of a stationary random process with short correlation time. Then the SDE (7. Furthermore. Z) satisﬁes the integral equation t t z(t) = z(0) + 0 h(z(s))ds + 0 γ(z(s)) ◦ dW (s). 122 .7) also satisﬁes the Itˆ SDE o dz 1 = h(z) + dt 2 1 · γ(z)γ(z)T − γ(z) 2 z(0) = z0 .7) By this we mean that z ∈ C(R+ . e.1) is dz dW = h(z) + γ(z) ◦ . · γ(z)T + γ(z) dW . Z) with T E 0 z(t)2 dt < ∞ ∀ T < ∞. dt (7. The Stratonovich analogue of (7.g. when γ ≡ 0 in (7.6) it can be shown that z satisfying the Stratonovich SDE (7.It is well known that existence and uniqueness of solutions for ODEs (i.e.
Hence. We apply Itˆ ’s formula to the function f (x) = xn to obtain: o dX(t)n = LX(t)n dt + √ 2λ∂X(t)n dW √ = −αnX(t)n dt + λn(n − 1)X(t)n−2 dt + n 2λX(t)n−1 dW. X(0) = x.1 Examples of SDEs The SDE for Brownian motion is: dX = The Solution is: X(t) = x + W (t). 123 . The generator is: o 2 L = −αx∂x + λ∂x . 7. if s = ct.3. We can solve this equation using the variation of constants formula: X(t) = e−αt x + √ 2λ 0 t e−α(t−s) dW (s). We can use Itˆ ’s formula to obtain equations for the moments of the OU process. then 1 dW dW =√ . From this it follows that. then we get the equation 1 1 dz dW = h(z) + √ γ(z) . ds c dt again in law. X(0) = x.1). if we scale time to s = ct in (7. √ 2σdW. The SDE for the OrnsteinUhlenbeck process is dX = −αX dt + √ 2λ dW. where the above should be interpreted as holding in law. ds c ds c z(0) = z0 .The Deﬁnition of Brownian motion implies the scaling property W (ct) = √ cW (t).
11) follows. we apply Itˆ ’s formula to the function f (x) = log(x): o d log(X(t)) = L log(X(t)) dt + σx∂x log(X(t)) dW (t) = = Consequently: log X(t) X(0) = µ− σ2 2 t + σW (t) 1 1 σ 2 x2 µx + − 2 dt + σ dW (t) x 2 x σ2 µ− dt + σ dW (t).11) σ 2 x2 2 ∂ .10) where we use the Itˆ interpretation of the stochastic differential. Consider the geometric Brownian motion dX(t) = µX(t) dt + σX(t) dW (t). 2 from which (7. 2 (7. 2 x To derive this formula. (7. The generator of this process is o L = µx∂x + The solution to this equation is X(t) = X(0) exp (µ − σ2 )t + σW (t) .Consequently: t X(t)n = xn + 0 −αnX(t)n + λn(n − 1)X(t)n−2 dt t √ +n 2λ 0 X(t)n−1 dW. By taking the expectation in the above equation we obtain the equation for the moments of the OU process that we derived earlier using the FokkerPlanck equation: t Mn (t) = xn + 0 (−αnMn (s) + λn(n − 1)Mn−2 (s)) ds. Notice that the Stratonovich interpretation of this equation leads to the solution X(t) = X(0) exp(µt + σW (t)) 124 .
we can write: d V (z(t)) = LV (z(t)) + dt V (z(t)). γ(z(t)) dW dt . The precise interpretation of the expression for the rate of change of V is in integrated form: Lemma 7. Rn ).1. 125 (7. equipped with a suitable domain of deﬁnition.1 The Generator.4 7. Let x(t) solve o (7.1) we deﬁne Γ(z) = γ(z)γ(z)T .2 Itˆ ’s Formula o The Itˆ formula enables us to calculate the rate of change in time of functions V : Z → Rn o evaluated at the solution of a Zvalued SDE. (Itˆ ’s Formula) Assume that the conditions of Theorem 7. Let φ : Z → R and consider the function v(z.7.13) (7. Then the process V (z(t)) satisﬁes t t V (z(t)) = V (z(0)) + 0 LV (z(s))ds + 0 V (z(s)).4. Formally.2 hold. is the generator of the Markov process given by (7. 7. Note that if W were a smooth timedependent function this formula would not be correct: there is an additional term in LV . γ(z(s)) dW (s) .4. The formal L2 −adjoint operator L∗ L∗ v = − · (hv) + 1 2 · · (Γv). Itˆ ’s formula and the FokkerPlanck Equao tion The Generator Given the function γ(z) in the SDE (7. t) = E φ(z(t))z(0) = z . which arises from the lack of smoothness of Brownian motion. The generator L is then deﬁned as Lv = h · 1 v+ Γ: 2 v. (7.4.1) and let V ∈ C 2 (Z.12) This operator. proportional to Γ.14) .3.1).
t) ∈ Z × (0. Then v is given by (7. ∞). (7. Assume that the law of z(t) has a density ρ(z. (7. ∞)).15) has a unique classical solution v(x.e. and using the Markov property. dx Consider the Stratonovich SDE (7. (7.4.16) Now we can derive rigorously the FokkerPlanck equation. Let Eµ denote averaging with respect to the product measure induced by the measure µ with density ρ0 on z(0) and the independent driving Wiener measure on the SDE itself. ∞). Then dV (X(t)) = dV (X(t)) (f (X(t)) dt + σ(X(t)) ◦ dW (t)) . Theorem 7. t) ∈ Z × {0} . ). σ : Rn → Rd . For a Stratonovich SDE the rules of standard calculus apply: Consider the Stratonovich SDE (7.29) and let V (x) ∈ C 2 (R). Assume that φ is chosen sufﬁciently smooth so that the backward Kolmogorov equation ∂v = Lv for (z. Averaging 126 .2) with z(0) a random variable with density ρ0 (z). which removes the stochastic integral.29) on Rd (i.17a) (7.14) where z(t) solves (7. it is possible to obtain the Backward Kolmogorov equation. t) ∈ C 2. Theorem 7.4.1 (Z × (0. The corresponding FokkerPlanck equation is: ∂ρ =− ∂t · (f ρ) + 1 2 · (σ · (σρ))).1 (Z × (0. f ∈ Rd .2. By averaging in the Itˆ foro mula.2). t) ∈ C 2. W (t) is standard Brownian motion on Rn ). ∂t ρ = ρ0 for z ∈ Z × {0}. ∞). Then ρ satisﬁes the FokkerPlanck equation ∂ρ = L∗ ρ for (z.17b) Proof. ∂t v = φ for (z.3. t) ∈ Z × (0. Consider equation (7.where the expectation is with respect to all Brownian driving paths.
. We use a density argument so that the identity can be extended to all φ ∈ L2 (Z). Equating these two expressions for the expectation at time t we obtain (eL t ρ0 )(z)φ(z) dz = Z Z ∗ ρ(z. d.over random z(0) distributed with density ρ0 (z). Let A ∈ Rn×n be a positive deﬁnite matrix and let D > 0 be a positive constant. . We will consider the SDE dX(t) = −AX(t) dt + or. ∗ But since ρ(z. from the above equation we deduce that ρ(z. t) is the density of z(t) we also have Eµ (φ(z(t))) = Z ρ(z.17a). i = 1. . t)φ(z)dz. d √ 2D dW (t) dXi (t) = − j=1 √ Aij Xj (t) + 2D dWi (t). ∗ 7. we ﬁnd Eµ (φ(z(t))) = Z v(z. Hence. Setting t = 0 gives the initial condition (7. t)φ(z) dz. Differentiation of the above equation gives (7.5 Linear SDEs In this section we study linear SDEs in arbitrary ﬁnite dimensions. t) = eL t ρ0 (z). t)ρ0 (z) dz (eLt φ)(z)ρ0 (z) dz Z = = Z (eL t ρ0 )(z)φ(z) dz.17b). componentwise. The corresponding FokkerPlanck equation is ∂p = ∂t · (Axp) + D∆p 127 .
∂x2 j Let us now solve the FokkerPlanck equation with initial conditions p(x. ˆ The initial condition is p(k.5)) M (0) = x0 and Σ(0) = 0 where 0 denotes the zero d × d matrix.18) which is of the form 1 p(k. 0) = (2π)−d/2 (det(Σ(t)))−1/2 exp − 1 x − e−At x0 2 T Σ−1 (t) x − e−At x0 .18) eik·x p(k. tx0 .19) We know that the transition probability density of a linear SDE is Gaussian. We can solve these equations using the spectral resolution of A = B T ΛB. tx0 .20) 128 . We calculate now the inverse Fourier transform of p to obtain the fundamental solution (Green’s ˆ function) of the FokkerPlanck equation p(x. Since the Fourier transform of a Gaussian function is also Gaussian. We take the Fourier transform of the FokkerPlanck equation to obtain ∂p ˆ = −Ak · ∂t with p(x. 0) = (2π)−d Rd ˆ kp − Dk2 p ˆ (7. tx0 . 0x0 . (7. we look for a solution to (7. tx0 .j ∂ (Aij xj p) + D ∂xi d j=1 ∂ 2p . ˆ 2 We substitute this into (7. The solutions are M (t) = e−At M (0) and Σ(t) = DA−1 − DA−1 e−2At . tx0 . 0) = exp(−ik · M (t) − k T Σ(t)k).or ∂p = ∂t d i. 0) = δ(x − x0 ). t) dk. 0) = e−ik·x0 ˆ (7. dt with initial conditions (which follow from (11.18) and use the symmetry of A to obtain the equations dM = −AM dt and dΣ = −2AΣ + 2DI.
dt ε dy αy =− 2 + dt ε 129 2D dV .5 we know that the process Xt is ergodic. tx0 . 0)ps (x0 ) dxdx0 . We use now the the variation of constants formula to obtain Xt = eAt X0 + √ 2D 0 t eA(t−s) dW (s).j=1 Aij xi xj . 7.21) e− 2 x 1 T Ax dx = (2π) 2 d det(A−1 ). equations (7.22a) (7.6 Derivation of the Stratonovich SDE When white noise is approximated by a smooth process this often leads to Stratonovich interpretations of stochastic integrals. The matrix exponential can be calculated using the spectral resolution of A: eAt = B T eΛt B. ε2 dt (7. 0 We substitute the formulas for the transitions probability density and the stationary distribution. at least in one dimension. The invariant distribution is ps (x) = with Z = Rd 1 − 1 xT Ax e 2 Z (7. Using the above calculations.20) into the above equations and do the Gaussian integration to obtain T E(X0 Xt ) = DA−1 e−At .22b) . we can calculate the stationary autocorrelation matrix is given by the formula T E(X0 Xt ) = xT xp(x.We note that generator of the Markov processes Xt is of the form L = − V (x) · with V (x) = 1 T x Ax 2 + D∆ = 1 2 d i. Consider the equations dx 1 = h(x) + f (x)y. This is a conﬁning potential and from the theory presented in Section 6. We use multiscale analysis (singular perturbation theory for Markov processes) to illustrate this phenomenon in a onedimensional example.21) and (7.
22a) has nonzero correlation time. α dt (7. α2 1 D − α t−s .23).23) (7. Both of these arguments lead us to conjecture the limiting Itˆ SDE: o dX = h(X) + dt 2D dV f (X) .6. ε→0 ε α2 dt Another way of seeing this is by solving (7. The correlation function of the colored noise η(t) := y(t)/ε is (we take y(0) = 0) R(t) = E (η(t)η(s)) = The power spectrum of the colored noise η(t) is: f ε (x) = 1 Dε−2 1 ε2 π x2 + (αε−2 )2 1 D D = → 4 x2 + α 2 πε πα2 2D δ(t − s). e ε2 ε2 α and. We will show this using singular perturbation theory.26).1. again. giving dX = h(X) + dt 2D dV f (X) ◦ . 2 dt α α dt (7. Whenever white noise is approximated by a smooth process. at the heuristic (7. the heuristic gives the incorrect limit.22a) converges. the limiting equation should be interpreted in the Stratonovich sense. We say that the process x(t) is driven by colored noise: the noise that appears in (7. consequently. A similar result is true in arbitrary ﬁnite and even inﬁnite dimensions. Theorem 7. Then the solution of eqn (7. as applied.25) In fact.24) If we neglect the O(ε) term on the right hand side then we arrive. α dt (7. 130 .with V being a standard onedimensional Brownian motion. ε→0 lim E y(t) y(s) ε ε = which implies the heuristic y(t) 2D dV = .26) This is usually called the WongZakai theorem. Assume that the initial conditions for y(t) are stationary and that the function f is smooth. in the limit as ε → 0 to the solution of the Stratonovich SDE (7.22b) for y/ε: lim y = ε 2D dV ε dy − .
α iv. In higher dimensions an additional drift term might appear due to the noncommutativity of the row vectors of the diffusion matrix. Consequently.6. It is possible to prove pathwise convergence under very mild assumptions. In most applications in physics the white noise is an approximation of a more complicated noise processes with nonzero correlation time. ii. Hence.Remarks 7. v. 131 .28) α − αy2 e 2D . The generator of a Stratonovich SDE has the from Lstrat = h(x)∂x + D f (x)∂x (f (x)∂x ) . y(t)) is L = 1 1 2 −αy∂y + D∂y + f (x)y∂x + h(x)∂x ε2 ε 1 1 =: 2 L0 + L1 + L2 . 2 0 ε ε (7. Proof of Proposition 7.27) We look for a solution to this equation in the form of a power series expansion in ε: uε (x. . t) = u0 + εu1 + ε2 u2 + . 2πD (7.6. i. α iii. This is related to the L´ vy area correction in the e theory of rough paths. the physically correct interpretation of the stochastic integral is the Stratonovich one. . y.2. ε ε The ”fast” process is an stationary Markov process with invariant density ρ(y) = The backward Kolmogorov equation is ∂uε = ∂t 1 1 L + L 1 + L 2 uε . the FokkerPlanck operator of the Stratonovich SDE can be written in divergence form: L∗ · = −∂x (h(x)·) + strat D ∂x f 2 (x)∂x · .1 The generator of the process (x(t).
We calculate: R ∂u ∂u0 ρ(y) dy = . R Finally L1 u1 ρ(y) dy = R R f (x)y∂x 1 f (x)∂x uy + ψ1 (x. −L0 u2 = L1 u1 + L2 u0 − ∂u0 .We substitute this into (7. ∂t ∂t Furthermore: L2 u0 ρ(y) dy = h(x)∂x u. We solve it using separation of variables: u1 (x.28) and equate terms of the same power in ε to obtain the following hierarchy of equations: −L0 u0 = 0. α In order for the third equation to have a solution we need to require that the right hand side is orthogonal to the null space of L∗ : 0 L1 u1 + L2 u0 − R ∂u0 ∂t ρ(y) dy = 0. t) ρ(y) dy α 1 = f (x)∂x (f (x)∂x u) y 2 + f (x)∂x ψ1 (x. This equation is solvable since the right hand side is orthogonal to the null space of the adjoint of L0 (this is the Fredholm alterantive). y. Hence: u0 = u(x. ∂t The ergodicity of the fast process implies that the null space of the generator L0 consists only of constant in y. t) y α D = f (x)∂x (f (x)∂x u) α2 D D 2 f (x)∂x f (x)∂x u + 2 f (x)2 ∂x u. t) = 1 f (x)∂x uy + ψ1 (x. = 2 α α 132 . The second equation in the hierarchy becomes −L0 u1 = f (x)y∂x u. t). −L0 u1 = L1 u0 . t).
2 α α from which we read off the limiting Stratonovich SDE dX = h(X) + dt 2D dV f (X) ◦ .29) The Itˆ and Stratonovich interpretation of an SDE can lead to equations with very different propo erties! When the diffusion coefﬁcient depends on the solution of the SDE X(t).9 Numerical Solution of SDEs Parameter Estimation for SDEs Noise Induced Transitions dXt = Xt (c − Xt2 ). (7.31) .1 Itˆ versus Stratonovich o A Stratonovich SDE dX(t) = f (X(t)) dt + σ(X(t)) ◦ dW (t) can be written as an Itˆ SDE o dX(t) = Conversely.8 7. (7.30) f (X(t)) + 1 2 σ dσ dx (X(t)) dt + σ(X(t)) dW (t). 7. we will say that we have an equation with multiplicative noise . (7.7 7.Putting everything together we obtain the limiting backward Kolmogorov equation ∂u = ∂t h(x) + D D 2 f (x)∂x f (x) ∂x u + 2 f (x)2 ∂x u. dt 133 Consider the Landau equation: X0 = x. α dt 7. and Itˆ SDE o dX(t) = f (X(t)) dt + σ(X(t))dW (t) can be written as a Statonovich SDE dX(t) = f (X(t)) − 1 2 σ dσ dx (X(t)) dt + σ(X(t)) ◦ dW (t).6.
The presence of additive noise in some sense ”trivializes” the dynamics. (7. dt dt X0 = x. Consider additive random perturbations to the Landau equation: √ dWt dXt = Xt (c − Xt2 ) + 2σ . The generator of this process is o 2 L = x(c − x2 )∂x + σx2 ∂x . Notice that Xt = 0 is always a solution of (7.33) Where the stochastic differential is interpreted in the Itˆ sense. 1 1 V (x) = cx2 − x4 . We will assume that x > 0.33) into an SDE with additive noise: dYt = (c − σ) − e2Yt dt + σ dWt . 2 4 ρ(x) is a probability density for all values of c ∈ R. Thus. We apply Itˆ ’s formula to this function: o dYt = L log(Xt ) dt + σXt ∂x log(Xt ) dWt 1 1 1 − σXt2 2 dt + σXt dWt = Xt (c − Xt2 ) Xt Xt Xt = (c − σ) dt − Xt2 dt + σ dWt . Consider the function Yt = log(Xt ).32) This equation deﬁnes an ergodic Markov process on R: There exists a unique invariant distribution: ρ(x) = Z −1 e−V (x)/σ . (7. we have been able to transform (7. The dependence of various averaged quantities on c resembles the physical situation of a second order phase transition. When c < 0 all solutions are attracted 2 to the single steady state X∗ = 0. dt dt X0 = x. When c > 0 the steady state X∗ = 0 becomes unstable and √ √ Xt → c if x > 0 and Xt → − c if x < 0. √ dXt dWt = Xt (c − Xt2 ) + 2σXt .34) . Thus. 2 134 (7. Z= R e−V (x)/σ dx. This is a gradient ﬂow with potential 1 V (y) = − (c − σ)y − e2y .33). if we start with x > 0 (x < 0) the solution will remain positive (negative).1 This is a gradient ﬂow for the potential V (x) = 1 cx2 − 4 x4 . Consider now multiplicative perturbations of the Landau equation.
for example.C. Appl. See G. Blankenship and G. For a review see P. The material in Section 7. Not all multiplicative random perturbations lead to ergodic behavior. 89 239 (1995). The properties of the SDE (stability. 135 . Phys. I. Papanicolaou. See also [58]. Hanggi and P. x2 γ= c − 2.9 is based on [59]. it is not clear what the right interpretation of the stochastic integral (in the limit as both small time scales go to 0). ergodicity etc. Stability and control of stochastic systems with wideband noise disturbances. 71]. SIAM J. 437–476. the SDE ¨ ˙ τ X = −X + v(X)η ε (t). Adv. In the limit where both small time scales go to 0 we can get either Itˆ or Stratonovich or neither. We need to make sure that this distribution is integrable: +∞ x2 Z= 0 xγ e− 2σ < ∞.10 Discussion and Bibliography Colored Noise When the noise which drives an SDE has nonzero correlation time we will say that we have colored noise. This is usually called the Itˆ versus o Stratonovich problem.) are quite robust under ”coloring of the noise”. Going back to the variable x we obtain: ρ(x) dx = Z −1 x(c/σ−2) e− 2σ dx. Jung Colored noise in dynamical systems. pp. Consider. is of the form ρ(y) dy = Z −1 e−V (y)/σ dy. where η ε (t) is colored noise with correlation time ε2 . Colored noise appears in many applications in physics and chemistry. The dependence of the invariant distribution on c is similar to the physical situation of ﬁrst order phase transitions. Math. 7. See [51. 34(3). in addition to the correlation time of the colored noise. 1978. if it exists. In the case where there is an additional small time scale in the problem. o Noise induced transitions are studied extensively in [42]. σ For this it is necessary that γ > −1 ⇒ c > σ.The invariant measure.. Chem.
33) for the Stratonovich interpretation of the stochastic integral. 2. Analyze equation (7. Study additive and multiplicative random perturbations of the ODE dx = x(c + 2x2 − x4 ).7. Calculate all moments of the geometric Brownian motion for the Itˆ and Stratonovich interpreo tations of the stochastic integral. dt 3.11 Exercises 1. 136 .
The FokkerPlanck equation for the Langevin equation. We can 137 . This is an example of a degenerate elliptic operator. hypoelliptic.2) qρ q − qV p + γ(−p p + D∆p ).1 8. ∂t The corresponding stochastic differential equations is the Langevin equation ¨ ˙ Xt = − V (Xt ) − γ Xt + ˙ 2γDWt . however.Chapter 8 The Langevin Equation 8. (8. The generator of this Markov process is L=p· The L2 (dpdq)adjoint is L∗ ρ = −p · The corresponding FP equation is: ∂p = L∗ p. This is Newton’s equation perturbed by dissipation and noise.1) − qV · pρ +γ( p (pρ) + D∆p ρ) . (8. Notice that L∗ is not a uniformly elliptic operator: there are second order derivatives only with respect to p and not q. which is sometimes called the KleinKramersChandrasekhar equation was ﬁrst derived by Kramers in 1923 and was studied by Kramers in his famous paper [?]. It is.2 Introduction The FokkerPlanck Equation in Phase Space (KleinKramers Equation) Consider a diffusion process in two dimensions for the variables q (position) and momentum p.
since the generator L is degenerate and nonselfadjoint. H(p(·. The proof of this result is very complicated. The operator A is antisymmetric in L2 := L2 (R2d . Let ρ(q. The unique invariant distribution is the MaxwellBoltzmann distribution ρ(p. t)ρβ (q. p. p). however. t) be the solution of the Kramers equation and let ρβ (q.q) dpdq. ρβ (q.q) e Z (8. ρ ∂ Let Xi := − ∂pi . It is possible to obtain rates of convergence in either a weighted L2 norm or the relative entropy norm. p. where h(q. p. t) solves the equation ∂h = −Ah + γSh ∂t where A=p· q (8. and obtain estimates on the solution.1. whereas S is symmetric. t)ρ) Ce−αt .4) − qV · p.3) is the Hamiltonian. q) = where 1 p 2 + V (q) 2 is the inverse temperature and the normalization factor Z is H(p.45) is ergodic.still prove existence. β = (kB T )−1 the partition function e−βH(p. uniqueness and regularity of solutions for the FokkerPlanck equation. It is not possible to obtain the solution of the FP equation for an arbitrary potential.2. The L2 adjoint of Xi is ρ Xi∗ = −βpi + 138 ∂ . q) = Z= R2d 1 −βH(p. Then the Markov process with generator (8. We can write ρ(q. Theorem 8. We can. p)). p) be the MaxwellBoltzmann distribution. calculate the (unique normalized) solution of the stationary FokkerPlanck equation. S = −p · p + β −1 ∆p . t) = h(q. See for example and the references therein. p. Let V (x) be a smooth conﬁning potential. ∂pi .
q ∈ Rd .6) Notice also that LV := − qV −1 ∆q = β −1 q +β d Yi∗ Yi . Xd . p(t)} can be written in H¨ rmander’s o ”sum of squares” form: d L = A + γβ −1 i=1 Xi∗ Xi .5) We calculate the commutators between the vector ﬁelds in (8. i=1 The phasespace FokkerPlanck equation can be written in the form ∂ρ +p· ∂t qρ − qV · pρ = Q(ρ. fB ) = D · fB −1 fB ρ . . ∂qi [Xi . Xj ] = 0. ∗ [Xi . Xj ] = βδij . (8. . . [A. Lie(X1 .We have that S = β −1 d Xi∗ Xi . i=1 Consequently. Xd ]) = Lie( p. X1 ].5): [A. . the generator can be written in the form d L = β −1 i=1 (Xi∗ Yi − Yi∗ Xi + γXi∗ Xi ) .q R2d for all p. [A. . with the difference that the collision operator for the FP equation is 139 . We have that Xi∗ Yi − Yi∗ Xi = β pi ∂ ∂V ∂ − ∂qi ∂qi ∂pi . This shows that the generator L is a hypoelliptic operator. the generator of the Markov process {q(t). q) ∂ . . Consequently. fB ) where the collision operator has the form Q(ρ. The FokkerPlanck equation has a similar structure to the Boltzmann equation (the basic equation in the kinetic theory of gases). (8. which spans Tp. ∂ Let now Yi = − ∂pi with L2 adjoint Yi∗ = ρ ∂ ∂qi ∂V − β ∂qi . Xi ] = Consequently.
We consider the problem in 1d. We set D = 1. =: L1 + γL0 .linear. we can expand the solution of (8. t)fn (p). Convergence of solutions of the Boltzmann equation to the MaxwellBoltzmann distribution has also been proved. The generator of the process is: 2 L = p∂q − V (q)∂p + γ −p∂p + ∂p . q.q) dpdq < ∞ .7) f 2 Z −1 e−βH(p.8) √ where fn (p) = 1/ n!Hn (p). Our plan is to substitute (8. t).7) into the basis of Hermite basis: ∞ 1 2 1 2 h(p. (8.8) into (8.q) = e−β 2 p e−βV (q) . ∂t The solution should be an element of the weighted L2 space L2 = ρ f R2 (8. t) = n=0 hn (q. We can study the backward and forward Kolmogorov equations for (9. where 2 L0 := −p∂p + ∂p and L1 := p∂q − V (q)∂p . See ??. The space L2 (e−β 2 p dp) is spanned by the Hermite polynomials. Consequently. The backward Kolmogorov equation is ∂h = Lh. We notice that the invariant measure of our Markov process is a product measure: e−βH(p.7) and obtain a sequence of equations for the coefﬁcients hn (q. We have: ∞ ∞ L0 h = L0 n=0 hn fn = − n=0 nhn fn Furthermore L1 h = −∂q V ∂p h + p∂q h.11) by expanding the solution with respect to the Hermite basis. 140 .
. √ Consequently: Lh = L1 + γL1 h ∞ = n=0 − γnhn + √ n + 1∂q hn+1 √ √ + n∂q hn−1 + n + 1∂q V hn+1 fn Using the orthonormality of the eigenfunctions of L0 we obtain the following set of equations which determine {hn (q. n = 0. We can use this approach to develop a numerical method for solving the KleinKramers equation. . Obvious choices are other the Hermite basis (polynomial potentials) or the standard Fourier basis (periodic potentials). We will do this 141 . For this we need to expand each coefﬁcient hn in an appropriate basis with respect to q. Furthermore ∞ ∞ ∂q V ∂p h = n=0 ∞ ∂q V hn ∂p fn = n=0 √ ∂q V hn nfn−1 = n=0 ∂q V hn+1 n + 1fn .We calculate each term on the right hand side of the above equation separately. . p∂q h = p∂q n=0 hn fn = p∂p h0 + n=1 ∞ ∂q hn pfn √ n + 1fn+1 = ∂q h0 f1 + n=1 ∞ ∂q hn √ nfn−1 + = n=0 √ √ ( n + 1∂q hn+1 + n∂q hn−1 )fn with h−1 ≡ 0. This is set of equations is usually called the Brinkman hierarchy (1956). 1. n=0 √ ˙ hn = −γnhn + n + 1∂q hn+1 √ √ + n∂q hn−1 + n + 1∂q V hn+1 . t)}∞ . For this we will need the formulas ∂p fn = √ nfn−1 ∞ and pfn = √ nfn−1 + ∞ √ n + 1fn+1 .
This expansion can be justiﬁed rigorously for the FokkerPlanck equation. diffusion coefﬁcient). which takes the form 2 2 L = p∂q − ω0 q∂p + γ(−p∂p + β −1 ∂p ).g.9) ˙ 2γβ −1 W (8. 8. p(t)}. See [73]. p) dqdp = βω0 − β p2 − βω0 q2 . See [67]. 2 The Langevin equation is 2 q = −ω0 q − γ q + ¨ ˙ (8. This expansion can also be used in order to solve the Poisson equation −Lφ = f (p. q). It quite often used in the approximate calculation of transport coefﬁcients (e.13) The process {q(t). p(t)} is an ergodic Markov process with Gaussian invariant measure ρβ (q. The resulting method is usually called the continued fraction expansion. ¨ 2 p = −ω0 q − γp + ˙ ˙ 2γβ −1 W .for the case of periodic potentials.12) The FokkerPlanck operator is 2 2 L = p∂q − ω0 q∂p + γ(−p∂p + β −1 ∂p ).10) or q = p. e 2 2π 2 (8. (8.3 The Langevin Equation in a Harmonic Potential There are very few potentials for which we can solve the Langevin equation or to calculate the eigenvalues and eigenfunctions of the generator of the Markov process {q(t). One case where we can calculate everything explicitly is that of a Brownian particle in a quadratic (harmonic) potential 1 2 V (q) = ω0 q 2 .11) This is a linear equation that can be solved explicitly. See [82]. (8. The Hermite expansion of the distribution function wrt to the velocity is used in the study of various kinetic equations (including the Boltzmann equation). (8. It was initiated by Grad in the late 40’s. we will calculate the eigenvalues and eigenfunctions of the generator. Rather than doing this.14) 142 .
We set a− = β −1/2 ∂p .17) See Exercise 3. [a± .15) (8.17). this is a linear transformation and can be written in the form Y = AX where X = (q. In order to calculate the eigenvalues and eigenfunctions of (8. The operators a± . In particular. [b+ . (8.19) we need to make an appropriate change of variables in order to bring the operator L into the ”decoupled” form (8. b± ] = 0. a− ] = −1.18c) (8.18b) (8. b± satisfy the commutation relations [a+ . p) for some 2 × 2 matrix A. and −1 b− = ω0 β −1/2 ∂q .6). −1 b+ = −ω0 β −1/2 ∂q + ω0 β 1/2 p.20) . (8. the operator L = −a+ a− − b+ b− is the generator of the OU process in two dimensions.For the calculation of the eigenvalues and eigenfunctions of the operator L it is convenient to introduce creation and annihilation operator in both the position and momentum variables. Using now the operators a± and b± we can write the generator L in the form L = −γa+ a− − ω0 (b+ a− − a+ b− ). Clearly.19) becomes L = −Cc+ c− − Dd+ d− 143 (8.18a) (8.16) We have that 2 a+ a− = −β −1 ∂p + p∂p and 2 b+ b− = −β −1 ∂q + q∂q Consequently.19) which is a particular case of (8. b− ] = −1. It is somewhat easier to make this change of variables at the level of the creation and annihilation operators. a+ = −β −1/2 ∂p + β 1/2 p (8. our goal is to ﬁnd ﬁrst order differential operators c± and d± so that the operator (8.
21a) (8.21) and (8. d− = β21 a− + β22 b− . to map L to the twodimensional OU process. [d+ . the decoupled form (8. {βij }. c− ] = −1. (8.19).20) and the commutation relations (8.22c) (8. essentially. If we substitute now these equations into (8. [c± . d± ] = 0. d− ] = −1.21b) (8.20) and equate it with (8.22b) (8. 2 λ1 − λ2 = δ. d+ = β11 a+ + β12 b+ . (8.23) (8.21) we obtain a system of equations for the coefﬁcients {αij }.22a) (8. Since our goal is.18) we conclude that c± and d± should be of the form c+ = α11 a+ + α12 b+ . ¨ ˙ The solution of this equation is q(t) = C1 e−λ1 t + C2 e−λ2 t with λ1. c− = α21 a− + α22 b− . From the structure of the generator L (8.24) 144 .2 = The eigenvalues satisfy the relations λ1 + λ2 = γ. λ1 λ2 = ω0 .for some appropriate constants C and D. γ±δ .22d) Notice that the c− and d− are not the adjoints of c+ and d+ . In order to write down the formulas for these coefﬁcients it is convenient to introduce the eigenvalues of the deterministic problem 2 q = −γ q − ω0 q. we require that that the operators c± and d± satisfy the canonical commutation relations [c+ . 2 δ= 2 γ 2 − 4ω0 .21c) The operators c± and d± should be given as linear combinations of the old operators a± and b± .19) and into the commutation relations (8. (8.
ﬁrst we check the commutation relations: [c+ . we have that [c+ . the operator L can be written in the form L = −λ1 c+ c− − λ2 d+ d− . = δ (8. (8. b− ] δ 1 (−λ1 + λ2 ) = −1. (8.25c) (8. d− ] = 1 − λ1 λ2 [a+ . [c+ . b− ] δ 1 = (− λ1 λ2 + − λ1 λ2 ) = 0.19) and let c± . Proof. δ Clearly.26) Furthermore. δ 145 .25b) (8. d± ] = −λ2 d± . λ 1 b+ .25d) Then c± .21) as well as [L. d± satisfy the canonical commutation relations (8. λ1 b− . a− ] + λ1 λ2 [b+ . d+ ] = [c− . b− ] δ 1 = (λ2 − λ1 ) = −1. d− ] = 1 −λ2 [a+ .Proposition 8. Furthermore.27) Similarly. a− ] − λ2 [b+ . dpm be the operators 1 c+ = √ δ 1 c− = √ δ 1 d+ = √ δ 1 d− = √ − δ λ1 a+ + λ1 a− − λ2 a+ + λ 2 a− + λ 2 b+ . λ2 b− .25a) (8. [d+ .3. d− ] = 0. Let L be the generator (8. c± ] = −λ1 c± . a− ] + λ1 [b+ . [L. c− ] = 1 λ1 [a+ .1.
1.2. (d+ )2 ] = −2λ1 (c+ )2 .30) 1 (c+ )n (d+ )m 1.19). (8. n!m! n. (8. Now we calculate L = −λ1 c+ c− − λ2 d+ d− √ λ2 − λ2 + − λ1 λ2 1 2 1 + − = − a a + 0b b + (λ1 − λ2 )a+ b− + δ δ δ + − + − + − = −γa a − ω0 (b a − a b ). which is precisely (8. (d+ )m ] = −mλ1 (d+ )m . Using now (8. We have [L.28) λ1 λ2 (−λ1 + λ2 )b+ a− . the Schr¨ dinger operator for the twoo dimensional quantum harmonic oscillator). which is the main result of this section. m = 0. (c+ )n ] = −nλ1 (c+ )n and [L. . . we have the following. we expect that the eigenfunctions should be tensor products of Hermite polynomials. 146 (8. (c+ )2 ] = L(c+ )2 − (c+ )2 L = (c+ L − λ1 c+ )c+ − c+ (Lc+ + λ1 c+ ) = −2λ1 (c+ )2 and similarly [L.3. A simple induction argument now shows that (see Exercise 8. p) = √ Proof.3) [L. . In the above calculation we used (8.26).Finally: [L. Theorem 8. .27) we can readily obtain the eigenvalues and eigenfunctions of L. c+ ] = −λ1 c+ c− c+ + λ1 c+ c+ c− = −λ1 c+ (1 + c+ c− ) + λ1 c+ c+ c− = −λ1 c+ (1 + c+ c− ) + λ1 c+ c+ c− = −λ1 c+ .11) are 1 1 λnm = λ1 n + λ2 m = γ(n + m) + δ(n − m).24). m = 0. Indeed.3. and similarly for the other equations in (8. p} (8.29) n. From our experience with the twodimensional OU processes (or. 2 2 and φnm (q. . . The eigenvalues and eigenfunctions of the generator of the Markov process {q. 1.
31) Remark 8. (d+ )n ] = n(d+ )n−1 . [d− . √ φ10 = β √ √ λ1 p + λ2 ω0 q √ . [L. 2δ √ √ λ2 β p λ1 ω0 q − λ1 + ω0 2 β q 2 λ1 √ .3. Exercise 8. Show that [L.4. (c+ )n ] = n(c+ )n−1 .29) follow. (d± )n ] = −nλ1 (d± )n . The ﬁrst few eigenfunctions are φ00 = 1.We use (8.3. δ −λ1 + β p2 λ1 + 2 √ √ λ2 β p λ1 ω0 q − λ2 + ω0 2 β q 2 λ2 √ . δ φ01 √ √ √ β λ2 p + λ1 ω0 q √ = δ φ11 √ √ √ √ √ √ −2 λ1 λ2 + λ1 β p2 λ2 + β pλ1 ω0 q + ω0 β qλ2 p + λ2 ω0 2 β q 2 λ1 = . (8. 2δ 147 φ20 = φ02 = −λ2 + β p2 λ2 + 2 . [c− . (c± )n ] = −nλ1 (c± )n .28) and (8.30) to calculate L(c+ )n (d+ )n 1 = (c+ )n L(d+ )m 1 − nλ1 (c+ )n (d+ m)1 = (c+ )n (d+ )m L1 − mλ2 (c+ )n (d+ m)1 − nλ1 (c+ )n (d+ m)1 = −nλ1 (c+ )n (d+ m)1 − mλ2 (c+ )n (d+ m)1 from which (8.3. b± the eigenfunctions of L are √ φnm = n!m!δ n − n+m 2 n/2 m/2 λ1 λ2 =0 k=0 m 1 k!(m − k)! !(n − )! λ1 λ2 k− 2 (a+ )n+m−k− (b+ ) +k 1. In terms of the operators a± .
is 0: λ00 = 0. γ < 2ω0 . corresponding to the constant eigenfunction. in the overdamped limit γ → +∞ (which we will study in Chapter ??). The eigenvalues are contained in a cone on the right half of the complex plane. This it to be expected.3 we present the ﬁrst few eigenvalues of L in the underdamped regime. 2 In Figure 8. δ = 2iω. We set ω= 2 (4ω0 − γ 2 ).e. in the overdamped regime. γ This is consistent with the fact that in this limit the solution of the Langevin equation converges to the solution of the OU SDE. Notice that the operator L is not selfadjoint and consequently. since the underdamped regime the dynamics is dominated by the deterministic Hamiltonian dynamics that give rise to the antisymmetric Liouville operator.Notice that the eigenfunctions are not orthonormal. i. we do not expect its eigenvalues to be real. On the other hand. whether the eigenvalues are real or not depends on the sign of the discriminant 2 ∆ = γ 2 − 4ω0 . In the underdamped regime. The cone is determined by λn0 = γ γ n + iωn and λ0m = m − iωm. 2 γ 2 − 4ω0 (n − m). See Chapter ?? for details. The eigenvalues can be written as λnm = γ (n + m) + iω(n − m). Indeed. the ﬁrst eigenvalue. As we already know. 2 2 The eigenvalues along the diagonal are real: λnn = γn. γ 1 1 λnm = γ(n + m) + 2 2 2ω0 all eigenvalues are real: γ 2ω0 . γ < 2ω0 the eigenvalues are complex: 1 1 λnm = γ(n + m) + i 2 2 2 −γ 2 + 4ω0 (n − m). 148 . the eigenvalues of the generator L converge to the eigenvalues of the generator of the OU process: λnm = γn + 2 ω0 (n − m) + O(γ −3 ). In fact.
1: First few eigenvalues of L for γ = ω = 1. 149 .Figure 8.
The eigenfunctions of L and L satisfy the biorthonormality relation φnm ψ k ρβ dpdq = δn δmk . .31). (n − + 1)m(m − 1) . where 1 (c+ )∗ = √ λ1 a− + λ2 b− δ 1 (c− )∗ = √ λ 1 a+ − λ 2 b + δ 1 (d+ )∗ = √ λ2 a− + λ1 b− δ 1 (d− )∗ = √ − λ2 a+ + λ1 b+ δ L has the same eigenvalues as L: −Lψnm = λnm ψnm .35d) (8. .35b) (8. 150 ) (8. . .5.32) (8. The eigenfunctions are ψnm = √ 1 ((c− )∗ )n ((d− )∗ )m 1. . (8. where λnm are given by (8. (8.35c) (8. Using the eigenfunctions/eigenvalues of L we can easily calculate the eigenfunctions/eigenvalues of the L2 adjoint of L.36) . . We have φnm ψ k ρβ dpdq = √ 1 ((c+ ))n ((d+ ))m 1((c− )∗ ) ((d− )∗ )k 1ρβ dpdq n!m! !k! n(n − 1) . Notice that using the third and fourth of these equations together with the fact that c− 1 = d− 1 = 0 we can conclude that (for n (c− ) (c+ )n 1 = n(n − 1) .33) (8. (n − + 1)(c+ )n− . (m − k + 1) √ = ((c+ ))n− ((d+ ))m−k 1ρβ n!m! !k! = δn δmk .2 β we know that the adjoint operator is L := −A + γS = −ω0 (b+ a− − b− a+ ) + γa+ a− = −λ1 (c− )∗ (c+ )∗ − λ2 (d− ) ∗ (d+ )∗.37) Proof. We will use formulas (8. .The eigenfunctions of L do not form an orthonormal basis in L2 := L2 (R2 . . . Z −1 e−βH ) since L β is not a selfadjoint operator. From the calculations presented in Section 8.38) . n!m! (8.34) Proposition 8.3.28).35a) (8.
ii. Small and large friction asymptotics for the FokkerPlanck equation: The Freidlin–Wentzell (underdamped) and Smoluchowski (overdamped) limits. We will study various asymptotic limits for the Langevin equation (we have set m = 1) q = − V (q) − γ q + ¨ ˙ 151 ˙ 2γβ −1 W . escape from a potential well. (8. iii. Small noise asymptotics/large times (rare events): the theory of large deviations. Using the formula (see equation (8. exit time problems. n!m! (8. There are many problems of physical interest that can be analyzed using techniques from perturbation theory and asymptotic analysis: i. The eigenfunctions are ψ∗nm = ρβ φnm = ρβ √ 1 ((c− )∗ )n ((d− )∗ )m 1.4 Asymptotic Limits for the Langevin Equation There are very few SDEs/FokkerPlanck equations that can be solved explicitly. v. From the eigenfunctions of L we can obtain the eigenfunctions of the FokkerPlanck operator.39) 8.40) . iv.4)) L∗ (f ρβ ) = ρLf we immediately conclude that the the FokkerPlanck operator has the same eigenvalues as those of L and L. Large time asymptotics for the Langevin equation in a periodic potential: homogenization and averaging. Stochastic systems with two characteristic time scales: multiscale problems and methods.since all eigenfunctions average to 0 with respect to ρβ . Small noise asymptotics at ﬁnite time intervals. In most cases we need to study the problem under investigation either approximately or numerically. In this part of the course we will develop approximate methods for studying various stochastic systems of practical interest.
We want to study the qualitative behavior of solutions to this equation (and to the corresponding FokkerPlanck equation). In many applications (especially in biology) the friction coefﬁcient is large: γ 1. β ˙ 2γβ −1 W .There are two parameters in the problem. γ γ (8. (8. It leads to exponential. In this case the position is the fast variable whereas the energy is the slow variable. There are various asymptotic limits at which we can eliminate some of the variables of the equation and obtain a simpler equation for fewer variables. In the large temperature limit. We rescale the solution to (9. Arrhenius type asymptotics for the reaction rate (in the case of a particle escaping from a potential well due to thermal noise) or the diffusion coefﬁcient (in the case of a particle moving in a periodic potential in the presence of thermal noise) κ = ν exp (−βEb ) .11) is dominated by diffusion: the Langevin equation (9. The small temperature asymptotics will be studied later for the case of a bistable potential (reaction rate) and for the case of a periodic potential (diffusion coefﬁcient).11): q γ (t) = λγ (t/µγ ). The large and small friction asymptotics can be expressed in terms of a slow/fast system of SDEs.41) where κ can be either the reaction rate or the diffusion coefﬁcient. In various problems in physics the friction coefﬁcient is small: γ 1.11) can be approximated by free Brownian motion: q= ˙ The small temperature asymptotics. This is the overdamped or Smoluchowski limit. Assuming that the temperature is ﬁxed. In this case the momentum is the fast variable which we can eliminate to obtain an equation for the position. the only parameter that is left is the friction coefﬁcient γ. In both cases we have to look at sufﬁciently long time scales. This is the underdampled or FreidlinWentzell limit. This rescaled process satisﬁes the equation qγ = − ¨ γ λγ ∂q V (q γ /λγ ) − q γ + ˙ µ2 µγ γ 152 ˙ 2γλ2 µ−3 β −1 W . We can eliminate the position and obtain an equation for the energy. β 1. the friction coefﬁcient γ and the inverse temperature β.42) . the dynamics of (9. 1 is much more interesting and more subtle.
ε 1 1 p = − ˙ V (q) − 2 p + ε ε q = ˙ (8.45) where we have set ε−1 = γ. (8. γ 1: q γ = −γ −2 V (q γ ) − q γ + ¨ ˙ ˙ 2γ −2 β −1 W . 8. (8. (8. since we are interested in the limit γ → ∞.42) becomes γ −2 q γ = −∂q V (q γ ) − q γ + ˙ ˙ ˙ 2β −1 W .43) can be approximated by the solution to q = −∂q V + ˙ λγ = 1. µγ = γ. µγ = γ −1 . the interesting limit is the overdamped limit. in the limit as ε → 0. (8.Different choices for these two parameters lead to the overdamped and underdamped limits: λγ = 1.45). the solution of the Langevin equation (8. γ 1.46) This systems of SDEs deﬁned a Markov process in phase space. We will show that. γ the limit as γ → +∞ the solution to (8. In this case equation (8.44) converges to a stochastic process on a graph. Its generator is Lε 1 1 − p · p + β −1 ∆ + p · ε2 ε 1 1 =: 2 L0 + L1 . q γ (t). We will see later that in ˙ 2β −1 W . ε → 0.48) ˙ 2β −1 W .4. γ the limit as γ → 0 the energy of the solution to (8.43): ε2 q γ (t) = − V (q γ (t)) − q γ (t) + ¨ ˙ ˙ 2β −1 W (t). We will see later that in Under this scaling.43) 1. the solution of the Smoluchowski equation q=− V + ˙ We write (8.44) 1. i. βε2 (8. Under this scaling the interesting limit is the underdamped limit.45) as a system of SDEs: 1 p.e.47) 2 ˙ W. converges to q(t). ε ε = 153 q − qV · p .1 The Overdamped Limit We consider the rescaled Langevin equation (8.
From equation (8. It is important in various applications to calculate corrections to the limiting FokkerPlanck equation.46). under appropriate assumptions on the potential V (q) and on the initial conditions.47) we have that q(t) = q(0) + Combining the above two equations we deduce t 1 ε t p(s) ds. as well as by analyzing the corresponding Kolmogorov equations. we can prove a pathwise approximation result: 1/p E sup q (t) − q(t) t∈[0.45) converges (in some appropriate sense to be explained below) to the solution of the FokkerPlanck equation corresponding to the Smoluchowski equation (8. We can accomplish this by analyzing the FokkerPlanck equation 154 .46) using a pathwise technique. 0 q(t) = q(0) − 0 qV (q(s)) ds + 2β −1 W (t) + O(ε) from which (8.T ] γ p Cε2−κ . In fact. p(s) ds = − 0 0 qV (q(s)) ds + 2β −1 W (t) + O(ε). arbitrary small (it accounts for logarithmic corrections).This is a singularly perturbed differential operator. We will derive the Smoluchowski equation (8. The pathwise derivation of the Smoluchowski equation implies that the solution of the FokkerPlanck equation corresponding to the Langevin equation (8. This estimate is true. We apply Itˆ ’s formula to p: o dp(t) = Lε p(t) dt + = − Consequently: 1 ε t t 1 ε 2β −1 ∂p p(t) dW qV 1 1 p(t) dt − ε2 ε (q(t)) dt + 1 ε 2β −1 dW. where κ > 0.46) follows. Notice that in this derivation we assumed that Ep(t)2 C.
q) dpdq. We note that L∗ ρ0 = 0 and L∗ ρ0 = 0.50) (8. is ρβ (p. Proof. The Fokker–Planck equation associated to equations (8. This is related to the fact that L0 is a symmetric operator in L2 (R2 .q. t)ρβ (p. We will consider the problem in one dimension. p}. q). whereas L1 is antisymmetric. t) deﬁned in (8.45) using singular perturbation theory. q) = 2 p2 + V (q). q. This mainly to simplify the notation.q) e .2.51) Remark 8. L∗ ρ = L∗ (f ρ0 ) = (−p∂q + ∂q V ∂p ) (f ρ0 ) 1 1 = ρ0 (−p∂q f + ∂q V ∂p f ) = −ρ0 L1 f. ε2 ε (8.for (8. q. This is ”almost” the backward Kolmogorov equation with the difference that we have −L1 instead of L1 . We use this to calculate: 0 1 2 L∗ ρ = L0 (f ρ0 ) = ∂p (f ρ0 ) + β −1 ∂p (f ρ0 ) 0 2 = ρ0 p∂p f + ρ0 β −1 ∂p f + f L∗ ρ0 + 2β −1 ∂p f ∂p ρ0 0 = Similarly. ε2 0 ε 1 = The invariant distribution of the Markov process {q.47) and (8. q) = 1 −βH(p.50) satisﬁes the equation ∂f ∂t = =: 1 1 2 −p∂q + β −1 ∂p − (p∂q − ∂q V (q)∂p ) f 2 ε ε 1 1 L0 − L1 f. Z −1 e−βH(p. 1 where H(p. 155 .4. We deﬁne the function f(p. if it exists.4. Proposition 8.49) e−βH(p. t) = f (p. Z Z= R2 (8. 2 −p∂p f + β −1 ∂p f ρ0 = ρ0 L0 f.1. The multi–dimensional problem can be treated in a very similar way. q.q) ). The function f (p.48) is ∂ρ ∂t = L∗ ρ 1 1 2 (−p∂q ρ + ∂q V (q)∂p ρ) + 2 ∂p (pρ) + β −1 ∂p ρ ε ε 1 ∗ 1 ∗ =: L + L ρ.t) through ρ(p.
ρβ (q)) with ρβ (q) = ¯ ¯ P · := with Zp := Rd 1 −βV (q) e .55b) (8. 4 . t). the Fokker–Planck equation (8. (8.55a) (8. 156 . . 0) = fic (q). q)) and deﬁne the projection operator P : H → L2 (Rd . . ρβ (p. assumption (11. N. Consequently. We will assume that the initial conditions for (8. −L0 f1 = −L1 f0 . We look for a solution to (8.5) can be written as P fic = fic .54) We substitute this expansion into eqn. (8.55b): L1 f0 = p∂q f.94b) becomes ρ0 from which the claim follows.53) 1 Zp ·e−β Rd p2 2 dp.55c) n = 3. (8. Now we can calculate the right hand side of equation (8. t).Consequently. (8. Zq Zq = Rd e−βV (q) dq: (8. q.55a) we conclude that f0 = f (q. Another way for stating this assumption is the following: Let H = L2 (R2d .51) depend only on q: f (p. L0 f0 = 0. from equation (8. e−βp 2 /2 dp. −L0 f2 = −L1 f1 − −L0 fn ∂f0 ∂t ∂fn−2 = −L1 fn−1 − . q. Then. t) = n=0 εn fn (p.55d) The null space of L0 consists of constants in p.51) in the form of a truncated power series in ε: N f (p. ∂t (8.51) to obtain the following system of equations.52) ∂f = ρ0 ∂t 1 1 L f − L1 f 2 0 ε ε . q.
The solvability condition for (8. t)p2 − ∂q ψ1 (q. ∂t together with the initial condition (11. First we calculate 1 3 2 2 L1 f2 = p3 ∂q f − p2 ∂q ψ1 + p∂q ψ2 − ∂q V ∂q f p − ∂q V ∂q ψ1 . ∂t 157 . q. t). We obtain this solution using separation of variables: f1 = −p∂q f + ψ1 (q.55c). Now we solve the equation for f2 .5). t).56) The solution of this equation is 1 2 f2 (p. t)p + ψ2 (q.55c) is − L 1 f1 − R ∂f0 ρOU (p) dp = 0.55d) with n = 3. 2 Now we calculate the right hand side of the equation for f3 . t) 2 = p2 ∂q f − p∂q ψ1 − ∂q V ∂q f. q.55b) becomes: L0 f1 = p∂q f. t) = ∂q f (p.55c) in the form 2 L0 f2 = β −1 − p2 ∂q f + p∂q ψ1 . We need to calculate L1 f1 : −L1 f1 = p∂q − ∂q V ∂p p∂q f − ψ1 (q. equation (8. 2 The solvability condition ∂ψ1 + L1 f2 ρOU (p) dp = 0.56) to write (8. (8. ∂t from which we obtain the backward Kolmogorov equation corresponding to the Smoluchowski SDE: ∂f 2 = −∂q V ∂q f + β −1 ∂q f. The right hand side of this equation is orthogonal to N (L∗ ) and consequently there exists a unique 0 solution.Equation (8. Now we can calculate the RHS of equation (8. We use (8. ∂t R This leads to the equation ∂ψ1 2 = −∂q V ∂q ψ1 + β −1 ∂q ψ1 .
and using Poincare` ’s inequality for the measure e 1d ψ1 2 dt 2 1 −βV (q) e .5. From the calculations presented in the proof of Theorem 6.51): ρ(p.5. We use Gronwall’s inequality now to conclude that ψ1 ≡ 0.together with the initial condition ψ1 (q. q.56). t) = (2πβ −1 )− 2 e−βp 1 2 /2 ρV (q. where f is the solution of (8.n = −(n + 1)fn+1. .n+1 . where ρV = Z −1 e−βV (q) f is the solution of the Smoluchowski FokkerPlanck equation ∂ρV 2 = ∂q (∂q V ρV ) + β −1 ∂q ρV . 158 k = 1. . t) = Z −1 e−βH(p. Putting everything together we obtain the ﬁrst two terms in the εexpansion of the Fokker–Planck equation (8. . q.n−1 = −nfn+1.q) f + ε(−p∂q f ) + O(ε2 ) .k+1 + β −1 kβ −1 ∂q fn. .k .54) in terms of Hermite functions (the eigenfunctions of the generator of the OU process) n fn (p. We can obtain the following system of equations (L = β −1 ∂q − ∂q V ): Lfn1 = 0. 0) = 0. t) + O(ε). nβ −1 ∂q fn. ∂t It is possible to expand the nth term in the expansion (8.57) where φk (p) is the k–th eigenfunction of L0 : −L0 φk = λk φk . Zq we deduce that −C ψ1 2 . t)φk (p).n . t) = k=0 fnk (q. q. k+1 Lfn. 2 . Notice that we can rewrite the leading order term to the expansion in the form ρ(p. (8. n − 1.k−1 = −kfn+1. (n + 1)β −1 ∂q fn.
(8. to leading order. q) = 2(H − V (q)). in order to study the γ → 0 limit we need to analyze the following fast/slow system of SDEs ˙ H = β −1 − p2 + ˙ 2β −1 p2 W ˙ 2β −1 W . We apply Itˆ ’s formula to the Hamiltonian of the system to obtain o ˙ H = β −1 − p2 + with p2 = p2 (H. (8.58) This is the equation for an O(1/γ) Hamiltonian system perturbed by O(1) noise.ε = γ.2 The Underdamped Limit Consider now the rescaling λγ. We expect that.ε = 1.Using this method we can obtain the ﬁrst three terms in the expansion: ρ(x.4.60) The limiting SDE lives on the graph associated with the Hamiltonian system.58) as system of two equations q γ = γ −1 pγ . the energy is conserved. since it is conserved for the Hamiltonian system. whereas the momentum (or position) is the fast variable. 159 . ˙ pγ = −γ −1 V (q γ ) − pγ + ˙ ˙ 2β −1 W . y. The Langevin equation becomes q γ = −γ −2 V (q γ ) − q γ + ¨ ˙ We write equation (8. Assuming that we can average over the Hamiltonian dynamics. β −1 ∂q f φ1 ) + ε2 β −1 2 √ ∂q f φ2 + f20 2 β −1 ∂q f20 φ1 β −3 3 2 ∂ f φ3 + − β −1 L∂q f − 3! q 8. µγ. we obtain the limiting SDE for the Hamiltonian: ˙ H = β −1 − p2 + ˙ 2β −1 p2 W . Thus.59b) ˙ 2β −1 p2 W pγ = −γ −1 V (q γ ) − pγ + ˙ The Hamiltonian is the slow variable. (8. t) = ρ0 (p. ˙ 2γ −2 β −1 W .59a) (8. q) f + ε(− +ε3 − +O(ε4 ). The domain of deﬁnition of the limiting Markov process is deﬁned through appropriate boundary conditions (the gluing conditions) at the interior vertices of the graph.
.. pγ } is 2 Lγ = γ −1 (p∂q − ∂q V ∂p ) − p∂p + β −1 ∂p = γ −1 L0 + L1 .We identify all points belonging to the same connected component of the a level curve {x : H(x) = H}.62c) .. . . . Let Ii . The generator of the process {q γ . L0 u1 = −L1 u1 + L0 u2 = −L1 u1 + . d be the edges of the graph. ∂t ∂u1 .62a) (8. γ (8. x = (q.62b) (8. Then (i.61) We look for a solution in the form of a power series expansion in ε: uγ = u0 + γu1 + γ 2 u2 + . This means that the null space of L0 consists of functions of the Hamiltonian: N (L0 ) = functions ofH . p).. t))). It satisﬁes the backward Kolmogorov equation associated to the process {q γ . ∂t (8. 160 (8.. q.. We substitute this ansatz into (8. q. Interior vertices correspond to separatrices. pγ }: ∂uγ = ∂t 1 L 0 + L 1 uγ . Notice that the operator L0 is the backward Liouville operator of the Hamiltonian system with Hamiltonian 1 H = p2 + V (q). We will study the small γ asymptotics by analyzing the corresponding backward Kolmogorov equation using singular perturbation theory.61) and equate equal powers in ε to obtain the following sequence of equations: L0 u0 = 0. H) deﬁnes a global coordinate system on the graph.. Each point on the edges of the graph correspond to a trajectory.63) ∂u0 . t). i = 1. . .. Let uγ = E(f (pγ (p. q γ (p. 2 We assume that there are no integrals of motion other than the Hamiltonian.
we need to understand how L1 acts to functions of H(p.Let us now analyze equations (8. ∞ ∀F ∈ Cb (R). q))) dpdq = R2 = 0. 1 (8. We start with (8.62). This implies that the solvability condition for equation (8. (8. p through the Hamiltonian function H: u0 = u(H(p. it becomes 2 L1 = (β −1 − p2 )∂H + β −1 p2 ∂H .63) implies that u0 depends on q.62a). t) (8.62b) to obtain that L1 u1 − R2 ∂u0 ∂t F (H(p. q)). q). q)) dpdq = R2 R2 uL∗ F (H(p. (8.65) My multiply it by an arbitrary smooth function of H(p.64) Now we proceed with (8.66) We use the solvability condition in (8.68) We assume that both u1 and F decay to 0 as p → ∞ to justify the integration by parts that follows. For this we need to ﬁnd the solvability condition for equations of the form L0 u = f (8. (8. integrate over R2 and use the skewsymmetry of the Liouville operator L0 to deduce:1 L0 uF (H(p. q)). ∂H ∂H 2 The above calculations imply that. R2 ∞ ∀F ∈ Cb (R). q). q)) dpdq 0 u(−L0 F (H(p. We have that ∂H ∂φ ∂φ ∂φ = =p ∂p ∂p ∂H ∂H and ∂ 2φ ∂ = 2 ∂p ∂p ∂φ ∂H = ∂φ ∂ 2φ + p2 .62b). Let φ = φ(H(p. q)) dpdq = 0. q)) dpdq = 0. when L1 acts on functions φ = φ(H(p. 161 . q)F (H(p. q).67) To proceed.83) is that f (p. eqn.
Notice that the noise that appears in the limiting equation (8. q) = ∂(H.67) and go from (p. The Jacobian of the transformation is: ∂(p. q) We use this. q) = 2(H − V (q)). together with (8. ∂H p(H. E) dq 162 .where p2 = p2 (H. ∂u 2 = β −1 − p−1 −1 p ∂H u + γ p−1 −1 p β −1 ∂H u. (8. H. which is the ”slow variable”.68). q) ∂p ∂H ∂q ∂H ∂p ∂q ∂q ∂q = 1 ∂p = . q) to p. the action and frequency are deﬁned as I(E) = p(q.69) p. contrary to the additive noise in the Langevin equation. to rewrite eqn. From this equation we can read off the limiting SDE for the Hamiltonian: ˙ ˙ H = b(H) + σ(H)W where b(H) = β −1 − p−1 −1 ∂u 2 = β −1 p−1 − p ∂H u + p β −1 ∂H u. ∂t (8. ∂t Thus.67) as ∂u 2 + (β −1 − p2 )∂H + β −1 p2 ∂H u F (H)p−1 (H. and this requirement leads to the differential equation p−1 or. We want to change variables in the integral (8. q) dHdq = 0. As it well known from classical mechanics. The integration over q can be performed ”explicitly”: ∂u −1 2 + (β −1 p−1 − p )∂H + β −1 p ∂H u F (H) dH = 0. ∂t We introduce the notation · := · dq. σ(H) = β −1 p−1 −1 p.69) is multiplicative. p ∂t This equation should be valid for every smooth function F (H). we have obtained the limiting backward Kolmogorov equation for the energy.
the limiting Fokker–Planck equation can be written as ∂ρ ∂ = − ∂t ∂E Iω I(E)ω(E) ∂2 ρ + β −1 2 2π ∂E 2π ∂ Iω dI ωρ ∂ρ ∂ ∂ = −β −1 + ρ + β −1 + β −1 ∂E ∂E 2π ∂E dE 2π ∂E ∂ Iω ∂ ∂ ωρ = ρ + β −1 I ∂E 2π ∂E ∂E 2π ∂ ω(E)ρ ∂ = I(E) + β −1 .70).4. We will discuss about this issue in the next section. If we rescale back to the original timescale we obtain the equation ∂ρ ∂ =γ ∂t ∂E I(E) + β −1 ∂ ∂E ω(E)ρ 2π . (8. We notice that dI = dE and consequently p−1 −1 I(E) + β −1 ∂ ∂E ω(E)ρ 2π p−1 dq . 2π Hence.71) We will use this equation later on to calculate the rate of escape from a potential barrier in the energydiffusionlimited regime. Remarks 8.70) ∂p dq = ∂E = ω(E) .4.4.3. Using the action and the frequency we can write the limiting Fokker–Planck equation for the distribution function of the energy in a very compact form. respectively. We emphasize that the above formal procedure does not provide us with the boundary conditions for the limiting Fokker–Planck equation. (8. ∂E ∂E 2π β −1 I ∂ ∂E ωρ 2π which is precisely equation (8. t) is ∂ρ ∂ = ∂t ∂E Proof. Theorem 8. The limiting Fokker–Planck equation for the energy distribution function ρ(E. ii. 163 .and ω(E) = 2π dI dE −1 . i.
72) Goal: Calculate the effective drift and the effective diffusion tensor Uef f = lim and Def f = lim t→∞ t→∞ x(t) t (8. 164 .75) controls the coupling between the Hamiltonian system x = − V (x) and the thermal heat bath: γ ¨ coupled to the heat bath. (8. j = 1. . Hence. Notice that we have already non–dimensionalized eqn. ξi (t) = 0 and ξi (t)ξj (s) = δij δ(t − s). 2t (8.74) 8. ξ(t) stands for the standard d–dimensional white noise process. (8. . . f (t)) + y(t) + 2γkB T ξ(t). e i=1 i = 1.75) in such a way that the non– dimensional particle mass is 1 and the maximum of the (gradient of the) potential is ﬁxed [52].73) x(t) − x(t) ) ⊗ (x(t) − x(t) ) . ˆ where {ˆi }d denotes the standard basis of Rd . furthermore. the only parameters in the problem are the friction coefﬁcient and the temperature. V (x) L∞ i. . d.75) where γ is the friction coefﬁcient. d. kB the Boltzmann constant and T denotes the temperature. .8. i. whereas γ 1 implies that the Hamiltonian system is strongly 1 corresponds to weak coupling. x = − V (x(t)) − γ x(t) + ¨ ˙ 2γkB T ξ(t).e. Notice. that the parameter γ in (8. (8. = 1 with period 1 in all spatial The potential V (x) is periodic in x and satisﬁes directions: V (x + ei ) = V (x). . .5. periodic potential.5 Brownian Motion in Periodic Potentials m¨ = −γ x(t) − x ˙ Basic model V (x(t).1 The Langevin equation in a periodic potential We start by studying the underdamped dynamics of a Brownian particle x(t) ∈ Rd moving in a smooth.
Indeed. (8. We will mostly focus on the one dimensional case.75) deﬁnes a Markov process in the phase space Td × Rd . (8.79) We are interested in analyzing the dependence of Def f on γ.Equation (8.5.y) . This process is ergodic. (8.11) x = F (x) − γ x + ¨ ˙ 165 ˙ 2γβ −1 W .75) is governed by an effective Brownian motion. The unique invariant measure is absolutely continuous with respect to the Lebesgue measure and its density is the Maxwell– Boltzmann distribution ρ(y. Let V (x) ∈ C(Td ). let us write (8. We start by rescaling the Langevin equation (9. Deﬁne the rescaled process x (t) := εx(t/ε2 ). ε Then x (t) converges weakly. y) is the Hamiltonian of the system 1 H(x. y(t)} is Markovian with generator L=y· x (8. x) = where Z = Td 1 1 e− D H(x. y)dxdy and the vector valued function Φ is the solution of the Poisson equation −LΦ = y. the following central limit theorem holds [83. 2 The long time behavior of solutions to (8.1.80) .78) where µ(dx dy) = ρ(x. In writing the above we have set D = KB T .77) e−V (x)/D dx and H(x. (8. y) = y 2 + V (x). Indeed. to a Brownian motion with covariance ε Def f = Td ×Rd −LΦ ⊗ Φµ(dx dy). n (2πD) 2 Z (8. ˙ y(t) = − V (x(t)) − γ y(t) + ˙ ˙ The process {x(t).76a) 2γkB T ξ(t). ?] Theorem 8. as ε → 0.76b) − V (x) · y + γ (−y · y + D∆y ) .75) as a ﬁrst order system x(t) = y(t). 70.
v. q)ρ(x.81) with the understanding that q ∈ [−π. p ∈ Rd . it is more customary in the physics literature to use the forward Kolmogorov equation. We shall accomplish this by studying the corresponding backward Kolmogorov equation using singular perturbation theory for partial differential equations. x. v. ε2 (8.e. where E denotes the expectation with respect to the Brownian motion W (t) in the Langevin equation and f is a smooth function. t. p. where ρ(x. t) is governed by the backward Kolmogorov equation associated to equations (8.11): 1 ˙ ˙ t → t/ε2 . q) dpdqdxdv. t. the FokkerPlanck equation. q. we have that uε (p. x(t)p(0) = p.82) where: L0 = − L1 = p · 2 qV x (q) · p +p· q +γ −p· p + β −1 ∆p . 3 166 . Let uε (p. for the calculation presented below. v. t) = f (x. ε2 1 p. However. Since we expect that at sufﬁciently long length and time scales the particle performs a purely diffusive motion. The two formulations are equivalent. q(0) = q. q) is the solution of the FokkerPlanck equation and µ(p. i. x → x . t. q.where we have set F (x) = − V (x). ε ˙ 2γβ −1 W .81) is [74]3 ∂uε ∂t 1 1 p · x uε + 2 − ε ε 1 1 := L + L1 uε . q and to obtain an equation for the slow variable x. q. Ch. we perform a diffusive rescaling to the equations of motion (9. Introducing p = εx and q = x/ε we write this equation as a ﬁrst order system: ˙ x = ˙ p = ˙ q = ˙ 1 F (q) ε2 − 1 ˙ γp + ε12 γβ −1 W . Our goal now is to eliminate the fast variables p. 6] for details. p. it is more convenient to use the backward as opposed to the forward Kolmogorov equation. In other words. π]d and x. See [72. q) is the initial distribution. t) = Ef p(t). x.2 The evolution of the function uε (p. We will assume that the potential is periodic with period 2π in every direction. 2 0 ε ε = qV (q) · p +p· q +γ −p· p + β −1 ∆p uε . (8. q)µ(p. Using the fact that W (c t) = √ W (t) in law we obtain: ε c x 1 − γx + ˙ ε2 x = F ¨ ε ε 1 p. q(t). x. x(0) = x . p.
. where H(q. p. Moreover.84) Equation (8.87a) (8. x. 39. p) dqdp = 0.85) These two conditions are sufﬁcient to ensure existence and uniqueness of solutions (up to constants) of equation (8. 0 where L∗ denotes the FokkerPlanck operator which is the L2 adjoint of the generator of the pro0 cess L0 : L∗ f · = 0 qV (q) · pf −p· qf +γ p · (pf ) + β −1 ∆p f .The invariant distribution of the fast process q(t). q.83) := Td ×Rd g(q. (8. the equation −L0 f = g.82) and equate equal powers in ε to obtain the following sequence of equations: L0 u0 = 0.p) dqdp.85). p(t) in Td × Rd is the MaxwellBoltzmann distribution ρβ (q. 70]. . p) = 1 p2 2 Z= Td ×Rd e−βH(q. We substitute (8. L0 u1 = −L1 u0 . (8. has a unique (up to constants) solution if and only if g β (8. .83) is equipped with periodic boundary conditions with respect to z and is such that f 2 µβ dqdp < ∞. Indeed. 2. ∂t 167 (8. we can readily check that L∗ ρβ (q. + V (q).87c) . t). The null space of the generator L0 consists of constants in q. i = 1. being 2π periodic in q and satisfying condition (8.87b) (8. . p) = Z −1 e−βH(q. p)ρβ (q.83) [38.86) with ui = ui (p.p) . We assume that the following ansatz for the solution uε holds: uε = u0 + εu1 + ε2 u2 + . Td ×Rd (8. L 0 u2 ∂u0 = −L1 u1 + . p) = 0. .86) into (8.
We solve it using separation of variables: u1 = Φ(p. let ξ be a unit vector in Rd . . (8. The limiting backward Kolmogorov equation is well posed since the diffusion tensor is nonnegative. q) · with −L0 Φ = p. i. q) dpdq ∂ 2 u0 .j=1 where the effective diffusion tensor is Dij = Td ×Rd d (8. t).j=1 Td ×Rd L1 u1 ρβ (p. Since p = 0.87). (8. q) dpdq. the above equation is wellposed. .88) x u0 This Poisson equation is posed on Td × Rd . Now we proceed with the third equation in (8.90) The calculation of the effective diffusion tensor requires the solution of the boundary value problem (8. We write it in the form ∂u0 ∂ 2 u0 = Dij ∂t ∂xi ∂xj i. We calculate (we use the notation Φξ = Φ · ξ and ·.90).87) becomes: L0 u1 = −p · x u0 . Dξ = (p · ξ)(Φξ )µβ dpdq = p Φξ 2 − L0 Φξ Φξ µβ dpdq 0. Hence. Now the second equation in (8. the right hand side of the above equation is meanzero with respect to the MaxwellBoltzmann distribution.89) pi Φj ρβ (p.From the ﬁrst equation in (8. · for the Euclidean inner product) ξ. We apply the solvability condition to obtain: ∂u0 = ∂t = i. d. ∂xi ∂xj This is the Backward Kolmogorov equation which governs the dynamics on large scales. The solution is periodic in q and satisﬁes condition (8.88) and the calculation of the integral in (8. Indeed. j = 1.91) = γβ −1 µβ dpdq 168 . . q) dpdq Td ×Rd d pi Φj ρβ (p.85). (8. since the null space of L0 consists of functions which are constants in p and q.87) we deduce that u0 = u0 (x.
with L=p· q + (− qV + F) · p +γ −p· p + β −1 ∆p .94a) (8.92) where V (x) is periodic with period 2π and F is a constant force ﬁeld.90). i. (8. (8.e. q) dpdq = 1.93) where −Lφ = p − V. Dξ = γβ −1 p Φξ 2 ρ(p. Thus. q) dpdq < +∞. We mention in passing that the analysis presented above can also be applied to the problem of Brownian motion in a tilted periodic potential. The solution of the Poisson equation (8. L∗ denotes the L2 adjoint of the operator L. Rd ×Td (8. in the underdamped regime and in high dimensions.96) for every vector ξ in Rd . The formulas for the effective drift and the effective diffusion tensor are V = Rd ×Td pρ(q. p)2 ρ(p. p) dqdp. will be the subject of a separate publication. q) dpdq. from the multiscale analysis we conclude that at large lenght/time scales the particle which diffuses in a periodic potential performs and effective Brownian motion with a nonnegative diffusion tensor which is given by formula (8. The Langevin equation becomes x(t) = − V (x(t)) + F − γ x(t) + ¨ ˙ ˙ 2γβ −1 W (t). The study of diffusion in a tilted periodic potential. based on the above formulas for V and D. p): φ(q. Equations (8. (8. (8.95) We have used ⊗ to denote the tensor product between two vectors. D= Rd ×Td (p − V ) ⊗ φρ(p. the FokkerPlanck operator. 169 .91) shows the positive deﬁniteness of the diffusion tensor: ξ. Rd ×Td The diffusion tensor is nonnegative deﬁnite.94) is also taken to be square integrable with respect to the invariant density ρ(q.where an integration by parts was used. A calculation similar to the one used to derive (8. q) dpdq 0.94) are equipped with periodic boundary conditions in q.94b) ρ(p. L∗ ρ = 0.
0. ∂t v(0. q. p) = q. p)v(0. p.10] v(t. Ch. q.98) The function v(t) satisﬁes the backward Kolmogorov equation which governs the evolution of observables [74. q. v(t. (8. t. p). p)v(0. ∂t where 2 L∗ ρ = −v∂x ρ + ∂x V (x)∂v ρ + γ ∂(vρ) + β −1 ∂v ρ . that the initial conditions are distributed according to the MaxwellBoltzmann distribution ρβ (q. We assume that the (x.97) and ρ(x. is equivalent to the GreenKubo formula (3. v) process is stationary. t. q) dpdq v(t. q) is the solution of the FokkerPlanck equation ∂ρ = L∗ ρ. p) = v pρ(x. eq. equation (8. q. q) = p. 170 (8. q. q)pρβ (p.97) in the form v(t. v(0.2 Equivalence With the GreenKubo Formula Let us now show that the formula for the diffusion tensor obtained in the previous section. q)ρβ (p. v. p. q) dpdqdxdv. q. The generalization to arbitrary dimensions is immediate. v.99) . p)) with v = x and initial conditions x(0. p. q.8. q. p) = ˙ p be the solution of the Langevin equation x = −∂x V − γ x + ξ ¨ ˙ where ξ(t) stands for Gaussian white noise in one dimension with correlation function ξ(t)ξ(s) = 2γkB T δ(t − s). q) dpdq. 6] ∂v = Lv. (8. p) = =: vρ(x. v.e.90). 2. p. i. We rewrite (8. The velocity autocorrelation function is [15. ρ(x. v. To simplify the notation we will prove the equivalence of the two formulas in one dimension. p) = Z −1 e−βH(p. q) dvdx pρβ (p. t. p. Let (x(t.14). q) = δ(x − q)δ(v − p).q) . p.5.
q) dpdq. p) dt ∞ = 0 eLt dt p pρβ dpdq − L−1 p pρβ dpdq ∞ π = = −∞ φpρβ dpdq.98) and (8. formally. p)v(0.We can write. The operator L0 that appears in (8.99) as v = eLt p. q. p) = p eLt p ρβ (p. p)v(0. q. We shall use singular perturbation theory for partial differential equations.88). (8. q. The Underdamped Limit In this subsection we solve the Poisson equation (8. 8. The derivation of these formulas is based on In this section we derive approximate formulas for the diffusion coefﬁcient which are valid in the overdamped γ the asymptotic analysis of the Poisson equation (8.6 The Underdamped and Overdamped Limits of the Diffusion Coefﬁcient 1 and underdampled γ 1 limits. In the above derivation we have used the formula −L−1 = ∞ Lt e 0 dt. Ch.100) We combine now equations (8. −π where φ is the solution of the Poisson equation (8.101) We substitute this into the GreenKubo formula to obtain ∞ D = 0 v(t.88) can be written in the form L0 = LH + γLOU 171 . q.88). whose proof can be found in [74.100) to obtain the following formula for the velocity autocorrelation function v(t. 11]. (8.88) in one dimension perturbatively for small γ. the solution of (8.
q) is the Jacobian of the transformation. we look (8.102) (8. 2 LOU = −p∂p + β −1 ∂p . q)) dpdq = 0. . q) where J = p−1 (z. −LH φ1 = p + LOU φ0 . q) + LOU φ0 (z)) −π 1 dzdq = 0. p(z. q)). 1. p(z. Operator L0 . q) + (β −π −1 2 ∂ −1 2 ∂ −p ) +β p ∂z ∂z 2 2 φ0 (z) 1 dzdq = 0. 1 the ﬁrst term in the expansion is a function of the Hamiltonian z(p. q) and LOU for the generator of the OU process. q) 172 . respectively: LH = p∂q − ∂q V ∂p . Thus. .103b) (8. so that the above integral becomes +∞ Emin π g(z) (p(z.103a) we deduce that. q.103c) From equation (8. −LH φ2 = LOU φ1 .103a) (8. g = g(z) and integrate over p and q to obtain +∞ −∞ π (p + LOU φ0 ) g(z(p. −π We change now from p.88) to obtain the sequence of equations Φ= LH φ0 = 0. q) = 2 p2 + V (q): φ0 = φ0 (z(p. since the φ0 is in the null space of the Liouville operator. Now we want to obtain an equation for φ0 by using the solvability condition for (8. ∂z ∂z g(z) p(z.where LH stands for the (backward) Liouville operator associated with the Hamiltonian H(p. the integral equation for φ0 (z) becomes +∞ Emin π ∂ ∂2 + β −1 p2 2 . when applied to functions of the Hamiltonian.103b). becomes: LOU = (β −1 − p2 ) Hence. we multiply this equation by an arbitrary function of z. q coordinates to z. γ We substitute this ansatz in (8. We expect that the solution of the Poisson equation scales like γ −1 when γ for a solution of the form 1 φ0 + φ1 + γφ2 + . To this end.
This equation is valid for every test function g(z).107) h(q. p.105). We consider ﬁrst the case E > E0 . p(z. φ3 are the solutions of equations (8.106) are augmented with condition (8. where φ1 . p > 0 . q)) can be written in the form [82. separately. respectively. T (z) = x1 (z) 1 dq. In this case x1 (x) = π. p. p) = h(q. 303] ∞ π (8.105) and (8.106). (8. 301] has been used for x1 (z) and x2 (z). (8. T (z) E > E0 . from which we obtain the following differential equation for φ0 : −Lφ := −β −1 1 S(z)φ + T (z) 1 2π S(z) − β −1 φ = .e. Emin < E < E0 .104). We can perform the integration with respect to q to obtain +∞ z > E0 .Let E0 denote the critical energy. φ2 . p < 0 and Emin < E < E0 the equation for φ0 is −Lφ = − and −Lφ = 0.106) 2π . p > 0. 173 . −p(z.85) and a continuity condition at the critical energy [27] 2φ3 (E0 ) = φ1 (E0 ) + φ2 (E0 ). p < 0 (8. p < 0 and Emin < z < E0 g(z) 2π + (β E0 −1 ∂ ∂2 −1 T (z) − S(z)) + β S(z) 2 ∂z ∂z φ0 (z) dz = 0. x2 (z) = −π. We need to consider the cases z > E0 . The average of a function h(q.105) Equations (8.104). z))−1 e−βz dzdq. We set x2 (z) x2 (z) S(z) = x1 (z) p(z. q) dq. i. p) β := −∞ −1 = Zβ h(q. the energy along the separatrix (homoclinic orbit). T (z) T (z) (8.104) where primes denote differentiation with respect to z and where the subscript 0 has been dropped for notational simplicity. q) where Risken’s notation [82. p(z. q)) + h(q. A similar calculation shows that in the regions E > E0 . (8. p) dqdp −π +∞ Emin x2 (z) x1 (z) h(q. p)µβ (q. (8. p(z. q)) (p(q.
we obtain the following formula. Condition (8. yield: D = pΦ(p.where the partition function is Zβ = 2π β π e−βV (q) dq.106) we deduce that φ3 (z) = 0. These facts. q) β = pφ0 β + O(1) 2 −1 +∞ ≈ Z φ0 (z)eβz dzO(1) γ β E0 4π −1 +∞ = Z φ0 (z)e−βz dz.110) We emphasize the fact that this formula is exact in the limit as γ → 0 and is valid for all periodic potentials and for all values of the temperature. Furthermore. which is equivalent to (8. we rewrite (8. This equation can be rewritten as −β −1 e−βz Sφ = e−βz . we will drop the subscript 0 ). −π From equation (8.109). S(z) (8. 174 .108) (8. We use this in (8. we have that φ1 (z) = −φ2 (z).109): D= 4π −1 Z γβ β +∞ β for the diffusion coefﬁ ∂z φ0 (z)2 e−βz dz. Using the fact that S (z) = T (z).104) is φ (z) = S −1 (z).104). E0 Now we solve the equation for φ0 (z) (for notational simplicity. γ β E0 (8. and where φ0 (z) is the solution of the two point boundary value problem (8. together with the above formula for the averaging with respect to the Boltzmann distribution.109) to leading order in γ. We remark that if we start with formula D = γβ −1 ∂p Φ2 cient. to obtain the following formula for the diffusion coefﬁcient: D= 1 2 −1 −1 8π Zβ β γ +∞ E0 e−βz dz.85) implies that the derivative of the unique solution of (8. together with an integration by parts.104) as −β −1 (Sφ ) + Sφ = 2π.
As in the previous case.111) We use now the asymptotic formula J0 (β) ≈ (2πβ)−1/2 eβ . it is not straightforward to obtain the next order correction in the formula for the effective diffusivity.88) along the separatrix. The partition function is Zβ = (2π)3/2 J0 (β).Consider now the case of the nonlinear pendulum V (q) = − cos(q). For the cosine potential.113) = π2 8γ 2 β 2 for γ The Overdamped Limit In this subsection we study the large γ asymptotics of the diffusion coefﬁcient. obtained by Risken. Upon combining the formula for the diffusion coefﬁcient and the formula for the hopping rate from Kramers’ theory [41. e. 4. . Unlike the overdamped limit which is treated in the next section. γ 2β β 1. and for β 2 1. rather than (1) as suggested by ansatz (8. [42. (8. (8. a simple calculation yields √ S(z) = 25/2 z + 1E 2 z+1 . β 1 and the fact that E(1) = 1 to obtain the small temperature asymptotics for the diffusion coefﬁcient: D= 1 π −2β e . Ch. Furthermore. 8]. In particular.102). the next order correction to φ when γ 1 is of (γ −1/2 ). eqn. The regularity of the solution of (8. The formula for the diffusion coefﬁcient becomes √ 1 π D= γ 2β 1/2 J0 (β) +∞ √ 1 e−βz z + 1E( 2/(z + 1)) dz. This is because. we use singular perturbation theory.112) which is precisely formula (??).88) when γ 1 will enable us to obtain the ﬁrst two terms in the 175 1 γ expansion without any difﬁculty. (8. this formula is 1.g.48(a)] we can obtain a formula for the mean square jump length at low friction. where E(·) is the complete elliptic integral of the second kind. β 1. β 1/2 where J0 (·) is the modiﬁed Bessel function of the ﬁrst kind. due to the discontinuity of the solution of the Poisson equation (8.
115d) (8. this equation has a solution which is φ1 (p. p2 be the invariant distribution of the OU process (i.115a) (8.115b) (8. The second equation becomes −LOU φ1 = p(1 + ∂q φ).88) in the form of a power series expansion in γ: Φ = φ0 + εφ1 + ε2 φ2 + ε3 φ3 + . Consequently. −LOU φ1 = p + LH φ0 . q) = (1 + ∂q φ)p + ψ1 (q).88) and obtain the following sequence of equations: −LOU φ0 = 0.115c) (8. from the ﬁrst equation in (8. by Fredholm alternative. We substitute this into (8. Let νβ (p) = 2π β −1 2 e−β 2 . (8. This condition is clearly satisﬁed for the equation for φ1 . φ0 = φ(q). that the right hand side of the equation is orthogonal to the null space of the adjoint of LOU . We substitute this into the right hand side of the third equation to obtain 2 −LOU φ2 = p2 ∂q φ − ∂q V (1 + ∂q φ) + p∂q ψ1 (q).e. . The solvability condition for OU an equation of the form −LOU φ = f requires that the right hand side averages to 0 with respect to νβ (p). Thus.We set γ = 1 . The differential operator L0 becomes ε 1 L0 = LOU + LH . L∗ νβ (p) = 0). . i. where the function ψ1 (q) of is to be determined. ε We look for a solution of (8.115) we deduce that the ﬁrst term in the expansion in independent of p.114) The null space of the OrnsteinUhlenbeck operator L0 consists of constants in p. 176 . −LOU φ2 = LH φ1 .e. −LOU φ3 = LH φ2 .
Continuing in the same fashion. (8. The ﬁnal result is D= where Z1 = π −π 4π 2 βγZ Z − 4π 2 βZ1 γ 3Z Z 2 +O 1 γ5 . V (q) = cos(q).120) .119) where Jn (β) is the modiﬁed Bessel function of the ﬁrst kind.116). see Exercise 4. Dξ = 1 ξ.From the solvability condition for this we obtain an equation for φ(q): 2 β −1 ∂q φ − ∂q V (1 + ∂q φ) = 0. (8. In the case of the nonlinear pendulum. where φ(q) is the solution of (8. This is. where Z = π −π e−βV (q) .117) The ﬁrst two terms in the large γ expansion of the solution of equation (8. the LifsonJackson formula which gives the diffusion coefﬁcient in the overdamped limit [54]. we can also calculate the next two terms in the expansion (8. a similar analysis leads to the large gamma asymptotics: ξ.88) are 1 Φ(p. q) = φ(q) + (1 + ∂q φ) + O γ .118) gives D= 1 −2 β J0 (β) − 3 γβ γ J2 (β) −2 − J0 (β) + O 3 J0 (β) 1 γ5 . In the multidimensional case. of course. formula (8. where ξ is an arbitrary unit vector in Rd and D0 is the diffusion coefﬁcient for the Smoluchowski (overdamped) dynamics: D0 = Z −1 Rd − LV χ ⊗ χe−βV (q) dq 177 (8.118) V (q)2 eβV (q) dq. eβV (q) dq 1 γ2 (8. q) dpdq = π −π 4π 2 βZ Z +O 1 γ3 .114).117) we obtain ∞ π D = −∞ −π pΦρβ (p. D0 ξ + O γ 1 γ3 . Z = eβV (q) . The derivative of the solution of this twopoint boundary value problem is ∂q φ + 1 = π −π 2π eβV (q) . Substituting this in the formula for the diffusion coefﬁcient and using (8.116) together with the periodic boundary conditions. we can compute the next order correction to the diffusion coefﬁcient. From this. (8.
where LV = − and χ(q) is the solution of the PDE LV χ =
qV qV
·
q
+ β −1 ∆q
with periodic boundary conditions.
Now we prove several properties of the effective diffusion tensor in the overdamped limit. For this we will need the following integration by parts formula
yχ Td
ρ dy =
Td
y (χρ)
−χ⊗
yρ
dy = −
Td
(χ ⊗
y ρ) dy.
(8.121)
The proof of this formula is left as an exercise, see Exercise 5. Theorem 8.6.1. The effective diffusion tensor D0 (8.120) satisﬁes the upper and lower bounds D ZZ where Z=
Td
ξ, Kξ
Dξ2
∀ξ ∈ Rd ,
(8.122)
eV (y)/D dy.
In particular, diffusion is always depleted when compared to molecular diffusivity. Furthermore, the effective diffusivity is symmetric. Proof. The lower bound follows from the general lower bound (??), equation (??) and the formula for the Gibbs measure. To establish the upper bound, we use (8.121) and (??) to obtain K = DI + 2D
Td
( χ)T ρ dy +
Td yρ Td
− −
Td
yV
⊗ χρ dy ⊗ χρ dy ⊗ χρ dy
= DI − 2D = DI − 2
Td
⊗ χ dy + ⊗ χρ dy +
yV
− −
Td
yV yV
−
Td
yV
= DI − = DI −
Td
⊗ χρ dy
− L0 χ ⊗ χρ dy
yχ Td
= DI − D Hence, for χξ = χ · ξ, ξ, Kξ
⊗
yχ
ρ dy.
(8.123)
= Dξ2 − D
Td

2 y χξ  ρ dy
Dξ2 . This proves depletion. The symmetry of K follows from (8.123). 178
The One Dimensional Case The one dimensional case is always in gradient form: b(y) = −∂y V (y). Furthermore in one dimension we can solve the cell problem (??) in closed form and calculate the effective diffusion coefﬁcient explicitly–up to quadratures. We start with the following calculation concerning the structure of the diffusion coefﬁcient.
1 1
K = D + 2D
0 1
∂y χρ dy +
0
−∂y V χρ dy
1
= D + 2D
0 1
∂y χρ dy + D
0 1
χ∂y ρ dy ∂y χρ dy
0
= D + 2D
0 1
∂y χρ dy − D
= D
0
1 + ∂y χ ρ dy.
(8.124)
The cell problem (??) in one dimension is D∂yy χ − ∂y V ∂y χ = ∂y V. We multiply equation (8.125) by e−V (y)/D to obtain ∂y ∂y χe−V (y)/D = −∂y e−V (y)/D . We integrate this equation from 0 to 1 and multiply by eV (y)/D to obtain ∂y χ(y) = −1 + c1 eV (y)/D . Another integration yields
y
(8.125)
χ(y) = −y + c1
0
eV (y)/D dy + c2 .
The periodic boundary conditions imply that χ(0) = χ(1), from which we conclude that
1
−1 + c1
0
eV (y)/D dy = 0.
Hence c1 = 1 Z , Z=
0
1
eV (y)/D dy.
179
We deduce that Z We substitute this expression into (8.124) to obtain K = = = with Z=
0
∂y χ = −1 +
1
eV (y)/D .
D Z D ZZ D ZZ
1
(1 + ∂y χ(y)) e−V (y)/D dy
0 1
eV (y)/D e−V (y)/D dy
0
,
(8.126)
1
1
e−V (y)/D dy,
Z=
0
eV (y)/D dy.
(8.127)
The CauchySchwarz inequality shows that Z Z lower bound is sharp. Example 8.6.2. Consider the potential V (y) = where a1 , a2 are positive constants.4 a1 a2
1. Notice that in the one–dimensional case the
formula for the effective diffusivity is precisely the lower bound in (8.122). This shows that the
: y ∈ [0, 1 ], 2 : y ∈ ( 1 , 1], 2
(8.128)
It is straightforward to calculate the integrals in (8.127) to obtain the formula K= D cosh
2 a1 −a2 D
.
(8.129)
In Figure 8.2 we plot the effective diffusivity given by (8.129) as a function of the molecular diffusivity D. We observe that K decays exponentially fast in the limit as D → 0.
8.6.1
Brownian Motion in a Tilted Periodic Potential
In this appendix we use our method to obtain a formula for the effective diffusion coefﬁcient of an overdamped particle moving in a one dimensional tilted periodic potential. This formula was ﬁrst
Of course, this potential is not even continuous, let alone smooth, and the theory as developed in this chapter does not apply. It is possible, however, to consider a regularized version of this discontinuous potential and then homogenization theory applies.
4
180
Figure 8.2: Effective diffusivity versus molecular diffusivity for the potential (8.128).
derived and analyzed in [80, 79] without any appeal to multiscale analysis. The equation of motion is x = −V (x) + F + ˙ √ 2Dξ, (8.130)
where V (x) is a smooth periodic function with period L, F and D > 0 constants and ξ(t) standard white noise in one dimension. To simplify the notation we have set γ = 1. The stationary Fokker–Planck equation corresponding to(8.130) is ∂x ((V (x) − F ) ρ(x) + D∂x ρ(x)) = 0, with periodic boundary conditions. Formula (10.13) for the effective drift now becomes
L
(8.131)
Uef f =
0
(−V (x) + F )ρ(x) dx.
(8.132)
The solution of eqn. (8.131) is [77, Ch. 9] ρ(x) = with Z± (x) := e± D (V (x)−F x) , 181
1
1 Z
x+L
dyZ+ (y)Z− (x),
x
(8.133)
The fact that ρ(x) solves the stationary Fokker–Planck equation. It will be more convenient for the subsequent calculation to rewrite the above formula for the effective diffusion coefﬁcient in a different form.133) in (8. with periodic boundary conditions. Now we have L L Def f = D + 0 L (−V (x) + F − Uef f )χ(x)ρ(x) dx + 2D 0 L ∂x χ(x)ρ(x) dx = D+ 0 (−Lχ(x))χ(x)ρ(x) dx + 2D L 0 L ∂x χ(x)ρ(x) dx ∂x χ(x)ρ(x) dx 0 = D+D 0 L (∂x χ(x))2 ρ(x) dx + 2D = D 0 (1 + ∂x χ(x))2 ρ(x) dx.136) Def f = D + 0 (−V (x) + F − Uef f )ρ(x) dx + 2D 0 ∂x χ(x)ρ(x) dx. Ch.136) with periodic boundary conditions. L L φ(x)(−Lφ(x))ρ(x) dx = D 0 0 (∂x φ(x))2 ρ(x) dx.20) which now becomes Lχ(x) := D∂xx χ(x) + (−V (x) + F )∂x χ = V (x) − F + Uef f . Then we need to evaluate the integrals in (10. (8.137) Now we solve the Poisson equation (8.132) we obtain [77.and Z= 0 L x+L dx x dyZ+ (y)Z− (x). D . Z (8. (8.18): L L (8.135) Our goal now is to calculate the effective diffusion coefﬁcient. For this we ﬁrst need to solve the Poisson equation (10. 9] Uef f = FL DL 1 − e− D . for all sufﬁciently smooth periodic functions φ(x).134) Upon using (8. We multiply the equation by Z− (x) and divide through by D to rewrite it in the form ∂x (∂x χ(x)Z− (x)) = −∂x Z− (x) + 182 Uef f Z− (x). together with elementary integrations by parts yield that.
An example–the calculation of the effective diffusion coefﬁcient of an overdamped Brownian particle in a tilted periodic potential–is presented in appendix. 85] can be obtained from equation (??): they correspond to cases where equations (??) and (??) can be solved analytically. 183 . A similar approximation theorem is also valid in inﬁnite dimensions (i.133) we ﬁnally obtain Def f = with x x+L D Z3 L (I+ (x))2 I− (x) dx.7 8. 8. Formula (8. see [10. 80.135) to obtain ∂x χ(x)Z− (x) 1 − e− FL D = −Z− (x) 1 − e− FL D + FL L 1 − e− D Z x Z− (y) dy. x−L from which we immediately get 1 ∂x χ(x) + 1 = Z x Z− (y)Z+ (x) dy. 11]. We also mention in passing that the various formulae for the effective diffusion coefﬁcient that have been derived in the literature [34.8 Numerical Solution of the KleinKramers Equation Discussion and Bibliography The rigorous study of the overdamped limit can be found in [68]. More information about the underdamped limit of the Langevin equation can be found at [89.e. 29]. for SPDEs).138) I+ (x) = x−L Z− (y)Z+ (x) dy and I− (x) = x Z+ (y)Z− (x) dy. 0 (8. 28.137) and using the formula for the invariant distribution (8. x−L Substituting this into (8.138) for the effective diffusion coefﬁcient (formula (22) in [79]) is the main result of this section.We integrate this equation from x − L to x and use the periodicity of χ(x) and V (x) together with formula (8. Similar calculations yield analytical expressions for all other exactly solvable models that have been considered in the literature. 54.
17). b− ] = −1. o 2. [L. Prove formula (8.139c) 184 . a− ] = −1. (d± )∗ ]. [a± .139a) (8. b± ] = 0. Show that the operators a± .139b) (8.9 Exercises 1. b± deﬁned in (8. [(d+ )∗ . (d± )∗ ]. Obtain the second term in the expansion (8. Let L be the operator deﬁned in (8. (c− )∗ ]. (d− )∗ ]. (8. 4. [L. Calculate the eigenvalues and eigenfunctions of L.15) and (8. Let L be the generator of the twodimensional OrnsteinUhlenbeck operator (8.34) (a) Show by direct substitution that L can be written in the form L = −λ1 (c− )∗ (c+ )∗ − λ2 (d− )∗ (d+ )∗ .121). Show that there exists a transformation that transforms L into the Schr¨ dinger operator of the twodimensional quantum harmonic oscillator. 5.118). (b) Calculate the commutators [(c+ )∗ . (c± )∗ ].16) satisfy the commutation relations [a+ . [(c± )∗ . 3. [b+ .8.
Among the many applications we mention the switching and storage devices in computers. chemistry and biology that exist in at least two stable states.e.Chapter 9 Exit Time Problems 9. We we will solve completely the one dimensional problem and discuss in some detail about the ﬁnite dimensional problem. Vanden Eijnden etc. • How long does it take for a system to switch spontaneously from one state to another? • How is the transfer made. i. in particular the development of numerical methods for the calculation of various quantities such as reaction rates. The inﬁnite dimensional situation is an extremely hard problem and we will only make some remarks. • How does the system relax to an unstable state? We can separate between the 1d problem.1 9. through what path in the relevant state space? There is a lot of important current work on this problem by E. The study of bistability and metastability is a very active research area. Another example is biological macromolecules that can exist in many different states. The problems that we would like to solve are: • How stable are the various states relative to each other. the ﬁnite dimensional problem and the inﬁnite dimensional problem (SPDEs). 185 .2 Introduction Brownian Motion in a Bistable Potential There are many systems in physics. transition pathways etc.
It has to local minima. under the inﬂuence of thermal noise in one dimension: x = −V (x) + ˙ ˙ 2kB T β. we assume that the potential has two local minima at the points a and c and a local maximum at b. i.We will mostly consider the dynamics of a particle moving in a bistable potential. The physically (and mathematically!) 4 interesting case is when the thermal ﬂuctuations are weak when compared to the potential barrier that the particle has to climb over. 4 (9. More generally. one local maximum and it increases at least quadratically at inﬁnity. (9. The potential barrier is then deﬁned as ∆E = V (b) − V (a).2) It is easily checked that this potential has three local minima.e. a local maximum at x = 0 and two We will say that the height of the potential barrier is 1 . 186 . 4 2 4 local minima at x = ±1. The standard potential that satisﬁes these assumptions is 1 1 1 V (x) = x4 − x2 + . 1 V (0) = . Let us consider the problem of the escape of the particle from the left local minimum a. that the particle cannot escape at inﬁnity.1) An example of the class of potentials that we will consider is shown in Figure. The values of the potential at these three points are: V (±1) = 0. This ensures that the state space is ”compact”.
(9. dt dt Hence. ˙ x(0) = x0 . In this limit. a state of local stability. the exit time or the mean ﬁrst passage time scales exponentially in β := (kB T )−1 : τ = ν −1 exp(β∆E). 187 . depending on the initial condition the particle will converge either to a or c. It is more customary to calculate the reaction rate κ := τ −1 which gives the rate with which particles escape from a local minimum of the potential: κ = ν exp(−β∆E).e. This is a result that we can obtain by studying the small temperature limit by using perturbation theory. It will then spend a long time in the neighborhood of c until it escapes again the potential barrier and end at a. consider the case T = 0. a say. This is an example of a rare event. at high temperatures the particle does not ”see” the potential barrier: it essentially jumps freely from one local minimum to another. The result is that we can describe locally the dynamics of the particle by appropriate Ornstein–Uhlenbeck processes. Of course. it is intuitively clear that the particle is most likely to be found at either a or c. i.3) It is very important to notice that the escape from a local minimum. Indeed. The particle cannot escape from either state of local stability.Our assumption that the thermal ﬂuctuations are weak can be written as kB T ∆E 1. and surmount the potential barrier to end up at c. The relevant time scale. can happen only at positive temperatures: it is a noise assisted event. In this case the potential becomes a Lyapunov function: dx dx = V (x) = −(V (x))2 < 0. On the other hand. The equation of motion becomes x = −V (x). There it will perform small oscillations around either of the local minima. this result is valid only for ﬁnite times: at sufﬁciently long times the particle can escape from the one local minimum.
as well as the justiﬁcation of the Arrhenius kinetics. this is a random variable. We will restrict ourselves to the case of homogeneous Markov processes. (9. We observe that at small temperatures the particle spends most of its time around x = ±1 with rapid transitions from −1 to 1 and back. is that of the mean ﬁrst passage time method (MFPT). What is extremely important both from a theoretical and an applied point of view is the calculation of the prefactor ν. Given x ∈ D. We can calculate the MFPT by solving an appropriate boundary value problem. 9.4) Let D be a bounded subset of Rd with smooth boundary. we will present it in a quite general setting and apply it to the problem of the escape from a potential barrier in later sections. (9. Since this method is of independent interest and is useful in various other contexts.3. It is not very easy to extend the method to nonMarkovian processes. In Figure we present the time series of the particle position. we solve the equation of motion (10. the rate coefﬁcient. we want to know how long it takes for the process Xt to leave the domain D for the ﬁrst time x τD = inf {t 0 : Xtx ∈ D} . A systematic approach for the calculation of the rate coefﬁcient.3) numerically. x X0 = x. 9. 188 .1 The Boundary Value Problem for the MFPT Let Xt be a continuous time diffusion process on Rd whose evolution is governed by the SDE dXtx = b(Xtx ) dt + σ(Xtx ) dWt .To get a better understanding of the dependence of the dynamics on the depth of the potential barrier relative to temperature.3) is intuitively and it has been observed experimentally in the late nineteenth century by Arrhenius and others. eqn.3 The Mean First Passage Time The Arrheniustype factor in the formula for the reaction rate. We will ﬁrst treat the one dimensional problem and then extend the theory to arbitrary ﬁnite dimensions. / Clearly. The average of this random variable is called the mean ﬁrst passage time MFPT or the ﬁrst exit time: x τ (x) := EτD .
(9.1 is based on Itˆ ’s formula. The MFPT is the solution of the boundary value problem −Lτ = 1. (9.5a) τ = 0.5. x. t) = D ρ(X. x. It solves the FP equation with absorbing boundary conditions. Notice that this is a decreasing function of time.5b) The homogeneous Dirichlet boundary conditions correspond to an absorbing boundary: the particles are removed when they reach the boundary.3. t) dx.6) We can write the solution to this equation in the form ρ(X. 0) = δ(X − x). ∗ ∗ t→+∞ That is: all particles will eventually leave the domain. (9. x. We can write ∂S = −f (x. x ∈ D. The rigorous proof of Theorem 9. Let ρ(X. ρ∂D = 0. ∂t ρ(X. t) be the probability distribution of the particles that have not left the domain D at time t. x. where L is the generator of the SDE 9. The (normalized) number of particles that are still inside D at time t is S(x. t) = 0.1. where the absorbing boundary conditions are included in the deﬁnition of the semigroup eL t . The homogeneous Dirichlet (absorbing) boundary conditions imply that lim ρ(X. ∂t 189 .Theorem 9. t) = eL t δ(X − x). ∂ρ = L∗ ρ. x ∈ ∂D. x. Other choices of boundary conditions are also possible.3. o Proof. t).
Brownian motion with one absorbing and one reﬂecting boundary. 190 . x.3.2 Examples In this section we consider a few simple examples for which we can calculate the mean ﬁrst passage time in closed form. The boundary value problem for the MFPT time becomes − The solution of this equation is τ (x) = − x2 a + bx + a −b . 2 2 d2 τ = 1. t) is the ﬁrst passage times distribution.3. x) ds = 0 ∗ D = 0 +∞ D eL s δ(X − x) dXds +∞ = 0 D δ(X − x) eLs 1 dXds = 0 eLs 1 ds. We consider the problem of Brownian motion moving in the interval [a. dτ (b) = 0. t): +∞ +∞ τ (x) = 0 +∞ f (s. dx (9.7) The MFPT time for Brownian motion with one absorbing and one reﬂecting boundary in the interval [−1. We assume that the left boundary is absorbing and the right boundary is reﬂecting. b]. dx2 τ (a) = 0. s) dXds = 0 +∞ S(s.2. x)s ds = 0 +∞ − dS s ds ds ρ(X. 9. 1] is plotted in Figure 9.where f (x. The MFPT is the ﬁrst moment of the distribution f (x. We apply L to the above equation to deduce: +∞ t Lτ = 0 LeLt 1 dt = 0 d LeLt 1 dt dt = −1.
Brownian motion with two reﬂecting boundaries. Consider again the problem of Brownian motion moving in the interval [a.Figure 9. 2 2 τ (a) = 0. dx 2 dx where the drift and diffusion coefﬁcients are smooth functions and where the diffusion coefﬁcient b(x) is a strictly positive function (uniform ellipticity condition). In order to calculate the mean 191 . but now with both boundaries being absorbing. 1] is plotted in Figure 9. dx The solution of this equation is τ (x) = − x2 a + bx + a −b .3.2.8) The MFPT time for Brownian motion with two absorbing boundaries in the interval [−1. (9. b].1: The mean ﬁrst passage time for Brownian motion with one absorbing and one reﬂecting boundary. The boundary value problem for the MFPT time becomes d2 τ − 2 = 1. The Mean First Passage Time for a OneDimensional Diffusion Process Consider now the mean exit time problem from an interval [a. b] for a general onedimensional diffusion process with generator L = a(x) 1 d2 d + b(x) 2 . τ (b) = 0.
2: The mean ﬁrst passage time for Brownian motion with two absorbing boundaries.9) together with appropriate boundary conditions. To solve this equation we ﬁrst deﬁne the function ψ(x) through ψ (x) = 2a(x)/b(x) to write (9. When both boundaries are absorbing we get x z τ (x) = −2 a e−ψ(z) dz a e−ψ(y) 2Z dy + b(y) Z x e−ψ(y) dy.10) 9. depending on whether we have one absorbing and one reﬂecting boundary or two absorbing boundaries. a (9.Figure 9. dx 2 dx (9. a where the constants c1 and c2 are to be determined from the boundary conditions. ﬁrst passage time we need to solve the differential equation d 1 d2 − a(x) + b(x) 2 τ = 1.9) in the form eψ(x) τ (x) =− 2 −ψ(x) e b(x) The general solution of (9.9) is obtained after two integrations: x z τ (x) = −2 a e−ψ(z) dz a e−ψ(y) dy + c1 b(y) x e−ψ(y) dy + c2 .4 Escape from a Potential Barrier In this section we use the theory developed in the previous section to study the long time/small temperature asymptotics of solutions to the Langevin equation for a particle moving in a one– 192 .
3) We want to calculate the rate of escape from the potential barrier in this case. at x = a by a repelling B. In particular.3) from the interval (a.1 Calculation of the Reaction Rate in the Overdamped Regime We consider the Langevin equation (9. the solution to (9. 9. (9.C. Consider the boundary value problem for the MFPT of the one dimensional diffusion process (10. at x = −∞: b y τ (x) = β −1 x dye βV (y) −∞ dze−βV (z) . 193 .2): x = −V (x) − γ x + ¨ ˙ ˙ 2γkB T W . the left local minimum of the potential. and the absorbing boundary is at x = b.12) with these boundary conditions by quadratures: b y τ (x) = β −1 x dyeβV (y) 0 dze−βV (z) . we analyze the dependence of the escape rate on the friction coefﬁcient.12) We choose reﬂecting BC at x = a and absorbing B.11) In particular. We assume that the particle is initially at x0 which is near a. we justify the Arrhenius formula for the reaction rate κ = ν(γ) exp(−β∆E) and we calculate the escape rate ν = ν(γ). We will see that the we need to distinguish between the cases of large and small friction coefﬁcients.13) Now we can solve the problem of the escape from a potential well: the reﬂecting boundary is at x = a.dimensional potential of the form (9. the local maximum. the left potential minimum. As we saw in Section 8.C. We can solve (9. (9. We can replace the B. Smoluchowski equation (10.4.11) in the limit of large friction.4. in the overdamped limit γ 1.11) can be approximated by the solution to the x = −V (x) + ˙ ˙ 2β −1 W .C. at x = b. b): −β −1 eβV ∂x e−βV τ = 1 (9.
We use the Taylor series expansion 1 2 V (y) = V (b) − ωb (y − b)2 + . the integral wrt y is dominated by the value of the potential around the saddle point. the minimum of the potential. We ﬁnally obtain b exp(βV (y)) dy ≈ x 2 βωb (y − b)2 exp(βV (b)) exp − 2 −∞ b dy = 1 exp (βV (b)) 2 2π . 2π 9.2 The Intermediate Regime: γ = O(1) • Consider now the problem of escape from a potential well for the Langevin equation q = −∂q V (q) − γ q + ¨ ˙ 194 ˙ 2γβ −1 W . 2 Assuming that x is close to a. the escape rate (or reaction rate). . we can replace the upper limit of integration by ∞: z +∞ exp(−βV (z)) dz ≈ −∞ −∞ exp(−βV (a)) exp − 2π . . Furthermore. . (9. 2 Similarly. .4. 2 βω0 2 βω0 (z − a)2 2 dz = exp (−βV (a)) where we have used the Taylor series expansion around the minimum: 1 2 V (z) = V (a) + ω0 (z − a)2 + .14) .When Eb β 1 the integral wrt z is dominated by the value of the potential near a. is given by 1 : 2τ κ= ω0 ωb exp (−βEb ) . we can replace the lower limit of integration by −∞. Consequently. Only have of the particles escape. 2 βωb Putting everything together we obtain a formula for the MFPT: τ (x) = π exp (βEb ) . ω0 ωb The rate of arrival at b is 1/τ .
70). This formula requires the calculation of integrals and it reduced to (9.16) • Naturally. 2π (9.• The reaction rate depends on the ﬁction coefﬁcient and the temperature. Ch. One of the ﬁrst methods that were developed was that of transition state theory. The result is κ = γβI(Eb ) where I(Eb ) denotes the action evaluated at b. 2π (9.69) or (8.15) 9.5 Discussion and Bibliography The calculation of reaction rates and the stochastic modeling of chemical reactions has been a very active area of research since the 30’s. We highly recommend this review article for further information on reaction rate theory. Chem.16) reduces to (9. A formula for the escape rate which is valid for all values of friction coefﬁcient was obtained by Melnikov and Meshkov in 1986. Phys 85(2) 10181027.15) • We can also obtain a formula for the reaction rate for γ = O(1): κ= γ2 4 2 − ωb − γ 2 ωb ω0 exp (−βEb ) .18a) . (8. in the limit as γ → +∞ (9.15) and (9. ω0 −βEb e . [96.17) in the overdamped and underdamped limits. 5.3 Calculation of the Reaction Rate in the energydiffusionlimited regime In order to calculate the reaction rate in the underdamped or energydiffusionlimited regime γ 1 we need to study the diffusion process for the energy. respectively. Ch. Ch. See also [40] and the review article of Melnikov (1991). Kramers developed his theory in his celebrated paper [49]. Our analysis is based mostly on [35. There are many applications of interest where it is important to calculate reaction rates for nonMarkovian Langevin equations of the form t x = −V (x) − ¨ 0 bγ(t − s)x(s) ds + ξ(t) ˙ 195 (9. In the overdamped limit (γ 1) we retrieve (??).17) 9.4. In this chapter we have based our approach to the calculation of the mean ﬁrst passage time. appropriately rescaled with γ: κ= ω0 ωb exp (−βEb ) . 4] and the excellent review article [41]. 9]. 2πγ (9. J.
Schuss and collaborators [86. As in the mean ﬁrst passage time approach. together with the ﬂuctuation–dissipation theorem (11. 64]. of course. in the analysis of the transition between different conformation states of biological macromolecules such as DNA [87]. the calculation of the transition rate which is uniformly valid in the friction coefﬁcient is presented in [64].. for the mean ﬁrst passage time τ . See also [6]. the ﬁrst eigenvalue) of the generator of the Markov process x(t) which is the solution of γ x = − V (x) + ˙ ˙ 2γkB T W . The theory of Freidlin and Wentzell has also been extended to inﬁnite dimensional problems. transition path theory 196 .. Singular perturbation theory is used to study the small temperature asymptotics of solutions to the boundary value problem. This formula is obtained through a careful analysis of the PDE 2 p∂q τ − ∂q V ∂p τ + γ(−p∂p + kB T ∂p )τ = −1. The calculation of reaction rates for the generalized Langevin equation is presented in [40].18b) We will derive generalized non–Markovian equations of the form (9. Recent developments in this area involve the transition path theory of W. In particular. E and Vanden Eijnden.. for more details. The PDE is equipped. This approach is based on a systematic use of singular perturbation theory. The study of rare transition events between long lived metastable states is a key feature in many systems in physics.ξ(t)ξ(0) = kB T M −1 γ(t) (9. in Chapter 11. 63. The formula derived in this paper reduces to the formulas which are valid at large and small values of the friction coefﬁcient at the appropriate asymptotic limits.10). chemistry and biology. This is a very important problem in many applications such as micromagnetics.. Schutte et al 2006.18a). The long time/small temperature asymptotics can be studied rigorously by means of the theory of FreidlinWentzell [29]. Rare transition events play an important role. A systematic study of the problem of the escape from a potential well was developed by Matkowsky. for example. A related issue is that of the small temperature asymptotics for the eigenvalues (in particular. Various simple applications of this theory are presented in Metzner.We refer to CITE. with the appropriate boundary conditions. The study of rare events is one of the most active research areas in the applied stochastic processes.
6 Exercises 197 .is also based on the solution of an appropriate boundary value problem for the socalled commitor function. 9.
198 .
The long time behavior of a Brownian particle in a periodic potential is determined uniquely 199 . such as Josephson junctions [3]. have found new and exciting applications e. Ch. surface diffusion [52. Consequently. Furthermore.4 Introduction Stochastic Resonance Brownian Motors Introduction Particle transport in spatially periodic. various experimental methods for particle separation have been suggested which are based on the theory of Brownian motors [7]. While the system of a Brownian particle in a periodic potential is kept away from equilibrium by an external. [78] and the references therein. without any violation of the second law of thermodynamics. There are various physical systems where Brownian motion in periodic potentials plays a prominent role. noisy systems has attracted considerable attention over the last decades.g as the basis of theoretical models for various intracellular transport processes such as molecular motors [9].2 10. 11]. deterministic or random.g. force. detailed balance does not hold. It was this fundamental observation [60] that led to a revival of interest in the problem of particle transport in periodic potentials with broken spatial symmetry. see e. These types of non– equilibrium systems. 84] and superionic conductors [33].1 10.3 10.Chapter 10 Stochastic Resonance and Brownian Motors 10. which are often called Brownian motors or ratchets. [82. and in the absence of any spatial symmetry. a net particle current will appear.
however. such as tilting periodic potentials or simple periodic in time forcing. the number of theoretical studies related to the calculation of the effective diffusion tensor has also been scarce [34. where the one dimensional analysis is inadequate. Enormous theoretical effort has been put into the study of Brownian ratchets and.1) and (10. as we will show in this paper. 80. Indeed. (10. As an example we mention the technique for separation of macromolecules in microfabricated sieves that was proposed in [14]. to analyze the dependence of these two quantities on the various parameters of the problem. Furthermore. of Brownian particles in spatially periodic potentials [78].2) t→∞ 2t Here x(t) denotes the particle position. · denotes ensemble average and ⊗ stands for the tensor Def f = lim product. On the other hand. [47] implies that at long times the particle performs an effective Brownian motion which is a Gaussian process. Ch. 85]. relatively simple potentials and/or forcing terms are considered.1) t→∞ . non–equilibrium particle transport is the calculation of (10. but with a zero net velocity in the y direction. an appropriately chosen driving force in the y direction produces a constant drift in the x direction. and hence the ﬁrst two moments are sufﬁcient to determine the process uniquely. Indeed. such as the friction coefﬁcient. 3]. The vast majority of all these theoretical investigations is concerned with the calculation of the effective drift for one dimensional models. In the two–dimensional setting considered in this paper. a force in the x direction produces no drift in the y direction. The main goal of all theoretical investigations of noisy. In these papers. There are various applications. 55. it 200 x(t) − x(0) t (10.2). respectively. more generally. an argument based on the central limit theorem [5. as Uef f = lim and x(t) − x(t) ) ⊗ (x(t) − x(t) ) . since the theoretical tools that are currently available are not sufﬁcient for the analytical treatment of the multi–dimensional problem. 79. the temperature and the particle mass. It is widely recognized that the calculation of the effective diffusion coefﬁcient is technically more demanding than that of the effective drift. One wishes. This is not surprising. For more general multi–dimensional problems one has to resort to numerical simulations. This is only possible when the potential and/or noise are such that the problem can be reduced to a one dimensional one [19]. The theoretical analysis of this problem requires new technical tools. in particular.by the effective drift and the effective diffusion tensor which are deﬁned.
with particular emphasis to the calculation of the effective diffusion tensor. In section 10.5 we introduce the model that we will study. The purpose of this work is to apply these multiscale techniques to the study Brownian motors in arbitrary dimensions. In Appendix A we derive formulae for the effective drift and the effective diffusion coefﬁcient for the case where the Brownian particle is driven away from equilibrium by periodic in time external ﬂuctuations. The SDEs which govern the motion of a Brownian particle in a periodic potential possess inherent length and time scales: those related to the spatial period of the potential and the temporal period (or correlation time) of the external driving force. The rest of this paper is organized as follows. Sec 5. It is therefore desirable to develop systematic tools for the calculation of the effective diffusion coefﬁcient (or tensor. A systematic methodology for studying problems of this type. in appendix B we use the method developed in this paper to calculate the effective diffusion coefﬁcient of an overdamped particle in a one dimensional tilted periodic potential. ?. Diffusive. The techniques developed in the aforementioned references are appropriate for the asymptotic analysis of stochastic systems (and Markov processes in particular) which are spatially and/or temporally periodic. transport can be potentially extremely important in the design of experimental setups for particle selection [78.7 we study the effective diffusion coefﬁcient for a Brownian particle in a periodic potential driven simultaneously by additive Gaussian white and colored noise. In section 10. non–equilibrium systems which are subject to unbiased noise can be modelled as non–reversible Markov processes [76] and can be expressed in terms of solutions to stochastic differential equations (SDEs).6 we obtain formulae for the effective drift and the effective diffusion tensor in the case where all external forces are Markov processes. has been developed many years ago [5. From a mathematical point of view. In section 10. 201 . in addition to the solution of the stationary Fokker– Planck equation which is sufﬁcient for the calculation of the effective drift. rather than directed. ?]. in the multi–dimensional setting). Section ?? is reserved for conclusions.11] [85].requires the solution of a Poisson equation. Finally. which is based on scale separation. From this point of view the calculation of the effective drift and the effective diffusion coefﬁcient amounts to studying the behavior of solutions to the underlying SDEs at length and time scales which are much longer than the characteristic scales of the system.
f (t)) + y(t) + ˙ 2γkB T ξ(t). i. . y) = 1 (− V (x. We remark that the state variable x(t) does not necessarily denote the position of a Brownian particle. f ) = V (x. L]d .3) is constant. such as pulsating. Ly . d. since eqn.10.3) with a time dependent temperature can be mapped to an equation with constant temperature and an appropriate effective potential [78. with period L in all spatial directions: V (x + Lˆi . y) · where D := kB T γ x + D∆x + Lf + Ly . Ch. f ). tilting. f. (10. Thus. and F (x. f ) + y) . . 9]. e i=1 The processes f (t) and y(t) can be continuous in time diffusion processes which are constructed as solutions of stochastic differential equations. We take f (t) and y(t) to be Markov processes with respective state spaces Ef . The process {x(t). f. The (easier) case where f (t) and y(t) are deterministic. . . we have assumed that the temperature in (10. or temperature ratchets. For simplicity. however. the above framework is general enough to encompass most of the models that have been studied in the literature. f (t). j = 1. ξi (t) = 0 and ξi (t)ξj (s) = δij δ(t − s). e i = 1. more general Markov chains etc. d. i. . . 6]. γ 202 . We will use the notation Q = [0. We will. sec. kB the Boltzmann constant and T denotes the temperature.5 The Model We consider the overdamped d–dimensional stochastic dynamics for a state variable x(t) ∈ Rd [78. . The potential V (x. (10. ξ(t) stands for the standard d–dimensional white noise process. refer to x(t) as the particle position in the sequel. this assumption is with no loss of generality. 3] γ x(t) = − V (x(t). periodic functions of time is treated in the appendix. dichotomous noise [42. y(t)} in the extended phase space Rd × Ef × Ey is Markovian with generator L = F (x. f ) is periodic in x for every f . sec.e. where {ˆi }d denotes the standard basis of Rd . However.3) where γ is the friction coefﬁcient. Ey and generators Lf .
6 Multiscale Analysis In this section we derive formulae for the effective drift and the effective diffusion tensor for x(t). Let us outline the basic philosophy behind the derivation of formulae (10. 8] ∂u = Lu. y) · ∂t x + D∆x + Lf + Ly ) u(x. f. y. of course. (10. t). f.6.5) according to x → εx. f. Ch.4). y.6) 203 .3).5) We re–scale a space and time in (10.18). This is due to the fact that advection and diffusion have different characteristic time scales. We are interested in the long time. y. f ). Our derivation of formulae for the effective drift and the effective diffusion tensor is based on singular perturbation analysis of the initial value problem (10. We remark that the calculation of the effective drift and the effective diffusion tensor are performed seperately. t = 0) = uin (x. 10. ∂t u(x. For the analysis that follows it is convenient to introduce a parameter ε 1 which in effect is the ratio between the length scale deﬁned through the period of the potential and a large ”macroscopic” length scale at which the motion of the particle is governed by and effective Brownian motion. The behavior of the system in this limit can be analysed using singular perturbation theory.13) and (10. y. and divide through by ε to obtain ∂uε 1 x = F ( . f. the adjoint to the Fokker–Planck equation. t) = (F (x. the solution of (10. (10. large scale behavior of x(t). f. The limit ε → 0 corresponds to the limit of inﬁnite scale separation. 10. y · ∂t ε ε x t → εt + εD∆x + Lf + Ly uε . because a different re–scaling is needed in each case. (10.1 Calculation of the Effective Drift The backward Kolmogorov equation reads ∂u(x.4) which is.To this process we can associate the initial value problem for the backward Kolmogorov Equation [69.
12) and L∗ ρ = 0 z · (F (z. f. where L0 = F (z. In order to proceed we need to assume that this process is ergodic: there exists a unique stationary solution of the Fokker–Planck equation L∗ ρ(z. rather than Markovian. ∂t (10. y. The operator L0 is the generator of a Markov process on Q × Ey × Ef . f ) dzdydf = 1 Q×Ey ×Ef (10.11) + 2D z x.. From the chain rule we have x → x + 1 ε z.. y f 204 . t + ε2 u2 x. f.8) to depend explicitly on t/ε.5). y)ρ) + D∆z ρ + L∗ ρ + L∗ ρ.7) are periodic functions of z = x/ε. . 0 with ρ(z.We solve (10. t + εu1 x. y. . The details are presented in the appendix. f. y. in time.8) and treat x and z as independent variables. We substitute now (10. . t + . use (10. t) depend explicitly on t/ε. . f ) = 0. we will need to assume that the terms in the multiscale expansion for uε (x.. t) = u0 x. = . In the case where the ﬂuctuations are periodic. (10. f.6) do not depend explicitly on the fast time t/ε. f. y) · and L1 = F (z. f. .10) + D∆z + Ly + Lf (10. L0 u1 = −L1 u0 + .7) All terms in the expansion (10..8) Notice that we do not take the terms in the expansion (10. This is because the coefﬁcients of the backward Kolmogorov equation (10. Upon equating the coefﬁcients of equal powers in ε we obtain the following sequence of equations L0 u0 = 0.. y) · x z (10. ε ε ε (10. y.6) pertubatively by looking for a solution in the form of a two–scale expansion x x x uε (x. y. . y. f.7) into (10.9) ∂u0 .
Under the assumption that (10. Eqn.In the above L∗ and L∗ are the Fokker–Planck operators of f and y. f ) · ∂t x u(x.12) has a unique solution eqn. We emphasize that the ergodicity of the ”fast” process is necessary for the very existence of an effective drift and an effective diffusion coefﬁcient. and divide through by ε2 to obtain ∂uε 1 x = 2 F . Uef f = 0. y. y. f ) satisﬁes periodic boundary conditions in z and appropriate boundary conditions in f and y. y. y). by Fredholm alternative. f. F (z. f )ρ(z. t) − F (z. t).10) now becomes L0 u1 = ∂u(x.9) implies. f ) dzdydf 1 γ (− V (x.5) x → εx.2 Calculation of the Effective Diffusion Coefﬁcient We assume for the moment that the effective drift vanishes. y · ∂t ε ε x t → ε2 t + εD∆x + Lf + Ly uε . This leads to the following backward Liouville equation ∂u(x. respectively. that u0 is independent of the fast scales: u0 = u(x. t) = Uef f · ∂t with the effective drift given by Uef f = Q×Ey ×Ef x u(x. Q×Ey ×Ef = (10. t).13) 10. t). (10. and it has been tacitly assumed in all theoretical investigations concerning Brownian motors [78]. f. (10. y. The stationary y f density ρ(z. y. (10. f ) dzdydf.14) 205 . f ) + y) ρ(z. We perform a diffusive re–scaling in (10. In order for this equation to be well posed it is necessary that the right hand side averages to 0 with respect to the invariant distribution ρ(z.6.
18) 206 . ∂t ∂xi ∂xj i.17) Equation (10. t) 2 ∂u(x. f ) satisﬁes the Poisson equation −L0 χ(z. ∂t (10. We proceed now with the analysis of equation (10. Since we have assumed that Uef f = 0. L0 u1 = −L1 u0 .16) (10. y. f ) · x u(x. t) = Dij f .. t) = χ(z. ρ (10.16) becomes L0 u1 = −F (z.j=1 d The effective diffusion tensor is ef Dij f = Dδij + Fi (z. after some straightforward algebra. where L0 and L1 were deﬁned in the previous subsection and L2 = − ∂ + D∆x .15) (10. Q×Ey ×Ef from which. the right hand side of the above equation belongs to the null space of L∗ and this equation is well posed.17). f ) · x u(x. t).We go through the same analysis as in the previous subsection to obtain the following sequence of equations. f ) = F (z. t) ef ∂ u(x. f. Now (10. L0 u0 = 0. t).. we obtain the limiting backward Kolmogorov equation for u(x. f ) ∂zj . Its solution is 0 u1 (x. The solvability condition for this equation reads (−L1 u1 − L2 u0 ) dzdydf = 0. L0 u2 = −L1 u1 − L2 u0 . y. where the auxiliary ﬁeld χ(z. z. f )χj (z. y. y. y.. f ) with periodic boundary conditions in z and appropriate boundary conditions in y and f . ..15) implies that u0 = u(x. y.. y. y. y. = . f ) + 2D ρ ∂χi (z. t).
[5. . ch. can be reduced to the situation analyzed in this subsection through a Galilean transformation with respect to Uef f 1 . (10.21a) (10.19) and the ﬁeld χ(z.20) 10. d. 17] γ x(t) = − V (x(t)) + y(t) + ˙ 1 y(t) = − y(t) + ˙ τ 2γkB T ξ(t).where the notation · ρ for the averaging with respect to the invariant density has been introduced. y(t)} is L= with D := kB T . f. f ) ∂zj i . the process x( ε)(t) := ε x(t/ε2 ) − ε−2 Uef f t converges to a mean zero Gaussian process with effective diffusivity given by (10. 3] ensure that the process {z(t). i.7 Effective Diffusion Coefﬁcient for Correlation Ratchets In this section we consider the following model [4. (10.21b) 2σ ζ(t). The effective diffusion tensor is now given by ef Dij f = Dδij + i Fi (z. ρ (10. y. f ) ρ +2D ∂χ (z. . f ) − Uef f . y(t)} ∈ Q × Rd . j = 1. f ) − Uef f χj (z. The potential V (x) is assumed to be L–periodic in all spatial directions The process y(t) is the d– dimensional Onrstein–Uhlenbeck (OU) process [35] which is a mean zero Gaussian process with correlation function yi (t)yj (s) = δij σe− t−s τ .19) 1 207 . y. The case where the effective drift does not vanish. Let z(t) denote the restriction of x(t) to Q = [0. 2π]d . . The generator of the Markov process {z(t).g. y. y) satisﬁes the Poisson equation −L0 χ = F (z. y. γ 1 (− γ zV (z) + y) · z + D∆z + 1 (−y · τ y + σ∆y ) Standard results from the ergodic theory of Markov processes see e. Uef f = 0. with generator L is ergodic and that the unique In other words. . τ where ξ(t) and ζ(t) are mutually independent standard d–dimensional white noise processes.
deﬁne f = F · e. Let now h(y. τ γ The effective diffusion tensor is positive deﬁnite.20) which take the form: − 1 1 z · ((− z V (z) + y)ρ(y. z) = 0 y · (yρ(y. z) − D∆z χ(y. We use the above calculation in the formula for the effective diffusion tensor. This is true even at zero temperature [?. z) with respect to the Lebesgue measure. To prove this.12) and (10. z) γ 1 1 − − y · y χ(y.13) and (10. z). Hence. 208 . together with an integration by parts and the fact that φ(y. to obtain zφ ρ ρ +D e· − φLφ 2 z φ ρ + 2D e · ρ yφ ρ + 2D e · y φ σ 2  y φ2 ρ . z) + γ τ +σ∆y ρ(y. z)) + D∆z ρ(y. z + L) = φ(y. z)) and 1 − (− z V (z) + y) · z χ(y. φ ρ = 0.18).6 apply: the effective drift and effective diffusion tensor are given by formulae (10.invariant measure has a smooth density ρ(y. u = Uef f · e and let φ := e · χ denote the unique solution of the scalar problem −Lφ = (F − U ) · e =: f − u. the results of section 10. z φ ρ + τ + σ  τ 2 y φ ρ From the above formula we see that the effective diffusion tensor is non–negative deﬁnite and that it is well deﬁned even at zero temperature: e · Def f (T = 0) · e = σ  τ y φ(T = 0)2 ρ . Elementary computations yield L∗ (hρ) = −ρLh + 2D z · (ρ z h) + 2σ τ y · (ρ y h) . 65]. z) be a sufﬁciently smooth function. let e be a unit vector in Rd . z) = (− z V (z) + y) − U. Of course. z) + σ∆y χ(y. z) e · Def f · e = D + f φ = D + uφ = D+D  = D e + ρ ρ ρ = 0. in order to calculate these quantities we need to solve equations (10. φ(y. respectively.
23) Z= 0 e− L V (z) D L dz. (10. Z2 = 0 V (z)e V (z) D dz. [42. 17] for the model considered in this section and in [16. 94]. the resulting formula is. and Def f = L2 (10.23). in the limit as τ → 0 the effective diffusion coefﬁcient converges to its value for y≡0: Def f = . It was shown in [16. it is possible to calculate the small τ expansion of the effective drift and the effective diffusion coefﬁcient. The problem becomes substantially more complicated at zero temperature because the generator of the Markov process becomes a degenerate differential operator at T = 0. (10. L Z1 = 0 V (z)e− dz. We also remark that the expansion (10. the O(τ 2 ) term involves third order derivatives of the potential and can be deﬁned only when V (x) ∈ C 3 (0.Although we cannot solve these equations in closed form. is well deﬁned even for non–smooth potentials. L). Naturally. at least in one dimension. V (z) D Z= 0 e V (z) D dz. for the case of dichotomous noise. The small τ asymptotics for the effective drift were also studied in [4.24) ZZ This is the effective diffusion coefﬁcient for a Brownian particle moving in a periodic potential. however.23) is only valid for positive temperatures. 92] when the external ﬂuctuations are given by a continuous time Markov chain. too complicated to be of much use. that the effective 209 L2 D 5 .23) we have used the following notation D + τσ 1 + L + O(τ 2 ).23) does not involve any derivatives of the potential and.g. e. a tedious calculation using singular perturbation theory. ?] yields Uef f = O(τ 3 ). Indeed. It is relatively straightforward to obtain the next order correction to (10. It is well known. Non–smooth potentials lead to an effective drift which is O(τ 2 ). the ﬁrst non–zero term–of order O(τ 3 )–involves the second derivative of the potential. the small τ expansion for Uef f is valid only for sufﬁciently smooth potentials. and easy to prove. (10. On the other hand. eqn. in the absence of external ﬂuctuations [54. hence. (10. Indeed. 92] that. On the contrary.22) 1 Z2 Z1 − 2 γD Z ZZ Z In writing eqn.
We also plot the results of the small τ asymptotics.21). The agreement between theoretical predictions and numerical results is quite satisfactory. The results presented in ﬁgures 1 and 2 were obtained from the numerical solution of equations (10. In ﬁgure 2 we plot the effective diffusivity as a function of the noise strength σ of the OU process.21) using the Euler–Marayama method.21) with V (x) = cos(x) as a function of τ . for the cosine potential V (x) = cos(x).1: Effective diffusivity for (10. Dashed line: Results γ from formula (10. diffusion coefﬁcient given by (10.23) and the numerical experiments is excellent. The integration step that was used was ∆t = 10−4 and the total number of integration steps was 107 . We compare now the small τ asymptotics for the effective diffusion coefﬁcient with Monte Carlo simulations.24) is bounded from above by D. 210 1. Solid line: Results from Monte Carlo simulations. D = kB T = 1. As expected. the effective diffusivity is an increasing function of σ.23). 2π].Figure 10. The effective diffusion coefﬁcient was calculated by ensemble averaging over 2000 particle trajectories which were initially uniformly distributed on [0. for σ = 1. We also observe that . In ﬁgure 1 we present the effective diffusion coefﬁcient as a function of the correlation time τ of the OU process. for τ the effective diffusivity is an increasing function of τ . γ = 1. This not the case for the effective diffusivity of the correlation ratchet (10. The agreement between the theoretical predictions from (10.
1.23). L]d . (10. Solid line: Results from Monte Carlo simulations.21) with V (x) = cos(x) as a function of σ. D = kB T = 1.Figure 10. for τ = 0.25) is interpreted in the Itˆ sense. and periodic in time force y(t). t) and temperature T (x. t) > 0. Dashed line: γ Results from formula (10.9 Discussion and Bibliography Exercises 1. We take the spatial period to be L in all directions and the temporal period of V (x. o 211 . We use the notation Q = [0. In this appendix we derive formulae for the mean drift and the effective diffusion coefﬁcient for a Brownian particle which moves according to γ x(t) = − V (x(t). T (x.25) for space–time periodic potential V (x. 10.2: Effective diffusivity for (10. t) + y(t) + ˙ 2γkB T (x(t). t). t)ξ(t). γ = 1.8 10. Equation (10. t) and y(t) to be T .
212 .
1 Introduction We will consider some simple ”particle + environment” systems for which we can obtain rigorously a stochastic equation that describes the dynamics of the Brownian particle. q). p} := {qj }N .11). PN ) + HHB (q. We will also see that. whereas the ﬂuid is assumed to be initially in equilibrium (Gibbs distribution). we can derive the Markovian Langevin equation (9.Chapter 11 Stochastic Processes and Statistical Mechanics 11. p) + HI (QN . {pj }N j=1 are the positions and momenta of the ﬂuid particles. q. {pj }N j=1 j=1 to obtain a closed equation for the Brownian particle. in some appropriate limit. We will see that this equation is a stochastic integrodifferential equation. We can describe the dynamics of the Brownian particle/ﬂuid system: H(QN . the Generalized Langevin Equation (GLE) (in the limit as N → +∞) t ¨ Q = −V (Q) − 0 ˙ R(t − s)Q(s) ds + F (t).1) where {q. Goal: eliminate the ﬂuid variables {q. p) = HBP (QN .2) where R(t) is the memory kernel and F (t) is the noise. (11. The initial conditions of the Brownian particle are taken to be ﬁxed. PN . 213 . N is the j=1 number of ﬂuid (heat bath) particles (we will need to take the thermodynamic limit N → +∞). (11. p} := {qj }N .
In order to choose the initial conditions according to µβ (dpdq) we can take qn (0) = λQ0 + −1 β −1 kn ξn . p) = 2 p2 1 PN n + V (QN ) + + kn (qn − λQN )2 . The simplest model is that of a harmonic heat bath and of linear coupling: H(QN .11. . The initial conditions of the heat bath particles are distributed according to the Gibbs distribution.6a) (11. . (11.d. we can take qn (0) = Hamilton’s equations of motion are: N −1 β −1 kn ξn . This is a way of introducing the concept of the temperature in the system (through the average kinetic energy of the bath particles). ensures that the forcing term in the GLE that we will derive is mean zero (see below).C.3) The initial conditions of the Brownian particle {QN (0). conditional on the knowledge of {Q0 . P0 }: µβ (dpdq) = Z −1 e−βH(q. Our choice of I. Other choices for the initial conditions are possible.i. The equations for the heat bath particles are second order linear inhomoge neous equations with constant coefﬁcients. Our plan is to solve them and then to substitute the result in the equations of motion for the Brownian particle.6b) 2 where ωn = kn /mn . n = 1. pn (0) = mn β −1 ηn . 2 2mn 2 n=1 N (11. . P0 } are taken to be deterministic. For example.2 The KacZwanzig Model Need to model the interaction between the heat bath particles and the coupling between the Brownian particle and the heat bath.p) dqdp. ¨ kn (λqn − λ2 QN ). 1) random variables. N (0.4) where β is the inverse temperature. (11. q. ¨ QN + V (QN ) = n=1 2 qn + ωn (qn − λQN ) = 0. (11. Notice that we actually consider the Gibbs measure of an effective (renormalized) Hamiltonian.5) where the ξn ηn are mutually independent sequences of i. We can solve the equations of motion 214 . PN (0)} := {Q0 . PN . N.
6) and use the initial conditions (11.5) to obtain the Generalized Langevin Equation t ¨ QN = −V (QN ) − λ2 0 ˙ RN (t − s)QN (s) ds + λFN (t).9) = Remarks 11. (11. theorem: β −1 n=1 kn (ξn cos(ωn t) + ηn sin(ωn t)) . i. 215 (11. We substitute this in equation (11.8) and the noise process is N FN (t) = n=1 kn (qn (0) − λQ0 ) cos(ωn t) + N kn pn (0) sin(ωn t) mn ωn (11.2. The noisy and random term are related through the ﬂuctuationdissipation N FN (t)FN (s) = β −1 n=1 kn cos(ωn t) cos(ωn s) + sin(ωn t) sin(ωn s) = β −1 RN (t − s).1. An integration by parts yields qn (t) = qn (0) cos(ωn t) + pn (0) sin(ωn t) + λQN (t) mn ωn t −λQN (0) cos(ωn t) − λ 0 ˙ cos(ωn (t − s))QN (s) ds.for the heat bath variables using the variation of constants formula qn (t) = qn (0) cos(ωn t) + t pn (0) sin(ωn t) mn ωn +ωn λ 0 sin(ωn (t − s))QN (s) ds.10) .7) where the memory kernel is N RN (t) = n=1 kn cos(ωn t) (11. ii. The noise F (t) is a mean zero Gaussian process.
N 2b where the function f (ωn ) decays sufﬁciently fast at inﬁnity. {ζn }∞ and n=1 FN (t) → F (t) weakly in C([0.11) ¨ Q = −V (Q) − λ2 0 ˙ R(t − s)Q(s) ds + λF (t).10) to be valid. 2b = 1 − a and set ωn = N a ζn where {ζn }∞ are i. n=1 where ∆ω = N /N . R). QN (t).d. Using now properties of Fourier series with random coefﬁcients/frequencies and of weak convergence of probability measures we can pass to the limit: RN (t) → R(t) in L1 [0.i. in particular. By choosing the frequencies ωn and spring constants kn (ω) of the heat bath particles appropriately we can pass to the limit as N → +∞ and obtain the GLE with different memory kernels R(t) and noise processes F (t). The choice of the initial conditions (11. a The time T > 0 if ﬁnite but arbitrary. We can rewrite the dissipation and noise terms in the form RN (t) = n=1 N f 2 (ωn ) cos(ωn t) ∆ω and FN (t) = N √ f (ωn ) (ξn cos(ωn t) + ηn sin(ωn t)) ∆ω. for a.iii. the solution of (11.5) for q.a. 1). we choose the spring constants according to kn = f 2 (ωn ) .7) converges weakly to the solution of the limiting GLE t (11. 1). n=1 Furthermore.12) . for the ﬂuctuationdissipation theorem (11. The parameter λ measures the strength of the coupling between the Brownian particle and the heat bath. The limiting kernel and noise satisfy the ﬂuctuationdissipation theorem (11. p is crucial for the form of the GLE and. Let a ∈ (0. with ζ1 ∼ U(0. T ].10): F (t)F (s) = β −1 R(t − s). 216 (11. T ]. iv. v.
P (t).12) becomes t ¨ Q = −V (Q) − λ2 0 ˙ e−αt−s Q(s) ds + λ2 F (t). We can rewrite (11. i. when conditioned on the present. We can turn (11. Then R(t) = e−αt . from (11.12). introducing auxiliary variables.13) with F (0) ∼ N (0.14). β −1 ).The properties of the limiting dissipation and noise are determined by the function f (ω). The GLE (11. Z(t)} ∈ R3 is Markovian.15) where F (t) is the OU process (11.14) 2α/π α2 + ω 2 (11. the solution of the GLE (11. it is sufﬁcient to introduce one auxiliary variable. is not a Markov process. dt dZ dW = −αZ − λP + 2αβ −1 . dt (11. consider the Lorentzian function f 2 (ω) = with α > 0. 217 . The stochastic process Q(t) has memory.12) into a Markovian SDE by enlarging the dimension of state space. β −1 ). Q(t). We might have to introduce inﬁnitely many variables! For the case of the exponential memory kernel. dt dt where Z(0) ∼ N (0.15) as a system of SDEs: dQ = P. the future is not statistically independent of the past.e. F (t) is the stationary OrnsteinUhlenbeck process: dF = −αF + dt 2β −1 α dW .11). (11. As an example. The process {Q(t). dt dP = −V (Q) + λZ. i.e. Hence. when the noise is given by an OU process. exponential correlation function: F (t)F (s) = β −1 e−αt−s . The noise process F (t) is a mean zero stationary Gaussian process with continuous paths and.
we have that 1 dW Z → 2γβ −1 − γP. dt (11. A natural question which arises is whether it is always possible to turn the gLE into a Markovian system by adding a ﬁnite number of additional variables. α = ε−2 . 218 .1. Equations (11. when the memory kernel consists of a sum of exponential functions. In many cases the additional variables Yt in terms of solutions to linear SDEs. ε dt Thus. We will say that a stochastic process Xt is quasiMarkovian if it can be represented as a Markovian stochastic process by adding a ﬁnite number of additional variables: There exists a stochastic process Yt so that {Xt . a natural extension of the case considered in the previous section.18) 2β −1 dW . This is not always the case. We can eliminate the auxiliary process Z by taking an appropriate distinguished limit.17) become dQ = P.3 QuasiMarkovian Stochastic Processes In the previous section we studied the gLE for the case where the memory kernel decays exponentially fast. ε2 dt We can use tools from singular perturbation theory for Markov processes to show that. for example. the solution of a linear SDE. in this limit we obtain the Markovian Langevin Equation (R(t) = γδ(t)) ¨ ˙ Q = −V (Q) − γ Q + 2γβ −1 dW . dt √ γ dP = −V (Q) + Z. √ Set λ = γε−1 . there are many applications where the memory kernel decays sufﬁciently fast so that we can approximate the gLE by a ﬁnite dimensional Markovian system. This is possible.It is a degenerate Markov process: noise acts directly only on one of the 3 degrees of freedom. dt ε √ γ dZ 1 = − 2Z − P+ dt ε ε as ε → 0.3. We showed that we can represent the gLE as a Markovian processes by adding one additional variable. Deﬁnition 11. We introduce the concept of a quasiMarkovian stochastic process. However. in the limit 11. Yt } is a Markov process.
. 2αj β −1 0 e−αj (t−s) dWj =: − 0 We substitute this into the equation for P to obtain n ˙ P = −V (Q) + j=1 n λj uj t = −V (Q) + j=1 t λj − 0 Rj (t − s)P (s) ds + ηj (t) = −V (Q) − 0 R(t − s)P (s) ds + F (t) where R(t) is given by (11. ˙ P = −V (Q) + j=1 λj uj . Then (11.Proposition 11.19) with a memory kernel of the form n R(t) = j=1 λj e−αj t (11.20) and F (t) being a mean zero stationary Gaussian process and where R(t) and F (t) are related through the ﬂuctuationdissipation theorem. (11. 219 .19) is equivalent to the Markovian SDE n (11. ˙ P = −V (Q) − 0 R(t − s)P (s) ds + F (t) (11.21) ˙ Q = P. j = 1. We solve the equations for uj : t t uj = −λj 0 t e−αj (t−s) P (s) ds + e−αj t uj (0) + Rj (t − s)P (s) ds + ηj (t). .3. β −1 ) and where Wj (t) are independent standard one dimensional Brownian motions.20) and the noise process F (t) is n F (t) = j=1 λj ηj (t). uj = −αj uj − λj pj + ˙ 2αj β −1 . . F (t)F (s) = β −1 R(t − s). Consider the generalized Langevin equation t ˙ Q = p. n.22) with uj ∼ N (0.2. Proof.
. we have that s→∞ ∆2 1 s + γ1 + ∆2 2 . . Indeed.j=1 n = = i=1 λi e−αi t−s = R(t − s). γi 0.24) This is the Laplace transform of the autocorrelation function of an appropriate linear system of SDEs. .23) lim R(s) = 0. . sN + N bj sN −j j=1 aj . Consider now the case where the memory kernel is a bounded analytic function. (11. (11. RN (t) is bounded which implies that s→∞ lim RN (s) = 0. Consider an approximation RN (t) such that the continued fraction representation terminates after N steps. N. This follows from results in approximation theory. Its Laplace transform +∞ R(s) = 0 e−st R(t) dt can be represented as a continued fraction: R(s) = Since R(t) is bounded. We readily check that the ﬂuctuationdissipatione theorem is satisﬁed: n F (t)F (s) = i. These additional variables are solutions of a linear system of SDEs.j=1 n λi λj ηi (s)ηj (t) λi λj δij e−αi t−s i.. The Laplace transform of RN (t) is a rational function: RN (s) = N N −j j=1 aj s .with ηj (t) being onedimensional stationary independent OU processes. dt dt 220 j = 1.25) . (11. . set dxj dWj = −bj xj + xj+1 + aj . bj ∈ R. .
1 Open Classical Systems When studying the KacZwanzing model we considered a one dimensional Hamiltonian system coupled to a ﬁnite dimensional Hamiltonian system with random initial conditions (the harmonic 221 . Z1 . Consider now the case N = 2 with bi = αi . We can write (11. a1 = The GLE becomes t 2β −1 α we derive the GLE (11. a2 = ¨ Q = −V (Q) − λ2 0 ˙ R(t − s)Q(s) ds + λF1 (t).15) with F (t) being the OU 2β −1 α2 . Z2 }: ˙ Q = P. For N = 1 and b1 = α.14). It is still. 2 and a1 = 0. process (11. Under appropriate assumptions on the potential V (Q) the solution of the GLE equation is an ergodic process. It is possible to study the ergodic properties of a quasimarkovian processes by analyzing the spectral properties of the generator of the corresponding Markov process. Z1 . Z2 } so that noise (and hence regularity) is transferred from the degrees of freedom that are directly forced by noise to the ones that are not.with xN +1 (t) = 0. 11. This leads to the analysis of the spectral properties of hypoelliptic operators.3.27) as a Markovian system for the variables {Q. however. There exists a smooth density. ˙ Z1 = −α1 Z1 + Z2 . hypoelliptic (Hormander’s condition is satisﬁed): there is sufﬁcient interaction between the degrees of freedom {Q. The process x1 (t) is a stationary Gaussian process with autocorrelation function RN (t). Stochastic processes that can be written as a Markovian process by adding a ﬁnite number of additional variables are called quasimarkovian .15): noise acts on fewer degrees of freedom. The corresponding Markov semigroup has nice regularizing properties. ˙ Z2 = −α2 Z2 − λP + ˙ 2β −1 α2 W2 . ˙ F1 = −α1 F1 + F2 . i = 1. ˙ F2 = −α2 F2 + with β −1 R(t − s) = F1 (t)F1 (s) . Notice that this diffusion process is ”more degenerate” than (11. P. ˙ P = −V (Q) + λZ1 (t). P. ˙ 2β −1 α2 W2 .
We can consider a small Hamiltonian system coupled to its environment which we model as an inﬁnite dimensional Hamiltonian system with random initial conditions. π) = ∂x φ2 + π(x)2 .30) π(x) denotes the conjugate momentum ﬁeld. x) = ∂x φ(t. π) + HI (q. This coupling is inﬂuenced by the dipole coupling approximation from classical electrodynamics. We have a coupled particleﬁeld model. (11. (11. the environment can pump enough energy into the system so that nontrivial ﬂuctuations emerge. p. Now we can proceed as in the case of the ﬁnite dimensional heat 222 . x). q) + H(φ. Under this assumption on the initial conditions.π) dφdπ”.29) The Hamiltonian of this system is HHB (φ.31) where The function ρ(x) models the coupling between the particle and the ﬁeld. The initial conditions are distributed according to the Gibbs measure (which in this case is a Gaussian measure) at inverse temperature β.28) We will model the environment through a classical linear ﬁeld theory (i. π) = HDP (p. 2 (11. which we formally write as ”µβ = Z −1 e−βH(φ. φ. (11. The Hamiltonian of the particleﬁeld model is H(q.32) (11.e. The distinguished particle (Brownian particle) is described through the Hamiltonian 1 HDP = p2 + V (q). (11. φ) = q ∂q φ(x)ρ(x) dx. typical conﬁgurations of the heat bath have inﬁnite energy. φ). In this way. the wave equation) with inﬁnite energy: 2 2 ∂t φ(t.33) The corresponding Hamiltonian equations of motion are a coupled system of equations of the coupled particle ﬁeld model.heat bath) and then passed to the theromdynamic limit N → ∞. We will assume linear coupling between the particle and the ﬁeld: HI (q. Care has to be taken when deﬁning probability measures in inﬁnite dimensions.
35) {A.37a) P fN +1 = h. The Liouville equation can be written as i ∂fN +1 = LN +1 fN +1 . ˙ (11. B} = j=0 ∂A ∂B ∂B ∂A − ∂qj ∂pj ∂qj ∂pj . H}. ∂t (11. i ∂h = (I − P )L(f + h).bath. ∂t where H is the full Hamiltonian and {·. The N + 1− probability distribution function fN +1 satisﬁes the Liouville equation ∂fN +1 + {fN +1 .36) We want to obtain a closed equation for the distribution function of the Brownian particle.34) with appropriate deﬁnitions for the memory kernel and the noise.4 The MoriZwanzig Formalism Consider now the N + 1dimensional Hamiltonian (particle + heat bath) with random initial conditions. which are related through the ﬂuctuationdissipation theorem. We can integrate the equations motion for the heat bath variables and plug the solution into the equations for the Brownian particle to obtain the GLE. 11. ·} is the Poisson bracket N (11. We introduce a projection operator which projects onto the distribution function f of the Brownian particle: P fN +1 = f. ∂t (11. ∂t 223 (11. The Liouville equation becomes i ∂f = P L(f + h). We introduce the Liouville operator LN +1 · = −i{·. H} = 0.37b) . The ﬁnal result is t q = −V (q) − ¨ 0 R(t − s)q(s) + F (t).
We integrate the second equation and substitute into the ﬁrst equation.8 Exercises 224 . An excellent discussion on the derivation of the FokkerPlanck equation using projection operator techniques can be found in [66]. We obtain i ∂f = P Lf − i ∂t t P Le−i(I−P )Ls (I − P )Lf (t − s) ds + P Le−i(I−P )Lt h(0). 95]. 0 (11. Applications of linear response theory to climate modeling can be found in.38) In the Markovian limit (large mass ratio) we obtain the FokkerPlanck equation (??).7 Derivation of the FokkerPlanck and Langevin Equations Linear Response Theory Discussion and Bibliography The original papers by Kac et al and by Zwanzig are [26.6 11. 11. The variant of the KacZwanzig model that we have discussed in this chapter was studied in [37].5 11. 11. See also [25].
45 e KarhunenLo´ ve Expansion e for Brownian Motion. 96 KarhunenLo´ ve Expansion. 137 OrnsteinUhlenbeck process 225 . 39 inverse temperature. 88 kinetic. 100 Itˆ formula. 32 Banach space. 17 covariance function. 42 central limit theorem. 49 kinetic equation. 111 MCMC. 13 law of large numbers strong. 88 FokkerPlanck equation. 16 Brownian motion scaling and symmetry properties. 116 Kolmogorov equation. 188 Diffusion processes reversible. 111 Mean ﬁrst passage time. 188 Multiplicative noise. 126 Langevin equation. 89 Gaussian stochastic process. 24 conditional expectation. 126 FokkerPlanck equation classical solution of. 24 Markov Chain Monte Carlo. 125 Gibbs distribution. 18 correlation coefﬁcient.Index autocorrelation function. 116 KleinKramersChandrasekhar. 107 Gibbs measure. 109 GreenKubo formula. 137 Langevin. 32 Diffusion process mean ﬁrst passage time. 137 FokkerPlanck. 30 generator. 68. 137 law. 109 equation FokkerPlanck. 133 operator hypoelliptic. 125 o Joint probability density. 106 Dirichlet form.
17 uncorrelated. 39 Wiener process. 221 random variable Gaussian. 107 Poincar´ ’s inequality e for Gaussian measures. 32 strictly stationary. 31 stationary process second order stationary. 101 Poincar` ’s inequality. 95 partition function. 40 226 . 106 spectral density. 30 stochastic processes strictly stationary. 221 stochastic process deﬁnition. 32 stochastic differential equation. 31 equivalent. 109 e Quasimarkovian stochastic process. 35 stationary process. 31 wide sense stationary. 43 Stochastic Process quasimarkovian. 17 Reversible diffusion. 31 transport coefﬁcient. 29 Gaussian.FokkerPlanck equation for. 32 stationary. 30 secondorder stationary.
D. volume 5 of Studies in Mathematics and its Applications. Let. Bier and R. [5] A. 76(22):4277. [8] L. New York. Astumian. 1982. Paterno. Translated from the German. [6] N. 1996. [7] M.Bibliography [1] L. Reimann. NorthHolland Publishing Co. Noiseinduced phenomena in slowfast dynamical systems. 1974.. [3] A.. Hanggi. London. Lions. 227 . and G. [2] R. PA. Let. Arnold. Bartussek. 1997. Rev. Breiman. London. Barone and G. Philadelphia. Balescu. 1978. Wiley.L.. [4] R. Phys. J. Society for Industrial and Applied Mathematics (SIAM). A samplepaths approach. Biasing Brownian motion in different directions in a 3–state ﬂuctuating potential and application for the separation of small particles. 2006. Matter out of equilibrium. SpringerVerlag London Ltd. Amsterdam. Physics and Applications of the Josephson Effect.. Corrected reprint of the 1968 original. Papanicolaou. 1996. volume 7 of Classics in Applied Mathematics. Berglund and B. Stochastic differential equations: theory and applications. Gentz. Rev. Bensoussan. Probability. Phys. Precise numerics versus theory for correlation ratchets. Probability and its Applications (New York). Imperial College Press. WileyInterscience [John Wiley & Sons]. and P. Statistical dynamics. 76(7):1166–1169. New York. P. 1992. Asymptotic analysis for periodic structures.
L. Doering. Freidlin. Riordan. [20] A. [14] I. u [21] S. Cerrai and M. Rev. 1954.J. New York. Horsthemke. Acc. [18] N. [13] A. and R. [10] S. Rev. 1977. John Wiley & Sons Inc.N. Hald. 2006. F¨ rth. Constructive role of noise: fast ﬂuctuation asymptotics of transport in stochastic ratchets. D. Freidlin.[9] C.. [17] C. Springer. [19] R. Chaos. 2006. res. 2005.. Dietrich. Jan 1943. 27:177–187. Oster.M. Cowper. [16] C. Evol. 15(1):1– 89. New York. Rev. Theory Related Fields.D. New York. 2006. Translated by A. 34:412–420.H. 228 . 2001. Stochastic tools in mathematics and science. 1994. E. Chandrasekhar. Eichhorn and P. Kurtz. . and W. Chem. Peschel. 69(4):517–523. New York. Phys. SmoluchowskiKramers approximation for a general class of SPDEs.. 1998. and M. Dontcheva. Bustamante. Cerrai and M. and G. Nonequilibrium ﬂuctuation–induced transport. Probab. Ethier and T. Schneider. A. [12] S. Keller. Phys. [11] S. Europhys.R.. Dover Publications Inc. Einstein. 1986. Wax (editor). Klosek. Let.. 72(19):2984–2987. Diffusion in periodic potentials.. 1956. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. [15] W. Edited with notes by R. Equ. The physics of molecular motors. 6(4):657–689. Stochastic problems in physics and astronomy. 135(3):363–394.R. W.R. On the SmoluchowskiKramers approximation for a system with an inﬁnite number of degrees of freedom. I. J. Phys. Markov processes. Chorin and O. ac separation of particles by biased Brownian motion in a two–dimensional sieve. Investigations on the theory of the Brownian movement. Lett. Astumian. Reimann. Phys. Doering. Selected Papers on Noise and Stochastic Processes. 8(3):643–649.G. D. Z. 1998. Derenyi. Paradoxical directed diffusion due to temperature anisotropies. Dover.. Mod. and J. volume 1 of Surveys and Tutorials in the Applied Mathematical Sciences. 58(6):7781–7784.
New York.. 46(56):803– 810. An introduction to probability theory and its applications. 1968. Second edition. N. Amer. Ford and M. Wentzell. 1987. 229 . An introduction to probability theory and its applications. [33] P. 2. Stochastic differential equations and applications. Vol. [23] W. Friedman. 1984. A remark on random perturbations of the nonlinear pendulum.. II. 9(3):611–628. [27] M. Third edition. [24] W.. 35(26):1776–1779.. [25] G. Ann.C. Probability and Mathematical Statistics. D. 1964. Freidlin and M. [32] A. Let. J. Mazur. Freidlin and A. Friedman. Feller. Vol. Fulde. [31] A. 6:504–515. Wentzell. 1965. W. Random perturbations of Hamiltonian systems. John Wiley & Sons Inc. Mathematical Phys. Str¨ ssler. On the quantum Langevin equation. L. Stochastic differential equations and applications. Freidlin and A. Ford. 1976. Vol. Providence. Vol. and P. J. Academic Press [Harcourt Brace Jovanovich Publishers].D. 109(523):viii+82.I. John Wiley & Sons Inc.. Phys. New York. [28] M. Rev.. Vol. AMS. Kac. 1975. Friedman. Soc. 1975. Mem. New York.. Weber. Pietronero. [29] M. 28. Phys. SpringerVerlag. R. Statistical mechanics of assemblies of coupled oscillators. New York.J.. 1994. 1999. Partial differential equations of parabolic type. 28. Evans.. PrenticeHall Inc. Rhode Island. I. Probability and Mathematical Statistics. Feller. [26] G. Statist. Random Perturbations of dunamical systems. Schneider. Academic Press [Harcourt Brace Jovanovich Publishers]. Probab. 1971. W. Problem of brownian motion in a a periodic potential.[22] L. [30] A. Englewood Cliffs. M. I. 1998. New York. Partial Differential Equations. Math. Kac. W. and S. 1. Vol. Appl.
Rev. Hairer and G. Nonlinearity.N. Pavliotis. [40] P. 1984. Statist.. SpringerVerlag. Talkner. J. Stat. volume 288 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. [Harcourt Brace Jovanovich Publishers]. Horsthemke and R. 76(26):4874–4877. and biology.. John. Let.. Introduction to the theory of random processes. Lefever.. chemistry. Phys. Givon. 2004.[34] H. Escape from a metastable state. [41] P. volume 15 of Springer Series in Synergetics. Hairer and G. V. Berlin.. and H. 1981. 42(1/2):105–140. Kupferman. fourth edition. 2004. Shiryaev. Gang. Borkovec. Partial differential equations. Diffusion in periodically forced Brownian particles moving in space–periodic potentials. Pavliotis. 1996. Hanggi. volume 1 of Applied Mathematical Sciences. 2003. Phys. 230 . Noiseinduced transitions. [35] C. A second course in stochastic processes. 1985. 1986. Academic Press Inc. W. Handbook of stochastic methods. Jacod and A. 17(6):R55–R127. M. Karlin and H. Rev. I. 131(1):175–202. R. Theory and applications in physics. and A. Daffertshofer. 117(12):261–279. P. Haken. 62(2):251–341. Stat. Extracting macroscopic dynamics: model problems and algorithms. Phys. Stuart.M. 1990. J. New York. 1996. [45] S.A.. A. [43] J. Berlin. Mineola. 1991. Gikhman and A. Taylor. 2008. Berlin. second edition. [36] I. SpringerVerlag. Dover Publications Inc. J. Phys. NY. Gardiner. [39] M. Reactionrate theory: ﬁfty years after Kramers. Limit theorems for stochastic processes. SpringerVerlag. and M. Hanggi. [37] D. chemistry and the natural sciences. [42] W. Periodic homogenization for hypoelliptic diffusions. [38] M. New York. For physics. A. [44] F. Skorokhod. Modern Phys. From ballistic to diffusive behavior in periodic potentials. SpringerVerlag.
[55] B. Introduction to the theory of diffusion processes. 231 . Hoboken. RI. New YorkLondon. volume 142 of Translations of Mathematical Monographs. second edition. Kupferman. A. Krylov. second edition. Fluctuation and Noise Letters. [56] M. 104(1):1–19. Lindner. 2007. Sinai. fourth edition. Lax. [52] A. 1975. Jackson. 2004. Phys. Phys. M. I. Vol. S. 70:051104. 1977. New York. Pure and Applied Mathematics (Hoboken). R. Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Rev. Romero. Probability theory. Brownian motion in a ﬁeld of force and the diffusion model of chemical reactions. [47] C. [53] P.M. A. [57] M. Stuart. Graduate e Texts in Mathematics. Sokolov. 1995. J. Optimal diffusive transport in a tilted periodic potential. Koralov and Y. American Mathematical Society. 2001. Itˆ versus Stratonovich whitenoise limits o for systems with inertia and colored multiplicative noise.H. NJ. Graduate e Texts in Mathematics. I. Physica. and A. On the self–diffusion of ions in polyelectrolytic solution. [49] H. 1(1):R25–R39. Providence. WileyInterscience [John Wiley & Sons]. Academic Press [A subsidiary of Harcourt Brace Jovanovich. Probability theory. [51] R. Taylor. V. Theory of probability and random processes.. Publishers]. 70(3):036120. Pavliotis. Lifson and J. Comm. M. 7:284–304. Phys. J. E (3). Math. Berlin. Karlin and H. Rev. and K. D.M. A ﬁrst course in stochastic processes. 45. Varadhan. 2007. and L. 1986. 2004. 1978. [50] N. Universitext. 9. [54] S. Lo` ve. Kramers. G. Lo` ve. Chem. Vol. [48] L. 46.M Sancho. SchimanskyGeier. 1962. B. II.M. From subdiffusion to superdiffusion of particles on solid surfaces. New York.L. 36:2410. G. A. Lacasta. Linear algebra and its applications. Lindenberg.[46] S. Springer. 1940. SpringerVerlag. Phys. E. Kipnis and S. fourth edition. SpringerVerlag. Kostur.
Prague. Uniform expansion of the transition rate in Kramers’ problem.. Forced thermal ratchets. Schuss. Oxford University Press. A. Meyer and J. [65] J. Mackey.C.C. Princeton. BenJacob. [63] B. MR1140408]. A. Math. 1984. Appl. 19:1–29. Schr¨ ter. 232 . On the trend to equilibrium for the FokkerPlanck equation: an interplay between physics and functional analysis. Noiseinduced global asymptotic stability. Phys. Rev. J. Phys. J. 2002. A singular perturbation approach to Kramers’ diffusion problem. 2002. Reprint of the 1992 original [Springer. 32(1):53–69. Brownian motion. 2003. Nelson.. O. S. Universitext. volume 112 of International Series of Monographs on Physics. Statist. 1982. Mazo. J.[58] M. o J. Matkowsky. Contemp. Analytical treatment of onedimensional Markov processes. 1990. 2003. The origins of thermodynamic behavior. Comments on the Grad procedure for the FokkerPlanck equation. and A. C.. 38(6):819–834. NY. Mineola. [60] M. Longtin... 42(4):835–849. Z. Mandl. Villani. Academia Publishing House of the Czechoslovak Academy of Sciences. Pure Appl. Statist. Varadhan. 1985. Phys. Øksendal. Band 151. [59] M.J. Magnasco. Math. SpringerVerlag. Princeton University Press. New York. [66] R. Dover Publications Inc. Mattingly and A. and C. Z.C. 1967. N. OrnsteinUhlenbeck process in a random potential.. Tier. Phys. Matkowsky. Papanicolaou and S. 1983. Berlin. [67] J. Statist. 2000. [62] P. and E.. 1993. 1968. Stochastic differential equations. Dynamical theories of Brownian motion.. Mat. Schuss. 8(2):199–214. [64] B. [61] P.. Lasota. Die Grundlehren der mathematischen Wissenschaften. [68] E. Stuart. Markov Processes and Related Fields. R. SIAM J. Comm. 35(34):443–456. New York. Markowich and C. [69] B. M.M. J. 60(56):735–751. [70] G. 71(10):1477–1481. Time’s arrow. Mackey. Let. Geometric ergodicity of some hypoelliptic diffusions for particle motions.
Analysis of white noise limits for stochastic systems with two fast relaxation times. Phys. A. Reimann. [75] G.M. 127(4):741–781. Pavliotis and A. Van den Broeck. Qian. Vol. Van den Broeck. 233 . Springer. 8(2):L155–173.. Fluct. New York. New York. 2002. Rubi. Boron. J.[71] G. Topics in the theory of random noise. Stat. H¨ nggi. H. Pavliotis and A. H. Stuart. volume 44 of Encyclopedia of Mathematics and its Applications.. Zabczyk. 1992. e 1990. Let. [72] G. M.A. E. Pavliotis and A. Cambridge University Press. Noise Lett. Rep. Translated from the second French edition by Leo F. 87(1):010602. Gordon and Breach Science Publishers. C. Functional analysis. Multiscale methods. 4(1):1–35 (electronic). Da Prato and J. Linke. Multiscale Model. Stat.M. M. Averaging and homogenization. PerezMadrid. Linke. C. [76] H.M. 2002. Rev. Giant acceleration of free diffusion by use of tilted periodic potentials. P. Phys. Revised English edition. New York. 2001. 2005. Diffusive transport in periodic potentials: Underdamped dynamics.. Tang. Reprint of the 1955 original. Reimann. Rev. Stuart. Dover Publications Inc. universality and scaling. II. [81] Frigyes Riesz and B´ la Sz. Phys. [80] P. Stochastic Equations in Inﬁnite Dimensions. Brownian motors: noisy transport far from equilibrium. Reimann. [73] G. and X. 65(3):031104.Nagy. [79] P. J. [78] P. Vogiannou. volume 53 of Texts in Applied Mathematics. A. 2008. 1967. Thermodynamics of the general duffusion process: time– reversibility and entropy production. 361(24):57–265. Difa fusion in tilted periodic potentials: enhancement. Stratonovich. R... PerezMadrid. A. and A. [77] R. Simul. and A. 2002. Parameter estimation for multiscale diffusions. J. L. [74] G.. Min Qian. Translated from the Russian by Richard A. Phys. Stuart. Phys. 2008.. Rubi. 107(5/6):1129–1141. J. Pavliotis and A. L. Silverman. 2007.
Biomolecular conformations can be identiﬁed as metastable u sets of molecular dynamics. and A. Let. Soc.C. A.H. 83:359–383. 55(56):1065–1088. J.A. Todor. 92(25):250601. Comput. The FokkerPlanck equation. 1989. Giant enhancement of diffusion and a particle selection in rocked periodic potentials. Probability theory. 36(5):823–841. Phys. Europhys. A boundary layer theory for diffusively perturbed transport around a heteroclinic cycle. SpringerVerlag. Pure Appl. Vergassola and M. Reimann..M. I.B.. P. 2006.. Cambridge. 106(12):148–166. Schwab and R. Avellaneda. Sep 1930. Stat. Uhlenbeck and L. Romero. KarhunenLo` ve approximation of random ﬁelds by generalized e fast multipole methods. Einstein’s relation between diffusion constant and mobility for a diffusion model. [86] Z.. 234 . Rev. [92] T. Phys. 2004. Vol X.M. Doering. On the theory of the brownian motion. Sokolov. Let. 1921. 58(1):30–84. Lindenberg.. 2005. Phys. Schuss. In Handbook of Numerical Analysis (Computational Chemistry). [93] G. an analytic view. SIAM Review. K. [90] D. E. Berlin. Phys. Pollak. and E. Comm. [84] J.. Huisinga. Scalar transport in compressible ﬂow. 20:196. Stroock.[82] H. J. D. Diffusion by continuous movements. Sowers.M Sancho.R. [85] M Schreier. [89] R. 1997. S. Cambridge University Press. [83] H. Math. 1993. Singular perturbation methods in stochastic differential equations of mathematical physics.. P. Phys. Risken. Rodenhausen. Sch¨ tte and W. volume 18 of Springer Series in Synergetics. 2003. 217(1):100–122. 1996. Phys.Elston and C. H¨ nggi. [91] G. Rev. 44(4):416–422. Diffusion on a solid surface: anomalous is normal. J. 1980. 22(2):119–155.W. [87] Ch. 1998. Ornstein. [94] M. I. London Math. Taylor. Statist. Lacasta. Numerical and analytical studies of nonequilibrium ﬂuctuationinduced transport processes. [88] C. 1989.
9(3):215–220.. Zwanzig. Oxford University Press. New York. Nonlinear generalized Langevin equations. J.[95] R. Nonequilibrium statistical mechanics. Zwanzig. 1973. [96] R. 2001. 235 . Phys. Stat.
This action might not be possible to undo. Are you sure you want to continue?
Use one of your book credits to continue reading from where you left off, or restart the preview.