Pavliotis

© All Rights Reserved

27 views

Pavliotis

© All Rights Reserved

- Stochastic Integration and Stochastic Differential Equations a Gentle Introduction
- Metastasis and Metastability_ a - Kane X. Faucher
- Absolute Measurable Spaces (Encyclopedia of Mathematics and Its Applications)
- General Theory of Stochastic Process
- Random Signals Notes
- Plausible Reasoning Spatial Observation
- Math382_Lecture_notes__Probability_and_Statistics.pdf
- Measure and Integration Mod01 Lec 02
- TCDDmodel
- 23-74-1-PB
- Optimal Trading Strategies for Ito Processes
- Nikhil Chandaria - Final Year Project
- 2-The Normal & Standard Normal Distribution
- Mixing Models - Elder
- Lecture-6 (Paper 1)
- 1010.2992
- markets98_lecture.pdf
- PROB1-2
- Slides Sempi
- Chapter 3

You are on page 1of 239

APPLICATIONS

G.A. Pavliotis

Department of Mathematics

Imperial College London

London SW7 2AZ, UK

June 9, 2011

Contents

Preface

vii

1 Introduction

1.1 Introduction . . . . . . . . . . . . . . . . .

1.2 Historical Overview . . . . . . . . . . . . .

1.3 The One-Dimensional Random Walk . . . .

1.4 Stochastic Modeling of Deterministic Chaos

1.5 Why Randomness . . . . . . . . . . . . . .

1.6 Discussion and Bibliography . . . . . . . .

1.7 Exercises . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

1

1

1

3

6

6

7

7

.

.

.

.

.

.

.

.

.

.

.

9

9

9

12

13

16

18

19

20

23

25

25

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.2 Definition of a Stochastic Process . . . . . . . . . . . . . . . . .

29

29

29

2.1 Introduction . . . . . . . . . . . . . . . . .

2.2 Basic Definitions from Probability Theory .

2.2.1 Conditional Probability . . . . . . .

2.3 Random Variables . . . . . . . . . . . . . .

2.3.1 Expectation of Random Variables .

2.4 Conditional Expecation . . . . . . . . . . .

2.5 The Characteristic Function . . . . . . . . .

2.6 Gaussian Random Variables . . . . . . . .

2.7 Types of Convergence and Limit Theorems

2.8 Discussion and Bibliography . . . . . . . .

2.9 Exercises . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

CONTENTS

ii

3.3

3.4

3.5

3.6

3.7

3.8

4

Stationary Processes . . . . . . . . . . . . . . . . . . . . . . .

3.3.1 Strictly Stationary Processes . . . . . . . . . . . . . . .

3.3.2 Second Order Stationary Processes . . . . . . . . . . .

3.3.3 Ergodic Properties of Second-Order Stationary Processes

Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . .

Other Examples of Stochastic Processes . . . . . . . . . . . . .

3.5.1 Brownian Bridge . . . . . . . . . . . . . . . . . . . . .

3.5.2 Fractional Brownian Motion . . . . . . . . . . . . . . .

3.5.3 The Poisson Process . . . . . . . . . . . . . . . . . . .

The Karhunen-Loeve Expansion . . . . . . . . . . . . . . . . .

Discussion and Bibliography . . . . . . . . . . . . . . . . . . .

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Markov Processes

4.1 Introduction . . . . . . . . . . . . . .

4.2 Examples . . . . . . . . . . . . . . .

4.3 Definition of a Markov Process . . . .

4.4 The Chapman-Kolmogorov Equation .

4.5 The Generator of a Markov Processes

4.5.1 The Adjoint Semigroup . . .

4.6 Ergodic Markov processes . . . . . .

4.6.1 Stationary Markov Processes .

4.7 Discussion and Bibliography . . . . .

4.8 Exercises . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Diffusion Processes

5.1 Introduction . . . . . . . . . . . . . . . . . . . . .

5.2 Definition of a Diffusion Process . . . . . . . . . .

5.3 The Backward and Forward Kolmogorov Equations

5.3.1 The Backward Kolmogorov Equation . . .

5.3.2 The Forward Kolmogorov Equation . . . .

5.4 Multidimensional Diffusion Processes . . . . . . .

5.5 Connection with Stochastic Differential Equations .

5.6 Examples of Diffusion Processes . . . . . . . . . .

5.7 Discussion and Bibliography . . . . . . . . . . . .

5.8 Exercises . . . . . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

31

31

32

37

39

44

44

45

46

46

51

51

.

.

.

.

.

.

.

.

.

.

57

57

57

62

64

67

69

70

72

73

74

.

.

.

.

.

.

.

.

.

.

77

77

77

79

79

82

84

85

86

86

87

CONTENTS

6 The Fokker-Planck Equation

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2 Basic Properties of the FP Equation . . . . . . . . . . . . .

6.2.1 Existence and Uniqueness of Solutions . . . . . . .

6.2.2 The FP equation as a conservation law . . . . . . . .

6.2.3 Boundary conditions for the FokkerPlanck equation

6.3 Examples of Diffusion Processes . . . . . . . . . . . . . . .

6.3.1 Brownian Motion . . . . . . . . . . . . . . . . . . .

6.3.2 The Ornstein-Uhlenbeck Process . . . . . . . . . . .

6.3.3 The Geometric Brownian Motion . . . . . . . . . .

6.4 The Ornstein-Uhlenbeck Process and Hermite Polynomials .

6.5 Reversible Diffusions . . . . . . . . . . . . . . . . . . . . .

6.5.1 Markov Chain Monte Carlo (MCMC) . . . . . . . .

6.6 Perturbations of non-Reversible Diffusions . . . . . . . . . .

6.7 Eigenfunction Expansions . . . . . . . . . . . . . . . . . .

6.7.1 Reduction to a Schrodinger Equation . . . . . . . .

6.8 Discussion and Bibliography . . . . . . . . . . . . . . . . .

6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Stochastic Differential Equations

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2 The Ito and Stratonovich Stochastic Integral . . . . . . . . .

7.2.1 The Stratonovich Stochastic Integral . . . . . . . . .

7.3 Stochastic Differential Equations . . . . . . . . . . . . . . .

7.3.1 Examples of SDEs . . . . . . . . . . . . . . . . . .

7.4 The Generator, Itos formula and the Fokker-Planck Equation

7.4.1 The Generator . . . . . . . . . . . . . . . . . . . .

7.4.2 Itos Formula . . . . . . . . . . . . . . . . . . . . .

7.5 Linear SDEs . . . . . . . . . . . . . . . . . . . . . . . . . .

7.6 Derivation of the Stratonovich SDE . . . . . . . . . . . . .

7.6.1 Ito versus Stratonovich . . . . . . . . . . . . . . . .

7.7 Numerical Solution of SDEs . . . . . . . . . . . . . . . . .

7.8 Parameter Estimation for SDEs . . . . . . . . . . . . . . . .

7.9 Noise Induced Transitions . . . . . . . . . . . . . . . . . .

7.10 Discussion and Bibliography . . . . . . . . . . . . . . . . .

7.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

89

89

90

90

92

92

94

94

98

102

103

110

115

115

116

117

119

120

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

123

123

124

125

126

127

129

129

129

131

134

137

138

138

138

140

140

iv

8

CONTENTS

The Langevin Equation

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.2 The Fokker-Planck Equation in Phase Space (Klein-Kramers Equation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.3 The Langevin Equation in a Harmonic Potential . . . . . . . . . .

8.4 Asymptotic Limits for the Langevin Equation . . . . . . . . . . .

8.4.1 The Overdamped Limit . . . . . . . . . . . . . . . . . . .

8.4.2 The Underdamped Limit . . . . . . . . . . . . . . . . . .

8.5 Brownian Motion in Periodic Potentials . . . . . . . . . . . . . .

8.5.1 The Langevin equation in a periodic potential . . . . . . .

8.5.2 Equivalence With the Green-Kubo Formula . . . . . . . .

8.6 The Underdamped and Overdamped Limits of the Diffusion Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.6.1 Brownian Motion in a Tilted Periodic Potential . . . . . .

8.7 Numerical Solution of the Klein-Kramers Equation . . . . . . . .

8.8 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . .

8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141

141

141

146

155

157

163

168

168

174

176

185

188

188

189

191

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

9.2 Brownian Motion in a Bistable Potential . . . . . . . . . . . . . . 191

9.3 The Mean First Passage Time . . . . . . . . . . . . . . . . . . . . 194

9.3.1 The Boundary Value Problem for the MFPT . . . . . . . . 194

9.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 197

9.4 Escape from a Potential Barrier . . . . . . . . . . . . . . . . . . . 199

9.4.1 Calculation of the Reaction Rate in the Overdamped Regime200

9.4.2 The Intermediate Regime: = O(1) . . . . . . . . . . . 201

9.4.3 Calculation of the Reaction Rate in the energy-diffusionlimited regime . . . . . . . . . . . . . . . . . . . . . . . 202

9.5 Discussion and Bibliography . . . . . . . . . . . . . . . . . . . . 202

9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2 The Kac-Zwanzig Model . . . . . . . . . . . . . . . . . . . . . .

10.3 The Generalized-Langevin Equation . . . . . . . . . . . . . . . .

205

205

206

213

CONTENTS

10.4

10.5

10.6

10.7

10.8

Linear Response Theory . . . .

Projection Operator Techniques

Discussion and Bibliography . .

Exercises . . . . . . . . . . . .

v

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

217

218

218

219

219

vi

CONTENTS

Preface

The purpose of these notes is to present various results and techniques from the

theory of stochastic processes and are useful in the study of stochastic problems

in physics, chemistry and other areas. These notes have been used for several

years for a course on applied stochastic processes offered to fourth year and to

MSc students in applied mathematics at the department of mathematics, Imperial

College London.

G.A. Pavliotis

London, December 2010

vii

viii

PREFACE

Chapter 1

Introduction

1.1 Introduction

In this chapter we introduce some of the concepts and techniques that we will study

in this book. In Section 1.2 we present a brief historical overview on the development of the theory of stochastic processes in the twentieth century. In Section 1.3

we introduce the one-dimensional random walk an we use this example in order to

introduce several concepts such Brownian motion, the Markov property. In Section 1.4 we discuss about the stochastic modeling of deterministic chaos. Some

comments on the role of probabilistic modeling in the physical sciences are offered in Section 1.5. Discussion and bibliographical comments are presented in

Section 1.6. Exercises are included in Section 1.7.

The theory of stochastic processes, at least in terms of its application to physics,

started with Einsteins work on the theory of Brownian motion: Concerning the

motion, as required by the molecular-kinetic theory of heat, of particles suspended

in liquids at rest (1905) and in a series of additional papers that were published in

the period 1905 1906. In these fundamental works, Einstein presented an explanation of Browns observation (1827) that when suspended in water, small pollen

grains are found to be in a very animated and irregular state of motion. In developing his theory Einstein introduced several concepts that still play a fundamental

role in the study of stochastic processes and that we will study in this book. Using

modern terminology, Einstein introduced a Markov chain model for the motion of

1

CHAPTER 1. INTRODUCTION

the particle (molecule, pollen grain...). Furthermore, he introduced the idea that it

makes more sense to talk about the probability of finding the particle at position x

at time t, rather than about individual trajectories.

In his work many of the main aspects of the modern theory of stochastic processes can be found:

The assumption of Markovianity (no memory) expressed through the ChapmanKolmogorov equation.

The FokkerPlanck equation (in this case, the diffusion equation).

The derivation of the Fokker-Planck equation from the master (ChapmanKolmogorov) equation through a Kramers-Moyal expansion.

The calculation of a transport coefficient (the diffusion equation) using macroscopic (kinetic theory-based) considerations:

D=

kB T

.

6a

fluid and a is the diameter of the particle.

Einsteins theory is based on the Fokker-Planck equation. Langevin (1908) developed a theory based on a stochastic differential equation. The equation of

motion for a Brownian particle is

m

dx

d2 x

+ ,

= 6a

dt2

dt

where is a random force. It can be shown that there is complete agreement between Einsteins theory and Langevins theory. The theory of Brownian motion

was developed independently by Smoluchowski, who also performed several experiments.

The approaches of Langevin and Einstein represent the two main approaches

in the theory of stochastic processes:

Study individual trajectories of Brownian particles. Their evolution is governed by a stochastic differential equation:

dX

= F (X) + (X)(t),

dt

Study the probability (x, t) of finding a particle at position x at time t. This

probability distribution satisfies the Fokker-Planck equation:

1

= (F (x)) + : (A(x)),

t

2

where A(x) = (x)(x)T .

The theory of stochastic processes was developed during the 20th century by several mathematicians and physicists including Smoluchowksi, Planck, Kramers,

Chandrasekhar, Wiener, Kolmogorov, Ito, Doob.

We let time be discrete, i.e. t = 0, 1, . . . . Consider the following stochastic

process Sn : S0 = 0; at each time step it moves to 1 with equal probability 21 .

In other words, at each time step we flip a fair coin. If the outcome is heads,

we move one unit to the right. If the outcome is tails, we move one unit to the left.

Alternatively, we can think of the random walk as a sum of independent random

variables:

n

X

Xj ,

Sn =

j=1

We can simulate the random walk on a computer:

We need a (pseudo)random number generator to generate n independent

random variables which are uniformly distributed in the interval [0,1].

If the value of the random variable is >

otherwise it moves to the right.

1

2

The sequence {Sn }N

n=1 indexed by the discrete time T = {1, 2, . . . N } is

the path of the random walk. We use a linear interpolation (i.e. connect the

points {n, Sn } by straight lines) to generate a continuous path.

CHAPTER 1. INTRODUCTION

4

50step random walk

8

6

4

2

0

2

4

6

0

10

15

20

25

30

35

40

45

50

20

10

10

20

30

40

50

0

100

200

300

400

500

600

700

800

900

1000

2

mean of 1000 paths

5 individual paths

1.5

U(t)

0.5

0.5

1.5

0.2

0.4

0.6

0.8

t

Figure 1.3: Sample Brownian paths.

Every path of the random walk is different: it depends on the outcome of a sequence of independent random experiments. We can compute statistics by generating a large number of paths and computing averages. For example, E(Sn ) =

0, E(Sn2 ) = n. The paths of the random walk (without the linear interpolation) are

not continuous: the random walk has a jump of size 1 at each time step. This is an

example of a discrete time, discrete space stochastic processes. The random walk

is a time-homogeneous Markov process. If we take a large number of steps, the

random walk starts looking like a continuous time process with continuous paths.

We can quantify this observation by introducing an appropriate rescaled process and by taking an appropriate limit. Consider the sequence of continuous time

stochastic processes

1

Ztn := Snt .

n

In the limit as n , the sequence {Ztn } converges (in some appropriate sense,

that will be made precise in later chapters) to a Brownian motion with diffusion

2

1

coefficient D = x

2t = 2 . Brownian motion W (t) is a continuous time stochastic

processes with continuous paths that starts at 0 (W (0) = 0) and has independent, normally. distributed Gaussian increments. We can simulate the Brownian

CHAPTER 1. INTRODUCTION

distributed, independent random variables. We can write an equation for the evolution of the paths of a Brownian motion Xt with diffusion coefficient D starting

at x:

dXt = 2DdWt , X0 = x.

This is the simplest example of a stochastic differential equation. The probability

of finding Xt at y at time t, given that it was at x at time t = 0, the transition

probability density (y, t) satisfies the PDE

2

= D 2,

t

y

(y, 0) = (y x).

between Brownian motion and the diffusion equation was made by Einstein in

1905.

1.5 Why Randomness

Why introduce randomness in the description of physical systems?

To describe outcomes of a repeated set of experiments. Think of tossing a

coin repeatedly or of throwing a dice.

To describe a deterministic system for which we have incomplete information: we have imprecise knowledge of initial and boundary conditions or of

model parameters.

ODEs with random initial conditions are equivalent to stochastic processes that can be described using stochastic differential equations.

To describe systems for which we are not confident about the validity of our

mathematical model.

To describe a dynamical system exhibiting very complicated behavior (chaotic

dynamical systems). Determinism versus predictability.

dimensional stochastic system. Think of the physical model for Brownian

motion (a heavy particle colliding with many small particles).

To describe a system that is inherently random. Think of quantum mechanics.

Stochastic modeling is currently used in many different areas ranging from

biology to climate modeling to economics.

The fundamental papers of Einstein on the theory of Brownian motion have been

reprinted by Dover [11]. The readers of this book are strongly encouraged to study

these papers. Other fundamental papers from the early period of the development

of the theory of stochastic processes include the papers by Langevin, Ornstein and

Uhlenbeck, Doob, Kramers and Chandrashekhars famous review article [7]. Many

of these early papers on the theory of stochastic processes have been reprinted

in [10]. Very useful historical comments can be founds in the books by Nelson [54]

and Mazo [52].

1.7 Exercises

1. Read the papers by Einstein, Ornstein-Uhlenbeck, Doob etc.

2. Write a computer program for generating the random walk in one and two dimensions. Study numerically the Brownian limit and compute the statistics of

the random walk.

CHAPTER 1. INTRODUCTION

Chapter 2

2.1 Introduction

In this chapter we put together some basic definitions and results from probability

theory that will be used later on. In Section 2.2 we give some basic definitions

from the theory of probability. In Section 2.3 we present some properties of random variables. In Section 2.4 we introduce the concept of conditional expectation

and in Section 2.5 we define the characteristic function, one of the most useful

tools in the study of (sums of) random variables. Some explicit calculations for

the multivariate Gaussian distribution are presented in Section 2.6. Different types

of convergence and the basic limit theorems of the theory of probability are discussed in Section 2.7. Discussion and bibliographical comments are presented in

Section 2.8. Exercises are included in Section 2.9.

In Chapter 1 we defined a stochastic process as a dynamical system whose law of

evolution is probabilistic. In order to study stochastic processes we need to be able

to describe the outcome of a random experiment and to calculate functions of this

outcome. First we need to describe the set of all possible experiments.

Definition 2.2.1. The set of all possible outcomes of an experiment is called the

sample space and is denoted by .

Example 2.2.2.

The possible outcomes of the experiment of tossing a coin

are H and T . The sample space is = H, T .

9

10

and 6. The sample space is = 1, 2, 3, 4, 5, 6 .

the unions, intersections and complements of events to also be events. When the

sample space is uncountable, then technical difficulties arise. In particular, not

all subsets of the sample space need to be events. A definition of the collection of

subsets of events which is appropriate for finite additive probability is the following.

Definition 2.2.3. A collection F of is called a field on if

i. F;

ii. if A F then Ac F;

iii. If A, B F then A B F.

From the definition of a field we immediately deduce that F is closed under

finite unions and finite intersections:

A1 , . . . An F ni=1 Ai F,

ni=1 Ai F.

since we need to consider countable unions of events.

Definition 2.2.4. A collection F of is called a -field or -algebra on if

i. F;

ii. if A F then Ac F;

iii. If A1 , A2 , F then

i=1 Ai F.

A -algebra is closed under the operation of taking countable intersections.

Example 2.2.5.

F = , .

F = , A, Ac , where A is a subset of .

11

for example the power set of ). Consider all the algebras that contain F and

take their intersection, denoted by (F), i.e. A if and only if it is in every

algebra containing F. (F) is a algebra (see Exercise 1 ). It is the smallest

algebra containing F and it is called the algebra generated by F.

Example 2.2.6. Let = Rn . The -algebra generated by the open subsets of Rn

(or, equivalently, by the open balls of Rn ) is called the Borel -algebra of Rn and

is denoted by B(Rn ).

Let X be a closed subset of Rn . Similarly, we can define the Borel -algebra

of X, denoted by B(X).

A sub-algebra is a collection of subsets of a algebra which satisfies the

axioms of a algebra.

The field F of a sample space contains all possible outcomes of the experiment that we want to study. Intuitively, the field contains all the information

about the random experiment that is available to us.

Now we want to assign probabilities to the possible outcomes of an experiment.

Definition 2.2.7. A probability measure P on the measurable space (, F) is a

function P : F 7 [0, 1] satisfying

i. P() = 0, P() = 1;

ii. For A1 , A2 , . . . with Ai Aj = , i 6= j then

P(

i=1 Ai ) =

P(Ai ).

i=1

Definition 2.2.8. The triple , F, P comprising a set , a -algebra F of subsets of and a probability measure P on (, F) is a called a probability space.

Example 2.2.9. A biased coin is tossed once: = {H, T }, F = {, H, T, } =

{0, 1}, P : F 7 [0, 1] such that P() = 0, P(H) = p [0, 1], P(T ) =

1 p, P() = 1.

Example 2.2.10. Take = [0, 1], F = B([0, 1]), P = Leb([0, 1]). Then

(, F, P) is a probability space.

12

One of the most important concepts in probability is that of the dependence between events.

Definition 2.2.11. A family {Ai : i I} of events is called independent if

P jJ Aj = jJ P(Aj )

for all finite subsets J of I.

that the event A will occur, given that B has already happened. We define this

to be conditional probability, denoted by P(A|B). We know from elementary

probability that

P (A B)

P (A|B) =

.

P(B)

A very useful result is that of the total law of probability.

Definition 2.2.12. A family of events {Bi : i I} is called a partition of if

Bi Bj = , i 6= j

and

iI Bi = .

Proposition 2.2.13. Law of total probability. For any event A and any partition

{Bi : i I} we have

X

P(A) =

P(A|Bi )P(Bi ).

iI

The proof of this result is left as an exercise. In many cases the calculation of

the probability of an event is simplified by choosing an appropriate partition of

and using the law of total probability.

Let (, F, P) be a probability space and fix B F. Then P(|B) defines a

probability measure on F. Indeed, we have that

P(|B) = 0,

P(|B) = 1

P (

j=1 Ai |B) =

X

j=1

P(Ai |B),

j=1 . Consequently, (, F, P(|B))

is a probability space for every B cF .

13

We are usually interested in the consequences of the outcome of an experiment,

rather than the experiment itself. The function of the outcome of an experiment is

a random variable, that is, a map from to R.

Definition 2.3.1. A sample space equipped with a field of subsets F is called

a measurable space.

Definition 2.3.2. Let (, F) and (E, G) be two measurable spaces. A function

X : E such that the event

{ : X() A} =: {X A}

(2.1)

variable.

When E is R equipped with its Borel -algebra, then (2.1) can by replaced

with

{X 6 x} F x R.

Let X be a random variable (measurable function) from (, F, ) to (E, G). If E

is a metric space then we may define expectation with respect to the measure by

Z

X() d().

E[X] =

Z

f (X()) d().

E[f (X)] =

Let U be a topological space. We will use the notation B(U ) to denote the Borel

algebra of U : the smallest algebra containing all open sets of U . Every random variable from a probability space (, F, ) to a measurable space (E, B(E))

induces a probability measure on E:

X (B) = PX 1 (B) = ( ; X() B),

B B(E).

(2.2)

Example 2.3.3. Let I denote a subset of the positive integers. A vector 0 =

{0,i , i I} is a distribution on I if it has nonnegative entries and its total mass

P

equals 1: iI 0,i = 1.

14

Consider the case where E = R equipped with the Borel algebra. In this

case a random variable is defined to be a function X : R such that

{ : X() 6 x} F

x R.

FX (x) = P X() 6 x =: P(X 6 x).

(2.3)

The distribution function FX (x) of a random variable has the properties that

limx FX (x) = 0, limx+ F (x) = 1 and is right continuous.

takes values in some countable subset {x0 , x1 , x2 , . . . } of R. i.e.: P(X = x) 6= x

only for x = x0 , x1 , . . . .

With a random variable we can associate the probability mass function pk =

P(X = xk ). We will consider nonnegative integer valued discrete random variables. In this case pk = P(X = k), k = 0, 1, 2, . . . .

Example 2.3.5. The Poisson random variable is the nonnegative integer valued

random variable with probability mass function

pk = P(X = k) =

k

e ,

k!

k = 0, 1, 2, . . . ,

where > 0.

Example 2.3.6. The binomial random variable is the nonnegative integer valued

random variable with probability mass function

pk = P(X = k) =

N!

pn q N n

n!(N n)!

k = 0, 1, 2, . . . N,

Definition 2.3.7. A random variable X with values on R is called continuous if

P(X = x) = 0 x R.

Let (, F, P) be a probability space and let X : R be a random variable

with distribution FX . This is a probability measure on B(R). We will assume

that it is absolutely continuous with respect to the Lebesgue measure with density

X : FX (dx) = (x) dx. We will call the density (x) the probability density

function (PDF) of the random variable X.

Example 2.3.8.

15

ex x > 0,

f (x) =

0

x < 0,

with > 0.

ii. The uniform random variable has PDF

f (x) =

1

ba

a < x < b,

x

/ (a, b),

with a < b.

Definition 2.3.9. Two random variables X and Y are independent if the events

{ | X() 6 x} and { | Y () 6 y} are independent for all x, y R.

Let X, Y be two continuous random variables. We can view them as a random vector, i.e. a random variable from to R2 . We can then define the joint

distribution function

F (x, y) = P(X 6 x, Y 6 y).

The mixed derivative of the distribution function fX,Y (x, y) :=

exists, is called the joint PDF of the random vector {X, Y }:

Z x Z y

fX,Y (x, y) dxdy.

FX,Y (x, y) =

2F

xy (x, y),

FX,Y (x, y) = FX (x)FY (y)

and

fX,Y (x, y) = fX (x)fY (y).

The joint distribution function has the properties

FX,Y (x, y) = FY,X (y, x),

FX,Y (+, y) = FY (y),

fY (y) =

if it

16

We can extend the above definition to random vectors of arbitrary finite dimensions. Let X be a random variable from (, F, ) to (Rd , B(Rd )). The (joint)

distribution function FX Rd [0, 1] is defined as

FX (x) = P(X 6 x).

Let X be a random variable in Rd with distribution function f (xN ) where xN =

{x1 , . . . xN }. We define the marginal or reduced distribution function f N 1 (xN 1 )

by

Z

f N (xN ) dxN .

f N 1 (xN 1 ) =

Z Z

Z

N 1

N 2

f (xN ) dxN 1 dxN .

f

(xN 1 ) dxN 1 =

f

(xN 2 ) =

R

We can use the distribution of a random variable to compute expectations and probabilities:

Z

f (x) dFX (x)

(2.4)

E[f (X)] =

R

and

P[X G] =

dFX (x),

G

G B(E).

(2.5)

The above formulas apply to both discrete and continuous random variables, provided that we define the integrals in (2.4) and (2.5) appropriately.

When E = Rd and a PDF exists, dFX (x) = fX (x) dx, we have

Z xd

Z x1

...

fX (x) dx..

FX (x) := P(X 6 x) =

we mean the Banach space of measurable functions on with norm

1/p

kXkLp = E|X|p

.

Let X be a nonnegative integer valued random variable with probability mass

function pk . We can compute the expectation of an arbitrary function of X using

the formula

X

f (k)pk .

E(f (X)) =

k=0

17

and, if they are, to calculate how correlated they are. We define the covariance of

the two random variables as

cov(X, Y ) = E (X EX)(Y EY ) = E(XY ) EXEY.

The correlation coefficient is

(X, Y ) = p

cov(X, Y )

p

var(X) var(X)

(2.6)

The Cauchy-Schwarz inequality yields that (X, Y ) [1, 1]. We will say

that two random variables X and Y are uncorrelated provided that (X, Y ) = 0.

It is not true in general that two uncorrelated random variables are independent.

This is true, however, for Gaussian random variables (see Exercise 5).

Example 2.3.10.

(x b)2

12

,b (x) := (2) exp

.

2

Z

x,b (x) dx = b

EX =

R

2

E(X b) =

(x b)2 ,b (x) dx = .

variable X : 7 Rd with pdf

1

1 1

2

d

,b (x) := (2) det

exp h (x b), (x b)i

2

is

E(X) = b

and the covariance matrix is

E (X b) (X b) = .

(2.7)

(2.8)

18

Since the mean and variance specify completely a Gaussian random variable on

R, the Gaussian is commonly denoted by N (m, ). The standard normal random

variable is N (0, 1). Similarly, since the mean and covariance matrix completely

specify a Gaussian random variable on Rd , the Gaussian is commonly denoted by

N (m, ).

Some analytical calculations for Gaussian random variables will be presented

in Section 2.6.

Assume that X L1 (, F, ) and let G be a subalgebra of F. The conditional

expectation of X with respect to G is defined to be the function (random variable)

E[X|G] : 7 E which is Gmeasurable and satisfies

Z

Z

X d G G.

E[X|G] d =

G

We can define E[f (X)|G] and the conditional probability P[X F |G] = E[IF (X)|G],

where IF is the indicator function of F , in a similar manner.

We list some of the most important properties of conditional expectation.

Theorem 2.4.1. [Properties of Conditional Expectation]. Let (, F, ) be a probability space and let G be a subalgebra of F.

(a) If X is Gmeasurable and integrable then E(X|G) = X.

(b) (Linearity) If X1 , X2 are integrable and c1 , c2 constants, then

E(c1 X1 + c2 X2 |G) = c1 E(X1 |G) + c2 E(X2 |G).

(c) (Order) If X1 , X2 are integrable and X1 6 X2 a.s., then E(X1 |G) 6 E(X2 |G)

a.s.

(d) If Y and XY are integrable, and X is Gmeasurable then E(XY |G) =

XE(Y |G).

(e) (Successive smoothing) If D is a subalgebra of F, D G and X is integrable, then E(X|D) = E[E(X|G)|D] = E[E(X|D)|G].

19

n=1 be a sequence of random variables such that, for

all n, |Xn | 6 Z where Z is integrable. If Xn X a.s., then E(Xn |G)

E(X|G) a.s. and in L1 .

Proof. See Exercise 10.

Many of the properties of (sums of) random variables can be studied using the

Fourier transform of the distribution function. Let F () be the distribution function

of a (discrete or continuous) random variable X. The characteristic function of

X is defined to be the Fourier transform of the distribution function

Z

eit dF () = E(eitX ).

(2.9)

(t) =

R

For a continuous random variable for which the distribution function F has a density, dF () = p()d, (2.9) gives

Z

eit p() d.

(t) =

R

(t) =

eitk ak .

k=0

From the properties of the Fourier transform we conclude that the characteristic

function determines uniquely the distribution function of the random variable, in

the sense that there is a one-to-one correspondance between F () and (t). Furthermore, in the exercises at the end of the chapter the reader is asked to prove the

following two results.

Lemma 2.5.1. Let {X1 , X2 , . . . Xn } be independent random variables with charP

acteristic functions j (t), j = 1, . . . n and let Y = nj=1 Xj with characteristic

function Y (t). Then

Y (t) = nj=1 j (t).

Lemma 2.5.2. Let X be a random variable with characteristic function (t) and

assume that it has finite moments. Then

E(X k ) =

1 (k)

(0).

ik

20

In this section we present some useful calculations for Gaussian random variables.

In particular, we calculate the normalization constant, the mean and variance and

the characteristic function of multidimensional Gaussian random variables.

Theorem 2.6.1. Let b Rd and Rdd a symmetric and positive definite matrix. Let X be the multivariate Gaussian random variable with probability density

function

1

1 1

,b (x) = exp h (x b), x bi .

Z

2

Then

i. The normalization constant is

Z = (2)d/2

det().

EX = b

and

E((X EX) (X EX)) = .

iii. The characteristic function of X is

1

Proof.

we have that there exists a diagonal matrix with positive entries and an

orthogonal matrix B such that

1 = B T 1 B.

Let z = x b and y = Bz. We have

h1 z, zi = hB T 1 Bz, zi

= h1 Bz, Bzi = h1 y, yi

d

X

i=1

2

1

i yi .

21

d

Furthermore, we have that det(1 ) = di=1 1

i , that det() = i=1 i

and that the Jacobian of an orthogonal transformation is J = det(B) = 1.

Hence,

Z

Z

1

1

exp h1 (x b), x bi dx =

exp h1 z, zi dz

2

2

Rd

Rd

!

Z

d

1 X 1 2

exp

=

i yi |J| dy

2

Rd

i=1

d Z

Y

1 1 2

exp i yi dyi

=

2

i=1 R

p

1/2

= (2)d/2 ni=1 i = (2)d/2 det(),

Z = (2)d/2

det().

Z

x2

dx =

2

.

,b (x) dx = ,b (B T y + b) dy

=

(2)d/2

Consequently

EX =

=

ZR

RZd

1

p

d

Y

det() i=1

1

exp i yi2

2

dyi .

x,b (x) dx

(B T y + b),b (B T y + b) dy

= b

Rd

,b (B T y + b) dy = b.

22

more, z = B T y. We calculate

Z

E((Xi bi )(Xj bj )) =

Rd

zi zj ,b (z + b) dz

Z

1 X 1 2

y

2

(2)d/2 det() Rd k

m

!

Z

X

1

1 X 1 2

p

=

y d

Bki Bmj

yk ym exp

2

(2)d/2 det() k,m

Rd

X

=

Bki Bmj k km

1

p

Bki yk

Bmi ym exp

k,m

= ij .

iii. Let y be a multivariate Gaussian random variable with mean 0 and covari

ance I. Let also C = B . We have that = CC T = C T C. We have

that

X = CY + b.

To see this, we first note that X is Gaussian since it is given through a linear

transformation of a Gaussian random variable. Furthermore,

EX = b

and

E((Xi bi )(Xj bj )) = ij .

Now we have:

(t) = EeihX,ti = eihb,ti EeihCY,ti

T ti

= eihb,ti EeihY,C

= eihb,ti Eei

1

= eihb,ti e 2

P P

j ( k Cjk tk )yj

P P

j

Cjk tk |

= eihb,ti e 2 hCt,Cti

1

= eihb,ti e 2 ht,C

T Cti

= eihb,ti e 2 ht,ti .

Consequently,

1

23

One of the most important aspects of the theory of random variables is the study of

limit theorems for sums of random variables. The most well known limit theorems

in probability theory are the law of large numbers and the central limit theorem.

There are various different types of convergence for sequences or random variables.

We list the most important types of convergence below.

Definition 2.7.1. Let {Zn }

n=1 be a sequence of random variables. We will say

that

(a) Zn converges to Z with probability one if

lim Zn = Z = 1.

n+

lim P |Zn Z| > = 0.

n+

(c) Zn converges to Z in Lp if

p

lim E Zn Z = 0.

n+

1, + and Z, respectively. Then Zn converges to Z in distribution if

lim Fn () = F ()

n+

Recall that the distribution function FX of a random variable from a probability

space (, F, P) to R induces a probability measure on R and that (R, B(R), FX ) is

a probability space. We can show that the convergence in distribution is equivalent

to the weak convergence of the probability measures induced by the distribution

functions.

Definition 2.7.2. Let (E, d) be a metric space, B(E) the algebra of its Borel

sets, Pn a sequence of probability measures on (E, B(E)) and let Cb (E) denote

the space of bounded continuous functions on E. We will say that the sequence of

Pn converges weakly to the probability measure P if, for each f Cb (E),

Z

Z

f (x) dP (x).

f (x) dPn (x) =

lim

n+ E

24

Zn n = 1, + and Z, respectively. Then Zn converges to Z in distribution if

and only if, for all g Cb (R)

Z

Z

g(x) dF (x).

(2.10)

g(x) dFn (x) =

lim

n+ X

lim En g(Xn ) = Eg(X),

n+

When the sequence of random variables whose convergence we are interested

in takes values in Rd or, more generally, a metric space space (E, d) then we can

use weak convergence of the sequence of probability measures induced by the

sequence of random variables to define convergence in distribution.

Definition 2.7.4. A sequence of real valued random variables Xn defined on a

probability spaces (n , Fn , Pn ) and taking values on a metric space (E, d) is said

to converge in distribution if the indued measures Fn (B) = Pn (Xn B) for

B B(E) converge weakly to a probability measure P .

Let {Xn }

n=1 be iid random variables with EXn = V . Then, the strong law

of large numbers states that average of the sum of the iid converges to V with

probability one:

N

1 X

P

lim

Xn = V = 1.

N + N

n=1

The strong law of large numbers provides us with information about the behavior of a sum of random variables (or, a large number or repetitions of the same

experiment) on average. We can also study fluctuations around the average behavior. Indeed, let E(Xn V )2 = 2 . Define the centered iid random variables

P

Yn = Xn V . Then, the sequence of random variables 1N N

n=1 Yn converges

in distribution to a N (0, 1) random variable:

lim P

n+

N

X

n=1

Yn 6 a

1 2

1

e 2 x dx.

2

25

The material of this chapter is very standard and can be found in many books on

probability theory. Well known textbooks on probability theory are [4, 14, 15, 44,

45, 37, 71].

The connection between conditional expectation and orthogonal projections is

discussed in [8].

The reduced distribution functions defined in Section 2.3 are used extensively

in statistical mechanics. A different normalization is usually used in physics textbooks. See for instance [2, Sec. 4.2].

The calculations presented in Section 2.6 are essentially an exercise in linear

algebra. See [42, Sec. 10.2].

Random variables and probability measures can also be defined in infinite dimensions. More information can be found in [?, Ch. 2].

The study of limit theorems is one of the cornerstones of probability theory and

of the theory of stochastic processes. A comprehensive study of limit theorems can

be found in [33].

2.9 Exercises

1. Show that the intersection of a family of -algebras is a -algebra.

2. Prove the law of total probability, Proposition 2.2.13.

3. Calculate the mean, variance and characteristic function of the following probability density functions.

(a) The exponential distribution with density

ex x > 0,

f (x) =

0

x < 0,

with > 0.

(b) The uniform distribution with density

1

ba a < x < b,

f (x) =

0

x

/ (a, b),

with a < b.

26

(c) The Gamma distribution with density

(

1 ex x > 0,

() (x)

f (x) =

0

x < 0,

with > 0, > 0 and () is the Gamma function

Z

1 e d, > 0.

() =

0

and FY . Show that the distribution function of the sum Z = X + Y is the

convolution of FX and FY :

Z

FZ (x) = FX (x y) dFY (y).

5. Let X and Y be Gaussian random variables. Show that they are uncorrelated if

and only if they are independent.

6.

Show that

1

EX k = k (k) (0),

i

where (k) (t) denotes the k-th derivative of evaluated at t.

(b) Let X be a nonnegative random variable with distribution function F (x).

Show that

Z

+

E(X) =

(1 F (x)) dx.

f (x) and characteristic function (t). Find the probability density and

characteristic function of the random variable Y = aX + b with a, b R.

(d) Let X be a random variable with uniform distribution on [0, 2]. Find the

probability density of the random variable Y = sin(X).

7. Let X be a discrete random variable taking vales on the set of nonnegative inteP

gers with probability mass function pk = P(X = k) with pk > 0, +

k=0 pk =

1. The generating function is defined as

X

g(s) = E(s ) =

+

X

k=0

pk s k .

2.9. EXERCISES

27

EX = g (1)

and

EX 2 = g (1) + g (1),

(b) Calculate the generating function of the Poisson random variable with

pk = P(X = k) =

e k

,

k!

k = 0, 1, 2, . . .

and

> 0.

integer valued random variables is the product of their generating functions.

8. Write a computer program for studying the law of large numbers and the central

limit theorem. Investigate numerically the rate of convergence of these two

theorems.

9. Study the properties of Gaussian measures on separable Hilbert spaces from [?,

Ch. 2].

10. . Prove Theorem 2.4.1.

28

Chapter 3

Processes

3.1 Introduction

In this chapter we present some basic results form the theory of stochastic processes and we investigate the properties of some of the standard stochastic processes in continuous time. In Section 3.2 we give the definition of a stochastic process. In Section 3.3 we present some properties of stationary stochastic processes.

In Section 3.4 we introduce Brownian motion and study some of its properties.

Various examples of stochastic processes in continuous time are presented in Section 3.5. The Karhunen-Loeve expansion, one of the most useful tools for representing stochastic processes and random fields, is presented in Section 3.6. Further

discussion and bibliographical comments are presented in Section 3.7. Section 3.8

contains exercises.

Stochastic processes describe dynamical systems whose evolution law is of probabilistic nature. The precise definition is given below.

Definition 3.2.1. Let T be an ordered set, (, F, P) a probability space and (E, G)

a measurable space. A stochastic process is a collection of random variables

X = {Xt ; t T } where, for each fixed t T , Xt is a random variable from

(, F, P) to (E, G). is called the sample space. and E is the state space of the

stochastic process Xt .

29

The set T can be either discrete, for example the set of positive integers Z+ , or

continuous, T = [0, +). The state space E will usually be Rd equipped with the

algebra of Borel sets.

A stochastic process X may be viewed as a function of both t T and .

We will sometimes write X(t), X(t, ) or Xt () instead of Xt . For a fixed sample

point , the function Xt () : T 7 E is called a sample path (realization,

trajectory) of the process X.

Definition 3.2.2. The finite dimensional distributions (fdd) of a stochastic process are the distributions of the E k valued random variables (X(t1 ), X(t2 ), . . . , X(tk ))

for arbitrary positive integer k and arbitrary times ti T, i {1, . . . , k}:

F (x) = P(X(ti ) 6 xi , i = 1, . . . , k)

with x = (x1 , . . . , xk ).

From experiments or numerical simulations we can only obtain information

about the finite dimensional distributions of a process. A natural question arises:

are the finite dimensional distributions of a stochastic process sufficient to determine a stochastic process uniquely? This is true for processes with continuous

paths 1 . This is the class of stochastic processes that we will study in these notes.

Definition 3.2.3. We will say that two processes Xt and Yt are equivalent if they

have same finite dimensional distributions.

Definition 3.2.4. A one dimensional Gaussian process is a continuous time stochastic process for which E = R and all the finite dimensional distributions are Gaussian, i.e. every finite dimensional vector (Xt1 , Xt2 , . . . , Xtk ) is a N (k , Kk ) random variable for some vector k and a symmetric nonnegative definite matrix Kk

for all k = 1, 2, . . . and for all t1 , t2 , . . . , tk .

From the above definition we conclude that the Finite dimensional distributions

of a Gaussian continuous time stochastic process are Gaussian with PFG

1 1

n/2

1/2

k ,Kk (x) = (2)

(detKk )

exp hKk (x k ), x k i ,

2

where x = (x1 , x2 , . . . xk ).

1

In fact, what we need is the stochastic process to be separable. See the discussion in Section 3.7

31

Gaussian process x(t) is characterized by its mean

m(t) := Ex(t)

and the covariance (or autocorrelation) matrix

C(t, s) = E x(t) m(t) x(s) m(s) .

Thus, the first two moments of a Gaussian process are sufficient for a complete

characterization of the process.

3.3.1 Strictly Stationary Processes

In many stochastic processes that appear in applications their statistics remain invariant under time translations. Such stochastic processes are called stationary. It

is possible to develop a quite general theory for stochastic processes that enjoy this

symmetry property.

Definition 3.3.1. A stochastic process is called (strictly) stationary if all finite

dimensional distributions are invariant under time translation: for any integer k

and times ti T , the distribution of (X(t1 ), X(t2 ), . . . , X(tk )) is equal to that

of (X(s + t1 ), X(s + t2 ), . . . , X(s + tk )) for any s such that s + ti T for all

i {1, . . . , k}. In other words,

P(Xt1 +t A1 , Xt2 +t A2 . . . Xtk +t Ak ) = P(Xt1 A1 , Xt2 A2 . . . Xtk Ak ), t T.

Example 3.3.2. Let Y0 , Y1 , . . . be a sequence of independent, identically distributed random variables and consider the stochastic process Xn = Yn . Then

Xn is a strictly stationary process (see Exercise 1). Assume furthermore that

EY0 = < +. Then, by the strong law of large numbers, we have that

N 1

N 1

1 X

1 X

Xj =

Yj EY0 = ,

N

N

j=0

j=0

almost surely. In fact, Birkhoffs ergodic theorem states that, for any function f

such that Ef (Y0 ) < +, we have that

N 1

1 X

f (Xj ) = Ef (Y0 ),

lim

N + N

j=0

(3.1)

almost surely. The sequence of iid random variables is an example of an ergodic

strictly stationary processes.

Ergodic strictly stationary processes satisfy (3.1) Hence, we can calculate the

statistics of a sequence stochastic process Xn using a single sample path, provided

that it is long enough (N 1).

Example 3.3.3. Let Z be a random variable and define the stochastic process

Xn = Z, n = 0, 1, 2, . . . . Then Xn is a strictly stationary process (see Exercise 2).

We can calculate the long time average of this stochastic process:

N 1

N 1

1 X

1 X

Xj =

Z = Z,

N

N

j=0

j=0

which is independent of N and does not converge to the mean of the stochastic processes EXn = EZ (assuming that it is finite), or any other deterministic number.

This is an example of a non-ergodic processes.

Let , F, P be a probability space. Let Xt , t T (with T = R or Z) be a

real-valued random process on this probability space with finite second moment,

E|Xt |2 < + (i.e. Xt L2 (, P) for all t T ). Assume that it is strictly

stationary. Then,

E(Xt+s ) = EXt , s T

(3.2)

from which we conclude that EXt is constant. and

E((Xt1 +s )(Xt2 +s )) = E((Xt1 )(Xt2 )),

sT

(3.3)

function C(t, s) = E((Xt )(Xs )) depends on the difference between the

two times, t and s, i.e. C(t, s) = C(t s). This motivates the following definition.

Definition 3.3.4. A stochastic process Xt L2 is called second-order stationary or wide-sense stationary or weakly stationary if the first moment EXt is a

constant and the covariance function E(Xt )(Xs ) depends only on the

difference t s:

EXt = ,

33

we can set = 0, since if EXt = then the process Yt = Xt is mean

zero. A mean zero process with be called a centered process. The function C(t)

is the covariance (sometimes also called autocovariance) or the autocorrelation

function of the Xt . Notice that C(t) = E(Xt X0 ), whereas C(0) = E(Xt2 ), which

is finite, by assumption. Since we have assumed that Xt is a real valued process,

we have that C(t) = C(t), t R.

Remark 3.3.5. Let Xt be a strictly stationary stochastic process with finite second

moment (i.e. Xt L2 ). The definition of strict stationarity implies that EXt = , a

constant, and E((Xt )(Xs )) = C(ts). Hence, a strictly stationary process

with finite second moment is also stationary in the wide sense. The converse is not

true.

Example 3.3.6.

Let Y0 , Y1 , . . . be a sequence of independent, identically distributed random variables and consider the stochastic process Xn = Yn . From Example 3.3.2 we

know that this is a strictly stationary process, irrespective of whether Y0 is such

that EY02 < +. Assume now that EY0 = 0 and EY02 = 2 < +. Then

Xn is a second order stationary process with mean zero and correlation function

R(k) = 2 k0 . Notice that in this case we have no correlation between the values

of the stochastic process at different times n and k.

Example 3.3.7. Let Z be a single random variable and consider the stochastic

process Xn = Z, n = 0, 1, 2, . . . . From Example 3.3.3 we know that this is a

strictly stationary process irrespective of whether E|Z|2 < + or not. Assume

now that EZ = 0, EZ 2 = 2 . Then Xn becomes a second order stationary

process with R(k) = 2 . Notice that in this case the values of our stochastic

process at different times are strongly correlated.

We will see in Section 3.3.3 that for second order stationary processes, ergodicity is related to fast decay of correlations. In the first of the examples above,

there was no correlation between our stochastic processes at different times and

the stochastic process is ergodic. On the contrary, in our second example there is

very strong correlation between the stochastic process at different times and this

process is not ergodic.

Remark 3.3.8. The first two moments of a Gaussian process are sufficient for a

complete characterization of the process. Consequently, a Gaussian stochastic

process is strictly stationary if and only if it is weakly stationary.

Continuity properties of the covariance function are equivalent to continuity

properties of the paths of Xt in the L2 sense, i.e.

lim E|Xt+h Xt |2 = 0.

h0

Lemma 3.3.9. Assume that the covariance function C(t) of a second order stationary process is continuous at t = 0. Then it is continuous for all t R. Furthermore, the continuity of C(t) is equivalent to the continuity of the process Xt in

the L2 -sense.

Proof. Fix t R and (without loss of generality) set EXt = 0. We calculate:

|C(t + h) C(t)|2 = |E(Xt+h X0 ) E(Xt X0 )|2 = E|((Xt+h Xt )X0 )|2

6 E(X0 )2 E(Xt+h Xt )2

2

+ EXt2 2EXt Xt+h )

= C(0)(EXt+h

= 2C(0)(C(0) C(h)) 0,

Assume now that C(t) is continuous. From the above calculation we have

E|Xt+h Xt |2 = 2(C(0) C(h)),

(3.4)

Then, from the above equation we get limh0 C(h) = C(0).

Notice that form (3.4) we immediately conclude that C(0) > C(h), h R.

The Fourier transform of the covariance function of a second order stationary

process always exists. This enables us to study second order stationary processes

using tools from Fourier analysis. To make the link between second order stationary processes and Fourier analysis we will use Bochners theorem, which applies

to all nonnegative functions.

Definition 3.3.10. A function f (x) : R 7 R is called nonnegative definite if

n

X

i,j=1

for all n N, t1 , . . . tn R, c1 , . . . cn C.

(3.5)

35

nonnegative definite function.

Proof. We will use the notation Xtc :=

n

X

i,j=1

C(ti tj )ci cj

n

X

Pn

i=1 Xti ci .

We have.

EXti Xtj ci cj

i,j=1

= E

=

n

X

Xti ci

i=1

E|Xtc |2

n

X

j=1

> 0.

c

Xtj cj = E Xtc X

t

Then there exists a unique nonnegative measure on R such that (R) = C(0)

and

Z

eixt (dx)

C(t) =

t R.

(3.6)

Definition 3.3.13. Let Xt be a second order stationary process with autocorrelation function C(t) whose Fourier transform is the measure (dx). The measure

(dx) is called the spectral measure of the process Xt .

In the following we will assume that the spectral measure is absolutely continuous with respect to the Lebesgue measure on R with density f (x), i.e. (dx) =

f (x)dx. The Fourier transform f (x) of the covariance function is called the spectral density of the process:

Z

1

f (x) =

eitx C(t) dt.

2

From (3.6) it follows that that the autocorrelation function of a mean zero, second

order stationary process is given by the inverse Fourier transform of the spectral

density:

Z

C(t) =

(3.7)

There are various cases where the experimentally measured quantity is the spectral density (or power spectrum) of a stationary stochastic process. Conversely,

from a time series of observations of a stationary processes we can calculate the

autocorrelation function and, using (3.7) the spectral density.

The autocorrelation function of a second order stationary process enables us to

associate a time scale to Xt , the correlation time cor :

Z

Z

1

cor =

E(X X0 )/E(X02 ) d.

C( ) d =

C(0) 0

0

The slower the decay of the correlation function, the larger the correlation time

is. Notice that when the correlations do not decay sufficiently fast so that C(t) is

integrable, then the correlation time will be infinite.

Example 3.3.14. Consider a mean zero, second order stationary process with correlation function

R(t) = R(0)e|t|

(3.8)

where > 0. We will write R(0) =

process is:

f (x) =

=

=

=

1 D

2

eixt e|t| dt

Z +

1 D

eixt et dt

eixt et dt +

2

0

1

1

1 D

+

2 ix + ix +

D

1

.

x2 + 2

Z

This function is called the Cauchy or the Lorentz distribution. The correlation

time is (we have that R(0) = D/)

Z

et dt = 1 .

cor =

0

importance in the theory and applications of stochastic processes.

Definition 3.3.15. A real-valued Gaussian stationary process defined on R with

correlation function given by (3.8) is called the (stationary) Ornstein-Uhlenbeck

process.

37

The Ornstein Uhlenbeck process is used as a model for the velocity of a Brownian particle. It is of interest to calculate the statistics of the position of the Brownian

particle, i.e. of the integral

Z t

Y (s) ds,

(3.9)

X(t) =

0

Lemma 3.3.16. Let Y (t) denote the stationary OU process with covariance function (3.8) and set = D = 1. Then the position process (3.9) is a mean zero

Gaussian process with covariance function

E(X(t)X(s)) = 2 min(t, s) + e min(t,s) + emax(t,s) e|ts| 1.

(3.10)

Second order stationary processes have nice ergodic properties, provided that the

correlation between values of the process at different times decays sufficiently fast.

In this case, it is possible to show that we can calculate expectations by calculating

time averages. An example of such a result is the following.

Theorem 3.3.17. Let {Xt }t>0 be a second order stationary process on a probability space , F, P with mean and covariance R(t), and assume that R(t)

L1 (0, +). Then

2

Z T

1

(3.11)

X(s) ds = 0.

lim E

T +

T 0

For the proof of this result we will first need an elementary lemma.

Z TZ T

Z T

(T s)R(s) ds.

R(t s) dtds = 2

0

(3.12)

integration in the t, s variables is [0, T ] [0, T ]. In the u, v variables it becomes

[T, T ] [0, 2(T |u|)]. The Jacobian of the transformation is

J=

(t, s)

1

= .

(u, v)

2

The integral becomes

Z TZ T

Z

R(t s) dtds =

0

T

Z T

T

= 2

2(T |u|)

R(u)J dvdu

(T |u|)R(u) du

T

(T u)R(u) du,

where the symmetry of the function R(u) was used in the last step.

Proof of Theorem 3.3.17. We use Lemma (3.3.18) to calculate:

Z

2

2

Z T

1

1 T

E

Xs ds =

(X

)

ds

E

s

T 0

T2 0

Z TZ T

1

=

E

(X(t) )(X(s) ) dtds

T2

0

0

Z TZ T

1

R(t s) dtds

=

T2 0 0

Z T

2

(T u)R(u) du

=

T2 0

Z

Z

2 +

2 +

u

6

R(u) du 6

R(u) du 0,

1

T 0

T

T 0

Assume that = 0 and define

Z +

R(t) dt,

(3.13)

D=

0

suggests that, for T 1, we have that

2

Z t

X(t) dt 2DT.

E

This implies that, at sufficiently long times, the mean square displacement of the

integral of the ergodic second order stationary process Xt scales linearly in time,

with proportionality coefficient 2D.

2

Notice however that we do not know whether it is nonzero. This requires a separate argument.

39

Assume that Xt is the velocity of a (Brownian) particle. In this case, the integral of Xt

Z t

Xs ds,

Zt =

0

represents the particle position. From our calculation above we conclude that

EZt2 = 2Dt.

where

D=

R(t) dt =

E(Xt X0 ) dt

(3.14)

is the diffusion coefficient. Thus, one expects that at sufficiently long times and

under appropriate assumptions on the correlation function, the time integral of a

stationary process will approximate a Brownian motion with diffusion coefficient

D. The diffusion coefficient is an example of a transport coefficient and (3.14) is

an example of the Green-Kubo formula: a transport coefficient can be calculated

in terms of the time integral of an appropriate autocorrelation function. In the

case of the diffusion coefficient we need to calculate the integral of the velocity

autocorrelation function.

Example 3.3.19. Consider the stochastic processes with an exponential correlation function from Example 3.3.14, and assume that this stochastic process describes the velocity of a Brownian particle. Since R(t) L1 (0, +) Theorem 3.3.17 applies. Furthermore, the diffusion coefficient of the Brownian particle

is given by

Z +

D

R(t) dt = R(0)c1 = 2 .

The most important continuous time stochastic process is Brownian motion. Brownian motion is a mean zero, continuous (i.e. it has continuous sample paths: for

a.e the function Xt is a continuous function of time) process with independent Gaussian increments. A process Xt has independent increments if for every

sequence t0 < t1 < . . . tn the random variables

Xt1 Xt0 , Xt2 Xt1 , . . . , Xtn Xtn1

are independent. If, furthermore, for any t1 , t2 , s T and Borel set B R

P(Xt2 +s Xt1 +s B) = P(Xt2 Xt1 B)

then the process Xt has stationary independent increments.

Definition 3.4.1.

A one dimensional standard Brownian motion W (t) : R+

R is a real valued stochastic process such that

i. W (0) = 0.

ii. W (t) has independent increments.

iii. For every t > s > 0 W (t) W (s) has a Gaussian distribution with

mean 0 and variance t s. That is, the density of the random variable

W (t) W (s) is

1

x2

2

;

(3.15)

exp

g(x; t, s) = 2(t s)

2(t s)

A ddimensional standard Brownian motion W (t) : R+ Rd is a collection of d independent one dimensional Brownian motions:

W (t) = (W1 (t), . . . , Wd (t)),

where Wi (t), i = 1, . . . , d are independent one dimensional Brownian motions. The density of the Gaussian random vector W (t) W (s) is thus

d/2

kxk2

g(x; t, s) = 2(t s)

exp

.

2(t s)

Brownian motion is sometimes referred to as the Wiener process .

Brownian motion has continuous paths. More precisely, it has a continuous

modification.

Definition 3.4.2. Let Xt and Yt , t T , be two stochastic processes defined on the

same probability space (, F, P). The process Yt is said to be a modification of

Xt if P(Xt = Yt ) = 1 t T .

Lemma 3.4.3. There is a continuous modification of Brownian motion.

This follows from a theorem due to Kolmogorov.

41

2

mean of 1000 paths

5 individual paths

1.5

U(t)

0.5

0.5

1.5

0.2

0.4

0.6

0.8

probability space {, F, P}. Suppose that there are positive constants and ,

and for each T > 0 there is a constant C(T ) such that

E|Xt Xs | 6 C(T )|t s|1+ ,

0 6 s, t 6 T.

(3.16)

The proof of Lemma 3.4.3 is left as an exercise.

Remark 3.4.5. Equivalently, we could have defined the one dimensional standard

Brownian motion as a stochastic process on a probability space , F, P with

continuous paths for almost all , and Gaussian finite dimensional distributions with zero mean and covariance E(Wti Wtj ) = min(ti , tj ). One can then

show that Definition 3.4.1 follows from the above definition.

It is possible to prove rigorously the existence of the Wiener process (Brownian

motion):

Theorem 3.4.6. (Wiener) There exists an almost-surely continuous process Wt

with independent increments such and W0 = 0, such that for each t > 0 the

random variable Wt is N (0, t). Furthermore, Wt is almost surely locally Holder

continuous with exponent for any (0, 21 ).

Notice that Brownian paths are not differentiable.

We can also construct Brownian motion through the limit of an appropriately

rescaled random walk: let X1 , X2 , . . . be iid random variables on a probability

space (, F, P) with mean 0 and variance 1. Define the discrete time stochastic

P

process Sn with S0 = 0, Sn = j=1 Xj , n > 1. Define now a continuous time

stochastic process with continuous paths as the linearly interpolated, appropriately

rescaled random walk:

1

1

Wtn = S[nt] + (nt [nt]) X[nt]+1 ,

n

n

where [] denotes the integer part of a number. Then Wtn converges weakly, as

n + to a one dimensional standard Brownian motion.

Brownian motion is a Gaussian process. For the ddimensional Brownian motion, and for I the d d dimensional identity, we have (see (2.7) and (2.8))

EW (t) = 0

and

Moreover,

t > 0

E (W (t) W (s)) (W (t) W (s)) = (t s)I.

E W (t) W (s) = min(t, s)I.

(3.17)

(3.18)

From the formula for the Gaussian density g(x, t s), eqn. (3.15), we immediately conclude that W (t) W (s) and W (t + u) W (s + u) have the same pdf.

Consequently, Brownian motion has stationary increments. Notice, however, that

Brownian motion itself is not a stationary process. Since W (t) = W (t) W (0),

the pdf of W (t) is

1 x2 /2t

g(x, t) =

e

.

2t

We can easily calculate all moments of the Brownian motion:

Z +

2

1

xn ex /2t dx

2t

n

1.3 . . . (n 1)tn/2 , n even,

=

0,

n odd.

E(xn (t)) =

43

the following properties:

i. (Rescaling). For each c > 0 define Xt =

(Wt , t > 0) in law.

1 W (ct).

c

ii. (Shifting). For each c > 0 Wc+t Wc , t > 0 is a Brownian motion which is

independent of Wu , u [0, c].

iii. (Time reversal). Define Xt = W1t W1 , t [0, 1]. Then (Xt , t [0, 1]) =

(Wt , t [0, 1]) in law.

iv. (Inversion). Let Xt , t > 0 defined by X0 = 0, Xt = tW (1/t). Then

(Xt , t > 0) = (Wt , t > 0) in law.

We emphasize that the equivalence in the above theorem holds in law and not

in a pathwise sense.

Proof. See Exercise 13.

We can also add a drift and change the diffusion coefficient of the Brownian

motion: we will define a Brownian motion with drift and variance 2 as the

process

Xt = t + Wt .

The mean and variance of Xt are

E(Xt EXt )2 = 2 t.

EXt = t,

Notice that Xt satisfies the equation

dXt = dt + dWt .

This is the simplest example of a stochastic differential equation.

We can define the OU process through the Brownian motion via a time change.

Lemma 3.4.8. Let W (t) be a standard Brownian motion and consider the process

V (t) = et W (e2t ).

Then V (t) is a Gaussian stationary process with mean 0 and correlation function

R(t) = e|t| .

(3.19)

For the proof of this result we first need to show that time changed Gaussian

processes are also Gaussian.

Lemma 3.4.9. Let X(t) be a Gaussian stochastic process and let Y (t) = X(f (t))

where f (t) is a strictly increasing function. Then Y (t) is also a Gaussian process.

Proof. We need to show that, for all positive integers N and all sequences of times

{t1 , t2 , . . . tN } the random vector

{Y (t1 ), Y (t2 ), . . . Y (tN )}

(3.20)

invertible and hence, there exist si , i = 1, . . . N such that si = f 1 (ti ). Thus, the

random vector (3.20) can be rewritten as

{X(s1 ), X(s2 ), . . . X(sN )},

which is Gaussian for all N and all choices of times s1 , s2 , . . . sN . Hence Y (t) is

also Gaussian.

Proof of Lemma 3.4.8. The fact that V (t) is mean zero follows immediately

from the fact that W (t) is mean zero. To show that the correlation function of V (t)

is given by (3.19), we calculate

E(V (t)V (s)) = ets E(W (e2t )W (e2s )) = ets min(e2t , e2s )

= e|ts| .

The Gaussianity of the process V (t) follows from Lemma 3.4.9 (notice that the

transformation that gives V (t) in terms of W (t) is invertible and we can write

W (s) = s1/2 V ( 12 ln(s))).

3.5.1 Brownian Bridge

Let W (t) be a standard one dimensional Brownian motion. We define the Brownian bridge (from 0 to 0) to be the process

Bt = Wt tW1 ,

t [0, 1].

(3.21)

45

the continuous Gaussian process {Bt : 0 6 t 6 1} such that

E(Bt Bs ) = min(s, t) st,

EBt = 0,

s, t [0, 1].

(3.22)

time change of the Brownian motion:

Bt = (1 t)W

t

1t

t [0, 1).

(3.23)

Conversely, we can write the Brownian motion as a time change of the Brownian

bridge:

t

Wt = (t + 1)B

, t > 0.

1+t

Definition 3.5.1. A (normalized) fractional Brownian motion WtH , t > 0 with

Hurst parameter H (0, 1) is a centered Gaussian process with continuous sample paths whose covariance is given by

E(WtH WsH ) =

1 2H

s + t2H |t s|2H .

2

(3.24)

1

ii. W0H = 0, EWtH = 0, E(WtH )2 = |t|2H , t > 0.

iii. It has stationary increments, E(WtH WsH )2 = |t s|2H .

iv. It has the following self similarity property

H

(Wt

, t > 0) = (H WtH , t > 0), > 0,

Proof. See Exercise 19

(3.25)

Another fundamental continuous time process is the Poisson process :

Definition 3.5.3. The Poisson process with intensity , denoted by N (t), is an

integer-valued, continuous time, stochastic process with independent increments

satisfying

k

e(ts) (t s)

P[(N (t) N (s)) = k] =

, t > s > 0, k N.

k!

The Poisson process does not have a continuous modification. See Exercise 20.

Let f L2 () where is a subset of Rd and let {en }

n=1 be an orthonormal basis

2

in L (). Then, it is well known that f can be written as a series expansion:

f=

fn en ,

n=1

where

fn =

N

X

fn en (x)

lim f (x)

N

n=1

= 0.

L2 ()

It turns out that we can obtain a similar expansion for an L2 mean zero process

which is continuous in the L2 sense:

EXt2 < +,

EXt = 0,

lim E|Xt+h Xt |2 = 0.

h0

(3.26)

For simplicity we will take T = [0, 1]. Let R(t, s) = E(Xt Xs ) be the autocorrelation function. Notice that from (3.26) it follows that R(t, s) is continuous in both t

and s (exercise 21).

Let us assume an expansion of the form

Xt () =

n=1

n ()en (t),

t [0, 1]

(3.27)

3.6. THE KARHUNEN-LOEVE

EXPANSION

47

2

where {en }

n=1 is an orthonormal basis in L (0, 1). The random variables n are

calculated as

Z 1X

Z 1

n en (t)ek (t) dt

Xt ek (t) dt =

0 n=1

n nk = k ,

n=1

will assume that these random variables are orthogonal:

E(n m ) = n nm ,

where {n }

n=1 are positive numbers that will be determined later.

Assuming that an expansion of the form (3.27) exists, we can calculate

!

X

X

R(t, s) = E(Xt Xs ) = E

k ek (t) e (s)

k=1 =1

=

=

E (k ) ek (t)e (s)

k=1 =1

k ek (t)ek (s).

k=1

R(t, s) =

k ek (t)ek (s).

(3.28)

k=1

Z

0

1X

0 k=1

k ek (t)

k=1

ek (s)en (s) ds

0

k ek (t)kn

k=1

= n en (t).

Hence, in order for the expansion (3.27) to be valid, {n , en (t)}

n=1 have to be

the eigenvalues and eigenfunctions of the integral operator whose kernel is the

correlation function of Xt :

Z 1

R(t, s)en (s) ds = n en (t).

(3.29)

0

Hence, in order to prove the expansion (3.27) we need to study the eigenvalue

problem for the integral operator R : L2 [0, 1] 7 L2 [0, 1]. It easy to check that

this operator is self-adjoint ((Rf, h) = (f, Rh) for all f, h L2 (0, 1)) and nonnegative (Rf, f > 0 for all f L2 (0, 1)). Hence, all its eigenvalues are real

and nonnegative. Furthermore, it is a compact operator (if {n }

n=1 is a bounded

2

sequence in L (0, 1), then {Rn }n=1 has a convergent subsequence). The spectral theorem for compact, self-adjoint operators implies that R has a countable

sequence of eigenvalues tending to 0. Furthermore, for every f L2 (0, 1) we can

write

X

fn en (t),

f = f0 +

n=1

eigenvalues and the convergence is in L2 . Finally, Mercers Theorem states that

for R(t, s) continuous on [0, 1] [0, 1], the expansion (3.28) is valid, where the

series converges absolutely and uniformly.

Now we are ready to prove (3.27).

Theorem 3.6.1. (Karhunen-Loeve). Let {Xt , t [0, 1]} be an L2 process with

zero mean and continuous correlation function R(t, s). Let {n , en (t)}

n=1 be the

eigenvalues and eigenfunctions of the operator R defined in (3.35). Then

Xt =

n=1

where

n =

n en (t),

t [0, 1],

(3.30)

Xt en (t) dt,

En = 0,

E(n m ) = nm .

(3.31)

Proof. The fact that En = 0 follows from the fact that Xt is mean zero. The

orthogonality of the random variables {n }

n=1 follows from the orthogonality of

3.6. THE KARHUNEN-LOEVE

EXPANSION

49

the eigenfunctions of R:

Z 1Z 1

Xt Xs en (t)em (s) dtds

E(n m ) = E

0

0

Z 1Z 1

=

R(t, s)en (t)em (s) dsdt

0

0

Z 1

en (s)em (s) ds

= n

0

= n nm .

P

Consider now the partial sum SN = N

n=1 n en (t).

2

2E(Xt SN )

E|Xt SN |2 = EXt2 + ESN

= R(t, t) + E

N

X

k,=1

= R(t, t) +

= R(t, t)

N

X

k=1

N

X

k=1

k ek (t)e (t) 2E Xt

2

k |ek (t)| 2E

N Z

X

k=1

N

X

n=1

n en (t)

Xt Xs ek (s)ek (t) ds

k |ek (t)|2 0,

by Mercers theorem.

Remark 3.6.2. Let Xt be a Gaussian second order process with continuous covariance R(t, s). Then the random variables {k }

k=1 are Gaussian, since they

are defined through the time integral of a Gaussian processes. Furthermore, since

they are Gaussian and orthogonal, they are also independent. Hence, for Gaussian

processes the Karhunen-Loeve expansion becomes:

Xt =

+ p

X

k k ek (t),

(3.32)

k=1

where {k }

k=1 are independent N (0, 1) random variables.

Example 3.6.3. The Karhunen-Loeve Expansion for Brownian Motion. The

correlation function of Brownian motion is R(t, s) = min(t, s). The eigenvalue

problem Rn = n n becomes

Z 1

min(t, s)n (s) ds = n n (t).

0

Let us assume that n > 0 (it is easy to check that 0 is not an eigenvalue). Upon

setting t = 0 we obtain n (0) = 0. The eigenvalue problem can be rewritten in

the form

Z 1

Z t

n (s) ds = n n (t).

sn (s) ds + t

t

Z

1

t

n (s) ds = n n (t).

A second differentiation yields;

n (t) = n n (t),

where primes denote differentiation with respect to t. Thus, in order to calculate the eigenvalues and eigenfunctions of the integral operator whose kernel is

the covariance function of Brownian motion, we need to solve the Sturm-Liouville

problem

n (t) = n n (t), (0) = (1) = 0.

It is easy to check that the eigenvalues and (normalized) eigenfunctions are

n (t) =

2 sin

1

(2n 1)t ,

2

n =

2

(2n 1)

2

X

n

Wt = 2

n=1

2

sin

(2n 1)

1

(2n 1)t .

2

(3.33)

We can use the KL expansion in order to study the L2 -regularity of stochastic processes. First, let R be a compact, symmetric positive definite operator on

L2 (0, 1) with eigenvalues and normalized eigenfunctions {k , ek (x)}+

k=1 and conR1

sider a function f L2 (0, 1) with 0 f (s) ds = 0. We can define the one parameter family of Hilbert spaces H through the norm

kf k2 = kR f k2L2 =

X

k

|fk |2 .

51

The inner product can be obtained through polarization. This norm enables us to

measure the regularity of the function f (t).3 Let Xt be a mean zero second order

(i.e. with finite second moment) process with continuous autocorrelation function.

Define the space H := L2 ((, P ), H (0, 1)) with (semi)norm

X

kXt k2 = EkXt k2H =

|k |1 .

(3.34)

k

Notice that the regularity of the stochastic process Xt depends on the decay of the

R1

eigenvalues of the integral operator R := 0 R(t, s) ds.

As an example, consider the L2 -regularity of Brownian motion. From Example 3.6.3 we know that k k2 . Consequently, from (3.34) we get that, in order

for Wt to be an element of the space H , we need that

X

|k |2(1) < +,

k

from which we obtain that < 1/2. This is consistent with the Holder continuity

of Brownian motion from Theorem 3.4.6. 4

The Ornstein-Uhlenbeck process was introduced by Ornstein and Uhlenbeck in

1930 as a model for the velocity of a Brownian particle [73].

The kind of analysis presented in Section 3.3.3 was initiated by G.I. Taylor

in [72]. The proof of Bochners theorem 3.3.12 can be found in [39], where additional material on stationary processes can be found. See also [36].

The spectral theorem for compact, self-adjoint operators which was needed

in the proof of the Karhunen-Loeve theorem can be found in [63]. The KarhunenLoeve expansion is also valid for random fields. See [69] and the reference therein.

3.8 Exercises

1. Let Y0 , Y1 , . . . be a sequence of independent, identically distributed random

variables and consider the stochastic process Xn = Yn .

3

Think of R as being the inverse of the Laplacian with periodic boundary conditions. In this case

H coincides with the standard fractional Sobolev space.

4

Notice, however, that Wieners theorem refers to a.s. Holder continuity, whereas the calculation

presented in this section is about L2 -continuity.

(a) Show that Xn is a strictly stationary process.

(b) Assume that EY0 = < + and EY02 = sigma2 < +. Show that

N 1

X

1

Xj = 0.

lim E

N + N

j=0

NX

1 1

lim E

f (Xj ) f (Y0 ) = 0.

N + N

j=0

0, 1, 2, . . . . Show that Xn is a strictly stationary process.

3. Let A0 , A1 , . . . Am and B0 , B1 , . . . Bm be uncorrelated random variables with

mean zero and variances EA2i = i2 , EBi2 = i2 , i = 1, . . . m. Let 0 , 1 , . . . m

[0, ] be distinct frequencies and define, for n = 0, 1, 2, . . . , the stochastic

process

m

X

Xn =

Ak cos(nk ) + Bk sin(nk ) .

k=0

Calculate the mean and the covariance of Xn . Show that it is a weakly stationary

process.

, E(n )2 = 2 , n = 0, 1, 2, . . . . Let a1 , a2 , . . . be arbitrary real

numbers and consider the stochastic process

Xn = a1 n + a2 n1 + . . . am nm+1 .

(a) Calculate the mean, variance and the covariance function of Xn . Show

that it is a weakly stationary process.

study the cases m = 1 and m +.

5. Let W (t) be a standard one dimensional Brownian motion. Calculate the following expectations.

(a) EeiW (t) .

3.8. EXERCISES

53

P

(c) E( ni=1 ci W (ti ))2 , where ci R, i = 1, . . . n and ti (0, +), i =

1, . . . n.

Pn

(d) Ee i i=1 ci W (ti ) , where ci R, i = 1, . . . n and ti (0, +), i =

1, . . . n.

6. Let Wt be a standard one dimensional Brownian motion and define

Bt = Wt tW1 ,

t [0, 1].

EBt = 0,

formula

t

Bt = (1 t)W

.

1t

(c) Calculate the distribution function of Bt .

function

N

X

2j |t|

e j ,

R(t) =

j

j=1

where

{j , j }N

j=1

(a) Calculate the spectral density and the correlaction time of this process.

(b) Show that the assumptions of Theorem 3.3.17 are satisfied and use the

argument presented in Section 3.3.3 (i.e. the Green-Kubo formula) to calRt

culate the diffusion coefficient of the process Zt = 0 Xs ds.

j=1 can you study

the above questions in the limit N +?

9. Let a1 , . . . an and s1 , . . . sn be positive real numbers. Calculate the mean and

variance of the random variable

n

X

ai W (si ).

X=

i=1

10. Let W (t) be the standard one-dimensional Brownian motion and let , s1 , s2 >

0. Calculate

(a) EeW (t) .

(b) E sin(W (s1 )) sin(W (s2 )) .

11. Let Wt be a one dimensional Brownian motion and let , > 0 and define

St = et+Wt .

(a) Calculate the mean and the variance of St .

(b) Calculate the probability density function of St .

12. Use Theorem 3.4.4 to prove Lemma 3.4.3.

13. Prove Theorem 3.4.7.

14. Use Lemma 3.4.8 to calculate the distribution function of the stationary OrnsteinUhlenbeck process.

15. Calculate the mean and the correlation function of the integral of a standard

Brownian motion

Z t

Ws ds.

Yt =

0

Yt =

t+1

t

(Ws Wt ) ds, t R,

17. Let Vt = et W (e2t ) be the stationary Ornstein-Uhlenbeck process. Give the

definition and study the main properties of the Ornstein-Uhlenbeck bridge.

18. The autocorrelation function of the velocity Y (t) a Brownian particle moving

in a harmonic potential V (x) = 12 02 x2 is

1

R(t) = e|t| cos(|t|) sin(|t|) ,

p

where is the friction coefficient and = 02 2 .

3.8. EXERCISES

55

(b) Calculate the mean square displacement E(X(t))2 of the position of the

Rt

Brownian particle X(t) = 0 Y (s) ds. Study the limit t +.

19. Show the scaling property (3.25) of the fractional Brownian motion.

20. Use Theorem (3.4.4) to show that there does not exist a continuous modification

of the Poisson process.

21. Show that the correlation function of a process Xt satisfying (3.26) is continuous in both t and s.

22. Let Xt be a stochastic process satisfying (3.26) and R(t, s) its correlation function. Show that the integral operator R : L2 [0, 1] 7 L2 [0, 1]

Rf :=

(3.35)

is self-adjoint and nonnegative. Show that all of its eigenvalues are real and

nonnegative. Show that eigenfunctions corresponding to different eigenvalues

are orthogonal.

23. Let H be a Hilbert space. An operator R : H H is said to be HilbertSchmidt if there exists a complete orthonormal sequence {n }

n=1 in H such

that

X

kRen k2 < .

n=1

Let R : L2 [0, 1] 7 L2 [0, 1] be the operator defined in (3.35) with R(t, s) being

continuous both in t and s. Show that it is a Hilbert-Schmidt operator.

24. Let Xt a mean zero second order stationary process defined in the interval [0, T ]

with continuous covariance R(t) and let {n }+

n=1 be the eigenvalues of the

covariance operator. Show that

n = T R(0).

n=1

25. Calculate the Karhunen-Loeve expansion for a second order stochastic process

with correlation function R(t, s) = ts.

26. Calculate the Karhunen-Loeve expansion of the Brownian bridge on [0, 1].

27. Let Xt , t [0, T ] be a second order process with continuous covariance and

Karhunen-Loeve expansion

Xt =

k ek (t).

k=1

Y (t) = f (t)X (t) ,

t [0, S],

where f (t) is a continuous function and (t) a continuous, nondecreasing function with (0) = 0, (S) = T . Find the Karhunen-Loeve expansion of Y (t),

in an appropriate weighted L2 space, in terms of the KL expansion of Xt . Use

this in order to calculate the KL expansion of the Ornstein-Uhlenbeck process.

28. Calculate the Karhunen-Loeve expansion of a centered Gaussian stochastic process with covariance function R(s, t) = cos(2(t s)).

29. Use the Karhunen-Loeve expansion to generate paths of the

(a) Brownian motion on [0, 1].

(b) Brownian bridge on [0, 1].

(c) Ornstein-Uhlenbeck on [0, 1].

Study computationally the convergence of the KL expansion for these processes. How many terms do you need to keep in the KL expansion in order

to calculate accurate statistics of these processes?

Chapter 4

Markov Processes

4.1 Introduction

In this chapter we will study some of the basic properties of Markov stochastic

processes. In Section 4.2 we present various examples of Markov processes, in

discrete and continuous time. In Section 4.3 we give the precise definition of a

Markov process. In Section 4.4 we derive the Chapman-Kolmogorov equation,

the fundamental equation in the theory of Markov processes. In Section 4.5 we

introduce the concept of the generator of a Markov process. In Section 4.6 we study

ergodic Markov processes. Discussion and bibliographical remarks are presented

in Section 4.7 and exercises can be found in Section 4.8.

4.2 Examples

Roughly speaking, a Markov process is a stochastic process that retains no memory of where it has been in the past: only the current state of a Markov process

can influence where it will go next. A bit more precisely: a Markov process is

a stochastic process for which, given the present, past and future are statistically

independent.

Perhaps the simplest example of a Markov process is that of a random walk

in one dimension. We defined the one dimensional random walk as the sum of

independent, mean zero and variance 1 random variables i , i = 1, . . . :

XN =

N

X

n ,

n=1

57

X0 = 0.

58

that

P(Xn+m = in+m |X1 = i1 , . . . Xn = in ) = P(Xn+m = in+m |Xn = in ). (4.1)

1 In

words, the probability that the random walk will be at in+m at time n + m

depends only on its current value (at time n) and not on how it got there.

The random walk is an example of a discrete time Markov chain:

Definition 4.2.1. A stochastic process {Sn ; n N} and state space is S = Z is

called a discrete time Markov chain provided that the Markov property (4.1) is

satisfied.

Consider now a continuous-time stochastic process Xt with state space S = Z

and denote by {Xs , s 6 t} the collection of values of the stochastic process up to

time t. We will say that Xt is a Markov processes provided that

P(Xt+h = it+h |{Xs , s 6 t}) = P(Xt+h = it+h |Xt = it ),

(4.2)

for all h > 0. A continuous-time, discrete state space Markov process is called a

continuous-time Markov chain.

Example 4.2.2. The Poisson process is a continuous-time Markov chain with

P(Nt+h = j|Nt = i) =

if j < i,

es (s)ji

,

(ji)!

if j > i.

is R. In this case, the above definitions become

P(Xt+h |{Xs , s 6 t}) = P(Xt+h |Xt = x)

(4.3)

Example 4.2.3. The Brownian motion is a Markov process with conditional probability density

1

|x y|2

p(y, t|x, s) := p(Wt = y|Ws = x) = p

exp

. (4.4)

2(t s)

2(t s)

1

4.2. EXAMPLES

59

process with conditional probability density

!

1

|y xe(ts) |2

p(y, t|x, s) := p(Vt = y|Vs = x) = p

.

exp

2(1 e2(ts) )

2(1 e2(ts) )

(4.5)

To prove (4.5) we use the formula for the distribution function of the Brownian

motion to calculate, for t > s,

P(Vt 6 y|Vs = x) = P(et W (e2t ) 6 y|es W (e2s ) = x)

= P(W (e2t ) 6 et y|W (e2s ) = es x)

Z et y

|zxes |2

1

2t 2s

2(e

e ) dz

p

e

=

2(e2t e2s )

Z y

|et xes |2

1

p

=

e 2(e2t (1e2(ts) ) d

2e2t (1 e2(ts) )

Z y

|x|2

1

2(ts) )

2(1e

p

d.

e

=

2(1 e2(ts) )

Consequently, the transition probability density for the OU process is given by the

formula

p(y, t|x, s) =

=

P(Vt 6 y|Vs = x)

y

|y xe(ts) |2

p

exp

2(1 e2(ts) )

2(1 e2(ts) )

1

chemistry, biology and finance. In this and the next chapter we will develop various analytical tools for studying them. In particular, we will see that we can obtain

an equation for the transition probability

P(Xn+1 = in+1 |Xn = in ),

(4.6)

which will enable us to study the evolution of a Markov process. This equation

will be called the Chapman-Kolmogorov equation.

We will be mostly concerned with time-homogeneous Markov processes, i.e.

processes for which the conditional probabilities are invariant under time shifts.

60

P(Xn+1 = j|Xn = i) = P(X1 = j|X0 = i) =: pij .

We will refer to the matrix P = {pij } as the transition matrix. It is each to check

that the transition matrix is a stochastic matrix, i.e. it has nonnegative entries and

P

j pij = 1. Similarly, we can define the n-step transition matrix Pn = {pij (n)}

as

pij (n) = P(Xm+n = j|Xm = i).

We can study the evolution of a Markov chain through the Chapman-Kolmogorov

equation:

X

pij (m + n) =

pik (m)pkj (n).

(4.7)

k

(n)

determines the state of the Markov chain at time n. A simple consequence of the

Chapman-Kolmogorov equation is that we can write an evolution equation for the

vector (n)

(n) = (0) P n ,

(4.8)

where P n denotes the nth power of the matrix P . Hence in order to calculate the

state of the Markov chain at time n all we need is the initial distribution 0 and the

transition matrix P . Componentwise, the above equation can be written as

X (0)

(n)

j =

i ij (n).

i

pij (s, t) = P(Xt = j|Xs = i),

s 6 t.

pij (s, t) = pij (0, t s) for all i, j, s, t.

In particular,

pij (t) = P(Xt = j|X0 = i).

The Chapman-Kolmogorov equation for a continuous time Markov chain is

X

dpij

=

pik (t)gkj ,

dt

k

(4.9)

4.2. EXAMPLES

61

where the matrix G is called the generator of the Markov chain. Equation (4.9)

can also be written in matrix notation:

dP

= Pt G.

dt

The generator of the Markov chain is defined as

1

G = lim (Ph I).

h0 h

Let now it = P(Xt = i). The vector t is the distribution of the Markov chain at

time t. We can study its evolution using the equation

t = 0 Pt .

Thus, as in the case if discrete time Markov chains, the evolution of a continuous

time Markov chain is completely determined by the initial distribution and and

transition matrix.

Consider now the case a continuous time Markov process with continuous state

space and with continuous paths. As we have seen in Example 4.2.3 the Brownian

motion is an example of such a process. It is a standard result in the theory of partial differential equations that the conditional probability density of the Brownian

motion (4.4) is the fundamental solution of the diffusion equation:

1 2p

p

=

, lim p(y, t|x, s) = (y x).

(4.10)

ts

t

2 y 2

Similarly, the conditional distribution of the OU process satisfies the initial value

problem

p

(yp) 1 2 p

=

+

, lim p(y, t|x, s) = (y x).

(4.11)

ts

t

y

2 y 2

The Brownian motion and the OU process are examples of a diffusion process.

A diffusion process is a continuous time Markov process with continuous paths.

We will see in Chapter 5, that the conditional probability density p(y, t|x, s) of a

diffusion process satisfies the forward Kolmogorov or Fokker-Planck equation

1 2

p

= (a(y, t)p) +

(b(y, t)p), lim p(y, t|x, s) = (y x). (4.12)

ts

t

y

2 y 2

as well as the backward Kolmogorov equation

p

p 1

2p

= a(x, s)

+ b(x, s) 2 , lim p(y, t|x, s) = (y x).

(4.13)

ts

s

x 2

x

for appropriate functions a(y, t), b(y, t). Hence, a diffusion process is determined

uniquely from these two functions.

62

In Section 4.1 we gave the definition of Markov process whose time is either discrete or continuous, and whose state space is the set of integers. We also gave

several examples of Markov chains as well as of processes whose state space is the

real line. In this section we give the precise definition of a Markov process with

t T , a general index set and S = E, an arbitrary metric space. We will use this

formulation in the next section to derive the Chapman-Kolmogorov equation.

In order to state the definition of a continuous-time Markov process that takes

values in a metric space we need to introduce various new concepts. For the definition of a Markov process we need to use the conditional expectation of the stochastic process conditioned on all past values. We can encode all past information

about a stochastic process into an appropriate collection of -algebras. Our setting will be that we have a probability space (, F, P) and an ordered set T . Let

X = Xt () be a stochastic process from the sample space (, F) to the state space

(E, G), where E is a metric space (we will usually take E to be either R or Rd ).

Remember that the stochastic process is a function of two variables, t T and

.

We start with the definition of a algebra generated by a collection of sets.

Definition 4.3.1. Let K be a collection of subsets of . The smallest algebra on

which contains K is denoted by (K) and is called the algebra generated by

K.

Definition 4.3.2. Let Xt : 7 E, t T . The smallest algebra (Xt , t

T ), such that the family of mappings {Xt , t T } is a stochastic process with

sample space (, (Xt , t T )) and state space (E, G), is called the algebra

generated by {Xt , t T }.

In other words, the algebra generated by Xt is the smallest algebra such

that Xt is a measurable function (random variable) with respect to it: the set

: Xt () 6 x (Xt , t T )

subalgebras of F: Fs Ft F for s 6 t.

63

stochastic process, is

FtX := (Xs ; s 6 t) .

Definition 4.3.4. A stochastic process {Xt ; t T } is adapted to the filtration

Ft := {Ft , t T } if for all t T , Xt is an Ft measurable random variable.

Definition 4.3.5. Let {Xt } be a stochastic process defined on a probability space

(, F, ) with values in E and let FtX be the filtration generated by {Xt ; t T }.

Then {Xt ; t T } is a Markov process if

P(Xt |FsX ) = P(Xt |Xs )

(4.14)

Remark 4.3.6. The filtration FtX is generated by events of the form {|Xs1

B1 , Xs2 B2 , . . . Xsn Bn , } with 0 6 s1 < s2 < < sn 6 s and Bi

B(E). The definition of a Markov process is thus equivalent to the hierarchy of

equations

P(Xt |Xt1 , Xt2 , . . . Xtn ) = P(Xt |Xtn ) a.s.

for n > 1, 0 6 t1 < t2 < < tn 6 t and B(E).

Roughly speaking, the statistics of Xt for t > s are completely determined

once Xs is known; information about Xt for t < s is superfluous. In other words:

a Markov process has no memory. More precisely: when a Markov process is

conditioned on the present state, then there is no memory of the past. The past and

future of a Markov process are statistically independent when the present is known.

Remark 4.3.7. A non-Markovian process Xt can be described through a Markovian one Yt by enlarging the state space: the additional variables that we introduce

account for the memory in the Xt . This Markovianization trick is very useful

since there exist many analytical tools for analyzing Markovian processes.

Example 4.3.8. The velocity of a Brownian particle is modeled by the stationary

Ornstein-Uhlenbeck process Yt = et W (e2t ). The particle position is given by the

integral of the OU process (we take X0 = 0)

Z t

Ys ds.

Xt =

0

64

The particle position depends on the past of the OU process and, consequently,

is not a Markov process. However, the joint position-velocity process {Xt , Yt } is.

Its transition probability density p(x, y, t|x0 , y0 ) satisfies the forward Kolmogorov

equation

p

p

1 2p

= p

+

(yp) +

.

t

x y

2 y 2

With a Markov process {Xt } we can associate a function P : T T E B(E)

R+ defined through the relation

P Xt |FsX = P (s, t, Xs , ),

for all t, s T with t > s and all B(E). Assume that Xs = x. Since

P Xt |FsX = P [Xt |Xs ] we can write

P (, t|x, s) = P [Xt |Xs = x] .

E with P (t, E|x, s) = 1; it is B(E)measurable in x (for fixed t, s, ) and satisfies the ChapmanKolmogorov equation

Z

P (, t|y, u)P (dy, u|x, s).

(4.15)

P (, t|x, s) =

E

Chapman-Kolmogorov equation is based on the assumption of Markovianity and

on properties of the conditional probability. Let (, F, ) be a probability space,

X a random variable from (, F, ) to (E, G) and let F1 F2 F. Then (see

Theorem 2.4.1)

E(E(X|F2 )|F1 ) = E(E(X|F1 )|F2 ) = E(X|F1 ).

(4.16)

Assume that f is such that E(f (X)) < . Then

Z

f (x)PX (dx|G).

(4.17)

E(f (X)|G) =

R

65

Now we use the Markov property, together with equations (4.16) and (4.17) and

the fact that s < u FsX FuX to calculate:

P (, t|x, s) := P(Xt |Xs = x) = P(Xt |FsX )

=

Z

P (, t|Xu = y)P (dy, u|Xs = x)

=

R

Z

P (, t|y, u)P (dy, u|x, s).

=:

=

I () denotes the indicator function of the set . We have also set E = R. The

CK equation is an integral equation and is the fundamental equation in the theory

of Markov processes. Under additional assumptions we will derive from it the

Fokker-Planck PDE, which is the fundamental equation in the theory of diffusion

processes, and will be the main object of study in this course.

Definition 4.4.1. A Markov process is homogeneous if

P (t, |Xs = x) := P (s, t, x, ) = P (0, t s, x, ).

We set P (0, t, , ) = P (t, , ). The ChapmanKolmogorov (CK) equation becomes

Z

P (s, x, dz)P (t, z, ).

(4.18)

P (t + s, x, ) =

E

Let Xt be a homogeneous Markov process and assume that the initial distribution of Xt is given by the probability measure () = P (X0 ) (for deterministic initial conditionsX0 = x we have that () = I (x) ). The transition

function P (x, t, ) and the initial distribution determine the finite dimensional

distributions of X by

P(X0 1 , X(t1 ) 1 , . . . , Xtn n )

Z

Z Z

P (tn tn1 , yn1 , n )P (tn1 tn2 , yn2 , dyn1 )

...

=

0

n1

(4.19)

Theorem 4.4.2. ([12, Sec. 4.1]) Let P (t, x, ) satisfy (4.18) and assume that

(E, ) is a complete separable metric space. Then there exists a Markov process

X in E whose finite-dimensional distributions are uniquely determined by (4.19).

66

P (X0 ) and transition function P (x, t, ). We can calculate the probability of

finding Xt in a set at time t:

Z

P (x, t, )(dx).

P(Xt ) =

E

Thus, the initial distribution and the transition function are sufficient to characterize a homogeneous Markov process. Notice that they do not provide us with any

information about the actual paths of the Markov process. The transition probability P (, t|x, s) is a probability measure. Assume that it has a density for all

t > s:

Z

p(y, t|x, s) dy.

P (, t|x, s) =

equation becomes:

Z Z

Z

p(y, t|z, u)p(z, u|x, s) dzdy,

p(y, t|x, s) dy =

Z

p(y, t|z, u)p(z, u|x, s) dz.

p(y, t|x, s) =

(4.20)

and time x, s and the final position and time y, t.

In words, the CK equation tells us that, for a Markov process, the transition

from x, s to y, t can be done in two steps: first the system moves from x to z at

some intermediate time u. Then it moves from z to y at time t. In order to calculate

the probability for the transition from (x, s) to (y, t) we need to sum (integrate) the

transitions from all possible intermediary states z. The above description suggests

that a Markov process can be described through a semigroup of operators, i.e. a

one-parameter family of linear operators with the properties

P0 = I,

Pt+s = Pt Ps t, s > 0.

process. It satisfies the CK equation (4.18):

Z

P (s, x, dz)P (t, z, ).

P (t + s, x, ) =

E

67

(Pt f )(x) := E(f (Xt )|X0 = x) =

E

(P0 f )(x) = E(f (X0 )|X0 = x) = f (x) P0 = I.

Furthermore:

(Pt+s f )(x) =

f (y)P (t + s, x, dy)

Z Z

=

f (y)P (s, z, dy)P (t, x, dz)

Z Z

=

f (y)P (s, z, dy) P (t, x, dz)

Z

=

(Ps f )(z)P (t, x, dz)

= (Pt Ps f )(x).

Consequently:

Pt+s = Pt Ps .

Let (E, ) be a metric space and let {Xt } be an E-valued homogeneous Markov

process. Define the one parameter family of operators Pt through

Z

Pt f (x) = f (y)P (t, x, dy) = E[f (Xt )|X0 = x]

for all f (x) Cb (E) (continuous bounded functions on E). Assume for simplicity

that Pt : Cb (E) Cb (E). Then the one-parameter family of operators Pt forms

a semigroup of operators on Cb (E). We define by D(L) the set of all f Cb (E)

such that the strong limit

Pt f f

Lf = lim

,

t0

t

exists.

Definition 4.5.1. The operator L : D(L) Cb (E) is called the infinitesimal

generator of the operator semigroup Pt .

68

Definition 4.5.2. The operator L : Cb (E) Cb (E) defined above is called the

generator of the Markov process {Xt ; t > 0}.

The semigroup property and the definition of the generator of a semigroup

imply that, formally at least, we can write:

Pt = exp(Lt).

Consider the function u(x, t) := (Pt f )(x). We calculate its time derivative:

u

t

d

d Lt

(Pt f ) =

e f

dt

dt

= L eLt f = LPt f = Lu.

=

value problem

u

= Lu, u(x, 0) = f (x).

(4.21)

t

When the semigroup Pt is the transition semigroup of a Markov process Xt ,

then equation (4.21) is called the backward Kolmogorov equation. It governs the

evolution of an observable

u(x, t) = E(f (Xt )|X0 = x).

Thus, given the generator of a Markov process L, we can calculate all the statistics

of our process by solving the backward Kolmogorov equation. In the case where

the Markov process is the solution of a stochastic differential equation, then the

generator is a second order elliptic operator and the backward Kolmogorov equation becomes an initial value problem for a parabolic PDE.

The space Cb (E) is natural in a probabilistic context, but other Banach spaces

often arise in applications; in particular when there is a measure on E, the spaces

Lp (E; ) sometimes arise. We will quite often use the space L2 (E; ), where

will is the invariant measure of our Markov process. The generator is frequently taken as the starting point for the definition of a homogeneous Markov

process. Conversely, let Pt be a contraction semigroup (Let X be a Banach space

and T : X X a bounded operator. Then T is a contraction provided that

kT f kX 6 kf kX f X), with D(Pt ) Cb (E), closed. Then, under mild

technical hypotheses, there is an Evalued homogeneous Markov process {Xt }

associated with Pt defined through

E[f (X(t)|FsX )] = Pts f (X(s))

69

Example 4.5.3. The Poisson process is a homogeneous Markov process.

Example 4.5.4. The one dimensional Brownian motion is a homogeneous Markov

process. The transition function is the Gaussian defined in the example in Lecture

2:

1

|x y|2

P (t, x, dy) = t,x (y)dy, t,x (y) =

.

exp

2t

2t

The semigroup associated to the standard Brownian motion is the heat semigroup

t d2

1 d2

2 dx2 .

Notice that the transition probability density t,x of the one dimensional

Brownian motion is the fundamental solution (Greens function) of the heat (diffusion) PDE

1 2u

u

=

.

t

2 x2

The semigroup Pt acts on bounded measurable functions. We can also define the

adjoint semigroup Pt which acts on probability measures:

Z

Z

p(t, x, ) d(x).

P(Xt |X0 = x) d(x) =

Pt () =

R

Pt

is again a probability measure.

2

Z

Z

(4.22)

f (x) d(Pt )(x).

Pt f (x) d(x) =

R

Pt = exp(L t),

where L is the L2 -adjoint of the generator of the process:

Z

Z

Lf h dx = f L h dx.

Let t := Pt . This is the law of the Markov process and is the initial distribution. An argument similar to the one used in the derivation of the backward

70

t :

t

= L t , 0 = .

t

Assuming that t = (y, t) dy,

= L ,

t

(y, 0) = 0 (y).

(4.23)

conditions are deterministic, X0 = x, the initial condition becomes 0 = (y x).

Given the initial distribution and the generator of the Markov process Xt , we can

calculate the transition probability density by solving the Forward Kolmogorov

equation. We can then calculate all statistical quantities of this process through the

formula

Z

E(f (Xt )|X0 = x) =

f (y)(t, y; x) dy.

We will derive rigorously the backward and forward Kolmogorov equations for

Markov processes that are defined as solutions of stochastic differential equations

later on.

We can study the evolution of a Markov process in two different ways: Either

through the evolution of observables (Heisenberg/Koopman)

(Pt f )

= L(Pt f ),

t

or through the evolution of states (Schrodinger/Frobenious-Perron)

(Pt )

= L (Pt ).

t

We can also study Markov processes at the level of trajectories. We will do this

after we define the concept of a stochastic differential equation.

A very important concept in the study of limit theorems for stochastic processes is

that of ergodicity. This concept, in the context of Markov processes, provides us

with information on the longtime behavior of a Markov semigroup.

71

g Cb (E)

Pt g = g,

t > 0

Roughly speaking, ergodicity corresponds to the case where the semigroup Pt

is such that Pt I has only constants in its null space, or, equivalently, to the case

where the generator L has only constants in its null space. This follows from the

definition of the generator of a Markov process.

Under some additional compactness assumptions, an ergodic Markov process

has an invariant measure with the property that, in the case T = R+ ,

1

t+ t

lim

g(Xs ) ds = Eg(x),

0

of an ergodic process: time averages equal phase space averages.

Using the adjoint semigroup we can define an invariant measure as the solution

of the equation

Pt = .

If this measure is unique, then the Markov process is ergodic. Using this, we can

obtain an equation for the invariant measure in terms of the adjoint of the generator

L , which is the generator of the semigroup Pt . Indeed, from the definition of the

generator of a semigroup and the definition of an invariant measure, we conclude

that a measure is invariant if and only if

L = 0

in some appropriate generalized sense ((L , f ) = 0 for every bounded measurable function). Assume that (dx) = (x) dx. Then the invariant density satisfies

the stationary Fokker-Planck equation

L = 0.

The invariant measure (distribution) governs the long-time dynamics of the Markov

process.

72

If X0 is distributed according to , then so is Xt for all t > 0. The resulting

stochastic process, with X0 distributed in this way, is stationary . In this case

the transition probability density (the solution of the Fokker-Planck equation) is

independent of time: (x, t) = (x). Consequently, the statistics of the Markov

process is independent of time.

Example 4.6.2. Consider the one-dimensional Brownian motion. The generator

of this Markov process is

1 d2

L=

.

2 dx2

The stationary Fokker-Planck equation becomes

d2

= 0,

dx2

together with the normalization and non-negativity conditions

Z

(x) dx = 1.

> 0,

(4.24)

(4.25)

There are no solutions to Equation (4.24), subject to the constraints (4.25). 2 Thus,

the one dimensional Brownian motion is not an ergodic process.

Example 4.6.3. Consider a one-dimensional Brownian motion on [0, 1], with periodic boundary conditions. The generator of this Markov process L is the differd2

ential operator L = 12 dx

2 , equipped with periodic boundary conditions on [0, 1].

This operator is self-adjoint. The null space of both L and L comprises constant

functions on [0, 1]. Both the backward Kolmogorov and the Fokker-Planck equation

reduce to the heat equation

1 2

=

t

2 x2

with periodic boundary conditions in [0, 1]. Fourier analysis shows that the solution converges to a constant at an exponential rate. See Exercise 6.

Example 4.6.4. The one dimensional Ornstein-Uhlenbeck (OU) process is a

Markov process with generator

L = x

d

d2

+ D 2.

dx

dx

2

The general solution to Equation (4.25) is (x) = Ax + B for arbitrary constants

A and B. This

R

function is not normalizable, i.e. there do not exist constants A and B so that R rho(x) dx = 1.

73

process. In order to calculate the invariant measure we need to solve the stationary

FokkerPlanck equation:

L = 0,

> 0,

kkL1 (R) = 1.

(4.26)

infinity, we have:

Z

Z

(xx f )h + (Dx2 f )h dx

Lf h dx =

R

Z

ZR

f L h dx,

f x (xh) + f (Dx2 h) dx =:

=

R

where

d

d2 h

(axh) + D 2 .

dx

dx

We can calculate the invariant distribution by solving equation (4.26). The invariant measure of this process is the Gaussian measure

r

(dx) =

exp

x2 dx.

2D

2D

If the initial condition of the OU process is distributed according to the invariant

measure, then the OU process is a stationary Gaussian process.

L h :=

zero, Gaussian second order stationary process on [0, ) with correlation function

D

R(t) = e|t|

D

1

f (x) =

.

x2 + 2

Furthermore, the OU process is the only real-valued mean zero Gaussian secondorder stationary Markov process defined on R.

The study of operator semigroups started in the late 40s independently by Hille and

Yosida. Semigroup theory was developed in the 50s and 60s by Feller, Dynkin

and others, mostly in connection to the theory of Markov processes. Necessary

and sufficient conditions for an operator L to be the generator of a (contraction)

semigroup are given by the Hille-Yosida theorem [13, Ch. 7].

74

4.8 Exercises

1. Let {Xn } be a stochastic process with state space S = Z. Show that it is a

Markov process if and only if for all n

P(Xn+1 = in+1 |X1 = i1 , . . . Xn = in ) = P(Xn+1 = in+1 |Xn = in ).

2. Show that (4.4) is the solution of initial value problem (4.10) as well as of the

final value problem

p

1 2p

=

,

s

2 x2

st

3. Use (4.5) to show that the forward and backward Kolmogorov equations for the

OU process are

1 2p

p

=

(yp) +

t

y

2 y 2

and

p

p 1 2 p

= x

+

.

s

x 2 x2

4. Let W (t) be a standard one dimensional Brownian motion, let Y (t) = W (t)

with > 0 and consider the process

Z t

Y (s) ds.

X(t) =

0

Show that the joint process {X(t), Y (t)} is Markovian and write down the

generator of the process.

5. Let Y (t) = et W (e2t ) be the stationary Ornstein-Uhlenbeck process and consider the process

Z

t

X(t) =

Y (s) ds.

Show that the joint process {X(t), Y (t)} is Markovian and write down the

generator of the process.

6. Consider a one-dimensional Brownian motion on [0, 1], with periodic boundary

conditions. The generator of this Markov process L is the differential operator

d2

L = 12 dx

2 , equipped with periodic boundary conditions on [0, 1]. Show that this

4.8. EXERCISES

75

operator is self-adjoint. Show that the null space of both L and L comprises

constant functions on [0, 1]. Conclude that this process is ergodic. Solve the

corresponding Fokker-Planck equation for arbitrary initial conditions 0 (x) .

Show that the solution converges to a constant at an exponential rate. .

7.

2 , EY 2 =

(a) Let X, Y be mean zero Gaussian random variables with EX 2 = X

)

Y2 and correlation coefficient (the correlation coefficient is = E(XY

X Y ).

Show that

X

Y.

E(X|Y ) =

Y

function R(t). Use the previous result to show that

E[Xt+s |Xs ] =

R(t)

X(s),

R(0)

s, t > 0.

(c) Use the previous result to show that the only stationary Gaussian Markov

process with continuous autocorrelation function is the stationary OU process.

8. Show that a Gaussian process Xt is a Markov process if and only if

E(Xtn |Xt1 = x1 , . . . Xtn1 = xn1 ) = E(Xtn |Xtn1 = xn1 ).

76

Chapter 5

Diffusion Processes

5.1 Introduction

In this chapter we study a particular class of Markov processes, namely Markov

processes with continuous paths. These processes are called diffusion processes

and they appear in many applications in physics, chemistry, biology and finance.

In Section 5.2 we give the definition of a diffusion process. In section 5.3 we

derive the forward and backward Kolmogorov equations for one-dimensional diffusion processes. In Section 5.4 we present the forward and backward Kolmogorov

equations in arbitrary dimensions. The connection between diffusion processes

and stochastic differential equations is presented in Section 5.5. Discussion and

bibliographical remarks are included in Section 5.7. Exercises can be found in

Section 5.8.

A Markov process consists of three parts: a drift (deterministic), a random process

and a jump process. A diffusion process is a Markov process that has continuous

sample paths (trajectories). Thus, it is a Markov process with no jumps. A diffusion

process can be defined by specifying its first two moments:

Definition 5.2.1. A Markov process Xt with transition function P (, t|x, s) is

called a diffusion process if the following conditions are satisfied.

77

78

Z

P (dy, t|x, s) = o(t s)

(5.1)

|xy|>

ii. (Definition of drift coefficient). There exists a function a(x, s) such that for

every x and every > 0

Z

(y x)P (dy, t|x, s) = a(x, s)(t s) + o(t s).

(5.2)

|yx|6

iii. (Definition of diffusion coefficient). There exists a function b(x, s) such that

for every x and every > 0

Z

(y x)2 P (dy, t|x, s) = b(x, s)(t s) + o(t s).

(5.3)

|yx|6

Remark 5.2.2. In Definition 5.2.1 we had to truncate the domain of integration

since we didnt know whether the first and second moments exist. If we assume

that there exists a > 0 such that

Z

1

|y x|2+ P (dy, t|x, s) = 0,

(5.4)

lim

ts t s Rd

then we can extend the integration over the whole Rd and use expectations in the

definition of the drift and the diffusion coefficient. Indeed, ,let k = 0, 1, 2 and

notice that

Z

|y x|k P (dy, t|x, s)

|yx|>

Z

|y x|2+ |y x|k(2+) P (dy, t|x, s)

=

|yx|>

Z

1

6 2+k

|y x|2+ P (dy, t|x, s)

|yx|>

Z

1

|y x|2+ P (dy, t|x, s).

6 2+k

d

R

Using this estimate together with (5.4) we conclude that:

Z

1

lim

|y x|k P (dy, t|x, s) = 0,

ts t s |yx|>

79

k = 0, 1, 2.

This implies that assumption (5.4) is sufficient for the sample paths to be continuous

(k = 0) and for the replacement of the truncated integrals in (8.73) and (5.3) by

integrals over R (k = 1 and k = 2, respectively). The definitions of the drift and

diffusion coefficients become:

Xt Xs

(5.5)

lim E

Xs = x = a(x, s)

ts

ts

and

lim E

ts

|Xt Xs |2

Xs = x

ts

= b(x, s)

(5.6)

In this section we show that a diffusion process is completely determined by its first

two moments. In particular, we will obtain partial differential equations that govern

the evolution of the conditional expectation of an arbitrary function of a diffusion

process Xt , u(x, s) = E(f (Xt )|Xs = x), as well as of the transition probability

density p(y, t|x, s). These are the backward and forward Kolmogorov equations.

In this section we shall derive the backward and forward Kolmogorov equations for one-dimensional diffusion processes. The extension to multidimensional

diffusion processes is presented in Section 5.4.

Theorem 5.3.1. (Kolmogorov) Let f (x) Cb (R) and let

Z

u(x, s) := E(f (Xt )|Xs = x) = f (y)P (dy, t|x, s).

Assume furthermore that the functions a(x, s), b(x, s) are continuous in both x

and s. Then u(x, s) C 2,1 (R R+ ) and it solves the final value problem

u

u 1

2u

= a(x, s)

+ b(x, s) 2 ,

s

x 2

x

st

(5.7)

80

Proof. First we notice that, the continuity assumption (5.1), together with the fact

that the function f (x) is bounded imply that

Z

f (y) P (dy, t|x, s)

u(x, s) =

Z

ZR

f (y)P (dy, t|x, s)

f (y)P (dy, t|x, s) +

=

|yx|>

|yx|6

Z

Z

P (dy, t|x, s)

f (y)P (dy, t|x, s) + kf kL

6

|yx|>

|yx|6

Z

f (y)P (dy, t|x, s) + o(t s).

=

|yx|6

We add and subtract the final condition f (x) and use the previous calculation to

obtain:

Z

Z

f (y)P (dy, t|x, s) = f (x) + (f (y) f (x))P (dy, t|x, s)

u(x, s) =

R

R

Z

Z

(f (y) f (x))P (dy, t|x, s)

(f (y) f (x))P (dy, t|x, s) +

= f (x) +

|yx|>

|yx|6

Z

(f (y) f (x))P (dy, t|x, s) + o(t s).

= f (x) +

|yx|6

Now the final condition follows from the fact that f (x) Cb (R) and the arbitrariness of .

Now we show that u(s, x) solves the backward Kolmogorov equation. We use

the Chapman-Kolmogorov equation (4.15) to obtain

Z

f (z)P (dz, t|x, )

(5.8)

u(x, ) =

R

Z Z

f (z)P (dz, t|y, )P (dy, |x, )

=

R

R

Z

u(y, )P (dy, |x, ).

(5.9)

=

R

u(z, )u(x, ) =

where

u(x, )

1 2 u(x, )

(z x)+

(z x)2 (1+ ),

x

2 x2

2

u(x, ) 2 u(z, )

.

= sup

x2

x2

,|zx|6

|z x| 6 ,

(5.10)

81

We combine now (5.9) with (5.10) to calculate

Z

u(x, s) u(x, s + h)

1

P (dy, s + h|x, s)u(y, s + h) u(x, s + h)

=

h

h

Z R

1

=

P (dy, s + h|x, s)(u(y, s + h) u(x, s + h))

h R

Z

1

=

P (dy, s + h|x, s)(u(y, s + h) u(x, s)) + o(1)

h |xy|<

Z

1

u

(x, s + h)

(y x)P (dy, s + h|x, s)

=

x

h |xy|<

Z

1

1 2u

(x, s + h)

(y x)2 P (dy, s + h|x, s)(1 + ) + o(1)

+

2 x2

h |xy|<

= a(x, s)

1

2u

u

(x, s + h) + b(x, s) 2 (x, s + h)(1 + ) + o(1).

x

2

x

Assume now that the transition function has a density p(y, t|x, s). In this case

the formula for u(x, s) becomes

Z

f (y)p(y, t|x, s) dy.

u(x, s) =

R

Z

p(y, t|x, s)

f (y)

+ As,xp(y, t|x, s) = 0

s

R

where

As,x := a(x, s)

(5.11)

1

2

+ b(x, s) 2 .

x 2

x

Since (5.11) is valid for arbitrary functions f (y), we obtain a partial differential

equations for the transition probability density:

p(y, t|x, s) 1

2 p(y, t|x, s)

p(y, t|x, s)

= a(x, s)

+ b(x, s)

.

s

x

2

x2

(5.12)

Notice that the variation is with respect to the backward variables x, s. We will

obtain an equation with respect to the forward variables y, t in the next section.

82

In this section we will obtain the forward Kolmogorov equation. In the physics

literature is called the Fokker-Planck equation. We assume that the transition

function has a density with respect to Lebesgue measure.

Z

p(y, t|x, s) dy.

P (, t|x, s) =

Theorem 5.3.2. (Kolmogorov) Assume that conditions (5.1), (8.73), (5.3) are satisfied and that p(y, t|, ), a(y, t), b(y, t) C 2,1 (R R+ ). Then the transition

probability density satisfies the equation

p

1 2

= (a(t, y)p) +

(b(t, y)p) ,

t

y

2 y 2

ts

Proof. Fix a function f (y) C02 (R). An argument similar to the one used in the

proof of the backward Kolmogorov equation gives

Z

1

1

lim

f (y)p(y, s + h|x, s) ds f (x) = a(x, s)fx (x) + b(x, s)fxx (x),

h0 h

2

(5.14)

where subscripts denote differentiation with respect to x. On the other hand

Z

Z

f (y)p(y, t|x, s) dy

f (y) p(y, t|x, s) dy =

t

t

Z

1

= lim

(p(y, t + h|x, s) p(y, t|x, s)) f (y) dy

h0 h

Z

Z

1

p(y, t + h|x, s)f (y) dy p(z, t|s, x)f (z) dz

= lim

h0 h

Z Z

Z

1

= lim

p(y, t + s|z, t)p(z, t|x, s)f (y) dydz p(z, t|s, x)f (z

h0 h

Z

Z

1

= lim

p(y, t + h|z, t)f (y) dy f (z)

dz

p(z, t|x, s)

h0 h

Z

1

=

p(z, t|x, s) a(z, t)fz (z) + b(z)fzz (z) dz

2

Z

1 2

(b(z)p(z,

t|x,

s)

f (z) dz.

=

z

2 z 2

performed two integrations by parts and used the fact that, since the test function f

has compact support, the boundary terms vanish.

83

Since the above equation is valid for every test function f (y), the forward

Kolmogorov equation follows.

Assume now that initial distribution of Xt is 0 (x) and set s = 0 (the initial

time) in (5.13). Define

Z

p(y, t) := p(y, t|x, 0)0 (x) dx.

(5.15)

We multiply the forward Kolmogorov equation (5.13) by 0 (x) and integrate with

respect to x to obtain the equation

1 2

p(y, t)

= (a(y, t)p(y, t)) +

(b(y, t)p(t, y)) ,

t

y

2 y 2

(5.16)

p(y, 0) = 0 (y).

(5.17)

The solution of equation (5.16), provides us with the probability that the diffusion

process Xt , which initially was distributed according to the probability density

0 (x), is equal to y at time t. Alternatively, we can think of the solution to (5.13)

as the Greens function for the PDE (5.16). Using (5.16) we can calculate the

expectation of an arbitrary function of the diffusion process Xt :

Z Z

E(f (Xt )) =

f (y)p(y, t|x, 0)p(x, 0) dxdy

Z

=

f (y)p(y, t) dy,

where p(y, t) is the solution of (5.16). Quite often we need to calculate joint probability densities. For, example the probability that Xt1 = x1 and Xt2 = x2 . From

the properties of conditional expectation we have that

p(x1 , t1 , x2 , t2 ) = P(Xt1 = x1 , Xt2 = x2 )

= P(Xt1 = x1 |Xt2 = x2 )P(Xt2 = x2 )

= p(x1 , t1 |x2 t2 )p(x2 , t2 ).

Using the joint probability density we can calculate the statistics of a function of

the diffusion process Xt at times t and s:

Z Z

E(f (Xt , Xs )) =

f (y, x)p(y, t|x, s)p(x, s) dxdy.

(5.18)

84

Z Z

E(Xt Xs ) =

yxp(y, t|x, s)p(x, s) dxdy.

In particular,

E(Xt X0 ) =

Z Z

Let Xt be a diffusion process in Rd . The drift and diffusion coefficients of a diffusion process in Rd are defined as:

Z

1

(y x)P (dy, t|x, s) = a(x, s)

lim

ts t s |yx|<

and

1

lim

ts t s

|yx|<

The drift coefficient a(x, s) is a d-dimensional vector field and the diffusion coefficient b(x, s) is a d d symmetric matrix (second order tensor). The generator of

a d dimensional diffusion process is

1

L = a(x, s) + b(x, s) :

2

d

d

X

2

1 X

bij (x, s) 2 .

+

aj (x, s)

=

xj

2

xj

j=1

i,j=1

Exercise 5.4.1. Derive rigorously the forward and backward Kolmogorov equations in arbitrary dimensions.

Assuming that the first and second moments of the multidimensional diffusion

process exist, we can write the formulas for the drift vector and diffusion matrix as

Xt Xs

(5.19)

lim E

Xs = x = a(x, s)

ts

ts

and

lim E

ts

(Xt Xs ) (Xt Xs )

Xs = x = b(x, s)

ts

(5.20)

Notice that from the above definition it follows that the diffusion matrix is symmetric and nonnegative definite.

Notice also that the continuity condition can be written in the form

P (|Xt Xs | > |Xs = x) = o(t s).

Now it becomes clear that this condition implies that the probability of large changes

in Xt over short time intervals is small. Notice, on the other hand, that the above

condition implies that the sample paths of a diffusion process are not differentiable: if they where, then the right hand side of the above equation would have to

be 0 when t s 1. The sample paths of a diffusion process have the regularity

of Brownian paths. A Markovian process cannot be differentiable: we can define

the derivative of a sample paths only with processes for which the past and future

are not statistically independent when conditioned on the present.

Let us denote the expectation conditioned on Xs = x by Es,x . Notice that the

definitions of the drift and diffusion coefficients (5.5) and (5.6) can be written in

the form

Es,x (Xt Xs ) = a(x, s)(t s) + o(t s).

and

Es,x (Xt Xs ) (Xt Xs ) = b(x, s)(t s) + o(t s).

Consequently, the drift coefficient defines the mean velocity vector for the stochastic process Xt , whereas the diffusion coefficient (tensor) is a measure of the local

magnitude of fluctuations of Xt Xs about the mean value. hence, we can write

locally:

Xt Xs a(s, Xs )(t s) + (s, Xs ) t ,

where b = T and t is a mean zero Gaussian process with

E s,x (t s ) = (t s)I.

Since we have that

Wt Ws N (0, (t s)I),

we conclude that we can write locally:

Xt a(s, Xs )t + (s, Xs )Wt .

Or, replacing the differences by differentials:

dXt = a(t, Xt )dt + (t, Xt )dWt .

Hence, the sample paths of a diffusion process are governed by a stochastic differential equation (SDE).

86

i. The 1-dimensional Brownian motion starting at x is a diffusion process with

generator

1 d2

.

L=

2 dx2

The drift and diffusion coefficients are, respectively a(x) = 0 and b(x) = 1.

The corresponding stochastic differential equation is

dXt = dWt ,

X0 = x.

Xt = x + Wt .

ii. The 1-dimensional Ornstein-Uhlenbeck process is a diffusion process with

drift and diffusion coefficients, respectively, a(x) = x and b(x) = D.

The generator of this process is

L = x

D d2

d

+

.

dx

2 dx2

dXt = Xt dt +

D dWt .

Xt = et X0 +

Z t (ts)

e

dWs .

D

0

The argument used in the derivation of the forward and backward Kolmogorov

equations goes back to Kolmogorovs original work. More material on diffusion

processes can be found in [26], [32].

5.8. EXERCISES

87

5.8 Exercises

1. Prove equation (5.14).

2. Derive the initial value problem (5.16), (5.17).

3. Derive rigorously the backward and forward Kolmogorov equations in arbitrary

dimensions.

88

Chapter 6

6.1 Introduction

In the previous chapter we derived the backward and forward (Fokker-Planck) Kolmogorov equations and we showed that all statistical properties of a diffusion process can be calculated from the solution of the Fokker-Planck equation. 1 In this

long chapter we study various properties of this equation such as existence and

uniqueness of solutions, long time asymptotics, boundary conditions and spectral

properties of the Fokker-Planck operator. We also study in some detail various examples of diffusion processes and of the associated Fokker-Palnck equation. We

will restrict attention to time-homogeneous diffusion processes, for which the drift

and diffusion coefficients do not depend on time.

In Section 6.2 we study various basic properties of the Fokker-Planck equation, including existence and uniqueness of solutions, writing the equation as a

conservation law and boundary conditions. In Section 6.3 we present some examples of diffusion processes and use the corresponding Fokker-Planck equation in

order to calculate various quantities of interest such as moments. In Section 6.4 we

study the multidimensional Onrstein-Uhlenbeck process and we study the spectral

properties of the corresponding Fokker-Planck operator. In Section 6.5 we study

stochastic processes whose drift is given by the gradient of a scalar function, gradient flows. In Section 6.7 we solve the Fokker-Planck equation for a gradient

SDE using eigenfunction expansions and we show how the eigenvalue problem

for the Fokker-Planck operator can be reduced to the eigenfunction expansion for

1

In this chapter we will call the equation Fokker-Planck, which is more customary in the physics

literature. rather forward Kolmogorov, which is more customary in the mathematics literature.

89

90

a Schrodinger operator. In Section 8.2 we study the Langevin equation and the

associated Fokker-Planck equation. In Section 8.3 we calculate the eigenvalues

and eigenfunctions of the Fokker-Planck operator for the Langevin equation in a

harmonic potential. Discussion and bibliographical remarks are included in Section 6.8. Exercises can be found in Section 6.9.

6.2.1 Existence and Uniqueness of Solutions

Consider a homogeneous diffusion process on Rd with drift vector and diffusion

matrix a(x) and b(x). The Fokker-Planck equation is

d

d

X

1 X 2

p

=

(ai (x)p) +

(bij (x)p), t > 0, x Rd ,

t

xj

2

xi xj

j=1

(6.1a)

i,j=1

p(x, 0) = f (x),

x Rd .

(6.1b)

Since f (x) is the probability density of the initial condition (which is a random

variable), we have that

Z

f (x) dx = 1.

f (x) > 0, and

Rd

d

d

p

p X

2p

1 X

a

j (x)

bij(x)

=

+

+ c(x)u, t > 0, x Rd ,

t

xj

2

xi xj

j=1

(6.2a)

i,j=1

p(x, 0) = f (x),

x Rd ,

(6.2b)

where

a

i (x) = ai (x) +

d

X

bij

j=1

xj

ci (x) =

d

d

X

1 X 2 bij

ai

.

2

xi xj

xi

i,j=1

i=1

By definition (see equation (5.20)), the diffusion matrix is always symmetric and

nonnegative. We will assume that it is actually uniformly positive definite, i.e. we

will impose the uniform ellipticity condition:

d

X

i,j=1

Rd ,

(6.3)

91

, b, c are smooth and that they

satisfy the growth conditions

kb(x)k 6 M, k

a(x)k 6 M (1 + kxk), k

c(x)k 6 M (1 + kxk2 ).

(6.4)

Definition 6.2.1. We will call a solution to the Cauchy problem for the Fokker

Planck equation (6.2) a classical solution if:

i. u C 2,1 (Rd , R+ ).

ii. T > 0 there exists a c > 0 such that

2

iii. limt0 u(t, x) = f (x).

that, under the regularity and uniform ellipticity assumptions, the Fokker-Planck

equation has a unique smooth solution. Furthermore, the solution can be estimated

in terms of an appropriate heat kernel (i.e. the solution of the heat equation on Rd ).

Theorem 6.2.2. Assume that conditions (6.3) and (6.4) are satisfied, and assume

2

that |f | 6 cekxk . Then there exists a unique classical solution to the Cauchy

problem for the FokkerPlanck equation. Furthermore, there exist positive constants K, so that

1

|p|, |pt |, kpk, kD 2 pk 6 Kt(n+2)/2 exp kxk2 .

2t

(6.5)

Notice that from estimates (6.5) it follows that all moments of a uniformly

elliptic diffusion process exist. In particular, we can multiply the Fokker-Planck

equation by monomials xn and then to integrate over Rd and to integrate by parts.

No boundary terms will appear, in view of the estimate (6.5).

Remark 6.2.3. The solution of the Fokker-Planck equation is nonnegative for all

times, provided that the initial distribution is nonnegative. This is follows from the

maximum principle for parabolic PDEs.

92

The Fokker-Planck equation is in fact a conservation law: it expresses the law of

conservation of probability. To see this we define the probability current to be

the vector whose ith component is

d

Ji := ai (x)p

1X

bij (x)p .

2

xj

(6.6)

j=1

equation:

p

+ J = 0.

t

Integrating the FP equation over Rd and integrating by parts on the right hand side

of the equation we obtain

d

dt

p(x, t) dx = 0.

Rd

Consequently:

kp(, t)kL1 (Rd ) = kp(, 0)kL1 (Rd ) = 1.

(6.7)

Hence, the total probability is conserved, as expected. Equation (6.7) simply means

that

E(Xt Rd ) = 1,

t > 0.

When studying a diffusion process that can take values on the whole of Rd , then

we study the pure initial value (Cauchy) problem for the Fokker-Planck equation,

equation (6.1). The boundary condition was that the solution decays sufficiently

fast at infinity. For ergodic diffusion processes this is equivalent to requiring that

the solution of the backward Kolmogorov equation is an element of L2 () where

is the invariant measure of the process. There are many applications where it is

important to study stochastic process in bounded domains. In this case it is necessary to specify the value of the stochastic process (or equivalently of the solution

to the Fokker-Planck equation) on the boundary.

93

Fokker-Planck equation, let us consider the example of a random walk on the domain {0, 1, . . . N }.2 When the random walker reaches either the left or the right

boundary we can either set

i. X0 = 0 or XN = 0, which means that the particle gets absorbed at the

boundary;

ii. X0 = X1 or XN = XN 1 , which means that the particle is reflected at the

boundary;

iii. X0 = XN , which means that the particle is moving on a circle (i.e., we

identify the left and right boundaries).

Hence, we can have absorbing, reflecting or periodic boundary conditions.

Consider the Fokker-Planck equation posed in Rd where is a bounded

domain with smooth boundary. Let J denote the probability current and let n be the

unit outward pointing normal vector to the surface. The above boundary conditions

become:

i. The transition probability density vanishes on an absorbing boundary:

p(x, t) = 0,

on .

n J(x, t) = 0,

on .

iii. The transition probability density is a periodic function in the case of periodic boundary conditions.

Notice that, using the terminology customary to PDEs theory, absorbing boundary

conditions correspond to Dirichlet boundary conditions and reflecting boundary

conditions correspond to Neumann. Of course, on consider more complicated,

mixed boundary conditions.

2

Of course, the random walk is not a diffusion process. However, as we have already seen the

Brownian motion can be defined as the limit of an appropriately rescaled random walk. A similar

construction exists for more general diffusion processes.

94

Consider now a diffusion process in one dimension on the interval [0, L]. The

boundary conditions are

p(0, t) = p(L, t) = 0 absorbing,

J(0, t)) = J(L, t) = 0 reflecting,

p(0, t) = p(L, t) periodic,

where the probability current is defined in (6.6). An example of mixed boundary

conditions would be absorbing boundary conditions at the left end and reflecting

boundary conditions at the right end:

p(0, t) = J(L, t) = 0.

There is a complete classification of boundary conditions in one dimension, the

Feller classification: the BC can be regular, exit, entrance and natural.

6.3.1 Brownian Motion

Brownian Motion on R

Set a(y, t) 0, b(y, t) 2D > 0. This diffusion process is the Brownian motion

with diffusion coefficient D. Let us calculate the transition probability density of

this process assuming that the Brownian particle is at y at time s. The FokkerPlanck equation for the transition probability density p(x, t|y, s) is:

2p

p

= D 2,

t

x

(6.8)

The solution to this equation is the Greens function (fundamental solution) of the

heat equation:

(x y)2

1

.

exp

p(x, t|y, s) = p

4D(t s)

4D(t s)

(6.9)

Notice that using the Fokker-Planck equation for the Brownian motion we can

immediately show that the mean squared displacement grows linearly in time. As-

95

Z

d

d

2

EWt =

x2 p(x, t|0, 0) dx

dt

dt R

Z

2 p(x, t)

dx

= D x2

x2

R

Z

= D p(x, t|0, 0) dx = 2D,

R

where we performed two integrations by parts and we used the fact that, in view

of (6.9), no boundary terms remain. From this calculation we conclude that

EWt2 = 2Dt.

Assume now that the initial condition W0 of the Brownian particle is a random

variable with distribution 0 (x). To calculate the probability density function (distribution function) of the Brownian particle we need to solve the Fokker-Planck

equation with initial condition 0 (x). In other words, we need to take the average of the probability density function p(x, t|y, 0) over all initial realizations of

the Brownian particle. The solution of the Fokker-Planck equation, the distribution

function, is

Z

p(x, t) =

(6.10)

Notice that only the transition probability density depends on x and y only through

their difference. Thus, we can write p(x, t|y, 0) = p(x y, t). From (6.10) we see

that the distribution function is given by the convolution between the transition

probability density and the initial condition, as we know from the theory of partial

differential equations.

Z

p(x, t) = p(x y, t)0 (y) dy =: p 0 .

Brownian motion with absorbing boundary conditions

We can also consider Brownian motion in a bounded domain, with either absorbing, reflecting or periodic boundary conditions. Set D = 1 and consider the

Fokker-Planck equation (6.8) on [0, 1] with absorbing boundary conditions:

p

1 2p

=

,

t

2 x2

p(0, t) = p(1, t) = 0.

(6.11)

96

p(x, t) =

pn (t) sin(nx).

(6.12)

k=1

Notice that the boundary conditions are automatically satisfied. The initial condition is

p(x, 0) = (x x0 ),

where we have assumed that W0 = x0 . The Fourier coefficients of the initial

conditions are

Z 1

(x x0 ) sin(nx) dx = 2 sin(nx0 ).

pn (0) = 2

0

We substitute the expansion (6.12) into (6.11) and use the orthogonality properties

of the Fourier basis to obtain the equations

pn =

n2 2

pn

2

n = 1, 2, . . .

pn (t) = pn (0)e

n2 2

t

2

Consequently, the transition probability density for the Brownian motion on [0, 1]

with absorbing boundary conditions is

p(x, t|x0 , 0) = 2

n2 2

t

2

n=1

Notice that

lim p(x, t|x0 , 0) = 0.

This is not surprising, since all Brownian particles will eventually get absorbed at

the boundary.

Brownian Motion with Reflecting Boundary Condition

Consider now Brownian motion on the interval [0, 1] with reflecting boundary conditions and set D = 1 for simplicity. In order to calculate the transition probability

97

density we have to solve the Fokker-Planck equation which is the heat equation on

[0, 1] with Neumann boundary conditions:

1 2p

p

=

,

t

2 x2

p(x, 0) = (x x0 ).

x p(0, t) = x p(1, t) = 0,

The boundary conditions are satisfied by functions of the form cos(nx). We look

for a solution in the form of a cosine Fourier series

X

1

p(x, t) = a0 +

an (t) cos(nx).

2

n=1

Z 1

cos(nx)(x x0 ) dx = 2 cos(nx0 ).

an (0) = 2

0

We substitute the expansion into the PDE and use the orthonormality of the Fourier

basis to obtain the equations for the Fourier coefficients:

a n =

n2 2

an

2

an (t) = an (0)e

n2 2

t

2

Consequently

p(x, t|x0 , 0) = 1 + 2

cos(nx0 ) cos(nx)e

n2 2

t

2

n=1

Markov process. To see this, let us consider the stationary Fokker-Planck equation

2 ps

= 0, x ps (0) = x ps (1) = 0.

x2

The unique normalized solution to this boundary value problem is ps (x) = 1.

Indeed, we multiply the equation by ps , integrate by parts and use the boundary

conditions to obtain

Z 1

dps 2

dx dx = 0,

0

98

from which it follows that ps (x) = 1. Alternatively, by taking the limit of p(x, t|x0 , 0)

as t we obtain the invariant distribution:

lim p(x, t|x0 , 0) = 1.

Z 1Z 1

E(W (t)W (0)) =

xx0 p(x, t|x0 , 0)ps (x0 ) dxdx0

0

0

1Z 1

xx0 1 + 2

2 2

n 2 t

cos(nx0 ) cos(nx)e

n=1

dxdx0

+

2 2

1

8 X

1

(2n+1)

t

2

+ 4

e

.

4

(2n + 1)4

n=0

We set now a(x, t) = x, b(x, t) = 2D > 0. With this drift and diffusion

coefficients the Fokker-Planck equation becomes

(xp)

2p

p

=

+D 2.

t

x

x

(6.13)

This is the Fokker-Planck equation for the Ornstein-Uhlenbeck process. The corresponding stochastic differential equation is

dXt = Xt + 2DdWt .

So, in addition to Brownian motion there is a linear force pulling the particle towards the origin. We know that Brownian motion is not a stationary process, since

the variance grows linearly in time. By adding a linear damping term, it is reasonable to expect that the resulting process can be stationary. As we have already

seen, this is indeed the case.

The transition probability density pOU (x, t|y, s) for an OU particle that is located at y at time s is

!

r

(x e(ts) y)2

exp

. (6.14)

pOU (y, t|x, s) =

2D(1 e2(ts) )

2D(1 e2(ts) )

We obtained this formula in Example (4.2.4) (for = D = 1) by using the fact that

the OU process can be defined through the a time change of the Brownian motion.

99

We can also derive it by solving equation (6.13). To obtain (6.14), we first take

the Fourier transform of the transition probability density with respect to x, solve

the resulting first order PDE using the method of characteristics and then take the

inverse Fourier transform3

Notice that from formula (6.14) it immediately follows that in the limit as the

friction coefficient goes to 0, the transition probability of the OU processes converges to the transition probability of Brownian motion. Furthermore, by taking

the long time limit in (6.14) we obtain (we have set s = 0)

r

x2

exp

,

t+

2D

2D

irrespective of the initial position y of the OU particle. This is to be expected, since

as we have already seen the Ornstein-Uhlenbeck process is an ergodic Markov

process, with a Gaussian invariant distribution

r

x2

exp

.

(6.15)

ps (x) =

2D

2D

Using now (6.14) and (6.15) we obtain the stationary joint probability density

p2 (x, t|y, 0) = p(x, t|y, 0)ps (y)

(x2 + y 2 2xyet )

=

.

exp

2D(1 e2t )

2D 1 e2t

More generally, we have

p2 (x, t|y, s) =

2D

1 e2|ts|

!

(x2 + y 2 2xye|ts| )

exp

(.6.16)

2D(1 e2|ts| )

Z Z

E(X(t)X(s)) =

xyp2 (x, t|y, s) dxdy

(6.17)

=

D |ts|

e

.

(6.18)

of variables. The calculation is similar to the one presented in Section 2.6. See

Exercise 2.

3

This calculation will be presented in Section ?? for the Fokker-Planck equation of a linear SDE

in arbitrary dimensions.

100

according to a distribution 0 (x). As in the case of a Brownian particle, the probability density function (distribution function) is given by the convolution integral

Z

p(x, t) = p(x y, t)0 (y) dy,

(6.19)

where p(x y, t) := p(x, t|y, 0). When the OU process is distributed initially

according to its invariant distribution, 0 (x) = ps (x) given by (6.15), then the

Ornstein-Uhlenbeck process becomes stationary. The distribution function is given

by ps (x) at all times and the joint probability density is given by (6.16).

Knowledge of the distribution function enables us to calculate all moments of

the OU process using the formula

Z

n

E((Xt ) ) =

xn p(x, t) dx,

We will calculate the moments by using the Fokker-Planck equation, rather than

the explicit formula for the transition probability density. Let Mn (t) denote the nth

moment of the OU process,

Z

xn p(x, t) dx, n = 0, 1, 2, . . . ,

Mn :=

R

Z

Z 2

Z

(yp)

p

p

=

+D

= 0,

t

y

y 2

after an integration by parts and using the fact that p(x, t) decays sufficiently fast

at infinity. Consequently:

d

M0 = 0

dt

In other words, since

we deduce that

M0 (t) = M0 (0) = 1.

d

kpkL1 (R) = 0,

dt

Z

Z

p(x, t = 0) dy = 1,

p(x, t) dx =

R

which means that the total probability is conserved, as we have already shown

for the general Fokker-Planck equation in arbitrary dimensions. Let n = 1. We

101

multiply the FP equation for the OU process by x, integrate over R and perform

and integration by parts to obtain:

d

M1 = M1 .

dt

Consequently, the first moment converges exponentially fast to 0:

M1 (t) = et M1 (0).

Let now n > 2. We multiply the FP equation for the OU process by xn and

integrate by parts (once on the first term on the RHS and twice on the second) to

obtain:

Z

Z

Z

d

n

n

y p = n y p + Dn(n 1) y n2 p.

dt

Or, equivalently:

d

Mn = nMn + Dn(n 1)Mn2 , n > 2.

dt

This is a first order linear inhomogeneous differential equation. We can solve it

using the variation of constants formula:

Z t

nt

en(ts) Mn2 (s) ds.

(6.20)

Mn (t) = e

Mn (0) + Dn(n 1)

0

We can use this formula, together with the formulas for the first two moments in

order to calculate all higher order moments in an iterative way. For example, for

n = 2 we have

Z t

e2(ts) M0 (s) ds

M2 (t) = e2t M2 (0) + 2D

0

D

= e

M2 (0) + e2t (e2t 1)

D

D

=

+ e2t M2 (0)

.

2t

D

. The stationary moments of the OU process are:

value 2

r

Z

y 2

n

y n e 2D dx

hy iOU :=

2D R

n

D n/2

, n even,

1.3

.

.

.

(n

1)

=

0,

n odd.

102

lim Mn (t) = hy n iOU

(6.21)

exponentially fast4 . Since we have already shown that the distribution function of

the OU process converges to the Gaussian distribution in the limit as t +, it

is not surprising that the moments also converge to the moments of the invariant

Gaussian measure. What is not so obvious is that the convergence is exponentially

fast. In the next section we will prove that the Ornstein-Uhlenbeck process does,

indeed, converge to equilibrium exponentially fast. Of course, if the initial conditions of the OU process are stationary, then the moments of the OU process become

independent of time and given by their equilibrium values

Mn (t) = Mn (0) = hxn iOU .

(6.22)

We set a(x) = x, b(x) = 12 2 x2 . This is the geometric Brownian motion. The

corresponding stochastic differential equation is

dXt = Xt dt + Xt dWt .

This equation is one of the basic models in mathematical finance. The coefficient

is called the volatility. The generator of this process is

L = x

x2 2

+

.

x

2 x2

Notice that this operator is not uniformly elliptic. The Fokker-Planck equation of

the geometric Brownian motion is:

2 2 x2

p

=

(x) + 2

p .

t

x

x

2

We can easily obtain an equation for the nth moment of the geometric Brownian

motion:

d

2

Mn = n + n(n 1) Mn , n > 2.

dt

2

4

Of course, we need to assume that the initial distribution has finite moments of all orders in order

to justify the above calculations.

The solution of this equation is

Mn (t) = e(+(n1)

2

2

)nt

Mn (0),

n>2

and

M1 (t) = et M1 (0).

Notice that the nth moment might diverge as t , depending on the values of

and . Consider for example the second moment and assume that < 0. We have

Mn (t) = e(2+

2 )t

M2 (0),

The Ornstein-Uhlenbeck process is one of the few stochastic processes for which

we can calculate explicitly the solution of the corresponding SDE, the solution of

the Fokker-Planck equation as well as the eigenfunctions of the generator of the

process. In this section we will show that the eigenfunctions of the OU process are

the Hermite polynomials. We will also study various properties of the generator

of the OU process. In the next section we will show that many of the properties

of the OU process (ergodicity, self-adjointness of the generator, exponentially fast

convergence to equilibrium, real, discrete spectrum) are shared by a large class of

diffusion processes, namely those for which the drift term can be written in terms

of the gradient of a smooth functions.

The generator of the d-dimensional OU process is (we set the drift coefficient

equal to 1)

L = p p + 1 p

(6.23)

where denotes the inverse temperature. We have already seen that the OU process is an ergodic Markov process whose unique invariant measure is absolutely

continuous with respect to the Lebesgue measure on Rd with Gaussian density

C (Rd )

|p|2

1

2

.

e

(p) =

(2 1 )d/2

104

The natural function space for studying the generator of the OU process is the L2 space weighted by the invariant measure of the process. This is a separable Hilbert

space with norm

Z

kf k2 :=

Rd

f 2 dp.

(f, h) =

f h dp.

Sobolev spaces. See Exercise .

The reason why this is the right function space in which to study questions

related to convergence to equilibrium is that the generator of the OU process becomes a self-adjoint operator in this space. In fact, L defined in (6.23) has many

nice properties that are summarized in the following proposition.

Proposition 6.4.1. The operator L has the following properties:

i. For every f, h C02 (Rd ) L2 (Rd ),

(Lf, h) = (f, Lh) =

Rd

f h dp.

(6.24)

iii. Lf = 0 iff f const.

iv. For every f C02 (Rd ) L2 (Rd ) with

f = 0,

(Lf, f ) > kf k2

(6.25)

Z

Z

1

(Lf, h) =

p f h dp +

f h dp

Z

Z

Z

1

=

p f h dp

f h dp + p f h dp

= 1 (f, h) .

(Lf, f ) = 1 kf k2 6 0.

Similarly, multiplying the equation Lf = 0 by f , integrating over Rd and using (6.24) gives

kf k = 0,

from which we deduce that f const. The spectral gap follows from (6.24),

together with Poincares inequality for Gaussian measures:

Z

Z

2

1

f dp 6

|f |2 dp

(6.26)

Rd

Rd

with (6.26) we obtain:

(Lf, f ) = 1 kf k2

6 kf k2

the compactness of its resolvent, implies that L has discrete spectrum. Furthermore, since it is also a self-adjoint operator, we have that its eigenfunctions form

a countable orthonormal basis for the separable Hilbert space L2 . In fact, we can

calculate the eigenvalues and eigenfunctions of the generator of the OU process in

one dimension.5

Theorem 6.4.2. Consider the eigenvalue problem for the generator of the OU

process in one dimension

Lfn = n fn .

(6.27)

Then the eigenvalues of L are the nonnegative integers:

n = n,

n = 0, 1, 2, . . . .

p

1

p ,

fn (p) = Hn

(6.28)

n!

where

Hn (p) = (1) e

p2

2

dn

dpn

p2

(6.29)

5

The multidimensional problem can be treated similarly by taking tensor products of the eigenfunctions of the one dimensional problem.

106

Hermite polynomials which we state here without proof (we use the notation 1 =

).

Proposition 6.4.3. For each C, set

H(p; ) = ep

Then

H(p; ) =

X

n

n=0

n!

2

2

p R.

Hn (p),

p R,

(6.30)

n N} is an orthonormal basis in L2 (C; ).

From (6.29) it is clear that Hn is a polynomial of degree n. Furthermore, only

odd (even) powers appear in Hn (p) when n is odd (even). Furthermore, the coefficient multiplying pn in Hn (p) is always 1. The orthonormality of the modified

Hermite polynomials fn (p) defined in (6.28) implies that

Z

R

The first few Hermite polynomials and the corresponding rescaled/normalized eigenfunctions of the generator of the OU process are:

H0 (p) = 1,

H1 (p) = p,

H2 (p) = p2 1,

H3 (p) = p3 3p,

H4 (p) = p4 3p2 + 3,

H5 (p) = p5 10p3 + 15p,

f0 (p) = 1,

p

f1 (p) = p,

1

f2 (p) = p2 ,

2

2

3/2

3

3

f3 (p) =

p

p

6

6

1

2 p4 3p2 + 3

f4 (p) =

24

1 5/2 5

f5 (p) =

p 10 3/2 p3 + 15 1/2 p .

120

The proof of Theorem 6.4.2 follows essentially from the properties of the Hermite

polynomials. First, notice that by combining (6.28) and (6.30) we obtain

+

X

p

n

fn (p)

H( p, ) =

n!

n=0

+

X

p

p

n

p fn (p),

H( p, ) =

n!

n=1

+

X

p

n1

H( p, ) =

p fn (p)

n!

n=1

=

from which we deduce that

+

X

n

p fn+1 (p)

p

(n

+

1)!

n=0

1

p fk = kfk1 .

(6.31)

(p )H(p; ) =

+ k

X

k=0

k!

pHk (p)

+

X

k=1

X k

k

Hk1 (p)

Hk+1 (p)

(k 1)!

k!

k=0

pHk = Hk+1 + kHk1 .

Upon rescaling, we deduce that

pfk =

1 (k + 1)fk+1 +

1 kfk1 .

p

1

p p fk = k + 1fk+1 .

(6.32)

(6.33)

108

1

1

Lfn =

p p p fn

p

1

=

p p

nfn1 = nfn .

p 1 p and 1 p play the role of creation and anniThe operators

hilation operators. In fact, we can generate all eigenfunctions of the OU operator

from the ground state f0 = 0 through a repeated application of the creation operator.

p

a+ = p + p.

Then the generator of the OU process can be written in the form

L = a+ a .

Furthermore, a+ and a satisfy the following commutation relation

[a+ , a ] = 1

Define now the creation and annihilation operators on C 1 (R) by

and

S+ = p

1

a+

(n + 1)

1

S = a .

n

Then

S + fn = fn+1

and

S fn = fn1 .

(6.34)

In particular,

and

1

fn = (a+ )n 1

n!

(6.35)

1

1 = (a )n fn .

n!

(6.36)

Proof. let f, h C 1 (R) L2 . We calculate

Z

Now,

Z

p f h = f p (h)

Z

=

f p + p h.

(6.37)

(6.38)

a+ a = (p + p)p = p pp = L.

Similarly,

a a+ = p2 + pp + 1.

and

[a+ , a ] = 1

Forumlas (6.34) follow from (6.31) and (6.33). Finally, formulas (6.35) and (6.36)

are a consequence of (6.31) and (6.33), together with a simple induction argument.

Notice that upon using (6.35) and (6.36) and the fact that a+ is the adjoint of

a we can easily check the orthonormality of the eigenfunctions:

Z

Z

1

fn (a )m 1

fn fm =

m! Z

1

(a )m fn

=

m!

Z

=

fnm = nm .

From the eigenfunctions and eigenvalues of L we can easily obtain the eigenvalues

and eigenfunctions of L , the Fokker-Planck operator.

Lemma 6.4.5. The eigenvalues and eigenfunctions of the Fokker-Planck operator

L = p2 +p (p)

are

n = n,

n = 0, 1, 2, . . .

and fn = fn .

110

Proof. We have

L (fn ) = fn L + Lfn

= nfn .

An immediate corollary of the above calculation is that we can the nth eigenfunction of the Fokker-Planck operator is given by

fn = (p)

1 + n

(a ) 1.

n!

The stationary Ornstein-Uhlenbeck process is an example of a reversible Markov

process:

Definition 6.5.1. A stationary stochastic process Xt is time reversible if for every m N and every t1 , t2 , . . . , tm R+ , the joint probability distribution is

invariant under time reversals:

p(Xt1 , Xt2 , . . . , Xtm ) = p(Xt1 , Xt2 , . . . , Xtm ).

(6.39)

In this section we study a more general class (in fact, as we will see later the

most general class) of reversible Markov processes, namely stochastic perturbations of ODEs with a gradient structure.

Let V (x) = 21 x2 . The generator of the OU process can be written as:

L = x V x + 1 x2 .

Consider diffusion processes with a potential V (x), not necessarily quadratic:

L = V (x) + 1

(6.40)

kB T where kB is Boltzmanns constant and T the absolute temperature. The corresponding stochastic differential equation is

p

(6.41)

dXt = V (Xt ) dt + 2 1 dWt .

111

Hence, we have a gradient ODE X t = V (Xt ) perturbed by noise due to thermal fluctuations. The corresponding FP equation is:

p

= (V p) + 1 p.

t

(6.42)

It is not possible to calculate the time dependent solution of this equation for an

arbitrary potential. We can, however, always calculate the stationary solution, if it

exists.

Definition 6.5.2. A potential V will be called confining if lim|x|+ V (x) = +

and

eV (x) L1 (Rd ).

(6.43)

for all R+ .

Gradient SDEs in a confining potential are ergodic:

Proposition 6.5.3. Let V (x) be a smooth confining potential. Then the Markov

process with generator (6.40) is ergodic. The unique invariant distribution is the

Gibbs distribution

1

(6.44)

p(x) = eV (x)

Z

where the normalization factor Z is the partition function

Z

eV (x) dx.

Z=

Rd

The fact that the Gibbs distribution is an invariant distribution follows by direct

substitution. Uniqueness follows from a PDEs argument (see discussion below). It

is more convenient to normalize the solution of the Fokker-Planck equation with

respect to the invariant distribution.

Theorem 6.5.4. Let p(x, t) be the solution of the Fokker-Planck equation (6.42),

assume that (6.43) holds and let (x) be the Gibbs distribution (10.11). Define

h(x, t) through

p(x, t) = h(x, t)(x).

Then the function h satisfies the backward Kolmogorov equation:

h

= V h + 1 h,

t

(6.45)

112

Proof. The initial condition follows from the definition of h. We calculate the

gradient and Laplacian of p:

p = h hV

and

p = h 2V h + hV + h|V |2 2 .

We substitute these formulas into the FP equation to obtain

h

= V h + 1 h ,

t

from which the claim follows.

sufficient to study the backward equation (6.45). The generator L is self-adjoint,

in the right function space. We define the weighted L2 space L2 :

Z

2

|f |2 (x) dx < ,

L = f |

Rd

where (x) is the Gibbs distribution. This is a Hilbert space with inner product

Z

f h(x) dx.

(f, h) =

Rd

Theorem 6.5.5. Assume that V (x) is a smooth potential and assume that condition (6.43) holds. Then the operator

L = V (x) + 1

is self-adjoint in L2 . Furthermore, it is non-positive, its kernel consists of constants.

Proof. Let f, C02 (Rd ). We calculate

Z

(V + 1 )f h dx

(Lf, h) =

d

Z

Z

ZR

f h dx

f h dx 1

(V f )h dx 1

=

d

d

R

R

Rd

Z

f h dx,

= 1

Rd

113

(Lf, f ) = 1 kf k2 ,

which shows that L is non-positive.

Clearly, constants are in the null space of L. Assume that f N (L). Then,

from the above equation we get

0 = 1 kf k2 ,

and, consequently, f is a constant.

Remark 6.5.6. The expression (Lf, f ) is called the Dirichlet form of the operator L. In the case of a gradient flow, it takes the form

(Lf, f ) = 1 kf k2 .

(6.46)

Using the properties of the generator L we can show that the solution of the

Fokker-Planck equation converges to the Gibbs distribution exponentially fast. For

this we need to use the fact that, under appropriate assumptions on the potential V ,

the Gibbs measure (dx) = Z 1 eV (x) satisfies Poincar`es inequality:

Theorem 6.5.7. Assume that the potential V satisfies the convexity condition

D 2 V > I.

Then the corresponding Gibbs measure satisfies the Poincare inequality with constant :

Z

f = 0 kf k > kf k .

(6.47)

Rd

Theorem 6.5.8. Assume that p(x, 0) L2 (eV ). Then the solution p(x, t) of the

Fokker-Planck equation (6.42) converges to the Gibbs distribution exponentially

fast:

kp(, t) Z 1 eV k1 6 eDt kp(, 0) Z 1 eV k1 .

Proof. We Use (6.45), (6.46) and (6.47) to calculate

h

d

2

k(h 1)k = 2

, h 1 = 2 (Lh, h 1)

dt

t

> 2 1 kh 1k2 .

(6.48)

114

calculation shows that

kh(, t) 1k 6 e

1 t

kh(, 0) 1k .

Remark 6.5.9. The assumption

Z

|p(x, 0)|2 Z 1 eV <

Rd

L2 (eV ) in which we prove convergence is not the right space to use. Since

p(, t) L1 , ideally we would like to prove exponentially fast convergence in L1 .

We can prove convergence in L1 using the theory of logarithmic Sobolev inequalities. In fact, we can also prove convergence in relative entropy:

Z

p

p ln

H(p|V ) :=

dx.

V

Rd

The relative entropy norm controls the L1 norm:

k1 2 k2L1 6 CH(1 |2 )

Using a logarithmic Sobolev inequality, we can prove exponentially fast convergence to equilibrium, assuming only that the relative entropy of the initial conditions is finite.

A much sharper version of the theorem of exponentially fast convergence to

equilibrium is the following:

Theorem 6.5.10. Let p denote the solution of the FokkerPlanck equation (6.42)

where the potential is smooth and uniformly convex. Assume that the the initial

conditions satisfy

H(p(, 0)|V ) < .

Then p converges to the Gibbs distribution exponentially fast in relative entropy:

H(p(, t)|V ) 6 e

1 t

H(p(, 0)|V ).

115

L = b(x) + 1

and invariant measure . Then the following three statements are equivalent.

i. The process it time-reversible.

ii. Its generator of the process is symmetric in L2 (Rd ; (dx)).

iii. There exists a scalar function V (x) such that

b(x) = V (x).

The Smoluchowski SDE (6.41) has a very interesting application in statistics. Suppose we want to sample from a probability distribution (x). One method for

doing this is by generating the dynamics whose invariant distribution is precisely

(x). In particular, we consider the Smolochuwoski equation

(6.49)

dXt = ln((Xt )) dt + 2dWt .

Assuming that ln((x)) is a confining potential, then Xt is an ergodic Markov

process with invariant distribution (x). Furthermore, the law of Xt converges to

(x) exponentially fast:

kt kL1 6 et k0 kL1 .

1

The exponent is related to the spectral gap of the generator L = (x)

(x)

+ . This technique for sampling from a given distribution is an example of the

Markov Chain Monte Carlo (MCMC) methodology.

We can add a perturbation to a non-reversible diffusion without changing the invariant distribution Z 1 eV .

Proposition 6.6.1. Let V (x) be a confining potential, (x) a smooth vector field

and consider the diffusion process

p

(6.50)

dXt = (V (Xt ) + (x)) dt + 2 1 dWt .

116

Then the invariant measure of the process Xt is the Gibbs measure (dx) =

1 V (x)

dx if and only if (x) is divergence-free with respect to the density of

Ze

this measure:

(x)eV (x)) = 0.

(6.51)

Consider the generator of a gradient stochastic flow with a uniformly convex potential

L = V + D.

(6.52)

spectral gap:

(Lf, f ) 6 Dkf k2

where is the Poincare constant of the potential V (i.e. for the Gibbs measure

Z 1 eV (x) dx). The above imply that we can study the spectral problem for L:

Lfn = n fn ,

n = 0, 1, . . .

0 = 0 < 1 < 2 < . . .

2

Furthermore, the eigenfunctions {fj }

j=1 form an orthonormal basis in L : we can

2

express every element of L in the form of a generalized Fourier series:

n fn ,

n = (, fn )

(6.53)

n=0

equation in terms of an eigenfunction expansion. Consider the backward Kolmogorov equation (6.45). We assume that the initial conditions h0 (x) = (x)

L2 and consequently we can expand it in the form (6.53). We look for a solution

of (6.45) in the form

X

hn (t)fn (x).

h(x, t) =

n=0

117

!

X

X

h

=

h n fn = L

hn fn

t

=

n=0

n=0

(6.54)

n=0

n hn fn .

(6.55)

We multiply this equation by fm , integrate wrt the Gibbs measure and use the

orthonormality of the eigenfunctions to obtain the sequence of equations

h n = n hn ,

n = 0, 1,

The solution is

h0 (t) = 0 ,

Notice that

1 =

=

hn (t) = en t n , n = 1, 2, . . .

p(x, 0) dx =

ZR

p(x, t) dx

Rd

Rd

= 0 .

Consequently, the solution of the backward Kolmogorov equation is

h(x, t) = 1 +

en t n fn .

n=1

This expansion, together with the fact that all eigenvalues are positive (n > 1),

shows that the solution of the backward Kolmogorov equation converges to 1 exponentially fast. The solution of the FokkerPlanck equation is

!

X

n t

1 V (x)

e

n fn .

1+

p(x, t) = Z e

n=1

Lemma 6.7.1. The FokkerPlanck operator for a gradient flow can be written in

the self-adjoint form

p

= D eV /D eV /D p .

(6.56)

t

118

Define now (x, t) = eV /2D p(x, t). Then solves the PDE

|V |2 V

= D U (x), U (x) :=

.

(6.57)

t

4D

2

Let H := D + U . Then L and H have the same eigenvalues. The nth eigenfunction n of L and the nth eigenfunction n of H are associated through the

transformation

V (x)

n (x) = n (x) exp

.

2D

Remarks 6.7.2.

i. From equation (6.56) shows that the FP operator can be

written in the form

L = D eV /D eV /D .

ii. The operator that appears on the right hand side of eqn. (6.57) has the form

of a Schrodinger operator:

H = D + U (x).

iii. The spectral problem for the FP operator can be transformed into the spectral problem for a Schrodinger operator. We can thus use all the available

results from quantum mechanics to study the FP equation and the associated

SDE.

iv. In particular, the weak noise asymptotics D 1 is equivalent to the semiclassical approximation from quantum mechanics.

Proof. We calculate

D eV /D eV /D f

= D eV /D D 1 V f + f eV /D

= (V f + Df ) = L f.

L n = n n .

1

Set n = n exp 2D

V . We calculate L n :

L n = D eV /D eV /D n eV /2D

V

= D eV /D n +

n eV /2D

2D

2

V

|V |

+

n eV /2D = eV /2D Hn .

=

Dn +

4D

2D

119

From this we conclude that eV /2D Hn = n n eV /2D from which the equivalence between the two eigenvalue problems follows.

Remarks 6.7.3.

H = DA A,

A=+

U

,

2D

A = +

U

.

2D

ii. These are creation and annihilation operators. They can also be written in

the form

A = eU/2D eU/2D , A = eU/2D eU/2D

iii. The forward the backward Kolmogorov operators have the same eigenvalues.

Their eigenfunctions are related through

F

B

n = n exp (V /D) ,

F

where B

n and n denote the eigenfunctions of the backward and forward

operators, respectively.

The proof of existence and uniqueness of classical solutions for the Fokker-Planck

equation of a uniformly elliptic diffusion process with smooth drift and diffusion

coefficients, Theorem 6.2.2, can be found in [21]. A standard textbook on PDEs,

with a lot of material on parabolic PDEs is [13], particularly Chapters 2 and 7 in

this book.

It is important to emphasize that the condition that solutions to the FokkerPlanck equation do not grow too fast, see Definition 6.2.1, is necessary to ensure

uniqueness. In fact, there are infinitely many solutions of

p

= p in Rd (0, T )

t

p(x, 0) = 0.

Each of these solutions besides the trivial solution p = 0 grows very rapidly as

x +. More details can be found in [34, Ch. 7].

120

See also [25] and [32]. The connection between the Fokker-Planck equation and

stochastic differential equations is presented in Chapter 7. See also [1, 22, 23].

Hermite polynomials appear very frequently in applications and they also play

a fundamental role in analysis. It is possible to prove that the Hermite polynomials

form an orthonormal basis for L2 (Rd , ) without using the fact that they are the

eigenfunctions of a symmetric operator with compact resolvent.6 The proof of

Proposition 6.4.1 can be found in [71], Lemma 2.3.4 in particular.

Diffusion processes in one dimension are studied in [48]. The Feller classification for one dimensional diffusion processes can be also found in [35, 15].

Convergence to equilibrium for kinetic equations (such as the Fokker-Planck

equation) both linear and non-linear (e.g., the Boltzmann equation) has been studied extensively. It has been recognized that the relative entropy and logarithmic

Sobolev inequalities play an important role in the analysis of the problem of convergence to equilibrium. For more information see [49].

6.9 Exercises

1. Solve equation (6.13) by taking the Fourier transform, using the method of characteristics for first order PDEs and taking the inverse Fourier transform.

2. Use the formula for the stationary joint probability density of the OrnsteinUhlenbeck process, eqn. (6.17) to obtain the stationary autocorrelation function

of the OU process.

3. Use (6.20) to obtain formulas for the moments of the OU process. Prove, using

these formulas, that the moments of the OU process converge to their equilibrium values exponentially fast.

4. Show that the autocorrelation function of the stationary Ornstein-Uhlenbeck is

Z Z

xx0 pOU (x, t|x0 , 0)ps (x0 ) dxdx0

E(Xt X0 ) =

R

D |t|

e

,

2

6

In fact, Poincares inequality for Gaussian measures can be proved using the fact that that the

Hermite polynomials form an orthonormal basis for L2 (Rd , ).

6.9. EXERCISES

121

5. Let Xt be a one-dimensional diffusion process with drift and diffusion coefficients a(y, t) = a0 a1 y and b(y, t) = b0 + b1 y + b2 y 2 where ai , bi > 0, i =

0, 1, 2.

(a) Write down the generator and the forward and backward Kolmogorov

equations for Xt .

(b) Assume that X0 is a random variable with probability density 0 (x) that

has finite moments. Use the forward Kolmogorov equation to derive a

system of differential equations for the moments of Xt .

(c) Find the first three moments M0 , M1 , M2 in terms of the moments of the

initial distribution 0 (x).

(d) Under what conditions on the coefficients ai , bi > 0, i = 0, 1, 2 is M2

finite for all times?

6. Let V be a confining potential in Rd , > 0 and let (x) = Z 1 eV (x) .

Give the definition of the Sobolev space H k (Rd ; ) for k a positive integer

and study some of its basic properties.

7. Let Xt be a multidimensional diffusion process on [0, 1]d with periodic boundary conditions. The drift vector is a periodic function a(x) and the diffusion

matrix is 2DI, where D > 0 and I is the identity matrix.

(a) Write down the generator and the forward and backward Kolmogorov

equations for Xt .

(b) Assume that a(x) is divergence-free ( a(x) = 0). Show that Xt is

ergodic and find the invariant distribution.

(c) Show that the probability density p(x, t) (the solution of the forward Kolmogorov equation) converges to the invariant distribution exponentially

fast in L2 ([0, 1]d ). (Hint: Use Poincares inequality on [0, 1]d ).

8. The Rayleigh process Xt is a diffusion process that takes values on (0, +)

with drift and diffusion coefficients a(x) = ax + D

x and b(x) = 2D, respectively, where a, D > 0.

(a) Write down the generator the forward and backward Kolmogorov equations for Xt .

(b) Show that this process is ergodic and find its invariant distribution.

122

(c) Solve the forward Kolmogorov (Fokker-Planck) equation using separation

of variables. (Hint: Use Laguerre polynomials).

9. Let x(t) = {x(t), y(t)} be the two-dimensional diffusion process on [0, 2]2

with periodic boundary conditions with drift vector a(x, y) = (sin(y), sin(x))

and diffusion matrix b(x, y) with b11 = b22 = 1, b12 = b21 = 0.

(a) Write down the generator of the process {x(t), y(t)} and the forward and

backward Kolmogorov equations.

(b) Show that the constant function

s (x, y) = C

is the unique stationary distribution of the process {x(t), y(t)} and calculate the normalization constant.

(c) Let E denote the expectation with respect to the invariant distribution

s (x, y). Calculate

E cos(x) + cos(y)

and

E(sin(x) sin(y)).

10. Let a, D be positive constants and let X(t) be the diffusion process on [0, 1]

with periodic boundary conditions and with drift and diffusion coefficients a(x) =

a and b(x) = 2D, respectively. Assume that the process starts at x0 , X(0) =

x0 .

(a) Write down the generator of the process X(t) and the forward and backward Kolmogorov equations.

(b) Solve the initial/boundary value problem for the forward Kolmogorov

equation to calculate the transition probability density p(x, t|x0 , 0).

(c) Show that the process is ergodic and calculate the invariant distribution

ps (x).

(d) Calculate the stationary autocorrelation function

Z 1Z 1

xx0 p(x, t|x0 , 0)ps (x0 ) dxdx0 .

E(X(t)X(0)) =

0

Chapter 7

7.1 Introduction

In this part of the course we will study stochastic differential equation (SDEs):

ODEs driven by Gaussian white noise.

Let W (t) denote a standard mdimensional Brownian motion, h : Z Rd

a smooth vector-valued function and : Z Rdm a smooth matrix valued

function (in this course we will take Z = Td , Rd or Rl Tdl . Consider the SDE

dz

dW

= h(z) + (z)

,

dt

dt

z(0) = z0 .

(7.1)

dt as representing Gaussian white noise: a mean-zero Gaussian process with correlation (t s)I. The function h in (7.1) is sometimes

referred to as the drift and as the diffusion coefficient. Such a process exists only

as a distribution. The precise interpretation of (7.1) is as an integral equation for

z(t) C(R+ , Z):

z(t) = z0 +

h(z(s))ds +

(z(s))dW (s).

(7.2)

In order to make sense of this equation we need to define the stochastic integral

against W (s).

123

124

For the rigorous analysis of stochastic differential equations it is necessary to define

stochastic integrals of the form

Z

I(t) =

f (s) dW (s),

(7.3)

where W (t) is a standard one dimensional Brownian motion. This is not straightforward because W (t) does not have bounded variation. In order to define the

stochastic integral we assume that f (t) is a random process, adapted to the filtration Ft generated by the process W (t), and such that

E

Z

f (s)2 ds

< .

The Ito stochastic integral I(t) is defined as the L2 limit of the Riemann sum

approximation of (7.3):

I(t) := lim

K1

X

k=1

(7.4)

left end of each interval [tn1 , tn ] in (7.4). The resulting Ito stochastic integral I(t)

is a.s. continuous in t. These ideas are readily generalized to the case where W (s)

is a standard d dimensional Brownian motion and f (s) Rmd for each s.

The resulting integral satisfies the Ito isometry

2

E|I(t)| =

integral is a martingale:

EI(t) = 0

and

E[I(t)|Fs ] = I(s)

t > s,

(7.5)

Example 7.2.1.

125

Z

I(t) =

f (s) dW (s),

0

hIit =

(f (s))2 ds.

a martingale with quadratic variation

hIit =

t

0

In addition to the Ito stochastic integral, we can also define the Stratonovich stochastic integral. It is defined as the L2 limit of a different Riemann sum approximation

of (7.3), namely

Istrat (t) := lim

K1

X

k=1

1

f (tk1 ) + f (tk ) (W (tk ) W (tk1 )) ,

2

(7.6)

endpoints of each interval [tn1 , tn ] in (7.6). The multidimensional Stratonovich

integral is defined in a similar way. The resulting integral is written as

Istrat (t) =

t

0

f (s) dW (s).

The limit in (7.6) gives rise to an integral which differs from the Ito integral. The

situation is more complex than that arising in the standard theory of Riemann integration for functions of bounded variation: in that case the points in [tk1 , tk ]

where the integrand is evaluated do not effect the definition of the integral, via a

limiting process. In the case of integration against Brownian motion, which does

not have bounded variation, the limits differ. When f and W are correlated through

an SDE, then a formula exists to convert between them.

126

Definition 7.3.1. By a solution of (7.1) we mean a Z-valued stochastic process

{z(t)} on t [0, T ] with the properties:

i. z(t) is continuous and Ft adapted, where the filtration is generated by the

Brownian motion W (t);

ii. h(z(t)) L1 ((0, T )), (z(t)) L2 ((0, T ));

iii. equation (7.1) holds for every t [0, T ] with probability 1.

The solution is called unique if any two solutions xi (t), i = 1, 2 satisfy

P(x1 (t) = x2 (t), t [0.T ]) = 1.

It is well known that existence and uniqueness of solutions for ODEs (i.e. when

0 in (7.1)) holds for globally Lipschitz vector fields h(x). A very similar

theorem holds when 6= 0. As for ODEs the conditions can be weakened, when a

priori bounds on the solution can be found.

Theorem 7.3.2. Assume that both h() and () are globally Lipschitz on Z and

that z0 is a random variable independent of the Brownian motion W (t) with

E|z0 |2 < .

Then the SDE (7.1) has a unique solution z(t) C(R+ ; Z) with

E

Z

T

0

|z(t)| dt <

T < .

The Stratonovich analogue of (7.1) is

dW

dz

= h(z) + (z)

,

dt

dt

z(0) = z0 .

(7.7)

z(t) = z(0) +

h(z(s))ds +

t

0

(z(s)) dW (s).

(7.8)

127

By using definitions (7.4) and (7.6) it can be shown that z satisfying the Stratonovich

SDE (7.7) also satisfies the Ito SDE

1

dz

1

dW

= h(z) + (z)(z)T (z) (z)T + (z)

,

dt

2

2

dt

z(0) = z0 ,

(7.9a)

(7.9b)

provided that (z) is differentiable. White noise is, in most applications, an idealization of a stationary random process with short correlation time. In this context

the Stratonovich interpretation of an SDE is particularly important because it often

arises as the limit obtained by using smooth approximations to white noise. On

the other hand the martingale machinery which comes with the Ito integral makes

it more important as a mathematical object. It is very useful that we can convert

from the Ito to the Stratonovich interpretation of the stochastic integral. There are

other interpretations of the stochastic integral, e.g. the Klimontovich stochastic

integral.

The Definition of Brownian motion implies the scaling property

W (ct) =

cW (t),

where the above should be interpreted as holding in law. From this it follows that,

if s = ct, then

1 dW

dW

=

,

ds

c dt

again in law. Hence, if we scale time to s = ct in (7.1), then we get the equation

dz

1

1

dW

= h(z) + (z)

,

ds

c

ds

c

z(0) = z0 .

The SDE for Brownian motion is:

dX = 2dW,

X(0) = x.

X(t) = x + W (t).

The SDE for the Ornstein-Uhlenbeck process is

dX = X dt + 2 dW,

X(0) = x.

128

t

X(t) = e

x+

e(ts) dW (s).

We can use Itos formula to obtain equations for the moments of the OU process.

The generator is:

L = xx + x2 .

We apply Itos formula to the function f (x) = xn to obtain:

dX(t)n = LX(t)n dt +

2X(t)n dW

Consequently:

n

X(t)

= x +

+n 2

nX(t)n + n(n 1)X(t)n2 dt

t

X(t)n1 dW.

By taking the expectation in the above equation we obtain the equation for the moments of the OU process that we derived earlier using the Fokker-Planck equation:

Mn (t) = xn +

dX(t) = X(t) dt + X(t) dW (t),

(7.10)

where we use the Ito interpretation of the stochastic differential. The generator of

this process is

2 x2 2

L = xx +

.

2 x

The solution to this equation is

2

X(t) = X(0) exp ( )t + W (t) .

2

(7.11)

7.4. THE GENERATOR, ITOS

To derive this formula, we apply Itos formula to the function f (x) = log(x):

d log(X(t)) = L log(X(t)) dt + xx log(X(t)) dW (t)

1 2 x2

1

=

x +

dt + dW (t)

2

x

2

x

2

=

dt + dW (t).

2

Consequently:

2

X(t)

=

t + W (t)

log

X(0)

2

from which (7.11) follows. Notice that the Stratonovich interpretation of this equation leads to the solution

Equation

7.4.1 The Generator

Given the function (z) in the SDE (7.1) we define

(z) = (z)(z)T .

(7.12)

1

(7.13)

Lv = h v + : v.

2

This operator, equipped with a suitable domain of definition, is the generator of the

Markov process given by (7.1). The formal L2 adjoint operator L

1

L v = (hv) + (v).

2

The Ito formula enables us to calculate the rate of change in time of functions

V : Z Rn evaluated at the solution of a Z-valued SDE. Formally, we can write:

dW

d

V (z(t)) = LV (z(t)) + V (z(t)), (z(t))

.

dt

dt

130

Note that if W were a smooth time-dependent function this formula would not be

correct: there is an additional term in LV , proportional to , which arises from the

lack of smoothness of Brownian motion. The precise interpretation of the expression for the rate of change of V is in integrated form:

Lemma 7.4.1. (Itos Formula) Assume that the conditions of Theorem 7.3.2 hold.

Let x(t) solve (7.1) and let V C 2 (Z, Rn ). Then the process V (z(t)) satisfies

V (z(t)) = V (z(0)) +

LV (z(s))ds +

v(z, t) = E (z(t))|z(0) = z ,

(7.14)

where the expectation is with respect to all Brownian driving paths. By averaging

in the Ito formula, which removes the stochastic integral, and using the Markov

property, it is possible to obtain the Backward Kolmogorov equation.

Theorem 7.4.2. Assume that is chosen sufficiently smooth so that the backward

Kolmogorov equation

v

= Lv for (z, t) Z (0, ),

t

v = for (z, t) Z {0} ,

(7.15)

(7.14) where z(t) solves (7.2).

For a Stratonovich SDE the rules of standard calculus apply: Consider the

Stratonovich SDE (7.29) and let V (x) C 2 (R). Then

dV (X(t)) =

dV

(X(t)) (f (X(t)) dt + (X(t)) dW (t)) .

dx

standard Brownian motion on Rn ). The corresponding Fokker-Planck equation is:

1

= (f ) + ( ())).

t

2

Now we can derive rigorously the Fokker-Planck equation.

(7.16)

131

Theorem 7.4.3. Consider equation (7.2) with z(0) a random variable with density

0 (z). Assume that the law of z(t) has a density (z, t) C 2,1 (Z (0, )). Then

satisfies the Fokker-Planck equation

t

= 0 for z Z {0}.

(7.17a)

(7.17b)

Proof. Let E denote averaging with respect to the product measure induced by the

measure with density 0 on z(0) and the independent driving Wiener measure

on the SDE itself. Averaging over random z(0) distributed with density 0 (z), we

find

Z

E ((z(t))) =

Z

Z

(eLt )(z)0 (z) dz

=

ZZ

=

Z

Z

(z, t)(z)dz.

E ((z(t))) =

Z

Z

Z

L t

(z, t)(z) dz.

(e 0 )(z)(z) dz =

Z

We use a density argument so that the identity can be extended to all L2 (Z).

Hence, from the above equation we deduce that

(z, t) = eL t 0 (z).

Differentiation of the above equation gives (7.17a). Setting t = 0 gives the initial

condition (7.17b).

In this section we study linear SDEs in arbitrary finite dimensions. Let A Rnn

be a positive definite matrix and let D > 0 be a positive constant. We will consider

the SDE

132

or, componentwise,

dXi (t) =

d

X

Aij Xj (t) +

2D dWi (t),

i = 1, . . . d.

j=1

p

= (Axp) + Dp

t

or

i,j

j=1

X 2p

p X

=

(Aij xj p) + D

.

t

xi

x2j

Let us now solve the Fokker-Planck equation with initial conditions p(x, t|x0 , 0) =

(x x0 ). We take the Fourier transform of the Fokker-Planck equation to obtain

p

= Ak k p D|k|2 p

t

with

d

Rd

(7.18)

p(k, 0|x0 , 0) = eikx0

(7.19)

We know that the transition probability density of a linear SDE is Gaussian. Since

the Fourier transform of a Gaussian function is also Gaussian, we look for a solution to (7.18) which is of the form

1

p(k, t|x0 , 0) = exp(ik M (t) kT (t)k).

2

We substitute this into (7.18) and use the symmetry of A to obtain the equations

dM

= AM

dt

and

d

= 2A + 2DI,

dt

with initial conditions (which follow from (10.13)) M (0) = x0 and (0) = 0

where 0 denotes the zero d d matrix. We can solve these equations using the

spectral resolution of A = B T B. The solutions are

M (t) = eAt M (0)

133

and

(t) = DA1 DA1 e2At .

We calculate now the inverse Fourier transform of p to obtain the fundamental

solution (Greens function) of the Fokker-Planck equation

T 1

1

At

At

x0 (t) x e

x0 .

p(x, t|x0 , 0) = (2)

(det((t)))

exp x e

2

(7.20)

We note that generator of the Markov processes Xt is of the form

d/2

1/2

L = V (x) + D

P

with V (x) = 21 xT Ax = 21 di,j=1 Aij xi xj . This is a confining potential and from

the theory presented in Section 6.5 we know that the process Xt is ergodic. The

invariant distribution is

1 1 T

(7.21)

ps (x) = e 2 x Ax

Z

R

1 T

dp

with Z = Rd e 2 x Ax dx = (2) 2 det(A1 ). Using the above calculations, we

can calculate the stationary autocorrelation matrix is given by the formula

E(X0T Xt ) =

Z Z

We substitute the formulas for the transitions probability density and the stationary distribution, equations (7.21) and (7.20) into the above equations and do the

Gaussian integration to obtain

E(X0T Xt ) = DA1 eAt .

We use now the the variation of constants formula to obtain

At

Xt = e X0 +

2D

eA(ts) dW (s).

eAt = B T et B.

134

When white noise is approximated by a smooth process this often leads to Stratonovich

interpretations of stochastic integrals, at least in one dimension. We use multiscale

analysis (singular perturbation theory for Markov processes) to illustrate this phenomenon in a one-dimensional example.

Consider the equations

dx

1

= h(x) + f (x)y,

(7.22a)

dt

r

y

dy

2D dV

= 2 +

,

(7.22b)

dt

2 dt

with V being a standard one-dimensional Brownian motion. We say that the process x(t) is driven by colored noise: the noise that appears in (7.22a) has non-zero

correlation time. The correlation function of the colored noise (t) := y(t)/ is

(we take y(0) = 0)

R(t) = E ((t)(s)) =

1 D 2 |ts|

.

e

2

f (x) =

=

1 D2

1

2

2

x + (2 )2

1

D

D

4 x2 + 2

2

and, consequently,

lim E

y(t) y(s)

2D

(t s),

2

y(t)

lim

=

0

2D dV

.

2 dt

r

dy

y

2D dV

=

2 dt

dt

(7.23)

(7.24)

135

If we neglect the O() term on the right hand side then we arrive, again, at the

heuristic (7.23). Both of these arguments lead us to conjecture the limiting Ito

SDE:

r

dV

dX

2D

= h(X) +

f (X)

.

(7.25)

dt

dt

In fact, as applied, the heuristic gives the incorrect limit. Whenever white noise is

approximated by a smooth process, the limiting equation should be interpreted in

the Stratonovich sense, giving

r

dV

dX

2D

= h(X) +

f (X)

.

(7.26)

dt

dt

This is usually called the Wong-Zakai theorem. A similar result is true in arbitrary

finite and even infinite dimensions. We will show this using singular perturbation

theory.

Theorem 7.6.1. Assume that the initial conditions for y(t) are stationary and that

the function f is smooth. Then the solution of eqn (7.22a) converges, in the limit

as 0 to the solution of the Stratonovich SDE (7.26).

Remarks 7.6.2.

i. It is possible to prove pathwise convergence under very

mild assumptions.

ii. The generator of a Stratonovich SDE has the from

Lstrat = h(x)x +

D

f (x)x (f (x)x ) .

written in divergence form:

Lstrat = x (h(x)) +

D

x f 2 (x)x .

more complicated noise processes with non-zero correlation time. Hence, the

physically correct interpretation of the stochastic integral is the Stratonovich

one.

v. In higher dimensions an additional drift term might appear due to the noncommutativity of the row vectors of the diffusion matrix. This is related to

the Levy area correction in the theory of rough paths.

136

L

=

=:

1

1

yy + Dy2 + f (x)yx + h(x)x

2

1

1

L0 + L1 + L2 .

2

r

y2

e 2D .

(y) =

2D

The backward Kolmogorov equation is

1

u

1

=

L 0 + L 1 + L 2 u .

t

2

(7.27)

(7.28)

We look for a solution to this equation in the form of a power series expansion in

:

u (x, y, t) = u0 + u1 + 2 u2 + . . .

We substitute this into (7.28) and equate terms of the same power in to obtain the

following hierarchy of equations:

L0 u0 = 0,

L0 u1 = L1 u0 ,

L0 u2 = L1 u1 + L2 u0

u0

.

t

The ergodicity of the fast process implies that the null space of the generator L0

consists only of constant in y. Hence:

u0 = u(x, t).

The second equation in the hierarchy becomes

L0 u1 = f (x)yx u.

This equation is solvable since the right hand side is orthogonal to the null space of

the adjoint of L0 (this is the Fredholm alterantive). We solve it using separation

of variables:

1

u1 (x, y, t) = f (x)x uy + 1 (x, t).

137

In order for the third equation to have a solution we need to require that the right

hand side is orthogonal to the null space of L0 :

Z

u0

L1 u1 + L2 u0

(y) dy = 0.

t

R

We calculate:

Furthermore:

Finally

Z

L1 u1 (y) dy =

=

=

=

u0

u

(y) dy =

.

t

t

L2 u0 (y) dy = h(x)x u.

f (x)yx

R

1

f (x)x uy + 1 (x, t) (y) dy

1

f (x)x (f (x)x u) hy 2 i + f (x)x 1 (x, t)hyi

D

f (x)x (f (x)x u)

2

D

D

f (x)x f (x)x u + 2 f (x)2 x2 u.

2

D

D

u

= h(x) + 2 f (x)x f (x) x u + 2 f (x)2 x2 u,

t

r

dV

dX

2D

= h(X) +

f (X)

.

dt

dt

A Stratonovich SDE

dX(t) = f (X(t)) dt + (X(t)) dW (t)

can be written as an Ito SDE

d

1

dX(t) = f (X(t)) +

2

dx

(7.29)

138

dX(t) = f (X(t)) dt + (X(t))dW (t)

(7.30)

1

d

dX(t) = f (X(t))

2

dx

The Ito and Stratonovich interpretation of an SDE can lead to equations with very

different properties!

When the diffusion coefficient depends on the solution of the SDE X(t), we

will say that we have an equation with multiplicative noise .

7.8 Parameter Estimation for SDEs

7.9 Noise Induced Transitions

Consider the Landau equation:

dXt

= Xt (c Xt2 ),

dt

X0 = x.

(7.31)

This is a gradient flow for the potential V (x) = 21 cx2 41 x4 . When c < 0 all

solutions are attracted to the single steady state X = 0. When c > 0 the steady

Consider additive random perturbations to the Landau equation:

dWt

dXt

= Xt (c Xt2 ) + 2

,

dt

dt

X0 = x.

(7.32)

invariant distribution:

Z

1

1

eV (x)/ dx, V (x) = cx2 x4 .

(x) = Z 1 eV (x)/ , Z =

2

4

R

(x) is a probability density for all values of c R. The presence of additive noise

in some sense trivializes the dynamics. The dependence of various averaged

quantities on c resembles the physical situation of a second order phase transition.

139

dWt

dXt

= Xt (c Xt2 ) + 2Xt

, X0 = x.

(7.33)

dt

dt

Where the stochastic differential is interpreted in the Ito sense. The generator of

this process is

L = x(c x2 )x + x2 x2 .

Notice that Xt = 0 is always a solution of (7.33). Thus, if we start with x > 0

(x < 0) the solution will remain positive (negative). We will assume that x > 0.

Consider the function Yt = log(Xt ). We apply Itos formula to this function:

dYt = L log(Xt ) dt + Xt x log(Xt ) dWt

1

2 1

2 1

Xt 2 dt + Xt

dWt

=

Xt (c Xt )

Xt

Xt

Xt

= (c ) dt Xt2 dt + dWt .

Thus, we have been able to transform (7.33) into an SDE with additive noise:

i

h

(7.34)

dYt = (c ) e2Yt dt + dWt .

h

1 i

V (y) = (c )y e2y .

2

The invariant measure, if it exists, is of the form

(y) dy = Z 1 eV (y)/ dy.

Going back to the variable x we obtain:

x2

We need to make sure that this distribution is integrable:

Z +

x2

c

x e 2 < , = 2.

Z=

0

For this it is necessary that

> 1 c > .

Not all multiplicative random perturbations lead to ergodic behavior. The dependence of the invariant distribution on c is similar to the physical situation of first

order phase transitions.

140

Colored Noise When the noise which drives an SDE has non-zero correlation time

we will say that we have colored noise. The properties of the SDE (stability,

ergodicity etc.) are quite robust under coloring of the noise. See

G. Blankenship and G.C. Papanicolaou, Stability and control of stochastic systems with wide-band noise disturbances. I, SIAM J. Appl. Math., 34(3), 1978, pp.

437476. Colored noise appears in many applications in physics and chemistry.

For a review see P. Hanggi and P. Jung Colored noise in dynamical systems. Adv.

Chem. Phys. 89 239 (1995).

In the case where there is an additional small time scale in the problem, in

addition to the correlation time of the colored noise, it is not clear what the right

interpretation of the stochastic integral (in the limit as both small time scales go

to 0). This is usually called the Ito versus Stratonovich problem. Consider, for

example, the SDE

= X + v(X) (t),

X

where (t) is colored noise with correlation time 2 . In the limit where both small

time scales go to 0 we can get either Ito or Stratonovich or neither. See [40, 56].

Noise induced transitions are studied extensively in [32]. The material in Section 7.9 is based on [47]. See also [46].

7.11 Exercises

1. Calculate all moments of the geometric Brownian motion for the Ito and Stratonovich

interpretations of the stochastic integral.

2. Study additive and multiplicative random perturbations of the ODE

dx

= x(c + 2x2 x4 ).

dt

3. Analyze equation (7.33) for the Stratonovich interpretation of the stochastic

integral.

Chapter 8

8.1 Introduction

8.2 The Fokker-Planck Equation in Phase Space (KleinKramers Equation)

Consider a diffusion process in two dimensions for the variables q (position) and

momentum p. The generator of this Markov process is

L = p q q V p + (pp + Dp ).

(8.1)

The L2 (dpdq)-adjoint is

L = p q q V p + (p (p) + Dp ) .

The corresponding FP equation is:

p

= L p.

t

The corresponding stochastic differential equations is the Langevin equation

p

t = V (Xt ) X t + 2DW

t.

X

(8.2)

equation for the Langevin equation, which is sometimes called the Klein-KramersChandrasekhar equation was first derived by Kramers in 1923 and was studied

by Kramers in his famous paper [?]. Notice that L is not a uniformly elliptic operator: there are second order derivatives only with respect to p and not q. This is

141

142

still prove existence, uniqueness and regularity of solutions for the Fokker-Planck

equation, and obtain estimates on the solution. It is not possible to obtain the solution of the FP equation for an arbitrary potential. We can, however, calculate the

(unique normalized) solution of the stationary Fokker-Planck equation.

Theorem 8.2.1. Let V (x) be a smooth confining potential. Then the Markov process with generator (8.45) is ergodic. The unique invariant distribution is the

Maxwell-Boltzmann distribution

(p, q) =

1 H(p,q)

e

Z

(8.3)

where

1

H(p, q) = kpk2 + V (q)

2

1

is the Hamiltonian, = (kB T ) is the inverse temperature and the normalization factor Z is the partition function

Z

eH(p,q) dpdq.

Z=

R2d

the relative entropy norm.

H(p(, t)|) 6 Cet .

The proof of this result is very complicated, since the generator L is degenerate

and non-selfadjoint. See for example and the references therein.

Let (q, p, t) be the solution of the Kramers equation and let (q, p) be the

Maxwell-Boltzmann distribution. We can write

(q, p, t) = h(q, p, t) (q, p),

where h(q, p, t) solves the equation

h

= Ah + Sh

t

(8.4)

where

A = p q q V p ,

S = p p + 1 p .

Let Xi := p

. The L2 -adjoint of Xi is

i

Xi = pi +

We have that

S=

d

X

.

pi

Xi Xi .

i=1

Consequently, the generator of the Markov process {q(t), p(t)} can be written in

Hormanders sum of squares form:

L = A +

d

X

Xi Xi .

(8.5)

i=1

[A, Xi ] =

,

qi

[Xi , Xj ] = 0,

[Xi , Xj ] = ij .

Consequently,

Lie(X1 , . . . Xd , [A, X1 ], . . . [A, Xd ]) = Lie(p , q )

which spans Tp,q R2d for all p, q Rd . This shows that the generator L is a

hypoelliptic operator.

V

Let now Yi = p

with L2 -adjoint Yi = q i q

. We have that

i

i

Xi Yi Yi Xi = pi

.

qi

qi pi

Consequently, the generator can be written in the form

L=

d

X

i=1

(Xi Yi Yi Xi + Xi Xi ) .

LV := q V q + 1 q = 1

d

X

Yi Yi .

i=1

+ p q q V p = Q(, fB )

t

(8.6)

144

Q(, fB ) = D fB fB1 .

The Fokker-Planck equation has a similar structure to the Boltzmann equation (the

basic equation in the kinetic theory of gases), with the difference that the collision

operator for the FP equation is linear. Convergence of solutions of the Boltzmann

equation to the Maxwell-Boltzmann distribution has also been proved. See ??.

We can study the backward and forward Kolmogorov equations for (9.13) by

expanding the solution with respect to the Hermite basis. We consider the problem

in 1d. We set D = 1. The generator of the process is:

L = pq V (q)p + pp + p2 .

=: L1 + L0 ,

where

L0 := pp + p2

and

L1 := pq V (q)p .

h

= Lh.

t

(8.7)

Z

2 1 H(p,q)

2

|f | Z e

dpdq < .

L = f |

R2

We notice that the invariant measure of our Markov process is a product measure:

1

1

we can expand the solution of (8.7) into the basis of Hermite basis:

h(p, q, t) =

(8.8)

n=0

where fn (p) = 1/ n!Hn (p). Our plan is to substitute (8.8) into (8.7) and obtain a

sequence of equations for the coefficients hn (q, t). We have:

L0 h = L0

n=0

hn fn =

n=0

nhn fn

Furthermore

L1 h = q V p h + pq h.

We calculate each term on the right hand side of the above equation separately. For

this we will need the formulas

pq h = pq

hn fn = pp h0 +

q hn pfn

n=1

n=0

= q h0 f1 +

q hn

nfn1 +

n=1

n + 1fn+1

n=0

with h1 0. Furthermore

q V p h =

=

n=0

q V hn p fn =

q V hn nfn1

n=0

q V hn+1 n + 1fn .

n=0

Consequently:

Lh = L1 + L1 h

=

nhn + n + 1q hn+1

n=0

+ nq hn1 + n + 1q V hn+1 fn

of equations which determine {hn (q, t)}

n=0 .

h n = nhn + n + 1q hn+1

+ nq hn1 + n + 1q V hn+1 , n = 0, 1, . . .

This is set of equations is usually called the Brinkman hierarchy (1956). We can

use this approach to develop a numerical method for solving the Klein-Kramers

146

with respect to q. Obvious choices are other the Hermite basis (polynomial potentials) or the standard Fourier basis (periodic potentials). We will do this for the

case of periodic potentials. The resulting method is usually called the continued

fraction expansion. See [64]. The Hermite expansion of the distribution function wrt to the velocity is used in the study of various kinetic equations (including

the Boltzmann equation). It was initiated by Grad in the late 40s. It quite often

used in the approximate calculation of transport coefficients (e.g. diffusion coefficient). This expansion can be justified rigorously for the Fokker-Planck equation.

See [53]. This expansion can also be used in order to solve the Poisson equation

L = f (p, q). See [58].

There are very few potentials for which we can solve the Langevin equation or

to calculate the eigenvalues and eigenfunctions of the generator of the Markov

process {q(t), p(t)}. One case where we can calculate everything explicitly is that

of a Brownian particle in a quadratic (harmonic) potential

1

V (q) = 02 q 2 .

2

(8.9)

q = 02 q q +

or

q = p,

2 1 W

p = 02 q p +

.

2 1 W

(8.10)

(8.11)

This is a linear equation that can be solved explicitly. Rather than doing this, we

will calculate the eigenvalues and eigenfunctions of the generator, which takes the

form

L = pq 02 qp + (pp + 1 p2 ).

(8.12)

The Fokker-Planck operator is

L = pq 02 qp + (pp + 1 p2 ).

(8.13)

The process {q(t), p(t)} is an ergodic Markov process with Gaussian invariant

measure

2

0

0 2 p2

2

q

e

.

(8.14)

(q, p) dqdp =

2

147

convenient to introduce creation and annihilation operator in both the position and

momentum variables. We set

a = 1/2 p ,

a+ = 1/2 p + 1/2 p

(8.15)

and

b = 01 1/2 q ,

b+ = 01 1/2 q + 0 1/2 p.

(8.16)

We have that

a+ a = 1 p2 + pp

and

b+ b = 1 q2 + qq

Consequently, the operator

Lb = a+ a b+ b

(8.17)

[a+ , a ] = 1,

(8.18a)

[b+ , b ] = 1,

(8.18b)

The operators a , b satisfy the commutation relations

[a , b ] = 0.

(8.18c)

See Exercise 3. Using now the operators a and b we can write the generator L

in the form

L = a+ a 0 (b+ a a+ b ),

(8.19)

which is a particular case of (8.6). In order to calculate the eigenvalues and eigenfunctions of (8.19) we need to make an appropriate change of variables in order

to bring the operator L into the decoupled form (8.17). Clearly, this is a linear

transformation and can be written in the form

Y = AX

where X = (q, p) for some 2 2 matrix A. It is somewhat easier to make this

change of variables at the level of the creation and annihilation operators. In particular, our goal is to find first order differential operators c and d so that the

operator (8.19) becomes

L = Cc+ c Dd+ d

(8.20)

148

for some appropriate constants C and D. Since our goal is, essentially, to map L

to the two-dimensional OU process, we require that that the operators c and d

satisfy the canonical commutation relations

[c+ , c ] = 1,

(8.21a)

[d+ , d ] = 1,

(8.21b)

[c , d ] = 0.

(8.21c)

The operators c and d should be given as linear combinations of the old operators a and b . From the structure of the generator L (8.19), the decoupled

form (8.20) and the commutation relations (8.21) and (8.18) we conclude that c

and d should be of the form

c+ = 11 a+ + 12 b+ ,

(8.22a)

c = 21 a + 22 b ,

(8.22b)

d+ = 11 a+ + 12 b+ ,

(8.22c)

d = 21 a + 22 b .

(8.22d)

Notice that the c and d are not the adjoints of c+ and d+ . If we substitute now

these equations into (8.20) and equate it with (8.19) and into the commutation relations (8.21) we obtain a system of equations for the coefficients {ij }, {ij }. In

order to write down the formulas for these coefficients it is convenient to introduce

the eigenvalues of the deterministic problem

q = q 02 q.

The solution of this equation is

q(t) = C1 e1 t + C2 e2 t

with

,

2

The eigenvalues satisfy the relations

1,2 =

1 + 2 = ,

2 402 .

1 2 = , 1 2 = 02 .

(8.23)

(8.24)

149

Proposition 8.3.1. Let L be the generator (8.19) and let c , dpm be the operators

p

1 p

c+ =

1 a+ + 2 b+ ,

p

1 p

c =

1 a 2 b ,

p

1 p

2 a+ + 1 b+ ,

d+ =

p

1 p

d = 2 a + 1 b .

(8.25a)

(8.25b)

(8.25c)

(8.25d)

[L, c ] = 1 c ,

[L, d ] = 2 d .

(8.26)

L = 1 c+ c 2 d+ d .

Proof. first we check the commutation relations:

[c+ , c ] =

=

1

1 [a+ , a ] 2 [b+ , b ]

1

(1 + 2 ) = 1.

Similarly,

[d+ , d ] =

=

1

2 [a+ , a ] + 1 [b+ , b ]

1

(2 1 ) = 1.

[c+ , d+ ] = [c , d ] = 0.

Furthermore,

[c+ , d ] =

=

p

1 p

1 2 [a+ , a ] + 1 2 [b+ , b ]

p

1 p

( 1 2 + 1 2 ) = 0.

(8.27)

150

Finally:

[L, c+ ] = 1 c+ c c+ + 1 c+ c+ c

= 1 c+ (1 + c+ c ) + 1 c+ c+ c

= 1 c+ (1 + c+ c ) + 1 c+ c+ c

= 1 c+ ,

L = 1 c+ c 2 d+ d

22 21 +

1p

1 2

+

=

a a + 0b b +

(1 2 )a+ b +

1 2 (1 + 2 )b+ a

= a+ a 0 (b+ a a+ b ),

which is precisely (8.19). In the above calculation we used (8.24).

Using now (8.27) we can readily obtain the eigenvalues and eigenfunctions of

L. From our experience with the two-dimensional OU processes (or, the Schrodinger

operator for the two-dimensional quantum harmonic oscillator), we expect that the

eigenfunctions should be tensor products of Hermite polynomials. Indeed, we have

the following, which is the main result of this section.

Theorem 8.3.2. The eigenvalues and eigenfunctions of the generator of the Markov

process {q, p} (8.11) are

1

1

nm = 1 n + 2 m = (n + m) + (n m),

2

2

and

1

nm (q, p) =

(c+ )n (d+ )m 1,

n!m!

n, m = 0, 1, . . .

n, m = 0, 1, . . .

(8.28)

(8.29)

Proof. We have

[L, (c+ )2 ] = L(c+ )2 (c+ )2 L

= 21 (c+ )2

and similarly [L, (d+ )2 ] = 21 (c+ )2 . A simple induction argument now shows

that (see Exercise 8.3.3)

[L, (c+ )n ] = n1 (c+ )n

and

(8.30)

151

L(c+ )n (d+ )n 1

from which (8.28) and (8.29) follow.

Exercise 8.3.3. Show that

[L, (c )n ] = n1 (c )n ,

[L, (d )n ] = n1 (d )n ,

(8.31)

nm

n m

= n!m! 2 1 2

=0 k=0

1

k!(m k)!!(n )!

1

2

k

2

00 = 1.

10 =

01 =

11

1 p + 2 0 q

2 p +

1 0 q

2 1 2 + 1 p2 2 + p1 0 q + 0 q2 p + 2 0 2 q 2 1

=

.

20

1 + p2 1 + 2 2 p 1 0 q 2 + 0 2 q 2 2

.

=

2

02

2 + p2 2 + 2 2 p 1 0 q 1 + 0 2 q 2 1

=

.

2

152

As we already know, the first eigenvalue, corresponding to the constant eigenfunction, is 0:

00 = 0.

Notice that the operator L is not self-adjoint and consequently, we do not expect its

eigenvalues to be real. Indeed, whether the eigenvalues are real or not depends on

the sign of the discriminant = 2 402 . In the underdamped regime, < 20

the eigenvalues are complex:

q

1

1

nm = (n + m) + i 2 + 402 (n m), < 20 .

2

2

by the deterministic Hamiltonian dynamics that give rise to the antisymmetric Lip

ouville operator. We set = (402 2 ), i.e. = 2i. The eigenvalues can be

written as

nm = (n + m) + i(n m).

2

In Figure 8.3 we present the first few eigenvalues of L in the underdamped regime.

The eigenvalues are contained in a cone on the right half of the complex plane. The

cone is determined by

n0 =

n + in and

2

0m =

m im.

2

nn = n.

On the other hand, in the overdamped regime, > 20 all eigenvalues are real:

q

1

1

nm = (n + m) +

2 402 (n m), > 20 .

2

2

In fact, in the overdamped limit + (which we will study in Chapter ??), the

eigenvalues of the generator L converge to the eigenvalues of the generator of the

OU process:

2

nm = n + 0 (n m) + O( 3 ).

This is consistent with the fact that in this limit the solution of the Langevin equation converges to the solution of the OU SDE. See Chapter ?? for details.

Im (

nm

0.5

1.5

2.5

Re (nm)

Figure 8.1: First few eigenvalues of L for = = 1.

153

154

since L is not a selfadjoint operator. Using the eigenfunctions/eigenvalues of L we

can easily calculate the eigenfunctions/eigenvalues of the L2 adjoint of L. From

the calculations presented in Section 8.2 we know that the adjoint operator is

Lb := A + S

=

where

(8.32)

0 (b+ a b a+ ) + a+ a

1 (c ) (c ) 2 (d ) (d ),

p

1 p

1 a + 2 b ,

(c+ ) =

p

1 p

1 a+ 2 b+ ,

(c ) =

p

1 p

(d+ ) =

2 a + 1 b ,

p

1 p

(d ) = 2 a+ + 1 b+ .

(8.33)

(8.34)

(8.35a)

(8.35b)

(8.35c)

(8.35d)

b nm = nm nm ,

L

nm =

1

n!m!

((c ) )n ((d ) )m 1.

(8.36)

relation

Z Z

nm k dpdq = n mk .

(8.37)

Proof. We will use formulas (8.31). Notice that using the third and fourth of these

equations together with the fact that c 1 = d 1 = 0 we can conclude that (for

n > )

(c ) (c+ )n 1 = n(n 1) . . . (n + 1)(c+ )n .

(8.38)

155

We have

Z Z

nm k dpdq =

Z Z

1

n!m!!k!

Z Z

n(n 1) . . . (n + 1)m(m 1) . . . (m k + 1)

=

((c+ ))n ((d+ ))mk 1 d

n!m!!k!

= n mk ,

From the eigenfunctions of Lb we can obtain the eigenfunctions of the FokkerPlanck operator. Using the formula (see equation (8.4))

b

L (f ) = Lf

we immediately conclude that the the Fokker-Planck operator has the same eigenb The eigenfunctions are

values as those of L and L.

nm = nm =

1

((c ) )n ((d ) )m 1.

n!m!

(8.39)

There are very few SDEs/Fokker-Planck equations that can be solved explicitly. In

most cases we need to study the problem under investigation either approximately

or numerically. In this part of the course we will develop approximate methods for

studying various stochastic systems of practical interest. There are many problems

of physical interest that can be analyzed using techniques from perturbation theory

and asymptotic analysis:

i. Small noise asymptotics at finite time intervals.

ii. Small noise asymptotics/large times (rare events): the theory of large deviations, escape from a potential well, exit time problems.

iii. Small and large friction asymptotics for the Fokker-Planck equation: The

FreidlinWentzell (underdamped) and Smoluchowski (overdamped) limits.

iv. Large time asymptotics for the Langevin equation in a periodic potential:

homogenization and averaging.

156

and methods.

We will study various asymptotic limits for the Langevin equation (we have set

m = 1)

p

.

(8.40)

q = V (q) q + 2 1 W

There are two parameters in the problem, the friction coefficient and the inverse temperature . We want to study the qualitative behavior of solutions to this

equation (and to the corresponding Fokker-Planck equation). There are various

asymptotic limits at which we can eliminate some of the variables of the equation and obtain a simpler equation for fewer variables. In the large temperature

limit, 1, the dynamics of (9.13) is dominated by diffusion: the Langevin

equation (9.13) can be approximated by free Brownian motion:

q =

.

2 1 W

subtle. It leads to exponential, Arrhenius type asymptotics for the reaction rate (in

the case of a particle escaping from a potential well due to thermal noise) or the

diffusion coefficient (in the case of a particle moving in a periodic potential in the

presence of thermal noise)

= exp (Eb ) ,

(8.41)

where can be either the reaction rate or the diffusion coefficient. The small

temperature asymptotics will be studied later for the case of a bistable potential

(reaction rate) and for the case of a periodic potential (diffusion coefficient).

Assuming that the temperature is fixed, the only parameter that is left is the

friction coefficient . The large and small friction asymptotics can be expressed in

terms of a slow/fast system of SDEs. In many applications (especially in biology)

the friction coefficient is large: 1. In this case the momentum is the fast

variable which we can eliminate to obtain an equation for the position. This is the

overdamped or Smoluchowski limit. In various problems in physics the friction

coefficient is small: 1. In this case the position is the fast variable whereas the

energy is the slow variable. We can eliminate the position and obtain an equation

for the energy. This is the underdampled or Freidlin-Wentzell limit. In both

cases we have to look at sufficiently long time scales.

157

q (t) = (t/ ).

This rescaled process satisfies the equation

q = 2 q V (q / )

q +

1

22 3

W,

(8.42)

Different choices for these two parameters lead to the overdamped and underdamped limits: = 1, = 1 , 1. In this case equation (8.42)

becomes

p

.

2 q = q V (q ) q + 2 1 W

(8.43)

Under this scaling, the interesting limit is the overdamped limit, 1. We will

see later that in the limit as + the solution to (8.43) can be approximated

by the solution to

p

.

q = q V + 2 1 W

= 1,

= ,

1:

q = 2 V (q ) q +

.

2 2 1 W

(8.44)

Under this scaling the interesting limit is the underdamped limit, 1. We will

see later that in the limit as 0 the energy of the solution to (8.44) converges to

a stochastic process on a graph.

We consider the rescaled Langevin equation (8.43):

2 q (t) = V (q (t)) q (t) +

(t),

2 1 W

(8.45)

0. We will show that, in the limit as 0, q (t), the solution of the Langevin

equation (8.45), converges to q(t), the solution of the Smoluchowski equation

p

.

q = V + 2 1 W

(8.46)

1

p,

r

1

1

2

p = V (q) 2 p +

W.

2

q =

(8.47)

(8.48)

158

This systems of SDEs defined a Markov process in phase space. Its generator is

L

=

=:

1

1

p p + 1 + p q q V p

2

1

1

L0 + L1 .

2

This is a singularly perturbed differential operator. We will derive the Smoluchowski equation (8.46) using a pathwise technique, as well as by analyzing the

corresponding Kolmogorov equations.

We apply Itos formula to p:

1 p 1

2 p p(t) dW

1

1 p 1

1

2 dW.

= 2 p(t) dt q V (q(t)) dt +

dp(t) = L p(t) dt +

Consequently:

Z t

Z

p

1 t

q V (q(s)) ds + 2 1 W (t) + O().

p(s) ds =

0

0

q(t) = q(0) +

p(s) ds.

0

Z t

p

q V (q(s)) ds + 2 1 W (t) + O()

q(t) = q(0)

0

Notice that in this derivation we assumed that

E|p(t)|2 6 C.

This estimate is true, under appropriate assumptions on the potential V (q) and on

the initial conditions. In fact, we can prove a pathwise approximation result:

t[0,T ]

!1/p

6 C2 ,

159

The pathwise derivation of the Smoluchowski equation implies that the solution of the Fokker-Planck equation corresponding to the Langevin equation (8.45)

converges (in some appropriate sense to be explained below) to the solution of the

Fokker-Planck equation corresponding to the Smoluchowski equation (8.46). It is

important in various applications to calculate corrections to the limiting FokkerPlanck equation. We can accomplish this by analyzing the Fokker-Planck equation

for (8.45) using singular perturbation theory. We will consider the problem in one

dimension. This mainly to simplify the notation. The multidimensional problem

can be treated in a very similar way.

The FokkerPlanck equation associated to equations (8.47) and (8.48) is

1

1

(pq + q V (q)p ) + 2 p (p) + 1 p2

1 1

=:

L + L .

2 0 1

=

(8.49)

Z

1 H(p,q)

eH(p,q) dpdq,

, Z=

(p, q) = e

Z

2

R

(p, q, t) = f (p, q, t) (p, q).

(8.50)

Proposition 8.4.1. The function f (p, q, t) defined in (8.50) satisfies the equation

1

f

1

1 2

pq + p (pq q V (q)p ) f

=

t

2

1

1

(8.51)

L0 L1 f.

=:

2

Remark 8.4.2. This is almost the backward Kolmogorov equation with the difference that we have L1 instead of L1 . This is related to the fact that L0 is a

symmetric operator in L2 (R2 ; Z 1 eH(p,q) ), whereas L1 is antisymmetric.

Proof. We note that L0 0 = 0 and L1 0 = 0. We use this to calculate:

L0 = L0 (f 0 ) = p (f 0 ) + 1 p2 (f 0 )

= 0 pp f + 0 1 p2 f + f L0 0 + 2 1 p f p 0

= pp f + 1 p2 f 0 = 0 L0 f.

160

Similarly,

L1 = L1 (f 0 ) = (pq + q V p ) (f 0 )

= 0 (pq f + q V p f ) = 0 L1 f.

f

1

1

0

= 0

L0 f L1 f ,

t

2

We will assume that the initial conditions for (8.51) depend only on q:

f (p, q, 0) = fic (q).

(8.52)

Another way for stating this assumption is the following: Let H = L2 (R2d ; (p, q))

and define the projection operator P : H 7 L2 (Rd ; (q)) with (q) = Z1q eV (q) , Zq =

R

V (q) dq:

Rd e

Z

|p|2

1

(8.53)

P :=

e 2 dp,

Zp Rd

R

2

with Zp := Rd e|p| /2 dp. Then, assumption (10.13) can be written as

P fic = fic .

f (p, q, t) =

N

X

n fn (p, q, t).

(8.54)

n=0

We substitute this expansion into eqn. (8.51) to obtain the following system of

equations.

L0 f0 = 0,

(8.55a)

L0 f1 = L1 f0 ,

(8.55b)

L0 f2

(8.55c)

L0 fn

f0

= L1 f1

t

fn2

= L1 fn1

,

t

n = 3, 4 . . . N.

(8.55d)

161

we conclude that

f0 = f (q, t).

Now we can calculate the right hand side of equation (8.55b):

L1 f0 = pq f.

Equation (8.55b) becomes:

L0 f1 = pq f.

The right hand side of this equation is orthogonal to N (L0 ) and consequently there

exists a unique solution. We obtain this solution using separation of variables:

f1 = pq f + 1 (q, t).

Now we can calculate the RHS of equation (8.55c). We need to calculate L1 f1 :

L1 f1 =

pq q V p pq f 1 (q, t)

= p2 q2 f pq 1 q V q f.

Z

f0

L1 f 1

OU (p) dp = 0,

t

R

Smoluchowski SDE:

f

= q V q f + 1 q2 f,

(8.56)

t

together with the initial condition (10.13).

Now we solve the equation for f2 . We use (8.56) to write (8.55c) in the form

L0 f2 = 1 p2 q2 f + pq 1 .

The solution of this equation is

1

f2 (p, q, t) = q2 f (p, q, t)p2 q 1 (q, t)p + 2 (q, t).

2

Now we calculate the right hand side of the equation for f3 , equation (8.55d) with

n = 3. First we calculate

1

L1 f2 = p3 q3 f p2 q2 1 + pq 2 q V q2 f p q V q 1 .

2

162

Z

1

+ L1 f2 OU (p) dp = 0.

t

R

This leads to the equation

1

= q V q 1 + 1 q2 1 ,

t

together with the initial condition 1 (q, 0) = 0. From the calculations presented

in the proof of Theorem 6.5.5, and using Poincare`es inequality for the measure

1 V (q)

, we deduce that

Zq e

1 d

k1 k2 6 Ck1 k2 .

2 dt

We use Gronwalls inequality now to conclude that

1 0.

Putting everything together we obtain the first two terms in the -expansion of the

FokkerPlanck equation (8.51):

(p, q, t) = Z 1 eH(p,q) f + (pq f ) + O(2 ) ,

where f is the solution of (8.56). Notice that we can rewrite the leading order term

to the expansion in the form

1

(p, q, t) = (2 1 ) 2 ep

2 /2

V (q, t) + O(),

equation

V

= q (q V V ) + 1 q2 V .

t

It is possible to expand the n-th term in the expansion (8.54) in terms of Hermite

functions (the eigenfunctions of the generator of the OU process)

fn (p, q, t) =

n

X

k=0

L0 k = k k .

(8.57)

163

s

b n1 = 0,

Lf

p

k+1 b

Lfn,k+1 + k 1 q fn,k1 = kfn+1,k , k = 1, 2 . . . , n 1,

1

p

n 1 q fn,n1 = nfn+1,n ,

p

(n + 1) 1 q fn,n = (n + 1)fn+1,n+1 .

Using this method we can obtain the first three terms in the expansion:

1

p

2

!!

r

3

p

p

b 2 f 1 q f20 1

3 f 3 + 1 L

+3

q

3! q

+O(4 ),

Consider now the rescaling , = 1, , = . The Langevin equation becomes

p

.

q = 2 V (q ) q + 2 2 1 W

(8.58)

We write equation (8.58) as system of two equations

q = 1 p ,

p = 1 V (q ) p +

.

2 1 W

This is the equation for an O(1/) Hamiltonian system perturbed by O(1) noise.

We expect that, to leading order, the energy is conserved, since it is conserved for

the Hamiltonian system. We apply Itos formula to the Hamiltonian of the system

to obtain

p

H = 1 p2 + 2 1 p2 W

with p2 = p2 (H, q) = 2(H V (q)).

Thus, in order to study the 0 limit we need to analyze the following

fast/slow system of SDEs

p

H = 1 p2 + 2 1 p2 W

(8.59a)

p

.

(8.59b)

p = 1 V (q ) p + 2 1 W

164

The Hamiltonian is the slow variable, whereas the momentum (or position) is the

fast variable. Assuming that we can average over the Hamiltonian dynamics, we

obtain the limiting SDE for the Hamiltonian:

p

.

H = 1 hp2 i + 2 1 hp2 iW

(8.60)

The limiting SDE lives on the graph associated with the Hamiltonian system. The

domain of definition of the limiting Markov process is defined through appropriate

boundary conditions (the gluing conditions) at the interior vertices of the graph.

We identify all points belonging to the same connected component of the a

level curve {x : H(x) = H}, x = (q, p). Each point on the edges of the graph

correspond to a trajectory. Interior vertices correspond to separatrices. Let Ii , i =

1, . . . d be the edges of the graph. Then (i, H) defines a global coordinate system

on the graph.

We will study the small asymptotics by analyzing the corresponding backward Kolmogorov equation using singular perturbation theory. The generator of

the process {q , p } is

L = 1 (pq q V p ) pp + 1 p2

= 1 L0 + L1 .

Let u = E(f (p (p, q; t), q (p, q; t))). It satisfies the backward Kolmogorov equation associated to the process {q , p }:

1

u

=

L 0 + L 1 u .

(8.61)

t

u = u0 + u1 + 2 u2 + . . .

We substitute this ansatz into (8.61) and equate equal powers in to obtain the

following sequence of equations:

L0 u0 = 0,

u0

,

t

u1

.

L0 u2 = L1 u1 +

t

L0 u1 = L1 u1 +

(8.62a)

(8.62b)

(8.62c)

165

.........

Notice that the operator L0 is the backward Liouville operator of the Hamiltonian

system with Hamiltonian

1

H = p2 + V (q).

2

We assume that there are no integrals of motion other than the Hamiltonian. This

means that the null space of L0 consists of functions of the Hamiltonian:

N (L0 ) = functions ofH .

(8.63)

Let us now analyze equations (8.62). We start with (8.62a); eqn. (8.63) implies that

u0 depends on q, p through the Hamiltonian function H:

u0 = u(H(p, q), t)

(8.64)

Now we proceed with (8.62b). For this we need to find the solvability condition

for equations of the form

L0 u = f

(8.65)

My multiply it by an arbitrary smooth function of H(p, q), integrate over R2 and

use the skew-symmetry of the Liouville operator L0 to deduce:1

Z

Z

uL0 F (H(p, q)) dpdq

L0 uF (H(p, q)) dpdq =

2

2

R

ZR

u(L0 F (H(p, q))) dpdq

=

R2

= 0,

F Cb (R).

This implies that the solvability condition for equation (8.83) is that

Z

f (p, q)F (H(p, q)) dpdq = 0, F Cb (R).

(8.66)

R2

Z

u0

L1 u1

F (H(p, q)) dpdq = 0,

t

R2

(8.67)

1

We assume that both u1 and F decay to 0 as |p| to justify the integration by parts that

follows.

166

(H(p, q)). We have that

H

=

=p

p

p H

H

and

2

=

2

p

p

+ p2

.

H

H 2

The above calculations imply that, when L1 acts on functions = (H(p, q)), it

becomes

i

h

2

,

(8.68)

L1 = ( 1 p2 )H + 1 p2 H

where

We want to change variables in the integral (8.67) and go from (p, q) to p, H. The

Jacobian of the transformation is:

(p, q)

=

(H, q)

p

H

q

H

p

q

q

q

p

1

=

.

H

p(H, q)

Z Z

i

u h 1

2

u F (H)p1 (H, q) dHdq = 0.

+ ( p2 )H + 1 p2 H

t

We introduce the notation

hi :=

dq.

Z h

i

u 1

2

hp i + ( 1 hp1 i hpi)H + 1 hpiH

u F (H) dH = 0.

t

This equation should be valid for every smooth function F (H), and this requirement leads to the differential equation

hp1 i

or,

u

2

= 1 hp1 i hpi H u + hpi 1 H

u,

t

u

2

= 1 hp1 i1 hpi H u + hp1 i1 hpi 1 H

u.

t

167

Thus, we have obtained the limiting backward Kolmogorov equation for the energy,

which is the slow variable. From this equation we can read off the limiting SDE

for the Hamiltonian:

H = b(H) + (H)W

(8.69)

where

b(H) = 1 hp1 i1 hpi,

Notice that the noise that appears in the limiting equation (8.69) is multiplicative, contrary to the additive noise in the Langevin equation.

As it well known from classical mechanics, the action and frequency are defined as

Z

I(E) = p(q, E) dq

and

dI 1

(E) = 2

,

dE

respectively. Using the action and the frequency we can write the limiting Fokker

Planck equation for the distribution function of the energy in a very compact form.

Theorem 8.4.3. The limiting FokkerPlanck equation for the energy distribution

function (E, t) is

(E)

=

I(E) + 1

.

(8.70)

t

E

E

2

Proof. We notice that

dI

=

dE

and consequently

p

dq =

E

p1 dq

(E)

.

2

Hence, the limiting FokkerPlanck equation can be written as

I(E)(E)

2

=

1

+ 1

t

E

2

E 2 2

I

dI

1

1

1

=

+

+

+

I

E

E 2

E dE 2

E

E 2

+ 1

I

=

E 2

E

E 2

(E)

1

=

I(E) +

,

E

E

2

hp1 i1 =

168

which is precisely equation (8.70).

Remarks 8.4.4.

i. We emphasize that the above formal procedure does not

provide us with the boundary conditions for the limiting FokkerPlanck equation. We will discuss about this issue in the next section.

ii. If we rescale back to the original time-scale we obtain the equation

(E)

1

=

I(E) +

.

t

E

E

2

(8.71)

We will use this equation later on to calculate the rate of escape from a

potential barrier in the energy-diffusion-limited regime.

Basic model

m

x = x(t)

V (x(t), f (t)) + y(t) +

2kB T (t),

(8.72)

Goal: Calculate the effective drift and the effective diffusion tensor

hx(t)i

t

(8.73)

.

2t

(8.74)

Uef f = lim

and

Def f = lim

We start by studying the underdamped dynamics of a Brownian particle x(t) Rd

moving in a smooth, periodic potential.

x

= V (x(t)) x(t)

+

p

2kB T (t),

(8.75)

where is the friction coefficient, kB the Boltzmann constant and T denotes the

temperature. (t) stands for the standard ddimensional white noise process, i.e.

hi (t)i = 0

and

i, j = 1, . . . d.

169

all spatial directions:

V (x + ei ) = V (x),

i = 1, . . . , d,

where {

ei }di=1 denotes the standard basis of Rd .

Notice that we have already nondimensionalized eqn. (8.75) in such a way

that the nondimensional particle mass is 1 and the maximum of the (gradient of

the) potential is fixed [41]. Hence, the only parameters in the problem are the

friction coefficient and the temperature. Notice, furthermore, that the parameter

in (8.75) controls the coupling between the Hamiltonian system x

= V (x)

and the thermal heat bath: 1 implies that the Hamiltonian system is strongly

coupled to the heat bath, whereas 1 corresponds to weak coupling.

Equation (8.75) defines a Markov process in the phase space Td Rd . Indeed,

let us write (8.75) as a first order system

x(t)

= y(t),

y(t)

= V (x(t)) y(t)

+

(8.76a)

p

2kB T (t),

(8.76b)

L = y x V (x) y + (y y + Dy ) .

In writing the above we have set D = KB T . This process is ergodic. The unique

invariant measure is absolutely continuous with respect to the Lebesgue measure

and its density is the MaxwellBoltzmann distribution

(y, x) =

where Z =

Td

1

1

e D H(x,y) ,

n

2

(2D) Z

(8.77)

H(x, y) =

1 2

y + V (x).

2

motion. Indeed, the following central limit theorem holds [65, 55, ?]

Theorem 8.5.1. Let V (x) C(Td ). Define the rescaled process

x

(t) := x(t/2 ).

170

Z

L (dx dy),

(8.78)

Def f =

Td Rd

where (dx dy) = (x, y)dxdy and the vector valued function is the solution of

the Poisson equation

L = y.

(8.79)

We are interested in analyzing the dependence of Def f on . We will mostly

focus on the one dimensional case. We start by rescaling the Langevin equation (9.13)

p

,

(8.80)

x

= F (x) x + 2 1 W

where we have set F (x) = V (x). We will assume that the potential is periodic

with period 2 in every direction. Since we expect that at sufficiently long length

and time scales the particle performs a purely diffusive motion, we perform a diffusive rescaling to the equations of motion (9.13): t t/2 , x x . Using the

(t) in law we obtain:

(c t) = 1 W

fact that W

c

p

1 x

,

x + 2 1 W

2 x

= F

x =

p =

q =

1

F (q)

2

1

p,

1

,

p

+ 12 1 W

2

1

p,

2

(8.81)

eliminate the fast variables p, q and to obtain an equation for the slow variable

x. We shall accomplish this by studying the corresponding backward Kolmogorov

equation using singular perturbation theory for partial differential equations.

Let

u (p, q, x, t) = Ef p(t), q(t), x(t)|p(0) = p, q(0) = q, x(0) = x ,

where E denotes the expectation with respect to the Brownian motion W (t) in

the Langevin equation and f is a smooth function.2 The evolution of the function u (p, q, x, t) is governed by the backward Kolmogorov equation associated to

2

u (p, q, x, t) =

171

u

t

1

1

p x u + 2 q V (q) p + p q + p p + 1 p u .

1

1

:=

L 0 + L 1 u ,

(8.82)

2

where:

L0 = q V (q) p + p q + p p + 1 p ,

L1 = p x

The invariant distribution of the fast process q(t), p(t) in Td Rd is the MaxwellBoltzmann distribution

Z

1 H(q,p)

eH(q,p) dqdp,

(q, p) = Z e

, Z=

Td Rd

L0 (q, p) = 0,

where L0 denotes the Fokker-Planck operator which is the L2 -adjoint of the generator of the process L0 :

L0 f = q V (q) p f p q f + p (pf ) + 1 p f .

equation

L0 f = g,

(8.83)

has a unique (up to constants) solution if and only if

Z

g(q, p) (q, p) dqdp = 0.

hgi :=

(8.84)

Td Rd

where (x, v, t; p, q) is the solution of the Fokker-Planck equation and (p, q) is the initial distribution.

3

it is more customary in the physics literature to use the forward Kolmogorov equation, i.e. the

Fokker-Planck equation. However, for the calculation presented below, it is more convenient to use

the backward as opposed to the forward Kolmogorov equation. The two formulations are equivalent.

See [57, Ch. 6] for details.

172

and is such that

Z

|f |2 dqdp < .

(8.85)

Td Rd

These two conditions are sufficient to ensure existence and uniqueness of solutions

(up to constants) of equation (8.83) [28, 29, 55].

We assume that the following ansatz for the solution u holds:

u = u0 + u1 + 2 u2 + . . .

(8.86)

with ui = ui (p, q, x, t), i = 1, 2, . . . being 2 periodic in q and satisfying condition (8.85). We substitute (8.86) into (8.82) and equate equal powers in to obtain

the following sequence of equations:

L0 u0 = 0,

(8.87a)

L0 u1 = L1 u0 ,

(8.87b)

L0 u2

(8.87c)

u0

= L1 u1 +

.

t

From the first equation in (8.87) we deduce that u0 = u0 (x, t), since the null

space of L0 consists of functions which are constants in p and q. Now the second

equation in (8.87) becomes:

L0 u1 = p x u0 .

Since hpi = 0, the right hand side of the above equation is mean-zero with respect

to the Maxwell-Boltzmann distribution. Hence, the above equation is well-posed.

We solve it using separation of variables:

u1 = (p, q) x u0

with

L0 = p.

(8.88)

satisfies condition (8.85). Now we proceed with the third equation in (8.87). We

apply the solvability condition to obtain:

Z

u0

L1 u1 (p, q) dpdq

=

t

Td Rd

2

d Z

X

u0

pi j (p, q) dpdq

=

.

xi xj

Td Rd

i,j=1

173

This is the Backward Kolmogorov equation which governs the dynamics on large

scales. We write it in the form

d

X

2 u0

u0

Dij

=

t

xi xj

(8.89)

i,j=1

Z

pi j (p, q) dpdq,

Dij =

i, j = 1, . . . d.

(8.90)

Td Rd

The calculation of the effective diffusion tensor requires the solution of the boundary value problem (8.88) and the calculation of the integral in (8.90). The limiting

backward Kolmogorov equation is well posed since the diffusion tensor is nonnegative. Indeed, let be a unit vector in Rd . We calculate (we use the notation

= and h, i for the Euclidean inner product)

Z

Z

L0 dpdq

h, Di =

(p )( ) dpdq =

Z

1

p 2 dpdq > 0,

=

(8.91)

Thus, from the multiscale analysis we conclude that at large lenght/time scales

the particle which diffuses in a periodic potential performs and effective Brownian

motion with a nonnegative diffusion tensor which is given by formula (8.90).

We mention in passing that the analysis presented above can also be applied

to the problem of Brownian motion in a tilted periodic potential. The Langevin

equation becomes

p

(t),

(8.92)

x

(t) = V (x(t)) + F x(t)

+ 2 1 W

where V (x) is periodic with period 2 and F is a constant force field. The formulas

for the effective drift and the effective diffusion tensor are

Z

Z

(p V ) (p, q) dpdq, (8.93)

p(q, p) dqdp, D =

V =

Rd Td

Rd Td

where

L = p V,

Z

(p, q) dpdq = 1.

L = 0,

Rd Td

(8.94a)

(8.94b)

174

with

L = p q + (q V + F ) p + p p + 1 p .

(8.95)

We have used to denote the tensor product between two vectors; L denotes the

L2 -adjoint of the operator L, i.e. the Fokker-Planck operator. Equations (8.94)

are equipped with periodic boundary conditions in q. The solution of the Poisson

equation (8.94) is also taken to be square integrable with respect to the invariant

density (q, p):

Z

Rd Td

The diffusion tensor is nonnegative definite. A calculation similar to the one used

to derive (8.91) shows the positive definiteness of the diffusion tensor:

Z

1

p 2 (p, q) dpdq > 0,

h, Di =

(8.96)

for every vector in Rd . The study of diffusion in a tilted periodic potential, in the

underdamped regime and in high dimensions, based on the above formulas for V

and D, will be the subject of a separate publication.

Let us now show that the formula for the diffusion tensor obtained in the previous section, equation (8.90), is equivalent to the Green-Kubo formula (3.14).

To simplify the notation we will prove the equivalence of the two formulas in

one dimension. The generalization to arbitrary dimensions is immediate. Let

(x(t; q, p), v(t; q, p)) with v = x and initial conditions x(0; q, p) = q, v(0; q, p) =

p be the solution of the Langevin equation

x

= x V x +

where (t) stands for Gaussian white noise in one dimension with correlation function

h(t)(s)i = 2kB T (t s).

We assume that the (x, v) process is stationary, i.e. that the initial conditions are

distributed according to the Maxwell-Boltzmann distribution

(q, p) = Z 1 eH(p,q) .

The velocity autocorrelation function is [9, eq. 2.10]

Z

hv(t; q, p)v(0; q, p)i = v p(x, v, t; p, q) (p, q) dpdqdxdv,

175

(8.97)

= L ,

t

where

L = vx + x V (x)v + (v) + 1 v2 .

Z Z Z Z

hv(t; q, p)v(0; q, p)i =

v(x, v, t; p, q) dvdx p (p, q) dpdq

Z Z

v(t; p, q)p (p, q) dpdq.

(8.98)

=:

The function v(t) satisfies the backward Kolmogorov equation which governs the

evolution of observables [59, Ch. 6]

v

= Lv, v(0; p, q) = p.

(8.99)

t

We can write, formally, the solution of (8.99) as

v = eLt p.

(8.100)

We combine now equations (8.98) and (8.100) to obtain the following formula for

the velocity autocorrelation function

Z Z

hv(t; q, p)v(0; q, p)i =

p eLt p (p, q) dpdq.

(8.101)

We substitute this into the Green-Kubo formula to obtain

Z

hv(t; q, p)v(0; q, p)i dt

D =

Z0 Z

Lt

e dt p p dpdq

=

0

Z

1

L p p dpdq

=

Z Z

=

p dpdq,

where is the solution of the Poisson equation (8.88). In the above derivation we

R

have used the formula L1 = 0 eLt dt, whose proof can be found in [59, Ch.

11].

176

In this section we derive approximate formulas for the diffusion coefficient which

are valid in the overdamped 1 and underdampled 1 limits. The derivation of these formulas is based on the asymptotic analysis of the Poisson equation (8.88).

The Underdamped Limit

In this subsection we solve the Poisson equation (8.88) in one dimension perturbatively for small . We shall use singular perturbation theory for partial differential

equations. The operator L0 that appears in (8.88) can be written in the form

L0 = LH + LOU

where LH stands for the (backward) Liouville operator associated with the Hamiltonian H(p, q) and LOU for the generator of the OU process, respectively:

LH = pq q V p ,

LOU = pp + 1 p2 .

We expect that the solution of the Poisson equation scales like 1 when 1.

Thus, we look for a solution of the form

=

1

0 + 1 + 2 + . . .

(8.102)

LH 0 = 0,

LH 1 = p + LOU 0 ,

LH 2 = LOU 1 .

(8.103a)

(8.103b)

(8.103c)

From equation (8.103a) we deduce that, since the 0 is in the null space of the

Liouville operator, the first term in the expansion is a function of the Hamiltonian

z(p, q) = 12 p2 + V (q):

0 = 0 (z(p, q)).

Now we want to obtain an equation for 0 by using the solvability condition for

(8.103b). To this end, we multiply this equation by an arbitrary function of z,

g = g(z) and integrate over p and q to obtain

Z + Z

(p + LOU 0 ) g(z(p, q)) dpdq = 0.

Z + Z

1

g(z) (p(z, q) + LOU 0 (z))

dzdq = 0,

p(z, q)

Emin

where J = p1 (z, q) is the Jacobian of the transformation. Operator L0 , when

applied to functions of the Hamiltonian, becomes:

LOU = ( 1 p2 )

+ 1 p2 2 .

z

z

Z + Z

2

1

1

2

1 2

g(z) p(z, q) + ( p )

+ p

dzdq = 0.

0 (z)

2

z

z

p(z, q)

Emin

Let E0 denote the critical energy, i.e. the energy along the separatrix (homoclinic

orbit). We set

Z x2 (z)

Z x2 (z)

1

dq,

p(z, q) dq, T (z) =

S(z) =

x1 (z) p(z, q)

x1 (z)

where Riskens notation [64, p. 301] has been used for x1 (z) and x2 (z).

We need to consider the cases z > E0 , p > 0 , z > E0 , p < 0 and

Emin < z < E0 separately.

We consider first the case E > E0 , p > 0. In this case x1 (x) = , x2 (z) =

. We can perform the integration with respect to q to obtain

Z +

2

1

1

g(z) 2 + ( T (z) S(z))

+ S(z) 2 0 (z) dz = 0,

z

z

E0

This equation is valid for every test function g(z), from which we obtain the following differential equation for 0 :

1

1

2

S(z) +

S(z) 1 =

,

(8.104)

L := 1

T (z)

T (z)

T (z)

where primes denote differentiation with respect to z and where the subscript 0 has

been dropped for notational simplicity.

178

A similar calculation shows that in the regions E > E0 , p < 0 and Emin <

E < E0 the equation for 0 is

L =

2

,

T (z)

E > E0 , p < 0

(8.105)

and

L = 0,

(8.106)

Equations (8.104), (8.105), (8.106) are augmented with condition (8.85) and a continuity condition at the critical energy [18]

23 (E0 ) = 1 (E0 ) + 2 (E0 ),

(8.107)

where 1 , 2 , 3 are the solutions of equations (8.104), (8.105) and (8.106), respectively.

The average of a function h(q, p) = h(q, p(z, q)) can be written in the form [64,

p. 303]

Z Z

h(q, p) (q, p) dqdp

hh(q, p)i :=

Z1

Emin

x2 (z)

x1 (z)

h(q, p(z, q)) + h(q, p(z, q)) (p(q, z))1 ez dzdq,

Z =

eV (q) dq.

1 (z) = 2 (z). These facts, together with the above formula for the averaging

with respect to the Boltzmann distribution, yield:

D = hp(p, q)i = hp0 i + O(1)

Z

2 1 +

Z

0 (z)ez dzO(1)

E0

Z

4 1 +

=

0 (z)ez dz,

Z

E0

(8.108)

(8.109)

to leading order in , and where 0 (z) is the solution of the two point boundary

value problem (8.104). We remark that if we start with formula D = 1 h|p |2 i

for the diffusion coefficient, we obtain the following formula, which is equivalent

to (8.109):

Z

4 1 +

D=

Z

|z 0 (z)|2 ez dz.

E0

Now we solve the equation for 0 (z) (for notational simplicity, we will drop the

subscript 0 ). Using the fact that S (z) = T (z), we rewrite (8.104) as

1 (S ) + S = 2.

This equation can be rewritten as

1 ez S = ez .

Condition (8.85) implies that the derivative of the unique solution of (8.104) is

(z) = S 1 (z).

We use this in (8.109), together with an integration by parts, to obtain the following

formula for the diffusion coefficient:

Z

1 2 1 1 + ez

dz.

(8.110)

D = 8 Z

S(z)

E0

We emphasize the fact that this formula is exact in the limit as 0 and is valid

for all periodic potentials and for all values of the temperature.

Consider now the case of the nonlinear pendulum V (q) = cos(q). The

partition function is

(2)3/2

Z =

J0 (),

1/2

where J0 () is the modified Bessel function of the first kind. Furthermore, a simple

calculation yields

!

r

2

S(z) = 25/2 z + 1E

,

z+1

where E() is the complete elliptic integral of the second kind. The formula for the

diffusion coefficient becomes

Z +

1

ez

p

D=

dz.

(8.111)

2 1/2 J0 () 1

z + 1E( 2/(z + 1))

180

that E(1) = 1 to obtain the small temperature asymptotics for the diffusion coefficient:

1 2

D=

e

, 1,

(8.112)

2

which is precisely formula (??), obtained by Risken.

Unlike the overdamped limit which is treated in the next section, it is not

straightforward to obtain the next order correction in the formula for the effective

diffusivity. This is because, due to the discontinuity of the solution of the Poisson

equation (8.88) along the separatrix. In particular, the next order correction to

when 1 is of ( 1/2 ), rather than (1) as suggested by ansatz (8.102).

Upon combining the formula for the diffusion coefficient and the formula for

the hopping rate from Kramers theory [31, eqn. 4.48(a)] we can obtain a formula

for the mean square jump length at low friction. For the cosine potential, and for

1, this formula is

h2 i =

2

8 2 2

for 1, 1.

(8.113)

In this subsection we study the large asymptotics of the diffusion coefficient. As

in the previous case, we use singular perturbation theory, e.g. [32, Ch. 8]. The

regularity of the solution of (8.88) when 1 will enable us to obtain the first

two terms in the 1 expansion without any difficulty.

We set = 1 . The differential operator L0 becomes

1

L0 = LOU + LH .

= 0 + 1 + 2 2 + 3 3 + . . .

(8.114)

We substitute this into (8.88) and obtain the following sequence of equations:

LOU 0 = 0,

(8.115a)

LOU 1 = p + LH 0 ,

(8.115b)

LOU 3 = LH 2 .

(8.115d)

LOU 2 = LH 1 ,

(8.115c)

The null space of the Ornstein-Uhlenbeck operator L0 consists of constants in p.

Consequently, from the first equation in (8.115) we deduce that the first term in the

expansion in independent of p, 0 = (q). The second equation becomes

LOU 1 = p(1 + q ).

Let

(p) =

1

2

p2

2

be the invariant distribution of the OU process (i.e. LOU (p) = 0). The solvability condition for an equation of the form LOU = f requires that the right hand

side averages to 0 with respect to (p), i.e. that the right hand side of the equation

is orthogonal to the null space of the adjoint of LOU . This condition is clearly

satisfied for the equation for 1 . Thus, by Fredholm alternative, this equation has

a solution which is

1 (p, q) = (1 + q )p + 1 (q),

where the function 1 (q) of is to be determined. We substitute this into the right

hand side of the third equation to obtain

LOU 2 = p2 q2 q V (1 + q ) + pq 1 (q).

From the solvability condition for this we obtain an equation for (q):

1 q2 q V (1 + q ) = 0,

(8.116)

together with the periodic boundary conditions. The derivative of the solution of

this two-point boundary value problem is

2

eV (q) .

V (q) dq

e

q + 1 = R

(8.117)

The first two terms in the large expansion of the solution of equation (8.88) are

1

1

,

(p, q) = (q) + (1 + q ) + O

2

where (q) is the solution of (8.116). Substituting this in the formula for the diffusion coefficient and using (8.117) we obtain

Z Z

4 2

1

D =

,

p (p, q) dpdq =

+O

b

3

Z Z

182

R

R

where Z = eV (q) , Zb = eV (q) . This is, of course, the Lifson-Jackson

formula which gives the diffusion coefficient in the overdamped limit [43]. Continuing in the same fashion, we can also calculate the next two terms in the expansion (8.114), see Exercise 4. From this, we can compute the next order correction

to the diffusion coefficient. The final result is

1

4 2 Z1

4 2

+O

D=

,

(8.118)

3

2

b

b

5

Z Z

ZZ

R

where Z1 = |V (q)|2 eV (q) dq.

In the case of the nonlinear pendulum, V (q) = cos(q), formula (8.118) gives

J2 ()

1

1 2

2

J () 3

,

(8.119)

J0 () + O

D=

3

5

0

J0 ()

where Jn () is the modified Bessel function of the first kind.

In the multidimensional case, a similar analysis leads to the large gamma

asymptotics:

1

1

h, Di = h, D0 i + O

,

3

where is an arbitrary unit vector in Rd and D0 is the diffusion coefficient for the

Smoluchowski (overdamped) dynamics:

Z

D0 = Z 1

(8.120)

LV eV (q) dq

Rd

where

LV = q V q + 1 q

and (q) is the solution of the PDE LV = q V with periodic boundary conditions.

Now we prove several properties of the effective diffusion tensor in the overdamped limit. For this we will need the following integration by parts formula

Z

Z

Z

( y ) dy. (8.121)

y () y dy =

y dy =

Td

Td

Td

Theorem 8.6.1. The effective diffusion tensor D0 (8.120) satisfies the upper and

lower bounds

D

6 h, Ki 6 D||2 Rd ,

(8.122)

b

ZZ

where

Zb =

eV (y)/D dy.

Td

Furthermore, the effective diffusivity is symmetric.

Proof. The lower bound follows from the general lower bound (??), equation (??)

and the formula for the Gibbs measure. To establish the upper bound, we use

(8.121) and (??) to obtain

K = DI + 2D

() dy +

d

ZT

ZT

y V dy

y V dy

y dy +

= DI 2D

d

d

T

T

Z

Z

y V dy

y V dy +

= DI 2

d

d

T

T

Z

y V dy

= DI

d

ZT

= DI

L0 dy

Td

Z

y y dy.

= DI D

(8.123)

Td

Hence, for = ,

h, Ki = D||2 D

6 D||2 .

Td

|y |2 dy

The One Dimensional Case

The one dimensional case is always in gradient form: b(y) = y V (y). Furthermore in one dimension we can solve the cell problem (??) in closed form and

calculate the effective diffusion coefficient explicitlyup to quadratures. We start

184

with the following calculation concerning the structure of the diffusion coefficient.

K = D + 2D

= D + 2D

y dy +

0

1

1

0

y dy + D

0

1

y dy D

= D + 2D

0

Z 1

1 + y dy.

= D

y V dy

1

y dy

0

1

y dy

0

(8.124)

Dyy y V y = y V.

(8.125)

y y eV (y)/D = y eV (y)/D .

We integrate this equation from 0 to 1 and multiply by eV (y)/D to obtain

y (y) = 1 + c1 eV (y)/D .

Another integration yields

(y) = y + c1

eV (y)/D dy + c2 .

The periodic boundary conditions imply that (0) = (1), from which we conclude that

Z 1

eV (y)/D dy = 0.

1 + c1

0

Hence

We deduce that

1

c1 = ,

Zb

b=

Z

y = 1 +

eV (y)/D dy.

1 V (y)/D

e

.

Zb

We substitute this expression into (8.124) to obtain

Z

D 1

K =

(1 + y (y)) eV (y)/D dy

Z 0

Z 1

D

=

eV (y)/D eV (y)/D dy

b 0

ZZ

D

,

=

b

ZZ

with

Z=

V (y)/D

b=

Z

dy,

eV (y)/D dy.

(8.126)

(8.127)

The Cauchy-Schwarz inequality shows that Z Z

dimensional case the formula for the effective diffusivity is precisely the lower

bound in (8.122). This shows that the lower bound is sharp.

Example 8.6.2. Consider the potential

a1 : y [0, 12 ],

V (y) =

a2 : y ( 21 , 1],

(8.128)

It is straightforward to calculate the integrals in (8.127) to obtain the formula

K=

D

cosh

a1 a2

D

.

(8.129)

In Figure 8.2 we plot the effective diffusivity given by (8.129) as a function of the

molecular diffusivity D. We observe that K decays exponentially fast in the limit

as D 0.

In this appendix we use our method to obtain a formula for the effective diffusion

coefficient of an overdamped particle moving in a one dimensional tilted periodic

potential. This formula was first derived and analyzed in [62, 61] without any

appeal to multiscale analysis. The equation of motion is

x = V (x) + F + 2D,

(8.130)

4

Of course, this potential is not even continuous, let alone smooth, and the theory as developed

in this chapter does not apply. It is possible, however, to consider a regularized version of this

discontinuous potential and then homogenization theory applies.

186

10

10

10

10

10

10

10

10

10

10

D

Figure 8.2: Effective diffusivity versus molecular diffusivity for the potential

(8.128).

where V (x) is a smooth periodic function with period L, F and D > 0 constants

and (t) standard white noise in one dimension. To simplify the notation we have

set = 1.

The stationary FokkerPlanck equation corresponding to(8.130) is

x

V (x) F (x) + Dx (x) = 0,

(8.131)

with periodic boundary conditions. Formula (??) for the effective drift now becomes

Z L

(V (x) + F )(x) dx.

(8.132)

Uef f =

0

(x) =

1

Z

x+L

with

1

Z (x) := e D (V (x)F x) ,

(8.133)

and

Z=

dx

0

x+L

x

(8.134)

Uef f =

FL

DL

1 e D .

Z

(8.135)

Our goal now is to calculate the effective diffusion coefficient. For this we first

need to solve the Poisson equation (8.94a) which now becomes

L(x) := Dxx (x) + (V (x) + F )x = V (x) F + Uef f ,

(8.136)

with periodic boundary conditions. Then we need to evaluate the integrals in (??):

Def f = D +

x (x)(x) dx.

It will be more convenient for the subsequent calculation to rewrite the above formula for the effective diffusion coefficient in a different form. The fact that (x)

solves the stationary FokkerPlanck equation, together with elementary integrations by parts yield that, for all sufficiently smooth periodic functions (x),

Z

(x)(L(x))(x) dx = D

Now we have

Def f

x (x)(x) dx

(V (x) + F Uef f )(x)(x) dx + 2D

0

0

Z L

Z L

x (x)(x) dx

(L(x))(x)(x) dx + 2D

= D+

0

0

Z L

Z L

2

x (x)(x) dx

(x (x)) (x) dx + 2D

= D+D

0

0

Z L

(1 + x (x))2 (x) dx.

(8.137)

= D

= D+

Now we solve the Poisson equation (8.136) with periodic boundary conditions. We

multiply the equation by Z (x) and divide through by D to rewrite it in the form

x (x (x)Z (x)) = x Z (x) +

Uef f

Z (x).

D

188

We integrate this equation from x L to x and use the periodicity of (x) and

V (x) together with formula (8.135) to obtain

L

Z x

FL

FL

FL

1 e D

Z (y) dy,

x (x)Z (x) 1 e D = Z (x) 1 e D +

Z

xL

from which we immediately get

1

x (x) + 1 =

Z

x

xL

Substituting this into (8.137) and using the formula for the invariant distribution

(8.133) we finally obtain

Def f

D

= 3

Z

(8.138)

with

I+ (x) =

x

xL

Z (y)Z+ (x) dy

and

I (x) =

x+L

x

Formula (8.138) for the effective diffusion coefficient (formula (22) in [61]) is the

main result of this section.

8.8 Discussion and Bibliography

The rigorous study of the overdamped limit can be found in [54]. A similar approximation theorem is also valid in infinite dimensions (i.e. for SPDEs); see [5, 6].

More information about the underdamped limit of the Langevin equation can

be found at [70, 19, 20].

We also mention in passing that the various formulae for the effective diffusion

coefficient that have been derived in the literature [24, 43, 62, 66] can be obtained

from equation (??): they correspond to cases where equations (??) and (??) can be

solved analytically. An examplethe calculation of the effective diffusion coefficient of an overdamped Brownian particle in a tilted periodic potentialis presented

in appendix. Similar calculations yield analytical expressions for all other exactly

solvable models that have been considered in the literature.

8.9. EXERCISES

189

8.9 Exercises

1. Let Lb be the generator of the two-dimensional Ornstein-Uhlenbeck operator (8.17).

b Show that there exists a

Calculate the eigenvalues and eigenfunctions of L.

transformation that transforms Lb into the Schrodinger operator of the two-dimensional

quantum harmonic oscillator.

2. Let Lb be the operator defined in (8.34)

Lb = 1 (c ) (c+ ) 2 (d ) (d+ ) .

b (c ) ], [L,

b (d ) ].

[(c+ ) , (c ) ], [(d+ ) , (d ) ], [(c ) , (d ) ], [L,

3. Show that the operators a , b defined in (8.15) and (8.16) satisfy the commutation relations

[a+ , a ] = 1,

(8.139a)

[b+ , b ] = 1,

(8.139b)

[a , b ] = 0.

(8.139c)

5. Prove formula (8.121).

190

Chapter 9

Exit Time Problems

9.1 Introduction

9.2 Brownian Motion in a Bistable Potential

There are many systems in physics, chemistry and biology that exist in at least two

stable states. Among the many applications we mention the switching and storage

devices in computers. Another example is biological macromolecules that can exist

in many different states. The problems that we would like to solve are:

How stable are the various states relative to each other.

How long does it take for a system to switch spontaneously from one state

to another?

How is the transfer made, i.e. through what path in the relevant state space?

There is a lot of important current work on this problem by E, Vanden Eijnden etc.

How does the system relax to an unstable state?

We can separate between the 1d problem, the finite dimensional problem and the

infinite dimensional problem (SPDEs). We we will solve completely the one dimensional problem and discuss in some detail about the finite dimensional problem. The infinite dimensional situation is an extremely hard problem and we will

191

192CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

50

100

150

200

250

300

350

400

450

500

only make some remarks. The study of bistability and metastability is a very active

research area, in particular the development of numerical methods for the calculation of various quantities such as reaction rates, transition pathways etc.

We will mostly consider the dynamics of a particle moving in a bistable potential, under the influence of thermal noise in one dimension:

x = V (x) +

2kB T .

(9.1)

has to local minima, one local maximum and it increases at least quadratically at

infinity. This ensures that the state space is compact, i.e. that the particle cannot

escape at infinity. The standard potential that satisfies these assumptions is

1

1

1

V (x) = x4 x2 + .

4

2

4

(9.2)

It is easily checked that this potential has three local minima, a local maximum at

x = 0 and two local minima at x = 1. The values of the potential at these three

points are:

1

V (1) = 0, V (0) = .

4

We will say that the height of the potential barrier is 14 . The physically (and mathematically!) interesting case is when the thermal fluctuations are weak when compared to the potential barrier that the particle has to climb over.

193

More generally, we assume that the potential has two local minima at the points

a and c and a local maximum at b. Let us consider the problem of the escape of the

particle from the left local minimum a. The potential barrier is then defined as

E = V (b) V (a).

Our assumption that the thermal fluctuations are weak can be written as

kB T

1.

E

In this limit, it is intuitively clear that the particle is most likely to be found at either

a or c. There it will perform small oscillations around either of the local minima.

This is a result that we can obtain by studying the small temperature limit by using

perturbation theory. The result is that we can describe locally the dynamics of

the particle by appropriate OrnsteinUhlenbeck processes. Of course, this result is

valid only for finite times: at sufficiently long times the particle can escape from

the one local minimum, a say, and surmount the potential barrier to end up at c.

It will then spend a long time in the neighborhood of c until it escapes again the

potential barrier and end at a. This is an example of a rare event. The relevant

time scale, the exit time or the mean first passage time scales exponentially in

:= (kB T )1 :

= 1 exp(E).

It is more customary to calculate the reaction rate := 1 which gives the rate

with which particles escape from a local minimum of the potential:

= exp(E).

(9.3)

It is very important to notice that the escape from a local minimum, i.e. a state of

local stability, can happen only at positive temperatures: it is a noise assisted event.

Indeed, consider the case T = 0. The equation of motion becomes

x = V (x),

x(0) = x0 .

dx

dx

= V (x)

= (V (x))2 < 0.

dt

dt

Hence, depending on the initial condition the particle will converge either to a or

c. The particle cannot escape from either state of local stability.

194CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

On the other hand, at high temperatures the particle does not see the potential

barrier: it essentially jumps freely from one local minimum to another.

To get a better understanding of the dependence of the dynamics on the depth of

the potential barrier relative to temperature, we solve the equation of motion (9.1)

numerically. In Figure we present the time series of the particle position. We

observe that at small temperatures the particle spends most of its time around x =

1 with rapid transitions from 1 to 1 and back.

The Arrhenius-type factor in the formula for the reaction rate, eqn. (9.3) is intuitively and it has been observed experimentally in the late nineteenth century by

Arrhenius and others. What is extremely important both from a theoretical and an

applied point of view is the calculation of the prefactor , the rate coefficient. A

systematic approach for the calculation of the rate coefficient, as well as the justification of the Arrhenius kinetics, is that of the mean first passage time method

(MFPT). Since this method is of independent interest and is useful in various other

contexts, we will present it in a quite general setting and apply it to the problem

of the escape from a potential barrier in later sections. We will first treat the one

dimensional problem and then extend the theory to arbitrary finite dimensions.

We will restrict ourselves to the case of homogeneous Markov processes. It is

not very easy to extend the method to non-Markovian processes.

Let Xt be a continuous time diffusion process on Rd whose evolution is governed

by the SDE

dXtx = b(Xtx ) dt + (Xtx ) dWt , X0x = x.

(9.4)

Let D be a bounded subset of Rd with smooth boundary. Given x D, we want to

know how long it takes for the process Xt to leave the domain D for the first time

x

D

= inf {t > 0 : Xtx

/ D} .

Clearly, this is a random variable which is called the first passage time. The

average of this random variable is called the mean first passage time MFPT or the

first exit time:

x

(x) := ED

= E inf {t > 0 : Xtx

/ D} X0x = x .

195

We have written the second equality in the above in order to emphasize the fact

that the mean first passage time is defined in terms of a conditional expectation, i.e.

the MFPT is defined as the expectation of the first time the diffusion processes Xt

leaves the domain, conditioned on Xt starting at x . Consequently, the MFPT

is a function of the starting point x. Consider now an ensemble of initial conditions

distributed according to a distribution p0 (x). The confinement time is defined as

Z

Z

E inf {t > 0 : Xtx

/ D} X0x = x p0 (x) dx.

(x)p0 (x) dx =

=

(9.5)

We can calculate the MFPT by solving an appropriate boundary value problem.

The calculation of the confinement time follows then by calculating the integral

in [?].

Theorem 9.3.1. The MFPT is the solution of the boundary value problem

L = 1,

= 0,

x D,

(9.6a)

x D,

(9.6b)

The homogeneous Dirichlet boundary conditions correspond to an absorbing

boundary: the particles are removed when they reach the boundary. Other choices

of boundary conditions are also possible. The rigorous proof of Theorem 9.3.1 is

based on Itos formula.

Proof. Let (X, x, t) be the probability distribution of the particles that have not

left the domain D at time t. It solves the FP equation with absorbing boundary

conditions.

= L ,

t

(X, x, 0) = (X x),

|D = 0.

(9.7)

(X, x, t) = eL t (X x),

where the absorbing boundary conditions are included in the definition of the semi

group eL t . The homogeneous Dirichlet (absorbing) boundary conditions imply

that

lim (X, x, t) = 0.

t+

196CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

That is: all particles will eventually leave the domain. The (normalized) number of

particles that are still inside D at time t is

Z

(X, x, t) dx.

S(x, t) =

D

S

= f (x, t),

t

where f (x, t) is the first passage times distribution. The MFPT is the first moment of the distribution f (x, t):

Z +

dS

f (s, x)s ds =

s ds

ds

0

0

Z + Z

Z +

(X, x, s) dXds

S(s, x) ds =

=

0

D

0

Z + Z

eL s (X x) dXds

=

0

D

Z +

Z + Z

eLs 1 ds.

(X x) eLs 1 dXds =

=

(x) =

L

Z

Le 1 dt =

+

= 1.

Lt

d Lt

Le 1 dt

dt

In the case where a part of the boundary is absorbing and a part is reflecting,

then we end up with a mixed boundary value problem for the MFPT:

L

= 1,

= 0,

= 0,

x D,

x DA ,

x DR .

(9.8a)

(9.8b)

(9.8c)

reflecting part of the boundary and J denotes the probability flux.

197

2

1.8

1.6

1.4

(x)

1.2

1

0.8

0.6

0.4

0.2

0

1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Figure 9.1: The mean first passage time for Brownian motion with one absorbing

and one reflecting boundary.

9.3.2 Examples

In this section we consider a few simple examples for which we can calculate the

mean first passage time in closed form.

We consider the problem of Brownian motion moving in the interval [a, b]. We

assume that the left boundary is absorbing and the right boundary is reflecting.

The boundary value problem for the MFPT time becomes

d2

= 1,

dx2

(a) = 0,

d

(b) = 0.

dx

(9.9)

(x) =

a

x2

+ bx + a

b .

2

2

The MFPT time for Brownian motion with one absorbing and one reflecting boundary in the interval [1, 1] is plotted in Figure 9.3.2.

198CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

0.5

0.45

0.4

0.35

(x)

0.3

0.25

0.2

0.15

0.1

0.05

0

1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Figure 9.2: The mean first passage time for Brownian motion with two absorbing

boundaries.

Consider again the problem of Brownian motion moving in the interval [a, b], but

now with both boundaries being absorbing. The boundary value problem for the

MFPT time becomes

d2

= 1,

dx2

(a) = 0, (b) = 0.

(9.10)

(x) =

a

x2

+ bx + a

b .

2

2

The MFPT time for Brownian motion with two absorbing boundaries in the interval

[1, 1] is plotted in Figure 9.3.2.

The Mean First Passage Time for a One-Dimensional Diffusion Process

Consider now the mean exit time problem from an interval [a, b] for a general onedimensional diffusion process with generator

L = a(x)

1

d2

d

+ b(x) 2 ,

dx 2

dx

199

where the drift and diffusion coefficients are smooth functions and where the diffusion coefficient b(x) is a strictly positive function (uniform ellipticity condition).

In order to calculate the mean first passage time we need to solve the differential

equation

1

d2

d

+ b(x) 2 = 1,

(9.11)

a(x)

dx 2

dx

together with appropriate boundary conditions, depending on whether we have one

absorbing and one reflecting boundary or two absorbing boundaries. To solve

this equation we first define the function (x) through (x) = 2a(x)/b(x) to

write (9.11) in the form

2 (x)

e(x) (x) =

e

b(x)

(x) = 2

(z)

dz

e(y)

dy + c1

b(y)

e(y) dy + c2 ,

where the constants c1 and c2 are to be determined from the boundary conditions.

When both boundaries are absorbing we get

(x) = 2

(z)

dz

z

a

2Zb

e(y)

dy +

b(y)

Z

e(y) dy.

(9.12)

In this section we use the theory developed in the previous section to study the

long time/small temperature asymptotics of solutions to the Langevin equation for

a particle moving in a onedimensional potential of the form (9.2):

p

.

x

= V (x) x + 2kB T W

(9.13)

In particular, we justify the Arrhenius formula for the reaction rate

= () exp(E)

and we calculate the escape rate = (). In particular, we analyze the dependence of the escape rate on the friction coefficient. We will see that the we need to

distinguish between the cases of large and small friction coefficients.

200CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

We consider the Langevin equation (9.13) in the limit of large friction. As we

saw in Section 8.4, in the overdamped limit 1, the solution to (9.13) can be

approximated by the solution to the Smoluchowski equation (9.1)

p

.

x = V (x) + 2 1 W

We want to calculate the rate of escape from the potential barrier in this case. We

assume that the particle is initially at x0 which is near a, the left potential minimum. Consider the boundary value problem for the MFPT of the one dimensional

diffusion process (9.1) from the interval (a, b):

(9.14)

1 eV x eV = 1

with these boundary conditions by quadratures:

(x) = 1

dyeV (y)

dzeV (z) .

(9.15)

Now we can solve the problem of the escape from a potential well: the reflecting

boundary is at x = a, the left local minimum of the potential, and the absorbing

boundary is at x = b, the local maximum. We can replace the B.C. at x = a by a

repelling B.C. at x = :

(x) =

V (y)

dye

dzeV (z) .

When Eb 1 the integral wrt z is dominated by the value of the potential near

a. Furthermore, we can replace the upper limit of integration by :

Z +

Z z

02

2

(z a)

dz

exp(V (a)) exp

exp(V (z)) dz

2

s

2

= exp (V (a))

,

02

where we have used the Taylor series expansion around the minimum:

1

V (z) = V (a) + 02 (z a)2 + . . .

2

201

Similarly, the integral wrt y is dominated by the value of the potential around the

saddle point. We use the Taylor series expansion

1

V (y) = V (b) b2 (y b)2 + . . .

2

Assuming that x is close to a, the minimum of the potential, we can replace the

lower limit of integration by . We finally obtain

Z b

Z b

b2

2

exp(V (b)) exp

exp(V (y)) dy

(y b)

dy

2

x

s

2

1

.

exp (V (b))

=

2

b2

Putting everything together we obtain a formula for the MFPT:

(x) =

exp (Eb ) .

0 b

The rate of arrival at b is 1/ . Only have of the particles escape. Consequently, the

1

:

escape rate (or reaction rate), is given by 2

=

0 b

exp (Eb ) .

2

Consider now the problem of escape from a potential well for the Langevin

equation

p

.

(9.16)

q = q V (q) q + 2 1 W

The reaction rate depends on the fiction coefficient and the temperature. In

the overdamped limit ( 1) we retrieve (??), appropriately rescaled with

:

0 b

exp (Eb ) .

(9.17)

=

2

We can also obtain a formula for the reaction rate for = O(1):

q

2

2

4 b 2 0

=

exp (Eb ) .

b

2

Naturally, in the limit as + (9.18) reduces to (9.17)

(9.18)

202CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

regime

In order to calculate the reaction rate in the underdamped or energy-diffusionlimited regime 1 we need to study the diffusion process for the energy, (8.69)

or (8.70). The result is

0

= I(Eb ) eEb ,

(9.19)

2

where I(Eb ) denotes the action evaluated at b.

The calculation of reaction rates and the stochastic modeling of chemical reactions

has been a very active area of research since the 30s. One of the first methods

that were developed was that of transition state theory. Kramers developed his

theory in his celebrated paper [38]. In this chapter we have based our approach

to the calculation of the mean first passage time. Our analysis is based mostly

on [25, Ch. 5, Ch. 9], [75, Ch. 4] and the excellent review article [31]. We highly

recommend this review article for further information on reaction rate theory. See

also [30] and the review article of Melnikov (1991). A formula for the escape

rate which is valid for all values of friction coefficient was obtained by Melnikov

and Meshkov in 1986, J. Chem. Phys 85(2) 1018-1027. This formula requires the

calculation of integrals and it reduced to (9.17) and (9.19) in the overdamped and

underdamped limits, respectively.

There are many applications of interest where it is important to calculate reaction rates for non-Markovian Langevin equations of the form

x

= V (x)

t

0

b(t s)x(s)

ds + (t)

h(t)(0)i = kB T M 1 (t)

(9.20a)

(9.20b)

with the fluctuationdissipation theorem (10.16), in Chapter 10. The calculation of

reaction rates for the generalized Langevin equation is presented in [30].

The long time/small temperature asymptotics can be studied rigorously by

means of the theory of Freidlin-Wentzell [20]. See also [3]. A related issue is

that of the small temperature asymptotics for the eigenvalues (in particular, the

9.6. EXERCISES

203

first eigenvalue) of the generator of the Markov process x(t) which is the solution

of

p

.

x = V (x) + 2kB T W

The theory of Freidlin and Wentzell has also been extended to infinite dimensional

problems. This is a very important problem in many applications such as micromagnetics...We refer to CITE... for more details.

A systematic study of the problem of the escape from a potential well was

developed by Matkowsky, Schuss and collaborators [67, 50, 51]. This approach

is based on a systematic use of singular perturbation theory. In particular, the

calculation of the transition rate which is uniformly valid in the friction coefficient

is presented in [51]. This formula is obtained through a careful analysis of the PDE

pq q V p + (pp + kB T p2 ) = 1,

for the mean first passage time . The PDE is equipped, of course, with the appropriate boundary conditions. Singular perturbation theory is used to study the small

temperature asymptotics of solutions to the boundary value problem. The formula

derived in this paper reduces to the formulas which are valid at large and small

values of the friction coefficient at the appropriate asymptotic limits.

The study of rare transition events between long lived metastable states is a

key feature in many systems in physics, chemistry and biology. Rare transition

events play an important role, for example, in the analysis of the transition between

different conformation states of biological macromolecules such as DNA [68]. The

study of rare events is one of the most active research areas in the applied stochastic

processes. Recent developments in this area involve the transition path theory of W.

E and Vanden Eijnden. Various simple applications of this theory are presented in

Metzner, Schutte et al 2006. As in the mean first passage time approach, transition

path theory is also based on the solution of an appropriate boundary value problem

for the so-called commitor function.

9.6 Exercises

204CHAPTER 9. THE MEAN FIRST PASSAGE TIME AND EXIT TIME PROBLEMS

Chapter 10

Statistical Mechanics

10.1 Introduction

In this final chapter we study the connection between stochastic processes and

non-equilibrium statistical mechanics. In particular, we derive stochastic equations

of evolution for a particle (or more generally, a low-dimensional deterministic

Hamiltoniandynamical system) that is in contact with a heat bath. This derivation

provides a justification for the use of stochastic differential equations in physics

and chemistry. We also develop some additional tools that are useful in the study

of systems far from equilibrium such linear response theory and projection operator

techniques.

In Section 10.2 we study the Kac-Zwanzig model and we derive the generalized

Langevin equation (GLE), together with the fluctuation-dissipation theorem . The

generalized Langevin equation is studied in Section 10.3. More general classes

of models that describe the dynamics of a particle interacting with a heat bath

are studied in Section 10.4. Linear response theory , one of the most important

techniques that are used in the study of systems far from equilibrium, is developed

in Section 10.5. Projection operator techniques, another extremely useful tool in

non-equilibrium statistical mechanics, are studied in Section 10.6. Discussion and

bibliographical remarks are included in Section 10.7. Exercises can be found in

Section 10.8.

205

In this section we will study a simple model for the dynamics of a particle (the

distinguished or Brownian particle) that interactsi.e. exchanges energywith its

environment (the heat bath). The dynamics of the particle-heat bath system can be

described through a Hamiltonian of the form

H(Q, P ; q, p) = HBP (Q, P ) + HHB ({q}, {p}) + HI (Q, {q}),

(10.1)

{Q, P } are the coordinates of the Brownian particle and {{q}, {p}} the coordinates of particles in the heat bath. The last term in the Hamiltonian function (10.1),

HI (Q, q) describes the interaction between the particle and the heat bath. The heat

bath is assumed to be in equilibrium at temperature 1 . For this, we need to prepare the system appropriately, i.e. we need to assume that the initial conditions for

the particles in the heat bath are random variables that are distributed according to

an appropriate probability distribution, an appropriate Gibbs measure.

For simplicity we will restrict ourselves to the one dimensional case. We will

also consider the simplest possible model for the heat bath as well as the simplest

possible coupling between the particle and the heat bath: the heat bath will taken

to consists of N harmonic oscillators and the coupling will be taken to be linear:

N

X

P2

H(QN , PN , q, p) = N + V (QN ) +

2

n=1

p2n

1

+ mn n2 qn2

2mn 2

n qn QN (10.2)

,

where we have introduced the subscript N in the notation for the position and momentum of the distinguished particle, QN and PN to emphasize their dependence

on the number N of the harmonic oscillators in the heat bath. V (Q) denotes the

potential experienced by the Brownian particle. For notational simplicity we have

assumed that the Brownian particle has unit mass. Notice also that we have introduced a parameter that measures the strength of the coupling between the particle

and the thermal reservoir and that we have also introduced a family of constants

{n }N

n=1 .

Hamiltons equations of motion are:

N + V (QN ) =

Q

qn +

n2

n

QN

qn

mn

N

X

n q n ,

(10.3a)

n=1

= 0,

n = 1, . . . N.

(10.3b)

207

The equations for the particles in the harmonic heat bath are second order linear

inhomogeneous equations with constant coefficients. Our plan is to solve them and

then to substitute the result in the equations of motion for the Brownian particle.

We can solve the equations of motion for the heat bath variables using the variation

of constants formula. Set zn = (qn vn )T , vn = qn . Then equations (10.3b) can be

written as

dzn

= An zn + hN (t),

(10.4)

dt

where

0

0

1

and F (t) =

An =

n

n2 0

mn QN (t)

The solution of (10.4) is

An t

zn (t) = e

zn (0) +

eAn t = cos(n t)I +

1

sin(n t)An ,

n

(10.5)

where I stands for the 22 identity matrix. From this we obtain, with pn = mn qn ,

pn (0)

sin(n t)

qn (t) = qn (0) cos(n t) +

mn n

Z t

n

+

sin(n (t s))QN (s) ds.

mn n 0

(10.6)

Now we can substitute (10.6) into (10.3a) to obtain a closed equation that describes

the dynamics of the distinguished particle. However, it is more customary to perform an integration by parts in (10.6) first:

n

pn (0)

qn (t) =

qn (0)

QN (0) cos(n t) +

sin(n t)

2

mn n

mn n

Z t

n

n

cos(n (t s))Q N (s) ds

Q

(t)

+

N

mn n2

mn n2 0

=: n cos(n t) + n sin(n t)

Z t

Rn (t s)Q N (s) ds.

+En QN (t)

0

Z t

2

2

QN + V (QN ) = EN QN (t)

RN (t s)Q N (s) ds + FN (t), (10.7)

0

where

EN

N

X

2n

=

,

mn n2

n=1

N

X

2n

cos(n t),

mn n2

n=1

N h

X

2n

QN (0) cos(n t)

FN (t) =

n qn (0)

mn n2

n=1

i

n pn (0)

sin(n t)

+

mn n

RN (t) =

(10.8a)

(10.8b)

(10.8c)

(10.8d)

It is important to note that equation (10.7) with EN , RN (t) and FN (t) given

by (10.8a) is equivalent to the original Hamiltonian system (10.2): so far no approximation or particular assumption has been made. Notice also that the above

calculation is valid for any number of harmonic oscillators in the heat bath, even

for N = 1!

Equation (10.7) can be also written in the form

Z t

QN + Veff (QN ) =

RN (t s)Q N (s) ds + FN (t),

(10.9)

0

1

Veff (Q) = V (Q) 2 EN Q2 .

2

(10.10)

Consequently, the effect of the interaction between the Brownian particle and the

heat bath is not only to introduce two additional terms to the equations of motion

for the Brownian particle, the two terms on the right hand side of (10.9), but also to

modify the potential. Notice also that all the dependence on the initial conditions

in (10.9) is included in FN (t). When the initial conditions for the heat bath are

random, the case of interest here, FN (t) becomes a stochastic process, a random

forcing term.

The initial conditions of the Brownian particle {QN (0), PN (0)} =: {Q0 , P0 }1

are taken to be deterministic. As it has already been mentioned, the initial conditions for the harmonic heat bath are chosen so that the thermal reservoir is in equilibrium. Here we can make two choices: we can either assume that the heat bath

1

The initial conditions for the distinguished particle are, of course, independent of the number of

particles in the heat bath

209

initially in equilibrium in the absence of the Brownian particle or that the heat bath

is initially in equilibrium in the presence of the distinguished particle, i.e. that the

initial positions and momenta of the heat bath particles are distributed according to

a Gibbs distribution, conditional on the knowledge of {Q0 , P0 }:

(dpdq) = Z 1 eHeff (q,p,QN ) dqdp,

(10.11)

where

Heff(q, p, QN ) =

N

X

n=1

"

2 #

p2n

1

n

+ mn n2 qn

QN

,2

2mn 2

mn n2

(10.12)

of introducing the concept of the temperature in the system: through the average

kinetic energy of the bath particles.

Our assumption that the initial conditions for the heat bath are distributed according to (10.11) imply that

q

p

n

1 k 1 ,

Q

+

p

(0)

=

mn 1 n ,

(10.13)

qn (0) =

n

0

n

n

mn n2

where the n n are mutually independent sequences of i.i.d. N (0, 1) random variables and we have used the notation kn = mn n2 . We reiterate that we actually

consider the Gibbs measure of an effective Hamiltonian. If we assume that the

heat bath is in equilibrium

p at t = 0 in the absence of the distinguished particle,

then we have qn (0) = 1 kn1 n . Our choice of the initial conditions (10.13)

ensures that the forcing term in the generalized Langevin equation that we will

derive is mean zero (see below).

Now we use (10.13) into (10.8c) to obtain

FN (t) =

N

X

n=1

kn1 n cos(n t) + n sin(n t) .

(10.14)

Equation (10.9) is called the generalized Langevin equation , FN (t) the noise

and RN (t)

N

X

2n

cos(n t)

(10.15)

RN (t) =

k

n=1 n

2

Notice that if we add the quadratic term in Q to the Hamiltonian (10.2) then no correction to the

potential V (Q) (eqn. (10.10)) appears.

is the memory kernel. The noise and memory kernel are related. This is not surRt

prising, since the dissipation (i.e. the term 0 RN (t s)Q N (s) ds ) and the noise

FN (t) in (10.9) have the same source, namely the interaction between the Brownian particle and the heat bath. In fact, the memory kernel is the autocorrelation

function of the noise (times a constant, the temperature). The following proposition

summarizes the basic properties of the noise term FN (t).

Proposition 10.2.1. The noise term FN (t) is a mean zero Gaussian stationary

process with autocorrelation function

hFN (t)FN (s)i = 1 RN (t s).

(10.16)

In the writing the above equation we have used the notation hi to denote the

average with respect to the random variables {n , n }N

n=1 .

Remark 10.2.2. Equation (10.16) is called the fluctuation-dissipation theorem

Proof. The fact that FN (t) is mean zero follows from (10.13). Gaussianity follows

from the fact that the n n are mutually independent Gaussian random variables.

Stationarity is proved in Exercise 3, Chapter 3. The proof of (10.16) follows from

the formulas hn m i = nm , hn m i = nm , hn m i = 0, n, m = 1, . . . N and a

simple trigonometric identity

hFN (t)FN (s)i = 1

N

X

n=1

+ sin(n t) sin(n s)

= 1 RN (t s).

particles appropriately we can pass to the limit as N + and obtain the GLE

with different memory kernels R(t) and noise processes F (t).

Let a (0, 1), 2b = 1 a and set n = N a n where {n }

n=1 are i.i.d. with

1 U(0, 1). Furthermore, we choose the spring constants according to

kn =

f 2 (n )

,

N 2b

211

where the function f (n ) decays sufficiently fast at infinity. We can rewrite the

dissipation and noise terms in the form

RN (t) =

N

X

f 2 (n ) cos(n t)

n=1

and

FN (t) =

N

X

n=1

where = N a /N . Using now properties of Fourier series with random coefficients/frequencies and of weak convergence of probability measures we can pass

to the limit:

RN (t) R(t) in L1 [0, T ],

for a.a. {n }

n=1 and

FN (t) F (t) weakly in C([0, T ], R).

The time T > 0 if finite but arbitrary. The limiting kernel and noise satisfy the

fluctuation-dissipation theorem (10.16):

hF (t)F (s)i = 1 R(t s).

(10.17)

QN (t), the solution of (??) converges weakly to the solution of the limiting GLE

Z t

Q = V (Q)

R(t s)Q(s)

ds + F (t).

(10.18)

0

The properties of the limiting dissipation and noise are determined by the function

f (). As an example, consider the Lorentzian function

f 2 () =

2/

2 + 2

(10.19)

R(t) = e|t| .

The noise process F (t) is a mean zero stationary Gaussian process with continuous

paths and, from (10.17), exponential correlation function:

hF (t)F (s)i = 1 e|ts| .

Hence, F (t) is the stationary Ornstein-Uhlenbeck process:

p

dW

dF

= F + 2 1

,

dt

dt

Z t

Q = V (Q)

e|ts| Q(s)

ds + 2 F (t),

(10.20)

(10.21)

where F (t) is the OU process (10.20). Q(t), the solution of the GLE (10.18), is not

a Markov process, i.e. the future is not statistically independent of the past, when

conditioned on the present. The stochastic process Q(t) has memory. We can

turn (10.18) into a Markovian SDE by enlarging the dimension of state space, i.e.

introducing auxiliary variables. We might have to introduce infinitely many variables! For the case of the exponential memory kernel, when the noise is given

by an OU process, it is sufficient to introduce one auxiliary variable. We can

rewrite (10.21) as a system of SDEs:

dQ

dt

dP

dt

dZ

dt

= P,

= V (Q) + Z,

p

dW

= Z P + 2 1

,

dt

It is a degenerate Markov process: noise acts directly only on one of the 3 degrees

of freedom.

We can eliminate the auxiliary process Z by taking an appropriate distinguished

limit.

dQ

dt

dP

dt

dZ

dt

= P,

= V (Q) +

Z,

1

2 1 dW

= 2Z

P+

.

2 dt

213

We can use tools from singular perturbation theory for Markov processes to show

that, in the limit as 0, we have that

p

1

dW

Z 2 1

P.

dt

Thus, in this limit we obtain the Markovian Langevin Equation (R(t) = (t))

= V (Q) Q +

Q

2 1

dW

.

dt

(10.24)

In the previous section we studied the gLE for the case where the memory kernel

decays exponentially fast. We showed that we can represent the gLE as a Markovian processes by adding one additional variable, the solution of a linear SDE. A

natural question which arises is whether it is always possible to turn the gLE into

a Markovian system by adding a finite number of additional variables. This is not

always the case. However, there are many applications where the memory kernel

decays sufficiently fast so that we can approximate the gLE by a finite dimensional

Markovian system.

We introduce the concept of a quasi-Markovian stochastic process.

Definition 10.3.1. We will say that a stochastic process Xt is quasi-Markovian if

it can be represented as a Markovian stochastic process by adding a finite number

of additional variables: There exists a stochastic process Yt so that {Xt , Yt } is a

Markov process.

In many cases the additional variables Yt in terms of solutions to linear SDEs.

This is possible, for example, when the memory kernel consists of a sum of exponential functions, a natural extension of the case considered in the previous section.

Proposition 10.3.2. Consider the generalized Langevin equation

Q = p,

P = V (Q)

t

0

(10.25)

R(t) =

n

X

j=1

j ej |t|

(10.26)

and F (t) being a mean zero stationary Gaussian process and where R(t) and F (t)

are related through the fluctuation-dissipation theorem,

hF (t)F (s)i = 1 R(t s).

(10.27)

Q = P,

P = V (Q)+

N (0, 1 )

with uj

Brownian motions.

n

X

j uj ,

j=1

u j = j uj j pj +

2j 1 , j = 1, . . . n,

(10.28)

and where Wj (t) are independent standard one dimensional

uj

Z t

Z t

q

ej (ts) P (s) ds + ej t uj (0) + 2j 1

ej (ts) dWj

j

0

0

Z t

Rj (t s)P (s) ds + j (t).

=:

=

P

= V (Q) +

= V (Q) +

= V (Q)

n

X

j=1

n

X

j=1

t

j uj

Z t

j

Rj (t s)P (s) ds + j (t)

0

F (t) =

n

X

j j (t),

j=1

215

n

X

hF (t)F (s)i =

i,j=1

n

X

i j hi (s)j (t)i

i j ij ei |ts|

i,j=1

n

X

i=1

These additional variables are solutions of a linear system of SDEs. This follows from results in approximation theory. Consider now the case where the memory kernel is a bounded analytic function. Its Laplace transform

b

R(s)

=

est R(t) dt

b

R(s)

=

21

s + 1 +

22

...

i > 0,

(10.29)

b

lim R(s)

= 0.

terminates after N steps.

RN (t) is bounded which implies that

bN (s) = 0.

lim R

bN (s) =

R

PN

N j

j=1 aj s

,

P

N j

sN + N

j=1 bj s

aj , bj R.

(10.30)

This is the Laplace transform of the autocorrelation function of an appropriate

linear system of SDEs. Indeed, set

dxj

dWj

= bj xj + xj+1 + aj

,

dt

dt

j = 1, . . . , N,

(10.31)

with xN +1 (t) = 0. The process x1 (t) is a stationary Gaussianpprocess with autocorrelation function RN (t). For N = 1 and b1 = , a1 = 2 1 we derive

the GLE (10.21) with F (t) being the OU processp(10.20). Consider now the case

N = 2 with bi = i , i = 1, 2 and a1 = 0, a2 = 2 1 2 . The GLE becomes

Z t

= V (Q) 2

Q

R(t s)Q(s)

ds + F1 (t),

0

with

F1 = 1 F1 + F2 ,

p

2,

F2 = 2 F2 + 2 1 2 W

We can write (10.33) as a Markovian system for the variables {Q, P, Z1 , Z2 }:

Q = P,

P = V (Q) + Z1 (t),

Z 1 = 1 Z1 + Z2 ,

p

2.

Z 2 = 2 Z2 P + 2 1 2 W

Notice that this diffusion process is more degenerate than (10.21): noise acts

on fewer degrees of freedom. It is still, however, hypoelliptic (Hormanders condition is satisfied): there is sufficient interaction between the degrees of freedom

{Q, P, Z1 , Z2 } so that noise (and hence regularity) is transferred from the degrees of freedom that are directly forced by noise to the ones that are not. The

corresponding Markov semigroup has nice regularizing properties. There exists a

smooth density. Stochastic processes that can be written as a Markovian process by

adding a finite number of additional variables are called quasimarkovian . Under

appropriate assumptions on the potential V (Q) the solution of the GLE equation

is an ergodic process. It is possible to study the ergodic properties of a quasimarkovian processes by analyzing the spectral properties of the generator of the

corresponding Markov process. This leads to the analysis of the spectral properties

of hypoelliptic operators.

217

When studying the Kac-Zwanzing model we considered a one dimensional Hamiltonian system coupled to a finite dimensional Hamiltonian system with random

initial conditions (the harmonic heat bath) and then passed to the theromdynamic

limit N . We can consider a small Hamiltonian system coupled to its environment which we model as an infinite dimensional Hamiltonian system with random

initial conditions. We have a coupled particle-field model. The distinguished

particle (Brownian particle) is described through the Hamiltonian

1

HDP = p2 + V (q).

(10.34)

2

We will model the environment through a classical linear field theory (i.e. the wave

equation) with infinite energy:

t2 (t, x) = x2 (t, x).

(10.35)

HHB (, ) =

|x |2 + |(x)|2 .

(10.36)

(x) denotes the conjugate momentum field. The initial conditions are distributed

according to the Gibbs measure (which in this case is a Gaussian measure) at inverse temperature , which we formally write as

= Z 1 eH(,) dd.

(10.37)

Under this assumption on the initial conditions, typical configurations of the

heat bath have infinite energy. In this way, the environment can pump enough

energy into the system so that non-trivial fluctuations emerge. We will assume

linear coupling between the particle and the field:

Z

HI (q, ) = q q (x)(x) dx.

(10.38)

where The function (x) models the coupling between the particle and the field.

This coupling is influenced by the dipole coupling approximation from classical

electrodynamics. The Hamiltonian of the particle-field model is

H(q, p, , ) = HDP (p, q) + H(, ) + HI (q, ).

(10.39)

The corresponding Hamiltonian equations of motion are a coupled system of equations of the coupled particle field model. Now we can proceed as in the case of the

finite dimensional heat bath. We can integrate the equations motion for the heat

bath variables and plug the solution into the equations for the Brownian particle to

obtain the GLE. The final result is

Z t

R(t s)q(s)

+ F (t),

(10.40)

q = V (q)

0

with appropriate definitions for the memory kernel and the noise, which are related

through the fluctuation-dissipation theorem.

10.6 Projection Operator Techniques

Consider now the N + 1-dimensional Hamiltonian (particle + heat bath) with random initial conditions. The N + 1 probability distribution function fN +1 satisfies the Liouville equation

fN +1

+ {fN +1 , H} = 0,

t

(10.41)

N

X

B A

A B

{A, B} =

.

qj pj

qj pj

j=0

LN +1 = i{, H}.

The Liouville equation can be written as

i

fN +1

= LN +1 fN +1 .

t

(10.42)

We want to obtain a closed equation for the distribution function of the Brownian

particle. We introduce a projection operator which projects onto the distribution

function f of the Brownian particle:

P fN +1 = f,

P fN +1 = h.

219

f

= P L(f + h),

t

(10.43a)

h

= (I P )L(f + h).

t

(10.43b)

We integrate the second equation and substitute into the first equation. We obtain

Z t

f

i

P Lei(IP )Ls (I P )Lf (t s) ds + P Lei(IP )Lt h(0).

= P Lf i

t

0

(10.44)

In the Markovian limit (large mass ratio) we obtain the Fokker-Planck equation (??).

.

The original papers by Kac et al and by Zwanzig are [17, 74]. See also [16].

The variant of the Kac-Zwanzig model that we have discussed in this chapter was

studied in [27]. An excellent discussion on the derivation of the Fokker-Planck

equation using projection operator techniques can be found in [52].

Applications of linear response theory to climate modeling can be found in.

10.8 Exercises

1. Prove (10.5). Use this formula to obtain (10.6).

Index

autocorrelation function, 32

Banach space, 16

Brownian motion

scaling and symmetry properties, 43

Fokker-Planck, 90

Fokker-Planck equation, 131

Fokker-Planck equation

classical solution of, 91

Gaussian stochastic process, 30

generalized Langevin equation, 205, 209

generator, 68, 129

Gibbs distribution, 111

Gibbs measure, 113, 206

Green-Kubo formula, 39

conditional expectation, 18

confinement time, 195

correlation coefficient, 17

covariance function, 32

Diffusion process

confinement time, 195

mean first passage time, 194

Diffusion processes

reversible, 110

Dirichlet form, 113

inverse temperature, 103

Ito formula, 130

Joint probability density, 99

equation

Fokker-Planck, 90

kinetic, 120

Klein-Kramers-Chandrasekhar, 141

Langevin, 141

Equation

Generalized Langevin, 205

equation

generalized Langevin, 209

Karhunen-Loeve Expansion, 46

Karhunen-Loeve Expansion

for Brownian Motion, 49

kinetic equation, 120

Kolmogorov equation, 130

law, 13

law of large numbers

strong, 24

first passage time, 194

fluctuation-dissipation theorem, 205, 210 linear response theory, 205

221

INDEX

222

Markov Chain Monte Carlo, 115

MCMC, 115

Mean first passage time, 194

mean first passage time, MFPT, 194

Multiplicative noise, 138

operator

hypoelliptic, 142

Ornstein-Uhlenbeck process

Fokker-Planck equation for, 98

partition function, 111

Poincares inequality

for Gaussian measures, 105

Poincar`es inequality, 113

Quasimarkovian stochastic process, 216

random variable

Gaussian, 17

uncorrelated, 17

Reversible diffusion, 110

spectral density, 35

stationary process, 31

stationary process

second order stationary, 32

strictly stationary, 31

wide sense stationary, 32

stochastic differential equation, 43

Stochastic Process

quasimarkovian, 216

stochastic process

definition, 29

Gaussian, 30

second-order stationary, 32

stationary, 31

equivalent, 30

stochastic processes

strictly stationary, 31

theorem

fluctuation-dissipation, 205

fluctuations-dissipation, 210

transport coefficient, 39

Wiener process, 40

Bibliography

[1] L. Arnold. Stochastic differential equations: theory and applications. WileyInterscience [John Wiley & Sons], New York, 1974. Translated from the

German.

[2] R. Balescu. Statistical dynamics. Matter out of equilibrium. Imperial College

Press, London, 1997.

[3] N. Berglund and B. Gentz. Noise-induced phenomena in slow-fast dynamical systems. Probability and its Applications (New York). Springer-Verlag

London Ltd., London, 2006. A sample-paths approach.

[4] L. Breiman. Probability, volume 7 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992.

Corrected reprint of the 1968 original.

[5] S. Cerrai and M. Freidlin. On the Smoluchowski-Kramers approximation

for a system with an infinite number of degrees of freedom. Probab. Theory

Related Fields, 135(3):363394, 2006.

[6] S. Cerrai and M. Freidlin. Smoluchowski-Kramers approximation for a general class of SPDEs. J. Evol. Equ., 6(4):657689, 2006.

[7] S. Chandrasekhar. Stochastic problems in physics and astronomy. Rev. Mod.

Phys., 15(1):189, Jan 1943.

[8] A.J. Chorin and O.H. Hald. Stochastic tools in mathematics and science,

volume 1 of Surveys and Tutorials in the Applied Mathematical Sciences.

Springer, New York, 2006.

[9] W. Dietrich, I. Peschel, and W.R. Schneider. Diffusion in periodic potentials.

Z. Phys, 27:177187, 1977.

223

224

BIBLIOGRAPHY

[10] N. Wax (editor). Selected Papers on Noise and Stochastic Processes. Dover,

New York, 1954.

[11] A. Einstein. Investigations on the theory of the Brownian movement. Dover

Publications Inc., New York, 1956. Edited with notes by R. Furth, Translated

by A. D. Cowper.

[12] S.N. Ethier and T.G. Kurtz. Markov processes. Wiley Series in Probability

and Mathematical Statistics: Probability and Mathematical Statistics. John

Wiley & Sons Inc., New York, 1986.

[13] L.C. Evans. Partial Differential Equations. AMS, Providence, Rhode Island,

1998.

[14] W. Feller. An introduction to probability theory and its applications. Vol. I.

Third edition. John Wiley & Sons Inc., New York, 1968.

[15] W. Feller. An introduction to probability theory and its applications. Vol. II.

Second edition. John Wiley & Sons Inc., New York, 1971.

[16] G. W. Ford and M. Kac. On the quantum Langevin equation. J. Statist. Phys.,

46(5-6):803810, 1987.

[17] G. W. Ford, M. Kac, and P. Mazur. Statistical mechanics of assemblies of

coupled oscillators. J. Mathematical Phys., 6:504515, 1965.

[18] M. Freidlin and M. Weber. A remark on random perturbations of the nonlinear

pendulum. Ann. Appl. Probab., 9(3):611628, 1999.

[19] M. I. Freidlin and A. D. Wentzell. Random perturbations of Hamiltonian

systems. Mem. Amer. Math. Soc., 109(523):viii+82, 1994.

[20] M.I. Freidlin and A.D. Wentzell. Random Perturbations of dunamical systems. Springer-Verlag, New York, 1984.

[21] A. Friedman. Partial differential equations of parabolic type. Prentice-Hall

Inc., Englewood Cliffs, N.J., 1964.

[22] A. Friedman. Stochastic differential equations and applications. Vol. 1. Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1975. Probability and Mathematical Statistics, Vol. 28.

BIBLIOGRAPHY

225

[23] A. Friedman. Stochastic differential equations and applications. Vol. 2. Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1976. Probability and Mathematical Statistics, Vol. 28.

[24] H. Gang, A. Daffertshofer, and H. Haken. Diffusion in periodically forced

Brownian particles moving in spaceperiodic potentials. Phys. Rev. Let.,

76(26):48744877, 1996.

[25] C. W. Gardiner. Handbook of stochastic methods. Springer-Verlag, Berlin,

second edition, 1985. For physics, chemistry and the natural sciences.

[26] I. I. Gikhman and A. V. Skorokhod. Introduction to the theory of random

processes. Dover Publications Inc., Mineola, NY, 1996.

[27] D. Givon, R. Kupferman, and A.M. Stuart. Extracting macroscopic dynamics:

model problems and algorithms. Nonlinearity, 17(6):R55R127, 2004.

[28] M. Hairer and G. A. Pavliotis. From ballistic to diffusive behavior in periodic

potentials. J. Stat. Phys., 131(1):175202, 2008.

[29] M. Hairer and G.A. Pavliotis. Periodic homogenization for hypoelliptic diffusions. J. Statist. Phys., 117(1-2):261279, 2004.

[30] P. Hanggi. Escape from a metastable state. J. Stat. Phys., 42(1/2):105140,

1986.

[31] P. Hanggi, P. Talkner, and M. Borkovec. Reaction-rate theory: fifty years

after Kramers. Rev. Modern Phys., 62(2):251341, 1990.

[32] W. Horsthemke and R. Lefever. Noise-induced transitions, volume 15 of

Springer Series in Synergetics. Springer-Verlag, Berlin, 1984. Theory and

applications in physics, chemistry, and biology.

[33] J. Jacod and A.N. Shiryaev. Limit theorems for stochastic processes, volume 288 of Grundlehren der Mathematischen Wissenschaften [Fundamental

Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 2003.

[34] F. John. Partial differential equations, volume 1 of Applied Mathematical

Sciences. Springer-Verlag, New York, fourth edition, 1991.

[35] S. Karlin and H. M. Taylor. A second course in stochastic processes. Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1981.

226

BIBLIOGRAPHY

[36] S. Karlin and H.M. Taylor. A first course in stochastic processes. Academic

Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New YorkLondon, 1975.

[37] L. B. Koralov and Y. G. Sinai. Theory of probability and random processes.

Universitext. Springer, Berlin, second edition, 2007.

[38] H. A. Kramers. Brownian motion in a field of force and the diffusion model

of chemical reactions. Physica, 7:284304, 1940.

[39] N. V. Krylov. Introduction to the theory of diffusion processes, volume 142 of

Translations of Mathematical Monographs. American Mathematical Society,

Providence, RI, 1995.

[40] R. Kupferman, G. A. Pavliotis, and A. M. Stuart. Ito versus Stratonovich

white-noise limits for systems with inertia and colored multiplicative noise.

Phys. Rev. E (3), 70(3):036120, 9, 2004.

[41] A.M. Lacasta, J.M Sancho, A.H. Romero, I.M. Sokolov, and K. Lindenberg.

From subdiffusion to superdiffusion of particles on solid surfaces. Phys. Rev.

E, 70:051104, 2004.

[42] P. D. Lax. Linear algebra and its applications. Pure and Applied Mathematics

(Hoboken). Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, second

edition, 2007.

[43] S. Lifson and J.L. Jackson. On the selfdiffusion of ions in polyelectrolytic

solution. J. Chem. Phys, 36:2410, 1962.

[44] M. Lo`eve. Probability theory. I. Springer-Verlag, New York, fourth edition,

1977. Graduate Texts in Mathematics, Vol. 45.

[45] M. Lo`eve. Probability theory. II. Springer-Verlag, New York, fourth edition,

1978. Graduate Texts in Mathematics, Vol. 46.

[46] M. C. Mackey. Times arrow. Dover Publications Inc., Mineola, NY,

2003. The origins of thermodynamic behavior, Reprint of the 1992 original

[Springer, New York; MR1140408].

[47] M.C. Mackey, A. Longtin, and A. Lasota. Noise-induced global asymptotic

stability. J. Statist. Phys., 60(5-6):735751, 1990.

BIBLIOGRAPHY

227

Grundlehren der mathematischen Wissenschaften, Band 151. Academia Publishing House of the Czechoslovak Academy of Sciences, Prague, 1968.

[49] P. A. Markowich and C. Villani. On the trend to equilibrium for the FokkerPlanck equation: an interplay between physics and functional analysis. Mat.

Contemp., 19:129, 2000.

[50] B. J. Matkowsky, Z. Schuss, and E. Ben-Jacob. A singular perturbation approach to Kramers diffusion problem. SIAM J. Appl. Math., 42(4):835849,

1982.

[51] B. J. Matkowsky, Z. Schuss, and C. Tier. Uniform expansion of the transition

rate in Kramers problem. J. Statist. Phys., 35(3-4):443456, 1984.

[52] R.M. Mazo. Brownian motion, volume 112 of International Series of Monographs on Physics. Oxford University Press, New York, 2002.

[53] J. Meyer and J. Schroter. Comments on the Grad procedure for the FokkerPlanck equation. J. Statist. Phys., 32(1):5369, 1983.

[54] E. Nelson. Dynamical theories of Brownian motion. Princeton University

Press, Princeton, N.J., 1967.

[55] G.C. Papanicolaou and S. R. S. Varadhan. Ornstein-Uhlenbeck process in a

random potential. Comm. Pure Appl. Math., 38(6):819834, 1985.

[56] G. A. Pavliotis and A. M. Stuart. Analysis of white noise limits for stochastic

systems with two fast relaxation times. Multiscale Model. Simul., 4(1):135

(electronic), 2005.

[57] G. A. Pavliotis and A. M. Stuart. Parameter estimation for multiscale diffusions. J. Stat. Phys., 127(4):741781, 2007.

[58] G. A. Pavliotis and A. Vogiannou. Diffusive transport in periodic potentials:

Underdamped dynamics. Fluct. Noise Lett., 8(2):L155173, 2008.

[59] G.A. Pavliotis and A.M. Stuart. Multiscale methods, volume 53 of Texts in

Applied Mathematics. Springer, New York, 2008. Averaging and homogenization.

228

BIBLIOGRAPHY

[60] R. L. R. L. Stratonovich. Topics in the theory of random noise. Vol. II. Revised English edition. Translated from the Russian by Richard A. Silverman.

Gordon and Breach Science Publishers, New York, 1967.

[61] P. Reimann, C. Van den Broeck, H. Linke, P. Hanggi, J.M. Rubi, and A. PerezMadrid. Diffusion in tilted periodic potentials: enhancement, universality and

scaling. Phys. Rev. E, 65(3):031104, 2002.

[62] P. Reimann, C. Van den Broeck, H. Linke, J.M. Rubi, and A. Perez-Madrid.

Giant acceleration of free diffusion by use of tilted periodic potentials. Phys.

Rev. Let., 87(1):010602, 2001.

[63] Frigyes Riesz and Bela Sz.-Nagy. Functional analysis. Dover Publications

Inc., New York, 1990. Translated from the second French edition by Leo F.

Boron, Reprint of the 1955 original.

[64] H. Risken. The Fokker-Planck equation, volume 18 of Springer Series in

Synergetics. Springer-Verlag, Berlin, 1989.

[65] H. Rodenhausen. Einsteins relation between diffusion constant and mobility

for a diffusion model. J. Statist. Phys., 55(5-6):10651088, 1989.

[66] M Schreier, P. Reimann, P. Hanggi, and E. Pollak. Giant enhancement of

diffusion and particle selection in rocked periodic potentials. Europhys. Let.,

44(4):416422, 1998.

[67] Z. Schuss. Singular perturbation methods in stochastic differential equations

of mathematical physics. SIAM Review, 22(2):119155, 1980.

[68] Ch. Schutte and W. Huisinga. Biomolecular conformations can be identified as metastable sets of molecular dynamics. In Handbook of Numerical

Analysis (Computational Chemistry), Vol X, 2003.

[69] C. Schwab and R.A. Todor. Karhunen-Lo`eve approximation of random fields

by generalized fast multipole methods. J. Comput. Phys., 217(1):100122,

2006.

[70] R.B. Sowers. A boundary layer theory for diffusively perturbed transport

around a heteroclinic cycle. Comm. Pure Appl. Math., 58(1):3084, 2005.

BIBLIOGRAPHY

229

Press, Cambridge, 1993.

[72] G. I. Taylor. Diffusion by continuous movements. London Math. Soc.,

20:196, 1921.

[73] G. E. Uhlenbeck and L. S. Ornstein. On the theory of the brownian motion.

Phys. Rev., 36(5):823841, Sep 1930.

[74] R. Zwanzig. Nonlinear generalized Langevin equations.

9(3):215220, 1973.

J. Stat. Phys.,

New York, 2001.

- Stochastic Integration and Stochastic Differential Equations a Gentle IntroductionUploaded byTu Shirota
- Metastasis and Metastability_ a - Kane X. FaucherUploaded byaxiomatizadorr
- Absolute Measurable Spaces (Encyclopedia of Mathematics and Its Applications)Uploaded byrobert5918
- General Theory of Stochastic ProcessUploaded bymemfilmat
- Random Signals NotesUploaded byRamyaKoganti
- Plausible Reasoning Spatial ObservationUploaded byscribd202
- Math382_Lecture_notes__Probability_and_Statistics.pdfUploaded byMinhQuân
- Measure and Integration Mod01 Lec 02Uploaded byDeepak Bhoriya
- TCDDmodelUploaded bydrancerboy
- 23-74-1-PBUploaded byRodolfo Quijada
- Optimal Trading Strategies for Ito ProcessesUploaded byTraderCat Solaris
- Nikhil Chandaria - Final Year ProjectUploaded bynikuhiruC
- 2-The Normal & Standard Normal DistributionUploaded byJessica Ryan
- Mixing Models - ElderUploaded byTony A. Apaza
- Lecture-6 (Paper 1)Uploaded byAsk Bulls Bear
- 1010.2992Uploaded byAnnas Mahfudz
- markets98_lecture.pdfUploaded byLester Ingber
- PROB1-2Uploaded bysanjaysharmaiisc
- Slides SempiUploaded byMahfudhotin
- Chapter 3Uploaded byJiah Mohd Daud
- Zetica-Statistical-basis-for-assessing-the-risk-of-unexploded-bombs.pdfUploaded byRymond
- 1-s2.0-0304414981900120-mainUploaded byKrishnan Muralidharan
- 3rde ChapterUploaded byGowtham Raj
- 3-4-StochasticProcessandvariancespectrumUploaded bySourangshu Ghosh
- notes1Uploaded byfatcode27
- 1655320710000_Asgn-1-JMUploaded by123
- Inde-Depe Events ExamplesUploaded byjally
- A New Probabilistic Transformation in Generalized Power SpaceUploaded byMia Amalia
- IntensityBasedDefaultModelUploaded bypostscript
- Powerpoint 2017 (1)Uploaded byTina Chen

- Period Table QuestUploaded byKamariah Ismail
- Ch23Uploaded bymonogod
- KS40 -1Uploaded byadyro12
- mp2Uploaded bygon972001
- Civil-Breadth-Mor-Question-1-sample.pdfUploaded bySharif
- D June 2014 (R)Uploaded byVraj Patel
- Fly Ash Usage at Marine Structures to Resist Chloride and Sulfate AttacksUploaded bySuresh Rao
- Link 180Uploaded byOmar Najm
- Physics NoteUploaded byMuhamad Arif
- A Photoelectric Planimeter for Measuring Leaf AreaUploaded byqwe rty
- ME211.F2014.PbSet2 (1)_2Uploaded byKhan AB
- OTC 19128 Pipeline Embeded in DeepwaterUploaded byDamisha Damisha
- Kinematics BansalUploaded byBHAAJI0001
- Removal of Air From Water Lines by Hydraulic Means [P.E. Wisner] (1975)Uploaded byAlex
- remotesensing-09-00043Uploaded byLeandro Gregorio
- QHAUploaded byBastian Dewi
- Wave Power - Wikipedia,Uploaded byzvjesos
- Surveying Lecture Notes to Be Acetated-errorsUploaded byCalvin Lacson
- Linear Graph TheoryUploaded byHitesh Jk
- K. Sankara RaoUploaded byaced321
- A Note on Random Number GenerationUploaded byGlenn Calvin
- Alex Schiffer - Joe Cell - Experimenters Guide to the Joe CellUploaded byAnonymous UwXe23x
- Biodiesel process intensification in a very simple microchannel device.pdfUploaded byEmily Tatiana Alvarez Villa
- Grid Connection Control of Dfig Based on PscadUploaded byejkehoe
- Material and Energy Balance-Day IUploaded bywajrag
- Estimating Hydraulic ConductivityUploaded byMilan Popovic
- Wps - Rev 4 First Page PrintUploaded byAlamirfan Mohammad
- F648.1889433-1Uploaded byKandido Aca
- Stewardson; The Effect of Elastic ModulusUploaded bybendung69
- SolutionsManual-Statistical and Adaptive Signal ProcessingUploaded byRodrigo Lobos Morales