This action might not be possible to undo. Are you sure you want to continue?

∂p ( x, t ) ∂t =D ∂ 2 p ( x, t ) ∂x 2

k+ k−

B

C

C

**Applied Stochastic Processes
**

in science and engineering

by

M. Scott

c 2011

Objectives

This book is designed as an introduction to the ideas and methods used to formulate mathematical models of physical processes in terms of random functions. The ﬁrst ﬁve chapters use the historical development of the study of Brownian motion as their guiding narrative. The remaining chapters are devoted to methods of solution for stochastic models. The material is too much for a single course – chapters 1-4 along with chapters 7 and 8 are ample for a senior undergraduate course oﬀered to students with a suitably mathematical background (i.e. familiarity with most of the methods reviewed in Appendix B). For a graduate course, on the other hand, a quick review of the ﬁrst three chapters, with a focus on later chapters should provide a good general introduction to numerical and analytic approximation methods in the solution of stochastic models. The content is primarily designed to develop mathematical methods useful in the study of stochastic processes. Nevertheless, an eﬀort has been made to tie the derivations, whenever possible, to the underlying physical assumptions that gave rise to the mathematics. As a consequence, very little is said about Itˆ formula and associated methods of what has come o to be called Stochastic Calculus. If that comes as a disappointment to the reader, I suggest they consider C. W. Gardiner’s book: • Handbook of stochastic methods (3rd Ed.), C. W. Gardiner (Springer, 2004), as a friendly introduction to Itˆ’s calculus. o A list of references useful for further study appear at the beginning of some sections, and at the end of each chapter. These references are usually pedagogical texts or review articles, and are not meant to be an exhaustive tabulation of current results, but rather as a ﬁrst step along the road of independent research in stochastic processes. A collection of exercises appear at the end of each chapter. Some of these are meant to focus the reader’s attention on a particular point of the analysis, others are meant to develop essential technical skills. None of the exercises are very diﬃcult, and all of them should be attempted (or at least read). My hope is that the reader will ﬁnd that the study of stochastic processes is not as diﬃcult as it is sometimes made out to be.

ii

Acknowledgements This book is based, in part, upon the stochastic processes course taught by Pino Tenti at the University of Waterloo (with additional text and exercises provided by Zoran Miskovic), drawn extensively from the text by N. G. van Kampen “Stochastic process in physics and chemistry.” The content of Chapter 8 (particularly the material on parametric resonance) was developed in collaboration with Francis Poulin, and the eﬀective stability approximation of Chapter 9 was developed in collaboration with Terry Hwa and Brian Ingalls. The material on stochastic delay equations was motivated by discussions with Lev Tsimring. Discussions with Stefan Klumpp led to clariﬁcation of a great deal of the material. I am grateful to Chris Ferrie, Killian Miller, David Stechlinski, Mihai Nica and all of my students in stochastic processes for correcting errors in the notes, and for trouble-shooting the exercises.

iii

Contents

1 Introduction 1.1 Stochastic Processes in Science and 1.2 Brownian motion . . . . . . . . . . 1.2.1 Einstein . . . . . . . . . . . 1.2.2 Smolouchowski . . . . . . . 1.2.3 Langevin . . . . . . . . . . 1.2.4 Ornstein and Uhlenbeck . . 1.3 Outline of the Book . . . . . . . . 1.4 Suggested References . . . . . . . .

Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

1 1 2 4 9 11 15 20 22 29 30 34 35 39 40 42 44 50 50 52 57 59 62 64 68

2 Random processes 2.1 Random processes - Basic concepts . . . . . 2.1.1 Moments of a Stochastic Process . . 2.2 Stationarity and the Ergodic Theorem . . . 2.3 Spectral Density and Correlation Functions 2.3.1 Linear Systems . . . . . . . . . . . . 2.3.2 Fluctuation-Dissipation Relation . . 2.3.3 Johnson and Nyquist . . . . . . . . . 3 Markov processes 3.1 Classiﬁcation of Stochastic Processes . . . . 3.2 Chapman-Kolmogorov Equation . . . . . . 3.3 Master Equation . . . . . . . . . . . . . . . 3.4 Stosszhalansatz . . . . . . . . . . . . . . . . 3.5 Example – One-step processes . . . . . . . . 3.6 Example – Bernoulli’s Urns and Recurrance 3.7 Example – Chemical Kinetics . . . . . . . .

iv

4 Solution of the Master Equation 4.1 Exact Solutions . . . . . . . . . . . . . . . . . . . . 4.1.1 Moment Generating Functions . . . . . . . 4.1.2 Matrix Iteration . . . . . . . . . . . . . . . 4.2 Approximation Methods . . . . . . . . . . . . . . . 4.2.1 Numerical Methods – Gillespie’s Algorithm 5 Perturbation Expansion of the Master Equation 5.1 Linear Noise Approximation (LNA) . . . . . . . . 5.1.1 Example – Nonlinear synthesis . . . . . . . 5.1.2 Approximation of the Brusselator model . . 5.2 Example – ‘Burstiness’ in Protein Synthesis . . . . 5.3 Limitations of the LNA . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80 80 81 84 87 90 100 100 105 109 113 116 120 121 124 128 131 133 134 135 136 137 137 140 143 143 152 152 154 155 156 158 160 166 168 173 174

6 Fokker-Planck Equation 6.1 Kolmogorov Equation . . . . . . . . . . . . . . . . . . . 6.2 Derivation of the Fokker-Planck Equation . . . . . . . . 6.3 Macroscopic Equation . . . . . . . . . . . . . . . . . . . 6.3.1 Coeﬃcients of the Fokker-Planck Equation . . . 6.4 Pure Diﬀusion and Ornstein-Uhlenbeck Processes . . . . 6.4.1 Wiener process . . . . . . . . . . . . . . . . . . . 6.4.2 Ornstein-Uhlenbeck process . . . . . . . . . . . . 6.4.3 Heuristic Picture of the Fokker-Planck Equation 6.5 Connection to Langevin Equation . . . . . . . . . . . . . 6.5.1 Linear Damping . . . . . . . . . . . . . . . . . . 6.5.2 Nonlinear Damping . . . . . . . . . . . . . . . . 6.6 Limitations of the Kramers-Moyal Expansion . . . . . . 6.7 Example – Kramer’s Escape Problem . . . . . . . . . . . 7 Stochastic Analysis 7.1 Limits . . . . . . . . . . . . . . . . . . . . 7.2 Mean-square Continuity . . . . . . . . . . 7.3 Stochastic Diﬀerentiation . . . . . . . . . 7.3.1 Wiener process is not diﬀerentiable 7.4 Stochastic Integration . . . . . . . . . . . 7.4.1 Itˆ and Stratonovich Integration . o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8 Random Diﬀerential Equations 8.1 Numerical Simulation . . . . . . . . . . . . . . . . . . . . 8.2 Approximation Methods . . . . . . . . . . . . . . . . . . . 8.2.1 Static Averaging . . . . . . . . . . . . . . . . . . . v

. .5 Example – Spatially-varying steady-state 10.1. .3. . .2 Diﬀerential ﬂow instabilities . .2. . .3 Eﬀective Stability Analysis (ESA) . . . . . . . . .1 Example – Excitable Oscillator . 175 179 185 187 188 198 199 201 204 206 208 215 215 215 217 217 217 217 217 217 217 217 217 218 218 219 222 224 226 229 232 236 238 243 244 245 9 Macroscopic Eﬀects of Noise 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Example – Malthus Predator-Prey Dynamics 9. . .2. . . 9.1 Keizer’s Paradox . . . .4 Example – Homogeneous steady-state . . 10. Example – Parametric Resonance . . . . . . . . . . . . . . .2 Examples of Distributions and Densities . . . . . Optional – Stochastic Delay Diﬀerential Equations . . . . . . . . 11. . . .2.2. . . .4 8. . . . . .4 Generating Random Numbers . . . . . . . A. . . . 11. . .1. . . . . .3 Cumulant Expansion . . . . . . . . . . 10.1 Variational Approximation . . . . . . . . 9. . . . . . . . 10 Spatially varying systems 10. . . . . A. . . . . . .1 Turing-type instabilities . . . . .2 Bourret’s Approximation .5 Novikov-Furutsu-Donsker Relation . . . . . . . . . . . 9. . . . . . . . . . . . . . . . . . .3 8. . . . . . . . . . . . . . . . . . . . . . . . . 10. . . . . . . . . .2 Oscillations from Underdamped Dynamics .2 Internal noise: Reaction-transport master equation . . . . 11.1 Random Walks and Electric Networks 11. . .4 Kalman Filter . . . . . . . . . 10. . 10. . . . . . . . . . . . . . . 10. . . . . . . 10. . . .2. 10. . . . . . . . .3 Central limit theorem . . . .2 Combining reaction and transport . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . .2. . . . . . . . . .1 Diﬀusion as stochastic transport . .1 Statistics of a Random Variable . . . . . . . . 11 Special Topics 11. . . .8. . . . . .5 8. A. . . . . 11.3 Other mechanisms . . . . . . . .5. . .2. .3 LNA for spatial systems . . . . . . . . . . . . . . . .2 Fluctuations Along a Limit Cycle . . . . . A Review of Classical Probability A. . .2. . . . . . . . . . Example – Random Harmonic Oscillator . 10. . . . . . . . .1 Deterministic pattern formation . . . . . . . . . . . .3 External noise: Stochastic PDEs .3 Stochastic Resonance . . . . . . . . .1. . . . 8. . . . . .1 Joint inversion generating method vi . . . A. . . . . . .

. .2 Fourier Transform . .4. . . . . . . . B. . . . . .1 Laplace Transform . .1 Itˆ’s stochastic integral . . C. . . D. . . . . . . . . . . . . . .4. . B. . . . . . . . . . . o t C. B. . .3. .5 Change of Variables . . . . . . . . . . . . . . . . . . . . B. . . . . . . . .3 Transform methods . . . .3 Transforms Methods . . .2 Coloured noise .3 Example .3. . . . . .4 Partial diﬀerential equations .2. . . . . . . . . . B. . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . .1 Example – Calculate cos[η(t)] . D. . .3 Stirling’s Approximation . . . .7 Bertrand’s Paradox . B. . . . . 256 256 259 260 261 262 263 265 266 266 266 267 267 267 268 269 271 271 272 273 274 276 276 276 277 278 279 280 280 D Sample Matlab code D. . . . . . . . . . . . . .1 Deﬁning the propensity vector and stoichiometry matrix . . . . . . .1 Example – 0 W (t′ )dW (t′ ) . . . . . . . . . . . . . . . . .4. . . . . 247 B Review of Mathematical Methods B. . . .1 Examples of Gillespie’s direct method . . 246 A. . D.5.1. . . . .2 Linear Stability Analysis . . . . . . . . C. . .1 Method of characteristics . . . . . . . . .3 z-Transform .2 Change of Variables Formula . . . . . . . . . . . . . .1.2 Stochastic diﬀerential equations . D. . . . . . . . . . . . . . . . . . . . . . . D. . . . B. . . . . . . . . . . . . . C Itˆ Calculus o C. . D. . . . vii .6 Monte-Carlo simulation . . . . . . . . . . .2 Separation of variables . . . . . . .2. . . . . . . . . 247 A. . . . . . . .A.1 Routh-Hurwitz Criterion . . . . . .1 Matrix Algebra and Time-Ordered Exponentials . .2 Core stochastic simulation algorithm . . . . . . . . . . . . . B. .2 Watson’s Lemma . . . . . . . . . .5. .3. . . . .1 Cauchy-Schwarz Inequality . . . . . . B. .5. . . . . . . . . . . . . . . . . B. . . . . . . . B. . . . . . . . . . . . . . . . . . B. . . . . .2. . .1 White noise . . . . . . . . . . . . . . . . . . . . B. .the Brusselator . . . .1. .5 Some Results from Asymptotics and Functional Analysis B. . . . . . . . . . . . . . . . .

Chapter 1 Introduction

1.1 Stochastic Processes in Science and Engineering

Physics is the study of collective phenomena arising from the interaction of many individual entities. Even a cannonball dropped from a high tower will collide with some 1030 gas molecules on its way down. Part of the miracle of physics is that, as a rule, only a few variables are required to capture the behaviour of the system, and this behaviour can, in turn, be described by very simple physical laws. In the case of the falling cannonball, for example, only its position and velocity are important. In hydrodynamics, despite the incredible numbers of individual ﬂuid molecules, the ﬂow velocity, density and temperature are suﬃcient to describe the system under most circumstances. Yet the interactions that are eliminated from large-scale1 models make themselves felt in other ways: a ball of crumpled paper experiences a strong drag force in ﬂight due to air-resistance, a constitutive equation must be provided to ﬁx the visco-elastic properties of a ﬂuid, etc. Drag forces, viscosity, electrical resistance – these are all vestiges of the microscopic dynamics left behind in the macroscopic models when the enormous degrees of freedom in the original many-body problem were integrated away to leave a simple deterministic description. These shadows of microscopic motion are often called ﬂuctuations or noise, and their description and characterization will be the focus of this course. Deterministic models (typically written in terms of systems of ordinary

1 Also

called macroscopic or phenomenological models.

1

2

Applied stochastic processes

diﬀerential equations) have been very successfully applied to an endless variety of physical phenomena (e.g. Newton’s laws of motion); however, there are situations where deterministic laws alone are inadequate, particularly as the number of interacting species becomes small and the individual motion of microscopic bodies has observable consequences. The most famous example of observable ﬂuctuations in a physical system is Brownian motion: the random, incessant meandering of a pollen grain suspended in a ﬂuid, successfully described by Einstein in 1905. (We shall consider Brownian motion in greater detail throughout these notes.) The next major developments in the study of noise came in the 1950’s and 1960’s through the analysis of electrical circuits and radio wave propagation. Today, there is again renewed interest in the study of stochastic processes, particularly in the context of microﬂuidics, nanoscale devices and the study of genetic circuits that underlie cellular behaviour.

**Why study Brownian motion?
**

Brownian motion will play a central role in the development of the ideas presented in these notes. This is partly historical because much of the mathematics of stochastic processes was developed in the context of studying Brownian motion, and partly pedagogical because Brownian motion provides a straightforward and concrete illustration of the mathematical formalism that will be developed. Furthermore, Brownian motion is a simple enough physical system that the limitations of the various assumptions employed in the modeling of physical phenomena are made obvious.

1.2

Brownian motion

Using a microscope, Robert Brown (1773-1858) observed and documented the motion of large pollen grains suspended in water. No matter what Brown did, the pollen grains moved incessantly in erratic and irregular motion (Figure 1.1). Brown tried diﬀerent materials and diﬀerent solvents, and still the motion of these particles continued. This was a time when most scientists did not believe in atoms or molecules, so the underlying mechanism responsible remained a mystery for nearly a century. In the words of S. G. Brush, “three quarters of a century of experiments produced almost no useful results in the understanding of Brownian motion – simply because no theorist had told the experimentalists what to measure!” Thanks mainly to the work of Guoy [J. de Physique 7, 561 (1888)],

Brownian motion

A B

3

Figure 1.1: Brownian motion A) Sample path of Brownian motion. B) Schematic of the physical mechanism underlying Brownian motion. Notice the ﬁgure is not to scale – In reality, there would be about 109 solvent molecules in the view shown, and each molecule would be about 1/100 mm in diameter. several suggestive facts were known by century’s end: 1. The motion of the particles is very irregular, and the path appears to have no tangent at any point. (On an observable scale, the path appears non-diﬀerentiable, though of course it is continuous.) 2. The particles appear to move independently of one another, even when they approach closely. 3. The molecular composition and mass density of the particles has no eﬀect. 4. As the solvent viscosity is decreased, the motion becomes more active. 5. As the particle radius is decreased, the motion becomes more active. 6. As the ambient temperature is increased, the motion becomes more active. 7. The motion never ceases. Such was the state of aﬀairs when Einstein published his seminal paper in 1905 [Ann. Phys. 17, 549 (1905)], compelling him to write in the opening paragraph of that article, that “it is possible that the movements to be discussed are identical with the so-called ‘Brownian molecular motion’; however, the information available to me regarding the latter is so lacking in precision, that I can form no judgment in the matter.”

4

Applied stochastic processes

Figure 1.2: Albert Einstein and Marian Smoluchowski

1.2.1

Einstein

Einstein’s argument is a classic – and it contains the seeds of all of the ideas developed in this course. To appreciate the revolutionary nature of Einstein’s formulation of the problem, it is important to recall that at the time many physicists did not believe in atoms or molecules. The argument is built upon two key hypotheses. First, that the motion of the particle is caused by many frequent impacts of the incessantly moving solvent molecules. Second, that the eﬀect of the complicated solvent motion can be described probabilistically in terms of very frequent statistically independent collisions. Einstein’s approach follows two directions – the ﬁrst is physical. Balancing the osmotic pressure with the body forces acting on the particle, Einstein derives a partial diﬀerential equation for the particle density f (x, t), ∂f ∂2f . =D ∂t ∂x2 Here, the diﬀusion coeﬃcient D is a measure of how quickly the density distribution will spread along the X-axis. For a spherical Brownian particle of radius R, using Stokes’s theory for the drag force acting on the particle, Einstein arrives at the following expression for D, D= kB T . 6πνR (1.1)

Here ν is the shear viscosity of the suspending liquid, T is the absolute temperature, and kB is Boltzmann’s constant. The units of kB are such

Brownian motion

5

that kB T has the units of energy. Knowledge of the value of kB is equivalent to the knowledge of Avogadro’s number, and hence of molecular size. Notice how the qualitative aspects of Brownian motion (items 4-6 from the table on p. 3) are reﬂected in the expression for the diﬀusion coeﬃcient. The second direction of Einstein’s paper is mathematical, using simple assumptions to derive a partial diﬀerential equation governing the probability density of ﬁnding the Brownian particles at a given position x after a time t – a derivation that will be echoed throughout this course. The original article is a wonderful piece of scientiﬁc writing, not only because of the historical signiﬁcance, but because of the clarity and force of the central ideas. For that reason, the portion of the paper describing the derivation of the partial diﬀerential equation is reproduced below. Excerpt from Einstein’s paper on Brownian motion (Author’s translation) •

A. Einstein (1905) Ann. Physik 17:549.

It must clearly be assumed that each individual particle executes a motion which is independent of the motions of all other particles; it will also be considered that the movements of one and the same particle in diﬀerent time intervals are independent processes, as long as these time intervals are not chosen too small. We introduce a time interval τ into consideration, which is very small compared to the observable time intervals, but nevertheless so large that in two successive time intervals τ , the motions executed by the particle can be thought of as events which are independent of each other. Now let there be a total of n particles suspended in a liquid. In a time interval τ , the X-coordinates of the individual particles will increase by an amount ∆, where for each particle ∆ has a diﬀerent (positive or negative) value. There will be a certain frequency law for ∆; the number dn of the particles which experience a shift which is between ∆ and ∆ + d∆ will be expressible by an equation of the form dn = nφ(∆)d∆, (1.2)

7) We can use this series under the integral. t) φ (∆) d∆.3) and φ is only diﬀerent from zero for very small values of ∆. the collisions are unbiased: φ(∆) = φ(−∆). In the case considered by Einstein. t) in powers of ∆: f (x − ∆.6 where ∞ Applied stochastic processes φ (∆) d∆ = 1. we can set f (x. t) in the integrand. −∞ (1. The sign has been corrected in all subsequent equations.4) We now investigate how the diﬀusion coeﬃcient depends on φ. ∂t (1..5) But since τ is very small. 2 (1.8) actually wrote f (x + ∆.6) Furthermore. Let ν = f (x. because only small values of ∆ contribute to this equation. t + τ ) dx = dx −∞ f (x − ∆. however. t + τ ) = f (x. We compare the distribution of particles at the time t+τ from the distribution at time t. t) be the number of particles per unit volume. so the sign of ∆ makes no diﬀerence whatsoever. From the deﬁnition of the function φ(∆). and satisﬁes the condition φ(∆) = φ(−∆). t) + τ ∂f . (1.. t) = f (x. One obtains2 ∞ f (x. 2 Einstein . we develop f (x − ∆. which is incorrect (Why?). it is easy to ﬁnd the number of particles which at time t+τ are found between two planes perpendicular to the x-axis and passing through the points x and x + dx.. We shall once more restrict ourselves to the case where the number ν of particles per unit volume depends only on x and t. (1. ∂x 2! ∂x2 (1. t) − ∆ ∆2 ∂ 2 f ∂f + + . We obtain ∂f f+ τ = f ∂t ∞ −∞ ∂f φ (∆) d∆− ∂x ∞ −∞ ∂2f ∆φ (∆) d∆+ 2 ∂x ∞ −∞ ∆2 φ (∆) d∆.

t) = √ . terms on the right-hand side vanish. can be measured. the displacement λx in the direction of the X-axis that a particle experiences on the average or. We obtain from this equation. ∂2f ∂f = D 2 .12) We now calculate..11) ∂t ∂x This is already known as the diﬀerential equation of diﬀusion and it can be seen that D is the diﬀusion coeﬃcient. the second..Brownian motion Because φ(∆) = φ(−∆).. and with care a suspension of spherical particles of fairly uniform radius R may be prepared. ν. each one is very small compared with the previous.10) and keeping only the 1st and third terms of the right-hand side. while out of the 1st. The problem. etc. . fourth.13) λx = (End of excerpt) The temperature T and the viscosity of the surrounding ﬂuid. it is √ x2 = 2Dt. etc. 3rd.. with the help of this equation. terms. 2 (1.. more exactly. by taking into consideration ∞ 7 φ (∆) d∆ = 1. .. the square root of the arithmetic mean of the square of the displacement in the direction of the X-axis. . t 4πD 2 (1. (1. 5th. its solution is x n exp − 4Dt √ f (x. The diﬀusion coeﬃcient D is then . is now completely determined mathematically... −∞ (1. (1.9) and setting 1 τ ∞ −∞ ∆2 φ (∆) d∆ = D. which corresponds to the problem of diﬀusion from a single point (neglecting the interaction between the diﬀusing particles).

Smoluchowski also lay the foundation for a more extensive study of Brownian motion. independently of Einstein. Einstein’s work was of great importance in physics. 1. 1. including the possibility of various forces acting upon the particle. Quoting from Einstein’s autobiography: The agreement of these considerations with experience together with Planck’s determination of the true molecular size from the law of radiation (for high temperature) convinced the skeptics.13). who were quite numerous at the time (Ostwald. it only determines the nature of the motion and the value of the diﬀusion coeﬃcient on the basis of some assumptions. Quoting from Gardiner’s Handbook. Relation between Einstein’s derivation and the rest of the course Einstein’s work anticipated much of what was to come in the study of stochastic processes in physics. This is an interesting example of the fact that even scholars of audacious spirit and ﬁne instinct can be obstructed in the interpretation of the facts by philosophical prejudices. with reference to their appearance later in the course notes. 1. attempted a dynamical theory and arrived at Eq.2.2). Langevin provided another derivation of Eq. • The Chapman-Kolmogorov Equation (Eq.2. The coupling of Eq.13 is one of the earliest examples of using ﬂuctuations to quantify a physical parameter (as opposed to simply considering the averaged behaviour).4). Said another way. equivalently Avogadro’s number) can be determined. Boltzmann’s constant (or. it is the irreproducibility in the experiment that is reproducible! Einstein’s argument does not give a dynamical theory of Brownian motion. A few key steps will be highlighted.1 and Eq.2. 1. 1. Mach) of the reality of atoms.1 (Section 1. 1. The antipathy of these scholars towards atomic theory can indubitably be traced back to their positivistic philosophical attitude.5) states that the probability of the particle being at point x at time t + τ is given by . This program was undertaken in a series of laborious experiments by Perrin and Chaudesaigues – resulting in a Nobel prize for Perrin in 1926. Smoluchowski.8 Applied stochastic processes available through the statistics of the Brownian particle (via Eq.1 with an additional factor of 32 on the right-hand side (Sec27 tion 1. for it showed in a visible and concrete way that atoms were real. In this way.3) which was the starting point for the work of Ornstein and Uhlenbeck (Section 1.

von Smoluchowski (1906) Ann. as assumed by Einstein). has been a topic of discussion in the last decade. We will discuss this in detail in at the end of Chapter 6.2.2 and Section 5. 1. multiplied by the probability of being at x−∆ at time t. 105). The Chapman-Kolmogorov equation and Markov processes will be studied in detail in Chapter 3.3. 1. • The Kramers-Moyal and similar expansions are essentially the same as that used by Einstein to go from Eq. In this case. which describes a large class of very interesting stochastic processes in which the system has a continuous sample path.Brownian motion 9 the sum of probabilities of all possible “pushes” ∆ from positions x−∆. • The Fluctuation-Dissipation Relation: Eqs.2 • Smolouchowski M.1. in which time t is continuous (not discrete. of which Eq. The use of this type of approximation. 1.5 is a special form. This procedure was initiated by Langevin with the famous equation that to this day bears his name. Smoluchowski developed a complementary model of Brownian motion. 1. 1. is the central dynamical equation to all Markov processes. can be written x(t). if thought of as obeying a probabilistic law given by solving the diﬀusion equation. a special case of the Fokker-Planck equation. 1. This assumption is based on the independence of the push ∆ of any previous history of the motion: it is only necessary to know the initial position of the particle at time t – not at any previous time. Physik 21:756. [see the following section]. This is the Markov postulate and the Chapman-Kolmogorov equation. • The Fokker-Planck Equation: (Eq. which eﬀectively replaces a process whose sample paths need not be continuous with one whose paths are continuous.13 connect the dissipation in the system with the magnitude of the ﬂuctuations in thermal equilibrium (see Section 2. p. where x(t) is a continuous function of time – but a random function. Eq.11.11. This leads us to consider the possibility of describing the dynamics of the system in some direct probabilistic way.5 to Eq. His original derivation has more in common . Its use and validity will be discussed in Chapters 6 and 4.1 and 1.11) is the diﬀusion equation. 1. so that we would have a random or stochastic diﬀerential equation for the path. Independently of Einstein. that means that the pollen grain’s position.

. etc. Eq. 25. 1.14) where δij is the Kronecker delta. t). P (x. 1. one assigns a starting position (generally x = 0 at t = 0 for convenience) and a transition probability between the lattice points. 149). Smoluchowski’s model of Brownian motion assumes that a particle can move only one step to the right or to the left. 0) = δx. and his legacy to the modeling of stochastic processes.14 can be solved explicitly for the full time-dependent conditional distribution P (x. x + ∆x. . such . discretization is unnatural. t + τ ). This type of process is called an unbiased random walk. and Chapter 4).0 . A particle at position x can move to neighbouring positions x + ∆x and x − ∆x with equal probability during each time-step (t. t) + P (x + ∆x. Eq. . The two terms in the right-hand side of Eq. with Rayleigh’s study of Brownian motion (see Exercise 9 on p. 24). t+τ ). In problems where the state space is inherently discrete.11 (see Exercise 6 on p. In addition to a discretization of time (t. and that these occur with equal probability. 1. (Figure 1. . . In many applications where x represents physical space. 2 2 P (x. an example of a more general class of one-step processes. t + τ.3). although the essence of his argument. Mathematically. t + τ ) = 1 1 P (x − ∆x. the conditional probability density P (x. 1. Furthermore.5.3: Schematic illustration of an unbiased random walk in one dimension.14 represent the two ways in which a particle can move into position x during the interval (t. in a certain limit. can be appreciated by considering a spatiallydiscretized version of Eq. imagine the motion of the particle is likewise constrained to discrete positions . x − ∆x. the Smoluchowski model reduces to Einstein’s diﬀusion equation.). (1. t) using transform methods (see Exercise 7 on p. t + 2τ . x. To fully characterize the model.10 Applied stochastic processes 1 2 1 2 x − ∆x x x + ∆x Figure 1. . t) for the unbiased random walk obeys a diﬀerence equation.

14) is an example of a master equation with discrete states. etc.Brownian motion 11 as molecule numbers or animal populations. so a large portion of the course will be devoted to simulating or approximating the solutions. his work marks the ﬁrst attempt to provide a dynamical theory of Brownian motion (to be contrasted with Einstein’s kinematical theory). for example. In practice. Relation between Smoluchowski’s formulation and the rest of the course • The master equation: Smoluchowski’s conceptual framework is ideal for many physical applications. 52). J. although coming from a very diﬀerent perspective and following [in his words] an “inﬁnitely simpler” derivation. Acad. Many physical systems. Gythiel (1997) Am.2. Langevin’s argument is as follows. S. 65: 1079–1081. then the master equation is used to predict how the state evolves in time. Beginning with the trajectory of a single particle. The main dynamical equation for a Markov process is the Chapman-Kolmogorov equation (see p. The master equation is rarely solvable exactly. biased transition rates. The random walk (Eq. multidimensional lattices. Lemons and A. Sci. (Paris) 146:530. Langevin (1908) C. R. the physics of the process are captured by the transition probabilities among lattice sites. Under the same coarse-graining of the observation time-scale that permits the physical process to be described as Markovian (see p. Smoluchowski’s methodology is particularly convenient. Langevin arrives at the same result for the mean-squared displacement as Einstein. the Chapman-Kolmogorov equation reduces to the more simple master equation. A few years after Einstein and Smoluchowski’s work. with an appropriately chosen coarse-graining of the observation time-scale. Phys.3. The master equation and Markov processes will be studied in detail in Chapter 3. 59). a new method for studying Brownian motion was developed by Paul Langevin. • English translation and commentary: D. These more general models will be explored in more detail in Section 3. statedependent transition rates. can be represented by Markov processes (Chapter 3). 1. multiple steps. 1. It is not diﬃcult to imagine whole classes of more complicated processes that can be constructed along similar lines by allowing.3 • Langevin P. .

Viscous drag fd = −γ dx (γ = 6πνR) for a sphere of radius R in dt slow motion in a ﬂuid with shear viscosity ν (Stokes ﬂow). 2 2 dt 2 dt (1.16) dt dt or. The Brownian particle moves around because it is subject to a resultant force F coming from the collision with the solvent molecules. F = ma. γ d 2 m d2 2 (x ) − mv 2 = − (x ) + xfr (t). Brownian motion can be studied using Newton’s second law.15) m 2 = F. about which next to nothing is known. from a macroscopic point of view. m 2 = −γ (1. Fluctuating (random) force fr (t) due to the incessant molecular impacts. d2 x (1. dt The precise. d2 x dx + fr (t).4: Paul Langevin with Einstein A single Brownian particle is subject to Newton’s laws of motion. microscopic form of the force acting on the particle is unknown.12 Applied stochastic processes Figure 1. With enough information about F . it is reasonable to consider it as the resultant of two diﬀerent forces: 1. the equation of motion for the Brownian particle is. after multiplication with x and re-arrangement. 2. Under these assumptions. however.17) .

” The resulting diﬀerential equation for the mean-squared displacement x2 is then simply. so the exponential may be dropped after a time period of about 10−8 s has elapsed. 1. m d2 2 γ d 2 x + xfr (t) . however. For an ideal gas. dt β C = constant. the energy is (1/2)kB T – this is the result used by Langevin. at thermal equilibrium. Instead of equipartition of energy theorem comes from classical statistical mechanics (in the absence of quantum eﬀects). x2 − x2 = 2 0 kB T t = 2Dt. will be the path x(t). the average kinetic energy in thermal equilibrium of each molecule is (3/2)kB T . Ω = γ . It states that. then from Eq. Another integration yields the ﬁnal result. he proceeded to average the equation over a large number of similar (but diﬀerent) particles in Brownian motion under the same conditions. The cross-correlation xfr (t) Langevin dismissed by saying “it can be set = 0. 3 The .21) if we identify the diﬀusion coeﬃcient with Einstein’s result D = kB T = γ kB T 6πνR . m (1.17 has a well-deﬁned meaning in a statistical sense. so. m 2 x2 + γ dt dt Integrating once.19) x = 2kB T. where kB is Boltzmann’s constant and T is the absolute temperature3 . Langevin was able to replace the average kinetic energy of the particle mv 2 by kB T .20) Langevin proceeds by noting that experimentally Ω ≈ 108 Hz. energy is shared equally among all forms and all degrees of freedom. since the forcing fr (t) is a random function. Denote this average by · · · . and assume that it commutes with the d derivative operator dt . In fact. The path x(t) was not Lagevin’s objective. It should be clear from Langevin’s equation that stochastic processes aﬀord a tremendous simpliﬁcation of the governing model. x − mv 2 = − 2 2 dt 2 dt (1. too. d 2 d2 (1. 1. γ (1. Assuming Eq. due to the great irregularity of fr (t). In the x-component of the velocity. d 2 2kB T x = + Ce−Ωt .18) From the equipartition of energy theorem derived in equilibrium statistical mechanics.17 is not a diﬀerential equation in the ordinary sense. 1.17.Brownian motion 13 Langevin remarks that “since the quantity fr (t) has great irregularity” Eq.

If fr (t) is a random function. What kind of averaging are we using in moving from Eq. while φ(∆) describes the eﬀect of the force imparted by the collisions (a kinematic approach). . fr (t) treats the force itself as a random variable (a dynamic approach). The “relevant” variable x is the position of the particle. whereas Langevin derives an ordinary diﬀerential equation for the ﬁrst two moments of the distribution.18? Are we picking some a priori probability distribution and computing the expectation? If so. We assume x(t) obeys a diﬀerential equation. . What. Several critical remarks can be made regarding Langevin’s derivation. but interacts with a host of ﬂuid molecules. Einstein derives a partial diﬀerential equation governing the entire probability distribution. how do we choose the probability distribution? Are we taking an arithmetical average? 4. Of particular interest is . in retrospect. then x(t) should inherit some of that randomness. What is going on over very short timescales? Soon after the 1908 paper by Langevin. . 1. several outstanding contributions to the theory of Brownian motion appeared. is the function fr (t)? We know what a random variable is. What is the mathematical meaning of that statement? 2. the similarity between φ(∆) of Einstein and fr (t) of Langevin – both characterize the ﬂuctuations. 1. Nevertheless. and criticisms were not long in coming. precisely. [This] is due to the fact that the single variable x does not really obey a closed diﬀerential equation. 1. . which constitutes a stochastic process. Setting Ce−Ωt ≈ 0 because Ω is large only holds if t is large. These variables are not visible in the equation for x but their eﬀect shows up in the random Langevin force.14 Applied stochastic processes including ∼ 1023 auxiliary variables describing the movement of the solvent molecules. Langevin includes their eﬀect as the single random forcing function fr (t). Fluctuations in x are constantly being generated by the collisions. As a result.17 to Eq. Notice. and we know what a function is – but fr (t) is a random function. leading to a closed equation for the particle position x(t) – Quoting van Kampen: Together with its surrounding ﬂuid [the Brownian particle] constitutes a closed. but how 2 x do we know the derivatives dx and d 2 even exist? dt dt 3. isolated system.

Brownian motion 15 Figure 1. S. 271). The diﬃculties of Langevin’s approach for nonlinear equations is discussed in N. Langevin’s approach is very intuitive and simple to formulate. Einstein obtained the result (γ is the . Doob (p. E. Ornstein (1930) Phys. For nonlinear systems. Uhlenbeck and L. Rev.2. Langevin’s approach cannot be used. G. Of particular note is the work of Kolmogorov (see p.4 Ornstein and Uhlenbeck The Ornstein-Uhlenbeck paper (G. van Kampen’s article ‘Fluctuations in Nonlinear Systems. Relation between Langevin’s approach and the rest of the course • Random diﬀerential equations: Langevin’s formulation of stochastic processes in terms of Newton’s law’s of motion still appeals to many physicists.’ • Stochastic analysis: The cavalier treatment of the analytic properties of the random force fr (t) and the resulting analytic properties of the stochastic process x(t) generated extensive activity by mathematicians to solidify the foundations of stochastic calculus. however. For systems governed by linear dynamics in the absence of ﬂuctuations. 36:823) begins by reminding the reader that the central result of both Einstein and Langevin is the mean-squared displacement of the Brownian particle. 30).5: Leonard Ornstein and George Uhlenbeck the 1930 paper by Uhlenbeck and Ornstein that sought to strengthen the foundations of Langevin’s approach. o 1. Diﬀerential equations with stochastic forcing or stochastic coeﬃcients are the subject of Chapter 8. 20) and Itˆ (p.

dt whose solution is. Ornstein and Uhlenbeck go on to observe that the Brownian motion problem is still far from solved. when it started at t = 0 with velocity u0 . (1. m m (1. too. dx du = −γu + fr (t). (1. γ (1. although the reader is encouraged to seek out the original.16 friction coeﬃcient).25) u = u0 e −βt +e −βt 0 eβξ A (ξ) dξ. and that the two following questions have yet to be answered: • Einstein. du + βu = A(t). t β= γ f (t) . A(t) = . u ≡ . from the evolution equation of the probability distribution. while many workers preferred Langevin’s dynamical approach based upon Newton’s law. They declare their intention of using Langevin’s equation. Their paper. x2 = 2Dt. Next. The problem is to determine the probability that a free particle in Brownian motion after the time t has a velocity which lies between u and u + du. D= Applied stochastic processes kB T . Smoluchowski and Langevin all assumed a time scale such that m t≫ (1. is a classic – We paraphrase it below.24) γ Under what conditions is this assumption justiﬁed? • What changes must be made to the formulation if the Brownian particle is subject to an external ﬁeld? The Ornstein and Uhlenbeck paper is devoted to answering these outstanding questions.23) m dt dt along with some “natural” assumptions about the statistical properties of fr (t).22) kinematically.26) .

More explicitly we shall suppose that: A(t1 )A(t2 ) = φ(t1 − t2 ). 0 2β ∞ u0 τ1 ≡ φ1 (w) dw. where φ(x) is a function with a very sharp maximum at x = 0.30) (where the integration limits in the deﬁnition of τ1 extend to ±∞ because φ1 ≈ 0 except near the origin).29) where φ1 is a function with values close to zero everywhere except at the origin where it has a very sharp peak. φ1 is called the correlation function or autocorrelation of A(t). . Making the change of variable ξ + η = v and ξ − η = w. u u0 = u0 e−βt . Recalling that the equipartition of energy theorem requires that. There will be correlation between the values of [the random forcing function A(t)] at diﬀerent times t1 and t2 only when |t1 − t2 | is very small.’ Squaring Eq. (1. is typical of a special class of stochastic processes. (1.31) t→∞ m . (1.Brownian motion 17 (By solution. Physically. A(t1 )A(t2 ) = φ1 (t1 − t2 ) (1. u2 τ1 = u2 e−2βt + 1 − e−2βt . Is that sensible?) On taking the average over the initial ensemble. The dependence on time diﬀerence only. using the assumption that A(t) = 0 as Langevin did.26 and taking the average. the integration can be carried out. t1 − t2 . they mean an integration of the Langevin equation.28) To make progress. kB T lim u2 = . As we will see in the coming chapters. −∞ (1. 1. we now assume that.27) The subscript means ‘average of all u that began with velocity u0 at time t = 0. . this is justiﬁable on the grounds that the ﬂuctuating function A(t) has values that are correlated only for very short time instants. In Ornstein and Uhlenbeck’s own words: [W]e will naturally make the following assumptions. t t u2 u0 = u2 e−2βt + e−2βt 0 0 0 eβ(ξ+η) A (ξ) A (η) dξdη.

m and the ﬁnal result for the mean-squared velocity of the Brownian particle is. in addition to Eq. (1. Ornstein and Uhlenbeck proceed to calculate the distribution for the displacement. In analogy with the velocity.34) Having derived the velocity distribution.36) 1 e A (ξ) dξ + β 0 A (ξ) dξ. between x and x + dx.32) 2βkB T . 1. it is assumed that ﬂuctuations A(t) are Gaussian random variables at every time t.29. It is clear that this probability will depend on s = x − x0 only. so. t m u − u0 e−βt exp − 2kB T (1 − e−2βt ) 2 .37) 0 . t) = 2πkB T (1 − e−2βt ) 1 2 (1. The problem is to determine the probability that a free particle in Brownian motion which at t = 0 starts from x = x0 with velocity u0 will lie. m G (u. x = x0 + u0 1 − e−βt + β t 0 integrating partially this may be written in the form. (1. u0 . t (1. after a time t. it is then possible to show that the random quantity u − u0 e−βt also obeys a Gaussian (normal) probability distribution given by. (1. 1.35) and integrate. and on t.18 we have – in that same limit – that τ1 = τ1 2β Applied stochastic processes = kB T m . they begin with Eq. with the result. u0 1 1 − e−βt − e−βt s = x−x0 = β β t βξ e−βη 0 η eβξ A (ξ) dξ dη. kB T (1. u2 u0 = u2 e−2βt + 0 m If. u = u0 e−βt + e−βt 0 eβξ A (ξ) dξ.33) 1 − e−2βt .26.

s u0 19 = u0 1 − e−βt . on the other hand. For instance.39) where τ1 is as above. β (1. and computing the double integral as above.42) The calculation of higher powers proceeds similarly. they re-derive the equation using what is called the Fokker-Planck equation .Brownian motion Taking the mean gives.38) By squaring. it goes over to Einstein’s result. with mβ 2 F (x.41) i. The Ornstein-Uhlenbeck paper contains more than the results described here.44) under the assumption that A(t) is Gaussian. averaging. Taking a second average over u0 (using the distribution derived above).40) For short times. 0 m s =0 s2 =2 kB T βt − 1 + e−βt . In the limit of long times.e. 2 mβ (1. β (1.43) is Gaussian distributed. 2kB T (2βt − 3 + 4e−βt − e−2βt ) 1 2 (1. one can show that the variable. s 2 u0 τ1 u2 = 2 t + 0 1 − e−βt β β2 2 + τ1 −3 + 4e−βt − e−2βt . x0 . t) = 2πkB T (2βt − 3 + 4e−βt − e−2βt ) 2 u0 −βt mβ 2 x − x0 − β 1 − e × exp − . 3 2β (1. too. (1. the motion is uniform with velocity u0 . (1. s u0 ∼ u0 t and s2 u0 ∼ u 2 t2 0 as t → 0. In this way. and since u2 = kB T . This result was ﬁrst obtained by Ornstein (1917). S =s− u0 1 − e−βt . the mean-squared displacement is. we have. we have. s2 u0 ∼ τ1 kB T t∼2 t β2 mβ as t → ∞.

we need to know what is meant by diﬀerentiability in this context. dx d2 x +β (1. is used to ﬁnd a solution x(t) not having a second-derivative. by solving the problem of an harmonically-bound particle. and permitting a generalization of Langevin’s approach. to make sense of Doob’s statement.20 Applied stochastic processes (which we shall meet later. involving the secondderivative of x. This will avoid the usually embarrassing situation in which the Langevin equation. . That connection will be made explicit in Chapter 6. the outline of book is as follows: First. A stochastic diﬀerential equation is introduced in a rigorous fashion to give a precise meaning to the Langevin equation for the velocity function. its properties. L. in Chapter 6). 43:351). Obviously there is something deep going on in terms of the diﬀerentiability and integrability of a stochastic process. 1.45) + ω 2 x = A(t). and its derivation. Relation between Ornstein and Uhlenbeck’s work and the rest of the course • Fokker-Planck equation: In addition to clarifying the assumptions left implicit by Langevin. In particular. Ornstein and Uhlenbeck provided a mathematical connection between the kinematic approach of Einstein (using a partial diﬀerential equation to describe the probability distribution) and the dynamical approach of Langevin (using an ordinary diﬀerential equation with random forcing to describe a single trajectory). The purpose of this paper is to apply the methods and results of modern probability theory to the analysis of the OrnsteinUhlenbeck distribution. 2 dt dt Although the approach of Langevin was improved and expanded upon by Ornstein and Uhlenbeck. . the fundamental concepts are introduced – . Doob (1942) Annals of Math.3 Outline of the Book In broad strokes. They also consider the eﬀect of an external force. That will be the focus of Chapter 7. some more fundamental problems remained. as Doob pointed out in the introduction to his famous paper of 1942 (J.

Chapter 6 . After the fundamentals have been established.Fokker-Planck Equation: This is a partial diﬀerential diﬀusion equation with drift that governs the entire probability distribution – a generalization of the diﬀusion equation used by Einstein in his study of Brownian motion. The remaining chapters are. Particular attention is paid to the second moment of a stochastic function evaluated at diﬀerent times. the master equation is too diﬃcult to solve exactly. Chapter 5 . with particular focus on the most common process appearing in the modeling of physical systems: The Markov process. methods of approximating the master equation so that it is more amenable to solution. but it is often approximately satisﬁed by many physical systems and allows a useful coarse-graining of the microscopic dynamics.Markov processes: The most commonly used stochastic descriptor with the property that the future state of a system is determined by the present state. There are some ambiguities and essential diﬃculties associated with this method when applied to systems with nonlinear transition probabilities. We examine the most popular approximation method – stochastic simulation. Chapter 4 . In most cases of interest. Before returning to the dynamic (stochastic diﬀerential equation) meth- . called the correlation function.Perturbation Expansion of the Master Equation: The linear noise approximation of van Kampen is a robust and widelyapplicable analytic approximation method for the solution of the Master equation. a diﬃcult equation to solve exactly. we examine in more detail solution methods and the kinematic formulation of Einstein. Once Markov processes and their evolution equations have been derived.Solution of the Master Equation: Brieﬂy discuss exact solution methods. and not on any states in the past. we examine how stochastic processes are classiﬁed. Chapter 3 . This is a strong requirement. to a large extent. The evolution equation for the probability distribution of a Markov process is often called the master equation.Brownian motion 21 Chapter 2 . We focus on a particular class of stochastic processes – called stationary processes – that will be used throughout the course.Random processes: The ideas of classical probability theory (Appendix A) are extended from random variables to the concept of random functions. A popular analytic method of approximating the evolution of a Markov process.

Beyond the ﬁrst chapter. C. G. It.Special Topics: Some additional topics are included which are beyond the scope of the course. van Kampen: • Stochastic processes in physics and chemistry (2nd Ed. N. too. and the analytic tools that can be used to study these eﬀects. 2001). As claimed. Ornstein and Uhlenbeck has many appealing features. He doesn’t shy away from making emphatic and unambiguous statements that help in learning . We examine several such cases.). Ornstein and Uhlenbeck.Random Diﬀerential Equations: The diﬀerential equationbased modeling of Langevin. though perhaps of interest to some readers. G. so it is a great starting point if you’re looking for speciﬁc information about a speciﬁc stochastic process. Chapter 7 .). we must ﬁrst formulate the calculus of random functions. One of the ﬁnest books on stochastic processes is by N. Gardiner’s book: • Handbook of stochastic methods (3rd Ed. van Kampen (North-Holland. Gardiner (Springer. o Chapter 8 .4 Suggested References Much of the content on Brownian motion comes from C. More reﬁned ideas of integration will be required to provide a formal meaning to expressions involving white noise as the canonical noise source (Itˆ’s calculus). Chapter 9 .Stochastic Analysis: We shall be interested in the meansquare limits of sequences as the foundation of continuity and the operations of diﬀerentiation and integration. but well-worth the eﬀort. W. is diﬃcult reading. 1. it is diﬃcult reading. we characterize the eﬀect of ﬂuctuations on the macroscopic behaviour of nonlinear systems. Chapter 10 . with particular focus on numerical simulation and analytic approximation of solutions.22 Applied stochastic processes ods of Langevin.Macroscopic eﬀects of noise: There are situations where stochastic models exhibit behaviour that is in sharp contrast with deterministic models of the same system. 2004). We examine generalizations of their analysis. Combining the ideas of the previous chapters. W. it is a handbook.

2). Here. ∂t ∂x ∂x (1.3. Harmonically-bound particle: Repeat Ornstein and Uhlenbeck’s analysis to calculate x(t) and x2 (t) for the harmonically-bound particle. 4. That interpretation persists in many textbooks. Genetics 28: 491–511. A good supplemental text is the statistical mechanics book by Reif: 23 • Fundamentals of statistical and thermal physics. Luria and M.E. 1. Delbr¨ ck and Luria. Benveniste: Read the paper “Human basophil degranulation triggered by very dilute antiserum against IgE. 1. Fill in the steps that connect Eq. Brieﬂy (one or two sentences) summarize what the authors claim. the ﬂux term J = −D∂c/∂x was discovered empirically by Fick (1855). How does the physical interpretation change given Einstein’s work on Brownian motion (1905)? 5.Brownian motion the material. what parts of the course would need to be revised. Excercises 1. 1965). 1. At the time. Delbr¨ck (1943) u u Mutations of bacteria from virus sensitivity to virus resistance. Why does the editor say that ‘[t]here is no physical basis for such an activity?’ Speciﬁcally. It may be helpful to treat φ1 (t1 −t2 ) in Eq. 3.28 to Eq. . Fick’s law: A substance with concentration c(x.29 as an idealized delta function δ(t1 −t2 ) (see Section B.45. F.30. Eq. t) undergoing diffusive transport in one-dimension obeys the diﬀusion equation: ∂ ∂c ∂2c = D 2 = − J.” by Davenas et al. Notice the disclaimer appended to the end of the article by the editor. 1. 2.46) to a good level of approximation. if the results reported in this paper are correct. The exercises are excellent – diﬃcult and rewarding. Reif (McGrawHill. Read S. the dependence of the ﬂux upon the concentration gradient was interpreted as meaning particles bump into each other in locations of high concentration and consequently move to areas of lower concentration.

(a) Suppose each step has length ∆ and is taken after a time τ . τ → 0 in such a way that. Brownian motion – Smoluchowski’s model: Smoluchowski formulated Brownian motion as a random walk along a discrete lattice. with s ≥ 0). pm (s + 1) − pm (s) . show that the discrete random walk reduces to Einstein’s diffusion equation. is at lattice site m after s steps (m and s are integers. ∆2 = D. t) change? .24 Applied stochastic processes (a) What is the problem Luria and Delbr¨ck set out to study? u How will they distinguish among the possible hypotheses? (b) What relation can you see between this article and Einstein’s article on Brownian motion? What parameter in the work of Luria and Delbr¨ck plays the role of Avogadro’s number in u Einstein’s analysis? (c) It was primarily this work that resulted in Luria and Delbr¨ck u winning the Nobel prize in 1969 (along with Herschey) – why do you think this work is so signiﬁcant? 6. with probability p = 1 + β∆ 2 of moving to the left and q = 1p = 1 − β∆ of moving to 2 the right after each step (where ∆ must be small enough that q > 0). =D ∂t ∂x2 (b) Suppose the movement is biased. with equal probability of moving to the left or to the right. Taking the limit ∆. Under the same limiting conditions as above. In a certain limit. beginning at lattice site m = 0 at t = 0. τ where pm (s) is the probability the particle. how does the partial diﬀerential equation for P (x. sτ = t. ﬁnd an expression for. 2τ m∆ → x. Smoluchowski’s formulation coincides with Einstein’s diﬀusion equation. ∂P ∂2P . Using a centered-diﬀerence scheme in ∆.

Brownian motion 25 Solve this equation for P (x. t). pm (t) = 1 2πi z −m−1 Q(z. and possible positions of the particle are limited by the condition −R ≤ k ≤ R. t) subject to P (x. dt where t = sτ is a ﬁnite observation time. More precisely. subject to the boundary conditions P (x. 7. respectively. That is. derive the partial diﬀerential equation for P (x. Under the same limiting conditions as above. so that the probability of moving in either direction depends upon the position of the particle. 1 2 1− k R or 1 2 1+ k R . t) → 0 and ∂P/∂x → 0 as x → ∞. . if the particle is at k∆ the probabilities of moving right or left are. and a reﬂecting barrier at x = 0. with equal probabilities for moving to the right and to the left. (Note that throughout. in the next step. when the particle reaches the point x = 0 it must. t). given by p = q = λτ (so that the probability to remain on a site is r = 1 − 2λτ ). m is an integer. and with R → ∞ and 1/Rτ → γ.) (b) Show that. dpm (t) = λ [pm−1 (t) + pm+1 (t) − 2pm (t)] . Derive the diﬀerence equation for the probability pm (s) for occupying the site m after s moves (if the initial site was m = 0). where τ is the time between consecutive steps and λ is a given constant. (c) Suppose a state-dependent transition rate. Solve for P (x. One-dimensional random walk with hesitation: Consider a random walk along a discrete lattice with unit steps. t)dz. (a) Write a balance equation for the probability pm (s) for occupying the site m after s moves (if the initial site was m = 0). move ∆ to the right. R is a given integer. t) → 0 and ∂P/∂x → 0 as x → ±∞. Take the τ → 0 limit in that equation to derive the following continuous-time equation.

26 Applied stochastic processes where Q(z. Estimate the solution of the following problems by running a Monte Carlo simulation. Monte Carlo simulation of random walks: Often various properties of stochastic processes are not amenable to direct computation and simulation becomes the only option. on average how many steps must it take before it is absorbed at x = n? Run your simulations for various n. where In (x) is a modiﬁed-Bessel function of integer order n. and the integral the unit circle |z| = 1 in the complex plane. pm (t) = e−2λt I|m| (2λt). t) = ∞ m=−∞ z m pm (t). This is a simple enough example that the answer can be checked analytically – verify your formula obtained from simulation against the analytic solution. (e) Deduce that the leading term of the asymptotic series for the probability density function from part 7d. t) is the probability-generating function. with a reﬂecting barrier at x = 0 and an absorbing barrier at x = n (any particle that reaches x = n disappears with probability 1). (1.47) running along (d) Show that the solution of the equation from part 7a is given by. t) = exp z + z −1 − 2 λt . pm (t) = √ 1 m2 exp − 4λt 4πλt . For a system that starts at x = 0. when t. you may use any standard handbook of mathematical methods and formulas. such as the book by Abramowitz and Stegun. m → ∞ with m2 /t ﬁxed. has the form. subject to the appropriate initial condition. 8. Q(z. . (a) Consider a one-dimensional unbiased random walk. Note: In parts 7d and 7e. is given by Q(z. with z a complex number. and deduce a general formula. (c) Derive a diﬀerential equation for the probability-generating function and show that its solution.

ii.6). iii. For a spider asleep at the center of the web. . assume the time to traverse along a strand is independent of the strand length). Spider web for Exercise 8(b)iv. Suppose the web is very dusty so that the ﬂy can easily move along every strand (for simplicity.) Can you imagine a mapping of this problem to a one-dimensional random walk? Verify your simulation results with an analytic estimate. 8(b)ii and 8(b)iii. how many steps must the spider take in order to ﬁnd the ﬂy? How does that number change as the number of radial vertices increases? (In Fig. B. 1. Suppose the ﬂy dies of fright. Spider web for Exercise 8(b)i. On average. i. what is the probability of escape for a ﬂy landing on any given vertex of the web? Compute the escape probability at every vertex for 3. (b) Consider a two-dimensional random walk along a spider web (Fig.6: From a given vertex. 1.6. there are 3 radial vertices. 4. but slowly.Brownian motion A B 27 Figure 1. Now assume the spider can move. A ﬂy lands on the web at some vertex. A. then it is free. and 5 radial vertices. If the ﬂy can make it to one of the arrows. If the spider catches the ﬂy. Repeat 8(b)ii with a spider that moves 1 step for every 2 steps taken by the ﬂy. Hint: Use symmetry arguments to cut down the number of simulations you must run. then the ﬂy is eaten. the probability to move along any adjacent strand is equal. and the spider moves blindly along the web to ﬁnd it.

On average.2n . what is the most likely number of tosses for which Albert is in the lead? Guess ﬁrst. 1. Albert bets on heads. The probability that Albert leads in 2r out of 2n tosses is called the lead probability P2r. how many steps must the spider take in order to ﬁnd the ﬂy? (c) Suppose that in a coin-toss game with unit stakes. . . 2. . For a coin tossed 20 times.2n for n = 10 and r = 0. 10. 1. . .28 Applied stochastic processes iv. Suppose the ﬂy dies of fright.6B. and the spider moves blindly along the web to ﬁnd it. if the spider is asleep at the center. then run Monte Carlo simulations to make a table of P2r. Compute the escape probability for the ﬂy that lands in the tattered web shown in Fig. and Paul bets on tails.

the variables that describe it are not the ordinary random variables of classical probability theory. although Brownian motion is certainly a random aﬀair. For one thing. Roughly speaking. although Wiener (1922) had put a special case of Brownian motion on a sound mathematical basis. these are functions which are speciﬁed by the results of an observation. These vacuum tubes are always a source of considerable noise because of ﬂuctuations in the number of electrons passing through the tube during identical time intervals (shot eﬀect) and because of ﬂuctuations in the cathode emission (ﬂicker noise). This kind of noise was becoming increasingly important in the 1920’s in electrical engineering due to the increasing importance of the vacuum tube – recall the pioneering work of van der Pol (1924) in this area. and it was not obvious whether Wiener’s methods could be generalized. one not only observes noise arising in the electrical circuits themselves. In radio receivers. but also random changes in the level of the received signals due to scattering of electromagnetic waves caused by in29 . Strictly speaking. it was clear that Brownian motion was just one example of this kind of random phenomenon (with many more cases popping in science and engineering). and which can take on diﬀerent values when the observation is repeated many times. For another. but rather they are random functions. it had become clear that Brownian motion was a phenomenon that could not be handled in a rigorous fashion by classical probability theory. A phenomenon similar to Brownian motion is that of ﬂuctuations in electric circuits. the current through a conductor is always a random function of time. since the thermal motion of the electrons produces uncontrollable ﬂuctuations (thermal “noise”).Chapter 2 Random processes By the time the Ornstein and Uhlenbeck paper appeared.

2 0 50 C Height (nm) 1 0. To every outcome ω we now assign – according to a certain rule – a . 5. homogeneities in the refractive index of the atmosphere (fading) and the inﬂuence of random electrical discharges (meteorological and industrial noise). see Figure 2. in the tossing of a fair die.6 B0 0. 3.8 δB 0. . 2. 6. 4. To illustrate the appearance of the observed values of random functions.5 Applied stochastic processes B 0. Ω = {1. 2. 4. albeit very important. Curves (and contours) of this type are obtained by experiment. As other examples.Basic concepts Suppose we are given an experiment speciﬁed by its outcomes ω ∈ Ω (Ω = the set of all possible outcomes). that is. they are the result of observations.1 Random processes .1: Examples of Random Functions A) Height of a protein molecule above a catalytic surface. These random functions depend on four variables (the three space variables and time): they are called random ﬁelds. 2}. 4.30 A 1. 3. the probability of each outcome is independent of the previous toss and equal to P = 1 . is far from a unique case.5 0 100 150 time (ms) 200 50 time (s) 100 150 Figure 2.1.4 0. . Their general appearance shows that a deterministic description of the phenomenon would be so complicated that the resulting mathematical model (if possible at all) would be practically impossible to solve. They represent realizations of the random functions. C) Example of a random ﬁeld – The cosmic background radiation. For example. at a point in the Earth’s atmosphere. The list goes on. 6}. they are also called sample functions or sample paths. A sample experiment might be 6 ω = {5. B) Fluctuations in the magnetic ﬁeld around the North pole. and by the probability of occurrence of certain subsets of Ω. we may cite the pressure. Electrical noise. temperature and velocity vector of a ﬂuid particle in a turbulent ﬂow – in particular.

since the use of such techniques is not essential for our subsequent considerations. t). This family is called a stochastic process (or a random function). depending upon the parameter ω. then ξ(ω. by specifying this measure. Remarks: 1. for us. Fix ω. This function is called a realization. originating in Kolmogorov’s work. To each outcome ω. This leads to the deﬁnition of a particular measure P (the probability measure) on the function space of realizations. We have thus created a family of functions. . T ]. Fix t. we will denote the stochastic process by ξ(t). in particular. but notice that Einstein’s point of view was to treat Brownian motion as a distribution of a random variable describing position (ξt (ω)). then ξ(ω. Unfortunately. ω and t. real or complex. From a purely mathematical point of view. it requires the use of advanced ideas from set and measure theory (see STAT 901/902). although it could also be that t ∈ [0. we specify the stochastic process. A stochastic process is a function of two variables. This approach. We shall demand that. Moreover. t ∈ R. we shall develop the theory from a more physical point of view. there corresponds a function of t.Random processes 31 function of time ξ(ω. This may seem a pedantic distinction. or as a family of random variables ξt (ω). 2. Usually. There are two points of view: 1. it can be shown to include as special cases all other ways of specifying a random process. 2. has been mathematically very successful and fruitful. for a ﬁxed t. ξt (ω) be a random variable in the classical sense. t) = ξt (ω) is a family of random variables depending upon the parameter t. in order to specify a stochastic process we have to provide the probability (or probability density) of occurrence of the various realizations. while Langevin took the point of view that Newton’s law’s of motion apply to an individual realization (ξ (ω) (t)). a stochastic process can be regarded as either a family of realizations ξ (ω) (t). t) = ξ (ω) (t) is a (real-say) function of time. In that way. with the dependence on ω understood. or sample function of the stochastic process. one for each ω.

2.1: An experiment is repeated n times and a sample path observed each time (see Figure 2.2) Next. then its distribution function is given by. consider the random variables ξ(t1 ) and ξ(t2 ) and deﬁne their joint probability distribution by. various probability concepts may be immediately generalized to stochastic processes. Equality of two stochastic processes will mean their respective functions are identical for any outcome ωi . f (x. t) is the probability of the event {ξ(t) ≤ x} consisting of all outcomes ω such that ξt (ω) ≤ x. Because of the above interpretation. t ∈ R. t) = P {ξ(t) ≤ x} (2. As in classical probability theory. 3.3) . t1 . t). t2 ) = P {ξ(t1 ) ≤ x1 .2).1 on page 232). we count the total number nt (x) of times that (at a given t). 2. t) .e.1 is: Given x. For example. we can adopt the frequency interpretation of Eq. F (x. F (x1 .32 ξ (ω . ξ(t2 ) ≤ x2 } (2. Similarly. ∂x (2. Now given two numbers (x. t) ≈ ntn .t ) x Applied stochastic processes t time ( t ) Figure 2. given two instants t1 and t2 . ξ(t) = ζ(t) means ξt (ωi ) = ζt (ωi ). The meaning of Eq. x2 . i. then F (x. F (x. More intuitively. t) = ∂F (x.1) (see Eq A. the probability density corresponding to the distribution F (x.2: Frequency interpretation of an ensemble of sample paths. we deﬁne the operations of addition and multiplication. if ξ(t) is a real stochastic process. t) is given by. the ordinates (x) of the observed functions do not exceed x.

. . . ∂x1 . t1 . . . xn .5) (2. . xm . . • Compatability condition. t1 . t) such that F (x1 . . . . . tn ) = ∂ F (x1 . j2 . the nth -order distribution function completely determines the stochastic process. 2. . . tn ) gives the joint probability distribution of ξt1 (ω). . . and in fact. ∂x1 ∂x2 33 (2.6) f (x1 . . . ∂xn Note that the nth -order distribution function determines all lower-order distribution functions. . but must satisfy the following. . tjn ) = F (x1 . Kolmogorov (1931) proved that given any set of density functions satisfying the two conditions above. . t1 . . . . xj2 . . tm . . . ∞. there exists a probability space Ω and a stochastic process ξ(ω. xn .4) and so on for higher-order joint distributions. t1 . jn ) of (1. t2 . x2 . tn ) = F (x1 . . . . t1 . tj1 . . . . . . xn . xn . . tn ) = P {ξ(t1 ) ≤ x1 . t1 . . x2 . . ∞. . . . . . For every permutation (j1 . tm ). . . . . t1 . For m < n. f (x1 . x2 . . . . tj2 . . Remarks: 1. This condition follows from the AND logic of the distribution function: the statements ‘A AND B AND C’ and ‘A AND C AND B’ are equivalent. . . . tn ) . t1 . . xn . . . t1 . n). . F (x1 . . . . . . xjn . . . . . . This statement again follows from the AND logic: the statements ‘A AND B AND C’ and ‘A AND TRUE AND C’ are equivalent. . The distribution function Eq. we must have F (x1 . . . we must have F (xj1 .5 is not arbitrary. . . . t2 ) . In general. . . . . . . • Symmetry condition. . 2. . . . ξ(tn ) ≤ xn } n (2. tn ). ξtn (ω). t2 ) = ∂ 2 F (x1 . .Random processes with the probability density deﬁned analogously. . . . tm+1 . xm . . . . 2. .

t1 )× . . (2. . tn ) . x2 . t2 |x1 . t2 ) = f (x2 . f (x1 . f (xk+1 . . µ(t) = ξ(t) = −∞ xf (x. . . . . . . f (x1 . t) dx. (2. t1 . . t1 ) × f (x1 . . we have. . .9) It will be useful in coming chapters to re-write the correlation function using the deﬁnition from the previous section – re-writing the joint distribution using the conditional distribution. . . . x2 . t2 ) = ξ(t1 )ξ(t2 ) = −∞ −∞ x1 x2 f (x1 . . . t1 . (2. . tn |x1 . t1 . (2. simply means the probability that the state is x2 at time t2 and x1 at t1 is equal to the probability that the state is x2 at time t2 given that it was x1 at t1 multiplied by the probability that the state actually was x1 at t1 . t1 . . . .8) the correlation function by. . x2 . Focusing upon the two variable case. we introduce the conditional probability density. xn . . x1 . tk ) f (x2 . . t1 ) × f (x1 . . t1 ) = In particular. xk . . t1 ) f (x1 . t2 ) . 2. t2 |x1 . . . . t2 ) dx1 dx2 . t1 . t1 ). ∞ ∞ B(t1 . t1 ) dx1 . . tn ) = f (xn . .1 Moments of a Stochastic Process ∞ We deﬁne the mean of the stochastic process by. × f (x2 . t2 ) = ξ(t1 )ξ(t2 ) = −∞ x1 x2 x1 f (x1 . . tn−1 . t2 ) = ξ(t1 )ξ(t2 ) = −∞ −∞ x1 x2 [f (x2 . . t1 ) × f (x1 . .1. .10) . xn . f (x1 . . . the relationship between the conditional and joint probabilities is very straightforward. tk+1 . . . t1 ). . t2 |x1 . xk . tk ) = f (x1 . Intuitively. t1 . t1 . . . . . . t2 |x1 . . . .7) f (x1 . tn |xn−1 . xn . . ∞ ∞ B(t1 . t1 )] dx1 dx2 ∞ B(t1 .34 Applied stochastic processes As a further extension.

dtm . . given by C(t1 . ξ(t2 ). . . 2. t1 ) dx2 .Random processes where x2 x1 35 is the conditional average of x2 . (2. 12 = 1 2 + 12 . there is a very convenient relationship between the probability distribution of the ﬂuctuations in equilibrium and the parameters in the model governing the dynamics . e −iξt (−it)m = exp κm . the cumulants κm of ξ are deﬁned via a generating function. ξ(tm ) dt1 dt2 . . . t2 |x1 .9 on page 238).12) 0 The general rule for relating cumulants to the moments of ξ(t) is to partition the digits of the moments into all possible subsets.2 Stationarity and the Ergodic Theorem For systems characterized by linear equations.. are given by e−i exp (−i)m m! m=1 ∞ t 0 t Rt 0 ξ(t′ )dt′ = ··· ξ(t1 )ξ(t2 ) .11) and so on for higher-order correlations. for ξ(t1 ). . then adds all such products. denoted by κm ≡ ξ m . x2 x1 = ∞ −∞ x2 f (x2 . the cumulants of a stochastic function ξ(t). . one writes the product of cumulants. 123 = 23 + 2 13 + 3 1 2 3 + 1 1 2 + 1 2 3 .. . for each partition. (conditioned so that x2 (t1 ) = x1 ). 2. The ﬁrst few relations will make the prescription clear – Writing 1.. The double-angled brackets denote the cumulants of the process: Recall that for a scalar random variable ξ. Similarly. (2. . Related to the correlation is the covariance. A. 1 = 1 . m! m=1 ∞ (see Eq. t2 ) = {ξ(t1 ) − µ(t1 )} · {ξ(t2 ) − µ(t2 )} ≡ ξ(t1 )ξ(t2 ) .. .

the question arises as to how to compare mathematical averages with those obtained by experiment. ξ (2) (t). . t2 . F (x. of ξ (j) (t) for . . It turns out that there exists a whole class of phenomena where the underlying stochastic process is completely characterized by the mean ξ(t) = µ = constant. Deﬁnition: The stochastic process ξ(t) is called stationary if all ﬁnite dimensional distribution functions deﬁning ξ(t) remain unchanged when the whole group of points is shifted along the time axis. xn . . Eq. . Such stochastic processes are said to be stationary in the wide sense (or in Khinchin’s sense). tn ).e. Consider a stochastic process and suppose we know how to compute mathematical expectations. . . . A very useful property of stationary stochastic processes is the fact that they obey the so-called Ergodic Theorem. all 2-dimensional distribution functions can only depend upon t1 − t2 .13 above may not hold for stochastic processes which are only stationary in the wide sense. t1 + τ. . The ﬂuctuation-dissipation relation underlies many results in stochastic processes (including Einstein and Langevin’s treatment of Brownian motion) and its derivation will be the focus of this section. . t2 . Experimentally. . all one-dimensional distribution functions must be identical (i. (2. if F (x1 . .. Take a large number N of records (realizations) of ξ(t). i. we introduce a very useful notion – stationarity. . As a prerequisite. . . 2. . ξ(t)ξ(s) = B(t − s) = B(τ ).13) for any n. .36 Applied stochastic processes called the ‘ﬂuctuations-dissipation relation’. . . and so on. x2 . It follows that the autocorrelation B depends upon time diﬀerence only. . in general. ξ (j) (t). and by the correlation function B(τ ). . tn + τ ) = F (x1 . t) = F (x) cannot depend upon t). xn . . . t1 . say. with respect to j. ξ (N ) (t). t1 . . . ξ (1) (t). .e. Then we take the arithmetic mean. t2 + τ. . x2 . we may proceed as follows: 1. . tn and τ . In particular. . It should be clear that. 2.

g. 0 ξ (t) ξ (t + τ ) dt.Random processes every t and claim this equals ξ(t) . s). but in practice taking all of these records and processing them as above is a very complicated aﬀair – impossible to carry out in many cases. s).. but rather we can compute the time averages of the quantities of interest.3. 1 ξ(t) ≈ N N 37 ξ (j) (t). very long record ξ ⋆ (t) of our stochastic process ξ(t) – something like Figure 2. however. Say we have a single. 1 ξ (t) ≈ lim T →∞ T 1 ξ (t) ξ (t + τ ) ≈ lim T →∞ T T ξ (t) dt. if the process is stationary. Here. 0 T ξ (t) ξ (t + τ ) dt. j=1 3. 0 and so on for higher-order correlations. the ergodic theorem guarantees that they coincide. 0 A nice pictorial way of thinking about the content of the ergodic theorem is the following. 4. etc. This program is ﬁne in principal. we cannot compute the arithmetic mean. In general. but very large time. T is a ﬁnite. we take the arithmetic mean of the products ξ (j) (t)ξ (j) (s) for every pair (t. then 1 ξ (t) = lim T →∞ T ξ (t) ξ (t + τ ) = lim 1 T →∞ T T T ξ (t) dt. this time-average will not equal the mathematical expectation. Now suppose this single . e. . and claim that this equals ξ(t)ξ(s) = B(t. Proceed analogously with higher-order moments . in such a case. If ξ(t) is a stationary stochastic process (with t a continuous variable) satisfying certain general conditions. it is far easier to take a few long records of the phenomenon. In fact. Next. .

j=1 and so on. . the time-averages and the arithmetic averages coincide. since the process is stationary. and the arithmetic average can be performed.38 ξ (t ) Applied stochastic processes time ( t ) ξ (1) ( t ) ξ ( 2) ( t ) ξ ( 3) ( t ) 0 time ( t ) T 0 time ( t ) T 0 time ( t ) T Figure 2. the calculation of the averages is performed as follows. T 0 T ξ (t) dt ≈ N N ξ k k=1 T . N N → ∞. record is cut into a large number N of pieces of length T . each ξ (j) (t) may be regarded as a separate measurement. j=1 N 1 ξ (t) ξ (t + τ ) = B (τ ) ≈ N ξ (j) (t) ξ (j) (t + τ ). since ξ(t) cannot (in general) be expressed by a closed formula. In practice. 1 ξ (t) ≈ N N ξ (j) (t). The ergodic theorem says that in the limit N → ∞.3: A long sample of a stationary process can be used in place of many (shorter) repeated samples. The record ξ ⋆ (t) is digitized and the time integrals replaced by their approximating sums.

S(ω) = lim 1 ˆ | X(ω)|2 . while the number N should be so large that a further increase has a negligible eﬀect on the value computed via Eq. 1 π ∞ 0 cos(ωτ )B(τ )dτ. and N is chosen to make N ∆ = T suﬃciently large. the ﬂuctuation spectrum. The two representations are closely related. ξ (t) ≈ 1 N N 39 ξ ⋆ (k∆). Then. First. Usually. ˆ X(ω) = 0 T ξ(t)e−iωt dt. 1 S(ω) = 2π ∞ −∞ T →∞ 1 π T 0 1 cos(ωτ ) T T −τ 0 ξ(t)ξ(t + τ )dtdτ .Random processes so that. or spectral density of the ﬂuctuations.14) . k=1 where ∆ is a small time interval. as we shall now show. ∆ is chosen in such a way that the function ξ ⋆ (t) does not change appreciably during ∆. T →∞ 2πT After some manipulation. e −iωτ B(τ )dτ ⇐⇒ B(τ ) = ∞ −∞ eiωτ S(ω)dω. we deﬁne. S(ω) = lim or. is deﬁned as. S(ω) = Written another way. 2. (2. the connection between the (time-averaged) autocorrelation function and the ﬂuctuation spectrum can be made explicit.2.3 Spectral Density and Correlation Functions In practice. 2. it is often much easier to measure the Fourier transform of a signal rather than trying to compute the autocorrelation function directly.

dtn dt (where δ(t) is the Dirac-delta function and h(t) is often called the impulse response of the system).1 Linear Systems dn dn−1 + an n−1 + . we begin with the function H(ω) which is called the transfer function of the linear operator acting on y(t) and is given by the Fourier transform of the solution h(t) of the auxiliary equation. . and we write. Ξ(ω) = 1 2π ξ(t) e−iωt dt = 0. + a0 y(t) = η(t). an (2.40 Applied stochastic processes The conjugal relationship between the autocorrelation function B(τ ) and the ﬂuctuation spectrum S(ω) is called the Wiener-Khinchin theorem (see Fig. Furthermore. assuming ξ(t) = 0. 2. + a0 h(t) = δ(t). To show this relationship explicitly. .15) In the next chapter. we have the following. (2. ξ(t)ξ(t + τ ) = B(τ ). Writing H(ω) explicitly. . ˆ ˆ Ξ(ω)Ξ(ω ′ ) = δ(ω − ω ′ )S(ω).3. the time-average may be replaced by the ensemble average via the ergodic theorem. + a0 −1 . ˆ Ξ(ω) = ∞ −∞ ξ(t)e−iωt dt. .16) the correlation function of the process y(t) is related in a very simple way to the correlation of the forcing η(t). H(ω) = an (iω)n + an−1 (iω)n−1 + . . dtn dt For linear systems with stochastic forcing η(t). Using the Fourier transform of ξ(t). 2. . an dn dn−1 + an−1 n−1 + . for a stationary process ξ(t).4). we use the ﬂuctuation spectrum S(ω) to derive the very useful ﬂuctuation-dissipation relation. .

for example. implied by the Wiener-Khinchin theorem. 2. Eq. B) The corresponding ﬂuctuation spectra. Syy (ω) = H(ω)Sηy (ω). Sηy (ω) = Sηη (ω)H ⋆ (ω). and a Fourier transform. After a change of variables. ∞ y(t) = −∞ η(t − τ ′ )h(τ ′ )dτ ′ .Random processes 41 A) Bx (τ ) B) S x (ω ) τ Bandwidth ω S y (ω ) By (τ ) τ ω Figure 2. has the formal solution.18) (2. then η(t + τ )y ⋆ (t) = ∞ denotes the complex conju- −∞ η(t + τ )η ⋆ (t − τ ′ ) h⋆ (τ ′ )dτ ′ . The utility of the impulse response h(t) is that the particular solution of an inhomogeneous diﬀerential equation can always be written as a convolution of the forcing function with h(t). As the correlation time decreases.17) . while the spectrum broadens – this is a general feature of correlation functions and spectra. ⋆ If we multiply η(t + τ ) by y ⋆ (t) (where the gate) and take the average.4: Correlation functions and spectra A) Example correlation functions.16. (2. we also have. In a similar way. the autocorrelation becomes more narrowly peaked.

so that Syy (ω) = Sηη (ω) 1 Γ = . 2.23) where β is the damping constant and the random forcing η(t) has the statistical properties. (2. η(t) = 0. we obtain the socalled ﬂuctuation-dissipation relation. We can write a generic version of Langevin’s equation as the following phenomenological damping law governing the variable y.22) is the autocorrelation function for the Ornstein-Uhlenbeck process at steady-state.3. If this is combined with the fundamental result above. 2β (2. η(t)η(t′ ) = Γδ(t − t′ ). Syy (ω) = Sηη (ω)H(ω)H ⋆ (ω) = Sηη (ω) × |H(ω)|2 .21) (2. y(t)y(t − τ ) = Γ e−β|τ | . yields the fundamental relation. subject to random forcing η(t): dy = −βy + η(t). so (2. 2.22) Eq.20) Taking the inverse Fourier transform.18. dt (2. (2.20 is Langevin’s equation for the velocity of a Brownian particle.19) As an example. we can obtain the relation between the correlation function of the process y(t) and the correlation function of the forcing function η(t) by directly calculating the autocorrelation of y(t). 2 + β2 2 + β2 ω 2π ω (2.24) . 2. relating the noise magnitude Γ to the other physical parameters in the model.2 Fluctuation-Dissipation Relation For linear systems.42 Applied stochastic processes Combining Eqs.17 and 2. dt the impulse response H(ω) is simply H(ω) = 1/(iω + β). consider a linear system with damping driven by a stochastic function with zero mean and a delta-correlation function. dy = −βy + η(t).

For the linear system (2. it is possible to compute the ﬂuctuation spectrum of y without using Eq. note that the general solution of Eq.27) since the average of η(t) vanishes by Eq. 2. we arrive at. 1. 2. (2. ′ (2.31) relating magnitude of the ﬂuctuations (Γ) to the magnitude of the phenomenological dissipation (β).25 has not been used. From the fundamental relation above (Eq. y(t) y0 = y0 e−βt .25) (Compare with Eq. η(t)η(t′ ) = Γδ(t − t′ ). Eq.32 on p. Γ = 2 y2 eq (2.30. y(0) y(t) y0 eq = y2 eq −βt e . 2.25 and Eq.29 on page 17. (β 2 + ω 2 )Syy (ω) = Sηη (ω).25! To that end. B. 2.28) where y 2 eq is given by the variance of the equilibrium distribution.23). .19). as calculated by equilibrium statistical mechanics (equipartition of energy).) Noise with a Dirac-delta correlation function is called white noise because one can easily show that the ﬂuctuation spectrum of the Dirac-delta is a constant. conditioned by the initial value y(0) = y0 . β 2 + ω2 (2. The spectral density is simply.Random processes and. Now using Eq. 2. This is a statement of the ﬂuctuationdissipation relation.23 is y(t) = y0 e −βt t + 0 e−β(t−t ) η(t′ )dt′ . 43 (2. 264). (2. 2.24. Syy (ω) = 1 2 y π eq β . It says that in order to acheive thermal equilibrium.30) β. (2.e. 2. in analogy with white light that contains all frequencies of visible light (see Eq. all frequencies are equally represented in the spectrum. i.26) Taking the average.29) Notice that up to this point. Calculating the autocorrelation function.

like Langevin. In 1928. J. Capacitance C shorted through a resistance R at temperature T .3. Redrawn from D. thereby obtaining an explicit relation between the mean-squared displacement and Boltzmann’s constant (cf. That work is described in more detail below and provides another example of using the ﬂuctuations to quantify phenomenological parameters (in this case. Nyquist provided a theoretical framework for Johnson’s observations. Johnson observed random ﬂuctuations of current through resistors of various materials. the equipartition of energy theorem from equilibrium statistical mechanics and the ﬂuctuation-dissipation theorem. The current I(t) is dQ/dt.13 on page 7). again. 2. the diﬀusive eﬀects of the ﬂuctuations must be balanced by the dissipative eﬀects of the drag. The ﬂuctuation-dissipation relation was later used by Nyquist and Johnson in the study of thermal noise in a resistor. he observed that the power in the ﬂuctuations scaled linearily with temperature. Johnson (1928) “Thermal agitation of electricity in conductors. His main tools were.” Physical Review 32: 97–109.” Physical Review 32: 110-113. H. 1. B. 2002). Boltzmann’s constant). A version of this relation was used by Einstein in his study of Brownian motion to relate the diﬀusion coeﬃcient D to the drag force experienced by the particle. Lemons “An introduction to stochastic processes in physics. Eq. Most importantly.3 • • Johnson and Nyquist J. In an article immediately following. Nyquist (1928) “Thermal agitation of electric charge in conductors.” (John Hopkins University Press. The consequence of Nyquist’s analysis .44 Q (t ) C Applied stochastic processes I (t ) R T Figure 2.5: Johnson-Nyquist circuit. B.

Of that energy. From the ﬂuctuation-dissipation relation (Eq. We shall consider their work in more detail below. the ﬂuctuations should be uniformly distributed among all frequencies (i. Copied from Figure 6 of Johnson’s paper.34). one half is magnetic and the other half is electric.Random processes 45 was that the linear dependence of the ﬂuctuation power on the ambient temperature yields Boltzmann’s constant (which is proportional to the slope of the line).5)..5). the average energy in each degree of freedom will be kB T . In the . all at a ﬁxed temperature T (Figure 2. Figure 2. We consider a speciﬁc circuit – A charged capacitor C discharging through a resistance R. • At equilibrium. the noise is white).6: Johnson’s measurement of ν 2 /R as a function of T . the slope is proportional to Boltzmann’s constant. Nyquist uses essentially a completely verbal argument based on equilibrium thermodynamics to arrive at the following conclusions: • To a very good approximation. starting with the work of Nyquist and applying the ﬂuctuation-dissipation relation to a simple RC-circuit (Figure 2.e. 2. where kB is Boltzmann’s constant and T is the absolute temperature of the system.

46 absence of ﬂuctuations. V = IR.6). Johnson observed variability in the current. 1984). random variables. we immediately have the variance of the ﬂuctuations Γ. (2. • Probability.27 ± 0. and stochastic processes (2nd Ed. Writing I = dQ/dt leads to the Langevin equation dQ 1 =− Q + η(t). By Ohm’s law. Notice ν 2 /R should be linear in T – that is precisely what Johnson observed (Figure 2. IR + Q = 0.31). so that.38 × 10−23 J/K.34) called Nyquist’s theorem. as compared to the currently accepted value of 1. . Papoulis (McGraw-Hill. 2. dt RC (2. From the ﬂuctuation-dissipation relation (Eq. η 2 = Γ = 2 Q2 · β = 2kB T . A. 2C 2 or. R (2. Using his setup. ν 2 = η 2 · R2 = 2RkB T. we have.17 × 10−23 J/K.). kB T Q2 = . Q2 = CkB T . represented by a white noise source η(t). From the equipartition of energy. Johnson obtained an averaged estimate for Boltzmann’s constant of 1.33) This result is more commonly written as a ﬂuctuating voltage ν applied to the circuit. Suggested References Two superb books dealing with correlation functions and spectra from an applied perspective are. we know that the mean electrical energy stored in the capacitor is. with damping coeﬃcient β = 1/RC. C Applied stochastic processes where I is the current in the circuit and Q is the charge on the capacitor.32) This equation has the canonical form discussed in the previous section. Kirchoﬀ’s law gives.

Gardner (McGraw-Hill. ﬁlter ﬂuctuations in a particularly simple fashion. (b) Assume ξ has units such that ξ 2 ∝ Energy.Random processes 47 • Introduction to random processes with applications to signals and systems (2nd Ed. t′ ) = Cij (t − t′ ). Such a process is called whitenoise since the spectrum contains all frequencies (in analogy with white light). 2. For a whitenoise process. (a) Show that for δ-correlated ﬂuctuations. ω1 is the energy lying between frequencies ω1 and ω2 .18. Exercises 1. then. Linear systems: Systems such as Langevin’s model of Brownian motion. 1990).19 to higher-dimensions. (a) Fill in the details to derive Eqs. t′ ) (where T is the transpose). For a stationary process. The second book (by Gardner) is particularly recommended for electrical engineering students or those wishing to use stochastic processes to model signal propagation and electrical networks. or element-wise: yi (t)yj (t′ ) = Cij (t. Syy (ω) = H(ω) · Sηη (ω) · HT (−ω) . the correlation function is generalized to the correlation matrix. and show that.). ω2 S(ω)dω = E(ω1 . (2. the ﬂuctuation spectrum is constant. W. White noise: It is often useful. B(τ ) = δ(τ ). A. to represent the ﬂuctuations as white noise. 2.35) . 2. use the same line of reasoning as above to generalize Eq. (b) For systems characterized by several variables. Cij (t. ω2 ). t′ ). with linear dynamic equations. in cases where the time-scale of the ﬂuctuations is much shorter than any characteristic time scale in the system of interest.17 and 2. deﬁned either by the dot product: y(t) · y(t′ )T = C(t. show that the energy content of the ﬂuctuations is inﬁnite! That is to say. no physical process can be exactly described by white noise.

then X(t) is stationary in the wide-sense. (a) Find Z(t1 )Z(t2 ) . SY Y (ω). . where Y (t) is an Ornstein-Uhlenbeck process with. but neither stationary or Markovian. Y (t) = 0 and Y (t)Y (t − τ ) = Γ e−β|τ | .48 Applied stochastic processes where Syy (ω) is the matrix of the spectra of Cij (t − t′ ) and H is the matrix Fourier transform. It is often easier to measure the t transported charge Z(t) = 0 Y (t′ )dt′ . 2. Hint: First show (d/dt) Z 2 (t) = 2 4. MacDonald’s theorem: Let Y (t) be a ﬂuctuating current at equilibrium (i. 5. It may be helpful to consider the cumulant expansion of the exponential of a random process. and A2 = B 2 . 2β t 0 Y (t′ )dt′ Z(t) is Gaussian. 3. Show that. Let the stochastic process X(t) be deﬁned by X(t) = A cos(ωt) + B sin(ωt). where ω is constant and A and B are random variables.e. ω SY Y (ω) = π ∞ sin ωt 0 t 0 d Z 2 (t) dt. (b) If the joint-probability density function for A and B has the √ form fAB (a. dt Y (0)Y (t′ ) dt′ . Integrated Ornstein-Uhlenbeck process: Deﬁne Z(t) = (t ≥ 0). (b) Calculate cos [Z(t1 ) − Z(t2 )] . Reciprocal spreading and the Wiener-Khinchin theorem: Suppose g(t) is any non-periodic. Show that the spectral density of Y . Y (t) is stationary). (a) If A = B = AB = 0. 6. then X(t) is stationary in the strict sense. is related to the transported charge ﬂuctuations by MacDonald’s theorem.12 on page 35. real-valued function with Fourier transform G(ω) and ﬁnite energy: 1 W = g (t)dt = 2π −∞ 2 ∞ ∞ −∞ |G(ω)|2 dω < ∞. Eq. b) = h a2 + b2 .

W and the ‘uncertainty’ in frequency as σω . 2 ∞ −∞ ω2 |G(ω)|2 dω. σt = ∞ −∞ t2 g 2 (t) dt.Random processes 49 For the following. ∞ −∞ tg(t) dg dt dt (a) Writing the ‘uncertainty’ in time as σt . 2πW (b) For what function g(t) does σt · σω = 1 ? 2 . 267) and consider the integral. it may be useful to use the Cauchy-Schwarz inequality (p. σω = show that σt · σ ω ≥ 1 .

from f (x1 . . t1 . . in other words. Markov Process: Deﬁned by the fact that the conditional probability density enjoys the property. . t2 ) = f (x1 . . t1 ) · f (x2 . Assuming the time ordering t1 < t2 < . . However. . . .Chapter 3 Markov processes 3. . . . . . . xn . . t1 . i. tn |xn−1 .e. xn . we then identify several examples of stochastic processes: 1. t2 . . . . .1) . tn |x1 . completely determines a stochastic process. it follows that. the converse is not true. x2 . t1 ) · f (x2 . x2 . . t2 ). if the nth -order density factorizes. e. f (x1 . tn ). 50 (3. t1 . t2 ) · f (x3 . this leads to a natural classiﬁcation system. tn ) = f (x1 . t2 . . t2 . . tn ). t3 ) = f (x1 . tn ). . t1 . .g. x2 . t3 ). t1 ) · f (x2 . 2. . . tn−1 ). . Obviously. f (x1 . x3 . or density f (x1 . xn−1 . . f (xn . xn . then all lower-order densities do as well. . Purely Random Process: Successive values of ξ(t) are statistically independent. all the information about the process is contained in the 1st -order density.1 Classiﬁcation of Stochastic Processes Because the nth -order distribution function F (x1 . t1 . < tn . x2 . t2 ) · · · f (xn . . . . . t1 . . tn−1 ) = f (xn .

. This viewpoint is developed in detail in the book by D. In this sense.” A Markov process is fully determined by the two functions f (x1 . x2 . t3 . See N. The term seems to have a magical appeal. however. t3 ) = f (x3 . In particular: . Notice. . t2 |x1 . van Kampen (1998) Remarks on non-Markov processes. x2 . t1 ) and f (x2 . x3 . which invites its use in an intuitive sense not covered by the deﬁnition. t3 ) = f (x3 . Progressively more complicated processes may be deﬁned in a similar way. too. f (x1 . t2 ). f (x1 . with t1 < t2 < t3 . the adjective Markovian is used with regrettable looseness. t1 . 3. t1 ). this property is approximately satisﬁed by the process over the coarse-groaned time scale (see p. t1 ).Markov processes 51 That is. ti+1 |xi . t2 ) = f (x3 . . t1 ) · f (x1 . given the value at xn−1 at tn−1 . . (3. For example. is not aﬀected by the values at earlier times. the whole hierarchy can be reconstructed from them. x3 . . the process is “without memory. x2 . x3 . Here. . t1 ). although usually very little is known about them. t2 . t2 |x1 . t2 . t1 . the conditional probability density at tn . t1 . t1 ). so. t1 . t1 . x2 . . ti ) carrying the system forward in time. it is possible to add a set of auxiliary variables to generate an augmented system that obeys the Markov property. f (x3 .) with a propagator f (xi+1 . But. G. t1 ) · f (x1 . t3 |x2 . T. 59). Brazilian Journal of Physics 28:90–96. t2 . t3 |x1 . t3 |x2 . and in many applications (for example Einstein’s study of Brownian motion).2) The algorithm can be continued. beginning with the initial distribution f (x1 . x2 . t2 ) · f (x2 . In some cases. t2 ) · f (x2 . Gillespie (1992) Markov Processes. t2 |x1 . the analogy with ordinary diﬀerential equations. t3 |x1 . This property makes Markov processes manageable. WARNING: In the physical literature. we have a process f (x1 . by the Markov property.

1 is a condition on all the probability densities. ∞ ∞ f (x1 |x4 ) = −∞ −∞ f (x1 |x2 . x7 . 3. 1. . x2 |x3 ) dx2 . • To remove a number of right variables. multiply by their conditional density with respect to the remaining right variables and integrate. f x1 . if one knows the process is Markovian. x10 Left Variables Right Variables Sometimes one wants to remove a left (or right) variable. For example. • Eq. this can be done according to the following rules. ∞ f (x1 |x3 ) = −∞ f (x1 . x5 | x6 . We shall call those variables appearing on the left (right) of the conditional line in a conditional probability distribution the left variables (right variables). It is meaningless to say a “phenomenon” is Markovian (or not) unless one speciﬁes the variables (especially the time scale) to be used for its description. t1 ) do suﬃce to specify the entire process. 3.2 Chapman-Kolmogorov Equation First some remarks about terminology before deriving this fundamental result. simply integrate with respect to them. x3 . x8 .” a certain phenomenon involving time is usually what is being referred to. x4 .52 Applied stochastic processes • When a physicist talks about a “process. x9 . x2 . t1 ) and f (x2 . x3 . • To remove a number of left variables. x3 |x4 ) dx2 dx3 . x4 ) f (x2 . t2 |x1 . then of course f (x1 . On the other hand. one simply cannot say that a process is Markovian if only information about the ﬁrst few of them is available.

. xn )dx1 . e. . and the joint density function is obtained by diﬀerentiation with respect to {x1 . . . . . If we integrate the joint density function f (x1 . ξ2 . f (x1 . −∞ If we substitute in the distribution function certain variables by ∞. t1 ) and f (x2 . Obviously. t2 |x1 . t1 ) · f (x1 . . . dxn = 1. ξn : F (x1 . t2 ) = f (x2 . . we obtain the joint density of the remaining variables (called the marginal density). . x3 ) = F (x1 . however. . x3 ) = −∞ −∞ f (x1 . The ﬁrst one follows from the deﬁnition of the conditional density. xn ) ≥ 0. we obtain the joint distribution function of the remaining variables. . . x3 . As we have seen. . the ensemble average (or E { · }) will be denoted by ··· . This.g.3) . . . . . xn ) with respect to certain variables. . . . ∞). . a Markov process is fully determined by the two functions f (x1 . . (3. x2 . . is the joint distribution function. xn ) = P {ξ1 ≤ x1 . t1 ). . Recall the deﬁnition of the distribution function of n random variables ξ1 . t1 . From now on. ∞ ∞ f (x1 . .g. . ξn ≤ xn }. f (x1 . ∞) = −∞ ··· f (x1 . x(t) = ∞ −∞ xf (x. t1 ). . . We now proceed with the derivation of the Chapman-Kolmogorov equation. This.Markov processes 53 2. does not mean that these two functions can be chosen arbitrarily. of course. . for they must also obey two important identities. x3 . . ∞. 3. t2 |x1 . . t)dx. . ∞ ∞ F (∞. F (x1 . x2 . xn }. e. x4 ) dx2 dx4 . .

t3 |x1 . (3.4 and 3. ∞ f (x2 . t1 ) dx1 . x2 . f (x. then they uniquely deﬁne a Markov process. t1 . ti |xj . t2 |x1 . t3 ) = f (x3 . t1 .54 Applied stochastic processes (where f (x2 . t2 . f (x1 . tj ) for a Markov process. t1 ) (Eq.3). The converse is also true: if f (x1 . t1 ) −∞ f (x3 .1). no general solution to this equation is known. t1 ) and f (x2 . t1 ) With t1 < t2 < t3 . using f (x1 . t1 ) can be thought of as the transition probability for reasons that will become clear below when we discuss the master equation). t1 ) · f (x1 . x3 . t2 |x1 . It is a functional equation relating all conditional probability densities f (xi . From the meaning of f (x. t1 ) · f (x1 . 2. t3 ) = f (x1 . i. t2 |x1 . By integration over x1 . f (x2 . t2 |x1 . t2 ) = −∞ f (x2 . t2 ) f (x2 . t3 |x2 . t2 ) · f (x2 . ∞ f (x3 . t3 |x2 .e. t1 ) dx2 . t1 ) obey the consistency condition. ti |xj . t1 . t0 ) → δ(x − x0 ) as t → t0 .4) The second identity is obtained from Eq. it is clear that we must have. 3. where the time ordering in the integrand is essential. t|x0 . t1 ) = −∞ f (x3 . t3 |x2 . Remarks: 1. x3 . integration over x2 gives.2). its solution would give us a complete description of any Markov process – Unfortunately. t1 ) dx2 .5) which is known as the Chapman-Kolmogorov Equation (Figure 3.5. x3 . 3. t2 ) f (x2 . 3. For example. t2 |x1 . the Gaussian. 4 D (t2 − t1 ) 4πD (t2 − t1 ) 1 2 . ∞ (Eq. Eqs. t1 ) = 1 (x2 − x1 ) exp − . t1 ) f (x1 . The Chapman-Kolmogorov equation is a functional equation for the transition probability f (xi . t2 |x1 . t0 ). t|x0 . 3.2. tj ). f (x1 . we immediately have. t2 |x1 . t3 ) = f (x3 . (3. an important Markov process is the one whose transition probability is normal. t3 |x1 . and.

6) This is the probability density of the so-called Wiener-L´vy process. t1 ) does depend on (x2 − x1 . τ + τ ′ ) = p (x3 |x2 . For a stationary Markov process. f (x2 . exp − 4 Dt 4Dπt (3. it is not a stationary stochastic process. Notice. The intermediate variable x2 is integrated over to provide a connection between x1 and x3 .Markov processes 55 t x3 x2 x1 Figure 3. f (x. and – if one chooses f (x1 . t) = √ 1 x2 1 . t2 |x1 . because ξ(t1 )ξ(t2 ) = 2D min(t1 . t2 |x1 . it is convenient to use a special notation.7) −∞ . 2 – it follows from Eq. (3. We let. (t ≥ 0).4 that. 0) = δ(x1 ) and uses Eq. (which is not a function of time-diﬀerence only – Exercise 1).1: Chapman-Kolmogorv Equation. t1 ) ≡ p(x2 |x1 . This process describes the position of a Brownian particle according to Einstein. p (x3 |x1 . Convince yourself that it obeys the Chapman-Kolmogorov equation. t2 ). τ ′ ) p (x2 |x1 . t2 − t1 ) only. 3. that the transition probability f (x2 . τ ) dx2 . in terms of which the Chapman-Kolmogorov equation reads. ∞ τ = t2 − t 1 . τ ). but despite appearances. e It is a Markov process. however. 3.

exp − (D/β) π(D/β) 1 2 x2 − x1 e−βτ 1 . Gaussian and Markovian. ∞ ∞ B(τ ) = −∞ −∞ x1 x2 p (x2 |x1 . exp − (D/β) (1 − e−2βτ ) π(D/β) (1 − e−2βτ ) (3. τ ) > 0 because of the time-ordering. τ ) = 1 x2 1 .8) This process describes the velocity of a Brownian particle according to Ornstein and Uhlenbeck. It is straight-forward to verify that this process satisﬁes the ChapmanKolmogorov equation. and (τ ′ . . Doob (1942) proved that it is “essentially” the only process with these three properties. the latter applying when the stationary Markov process has zero mean. −∞ p (x2 |x1 . The Ornstein-Uhlenbeck process is stationary. τ ) dx2 = 1. τ ) f (x1 )dx1 = f (x2 ). and that the correlation function is exponential (Exercise 2). τ ) f (x1 ) dx1 dx2 . B(τ ) ∝ e−β|τ | . We also have. it is deﬁned by: f (x1 ) = p (x2 |x1 .56 Applied stochastic processes where τ ′ = t3 − t2 . The Ornstein-Uhlenbeck process is the best-known example of a stationary Markov process. ∞ ∞ −∞ p (x2 |x1 .

t3 ). 2003). C. the real importance of the Chapman-Kolmogorov equation is that it enables us to build up the conditional probability densities over the “long” time interval (t1 . τ ′ ) = (1 − a0 τ ′ )δ(x − z) + τ ′ w(x|z) + o(τ ′ ). however. for we need to know the evolution of the system on a time scale of the order of the time it takes to perform an experiment.g. These theories are formulated in terms of diﬀerential equations describing the trajectories of the many (∼ 1023 ) particles constituting the system – such as Hamilton’s equations in classical mechanics or Schr¨dinger’s equation in quantum mechanics. It is here. t2 ) and (t2 . It turns out this is an incredibly useful property. for we would have the solution for all time.. t3 ) from those over the “short” intervals (t1 . (3.Markov processes 57 3. Mackey. generating the transition probability between two states during the time interval ∆t. This is not enough. Time’s Arrow: The Origins of Thermodynamic Behavior. we would not need to think of approximations such as the splitting of time scales mentioned above. namely in bridging these two time scales. and a0 is the . How does this irreversibility come about? This is one of the classic problems of statistical mechanics (see M.9) where w(x|z) is the transition probability per unit time. the transition probability is. For a large class of systems. it is possible to show that over very short time. [Even in that case. and if we could solve the initial value problems for them. for an interesting discussion). In physical applications. such symmetry is lost at the macroscopic level and irreversibility dominates the evolution of physical phenomena. p(x|z. (Dover. we would have a big puzzle on our hands.3 Master Equation Aside from providing a consistency check. that the Markovian assumption helps: from our knowledge of the transition probability at small times we can build up our knowledge of the transition probability at all time iteratively from the Chapman-Kolmogorov equation. we generally have well-developed theories describing in detail (microscopically) the evolution of a system out of equilibrium. of course. o These descriptions are determinisitic.] There are several methods in physics that provide the microscopic dynamics over very short time scales. time-dependent perturbation theory in quantum mechanics). for while our fundamental microscopic theories are time-reversal invariant. as ∆t → 0 (e.

9 into the Chapman-Kolmogorov equation. Re-arranging. 3. models of chemical reaction kinetics. 3. τ ) dx2 . predator-prey dynamics. any model where it is the particulate nature of the variables that necessitates a stochastic formulation. radioactive decay. τ ) dx2 + −∞ w (x3 |x2 ) p (x2 |x1 . τ ) dx2 . This will be the case in systems where the ﬂuctuations arise from approximating a discrete stochastic process by a continuous deterministic model – for example. 3.10). τ ) + τ ′ −∞ w (x3 |x2 ) p (x2 |x1 . τ ) dx2 + + τ′ ∞ −∞ w (x3 |x2 ) p (x2 |x1 . (3. . τ ) dx2 ∞ = (1 − a0 (x3 ) τ ′ ) p (x3 |x1 . etc.9 is straightforward – it simply says that: the probability that a transition (z → x) occurs + the probability that no transition occurs during that time (i. ∞ p (x3 |x1 . we write p (x3 |x1 .58 zeroth -jump moment. τ + τ ′ ) − p (x3 |x1 .. Substitute Eq. Eq. and recalling the deﬁnition of a0 (cf. τ ) dx2 ∞ −∞ = −∞ (1 − a0 (x2 ) τ ′ ) δ (x3 − x2 ) p (x2 |x1 .10) The physical content of Eq. particle collisions. τ ) =− τ′ ∞ ∞ −∞ w (x2 |x3 ) p (x3 |x1 .e. z = x) = the transition probability of moving from z to x during time τ ′ . ∞ Applied stochastic processes a0 (z) = −∞ w (x|z) dx. dividing by τ ′ . . . τ + τ ′ ) = [(1 − a0 (x2 ) τ ′ ) δ (x3 − x2 ) + τ ′ w (x3 |x2 )] p (x2 |x1 .

(3. The derivation and utility of the master equation formalism is best appreciated by example.4 Stosszhalansatz At this point. and the resulting equation is linear in the conditional probability density which determines the (mesoscopic) state of that system. as a given function determined by the speciﬁc physical system. we have1 . which is how Smoluchowski formulated his study of Brownian motion. (3. τ ) − w (x2 |x3 ) p (x3 |x1 . by contrast. yet it is much larger than the time required for all of the microscopic variables in the system to relax to their equilibrium distribution. and there is of course a discrete version. The big diﬀerence between the Chapman-Kolmogorov equation and the master equation is that the Chapman-Kolmogorov equation is a nonlinear equation (in the transition probabilities) that expresses the Markov character of the process. ∂ p (x3 |x1 . one considers the transition probability at short times. We begin with a general functional equation expressing a consistency condition for Markov processes (the Chapman-Kolmogorov equation). Note that it is a conservation equation of the gain-loss type.4 on page 59. w(xj |xi ). the birth-and-death process.12) m where the label n refers to the possible states of the stochastic process ξ(t). τ ) = ∂τ ∞ 59 −∞ [w (x3 |x2 ) p (x2 |x1 . d pn (t) = dt [wnm pm (t) − wmn pn (t)].11) which is generally called the Master Equation. we are able to express the 1 This limit is taken in an asymptotic sense: τ ′ is much smaller than the characteristic time scale over which p (x3 |x1 . but containing no information about any particular Markov process. In the master equation. We ﬁrst consider the simplest master equation.Markov processes Passing to the limit τ ′ → 0. it is well worth taking a step back and examining the various assumptions that have been made so far (Figure 6. τ ) varies. The transition probabilities wnm denote the probability of a transition from state m to state n in a small time increment dt. τ )] dx2 . see Section 3. . 3.3). Assuming the transition probabilities are stationary.

Furthermore. the microscopic variables are perturbed from equilibrium.13) is not a purely mathematical consequence of the microscopic equations of motion. . which are functions of x1 . . macroscopic aspect of the state is described by a much smaller number of variables Q1 .60 Applied stochastic processes Chapman-Kolmogorov equation as a discrete-diﬀerential equation (the master equation) that turns out to be far more amenable to analysis.11 on page 59) – Although ∆t → 0. The master equation rests upon the assumption that on the time scale ∆t over which the observable state evolves. . p 3N . A more detailed description of the physical derivation of the master equation is provided by van Kampen in “Fluctuations in Nonlinear Systems. . but it is important that the underlying physical assumptions are clear. In classical theory. . (3. They obey the 6N microscopic equations of motion. The phenomenological law (3. or the Stosszhalansatz. . . the motion of the solvent molecules colliding with a Brownian particle) assume their stationary equilibrium distributions. . 3. . What is essential to bear in mind is that the limit ∆t → 0 used in the derivation of the master equation is not a true limit in the mathematical sense (cf. p1 . The mathematical derivation is straightforward. Qn . . . and drop the label. Eq. p 3N ) obeys again a diﬀerential equation ˙ Q = F (Q). and quoted in part below: We are concerned with systems that consist of a very large number N of particles. x3N . . then rapidly relax to their new equilibrium distribution. all of the microscopic auxiliary variables (for example. . Thus at every time step ∆t. For convenience we suppose that apart from the energy there is just one other Q. . The reason why it exists can be roughly . we must remember that ∆t is still long enough on the microscopic scale that the microscopic variables are allowed to relax to their equilibrium state so that the microscopic state is (approximately) independent at times t and t + ∆t. . This is called the repeated randomness assumption. .13) which permits to uniquely determine its future values from its value at some initial instant. . . . The gross. the equilibrium distribution of the microscopic variables at time t + ∆t depends only upon the state of the system at time t. Experience tells us the remarkable fact that this macroscopic variable Q (x1 . the precise microscopic state of the system is described by 6N variables x1 . p 3N . .” appended to these notes.

It is then possible to pick ∆t such that Q(t) does not vary much during ∆t. . if Q has the value q at time t. ϑ 6N −2 ) . ϑ1 . ∆t W (q ′ |q) dq ′ is the probability that. t) − W (q ′ |q)P (q. . . . . in the relevant part of phase space. ϑ] Q(t) = ∆t · F [Q(t)]. but has a probability distribution around it. p 3N ) say. Using the equations of motion one has 3N 61 ˙ Q= k=1 ∂Q ∂Q xk + ˙ pk ˙ ∂xk ∂pk ≡ g (x1 . Hence Q(t + ∆t) is no longer uniquely determined by Q(t). f is not exactly equal to its average F . ϑλ (x1 .Markov processes understood as follows. t)}dq ′ .14) . . . It should be emphasized that this implies that at each time t the ϑλ vary in a suﬃciently random way to justify the use of a phase space average (“repeated randomness assumption”). while the ϑλ practically run through all their possible values (ergodic theorem with ﬁxed value for Q). . ϑ(t′ )] dt′ . but instead there exists a transition probability W (q ′ |q). . (3. Hence one may substitute in the integral Q(t) for Q(t′ ) and replace the time integration by an average over that part of the phase space that corresponds to given values of the energy and Q: Q(t + ∆t) − Q(t) = ∆t · f [Q(t). The probability distribution P (q. This may also be written t+∆t Q(t + ∆t) − Q(t) = f [Q(t′ ). . . Hence ˙ Q = f (Q. p 3N ) . t) = ∂t {W (q|q ′ )P (q ′ . and 6N − 2 remaining variables. More precisely. The variables in g may be expressed in Q and the energy (which we do not write explicitly). . t) of Q at any time t then obeys the rate equation ∂P (q. the value of Q(t + ∆t) will lie between q ′ and q ′ + dq ′ . Fluctuations arise from the fact that. t Now suppose that Q(t) varies much more slowly than the ϑλ (which is the reason it is microscopic).

5 Example – One-step processes Many stochastic processes are of a special type called one-step process. 3. van Kampen. wnm = rm δn.62 Applied stochastic processes g n−1 site gn n n +1 site n −1 rn rn +1 Figure 3. Conversely. a jump occurs to site n − 1. G. (m = n) wnn = 1 − (rn + gn ) .m+1 . and whose transition probability per unit time (i.s + pn|m+1. gn is the probability per unit time that.s . Fluctuations in Nonlinear Systems. Deﬁning the probability density pn|m. They are continuous-time Markov processes whose range consists of the integers n. . the master equation for this unbounded random walk is. – N.2).s+1 = 1 1 pn|m−1. birth-and-death process or generation-recombination process.m−1 + gm δn. wnm ) permits only jumps between adjacent sites. discussion of actual solution methods is postponed to Chapter 4 – but by all means.15) . Again a repeated randomness assumption is involved.2: Schematic illustration of a birth-death process This is the general form of the master equation . where rn is the probability per unit time that. In the following examples. being at site n.s as the probability that a random walker beginning at n at time t = 0 will be at site m after s steps. Smoluchowski applied this model to the study of Brownian motion by setting the generation and recombination probabilities to 1/2: gn = rn = 1/2. being at site n. pn|m. it is the derivation of the master equation that is emphasized. 2 2 (3. namely that at each time the ϑλ are suﬃciently random to justify the identiﬁcation of probability with measure in phase space.e. read ahead if you’re curious. a jump occurs to site n + 1 (Figure 3. .

s or pm (s). then one can show that pn|m. Note that it is often the case in literature that the authors will not explicitly write the conditioning of the distribution by the initial state. (3.s = 1 2s s! . that any distribution governed by a master equation (or any other dynamical equation) is necessarily conditioned by the initial distribution. One must always remember.s ≡ pm.s is given by.Markov processes 63 Let ν = |m − n| be the net distance traveled by the random walker. however. ν ≤ s and ν + s even. ( s+ν )!( s−ν )! 2 2 0.16) It is possible to show that this discrete random walk formulation is equivalent with Einstein’s diﬀusion representation of free Brownian motion (Excercise 6a). . otherwise. preferring instead the short-hand: pn|m. pn|m.

” Novi Commentarii Academiae Scientiarum Imperialis Petropolitanae XIV: 3– 25. A) Two urns. A and B. and the ball that came from B is placed in urn A. one by one.3): There are n white balls and n black balls in two urns. urn A will contain x white balls? In attempting to solve this problem. Jacobsen (1996) “Laplace and the origin of the Ornstein-Uhlenbeck process. S. The distribution is Gaussian. What is the probability zx. with n of the 2n balls white and n black.” Bernoulli 2: 271–286.3: Bernoulli’s urn model. The following problem was posed by Bernoulli (see Figure 3. • • P. Chapter 3.64 A) B) Applied stochastic processes ˆ ρ ( x) 1 8n n balls 1 2 ˆ x=x n Figure 3. Laplace (1812) Th´orie Analytique des Probabilit´s. Livre II.r that after r cycles. with standard deviation √ 1/ 8n. e e e M. from one urn to another. B) Laplace’s approximation of the fraction of white balls in urn A (ˆ = x/n) after a x very long time. Th´orie e e e G´n´rale des Probabilit´s. Laplace derived the following diﬀer- . 3. A ball is drawn at random from each urn and then the ball that came from A is placed in urn B. The balls are moved cyclically. Bernoulli (1770) “Disquitiones analyticae de novo problemate conjecturali.6 • Example – Bernoulli’s Urns and Recurrance D. each contain n balls. and each urn contains n balls.

of Prob.r+1 = x+1 n 2 x zx+1. Prob. The diﬀerence equation for this process is (much) more concisely written as. of Prob.r + 1 − n 2 zx−1. If the transition rates are nonlinear. x white balls in urn A after = r + 1 cycles Prob. of Prob.17) where the ﬁrst term represents the loss of a white ball from urn A.r + 2 n x 1− n x−1 zx. x white balls drawing a drawing a + in urn A after .18) Notice that in contrast to the master equation for the simple birth-death process described in the preceding section. the second term represents the gain of a white ball to urn A. and the remaining terms contribute no change in the number of white balls in urn A. zx. of Prob. of Prob. (3. x − 1 white balls drawing a drawing a in urn A after + black ball white ball r cycles from urn B from urn A Prob. of Prob. His approach will be described in some detail .r . it is very diﬃcult to ﬁnd an explicit solution for the probability zx. black ball black ball r cycles from urn B from urn A 65 (3. so Laplace sought an approximate solution instead. of Prob.r .r – in words. Eq.Markov processes ence equation for the evolution of zx. 3.18 has nonlinear transition rates. x + 1 white balls drawing a drawing a in urn A after white ball black ball r cycles from urn B from urn A Prob. x white balls drawing a drawing a + in urn A after white ball white ball r cycles from urn B from urn A Prob. of Prob.

x= ∂ 1 ∂2 ∂U (µ. r = nr′ . 2 (3.r + ∂x 2 ∂x2 ∂ zx.20. and x using Eq.21 is the Gaussian2 . 6.28).4).22) Reverting to the original variables by writing a continuous probability density ρ(x/n) for the fraction of white balls in urn A (ˆ = x/n). 2 we shall see in Section 5. 3. the nonlinear transition rates can be expanded in powers of 1/n.1. Laplace assumes the number of balls n is large. Chapter 6. Section 1.20) n + n µ. it characterizes what is called an Ornstein-Uhlenbeck process which we met in Chapter 1 with Ornstein and Uhlenbeck’s study of the velocity of a Brownian particle (cf. B . r′ ) = U (µ) = √ e−4µ . r ′ →∞ 2 2 lim U (µ.r zx. with some “ﬂuctuations” parameterized √ by µ. Laplace makes a change of variables.21) This is an example of a Fokker-Planck equation (cf.1 that this change of variables is tantamount to assuming that the density of white balls.r + ∂r (3. along with Eq. With Eq.2. U (µ.23) connection between the work of Ornstein and Uhlenbeck becomes apparent if one makes the substitution k mT = 8 in Eq.19. x/n. r′ )) + ∂r′ ∂µ 2 ∂µ2 1 U (µ. ρ(ˆ) = 2 x 2 The 1 n exp −4n x − ˆ π 2 2 .19) Second. which we will meet again in Section 5.r ≈ zx. which. is distributed about the “deterministic” value 1/2. ∞)).34. allowing the diﬀerences to be replaced with diﬀerentials.66 Applied stochastic processes because the method is essentially the same as van Kampen’s linear noise approximation. so that changes x ± 1 are almost inﬁnitesimal with respect to the density of white balls x/n. 3.r . π (3. and in particular. Assuming that n is large enough that the range of µ is unbounded (µ ∈ (−∞. r′ ) . First. and √ 1 (3. ≈ zx.r+1 ∂ 1 ∂2 zx. 1. r′ ) =2 (µU (µ.22. (3. r). zx±1. Eq. one can easily verify that the steady-state solution of Eq.r ± zx. and that the magnitude of these ﬂuctuations scale with n. 3. 3. leads to a linear partial diﬀerential equation governing the distribution of µ.

lose heat to the surrounding environment until the temperature of the liquid is in equilibrium with the surroundings. In Section 4. We choose a random integer between 1 and 2R (all integers are supposed to be equiprobable) and move the ball. Poincar´ proved that almost every state of the system will be e revisited eventually. so that at the beginning there are R + n (−R ≤ n ≤ R) balls in box I. and irreversibly. For example. M. One can make Boltzmann’s argument quantitative using the following simple urn model. a hot liquid will spontaneously. true in the limit of inﬁnite time. then the recurrence time is enormously long. the density becomes more narrowly peaked at x = 1/2. from the box in which it is to the other box. Speciﬁcally.” The American Mathematical Monthly 54: 369–391) that irrespective of how the initial state is prepared. with mean 1/2 and variance 1/8n. distributed in two boxes (I and II). to an arbitrarily prescribed degree of accuracy (this is Poincar´’s Wiederkehrsatz ). Kac also shows that excursions e from equilibrium return exponentially quickly (Newton’s law of cooling) and that if the number of balls is large and the initial state is far from equilibrium. numbered consecutively from 1 to 2R. Yet. we shall see that Laplace’s approximation is extremely good all the way down to n ≈ 10. This process is repeated s times and we ask for the probability QR+m. Zermelo argued that the irreversibility e of thermodynamics and the recurrence properties of dynamical systems are incompatible. Nevertheless.2 ˆ (on page 84).Markov processes 67 which is a Gaussian distribution. . every possible state is visited with probability one – this is Poincar´’s Wiederkehrsatz. approaching a delta-function. Boltzmann replied that Poincar´’s Wiederkehrsatz is a e mathematical result. Kac (1947) “Random walk and the theory of Brownian motion. Imagine 2R balls.s that after s drawings there will be R + m balls in box I. A model very similar to Bernoulli’s urn model was proposed at the beginning of the 20th century by Paul and Tatiana Ehrenfest to clarify some foundational concerns that continued to undermine Boltzmann’s formulation of statistical mechanics (see Excercise 7). whose number has been drawn.1. Kac proves in his classic paper on Brownian motion (M. As the number of balls increases (n → ∞). and so the statistical mechanical processes underlying thermodynamics are irreversible for all intents and purposes. but that in practice the time-span between the recurrance of an unlikely state is unimaginably long. The trouble lay with the precise meaning of irreversibility in thermodynamics. if we treat the system as a microscopic dynamical system.

neighbouring states are visited often: if we begin with R = 10000 and n = 0 (i. half of the 20000 balls in Urn I).e. for example. srecur . T. On the other hand. J. For moles of reactants (i. Underlying that formulation is the implicit assumption that the concentration of the reactants varies both continuously and diﬀerentiably. Ehrenberg (2003) “Fast evaluation of ﬂuctuations in biochemical networks with the linear noise approximation. McQuarrie (1967) “Stochastic approaches to chemical kinetics. and each drawing takes 1 second. the kinetics of chemical reactions are described in terms of deterministic chemical rate equations.” Journal of Applied Probability 4: 413. (2R)! If. we begin with R = 10000 and n = 10000 (i. the number of draws.e. 3. srecur = (R + n)!(R − n)! 2R 2 .4: Noise in chemical reaction kinetics. Gillespie (1977) “Exact simulation of coupled chemical reactions. D.68 A Applied stochastic processes A B k+ k− B C C Figure 3.” Genome Research 13: 2475. √ then on average we must wait about 100 π ∼ 175 seconds for the state to re-occur. Elf and M.e. molecule numbers of the order 1023 ).” Journal of Chemical Physics 81: 2340. which take the form of a system of coupled nonlinear diﬀerential equations. then on average we will have to wait more than 106000 years (!) for this state to re-occur. required on average for the recurrance of a state beginning with R + n balls in one urn is given by.7 • • • Example – Chemical Kinetics D. these assumptions are perfectly justiﬁed . close to equilibrium. Very often. all 20000 balls in Urn I).

but rather progress in a series of steps of ﬁnite time width. and in fact we shall ﬁnd that the coeﬃcients appearing in the deterministic . including freshman chemistry labs (Figure 3. Ω is usually the volume of the reaction vessel. That accounts for the great success of deterministic models in most macroscopic systems. We can make this relationship explicit by writing the number of individuals n as being proportional to a density X. reactions no longer occur ‘continuously’ over an inﬁnitely small time interval.4). the dynamics of chemical reaction networks can be reasonably described by a master equation governing the probability density for the molecule numbers n. reactant numbers tend to be of the order 1-1000. however. a baby is born in this country. In the urn example above.24) n′ As above. depending upon state variables for all the products and reactants of interest in the network. Furthermore. but the evolution of the population of a small town occurs in a stochastic. t) − wn′ n P (n.” That clearly cannot be true of my next-door neighbor. n = Ω · X. For small pools of reactants. Inside living cells. Ω was the total number of balls in one urn. t) is usually a multivariate probability distribution. for all intents and purposes. t) = ∂t wnn′ P (n′ . ∂P (n. The transition probabilities wij are generalizations of the deterministic reaction rates. This example illustrates the dichotomy of discrete evolution of individuals on the one hand. A reaction altering the population by one or two therefore generates a large relative change. wnn′ denotes the transition probability from the state n′ to the state n. In contrast with examples studied so far. and the molecule numbers no longer evolve diﬀerentiably (Figure 3. Evolution of the population of an entire country can be welldescribed using diﬀerential equations. By way of analogy. inﬁnitesimal. and (nearly) continuous evolution of the population density on the other. P (n. with the constant of proportionality Ω being a measure of the system size. one can imagine the national birth rate as compared to the chances my next-door neighbor will have a baby. step-wise manner. One often hears statements such as: “Every 10 minutes.5). the mathematical formulation becomes more delicate. (3.Markov processes 69 since a change of one or two molecules in a population of 1023 is. It is straightforward to argue that in the gas phase (or a very dilute solution). t). In chemical reaction dynamics.

due to the probabilistic nature of individual reaction events and the ﬁnite change in molecule numbers incurred.70 Applied stochastic processes a) b) C ~ 10 c) [C ] C ~ 100 [C ] [C ] ( t ) C ~ 1000 time time Figure 3. The width of the probability distribution scales roughly as √1 where N is the number of molecules. Inset: We can imagine the full evolution of the system as composed of two parts: the determinsitic evolution of [C](t) and a probability distribution for the ﬂuctuations that moves along [C](t). a) For many reactant molecules. b) When small numbers of reactants are involved. N . we typically obtain some repeatable averaged behavior that conforms very nearly to the deterministic description and some envelope around the average that accounts for the ﬂuctuations in individual trajectories. the species concentrations evolve both continuously and diﬀerentiably. Although as the reactant numbers increase. c) Repeating an experiment many times. the relative size of the jumps decreases. the concentration evolves step-wise.5: Noise in chemical reaction kinetics.

ni . → The collection of the elements Sij compose the stoichiometry matrix. The action of the operator is deﬁned in the following way: Ek i increments the ith variable by an integer k. .25) The stoichiometry matrix S describes how much each species changes with the completion of a given reaction. . i (3. .) = f (n1 . .Markov processes 71 rate equations are simply proportional to the mean of an exponential distribution. t) exactly. . n2 . . . . while the propensity vector ν describes the rate at which a particular reaction proceeds. For multidimensional master equations evolving over a discrete state space. . at least in principle (see Section 5. . . ν1 S= ν2 ν3 ν4 n1 n2 1 −1 0 0 0 0 1 −1 . . Example: Coupled Poisson processes – To introduce the stoichiometry matrix and the propensity vector. consider a simple linear two-state model: n1 −1 n1 + 1. . we record the reaction propensities in a vector ν and the stoichiometries in a matrix S deﬁned such that when the j th reaction νj occurs it increments the ith reactant by an integer Sij : ni − ni + Sij . n2 . the stoichiometry matrix S and the propensity vector ν. n2 .). Ek f (n1 . . ni + k. → → n2 −3 n2 + 1. . An example should make this notation more clear. . All of the transition rates are linear or constant making it possible to solve for the probability distribution P (n1 . but focus instead on how to represent the reaction network in a convenient manner. → ν4 ν ν1 = α1 (3. That is.2 on page 113). n2 − n2 − 1. → ν ν2 ν2 = β1 · n1 /Ω ν3 = α2 · n1 /Ω ν4 = β2 · n2 /Ω.) depending upon several variables. ni . for a function f (n1 . Generally. it is convenient to introduce several new objects: the step-operator E. . n2 . .26) n1 − n1 − 1. . In the present example. The step-operator E is short-hand for writing evolution over discrete space. . We shall not attempt to do so here. .

29) a .. Furthermore. d 2 [X2 ] = a [X] . the deterministic rate equations can be written in terms of the stoichiometry matrix and the propensity vector. → In a deterministic system.27) or. dx = lim S · ν. dx1 = α1 − β1 · x1 . for numerical and analytic approximations of the solution of the master equations. Using this notation. where the number of molecules is very large. X + X − X2 . the propensity appearing in the master equation is not identical with the propensity in the deterministic rate equations (Eq. 3.27). Piecing together each term in the equation from the various reactions can be tedious.28) where νj is the microscopic reaction propensity (with units of concentration per time) explained below. Brief note about microscopic reaction propensities ν For nonlinear transition rates. in the limit Ω → ∞ with ni /Ω = xi held constant. most schemes are concisely written in terms of S and ν. dt (3. the rate of accumulation of the product X2 is written. for example. dt dx2 = α2 · x1 − β2 · x2 .e. n ∈ RN ). (3. Consider. The diﬀerence is not diﬃcult to understand. dt The real convenience of describing the reaction network in terms of S and ν comes in writing the master equation. but an explicit expression exists – For a system of R reactions and N reactants (i.72 Applied stochastic processes where each column of the stoichiometry matrix corresponds to a particular reaction and each row to a particular reactant. t). the dimerization reaction. the master equation is ∂P =Ω ∂t j=1 R N i=1 Ei −Sij − 1 νj (n) P (n. explicitly as a system of coupled linear ordinary diﬀerential equations. Ω→∞ dt (3.

uses a distinctive approach to Markov processes that some readers may enjoy. . dt 2 (3. ν nX Ω = a′ nX nX − 1 · · 2 Ω Ω −X − → − − − n →∞ ν ([X]) = a · [X] · [X] . T. what we mean by a reaction event is that two molecules of X ﬁnd one another with suﬃcient energy that they form the dimer X2 .1). the microscopic reaction propensities ν are used. Inc.31) ˜ macroscopic reaction rate microscopic reaction rate (with a′ /2 = a). 2004). The probability for the reaction to occur is then proportional to the number of ways two molecules can collide. but the connection among many aspects of Markov processes are united into a single framework. W.30) where nx is the number of molecules of X in a reaction vessel of volume Ω. The pedagogical article by Kac (pronounced ‘Katz’). in units of concentration per time. has a thorough discussion of Markov processes. C. while in asymptotic solutions of the master equation (Section 5. 1992). Gillespie (Academic Press. The text by Gillespie. to the order of approximation that we will be concerned with in this course. the fraction accounting for diﬀerent permutations of the reactants will be absorbed into the rate of reaction so that we can write.Markov processes 73 Microscopically. the ˜ macroscopic propensities ν are suﬃcient. d 1 [X2 ] ∝ (nX ) (nX − 1) . • Handbook of stochastic methods (3rd Ed. Gardiner (Springer.2. D. The last term on the right-hand side is (nX − 1) because we need at least two molecules to have a reaction. Suggested References The text by Gardiner.one molecule colliding with another is the same as the other colliding with the one.1). Here. The mathematics is a bit heavy. (3.). and the 1/2 factor comes from not double-counting each reactant . In numerical simulation of chemical reaction kinetics (Section 4. • Markov Processes. and throughout.

3. [X(t4 ) − X(t3 )] [X(t2 ) − X(t1 )] = 0. t2 ) . Ornstein-Uhlenbeck process: In the following.6). Wiener process: Use the deﬁning features of the Wiener process X(t). the stationary-independent increments and the 1st -order probability density given by (Eq. Exercises 1. t) = √ to show the following: (a) The autocovariance is given by.8). Kac (1947) “Random walk and the theory of Brownian motion. Elf and Ehrenberg present the stoichiometry matrix and propensity vector in the context of several applications. use the deﬁning expressions for the Ornstein-Uhlenbeck process X(t) (Eq. f (x1 ) = p (x2 |x1 . 2.” Review of Modern Physics 15: 1–89. Ehrenberg (2003) “Fast evaluation of ﬂuctuations in biochemical networks with the linear noise approximation. Elf and M. for any t4 > t3 > t2 > t1 . • S.e. .” The American Mathematical Monthly 54: 369–391. (D/β) 2 x2 − x1 e−βτ 1 exp − (D/β) (1 − e−2βτ ) π(D/β) (1 − e−2βτ ) . (where τ ≡ t2 − t1 > 0). • J. (b) The increments of the Wiener process are un-correlated on disjoint time-intervals. Finally. 3. x2 1 exp − 4Dt 4πDt (t ≥ 0).” Genome Research 13: 2475. i. is a classic. f (x. Chandrasekhar (1943) “Stochastic problems in physics and astronomy. as is the lengthy review by Chandrasekhar. τ ) = 1 1 π(D/β) exp − x2 1 .74 Applied stochastic processes • M. X(t1 )X(t2 ) = 2D min (t1 .

takes continuous values x and is homogeneous in time and space (i. (a) Show that the Chapman-Kolmogorov equation. t) = tg(k). (d) If Z(t) is the integral of the Ornstein-Uhlenbeck process X(t). t2 |x1 . t) of the process. then ﬁnd. (b) Show that the compatibility condition is satisﬁed for t2 > t1 . x2 . where f (x. Chapman-Kolmogorov equation: Consider a stochastic process X(t) which starts from X(0) = 0. t1 . f (x. ∞ −∞ f (x1 . X(t1 )X(t2 ) . where g(k) is an arbitrary function of k. t2 − t1 ) for t2 ≥ t1 ≥ 0. t1 ) = f (x2 − x1 . Hint: Use the Fourier transform to convert the ChapmanKolmogorov equation into a functional equation for the characteristic function G(k.. . t) = 2π ∞ −∞ exp −ikx + t ∞ −∞ w(x′ ) eikx − 1 dx′ ′ dk. 1 f (x. Z(t) = 0 X(τ )dτ. Hint: It may be convenient to use cumulants. (c) Show that the increments of the Ornstein-Uhlenbeck process are negatively correlated on disjoint time intervals. (b) Assuming that the time evolution of the process is governed by the (known) transition probability per time of the form w(x′ |x) = w(x′ −x). t2 ). τ )dy.e. t2 )dx1 = f (x2 . then ﬁnd cos [Z(t1 ) − Z(t2 )] .Markov processes 75 (a) Compute explicitly the autocorrelation function. t − τ )f (y. t for any t4 > t3 > t2 > t1 . has stationary independent increments). t) is the ﬁrst-order probability density. t) and show that. Z(t1 )Z(t2 ) . (e) If Z(t) is the integral of the Ornstein-Uhlenbeck process X(t). such that f1|1 (x2 . t) = ∞ −∞ f (x − y. is satisﬁed if the ﬁrst-order cumulant-generating function has the form: ln G(k. apply the Fourier transform to the master equation for f (x. 3. [X(t4 ) − X(t3 )] [X(t2 ) − X(t1 )] < 0.

t|y0 . 2 (d) Repeat the above for a Markov process Y (t) with a range α {a. what is the probability that one of the players will be in the lead less than one day out of 365? (c) For a one-dimensional random walk. .2n for n = 10 and r = 1. 5. . π as n → ∞. Compute the steady-state autocorrelation function for this process. . Laplace. One can show that. or to hover near the origin. spending equal time on the positive and negative axes? 6. and Paul bets on tails.−1 + δy. . Lead-probability: Suppose that in a coin-toss game with unit stakes. switching between the two at a rate γ. For a game where the coin is tossed every second for one year.1 ). 1}. The probability that Albert leads in 2r out of 2n tosses is called the lead probability P2r. √ 2 arcsin x. where δij is the Kronecker delta. derived an approximate evolution equation. by a clever substitution. then make a table of P2r. (b) Solve the master equation with the initial condition y(t0 ) = y0 to obtain the conditional probability density P (y. 2 n−r (a) For a coin tossed 20 times. 10. Dichotomic Markov process (Random telegraph process): The function Y (t) is a Markov process with range {−1.76 Applied stochastic processes 4. b} and asymmetric transition probabilities: a − b and → β → b − a. the positive axis.2n . (c) Show that this is consistent with the single-point probability P (y. t0 ). Albert bets on heads. would you expect a given realization to spend most of its time on the negative axis. (b) Show that the probability that f (x) ∼ r n < x is given by. what is the most likely number of tosses for which Albert is in the lead? Guess ﬁrst. and therefore cannot be solved in general. (a) Construct the master equation for the dichotomic Markov process. Bernoulli’s urn model and the approximation of Laplace: The urn model of Bernoulli has nonlinear transition rates. . 2.2n = 2r r 2n − 2r −2n . t) = 1 (δy. P2r.

what is the variance of the steady-state distribution for the white ball density x = x/n? What can you say about ˆ the equilibrium state as n → ∞? 7.. U ′ (µ) → 0 exponentially as µ → ±∞.) From the resulting equation. from the box in which it is to the other box.s that after s drawings there will be R + m balls in box I. so that at the beginning there are R + n (−R ≤ n ≤ R) balls in box I.21. . ∞) to obtain an evolution equation for the variance µ2 . at steadystate. Eq. but this time multiply by µ2 and integrate from µ ∈ (−∞. whose number has been drawn. 3. Hint: Integrate by parts.Markov processes 77 (b) Multiply Eq. This process is repeated s times and we ask for the probability QR+m. and solve the resulting ordinary diﬀerential equation for U (µ) assuming U (µ). A. show that µ returns to µ = 0 exponentially quickly. 3. assuming that the probability distribution and its ﬁrst derivative both vanish as µ → ±∞. Show that the variance asymptotically approaches a non-zero steady-state. How does this steady-state compare with the results obtained in question 6a? (a) Solve Laplace’s partial diﬀerential equation. set the left-hand side equal to 0. ∞) to obtain an evolution equation for the average µ . (a) Write out the master equation governing the evolution of the probability distribution QR+m. deviations from the equilibrium state x = 1 n are restored exponentially quickly 2 (Newton’s law of cooling). 3.20.e. (c) Repeat the above.21 by µ and integrate from µ ∈ (−∞. We choose a random integer between 1 and 2R (all integers are supposed to be equiprobable) and move the ball. Ehrenfests’ urn model: The urn model proposed by Paul and Tatiana Ehrenfest to clarify the idea of irreversibility in thermodynamics is the following: Imagine 2R balls. and using the change of variables Eq. That is. i. (Use the same boundary conditions as above: U (µ).19 on page 241). Comparing the solution with the canonical Gaussian probability density (cf. U ′ (µ) → 0 exponentially as µ → ±∞. numbered consecutively from 1 to 2R. distributed in two boxes (I and II).s . Eq.

to solve the master equation for f (n.47. f (0. From the resulting equation. b(n) = βn and d(n) = αn.32) for any pair of functions f. 8. Malthus’s Law: Malthus’s law for population growth assumes that the birth and death rates are proportional to the number of individuals. (a) Use the probability generating function. a (3. β < α. with α and β given constants.e. Hint: Integrate by parts. t) = P {N (t) = n|N (0) = n0 } and show that. g(N )Ef (N ) = ∞ N =1 f (N )E−1 g(N ) (3. the probability of extinction goes as. 1. Eq.. Step-operator (Eq. (c) Multiply the partial diﬀerential equation derived in 7b by µ and integrate from µ ∈ (−∞.25): (a) Prove the identity ∞ N =0 −1 −n0 . i. g such that the sum converges. Describe the time-dependence of the variance σ 2 (t) for the cases when β > α.78 Applied stochastic processes (b) Following the method of Laplace discussed in Section 3. and β = α.6. in the case α = β. 9. ∞) to obtain an evolution equation for the average µ . 3. t) = 1 + (βt) where n0 is the initial population.33) . deﬁne a suitable continuous ﬂuctuation variable µ and derive a partial diﬀerential equation for the evolution of the probability distribution for µ. (b) Consider the decay of a radioactive isotope X − A. respectively. deviations from the equilibrium state are restored exponentially quickly (Newton’s law of cooling). assuming that the probability distribution and its ﬁrst derivative both vanish exponentially as µ → ±∞. (b) Solve the deterministic rate equation for N (t) and N 2 (t) . show that µ returns to µ = 0 exponentially quickly. → where A is inert and does not enter into the equations.

conversely.32 to show that the mean satisﬁes the deterministic rate equation. (a) Write out the master equation for the simple linear network (3. characterized by the master equation. Use Eq. A simple network describing this system is the following four reactions: r1 −1 r1 + 1. Compute the equation governing the mean X by multiplying the master equation by X and summing over all X.26). Try to do the same without using the explicit formula provided.Markov processes 79 i. Write out the deterministic reaction rate equations and the master equation. ν3 = β · → Ω r2 ν4 r2 − r2 − 1. This is a general characteristic of master equations with linear transition rates. iii. synthesis of r2 is low. Write out the stoichiometry matrix S and the propensity vector ν for the toggle switch. r1 and r2 – If r1 is high. ν4 = β · . ν2 ν ν1 = α · gR (r2 /Ω) ν2 = α · gR (r1 /Ω) r1 ν r1 −3 r1 − 1. . 10. if r2 is high. and. → Ω where the function gR (x) is high if x is low and low if x is high. (b) A “toggle switch” network consists of two mutually repressing species. 3. → → r2 − r2 + 1. Write out the master equation governing the probability distribution of X. Chemical reaction networks: The time-evolution of species in a chemical reaction network is often represented by a Markov process. r1 is kept low. What is the deterministic equation describing this process? ii.

the transition rates are nonlinear. More often. The ﬁrst is called the moment generating function and is used to transform the master equation into a linear partial diﬀerential equation. As such. it is still rare to ﬁnd an exact solution.2. it is restricted to use on rather artiﬁcial and uninteresting examples. and approximation methods are the only recourse. The second method we shall discuss relies upon re-writing the master equation in vector-matrix notation. The steady-state solution can then be computed as the eigenvector of a given transfer matrix. We shall explore two popular and powerful approximation methods: numerical simulation algorithms (Section 4. the master equation can be transformed to a ﬁrst-order partial diﬀerential equation from which it is sometimes possible to extract an exact solution. 4. The 80 .Chapter 4 Solution of the Master Equation The master equation derived in Chapter 3 provides a foundation for most applications of stochastic processes. the algebra is formidable and the auxiliary partial diﬀerential equation may not be amenable to solution either. The method only works if the transition probabilities are linear in the state variables.1 Exact Solutions There are very few general methods for solving the master equation. We shall discuss two of these in detail. Although it is more tractable than the Chapman-Kolmogorov equation. Furthermore.1) and a perturbation expansion called the linear noise approximation (Chapter 5). if the dimensionality of the system is high. For linear transition rates.

A simple example should clarify the procedure. Q (1. ∂Q (z. for one dimensional systems. The moment generating function is so-named because the moments of P (n. Nevertheless. z=1 n (n − 1) z n−2 P (n. . i. the z-transform X(z) is deﬁned xn .Approximation Methods 81 method is semi-exact since the eigenvector is usually computed by numerical matrix iteration.3 on p. (4.3. t) = ∞ n=0 z n P (n. is a discrete version of the Laplace transform.2) (4. by some initial distribution). with constant birth rate gn = α and linear death rate rn = β × n. Example – Poisson process. 2004). t) the probability of ﬁnding the system in state n at time t (conditioned. t) ∂z ∂ 2 Q (z. t). t).3) nz n−1 P (n.1) very similar to the z-transform used by electrical engineers1 . associated with the probability P (n. t). even with nonlinear transition probabilities. t) to be transformed into a partial diﬀerential equation for Q(z. as always.1 Moment Generating Functions The moment generating function Q(z. .e.1. t) are generated by subsequent derivatives of Q(z. t) = 1. (4. 265. 1 The as X(z) = P ∞ . t)] = ∞ n=0 (4. Q (z. t) z=1 = n (t) . Call P (n. t) z=1 = n2 (t) − n (t) . the method can be very useful. and Section B. with z n appearing in the denominator. t). James “Advanced modern engineering zn n=−∞ mathematics (3rd Ed. . Furthermore.). the method becomes diﬃcult to implement if the dimensionality of the system is large or the range of the state space is unbounded. 4. See Chapter 3 of G. Consider a simple birth-death process.” (Prentice-Hall.4) Multiplying both sides of the master equation by z n and summing over all n allows the discrete-diﬀerential master equation for P (n. The master equation z-transform is usually deﬁned over an unbounded domain. t). t) ∂z 2 = z=1 ∞ n=0 [Normalization condition on P (n. for the sequence {xk }∞ k=−∞ .

. t) + β E1 − z 1 1 dt z n−1 n P (n.9 becomes a simple partial diﬀerential equation for Q(z. . t) + αP (n. t) − β (z − 1) .) .6) A particularly elegant short-hand. 1 1 dt Multiplying by z n . . . t) = α E−1 z − 1 z n P (n. rearranging slightly. . involves the step-operator Ek (recall i Eq.) = f (. The operator acts by ﬁnding the ith entry of n and incrementing it by an integer k: Ek f (. ∂t Qs (z) = exp α (z − 1) .11) . t). t) = α E−1 − 1 P (n. t)) . t)] . t) ∂Q (z. t) + β E1 − 1 (n P (n. Eq. t)] − [β nP (n. 4. dP (n. .9) (4.7) Under the transformation Eq. 4. .8) (4. dP (n. . . . i Using the step-operator. ∂Q (z. The full timedependent solution Q(z. dt (4.10 at steady-state ∂Q = 0. we have. Instead of the full time-dependent distribution. ni + k. dt GAIN LOSS (4.6 is written. t) − P (n. dz n P (n. t) + β (n + 1) P (n + 1. Eq. t) . . Applied stochastic processes dP (n. ni . t) = α (z − 1) Q (z. t) = [αP (n − 1. 4. ∂t ∂z (4. t)] + β [(n + 1) P (n + 1. 4. Solving the transformed Eq.10) We have transformed a discrete-diﬀerential equation (diﬃcult) into a linear ﬁrst-order partial diﬀerential equation (easier). .82 corresponding to this process is. we shall focus upon the ﬁrst two moments in steady-state. β (4. t) can be determined using what is called the method of characteristics.5) or. (4.25 on page 71). t) = α [P (n − 1. which shall be useful in the following section on approximation methods. 3. t)]. t) − nP (n.1.

σ2 = α . In the example above. discussing the linear noise approximation.14) Having mean equal to the variance is the footprint of a Poisson process.15) σ Alternately. n! we recognize the coeﬃcients of the series as the probability distribution of a Poisson process. η= 1 n .Approximation Methods The steady-state moments follow immediately. Laplace’s solution to Bernoulli’s urn problem. 4. expanding Qs (z) as an inﬁnite series. pn = e−µ µn . n 2 (4. n! with µ = αm /βm .1. σ . We shall exploit this scaling in Section 5. from the deﬁnition of the moment generating function (Eq. In our case (since n our process is Poisson distributed). Qs (z) = exp αm αm (z − 1) = exp[− ] βm βm zn n (αm /βm )n . In fact. We can measure how close to Poisson distributed a 2 given process is by considering the Fano factor. (4.12) − n . . s α β 2 s (4. 3.1). the Fano factor is 1.20 on page 66). σ2 = 1.13) with steady-state variance. the fractional deviation η = is a dimensionless mean 2 sure of the ﬂuctuations and often provides better physical insight than the Fano factor. Eq. ∂Qs ∂z ∂ 2 Qs ∂z 2 = z=1 83 = z=1 α = n β = n2 s (4.16) substantiating the rule-of-thumb that relative ﬂuctuations scale roughly as the square-root of the number of reactants (cf. β (4.

Ws = lim Wt . ∂pn = ∂t wnn′ pn′ − wn′ n pn (4.1. ∂pn = W · pn ∂t where Wnn′ = wnn′ − δnn′ wn′′ n n′′ .19) . p(t + 1) = W · p(t). p(t) = Wt · p(0). with the probability of ﬁnding the system in a (discrete) state n ∈ (0. written pn .17) n′ can likewise be expressed in terms of the transition matrix W. it is clearly most useful for events restricted to a bounded domain. N ). Notice. A (semi)-exact method that can be used in the case of nonlinear transition rates is matrix iteration.18) The steady-state distribution is then the eigenvector corresponding to the λ = 1 eigenvalue of the matrix Ws obtained from multiplying W by itself an inﬁnite number of times. also. 4. that the continuous time master equation.2 Matrix Iteration Discrete master equations derived for systems with low dimensionality (i.84 Applied stochastic processes The moment generating function method will only work if the transition rates are linear in the state variable – in complete analogy with the Laplace transform for ordinary diﬀerential equations.17 could be generalized to an unbound state space n ∈ (−∞.e. occupying elements in a 1 × (N + 1) vector p. and the transition probabilities occupying the elements of an (N + 1) × (N + 1) matrix W. The evolution of the probability distribution is deﬁned as an iterative matrix multiplication. (4. a method we shall discuss in the next section. ∞). one or two state variables) can often be usefully re-written in vectormatrix notation. t→∞ (4. 4. Although Eq.

0 . . one by one. from one urn to another. . 0 ··· T0 T− 0 T+ T0 T− 0 0 0 T+ T0 T− 0 W= . for one-step processes.. 0 T+ T0 .r that after r cycles.3 on page 64): There are n white balls and n black balls in two urns.. . Furthermore. What is the probability zx. . then there is a unique stationary distribution ps . T+ and T− are themselves functions of the row in which they appear. the transition matrix takes the particularly simple tri-diagonal form.6 on page 64). . If W is a separable transition matrix. For many processes. urn A will contain x white balls? . that is. 0 0 diﬀerent initial conditions converge to diﬀerent equilibrium distributions (Exercise 1). Matrix iteration provides a particularly convenient method to estimate the accuracy of Laplace’s approximate solution of the Bernoulli urn problem (see Section 3. systems whose deterministic counterparts have a single asymptotically stable ﬁxed point. and each urn contains n balls. the steady-state probability distribution ps corresponds to the unit eigenvector of Ws . the transition matrix W does not depend upon time. Example – Bernoulli’s urn model. A 0 0 W = 0 B 0 . . A theorem by Perron and Frobenius states that if W is irreducible. .Approximation Methods 85 That is. then once Ws is calculated.. if the system is ‘aperiodic’. . 0 0 where it should be understood that the elements T0 . multiplication with any unit vector yields ps . The balls are moved cyclically. The following problem was posed by Bernoulli (see Figure 3. Furthermore. . Ws · ps = ps ..

C) n = 100. Laplace derived the following diﬀerence equation for the evolution of zx. (4.86 A) B) n = 10 n = 50 Applied stochastic processes C) n = 100 pi pi pi 0. r′ )) + ∂r′ ∂µ 2 ∂µ2 1 U (µ. U (µ. as n → ∞.6 0.8 i 1 0. r′ ) =2 (µU (µ. The eigenvector of the repeated multiplication of the transition matrix is shown as ﬁlled circles.22) Re-writing Laplace’s diﬀerence equation in vector-matrix notation allows us to use matrix iteration to estimate the error in Laplace’s approximation. B) n = 50. r). cenˆ tered at x = 1/2 with variance 1/8n.6 0.2 0.23 on page 66) shown as a solid line. In attempting to solve this problem. 3. r′ ) . repeated multiplication becomes computationally demanding. (4. with Laplace’s approximation (cf. Laplace arrived at a partial diﬀerential equation governing the probability distribution for µ.8 n i 1 n Figure 4. ∂ 1 ∂2 ∂U (µ.1: Steady-state distribution for the fraction of white balls in one urn of Bernoulli’s urn model.2 0.r + 1 − x−1 n 2 zx−1.r + 2 x n 1− x n zx.2 0. Eq. n = 10. along with the ansatz that x = n/2 + µ n. for increasing numbers of balls. ˆ ρ(ˆ) = 2 x 1 n exp −4n x − ˆ π 2 2 .8 n i 1 0.r .20) Assuming the number of balls is large (n → ∞) so that changes in the den√ sity x/n is nearly continuous. A) Number of balls in one urn. however. 2 (4.4 0.r+1 = x+1 n 2 zx+1. As the size of the transition matrix becomes larger (n → ∞). .r .4 0. Laplace’s approximation becomes indistinguishable from the ‘exact’ numerical solution.6 0. zx.4 0.21) The steady-state solution for the density x = x/n is the Gaussian.

we see that Laplace’s approximation of the distribution is quite accurate even down to n = 10 (Figure 4.r . where zi (r) ≡ zi−1.” Journal of Chemical Physics 81: 2340. .23) 2 For example. W is the 5 × 5 matrix. Wi. 1 16 36 16 1 1 16 36 16 1 1 16 36 16 1 1 so that the steady-state probability distribution is z = 70 [1 16 36 16 1]. Gillespie (1977) “Exact simulation of coupled chemical reactions. 1. . Wi. the transition rates appearing in the ith row are given by. Continuing in the same fashion for larger n. Numerical Simulation: The classic reference for these types of algorithms is the paper by Gillespie: D.i+1 = i n 2 . Each has clear advantages and disadvantages. t) characterized by the master equation.Approximation Methods 87 Writing the probability zx. 4.1). 2 0 0 1 3 1 4 8 1 0 0 0 16 0 The steady-state transition matrix Ws 1 1 16 16 1 36 36 Ws = 70 16 16 1 1 is given by.i−1 = 1− i−2 . with n = 4 balls in each urn.numerical simulation algorithms and perturbation methods. Wi.2 Approximation Methods There are two broad classes of approximation methods . 1 0 16 0 0 0 1 3 1 0 0 8 4 9 9 W = 0 16 1 16 0 . n (4. T.r as a vector z(r) ∈ R(n+1) . The method simulates a single trajectory n(t) that comes from the unknown probability distribution P (n.i = 2 i−1 n 1− i−1 n .

the time and volume are scaled so that γ = δ = 1 to give the deterministic rate equations. The model is a two-species chemical network described by the following reactions. the deterministic rate equations admit a single stable equilibb rium point (X1 . van Kampen (1976) “Expansion of the Master equation. dt dX2 2 = −aX1 X2 + bX1 . → a b γ → 2X1 + X2 − 3X1 . Strogatz. t). Nonlinear dynamics and chaos: With applications to physics. → X1 − ∅. Without loss of generality.” Advances in Chemical Physics 34: 245. X2 ) = (1. the re-scaled master equation governing the probability density P (n1 . . We shall develop both the stochastic simulation algorithm and the linear noise approximation in the context of a speciﬁc example – the Brusselator.24) δ and consequently. ∂P a = Ω E−1 − 1 P + 2 E−1 E1 − 1 n1 (n1 − 1) n2 P + 2 1 1 ∂t Ω + E1 − 1 n1 P + b E1 E−1 − 1 n1 P. biology. chemistry and Engineering (Perseus Books Group. G. The Brusselator is a often-used pedagogical example of a system exhibiting limit cycle behaviour over a range of parameter values. dt (4. ∅ − X1 . Perturbation Methods: The classic reference for the perturbation scheme we will consider is the paper by van Kampen: N.25) 1 1 2 In this form. For perturbation methods.88 Applied stochastic processes 2. → X1 − X2 . (4. the discrete jump in n(t) that occurs with each reaction is treated as a nearly deterministic process. The ‘nearly’ is what we use as a perturbation parameter. n2 . 2001). a ) for all choices of parameters with b < 1 + a. The Brusselator • S. dX1 2 = 1 + aX1 X2 − (b + 1)X1 .

the system undergoes a Hopf bifurcation. Notice that the transition rates are nonlinear in n – as a consequence. ν1 S= ν2 ν3 ν4 n1 n2 (4. From the reaction network shown above. the propensity vector and stoichiometry matrix are given by. We shall express both the stochastic simulation algorithm of Gillespie and the linear noise approximation of van Kampen in terms of the propensity vector ν and the stoichiometry matrix S introduced in Chapter 3 (Section 3. b) For b > 1 + a. a trajectory will move toward the stable ﬁxed point. and orbit on a closed limit cycle. a trajectory will move away from the ﬁxed point. and the steady-state is a stable limit cycle (Figure 4. The stability of the trajectory in phase space is determined by the relationship between a and b in parameter space.2: The stability of the Brusselator.26) ν= b · n1 Ω n1 Ω and. c) For b < 1 + a. with γ = δ = 1. there are two parameters in the Brusselator model.27) 1 1 −1 −1 0 −1 1 0 .2). a) With suitably chosen units. 1 a · n1 ·(n1 −1)·n2 . and therefore the Brusselator is a terriﬁc example to showcase numerical and analytic approximation methods. Along the critical line b = 1 + a. Nevertheless.Approximation Methods a) b Limit Cycle About the Fixed Point 89 b) X2 b > 1+ a c) X2 b < 1+ a b > 1+ a Fixed Point is Stable 1 b < 1+ a a X1 X1 Figure 4. there is no general solution method available to determine P (n.7 on page 71). t). Ω3 (4. it is precisely the nonlinearity of the transition rates that makes this model interesting.

Gillespie (Academic Press. Gillespie’s algorithm is a method by which an individual sample path. 1992). can be simulated in time such that it conforms to the unknown probability distribution we seek. ni → ni + Siµ . The propensities νj are used to generate a probability distribution for the next reaction time. we generate a discrete time series for the reactant numbers. Repeat . t) can rarely be solved exactly. we are after the conditional probability distribution.3). for a suﬃciently large population of sample paths. the inferred probability distribution is as near to the exact solution as we wish (analogous with Langevin’s modeling of Brownian motion). Call the reaction index µ. Gillespie (1977) “Exact simulation of coupled chemical reactions. 3.for each reactant. . i. so we generate some stochastic simulations of the Brusselator model introduced on page 88. we need two random variables – the time for the completion of the next reaction τ .2. τ and τ is drawn from this distribution. The time is advanced t → t + τ and the state is updated using the stoichiometry matrix . . and in fact how to generate the pair (τ. starting at a given initial point. In this way.” Journal of Chemical Physics 81: 2340.e. and it will be clear that in developing his algorithm. To advance the state with each reaction event. 2. that is. Gillespie built upon some fundamental properties of stochastic processes. T. Ultimately. • Markov Processes. T. The propensities are used to generate a probability distribution for which reaction in the network will occur next. The algorithm proceeds in 3 steps: 1. It is also instructive to see Gillespie’s algorithm in action. .90 Applied stochastic processes 4. µ) using a unit uniform random number generator. We shall go over each of these steps in greater detail (see Figure 4. the full distribution P (n. Details of the stochastic simulation algorithm (Figure 4. and the index of the reaction that ﬁres µ. D.3) • D. Inc. It is Gillespie’s deep insight that allows us to determine the probability distribution for each of these. which of the vj ’s is completed at time t + τ .1 Numerical Methods – Gillespie’s Algorithm For master equations with nonlinear transition probabilities.

t) ≡ a(n). stop. n ( t ′) = ® for t ′ = t . Notice the jump in the state ∆nµ = Sµ . 331. n n0. otherwise. If the process is to continue. t ) = µ a (n ) 4. Initialize: t t 0. then return to 2. and take τ = [1/a(n)] ln(1/r1 ). then the inversion is easy: Draw a unit uniform random number r1 .3: Gillespie’s algorithm for stochastic simulation of the master equation. t ) = a ( n ) exp ª − a ( n )τ º . where Sµ is the µth column of the stoichiometry matrix.c 5. Taken from Gillespie (1992). .b p1 (τ | n.Advance the process: ni + Siµ •ni •t t + τ. Record as required for sampling or plotting.4). the sum t + τ should be computed with at least K + 3 digits of precision. n − S µ . Pick τ according to the density functiona. c) In a simulation run containing ∼ 10K jumps. ¬ ¼ 3. Figure 4.Pick µ according to the density functiona ν (n ) p2 ( ∆n µ | n. for t − τ < t ′ < t . ¯ n. 2. a) Use the inversion method for generating the random number τ (see Section A.Approximation Methods 91 1. b) Draw a unit uniform random number r2 and take µ to be the smallest integer for which the sum over νj (n)/a(n) from j = 1 to j = µ exceeds r2 . If a(n. p.

µ). it is suﬃcient that we simply generate the random variable specifying which reaction has ﬁred. t + τ |n. Over a non-inﬁnitesimal interval. is q ⋆ (n. dτ ) = 1 − q (n. given the state jumps at t + τ .t + τ ] The ﬁrst two terms on the right-hand side determine the next reaction time τ . t + τ |n. . we introduce the probability q(n. . (4. Therefore. the probability that no jump occurs is. t. p (n + ∆n. t + τ ) Probability that. In practice.29) = [ν1 (n)dt + ν2 (n)dt . Reaction 1 Occurs + Over an inﬁnitesimal interval. dt) ≡ N Prob. the probability p (n + ∆n.t+τ ) . p1 (τ |n. t) dτ is written. t) dτ : the probability that.. (4. t + τ |n. τ ) = exp [−a(n)τ ] . t + τ ) . t + τ |n. (τ. Reaction 2 Occurs j=1 + . dτ ). τ ) w (∆n|n. the next jump occurs between t + τ and t + τ + dτ .28) νj (n) dt ≡ a (n) dt (Exercise 6).] = Prob. we can factor the conditional probability into two parts p1 (τ ) and p2 (∆n). because the stoichiometry matrix S records how the state advances with each reaction event. t. q (n. q ⋆ (n.t) p2 (∆n|n. ∆n) to advance the system. it will land in n + ∆n Probability the state will NOT jump during [t.92 Applied stochastic processes p (n + ∆n. dτ ). t. so we have that the probability that no jump occurs. from which we draw the random numbers (τ. t. at most one jump can occur. Following Gillespie. given the system is in state n at time t. From the microscopic transition rates ν(n). τ ) that the system in state n(t) will jump at some instant between t and t + τ . while the last term determines the next reaction index. q ⋆ (n. In that way. t. carrying the state from n to n + ∆n. p (n + ∆n. t) = × a (n) dτ × Probability the state WILL jump in [t + τ . t. we know that over an inﬁnitesimal interval dt.t + τ + dτ ] q ⋆ (n. t) = a (n) exp [−a (n) τ ] dτ × w (∆n|n. t..

Approximation Methods

93

Probability p1 (τ ) is exponentially distributed, consequently a unit uniform random number r1 can be used to simulate τ via the inversion (see Section A.4), τ= 1 ln(1/r1 ). a(n) (4.30)

The index of the next reaction µ is a little more subtle – Notice from Eq. 4.28 that the probability of reaction µ to occur is proportional to the rate of reaction νµ . The normalization condition ensures that the probability it is reaction µ that has caused the state to jump is, p2 (∆nµ ) = νµ (n) νµ (n) ≡ , a(n) j νj (n) (4.31)

where ∆nµ is the µth column of the stoichiometry matrix S. The next reaction index µ is simulated using a unit uniform random number r2 via the integer inversion method (see Exercise 4b on p. 252). That is, the index µ drawn from p2 is the ﬁrst integer for which 1 a(n)

µ

νj (n) > r2 .

j=1

(4.32)

With (τ, µ), the system is updated, t → t + τ, ni → ni + Siµ , and the algorithm is repeated as long as desired, each time drawing two random numbers (r1 , r2 ) from a unit uniform random number generator. An implementation of Gillespie’s algorithm coded in Matlab is annotated in the Appendix (see Section D.1.2 on p. 277). Stochastic simulation of the Brusselator model We shall produce stochastic simulations of the Brusselator model in both the stable and limit-cycle regimes of the model parameter space (see Section D.1.3 on p. 278 for example Matlab code). We shall ﬁnd that both regimes are equally accessible to the stochastic simulation algorithm. In contrast, the linear noise approximation described in the next section requires a major modiﬁcation to treat ﬂuctuations around the limit cycle. On the other hand, it is very diﬃcult to describe the simulation results

94

A

Number of X2 molecules (x103)

5

**Applied stochastic processes
**

B

Number of X2 molecules (x103)

5

4

4

3

3

2

2

1

1

1

2

3

4

5

1

2

3

4

5

Number of X1 molecules (x103)

Number of X1 molecules (x103)

Figure 4.4: Gillespie’s stochastic simulation algorithm - The Brusselator in the stable regime. a) (a, b) = (0.1, 0.2), (α = 10 in Figure 23 of Gillespie (1977)). The system is very stable. b) (a, b) = (0.5, 1), (α = 2 in Figure 22 of Gillespie (1977)). The system is still stable, but ﬂuctuations are more appreciable. any more than qualitatively, while the perturbation methods provide a connection between system behaviour and model parameters. Stable regime, b < 1 + a: In the stable regime of the model parameter space, the system admits a single, stable equilibrium point - In Figure 4.4, that equilibrium is (x1 , x2 ) = (1000, 2000). The number of reactant molecules is large, so the intrinsic ﬂuctuations are correspondingly small. As the parameters get closer to the Hopf bifurcation, the ﬂuctuations become somewhat larger (Figure 4.4b). Neither plot illustrates particularly interesting behaviour. Limit cycle, b ≥ 1 + a: At the bifurcation (Figure 4.5), the system parameters are on the threshold of stability and the ﬂuctuations carry the state on long excursions away from the ﬁxed point. From the timeseries plot (Figure 4.5a), it appears as if the ﬂuctuations are generating nearly regular oscillations (see Excercise 5 on p. 212). In phase-space (Figure 4.5b), the system seems conﬁned to an elongated ellipse with a negatively-sloped major axis. Beyond the Hopf bifurcation, the system exhibits regular limit cycle oscillations (Figure 4.6). As Gillespie notes in his original article, the horizontal leg of the cycle seems to travel along grooves of negative slope, rather than straight from right to left. There is some spread along the diagonal leg, but both the horizontal and vertical legs are little inﬂuenced by the ﬂuctuations (for analytic insight into why that is, see Section 11.2).

Approximation Methods

A

Number of Xj molecules (x103)

5

95

B

Number of X2 molecules (x103)

5

4

4

3

3

2

2

1

1

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

Time

Number of X1 molecules (x103)

Figure 4.5: Gillespie’s stochastic simulation algorithm - The Brusselator at the Hopf bifurcation. a) Simulation of the system at the Hopf bifurcation, (a, b) = (1, 2) (α = 1 in Figure 21 of Gillespie (1977)). The ﬂuctuations generate what appear to be almost regular oscillations. b) In phase-space, the ﬂuctuations are conﬁned to a large ellipse, angled with negative slope indicating the strong cross-correlation between ﬂuctuations in x1 and x2 .

A

Number of X2 molecules (x103)

14 12 10 8 6 4 2

B

Number of X2 molecules (x103)

2 4

14 12 10 8 6 4 2

6

8

10

12

14

2

4

6

8

10

12

14

Number of X1 molecules (x103)

Number of X1 molecules (x103)

Figure 4.6: Gillespie’s stochastic simulation algorithm - The Brusselator in the unstable regime. a) (a, b) = (5, 10) (α = 0.2 in Figure 19 of Gillespie (1977)). b) (a, b) = (10, 20) (α = 0.1 in Figure 18 of Gillespie (1977)). The advantages of the stochastic simulation algorithm is that it is simple to program and provides an output trajectory that exactly conforms to the solution distribution of the master equation. The disadvantages are that the original algorithm is computationally expensive and the method

96

Applied stochastic processes

does not scale well as the number of molecules gets large (although there are approximate algorithms that alleviate some of the computational burden). Most importantly, the method suﬀers from the same limitations as any numerical scheme – there is a lack of deep insight into the model and it is diﬃcult to systematically explore diﬀerent regions of parameter space. Nevertheless, Gillespie’s algorithm is the benchmark against which all other methods of solving the master equation are measured.

Suggested References

For numerical simulation methods, the textbook by Gillespie is unsurpassed, along with his seminal article on the stochastic simulation algorithm, • D. T. Gillespie, Markov processes: An introduction for physical scientists (Academic Press, 1992). • D. T. Gillespie (1977) “Exact simulation of coupled chemical reactions,” Journal of Chemical Physics 81: 2340.

Exercises

1. Separable transition matrices: Write out the most general 2 × 2 transition matrix W. Show that for this case, any initial condition p(0) converges to a well-deﬁned steady-state distribution ps , i .e. lim Wt · p(0) = ps , with two exceptions.

t→∞

2. Time-dependent reaction rates: A chemical reaction is described by the following deterministic rate equation, dA = k(t)A(t), dt where k(t) is a time-dependent reaction rate. Solve the associated chemical master equation using a moment generating function. 3. Bursty Poisson model: Consider the following generalization of the Poisson model: synthesis occurs with a constant rate α, but in ‘bursts’ of size b, and degradation is linear, with rate βn. The

Approximation Methods master equation corresponding to this process is, dP (n, t) = dt α [P (n − b, t) − P (n, t)] + β [(n + 1) P (n + 1, t) − nP (n, t)] .

97

(a) For b = 1, solve the characteristic function Q(z, t) for all time. What do you notice about the distribution if n0 = 0? (b) Repeat 3a, but for arbitrary burst size. 4. Bernoulli’s urn model: In describing Bernoulli’s urn model, Laplace derived a diﬀerence equation with nonlinear transition rates (Eq. 4.20). The nonlinearity of the transition rates make the equation diﬃcult to solve even at steady-state. (a) Using matrix iteration for the cases n = 2, 3, and 4, multiply the steady-state probability distribution z by its ﬁrst element z1 to get a vector of integers. Can you spot a pattern in the individual entries? Postulate a general solution for the steadystate probability distribution and verify that it satisﬁes the diﬀerence equation. (b) Using Laplace’s approximate solution (Eq. 4.22), calculate the mean-squared error between the exact solution and the continuous approximation as a function of the number of balls. 5. Asymmetric cell division: Most bacteria divide with surprising symmetry – E. coli for example, typically divides into daughter cells that diﬀer in length by less than 5%. Suppose a bacterium divided unequally, what would the age distribution (i.e., the time to next division) look like for a population? (a) Divide the lifetime of the cell into time-steps ∆t – Assume the larger daughter lives 10∆t before division, while the smaller daughter lives 15∆t. Scale the time so that the transition rates are 1. Write out the transition matrix W. (b) Find Ws . What does the equilibrium lifetime distribution look like? (c) Repeat the above, but on a ﬁner scale. That is, assume the large daughter lives 100∆t and the small daughter lives 150∆t. (d) Derive a deterministic model for the process. How do the results above compare?

i. show that for time independent reaction events ν (n. x −1 x + 1.98 Applied stochastic processes (e) Suppose the size after division was itself a stochastic process. as quoted in the main text (Eq. t. but with Ω a time-dependent quantity. How would the transition matrix change? 6. → ν ν ν1 = α ν2 = β · x. → x −2 x − 1. Generate stochastic simulation data for molecule . That is. Gillespie’s simulation algorithm: Gillespie draws upon two very important ideas in stochastic processes – the evolution of probability for a Markov process. (b) Repeat part 6a for time dependent reaction events ν (n. What distribution will you choose for these two variables? 7. In bacterial cell growth. (c) Write a stochastic simulation algorithm to generate realizations of the stochastic process that describes the Brusselator in a growing cell. As a ﬁrst approximation. t) = ν (n). then divides more or less symmetrically. others are quite surprising. Consider a very simple system with constant synthesis and linear degradation. Exploring stochastic dynamics: Stochastic models exhibit features that do not appear in their deterministic counter-parts. t. t. the volume grows approximately exponentially over a cell-cycle. Code a routine to allow both the volume after division and the partitioned contents to be narrowly-peaked random variables. integration of the probability q ⋆ (n. and the simulation of a random variable by a unit uniform random number. repeat the example in the text. 4. t). τ ) = exp [−a(n)τ ] . (a) Making use of the Markov property. dt) gives q ⋆ (n. (a) Stable system.29). Starting with x(0) = 0. ii. dt) ≡ 1 − q (n. compute the deterministic trajectory for x(t). assume perfect division of the cell volume and perfect partitioning of the cell contents into daughter cells. Some of these features are straight-forward.

Approximation Methods 99 numbers from about 10 − 103 . and plot these runs normalized to the same steady-state value. r1 is kept low. N. Suppose gR (x) takes the simple Hill-form. → ν2 ν ν1 = α · gR (r2 /Ω) ν2 = α · gR (r1 /Ω) r1 ν → r1 −3 r1 − 1. synthesis of r2 is low. H. KR measures the repressor strength (smaller the KR . G. Repeat the simulation of their model for the parameter choice leading to a deterministically stable system. Ω where the function gR (x) is high if x is low and low if x is high. A simple network describing this system is the following four reactions: r1 −1 r1 + 1. How does the relative magnitude of the ﬂuctuations scale with the number of molecules? Plot a histogram of the ﬂuctuations around the steady-state – what distribution does it resemble? How does the half-width scale with the number of molecules? (b) Multi-stable system. and n is the cooperativity determining how abrupt the transition is from the high to low state. Leibler (2002) Mechanisms of noise-resistance in genetic oscillators. and S. along with varying system size Ω. r1 and r2 – If r1 is high. and estimate a range of parameters for which the system exhibits bistability. Barkai. conversely. if r2 is high. where f is the capacity. ν3 = β · Ω r2 ν4 → r2 − r2 − 1. Vilar. M. . gR (x) = 1+f · 1+ x KR x KR n n . What diﬀerences do you see comparing the stochastic and deterministic models? (c) Noise-induced oscillations. Proceedings of the National Academy of Sciences USA 99: 15988–15992. Perform stochastic simulations of the model for parameters in the bistable regime. the less repressor necessary to reduce gR ). Read J. and. A “toggle switch” network consists of two mutually repressing species. → r2 − r2 + 1. Kueh. Nondimensionalize the deterministic rate equations corresponding to this system. Y. ν4 = β · .

We shall show in this Chapter that this is the ﬁrst-step in a systematic analytic approximation scheme. The approximation method that we describe in this section examines system behavior in the limit of large numbers of reactant molecules. the system evolution becomes more smooth and the deterministic formulation becomes more appropriate (Figure 3.5).1 • Linear Noise Approximation (LNA) N. call them α. call them x. We have already seen that as the number of molecules increases.” Advances in Chemical Physics 34: 245. van Kampen (1976) “Expansion of the Master equation. G.Chapter 5 Perturbation Expansion of the Master Equation The master equation derived in Chapter 3 provides a foundation for most applications of stochastic processes to physical phenomena. and that 100 . One possibility is to adopt the Fokker-Planck equation as an approximate evolution equation. as in Chapter 6. it is still rare to ﬁnd an exact solution. 5. Often we can gain a better sense of a particular model by examining certain limiting regimes. The linear noise approximation exploits this behavior and rests upon the supposition that the deterministic evolution of the reactant concentrations. Although it is more tractable than the Chapman-Kolmogorov equation. can be meaningfully separated from the ﬂuctuations.

the fractional deviation is inversely proportional to the square-root of the number of molecules (see Eq. With the propensity vector ν and the stoichiometry matrix S.1) ni = Ωxi + Ωαi . b) The discrete state space is smeared into a continuum by replacing the discrete step-operator E by a continuous diﬀerential operator. and convenient manner: for a network of R reactions involving N species the master equation is. call it Π(α. a) The microscopic ﬂuctuations are separated from the macroscopic evolution of the system by re-writing the probability density for the whole state P (n.16).1: The Linear Noise Approximation. t). 4. We have i repeatedly emphasized that if the transition probabilities νj are nonlinear . t) as a distribution for the ﬂuctuations Π(α. The picture that underlies Eq.16 on page 83). Recall that for a Poisson process. We would like a set of equations that govern the change in the deterministic part x and an equation that governs the change in the probability distribution of the ﬂuctuations. we are in a position to write the master equation in a compact. We introduce an extensive parameter Ω that carries the units of volume and is directly proportional to the molecule numbers.7).1 is that of a deterministic. these ﬂuctuations scale roughly as the square-root of the number of molecules. allowing the molecule numbers to be written √ (5.2) (where we have used the step-operator Ek deﬁned in Eq. dP (n. centered upon x (Figure 5. t) =Ω dt j=1 R N i=1 Ei −Sij − 1 νj (n. 4. We are lead to the square-root scaling of the ﬂuctuations by the suggestion from Poisson statistics (see Eq.t ) 101 b) x (t ) Ω→∞ Large Volume Figure 5. (5. 4. Ω) P (n.1a). 5. t). t) centered on the macroscopic trajectory x(t).Linear Noise Approximation a) Π (α . reproducible trajectory surrounded by a cloud of ﬂuctuations.

1. . .. Recall that what makes the master equation diﬃcult to solve exactly is the discrete evolution over state-space characterized by the stepoperator Ek . To that end. . proceeds in three steps. Eq.1). . .) + √ 2 + . . . . . ni + k. t) → Ω− 2 Π(α. Ek f (.1b). . Ω xi + Ω αi + √ .5) allowing us to approximate the discrete step-operator by a continuous diﬀerential operator (Figure 5. P (n. . then there is no systematic way to obtain an exact solution of the master equation. t). 2.) = f (.. N N (5.it increments the ith species by an integer k. 2 Ω ∂αi Ω ∂αi f (. 2 Ω ∂αi Ω ∂αi (5. .4) Taylor series around becomes negligibly small as Ω → ∞. . .. . . Ω The term k √ Ω k √ Ω (5. Ω xi + Ω (5. . .) i √ k = f (. t) centered on the macroscopic trajectory x. . √ k . . 6. ni . . consider the action of the operator . .11 on page 126). .). . The linear noise approximation. we replace the full probability distribution P (n. k ∂ k2 ∂ 2 Ek ≈ 1 + √ + i 2 + .6) (Compare this with the Kramers-Moyal expansion. we must ﬁnd some continuous i representation of the action of the operator. t) by the probability distribution for the ﬂuctuations Π(α.. ni . . . To make headway. 5.3) The pre-factor Ω− 2 comes from the normalization of the probability distribution. suggesting a = 0. Using the assumption above (Eq.102 Applied stochastic processes functions. First.) ≈ αi + √ Ω k ∂f k2 ∂ 2 f + f (. . . . . and we must resort to approximation methods. . we write. which is the subject of this section. .

2 in like powers of zero’th order (Ω0 ).j 1 2 Dij ∂ij Π. νj (x) = lim νj ˜ Ω→∞ n .j (5. dt dt Ω . we have the equation characterizing the probability distribution for the ﬂuctuations. dxi = [S · ν]i ≡ fi (x).10) ∂Π dxi ∂Π = [S · ν]i . . we must likewise expand the propensities in the limit Ω → ∞.7) where νj (x) are the macroscopic propensities deﬁned in the limit ˜ that the molecule numbers go to inﬁnity (Ω → ∞). i. νj 1 n ≈ νj (x) + √ ˜ Ω Ω N i=1 ∂ νj ˜ αi + .12) is a missing step here. 72). − dx = √ dα . i.9) 1−N 1 ∂Π dxi ∂Π ∂P .11) √ −1 At the next order.. ∂xi (5. The result is a consistent approximation scheme with a particularly simple Fokker-Planck equation governing the probability distribution of the ﬂuctuations.e. dt ∂αi ∂αi This system of equations is identically satisﬁed if x obeys the deterministic rate equations. Finally. Ω (5. 5. (5. Ω . √ 1 There Ω −1 : ∂Π =− ∂t Γij ∂i (αj Π) + i. . . as we show below. we have.Linear Noise Approximation 103 3. dt (5. To (5. taking care to write N ∂Π ∂t using the chain rule1 .8) It is the expansion of the propensities that distinguishes the linear noise approximation from the Kramers-Moyal expansion. The chain rule will give an expression involving the time derivatives of α. while the concentration remains ﬁxed (n/Ω constant) (see p. Putting all of this together. = Ω 2 Ω− 2 − ∂t ∂t dt ∂αi i=1 we collect Eq. Ω0 : √ Ω taking the limit Ω → ∞. To obtain the expression involving dxi . note that the time dt 1 derivative of P is taken with n ﬁxed. to remain consistent in our perturbation scheme.

The condition . 5.12 by multiplying with αi and integrating by parts to give. Without loss of generality. dΞ = Γ · Ξ + Ξ · ΓT + D. Notice. α(0) = 0). we set α = 0.13) x(t) We now have in hand a system of ordinary diﬀerential equations that govern the deterministic evolution of the system.16) with covariance matrix Ξ(t) determined by Eq.12. t) as it moves along x(t).12. dt (5. Eq.e. by αi αj and integration by parts gives. 5. 5. Π(α. is a special sub-class of Fokker-Planck equations since the coeﬃcient matrices Γ and D are independent of α. For the covariance Ξij = αi αj − αi αj = αi αj . then α = 0 for all time. 5.15. You can prove (see Excercise 1) that for linear drift and constant diﬀusion coeﬃcient matrices. 2. Some comments are in order: 1. particularly in higher-dimensional systems.15) The covariance determines the width of Π(α. that it is not the single parameter Ω that determines the reliability of the expansion. t) = (2π)N detΞ(t) 1/2 1 exp − αT · Ξ−1 (t) · α . then. dα =Γ· α dt (5. We have used Ω as an ordering parameter in our perturbation expansion of the master equation. however. 5. We also have a partial diﬀerential equation that characterizes the probability distribution of the ﬂuctuations. Furthermore. 2 (5. t). the moments of α are easily computed from Eq. which happen to coincide with the macroscopic reaction rate equations. Eq.104 where ∂i ≡ ∂ ∂αi Applied stochastic processes and. Γij (t) = ∂fi ∂xj D = S · diag[ν] · ST . multiplication of the Fokker-Planck equation. (5.12 is the Gaussian.14) If we choose the initial condition of x to coincide with the initial state of the system n0 (i. The full distribution satisfying Eq. The equation describing Π(α. the solution distribution is Gaussian for all time.

Notice Γ is simply the Jacobian of the deterministic system. Notice the role stoichiometry plays in the magnitude of the ﬂuctuations through the coeﬃcient matrix D = S·diag[ν]·ST . Although Γ is unchanged by lumping together the propensity and stoichiometry.1. Ω . Compare Eq. 5. the ﬂuctuation matrix D is not! (See Excercise 2.7 on page 143). it represents the local ﬂuctuations. 5. Ω n ν → n −2 n − 1. (5. α(t) · αT (0) = exp [Γs t] · Ξs . The diﬀusion coeﬃcient matrix D tells us how much the microscopic system is changing at each point along the trajectory. The Fokker-Planck equation provides an even deeper insight into the physics of the process. After non-dimensionalizing the model equations. n ν → n −1 n + b.17) where each of the matrices is evaluated at a stable equilibrium point of the deterministic system. ν1 = g .15 with dΞ = 0). ν2 = β · .18) 5.12 with Kramer’s equation for a Brownian particle trapped in a potential well (Section 6. g[x].Linear Noise Approximation 105 that must be satisﬁed is that the standard deviation of the ﬂuctuations in small compared with the mean. 3.occurs at steady-state and is described by the ﬂuctuation-dissipation relation (Eq. It is straightforward to show that the autocorrelation of the ﬂuctuations about the steady-state are exponentially distributed (Exercise 3). The balance of these two competing eﬀects .dissipation and ﬂuctuation . it represents the local damping or dissipation of the ﬂuctuations.1 Example – Nonlinear synthesis Consider a one-state model with a general synthesis rate. evaluated pointwise along the macroscopic trajectory. dt Γs · Ξs + Ξs · ΓT + Ds = 0 s (5. You will see that Γ plays the role of the curvature of the potential (the spring constant).) 5. As such. a more convenient measure is the requirement that the elements of the diﬀusion coeﬃcient matrix D be small. 4. and D plays the role of temperature. but linear degradation. As such.

21) The solution to this equation is straightforward – it is simply a Gaussian centered on the trajectory x(t) determined by Eq.20) Although the approximation is called the linear noise approximation.19) Substituting the ansatz (Eq.20.8). d α2 dt = 2 {b · g ′ [x] − β} α2 + b2 · g[x] + β · x . 1 dΠ √ ∂Π dx √ = − Ω ∂α dt Ω dt 1 −b ∂ 1 b2 ∂ 2 √ g[x] + √ g ′ [x]α √ Π(α.106 Applied stochastic processes where n is the number of molecules and Ω is the volume of the reaction vessel (so that X = n/Ω is the concentration). 5. 5.21. t). n dP (n. t)+ + 2 Ω ∂α 2Ω ∂α Ω Ω 2 √ 1 ∂ 1 1 ∂ β √ Ωx + Ωα √ Π(α. t) = E−b − 1 g · P (n.1). = − {b · g [x] − β} + ∂t ∂α 2 ∂α2 (5. Eq. t). The two equations. 5.3. + 2 Ω ∂α 2Ω ∂α Ω √ Cancelling a common factor of 1/ Ω from both sides. dt Ω (5. To ﬁrst-order √ in 1/ Ω. and collecting terms √ in like powers of 1/ Ω. dt (5. the full nonlinearity in the deterministic trajectory is retained. dt ∂α ∂α This equation is satisﬁed if the deterministic trajectory x(t) obeys the deterministic rate equation. provide an approximation of the full time-dependent evolution of the probability distribution P (n. b2 · g[x] + β · x ∂ 2 Π ∂ (αΠ) ∂Π ′ . 5.6 and 5. The master equation for this process is. into the master equation (Eq. − ∂Π dx ∂Π = − {b · g[x] − βx} . 5. to leading-order. t) .20 and 5. along with preceding approximations (Eqs. t) + β E1 − 1 nP (n. with covariance given by. the ﬂuctuations Π(α. dx = b · g[x] − β · x. 5. t) obey a linear Fokker-Planck equation.19).

it is essential to ﬁnd an extensive parameter that quantiﬁes the relative magnitude of the jump size in the reaction numbers (and take the limit that this jump magnitude becomes small). the mean x⋆ satisﬁes the algebraic condition. For example. let g[x] = x2 /(1 + x2 ). t) is no longer Gaussian (see Exercise 5). This model is a speciﬁc example of the nonlinear synthesis model described in detail above.21 become nonlinear in α. To be speciﬁc. for many chemistry problems it is proportional to the volume. consider a model of an autoactivator at the level of protein (after time-averaging the promoter and mRNA kinetics). The validity of the expansion depends upon the magnitude of Ω only indirectly. 5. then the coeﬃcients in Eq. dt where for convenience time is scaled relative to the degradation rate of A. the distribution Π(α. The function g [A/KA ] depends upon the dissociation constant KA (in units of concentration). The relevant concentration scale in the deterministic model is the dissociation constant KA .i.22) b · g[x].Linear Noise Approximation 107 obeying the master equation (Eq. and the unitless parameter that fully characterizes . 2 b · g ′ [x⋆ ] − β (5. the parameter Ω is used to order the terms in the perturbation expansion. and A0 is the maximal synthesis rate of A. More informally. what is √ required is that Ωx ≫ Ωα. dA = A0 · g [A/KA ] − A. The series neglects terms of order 1/Ω and higher. If these terms are retained.e. x⋆ = and the variance is given by. What precisely that parameter is will depend upon the problem at hand – for many electrical applications it is proportional to the capacitance in the circuit. 5. α2 = βx⋆ (b + 1) . and further suppose the synthesis of the activator is expressed as a Hill function so that the deterministic rate equation governing the concentration of A is. β What is the system size? In the derivation of the linear noise approximation. At steady-state. and higher-order partial derivatives appear.19).

25 nanomolar and you were working in E. One strategy is to identify in the diﬀusion matrix D (Eq. If KA was. for example. 2 ΩKA remains small. n2 = n2 α2 . 5. For a more complicated model. determining how loud a disturbance from the speaker must be in order to initiate a feedback whine.22.13) dimensionless parameters like ∆ that will control the magnitude of the variance. b for synthesis). n2 = n2 (b + 1) 1 . KA would correspond to the sensitivity of the microphone to noise from the speaker. (b + 1)/2. ∆= (b + 1) 1 . It is straightforward to analyze this condition around steady-state. using Eq. the fractional deviation will remain small provided the unitless parameter. coli (Ω ∼ 10−15 L). In an electronic feedback loop with a microphone pointed at a speaker. Ωx2 For clarity. we make the simplifying assumption that g ′ [x⋆ ] ≪ 1 after scaling time relative to the degradation rate. Physically. In the autoactivator example. the state space is one-dimensional and so the number scale is obvious. is the average change in numbers when a reaction occurs (1 for degradation.108 Applied stochastic processes the behaviour of the deterministic model is the ratio A0 /KA . we need the fractional deviation of the molecule numbers to be small. The fractional deviation is. 5. then KA Ω is about 25 molecules. So long as the average jump size (b+1)/2 is much less than 25. . For the linear noise approximation to be valid. The second factor is perhaps the more important factor – it is the dissociation constant (in units of concentration) multiplied by the system volume. the dissociation constant KA determines the threshold beyond which a bistable positive feedback loop switches from the low state to the high state (or vice-versa). What is the physical interpretation of ∆? The ﬁrst factor. the linear noise approximation provides a useful approximation. 2 Ωx⋆ Non-dimensionalizing the mean x⋆ using the characteristic concentration KA . Then. The factor 1/(ΩKA ) expresses a characteristic number scale in the system. it may not be so obvious what dimensionless parameters must be small for the approximation to remain valid.

1 ∂α1 ∂α2 This is a Fokker-Planck equation with coeﬃcients linear in αi .j 1 ∂ 1 (αj Π) + ∂αi 2 Dij i. Writing the above in more compact notation. 4. − ∂Π dx2 ∂Π ∂Π dx1 ∂Π − = −ax2 x2 + (1 + b) x1 − 1 + ax2 x2 − bx1 .25. at Ω− 2 . Γ= D= 2ax1 x2 − (b + 1) ax2 1 b − 2ax1 x2 −ax2 1 Γij i. Eq. Expanding the step-operator and propensity functions in the master equation for the Brusselator. 1 dt The next term in the expansion. dx1 = 1 + ax2 x2 − (1 + b)x1 . at Ω0 we have.Linear Noise Approximation 109 5. is ∂Π ∂ =− (2ax1 x2 − (b + 1)) α1 + ax2 α2 Π 1 ∂t ∂α1 ∂ (b − 2ax1 x2 ) α1 − ax2 α2 Π + − 1 ∂α2 ∂2Π 1 ∂2Π + (b + 1) x1 + ax2 x2 + 1 + bx1 + ax2 x2 1 1 2 2 2 ∂α1 ∂α2 ∂2Π + −bx1 − ax2 x2 . .1.j ∂2Π ∂αi ∂αj (5. We use the ansatz ni = Ωxi + Ωαi to re-write the molecule numbers ni as concentrations xi and ﬂuctuations αi . then collecting terms in powers of 1 Ω− 2 . because from Eq.2 Approximation of the Brusselator model To illustrate the linear noise approximation. ∂Π =− ∂t with.24. 4.23) (b + 1) x1 + ax2 x2 1 −bx1 − ax2 x2 1 −bx1 − ax2 x2 1 bx1 + ax2 x2 1 . we return to the Brusselator √ example introduced on page 88. this term vanishes identically. 1 1 dt ∂α1 dt ∂α2 ∂α1 ∂α2 Identifying x1 and x2 with the macroscopic trajectory of the system. 1 dt dx2 = −ax2 x2 + bx1 .

it is only the uncertainty in the phase that is divergent. This is certainly true for exponentially divergent macroscopic trajectories. the variance will exceed Ω. since: d αi = dt d αi αj = dt to give: d α1 = [2ax1 x2 − (b + 1)] α1 + ax2 α2 . xss ) → 1.. 1 2 ss ss the mean of the ﬂuctuations will vanish asymptotically ( α1 . the mean and variance of α1 and α2 are easily derived. 1 dt and. . so that the ordering implied by our original ansatz 1 ni = Ωxi + Ω 2 αi breaks down. To examine the ﬂuctuations in a system on a periodic orbit.110 Applied stochastic processes In this form.e. It is instructive to compare the linear noise approximation with the results of stochastic simulation in order to contrast the two approaches. 1 1 dt d 2 α1 α2 = [b − 2ax1 x2 ] α1 + 2ax1 x2 − (b + 1) − ax2 α1 α2 1 dt 2 +ax2 α2 − bx1 − ax2 x2 . and tend to (xss .2 on page 219). 2 α1 ss Γij αj j Γik αk αj + k k Γkj αi αk + Dij = b + (1 + a) (1 + a) − b α1 α2 ss = −2b (1 + a) − b 2 α2 ss = b b + (1 + a) . though the variance remains bounded and non-zero. d 2 2 α1 = 2 [2ax1 x2 − (b + 1)] α1 + 2ax2 α1 α2 + 1 + (b + 1) x1 + ax2 x2 . 1 1 dt 2 Stable regime. more sophisticated techniques are required (see Section 11. 0). but for orbitally stable trajectories. 1 1 d 2 α2 = 2 [b − 2ax1 x2 ] α1 α2 − 2ax2 α2 + bx1 + ax2 x2 . the variance diverges and after a transient period. If the macroscopic solution is unstable. b ≥ 1 + a. i. α2 ) → (0. a . a (1 + a) − b These expressions obviously become meaningless for a choice of parameters outside the stable region. 1 dt d α2 = [b − 2ax1 x2 ] α1 − ax2 α2 . b < 1 + a: In the parameter regime where the macrob scopic solutions are asymptotically stable.

far from the Hopf bifurcation.2: The stable system . The system is very stable . n2 ) = (1000. shown here at (n1 . (α = 2 in Figure 22 of Gillespie (1977)). 3000 n2 2000 1000 1000 2000 n1 3000 Figure 5. b) = (0. 2000). (α = 10 in Figure 23 of Gillespie (1977)). Here a = 0. Below the critical line b = 1 + a. with microscopic trajectories ﬂuctuating about the ﬁxed point.2). but ﬂuctuations are more appreciable.5. very close to the parameters used in Figure 4.1.the inset shows the ﬁrst and second standard deviations of the Gaussian variations predicted by the linear noise approximation. and the linear noise approximation describes the statistics of the ﬂuctuations very well. b) (a. b) = (0. The system is still stable. 1).Linear Noise Approximation a) b) n2 2000 111 3000 3000 n2 2000 1000 1000 1000 2000 3000 1000 2000 3000 n1 n1 Figure 5.95. b = 1. In both ﬁgures.near to the Hopf bifurcation. 0. the macroscopic system is stable. As the parameters approach the critical line b = 1 + a. . the ﬂuctuations are enhanced substantially. a) (a.95.5.3: The stable system . Ω = 1000.

α2 ) · Ξ−1 · (α1 . though the analytic approximation becomes more reliable. In systems exhibiting a limit cycle or multistability. the ﬂuctuations become enhanced and the eigenvectors approach a limiting slope (Figure 5. Nevertheless. α2 ) = with Ξs = 2 α1 α1 α2 ss ss 1 T exp − 1 (α1 . the probability density of αi can be explicitly written as the bivariate Gaussian distribution. . α2 ) . xss ) = 1. the linear noise approximation cannot be used because it relies on linearization about a unique stable macroscopic trajectory. α2 ). described by the ellipse 2 [(b + (1 + a)) ((1 + a) − b)] α1 + a 2 [(b + (1 + a)) ((1 + a) − b)] α2 b +4a [(1 + a) − b] α1 α2 3 =K b (b2 + 2b − 2ab + 2a + a2 + 1) 2 . α2 ) (Figure 5. at steady-state. the probability density Π (α1 . As the system parameters approach the bifurcation line b → 1 + a.2). Πss (α1 .24) . the linear noise approximation provides an excellent approximation of the statistics of the ﬂuctuations.112 Applied stochastic processes 1 Since retention of terms up to ﬁrst order in Ω− 2 results in a linear FokkerPlanck equation. s 2 2π det Ξs √ ss α1 α2 2 ss α2 = 1 (1 + a) − b b + (1 + a) −2b b a −2b [b + (1 + a)] (5. In the stable regime. stochastic simulation can be performed easily. α2 . the linear noise approximation must be modiﬁed (see Section 11. as the number of molecules increases (Ω → ∞).3 and Excercise 4). b centred upon (xss .2 on page 219). 2 a [(1 + a) − b] The constant K determines the fraction of probability contained under the surface Πss (α1 . In fact. For ﬂuctuations around a limit-cycle. For the tracking of a trajectory through a multi-stable phase space. a . stochastic simulation becomes prohibitively time-consuming. Notice that the major and minor axes of the level curve ellipse are the eigenvectors of Ξ−1 . t) must necessarily be Gaussian (to this order in Ω). The ﬂuctuations are therefore con1 2 strained to lie within some level curve of Πss (α1 .

ms = αm . protein is synthesized via a two-step process . and the green circle denotes protein (p).25) where αm is the rate of transcription. the deterministic model above is recast as a master . 5. Notice the steady-state values of the mRNA and protein levels are.4: Central Dogma of Molecular Biology A) RNApolymerase transcribes DNA into messenger RNA (mRNA). dt (5. Ribosomes translate mRNA into protein. The gray ribbon denotes mRNA (m). the block arrow denotes the DNA of the gene of interest. αp is the rate of translation. B) The process of protein synthesis can be represented by a simpliﬁed model.DNA is transcribed to messenger RNA (mRNA) and that mRNA is translated to protein.Linear Noise Approximation A) Transcription Translation 113 B) m αp p αm βm βp Figure 5.” Proceedings of the National Academy of Sciences USA 98: 86148619. βp βm · βp (5.4). βm and ps = αp · ms αm · αp = .2 • Example – ‘Burstiness’ in Protein Synthesis M. In cells. dm = αm − βm m. van Oudenaarden (2001) “Intrinsic noise in gene regulatory networks. coupled diﬀerential equations for the concentrations of mRNA (m) and protein (p). Here. Thattai and A. respectively (Figure 5. and βm and βp are the rates of mRNA and protein degradation.26) On a mesoscopic level. Deterministically. this process can be represented as a pair of linear. dt dp = αp m − βp p.

the mean mRNA and protein levels correspond to the deterministic values.27) +αp E−1 − 1 n1 P (n1 . ∂P (n1 . The steady-state covariance is determined by the ﬂuctuation-dissipation relation.28) ∂t ∂Q ∂Q ∂Q αm (z1 − 1) Q + βm (1 − z1 ) + αp z1 (z2 − 1) + βm (1 − z2 ) . nj ). n2 . 5. n2 . n2 . Focusing upon the steady-state. n2 . Eq. Solving for the steady-state covariance (writing φ = βm /βp ). m(t) = x1 (t) + Ω− 2 α1 (t) p(t) = x2 (t) + Ω− 2 α2 (t). n2 . and using Eqs. so in principle the full distribution P (n1 . z2 . The calculation is algebraically tedious – you are encouraged to try.1. nj ) = f (ni + k. t) as in Section 4.3 and 4. n2 . n2 . Using the moment generating function Q(z1 . the partial diﬀerential equation for Q is. ∂Q (z1 . Often Q is expanded as a Taylor series about the point z1 = z2 = 1. We make the usual ansatz that the concentration can be partitioned into deterministic and stochastic components.114 Applied stochastic processes equation for the number of mRNA (n1 ) and protein (n2 ) molecules. t) can be obtained. All of i the transition rates are linear in the variables n1 . 1 2 s s (α1 = α2 = 0). t) + βp E1 − 1 n2 P (n1 .4. Ξ = s 1 1 m2 mp mp p2 = ms ps 1+φ ps + ps 1+φ (ps )2 s (1+φ) m . 2 2 where as above E is the step-operator : Ek f (ni . t). t) + βm E1 − 1 n1 P (n1 . t) = αm E−1 − 1 P (n1 . z2 ) exactly. xs = ms xs = ps . one can use the linear noise approximation. even at steady-state it is diﬃcult to determine Qs (z1 . z2 . (5.17. ∂z1 ∂z1 ∂z2 Despite being a linear partial diﬀerential equation. t)+ 1 1 ∂t (5. t) = (5.29) . an evolution equation for the ﬁrst two moments of the distribution are obtained by treating ε1 = (z1 − 1) and ε2 = (z2 − 1) as small parameters and collecting terms of like powers in εi . 4. Alternatively.

2) [blue] and (0. A) Stochastic simulation the minimal model of protein synthesis (Eq. The degradation rates are the same for both: −1 −1 βm = 5 mins and βP = 50 mins. . (αm .1. The diﬀerence is captured by the burstiness. Cai L. Both the red and the blue curve have the same mean ( p = 50). Trapping individual E. Xie XS (2006) Stochastic protein expression in individual cells at the single molecule level. it is possible to observe the step-wise increase of β-gal oﬀ the lacZ gene. coli cells in a microﬂuidic chamber.01.27). The Xie lab has developed sophisticated methods to observe bursts of protein oﬀ of individual mRNA transcribed from a highly repressed lac promoter. αp ) = (0. 5. Nature 440:358-362.Linear Noise Approximation 115 A) Protein number 600 500 400 300 200 100 0 0 100 200 300 400 500 B) Protein number 25 20 15 10 5 b 50 100 150 Time (min) Time (min) Figure 5. but distinctly diﬀerent variance. B) Burstiness can be observed experimentally. 20) [red].5: Burstiness of protein synthesis. Friedman N.

Focusing p2 / p upon the protein.5b)! The moment generating functions only work with linear transition rates and can be algebraically formidable. with the probability distribution of the ﬂuctuations Π(α. the approximation is a perturbation expansion that is valid √ so long as terms of order 1/ Ω remain sub-dominant to the leading-order deterministic trajectory. αp since each errant transcript is ampliﬁed by b = βm =(protein molecules / mRNA) × (average mRNA lifetime) (Figure 5. Second. the fractional deviation of the ﬂuctuations is.5). 2 p p βp (1 + φ) In prokaryotes. Having said that. the degradation rate of the mRNA is typically much greater than the degradation of the protein. √ ni = Ωxi + Ωαi . . the linear noise approximation is a local expansion of the master equation. is calculated algorithmically and works regardless of the form of the transition rates. and in fact is exact for linear transition rates.26). so φ ≫ 1. 5. 5. 2 p p βm p (5. it is a wonder that moment generating functions are used at all.116 Applied stochastic processes The steady-state values ms and ps are simply the steady-state values calculated from the deterministic rate equations (Eq.3 Limitations of the LNA The linear noise approximation is built upon the ansatz that the full state can be separated into deterministic and stochastic components. with the ﬂuctuations scaling with the square-root of the system size (cf. p2 1 1 αp = s 1+ . First. The limitations of the linear noise approximation are implicit in the ansatz. 1 αp 1 p2 ≈ s 1+ ≡ s [ 1 + b ]. What is surprising is that we can observe this burstiness experimentally (Figure 5. The linear noise approximation. 5. The burstiness is a measure of the ampliﬁcation of transcription noise by translation. Eq. and the fractional deviation becomes. by contrast.30) where b = αp /βm is called the burstiness of protein synthesis. t) evaluated along the trajectory x(t).1).

From the perspective of the phenomenological model. Eq. For a critique of the Kramers-Moyal expansion as a reliable approximation of the master equation. such as switching rates between ﬁxed-points or splitting probabilities among multiple equilibria. Stoichiometry vs. van Kampen (1981) “The validity of non-linear Langevinequations. “The diﬀusion approximation for Markov processes. 2. and absorb the step-size into each entry in . I. the ﬂuctuations about a stable limit cycle can also be estimated (see Section 11. van Kampen (1976) “Expansion of the Master equation. Furthermore.Linear Noise Approximation 117 The most conservative application of the approximation is to systems with deterministic dynamics exhibiting a single asymptotically stable ﬁxed-point. stochastic simulation is the only consistent method presently available. • N. G. however. 1982). For those types of computation. G. the approximation can still be applied.” Advances in Chemical Physics 34: 245. • N. be used to compute global properties of the system. see the two articles by van Kampen. Ed. with slightly more elaborate analysis. propensity: The deterministic rate equations are written in terms of the dot-product of the stoichiometry matrix S and the propensity vector ν.2 on p. • N. Linear Fokker-Planck equation: Show that the solution of the linear Fokker-Planck equation is Gaussian for all time (conditioned by a delta-peak at t = 0).” Journal of Statistical Physics 25: 431. G. Lamprecht and A.11.. it is permissible to re-scale the entries in the stoichiometry to ±1. 5. 219).” in Thermodynamics and kinetics of biological processes. Exercises 1. The linear noise approximation cannot. but only within a basin of attraction thereby providing an estimation of the ﬂuctuations along a trajectory close to a ﬁxed point. For multistable systems. I. Suggested References The review by van Kampen of his linear noise approximation is excellent. Zotin (Walter de Gruyter & Co. van Kampen.

3. Eq. What is the slope of the major axis as the parameters approach the bifurcation a → b − 1? Do you recognize this number? 5. n −2 n − b. ν ν ν1 = α. (b) Calculate the variance in n. → → 2n −2 n. To that end. Brusselator near the bifurcation: The major and minor axes of the stationary probability distribution equiprobability contours are given by the eigenvectors of the inverse-covariance matrix.18. Exponential autocorrelation in the linear noise approximation: Derive the expression for the steady-state autocorrelation function in the linear noise approximation. a β ν2 = · n. (b) The mean and variance of n. 5. Ω2 ν2 = determine the following. . Higher-order corrections: For a model with nonlinear degradation. Section 2.24. examine the following process with variable step-size: → n −1 n + a. b ν1 = (a) Show that the deterministic rate equation is independent of the steps a and b. calculate these eigenvectors using Eq. From the point of view of characterizing the ﬂuctuations. perhaps the most straightforward is to generalize the ideas developed in the section on the ﬂuctuation-dissipation relation. 5. There are several ways to do this. it is very important to distinguish between stoichiometry and propensity. In the Brusselator example. 42. βn(n − 1) . 3. n −1 n + 1. including terms of order 1/Ω in the linear noise approximation: (a) The partial diﬀerential equation for Π(α.2 on p. Show that it is strongly dependent upon a and b. 4. → ν ν α . Ξ−1 . t).118 Applied stochastic processes the propensity vector. (c) The auto-correlation function at steady-state n(t)n(t − τ ) .

(b) How does the fractional deviation about the steady-state. r ν2 = β . Negative feedback: Consider the nonlinear negative feedback model.1. (c) Identify a unitless parameter that determines the validity of the linear noise approximation (see Section 5. . change as the cooperativity n increases? r2 / r⋆ . and the steady-state variance r2 .1). r −2 r − 1. Ω where g(x) is a Hill-type function. → r −1 r + b.Linear Noise Approximation 119 6. → ν ν ν1 = α · g (r/(Ω · KR )) . In the regime where the system is homeostatic (r⋆ ≫ Ω · KR ): (a) Find explicit estimates for the steady-state mean r⋆ . g(x) = (1 + xn ) −1 .

τ ) − w (x2 |x3 ) p (x3 |x1 . 3. and ﬁnd that for stochastic diﬀerential equations with multiplicative white noise. we formalize the connection between the two approaches. is integro-diﬀerential equation (or. equation is an attempt to approximate the master equation by a (one hopes) more amenable partial diﬀerential equation governing p(x3 |x1 . in the case of a discrete state-space. Kolmogorov derived his equation by imposing abstract conditions on the moments of the transition probabilities w in the Chapman-Kolmogorov equation. τ ). The master equation (cf. both derivation are inconsistent approximations. 3. τ )] dx2 . In this Chapter.12). van Kampen (1982) ‘The diﬀusion approximation for Markov processes’ in Thermody120 . This family of equations is diﬃcult to solve in general. G. Eq. Eq. and that a systematic approximation method is needed. ∂ p (x3 |x1 . τ ) = ∂τ ∞ −∞ [w (x3 |x2 ) p (x2 |x1 . (See N. while the derivation of the same equation by Fokker and Planck proceeds as a continuous approximation of the discrete master equation. The Kolmogorov. We shall see in Chapter 5 that in systems where the noise is intrinsic to the dynamics. or Fokker-Planck. a new calculus is required. a discrete-diﬀerential equation.11 on page 59).Chapter 6 Fokker-Planck Equation The work of Ornstein and Uhlenbeck on Brownian motion suggested that there is a general connection between a certain class of stochastic diﬀerential equations for a given trajectory (as studied by Langevin) and a certain family of partial diﬀerential equations governing the probability distribution for an ensemble of trajectories (as studied by Einstein).

(6. ∆t) . t) p (z|y. ∆t) dzdy − R (y) p (x|y. Zotin. assume that for ∆t → 0. ∆t→0 ∆t (6. t) dy −∞ . Then. 104: 415. ∞ p (x|y. 181.1) and deﬁne the jump moments by. t) dy . and note that. t) p (z|y. I. t + ∆t). A (z) = lim 1 a1 (z.) 6. Ann. Ed. .1 to replace p (x|y. N. Write the Chapman-Kolmogorov equation as.Fokker-Planck Equation 121 namics and kinetics of biological processes. ∆t) dz. possessing all properties required for the following operations to be well-deﬁned.2) Now. = lim ∆t→0 ∆t −∞ −∞ ∞ ∞ ∞ Using Eq. Consider a stationary. ∆t) = −∞ (y − z)n p (z|y. Kolmogorov (1931) Math. p. ∆t) dy. only the 1st and 2nd moments become proportional to ∆t (assume all higher moments vanish in that limit). Markov process – one dimensional. Furthermore. t + ∆t) − p (x|y. let R(y) be a suitable test function. 1 ∂p R (y) dy = lim R (y) [p (x|y. t + ∆t) dy − R (y) p (x|y.1 • Kolmogorov Equation A. ∞ an (z. I. R (y) 1 lim ∆t→0 ∆t ∞ ∞ −∞ ∂p dy = ∂t ∞ R (y) −∞ −∞ p (x|z. for simplicity. the following limits exist. ∆t) ∆t→0 ∆t B (z) = lim 1 a2 (z. 6. t)] dy ∆t→0 ∆t ∂t −∞ −∞ ∞ ∞ 1 R (y) p (x|y. Lamprecht and A. t + ∆t) = −∞ p (x|z.

) In terms of the functions A(z) and B(z) deﬁned above in Eq. choose R(y) so that R(y) and R′ (y) → 0 as y → ±∞ as fast as necessary for the following to be true. t)} + {B (y) p (x|y. t)} dz = 0. with all the necessary properties for the following to hold. we must conclude that. lim R′ (z) p (x|z. 6. 6. In particular. t) B (z) dz. t)} − ∂t ∂z 2 ∂z 2 By the du Bois-Reymond lemma from analysis. Since we have assumed all jump moments beyond the second vanish. t) A (z) = 0. t) A (z) dz + −∞ −∞ ′ R′′ (z) p (x|z. t) B (z) = 0. Integrating Eq. ∂t ∂y 2 ∂y 2 (6.3) Recall that R(y) is an arbitrary test function. (6. z→±∞ z→±∞ lim R (z) p (x|z. since R(y) is an arbitrary test function. ∞ ∞ ∞ 1 ′ dz R (z) p (x|z. t) ∆t→0 ∆t 2 −∞ −∞ ∞ ∞ −∞ ∂p R (y) dy = ∂t R (z) p (x|z. t) ∂ 1 ∂2 {B (z) p (x|z. + lim R′′ (z) p (x|z.122 Applied stochastic processes We expand R(y) around the point y = z and interchange the order of integration.3 by parts. ∆t) dy lim ∆t→0 ∆t −∞ −∞ ∞ ∞ 2 1 (y − z) p (z|y. we obtain: ∞ R (z) −∞ ∂p (x|z.2. ∞ R (y) −∞ ∂p dy = ∂t (Prove to yourself that this is true. t)}. we are left with. + {A (z) p (x|z. t) (y − z) p (z|y. ∂p (x|y.4) . t) ∂ 1 ∂2 =− {A (y) p (x|y. ∆t) dy dz .

0) = δ(x − y). This success has been exploited in a large number of applications since 1931. is called the diﬀusion or ﬂuctuation term. 6. 181. for which processes one might expect the higher jump moments to vanish. It is obvious that having a partial diﬀerential equation that is equivalent (under the assumed conditions) to the impossibly diﬃcult ChapmanKolmogorov equation is great progress. involving A(y). p(x|y.5) 3. J(y) = A(y)p − 1 ∂ {B(y)p}. Remarks: 1. . Zotin.4 may be written as a continuity equation. The crux of the problem is the following: Despite the impeccable mathematics of Kolmogorov. and the debate is far from over today. G. The derivation of the Kolmogorov equation was rightly considered a breakthrough in the theory of stochastic processes. Ed. involving B(y). or the Generalized Diﬀusion Equation. I. See N. I. van Kampen (1982) ‘The diﬀusion approximation for Markov processes’ in Thermodynamics and kinetics of biological processes. Lamprecht and A. The ﬁrst term on the right. ∂ 1 ∂p {Ai (y) p} + =− ∂t ∂yi 2 i=1 n n k.l=1 ∂2 {Bkl (y) p}. The derivation above may be repeated (with minor modiﬁcations) for an n-dimensional process. 4. Note that the Kolmogorov equation is an evolution equation for the conditional probability density. but as work in this area grew. the derivation does not hold any information regarding what physical processes actually obey the necessary mathematical assumptions – in particular. ∂yk ∂yl (6. is called the convection or drift term. subtle questions of consistency arose.Fokker-Planck Equation 123 This is the Kolmogorov Equation. ∂t ∂y where J is the probability ﬂux obeying the constitutive equation. though sometimes appearing as the Smoluchowski Equation. and that it must be solved subject to the initial condition. often called the Fokker-Planck Equation. 2 ∂y 2. the second term. ∂J ∂p =− . the result is then given by. p. Eq.

with no reference to any particular physical process. 1996). t0 ). τ ) = ∂τ ∞ −∞ [w (y|y ′ ) p (y ′ . x3 appearing in the master equation may be interpreted as representing the initial. 6. it will become clear in Chapter 5 that Fokker’s derivation (described below) is only consistent if the transition probabilities are linear in the state variables. intermediate and ﬁnal state. τ )] dy ′ . (6. y ′ .7) This must not be confused with an evolution equation for the singlestate probability distribution.2 Derivation of the Fokker-Planck Equation Kolmogorov’s derivation is formal. We introduce the jump-size ∆y = y − y ′ .6 may be written in a short-hand form as. ∂ p (y.6) This is an approximate evolution equation for the conditional probability density. one must ﬁnd an explicit expression for the coeﬃcients A(y) and B(y). Notice. however. (6. (Springer-Verlag. so that the master equation reads. for details). since p is still conditioned upon the initial distribution p(y0 . Fokker and Planck derived the same equation based on assumptions tied directly to particular features of physical systems under consideration. τ ) − w (y ′ |y) p (y. in that . If the dependence on the initial state is understood. 6. τ ) − w (y ′ |y) p (y|y0 . ∂ p (y|y0 . The Fokker-Planck Equation: Methods of Solution and Applications. then Eq. y. x2 . that before any attempt at solving the initial value problem is made. τ ) since w(y|y ′ ) is supposed to be provided by the “physics” of the system. The initial-value problem can be solved in a few cases (see H. nor even an indication of what kinds of physical processes obey the necessary mathematical conditions. τ )] dy ′ . respectively. the Fokker-Planck equation is the favoured approximation method for many investigators (although rarely applied to systems with linear transition rates). and it is linear in p(y|y0 . Nevertheless. The labeled states x1 . Let them be re-labeled as y0 .124 Applied stochastic processes 5. τ ) = ∂τ ∞ −∞ [w (y|y ′ ) p (y ′ |y0 . Risken. Despite this very strict limitation.

τ )] d∆y ∞ = −∞ w (y − ∆y. ∆y). w (y ′ . ∂ p (y. ∆y) ≈ w(y ′ .e. w (y ′ . ∆y) for |∆y| < δ. −∆y). ∆y) p (y. τ ) − ∆y ∂y 2 ∂y 2 . one is able to Taylor expand w and p in y ′ about y. τ ) ≈ y ′ =y wp|y′ =y + ∂ (wp) ∂y ′ (y ′ − y) + ∂2 (wp) ∂y ′2 y ′ =y 1 ′ 2 (y − y) .Fokker-Planck Equation 125 way. ∆y) p (y. That is. ∆y) ≈ 0 for |∆y| > δ and w(y ′ + ∆y. ∆y) p (y − ∆y. With this change in notation. 2. τ ) for |∆y| < δ. (6. p(y. The transition probability w(y|y ′ ) is the probability per unit time of a transition y ′ → y. Similarly. τ ) − w (y. τ ) varies slowly with y. ∆y) p (y. we assume that p(y + ∆y. since ∆y = y − y ′ . to O(∆y 3 ). ∆y) p (y ′ . i. ∆y) p (y ′ . −∆y) d∆y. τ ) = ∂τ ∞ ∞ −∞ [w (y − ∆y. ∆y) ≡ w(y − ∆y. we assume that there exists a δ > 0 such that w(y ′ . That is. τ )] + w (y. τ ) −∞ w (y. τ )] . the master equation becomes. we re-write the transition probability w(y ′ |y) as w(y ′ |y) = w(y. Under these conditions. 1. we make two assumptions. we can re-write the transition probabilities in terms of the jump-size ∆y. Only small jumps occur. τ ) ≈ p(y.8) Next.. 2 Or. that starting at y ′ . ∆y) p (y − ∆y. w(y|y ′ ) = w(y ′ . τ ) = ∆y 2 ∂ 2 ∂ [w (y. τ ) d∆y − p (y. We write this as. [w (y. there is a jump of size ∆y. −∆y) p (y.

τ )] d∆y ∂y (6. Furthermore. ∆y) p (y. τ )] . τ ) a1 (y)] . −∆y) p (y. ∂ ∂ 1 ∂2 p(y. how good an approximation is provided by this equation? 2. It is not diﬃcult to include all terms in the Taylor expansion used above and obtain the Kramers-Moyal Expansion. τ ) ≈ − [a1 (y) p (y. Nevertheless. since ∞ ∞ ∂ ∂ ∆y ∆yw (y. .9) 1 + 2 −∞ ∂2 ∆y [w (y. ∂ p (y.8 is then approximated by. ∂τ ∂y 2 ∂y 2 Remarks: 1. τ ) = ∂τ ∞ ∞ Applied stochastic processes −∞ @ @@ w @@@@(y. an expansion with respect to ∆y is not possible since w varies rapidly with ∆y. τ )] + [a2 (y) p (y. ∞ ∞ n (6. τ ) [w (y.126 Eq.11) an (y) = −∞ ∆y n w (y.10) (6. ∆y) p @ τ @ ∞ ∆y −∞ ∞ ∂ [w (y. ∂ (−1) ∂ n p (y. viz. (y. the assumptions made in deriving the Fokker-Planck equation make it possible to answer the question: For a given w. τ )] d∆y = ∂y ∂y ∂ [p (y. ∂y −∞ we ﬁnally obtain the Fokker-Planck Equation. 6. 2nd -order partial diﬀerential equation of parabolic type. ∆y) p (y. The Fokker-Planck equation has the same mathematical form as the Kolmogorov equation. ∆y) p (y. ∆y) d∆y. τ ) = [an (y) p (y.@) d∆y − (y. @ −∞ Note the dependence of w(y. τ )] d∆y − ∂y 2 2 −∞ @ @@@ w @@@@@@ τ ) d∆y. ∂τ n! ∂y n n=1 where. a linear. ∆y) d∆y = p (y. ∆y) on the second argument is fully maintained. τ )].

1). 1 ∂2 ∂p ∂ = − ª a1 ( y ) p º + ¬ ¼ 2 ∂y 2 ª a2 ( y ) p º ¬ ¼ ∂t ∂y ∞ Fokker-Planck Equation an ( y ) = ³ z w ( y. is the result of truncation after two terms. 1. . The truncated Kramers-Moyal expansion is what Einstein used in his derivation of the diﬀusion equation for a Brownian particle (cf.τ ) dx2 Assume an=0. t2 | x1 . to be practical the expansion must be truncated at a certain point – the Fokker-Planck equation.τ ) − w ( x2 | x3 ) p ( x3 | x1 . and p(y.7 on page 6).A (nonlinear) functional equation governing all conditional probability densities of a Markov process. t1 ) = ³ f ( x . n>2 −∞ Chapman-Kolmogorov Equation Compute from microscopic theory: p ( x | z .τ ′ ) = (1 − a0τ ′ ) δ ( x − z ) +τ ′w ( x | z ) + o (τ ′ ) ∞ 1 ∂2 ∂p ∂ = − ª A( y ) pº + ¬ ¼ 2 ∂y 2 ª B ( y ) p º ¬ ¼ ∂t ∂y Kolmogorov Equation ∂ p ( x3 | x1 . t1 ) dx2 ( t1 < t2 < t3 ) −∞ Chapman-Kolmogorov Equation . for example. z ) dz n −∞ Figure 6. t3 | x1 . Stationary Process ∞ p ( x3 | x1 . Breaking-oﬀ the series in a consistent fashion requires the introduction of a well-deﬁned extensive variable quantifying the relative size of the individual jump events (as we shall see in section 5.τ ) º dx2 ¬ ¼ ∂τ −∞ Master Equation Assume the jumps are “small”.τ ) = ³ ª w ( x3 | x2 ) p ( x2 | x1 .Fokker-Planck Equation ∞ 127 3 3 f ( x3 . Eq.t) is a slowly varying function of y.1: Review of Markov Processes While this is formally equivalent to the master equation.τ + τ ′ ) = ³ p(x 3 | x2 .t | x2 . t2 ) f ( x2 .τ ′ ) p ( x2 | x1 .

Examples include Ohm’s law.128 t >0 Applied stochastic processes p ( y. taking the value y0 at t = 0. At the next. more fundamental (mesoscopic) level.3 • Macroscopic Equation R. if it applies) for stationary Markov processes. As the latter determines the entire probability distribution. it must be possible to derive from it the macroscopic equations as an approximation for the case that the ﬂuctuations are negligible. t) for each t is a sharplypeaked function. t) = peq (y) . Let Y be a physical quantity with Markov character. Then we have from equilibrium statistical mechanics that the probability distribution p(y. chemical rate equations. i. t) toward its equilibrium distribution peq (y). t) tends toward some equilibrium distribution peq (y) as t → ∞.. . population growth dynamics. t→∞ lim p (y. Kitahara (1973) Fluctuation and relaxation of macrovariables. For deﬁniteness. P (y. 0) → δ(y − y0 ) as t → 0. reﬂecting the initial stage of organization of experimental facts.2: Evolution of p(y. the ﬂuctuations are taken into account – by the master equation (or the Fokker-Planck equation. and so p(y. Matsuo and K. in which the microscopic ﬂuctuations are neglected (averaged over). Kubo. t ) peq ( y ) y0 y (t ) yeq Figure 6. etc. resulting in a deterministic theory. 6. The location of this peak is a fairly well-deﬁned number. say the system is closed and isolated.2). This stage often leads to a mathematical model in terms of diﬀerential equations where the variables of interest are macroscopic variables. Journal of Statistical Physics 9:51–96 We know that the study of physical phenomena always starts from a phenomenological approach. t|y0 .e. K. We know from experience that the ﬂuctuations remain small during the whole process. (see ﬁgure 6.

12) As t increases. (6. −∞ (6. t) dy = a1 (y) t . d y (t) = ¯ dt ∞ a1 (y)p (y. one takes. ∞). dt For example.15 reduces to: (6. Eq. the peak slides bodily along the y-axis from its initial ¯ location y (0) = y0 to its ﬁnal location y (∞) = yeq . 62). dt Multiplying both sides by n. we obtain the evolution equation for the average n. ∂t Using either the master equation or the Fokker-Planck equation. dn = gn − rn . Bearing this picture in mind. ¯ d y y (t) = a1 (¯).16) . with rn = βn and gn = α. we had rn t = r(n) t = r( n t ) and similarly for gn . and summing over all n ∈ (−∞. t) dy. (6.Fokker-Planck Equation 129 having an uncertainty of the order of the width of the peak. the width of the ¯ distribution growing to the equilibrium value. The discrete master equation reads. ¯ dt which follows exactly from the master equation.13) which we identify as the macroscopic equation. and it is to be identiﬁed with the macroscopic value y (t). we have. ¯ ∞ y (t) = Y ¯ t = −∞ yp (y.14) pn = rn+1 pn+1 + gn−1 pn−1 − (rn + gn )pn . Example: One-step process (see p. 6.15) dn =α−β n . we have: d y (t) = ¯ dt ∞ y −∞ ∂ p (y. then the macroscopic variable y (t) satisﬁes the macroscopic equation. d (6. Usually. The situation illustrated by this example holds in general: Proposition: If the function a1 (y) is linear in y. dt Note in this last example. t) dy.

It is also possible to deduce from the master equation (or the FokkerPlanck equation) an approximate evolution for the width of the distribution. d y (6.19 will be only an approximation. a1 (y) t on the grounds that the ﬂuctuations (y − y )2 ¯ we read Eq. 1 y dt though in general. Eq. then Eq. the (usually-not-stated-) assumption is that since the ﬂuctuations are small.19) = a2 (¯) + 2σ 2 a′ (¯). This equation for the variance may now be used to compute corrections to the macroscopic equation.21) − y 2 t obeys.17.18 is identical with. dt Again.16= 6. In that case. 6. .16 as an approximation. 6. 1 y dt (6.18) = a2 (y) t + 2 (y − y )a1 (y) t .13. however. are small. 6. ¯ y 1 y dt 2 dσ 2 y = a2 (¯) + 2σ 2 a′ (¯).16 used even when a1 (y) is nonlinear . 1 = a1 (¯) + a′′ (¯) · (y − y )2 y y ¯ 2 1 a1 (y) t ≈ a1 (¯). 6. but one must also specify what “macroscopic equation” means. 6. one ﬁnds Eq. we have.20) (6. we may expand it about y t = y . if a1 and a2 are linear in y. In the literature. . . d 1 y (t) = a1 (¯) + σ 2 a′′ (¯). ¯ Not only. 6. ¯ 1 y ¯ y ¯ a1 (y) = a1 (¯) + a′ (¯) · (y − y ) + a′′ (¯) · (y − y )2 + . dσ 2 y (6. t from which it follows that the variance σ 2 (t) = y 2 dσ 2 ¯ (6. a1 (y) is nonlinear in y... and a1 (y) is smooth. then Eq. y t t + . one shows that d y2 dt t = a2 (y) t + 2 ya1 (y) t . ¯ dt if a1 (y) is nonlinear – this is the “meaning” that is assigned to the macroscopic equation. and the macroscopic equation is no longer determined uniquely by the initial value of y . 1 y 2 1 Taking the average. ﬁrst.17) y (t) ≈ a1 (¯). Eq.130 Applied stochastic processes If.

a2 (y) > 0. ∆y) d∆y. and for the system to evolve toward a stable steady state. σ2 → a2 . τ ) ≈ − [a1 (y) p (y.1 Coeﬃcients of the Fokker-Planck Equation The coeﬃcients of the Fokker-Planck equation.20 be small compared to the ﬁrst – is given by. τ )] + ∂τ ∂y 2 ∂y 2 are given by.22) and so the condition for the approximate validity of Eq. and an alternative method would be highly convenient. a2 1 ′′ |a | ≪ |a1 |. which we shall consider in Chapter 4 (Section 5. 2|a′ | 2 1 1 (6. In practice. The linear noise approximation. that is not always easy to do. one needs a′ (y) < 0. proceeding as it does from a systematic expansion in some well-deﬁned small parameter. In theory. hence. 6. they can be computed explicitly since w(y. 2|a′ | 1 (6.23) which says that it is the second derivative of a1 (responsible for the departure from linearity) which must be small.3. ∆y) is known from an underlying microscopic theory. ∆y) d∆y a2 (y) = −∞ ∆y 2 w (y.Fokker-Planck Equation 131 Note that. ∞ (6. but this tendency is kept in check by the second term. 6. τ )] . p(y.17 – namely that the 2nd term in Eq.24) a1 (y) = −∞ ∞ ∆yw (y. The way one usually proceeds is the following: .1) provides a more satisfying derivation of the macroscopic evolution from the master equation. 1 ∂2 ∂ ∂ [a2 (y) p (y. It follows that σ 2 tends to 1 increase at the rate a2 . 6. by deﬁnition.

If one chooses the “steady-state” solution of Eq.25.27) ∆¯ = −β y ∆t + y ¯ f (ξ) dξ. t2 |y1 . the function a1 is determined. t+∆t (6. This. Note the various (for the most part uncontrolled) assumptions made in this derivation.26 over a short time interval.26) (6. From Eq. the macroscopic equation is. we know peq from statistical mechanics.25) a2 (y) a2 (y ′ ) = a1 (y) t ≈ a1 ( y t ). d¯ y + β y = f (t) ¯ dt with assumptions about the statistics of the ﬂuctuating force f (t). Since this equation must coincide with the equation known from the phenomenological theory. (neglecting ﬂuctuations). i. then the resulting Markov process is stationary. t1 ) whose one-time distribution p(y1 . is only possible if ps is integrable. 6.132 Applied stochastic processes Let p(y. 6. t0 ) be the solution of Eq. 6. identiﬁed with Eq. Recall that this begins with the dynamical description. Next. is the derivation based upon the Langevin equation. Integrating Eq. of course. t1 ) may still be chosen arbitrarily at one initial time t0 . To “derive” the coeﬃcients of the Fokker-Planck equation. ∞ p (y2 .e.24. t1 ) dy1 . however. t|y0 . t2 |y1 .4. t . 6. f (t) = 0.24 with initial condition δ(y − y0 ).24. 6.. 3. Eq. The simplest of all procedures.25 may be identiﬁed with the equilibrium probability distribution peq known from statistical mechanics. For closed systems. t1 ) p (y1 . 6. According to Eq. f (t1 )f (t2 ) = 2Dδ(t1 − t2 ). we proceed as follows. hence a2 is also determined. t2 ) = −∞ p (y2 . y ′ constant a1 (y ) ′ ps (y) = exp 2 dy . d y dt t one may construct a Markov process with transition probability p(y2 . (6.

27– that y a2 (¯) = 2D.Fokker-Planck Equation so that.1 is indispensable. Eq.4 Pure Diﬀusion and Ornstein-Uhlenbeck Processes As an example of the Fokker-Planck equation. 1 1 ∆¯ = −β y + lim y ¯ a1 (¯) = lim y ∆t→0 ∆t ∆t→0 ∆t Similarly.28) This coincides with the appropriate Fokker-Planck equation for the position of a Brownian particle in the absence of an external ﬁeld. and diﬀusion with drift (which is the same as the Ornstein-Uhlenbeck process shown above.28). and the corresponding Fokker-Planck equation reads. For nonlinear systems. we shall consider two families of diﬀusive equations . as derived by Ornstein and Uhlenbeck. and so the systematic expansion scheme described in Section 5. ∂t ∂y ∂y (6. See N. from which it follows – using Eq. derivation of the Fokker-Planck equation in the manner described here can lead to serious diﬃculties. G. 6. 6. . van Kampen (1965) Fluctuations in Nonlinear Systems. t+∆t t+∆t 133 X $0 f$$$ (ξ) dξ $ t (∆¯) y 2 = β y (∆t) + ¯ t 2 2 2 f (ξ) f (η) dξdη. ∂p ∂ ∂2p = β (yp) + D 2 . 6.pure diﬀusion. The identiﬁcation of a1 (y) as the macroscopic rate law is really only valid when a1 (y) is a linear function of y – as it is in the OrnsteinUhlenbeck process described above.

pt whose solution is. φ(s. t) = −∞ eisy p(y.134 Applied stochastic processes 6.30) In particular. This is a very famous Fourier transform that appears in many models of heat conduction.31) sometimes also called a Wiener process (denoted W (t)). t) = √ 1 (y − y0 )2 . The inverse Fourier transform back to p(y. ∂2p ∂p = k 2.2 on page 263). 0) = eisy0 . 6. so it is readily solved using a Fourier transform in y (see section B.4. t) = exp[isy0 − ks2 t].29 leads to the ordinary diﬀerential equation in φ. in the absence of a damping force). for k = D. φ(s. 0) = δ(y − y0 ). t).3.1 Wiener process First consider the simple case with a1 = 0. ∂t ∂y with p(y|y0 .e.29) This is a parabolic partial diﬀerential equation with constant coeﬃcients. The transformed Eq. t) yields what is sometimes called the ‘heat kernel’. . or a purely diffusive process. ∞ φ(s. where k is some (positive) constant. t) = −s2 kφ(s. √ dW = 2Dη(t). exp − 4kt 4πkt (6. (6. p(y. Then we have the problem. t) dy. a2 = k. or sometimes even called Brownian motion. dt (6. this coincides with Einstein’s result for a free particle in Brownian motion (i. although that designation is not very descriptive. dφ(s.

∂t ∂s 2 This is easily solved by the method of characteristics to give the partial solution. η(t1 )η(t2 ) = δ(t1 − t2 ). From the initial distribution p(y|y0 . the autocorrelation function of the Wiener process is straightforward to compute. t) = e−Ds 2 /4k g(se−kt ). viz. the arbitrary function g is fully determined: Dx2 + iy0 x . (6.4.36) The complete solution is therefore. t1 t2 W (t1 )W (t2 ) = 2D 0 0 δ(u1 − u2 )du2 du1 . = k (yp) + ∂t ∂y 2 ∂y 2 (6.28. g(x) = exp 4k Ds2 4k (6. (6. (6. 6.34) + ks = − s2 φ. ∂p ∂ D ∂2p . t) = exp − 1 − e−2kt + isy0 e−kt .Fokker-Planck Equation 135 Because the forcing function η(t) is delta-correlated. a1 = −ky.. a2 = D. t2 ).38) . Eq.37) Comparison with the solution above for the purely diﬀusive process shows that the density is a Gaussian with Y (t) = y0 e−kt . φ(s. 6.33) where k and D are (positive) constants. D σ2 = 1 − e−2kt . 2k as derived by Ornstein and Uhlenbeck (see Chapter 1). taking the Fourier transform. ∂φ ∂φ D (6. we are left with a ﬁrst-order linear partial diﬀerential equation.32) W (t1 )W (t2 ) = 2D min(t1 . 0) = δ(y − y0 ). (6.35) in terms of an arbitrary function g. φ(s.2 Ornstein-Uhlenbeck process Now consider the more general Fokker-Planck equation. Proceeding as above.

t ) Width δ ( y − y0 ) C) P ( y. C) Ornstein-Uhlenbeck process. 0) = δ(y − y(t)). t) ≡ P (y. (6. where y(t) obeys the deterministic ordinary diﬀerential equation. .136 A) P ( y.39) We begin with the same initial condition p(y. relaxing to an equilibrium distribution around y = 0. dt y(0) = y0 . both constant.3). t ) Applied stochastic processes B) P ( y. B) Wiener process. certain characteristics of the solution can be made clear (Figure 6. a2 = 0..3: Some solutions of the Fokker-Planck equation. t|y0 . then by choosing diﬀerent coeﬃcients a1 (y) and a2 (y). is the general Fokker-Planck equation. 1. 0) = δ(y − y0 ). 0). A) Liouville equation. Without diﬀusion. a1 = −ky. a2 = D. 6. dy = A(y). It should be understood that in this ﬁgure P (y. a1 = A. τ )] . τ )] + ∂τ ∂y 2 ∂y 2 (6. localized about y(t). The probability remains a delta-peak. The solution is Gaussian. but the variance spreads ∝ t. p(y. p(y.e. t ) Width 2 ( D / 2k ) ⋅ (1 − e−2 k ⋅t ) 2 D ⋅t t y ( t ) where dy = A( y ) dt t t y y y ( t ) = y0 + A ⋅ t y y ( t ) = y0 e − k ⋅t Figure 6.4. τ ) = − [a1 (y) p (y. 1 ∂2 ∂ ∂ [a2 (y) p (y.39). Here. i. a2 (y) = 0). Liouville equation (a1 (y) = A(y). again for convenience.3 Heuristic Picture of the Fokker-Planck Equation Examining several limiting cases of the Fokker-Planck equation is useful in developing intuition regarding the behaviour of the solution. a2 = D. the initial deltadistribution follows the deterministic trajectory. t|y0 . The solution is a Gaussian.

and the variance grows linearly with t. The coeﬃcient a1 (y) = −ky is like a Hooke’s force restoring the mean state to y = 0 exponentially quickly. That direct correspondence holds in general and can be uniquely assigned for Langevin equations where the coeﬃcient of the noise term is constant. Interpretation rules for the case where the noise coeﬃcient is a function of the state variables will be postponed to Chapter 7. a2 (y) = D). Wiener process (a1 (y) = A. the correspondence is no longer unique and an additional interpretation rule for the Langevin equation must be provided.Fokker-Planck Equation 137 2. 2 · (D/2k) 6. The mean follows the line y(t) = y0 + A · t. dt where the random force η(t) is supposed to have the properties η(t) = 0.41) (6.1 Linear Damping We shall review the general procedure for the linear case. (6. 6. a2 (y) = D). 3.5. y(t) = y0 e−kt . The equilibrium distribution is the Gaussian centered at y = 0 with variance y 2 = D/2k. viz.5 Connection to Langevin Equation From the work of Einstein and Langevin. there is clearly a direct analogy between the Fokker-Planck and Langevin equations in the case of Brownian motion. This is the equation governing pure diﬀusion. We shall outline the connection between the Fokker-Planck and Langevin equations below.40) . Ornstein-Uhlenbeck process (a1 (y) = −ky. for the cases of linear and non-linear drift coeﬃcients with the requirement that the noise coeﬃcient is constant. p∞ (y) = 1 2π · (D/2k) exp − y2 . For non-constant coeﬃcients. y 2 − y 2 = 2Dt. The variance grows until it reaches the equilibrium value y 2 = D/2k. Langevin begins with the dynamical equation dy = −βy + η(t).

6.43) Averaging over the ensemble and using Eq. 6.31 on page 43). Then.41 and 6. the variable y(t) becomes a stochastic process. one gets. take for t in Eq.44) Moreover. Also note that because of the random forcing η(t). (6.44 a small time interval ∆t. averaging and using Eq. 2 (6.45 and comparing with the equilibrium result above. for t ≫ 1/β the eﬀect of the initial conditions must disappear and the system reaches equilibrium. Physically. The equilibrium distribution is the resulting compromise between these two opposing tendencies.48) . however. (6.138 Applied stochastic processes η(t)η(t′ ) = Γδ(t − t′ ). we have Γ= 2β kB T. 2β (6.42) Note that Eqs. 1 m y 2 (∞) 2 y0 = 1 kB T.41. ∆y y0 = y(t) y0 − y0 = −βy0 ∆t + O(∆t2 ). 6. (6. y(t) y0 = y0 e−βt . Equilibrium statistical mechanics (the equipartition of energy) can be used to derive. ′ (6. To derive the relation between the Langevin equation and the FokkerPlanck equation. we have from Eq. 6.42.42 are not enough to fully specify the stochastic process η(t) – they only specify the ﬁrst two moments.40. m (6. y 2 (t) y0 2 = y0 e−2βt + Γ 1 − e−2βt . Taking t → ∞ in Eq. Suppose that y(0) = y0 . 2. 6. the random force η(t) creates a tendency for y to spread out over an ever broadening range of values. y(t) = y0 e−βt + e−βt 0 t eβt η(t′ )dt′ .47) an example of the so-called Fluctuation-Dissipation Relation (see Eq.46) where kB is Boltzmann’s constant and T is the equilibrium temperature. while the damping term tries to bring y back to the origin. 6. after squaring.45) The constant Γ is unknown so far.

however.40).50) to give a2 (y) = Γ.g.51) with initial value y0 is also a Gaussian. .40 with initial condition y0 is Gaussian. Customarily.52) Alternatively. the joint distribution of y(t1 ). 6. ∆y 2 y0 = Γ∆t + O(∆t2 ).49) Similarly. In fact. (6. the value of y(t) is a linear combination of the values that η(t) takes at all previous times (0 ≥ t′ ≥ t). we know that the solution of the Fokker-Planck equation (6. On the other hand.and second-moments of both Gaussians are the same. . = [(βy)p] + ∂t ∂y 2 ∂y 2 (6. y(t2 ).Fokker-Planck Equation Therefore. from the square of Eq. 6.. the drift coeﬃcient in the Fokker-Planck equation is.and second-moments of y(t) as does the Langevin equation (6.51) This Fokker-Planck equation returns the same values for the ﬁrst. The equivalence between the Fokker-Planck and Langevin descriptions then holds: According to Eq. it follows that the two Gaussians are identical.42. The resulting Fokker-Planck equation reads.) . Hence the process y(t) determined by Eq.41 and 6. one may stipulate that all higher cumulants of η vanish beyond the second.43. 6. η(t1 )η(t2 )η(t3 )η(t4 ) = η(t1 )η(t2 ) η(t3 )η(t4 ) + + η(t1 )η(t3 ) η(t2 )η(t4 ) + η(t1 )η(t4 ) η(t2 )η(t3 ) = Γ2 [δ(t1 − t2 )δ(t3 − t4 ) + δ(t1 − t3 )δ(t2 − t4 ) + δ(t1 − t4 )δ(t2 − t3 )]. (6. e. the Langevin equation does not unless higher moments of η(t) are speciﬁed. decomposition into pair-wise products.43. it cannot yet be said that they are equivalent because the higher moments do not agree. (Note that noise η(t) deﬁned as above is called Gaussian white-noise. ∂p ∂ 1 ∂2 (Γp). is Gaussian. By the same argument. whereas the Fokker-Planck equation provides a deﬁnite expression for each moment. ∆t (6. • All even moments obey the same rule as for Gaussian distributions. . Since its coeﬃcients have been chosen so that the ﬁrst. it follows that y(t) is Gaussian. 6. Since the joint distribution of all quantities η(t′ ) is Gaussian. along with Eqs. a1 (y0 ) = lim ∆t→0 139 ∆y y0 = −βy0 . the assumption is • All odd moments of η(t) vanish.

it is clear that for each sample function η(t). Eq.54) Outline of the proof: First. Since the values of η(t) at diﬀerent times are independent. but the damping term A(y) is a nonlinear function of y.140 Applied stochastic processes 6. t+∆t 2 ∆y 2 = t t+∆t t+∆t t t+∆t t+∆t t A(y(t′ )) dt′ dt′ +2 t A(y(t′ ))η(t′′ ) dt′′ η(t′ )η(t′′ ) dt′′ + t dt′ .11 on page 126). it can still be argued that the Langevin equation is equivalent to the Fokker-Planck equation. it obeys a master equation which may be written in the Kramers-Moyal form (see Eq.2 Nonlinear Damping Another case occurring frequently in applications is when the Langevin equation contains a nonlinear damping term.53) where the random force is still assumed to be Gaussian white-noise. Taking the square of Eq. dy = A(y) + η(t).56) to yield a1 (y) = A(y) as before. 6. (6.55 and averaging. =− [A(y)p] + ∂t ∂y 2 ∂y 2 (6. .5. upon averaging. dt (6. 6.55) hence. ∆y y0 = A(y0 )∆t + O(∆t2 ). ∂p ∂ 1 ∂2 (Γp). it follows that y(t) is Markovian. 6. for a very small ∆t. Eq. Next.53 gives t+∆t ∆y = t A(y(t ))dt + t ′ ′ t+∆t η(t′ )dt′ . Hence. Although in this case an explicit solution for y(t) is not usually available. 6. (6.53 along with the initial condition y(0) = y0 uniquely determines y(t) from the existence/uniqueness theorem for diﬀerential equations.

(6. and introduce the change of variables. ∂p ∂ c2 Γ ∂ 2 p . Substituting this in the second term above. y= ˆ dy . (6. so it does not contribute to a2 (y). c(y) A(y) ˆy = A(ˆ). provided the second term above does not contribute. .58) B ¨0 ¨ ¨ ′′ ′′ η(t 2A(y(t))∆t ¨¨ ) dt + t¨ ¨ 2A (y(t)) t t ′ t+∆t t+∆t [y(t′ ) − y(t)] · η(t′′ ) dt′′ dt′ + . dt (with η(t)η(t′ ) = Γδ(t − t′ ) and the higher-moments of η(t) suitably deﬁned) is equivalent to the Fokker-Planck equation. . .54. To untangle the second term. A(y(t′ )) = A(y(t)) + A′ (y(t))(y(t′ ) − y(t)) + · · · . dy = A(y) + cη(t). That concludes the outline of the proof. and therefore does not contribute to a2 (y). Similarly. the Langevin equation. c(y) ˆ y P (ˆ) = P (y)c(y).Fokker-Planck Equation 141 The ﬁst line is O(∆t2 ).60) . since (∆y)n = o(∆t). expand A(y(t′ )) about y(t). 6. =− [A(y)p] + ∂t ∂y 2 ∂y 2 irrespective of the form of A(y). t+∆t t+∆t t 2 t A(y(t′ ))η(t′′ ) dt′′ t+∆t dt′ = (6. one can show that all higher-order moments contribute nothing. This agrees with the last term of Eq.59) the last term of which is O(∆t2 ). The third line gives. Diﬃculties emerge once c(y) is no longer constant. In summary. We can proceed na¨ ıvely. t+∆t t t t+∆t η(t′ )η(t′′ ) dt′′ dt′ = t t+∆t Γdt′ = Γ∆t.57) as in the linear case. (6.

142

Applied stochastic processes

transforming the nonlinear Langevin equation1 , dy = A(y) + c(y)η(t), dt to the Langevin equation described above, (6.61)

dˆ y ˆy = A(ˆ) + η(t). (6.62) dt This equation, in turn, is equivalent to the Fokker-Planck equation, ˆ ˆ ∂ ˆ Γ ∂2P ∂P = − A(ˆ)P + y ˆ . ∂t ∂y ˆ 2 ∂ y2 ˆ In the original variables, Γ ∂ Γ ∂2 2 ∂P (y, t) A(y) + c(y)c′ (y) P (y, t) + =− c (y)P (y, t) . ∂t ∂y 2 2 ∂y 2 (6.64) There is a problem – Eq. 6.61 as it stands has no meaning! The diﬃculty comes in trying to interpret the product c(y)η(t). Since η(t) causes a jump in y, what value of y should be used in c(y)? Speciﬁcally, how is one to interpret the c(y(t′ ))η(t′ )dt′ term in the equivalent integral equation,

t+dt

(6.63)

y(t + dt) − y(t) =

A(y(t ))dt +

t t

′

′

t+dt

c(y(t′ ))η(t′ )dt′ ?

**Stratonovich chose to replace c(y(t′ )) by its mean value over the interval, giving,
**

t+dt

y(t + dt) − y(t) =

A(y(t′ ))dt′ + c

t

y(t) + y(t + dt) 2

t+dt t

η(t′ )dt′ . (6.65)

One can prove that this interpretation indeed leads to Eq. 6.64. Other choices are possible, generating diﬀerent transformation laws. The calculus of stochastic functions is the focus of the next chapter, but ﬁrst we shall turn attention to a classic application of the Fokker-Planck equation to the modeling of a physical system – Kramers escape over a potential barrier.

Langevin equation is the designation usually given to any equation of the Langevin type where c(y) is non-constant, irrespective of the (non)linearity of A(y).

1 Nonlinear

Fokker-Planck Equation

143

6.6

Limitations of the Kramers-Moyal Expansion

The Fokker-Planck equation, as derived in Chapter 6, remains the most popular approximation method to calculate a solution for the probability distribution of ﬂuctuations in nonlinear systems. Typically, the FokkerPlanck equation is generated through a two-term Kramers-Moyal expansion of the master equation (see Eq. 6.11 on page 126). The result is a Fokker-Planck equation with nonlinear drift and diﬀusion coeﬃcients. There are several problems with this approach – First, the expansion is not consistent since the step-operator is approximated by a continuous operator, but the transition probabilities remain as nonlinear functions of the microscopic variables. Second, the nonlinear Fokker-Planck equation has no general solution for state dimensions greater than one. That is, as soon as the system of interest has more than one variable, the nonlinear Fokker-Planck equation must be solved numerically or by some specialized approximation methods. Since the nonlinear Fokker-Planck equation derived by the Kramers-Moyal expansion is consistent only insofar as it agrees with the linear Fokker-Planck equation derived using the linear noise approximation, and since the nonlinear Fokker-Planck equation is so diﬃcult to solve, the Kramers-Moyal expansion seems an undesirable approximation method. For a more detailed discussion, see: • N. G. van Kampen (1981) “The validity of non-linear Langevinequations,” Journal of Statistical Physics 25: 431. • N. G. van Kampen, “The diﬀusion approximation for Markov processes,” in Thermodynamics and kinetics of biological processes, Ed. I. Lamprecht and A. I. Zotin (Walter de Gruyter & Co., 1982).

6.7

• • •

**Example – Kramer’s Escape Problem
**

Kramers, HA (1940) “Brownian motion in a ﬁeld of force and the diﬀusion model of chemical reactions,” Physica 7: 284. D. ter Haar, Master of Modern Physics: The Scientiﬁc Contributions of H. A. Kramers. Princeton University Press, 1998. Chapter 6. P. H¨nggi, P. Talkner and M. Borkovec (1990) “Reaction-rate theory: 50 years a after Kramers,” Reviews of Modern Physics 62: 251.

144

A)

**Applied stochastic processes
**

U ( x)

B

W

B)

ωA

B

ª Wº exp « − » ¬ kT ¼

C A

x0

x

A

Figure 6.4: Kramers Escape from a Potential Well. A) Potential ﬁeld with smooth barrier. Redrawn from Kramers (1940). B) The steadystate probability distribution is used, normalized as though the barrier at x = B were not there. The escape time is computed by calculating the rate at which particles pass X = B with positive velocity. Brieﬂy, Kramers escape problem is the following: Consider a particle trapped in a potential well, subject to Brownian motion (Figure 6.4). What is the probability that the particle will escape over the barrier? Among the many applications of this model, Kramers suggested that the rates of chemical reactions could be understood via this mechanism. Imagine a Brownian particle subject to a position-dependent force F (x), in addition to the damping and ﬂuctuations of Brownian motion, m dX d2 X = −γ + F (X) + η(t), dt dt (6.66)

Here, η(t) is the same Gaussian white-noise forcing that appears in Langevin’s equation. Suppose the force can be written as the gradient of a potential U (X) (as is always the case in one-dimension), F (X) = −U ′ (X). Kramers had in mind a two-welled potential, as depicted in Figure 6.4a. To derive the associated Fokker-Planck equation, it is convenient to rewrite Eq. 6.66 explicitly in terms of the position X and the velocity V , dX = V, dt m dV = −γV − U ′ (X) + η(t). dt

We are after a Fokker-Planck equation that describes the bivariate probability distribution P (X, V, t). The coeﬃcients for the drift are straight-

Fokker-Planck Equation forward to calculate (see Section 6.3.1), lim ∆X = V, ∆t lim ∆V ∆t = γ F (X) − V m m .

145

∆t→0

∆t→0

(6.67)

It is likewise straightforward to show that the diﬀusion coeﬃcients ∆X 2 and ∆X∆V vanish, lim ∆X∆V ∆X 2 = lim ∆t→0 ∆t ∆t = 0. (6.68)

∆t→0

For the ∆V 2 -term, we must appeal to statistical mechanics which says that the equilibrium distribution for the velocity is given by the Maxwell distribution, P e (V ) = m 2πkT

1/2

exp −

m 2 V . 2kT

(6.69)

In that way, one can show (Exercise 9a), lim kT ∆V 2 =γ . ∆t m ∂2P ∂ V P + kT ∂V ∂V 2 (6.70)

∆t→0

The full Fokker-Planck equation is then, ∂P U ′ (X) ∂P γ ∂P (X, V, t) = −V + + ∂t ∂X m ∂V m , (6.71) solved subject to the initial condition P (X, V, 0) = δ(X − A)δ(V − 0). As a ﬁrst estimate, we assume the well-depth W is large compared with kT . We are then justiﬁed in using the stationary distribution for P (X, V, t) around the point X = A, U (X) + 1 V 2 2 . P (X, V ) = N · exp − kT

s

(6.72)

There is escape over the barrier, so obviously P s (X, V ) is not correct for long times, but over short times, the probability is very small near the top of the barrier X = B (Figure 6.4b). We can then set P s (X ′ , V ) = 0 for X ′ > B. The normalization constant N is approximately, N −1 = kT 2π m

B

−∞

e−U (x)/kT dx ≈

2πkT −U (A)/kT e , ωa

146

Applied stochastic processes

2 where ωa = U ′′ (A) is the curvature at the base of the well, X = A. The escape rate comes from calculating the outward ﬂow across the top of the barrier, at X = B,

1 = τ

∞

0

V P (X, V )dV = N e

s

−U (B)/kT

∞

V e−V

2

/2kT

dV =

2 ωa −W/kT . e 2π

0

(6.73) The underlying assumption is that if the particle is at the top of the barrier with positive velocity, then it will escape – as if there were an absorbing barrier at X = B. A heuristic interpretation of Eq. 6.73 is that 2 the particle oscillates in a potential well 1 ωa X 2 and therefore hits the 2 2 wall ωa /2π times per second, each time with a probability of e−W/kT to get across (Figure 6.4b). In a more sophisticated treatment, the Fokker-Planck Eq. 6.71 is solved with an initial distribution inside the well and the escape rate is determined by the ﬂow past a point X = C suﬃciently far from the top of the barrier that the probability of return is negligible (Figure 6.4a). This is Kramer’s escape problem. There is no known solution for a potential well of the shape drawn in Figure 6.4a, so clever approximation methods are the only recourse for estimating the escape time. In Exercise 11, an approximate solution is developed in the limit of large friction, γ ≫ 1.

Suggested References

The encyclopedic reference for applications of the Fokker-Planck equation along with an exhaustive detailing of existing solution methods is the following text by Risken: • The Fokker-Planck Equation: Methods of Solutions and Applications (2nd Ed.), H. Risken (Springer, 1996). For all kinds of ﬁrst-passage and escape problems, • A Guide to First-Passage Processes, S. Redner (Cambridge University Press, 2001). is recommended.

2.64. Deterministic dynamics as a Markov process: Consider the deterministic diﬀerential equation dy/dt = f (y). ′ (b) If α1 (x) = a(x) + 1 α2 (x) and α2 (x) = b2 (x). Autocorrelation of the Wiener process.60.5). Fill in the details leading to Eq. t) ∂ 1 ∂ =− [a(x)f (x. Deterministic equations: Starting from the Fokker-Planck equation. ∂t ∂x 2 ∂x2 (a) Derive the deterministic (macroscopic) rate equations for X(t) and X 2 (t) . show that the last term. t)f (x.32. Stratonovich Fokker-Planck equation. One-step process: Fill in the details in the derivation of Eq. derive 4 ∂f (x. Langevin equation with nonlinear drift: In Eq. 6. t)] . compute the probability that a particle starting from (r0 . 6.Fokker-Planck Equation 147 Exercises 1. t)] . then it likewise satisﬁes the original diﬀerential equation. θ0 ) (where θ0 is the angle with the horizontal) is absorbed by the horizontal edge of the wedge. 7. to the FokkerPlanck equation. (a) Show that the process y(t) is a Markov process.14. 6. t)] + ∂t ∂x 2 ∂x b(x) ∂ [b(x)f (x. Eq. It may be helpful to write the Dirac delta function as the derivative of the unit-step function.59. 6.15 from Eq. ∂x 4. t) =− [α1 (x. t)] + [α2 (x. 6. 6. 6. ∂ 1 ∂2 ∂f (x. 5. Show that if y(t) satisﬁes the Liouville equation. Diﬀusion in a wedge: For a Brownian particle diﬀusing in an inﬁnite wedge with absorbing edges (Figure 6. and integrate by parts. Fill in the missing details that lead from the change of variables. (b) Derive the associated Fokker-Planck equation (called the Liouville equation). 2A′ (y(t)) t t+∆t t t+∆t [y(t′ ) − y(t)] · η(t′′ ) dt′′ dt′ . t)f (x. 3. Eq.

0) = δ(x − x0 ) (where x0 ∈ (0. Hint: Your answer will be given in terms of a cosine Fourier series. the trapping-probability Ti (t) is the ﬂux of probability through the boundary point at i. Find the average lifetime of the particle starting at x = x0 before it is trapped at x = 0. The edges are absorbing. with reﬂecting boundary conditions at x = 0 and x = L (no-ﬂux boundary conditions). ∂x ∂x2 for t ≥ 0. (d) Show that for L → ∞. the probability distribution for diﬀusion on the positive half of the x-axis. First-passage time: Consider the diﬀusion process X(t) in onedimension on a ﬁnite-interval (0. (b) What is the stationary probability density function for the process X(t)? (c) For absorbing boundary conditions (f (0. t) ∂f (x. L)). t) =D . with the initial condition f (x. or unconditionally trapped at either boundary. An inﬁnite wedge forming angle θ with the horizontal. ∂ 2 f (x. t) = f (L. Find the splitting probability Si that a particle beginning at x0 will be eventually trapped at boundary point i. t) = 0).5: Diﬀusion in a wedge. What is the probability that a Brownian particle beginning at (r0 .148 Applied stochastic processes θ Figure 6. 8. x = L. with reﬂecting boundary . L). θ0 ) is absorbed by the horizontal edge? is indeed O(∆t2 ) as claimed in the text. (a) Solve the Fokker-Planck equation for the process. on the interval 0 ≤ x ≤ L.

74) (a) Derive the Fokker-Planck equation governing the distribution for the velocity P (V.Fokker-Planck Equation conditions at x = 0. and thereby the full Fokker-Planck equation governing P (V.75) Use this to derive the diﬀusion coeﬃcient. dV = −γV. the equilibrium distribution for the velocity is given by the Maxwell distribution. (e) In the semi-inﬁnite domain (L → ∞). P e (V ) = m 2πkT 1/2 exp − m 2 V . t) = √ 1 4πDt e− (x−x0 )2 4Dt 149 + e− (x+x0 )2 4Dt . ﬁnd V (t) and V (t)V (t + τ ) . lim ∆V ∆t = −γV. Lord Rayleigh published his study of the motion of a particle buffeted by collisions with the surrounding gas molecules. The motion is studied on a timescale over which the velocity relaxes (a ﬁner time scale than Einstein’s study of Brownian motion). t). 2kT (6. Show that for a particle starting at x = x0 trapping at x = 0 is certain. dt (6. t). If the boundary condition is absorbing at x = 0. 10. 9. The drift coeﬃcient is easily found. the macroscopic equation for the velocity is given by the damping law. (b) For the initial condition V (0) = V0 . Brownian motion in a magnetic ﬁeld: A Brownian particle with mass m and charge q moves in three-dimensional space spanned . but that the average lifetime is inﬁnite! This is an example of Zermelo’s paradox (see p. ∆t→0 From statistical mechanics. Rayleigh particle: In 1891 (some 15 years before Einstein’s work). In one dimension. can be written as f (x. suppose the x = 0 boundary is absorbing. how is the solution changed? Provide a physical interpretation of the distribution with absorbing conditions at x = 0. 67).

dR = C · Γ(t). 6. each with zero mean and 2 variance 2DB t.150 Applied stochastic processes by a rectangular coordinate system xyz under the inﬂuence of a constant. homogeneous magnetic ﬁeld B directed along the z-axis. (6. 0) = δ(x)δ(y)δ(z) by using a 3-component Fourier transform. Γy (t). y. X = x kT /m. Z(t)) in matrix form. z. B). .76) T where β is the friction coeﬃcient. the velocity relaxes very quickly. and so we would like an equation that governs the marginal distribution for the position X alone. (a) Assuming over-damped motion. (a) In Eq. What can you say about the motion Z(t)? 11. z. Γz (t)) is a vector of Gaussian white noise with independent components. and Γ(t) = (Γx (t). The velocity vector V (t) = (Vx (t). re-scale the variables. (6. Y (t). m dV = −β V (t) − q B × V (t) + dt 2βkB T Γ(t). 0. t). (c) Solve the Fokker-Planck equation in 10b subject to the initial condition f (x. P (X. y. neglect the inertial term in Eq. where DB = D/ 1 + (qB/β) and D = kB T /β. Vy (t). 6.78) to derive a re-scaled Fokker-Planck equation.77) (b) Derive from Eq. F (X) = f (x) kT /m. 6. Vz (t)) satisﬁes the 3-component Langevin equation. (6. Large damping (γ ≫ 1) limit of Kramers Fokker-Planck equation: For large damping. V =v kT /m. T B = (0.77 the coeﬃcients of the Fokker-Planck equation governing the joint probability distribution f (x. t) for the position process R(t). Show that the components X(t) and Y (t) are independent Wiener processes.76 and derive the Langevin equation for the position T R(t) = (X(t).71. dt where C is a 3 × 3 matrix.

in the original variables. t) up to O(γ −2 ). t) + γ −1 P (1) (x. t) = P (0) (x. Substitute into the re-scaled Fokker-Planck equation. (c) Integrate P (x. and ﬁnd P (x. t) + γ −1 ∂t ∂2P d f (x)P − dx ∂x2 = O(γ −2 ). v.79) ∂t ∂X mγ mγ ∂X 2 . or. v. v. t) over v to obtain the marginal distribution P (x. . . t) =− P (X. t) + P (X. t) and show that it obeys the Fokker-Planck equation. make the ansatz. t) in powers of γ −1 .Fokker-Planck Equation 151 (b) We seek a perturbation expansion of P (x. t) + . (6. t) + γ −2 P (2) (x. To that end. P (x. v. v. ∂P (x. ∂ F (X) kT ∂ 2 ∂P (X. v. v. t).

.” Van e Nostrand (1966) is recommended. We choose to use the mean-square limit here because of the close association between limits deﬁned in this way and limits from ordinary calculus. involving the secondderivative of x. in turn. The purpose of this paper is to apply the methods and results of modern probability theory to the analysis of the OrnsteinUhlenbeck distribution. we must extend the deﬁnition of the derivative to include random functions. This will avoid the usually embarrassing situation in which the Langevin equation. A stochastic diﬀerential equation is introduced in a rigorous fashion to give a precise meaning to the Langevin equation for the velocity function. This deﬁnition. –J. requires a deﬁnition for the limit of a sequence of random variables. is used to ﬁnd a solution x(t) not having a second-derivative. Lo`ve “Probability theory. its properties.1 Limits We return now to Doob’s criticism of Ornstein and Uhlenbeck’s study of Brownian motion (cf. Doob (1942) Annals of Math. L. . page 20). M.Chapter 7 Stochastic Analysis 7. 43:351. To understand Doob’s objection. of which there are several. and its derivation. For more details. 152 .

C (τ ) = [ξ (t + τ ) − m] [ξ (t) − m] = B (τ ) − m2 . it is straightforward to verify that Eq. . A limit deﬁned in this way is usually . if for any ε > 0. . .) Using these deﬁnitions. . One question that arises is: Under what conditions do we have. there exists an N = N (ε) such that for all 2 n > N . . then it is also the limit in probability. (7. P {|ξn − ξ| < ε} > 1 − η. instead. This means. Using Chebyshev’s inequality. n→∞ lim P {|ξn − ξ| ≥ ε} = 0. there exists an N0 = N0 (ε. ξn }. if n→∞ 153 lim |ξ − ξn | 2 = 0. ξ2 .e. η) such that for all n > N0 . the statement of the ergodic theorem from Chapter 2 can be made more precise. (If ξ is the limit in the mean-square. in turn. that given any ε > 0 and η > 0.1 implies that the sequence also converges in probability.1) called a mean-square limit (or a limit in the mean-square) and {ξn } is said to converge to ξ in the mean-square. We will not do so here. |ξ − ξn | < ε. 7.2) This question was answered by Slutski (1938) (see Doob’s “Stochastic Processes. 1 m = ξ (t) = lim T →∞ T T 0 1 ξ (t) dt = lim N →∞ N N ξ (j) (t)? j=1 (7. we shall discuss some conditions which have to be imposed on a stationary stochastic process to guarantee the validity of the ergodic theorem.” Wiley (1960) for more details): Consider the centered correlation function.Random processes Deﬁnition: The random variable ξ is the limit of the sequence of random variables {ξ1 . P {|ξn − ξ| ≥ ε} ≤ |ξ − ξn | ε2 2 i.

the limit h→0 lim (ξ(t) − ξ(t − h)) 2 = 0. The idea of convergence in the mean-square will be crucial. Of course. 7. the limit h1 . as when ξ(t) contains a purely harmonic component of the type ξ = ξ0 eiωt . t − h2 ) − B(t. then B(t. In practice. we shall ﬁnd that wherever mean-squared convergence appears. B(t1 . that is. Then. Notice this is not the same as setting h1 = h2 = h and taking the limit as h → 0. Likewise. We begin with the deﬁnition of mean-squared continuity. The above limit exists if.3 is satisﬁed. we shall modify the standard deﬁnitions of calculus (continuity. Deﬁnition: A process ξ(t) is mean-square continuous at t if. (7. reformulation of the deﬁnition in terms of the autocorrelation is natural. and only if.5) must exist. and only if. and only if.154 then Eq. that is. 7. The formal deﬁnition can be recast in a more useful form involving the correlation function B(t1 . B(τ ) is continuous at τ = 0. (7. 1 lim T →∞ T T Applied stochastic processes C (τ ) dτ = 0. Another case of practical importance occurs when C(τ ) = (function → 0 as τ → ∞) + (periodic terms). t2 ) = ξ(t1 )ξ(t2 ) . the function C(τ ) → 0 as τ → ∞ since the dependence between ξ(t + τ ) and ξ(t) usually weakens as τ → 0. 7.3 is satisﬁed.2 Mean-square Continuity Throughout this chapter. 7. too. and only if. where ξ0 is a random variable and ω is a real number. in this case Eq. If ξ(t) is wide-sense stationary. t)] = 0. (7.4) exists. Eq.2 holds if. h2 →0 lim [B(t − h1 . t−τ ) = B(τ ) is a function of the time-diﬀerence only and the condition for continuity is satisﬁed if.6) . t2 ) is continuous in t1 and t2 at the point t1 = t2 = t. the limit h→0 lim [B(h) − B(0)] = 0. 0 (7.3) where the limits are deﬁned as above. diﬀerentiability and integrability) to include stochastic processes.

i.3 Stochastic Diﬀerentiation Deﬁnition: A process ξ(t) is mean-square diﬀerentiable at t if. ξ ′ (t) ∂2 B(t1 . the limit 1 [B(t − h1 .9) must exist.7) exists. From Eq. (7. h2 →0 h1 h2 (7. If the process ξ(t) is wide-sense stationary. The deﬁnition. h→0 h2 lim (7. denoted by ξ ′ (t). and only if. t2 ). t2 ) = ξ(t1 )ξ(t2 ) is diﬀerentiable at the point t = t1 = t2 . one can show that Eq.7 is satisﬁed if. it is continuous for all time.Random processes 155 must exist.e. as usual. but we shall also show that this does not matter in the least for the modeling of physical systems. such that the limit lim ξ(t + h) − ξ(t) − ξ ′ (t) h 2 The derivative of a stochastic process is deﬁned in a formal way: h→0 = 0. t − h2 ) − B(t. then B(t. dτ With these conditions in hand. h1 . and only if. 7. t2 ) = or. 7.6 it should be clear that if a wide-sense stationary process is continuous at one time t. d2 B (τ ) = − 2 B(τ ). 7. the autocorrelation function of the derivative ξ ′ (t) is given by. As a more useful corollary. t) + B(t. Bξ′ (t) (t1 . it is straightforward to show that the Wiener process and the Ornstein-Uhlenbeck process are not diﬀerentiable (precisely Doob’s objection). ∂t1 ∂t2 . t − h2 ) − B(t − h1 . for a stationary process. t)] . there exists a random variable.8) lim must exist. Moreover. the autocorrelation function B(t1 . t − τ ) = B(τ ) is a function of the time-diﬀerence only and the condition for diﬀerentiability is simpliﬁed: The limit 1 [B(h) − 2B(0) + B(−h)] . is diﬃcult to use in practice.

6. the autocorrelation function indicates that the process W (t) is nonstationary. and the autocorrelation function. 7.3.31 on page 134) that in the absence of a damping force. we derived the autocorrelation function for the Wiener process. h2 →0 h1 h2 lim The limit does not exist. B(τ ) = Γ −|τ |/τc e 2τc is a function of the time-diﬀerence τ – We can therefore use Eq.10) .1 Wiener process is not diﬀerentiable Recall from Chapter 6 (cf. 7. Eq. 6. t − h2 ) − (t − h2 ) − (t − h1 ) + t] h1 . dt (7. then the diﬀerential equation. t2 ). dy = −γy + F (t). h2 →0 h1 h2 1 = lim [min(t − h1 .9 to check its diﬀerentiability. dt We will now show that W (t) is not diﬀerentiable – and so the equation above. t − h2 ) + h2 + h1 − t] . In Chapter 6 (Eq.156 Applied stochastic processes 7. and we must conclude that the Wiener process is not diﬀerentiable (although one can easily show that it is mean-square continuous). BUT (and this is critically important) We shall now show that if the forcing function F (t) is not strictly deltacorrelated. if the process has a non-zero correlation time (however small).8 to check its diﬀerentiability: 1 [min(t − h1 . That was Doob’s point in his criticism of the work of Ornstein and Uhlenbeck (quoted on page 20). h1 . Clearly. dW = η(t). One ﬁnds that the Ornstein-Uhlenbeck process is likewise not diﬀerentiable (Excercise 3a). The Ornstein-Uhlenbeck process is stationary at steady-state.32 on page 135). so we must use Eq. as it stands. W (t1 )W (t2 ) = min(t1 . is meaningless. the Langevin equation reduces to the equation characterizing the Wiener process W (t).

. the steady-state correlation function for the Ornstein-Uhlenbeck process is the exponential. τc →0 2τc lim From Chapter 2 (Eq. 7. the spectral density of the process y(t) (as characterized by Eq.Random processes 157 is well-deﬁned and y(t) is diﬀerentiable. in the limit of vanishing correlation time. 1 By (h) − 2By (0) + By (−h) Γ =− . . the derivative of y(t) is well-deﬁned and can be manipulated using the rules of ordinary calculus. suppose F (t) is an Ornstein-Uhlenbeck process. so we conclude that the process y(t) deﬁned by the differential equation (7. Obviously. As shown above. Eq. Uhlenbeck and L.12) Taking the limit (cf. Notice.10) is simply. BF (τ ) = Γ −|τ |/τc e .13) The limit exists. By (τ ) = Γ e−γ|τ | − γτc e−|τ |/τc . Incidently. 7.11) from which the autocorrelation function follows. Syy (ω) = (1/τc )2 Γ . 2. the limit above becomes undeﬁned. as τc → 0. 2τc where we have written the correlation time explicitly. ω 2 + γ 2 ω 2 + (1/τc )2 (7. Ornstein (1930) Physical Review 36: 823–841): [W]e will naturally make the following assumptions.9).10) is diﬀerentiable. Ornstein and Uhlenbeck never explicitly state that their forcing function is delta-correlated. and made the prefactor proportional to 1/τc to clarify the correspondence between the present example and the Wiener process studied above. 2 2γ 1 − γ 2 τc (7. however small. 2 h→0 h 2τc 1 + γτc lim (7. 1 −|τ |/τc e → δ(τ ). merely that it is very narrow (G. There will be correlation between the values of [the random forcing . S. but for any non-zero correlation time. E.19 on page 42). For sake of example.

J. Wiener. The authors are aware of the fact that in the mathematical literature (especially in papers by N. such that the limit lim t/ε t ε→0 exists (where the sequence of values of ε is chosen such that t/ε are integers). Wang and G. or continuous or diﬀerentiable. t denoted by ξ (−1) (t) or 0 ξ(u)du. Doob. However. the deﬁnition can be recast as a condition on the correlation function B(t1 . also for further references) the notion of a random (or stochastic) process has been deﬁned in a much more reﬁned way. and only if. E. it seems to us that these investigations have not helped in the solution of problems of direct physical interest. etc. As above with diﬀerentiation. 351 (1942). Math. cf. This allows for instance to determine in certain cases the probability that a random function y(t) is of bounded variation. for instance Doob.158 Applied stochastic processes function A(t)] at diﬀerent times t1 and t2 only when |t1 − t2 | is very small. Speciﬁcally.4 Stochastic Integration Similar ideas can be used to deﬁne the integral of a stochastic process ξ(t): Deﬁnition: A process ξ(t) is mean-square integrable on the interval (0. t2 ). not try to give an account of them.14) . therefore. (7. 43. and we will. and others. 7. L. under these assumptions the derivation of Ornstein and Uhlenbeck is perfectly correct and no questions of inconsistency arise. This led Wang and Uhlenbeck to write as a footnote in a later publication (M. 7. More explicitly we shall suppose that: A(t1 )A(t2 ) = φ(t1 − t2 ). t) if.14 ε i=1 ξ(iε) − 0 ξ(u)du 2 = 0. the limit Eq. As shown above. Ann. where φ(x) is a function with a very sharp maximum at x = 0. there exists a random variable. C. Uhlenbeck (1945) Reviews of Modern Physics 17: 323–342).

or.Random processes 159 exists if and only if B(t1 . dt The question comes down to how one deﬁnes the terms in the equivalent integrated equation. if the integral is deﬁned in the Lebesgue-Stieltjes sense. t t/ε 0 B(t1 . the limit t t t/ε 0 0 must exist. There convenience. c(y)η(t)dt is not integrable in the ordinary sense and some interpretation must be provided. that is. t2 )dt1 dt2 ≡ lim ε ε→0 2 i.18) In particular. we call any Langevin equation (irrespective of whether A(y) is nonlinear) a nonlinear Langevin equation if the forcing function c(y) is a nonconstant function of the state variable y(t). Recall that the correspondence between the Langevin equation and the Fokker-Planck equation is unambiguous. (7. this process is not integrable.j=1 B(iε. t).. t2 ) is Riemann-integrable on the square (0. That means. (7. a necessary condition for b B(τ )dτ ≡ lim ε ε→0 i=1 B(iε) . 1 For .16) f (t) ξ (t) dt.15) For a test function f (t). that the inﬁnitesimal ξ (t) dt must not be inﬁnitely large in the mean-squared sense. dy = A(y)dt + c(y)η(t)dt.17) to be integrable in the mean-squared sense is that ξ(t) is integrable. unless the white-noise forcing is multiplied by a nonlinear function of the state variable. then measure ξ (t) dt must be of bounded variation. t2 ) = δ(t1 − t2 ). then the corresponding limit for the stationary correlation function must exist. If the process ξ(t) is wide-sense stationary. what precisely is the meaning of the last term. a (7.e. B(t1 . beyond the ordinary rules of calculus. i. very roughly. t) × (0. (7. Therefore.1 dy = A(y) + c(y)η(t). c(y)η(t)dt? Notice that because the correlation function for white noise is deltacorrelated. jε) .

. dt In the integrated form.e. (7. The problem comes from the singularity of the forcing function’s correlation η(t1 )η(t2 ) = δ(t1 − t2 ).3. some notation.4 on page 133 that a purely diﬀusive process (i. It is therefore more correct mathematically to write the Langevin equation as.19) Furthermore. As we saw in Section 7.. 7. recall from Section 6.20 have no meaning until an interpretation rule is provided for c(y(t′ ))η(t′ )dt′ ≡ c(y(t′ ))dW (t′ ). so long as the correlation time τc is non zero. a Langevin equation without damping). Stratonovich viewed the correlation η(t1 )η(t2 ) = δ(t1 − t2 ) as an idealization of a narrowly peaked correlation function.4. the correlation function is non-singular.3. the Langevin equation is not well-deﬁned since the process it characterizes is non-diﬀerentiable.e. dW = η(t). the integrated form of the equation is well-deﬁned. Γ = 1. 7. dy = A(y)dt + c(y)η(t)dt. The two dominant interpretations are discussed below. i. and most importantly. dW (t) = η(t)dt.1 Itˆ and Stratonovich Integration o First. contrasting how they were developed. we set the variance of the white noise to 1.20) It should be emphasized once again that Eqs. 2 For simplicity. Eq.19 and 7. It is useful to compare the two o viewpoints. As we showed in Section 7. 7. Very often. thus. (7.2 Nevertheless. is called Brownian motion or a Wiener process W (t). 1. dy = A(y)dt + c(y)dW (t). Stratonovich interpretation. once an interpretation rule has been attached.19 is written using the increment of the Wiener process.160 Applied stochastic processes are two interpretations that dominate the literature: the Stratonovich interpretation and the Itˆ interpretation. how they come in to solving actual problems.

roughly speaking. that the mean-squared length of the inﬁnitesimal dW (t) is inﬁnite – so Itˆ’s o taming of the wild Wiener process into a consistent deﬁnition of stochastic integration is certainly a mathematical tour-de-force. t) . The above amounts to setting c(y) equal to its mean value over the inﬁnitesimal range of integration in Eq. ∂t ∂y 2 2 ∂y 2 Typically. Itˆ interpretation. (7. y= ˆ dy . o o In the same way that Lebesgue developed the ideas of set theory to provide a more general deﬁnition of Riemann’s integral. with the change of variables (cf. One can show that dW (t) has unbounded variation which means. dy = A(y)dt + c(y)dW (t) (STRATONOVICH) (7.24) ∂ 1 ∂2 2 ∂P (y. c(y) A(y) ˆy = A(ˆ). t+dt y(t + dt) − y(t) = A(y(t ))dt + c t ′ ′ y(t) + y(t + dt) 2 t+dt η(t′ )dt′ . t) + ∂t ∂y 2 ∂y 2 . 6.23) 2. more sophisticated methods are required to patch together solutions in the vicinity of the singular point. t (7. dy = A(y)dt + c(y)dW (t) ˆ (ITO) (7. Itˆ exo tended the ideas of Lebesgue to include integration with the Wiener process as the weighting function dW (t′ ). 7.21) Stratonovich arrived at the following correspondence between the Langevin and Fokker-Planck equations.Random processes 161 and the Langevin equation can be manipulated using ordinary calculus. Eq. the function c(y) = 0 for any y – if it does. His development leads to the correspondence. t) c (y)P (y. Itˆ began from an entirely diﬀerent viewpoint. =− [A(y)] P (y.20. t) . t) + =− c (y)P (y. In particular. c(y) ˆ y P (ˆ) = P (y)c(y).22) 1 ∂ ∂P (y. t) 1 ∂2 2 A(y) + c(y)c′ (y) P (y.60 on page 141).

25) It is important to note that in contrast to Stratonovich’s interpretation. For further details about Itˆ’s calculus. In the words of van Kampen ((1981) Journal of Statistical Physics 24: 175–187). if c(y) is constant (c′ (y) = 0). 1 ¯ ˆ ¯ (STRATONOVICH) A(y) = [A(y) − c(y)c′ (y)] ⇔ A(y) = A(y) (ITO). 7. The ﬁnal conclusion is that a physicist cannot go wrong by regarding the Itˆ interpretation as one of those vagaries of the o mathematical mind that are of no concern to him. (7.162 Applied stochastic processes This amounts to setting c(y) equal to its initial value over the inﬁnitesimal range of integration in Eq. Furthermore. what it all comes down o to is that no natural process is described by white noise. a gross simpliﬁcation of whatever process it is meant to represent. the Stratonovich and Itˆ interpretations can o be made to generate the same form for the Fokker-Planck equation at the expense of re-deﬁning the drift A(y). and this is often useful in proving various properties of the solution of Eq. 7. A great deal is made about the distinction between Stratonovich and Itˆ calculus in some circles.24 (see Appendix C). t+dt t+dt ′ ′ y(t + dt) − y(t) = A(y(t ))dt + c (y(t)) t t η(t′ )dt′ . Transformation of variables under Itˆ’s o calculus are not the same as ordinary calculus. then the two interpretations coincide. so the nonlinear Langevin equation is. particularly in the numerical simulation of stochastic diﬀerential equations (see Section 8. 2 Obviously. It merely served to point out that [the nonlinear Langevin equation] is not a meaningful equation. but in the end. the Itˆ interpretation cannot be formulated unless the noise o is strictly delta-correlated.1). by construction. and thereby warn him against glib applications of the Langevin approach to nonlinear equations. .20. Gardiner’s “Handbook of stochastic methods” is recomo mended. Nevertheless. Notice that the Fokker-Planck equation does not require an interpretation rule – it is simply a partial diﬀerential equation governing the probability density. white noise is a useful construction in the development of approximation methods.

white noise does not exist! Any equation resembling the nonlinear Langevin equation is necessarily an approximation of dynamics forced by a noise source with very narrow (though nonzero) correlation time τc . however. Nevertheless.). namely. W. In summary. by providing a foundation built upon the very deep ideas of set theory. dW (t) has unbounded variation so that the meansquared length of the inﬁnitesimal dW (t) is inﬁnite! That is clearly problematic. Suggested References Much of the early content of this chapter comes from • Introduction to random processes (2nd Ed. τc →0 2τc lim is best thought of as an asymptotic limit because our Markov assumption relies upon a full relaxation of the microscopic variables over time scales much shorter than τc (see Section 3. pathological integrals such as c(y(t′ ))dW (t′ ) are endowed with useful and self-consistent meaning. Said another way. but it is a drastic approximation of real processes and therefore the physical relevance of results derived in this fashion must be viewed with skepticism. 1 exp[−|t|/τc ] → δ(t).Random processes 163 Afterword As mentioned above. This streamlining is so signiﬁcant that often white noise sources are used to expedite the analysis of various models. From a physical point of view. Itˆ did for Lebesgue integration what Lebesgue o did for Riemann integration. which is an excellent text on stochastic signal analysis directed toward an electrical engineering audience. A. Furthermore. various transformations and theorems o have been established that greatly simplify the analysis of nonlinear Langevin equations. and leads to c(y(t′ ))dW (t′ ) being undeﬁned in any ordinary sense. The formal limit of vanishing correlation time τc from whence white noise in the Stratonovich sense is deﬁned. 1990). Gardner (McGrawHill.4 on page 59). using Itˆ’s calculus. . Itˆ’s calculus is a useful construction of pure o mathematics that streamlines the proof of theorems using the Wiener process as a canonical noise source. Itˆ’s o calculus is lovely mathematics and can sometimes provide a short-cut to lengthy computations.

164 Applied stochastic processes Gardiner’s Handbook has a large collection of examples illustrating the range and applicability of Itˆ calculus. G. van Kampen’s master work. . G. G.).6 follows from the deﬁnition of continuity for a wide-sense stationary process. C. 232–237 in that book). Gardiner (Springer. (c) Show that Eq. 7. G. show that ξ(t) dξ(t) dt = 0. Wide-sense stationary processes: For wide-sense stationary processes. N. the fundamental deﬁnitions of continuity.9 follows from the deﬁnition of diﬀerentiability for a wide-sense stationary process.” Journal of o Statistical Physics 24: 175–187. (a) Show that Eq. 2004). N.). For a more self-contained and detailed version of his argument. spares no punches in a characteristic savaging of Itˆ calculus as it relates o to the modeling of physical processes (see p. • Stochastic processes in physics and chemistry (2nd Ed. and devotes several pages to the derivations outlined in this chapter. van Kampen (North-Holland. 2. Orthogonality of a wide-sense stationary process: For the wide-sense stationary process ξ(t). diﬀerentiability and integrability were re-cast in terms of the correlation function B(τ ). 7. van Kampen (1981) “The validity of non-linear Langevinequations. is highly recommended. 2001).” Journal of Statistical Physics 25: 431–442. (b) Show that Eq. o • Handbook of stochastic methods (3rd Ed.16 follows from the deﬁnition of integrability for a wide-sense stationary process. (d) Show that mean-squared diﬀerentiability implies mean-squared continuity. Exercises 1. 7. • N. along with • N. van Kampen (1981) “Itˆ versus Stratonovich. W.

4. 2 (b) For the following Langevin equation. c(y)η(t) = 0. where Cξ (τ ) = ξ(t)ξ(t − τ ) − ξ 2 . t (Y (t) − Y (t) ) 2 =t· −t 1− |τ | t Cξ (τ )dτ. in words. dt x(0) = x0 . Moments of an integrated process: Show that for the integrable wide-sense stationary process ξ(t). why the damped diﬀerential equation for y(t) (Eq. dy = −γy + F (t). (b) Explain. 5. interpreted in the Itˆ sense. the integrated process Y (t) = t ξ(u)du has mean and variance given by. 0 Y (t) = ξ · t. dt forced with a nondiﬀerentiable Ornstein-Uhlenbeck process F (t) still results in y(t) being diﬀerentiable. 7. Non-diﬀerentiability of the Ornstein-Uhlenbeck process.26) where η(t) is Gaussian white noise. interpreted in the Statonovich sense. 7.Random processes 165 3.26. Itˆ and Stratonovich interpretations: o (a) Show that with Itˆ’s interpretation. (7.10). i. while the o Stratonovich interpretation gives c(y)η(t) = 1 c′ (y) c(y). ii. (a) Show that the Ornstein-Uhlenbeck process is not diﬀerentiable. o . what is x(t) with Eq. dx = −x η(t).

where dW (t) = η(t)dt is the Wiener process coming from a Gaussian white noise source η(t).Chapter 8 Random Diﬀerential Equations The ‘repeated-randomness assumption’ (or the Stosszhalansatz. section 3. where the motion of the solvent molecules give rise to a random forcing of the solute particle. justiﬁed by the observation that the passing wave has negligible eﬀect on the turbulent atmosphere. although their motion is not aﬀected by the solute. Noise of this type is usefully denoted ‘extrinsic’ or ‘external ’ noise (in contrast with ‘intrinsic’ noise where the ﬂuctuations are inherent to the dynamics. and the master equation reduces to a diﬀerential equation with random coeﬃcients (a random diﬀerential equation). all of the stochastic character comes from these external variables. When the random force dξ(t) is not white noise. In much of the current literature.4 on page 59) allows many degrees of microscopic freedom in the model dynamics to be eliminated. this coarse-graining is continued. leaving behind a Markov process governed by a master equation. Examples of extrinsic noise are Brownian motion. and it is possible to identify a subset of variables whose stochastic properties are not inﬂuenced by the variables of interest. the 166 . the term stochastic diﬀerential equation is used almost exclusively to denote Itˆ’s version of the nonlinear o Langevin equation. and the master equation cannot be further reduced). In the extreme case. Another example is an electromagnetic wave passing through a turbulent troposphere – the wave obeys Maxwell’s equations with a random dielectric constant (leading to a random coeﬃcient in the diﬀerential equation). dy(t) = A(y)dt + c(y)dW (t). In some case.

4. stochastic diﬀerential equations can be formally written as ordinary diﬀerential equations.e. etc. du = F (u. linear random diﬀerential equations where only the forcing term is a random function. Roughly speaking. but has (at best) a very short correlation time.Random Diﬀerential Equations 167 problem of interpretation (Itˆ versus Stratonovich) does not occur because o c(y)dξ(t) is well-deﬁned as an ordinary (Lebesgue-Stieltjes) integral. these diﬃculties are artiﬁcial and disappear when one takes into account that a random force in physics is never really white noise. as in the Langevin equation (additive noise). Once that is accepted. dt u(t0 ) = u0 . In a manner of speaking. 2. (8. . t. Multiplicative noise will be dealt with in the following sections. If one’s focus is upon problems arising from physical systems (i. random partial diﬀerential equations. Consequently. 3. one also gets rid of Doob’s objection arising from the non-existence of the derivative in the Langevin equation. problems in science and engineering).1) where u and F may be vectors. 1. since a non-white noise forcing does not aﬀect the diﬀerentiability of the process. and Y (t) is a random function whose stochastic properties are given and whose correlation time is non-zero. Additive noise was dealt with in great detail in Chapter 1. Y (t)). Nonlinear stochastic diﬀerential equations and partial diﬀerential equations are more advanced than the level of this course. other variations. nonlinear random diﬀerential equations. random diﬀerential equations can be subdivided into several classes. the emphasis on Itˆ’s equation and o Itˆ calculus is misdirected because it is the fact that dW (t) comes from a o white noise source that creates all the problems. linear random diﬀerential equations where one or more of the coeﬃcients are random functions (multiplicative noise).

we concentrate on the class of multiplicative noise processes. u0 ) for all possible realizations y(t′ ) constitutes a stochastic process. • D. The system described by Eq. viz. then turn attention to analytic approximation schemes for generating closed equations for the moments of u(t). 2004). Instead. we brieﬂy discuss how numerical solutions of random diﬀerential equations should be computed.168 Applied stochastic processes 8. t. Gardiner. as a random force whose stochastic properties are given (or assumed). W.e. • D. The ensemble of solutions U ([y]. An introduction to stochastic processes in physics. Hence the equations of motion of the total system are reduced to those of the subsystem at the expense of introducing random coeﬃcients. The random diﬀerential equation is solved when the stochastic properties of this process have been found – this can rarely be done in practice. or at least some large system. those random diﬀerential equations with random coeﬃcients. 1992). The typical situation in physics is that the physical world. S. the theory is very incomplete and a general solution is out of the question. Handbook of stochastic methods (3rd Ed. As an example of the treatment of random diﬀerential equations. 2002). where η(t) is Gaussian white noise.). u0 ) which depends upon all values y(t′ ) for 0 ≤ t′ ≤ t. Even for this comparatively simple class. In the following sections. t. Lemons. and speciﬁcally on the behaviour of the average u(t) . 8.1 determines for each particular realization y(t) of the random function Y (t) a functional U ([y]. On can show that the integrated process dy (interpreted in the Itˆ sense) is equivalent to the equation o . i. White noise Numerical simulation of random diﬀerential equations begins with a discretization of the Langevin equation. The inﬂuence of the environment on a subsystem is treated like a heat bath.4 on page 59). is subdivided into a subsystem and its environment (see Section 3. T. Markov processes: An introduction for physical scientists. t)η(t)dt. we focus upon the ﬁrst two moments. dy = A(y. t)dt + c(y.1 • Numerical Simulation C. (Springer. Gillespie. (Academic Press. (John Hopkins University Press. and so approximation methods are necessary.

1). . (Eq. Choose a suitably small ∆t>0. 8. Initialize: t t0. t)p] + 1 2 2 a) Should satisfy the ﬁrst-order accuracy condition 2 ∂y [c (y. 194.Random Diﬀerential Equations 169 1. t)p]. may satisfy the plotting condition (Eq. stop.a 3.c 5.t) [∆t]1/2 + A(y. Draw a sample value n of the unit normal random variable N(0. If the process is to continue. Optionally.Advance the process: •y y + n·c(y.d otherwise.5) for some ε1 ≪ 1. d) The value of ∆t may be reset at the beginning of any cycle.6) for some ε2 < 1. Taken from Gillespie (1992). c) If (tmax − t0 )/∆t ∼ 10K . 8. y y0.1: Simulation of a trajectory drawn from p(y. Record y(t)=y as required for sampling or plotting. p. then return to 2 or 3. then the sum t + ∆t should be computed with at least K + 3 digits of precision. t0 ) satisfying the Fokker-Planck equation ∂p/∂t = −∂y [A(y. Figure 8. 2.b 4. b) Use a suitable method to generate the normally distributed random number (see Excercise 3). t|y0 .t)∆t •t t + ∆t.

1).3. (8. t)∆t + n · c(y.3 over a small. 1). σ 2 ) is a Gaussian (normal) distribution with mean µ and variance σ 2 (see Section A. that the Wiener process has independent increments. 8. From the properties of the Gaussian distribution. The deviation of the numerical scheme from the ChapmanKolmogorov equation quantiﬁes the error involved. This relation is only satisﬁed if the normal distributions are independent of one another for non-overlapping intervals. t)dt + N(0. The Wiener process is continuous. N 2 t t+ ∆t 2 + W t+ ∆t 2 − W (t) .2 on page 238).and super-scripts of Nt+∆t indicate that the random varit able is explicitly associated with the interval (t.2) where N(µ. time interval ∆t provides a practical numerical simulation algorithm. (8. √ y(t + ∆t) = y(t) + A(y. t) dt.170 (Exercise 2a). Several remarks are in order– 1. we can re-write Eq. to minimize this type . dW (t) = n · dt1/2 . Eq. t) ∆t.3 reduces to. 8. We say. therefore.4) (see Figure 8. D = 1). 8.3) where now n is a number drawn from a unit normal distribution N(0. t+∆t). 1) + N 2 t+∆t/2 ∆t t+∆t/2 (0. √ dy = A(y. The sub. but ﬁnite. Applied stochastic processes dy = A(y.2 as. 1). (8. The integration of Eq. so W (t + ∆t) − W (t) = W (t + ∆t) − W From Eq. a relation that is exploited in many theorems using Itˆ’s calculus o (see Gardiner (2004) and Appendix C for examples). c2 (y. t)dt). 8. t)dt + n · c(y. √ ∆tNt+∆t (0. 2. For the Wiener process (A = 0. 1) = t ∆t t+∆t (0.

t) 2 . ∆t 0 c(y(t). Eq. 8. W. reduces to the simple forward Euler method for the numerical solution of ordinary diﬀerential equations. which have proved so useful in numerical solution of ordinary diﬀerential equations. ε2 c2 (y. For the Langevin equation. A2 (y. 0)W (∆t). the time step ∆t must satisfy the accuracy condition (ε1 ≪ 1). t) 2 . (8. One can show that the scheme described above exhibits 1 -order strong convergence and 1st -order weak 2 convergence. The question that immediately presents itself is whether higher-order schemes. Asmussen and P. |∂y A(y. (Springer. such as Runge-Kutta or Adams-Bashford methods.5) Furthermore. The main error associated with the Euler scheme comes from the approximation. the data must be stored with a resolution such that the ﬂuctuations are captured on a time-scale much shorter than the evolution due to the drift.Random Diﬀerential Equations 171 of consistency error (i. ∆t ≤ Higher-order schemes • S.6) The simulation algorithm. 2007). t)| ε1 ∂y c(y. The major diﬃculty is not so much the stochasticity.4. it would seem they cannot without considerable complication of the stepping algorithm. to generate a representative plot of the simulation results. t) (8. We will brieﬂy outline how higher-order explicit schemes can be derived – although we shall see that very little is gained from the added complexity of the algorithm. t)dW (t) ≈ c(y(0). t) = 0).e. frustrating the derivation of higher-order methods. can be extended to stochastic systems. . in the absence of diﬀusion (c(y. ε1 ∆t ≤ min . resulting in an additional plotting condition on the storage time step (ε2 < 1). Stochastic simulation: Algorithms and analysis. to keep the error higher order in ∆t). but rather that the trajectory governed by the Langevin equation has no well-deﬁned derivative. Glynn.

t) using Itˆ’s formula (see Appendix C. t) · cy (y.7) we expand c(y(t).8) y(t) + A(y. For example. Re-writing ∆t 0 c(y(t). dt dF 1 1 η(t). although convergence order can rarely be computed explicitly for these schemes. (8. t) · (1 − n2 ). the equation dy = A(y. the steady-state autocorrelation function of F (t) is F (t)F (t − τ ) = 1 −|τ |/τc τc →0 e − − δ(τ ). higher-order terms are included in the estimate. and τc is the correlation time of the coloured noise process. Coloured noise Coloured noise is often simulated by augmenting the model equations to include a Langevin equation governing an Ornstein-Uhlenbeck process. 0)W (∆t) = ∆t 0 {c(y(t). unit-variance Gaussian white noise. (where again ni are independent samples of a unit Normal distribution N(0. 1)). y(t + ∆t) = √ ∆t c(y. dy = A(y. but they become increasingly unwieldy with only moderate improvement of convergence. t) ∆t − 2 2 which converges strongly with order 1. (8. t) − c(y(0). is simulated by the augmented system. dt (8. t)dW (t) − c(y(0). t) · F (t). p. = − F (t) + dt τc τc where η(t) is zero-mean.172 Applied stochastic processes To improve upon the algorithm. t) + c(y.9) where F (t) is an Ornstein-Uhlenbeck process. 0)} dW (t). t) + c(y. Higher-order schemes can be generated. Written in this way. t)∆t + n1 · c(y. −→ 2τc . t) · F (t). The o result is Milstein’s scheme. 273). A pragmatic approach is to use coloured noise and higher-order stepping schemes.

10) where.9 is diﬀerentiable (see Eq.Random Diﬀerential Equations 173 An improved approach comes from noting that the Ornstein-Uhlenbeck process is Gaussian. Echoing the analysis of Chapter 7. Although it should not need saying.2 Approximation Methods There are several approximation methods that have been developed to derive evolution equations for the moments of a process governed by a linear stochastic diﬀerential equation.9. 7. We are then able to use all the methods developed in ordinary calculus. F (t)] .9 reduces to the Langevin equation interpreted in the Stratonovich sense! In the next section. n is a unit normal random variable. including the numerical integration of diﬀerential equations. The process governed by Eq. . . and we shall limit the analysis to the derivation of leading order terms. We shall consider approximation methods that assume either very long or very short correlation time for the ﬂuctuations. That is. 8.13 on page 157). 8. 8. y(t + ∆t). . F (t + ∆t) = F (t) · e −∆t/τc +n· 1 2τc 1/2 (1 − e −2∆t/τc ) . y(t + ∆t) = h [y(t). so we are free to use any stepping scheme we wish to advance y. we again ﬁnd that the Langevin equation is diﬃcult to deal with because it is a diﬀerential equation characterizing a non-diﬀerentiable process. This paradox disappears when we replace the cannonical white noise forcing by a more physically relevant coloured noise. again. 8. 8. F (t + ∆t) = F (t) · e −∆t/τc +n· 1 2τc 1/2 (1 − e −2∆t/τc ) . notice that in the limit τc → 0. . Eq.9 under various assumptions about the correlation time τc and the magnitude of the noise (focusing exclusively on models with A and c linear in y). . we shall discuss various approximation schemes for treating equations such as Eq. We can therefore write the exact stepping formula for F (t) as (Exercise 2b). (8. where h[·] is any implicit or explicit stepping scheme used to integrate Eq.

u(t) = u0 e−iξt . then the solution is averaged over the probability distribution of the ﬂuctuating coeﬃcient f (ξ). It can happen that the ﬂuctuations have very long correlation time compared with the rapid deterministic relaxation of the system. 1. u = u0 e−iξ0 t−γt . ∞ u(t) = −∞ f (ξ)u(t. The solution is. Remember. dt Consider the stochastic diﬀerential equation.2. we assume from the outset that the entire probability distribution for ξ is known. where ξ is a random variable. In that case. in the following list ξ0 and γ are ﬁxed parameters. Cauchy/Lorentz: f (ξ) = 2.174 Applied stochastic processes 8. ξ)u. du = −iξu. dt u(0) = u0 . We shall call this the static averaging approximation. 2γ 2πγ u = u0 e−iξ0 t−γt 2 γ/π γ2 + (ξ − ξ0 ) 2. we can solve the random diﬀerential equation explicitly – i.e. Gaussian: f (ξ) = √ 1 1 exp − (ξ − ξ0 )2 . Because ξ does not depend on time. we may treat the ﬂuctuating coeﬃcients as a time-independent random variable. ξ). where ξ is a stochastic process. Example: The static averaging approximation is well-illustrated by the complex harmonic oscillator with random frequency. It is instructive to look at a few special cases for the probability distribution of f (ξ). solve for arbitrary ξ. An example will make the procedure clear. The diﬀerential equation is solved in the ordinary fashion to obtain a solution that depends parametrically upon the ﬂuctuating term u(t. ξ)dξ.1 Static Averaging du = A(t. and average over the ensemble.

m! m=1 ∞ m where κm stands for the mth -cumulant. N. C. hence |u (t)| 2 = u2 . The fact that u (t) = χ (t) suggests the use of the cumulant expansion u (t) = e −iξt = exp (−it) κm . The modulus of u is not subject to phase mixing and does not tend to zero. 8. Note that u (t) is identical with the characteristic function χ (t) of the distribution f (ξ). in mathematics the “Riemann-Lebesque theorem”.” Canadian Journal of Physics 43: 619–639. Only for one particular f (ξ) does it have the form that corresponds to a complex frequency. 0 The form of the damping factor is determined by f (ξ). dt u(0) = u0 . This damping is due to the fact that the harmonic oscillators of the ensemble gradually lose the phase coherence they had at t = 0. Laplace: 1 f (ξ) = γe−γ|ξ−ξ0 | .Random Diﬀerential Equations 3. G. Bourret (1965) “Ficton theory of dynamical systems with noisy parameters . In plasma physics this is called “phase mixing”.” Physics Reports 24: 171–228. Consider the linear random diﬀerential equation. the averaged solution tends to zero as t → ∞. 2 u = u0 e −iξ0 t 175 γ2 . Frisch (1974) “Solving linear stochastic diﬀerential equations. γ 2 + t2 In each case. and indeed it is this idea that underlies the approximation discussed in Section 8.11) . |u (t)| = u0 .2 • • • Bourret’s Approximation R. A. Brissaud and U. (8.2.” Journal of Mathematical Physics 15: 524–534.2. du = [A0 + αA1 (t)]u. in fact. van Kampen (1976) “Stochastic diﬀerential equations.3.

12. the obvious method for solving this problem seems to be a perturbation series in α. du = A0 u + α A1 (t)u . Substituting into Eq. Substitution into Eq. . (8. A1 (t) = 0: This is not strictly necessary. It is. A1 (t) has a ﬁnite (non-zero) correlation time τc : This means that for any two times t1 . A1 (t) is a random matrix. but it simpliﬁes the discussion. u(t) = etA0 v(t). and v(0) = u(0) = u0 . for then A1 (t) is independent of t and can be absorbed into A0 .12) where V(t) = e−tA0 A1 (t)etA0 . and α is a parameter measuring the magnitude of the ﬂuctuations in the coeﬃcients. t2 . then α is small. 2. for example. Concerning the random matrix A1 (t). v(t) = v0 (t) + αv1 (t) + α2 v2 (t) + · · · . The methods derived in this section are approximations of that crosscorrelation. such that |t1 − t2 | ≫ τc . assume a solution of the form. we make the following assumptions: 1. Since α is small. i. we eliminate A0 from the random diﬀerential equation by setting.13) .e. dt But we have no obvious method to evaluate the cross-correlation A1 (t)u . 8. If the mathematical model is chosen properly. Notice if we simply average the stochastic diﬀerential equation. 8. We would like to ﬁnd deterministic equations for the moments of u. true if A1 (t) is stationary. First. one may treat all matrix elements of A1 (t1 ) and of A1 (t2 ) as statistically independent. .11 yields the new equation for v(t). dt (8. dv = αV(t)v(t).176 Applied stochastic processes where u is a vector. t t t1 v(t) = u0 + α 0 V(t1 )dt1 u0 + α2 0 0 V(t1 )V(t2 )dt2 dt1 u0 + . A0 is a constant matrix.

V(t′ )V(t′′ )v(t′′ ) ≈ V(t′ )V(t′′ ) v(t′′ ) . in terms of the original variables.15) By iteration.12. we are left with the convolution equation for u(t) d u(t) = A0 u(t) + α2 dt t 0 A1 (t)eA0 (t−t ) A1 (t′ ) u(t′ ) dt′ .14) Dropping all higher-order terms. ′ (8. (8. Then Eq. (8. t t′ 0 (8. αt ≪ 1. this is not a suitable perturbation scheme because the sucessive terms are not merely of increasing order in α. . we have Born’s iterative approximation. t v(t) = u0 + α 0 V(t′ )v(t′ )dt′ . t t′ 0 v(t) = u0 + α 2 0 V(t′ )V(t′′ )v(t′′ ) dt′′ dt′ . t v(t) = u0 + α 0 t V(t ) u0 + α 0 ′ ′ t 2 0 0 t′ ′ t′ V(t′′ )v(t′′ )dt′′ dt′ = u0 + u0 α 0 V(t )dt + α V(t′ )V(t′′ )v(t′′ )dt′′ dt′ . 8.e. Starting from Eq.16 becomes an integral equation for v alone.Random Diﬀerential Equations Upon taking the average. A diﬀerent approach was taken by Bourret. however.18) or. . (8. the expansion is actually in powers of (αt) and is therefore valid only for a limited time.16) This equation is exact.17) v(t) = u0 + α 2 0 V(t′ )V(t′′ ) v(t′′ ) dt′′ dt′ . this equation can be re-written as. and on taking the average. the equation can be re-written in an equivalent integrated form. 8. One could claim to have an approximation for v to 2nd -order in α. but also in t.19) . That is. i. t 177 v(t) = u0 + α2 0 0 t1 V(t1 )V(t2 ) dt2 dt1 u0 + . Suppose. (8. that one could write. however. and after diﬀerentiation. with ﬁxed u0 . but of no help in ﬁnding v(t) because it contains the higher-order correlation V(t′ )V(t′′ )v(t′′ ) .

Bourret’s convolution equation is formally solved using the Laplace transform. (8. That is. with further manipulation. allowing u(t) to pass through the integral in Eq. t 0 t 0 ′ (8. 8. Nevertheless. ω)u for each individual ω and averaging afterwards. which means the upper limit of integration is not important and can be extended t → ∞ incurring only exponentially small error. A convolution equation extends over the past values of u(t) . 0 . We use the noiseless dynamics (α = 0) to re-write u(t′ ) . and as such does not describe a Markov process. the mode-coupling approximation. so the argument of the integral is a function of time-diﬀerence only. u(t′ ) ≈ e−A0 (t−t ) u(t) . A1 (t)eA0 (t−t ) A1 (t′ ) e−A0 (t−t ) ≡ K′ (t − t′ ). It is a striking result because it asserts that the average of u(t) obeys a closed equation. t 0 K′ (τ )dτ ≈ ∞ K′ (τ )dτ.20) A1 (t)eA0 (t−t ) A1 (t′ ) u(t′ ) dt′ ≈ ′ ′ ′ A1 (t)eA0 (t−t ) A1 (t′ ) e−A0 (t−t ) dt′ u(t) . 0 The ﬂuctuations are correlated only for small time separations (τ ≪ τc ). without going through the procedure of solving the random diﬀerential equation du/dt = A(t. or sometimes in the physics literature.178 Applied stochastic processes This is called Bourret’s convolution equation.21) We have assumed that A1 (t) is a stationary random process. but the structure of the equation is sometimes not convenient. t t ′ ′ 0 K′ (t − t′ )dt′ = K′ (τ )dτ. as was done by Kubo using a diﬀerent argument.19. Making the change of variable τ = t − t′ . we are able to turn the convolution equation into an ordinary diﬀerential equation (albeit with modiﬁed coeﬃcients). thus one can ﬁnd u(t) without knowledge of the higher moments of u(t).

” Physica 74: 215–238. Kubo’s approximation results in an ordinary diﬀerential equation. in a sense.23 is as accurate as Eq. van Kampen (1974) “A cumulant expansion for stochastic linear diﬀerential equations. is unsatisfactory because it implies an uncontrolled neglect of certain ﬂuctuations without any a priori justiﬁcation. A1 (τ )eA0 τ A1 (0) e−A0 τ dτ · u(t) . and so u(t) characterized by this equation is a Markov process. ∞ d u(t) = A0 + α2 K′ (τ )dτ · u(t) dt 0 ∞ = A 0 + α 2 0 (8.e. we shall develop Bourret’s approximation (and higher-order terms) with a careful estimate of the error incurred at each step. and in fact show that the error incurred in deriving Kubo’s renormalization is of the same order as the error incurred in Bourret’s approximation – so that Eq. and will be the focus of the next two sections. In this section. and so we arrive at an ordinary diﬀerential equation for u(t) where K′ (τ )dτ renormalizes the constant coeﬃcient matrix A0 .2. as we show below. R.Random Diﬀerential Equations ∞ 0 179 This last term is a constant (i. Although the derivation may seem ad hoc.23) 8. . Eq. independent of time).19! The cumulant expansion method also points the way to calculation of higher-order terms (although the algebra becomes formidable). 8. 8. The hypothesis of Bourret. 8. it is what is meant by the Stosszhalansatz discussed on page 59.22) This is called Kubo’s renormalization equation. Terwiel (1974) “Projection operator method applied to stochastic linear diﬀerential equations. it is important to note that replacing an average of a product with a product of averages underlies the microscopic theory of transport phenomena.. we can use cumulants to compute a very tight error estimate of the approximation. H. Nonetheless.17. In contrast with Bourret’s convolution equation. and.3 • • Cumulant Expansion N.” Physica 74: 248–265. (8. G.

. . .. that for |t1 − t2 | ≫ τc . 12 = 1 2 + 12 . greatly decreasing the risk of secular terms1 . When discussing the cumulants of a stochastic function ξ(t). 123 = 23 + 2 13 + 3 1 2 3 + 1 1 2 + 1 2 3 . ξ(t1 ) and ξ(t2 ) are uncorrelated. for each partition. the expansion occurs in the exponential. For example. e −iξt = exp (−it)m κm . . for ξ(t1 ). ξ(t2 ). terms are terms arising in a perturbation approximation that diverge in time although the exact solution remains well-behaved. . ξ(tm ) dt1 dt2 . Kubo (1962) “Generalized cumulant expansion method. (8. A. then. Eq. The cumulants κm of ξ are deﬁned by means of the generating function. 2.180 Cumulants of a Stochastic Process • Applied stochastic processes R. dtm . . The ﬁrst few relations will make the prescription clear – Writing 1.. 1 = 1 .24) 0 The general rule for relating cumulants to the moments of ξ(t) is to partition the digits of the moments into all possible subsets. The second advantage of cumulants is that higher-order cumulants vanish if any of the arguments are uncorrelated with any of the others. one writes the product of cumulants. . An expansion in cumulants is preferable over an expansion in moments for two main reasons – ﬁrst. Let ξ(t) be a scalar random function.. then. 1 Secular . m! m=1 ∞ (cf. 12 = 1 2 .. . Let ξ be a scalar random variable. .” Journal of the Physical Society of Japan 17: 1100–1120. after a short time 1 − t + t2 /2! is a terrible approximation of e−t . . suppose ξ(t) has a short correlation time τc – speciﬁcally.9 on page 238). then adds all such products. it is convenient to adopt the notation κm ≡ ξ m . e−i exp (−i)m m! m=1 ∞ t 0 t Rt 0 ξ(t′ )dt′ = ··· ξ(t1 )ξ(t2 ) . For example.

t2 . 8. The time 1/α over which v(t) varies. Said another way. an improved perturbation scheme should be possible if we expand in powers of (ατc ). 3. . . each integral in Eq. and precisely the assumption Einstein made when constructing his model of Brownian motion – that the time axis can be subdivided into steps ∆t which are long enough that the microscopic variables relax to their equilibrium state (∆t ≫ τc ). one can show that having ξ(t1 ) and ξ(t2 ) uncorrelated is suﬃcient to ensure that all higher-order cumulants vanish. The condition that ατc ≪ 1 underlies the Bourret and Kubo approximations. dt (8. That is. . we see that equation contains three time scales: 1. That is to say.2 If ατc ≪ 1. Since the cumulants are approximately zero once the gap among any of the time points exceeds τc . . The time τV over which V(t) varies. As we now 2 The dimensionless parameter ατc is called the Kubo number. each integral in Eq. though not long enough that the observable state v(t) is appreciably changed (1/α ≫ ∆t).24 grows linearily in time (once t ≫ τc ). the mth cumulant vanishes as soon as the sequence of times {t1 .Random Diﬀerential Equations so we have. 1 $ 2 $ $ 181 which implies 1 2 = 0. which is not relevant in what follows. irrespective of their correlations with one another. t] so that t ≫ τc and yet αt ≪ 1. 8. . V(t)V(t′ ) ≈ 0 for |t − t′ | ≫ τc . This is precisely the Stosszhalansatz underlying the derivation of the master equation. $ 1 2 = $$ = $1 $2 + 1 2 .24 is an (m − 1)dimensional sphere of radius ∼ τc moving along the time axis. Similarly. The correlation time τc over which V(t) is correlated. While the naive perturbation solution initially proposed failed because we expanded in powers of (αt). then it is possible to subdivide the interval [0. Cumulant Expansion Returning again to our original linear stochastic diﬀerential equation.25) (with V(t) = e−tA0 A1 (t)etA0 ). 2. dv = αV(t)v(t). tm } contains a gap large compared to τc .

it is possible to justify these approximations and obtain higher-order terms as a systematic expansion in powers of (ατc ) using the cumulants of a stochastic process.1).24 is of little use. . t ′ ′ v (0) .27) v (t) = αV (t ) dt 0 show. Eq. 8.26) v (t) = exp 0 since the average and the time-ordering commute. 8. successive terms in the exponent are of order α. Inside the time-ordering operator. α2 . For the matrix V(t). v (t) = exp α t V (t1 ) dt1 + 0 α 2 2 t t 0 0 This equation is exact. that make it well-suited as a starting point for approximations of an evolution equation for v(t) . For the matrix V(t).13. Second. 8..28) . . .28. (8. We can write the formal solution of Eq. v(t) .182 Applied stochastic processes where the time-ordering operator ⌈ ⌉ is short-hand for the iterated series. greatly reducing the risk of secular terms. v (0) . There are several important characteristics of Eq. we can freely commute the matrix operator V(t) since everything must be put in chronological order once the timeordering operator is removed. the cumulant expansion appears in the exponent.24 shows how the cumulants of a scalar stochastic process can be used to express the averaged exponential. We are therefore justiﬁed in writing. and grow linearily in time. but we are saved by the time-ordering.25 as a time-ordered exponential (see Section B. First. For the average process. we have the suggestive expression. Eq. 8. . however. . we assume the ﬂuctuations have ﬁnite correlation time τc in the sense that the cumulants of the matrix elements Vij (t1 ) Vkl (t2 ) · · · Vqr (tm ) =0 V (t1 ) V (t2 ) dt2 dt1 + . exp (8. no approximations have been made up to this point. (8. t αV (t′ ) dt′ v (0) . it would seem at ﬁrst sight that Eq. 8.

v (t) ≈ exp 0 which solves the diﬀerential equation.30) 0 K (t1 ) dt1 v (0) .Random Diﬀerential Equations 183 vanish as soon as a gap of order τc appears among any of the time-points. dt the simplest possible approximation where the ﬂuctuations are replaced by their average. d v(t) = V(t) v (t) . v (t) = exp α2 t (8.31) . if we eliminate all terms α2 and higher. and consider the diﬀerential equation d v (t) = α2 K (t) v (t) . Suppose that V = 0 (see page 176). we have. 8. t1 (8. For example. v (t) ≈ exp 2 0 0 or.28 after two terms. α2 t t V (t1 ) V (t2 ) dt2 dt1 v (0) .29) V (t1 ) V (t2 ) dt2 .28 a practical approximation of the process v(t) . dt The solution is given by. K (t1 ) = 0 exp α2 v (t) = t t1 0 0 V (t1 ) V (t2 ) dt2 dt1 v (0) . we begin eliminating terms in the expansion. 8. equivalently. t V (t1 ) dt1 v (0) . then truncating Eq. (8. each term is of order αm t τc To obtain from the exact Eq. We then have a good estimate of the size of each term in the expansion – m−1 . We introduce the operator.

30 is identical to Eq. although the diﬀerence is of higher-order in ατc . 8. A1 (t) eA0 τ A1 (t − τ ) e−A0 τ dt 0 (8. or Kubo’s approximation once the upper limit of integration is extended t → ∞. if the correlation time of the ﬂuctuations is short. Loosely speaking. Redrawn after Brissaud and Frisch (1974). Furthermore. We may therefore use Eq.184 Applied stochastic processes This solution is not quite the same as Eq. although it is a convolution equation.30 as an approximation of the evolution equation for v(t) . we have a good estimate of the 2 error incurred by the approximation – the error is of order (α3 τc ).2). so it is slightly . 8.21. Eq. No ad hoc assumptions regarding the factoring of correlations have been made. t d u (t) = A0 + α2 u (t) . With the many approximation methods discussed above.29. 8. 8. and the Bourret approximation is good over short and long times. Summary of approximation methods Condition on Kubo number ατc<1 ατc>1 Bourret Convolution Kubo Renormalization Static Averaging ατc<<1 ατc<<1 ατc>>1 or Approximation Born Iteration Condition on t α2τct<<1 αt<<1 t>>τc t<<τc Figure 8. Nor have any secular terms appeared in the derivation. the Kubo renormalization is good for times longer than the correlation time.32) which reduces to Bourret’s approximation once u(t) is brought back inside the integral. In the original representation. then the Born approximation is only good over short times.2: User’s guide for various approximations of the evolution equation for y(t) derived from a linear random diﬀerential equation. it is helpful to have a “user’s guide” outlining the range of applicability of each approximation scheme (Figure 8.

then the static averaging approximation is useful – if the Kubo number (ατc ) is very large (ατc ≫ 1). consider the random harmonic oscillator.Random Diﬀerential Equations 185 more cumbersome to deal with than the Kubo renormalization. As an example of a simple linear system with stochastic coeﬃcients. Suppose instead that ω0 is a random function. dx (0) = 0. dx (0) = 0. ξ(t)ξ(t − τ ) = 0. 8. dt2 x(0) = a. we have the non-dimensionalized model. 2 dt x(0) = a. We now examine two limiting regimes for τc . Long correlation time. |τ | ≫ τc . stationary stochastic process. Re-scaling time. (World Scientiﬁc Publishing Company. d2 x + (1 + αξ(t))x = 0. The formal solution for x(t) is then given by. then the approximation is good for all time. the solution is x(t) = a cos(ω0 t). Gitterman. d2 x 2 + ω0 x = 0. τc ≫ 1 If the correlation time τc is long.” Physics Reports 24: 171. (8. van Kampen (1976) “Stochastic diﬀerential equations. t → ω0 t. dt (8. then the approximation is good for times shorter than the noise correlation time. G. from Einstein until now. 2005). then we can invoke the static averaging approximation by treating ξ(t) → ξ as a time-independent random variable. where ξ(t) is a zero-mean.3 • Example – Random Harmonic Oscillator N. dt For constant frequency ω0 . The noisy oscillator: The ﬁrst hundred years. while if the Kubo number is not large.34) . If the correlation time of the noise is long.33) The process ξ(t) is assumed to have a ﬁnite correlation time τc such that. • M. x(t) = a cos [(1 + αξ)t] . 2 2 ω0 → ω0 (1 + αξ(t)).

(8. σ). Gaussian: (σ ≪ 1) f (ξ) = N(0. β). A0 = 0 −1 1 0 . Eq. Uniform: (β ≪ 1) f (ξ) = U(−β. α2 d2 α2 d x + (1 − c1 ) x = 0. (8. d dt with. τc ≪ 1 It is convenient to re-write Eq. 2. x(t) = ae−ασt x(t) = a sin[αβt] (αβt) x(t) = ae− α2 σ 2 t2 2 In all cases.186 Applied stochastic processes To ﬁnd x(t) .34 over the probability distribution for ξ. Short correlation time. 8.38) . f (ξ). we see again that the super-position of an ensemble of outof-phase oscillators results in the damping of the average x(t) . we have. σ 2 ).36) Using Kubo’s renormalization approximation.23. 8. Cauchy: (σ ≪ 1) f (ξ) = C(0.35) We tabulate several examples of f (ξ) below: 1. x + c2 dt2 2 dt 2 (8. 8. (8. ∞ x(t) = a −∞ cos [(1 + αξ)t] f (ξ)dξ. we average Eq. A1 (t) = αξ (t) 0 −1 0 0 .33 in matrix-vector notation. 3.37) x x ˙ = 0 −1 1 0 x x ˙ + αξ (t) 0 −1 0 0 x x ˙ .

the review by Brissaud and Frisch. Although outside the scope of this course. u + [a + 2q cos 2t]u = 0. ¨ (8. we ﬁnd that for long correlation time it is the distribution f (ξ) of the ﬂuctuations that is important. ∞ 187 c1 = 0 ∞ ξ(t)ξ(t − τ ) sin[2τ ]dτ. In vector-matrix notation. we now consider a linear diﬀerential equation with time-varying coeﬃcients – the Mathieu equation.40) + αξ (t) Because the Mathieu equation has time-dependent coeﬃcients. Brissaud and U. (8.Random Diﬀerential Equations where.4 Example – Parametric Resonance In contrast to the example above. the periodic time-dependence in the frequency leads to instability via a mechanism called parametric resonance. This is the same eﬀect that is exploited by children “pumping” a swing.” Journal of Mathematical Physics 15: 524–534. 8. for example. c2 = 0 ξ(t)ξ(t − τ ) (1 − cos[2τ ])dτ. we allow the parametric forcing to have a random amplitude q → q(1+αξ(t)). even in the absence . it is no longer possible to ﬁnd a closed form for the solution. Comparing the two regimes. there are approximation methods that combine these ideas to develop an approximation method valid over all correlation times – see.39) Over a range of a and q. Frisch (1974) “Solving linear stochastic diﬀerential equations. d dt 0 −(a + 2q cos 2t) 1 0 u u ˙ u u ˙ = 0 −2q cos 2t 0 0 u u ˙ . To transform the Mathieu equation to a stochastic diﬀerential equation. while for short correlation time it is the autocorrelation function ξ(t)ξ(t − τ ) that determines the average state. • A.

(8. u u ˙ t = exp 0 A0 (t1 )dt1 · u(0) u(0) ˙ . we shall derive Bourret’s convolution approximation for a linear stochastic delay equation in the limit that the time-delay is large compared to the correlation time of the .41) If we adopt the Kubo renormalization approximation. the formal solution can be written as a time-ordered exponential. exp − 0 A0 (t1 )dt1 propogator. α = 0.42) as a series in powers of τc (Excercise 9). 8. we can invoke Watson’s lemma (Section B. several degrees of freedom are eliminated from the governing equations by introducing a time delay into the dynamics. For linear delay diﬀerential equations with stochastic coeﬃcients. and τ A0 (t1 )dt1 0 is the matrix inverse of the deterministic 0 −2q cos 2t 0 0 A1 (t) = αξ (t) . with an exponential correlation function. The result is a system of delay diﬀerential equations. (8. the correction to the deterministic dynamics is the integral. 8. Nevertheless.41.43) Let ξ(t) be a colored noise process. ∞ τ 0 A1 (t) · exp A0 (t1 )dt1 0 τ × A1 (t − τ ) dτ (8. it is straightforward to derive an approximation for the averaged process as we have done in the preceding sections. In particular. τc In that case. ξ(t)ξ(t − τ ) = exp − |τ | . Eq.5 Optional – Stochastic Delay Diﬀerential Equations In many models of physical systems.42) × exp − Here.2) to approximate the integral (8.5.188 Applied stochastic processes of ﬂuctuations.

g g ˆ since L {g (t − τd )} = e−sτd L {g (t)}. To that end. Taking the ensemble average. with g (s) = L {g (t)}.Random Diﬀerential Equations 189 ﬂuctuating coeﬃcients. we write the formal solution of (8.44) x = ax + bxτ + η (t) x.44) as a convolution. consider the linear delay diﬀerential equation. d x = a x + b xτ + dt 0 t g (t − t′ ) η (t) η (t′ ) x (t′ ) dt′ . . Explicitly. d x = ax + bxτ . dt Taking the Laplace transform L {·}. but now with zero-mean multiplicative noise η(t). d x = ax + bxτ + dt 0 t g (t − t′ ) η (t) η (t′ ) x (t′ ) dt′ . Consider the original delay equation. dt x(t) = x0 for t ≤ 0. where τd is the delay time and xτ = x(t − τd ). The one-sided Green’s function for this system g(t) is calculated using the Laplace transform. we write. With substitution into the right-hand side of (8.44). d (8. ˆ sˆ (s) − aˆ (s) − be−sτd g (s) = 1. we are left with the evolution equation for the ﬁrst-moment. g g (t) = L−1 {ˆ (s)} = L−1 1 s − a − be−sτd . dt Using the one-sided Green’s function g(t). t x (t) = x0 g (t) + 0 g (t − t′ ) η (t′ ) x (t′ ) dt′ . The auxiliary equation characterizing g(t) is d g (t) − ag (t) − bg (t − τd ) = δ (t) .

Markov process so that. (8. we have d x = a x + b xτ + dt 0 t g (t − t′ ) η (t) η (t′ ) x (t′ ) dt′ . t =g s+ ˆ 1 τc . stationary. dt The Laplace transform x (s) is readily obtained.190 Applied stochastic processes Invoking Bourret’s approximation to factor the cross-correlation. η (t) η (t′ ) x (t′ ) ≈ η (t) η (t′ ) x (t′ ) . η (t) η (t′ ) = K (t − t′ ) = σ 2 exp − |t − t′ | . d x = a x + b xτ + dt 0 t g (t − t′ ) K (t − t′ ) x (t′ ) dt′ . x0 s (s − a − be−sτd ) − σ 2 [˜ − a − be−˜τd ] s −1 . but since ˆ L g (t) e− τc we have. d x (t) = a x (t) + b x (t − τd ) + [g (t) K (t)] ∗ x (t) . Explicit inversion of the Laplace transform g (s) is diﬃcult. The approximate evolution equation then simpliﬁes to. x (s) = ˆ where s = s + ˜ 1 τc .45) . τc and τc is the correlation time of the noise. Writing the convolution explicitly. We further assume that η(t) is a Gaussian distributed. ˆ x (s) = ˆ x0 s − a − be−sτd − σ 2 L g (t) e− τc t .

Frisch (1974) “Solving linear stochastic diﬀerential equations. τc →0 2τc with 0 δ (t) dt = (the Stratonovich interpretation of the Dirac delta function).” Journal of Mathematical Physics 15: 524–534. • A. Two excellent sources are van Kampen’s review.46) In this form. s⋆ − a − be−s ⋆ τd = 0.Random Diﬀerential Equations 191 In the absence of noise. Similarly. We then write the resolvent equation as. Brissaud and U. it is conveΓ2 nient to make the substitution σ 2 = 2τc so that. G. the roots of the resolvent equation can be developed as an asymptotic series in a straightforward manner (see Exercise 11). (8. |t| τc →0 ε Γ2 − |t| e τc = Γ2 δ (t) . where s is the root of s − a − be−s τd . and the review. or equivalently ˆ ⋆ ⋆ ⋆ ⋆ by the real part of s . s⋆ − a − be−s τc s⋆ − a − be−s ⋆ ⋆ τd − Γ2 + 2 ⋆ τd − τd c e τ s⋆ − a − be−s τd = 0. the asymptotic stability of x(t) is determined by the real part of the pole of the Laplace transform x(s). (s − a − be ⋆ −s⋆ τd )−σ 2 1 s + τc ⋆ − a − be 1 −(s⋆ + τc )τd −1 = 0. • N. lim σ 2 e− τc = lim 1 2. translated from Russian.” Physics Reports 24: 171–228. the asymptotic stability of the ﬁrst-moment x (t) is determined by the real part of s⋆ satisfying. To examine the white-noise limit (in the Stratonovich sense). van Kampen (1976) “Stochastic diﬀerential equations. Suggested References Much of the early part of this chapter comes from. .

x ( t + ∆t ) = x ( t ) − x ( t ) ⋅ ∆t ⋅ N ( 0. Jenny noticed that the following diﬀerential equation. (a) Explain what each routine shown in Figure 8. What is the fundamental diﬀerence between the two routines? . I.3A is intended to do. dx = −xη(t).3).1) 2τ c B. A) Two diﬀerent schemes to represent the diﬀerential equation x = −x×η(t). x ( t + ∆t ) = x ( t ) − x ( t ) ⋅η ( t ) ∆t 1 N ( 0. 12 10 x (t ) 1. 1.5 on page 226). The step-size is ∆t = 10−2 and the correlation time is τc = 10−1 . B) The averaged behaviour using each of the schemes shown in panel A. σ 2 ) represents a random number drawn from a Normal distribution with mean µ and variance σ 2 . 1 2 3 4 5 2. The average is taken over an ensemble of 1000. This second article is notable in its use of the Novikov-Furutsu relation (see Section 11. I. Exercises 1. where η(t) is Gaussian ˙ white noise. • V.192 Applied stochastic processes A. Jenny’s paradox: In the course of testing her code to numerically integrate stochastic diﬀerential equations. (where η(t) is Gaussian white noise) had diﬀerent averaged behaviour x(t) depending upon how she coded the white noise (Figure 8. N(µ. dt x(0) = 1.” Soviet Physics – Uspekhi 16: 494–511.1) time Figure 8. η ( t + ∆t ) = ρ ⋅η ( t ) + 1 − ρ 2 § ª ∆t º · ¨ ρ ≡ exp « − » ¸ ¨ ¸ ¬ τc ¼ ¹ © 8 6 4 2 2.Exercise 1. Klyatskin and V. Tatarskii (1974) “Diﬀusive random process approximation in certain nonstationary statistical problems of physics.3: Jenny’s paradox .

along with the subordinate (1) density functions Q1 (s) and Q2 (θ|s). we shall consider how to transform a realization r ∈ U(0. 1) to a number n ∈ N(0. 1). 3. it is often necessary to generate Gaussian-distributed random numbers. 5. Cumulant expansion: In the derivation of the cumulant expansion. follows from the Langevin equation interpreted in the Itˆ sense. explicit derivation of the algorithm is illuminating.Random Diﬀerential Equations 193 (b) Why don’t both routines return the same result? What should Jenny expect the averages to look like? What will happen to the output of scheme 1 if the step-size ∆t is reduced? 2. several details were left out. 8. 8. derive the Bourret and Kubo approximations. o (b) Show that the update scheme.31? (b) From Eq. t exp i 0 F (t′ )dt′ t = exp −σ 2 0 (t − τ )ψ(τ )dτ . where F (t1 )F (t2 ) = σ 2 ψ(|t1 − t2 |).29 and Eq.2. provides the necessary correlation function for the Ornstein-Uhlenbeck process F (t). introduce two auxiliary random numbers s = σ[2 ln(1/r1 )]1/2 and θ = 2πr2 . Eq. σ 2 ). Eq. . 8. (a) What makes it impossible to directly invert a unit uniform random variable into a normally distributed random number? Why is this not a problem for inversion to an exponential or Cauchy distribution? (b) Using two unit uniform realizations r1 and r2 . Show that x1 = µ + s cos θ and x2 = µ + s sin θ are a pair of statistically independent sample values of the Gaussian distribution N(µ. (a) What is the diﬀerence between Eq. 8. Show that for a zero-mean Gaussian process F (t). Though most coding packages include subroutines for just this purpose. Hint: Use the Joint Inversion Generating Method described on page 245. 4. Simulation of the Langevin equation (in the Itˆ’s sense): o (a) Show that the update scheme.10. 8. Gaussian distributed random number generator: In the simulation of random diﬀerential equations. Since there are many tried and true routines for generating a unit uniform random number.32.

7. . (c) Determine the conditions for stability of the second moments (mean-squared stability) for the damped random harmonic oscillator discussed in Section 8. Redﬁeld equations: We can exploit the linearity of Eq. It may be helpful to review stability conditions for ordinary diﬀerential equations. Eigenvalues of the averaged random harmonic oscillator: From the equation governing the averaged response of the random harmonic oscillator. du = [A0 + αA1 (t)] · u ≡ A · u. 8. Verify the approximate equation for the second moments using an ensemble of simulation data. derive the eigenvalues of the system for the following choices of noise autocorrelation function: (a) Exponential correlation: (b) Uniform correlation: (c) Triangle correlation: ξ (t) ξ (t − τ ) = σ2 τc ξ(t)ξ(t − τ ) = = σ 2 exp − |τc| τ σ2 2τc ξ (t) ξ (t − τ ) − σ2 |τ | τc 0 2 0 −τc ≤ τ ≤ τc otherwise −τc ≤ τ ≤ τc otherwise ξ(t)ξ(t − τ ) = σ 2 sin |τ | π |τ | (d) Damped sinusoidal correlation: Do you notice any commonalities among the eigenvalues resulting from all of these various choices? Explain.194 Applied stochastic processes 6.3. each component of the N -dimensional diﬀerential equation.2. (b) Use either Bourret’s convolution equation or Kubo’s renormalization approximation to derive an equation for the second moments ui uk for the damped random harmonic oscillator discussed in Section 8. Eq. (a) Using tensor notation. Appendix B. Use this formulation to derive the diﬀerential equation for the products ui uk .3. 8. Verify the stability bounds using an ensemble of simulation data.38. dt N is written ui = j=1 Aij uj .1.11 to generate an equation for the higher moments. cf.

The primary diﬃculty lies in the time dependence of the coeﬃcients which results in the deterministic Mathieu equation not having a closed form-solution. . Eq 8. Write out the details of this approximation and thereby arrive at the renormalized equation for u . What happens to the corrected dynamics in the limit of white noise tc → 0? Verify the approximate equation for the second moments using an ensemble of simulation data. dt where A1 (t) and f (t) are correlated. d dt u 1 = A 0 f 0 u 1 .2) to express integral (8.Random Diﬀerential Equations 195 (d) Use Kubo’s renormalization approximation to derive an equation for the second moments ui uk for the parametric oscillator discussed in Section 8.e. and f (t) = f0 + f1 (t). du = [A0 + αA1 (t)] · u + f (t). Additive noise: Consider the inhomogeneous random diﬀerential equation.40) as a series of iterated integrals (the Born approximation). We seek an approximate evolution equation for the mean u .40. 8. we can invoke Watson’s lemma (Section B.5. α = 0 in Eq. allowing the methods developed in this section to be readily applied. (a) Write out the formal solution of the deterministic Mathieu equation (i. (b) Derive an equation for u . To that end. 8. 9. show that. exp τ A0 0 f0 0 = eτ A0 0 eτ A0 −1 A0 f0 1 .42) as a series in the correlation time τc . Parametric resonance: It is challenging to compute the evolution equation for the average process u characterized by the stochastic Mathieu equation..4. (a) Notice that the inhomogeneous equation can be written as a homogeneous equation if the state space is expanded. (b) For narrowly-peaked correlation function.

y. t. t|y0 . 0). Y (t)). t|u0 . Random diﬀerential equations with Markov noise: For the random diﬀerential equation du/dt = F (u. Eq. 10. These equations must be solved with initial values mi (y. 8. y.47) This equation is often not very useful since W can be quite large (even inﬁnite). the joint probability density P (u. (a) Show that if the random diﬀerential equation is linear (with coeﬃcients that do not depend upon time). Nevertheless. t)du.196 Applied stochastic processes (c) The stability of the mean of an oscillatory process is not particularly informative – much more important is the energy stabil˙ ity determined by the stability of u2 + u2 . t0 ) obeys its own master equation. y0 . The average of the process of interest is given by the integral. For the linear system introduced in (10a).47 reduces to a particularly simple form. y)P + WP. ﬁnd the evolution equation for mi . (b) Deﬁne the marginal averages mi . 8. Under what conditions are these ˙ ˙ asymptotically stable? Verify the approximation results using an ensemble of simulation data. t)dy. Use the Redﬁeld equations (Excercise 7) to generate approximate equations governing u2 . u u and u2 . ui (t) = mi (y. where u ∈ RN and Y (t) is a Markov process having probability density Π(y. mi (y. t. ∂ ∂P Fi (u. j=1 then Eq.47 is exact and it makes no assumptions about the correlation time of the ﬂuctuations. dui = dt N Aij (Y (t))uj . =− ∂t ∂ui i=1 N (8. t) = ui P (u. . 0) = u0i Π(y. t0 ) obeying the master equation dΠ/dt = WΠ.

. The zero-order term s0 is given in terms of Lambert functions. a process taking taking one of two values ±1 with transition rate γ.e.45 for a system of delay diﬀerential ˆ equations. Re-write the resolvent equation by nondimensionalizing all of the parameters appearing in Eq.e. 8. (c) Derive the analogue of Eq. t)] with transition matrix. ∂P+ ∂ =i (ω0 + α)uP+ − γP+ + γP− . 8. . then the joint probability is simply the two-component vector P (u. 8. What can you say about the two limits γ ≪ α (slow ﬂuctuations) and γ ≫ α (fast ﬂuctuations)? 11. and introduce the perturbaτc tion parameter ε = τd . ∂t ∂u Find the equations for the two marginal averages m± (t) using the initial condition m± (0) = 1 a. see Excercise 4 on page 76). (b) Assume the correlation time is short compared with the delay τc time (ε = τd ≪ 1). Show that Eq. (a) To determine the stability of the stochastic delay diﬀerential equation is not easy.. Consider the random diﬀerential equation u = −iω(t)u where u ˙ is complex and ω(t) is a random function of time. Stochastic delay diﬀerential equations: Read Section 8. Let ω(t) = ω0 + αξ(t). W= −γ γ γ −γ .47 gives two coupled equations. . Find u(t) = m+ (t) + 2 m− (t). and develop a perturbation series for s = s0 + εs1 + ε2 s2 + .46 with respect to the delay time τd . ±1.Random Diﬀerential Equations 197 (c) If Y (t) is a dichotomic Markov process (i. ﬁnd an expression for x(s) where x(t) is an n-dimensional vector. Find an expression for s1 . where ξ(t) is a dichotomic Markov process. i. This example was introduced by Kubo to illustrate the eﬀect of ﬂuctuations on spectral line broadening. t) P− (u. t) ≡ [P+ (u.5. ∂t ∂u ∂ ∂P− =i (ω0 − α)uP− − γP− + γP+ .

Johnson’s determination of Avagadro’s number from thermal noise in resistors using Nyquist’s analysis (page 46). leading to a distribution of states that follows the macroscopic solution.1). In this Chapter. In these cases. In that sense. while the density is held ﬁxed. or stable systems that exhibit regular oscillations when ﬂuctuations are included (Sections 9. the zero’th order term in the linear noise approximation. counterpart means the deterministic ordinary diﬀerential equations obtained in the macroscopic limit of the master equation. number of individuals and the volume go to inﬁnity. that the statistics of the ﬂuctuations can be used to measure parameters that are not accessible on an observable scale – for example. but not very practical. we consider some examples where stochastic models exhibit behaviour that is diﬀerent from their deterministic counterparts1 : Unstable equilibrium points at 0 become stable (Keizer’s paradox. i. stochastic models provide insight into system behaviour that is simply not available from deterministic formulations.3. We have seen.Chapter 9 Macroscopic Eﬀects of Noise The great success of deterministic ordinary diﬀerential equations in mathematical modeling tends to foster the intuition that including ﬂuctuations will simply distort the deterministic signal somewhat.2 and 9. Alternatively.e. 1 Deterministic 198 . and even the mutation rate in bacteria (see Excercise 5 on page 23). Avagadro’s number as determined by Perrin from Einstein’s work on Brownian motion (page 8). studying stochastic processes seems to be an interesting hobby. Section 9. however.1)..

or that the macroscopic limit is problematic. the discrete nature of the constituents will become manifest.4) .1 • Keizer’s Paradox J. → k (9. Keizer Statistical thermodynamics of nonequilibrium processes (SpringerVerlag.3) (9. consider the following (nonlinear) autocatalytic reaction scheme: X ⇋ 2X. The ﬂuctuations described by the master equation come about. So. Some authors. We shall examine his argument in more detail below. t) of the system having n molecules of X at time t obeys the (nonlinear) master equation. for example. a ﬁxed point that is unstable with respect to inﬁnitesimal perturbations is stable if approached along a discrete lattice. in part. while the equilibrium point X ss = (k1 −k2 ) is stable. This is Keizer’s paradox. have taken variations of Keizer’s paradox to suggest that the master equation formalism is somehow fundamentally ﬂawed. Vellela and H. however. X ss = 0 and X ss = (k1 − k2 ) .1) The deterministic equation governing the concentration of the species X is given by the diﬀerential equation. but it is important to emphasize that Keizer used his example to illustrate how correct macroscopic behaviour is sometimes not exhibited by the master equation. dt (9.2) It is straightforward to show that X ss = 0 is unstable. dt The two equilibrium points for the model are. Qian (2007) “A quasistationary analysis of a stochastic chemical reaction: Keizers paradox. k−1 (9. 1987). because the systems under study are compose of discrete particles. Such claims should not be taken too seriously. dX = (k1 − k2 )X − k−1 X 2 . the k−1 probability p(n. • M. 164–169.” Bulletin of Mathematical Biology 69: 1727. It should be no surprise then that as the number of particles decreases. dp ˆ ˆ ˆ = k1 E−1 − 1 np + k2 (E − 1) np + k−1 (E − 1) n(n − 1)p. k−1 k1 X −2 ∅. As a concrete example.Macroscopic Eﬀects of Noise 199 9. p. Re-cast as a master equation. suggesting caution in interpreting asymptotic behaviour of the master equation literally.

166). So we have. . dp0 ˆ = k2 p1 . which predicts that the concentration of X will stabilize at the nonzero x∗ . dt dp2 ˆ ˆ ˆ ˆ ˆ ˆ = k1 p1 − 2 k−1 + k1 + k2 p2 + 3 2k−1 + k2 . By induction. (9. say X(0) = 1000. the paradox disappears. 9. Because the sum of all the probabilities must add to one. dt (9. This is the Keizers paradox. Note that the steady state of the deterministic model are ﬁxed points. Writing out each component of p(n) as a separate element in the vector pn . while x∗ = 0 is unstable. dt Quoting from Vellela and Qian. Quoting from Keizer (p. . and in a time τ1 = 1/k1 it will become a sharp . This is in striking contrast to the previous deterministic model. p∗ (0) = 1 and p∗ (n) = 0. this forces p0 = 1. dt dp1 ˆ ˆ ˆ ˆ = 2 k−1 + k2 p2 − k1 + k2 p1 . the master equation can be expressed in terms of the transition matrix W (Excercise 1b).6) as the probability distribution for the unique steady state of the stochastic model. .5) It is more convenient to separate each element in Eq. dp = W · p. The stochastic model shows that eventually there will be no X left in the system.200 Applied stochastic processes where we have absorbed the volume V into the deﬁnition of the rate ˆ constants: k1 = k1 /V . etc. all probabilities pn for n > 0 are zero. the probability of nX taking on other valX ˆ ues will grow.5 as a coupled system of ordinary diﬀerential equations. This ‘paradox’ is of the same form as Zermelo’s paradox discussed on page 67 and is resolved in precisely the same fashion. Since n0 is large. For systems with a reasonable number of molecules. n > 0. and the steady state of the stochastic model has a distribution.

coming from mechanisms distinct from the separation of time scales underlying Keizer’s paradox. almost all members of the ensemble will have no X molecules.6 one would have to wait a time of the order of 1010 times longer than it takes to achieve the distribution centered ˆ at n∞ = V k1 /2k2 . is obviously in conﬂict with the predictions of the deterministic model. “if used uncritically. P. Other deviant eﬀects are described in: • M. the systems of the ensemble in the low-nX tail of the Gaussian will fall into the absorbing state at nX = 0. 201 We will generally be interested in systems where we are able to apply the linear noise approximation. For a large system. Looking back to the multivariate linear noise approximation (Eq. V becomes ˆ inﬁnite.10 on page 126). 6. This creates a new peak near nX = 0 and. [the master equation] can lead to meaningless results.” There are situations where the behaviour of a stochastic system. the order of ˆ τ2 = τ1 eV k1 /2k2 . the paradoxical distinction between the behaviour of the master equation and its deterministic counterpart disappears. As Keizer says.6 being unobservable. which is independent of the volume. whereas τ2 depends strongly on the volume and becomes inﬁnite like eV . We shall examine two examples in the following sections. the .2 Oscillations from Underdamped Dynamics Before we consider quantiﬁcation of noise-induced oscillations. Thus to witness the probability in 23 Eq. 9. Under the conditions of applicability of this approximation. 9. we will collect a review of relevant background scattered through previous chapters. This is tantamount to the probability in X Eq. S. we found that to leading order in the system size Ω. This has no eﬀect on k1 . if one waits for the period of time τ2 . Arkin (2006) Deviant eﬀects in molecular reaction pathways. Samoilov and A. 9. The important point here is the diﬀerence in the two time scales τ1 and τ2 . Nature biotechnology 24: 1235-1240.Macroscopic Eﬀects of Noise ˆ Gaussian centered at n∞ = V k1 /2k2 [the macroscopic equiX librium point]. On a much longer time scale. even with a large number of molecules.

it is straightforward to show that the Fourier transform of Eq. 5. Finally. (9. The matrices Γ and D are independent of α. t) will be Gaussian for all time. again. the distribution Π(α. i.j 1 2 Dij ∂ij Π.10 is (Excercise 2). ∂Π =− ∂t Γij ∂i (αj Π) + i. (9. at equilibrium Γs and Ds will be constant and the ﬂuctuations are distributed with density.202 Applied stochastic processes ﬂuctuations α obey the linear Fokker-Planck equation. we introduced the ﬂuctuation spectrum S(ω) (see Section 2.9) Γs · Ξs + Ξs · Γs T + Ds = 0.j (9. (9. √ Ω −1 : and. In particular.12) .7) where ∂i ≡ ∂ ∂αi Γij (t) = ∂fi ∂xj x(t) D = S · diag[ν] · ST . B(t) = αs (t) · αT (0) = exp [Γs t] · Ξs . the steady-state autocorrelation is exponentially distributed (see Eq.10) In Chapter 2. S is the stoichiometry matrix and nu is the vector of reaction propensities.18 on page 5. s (9. Πs (α) = (2π) det Ξs and variance Ξs = α · αT d 1 2 1 exp − αT · Ξs −1 · α . which appears only linearly in the drift term.18).11) which is the Fourier transform of the autocorrelation function. 9. f are the macroscopic reaction rates. 2 determined by. (9. and provides the distribution of the frequency content of the ﬂuctuations. In multivariate systems. Sα (ω) = 1 −1 [Γs + Iiω] · Ds · ΓT − Iiω s 2π −1 . S(ω) = 1 2π ∞ −∞ e−iωτ B(τ )dτ ⇐⇒ B(τ ) = ∞ −∞ eiωτ S(ω)dω.8) where. As a consequence.3 on page 39).

then noiseinduced oscillations are inevitable and these oscillations will have amplitude proportional to Ds . dt (9. dα − Γs · α = B · η.5).. then the system will selectively amplify the components of the white-noise forcing that lie near to the resonance frequency of the system. looking at the spectrum at the level of species concentration. For a multivariate system.e. dα = Γs · α + B · η.13) where Ds = BBT and η is a vector of uncorrelated unit-variance white noise with the same dimensions as the reaction propensity vector ν. the amplitude of the oscillations will of course vanish. it was shown that the linear Fokker-Planck equation is equivalent to the Langevin equation. 9. and we ﬁnd that Eq.7 is equivalent to. as we discuss below. constant Ω concentration). and that if any of the eigenvalues of Γs have a non-zero imaginary part. Said another way. recovering the deterministic stability implied by the negative real parts of the eigenvalues of Γs . in Chapter 6 (on page 6. dt (9. narrow peaks in Sα (ω) indicate coherent noise-induced oscillations. Sx (ω) = 1 −1 Ds [Γs + Iiω] · · ΓT − Iiω s 2π Ω −1 .Macroscopic Eﬀects of Noise 203 Since Sα (ω) contains the distribution of the frequency content of the ﬂuctuations. . underdamped).15) if the eigenvalues of Γs are complex (i. the same result holds. (9. Finally. In the deterministic limit (Ω → ∞.14) where now the dynamics of α about the deterministic steady-state are mapped to a harmonic oscillator with white-noise forcing. Physical insight into the origin of noise-induced oscillations comes from re-writing this equation more suggestively. It should be clear that the frequency response of the system is given by the eigenvalues of the deterministic Jacobian Γs .

Following McKane and Newman. No. Dieckmann and Doebeli (2007) Oikos 116: 53–64. 218102. Notice the vertical axis is logarithmic.16) . Data is from Elton and Nicholson (1942) J.1). governing the birth. dt dZ = γ · Z · P − δZ. Ecol. death and predation among a population of predators Z and their prey P . we consider noise-induced oscillations in predatory-prey models used in theoretical ecology. (9. dP = α · P − β · Z · P.1: Oscillations in Predator-Prey Populations. One of the earliest models of predator-prey dynamics is the Lotka-Volterra model (1925-1926). Some very brief historical context helps to illuminate the signiﬁcance of McKane and Newman’s analysis. Anim. Sample of lynx and hare populations taken from Pineda-Krch. Predatory-prey populations often exhibit oscillatory dynamics (Figure 9. dt Here: α : birth rate of prey β : rate of predation of P by Z γ : eﬃciency of the predator to turn food into oﬀspring δ : death rate of predators. 11: 215–244.2. Blok. Figure 5A.1 Example – Malthus Predator-Prey Dynamics McKane AJ.204 Applied stochastic processes Figure 9. Newman TJ (2005) “Predator-prey cycles from resonant ampliﬁcation of demographic stochasticity. 9.” Physical Review Letters 94 (21): Art.

9. Despite the fact that Eqs. B) Fluctuation spectrum for the predator population extracted from an ensemble of 500 stochastic simulations. 9. dP P − β · Z · P. The deterministic trajectory rapidly approaches equilibrium (dashed line). Physically. Eq.15) to the spectrum extracted from an ensemble of 500 . (Excercise 4). this modiﬁed set of equations no longer oscillates.17) where K is called the “carrying capacity of the environment.Macroscopic Eﬀects of Noise A) B) 205 Figure 9. Unfortunately.2: Noise-induced oscillations in predator-prey models. A simple modiﬁcation of the Lotka-Volterra model that attempts to increase the relevance of the equations is to limit the exponential growth of the prey P by introducing a death term corresponding to overcrowding. so the magnitude of the oscillations depends precisely upon the initial conditions. 9. they do exhibit noise-induced oscillations! Figure 9. 9.17. =α·P · 1− dt K dZ = γ · Z · P − δZ. the eigenvalues are pure imaginary.2B compares the theoretical spectrum (Eq. Noise-induced oscillations are evident in the simulation data.2A shows an example trajectory of the predator-prey dynamics from a stochastic simulation.” and corresponds to the maximum prey population that can be sustained by the environment.17 do not oscillate deterministically. The inset corresponds to the power spectrum for the prey.15. Eq. Taken from McKane and Newman (2005). dt (9. A) Sample stochastic simulation of the modiﬁed Lotka-Volterra model. it makes more sense for the model to evolve toward some stable limit cycle. compared with the analytic approximation. While these equations do indeed exhibit oscillatory dynamics. Figure 9. irrespective of the initial conditions.

as well. the system is linearized about the equilibrium point: x = xs + xp . In analogy with linear stability analysis of ordinary nonlinear diﬀerential equations. Deviations from deterministic behaviour can occur in systems with all-real eigenvalues. the loss of stability may simply bounce the state among multiple ﬁxed points. d xp = J(0) · xp . linearized about a deterministic ﬁxed point.3 • Eﬀective Stability Analysis (ESA) M. we √ −1 set x = xs + xp + ωα(t). To accommodate ﬂuctuations on top of the small perturbation xp . Scott. The eﬀective stability approximation is a method by which the eﬀect of the noise is used to renormalize the eigenvalues of the system. it cannot be used to provide the ﬂuctuation spectrum of the noise-induced oscillations. dt (9. if all the eigenvalues have negative real part. Hwa and B. the eﬀective stability approximation will demonstrate that the system is unstable. To calculate the stability of the macroscopic model dx = f (x) to dt small perturbations. Both will be brieﬂy reviewed below.206 Applied stochastic processes stochastic simulations. where we have written ω ≡ Ω to keep the . Obviously. The approximation method combines the linear noise approximation of Chapter 4 (page 100) to characterize the ﬂuctuations. generate coherent noise-induced oscillations. 9. Oscillations in the predator-prey model above comes from driving with full-spectrum white noise a system with complex eigenvalues. or. but cannot say anymore than that – in particular. Ingalls (2007) “Deterministic characterization of stochastic genetic circuits.” Proceedings of the National Academy of Sciences USA 104: 7402. in some cases. we say the system is locally asymptotically stable.18) ∂f The eigenvalues of the Jacobian J(0) = ∂x x=x provide the decay rate of s the exponential eigenmodes. T. and Bourret’s approximation from Chapter 8 (page 177) to renormalize the eigenvalues. Here. providing conditions on the model parameters for which stability is lost in the stochastic model. the agreement is quite good. Depending upon the underlying phase-space. the interplay between intrinsic noise and nonlinear transition rates leads to a noise-induced loss of stability.

the dynamics of xp are governed by the convolution equation.20) . (9.19) xp = [J(0) + ωJ(1) (t)] · xp + zero mean terms. In the limit ω → 0. The Jacobian J≡ ∂f ∂x . dt The right-most term is the cross-correlation between the process xp and the coeﬃcient matrix J(1) (t). Derivation of the entire probability distribution of xp (t) is usually impossible. By assumption. page 177) to arrive at a deterministic equation for the evolution of the averaged process xp (t) in terms of only the ﬁrst and second moments of the ﬂuctuations. dt This is a linear stochastic diﬀerential equation with random coeﬃcient matrix J(1) (t) composed of a linear combination of the steady-state ﬂuctuations α(t) which have non-zero correlation time (cf. the trajectory xp (t) is a random function of time since it is described by a diﬀerential equation with random coeﬃcients.19. Our present interest is in the mean stability of the equilibrium point. Taking the ensemble average of Eq. we can further linearize J with respect to ω. We shall adopt the closure scheme of Bourret (Eq.Macroscopic Eﬀects of Noise notation compact. In that approximation. x=xs +ωα 207 will then be a (generally) nonlinear function of the ﬂuctuations about the steady-state α(t). d xp (t) = J0 xp (t) + ω 2 dt 0 t Jc (t − τ ) xp (τ ) dτ . although not so small that intrinsic ﬂuctuations can be ignored.19. ω→0 The stability equation is then given by. the number of molecules is large so the parameter ω is small.10). J ≈ J|ω→0 + ω ∂J ∂ω ≡ J(0) + ωJ(1) (t) . d xp = J(0) · xp + ω J(1) (t) · xp . 9. it cannot be replaced by white noise. To leading-order in ω. 8. Since the correlation time of J(1) (t) is not small compared with the other time scales in the problem. 9. provided J(0) ≫ ωJ(1) . and an approximation scheme must be developed to ﬁnd a closed evolution equation for xp . and we must resort to methods of approximation. Eq. d (9.

Eq. Some insight into the behavior of the system can be gained by considering a perturbation expansion of the eﬀective eigenvalues λ′ in terms of the small parameter ω. and provided the eigenvalues are distinct. The equation can be solved formally by Laplace transform. ˆ λ′ = λi + ω 2 [ P−1 · Jc (λi ) · P ]ii . In the next section.3.22) where [ · ]ii denotes the ith diagonal entry of the matrix. diag[λi ] = P−1 ·J(0) ·P. ˆ det λ′ I − J0 − ω 2 Jc (λ′ ) = 0. Notice the matrix product Jc (t) contains linear combinations of the correlation of the ﬂuctuations αi (t)αj (0) . and S. J. we will apply the eﬀective stability approximation to a model of an excitable oscillator.21) all have negative real parts (Re(λ′ ) < 0).” Proceedings of the National Academy of Sciences USA 99.. (9. These are simply given by the exponential autocorrelation function derived above. with a transcriptional autoactivator driving expression of a repressor that provides negative control by sequestering activator proteins through dimerization.Y. A necessary and suﬃcient condition for asymptotic stability of the averaged perturbation modes xp (t) is that the roots λ′ of the resolvent. Kueh. i (9. The repressor and activator form an inert complex until the activator degrades.10. 9. ˆ xp (s) = sI − J(0) − ω 2 Jc (s) ˆ ˆ where now Jc (s) = t 0 −1 xp (0) . 5988-5992.208 (0) Applied stochastic processes where Jc (t − τ ) = J(1) (t) eJ (t−τ ) J(1) (τ ) is the time autocorrelation matrix of the ﬂuctuations. 9. Jc (t) e−st dt. We consider the generic model proposed by Vilar and co-workers to describe circadian rhythms in eukaryotes. using the method to construct a phase-plot in parameter space illustrating regions of noise-induced oscillations. We further diagonalize J(0) . Leibler (2002) “Mechanisms of noiseresistance in genetic oscillators. N.1 • Example – Excitable Oscillator Vilar. H. Barkai. recycling . we can explicitly write λ′ in terms of the unperturbed eigenvalues i λi to O(ω 4 ) as.

the degradation rate of the activator. there are many dimensionless combinations of parameters that characterize the system dynamics. the function g. bounding a region of parameter space between the deterministic phase boundary where qualitatively different behavior is expected from the stochastic model. We simplify their original model somewhat. the deterministic model exhibits oscillations over the range 0. 1+x (9. Here. leading to a reduced set of rate equations governing the concentration of activator A. f ) = 1+f ·x . Applying the ESA to the oscillator model. We shall focus on the parameter regime near to the phase boundary at ǫ ≈ 0. We examine the . is ≫ 1 because A is an activator. repressor R and the inert dimer C. The fold-change in the synthesis rate.Macroscopic Eﬀects of Noise 209 repressor back into the system. δA . The scaled repressor degradation rate ǫ = δR /δA is a key control parameter in the deterministic model since oscillations occur only for an intermediate range of this parameter. the parameter ∆bA = (bA + 1)/(2 · KA · Vcell ) emerges as an important measure quantifying the discreteness in activator synthesis. 113). dt Here. bA is the burst size in the activator synthesis (see p. fA − δA · A − κC · A · R = γA · g dt KA dR A = γR · g . and assume fast activator/DNA binding along with rapid mRNA turnover.3a. In their model. fR − δR · R − κC · A · R + δA · C dt KR dC = κC · A · R − δA · C.) The nominal parameter set of Vilar leads to a burstiness in activator synthesis of bA = 5 (giving ∆bA = 6 × 10−2 ) and a burstiness in repressor synthesis of bR = 10.23) (9. The phase boundary predicted by the ESA is shown as a solid line in Figure 9. f .24) characterizes the response of the promoter to the concentration of the regulatory protein A.3a. g(x. KA is the activator/DNA dissociation constant and Vcell is the cell volume (with Vcell = 100µm3 as is appropriate for eukaryotic cells. A dA . For the nominal parameter set used in the Vilar paper.12 < ǫ < 40 (Figure 9. is the same irrespective of whether it is bound in the inert complex or free in solution. black region). In this complicated system.12 and examine the role intrinsic noise plays in generating regular oscillations from a deterministically stable system.

green line). clearly shows oscillations (Figure 9. With this choice.75 ) days Figure 9.06 0. T. denoted by a cross in the left panel) shows oscillations (grey line).210 A) 0.5 500 1 1 2 3 4 5 6 7 ∆bA ×10−1 ( 0.10 0.25 0. Nevertheless. L.12 0.1 and ∆bA = 6 × 10−2 (denoted by a cross in Figure 9. Suggested References Beyond the references cited above. Greenwood.” Journal of Theoretical Biology 245: 459-469.04 0. Keizer’s paradox is discussed by Gillespie in his seminal paper on stochastic simulation: • D.3b. Gillespie (1977) “Exact simulation of coupled chemical reactions.3b. Gordillo and P.” Journal of Chemical Physics 81: 2340.14 0. Another method of characterizing noise-induced oscillations is the multiple-scales analysis of Kuske and coworkers: • R.02 Applied stochastic processes Oscillations λ =0 δR δA B) R (t ) 1500 1000 X Noise – Induced Oscillations λ′ = 0 Monostable 0. black line). A) Phase plot shows a region of noise induced oscillations. F. . with the same parameters including protein bursting and stochastic dimerization. Kuske. system behavior in this region by running a stochastic simulation using the parameter choice ǫ = 0. He takes the view that the stochastic simulation trajectories show very clearly the immense time scales involved before the system reaches the anomalous stability of the ﬁxed point at the origin. B) Stochastic simulation of a deterministically stable system (black line.3: Oscillations in the excitable system. the deterministic model is stable (Figure 9. (2006) “Sustained oscillations via coherence resonance in SIR.08 0. a stochastic simulation of the same model.3a).

as written in Eq. ∞). In Section 9. Initializing the system at the macroscopic equilibrium point.12. the method of Kuske allows the noise-induced oscillations to be fully characterized in terms of ﬂuctuation spectra and oscillation amplitude. 9. 2. The details left out of the main text will be ﬁlled in below. the macroscopic equilibrium point is stable. Show that in the limit t → ∞.10. is valid for t > 0. i ≡ √ −1 . Do this for various choices of the system size. Notice that B(t).1. and the amplitude of y is maximum when b = ω.5? (c) Write a stochastic simulation of the example discussed in the text. 9. 9. (b) Derive the master equation that corresponds to the autocatalytic network shown in Eq. Keizer’s paradox: Stochastic models often exhibit behaviour that is in contrast to their deterministic counter-parts. . 9. with an arbitrary initial condition. What is the explicit form of the transition matrix W appearing in Eq. 9.Macroscopic Eﬀects of Noise 211 Like the method of McKane and Newman discussed in the text. (a) Damped harmonic oscillator. Resonance in driven linear systems: Some systems can be excited into oscillation through periodic forcing. show sample trajectories to argue that for all intents and purposes. Exercises 1.1. we discussed a particular autocatalytic network. (a) Use linear stability analysis to prove the claim that the equilibrium point X ss = 0 is unstable in the deterministic model.12 by taking the multivariate Fourier transform of the correlation function B(t). Find the expression for the autocorrelation function for t < 0 in order to be able to compute the Fourier integral over the interval t ∈ (−∞. Derivation of the multivariate frequency spectrum: Derive Eq. Use the ﬂuctuation-dissipation relation to put the spectrum in the form shown in Eq. Solve dy + (a + ib) y = eiωt dt a > 0. the inﬂuence of the initial condition vanishes. 3.

→ corresponding to the deterministic rate equations. and determine the stability of the system linearized about those ﬁxed points. ∅ − X1 . the Brusselator model was introduced. What can you say about stochastic trajectories in the predator-prey phase space for this model? (c) For what choice of parameters would you expect noise-induced oscillations in the modiﬁed Lotka-Volterra model (Eq.2 on p. (a) Find the ﬁxed points of the Lotka-Volterra model. dX1 2 = 1 + aX1 X2 − (b + 1)X1 . X1 − X2 . 88. dt Compute the power spectrum for the ﬂuctuations about the steadystate for X 2 . Eq. Eq. → a b → 2X1 + X2 − 3X1 .212 Applied stochastic processes (b) 2D Langevin equation. 9. Noise-induced oscillations in the Brusselator: In Section 4. → X1 − ∅. XY and Y 2 . Write out the white noise as a Fourier series. dy + (a + ib) y = η(t) dt (a > 0) . 9.17. Consider the harmonic oscillator driven by white noise η(t). What are the conditions for noiseinduced oscillations in this model? . 9. What can you say about the t → ∞ amplitude maximum of y in this case? 4.16.2. several details were suppressed – these will be made more explicit below. dt dX2 2 = −aX1 X2 + bX1 .17)? 5. Predator-prey models: In the predator-prey models discussed in Section 9. and determine the stability of the system linearized about those ﬁxed points. What can you say about stochastic trajectories in the predator-prey phase space? (b) Find the ﬁxed points of the modiﬁed Lotka-Volterra model.

decide how many stable ﬁxed points the deterministic system has.4). B) More detailed view of a genetic circuit that encodes a protein that activates its own rate of transcription. allows the approximation to be computed explicitly. Each transcription event triggers the translation of b activator proteins. Hypothesize about what noise will do to this system. Eﬀective stability approximation of the autoactivator: The vector-matrix formalism of the eﬀective stability approximation leads to unweildy expressions for the eﬀective eigenvalues in systems with several state variables. What is the eigenvalue of the system? Identify the dimensionless combinations of parameters that fully characterize this system.4: Exercise 6 – The autoactivator. . The deterministic rate equation for this system is. by contrast. where b is the burst parameter (see Section 5. 6.Macroscopic Eﬀects of Noise A B A 213 C δ ln g ([ A]) f A KA γ KA ln [ A] Figure 9. dt where. (a) Drawing the synthesis and degradation rates on the same loglog plot.2). The autoactivator is a genetic regulatory motif with a product A that stimulates its own synthesis through a positive feedback loop (see Figure 9. g(A) = 1 + f · (A/KA )n 1 + (A/KA )n (n > 1). dA = γ · g(A) − δ · A. and draw a phase plot showing the boundaries where the number of stable ﬁxed points changes. The one-species autoactivator model. A) Schematic of the positive feedback loop. C) The activation function g([A]) increases (nonlinearly) with increasing activator concentration.

(d) Compute the eﬀective eigenvalue. written in terms of the dimensionless parameters and without writing out g(A) explicitly. and identify an additional dimensionless combination that characterizes the stochastic behaviour of the system. Provide a physical interpretation of this parameter. What can you say about the noise-induced correction? . (c) Write out the diﬀusion matrix D = S · diag[ν] · ST .214 Applied stochastic processes (b) Write out explicitly the propensity vector ν and the stoichiometry matrix S for the stochastic version of this model.

xp . diﬀusion can act to destabilize the steady solution. In a strictly deterministic system. the local stability is characterized by the evolution of some small perturbation. Taking the Laplace and Fourier transforms in time and space. the stability of the equilibrium state is determined by the 215 . Turing (1953) Philosophical Transactions of the Royal Society .Chapter 10 Spatially varying systems Preamble 10. ∂t where A is the Jacobian of the reaction dynamics and D is the diﬀusion matrix. leading to the emergence of regular patterning. the system becomes destabilized to only a certain range of spatial modes. For example. chemical and biological systems. M. One mechanism through which they can arise is due to the evolution of a deterministically unstable steady state.1. Pattern formation is ubiquitous in physical. respectively. For suﬃciently small amplitudes. ∂xp = A · xp + D · ∇2 xp .London B 237:37. the perturbation ﬁeld obeys the linearized mean-ﬁeld equation. about the equilibrium state.1 • Deterministic pattern formation Turing-type instabilities A.1 10. Moreover. Turing showed that for a reaction-diﬀusion model tending to a homogeneous equilibrium state.

(10. Turing’s claim is that for a certain class of diﬀusivity matrices D and a range of wavenumbers k. ˆ a22 → a22 = a22 − D2 k 2 . ˆ since the condition on the trace is automatically satisﬁed. The equilibrium is asymptotically stable if Re[λ] < 0. the criteria for A to have stable eigenvalues are. we arrive at an explicit expression for 2 kmin that minimizes Q(k 2 ). Applied stochastic processes det λI − A + k 2 D = 0. ˆ ˆ det A − k 2 D < 0 ⇒ a11 a22 − a12 a21 < 0.e. Q(k 2 ) ≡ k 4 − D1 a22 + D2 a11 2 a11 a22 − a12 a21 k + < 0. Setting the derivative of Q to zero. det A − k 2 D > 0. 2 kmin = D1 a22 + D2 a11 . a11 → a11 = a11 − D1 k 2 . For a two-species model. The presence of diﬀusion introduces a re-scaling of the diagonal elements of A. The above equation can be written explicitly as a quadratic in k 2 . The conditions for stability then become. For diﬀusion to destabilize the steady-state. the roots of the resolvent equation λ are shifted to the right-hand side of the complex plane and the system becomes unstable.216 resolvent equation. it must be that. the eigenvalues of A lie in the left-half of the complex plane). Even though A may be stable (i.1) tr A − k 2 D < 0. trA < 0 ⇒ a11 + a22 < 0. 2D1 D2 . The analysis is made more transparent by considering a two-species model in one spatial dimension. D1 D2 D1 D2 A suﬃcient condition for instability is that the minimum of Q(k 2 ) < 0. and det A > 0 ⇒ a11 a22 − a12 a21 > 0.

Spatially varying systems 217 Finally. The instability that arises is periodic with some characteristic wavenumber close to kmin .3 External noise: Stochastic PDEs . 2.4 10. and that once intrinsic ﬂuctuations in the populations are admitted.2 is satisﬁed.2 10. 2 Q(kmin ) (D1 a22 + D2 a11 ) < 0 ⇒ det A < . the diﬀusion coeﬃcients are rarely very diﬀerent.2 10.2 10. is called the Turing space.3 Diﬀerential ﬂow instabilities Other mechanisms 10. the severity of both limitations is greatly reduced.1. unless one of the species is immobilized. leading to a distribution in the spectrum spread across a range of wavenumbers. The diﬀerence in the diﬀusion coeﬃcients D1 and D2 must be quite large (D2 ≫ D1 ).1. In reality. the condition for Turing-type instability can then be written as.2) The range of parameter space for which A is stable. 4D1 D2 2 (10.1 10. 1. 10. spatial patterning in natural systems exhibit irregular patterning.2. 10. typically at least an order of magnitude. In reality.2. We shall see below that both objections hold for deterministic models.5 Internal noise: Reaction-transport master equation Diﬀusion as stochastic transport Combining reaction and transport LNA for spatial systems Example – Homogeneous steady-state Example – Spatially-varying steady-state 10. to satisfy the necessary conditions for pattern formation.2.2.2. The two major limitations associated with applications of Turing-type pattern-formation in the modeling of natural systems are that.3 10. but Eq.

L. G. along with a list of relevant references.Chapter 11 Special Topics This ﬁnal chapter contains supplemental topics that don’t really ﬁt in the rest of the notes. then the walk is transient. A Guide to First-Passage Processes (Cambridge University Press. Redner. Snell.1b). The equivalence is made precise by observing the correspondence of the steady-state master equation governing the probability and the Kirchoﬀ laws governing the potential. Beyond the analogy. 1984). They are either suggested extensions of topics covered in previous chapters. Random Walks and Electrical Networks (Mathematical Association of America. 2001). 11.1a) and the potential along the nodes of an array of resistors (Figue 11. for example. the question of recurrence in high-dimensional random walks. Doyle and J. If a random walker is guaranteed to return to the origin at some point in the wandering. then the walk is called recurrent. recasting a high-dimensional random walk in terms of the electrical properties of a high-dimensional array of resistors allows many diﬃcult theorems to be proved with ease. P´lya proved the following o theorem: 218 . If there is some probability of never returning. Consider. • S. There is a direct analogy between the escape probability for a random walk along a lattice (Figue 11. or very short discussions of topics that may be of interest to some readers.1 • Random Walks and Electric Networks P.

• F.1: Equivalence of a random walk and the potential along a resistor array. T. In Section 5. we used the linear noise approximation to estimate the statistics of the steady-state ﬂuctuations in the Brusselator model. or grounded. ´ POLYA’S THEOREM: A simple random walk on a d-dimensional lattice is recurrent for d = 1.” Progress of Theoretical Physics 52: 1744. In the language of resistor arrays.317 1 0 0 219 E 1 volt 1 1 1 1 0 . at the interior nodes.503 .876 . Tomita. Ingalls B. C) The escape probability. with boundary points held at 1 volt. What makes the Brusselator model interesting is that over a range of parameter values. the linear noise approximation can again be sued to characterize the ﬂuctuations around the limit cycle.1 and 2.824 . 2 and transient for d > 2. or equivalently the potential. Redrawn from Figures 2. the system exhibits a stable limit cycle.1.Special Topics E E E E P E P P 1 .2 of Doyle and Snell Random Walks and Electrical Networks (1984).2 • Fluctuations Along a Limit Cycle K. Ohta and H. Estimating the resistance at inﬁnity is a far simpler approach than P´lya’s original proof of the problem. while the nodes marked P are police. That change of basis is the subject of the present section. B) An array of 1 Ω resistors. . • Scott M. With a change of basis. Tomita (1974) “Irreversible circulation and orbital revolution. P´lya’s theorem can be restated as: the o random walk in d-dimensions is recurrent if and only if the resistance to inﬁnity is inﬁnite. o 11.785 Figure 11. Menzinger (1999) “On the local stability of limit cycles. The nodes marked E are escape points.” Chaos 9: 348. Chaos 16: 026107.2. Kaern M (2006) Estimations of intrinsic and extrinsic noise in models of nonlinear genetic networks. A) Escape probability from a two-dimensional lattice for a random walker. Ali and M.

there is no mechanism to control ﬂuctuations tangent to the limit cycle. 5. however.2). t) far away from the critical line b = 1 + a. and r perpendicular (Figure 11. we make a rotating change of basis from n1 -n2 space to r-s space. To separate the ﬂuctuations tangent to the limit cycle from those perpendicular. with s tangent to the limit cycle. The transformation matrix to the new coordinate system is given by the rotation: U (t) = cos φ (t) − sin φ (t) sin φ (t) cos φ (t) . Limit cycle regime b > 1 + a From the Fokker-Planck equation for Π (α1 .23. where φ(t) is related to the macroscopic propensities of x1 and x2 . dCij = dt Γim Cmj + m n Γjn Cin + Dij . where we have written Cij = Cji = αi αj . the coeﬃcients Γ and D will be periodic functions of time. and the variance grows unbounded in that direction. Eq. we obtain the evolution equations for the variance of the ﬂuctuations. and so we expect the variance of ﬂuctuations perpendicular to the limit cycle to reach a steady-state. We introduce a change of basis. α2 . and the equations governing the variance will not admit a stable steady-state. (de- .220 Applied stochastic processes n2 s r φ Macroscopic limit cycle n1 Figure 11. In the parameter regime where the macroscopic system follows a limit cycle. Physically.2: Rotating change of basis along the limit cycle. Trajectories perturbed away from the limit cycle are drawn back.

3: Fluctuations about the limit cycle. R= dφ dt 0 −1 1 0 . dt where Γ′ and D′ are the transformed matrices. the perpendicular ﬂuctuations are conﬁned to a gully of width Crr . 2 (t) + ν 2 (t) ν1 2 2 ν1 . shown as a dashed line. − sin φ (t) = cos φ (t) = ν1 (t) 2 (t) + ν2 (t) ν2 (t) . With the change of basis. noted by ν1 and ν2 . dC′ T = (Γ′ + R) C′ + [(Γ′ + R) C′ ] + D′ . Γ′ = U (t) · Γ (t) · UT (t) . D′ = U (t) · D (t) · UT (t) . and R is the rate of rotation of the coordinate frame itself. . the evolution equation for the variance C′ in the r-s coordinate system is given by. The trajectory is conﬁned to the a region very close to the macroscopic limit cycle. b) The blue curve is the result of a stochastic simulation of the system. a) Although the variance of the ﬂuctuations tangent to the limit cycle grows without bound. respectively).Special Topics a) 10000 221 n2 b) 10000 n2 8000 8000 6000 6000 4000 4000 2000 2000 n1 2000 4000 6000 8000 10000 2000 4000 6000 8000 n1 10000 Figure 11.

. but tangential ﬂuctuations are conﬁned to a narrow gully with perpendicular variance Crr . the variance Crr decouples from the tangential ﬂuctuations. For example. Recall from Section 6. Again. and Ω = 1000. small changes in the well-depth can have exponentially larger eﬀects on the hopping rate between wells. Wiesenfeld (1989) “Theory of stochastic resonance.3a as a dashed curve conﬁning the macroscopic limit cycle.3 • Stochastic Resonance B. As a consequence.3b) is a stochastic simulation of the same system (compare with Gillespie (1977). Here. dt that converges to a stable limit cycle. Figure 19). U (x) → U (x) + ε cos ωp t (ε ≪ 1). the system will exhibit a maximum in the signal-to-noise ratio as the white noise variance D is varied (Figure 11. shown in Figure 11.5). and is characterized by the single evolution equation.4).222 Applied stochastic processes In this new coordinate system. The particle in the well is subject to Brownian motion of variance D. Here. b = 10.” Physical Review A 39: 4854. the linear noise approximation captures the statistics of the randomly generated trajectories very well. and for a ﬁxed ﬁxed signal magnitude ε and frequency ωs . two-well potential there is a small. suppose in addition to a symmetric. 11. a = 5. the signal is the component of the power spectrum centered on ωs . After the transients have passed. McNamara and K.7 on page 143 that the escape probability from a potential well depends exponentially upon the well depth. The blue curve (Figure 11. periodic perturbation to the potential (Figure 11. dCrr ′ = 2 (Γ′ + R)rr Crr + Drr . the trajectory in the original n1 -n2 coordinate system can be anywhere along the limit cycle.

Special Topics 223 U ( x) W− U0 W+ −c +c x Figure 11. Redrawn from Figure 2 of McNamara and Wiesenfeld (1989).5: Stochastic resonance. . The rate of hopping from −c to c is the same as the rate of hopping from c to −c. A) Consider the hopping of a particle between the wells of a symmetric double-well potential. very small perturbation that results in the tilting of the double-well with a prescribed frequency ωp – called the signal. This eﬀect is manifest as a maximum in the signal-to-noise ratio. B) Impose upon this potential a second.4: Stochastic resonance. As the variance in the white noise forcing (D) is varied. Redrawn from Figure 9b of McNamara and Wiesenfeld (1989). the component of the power spectrum lying at frequency ωs is enhanced well above the background white noise. Signal-tonoise ratio Variance of white noise forcing Figure 11. the hopping probability will become slaved to the external perturbation. As the magnitude of the noise is increased. This ability of the system to turn noise into an ampliﬁcation of a weak signal (rather than a corruption) is called stochastic resonance. Over a range of noise strengths. the signal-to-noise ratio will attain a maximum.

4 • Kalman Filter R. 2001) A problem that is of central concern in control theory is how to estimate the trajectory of a system.5) (11.2) how can the state of the system x be estimated? Obviously in the absence of noise. Redrawn from Figure 1. η(t1 )ξ(t2 ) = 0. 11. where η(t) and ξ(t) are zero-mean (uncorrelated) white noise. dx = F x + Gη(t) + Cu dt z = Hx + ξ(t) + Du. η(t1 )η(t2 ) = Q(t1 )δ(t1 − t2 ). For example.6: Mathematical foundations of the Kalman ﬁlter. 2nd Ed.1 of Grewal and Andrews Kalman Filtering (2001). x = H −1 z provides an exact estimate of the state. perturbed by noise. and the set of observations z. Kalman (1960) “A new approach to linear ﬁltering and prediction problems. The question ˆ . Andrews.4) (11. given a history of observations corrupted by noise.3) (11. given the following model for the trajectory x with known input u. • M. The Kalman ﬁlter combines fundamental ideas spread across several areas to obtain an estimate of the state in the presence of noise in the system and noise in the observations. P. S. ξ(t1 )ξ(t2 ) = R(t1 )δ(t1 − t2 ). Kalman Filtering: Theory and practice using MATLAB. Grewal and A. E.1) (11. (11.” Transaction of the ASMEJournal of Basic Engineering 82: 35. (Wiley Interscience.224 Applied stochastic processes Kalman Filter Least Squares Stochastic Processes Linear Algebra Dynamical Probability Systems Theory Figure 11.

The Kalman gain matrix Kk tells us how ˆ much to trust the observation zk in reﬁning our estimate xk beyond the deterministic dynamics Φ . ˆ ˆ Pk (−) = Φk−1 · Pk−1 (+) · ΦT + Qk−1 . while the discrete-time ﬁlter is called the Kalman ﬁlter. where vk is a vector of random variables drawn from a normal distribution N(0.This is done in such a way that the estimate xk is optimal in the mean-squared sense described above. how can we optimally extract the estimate x? ˆ First. we must deﬁne in what sense an estimate is optimal. the Kalman ﬁlter provides an algorithm for updating our ˆ ˆ0 estimate of the state x0 and the variance P0 using an auxiliary quantity ˆ called the Kalman gain matrix. Our knowledge of the true state xk comes from observations zk that are also corrupted by white noise vk zk = H · xk + vk .Special Topics 225 is: with the state dynamics and the observation data corrupted by noise. Qk ). we use Φ to form an initial estimate of the state xk (−) ˆ and of the variance Pk (−). ˆ Speciﬁcally. where wk is a vector of random variables drawn from a normal distribution N(0. The discrete-time implementation is more straightforward to explain. or that x(t) is a least-squares estimate ˆ of x(t). We say that x(t) is ˆ optimal in the mean-squared sense. We seek an estimate x(t) which is a linear function of the observations z(t) which ˆ minimizes ˆ ˆ x(t) = min [x(t) − x(t)]T · M · [x(t) − x(t)] . the continuous-time ﬁlter is called the KalmanBucy ﬁlter. ˆ x ˆ where M is a given symmetric positive-deﬁnite matrix. k−1 . Rk ). that are all independent of wk . xk (−) = Φk−1 · xk−1 (+). Discrete Kalman ﬁlter We have a state xk that evolves through (known) deterministic dynamics Φ and some white noise forcing wk . To be precise. xk+1 = Φ · xk + wk . the ˆ indicates ˆ an estimate of the unknown state xk ) and the variance of our estimate P0 = x0 · xT . Given some initial estimate of the state x0 (here.

Donsker M.5 • • Novikov-Furutsu-Donsker Relation E. dy = A0 y + A1 (t)y. If we average the evolution equation (11.226 Then we use the Kalman gain matrix Kk . For a Gaussian-distributed multiplicative noise source A1 (t). ˆ ˆ xk (+) = xk (−) + Kk · [zk − Hk · xk (−)] . ˆ Pk (+) = [I − Kk · Hk ] · Pk (−). we have. although the Novikov theorem does open the way to a variational estimate of higher-order terms as shown in Section 11. (11.” Soviet Physics – JETP 20: 1290–1294. and the process is repeated at each time-step.T.6) the Novikov theorem allows the correlation of the process y(t) depending implicitly upon A1 (t) to be calculated in terms of the autocorrelation of A1 (t) and the functional derivative of y(t) with respect to the ﬂuctuations. D. A.5. W.8) . Furutsu (1963) “On the statistical theory of electromagnetic waves in a ﬂuctuating medium. Martin & I. To leading order. Novikov (1965) “Functionals and the random-force method in turbulence theory. dt (11. • M. K. we simply recover Bourret’s approximation. dt (11. Applied stochastic processes T T Kk = Pk (−) · Hk · Hk · Pk (−) · Hk + Rk −1 . d y (t) = A0 y (t) + α A1 (t) y (t) .1. pp 17–30. and the observation zk to reﬁne our original estimates. Segal. t A1 (t) y (t) = 0 A1 (t) A1 (t′ ) δy (t) δA1 (t′ ) dt′ .” Journal of the Research of the National Bureau of Standards D67: 303. eds. MIT Press. The correlation of the coeﬃcient A1 (t) and the process y(t) can be calculated using Novikov’s theorem.D (1964) “On function space integrals” in Analysis in Function Space.6) directly.7) for the scalar function y(t). 11.

Special Topics where δy(t) δA1 (t′ ) 227 is the functional derivative of y(t) with respect to the Gaussian random function A1 (t′ ). t′ ) y (t′ ) dt′ which is identical to Bourret’s approximation of the evolution equation for the ﬁrst-moment (8.8). and averaging δy (t) δA1 (t′ ) ≈ αG0 (t. we consider a variational approximation of the higher-order terms]. [In the Section 11. with annotations. δA1 (t′ ) (11. d y (t) = A0 y (t) + α2 dt 0 t A1 (t) A1 (t′ ) G0 (t.5.9) which is an integral equation for the functional derivative that can be solved iteratively if α is small. t y (t) = G0 (t.8 follows the original by Novikov. ***** This proof of Eq. Taking the functional derivative with respect to A1 (t′ ). τ ) A1 (τ ) 0 δy (τ ) dτ . Retaining only the leading term. The Green’s function G0 (t.7) becomes. 0) y (0) + α 0 G0 (t. the Novikov-Furutsu-Donsker relation reads: .1. δy (t) =0+α δA1 (t′ ) t 0 G0 (t. t′ ) for the d noiseless operator dt − A0 allows y(t) to be re-written as an integral equation. With substitution into (11. t′ ) y (t′ ) . τ ) δ (τ − t′ ) y (τ ) + A1 (τ ) t δy (τ ) dτ δA1 (t′ ) δy (t) = αG0 (t. τ ) A1 (τ ) y (τ ) dτ . t′ ) = ′ ′ eA0 (t−t ) and y(t) = eA0 (t−t ) y(t′ ). 11. In full. (11. t′ ) y (t′ ) + α ′) δA1 (t G0 (t.19) written in the original notation with G0 (t.

γ (t) x (t) = 1 n! n=1 ∞ δn x δγ1 .δγn γ (t) γ (t1 ) .228 Applied stochastic processes The correlation of a Gaussian random variable γ (t) with a function depending upon that variable x [γ (t) ... γ=0 (11. γ=0 (11. for brevity.δγn = γ=0 δn x δγ1 ..γ (tn ) dt1 .....γ (tn ) = α=1 γ (t) γ (tα ) γ (t1 ) .δγ (tn ) .. γ(t1 )=.δγn γ (t1 ) .....dtn .γ (tα−1 ) γ (tα+1 ) ..δγn ≡ δ n x (t) δγ (t1 ) . and taking the ensemble average. t] is: γ (t) x (t) = γ (t) γ (t1 ) δx (t) δγ (t1 ) dt1 . Proof : The functional x [γ] is expanded as a functional Taylor series about the point γ (t) = 0: x [γ (t)] = x [0] + δx δγ1 ∞ γ (t1 ) dt1 + γ=0 1 2 δ2 x δγ1 δγ2 γ (t1 ) γ (t2 ) dt1 dt2 + .....dtn ....γ (tn ) . while for an even number of terms... the mean value is equal to the sum of products of all pair-wise combinations: n γ (t) γ (t1 ) .γ (tn ) dt1 . γ=0 = x [0] + 1 n! n=1 δn x δγ1 .... the mean value of the product of an odd number of terms vanishes. δn x δγ1 ...δγn .10) Here. γ=0 Multiplying (11.10) by γ (t).11) For a Gaussian random variable...=γ(tn )=0 γ=0 The functional derivative is evaluated at γ = 0 (zero noise) and as such constitutes a non-random function so that δn x δγ1 .. .

Schwinger.Special Topics 229 (where γ (t1 ) ... x (t) will only depend upon values of γ (t1 ) that precede it. Dyson (1949) “The radiation theories of Tomonago.dtn γ (t2 ) .....dtn dt1 .δγn γ=0 δ (t1 − t′ ) γ (t2 ) .γ (tn ) .. Substituting into the cross-correlation (11.. and take the ensemble average.12) yields the Novikov-FurutsuDonsker relation. The functional derivatives in the integrals of (11... .dtn ...γ (tn ) dt2 .γ (tn ) dt2 ..γ (tn ) dt1 .13) Substituting the left-side of (11. .. γ=0 (11... γ=0 (11...δγn γ=0 δγ (t1 ) γ (t2 ) . taking the functional derivative of (11. For a causal process.. so.δγn γ (t2 ) ..10) are all symmetric with respect to the labelling of the arguments t1 .....δγn Call t′ ≡ t1 ..” Physical Review 75: 486–502..δγn γ (t2 ) . γ=0 δn x δγ ′ δγ2 .γ (tn ) dt1 . γ (t) γ (t1 ) ....1 • Variational Approximation F. γ (t) x (t) = γ (t) γ (t1 ) 1 (n − 1)! n=1 ∞ δn x δγ1 .11)... t γ (t) x (t) = 0 γ (t) γ (t1 ) δx(t) δγ(t1 ) dt1 . 1 δx [γ (t)] = ′) δγ (t (n − 1)! n=1 = 1 (n − 1)! n=1 = 1 (n − 1)! n=1 ∞ ∞ ∞ ∞ δn x δγ1 .γ (tα−1 ) γ (tα+1 ) ... J.γ (tn ) = n γ (t) γ (t1 ) γ (t2 ) .5. again taking advantage of the symmetry.dtn δγ (t′ ) δn x δγ1 .dtn . δx (t) δγ (t1 ) = 1 (n − 1)! n=1 δn x δγ1 δγ2 ..10) directly.γ (tn ) dt2 .. tn ..14) 11.13) into (11. (11.. and Feynman.. so that there are n equivalent ways to choose α above....γ (tn ) can obviously be further divided into pair-wise combinations).12) On the other hand.

G (t. τ2 ) A1 (τ2 ) G2 (τ2 . t′ ) δG1 (t1 . (11. t′ ) dτ G0 (t.16) (11.17) Here. t′ ) 1 = −αδ (t − × G (t′ . With substitution. t′ . ) dτ 1 = 0. t′ ) = G0 (t. δy (t) = αG0 (t.15) This is sometimes called Dyson’s equation. Rosenberg and S. A variational estimate of G (t.230 • Applied stochastic processes L. t′ ) + α +α +α2 G1 (t. τ1 ) A1 (τ1 ) G0 (τ1 .9). t′ ). τ ) A1 (τ ) G (τ. t′ ) dτ . t′ ) dτ G1 (t. t′ ) − α 1 1 = G1 =G2 =G t1 ) A1 (t′ ) × 1 G0 (t′ . δA1 (t′ ) it is convenient to assume a form for the functional derivative δy (t) = αG (t. t′ ) + α G0 (t. δA1 (t′ ) for some noiseless function G (t. t′ ) correct to ﬁrst order is: Gv (t. Beginning with equation (11. and similarly for variations in G2 (t. δGv (t. t′ ) are two trial functions. and a variational principle derived from this equation has a long history (see Rosenberg and Tolchin). Tolchin (1973) “Variational methods for many body problems. τ ) A1 (τ ) G0 (τ. τ ) A1 (τ ) G2 (τ. t′ ) dτ − α G1 (t.16). τ ) A1 (τ ) G (τ. G1 (t. t′ ) − G0 (t′ . t′ ) = G0 (t. t′ ) dτ1 dτ2 . t′ ) y (t′ ) . t′ ) and G2 (t.” Physical Review A 8: 1702–1709. The stationarity of this estimate with respect to independent variations in the trial functions about the exact solution can easily be demonstrated using (11. The variational approximation can be used as the starting point for a Raleigh-Ritz scheme to determine . (11. t′ ). τ ) A1 (τ ) δy (τ ) dτ . t′ ) y (t′ ) + α δA1 (t′ ) G0 (t. τ ) A1 (τ ) G2 (τ.

τ1 ) G0 (τ1 . t′ ) + α +α G1 (t. Gv (t. τ ) A1 (τ ) G0 (τ. τ2 ) G0 (τ2 .15). . Finally. τ2 ) G0 (τ2 . Here. t′ ) A1 (t) A1 (t′ ) y (t′ ) dt′ + G0 (t. the averaged solution should provide a superior estimate of G (t. δy (t) = αG0 (t. t′ ) + α2 G0 (t. t′ ) ≈ G0 (t. as one can show by multiplying (11. t′ ) than than the one obtained by successive iteration of the original integral equation (11. t′ ) and G2 (t. τ2 ) G2 (τ2 . t′ ) dτ − α +α2 G1 (t. t′ ) A1 (τ1 ) A1 (τ2 ) dτ1 dτ2 . t′ ) = G2 (t. t′ ) is not. t′ ) A1 (τ1 ) A1 (τ2 ) dτ1 dτ2 × × A1 (t) A1 (t′ ) y (t′ ) dt′ The ﬁrst term is Bourret’s approximation. t′ ) A1 (τ1 ) A1 (τ2 ) dτ1 dτ2 Substituting back into the variational derivative (11. t′ ) in a suitable orthonormal basis. t′ ) is a noiseless function. while the second term is the higher-order correction obtained from the variational principle. τ ) A1 (τ ) G2 (τ. τ1 ) G0 (τ1 . while Gv (t. t′ ) A1 (τ1 ) A1 (τ2 ) dτ1 dτ2 y (t′ ) .17)(assuming the trial functions are deterministic).9). t′ ) = G0 (t. so we take the ensemble average of (11. the averaged solution is considerably simpliﬁed: G (t. τ2 ) G0 (τ2 . τ1 ) G0 (τ1 . the Novikov relation reads: A1 (t) y (t) = α +α3 G0 (t. t′ ) ≈ Gv (t. t′ ) and for zero-mean noise. however. τ1 ) G0 (τ1 . t′ ) directly. G (t. t′ ) = G0 (t. G (t. Furthermore.16) by A1 (τ1 ). τ ) A1 (τ ) G2 (τ. t′ ) dτ G1 (t.Special Topics 231 the optimal trial functions via some expansion of G1 (t. t′ ) y (t′ ) + ′) δA1 (t +α3 G0 (t. With the convenient choice G1 (t. we shall use Gv (t. t′ ) dτ G0 (t. By assumption. t′ ) still obeys the variational principle. and taking the ensemble average (provided the trial functions are not random).

e. 1.g. observations (experiments. i.1) where P is the probability of occurrence of the relation in {·}. . ∞) (A. F (∞) = 1. x ∈ (−∞. trials) which can be repeated many times under similar conditions. i. • the lifetime of an electric light bulb. e. The distribution function F (x) can be shown to have the following properties. . 2. A random variable ξ is regarded as speciﬁed if one knows the distribution function. • the number of points which appear when a die is tossed. F (x+ ) = F (x) (continuity from the right) 232 . F (x) = P {ξ ≤ x}. • the number of calls which arrive at a telephone station during a given time interval. x1 < x2 ⇒ F (x1 ) < F (x2 ) 3.e. • the error made in measuring a physical quantity. Such quantities are called random variables.Appendix A Review of Classical Probability The mathematical theory of probability deals with phenomena en masse. quantities which take on various numerical values depending upon the result of the observation. F (−∞) = 0. Its principal concern is with the numerical characteristics of the phenomenon under study.

2 ⇒ F (x) = −∞ f (x′ ) dx′ . • 2 ⇒ f (x) ≥ 0. points of discontinuity are comparatively “few”). dx 233 (A. if ∆x is suﬃciently small. • 1 and A. Since F (x) might not have a derivative for every x. and count the number of trials ∆n(x) such that x ≤ ξ ≤ x + ∆x. x −∞ • A. x2 P {x1 ≤ ξ ≤ x2 } = f (x) dx. We assume that the number of points where f (x) doesn’t exist is a countable set (i. we perform the experiment n times. f (x) = dF . F (x2 ) − F (x1 ) = x2 f (x) dx. To determine f (x) for a given x. i. x1 In particular.e.2 ⇒ ∞ f (x) dx = 1. and so one also has the deﬁnition. and from P {x1 ≤ ξ ≤ x2 } = F (x2 ) − F (x1 ). then. f (x) = lim P {x ≤ ξ ≤ x + ∆x} . it follows that.Classical Probability Theory It is convenient to introduce the probability density function. so f (x)∆x ≈ ∆n(x) n . 1.e.2) of the random variable ξ. x1 From this. Continuous random variables. P {x ≤ ξ ≤ x + ∆x} ≈ f (x)∆x. ∆x→0 ∆x Note this deﬁnition leads to a frequency interpretation for the probability density function. Then it can be shown that. one can distinguish several types of random variables – in particular.

4) pn = F (∞) − F (−∞) = 1 In general. nsuch thatxn ≤ x. k = F (x+ ) − F (x− ) (the step height). the Dirac δ-function. if F (x) = θ (x) = 1. Discrete (and mixed) random variables. 2. pn = P }ξ = xn } = F (xn ) − F (x− ) n n (A. then one doesn’t usually associate with it a density function.3) (A. x < 0 . though otherwise arbitrary. we extend the concept of function to include distributions – in particular.234 F ( x) F ( xn ) Applied stochastic processes − F ( xn ) xn x Figure A. We have.1). Nevertheless. dF dx = k · δ (x − x0 ) . F (x) = n P {ξ = xn }. we have. x ≥ 0 0. where φ is continuous at x0 . speciﬁed by the integral property. 0 0 x=x0 In particular. Then one can show that if F (x) is discontinuous at x0 . since F (x) is not differentiable in the ordinary sense.1: Typical distribution function for a discrete process. These have a distribution function F (x) that resembles a staircase (ﬁgure A. (A.5) If a random variable is not of continuous type. ∞ −∞ φ (x) δ (x − x0 ) dx = φ (x0 ) .

4 and A. G(∞) = 1.5 give. An important. • x1 < x2 ⇒ G(x1 ) < G(x2 ) • G(x+ ) = G(x) (continuity from the right) one can ﬁnd an experiment. f (x) = n pn δ(x − xn ). and subtle.Classical Probability Theory θ ( x) 1 235 0 x δ ( x) = dθ dx 0 x Figure A. Existence theorem: Given a function G(x) such that. such that its distribution function F (x) = G(x). Using this. and an associated random variable. A. then. Eqs. result is the following.2). dθ = δ(x) dx (see ﬁgure A. .2: The Dirac Delta function δ(x) is a distribution that can be thought of as the derivative of the Heaviside step function θ(x). • G(−∞) = 0.

variable: σ 2 = 0 ⇐⇒ ξ is the sure variable ξ . 3. and σ.6) = ∞ −∞ xk f (x) dx n xk pn n . Note that σ 2 is also called the dispersion.236 Applied stochastic processes ∞ Similarly. if ξ is continuous −∞ E {ξ} = xn pn .1 Statistics of a Random Variable 2. x g (x) dx = 1. This is done −∞ G (x) = −∞ g (x′ ) dx′ . if ξ is discrete n σ 2 = E (ξ − E {ξ}) 2 = ∞ −∞ (x − E {ξ}) f (x) dx n 2 (xn − E {ξ}) pn 2 from which we have the important result: σ 2 = E ξ 2 − E2 {ξ} . or sure. the standard deviation. A. If the variance is vanishingly small. and then invoking the existence theorem. Moments: mk = E ξ k (A. . then the random variable ξ becomes a deterministic. Variance: 1. one can determine a random variable having as a density a given function g(x). Expectation value: ∞ xf (x) dx. provided g(x) ≥ 0 and by determining ﬁrst.

A useful result is the moment theorem. (A. if ρ < 1 we say the two processes X and Y are anticorrelated. ρ= XY . viz. For that purpose. ∞ k (x − E {ξ}) f (x) dx k −∞ . the two process are obviously uncorrelated. Furthermore.8) Notice that the correlation coeﬃcient is bounded. 5. while if ρ > 0. we note that µ0 = 1. we introduce the correlation coeﬃcient. µ1 = 0 and µ2 = σ 2 . it will be convenient to introduce the following notation – We denote the second central moment by the double angled brackets. dn φ (ω) dω n = in mn . ω=0 The characteristic function is also useful as an intermediate step in computing the moments of a nonlinear transformation of a given distribution. Characteristic function: The characteristic function is a generating function for the moments of a distribution. . again. Actually. σX σY (A.Classical Probability Theory 237 Note that m0 = 1 and m1 = E {ξ}. −1 ≤ ρ ≤ 1. For ρ = 0. XY ≡ XY − X Y . The characteristic function is deﬁned as φ(ω) = E eiωξ = ∞ eiωx f (x) dx.7) 4. Correlation Coeﬃcient: It is sometimes useful to express the degree of correlation among variables by some dimensionless quantity. we say X and Y are correlated. −∞ Notice this is precisely the Fourier transform of the probability density f (x). There is a unique one-to-one correspondence between the characteristic function φ(ω) and a given probability density f (x). µk = E (ξ − E {ξ}) = k (xn − E {ξ}) pn n where. one frequently uses the central moments. When we discuss multivariate distributions.

a−b (A. κ1 = m1 . and F (x) = 1 for x > b.12) (see Figure A. dn ln φ (ω) dω n = in κn . with F (x) = 0 for x < a.10) where a is a constant and ξ is a random variable. Cumulants: The characteristic function also generates the cumulants κn of the random variable. .2 Examples of Distributions and Densities (b − a) 0 −1 1. Uniform The uniform random variable has density function. .9) The cumulants are combinations of the lower moments. cumulants appear in approximations of the average of an exponentiated noise source.g. κ3 = m3 − 3m2 m1 + 2m3 . 1 For a Gaussian distribution. The utility of cumulants comes from the identity. x F (x) = a (b − a)−1 dx′ = a−x .238 Applied stochastic processes 6.3A). e. ω=0 (A. (A. A random variable drawn from this distribution is said to . In that way. the distribution function is correspondingly given by. b]. one can show that all cumulants beyond the second are zero. .4A). e aξ κ2 = m2 − m2 = σ 2 . 1 an mn = exp = n! n=0 ∞ an κn n! n=1 ∞ . A. f (x) = a≤x≤b . In the domain x ∈ [a.11) (see Figure A. otherwise (A.

B) Exponential. A) Uniform. Academic Press. C) Normal (Gaussian).Classical Probability Theory 239 A) f ( x) X = U (a. b ) B) a f ( x) X = E (a) f ( x ) = a exp [ − ax ] x ≥ 0 1 b−a −1 a≤ x≤b °( b − a ) f ( x) = ® otherwise °0 ¯ ae a C) 1 2πσ 2 b f ( x) = ª ( x − µ )2 º exp « − » 2σ 2 » « 2πσ 2 ¬ ¼ 1 x D) 1a f ( x) f ( x) = x f ( x) (σ π ) 2 (x − µ) +σ 2 X = C ( µ . Inc.σ 2σ ( 2 ) πσ 2σ µ−σ µ µ+σ x µ−σ µ µ+σ x Figure A. Redrawn from Gillespie (1992) Markov Processes.3: Common probability density functions.σ ) 1 X = N µ. . D) Cauchy (Lorentz).

4: Common probability distribution functions. A) Uniform.4 0. The corresponding density functions are shown as dashed lines.4 0. B) Exponential.6 0.6 0.240 Applied stochastic processes A) 1 0.2 b x D) x F (x ) X = N µ.4 0.8 ( ) § µ−x 1§ F ( x ) = ¨ 1 − erf ¨ 2 2¨ © 2σ © ·· ¸¸ ¸ ¹¹ X = C ( µ .2 F (x ) X = U (a.6 0. .2 F ( x) = 1 1 ªµ − xº − arctan « 2 π ¬ σ » ¼ µ-σ µ µ+σ x µ-σ µ µ+σ x Figure A. C) Normal (Gaussian).8 0.8 0. D) Cauchy (Lorentz).6 0.σ 2 F (x ) 0.4 0. b ) F ( x) = a−x a −b B) F (x ) X = E (a) 1 0.2 F ( x ) = 1 − e − ax a C) 1 0.8 0.σ ) 0.

4B). b→a a+b . For x ≥ 0. X = As a consequence. 2σ 2 (A.20) . The mean and standard deviation are.” written as X = U(a. The mean and standard deviation are. a (A.18) 3.14) 2.” written as X = E(a). (A.16) (see Figure A. X = (X − X )2 = As a consequence. f (x) = a exp(−ax) x ≥ 0.3C). the distribution function is correspondingly given by. x F (x) = −∞ √ 1 2πσ 2 exp − (x′ − µ)2 1 dx′ = 2 2σ 2 µ−x 1 − erf √ . 2 3 (A. with F (x) = 0 for x < 0. A random variable drawn from this distribution is said to be “exponentially distributed with decay constant a.13) lim U (a. The distribution function is correspondingly given by. (A. a→∞ 1 . 2 (X − X )2 = b−a √ . Exponential The exponential random variable has density function. x F (x) = 0 a exp(−ax′ )dx′ = 1 − exp(−ax).3B). 2πσ 2 (A. f (x) = √ 1 2πσ 2 exp − (x − µ)2 . (A. b).19) (see Figure A. b) = the sure variable a.17) lim E (a) = the sure number 0.15) (see Figure A. (A. Normal/Gaussian The normal random variable has density function. b].Classical Probability Theory 241 be “uniformly distributed on [a.

σ 2 = the sure variable µ.23) The multivariate Gaussian distribution is a straightforward generalization of the above – for the n-dimensional vector x ∈ Rn . As a consequence. The mean and standard deviation are. f (x) = (σ/π) .21) is the error function (see Figure A. A random variable drawn from this distribution is said to be “Cauchy distributed about µ with half-width σ.4C). x F (x) = −∞ (x′ (σ/π) 1 1 µ−x dx′ = − arctan . Cauchy/Lorentz The Cauchy random variable has density function.242 where 2 erf(z) = √ π z 0 Applied stochastic processes e−t dt. A random variable drawn from this distribution is said to be “normally (or Gaussian) distributed with mean µ and variance σ 2 . Obviously.3D). by deﬁnition C is a symmetric. σ→0 (X − X )2 = σ 2 .25) (see Figure A.24) where x = µ is the vector of averages and C is called the crosscorrelation matrix with elements given by Cij = xi xj − xi xj . 4. The distribution function is correspondingly given by.” written as X = N(µ. (x − µ)2 + σ 2 (A. 2 (A. n 1 1 T P (x) = (2π)− 2 (det C)− 2 exp − (x − x ) · C−1 · (x − x ) . positive-semideﬁnite matrix.4D). (A.22) lim N µ. (A. σ 2 ). (A. the joint Gaussian distribution is. X = µ.26) 2 + σ2 − µ) 2 π σ (see Figure A.” . 2 (A.

the distribution of the random variable Yn = n i=1 Xi − nµ √ .. and the greater the apparent anarchy. since the density function is one of many representations of the Dirac-delta function. It reigns with serenity and in complete self-eﬀacement. the Central Limit Theorem states. if they had known of it. σ→0 lim C (µ. This result was signiﬁcantly generalized by Lyapunov (1901). Let X1 . X3 . Assume that Xi has ﬁnite . It is the supreme law of Unreason. who showed that it applies to independent random variables without identical distribution: Let {Xi } be a sequence of independent random variables deﬁned on the same probability space. –Sir Francis Dalton (1889). . As the sample size n increases (n → ∞). The law would have been personiﬁed by the Greeks and deiﬁed. 1). X2 . σ n (A. an unsuspected and most beautiful form of regularity proves to have been latent all along. σ) = the sure variable µ..Classical Probability Theory 243 written as X = C(µ. In it’s most restrictive form. it follows that. Although the distribution satisﬁes the normalization condition.3 Central limit theorem I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the “Law of Frequency of Error”. Whenever a large sample of chaotic elements are taken in hand and marshaled in the order of their magnitude.27) A. The huger the mob. the tails of the distribution vanish so slowly that one can show that no higher moments of the distribution are deﬁned ! Nevertheless.Xn be a sequence of n independent and identically distributed random variables having each ﬁnite values of expectation µ and variance σ 2 > 0.28) approaches the standard normal distribution N(0. (A. amidst the wildest confusion. the more perfect is its sway. σ).

the function F (x) is strictly increasing and the inversion F −1 is uniquely deﬁned for any distribution. the exponentially distributed random variable E(a) has the distribution function.4 Generating Random Numbers Notice the distribution function F (x) by deﬁnition always lies between 0 and 1. sn then the distribution of Zn converges to the standard normal distribution N(0. and that rn = 0. (A.244 Applied stochastic processes mean µi and ﬁnite standard deviation σi . and solving for F (x) = r.30) . A. x = F −1 (r). Since F ′ (x) = f (x). It is possible. 1). 1). (A. F (x) = 1 − exp(−ax). writing the sum of random variables Sn = X1 + · · · + Xn . then. viz. n→∞ sn lim Under these conditions. We deﬁne n s2 = n i=1 2 σi . to obtain a sample x from any given distribution by generating a unit uniform random variable r ∈ U(0. the distribution function reﬂects the sampling onto the horizontal axis thereby transforming r into the random variable x with the desired statistics. For example.29) By spraying the vertical axis with a uniformly distributed random sampling. and f (x) ≥ 0.. Assume that the third central moments n 3 rn = i=1 |Xi − µi | 3 are ﬁnite for every n. and the variable deﬁned by Zn = Sn − mn .

x1 . x2 . p. Inc. x3 ) ∞ −∞ P3 (1.2) (1) F3 (x3 . The following procedure is used to generate joint random variables. ∞ −∞ ∞ −∞ P (x1 . A.31) is a random variable. X2 and X3 be three random variables with joint density function P . x2 ) = −∞ (x′ |x1 . then x = (1/a) ln(1/r). x2 )dx′ . where r is a unit uniform random variable. 2 2 P3 (1. F2 and F3 by x1 F1 (x1 ) = −∞ x2 P1 (x′ )dx′ . x3 )dx2 dx3 P (x1 . (1) P2 (x2 |x1 ) = P (x1 . exponentially distributed with decay constant a. This relation will be of particular interest in Chapter 4 when we consider numerical simulation of random processes. x2 . 1 1 F2 (x2 . (A. 1992). D. x2 . x1 ) = −∞ x3 P2 (x′ |x1 )dx′ . x2 ) = . x3 )dx3 . Let X1 . x2 . P (x1 . T. x3 )dx3 . x2 .1 • Joint inversion generating method Markov Processes. Gillespie (Academic Press.4. and is particularly useful in the generating of Gaussian-distributed random variables (see Exercise 3 on page 193). Deﬁne the functions F1 .j) P1 (x1 ) = −∞ P (x1 . where the subordinate densities Pk ∞ (i. 3 3 are deﬁned as follows.2) (x3 . 53.Classical Probability Theory 245 If we set F (x) = 1 − r. x1 . x3 )dx2 dx3 .

the probability density Q(y) of this new variable is given by. . A. T.33) Here. δ(·) is the Dirac-delta function. as always. x2 ) = r3 . y2 . for a vector of random variables {Xi }n with density P (x). . . y = g(x). (A. we arrive at the formula for a transformation of probability density functions. F2 (x2 . and a new random variable Y deﬁned by the transformation. x2 . x1 ) = r2 . . r2 and r3 are three independent unit uniform random numbers. . the joint density for Y is. In general. . Gillespie (1983) “A theorem for physicists in the theory of random variables.246 Applied stochastic processes Then if r1 . xn ) . Given a random variable X with density distribution P (x).32) (where g(·) is a given deterministic function). . . x2 and x3 obtained by successively solving the three equations F1 (x1 ) = r1 . ∂(y1 . X2 and X3 . and transformed variables i=1 {yj = gj (x)}m .” American Journal of Physics 51: 520. Q(y) = P (x) ∂(x1 . (A. the values x1 . (A. . x1 . yn ) (A. j=1 Q(y) = ∞ −∞ m P (x) j=1 δ(yj − gj (x))dx. Q(y) = ∞ −∞ P (x)δ(y − g(x))dx. are simultaneous sample values of X1 .5 • Change of Variables D. F3 (x3 . then making the change of variables zj = gj (x) in the integral above.34) If m = n.35) where |∂x/∂y| is the determinant of the Jacobian of the transformation xi = hi (y).

Nahin Digital dice.5). More challenging applications of Monte Carlo methods appear in the exercises. A more sophisticated example comes from trying to estimate the value of an integral using Monte Carlo methods. x2 + y 2 ≤ 1) divided by the number of total points generated. For example. 2008). the problem is to ﬁnd the probability . or perhaps the approach to a solution is not clear. The main idea is to use a random number generator and a program loop to estimate the averaged outcome of a particular experiment. when 2 dice are cast there are 36 elementary events and they are assumed to have equal a priori probability. and count how many of these lie inside the circle (i. providing a straightforward method to gain some insight into the behaviour of the system.e. The larger the number of trials. There are situations when exact solutions to probability questions are very diﬃcult to calculate. the problem is then to transform it into a probability distribution for the possible outcomes each of which corresponds to a collection of elementary events. and repeat this hundreds or thousands of times. Some probability distribution is given a priori on a set of elementary events. J. We then assume that fraction is an adequate representation of the fraction of the area of the 1×1 square that lies within the quarter-circle (Figure A. For comparison. Again. then take the number of times that 4 heads appear in my list. Often. (Princeton. I could sit down and ﬂip a coin 4 times. Suppose we would like to estimate the area of a quarter-circle of radius 1.6). One approach is to randomly generate points in the 1 × 1 square in the ﬁrst quadrant. a table of data and the routine used to generate that data are shown below (Table A.Classical Probability Theory 247 A.7 Bertrand’s Paradox The classical theory of probability is – from a mathematical point of view – nothing but transformations of variables. and divide this by the number of 4-ﬂip trials I have done.6 • Monte-Carlo simulation P. Monte Carlo simulations may be the only solution strategy available.7). The data in the table below comes from a random number generator and a for loop in place of ﬂipping an actual coin. the Matlab script used to generate the data is also shown (Table A. the better my estimate for the average will be. A. For example. if I want to know the average number of times I will get 4 heads in a row ﬂipping a fair coin. recording the outcome each time.

B) Output simulation data. end end sprintf('Average number of 4 heads tossed: %0.062450 1/24 0. f4=rand.6f command ensures that the output is displayed with 6 decimal points.8 x 1 Figure A.2 0.062140 0.4 0.6: The fraction of randomly generated points lying inside the quarter circle provide an estimate of the area. the Monte Carlo estimate approaches the analytic value. recording the number of times 4 heads are ﬂipped. .5 && f4>0.number_of_4_heads/N) Applied stochastic processes B) Number of trials. number_of_4_heads=0. As the number of trials increases. in particular the %0.6f'.248 A) N=input('How many trials?'). The only idiosyncratic commands are rand which calls a unit uniform random number generator and sprintf which displays the output as a formatted string.6 0.062500 Figure A.5 && f2>0.5: A) Matlab routine to simulate a four-toss experiment. if f1>0. N 103 104 105 106 107 Estimate of probability to flip 4 heads 0. for i=1:N f1=rand. f2=rand. f3=rand.5 number_of_4_heads=number_of_4_heads+1.4 0.061000 0.2 0.8 0.5 && f3>0.062643 0.6 0.063300 0. y 1 0.

no philosophical principle can tell whether a die is loaded or not! The danger of intuitive ideas about equal probabilities has been beautifully illustrated by Bertrand (Calcul des probabilit´s.” Clarendon. Gauthiers-Villars. (This is not a mathematical problem. recording the number of points laying inside the quarter circle of radius 1. In applications to the real world.783310 0. one must therefore decide which a priori distribution correctly describes the actual situation. if p1^2+p2^2 <= 1 pts_inside=pts_inside+1. for the various totals by counting the number of elementary events that make up each total. N 103 104 105 106 107 Estimate of area of quarter circle 0. with an enormous literature of semi-philosophical character. As the number of points increases.785627 π/4 0. end end sprintf('Area of the quarter circle: %0.7: A) Matlab routine to simulate points inside the 1 × 1 square. or balls in urns.785315 0. For a review. One attempt to avoid an explicit assumption of the a priori probability distribution is the so-called Principle of Insuﬃcient Reason. Oxford 1970. so is frequently not stated explicitly.Classical Probability Theory 249 A) N=input('How many points?'). pts_inside=0. R. B) Output simulation data. It states that two elementary events have the same probability when there is no known reason why they should not. the Monte Carlo estimate approaches the analytic value.805000 0. p2=rand. the correct choice (or at least the one meant by the author) is usually clear. e Paris 1889): . see for example J. pts_inside/N) B) Number of points. Lucas “The concept of probability. Mathematics can only derive probabilities of outcomes from a given a priori distribution. This has led to the erroneous view that pure mathematics can conjure up the correct probability for actual events to occur.6f'.) In problems of gambling. for i=1:N p1=rand. At best.786300 0.785398 Figure A. this can be considered a working hypothesis.

The loose phrase at random does not suﬃciently specify the a priori probability to choose among the solutions. Hence. 2 4 Each solution is based upon a diﬀerent assumption about equal a priori probabilities. Hence. The area of a circle of 2 radius 1 is a quarter of that of the original circle. 2 √ Answer 3: For a chord to have length > 3. Apart from the tangent (zero probability).250 Applied stochastic processes P Figure A. its center must lie at a distance less than 1 from the center of the circle. the line must lie within an angle of 60 ◦ out of a total 180 ◦ . . For the chord to be > 3. with an inscribed equilateral √ triangle with sides of length 3. all √ such lines will intersect the circle. and draw at random a straight line intersecting it. p = 1 . 3 Answer 2: Take all random lines perpendicular to a ﬁxed diameter.8: A circle of radius 1. p = 1 . What is the probability that the chord has √ length > 3 (the length of the side of the inscribed equilateral triangle)? Answer 1: Take all random lines through a ﬁxed point P on the edge of the circle (Figure A. Take a ﬁxed circle of radius 1. p = 1 .8). √ The chord is > 3 when the point of intersection lies on the middle half of the diameter. Hence.

Exercises 1.Classical Probability Theory 251 Suggested References Much of this chapter was taken from. identically . Gillespie (Academic Press. σ2 ) = N(aµ1 + bµ2 . Use this characteristic function to show that. Gillespie’s book is particularly useful if one wants to develop practical schemes for simulating random processes. Inc. G. T. • Stochastic Processes in Physics and Chemistry (2nd Ed. van Kampen (North-Holland. (b) Use the characteristic function to show that for an arbitrary random variable the third-order cumulant κ3 is expressed in terms of the moments µ1 . and µ3 . show that for a series of independent. 2001). 2 2 2 2 aN(µ1 . D. κ3 = µ3 − 3µ2 µ1 + 2µ3 . σ1 ) + bN(µ2 . and • Markov Processes. each with a probability of exactly 1/2? 2. 2002). σ 2 ). Flipping a biased coin: Suppose you have a biased coin . µ2 . as follows. In that direction. 1 (c) Use the characteristic function for an arbitrary random variable to prove the weak version of the central limit theorem – That is.). How could you use this coin to generate a random string of binary outcomes. for two independent normal random variables. 1992). Characteristic functions: (a) Derive the characteristic function φ(ω) for the normal distribution N(µ. is also very useful.showing heads with a probability p = 1/2. N. Lemons (Johns Hopkins University Press. D. a2 σ1 + b2 σ2 ). • An Introduction to Stochastic Processes in Physics. S.

. 1). P (n ) ≤ r < ′ P (n′ ). σ n (A.36) converges to the characteristic function of the unit normal distribution N(0. σ 2 )? (b) Given the discrete probability distribution P (n). the characteristic function of the random variable Yn = n i=1 Xi − nµ √ .252 Applied stochastic processes distributed random variables Xi with mean µ and standard deviation σ. for 0 ≤ a < b < 1. A. What do you notice for small n? Compare the histogram to a unit normal distribution – beyond what values of n are the plots indistinguishable? 3. But we often want random numbers drawn from more exotic distributions. Generating random numbers: Very powerful and reliable schemes have been developed to generate unit uniform random numbers. n′ =−∞ is a realization drawn from P (n). Use this to derive an expression for the characteristic function of Yn in the limit n → ∞. 1) as n → ∞. 4.28. the integer n satisfying. Start with any kind of nonlinear transformation of a unit uniform random number. n−1 n′ =−∞ n and write out the Taylor series for the characteristic function of Zi . σ (d) Write a Monte Carlo simulation to verify the central limit theorem. b) and C(µ. then x is a vector of independent Gaussian random variables. σ) in terms of the unit uniform variable r. i = j). (a) Write the generation formulas for x drawn from U(a. Multivariate Gaussian distribution: Show that if the correlation matrix C is diagonal (Cij = 0. Do you foresee a problem trying to do the same for x drawn from N(µ. Hint: Let Zi = Xi − µ . Hint: What is the probability that a ≤ r < b. and plot a histogram of the distribution of the sum Eq. show that for r drawn from a unit uniform distribution U(0.

Each member of one list maps uniquely to a member of the other list. Estimate the solution of the following problems by running a Monte Carlo simulation. (a) Ballistics: (van Kampen. or simply to gain some intuition for the process. on the hour. 2001) A cannon shoots a ball with initial velocity ν at an angle θ with the horizontal. For a list of 15 members. stochastic simulation is indispensable. a list of architects and their famous buildings. Change of variables: To a large extent. (a) Suppose you have to match members of two lists – For example. When you put them back.Classical Probability Theory 253 5. but constant x fraction of an hour later (where. You take them out of the carton and wash them. In those cases. can you speculate about a general formula for all n? 6. One bus arrives every hour. What is the average wait time for a rider arriving at the stop at random? What about the case for n independent lines running from the same stop? By running simulations for diﬀerent n. for lack of additional information. The following test that notion. Monte Carlo simulation: Often analytic solutions to probability questions are prohibitively diﬃcult. centered on ν0 and θ0 . and was computed explicitly for an arbitrary number of elements by Euler (1751) using a very crafty argument. what is the probability that none of the eggs end up in their original location? What if you have 100 eggs? Do you recognize the reciprocal of this probability? The probability that after permutation no member returns to where it began is called the derangement probability. The variance is narrow enough that negative values of ν and θ can . respectively. what is the average number of correct connections that will be made if each member of one list is randomly connected to another member of the other list (in a one-to-one fashion)? How does this average change for a list with 5 members? With 100 members? (b) Suppose you have a dozen eggs. the other an unknown. classical probability theory is simply a change of variables. we assume x is a unit uniform random variable). (c) Two (independent) bus lines operate from a stop in front of your apartment building. Both ν and θ have uncertainty given by Gaussian distributions with 2 2 variance σν and σθ .

ni (t) = Ωxi (t) + Ω1/2 αi (t). a is the slit width. ii. r2 sin2 θ where r is the distance from the center of the slit to an arbitrary place on the screen. 2002) According to the probability interpretation of light. What is the probability density f (x) that a single photon passes through a narrow slit and arrives at position x on a screen parallel to and at a distance d beyond the barrier? Each angle of forward propagation θ is the uniform random variable U(0. N . i = 1. i. (b) Single-slit diﬀraction: (Lemons. Show that for slits so narrow that πa/λ ≪ 1. . 2. narrow slit. Find the probability distribution for the distance traveled by the cannon ball. is proportional to I∝ 1 sin2 [(πa/λ) sin θ] . formulated by Max Born in 1926.9). From Lemons (2002). π/2) (see Figure A. . The light intensity produced by diﬀraction through a single. 30.9: Single-slit diﬀraction. be ignored. and λ the light wavelength. (c) Linear noise approximation: If we deﬁne the random variables n(t) by the linear (though time-dependent) transformation. the above light intensity is proportional to the photon probability density derived in part 6(b)i. p. as found in almost any introductory physics text. . light intensity at a point is proportional to the probability that a photon exists at that point.254 Applied stochastic processes λ r a θ x d Figure A. . .

then show that the density P (n. . and x(t) is a deterministic function of time. t). t) is simply. P (n. Ω is a constant.Classical Probability Theory 255 where α(t) is a random variable with density Π(α. t). t) = ΩN/2 Π(α.

In the main notes. most of the methods outlined in this appendix will be written in vector-matrix notation. linear stability analysis Laplace/Fourier transforms. and some elementary results from asymptotic analysis. more background is provided in this appendix. As such. it is assumed that the reader is familiar with matrix algebra. For example.1 Matrix Algebra and Time-Ordered Exponentials The main application of matrix algebra in this course will to codify a system of diﬀerential equations in a compact way. Matrix form of diﬀerential equations A system of high-order diﬀerential equations can always be written as a larger system of ﬁrst-order diﬀerential equations by the simple expedient of introducing new state variables for each of the derivatives.Appendix B Review of Mathematical Methods There are several mathematical methods we return to again and again in this course. As a refresher. B. the equation describing simple harmonic motion can be written either as a second-order diﬀerential equation or as a 2 × 2-system of ﬁrst-order 256 .

F2 (x). There is no general solution for such systems. For a homogeneous (i. dt x(0) = x0 . . If.. Autonomous linear diﬀerential equations Autonomous means that the coeﬃcients appearing in the diﬀerential equation are constant. x(t) = x0 · exp [At] . . (B. zero right-hand side). B. 257 (B. We deﬁne the matrix exponential as.3 is simply.5) 1 0 0 1 I= 0 0 0 0 0 0 . there are formal solutions.Mathematical Methods diﬀerential equations– d d2 x + ω2 x = 0 ⇔ 2 dt dt x x ˙ = 0 −ω 2 1 0 x x ˙ . 1 An exp A = I + A + A2 + . dt (B. however. I is the identity matrix. (B.e. dx − A · x = 0. 0 0 0 . For a fully nonlinear system. the fundamental solutions of this equation are simply exponentials. . .2) where F = (F1 (x).3) In direct analogy with the one-dimensional system. the reaction rate vector is linear in the state variables.1) where the ﬁrst line of the matrix equation deﬁnes the new state variable x. 0 1 . where x ∈ Rn . d x = F(x). the formal solution of Eq. . . 2! n! n=0 Here.4) With the matrix exponential. the matrix form of the equation is preferable since ˙ solution methods immediately generalize to higher-order systems. Fn (x)) is the reaction rate vector. = . . In many ways. albeit with matrix arguments. the system reads. ∞ (B.

.7. Y(t) = ∞ n=0 0 t t1 tn−1 dt1 0 dt2 · · · t 0 dtn A(t1 )A(t2 ) · · · A(tn ) . Feynman (1951) “An operator calculus having applications in quantum electrodynamics. Y(0) = 1.” Physical Review 84: 108 – 128. . 1. G. then a time-ordering operator ⌈ ⌉ is used to allow the inﬁnite series. to be written as an exponential. If the matrices A(tn ) do not commute with one another (as is typically the case). the diﬀerential equation with time-dependent coeﬃcients1 . (B. B.” Physica 74: 215–238. (B. t Y(t) = exp 1 n! n=0 ∞ t t 0 t A(t′ )dt′ = dt1 0 0 dt2 · · · 0 dtn A(t1 )A(t2 ) · · · A(tn ).7) = ∞ n=0 0 t dt1 0 dt2 · · · 0 If the matrices A(tn ) commute with one another. where the 1/n! is included to accommodate the extended domain of integration from [0.258 Applied stochastic processes Time-order exponentials for non-autonomous diﬀerential equations • R. Non-autonomous means the coeﬃcients in the diﬀerential equation are no longer constant. van Kampen (1974) “A cumulant expansion for stochastic linear diﬀerential equations. the solution can be expressed as an inﬁnite series. d Y(t) = A(t) · Y(t). . dt By iteration. t t t1 Y(t) = I + 0 A(t1 )dt1 + t1 0 0 tn−1 A(t1 )A(t2 )dt2 dt1 + . this series can be simply written as an exponential of an integral (as in the autonomous case discussed above).8) ≡ exp 1 Equations A(t′ )dt′ 0 of this sort cannot be readily solved using Laplace or Fourier transform methods. dtn A(t1 )A(t2 ) · · · A(tn ). Eq. • N. for example.6) (B. t] for each iterate. Consider.

although the explicit solution cannot be written in terms of elementary functions.4. dt or. where J is called the Jacobian or response matrix of F. points of this system by looking for values of the state variables such that. Suppose we have the nonlinear system of diﬀerential equations. d x = F(x).9) We can determine the equilibrium. dt (B. F(x⋆ ) = 0. ∂xj . either analytically or numerically. Jij = ∂Fi (x⋆ ) . or ﬁxed. for example the Mathieu equation discussed in Section 8.Mathematical Methods 259 Many equations. d d [x⋆ + xp ] = xp = F(x⋆ + xp ). equivalently. (SIAM. those values of x that satisfy the condition. the local dynamics near to x⋆ can be adequately described by a linearization of the rates. B. dt dt we use the Taylor series to approximate F for small xp . the question arises as to whether or not they are stable in the sense that small perturbations away from x⋆ return to x⋆ . can be written formally as a time-ordered exponential. Eigenvalues and the fundamental modes For very small perturbations xp .2 • Linear Stability Analysis L.10) Once these equilibrium points have been determined. F(x⋆ + xp ) ≈ F(x⋆ ) + J · xp = J · xp . (B. Starting from the governing equation. Edelstein-Keshet Mathematical models in biology. dx = 0. 2005).

and values λ that satisfy this condition are called the eigenvalues of J. v = 0.e. if J ∈ Rn ×Rn . (particularly when the order is greater than 4 or some of the coeﬃcients are free parameters). (B. Eq. the perturbations will grow and we say that the equilibrium point x⋆ is unstable. dt (B.15. Since we have xp ∝ eλt . determining the stability from the coeﬃcients of the resolvent equation without explicit computation of λ. Or. then the resolvent equation is an nth -order polynomial. (B. if we had a condition for stability that did not require that the λ be computed explicitly. we know that the solution is an exponential xp ∝ exp λt.2. The Routh-Hurwitz criterion provides just such a test. (B.260 Applied stochastic processes The dynamics of the small perturbation modes xp are then governed by the homogeneous linear equation. . Assuming the same holds true in multiple dimensions. are only possible if the matrix on the lefthand side is not invertible. writing λv = λI · v.14) (B. B. λv = J · v. dxp = J · xp . det [λI − J] = 0.1 Routh-Hurwitz Criterion Notice that the resolvent equation.13) Non-trivial solutions.12) Substituting into both sides of Eq. we make the ansatz. B. xp (t) = veλt . we arrive at the constraint on λ.. is a polynomial in λ with degree equal to the dimensionality of our system – That is. i.11) For a one-dimensional system. B.15) This is called the resolvent equation for the linear operator J. and cancelling eλt . It would be nice.11. The importance of the eigenvalues comes from what they tell us about the long-time stability of the perturbation modes xp . [λI − J] · v = 0. if any of the eigenvalues have a positive real part.

m) element of the matrix Hj is (B. B.16 to have negative real parts is that the determinants of the matrices Hk are all positive (> 0).Mathematical Methods Suppose our resolvent equation is the k th -order polynomial.5 on page 188 and Section 9.. The simpliﬁcation of solving the problem in Laplace (or Fourier) space comes at the expense of a sometimes diﬃcult (or impossible) inversion back to the original problem space.. + ak = 0. James. . Hk = . a1 1 H1 = [a1 ] .. λk + a1 λk−1 + .17) where the (l. Laplace and Fourier transforms are a convenient method of exchanging the search for the solution of a diﬀerential equation (diﬃcult) with the search for the solution of an algebraic equation (easier). Mathematica or Matlab. (Prentice Hall. but it is easily coded into symbolic mathematics packages like Maple. a3 a2 a1 1 0 ··· . The stability criterion is the following. . . In the context of the present course.3 on page 206). then deﬁne the following matrices. and the Fourier transform of the correlation function is related to the power spectrum of the ﬂuctuations (Section 2.the poles of the Laplace transform will be used to determine the asymptotic stability (Section 8. Advanced Modern Engineering Mathematics. . a3 a2 ··· 0 a1 1 0 . for otherwise. A necessary and suﬃcient condition for all of the roots of the resolvent Eq. .18) It looks cumbersome written out for a general system. 0 0 0 ··· ak a2l−m 1 0 for 0 < 2l − m < k.3 • Transforms Methods G.3 on page 39). H2 = . .. B.16) (B. we will rarely be interested in the inverting of the Laplace or Fourier transform back to the original problem space since we will be using the properties of the transformed quantities directly . 2005). 261 (B. for 2l = m.

24) Notice from Eq.21) the formal solution can always be expressed as a Laplace transform.19) where I is the n×n identity matrix (for the transform of a scalar function. det[sI − A] = 0. (B.20) Then. set n = 1). dt x(0) = x0 . ˆ L[f (t)](s) ≡ F(s) = ∞ 0 f (t) · esIt dt.22) The second useful property is that the Laplace transform of a convolution integral. dt (B. t g(t) ∗ f (t) = 0 g (t − τ ) · f (τ ) dτ (B. for a system of linear diﬀerential equations. (B. First. using the deﬁnition of the transform and integration by parts. it is possible to derive the relation: L dx ˆ = sI · X(s) − x(0). dx = A · x. Grossman and R. K.25) . Asymptotic Stability • S. ˆ X(s) = [sI − A]−1 · x0 . B. I. Our interest in the Laplace transform comes from two useful properties.” Journal of Diﬀerential Equations 13: 551–566. (B. L [g(t) ∗ f (t)] = L [g(t)] · L [f (t)] . (B. (B. Miller (1973) “Nonlinear Volterra integrodiﬀerential systems with L1 -kernels.23) is simply the product of the individual Laplace transforms.22 that [sI − A] is not invertible precisely at those values of s for which.1 Laplace Transform The Laplace transform of the vector function f (t) ǫ Rn is given by the integration of the function multiplied by exp[sI t].262 Applied stochastic processes B.3.

That is. likewise determines the eigenvalues of the matrix A (Section B.29) B. ˆ det[sI − A − G(s)] = 0 Re(s) < 0.3.30) The Fourier transform. (B.Mathematical Methods 263 This equation. (B. Up to a factor of 2π. Therefore. called the resolvent equation. is easily inverted.27) is also guaranteed. and only if.2). 1973) states that asymptotic stability of the convolution equation. Furthermore. s→0 (B. ˆ F[F(it)] = 2πf (−ω).2 Fourier Transform The Fourier transform is very similar to the Laplace transform.26) A more general theorem (Grossman and Miller. (B. t→∞ ˆ lim x (t) = lim s · x (s) . dx =A·x+ dt t 0 g (t − τ ) · x (τ ) dτ . is asymptotically stable (x(t) → 0 as t → ∞) if. (B. the solution of the differential equation Eq. (B. in contrast to the Laplace or z-transform.28) Finally. note the duality property of the Fourier transform. those values of s that solve the resolvent equation all have negative real parts: det[sI − A] = 0 Re(s) < 0. the Fourier transform is its own inverse. but now with an explicitly complex argument iω and an unbounded domain of integration. a useful feature of the Laplace transform is that the s → 0 limit provides the t → ∞ limit of the un-transformed function x(t). ˆ F[f (t)](iω) ≡ F(iω) = ∞ −∞ f (t) · eiωIt dt. 1 f (t) = 2π ∞ −∞ ˆ F(iω) · eiωIt dω. B.31) .27. provided the roots of the resolvent equation have negative real parts.

2 One can show that this convention is consistent with Stratonovich’s interpretation of the nonlinear Langevin equation. is consistent with the convention. b. What happens for c = a or c = b is a matter of convention – If δ(t) is deﬁned as the limit of a sequence of functions. Itˆ’s interpretation. (B. For the purposes of deriving generalized Fourier transforms. n −|τ |n = δ(τ ). b a f (t)δ(t − c)dt = 1 f (c) c = a. on o the other hand. Dirac delta function The Dirac delta function δ(t) is a distribution with the following sifting property. Generalized Fourier transforms By the sifting property. b a f (t) δ (t − c) dt = f (c) c = a 0 c = b. for example. To illustrate this property. e n→∞ 2 lim then it follows that. as we will be considering an unbounded domain of integration.32) .264 Applied stochastic processes The Fourier transform can be thought of as a projecting of the frequency content of a signal f (t) onto the basis functions cos ωt and sin ωt. b a f (t)δ(t − c)dt = f (c) for a < c < b. for a continuous distribution of frequencies. we use the Dirac delta function to derive generalized Fourier transforms. these distinctions are immaterial. F[δ(t)] = ∞ −∞ δ(t)e−iωt dt = 1.

Mathematical Methods and F[δ(t − t0 )] = ∞ −∞ 265 δ (t − t0 ) e−iωt dt = e−iωt0 . eiω0 t ⇐⇒ 2πδ(ω − ω0 ). ∞ Formally. B. 1 ⇐⇒ 2πδ(ω). For a causal sequence then. zk whenever the sum exists. Conversely. we arrive at the decidedly more exotic pairs. given a sequence {xk }−∞ of complex number xk .3. leading to the following Fourier transforms. Z ∞ {xk }−∞ = X(z) = ∞ k=0 xk . however. The Fourier transform can therefore be thought of as a portrait of the frequency content of the signal f (t). such as one ﬁnds in trying to solve for the equilibrium probability distribution of a random walk on a discrete lattice. . the z-transform reduces to. The trigonometric functions cos ωt and sin ωt can be written as a sum of complex exponentials. these transform pairs are unremarkable. the z-transform of the sequence is deﬁned as Z ∞ {xk }−∞ = X(z) = ∞ k=−∞ xk . a ﬂat spectrum indicates very little structure in the signal (as is the case for white noise). It also appears in the guise of moment generating functions again used to solve linear diﬀerence equations. F[sin ω0 t] = iπ [δ(ω + ω0 ) − δ(ω − ω0 )] . F[cos ω0 t] = π [δ(ω − ω0 ) + δ(ω + ω0 )] . A sequence is called causal if xk = 0 for k < 0. In and of themselves.31). By using the duality property (Eq. B. A peak at some frequency ω0 indicates a dominant oscillatory component in the signal.3 z-Transform The z-transform is essentially a discrete form of the Laplace transform. zk The z-transform is useful to solve linear diﬀerence equations.

a(x. ∂x ∂t can be written as a system of ordinary diﬀerential equations. t(s). for example the reasonably-priced • S. consult a text devoted to the solution of partial diﬀerential equations. t(s)). B. Partial Diﬀerential Equations for Scientists and Engineers. u(s)).2 Separation of variables Partial diﬀerential equations with particular underlying symmetry governing the function u(x. particularly if the equation in nonlinear in the function of interest. u) = c(x. t) can be transformed to a coupled system of ordinary diﬀerential equations by assuming a solution of the form u(x. Farlow. u) dx = a(x(s). t. ds The method of characteristics is only applicable to ﬁrst-order partial differential equations. t(s). B. t. ds du = c(x(s). u(s)). although these occur in application when the moment generating function is used to transform a master equation with linear transition rates (see Section 4. t. t(s)) of parameterized curves (x(s).4. ds dt = b(x(s). They are notoriously diﬃcult to solve.1).1 Method of characteristics Notice that the chain-rule applied to the function u(x(s). J. ∂u dx ∂u dt du = + .1. Fro details. (Dover.266 Applied stochastic processes B. u(s)). t(s).4. we will only consider linear partial diﬀerential equations that will be susceptible to solution through a variety of simple methods. 1993). For the purposes of this course. t) = .4 Partial diﬀerential equations Partial diﬀerential equations are equations characterizing the behaviour of a multivariable function in terms of partial derivatives. u). ds ∂x ds ∂t ds implies that the ﬁrst-order partial diﬀerential equation ∂u ∂u + b(x.

because of the unbounded support of the transform integral. Fourier transform of u(x. Substitution of a solution of the form u(x. use is made of some important results from functional analysis and asymptotic analysis. For example. t) with t > 0 yields an ordinary difˆ ferential equation for U (x. Fourier transforms. then it is obviously true that for some real constant λ ∈ R.3 Transform methods The transform methods.Mathematical Methods X(x)T (t). t) with −∞ < x < ∞ yields an ordinary diﬀerential ˆ equation for U (ω. on the other hand. Laplace transform of u(x. ∂2u ∂u = D 2. ∂t ∂x 267 is separable.for example.4. . are particularly powerful methods for solving partial diﬀerential equations.1 Cauchy-Schwarz Inequality If f (t) and g(t) are any real-valued functions. B. t) = X(x)T (t) yields two ordinary diﬀerential equations for X(x) and T (t). s). because they convert derivatives into algebraic expressions involving the transformed functions. are used when the function of interest has an unbounded domain . Laplace and Fourier. Laplace transforms are used to transform variables with domain deﬁned on the positive half-line . the diﬀusion equation. U L {f (t) + λg(t)}2 dt ≥ 0. The limitations are that they only work if the coeﬃcients are constant in the dependent variable and inversion back to the original problem space can be diﬃcult (or impossible). t). B. It was in this context that Fourier derived his famous series. Typically. B.for example. We will use only basic results that can be proved without much eﬀort.5.5 Some Results from Asymptotics and Functional Analysis Occasionally.

33) For ﬁxed limits of integration U and L (possibly inﬁnite). For h(λ) above the λ-axis. B. f (t)g(t)dt. we then have.33 reduces to. (B.268 Applied stochastic processes since the integrand is nowhere negative. L U b= c= L f 2 (t)dt. In terms of our original integrals. and c. At most.5. in which case we have the double root λ = −b/a. . (B. we must have that the roots of h(λ) are imaginary. From the quadratic formula. with. h(λ) = aλ2 + 2bλ + c ≥ 0. it may touch the axis. Eq. U (B. −1 < α < β.2 Watson’s Lemma f (t) ∼ f0 (t − t0 )α + f1 (t − t0 )β . Expanding this expression. call them a. The condition h(λ) ≥ 0 means a plot of h(λ) must lie above the λ-axis and cannot cross it. b. U L (B.37) For 0 < b < ∞.35) f (t)g(t)dt 2 ≤ U L f 2 (t)dt U L g 2 (t)dt . U U U λ2 L g 2 (t)dt + 2λ L f (t)g(t)dt + L f 2 (t)dt ≥ 0. (B.34) a= L U g 2 (t)dt. where. the value of each integral is simply a constant.36) This is the Cauchy-Schwarz inequality. B. h(λ) ≥ 0 ⇔ b2 ≤ ac.

**Mathematical Methods the estimate for the following integral holds,
**

b

269

0

f (t)e−xt dt ∼ f0

Γ(1 + α) Γ(1 + β) + f1 , 1+α x x1+β

as x → ∞.

(B.38)

**More generally, for −∞ < a < b < ∞, and, g(t) ∼ g0 + g1 (t − t0 )λ , λ > 0, g1 = 0, (B.39) the estimate for the following integral holds,
**

b a

f (t)e−xg(t) dt ∼

2f0 λ Γ

1+α λ

1 xg1

1+α λ

e−xg0 ,

as x → ∞.

(B.40)

where the minimum of g(t) occurs at t0 (a < t0 < b). If the minimum t0 should fall on either of the end points a or b, then the factor of 2 no longer appears; i.e. for t0 = a or t0 = b,

b

f (t)e

a

−xg(t)

f0 dt ∼ Γ λ

1+α λ

1 xg1

1+α λ

e−xg0 ,

as x → ∞. (B.41)

B.5.3

Stirling’s Approximation

for n ≫ 1 (B.42)

Stirling’s approximation is: √ n! ≈ 2πn · nn e−n

which is incredibly accurate. To justify the expression, we write the factorial in its equaivalent form as an improper integral,

∞

n! =

0

xn e−x dx.

(B.43)

Note that the integrand F (x) = xn e−x is a sharply-peaked function of x, and so we seek an approximation of the integrand near the maximum. Actually, F (x) is so sharply-peaked that it turns out to be more convenient to consider the logarithm ln F (x). It is straightforward to show that ln F (x) (and hence, F (x)), has a maximum at x = n. We make the change of variable x = n + ε, and expand ln F (x) in a power-series in ε, ln F = n ln x − x = n ln(n + ε) − (n + ε) ln F ≈ n ln n − n − 1 ε2 . 2n (B.44)

270 Or, in terms of the original integrand,

Applied stochastic processes

**F ≈ nn e−n e− 2n . From the integral for the factorial, Eq. B.43,
**

∞ ∞

ε2

(B.45)

n! ≈

n e

−n

ε n −n − 2n

2

e

dε ≈ n e

n −n

e− 2n dε,

ε2

(B.46)

−∞

where we have replaced the lower-limit of integration by −∞. The remaining integral is simply the integral of a Gaussian probability distribution. Therefore, √ n! ≈ 2πn nn e−n for n ≫ 1, (B.47) as above.

Suggested references

One of the ﬁnest books on general mathematical methods is • M. L. Boas, Mathematical Methods in the Physical Sciences, (Wiley, 2005). It has gone through several editions, and no harm comes from picking up a cheap used copy of an earlier edition. For asymptotic and perturbation methods, • C. M. Bender and S. A. Orszag, Advanced Mathematical Methods for Scientists and Engineers: Asymptotic Methods and Perturbation Theory, (Springer, 1999). is excellent. It contains a huge survey of methods by masters of the ﬁeld. It, too, has gone through various editions and can be picked up used.

**Appendix C Itˆ Calculus o
**

Itˆ’s calculus is used very commonly in ﬁnancial applications of stoo chastic processes, presumably because it allows very concise proofs of boundedness, continuity, etc., that would be diﬃcult or impossible using Strantonovich’s interpretation of a white-noise stochastic diﬀerential equation. A very useful formula developed by Itˆ is his rule for a change of o variables. We will derive the change of variable formula and other related properties of the Itˆ stochastic integral in this brief appendix. o

C.1

**Itˆ’s stochastic integral o
**

t

We would like to assign a meaning to the integral t0 G(t′ )dW (t′ ), where dW (t) is the increment of a Wiener process, and G(t) is an arbitrary function. Following the formulation of the Riemann-Stieltjes integral, we divide the domain into N subintervals [t0 , t1 ] [t1 , t2 ] . . . [tn−1 , t] and choose intermediate points τi such that τi ∈ [ti−1 , ti ]. The integral t G(t′ )dW (t′ ) is then deﬁned as the (mean-square) limit of the partial t0 sum,

n

Sn =

i=1

G(τi ) [W (ti ) − W (ti−1 )] .

as n → ∞. In contrast with the Riemann-Stieltjes integral, however, the limit depends upon the choice of intermediate point τi ! One can show that the two interpretations of the stochastic integral discussed in Section 7.4.1 o correspond to particular choices of τi – For the Itˆ interpretation, τi is the 271

272

Applied stochastic processes

initial point in the interval τi = ti−1 , while the Stratonovich interpretation amounts to choosing τi as the mid-point τi = ti +ti−1 . 2

C.1.1

Example –

t 0

W (t′ )dW (t′ )

**Writing Wi ≡ W (ti ) and ∆Wi ≡ Wi − Wi−1 , in the Itˆ interpretation o (τi = ti−1 ), 1 Sn = Wi−1 ∆Wi = 2 i=1
**

n n i=1

**(Wi−1 + ∆Wi ) − (Wi−1 ) − (∆Wi )
**

n

2

2

2

1 1 W 2 (t) − W 2 (t0 ) − = 2 2

(∆Wi ) .

i=1

2

(C.1)

**The mean-square limit of the last term can be computed: Since ∆Wi2
**

i

= t − t0 ,

(C.2)

we ﬁnd,

2

i

(Wi − Wi−1 ) − (t − t0 )

2

=2

i

(ti − ti−1 ) → 0,

2

(C.3)

as n → ∞. Or, equivalently,

t t0

W (t′ )dW (t′ ) =

1 W 2 (t) − W 2 (t0 ) − (t − t0 ) 2

ˆ (ITO)

(C.4)

**In the Stratonovich interpretation, one can show that (Exercise 2),
**

t t0

W (t′ )dW (t′ ) = ms − lim 1 = W 2 (t) − W 2 (t0 ) 2

n→∞

i

W (ti ) + W (ti−1 ) ∆Wi = 2 (C.5)

(STRATONOVICH)

For an arbitrary function G(t), the Stratonovich integral has no relationship whatever to the Itˆ integral, unless, of course, G(t) is related to a o stochastic diﬀerential equation, then there is an explicit relationship as explained in Section 7.4.1.

Itˆ Calculus o

273

C.2

Change of Variables Formula

**Computations using Itˆ’s calculus are greatly facilitated by the following o identities, dW 2 (t) = dt
**

t

and dW 2+N (t) = 0

t t0

(N > 0).

**That is, written more precisely, G (t ) [dW (t )]
**

t0 ′ ′ 2+N

=

G (t′ ) dt′ 0

N = 0, N > 0.

These relations lead to a generalized chain rule in the Itˆ interpretation. o For example, d {exp [W (t)]} = exp [W (t) + dW (t)] − exp [W (t)] 1 = exp [W (t)] dW (t) + dW 2 (t) 2 1 = exp [W (t)] dW (t) + dt . 2 1 ∂2f ∂f + ∂t 2 ∂W 2

Written in general, for an arbitrary function f [W (t)], df [W (t), t] = dt + ∂f dW (t). ∂W (C.6)

Suppose we have a function X(t) characterized by the Langevin equation, dX(t) = a(X, t)dt + b(X, t)dW (t), (C.7)

and φ(X) is a function of the solution of this stochastic diﬀerential equation. Using the generalized chain rule, Eq. C.6, and the Langevin equation, Eq. C.7, we obtain Itˆ’s change of variables formula: o dφ(X(t)) = φ(X(t) + dX(t)) − φ(X(t)) = 1 = φ′ (X(t)) dX(t) + φ′′ (X(t)) [dX(t)]2 + · · · 2 ′ = φ (X(t)) [a(X(t), t) dt + b(X(t), t) dW (t)] 1 2 + φ′′ (X(t)) b2 (X(t), t) [dW (t)] + · · · 2 1 = a(X(t), t) φ′ (X(t)) + b2 (X(t), t) φ′′ (X(t)) dt 2 +b(X(t), t) φ′ (X(t)) dW (t). (C.8)

W. the last term on the right-hand side vanishes. Gardiner (Springer. and we would like to compute cos[η(t)] . d cos η = − Γ2 cos ηdt + Γ sin ηdW. Suggested References This chapter is taken from Gardiner’s Handbook. C.).e. 1 2 b (X(t). dt 2 The solution is simply. cos η = e− Γ2 2 cos η(0) = 1. 2 since a(X) = 0 and b(X) = Γ in Eq. C.1 Example – Calculate cos[η(t)] Suppose we have white noise governed by the stochastic diﬀerential equation dη = ΓdW. η(t) = 0 and η(t)η(t′ ) = Γ2 δ(t − t′ ). o . η(0) = 0. C. o we arrive at the equation governing cos[η(t)]. leaving behind the ordinary diﬀerential equation Γ2 d cos η =− cos η .274 Applied stochastic processes Notice that if φ(X) is linear in X(t). 2004). t . t) φ′′ (X(t)).2. then the anomalous term. that contains a large collection of examples illustrating the range and applicability of Itˆ calculus. i. Using Itˆ’s change of variables or the generalized chain rule. 2 vanishes and the change of variables formula reduces to the chain rule of ordinary (Stratonovich) calculus. • Handbook of stochastic methods (3rd Ed. Taking the average.8.

0) · cy (y(0). t) · cy (y. In that way. 2. t) − c(y(0). − 2 2 where y(t) obeys the Langevin equation (interpreted in the Itˆ o sense). t)∆t + n1 · c(y. t) ∆t ∆t − c(y. Fill in the details in the derivation of Eq.9.Itˆ Calculus o 275 Exercises 1.10 on p. (C. 8. t)dt + c(y. t)dW (t). 3. 0)} dW (t) = W 2 (∆t) ∆t c(y(0). ∆t 0 {c(y(t). Repeat the calculation of cos[η(t)] using the average of the cumulant generating function (Eq. show that (Eq. t) · (1 − n2 ). 238) instead of Itˆ’s o change of variable formula. Prove Eq.e.7). 1). Milstein simulation algorithm: Using Itˆ’s change of variables o 2 formula. exp − 2τc τc 4. Could you use Itˆ’s formula for Gaussian o coloured noise.9) .5. C. √ y(t + ∆t) = y(t) + A(y. 0) . 2 2 where ni are independent samples of a unit Normal distribution N(0. dy = A(y. C. justify Milstein’s scheme for simulating Eq. A. C. F (t)F (t − τ ) = |τ | σ2 .1. noise with nonzero correlation time)? Could you use the cumulant generating function for Gaussian coloured noise? Repeat the calculation for cos[F (t)] for the Ornstein-Uhlenbeck process F (t) with autocorrelation function.4 from Eq. C. (i. and neglecting terms of O(∆t ).

H.” Journal of Chemical Physics 81: 2340. D. The routines are not fully annotated. etc.html The most conspicuous absence is a lack of fully-assembled code – Only small steps in the simulation algorithms are shown.ubc. J. including the following: • Numerical methods using MATLAB. This is partly to allow the reader to assemble full programs in their own style. that only detract from the presentation of the algorithms. Fink (Pearson. If you are not familiar with Matlab.ca/spider/cavers/MatlabGuide/guide. there are several good references. both print and web-based. Gillespie (1977) “Exact simulation of coupled chemical reactions. there are several examples of code used to generate simulation data shown in the main notes.1 • Examples of Gillespie’s direct method D. T.cs. 2004). D. and partly because full routines require many tedious lines of code for pre-allocating memory. we deﬁne two functions: . D.Appendix D Sample Matlab code In this chapter.1 Deﬁning the propensity vector and stoichiometry matrix 276 For a system with N reactants and M reactions. Mathews and K. • http://www. plotting results. and some familiarity with Matlab is assumed.1.

The second takes as an argument the present state of the system X and the system size OMEGA. OMEGA ∗ νM (Xc) D. the current time T . 1 ··· 1 Smat (M. and the updated state and time. . 72). vVec (M ) = = . function vVec=vVecIn(X. . . . .2 Core stochastic simulation algorithm Once the stoichiometry matrix and propensity vector are deﬁned. :) = [ 0 ] ] ] ] . . % reactant 1 2 . 2.M) %Generates the stoichiometery matrix Smat = zeros(M. returning a vector of the reaction rates Vvec (a M ×1 vector)..Sample Matlab code 277 1. function Smat = deﬁneReactions(N. and returns the transpose of the stoichiometry matrix Smat (an M × N matrix).. The ﬁrst is run at the initialization of the program. . the stoichiometry matrix Smat and the system size Ω are input. :) = [ −1 1 · · · −2 . vVec (1) vVec (2) . The current state of the system X. . . the core simulation algorithm is straightforward. are output. Xnew and T new. N Smat (1.1. The microscopic reaction rates νj are written in terms of the reactant concentrations (see p. . = [ 2 0 ··· 1 .N).OMEGA) % Function returning vector of M vVec which are functions of % N reactants X % Writing the reactions in terms of concentration Xc=X/OMEGA. The OMEGA multiplying each rate νj converts the units from concentration per time to per time. = OMEGA ∗ ν1 (Xc) OMEGA ∗ ν2 (Xc) . :) = [ 1 −1 · · · 0 Smat (2. .

T.OMEGA) [M.3 Example . %Step 2: Calculate tau and mu using random number generators % Tau is the *time* the next reaction completes and mu is the *index* of % the next reaction. r1 = rand. end % update the time Tnew = T + tau.’ﬁrst’).% N reactants. M reactions %Step 1: Calculate a mu & a 0 a mu = vVecIn(X.the Brusselator The Brusselator example (Section 4. r2 = rand.2) is a simple model that exhibits limit cycle and stable behaviour over a range of parameter values. Tnew]=CoreSSA(X. D. % Stoichiomtery of update for i=1:N Xnew(i) = X(i)+prod(i).1:N) . a 0 = sum(a mu) . tau = (1/a 0)*log(1/r1). %Step 3: Update the system: % carry out reaction next mu prod = Smat(next mu.OMEGA). The .278 Applied stochastic processes function [Xnew.Smat.1.N]=size(Smat). next mu = ﬁnd(cumsum(a mu)>=r2*a 0.1.

Smat (2. Smat (4. C.M) Smat = zeros(M. % reactant x y Smat (1. → In the Gillespie algorithm. Smat (3.OMEGA) global a b% if your parameter values are declared as global variables x=X(1)/OMEGA. for details and references. . Gardiner (Springer. ∅ − x. Higherorder methods are possible – see • Handbook of stochastic methods (3rd Ed.y=X(2)/OMEGA.Sample Matlab code reactions are. 2004). → a b 279 → 2x + y − 3x. function Smat = deﬁneReactions(N. W.2 Stochastic diﬀerential equations The update step for a simple forward-Euler method is shown. → x − ∅. the propensity vector is coded as.N). :) = [ 1 −1 ] . :) = [ 1 0 ] . function vVec=vVecIn(X. :) = [ −1 1 ] . D. vVec (1) vVec (2) vVec (3) vVec (4) = OMEGA*(1) = OMEGA*(a*x*xm1*y) = OMEGA*(b*x) = OMEGA*(x) The stoichiometry matrix is.). x − y. :) = [ −1 0 ] .xm1=(X(1)-1)/OMEGA.

its history does not need to saved. .2 Coloured noise We are simulating a sample trajectory of the process y(t) characterized by the stochastic diﬀerential equation.280 Applied stochastic processes D. yNew=y+a(y. D. o dy = A(y. and since the white noise is uncorrelated. t + dt).t)*n*dt. t)dt + c(y. dt where η(t) is coloured noise with unit variance and correlation time τc =tauC. function [yNew. The input is the present state of the system (y(t). η(t)) and the time-step dt.tNew. t)η(t).dt) rho=exp(-dt/tauC). The output is the updated state (y(t + dt).t)*dt∧(1/2)*randn.1 White noise The Euler method for white noise is straight-forward. tNew=t+dt. where a past value is needed. function [yNew.2. t) and the time-step dt. t + dt.t. nNew=rho*n+(1-rho∧2)∧(1/2)*(1/2/tauC)∧(1/2)*randn. We are simulating a sample trajectory of the process y(t) characterized by the Itˆ stochastic diﬀerential equation.nNew]=colouredEuler(y. t. dy = A(y.t)*dt+c(y.2. t) + c(y. η(t + dt)).dt) yNew=y+A(y.tNew]=whiteEuler(y.t)*dt+b(y.t. The output is the updated state (y(t + dt). in contrast with the coloured noise algorithm below. tNew=t+dt. t)dW (t).n. The input is the present state of the system (y(t).

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd