Stochastic Interacting System PDF

Gruncllehren cler
mathematischen Wissenschaften 324

A Series of Comprehensive Studies in Mathematics
Editors
S. S. Chern B. Eckmann P. de la Harpe
H. Hironaka F. Hirzebruch N. Hitchin
L. Hormander M.-A. Knus A. Kupiainen
J. Lannes G. Lebeau M. Ratner D. Serre
Ya.G. Sinai N. J. A. Sloane J.Tits
M. Waldschmidt S. Watanabe
Managing Editors
M. Berger J. Coates S.R.S. Varadhan
Springer- Verlag Berlin Heidelberg GmbH
Thomas M. Liggett
Stochastic
Interacting Systellls:
Contact, Voter and
Exclusion Processes
With 6 Figures
Springer
Thomas M. Liggett
Mathematics Department
University of California
Los Angeles, CA 90095-1555
USA
email: tml@math.ucla.edu
Cataloging-in-Publication Data applied for

Die Deutsche Bibliothek - CIP-Einheitsaufnahme
Liggett, Thomas M.: Stochastic interacting systems: contact, voter and
exclusion processes / Thomas M. Liggett. - Berlin; Heidelberg; New York;
Barcelona; Hong Kong; London; Milan; Paris; Singapore; Tokyo:
Springer 1999
(Grundlehren der mathematischen Wissenschaften; 324)
Mathematics Subject Classification (1991): 60K35
ISSN 0072-7830
ISBN 978-3-642-08529-1 ISBN 978-3-662-03990-8 (eBook)
001 10.1007/978-3-662-03990-8
This work is subject to copyright. All rights are reserved, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting,
reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in
any other way, and storage in data banks. Duplication of this publication or parts
thereof is permitted only under the provisions of the German Copyright Law of
September 9,1965, in its current version, and permission for use must always be
obtained from Springer-Verlag. Violations are liable for prosecution under the
German Copyright Law.
© Springer-Verlag Berlin Heidelberg 1999
Originally published by Springer-Verlag Berlin Heidelberg New York in 1999.
Softcover reprint of the hardcover 1st edition 1999
Cover design: MetaDesign plus GmbH, Berlin
Typesetting: Photocomposed from the author's AMSTEX files after editing and
reformatting by Kurt Mattes, Heidelberg, using a Springer TEX macro-package
Cover design: de'blik, Berlin
SPIN: 10728278 41/3143-543210 Printed on acid-free paper
Preface
Interacting particle systems is a branch of probability theory that has rich con-
nections with a number of areas of science - primarily physics in the early days,
but increasingly biology and the social sciences today. Stochastic processes of the
sort that are studied in this field are used to model magnetism, spatial competition,
tumor growth, spread of infection, and certain economic systems, to mention but
a few of the many areas of application.
The subject is by now about thirty years old. At the midpoint of that thirty year
period, I wrote the book Interacting Particle Systems (IPS) as an attempt to give
some order to the work that had been done by then, and to make the field more
accessible to new researchers, and more useful to workers in areas of application.
Judging from the rapid development of the field since then, this attempt appears
to have been successful.
My earlier book covered more or less the entire field, as it was at that time.
Even so, some topics, such as zero range processes and the then emerging area
of hydrodynamics, were mentioned only briefly. By now, the field has grown
to the point where it would be impossible to cover it entirely in one book. In
fact, a number of books that treat special topics within the field have appeared in
the interim - see for example Chen (1992), DeMasi and Presutti (1991), Durrett
(1988), Kipnis and Landim (1999), Konno (1994), and Spohn (1991).
IPS was organized horizontally, in that a separate chapter was devoted to each
type of model: stochastic Ising models, voter models, contact processes, nearest
particle systems, exclusion processes, and linear systems. The present book has a
more vertical appearance. It takes but three of these models - the ones given in
the title - and traces their development since 1985. Nearest particle systems are
omitted because, even though substantial progress has been made on them since
1985 (especially by T. Mountford), they are by their nature somewhat special.
Linear systems are omitted because they have been less active recently, while
stochastic Ising models are omitted because developments in that area alone would
justify an entire book.
Even my relatively modest objective of covering recent work on three models
cannot be attained in a book of reasonable size, so I have had to make some
choices about what to include. These choices reflect to some extent, of course,
my own interests and perspective on the field. Other work on these models is
described briefly in the Notes and References section for each of the three parts.
VI Preface
I have tried to make the treatment as self-contained as possible without dupli-

cating too much of the contents of IPS. The initial section on background material,
as well as the initial Preliminaries section of each of the three main parts, should
help in this regard.
This book is an outgrowth of my Wald Memorial Lectures - see Liggett (1997)
- that also dealt primarily with contact, voter and exclusion processes. One of the
advantages of this selection of topics is that it provides illustrations of the use of
some of the most important tools in the area: percolation and graphical techniques
(Part I), correlation inequalities (Part I), duality (Parts I and II), coupling (Parts I
and III), and partial differential equations (Part III). It should not be expected that
many models that come up in applications will fit exactly into one of the three
classes we consider here. The hope, rather, is that a good understanding of the
behavior of these classes and of the tools used in their analysis will facilitate the
analysis of new models that arise.
It should be clear from the above comments that the present book is in no
sense a second edition of IPS. It is also not really a second volume. I hope it will
playa role similar to that of IPS, though, as a reference for workers in probability
and areas of application, as well as an advanced text. There is plenty of material
in it for a semester course, and by adding lectures based on the papers discussed
in the Notes and References sections, it can easily be extended to a full year. The
mathematical prerequisites for reading this book are year-long courses in analysis
and probability - a probability course based on Durrett (1996), for example.
We tum now to a brief survey of the contents of this book. The contact process
has been one of the central models in the subject since its introduction by T. Harris
over twenty years ago. The theory as of 1985 was primarily one dimensional. Very
little was known in higher dimensions. Developments during the past decade have
therefore rendered much of Chapter VI of IPS largely out of date. The primary
exception to this is the first section, on critical value bounds. Here only minor
improvements have been made. As a consequence, our treatment of the contact
process in the first part of this book starts almost from the beginning. The critical
value bounds from IPS are stated in the Preliminaries section. While the proofs
from IPS are not repeated, a more elaborate version of the argument plays a
dominant role in Part II of this book.
Until the early 1990's, the contact process was studied almost exclusively on
the d-dimensional integer lattice Zd. Sections 2 and 3 of Part I explain some of
these developments. Section 2 is primarily devoted to the advances by Bezuiden-
hout and Grimmett (1990, 1991) that more or less completed the Zd theory,
showing among other things that the critical contact process dies out. Section 3
is dedicated to several results that address the following natural question: Since
real systems are finite, and contact processes on finite sets die out with probability
one, how can the phase transitions that occur on infinite sets have any bearing on
our understanding of real systems?
Section 4 traces the development of the theory of contact processes on ho-
mogeneous trees. Interest in this comes from the fact that an intermediate phase
Preface VII
occurs in this context that is absent in the case of Zd. Briefly, the contact process
on Zd has one critical value, while the contact process on a homogeneous tree
(other than Z 1) has two distinct critical values. Between these two critical values,
the finite process survives globally, but dies out locally. Unlike Zd, the tree is
large enough that the infected set can wander out to infinity without dying out,
but this can only happen for intermediate values of the infection parameter.
The story is quite different for voter models. The voter models discussed in
Chapter V of IPS are what are now known as linear voter models. Their ergodic
theory was more or less completed in IPS. While significant progress has been
made on linear voter models since then, and is discussed in the Notes and Refer-
ences section, the focus of Part II is on their nonlinear cousins. Nonlinear voter
models require quite a different approach, primarily because their duals (when they
exist) are harder to analyze. While the theory of nonlinear voter models is still
very far from being complete, there are close connections to the contact process,
and this makes it a natural candidate for inclusion in this book. The main theorem
in Part II gives a complete classification of threshold voter models with threshold
level = 1. The proof given there is a substantial improvement over my original
treatment, which was computer aided and contained a serious error.
The situation for exclusion processes is again different. The material in Chapter
VIII of IPS has in general not been superceded by subsequent developments.
However, there is a whole new collection of issues that have been investigated,
and it is to these that we address our attention in the final part of this book.
We again omit a treatment of the by now mature area of hydrodynamics, partly
because it is well covered in the books of De Masi and Presutti (1991), Spohn
(1991), and Kipnis and Landim (1999), and partly because it has quite a different
flavor from the topics we will cover here.
The first main section of Part III gives a probabilistic treatment of shocks in the
asymmetric, nearest neighbor, exclusion process in one dimension, based on work
by Ferrari and his coauthors. The main technique used here is coupling. Then we
move to a more analytic treatment of roughly the same issues that was developed
by Derrida and his coworkers. This is known as the matrix approach. Finally,
we turn to central limit theorems for tagged particles in more general exclusion
processes, based on work of Varadhan and coauthors. IPS has a treatment of
this only in the case of the symmetric, nearest neighbor, one-dimensional system,
which has a different behavior than the general system considered here.
The Background and Tools section at the beginning of the book describes the
basic particle system setup, and some of the key techniques that are useful in the
analysis of many models - coupling, monotonicity, correlation inequalities and
subadditivity, for example. These first few subsections should be read before ven-
turing into the book proper, but the latter subsections can be skipped, and read
when they are used later on. Each of the three parts begins with a brief description
of that particular model, and gives precise statements of results from the corre-
sponding chapters ofIPS (or from other references) that are used later. With this
exception, the numbered sections within each part are largely self-contained. Each
VIII Preface
part ends with a Notes and References section that has two functions. First, it
details the sources of the material in that part. Secondly, it contains brief descrip-
tions of the large amount of related work that I have not been able to include in
the book itself.
While this book is more or less self-contained, the reader may find that reading
parts of IPS first makes the going easier. Here are my suggestions about what parts
of IPS to read in this case:
(a) The first four sections of Chapter I and the first three sections of Chapter II
before starting this book.
(b) The first three sections of Chapter VI before reading Part I.
(c) The first two sections of Chapter V before reading Part II.
(d) The first three sections of Chapter VIII before reading Part III.
A popular (I think) feature of IPS was its sets of open problems. I have not
attempted to do anything formal of this sort here. There are simply too many
open problems, and many of them are not directly about the three types of models
I treat here, but rather about other models that are nevertheless closely related
to contact, voter and exclusion processes. However, I do mention problems that
I think should be looked at when they arise naturally, mainly in the Notes and
References sections.
As I mentioned in the preface to IPS, my wife Chris had a lot to do with my
writing that book. For the last several years, she has been lobbying for a follow-up.
It took a while, but she finally got it. In the earlier preface, I mentioned some of
the people who had had the most impact on my work, as well as on the subject as
a whole. Most have continued to be leaders in the field, but they have now been
joined by a large and impressive group of younger mathematicians. I won't list
them here, but most appear prominently in the bibliography. One of the measures
of a field of research is the caliber of researcher that it attracts. By this measure,
interacting particle systems has been a great success.
Pablo Ferrari, Norio Konno, Tom Mountford, Roberto Schonmann, and espe-
cially my former students, Amber Puha and Li-Chau Wu, have read parts of this
book, and made suggestions for improvement - I very much appreciate their input.
I would like to acknowledge the National Science Foundation for its support of
my work over the past quarter of a century, and the Guggenheim Foundation for
freeing my time in 1997-98, so that I could devote much of it to writing this book.
Without their support, this work would not have been possible.
Los Angeles, CA Thomas M Liggett

March I, 1999
Contents
Background and Tools

The Processes
Invariant Measures 4
Reversible Measures 5
Coupling, Monotonicity and Attractiveness 6
Correlation Inequalities 8
Duality ...... . 11
Subadditivity 12
Oriented Percolation 13
Domination by Product Measures 14
Renewal Sequences and Logconvexity 16
Translation Invariant Measures 21
Some Ergodic Theory 22
Branching Processes 25
Some Queuing Theory 26
The Martingale CLT 29
Part I. Contact Processes 31

1. Preliminaries 31
Description of the Process 31
The Graphical Representation; Additivity 32
The Upper Invariant Measure 34
Duality ........... . 35
Convergence ........ . 36
Monotonicity and Continuity in A 38
Rate of Growth . . . . . . . . . . 40
Survival and Extinction; Critical Values 42
Preview of Part I . . . . . . . . . . . . 44
2. The Process on the Integer Lattice Zd 44
The Boundary of a Big Box Has Many Infected Sites 45
The Finite Space-Time Condition ........ . 50
Comparison with Oriented Percolation ..... . 51
First Consequences of the Percolation Comparison 54
X Contents
Exponential Bounds in the Supercritical Case 57

Exponential Decay Rates in the Sub critical Case 60
A Critical Exponent Inequality 69
3. The Process on {I, ... , N}d 71
The Subcritical Case . 72
The Supercritical Case ... 74
4. The Process on the Homogeneous Tree Td 78
Some Critical Value Bounds . . . . . . . 79
Branching Random Walk . . . . . . . 80
Back to the Contact Process - the Function ¢ 86
Extinction at the First Critical Value . . . . . 91
Existence of an Intermediate Phase 94
The Sequence u and its Growth Parameter f3()...) 96
The Complete Convergence Theorem 103
Continuity of the Survival Probability 104
The Growth Profile .......... 105
Invariant Measures in the Intermediate Regime - First Construction 109
Invariant Measures in the Intermediate Regime - Second Construction 119
Strict Monotonicity of f3()...) 123
5. Notes and References 125
Part II. Voter Models 139

1. Preliminaries 139
Clustering and Coexistence 140
The Linear Voter Model .. 140
The Threshold Voter Model 142
The Graphical Representation 142
Duality when T = 1 143
Preview of Part II . . . . . . 145
2. Models with General Threshold and Range 146
Fixation for Large Thresholds ....... 146
Clustering in One Dimension . . . . . . . . 147
Coexistence; the Threshold Contact Process 151
The Threshold Contact Process with Large Range 153
The Threshold Voter Model with Large Range 155
3. Models with Threshold = 1 . . . . . . . . . . . 155
Duality for the Threshold Contact Process, T = 1 156
Reduction to One Dimension 158
The Convolution Equation 159
The Density . . . . . . . . . . 162
Contents XI
The Renewal Sequence . . . . . . . . . . . . . . . 167

Existence of a Nontrivial Invariant Measure 174
Nonnegativity for Sets that Contain No Singletons 180
Nonnegativity for General Sets .. 184
Strings of Length One . . . . . . . 185
Strings of Length Greater than One 191
Part III. Exclusion Processes 209

1. Preliminaries . . . . . . . 209
Invariant Measures .... 210
Symmetric Systems 212
Coupling; the Graphical Representation 215
Translation Invariant Systems . 215
First and Second Class Particles 218
The Tagged Particle Process 219
Preview of Part III 220
2. Asymmetric Processes on the Integers 220
Heuristics . . . . . . . . . . . . . . . 222
Basic Assumption; Expected Results 224
Location of the Shock . . . . . . . . 225
Another View of the Shock . . . . . 226
An Invariant Measure for the Process Viewed from X t 230
The Process X t Identifies the Shock 232
The Process Zt also Identifies the Shock 234
Behavior of the Shock - First Moments 238
Behavior of the Shock - Weak Law of Large Numbers 240
Behavior of the Shock - Second Moments 242
Central Limit Behavior of the Shock 253
Dynamic Phase Transition 258
3. Invariant Measures for Processes on {l, ... , N} 261
The Matrix Approach 262
Properties of the Matrices ... 264
Examples of Matrices D and E 266
Correlation Functions 268
The Partition Function 269
The Current . . . . . . 272
The Limiting Measure 273
An Application - the Process with a Blockage 276
4. The Tagged Particle Process . . . . . . . . . . 278
The Process Viewed from the Tagged Particle; First Decomposition 278
XII Contents
Invariance and Ergodicity of the Environment 280

The Law of Large Numbers for XI 284
Asymptotic Normality for M t 285
The Second Decomposition - Beginning 286
The Basic Assumption .......... 288
The Second Decomposition - Conclusion 290
Asymptotic Normality for XI 294
The Limit Is Not Degenerate 295
Bibliography 317
Index 331
Background and Tools
We begin this section by setting up the basic tenninology and notation to be used
in this book. Then we will discuss briefly the main foundational results and other
tools that will be used later. Many of these are taken from IPS, so the proofs will
often not be given here. Insofar as possible, we will use the notation from IPS.
The first part of this section should be read at the outset. The latter material is
more special, and can be read when it comes up later. This material appears in
roughly the order in which it is used in the rest of the book.
The Processes
The models studied in this book are continuous time Markov processes 1)( with
state space X = {O, l}s, where S is a countable set of sites. Usually S will be Zd
or a tree. Note that X is compact in the product topology. A configuration 1) E X
has the following interpretations in the three cases to be considered:
Contact Processes. There is an individual (or plant, or cell, or ... ) at each site
XES that is infected if 1) (x) = 1 and healthy if 1) (x) = O.
Voter Models. There is an individual (i.e., a voter) at each site x who has possible
opinions 0 or 1 at any given time. Alternatively, each site is occupied by an
individual of one of two types, labelled 0 or I.
Exclusion Processes. At each time, a site x is either occupied by a particle (if
1)(x) = 1) or vacant (if 1) (x) = 0).
The dynamics of the process are specified by a collection of transition rates.

For contact and voter models, the rate at which there is a flip at x from 0 to I
or vice versa is given by a function c(x, 1)) of the site x and the configuration 1).
General processes with transitions of this type are called spin systems. In exclusion
processes, the states at two sites x, y change simultaneously, at a rate given by
c(x, y, 1)). This transition corresponds to the motion of a particle from one site to
another. The specific fonn that c(x, 1)) or c(x, y, 1)) takes for each of our three
models will be described in Section 1 of the corresponding part. The function c
will always be assumed to be nonnegative, unifonnly bounded, and continuous as
a function of 1) in the product topology on X.
We will use the notation p'l for the distribution of the process with initial
configuration 1). Let C(X) be the space of continuous functions on X with the
2 Background and Tools
uniform norm
Ilfll = sup If(I])I·
ryEX
All the processes we will consider will have the Feller property, so that we can
define the semigroup of the process on C(X) by
S(t)f(l]) = Ery f(l]t), f E C(X).
(The Feller property is just the statement S(t)f E C(X) whenever f E C(X).)
For I] E X and x, yES, define I]x and I]x,y by
{ 1 - I](x) if Z = x,
I]x (z) = I] (z)
if Z =1= x
and
if Z = x,
{ "(y)
I]x,y(z) = I] (x) if Z = y,
I](z) if Z =1= x, y.
Thus I]x is obtained from I] by flipping the xth coordinate, while I]x,y is obtained
from I] by interchanging the xth and yth coordinates. With the occupancy inter-
pretation of exclusion processes, the effect of this is to move a particle from x to
y (if I] (x ) = 1, I](Y) = 0), to move a particle from y to x (if I](Y) = 1, I] (x ) = 0),
or has no effect (if I](x) = I](Y)).
These are the new configurations obtained from I] following a single transition.
The intuitive meaning of the function c in each case is then
and
pry(l]t = I]x,y) = c(x, y, I])t + oCt), (if I] (x) =1= I](Y))
as t -J.- O. Strictly speaking, these statements are only correct if S is finite, since
otherwise the probabilities on the left are typically zero for t > O. When S is
infinite and c(x, 1]) is bounded below by a positive number, there will be infinitely
many transitions in every finite time interval.
More formally, the connection between the rate function c and the process I]t
is made through the generator Q of I]t. For functions f on X that depend on
finitely many coordinates (these are known as cylinder junctions), define
(Bl)
x
or
(B2) Qf(l]) = LC(x, Y, 1]) [J(l]x,y) - f(I])]

x,y
in the two cases. The restriction to cylinder functions is needed so that these series
will converge. In (Bl), for example, there are only finitely many nonzero terms. In
Background and Tools 3
(B2) the series will converge provided that c(x, y, 1/) satisfies natural summability
conditions.
The fundamental construction of the process 1/1 is given by the following
theorem. It is a special case of Theorem 3.9 on page 27 of IPS. We will state
it only for spin systems, but the corresponding existence theorem for exclusion
processes is entirely analogous. The assumptions in that case are given in (1.1) of
Part III.
Theorem B3 (Liggett). Suppose that
(B4) sup L sup Ic(x, 1/) -

XES UES ryEX
c(x, 1/u)1 < 00.
Then the closure Q of the Q defined in (B 1) is the generator of a Feller Markov

process 1/1 on X. In particular, if f is a cylinder function, then
Qf = lim S(t)f - f,
ItO t
QS(t)f = S(t)Qf,
and u(t) = S(t)f is the unique solution to the evolution equation
d -
dt u(t) = Qu(t), u(O) = f.
Remarks. (a) The interpretation of condition (B4) is that in a certain uniform

sense, the transition rates at a site x do not depend very strongly on the state of
the system far from x. In particular, (B4) is automatically satisfied if the process
has finite range, in the sense that there is a constant K so that for each XES,
c(x, 1/) depends on 1/ through at most K of its coordinates.
(b) The closure referred to above is the following: Consider the graph
G = {(f, Qf), f a cylinder function} c C(X) x C(X).
Then Q is the linear operator on C (X) whose graph is the ordinary closure of the
set G. Part of the statement of the theorem is that the closure of G is the graph
of a (single valued) linear operator.
Often we will carry out some computation on a finite system, and then will
argue that the result applies to infinite systems as well. This extension will usually
not be carried out explicitly, but will be left to the reader. The extension from
finite to infinite systems is usually justified by the following result, which is a
special case of Corollary 3.14 on page 29 of IPS.
Theorem B5 (Trotter-Kurtz). Suppose Cn(x, 1/) and c(x, 1/) are transition rates
that satisfY (B4). Define Qnf and Qf for cylinder functions f by (B1). Suppose
that
for all cylinder functions f. Then the corresponding semigroups satisfy

(B6) S(t)f = lim Sn(t)f
n-HX)
for all f E C(X) and t ::::: 0. The convergence in (B6) is uniform on bounded t
intervals.
A second type of application of Theorem B5 is to the proof that families of

models whose rates depend on some parameter have various continuity properties
as functions of that parameter. It is important to note that this will give continuity
only of quantities that depend on the process over a finite time period. For quan-
tities that depend on the entire evolution of the process, the issue of continuity
is much more subtle. See, for example, Section 1 of Part I, where this point is
discussed in the context of the contact process.
In this book, all processes with state space X = {a, l}s will be assumed to
have the Feller property.
Invariant Measures
Much of the study of interacting particle systems involves their invariant measures
and convergence to them. If JL is a probability measure on X, the distribution of
1]1 when the initial distribution is JL is denoted by JLS(t), and is defined by
Ix fdJLS(t) = Ix S(t)fdJL, f E C(X).
The fact that this relation determines JLS(t) uniquely is a consequence of the Riesz
Representation Theorem (Theorem 2.14 of Rudin (1966». The probability measure
JL is said to be an invariant measure if it satisfies JLS(t) = JL for all t > O. The
set of all invariant measures is denoted by .9.
The following theorem summarizes some elementary, but important, proper-
ties of .9. See pages 10-18 of IPS for their proofs. The topology on the set of
probability measures on X is that of weak convergence. The compactness of X
implies the compactness of the set of probability measures on X in this topology,
and this is essential for several parts of the theorem. The fact that the process
satisfies the Feller property is also crucial.
Theorem B7. Consider a Feller process on {a, l}s.

(a) JL E .9 if and only if
Ix S(t)fdJL = Ix fdJL for all f E C(X), t ::::: 0.
(b) JL E.9 if and only if
Ix 0.fdJL = 0 for all cylinder functions f.

(c) g is compact, convex and nonempty.

(d) g is the closed convex hull of its extreme points.
(e) If v = limt---+oo /LS(t) exists for some probability measure /L, then v E g.
(j)If
V = lim -1 lTn /LS(t)dt

n--+oo Tn 0
exists for some probability measure /L and some sequence Tn t 00, then v E g.
(g) In the context of Theorem B5, if /Ln is invariant for the process with generator
Q n and /Ln -+ /L weakly, then /L is invariant for the process with generator Q.
One consequence of (c) and (d) is that g has at least one extreme point. The
set of all extreme points will be denoted by .9;.
Reversible Measures
According to part (c) of Theorem B7, the process always has at least one invariant
measure. Sometimes an invariant measure satisfies a symmetry property known as
reversibility, and when it does, additional tools become available, and results are
generally more complete. The probability measure /L on X is said to be reversible
for the process if it satisfies
f fS(t)gd/L = f gS(t)fd/L
for all f, g E C(X). Taking g =

1 and comparing with part (a) of Theorem B7,
it is clear that every reversible measure is invariant. A discussion of reversibility
can be found in Sections 5 and 6 of Chapter II of IPS. Included there is a proof
of the analogue of part (b) of Theorem B7 for reversibility: /L is reversible if and
only if
f fQgd/L = f gQfd/L
for all cylinder functions f, g.

The probabilistic meanings of invariance and reversibility are the following:
If /L is invariant, then the process Tit obtained by using /L as initial distribution is
stationary in time (and therefore its definition can be extended to negative times).
If, in addition, /L is reversible, then the processes Tit and TI-t have the same joint
distributions.
For continuous time Markov chains on a countable set S with transition rates
q(x, y), invariance of rr corresponds to
Lrr(x)q(x, y) = 0, YES,
x
while reversibility corresponds to
rr(x)q(x, y) = rr(y)q(y, x), x, YES.

Comparing these two properties, one can see that the second is quite strong, and
should be expected to hold only in very special cases. For example, if n is strictly
positive, then reversibility of n implies that q (x, y) > 0 if and only if q (y, x) > o.
Even when a measure is not reversible, quantities that one might call the
defects from reversibility,
n(x)q(x, y) - n(y)q(y, x),
can playa useful role. An example of this occurs in the proof of Theorem 3.1 of
Part III.
Coupling, Monotonicity and Attractiveness

Many arguments and techniques in the area of interacting particle systems are
based on monotonicity considerations. The state space X of the process is a par-
tially ordered set, with partial order given by
1] S l; if 1] (x) S l; (x) for all XES.
A function f E C(X) is said to be increasing if
1] S l; implies f(1]) S f(l;).
This leads naturally to the definition of stochastic mono tonicity for probability
measures IL on X:
ILl S ILz provided that
(B8)
Ix fdlLl S Ix fdILz for all increasing f on X.
This stochastic monotonicity for probability measures is best understood in
terms of the idea of coupling. A coupling of random variables or stochastic pro-
cesses is simply a joint construction of them on a common probability space.
Taken by itself, this is not a particularly compelling definition. However, making
a judicious choice of the joint distribution of the random variables or processes
involved turns out to be a very powerful technique. This book provides many
illustrations of this. The following is Theorem 2.4 on page 72 of IPS, and gives
the connection between coupling and stochastic monotonicity.
Theorem B9. Suppose ILl and ILz are probability measures on X. Then ILl S ILz if
and only if there is a coupling (1], I;) so that 1] has distribution ILl, l; has distribution
ILz, and 1] S l; a.s.
Remark. One direction of the proof is easy: If a coupling (1], I;) with these prop-
erties exists and f is increasing, then f(1]) S f(1;) a.s., so that
Ix fdlLl = Ef(1]) S Ef(1;) = Ix fdILz·

The other direction requires a construction, and is somewhat more difficult.
A simple application of this theorem shows that stochastic monotonicity is

quite a strong relation. Suppose that I-ll ::'S 1-l2 and that I-ll and 1-l2 have the same
marginal distributions: I-l d 1) : 1) (x) = I} = 1-l2 {1) : 1) (x) = I} for all XES. Then
I-ll = 1-l2· To see this, make the construction of (1), {) provided by Theorem B9,
and note that the equality of marginals and 1) (x) ::'S S(x) implies 1) (x) = S(x) a.s.
for each x. Therefore 1) = s a.s., so that I-ll = 1-l2.
Neither the definition (B8) nor Theorem B9 provides a very effective way of
checking stochastic monotonicity for a given pair of probability measures. The
following sufficient condition, which is Theorem 2.9 on page 75 of IPS, is often
useful in this regard. For 1), sEX, define 1) 1\ sand 1) v s coordinatewise:
1) 1\ sex) = min{1)(x), sex)}, 1) v sex) = max{1)(x), sex)}.
Theorem BI0 (Holley). Suppose S is finite and I-ll, 1-l2 are probability measures
on X that assign strictly positive probabilities to each point in X. If
(Bl1)
for all 1), sEX, then I-ll ::'S 1-l2.
Remark. It is important to keep in mind that (B 11) is much stronger than I-ll ::'S 1-l2.
For example, (B11) implies that the conditional measures obtained by specifying
the configurations on a subset of S are also stochastically ordered, while this is
certainly not the case if only (B8) is assumed.
We tum now to some connections between stochastic monotonicity of measures

on the one hand, and the process 1)1 with semigroup Set) on the other. According
to Theorem 2.2 on page 71 of IPS, the following two statements about 1)1 are
equivalent:
(B12) ! increasing implies Set)! increasing for all t :::: 0,
and
(B 13) I-ll ::'S 1-l2 implies I-ll Set) ::'S 1-l2S(t) for all t :::: o.
The proof is an immediate consequence of the definitions. A process that satisfies
these equivalent conditions is called monotone or attractive.
According to Theorem 2.2 on page 134 of IPS, the following is a necessary
and sufficient condition for a spin system to be attractive:
{
c(x,1))::'S c(x,{) if 1)(x) = sex) = 0,
(B14) 1) ::'S s implies
c(x, 1)) :::: c(x, {) if 1)(x) = sex) = 1.
We can use coupling to see that (B14) implies attractiveness, for example. Take
initial configurations 1), s satisfying 1) ::'S S. Construct a coupled process (1)1, Sl) on
X x X that satisfies 1)1 ::'S Sl a.s. for all t :::: 0 by allowing the following transitions:
if1/(x) = ~(x) = 0, then {

(1/, n -+ (1/x, ~x) at rate c(x, 1/),
n -+ (1/, ~x)
(1/, at rate c(x, n - c(x, 1/),
(1/, n -+ (1/x, ~x) at rate c(x, n,
if 1/(x) = ~(x) = 1, then {
(1/, n -+ (1/x, n at rate c(x, 1/) - c(x, n
and
{
(1/, n -+ (1/x, n at rate c(x, 1/),
if 1/(x) = 0 and ~(x) = 1, then
(1/, n -+ (1/, ~x) at rate c(x, n
Note that the marginals have the right transition rates. For example, if ~(x) = 0,
then ~ -+ ~x at rate
c(x, 1/) + [c(x, n - c(x, 1/)] = c(x, n

Assumption (BI4) is needed to guarantee that all of the above rates are nonneg-
ative. This construction is known as the basic coupling. Since 1/t S ~t a.s., f
increasing implies that
so that (BI2) is satisfied.
Correlation Inequalities
Correlation inequalities are also very useful in the study of interacting particle
systems. A probability measure J.L on X is said to have positive correlations if
Ix f gdJ.L :::: Ix f dJ.L Ix gdJ.L

for all increasing functions f and g on X. The following analogue of Theorem
BIO is Corollary 2.12 on page 78 of IPS. It provides a sufficient condition for a
measure to have positive correlations. Again, this condition is very far from being
necessary.
Theorem B15 (FKG). Suppose S is finite and J.L assigns positive probability to
every point in X. If
(BI6)
for all 1/, ~ E X, then J.L has positive correlations.
For almost any Markov process, it is practically impossible to check that J.LS(t)
has positive correlations using Theorem B 15, partly because (B 16) essentially
requires that J.LS(t) be known explicitly, but also because (BI6) is often false.
For example, Liggett (1994) showed that (at least for some times and parameter
values), the distribution at time t of the one dimensional contact process with
initial condition 1/ == 1 does not satisfy (B 16). In view of these comments, it
should not be surprising that the following result is useful. It is a special case of
Theorem 2.14 on page 80 ofIPS.
Theorem B17 (Harris). !fTJt is an attractive spin system, then for every t > 0,
JLS(t) has positive correlations whenever JL does.
An immediate corollary is that the distribution of the contact process with

initial configuration TJ == 1 does have positive correlations, even though it does
not satisfy (B 16).
As we will see below, an easy consequence of either Theorem B 15 or Theorem
B 17 is the following:
Corollary B18. Suppose v is the product measure on X with
(BI9) v{TJ : TJ(x) = I} = a(x), XES.
Then v has positive correlations.
The measure v has the following cylinder probabilities:
v{TJ: TJ(X) = 1 "Ix E G, TJ(X) = 0 "Ix E H} = n n

XEG
a(x)
XEH
[1 - a(x)].
To deduce Corollary B18 from Theorem B15, simply note that both sides of (B16)
are
n a(x)1)(x)+~(x)[1 - a(x)]2-1)(X)-~(X).
x
To deduce it from Theorem B 17, consider the spin system with

a(x) if TJ(x) = 0,
c(x, TJ) = { 1 _ a(x)
if TJ(x) = 1
and initial distribution JL, the pointmass on TJ == 1. Then the individual coordinates
TJI (x) are independent two state Markov chains with
lim p(TJt(x)
t ..... oo
= 1) = a(x),
so
v = lim JLS(t).
t ..... oo
Every deterministic distribution has positive correlations, so the distribution JLS(t)

has positive correlations by Theorem B 17, and therefore v does also.
A set A C X is said to be increasing if its indicator function
if TJ E A,
if TJ 'f- A
is increasing. An immediate consequence of Corollary B 18 is that if v is a product

measure on X and AI, A2 C X are both increasing, then
In other words, increasing events are positively correlated in the usual sense.
Often it is important to have inequalities in the opposite direction. Clearly the
event appearing on the left side of the inequality must be smaller than A I n A2
in order to have the opposite inequality. Here is the appropriate definition. For
AI, A2 C X, define
Al 0 A2 = {1] EX: 3 SI, S2 C S, SI n S2 = 0, ~ = 1] on SI implies ~ E AI,

~ = 1] on S2 implies ~ E A2}.
The set A I 0 A2 is read A I and A2 occur disjointly. The idea is that an 1] E A I 0 A2

is not only in both A I and in A2, but it is so for disjoint reasons.
Here is a simple example. Let
Al = {1] : 1](x) + 1](Y) :::: I} and A2 = {1] : 1](Y) + 1](z) :::: I},
where x, y, Z are distinct points in S. Then A 1 0 A2 = {1] : 1] (x ) + 1](Y) + 1](z) :::: 2},
so that
(AI n A2)\(A I 0 A2) = {1] : 1](x) = 0, 1](Y) = I, 1](z) = OJ.

If v is the product measure with density !, for example, then
8 9
v(AI 0 A 2) = 16' V(AI)V(A2) = - ,
16
which are ordered as predicted by (B20) and Theorem B21 below.
Theorem B21 (BKR inequality). If v is a product measure and S is finite, then
for all AI, A2 C X.
Note that Al 0 A2 C Al n A2 always, so that if AI, A2 are increasing, Theorem

B21 gives an inequality similar to (B20), but in the opposite direction. Interest-
ingly, (B20) (for increasing sets) is actually a consequence of Theorem B21. To
see this, suppose Al is increasing, and A2 is decreasing (i.e., its complement A~
is increasing). Then Al 0 A2 = Al n A 2, since given 1] E Al n A2, one can check
that 1] E Al 0 A2 by taking
SI = {x: 1](x) = I} = {x: 1](x) = OJ.

and S2
Therefore Theorem B21 implies that v(AI nA2) ::s V(AI)V(A2). But this is equiv-
alent to v(AI n AD :::: v(AI)v(A~).
Theorem B21 for increasing events is due to van den Berg and Kesten, and
has long been known as the BK inequality. See Section 2.3 of Grimmett (1989)
for a proof in this context. The proof of the general form of the theorem is due
to Reimer, and this leads to our calling it the BKR inequality. Reimer's proof is
given in Section 6 of Chayes, Puha and Sweet (1999). The proof given there is
for the case that v {1] : 1] (x) = I} = 1
for all XES. The fact that this special
case of Theorem B21 implies the general case had been proved earlier by van
den Berg and Fiebig (1987) - see their Lemma 3.5. In that paper, they proved the
inequality in several cases, including that in which A and B are intersections of an
increasing and a decreasing event. In his second edition, Grimmett (1999) states
the general BKR inequality (see his Theorem 2.19), but again proves it only for
increasing events.
Results such as Corollary B 18 and Theorem B21 have been stated for inde-
pendent Bernoulli random variables. However, they can be used to obtain similar
results for independent Poisson processes. This is important, since all of the pro-
cesses discussed in this book can be constructed from collections of independent
Poisson processes.
Here is the idea. Suppose N is a rate one Poisson process on [0, 1], i.e., NO
is a random measure on [0, 1] with the following properties:
(i) For each Borel set A C [0, 1], N(A) is Poisson distributed random variable
with mean meA), where m is Lebesgue measure.
(ii) If {Ad are disjoint, then {N(Ai)} are independent.
Define random variables by
if N(A) = 0,
M(A) = {~ if N(A) ~ 1.
If {Ad are disjoint, then {M(Ad} are independent Bernoulli random variables.
Furthermore,
P(N(A) i= M(A)) = P(N(A) ~ 2) = 1 - e- m (A)[1 + meA)] ~ ~[m(A)]2

as meA) to. In particular, if Ai = [i~l, ~] for i = 1, ... , n, then
P(N(Ai) i= M(Ai) for some 1 :s I. :s n) :s -Cn

for some constant C. This makes it possible to apply correlation inequalities to
the M's, and then deduce corresponding inequalities for the N's.
Duality
Two Markov processes 1]( and ~( (with possibly different state spaces) are said to
be dual with respect to the function H if
for all 1] in the state space of the first process and ~ in the state space of the
second. The function H should be jointly measurable, and either nonnegative or
bounded, so that the above expectations are well defined. Duality is often a useful
tool because it permits the computation of certain probabilities for one of the
processes in terms of probabilities for the other. It has other important uses as
well, as we will see later in this book.
A general discussion of duality can be found in Section 3 of Chapter II of IPS.
Rather than repeat this here, we will limit ourselves to the observation that duality
will arise in our discussion of the basic contact process in Part I of this book, in
our discussion of the threshold contact process, and the linear and threshold voter
models in Part II, and in our discussion of the symmetric exclusion process in
Part III.
Subadditivity
Subadditive sequences and functions will come up frequently. The much more
powerful subadditive ergodic theorem (Theorem 2.6 on page 277 of IPS) plays an
important role in some aspects of the study of the contact process, but will not be
used in this book.
Here is the main result we will use.
Theorem B22. Suppose a(t), t E [0,00), is locally bounded and satisfies
(B23) a(s + t) :s a(s) + a(t), s, t ~ o.

Then
. a(t) . a(t)
-00 < hm - -
- t~oo t
= t>O
mf--
t
< 00.
-
An analogous statement holds for sequences.
Proof Let
a = infa(t).
1>0 t
Fix s > 0 and write t = ks + u, 0 :s u :s s, where k is an integer. Then
a(t) = a(ks + u) :s ka(s) + a(u)

by (B23). Letting t -+ 00, we have k -+ 00 and t / k -+ s. Therefore,
. a(t) a(s)
I1m sup - - :s --.
I~OO t s
Since this is true for any s > 0, it follows that
. . a(t) . a(t)
a :s hmmf-- :s hmsup - - :s a,
t~oo t t~oo r
which completes the proof. The proof for sequences is similar.
Oriented Percolation
Oriented site percolation is a very useful comparison process for interacting par-
ticle systems - especially the contact process. Here is a description of the site
percolation model with parameter p: An is a discrete time Markov chain on the
collection of finite subsets of Z with the following evolution: conditional on the
process up to time n, the events {x E An+d are independent and have probability
if An n {x -l,x} =1= 0,
if An n {x - 1, x} = 0.
This is not quite the traditional description of the process, but it has the advantage
of making clear that oriented percolation can be viewed as a discrete time version
of the one dimensional contact process.
The following result summarizes the main facts we will use about An. Note
that if Ao = {O}, then An C [0, n]. Let T = inf{n 2: 1 : An = 0}.
Theorem B24. If P is sufficiently close to 1, then there are constants C and E > 0
such that
(a) inf p(O)(n E A 2n ) > 0,

n
(b) p(O)(n < T < 00) .:s Ce- and

fn ,
(c) pA(T < 00) .:s Ce- fIAI .
A proof of Theorem B24 can be found in Durrett (1984) for oriented bond
I
percolation, in which the transition probabilities given above are replaced by
p(2-p) if IAn n {x -l,x}1 = 2,

P(x E An+lIAo, ... , An) = ~ if IAn n {x - 1, x}1 = 1,
if IAn n {x -1,x}! = O.
To deduce parts (a) and (c) of Theorem B24 for oriented site percolation from the
corresponding statements in the bond case, it suffices to note that one can couple a
site percolation process An with parameter p(2 - p) to a bond percolation process
Bn with parameter p so that Bn C An, provided that the initial states satisfy
Bo C Ao. Part (b) is not quite so easy, since {n < T < oo} is not a monotone
event. However, it can be deduced from the bond case by using the restart argument
described on pages 1031-1032 of Durrett (1994). (We will encounter a version of
this argument in the proof of Theorem 2.30 of Part I.)
For the one dimensional contact process, the analogues of the three parts of
Theorem B24 can be found in IPS as Theorem 2.28 on page 284, Theorem 3.23
on page 302 and Theorem 3.29 on page 303 respectively.
More quantitative statements related to Theorem B24 have been proved by
Liggett (1995b): If p 2: ~, then
(B25) P(O}(A =1= 0) > 2(1 - p) n :::: O.

n - p[I-J4p-3]'
Note that the right side of (B25) is ~ when p = ~, and converges to 1 as p t l.

It is often useful to think in terms of the more traditional description of the
oriented percolation process. Consider the bond case, for example. Place arrows
on R2 independently, with probability p each, from each (i, j) to (i, j + 1), and
from each (i, j) to (i + 1, j + 1). Given A C Zl, let
An = {k: there is an oriented path from (i, 0) to (k, n) for some i E A}.
Here an oriented path is simply a concatenation of arrows. Then {An' n :::: O}

is a version of the oriented bond percolation model described earlier with initial
state A.
This construction of the oriented percolation model is analogous to the graphi-
cal representation that will be described in the first section of Part I. The graphical
representation is extensively used throughout this book.
Domination by Product Measures

Often the percolation models that arise in comparisons with particle systems fail to
satisfy the conditional independence assumption made in the definition of oriented
percolation above. The following is frequently useful in extending the applicability
of Theorem B24 to such dependent situations. It is a special case of results in
Liggett, Schonmann and Stacey (1997). In its statement, I . I can be any norm
on Rd.
Theorem B26. Fix k, d :::: 1, and let ~ = #{y E Zd : Iyl ::::: k}. Then whenever
{Xx, x E Zd} are Bernoulli random variables that satisfY
p(Xx = 11 Xy, Iy -xl> k):::: 1- (1- -JP)'" a.s.

for all x Zd, with p :::: ~, itfollows that the distribution /L of this family satisfies
E
/L :::: vP' where vp is the product measure on to, 1}Zd with density p, i.e., vp has
cylinder probabilities
Remark. The most important situation in which this theorem is used is that in
which the Xx's are k-dependent. Recall that a collection of random variables {Xx}
indexed by Zd is said to be k-dependent provided that whenever A and Bare
subsets of Zd that satisfy
Ix - yl > k for all x E A, y E B,
the collections ofrandom variables {Xx, x E A} and {Xx, x E B} are independent

of each other. In the k-dependent case, the hypothesis of Theorem B26 reduces to
Proof of Theorem B26. Let {xn, n 2: O} be any enumeration of the points in Zd,
and write Xn = XX n ' We assume without loss of generality that all probabilities of
the form P(Xo = EO, ... , Xn = En) are strictly positive. If it were the case that
(B27) P(Xn+l = 1 I Xo = EO, ... , Xn = En) 2: p
for all n and all choices of Ej E {O, I}, then it would be easy to construct recursively
a coupling that would realize the desired inequality jJ., 2: vp. Alternatively, one
could apply Theorem BlO to check this, since (Bll) reduces to (B27) when jJ.,1 =
vp and jJ.,2 = jJ.,. However, (B27) is too strong a condition to expect to check in
any significant generality.
The idea of the proof is to let {Yn , n 2: O} be an i.i.d. sequence of Bernoulli
random variables with P(Yn = 1) = r that is independent of the X's, and try
to check that the sequence Zn = Xn Yn satisfies (B27). Since Zn .::; X n, this will
suffice. Since
P(Zn+! = 1 I Zo = EO, ... , Zn = En) = r P(Xn+1 = 1 I Zo = EO, ... , Zn = En)

is r times an average of quantities of the form
P(X n+1 = 1 I Xo = EO, .. , ,Xn = En),

it should be plausible that it is easier to have the Z's satisfy (B27) (with a different
value of p) than to have the X's satisfy it. One should think of condition (B27)
for the Z's as being a smoothed or averaged version of condition (B27) for the
X's.
So, we assume that
(B28) p(XX = 1 I X Y ' Iy - xl > k) :::: s a.s.
for some s, and try to prove
(B29) P(Zn+l = 1 I Zo = EO, ... , Zn = En) 2: p
for some p depending on r, s. Recall from above that
P(Zn+l = 1 I Zo = EO, ... , Zn = En)

(B30)
= r P(Xn+! = 1 I Zo = EO, ... , Zn = En)·
For fixed n, let
No = {i : 0 .::; i .::; n, IXi - xn+ll .::; k, Ei = A},

N! = {i: 0.::; i.::; n, IXi -xn+ll.::; k,Ei = I},
M = {i : 0 .::; i .::; n, IXi - xn+!1 > k}.
Since Zi = 1 if and only if Xi = Yi = 1, and Yi = 0 implies Zi = 0,

P(Xn+1 = 0 I Zo = EO, ... , Zn = En)

= P(Xn+1 = 0 I Zi = 0, i E No; Xi = 1, i E NI; Zi = Ei, i EM)
P(X n+ 1 = 0; Zi = 0, i E No; Xi = 1, i E NI; Zi = Ei, i EM)
P(Zi = 0, i E No; Xi = 1, i E N I ; Zi = Ei, i EM)
P(Xn+1 = 0; Zi = Ei, i EM)
(B31) < --------~~-----------------------
- P(Yi = 0, i E No; Xi = 1, i E NI; Zi = Ei, i EM)
P(Xn+1 =0 I Zi =Ei,i EM)
1- s
< --------------------------------
- (l-r)INoIP(X i = l,i E NI I Zi =Ei,i EM)'
where the final inequality comes from (B28) and the fact that each Yi is indepen-
dent of all the X's and all the other Y's.
We will now use (B31), which is true for any ordering of Zd and any n, to
prove inductively that if r is chosen appropriately, then
(B32) P(Xn+1 = 1 I Zo = EO, ... , Zn = En) ::: r
for all orderings of zd, all n and all Ei. By (B28), it is true for n = -1 provided
that r S s. Write the P(Xi = 1, i E NI I Zi = Ei, i EM) that appears in the final
expression of (B31) as a product of INil conditional probabilities of the form
P(X/ = 1 I Zi = Ei, i EM; Xj = 1, j E NI, j > l)

= P(X[ = 1 I Zi = Ei, i EM; Zj = 1, j E N I , j > l)
for I E N I • Then we see from (B31) that (B32) holds for a given n, provided it
holds for all smaller values of n and that
l-s
(B33) ---------,----,-- S 1 - r.
(l - r)INolrlNll
Since INol + INII S Ll - 1, if r ::: ! it suffices for (B33) that

(B34) l-s S (l-r)t..
!
We conclude that if s ::: r ::: and (B34) all hold, then (B28) implies that (B29)
holds for p = r2 (by (B30) and (B32)), and hence by the remarks at the beginning
of the proof, that f.1- ::: vp. Take r = -/p and s so that equality holds in (B34) to
complete the proof.
Renewal Sequences and Logconvexity

Suppose (fen), n ::: I} is a probability density on the positive integers. Its renewal
sequence {u(n), n ::: O} is defined by u(O) = 1 and
=L
n
(B35) u(n) f(k)u(n - k), n 2: 1.
k=l
Here is the interpretation. Let {Xk. k 2: I} be independent and identically dis-

tributed random variables with
= n) = fen), n 2: I,
P(Xk
and let So = 0 and Sk = Xl + ... + Xb k 2: 1, be their partial sums. Think of X k

as the lifetime of the kth object used in some operation. A new object is installed
when the old one dies. The installation of a new object is called a renewal, so Sk
is the time of the kth renewal. Then
(B36) u(n) = P(Sk = n for some k 2: 0)

is the probability that a renewal occurs at time n. Let's check that the sequence
defined by (B36) satisfies the recurrence (B35). For n 2: 1, write
n
P(Sk =n for some k 2: 0) = L P(XI = k)P(SI = n for some 12: 0IX I = k)
k=l
n
=L P(XI = k)P(SI = n - k for some I 2: 0).
k=l
The renewal theorem asserts that if f is nonlattice (i.e., does not concentrate
on multiples of an integer> 1), then
. 1
(B37) hm u(n) = 00 •
n--+oo Lk=l kf(k)
An easy way to see this is to define a Markov chain Yn on to, 1, ... } with transition
probabilities
k 0 _ f(k+l) F(k + 2)
p( , ) - F(k + 1) and p(k,k+ 1) = F(k+ 1)'
where F(n) is the tail probability
=L
00
F(n) f(k).
k=n
The chain Yn can be interpreted as the age process associated with the renewal
process:
Yn = n - max{Sj : Sj ::: n}.
Note that with this definition, Yn increases by one at each unit of time, except that
it is reset to zero when a renewal occurs. Therefore Yn represents the age of the
object currently in service, and Yn = 0 corresponds exactly to a renewal occurring
at time n. To check that it is a Markov chain with the transition probabilities given
above, consider conditioning on the values Yo, Y1 , ••• , Yn - l and Yn = k. In this
situation, Sj = n - k for some j, and the conditioning determines the values of

Xl, ... , Xj and the fact that Xj+l ~ k + 1. Therefore, with this conditioning,
Yn + 1 = 0 or Yn +1 = k + 1 with probabilities
p(Xj + 1 = k + 1) p(Xj + 1 ~ k + 2)
and
P(Xj+l ~ k + 1) P(Xj+l ~ k + 1)
respectively.
From these observations, it is easy to see that
and the return time to the origin
r = min{n : Yn = O}
starting from 0 has density f. Therefore, (B37) follows from the convergence
theorem for Markov chains, which says in this case that
. P o( Yn
lIm = 0) = -0-.
1
n->oo E r
See Chapters 3 and 5 of Durrett (1996) for more on this.
A property that will be useful in our applications of renewal theory in Part II
is logconvexity. A positive sequence {c(n), n ~ no} is said to be logconvex if the
successive ratios are monotone:
c(n) c(n + 1)
(B38) --- > n
c(n + 1) - c(n + 2) , ~no.
Note that the logconvexity of the sequence c(n) is equivalent to the nonnegativity
of the 2 x 2 determinants
I c(nc(n) c(n + 1) I n ~ no.

+ 1) c(n + 2) ,
De Bruijn and Erdos (1953) discovered the following connection between renewal
theory and logconvexity.
Theorem B39. Let f be any strictly positive probability density on {I, 2, ... } and
let u(n) be the corresponding renewal sequence. Iff is logconvex, then so is u.
The proof of this result depends on an identity that relates determinants based
on f to determinants based on u. Note the similarity between (B41) below and
the convolution equation (B35) that defines the renewal sequence.
Lemma B40. Let f be any probability density on {1, 2, ... } and let u (n) be the
corresponding renewal sequence. Then
fen + 1) I u(n) u(n + 1) I

u(n + 1) u(n + 2)
(B41)
n I f(j) f(j + 1) II u(n - j) u(n - j + 1) I
=~ f(n+l) fen + 2) u(n) u(n + 1)
for n 2: 1.
Proof Expanding the determinants and using (B35) four times gives the following
for the sum on the right side of (B41):
L
n
L f(j)u(n -
n
fen + 2)u(n + 1) f(j)u(n - j) - fen + 2)u(n) j + 1)
}=l }=l
n
- fen + l)u(n + 1) L f(j + l)u(n - j)
}=l
n
+ fen + l)u(n) L f(j + l)u(n - j + 1)
}=l
= fen + 2)u(n + l)u(n) - fen + 2)u(n)[u(n + 1) - fen + 1)]
- fen + l)u(n + 1) [u(n + 1) - f(l)u(n)]
+ fen + l)u(n)[u(n + 2) - f(l)u(n + 1) - fen + 2)]
= fen + 1)[u(n)u(n + 2) - u 2 (n + 1)].
Proof of Theorem B39. The proof that
(B42) u(n)u(n + 2) 2: u 2 (n + 1)
is by induction on n. Take n = O. Then

u(O)u(2) - u 2 (l) = f(2) > O.
Suppose now that (B42) is true for n < m. Then
u(O) u(l) u(m - 1) u(m)

- > - > ... > >---
u(l) - u(2) - - u(m) - u(m + 1)
Applying (B41) with n = m, we see that all the determinants on the right side are
nonnegative. Therefore (B42) holds for n = m as well.
Liggett (1989) generalized Theorem B39 to higher order convexity properties.

To define these, begin with the notion of total positivity. A matrix M = (mi.}) is
said to be totally positive of order r 2: 1 (TP r) if for every k :s r, every k x k
submatrix of M has nonnegative determinant. A sequence {c(n), n 2: O} generates
a matrix M via mi.} = c(i + j). Note that this matrix is TP l if and only if the
sequence c is nonnegative, and for strictly positive sequences, the matrix is TP2 if
and only if c is logconvex. The generalization of Theorem B39 is the following:
If lCi + j + 1) is TP" then uCi + j) is TPr .

This statement has a partial converse:
If uCi + j) is TPr + 1 , then I(i + j + 1) is TPr .

Since logconvexity implies convexity (by the arithmetic-geometric mean in-
equality), we see from Theorem B39 that if I is logconvex, then u is convex,
i.e.,
u(n) - u(n + 1) :::: u(n + 1) - u(n + 2), n:::: O.
We will need another inequality that says that these differences cannot decrease
too rapidly. Here is the identity that leads to this inequality. It is taken from Liggett
(1989).
Proposition B43. If n :::: 1, then
[u(n) - u(n + I)]F(n + I) =
L [u(k -
n
(B44) 1) - u(k)][ F(n + 2)F(n - k + 1)
k=l
- F(n + I)F(n - k + 2)].
Proof Summing the right side above by parts and using
f(k) = F(k) - F(k + 1), k:::: 1
gIves
n-l n
F(n + 2) L u(k)f(n - k) - F(n + 1) L u(k)f(n - k + 1)
k=O k=O
-u(n)F(n + 2) + u(n)F(n + I).
Now apply (B35) to get the result.
Before stating the next result, we observe that the logconvexity of the density
I implies the logconvexity of the tail probabilities F:
F(n + l)F(n - 1) - F2(n) =f(n - I)F(n + I) - f(n)F(n)
L
00
= [f(n - l)f(k + I) - f(n)f(k)].

k=n
Theorem B45. Suppose {F(n), n :::: I} is logconvex. Then u(n) t, andfor n :::: 2,
F(n+2) ]
u(n) - u(n + 1) >[u(n - 1) - u(n)] [ - F(2)
- F(n + 1)
(B46)
F(n+2) ]
+ [u(n - 2) - u(n - 1)] [ F(2) - F(3) .
F(n + 1)
Proof Since F is 10gconvex,
F(n + 2) F(n - k + 2)
--->-----
F(n + 1) - F(n - k + 1)
for 1 :::: k :::: n. Therefore, u(n) ~ u(n+ 1) follows from Proposition B43 and induc-
tion. Now we know that all the summands on the right of (B44) are nonnegative.
Inequality (B46) comes from dropping all but the two summands corresponding
to k = nand k = n - 1.
Translation Invariant Measures

In Part III, we will have occasion to use some elementary properties of translation
invariant measures. Here is the most important one, which says that the mean
distance between particles is the reciprocal of the particle density.
Theorem B47. Suppose that fL is a shift invariant probability measure on {O, I}ZI
that puts no mass on the == 0 configuration, and let Y] have distribution fL. Define
Xb -00 < k < 00 by
... < X-I < Xo = 0 < XI < ...
and {Xb k =1= O} = {x E ZI\{O} : Y](x) = I}. Then
(B48)
for all k.
Proof First note that translation invariance guarantees that
(B49) I>(x) =L y](x) = 00 a.s.

x>o x<o
This is because probabilities of the sort
a = p(y](x) = 0 for all x ~ n)

are independent of n. The corresponding events are monotone in n, so their union
and intersection both have probability a. Our assumption is that the intersection
has probability 0, so the union has probability 0 as well.
In proving (B48), we will assume that k ~ 1. The case k = 0 is slightly
different, but easier, and negative cases are handled by symmetry. Write
00
E{Xk+1 - Xk.I1(O) = 1) = LnP{Xk+1 - Xk = n, 11(0) = 1)

n=1
00 00
=L L nP {I1(O) = 1, X k = m, X k+1 = m + n)
n=1 m=k
= ~ ~np( 11(0) = 1, ~ l1(i) = k - 1, l1(m) = 1,
m~-I
L 11(i) = 0, l1(m + n) = 1
)
i=m+1
t;
00 00 m+n-I ( m-/-I
=~~ P 11(-1)=1, i~/I1(i)=k-l'l1(m-t)=I,
L
m+n-/-I )
l1(i) = 0, l1(m + n -I) = 1 ,
i=m-/+I
where the final step uses shift invariance. Making the change of variables
w = -t, u = m -t, v = m + n -t
and using (B49), this becomes
00 0 u-k ( u-I
~ u~oow~ooP I1(W) = 1, i];ll1(i) = k - 1, I1(U) = 1,
i~1 11(i) = 0, I1(V) = 1)

= t ui;oo p( I1(U) = 1, i~II1(i) = 0, I1(V) = 1)
= tp(~I1(i) = 0, I1(V) = 1) = 1.
Some Ergodic Theory

A stochastic process 111 on X is said to be stationary if the joint distributions of
are independent of t for all choices of n and of tl, ... , tn. It is said to be ergodic if
in addition it satisfies the following property: for every event G in path space that
is invariant under time shifts, P(I1. E G) = 0 or 1. The main result concerning
stationary ergodic processes is the Birkhoff Ergodic Theorem - see Section 6.2 of
Durrett (1996) for example. Here it is:
Theorem B50. If 1]1 is stationary and ergodic, and if 1 is any bounded measurable
function on X, then
lit
-
t 0
1(1]s)ds ---+ EI(1]o) a.s.
as t ---+ 00. If 1]t is only stationary, the above limit exists a.s., but may not be
constant.
Theorem B50 can be applied to obtain an often useful criterion for ergodicity:
Theorem B51. The stationary process 1]t is ergodic if and only if
for all bounded measurable (or equivalently, all bounded continuous) functions 1
and g of n variables, all choices of SI < S2 < ... < Sn, and all n 2: 1.
For a proof of Theorem B51, see Proposition 4.11 of Chapter I of IPS, for
example. The general continuous functions that appear there can be replaced by
functions of finitely many variables to get the above statement.
One common way to construct stationary processes is to take a Markov process
1]t and use an invariant measure I-t as its initial distribution. The following result
gives an important connection between extremality of f.-t and ergodicity of the
resulting stationary process.
Theorem B52. Suppose that 1]1 is a stationary Markov process whose distribution
at each fixed time is the measure f.-t E g. Then each of the following is equivalent
to ergodicity of the process:
(a) f.-t E.9;.
(b)
lim
1--->00
~t 10t EF(1]o)G(1]s)ds = f f Fdf.-t Gdf.-t
for all bounded continuous functions F, G.
Proof For part (a), suppose 1 is a bounded nonnegative measurable function on

X with f Idf.-t = 1, and let v = If.-t. Then
111
- EI(TJo)g(TJs)ds = -
t o t
Ill! I(TJ)E~g(TJs)d/1ds
11 f E~g(TJs)dvds
0
={
= { 11 f S(s)gdvds
= f gd[{ 11 VS(S)dS].
Therefore, by Theorem B51 and the Markov property, TJI is ergodic if and only if
for every such I,
(B53) -
t
111
0
vS(s )ds ::::} /1,
where::::} denotes weak convergence. Now suppose
(B54)
for two probability measures VI, V2. Then Vi is absolutely continuous with respect
to /1, so it may be written as Vi = fi /1, with 0 :s fi :s 2 and II + h = 1. Also,
since /1 E .9',
/1 = ~
2t
r
10
vIS(s)ds + ~
2t
r
10
v2S(s)ds.
If /1 E .9,;, passing to subsequences and using Theorem B7(f) implies that

(B53) holds for each Vi, and hence TJI is ergodic. Conversely, if TJI is ergodic, and
the Vi in (B54) are taken to be in.9', then (B53) holds for Vi, and hence Vi = /1,
thus showing that /1 E .9,;.
For part (b), apply Theorem B51. One direction is immediate, since the con-
dition in (b) is a special case of the condition in Theorem B51. For the other
direction, take functions I, g of n variables, and define functions F, G of one
variable via
and
G(TJ) = E~g(TJo, TJs 2 -S 1,··· , TJSn-Sl)·
Then, using the Markov property, for s > Sn - Sl, we can write
EI(TJs 1, ... , TJsJg(TJs1+s, ... , TJsn+s)

= E(E[f(TJs 1,··· ,TJsJ I TJs.]E[g(TJs1+s, ... ,TJsn+s) I TJsnl)
= E(F(TJsJE~'ng(TJsl+S-Sn'··· , TJs))
= E(F(TJo)E~Og(TJSl+S-Sn'··· , TJs))
= E(F(TJo)G(TJs1+s-sJ).
Taking averages on s and passing to the limit gives the required result.
Branching Processes
Branching processes are very useful in making comparisons with interacting par-
ticle systems. Suppose {fen), n 2: O} is a probability density on the nonnegative
integers. Construct a discrete time Markov chain Xn on to, I, ... } by letting the
conditional distribution of Xn+! given Xn = k be the distribution of the sum of
k independent random variables with density I. Then Xn can be interpreted as
the number of individuals in the nth generation for a population in which each
member replaces itself at integer times with a random number of offspring, chosen
with density I.
Branching processes have been studied in this and more general forms for many
years. Athreya and Ney (1972) provides a good account of the theory, though all
the facts we will need can be found in any of several standard probability books
- see Chapter 4 of Durrett (1996), for example. Here we will summarize the basic
properties of a branching process. To rule out uninteresting special cases, we will
assume that 1(0) + 1(1) < 1; otherwise, Xn cannot grow.
The first question one asks is whether the survival probability
is strictly positive or not. Note in this connection that state 0 is absorbing for the
process. The answer to this question, and other aspects of the behavior of X n , are
basically determined by the mean of I,
which we assume is finite. The process is said to be subcritical, critical, or super-

critical according to whether m < 1, m = 1 or m > 1. The martingale
plays a key role in the theory. The martingale convergence theorem implies that
M = lim Mn
n~oo
exists a.e.
Theorem B55. (a) The extinction probability 1 - p is the smallest nonnegative

solution x of the equation
L I(k)x
00
k = x.
k=O
(b) p > 0 if and only ifm > 1.
(c) If Xo = 1, m > 1, and Lk k 2 1(k) < 00 then EM = 1. In particular, M is not
identically zero.
Some Queuing Theory

Let S be a countable set, and q (x, y) be the transition rates for an irreducible,
positive recurrent continuous time Markov chain on S:
q(x, y)::: 0, y i= x; Lq(x, y) = 0, XES.

Y
Assume that the chain is well behaved in that
sup L Iq(x, y) + q(y, x)1 < 00.

x y
Let rr be the stationary measure for the chain:
(B56) Lrr(x)q(x, y) = 0, YES.

x
Assume for simplicity that there is a unique point x* E S where rr achieves its
maximum value, and normalize rr so that rr(x*) = 1.
Define now a queuing system TJr associated with q (., .) in the following way:
At any given time, TJr(x) E {O, 1, ... ,oo} is regarded as the number of customers
in queue x. For x i= y such that TJr(x) ::: 1, at rate q(x, y), a customer moves from
queue x to queue y. The effect is that TJr (x) decreases by 1 and TJr (y) increases
by 1. The process can be formally defined by monotonicity arguments, since we
have allowed the number of customers in a queue to be infinite. To do so, note
first that the process is well defined whenever the initial configuration TJ satisfies
L TJ(x) < 00.

x
For two different initial configurations TJ, l; that satisfy TJ(x) ::::: l;(x), XES, the
two resulting processes can be coupled so that TJr (x) ::::: l;r (x), XES at all later
times. Thus the process can be constructed for a general initial TJ by taking a
sequence of finite configurations TJn t TJ, and defining TJr = limn TJ~.
Suppose that p (.) is a function on S that satisfies
(B57) p(x*) = 1 > p(x) > rr(x), x i= x*,

and define the measure v on the space of configurations of the queuing system by
taking TJ (x) to be independent random variables that are geometrically distributed
with parameter I - p(x):
v{TJ: TJ(x) = k} = [1 - p(x)]p(x)k, k::: 0.
By convention, v{TJ : TJ(x*) = oo} = 1. We would like to see under what conditions
we would expect v to be invariant for the system. Queue x* is automatically in
equilibrium, so we compute formally for x i= x*, k ::: 1,
When the distribution of 111 is taken to be v, the right side above is
[1 - p(x) ]p(x)k-l [P(X)q(X, x) +L p(y)q(y, X)].

y=l=x
Thus, we will want to assume that
(B58) p(x)q(x, x) + LP(y)q(y, x) = 0, x*- x*.

y=l=x
It is not too hard to show that under this condition, v is invariant. See Andjel
(1982) for a proof of this type. Distributions of this sort that are invariant for
queuing systems are known as product form - see Kelly (1979). Note that by
(B56), (B57) and (B58),
A= L [p(x)q(x, x*) - q(x*, x)] > L n(x)q(x, x*) + q(x*, x*) = 0.

x=l=x' x=l=x'
Under our assumptions, the queue length is finite for x *-

x*, and infinite for
x = x*.In fact, x* can be thought of as a source and sink for customers in the real
queuing system on S\{x*}. The parameter A defined above can be interpreted as
the net rate at which customers are joining queue x*, and hence, since the system
is in equilibrium, the rate at which customers are coming in from 00. Define the
departure and arrival processes DI and AI as the number of customers going from
S\{x*} to x* in the time interval [0, t] and the number of customers going from
x* to S\{x*} in [0, t] respectively. The net output process is XI = DI - AI'
Our objective is to state and motivate two results that are proved by Ferrari
and Fontes (1994). In each case, we assume the conditions stated up until now,
and consider the process in equilibrium, with distribution v. The first result is
often called Burke's theorem, since it was proved by Burke in the case of a single
queue. A good general reference is Kelly (1979).
Theorem B59. Suppose
' " q(y, x)

(B60) supp(y) ~ -- < 00.
y x=l=y p(x)
Then DI is a Poisson process with rate
L p(x)q(x, x*).
x=l=x'
By the observations made above, the rate of the resulting Poisson process is
no mystery. The proof is based on the reversed process with respect to v, i.e., the
process 11; whose generator is the formal adjoint of the generator of 111 in L2(V).
An example of the use of such reversal in the context of the exclusion process is
given in the proof of Theorem 1.17 in Part III. The reversed process is simply the
queuing system corresponding to the rates
*( ) p(y)q(y, x)
q x, y = p(x) ,
an observation that explains assumption (B60). In the reversal, the roles of arrivals
and departures are interchanged. In particular, {D t , t 2: O} and {A;, t 2: O} have
the same distribution. The latter process is Poisson by construction, and hence so
is the former.
Theorem B61. Suppose that
(B62) L iT (x)
---'--'--- < 00,
1 - p(x)
xoj=x'
"q(y, x)
SUPiT(Y) ~ - - < 00,
Y xo/=y iT (x)
and
" q(y, x)
sup [p(y) - iT(Y) ] ~ () _ () < 00.
Y xoj=x',y PX iT x
Then the net output process can be written as
(B63)
where R t is a Poisson process with rate A and Bt is a nonnegative stationary

process whose marginal distributions have finite exponential moments:
(B64)
for all E > O.
Remark. By applying the central limit theorem for the Poisson process, one im-
mediately deduces that the Dt in Theorem B59 and the X t in Theorem B61 satisfy
the central limit theorem.
The idea of the proof of this theorem is similar to that of Theorem B59.
The main difference is that the process is decomposed into a sum 1)t = 1)f + 1)~,
where the summands keep track of customers of two types, called black and red
respectively. The black customers are thought of as having entered the system
from x*, and the red ones are the ones that have entered the system from 00.
All customers at x* are labelled black. When some customer at an x =1= x* is
supposed to move, the customer that does move is chosen uniformly from among
all the customers at x, some of which will generally be black and others red.
The analogue of v for this bicolored process is the measure /L constructed in the
following way: 1) is first chosen according to v. Then each customer at queue
x =1= x* is labelled black with probability iT(x)/ p(x), and red otherwise. It turns
out that /L is invariant for the evolution of (1)f, 1)~). The reversed process (1)f*, 1)~*)
has a similar evolution to (1)f, 1);), except that the transition rates are different, and
different for customers of the two colors. For a detailed description, see Ferrari
and Fontes (1994).
Decomposition (B63) comes about in the following way: R t is the departure
process of red customers, and Bt is the number of black customers in the system
(at queues other than x*) at time t. Since no red customers enter the system,
the decomposition is clear. Since the process is in equilibrium, B t is a stationary
process. To check (B64), note that
Bt = L 1J~(x),
x =/=x *
so that
Ee EBt = n EeEry~(X) n E[EeEry~(X)

x=/=x*
=
x =/=x *
!1Jt(X)]
=n E[
P(X) +n(x)(eE _l)]ryt(X)
n
x=/=x* p(x)
1- p(x)
- x=/=x* 1 - p(x) - (e E - l)n(x)'
which is finite by (B62).
The Martingale eLT

If one had to choose one tool as the most important and widely applicable in
probability theory, it would probably be martingale theory. Recall that a (discrete
time) martingale is a sequence {(Mn, ~), n 2: O} of integrable random variables
Mn and corresponding a-algebras .9'f;, which satisfy ~ C ~+l and E[Mn+l I
,gq;] = Mn for each n 2: O. The most important results about martingales are a.s.
convergence theorems and stopping time theorems. These are well known to all
probabilists. Somewhat less known are central limit theorems for martingales, so
we give one here that will be used in Part III. It can be found as Theorem 7.4 in
Durrett (1996), for example.
Theorem B65. Suppose that {(Mn, ~), n 2: O} is a martingale that satisfies
n--+oo n
in probability, and
(B66)
.
hm
L~:6 E[(Mk+l - Md, IMk+1- Mkl > EJn]
=0
n--+oo n
for every E > O. Then
Mn
In => N(O, a 2),
where N(m, a 2 ) is the normal distribution with mean m and variance a 2 .
Our main applications of this result will be in the case that {Mn+ 1 - Mn} are
identically distributed. Note that in this case, (B66) is equivalent to E(M 1 -Mo)2 <
00.
Part I. Contact Processes
1. Preliminaries
The contact process is often thought of as a model for the spread of infection. The
collection of individuals that may be infected at any given time is taken to be the
set of vertices of a connected, undirected graph S. For such a graph, the degree of
a vertex x is the number of vertices y that are connected to x by an edge. The main
examples to be treated below are the d dimensional integer lattice Zd (in which
the degree of each vertex is 2d), and the homogeneous tree Td in which every
vertex has degree d + 1. In general, we will assume that the degrees of the vertices
are uniformly bounded. A path through S is a sequence of consecutive edges in
the graph, and its length is the number of edges used. The distance between two
vertices x, yES is the minimal length of a path from x to y, and is denoted by
Iy -xl·
While we will use the language of infection in talking about the contact process,
this process has arisen in other contexts, such as Reggeon Field Theory in high
energy physics. The contact process is a fundamental model that is often used
as a test case for new techniques or results that might apply more generally. It
has been the subject of intensive research, both rigorous within the mathematics
community, and numerical in the physics literature. An understanding of it and of
the tools that are used in its study is an important first step toward being able to
work with other models of interacting particle systems.
Description of the Process

The contact process on S with (infection) parameter A ::: 0 is a continuous time
Markov process rJt on {O, l}s. Points rJ E {O, l}s will often be identified with
subsets A of S via A = {x E S : rJ(x) = I}. Individuals in A are regarded as
infected, while the other individuals are thought of as being healthy. The transition
rates for rJt are given by
A -+ A \ {x} for x E A at rate 1, and

A -+ A U {x} for x fJ; A at rate A#{y E A : Iy - xl = I}.
Here # denotes cardinality. At times, we will also use IAI to denote the cardinality
of a finite set. In words, infected individuals recover from their infection after an
exponential time with mean 1, independently of the status of their neighbors, while
32 Part I. Contact Processes
healthy individuals become infected at a rate that is proportional to the number

of infected neighbors. Using the notation from the background section, the rate
function for the process is given by
I if 'fJ(x) = I,
c(x, 'fJ) ={
A Lly-xl=l 'fJ(y) if 'fJ(x) = O.
The fact that these rates uniquely define a well behaved Markov process is a
consequence of Theorem B3. Often we will denote the initial state of the process
by a superscript: A~ is the process with initial state A. At other times, the usual
Markov process notation will be used: pA [At E -]. A key feature of these rates is
that the infection cannot appear spontaneously. In other words, (0 is a trap for the
process.
The contact process we have just defined is often called the basic contact
process. Another version of the process will be considered in Part II, and still
other versions have been studied elsewhere - see Section 5 for details. In Part I,
we will omit the word basic from the name of the process.
An alternative way of thinking about the contact process is as follows: Infected
sites become healthy at rate I as before. In addition, each infected site generates a
new infection at rate A at each neighboring site. If the neighbor is already infected,
this new infection has no effect.
This point of view leads to a useful comparison with a simpler process known
as a branching random walk. This is a process I;t with a state space that is a
reasonable subset of to, 1,2, ... }s. It is not particularly important what the word
reasonable means here. Suffice it to say that it should not allow for explosions
to occur. Regarding I;(X) as the number of particles at x, the process evolves
according to the following rules: Particles die at rate I, and generate offspring
at each neighboring site at rate A. From this perspective, the contact process can
be thought of as a branching random walk in which particles at the same site
coalesce. Alternatively, the branching random walk can be regarded as a contact
process in which we keep track of the multiplicity of infections. Mathematically,
the branching random walk is easier to study because the offspring of different
parents evolve independently.
The Graphical Representation; Additivity

A very useful explicit construction of the contact process is based on the graphical
representation that is described on page 172 of IPS. One of the advantages is that
it lends itself to the use of ideas and terminology from the well developed theory
of percolation. (See Grimmett (1989,1999) for example.)
To carry out the construction, assign a Poisson process Nx of rate 1 to each
vertex x of S and a Poisson process N(x,y) of rate A to each ordered pair of vertices
that are joined by an edge of S. (All Poisson processes are independent.) Think of
the space-time picture S x [0, 00). For each event time t of N x a recovery symbol
* is placed at the point (x, t) E S x [0,00), and for each event time t of N(x.y),
1. Preliminaries 33
*
-4 -3 -2 -1 o 2 3 4
Figure 1
an infection arrow ---+ is placed from (x, t) to (y, t). This construction is shown
in Figure 1 in case S = Zl.
An active path in S x [0,00) is a connected oriented path which moves along
the time lines in the increasing t direction without passing through a recovery
symbol, and along infection arrows in the direction of the arrow. For example, in
Figure 1, there is an active path from (2,0) to (1, t), but not to (2, t). The process
A~ with initial state A can be obtained explicitly by setting
A~ = {y E S: :3 x E A with an active path from (x, 0) to (y, t)}.
In Figure 1, for example, A)O} = 0, while A)l} = {O, I}. Generally speaking, the
symbol P with no superscript will refer to a probability computed with respect to
the probability space on which the Poisson processes are defined.
One advantage of the graphical construction is that it provides a joint cou-
pling of the processes with arbitrary initial states. In fact, it provides a monotone
coupling, in the sense that
(1.1)
Thus the graphical representation allows us to conclude that the contact process is
attractive. Of course, it is easy to see this by checking condition (B 14) directly. It
also follows from the graphical representation that the contact process is additive:
(1.2) A AUB - AA U AB
t - t t·
The graphical representation can be extended in various ways to couple pro-

cesses with different evolution rules. For example, suppose we want to couple
the contact processes At and Bt with parameters AA and AB respectively, where
AA < AB. All we need to do is to couple the corresponding Poisson processes: Use
the same Nx for the two processes, but for the Poisson processes that generate the
infection arrows, use N(~,y)' N(~,y), where
N(~,y) = N(~,y) + M(x,y) ,

and the M's are Poisson processes of rate AB - AA that are independent of the
NA,s. Note that this coupling satisfies (1.1). Similarly, one can couple contact
processes on SA and SB respectively, where SA is a sub graph of SB.
Along somewhat the same lines, one can use common Poisson processes in
order to couple the contact process TJI with the corresponding branching random
walk l;t in such a way that
TJo(x) ::s l;o(x) for all XES implies

(1.3)
TJt(x) ::s l;t(x) for all XES, t::: O.
Here is the idea. First construct the branching random walk in the following way.
For each XES and nearest neighbor ordered pair (x, y) use independent Poisson
processes N~ of rate 1 and Ntx,y) of rate A for i ::: 1. At an event time of N~,
replace l;t(x) by l;t(x) - 1 if i ::s l;t(x). At an event time of Ntx,y)' replace l;t(Y)
by l;t (y) + 1 if i ::s l;t (x). Then construct TJt via the graphical representation based
on the Poisson processes N; and N(~,y).
To check that (1.3) holds, we must check that no transition leads to a pair
(TJ, t;) that violates TJ ::s l;. Since each transition just involves one coordinate being
increased or decreased by 1, the only way that could happen is that for some
u, TJ(u) = l;(u) E to, I} before the transition, and TJ(u) = I, l;(u) = 0 after the
transition. If the common value of l;(u) and TJ(u) before the transition was 1, then
the transition occurred because of an event time of N ~ . But then after the transition
we would have had TJ(u) = l;(u) = O. If, on the other hand, the common value
of l;(u) and TJ(u) before the transition was 0, then the transition occurred because
of an event time of Nix,u) for an x such that TJ(x) = 1. But since l;(x) ::: 1, this
would have led to TJ(u) = l;(u) = 1.
The Upper Invariant Measure

Clearly 80 is the smallest invariant measure for the contact process, since 0 is a
trap. Here smallest is to be understood in the sense of the partial order (B8). The
biggest invariant measure can be constructed using the monotonicity (1.1). To see
this, take Ao = S (which is the biggest possible configuration) and let ILl be the
distribution of At. Then ILt ::s ILo, so applying (1.1) and the Markov property, we
see that ILt+s ::s ILs for any s. Therefore ILt is stochastically decreasing in t. In
f
particular,
fdlLt
1. Preliminaries 35
is decreasing in t for every increasing continuous function f on X - see (B8). It

follows from this and the compactness of the set of all probability measures on X
that the limiting distribution
(1.4) v = t-+oo
lim JLt
exists. This is the biggest (or upper) invariant measure of the process.
The fact that v is invariant comes from Theorem B7(e). To see that it is the
biggest invariant measure, let v be any invariant measure. Then v S JLo, so that
v = vS(t) S JLoS(t) = JLt
by (B13). Now let t -+ 00 to conclude that v S v.

The measure v has a number of special properties. For example, it has positive
correlations by Theorem B 17. It also satisfies
(1.5) v({0}) = 0 or 1.
To see this, suppose p = v({0}) < 1. Then the conditional measure v(·) = v(. I
{0}c) is again invariant, and satisfies v S v, since
and 8o S V. Since v is the largest invariant measure, it follows that v = v, and

hence that v({0}) = O. An immediate consequence of (1.5) is that
(1.6) lim v{B : B n A 1= 0} = 1

AtS
whenever v 1= 8o.
Duality
Another useful property of the contact process is its self-duality:
(1.7)
for all A, B c S (Theorem 1.7 on page 266 of IPS). Here we have used At and
B t to denote the contact process with initial states A and B respectively, in order
to avoid confusion. The most general way of proving relations such as (1. 7) is via
a generator computation. Letting H(A, B) = 1{AnBj0}, it is not hard to check that
QH(·, B)(A) = QH(A, ·)(B)

for finite A, B c S, and then to apply Theorem B3 to get (1.7). In fact, the above
generator identity can be viewed as the differential form of (1.7).
In the case of the contact process, (1.7) can also be obtained from the graphical
representation by thinking of At as being generated by paths which move in the
increasing t direction, and B t as being generated by paths which move in the
decreasing t direction, reversing the directions of the arrows, and then using the
basic symmetry of the graphical construction.
Taking B = S in (1.7), letting t --+ 00, and using the fact that the event
{At =1= 0} is monotone in t, we see that the survival probability
(1.8) pA(A t =1= 0 V t ::: 0) = v{B :B nA =1= 0}
for every A C S. Combining this with (1.6) gives
(1.9)
whenever v =1= 80 .
The self-duality (1.7) and graphical representation can be used to construct
invariant measures that are potentially different than 80 and V. To do so, take
B C S, use the shorthand {At n B =1= 0 i.o.} (infinitely often) for the event that
At n B =1= 0 for a sequence of times t t 00, and {At n B =1= 0 f.o.} (finitely often)
for the complement of this event. Define the measure VB by prescribing the cylinder
probabilities in the following way. For finite disjoint G, H C S, put
VB{I] : I](x) =0Vx E G, I](x) = 1Vx E H}

= P(A;xJ n B =1= 0 f.o. V x E G, A)xJ n B =1= 0 i.o. V x E H).
It is easy to check that these cylinder probabilities are consistent. To check that
VB is invariant, we need a little notation. For any probability measure II and any
A C S, define
/L(A) = /l{T) : T) nA =1= !o},
so that
vB(A) = P(A~ n B =1= 0 i.o.).
Then by duality,
(VBS(t))~(A) = EAvB(A t ) = vB(A),
where the second equality comes from the fact that {A;xJ n B =1= 0 i.o.} is an
invariant event. Note that V0 = 80 , and Vs = V. When B =1= 0 is finite, VB is the
invariant measure introduced by Salzano and Schonmann (1997). Whether or not
V B is different from 80 and v depends very much on the nature of the graph S.
Convergence
The general problem of determining when convergence of IIS(t) as t --+ 00 occurs
is difficult, and the answer depends heavily on the structure of the graph Sand
the value of A. However, if S is appropriately homogeneous (e.g., if S = Zd or
Td), and if the initial distribution II is also homogeneous and satisfies 1I(0) = 0,
then it is not too hard to prove that
(1.10) IIS(t) => v

l. Preliminaries 37
as t --+ 00, where:::} denotes weak convergence. For example, this is Theorem
4.8 on page 309 of IPS if S = Zd and J-L is translation invariant. One consequence
of (1.1 0) in this case is that there are at most two extremal translation invariant
measures in g.
We tum next to the more important concept of complete convergence. This
term refers to the following property: For every initial configuration A,
(1.11)
where
Ci"A = pA(A I =f 0 V t ::: 0)
is the survival probability. Again an immediate consequence of property (1.11) is
that all invariant measures are mixtures of v and 80 . The main tool we will use in
proving complete convergence is the following:
Theorem 1.12. Suppose
(1.13) p(x E A~ i.o.) = Ci"A
for all XES and A C S, and
(1.14) lim liminfP(A~(n)

n400 t400
n B(n) =f 0) = 1,
where B(n) is the ball ofradius nand ajixed center. Then (1.11) holds. Conversely,
if (1.11) holds for every A and v =f 80, then (1.13) and (1.14) hold also.
Proof By (1.8), (1.11) is equivalent to
(1.15) lim P(A~

1-->00
n B =f 0) = Ci"ACi"B, for all finite B C S.
One inequality in (1.15) is easy to see: Using the graphical representation and the
independence of the Poisson processes used in it for disjoint parts of space-time,
p(A~nB =f 0)
= P(3 an active path from B)
(x, 0) to (y, 2t) for some x E A, y E
.:s P(3 an active path from (x, 0) to (z, t) for some x E A, z E S)
x P(3 an active path from (z, t) to (y, 2t) for some z E S, Y E B)
= P(A~ =f 0)P(A~ =f 0).
Passing to the limit gives
(1.16) lim sup P(A~ n B =f 0) .:s Ci"ACi"B, for all A, B C S.

1-->00
For the other inequality, define the stopping time
iB = inf{t ::: 0 : At ~ B}.

By the strong Markov property and monotonicity (1.1),
(1.17) pA(A s+t nC*-0):::pA(TB:::::s)infpB(A r nC*-0), A,B,CCS.

r?::.t
Applying this twice and using duality (1.7) gives
pA(As+t+u nD *- 0) ::: pA(TB ::::: s) r;::t+u

inf pB(A r n D *- 0)
= pA(TB ::::: s) inf pD(A r n B *- 0)

r;::t+u
::: pA(TB ::::: S)pD(TC ::::: t) inf pC(A r n B
r?:.u
*- 0).
Applying this to B = C = B(n), letting first u, s, t -+ 00 and then n -+ 00 and
using (1.14) gives
liminf pA(Au n D
u----+oo
*- 0)::: lim pA(TB(n) < oo)pD(TB(n) <
n----+oo
00).
But by (1.13),
pA(TB < 00) ::: aA
for all A C S and all finite B C S, and this completes the proof of one direction.
For the other direction, suppose (1.11) holds for all A. Then (1.15) holds, and
(1.14) follows immediately from this, by taking A = B = B(n) and using (1.9).
To check (l.13), use (1.15) to conclude that
p(A1 n B *- 0 i.o.) = lim P(A:

t-+oo
n B *- 0 for some s ::: t) ::: aAaB.
Since p(x E A;Y}) > 0 for all x, y, it follows that
p(x E A1 i.o.) ::: aAaB.
Applying this with B = B(n) and using (1.9) gives
p(x E A1 i.o.) ::: aA.
Remark. Since S is connected, condition (1.13) is independent of x. Also, by

additivity, (1.13) holds for all finite A C S if and only ifit holds for all singletons.
Monotonicity and Continuity in A

The properties we have discussed so far deal primarily with the contact process
with a fixed value of A. Most important issues in this field are concerned with
how the behavior of the process changes when A changes. In this connection, it
is worth recalling that the graphical construction makes it clear that the contact
process is monotone in A: Monotone families of Poisson processes indexed by A
can be used in the graphical representation, and with this joint construction, if At
and Bt have parameters AA and AB respectively with AA ::::: AB, then (1.1) holds
here as well.
1. Preliminaries 39
Generally speaking, it is fairly easy to prove continuity of reasonable functions

of A that depend on the process for finite time periods. Functions that depend on
the process for all times can easily be discontinuous in A - see Theorem 4.65(f),
for example. Theorem B5 provides one approach to proving continuity in the case
of a finite time horizon. Another approach, which can give explicit estimates, is
based on the graphical representation.
To illustrate this approach, suppose that the degrees of the vertices of S are
bounded by K. An easy coupling shows that there is a pure birth process Yt on
the positive integers with Yo = 1 and transitions
n --+ n + 1 at rate n K A
so that cardinality of the set of all sites infected up to time t satisfies
(l.l8)
provided that IAI = 1. To see this, simply ignore recoveries in the contact process,
and note that any set of size n has at most nK neighbors.
Let Tl, T2, ... be independent, exponentially distributed random variables with
means
1
ETn = --.
nKA
These can be thought of as the holding times at the various integers for the process
Yt . Therefore, for e > 0, the exponential form of Chebyshev's inequality gives
P(Yt > N) = P(TI + T2 + ... + TN :::: t)

:::: Ee e(t-r l -r2 -···-rN )
= eet N nKA [ N
et- e ]
< ex
DnKA+e - p ~nKA+e '
were we have used the inequality 1 - x :::: e- X in the final inequality. Taking
e= mKA for any integer m leads to
since
L --
N I
::: L I N m+n+ l 1
-dx =
I m+ N + l 1 m+N +1
-dx = log - - - -
m +n
n=l n=l m+n X m+l X m+1
Combining this with (l.l8) and additivity gives
(l.l9)
for all finite A, t and k.

Now we can use (1.19) to check continuity in A of various quantities. The idea
is to take AA < AB, and let At and B t be the contact processes with parameters
AA and AB respectively with a common initial configuration A, coupled by using
the graphical representation with the same Poisson processes associated with the
recovery symbols, and Poisson processes associated with the infection arrows that
are obtained as follows:
N(~.y) = Nt,y) + an independent Poisson process with parameter AB - AA.
Conditional on the Poisson processes {Nx }, {N(~,y)} for all x, y, the number of
infection arrows that could lead to an extra infection in the process Bs up to time
t has a Poisson distribution with parameter
(AB - AA) t Llds:s (AB - AA)K t lAs Ids.

10 x,y:xEA, 10
Ix-yl=1
It follows that
P(B s =1= As for some s :s t) :s 1 - EA exp [ - (AB - AA)K lot lAs Ids ]
(1.20)
:s (AB - AA)K lot EAIAslds,
where we have used the inequality 1 - e- u :s u in the last step. Using (1.19)
and (1.20), continuity in A can be easily shown for any reasonable function of the
process on a finite time interval. Rather than formulate a general theorem here,
which would necessarily have unpleasant assumptions, we will show how to use
(1.19) and (1.20) to prove continuity when the need arises later. However, the
idea should be fairly clear: (1.19) says that the set of sites ever infected by time
t is not too large. But then taking AB - AA small in (1.20) says that with large
probability, the two processes agree up to time t. One place where this argument
is worked out in detail is the proof of Proposition 4.33.
Rate of Growth
Bound (1.19) says nothing about how rapidly the cardinality of At grows as t t 00.
If there were no restriction on the number of infections per site (i.e., if this were
a branching random walk), then the size of the infection would in general grow
exponentially in time. However, this restriction leads to slower growth in general,
and polynomial growth on Zd, for example. To see this, let At be the contact
process on Zd, and let Bt be the process obtained from At by suppressing all
recoveries. Let Pt (x, y) be the transition probabilities for the simple random walk
on Zd that moves to each neighbor at rate A.
Proposition 1.21.
1. Preliminaries 41
Proof Let l;t be the branching random walk with no deaths: l;t (x) increases by 1
at rate
A l;t(Y)· L
ly-xl=!
Then the means of l;t satisfy the system of differential equations
The solution to this system is
E{ l;t(x) = e2dtA L Pt(x, y)l;(y).

y
To see this, write
:t [e 2dtA L Pt(x, y)l;(y)] = 2dAe2dtA L Pt(x, y)l;(y)

y y
+ e2dtA ~ [A IZ~=! Pt(z, y) - 2dAPt(X, y) ]l;(Y)
= Ae2dtA L L Pt(z, y)l;(y).

Iz-xl=! y
Since BiO} can be coupled to l;t with l;0(0) = 1 and l;o(x) = 0, x =F 0 so that
BiO} C {x : l;t (x) ::: I},
we have
p(x E BiO}) ::: p(l;t(x) ::: 1) ::: El;t(x) = e2dtA pt(0, x),
and the result follows.
In order to control the right side of the inequality in Proposition 1.21, we need
the following weak form of the large deviations bound for random walks.
Lemma 1.22. For every a > 0 there is a b > 0 so that
(1.23) L Pt(O, x) ::: be-at.

Ixl:o:bt
Proof It is enough to prove this for d = 1, since the one dimensional result can
be applied to the d coordinates of the d-dimensional random walk. By symmetry,
it is then enough to prove (1.23) where the sum is taken over positive x's only.
Let Xt be the one dimensional random walk starting at 0 that moves to each
neighbor at rate A. Then for y ::: 0, the exponential form of Chebyshev's inequality
gives
p(Xt ~ bt) S Eey(X,-btl = J-b y+).(e Y +e- Y -2 l]r,

which provides the required result - simply take any y > 0 and then a large b.
Proposition 1.24. For k ~ 1 there is a constant c independent oft so that
EIAjO}l k S c(l + t kd ).
Proof Choose y so that
Then breaking up the multiple sum below according to whether any IXi I ~ nand
if so, which IXi I is largest, we see that
E IAt(O) Ik S E IBt(O) Ik = '"'

~ P (XI( EOBtJ , ... , Xk E Bt{OJ)
S yknkd +k L yk-Ilxl(k-lld p(x E Bt(O}).

Ixl:::n
Now use the Schwarz inequality, Proposition 1.21, and Lemma 1.22, replacing n
by bt, where b is chosen to satisfy (1.23) for an a > 4dA. The result is that
EIAjO} Ik S yk (bt)kd + kyk-Ie2dtAJ"be-at/2 L IxI 2(k-l)d Pt(O, x).

x
The second summand on the right tends to zero as t ~ 00, since the expression
inside the square root grows polynomially in t, and this gives the result.
Survival and Extinction; Critical Values

The most important feature of the contact process is that survival and extinction
can both occur. Which of these occurs depends on the value of A. The contact
process is said to die out (or become extinct) if
otherwise it is said to survive. It is said to survive strongly if
p(x} (x E At i.o.) > O.
Note that neither of these properties depends on x, since S is connected. The

process is said to survive weakly if it survives but does not survive strongly.
Using these definitions and the monotonicity of the process in A, one can define
two critical values 0 S Al S A2 S 00 by
At dies out if A < Al
(1.25) At survives weakly if Al < A < A2
At survives strongly if A > A2.
1. Preliminaries 43
Relation (1.8) leads to a second interpretation of AI: V is the point mass at the
empty set 80 if A < AI, but is nontrivial if A > AI .
A host of questions is implicit in these definitions. For example,
(a) Is Al > O?
(b) Is A2 < oo?
(c) Is Al < A2?
(d) What happens when A = AI or A = A2?
(e) What is the limiting behaviour of the process for initial configurations other
than S itself? In particular, what are the invariant measures for the process.
Here are a number of facts that are either easy to see, or are proved in IPS:
(a) If all vertices of G have degree at most K, then
1
(1.26) AI> - .
-K
This is an easy consequence of comparison (1.3) of the contact process with a
branching random walk ~t. To see this, simply note that
is dominated by a branching process that is subcritical if K A < 1. (See Theorem

B55.) If S = Zd, this bound of 1/(2d) can be improved to
1
(1.27) AI>--
- 2d-l
(page 166 of IPS). If d = 1, it has been further improved to Al 2: 1.539 (page
289 of IPS).
(b,c) If S contains a copy of Zl, then both critical values for S are bounded
above by the corresponding critical value for Zl by an easy coupling based on
the graphical representation. For S = Zl,
(1.28)
and in fact the process with A = 2 survives (Theorem 1.33 on page 274). For
S = Zd,
(1.29) A(d) < ~

I - d
(Theorem 4.1 on page 307 of IPS), and in fact
(1.30) · d'1\.1(d)
11m -- ~
d~oo 2
«4.7) on page 308 of IPS).
Very little was known about the answers to questions (c), (d) or (e) at the
time IPS was written outside of the case S = Z I. In that case, it was known that
complete convergence (1.11) holds for A > A\ = A2. (Theorem 2.28 on page 284
of IPS). For larger values of d, results such as the complete convergence theorem
were known to hold for very large A, but not for all A.
Preview of Part I
The main objective of the next section is to give complete answers to questions
(c), (d) and (e) when S = Zd. It turns out that the answers are those that were
expected based on what was known about the one dimensional case, but the proofs
are quite different. Following this, we will derive exponential bounds for various
quantities: In the supercritical case, if the process does die out, it does so very
quickly. In the subcritical case, the process does die out very quickly.
Section 3 deals with the question of how the critical behavior of infinite systems
is reflected in the behavior of large finite systems. For the system on {I, ... ,N}d
starting with all sites infected, the process dies out after a time that is logarithmic
in N in the subcritical case (i.e., A is subcritical for the infinite system), and after
a time that is exponential in N in the supercritical case. Some of the results in
this section are based on theorems proved for the infinite system in Section 2.
Section 4 gives answers to questions (c), (d) and (e) for contact processes on
homogeneous trees. We will see that not only the techniques, but also the results,
tum out to be quite different from the Zd case, and it is this fact that makes them
so interesting. In particular, we will see that, unlike the case of Zd, A\ < A2, and
for values of A between the two critical values, there are infinitely many extremal
invariant measures.
2. The Contact Process on the Integer Lattice Zd
Our main objectives in the first part of this section are to prove the following for
the contact process on Zd:
(a) There is no intermediate phase, i.e., A\ = A2.
(b) At dies out at this common critical value.
(c) Complete convergence (1.11) holds for all A.
Following this, we will prove some exponential bounds in the supercritical case,
and then focus on the subcritical case, proving that the process dies out exponen-
tially rapidly.
Statements (a) and (c) were proved in IPS (Theorem 2.28 on page 284) for the
case d = I, using arguments based on edge speeds that work only in one dimen-
sion. Bezuidenhout and Grimmett (1990) developed entirely different techniques
that led to proofs of all three statements for all d ~ I. Note that even though we
have stated (a), (b) and (c) separately, (c) easily implies (a), so the main point is
to prove
(d) At dies out at A\, and
2. The Contact Process on the Integer Lattice Zd 45
(e) complete convergence holds for A > A\.

The crux of the proofs of these statements is finding some condition that
(i) depends only on the Poisson processes from the graphical representation cor-
responding to a finite space-time region, and
(ii) is equivalent to survival of the process.
The existence of such a condition may not seem plausible, since survival is
inherently an infinite time horizon statement. Nevertheless, such a condition does
exist. Because of (i), this condition will tum out to be continuous in A (see the
discussion surrounding (1.20», and this will give (d). The proof that the condition
implies survival will give more - it will imply that survival occurs in a very
strong sense, and this will lead to (e). Proofs of the general type we will use
in this section are known as block arguments. They lead to comparisons with
supercritical oriented percolation processes. The rough idea is that space-time is
partitioned into blocks. These blocks are regarded as the vertices that occur in
oriented percolation.
We will not write down the finite space-time condition at the outset, but rather
will assume that the process survives, and then build the condition in stages.
(Readers who cannot stand the suspense can look ahead to Theorem 2.12 below,
where the condition is stated explicitly.) The first step is to show that if the
contact process survives, then the contact process restricted to a large space-time
box (-L, L)d x [0, T] has the property that there are many infected sites on
various parts of the boundary of that box with high probability, provided that the
initial configuration is sufficiently large.
The Boundary of a Big Box Has Many Infected Sites

We begin by noting that survival of the unrestricted process is very likely if the
initial state is sufficiently large.
Proposition 2.1. Suppose At survives. Then
lim p(A~-n,nld
n-+oo
-+ 0 V t :::: 0) = 1.
Proof This is a special case of (1,9),
For L :::: 1, let LAt be the truncated contact process defined via the graphical
representation, but using only paths with vertical segments corresponding to sites
in (-L, L)d and infection arrows from (x, ,) to (y, ,) with x E (-L, L)d, The
next two results combine to say that there are many infected sites in an orthant
of the top of the (large) space-time box (-L, L)d x [0, t], In these and the results
that follow them, arguments based on correlation inequalities play a prominent
role,
Proposition 2.2. For every finite A and every N ::::: 1,
lim lim P(ILA~I

(-+00 L-+oo
::::: N) = P(A~ =fo 0 V t ::::: 0).
Remark. Note that the order of the limits above is important. Since the contact
process on a finite set dies out,
(2.3) lim P(ILA~I

t-'>oo
::::: N) = 0
for every L.
Proof of Proposition 2.2. Since
it follows that
(2.4) lim P(ILA~I

L-'>oo
::::: N) = P(IA~I ::::: N).
For an initial configuration of cardinality n, the probability that all n sites recover
before there is any infection is at least the probability that the maximum of n
independent exponential random variables with parameter I is smaller than the
minimum of 2dn independent exponential random variables with parameter A.
Therefore, since this mininmum is exponentially distributed with parameter 2dnA,
1 ]IA'I
P(A t = 0 for some tl,¥') > [ ,
- 1 + 2dAIAsi
where .97, is the a-algebra generated by the graphical representation up to time s.

By the martingale convergence theorem,
P(A t = 0 for some tl.¥.) ---+ I{A,=0 for some t} a.s.
as s ---+ 00. It follows that
(2.5) lim
t-'>oo
IAtl = 00 a.s. on {As =fo 0 V s ::::: OJ.
The statement of the proposition is a consequence of (2.4) and (2.5).
Proposition 2.6. For every n, N ::::: I and L ::::: n,
Proof Let Xl = ILA~-n,nld n [0, L)dl, and X2, ... ,X2d be defined similarly with
respect to the other orthants in R d , so that
(2.7) I L A t[-n,njd I ::: X1+"'+ X2 d •
Then X I, ... , X 2d are identically distributed, and are positively correlated by

Corollary B 18, since they are increasing functions of the infection Poisson pro-
cesses and decreasing functions of the recovery Poisson processes. (In applying
this result here and elsewhere, it is necessary to discretize the Poisson processes
used in the graphical representation, to apply Corollary BI8 to the Bernoulli ran-
dom variables that appear in the discretization, and then to pass to the limit. See
the discussion following the statement of Theorem B21.) Therefore
P(ILAf-n,njdl :::2 dN)::: P(XI +",+X2d :::2 dN)

2d
::: P(Xi ::: N for all 1 ::: i ::: 2d) ::: [P(X I ::: N)] .
Next we tum to the sides of the space-time box. For x E Zd, write x
(XI, ... , Xd) and Ix I = maXi Ix;!. The inequality x ::: 0 will mean Xi ::: 0 for all i.
Let
S(L, T) = {(x, s) E Zd X [0, T] : Ixl = L}
be the union of the sides of the box (- L, L)d X [0, T] and put
LA = Ut:::o( LAt X {tl) C Zd x [0,00),
which is the set of space-time points that are infected by the truncated process.
Let NA(L, T) be the maximal number of points in a subset of S(L, T) n LAA
with the following property: If (x, Sl) and (x, S2) are any two points in this set
with the same spatial coordinate x, then lSI - s21 ::: 1.
Proposition 2.S. Suppose L j t 00 and 'Fj t 00. For any M, N and any finite
A C Zd,
lim sup P(NA(L j , 'Fj) ::: M)p(ILA:

j-+oo J J
I ::: N) ::: P(A: = 0 for some s).
Proof Let .'.¥L,T be the a-algebra generated by the Poisson processes from the
graphical representation in (-L, L)d x [0, T]. The first step is to prove that if
A c (-L, L)d, then
(2.9)
p( A: = 0 for some SI.'.¥L'T) ::: [1 ::~Jk
a.s. on {NA(L, T) + ILA:I ::: k}.
To begin to check (2.9), note that for each point x E LA: there is probability
(1 + 2d'A)-1 that a recovery symbol occurs on the time line above (x, T) before
any infection arrows occur emanating from that time line. To see this, consider this
time line {x} x [T, 00), and the Poisson processes associated with it in the graphical
representation. The first recovery symbol after time T comes after an exponential
time with parameter 1, while the first infection arrow to a given neighbor of
x comes after an exponential time with parameter A. These exponential times
are independent. An elementary computation shows that if O"i are independent
exponential random variables with parameters Yi respectively, then for any j,
In our application, Yj = 1 and Yi = A for 2d i's. If the first recovery symbol

precedes the first infection arrow on that time line, then there is no infinite active
path passing through (x, T). Therefore, with probability at least (1 + 2dA)-I,
IL I
this site x cannot contribute to survival of the process. If A~ = I, then the
conditional probability that no x EL A~ contributes to survival is at least (1 +
2dA)-I.
Now consider a time line {x} x [0, T] above (x, 0), where Ix I = L, and let
(x, Sl), ... ,(x, Sj)
be a maximal set of points on this time line in S(L, T) n LAA with the property
that each pair is separated by at least distance 1. Assume j :::: 1, since otherwise,
nothing on this time line can contribute to survival. Let
1= U{=I({X} X (Si -l,si + 1)).

Then all points on this time line in S(L, T) n LAA are contained in I by the max-
imality assumption. The Lebesgue measure of I is at most 2j, so the probability
that there are no infection arrows to any of the 2d neighbors of x emanating from
I is at least e- 4djA , the probability that 2d independent Poisson random variables
with parameter 2jA are all zero. For each interval of length u in the complement
of I in this time line {x} x [0, T], the probability that there is no infection arrow
emanating from it, or if there is, it is preceded by a recovery symbol, is
e- 2dAU + jo 2dAe-2dAS [1 _ e-S]ds = 1 - jU 2dAe-O 2dA )sds> 1 + 2dA .

u
0
+
-
1
These events are independent, since they refer to disjoint parts of the graphical
representation, so the probability that none of the points on this time line in
S(L, T) n LAA contributes to survival of the process is at least
e-4dA ]'
[
1 +2dA
The numerator comes from the points in I, while the denominator comes from
points in the complement of I in {x} x [0, T]. Considering the contributions from
all the various x's gives (2.9).
Write G = {A~ = 0 for some s} and H j = {NA(L j , 'Fj) + I LjA~ I :s k} for a
fixed k. By the martingale convergence theorem, }
P(GI.¥Lj.T) --+ 1G a.s.
as j --+ 00. By (2.9), P(GI.¥Lj,T) is bounded below on H j . Therefore,
{Hj i.o.} c G.
It follows that
(2.10) lim sup P(Hi ) :s P(G)

i~oo
Next, write
P(NA(L, T) + IL A~I :s M + N) 2: p(NA(L, T) :s M, IL A~I :s N)

2: P(NA(L, T) :s M)p(IL A~I :s N),
where the second inequality is a consequence of positive correlations, Corollary

B 18. Combining this with (2.10) gives the statement of the proposition.
The next result is similar to Proposition 2.6. To state it, define
S+(L, T) = {(x, s) E Zd X [0, T] : XI = +L, Xi 2: 0 for 2 :s i :s d},

and let N~(L, T) be the maximal number of points in S+(L, T) n LAA such that
each pair of these points on the same time line is separated by at least a vertical
distance of 1.
Proposition 2.11. For any L, M, T and n < L,
Proof Let XI = N~-n,njd (L, T), and define X 2 , ••. , X d2 d similarly by replacing
the first coordinate in the definition of S+(L, T) by any of the d choices of
coordinates, and the positive signs used in the definition of S+(L, T) by any of
the 2d choices of signs. These random variables are identically distributed, and
are positively correlated by Corollary B18. Furthermore,
Therefore
:s M)] d2 :s M) x ... X P(Xd2d :s M)

d
[P(XI = P(XI
:s P(XI :s M, ... , X d2 :s M) d
:s P(Nl-n,n jd (L, T) :s Md2 d).

50 Part 1. Contact Processes
The Finite Space-Time Condition

By now, we have almost proved the necessity of the equivalent condition for
survival that we have been looking for, so this is a good time to state the condition,
and complete the proof of its necessity.
Theorem 2.12. If At survives, then it satisfies the following condition:

For every E > 0 there are choices ofn, L, T so that
(2.13)
and
p( L+2nA~~7,nld :J x + [-n, n]d for some 0 S t < T,

(2.14)
and for some x E {L + n} x [0, L)d-J) > 1 - E.
Proof The idea of the proof is to use Propositions 2.2,2.6,2.8 and 2.11 to construct
a big space-time box with many infected points on its boundary, and in fact, on
certain orthants of its boundary. If there are enough infected points, then at least
one of them will generate an infected cube of side length 2n in the extra time
period of length 1 that we are allowing ourselves. We will start with a 0 < 8 < 1,
and show at the end how to choose it in terms of the given E > O.
Given 8 > 0, use Proposition 2.1 to choose an n so that
(2.l5)
Choose N so large that any N points in Zd will contain a subset of at least N'
points, each pair of which is separated by an Loo distance of at least 2n + 1, where
N' is chosen so large that N' independent trials with success probability
will contain at least one success with probability at least 1 - 8, i.e.,
[ I-P ( nAJ{OJ :J[-n,n] d)]N' So.

Similarly, choose M so large that any M points in Zd will contain a subset of at
least M' points, each pair of which is separated by a distance of at least 2n + I,
where M' is chosen so large that M' independent trials with success probability
will contain at least one success with probability at least 1 - 8.

Since P(IL A~-n.nt I ::: 2d N) is continuous in t, and since 0 < 1 - 8 < 1 - 82 ,

inequality (2.15), Proposition 2.2 and (2.3) imply that there exist L j t 00 and
1j t 00 so that
(2.16)
for each j ::: 1. Applying Proposition 2.8 with M and N replaced by M d2 d and
N2 d respectively, it follows that for some j,
(2.17)
Letting L = L j and T = Tj for that choice of j, and applying Propositions 2.6

and 2.l1 to (2.l6) and (2.17) respectively, we see that
(2.18)
and
(2.19)
By our choices of Nand M, and the fact that the Poisson processes used in the
graphical representation are independent on disjoint space-time regions, (2.18) and
(2.19) then imply that
p( L+2nA~-:(ld :l x + [-n, n]d for some x E [0, L)d) ::: [1 - 82- d][l - 8],
and
p( L+2nA~~~,nld :l x + [-n, n]d for some O:s t < T, x E {L + n} x [0, L)d-l)

::: [1 - 82- d
/ d ][l_ 8].
It is clear then that 8 should have been chosen originally so that
[1 - 82 - d ][1 - 8] ::: 1 - E and [1 - 82 - d / d ][1 - 8] ::: 1 - E.
Doing this completes the proof.
Comparison with Oriented Percolation

The next step is to show that the condition appearing in Theorem 2.l2 implies
the survival of At. This is done by making a construction that leads to a compar-
ison between the contact process and a type of supercritical oriented percolation
process. The fact that there are two statements in the condition in Theorem 2.12
turns out to be somewhat inconvenient in carrying out this construction. The next
result has the effect of combining them into one.
Proposition 2.20. Suppose the condition appearing in Theorem 2.12 is satisfied.

Then for every E > 0, there are choices of n, L, T so that
p( 2L+3nAl-n,n)d :l X+[ -n, n]d for some T :::: t < 2T and

(2.21)
x E [L +n,2L +n] x [0,2L)d-l):::: 1- £.
Proof Given £ > 0, choose n, L, T so that (2.13) and (2.14) are satisfied. Using
(2.14) first, we see that with probability :::: I - £, there exist x and t with the prop-
erty appearing in that probability. Now consider the process restarted at time t + I
with initial state x + [-n, n]d. Use the strong Markov property and monotonicity
and apply (2.13) to conclude that (conditionally on the first event considered) with
probability:::: I - £ there is a y so that y - x E [0, L)d and the restarted process
at time T + I covers y + [-n, n]d. Putting these statements together, it follows
that
p( 2L+3nAl-n,n)d :l x + [-n, n]d for some T + I :::: t < 2T + 2 and

x E [L + n, 2L + n] x [0, 2L)d-l) :::: (1 - £)2.
Now replace T + I by T and (1 - £)2 by I - £ to complete the proof.
Next we will carry out the fundamental construction that will shortly lead to
the comparison with oriented percolation. Recall that an active path is a connected
oriented path in the graphical representation that moves along the time lines in the
increasing t direction without passing through a recovery symbol, and along the
infection arrows in the direction of the arrows.
Proposition 2.22. Suppose the condition appearing in Theorem 2.12 is satisfied.

Then for every £ > 0, there are choices of n, a, b with n < a so that if (x, s) E
[-a,a]d x [0, b), then
P (3(Y, t) E [a, 3a] x [-a, a]d-l x [Sb, 6b] and there are active paths
that stay in [-Sa, Sat x [0, 6b] and go from
(x, s) + ([-n, n]d x {O}) to every point in (y, t) + ([-n, n]d x {on) : : 1- E.
Proof The idea is to apply Proposition 2.20 repeatedly (between four and ten
times) to move the center (x, s) of a cube in four to ten steps to the center (y, t)
of a cube in such a way that if the first cube is fully infected, then so will be the
final one. In doing so, it is important to remember that while Proposition 2.20 was
stated for x in the positive box [L + n, 2L + n] x [0, 2L)d-l, (2.21) is true by
symmetry if this box is replaced by boxes obtained from it by reflections about
the coordinate planes in Zd. Thus we are free at each stage of the construction to
use any sign for each of the d coordinates.
Take n, L, T so that (2.21) is satisfied. Let a = 2L +n and b = 2T. Here are

the rules we will follow at each step of the construction:
(i) For 2 :::: i :::: d, if the current center (z, r) has Zi 2: 0, move the ith
coordinate in the negative direction; if Zi < 0, move it in the positive direction.
Note that since a 2: 2L, this coordinate will never leave [-a, a].
(ii) Move the first coordinate of the center in the positive direction until it
reaches [a, 3a], and then move it in the positive direction if it is to the left of 2a
and in the negative direction if it is to the right of 2a. Since it always moves by
at least L + nand 4(L + n) 2: 2a, it will reach [a, 3a] in at most four steps. Since
2L + n = a, it will remain in [a, 3a] thereafter.
(iii) Move the time coordinate r of the center (x, r) upward until it exceeds
5b. Since it always moves between T and 2T units, this will happen after four to
ten steps. Since b = 2T, it will not overshoot the height 6b. The entire process
ceases when this occurs.
Note that at each stage of the construction, only Poisson processes correspond-
ing to sites in [- 5a, 5a]d are ever used. Also, the various steps of the construction
use Poisson processes in (random) disjoint time intervals, so the strong Markov
property implies that the construction succeeds with probability at least (1 - E) 10.
By changing the value of E appropriately, we obtain the statement of the Propo-
sition.
Finally we are ready for the comparison with (independent) oriented site perco-
lation that provides the converse to Theorem 2.12. To avoid confusing the percola-
tion process with the contact process, we will denote the oriented site percolation
process defined prior to Theorem B24 by B k .
Theorem 2.23. Suppose the condition appearing in Theorem 2.12 is satisfied. Then
for every p < 1 there are choices ofn, a, b with n < a so that the following holds:
If the initial configurations Bo and A satisfy
j E Bo implies A J x+[ -n, n]d for some x E [a(4j -1), a(4j + 1)] x [-a, a]d-l,
then {A~, t 2: O} can be coupled with {Bb k 2: O} with parameter p so that
(2.24a) j E Bk implies A~ J x + [-n, n]d
for some
(2.24b) (x, t) E [a(4j -2k-l), a(4j -2k+ 1)] x [-a, a]d-l x [5bk, b(5k+ 1)].
In particular, At survives.
Remark. As will be clear from the proof, this coupling can also be achieved if
At is replaced by the process obtained using only the Poisson processes in the
graphical representation that correspond to x E Zd with Ix;! :::: 5a, 2 :::: i :::: d.
Proof of Theorem 2.23. There are two stages in the construction. In the first,
we do not try to achieve the conditional independence properties required in the
definition of the oriented percolation process. The Bernoulli random variables
needed in constructing the Bk 's are generated recursively in k, using the graphical
representation on which the construction of the contact process is based. Suppose
{Bi' i :s k} have been constructed. If Bk n {j - 1, j} =1= 0, then (2.24) holds for
j -lor j. The construction provided by Proposition 2.22 succeeds with probability
2: 1 - E, so this can be used to generate the appropriate Bernoulli random variable,
provided that 1 - E > p.
This completes the first stage ofthe construction. We are not done yet, however,
since these Bernoulli random variables are not independent. However, they are
m-dependent for some m - see the definition of m-dependence following the
statement of Theorem B26. It is for this reason that we wanted the active paths
occurring in the statement of Proposition 2.22 to remain in [-5a, 5a]d x [0,6b].
Because of this m-dependence, we can use Theorem B26 to construct independent
Bernoulli random variables that lie below the dependent ones, provided that we
take 1 - E » p. It is important, of course, that the value of m not depend on the
choices of a, b, n, but this is clearly the case.
To check that At survives, it is enough to take p large enough so that the
conclusions of Theorem B24 are satisfied.
First Consequences of the Percolation Comparison

We are now ready to reap the benefits of the constructions just carried out. Recall
that A] and A2 are the (apriori possibly different) critical values defined in (1.25).
Once we know that they are the same, the critical contact process is the one whose
parameter is this common value of A.
Theorem 2.25. (a) A] = A2.

(b) The critical contact process dies out.
Proof For part (a), take A > A]. Then At survives. By Theorems B24, 2.12 and
2.23, there exist n, a, b and a corresponding supercritical oriented site percolation
process Bk with Bo = {OJ which lies below it in the sense of (2.24). Again by
Theorem B24, P(B2k = k) is bounded below in k, so that P(B2k = k i.o.) > O.
Therefore by (2.24), with positive probability, there are infinitely many choices of
k so that
A\-n.n]d :> x + [-n, n]d for some (x, t) E [-a, a]d x [lObk, b(lOk + 1)].
For every x E Zd,
p(x E A\-n.n]d)
is strictly positive for t > 0 and continuous in t (by Theorem B3). Therefore it is
bounded below by a positive number for (x, t) in compact subsets of Zd x (0, 00).
For the process with initial state {OJ, there is positive probability of covering
[-n, n]d by time 1. By the Markov property and monotonicity, we may therefore
consider the process starting with [-n, n]d instead of {OJ. Every time a box of side
length 2n that is a bounded distance from the origin is covered by the process, there
is a positive probability that the process will cover 0 one unit of time later. This
is a consequence of the observation at the end of the last paragraph. Therefore,
(2.26) P (0 E AjO) for a sequence of times t too) > 0
by the generalized Borel-Cantelli Lemma (Corollary 3.2 of Chapter 4 of Durrett

(1996)). Therefore the contact process survives strongly. We have shown that
A > Al implies A ::: A2, from which part (a) of the theorem follows.
For part (b), take A so that At survives. Then the condition stated in Theorem
2.12 is satisfied. The probabilities on the left of (2.13) and (2.14) are continuous
in A by (1.20). Therefore, the condition in Theorem 2.12 is satisfied for some
A' < A. It follows from Theorem 2.23 that the contact process with parameter A'
survives also, so that A' ::: AI. Therefore, A > AI. We have shown that survival
implies A > AI. Therefore, A = AI implies extinction.
Since we now know that Al = A2, we will denote their common value by Ae for
the rest of this section and Section 3. The next result is the complete convergence
theorem. Since the process dies out for A ::: Ae, the only case of interest is A > Ae.
Theorem 2.27. Suppose A > Ae. Then for every A C Zd,
A~ ::::} (XAV + [1 - (XA]00
as t ---+ 00, where::::} denotes weak convergence, and (XA is the survival probability
Proof We need to check the conditions of Theorem 1.12. The first one is easy:
Let G be the event
G = {O E At for a sequence of times t too}.

For any x E Zd, monotonicity and the strong Markov property give
(2.28)
Here is the argument that leads to (2.28): First, G is an invariant event (i.e.,
invariant under time shifts), and is therefore a tail event. Therefore the equality in
(2.28) is just the Markov property at time s. To check the inequality in (2.28), let
a = inf{t : 0 E At},
and let g;;: be the a-algebra associated with this stopping time. On the event
{x E As},
pA'(G) ::: p(x)(G) ::: E(x)[P(G I ~), a < 00]

= E(X)[pAa(G), a < 00]
::: E(x)[P(O)(G),a < 00]
= p(O)(G)p(x)(a < (0).
Here we have used monotonicity of the process in the first and third inequalities,
and the strong Markov property in the first equality. This proves (2.28).
Since p(x E AIO)) > 0 for every x,
P(O E A)x) for some t) = P(x E A)O) for some t) ::: p(O)(G),
so that (2.28) gives
and therefore
(2.29)
By (2.26) p(O)(G) > 0, and by the martingale convergence theorem, P(GI~ -+
1Ga.s. as s -+ 00. So, (2.29) implies that
{As =1= 0 V s} c G a.s.

This proves (1.13).
For the proof of (1.14), we consider only the case d ::: 2. There are three
reasons for this. First, the complete convergence theorem in one dimension appears
as Theorem 2.28 on page 284 ofIPS - the proof there uses one dimensionality in a
crucial way. Secondly, the whole approach used in this section relies on properties
of oriented percolation that are analogous to properties of the one dimensional
contact process that we would appear to be proving, so the arguments would be
almost circular. Finally, the argument is simpler in two or more dimensions, as
we will now see.
Take p large enough that Bk is supercritical, and choose the n, a, b whose exis-
tence is guaranteed by Theorem 2.23. For j E Z, let Aj,t be the process constructed
from the graphical representation using only Poisson processes corresponding to
x E Zd with X2 E (6a(2j - 1), 6a(2j + 1)). These are clearly independent. Using
the remark following the statement of Theorem 2.23, the coupling in (2.24) can
be carried out with At replaced by AO,t. Since
At :) Uj Aj,r.
if m ::: 1 is odd,
p(A~-6ma,6ma)d n (-6ma, 6ma)d = 0) :s [p(o rf. At6a ,6a)d)r.
Therefore, to check (1.14), it is enough to show that
inf P(O E A~~6a,6a)d) > O.

t":O '
But this follows from the argument that led to (2.26).
Exponential Bounds in the Supercritical Case

The last two parts of Theorem B24 provided exponential bounds for oriented
percolation with a large parameter p. Now we will see how these can be used,
together with the comparison of the contact process with oriented percolation
given in Theorem 2.23, to obtain analogous results for the supercritical contact
process. There are two points to note in this connection: First, the techniques used
in the proof of parts (b) and (c) of Theorem B24 are highly one dimensional.
Nevertheless, we will be able to prove the corresponding contact process results
in all dimensions. Secondly, even though the results will be proved for all A > Ac ,
we will be using the oriented percolation results for large p only. Since it is usually
easier to prove percolation results for large p than for all supercritical p, this is
an advantage.
Theorem 2.30. Suppose A > Ac , and let rA = inf{t 2: 0 : At

= fO} be the
extinction time for the process starting at A. Then there are constants C and E > 0
(independent of A and t) so that
(a)
and
(b)
Remark. One of the reasons for our interest in (a) is that it provides exponential
rates of convergence to the upper invariant measure. To see this, let ILl be the
distribution at time t of the contact process with initial configuration Zd. By
duality (1.7), for any finite A,
p(t < rA < (0) = P(A: t- (0) - lim P(A: t- (0)

(2.31 ) s~oo
= ILdB : B nAt- fO} - v{B : B nAt- fO}.
Proof of Theorem 2.30. The idea of the proof of (a) is to use the percolation
construction of Theorem 2.23 repeatedly. Choose a p < 1 so that the oriented
percolation process Bk with parameter p satisfies the conclusions of Theorem
B24. By Theorems 2.12 and 2.23, there are choices of n, a, b with the coupling
property (2.24). Let
8 = P(A\O} ::::> x + [-n, n]d for some x E Zd) > O.
Then
P(A~ ::::> x + [-n, n]d for some x E Zd) 2: 8
for all A =1= 10 by monotonicity. Start the process with any A =1= 10. We will
define a random variable N (so that N + I is a stopping time with respect to the
percolation structure, together with some independent auxiliary randomization)

with the property that peN = k) = 8(1 - 8)k, k ::: 0, and
either A~=0 or A~+\:>x+[-n,n]dforsomexEZd.
Start with the observation that
At:> y + [-n,n]d for some y E Zd
with probability at least 8. Take {N = O} to be a subevent of this event with

probability 8. On the complement of {N = OJ, either At = 0 or At =1= 0. In the
first case, N ::: 1, so A ~ = 0 as required. In the second case, repeat the procedure
above for a new time period of length 1. Conditional on {N > 0, At =1= 0}, let
{N = I} be a subevent of
At :> y + [-n, n]d for some y E Zd
of probability 8, and continue in this way to construct the required N.

On the event {A~+\ :> x + [-n, n]d for some x E Zd} apply the percolation
construction from Theorem 2.23, obtaining a comparison with the contact process
with initial time N + 1 and initial set A~+l" Without loss of generality, the initial
state of the oriented percolation process can always be taken to be a singleton.
Let N + M + 1 be the extinction time of the percolation process. If M = 00, then
A~ survives. If M < 00, then at time N + M + 1, the contact process is either
the empty set or not. In the latter case, begin the whole procedure again. This
generates sequences of independent random variables N; with the distribution of
N and independent random variables M; with the distribution of (MIM < 00),
and a geometric random variable L independent of the N;'s and M;'s (L is the
number of times the (N, M) procedure is carried out) so that at time
L
a = L (N; + M; + 1),
;=\
either A: = 0 or rA = 00. In other words, a ::: rA on the event {rA < oo}.
Therefore,
p(t < rA < 00) :::: pea > t).
By the construction, Land N; have exponentially decaying tail probabilities. By
Theorem B24, the same is true of M;. It follows that a has exponentially decaying
tail probabilities. To see this, take E\ > 0 so that Ee E1L < 00, and then take E2 > 0
so that
Then
Ee E2 t'I = E[E(e E2 t'I I L)] = E[ Ee E2 (NI+MI+\lf :::: Ee E1L < 00.
This completes the proof of part (a) of the theorem.

For the proof of part (b), consider the contact process At on the tube in Zd
given by T = {x E Zd : Ix; I :::: Sa, 2 :::: i :::: d}, and note that it is enough to
prove the analogue of (b) for At. To see this, suppose that (b) holds for At. write
Zd as the disjoint union of translates Tn of T, and let An,t be the contact process
restricted to Tn. Then
A -AnT,
At :J UnAn,t '
and the An,r's are independent, so by the analogue of (b) for At.
P(r A < (0) :::: np(A:,~T'
n
= 0 for some t) :::: n
n
e-fIAnT,1 = e- fIAI .
To prove the analogue of (b) for At, we will use Theorem 2.23 again, so fix
a large p and the corresponding n, a, b. Write T as the disjoint union
(2.32) T = U~_oo((4j - 2)a, (4j +2)a] x [-5a, Sa]d-I.
Take 8 > 0 so that the contact process restricted to (- 2a, 2a] x [- Sa, Sa]d-I
starting at any singleton E (-2a,2a] x [-Sa, 5a]d-1 covers [-n, n]d at time I
with probability ~ 8.
Given any initial configuration A for At. thin it so that the resulting set contains
at most one point in each of the boxes appearing on the right of (2.32). The
cardinality of the resulting set will be at least a constant multiple of IA I. For
each point x in this thinned set, run a contact processes restricted to its box up
to time 1. These contact processes are independent for different x's, so that with
the exception of an event of exponentially small probability (exponentially small
in the number of x's in the thinned set, and hence exponentially small in the
cardinality of A itself), at least a fraction 8/2 of the processes starting at these x' s
will at time 1 cover the cube of side length 2n + 1 centered at the center of the
corresponding boxes. We conclude that for some E > 0,
P(A~ = 0 for some t) :::: e- E1A1 + P(A~ = 0 for some t, A~ contains EIAI
boxes of the form (4ja, 0, ... ,0) + [-n, n]d).
The first term on the right corresponds to the exceptional event with the exponen-
tially small probability mentioned above. On the complementary event, at least
some fraction of the boxes of side length 2n + 1 will be covered at time 1, and
that leads to the second term on the right. Now use the Markov property at time 1,
together with Theorems 2.23 and B24 to conclude that for some E > 0 and C,
(2.33)
It remains to show that we can take C = 1. To do so, let
a(A) = P(A~ = 0 for some t).

-A -8
Then by Corollary BI8, the events {At = 0 for some t} and {At = 0 for some t}
are positively correlated. This and additivity (1.2) leads to
a(A U B) = P(A~ = 0 for some t, -:4: = 0 for some t) 2: a(A)a(B).

Applying this to the union of k disjoint shifts of A, it follows from (2.33) that
[a(A)t :s Ce-EkIAI.
Taking kth roots and letting k --+ 00 leads to (2.33) with C = 1.
Exponential Decay Rates in the Subcritical Case

In this subsection, we tum to the subcritical case, showing that for A < Ac ,
quantities such as the extinction time, and the most distant site ever infected have
exponential moments. The first step is to use the crude bound on the growth of At
for all A obtained in Section 1 to show that exponential decay of any reasonable
quantity follows from exponential decay in t of the survival probability peAt =f= 0).
Theorem 2.34. There exists a constant c independent of t so that
(2.35)
and
(2.36) P(x E AiO) for some s 2: °and some Ixl 2: ct) :s ce- t + P(A;O) =f= 0).
Proof For the first statement, use the Schwarz inequality to get
and then use Proposition 1.24 with k = 2.

For (2.36), take a = 2dA + 1 in Lemma 1.22, and let c be the value of b so
that (1.23) holds for that a. Let B t be the contact process with no recoveries, as in
that lemma. Coupling At and Bt so that At C Bf> we see that since Bt increases
in t, the probability on the left of (2.36) is at most
(2.37) P(x E B/O) for some Ixl 2: ct) + P(A;O) =f= 0).
Using Proposition 1.21 and then Lemma l.22, the first summand in (2.37) is at
most
e2dtA L
Pt(O,x):s ce- t ,
Ixl~ct
which completes the proof.
Thus motivated, we are now ready to launch into the proof of exponential
decay of the survival probability I(A, t) = P(A;O) =f= 0) in the subcritical case.
We have included explicitly in the notation the dependence on the infection rate
A for reasons that will become clear below. Note that I is increasing in A and
decreasing in t. The idea of the proof of exponential decay is the following:
(a) First prove an inequality of the form
(2.38)
a a
Cl-Iog I(A, t) - C2t-Iog I(A, t) 2:
t
2,
1+ fo
I -
aA at I(A, s)ds
where C 1 and C2 depend on A (mildly) but not on t. This will be valid whether or
not the process is subcritical, though it is not very interesting in the supercritical
regime.
(b) Secondly, use (2.38) to show that if lim Hoo I(A, t) = 0 for some A, then
I(A, t) decays exponentially in t for all strictly smaller values of A. To see that
(2.38) might in fact imply this, suppose that fooo I (A', t )dt < 00, so that the right
side of (2.38) grows linearly in t. It is easy to check that if either of the terms
on the left of (2.38) grows linearly in t for an interval of A's, then I(A, t) decays
exponentially in t for A'S in that interval.
We begin by evaluating the partial derivatives that appear on the left side of
(2.38). The first two lemmas are versions of what is known as Russo's formula
in percolation. (See Section 2.4 of Grimmett (1989), for example.) Let XI be the
number of infection arrows in the graphical representation with the property that
if the arrow is removed, then there is no active path from (0,0) to (z, t) for
any Z E Zd. Such arrows are known as pivotal. We will often use PI. or E).. to
indicate that probabilities or expectations are taken with respect to the graphical
representation with infection rate A.
Lemma 2.39.
Proof Take h > 0, and think of constructing the graphical representation with
parameter A from that with parameter A + h by independently deleting infection
"*
arrows with probability h/(A+h). If A)O} 0 for the graphical representation with
parameter A + h and a pivotal arrow is deleted, then A)O} = 0 for the graphical
representation with parameter A. Therefore
I(A + h, t) -
h
I(A, t) =~P
~ J.+h
(A{O}.../.. 0 X
I -r- , I
= k)~
h
[1 _(_A_)kJ
A+ h
+ 0
h
(1).
k=l
The Oh (1) term comes from the possibility that two or more arrows are deleted
that together lead to the elimination of all active paths to time t, even though no
one of the deleted arrows is itself pivotal. The fact that the total rate at which any
of these arrows is deleted has finite expectation comes from (1.19).
Now pass to the limit, using (1.19) and dominated convergence for justification,
to obtain
(2.40)
Combining (1.19) and (1.20), we see that the + can be removed on the right of
(2.40), and then that the right side of (2.40) is continuous in A. It follows that the
partial derivative of I(A, t) with respect to A exists, and
a I(A, t)
A- = E).. ({O})
X t , At =1= 0 .
aA
Dividing by I(A, t) gives the result.
For the next result, let Yt be the total length of all vertical segments in the
graphical representation with the property that the addition of a recovery symbol
at any point in the segment means that there is no active path from (0,0) to (z, t)
for any Z E Zd in the resulting structure. Maximal segments with this property
are known as pivotal intervals. A convenient way of thinking of pivotal arrows
and pivotal intervals (on the event AjO} =1= 0) is that taken together, they form the
intersection of all active paths from (0,0) to Zd X it}.
Lemma 2.41.
Proof When we set up the contact process in Section 1, we placed the recovery
symbols in the graphical representation with rate 1. In this proof, it is convenient
to place them at a general rate 0 > 0. We will incorporate the 0 into our notation
in the obvious way. The scaling property of the Poisson process (i.e., if N (t) is a
Poisson process with rate A, then N*(t) = N(et) is a Poisson process of rate AC)
implies that
1(0, A, t) = l(l, A/O, Of) = I(A/O, ot),
so that
(2.42) a 1(0, A, t) I
- ao a I(A, t)
= AaA a I(A, t).
- ta-
8=1 t
Therefore, we need to compute the left side of (2.42), which we will do in a
manner analogous to the proof of Lemma 2.39.
Take h > 0, and construct the graphical representation corresponding to recov-
ery rate 0 + h from that with recovery rate 0 by adding recovery symbols at rate h.
Conditional on the graphical structure with recovery parameter 0, the probability
that one (or more) of these additional recovery symbols is placed in some pivotal
interval is 1 - e- hY" so that
1(0, A, t) - 1(0 + h, A, t) [1 - e- hY, {Of ]

h = E 8,).. h ' At =1= 0 .
Letting h t °and arguing as in the proof of Lemma 2.39 gives

a
- all f(8, A, t) = E8,;" [Yt,
{O}
At =1= 0] .
Taking 8 = 1, combining this with (2.42), and dividing by f(A, t) gives the
required result.
Next let Zt be the number of pivotal intervals. We will bound this in terms of
X t and Yt as follows:
Lemma 2.43.
1+ E;..(XtIAjO} =1= 0) ::: E;..(ZtIAjO} =1= 0)

(2.44)
::: 1 + eE;..(Xt IAjO} =1= 0) + 2dAeE;..(Yt IAjO} =1= 0).
Proof For the first inequality, it suffices to note that every pivotal arrow begins at
the end of a pivotal interval, and ends at the beginning of another pivotal interval.
Therefore, 1 + X t ::: Zt on {AjO} =1= 0}.
For the second inequality, which is the one we will actually use, fix y > 0;
a particular choice will be made at the end of the proof. Here is the idea of the
proof: Pivotal intervals with any of the following properties are easy to handle:
(i) Those of length at least y, since the total length of such intervals is at least
y x their number, and hence their number is at most y -I Yt .
(ii) Those that end at time t, since there is at most one such pivotal interval.
(iii)Those that end at a pivotal arrow, since the number of such pivotal intervals
is at most X t •
This explains the three summands that appear on the right of (2.44). So, it
will be enough to consider pivotal intervals that have none of the above three
properties, and show that the expected number of them is at most a constant
multiple of the expected number of pivotal intervals that do satisfy one of these
three properties.
In order to count pivotal intervals, it is useful to do some discretization. Choose
an E > 0, which will eventually be taken to approach zero. For fixed x E Zd and
integer k ::: 1 let F be the event (defined on the graphical structure of Poisson
processes) that there is a pivotal interval that A;O} =1= 0 and
(a) contains the point (x, kE),
(b) does not contain the point (x, (k - l)E),
(c) is of length less than y,
(d) ends strictly before time t, and
(e) does not end at a pivotal arrow.
For W E F, we will define a new configuration Twas follows. Since all points
in the graphical representation that we will consider here lie on the time line
{x} x [0, t], and all arrows will begin on this interval, we will omit the coordinate
x from the notation. So, for w E F, let [a, b] be the pivotal interval that begins
between (k -1)E and kE, so that (k -1)E < a < kE, and kE b be the last point
::: min (first recovery symbol after b, t)
such that there is an active path from it to Zd x {t}. Let T w be the configuration
obtained from w by removing all infection arrows in (kE, kE + y) except the one
at c (if c < kE + y.)
With this construction, T w has a pivotal interval containing [a, b) that either
ends in a pivotal arrow, or has length> y. To see this, consider two cases:
1. c > kE + y. Then the pivotal interval for T w contains kE + y, and hence is
of length > y.
2. c < kE + y. Then [a, c) is a pivotal interval for T w, and the infection arrow
at c is pivotal.
This is easiest to see by drawing some pictures, which is left to the reader.
Note that Tw rf. F, so at least one interval was deleted. Elementary properties
of Poisson processes imply that the Radon-Nykodym derivative of PoT-I with
respect to P satisfies
d(P 0 T- 1) < 1 - e- 2d J..y = e2dJ..y _ 1.

dP e- 2d J..y
Therefore since F C T- 1(T F),
P(F) :::: (p 0 T-1)(T F) :::: (e 2dAY - l)P(T F).
Summing over k and x, we see that
E J.. (#pivotal intervals of length in (E, y)

not ending at t or a pivotal arrow, A!O) =1= 0)
::: (e 2d J..Y _ I)[EJ..(Xr. A!O) =1= 0) + y-I EJ..(Yt , A;O) =1= 0)].
Now let E t 0, add the terms that we saw were easy to handle at the beginning
of the proof, and divide by P (A!O) =1= 0) to get
EJ..(ZtIA;O) =1= 0) ::: 1+ e2d J..y EJ..(XtIA!O) =1= 0) + y-I e2dJ..y EJ..(YtIA)O) =1= 0).
Now put y = 1/(2dA) to get (2.44).
We come now to the final ingredient in the proof of (2.38). Since the collection
of pivotal arrows and intervals make up the intersection of all active paths from
(0,0) to Zd X {t} in the graphical representation, they are, in particular, a subset
of any active path. Therefore, the projections of the pivotal intervals onto the time
line [0, t] are disjoint. Label these projections ([Pi, a;], 1 ::: i ::: Zt) in increasing
order. For i > Zt. set Pi = ai = t. Let T be the extinction time for the contact
process starting at {O}: P(T > s) = p(A1°} =1= 0).
Lemma 2.45. For any s ::: 0 and any integer k ::: 1,
where TI, T2, ... are independent random variables with the distribution of T.
Proof Let Xi E Zd be the spatial coordinate of the points in the ith pivotal interval.
Note that XI = O. Every pivotal interval must end in an arrow; let Yi be the spatial
coordinate of the endpoint of the arrow that begins at (Xi, ai ).
Fix k ::: 1, and let G be the union of all active paths in the graphical represen-
tation up to time t that start at (0, 0) and do not have (Xb ak) as an interior point,
together with the arrow from (Xb ak) to (Yb ak). Note that PI, ... , Pk. 0'1, ... ,ak,
XI, ... , Xk and YI, ... , Yk are all measurable with respect to G. For s > 0,
P(A;O} =1= 0, PHI - ak > slG) :s P(there are disjoint active paths from (Xb ak)
to Zd x {ak + s} and from (Yb ak) to Zd x {t}, not passing through GIG).
To see this, condition on G, and argue as follows:
1. If AlO} =1= 0, then there must be an active path from (Xb ad to Zd x {t},
since every active path from (0,0) to Zd X {t} must pass through the kth pivotal
interval. There must be one such path that proceeds from (Xb ak) through the
arrow that begins there, rather than through the time line above (Xb ak), because
of the maximality of the kth pivotal interval.
2. If also PHI - ak > s, then there is no pivotal interval with time coordinate
in (ab ak + s), and this forces the existence of a disjoint active path from (Xb ak)
that starts up the time line above that point. If such an active path had to intersect
the path in point 1 above, then this forced intersection would constitute part of a
pivotal interval.
By Theorem B21 (applied to a discretization of the graphical representation
conditional on G), the right side above is at most
P ( there is an active path from (Xb ak) to Zd X {ak + s },

not passing through G IG)
x P( there is an active path from (Yb ak) to Zd x {t},
not passing through GIG)
:s P(T > s)p(AlO} =1= 0I G).
Therefore,
P(Pk+1 - ak > slG, A;O] =1= 0) :s P(T > s).
Since P2 - 0'1, . " , Pk - ak-I are G measurable, the result follows from this and
an induction argument.
Next we combine the last four lemmas to prove (2.38).
Proposition 2.46.
(2.47) E). ( Yt / At(o) =1= 0) + E). (Zt/ At( o=1=)0) ::: t t - 1,

1 + Jo f(A, s)ds
and hence (2.38) holds with
C] = (1 + e + 2dAe)A and C2 = 1 + 2dAe.
Proof To deduce (2.38) with these choices for C] and C2, write
t t - 1 :::: E).(Yr/AjO) =1= 0) + E).(Zt/AjO} =1= 0) (by (2.47))

1 + Jo f(A, s)ds
:::: (1 +2dAe)E).(Yt /AjO) =1= 0) + 1 +eE).(Xt/AjO} =1= 0) (by Lemma 2.43)
a a
= 1 + C] aA log f(A, t) - C2 t at log f(A, t). (by Lemmas 2.39 and 2.41).
To prove (2.47), we define Pi and ai as in Lemma 2.45. Since

2,
t = L(Pi+] - ai) + Yt ,
i=]
if for a fixed k, L~=] (PH] - ai) :::: t - k, then either Zt ::: k, or Zt < k and
Yt > k. In other words
Let Ti i ::: 1 be independent random variables with distribution that of (T /\ t) + 1.

Then by Lemma 2.45,
p( t Ti :::: t) : : p( t(Pi+] - ai) :::: t- klAjO} 0). =1=
Combining the last two inequalities, setting N = min{n : T] + ... + Tn > t}, and
summing on k ::: 1 gives
EN - 1 :::: E).(Yt/AjO} =1= 0) + E).(Zt/AjO} =1= 0).

But by Wald's equation (see (1.6) on page 179 of Durrett (1996)),
EN = E "N
L....i=]
T"
I > t
t
.
ET] - 1 + Jo f(A, s)ds
This proves (2.47).
Finally, we show that (2.38) implies exponential decay in the subcritical case.
Theorem 2.48. For A < Ac , there is an fO(A) > 0 so that
PA(A)O} * 0):S e-E(A)t
for all t :::: o.

Proof Inequality (2.38) is somewhat awkward to use directly, since there are two
partial derivatives appearing on the left side. To combine the two, define a new
function g by
g(a, t) = f(a, a-It).
Note that g is also increasing in a and decreasing in t. Using subscripts to denote

partial derivatives, we have
Therefore, Proposition 2.46 implies that
(2.49) a(l
a
+ e + 2dae)-10gg(a, t) :::: t
t
- 2.
aa 0'+ fo g(a, s)ds
Next, we will make two observations based on this inequality:

1. If 0'1 < 0'2, we can integrate (2.49), using the monotonicity of g in a to write
g(a2, t) ( t )
a2(l+e+ 2da2e )10g ::::(0'2-0'1) t -2 ,
g(al, t) 0'2 + fo g(a2, s)ds
or equivalently,
(2.50) g(al, t) :s g(a2, t) exp [ a] - 0'2 ( t

t - 2)] .
0'2(l + e + 2da2e) 0'2 + fo g(0'2, s)ds
From (2.50) it is clear that
1 00
g(0'2, s)ds < 00
implies that g(a\, t) decays exponentially in t for 0'1 < 0'2, since then the integral
in the denominator in (2.50) remains bounded as t -+ 00.
2. It is also clear from (2.50) that if
C
(2.51 ) g(a2, s) :s 8s for s > 0,
where C is a constant and 8 > 0, then
1 00
g(a\, s)ds < 00
for CTI < CT2. To check this, replace g(CT2' t) by Ct- 8 and g(CT2' s) by Cs- 8 (for
s, 1 2': 1, say). The result is that
_ C I t- 8e- C2t '

g( CT I, I) <
for two constants C I and C2, and this is integrable for t 2': 1.
Combining these two observations, we see that it suffices to prove that for every
CT2 < Ac , (2.51) holds for some 8 > O.
To do so, take CTO > 0, 10 > 0, and define CTk, tk recursively by
tk
tk+1 = g(CTk, tk) .
Since 0 < g(CT, t) < 1 for CT > 0, t > 0, CTk t and tk t. There is a potential
problem that some CTk may become negative. If that happens, set that CTk and all
successive ones = 0, and set the tk'S that would then be undefined = 00. Suppose
CTk+1 > O. Apply (2.50) with CTI ~ CTk+I, CT2 ~ CTk. t ~ tk+l, and then use the
recursion and monotonicity of g to make the substitutions
(tHI
10 g(CTk, s)ds ~ tk + 1k+lg(CTk, tk),
CTk+1 - CTk ~ g(CTk, tk) 10gg(CTk, tk),
tk
tk+1 ~ ,
g(CTk, td
g(CTk, tk+l) ~ g(CTk, td
(2.52)
Since
g(O+, t) = lim/(CT, CT-I/) = 0
a.j,O
for t > 0, inequality (2.52) holds trivially if CTk > 0, even if CTk+1 = O. (In that
case, 1k+1 < 00.) Define
Y(CT, t) = CTO + e ~ 2dCTe) (CT : 2t - 2g(CT, t»),
which is decreasing in CT and increasing in t whenever Y(CT, t) 2': O. Therefore, if

y = Y(CTo, to) > 0, it follows that Y(CTk, tk) 2': y as long as CTk > O. (Recall that
CTk t and tk t·) Note that y > 0 can be achieved for any CTo < Ac by taking to
sufficiently large. So in this case, (2.52) implies
(2.53)
and iterating,
g(ak, tk) :s [g(ao, to) t+ d .

Since the function x logx is decreasing on [0, e- I ], if g(ao, to) < e- I and ak > 0,
then
k-I k-I
ak = ao + L [a;+1 - a;] = ao + L g(a;, t;) log g(a;, t;)
;=0 ;=0
k-I
::: ao + log g(ao, to) L(l + yng(ao, to) t+y);·
;=0
Note that by making g(ao, to) small, we can make SUPk lak - aol small, and in
particular, ak > 0 for all k.
To summarize, if 0 < a < ao < Ac , we can choose a to so large that y(ao, to) >
o and g(ao, to) < e- I , and then take to even larger so that ak > a for all k. Now,
by (2.53) and the tk recursion,
g(aHI, tHI) :s g(ak, td [~]Y,

tHI
and by iterating this,
to]Y
g(ak, td:S [ - = Y
it ,
tk [g(ak, tk)tHI]
where the equality comes from the tk recursion again. Simplifying this gives
g(ak, tk) :s [~]8,

tHI
where 8 = y / (1 + y). By the monotonicity of g, if tk :s t :s tH I, then
g(a, t) :s g(ab td :s [~J8 :s [~J8,

tHI t
which is (2.51).
This gives the statement of the the theorem with an extra constant on the right
side of the inequality. To remove it, use the fact that
f(A, t + s) ::: f(A, t)f(A, s)
and apply Theorem B22.
A Critical Exponent Inequality

The proof of Theorem 2.48 was quite involved, so it is fortunate that the techniques
developed there have other implications. One of them is a bound on the critical
exponent for the survival probability
a(A) = PA(A t 1= 0 for all t).

Recall that a(Ac) = 0 by Theorem 2.25(b). To say that a(A) has a critical exponent
of y is to say that in some sense,
as A t Ac for some constant C. There are various forms that this statement can
take:
a(A)
C[ < < C2 for Ac < A < Ac + 1, and
- (A - Ac)Y -
. 10ga(A)
hm =y,
A,I).< log (A - Ac)
for example. The following result implies that if the critical exponent y for survival
exists in any of these senses, then y S 1. Calculating y rigorously is probably
hopeless, but it would be interesting to improve the next result to show at least
. a(A)
hm--=oo.
A - Ac
A-J,Ac
Theorem 2.54. For A > Ac ,

A - Ac
a(A»-----
- A(3 + e + 2dAe)
Proof Multiply (2.49) by g(a, t), take Ac S a S A, and replace a by A in the

factor on the left and the rightmost term to get
a tg(a, t)
A(l + e + 2dAe)-g(a, t) 2: 1 - 2g(A, t).
aa a + fo g(a, s)ds
Then integrate this from Ac to A to conclude that
A(1 + e + 2dAe)[g(A, t) - g(A c, t)]

(2.55)
2: fAc
A
a
tg(a, t)
rl
+ Jo g(a, s)ds
da - 2(A - Ac)g(A, t).
Since
a(a) = 1--+00
lim g(a, t),
and therefore
·
11m tg(a, t) _ I. g(a, t)
1m = 1,
'f + +fo g(a, s)ds
1 - 1
Hoo a + fo g(a, s)ds HOO
we can pass to the limit in (2.55), using Fatou's Lemma, obtaining

3. The Contact Process on {1, ... , N}d 71
A(1 + e + 2dAe)[a(A) - a(AJ] ::: (A - Ac)[l - 2a(A)].
Recalling that a(Ac) = 0 and discarding the term 2Aca(A) leads to the statement
of the theorem.
3. The Contact Process on {I, ... , N}d
At first glance, one might object to the developments so far on the following
grounds:
(a) In the real world, all systems are finite.
(b) The contact process on a finite set always dies out, so it has no critical behavior.
(c) The main interest and challenge of the study of the contact process on Zd
comes precisely from the fact that it does exhibit critical behavior.
In view of these facts, how can the contact process on Zd be considered to be a
relevant model for real world phenomena?
To answer this question, it is necessary to remember that extinction is a t = 00
characteristic, and that in the real world, we are interested in large but finite times.
So the question becomes: Is it the case that infinite models observed over the
entire time axis capture important features of large finite systems at large finite
times? This section provides some answers to this question.
Let A L be the contact process on {l, ... , N}d with initial configuration A.
This is simply the contact process on Zd, modified so that no infections are allowed
off {I, ... , N}d. When no initial configuration A is specified, it will be taken to
be A = {l, ... , N}d. Since the contact process on {I, ... ,N}d is a finite state
Markov chain that is irreducible, except for having a single absorbing state 0, it
will eventually be absorbed at 0. Let
TN = inf{t ::: 0 : AN,t = 0}
be this absorption time. Clearly, TN ~ 00 in probability as N ~ 00, since TN is

larger than the maximum of N d independent unit exponentials. We are interested in
determining how rapidly TN ~ 00, and how this rate depends on A. In particular,
does something important change at Ac , the critical value of the contact process
on Zd?
Here is the punch line: As a function of N, TN is logarithmic if A < Ac ,
polynomial if A = Ac, and exponential if A > Ac. These results provide a precise
way of saying that large finite systems observed at large finite times die out below
Ac, and survive above Ac. After all, if extinction occurs at an exponentially large
time, it will not be seen at reasonably large times, while if the extinction time is
only logarithmic, then it will be observed without having to wait too long.
The proofs of these results are applications of the hard work we did in Sec-
tion 2. We will only prove the subcritical and supercritical versions of these state-
ments for several reasons: The results in the critical case have only been proved
in one dimension, the proofs are significantly more difficult, and even in one di-
mension, the results that have been proved are not complete. See the discussion
in Section 5 for more on this.
The Subcritical Case

To get warmed up, consider the case A = O. Then TN is the maximum of N d
independent random variables with the unit exponential distribution. Therefore
if c < d
if c > d,
and so
TN
(3.1) ---+d
10gN
in probability, as N -+ 00. The main result in this subsection is that (3.1) holds
(with a limit depending on A) for 0 < A < Ac as well.
The first step is to identify the quantity (defined in terms of the unrestricted
process on Zd) that will turn out to be the limit. Recall that by the Markov property
and monotonicity,
so that by Theorem B22,
Y_(A) =- lim ~ log P(AjO) =1= 0)

1--+00 t
exists, and
(3.2)
By Theorem 2.48, Y_(A) > 0 for A < Ac. Also, Y_(A) is decreasing in A, and
y_(O) = 1, since if A = 0, P(A)O) =1= 0) = e- I • Here is the general version of
(3.1):
Theorem 3.3. If A < Ac , then
TN d
---+ - -
10gN Y_(A)
in probability, as N -+ 00.
Proof The graphical representation provides a coupling between the contact pro-
cesses At,1 on {l, ... , N}d and A~ on Zd with the property that At,! c A~ for
all A c {I, ... , N}d and all t ::: O. Therefore, using additivity (1.2),
3. The Contact Process on {I, ... , N} d 73
P(rN > t) = P(AN,t =F 0) = P(A~!t =F 0 for some x E {l, ... , N}d)

(3.4) :s P(A)x} =F 0 for some x E {l, ... , N}d)
:s N d P (A)O) =F 0) :s N d e-Y-(A)t,
where the final inequality comes from (3.2). Now take t = clog N in (3.4), where
c> dly_(A.), to get
as N -+ 00. This completes one part of the proof.

For the other part, the idea is to compare the system starting with {I, ... ,N}d
with one starting with the more sparse initial configuration
where k is a positive integer, as follows:
P(rN < t) =P(AN,t = 0) = P(A~!t = 0 for all x E {I, '" , N}d)

:SP(A~,}t = 0 for all x E SN,k)
(3.5) :s L p(A1X} <t. x + (-k, k)d for some s > 0)
+ n
XESN,k
p(Alx} ex + (-k, k)d for all s > 0, A)x) = 0).
The last inequality comes from the fact that the events
{Ai X} ex + (-k, k)d for all s > 0, A)x) = 0}, x E SN,k
depend on disjoint parts of the percolation structure, and hence are independent.
Combining Theorems 2.34 and 2.48, we see that there are constants band E > 0
so that
(3.6)
Now take c < a < dlY_(A.). By the definition of y_(A.) above, b can be taken
larger so that for t 2: b,
(3.7)
Therefore, continuing with (3.5), using (3.6) and (3.7) and setting t = clogN, we
see that for clog N 2: b,
P(rN < clogN) :s (#SN,dbe- Ek + (I _ N-cd/a)#SN,k.

To complete the proof, we need to take k = kN so that the right side above tends
to zero, i.e., so that
This is easy to achieve, since up to constant multiples,
#SN,k ~ (~r
For example, kN = (log N)2 works.
The Supercritical Case

Again, we begin by identifying a quantity defined in terms of the unrestricted
contact process At that is relevant to the asymptotics of TN. Let
a(N) = P(Ajl, .. ·,Nl = 0 for some t).

d
If n, k, N are integers such that nk ::: N, then {l, ... ,N}d is contained in the
union of k d disjoint translates of {I, ... ,n}d. By additivity (1.2) and positive
correlations (see Corollary B 18 and the discussion following the statement of
Theorem B21),
kd
a(N) ::: [a(n)] ,
so
[a(N)t Nd ::: [a(n)fk/N)d.
Letting k -+ 00 and N -+ 00 so that N / k -+ n, we see that

-d N-d
[a(n)r ~ liminf[a(N)] .
N-HXJ
Therefore,
. 10ga(N)
(3.8) Y+(A) =- hm
N-,>oo N
d
exists. If A > Ac , then Y+ (A) > 0 by Theorem 2.30(b). It is finite, since
a(N) ::: C +12dA )

Nd
The right side is simply the probability that all N d sites recover before they have
a chance to infect any neighbor.
The next result is somewhat unsatisfactory, in that it does not identify the
limit in probability of (lOgTN)/N d (nor show the limit exists), but it does at least
determine the rate of growth of TN. For more precise results, see the discussion
in Section 5.
Theorem 3.9. If A > Ac and Y > Y+(A), then
(3.10) . P (lOg
hm TN )
--d-::: Y = O.
N-,>oo N
3. The Contact Process on {I, ... , N}d 75
Furthermore, there exists a 8 > 0 so that
(3.11) lim
N-+oo
p(IOg~N
N
:::: 8) = O.
Proof For the first statement, take 8 so that 8E > Y+(A), where E is the number
appearing in Theorem 2.30(a). Then by that result,
0:::: a(N) - p(A~~·e··N}d = 0) :::: Ce- ENde .

Rewrite this as
[a(N) - Ce- ENde
Take logarithms and divide by N d to get

r: p(A~~·e··N}d = 0) :::: a(N).
10ga(N) 10g[1 - Ce-ENde[a(N)rlt

Nd + Nd
log p(A~~i}··N}d = 0) 10ga(N)
< < --'----,,-
Nd Nd
Therefore by (3.8), recalling that 8E > Y+ (A), so that the argument of the second
log above tends to 1,
(l •... •N}d - 0)
Iog P (A Nde
. -
(3.12) Y+(A) =- hm
N-+oo N
d .
Now let At be the process obtained from At by restarting it with configuration

{I, ... ,N}d at each time t that is an integer multiple of N d8, provided At - -=1= 0
at that time. In other words, if AkNdO- oF 0 and
then As is the set of x E Zd such that there is an active path in the graphical
representation from (y, kN d8) to (x, s) for some y E {I, ... , N}d. Since A N.t C
At for every time that is a multiple of N d 8 by definition (until the extinction time
of At), it follows that A N . t C At for all t. Since disjoint parts of the graphical
representation are independent, we conclude that
P(rN > kN d8) = P(AN.kNdg -=1=0)

(3.13)
:::: P(AkNde -=1= 0) = [p(A~~i}··N}d oF 0)t.
To complete the proof of (3.10), use (3.12) and (3.13), choosing k = kN so that
where Y > y' > y+(A).

Figure 2
The proof of (3.11) uses an oriented percolation comparison analogous to that

in Theorem 2.23. Giving details would only obscure the main point, so we will
not give them here. The only real difference is that the percolation process evolves
in a linear tube embedded in {I, ... ,N}d, whose length is essentially a constant
multiple of N d • The basic construction is shown in Figure 2 if d = 2. The small
squares in the figure are of side length c, where c is chosen to be large enough that
the construction of Theorem 2.23 can be made using only the Poisson processes
corresponding to points in the shaded tube (c > lOa, for example, where a is the
value occurring in that theorem). The big square is of side length kc S N, where
k is an odd integer (k = 7 in the figure). As N increases, the c remains fixed, so
that the k increases as a constant multiple of N. Therefore, the number of small
squares in the tube increases as a constant multiple of N 2 . The tube is taken so
that it fills only about half of the big square, in order to facilitate the application
of Theorem B26.
For higher dimensions, we repeat the construction in layers. We give some
details in three dimensions - the extension to d > 3 should be clear. For d = 3,
construct a tube whose length is essentially a constant multiple of N 3 in the
following way. Make a copy of Figure 2, where the small squares are replaced by
cubes of side length c. This is the bottom layer of the construction. Make (k+ 1)/2
copies of this, and alternate them with empty layers of height c. Finally, connect
the (k + I) /2 tubes by adding one cube of side length c to each empty layer,
alternating between southwest and northwest comers of the layers. The result is
the long tube we want.
This construction leads to a statement analogous to that of Theorem 2.23, in
which the oriented percolation process is restricted to an interval whose length is
3. The Contact Process on {I, ... , N} d 77
a constant multiple of N d . Therefore, it is enough to prove an analogue of (3.11)

for this restricted oriented percolation process.
Let Bk be the oriented percolation process Ak described in the background
chapter, and let B N •k be the corresponding process with the restriction that at
time k,
k+ k+
BN,k C [ -2-' N + -2- .
2 I]
Let r N be the extinction time for this process with initial condition BN,O =
{I, ... , N}. We need to show that if p is sufficiently close to 1, then there is
a 8 > 0 so that
(3.14) lim p(lOgrN :s 8) = O.

N-->oo N
It will also be convenient to use a process intermediate between Bk and BN,k: Let
B~ be the process Bk with the restriction that at time k,
BkI C [k-2-,00.
+2 ]
We intentionally have not specified yet whether the B's are to be site or bond
processes. We are using (3.14) for the site processes, but will prove it for the bond
processes, since this is a bit more convenient. Since the site and bond processes can
be compared in either direction, provided the parameter p is adjusted appropriately,
and since we only need (3.14) for p sufficiently close to 1, (3.14) holds for the
site process if and only if it holds for the bond process (for different values of p).
So, from now on we take the B's to be bond versions of the processes. From
the traditional (i.e., graphical) description of the process, it is easy to see that
(3.15) p{l, ... ,nl(B~ = 0) = P{l,2, ... 1( B~ n {[k: 3]. ... , 1] +n} = 0).
[k:
where [.] is the greatest integer function. In fact, (3.15) is the analogue of contact
process duality. Let
h = min {i :i E B~ }
be the left edge of the semi-infinite processes. With this notation, (3.15) can be
rewritten as
(3.16) p{I, ... ,nJ(B~ = 0) = P{I,2, ... 1(h > [k; 1] +n).
Next, we will use the fact that there is an E > 0 and a C so that
(3.17) p{l, ... ,nl(B~ = 0 for some k) :s Ce- En , n 2: 1.
The analogous statement for the process Bk is Theorem B24(c), and the proof for
B~ is similar. Combining (3.16) and (3.17) gives
(3.18) pl l ,2, ... l(l j > [i ~ 1] +n for some i.:::: k) .:::: Cke-En, n:::: 1.
Now consider the reflected version of B£: Br is the process Bk with the
restriction that at time k,
Bk/I c [
1, N k+l] ,
+ -2-
Bn
and
rk = max {i :i E
is its right edge. The analogue of (3.18) is then
(3.19)
PI ... ,N-I,Nl(rj < [i: 1] + N _ n for some i .: : k) .:::: Cke-En,
n::::1.
By checking each possible transition, one sees that if initially, BN.O = {I, ... ,N},
Bo = {l, 2, ... } and B~ = {... ,N - 1, N}, then on the event {lj < rj for all i .: :
k},
and therefore
P(TN .:::: k) .:::: P(lj > rj for some i .: : k) .:::: 2Cke- EN /2 ,
where for the last inequality, we have used (3.18) and (3.19) with n = N 12 (taking
N to be even for simplicity). Therefore, (3.14) holds for any /) < E/2.
4. The Process on the Homogeneous Tree Td
Let Td be the homogeneous connected tree in which each vertex has d + 1 neigh-
bors. It is often useful to think of Td as a branching tree in which each vertex
has one parent and d children, and then it is natural to say that y is a descen-
dent of x if y is a child of a child ... of a child of x. More formally, define a
function I (x) from Td to Z I so that for each x, I (y) = I (x) - 1 for exactly one
neighbor y of x, and ley) = lex) + 1 for the other d neighbors y of x. Thus lex)
can be thought of as the generation number of x, and y is a descendent of x if
I(y) -lex) = Iy - xl :::: 1. Take {en, -00 < n < oo} in Td such that l(e n) = n
and len - en+11 = 1 and write e = eo. This provides an embedding of Zl in Td .
In Section 2, we saw that weak survival does not occur for the contact process
on Zd. Our first result in this section shows that the situation really is quite
different on Td - weak survival does occur. It is this fact that has led to much of
the interest in contact processes on trees, and is the primary justification for this
section. The occurrence of weak survival raises an entirely new set of questions
concerning the behavior of the process in the intermediate phase Al < A < A2.
Throughout Section 4, we will assume that d :::: 2, since TI = Zl, and this case is
covered by the results of Section 2.
Some Critical Value Bounds

We begin by proving some easy bounds on the critical values A1 and A2 that
are already good enough to show that these critical values are different if d is
sufficiently large. Later we will show that they are different for all d :::: 2.
Theorem 4.1.
1
(a) Al < - - .
- d-l
(b)
In particular, Al < A2 for d :::: 6.
Proof The final statement is an immediate consequence of the bounds in (a) and
(b). To prove (a) take 0 < P < 1 and define a function vp on the finite subsets of
Td by
Compute
d A vp(At)
-E I 0
(4.2) dt . t=
= [p-IIAI- A#{(X, y) : x E A, y rJ- A, Ix - yl = l}](1- p)vp(A).
For any finite subset A of Td, the number of edges incident to points in A (counted
with multiplicity) is (d + 1)IAI. There are at most IAI - 1 edges that join two
vertices in A. Therefore if A =1= 0,
#{(X, y) : x E A, y rJ- A, Ix - yl = I} :::: (d + 1)IAI- 2(IAI- 1) = (d - 1)IAI + 2.
Using this bound in (4.2), we see that if pA(d - 1) :::: 1, then for nonempty A,
so that EAvp(At) is nonincreasing. If At dies out, the limit would have to be 1.

So, the process survives whenever it is possible to choose 0 d~l.
Turning to (b), we will do something similar, but using a function that contains
some information about the locations of the infected sites. By contrast, recall that
vp depends only on how many infected sites there are. For p :::: 0, define wp by
(4.3) wp(A) = I>I(X),

XEA
and compute
(4.4)
:t EAwp{At)lt=o =~ [(A IY~=I /(Yl) - pl(Xl]
yj"A
:::: [A{dp + p-I) - I]wp{A).
Choosing
1
and A = 2..(J'
we see that the right side of (4.4) is zero, and hence that M t = wp{At) is a (pos-
itive) supermartingale. Therefore, M t converges a.s. On the event {x E At i.o.},
M t has to change by at least pl(xl i.o. Therefore pA{X E At i.o.) = 0 for each x
and A, and it follows that for this value of A, At does not survive strongly.
Pemantle (1992) proved Theorem 4.1, and then went on to improve the bounds
enough to conclude that Al < A2 for d > 2. Liggett (1996a) further improved the
bounds for d = 2. Here are their results:
d=2 A2 :::: 0.609
d=3 Al :::: 0.391 A2 :::: 0.425

d=4 Al :::: 0.279 A2 :::: 0.354
d=5 Al :::: 0.218 A2 :::: 0.309.
The proofs of these bounds become increasingly difficult as d decreases. The case
d = 2, in particular, is very computationally intensive. Recall that TI = Zl, so
that for d = 1, Al = A2 by the results of Section 2. It of course follows from these
bounds that AI < A2 for d :::: 2. Rather than prove the bounds here, we will see
later that the fact that the two critical values are different follows from general
considerations that do not require proving such bounds.
Much of the behavior of the contact process on Td can be understood in terms
of properties of a function ¢(p) that will be defined below. In order to motivate
the introduction of this function, we begin with a brief discussion of the analogous
function for the corresponding branching random walk. In general, one can say
that results for the branching random walk are useful predictors of results for
the contact process, though the proofs are substantially harder in the latter case.
Certain results are more complete in the case of branching random walk. For
example, we will be able to compute the values of the two critical values exactly,
and conclude that they are different for d :::: 2 by inspection.
Branching Random Walk

Let ~t be the branching random walk on Td that was described near the beginning
of Section 1. So, ~t (x) E {O, 1, 2, ... } is the number of particles at x. Particles die
at rate 1, and give birth to particles at each neighboring site at rate A. Carrying
out a computation analogous to that in (4.4) leads to
(4.5)
The fact that there is an equality here instead of the inequality in (4.4) is a reflection
of the independence of the offspring of different parents. Because of the equality,
(4.5) can be solved explicitly to obtain
(4.6)
x x
where
(4.7) 1Jr(p) = exp [A(dp + p-l) - 1].
Note that
and 1Jr achieves its minimum at p = l/,Jd. When we discuss the analogous
function (called 11) for the contact process shortly, we will find that it shares these
properties, though it cannot be computed explicitly.
It should be clear what strong and weak survival mean in this context, and how
the critical values A1 and A2 are defined. In order to relate them to the function 1Jr,
it is useful to consider also a sequence v(n) that is defined as follows: Abbreviate
the configuration that consists of a single particle at x by x itself, and for n ~ 0,
let
and
v(n) = peCan < (0).
Thus v(n) is the probability that an infection starting at e ever reaches e- n . Note
that v(n) is nonincreasing in n, since the infection can reach e-(n+l) only if it
reaches en first. It is left continuous in A, since it is a supremum of increasing
continuous functions of A:
v(n) = sup pe(~t(e_n) ~ 1 for some t :::: T).

T
For the continuity in A of the functions that appear in the supremum above, recall
the discussion surrounding (1.20).
The next result relates survival and strong survival to properties of the function
1Jr and the sequence v(n).
Theorem 4.8. (a) ~t survives if and only if
(b) !;t survives strongly if and only if

(c) Al = d~I' and l;t dies out at AI.

(d) A2 = 2./J, and l;t survives weakly at A2.
(e) If A > A2, then
. 1
hm v(n) = 1 - > 0.
n--->oo A(d + I)
(f) If A :s A2, then
. v(n+l) 1-v'1-4dA2 1
lim v(n) = 0,
n~oo
and f3 = n~oo
hm
v(n)
= 2dA
< -
- Jd
Remark. Aside from the fact that properties of 1{! lead to the explicit computation
of Al and A2, the most interesting consequence of this theorem comes from (e)
and (t). Above A2, v(n) does not tend to zero. Below A2 it tends to zero at an
exponential rate :s ..let. So, there is an interval of parameters of exponential decay
that cannot be attained in this problem.
Proof of Theorem 4.8. Parts (c) and (d) follow easily from parts (a) and (b) re-
spectively, since (4.7) is so explicit. For part (a), it suffices to note that
I)t(X)
x
is an ordinary Galton-Watson branching process. Since 1{! (I) is its exponential rate
of growth by (4.6), the branching process is supercritical if 1{! (1) > I, critical if
1{!(l) = I, and subcritical if 1{!(l) < 1. (See Theorem B55.)
Turning to part (b), note first that (4.6) implies that
M _ Lx l;t(x)/(x)
t- [1{!(p»)I
is a nonnegative martingale, and hence converges a.s. To check the martingale

property, first use the Markov property at time s < t, and then divide both sides
by [1{!(p) Y:
E[ ~l;t(X)/(X)lgr] = E~'[ ~l;t_s(X)pl(X)] = [1{!(p)r- s ~l;s(X)/(X).
If 1{!(p) 0, then for that value of p,
L l;t(x)/(x)
x
~ ° a.s.
as t ~ 00, so l;t does not survive strongly. If 1{! (p) = I, then the limit of
(4.9)
exists a.s. On the event {~t(x) ~ 1 i.o.}, (4.9) changes by at least pl(x) i.o., so it
follows that ~t does not survive strongly in this case either. Since Vr attains its
minimum at 1/-Jd, the result so far can be restated as follows: If Vr(1/-Jd) :::: 1,
then ~t does not survive strongly.
For the converse, use the strong Markov property, monotonicity and spatial
homogeneity to write
(4.10)
The events whose probabilities appear on the right of (4.10) are decreasing in n.
On the intersection of these events, ~t(e) ~ 1 for an unbounded sequence of times
t. Therefore, if ~t does not survive strongly, then the right side of (4.10) tends to
zero as n ---* 00, and hence
(4.11 ) lim v(n) = O.

n-->oo
Take n ~ 1, and run the process until the first transition occurs. Using the strong
Markov property at that time and the independence of the offspring of different
parents, one gets
[1 + (d + l)A][l - v(n)] = 1 + A[l - v(n)][l - v(n -1)]

+ dA[l - v(n)][l - v(n + 1)].
Expanding gives
(4.12) v(n)[l + Av(n - 1) + dAv(n + 1)] = Av(n - 1) + dAv(n + 1).

Since v(n) is monotone, one can pass to the limit and conclude that
. 1
(4.13) hm v(n) = 1 - or O.
n-->OO A(d + 1)
Suppose the limit is zero, which by (4.11) is true if ~t does not survive strongly.
Then given E > 0 there is an N so that the left side of (4.12) is at most (1 +E)v(n)
for n ~ N. Using the arithmetic-geometric mean inequality on the right side of
(4.12) gives
(1 + E)v(n) ~ 2AJdv(n - l)v(n + 1), n ~ N.
Taking the product of this inequality for N :::: n < N +m gives
(4.14) (1 + E)m v(N)v(N + m - 1) > (2AJdr.

v(N - l)v(N + m) -
By (4.12), the ratios

v(n - 1) v(n + 1)
and
v(n) v(n)
are uniformly bounded. Therefore, taking the mth root, then letting m -+ (Xl in
(4.14), and finally letting E {, 0, we see that 2A,Jd s 1. But this means that
1/!(~) = e2Jcv'd-1 S 1,
as required to complete the proof of (b). Note that since we only used (4.11) in
this argument, we have also proved (e) and the first part of (t).
To prove the second part of (t), we need to look at the ratios
v(n +1)
f3n = v(n)
more carefully. Define

e 1
Ic(p) = dA - dp'
This function is increasing and concave in p, and (4.12) can be written in the form
f3n = ICn (f3n-l) where en = 1 + Av(n - 1) + dAv(n + 1).

For e :::: 1, the fixed points of Ic are
e ± ";e 2 - 4dA 2
P±(e) = .
2dA
If c = 1, these are exactly the solutions of 1fr (p) = 1. The smaller fixed point is
unstable, while the larger one is stable. In other words, the k-fold iterate I?) of
II satisfies
lim I?\p) = p+(l), p > p-(l).
k->oo
See Figure 3 below.

Since Ie :::: II for e :::: 1,
f3n+k :::: I?) (f3n),
so that if f3n > p-(l) for any n, it would follow that
(4.15)
However, since 1/!(p-(l)) = 1, (4.9) with p = p-(l) is a martingale M t . Applying

the martingale stopping theorem to (a truncation ot) an gives
and hence
(4.16)
p + (c)
Figure 3 - graph of Je(P), c ~ 1
In... < A2, p-(1) < p+(1), so that (4.15) and (4.16) are incompatible. Therefore
we conclude that
I-JI-4dA2
(4.17) f3n .:::: p-(1) = 2dA
for all n. Since f3n is a left continuous function of A, it follows that (4.17) holds
for A = A2 as well. To get a lower bound for f3n, we argue in a similar way. Fix
c> I, and take n so large that Cn < c. Then
(4.18)
If f3n < p_(c), then the right side of (4.18) becomes negative for some k, which
is impossible. Therefore,
(4.19) f3n 2: p_(c) whenever Cn < c.
Since p_(c) is continuous at c = I, (4.17) and (4.19) combine to give part (f) of
the theorem.
Here are some properties of 1/1 and f3 that are immediate consequences of
Theorem 4.8, and that should be kept in mind as we develop analogous properties
for the contact process:
(a) f3 is a strictly increasing function of A for A .:::: A2.
(b) 1/1 (f3) = I for A .:::: A2·
(c) If A = A\ then f3 = I/d, while if A = A2 then f3 = I/-Jd.
Back to the Contact Process - the Function ¢

There is not much difference between the results we will prove about the contact
process and the situation described above. There are significant differences in the
proofs, however. The main features that make the proofs rather straightforward in
the context of the branching random walk are
(a) the expression in (4.6) is an exact exponential in t,
(b) the function 1jJ (p) that appears there is simple and explicitly computable,
and
(c) the sequence v(n) satisfies the simple recursion (4.12).
The contact process does not share these features. Still, there is a function anal-
ogous to 1jJ(p), and there is a sequence analogous to v(n) (to be called ¢(p)
and u(n) respectively below) that have similar properties. When no initial state is
specified for the contact process, it will be taken to be {e}.
The next several results are aimed at proving that there is an intermediate
phase for all d ::: 2 (Theorem 4.46). The main steps are:
(a) Define the function ¢, and show that it determines the asymptotics of
EWp(At) in a very strong sense (Proposition 4.27).
(b) Prove various monotonicity and continuity properties that are needed in
working with it (Proposition 4.33).
(c) Prove that if A = AI, not only does the process die out, but its expected
size remains bounded (Proposition 4.39).
(d) Show that properties of ¢ determine whether or not the process survives
strongly (Proposition 4.44).
In order to define ¢, we use the fact that EWp(At) is almost an exponential
in t, where W p is the function defined in (4.3). To see this, start by using the
additivity property of the contact process (1.2) to write
(4.20)
where es is the shift by time s of the Poisson processes used in the graphical
representation that was described in Section 1. Therefore for p > 0,
wp(At+s) = L /(y) ::: L L pl(y).

yEA,+, xEA, yEA: 08.,
The inequality above comes from the fact that the union in (4.20) is not neces-
sarily disjoint. Taking conditional expectations with respect to 31f, the a-algebra
generated by the process up to time s, we see that
(4.21) E[ w p(A t+s )I31f] ::: L pl(x) E L /(y)-l(x) = wp(As)Ewp(At).

xEAs yEA:
Taking expected values in (4.21) gives
(4.22)
which means that log EWp(At) is subadditive, and hence
(4.23)
exists and satisfies
(4.24)
by Theorem B22. Since the contact process At and the branching random walk
can be coupled together so that x E At implies /;t (x) ::: 1, it follows that
(4.25) </>(p) :s 1jr(p) = exp [A.(dp + p-l) -1] < 00.
Next we will show that </> has many of the qualitative properties of 1jr - the
main difference is that </> cannot be computed explicitly. We need these properties
so that we can use </> in the analysis of the contact process much as we used 1jr in
the analysis of the branching random walk. We start with some easy combinatorial
facts.
Lemma 4.26. (a) Let an,k be the number of x E Td such that Ix - el = nand
lex) = n - 2k. Then
ifk = 0,
if 1 :s k :s n - 1,
ifk = n.
(b) Let
an(p) = L /(x),
Ix-el=n
Then ao(p) = 1 and
for n ::: 1.
Proof The cases k = 0 and k = n of part (a) are immediate. For the other cases,
take x E Td such that Ix - el = n, and let k ::: 0 be the largest index so that e-k is
on the shortest path joining e and x. Then Ix - e_kl = n - k, and lex) = n - 2k.
Therefore, an,k is the number of x E Td so that e_k is on the geodesic joining e
and x, but e_k-l is not on it. In traversing such a geodesic from e to x, there is
one choice of edge at each step until reaching e-b d - 1 choices at the next step,
and d choices at the remaining n - k - 1 steps.
For part (b), take n ::: 1 and dp2 =1= 1, and use part (a) to write
n
an(p) = L /(x) = Lan,kpn-2k
Ix-el=n k=O
n-l
= (dp)n + (d _l)dn-1pn L(dp 2)-k + p-n
k=l
(dp)n[dp2 - 1] + (d - l)d n- 1pn[1 - (dp2)-n+l] + p-n[dp2 - 1]
dp2 - 1
Simplifying gives the required result. The result for dp2 = 1 follows by using
L'Hopital's rule.
Proposition 4.27. (a) The following symmetry properties hold:
EWl/dp(At) = EWp(At) and 4>(d~) = 4>(p).

(b) There is a constant C(p) :::: 00 depending only on d and p so that
[4>(p)Y :::: EWp(At) :::: C(p)[4>(p)Y, t::: O.
One can take C(p) < 00 if p =1= I/Jd.
Proof A simple computation shows that
Since
00
(4.28) EWp(At) = Lan(p)P(en EAt),

n=O
the first statement in part (a) is immediate, and the second follows from it by the
definition of 4> in (4.23).
For (b), note that the left inequality is just (4.24). By part (a), it is enough to
prove the right inequality when p > 1/ Jd, which we now assume. The inequality
in (4.22) (which led to (4.24)) came from the additivity property. The idea is to
show that except for a constant factor, the opposite inequality holds in (4.22)
because there is a substantial amount of disjointness in the union in (4.20).
For a finite set A C Td, let {Bx, x E A} be the subsets of Td that are defined
as follows: Bx is the set of descendents of x whose closest predecessor in A is x
itself. These sets are disjoint, so by additivity (1.2),
(4.29)
EAwp(At) = EW p( UxEA An::: EW p( UxEA (A; n Bx)) = E L wp(A; n Bx).
XEA
To find a lower bound for the right side of (4.29), we will need the following
inequality:
(4.30) dp L pl(x)::: (dp - l)d n L pl(x).
xEA.YEBx XEA
Ix-YI=n
We will prove this by induction on the number of points in A. The case of a

singleton A = {u} is immediate: the left side of (4.30) is (dp) pl(u) d n, and the
right side is (dp - l)d npl(u). To carry out the induction step, let A' = A U {x'},
where x' rj. A and I(x') is maximal among I(x), x E A'. In other words, A' is
obtained from A by adding a point with a generation number that is at least as
large as as the generation number of any point in A. Then let {B~, x E A'} be the
sets defined above, but relative to A' rather than A. We will show that if (4.30)
holds for A, it holds for A'. That will be the induction step.
Write LHS, RHS, LHS' and RHS' for the left and right sides of (4.30) relative
to A and A' respectively. Then
RHS' = RHS + (dp _l)dnpl(x').
To write a similar expression for the left sides, note that B~ C Bx for all x E A,
and Y E Bx \B~ if and only if x' E Bx and y is a descendent of x'. In particular,
there is at most one x E A such that B~ =1= Bx. If there is such an x, then
L pl(x) = /(x)dn-l(x'l+l(x).
yEB, \B~
Ix-yl=n
So
LH S' = LH S + d n+1 pl(x'l+l _ dn-l(x'l+l(xl+l pl(x)+l,
where the last term appears only if there is an x E A such that B~ =1= Bx. Therefore,
to complete the induction step in the proof of (4.30), we need
dp - (dp)l(x)-l(x')+l ::: dp - l.
But this is a consequence of dp ::: 1 and I (x') > I (x).

Using (4.30) and spatial homogeneity, we bound the right side of (4.29) as
follows:
xEA xEA yEBx

00
=L peen EAt) L /(xl+n

n=O xEA.yEBx
Iy-xl=n
(4.31 )
d I 00
::: Pd- L/(X)LP(enEAt)(dPt
p XEA n=O
dp-l ~
= --wp(A) ~ peen E At)(dpt.
dp n=O
Again using spatial homogeneity,
L peen EAt) L L peen

00 00
(4.32) EWp(At) = /(x) = E At)an(p),

n=O Ix-el=n n=O
where an(p) is defined in the statement of Lemma 4.26. Since dp2 > 1, that lemma
implies that an (p) is asymptotic to a constant multiple of (dp)n as n -+ 00, so
that by (4.29), (4.31) and (4.32), there is a constant C(p) so that
wp(A)Ewp(At) ::: C(p)EAwp(A t ).
By the Markov property,
EWp(As)Ewp(At) ::: C(p)Ewp(A t+s )'
Iterating this gives
[Ewp(At)f::: C(pt-IEwp(A nt ).
The right hand inequality in part (b) of the proposition now follows by taking nth
roots and passing to the limit, recalling the definition of ¢ in (4.23).
Proposition 4.33. (a) ¢ is nondecreasing in A, and is nondecreasing in p for

p ::: 1/v'd.
(b) ¢ is jointly continuous for A > 0, P > O.
Proof Part (a) comes from (4.23) and (4.28), together with the fact that an(p) is
increasing in p for p ::: 1/..,(J and peen E At) is increasing in A. The monotonicity
of an (p) is easiest to see by pairing up the summands in the expression for it given
in the proof of part (b) of Lemma 4.26, and rewriting the sum of pairs as
pn [( d p 2f k + (d p 2f<n-k) ] = rn/2 [( v'dp r- 2k + (v'dp )2k-n l

Now use the fact that the function x + X-I is increasing for x ::: 1.
For part (b) note first that E W p (At) is jointly continuous in A and p for fixed t,
since it only involves the process for a finite time period. To check this, consider
first continuity in A. Let A; C A;' be the processes with parameters A' < A"
respectively, coupled via the graphical representation. Then
0::: EWp(A;') - EWp(A;) =E L /(x)

xEA;'\A;
x x
where we have used the Schwarz inequality. The first factor on the right is finite
by comparison with the branching random walk process. To bound the second,
use the Schwarz inequality again to get
EIA;'\A;I J
= E(IA;'\A;I, A;' =1= A;) S JEIA;'1 2 P(A;' =1= A;).
Now use (1.19) to bound the first factor above, and both (1.19) and (1.20) to
show that the second factor is small if A" - A' is small. This argument shows
that E W P (At) is continuous in A, uniformly for (A, p) in compact subsets of
[0,00) x (0,00). It is easy to check the analogous statement for continuity in p,
using the inequality
Ipf - p~1 S Ip2 - Plllkl(p~-1 + p~-I) :
!Ew p1 (At) - EW p2 (A t )! S Ip2 - PII L p(x E At)ll(x)l(p:(xl-1 + p;(xl-I).

x
Now, by (4.23) and Proposition 4.27(b),
and for p =1= 1/./J,

EW (At)]t
¢(p) = s~p [ C~p)
The first of these statements implies that as a function of A and p, ¢ is upper
semicontinuous, while the second implies that it is lower semicontinuous, except
possibly when p = l/./J. Therefore, ¢ is continuous, except possibly when
p = l/./J. But by part (a) and Proposition 4.27(a),
where the + and - denote the right and left hand limits of ¢. But strict inequality
is ruled out by the upper semi continuity that holds for all p, including l/./J.
Extinction at the First Critical Value

A key technique in the analysis of At on Td is that of showing that certain sparser
versions of it have the same exponential growth rate that At does. These sparser
versions are useful in constructing branching processes that lie below the contact
process. Since they are branching processes, they are easier to analyze than the
contact process itself. (See Theorem B55.)
The first result of this type will be used below to show that at AI, the contact
process not only dies out, but its expected size remains bounded in t. That is an
important ingredient in the proof that Al < A2. To state the result, for any x E Td,
let S(x) C Td be x, together with all the descendents of x. Then S(x) can be
thought of as a rooted tree with root at x. Let At be the process constructed from
the graphical representation by using only the recovery symbols in See]) U {e} and
only the infection arrows that join vertices in S (ed U {e}.
Lemma 4.34. If p > 1/ y'd and ¢ (p) > 1, then
lim sup [Ewp(At)]]/t = ¢(p).

t--'>oo
Proof Since At C At, one inequality is clear from (4.23). To prove the other
inequality, note that if y E At n See]), then there must be an infection arrow from
e to e] at some time r < t such that there is an active path from (e], r) to (y, t).
So, since infection arrows occur at rate )...,
(4.35) P(y EAt) :s 1t p(y E A;~s))"'ds

for y E See]). Multiplying (4.35) by pl(y), replacing s by t - s, and summing
gives
(4.36)
By the spatial homogeneity of At and the fact that an (p) is asymptotic to a constant
multiple of (dp)n for p > 1/ y'd, there is a constant C so that
00
EWp(At) = Lan(p)P(en EAt)

n=O
(4.37) 00
:s 1 + C L(dpt peen EAt)

n=]
Also, using the Markov property at time 1 and monotonicity, we have
(4.38)
Combining (4.24), (4.36), (4.37) and (4.38) gives the following inequality, where
C' = Cd)",
r
P(A] = fed)
¢(p) :s [1 + c' f t +] Ewp(As)ds
But ¢(p) > 1 and

lim sup [Ewp(At)f/t < ¢(p)
t--'>oo
would imply that for some I < a < ¢ (p) and all s beyond some point,
which would give a contradiction.
Proposition 4.39. If A = AI, then cp(l) = 1, and hence

(4.40) sup EIAtl < 00
t>O
and At dies out.
Remark. It is not known whether (4.40) is true for the critical contact process on
Zd. It is thought to be false.
Proof We need only prove that cp(l) = 1, since then (4.40) follows from Proposi-
tion 4.27(b). (Recall that WI (A) = IAI.) That (4.40) implies extinction is a standard
Markov chain fact, which follows from
inf
(A:IAI=n)
peAt = 0 for some t) > O.
See (2.5), where the corresponding fact was proved on Zd.
If A > AI, then At survives, and hence (by the argument just given)
lim EIAtl = 00.

t-+oo
Therefore cp(l) > 1 by Proposition 4.27(b). So, Proposition 4.33(b) implies that
cp(l) 2: 1 for A = AI.
For the opposite inequality, we will use Lemma 4.34. For any finite A C Td ,
define its frontier F(A) to be the set of points x E A for which at least one of its
children - call it x' - has Sex') n A = 0. Let A' = the set of x' such that x' is the
child of some x E A and Sex') n A = 0. Since every point in A has d children,
and points in A\F(A) have no children in A',
(4.41) IA'I ~ dlF(A)I·
We will check
(4.42) IA'I2: IAI(d - 1)
by induction on the cardinality of A. Given A =1= 0, choose x E A that has the

maximal value of lex), and let B = A\{x}. Then A' contains all d children of
x, and B' contains at most one point that is not in A' (the child of the nearest
ancestor to x in A, if any). Therefore,
IA'I 2: IB'I + (d - 1).
It follows that if (4.42) holds for B, it holds for A. This is the induction step.
Combining (4.41) with (4.42) gives
d-I
(4.43) IF(A)I::: -d-1A1.
Suppose now that A satisfies cp(1) > 1. By Lemma 4.34 with P = I,
limsupEIAtl = 00.
t-+oo
By (4.43),
lim sup EIF(At)1 = 00
t-+oo
as well. Choose a t so that EIF(At)1 > 1. Then construct a discrete time process
Bn in the following way: Bo = {e} and Bl = F(At ) for that t. In general, Bn+l
is defined by applying the construction that led from Bo to Bl to each of the
points x E Bn (using the graphical recovery symbols and infection arrows for the
time period [nt, (n + I)t], falling in Sex') U {x}, where x' is a child of x with no
descendents in Bn) and then taking the union of the resulting sets. Then IBn I is a
supercritical branching process that satisfies
Bn CAnt a.s.
Therefore At survives, from which it follows that A ::: A1. So, we have shown that
A < Al implies cp(l) ~ 1. By Proposition 4.33(b), it follows that cp(l) ~ 1 for
A = Al as well.
Existence of an Intermediate Phase

We come now to the final ingredient in the proof that weak survival does occur
on all (exponentially growing) homogeneous trees. It provides a close connection
between the function cp and the issue of whether or not the process survives
strongly.
Proposition 4.44. (a) Suppose that At does not survive strongly. If 1j.j(j ~ PI <
P2 and CP(P2) ::: 1, then CP(Pl) < CP(P2).
(b) If cP (p) < 1 for some P > 0, then At does not survive strongly.
Proof It is enough to prove (a) for PI > Ij.j(j because of Proposition 4.33(a).
By Lemma 4.26(b),
lim a n (Pl) = 0.
an (P2)
n-+oo
Given E > 0, choose N so that
an (Pl)
--<E
an (P2) -
for n ::: N. Applying this, together with (4.28) for both PI and P2 gives
(4.45)
EWpJAt) <
E
+ L:=oan(pj)P(en EAt) .
EW p2 (A t ) - EW p2 (A t )
Since At does not survive strongly, the numerator of the second tenn on the right
side of (4.45) tends to zero as t -7 00. By (4.24) and our assumption,
EW p2 (A t ) 2: [(p2)Y 2: 1.
Therefore,
By Proposition 4.27(b),
[(pdY s [~::~~~:~ C(P2)] [ (p2)r.

Part (a) of the proposition follows from the last two statements.
Suppose (p) < 1. By (4.23), there is a t so that EWp(At) < 1. By (4.21)
Mn = wp(Ant)
[Ewp(At)f
is a nonnegative supennartingale, and hence converges a.s. Since the denominator

tends to zero, it follows that
lim wp(Ant)
n---+oo
=0 a.s.
Therefore, P(x E Ant i.o.) = 0 for each x. Every time x E Ant, it remains in the
infected set for an exponential time with parameter 1. So, the process does not
survive strongly.
We come now to the main result in the first part of this section - the one that
is the primary justification for the study of the contact process on Td • It is a very
simple consequence of the developments up to this point.
Theorem 4.46. For all d 2: 2, A\ < A2.
Proof If A = A\, then <1>0) = 1 and At dies out by Proposition 4.39. Therefore
(p) < 1 for I/Jd S P < 1 by Proposition 4.44(a), applied to PI = P and
P2 = 1. Fix such a p. By Proposition 4.33(b), there is a A> AI so that (p) < 1
for this A as well. But Proposition 4.44(b) implies that At does not survive strongly
for this A, so A S A2. Therefore, A2 > A\.
This is probably a good time to see how well we have done so far in proving
analogues of parts (a) and (b) of Theorem 4.8 for the contact process. Recall that
those statements for branching random walk are:
(a) ~t survives if and only if 1/r0) > 1, and
(b) ~t survives strongly if and only if 1/r (1 / Jd) > 1.
So far we have a complete analogue of (a), but only a weaker fonn of one direction
of (b):
(a') '7t survives if and only if ¢ (1) > 1, by Propositions 4.27(b) and Proposition
4.39, and
(h') ¢ (1 /,Jd) < 1 implies that '7t does not survive strongly, by Proposition
4.44(b).
The Sequeuce u and Its Growth Parameter f3(A)

Recalling the close connection between 1jr and v in Theorem 4.8 (concerning
branching random walks), it should not be surprising that it is useful to study the
sequence
u(n) = peen E At for some t), n ~ o.
Our objective is to prove as much of the analogue of Theorem 4.8 in the contact
process context as we can. It turns out that we will be able to prove somewhat
weaker versions of essentially everything, except for statements that involve ex-
plicit formulas. Later we will use these results in a number of ways, including
a proof of the complete convergence theorem above A2, and a construction of
nontrivial invariant measures in the intermediate phase A\ < A < A2. Another
interesting fact that will emerge is that u (n) is discontinuous as a function of A at
A2 for every n ~ 1 - see Theorem 4.65(f). The analogous statement for branching
random walks (at least for large n) follows from Theorem 4.8 (see also (4.16)),
which implies that
2,Jd 1
v(n) > 1 - - - n>1 A > - -
- d+ l' -, 2,Jd
1
v(n) ::s d- nj2 , n ::: 1, A ::s 2,Jd.
We start by obtaining some inequalities that will lead to the existence of an
exponential decay rate for u(n). In order for At to reach en +m it must first reach
en. Letting
r = inf{t > 0 : en E Ad,
one can use monotonicity and the strong Markov property to show that
(4.47) u(n + m) ~ u(n)u(m),

as follows:
u(n + m) = P(e n +m E At for some t)

= E[pAr(en +m E At for some t), r < 00]
~ E[ pen (e n +m E At for some t), r < 00]
= u(n)u(m).
So, the logarithm of u is superadditive, and hence

1
(4.48) f3(A) = n-+oo
lim [u(n)]"
(4.49)
by the discrete version of Theorem B22. Note that (4.48) can be regarded as a
Cesaro version of the convergence statement in Theorem 4.8(f).
We will begin to make the connection between ¢ and f3 by proving two
inequalities. It turns out that both are in fact equalities (the first one for A < A2 -
see Theorem 4.83 and Corollary 4.78), but the proofs of the reverse inequalities
are harder, and will be deferred until we develop some more machinery. Often we
will show explicitly the dependence of ¢(A, p) = ¢(p) on A as well as on p.
Proposition 4.50. (a) ¢(A, f3(A)) 2: 1 for A > O.

(b) f3(Ad s ~.
Proof For part (a), suppose that A, p > 0 satisfy ¢(A, p) < 1. Then
p-n [00 P(e_ n E At)dt S L pl(x) [00 P(x E At)dt

10 x 10
(4.51 ) = 100 EWp(At)dt
c(p)
< <00
- Ilog¢(A, p)1
if p =f. 1/./d by Proposition 4.27(b). Once e_ n EAt. e_ n remains infected for an

exponential time of mean one, so it follows from (4.51) that
C(p)pn
u(n) S -C
I l-og-¢'-(A-,-p-)I,
and hence that f3 (A) ::5 p. Restating this, we see that
(4.52) f3(A) > p implies ¢(A, p) 2: 1.
Letting p approach f3(A) and using Proposition 4.33(b) gives part (a).
Part (b) also follows by this argument. To see this, recall that Proposition 4.39
implies that ¢(A\, 1) = 1, and Propositions 4.27(a) and 4.44(a) then imply that
¢(A\, p) < 1 for ~ < p < 1. By (4.52), f3(A\) S ~.
Lemma 4.34 said that the process At C At is equivalent to At. at least in the
sense of the exponential rate of growth of EWp(A t ), and therefore the function
¢ could be defined equivalently in terms of either process. The next result is a
similar statement relative to the definition of f3. There are two ways in which
quantities are made smaller below, and both tum out to be inconsequential in
terms of exponential growth rates: The t is brought out of the probability, and At
is replaced by At in the definition of u(n). Let B(x, n) = {y E Td : Iy - xl S n}
be the ball centered at x of radius n, and as usual, B(n) = B(e, n).
Lemma 4.53. For any A,

I
lim [suPP(e n E
n-:H)O t
At)]~ = f3(A).
Proof One inequality is clear, since
sup peen EAt) ::: peen E At for some t).

t
For the other inequality, we argue as follows: Let vk(m) be the probability that
there is an active path in the graphical representation from (e,O) to (em, t) for
some t ::: k that remains inside B(k). Then
(4.54) lim vk(m) = u(m)

k-+oo
since the union of the events whose probabilities appear on the left is the event
whose probability is u(m). Now take positive integers j, k, m, n satisfying
(4.55) jm +k::: n.
Let Xo = en-jm,XI = en-(j-I)m,'" ,Xj = en. Suppose that there are times 0 <
TO < TI < ... < Tj so that Ti+1 is a stopping time relative to the post Ti collection
of Poisson processes in the graphical representation and TO ::: 1, Ti +I - Ti ::: k for
o ::: i < j, and there are active paths
from (e, 0) to (xo, TO) without exiting S(el) U {e},

from (xo, TO) to (XI, TI) without exiting B(XI, k),
from (XI, TI) to (X2, T2) without exiting B(X2, k),
from (Xj_l, Tj-d to (Xj, Tj) without exiting B(xj' k).
Then en E At for some t ::: 1 + jk ::: nk. Applying the strong Markov property
to the Poisson processes in the graphical representation and spatial homogeneity
gives
peen E At for some t ::: nk) ::: P(en-jm E At for some t ::: 1)[vk(m) t
On the other hand
peen E At for some t ::: nk) ::: nk max peen E At for some t E [i, i + I)).
O:::i<nk
Since sites in At remain in At for an exponential time with parameter 1,
Combining the last three inequalities gives

-I
(4.56) sup peen EAt)::: =--k P(en-jrn E At for some t :::: l)[vk(m)t
t n
Take nth roots and then for fixed k, m, let n -+ 00, with j = [n;;;k] so that (4.55)
is satisfied throughout. Note that with this choice, n - mj takes the finitely many
values k, ... , k+m, so that the probability on the right of (4.56) takes only finitely
many (positive) values as n, j -+ 00 in this way. Therefore
I
liminf[suPP(en
n..-....+oo t
EAt)]"::: [vk(m)]~.
Now let k -+ 00 and then m -+ 00 and use (4.48) and (4.54) to complete the
proof.
Next we use Lemma 4.53 in order to embed a supercritical branching process in

the contact process in much the way that we used Lemma 4.34 to prove Proposition
4.39. The following proposition will be used for two purposes: (i) to deduce further
properties of the function f3 below A.2 that complement Proposition 4.50, and (ii)
to prove the complete convergence theorem above A.2.
Proposition 4.57. If
I
(4.58) f3(A.) > ..{J'
then
(4.59) inf Pee

t
EAt) > O.
Proof Suppose that (4.58) holds. It will be slightly more convenient here to re-
define At so that it is constructed using all the Poisson processes in See), instead
of only those in S(el) U {e}. This is even larger than the process used in Lemma
4.53, so its statement holds for this redefined process also. It is not hard to check
that the conclusion (4.59) holds for the modified At if and only if it holds for the
original one. By Lemma 4.53, there exist a > ./ct, n :::: 1 and t > 0 (that we now
fix) so that
(4.60)
We will construct an embedded branching process in the following way. Let

Bo = {e}, and BI = {x E At : Ix - el = n}. For each point x in B 1, use the
same rules that led from Bo to BI to construct a random subset B(x) of {y E
See) : Iy - el = 2n}, and then let B2 = UXEBI B(x). Continuing the construction
in this way, and then taking the cardinalities of the resulting sets, we obtain a
branching process IBj I whose offspring distribution is bounded and has mean
d n an. Furthermore, with the appropriate coupling,
Bj C A jt .
The basic limit theorem for supercritical branching processes (Theorem B55(c»
says that
.
11m IBjl
.
j-+oo (d n a n )1
exists a.s. and is not identically zero. Therefore, there is an E > 0 so that
(4.61)
for all sufficiently large j. Now let
ri = pee E A 2ijt ).
Then
(4.62) ri+l ::: P(x E A(2i+l)jt for some x, Ix - el = nj)P(enj E Ajt ).
The same argument that led to (4.47) gives
(4.63)
where the equality comes from (4.60). To handle the first probability on the right
of (4.62) note that for each x E Bj , there is probability at least ri that x E A(2i+ l)jr.
and the appropriate events with those probabilities are independent for different
x's. Therefore, letting N be the integer part of E(da)n j , we have
(4.64) P(x E A(2i+l)jt for some x, Ix - el = nj) ::: P(IBj I ::: N)[l - (1- ri)N].
Combining (4.61-4.64) gives
where
fer) = E[1 - (1 - r)N]a nj .
Note that f (0) = 0 and f' (0) = EN a nj ::: E2 (da 2)nj - w nj , which can be taken
to be > 1 by taking j large since da 2 > 1 and a < 1. In this case, since
f(1) = w nj < 1, f has a fixed point r* in (0, 1), i.e., f(r*) = r*.
Now we can prove inductively that ri ::: r* for i ::: o. Since ro = 1, the basis
step is automatic. If ri ::: r*, then the monotonicity of f in r implies that
ri+l ::: f(ri) ::: f(r*) = r*.

This gives the induction step, and therefore we have proved (4.59) along a discrete
skeleton of times. To extend to all times, simply use
pee E As) ::: e-2jtri' 2ijt S s S 20 + l)jt.
From Proposition 4.57 we get immediately the following result, which collects
a number of properties of {3, particularly in the vicinity of ).,2.
Theorem 4.65. (a) f3(A) is nondecreasing and left continuous in A.

(b) f3(A2) = ~.
(c) At A = A2, At survives weakly, not strongly.
(d) If A > A2, then f3(A) = 1.
(e) Iff3(A) < ~, then¢(~) < 1.
(j) As afunction of A, u(n) is discontinuous at A2for every n :::: 1.
(g)
A
f3(A) :::: I + A'
(h)
Remark. In part (a), only the left continuity of f3 is asserted. Note that by parts
(b) and (d), f3 is not right continuous at A2.
Proof of Theorem 4.65. The monotonicity of f3 is clear from (4.48) and the fact
that each u(n) is nondecreasing in A. For the left continuity, write
u(n) = lim peen E At for some t ::s T)

T---+oo
for fixed n. The probability on the right is continuous in A for fixed T, since it
involves the graphical representation for only a finite time period. Since u (n) is
an increasing function of A, it follows that u(n) is left continuous in A. By (4.48)
and (4.49),
1
f3(A) = supu(n);;,
n
so (a) holds. Since At CAt. (4.59) implies strong survival. Therefore by Proposi-
tion 4.57, A < A2 implies f3(A) ::s ~. Combining this with (a) gives one inequality
in (b): f3(A2) ::s ~. We will return to the other inequality shortly. For part (c), it
suffices to use this half of (b), which implies that limn u (n) = 0 by (4.49), and to
note that
(4.66) Pee E At for a sequence of times t t 00) ::s u(n)

for each n and A > O. To check this last statement, fix n and let E = peen E
AI) > O. By monotonicity and the Markov property,
peen E At+ll.%"> :::: E a.s. on {e E Ad.
Now apply the extended Borel-Cantelli Lemma (page 240 of Durrett (1996)).
Part (d) also follows from (4.66), since it implies that u(n) is bounded below for
A> A2.
Turning to (e), use (4.28), (4.49) and Lemma 4.26 to write

00 00
sup EWp(At) :s I>n(pH.B(A)r:s CL)dP.B(A)f < 00

t n=O n=O
for some constant C, provided that Ja

< p < dr/po)' Therefore, ¢ (p) :s I for
such p's by (4.23). By parts (c) and (d), the process does not survive strongly
if .B(A) < Ja.
So, we can apply Proposition 4.44(a) to conclude that ¢(p) < I
for Ja < p < dfl~A)' and hence the conclusion holds since ¢ has its minimum at
1/ -Jd by Proposition 4.33(a). The remaining inequality in (b) now follows easily:
If .B(A2) < Ja, then ¢(Ja) < I by part (e). By Proposition 4.33(b), this will still
be true for slightly larger values of A. But then At does not survive strongly for
those A'S by Proposition 4.44(b), which contradicts the definition of A2.
For the proof of (t), consider the function
fA(A) = Pee E At for some t :::: 0), A finite.
By the strong Markov property, fA satisfies

LXEA fA\(x} (A) + A LXEA,yj!A fAU(y} (A)
(4.67) fA (A) = Ix-yl='
IAI + A#{(X, y) : x E A, y 1. A, Ix - yl = I}
for A C See,). Since each fA is nondecreasing in A, this implies that if fA is
continuous at A for some A C See,), then so are fA\(x} for each x E A, and fAu(y}
for each y 1. A that is a neighbor of some x E A. Inductively, it follows that fA is
continuous for some A C See,) if and only if it is continuous for all A c See,).
Since u(n) = f(enl(A), it follows that u(n) is continuous at A for one n ::: 1 if and
only if it is continuous at A for every n ::: 1. Note that above A2,
(4.68) Pee E At for a sequence of times t t 00) = peAt =1= 0 V t).
The proof of this is similar to that of (4.66): Let

G = {e E At for a sequence of times t too}.
Then by monotonicity and the Markov property,
P(GI..¥0 = pAs (G) :::: u(n)P(G)I(enEA,},
so that by (4.66),
P(GI..¥0 :::: P(G)21(As*0}.
By the martingale convergence theorem,
P(GI..¥0 -+ Ie a.s.
as s -+ 00. Since P(G) > 0 for A above A2, it follows that
{As =1= 0 V s} c G,
and hence that peAs =1= 0 V s) :s P(G). The other inequality is clear.
Continuing with the proof of (f), it now follows from (4.66) that
u(n) ?: E for all n?: 1, A> A2,
where
E = peAt =1= 0 'V t)1),=),2 > O.
But by (4.49) and part (b) above,
1
u(n) < - for A <_ A2.
- dn/ 2
It follows that u(n) is discontinuous at A2 once E2d n > 1, which completes the
proof of (f), since we observed earlier that a discontinuity in one u (n) leads to a
discontinuity in all the u(n)'s.
For part (g), apply (4.67) with A = {el}, using monotonicity and homogeneity,
to get
[1 + (d + l)A ]u(l) = Vle,ed(A) + dV{el,e2J(A) ?: AU(O) + dAu(l),

so that
(4.69) (1 + A)U(l) ?: A.
Combining (4.69) with (4.49) gives (g). Part (h) is an immediate consequence of
parts (b) and (g).
The Complete Convergence Theorem

An important application of Proposition 4.57 is the complete convergence theorem,
which we now state and prove.
Theorem 4.70. If A > A2, then

A~ => <YA(A)V + [1 - <YA(A)]80
as t ~ 00 for any initial configuration A C Td , where => denotes weak conver-

gence, v is the upper invariant measure of the process, and <YA (A) = P(A~ =1= 0 "It)
is the survival probability for the process started at A.
Remarks. (a) The above statement is trivially true for A :::: A\, since then v = 80.
It is trivially false for AI < A :::: A2, since the limiting distribution is 80 whenever
A is finite.
(b) An immediate consequence of Theorem 4.70 is that 80 and v are the only
extremal invariant measures if A > A2. Later, we will see that there are infinitely
many extremal invariant measures in the intermediate regime - see Theorems
4.107 and 4.12l.
Proof of Theorem 4.70. Assume A> A2. By Theorem 4.65(d), {3(A) = 1, and then
by Proposition 4.57,
inf Pee EAt) > O.

t
For each x E See) with Ix - el = n, let A;

be the process with initial state {x} that
is constructed from the graphical representation using only the Poisson processes
corresponding to vertices in S (x). These evolve independently for different x' s,
and are translated copies of At. Also,
-x B(n)
U XES(e) At C At .
Ix-el=n
Therefore
so that
lim liminf p(A~(n) n B(n) =1= 0) = l.
n~oo (---+00
The result then follows from (4.68) and Theorem 1.12.
Continuity of the Survival Probability

Since the survival probability CXA(A) appears in the limit on the right in Theorem
4.70, this is a natural time to prove that it is a continuous function of A. Note the
contrast with Theorem 4.65(f).
Theorem 4.71. For any A C Td, CXA(A) is continuous on [0,00).
Proof Clearly CXA == I if A is infinite, so we will consider only finite A's. Since
more than one value of J... will be used below, we use a subscript to indicate its
value: PAO. The right continuity of CXA is immediate, since
as t t 00, and PA (A~ =1= 0) is increasing and continuous in A for each t. To prove
left continuity, define the event Dn,t by
Dn,t = {A: :J B(x, n) for some x E Td and s ::: t}.
Take Al < A" < A' < A, and use the strong Markov property, monotonicity and
spatial homogeneity to write
Since the first factor in the products above depends on the graphical representation
only up to time t, it is continuous in A', so that we can let A' t A and conclude
that
CXA(A-) 2: P). (Dn,t)CXB(n) (J...II).
Now let t t 00 and note that PA (Dn,oc;) = CXA (A) for every n. The reason for this is
that if x E At for some x and t, then there is a positive probability (depending on
n) that B(x, n) C At+l, so on the survival event, this will occur with probability
1. Therefore,
aA(A-) 2: aA(A)aB(n)(A").
But by duality (1.7), and the fact that v(0) = 0 for A> Al by (1.5),
lim aB(n)(A)
n---+oo
= n---+oo
lim v{B: B n B(n) =1= 0} = 1,
so that aA (A-) 2: aA (A). It then follows that aA is left continuous at every A > AI.
But it is == 0 for A S Al (by definition for A < AI, and by Proposition 4.39 for
A = AI), so it is left continuous everywhere.
Remark. Continuity of the survival probability (above the critical value) was
proved for the one dimensional contact process on page 266 of IPS using the fact
that there are only two extremal (translation invariant) invariant measures. For the
contact process on Td , this proof works perfectly well, once that fact is proved,
and it can be proved in a manner similar to that on page 168 of IPS.
The Growth Profile

When we get to the point of studying the invariant measures for At for AI < A <
A2, it will be important to find roots of the equation ¢ (p) = 1. In order to do
so, it turns out to be convenient to express the function ¢ in terms of another
function that is known as the growth profile because it records information about
the location of infected sites at large times.
To begin, define
u(n, t) = peen EAt)
for integer n 2: 0 and real t 2: O. The arguments that led to (4.47) give also
(4.72) u(m + n, s + t) 2: u(m, s)u(n, t),

and hence
u(m + n, (m + n)t) 2: u(m, mt)u(n, nt).
Taking logarithms again, it follows from the discrete version of Theorem B22 that
I
V(t) = lim u(n, nt)~
n--+oo
(4.73) u(n, nt) S V(tt, n 2: O.
The following result collects some elementary properties of the function V.
Proposition 4.74. The function Vet) is logconcave (i.e., log V is concave) on

(0, (0) and hence is continuous there, and satisfies
(4.75) sup Vet) = f3(A)

t>O
and
(4.76) limV(t) = 0.
ttO
Proof Take °< a < 1 rational, and m, n so that a = m~n. By (4.72),
u(m + n, (m + n)(as + (1 - a)t)) ::: u(m, ms)u(n, nt).
Taking (m + n )th roots and passing to a limit leads to

(4.77)
To conclude that V is logconcave, we would need this to be true for all

not just rational ones. Note however that by (4.73),
° < a < 1,
1
Vet) = supu(n, nt)., t > 0,
n
so that by the continuity of u(n, .), V is lower semicontinuous:
(4.78) Vet) s liminfV(s).

s-H
Given a E (0, 1) and s, t > 0, choose rational an -+ a and define Sn, tn by

anS n = as and (1 - an)tn = (1 - a)t. Then
V(as + (1- a)t) = V(ans n + (1- an)tn )::: V(sn)a n V(tn)l-a n ,
so that (4.77) follows from (4.78). This proves the logconcavity statement.
Turning to (4.75), note that u(n, t) s u(n), so (4.49) implies that Vet) s fJ(A)
for all t. On the other hand, since Ant CAnt,
by (4.73), so that Lemma 4.53 implies
sup Vet) ::: fJ(A).

t
To prove (4.76), couple At with a pure birth process X t on the nonnegative

integers that has birth rates A so that en E At implies X t ::: n. To do so, simply
let Bt be the (increasing) process obtained from the graphical representation by
suppressing recoveries, and put
Then
< P(Xt _
u(n ,t)- > n) = peSn_< t) <
-
Eeo(t-Sn) = eOt (_A_)n
A+8 '
where Sn is the partial sum of n i.i.d. exponential random variables with parameter
A, and e > O. Replacing t by nt and optimizing over e gives the bound
(4.79) u(n, nt)~ S Ate l - At for At S 1.
Letting n ---+ 00 and then t -J, °in (4.79) gives (4.76).

Lemma 4.80. If f3 (A) < 1/"fd, then
1 1
(4.81) 0< lim [U(t))'
t~oo
= inf[U(t))'
1
< 1.
Proof Since U is logconcave by Proposition 4.74, the limit in (4.81) exists and
equals the infimum. Since u(O, t) S EWp(At) for any p > 0,
1
(4.82) lim [u(O,
1-+00
t))' S ¢(p)
by (4.23). (The limit on the left exists by Theorem B22, since u(O, s)u(O, t) s
u(O, s + t).) The argument that led to (4.72) also implies
u(n, s)u(n, t) S u(O, s + t),

and in particular,
1 1 1
[u(n, nt)]"' [u(n, n)]"' S [u(O,(n+ 1)t)]"'.
Passing to the limit as n ---+ 00 and using (4.82) and the definition of UO gives
1 1
[U(t)]'[U(1)]' S ¢(p).
The second inequality in (4.81) now follows from Theorem 4.65 (e). The first
inequality is easier: Take n = 1 in (4.73), and use the fact that once el E At, el
stays infected for an exponentially distributed time.
Now we are in a position to relate the functions ¢ and U, and then to show
that f3 can be used to generate solutions to the equation ¢ (p) = 1. Recall that
this was the reason for introducing the growth profile U. Recall in this connection
Proposition 4.50(a), which gave one inequality in (4.85) below.
Theorem 4.83. Iff3(A) < 1/"fd, then

1 1
(4.84) ¢(p) = sup [dpU(t))' for p>-
0<1<00 "fd
and
(4.85)
Proof For the first statement, fix a p > II Jd, and let a be the supremum
appearing on the right of (4.84). Note that by (4.28),
00
EWp(At) = Lan(p)u(n, t).

n=O
Since an (p) is asymptotic to a constant multiple of (dp)n for p > 1I Jd by Lemma

4.26(b), it follows from (4.23) that
(4.86) ¢(p) = t!i~ [~(dPYU(n, t)] f.
By Proposition 4.74 and Lemma 4.80, the function f(t) = 10g[dpU(t)] is concave
and has limiting slope 00 at t = 0 and a finite negative limiting slope at 00.
Furthermore, log a is the supremum of the slopes of lines joining the origin to
points on the graph of f. If a' < a, then the line with slope log a' must intersect
the graph of f at some point, call it s. Then f(s) = s loga'. Therefore
00
L(dpYu(n, t) 2: (dp)[t/s1u([tls], t),

n=O
so that by (4.86) and the definition of U,

I ,
¢(p) 2: [dpU(s)F =a .
Taking a' ---+ a gives ¢(p) 2: a.

For the other inequality, choose 0 < E < min(1, a). By (4.76), there is a 8> 0
so that dpU (t) ::: E for t ::: 8. By the definition of a, dpU (t) ::: at for all t > O.
Using these bounds and (4.73), we see that
Taking tth roots and passing to the limit using (4.86) gives ¢ (p) ::: a, which
completes the proof of (4.84).
The first equality in (4.85) is just Proposition 4.27(a). Taking p = IldfJ(A) in
(4.84) gives
where the inequality comes from Proposition 4.74. Combining this with Proposi-
tion 4.50(a) concludes the proof of (4.85).
Corollary 4.87. (a) fJ(A) is continuous on [0, A2].

(b) fJ(Ad = ~.
(c) fJ(A) > ~ for A > AI.
Proof Recall that f3 is left continuous on [0, (0) by Theorem 4.65(a), so we need
only check right continuity on [0, A2). By Proposition 4.50(a),
(4.88) <p(A, f3(A)) ::: 1, A> O.
Therefore by Proposition 4.33(b),
(4.89) <p(A, f3(A+)) ::: 1, A> O.
Take A < A2. We want to show that f3(A+) = f3(A). If not, then f3(A) < f3(A+) :::
~ by Theorem 4.65, so that (4.88) is an equality by Theorem 4.83. But this
contradicts Proposition 4.44(a) with
1
PI = df3(A+) and P2 = df3(A)·
To prove part (b), it suffices to note that A > Al implies that
1
f3(A) ::: d'
since otherwise
d + 1 00
EI Ut Atl ::: Lu(lx - el)::: -d- L [df3(A)f < 00
x n=O
by (4.49). Therefore by part (a), f3(AI) ::: ~. Combining this with Proposition
4.50(b) gives the result.
For (c), suppose A > Al and f3(A) = ~. Then <p(l) = 1 by Theorem 4.83,
so SUPt EIAtl < 00 by Proposition 4.27(b). It follows that At dies out, which
contradicts the definition of AI.
Invariant Measures in the Intermediate Regime - First Construction

Perhaps the most important application of Theorem 4.83 is to the construction of
invariant measures for the contact process on Td. ForA ::: AI, the process dies
out by Proposition 4.39, and hence the only invariant measure is 80. If A > A2,
the Complete Convergence Theorem 4.70 implies that all invariant measures are
convex combinations of 80 and the upper invariant measure v. Here we will take
Al < A < A2, and will see that the situation is much more complex in this regime.
We will also assume explicitly that f3 < ~. This is an assumption that has already
appeared several times - in Lemma 4.80 and Theorem 4.83, for example. Later
we will show that this is automatically true when A < A2, so the reader should
mentally replace the assumption f3 < ~ by A < A2 in the results below.
It follows from these assumptions that we are in the following situation (recall
Corollary 4.87(c)):
1 1 1
(4.90) -<f3<-<-<1
d ../d df3 '
and then
1
(4.91) ¢({3) = ¢(d{3) = 1
by Theorem 4.83. In order to construct nontrivial invariant measures for Ar. start
by taking a nonnegative function a on Td. Let JL = JLa be the product measure on
the subsets of Td with marginals
JLa{A : x E A} = min[a(x)(d{3)-lx-e l, I],
and let JLt be the distribution of the process at time t when the initial distribution
is JL. By duality (1. 7), for any finite set B,
(4.92) JLdA : An B =I=!O} = 1- E n

xeA~
[1 - a(x)(d{3)-lx- e 'r.
We will construct invariant measures by passing to a limit in JLt as t ---+
00, obtaining an invariant measure that has marginals with the same asymptotic
properties as JLa as x recedes from e. Define
M tB = L a(x)(d{3)-lx-e l.
xeA~
As we will see shortly, this quantity essentially controls the behavior of JLr. so
we need to develop some of its properties. In fact, in the proof of Theorem 4.107
below, we will prove and use the following bounds
EMtB - ~E(MtB)2 ::::: JLt{A : An B =1= !O} ::::: EMtB.
Recall that M t = M1 by convention.
Lemma 4.93. There are positive constants C 1 , C z so that
C 1 infa(x) ::::: EMf::::: Cz supa(x), t :::: O.

x x
Proof It is enough to consider the case a == I. In this case, since there are
(d + l)d n - 1 sites a distance n away from e for n :::: I,
d + I 00
(4.94) EMt = E L(d{3)-lx-e l = Pee EAt) + -d- L peen E A t ){3-n.
xeA t n=l
On the other hand, taking p = 1/d{3 > l/,jd in (4.28) and using the fact that by
Lemma 4.26(b), an(p) is asymptotic to a constant multiple of (dp)n = (3-n we
see that the right side of (4.94) is bounded above and below by constant multiples
of EWp(At). The lemma now follows from Proposition 4.27(b) and (4.91).
F or the next result, recall that S (x) is x, together with all the descendents of x.
Lemma 4.95. If n > k > 0, then
L (df3)-lx-e l = r k f3- k - n ,
Ix-ekl=n
XES(ek)
d-l . 2·
L I I k
(df3)-x-e = - d - r J f3- ]-n+ for 0 < j < k,
Ix-eki=n
XES(ej )\S(ej+l)
and
L (df3) -Ix-el = f3k-n.
Ix-ekl=n
xETd\S(ej)
Proof This is an easy consequence of the following observations:
x E Seek) ::::} Ix - el = k + Ix - ekl,
x E S(ej)\S(ej+d ::::} Ix -el = j + Ix - ejl, Ix -ekl = Ix -ejl +k - j

for 0 < j < k, and
Using the above lemma, we can investigate the limiting behavior of EM;k
as t ---+ 00. The boundary aTd of Td is defined as the collection of semi-infinite
self-avoiding paths emanating from e. This is a natural extension of the idea of
identifying a vertex x E Td with the (finite) path that leads from e to x. A base for
the natural topology on aTd is given by collections D(xo, ... ,xn ) of such paths
that share a common initial segment {e = Xo, Xl, ... ,xn }. With this topology,
Td U a Td becomes a compact metrizable space. An example of a metric for this
topology is the following. Fix a e E (0,1). For x, y E Td U aTd, let z be the
endpoint (other than e) of the intersection of the self-avoiding paths from e to x
and from e to y respectively. Then
dist(x, y) = e le - ZI [2 - e 1x - zl - e1y-zll
Note that S(ej) can be naturally interpreted as a subset of aTd for j :::: 1:
S(ej) = D(eo, ... ,ej). Let y be the uniform probability measure on aTd, i.e., the
one for which
1
y(D(xo, ... ,xn )) = (d + l)d n - l ' n:::: 1.
We will use the standard comparison process Nt = LXEA t (df3)-lx-e l (i.e., Mt

when a == 1).
Lemma 4.96. Suppose that a is uniformly bounded on Td, and that the limit
(4.97) a(z) = X---"z

lima(x)
exists for a. e. z E aTd with respect to y. Then
where
for z E Seek),
for Z E S(ej)\S(ej+d, 0 < j < k and
for Z E Td\S(ed.
Remark. It is not hard to construct many bounded a's that satisfy (4.97). To see
this, let Xn be the discrete time simple random walk on Td , i.e., the one that moves
to each neighbor with probability I/(d + I). Since Xn is transient,
Xoo = lim Xn
n---..oo
E aTd
exists a.s. If a is a bounded Borel function on aTd , one can define an extension
a on Td by
Then a is a bounded harmonic function for X n, so a(Xn) is a bounded martingale,

which converges a.s. In fact, by the Markov property,
If X0 = e, then the distribution of X 00 is y. But Xn must visit every point on the

self-avoiding path leading from e to Xoo , so (4.97) holds.
Proof of Lemma 4.96. Write
EM;k = La(x)(d.B)-lx-elp(x E A~k)

x
= L a(x)(d.B)-lx-elp(x E A~k)
(4.98) Ix-e,I::ok
00
+ L peen EAt) L a(x)(d.B)-lx-e l.

n=k+! Ix-eki=n
By (4.97), Lemma 4.95, and the bounded convergence theorem,
(4.99) 1· Llx-e,l=n,xES(ek) a (x)(d.B)-lx-el

n~~ '" -Ix-el = (
I
) a(z)dy,
f
L..lx-ekl=n,XES(ek) (d.B) y Seek) S(ed
with similar statements holding for sums and integrals over S(ej)\S(ej+l), 0 <
j < k and over Td\S(el). Recalling that Nt is Mt when a == 1, comparing (4.94)
to (4.98), and using (4.99), Lemma 4.95,
and the facts that

lim peen EAt)
t-'?oo
=0
for each n, and ENt is bounded below by Lemma 4.93, one obtains the statement
of the lemma.
Nothing in the previous developments depends on using ek, as opposed to any

other x E Td such that Ix - el = k. So, defining Gx(z) on aTd for x E Td\{e} in
an analogous way, we can conclude that
(4.100) EM
lim _
t-'?oo
_t
E Nt
x
= 1 aTd
a(z)GxCz)dy, x i= e.
Using the explicit expression for G, one can check that
(4.101)
.
hm (dfJ)lx-e l
X-'?W
1 aTd
a(z)Gx(z)dy =
(d
d(l - fJ2)
+ 1)(1 -
2 a(w),
dfJ )
for a.e. w E aTd.
For example, suppose w = {eo, el, ... } E aTd, and assume that a(x) -+ a(w) as
x -+ w. Then
(4.102)
Passing to the limit using the expressions for y(S(ek)), ... given in the proof
of Lemma 4.96 and summing the resulting geometric series gives (4.101) in this
case.
We will also need information about the second moments of The first M:.
step is an application of the BKR inequality (or more specifically, the special case
known as the BK inequality - see the discussion of Theorem B21).
Lemma 4.103. If y i= z, then
P(y,z E A:)::: A L t[P(u E A:)+P(v E A:)]P(y E A~_s)P(z E A~_s)ds.

lu-vl=110
Proof The events {y E An

and {z E An
are defined in tenns of the Poisson
processes used in the graphical representation up to time t. These Poisson processes
can be discretized in time, yielding a collection of independent Bernoulli random
variables. In applying Theorem B21, we will prove the inequality we want for the
corresponding events defined in tenns of the Bernoulli random variables, and will
then pass to the limit. In order not to complicate the notation, we will not discuss
this discretization explicitly in what follows.
If in the graphical representation, there are two active paths, one leading from
(x, 0) to (y, t) and the other leading from (x, 0) to (z, t), then there is a last point
(u, s) that they share. The two paths can be taken to agree up to that point. There
is a v satisfying lu - vi = 1 and an infection arrow from (u, s) to (v, s) so that
the paths are disjoint after s. Therefore,
P(y, Z E A:) ::::: A L t P(u E A~)P(3 disjoint active paths from (u, s)
Iu-vl=! 10
to (y, t) and from (v, s) to (z, t), or 3 disjoint active paths
from (u, s) to (Z, t) and from (v, s) to (y, t))ds.
By Theorem B2l,
P(3 disjoint active paths from (u, s) to (y, t) and from (v, s) to (z, t»
::::: P(3 an active path from (u, s) to (y, t»
x P(3 an active path from (v, s) to (Z, t»
= P(y E A~_s)P(z E A~_s)'
Applying this twice and using symmetry in the resulting sum gives the result.
Lemma 4.104. lfsupx a(x) < 00, then
lim (d{3)lx- el limsup E(M:)2 = O.

X-'>aTd t-'>oo
Proof Since a is bounded, we may as well take a == 1. Then by Lemma 4.103,

E(M:)2 = L(d{3)-ly-e l- 1z -e 1p(y, Z E An
y,z
:::::L(d{3)-2 Iy -e 1p(YEA:)+A L t[P(UEA~)+P(VEA~)]

y lu-vl=! 10
(4.105) x L(d{3)-ly-e 1p(y E A~_s) L(d{3)-lz-e p(z E A~_s)ds
1
y z
= L(d{3)-2 Iy -e P(y El An
y
+ 2A L 1t P(u E A~)EMtU_sEMtV_sds.
lu-vl=! 0
Take a p satisfying
max ( 5a, (d~)2 ) < P< d~'

The first term on the right of (4.105) is bounded above by
L ply-e l P(y E A:) S L pl(y) P(y E A:)

(4.106) y y
= pl(x) EWp(At) S pl(x)C(p)[¢(p)r,
where the final inequality comes from Proposition 4.27(b). The right side of (4.106)
tends to zero as t -+ 00 by (4.90), (4.91) and Proposition 4.44(a). To handle the
second term on the right of (4.105), use (4.100) to write
lim sup L
t-'>oo lu-vl=!
t P(u
Jo
E A;)EM~_sEMtV_sds
S sup (ENt )2 L roo P(u E A;)ds { Gu(z)dy { Gv(z)dy.

t lu-vl=! Jo JaTd JaTd
By (4.101) and Lemma 4.93, the expression on the right above is bounded by a
constant multiple of
L(dtn-2Iu-el roo P(u E A;)ds,

u Jo
and this is bounded above by the integral (on t) of the left side of (4.106). Since
E(M:) 2 depends on x only through Ix - e I, we have shown that there is a constant
independent of x so that
lim sup E(M:)2 S Cplx-e l,

t
from which the result follows, since d{3p < 1.
We are now ready to carry out the basic construction of invariant measures.
Theorem 4.107. Suppose that a is uniformly bounded on Td, and that the limit
(4.108) a(z) = lim a(x)

X-'>Z
exists for a. e. z E aTd with respect to y. Then there exists an invariant measure va
for the contact process that satisfies
(4.109) lim va{A : x E A}(d{3)lx-e l = a(z) a.e. z E aTd
X-'>Z
and
(4.110) lim lim sup sup va{A : x, y E A}(d{3r = o.
k-,>oo n-'>oo Ix-el~n.ly-el~n
Ix-YI~k
If a] S a2 are two such a's, then the corresponding invariant measures can be
taken to satisfY val S Va2 ·
Proof Define IJ.t given in (4.92) to be the distribution at time t of the process with
the appropriate initial product measure. Since a is bounded, and no limiting proper-
ties of IJ.t depend on a(x) for any fixed x, we may assume that a(x)(dfJ)-lx-e l S 1
for all x. Using the elementary bounds
1
1- "E·
~.-
< n(1 - E·) < 1- "E· + -2~'J
.-~.
"E·E·
i i i iOPj
for 0 < Ei < 1, it follows from (4.92) that
(4.111)
By compactness of the set of probability measures on {O, l} Td and by Lemma 4.93,

there is a sequence ten) t 00 so that
(4.112) K = nlim -
1
..... oot(n)
I
0
t (n)
ENtdt
I
and
1 t (n)
(4.113) Va = nlim -
..... oo ten) 0
IJ.tdt
exist. Therefore,
.. IJ.t{A:XEA} . IJ.t{A:XEA}
K hm Illf < Va {A : x E A} < K hm sup .
t ..... oo ENt - - t ..... oo ENt
By (4.100), (4.101), (4.111) and Lemma 4.104,
I1m . fIJ.rf A : x E A} = ----'---_::_

. (df3)lx-ell·1m III d(1 - f32)a(z)
HZ 1->00 ENt (d + 1)(1 - df32)
and
. (df3)lx-ell· IJ.t{A : x E A} d(1 - f32)a(z)
I1m 1m sup =
(d + 1)(1 - df32)
-----~
HZ 1->00 ENt
for a.e. zE aTd. Combining these statements leads to
lim va{A : x E A}(df3)lx- el = K d(1 - f32)a(z)

HZ (d + 1)(1 - df32)
for a.e. z E aTd. The extra constant in front of a(z) can be removed by replacing
the a used in the initial product measure by an appropriate multiple of it. This
gives (4.109). The fact that Va is invariant is a consequence of Theorem B7(t).
To prove (4.11 0), let B C Td be chosen randomly according to Ma, and in-
dependently of the Poisson processes used in the graphical representation of the
contact process. Use additivity and duality «1.2) and (1.7)) to write
MdA : x, YEA} = peA: n B =1= 0, Ai n B =1= 0)

(4.114) = P(UU,VEB{U E A:, v E AiD
::; L Ma{A : u, v E A}P(u E A:, v E Ai).
u,v
The sum of the terms on the right of (4.114) that have U =1= v is at most EM: M r
By Lemma 4.104 and the Schwarz inequality,
Ix-el+lv-el
lim (dfJ)-2-"-limsupEM:M( = O.
x,y-+aTd 1--+00
Therefore, these terms are easy to handle. It remains to show that
(4.115) lim lim sup sup (dfJ)n Lex (u) (dfJ) -tu-et P (u E A: n Ai) = 0,
k-+oo n--+oo tx-et~n,ty-et~n u
tx-yt~k
and in doing so, we may take ex == 1.

In order to show this, write
L(d,6)-lu-e p(u
1 E A; nAn = L(d,6)-lu-e p(x, Y E 1 A~)
u u
::;)" L(dfJ)-I(U) L
u tw-zt=110
[pew r E A~)
(4.116)
+ P(z E A~)]P(x E A~_s)P(y E A;_s)ds
::; )"C(d1 ) L (OO [(dfJ)-I(w) + (dfJ)-I(Z)]

fJ tw-zt=110
P(x E A~)P(y E A;)ds,
where the equality comes from duality, the first inequality comes from Lemma
4.103 together with the fact that leu) ::; lu - el, and the second inequality comes
from Proposition 4.27 and (4.91), since pew E A~) is symmetric in u and w.
It is enough to consider one of the dfJ terms that appears on the right of
(4.116). Break up the sum below according to whether Iy - z I < j or :::: j:
L (d{3)-I(w) 1 00
P(x E A~)P(y E A;)ds
1
Iw-zl=1 0
:s (d + 1) L(d{3)-I(W) 00
P(x E A~) sup P(u E A.,)ds
w 0 lu-el:::j
+ IW~=I (d{3)-I(w) 1 00
P(x E A~)P(y E A;)ds
Iy-zl<j
:s (d + I)C(~)(d{3)-I(X)
d{3
1 00
sup P(u
lu-el:::j
E As)ds
1
0
+ (d{3) -l(y)+ j (d + 1)2 d j 00

sup P (u E As )ds,
o lu-el:::k- j
where in the second inequality, we have used Proposition 4.27(b) on the first term,
and the triangle inequality Ix - wi :::: k - Iy - wi :::: k - j on the second term.
Without loss of generality, we may take x, y E See), in which case lex) = Ix - el
and ley) = Iy - el. In passing to the limit on k in (4.115), we can first let k -+ 00
and then j -+ 00. Therefore, we will have proved (4.115) if we prove
(4.117) lim
j->OO
1 0
00
sup P(u
lu-el:::j
E As)ds = O.
In order to prove (4.117) it suffices by the dominated convergence theorem to

show that
(4.118) [00 sup P(u E As)ds < 00.

Jo u
To do this, take a p between Ij,J(i and Ij(d{3), and use (4.28), Proposition 4.27(b),
and the fact that an (p) is asymptotic to a constant multiple of (pd)n (which tends
to (0) to conclude that
sup peen EAt) :s C[¢(p)Y

n
for some constant C that is independent of t. By Proposition 4.44(a), ¢(p) < I,

so that (4.118) holds.
For the final statement of the theorem, one can simply take the sequence ten)
in the construction so that (4.113) is satisfied for the two initial product measures
corresponding to £¥I and £¥2.
Proposition 4.119. Consider the Va constructed in Theorem 4.107 for constant

'so Then
£¥
lim
a->oo
Va = v,
the upper invariant measure.
Proof First note that the above limit exists by monotonicity, and is invariant by
Theorem B7(c). For constant a, let fJ";. be the product measure with marginals
M~{A : x E A} = min[a(d,B)-lx-YI, 1].
Then
Y > z
Ma - Ma(dfJ)-IY-ZI'
An analogous property holds for the invariant measures constructed using the
method of Theorem 4.l07 with these two initial measures. Letting a ---+ 00, we
see that lima - Hlo Va is stochastically larger than any "translate" of itself. But that
implies that it is invariant under the automorphisms of Td . Now use Theorem 5.l8
on page 168 of IPS to complete the proof. (This theorem was proved in IPS for
particle systems on Zd, but the proof is the same for particle systems on Td.)
Remark. It is not hard to extend the statement of Theorem 4.l07 to allow un-
bounded functions a on Td, whose boundary limit a(z) is allowed to be infinite
on a set of positive measure. For a z E aTd for which a(z) = 00, (4.l09) is to be
interpreted as meaning
(4.120) lim va{A : x E A} = v{A : YEA}.
x---+z
Simply take such an a, and a sequence an t a. By the final statement of the

theorem, van can be taken to be stochastically increasing, so that Va can be defined
by
Va = lim van'
n---+oo
To check (4.l20), use Proposition 4.119.
Invariant Measures in the Intermediate Regime - Second Construction

The invariant measures we have constructed up to this point have the property
that their marginals tend to zero at aTd at a prescribed exponential rate (d,B)-lx-e l.
This particular exponential rate was chosen because ¢ ((d,B) -I) = 1. Recall (4.91).
Using any smaller exponential rate in this construction would produce only the
invariant measure 80. Using a larger exponential rate would produce the upper
invariant measure v. See Proposition 4.119.
Next, we will construct invariant measures of a somewhat different sort. The
idea is that if the initial state of the process is one branch of Td, then the distribution
at a large time t should be approximately v on that branch, and approximately 80
on the complementary branches. In the resulting invariant measure, the marginals
will not tend to zero at all on the chosen branch, but will tend to zero at a rate
:s ,Blx-e l on the complementary branches. Recall that ,B < (d,B)-1 by (4.90). For
x =f=. e, let
S'(x) = {y E Td : Iy - el = Iy - xl + Ix - ell·
Theorem 4.121. Let B = U~=I S'(xn) c h where the S'(x n) are disjoint. Then
there is an invariant measure VB for the contact process that satisfies
(4.122) vB{A : x E A} = t--+oo

lim peA; n B =1= 0).
Furthermore, ifx E S'(x n), then
(4.123) viA : x E A} - vB{A : x E A} :s u(lx - xnl),
while if x rf. B, then
(4.124) vB{A : x E A} :s L u(lx - Xn I) :s L fJlx-xnl.

n n
Proof Let !J,t be the distribution of the contact process with initial set B. By
duality (1.7),
!J,t{A : An c =1= 0} = peA; n B =1= 0)
for any finite C C Td . Since the process does not survive strongly, Af eventually
leaves every finite set, and therefore,
lim peA; n B =1= 0) = p(Af n B =1= 0 i.o.) = peA; n B =1= 0 eventually).

t--+oo
Therefore,
VB = lim !J,t
1--+00
exists, and is invariant by Theorem B7(e). To prove (4.123), take x E S'(x n), and
write
ii{A : x E A} - vB{A : x E A} = lim [peA; =1= 0) - peA; n B =1= 0)]

t--+oo
:s P(x n E A; for some t) = u(lx - Xn I).
For (4.124), take x rf. B and write
vB{A : x E A} = lim peA;

1--+00
n B =1= 0)
:s P(Xn E A; for some n, t) :s L u(lx - Xn I)·
n
The final inequality comes from (4.49).
As noted earlier, the B in Theorem 4.121 can be viewed as being a subset

of aTd, and VB depends on B only through this "projection" onto the boundary.
Every closed subset B of aTd is a decreasing intersection of sets Bn C aTd of the
type appearing in Theorem 4.121, and therefore we can define VB for such B by
(4.125) VB = n--+oo
lim VB n •
It mayor may not be the case that VB is not trivial; i.e., =1= 80 . The next result
gives some indication of how large B needs to be for this to be the case.
Theorem 4.126. Take 0 < p < 1, and consider bond percolation on Td in which
bonds are open with probability p. Let B C aTd be the (random) set of points
that are connected to e by an open path. Then with probability one on the event
{B =1= 0},
if P < 1
I dfJ'
1
Iif p> dfJ'
Proof Let Cn be the set of x E Td so that Ix - el = n and x is connected to e by

an open (percolation) path, and Dn = UXECnS'(x). Then
so that
peAt n Cn =1= 0 for some t) S L pn P(x E At for some t)

x:lx-el=n
by (4.49). Therefore if dp{3 < 1,
lim v Dn {A : x E A} = 0
n--+oo
by (4.122), and hence VB = 80 .
For the second case, we will carry out a construction similar to that in the
proof of Proposition 4.57. Let {Bj, j 2: O} be the random sets constructed in that
proof corresponding to a fixed nand t that satisfy (4.60) for an a such that
(4.127) dap> l.
That such a choice can be made follows as before from Lemma 4.53 since d{3p >
l. Recall that Bj C Ajt and the cardinalities IB j I form a Galton-Watson branching
process whose offspring distribution has mean d nan. Therefore IB j n Cjn I is again
a Galton-Watson branching whose offspring distribution has mean d nan pn. This
process is supercritical by (4.127), so its survival probability
q = P(Bj n Cjn =1= 0 V j) > O.
(See Theorem B55.) Then by (4.125),
vD.{A : e
J
E A} = lim peAs
s~oo
n Djn =1= 0)
2: lim P(A kt n Dkn =1= 0)

k--+oo
2: lim P(B k
k--+oo
n Ckn =1= 0) =q > O.
and therefore
vB{A : x E A} 2: q > O.
The constructions given in Theorems 4.107 and 4.121 can be combined by

carrying out one of them on part of the space and the other on its complement.
Doing so, one obtains an invariant measure for the contact process Va.B associated
with a nonnegative function a on aTd and a closed subset B of {z E aTd : a(z) =
oo} that satisfies (4.l 09) and
An easy way to carry out the combined construction is the following: Suppose
VI and V2 are invariant measures, and let A I and A2 be independent with distri-
butions VI and V2 respectively. Let v be the distribution of Al U A 2, and Vt be the
distribution at time t when the initial distribution is v. As a consequence of the
monotonicity and additivity properties of the contact process, for specific choices
of AI, A2, and i = lor 2,
so that
Choosing A I, A2 random as described above and using the invariance of Vi gives
max (vdA : x E A}, v2{A : x E AJ) : : : vt{A : x E A}

::::: VI {A :x E A} + v2{A :x E A},
and limits J1 of Cesaro averages of V t satisfy the same bounds. Therefore, if
lim vdA : x
x--+z
E A} =0
for some z E aTd for example, then
lim J1{A : x E A}
x~z
= lim v2{A
x~z
: x E A}
for that z.
It is also possible to interpolate between the constructions in Theorems 4.107
and 4.l21. We will indicate briefly what we have in mind, but will not strive for
maximum generality, nor give full details, since we would still not have a charac-
terization of all invariant measures. There are two parameters in the construction:
apE (0, 1] and a connected set B C Td that satisfies the property that
. #{YES'(x)nB:ly-xl=n}
(4.128) hm = a(x)
n-+oo an
exists for every x =1= e, where ap{3 = 1. The limit necessarily satisfies
(4.129) a(x) = a-I L a(y).
YES'(x).ly-xl=1
Also, put aCe) = a-I Lly-el=1 a(y). One way of generating such a B is via the
bond percolation process used in the statement of Theorem 4.126 with P = J.
Let f-L be the product measure with marginals
f-L{A: x E A} = min [plx-e l, IB(x)],

and let f-Lt be the distribution of the process at time t when the initial distribution
is f-L. The analogue of (4.92) is
f-LdA : x E A} = 1- E n
yEA:nB
[1 _ ply-e l],
so it is natural to define
M: = L ply-el.
yEA:nB
The analogue of (4.98) is
EM: = L p(en EAt} L ply-el.

n YEB,ly-xl=n
The analogue of Lemma 4.95 implies that all limits as t --+ 00 of M: k are bounded
above and below by constant multiples of
k-I
la(ek) + LP2j-kaj-k[a(ej) - a-Ia(ej+t>] + p-ka-k[a(e) - a-Ia(el)].
j=!
Strict Monotonicity of f3(A)

For roughly the second half of this section, the operative assumption has been
f3 < 1/ v'd. The natural assumption would have been A < A2. We will now prove
that these are in fact equivalent. Recall that by Theorem 4.65, f3 is a nondecreasing
function of A, and f3 (A2) = 1/ v'd. Thus what we really want is the following result:
Theorem 4.130. The function f3(A) is strictly increasing on [0, A2].
Remark. One consequence of this result is the analogue of the remaining im-
plication of Theorem 4.8(b): If the process does not survive strongly, then
¢(1/v'd) ::: 1. To see this, take A < A2. By Theorems 4.65(b) and 4.130,
f3 < 1/v'd. By Theorem 4.65(e), ¢(1/v'd) < 1. By Proposition 4.33(b), we
can let At A2, to conclude that at A2, ¢(1/v'd) ::: 1. In fact, passing to the limit
in (4.85) gives ¢(1/v'd) = 1 at A = A2.
Proof of Theorem 4.130. The proof uses heavily the graphical representation of the
contact process that is described in Section 1. Recall that it is based on a collection
of rate 1 Poisson processes {Nx, x E Td } that generate recovery symbols, and
rate A Poisson processes {N(x,y), x, y E Td , Ix - yl = I} that generate infection
arrows. To each infection arrow a, associate a Bernoulli random variable ~a with
parameter p. These are to be conditionally independent given the collection of
Poisson processes.
Consider now a modification of the graphical representation based on {N;, x E

Td} and {N(*x,y )' x, Y E Td, Ix - y I = I}, where the starred Poisson processes are
defined by replacing every infection arrow a in the original graphical representa-
tion such that ~ct = 0 by a recovery symbol placed at the tail of a. Then N; is
a Poisson process of rate 1+ (d + 1)(1 - p)A, while N(:,y) is a Poisson process
of rate Ap. Let A7 be the contact process defined in tenns of the starred graphi-
cal representation. Then, except for a detenninistic time change, it is a standard
contact process with parameter
A* = ____A_p_ _ __
1 + (d + 1)(1 - p)A
When p ranges from 0 to 1, A* ranges from 0 to A. So, a contact process with

any parameter < A can be represented in this way with a p < 1.
The crux of the proof is to use the idea of pivotal arrows. This idea was used
in a crucial way in the proof of Theorem 2.48 - see in particular Lemma 2.45. If
G is an event defined on the original graphical representation, say that an infection
arrow a is pivotal if G occurs, but changing a to a recovery symbol would have
the effect of making G not occur.
Now let G n = {en E At for some t} and G~ = {en E A7 for some t}, so that
u(n) = P(G n ), and u*(n) = P(G~) is the corresponding quantity for the starred
process. Note that by construction, G~ C G n . Furthennore, if G~ is to occur, then
it must be the case that G n occurs, and that all pivotal arrows are retained in the
construction of the starred graphical representation. It follows that
where N n is the number of arrows that are pivotal for G n (in the original graphical
representation). Recalling the definition of f3 in (4.48), and using
we see that in order to prove the theorem, it suffices to show that for some C and
some a < 1,
(4.131)
This is a statement about the original graphical representation, so we may now

forget the starred version of the representation.
We will only sketch the proof of (4.131) - full details can be found in Lalley
(1999). Consider the space time cluster
{(x, t) E Td X [0,00) : x EArl.
Let am be the mth infection arrow whose endpoints lie in this cluster, ordered by
the time coordinate Tm of am. Ifthere are fewer than m such arrows, set Tm = 00. If
Tm < 00, let Xm and Ym denote the tail and head of am respectively. By definition,
there is an active path from (e,O) to (Xm, i m), and therefore also an active path
from (e, 0) to (Ym, i m). Then am is pivotal for G n if and only if
(a) there is an active path from (xm, i m) to {en} X (0,00) or there is an active
path from (Ym, i m) to {en} X (0,00), and
(b) there is no active path from (Arm \{Xm, Ym}) X {im} to {en} X (0,00).
For fixed n, let Fm be the event that am is pivotal for G n, and let Dm,k be the
event that the shortest paths from Xm and Ym to the tail w of the next pivotal arrow
(or to en if there are no further pivotal arrows) intersect the path eo, el, ... , en
in a segment of length at least k. Using some graph theoretic arguments, one
can show that on Fm n Dm.k. there is an active path from {xm, Ym} X lim} to
{w} X (0,00) whose projection onto Td travels a distance:::: k on eo, el, ... , en,
and does not intersect the active path guaranteed by (a) above, except possibly at
the endpoints. By Theorem B21 applied to the events Fm and Dm,k. and the strong
Markov property applied at the stopping time i m , the conditional distributions
of the lengths of the parts of the path eo, el, ... , en covered between successive
pivotal arrows is dominated by a distribution with exponentially decaying tails.
This gives (4.131).
5. Notes and References
Results from Section 1

Theorem 1.12 is modeled on Theorem 2 of Salzano and Schonmann (1997). Earlier
proofs of complete convergence were based on the lemma in Griffeath (1978).
Interest in the two critical values A1 and A2 originated in Pemantle (1992).
Here are some other results that have been proved for the contact process on
general graphs:
Correlation Inequalities. Let a (A) peAt = 0 for some t) be the extinction

probability. By duality (l.9),
a(A) = v{B : B nA = 0}.
On page 267 of IPS, there is a proof of Harris' result that a satisfies
(5.1) a(A U B) + a(A n B) :::: a(A) + a(B).

Also, a special case of Theorem B 17 is
(5.2) a(A U B) :::: a(A)a(B),
which is part of the statement that v has positive correlations. Belitsky, Ferrari,
Konno and Liggett (1997) proved the following inequality, which generalizes (5.1)
and (5.2):
a(A n B)a(A U B) :::: a(A)a(B).
Using duality again, this can be viewed as a correlation inequality for the upper
invariant measure v.
Other types of correlation inequalities have been conjectured for special graphs.
For example, if S = Zl, Konno (1994) conjectured that v satisfies
v(l)v(O· . ·0 1 0·· ·0) ::: v(O· . ·0 l)v(l 0·· ·0),
where in the cylinder probabilities above, there are m zeros to the left of the one
and n zeros to the right of the one. Liggett (1994a) proved that if Ac < A < 2,
then the above inequality holds (strictly) for some choices of m, n ::: 1. Some
numerical evidence for the conjecture in case 1 ::: m, n ::: 2 is given in Tretyakov,
Belitsky, Konno and Yamaguchi (1998).
Recurrence vs. Survival. Salzano and Schonmann (1997, 1999) have studied the
contact process on fairly general graphs, and discovered a number of phenomena
that occur in that context but do not occur on Zd or Td . The second lowest extremal
invariant measure Vr that appears in the titles of these papers is defined as follows:
Define the recurrence probability by
f3A = p(x E A~ for a sequence of times t too),

which is independent of x, and let
for finite B. Note that by (1.8), v can be thought of as having been defined in
the same way, but with f3B replaced by the survival probability aBo The measure
Vr is the second lowest extremal invariant measure in the sense that any invariant
measure v that puts no mass on 0 lies above it in the following weak sense:
for all finite B. As mentioned by Salzano and Schonmann (1997), Andjel (private
communication) has proved that Vr ::: v in the stronger sense of (B8). Of course,
it is often the case that Vr = 00 or Vr = V. For the tree Td , for example,
00 = Vr = V if A::: AI,
00 = Vr =I=- v if Al < A ::: A2, and
00 =I=- Vr = V if A> A2.
On more general graphs, Vr can be different from both 00 and V.

One of the questions considered by Salzano and Schonmann is whether certain
properties of the contact process on a graph are monotone in A, and in the graph
itself, in the sense that if they hold on one graph for one value of A, then they hold
for all bigger A'S and bigger graphs. Survival itself is clearly a monotone property,
as can be seen most easily by considering the graphical representation. However,
Salzano and Schonmann show that survival, together with complete convergence
(1.11) for all finite initial configurations A is not monotone. On the other hand,
recurrence (in the sense that fJA > 0 for all finite A =1= 0) and what they call
partial convergence ((1.11) for finite A with aA replaced by fJA) is a monotone
property.
Their second paper is primarily devoted to the study of continuity properties
of aA and fJA as functions of A.

This section is based largely on Bezuidenhout and Grimmett (1990, 1991). Another
exposition of the material leading up to Theorems 2.12 and 2.23 can be found in
Durrett (1991). Extensions to other growth models are given in Bezuidenhout
and Gray (1994). The restart argument used in the proof of Theorem 2.30(a) is
explained in Durrett (1991). If d = 1, the bound Ac .::: 2 (see (1.28)) can be used
in Theorem 2.54 to conclude
lim inf a(A) > .03.

A.p.,. A - Ac -
Grippenberg (1996) gave another lower bound in this case that improves the .03
to .4.
The complete convergence theorem for the contact process on Zd has a long
history. Griffeath (1978) proved it in one dimension for A above the critical value
for the one-sided contact process. Durrett proved it in one dimension for all A > Al
- see page 284 of IPS. Durrett and Griffeath (1982) proved it for all d and
sufficiently large A. Schonmann (1987b) simplified their proof. Andjel (1988)
proved it for a larger class of A'S. The final result, Theorem 2.27, was proved
by Bezuidenhout and Grimmett (1990), though the proof given there is somewhat
different.
Here are some other results that have been proved for the contact process on
Zd:
Chen, Durrett and Liu (1990) gave a necessary and sufficient condition for the
convergence in Theorem 2.27 to be exponentially rapid in the one dimensional
case. Examples of initial distributions that satisfy this condition are homogeneous
product measures and deterministic finite configurations.
Gray (1991) proved a number of monotonicity properties for the one dimen-
sional contact process, including the following:
p(x E A)O))
is a decreasing function of Ix I. Note that even though this might appear to be
obvious, there does not appear to be any simple way to prove it.
Durrett and Schonmann (1988b) proved that the upper invariant measure v
for the one dimensional supercritical contact process has the usual large deviation
behavior:
1 {IAn[l,n]1
lim -logv A : E [a, b]
}= - .
mf ¢(x),
n~oo n n aSxSb
where ¢ is a nonnegative convex function on [0, 1] that is 0 only at x = v{A :

o E A}. In the same context, Galves, Martinelli and Olivieri (1989) proved that if
x > v{A : 0 E A} and A is finite, then
Tn
. {
= mf t > 0:
IA~n[1,n]l}
n > x
satisfies
Tn
f3n => T,
where T has the unit exponential distribution and {f3n} is an appropriate normal-
izing sequence. For related results in a more general context, see Lebowitz and
Schonmann (1987).
The Shape Theorem. This theorem states that the supercritical contact process A;O)
has an asymptotic shape in the following sense: Let Ht = us<tA1° C Zd and
d (0) Zd -=
K t = {x E Z : At = At }. Define H t = UxEH,C(X) and K t = UXEK,C(X),
-
where C (x) C Rd is the cube centered at x of side length 1. Then there is a

convex set U C Rd so that for any E > 0,
1- 1 - -
(1 - E) U C - H t C (1 + E) U and (1 - E) U c - (H t n K t) C (1 + E) U
t t
eventually a.s. on the event {AjO) *-

0 'V t}. The history of this result is roughly
parallel to that of the complete convergence theorem, with the final technology
needed in the proof on Zd being due to Bezuidenhout and Grimmett (1990). For
details of the proof, see Durrett (1991).
Critical Values. Small improvements have been made in critical value upper
bounds: Liggett (1995b) improved (1.28) to Ac :::: 1.942. The point of this was not
so much that the numerical value is a bit smaller than 2, but rather that the new
bound results from a procedure that in principle can be used to generate succes-
sively better bounds. Stacey (1994) improved (1.29) for d = 2 to A~2) :::: .79.
Durrett (1992) developed another technique for getting upper bounds that ap-
plies in significant generality, which is based on certain computations on small
finite sets. For the one dimensional contact process itself, though, the best result
he gets is Ac :::: 3.95.
Various upper bounds on the survival probability and corresponding lower
bounds for the critical value have been obtained by Katori and Konno, in a series
of papers listed in the Bibliography. Much of this material is treated in Konno's
1997 lecture notes.
Central Limit Theorems. Schonmann (1986a) proved that if d = 1 and A is an

infinite initial configuration, the supercritical contact process satisfies
for any function f that depends on finitely many coordinates, where =} denotes
convergence in distribution and N (0, (J"2) is the normal distribution with mean
zero and variance (J"2. If f is increasing and not constant, then (J"f > O.
Edge Processes in One Dimension. Consider the supercritical one dimensional con-
tact process whose initial configuration has a rightmost infected site, and infinitely
many infected sites on its left. Let rt be the position of the rightmost infected site
at time t:
rt = max{x : x EAt}.
Galves and Presutti (1987a) proved that properly scaled, rt converges to a nonde-
generate Brownian motion. A simpler proof, presented in the context of oriented
percolation, was given by Kuczek (1989). An extension of his argument to non-
nearest neighbor contact processes in one dimension is provided by Mountford
and Sweet (1999).
According to results in Section 2 of Chapter VI of IPS,
. rt
hm -
t--+oo t
= peA)
a.s., where peA) > 0 in the supercritical case. Galves and Presutti (1987b) proved
that the distribution of the process shifted by p(A)t converges to the symmetric
mixture of the two extremal invariant measures:
1 1
-8 0 + -v.
2 2
The process viewed from rt is defined by shifting by rt units:
Galves and Presutti (1987b) proved that this process has a unique invariant measure
for A > Ac. The existence had been proved earlier in the oriented percolation
setting by Durrett (1984). Andjel, Schinazi and Schonmann (1990) proved that
this invariant measure can be coupled with the upper invariant measure v of
the unmodified contact process in such a way that there are only finitely many
discrepancies to the left of the origin. Galves and Schinazi (1989) proved that the
invariant measure is the limit as n -+ 00 of the invariant measures for a truncated
process that is not allowed to die out or to have cardinality greater than n. Cox,
Durrett and Schinazi (1991) proved the existence and uniqueness of the invariant
measure in the critical case.
Long Range Contact Processes. Consider the contact process on the graph Zd in
which vertices x, yare connected by edges if their Euclidean distance is at most
M. Renormalize the infection parameter so that A is the total infection rate from
a single isolated site, and let A) (M) be its critical value for survival. Bramson,
Durrett and Swindle (1989) proved that
lim Al(M) = 1,
M--+oo
which is the critical value for the corresponding branching random walk. More
interestingly, they found the asymptotics of the error in this limiting statement:
There are positive constants C I , C2 (depending on d) so that
2 2
CIM-} ::: Al (M) - I ::: C2M-} if d = 1,
(5.3) C I (logM)M- 2 :::
AI(M) - I::: C2 (logM)M- 2 if d = 2,
d
CIM- ::: Al (M) - 1 ::: C2M-d if d 2: 3.
Durrett and Perkins (1999) proved that rescaled long range contact processes con-
verge to super Brownian motion in two and higher dimensions, and as a conse-
quence were able to give sharp constants for the asymptotics in (5.3): Let N be
the number of neighbors of a point. Then
61T log N
Al '" 1 + N
in d = 2, and
in d 2: 3, where Cd is 2- d x the expected number of visits to [-1, 1t of a random

walk whose steps are uniformly chosen from [-1, l]d.
There has been a significant amount of interest in other limits of contact pro-
cesses. One version of the limiting process was obtained by Swindle (1990). More
recently, Mueller and Tribe (1994, 1995) considered the following rescaled ver-
sion of the long range contact process described above: The process evolves on
S = n- 2 Z. Infected sites recover at rate n, and infected sites attempt to infect
a randomly chosen site a distance :s 1/ In at a total rate of n + (). Mueller and
Tribe prove that this family of processes converges as n ~ 00 to a solution of
the stochastic partial differential equation
where W is a space time white noise process on {(t, x) : t > 0, x E R}. They also
show that there is a critical value (}c so that solutions to this SPDE die out (i.e.,
are identically zero in x for some t) with probability one if () < (}c and survive
with positive probability if () > (}c' In this statement, the initial condition u(O, x) is
assumed to be continuous with compact support, nonnegative, but not identically
zero.
Penrose (1996) proved a continuum limit for the threshold contact process on
Zd. (The threshold contact process will be used in Part II as a comparison process
for the threshold voter model.) In this model, recovery occurs at rate 1, and sites
become infected at rate A if there is an infected site within distance M, and zero
otherwise. Let Al (M) be the critical value for survival. The result is that
as M -+ 00, where f.-Lc is the critical value for a threshold contact process on
Rd. This is analogous to the first Bramson, Durrett and Swindle result described
above. It would be interesting to investigate the analogue of their more refined
result: How does
behave as M -+ oo?
Contact Processes with Stirring. Another way of passing to a limit is to add stirring
(also known as symmetric exclusion - see Part III) at a large rate. Consider the
process whose generator is the sum of the generator of the contact process and D x
the generator of the symmetric nearest neighbor exclusion process on Zd. (D is
the rate at which the values of '1 (x) and '1 Cy) are interchanged if Ix - y I = 1.) This
process is also attractive and self-dual, but as D gets large, it behaves increasingly
like a branching process. The reason is that (for the finite system), the fast stirring
separates particles, so that they are not likely to be close together, and hence are
not likely to affect each other. In fact, Durrett and Neuhauser (1994) proved that
. 1
hm )1.) (D) =-.
D-+oo 2d
The limit is of course the critical value for the associated branching process.
Konno and Sato (1995) obtained explicit lower bounds on the critical value (and
corresponding upper bounds on the survival probability) as a function of D:
ACD» __1_+_C_2d
__-_1_)D
__
I - C2d - 1)(1 + 2dD)
Katori (1994) proved upper bounds on this critical value for d ::: 3. Konno (1995)
proved the following analogue of (5.3) in this context:
I 1 I
CID-'j :::: AICD) - 2d :::: C2 D -'j if d = 1,
1
C I (log D)D- I :::: Al CD) - 2d :::: C2(log D)D- I if d = 2,
1
CI D- i :::: Al CD) - 2d :::: C2D- I if d ::: 3.
Inhomogeneous and Random Environments. This is an area in which a significant

amount of work has been done, but there remain many important open problems.
Suppose that both the infection rates and recovery rates are allowed to be spatially
inhomogeneous, so that the possible transitions are
A -+ A\{x} for x E A at rate 8(x), and

A-+AU{x} forx rtAatrate L A(x,y).
yEA
\y-x\=1
In the homogeneous case, survival implies linear growth of At - see the discussion
of shape theorems above. Bramson, Durrett and Schonmann (1991) proved that
this is not necessarily the case for inhomogeneous systems. In their examples,
they take d = 1, A(X, y) == 1, and {8(x), x E Z} to be i.i.d. with a particular
distribution. Madras, Schinazi and Schonmann (1994) give examples in which the
critical process survives, unlike the homogeneous case.
°
Now take 8 == 1 and {A(X, y), Ix - yl = I} i.i.d. In one dimension, Liggett
(1991 a, 1992) showed that At dies out if E log A < and survives if E 2~t 1 < 1.
For d > 1, Klein (1994) proved that there is extinction if
E[ log(l + A) t d)
is sufficiently small, where f3(d) is of order 2d 2 for large d. Andjel (1992) proved
the complementary result that for any f3 < d, survival is possible even if
is arbitrarily small. There is a natural open problem here - what is the correct
power f3, or at least, what is its asymptotic behavior as d ---* oo?
Newman and Volchan (1996) take d = 1, A(X, x-I) == AI, A(X, x + 1) == An
ALAr> 0, and {8(x), x E Z} i.i.d., and prove survival under a condition that is
slightly stronger than
E[ -log8t = 00.
More generally, one can ask what moment assumptions on the transition rates
imply survival or extinction.

Theorems 3.3 and 3.9 in one dimension were proved by Durrett and Liu (1988).
Versions of these results for d > 1 were proved by Chen (1994). In one dimension,
Durrett and Schonmann (1988a) improved Theorem 3.9, finding the exact rate of
exponential growth of the extinction time in the supercritical case:
in probability, as N ---* 00. For the analogous problem for d > 1, see the discussion
of metastability below.
Durrett, Schonmann and Tanaka (1989) showed in one dimension that TN
grows polynomially in the critical case:
lim P(aN:::: TN :::: bN 4 ) = 1

N-+oo
for any a, b > 0. It is not known what the correct power is in one dimension. It
has also not been proved yet that the growth of TN is polynomial in the critical
case in higher dimensions.
Here are some other results that have been proved for the contact process
restricted to cubes in Zd:
Metastability. Metastability refers to the following behavior: A process X t has a

unique invariant measure fJ, 1 to which it converges in distribution as t --+ 00, yet
there is another measure fJ,2 (the metastable state) with the following property:
The distribution of X t remains near fJ,2 for a long period of time T (which has an
approximate exponential distribution), after which it relaxes to fJ,1 rather quickly
(relative to the time scale T). This phenomenon has been extensively studied for
several different models. See Schonmann and Shlosman (1998), for example, for
recent results on the metastability of stochastic Ising models. For contact processes
on finite sets, one expects metastable behavior when the corresponding infinite
system is supercritical. In this case, fJ,1 = 80 and fJ,2 is is the upper invariant
measure v of the infinite system, restricted to the finite set.
Consider the one dimensional supercritical contact process AN,t restricted to
{1, ... ,N}, and let TN be the extinction time:
TN -- . f{t >
III _ 0 .. A{l·
N,t.. · ,N) -- 0} .
Define fiN by P(TN > fiN) = e- 1 • Schonrnann (1985) proved that

(a) TN / fiN converges in distribution to the unit exponential distribution, and
(b) for times only slightly smaller than TN, the distribution of the process is
close to the upper invariant measure v of the unrestricted process on Z.
This had been proved earlier by Cassandro, Galves, Olivieri and Vares for
large values of Ie. An improved proof of the exponential limit law was given in
Durrett and Schonmann (1988a). See also Cox and Greven (1990).
Mountford (1993) used the renormalization procedure of Section 2 to prove
the exponential limit law (a) for the contact process restricted to {1, ... , N}d,
d > 1. In his 1999 paper, he showed that
. log ETN
y = N~oo
I1m
Nd
exists. This limit is positive by Theorem 3.9. Combining these results leads to the
following strengthened form of Theorem 3.9 in all dimensions:
log TN
------;:jd --+ Y+ (Ie )
in probability.
Simonis (1996) proved (b) in this multidimensional setting.
Asymmetric Systems in One Dimension. Asymmetric contact processes were first

studied by Schonrnann (1986b). Consider the contact process on Z in which there
is an asymmetry in the infection rates - the rate for the transition
A --+ AU{x}
is eAIA (x - I) + (2 -e)AIA (x + I). Think of e as being fixed, while A is varied. It

is easy to see that Al < A2 for many choices of 8. (An interesting open problem is
e
to determine whether this is true for all =f= 1.) For example, the process survives
if A 2: 4 (see Holley and Liggett (1978» but it survives only weakly if eA < I
(since At can be kept to the left of a simple random walk with drift eA - I).
Therefore e < ~ is enough to guarantee Al < A2. For A = A2, Schinazi (1994)
proved that the process restricted to {1, ... , N} satisfies
10grN
~ 2
10gN
in probability. Sweet (1997) then proved the stronger distributional limit theorem:
::::} inf{t: IBtl = c},
for some constant c > 0, where B t is a standard Brownian motion.

Theorem 4.1 was proved by Pemantle (1992). Pemantle's paper contains a wealth
of information about the contact process on both homogeneous and inhomogeneous
trees. It is responsible for stimulating the interest and activity in the study of these
models.
Madras and Schinazi (1992) proved Theorem 4.8(d). The proof given here is
based on Liggett (1996b). For related results for branching random walks on more
general sets, see Schinazi (1993).
Lemma 4.26 comes from Stacey (1996). Proposition 4.27(b) was proved for
p = I (for the biased voter model) by Madras and Schinazi (1992), and for general
p by Liggett (1996b). Proposition 4.39 was proved by Morrow, Schinazi and Zhang
(1994). Theorem 4.46 is due to Pemantle (1992) for d > 2 and to Liggett (1996a)
for d = 2. The proof given here is due to Stacey (1996). Proposition 4.50 comes
from Liggett (1996b). Proposition 4.57 is due to Salzano and Schonmann (1998).
Turning to Theorem 4.65(b), the inequality
was conjectured by Liggett (1996b) and proved by Lalley and Sellke (1998).
The proof of Theorem 4.65(b) is given in Lalley (1999). Theorem 4.65(h) is an
improvement of a result in Pemantle (1992) that gives an upper bound for A2 that
is asymptotic to ejJd as d ~ 00. The proof given here is completely different.
Theorem 4.70 was proved by Zhang (1996); the simplified proof given here is
due to Salzano and Schonmann (1998). Theorem 4.71 was proved by Pemantle
(1992) (under the assumption of the then not fully verified fact that AI < A2). The
proof given here is taken from Salzano and Schonmann (1999). Theorem 4.83 is
due to Lalley (1999). Corollary 4.87(a) and (b) was proved by Schonmann (1998),
though the proof of part (a) given here is different. Theorem 4.130 is due to Lalley
(1999).
Turning to the construction of invariant measures, Theorem 4.107 is an exten-
sion of the construction given in Liggett (1996b). Theorem 4.121 is due to Durrett
and Schinazi (1995), who also proved that these measures are extremal.
Here are some other results that are related to the contact process on Td:
Critical Values. Let Aid) for i = 1, 2 denote the critical values for the contact pro-
cess on Td • Combining Theorems 4.1(a) and 4.8(c) and using the natural coupling
At C {x : {t(x) ::: l}, we see that
_1_ < A(d) < _1_
d+1- I -d-l'
and hence that
(5.4) lim dA;d) = 1.
d-+oo
(For the analogous statement on Zd, see (1.26).) It follows from Theorems 4.1(b)
and 4.65(h) that
(5.5)
Pemantle (1992) gives improved bounds that lead to the replacement of on the !
left side (5.5) with 2 - J2 : : : .5858. This bound could be improved a bit more
by using the results in Liggett (1996a). Note that unlike (5.4), the limit in (5.5)
cannot be the same as the corresponding limit for the branching random walk
process, which is !
by Theorem 4.8(c). It would be interesting to evaluate the
limit in (5.5).
Some numerical work on critical values and critical exponents for the contact
process on T2 have been carried out. For example, Tretyakov and Konno (1995)
give the estimate Al ~ .542.
Critical Exponents. There are constants C I, C2 so that
and
CI(AI - A)-I::: E 1 00
IAtldt ::: C2(AI - A)-I, A < AI.
This was proved to be a consequence of what is known as the triangle condition

by Barsky and Wu (1998). This triangle condition was checked by Wu (1995) for
d ::: 4, and by Schonmann (1998) for d ::: 2.
Growth Profile. Assume Al < A < A2, and let
rt = min
xEA,
Ix - el, Rt = max
XEA,
Ix - el, Nn(t) = #{x EAt: Ix - el = n}.
Take 0 < Sl :s S2 < 00 to be the smallest and largest solutions of U(s) = ~.

Lalley (1999) proved that
. rt 1 Rt 1.
= -, hm Nn(nsF = dUes)
1
hm - =-, lim -
t-+oo t S2 t-+oo t Sl n-+oo
a.s. on {At =1= '" V t}, provided in the last case that d U (s) > 1.
The Process on a Finite Tree. Stacey (2000) shows that the contact process on
a ball B in Td with A > A2 and Ao = {e} survives for a time that is almost
exponential in the cardinality of B with positive probability.
Liggett (1999) has studied branching random walks on the ball of radius N
on Td . Unlike the contact process, this process survives for large A. Let A~ be the
critical value for this survival. One would expect A~ ~ A2, and in fact it turns
out that
(Recall from Theorem 4.8(d) that A2 = 1/2Jd.) Liggett also gives precise asymp-
totics for the time tN at which the expected number of particles is I when the
initial configuration is ~ == I: For 0 < A < 1/2Jd,
lim
N-->oo
[tN(1-2A-/d)-NIOgd+~IOgN]=C,
2
where C is an explicit function of A and d.
Anisotropic Processes. Heuter (2000) has proved several results analogous to those
discussed in Section 4 for a contact process on T2d+ I, d 2: 1, in which different
infection rates apply in different directions: there are parameters AI, ... , Ad+1 so
that an infected site x with neighbors XI, ... , X2d+2 infects X2i-l, X2i at rate Ai
each.
Branching Random Walk on Galton Watson Trees. In Section 4, we used branching

random walks to suggest results for the contact process on Td. We found that the
behaviour of the two processes is essentially the same, but that the proofs are
much more difficult in the case of the contact process. One might guess that the
situation is similar for reasonable classes of inhomogeneous trees.
Pemantle and Stacey (2000) have shown that there are in fact significant differ-
ences between the behaviour of the two processes if the tree is chosen at random
via a Galton Watson branching process; i.e., the vertex set is the collection of all
individuals ever alive in that process - see the discussion surrounding Theorem
B55. Here are two consequences of their work:
(a) There is a tree of bounded degree so that Al < A2 for the contact process,
but Al = A2 for the branching random walk process.
(b) There is a tree in which every vertex has degree either 3 or 100 so that the
critical values of the contact process on that tree satisfy 0 < Al = A2 < 00.
5. Notes and References l37
A Reversible Version of the Contact Process. A modification of the contact process

on Td was studied by Puha (1999, 2000). Her process is obtained by allowing only
transitions that do not disconnect At. A bonus that results from this modification
is that the resulting process is reversible. For d = 2, Puha uses this reversibility
to prove that AI = A2 = ~, and that the survival probability satisfies
( 41)
C1 A -
l+ffi/2
:s peAt =F 0 V t):s
(
A-
1)5/2 '
4
1
-<A<1
4 - - ,
thus showing that the critical exponent for the survival probability, if it exists, is
between 2.5 and 2.803. Note the contrast with (5.6), which says that this critical
exponent is 1 for the contact process.
Part II. Voter Models
1. Preliminaries
Interest in voter models began at about the same time that people started working
on the contact process - the mid 1970's. As was the case for the contact process,
these models provided a fertile ground for the use of some of the basic tools in the
area of interacting particle systems. In fact, the main reason for their introduction
was not so much a desire to model political systems, as the name might suggest,
but rather the fact that voter models are exactly the class of spin systems to
which duality can be applied most completely and successfully. After applying
duality, one was often led to problems involving sustems of random walks, and
that provided a close link to one of the most active areas of research in probability
of the previous two decades.

By a voter model, we will mean a Markov process TJI on {O, l}Zd whose generator
has the form (B 1), where the rate function c(x, TJ) has the following properties:
(a) c(x, TJ) = 0 for every x E Zd ifTJ == 0 or ifTJ == 1,
(b) c(x, YJ) = c(x, n for every x E Zd if YJ(Y) + I;(Y) = 1 for all Y E Zd,
(c) c(x, TJ) ::: c(x, n
if TJ ::: I; and TJ(x) = i;(x) = 0, and
(d) c(x, TJ) is invariant under shifts in Zd.
Property (a) implies that the pointmasses 80 and 81 on the constant configurations
YJ == 0 and TJ == 1 are invariant. Property (b) says that the evolution of the system
is not changed by interchanging the roles of 0 and 1, while property (c) makes the
process attractive - i.e., (BI2) and (BI3) hold. Finally, the last property implies
that the process is invariant under spatial shifts.
There are various interpretations that one can give to such a process. Individu-
als placed at the points of Zd might have one of two possible opinions on an issue,
and they change their opinions at random times, based on the opinions of their
neighbors. This interpretation leads to the name voter model. Alternatively, one
could think of Zd as representing territory, each parcel of which is controlled by
one of two competing populations. The transition 0 ~ 1 at x, for example, then
corresponds to control of parcel x changing from population 0 to population 1.
With either interpretation, properties (a)-(d) should appear quite natural.
140 Part II. Voter Models
Clustering and Coexistence

The main question that will interest us in this part of the book is whether coex-
istence of types is possible in equilibrium. The two trivial invariant measures 80
and 81 correspond, of course, to lack of coexistence. We will say that the process
T/t coexists if there exists an invariant measure that is not a mixture of 80 and 81 .
If there is no nontrivial invariant measure, it must be because large clusters of 0' s
and I's develop over time, so that in the limit, only one type is seen. Therefore,
we will say that T/t clusters if
for all x, Y E Zd and all initial configurations T/. A desirable, but currently unre-
alistic, objective would be to classify all possible rate functions c(x, T/) according
to whether the corresponding processes coexist or cluster.
The Linear Voter Model

One class for which this classification is not only possible, but is relatively easy,
is the class of linear voter models. They are the models that are treated in Chapter
V of IPS, and are defined by taking
(1.1) c(x, T/) = L p(x, Y)T/(Y) for T/(x) = 0,

y
where p(., .) are the transition probabilities for an irreducible random walk on Zd.
One way of thinking of this process is that the individual at x E Zd waits a unit
exponential time, then chooses ayE Zd with probability p(x, Y), and adopts the
opinion of that y. The main reason why the analysis is easier in this case, is that
linear voter models satisfy a very useful form of duality:
(1.2)
where At is a process of coalescing random walks. To define At, consider random

walks on Zd with unit exponential holding times and transition probabilities p(., .),
and take them to be independent until two of them meet. At that time, the two
that meet coalesce into one particle, which continues to move like a random walk
with transition probabilities p(., .). Then At C Zd is the set of sites occupied by
these random walks at time t. Note that unlike the contact process dual (which
is again a contact process), At has decreasing cardinality. This property makes a
huge amount of difference in what can be proved, and how hard it is to prove it.
In this part of the book, we will concentrate on nonlinear voter models, for
which (1.2) fails. However, we include statements of some of the main results in
the linear case here, since they will provide a useful context for our discussion
of the nonlinear case. Here is the classification we wanted in the case of linear
voter models - the proof can be found in Section I of Chapter V of IPS. For its
statement, let X t be the symmetrized random walk with unit exponential holding
times and transition probabilities
I
2[P(x, y) + p(y, x)].
Theorem 1.3. The linear voter model TJt clusters if X t is recurrent, and coexists
if X t is transient.
In particular,
(a) the process clusters if d = I and
L Ixlp(O, x) < 00,

x
or if d = 2 and
L Ixl2 p(O, x) < 00,
x
and
(b) the process coexists if d ::: 3.
In order to contrast this with the behavior of the threshold voter models that
will be discussed shortly, note that whether the linear voter model clusters or
coexists depends almost exclusively on the dimension of the set of sites, rather
than on the size of the range of interaction.
A lot can be said about the invariant measures in case of coexistence, and
about convergence to invariant measures in both cases. Here are special cases of
some of the results in the first two sections of Chapter V of IPS.
Theorem 1.4. Suppose X t is recurrent, and J-L is any translation invariant proba-
bility measure on {O, l} Z d. Then
J-LS(t)::::} p/h + (1- p)8 0

as t ---+ 00, where => means weak convergence, and p = J-L{TJ: TJ(x) = I}.
Theorem 1.5. Suppose X t is transient. Then the extremal invariant measures for
TJt form a one-parameter family {J-Lp, 0 :s p :s I}, where J-Lp is translation invariant
and ergodic, and J-Lp{TJ : TJ(x) = I} = p. Furthermore, J-Lp has covariances given
by
Cov!-'p (TJ(x), TJ(Y)) = pO - p )pY-x (X t = 0 for some t ::: 0).
If J-L is any translation invariant, ergodic probability measure on {O, I }Zd with
J-L{TJ : TJ(x) = I} = p,
then J-LS(t) ::::} J-Lp as t ---+ 00.
One property of the linear voter model that is quite special is that it has what
is known as a conserved quantity (on average). In this case, this is the statement
that if J-L is translation invariant with density J-L{TJ : TJ(x) = I} = p, then J-LS(t)
satisfies
fJ,S(t){IJ: IJ(x) = I} = P
for all t 2: 0. (Take A to be a singleton in (1.2) and integrate with respect to fJ, to
see this.) It is for this reason that one expects the process to have a one-parameter
family of extremal invariant measures, indexed by the conserved quantity (if there
is coexistence). This will not be true for more general voter models in which there
is no conserved quantity.
The Threshold Voter Model

The nonlinear voter models that we will concentrate on are known as threshold
voter models. To define them, let ~;V. be a neighborhood of °
E Zd that is
obtained by intersecting Zd with any compact, convex, symmetric set in Rd. To
avoid uninteresting cases, we will always assume that d' contains all the unit
vectors (1,0, ... ,0), ... ,(0, ... ,0,1). For a positive integer T, the threshold
voter model with neighborhood Jj/ and threshold T is the one with rate function
if#{y EX+AI: IJ(Y) =1= IJ(x)} 2: T,

(1.6) C(x,IJ) = {~ otherwise.
If T is large, the process will coexist according to our definition for an un-
interesting reason. For example, if d = 1, AI = {-I, 0, I} and T = 2, then the
following configuration is a trap for the process:
···110011001100
In fact, if IJt (x) = IJt (x + 1) for some x and t, then those two coordinates will
never flip again. It is clear from this observation that starting from any initial
configuration, each site will flip only finitely many times. We will say that the
process fixates if it gets trapped in this way, i.e., if each IJt (x) flips only finitely
often for every initial configuration. This concept is not relevant for linear voter
models. In fact, results in Cox and Griffeath (1983) imply that nearest neighbor
voter models on Zd do not fixate.
The Graphical Representation

There is a useful graphical representation for the threshold voter model that has
some similarity to the one we defined for the contact process in Section 1 of Part I.
Let {Nx, x E Zd} be independent rate 1 Poisson processes. Then IJt is defined by
saying that IJt (x) =1= IJt- (x) if and only if t is an event time of N x and
#{y EX + d' : IJt(Y) =1= IJt-(x)} 2: T.
There is a small technical issue that must be resolved in carrying out this
construction. It is important to know that only finitely many previous decisions
are relevant in deciding whether to flip the configuration at site x at time t. In
one dimension, this issue is easily resolved by noting that for any k and t, with
probability one, there are infinitely many positive and negative n's so that N x has
l. Preliminaries 143
had no event times by time t for any n :::; x :::; n + k. This breaks up Z into finite
"islands" that have no influence on each other up to time t.
This idea does not work quite so easily in higher dimensions. If d > 1, one
constructs the process for small times, using a percolation argument to control
these finite islands of influence. The small time restriction comes from the fact
that the percolation parameter must be kept small in order to prevent percolation
from occurring. But once the process has been constructed for 0 :::; t :::; E, it can
be recursively constructed for later times by restarting the procedure at integer
multiples of E.
More explicitly, fix t > 0 and construct a random oriented graph with vertex
set Zd by placing an edge from x to y if Nx has an event time by time t and
y E x + .ff. By a potential path of length n, we will mean a sequence Xo, ... ,Xn
of distinct vertices in Zd so that Xi+! E Xi + JV for each i. There are at most
(1J1/ I - 1) n potential paths of length n starting at x, and the probability that any
one of them is a path in the random graph is (1 - e- t r. Therefore, if t is so small
that (IA/·I- 1)(1 - e- t ) < 1, the expected number of vertices connected to x in
the random graph is finite, so only finitely many sites can influence the evolution
of 1]s(x) up to time t.
Duality when T = 1
The graphical representation given above, unlike that described in Part I for the
contact process, does not lend itself to defining a dual process. There is another
graphical representation for some nonlinear voter models, including the threshold
model with T = 1, that does permit the definition of a dual process via arrow
reversal. The representation is of the type known as cancellative. (The duality for
the contact process that was used in Part I is known as additive. For a treatment of
both types of duality, see Section 4 of Chapter III of IPS.) A general description
of this graphical representation and the corresponding duality in the context of
voter models is provided in Section 2 of Cox and Durrett (1991).
Rather than discuss the graphical representation itself here, we will describe
the dual process for the threshold voter model with T = 1, and explain analytically
why the duality equation holds. Let At be an annihilating branching process that
evolves in the following way. At all times, At is a finite subset of Zd. Each
x E At has a rate 2 exponential clock, and when its clock rings, a subset B is
chosen uniformly from all even subsets of x +JV, and At is replaced by AtfJ.B,
where fJ. denotes the symmetric difference of two sets. In other words, any point
falling on an already occupied site leads to the annihilation of both points. The
duality equation is then
(1.7)
Here 1]t is the threshold voter model with T = 1, and is regarded as a subset of
Zd.
To check (1.7), define
H(A, 1J) = n[1 -

XEA
21J(x)] = {
+1
-1
if
if
IA n 1J1 is even,
IA n 1J1 is odd.
Then (1.7) can be rewritten as
(1.8)
The key property that leads to (1.8) is that the derivative of both sides with respect
to t be the same at t = 0 for all choices of A and 1J. The integration of the resulting
identity to get to (1.8) is described in Section 4 of Chapter III ofIPS.
The derivative of the left side of (1.8) at t = 0 is just the generator of the
process 1JI applied to the function H as a function of its second argument. The
generator is given by (B 1):
(1.9)
x
To compute this expression, note first that
-H(A,1J) if x E A
{
H(A, 1Jx) = +H(A,1J) if x 1: A.
Another important property of the function H is that
(1.10) H(A, 1J)H(B, 1J) = H(A/).B, 1J).
We need to write c(x, 1J) in terms of the H's. To do so, we argue as follows. The
set of all functions that depend on coordinates in a finite set A is a vector space
of dimension 21AI. A basis for this space is the collection {H(B, .), B C A}. The
easiest way to see that they form a basis is to check that they are orthonormal
in L2(V~), and this is an immediate consequence of (1.10). (As usual, v~ is the
product measure with density !.)
Therefore, any function 1 in this vector space
can be written as a linear combination
(1.11) 1(1J) = L a(B)H(B, 1J).

BcA
To evaluate the coefficients, multiply (l.ll) by H (C, 1J) and integrate with respect
to v 1. Using the orthogonality of the H' s, the result is
2
a(C) = f 1(1J)H(C, 1J)dv~.

In particular,
c(x, 1]) =l-l(ry=Oorry=1 onx+. V}
=1 - L H(B, 1]) 1
Bcx+. V (('=0 or ('=1 on x+. V}
H(B, r;)dVl
2
1 )1. VH
=1 - (- L H(B,1]).
2 Bcx+. V.B even
Therefore, we can write (1.9) in the following way:
-2 L H(A, 1])[1 - 21. /'1-1 L.'r

XEA Bcx+. .B even
H(B,1])]
= 2 L 21. H L, [H(A~B, 1]) -

1
Y H(A, 1])].
XEA Bcx+. ~"
Beven
But the right side above is the generator of the dual process At applied to H as a
function of its first argument, and hence is the derivative of the right side of (1.7)
at t = 0.
The dual process is often described in somewhat different terms: Instead of
choosing an even subset B of x + ,/1/' and taking the new state to be At~B,
one chooses an odd subset C of x + ./V uniformly, and takes the new state to
be (At \ {x l) ~ C. This is really the same process, since the mapping B --+ C =
B ~ {x} takes even subsets to odd subsets in a one-to-one fashion, and satisfies
A~B = (At \{xl)~C.
Preview of Part II
The next section deals with general threshold voter models. If a threshold voter
model does not fixate, we should expect that the process will coexist for small
threshold and cluster for large threshold - where large and small are interpreted
as being relative to the size of the neighborhood, I.ffl. The reason for this is that
having a small threshold makes it easy for flips to occur, so it is likely that there
will be a lot of both O's and 1's around at all times. In Section 2, we will verify
this by proving the following results:
(a) The process fixates if T > I.V~I-I.
(b) If d = 1 and T = I· V~H , then the process clusters.
(c) If T = elJVI with e sufficiently small and IJVI sufficiently large, then
the process coexists.
Section 3 is devoted to the case T = 1, which is the only situation in which
anything like complete results are available. In this case, we will show that the
process coexists in all cases except d = 1, ./1/' = {-I, 0, I} (in which case it
clusters by (b) above). Note that this is a very different state of affairs than the
one we saw in Theorem 1.3 for linear models.
Throughout the rest of Part II, we will consider only threshold voter models.
We will use Ix I for x E Zd to denote the restriction to Zd of any norm on Rd.
2. Models with General Threshold and Range
In this section, we consider the threshold voter model with neighborhood JV and
threshold T, whose transition rates are given by (1.6). We treat fixation, clustering
and coexistence of the model, in that order. Recall that for a fixed choice of
neighborhood, we expect these to correspond to large T, moderate T and small T
respectively.
Fixation for Large Thresholds

Here is the main result about fixation. Note that it is easy to check the result in
one dimension: If an interval of length T in the configuration is constant at any
time, then no site in that interval will ever flip again. In higher dimensions, we
have to work a bit harder.
Theorem 2.1. If T > 1.1/~I-l, then the process fixates.
Proof For E > 0, define the weight of the configuration I] by

w(l]) = L e-flx+yl.
x,Y:X-yE.V
ry(x)ofory(y)
Note that w(l]) < 00 for all I] E to, l} Z d. We need to see what the effect of a
flip is on the value of w. So, letting I]u be the configuration obtained from I] by
flipping the uth coordinate, write
X.y:x-yE. 'v' x,y:X-yE~V

ryu(x)i% (y).ry(x)=ry(y) ryu(x)=ryu (y).ry(x)ofory(y)
(2.2)
=2 L e-flu+yl.
YEu+,/V,yofou YEU+.,Y
ry(y)=ry(u) ry(y)ofory(u)
If the flip at u can occur, then
#{y E u +./V : I](Y) =1= I](u)} ::: T,
and hence
#{y E u + JV : I](Y) = I](u), Y =1= u} :::: Ih'l - T - 1.
Also, letting R = sup{lxl : x E JV}, we have

21ul- R :::: lu + yl :::: 21ul + R.
Using these inequalities in (2.2), we see that if c(u, 1]) = 1, then
By assumption, IJY'I- T - 1 < T, so that we can choose E small enough that the
last factor in (2.3) is < O. Since every flip at u decreases w by at least a certain
amount, there can only be finitely many flips at u.
Clustering in One Dimension

Assume now that d = 1 and ./V = {- T, . .. , T}, where T :::: 1. It is easy to
see that 80 and 8\ are the only extremal invariant measures that are translation
invariant. This is a good indication that the process clusters. To see this, let IL be
any probability measure on {O, l}z. Take k > T + 1 and compute
d
dtILS(tHI1 : 11(1) = ... = l1(k) = 1}1t=0
k
(2.4) =L IL{11 : 11(j) = 0, l1(i) = 1 for 1 ::: i ::: k, i =1= j}
j=\
- IL{11 : 11(1 - T) = ... = 11(0) = 0, 11(1) = '" = l1(k) = I}

- IL{11 : 11(1) = ... = l1(k) = 1, l1(k + 1) = ... = l1(k + T) = OJ.
If IL is invariant for the process, the left side of (2.4) is zero, so the right side is
zero as well. If IL is also translation invariant, then the two negative terms on the
right of (2.4) are (in magnitude) ::: the first and last terms in the sum respectively.
Therefore the other terms in the sum must be zero:
(2.5) IL{11 : 11(j) = 0, l1(i) = 1 for 1 ::: i ::: k, i =1= j} = 0, 1 < j < k.
Since IL is invariant, it is not hard to show that the fact that these cylinder
probabilities are zero implies that all other cylinder probabilities in which there is
at least one coordinate set to zero and another coordinate set to one are also zero.
This implies that IL is a convex combination of 80 and 8\. We leave the details for
general T to the reader, since we will prove directly the stronger result that this
process clusters.
Here is how it works for T = 1. Take j = 2, k = 3 in (2.5) to conclude that
IL (101) = 0, where we are using a natural shorthand to denote cylinder proba-
bilities. Therefore, IL puts no mass on configurations with a singleton zero. Since
configurations with a doubleton zero can flip to configurations with a singleton
zero with a positive rate, and since IL is invariant, it follows that IL puts no mass
on configurations with a doubleton zero. Arguing inductively, it follows that all
cylinder probabilities of the form IL(10··· 01) are zero. Since IL is translation
invariant, IL(10000· .. ) = O. To see this, let
An = {11 : l1(n) = 1, 11(j) = 0 V j > n}.
These are disjoint for different n's, and have the same probability by translation
invariance, so IL(An) = 0 for all n. Therefore,
IL(10) = IL(101) + IL(1001) + IL(1000l) + ... = O.

Similarly, !-l(01) = o. Therefore, !-l concentrates on the constant configurations

TJ == 0 and TJ == 1. Now we tum to the proof of the stronger statement that these
models cluster.
Theorem 2.6. The threshold voter model in one dimension with ./V = {- T, ... ,
T}, T 2: 1, clusters.
Remarks. (a) One might guess from this result and Theorem 2.1 that threshold
voter models in higher dimensions cluster if T = I· J~I-I . This is not correct. For
example, take d = 2, T = 2 and f f = {(O, 0), (0, 1), (l, 0), (0, -1), (-1,0)}. If
TJ is constant on alternating vertical infinite strips:
TJ(4i, j) = TJ(4i + 1, j) = 1, TJ(4i + 2, j) = TJ(4i + 3, j) = 0

for all i, j, then no transitions ever occur.
(b) Under the assumptions of this theorem, the process does not fixate. To see
this, consider the initial configuration
···00001111
in which infinitely many zeros are followed by infinitely many ones. Then only the
zero and one at the boundary can flip, so that the configuration will always look the
same, except that the boundary will move like a simple symmetric random walk.
The fact that this random walk is recurrent implies that every site flips infinitely
often.
Proof of Theorem 2.6. The idea of the proof is to construct two sequences of
random times Un, Vn for n 2: 1 with the following properties:
(a) 0 = Vo < UI < VI < U2 < V2 ··· ,
(b) {Uk+ 1 - Vko k 2: O} are i.i.d. with E(Uk+1 - Vk) < 00,
(c) {Vk - Uko k 2: I} are i.i.d. with E(Vk - Uk) = 00,
(d) the random variables in (b) and (c) are independent of each other, and
(e) TJIO is constant on {-T, ... , T} for every t E Ubl[Uko Vk).
Once this construction is made, it will follow from renewal theory that
(2.7) P(TJIO is constant on {-T, ... ,TJ) 2: p(t E Ubl[Uko Vk)) -+ 1
as t -+ 00. (See, for example, Exercise 4.8 in Chapter 3 of Durrett (1996).) It

follows from (2.7) that
lim P(TJI(l)
1-+00
=1= TJt(O)) = 0,
so that the process clusters.
We begin with several general comments about the construction, and then
explain concretely how to carry it out. The U's and V's will be defined in terms of
the Poisson processes N x used in the graphical representation described in Section
1, together with two additional rate one Poisson processes N +, N _. In fact, the
U's and V's will be stopping times with respect to the associated filtration. Also,
Uk+l will depend only on TJVk and the Poisson processes for times t > Vb and
Vk will depend only on TJUk and the Poisson processes for times t > Uk. This
will guarantee the independence required in (b), (c) and (d). The fact that the U's
and V's are separately identically distributed will follow automatically from the
construction. Since the U's and V's will be defined recursively, we might as well
just explain how a U is constructed starting from a general configuration TJ (which
will be the configuration at the previous time V), and how a V is constructed
starting from a configuration TJ that is constant on {- T, ... , T} (which will be
the configuration at the previous time U). We start with the latter.
So, suppose that TJ is constant on {-T, ... , T}. Without loss of generality,
assume that TJ(x) = 1 for Ixi ::: T. We will define two simple, symmetric random
walks L t , R t so that Lo = -T, Ro = T, and TJt(x) = 1 for all L t ::: x ::: R t up
until
(2.8) V = inf{t 2: 0: L t = -T + 1 or R t = T - I}.
Suppose Rs, s ::: t has been constructed, and R t = x. Then R stays at x until the
first time that one of the following happens:
1. There is an event time in N x ; at that time R moves one step to the left.
2. There is an event time in Nx+l and TJ(x + 1) = 0 at that time; at that time R
moves one step to the right.
3. There is an event time in N+ and TJ(x + 1) = 1 at that time; at that time R
moves one step to the right.
L t is constructed in a similar (reflected) way, using N_ instead of N+. Note that
L t , R t defined in this way have the property required above: TJt(x) = 1 for all
L t ::: x ::: Rt up until the time V defined in (2.8). Now, V is the minimum of two
independent random variables, each of which has the distribution of the hitting
time of {O} for a simple symmetric random walk on Z starting at 1. That hitting
time has tail probabilities of order Ct-~ by the reflection principle. (See Section
3.3 of Durrett (1996), for example.) Therefore
p(V 2: t) '" C't- 1 ,

and so E V = CXl as required.
Next, we will explain the construction of U starting from an arbitrary con-
figuration TJ. The first step is to argue that for any TJ E {O, 1} z, there is a finite
sequence Ul, ... , Urn E {-T, ... , T} so that if the configurations TJi are defined
recursively by
o 10m m-l
TJ = TJ, TJ = TJ u1 ' .•• , TJ = TJ um '
then C(Ui+l, TJi) = 1 for each i, and TJm is constant on {-T, ... , T}. Recall that
subscripts on the TJ' s above are the sites at which the spin is flipped.
The argument is by induction on the number n of maximal subintervals of
{ - T, . .. , T} on which TJ is constant. If n = 1, then TJ itself is constant on
{ - T, . .. , T}, so the result is immediate. Suppose now that n > 1, and the
result has been proved for all configurations with fewer intervals of constancy.
Let {j, ... ,k - I} and {k, ... ,I - I} be two consecutive maximal intervals
of constancy of 17 on {- T, ... , T}. Without loss of generality, assume that
17(k - 1) = 1, 17(k) = O. Then
k-I+T k+T
(2.9)
L [1-17(i)]+ L 17(i)=I-17(k-l-T)+17(k+T)+2T~2T.
i=k-I-T i=k-T
Therefore, at least one of the sums on the left of (2.9) is ~ T. If the first sum is
~ T, then c(k - 1,17) = 1, while if the second sum is ~ T, then c(k, 17) = 1. In
the first case, the site U at which we will flip is taken to be k - 1; in the second
case, it is taken to be k. If both sums are ~ T, either choice can be made.
To be specific, suppose that it is the first sum on the left of (2.9) that is ~ T,
so that U = k - 1. Then
k-2+T k-I+T
L [I-17k-l(i)] = I-17(k-2-T)+17(k-I+T)+ L [1-17(i)]~T,
i=k-2-T i=k-I-T
so that site k - 2 can be flipped next. Continuing in this way, we flip sites until site
j is flipped. At that point the number of intervals of constancy has been reduce
to n - 1, so that the induction hypothesis can be applied.
Let UI(17), U2(17), ... ,U m (ry)(17) E {-T, ... ,T} be the sequence constructed
above. It has the property that the successive application of flips at these sites
makes the configuration constant on {- T, ... , T}, and that each of the flips will
occur if there is an event time in the appropriate Poisson process. Note that the se-
quence of sites that are flipped to achieve a constant configuration on {- T, ... , T}
depends on 17 only through {17(X), Ixl :::: 2T}. Therefore, m = maxry m(17) < 00.
Extend the sequence Ui(17) to i :::: m by setting Ui(17) = 0 for m(17) < i :::: m. This
choice has the following property: For the process starting with configuration 17,
if the first m event times among the Poisson processes {Nx, Ix I :::: 2T} occur at
UI (17), U2 (17), ... , Um (17) in that order, then at the last of these event times t, 17t
will be constant on {- T, ... , T}.
Now partition the time axis into intervals of length 1. Let Ak be the event that
the only event times for {Nx, Ixl :::: 2T} in the time interval [k, k + 1) occur at
where 17 is the configuration of the process at time k, and that they occur in that
order. These are independent and have the same positive probability, so that
U = min{k ~ 0 : Ak occurs} + 1
is a stopping time with finite mean, and 17t is constant on {- T, . .. , T} at that

time. This is the U that we needed.
Coexistence; the Threshold Contact Process

Most proofs of coexistence for threshold voter models are based on comparisons
with a hybrid model known as the threshold contact process with parameter A > O.
This is the process on {O, l}Zd with flip rates
if 1) (x) = 0 and #{y EX +JV': 1)(Y) = 1}::: T,

(2.10) c(q) ={~ if1)(x)=l, and
otherwise.
Unlike the standard contact process that is the subject of Part I, the threshold
contact process is not self-dual. Therefore it is no longer immediate that survival is
equivalent to the existence of a nontrivial invariant measure. (See (l.8) of Part I.)
So, we will use the phrase has a nontrivial invariant measure instead of survives
in this context. Here is the basic comparison that will be used for the remainder
of this section, and in Section 3:
Proposition 2.11. For any d, JV' and T, if the threshold contact process with
A = 1 has a nontrivial invariant measure, then the threshold voter model coexists.
Proof The proof relies on a comparison of these two processes with a third one
- a process in which each site flips independently at rate 1. This is the very
simple spin system with c(x, 1) == 1. We will use the superscripts v, c, i to denote
quantities corresponding to the threshold voter model, threshold contact process
with A = 1, and the independent flips process respectively. Let v be the upper
invariant measure for the threshold contact process, and v 1 be the product measure
!.
2
with density
The transition rates for the three processes satisfy the following inequalities:
and
CV(X, 1) :s CC(c, 1) = c i (x, 1) if 1) (x) = 1.
This means that the processes can be coupled so that they satisfy
(2.12)
for all t ::: 0 if these inequalities are satisfied initially. This coupling is analogous
to that used in our discussion of attractiveness - see (B 14).
By the convergence theorem for finite state Markov chains,
(2.13)
as t -+ 00. By (2.12),
(2.14)
Combining (2.13) and the first part of (2.14), we see that v :s v1. Combining this
with the second part of (2.14) and the fact that 117 is attractive (see (B13)) gives
Therefore all Cesaro averages of v 1 sv (t) are stochastically larger than v. Any
weak limit JL of Cesaro averages of v 1 sv (t) is invariant for the threshold voter
2
model by Theorem B7(f), and is stochastically larger than v. But v concentrates on
configurations with infinitely many ones. This statement is a consequence of (1.5)
of Part I for the standard contact process - the proof for the threshold contact
process is identical. Therefore, JL concentrates on configurations with infinitely
many ones. But JL is unchanged by interchanging the roles of zeros and ones,
since both the initial distribution and the transition mechanism for the threshold
voter model have that symmetry. Therefore, JL concentrates on configurations with
infinitely many zeros as well, and so it is nontrivial as required.
The next result explains why it is easier to deal with the threshold contact
process than the threshold voter model. The analogous result for the voter model
(with coexistence replacing the existence of a nontrivial invariant measure) may
well be true, but it certainly does not follow from the simple argument that works
for the contact process.
Proposition 2.15. Suppose 17: is the threshold contact process on Zd 1 with neigh-
borhood Jh: threshold TI and parameter AI, and 11; is the threshold contact pro-
cess on Z d2 with neighborhood .A2: threshold T2 and parameter A2. Assume that
dl ::: d2 ,
(XI, ... ,Xdl) E Jh. implies (XI, ... ,Xdl' 0, ... ,0) E A"2·,
TI 2: T2, and AI:S A2.
If 11: has a nontrivial invariant measure, then so does 11;.
Proof All that is required is to couple the two processes so that they are both
== I at time zero, and 11: (XI, ... ,Xdl) :s 11; (XI , ... ,Xdl' 0, .. ,0) for all t and
all (XI, ... ,Xd1 ) E Zd 1 • This is easy to do, using the type of coupling that is
discussed following (B 14). In order to carry out the construction, use the fact that
the transition rates for the two processes satisfy
(2.16)
if 111 (XI, ... ,Xdl) = 112 (XI , ... ,Xdl' 0, ... ,0) = 1 and
CI ((XI, ... , Xd 1 ), 111) :s C2((XI, ... , Xdl' 0, ... ,0),112)
if 111 (XI, ... , Xd 1 ) = 112 (XI , ... , Xdl' 0, ... ,0) = o. The reader should be able to
write down the coupling explicitly.
Remarks. In Proposition 2.15, if we were considering threshold voter models

instead of threshold contact processes, the equality in (2.16) would be replaced
by the inequality .:s, and that is the wrong inequality to make the coupling work.
See again (BI4). In any case, if there is a nontrivial invariant measure for the
threshold voter model, it can be chosen (by 0 # I symmetry) to have density
!, so one does not expect to be able to compare them for different choices of
A/~ T in any simple way. Certainly they cannot be stochastically ordered without
being equal - see the application following Theorem B9. It would be interesting
to know whether a correlation type-comparison can be made for these measures.
Does increasing ,/V or decreasing T lead to lower correlations for the nontrivial
invariant measure? To see why this might be the case, recall that by Theorem
B 17, the limit (if it exists) of v IS (t) has positive correlations for any choice of
2
./V and T. On the other hand, formally letting T = 0 or jV = Zd results in the
independent flip process whose unique invariant measure in v I, and this measure
has uncorrelated coordinates. 2
The Threshold Contact Process with Large Range

We are now ready to show that the threshold contact process with).. = I has a
nontrivial invariant measure if the threshold T is sufficiently small compared to
the size of the neighborhood A< Application of Proposition 2.11 will then give
analogous coexistence results for the threshold voter model.
The idea behind the next theorem and its proof is the following. If T is
small compared to the neighborhood, then while there are T ones in a certain
region, there will be many sites x (i.e., those whose neighborhoods contain that
region) that evolve like two state Markov chains with rate I for each transition
I] (x) ----+ I - I] (x). The distribution of the configuration for those spins will remain
close to v I. But that means that there will be substantially many ones again. So,
2
the limiting distribution should not be too far from v I. 2
The formalization of this
argument is based on a comparison with oriented percolation.
Theorem 2.17. There is a c > 0 with the following property. If a sequence of

threshold contact processes 1]7 with).. = I, thresholds Tn and neighborhoods ./1-;;' =
{x E Zd : Ix I .:s n} satisfies
. Tn
(2.18) hm sup ;//' < c,
n-+oo 1./ r n I
then 1]7 has a nontrivial invariant measure for all sufficiently large n.
Proof The proof is based on a comparison with oriented percolation. To simplify

the notation, we will omit the index n. Take the integer L to be an appropriate
fraction of n so that for every x E {-3L, ... ,3L}d,
{-L, ... ,L}d cx+.ff.

Then while the number of ones in {- L, . " , L}d is at least T, every site in
{-3L, ... ,3L}d evolves like a two state Markov chain YI with rate I for each
of the transitions 0 ~ I and 1 ~ O. Therefore, thinking about the evolution of
the number of ones in a box of side length 2L + 1, it will be useful to make a
comparison with the Markov chain XI on the nonnegative integers with transitions
k~ k + 1 at rate (2L + I)d - k

k ~ k - 1 at rate k.
Since pI (YI = 0) = pO(Yt = 1) = HI -

e- 2t ), it is clear that if Xo = k, then
Xt is distributed as the sum of two independent binomials, one with parameters k
and HI + e- 2t ), and the other with parameters (2L + I)d - k and !(1- e- 2t ).
Write T = a(2L + I)d with a < !
(which is possible if the c in (2.18)
is sufficiently small and n is sufficiently large), and choose fJ E (a, so that !)
fJ(2L + l)d is an integer. Then if t is large enough that HI -
e- ) > fJ, the law
2t
of large numbers gives
(2.19) lim pk(X t :::: fJ(2L + I)d) = 1

L--->oo
for k = O. By coupling copies of XI starting at 0 and k respectively, we see that

pk(X t :::: I) :::: pO(Xt :::: I)
for every k and I, so that (2.19) holds uniformly in k. The coupling is simply the
one in which the two processes move independently until they hit each other (if
they ever do), after which time they move together. This keeps the process that
started at 0 to the left of the one that started at k.
As L ~ 00, if we take the initial state Xo to be asymptotic to z(2L + I)d for
some Z E [0, 1], the law of large numbers implies that Xs/(2L + I)d converges to
the following deterministic process Zs on [0, 1]:
1 2z - 1 2s
Z
s
=-+--e-
2 2
.
So if we let
T = inf{s :::: 0 : Xs ::::: a(2L + I)d},
it follows that
(2.20) lim pk (T > t) = I

L--->oo
for k = fJ (2L + I)d and for any t, and in particular for the t that satisfies (2.19),
which we now fix. By the coupling argument used above, (2.20) holds uniformly
for k :::: fJ(2L + I)d.
We are ready to make the comparison with oriented percolation. By (2.19) and
(2.20), by making L sufficiently large, we can guarantee that if
L rJo(x) :::: fJ(2L + l)d,

XE{-L, ... ,L)d
3. Models with Threshold = 1 155
then there is arbitrarily large probability that
L TJt(X) 2: {J(2L + l)d

xE{-3L •...• -L} x {-L •...• Ljd-l
and
L TJt(X) 2: (J(2L + \)d.
XE{L •... ,3Ljx{-L, ... ,Ljd-l
So, we can use Theorems B24(a) and 826 to conclude that if TJo == 1, then
i~fP( L d TJ2kt(X) 2: (J(2L + l)d) > 0,

XE{-L, ... ,Lj
and hence the upper invariant measure for the threshold contact process is non-
trivial.
The Threshold Voter Model with Large Range

Finally, we have the application to the coexistence of threshold voter models.
Corollary 2.21. There is a c > 0 with the following property. If a sequence of

threshold voter models TJ7 with thresholds Tn and neighborhoods .A{ = {x E Zd :
Ix I ::::: n} satisfies
. Tn
hmsup 4/1 < c,
n-+oo IJPn
then TJ7 coexists for all sufficiently large n.
Proof By Theorem 2.17, the threshold contact process has a nontrivial invariant
measure for large n. Now apply Proposition 2.11.
3. Models with Threshold = 1
In this section, we take the threshold T = 1. This case is of particular interest

because it is the only case in which we currently know exactly which models
coexist and which models cluster. By Theorem 2.6, the model with d = 1 and
J//' = {-I, 0, I} clusters. We will see that for all other choices of d and .IV,
the model coexists. By Propositions 2.11 and 2.15, in order to prove this, it is
sufficient to show that the threshold contact process with A. = 1 has a nontrivial
invariant measure in the following two cases:
(3.1) d = 1, J//' = {-2, -1,0, +1, +2},

and
(3.2) d = 2, JV = {(-I, 0), (0, -1), (0,0), (1,0), (0, I)}.

The first case is used as a comparison process for all one dimensional models
other than the one we know clusters, and the second for all models in two or more
dimensions. As we will see shortly, it is fairly easy to show that the second case
reduces to the first. Most of the work will be required to prove the result in case
(3.1). The proof in that case is a significantly more elaborate version of the proof
of (1.28) of Part I.
Since the proof in case (3.1) is fairly long and difficult, we will wann up by
showing that the threshold contact process with A = 1 has a nontrivial invariant
measure if the range of interaction is somewhat larger:
d = 1, ff = {-7, -6, ... , +6, +7}.

Let TI; be this process, and let TIl be the nearest neighbor threshold contact process
on Zl with A = 4. Note that the latter process has a nontrivial invariant measure,
since it can be coupled to lie above the basic contact process on Zl with A = 2,
and this process survives by (1.28) of Part I. The fact that makes this coupling
possible is that the infection rate in the threshold process is always 4, provided
that there is at least one infected neighbor, while the basic contact process has
infection rates 2 or 4, depending on whether one neighbor is infected or both
are.
Now, let h = {4k, 4k + 1, 4k + 2, 4k + 3} for k E Zl. Then TIl and TI; can be
coupled so that they maintain the following relationship:
TI/ (k) = 1 implies TI; (j) = 1 for some j E h.

Coupling the recovery transitions is easy: If Tll{k) = 1, then TI;U) = 1 for some
j E h, so we can couple the recovery at k in TIl with a recovery at any of the
infected sites in h in the process TI;. To couple the infection transitions, note that
h-I U h+1 C j +./1/'
for j E h. Suppose Tll{k) = 0 and TI; == 0 on h. IfTlt{k-l) = 1 or Tlt{k+ 1) = 1,

then TI; ¢. 0 on h-I U h+l, so there is an infection rate of 1 at each point of h,
which gives a total infection rate of 4 for the entire interval h. Thus the infection
at k in TIl can be coupled with an infection at some site in h. Since TIl has a
nontrivial invariant measure, it follows that TI; does as well, and this completes
the proof.
Of course, this proof is easy only because we have used the relatively hard
fact that the basic contact process on Z 1 with A = 2 survives. The rest of this
section is patterned after the original proof of that fact. It is quite a bit harder
because the process we must now consider is not nearest neighbor.
Duality for the Threshold Contact Process, T = 1

At several points in this section, we will need to use duality for the threshold
contact process TIt with T = 1, so we begin with that topic. In order to guess what
the dual process is (and whether there is one at all), we perfonn the following
computation, where Set) is the semigroup for the threshold contact process with
T = 1, and A is a finite subset of Zd:
~J-tS(t){1) : 1) = 0 on A}I
dt t=O
= LJ-t{1J : 1)(x) = 1,1) = 0 on A\{x}}

XEA
(3.3) - AL J-t{1J : 1) = 0 on A, 1) ¢. 0 on x + A/'}

XEA
XEA XEA
- (1 + A) [A [J-t {1J : 1) = 0 on A}.

This computation is the analogue of the one that led to (1.8), the duality statement
for the threshold voter model. The difference is primarily that we now use the
duality function R'(A, 1) = 1(ry=o on Aj, which is appropriate for additive duality,
rather than R(A, 1), which is appropriate for cancellative duality. For more on
this, see the discussion in Section 4 of Chapter III of IPS.
The right side of (3.3) can be viewed as the result of applying the generator
of a certain Markov chain to the function
f(A)= J-t{1J: 1) = 0 on A}.

This is the Markov chain with transitions
A -+ A\{x} at rate I, and
(3.4)
A -+ A U (x + JV) at rate A
for every x E A. Let At denote this Markov chain. Writing the analog of (3.3) for
general t and integrating up gives the duality relation
(3.5) J-tS(t){1) : 1) = 0 on A} = L pA(A t = C)J-t{1) : 1) = 0 on C}.

c
For technical details on this, see Section 3 of Chapter II of IPS. Note that the
computation (3.3) does not work for T > l. This is one of the places that we use
T = 1 in an essential way.
This duality provides an important connection between survival of one process
and existence of a nontrivial invariant measure for the other. As in Part I, we will
say that At survives if
pA(A t =F 0 for all t) > 0
for all A =F 0.
Proposition 3.6. At survives if and only if 1)t has a nontrivial invariant measure.
Proof Take J-t = 8\, the pointmass on 1) == I, in (3.5) to obtain

8\S(t){1) : 1) = 0 on A} = pA(A t = 0).
Pass to the limit in t to conclude that the upper invariant measure v for the
threshold contact process satisfies
v{7J: 7J = 0 on A} = pA(A t = 0 for some t).

Since At survives if and only if the right side is less than 1 (for A =1= 0), and v is
nontrivial if and only if the left side is less than one, the result follows.
Reduction to One Dimension

We are now in a position to reduce our problem to case (3.1). For the first result
A is arbitrary.
Proposition 3.7. Ifthe threshold contact process has a nontrivial invariant measure
in case (3.1), then it has a nontrivial invariant measure in case (3.2).
Proof Let At be the dual process with transition rates (3.4) in case (3.1), and Bt
be the dual process in case (3.2). By Proposition 3.6, we can equally well prove
that if At survives, then Bt survives. The proof is based on a coupling of the
processes At and Bt . To describe this coupling, define a mapping rr : Z2 --+ Zl
by rr (i, j) = i + 2 j. Here is a picture that shows this mapping, putting the value
rr(i, j) on the point (i, j):
2 3 4 5 6
o 2 3 4
-2 -1 o 2
-4 -3 -2 -1 o
-6 -5 -4 -3 -2
The point is that the four neighbors of an (i, j) E Z2 that has rr(i, j) = k have
corresponding rr values k - 2, k - 1, k + 1, k + 2, which are the four neighbors of
k E Zl. In other words, rr respects the neighborhood structure. Define a relation
A :::: B for A C ZI, B C Z2 by saying A :::: B if and only if
(3.8) x E A implies :3 Y E B such that rr(y) = x.

In the coupling, we will take Ao = {OJ and Bo = {CO, O)}, and try to preserve the
relation At :::: Bt for later times. This is easy to do. Suppose that At :::: Bt at a given
time t. If At = {XI, ... , xn}, then there are YI, ... , Yn E B t so that rr(Yi) = Xi
for 1 :::: i :::: n. Couple together the transitions from (3.4) at the corresponding
paired points, letting the other points in B t evolve independently. To check that
the relation holds after a transition, it suffices to note that if A :::: B, X E A, Y E
B, rr(y) = x, then
3. Models with Threshold = I 159
A\{x}:sB\{y} and AU(x+AD:sBU(y+A'2"),
where JK and J1;2' are the neighborhoods in the cases (3.1) and (3.2) respectively.
Since A :s B, A =1= 0 implies B =1= 0, the result follows.
The Convolution Equation

From now on, we can restrict ourselves to the case (3.1). For simplicity, we will
take A = 1, since this is the only value that is relevant in the application of
Proposition 2.11. We then need to show that the threshold contact process has a
nontrivial invariant measure in this case. The remainder of this section is devoted
to proving this.
From a purely algebraic point of view, one can think of a probability measure
fJ.. on {O, l}ZI as a collection of numbers fJ..{rJ : rJ = 0 on A} indexed by the
finite subsets A of Z \. These numbers must satisfy a lot of inequalities - the ones
needed so that all cylinder probabilities tum out to be nonnegative. The measure
will be invariant if the right side of (3.3) is zero, i.e., if these numbers satisfy a lot
of identities. This is an infinite dimensional linear programming problem, though
giving this problem a name does not necessarily make it easier to solve.
Let's look at a problem that is presumably easier, and that will tum out to
play an important role in solving the real problem. Suppose we require that fJ.. be
a stationary renewal measure. This means that there is a probability density f (.)
on {I, 2, ... } with finite mean so that
fJ..{rJ : rJ(O) = 1, rJ(1) = ... = rJ(k\ - 1) = 0, rJ(k\) = 1,

rJ(k\ + 1) = ... = rJ(k\ +k2 -1) = 0, rJ(k l +k2) = 1,
rJ(k l + k2 + 1) = 0, ... , rJ(k l + ... + kn - 1) = 0,
rJ(k l + ... +kn ) = I}
f(k 1 )··· f(k n )
Lj if(j)
for all choices of n ::: 1 and k\ ::: 1, ... , kn ::: 1. It is often useful to write
expressions involving f in terms of the corresponding tail probabilities
=L
00
F(n) f(k).
k=n
Then
F(n)
fJ..(1000· ··000) = fJ..{rJ: rJ(O) = 1, rJ(1) = ... = rJ(n -1) = O} = L .
j:o:l F(j)
This choice of a renewal measure is motivated by the fact that a particular

renewal measure played a key role in the proof of (1.28) of Part I. In that context
(the nearest neighbor basic contact process in one dimension), the motivation
was the following: The measure used in the proof has to be fairly simple, so
that explicit computations can be carried out. Product measures do not work, and
renewal measures are among the simplest measures on {O, I}ZI that need not be
product measures. Furthermore, the contact process is a nearest particle system
(see Chapter VII of IPS for more on this topic), and reversible nearest particle
systems have renewal measures as invariant measures.
We certainly do not expect such a special /1 to be invariant for the process.
But, we might try to find one that satisfies some of the equations: RHS of (3.3) =
O. For example, since we now have a one (discrete) parameter family of unknowns
U(k), k :::: I}, and therefore expect to be able to satisfy a one parameter family
of equations, we can try to choose the fen), n :::: I, so that the right side of (3.3)
is zero whenever A is an interval {I, ... ,n}, n :::: 1. The hope is that the resulting
measure /1 (if it exists) will tum out to make /1S(t) increase over time in some
sense, and therefore will make this have a nontrivial limit as t ---+ 00. This limit
would be the required nontrivial invariant measure. Proposition 3.44 below states
this more precisely.
Here are the equations one obtains by setting the right side of (3.3) equal to
zero when A is an interval of length n (after dividing by the mean of f):
n = 1: 1 = F(2) + F(3) + F(4) + F(5),
n = 2: F(2) = F(3) + F(4) + F(5),
n = 3: 2F(3) + F2(2) = 3F(4) + 3F(5),
and
L F(k)F(n -
n
(3.9) k + 1) = 4F(n + 1) + 2F(n + 2), n:::: 4.
k=l
To derive the first one, for example, take A = {OJ and use our shorthand for
cylinder probabilities to express the right side of (3.3) as
1 + /1(00000) - 2/1(0) = /1(1) - [/1(0) - /1(00000)]

= /1(1) - [/1(10) + /1(100) + /1(1000) + /1(10000)].
Note that (3.9) is a convolution equation. The equations for n = 1,2, 3, to-
gether with the fact that f is a probability density, can be rewritten as
1 1
(3.10) F(1) = 1, F(2) = ~, F(3) = 4' F(4) + F(5) = -.
4
We embark now on a somewhat lengthy analysis of (3.9) and (3.10), and of
properties of the solution. This effort would certainly not be justified if it led only
to a solution of the easier problem discussed above, in which we replaced the
full collection RHS of (3.3) = 0 for all finite A by a one parameter subfamily of
equations. It will in fact be crucial to the solution of the real problem.
Proposition 3.11. There is a unique bounded solution to the equations (3.9),

(3.10). For this choice, F(4) = .1497729 ...
Proof Without assuming boundedness, these equations do not have a unique

solution. In fact, letting f3 = F(4), they can clearly be solved explicitly for
F(n), n 2: 5, in terms of f3. What we must show is that there is a unique choice
of f3 so that the resulting solution is bounded. So, take any value for f3, and let
F(n),n 2: 1, be the corresponding solution of (3.9) and (3.10).
The fact that the left side of (3.9) is a convolution suggests that we use the
generating function
n=l
Multiplying (3.9) by x n + 1 and summing for n 2: 4 gives

2
L L F(n)x n + - L F(n)xn.
00 00
F(I)F(m)x l +m =4
l.m::o:l:l+m::o:5 n=5 X n=6
Using (3.10), this can be written as
4/(x) - (4 + ~ )¢(x) + [2 + 5x + ~X2 + 2f3x 3+ (2f3 - ~ )X4] = O.

The solution to this quadratic that is bounded near the origin is
1 + 2x - Jp(x)
¢(x) = ,
x
where P is the polynomial
P(x) = 1 + 4x + 2x 2 - 5x 3 - 3
2x 4 - 2f3x 5 - ( 2f3 - 41) x 6 .
The radius of convergence of ¢ (x) is the the magnitude of the (complex) zero of
P of odd multiplicity that is closest to the origin, since J (z - a)n is analytic in
the whole complex plane if n is even, but has a singularity at z = a if n is odd.
Therefore, we need to show that there is a unique choice of f3 so that P has no
zeros of odd multiplicity in the unit disk of the complex plane.
Note that
3
P(l) =- - 4f3 and
4
so that P has a root of odd multiplicity in (-1, 1) if
3 9
f3 > - or f3 < - -
16 8'
Therefore, we may assume
(3.12)
Make a change of variables in P, writing

P(u)
u-
= P ( -2- 1) .
Then
256P(u) =(9 + 8f3) - (134 + 32f3)u + (479 + 40f3)u 2 - 84u 3

- (9 + 40f3)u 4 - (6 - 32f3)u 5 - (8f3 - 1)u 6 •
So, if u is complex and lu 1 = 1,
1256P(u) - (479 + 40f3)u21 ::: 243 + 1201f31

< 479 - 401f31 ::: 1(479 + 40f3)u 21,
where the middle inequality follows from (3.12). Therefore, by Rouche's Theorem
(see Chapter 10 of Rudin (1966), for example),
P(u) and (479 + 40f3)u 2
have the same number of zeros in the unit disk. So, P(u) has exactly two zeros
in the unit disk, and hence P (x) has exactly two zeros in the disk If Ix - ! I ::: !.
these two zeros are simple, then F(n) is not bounded. Therefore, we can restrict
ourselves to the case that these two zeros agree, and hence are real.
So, we need to find -1 < x < 0 and f3 so that P(x) = 0 and P'(x) = O.
Eliminating f3 from these two equations gives
(3.13)
The left side of(3.13) is -20 at x = 0 and is +11 at x = -1, so there is a root
in (-1, 0). It is easy to check that this root is unique; Mathematica gives it as
xo = - .425465 . .. The corresponding value of f3 is .149772 ...
Fix these values of Xo and f3. Since P(x) has a double root at xo, it can be
factored as
where
ao = 5.524 ... , a] = 3.871 ... , a2 = 1.272 ... ,
a3 = .257 ... , a4 = .0495 ...
Since ao > a] + a2 + a3 + a4, the other four roots of P lie outside the unit disk.
Therefore, we see that for this choice of f3, F(n) decays exponentially rapidly. In
particular, it is bounded.
The Density
Proposition 3.11 is not entirely satisfactory, because we will need to know that the
bounded {F (n), n 2: I} whose existence is guaranteed by that result is decreasing
(so that fen) is nonnegative) and satisfies some other inequalities. In principle,
F(n) can be computed by expanding v' P(x) in a power series, but this approach
makes it difficult, if not impossible, to check any properties of F(n). So, we must
take a different tack.
The system of equations (3.9) and (3.10) can be rewritten in terms of the
density f:
(3.14) :t
k=i
f(k) = 1, f(1) =~, f(2) = ~,
and
2f(1)f(2) + 2f(4) + 2f(5) - 4f(3) = 0,
2f(1)f(3) + f2(2) + 2f(5) + 2f(6) - 5f(4) = 0,
(3.15) n-i
L f(k)f(n - k) + 2f(n + 1) + 2f(n + 2) - 6f(n) = 0, n 2: 5.
k=i
The easiest way to see this is to compute

d
dtJLS(t){ry: 1](0) = 1,1](1) = ... = 1](n - 1) = 0, 1](n) = I}, n 2: 1
as we did in (3.3). An important feature of(3.15) that is not shared by the equations
(3.9) for F is that there is only one negative term on the left. This property is used
in a crucial way in the proof of Proposition 3.17 below, and leads to the following
motivating remarks.
A probabilist looking at equations (3.15) should be struck by their similarity to
the equations that define a harmonic function for a continuous time Markov chain
on {l, 2,3, ... }. If the transition rates for such a chain are given by q(n, m), then
f is harmonic if it satisfies
L q(n, m)f(m) - fen) L q(n, m) = 0, m 2: 1.
m:mi=n m:mi=n
Equation (3.15) for n 2: 5, for example, almost has this form for a chain that leaves
n at rate 6, going to n + 1 or n + 2 at rate 2 each, and to sites to the left of n at
a total rate of 2. Of course, (3.15) is different in that the terms that correspond to
moving to the left are quadratic in f rather than linear. Nevertheless, this analogy
is useful in trying so solve (3.15). In particular, the form of (3.15) suggests that
we define a family of evolutions Ut (n), n 2: I} by setting
1
ft(1) == 2' ft(2) == 4'
d
dt ft(3) = 2ft (1)ft (2) + 2ft (4) + 2ft(5) - 4!t(3),
d
(3.16) - ft(4) = 2ft (1)ft (3) + ft2(2) + 2ft(5) + 2ft(6) - 5ft(4),
dt
d
L ft (k)ft (n -
n-i
- ft(n) = k) + 2ft(n + 1)
dt k=i
+ 2ft(n + 2) - 6ft(n), n 2: 5.
These would correspond to the Kolmogorov backward equations in the Markov

chain context.
The idea is to try to get a solution of (3.14) and (3.15) by passing to the
limit as t --+ 00 in a solution of (3.16). To see the connection between (3.15)
and (3.16), and hence complete the motivation for the latter, suppose that !ten)
converges nicely as t --+ 00. Then one would expect the t derivative to tend to
zero, and hence the right side of (3.16) to tend to zero. But this means that the
limiting values satisfy (3.15). Note that It will not in general be a density, even if
10 is - the evolution of the total mass Ln !ten) will be computed in (3.21) below.
However, it will turn out that the limit is a density.
Proposition 3.17. Let It(n) be defined by (3.16) with initial condition lo(n) = 0
for n ::: 3. Then
(a) !ten) is nonnegative and non decreasing in t for all n ::: 1,
(b) I(n) = lim t - Hx) !ten) < 00 for each n,
(c) {f (n), n ::: I} is the unique positive solution of the system (3.14), (3.15),
and
(d) L:l nl(n) < 3.
Proof For part (a), note that the only negative sign on the right of (3.16) is on
the term whose derivative appears on the left side. Therefore, if at some time t
and for some n, It(n) = 0, while It(k) ::: 0 for all k -=1= n, then the derivative of
It (n) is nonnegative, so It (n) is forced up by its differential equation. It follows
that It (n) remains nonnegative for all n at later times if this is the case at t = O.
The proof of monotonicity is similar. Differentiate the equations in (3.16).
Again, all the terms on the right of the differentiated equations have positive
signs, except for the last one. Therefore the derivatives
(3.18)
remain nonnegative if they are nonnegative initially, which is true. Arguments of

this sort are generally known as maximum principles. One interesting feature of
their application here is that the differential equations involved are nonlinear.
This argument requires, of course, that It have derivatives of higher order. This
property is automatic for solutions of differential equations of this sort. Equation
(3.16) itself shows that if It (n) is continuous in t for each n, then its derivative is
also continuous, since the right side of (3.16) is continuous. Therefore, we know
that the right side of (3.16) is continuously differentiable, and hence that It(n) is
twice continuously differentiable. Iterating this argument, it follows that It (n) is
infinitely differentiable in t.
The arguments given so far are somewhat informal. A formal proof of this
nonnegativity and monotonicity would parallel the usual treatment of the backward
equations for a continuous time Markov chain. The first step is to replace the
differential equations by integral equations. For example, the last equation in (3.16)
would become
ft(n) - fo(n) = It e- 6(t-s) [~fs(k)fs(n - k)

(3.19)
+ 2fs(n + 1) + 2fs(n + 2)]dS.
Then the solution to (3.16) is constructed by successive approximations - the

(k + 1)st approximation f?+l)(n) is obtained by putting the kth approximation
ft(k) (n) into the right side of (3.19):
f/k+l)(n) = fo(n)
t e- (t-s) [n-l
+ 10 6 f; f}k) (j)f}k) (n - j)
+ 2f}k)(n + 1) + 2fi k )(n + 2) ]dS
for n :::: 5, with similar equations for small n. This makes it clear that the successive
approximations are always nonnegative if the zeroth one is. The same argument
proves the monotonicity statement - the only difference is that the integrating
factor technique is applied to the differentiated versions of (3.16), thus yielding
successive approximations for the derivatives (3.18).
The previous argument is rather soft, and applies quite generally. If we were
considering the analogues of equations (3.14), (3.15) for a A below the critical
value of the threshold contact process, instead of A = 1, then nothing would
change in part (a). The problem would be that ft(n) might blow up as t -+ 00, or
if not, the limit would not satisfy (3.14), (3.15). The real work comes in the next
part of the proof.
Turning to part (b), note that the existence of the limit is immediate from the
monotonicity statement in part (a). The real issue is the finiteness of the limit.
Introduce the generating functions
00
1jJ(t,x) = Lfr(n)x n .
n=l
In the following computations, we will leave the initial conditions general at first,
since we will need to take different ones in the next subsection. Multiplying the
nth equation in (3.16) by xn and summing yields
(3.20)
Differentiate this with respect to x twice, and then replace x by 1. Letting
aCt) = 1jJ(t, 1), bet) = ~1jJ(t, X)I

ax x=l
' a2
c(t) = -21jJ(t, x)
ax
I
x=1
,
one gets the following differential equations:
d 2 I
dta(t) = [I - a(t)] + 4' - 2ft(3) - ft(4),
d
(3.21 ) dt bet) = 2[1 - a(t)][3 - b(t)], and
d 2 3
d/(t) = 2[3 - b(t)] - 2[1 - a(t)] [e(t) + 8] - 2 + 8ft(3) + 8ft(4).
These hold as long as aCt) remains finite. The fact that aCt) appears squared on
the right of the first equation means that in principle, aCt) could blow up in finite
time. We will see shortly that it remains finite for all t. The fact that bet) appears
only to the first power in the second equation, and e(t) appears only to the first
power in the third equation means that bet) and e(t) will remain finite as long as
aCt) does.
By part (a), ft(n) is nondecreasing in t for each n. It follows that aCt), bet),
and e(t) are also nondecreasing, and hence each of the derivatives in (3.21) is
nonnegative. We will need one other inequality. By the first and fourth expressions
in (3.16) and the fact that ft (4) is nondecreasing,
(3.22) ft(3) + f/(2) + 2ft(5) + 2ft (6) :::: 5ft(4).
To eliminate ft(5) and ft(6), use the fact that

I
- +L
00
ft(k) = aCt),
2 k=2
so that ft(2) + ft(3) + ft(4) + ft(5) + ft(6) s aCt) -~. Using this in (3.22) gives
(3.23) ft(3) + 7 ft(4) S 2a(t) - 1 - 2ft (2) + f/(2).

Now we want to eliminate the ft(3) and ft(4) terms that appear in inequality
(3.23) and in the inequalities that come from the nonnegativity of the right sides
of the first and third lines in (3.21). Taking a specific linear combination of these
three inequalities and using the second expression in (3.16) leads to
Os 13[2[3 - b(t)]2 - 2[1 - a(t)][e(t) + 8] - ~ + 8ft(3) + 8ft (4)]

+48[[I-a(t)]2 + l-2ft(3) - ft(4)]
(3.24)
+ 8[2a(t) - ~~ - ft(3) - 7ft(4)]

=26[3 - b(t)]2 - [1 - aCt) ][26e(t) + 48a(t) + 176] - 3.
Now we can argue as follows. From the nonnegativity of the right side of
the middle line of (3.21), we see that if I - aCt) or 3 - bet) changes sign, then
they must both change sign at the same time. By (3.24), they cannot be zero
simultaneously. Therefore, they never change sign. Since a(O) = ~ and b(O) = 1,
it follows that
(3.25) aCt) < 1 and bet) < 3 for all t ~ O.
In particular, ft (n) is bounded in t for each n, so that
fen) = lim ft(n) < 00.
t--> 00
This proves part (b).

Forpart(c),notethatbythemiddlelineof(3.21),eithera(t) -+ 1 orb(t) -+ 3
as t -+ 00, since otherwise b(t) would be unbounded. By (3.24), the latter is
impossible. Therefore, (fen), n ~ I} sums to 1, and hence is a probability density.
The right sides of (3.16) have limits, and these limits must be zero, since otherwise
fl(n) would be unbounded. Therefore, (fen), n ~ I} satisfies (3.l4) and (3.l5)
as required. The uniqueness follows from the uniqueness statement in Proposition
3.ll.
Finally, part (d) follows from
n=l
which we have just argued must remain uniformly below 3.
The Renewal Sequence

Having a density f that solves (3.14) and (3.15) (or equivalently, whose tail
probabilities F satisfy (3.9) and (3.10)) is a step in the right direction, but in
order to use it to prove that the threshold contact process with A = 1 has a
nontrivial invariant measure in case (3.1), we will need to use various monotonicity
and convexity properties of f, F and of the corresponding renewal sequence u,
which is defined in (B35). From now on, we will use f, F and u to denote these
sequences.
In order to see what inequalities we might expect to hold, consider the values
(rounded to the number of decimal places provided) given below for f, F, u, and
their successive ratios. One question we might ask is whether any of the sequences
is logconvex (see (B38)), and if not, whether it is "almost" logconvex. Note that
once the value of f3 = F(4) has been determined in the proof of Proposition
3.11, any other values of f can be computed recursively from (3.9) with arbitrary
precision.
Clearly the ratios are not all monotone, though it appears that they may be
monotone after the first obvious exceptions. The next order of business is to prove
this.
Proposition 3.26. The density f has the following property:

f(2) f(3) fen)
f(3) ~ f(4) ~ ... ~ fen + 1) ~ ....
Table I
f(n) F(n) u(n-I)
n fen) F(n) u(n) f(n+l) F(n+l) ---,;(ri)
.5000000 1.000 .5000000 2.00000 2.000 2.00000

2 .2500000 .500 .5000000 2.49434 2.000 1.00000
3 .1002271 .250 .4752271 2.02292 1.669 1.05213
4 .0495459 .150 .4622729 1.91236 1.494 1.02802
5 .0259083 .100 .4507380 1.56141 1.349 1.02559
6 .0165928 .074 .4428877 1.46809 1.287 1.01773
7 .0113023 .058 .4365590 1.39314 1.243 1.01450
8 .0081128 .046 .4314540 1.33989 1.212 1.01183
9 .0060548 .038 .4272128 1.29929 1.188 1.00993
It would be enough to prove that the evolution in (3.16) with the initial con-
ditions used in Proposition 3.17 has the property that
(3.27) .fr(k - 1).fr(k + 1) :::: f/(k), k:::: 3
for all t :::: 0, since then we could simply pass to the limit in these inequalities.
Note that (3.27) is true for t = 0, since the right side is zero for k :::: 3. However,
the asymptotics
as t t 0, which are easy to read off from (3.16), make it clear that (3.27) fails for
small t, at least for k = 4. In fact, it fails for all even k. So, we will have to argue
differently. The first step is to prove the following weaker statement.
Lemma 3.28.
f(5)]n-4
fen) :::: f(4) [ f(4) ,
Proof The idea of the proof is to follow the proof of Proposition 3.17 with a
different initial condition for the evolution (3.16). So, write the initial condition
as
fo(3) = a, fo(4) = b, fo(n) = ca n - 5 for n :::: 5,
where a, b, c, a are positive constants to be determined. In order for the argument

of Proposition 3.17 to work, we need to choose these constants so that the right
side of(3.16) is nonnegative for t = O. Since we have four constants to determine,
we will try to make the right side of (3.16) be zero for t = 0 and n = 3,4,5,6.
Here are these four equations:
1
4 + 2b + 2e = 4a,
1
a + 16 + 2e + 2m = 5b,
(3.29)
1
b + "l a + 2ea + 2ea 2 = 6e
1
e + "lb + a 2 + 2m2 + 2ea 3 = 6ea.
Solve the first three equations in (3.29) for a, b, e in tenns of a, and substitute
into the fourth equation. The result is
63 - 26a - 22a 2 51 - 12a - 16a 2 19

8a=-----_____=_ 16b = , 16e = ,
91 - 46a - 36a 2 ' 91 - 46a - 36a 2 91 - 46a - 36a 2
and
20167 - 55144a + 24496a 2 + 26736a 3 - 10828a 4 - 5472a 5 = O.
This polynomial has a root a = .585 ... , and then the values of a, b, e become
a = .0972 ... , b = .0464 ... , e = .0229 ...
Comparing with the corresponding entries in Table 1, we see that these are slightly
smaller than f(3), f(4), f(5). This is encouraging, since we hope to show that
with this initial condition, ft(n) t fen), n ~ 3. So, we will take these values for
a, b, e, a from now on.
The right sides of the other expressions in (3.16) are respectively
1
(n=7) m + "le + 2ab + 2m 3 + 2ea 4 - 6ea 2 ,
1
(n=8) ea 2 + "lea + 2ae + b 2 + 2ea 4 + 2ea 5 - 6ea 3 ,
1 n 7
ea n - 6 + _ea - + 2aea n - 8 + 2bea n - 9 + (n _ 9)e 2 a n - 10
2
The final one corresponds to n ~ 9, and it is clearly sufficient that it be nonnegative

for n = 9. It is a simple matter to check that they are all nonnegative for the values
of a, b, e, a that we have chosen. In fact, the values of the above expressions for
n = 7, 8,9 are .00138, .00214, and .00212 respectively.
So, as in the proof of Proposition 3.17, ft(n) t for all n. We need to show that
the limit is fen), as it was there. Since the current initial conditions are larger than
they were previously, the maximum principle implies that the current evolution is
larger for all t than it was previously. Therefore,
(3.30) lim ft(n) ~ fen).

t--+oo
Equality will follow once we know that

00
'"' lim ft(n)

~ t-'>oo
= 1.
n=l
But this follows as before, since now

3 e
a(O) = - +a +b+ - - = .948 ... < 1,
4 1- a
5 -4a
b(O) = 1 + 3a + 4b + e (l _ a)2 = 1.831 ... < 3.
Therefore, equality holds in (3.30). It follows that
fen) = lim !ten) ::: fo(n) = ea n- S, n::: 5.

t-'>oo
So, we have proved the statement of the lemma for any n for which
ea n- S > f(5)[f(5)]n-S
- f(4)
This is clearly true for large n, since f(5)lf(4) = .52 ... < .58 ... = a, by Table
1. In fact, it is true for n ::: 7, since
2 f3(5)
ea = .00785 ... > .00708 ... = f2(4).
But the statement of the lemma is obvious for n = 4 and n = 5, and follows from
Table I for n = 6, so the proof of the lemma is complete.
Proof of Proposition 3.26. Any proof of this result that uses the evolution (3.16)
runs into difficulties caused by the fact that the logconvexity actually fails for
small values of n. We will get around this problem by modifying the evolution
in such a way that the first few values do not change at all. So, take gt(n) to be
defined by
gt(n) == fen) for n ::: 5, go(n) = f(5)a n- S,
where a = f(5)lf(4) = .52 ... , and
d
L gt(k)gt(n -
n-l
- gt(n) = k) + 2gt (n + 1) + 2gt (n + 2) - 6g t (n), n::: 6.
dt k=l
Note that
(3.31) gt(2) > gt(3) > ... > gt(n) > ...
gt(3) - gt(4) - - gt(n + 1) -
for t = 0 by Table 1. Our plan is to prove that (3.31) holds for t > 0 as well, and
that
(3.32) lim gt(n) = fen), n 2: 1.

t-+oo
Combining (3.31) and (3.32) gives the result.

We begin with (3.32). Let (ft(n), n 2: I} be the evolution determined by (3.16)
with fo(n) = 0 for n 2: 3. To check that
(3.33) !ten) ::S gt(n), n 2: 1, t 2: 0
argue as follows. The inequalities (3.33) are clearly true at t = 0 for all n, and for
all t 2: 0 if n ::S 5 by Proposition 3.17. For n 2: 6, the equations of evolution are
the same for the two systems, so (3.33) holds by the maximum principle. Now,
by Lemma 3.28,
(3.34) gt(n) ::S fen), n 2: 1
holds at t = O. Since f satisfies (3.15), the maximum principle shows that (3.34)
holds for all t 2: O. Combining (3.33) and (3.34) with Proposition 3.17 gives
(3.32).
Finally, we tum to the proof of (3.31), which can be restated as
(3.35)
This is automatically true for k = 3,4, and for all k if t = O. We will again use
the maximum principle to show that (3.35) is true for t > O. To do so, we need
to check the following statement: If (3.35) is true for a given t and all k 2: 3, and
holds with equality at that time for a fixed k, then
d
(3.36) dt [gt(k - l)gt(k + 1) - g;(k)] 2: 0
for that k and t.

The first case is k = 5. Then the left side of (3.36) is
(3.37) f(4) [f(5) + ~ f(4) + f2(3) + 2gt (7) + 2gt (8) - 6gt(6) l
Using
f(4)gt(6) = f2(5), f(5)gt(7) 2: g;(6), gt(6)gt(8) 2: g;(7)
gives the following lower bound for (3.37):
f(4)f(5) + ~ f2(4) + f(4)f2(3) + 2 f3(5) + 2 f:(5) - 6f2(5) = .00005 ... > O.

2 f(4) f (4)
Next, consider the case k = 6. Then the left side of (3.36) is
f(5)[gt(6) + ~ f(5) + 2f(3)f(4) + 2gt(8) + 2gt(9) - 6gt(7)]

(3.38)
-2gt (6) [f(5) + ~ f(4) + f2(3) + 2gt(7) + 2gt(8) - 6gt(6)).
Now we will use
(3.39) 1(5)gt(7) = g;(6), gt(6)gt(8)::: g;(7), gt(7)gt(9)::: g;(8)
to get a lower bound for (3.38). First replace gt(9) in (3.38) by g;(8)/gt(7). This
results in a quadratic function of gt (8). This quadratic is increasing in gt (8) for
(8) > gt(6)gt(7) _ gt(7) = g;(7) _ gt(7)

gt - 1(5) 2/(5) gt(6) 2/(5)'
where the equality follows from (3.39). Therefore, we may replace gt(8) by its
lower bound from (3.39), and then replace gt(7) by its value in (3.39). The result
is the following lower bound for (3.38):
[~/2(5) + 2 / (3)/(4)/(5)] - gt(6) [/(5) + 1(4) + 2/2(3)]

(3.40)
+6 2(6)_2 g(6) _2 g1 (6).
gt 1(5) j2(5)
This is a polynomial in gt(6), whose smallest positive root is .0197 ... Since
gt(6) :::: 1(6) = .0165 ...

by (3.34) and Table I, it follows that (3.40) is nonnegative.
Finally, we consider the general case k ::: 7. The right side of (3.36) is
gt(k - 1)[ tgt(J)gt(k + 1 - j)
+ 2gt(k + 2) + 2gtCk + 3) - 6gtCk + I)]

(3.41)
+ gt(k + 1)[ ~ gt(j)gtCk-l- j) + 2gtCk) + 2gtCk+ I) - 6gtCk-l)]
- 2gt (k) [ ~ gt (J)gr (k - j) + 2gt (k + 1) + 2gt (k + 2) - 6gt (k) l

The key is to rewrite (3.41) in the following way:
grCk - I)gt(k + I) - g;(k) [~gt(j)gr(k _ j _ 1) _ 12gt (k _ I)]

gr(k - I)
I:
j=]
+ [gt(k)gt(J) - gt(k-l)gt(J+I)][gr(k)gt(k- j -1) - gt(kz -I)gt(k- j)]

j=] gt(k-I)
+ 2[gt(k - l)gtCk + 2) - gt(k)gtCk + 1)]
+ 2[gt(k - I)gt(k + 3) - 2gt (k)gt(k + 2) + g;(k + I)].
Since
1(4) < 1(1) 1(2) 1(3)
1(5) - 1(2)' 1(3)' 1(4)
by Table 1, the above expression is nonnegative under our assumption
gt(4) gt(5)
- - > - - > ... >
gt(k - 1)
= gt(k)
>
gt(k + 1) > ....
(3.42)
gt(5) - gt(6) - gt(k) gt(k + 1) - gt(k + 2) -
The only term for which this is not completely clear is the last one. But by (3.42)
and the inequality between geometric and arithmetic means,
gt(k)gt(k + 2) ::: J[gt(k - l)gt(k + 1)][gt(k + l)gt(k + 3)]

gt(k - l)gt(k + 3) + g;(k + 1)
< ~----~------~-----
- 2
This completes the proof of (3.31), and hence of the Proposition.
As pointed out earlier, 1 and u are not logconvex, so we cannot use Theorem
B39 directly to deduce inequalities for u from Proposition 3.26. However, Lemma
B40 can be used to prove the slightly weaker fact that is true in our case.
Proposition 3.43. The renewal sequence u satisfies the following properties:

u(O) u(2) u(3) u(n)
(i) --->--->---> ... > > ...
u(1) - u(3) - u(4) - - u(n + 1) -
(ii) u(O) - u(l) :::: u(2) - u(3) :::: u(3) - u(4) :::: '" :::: u(n) - u(n + 1) :::: ... ,
1 1
(iii) u(n) - u(n + 1) :::: 3[u(n - 1) - u(n)] + 6[u(n - 2) - u(n -1)], n:::: 3,
and
1
(iv) u(n) - u(n + 1) :::: 2" [u(n - 1) - u(n)], n:::: 2.
Proof The proof of (i) is by induction on n. Suppose that we have proved

u(O) u(2) u(3) u(n)
--- > --- > --- > ... > ------
u(l) - u(3) - u(4) - - u(n + 1)
for a fixed n :::: 5. (Note that the values in Table 1 show that this statement is true
for n = 5.) We need to prove that
u(n)u(n + 2) :::: u(n + 1)2,
which is just the statement that the left side of (B41) is nonnegative. Consider the
terms on the right side of (B41). All the determinants involving 1 are nonnegative
by Proposition 3.26, together with the fact that
f(1) f(4)
-->--
f(2) - f(5)'
which can be seen from Table 1. All the determinants involving u on the right
side of (B41) are nonnegative by the induction hypothesis, except for the one
corresponding to j = n - 1. Take 2 .::: j .::: n - I, and write
1fenf(j) f(j + 1) 1_ . 1 n 2 [ f(j) _ fen + 1)]

+ 1) fen + 2) - fCJ + )f( + ) f(j + 1) fen + 2)
fen - 1) fen + 1)]
::: f(n)f(n + 2) [ fen) - fen + 2)
fen - 1) fen) 1
- 1
fen + 1) fen + 2) ,
where the inequality follows from Proposition 3.26. Note in this connection that
the limit of f(k)lf(k + 1), which exists by monotonicity, must be ::: 1, since
otherwise f(k) would be unbounded. It follows that f(k) is decreasing in k so
f(j + 1) ::: fen) above. Using these observations, we see that the right side of
(B41) is bounded below by
fen - 1) fen) 1 ~ 1 u(n - j) u(n - j + 1) 1

fen + 1) fen + 2) u(n + 1)
1
.~ u(n) .
J=n-3
Therefore, it is enough to prove that the sum above is nonnegative. But this sum
is
u(n + l)[u(l) + u(2) + u(3)] - u(n)[u(2) + u(3) + u(4)].
By the values in Table 1 and the induction hypothesis,
u(l) + u(2) + u(3) = 1.0262> 1.0256 = _u(_4)> _u_(n_)_

u(2) + u(3) + u(4) - u(5) - u(n + 1)
as required.
Part (ii) follows from part (i), the arithmetic-geometric mean inequality and
Table 1. For part (iii), the result holds for n ::: 8 by Theorem B45, since
F(lO) 5
-->-
F(9) - 6
by Table 1. The other cases follow directly from Table 1. Part (iv) is a consequence
of parts (ii) and (iii) and Table 1.
Existence of a Nontrivial Invariant Measure

Having invested so much energy in the analysis of the solution to (3.9) and (3.10)
(equivalently, to (3.14) and (3.15», we will now explain more precisely how
this will be used to show that the threshold contact process with A = 1 has a
nontrivial invariant measure in case (3.1). The following preliminary result is a
simple consequence of duality.
Proposition 3.44. Suppose f-L is a probability measure on {O, l} Z d and Set) is the
semigroup for the threshold contact process on Zd with T = 1 and A ::: O. If
(3.45) !£f-LS(t){1): 1) = 0 on A}I :s 0

dt 1=0
for all finite A C Zd, then
f-LS(t){1J: 1) = 0 on A}
is a nonincreasing function of t for all finite A C Zd. In particular,
(3.46) v = lim f-LS(t)

1-'>00
exists, is an invariant measure for the process, and satisfies

(3.47) v {1) : 1) = 0 on A} :s f-L {1) : 1) = 0 on A}
for all finite A C Zd. If f-L j 80 , then v j 80.
Proof The final statement is an immediate consequence of (3.47). The existence

of the limit in (3.46) and inequality (3.47) follow from the monotonicity statement.
The fact that v is invariant comes from Theorem B7(e). So, we need only prove
that (3.45) for all finite A implies
d
(3.48) d t f-L S (t) {1) : 1) = 0 on A} :s 0
for all finite A and all t ::: O. To do so, apply (3.5) to the measure f-LS(s), giving
f-LS(t + s){1) : 1) = 0 on A} = L pA(A/ = C)f-LS(s){1) : 1) = 0 on C}.

c
Differentiate this relation with respect to s, and set s = O. The result is
d
-f-LS(t){1J:
dt
1) = 0 on A} = Lc pA(A/ = C)-f-LS(s){1)
d
ds
: 1) = 0 on C} I .
s=o
Therefore (3.45) implies (3.48). It is perhaps worth emphasizing that it is not the
case that (3.45) for a particular A implies (3.48) for that A.
We come now to the main result in this section.
Theorem 3.49. All threshold voter models (with T = 1) coexist except for the one
with d = 1, .IV = {-I, 0, I}.
Proof By Propositions 2.11, 2.15, 3.7 and 3.44, it suffices to consider the case
d = 1, JV = {-2, -1,0, 1, 2}, and to find a measure f-L j 80 satisfying (3.45)
for the corresponding threshold contact process with A = 1. In choosing such a
measure, it seems reasonable to look for one that satisfies (3.45) with equality for
as many A's as possible - say for all connected sets. Not coincidentally, we have
already found a candidate: the stationary renewal measure p, corresponding to the
density f that satisfies (3.14) and (3.15). These equations are exactly the ones that
say that (3.45) holds with equality for all intervals A.
By (3.3), we must show that
L p, {r! : IJ == 0 on A, IJ ¢. 0 on k + JV}
kEA
(3.50)
- LP,{IJ: IJ == 0 on A\{k}, IJ(k) = I} :::: 0
kEA
for all finite A C ZI. In order to take advantage of the renewal property, the left
side of (3.50) is best written in terms of the following conditional probabilities:
LA(k) = p,{r! : IJ == 0 on A n (-00, k)IIJ(k) = l},

RA(k) = p,{r!: IJ == 0 on A n (k, oo)IIJ(k) = I}.
It is not too hard to write the negative terms on the left of(3.50) in terms of LA and
R A . For the positive terms, the first step is to write the following decomposition
(which holds a.s.) according to the locations of the first 1's to the right and left
of a particular k E A:
{r!: IJ = 0 on A} = Uj,!j!A;j<k<l{r!(j) = IJ(I) = 1, IJ = 0 on AU (j, I)}.

Using the fact that, relative to the renewal measure p" the families of random
variables {IJ(j), j < k} and {IJ(j), j > k} are conditionally independent given the
event {IJ(k) = I}, the left side of (3,50) divided by p,{IJ : IJ(O) = I} can now be
written as
(3.51) L f(l- j)LA(j)RA(l) - LLA(k)RA(k).

j<k<!;kEA;j,lj!A kEA
Ik-jl:::2 or II-kl:::2
In order to show that this is nonnegative, it will be important to use various

identities satisfied by the functions f, LA, and R A, The ones satisfied by fare
(3,14) and (3.15), Here is the one satisfied by LA:
(3,52) LA(k) = L LA(j)f(k - j).

j<k,jj!A
To check it, use a decomposition based on the location of the first 1 to the left of
k as follows:
p,{r!: IJ == 0 on An (-00, k), IJ(k) = I}

= L p,{IJ: IJ == 0 on An (-00, j), IJ(j) = I, IJ == 0 on (j, k), IJ(k) = I}
j<k,jj!A
= L p,{IJ: IJ == 0 on An (-00, j), IJ(j) = I}f(k - j).
j<k,jj!A
Dividing by JL{1] : 1](0) = I} gives (3.52). The analogous identity for RA is
(3.53) RA(k) = L RA(J)f(J - k).

j>k.j<tA
Next we will use these identities to rewrite the second term in (3.51). Using
(3.52) and (3.53),
LLA(k)RA(k)= L LA(J)f(k-j)f(l-k)RA(l)
kEA j<k<l
kEA:j.l~A
(3.54) L LA (J)RA (I) L f(k - j)f(1 - k)

j <1:j.l~A j <k<l
- L LA(j)f(k - j)f(l- k)RA(l).
j<k<l
j.k,l~A
Note that the right side of (3.54) is the difference of two divergent series. The
interpretation is that all identical summands should be cancelled before the sum-
mations take place. After this cancellation, the remaining sums are convergent.
Perhaps we should pause a minute before plunging into the next set of compu-
tations to see what the objective is. Clearly for the proof to work, we must at some
point use the fact that f satisfies the convolution equation (3.15). The computation
in (3.54) is designed to introduce a convolution, so that we can use (3.15) at that
point. After doing so, there is simply some bookkeeping to do. Looking ahead to
(3.59), the reader will see an expression for (3.51) that has several virtues: (a) the
convolution equation defining f has been used, as it must be, and (b) the values of
f (n) do not appear explicitly. This latter fact is quite important, since there is no
explicit expression for f to be used. In carrying out the following computations,
it will be useful to have the explicit expressions for the first few f (.), s:
1 1 1
(3.55) f(l) = 2' f(2) = 4' f(4) = 2f3 - -.
4
Here f3 = F(4) as before.
The first term on the right of (3.54) can now be rewritten using (3.15) and
these values in the following way:
L L A(J)R A(l)[6f(l- j) - 2f(l- j + 1) - 2f(l- j + 2)]

j <l:j.l~A
(3.56) -(2f3 -~) l_jJ;;,l~ALA(j)RA(l) - (~-2f3) I-jl;,l~A LA(J)RA(l)

- (~ - 2f3 ) I-jf,l~A LA(J)R A(I) - (2 + 2(3) l-j1;.I~A LA(J)R A(I).
Now use (3.52) and (3.53) again to rewrite the first sum in (3.56) as
6 L LA(j)RA(j) - L LA(j)[RA(j - 1) - ~RA(j)J

UA UA 2
- LRA(l)[LA(l + 1) - ~LA(l)J
I¢A 2
(3.57)
- LLA(j)[RA(j -2) - ~RA(j -l)IU-l¢A} - ~RA(j)J
j¢A 2 4
- L RA(/)[LA(l + 2) - ~LA(l + l)l(l+I¢A} - ~LA(/)J.

I¢A 2 4
Note that in each of the terms rewritten in this step, there is a choice between
using (3.52) and (3.53), depending on the order in which the sums on j and I
are taken in the first expression in (3.56). To maintain symmetry we have used
(3.52) on half of the terms of each type, and (3.53) on the other half. Note that the
i
factors of 4 and in (3.57) are f(l) and f(2) respectively. The terms in braces
in (3.57) are just the left sides of (3.52) and (3.53), after one or two terms from
the right sides have been moved to the left.
The second term on the right of (3.54) is easier. Using (3.52) and (3.53), it
becomes
- L LA(k)RA(k).
k¢A
Next we tum to the first tenn in (3.51), and write it as
(sum for j + 2 = k < I) + (sum for j + 1 = k < I)

+(sum for j < k = 1 - 1) + (sum for j < k = 1- 2)
(3.58)
-(sum for j + 1 = k = I - 1) - (sum for j + 1 = k = I - 2)
-(sum for j + 2 = k = 1 - 1) - (sum for j + 2 = k = I - 2).
Using (3.52) and (3.53) in the first four sums in (3.58), they become:
kEA.k-2¢A kEA,k-l¢A
+ L LA(k + I)RA(k + 1) + L LA(k + 2)RA(k + 2)
kEA,k+l¢A kEA,k+2¢A
1 1
2 L LA(k - 2)RA(k - 1) - 2" L LA(k + I)R A(k + 2).
kEA;k-l,k-2¢A kEA;k+l,k+2¢A
Combining all of these expressions, we obtain the following for (3.51):

L {LA(k)[RA(k - 2) + RA(k - 1)] + [LA(k + 2) + LA(k + 1)]R A(k)}

kotA
5
- - L L ACk)RACk)+0+2,B) L LACk)RA(k+I)
2 kotA k,k+1otA
- L {LACk+I)RA(k)+LA(k)RACk)+LACk+I)RACk+l)}
k,k+lotA
- L {LA(k)RA(k) + L ACk+2)R A(k+2) - 0 - 2,B)L A(k)R A(k+2)}

(3.59) k,k+2otA
1
+- L {2L A(k)R A(k+ 1) + 2LA(k+ I)RACk+2) + L A(k)R A(k+2)}
4 k.k+l,k+2otA
+ (~-,B) ( L LA(k)RA(k+3) + L L A(k)R A(k+3))

4 k.k+l,k+3otA k,k+2,k+3otA
+ (2,B - ~) L LA(k)RA(k+4).
4 k,k+2.k+4otA
Note that all constraints on the indexes in (3.59) are of the form j ~ A. When
in the previous expressions one encounters a term with a constraint of the form
j E A, it is changed to the desired form by writing
lUEA) = 1- lUotA)'
The argument leading up to (3.59) is a bit tedious, but it involves nothing other
than careful bookkeeping.
As a check, consider (3.59) where A = {-n, -n + 1, ... , -1, O}, in which
case we know (3.59) should be zero. For simplicity, take n not to be too small, so
that all the summands in (3.59) can be unambiguously attached to the left or right
components of A c. Then the contributions to each of these components should be
zero. Consider those corresponding to the right component, which for each sum
correspond to k 2: 1. Since R( -1) = ~ and R(O) = R(I) = ... = 1, these
contributions become
GLAO) + 2L A(2) + 2LA(3) + ... ] + [L A(2) + 2L A(3) + 2LA(4) + ... ]

5
- -[LAO) + L A(2) + ... ] + 0 + 2,B)[L AO) + LA(2) + ... ]
2
- [LA (2) + L A(3) + ... ] - [LAO) + L A(2) + ... ]
- [LA (2) + L A(3) + ... ] - [LAO) + LA(2) + ... ] - [L A(3) + LA(4) + ... ]
I
+ 0- 2,B)[L AO) + LA(2) + ... ] + 2[L AO) + L A(2) + ... ]
1 I
+-[LA(2) + L A(3) + ... ] + -[LAO) + L A(2) + ... ]
2 4
+ 2G -,B )[L A(1) + L A(2) + ... ] + (2,B - ~)[LA(1) + L A(2) + ... ].

Note that all the terms cancel in this sum, as they should.
Nonnegativity for Sets that Contain No Singletons

In order to prove Theorem 3.49, we need to show that (3.59) is nonnegative for
every finite set A. This is not too hard to do for sets that contain no isolated points
- the presence of such points makes the proof significantly more difficult. This is
one of the complications that arises because the contact process we are considering
is not nearest neighbor. In particular, this difficulty does not arise in the proof of
the corresponding result «1.28) of Part I) for the basic contact process given in
Section 1 of Chapter VI of IPS.
In order to understand better why singletons in A present additional difficulties,
consider the constraints on k in the sums in (3.59). They are all of the form
k, ... f/. A - i.e., they say that certain indexes fall in the complement of A. It
would be ideal to be able to write (3.59) as a sum of expressions, each of which
corresponds to k's that fall in a given maximal interval of the complement of A.
For the first, second, third, fourth, and sixth sums, it is fairly clear how to do this -
just assign each summand to the interval containing k. The feature that allows this
to be done is the fact that the indexes in the constraint are consecutive integers, so
that they all fall in the same interval. In the fifth, seventh, eighth and ninth sums,
the indexes in the constraint are not consecutive, and hence may fall in different
maximal intervals in the complement of A if A contains singletons. Therefore, it
is not clear to which interval a given summand should be assigned in this case.
The hardest situation to handle is that in which A has the form
- * - * - * - * -,
where * denotes a point in A and - denotes a point in the complement of A.
The fact that the indexes in some of the sums in (3.59) are not consecutive is
a consequence of the fact that the process is not nearest neighbor. So, this is a
difficulty that does not have to be faced for the nearest neighbor contact process
(threshold or basic).
To carry out the verification of the nonnegativity of (3.59) for sets with no
isolated points, and at the same time do the first part of the proof for general sets,
proceed as follows. Let [m, n) be a maximal interval in the complement of A,
i.e., m :s n, m - 1, n + I E A, m, ... , n f/. A. Consider the terms in (3.59) with
the property that all of the indexes appearing in the constraint in the sum fall in
[m, n). That is, these are the terms in (3.59) with some index in [m, n) with the
property that they would appear no matter what the status (in A or not) of points
outside [m - I, n + I) might be. In a natural way, we will associate half of these
with the left boundary (m - 1, m) of the interval, and the other half with the right
boundary (n, n + 1). This suggests the following definitions:
Q(m - 1, m) = L LA(k)[RA(k - 2) + RA(k - 1) - ~RA(k)]

m~:.kS:.n
+ m:C:~-1 [(~ + f3 )LA(k)RA(k + 1) - ~LA(k + I)RA(k) - LA(k)RA(k)]
+ m:C:~_2LA(k)[ -RA(k)+~RA(k+l)+(~-f3)RA(k+2)]
+ (~ - f3 ) m:C:~-3 LA (k)RA (k + 3) + (f3 - t) m:C:~-4 LA (k)RA (k + 4)
and
Q(n, n + 1) = L
m:c:k:c:n
RA(k)[LA(k + 2) + LA(k + 1) - ~LA(k)]
+ L [(~ + f3 )RA(k)LA(k - 1) - ~RA(k - I)LA(k) - RA(k)LA(k)]

m+l:c:k:c:n
+ L RA(k)[ - LA(k) + ~LA(k - 1) + (~ - f3)L A(k - 2)]

m+2:c:k:c:n
If A contains no isolated points, then the expression (3.59) can be written as
L Q(m-l,m)+ L Q(n,n+l).
m:m-IEA,m\!A n:n\!A,n+IEA
If A contains isolated points, then the additional terms that appear in (3.59) are
those for which there are two constrained indexes in the sum that lie on either
side of an isolated point in A.
To check that each of the Q's is nonnegative, it is enough by symmetry to
consider Q(m - 1, m) when m - 1 E A, m rf. A. Rewrite it in the form
(3.60) Q(m - 1, m) = cLA(m) + L ck[LA(k + 1) - LA (k)].

m:c:k:c:n- l
Comparing with the definition of Q(m - 1, m) above, we can solve for the coef-
ficients in (3.60) as follows:
5
c = RA(m - 2) + RA(m - 1) - 4RA(m)
(3.6la)
1 5
= RA\{m-l)(m - 2) + -RA(m - 1) - -RA(m)
2 4
if n = m;
c = RA(m - 2) + 2RA(m - 1) - ~RA(m) - (~ - f3 )RA(m + 1)

- ~RA(m + 2) 1{n:o:m+2} - (f3 - ~ )RA(m + 3) 1{n:o:m+3)
(3.61b)
= RA\{m-l}(m - 2) 3
+ "2RA(m - 7
1) - 4"RA(m) - (3)
4" - f3 RA(m + 1)
- ~RA(m + 2) 1{n:o:m+2} - (f3 - ~ )RA(m + 3) 1{n:O:m+3)

ifn > m;
(3.62a)
if m S k = n - 1; and
Ck = RA(k - 1) + ~RA(k) - ~RA(k + 1) - (~- f3 )RA(k + 2)

(3.62b)
- ~RA(k + 3) 1{n:o:k+3} - (f3 - ~ )RA(k + 4) 1{n:o:k+4)
ifm S k < n - 1.
Note that in (3.61), we have expressed C in two slightly different ways. The
reason for preferring the second will become apparent shortly. Basically, it is to
be able to use monotonicity arguments more effectively. In comparing RA for two
different arguments, there will be natural inequalities when there are the same
number of points in A to the right of both arguments. This is not the case for the
arguments m - 2 and m - I, unless we replace A by A \ {m - I}. The identity
that is used in going from the first to the second expression for c is a consequence
of (3.53).
Clearly, to show that (3.60) is nonnegative, we will need to know that the L's
and R's satisfy some inequalities. What we know from Proposition 3.43 is that
the renewal sequence u satisfies some inequalities. Therefore, we need to relate
the L' sand R' s to u. Here are the relevant relations for an arbitrary finite set B:
I - LB(k) = L u(k - j)LB(J),

j<k,jEB
(3.63)
1 - RB(k) = L u(J - k)RB(J).
j>k.jEB
To check the first of these, for example, note that both sides are equal to
fLb : TJ(J) = 1 for some j E B n (-00, k)ITJ(k) = I}.

The right side of the first line of (3.63) is a decomposition of the above event
according to the location of the leftmost 1 in B n (-00, k).
By Proposition 3.43 and Table 1, u(n) ..}, so it follows from (3.63) that
(3.64a)
and
(3.64b)
By (3.64a) and (3.60), Q(m - 1, m) will be nonnegative if c :::: 0 and Ck :::: 0 for
m ::: k ::: n - 1. To check the latter statement, use (3.62), (3.64b) and the fact that
k·
f3 > For example, for (3.62b) when m ::: k ::: n - 4, write
5
Ck = [RA(k -1) - RA(k)] + 2[R A(k) - RA(k+ 1)]
3
+ 4[RA (k + 1) - RA (k + 2)] + f3 [ RA (k + 2) - RA (k + 3) ]
+ (f3 - t) [RA (k + 3) - RA (k + 4)].
The verification that C :::: 0 uses the same argument, together with the obser-
vation that
RA(m - 1) ::: RA\{m-l}(m - 2),
which follows like (3.64b) did from u(n) ..} and (3.63). Thus we see that (3.59) is
nonnegative whenever A contains no isolated points.
Before continuing, we will pause to explain two aspects of this argument, one
of which is explicit above, while the other is implicit. In the harder arguments
to come, both will appear explicitly. First, (3.59) is a quadratic form in the LA's
and RA's. We have seen that monotonicity of these functions is used crucially in
the proof that (3.59) is nonnegative. But these functions are monotone only in
intervals in the complement of A. When their arguments cross a site in A, this
monotonicity is lost. Therefore, LA and RA are written in terms of LB and R B,
where B is obtained from A by deleting a few points in order to regain the lost
monotonicity at those critical locations.
The second aspect to comment on is the following. Suppose one is trying to
show that the quadratic form
i.j
is nonnegative when Ui and Vj are monotone. This is fairly hard to do directly.

But, if one rewrites this form as
where u; = Ui+1 - Ui and vj = Vj - Vj+l, and <'j is whatever the coefficient turns
out to be after the change of variables, then the quadratic form is nonnegative if
the new coefficients are nonnegative, and this is much easier to see. Rewriting
expressions in terms of differences of LA'S and RA'S will be used repeatedly in
what follows for that very simple reason.
Nonnegativity for General Sets

Next we tum to the situation in which A has isolated points. Unfortunately, the rest
of the proof requires consideration of a fairly large number of cases, depending
on how the isolated points are situated in A. It is suggested that the reader not
try to check them all, but rather choose two or three and check those in order to
gain an understanding of the approach. At the same time, he can try to simplify
the proof. The computational aspects of the elementary linear algebra used can be
eased by reference to Mathematica or Maple.
We begin with a definition. By a maximal string of isolated points, we will
mean a set S = {m, m + 2, ... , n - 2, n} where n - m :::: 0 is even, SeA,
m - 1, m + 1, ... , n - 1, n + 1 ~ A, either n + 2, n + 3 E A or n + 2 ~ A, and
either m - 2, m - 3 E A or m - 2 ~ A. (These last constraints are the ones that say
S is maximal.) The terms in (3.59) that are to be associated with the string S are
Q(m - 2, m - 1)1(m-2EA} + Q(m - 1, m)

+Q(n, n + 1) + Q(n + 1, n + 2)I(n+2EA}
(n-m)/2
+ L L A(m+2k-1)[R A (m+2k-3)+R A(m+2k-2)]
k=!
(n-m)/2
+ L R A (m+2k-l)[L A(m+2k)+L A(m+2k+l)]
k=!
(3.65)
(n-m)/2
- LA(n + I)RA(n + 1) + (1 - 2fJ) L LA(m + 2k - I)R A(m + 2k + 1)
What we have done here is to list all terms from (3.59) for which some of the
indexes in the constraint in the sum are in (m, n), together with as much of the
contribution from the Q's (recall they are positive, so we want to use as much as
possible of them) as we can without double counting. For example, if m - 2 ~ A,
we cannot include a term corresponding to Q(m - 2, m - 1), because that might

be used in the expression for another maximal string of isolated points to the left
of S. However, if m - 2 E A, then m - 3 E A also by the definition ofmaximality,
so there cannot be another maximal string of isolated points immediately to the
left of S.
Let k = max{x E A : x < m} and I = min {x E A : x > n} be the points in A
closest to S on the left and right respectively. We assume that k and I are finite;
if one is not, the nonnegativity we need can be checked by a limiting argument.
Strings of Length One

Consider first the case m = n. Put
L(j) = LA\{m.lJ(j), R(j) = RA\{k,m}(j).
Then
L(k + 1) .:s L(k + 2) .:s '" .:s L(l) .:s L(l + 1)
and
R(l - 1) .:s R(l - 2) .:s ... .:s R(k) .:s R(k - 1)
by the argument that led to (3.64). Note that these inequalities would not neces-
sarily be true for LA because of the presence of I and m in A. However, by the
above inequalities for Land R, the following quantities are nonnegative:
ak = L(k + 1), ai = L (i + 1) - L (i) for k l.
Half of the terms in (3.65) are given by
Q(m, m + 1) + Q(m + 1, m + 2) 1(l=m+2} - LA(m + 1)RA(m + 1)

+ (~ - f3 )LA(m - l)RA(m + 1)
(3.66)
+ (~- f3 )LA(m -1)RA(m +2)l{l~m+3}
+ (2f3 - ~ )LA(m - l)RA(m + 3)1{l~m+4}.
(Note by the definition of maximality that I ::: m + 4 is equivalent to m + 3 ¢ A.)

We will write (3.66) in terms of the a's and b's in order to check nonnegativity
more easily. To do so, note first that
LA(m - 1) = L(m - 1) = ak +.,. + am-2,

LA(m) = L(m) = ak + ... + am-I,
LA(j) = L(j) - u(j - m)L(m)
= [1 - u (j - m) ] [ak + ... + am- d + am + ... + aj-I
for m < j :s I,
LA(m + 3) = L(m + 3) - ~L(m + 2) - (~ - fJ )L(m)
= (~+fJ )(ak + ... +am-l) + ~(am +am+l) +am+2

if I = m + 2, and
1 1
RA(m - 1) = R(m - 1) - "2R(m) = "2 (bm+l + ... + bl) + bm
RA(m) = R(m) = bm+l + ... + bl ,
RA(m + 1) = bm+2 + ... + bl,'" ,
RA(l- 1) = bl .
Since u(k) +by Proposition 3.43 and Table 1,

LA(J + 1) - LA(J) = aj + [u(J - m) - u(J + 1- m)][ak + ... + am-I] :::: aj
for m < j < l. Using this inequality in evaluating Q(m, m + 1) in (3.60) (note
that the m appearing there is not the same as the m appearing here, but rather is
smaller by one), it follows that (3.66) is ::::
[~(ak + ... + am-I) +am][bm +bm+l(~ + 1{l~m+3J)

+bm+2(~ + ~1(l?m+3J) +f3bm+3 + (f3 - ~)bm+4J
+ .L
1-2 [
aj bj
(3
+ bj+l "2 + 1{l~j+3J )
J=m+l
(3.67) + bj+2 ( ~ + ~ 1(l~j+3J) + fJbj+3 + (fJ - ~) bj+4 ]
+ 1(l=m+2J bm+2[fJ(ak + ... + am-I) + 1am + ~am+l + am+2]
- [~(ak + ... +am-l) +am}bm+2 + ... +bl)

+ (ak + ... + am-2) [ (~ - fJ )bm+2 + (l- 2fJ )bm+3 + ~(bm+4 + ... + bl) l
If I = m + 2, this can be written as
[~(ak + ... +am-l) +amJ[bm + ~bm+l]

(3.68)
+bm+2U(ak + ... +am-2) - (~- fJ )am-l - ~am + ~am+l +am+2)
Already we can see in this case that not all the coefficients of ai bj are nonnegative,
which makes the necessary analysis quite delicate. If I = m + 3, then (3.67)
becomes
bmU(ak + ... + am-I) + am] + bm+{~(ak + ... + am-I) + ~am + am+1]
(3.69) + bm+2 [(~ - (3 ) (ak + ... + am-2) - ~am-I - ~am + ~am+l]

+bm+3[ (~- ~{3 )(ak + ... +am-2) - (~- ~{3 )am-1 - (1- (3)am].
For I :::: m + 4, (3.67) becomes
(3.70)
Recall that (3.67) contains only half of the tenus (3.65). The other half are
obtained by symmetry. By this we mean that we interchange the roles of LA (m - i)
and RA (m + i), or equivalently, of am-i and bm+i . If k = m - 2 and I = m + 2,
then (3.68), together with its counterpart obtained by symmetry, imply that (3.65)
can be written as
where M is the matrix

1 3
4 ~+,8 0 2
3
~+,8 2 2 0 0
M= 0 2 2 0 0
3
2 0 0 0 0
0 0 0 0
Since all entries of M are nonnegative and the a's and b's are nonnegative, (3.65)
is nonnegative in this case. So, we see that even though some of the terms in (3.68)
are negative, they are compensated by positive terms in the symmetric expression.
If k = m - 3 and I = m + 3, then (3.69), together with its counterpart obtained
by symmetry, imply that (3.65) can be written as

1 - 3,8 i - ~,8 l + .!.,8
4 2 -1+,8 0
i - ~,8 ~ - 2,8 9
"8
1
4
3
2
l4 + .!.,8 9 5
M= 2 "8 2 3
1
-1 +,8 4 3 2 0
3
0 2 0 0
Recalling that ,8 = .1497 ... , we see that the only negative entry in this matrix is
-1 +,8, which appears twice. To handle these, use (3.63) to show that
am-l - am = 2L(m) - L(m - 1) - L(m + 1)

j:'OmJ;,jEA L(j)[[U(m - 1 - j) - u(m - j)]
(3.71 )
- [u(m - j) - u(m + 1 - j)]]
::: o.
The nonnegativity comes from part (ii) of Proposition 3.43. Similarly, bm +1 ::: bm .
Therefore, the ~ + !,8 entries can be used to compensate for the -! + ,8 entries.
It follows that (3.65) is nonnegative in this case as well.
Now take k = m - 2 and I = m + 3. Using (3.69) and the version of (3.68)
obtained by symmetry, we see that (3.65) can be written as

3
~ - ~f3 !-f3 i+f3 0 2"
1+113
4 2
5
8" 2 2 0 0
M=
-! + 13 1
4 3 2 0 0
3
0 2" 0 0 0
The only negative entry is again -!

+ 13. Now (3.71) is no longer good enough
to compensate for this. Looking at the the first column of M we see that we need
to know that
(3.72)
To check this, we will show that
(3.73)
Using (3.52) twice, write
ak + 2ak+l - 4ak+2 =6L(k + 2) - L(k + 1) - 4L(k + 3)

=4L(k + 2) - 2L(k + 1) - 4 LL(j)f(k + 3 - j)
jd,jlfcA
=4 L L(j)[J(k +2- j) - f(k +3- j)],

j<k,jlfcA
which is nonnegative, since f(j) t by Proposition 3.26 and Table 1. Using (3.73),
we see that the left side of (3.72) is
~ f3am -l + (~ - ~f3 )am-2 ~ O.

The case k = m - 3 and I = m + 2 is handled symmetrically.
By now, it should be clear what the strategy is. If all the entries in the matrix
M were nonnegative, it would be immediate that (3.65) is nonnegative. However,
some entries are negative. So, in each case, we need to identify the negative entries,
and find nearby positive entries that can be used to offset the negative ones. The
offset process requires that we prove that the a's and b' s satisfy some inequalities,
such as (3.71) and (3.72). These are proved either by combining (3.52) and (3.53)
with Proposition 3.26, or by combining (3.63) with Proposition 3.43.
In the remainder of the cases, I - m ~ 4 or m - k ~ 4 or both. Since (3.70)
contains many terms, it behooves us to argue that many can be dropped without
loss. Note that all coefficients of terms involving aj for j > m in (3.68), (3.69)
and (3.70) are nonnegative, so we will drop all these terms. After doing so, we
see that it is sufficient to prove that the following expressions are nonnegative:
8
1
[ -am-z 1
+ -am-I
4
- -am
2
1]
j=m+S
bj L
I
+ (am_z, am-I, am)M(bm+4 , bm+3, bm+z , bm+l , bm),

-.L +!.8 ~ -~.8 1-.8 i+.8
M =
(
~ +;.8 ~ + 1.8 S
8" 2
-i+.8 -1+.8 I
4 3
if k = m - 2 and I :::: m + 4,
[( ~ - ~.8 )am-3 + (~ -.8 )am-2 + !am-I - ~am] }=m+S

. bj t
+ (a m-3, am-2, am-I, am)M(bm+4 , bm+3, bm+z, bm+l , bm)
ft -.8 1- 3.8 i - ~.8 ~
4
+ !.8
2 -1 +.8
ft -1.8 i - ~.8 ~ - 2.8 9
8"
I
4
M=
!!
16
+ !.8
Z
~4 + !.8
2
9
8"
S
:2 3
-i+.8 -1+.8 ~ 3 2
+.038 +.051 +.251 +.825 -.350
+.238 +.251 +.451 1.125 +.250
+.762 +.825 1.125 2.500 3.000
-.475 -.350 +.250 3.000 2.000
if k = m - 3 and I :::: m + 4, and
[ - ~bm + !bm+1 + (~ - .8 )bm+z + (~ - ~.8 )bm+3

1 1) ] m-S
+ ( "2.8 - 16 bm +4 { ; ai
+[- ~am + !am-I + (~-.8 )am-z + (~-~.8 )am-3

+ (~.8 - 116 )a m - 4] jfs bj

t3-~ -t, - 13 ft - 413 !.!
16
+ !t3
2 -i + 13
-t, - 13 4- 313 i - ~t3 l4 + !t3
2 -4 + 13
M= ft - 413 i - ~t3 ~ - 213 9
"8
1
4
!.! + !t3 l4 + !t3 9 5

16 2 2 "8 2 3
-i + 13 -4 + 13 4
1
3 2
+.025 +.038 +.238 +.762 -.475
+.038 +.051 +.251 +.825 -.350
+.238 +.251 +.451 1.125 +.250
+.762 +.825 1.125 2.500 3.000
-.475 -.350 +.250 3.000 2.000

if k S m - 4 and I :::: m + 4. Note that in the above expressions, the negative
terms are exactly those corresponding to a;bm for k SiS m - 3 and ambj for
m + 3 S j S l. Use (3.73) and its counterpart for the b's to replace bm by
4bm+1 + ~bm+2 in the first case, and am by 4am-1 + ~am-2 in the second case.
After doing so, all resulting coefficients are nonnegative. This completes the proof
that (3.65) is nonnegative when m = n.
Strings of Length Greater than One

Now we tum to the case n > m. We want to regard (3.65) as the sum over
collections of terms, with one collection associated with each point in S, and then
show that the net contribution of each collection is nonnegative. It would appear
that the most natural way to do this is to associate
Q(n, n + 1) + Q(n + 1, n + 2) 1(l=n+2} - LA(n + I)R A(n + 1)
+ (1 - 2t3)LA (n - l)RA (n + 1) + (~ - 13 ) LA (n - I)RA (n + 2) 1(l:o:n+3}
(3.74) + (213 - ~)LA(n - I)R A(n + 3) 1{l:o:n+4} + LA(n + I)RA(n - 1)
9
+LA(n)RA(n - 1) - -LA(n - I)R A(n - 1)
4
+ (13 - t)LA(n - 3)RA(n + 1)

to the point n,
Q(m - 1, m) + Q(m - 2, m - 1)l(k=m-2} - LA(m - l)R A(m - 1)

+ (1 - 2f3)L A(m - l)R A(m + 1)
+ (~ - f3 )LA(m - 2)RA(m + 1) 1(k::om-3)
(3.75) + (2f3 - ~)LA(m - 3)RA(m + 1) 1(k::om-4) + LA(m + l)R A(m-1)
9
+LA(m + l)R A(m) - 4'L A(m + l)R A(m + 1)
+ (f3 - t)LA(m - l)RA(m + 3)
to the point m, and
2L A(J + l)R A(J - 1) + LA(J)RA(J - 1) + LA(J + l)R A(J)

9 9
- 4'L A(J - l)RA(J - 1) - 4'L A(J + 1)RA(J + 1)
(3.76)
+ (1 - 2f3)L A(J - l)R A(J + 1)
+ (f3 - t)[LA(J - 3)RA(J + 1) + LA(J - l)RA(J + 3)]
to j E S\{m, n}.
It turns out that while (3.76) is nonnegative, (3.74) and (3.75) are not necessar-
ily nonnegative. They contain negative terms that apparently cannot be compen-
sated for by positive terms in the same collection. The solution to this difficulty is
to use positive terms in (3.74) to compensate for negative terms in (3.75) and vice
versa. But, since these terms and their compensators may be located quite far from
each other if n - m is large, it seems not to be possible to carry out the trade-off
directly. We must move the terms through the (3.76)'s, using the positivity of the
intervening (3.76)'s to prevent a loss of positivity that might occur otherwise.
With these comments as motivation, we will now write down the collections
that we will show are nonnegative. The idea is to add to (3.74-3.76) bilinear
expressions in the LA'S and RA 's that satisfy the natural symmetry conditions and
whose total contribution to (3.65) is zero. To simplify the notation, put
Here is the most general such choice:
(3.74) + C\[D\(n - 1) - D\(n - 2)]

(3.74')
+ C2[D2(n - 1) - D2(n - 3)] + C3[D 3(n - 2) - D3(n - 3)],
(3.75) + C\[D\(m) - D\(m + 1)]

(3.75')
+ C2[D2(m - 1) - D2(m + 1)] + C3[D3(m - 1) - D3(m)],
and for each j E S\{m,n},
(3.76) + CI[DI(j) + DI(j - 1) - DI(j - 2) - DI(j + 1)]

(3.76') + C2[2D 2(j - 1) - D 2(j - 3) - D 2(j + 1)]
+ C3[D3(j - 1) + D3(j - 2) - D3(j - 3) - D3(j)].
Note that the sum over all j E S of all the added terms is zero, by a telescoping
series type of argument, so that
(3.74) + (3.75) + L (3.76) = (3.74') + (3.75') + L (3.76').

jES\{m,nj jES\{m,nj
Therefore, we need to show that each of the terms on the right is nonnegative if
the constants C I , C2 and C3 are chosen appropriately.
We will look at (3.74'), (3.75') and (3.76') separately. The first two are related
by a symmetry, so it is enough to consider one of them. We begin with (3.76') for
a fixed j E S\{m, n}. Write
L(i) = L A\{j-2,jJ(i), R(i) = RA\{j,j+2j(i).

Then arguing as we did in the case m = n, the following quantities are nonnegative:
aj-4 = L(j - 3), ai = L(i + 1) - L(i) for j - 4 j + 4.
Solving for LA and RA as before leads to
LA(j - 3) = aj-4, LA(j - 2) = aj-4 + aj-3,

1
LA(j - 1) = 2'(aj-4 + aj-3) + aj-2,
1
LA(j) = 2'(aj-4 + aj-3) + aj-2 + aj_l,
LA(j + 1) = (~+ fJ ) (aj-4 +aj-3) + ~(aj-2 +aj_l) +aj'
and
RA(j + 3) =
bj+4, RA(j + 2) = bj +4 + bj+3,
1
RA (j + 1) = 2(bj +4 + bj+3) + bj+2,
1
RA (j) = 2(bj+4 + bj +3) + bj+2 + bj+l ,
RA(j - 1) = (~ + fJ) (bj+4 + bj +3) + ~(bj+2 + bj+l) + bj .
Substituting into (3.76') leads to
(3.77) (aj-4, aj-3, aj-2, aj_l, aj)M(bj +4, bj+3, bj+2, bj+l, bj ),
where M = Mo + elMI + e2M2 + e3M3 and the Mi's are the matrices
2f32 - if3 2f32 - ~f3 + ft -if3 + tz 2f3 - ft 2f3 - ~
2f32 - ~f3 + ft 2f32 - ~f3 +! -~f3 + f2 2f3 - ft 2f3 - ~
3
Mo= i - 2f3 8
2R
}J
-..!.
16 2f3 - ft 3
8
3
2 2
I
2f3 - ~ 2f3 - ~ -4 2 2
I
i - 2f3 i - 2f3 2 0 -1
I
i - 2f3 i - 2f3 2 0 -1
I I
MI= 2 2 2 0
0 0 0 0
-1 -1 0 0 0
I I
i - 2f3 ~-f3 2 -2 -1
I
~-f3 2 0 0
I
M2 = 2 2 0 0
I
-2 0 0 0 0
-1 0 0 0 0
and I
0 2 0 -1 0
I
2 0 0
M3 = 0 0 0 0
-1 0 0 0 0
o 0 0 0 0
To check that (3.77) is nonnegative, we need to know how to choose the
constants el, e2 and e3, and to do that, we need to consider also the case (3.74').
Thus we defer the verification that (3.77) is nonnegative a bit. By analogy with
the earlier cases, in considering (3.74'), it is natural to let
L(i) = L A\(n-2.n./}(i), R(i) = RA\(n-2.n}(i),
a n-4 = L(n-3), ai = L(i+1)-L(i) for n-4 < is I, ai = 0 for i < n-4,
and
b l = R(l - 1), b i = R(i - 1) - R(i) for n - 4 l.
These are nonnegative as before.

Noting the similarity between (3.66) and (3.74), we will apply the argument
that led to (3.67), though the formulas are a bit different. Now
LA(n - 3) = a n-4,
LA(n - 2) = an-4 + an-3,
1
LA(n - 1) = 2(an-4 + an-3) + an-2
1
LA (n) = 2(an-4 + an-3) + an-2 + an-I,
LA(i) = [1 - u(i - n + 2) - ~U(i - n)}an- 4+ an-3)
+ [1 - u(i - n) ](an-2 + an-I) + an + ... + ai-l
if n < i S I, and if 1 = n + 2,
LA(n + 3) = c: 1
f3 - ~~)(an-4 +an-3) + (f3 + ~)(an-2 +an-l)
+ 2(an +an+l) +an+2.
U sing the inequality LA (i + 1) - LA (i) :::: ai for n n (aU of which are nonnegative), the analogues of
(3.68), (3.69), and (3.70) are ::::
[(f3 + ~) (an-4 + an-3) + ~(an-2 + an-I) + an] [bn + ~bn+l]

+ bn+2[ (~f3 - 312) (an-4 + an-3) + ~an-2 - (~- f3 )an-l - ~an]
if 1 = n + 2,
bn[ (,8 + ~) (an-4 + an-3) + ~ (an-2 + an-I) + an]

+ bn+{ (~,8 + 156) (an-4 + an-3) + ~(an-2 + an-I) + ~an]
+bn+2[ (372 -~,8 ) (an-4 +an-3) + (~-,8 )an-2 - ~an-I - ~an]
15 1)
+ bn+3[ (,8 2 - 8,8 + 4 (an-4 + an-3) + 4 - 4,8 an-2 (1 3)
-~(l - ,8)an-1 - (1 - ,8)an]
if 1= n + 3, and
if I ::: n + 4.
The terms that appear in (3.74) that do not appear in (3.66) (with m replaced
by n) are
(~-,8 )LA(n - I)RA(n + 1) + LA(n + I)RA(n -1) + LA(n)RA(n - 1)
-~LA(n - 1)RA(n - 1) + (,8 - ~ )LA(n - 3)RA(n + 1).
Writing this in terms of the a's and b' s gives
(bn+2+ ... + bt )[ (~,8 + ~ )an-4 + (~ - ~,8 )an-3 + (~ -,8 )an-2]

+[~(bn+1 + ... +bt ) +bn][ - (~-,8 )(an-4 +an-3) - ~an-2 + ~an-I + an].
Also,
(3.74') - (3.74) = [b n+2 + ... bl ] [(CI + C2)an-2 + (~C2 + C3 )an- 3 ]

+ bn+1 [Clan-2 - (~C2 + C3 )an-4]
- bn[Clan-3 + (CI + C2)an-4].
Therefore, if 1= n + 2, (3.74') is ::::
where N2 is the matrix
~f3 - fz
2f3 - .l
16 2f3 - ~ - C I
I
~ +CI -4
3
~+f3 :2 2
o 2 2
If 1= n + 3, (3.74') is ::::
(3.79) (a n-4, an-3, an-2, an-I, an)N3(b n+3, bn+2, bn+l , bn),

f32 - ~f3 + ! I f3 3
4 + TI 3f3 + t; 2f3 - ~
f32_¥f3+~ ~ - ~f3 3f3 + 1t; 2f3 - ~
N3= ~ - ~f3 ! - 2f3 7

"8 -4
I
5
1.4 + 1.f3
2 "8 2 2
-! + f3 I
4 3 2
0 0 -!C2 - C3 -C I - C2
!C2 + C3 !C2 + C3 0 -C I
+ CI +C2 C I +C2 CI 0
0 0 0 0
0 0 0 0
Finally, if I :::: n + 4, (3.74') is ::::
(3.80)

fJ2 - 1-
64 fJ2 - ~fJ + ~ 3
32 + 41 fJ 3fJ - .l
32 2fJ - ~
fJ2 - fJ + b. fJ 2 _¥fJ+* fz - ~fJ 3fJ - .l

32 2fJ - ~
N4 = ft - !fJ ~ - ~fJ ! - 2fJ 7

"8 -4
1
16
3
+ 2"1 fJ 14 + IfJ
2
5
"8 2 2
-~ + fJ -! + fJ 4
1
3 2
0 0 0 -!C2 - C3 -C 1 - C2
!C2 + C3 !C2+ C3 !C2 + C3 0 -Cl
+ C 1 +C2 C 1 +C2 C 1 +C2 C1 0
0 0 0 0 0
0 0 0 0 0
Next we need to decide how to choose values of C 1 , C2, C3 that make it
possible to show that (3.77)-(3.80) are all nonnegative. Looking first at (3.77),
note that aj-4 and bj+4 are values of Land R, while the other a's and b's are
differences of such values. This means that the latter ones may well be much
smaller in size than the former. In particular, it will be hard to make (3.77)
nonnegative unless the upper left entry of M is nonnegative, and the top row (or
equivalently, left column) of M has a nonnegative sum. In other words, we will
need
(3.8Ia)
and
(3.8Ib) 11
32
( 45)
2fJ 2 + 4fJ - - - C 1 2fJ + - - C2 fJ + - ( 13)
8
1 - O.
- 2 -C3 >
Looking at (3.80), we will certainly want to make the coefficient of L:=n+5 bi

nonnegative when the a's are constant:
(3.81c)
So, a reasonable strategy is to choose C\, C2 , C3 so that the left sides of (3.81)
are all zero. Using the value of f3 from Proposition 3.11 gives (rounded to four
decimals):
C\ = -.1231, C2 = .2729, C3 = .0133.
Using these values leads to
o +,0693 +.0687 +.0873 -.2252
+.0693 +.1385 +.1937 +.2370 +.0477
M = +.0687 +.1937 +.2500 +.2519 -.2500
+.0873 +.2370 +.2519 1.5000 2.0000
-.2252 +.0477 -.2500 2.0000 2.0000

+.1059 +.0873 -.2252
+.2309 +.2370 +.0477
N2 = +.2500 +.2519 -.2500
+.5248 1.5000 2.0000
o 2.0000 2.0000
+.0164 +.1312 +.3620 -.2252
+.1414 +.2562 +.5118 +.0477
N3 = +.2627 +.3502 +.7519 -.2500
+.3249 +.6250 2.0000 2.0000
-.3502 +.2500 3.0000 2.0000

and
+.0068 +.0164 +.1312 +.2058 -.2252
+.1318 +.1414 +.2562 +.3556 +.0477
N4 = +.1374 +.1503 +.3502 +.7519 -.2500
+.2624 +.3249 +.6250 2.0000 2.0000
-.4752 -.3502 +.2500 3.0000 2.0000

Most entries in the above matrices are nonnegative, but there are a few negative
ones that we must deal with. Start with M. The negative entries in the comers are
compensated by the positive entries in the first column and row respectively. To
see this, use (3.63) to write
aj-3 = L [u(j - 3 - i) - u(j - 2 - i)]LA(i),

i::;j-4.iEA
(3.82)
aj= L [u(j-i)-u(j-i)]LA(i).
i::;j-4,iEA
Therefore, in order for the overall contribution of the first column of M to (3.77)
to be nonnegative, we need
.0693[ u(i) - u(i + 1)] + .0687[u(i + 1) - u(i + 2)]

(3.83)
+.0873[u(i + 2) - u(i+ 3)] - .2252[ u(i + 3) - u(i + 4)] ::: 0
for i ::: 1. Recalling that the sum of the coefficients above was taken to be zero (in
making (3.8Ib) an equality), (3.83) follows immediately from Proposition 3.43(ii)
for i ::: 2. Using the values in Table 1, we see that (3.83) is true for i = 1 as well.
The - .25 entries in M are compensated by the entries just below and to the
right respectively. In order for this to be true, we would need
2aj_l - .25aj-2 ::: 0,
or using (3.82) again,
2[u(i + 2) - u(i + 3)] - .25[u(i + 1) - u(i + 2)] ::: 0, i::: 1.
But this follows from Proposition 3.43(iv). This (together with the analogues with
a's replaced by b's) completes the proof of the nonnegativity of (3.77).
The treatment of the negative entries in Nl, N2, N3 is similar. Since the sum
of the entries in the first row of N2 is negative, it might appear that there would
be a problem in that case. So, we will treat this case only, leaving the other entries
in the N's to the reader. We need to check that
(3.84) .1059bn +2 + .0873bn + 1 - .2252bn ::: O.
The analogue of (3.73) is bn +2 + 2bn + 1 ::: 4bn . An analogous argument gives

bn+2 ::: 2b n +1• These two inequalities are more than enough to check (3.84).
Finally, we have to check that the last expression in (3.80) is nonnegative. For
this, given our choices of C 1, C2, C3, we need
Using (3.63) again, it suffices to show that

1 1 1
S[u(i + 1) - u(i + 3)] + 4[u(i + 3) - u(i + 4)] - 2:[u(i + 4) - u(i + 5)] 2: 0
for i 2: O. For i = 0, this follows from the values in Table 1, while for i 2: 1, it
follows from Proposition 3.43(ii).
This completes the proof of the nonnegativity of (3.65) for all choices of
k < m < n < l. Together with the already proved nonnegativity of the Q's, this
shows that (3.59) is nonnegative for all finite A C Zl, and hence that (3.50) is
true. Therefore, the proof of Theorem 3.49 is complete.

Nonlinear voter models were introduced and first studied by Cox and Durrett
(1991). Further references to work on these models are given below in discussing
the results from Sections 2 and 3. First, we will discuss linear voter models.
Quite a few results have been proved about the linear voter model since the
publication of IPS. In order to discuss some of them, consider the nearest neighbor
process, i.e., the one with rates (1.1) with p(x, y) = 1/2d for Iy - xl = 1, and
p(x, y) = 0 otherwise.
Clustering in Two Dimensions. Recall from Theorem 1.3 that the two dimensional
linear voter model clusters, and in fact, that the critical dimension for clustering
is 2, in the sense that higher dimensional linear voter models coexist. Cox and
Griffeath (1986b) and Bramson, Cox and Griffeath (1986) quantify the way in
which this clustering occurs. Take the initial distribution to be the product measure
v p with density p: v p {1) : 1) (x) = I} = p for all x E Z2. The limiting behavior of
the voter model can be described in terms of the Fisher-Wright diffusion process
Y (t) on [0, 1], which is the process with generator
1
Qf(x) = /I
2:x(l - x)f (x).
Take the initial condition for this process to be Y(O) = p. Then for a E [0, 1], the
following limiting statements hold as t -+ 00:
(a) For a E [0, 1], {TJt(xt a / 2), x E Z2} converges in distribution to {~(x), x E
Z2}, where the limit is an exchangeable Bernoulli random field with
In other words, ~ is a mixture of product measures (by de Finetti's Theorem), with

the mixing distribution given by the distribution of Y at time 10g(1/a). Note that
the content of this statement in the extreme cases is: If a = 1, ~ has distribution
vp , so this says that the opinions at sites that are ,Ji apart are asymptotically
independent at time t. This is easy to see from (1.2), since in two dimensions,
a random walk that starts at xy't, x =F 0, will not have hit 0 by time t with
large probability. If a = 0, Y is being viewed at time 00. Since Qf = 0 for
f(x) = x, yet) is a (bounded) martingale. Its limit exists a.s., and cannot be in
(0,1). Therefore, yet) -+ 1 with probability p and yet) -+ 0 with probability
I - p. So, the above result reduces in this case to a special case of Theorem 1.4.
(b) The block averages
converge in distribution (jointly for finitely many a's) to Y(log ~).

(c) The width N (t) of the largest square centered at the origin on which IJt is
constant satisfies
10gN(t) L
---::::}-
logt 2'
where L is the hitting time of {O, I} for the process Y (log A).
As is usually the case, the proofs of these results are based on an analysis of
the dual process of coalescing random walks - see (1.2) for the statement of this
duality.
Occupation Times. Consider the linear voter model IJt on Zd, and let Tt be the
occupation time of the origin up to time t:
Tt = 1t 1]s (O)ds.
Cox (1988) established a central limit for Tt when the initial distribution is one
of the nontrivial invariant measures if d 2: 3, or an appropriate deterministic
configuration if d = 2. This followed earlier work by Cox and Griffeath, in which
the initial distribution is a product measure.
Bramson, Cox and Griffeath (1988) proved the following large deviation results
for Tt when the initial distribution is the product measure vp: For any a E (p, 1)
there are positive constants C 1, C2 (depending on d and a) so that for large t,
e- C1 (logt)2 :::: p(Tt/t > a) :::: e-C21ogt if d = 2,
e- CI0 :::: a) :::: e- C20

P(Tr/t > if d = 3,
e-Clt/logt :::: p(Tt/t > a) :::: e-C2t/logt if d = 4,
e- C1t :::: P(Tr/t > a) :::: e- C2t if d 2: 5.
Consensus Times for Finite Systems. Cox (1989) treats a topic somewhat analogous
to that discussed for the contact process in Section 3 of Part I. Consider the
nearest neighbor linear voter model on the box {I, ... , N}d, regarded as a torus by
identifying opposite sides. Use the initial distribution vp. This process is eventually
absorbed into one of the traps '1 == 0, '1 - 1. Let TN be this absorption time.
Then
-N2 '*TN
T if d = 1,
TN
~--
N210gN
'* T if d = 2,
-Nd '*
TN
T if d ::: 3,
where T is a random variable with a distribution (depending on d) that can be

computed explicitly. If d ::: 2, this limiting distribution is described again in terms
of the Fisher-Wright diffusion process. See Cox and Greven (1990) for related
results.
Rescaling Linear Voter Models. In Section 5 of Part I, we mentioned briefly some

results by Mueller and Tribe and by Durrett and Perkins on convergence of rescaled
contact processes to super Brownian motion and/or solutions of stochastic partial
differential equations. Analogous results for rescaled linear voter models have been
obtained by Mueller and Tribe (1995) and Cox, Durrett and Perkins (2000).
Modified Linear Voter Models. Several variants of the linear voter model have been
studied. For example, Granovsky and Madras (1995) have considered the model
obtained by adding constants to the transition rates for 0 ~ 1 and 1 ~ O. Ferreira
(1990) analyzes a one dimensional voter model in a random environment.
Sudbury (1999) studies the process '1t on {O, l}ZI in which
if '1 (x) = 0,
if '1(x) = 1.
The nearest neighbor one dimensional voter model corresponds to do = d\ = 1.
He proves that if do = 1 or 2, d\ > do, and the initial configuration contains
infinitely many blocks of 1's of length at least d\, then '1t converges weakly to
the pointmass on all 1'So
Other Voter Models. Mountford (1992) considers a class of voter models that
includes linear, but not threshold models, and proves a weak form of clustering
for them in one dimension. He assumes that the process is finite range, that c(x, '1)
satisfies a mild positivity condition, and most importantly, that the generator Q
satisfies the following condition: if fn ('1) = Llxl::n '1 (x), then
(4.1) sup IQfn('1)1 < 00.

n,~
The role of this last condition is to guarantee that fn ('1t) is almost a martingale.
(If Qf = 0, then f('1t) is a martingale.) The conclusion is that 80 and 8\ are the
only extremal invariant measures that are translation invariant. To check (4.1) for
finite range linear voter models, write
Qln(TJ) = I: I:p(x, Y)[TJ(Y) - TJ(X)] = I:p(O, Z) I: [TJ(Z +X) - TJ(x)],

Ixl:sn y Z Ixl:sn
so that
IQln(TJ)1 :::: 2 I: Izlp(O, z) < 00.
To show that (4.1) is not satisfied for the threshold voter model with JV
{- T, ... , T} (in which case we know the process clusters by Theorem 2.6), note
that if TJ is a configuration in which intervals of zeros of length T alternate with
intervals of ones of length T + 1, then
Qln(TJ) = I: [1 - 2TJ(x)),
Ixl:sn
and this quantity is asymptotic to - 2i:l as n --+ 00.
Muititype Voter Model with Mutation. This is a model in which there are infinitely
many potential types, rather than two. The types are indexed by the interval (0, 1),
so that a configuration TJ is a point in (0, l)zd. There are two kinds of transitions:
(a) (Dispersal) For nearest neighbor pairs x, y, site Y adopts the type of site x
1
at rate U'
(b) (Mutation) Each site Y adopts a new type chosen at random from (0, 1) at
rate a> 0.
°
Note that for each a E (0, 1), y (x) = 1(ry:ry(x)~cr) is an ordinary linear voter
model with additional spontaneous flips from to 1 at rate a(1 - a) and from 1
°
to at rate eta. This process has a unique invariant measure with density 1 - a.
Bramson, Cox and Durrett (1996, 1998) have used the multitype voter model
with mutation for small mutation rate et in two dimensions as a model to study
the abundance of species. They begin by observing that the process has a unique
stationary distribution, to which the distribution at time t converges as t --+ 00,
for any initial configuration. That should not be surprising, in view of the above
comment. The proof is a straightforward application of duality. Let ~ E (0, 1)Z2
have that stationary distribution.
Their first paper is devoted to the question of how the number of species in
a region depends on the size of the region. For r > 0, let N r •a be the number of
distinct types in the restriction of ~ to the square centered at the origin of side
length L', where L = 1/,Ja. They prove the following asymptotics for N r •a as
at 0:
(a) If r :::: 1, then
Nra 2
. --+ -
L2r-2(log L)2 7r
°: :
in probability.
(b) If r < 1, then
N r •a => Fr ,
where Fr is a distribution that is of order (1 - r) -1 as r t 1.
The second Bramson, Cox and Durrett paper obtains results on the relative
abundance of species for this model in a large box as ex t O.

Theorem 2.1 is due to Durrett and Steif (1993). They go on to prove more refined
results in the fixation regime, including the following, in one dimension: Let
8c ~ .649 be a solution of the equation
Consider a sequence of threshold voter models TJ~ with initial distribution the
product measure with density 4,
and parameters A{; Tk so that
Tk
IA{] ~8.
Note that by Theorem 2.1, fixation occurs for large k provided that 8 > 4. Then
for each x E Z 1,
(4.2) lim p(TJ;(x) = TJ~(x) for all t 2: 0) = 1 if 8 > 8c

k->oo
and
. 1
(4.3) lim P ( lim TJ~ (x) lim TJ~ (x + 1)) = 1
= t--+oo If 2" < 8 < 8c ·
k--+oo t--+oo
Durrett and Steif conjecture that a similar result holds for d > 1, but with 8c = ~.
(They prove (4.2) in all dimensions - the hard part is (4.3).)
Theorem 2.6 was proved in case T = 1 by Cox and Durrett (1991). The
general case was proved by Andjel, Liggett and Mountford (1992). In the latter
paper, the following results were also proved for the threshold voter model in one
dimension with JV. = {-T, ... , T}:
(a) If the initial distribution f-L is translation invariant, then
where D (f-L) is a constant depending on f-L.

(b) If T = 1, then as a function of p, D(vp) is concave on [0, 4] and convex
on [4, 1]. Furthermore,
lim D(vp) = 2.
p.(.O P
(c) If T > 1, then
lim D(vp) =0 and . . D(vp)

lImmf--T - > O.
p.(.o pT-l p.(.o p
As usual, vp denotes the product measure with density p.

A generalization of the T = 1 version of Theorem 2.6 appears in Andjel

and Mountford (1998). The result is that finite range, translation invariant, one
dimensional spin systems cluster provided that the flip rates satisfy the following
assumption: c(x, '7) = 0 if and only if '7(x) = '7(x - 1) = '7(x + 1). In particular,
the system is not required to be attractive.
Theorem 2.17 and Corollary 2.21 come from Section 5 of Durrett (1995). The
stronger result with c = ~ was proved by Durrett (1992). He also proved there
that the constant ~ is sharp for Theorem 2.17, in the sense that if
Tn 1
liminf-- > -
n--->oo 1A;;l 4'
then the threshold contact process with A = 1 does not have a nontrivial invariant
measure for large n. Conjecture 6.1 in Durrett (1995) is that ~ is sharp for Corollary
2.21 also, in the sense that if
1 Tn 1
(4.4) - < lim - - <-
4 n--->oo Iffn1 2'
then the threshold voter model clusters for large n. (Recall that by Theorem 2.1,
the process fixates for large n if the limit in (4.4) is> !.)
In Section 3, it is proved that with T = 1, the threshold voter model coexists
in all cases except d = 1, JV = {-1, 0, 1}. It would be interesting to know what
happens if
d=l, T=2, JV'={-n, ... ,n},
for example. In this case, we know from Theorem 2.1 that the process fixates if
n = 1, from Theorem 2.6 that the process clusters if n = 2, and from Theorem 2.17
that the process coexists if n is sufficiently large. Cox and Durrett (1991) proved
that it is enough that n ~ 47 for the process to coexist. They quote computer
simulations to guess that the process clusters if n = 3 and coexists if n ~ 4.

The results in this section are based primarily on Liggett (1994b). Theorem 3.49
was conjectured by Cox and Durrett (1991), and proved by them for all but a few
cases. The proof of the range 7 result given at the beginning of Section 3 comes
from their paper.
The proof of Theorem 3.49 given here is a substantial improvement over the
original in several respects. The proof (in both treatments) has two main parts: (a)
the one on the existence of a well-behaved solution to the convolution equation
(3.9) and corresponding properties of the density f and renewal sequence u given
in Propositions 3.11, 3.17 and 3.43, and (b) the verification of (3.50) for all finite
A C ZI. In the original paper, part (a) was based on a computer assisted proof.
The required inequalities on fen) and u(n) were proved analytically for n ~ 1000,
while the values of fen) and u(n) were computed explicitly for n < 1000, and the
inequalities were then obtained by inspection. The computer calculations involved
exact integer arithmetic with integers of nearly 2000 digits, so that the proof was
very computationally intensive. The proof of part (a) given here is entirely analytic,
and eliminates the need for a computer, except to do small calculations that could
be carried out on a calculator.
The improvement in part (b) is more significant. In working out the proof
for this presentation, the author discovered a serious error in the treatment of
some of the cases appearing on pages 777-787 of Liggett (1994b). (In the rest
of this paragraph, equation numbers refer to that paper.) In case 3, for example,
the bilinear expression (3.24) was supposed to be shown to be nonnegative for all
choices of LA and RA for which the corresponding Land R satisfy the inequalities
(3.25). In the proof, some reductions were made that led to an expression whose
nonnegativity was checked by verifying it at the extreme points of the convex set
detenuined by the inequalities. That is fine for a bilinear expression. However,
in the reductions, certain linear tenus were replaced by nonlinear tenus, thus
invalidating the proof. An example that satisfies (3.25) but for which (3.24) is
negative is
L(m - 2) = L(m - 1) = R(m + 2) = R(m + 1) = 26,

L(m) = R(m) = 27,
L(m + 1) = R(m - 1) = 28,
L(m + 2) = R(m - 2) = 29,
and correspondingly
LA(m - 2) = LA(m - 1) = RA(m + 2) = RA(m + 1) = 26,

LA(m) = RA(m) = 14,
LA(m + 1) = RA(m - 1) = 15,
21
LA(m + 2) = RA(m - 2) = "4 + 26f3.
It appears that there is no simple fix for this part of the proof in the paper. The
present treatment is a bit longer, but avoids that pitfall.
There should be ways of simplifying the arguments of Section 3. Given the
increase in the degree of difficulty in the proof in going from nearest neighbor to
second nearest neighbor contact processes, it appears hopeless to use this argument
as it now stands to obtain good upper bounds for the critical values of fairly general
finite range contact processes. Attempts to simplify and improve the proof given
here should be aimed at more general applicability.
In other work on threshold voter models, Handjani (1999) proved a complete
convergence theorem in the context of Theorem 3.49. Here is the statement:
(4.5)
where v is the limiting distribution of the process starting with the product measure
with density 1/2, which is nontrivial by Theorem 3.49, and
<Xo =p(1Jt == 0 for some t)

<Xl =p(1Jt == 1 for some t).
In particular, 1Jt => v for any initial configuration with infinitely many zeros and
infinitely many ones. A consequence of this is that v is the only nontrivial extremal
invariant measure for the process. Her proof of (4.5) uses duality (1.7) in a crucial
way.
Part III. Exclusion Processes
1. Preliminaries
A common feature of the contact processes and voter models that are treated in
Parts I and II is that only one coordinate of the configuration changes at each
time. One consequence of this property is that these processes tend to have only a
few invariant measures - typically there are one or two trivial ones, and then with
a substantial amount of work, one can often prove the existence of a nontrivial
invariant measure. The reason for this scarcity of invariant measures is that the
process has no conserved quantity, i.e., a quantity that does not change with time.
The existence of a conserved quantity tends to break up the state space {O, l}s
into classes determined by the value of this quantity, and then there tends to be
an invariant measure for each of its possible values. This corresponds roughly to
the difference between irreducible and reducible Markov chains.
One of the simplest models with a conserved quantity is the exclusion process
that we will study in Part III. This process is usually thought of as modelling
particle motion, and the conserved quantity is the number, or density, of particles.
There are other situations that can be modelled with the exclusion process, though.
One example is traffic flow, where particles are replaced by cars. In another (per-
haps the first studied), the particles are ribosomes (centers of protein production
in cells) that move along messenger RNA as they read genetic information.

Take a countable set S and transition probabilities p(x, y) for a discrete time
Markov chain on S:
p(x, y) ::: 0, LP(X,y)=l.

yES
In the exclusion process, particles try to move on S according to independent

continuous time Markov chains on S that have unit exponential holding times,
and jumps from x to y with probability p(x, y). Multiple occupancy is forbidden,
though, so that jumps to occupied sites are not allowed. If a particle tries to jump
from x to y and y is already occupied, then the particle remains at x and begins
a new exponential holding time.
210 Part III. Exclusion Processes
A configuration TJ E {O, l}s is given an occupancy interpretation: TJ(x) = 1

means that x is occupied by a particle, while TJ (x) = 0 means that x is vacant.
The generator of the process TJI is of the form (B2), where
p(x,y) if TJ(x) = 1, TJ(Y) = 0,

c(x, Y, TJ) ={ 0
otherwise.
The transition TJ -+ TJx,y when TJ(x) = 1, TJ(Y) = 0 corresponds to the motion of a

particle from x to y. The fact that particles are neither created nor destroyed leads
to the conserved quantity discussed above.
The analogue of Theorem B3 for the exclusion process is valid, with assump-
tion (B4) replaced by
(1.1) sup I>(x, y) <

yES xES
00.
(See (0.2) of Chapter VIII oflPS.) Note that (1.1) is automatically satisfied if p
is symmetric, or if S = Zd and p is translation invariant. This result provides the
formal construction and basic properties of the process.
To get an intuitive feeling about why a condition like (1.1) is needed in or-
der to have a well behaved process, consider what would happen if the initial
configuration TJ were given by
if x = x*
TJ(X)={~ if x -=1= x*,
where x* is a fixed site in S. Then the particle at x -=1= x* can be thought of

as waiting an exponential time with parameter p(x, x*), at which time it moves
to x* if it is the first particle to attempt a transition to x*. There will be a well
defined first attempt if and only if the infimum of these exponential times is strictly
positive. The necessary and sufficient condition for this is
L p(x, x*) < 00.

x=/=x'
If this sum were infinite, the only reasonable definition of the process (i.e., one
obtained by constructing the process on a large finite part of S and then pass-
ing to a limit) would have TJo+ == 1, so that the process would not even have
right continuous paths. Assumption (1.1) is just a uniform version of the above
condition.
Invariant Measures
What are the invariant measures for the exclusion process? This question has not
been answered completely, but a lot is known about it. The pointmasses on TJ == 1
and TJ == 0, are certainly invariant, since these two configurations are traps for the
process. It turns out that there are many invariant measures that are easy to write
down but do not concentrate on traps. Recall that .9 is defined to be the set of
all invariant measures for the process. For a function a : S --+ [0, 1], let Va be the
product measure on {O, l}s with marginals
Va {1) : 1)(x) = 1} = a(x).

The following is Theorem 2.1 of Chapter VIII of IPS.
Theorem 1.2. (a) If p(., .) is doubly stochastic, i.e.,
(1.3) LP(x,y) = 1, YES,

XES
then Va E g for any constant a.

(b) Ifn is a nonnegative function on Sand p(., .) is reversible with respect to n,
i.e.,
(1.4) n(x)p(x, y) = n(y)p(y, x), x, YES,
then Va E g where
n(x)
a(x) - XES
- 1 + n(x)' .
When (1.3) is satisfied, part (a) of the theorem produces a one parameter family
of invariant measures, indexed by particle density. If n satisfies (1.4), then so does
cn for any positive constant c, so part (b) generates a one parameter family of
invariant measures as well, though it is not so clear in this case what the parameter
represents.
Example 1.5. An instructive example is provided by the one dimensional, trans-

lation invariant, nearest neighbor exclusion process:
if y = x + 1,
(1.6) if y = x-I,
otherwise,
where p + q = 1, 0 < p < 1. Then (1.3) is satisfied, so all translation invariant

product measures are invariant. However, (1.4) is satisfied as well with
(1.7)
where c is a constant. If p = !,
then this n is constant, so the invariant measures
produced by part (b) of the theorem are the same as those produced by part (a). If
p > !,
though, n grows exponentially rapidly at +00 and decays exponentially
rapidly at -00, so the corresponding Va concentrates on configurations satisfying
L1)(x) <00, L[I-1)(x)] <00.

x<o x~o
There are countably many such configurations, and they form the disjoint union
of
Xn = {'7: I>(x) = L (1- '7(x)] < oo}
x<n x2':n
for integers -00 < n < 00. When restricted to X n , '7t is an irreducible countable
state Markov chain, and is positive recurrent since it has a stationary distribution
given by the conditional measure
where ex is determined by the n in (1.7). (While Va depends on the parameter c

in (1.7), Vn does not. This is a consequence of the uniqueness of the stationary
distribution of a positive recurrent, irreducible Markov chain.) Note in particular
that Va is not extremal, but rather is a convex combination of the vn's. Thus we
have identified two one parameter families of invariant measures. One parameter
is discrete; the other is continuous. Liggett (1976) showed that there are no other
extremal invariant measures in this case:
If p = 1, the analogue of Vn is the pointmass on the configuration '7 given by

if x n,
'7(X)={~
~
if x < n.
Symmetric Systems
Much of Part III deals with asymmetric systems - they tum out to be significantly
more interesting than the symmetric ones, which are defined by
(1.8) p(x, y) = p(y, x), x, yES.
Already we got a hint of this difference in the discussion of Example 1.5, where we
saw that there are more invariant measures for the process with p =I=- ~ than for the
one with p = ~. However, it is helpful for comparison purposes to review some
of the known results for symmetric systems. So, assume in this subsection that
(1.8) holds. In this case, the process satisfies the following self-duality property -
see Theorem 1.1 of Chapter VIII of IPS:
(1.9) p~('7t == 1 on A) = pA('7 == 1 on At), '7 E {a, l}s, A C S.
Here At is the same exclusion process, but thought of as a process on the collection
of finite subsets of S, with the identification A = {x : '7(x) = I}. Property (1.9) is
the main reason for the difference between the symmetric and asymmetric theories.
One consequence of (1.9) is that the k site marginal probabilities
of the process at time t depend on the initial distribution only through its k site
marginals. This is because the cardinality of At does not change with time. This
property does not hold for asymmetric systems - in general even P (rJt (x) = 1)
depends on the full structure of the initial distribution.
The fact that the dependence on the initial distribution is relatively simple
makes it possible to determine all the extremal invariant measures for symmetric
systems. To describe them, let
.~ = {ex : S -+ [0,1] : L p(x, y)ex(y) = ex(x), XES}

Y
be the harmonic functions for p(., .) taking values between zero and one. The
following result is proved in Section 1 of Chapter VIII of IPS. The fact that the
/-La defined below is invariant is a consequence of Theorem B7(e).
Theorem 1.10. Suppose the Markov chain with transition probabilities p(x, y) is
irreducible.
(a) For every ex E .~,
/-La = t--+oo
lim vaS(t)
exists and is in .9.

(b) /-Lah: rJ(x) = l} = ex(x) for XES.
(c) /-La = Va if and only if ex is constant on S.
(d) g,; = {/-La: ex E .~}.
Note that in statement (d), the one-to-one correspondence is between extremal

invariant measures and all harmonic functions between 0 and 1.
Whenever all bounded harmonic constants for p(., .) are constant, ~ consists
only of constants, and so it follows from Theorem 1.10 that the only extremal
invariant measures are the homogeneous product measures. In particular, we have
the following consequence.
Corollary 1.11. Assuming irreducibility, if either (a) the Markov chain with tran-
sition probabilities p(x, y) is recurrent, or (b) S = Zd and p(x, y) = p(O, Y - x),
then
g,; = {v p : p = constant E [0, In.
In the following example, .~ is very large, so the exclusion process has many
extremal invariant measures by Theorem 1.10.
Example 1.12. Consider the simple random walk Yn on the tree Td in which each
vertex has d + I neighbors; Yn moves to each of its neighbors with probability
d~ 1 . Take d 2: 2, so that Yn is transient. Fix a pair of neighbors L , x+ and write
Td = S_ U S+, where S_ consists of all vertices that are closer to x_ than to
x+, and S+ consists of all vertices that are closer to x+ than to x_. Since Yn is
transient, it visits x_ only finitely many times, and hence Yn is either eventually
in S_ or eventually in S+. Define
a (x) = px (Yn E S+ eventually).
This function is harmonic by the Markov property, and can be computed explicitly
as
(d + l)d1x-x-1
a(x) ={ 1
1------
(d + l)d 1x - x+ 1
By Theorem 1.10, the corresponding exclusion process has an extremal invariant
measure with marginal probabilities given by this function a. The measure de-
pends on the choice of x_, x+, so we see that the exclusion process has many
inhomogeneous extremal invariant measures in this case.
We conclude our discussion of the symmetric case by stating a convergence
theorem that is proved in Section 1 of Chapter VIII of IPS. For its statement, let
Pt(x, y) be the transition probabilities for the continuous time Markov chain with
unit exponential holding times and transition probabilities p(., .).
Theorem 1.13. Suppose that a E ~ and p, is a probability measure on {O, l}s

such that
lim '"' Pt(x, y)p,{1J : 1J(Y) = I} = a(x), XES,
t-+oo~
y
and
lim '"' Pt(x, U)Pt(Y, v)p,{1J : 1J(U)

t-+oo~
= 1, 1J(v) = I} = a(x)a(y), x, YES.
u.v
Then p,S(t) ::::} P,a.
It is important to emphasize that the proofs of all these results depend heavily
on the duality property (1.9). This is not simply a matter of technique. In the
absence of symmetry, the results, and not just the proofs, are different. Example
1.5 showed that Corollary 1.11 is not correct without the symmetry assumption.
So far, we have discussed primarily ergodic properties of symmetric systems.
Many other results are known. Some of them are described in Section 5. At this
point, we mention only one that plays an important role in the proofs of Theorems
1.10 and 1.13. It concerns inequalities that can be viewed (using duality) as com-
parisons between the symmetric exclusion process and systems of independent
Markov chains. Here is one, which is a special case of Proposition 1.7 of Chapter
VIII ofIPS: For any A c S,
(1.14) pry(1Jt == 1 on A) S n
XEA
pry (1Jt (x) = 1).
Coupling; the Graphical Representation

The main tool that replaces duality in the analysis of asymmetric exclusion pro-
cesses is coupling. The coupling between two versions of the process I1t, l;t can
be defined analytically, by writing down the generator for the bivariate process
(I1t, l;t), or probabilistically, by using common Poisson processes to construct the
two processes as we did in Parts I and II for contact and voter models. We will
use the latter approach here. The former approach is explained in Section 2 of
Chapter VIII of IPS.
For every ordered pair of sites (x, y), let Nx,y(t) be a Poisson process of
rate p(x, y). These Poisson processes are taken to be independent, as usual. In
constructing the process 111> proceed as follows. At an event time t of the process
Nx,y, if I1t- had a particle at x but none at y, move the particle from x to y. Thus if
I1t-(X) = 1, I1t-(Y) = 0, we will have I1t(X) = 0, I1t(Y) = 1, and I1t(Z) = I1t-(z) for
all z =F x, y. Otherwise, there is no change. We have described this construction
informally - condition (1.1) would come into the argument to show that this
procedure is well defined.
This construction can be used to run two processes 1110 l;t simultaneously, using
the same family of Poisson processes to generate the two versions of the exclusion
process. The resulting pair has many useful properties. For example, if the initial
configurations satisfy l1o(X) :::: l;o(x) for all XES, then I1t(X) :::: l;t(x) for all XES
and all t 2: O. This shows that the exclusion process is attractive in the sense of
(BI2, B13).
More generally, say that x is a discrepancy at time t if I1t (x) =F l;t (x). There
are two types of discrepancy, determined by whether I1t(X) = 0, l;t(x) = 1 or
I1t (x) = 1, l;t (x) = O. In the coupling we have defined, discrepancies can move
and they can disappear (if two discrepancies of opposite type meet). However,
discrepancies cannot be created. This is a key property used below.
The connection between invariant measures for the exclusion process and its
coupling is given next. It is Proposition 2.14 of Chapter VIII ofIPS.
Proposition 1.15. If f.-tl and f.-tz are invariant for the exclusion process, then there
is an invariant measure v for the coupled process (I1t, l;t) that has marginals f.-t 1 and
f.-tz respectively. If f.-tl and f.-tz are both extremal, then v can be taken to be extremal
as well. If f.-t 1 :::: f.-tz, then v can be taken to concentrate on {( 11, l;) : 11 :::: l;}.
Translation Invariant Systems

Assume now that S = Zd, p(x, y) = p(O, y-x) and the symmetrized random walk
with transition probabilities (p(x, y) + p(y,x»)/2 is irreducible. In this case, the
transition probabilities p are automatically doubly stochastic, so that (1.1) holds,
and part (a) of Theorem 1.2 applies. Let.'7 be the probability measures on {O, l}s
that are shift invariant. Here are two analogues of Theorem 1.10 that are proved
in Section 3 of Chapter VIII of IPS. Clearly, they fall substantially short of the
ideal of determining all of the invariant measures for the process in this case. The
proofs are based on coupling. Example 1.5 shows that the mean zero assumption
in part (b) is needed.
Theorem 1.16. (a) (9'nSt :t = {v p : P E [0, In.

(b) If d = 1, Lx Ixlp(O, x) < 00 and Lx xp(O, x) = 0, then.9; = {v p P E
[0, In.
To see how coupling is used in this context, we outline the proof of part (a)
of Theorem 1.16. Consider the coupled process (ryt, ~t). Suppose that the initial
distribution v is shift invariant, and let Vt be the distribution at time t. Then shift
invariance implies that
J(t) = vt{ (ry, l;) : ry(x) =1= ~(x)}
is independent of x. The fact that discrepancies cannot be created means that

l' (t) ~ O. In fact, writing down this derivative explicitly, and using shift invariance
to cancel all positive terms, one shows that it can be written as a sum of nonpositive
terms. If v is also invariant for the process, then this derivative is zero, and so
each of these nonpositive terms must be zero. A typical term is
-[p(x, y) + p(y, x)]v{ (ry, l;) : ry(x) = ~(y) = 1, ry(y) = ~(x) = OJ.
Therefore
v{ (ry, l;) : ry(x) = ~(y) = 1, ry(y) = ~(x) = O} = 0
whenever p(x, y) + p(y, x) > O. Using the irreducibility of the symmetrized
random walk, it follows that v puts no mass on pairs of configurations that contain
discrepancies of opposite type. In other words,
Now, by Proposition 1.15, given any pair of invariant measures for ryr. there
is an invariant measure for (ryt, ~t) that has those two measures as marginals. In
the construction used in the proof of that result, shift invariance is preserved, so
that if the original marginal measures are shift invariant, the invariant measure for
the coupled process will also be shift invariant. Applying this to the pair vp , J-L,
where J-L E (9' n y:t, it follows that either J-L ~ vp or J-L :::: vp for each p. Since
this is true for all p, it follows that J-L = vp for some p.
Even though the full class of invariant measures has not been determined
for asymmetric translation invariant systems, it is still possible to show that the
product measures vp are extremal.
Theorem 1.17. For each constant p E [0, 1], vp E .9;.
Proof Let Q be the generator for the exclusion process with transition probabilities
p(x, y) from x to y, and Q* be the generator for the process with transition
probabilities p(y, x) from x to y. Since vp is exchangeable, these generators are

formal adjoints of each other in L 2 (v p ):
(1.18) f fQgdvp = f gQ* fdv p
for any cylinder functions f, g. To see this, fix x and y. Make the change of
variables TJ --* TJx,y in the integral below and use the fact that vp is exchangeable:
This identity makes the terms involving TJx,y cancel in the next computation. Let-
ting T C Zd be finite so that f and g depend only on the coordinates in T, we
see that
f (JQg - gQ* f)dv p = L

x or yET
p(x, y) f f(TJ)g(TJ)[ TJ(Y) - TJ(x) ]dvp
= f f(TJ)g(TJ)[X~T p(x, Y)[TJ(Y) - TJ(x)]
+ L
xET,y1J
p(x,y)[p-TJ(x)]+ L
x11,YET
P(X,Y)[TJ(Y)-P]]dVp.
But this is zero, since
L p(x, y)[p - TJ(x)] = L [p - TJ(x)] - L p(x, y)[p - TJ(x)]

xET,y-tT XET x,yET
and
L p(x, y)[ TJ(Y) - p] = L [TJ(Y) - p] - L p(x, y)[ TJ(Y) - P].

X-tT,yET yET x,YET
Identity (1.18) extends to any g in the domain of Q and any f in the domain
of Q*, So, we may replace g by Set - s)g and f by S*(s)f, where S(t) and S*(t)
are the semi groups corresponding to Q and Q* respectively. Therefore,
f (S*(s)fQS(t - s)g - S(t - s)gQ*S*(s)f)dvp = 0
for cylinder functions f, g. It follows from the product rule that
f (:s S*(s)fS(t - s)g )dVp = 0,
and hence that
(1.19) f fS(t)gdv p = f gS*(t)fdvp.

Since vp is invariant for both Q and Q* by Theorem 1.2(a), the semigroups

Set) and S*(t) extend to contractions on L 2(vp). To See this, USe II· II and (-) for
the norm and inner product in this space, and write for continuous f
IIS(t)fI1 2 = f [S(t)f]2 dvp = f [E~f(1]()]2dvp

: : f E~ f2(1]()dv p f f 2dvp= = Ilf112.
Since these semigroups are contractions and the space of continuous functions is
dense in L 2 (v p ), (1.19) extends to all f, g E L 2 (v p ).
The idea of the proof is now the following: Suppose
1 1
(1.20) vp = 2ILI + 21L2'
where ILl, 1L2 E .9'. We need to show that ILl = 1L2 = vp. To do this, it is enough
to show that IL I and 1L2 are invariant for the process with generator Q + Q*, since
this is a symmetric exclusion process, and we know by Corollary 1.11 that vp is
extremal invariant for symmetric translation invariant exclusion processes.
By (1.20), ILl is absolutely continuous with respect to v p , so there is a measur-
able function h so that ILl = hvp. In fact, it follows from (1.20) that h is bounded:
o ::: h ::: 2. Since ILl is invariant with respect to the process with generator Q,
(h, Set)!} = f hS(t)fdvp = f S(t)fdILI = f fdlLI = f hfdvp = (h,!)
for any f E L 2 (v p ). In particular, setting f = h and using the contraction property
of Set), we get
IIS(t)h - hl1 2 = IIS(t)hI1 2 - 2(h, S(t)h) + IIhl1 2 ::: 0,

so that S(t)h = h. But now, by (1.19),
f S*(t)fdILI = f hS*(t)fdvp = (h, S*(t)!}
=(S(t)h,!) = (h,!) = f hfdvp = f fdlLI.

It follows that ILl is invariant for the process with generator Q* as well, and
therefore for the process with generator Q + Q* as required.
First and Second Class Particles

There is another way to view coupling for the exclusion process. Imagine that
the particles in the system are each called either first class or second class. The
evolution is the same as before, except that if a second class particle attempts to go
to a site occupied by a first class particle, it is not allowed to do so, while if a first
class particle attempts to move to a site occupied by a second class particle, the
two particles exchange positions. In other words, a first class particle has priority
over a second class particle. This rule has no effect on whether or not a given
site is occupied at a given time. The advantage, though, is that viewed by itself,
the collection of first class particles is Markovian, and has the same law as the
exclusion process. The collection of second class particles is clearly not Markovian.
However, the collection of first and second class particles is Markovian, and again
evolves like an exclusion process.
As mentioned earlier, this is just a slightly different way of thinking about
the coupling described above. To see this, let (17(, ~() be the coupled process, and
assume that 17( :s ~( at t = 0, and hence for all t. Think of the sites at which
17( (x) = ~(x) = 1 as being occupied by first class particles, and the set of sites at
which 17(X) = 0, ~t(x) = 1 as being occupied by second class particles. Then the
joint evolution of first and second class particles is exactly that described above.
To see this, note that the coupled process makes the transition
~:
17: o o
x y x y
at rate p(x, y), and when viewed in terms of first class and second class particles,
this transition becomes the exchange of positions when the first class particle at x
attempts to move to y, which is occupied by a second class particle.
Clearly, we can also consider third class particles, fourth class particles, etc. In
each case, mth class particles have priority over nth class particles if m < n. The
joint evolution of particles of different classes can again be realized by coupling
several copies of the exclusion process using the graphical representation. If 17: :s
17; :s ... :s 17~ coordinatewise, then we regard sites x for which 17: (x) = 1 as
locations of first class particles, sites for which 17; (x) = 1, 17: (x) = 0 as locations
of second class particles, etc. It follows that for any j, the collection of all particles
with class :s j is a version of the exclusion process.
The Tagged Particle Process

For the exclusion process on Zd with p(x, y) = p(O, y - x), it is natural to start
the process off in the equilibrium distribution vp , except that we put a particle
down at the origin with probability I, and then to follow the evolution of the
particle that started at 0, which is called the tagged particle. Let X( be the position
of this particle at time t. Arratia proved the following, which is Theorem 4.13 in
Chapter VIII of IPS.
Theorem 1.21. Ifd = 1 and p(x,x + 1) = p(x,x -1) =~, then
~( =* N(O, (fl- P)
t4 V~ P
as t --+ 00, where N(O, a 2 ) denotes the normal distribution with mean zero and
variance a 2 .
The most interesting feature of this result is the nonstandard normalization: t ~

instead of t 1. As we will see in Section 4, this is special to the nearest neighbor,
symmetric, one dimensional case. The small variance is caused by the rigidity of
the system - the original order of all the particles is preserved. Furthermore, when
XI is large and positive, the tagged particle will tend to see a greater density of
particles to its right than to its left. This tends to slow it down, reducing the spread
of the distribution.
Preview of Part III

In Sections 2 and 3, we will consider the asymmetric system which has been most
heavily and successfully studied - the one with only nearest neighbor jumps in one
dimension. In this case, homogeneous product measures are invariant by Theorem
1.2(a). Our main interest in Section 2 will be to understand the limiting behavior
of the system when the initial distribution is a relatively simple inhomogeneous
product measure. Even in this very special context, it will become clear that
both the results and the techniques are much richer and more interesting in the
asymmetric than in the symmetric case. The techniques used here are primarily
elaborate couplings. There will be close connections with the evolution of shock
profiles in certain partial differential equations.
Section 3 is devoted to what is known as the matrix approach to the exclusion
process. It is a technique that allows one to represent the invariant measure for
the nearest neighbor system on {I, . .. ,N} with various boundary conditions in a
rather explicit form. This leads to certain concrete computations that give insight
into the corresponding process on Zl. We will discuss the matrix formulation
only in the special context in which it first arose. Once understood, though, it
can be used to clarify the structure of the shock profiles alluded to above. Other
applications of the matrix approach will be surveyed in Section 5.
One of the things we will have discovered in Section 2 is that understanding
the motion of a tagged particle can be of great help in studying the exclusion
process itself. So, while Sections 2 and 3 are concerned only with the nearest
neighbor system in one dimension, it is of interest to study the tagged particle
process in greater generality. This is the subject of Section 4. The main question
is, what replaces Theorem 1.21 for other choices of transition rates p(., .)? The
answer is that one generally gets asymptotic normality, but with the standard
normalization d .
2. Asymmetric Processes on the Integers
In this section, we take S = Zl and p(x, x + 1) = p, p(x, x-I) = q for all

1
x E Zl, where p + q = 1, < PSI. Our main objective is to study the long
time behavior of the exclusion process with initial distribution IJ A. P ' which is the
product measure on {O, l}ZI with marginals
if x < 0,
(2.1)
if x :::: o.
Schematically, we are in the following context:
q p q p
... ~
q<p ... ~
-3 -2 -1 o +1 +2 +3
p
Figure 4
!
The reason we exclude the case p = from consideration is that the situation is
rather simple and uninteresting there. Theorem 1.13 implies in this case that
lim vA.pS(t) = VA+P,

(--+00 2
where as usual, Vy denotes the homogeneous product measure with density y. The
limiting behavior is more complex if p > !.
Here is the analogue of the above
limiting statement if p > !:
if A :::: ! and p :s !,
if p :::: ! and A + p > 1,
if A :s ! and A + p < 1,
if 0 < A < p and A + p = 1.
The first three cases correspond to the regions labelled I, II and III respectively in
the figure below, while the fourth corresponds to the line of slope -1.
II
III
o
o
Figure 5
The most interesting case is the last one, in which the limit is a mixture of
product measures, rather than a single product measure. Note also that, unlike the
symmetric case p = !, the limit is not continuous in (A, p) at the line A + p = 1.
Our major objective in this section is to explain the above result.
The reason for considering this particular initial distribution is twofold. First,
it is about the simplest initial distribution for which one cannot easily guess what
the limiting distribution should be. The second reason is that the answer connects
up with important issues in nonlinear partial differential equations, such as shock
propagation. Here is the basic question we would like to answer: If x(t) is a
reasonable function of t, what is the approximate distribution of v)",pS(t), viewed
from position x(t)?
Heuristics
We begin with an informal calculation. Let ILt be the distribution of the process
at time t, and set u (x, t) = ILt {17 : 17 (x) = I}. We will often use the following
shorthand for cylinder probabilities:
The exclusion version of Theorem B3 implies that
The easiest way to work out equations of this type is to consider separately the
positive and negative contributions to the derivative. For the positive ones, ask
what situations can lead to transitions from 17(X) = 0 to 17(x) = I (i.e., transitions
that increase the probability being differentiated), and write one term for each such
situation. The term to be written is simply the rate of the transition multiplied by
the probability that the situation occurs. The negative terms are similar, but the
transitions to be considered are now from 17 (x) = I to 17 (x) = o.
We see already in (2.2) the main reason that asymmetric systems are harder to
analyze than symmetric systems: The derivative of a cylinder probability involving
one site contains terms that involve two sites. If p = !,
then some cancellation
occurs that permits one to write the right side of (2.2) as
1 1
2'u(x - 1, t) + 2'u(x + 1, t) - u(x, t).
This can also be seen as a special case of duality (1.9). In this symmetric case,
(2.2) becomes the discrete heat equation - a discrete version of
au 1 a2 u
at 2 ax 2 '
If p > !, on the other hand, this cancellation does not occur, and it is not possible
to write the right side of (2.2) in terms of the function u.
If we are content to operate at the heuristic level, we can proceed as follows.

The initial distribution is a product measure. The interesting extremal invariant
distribution at time t > °

measures are product measures - see the discussion of Example 1.5. While the
is certainly not a product measure, we might not lose
too much if we pretend that it is. Doing so leads to the following approximation
to (2.2):
d
dt u(x, t) ~pu(x - 1, t)[1 - u(x, t)] + qu(x + 1, t)[1 - u(x, t)]
- pu(x, t)[1 - u(x + 1, t)] - qu(x, t)[1 - u(x - 1, t)].
This is a discrete approximation to Burgers' partial differential equation

au a
(2.3) at + (p - q) a)u(1 - u)] = 0.
Unlike the heat equation that one gets in the symmetric case, (2.3) is nonlinear.
One big difference between these two partial differential equations is the following:
The heat equation is well known to be smoothing - the solution at time t is much
smoother in x than the initial condition. This is not the case for Burgers' equation.
Discontinuities - also known as shocks - can persist for all time, or develop later
even if they are not present initially.
In our case, by analogy with (2.1), the natural initial condition for (2.3) is
if x < 0,
(2.4) U(X,o)={~ if x ::: 0,
which is discontinuous if J... =1= p. The nature of the solution (here we mean the
so-called entropy weak solution - the entropy condition is supposed to pick out
the physically relevant solution when there is nonuniqueness) depends on whether
)... p. (If)... = p, the solution is clearly constant in space and time.)
To see how this works, let's try to find a solution of (2.3) with initial condition
(2.4) that is of the following form:
if x .:s cit
u(x. I) ~ { :(I)X +b(t) if Cit .:sx .:sC2t
if x ::: C2t,
where Cl < C2 and aCt), bet) are chosen so that u is continuous. By the continuity
requirement,
p-J... J...C2 - PCI
aCt) = , bet) = .
(C2 - Cl)t C2 - Cl
Substituting into (2.3) gives two linear equations in Cl, C2, whose solution is
(2.5) Cl = (p - q)(1 - 2J...), C2 = (p - q)(1 - 2p).
All is well provided that this solution satisfies Cl < C2. This occurs if J... > p, but
not otherwise. In this case, the shock disappears immediately, and the solution is
continuous (though not smooth) for t > O. If A < p, however, this procedure does
not produce a solution, and the entropy weak solution turns out to be
(2.6) u(x, t) = u(x - vt, 0), where v = (p - q)(l - A - p),
i.e., the shape of the solution does not change, but it moves at velocity v. Note
that this v is the average of c] and C2 in (2.5). In this case, the shock persists, and
moves linearly with speed v. For more on partial differential equations of type
(2.3), see Section 3.4 of Evans (1998).
Basic Assumption; Expected Results

For the remainder of Section 2, we will assume that A < p, since this is the more
delicate and interesting case. It is the case in which shocks persist for solutions
of Burgers' equation, and correspondingly, the limiting behavior of vA,pS(t) is
discontinuous. Our objective will be to show that the exclusion process behaves
roughly in the following way: there is a (randomly located) shock that moves with
speed v and has Brownian fluctuations about this drift, such that, viewed from the
position of the shock, the distribution of the process is essentially VA far to the left
and vp far to the right. One consequence of this picture should be the following:
If Tn is the spatial shift by n units, then
where ex = peW :::: a) for an appropriate normally distributed random variable W

with mean zero. This statement appears as Theorem 2.93 below.
Expanding a bit on this, let Zt be the location of the shock at time t. (It is
not obvious yet what this means, but giving it a precise meaning will be the first
order of business.) We will show that
as t --+ 00 (Theorem 2.90), so that
lim p(Zt :::: vt

t-HX!
+ aJt) = peW :::: a) = ex.
It follows also from this central limit theorem that
lim p(IZt - vt - aJtI :::: M)

t--> 00
=0
for every M. Therefore, with probability approximately ex, Zt is far to the right
of vt + aJt, and hence the distribution of the process near vt + a.jt will be
approximately VA' On the other hand, with probability approximately I - ex, Zt is
far to the left of vt +a.jt, and hence the distribution of the process near vt +a.jt
will be approximately V p' This gives rise to the mixture ex VA + (1 - ex) V p that
appears in the limit.
Results that have been proved in case A > P are described in Section 5. This
case falls within the rubric of general hydrodynamic results that are not restricted
to nearest neighbor processes in one dimension. These more general results apply
only away from the shock in the corresponding partial differential equation. Our
interest here is precisely to determine what happens at the shock itself.
Location of the Shock

The first problem in carrying out this program is to decide what will be meant
by the location of the shock at time t. This is easy in the context of the partial
differential equation (2.3) - it is simply the location of the discontinuity at time
t. For the exclusion process, the natural definition involves the use of a second
class particle. Start the process with a second class particle at 0, and first class
particles distributed on Zi\{O} according to vA,P' Note that there will always be
exactly one second class particle in the process. Let Zt be its location, and let I1t
be the process of first class particles. Then I1t evolves according to the rules of
the exclusion process, as does the process obtained by adding a particle to it at
location Zt. The distribution of the exclusion process starting from v A•P itself is
the 1 - p, p mixture of the distributions of these two processes, since this is true
for the initial condition.
The process Zt is not Markovian. In fact, conditional on {l1s, s .:s t}, Zt moves
in the following way:
x -+ x+l at rate
{ : if I1teX
if I1teX
+ 1) =
+ 1) =
0,
1,
:
(2.7)
if I1t(X - 1) = 0,
x -+ x-I at rate
{ if I1t(X - 1) = l.
Note again the simplification that occurs if p = q = ~. In this case, Zt is a simple
random walk.
Let Tit be the process I1t, viewed from position Zt:
This process is Markovian on the set of configurations 11 with 11(0) = O. In this

process, there are transitions corresponding to the exclusion process rules, provided
that neither of the two sites involved in the transition is 0, and in addition there
are (modified) shifts of the entire configuration with rates that derive from (2.7).
For example, if Tit (1) = 1, then at rate q the configuration becomes 11, where
11(-1) = 1 and I1(X) = Tit(x + 1) for x *' -1,0. This transition corresponds to
the interchange of the first class particle at site 1 with the second class particle at
site 0.
We will argue that Zt can be viewed as the approximate location of the shock
in the process I1t. This should mean that, uniformly in t, the distribution of Tit is
asymptotic to VA at -00, and asymptotic to vp at +00. This suggests the following
definition: For a probability measure f.l on {O, 1}Zl , we will say that f.l ~ vA,p if
C lim P,{I1: 11
n---+-oo
= I on A +n} = AlAI and
(2.8)
C lim P,{I1: 11
n~+oo
= Ion A +n} = piAl
for every finite A C Zl. Here C lim means Cesaro limit. Recall that a sequence
Un converges to U in the Cesaro sense if
I N
lim -
N~oo N n=1
LU =U. n
With our notation, this would be written as C limn~oo Un = u.

We will say that a family of probability measures satisfies the property p, '" v;.,p
uniformly if for every A, the convergence in (2.8) is uniform over that family.
This uniformity is important, since for every fixed t, the fact that the distribution
at time t satisfies (2.8) (even without taking Cesaro averages) is a consequence
of the invariance of homogeneous product measures under the dynamics of the
exclusion process.
Another View of the Shock

We will not show directly that the distribution oflil '" v;.,p uniformly in t. Instead,
we will show that viewing 111 from a certain location XI, instead of Zt. results in
this property, and that for this choice,
(2.9)
But this implies that the process viewed from Zt satisfies this property also. To see
this, suppose that 11 is any random configuration, and X and Z are two randomly
chosen sites. Then
INI ~ P (11 = I on A + X + n) -
N I N
N ~ P (11 = I on A + Z + n)
I
I
:::; -E
I X+N
L I{~=I on A+k} - L
Z+N
I{~=I on A+k}
I :::; -EIX
2
- ZI·
N k=X+I k=Z+1 N
The key to defining the position XI, and to proving these facts, is to find a
closely related process for which an invariant measure '" v;.,p can be computed
more or less explicitly. To this end, consider the process (rl1, 11;, I1n in which the
particles in 11; are first class, the particles in 11; are second class, and the particles
in 11; are third class. Each site contains at most one particle overall, of course.
Defining 11;,3 = 11; + 11;, which amounts to not distinguishing between second and
third class particles, we see that (17;, 17;,3) can be viewed as a process of first and
second class particles.
Recall from our discussion in Section 1 that (11; , 11; + 11;.3) can be regarded as
the coupling of two exclusion processes, the second of which lies above the first.
Therefore, by Theorem 1.2(a) and Proposition 1.15, together with its extension to
shift invariant situations mentioned in the outline of the proof of Theorem 1.16,
there exists an invariant measure v for the process (111, 11;·3) that is shift invariant,
and such that, with this distribution, 111 has distribution VA and 111,2,3 = 111 + 11;,3
has distribution vp. Note that V{(11, n:
~(x) = I} = p - A > O. Even though
the distributions of 11 and 11 + ~ under v are product measures, v itself is not in
general a product measure. In particular, there is no reason to expect 11 (x) and
~ (y) to be independent according to v for x =F y. In fact, the distribution of
{11(X), x < 0, 11(X) + ~(x), x:::: O} is not vA,p.
If the process (111, 11;, 11i) is started from a configuration in which 11~'\0) = 1,
let X t be the position at time t of the particle that began at the origin at time
0, using the (11 tl , 11;,3) interpretation. To be more specific, X t is defined so that it
does not move when there are interchanges of positions of second and third class
particles. Define (111,11;, 11i) as the process (111, 11;, 11i) viewed from X t :
11; (x) = 11;(Xt +x).
The process (111,11;,3) is defined analogously. Let Set) and Set) be the semigroups
of (111,11;,3) and (111,11;,3) respectively. Write M for the set of shift invariant
probability measures p., on {(11, n:
11 + ~ E {O, l}zJ} such that p.,{(11, ~(O) = n:
I} > O. For p., E M, let
/I(.) = p.,(. I ~(O) = 1)
be the measure on {( 11, n:~ (0) = I} obtained by conditioning on the presence
of a second class particle at the origin.
When applied to shift invariant measures, there is a close relationship between
the action of Set) and the action of Set):
Proposition 2.10. (a) If p., E M, then
/IS(t) = p.,S(t).
(b) If v E M is invariant for the process (111, 11;,3), then v is invariant for the
process ( -I -23) .
11 t , 11t'
Proof Part (b) is an immediate consequence of part (a). It may be instructive

to give two proofs of part (a). The first is based on a generator computation,
while the second is shorter and may be regarded as more probabilistic. Let Q and
Q be the generators of (111, 11;,3) and (111,11;,3) respectively. Then, recalling that
11(X) + ~(x) E {O, I}, these can be written (for cylinder f) as
p(x, y)[J(11x,y, ~x,y) - /(11, n]

ry(x)=I,ry(y)=O
+
nx)=I,ry(y)={(y)=O
and
Q/(17, n= p(x, Y)[J(17x,y, ~x,y) - /(17, n]

x,y",O, ry(x)= I, ry(y)=o
+ p(x, Y)[J(17, ~x,y) - /(17, n]
x,#O,~(x)=I,ry(y)=~(y)=O
+L p(x, 0)[J(rx170,x, rx~o,x) - /(17, n]

ry(x)=1
+ L p(O, y)[J(ry17, ry~O,y) - /(17, n],

ry(y)=\(y)=o
where the subscript x, Y on 17 and ~ means that the x and y coordinates are
interchanged, and ry shifts a configuration y units, to bring the second class particle
back to the origin: r y17(u) = 17(U + y). The first two sums in the expression for Q
give the contributions from transitions that do not involve the second class particle
at the origin, while the last two sums are contributions from transitions involving
that particle.
We need to relate Q and Q, and this is done via the mapping T that is defined
by
T/(17, n = ~(0)/(17, n
In the following computation, separate the terms corresponding to x, y =1= 0, x =
O,y = 0:
Q[T /(17, n] = ~(O)
x,#O,ry(x)=I,ry(y)=O
+ ~(O) p(x, y)[J(17, ~x,y) - /(17, n]
x,y",O,~(x )=1, ry(y)=\(y)=O
+ 17(O)~(y) L p(O, y)/(170,y, ~o.y) - ~CO) L p(O, Y)/(17, n
y ry(y)=\(y)=O
- ~CO) L p(x, O)/C17, n + [1 - 17(0) - ~(O)] L p(x, O)/C17, ~x,o).
ry(x)=1 \(x)=1
In doing this computation, it is important to remember that 17 (x) + ~ (x) S 1, since

this property is used to simplify the constraints. There is a lot of cancellation in
computing the difference we are interested in, and this cancellation leads to
TQ/(17, n - Q[T/(17, n] = L ~(O)l](x)p(x, 0)/(rx170,x, rx~o,x)

x
+ L ~(0)[1 - 17(Y) - ~(y) ]p(O, y)/(ry17, ry~O,y)

y
- L 17(O)~(y)p(O, y)/C170,y, ~O,y)

y
- L ~(x)[1 - 17(0) - ~(O) ]p(x, 0)/(17, ~x,o).

x
We will rewrite this expression in terms of the functions
GO(1], n = 1](O)~(1)f(1]O,I, ~O,I)'

G I (1], n = [1 -1](0) - ~(O)]~(-I)f(1], ~O,-I),
G 2 (1], n = 1](O)~(-l)f(1]o,-I' ~O,-I),
G 3(1], n = [1 -1](0) - ~(O)]~(1)f(1], ~O,I)'
Then
G O(LI1], Lin = ~(O)1](-l)f(LI1]o,-I' LI~O,-I),

G I Cr l1], Tin = [1 -1](1) - ~(1)]~(O)f(TI1], TI~O,I),
= ~(O)1](1)f(TI1]O,I, TI~O,d,
G 2 (TI1], Tin
G3(LI1], Lin = [1 -1](-1) - ~(-l)]~(O)f(LI1], LI~O,-I).
Therefore,
T0.f(1], n - 0.[Tf(1], n] =p[ Go LI - Go + G I TI - G I](1], n

0 0
+ q[ G 3 0 LI - G 3 + G 2 0 TI - G 2 ](1], n
Here (G 0 T)(1], n = G(T1], Tn represents the composition of G with the shift.
It follows that for any translation invariant measure p"
(2.11 ) f {T0. f(1], n - 0.[Tf(1], n]}dp, = O.
Next we need to obtain a similar relation for the two semigroups. To do this,
write
TS(t)f - S(t)Tf =It ~[S(t

o ds
- s)TS(s)f]ds
= lot Set - s)[T0. - 0.T]S(s)fds.
Integrating with respect to a shift invariant measure p, leads to
(2.12) f [TS(t)f - S(t)Tf]dp, = lot f [T0. - 0.T]S(s)fd[p,S(t - s)]ds.
Now apply (2.11) to the function S(s)f and the measure p,S(t - s) (which is also
shift invariant) to conclude that the right side of (2.12) is zero. But by definition,
f fd[p,S(t)] = p,S(t){(1], ~~ : ~(O) = I} f S(t)Tfdp"
and
f fd[JIS(t)] = f S(t)fdJI =
p,{(1], n :1
~(O) = l}
f TS(t)fdp,.
The denominators in the two expressions above are equal, because f1 is transla-
tion invariant, and second class particles are neither created nor destroyed by the
evolution, so this completes the first proof of part (a).
For the second (version of the) proof, note from the last display, that what we
f f
need to show is
S(t)Tfdf1 = TS(t)fdf1
for cylinder functions f. For any x so that 1)~.3(x) = 1, let X~ be the position
at time t of the particle that was originally at x. Then, breaking the left side up
according to the initial location of the 1)2.3 particle that is at the origin at time t,
we see that
f S(t)Tfdf1 = f E(~I,~2.J)I);,3(0)f(1)1, 1);,3)df1
= f [~ 1)2,3(x)E(~I.~2,J)(J(1)1, X; = 0) 1);,3), Jdf1
= f [~ 1)2,3(x)E(rx~l,rx~23)(J(LxI)1, L x l);,3), X? = -x) Jdf1
= f [~(rxI)2,3)(0)E(rx~l,rx~2'3)(J(LxI)1, L x l);,3), X? = -x) Jdf1

= f [~ 1)2,3(0)E(~1 ,~23) (J(L x l)l, L x l);,3), X? = -x) Jd f1
= f T S(t)!df1.
The third equality above comes from the translation invariance of the process,
while the fifth comes from the translation invariance of the initial measure f1.
Remark. An analogue of Proposition 2.10 for the exclusion process itself (i.e.,
the process consisting only of first class particles) appears later as Proposition 4.3.
An Invariant Measure for the Process Viewed from X t

Now we will take v as in Proposition 2.l0(b), and use the corresponding v to
construct a family of invariant measures v;
for the full process (Tj1, Tj;, Tj~). This is
done in two stages. First, choose (Tjl, Tj2,3) according to v. Then number the second
and third class particles that occur in this random configuration consecutively, so
that the particle at the origin is numbered 0 (recall that there is always a second
or third class particle at the origin for configurations taken from v), the next Tj2,3
particle to the right is numbered 1, the first Tj2,3 particle to the left of the origin
is numbered -1, etc. Fix a constant c > O. For the particle numbered n, call
it a second class particle with probability c(plq)n 1(1 + c(plq)n), and call it a
third class particle with probability 11 (1 + c(p 1q)n), with the choices being made
independently from particle to particle. If p = 1, q = 0, the interpretation is that
the particle at the origin is taken to be a second class particle with probability ,~c'
all particles to the right of the origin are second class particles, and all particles
to the left of the origin are third class particles.
The choice of these particular probabilities is motivated by the invariant mea-
sures constructed in Example 1.5. Note that if A. = 0, p = 1, then according to
the v; we have constructed, there are no first class particles, every site is occu-
pied by either a second or a third class particle, and the second class particles are
distributed according to the invariant measure described in Example 1.5.
Proposition 2.13. The measure v; is invariant for the process (11;,11;, 11i).
Proof Let ~/ be the generator for the process (11;,11;, 11i). By Theorem B7(b), we
need to show that for every cylinder function f of three variables, J n* fdv; = O.
We can write
-*
n -* -*
=n,+n 2,
where n; consists of all the terms in the sum corresponding to transitions that
change the value of (11;,11;,3), and n; is the rest of the summands, i.e., the ones
that involve an exchange between neighboring second and third class particles.
Recall that such exchanges do not affect XI> so that n; contains no translations.
Since v is invariant for (11:,11;·3), and the process with generator n; does not
change the labelling of second vs. third class particles, v; is invariant for this
n;
process, and hence J f dv; = O. It remains to show that v; is invariant for the
process with generator n;.In fact, it turns out to be reversible with respect to
this process. To see this, we may consider the configuration (11;,11;,3) to be fixed,
since the process with generator n; does not change it. Consider two adjacent
sites x, x + 1 such that
Let nand n+ 1 be the numbers associated with these two particles in the assignment
of second/third class labels. Then reversibility is simply the statement that after
conditioning on everything except the second/third class labels at those sites,
(2.14)
But this is just
c(p/q)n I 1 c(p/q))n+'
p = q,
1+ c(p/q))n 1 + c(p/q))n+! 1 + c(p/q))n 1 + c(p/q))n+!
which is clearly true.
Remark. More generally, if f.1 is any shift invariant initial distribution for the
process (T}I, T};,3), we will define Ii; via the same procedure that was used to
construct v;
from v. The corresponding result is that the distribution of the sec-
ond/third class labellings for the process (1/1, 1/;, 1/i) (relative to the particle at X t )
is stationary in time.
The Process X t Identifies the Shock

f1
Say that has good marginals if it is shift invariant and the sequence (1/, l;) with
f1
distribution has the property that 1/ has distribution VA and 1/ + { has distribution
v p' An example of such a measure is the v that was defined just before Proposition
2.10. Another is the measure in which sites are independently given a first class
particle with probability A and a second class particle with probability p - A.
Note that the property of having good marginals is preserved by the evolution of
(1/1,1/;,3) - see Theorem I.2(a). We will need to know that in a uniform sense, the
second class particles cannot be spaced too far apart with respect to any measure
with good marginals:
Lemma 2.15. There exists a constant C so that
f1{(1/, l;) : {(k) = Ofor alII:::::: k :::::: n} :::::: e- cn , n 2: 1,
for all f1 with good marginals.
Proof Choose 0 < E < p~A. Then
f1{(1/, l;) : {(k) = 0 for all 1 : : : k :::::: n} :::::: f1{ (1/, l;): I~ ~ 1/(k) - AI> to}
+ f1{ (1/, l;) : I~ ~ [TI(k) + {(k)] - pi> E}
= vA{1/: I~ ~1/(k) -AI> E}
+vp{1/: l~t1/(k)-pl >E}'
which decays exponentially rapidly to zero by the large deviations theorem for
independent Bernoulli random variables. (See Section l.9 of Durrett (1996), for
example.)
Theorem 2.16. Let (1/1,1/;,1/;) be the coupled process of first, second and
third class particles. Assume that (1/6, 1/~,3) has distribution Ii, where f1 has good
marginals. Let X t be the position of the particle in 1/;,3 that began at the origin. If
(2.l7) L 1/~(x) < 00 and L 1/~(x) < 00 a.s.,

x<O x>O
°
(both sums = if p = 1) then 11:,2 = 11: + 11; is a version of the exclusion process
such that the distribution of rx, 1It '" v)",p uniformly in t.
Proof Note first that changing the labelling of the second and third class particles
does not affect the process Xt. This is important, since we will consider different
labellings in the proof. Fix a c > 0, and let (t;/, t;/, t;?) be a version of the three
class process, but with initial distribution 7I~, and let t;/,2 = t;/ + t;/. Without loss
of generality, we can assume that (t;J, l;~,3) = (116, 11~,3), and then the coupling
maintains this relation at later times. By (2.17),
(2.18)
when t = 0. But the basic properties of the coupling imply that the probabilities
in (2.18) are increasing in t for each c. Therefore, the convergence in (2.18) is
uniform in t.
Now write
(2.19) P(rX,1I:,2 = 1 on A + n) ~P(rx,l;/,2 = 1 on A + n) + p(1I:. 2 i l;/,2).
°
The second term tends to as c -+ 00 uniformly in t by (2.18). So, we need to
show that the first term on the right side of (2.19) tends in the Cesaro sense to
AlAI as n -+ -00 and to piAl as n -+ +00 for each value of c > 0, uniformly in
t. This will give one inequality, and the other comes from using
P(rX,1I:,2 = 1 on A + n) 2: P(rx,l;/,2 = 1 on A + n) - p(1II· 2 1:. t;/,2)

instead of (2.19), and letting c tend to zero instead of 00 when using (2.18). We
will write out the argument in the case n -+ -00 only, since the other case is
similar.
First we will show that l;/,2 can be replaced by l;/ in the statement that the
first term on the right of (2.19) tends to AlAI in the Cesaro sense. To do so, since
d,2 = l;/ + t;/, it is enough to show that
(2.20)
uniformly in t. For n < 0, let M t = L;2n l;t2,3(Xt + k) and write

P(rx,t;t2(n) = 1) = p(t;t2(X t +n) = 1)
= E[ E[l;t2(Xt + n)I{l;t2,3(Xt + k), k E Zl}]]
= E[l;2.3(X t + n) c(p/q)-M, ]
(2.21) t 1 + c(p/q)-M,
~ E[t;t2,3(Xt + n)c(:) -M']

where';l, ';2, ... are Bernoulli random variables, distributed according to the law of
the second class particles at time t for the two class process with initial distribution
/L. This is a consequence of Proposition 2.10.
We will now use Lemma 2.15 to show that the right side of (2.21) tends to
zero as n ---+ -00. Since Lemma 2.15 is unifonn in the measure /L with good
marginals, this conclusion will be unifonn in t. For n > 0, write
For any positive integer k,
P(Skn < k) :::: kP(Sn = 0) :::: ke- Cn .

Therefore
Er Skn :::: ke- Cn + rk,
where r = <J. < 1. Letting n ---+ 00 first, and then letting k ---+ 00, shows that Er Sn
tends to 0, ~nd hence the right side of (2.21) tends to 0 unifonnly in t.
It remains to show that
C- lim P(rx,s/
n--+-oo
= I on A + n) = AlAI
unifonnly in t. To do so, use Proposition 2.10 to write
t
I~ n=1 P(rx,s/ = 1 on A - n) - AlAI I ::::
p
~ AE ~ It 1{~=1
n=1
on A-n} - NAIAII,
where'; is distributed like the first class particles in the two class process at time
t with initial distribution /L. But this distribution is just VA, so the right side above
tends to zero by the law of large numbers.
Remark. If /L in Theorem 2.16 is taken to be the product measure with each site
occupied by a first class particle with probability A and a second class particle
with probability p - A, and the two sums in (2.17) are taken to be zero, then 111. 2
has initial distribution vA•P on ZI\{O}.
The Process Zt also Identifies the Shock

One unpleasant aspect of what we have done so far is that the process rx,111,2
considered in Theorem 2.16 is not Markov. Since rz,I1J,2 is Markov, this is a
better process to consider. We will now show that X t and Zt are not too far apart,
so that either can be thought of as the location of the shock.
The main idea is to construct the process (111, 11;, 11;, Xio Zt) jointly in such a
way that the relative positions of X t and Zt are in equilibrium. The joint construc-
tion is made via the graphical representation as usual. The first three components
are regarded again as the locations of first, second, and third class particles, re-
spectively. X t is the location at time t of the 11;,3 particle that started at the origin.
As before, it does not move when second and third class particles interchange
positions. Zt is the location of an rd,3 particle that moves with a priority inter-
mediate between those of second and third class particles - it has priority over
third class particles, while second class particles have priority over it. The copy of
the exclusion process that we focus on is YJ;,2 = yJ; + YJ;. By the priority rule we
have chosen for Zt, (YJ;,2, Zt) is a copy of the exclusion process, together with the
location of a particle that is second class with respect to it. Thus the evolutions
of the processes (YJ;, YJ;, YJ;, X t ) and (YJl,2, Zt) are consistent with the definitions
given earlier in this section.
Next we must choose an initial distribution for the process (YJ;, YJ;, YJ;, X t , Zt).
We want to do it in such a way that the relative positions of X t and Zt are in
equilibrium. The construction is similar to that used in the context of Proposition
2.13, but now we use a probability measure m(·) on ZI and a collection of numbers
satisfying 0 < mk(l) < 1. Initially, choose (YJ6, YJ~,3) according to Ti, where /-L is
a measure with good marginals, and put Xo = O. Next number the YJ~,3 particles
consecutively as before, with the particle at 0 being numbered O. Independently
'*
of what we have done so far, choose Zo to be the YJ~,3 particle numbered k with
probability m(k). Finally, if Zo = k and I k, the YJ~.3 particle numbered I is
called second class with probability mk(l) and third class otherwise.
Consider for a moment the case p = 1. Suppose that initially Xo = Zo = 0
and the YJ~,3 particles at negative sites are third class, and those at positive sites are
second class. Then at all future times, X t = Zt, and the YJ;,3 particles at sites to
the left of X t are third class, and those at sites to the right of X t are second class.
In particular, the relative positions of X t and Zt are automatically in equilibrium,
and conclusions (2.25a,b) below are automatic. For this reason, we exclude the
case p = 1 from the next result.
Theorem 2.22. Suppose that! < p < 1 and that m(k) and mk(l) are chosen so
that
qm(k)mk(k + 1) = pm(k + l)mk+l (k),
(2.23) mk(k + 1)] = qm(k + 1)[1 - mk+l(k)],
'*
pm(k)[1 -
pmk(l)[1 - mk(l + 1)] = qmk(l + 1)[1 - mk(l)], I, I + 1 k.
Let Xt(k), -00 < k < 00, be the locations of the YJ;,3 particles, when numbered in
order, with Xt(O) = XI' Then
(2.24)
and
1
L
00
(2.25a) EIZt - Xtl = -- Iklm(k).

p - A. k=-oo
Furthermore, for every n :::: 0 there is a constant C so that
L
00
(2.25b) EIZt - Xtl n :::: C Iklnm(k), t:::: O.

k=-oo
Remarks. (a) It is not hard to check that the following is the most general solution
to (2.23): Take a positive function a(k) on Zl and put
m(k + 1) p + qa(k)
(2.26a)
m(k) q + paCk)
and
(2.26b)
mk(l) = (fl)k-l x { a(k - 1) if I < k,
1 - mk(l) P a(k) if I > k.
Note that the resulting m is summable if a(k) tends to 0 at -00 and to 00 at +00.
In fact, it can be normalized to make a probability measure that has exponential
tails in both directions. In particular, the right sides of (2.25a,b) are finite.
(b) If the initial distribution of (11:, 11;, 11;, X t , Zt) is modified by conditioning
on an event of positive probability, and m has a finite first moment, then it follows
from (2.25a) that
supEIZt - Xtl < 00
t~O
for this modified initial configuration as well. A natural example of such an event
IS
{ZO = 0, L 116 (x) = 0, L 11~(X) = oJ.

x<O x>O
If 11 is the product measure with good marginals, then the distribution of 116,2 =
11& + 116 after this conditioning is vA,p on ZI\{O}.
Proof of Theorem 2.22. Giving a completely formal proof of (2.24) would involve
introducing a lot of notation that would obscure the main point, so we will argue
somewhat informally. First recall that the process {(11;, 11;·3), s 2: O}, is Markov,
and that {X s, s 2: O}, is measurable with respect to it. Furthermore, while the
transitions of {(11;, 11;,3), s 2: O}, can change the locations of the 11;,3 particles,
they do not change the labellings of these particles as second vs. third class, or the
determination of which of these particles is the one with location Zt. To check the
latter statement, note that any transitions of (111, 11;,3) that affect Zt correspond to
a first class particle switching positions with Zt, or Zt moving to an empty site.
Therefore it will be enough to check that the labellings as second vs. third class
particles and the choice of which of these is at Zt are in equilibrium with respect
to the part of the evolution that does not change (11;, 11;,3),
Any such transition involves two adjacent sites, x, x + 1, that are occupied by
11;,3 particles, not both of which have the same class. We will use some shorthand
to describe the situation at these two sites, The possible situations are called
fez, 2, k), (z, 3, k), (2, z, k), (3, z, k), (2,3, k), (3,2, k), k E Zl},
The third coordinate k is determined by Xt(k) = Zr, The first two coordinates are
the classes of the particles at x, x + 1 respectively, with 2=second class, 3=third
class and z = the special Zt particle. Thus (z, 2, k) denotes the situation in which
Zt = Xt(k) = x, rd(x + 1) = 1, for example. The possible transitions and their
rates are
(z, 2, k) ---+ (2, z, k + 1) at rate q

(z, 3, k) ---+ (3, z, k + 1) at rate p
(2, z, k) ---+ (z, 2, k - 1) at rate p
(3, z, k) ---+ (z, 3, k - 1) at rate q
(2,3, k) ---+ (3,2, k) at rate p
(3,2, k) ---+ (2,3, k) at rate q.
Note, as observed before the statement of the theorem, that if p = 1, then the
third coordinate k above does not change if there are no (z, 3)'s or (2, z)'s in the
configuration. The above transitions may be easier to visualize in the following
form:
z 2 2 z
--.q
I
Z = X(k) X(k + 1) X(k) Z = X(k+l)
z 3 3 z
I --.
p
I
Z = X(k) X(k + 1) X(k) Z = X(k+l)
2 z z 2
--.p
X(k - 1) Z = X(k) Z = X(k-l) X(k)
3 z z 3
I --.q
I
X(k - 1) Z = X(k) Z = X(k-l) X(k)
2 3 3 2
I I --.p
X(j) X(j + 1) X(j) X(j + 1)
3 2 2 3
I I --.q
I
X(j) X(j + 1) XC)~ X(j + 1)
Figure 6
The rates are shown above the arrows. X (k) is the special Z particle, and in the
last two transitions, k -=1= j, j + 1, i.e., the special Z particle is not at x or x + l. The
three equalities in (2.23) are exactly the detail balance, or reversibility, conditions
for these transitions. This establishes not only (2.24), but the stronger fact that the
full assignment of classes to the particles in rJ~,3 is in equilibrium.
To prove (2.25a), observe first that (2.24) implies that the event {Zt = X t (k)} is
independent of {(rJ.;, rJ;,3), s 2: O}, and hence of the sequence {Xt(k), -00 < k <
oo}, which is measurable with respect to it. By Proposition 2.10, the distribution
of rJ~,3 is a shift invariant measure, conditioned on rJ~,3 (0) = 1, so
£ll[X t (k + 1) - Xt(k), rJ;,3(0) = 1] = 1, -00 < k < 00
by Theorem B47. Therefore, since P (rJ~,3 (0) = 1) = P - A,
EIZt - Xtl = L E(IZt - Xtl I Zt = Xt(k))P(Zt = Xt(k))

k
= L E(IXt(k) - Xt(O)1 I Zt = Xt(k))P(Zt = Xt(k))

(2.27)
k
1
= - ~ Iklm(k).
P-A ~
The proof of (2.25b) is similar, noting that by Lemma 2.15, Xt(k + 1) - Xt(k)
has moments of all orders that are bounded in t.
Behavior of the Shock - First Moments

The next order of business is to determine the asymptotics of XI and Zt. The
analysis is based on the idea of a current - the net flow of particles across the
origin. More formally, let 1/ be the number of particles in rJ: that were in (-00, 0]
at time 0 but are in (0, 00) at time t, minus the number of particles in rJ: that
were in (0, 00) at time 0 but are in (-00, 0] at time t. The number of particles
that cross the origin by time t is dominated by a Poisson random variable, so
1/ has moments of all orders. Consistently with previous practice, we will write
1(2,3 = I? I?,
+ etc. Note that 1(2,3 is then the current for the process rJ;,3.
We begin with the behavior of the mean of X t and Zt. This will already
identify the v in (2.6) as the speed of the shock.
Theorem 2.28. In the context ofthe discussion preceding the statement of Theorem
2.22,
EXt = vt,
where v = (p - q)(1 - P - A). If, in addition, m(·) has mean zero, then
EZt = vt.
Remark. One way to give m mean zero is to make it symmetric about the origin.
If m is written as in (2.26), it is symmetric about 0 if and only if
(2.29) a(k)a(-k - 1) = 1, k E Zl.
This symmetry is easy to achieve, while still making m have exponential moments.
It is enough to define {a(k), k ~ O} so that limk-+oo a(k) = 00, and then define
{a(k), k < O} by (2.29).
Proof of Theorem 2.28. The second statement follows from the first by doing the
computation in (2.27) without the absolute values. This leads to
1
E(Zt - Xt) = - Lkm(k) = O.
p - A k
To prove the first statement, consider the process rd·

3 . For any x such that
1]~.3 (x) = 1, let X~ be the location at time t of the particle that was at x at time
O. Then the current of 1];.3 particles can be written as
(2.30) J/. 3 = L l{x;>o} - L 1{X;:<:,O}.

x<O x>O
~~·J(~)=l ~~·J(x)=l
Taking expected values with respect to the process with initial distribution fJ., with
good marginals and using translation invariance, we have
x:<:,O x>O
(2.31 )
Now we need to compute the left side of (2.31) in a different way, to determine
the value of EXt. Write
(2.32) Jt2.3 = J 1,2,3 _ Jl

t t.
The two terms on the right side depend only on the marginal processes 1]1,2,3
and 1]1 respectively, and these are in equilibrium under the evolution with initial
distribution fJ.,. At a time when the distribution of the exclusion process is y,
the rate at which particles cross from (-00, x] to (x, (0) is just py(1xOx+l) -
qy(Oxlx+l), where we have used the shorthand for cylinder probabilities from
(2.2). Therefore
d d
(2.33) -Ell Jl = (p - q)A(1 - A) and -Ell J/,2,3 = (p - q)p(1- p).
dt t dt
Combining (2.32) and (2.33) gives
Ell Jt2.3 = (p - q)[p(l - p) - A(l - A)]t = (p - A)vt,

where v = (p - q)(1 - A - p). Comparing this with (2.31) gives EXt = vt as

required.
Behavior of the Shock - Weak Law of Large Numbers

Theorem 2.28 suggests that X t and Zt satisfy a law of large numbers. Here is
the statement for X t - the analogous result for Zt can be easily deduced from
Theorem 2.22. This law of large numbers is not only of intrinsic interest, but will
be useful in analyzing the second moments of the shock location shortly.
Theorem 2.34. Suppose /L has good marginals. Then

. Xt
hm - = v
t---+oo t
in Ll with respect to pli.
Proof For any configuration rJ and x E Z 1, let N (x, rJ) be the signed number of
particles in (0, x]:
if x > 0,
(2.35) if x < 0,
if x = O.
Then, since particles of a given class do not change their order,
(2.36) 1t2,3 = N(X t, .,2,3)

'It
on the event {rJo' (0) = I}. So, we need to consider laws oflarge numbers for 1:.
23 .
The first observation is a stronger form of (2.33):
(2.37) Mt = 1/ - P lot l{ry~(O)=I,ry~(l)=ojds + q lot l{ry1(o)=o,ry1(l)=I}ds

is a martingale. To check this, it is enough by the Markov property to show that
Ery Mt = 0 for any initial configuration rJ and any time t. But this is just the
argument that led to (2.33).
By the martingale property,
2 2 2
(2.38) E ( Mt - Ms ) = EMt - EMs, 0< s < t.
To compute the left side for small values of t - s, write
EryM; =Ery(1/)2 + oCt), t.J, 0

=ptl{ryi(O)=l.ryi(l)=Oj + qtl{ryi(O)=o,ryi(l)=lj + oCt), t.J, 0
This implies that
E(Mt+h-Mtf = phP(rJl(O) = 1, rJl(l) = O)+qhP(rJl(O) = 0, rJl(l) = l)+o(h)

as h '" O. Combining this with (2.38) gives

d 2
dt EMt =).,(1 - ).,),
and then
EM; =).,(l - ).,)t.
In particular,
Mt
(2.39) --+0 t-+oo
t '
in probability.
Since VA is extremal invariant for the exclusion process (see the discussion
of Example 1.5 and Theorem 1.17), ry1 is a stationary and ergodic process by
Theorem B52. Therefore, the ergodic theorem (Theorem B50) gives
a.s. Combining this with (2.37) and (2.39) gives

Jl
~ -+ (p - q)).,(l -).,)
t
in probability with respect to pfL. A similar statement holds for J/,2.3 with ).,
replaced by p, so by (2.32),
J 2 ,3
(2.40) _t_ -+ (p _ ).,)v
t
in probability with respect to pfL.

To translate (2.40) into a weak law for Xt, assume that v > 0, since the other
cases are handled similarly. Then (2.36) and (2.40) combine to give
1 X,
(2.41 ) - "" ry2,\y) -+ (p _ ).,)v
t~ t
y=1
in probability with respect to pli. On the other hand, since the distributions under
pfL of ry1 and ryt1,2,3 are independent of t, the weak law of large numbers for
independent Bernoulli random variables gives
1 I rt
L ry1 (y) -+ ).,r,
rt
t - "" ryl,2,3(y) -+ pr
t ~ t
y=1 y=1
in probability with respect to pfL. Therefore,

1 rt
(2.42) - "" ry2,3(y) -+ (p _ ).,)r
t ~ t
y=1
in probability with respect to pll-. The statement of the theorem in the sense of
convergence in probability comes from comparing (2.41) and (2.42). To check LI
convergence, note that IXII is dominated by a Poisson process, so that
{Xdt,t> I}
is uniformly integrable.
Behavior of the Shock - Second Moments

Now we consider the second moments of XI and ZI' Use Var to denote the
variance of a random variable: Var(W) = EW2 - (Ewf. If it is necessary to
specify the initial distribution of the process with respect to which a variance is
to be computed, we will do so via a superscript, as usual. Here is the statement
of the relevant asymptotics:
Theorem 2.43. Suppose TJI has initial distribution v).,p on ZI\{O), with a second
class particle placed at the origin. Then ZI, the location o/the second class particle
at time t, satisfies
. Var(ZI) p(l - p) + A(l - A)
D = hm = (p - q) .
1-+00 t p - A
Remark. Note that this expression for the variance tends to 00 when p - A --+ O.
This suggests that if p = A, the motion of the second class particle is superdiffu-
sive. See Spohn (1991) for a discussion of this.
The proof of Theorem 2.43 is based on a number of reductions, which are stated
below as propositions. Throughout this discussion, t-t is the product measure with
good marginals, and Ii is t-t conditioned on TJ 2 ,3 (0) = 1. As in the case of first
moments, we will prove a result analogous to that in Theorem 2.43 for XI first,
and deduce the result for ZI easily from it later.
In (2.31), we related the first moments of 112 ,3 and XI' Let's try to do the same
thing for second moments. The first observation is that the ordering of the tagged
TJ;,3 particles is preserved, so using the notation from the proof of Theorem 2.28,
we see that
(2.44) x < y and TJ~,3(x) = TJ~'\Y) = 1 implies X~ < xi.
In particular, the product of any pair of summands in (2.30), one from the first
sum, and one from the second sum, is zero. Writing (2.30) in the form 1/,3 =
1?,3,+ - 112,3,-, where the terms on the right are defined as the two sums in (2.30),
it follows that
(2.45)
To compute the expected value of the first term on the right of (2.45), square out
the sum and use (2.44) again, to obtain
(2.46) £!-'[1/.3.+]2 = 2 L PI'(11~·3(x) = 11~,3(y) = 1, X; > 0) + EI' Jt2•3.+.

y<x:,::O
Arguing as in (2.31),
(2.47)
where x+ denotes the positive part.

For the next part of the computation, we need to introduce two additional
processes, to be used when the initial distribution is 7I. In this situation, 116 (0) = 0
and 11~·3 (0) = 1. Let V t be the position at time t of the particle starting at 0,
defining its motion by saying that it has priority intermediate between the 111
particles and the 11;·3 particles that did not start at 0, while Vt is the position at
time t of the particle starting at 0 whose motion is defined by saying that it has
lower priority than all the other particles. All these processes are defined in terms
of the original graphical representation.
Instead of computing the first term on the right of (2.46) directly, consider the
following closely related expression:
y<x:,::O
- PI'(11~·3(y) = 1, X; > 0 I 11b· 2 .3(x) = 0)]

(2.48) + (p - A) L [PI'(11~·3(y) = 1, X; > 0 111~·3(x) = 1)
- PI'(11~·3(y) = 1, X; > 0 111~·3(x) = 1)]
+A L [PI'(11~·3(y) = I,X; > 0 111~·3(x) = 1)
- PI'(11~·3(y) = 1, X; > 0 I 116 (x) = 1)].
The middle term on the right of (2.48) is of course zero, but is included to make
it more obvious that the equality is correct, since the sum of the positive terms
on the right is the positive term on the left, and the sum of the negative terms on
the right is the negative term on the left. The first and third sums on the right of
(2.48) are handled in a manner similar to each other. To treat the first one, note
that by translation invariance, it is, except for the factor of 1 - p, equal to
(2.49) y<O:,::x
- pl'(11~·3(y) = 1, X; > x I 11b· 2•3(0) = 0)]'

Fix an initial configuration y = (yl, y2) = (116, 11~,3) taken from the distribu-
tion 7I. Let y* be the configuration obtained from it by removing the 11 2. 3 particle
at O. Couple two processes with initial configurations y and y* together with a
common graphical representation, and let xi be the position of the second class
particle that started at y for the un starred process, and xi'* be the position of the
second class particle that started at y for the starred process, Then (2.49) is simply
the average value with respect to Ii of
(2.50)
The expected value in (2.50) refers to averaging over the evolution for a fixed
initial condition.
Let··· < Y_I < Yo = 0 < YI < ... be those values of Y so that y 2 ey) = 1.
Then at all times t, ... < Xtl < xio < xii < ... are the locations of the
7]2,3 particles in the unstarred process, and· .. < xi- I ,* < xii" < ... are the
locations of the 7]2,3 particles in the starred process. These two sets of locations
are the same, except for Vt which is in the first set, but not the second. For a
given realization of the processes and time t, define i by setting xi; = Vt . Since
the labels must match up far out to the left and far out to the right, it must be the
I
case that
X tYi + 1 if i :::; j < 0,
xi" ~ XYj-l
t
X Yi
ifO<j:::;i,
otherwise.
t
Therefore
L [eX;i)+ - eX;i'*)+] =lri<oJ L [eX;i)+ - eX;i+ )+]

-I
I
j=i
=I U<oJ[eXr)+ - eX;o)+]
=l rxt >vtJ[V/ - xi].
Averaging over Ii gives the following expression for (2.49):
(2.51)
Similarly, the third sum on the right side of (2.48), except for the factor of A,
can be written as
(2.52)
The difference between this and the computation leading to (2.51) is that y', the
initial configuration of the starred process, is obtained from y by replacing the
7]2,3 particle at the origin by a first class particle. Ut is the site at which there is
an 7]2,3 particle in the unstarred process and an 7]1 particle in the starred process
at time t.
Using translation invariance again, we can compute the sum of the negative
terms on the left side of (2.48):
L pfl(1J~'\Y) = 1, xi> 0) = LyPfl(1J~,3(0) = 1, X t > Y)

y<x50 y>O
(2.53)
y>O
Combining (2.46}-(2.53), we obtain
Efl[Jt2,3,+]2 =(p - Ai Eil(Xn 2 + (p - A) (1 - P + A)EilX;

(2.54) + 2(p - A)(1 - p)Eil[V/ - x;, X t > Vt]
Correspondingly, one can compute
Efl[Jt2,3,-]2 =(p - A)2 Eil(X;)2 + (p - A)(1 - P + A) EilX t-

(2.55) + 2(p - A)(1- p)Eil[Vt- - X;, X t < Vt]
+ 2(p - A)AEil[Vt- - X;, X t < V t ].
Combining (2.31), (2.45), (2.54) and (2.55) gives the following result:
Proposition 2.56. If JL is the product measure with good marginais, then
Varfl 1t2,3 = (p - A)2Va?iX t + (p - A)(1 - P + A)EilIXtl

+ 2(p-A)(1-p)[ Eil(V/ - X;, X t > Vt )+ Eil(Yr- - X;, X t < Vt)]
+ 2(p-A)A[ Eil(Vt+ -X;, X t > V t ) + Eil(Vt- -X;, X t < V t )]'
Before proceeding, a few words are in order on ~w this result will be used.
Recall that we are trying to find the asymptotics of Var fl X t . Proposition 2.56 relates
this to the variance of the current (on the left side of the identity) and first order
properties of Xt, V t and Vt (on the right side of the identity). We will actually end
up using Proposition 2.56 in both directions: to compute the variance of the current
in terms of the variance of the tagged particle, and vice versa. This is a profitable
approach for the following reason. We will be able to compute the asymptotic
variance of X t directly when A = 0 or p = 1. This is because of Proposition 2.13.
As it stands, that result refers to an invariant measure v for the coupled process. In
general, this cannot be written down explicitly. However, if A = 0, for example,
then v is the same as the product measure with good marginals - this is the key
fact. Once we carry out this part of the argument, Proposition 2.56 will give us the
variance of the current in this case. However, we will be able to use the variance
of the current in this case to compute the variance of the current in the general
case, and then use Proposition 2.56 again to get the variance of X t in the general
case.
But first, we need to handle the first order tenns in Proposition 2.56. The law
of large numbers given in Theorem 2.34 will allow us to do this for XI' So, we
need to prove analogous laws of large numbers for VI and VI to be used in the
tenns that involve them. Note that V t is the position of a second class particle
starting at the origin when the rest of the system is made up of first class particles
that have initial distribution VA on Z 1\ {O}, while Vt is the position of a second
class particle starting at the origin when the rest of the system is made up of first
class particles that have initial distribution vp on Z 1\ {O}. Therefore the results we
need for VI and VI are the same, except for the density of the initial distribution.
To avoid confusion, we will call the density of particles away from the origin fJ
in the next result, and the position of the second class particle Wt .
Proposition 2.57. Consider the exclusion process 17t that consists of first class
particles on Zl \ {O} with initial distribution vj3, and a single second class particle
initially at O. Let Wt be the position of the second class particle at time t. Then
· -WI = (p - q)(l - 2 fJ)

I1m
t->oo t
in LI.
Proof Choose A = fJ < p. Consider the process (171, 17;, 17;, Xr, Zt) with the initial
distribution used in Theorem 2.22, based on a choice of m(·) that is symmetric
about 0 and has exponentially decaying tails. Let L t be the position of the leftmost
particle in 17; and R t be the rightmost particle in 17;. Note that these are finite,
since by (2.26b),
L mk(l) <
l<k
00, L [I - mk(l)] <
l>k
00.
Recall from the proof of Theorem 2.22 that given {(171 , 17;,3), s ::: O}, the law of
the location of Zt relative to XI> and the labelling of the 17;,3 particles as second
vs. third class are in equilibrium. Again let Xt(k) be the ordered locations of the
17;,3 particles at time t, with Xt(O) = Xt. Then, as in the computation that led to
(2.27),
EJL[Xt(k + 1) - XI(k), 17;,3(0) = I] = I, -00 < k < 00,
For k E ZI and f = {Ej}jEZ1\(Oj, Ej E {O, I}, such that
Lfj < 00 and L[l-Ej] < 00,

j<O j>O
let G(k, E) be the event
Then
P(G(k, E») = m(k) n

Ej=i
mk(J) n [1 -
Ej=O
mk(J)].
Also, by Proposition 2.10 and Theorem B47,
E(ILt - Xtl
-
I G(k, E») = E"(IXt(l) - Xt(O)I) = -III- ,
P-A
where I = min {j : Ej = I}. Therefore,
for appropriate choice of a (.) in (2.26). Since the right side is independent of t,
even if the initial distribution is conditioned on the event
{ZO = 0, L1/~(x) = 0, L1/~(x) = OJ,

x 0:::0
it will still be the case that
supEILt - Xtl < 00.

t>O
Combining this with Theorem 2.34 gives
. Lt
(2.58) hm -
t-+oo t
= (p - q)(l - A - p)
in probability.
Now consider coupling together the processes (1/t, Wt ) and (1/1, 1/;, 1/:, x t , Zt),
using a common graphical representation. The initial configurations are coupled
by saying that Xo = Wo = 0 and 1/0 = 1/6 on Zi\{O}. Note that with this coupling,
Wt :::: L t for all t, provided that it is true at t = O. To see this, it suffices to check
that 1/; (Wt ) = 1, i.e., the sole second class particle in (1/l> Wt ) is always at a site
occupied by a second class particle in (1/] , 1/;, 1/:, X t , Zt). On the set {1/5 (0) = I},
Lo = Wo = O. Combining these observations with (2.58), we see that
(2.59) lim p(WI < u,

1-+00 t
1/~(0) = 1) = 0
for every u < (p - q)(l - A - p). Since A = {3, p > {3 is arbitrary, and the
distribution of WI depends only on {3, it follows that (2.59) holds for all u <
(p - q)(l - 2{3). Noting that WI is independent of
{Zo = 0, L 1/~(x) = 0, L 1/~(x) = OJ,

x 0:::0
one concludes that (2.59) holds without the condition 1/5(0) = 1.

This gives half of the statement of the proposition in the sense of convergence
in probability. For the other half, use the same argument, but applied now to
A I} is uniformly integrable.
By the discussion preceding the statement of Proposition 2.57, we conclude

from that result that in the context of Theorem 2.43,
. Ut . Vt
11m
(2.60) hm -
t-->oo t
= (p - q)( 1 - 2A) and - =
t-->oo t
(p - q)(l - 2p)
in LI.
Recalling the discussion following the statement of Proposition 2.56, we see
that the next step in the proof of Theorem 2.43 is to obtain the asymptotics of
VarJi X t in the special case A = 0. We tum to this next.
Proposition 2.61. Suppose that A

marginals. Then
= ° and JL is the product measure with good
t~1
and
. Varl" X t
hm - -
t-->oo t
= (p - q)(1 - p).
Proof Since A = 0, the product measure with good marginals is just the measure
in which there are no first class particles, and the 7]2,3 particles have distribution
vp. In this case, therefore, this measure is invariant for the process (7]}, 7];,3). Using
the assignment of labels described prior to Proposition 2.13 (with c = 1, say), we
have then that (Tj;, Tji) is stationary. Let L t be the position of the leftmost 7];
particle at time t. Then all moments of L t - X t are uniformly bounded in t, so
we may as well prove the result for L t in place of X t .
The proof for L t is based on Theorem B61. The point is that we can map
the evolution of iL t 7]; to a series of queues. To make the connection, think of
the number of sites between successive particles in 7]; as queue lengths. When a
particle moves from x to x + 1, for example, this can be thought of as a customer
moving from one queue to the previous one. Therefore, in the context of Theorem
B61, we should take S = {-I, 0,1, ... } with x* = -1, and
f
if y = x-I, x ::: 0,
q(x,Y) ~ { if y = x + 1, x ::: - 1,
otherwise.
Then
rr(x) = (~r+l,
and we should take p(x) = rr(x) + (1 -rr(x) )(1 - p), so that the A from Theorem
B61 is just (p - q) (1 - p) as required. Since with this mapping, the net output
process for the queuing system is just L t - L o, the result for L t follows from
Theorem B61.
There is one difficulty, however. We need to know that our mapping takes the
distribution of iL , I1; to the measure v that is relevant to Theorem B6l, for the
present choice of p (.). This should not be surprising, since both v and the image of
the distribution of iL, 11; are invariant for the queuing system, and both distributions
have a queue length that is asymptotically distributed as the same geometric, since
p (x) -+ 1 - p as x -+ 00. What we need then is the following: Let {X (k), k E Zl}
be the ordered locations of particles in vp , so that the increments X (k + 1) - X(k)
are independent and satisfy
(2.62) P(X(k + 1) - X(k) = j) = p(l - p)j-l, j::: 1.
Remove the particle at X (k) with probability
qk
r(k) = k k
P +q
There is a leftmost particle among the remaining ones, since Lk<O [1-r(k)] < 00.
Let Y (0) < Y (l) < ... be the positions of the remaining particles. Then it should
be the case that the increments Y(k + 1) - Y(k) are independent and satisfy
(2.63) P(y(k + 1) - Y(k) = j) = [1 - p(k)]p(k)j-l, j::: 1, k ::: o.

To show this, define random variables No < Nl < ... by Y(k) = X(Nk).
Conditional on {Nk, k ::: o}, {Y(k + 1) - Y(k), k ::: o} are independent random
variables, with Y(k + 1) - Y(k) being the sum of Nk+l - Nk independent copies of
a random variable with distribution given in (2.62). Therefore, if we take numbers
{Uk, k ::: o} satisfying IUkl .:s 1 and Uk = 1 for all k beyond some point,
E Qui(k+l)-Y(k) I{NJ' j ::: o} = Q 1 _ (~~ P)Uk

Nk+l- N,
(2.64) [ ] [ ]
Next we need to compute the joint distribution of the Nk'S, so that we can compute
the expected value of the right side of (2.64). Fix an m well beyond the point that
the Uk'S become 1, and write
P(No = no,··· ,Nm = nm) = n r(k)[1 - reno)] n r(k)[1 - r(nd]···
_(p)no+.+n
- -
m- 1 _
P(No - n m ).
q
Using this in (2.64) gives

=
n[
k
pn(k)uk ]
l - uk[l-p+pn(k)] E
(P)mNO
q
Since the left side is 1 when Uk == 1, the right side must be 1 as well in that case.
This allows us to evaluate the final factor on the right side. Using this, we find
that
as required.
Remark. Just as in Theorem B61, this proof implies that X/ satisfies the central
limit theorem. We will not need this in what follows, however.
We can now use this information on the asymptotic variance of the tagged
particle to get corresponding information for the current.
Corollary 2.65. Suppose IL is the product measure with good marginals.

(a) If)." = 0, then
lim Varl1 J/2 ,3 = (p _ q)p(1 - p)12p - 11.

/---+00 t
(b) For general)", < p,
VarJl J1
lim _ _ I = (p - q»).,,(1 - )"')12)", - 11
/---+00 t
and
Varl1 J 1,2,3
lim / = (p - q)p(1 - p)12p - 11.
/---+00 t
Proof Part (b) follows immediately from part (a), since J/. 2.3 is the current for
a system with distribution vp and J/ is the current for a system with distribution
VA' To prove part (a), we evaluate asymptotically the right side of the identity in
Proposition 2.56. In doing so, we use Proposition 2.61, together with the following
limiting statements that come from Theorem 2.34 and (2.60):
Vf Xt Vt
--+p-q, - -+ (p - q) (1 - p), - -+ (p-q)(1-2p)
t t t
as t -+ 00 in L 1. Therefore
VarJL J2.3
lim t =p2(p _ q)(1 - p) + p(1 _ p)2(p _ q)
t--+oo t
+ 2p(l - p)(p - q)[(l - 2p)+ - (l - p)]
=(p - q)p(1 - p)12p - 11·
Remark. In order to simplify notation, we have considered so far only currents

across the origin. Analogous results can be proved for more general currents, and
in fact will be needed shortly. Since the proofs are similar, we will state the more
general versions without proof. For details, see Ferrari and Fontes (1994b).
J/.
For any number r define the current r as the number of particles of type i
that were to the left of the origin at time 0, but are to the right of rt at time t.
Thus the currents considered so far correspond to the case r = O. The computation
of the mean is similar to (2.33), and the law of large numbers is similar to the
argument that led to (2.40), but now there is an additional term coming from the
moving frame of reference. The result is that
(2.66)
. J/ = (p -
hm - '
r
q)A(1 - A) - rA
t
t--+oo
in probability and L 1. The mean of the left side of (2.66) is the right side for fixed
t, except for small errors coming from the fact that rt may not be an integer. The
analogue of the conclusion of part (a) of Corollary 2.65 is then that if A = 0,
VarJL J2.3
(2.67) lim t,r = p(1 - p)l(p - q)(1 - 2p) - rl.
t--+oo t
Even though J?,3 = J/,2,3 - J/, Corollary 2.65(b) does not give enough
information to compute the asymptotic variance of J?,3, since at this point we do
not know what correlations might exist between J/,2,3 and J/. To get around this
problem, we will relate currents to initial configurations, and it is here that (2.67)
is relevant. For the next statement, recall the definition of N(x, TJ) in (2.35).
Proposition 2.68. Suppose fJ., is the product measure with good marginals and
A = O. Then
Proof The idea is to use (2.67) in the case that the limit is zero, so take r =
(p - q)(1 - 2p). By (2.66), the asymptotic mean of Jt~~3 is (p - q)p2 t . For ease
of exposition, assume r < O. Write I] = 1]~,3, which has distribution vp' As usual,
write X~ for the position at time t of the particle that began at x. Then
J?,3 + N(p - q)(2p - l)t, 1]) - (p - q)p2t
(2.69)
= L l](x)l{x;>o) - L l](x)l{x;::oo} + L I](x) - (p - q)p2 t
x~-rt x>-rt
But the right side of (2.69) has the same distribution as Jt~~3 - (p - q) p 2t, so the
result follows from (2.66) and (2.67).
Corollary 2.70. Suppose f.L is the product measure with good marginals. Then
+ N(p
(2.71)
J/
o
- q)(2A - l)t, 1]6) - (p - q»)."h
-+ 0,
(2.72)
and hence
J?,3 _ (p _ q)(p2 _ A2)t
(2.73)
o
N«p - q)(2p - l)t, 1]6,2,3) - N(p - q)(2A - l)t, 1]6)
+ 0 -+0
in L2. In particular,
1I. mVarJt2,3
---
I
t ..... oo t
(p - q)(p - A)[(2A - 1)(1 - p + A) + 2p(1 - p)] if 0 ::: 2A-1,

= (p - q)[(1 - 2A)A(1 - A) + (2p - l)p(1 - p)] if2A-1::: 0::: 2p-1,
(p - q)(p - A)[O - 2p)(1 - p + A) + 2A(1 - A)] if2p-1 ::: O.
Proof Statements (2.71) and (2.72) are just Proposition 2.68, applied to the
marginal processes that have distribution VA and Vp respectively. Statement (2.73)
is obtained by taking differences of the first two statements, recalling (2.32). The
final statement comes from (2.73), since the numerator of the second expression,
after some cancellation, is a sum and difference of independent Bernoulli random
variables. One does have to exercise some care in the computation, since different
cancellations occur in the three different cases detennined by the signs of 2A - 1

and 2p - 1. For example, if 0 :s 2)" - 1, the numerator of the second expression is
the sum of (approximately) (p - q)(2)" - l)t Bernoulli random variables 1]~,3(x)
with parameter (p -)..) and (p - q )2(p - )..)t Bernoulli random variables 1]~,2,3 (x)
with parameter p.
We are now finally able to detennine the asymptotics of the variance of the
shock.
Proof of Theorem 2.43. By the final statement of Theorem 2.22, it is enough to

prove the result for XI in place of Zt. But we have in Proposition 2.26 an identity
that relates VarX t to quantities whose asymptotics we now know: Corollary 2.70
for Var lt2,3, Theorem 2.34 for the first order behavior of XI> and (2.60) for the first
order behavior of V t and Vt. Note that the asymptotics of these three processes
imply that ultimately they are ordered as
So, passing to the limit in Proposition 2.26, we have
. Varl/,3 2. VarX t
hm - - =(p -)..) hm - -
t ..... oo t t ..... oo t
+ (p -)..)(1 - p + )..)(p - q)ll -).. - pi
+ 2(p - )..)(1 - p)(p - q)[(1 - 2p)+ - (1 -).. - p)+]
+ 2(p - )..»)..(p - q)[(1 - 2),,)- - (1 -).. - p)-]
2· VarXt 2
=(p -)..) hm - - - (p - q)(p - )..)(1 -).. - p)
t ..... oo t
+ 2(p - q)(p - )..)[(1 - p)(1 - 2p)+ +)..(1 - 2),,)-].
Considering the three cases 0 :s 2A - 1, 2).. - 1 :s 0 :s 2p - 1, and 2p - 1 :s 0

separately and using the final statement of Corollary 2.70 gives the statement of
Theorem 2.43 with ZI replaced by Xt, as required.
Central Limit Behavior of the Shock

The key to proving the central limit theorem for Zt is to relate this process closely
to the initial configuration. Since the initial configuration is made up of independent
Bernoulli random variables, we can then apply the usual central limit theorem to
them. Here is the result that establishes the connection we need.
Proposition 2.74. Suppose 1]1 has initial distribution vA,p on Z'\{O}, with a second
class particle placed at the origin. Then ZI, the location of the second class particle
at time t, satisfies
(p - ),,)ZI - (p - q)(p - )..)t + Ll x l:S(P-q)(P-A)I1]O(X) --+ 0
(2.75)
.fi
Proof The mean of the expression in (2.75) tends to zero, by Theorems 2.22 and
2.28. Therefore, it is enough to show that the variance tends to zero. Its variance
is (up to small errors caused by the fact that (p -q)(p - A)t may not be an integer
- we will ignore such errors in this computation) is
VarZ
(2.76) (p - A)2 _ _t + (p _ A)2 D + 2(p _ A)
COV(Zt'LI x::o
I ( )( A) 1I0(X»)
p-q p- t ,
t t
where D is defined in Theorem 2.43 and Cov denotes the covariance of two
random variables. Theorem 2.43 gives us the asymptotics of the first term, so we
need only consider the covariance term. In particular, the covariance term should
in the limit exactly cancel the first two terms in (2.76).
Take x > 0, and compute
(2.77) COV(Zr.1I0(X») = p[E(Zt I 110 (x) = 1) - EZt].
Of course,
(2.78) EZ t = pE(Zt I 110 (x) = 1) + (1 - p)E(Zt 11I0(x) = 0),

so we will be able to compute (2.77) if we know
(2.79) E(Zt I 110 (x) = 0) - E(Zt 11I0(x) = I).

This is simply the difference in mean locations of the shock depending on whether
or not there is a particle at x initially. We will show that in a Cesaro sense, over
the range of positive x's relevant to (2.76), (2.79) is approximately (p - A)-I:
1 (p-q)(p-A)t
(2.80) lim
(-+00 (p - q)t
"
~
[E(Zt 11I0(x) = 0) - E(Zt 11I0(x) = 1)] = 1.
Combining (2.80) with (2.78), it follows that asymptotically, in the Cesaro sense
over the same range,
p(1 - p)
Cov(Zt' 1I0(X») '" - .
P-A
This, together with the corresponding result for negative x's and Theorem 2.43
implies that the limit of (2.76) is zero as t -+ 00.
So, we need to prove (2.80). Choose (116, 1I~,3) according to the product measure
J1 with good marginals, and let Xt(k) be the ordered locations of the 11;·3 particles.
Define processes Yt (k) so that
by setting Yo(k) = Xo(k), and letting these positions evolve according to the
graphical representation with the following priority rule:
Yt(k) has priority over YrU) iff j < k.

Since this construction is shift invariant, the distribution of Xt(k) - Yt(k) is inde-
pendent of k. Therefore,
1
(2.81 ) E[Yt(k + I) - Yt(k)] = E[Xt(k + I) - Xt(k)] = - - ,
P-A
where the second equality comes from Theorem B47.
The priority rules we have chosen are intended to guarantee that when viewed
from Yt(k), the process
'7: + 1{Y,(j).j>k}
has distribution v).,pS(t). Consider then the process ('7t, Yt(O), Yt(-I)), where
'7t = '7: + 1{Y,(j).j>O}·

Then '7t is a copy of the exclusion process, Yt (0) has lower priority than the
particles in '7" and Yt ( -I) has lower priority still.
For an x > Yo(O), we will consider the processes obtained from ('71> Yt(O),
Yt(-I)) by replacing '7o(x) by 0 or 1 respectively - call them (~j, uj, V/)
for i = 0, I. Recalling that we are trying to prove (2.80), note that
(2.82) E(Zt I '7o(x) = 0) - E(Zt I '7o(x) = I) = E(UtO - Un·
Couple the two processes (~tO, U tO, Vto) and (~/ ' U/, Vt1) together with the graph-
ical representation. For a certain amount of time, the configuration of the coupled
processes will be of the form
*' *... 0
*' * ...
where *' and * represent the locations of the particles earlier denoted by uj and
V/' in either order, and the· .. represent the rest of the configuration of O's and
1'so These agree in the two configurations. The location of the ~ can be thought
of as moving as a second class particle with respect to the process without the *S.
At some point, the : and the ~ may be at adjacent sites. (By this time, * may
be either of the star particles, since their order is not preserved by the evolution.)
At this time, the following transitions affecting these two sites are possible:
(2.83) * 0 0 * and * 0 * 0
* * * *
at rates p and q respectively. It is important to note that the transitions in the
opposite direction do not occur, so that once one of the transitions in (2.83) has
occurred, there will no longer be a site with a ~ in the pair of configurations.
At this stage, the three special sites are of the form *:, 0 and *1' These are
* two
all second class with respect to the other particles. The latter * of them interact
with each other, with ~ having greater priority than ~. The interactions among
the other two pairs is more interesting. Here are the possible transitions in these
cases:
* *' at rate p if *' > *,
*'
(2.84) *' * -+
* *' at rate q if * > *',
*' *'
*' *
at rate q if *' > *,
*'
where * > *' means that * has priority over *', and
0 *'
at rate p if * > *',
*' *
*' 0 0 *'
(2.85) -+ at rate p if *' > *,
*' * * *'
*' 0
at rate q if * > *'.
* *'
The thing to notice is that after these transitions, the 1 is paired with the higher
priority * (in the first case) and the 0 is paired with the lower priority * (in the
second case). After each of these types of transitions has occurred, this pairing
will persist forever, and then VI O = V/ at all later times.
Breaking up the following expectation according to whether 1)o(x) = 0 or 1,
we have
(2.86)
By (2.81), the left side of (2.86) is (p - A)-I. Pretend for a minute that VtO = V/
with probability 1. Then we would be able to rewrite (2.86) as
(2.87)
By (2.82), what we wanted was to show that
(2.88) E(VO _ Vi) '" _I_

I I P_ A
for the appropriate (x, t) range. This would follow from (2.87), provided that
(2.89) E(VOI - Vi)

I
'" E(VOI_tVi)
·
But the left side of (2.89) can be thought of as the expression on the right side,
but computed for a shifted x' = x + Yo (0) - Yo (-1). Since what we are interested
in is a Cesaro average of these expressions as x varies, this shift plays no role in
the limit.
This is essentially the entire proof of (2.80), except for the proof that
(a) the transitions (2.83), (2.84) and (2.85) will have occurred with large probability
by time t if x is in the range relevant to (2.80): 0 < x « (p - q)(p - ),.)t, and
(b) errors in the above argument caused by the fact that VtO = U/ is only true
with large probability, not with probability 1, disappear in the limit.
At this point we discuss the basic ideas for (a) only, referring to Ferrari (1 992a)
for the rest of the details. The *s are travelling along the shock, so that by Theorem
2.34, they are moving at rate (p - q)(l - ),. - p). Until it nears the *s, the ~
moves like a second class particle in an environment with distribution vp , so that
by Proposition 2.57, it moves at rate (p - q)(l - 2p). Therefore, these will meet
at approximately the time s at which
(p - q)(l -),. - p)s = x + (p - q)(l - 2p)s,
i.e., at time
x
s=------
(p - q)(p -),.)
So, they will have met by time t with high probability provided that
x «(p-q)(p-),.)t
as claimed.
Here is the central limit theorem for Zt. which follows easily from Proposition
2.74.
Theorem 2.90. Suppose rJt has initial distribution vA,p on Zi\{O}, with a second
class particle placed at the origin. Then Zt, the location of the second class particle
at time t, satisfies the following:
Zt - vt
converges in distribution to the normal with mean zero and variance

p(l-p)+),.(l-),.)
D = (p - q)------
p-),.
as t ---+ 00.
Proof By Proposition 2.74, it is enough to prove that
Llxl:5(p-Q)(P-A)t rJo(x) - (p - q)(p2 - ),.2)t

Jt
converges in distribution to the normal with mean zero and variance D / (p - ),.)2.
But the central limit theorem for i.i.d. Bernoulli random variables implies the
convergence of
LO<x::;(p-q)(p-'\)1 1]o(x) - (p - q)(p - )...)pt

(2.91)
-Ii
to the normal with mean zero and variance (p - q)(p - )...)p(1 - p), and the
convergence of
L-(p-q)(p-,\)I::;x<O 1]o(x) - (p - q)(p - )"')M

(2.92)
-Ii
to the normal with mean zero and variance (p - q)(p - )...»)...(1 - )...). Since (2.91)
and (2.92) are independent, the result follows.
Dynamic Phase Transition

We are now in a position to obtain some very precise information about the
limiting behavior of v,\.pS(t). As we will see, this limiting behavior was accurately
predicted in our earlier discussion of the evolution of shocks in Burgers' equation
(2.3). An important reason for our detailed analysis of the shock was to be able
to prove this result.
Theorem 2.93. For any a,
V'\,pS(t)r:vl+a./i -+ av,\ + (1 - a)vp,

where a = peW 2: a) and W is a normally distributed random variable with mean
zero and variance D.
Before proving this theorem, we check the following easier statement:
Proposition 2.94. Any weak limit of v'\,pS(t)r:vt+a./i as t -+ 00 is translation

invariant.
Proof Since the evolution of the system is translation invariant, shifting the dis-
tribution at time zero has the same effect as shifting the distribution at time t.
Therefore, it is enough to show that for any cylinder function,
(2.95) l!i~ [/ fd(r:1V,\,pS(t)r:vI+a./i) - / fd(V,\,pS(t)r:vt+a./i)] = O.

Couple together two copies of the exclusion process, 1]1 and ~I' with the graphical
representation, so that the initial distributions are v'\,p and r:1 v'\,p respectively, and
so that initially the two processes differ only at the origin (if at all), and differ as
little as possible there. This is possible since these two distributions are product
measures with marginals that agree everywhere except at the origin. Conditional
on 1]0(0) = ~0(0), 1]1 == ~I for all t, while conditional on ~o(O) = 0, 1]0(0) = 1, there
is always one discrepancy, and its location ZI moves as a second class particle
with respect to 1]1' Therefore, if g is any function that depends on the coordinates
in a finite set T and is bounded in absolute value by 1,
IEg(1]t) - Eg(~t)1 ~ Elg(1]t) - g(~t)1

(2.96)
~ 2P(Zt E T) ~ 21TI max P(Zt = x).
XEZ 1
The right side of (2.96) tends to zero as t --+ 00 by Theorem 2.90. Applying this
to g = a translate of f gives (2.95).
Proof of Theorem 2.93. Let (1]1, 1];, 1];,

Xf, Zt) be the process described prior to
the statement of Theorem 2.22 with the initial distribution described there based on
the product measure /-L with good marginals and a symmetric m with exponential
tails. As in the proof of Proposition 2.57, let L t be the leftmost particle in and 1];
R t be the rightmost particle in 1]i. As shown there, even after conditioning on
(2.97) {Zo = 0, L 1]6 (x) = 0, L 1]6 (x) = o},

X<o
these processes satisfy
(2.98) sup EIL t - Xtl < 00, sup EIR t - Xtl < 00.
t t
When conditioned on (2.97), the initial distribution of 1]t = 1]:,2

is v)",p on ZI \ {OJ.
It is this conditioned process that we use below.
Suppose now that f is a function that depends only on the coordinates in
{-k, ... ,k}. By compactness, it is enough to prove that for any sequence of t's
tending to 00 for which v)",pS(t)Tvt+av'! has a weak limit along that sequence, the
f f
limit of
fd(v)",pS(t)Tvt+av'!] IS fd[av)" + (1 - a)vp].
All limits in t below will be understood to be along such a sequence. By Propo-

sition 2.94,
= lim [h(t)
t~oo
+ h(t) + h(t)],
where Mt), h(t), [3(t) are the expressions in the middle of (2.99), but where the
expected value is taken over the events
and K t = the complement of G t U H t , respectively.

Recalling the proof of Proposition 2.57, we have L t ~ Zt ~ Rt . Therefore
P(Kt ) ~ P{vt+av't-t I/4 ~ Zt ~ vt+av't+t I/4 ) = p(IZtJ"rvt al ~ t- 1/4 ).

and so by Theorem 2.90,

lim h(t)
t-+oo
=0
for every n. So, it will be enough to prove that
(2.100) lim limh(t)=a!fdv).

n-+oo (-+00
and lim lim lz(t)
n-+oo 1-+00
= (l - a)! fdvp,
where what we mean by this in case the limit on t does not exist is that these
statements are true with limt replaced by either lim SUPt or lim inft . The two
statements in (2.100) are proved in the same manner, so we consider only the first
of them.
Recalling Theorem 2.22 and the proof of Proposition 2.57, we have
sup EIL t - Zr/ < 00.

t
Combining this with Theorem 2.90, it follows that
lim P(G t )
t-+oo
= a.
Therefore, we need to consider
Ih(t) - ! fdV).P(Gt)1
!
(2.101)
= /2n ~ 1 jt.;n E[Tvt+av't+(2k+l)J(1Jt) - fdv)., GtJ/.
Now, the translate of f that appears in (2.101) depends on the coordinates in
{vt + aJt + (2k + l)j - k, ... ,vt + aJt + (2k + l)j + k}.
Provided that t is large enough that 2(k + l)n +k :::: t l / 4 , on the event G" we may
replace the 1], above by 1]1. After making that replacement, the translates of f that
appear in (2.101) are i.i.d. with the distribution of f, evaluated at a v). distributed
configuration. Let Uj be i.i.d. with this distribution. Using the Schwarz inequality,
we see that the square of (2.101) is bounded above by
In
E [ - - "(U' - EU·)
]2 = --Var(U
1
1).
2n+l.~ j j 2n+l
j=-n
This proves the first statement in (2.1 00), as required.
Now we can go back and answer the original question raised in this section
- what is the limit of v).,pS(t) for various values of A and p? We must restrict
ourselves to A < p, since that has been the underlying assumption throughout.
3. Invariant Measures for Processes on {I, ... , N} 261
Corollary 2.102. Suppose A < p. Then
ifA+p>l,
ifA+P < 1,
ifA+p=1.
Proof The third case is obtained by setting a = 0 in Theorem 2.93, since v = 0 in

this case. The first two cases correspond to v < 0 and v > 0 respectively, and we
will get these by a comparison argument. Since vA•P Tx is stochastically increasing
in x, the same is true of vA,pS(t)Tx for each t.
Suppose then that A + p > 1, for example, so that v < O. Fix an a. Once t is
large enough that vt + a0 :s 0, we will have
(2.103)
The second inequality is a consequence of attractiveness, while the equality comes

from the invariance of vp (Theorem 1.2). Letting first t --+ 00 and then a --+ 00,
and using Theorem 2.93 to evaluate the limit of the left side of (2.103), it follows
that vA,pS(t) --+ vp.
3. Invariant Measures for Processes on {I, ... , N}
In Section 2, we studied the asymmetric nearest neighbor exclusion process on Z 1

with an initial distribution that is asymptotically VA to the left and V p to the right.
The approach was almost entirely probabilistic, making extensive use of coupling.
Rather than considering the infinite system with those boundary conditions, one
might hope to understand the same phenomena (the motion of shocks, the state of
the system viewed from the location of the shock ... ) by considering a system on
{I, ... , N} for large N that is in equilibrium. The transitions within {I, ... , N}
are the same as before - particles move to the right at rate p and to the left at rate
q, provided the destination site is empty. The effect of the boundary condition is
built in by imagining that there are sites 0 and N + 1 at which the distribution is
always
with probability A,
rJ(O)={~ with probability 1 - A,
and
with probability p,
rJ(N + 1) = {~ with probability 1 - p,
independently of each other, and of the configuration on {I, ... ,N}. More for-
mally, in addition to the exclusion transitions on {l, ... , N}, the following tran-
sitions occur:
° AP,
°
--+ 1 at site 1 at rate
1 --+ at site 1 at rate (1 - A)q,
° --+ 1 at site N at rate pq,
°
1 --+ at site N at rate (1 - p)p.
The resulting process 1/1 is a finite state, continuous time, Markov chain on
{O, I}N, which is irreducible except for a few choices of A, p, P E [0, 1]. The main
objective of this section is to study properties of its stationary distribution, and
to then relate these to some of the phenomena studied in Section 2. It is perhaps
surprising that the stationary distribution can be written down fairly explicitly. In
contrast with Section 2, the approach in this section is almost entirely analytic.
The Matrix Approach

The main idea is to try to represent the stationary distribution of the system in
terms of certain (usually infinite, noncommuting) matrices D, E and vectors iV, V.
Here is the result that gets the program started. The hypotheses may seem a bit
strange, but the proof will show how they arise. Later we will show that often
there exist matrices and vectors with the required properties.
Theorem 3.1. Suppose that the matrices D, E and vectors iV, v satisfy
(3.2a) pDE-qED=D+E,
(3.2b) iV[ApE - (1 - A)qD] = iV,
(3.2c) [0 - p)pD - pqE]v = v
For 1/ E {O, I}N, put
(3.3) IN(1/) = iV n
N
i=1
[1/(i)D + (1 - 1/(i»E]v.
If (3.3) is well defined for each 1/ (i.e., the matrix products converge), IN satisfies
Li" IN(/;) =1= 0, and 1/1 is irreducible, then
is the stationary distribution for 1/1,
Proof Let Q(1/, n be the rate at which the process goes from 1/ to ~:
if 1/(i) = 1, 1/(i + I) = 0,
if 1/(i) = 0, 1/(i + 1) = 1
for 1 :s i < N,
AP if 1/(1) = 0,
Q(1/, 1/1) = { (1 _ A)q
if 1/(1) = 1,
pq if rJ(N) = 0,
Q(1), rJN) ={ (1 _ p)p
ifrJ(N) = 1,
and Q(1), 0 = 0 otherwise. Here, as usual, rJi is the configuration obtained from rJ
by flipping the ith coordinate, and rJ;,j is obtained from rJ by interchanging the ith
and jth coordinates. A finite state, irreducible Markov chain has a unique nonzero
invariant signed measure of given total mass, and it is strictly of one sign. So, it
will be enough to show that
(3.4)
for each 1;.

To check (3.4), consider the following relations:
fN(I;;,;+dQ(I;;,;+I, 0 - fN(OQ(I;, 1;;,;+1)

(3.5b) =[1 - 21;(i)]fN-I(I;(1), ... ,I;(i -1), I;(i + 1), ... ,I;(N))
- [1 - 21;(i + 1)]fN-I (1;(1), ... ,I;(i), I;(i + 2), ... ,I;(N))
for 1 :s i < N, and

fN(I;N)Q(I;N, 0 - fN(OQ(I;, I;N)
(3.5c)
= [1 - 21;(N) ]fN-I (1;(1), ... , I;(N - 1)).
(Note that the left sides of (3.5) are just the expressions that would have to be
zero if the chain were to be reversible with respect to fN - see the discussion
of reversibility in the Background and Tools section. This chain is almost never
reversible, yet (3.5) says that the defects that keep it from being reversible can be
written in terms of the invariant measure for a smaller system.) The sum of the
left sides of (3.5) is just the difference between the two sides of (3.4), while the
sum of the right sides is zero, since it is a telescoping series. Therefore, if (3.5)
holds, then so does (3.4).
So, we need to check (3.5). Consider first (3.5a). Both sides are multiplied by
-1 if we replace I; by I; I. Therefore, we may assume I; (1) = O. The left side of
(3.5a) is then
(1 - )..,)qfN(I;I) - )..,pfN(O =w[(l- )..,)qD - )..,pE] nN
;=2
[1;(i)D + (1 -1;(i))E]ii
wn[1;(i)D + (1 -1;(i))E]ii
N
=-
i=2
as required. In the middle equality, we have used (3.2b). The verification of (3.5c)
is similar, using (3.2c) in place of (3.2b). For (3.5b), we may assume that I; (i) = 1,
~(i + 1) = O. (Both sides of (3.5b) are multiplied by -1 if such a ~ is replaced

by ~i,i+l, while both sides are zero if ~(i) = ~(i + 1).) In this case, the left side
of (3.5b) is
qfN(~i,i+l) - pfN({)
=W n[~(j)D + (1 - ~(j»)E][qED
i-I
j=1
n [~(j)D + (1 - ~(j»)E]v
- pDE]
N
j=i+2
= -w n [~(j)D + (1 - ~(j»)E][D + n [~(j)D + (1 - ~(j»)E]v

i-I N
E]
j=1 j=i+2
= - fN-I (~(l), .. , , ~(i - 1), ~(i + 1), ... , ~(N»)

- fN-I (~(l), ... , ~(i), ~(i + 2), ... , ~(N»)
as required. In the middle equality, we have used (3.2a).
The reader might wonder how one would guess that (3.5) is true and/or relevant
here. As mentioned in the proof, the expressions on the left are natural because
they are defects from reversibility. Computing these for small values of N led to a
guess that these defects might be related in this way to the stationary distribution
of the smaller system, and that suggested that there might be a potentially useful
recursion here.
Properties of the Matrices

In order for Theorem 3.1 to be useful, we must be able to find matrices and vectors
that satisfy (3.2). The first issue to resolve is whether the matrices D and E can
be taken to commute and/or to be finite dimensional, since that would naturally
simplify matters.
Proposition 3.6. Suppose D, E, wand v satisfy (3.2), and the resulting fN is

strictly positive.
(a) If D and E commute, then (i) A = p, or (iO p = 1 and A = 0 or p = 1,
or (iii) p = 0 and A = 1 or p = O. In case (i), the unique stationary distribu-
tion is VA, while in cases (ii) and (iii), the stationary distributions concentrate on
configurations of the form ... 000 III . .. (in case (ii)) or ... 111000 . .. (in case
(iii)).
(b) If p = 1 and D and E do not commute, then they are necessarily infinite
dimensional.
Proof Suppose D and E commute, and define matrices
(3.7) A = ApE - (1 - A)qD, B = (1 - p)pD - pqE.
Then A and B commute, and wA = W, Bv = v by (3.2b,c). Eliminating E and

D respectively from the linear equations (3.7) gives
3. Invariant Measures for Processes on {I, ... ,N} 265
pqA + )...pB =[)...(l - p)p2 - p(l - )...)q2]D,

(3.8)
(l - p)pA + (l - )...)qB =[)...(l - p)p2 - p(l - )...)q2]E.
Now multiply (3.2a) by [)...(l - p)p2 - p(l - )...)q2]2 and use (3.8) to replace E
and D by A and B. This gives
(p - q)[pqA + )...pB] [ (l - p)pA + (1 - )...)qB]

= [)...(l - p)p2 - p(l - )...)q2] [pqA + )...pB + (l - p)pA + (1 - )...)qB].
Applying won the left and von the right leads to

(p -q)[pq +)...p][(1- p)p+ (l-)...)q]w. v
(3.9)
= [)...(l - p)p2 - p(l - )...)q2][pq +)...p + (1 - p)p + (1 - )...)q]w. v.
The left side of (3.9) minus the right side of (3.9) factors nicely:
(p + q)(p - )...)[)...p + (1 - )...)q ][pq + (1 - p)p]w . v.
Since IN is strictly positive, w· v *

O. Taking each of the other factors to be zero
gives the three cases in part (a).
If 0 < )... = P < 1, the chain is irreducible, and if in addition p
dimensional matrices
the one *!,
1 1
d= , e=---
(l - )...)(p - q) )...(p - q)
satisfy (3.2). Except for a constant multiple, the resulting IN gives the distribution
VA, so VA is the stationary distribution in this case. The same is true if p = by !
continuity. Part (a) in cases (ii) and (iii) is easy to check directly.
For part (b), assume that p = 1. Then (3.2) becomes
(3.10) DE = D + E, )...wE = W, (l - p)Dv = v.
Suppose that D and E are finite dimensional, and that the vector u satisfies Eu =
u. Multiplying the first identity in (3.10) by u on the right gives Du = Du + u,
and hence u = o. It follows that E - I is invertible, so that we may solve the
first identity in (3.10) for D: D = E(E - I)-I. But this implies the E and D
commute.
Assumption: From now on, we will exclude the trivial cases in (a) above. There-
fore, the matrices D and E (if they exist) will necessarily not commute and will
usually be infinite dimensional. This also makes the Markov chain 11t irreducible,
so the stationary distribution is unique. In addition, we will assume that p > !.
There is no real loss in this assumption, since there is a symmetry between the
cases p > !
and p < !,
and as we saw in the last section, the symmetric case
p = !
is much simpler than the asymmetric case.
Examples of Matrices D and E

There is no reason to expect (3.2) to detennine the matrices uniquely. For some
computations, all that is relevant is that some representation of the fonn (3.3)
exists - not what the particular matrices that appear in it are. Therefore, it makes
sense to try to find at least some choices that work.
The simplest matrices are diagonal, but if D and E are diagonal, they will
commute. So, we will not find any useful examples that are diagonal. The next
simplest case is that of bidiagonal matrices. So, take D and E of the following
fonn:
do d'0 0 0 eo 0 0 0
0 dl d'I 0 e'0 el 0 0
D= 0 0 d2 d'2 E= 0 e'I e2 0
0 0 0 d3 0 0 e'2 e3
Equation (3.2a) then becomes
p(diei + d;e;) - q(e;_ld;_1 + eidi) =di + ei,

(3.11 ) pd;ei+1 - qeid; =d;,
pdi+1e; - qe;d; =e;,
where we have set d_ 1 = d~1 = e_1 = e~1 = O. Assuming that d; =1= 0, e; =1= 0 for
i ::: 0, the last two equations in (3.11) become
Solving these recursions leads to

1 _ (qjp)i .
(3.12a) ei = + (qjp)'eo, di = 1 - (qjp)i + (qjp)ido.
p-q p-q
Using these values, the first equation in (3.11) becomes
' ' - d' ,

Pdiei -q i-lei-I
+ -1- - [1 - (p - q)do][1 - (p - q)eo] ( j )2i
q p .
p-q p-q
Solving this recursion gives
" [1 - (qjp)i+I][1 - (qjp)i[1- (p - q)do][1- (p - q)eoJ]

(3.12b) die i = (p - q)
2 •
With our choice of matrices, equations (3.2b,c) for v= (vo, VI, ... ) and IV =
(wo, WI, .•. ) become
(3. 13 a)
and
3. Invariant Measures for Processes on {I, ... ,N} 267
(3.13b)
For fixed positive choices of ei, d i , e;, d;, (3.13a,b) has solutions V, W that are
unique up to constant multiples, and can be computed recursively, provided that
A> 0, P < 1.
To understand the nature of the solutions, note first that ei ~ (p - q)-I and
di ~ (p - q) -I as i -+ 00 by (3 .12a), and it is consistent by (3 .12b) to take
e; ~ (p - q)-I and d; ~ (p - q)-I as well. Therefore, the solutions to (3.13)
should behave asymptotically like the solutions of the second order recursions
with constant coefficients
(p - q)Wi =Ap(Wi + Wi+l) - (1 - A)q(Wi + Wi-I),

(p - q)Vi =(1 - P)P(Vi + Vi+l) - pq(Vi + Vi-I).
The general solutions to these recursions are
(3.14)
respectively.
In order for (3.3) to be well defined, we need Li IViWiI < 00. This will be
true for any choice of constants in (3.14) if P < A, q < A, p < p, but in general,
we would need at least one of C I, C2 to be zero.
Looking at (3.12b) and recalling that we need d~1 = e~1 = 0, it makes sense
to choose do, eo so that
q
(3.15) [1 - (p - q)dol[l - (p - q)eol = -,
p
and then
I I 1 - (qjp)i+1
d·I = e· = - -p_q
I
---
To solve (3.13) it is natural, in view of(3.14), to try solutions of the form Wi = Wi
and Vi = Vi, where
1- A q p q
W = - - or and V = -- or
A p 1-p p
The conditions for (3.13) to be satisfied are
AW[ - p + eop(p - q) - qw] = (1 - A)q[ - W + do(p - q) -1]

and
(1 - p)v[ - p + dop(p - q) - qv] = pq[ - V + eo(p - q) -1]

respectively. Then one can check under what conditions (3.15) holds. Here is the
answer in the four cases:
p 1- A
A = 1 or p = 0 if v = - - and w = - -
I-p A
q 1- A
A = 1 or p = 1 or q = 0 if v = - - and w = - -
(3.16) P A
A = 0 or p = 0 or q = 0 if v = _P- and w = - f{
1- P P
q = 0 if v = - f{ and w = - f{.
p p
This gives a number of examples in which a representation of type (3.3) is possible.

In particular, in the totally asymmetric case p = 1, there are several choices that
work for any value of A, p. Other choices can be made to cover some other cases.
Correlation Functions
Next, we will see how to compute correlation functions for the stationary distri-
v,
bution p, N in terms of the vectors wand the matrices D, E and C = D + E.
Proposition 3.17. Under the assumptions of Theorem 3.1,
L iN
ry
(I]) = wCNv,
and
Proof The proofs of all three are immediate from (3.3). For example, consider the
first statement with N = 2. Then the left side has the following four contributions:
wD 2v if I] = 11
wDEv if I] = 10
wED v ifl]=OI
wE 2v if I] = 00,
while the right side is
weD + E)(D + E)v = w(D 2 + DE + ED + E2)v.

There is probably no need to give further details.
In what follows, we will consider only the completely asymmetric system,

p = 1, in order to simplify the computations. Recall that in this case, there are
various choices of vectors and matrices that satisfy the assumptions of Theorem
3.l - see (3.16). When p = 1, (3.2) becomes
(3.18) DE = D + E = e, AwE = W, (1 - p)Dv = v.

Our objective is to evaluate
where keN) is a reasonable sequence, and Lk is the shift that moves site k to the
origin.
The Partition Function

The first step is to determine the asymptotics of weNv. By the first statement of
Proposition 3.l7, this is the analogue of the quantity that is known as the partition
function in statistical mechanics - it is the normalizing constant that must be used
in order to make the measure with density iN into a probability measure. The next
result expresses this quantity in terms of a sequence of polynomials RN defined
by Ro(x) = x and
RN(X) = L 2Nk-
N
k=O k
(2N - k) x k+!
N
for N ::: 1.
Proposition 3.19. If p = 1, the partition function can be written as
Proof Note that eN = (D + E)N is a sum of products of D's and E's. Anytime
v
the product is of the form Ei Dj, WEi Dj can be computed easily, because w
v
is a left eigenvector of E and is a right eigenvector of D. Since E and D do
not commute in general, we cannot simply reorder the factors so that all the E's
precede all the D's. However, we can use the first part of (3.l8) to take any pair
that is in the wrong order, and replace it bye. Of course, that reduces the degree
of the product by one.
Here is the result of using this reduction repeatedly:
(3.20) N _
e-~
~_k_(2N -k) ~ j
~ED.
k-j
k=O 2N - k N j=O
We prove (3.20) by induction on N. The case N = 1 is just e = D + E, which

is part of (3.l8). For the induction step, assume that (3.20) holds for N. Multiply
eN on the right bye = D + E, so that
C N +1 = L - - -
N k (2N _ k) LE)Dk-)(D+E).
k
(3.21 )
k=O 2N - k N )=0
The factor of D is on the correct side of E) D k -), but the factor of E is on the
incorrect side. To fix this, write for n 2: 1
D nE = D n- 1DE = Dn-1(D + E) = D n + D n- 1E.

Iterating this gives
D nE = D n + D n- 1 + ... + D + E.
C N +1 = t _k_(2N -k) t
k=O 2N - k N )=0
E) [Dk-j+l + D k-) + ... + D + E]
~
=~--- (2N - k k) "~ ..
E'DJ.
k=O 2N - k N I:Oi+):o:k+l
;,),:0
Letting I = i + j - 1, we see that to complete the induction step, we need to show

that for 0 .:'S I .:'S N,
(3.22) bN k
2N - k
(2N - k)
N
1+1
= 2N - / + 1
(2N -/ +
N+1
1) .
This is clearly true for I = N, since both sides are = 1. Given that, (3.22) is
equivalent to the equality of the successive differences of the two sides of (3.22):
(3.23) _1_ (2N - I) =

2N - I N
I+
2N - I + 1
(2N - I +
N +1
1 1) _2N
~ (2N - I).
- I N +1
But this is easy to check directly. Writing out the binomial coefficients in terms
of factorials, and cancelling common factors, (3.23) becomes just
I(N + I) = (l + 1)(2N -I) - (l + 2)(N -I),

which is true. This proves (3.20).
Multiplying (3.20) on the left by wand on the right by v and using the last
two parts of (3.18) gives
~
wCNv = L
~ N k (2N _k) k.
LA-J(l- p)J-k W
. ~ ~
' v
k=O 2N - k N )=0
=" kN
f=Q2N-k
(2N-k)(l-p)-k-l_A-k-l~ ~
N (l_p)-I-A- 1 W·V '
where the final step comes from summing the finite geometric series. The statement
of the proposition follows by using the definition of RN .
In view of Proposition 3.19, in order to find the asymptotics of wCNv, we

need the asymptotics of RN . These are given next.
Lemma 3.24. As N -+ 00,
1 4N
JIT(2x-l)2 N ~
2 4N 1
JIT NI/2
if
lX=2:
N+l
( )
(1-2x) x(l~X) if 0 < x < ~.
Proof One could look at the definition of RN directly to carry out the asymptotics,
but it is bit easier to argue indirectly. Multiply (3.23) by Xl+l and sum for 1 <
I .:'S N. The result, after some cancellation is
RN(X)
x-I
= ~RN+l(X) + 2N
1+ 1 (2NN ++ 11) .
This can be rewritten as
[x(1-x) ]NRN(x-)-
1 [ x(1-x) ]N+l RN+1(X- 1) = [ x(1-x) ]N
2N
1+ 1 (2N ++ 11)
N
.
Replacing N by k, and then summing this telescoping series for 0 .:'S k < N leads
to the following alternative representation for RN:
(3.25)
The advantage of this representation over the definition of RN is that the summands
on the right depend on k but not on N. The final ingredient of the proof is the
following form of the Taylor expansion of the square root function:
(3.26) 1 - }1 - 4y = 2y L --
00 1 (2k + 1) y , k
2k + 1 k + 1
k=O
This may be used with y = x(1 - x) since x(1 - x) .:'S t for 0 .:'S x .:'S 1. Passing
to the limit in (3.25) gives
if ~ .:'S x < 1,
= l-II-2xl = lx-
N 1 1
x- 1 _ lim [x(1-x)] RN(x-)
N-+oo 2x(1 - x) (1 - x)-l if 0 < x < ~.
This already gives the statement of the lemma if 0 < x < ~. If ~ .:'S x < 1, this
argument gives only
But in this case, (3.25) can be rewritten as
N 1 ~
[x(l-x)] RN(X-) = 6[X(l-X)] 2k+ 1
k 1 (2kk++11) '
and the statement of the lemma follows from this and
_1+_
2k 1
(2k +
k+1
1) '" _1_
.fiT k~ ,
4k
which is a consequence of Stirling's formula.
Combining Proposition 3.19 and Lemma 3.24, it is not hard to determine the
asymptotics of wCNv. Recall that we are excluding the trivial cases A = 0 and
p=l.
Corollary 3.27. Suppose p = 1. Then there is a constant K > 0, depending on A

and p, so that
if p ! < A,
<
if p = ! < A or p < ! = A,
wcNv '" K x [A(l - A) rN if A < ! and A + P < 1,
[P(l - p)r N if p > ! and A + P > 1,
N[A(l - A)r N if A < ~ < p and A + P = 1.
The Current
Corollary 3.27 is already enough to determine the asymptotics of the current
JLN(lO) = JLN{IJ : 1J(i) = 1, IJU + 1) = OJ.
Since this is the rate at which particles move from i to i + 1 in equilibrium, this
quantity is independent of i. To see this, simply compute
f QNgdJLN = JLN{IJ : lJ(i - 1) = 1, 1J(i) = O} - JLN{IJ : 1J(i) = 1, 1J(i + 1) = O},

where Q N is the generator of the process on {I, ... , N} and g(lJ) = lJ(i). This
quantity is zero by Theorem B7. This lack of dependence on i is also easy to see
from Proposition 3.17 directly - see the proof below.
Theorem 3.28. Suppose p = 1. Then
1~(l
ifp:S!:SA,
lim JLN(lO)
N-->oo
= - A) if A :S ~ and A + P :S 1,
p(l - p) if p ::: ! and A + P ::: 1.
Proof By Proposition 3.17,

WC i - 1 DECN-i-IV
fl,N{17 : 17(i) = 1, 17(i + 1) = O} = --w"'-C-N"""'v"'---
WCi-lCCN-i-IV
WCN V
WCN-IV
WCN V '
where the second equality comes from (3.18). Now apply Corollary 3.27.
The Limiting Measure

Knowledge of the current says a lot about the nature of the measure fl,N, as the
next result shows. This result should be compared with Corollary 2.102.
Theorem 3.29. Suppose p = 1, and keN) a sequence ofintegers such that keN) --+
I
00 and N - keN) --+ 00. Then
VI if P ::: ! ::: A,
lim
N->oo
Tk(N)fl,N = v~ if A ::: ! and A + P < 1,
vp if P :::: ! and A + P > 1.
If A < !, P > !, and A + P = 1, then any weak limit of Tk(N)fl,N is of the form
(3.30)
for some 0 ::: a ::: 1.
Proof At times, we will make the dependence of fl,N on A and P explicit by

writing fl, N CA, p). The processes with different choices of A and P can be coupled
together so that if AI ::: A2 and PI ::: P2, and 17; is the process with parameters
(Ai, Pi), then the inequality 17: ::: 17; is preserved in time. Therefore,
(3.31 )
for each N. Since fl,N(A, A) = VA for each A by Proposition 3.6(a), it follows that
(3.32)
for any A, p, where /\ and v denote the minimum and maximum respectively.
Consider a sequence Nt along which the limit
fl, = lim
N'
Tk(N')fl,N'
exists. Then fl, is invariant for the exclusion process on Zl by (the exclusion
version of) Theorem B7(g). Therefore, using the notation of Example 1.5, fl, is a
mixture of Va. a E [0, 1] and of vn , n E Zl. By Theorem 3.28
if p ::: ~ ::: A,
(3.33) J,t(A, p)(10) = I ;(1 - A) if A ::: ~ and A + p ::: I,
pO - p) if p 2: ~ and A + p 2: 1.
Passing to the limit in (3.32) leads to
(3.34)
If A /\ P > ° °
or A v p < 1, (3.34) implies that J,t must be an average of the
Va, a E [0, 1] alone. Since we are assuming that A > and p < 1, the only case
in which we cannot yet reach this conclusion is A = I, p = 0. But in this case,
(3.33) tells us that J,t(1, 0)(10) = ~, so again it follows that none of the Vn are
involved in the representation of J,t as a mixture of extremal invariant measures.
To see this, note that for any mixture J,t of Va, a E [0, I], J,t(10) ::: ~, while for
any mixture J,t of Vn, n E Zl, J,t(10) = 0. It follows that in general,
(3.35) J,t(A, p) = !ol vay(da)

for some probability measure y on [0, 1]. By (3.34), y concentrates on [A /\ p, A v
p]. To see this, write
J,t(A, p){I](1) = ... = I](n) = I} = !ol any(da) ::: (A Vpt

and let n -+ 00 to show that yeA v p, 1] = 0, and use a similar argument with 1
°
replaced by above to conclude that y [0, A /\ p) = as well.
Putting all these observations together, we have that
°
(3.36) J,t(A, p) = l AVP
Al\p
vay(da).
But by (3.33),
1
AVp 4 if p ::: ~ ::: A,
(3.37) ( a(1 - a)y(da) = {
A(1 - A) if A ::: ~ and A + p ::: 1,
JAI\P
p(1 - p) if p 2: ~ and A + p 2: 1.
Note that in each case, the right side of (3.37) is either the maximum value or the
minimum value of the function a(1 - a) for a E [A /\ p, A V p], and therefore,
°
y puts all of its mass on the point or points at which this extremum is attained.
This point is unique in all cases except < A < ~,A + p = 1. In this case, the
minimum is attained at both A and p. This proves the statement of the theorem in
all cases.
Next, we want to determine the value of a in (3.30). For this, we need to find
the asymptotics of J,tN{I] : I](k(N» = I}. Here is the expression that makes this
possible.
Proposition 3.38. If p = 1, then
J-LN{TJ : TJ(k) = I} = f; 2j 1+ 1 (2j j+ 1) wCN-j-]Jj

N-k-]
wCNJj
. (2N - 2k - .) .
+ wCk-]Jj ,
N-k
,1 1 (1 )-J-]
WCNJj f;;J2N-2k-j N-k -p
for 1 :::: k < N.
Proof Recalling Proposition 3.17, we need to compute expressions involving DC n

for various values of n. Here is the key identity, which we will prove in much the
same way that we proved (3.20):
DC = ~ _._1_ (2j :- I)Cn-j

j=O 21 +1 1
t
(3.39)
+ _1_'-. (2n - j)DH], n::: 1.
j=] 2n - 1 n
The proof of (3.39) is by induction on n. For n = 1, (3.39) becomes DC = C + D2,

which is an immediate consequence of (3.18). Suppose now that (3.39) is true for
a given n, and multiply both sides of (3.39) by C on the right, giving
DC+] = ~ _.1_(2 j :- I)C n-H]

j=O 21 + 1 1
t
(3.40)
+ - j-. (2n - j)DH](D + E).
j=] 2n - 1 n
As in the proof of (3.20), we have
DH] E = DH] + Dj + ... + D2 + c.
DCn+] = ~ _.1_(2 j :- I)C n-H] +

j=O 21 + 1 1
ct _1_'_. (2n
j=] 2n - 1
- j)
n
+ t - j - . (2n - j)[DH2 + ... + D2].

j=] 2n - 1 n
Using (3.22) with I = 0, we see that the C term above can be written as the j = n
term in the first sum. Doing this, and interchanging the order of summation in the
last sum, gives
DC+] = t _.I_(2 j :- I)C n-H] +

j=O 21 + 1 1 i=]
D i+] I: t
_1_'_. (2n - j).
j=i-] 2n - 1 n
Applying (3.22) again, now with I = i-I, leads to (3.39) with n replaced by
n + 1. The statement of the proposition now follows by using the middle identity in
Proposition 3.17, (3.39) with n = N - k, and the fact that Dj+1Jj = (1 - p)-j-1Jj,
which comes from (3.18).
We can now use this expression to determine the value of a in (3.30). The
answer given below can be interpreted in the following way: There is shock that
is approximately uniformly placed on [0, N]. To the left of the shock, J-LN is
approximately VA, while to the right of the shock it is approximately vp. This
should be compared with Theorem 2.93.
Theorem 3.41. Suppose that p = 1, 0 < "A < ! and "A + p = 1. If k(N) satisfies
k(N)/N -+ a, 0 < a < 1, then
lim Lk(N)J-LN
N~oo
= (1 - a)vA + avp •
Proof By Theorem 3.29, it is enough to show that

(3.42) lim J-LN{TJ : TJ(k(N» = I} = (1 - a)"A + ap.
N~oo
By Corollary 2.27, there is a positive constant K so that
wCNJj '" KN["A(l - "A)r N

as N -+ 00. Therefore, we can pass to the limit in the statement of Proposition
3.38 to get
(3.43)
Applying (3.26) to the first expression on the right of (3.43) and Lemma 3.24 to
the second expression leads to
lim J-LN{TJ : TJ(k(N» = 1} ="A + a(1 - 2"A)
N~oo
as required.
An Application - the Process with a Blockage

In this subsection we consider the following question: Can a local perturbation in
the dynamics have a global effect on the evolution? More specifically, consider
the exclusion process on Zl with p = 1, modified only by making the rate at
which a particle goes from site 0 to (empty) site 1 equal to r E [0, 1] instead
of 1. We refer to this as the exclusion process with a blockage. Take the initial
configuration to be . .. 1 1 1 0 0 0 "', where the O's are on the strictly positive
sites. If r = 1, this is the original exclusion process, so the limiting distribution is
Vl by Theorem 1.3 of Liggett (1975). (See also Theorem 3.29 above.) If r = 0,
2
there are no transitions at all, so the limiting distribution is the initial distribution.
One would expect for general r that the limiting distribution would be asymptotic
to vy(r) at -00 and to Vl-y(r) at +00. By the above remarks, yeO) = 1, yO) = !.
The question is, is it the case that y (r) > !
for all r < I? If so, then arbitrarily
small local perturbations in the dynamics do have global effects on the limiting
behavior of the system. This problem is open for r close to 1. It turns out that
Theorem 3.29 gives some information away from 1.
One case is easy to handle without using it. Suppose fL is invariant for the
system, and suppose that fL is asymptotic to Vy at -00. Since the current is constant
in equilibrium,
(3.44) fL{l]: I](i) = 1, I](i + 1) = O} = rfL{1] : 1](0) = 1,1]0) = o}, i < O.
The right side of (3.44) is at most r, while the left side tends to y (1 - y) as
i -+ -00. Therefore
(3.45) y(1 - y) ::: r.
If r < ~, this makes it impossible to have y = !.

In the spirit of this section, it is natural to ask the above question in a different,
but closely related, way. Consider the process l;t on {- N + 1, ... , N} with a
blockage between sites 0 and 1 as before. Take the boundary condition I] ( - N) =
1, I](N + 1) = 0, and let aN be the unique stationary distribution for this system.
Stationary distributions for the system on Zl can be constructed by taking limits
of aN, so one would want to know if
(3.46) J = limsupaN{I]: I](i) = 1, I](i + 1) = O},

N-+oo
which is independent of i for i < 0, must be < ~. By the argument that led to
(3.45), this must be the case for r < ~. Next we will use Theorem 3.29 to extend
this to a larger range of r's.
Theorem 3.47. Ifr < !, then J ::: r(l - r) < ~.
Proof Let I]t be the exclusion process on {I, ... , N} with boundary conditions
1](0) = A = r, I](N + 1) = p = o.
The processes St and I]t have the same dynamics on {I, ... , N}, except for the
transition 0 -+ 1 at site 1. For I]t. this occurs at rate r, while for st. it occurs at
rate r if St- (0) = 1 and at rate zero otherwise. Therefore, the two processes can
be coupled so that St ::: I]t at all times, provided this is true at time O. It follows
that
(3.48)
Since the current is constant for both processes,
aN{1] : I](i) = 1, I](i + 1) = O} = aN{1] : I](N) = I}, -N < i < N, i =f. 0

and
JIN{'7 : '7(i)= 1, '7(i + 1) = O} = JIN{'7 : '7(N) = I}, 1 S i < N.

Since ('7 : '7(N) = I} is an increasing event, applying (3.48) to the right sides
above leads to
(3.49) aN{'7 : '7(i) = 1, '7(i + 1) = O} S JIN{'7 : '7(j) = 1, '7(j + 1) = O}
for -N oo
whenever keN) --+ 00, N - keN) --+ 00. Passing to the limit in (3.49) gives the
result.
4. The Tagged Particle Process

Throughout this section, we take S = Zd, and p(x, y) = p(O, y-x) to be the tran-
sition probabilities for an irreducible random walk on Zd. To avoid uninteresting
difficulties we will assume throughout that p(O, 0) = 0 and (x : p(O, x) > O} is
finite. Start off the exclusion process with distribution vp on Zd\{O} and '7(0) = 1.
Let XI be the position at time t of the particle that started at O. This section is
devoted to proving limit theorems for XI; particularly of interest is the central
limit theorem.
We will specifically exclude the nearest neighbor, one dimensional cases:
p(x, x + 1) = p(x, x-I) = !,
in which the asymptotic behavior of XI is atypical
by Theorem 1.21, and p(x, x + 1) = p, p(x, x-I) = 1 - p, p =F where !,
corresponding results have been proved by special methods by Kipnis (1986). The
value of p E (0, 1) will be fixed in this section.
The Process Viewed from the Tagged Particle; First Decomposition

The first idea in studying the position of the tagged particle Xt is to represent it in
terms of the so called environment process. In this section, we will let '71 denote
the exclusion process itself, and ~I the process viewed from XI: ~I (x) = '71 (X I + x).
The latter one is called the environment process. The generators of the processes
'71, (XI' ~I) and ~I are respectively (when applied to suitable functions f)
Q/('7) = p(x, y)[J('7x,y) - 1('7)],
Q/(x, n= p(u, v)[J(x, ~u,v) - f(x, n]

u.V*O
(u)=l,(v)=O
+ L
(y-x)=O
p(x, y)[J(y, Ty-xn - f(x, n],
4. The Tagged Particle Process 279
and
0.f(O = p(u, v)[J(~u,v) - f(O]

u,vi=O
nu)=l,nv)=O
+ L p(O, y)[J(TyO - f(O].

ny)=o
Here Tx is the modified spatial shift that moves x to the origin:
if y =1= 0, -x,
if y = -x,
if y = O.
See (B2) for the expression for 0.; the expressions for Q and 0. are analogous.
Note that if f(x, 0 = g(n, then Qf(x, 0 = 0.g(n, as it should be, indicating
that ~t is a Markov process in its own right.
Below we have the decomposition that expresses X t in tenns of the environ-
ment and a martingale. As we will see, the martingale part is fairly easy to deal
with. That will leave us with the task of detennining the asymptotic behavior of
the part that depends on the environment.
Proposition 4.1. Let

1/1(0 = L vp(O, v).
(v:~(v)=O)
Then
(4.2)
where M t is a martingale.
Proof The process M t is defined by (4.2). Take f (x, 0 = x. Substituting into the
above expression for Q, we see that
Qf(x,O = L p(x, y)[y - x] = 1/I(n.

(y:ny-x)=O)
Let jifbe the a-algebra generated by the process (X s , ~s) for s :'S t. To check the
martingale property of Mr, it suffices to take s < t and show that E [Mt - Ms I
.~] = O. Using the Markov property and the definition of Mr,
E[Mt - Ms I ~] = E[ Xt - Xs _ [t 1/I(~r)dr I ~]
= E(X,,(,) [ X t - s - Xo _I t
-
s
1/I(~r )dr l
So, it is enough to check that
E(X,O[X t - x] = 1t E(x,01jJ({r)dr
for all t > O. In terms of the semigroup Set) for the process (X t , {t), this can be
written as
S(t)/(x, n-
I(x, n
= 1t S(r)Q/(x, ndr,
which is a consequence of Theorem B3, applied to the present semi group and
generator. (Strictly speaking, our I is not in the domain of Q, so one should
apply the above argument to a truncation of I and pass to a limit, but this step is
left to the reader.)
Invariance and Ergodicity of the Environment

In order to take advantage of representation (4.2), we need to have some infor-
mation about the environment process {t. The simplest statement is that it is in
equilibrium.
Proposition 4.3. For all t ::: 0, (ryt(X t + x), x =1= O} are i.i.d. random variables
with P (TJt (X t + x) = 1) = p. The process {t is stationary.
Proof For any x that is initially occupied, let X: be the position at time t of the
particle that was initially at x. In particular, X~ = Xt. In order to prove the first
part of the theorem, we need to show that for any finite A C Zd with 0 cJ. A,
(4.4) En{t(x)=pIAI.
XEA
To do so, use successively Theorem 1.2(a), a decomposition according to the initial

position of the particle that is at the origin at time t, the translation invariance of
the process, the fact that TJ(Y) = (ryTJ)(O), and the translation invariance of vP ' to
write
pIAI+! =e p n
XEAU(O}
TJt(x) = f Ery[ n
XEAU(O}
TJt(X)]dV p
= f L TJ(Y)Ery[ n TJt(X), xi = O]dV

y XEA
p
= f L TJ (0) Ery [n TJt(X + X?), X? = -Y]dVp

n
Y XEA
= pE {t(X).
XEA
Cancelling the factor of p gives (4.4).

The second part of the proposition is now a special case of the general fact
that a Markov process started off with an invariant measure is a stationary process.
An immediate consequence of Proposition 4.3 is the following.
Corollary 4.5. The martingale Mt has stationary increments.
°: :
Proof Since the transitions of X t correspond exactly to the shifts in 1;1> X t is a
function of {I;s, s ::: t}, say
(4.6)
Therefore by (4.2), if s < t,
(4.7) Mt - Ms = Ft-s(l;" s ::: r ::: t) _ [t ljf(l;r)dr.
Since appropriate functions of stationary processes are again stationary, the result
follows from Proposition 4.3.
Proposition 4.8. The stationary process I;t is ergodic.
Proof The idea is to deduce the ergodicity of I;t (with initial distribution vp (.) =
vp(' 11](0) = 1»
from that of 1]t (with initial distribution vp). The ergodicity of 1]t
follows from Theorem 1.17 and Theorem B52(a). We will argue by contradiction,
so assume that I;t is not ergodic. Then there is a set A of configurations 1] with
1](0) = 1 so that
(4.9)
and
(4.10)
for a.e. I; E A and all t > 0, i.e., A is invariant for the process I;t.
Here is one way to see this. Take a bounded continuous function G so that the
statement equivalent to ergodicity in Theorem B52(b) fails for some function F.
By Theorem B50,
W = lim -
t ..... oo
11t G(l;s)ds
t 0
exists a.s. By Fubini's theorem, we can consider this limit for the process with
initial configuration I; for a.e. I; (with respect to vp). Because the condition in The-
orem B52(b) fails, w(1;) = E~W is not constant a.s. Let also v(1;) = Var~(W) be
the variance of a random variable whose distribution is the conditional distribution
of W given 1;0 = 1;. Then
(4.11 )
Integrating both sides of (4.11) with respect to vp and using stationarity, we

conclude that Va~ WeSt) = 0 a.s., and hence that for a.e. S, p{ (w(St) = w(n) = 1.
Therefore, we can take A = {s : wen < a} for an appropriate a.
Given an A satisfying (4.9) and (4.10), let B = {7] : 7](0) = 1}\A. Then
vp(B) = f p{ (St E B)dvp = Is p{ (St E B)dvp
since vp is invariant for St and the integrand is zero on A by (4.10). Therefore,

for t > 0,
(4.12)
for a.e. S E B. It follows that A and B are closed for the part of the evolution
of 7]t that does not involve transitions to or from the origin. It is not invariant for
those transitions, of course, since 7] (0) = I on AU B. In order to find sets that are
invariant for all transitions of 7](, let
Since every transition for 7]t that involves the origin is a transition of St followed
by a translation, it follows that A and B are invariant for the process 7]t. Since 7]t
(with initial distribution vp) is ergodic, vp(A) and vp(B) are each either 0 or 1. If
vp(A) = 0, then vp(A) = 0, which contradicts (4.9). This, together with the same
argument applied to B implies that
(4.13)
In particular, A and B are not disjoint. This does not yet contradict the fact that
A and B are disjoint, since A and B are potentially much larger than A and B
respectively. So, we must work a bit harder.
We will argue shortly that (as a consequence of (4.13)) for a.e. 7] with respect
to vp , there are sites
with the following properties:

(i) Ta7] E A and Tb7] E B. (In particular, 7](a) = 7](b) = 1.)
(ii) 7](c) = 7](a\) = ... = 7](a n ) = O.
(iii) Ci =1= b, bi =1= a, ai =1= C for all i.
(iv) pea, adp(a\, a2)'" p(a n , b) > 0, pCb, b\)p(h, b2) .. · p(b[, c) > 0, and
pea, c\)p(c\, C2)'" p(Ck. c) > O.
Properties (iii) and (iv) can be expressed in words as follows: {a\, ... ,an} is a
path from a to b that avoids c, {b l , ... ,bd is a path from b to C that avoids a,
and {CI' ... , cd is a path from a to C that avoids b.
Assuming the existence of sites with these properties for the moment, we
will complete the argument by contradiction. Take an 7] so that sites exist that
satisfy properties (i)-(iv). Let N = {a,b,c,al, ... ,an,b l , ... ,bl,cl, ... ,cd.
Fix a time to and let ri' be the random configuration that agrees with 1'/ on N,
while on the complement of N it has the distribution that the exclusion process
would have at time to if it evolved starting with configuration 1'/, but allowing only
transitions on the complement of N. Recalling the graphical representation of the
exclusion process that is described in Section 1, we see that for any particular
way of transforming 1'/ into I'/a,c using only transitions in N, on a set of positive
probability, 1'/10 has the same distribution as I'/~,c' and the transitions on N have
occurred in the order dictated by that way of going from 1'/ to I'/a,c.
We will focus on two ways in which 1'/ can be transformed into I'/a.c, using
only transitions in N. They also provide two ways of transforming 1'/' into I'/~.c'
(1) Let {ij, 1 .:::: j .:::: m} be the successive values of i so that I'/(Ci) = 1. Move
the particle at Ci m to c, then the particle at cim~l to Ci m , ••• , and finally the particle
at a to Ci 1 • Since the particle at b has not moved in this sequence of transitions,
Tbl'/ E B, and B is closed for the process ~r. this shows that Tbl'/~,c E B a.s.
(2) Move the particle at b to C through the sites bi in a manner similar to (1)
above, and then move the particle from a to b through the sites ai. Recall at this
latter step that the sites ai are vacant by (ii). Since the particle originally at a is
now at b, Tal'/ E A, and A is closed for the process ~I' this shows that Tbl'/~,c E A
a.s.
But since A and B are disjoint, it cannot be the case that both Tbl'/~,c E B a.s.
and Tbl'/~,c E A a.s. This gives the required contradiction.
It remains to prove the existence of sites that satisfy properties (i)-(iv) for a.e.
1'/. For two distinct sites a, b, define C(a, b) to be the set of sites C for which there
is a path from a to C that avoids b and there is a path from b to C that avoids a.
Using the fact that the random walk is irreducible and is not nearest neighbor in
one dimension, we will show below that
(4.14) IC(a, b)1 = 00

for all a -=f. b.
By (4.14), a.e. 1'/ has the property that for every a -=f. b, there are infinitely
many C E C(a, b) so that I'/(c) = O. By (4.13), for a.e. 1'/, there are sites a, b so
that
(4.15) Tal'/ E A and Tbl'/ E B.
Since AU B = {I'/ : 1'/(0) = I} modulo a null set, for a.e. 1'/, Twl'/ E AU B for
all w such that 1'/ (w) = 1. Fix an 1'/ with these properties, and (by irreducibility)
choose a path a = ao, ai, ... ,an, an+1 = b so that p(ai' ai+l) > 0 for each i,
where a, b satisfy (4.15). Then among the i's such that I'/(ai) = 1, there must be
two successive ones so that Taj 1'/ E A for the first of these, and Taj 1'/ E B for the
second. Therefore, by using these as new choices of a and b, we may assume that
l1(ad = 0 for 1 .:::: i .:::: n. This gives properties (i) and most of (ii). To get the rest
of (ii), and (iii) and (iv) as well, choose C E C(a, b) such that C -=f. ai for all i and
TJ(C) = O. This choice is possible since there are infinitely many vacant sites in
C(a, b).
Finally, we need to check (4.14). We will consider only the case d = 1 - the
higher dimensional case is similar. Without loss of generality, we can take a < b,
and assume that p(O, x) > 0 for some x > 1. By irreducibility, there is a y < 0 so
that p(O, y) > O. Again by irreducibility, there is a path from 0 to 1; call it n. If
i > 0 is sufficiently large, then the path that begins with {b +x, b +2x, ... , b +i x},
and then continues with any number of shifts of n will remain to the right of b,
and hence will avoid a. Thus for a sufficiently large Co, for every c ::: Co there is
a path from b to c that avoids a. Similarly, if i is sufficiently large, then the path
that begins with {a + y, a + 2 y, . .. , a + i y}, and then continues with some number
of shifts of n will remain to the left of a, and will end at some Z for which b - z is
not a multiple x. Then, continuing this path by adding {z + x, z + 2x, . .. , z + j x}
for a large j (thereby avoiding b), and following it with any number of shifts of
n, leads to a path from a to any sufficiently large positive c, while still avoiding
b. Thus we conclude that C(a, b) contains a half line of the form [co, (0). This
concludes the proof of Proposition 4.8.
Just as Corollary 4.5 followed from Proposition 4.3, we get the next result as
a consequence of Proposition 4.8:
Corollary 4.16. The martingale M t has ergodic increments.
The Law of Large Numbers for X t

We are now in a position to reap the first benefits of the representation 4.2 for XI'
Let m= Ly yp(O, y) denote the mean of the motion of the individual particles.
Theorem 4.17.
EXt = t(1 - p)m,
and
. Xt
11m ~
(4.18) - = (1- p)m
t--+oot
a.s. and in L,.
Proof Taking expected values of (4.2) and using the martingale property of Mt
and the stationarity property of ~t gives
Applying the ergodic theorem (Theorem B50) to both terms on the right side of
(4.2) gives (4.18).
Asymptotic Normality for M t

In the decomposition (4.2), it is relatively easy to check that M t has central limit
°
behavior. We will look at the first term on the right of (4.2) a bit later. Let N (0, I;)
denote the multivariate normal distribution with mean and covariance matrix I;.
Proposition 4.19.
Mt
(4.20) .ji => N(O, I;),
where the covariance matrix is determined by
vI;v = (1 - p) L(y· v)2p(0, y), v E Zd.

y
Proof We will write the proof in case d = 1 to simplify the notation. The proof for
general d is the same, except that quantities of interest are multiplied by arbitrary
vectors in Rd, in order to make them one dimensional. Since
sup
n:::t:::n+l
IMt - Mn I :s L Ivlp(O, v) +
v
sup
ns:::n+l
IX t - Xn I
by (4.2), and the last term above is dominated by a constant multiple of a Poisson
distributed random variable, it is enough to prove (4.20) along the integer sequence
t = n.
Define D. n by
n
Mn = LD.k,
k=l
so that {D.b k 2: I} is a stationary sequence by Proposition 4.3. In this situation,

the martingale central limit theorem (Theorem B65) implies that
Mn
In => N(O, (J
2
)
provided that E D.~ < 00 and
(4.21) . lLn E [2D.k I OZ'] =

hm -
n
n ..... oo
oY'"k-l (J
2
k=l
in probability. But by the Markov property,
(4.22)
where
By Propositions 4.3 and 4.8, the random variables on the right of (4.22) are
-1 1/I(~s)dsr
stationary and ergodic, so that (4.21) follows from the ergodic theorem, Theorem
B50, where
1
a 2 = E[ XI = EMf.
Since MI is a martingale, it has orthogonal increments, and hence
EM; = tEMf.
This allows us to compute
a 2 = lim EM(
1,),0 t
= lim ~E[XI _
t,j,o t
r 1/I(~s)dS]2 = lim VarX
J0 1,),0 t
1,
where the last equality comes from the fact that (since 1/1 is bounded)
is uniformly of order t as t t 0. But
EX; ~ t(1- p) Lip(O, y), t t 0,

y
so that
a 2 = (1- p) Lip(O, y).
y
The Second Decomposition - Beginning

Returning to the decomposition (4.2), it would be nice if the first expression on the
right were also a martingale (after centering), so that we could apply the martingale
central limit theorem to it too. However, a process of the form
is never a martingale, unless f = 0. This suggests, though, that we try to write

the centered first expression on the right of (4.2) as
(4.23) 11 1/I(~s)ds - t(1 - p)m = N(t) + D(t),

where N (t) is a martingale and D(t) is a process that is negligible in the sense
that
D(t)
--~o
Jt
in probability. That would be just as good. This is the idea we will implement
next. At this point, to simplify notation, we will begin to write down expressions
as if they were one dimensional. If d > 1, the arguments are then applied to one
coordinate at a time. For example, decomposition (4.23) is simply the statement
that each component of the left side can be expressed as the sum of a (one
dimensional) martingale and a negligible process.
The decomposition (4.23) is based on the solution u).. to the resolvent equation
(4.24)
for A> 0, where 1/!(S) = 1/!(I;) - (1 - p)m. The solution to (4.24) can be written
down explicitly as
(4.25)
where Set) is the semigroup for the process {to The integral converges since A > O.
To check that the right side of (4.25) solves (4.24), simply put it into (4.24) and
integrate by parts, using the fact that
-- - d- -
Q S(t)1/! = -S(t)1/!.
dt
(See Theorem B3.) Now we can at least write down a first approximation to the
desired decomposition (4.23).
To begin, note that (4.2) can be written as
since I(x, S) = x and QI(x, S) = 1/!(S) - see the beginning of the proof of
Proposition 4.1. Applying the argument in the proof of Proposition 4.1 to the
generator Q and the function u).. gives the analogous expression
(4.26)
where N).. is a martingale with stationary ergodic increments. Using the resolvent
equation (4.24), this can be written as
(4.27) i t 1/!({s)ds = N)..(t) + D)..(t),
where
D)..(t) = i t AU).. ({s)ds - u)..({t) + u)..({o).
The idea now is to pass to a limit in (4.27) as A -+ 0 to get (4.23). In order to do
so, it is necessary to have some control of the behavior of u).. as A -} O.
The Basic Assumption

The control we need is expressed in terms of the following Dirichlet form. For a
function U defined on {1) E {O, l} Z d : 1)(0) = I}, define
f!5Z(u) = ~xCu) + ~h(U),

where
6.
~x(u) -- 4"1 f "[
~ p(x, y) U(1)x.y) - U(1)
x,Y',co
]2 dvp
-
f
and
~h(U) = ~ L p(O, x)[u(rx1) - U(1)]2dvp .
{x:ry(x)=O}
The subscript ex stands for exclusion or exchange, while the sh stands for shift.
Here is our basic assumption: There is a constant C so that
(4.28a)
and
(4.28b)
for all U and all A > O.

To make sure that this assumption is not vacuous, we will check it below
in the (easy) case that p(., .) is symmetric. More generally, (4.28) was proved
by Varadhan (1995) when p(., .) has mean zero (see the third display on page
278 for (4.28a) and combine the final display on page 278 with his Theorem
5.1 for (4.28b», and by Sethuraman, Varadhan and Yau (1999) for p(., .) with
mean different from 0 if d 2: 3 (see Lemma 2.1 for (4.28a) and Theorem 2.3 for
(4.28b».
We begin with a simple fact about Markov chains, which explains the relation
between a generator and the corresponding Dirichlet form. Suppose the Markov
chain has transition rates q (1), n
and stationary measure TC. The generator of the
chain is given by
Q' u(1) = L q(1), n[u(n - u(1)],
~
and its Dirichlet form is
f!5Z'(U) = ~ LTC(1)q(1), n[u(1) - u(nt

ry,~
We state the next result for finite state Markov chains to avoid problems with
convergence of sums, but corresponding statements for more general processes
can usually be obtained easily by passing to limits,
Lemma 4.29. Suppose the chain has a finite state space. Then
(a)
- L U(1])Q'u(1]);rr(1]) = M'(u),
and
(b) if the Markov chain is reversible with respect to;rr, i.e., if the expression
a(1], n = ;rr(1])q(1], n = ;rr(nq(~, 1])
is symmetric, then
(4.30) [ ~U(1])Q'V(1]);rr(1])f :s ~'(u)~'(v).
Proof For part (a), write out both sides explicitly, and cancel common terms. The
resulting identity that must be checked is
To check it, interchange the roles of 1] and ~ in the sum on the right, and use the
fact that ;rr is invariant:
;rr(1]) Lq(1], n = L;rr(nq(~, 1]).

~ ~
For part (b), use the symmetry of a(1], n to write the left side of (4.30) as
[LU(1])a(1], n[v(n- V(1])]f = [La(1], nU(1])v(n- La(1], n U(1])V(1])f
ry.t ry,t ry,t
= ~[La(1], n[u(1]) - u(n][v(1]) - v(n]r

ry.~
:s ~'(u)~'(v),
where the final step comes from the Schwarz inequality.
Now we return to the exclusion process. One application of the last result is the
following. Multiply (4.24) by u).. and integrate with respect to vP ' using Lemma
4.29(a), to get
(4.31 ) A. f u~dvp + ~(u)..) = f 1/J u )..dvp.
Here is another application.
Proposition 4.32. If p(x, y) = p(y, x), then (4.28) holds.

Proof The process (Xt. St) is reversible with respect to the product measure that
is counting measure on the first component and vp on the second (since p is
symmetric). Recalling from the proof of Proposition 4.1 that 1/1 = 1/1 = fl.!, where
lex, l;) = x, (4.28a) follows essentially from (4.30), since Ly
p(O, y)lyl 2 < 00.
It doesn't quite follow directly, because both sides of (4.30) are infinite (since
each x gives rise to an identical term). To fix this, simply replace p(x, y) by
1
p(x, y) if x, yET
PT(X, y) = ~ ifx=y~T
otherwise,
where T = [-n, n]d, divide by n d, and then let n -+ 00.

Likewise, the process St is reversible with respect to vp , so to get (4.28b) from
(4.30), it is then enough to prove that
(4.33)
To check this, consider (4.31). Neglecting the first term on the left, and using
(4.28a) to bound the term on the right gives
and this implies (4.33), since M(u)..) < 00 for each A by (4.31).
The Second Decomposition - Conclusion

For the next step, we need to use some functional analysis. Define Ilulll = .jM(u)
for cylinder functions u. After modding out by functions with II u III = 0 and
completing with respect to II . III, one obtains a Hilbert space HI. If g satisfies
for some constant K, then
u -+ f gudv p
defines a bounded linear functional on HI, and therefore can be represented by an

element G (g) of HI. For such g, define the norm
IIgll-1 = IIG(g)lh·
Let H_I be the completion (after modding out functions of norm 0 again) with
respect to II . II-I. This is again a Hilbert space. Note that in this language, our
basic assumption (4.28) becomes
(4.34) 111/111-1 :::: C and IIQuAII-I:::: C.

When combined with (4.24), this implies
(4.35) AIIU,l,.II_1 ::s 2e.

Here is an elementary Hilbert space lemma that we will need below. Let (., .)
denote the inner product. Recall that Un -+ U weakly in a Hilbert space H means
that
(4.36) (Un' v) -+ (u, v)
for all v E H. A basic fact about weak convergence in H is that it is equivalent

to (4.36) for a dense set of v E H, together with
(4.37) sup Ilunll < 00.

n
The only hard part of the equivalence is the uniform boundedness principle (The-
orem 5.8 in Rudin (1966», which is used to deduce (4.37).
Lemma 4.38. Suppose H is a Hilbert space and Un E H converges weakly to u.

Then there is a sequence Vn E H so that for each n, Vn is a convex combination of
{u I, ... , un} and Vn converges strongly to u.
Proof By replacing Un by Un - u, we may assume that U = O. Since Un -+ 0

weakly, there is a sequence nk so that
Then
which tends to zero as k -+ 00 by (4.37). So, one can let
Next we will see that the solution to the resolvent equation U,l,. has some useful
properties as A t 0 that will enable us to pass to the limit in (4.27) to construct
the decomposition (4.23).
Theorem 4.39. Suppose (4.28) holds. Then there is aWE HI so that
lim Ilu,l,. - will

,l,.-I-0
= o.
Furthermore,
Proof By (4.31) and (4.28a),
(4.40) A f uidvp + Ilu).llf :s Cllu).lll.

For each A, u). is a uniformly bounded function by (4.25). Therefore, it is in HI
by (4.31). In particular, all the terms in (4.40) are finite. By discarding the first
term in (4.40), we see that Ilu).11I :s C. Since bounded sets in a Hilbert space
are weakly relatively compact, there is a sequence An i 0 and aWE HI so that
Un = u).n converges weakly to w in HI. By (4.40) again, AJuidvp :s C 2 , so
AU). ---+ 0 strongly in L2 (v p) as A i O. By the definitions of the various Hilbert
spaces we are using, the inner products are related in the following ways: For
functions g, h in a dense set in H_ b u E HI,
(4.41) (g, h}_1 = (G(g), G(h)}1 = f gG(h)dvp, (u, G(g)}1 = f gudvp.
Therefore, the strong convergence AU). ---+ 0 in L 2 Cvp ), together with (4.35), im-
plies the weak convergence AU). ---+ 0 in H_ I . Recalling (4.24), it follows that
Qu). ---+ -1/1 weakly in H_ I .
Applying Lemma 4.38, there is a sequence Vn so that Vn is a convex combi-
nation of {UI, ... ,un} for each n so that Vn ---+ W strongly in HI and QVn ---+ -1/1
strongly in H_I. From the proof of Lemma 4.38, it is clear that the same convex
combinations can be used in the two cases. Again, by the definitions of the two
Hilbert spaces,
Therefore,
and hence
Ilwllf = f w1/ldvp.
Passing to the limit in (4.31) using (4.41) and the fact (from 4.34) that 1/1 E H_I,
we get
Since the norm is lower semicontinuous under weak convergence,
(4.43) Ilwllf :s liminfllunllf·

n--+oo
Combining (4.42) and (4.43) leads to
lim An
n---*oo
f u~dvp = 0 and lim lIunllf
n~oo
= IlwllT-
The latter fact (together with the weak convergence of Un to w) implies that
Un ~ w strongly in HI, since
Ilw - unliT = IIwllT - 2(w, Un)1 + lIunliT ~ O.

This gives the assertions of the theorem along the sequence An. The fact that the
second assertion holds along all A ~ 0 follows from the fact that we could have
taken the An in the proof to be a subsequence of any given sequence.
So, it remains to prove that the limit w is independent of the sequence chosen.
To do so, take two values of A. Applying (4.24) to both and taking differences
gives
so that
(4.44)
Now take two subsequences An and A~ as in the first part of the proof, with
UAn ~ w and UA~ ~ w' strongly in HI. Recalling that AUA ~ 0 weakly in H_I,
it follows from (4.44) that IIw - w'III = 0 as required.
We can now get the decomposition (4.23).
Theorem 4.45. Suppose (4.28) holds. Then there is a martingale N(t) with sta-
tionary ergodic increments so that
(4.46) 10 1
1jJ(ss)ds = N(t) + D(t),
(4.47)
and
(4.48) lim EID(t)1 2 = o.

/-'>00 t
Proof Begin with (4.26). Since NA is a mean zero martingale, ENi(t) = tENi(1)
is linear in t. As in the proof of Proposition 4.19, we compute the factor by looking
at small t. The first term on the right of (4.26) is of order t as t ~ 0, so
· ENi(t)
11m = l'1m ----"------=--
E[UA(S/) - UA(SO)]2
I,J,O t I,J,O t
. 2 f u~dvp - 2 f uAS(t)uAdvp
= hm --"----'-'---'-----"------'-
t-J,O t
= - 2 lim
t,J,O
f UA
S(t)UA -
t
UA
dv p
= - 2 f UAQuAdvp = 211uAIif.
Therefore
(4.49)
Applying the same argument to the differences in (4.26) for two different values
of).. gives
E[N),,(t) - NA2(t)]2 = 2tllu A1 - UA211~.
By Theorem 4.39, UA is Cauchy in HI. Therefore, for any t, NA (t) has an L2 limit
as ).. t O. The martingale property allows us to conclude that there is a square
integrable martingale N(t) so that NA(t) -+ N(t) in L2 for every t. By (4.27),
DA (t) has an L2 limit as well, which we will call D(t). This gives (4.46). For
(4.47), use (4.31) to write
IluAII~ .:::: f 1/Iu Adv p .:::: IluAIIII11/I11-1,
so that IluAllI .:::: 111/111-1' Now pass to the limit in (4.49) as ).. t O.
It remains to prove (4.48). Recall from the discussion that led to (4.27) that
DA can be expressed as
DA(t) = 1t )..uA(~s)ds - UA(~t) + UA(~O).

Using the inequality (a+b+c)2 .:::: 3(a 2+b2+c2) and then the Schwarz inequality,
f
ED~(t).:::: 6 u~dvp +3)..2 E [l t UA(~s)dSr .:::: (6+3)..2 t 2) u~dvp. f
Since D(t) = DA(t) + [NA(t) - N(t)] by (4.27) and (4.46), this implies that
ED2(t) .:::: (12 + 6)..2t 2) f u~dvp + 2tE[NA(1) - N(1)t
Divide this by t, put).. = lit, and let t -+ 00. Using the second statement in
Theorem 4.39 and the L2 convergence of N A(1) to N(1) leads to (4.48).
Asymptotic Normality for X t

Combining the two decompositions (4.2) and (4.46) gives
X t - EXt = M t + N(t) + D(t).

The proof of Proposition 4.19 applies equally well to the martingale N(t) - the
function h that appears in that proof is now
The finiteness of the limiting variance is guaranteed by (4.47).

In fact, the same proof applies to the martingale Mt+N(t). Combining this with
(4.48) leads to the following central limit theorem for the position of the tagged
particle. Recall that we have proved (4.28) for symmetric systems (Proposition
4.32), and that it has been proved in many other cases.
Theorem 4.50. Suppose (4.28) holds. Then

X t - EXt
"fi =} N(O, 1:)
for some covariance matrix 1:.
The Limit Is Not Degenerate

An unsatisfactory aspect of Theorem 4.50 is that, as far as we know at this point,
1: might be O. In fact, 1: is zero in the nearest neighbor, symmetric case in one
dimension - see Theorem 1.21. This degeneracy can happen because the two
martingales M t and N(t) cancel each other. We will now check that, at least if
m = 0, this complete cancellation cannot occur in any other cases. The key result
is an improvement of condition 111/111-1 < 00 from (4.34), which is equivalent to
for some constant C.
Lemma 4.51. Suppose m = 0 (and p(., .) is not nearest neighbor in one dimen-
Sion}. Then there is a constant C so that
(4.52)
Remarks. Recall that 1/1 = 1/1 if m = O. Note that the above statement is not true
in the one dimensional nearest neighbor case. In that case, 1/1(17) = ~(-li-~(1). If
u is of the form
L a (k)17(k)
00
U(17) =
k=1
where a(k) = 0 for all but finitely many k's, then (4.52) becomes
L [a(k) -
00
a 2 (1) ::: C a(k + 1)]2

k=1
(for a different constant C). But this is clearly false.
Proof of Lemma 4.51. Since the mean is zero,
(4.53) 1/1(17) = L [1 -17(X)]XP(O, x) = L a (X)17(X),

x x#
where a(x) =fo 0 for only finitely many x's and LxiOa(x) = O. Therefore there
are numbers a (x, y), with only finitely many of them nonzero, so that
(4.54) 1/1(1J) = L a(x, y)[ 1J(x) - 1J(Y)]'

x,yiO
One way to argue this, is inductively on the number of nonzero a(x)'s. For the
induction step, take any two nonzero sites x, y so that a (x) =fo 0 and a (y) =fo O.
The contributions to (4.53) due to them are
a(x)1J(x) + a(Y)1J(Y) = a(x)[ 1J(x) - 1J(Y)] + [a(x) + a(y) hey).

This reduces the problem to one involving fewer nonzero a(·)'s. The property
LxiO a(x) = 0 is preserved, so there can never be exactly one site at which a is
nonzero. By an argument similar to the one used to prove (4.14), using the fact
that the random walk is irreducible and not nearest neighbor in one dimension,
for any nonzero x, y, there is either a path x = Xo, ... , Xn = Y from x to y so
that Xi =fo 0 for each i and P(Xi-l, Xi) > 0 for each i, or there is such a path from
y to x. Writing
n
1J(x) - 1J(Y) = L [1J(Xi-l) - 1J(Xi)],
i=1
we see that in representation (4.54), we may assume that a(x, y) =fo 0 only if
p(x, y) > O.
Since vp is invariant under permutations of nonzero coordinates, if x, yare
nonzero,
f [1J(x) -1J(y)]u(1J)dvp = f [1Jx,y(Y) -1Jx,y(x)]u(1J)dvp
= f [1J(y) -1J(x)]u(1Jx,y)dvp •
Therefore, we can write
Using the Schwarz inequality leads to
Since a (x, y) =fo 0 for only finitely many pairs (x, y), the second factor on the
right is finite. Since a(x, y) =fo 0 only if p(x, y) > 0, the third factor on the right
is bounded by a constant multiple of ~xCu), so the result follows.
Now we can complete the statement of the central limit theorem for X t .
Theorem 4.55. Suppose that m = 0 (and pC .) is not nearest neighbor in one

dimension}. Then in the context of Theorem 4.50, ~ =1= o.
Proof Begin again by multiplying (4.24) by UA and integrating with respect to vp.
Drop the first term on the left and use (4.52) to bound the term on the right. The
result is
(4.56)
+
If IluA111 -+ 0 as A. 0, then (4.49) implies that NA(t) -+ 0 in L2 for each t
and hence N(t) = O. In this case, the ~ in Theorem 4.50 is the same as the ~ in
Proposition 4.19, which is not zero. Therefore, we may assume IluA111 fr 0, and
hence by (4.56),
(4.57)
The final step is a modification of the first argument in the proof of Theorem
4.45. Let gA(X, n
= x + uA(l;). Combining (4.2) with (4.26) and recalling that
QUA = Qu since UA is a function of ~ alone, one can write
Therefore, recalling the definition of Q at the beginning of this section, we have
. E[NA(t) + M t ]2 . E[gA(Xt'~t)-gA(XO'~O)]2
lIm = lIm - - = - - - - - - - - - - - - ' ' - -
q,o t tin t
=j L p(u, v)~(u)[l - ~(V)][gA(O' ~u,v) - gA(O, n]2dvp

u,v"'o
+ jLP(O,Y)[I-l;(Y)J[gA(y,ryn-gA(O,n]2dVp
y
::: j L p(u, v)S(u)[l - ~(V)][UA(~U,V) - u A(n]2 dvp = 2~X(UA)'

u,v*o
Since NA(t) + Mt is a martingale, it follows that
and hence
E[N(t) + Mt ]2 ::: 2t limsup~xCuA)'
AiO
Combining this with (4.57) gives the result.

Most of the material in Section 1 comes from IPS. Precise references can be found
there. Theorem 1.17 is due to Saada (1987). Here are some results that have been
proved for the exclusion process that are not directly related to the material covered
in Sections 2-4, beginning with symmetric systems:
Correlation Inequalities for Symmetric Systems. Consider the exclusion process

with p(x, y) = p(y, x), x, YES. Andjel (1988) proved the following negative
correlation inequality: If A, B are disjoint subsets of Sand TJ E {O, l}s, then
for all t ::: O. Note that this is a generalization of (1.14). He then used (5.1) to prove
the following pointwise ergodic theorem: If S = Zd and p(x, y) = p(O, Y - x),
and if the initial configuration satisfies
(5.2) lim'""' Pt(x, Y)TJ(Y) = p, XES

t-+oo~
y
then
(5.3) lim ~ 10
t-+oo t
r I(TJs)ds = f Idvp a.s.
for every continuous function I on {O, l}s. This had been proved earlier by Andjel
and Kipnis (1987) for d ::: 3, and for all d if I depends on only one coordinate.
Note that (5.2) is simply the hypothesis of Theorem 1.13 in case of deterministic
initial configurations (with a = constant - recall that 9(j consists only of constants
in the translation invariant context).
The negative correlation inequality (5.1) should be compared with Theorem
B17, which provides a positive correlation result for attractive spin systems. The-
orem Bl7 asserts in particular that I(TJt) and g(TJt) are positively correlated for
deterministic initial configurations for all increasing continuous functions I and
g, while (5.1) applies only to very special increasing functions - those of the form
if TJ == 1 on A
otherwise.
It would be interesting to know for what larger class of increasing functions

one could prove negative correlations in the context of the symmetric exclusion
process.
Negative correlations are harder to deal with than positive correlations, because
for any function I, I(TJt) and I(TJt) are automatically positively correlated:
by lensen's or Holder's inequality. This means that it would be natural in proving

negative correlations results to assume something like ! and g depend on disjoint
sets of coordinates. This creates difficulties, though, since it does not follow from
this that Set)! and S(t)g depend on disjoint sets of coordinates for t > O.
Negative correlations are not in general preserved by asymmetric exclusion
processes. This is easy to see from Theorem 2.93. In that case, the initial distri-
bution is a product measure, but the limiting distribution is a mixture of different
product measures, and therefore has (strictly) positive correlations.
Occupation Times for Symmetric Systems. Statement (5.3) when !(11) = 11(0) is a
strong law of large numbers for the occupation time of the origin. Kipnis (1987)
proved the associated central limit theorem in the case of the nearest neighbor ex-
clusion process on Zd with initial distribution vp. The result displays an interesting
dimension dependence. Here it is:
f~ I1s (O)ds - pt 2
~---- => N(O, a ),
I
bet)
where
r
3
if d = 1,
b(t) = ~tlogt if d = 2,
Jt if d :::: 3,
and
4v'2 if d = 1,
a2 =p(1-p)x 1 3.jii
1 if d = 2,
; fooo Ps(O, O)ds if d :::: 3,
where Ps (x, y) is the probability that a simple random walk on Zd that starts at
x will be at y at time s.
There are also large deviations results in this context. Arratia (1985) and
Landim (1992) proved that if ex > p, then
log p (~ fot I1s(O)ds :::: ex)

lies between two constant multiples of
if d = 1,
if d = 2,
if d :::: 3.
Process level large deviations results for the symmetric exclusion process on
Zd, d :::: 3 have been obtained by Quastel, Rezakhanlou and Varadhan (1999).
Occurrence of Rare Events. During the past decade and a half, a number of results
have been proved that say that the time at which a rare event occurs is nearly
exponential. For Markov chains, see Aldous (1982), for example. In interacting
particle systems, there are results of this type for stochastic Ising models (Schon-
mann (1991), Schonmann and Shlosman (1998» and for zero range processes
(Ferrari, Galves and Landim (1994», among others. Here is a result of this type
for the symmetric exclusion process on Zd with nearest neighbor jumps. It was
proved for d = 1, pi = 1 by Ferrari, Galves and Liggett (1995) and for general
d, pi by Asselah and Dai Pra (1997).
Let 17t be the process with initial distribution vP' and set
Tn = inf {t : : 0: L 17t (x) :::: pi n d },

xE[I,nJd
where pi > p. Then there is a sequence an and constants Ci so that
sup IP(anTn > t) - e-tl.:::: e- C1nd

t2:0
and
C2nd-lvp{17: L 17(X)=[ndpIJ}.::::an'::::C3nd-lvp{17: L 17 (x) = [ndpIJ}.

xE[I,nJd xE[I,nJd
Rates o/Convergence. Take S = Zl and p(x, x+ 1) = p(x, x-I) = x E Zl. A !,

number of results have been proved about the rate of convergence to equilibrium
in this context. Ferrari, Presutti, Scacciatelli and Vares (1991 a) proved that there
are constants Cn so that if A C Zl is finite and 17 E {O, 1}Zl, then
(5.4)
IETI n
XEA
[17t(X) - PTI(17t(X) = I)JI.:::: CIAlt-IAI/8.
In the companion paper (1991b), they prove the following strong form of (5.3) in
this case: If 0< a < 1, n > 1, and 17 E {O, l}Zl, then
1 It+t. j(Tx17s) -
lim sup 1 -;;
t-->oo Ixign t t
f jdva(x,t)
I
= ° a.s.
for any continuous j On {O, l}Zl, where
a(x, t) = LPt(x, Y)17(Y),

y
and Tx is the spatial shift by x units.

If A = {y, z}, (5.4) gives a uniform bound on the rate of decay of the co-
variance of 17t(Y) and 17t(z) of order t-~ for deterministic initial configurations. A
precise decay rate for this covariance was given by Keisling (1998) for the special
case where the initial distribution is the fair mixture of the pointmasses on 17e and
TJo, which have particles exactly on the even sites and odd sites respectively. In
this case, the covariance is asymptotic to -et-~, where e is an explicit constant.
For symmetric exclusion processes in higher dimensions, one can prove al-
gebraic rates of convergence in L 2 (v p). Deuschel (1994) did so for some related
processes, and indicated that the same technique gives analogous results for sym-
metric exclusion processes.
For the asymmetric nearest neighbor exclusion process on Zd, Cancrini and
Galves (1995) proved that if the initial configuration is periodic, or if the initial
distribution is stationary and exponentially mixing, then there is a constant e
(depending on the initial distribution) so that
d
d nd 2d ( log t )
Ip(TJt == 1 on {I, ... , n} ) - pis en Jt
The Finite Symmetric System. Consider two particles moving according to a sym-
metric, translation invariant exclusion process on Zd, where Lx Ix 12 p(O, x) < 00,
and two particles with the same initial states moving according to the same rules,
but without the exclusion interaction. In an unpublished manuscript, Andjel ob-
tained the following upper bounds for the total variation distance between the
distributions of these two systems:
e logt
if d = 1,
Jt
e logt if d = 2,
t
e if d :::: 3,
where e is a constant. In the one dimensional nearest neighbor case, he was able
to remove the logarithmic term.
The Asymmetric System on a Finite Torus. Fill (1991) studied rates of convergence
to equilibrium for the system of k particles that move clockwise at rate 1 on a
discrete circle of size N with exclusion. The particles are regarded as labelled, so
that for a given initial configuration, there are kG) possible configurations at later
times. The limiting distribution f.1 is uniform on that set of configurations. Let f.1t
be the distribution at time t, where the initial state is deterministic, and let II . II
denote the total variation norm. Here is one of Fill's results. Take N = 2k. Then
IIf.1cN6 - f.111 S IN /8exp { -l(3C -IOg4)N}.

Gwa and Spohn (1992) discuss the asymptotics of the principal eigenvalues of
the generator of this process.
Hydrodynamics. This is a huge area of interacting particle systems, which is the

subject of an entire book - Kipnis and Landim (1999). Many of the papers that
have been written on this topic are listed in the Bibliography, and can be identified
by the presence of the word hydrodynamics in the title. We list here only some:
Rost (1981), Kipnis, Olla and Varadhan (1989), De Masi and Presutti (1991),
Landim (1991), Rezakhanlou (1991), and Varadhan (1994a).
Here is an informal description of the type of result that falls under the rubric
of hydrodynamics: Consider a translation invariant exclusion process on Zd, where
the individual particles have a drift
m = LXp(O, x) =1= O.
x
Suppose that u is a reasonable function on R d , and the initial distribution fl~ for
the exclusion process is close to being a product measure with density
fl~ {ry : ry (x) = I} = u (~ ).

Then for large N, the distribution fl Zt of the process at time Nt will be approxi-
mately a product measure with density
flZt{ry: ry(x) = I} = u(~, t)'

where u(x, t) is the (correct) solution of Burgers' equation, which now is the
following generalization of (2.3):
au
- + m . grad(u[1 - u1) = 0, u(x,O) = u(x).
at
Here grad(f) refers to the gradient of a function f.
The relevance of measures that are locally product in both the hypothesis and
in the conclusion comes from Theorems 1.2 and 1.16, which guarantee that homo-
geneous product measures are the most general extremal stationary shift invariant
measures. Strictly speaking, the term hydrodynamics refers to the asymptotic evo-
lution of the particle density according to Burgers' equation. The fact that the
distribution at later times is close to being the corresponding product measure is
usually known as preservation of local equilibrium.
These general results have only been proved at continuity points of u(x, t),
and of course, the results as stated are not true at the shock, since we know now
(by Theorem 2.93) that the limit at the shock is a mixture of product measures
rather than a single product measure.
In the context of Section 2, but with A > p, the fact that
VA if A S ~,
{
(5.5) lim vA.pS(t) =
1---7>-00
Vl
2
if p S ~ SA,
Vp
·f 1
1 P 2: 2'
follows from these more general results. To see this, recall our discussion of (2.3)
in case A > p: The solution with initial condition (2.4) is
if x ::: (p-q)(1-2A)t,
if (p - q)(1 - 2)..)t :::: x ::: (p - q)(1 - 2p)t,
if x ::: (p - q)(1 - 2p)t.
Taking x = 0 in (5.6), it should then be clear how the general hydrodynamic

results lead to (5.5). The special result (5.5) was proved much earlier by Liggett
(1975, 1977) before the connection with Burgers' equation was understood. (The
second paper considered processes that are not necessarily nearest neighbor.)
Hydrodynamics results have been proved in many other contexts. One example
is the exclusion process in which at most one particle per site is replaced by at
most K particles per site - see Kipnis, Landim and Olla (1994) for the symmetric
case (with K = 2) in one dimension, and Seppalainen (1999) for the totally
asymmetric case in one dimension. A significant difference between the symmetric
and asymmetric cases is that the invariant measures can be written down explicitly
in the symmetric case, but (unlike for the exclusion process, K = 1) not in the
asymmetric case. In the symmetric case, invariant measures can be obtained by
taking product measures whose marginals are Poisson distributions, conditioned to
be::: K. For more on this, see Keisling (1998). Other results have been proved in
this context. See Yau (1997) for the logarithmic Sobolev inequality, for example.
The Weakly Asymmetric Exclusion Process. This refers to a system in which there
are symmetric jumps at a fast rate E- i , and completely asymmetric jumps at rate
1. De Masi, Presutti and Scacciatelli (1989) consider a one dimensional process
of this type, and prove that for times of order 1, the hydro dynamical behaviour is
governed by the linear heat equation (as it would be in the symmetric case), while
for times of order E- i , the relevant equation is a (nonlinear) Burgers equation (as in
the asymmetric case). Other papers on this topic are Gartner (1988), Dittrich (1990,
1992), and Dittrich and Gartner (1991). Ravishankar (1992b) deals with similar
issues for a two-dimensional process in which the asymmetry applies to only one
direction. The weakly asymmetric process in a regime that leads to a nonlinear
stochastic partial differential equation was studied in Bertini and Giacomin (1997).

The issues discussed in Section 2 were first raised to the author by F. Spitzer in
1974. His original question was: For the exclusion process that moves only one
step to the right (p(x, x + 1) = 1), what is the limiting distribution when the initial
configuration is
... 1 1 I 1 0000 ... ?
This corresponds to the case ).. = 1, p = O. It was clear by symmetry that the
answer had to be \J l, but at that time it was not even clear how to prove that.
The material id this section is based on Ferrari, Kipnis and Saada (1991),
Ferrari (1992a) and Ferrari and Fontes (1994a, 1994b). In Ferrari (1992a) and the
papers that followed, the property'" 1L).,p is defined as in (2.8), but without the
Cesaro averaging, and the analogue of Theorem 2.16 and corresponding result for
Zt (see the discussion surrounding (2.9)) are asserted for this stronger version of
the property. Only the weaker fonn follows directly from the arguments given
there, however. It is possible to obtain the stronger (non-Cesaro) statement for Xt,
but that, together with (2.9), does not imply the stronger statement for Zt. Since
Zt = Xt if p = 1, there is no difficulty in this case.
Proposition 2.10 (for a system with only one class of particles) is due to Harris
(1967). The proofs given here are taken from Ferrari (1986) and Ferrari (1992b)
respectively. In the 1986 paper, Ferrari identifies all the invariant measures for the
process viewed from the tagged particle.
Ferrari, Kipnis and Saada proved the strong law version of Theorem 2.34,
rather than the weak law presented here. Theorem 2.43 was conjectured by Spohn
(1991). Proposition 2.74 was proved by Giirtner and Presutti (1990) for A = 0,
p=l.
The crucial variance computation given in Proposition 2.56 is Theorem 3.1 of
Ferrari and Fontes (1994a). The expression given here looks quite different from
that in the paper, but it is not hard to check that it is, in fact, the same. In carrying
out the verification of this, the reader should keep in mind that the roles of A and p
are reversed in the paper, and that our Ut and Vt are called Rt and R t respectively
in the paper.
°
Theorem 2.93 was proved for A = 0, a = by Wick (1985) in case p = 1 and
by De Masi, Kipnis, Presutti and Saada (1989) for p > !. The latter paper also has
°
a central limit theorem for XI> which is, in this case, the position of the leftmost
particle. Theorem 2.93 was proved for A + p = 1, a = by Andjel, Bramson
and Liggett (1988). (A Cesaro version of the statement was proved earlier by
Andjel (1986).) This is the third case in Corollary 2.102. The other two cases
were proved by Liggett (1975). They were extended to more general (not nearest
neighbor) exclusion processes by Liggett (1977).
The connection between the exclusion process and queuing theory that was
used in the proof of Proposition 2.61 was first employed by Kipnis (1986). This
connection was exploited in the opposite direction recently when Mountford and
Prabhakar (1995) used ideas about the exclusion process to settle an old problem
about series of queues. To state their result, recall Theorem B59, a special case
of which says that that if a stationary Poisson process of rate A < 1 is fed into
a single server queue, where the service is exponential of rate 1, then the output
process is again Poisson with rate A. Now suppose the input process is a general
stationary ergodic point process X of rate A < 1. Then the output process is
another stationary ergodic point process of rate A - call it T X. This can be fed
back into another queue of the same type, and the output process is then T2 X.
The Mountford-Prabhakar theorem states that
r X:::} the Poisson process of rate A
as n -+ 00. For other connections between exclusion processes and queuing sys-
tems, see Srinivasan (1993) and Seppiiliiinen (1997).
Here are some other results that have been proved for the nearest neighbor
asymmetric exclusion process on ZI:
The Rarefaction Fan. Consider the process with P = 1 and initial distribution vA•P
on Zl \ {OJ, A > p, and put a second class particle at the origin. Let Zt be the
position of the second class particle at time t. Ferrari and Kipnis (1995) proved that
Zt! t converges weakly as t ---+ 00 to the uniform distribution on [1 - 2A, 1 - 2p].
To understand the limiting distribution, recall that a second class particle in a sea
of first class particles of density y has drift (1 - 2y) ~ see Proposition 2.57. So,
a second class particle well to the left of the origin would travel at speed 1 - 2A,
while a particle well to the right would travel at speed 1 - 2p. The result asserts
that a second class particle starting at the origin chooses what speed to use at
random from among the allowed possibilities.
Ferrari and Kipnis also consider what happens when initially there are two
second class particles, one at 0 and the other at 1, the negative sites are occupied
by first class particles, and the other positive sites are empty. The second class
particles coalesce at rate 1 when they are at adjacent sites. If Z~ and Z/ are their
respective positions at time t, they prove that
P(Z~ =1= zi for all t) :::: 4'1

and
.
11m E(Zl- Zn
---''--'------'..:....
2
t ..... oo t 3
Evolution of the Finite System. Schutz (1997b) computes the exact distribution of
the finite exclusion process At in the totally asymmetric case P = 1. The anSwer
I
is given in terms of the function Fm(n, t) that is defined for n :::: 0 by
Fm(n, t) =
e-
t
b (k + 1)
00 m-
m- 1 (k
t k +n
+ n)! if m :::: 1,
e
-t
b(-) (1m I)
Iml 1k
k
t k +n
(k+n)! if m :s o.
If A = {Xl, ... ,XN} and B = {YI, ... , YN} are configurations of size N, written
so that Xl < ... < XN and YI < ... < YN, then pA(A t = B) is the determinant
of the N x N matrix whose (i, j) entry is Fi-j(Yi -Xj, t).
This is reminiscent of the following old result by Karlin and McGregor (1959).
Suppose XI(t), ... ,XN(t) are independent continuous time birth and death pro-
cesses On Zl with transition probabilities Pt(x, Y), i.e., Markov chains that move
only One step to the left or right at each transition. Let G be the event that
XI(s) < ... < XN(s) for all s:s t. If Xl < ... < XN and YI < ... < YN, then
is the determinant of the N x N matrix whose (i, j) entry is Pt (Xi , Yj). Note that
if these chains move only to the right at rate 1, then Pt (x, y) = Fo (y - x, t).
Shift Equivalent Measures. Consider the context of (2.7), in which the exclusion
process is viewed from the location of a second class particle. If the initial dis-
tribution is vA,p on ZI\{O}, Derrida, Goldstein, Lebowitz and Speer (1998) have
proved that the two measures obtained by treating the second class particle as
either a first class particle or an empty site are random translates of one another.

This process on {1, ... ,N} with boundary conditions at the two endpoints ap-
peared already in MacDonald, Gibbs and Pipkin (1968) in their study of the
kinetics of protein synthesis. It was first studied rigorously by Liggett (1975).
He proved that in the irreducible case, the stationary distributions for the process
satisfy the following recursion: There is a constant CN so that
PftN(EI"'Ei-II0Ei+2"'EN) -qftN(EI"'Ei-I 01Ei+2"' EN)

(5.7a)
= CN[ftN-I (EI ... Ei-I OEi+2 ... EN) + ftN-I (EI ... Ei-I1Ei+2 ... EN)]
for all choices 1 :::: i < Nand E/S in {O, I},
(5.7b)
for all choices of E/S in {O, I}, and
(5.7c) p(l - P)ftN(EI ... EN-II) - qPftN(EI .•. EN-IO) = CNftN-I (EI ... EN-I)
for all choices of E/S in {O, I}. Note that (5.7b,c) can be thought of as versions
of (5.7a) for i = 0 and i = N respectively, since the boundary conditions have
the interpretation of making 11(0) = 1 with probability A and I1(N + 1) = 1 with
probability P, independently of the configuration on {I, ... , N}. By summing over
all values of the Ej'S, it is clear that eN is the net rate at which particles move
to the right in equilibrium for the process on {I, ... , N} - i.e., the current. He
then used this recursion to obtain the asymptotics of ftN as N ---+ 00, and then to
compute
lim vA,pS(t)
t~oo
in the context of Section 2 for p > i,

A + P =1= 1.
A similar recursion for p = 1 was obtained independently by Derrida, Domany
and Mukamel (1992), and used to prove the following explicit formula for the
marginals in case A = 1, P = 0:
They then observed this implies the following square root law for the spatial decay
of the system that is obtained by formally letting N ---+ 00:
. 1 1
hm ftN{11 : 11(i) = I} ~ - + '-"" i ---+ 00.
N~oo 2 2", rri
The recursions discussed above were the precursors of the matrix method,
which is the main subject of Section 3. In fact, it is easy to check that (5.7) is
a consequence of Theorem 3.1 (provided, of course, that there exist D, E, w, v
satisfying its assumptions), with (3.2a,b,c) being used to check (5.7a,b,c) respec-
tively. Besides, equation (3.5), which is the key to the proof of Theorem 3.1,
is a restatement of (5.7). The matrix method was introduced by Derrida, Evans,
Hakim and Pasquier (1993a) and used extensively in a series of papers by Derrida
and various coauthors. Much of the material in Section 3 is based on the paper
mentioned above and the review paper by Derrida and Evans (1997). Some of
these results were obtained independently by Schlitz and Domany (1993) using
the recursions directly. The analysis of the partition function corresponding to
Corollary 3.27 was carried out by Sandow (1994) for the general case p > !,
though parts of the argument do not appear to be entirely rigorous. Theorems
3.28 and 3.29 were originally proved in Liggett (1975) for general p > and in !
Liggett (1977) for one dimensional exclusion processes with Lx Ixlp(O, x) < 00
and Lx xp(O, x) > O.
Theorem 3.47 is taken from lanowsky and Lebowitz (1994). In that paper,
the authors observe that the same monotonicity arguments that are used in the
proof of Theorem 3.47 can be used to show that the current for the system on
{- N + 1, ... , N} with blockage between sites 0 and 1,
IN(r) = O"N{1J : 1J(i) = 1, 1J(i + 1) = O}, -N < i < N, i =1= 0,
is decreasing in N for each r. Therefore, anytime JN (r) < ~ for some N, it

follows that J < ~ for that r. They then compute JN (r) for small values of N. It
turns out that J1 (.4) = h(.4591) = iJ(.4943) = ~. (The arguments are rounded
to four decimal places.) Using simulation, they also get the approximate value
J15000(.8) = .24979. Based on this, it seems reasonable to guess that J < ~ for
all r < 1, but this has not been proved.
The following picture for the exclusion process on Zl with p = 1 and a
blockage between sites 0 and 1 has emerged from the work of lanowsky and
Lebowitz, and unpublished work of Bramson: There is a function y (r) on [0, 1]
that is continuous and strictly decreasing, and satisfies y (0) = 1, y (l) = so !,
that
(a) There is an invariant measure that is asymptotic to vy(r) at -00 and to
VI-y(r) at +00.
(b) For every y > y(r) and every y < 1 - y(r), there is an invariant measure
that is asymptotic to Vy at ±oo.
(c) For 1 - y(r) < y < y(r) there is no invariant measure that is asymptotic
to Vy at -00 or at +00.
Some of this picture has been proved, but the most interesting parts remain open.
Here are some other results that have been proved using the matrix approach.
The Diffusion Constant. Consider the system on {I, ... ,N} with p = 1 in equi-
librium, and let Yt be the number of particles that have entered {l, ... , N} by
time t. Then Yt has stationary increments, and
EYt
- = A/LN{11 : 11(1) = O} = /LN(10),
t
which is the current whose asymptotics are given in Theorem 3.28. Derrida, Evans
and Mallick (1995) have computed the asymptotic variance
. Var(Yt )
/}.N = hm - - - .
t--+oo t
Their expression is rather complicated - see (58) in the paper - but simplifies
significantly in two cases: (a) if A = p,
/}.N =AO-A)(2A-l +2[A(1-A)f+IRN(A-1»)
I A(1I-A)I2A-l1
4JriN
ifA=F!,
if A = 2'
1 N -+ 00,
where the asymptotics come from Lemma 3.24, and (b) A = 1, p = 0,

(4N+4) ~
/}.N = 4(2N
3(N + 1)
+ 1)(4N + 3) e 2N+I
NN+2)2 ~
3",2:rr
64,.[Fi'
N -+ 00.
Analogous results for the exclusion process on {I, ... , N} with periodic boundary
conditions (i.e., where one identifies sites 0 and N) were obtained by Derrida,
Evans and Mukamel (1993) in case p = 1 and by Derrida and Mallick (1997) for
general p.
Exponential Rate of Growth. The next step after considering the asymptotics of
the first two moments is to study the behavior of exponential moments. Derrida
and Lebowitz (1998) have done so for the process with p = 1 with periodic
boundary conditions. To state their result, consider the process with n particles on
{I, ... , N}, and let Yt be the total distance travelled by all the particles by time
t. Then
log Ee"Y'
y (a) = lim ---=----
t--+oo t
can be computed by solving
yea) = -n 8 1
00
Nk _ 1
(Nk -
nk
1) xk
and
a = - f_l
k=1 Nk
(Nk)xk
nk
simultaneously, eliminating the x.
Invariant Measures Viewed from the Location of the Shock. In Section 2, we saw
that for the asymmetric system on Z I, the location of the shock can be thought
of as the position Zt of a second class particle moving in a sea of first class
particles. To understand the microscopic structure of the shock, it is natural to
study invariant measures for the process of first class particles, when viewed
from Zt. That such invariant measures exist was proved by Ferrari, Kipnis and
Saada (1991). Using the matrix approach, Derrida, Lebowitz and Speer (1997)
were able to write down these measures explicitly. Corresponding expressions
in case p = 1 were obtained earlier by Derrida, Janowsky, Lebowitz and Speer
(l993).
To describe these results, let TIt be the process of first class particles on Zl,
viewed from the position of a single second class particle - the second class
particle is always placed at O. Take 0 :::: A < p :::: 1, and consider three matrices
v w
D, E and A and two vectors and that satisfy the following analogue of (3.2):
pDE - qED = (p - q)[(l - A)(l - p)D + ApE],

pAE - qEA = (p - q)(1 - A)(l - p)A,
(5.8)
pDA - qAD = (p - q)ApA
(D + E)v = v, weD + E) = w, wAv = 1.
Then the measure J1, on {O, l}ZI\{O} that has cylinder probabilities
is well defined and invariant for TIt. Furthermore, the measure is asymptotically
VA at -00 and vp at +00, with this convergence being exponentially rapid.
In fact, the single site probabilities are expressed in the following very explicit
form. Let Xn be a random walk on Zl that starts at 0, moves one step to the
right with probability p(l - A), one step to the left with probability A(l - p) and
remains where it is with the remaining probability, (1 - A) (l - p) + Ap. Note that
this random walk has a drift to the right, since A < p. Define
qk
f(k) = k Pk -q k
for k =1= 0, letting f(O) be arbitrary, and in terms of f,
I
F(k) = __ [p2(l - A)2 f(k + 2) - p(l - A)(A +p- 2Ap)f(k + I)
P-A
+ A(l - p)(A +p - 2Ap)f(k - I) - A2(l - p)2 f(k - 2)].
Note that F satisfies

as k -+ +00
F(k) -+ { 0
A-P as k -+ -00
exponentially rapidly. Then
J.L{1] : 1](n + 1) = I} = p + EF(X n ), n:::: O.
The single site probabilities for the negative sites can be obtained by symmetry:
J.L{1]: 1](n) = I} + J.L{1] : 1](-n) = I} ="A + p, n =1= O.
Related results can be found in Sandow and Schiitz (1994) and Schiitz (1997a).
Speer (1997) showed that if "A > 0, p < 1, then system (5.8) has a finite
dimensional representation if and only if
( _pq)r = "A(l _ p)
p(1 -"A)
for some positive integer r.

Invariant measures for the case in which both first class and second class par-
ticles appear with positive densities (recall the discussion preceding Proposition
2.10) with p = 1 are discussed by Derrida, Janowsky, Lebowitz and Speer (1993),
Speer (1994), and Ferrari, Fontes and Kohayakawa (1994). In the latter paper, it
is proved that the invariant measures with good marginals have a renewal struc-
ture: The configurations of first class particles and empty sites between successive
second class particles are i.i.d.

Proposition 4.8 and Theorem 4.17 are due to Saada (1987). The weaker version of
Theorem 4.17 without the assertion that the limit in (4.18) is constant is Corollary
4.6 of Chapter VIII of IPS.
The central limit theorem for the tagged particle process XI was first proved
in the case of asymmetric nearest neighbor exclusion processes in one dimension
by Kipnis (1986). The limiting variance in this case is (1 - p)(p - q), as was
proved by De Masi and Ferrari (1985).
The central limit theorem for XI was then proved in different contexts by
Kipnis and Varadhan (1986) (for symmetric systems), Varadhan (1995) (for asym-
metric systems with mean zero), and Sethuraman, Varadhan and Yau (1999) (for
systems on Zd, d :::: 3 with nonzero mean). The treatment in Section 4 is based
on these three papers. The problem remains open for one and two dimensional
systems with drift, other than those covered by Kipnis (1986).
The Kipnis-Varadhan paper deduced the central limit theorem for XI from a
more general result for additive functionals of reversible Markov processes. De
Masi, Ferrari, Goldstein and Wick (1989) weakened the hypotheses in the general
theorem. A streamlined proof of their result was given by Goldstein (1995).
Here are some other results on tagged particles in the exclusion process.
Systems of Tagged Particles. In Section 4, we have considered the situation in

which only one particle is tagged. One can equally well follow all of the particles.
Seppiiliiinen (1998a) does so in the context of the totally asymmetric exclusion

process on Zl (i.e., the context of Section 2 with p = 1). Let {... < X;-I < X? <
Xi < ... } be the ordered locations of the particles at time t. The main results
are: If laws of large numbers and large deviations results hold at time 0, then they
hold at (rescaled) time t, with limits that can be computed explicitly. To describe
the results more concretely, take a fixed Xo E RI.
For the law of large numbers, the assumption is that there is a nondecreasing
function Vo{x) so that
X[nx1
lim _0- = Vo{x)
n--+oo n
in probability for x ::: Xo. Then for t > 0, x ::: xo,
X[nx 1
lim ~
n--+oo n
= Vex, t),
where
Vex, t) = )~L {vo{x) + x - y + [-Ii - JY=X]2}.
For (lower tail) large deviations, the assumption is that for every x ::: Xo there
exists a right continuous function ¢x on R I so that for all s,
¢xCs) = - lim log p(X~nxl :::: ns).

n--+ 00
The conclusion is that for t > 0, x ::: xo,
- lim log p(Xf~nl :::: ns)

n--+oo
= lx,t{s),
where
t+s-r )
- 2{s - r +y - x) cosh- I (
2,Jt{s - r + y - x)
- 2{y - x) cosh- I ( t-s+r )]} .

2,Jt{y - x)
Asymmetric Tagged Particle in a Symmetric Environment. Landim, Olla and

Volchan (1997, 1998) consider the exclusion process on Zl in which a tagged
particle starting at the origin moves to the right with probability p and to the left
with probability q, where p + q = 1, p > !,
while all other particles move to the
!
right and left with probability each, Suppose that initially, the symmetrically
moving particles have distribution vp on Z 1\ {O}. Then they show that the position
X t of the tagged particle satisfies
. Xt
hm r; = v{p)
t--+oo Vt
in probability, where v(p) is deterministic, and satisfies
(5.9) lim v(p) = 1- p iI.

pH p -q p V;
Their proofs use a mapping between the exclusion process and the zero range
process. This latter process can be viewed as a system of queues. The fact that
the right side of (5.9) agrees with the limiting variance in Theorem 1.21 is known
as an Einstein relation. Weaker results along these lines were proved earlier by
Ferrari, Goldstein and Lebowitz (1985).
Some Results about Related Processes

Various relatives of the exclusion process have received some attention. Here is a
brief description of some results that have been proved about them.
Exclusion Processes with Different Update Rules. If one wishes to simulate the
exclusion process, one is naturally led to consider various rules that could be used
to determine the order in which the states of the various sites are updated. For
the process on {l, . " , N} that is studied in Section 3, for example, updates must
be performed at each endpoint of the interval, and at each nearest neighbor pair
of sites. One can imagine updating these in a random order, for example, or se-
quentially from left to right. Rajewsky, Santen, Schadschneider and Schreckenberg
(1998) use the matrix approach discussed in Section 3 to compare systems with
different update rules. That paper contains other references on this topic.
Exclusion Processes with Spontaneous Births and Deaths. Ferrari and Golstein
°
(1988) consider the symmetric nearest neighbor exclusion process on Z3 with the
addition of births at 0 at rate f3 and deaths at at rate 8. For each p E [0, 1], there
is an extremal invariant measure JL p for this process that has asymptotic density p
at 00. (Invariant measures for this type of process where spontaneous births and
deaths are allowed at any site are discussed by Schwartz (1976).) This is a product
measure if and only if p = f3! (f3 + 8). In all other cases, the covariance of 11 (x)
and 11 (y) relative to JL p for x, y -=f=. 0 lies between two negative constant multiples
of
Exclusion Processes with Spin System Dynamics. The exclusion process has been
added to other particle systems dynamics in several contexts. Its addition to the
contact process was mentioned in Section 5 of Part I. Here we consider its addition
to one dimensional reversible spin systems. The process has state space {O, 1}ZI
and the following transitions:
b2 + 1)) = 010 or 101,
1
if (7] (x - 1), 7](x), 7](x
7] ~ 7]x at rate ab if (7] (x - 1), 7](x), 7](x + 1» = 001,100,110 or 011,
a2 if (7] (x - 1), 7](x), I1(X + 1» = 000 or Ill,
where a .:s b (to make it attractive), and
if Ix - yl = 1,
17 -+ 17x,y at rate { ~ if Ix - yl =1= 1.
This process was first considered by De Masi, Ferrari and Lebowitz (1986), who
proved hydrodynamic type results for it. We will be concerned with the issue of
ergodicity: Does this process have a unique invariant measure? Especially, for
fixed a, b, what happens if M is very large or very small?
Here are the known results:
(a) If %> 1,
the process is ergodic for all M by Theorem 4.1 of Chapter I of
IPS.
(b) If %> ~, the process is ergodic for sufficiently large M, as was proved by
Brassesco, Presutti, Sidoravicius and Vares (1999).
(c) For any strictly positive a, b, the process is ergodic for sufficiently small
M - see Neuhauser (1990).
The process is clearly not ergodic if a = 0, since then the pointmasses on
17 == 0 and on 17 == 1 are invariant. The most interesting open problem involves
the case % < ~ and M large. In this case, one can make a heuristic argument for
nonergodicity as follows: Let the distribution at time t be ILt, and assume that the
initial distribution is shift invariant. Then
d
-ILt(1) = b2ILt(101) + 2abILt(100) + a 2ILt(000)
dt
- a 2ILl (111) - 2abILI (110) - b2ILl (010).
If ILl is the product measure with density p, then the right side above is
Since ~ < ~, there are three roots of f in [0, 1], given by
1
p=- and p(1 _ p) = (_a_)2
2 b-a
The system
d
(5.10) -pet) = f(p(t))
dt
has these three roots as fixed points; p = 1
is unstable, and the other two are
stable.
Here is the heuristic part of the argument: If M is large, then Theorems 1.10
and 1.13 suggest that the distribution of the process at time t is close to being
a product measure, since the exclusion part of the evolution should dominate
the spin-flip part. If this were the case, one would expect the process to have two
extremal invariant measures corresponding to the two stable fixed points of (5.10).
It would be quite interesting to determine whether this is in fact the case.
Here is an argument that counters the above heuristic: If a > 0, the spin system
alone is exponentially ergodic (Holley and Stroock (1989», while the exclusion
process converges only algebraically rapidly (Deuschel (1994». Therefore only a
small amount of spin evolution might be enough to render the combined process
ergodic, no matter how large the exclusion component is.
Similar issues arise in another context. Consider a process T/I on {O, 1,2, ... }Zl
with the following transitions:
(a) increase T/(x) by 1 at rate f3(T/(x)),
(b) decrease T/(x) by 1 at rate 8(T/(x)),
(c) increase T/ (x) by 1 and decrease T/ (y) by 1 at rate M if Ix - Y I = 1.
This is sometimes called a reaction-diffusion process; the reaction part is the
increase and decrease in (a) and (b) above, and the diffusion part is given by
(c). Homogeneous product measures with Poisson marginals are invariant for the
diffusion part, so if f3 (-) and 8 (-) are chosen so that a particular Poisson distribution
is invariant for the reaction part, this Poisson will be invariant for the combined
evolution. The condition for this to be the case for the Poisson with parameter A
is
(S.l1) (n + 1)f3(n) = A8(n + 1), n ~ O.
In a somewhat restricted version of this situation, Ding, Durrett and Liggett (1990)
proved that this is the only invariant measure, and there is convergence to it from
any initial configuration. This result was generalized by Chen, Ding and Zhu
(1994). Ergodicity has also been proved in some other (nonreversible) (f3, 8, M)
regions - see Chen (199S) for details.
There is again a heuristic argument for nonergodicity for certain choices of
(f3,8) if M is large: If M is large, the distribution of the process at large times
should be close to a homogeneous product of Poisson distributions. If the distri-
bution at time t really were such a product, with Poisson parameter A(t), then A(t)
would satisfy
=L
d 00 e-)'(I)[A(t)f
(S.12) -ET/I(x) [f3(n) - 8(n)].
dt n=O n!
As an example, suppose
f3(n) = a + bn(n - 1), 8(n) = en + dn(n - 1)(n - 2),
where a, b, e, d > O. This is known as Schl6gl's second model. Then the right
side of (S.12) becomes
(S.13)
evaluated at A(t). If this cubic has three positive roots, the smallest and largest
will be stable, and that argues for the nonergodicity of the system. Note that in
this example, (S.11) holds for some A if and only if
a b
=
c d'
and in this case (5.13) has only one real root. Thus, the heuristic does not contradict
known results in the reversible case.
Asymmetric Exclusion Processes with Random Rates. Take a nearest neighbor one
dimensional exclusion process with jump probabilities that are different for dif-
ferent particles - the ith particle jumps to the right with probability Pi and to the
left with probability qi, where
1
Pi +qi = 1, ·>
P1 _
c > -.
2
Note that the jump probabilities are associated with particles, not with sites. Let
{Pi, i E Zl} be chosen according to a stationary, ergodic process, and ask for what
densities p, does there exist a product measure that is invariant for the process seen
from a tagged particle, and has asymptotic density p in both directions. Benjamini,
Ferrari and Landim (1996) give the following answer: There is a critical density
p* so that for almost all choices of the {Pi}, if p ~ p*, there is such a product
measure, while if p < p*, there is no such product measure. Now suppose that c
is the essential infimum of the distribution of Pi, and that the Pi's are i.i.d. Then
p* > 0 if and only if P(Po = c) < I~C. More explicit results for one sided models
were obtained by Krug and Ferrari (1996). Hydrodynamics for models of this sort
is studied by Seppiiliiinen and Krug (1999).
Long Range Exclusion Processes. The long range exclusion process differs from
the exclusion process we have considered in Part III in that when a particle attempts
to move to an occupied site, instead of returning at its original site, it continues
searching (instantaneously) for a vacant site until it finds one (which may never
happen). In other words, if it is at XES when its exponential clock rings, and
the configuration of the system at that time is 1'], it constructs a Markov chain Xn
with transition probabilities P (', .) and initial state x, and moves to X T, where
r = inf{n ~ 1: 1'](Xn) = 0 or Xn =x}.

This process was introduced by Spitzer (1970), and constructed and first studied
by Liggett (1980). In a more recent paper, Guio1 (1997) has proved that for a
translation invariant system on Zd, all shift invariant measures that are invariant
for the process are mixtures of {v p' 0 :::: p :::: I}.
Ulam's Problem and Hammersley's Process. Here is Vlam's problem: Take a

random permutation of {l, ... , n}, with each permutation having probability ~.
n.
Let Ln be the length of the longest increasing subsequence in that permutation.
Hammersley used an embedding in a spatial Poisson process and subadditivity to
show that
. Ln
hm r;;; = C
n---*oo V n
in probability for some constant c. Vershik-Kerov and Logan-Shepp showed that

c = 2. In a recent paper, Aldous and Diaconis (1995) deduced this result by con-
sidering a process they call Hammersley's process, and proving a hydrodynamic
limit theorem for it.
Hammersley's process ~t is closely related to the exclusion process, as the
following description indicates: The states ~ are collections of points in (0, (0)
with only finitely many points in each compact subset. Let i?l' be a homogeneous
Poisson process on (0, (0)2. At an event time (x, t) of i?l', the point in ~t that is
closest to x on its right is moved to x. If there is no point to the right of x, a
new point is put at x. Thus points move to the left in such a way that order is
preserved. There is a natural way to start the process off at the empty set - that
turns out to be the relevant initial configuration.
The analogue of Burgers' equation (2.3) for this process turns out to be
au au
--=1,
at ax
where u(x, t) = EI~t(O, x)l, the expected number of points in (0, x) at time t.
The initiallboundary conditions are
u(x, 0) = u(O, t) = 0.
The solution to this partial differential equation with these initiallboundary con-
ditions is u(x, t) = 2Ft, and it is this factor of 2 that gives the value of c
above.
For another view of this connection between Ulam's problem and particle
systems related to the exclusion process, see Seppiiliiinen (1996). A closely related
problem is solved in Seppiiliiinen (1997). Large deviations in this context are
considered by Seppiiliiinen (1 998b).
Bibliography
Papers on the main topics of this book - contact, voter and exclusion processes - are listed
by topic, following a list of books. In these sections, we have tried to include all papers
written about those models since 1985. Papers that are more general, or do not fit naturally
into one of the three categories, are listed at the end.
In most cases, only references after 1985 are listed. For earlier references, please see
the bibliography oflPS, Liggett (1985). In that book, we tried to list essentially all papers
written about interacting particle systems up to that time. By now, there are well over 1000
papers on this subject, so we have not tried to be as inclusive this time. Therefore, the
listing of "other papers" at the end contains only papers that are referred to explicitly in
the text.
Books
K. B. Athreya and P. E. Ney, Branching Processes, Springer, 1972.
M. F. Chen, From Markov Chains to Non-Equilibrium Particle Systems, World Scientific,
1992.
A. De Masi and E. Presutti, Mathematical Methods for Hydrodynamic Limits, Springer
Lecture Notes in Mathematics 1501, 1991.
R. Durrett, Lecture Notes on Particle Systems and Percolation, Wadsworth, 1988.
R. Durrett, Probability: Theory and Examples, second edition, Duxbury, 1996.
L. C. Evans, Partial Differential Equations, American Mathematical Society, 1998.
G. Grimmett, Percolation, Springer, 1989.
G. Grimmett, Percolation, 2nd edition, Springer, 1999.
F. P. Kelly, Reversibility and Stochastic Networks, Wiley, 1979.
C. Kipnis and C. Landim, Scaling Limits of Interacting Particle Systems, Springer, 1999.
N. Konno, Phase Transitions of Interacting Particle Systems, World Scientific, 1994.
T. M. Liggett, Interacting Particle Systems, Springer, 1985.
W. Rudin, Real and Complex Analysis, McGraw-Hill, 1966.
R. Schinazi, Classical and Spatial Stochastic Processes, Birkhauser, 1999.
H. Spohn, Large Scale Dynamics of Interacting Particles, Springer Texts and Monographs
in Physics, 1991.
Contact Processes
M. Aizenman and G. Grimmett, Strict monotonicity for critical points in percolation and
ferromagnetic models, J. Statist. Phys. 63 (1991), 817-835.
E. D. Andje1, The contact process in high dimensions, Ann. Prob. 16 (1988), 1174-1183.
E. D. Andjel, Survival of multidimensional contact process in random environments, Bol.
Soc. Bras. Mat. 23 (1992), 109-119.
E. D. AndjeJ, R. Schinazi and R. H. Schonmann, Edge processes ofone-dimensional stochas-
tic growth models, Ann. Inst. H. Poincare Probab. Statist. 26 (1990), 489-506.
318 Bibliography
D. J. Barsky and C. C. Wu, Critical exponents for the contact process under the triangle
condition,1. Statist. Phys. 91 (1998), 95-124.
V. Belitsky, P. A. Ferrari, N. Konno and T. M. Liggett, A strong correlation inequality for
contact processes and oriented percolation, Stoch. Proc. Appl. 67 (1997), 213-225.
C. Bezuidenhout and L. Gray, Critical attractive spin systems, Ann. Probab. 22 (1994),
1160--1194.
C. Bezuidenhout and G. Grimmett, The critical contact process dies out, Ann. Probab. 18
(1990), 1462-1482.
C. Bezuidenhout and G. Grimmett, Exponential decay for subcritical contact and percolation
processes, Ann. Probab. 19 (1991), 984--1009.
M. Bramson, R. Durrett and R. H. Schonmann, The contact process in a random environ-
ment, Ann. Probab. 19 (1991), 960--983.
M. Bramson, R. Durrett and G. Swindle, Statistical mechanics of crabgrass, Ann. Probab.
17 (1989), 444--481.
L. Buttel, J. T. Cox and R. Durrett, Estimating the critical values of stochastic growth
models, 1. Appl. Probab. 30 (1993), 455-461.
M. Cassandro, A. Galves, E. Olivieri and M. E. Vares, Metastable behavior of stochastic
dynamics: A pathwise approach, 1. Statist. Phys. 35 (1984), 603-634.
J. W. Chen, Small density fluctuation for one-dimensional contact processes under nonequi-
librium, Acta Math. Sci. 13 (1993), 399-405.
J. W. Chen, The contact process on afinite system in higher dimensions, Chinese J. Contemp.
Math. 15 (1994), 13-20.
1. W. Chen, Smoothness and stability of one-dimensional contact processes, Acta Math.
Sinica 38 (1995), 91-98.
J. W. Chen, R. Durrett and X. F. Liu, Exponential convergence for one dimensional contact
processes, Acta Math. Sinica 6 (1990), 349-353.
J. T. Cox, R. Durrett and R. Schinazi, The critical contact process seen from the right edge,
Probab. Th. ReI. Fields 87 (1991), 325-332.
1. T. Cox and A. Greven, On the long term behavior ofsome finite particle systems, Probab.
Th. ReI. Fields 85 (1990), 195-237.
R. Dickman, Nonequilibrium lattice models: series analysis of steady states, J. Statist. Phys.
55 (1989), 997-1026.
R. Durrett, The contact process, 1974-1989, Proceedings of the 1989 AMS Seminar on
Random Media (W. E. Kohler and B. S. White, ed.), vol. 27, AMS Lectures in Applied
Mathematics, 1991, pp. 1-18.
R. Durrett, Stochastic growth models - bounds on critical values, J. Appl. Prob. 29 (1992),
11-20.
R. Durrett and D. Griffeath, Contact processes in several dimensions, Z. Wahrsch. verw.
Gebiete 59 (1982), 535-552.
R. Durrett and X. Liu, The contact process on a finite set, Ann. Probab. 16 (1988),
1158-1173.
R. Durrett and E. Perkins, Rescaled contact processes converge to super Brownian motion
for d ~ 2, Probab. Theory ReI. Fields (1999).
R. Durrett and R. Schinazi, Intermediate phase for the contact process on a tree, Ann.
Probab. 23 (1995), 668-673.
R. Durrett and R. Schonmann, Stochastic growth models, Percolation Theory and Ergodic
Theory of Infinite Particle Systems (H. Kesten, ed.), Springer, 1987, pp. 85-119.
R. Durrett and R. Schonmann, The contact process on a finite set II, Ann. Probab. 16
(1988a), 1570--1583.
R. Durrett and R. Schonmann, Large deviations for the contact process and two dimensional
percolation, Probab. Th. ReI. Fields 77 (1988b), 583-603.
R. Durrett, R. Schonmann and N. Tanaka, The contact process on a finite set III. The critical
case, Ann. Probab. 17 (1989), 1303-1321.
Bibliography 319
S. N. Evans and E. A. Perkins, Measure-valued branching diffusions with singular interac-

tions, Canad. J. Math. 46 (1994), 120-168.
M. Fiocco, Statistical estimation for the supercritical contact process, Thesis, Leiden, 1997.
A. Galves, F. Martinelli and E. Olivieri, Large density fluctuations for the one dimensional
supercritical contact process, 1. Statist. Phys. 55 (1989), 639-648.
A. Galves and E. Presutti, Edge fluctuations for the one dimensional supercritical contact
process, Ann. Probab. 15 (1987a), 1131-1145.
A. Galves and E. Presutti, Travelling wave structure of the one dimensional contact process,
Stoch. Proc. App!. 25 (1987b), 153-163.
A. Galves and R. Schinazi, Approximations finis de la mesure invariante du processus de
contact sur-critique vu par la premiere particule, Probab. Th. ReI. Fields 83 (1989),
435-445.
L. Gray, Is the contact process dead?, Proceedings of the 1989 AMS Seminar on Ran-
dom Media (W. E. Kohler and B. S. White, ed.), vol. 27, AMS Lectures in Applied
Mathematics, 1991, pp. 19-29.
D. Griffeath, Limit theorems for nonergodic set-values Markov processes, Ann. Probab. 6
(1978), 379-387.
C. Grillenberger and H. Ziezold, On the critical infection rate of the one dimensional basic
contact process: numerical results, 1. Appl. Probab. 25 (1988), 1-8.
G. Grippenberg, A lower bound for the order parameter in the one-dimensional contact
process, Stoch. Proc. App!. 63 (1996), 211-219.
I. Heuter, Anisotropic contact process on homogeneous trees, (2000).
R. Holley and T. M. Liggett, The survival of contact processes, Ann. Probab. 6 (1978),
198-206.
I. Jensen and R. Dickman, Time-dependent perturbation theory for nonequilibrium lattice
models, 1. Statist. Phys. 71 (1993), 89-127.
S. Jitomirskaya and A. Klein, Ising model in a quasiperiodic transverse field, percola-
tion, and contact processes in quasiperiodic environments, 1. Statist. Phys. 73 (1993),
319-344.
M. Katori, Rigorous results for the diffusive contact process in d 2: 3, J. Phys. A 27 (1994),
7327-7341.
M. Katori and N. Konno, Correlation inequalities and lower bounds for the critical value
)."C of contact processes, 1. Phys. Soc. Japan 59 (1990), 877-887.
M. Katori and N. Konno, Applications of the Harris-FKG inequality to upper bounds for
order parameters in the contact process, J. Phys. Soc. Japan 60 (1991), 430-434.
M. Katori and N. Konno, Three point Markov extension and an improved upper bound
for survival probability of the one dimensional contact process, J. Phys. Soc. Japan 60
(1991),418-429.
M. Katori and N. Konno, Correlation identities for nearest-particle systems and their appli-
cations to one-dimensional contact process, Modem Phys. Lett. B 5 (1991),151-159.
M. Katori and N. Konno, An upper bound for survival probability of infected region in the
contact process, J. Phys. Japan 60 (1991), 95-99.
M. Katori and N. Konno, Upper bounds for the survival probability of the contact process,
1. Statist. Phys. 63 (1991), 115-130.
M. Katori and N. Konno, Bounds on the critical line of the &-contact processes with
I ::: & ::: 2, 1. Phys. A 26 (1993), 6597--6614.
A. Klein, Extinction of contact and percolation processes in a random environment, Ann.
Probab. 22 (1994),1227-1251.
N. Konno, Asymptotic behavior of basic contact processes with rapid stirring, 1. Th. Probab.
8 (1995), 833-876.
N. Konno, Lecture Notes on Harris Lemma and Particle Systems, Publicacoes do Instituto
de Matematica e Estatistica da Universidade de Sao Paulo, 1996.
320 Bibliography
N. Konno, Lecture Notes on Interacting Particle Systems, Rokko Lectures in Mathematics

#3 Kobe University, 1997.
N. Konno and K. Sato, Upper bounds on order parameters of diffusive contact processes,
1. Phys. Soc. Japan 64 (1995), 2405-2412.
S. M. Krone, The two-stage contact process, Ann. Appl. Probab. 9 (1999).
T. Kuczek, The central limit theoremfor the right edge ofsupercritical oriented percolation,
Ann. Probab. 17 (1989), 1322-1332.
S. P. Lalley, Growth profile and invariant measures for the weakly supercritical contact
process on a homogeneous tree, Ann. Probab. 27 (1999), 206-225.
s. P. Lalley and T. Sellke, Limit set of a weakly supercritical contact process on a homo-
geneous tree, Ann. Probab. 26 (1998), 644-657.
1. L. Lebowitz and R. H. Schonmann, On the asymptotics of occurrence times of rare events
for stochastic spin systems, 1. Statist. Phys. 48 (1987), 727-751.
T. M. Liggett, Spatially inhomogeneous contact processes, Spatial Stochastic Processes.
A Festschrift in honor of the Seventieth Birthday of Ted Harris, Birkhauser, 1991a,
pp. 105-140.
T. M. Liggett, The periodic threshold contact process, Random Walks, Brownian Motion
and Interacting Particle Systems, A Festschrift in honor of Frank Spitzer (R. Durrett
and H. Kesten, ed.), Birkhauser, 1991b, pp. 339-358.
T. M. Liggett, The survival of one dimensional contact processes in random environments,
Ann. Probab. 20 (1992), 696-723.
T. M. Liggett, Survival and coexistence in interacting particle systems, Probability and
Phase Transition, Kluwer, 1994a, pp. 209-226.
T. M. Liggett, Improved upper bounds for the contact process critical value, Ann. Probab.
23 (1995a), 697-723.
T. M. Liggett, Multiple transition points for the contact process on the binary tree, Ann.
Probab. 24 (1996a), 1675-1710.
T. M. Liggett, Branching random walks and contact processes on homogeneous trees,
Probab. Th. ReI. Fields 106 (1996b), 495-519.
T. M. Liggett, Branching random walks onfinite trees, Perplexing Problems in Probability:
Papers in Honor of Harry Kesten, Birkhiiuser, 1999.
N. Madras and R. Schinazi, Branching random walks on trees, Stoch. Proc. Appl. 42 (1992),
255-267.
N. Madras, R. Schinazi and R. Schonmann, On the critical behavior of the contact process
in deterministic inhomogeneous environment, Ann. Probab. 22 (1994), 1140-1159.
G. Morrow, R. Schinazi and Y. Zhang, The critical contact process on a homogeneous tree,
J. Appl. Probab. 31 (1994),250-255.
T. S. Mountford, A metastable result for the finite multidimensional contact process, Can.
Math. Bull. 36 (1993), 216-226.
T. S. Mountford, Existence of constant for finite system extinction, 1. Statist. Phys. (1999).
T. S. Mountford and T. D. Sweet, An extension ofKuczek's argument to nonnearest neighbor
contact processes, (1999).
C. Mueller and R. Tribe, A phase transition for a stochastic PDE related to the contact
process, Probab. Th. ReI. Fields 100 (1994), 131-156.
C. Mueller and R. Tribe, Stochastic p.d.e. 's arising from the long range contact and long
range voter processes, Probab. Th. ReI. Fields 102 (1995), 519-545.
C. M. Newman and S. Vo1chan, Persistent survival of one-dimensional contact processes
in random environments, Ann. Probab. 24 (1996), 411-421.
C. M. Newman and C. C. Wu, Percolation and contact processes with low dimensional
inhomogeneity, Ann. Probab. 25 (1997), 1832-1845.
R. Pemantle, The contact process on trees, Ann. Probab. 20 (1992), 2089-2116.
R. Pemantle and A. M. Stacey, The branching random walk and contact process on
non-homogeneous and Galton-Watson trees, (2000).
Bibliography 321
M. D. Penrose, The threshold contact process: a continuum limit, Probab. Th. ReI. Fields
104 (1996), 77-95.
A. Puha, A reversible nearest particle system on the homogeneous tree, 1. Th. Probab. 12
(1999),217-254.
A. Puha, Critical exponents for a reversible nearest particle system on the binary tree, Ann.
Probab. (2000).
M. Salzano and R. H. Schonmann, The second lowest extremal invariant measure of the
contact process, Ann. Probab. 25 (1997), 1846-187l.
M. Salzano and R. H. Schonmann, A new proofthatfor the contact process on homogeneous
trees local survival implies complete convergence, Ann. Probab. 26 (1998), 1251-1258.
M. Salzano and R. H. Schonmann, The second lowest extremal invariant measure of the
contact process II, Ann. Probab. 27 (1999).
R. Schinazi, On multiple phase transitions for branching Markov chains, 1. Statist. Phys.
71 (1993), 521-525.
R. Schinazi, The asymmetric contact process on a finite set, 1. Statist. Phys. 74 (1994),
1005-1016.
R. Schinazi, A contact process with a single inhomogeneous site, J. Statist. Phys. 83 (1996),
767-777.
R. H. Schonmann, Metastability for the contact process, 1. Statist. Phys. 41 (1985),445-464.
R. H. Schonmann, Central limit theorem for the contact process, Ann. Probab. 14 (1986a),
1291-1295.
R. H. Schonmann, The asymmetric contact process, 1. Statist. Phys. 44 (1986b), 505-534.
R. H. Schonmann, A new look at contact processes in several dimensons, Percolation Theory
and Ergodic Theory of Infinite Particle Systems (H. Kesten, ed.), vol. 8, IMA Series
in Mathematics and its Applications, 1987a, pp. 245-250.
R. H. Schonmann, A new proof ofthe complete convergence theorem for contact processes in
several dimensions with large infection parameter, Ann. Probab. 15 (1987b), 382-387.
R. H. Schonmann, The triangle condition for contact processes on homogeneous trees, 1.
Statist. Phys. 90 (1998), 1429-1440.
R. H. Schonmann and M. E. Vares, The survival of the large dimensional basic contact
A. Simonis, Metastability for the d-dimensional contact process, 1. Statist. Phys. 83 (1996),
1225-1239.
A. Simonis, Filling in the hypercube in the supercritica/ contact process in equilibrium,
Markov Proc. ReI. Fields 4 (1998), 113-l30.
A. M. Stacey, Bounds on the critical probabilities in oriented percolation models, Cambridge
University thesis (1994).
A. M. Stacey, The existence of an intermediate phase for the contact process on trees, Ann.
Probab. 24 (1996), 1711-1726.
A. M. Stacey, The contact process on afinite tree, (2000).
T. Sweet, The asymmetric contact process at its second critical value, J. Statist. Phys. 86
(1997),749-764.
G. Swindle, A mean field limit of the contact process with large range, Probab. Th. ReI.
Fields 85 (1990), 261-282.
A. Y. Tretyakov, V. Belitsky, N. Konno and T. Yamaguchi, Numerical estimation on cor-
relation inequalities for Holley-Liggett bounds, Mem. Muroran Inst. Tech. 48 (1998),
101-105.
A. Y. Tretyakov, N. Inui and N. Konno, Phase transition for the one-sided contact process,
1. Phys. Soc. Jap. 66 (1997), 3764-3769.
A. Y. Tretyakov and N. Konno, Phase transition of the contact process on the binary tree,
1. Phys. Soc. Japan 64 (1995), 4069-4072.
c. C. Wu, The contact process on a tree: behavior near the first phase transition, Stoch.
Proc. Appl. 57 (1995), 99-112.
322 Bibliography
C. C. Wu, Inhomogeneous contact processes on trees, J. Statist. Phys. 88 (1997), 1399-1408.

Y. Zhang, The complete convergence theorem of the contact process on trees, Ann. Probab.
24 (1996), 1408-1443.
Voter Models
E. D. Andjel, T. M. Liggett and T. Mountford, Clustering in one dimensional threshold
voter models, Stoch. Proc. Appl. 42 (1992), 73-90.
E. D. Andjel and T. Mountford, A coupling of infinite particle systems II, J. Math. Kyoto
Univ. 38 (1998), 635-642.
M. Bramson, J. T. Cox and R. Durrett, Spatial models for species area curves, Ann. Probab.
24 (1996),1727-1751.
M. Bramson, 1. T. Cox and R. Durrett, A spatial modelfor the abundance of species, Ann.
Probab. 26 (1998), 658-709.
M. Bramson, 1. T. Cox and D. Griffeath, Consolidation rates for two interacting systems in
the plane, Probab. Th. ReI. Fields 73 (1986), 613-625.
M. Bramson, 1. T. Cox and D. Griffeath, Occupation time large deviations of the voter
model, Probab. Th. ReI. Fields 77 (1988), 401-413.
D. Chen, The consensus times of the majority vote process on a torus, J. Statist. Phys. 86
(1997), 779-802.
1. T. Cox, Some limit theorems for voter model occupation times, Ann. Probab. 16 (1988),
1559-1569.
J. T. Cox, Coalescing random walks and voter model consensus times on the torus in Zd,
Ann. Probab. 17 (1989), 1333-1366.
1. T. Cox and R. Durrett, Nonlinear voter models, Random Walks, Brownian Motion and
Interacting Particle Systems, A Festschrift in honor of Frank Spitzer (R. Durrett and
H. Kesten, ed.), Birkhauser, 1991, pp. 189-201.
J. T. Cox and R. Durrett, Hybrid zones and voter model interfaces, Bernoulli 1 (1995),
343-370.
1. T. Cox, R. Durrett and E. A. Perkins, Rescaled voter models converge to super Brownian
motion, 2000.
J. T. Cox and A. Greven, On the long term behavior offinite particle systems: A critical
dimension example, Random Walks, Brownian Motion and Interacting Particle Systems,
A Festschrift in honor of Frank Spitzer, Birkhauser, pp. 203-213.
J. T. Cox and D. Griffeath, Occupation time limit theorems for the voter model, Ann. Probab.
11 (1983), 876--893.
J. T. Cox and D. Griffeath, Critical clustering in the two dimensional voter model, Stochastic
Spatial Processes (P. Tautu, ed.), vol. 1212, Springer Lecture Notes in Mathematics,
1986a, pp. 59-68.
1. T. Cox and D. Griffeath, Diffusive clustering in the two dimensional voter model, Ann.
Probab. 14 (1986b), 347-370.
M. 1. De Oliveira, Isotropic majority vote model on a square lattice, 1. Statist. Phys. 66
(1992), 273-281.
R. Durrett, Multicolor particle systems with large threshold and range, J. Th. Probab. 5
(1992), 127-152.
R. Durrett and J. E. Steif, Fixation results for threshold voter systems, Ann. Probab. 21
(1993),232-247.
1. Ferreira, The probability of survival for the biased voter model in a random environment,
Stoch. Proc. Appl. 34 (1990), 25-38.
B. Granovsky and N. Madras, The noisy voter model, Stoch. Proc. Appl. 55 (1995), 23-43.
S. Handjani, The complete convergence theorem for coexistent threshold voter models, Ann.
Probab. 27 (1999), 226--245.
T. M. Liggett, Coexistence in threshold voter models, Ann. Probab. 22 (1994b), 764-802.
T. S. Mountford, Generalized voter models, 1. Statist. Phys. 67 (1992), 303-311.
Bibliography 323
M. A. Santos and S. Texeira, Anisotropic voter model, 1. Statist. Phys. 78 (1995), 963-970.
A. Sudbury, Hunting submartingales in the jumping voter model and the biased annihilating
branching process, Adv. Appl. Probab. (1999).
Exclusion Processes
F. J. Alexander, Z. Cheng, S. A. Janowsky and J. L. Lebowitz, Shockjiuctuations in the

two-dimensional asymmetric exclusion process, 1. Statist. Phys. 68 (1992), 761-785.
E. D. Andjel, Convergence to a nonextremal equilibrium measure in the exclusion process,
E. D. Andjel, A correlation inequality for the symmetric exclusion process, Ann. Probab. 16
(1988),717-721.
E. D. Andjel, Finite exclusion process and independent random walks, Unpublished paper.
E. D. Andjel, M. D. Bramson and T. M. Liggett, Shocks in the asymmetric exclusion process,
E. D. Andjel and C. P. Kipnis, Pointwise ergodic theorems for the symmetric exclusion
process, Probab. Theory and ReI. Fields 75 (1987), 545-550.
E. D. Andjel and M. E. Vares, Hydrodynamic equations for attractive particle systems on
Z,1. Statist. Phys. 47 (1987), 265-288.
R. Arratia, Symmetric exclusion processes: a comparison inequality and a large deviation
result, Ann. Probab. 13 (1985), 53-61.
D. Arora, D. P. Bhatia and M. A. Prasad, Survival probability in one dimension for the
A + B --+ B reaction with hard-core repulsion, 1. Statist. Phys. 84 (1996), 697-711.
A. Asselah and P. Dai Pra, Sharp estimates for the occurrence times of rare events for
symmetric simple exclusion, Stoch. Proc. Appl. 71 (1997),259-273.
C. Bahadoran, Hydrodynamicallimitfor spatially heterogeneous simple exclusion processes,
A. Benassi and 1. P. Fouque, Hydrodynamicallimit for the asymmetric exclusion process,
Ann. Probab. 15 (1987), 546-560.
A. Benassi and J. P. Fouque, Fluctuation field for the asymmetric simple exclusion process,
Random Partial Differential Equations, Birkhiiuser, 1991, pp. 3J-43.
A. Benassi, 1. P. Fouque, E. Saada and M. E. Vares, Asymmetric attractive systems on Z:
hydrodynamic limit for monotone initial profiles, 1. Statist. Phys. 63 (1991), 719-735.
I. Benjamini, P. A. Ferrari and C. Landim, Asymmetric conservative processes with random
rates, Stoch. Proc. Appl. 61 (1996), 181-204.
L. Bertini and G. Giacomin, Stochastic Burger's equation and KPZ equations from particle
systems, Comm. Math. Phys. 183 (1997), 571-607.
C. Boldrighini, G. Cosini, S. Frigio and M. Grasso Nunes, Computer simulation of shock
waves in the completely asymmetric simple exclusion process, J. Statist. Phys. 55 (1989),
611-623.
M. Bramson, Front propagation in certain one dimensional exclusion models, 1. Statist.
Phys. 51 (1988), 863-870.
M. Bramson, P. Calderoni, A. De Masi, P. Ferrari, 1. Lebowitz and R. H. Schonmann,
Microscopic selection principle for diffusion-reaction equations, 1. Statist. Phys. 45
(1986),56-70.
S. Brassesco, E. Presutti, V. Sidoravicius and M. E. Vares, Ergodicity and exponential
convergence of a Glauber+Kawasaki process, Trans. Amer. Math. Soc. (1999).
S. Brassesco, E. Presutti, V. Sidoravicius and M. E. Vares, Ergodicity of a Glauber+Kawa-
saki process with metastable states, 2000.
C. Cammarotta and P. A. Ferrari, An invariance principle for the edge of the branching
exclusion process, Stoch. Proc. Appl. 38 (1991), 1-11.
N. Cancrini and A. Ga1ves, Approach to equilibrium in the symmetric simple exclusion
process, Markov Proc. ReI. Fields 1 (1995), 175-184.
324 Bibliography
c. C. Chang, Equilibriumjluctuations of nongradient reversible particle systems, Nonlinear

Stochastic PDEs, Springer, 1994, pp. 41-51.
A. De Masi and P. A. Ferrari, Self diffusion in one dimensional lattice gasses in the presence
of an external field, 1. Statist. Phys. 38 (1985), 603--613.
A. De Masi, P. A. Ferrari and 1. L. Lebowitz, Reaction-diffusion equations for interacting
particle systems, 1. Stat. Phys. 44 (1986), 589--644.
A. De Masi, P. A. Ferrari and M. E. Vares, A microscopic model of interface related to the
Burgers equation, 1. Statist. Phys. 55 (1989), 601--609.
A. De Masi, C. Kipnis, E. Presutti and E. Saada, Microscopic structure at the shock in the
asymmetric simple exclusion, Stoch. and Stoch. Reports 27 (1989), 151-165.
A. De Masi, E. Presutti and E. Scacciatelli, The weakly asymmetric simple exclusion process,
Ann. Inst. H. Poincare Probab. Statist. 25 (1989), 1-38.
B. Derrida, Systems out of equilibrium: some exactly soluble models, StatPhys 19, World
Sci., 1996, pp. 243-253.
B. Derrida, E. Domany and D. Mukamel, An exact solution ofa one-dimensional asymmetric
exclusion model with open boundaries, 1. Statist. Phys. 69 (1992), 667-687.
B. Derrida and M. R. Evans, The asymmetric exclusion model: exact results through a matrix
approach, Nonequilibrium Statistical Mechanics in One Dimension (V. Privman, ed.),
Cambridge U. Press, 1997, pp. 277-304.
B. Derrida, M. R. Evans, V. Hakim and V. Pasquier, Exact solution of a ID asymmetric
exclusion model using a matrix formulation, 1. Phys. A 26 (1993a), 1493-1517.
B. Derrida, M. R. Evans, V. Hakim and V. Pasquier, A matrix method ofsolving an asymmet-
ric exclusion model with open boundaries, Cellular Automata and Cooperative Systems
(N. Boccara, E. Goles, S. Martinez and P. Picco, ed.), Kluwer, 1993b, pp. 121-133.
B. Derrida, M. R. Evans and K. Mallick, Exact diffusion constant of a one-dimensional
asymmetric exclusion model with open boundaries, J. Statist. Phys. 79 (1995), 833-874.
B. Derrida, M. R. Evans and D. Mukamel, Exact diffusion constant for one-dimensional
asymmetric exclusion models, J. Phys. A 26 (1993), 4911-4918.
B. Derrida, S. Goldstein, J. L. Lebowitz and E. R. Speer, Shift equivalence of measures and
the intrinsic structure of shocks in the asymmetric simple exclusion process, 1. Statist.
Phys. 93 (1998), 547-571.
B. Derrida, S. A. Janowsky, 1. L. Lebowitz and E. R. Speer, Exact solution of the to-
tally asymmetric simple exclusion process: shock profiles, J. Statist. Phys. 73 (1993a),
813-842.
B. Derrida, S. A. Janowsky, J. L. Lebowitz and E. R. Speer, Microscopic-shock profiles:
exact solution of a nonequilibrium system, Europhys. Lett. 22 (1993b), 651--656.
B. Derrida and 1. L. Lebowitz, Exact large deviation function in the asymmetric exclusion
process, Phys. Rev. Lett. 80 (1998), 209-213.
B. Derrida, 1. L. Lebowitz and E. R. Speer, Shock profiles for the asymmetric simple exclu-
sion process in one dimension, 1. Statist. Phys. 89 (1997), 135-167.
B. Derrida and K. Mallick, Exact diffusion constant for the one-dimensional partially asym-
metric exclusion model, J. Phys. A 30 (1997), 1031-1046.
P. Dittrich, Travelling waves and long-time behaviour of the weakly asymmetric exclusion
P. Dittrich, Long-time behavior of the weakly asymmetric exclusion process and the Burgers
equation without viscosity, Math. Nachr. 155 (1992), 279-287.
P. Dittrich and 1. Giirtner, A central limit theorem for the weakly asymmetric simple exclusion
process, Math. Nachr. 151 (1991), 75-93.
R. Esposito, R. Marra and H.-T. Yau, Diffusive limit of asymmetric simple exclusion, Rev.
Math. Phys. 6 (1994),1233-1267.
P. A. Ferrari, The simple exclusion process as seen from a tagged particle, Ann. Probab. 14
(1986),1277-1290.
Bibliography 325
P. A. Ferrari, Microscopic shocks in one-dimensional driven systems, Ann. Inst. H. Poincare

Phys. Theor. 55 (1991), 637-655.
P. A. Ferrari, Shockfluctuations in the asymmetric simple exclusion, Probab. Th. ReI. Fields
91 (1992a), 81-101.
P. A. Ferrari, Shocks in the Burgers equation and the asymmetric simple exclusion pro-
cess, Automata, Networks, Dynamical Systems and Statistical Physics, K1uwer, 1992b,
pp.25-64.
P. A. Ferrari, Shocks in one-dimensional processes with drift, Probability and Phase Tran-
sition, K1uwer, 1994, pp. 35-48.
P. A. Ferrari, Limit theorems for tagged particles, Markov Proc. ReI. Fields 2 (1996), 17-40.
P. A. Ferrari and L. R. G. Fontes, Shocks in asymmetric one-dimensional simple exclusion
processes, Resenhas IME-USP 1 (1993), 57-68.
P. A. Ferrari and L. R. G. Fontes, Shockfluctuations in asymmetric simple exclusion process,
Probab. Th. ReI. Fields 99 (1994a), 305-319.
P. A. Ferrari and L. R. G. Fontes, Current fluctuations in asymmetric simple exclusion
process, Ann. Probab. 22 (1994b), 820-832.
P. A. Ferrari and L. R. G. Fontes, Poisson ian approximation for the tagged particle in
asymmetric simple exclusion, J. Appl. Probab. 33 (1996), 411-419.
P. A. Ferrari, L. R. G. Fontes and Y. Kohayakawa, Invariant measures for a two species
asymmetric process, J. Statist. Phys. (1994), 1153-1178.
P. A. Ferrari, A. Galves and T. M. Liggett, Exponential waiting time for filling a large
interval in the symmetric simple exclusion process, Ann. Inst. Henri Poincare 31 (1995),
155-175.
P. A. Ferrari and S. Goldstein, Microscopic stationary states for stochastic systems with
particle flux, Probab. Th. ReI. Fields 78 (1988), 455-471.
P. A. Ferrari, S. Goldstein and J. L. Lebowitz, Diffusion, mobility and the Einstein rela-
tion, Statistical Physics and Dynamical Systems, Rigorous Results, Birkhauser, 1985,
pp. 405-442.
P. A. Ferrari and C. Kipnis, Second class particles in the rarefaction fan, Ann. Inst. H.
Poincare Probab. Statist. 31 (1995), 143-154.
P. A. Ferrari, C. Kipnis and E. Saada, Microscopic structure of travelling waves in the
asymmetric simple exclusion process, Ann. Probab. 19 (1991), 226-244.
P. A. Ferrari, E. Presutti, E. Scacciatelli and M. E. Vares, The symmetric simple exclusion
process I: Probability estimates, Stoch. Proc. Appl. 39 (1991a), 89-105.
P. A. Ferrari, E. Presutti, E. Scacciatelli and M. E. Vares, The symmetric simple exclusion
process II: Applications, Stoch. Proc. Appl. 39 (1991b), 107-115.
J. A. Fill, Eigenvalue bounds on convergence to stationarity for nonreversible Markov
chains, with an application to the exclusion process, Ann. Appl. Probab. 1 (1991),
62-87.
J. P. Fouque, Hydrodynamical behavior of asymmetric attractive particle systems. One ex-
ample: One-dimensional nearest-neighbors asymmetric simple exclusion process, Pro-
ceedings of the 1989 AMS Seminar on Random Media, vol. 27, AMS Lectures in
Applied Mathematics, 1991, pp. 97-107.
J. P. Fouque and E. Saada, Totally asymmetric attractive particle systems on Z: hydrody-
namic lim it for general initial profiles, Stoch. Proc. Appl. 51 (1994),9-23.
T. Funaki, K. Handa and K. Uchiyama, Hydrodynamic limit of one-dimensional exclusion
processes with speed change, Ann. Probab. 19 (1991), 245-265.
J. Gartner, Convergence towards Burgers' equation and propagation of chaos for weakly
asymmetric exclusion processes, Stoch. Proc. Appl. 27 (1988), 233-260.
J. Gartner and E. Presutti, Shock fluctuations in a particle system, Ann. Inst. H. Poincare
Phys. Theor. 53 (1990), 1-14.
A. Greven, Symmetric exclusion on random sets and a related problem for random walks in
random environment, Prob. Th. ReI. Fields 85 (1990), 307-364.
326 Bibliography
H. Guiol, Un rI?sultat pour Ie processus d'exclusion Ii longue portee, Ann. Inst. Henri
Poincare 33 (1997), 387--405.
L.-H. Gwa and H. Spohn, Bethe solution for the dynamic-scaling exponent of the noisy
Burgers equation, Phys. Rev. A 46 (1992), 844-854.
S. A. Janowski, Exact solution of the totally asymmetric exclusion process: shock profiles,
Rebrape 8 (1994), 85-91.
S. A. Janowski and 1. L. Lebowitz, Finite size effects and shock fluctuations in the asym-
metric simple exclusion process, Phys. Rev. A 45 (1992), 618-625.
S. A. Janowski and 1. L. Lebowitz, Exact results for the asymmetric simple exclusion process
with a blockage, 1. Statist. Phys. 77 (1994), 35-51.
1. D. Keisling, Convergence speed for simple symmetric exclusion: An explicit calculation,
1. Statist. Phys 90 (1998), 1003-1013.
J. D. Keisling, An ergodic theoremfor the symmetric generalized exclusion process, Markov
Proc. ReI. Fields 4 (1998), 351-379.
C. Kipnis, Recent results on the movement of a tagged particle in simple exclusion, Par-
ticle Systems, Random Media, and Large Deviations (R. Durrett, ed.), vol. 41, AMS
Contemporary Mathematics, 1985, pp. 259-265.
C. Kipnis, Central limit theorem for infinite series of queues and applications to simple
exclusion, Ann. Probab. 14 (1986), 397--408.
C. Kipnis, Fluctuations des temps d'occupation d'un site dans I 'exclusion simple symetrique,
Ann. Inst. H. Poincare Probab. Statist. 23 (1987), 21-35.
C. Kipnis, C. Landim and S. Olla, Hydrodynamicallimit for a nongradient system: The gen-
eralized symmetric exclusion process, Comm. Pure Appl. Math 47 (1994), 1475-1545.
C. Kipnis, C. Landim and S. Olla, Macroscopic properties of a stationary non-equilibrium
distribution for a non-gradient interacting particle system, Ann. Inst. H. Poincare
Probab. Statist. 31 (1995), 191-221.
C. Kipnis, S. Olla, and S. R. S. Varadhan, Hydrodynamics and large deviations for simple
exclusion processes, Comm. Pure Appl. Math 42 (1989),115-137.
C. Kipnis and S. R. S. Varadhan, Central limit theorem for additive functionals of re-
versible Markov processes and applications to simple exclusions, Comm. Math. Phys.
104 (1986),1-19.
K. Komoriya, Hydrodynamic limit for asymmetric mean zero exclusion processes with speed
change, Ann. Inst. H. Poincare Probab. Statist. 34 (1998), 767-797.
1. Krug and P. A. Ferrari, Phase transitions in driven diffusive systems with random rates,
1. Phys. A 29 (1996), L465-L471.
C. Landim, Hydrodynamical equation for attractive particle systems on Zd, Ann. Probab.
19 (1991),1537-1558.
C. Landim, Occupation time large deviations for the symmetric simple exclusion process,
Ann. Probab. 20 (1992), 206-231.
C. Landim, S. Olla and S. B. Volchan, Driven tracer particle and Einstein relation in
one-dimensional symmetric simple exclusion process, Resenhas 3 (1997), 173-209.
C. Landim, S. Olla and S. B. Volchan, Driven tracer particle in one-dimensional symmetric
simple exclusion process, Comm. Math. Phys. 192 (1998), 287-307.
C. Landim, S. Olla and H.- T. Yau, Some properties ofthe diffusion coefficient for asymmetric
simple exclusion processes, Ann. Probab. 24 (1996), 1779-1808.
C. Landim, S. Olla and H. T. Yau, First order correction for the hydrodynamic limit of
asymmetric simple exclusion processes in dimension d ::: 3, Comm. Pure Appl. Math.
50 (1997), 149-203.
C. Landim and M. E. Vares, Equilibrium fluctuations for exclusion processes with speed
change, Stoch. Proc. Appl. 52 (1994), 107-118.
C. Landim and H.-T. Yau, Fluctuation-dissipation equation of asymmetric simple exclusion
processes, Probab. Th. ReI. Fields 108 (1997), 321-356.
Bibliography 327
T. M. Liggett, Ergodic theorems for the asymmetric simple exclusion process, Trans. Amer.
Math. Soc. 213 (1975), 237-261.
T. M. Liggett, Coupling the simple exclusion process, Ann. Probab. 4 (1976), 339-356.
T. M. Liggett, Ergodic theorems for the asymmetric simple exclusion process II, Ann.
Probab. 4 (1977), 339-356.
T. M. Liggett, Long range exclusion processes, Ann. Probab. 8 (1980), 861-889.
C. T. MacDonald, J. H. Gibbs and A. C. Pipkin, Kinetics of biopolymerization on nucleic
acid templates, Biopolymers 6 (1968), 1-25.
F. P. Machado, Branching exclusion on a strip, J. Statist. Phys. 86 (1997), 765-777.
C. Maes and F. Redig, Anisotropic perturbations of the simple symmetric exclusion process:
long correlations, J. Phys. I 1 (1991), 669-684.
J. P. Marchand and P. A. Martin, Exclusion process and droplet shape, J. Statist. Phys. 44
(1986),491-504.
J. P. Marchand and P. A. Martin, Errata: Exclusion process and droplet shape, J. Statist.
Phys. 50 (1988), 469-471.
Y. Nagahata, The gradient condition for one-dimensional symmetric exclusion processes, J.
Statist. Phys. 91 (1998), 587-602.
C. Neuhauser, One dimensional stochastic Ising model with small migration, Ann. Probab.
18 (1990), 1539-1546.
J. Quastel, Diffusion of color in simple exclusion process, Comm. Pure Appl. Math. 45
(1992), 623-679.
J. Quastel, F. Rezakhanlou and S. R. S. Varadhan, Large deviations for the symmetric simple
exclusion process in dimensions d ~ 3, Probab. Th. ReI. Fields 113 (1999), 1-84.
N. Rajewsky, L. Santen, A. Schadschneider and M. Schreckenberg, The asymmetric exclu-
sion process: comparison of update procedures, 1. Statist. Phys. 92 (1998), 151-194.
K. Ravishankar, Fluctuations from the hydrodynamical limit for the symmetric simple ex-
clusion in Zd, Stoch. Proc. Appl. 42 (l992a), 31-37.
K. Ravishankar, Interface fluctuations in the two-dimensional weakly asymmetirc simple
exclusion process, Stoch. Proc. Appl. 43 (l992b), 223-247.
F. Rezakhanlou, Hydrodynamic limit for attractive particle systems on Zd, Comm. Math.
Phys. 140 (1991), 417-448.
F. Rezakhanlou, Evolution of tagged particles in nonreversible particle systems, Comm.
Math. Phys. 165 (l994a), 1-32.
F. Rezakhanlou, Propagation of chaos for symmetric simple exclusion, Comm. Pure Appl.
Math. 47 (1994b), 943-957.
F. Rezakhanlou, Microscopic structure of shocks in one conservation laws, Ann. Inst. H.
Poincare Anal. Non Lineaire 12 (1995), 119-153.
H. Rost, Non-equilibrium behaviour of a many particle process: density profile and local
equilibria, Z. Wahrsch. verw. Gebiete 58 (1981), 41-53.
E. Saada, A limit theorem for the position of a tagged particle in a simple exclusion process,
Ann. Probab. 15 (1987), 375-381.
S. Sandow, Partially asymmetric exclusion process with open boundaries, Phys. Rev. E 50
(1994),2660--2667.
S. Sandow and G. M. Schutz, On Uq [SU(2)]-symmetric driven diffusion, Europhys. Lett.
26 (1994), 7-12.
T. Sasamoto and M. Wadati, Dynamic matrix product ansatz and Bethe ansatz equation for
asymmetric exclusion process with periodic boundary condition, 1. Phys. Soc. Japan 66
(1997), 279-282.
G. M. Schutz, Generalized Bethe ansatz solution of a one dimensional asymmetric exclusion
process on a ring with blockage, 1. Statist. Phys. 71 (1993),471-505.
G. M. Schutz, Pairwise balance and invariant measures for generalized exclusion processes,
1. Phys. A 29 (1996), 837-843.
328 Bibliography
G. M. Schlitz, Duality relations for asymmetric exclusion processes, 1. Statist. Phys. 86

(1997a), 1265-1287.
G. M. Schlitz, Exact solution of the master equation for the asymmetric exclusion process,
1. Statist. Phys. 88 (1997b), 427-445.
G. M. Schlitz and E. Domany, Phase transitions in an exactly soluble one dimensional
exclusion process, 1. Statist. Phys. 72 (1993), 277-296.
D. Schwartz, Ergodic theorems for an infinite particle system with births and deaths, Ann.
Probab. 4 (1976), 783-80l.
T. Seppiiliiinen, A scaling limitfor queues in series, Ann. Appl. Probab. 7 (1997a), 855-872.
T. Seppiiliiinen, Coupling the totally asymmetric exclusion process with a moving interface,
Markov Proc. ReI. Fields 4 (1998a), 593-628.
T. Seppiiliiinen, Existence of hydrodynamics for the totally asymmetric simple K -exclusion
process, Ann. Probab. 27 (1999), 361-415.
T. Seppiiliiinen and 1. Krug, Hydrodynamics and platoon formation for a totally asymmetric
exclusion process with particlewise disorder, 1. Statist. Phys. 95 (1999).
S. Sethuraman and L. Xu, A central limit theorem for reversible exclusion and zero range
particle systems, Ann. Probab. 24 (1996), 1842-1870.
S. Sethuraman. S. R. S. Varadhan and H.-T. Yau, DifJUsive limit of a tagged particle in
asymmetric simple exclusion processes, 1999.
E. R. Speer, The two species totally asymmetric simple exclusion process, Micro, Meso
and Macroscopic Approaches in Physics (M. Fannes, C. Maes and A. Verbeure, ed.),
Plenum, 1994, pp. 91-102.
E. R. Speer, Finite-dimensional representations of a shock algebra, 1. Statist. Phys. 89
(1997), 169-175.
R. Srinivasan, Queues in series via interacting particle systems, Math. Oper. Res. 18 (1993),
39-50.
T. Strobel, The Burgers equation as hydrodynamic limit ofthe exclusion process with bound-
ary condition, Stoch. Stoch. Rep. 58 (1996), 139-189.
S. R. S. Varadhan, Entropy methods in hydrodynamic scaling, Proceedings of the Interna-
tional Congress of Mathematicians, Birkhauser, 1994a, pp. 196-208.
S. R. S. Varadhan, Regularity of self-diffusion coeffiCient, The Dynkin Festschrift (M. I.
Freidlin, ed.), Birkhauser, 1994b, pp. 387-397.
S. R. S. Varadhan, Self-diffusion of a tagged particle in equilibrium for asymmetric mean
zero random walk with simple exclusion, Ann. Inst. Henri Poincare 31 (1995), 273-285.
S. R. S. Varadhan, The complex story of simple exclusion, Ito's Stochastic Calculus and
Probability Theory, Springer, 1996, pp. 385-400.
D. Wick, A dynamical phase transition in an infinite particle system, 1. Statist. Phys. 38
(1985),1015-1025.
H. Yaguchi, Entropy analysis of a nearest-neighbor attractivelrepulsive exclusion process
on one-dimensional lattices, Ann. Probab. 18 (1990), 556-580.
H.-T. Yau, Logarithmic Sobolev inequality for generalized simple exclusion processes,
Other Papers
D. Aldous, Markov chains with almost exponential hitting times, Stoch. Proc. Appl. 13
(1982),305-310.
D. Aldous and P. Diaconis, Hammersley's interacting particle process and longest increas-
ing subsequences, Probab. Th. ReI. Fields 103 (1995), 199-213.
E. Andjel, Invariant measures for the zero range process, Ann. Probab. 10 (1982), 525-547.
1. T. Chayes, A. Puha and T. Sweet, Independent and dependent percolation, Probability
Theory and Applications (E.P. Hsu and S.R.S. Varadhan, eds.), IASlPark City Mathe-
matics Series, Vol. 6, AMS, 1999, pp. 49-166.
Bibliography 329
M.-F. Chen, On the ergodic region of SchlOgl's model, Proc. Intern. Conf. Dirichlet Forms
and Stoch. Proc., Walter de Gruyter, 1995, pp. 87-102.
M. -F. Chen, W. -D. Ding and D. -G. Zhu, Ergodicity ofreversible reaction diffusion processes
with general reaction rates, Acta Math. Sinica 10 (1994), 99-112.
N. G. de Bruijn and P. Erdos, On a recursion formula and some Tauberian theorems, 1.
Res. Nat. Bur. Standards 50 (1953), 161-164.
A. De Masi, P. A. Ferrari, S. Goldstein and W. D. Wick, Invariance principle for reversible
Markov processes with applications to random motions in random environments, 1.
Statist. Phys. 55 (1989), 787-855.
1. van den Berg and U. Fiebig, On a combinatorial conjecture concerning disjoint occur-
rences of events, Ann. Probab. 15 (1987), 354-374.
1. D. Deuschel, Algebraic L 2 decay of attractive critical processes on the lattice, Ann.
Probab. 22 (1994),264-283.
W.-D. Ding, R. Durrett and T. M. Liggett, Ergodicity of reversible reaction diffusion pro-
cesses, Probab. Th. ReI. Fields 85 (1990), 13-26.
R. Durrett, Oriented percolation in two dimensions, Ann. Probab. 12 (1984), 999-1040.
R. Durrett, Ten Lectures on Particle Systems, Proceedings of the 1993 St. Flour Summer
School, Springer Lecture Notes #1608, 1995, pp. 97-201.
R. Durrett, Stochastic spatial models, Probability Theory and Applications (E.P. Hsu
and S.R.S. Varadhan, eds.), lAS/Park City Mathematics Series, Vol. 6, AMS, 1999,
pp. 5-47.
R. Durrett and C. Neuhauser, Particle systems and reaction-diffusion equations, Ann.
Probab. 22 (1994), 289-333.
P. A. Ferrari and L. R. G. Fontes, The net output process of a system with irifinitely many
queues, Ann. Appl. Probab. 4 (1994), 1129-1144.
P. A. Ferrari, A. Galves and C. Landim, Exponential waiting time for a big gap in a one
dimensional zero range process, Ann. Probab. 22 (1994), 284-288.
S. Goldstein, Antisymmetric functionals of reversible Markov processes, Ann. Inst. Henri
Poincare 31 (1995),177-190.
T. E. Harris, Random measures and motions ofpoint processes, Z. Wahrsch. verw. Gebiete
9 (1967), 36-58.
R. A. Holley and D. W. Stroock, Uniform and L 2 convergence in one dimensional stochastic
Ising models, Comm. Math. Phys. 123 (1989), 85-93.
S. Karlin and 1. McGregor, Coincidence probabilities, Pac. 1. Math. 9 (1959), 1141-1164.
T. M. Liggett, Total positivity and renewal theory, Probability, Statistics and Mathematics:
Papers in Honor of Samuel Karlin, Academic Press, 1989, pp. 141-162.
T. M. Liggett, Survival of discrete time growth models, with applications to oriented perco-
lation, Ann. Appl. Probab. 5 (1995b), 613-636.
T. M. Liggett, Stochastic models of interacting systems, Ann. Probab. 25 (1997), 1-29.
T. M. Liggett, R. H. Schonmann and A. M. Stacey, Domination by product measures, Ann.
Probab. 25 (1997), 71-95.
T. S. Mountford and B. Prabhakar, On the weak convergence of departures from an irifinite
series of·/ M /1 queues, Ann. Appl. Probab. 5 (1995), 121-127.
R. H. Schonmann, An approach to characterize metastability and critical droplets in stochas-
tic Ising models, Ann. Inst. H. Poincare A 55 (1991), 591-600.
R. H. Schonmann and S. B. Shlosman, Wulff droplets and the metastable relaxation of
kinetic Ising models, Comm. Math. Phys. 194 (1998), 389-462.
T. Seppiiliiinen, A microscopic model for the Burgers equation and longest increasing sub-
sequences, Elect. 1. Probab. 1 (1996), 1-51.
T. Seppiiliiinen, Increasing sequences ofindependent points on the planar lattice, Ann. Appl.
Probab. 7 (1997b), 886-898.
T. Seppiiliiinen, Large deviations for increasing sequences on the plane, Probab. Th. ReI.
Fields 112 (l998b), 221-244.
F. Spitzer, Interaction of Markov processes, Advances Math. 5,246-290.
Index
Additive processes 33 Critical exponents 69, 135, 137

Age process 17 Critical value equality 54
Attractive process 7 Critical values for branching random walk
82
Basic coupling 8 Critical values for the contact process 42
BKR inequality 10 - on Zd 128
Branching processes 25 - on Td 135
Branching random walk 32, 80, 136 Currents 239, 272
Burgers' partial differential equation 223, Cylinder function 2
302,316
Degree of a vertex 31
Central limit theorem Diffusion constant 307
- for martingales 29 Dirichlet form 288
- for the contact process 128 Domination by product measures 14
- for the exclusion process 253, 295 Duality II
Clustering 140, 147,201,203 - for the basic contact process 35
Coalescing random walks 140, 202 - for the linear voter model 140
Coexistence 140, 151 - for the symmetric exclusion process 212
Complete convergence 37 - for the threshold contact process with
- for the contact process on Zd 55, 127 T = I 156
- for the contact process on Td 103 - for the threshold voter model 143
- for the threshold voter model 207 Dynamic phase transition 258
Consensus times 202
Contact processes I, 31 Edge processes for the contact process 129
- asymmetric 133 Ergodic theorem 23
- in random environments 131 Exclusion processes I, 209
- on a finite set 71 - added to the contact process 131
- on Zd 44 - finite 301
- on Td 78 - long range 315
- reversible version of 137 - on a finite set 261
- subcritical 44, 60, 72 - symmetric 212,298
- supercritical 44, 57, 74 - weakly asymmetric 303
- with stirring 131 - with random rates 315
Convolution equation 159 - with spin dynamics 312
Correlation inequalities 8-11 - with spontaneous births and deaths 312
- for the contact process 125 Extinction for the contact process 42
- for the exclusion process 298 - on Zd 54
Coupling 6 - on Td 93
- for the contact process 33
- for the exclusion process 215 Feller process 2
- for the threshold contact process 158 Finite range process 3
332 Index
Finite space-time condition 50 Queuing and the exclusion process 248,

Finite tree 136 304,312
First class particle 218 Queuing systems 26
Fisher-Wright diffusion 201
Fixation 142, 146 Random environments 131
FKG theorem 8 Rate function I
Reaction-diffusion processes 131, 312, 314
Generator 2 Recurrence probability 126
Good marginals 232 Reggeon Field Theory 31
Graphical representation Renewal measure 159
- for the contact process 32 Renewal sequence 16, 167
- for the voter model 142 Resolvent equation 287
- for the exclusion process 215 Reversible Markov chains 5,211,289
Growth profile 105, 135 Reversible measures 5
Reversible version of the contact process
Hammersley's process 315 137
Harmonic functions 112, 213 Russo's formula 61
Hydrodynamics 225, 301
SchlagI's model 314
Indicator function 9 Second class particle 218
Intermediate phase 94 Semigroup 2
Invariant measures 4 Shape theorem 128
- for the contact process 109, 119 Shocks 223
- for the linear voter model 141 Spacings for stationary sequences 21
- for the threshold voter model 208 Spin systems 1
- for exclusion processes 210, 309 Stirring 131
Stochastic monotonicity 6
Linear voter model 140,201 Strong survival 42, 81
Logconvexity 18 Subadditive functions and sequences 12
Long range Subcritical contact process 44, 60, 72
- contact processes 129 Super Brownian motion 130
- exclusion processes 315 Supercritical contact process 44, 57, 74
Survival for the contact process 42
Martingale central limit theorem 29 Survival probability 37, 103
Matrix approach 262 Symmetric exclusion processes 212, 298
Maximal string of isolated points 184
Maximum principle 164 Tagged particle process 219,278,310
Metastability 133 Threshold contact process 130, 151
Monotone coupling 33 Threshold voter model 142
Monotone process 7 Total positivity 19
Multitype voter model 204 Triangle condition 135
Transition rates 1
Occupation times Translation invariant measures 21
- for linear voter models 202
- for symmetric exclusion processes 299 Ulam's problem 315
Oriented percolation 13, 51 Upper invariant measure 34, 127
Partial convergence 127 Voter modell, 139

Pivotal arrows 61 - linear 140, 201
Pivotal intervals 62 - multitype 204
Poisson process 11 - threshold 142
Positive correlations 8-11
Product measure 9 Weak survival 42, 81
Protein synthesis 209, 306 Weakly asymmetric exclusion process 303
Grundlehren der mathematischen Wissenschaften
A Series of Comprehensive Studies in Mathematics
A Selection
219. DuvautJLions: Inequalities in Mechanics and Physics
220. Kirillov: Elements of the Theory of Representations
221. Mumford: Algebraic Geometry I: Complex Projective Varieties
222. Lang: Introduction to Modular Forms
223. Bergh/U:ifstrom: Interpolation Spaces. An Introduction
224. Gilbargffrudinger: Elliptic Partial Differential Equations of Second Order
225. Schutte: Proof Theory
226. Karoubi: K-Theory. An Introduction
227. GrauertlRemmert: Theorie der Steinschen Riiume
228. SegaUKunze: Integrals and Operators
229. Hasse: Number Theory
230. Klingenberg: Lectures on Closed Geodesics
231. Lang: Elliptic Curves. Diophantine Analysis
232. GihmanlSkorohod: The Theory of Stochastic Processes III
233. StroocklVaradhan: Multidimensional Diffusion Processes
234. Aigner: Combinatorial Theory
235. DynkinlYushkevich: Controlled Markov Processes
236. GrauertlRemmert: Theory of Stein Spaces
237. Kothe: Topological Vector Spaces II
238. GrahamlMcGehee: Essays in Commutative Harmonic Analysis
239. Elliott: Probabilistic Number Theory I
240. Elliott: Probabilistic Number Theory II
en
241. Rudin: Function Theory in the Unit Ball of
242. HuppertlBlackburn: Finite Groups II
243. HuppertlBlackburn: Finite Groups III
244. Kubert/Lang: Modular Units
245. CornfeldIFominlSinai: Ergodic Theory
246. NaimarklStern: Theory of Group Representations
247. Suzuki: Group Theory I
248. Suzuki: Group Theory II
249. Chung: Lectures from Markov Processes to Brownian Motion
250. Arnold: Geometrical Methods in the Theory of Ordinary Differential Equations
251. ChowlHale: Methods of Bifurcation Theory
252. Aubin: Nonlinear Analysis on Manifolds. Monge-Ampere Equations
253. Dwork: Lectures on p-adic Differential Equations
254. Freitag: Siegelsche Modulfunktionen
255. Lang: Complex Multiplication
256. Hormander: The Analysis of Linear Partial Differential Operators I
257. Hormander: The Analysis of Linear Partial Differential Operators II
258. Smoller: Shock Waves and Reaction-Diffusion Equations
259. Duren: Univalent Functions
260. FreidlinlWentzell: Random Perturbations of Dynamical Systems
261. BoschlGuntzerlRemmert: Non Archimedian Analysis - A System Approach
to Rigid Analytic Geometry
262. Doob: Classical Potential Theory and Its Probabilistic Counterpart
263. Krasnosel'skiIlZabrelko: Geometrical Methods of Nonlinear Analysis
264. AubinlCellina: Differential Inclusions
265. GrauertlRemmert: Coherent Analytic Sheaves
266. de Rham: Differentiable Manifolds
267. ArbarellolCornalbalGriffithslHarris: Geometry of Algebraic Curves, Vol. I
268. ArbarellolCornalbalGriffithslHarris: Geometry of Algebraic Curves, Vol. II
269. Schapira: Microdifferential Systems in the Complex Domain
270. Scharlau: Quadratic and Hermitian Forms
271. Ellis: Entropy, Large Deviations, and Statistical Mechanics
272. Elliott: Arithmetic Functions and Integer Products
273. Nikol'skiI: Treatise on the Shift Operator
274. Hormander: The Analysis of Linear Partial Differential Operators III
275. Hormander: The Analysis of Linear Partial Differential Operators IV
276. Liggett: Interacting Particle Systems
277. Fulton/Lang: Riemann-Roch Algebra
278. BarrIWells: Toposes, Triples and Theories
279. BishoplBridges: Constructive Analysis
280. Neukirch: Class Field Theory
281. Chandrasekharan: Elliptic Functions
282. Lelong/Gruman: Entire Functions of Several Complex Variables
283. Kodaira: Complex Manifolds and Deformation of Complex Structures
284. Finn: Equilibrium Capillary Surfaces
285. Burago/Zalgaller: Geometric Inequalities
286. Andrianaov: Quadratic Forms and Hecke Operators
287. Maskit: Kleinian Groups
288. JacodlShiryaev: Limit Theorems for Stochastic Processes
289. Manin: Gauge Field Theory and Complex Geometry
290. Conway/Sloane: Sphere Packings, Lattices and Groups
291. HahnlO'Meara: The Classical Groups and K-Theory
292. Kashiwara/Schapira: Sheaves on Manifolds
293. RevuzIYor: Continuous Martingales and Brownian Motion
294. Knus: Quadratic and Hermitian Forms over Rings
295. DierkeslHildebrandtIKiisterlWohlrab: Minimal Surfaces I
296. DierkeslHildebrandtIKiisterlWohlrab: Minimal Surfaces II
297. PasturlFigotin: Spectra of Random and Almost-Periodic Operators
298. Berline/GetzlerNergne: Heat Kernels and Dirac Operators
299. Pommerenke: Boundary Behaviour of Conformal Maps
300. Orlikfferao: Arrangements of Hyperplanes
301. Loday: Cyclic Homology
302. LangelBirkenhake: Complex Abelian Varieties
303. DeVorelLorentz: Constructive Approximation
304. Lorentz/v. GolitschekIMakovoz: Construcitve Approximation. Advanced Problems
305. Hiriart-UrrutylLemarechal: Convex Analysis and Minimization Algorithms I.
Fundamentals
306. Hiriart-UrrutylLemarechal: Convex Analysis and Minimization Algorithms II.
Advanced Theory and Bundle Methods
307. Schwarz: Quantum Field Theory and Topology
308. Schwarz: Topology for Physicists
309. Adem/Milgram: Cohomology of Finite Groups
310. GiaquintaIHildebrandt: Calculus of Variations I: The Lagrangian Formalism
311. GiaquintalHildebrandt: Calculus of Variations II: The Hamiltonian Formalism
312. Chung/Zhao: From Brownian Motion to Schrodinger's Equation
313. Malliavin: Stochastic Analysis
314. AdamslHedberg: Function Spaces and Potential Theory
315. Biirgisser/ClausenlShokrollahi: Algebraic Complexity Theory
316. SafflTotik: Logarithmic Potentials with External Fields
317. RockafellarlWets: Variational Analysis
318. Kobayashi: Hyperbolic Complex Spaces
319. BridsonlHaefliger: Metric Spaces of Non-Positive Curvature
320. KipnislLandim: Scaling Limits of Interacting Particle Systems
321. Grimmett: Percolation
322. Neukirch: Algebraic Number Theory
323. NeukirchlSchmidt/Wingberg: Cohomology of Number Fields
324. Liggett: Stochastic Interacting Systems: Contact, Voter and Exclusion Processes
Springer
and the
environment
At Springer we firmly believe that an
international science publisher has a
special obligation to the environment,
and our corporate policies consistently
reflect this conviction.
We also expect our business partners -
paper mills, printers, packaging
manufacturers, etc. - to commit
themselves to using materials and
production processes that do not harm
the environment. The paper in this
book is made from low- or no-chlorine
pulp and is acid free, in conformance
with international standards for paper
permanency.
Springer

Stochastic Interacting System PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stochastic Interacting System PDF

Uploaded by

Copyright:

Available Formats

Gruncllehren cler

mathematischen Wissenschaften 324

Cataloging-in-Publication Data applied for

Mathematics Subject Classification (1991): 60K35

I have tried to make the treatment as self-contained as possible without dupli-

Los Angeles, CA Thomas M Liggett

Background and Tools

Part I. Contact Processes 31

Exponential Bounds in the Supercritical Case 57

Part II. Voter Models 139

The Renewal Sequence . . . . . . . . . . . . . . . 167

Part III. Exclusion Processes 209

Invariance and Ergodicity of the Environment 280

The dynamics of the process are specified by a collection of transition rates.

S(t)f(l]) = Ery f(l]t), f E C(X).

(B2) Qf(l]) = LC(x, Y, 1]) [J(l]x,y) - f(I])]

Theorem B3 (Liggett). Suppose that

(B4) sup L sup Ic(x, 1/) -

Then the closure Q of the Q defined in (B 1) is the generator of a Feller Markov

Remarks. (a) The interpretation of condition (B4) is that in a certain uniform

for all cylinder functions f. Then the corresponding semigroups satisfy

A second type of application of Theorem B5 is to the proof that families of

Ix fdJLS(t) = Ix S(t)fdJL, f E C(X).

Theorem B7. Consider a Feller process on {a, l}s.

Ix S(t)fdJL = Ix fdJL for all f E C(X), t ::::: 0.

(b) JL E.9 if and only if

Ix 0.fdJL = 0 for all cylinder functions f.

(c) g is compact, convex and nonempty.

V = lim -1 lTn /LS(t)dt

for all f, g E C(X). Taking g =

for all cylinder functions f, g.

while reversibility corresponds to

rr(x)q(x, y) = rr(y)q(y, x), x, YES.

Coupling, Monotonicity and Attractiveness

1] S l; if 1] (x) S l; (x) for all XES.

A function f E C(X) is said to be increasing if

1] S l; implies f(1]) S f(l;).

Ix fdlLl = Ef(1]) S Ef(1;) = Ix fdILz·

A simple application of this theorem shows that stochastic monotonicity is

1) 1\ sex) = min{1)(x), sex)}, 1) v sex) = max{1)(x), sex)}.

for all 1), sEX, then I-ll ::'S 1-l2.

We tum now to some connections between stochastic monotonicity of measures

(B12) ! increasing implies Set)! increasing for all t :::: 0,

if1/(x) = ~(x) = 0, then {

c(x, 1/) + [c(x, n - c(x, 1/)] = c(x, n

so that (BI2) is satisfied.

Ix f gdJ.L :::: Ix f dJ.L Ix gdJ.L

for all 1/, ~ E X, then J.L has positive correlations.

JLS(t) has positive correlations whenever JL does.

An immediate corollary is that the distribution of the contact process with

Corollary B18. Suppose v is the product measure on X with

(BI9) v{TJ : TJ(x) = I} = a(x), XES.

Then v has positive correlations.

The measure v has the following cylinder probabilities:

v{TJ: TJ(X) = 1 "Ix E G, TJ(X) = 0 "Ix E H} = n n

To deduce it from Theorem B 17, consider the spin system with

Every deterministic distribution has positive correlations, so the distribution JLS(t)

is increasing. An immediate consequence of Corollary B 18 is that if v is a product

Al 0 A2 = {1] EX: 3 SI, S2 C S, SI n S2 = 0, ~ = 1] on SI implies ~ E AI,

The set A I 0 A2 is read A I and A2 occur disjointly. The idea is that an 1] E A I 0 A2

(AI n A2)\(A I 0 A2) = {1] : 1](x) = 0, 1](Y) = I, 1](z) = OJ.

Theorem B21 (BKR inequality). If v is a product measure and S is finite, then

for all AI, A2 C X.