You are on page 1of 718


Deformation Quantization
A C Hirshfeld, Universität Dortmund, Dortmund, where (2) is the two-dimensional Dirac delta
Germany function. The observables of the dynamical system
ª 2006 Elsevier Ltd. All rights reserved. are functions on the phase space, the states of the
system are positive functionals on the observables
(here the Dirac delta functions), and we obtain the
value of the observable in a definite state by the
Introduction operation shown in eqn [1].
In general, functions on a manifold are multiplied
Deformation quantization is an alternative way by each other in a pointwise manner, that is, given
of looking at quantum mechanics. Some of its two functions f and g, their product fg is the
techniques were introduced by the pioneers of function
quantum mechanics, but it was first proposed as
an autonomous theory in a paper in Annals of ðfgÞðxÞ ¼ f ðxÞgðxÞ ½2
Physics (Bayen et al. 1978). More recent reviews bi

treat modern developments (HH I 2001, Dito and In the context of classical mechanics, the observa-
Sternheimer 2002, Zachos 2002). bles build a commutative algebra, called the com-
Deformation quantization concentrates on the cen- mutative ‘‘classical algebra of observables.’’
tral physical concepts of quantum theory: the algebra In Hamiltonian mechanics there is another way to
of observables and their dynamical evolution. Because combine two functions on phase space in such a way
it deals exclusively with functions of phase-space that the result is again a function on the phase space,
variables, its conceptual break with classical mechanics namely by using the Poisson bracket
is less severe than in other approaches. It formulates the Xn  
@f @g @f @g 
correspondence principle very precisely which played ff ; ggðq; pÞ ¼ 
@qi @pi @pi @qi q;p
such an important role in the historical development. i¼1
Although this article deals mainly with nonrelati- !
¼ f @q@p  @p@p g ½3
vistic bosonic systems, deformation quantization is
much more general. For inclusion of fermions and
the Dirac equation see (Hirshfeld et al. 2002b). The in an abbreviated notation.
fermionic degrees of freedom may, in special cases, be The notation can be further abbreviated by using x
obtained from the bosonic ones by supersymmetric
to represent points of the phase-space manifold,
extension (Hirshfeld et al. 2004). For applications to
x = (x1 , . . . , x2n ), and introducing the Poisson tensor
field theory, see Hirshfeld et al. (2002). For the
ij , where the indices i, j run from 1 to 2n. In
relation to Hopf algebras see Hirshfeld et al. (2003),
canonical coordinates ij is represented by the matrix
and to geometric algebra, see Hirshfeld et al. (2005).  
The observables of a physical system, such as the 0 In
¼ ½4
Hamilton function, are smooth real-valued functions In 0
on phase space. Physical quantities of the system at
some time, such as the energy, are calculated by where In is the n  n identity matrix. Then eqn [3]
evaluating the Hamilton function at the point becomes
x0 = (q0 , p0 ) in phase space that characterizes the ff ; ggðxÞ ¼ ij @i f ðxÞ @j gðxÞ ½5
state of the system at this time (we assume for the
moment, a one-particle system). The mathematical where @i = @=@xi .
expression for this operation is For a general observable,
E ¼ Hðq; pÞð2Þ ðq  q0 ; p  p0 Þ dq dp ½1 f_ ¼ ff ; Hg ½6
2 Deformation Quantization

Because  transforms like a tensor with respect mechanics. Textbooks refer to the correspondence
to coordinate transformations, eqn [5] may also be principle, which guided the pioneers of the subject.
written in noncanonical coordinates. In this case Attempts to give this idea a precise formulation by
the components of  need not be constants, and postulating a specific relation between the classical
may depend on the point of the manifold at which Poisson brackets of observables and the commu-
they are evaluated. But in Hamiltonian mechanics, tators of the corresponding quantum mechanical
 is still required to be invertible. A manifold operators, as undertaken, for example, by Dirac and
equipped with a Poisson tensor of this kind is von Neumann, encountered insurmountable diffi-

called a symplectic manifold. In general, the tensor culties, as pointed out by Groenewold in 1946 in an
 is no longer required to be invertible, but it unjustly neglected paper (Groenewold 1948). In the
nevertheless suffices to define Poisson brackets via same paper Groenewold also wrote down the first
eqn [5], and these brackets are required to have explicit representation of a ‘‘star product’’ (see eqn
the properties [11]), without however realizing the potential of this
concept for overcoming the difficulties that he
1. {f , g} = {g, f },
wanted to resolve.
2. {f , gh} = {f , g}h þ g{f , h}, and
In the deformation quantization approach, there
3. {f , {g, h}} þ {g, {h, f }} þ {h, {f , g}} = 0.
is no such break when going from the classical
Property (1) implies that the Poisson bracket is system to the corresponding quantum system; we
antisymmetric, property (2) is referred to as the Leibnitz describe the quantum system by using the same
rule, and property (3) is called the Jacobi identity. The entities that are used to describe the classical
Poisson bracket used in Hamiltonian mechanics satis- system. The observables of the system are described
fies all these properties, but we now abstract these by the same functions on phase space as their
properties from the concrete prescription of eqn [3], and classical counterparts. Uncertainty is realized by
a Poisson manifold (M, ) is defined as a smooth describing physical states as distributions on phase
manifold M equipped with a Poisson tensor , whose space that are not sharply localized, in contrast to
components are no longer necessarily constant, such the Dirac delta functions which occur in the
that the bracket defined by eqn [5] has the above classical case. When we evaluate an observable in
properties. It turns out that such manifolds provide a some definite state according to the quantum
better context for treating dynamical systems with analog of eqn [1] (see eqn [24]), values of the
symmetries. In fact, they are essential for treating gauge- observable in a whole region contribute to the
field theories, which govern the fundamental interac- number that is obtained, which is thus an average
tions of elementary particles. value of the observable in the given state. Non-
commutativity is incorporated by introducing a
noncommutative product for functions on phase
Quantum Mechanics and Star Products
space, so that we get a new noncommutative
The essential difference between classical and quantum algebra of observables. The systematic
quantum mechanics is Heisenberg’s uncertainty work on deformation quantization stems from
relation, which implies that in the latter, states can Gerstenhaber’s seminal paper, where he introduced
no longer be represented as points in phase space. the concept of a star product of smooth functions

The uncertainty is a consequence of the noncommu- on a manifold (Gerstenhaber 1964).

tativity of the quantum mechanical observables. For applications to quantum mechanics, we
That is, the commutative classical algebra of consider smooth complex-valued functions on a
observables must be replaced by a noncommutative Poisson manifold. A star product f  g of two such
quantum algebra of observables. functions is a new smooth function, which, in
In the conventional approach to quantum general, is described by an infinite power series:
mechanics, this noncommutativity is implemented
by representing the quantum mechanical observables f  g ¼ fg þ ðihÞC1 ðf ; gÞ þ Oðh2 Þ
by linear operators in Hilbert space. Physical X1

quantities are then represented by eigenvalues of ¼ ðihÞn Cn ðf ; gÞ ½7

these operators, and physical states are related to the
operator eigenfunctions. Although these entities are The first term in the series is the pointwise product
somehow related to their classical counterparts, to given in eqn [2], and (ih) is the deformation
which they are supposed to reduce in an appropriate parameter, which is assumed to be varying con-
limit, the precise relationship has remained obscure, tinuously. If h is identified with Planck’s constant,
one hundred years after the beginnings of quantum then what varies is really the magnitude of the
Deformation Quantization 3

action of the dynamical system considered in units For physical applications we usually require the
of h: the classical limit holds for systems with large star product to be Hermitean: f  g = ḡ  f̄, where f̄
action. In this limit, which we express here as h ! 0, denotes the complex conjugate of f. The star
the star product reduces to the usual product. In products considered in this article have this
general, the coefficients Cn will be such that the new property.
product is noncommutative, and we consider the For a given Poisson manifold, it is not clear a
noncommutative algebra formed from the functions priori if a star product for the smooth functions on
with this new multiplication law as a deformation of the manifold actually exists, that is, whether it is at
the original commutative algebra, which uses point- all possible to find coefficients Cn that satisfy the
wise multiplication of the functions. above list of properties. Even if we find such
The expressions Cn (f , g) denote functions made coefficients, it it still not clear that the series they
up of the derivatives of the functions f and g. It is define through eqn [7] yields a smooth function.
obvious that without further restrictions of these Mathematicians have worked hard to answer these
coefficients, the star product is too arbitrary to be of questions in the general case. For flat Euclidian
any use. Gerstenhaber’s discovery was that the spaces, M = R2n , a specific star product has long
simple requirement that the new product be asso- been known. In this case, the components of the
ciative imposes such strong requirements on the Poisson tensor ij can be taken to be constants. The
coefficients Cn that they are essentially unique in coefficient C1 can then be chosen antisymmetric,
the most important cases (up to an equivalence so that
relation, as discussed below). Formally, Gerstenhaber
required that the coefficients satisfy the following C1 ðf ; gÞ ¼ 12ij ð@i f Þð@j gÞ ¼ 12 ff ; gg ½10
properties: by property (3) above. The higher-order coefficients
P P may be obtained by exponentiation of C1 . This
1. jþk = n Cj (Ck (f , g), h) = jþk = n Cj (f , Ck (g, h)), bi

2. C0 (f , g) = fg, and procedure yields the Moyal star product (Moyal

3. C1 (f , g)  C1 (g, f ) = {f , g}. 1949):
Property (1) guarantees that the star product is ih ij !
f M g ¼ f exp  @i@j g ½11
associative: (f  g)  h = f  (g  h). Property (2) means 2
that in the limit h ! 0, the star product f  g agrees In canonical coordinates, eqn [11] becomes
with the pointwise product fg. Property (3) has at least
two aspects: (i) mathematically, it anchors the new ðf M gÞðq; pÞ
product to the given structure of the Poisson manifold  
ih ! !
and (ii) physically, it provides the connection between ¼ f ðq; pÞ exp ð@ q @ p  @ p @ q Þ gðq; pÞ ½12
the classical and quantum behavior of the dynamical
system. Define a commutator by using the new
product: X1  mþn
ih ð1Þm m n
¼ ð@p @q f Þð@pn @qm gÞ ½13
½f ; g ¼ f  g  g  f ½8 m;n¼0
2 m!n!

Property (3) may then be written as We now come to the question of uniqueness of the
star product on a given Poisson manifold. Two star
lim ½f ; g ¼ ff ; gg ½9 products  and 0 are said to be ‘‘c-equivalent’’ if
h! 0 i
h there exists an invertible transition operator
Equation [9] is the correct form of the correspon- X
dence principle. In general, the quantity on the left- T ¼ 1 þ hT1 þ    ¼  n Tn
h ½14
hand side of eqn [9] reduces to the Poisson bracket n¼0

only in the classical limit. The source of the where the Tn are differential operators that satisfy
mathematical difficulties that previous attempts to
f 0 g ¼ T 1 ððTf Þ  ðTgÞÞ ½15
formulate the correspondence principle encoun-
tered was related to trying to enforce equality It is known that for M = R 2n all admissible star
between the Poisson bracket and the corresponding products are c-equivalent to the Moyal product. The
expression involving the quantum mechanical com- concept of c-equivalence is a mathematical one

mutator. Equation [9] shows that such a relation in (c stands for cohomology (Gerstenhaber 1964)); it
general only holds up to corrections of higher order does not by itself imply any kind of physical
h. equivalence, as shown below.
4 Deformation Quantization

Another expression for the Moyal product is a The c-equivalent star products correspond to differ-
kind of Fourier representation: ent quantization schemes. Having chosen a quantiza-
tion scheme, the quantities of interest for the quantum
ðf M gÞðq; pÞ system may be calculated. It turns out that different
Z quantization schemes lead to different spectra for the
¼ 2 dq1 dq2 dp1 dp2 f ðq1 ; p1 Þgðq2 ; p2 Þ observables. The choice of a specific quantization
h 2

scheme can only be motivated by further physical

2 requirements. In the simple example we discuss below,
exp ðpðq1  q2 Þ þ qðp2  p1 Þ the classical system is completely specified by its
 Hamilton function. In more general cases, one may
have to decide what constitutes a sufficiently large set
þ ðq2 p1  q1 p2 Þ ½16
of good observables for a complete specification of the
system (Bayen et al. 1978).
Equation [16] has an interesting geometrical inter- A state is characterized by its energy E; the set
pretation. Denote points in phase space by vectors, of all possible values for the energy is called the
for example, in two dimensions: spectrum of the system. The states are described
      by distributions on phase space called projectors.
q q1 q2
r¼ ; r1 ¼ ; r2 ¼ ½17 The state corresponding to the energy E is
p p1 p2
denoted by E (q, p). These distributions are
Now, consider the triangle in phase space spanned normalized:
by the vectors r  r1 and r  r2 . Its area (symplectic Z
volume) is E ðq; pÞdq dp ¼ 1 ½20
Aðr; r1 ; r2 Þ
and idempotent:
¼ 12ðr  r1 Þ ^ ðr  r2 Þ
ðE  E0 Þðq; pÞ ¼ E;E0 E ðq; pÞ ½21
¼ 12½pðq2  q1 Þ þ qðp1  p2 Þ þ ðq1 p2  q2 p1 Þ ½18
The fact that the Hamilton function takes the value
which is proportional to the exponent in eqn [16]. E when the system is in the state corresponding to
Hence, we may rewrite eqn [16] as this energy is expressed by the equation
ðf  gÞðrÞ ðH  E Þðq; pÞ ¼ EE ðq; pÞ ½22
¼ dr1 dr2 f ðr1 Þgðr2 Þ exp Aðr; r1 ; r2 Þ ½19 Equation [22] corresponds to the time-independent

Schrödinger equation, and is sometimes called the
‘‘-genvalue equation.’’ The spectral decomposition
of the Hamilton function is given by
Deformation Quantization X
Hðq; pÞ ¼ EE ðq; pÞ ½23
The properties of the star product are well adapted E
for describing the noncommutative quantum algebra where the summation sign may indicate an integra-
of observables. We have already discussed the tion if the spectrum is continuous. The quantum
associativity and the incorporation of the classical mechanical version of eqn [1] is
and semiclassical limits. Note that the characteristic Z
nonlocality feature of quantum mechanics is also 1
E¼ ðH  E Þðq; pÞdq dp
explicit. In the expression for the Moyal product 2h Z
given in eqn [13], the star product of the functions f 1
¼ Hðq; pÞ E ðq; pÞdq dp ½24
and g at the point x = (q, p) involves not only the 2h
values of the functions f and g at this point, but also where the last expression may be obtained by using
all higher derivatives of these functions at x. But for eqn [16] for the star product.
a smooth function, knowledge of all the derivatives The time-evolution function for a time-indepen-
at a given point is equivalent to the knowledge of dent Hamilton function is denoted by Exp(Ht), and
the function on the entire space. In the integral the fact that the Hamilton function is the generator
expression of eqn [16], we also see that knowledge of the time evolution of the system is expressed by
of the functions f and g on the whole phase space is
necessary to determine the value of the star product d
ih ExpðHtÞ ¼ H  ExpðHtÞ ½25
at the point x. dt
Deformation Quantization 5

This equation corresponds to the time-dependent so that

Schrödinger equation. It is solved by the star ½a; a ¼ h ½33
exponential: N

X   Equation [25] for this case is

1 it n
ExpðHtÞ ¼ ðHÞn ½26
n! h
n¼0 ih ExpN ðHtÞ ¼ ðH þ h!a@ a ÞExpN ðHtÞ ½34
where (H  ) = H  Hffl{zfflfflfflfflfflfflfflfflfflfflffl
|fflfflfflfflfflfflfflfflfflfflffl    H
ffl} . Because each state with the solution
n times
of definite energy E has a time evolution exp (iEt=h),
ExpN ðHtÞ ¼ eaa=h exp ei!t aa=h ½35
the complete time-evolution function may be written
in the form By expanding the last exponential in eqn [35], we
X obtain the Fourier–Dirichlet expansion
ExpðHtÞ ¼ E eiEt=h ½27
E X1
1 n n in!t
ExpN ðHtÞ ¼ eaa=h n a a e ½36
This expression is called the ‘‘Fourier–Dirichlet n¼0
expansion’’ for the time-evolution function.
Questions concerning the existence and unique- From here, we can read off the energy eigenvalues
ness of the star exponential as a C1 function and the and the projectors describing the states by compar-
nature of the spectrum and the projectors again ing coefficients in eqns [27] and [36]:
require careful mathematical analysis. The problem ðNÞ
0 ¼ eaa=h ½37
of finding general conditions on the Hamilton
function H which ensure a reasonable physical
spectrum is analogous to the problem of showing, 1 1 ðNÞ
n ¼ an an ¼ n an N 0 N an
n 0  ½38
in the conventional approach, that the symmetric h n! h
operator Ĥ is self-adjoint and finding its spectral
projections. En ¼ nh! ½39
Note that the spectrum obtained in eqn [39] does
not include the zero-point energy. The projector
The Simple Harmonic Oscillator onto the ground state (N) satisfies
As an example of the above procedure, we treat the ðNÞ
simple one-dimensional harmonic oscillator charac- a N  0 ¼0 ½40
terized by the classical Hamilton function The spectral decomposition of the Hamilton func-
p2 m!2 2 tion (eqn [23]) is in this case
Hðq; pÞ ¼ þ q ½28  
2m 2 X1
H¼ nh! n eaa=h an an ¼ !aa ½41
In terms of the holomorphic variables n¼0
h n!
m! p
We now consider the Moyal quantization scheme.
a¼ qþi ;
2 m! If we write eqn [12] in terms of holomorphic
rffiffiffiffiffiffiffi ½29
m! p
coordinates, we obtain
a ¼ qi
2 m!  
h ! !
f M g ¼ f exp ð@ a @ a  @ a @ a Þ g ½42
the Hamilton function becomes 2
H ¼ !a
a ½30 Here, we have
Our aim is to calculate the time-evolution function. h h

We first choose a quantization scheme characterized a M a ¼ aa þ ; a M a ¼ aa  ½43
2 2
by the normal star product
and again
h @ a @ a
f N g ¼ f e g ½31 ½a; a ¼ h ½44
we then have
The value of the commutator of two phase-space
a N a ¼ a
 a; a ¼ a
a N  aþ
h ½32 variables is fixed by property (3) of the star product,
6 Deformation Quantization

and cannot change when one goes to a c-equivalent which is equivalent to the expression already found
star product. The Moyal star product is c-equivalent to in eqn [48].
the normal star product with the transition operator
~~ Conventional Quantization
T ¼ eðh=2Þ@a @a ½45
We can use this operator to transform the normal One usually finds the observables characterizing
product version of the -genvalue equation, eqn [22], some quantum mechanical system by starting from
into the corresponding Moyal product version the corresponding classical system, and then, either
according to eqn [15]. The result is by guessing or by using some more or less systematic
  method, and finding the corresponding representa-
ðMÞ h
 tions of the classical quantities in the quantum
H M  n ¼ ! a M a þ
2 M n system. The guiding principle is the correspondence

¼ h! n þ 12 nðMÞ ½46 principle: the quantum mechanical relations are
supposed to reduce somehow to the classical
with relations in an appropriate limit. Early attempts to
0 ¼ T0 ¼ 2e2aa=h ½47 systematize this procedure involved finding an
assignment rule  that associates to each phase-
1 n ðMÞ space function f a linear operator in Hilbert space
nðMÞ ¼ TðNÞ ¼ a M  0 M an
h f̂ = (f ) in such a way that in the limit h ! 0, the
quantum mechanical equations of motion go over to
The projector onto the ground state (M)
0 satisfies
the classical equations. Such an assignment cannot
a M  0
¼0 ½49 be unique, because even though an operator that is a
function of the basic operators Q̂ and P̂ reduces to a
We now have, for the spectrum, unique phase-space function in the limit h ! 0,
there are many ways to assign an operator to a given
En ¼ n þ 12  h! ½50
phase-space function, due to the different orderings
which is the textbook result. We conclude that for of the operators Q̂ and P̂ that all reduce to the
this problem, the Moyal quantization scheme is the original phase-space function. Different ordering
correct one. procedures correspond to different quantization
The use of the Moyal product in eqn [25] for the schemes. It turns out that there is no quantization
star exponential of the harmonic oscillator leads to scheme for systems with observables that depend on
the following differential equation for the time the coordinates or the momenta to a higher power
evolution function: than quadratic which leads to a correspondence
between the quantum mechanical and the classical
h ExpM ðHtÞ equations of motion, and which simultaneously
dt ! strictly maintains the Dirac–von Neumann require-
ðh!Þ2 h!Þ2
ð 2 ment that (1=ih)[f̂, ĝ] $ {f , g}. Only within the
¼ H @H  H@H ExpM ðHtÞ ½51
4 4 framework of deformation quantization does the
correspondence principle acquire a precise meaning.
The solution is A general scheme for associating phase-space
 functions and Hilbert space operators, which
1 2H
ExpM ðHtÞ ¼ exp tan ½52 includes all of the usual orderings, is given as
cosð!t=2Þ i
h! 2 follows: the operator  (f ) corresponding to a
This expression can be brought into the form of the given phase-space function f is
Fourier–Dirichlet expansion of eqn [27] by using Z
^ ^
the generating function for the Laguerre  ðf Þ ¼ ~f ð; ÞeiðQþPÞ eð;Þ d d ½55
  X where f̃ is the Fourier transform of f, and (Q̂, P̂) are the
1 zs Schrödinger operators that correspond to the phase-
exp ¼ sn ð1Þn Ln ðzÞ ½53
1þs 1þs space variables (q, p); (, ) is a quadratic form:

with s = ei!t . The projectors then become ð; Þ ¼ ð2 þ 2 þ 2i Þ ½56
nðMÞ ¼ 2ð1Þn e2H=h! Ln ½54 Different choices for the constants (, , ) yield
 different operator ordering schemes.
Deformation Quantization 7

The relation between operator algebras and star normal star product. In the density matrix formal-
products is given by ism, we say that the projection operator is that of a
pure state, which is characterized by the property of
ðf ÞðgÞ ¼ ðf  gÞ ½57 being idempotent:
^2n =
^n (compare eqn [21]). The
where  is a linear assignment of the kind discussed integral of the projector over the momentum gives
above. Different assignments, which correspond to the probability distribution in position space:
different operator orderings, correspond to c-equiva- Z
lent star products. It demonstrates that the quantum ðMÞ
n ðq; pÞdp
mechanical algebra of observables is a representa- Z
tion of the star product algebra. Because in the ¼ hq þ =2jnihnjq  =2ieip=h d dp
algebraic approach to quantum theory all the 2h
information concerning the quantum system may ¼ hqjnihnjqi ¼ j n ðqÞj ½64
be extracted from the algebra of observables,
and the integral over the position gives the prob-
specifying the star product completely determines
ability distribution in momentum space:
the quantum system. Z
The inverse procedure of finding the phase-space 1 ~ 2
n ðq; pÞdq ¼ hpjnihnjpi ¼ j n ðpÞj ½65
function that corresponds to a given operator f̂ is, 2h
for the special case of Weyl ordering, given by
Z The normalization is
f ðq; pÞ ¼ hq þ 12j^f jq  12ieip=h d ½58 1
n ðq; pÞdq dp ¼ 1 ½66
When using holonomic coordinates, it is convenient
which is the same as eqn [20]. Applying these
to work with the coherent states
relations to the ground-state projector of the
ajai ¼ ajai;
^ h ay ¼ h
aj^ aj
a ½59 harmonic oscillator, eqn [47] shows that this is a
minimum-uncertainty state. In the classical limit
These states are related to the energy eigenstates of h ! 0, it goes to a Dirac -function. The expecta-
the harmonic oscillator tion value of the Hamiltonian operator is
1 yn Z Z
jni ¼ pffiffiffiffi ^
a j0i ½60 1 ^
n jqidq
n! ðH M nðMÞ Þðq; pÞdq dp ¼ hqjH^
by ^
n Þ
¼ trðH^ ½67
an which should be compared to eqn [24].
jai ¼ e2aa=h pffiffiffiffi jni;
n¼0 n!

haj ¼ e2aa=h pffiffiffiffi hnj
n! Quantum Field Theory

In normal ordering, we obtain the phase space function A real scalar field is given in terms of the coefficients
f (a, 
a) corresponding to the operator f̂ by just taking a(k), ā(k) by
the matrix element between coherent states: Z h i
d3 k
ðxÞ ¼ 3=2 pffiffiffiffiffiffiffiffi
aðkÞeikx þ aðkÞeikx ½68
f ða; 
aÞ ¼ h
ajf ð^ ay Þjai
a; ^ ½62 ð2Þ 2!k
For holomorphic coordinates, it is easy to show where h!k = h2 k2 þ m2 is the energy of a single-
1 1 quantum of the field. The corresponding quantum
ðNÞ aÞ ¼
n ða;  ajnihnjai ¼ n ð
h aaÞn eaa=h ½63 field operator is
h h n!

Z h i
in agreement with eqn [38] for the normal star d3 k ikx y ikx
product projectors. ðxÞ ¼ p ffiffiffiffiffiffiffiffi a
^ ðkÞe þ a
^ ðkÞe ½69
ð2Þ3=2 2!k
The star exponential Exp(Ht) and the projectors
n are the phase-space representations of the time- where â(k), ây (k) are the annihilation and creation
evolution operator exp (iĤt=h) and the projection operators for a quantum of the field with momen-
^n = jnihnj, respectively. Weyl ordering tum hk. The Hamiltonian is
corresponds to the use of the Moyal star product for Z
quantization and normal ordering to the use of the H ¼ d3 kh!k ^ay ðkÞ^aðkÞ ½70
8 Deformation Quantization

N(k) = ây (k)â(k) is interpreted as the number opera- where T indicates the time-ordered product of the
tor, and eqn [70] is then just the generalization of fields and N the normal-ordered product. Because the
eqn [39], the expression for the energy of the harmonic second term in eqn [75] is a normal-ordered product
oscillator in the normal ordering scheme, for an infinite with vanishing vacuum expectation value, the Feyn-
number of degrees of freedom. Had we chosen the man propagator may be simply characterized as the
Weyl-ordering scheme, it would have resulted in (by vacuum expectation value of the time-ordered product
the generalization of eqn [50]) an infinite vacuum of the fields. The antisymmetric part of the positive
energy. Hence, requiring the vacuum energy to vanish frequency propagator is the Schwinger function:
implies the choice of the normal ordering scheme in
þ ðxÞ  þ ðxÞ ¼ þ ðxÞ þ  ðxÞ ¼ ðxÞ ½76
free field theory. In the framework of deformation
quantization, this requirement leads to the choice of The fact that going over to a c-equivalent product
the normal star product for treating free scalar fields: leaves the antisymmetric part of the differential
only for this choice is the star product well defined. operator in the exponent of eqn [71] invariant suggests
Currently, in realistic physical field theories that the use of the positive frequency propagator
involving interacting relativistic fields we are limited instead of the Schwinger function merely involves the
to perturbative calculations. The objects of interest passage to a c-equivalent star product. This is indeed
are products of the fields. The analog of the Moyal easy to verify. The time-ordered product of the
product of eqn [11] for systems with an infinite operators is obtained by replacing the Schwinger
number of degrees of freedom is function (x  y) in eqn [72] by the c-equivalent
positive frequency propagator þ (x  y), restricting
ðx1 Þ  ðx2 Þ      ðxn Þ the time integration to x0 > y0 , as in eqn [74], and
" Z #
1X 4 4   symmetrizing the integral in the variables x and y,
¼ exp d xd y ðx  yÞ which brings in the negative frequency propagator
2 i<j  i ðxÞ  j ðyÞ
 (x  y) for times x0 < y0 . Then eqn [71] becomes
 1 ðx1 Þ; . . . ; n ðxn Þj i ¼ ½71 Wick’s theorem, which is the basic tool of relativistic
perturbation theory. In operator language
where the expressions = (x) indicate functional
derivatives. Here, we have used the antisymmetric T ððx1 Þ; . . . ; ðxn ÞÞ
Schwinger function: 1  
¼ exp d4 x d4 y F ðx  yÞ
ðx  yÞ ¼ ½ðxÞ; ðyÞ ½72 2 ðxÞ ðyÞ
 N ððx1 Þ; . . . ; ðxn ÞÞ ½77
The Schwinger function is uniquely determined by
relativistic invariance and causality from the equal- Another interesting relation between deformation
time commutator quantization and quantum field theory has been
 uncovered by studies of the Poisson–Sigma model.
ðxÞ; ðyÞ  0 0 ¼ i
hð3Þ ðx  yÞ ½73
x ¼y This model involves a set of scalar fields Xi which map a
two-dimensional manifold 2 onto a Poisson space M,
which is the characterization of the canonical
as well as generalized gauge fields Ai , which are 1-forms
structure in the field theoretic framework.
on 2 mapping to 1-forms on M. The action is given by
The Moyal product is, however, not the suitable Z
star product to use in this context. In relativistic
SPS ¼ ðAi dXi þ ij Ai Aj Þ ½78
quantum field theory, it is necessary to incorporate 2
causality in the form advocated by Feynman: ij
positive frequencies propagate forward in time, where  is the Poisson structure of M. A remark-

whereas negative frequencies propagate backwards able formula was found (Cattaneo and Felder 2000):
in time. This property is achieved by using the
Feynman propagator: ðf  gÞðxÞ ¼ DXDAf ðXð1ÞÞgðXð2ÞÞeiSPS =h ½79
þ ðxÞ for x0 > 0 where f, g are functions on M,  is Kontsevich’s star
F ðxÞ ¼ ½74 product (Kontsevich 1997), and the functional integra-
 ðxÞ for x0 < 0
tion is over all fields X that satisfy the boundary
where þ (x),  (x) are the propagators for the condition X(1) = x. Here 2 is taken to be a disk in R 2 ;
positive and negative frequency components of the 1, 2, and 1 are three points on its circumference. By
field, respectively. In operator language expanding the functional integral in eqn [79] according
to the usual rules of perturbation theory, one finds that
F ðx  yÞ ¼ T ððxÞðyÞÞ  N ððxÞðyÞÞ ½75 the coefficients of the powers of h reproduce the graphs
Deformation Quantization and Representation Theory 9

and weights that characterize Kontsevich’s star pro- Cattaneo AS and Felder G (2000) A path integral approach to the
duct. For the case in which the Poisson tensor is Kontsevich quantization formula. Communications in Mathe-
matical Physics 212: 591–611.
invertible, we can perform the Gaussian integration in Dito G and Sternheimer D (2002) Deformation Quantization:
eqn [79] involving the fields Ai . The result is genesis, developments, and metamorphoses. In: Halbout G
(ed.) Deformation Quantization, IRMA Lectures in Mathematical
ðf  gÞðxÞ Physics, vol. I, pp. 9–54, (Walter de Gruyter, Berlin, 2002),
Z  Z 
i i j
¼ DXf ðXð1ÞÞgðXð2ÞÞ exp ij dX dX ½80 Gerstenhaber M (1964) On the deformation of rings and algebras.
 Annals of Mathematics 79: 59–103.
Equation [80] is formally similar to eqn [16] for the Groenewold HJ (1946) On the principles of elementary quantum
mechanics. Physica 12: 405–460.
Moyal product, to which the Kontsevich product Hirshfeld A and Henselder P (2002a) Deformation quantization
reduces in the symplecticR case. Here ij = (ij )1 is the in the teaching of quantum mechanics. American Journal of
symplectic 2-form, and ij dXi dXj is the symplectic Physics 70(5): 537–547.
volume of the manifold M. To make this relationship Hirshfeld A and Henselder P (2002b) Deformation quantization
exact, one must integrate out the gauge degrees of for systems with fermions. Annals of Physics 302: 59–77.
Hirshfeld A and Henselder P (2002c) Star products and
freedom in the functional integral in eqn [79]. Since the perturbative field theory. Annals of Physics (NY) 298:
Poisson-sigma model represents a topological field 382–393.
theory there remains only a finite-dimensional inte- Hirshfeld A and Henselder P (2003) Star products and quantum
gral, which coincides with the integral in eqn [80]. groups in quantum mechanics and field theory. Annals of
Physics 308: 311–328.
See also: Deformations of the Poisson Bracket on a Hirshfeld A, Henselder P, and Spernat T (2004) Cliffordization,
spin, and fermionic star products. Annals of Physics (NY) 314:
Symplectic Manifold; Deformation Quantization and
Representation Theory; Deformation Theory; Fedosov
Hirshfeld A, Henselder P, and Spernat T (2005) Star products and
Quantization; Noncommutative Geometry from Strings; geometric algebra. Annals of Physics (NY) 317: 107–129.
Operads; Quantum Field Theory: A Brief Introduction; Kontsevich M (1997) Deformation quantization of Poisson
Schrödinger Operators. manifolds, q-alg/9709040.
Moyal JE (1949) Quantum mechanics as a statistical theory.
Proceedings of Cambridge Philosophical Society 45: 99–124.
Further Reading Zachos C (2001) Deformation quantization: quantum mechanics
lives and works in phase space, hep-th/0110114.
Bayen F, Flato M, Fronsdal C, Lichnerowicz A, and Sternheimer D
(1978) Deformation theory and quantization I, II. Annals of
Physics (NY) 111: 61–110, 111–151.

Deformation Quantization and Representation Theory

S Waldmann, Albert-Ludwigs-Universität Freiburg, for gauge field theories and gravity, whence it is clear
Freiburg, Germany that quantization is still one of the most important
ª 2006 Elsevier Ltd. All rights reserved. issues in mathematical physics.
One possibility (among many others) is to use the
structural similarity between the classical and
quantum observable algebras. In both cases the
The Quantization Problem observables constitute a complex -algebra: in the
Though quantum theory for the classical phase space classical case it is commutative with the additional
R2n is well established by means of what usually is structure of a Poisson bracket, whereas in the
called canonical quantization, physics demands to go quantum case the algebra is noncommutative. In
beyond R2n : On the one hand, systems with constraints deformation quantization, one tries to pass from the
lead by phase-space reduction to classical phase spaces classical observables to the quantum observables by
different from R2n ; in general one ends up with a a deformation of the algebraic structures.
symplectic or even Poisson manifold. Thus, one needs
to quantize geometrically nontrivial phase spaces. On
From Canonical Quantization to Star
the other hand, field theories and thermodynamical
systems require to pass from R2n to infinitely many
degrees of freedom, where one faces additional Let us briefly recall canonical quantization and the
analytical difficulties. Both types of difficulties combine ordering problem. In order to ‘‘quantize’’ classical
10 Deformation Quantization and Representation Theory

observables like the polynomials on R2n to qk , pl , where (f  g) = fg is the commutative product.
one assigns the operators Clearly, for f , g 2 Pol(T  Rn ) the exponential series
terminates after finitely many terms. If one now
qk 7! %ðqk Þ ¼ Qk ¼ ðq 7! qk ðqÞÞ ½1 wants to extend further to all smooth functions,
  then [7] is only a formal power series in h. Since on
h @ a manifold one does not have a priori a nice
pl 7! %ðpl Þ ¼ Pl ¼ q 7! ðqÞ ½2
i @ql distinguished class of functions like Pol(T  R n ), one
indeed has to generalize in this direction if a
for k, l = 1, . . . , n, defined on a suitable domain in
geometric framework is desired. This observation
L2 (Rn , dn q). For simplicity, we choose C1 n
0 (R ) as and the simple fact, that ?Weyl satisfies all the
domain. The well-known ordering problem is
following properties, lead to the definition of a
encountered if one wants to also quantize higher bi
formal star product by Bayen et al. (1978):
polynomials. One convenient (although not the only)
possibility is Weyl’s total symmetrization rule, that is, Definition 1 A formal star product on a Poisson
for a monomial like q2 p we take the quantization manifold (M, ) is an associative C[[]]-bilinear
%Weyl ðq2 pÞ ¼ 13 ðQ2 P þ QPQ þ PQ2 Þ
@ f ?g¼ r Cr ðf ; gÞ ½8
¼ i hq2  i
hq ½3
@q r¼0

This can be written in the more explicit form: for f , g 2 C1 (M)[[]] such that
%Weyl ðf Þ 1. C0 (f , g) = fg and C1 (f , g)  C1 (g, f ) = i{f , g},
   2. 1 ? f = f = f ? 1, and
1 h r @ r ðNf Þ  @r ½4
¼ 3. Cr is a bidifferential operator.
r! i @pi1    @pir p¼0 @qi1    @qir
If in addition f ? g = g ? f , then ? is called
with Hermitian.
 @2 Clearly, ?Weyl defines a Hermitian star product for
N ¼ exp  and ¼ i
2i @q @pi R2n . The first condition is called the correspondence
Using [4] one can easily extend %Weyl to all functions principle in deformation quantization and the for-
f 2 C1 (R2n ) which are polynomial in the momentum mal parameter  =  corresponds to Planck’s con-
stant h once P
a convergence scheme is established.
variables only and have an arbitrary smooth depen-
If S = id þ 1 r
r = 1  Sr is a formal series of differ-
dence on the position variables. This Poisson sub-
ential operators with Sr 1 = 0 for r 1, then it is easy
algebra of C1 (R 2n ) certainly covers all classical
observables of physical interest. Denoting these obser- to see that
vables by Pol(T  R n ), one obtains a linear isomorphism f ?0 g ¼ S1 ðSf ? SgÞ ½9

%Weyl : PolðT  Rn Þ ! DiffopðRn Þ ½5 defines again a star product which is Hermitian if ? is
Hermitian and if in addition Sf = Sf . In particular, the
into the differential operators with smooth coeffi-
operator N, as before, serves for the transition from
cients, called Weyl symbol calculus. Other orderings
?Weyl to the standard-ordered star product ?Std obtained
would result in a different linear isomorphism like
the same way from the standard-ordered quantization.
[5], for example, the standard ordering is obtained
Thus, [9] can be seen as the abstract notion of changing
by simply omitting the operator N in [4].
the ordering prescription, even if no operator repre-
Using [5], one can pull back the operator product
sentation has been specified. Two star products related
of Diffop(R n ) to obtain a new product ?Weyl for
by such an equivalence transformation are called
Pol(T  Rn ), that is
equivalent and -equivalent in the Hermitian case.
f ?Weyl g ¼ %Weyl1 ð%Weyl ðf Þ%Weyl ðgÞÞ ½6 One main advantage of formal deformation
which is called the Weyl–Moyal star product. quantization is that one has very strong existence
Explicitly, one has and classification results:
Theorem 2 On every Poisson manifold there exists
f ?Weyl g
!! a star product.
ih @ @ @ @ bi
¼   exp    f g ½7 The above theorem was first shown by deWilde
2 @qk @pk @pk @qk
and Lecomte (1983) for the symplectic case and
Deformation Quantization and Representation Theory 11


independently by Fedosov (1985) and Omori, ! : A ! C is called positive if !(a  a) 0. For

Maeda, and Yoshioka (1991). In 1997, Kontsevich formal deformation quantization, things are
was able to prove the general Poisson case by slightly more subtle as now one has to consider
showing his profound formality theorem. The full C[[]]-linear functionals
classification of star products up to equivalence was
first obtained for the symplectic case by Nest and
! : ðC1 ðMÞ½½; ?Þ ! C½½ ½11

Tsygan (1995) and independently by Deligne

where ? is assumed to be a Hermitian star product
(1995), Bertelson, Cahen, and Gutt (1997), and
in the following. Then the positivity is understood in
Weinstein and Xu (1997). The general Poisson case
the sense of formal power
P series where a 2 R[[]] is
again follows from Kontsevich’s formality. In
called positive if a = 1r = r0  r
a r with ar0 > 0. Thus,
particular, in the symplectic case, star products are
we can make sense out of the following
classified by their characteristic class
½! Definition 3 Let ? be a Hermitian star product on
c : ? 7! cð?Þ 2 þ H2deRham ðM; CÞ½½ ½10
i M. A C[[]]-linear functional ! : C1 (M)[[]] !
C[[]] is called positive with respect to ? if
As conclusion one can state that for the price of
formal power series in  h one obtains in formal !ðf ? f Þ 0 ½12
deformation quantization a very general and well-
understood picture of the observable algebra for the and it is called a state if, in addition, !(1) = 1.
quantum version of any classical system described In fact, !(f ) is interpreted as the expectation value
kinematically by a Poisson manifold. It turns out of the observable f in the state !. The positivity [12]
that already in this framework one can discuss ensures that the usual uncertainty relations between
dynamics as well by use of a Heisenberg equation expectation values hold.
formulated with ?. Moreover, the quantization of Sometimes it is convenient to consider positive
symmetries described by Hamiltonian Lie group or functionals only defined on a (proper) -ideal in
Lie algebra actions has been extensively studied. C1 (M)[[]], for instance, C1 0 (M)[[]].
For a physical theory of quantization, however, Since in some situations one wants more general
there are still at least two ingredients missing. On formal series than just power series, it is conve-
the one hand, one has to overcome the formal nient to embed the above definition of states into a
power series expansion in h. This problem is, in larger and more algebraic context: consider an
principle, on the same footing as any perturbative ordered ring R, that is, a commutative, associative,
approach to quantum theory and thus no easy unital ring R together with a distinguished subset
answer can be expected to hold in general. In P
R (the positive elements) such that R is the
particular examples, however, such as the Weyl– disjoint union P[{0} _ [P,
_ and we have P  P P
Moyal star product, it can easily be solved. These and P þ P P. Then C = R(i) denotes the ring
issues together with the corresponding questions extension by a square root i of 1 and consider
about a spectral calculus are best studied in the -algebras A over C. Clearly, this generalizes the
framework of Rieffel’s strict deformation quantiza- cases R = R, where C = C, as well as R = R[[]],
tion based on a more C -algebraic formulation of where C = C[[]]. In this way, one provides a
the deformation problem. On the other hand, the framework where C -algebras, -algebras over C,
observable algebra is not enough to describe a and formal Hermitian star products can be treated
quantum system: one also needs to have a notion on the same footing. It is clear that the definition
for the states. It turns out that already in the formal of a positive functional immediately extends to
framework one has a physically reasonable notion
bi ! : A ! C for such a ring C.
of states as discussed by Bordemann and Wald-
mann (1998). Example 4
(i) For the Wick star product on R2n ffi Cn , defined
States and Representations X
ð2Þr @rf @rg
f ? Wick g ¼ ½13
The notion of states in deformation quantization r¼0
r! @zi1  @zir @z @zir

is adapted from the C -algebraic world and based

on the notion of positive functionals. Recall that the -functional  : f 7! f (0) is positive. Note,
for a -algebra A over C a linear functional however, that  is not positive for ?Weyl .
12 Deformation Quantization and Representation Theory

(ii) For the Weyl–Moyal star product ?Weyl the state in some representation. This is the well-known
Schrödinger functional Gelfand–Naimark–Segal (GNS) construction from
Z operator algebra theory which can be transferred to bi

!ðf Þ ¼ f ðq; p ¼ 0Þdn q ½14 this purely algebraic context (Bordemann and
Rn Waldmann 1998). First recall that any positive
defined on the -ideal C1
0 (R )[[]], is positive.
linear functional ! : A ! C satisfies the Cauchy–
(iii) For any connected symplectic manifold (M, !) Schwarz inequality
and any Hermitian star product ?, there exists a
unique normalized trace functional !ða bÞ !ða bÞ !ða aÞ!ðb bÞ ½16

tr : C1 and !(a b) = !(b a). If A is unital, which will always

0 ðMÞ½½!C½½
½15 be assumed for simplicity, then !(a ) = !(a) follows.
trðf ? gÞ ¼ trðg ? f Þ
with zeroth order equal to the integration over
J! ¼ fa 2 A j !ða aÞ ¼ 0g ½17
M with respect to the Liouville measure  = !n .
Then this trace is positive as well, tr(f ? f ) 0. is a left ideal in A, the so-called Gel’fand ideal, and
Having a notion for states as expectation-value hence H! = A=J ! is a left A-module with module
functionals is still not enough to formulate quantum structure denoted by ! (a) b = ab , where b 2 H!
theory. One main feature of quantum states, the denotes the equivalence class of b 2 A. Finally,
superposition principle, is not yet implemented. In h b , c i = !(b c) turns H! into a pre-Hilbert space
particular, forming convex combinations like and ! becomes a -representation, the GNS repre-
! = c1 !1 þ c2 !2 , with c1 , c2 0 and c1 þ c2 = 1, sentation with respect to !. Moreover, 1 2 H! is a
does not give a superposition of !1 and !2 but cyclic vector, b = ! (b) 1 , with the property
a mixed stated. Hence, one needs an additional !ðaÞ ¼ h 1 ; ! ðaÞ 1 i ½18
linear structure on the states whence we look for a
-representation  of the observable algebra A on a These properties characterize the GNS representa-
pre-Hilbert space H over C such that the states tion (H! , ! , 1 ) up to unitary equivalence.
!1 , !2 can be written as vector states !i (a) = Example 5 We can now apply this construction to
hi , (a)i i for some unit vectors 1 , 2 2 H. Then the three basic examples and obtain the following
one can build superpositions of the vectors 1 , 2 in well-known representations as GNS representations:
the usual way. While this is the well-known
argument in any quantum theory based on the (i) The GNS representation corresponding to the
observable algebras, for deformation quantization -functional and the Wick star product is
one first has to make sense out of the above notions, (unitarily equivalent to) the formal
since now R = R[[]] is only an ordered ring. This Bargmann–Fock representation. Here
can actually be done in a consistent way as H = C[[y1 , . . . , y n ]][[]] with inner product
demonstrated and exemplified by Bordemann, X
Bursztyn, Waldmann, and others.
ð2Þr @ r
h; i ¼ ð0Þ
We recall the basic results: A pre-Hilbert space H r¼0
r! @y    @yir

over C is a C-module with a C-sesquilinear inner @r

product h , i : H H ! C such that h, i = h , i ð0Þ ½19
@yi1    @yir
and h, i > 0 for  6¼ 0. This makes sense since R is
ordered. An operator A : H1 ! H2 is called adjointa- and  is explicitly given by
ble if there exists an operator A : H2 ! H1 such that X
hA, i2 = h, A i1 for all  2 H1 , 2 H2 . The set
ð2Þr @ rþs f
 ðf Þ ¼ ð0Þ
of adjointable operators is denoted by B(H1 , H2 ), and r;s¼0
r!s! @zi1    @zir @zj1    @zjs
B(H) = B(H, H) turns out to be a -algebra over C. @r
This allows one to define a -representation  of A on yj 1    yj s ½20
@y    @yir
H to be a -homomorphism  : A ! B(H). An
intertwiner T between two -representations (H1 , 1 ) In particular,  (zi ) = 2@=@yi and  (zi ) = 
and (H2 , 2 ) is an operator T 2 B(H1 , H2 ) with are the annihilation and creation operators
T1 (a) = 2 (a)T for all a 2 A. This defines the and [20] gives the Wick (or normal) ordering.
category -Rep(A) of -representations of A. This basic example has been extended to bi
Let us now recall that a positive linear functional arbitrary Kähler manifolds by Bordemann and
! can be written as an expectation value for a vector Waldmann (1998).
Deformation Quantization and Representation Theory 13

(ii) The Weyl–Moyal star product ?Weyl and the In deformation quantization, some parts of these
Schrödinger functional ! as in [14] give the superselection rules have been understood well:
usual Schrödinger representation as GNS repre- again, for cotangent bundles T  Q, one can classify
sentation. We obtain H! = C1 0 (R )[[]] with the unitary equivalence classes of Schrödinger-like
inner product representations on C1 0 (Q)[[]] by topological classes
Z of nontrivial vector potentials. Thus, one arrives at
h; i ¼ ðqÞ ðqÞ dn q ½21 the interpretation of the Aharonov–Bohm effect as
Rn superselection rule where theclassification is essen-
tially given by H1deRham (Q, C) 2i H1deRham (Q, Z).
and ! (f ) = %Weyl (f ) as in [4] with  h replaced
by . The Schrödinger representation as a
particular case of a GNS representation has General Representation Theory
been generalized to arbitrary cotangent bundles
Although it is very much desirable to determine the
including representations on sections of line
structure and the superselection sectors in -Rep(A)
bundles over the configuration space (Dirac’s
completely, this is only achievable in the very
representation for magnetic monopoles) by
simplest examples. Moreover, for formal star pro-
Bordemann, Neumaier, Pflaum, and Waldmann
ducts, many artifacts due to the purely algebraic
(1999, 2003). In this context, the WKB expan-
nature have to be expected: the Bargmann–Fock and
sion can also be formulated.
Schrödinger representation in Example 5 are uni-
(iii) For the positive trace tr, the GNS pre-Hilbert is
tarily inequivalent and thus define a superselection
simply the space Htr = C1 0 (M)[[]] with inner
rule, even the pre-Hilbert spaces are nonisomorphic.
product hf , gi = tr(f ? g). The corresponding GNS
However, these artifacts vanish immediately when
representation is the left regular representation
one imposes the suitable convergence conditions
tr (f )g = f ? g. Note that in this case the commu-
together with appropriate topological completions
tant of the representation is (anti-)isomorphic to
(von Neumanns’s theorem). Given such problems, it
the observable algebra and given by all the right
is very difficult to find ‘‘hard’’ superselection rules
multiplications. Thus, tr is highly reducible and
which indeed have physical significance already at
the size of the commutant indicates a ‘‘thermo-
the formal level. Nevertheless, the example of the
dynamical’’ interpretation of this representation.
Aharonov–Bohm effect shows that this is possible.
Indeed, one can take this GNS representation, and
In any case, new techniques for investigating
more general for arbitrary KMS functionals, as a
-Rep(A) have to be developed. It turns out that
starting point of a preliminary version of a
comparing -Rep(A) with some other -Rep(B) is
Tomita–Takesaki theory for deformation quanti-
much simpler but still gives some nontrivial insight
zation as shown by Waldmann (1999).
in the structure of the representation theory. Here
After these fundamental examples, we now recon- the Morita theory provides a highly sophisticated
sider the question of superpositions: in general, two tool.
(pure) states !1 , !2 cannot be realized as vector The classical notion of Morita equivalence as well
states inside a single irreducible representation. One as Rieffel’s more specialized strong Morita equiva-
encounters superselection rules. Usually, for lence for C -algebras have been transferred to
instance, in algebraic quantum field theory, the deformation quantization and, more generally, to
existence of superselection rules indicates the pre- -algebras A over C = R(i) by Bursztyn and Wald-
sence of charges. In particular, it is not sufficient to mann (2001). The aim is to construct functors
consider one single representation of the observable
F : -RepðAÞ! -RepðBÞ ½22
algebra A. Instead, one has to investigate (as good
as possible) all superselection sectors of the repre- which allow us to compare these categories and
sentation theory -Rep(A) of A and find physically determine whether they are equivalent. But even if
motivated criteria to select distinguished representa- they are not equivalent, functors such as [22] are
tions. In usual quantum mechanics on R2n , this interesting. As example, one considers the situation
turns out to be rather simple, thanks to the of classical phase space reduction M V Mred as it is
(nontrivial) uniqueness theorem of von Neumann: present in every constraint system or gauge theory.
one has a unique irreducible representation of the Suppose one succeeded with the (highly nontrivial)
Weyl algebra up to unitary equivalence. In infinite problem of quantizing both classical phase spaces in
dimensions or in topologically nontrivial situations, a reasonable way whence one has quantum obser-
however, von Neumann’s theorem does not apply vable algebras A and Ared . Then, of course, a
and one indeed has superselection rules. relation between -Rep(A) and -Rep(Ared ) is of
14 Deformation Quantization and Representation Theory

particular physical interest although one cannot and analogously for h , iB , and compatible, in the
expect both representation theories to be equivalent: sense that
A contains additional but physically irrelevant
hb  x; yiA ¼ hx; b  yiA ; hx  a; yiB ¼ hx; y  a iB ½26
structure leading to possibly ‘‘more’’ representations.
To get a clear picture of the Morita theory, one
has to extend the notion of -representations to the
hx; yiB  z ¼ x  hy; ziA ½27
following framework: for an auxiliary -algebra D
over C, one defines a pre-Hilbert right D-module to In this case, A and B are called strongly Morita
be a right D-module H together with a C-sesqui- equivalent.
linear D-valued inner product h , i : H H ! D
such that h, i and h,  di = h, id for d 2 D It turns out that this is indeed an equivalence
and such that h , i is completely positive. This relation and that strong Morita equivalence implies
means (hi , j i) 2 Mn (D)þ for all 1 , . . . , n , where, the equivalence of the representation theories:
in general, an algebra element a 2 A is called Theorem 7 For unital -algebras over C, strong
positive, a 2 Aþ , if !(a) 0 for all positive linear Morita equivalence is an equivalence relation.
functionals ! : A ! C.
Then one defines B(H) analogously as for pre- Theorem 8 If E is a strong Morita equivalence
Hilbert spaces leading to a definition of a bimodule, then RE as in [24] is an equivalence of
-representation  of A on a pre-Hilbert right D- categories.
module H. The corresponding category of -represen- Example 9 The fundamental example in Morita
tations is denoted by -RepD (A). Clearly, elements in theory is that a unital -algebra A is strongly Morita
-RepD (A) are in particular (A, D)-bimodules. equivalent to the matrices Mn (A) via the (Mn (A), A)-
The advantage is that now one has a tensor bimodule An where the inner product is hx, yiA =
^ taking care of the inner products as well. n 
product  i = 1 xi yi and h , iMn (A) is uniquely determined by
For -algebras A, B, C, one has a functor the compatibility condition [27].
^ : -RepB ðCÞ -RepA ðBÞ ! -RepA ðCÞ
 ½23 An efficient way to encode the whole Morita
theory of unital -algebras over C is to collect all
which, on objects, is essentially given by B . In fact, strong Morita equivalence bimodules modulo iso-
for F 2 -RepB (C) and E 2 -RepA (B), one defines metric isomorphisms of bimodules. Then the tensor
on the (C, A)-bimodule F B E an A-valued inner product  ^ makes this into a ‘‘large’’ groupoid
product by hx  , y  i = h, hx, yi  i, which whose units are the -algebras themselves. This so-
turns out to be well defined and completely positive called Picard groupoid Pic then encodes everything
again. Then F  ^ E is F B E equipped with this one can say about strong Morita equivalence. In
inner product modulo its possibly nonempty degen- particular, the orbits of this groupoid are precisely
eracy space. the strong Morita equivalence classes of -algebras.
By fixing one of the arguments of , ^ one The isotropy groups are the Picard groups Pic(A)
obtains the functor of Rieffel induction of which generalize the (outer) automorphism groups.
RE : -RepD ðAÞ ! -RepD ðBÞ ½24
Strong Morita Equivalence of Star
^ for
where E 2 -RepA (B) is fixed and RE (H) = E H Products
H 2 -RepD (A). This section considers star products from the view-
The idea of strong Morita equivalence is then to point of the Morita equivalence. Here one can show
search for such bimodules E where RE gives an that for A = (C1 (M)[[]], ?), the possible candidates
equivalence of categories. In detail, this is accom- of equivalence bimodules are formal power series of
plished by the following definition, where, for sections 1 (E)[[]] of vector bundles E ! M. This
simplicity, only unital -algebras are considered. follows as, on the one hand, strong Morita
Definition 6 A (B, A)-bimodule E is called a strong equivalence is compatible with the classical limit
Morita equivalence bimodule if it is equipped with  = 0 in the sense that it implies strong Morita
completely positive inner products h , iA and h , iB equivalence of the classical limits. On the other
such that both inner products are full, in the sense hand, any (classical or quantum) equivalence bimo-
that dule is finitely generated and projective as right
A-module. Thus, by the Serre–Swan theorem one
C-spanfhx; yiA jx; y 2 Eg ¼ A ½25 obtains the sections of a vector bundle in the
Deformation Quantization and Representation Theory 15

classical limit. Now one can show that every vector Finally, it is worth mentioning that [28] has a very
bundle can uniquely (up to equivalence) be simple physical interpretation. Consider again a
deformed such that 1 (E)[[]] becomes a right cotangent bundle T  Q with a topologically non-
A-module. Thus, the only thing to be computed is trivial configuration space Q, for example, R3 n{0}.
which deformation ?0 is induced by this deformation Then there is a canonical Weyl-type star product
of E for the endomorphisms 1 (End(E))[[]], since ?Weyl depending on the choice of a connection r and
one can show that then the result will always be a an integration density  > 0, generalizing [7] to a
strong Morita equivalence bimodule. The inner curved situation. Now let B be a magnetic field,
products come from deformations of a Hermitian modeled as a closed 2-form on Q. Minimal coupling
fiber metric on E. leads to a new star product ?BWeyl describing an
Since every vector bundle E ! M can be electrically charged particle moving in Q in the
deformed in this manner in an essentially unique external field B. Then the two star products ?Weyl
way, we arrive at a general global construction of and ?BWeyl are (strongly) Morita equivalent if and
a noncommutative field theory where the fields are only if the magnetic field satisfies Dirac’s integrality
sections of E endowed with a deformed bimodule condition for the (possibly nontrivial) magnetic
structure. In the case where M is even a symplectic charges described by B. Thus, Dirac’s condition
manifold, a simple extension of Fedosov’s construc- is responsible for the very strong statement that the
tion of a star product ? gives a rather explicit quantizations with and without magnetic field
formula for the deformed bimodule structure of are Morita equivalent. In particular, the -represen-
1 (E)[[]] including a construction of the deforma- tation theories of ?Weyl and ?BWeyl are equivalent.
tion (1 (End(E))[[]], ?0 ) which acts from the left. Even more specifically, using B to construct a line
As usual in Fedosov’s approach, the construction bundle L ! Q one obtains the result that Dirac’s
depends functorially on the choice of a connection -representation of ?BWeyl on 1 0 (L)[[]] is precisely
rE for E. the Rieffel induction of the Schrödinger representa-
Returning to the question of strong Morita tion of ?Weyl on C1 0 (Q)[[]].
equivalence of star products, we see that the vector
bundle E has to be a line bundle L since only in this See also: Aharonov–Bohm Effect; Algebraic Approach to
case we have 1 (End(E)) ffi C1 (M). Since the Quantum Field Theory; Deformation Quantization;
deformation of the Hermitian fiber metric is always Deformation Theory; Deformations of the Poisson
Bracket on a Symplectic Manifold; Fedosov Quantization.
possible and since two equivalent Hermitian star
products are always -equivalent, one can show that
strong Morita equivalence is already implied by
ring-theoretic Morita equivalence (the converse is
Further Reading
true in general).
Bayen F, Flato M, Frønsdal C, Lichnerowicz A, and Sternheimer
Theorem 10 Star products are strongly Morita D (1978) Deformation theory and quantization. Annals of
equivalent if and only if they are Morita equivalent. Physics 111: 61–151.
Bertelson M, Cahen M, and Gutt S (1997) Equivalence of star
An analogous statement holds for C -algebras, products. Class. Quant. Grav. 14: A93–A107.
known as Beer’s theorem (1982). Bieliavsky P, Dito G, Maeda Y, and Waldmann S (2002) The
In the symplectic case, the characteristic class c(?0 ) deformation quantization homepage. http://idefix.physik.uni-
of the induced star product ?0 can be computed  star/en/index.html (regularly updated online
bibliography and short introductory articles).
explicitly leading to the following classification by
bi Bordemann M and Waldmann S (1998) Formal GNS construction
Bursztyn and Waldmann (2002): and states in deformation quantization. Communications in
Mathematical Physics 195: 549–583.
Theorem 11 Let ?, ?0 be star products on a
Bordemann M, Neumaier N, Pflaum MJ, and Waldmann S
symplectic manifold M. Then ?0 is (strongly) Morita (2003) On representations of star product algebras over
equivalent to ? if and only if there exists a symplecto- cotangent spaces on Hermitian line bundles. Journal of
morphism such that Functional Analysis 199: 1–47.
Bursztyn H and Waldmann S (2001) Algebraic Rieffel induction,

cð?0 Þ  cð?Þ 2 2iH2deRham ðM; ZÞ ½28 formal Morita equivalence and applications to deformation
quantization. Journal of Geometrical Physics 37: 307–364.
A similar result in the general Poisson case was Bursztyn H and Waldmann S (2002) The characteristic classes of
Morita equivalent star products on symplectic manifolds.
given by Jurčo, Schupp, and Wess (2002) based on
Communications in Mathematical Physics 228: 103–121.
Kontsevich’s formality theorem. This approach is Deligne P (1995) Déformations de l’algèbre des fonctions d’une
motivated by a careful investigation of noncommu- variété symplectique: comparaison entre Fedosov et DeWilde,
tative (scalar) field theories. Lecomte. Sel. Math. New Series 1(4): 667–697.
16 Deformation Theory

DeWilde M and Lecomte PBA (1983) Existence of star-products Nest R and Tsygan B (1995) Algebraic index theorem. Commu-
and of formal deformations of the Poisson Lie algebra of nications in Mathematical Physics 172: 223–262.
arbitrary symplectic manifolds. Letters in Mathematical Omori H, Maeda Y, and Yoshioka A (1991) Weyl manifolds and
Physics 7: 487–496. deformation quantization. Advanced Mathematics 85:
Dito G and Sternheimer D (2002) Deformation quantization: 224–255.
genesis, developments and metamorphoses. In: Halbout G Waldmann S (2002) On the representation theory of deformation
(ed.) Deformation Quantization, IRMA Lectures in Mathe- quantization. In: Halbout G (ed.) Deformation Quantization,
matics and Theoretical Physics, vol. 1, pp. 9–54. Berlin: IRMA Lectures in Mathematics and Theoretical Physics,
Walter de Gruyter. vol. 1, pp. 107–133. Berlin: Walter de Gruyter.
Fedosov BV (1986) Quantization and the index. Soviet Physics Waldmann S (2005) States and representation theory in deforma-
Doklady 31(11): 877–878. tion quantization. Reviews of Mathematical Physics 17:
Gutt S (2000) Variations on deformation quantization. In: Dito G and 15–75.
Sternheimer D (eds.) Conférence Moshé Flato 1999. Quantiza- Weinstein A and Xu P (1998) Hochschild cohomology and
tion, Deformations, and Symmetries, Mathematical Physics characteristic classes for star-products. In: Khovanskij A,
Studies, vol. 21, pp. 217–254. Dordrecht: Kluwer Academic. Varchenko A, and Vassiliev V (eds.) Geometry of Differential
Jurco B, Schupp P, and Wess J (2002) Noncommutative line Equations, pp. 177–194. Dedicated to VI Arnol’d on the
bundles and Morita equivalence. Letters of Mathematical occasion of his 60th birthday. Providence, RI: American
Physics 61: 171–186. Mathematical Society.
Kontsevich M (2003) Deformation quantization of Poisson
manifolds. Letters in Mathematical Physics 66: 157–216.

Deformation Theory
M J Pflaum, Johann Wolfgang Goethe-Universität, and Fedosov for symplectic and by Kontsevich for
Frankfurt, Germany Poisson manifolds.
ª 2006 Elsevier Ltd. All rights reserved. Recently, Fukaya and Kontsevich have found a
far-reaching connection between general deforma-
tion theory, the theory of moduli, and mirror
symmetry. Thus, deformation theory comes back to
Introduction and Historical Remarks its origins, which lie in the desire to construct
moduli spaces. Briefly, a moduli problem can be
In mathematical deformation theory one studies how described as the attempt to collect all isomorphism
an object in a certain category of spaces can be varied classes of spaces of a certain type into one single
as a function of the points of a parameter space. In object, the moduli space, and then to study its
other words, deformation theory thus deals with the geometric and analytic properties. The observations
structure of families of objects like varieties, singula- by Fukaya and Kontsevich have led to new insight
rities, vector bundles, coherent sheaves, algebras, or into the algebraic geometry of mirror varieties and
differentiable maps. Deformation problems appear in their application to string theory.
various areas of mathematics, in particular in algebra,
algebraic and analytic geometry, and mathematical
Basic Definitions and Examples
physics. According to Deligne, there is a common
philosophy behind all deformation problems in Deformation theory is based on the notion of a
characteristic zero. It is the goal of this survey to ringed space, so we briefly recall its definition.
explain this point of view. Moreover, we will provide
Definition 1 Let k be a field. By a k-ringed space
several examples with relevance for mathematical
one understands a topological space X together with
a sheaf A of unital k-algebras on X. The sheaf A will
Historically, modern deformation theory has its
be called the structure sheaf of the ringed space. In
roots in the work of Grothendieck, Artin, Quillen,
case each of the stalks Ax , x 2 X, is a local algebra,
Schlessinger, Kodaira–Spencer, Kuranishi, Deligne,
that is, has a unique maximal ideal mx , one calls
Grauert, Gerstenhaber, and Arnol’d. The applica-
(X, A) a locally k-ringed space. Likewise, one defines
tion of deformation methods to quantization
a commutative k-ringed space as a ringed space
theory goes back to Bayen–Flato–Fronsdal–
such that the stalks of the structure sheaf are all
Lichnerowicz–Sternheimer, and has led to the
concept of a star product on symplectic and
Poisson manifolds. The existence of such star Given two k-ringed spaces (X, A) and (Y, B), a
products has been proved by de Wilde–Lecomte morphism from (X, A) to (Y, B) is a pair (f , ’), where
Deformation Theory 17

f : X ! Y is a continuous mapping and ’ : f 1 B ! A a

morphism of sheaves of algebras. This means in
particular that for every point x 2 X there is a
homomorphism of algebras ’x : Bf (x) ! Ax induced
by ’. Under the assumption that both ringed spaces
are local, (f , ’) is called a morphism of locally ringed
spaces, if each ’x is a homomorphism of local Yp
k-algebras, that is, maps the maximal ideal of Bf (x)
to the one of Ax . P
Clearly, k-ringed spaces (resp. locally or commu-
tative k-ringed spaces) together with their morphisms Figure 1 A fibered space.
form a category. The following is a list of examples of
ringed spaces, in particular of those which will be Definition 3 A morphism (f , ’) : (Y, B) ! (P, S) of
needed later. ringed spaces is called fibered, if the following
conditions are fulfilled:
Example 2
(i) (P, S) is a commutative locally ringed space;
(i) Denote by C1 the sheaf of smooth functions on
(ii) f : Y ! P is surjective; and
Rn , by C! the sheaf of real analytic functions,
(iii) ’y : S f (y) ! By maps S f (y) into the center of By
and let O be the sheaf of holomorphic functions
for each y 2 Y.
on Cn . Then (Rn , C1 ), (Rn , C! ), and (Cn , O) are
ringed spaces over R resp. C. The fiber of (f , ’) over a point p 2 P then is the
(ii) A differentiable manifold of dimension n can be ringed space (Yp , Bp ) defined by
understood as a locally R-ringed space (M, C1 M)
which locally is isomorphic to (Rn , C1 ). Likewise, Yp ¼ f 1 ðpÞ; Bp ¼ Bjf 1 ðpÞ =mp Bjf 1 ðpÞ
a real analytic manifold is a ringed space (M, C!M ) where mp is the maximal ideal of S p which acts on
which locally can be modeled by (Rn , C! ), and a Bjf 1 (p) via ’.
complex manifold is an (M, OM ) which locally A fibered morphism of ringed spaces can be
looks like (Cn , O). pictured in Figure 1.
(iii) Let D be a domain in Cn , and J an ideal sheaf Additionally to this intuitive picture, conditions
in OD of finite type, which means that J is (i)–(iii) imply that the stalks By are central exten-
locally finitely generated over OD . Let Y be the sions of By =mf (y) By by S f (y) .
support of the quotient sheaf OD =J . The pair
(Y, OY ), where OY denotes the restriction of Definition 4 Let (P, S) be a commutative locally
OD =J to Y, then is a ringed space, called a ringed space over a field k with P connected, let  be
complex model space. A complex space now is a fixed point in P, and (X, A) a k-ringed space.
a ringed space (X, OX ) which locally looks like A deformation of (X, A) over the parameter space
a complex model space (cf. Grauert and (P, S) with distinguished point  then is a fibered
Remmert 1984). morphism (f , ’) : (Y, B) ! (P, S) over k together with
(iv) Let k be an algebraically closed field, and An an isomorphism (i, ) : (X, A) ! (Y , B ) such that for
the affine space over k of dimension n. Then all p 2 P and y 2 f 1 (p) the homomorphism
An , together with the sheaf of regular functions, ’y : S p ! By is flat.
is a ringed space. The condition of flatness in the definition of a
(v) Given a ring A, its spectrum Spec A together deformation serves as a substitute for ‘‘local trivi-
with the sheaf of regular functions OA forms a
bi ality’’ and works also in the presence of singularities.
ringed space (cf. (Hartshorne (1997), section (see Palamodov (1990), section 3) for a discussion of
II.2)). One calls (Spec A, OA ) an affine scheme. this point.
More generally, a scheme is a ringed space In the remainder of this section, we provide a list
(X, OX ) which locally can be modeled by affine of some of the most important deformation pro-
schemes. blems in mathematics, and show how these can be
(vi) Finally, if A is a local k-algebra, the pair (, A) formulated within the above language.
can be understood as a locally ringed space.
With A the algebra of formal power series k[[t]]
Products of k-Ringed Spaces
over one variable t, this example plays an
important role in the theory of formal deforma- Let (X, A) be any k-ringed space and (P, S) a
tions of algebras. k-scheme. For any closed point  2 P, the product
18 Deformation Theory

(X  P, B) = (X, A) k (P, S) then is a flat deforma- y

tion of (X, A) with distinguished point . This can
be seen easily from the fact that B(x, p) = Ax k S p for
every x 2 X and p 2 P.

Families of Matrices as Deformations

a =1
Let (P, OP ) be a complex space with distinguished a=0
point  and AP : P ! Mat(n  n, C) a holomorphic a = 0.5 x
family of complex n  n matrices over P. By the
following construction, AP can be understood as a
deformation, more precisely as a deformation of the
matrix A := AP (). Let Y be the graph of AP in the
product space P  Mat(n  n, C) and f : Y ! P be
the restriction of the projection onto the first
coordinate. Define the sheaf B as the inverse image
Figure 2 Deformation of the coordinate axes.
sheaf f 1 S, and let ’ be the sheaf morphism which
for every y 2 Y is induced by the identity map
’y : S f (y) ! By := S f (y) . It is then immediately clear connected, each of the fibers Yp is a compact
that (f , ’) is a deformation of the fiber f 1 () and complex manifold. Moreover, the family (Yp )p2P
that this fiber coincides with the matrix A. then is a family of compact complex manifolds in bi

Now let A be an arbitrary complex n  n-matrix, the sense of Kodaira–Spencer (cf. Palamodov
and choose a GL(n, C)-slice through A, that is, a (1990)).
submanifold P containing A which is transversal to the
Deformation of Singularities
GL(n, C)-orbit through A. Hereby, it is assumed that
GL(n, C) acts by the adjoint action on Mat(n  n, C). Let p be a point of some Cn . Two complex spaces
The family AP given by the canonical embedding (X, OX )  (Cn , O) and (X0 , OX0 )  (Cn , O) with x 2
P ,! Mat(n  n, C) now is a deformation of A. The X \ X0 are then called germ equivalent at x if there
germ of this deformation at  is versal in the sense exists an open neighborhood U 2 Cn of x such that
defined in the next section. X \ U = X0 \ U. Obviously, germ equivalence at x is
an equivalence relation indeed. We denote the equiva-
Deformation of a Scheme à la Grothendieck lence class of X by [X]x . Clearly, if [X]x = [X0 ]x , then
Assume that (P, S) is a connected scheme over k. A one has OX,x = OX0 , x for the stalks at x. By a
deformation of a scheme (X, A) then is a deforma- singularity one understands a pair ([X]x , OX, x ). In the
tion (f , ’) : (Y, B) ! (P, S) in the sense defined literature, such a singularity is often denoted by (X, x).
above, together with the requirement that f : Y ! P The singularity (X, x) is called nonsingular or regular if
is a proper map, that is, f 1 (K) is compact for every OX, x is isomorphic to an algebra of convergent power
compact K  P. As a particular example, consider series C{z1 , . . . , zd }. A deformation of a complex
the k-scheme Y = Spec k[x, y, t]=(xy  t]. It gives rise singularity (X, x) over a complex germ (P, ) is a
to a fibration Y ! Spec k[t], whose fibers Ya with morphism of ringed spaces ([Y]x , OY, x ) ! ([P] , OP,  )
a 2 k are hyperbolas xy = a, when a 6¼ 0, and consist which is induced by a holomorphic map and which is
of the two axes x = 0 and y = 0, when a = 0. For a deformation of ([X]x , OX, x ) as a ringed space. See
bi bi

k = R, this deformation can be illustrated as in Artin (1976) and the overview article by Greuel (1992)
Figure 2. for further details and a variety of examples.
For further information on this and similar
bi First-Order Deformation of Algebras
examples, see Hartshorne (1977), in particular
example 3.3.2. Consider a k-algebra A and the truncated poly-
nomial algebra S = k["]="2 k["]. Furthermore, let  :
Deformation of a Complex Space A  A ! A be a Hochschild 2-cocycle of A; in other
words, assume that the relation
According to Grothendieck, one understands by a
deformation of a complex space (X, A) a morphism a1 ða2 ; a3 Þ  ða1 a2 ; a3 Þ þ ða1 ; a2 a3 Þ
of complex spaces (f , ’) : (Y, B) ! (P, S) which is
 ða1 ; a2 Þa3 ¼ 0 ½1
both a proper flat morphism of complex spaces and
a deformation of (X, A) as a ringed space. In case holds for all a1 , a2 , a3 2 A. Then one can define a
(X, A) and (P, S) are complex manifolds and if P is new k-algebra B, whose underlying linear structure
Deformation Theory 19

is isomorphic to A k S and whose product is given (1=2i){ , } and, even though HH3 (A, A) might not
by the following construction: any element b 2 B always vanish, a deformation quantization of M, that
can be written uniquely in the form b = a0 þ a1 ", means a formal deformation of C1 (M) in the
with a0 , a1 2 A. Then the product of b = a0 þ a1 " 2 B direction of the Poisson bracket (1=2i){ , }. For the
and b0 = a00 þ a01 " 2 B is given by symplectic case, this fact has been proved first by
  deWilde–Lecomte using methods from Hochschild
b  b0 ¼ a0 a00 þ ða0 ; a00 Þ þ a0 a01 þ a1 a00 " ½2
cohomology theory. A more geometric and intuitive

By condition [1], this product is associative. One proof has been given by Fedosov (1996). The Poisson

thus obtains a flat deformation  : S ! B of the case has been settled in the work of Kontsevich
algebra A and calls it the first-order or infinitesimal (2003) (see also the section ‘‘Deformation quantiza-
deformation of A along the Hochschild cocycle . tion of Poisson manifolds’’).
For further information on this and the connection
between deformation theory and Hochschild coho- Quantized Universal Enveloping Algebras
mology, see the overview article by Gerstenhaber According to Drinfeld
and Schack (1986). A quantized universal enveloping algebra for a
complex Lie algebra g is a Hopf algebra A over
Formal Deformation of an Algebra
C[[t]] such that A is a topologically free C[[t]]-
Let us generalize the preceding example and explain module (i.e., A = (A=tA)[[t]] as left C[[t]]-module)
the concept of a formal deformation of an algebra and A=tA is the universal enveloping algebra Ug of g.
by Gerstenhaber. Assume again A to be an arbitrary Because A is a topologically free C[[t]]-module, A is a
k-algebra and choose bilinear maps n : A  A ! A flat C[[t]]-module and thus a deformation of Ug over
for n 2 N such that 0 is the product on A and 1 is C[[t]]. See Drinfel’d (1986) and the monograph by
a Hochschild cocycle. Furthermore, let S be the Kassel (1995) for further details and examples of
algebra k[[t]] of formal power series in one variable quantized universal enveloping algebras.
over k. Then define on the linear space B = A[[t]] of
formal power series in one variable with coefficients Quantum Plane
in A the following bilinear map: L
Consider the tensor algebra T = n2N (R 2 )n of
?:BB!B the two-dimensional real vector space R2 , and let
! (x, y) be the canonical basis of R 2 . Then form the
X X X X ½3
n n
an t ; bn t 7! m ðak ; bl Þtn tensor product sheaf T C = T R OC and let I C be
n2N n2N n2N k;l;m2N
the ideal sheaf in T C generated by the relation

If B together with ? becomes a k-algebra or, in other x  y  zy  x ¼ 0 ½4

words, if ? is associative, one can easily see that it 
where z : C ! C is the identity function. The
gives a flat deformation of A over S = k[[t]]. In that quotient sheaf B = BC = T C =I C then is a sheaf of
case, one says that B is a formal deformation of A C-algebras and an OC -module. Using eqn [4] now
by the family (n )n2N . Contrarily to the preceding move all occurrences of x in an element of BC to the
example, there might not exist for every Hochschild right of all y’s. Since 1/z is an element of O(C ), one
cocycle  on A a formal deformation B of A defined can thus show that BC is a free OC -module. Hence,
by a family (n )n2N such that 1 = . In case it BC is flat over OC . Further, it is easy to see that for
exists, we will say that the deformation B of A is in every q 2 C the C-algebra Aq = Bq =mq Bq is freely
the direction of . If the third Hochschild cohomol- generated by elements x, y with relations
ogy group H3 (A, A) vanishes, there exists for every
Hochschild cocycle  on A a deformation B of A in x  y  qy  x ¼ 0 ½5

the direction of  (see again Gerstenhaber and We call Aq the q-deformed quantum plane and
Schack (1986) for further details). B = B(C ) the over C universally deformed quan-
tum plane. Altogether, one can interpret B as
Formal Deformation Quantization of Symplectic
a deformation of Aq over C , in particular as a
and Poisson Manifolds
deformation of A1 = T R C = C[x, y], the algebra
Let us consider the last two examples for the case of complex polynomials in two generators.
where A is the algebra C1 (M) of smooth functions on In the same way, one can deform function
a symplectic or Poisson manifold M. Then the Poisson algebras on higher-dimensional vector spaces as
bracket { , } gives a Hochschild cocycle on C1 (M). well as function algebras on certain Lie groups.
There exists a first-order deformation of C1 (M) along In this manner, one obtains the quantum group
20 Deformation Theory

SUq (2) as a deformation of a Hopf algebra of unified and quite general method for construct-
functions on SU(2). See, for example, the work of
bi bi
ing versal deformations in analytic geometry.
Faddeev–Reshetikhin–Takhtajan (1990), Manin (1988)
(vi) Fialowski–Fuchs have constructed miniversal
and Wess–Zumino (1990) for more information on deformations of Lie algebras.
q-deformations of vector spaces, Lie groups, differ-
ential calculi, etc.

Schlessinger’s Theorem
Versal Deformations
According to Grothendieck, spaces in algebraic
In this section, and the ones that follow, we consider geometry are represented by functors from a category
only germs of deformations, that is, deformations of commutative rings to the category of sets. In this
over parameter spaces of the form (, S). This means picture, an affine algebraic variety X over the base
in particular that the structure sheaf only consists field k and with coordinate ring A is equivalently
of its stalk S at , a commutative local k-algebra. Let described by the functor Homalg (A, ) defined on the
us now suppose that the sheaf morphism category of commutative k-algebras. As will be
’ : (Y, B) ! (, S) (over the canonical map Y ! ) shown by examples in the next section, versal
is a deformation of the ringed space (X, A) and that deformations are often encoded by functors repre-
 : T ! S is a homomorphism of commutative local senting spaces. More precisely, a deformation pro-
k-algebras. Then the sheaf morphism   ’ : B S T ! blem leads to a so-called functor of Artin rings, which
T with (  ’)y (t) = 1  t for y 2 Y and t 2 T is means a covariant functor F from the category of
a deformation of (X, A) over the parameter space (local) Artinian k-algebras to the category of sets such
(, T). One says that the deformation   ’ is induced that the set F(k) has exactly one element. The
by the homomorphism . question now arises as to under which conditions
Definition 5 A deformation ’ : (Y, B) ! S of the functor F is representable, that is, there exists
(X, A) is called versal if every (germ of a) deforma- a commutative k-algebra A such that F ffi

tion of (X, A) is isomorphic to a deformation germ Homalg (A, ). In the work of Schlessinger (1968),
induced by a homomorphism of k-algebras  : T ! S. the structure of functors of Artin rings has been studied
A versal deformation is called universal, if the in detail. Moreover, criteria have been established,
inducing homomorphism  : T ! S is unique, and when such a functor is pro-presentable, which means
miniversal if S is of minimal dimension. that it can be represented by a complete local
algebra A,^ where ‘‘completeness’’ is understood
with respect to the m-adic topology. Because of its
Example 6 importance for deformation theory, we will state
Schlessinger’s theorem in this section. Before we
(i) In the section ‘‘Families of matrices as deforma- come to its details, let us recall some notation.
tions,’’ the construction of a versal deformation
of a complex matrix A has been sketched. Definition 7 By an Artinian k-algebra over a field k
(ii) According to Kuranishi, every compact com- one understands a commutative k-algebra R which
plex manifold has a versal deformation by an satisfies the following descending chain condition:

analytic germ. See Kuranishi (1971) for a

ðDecÞ Every descending chain I1

detailed exposition and the section ‘‘The

Kodaira–Spencer algebra controlling deforma- Ikþ1
   of ideals in R becomes stationary:
tions of compact complex manifolds’’ for a
Among others, an Artinian algebra R has the
description of the principal ideas.
following properties:
(iii) Grauert has shown that for isolated singularities
there exists a versal analytic deformation. 1. R is Noetherian, that is, it satisfies the ascending
(iv) By the work of Douady–Verdier, Grauert, and chain condition.
Palamodov one knows that for every compact 2. Every prime ideal in R is maximal.
complex space there exists a miniversal analytic 3. (Chinese remainder theorem) R is isomorphic to
deformation. One of the essential methods in a finite product ni= 1 Ri , where each Ri is a local
the existence proof hereby is Palamodov’s Artinian algebra.
construction of the cotangent complex (see 4. Every maximal ideal m of R is nilpotent, that is,
Palamodov (1990).
mk = 0 for some k 2 N.
(v) Bingener (1987) has further established 5. Every quotient R=mk with m maximal is finite
Palomodov’s approach and thus has provided a dimensional.
Deformation Theory 21

Definition 8 Assume that f : B ! A is a surjective Differential Graded Lie Algebras

homomorphism in the category k-Algl,Art of local
Artinian k-algebras. Then f is called a small extension Definition 10 By a graded algebra over a field k
if ker f is a nonzero principal ideal (b) in B such that one
L understands a graded k-vector space A =
mb = (0), where m is the maximal ideal of B. k2Z A together with a bilinear map

Theorem 9 (Schlessinger (1968, theorem 2.11)).  : A  A ! A ; ða; bÞ 7! a  b ¼ ða; bÞ

Let F be a functor of Artin rings (over the base field
k). Assume that A0 ! A and A00 ! A are morphisms such that Ak  Al  Akþl for all k, l 2 Z. The graded
in k-Algl,Art , and consider the map algebra A is called associative if (ab)c = a(bc) for all
a, b, c 2 A .
FðA0 A A00 Þ ! FðA0 Þ  FðAÞ FðA00 Þ ½6 A graded subalgebra of A is a graded subspace
B = k2Z Bk  A which is closed under , a

Then F is pro-representable if and only if F has the
graded ideal is a graded subalgebra I  A such
following properties:
that I  A  I and A  I  I .
(H1) The map [6] is a surjection whenever A00 ! A A homomorphism between graded algebras A
is a small extension. and B is a homogeneous map f : A ! B of degree
(H2) The map [6] is a bijection, when A = k and 0 such that f (a  b) = f (a)  f (b) for all a, b 2 A .
A00 = k["].
From now on, assume that k has characteristic
(H3) One has dimk (tF ) < 1 for the tangent space
6¼ 2, 3. A graded Lie
L algebra then is a graded
tF := F(k["]).
k-vector space g = k2Z gk together with a bilinear
(H4) For every small extension A0 ! A, the map
FðA0 A A0 Þ ! FðA0 Þ FðAÞ FðA0 Þ ½ ;  : g  g ! g ; ða; bÞ 7! ½a; b
is an isomorphism. such that the following axioms hold true:
Suppose that the functor F satisfies conditions 1. [gk , gl ]  gkþl for all k, l 2 Z.
^ be an arbitrary complete local
(H1)–(H4), and let A 2. [, ] =  (1)kl [, ] for all  2 gk ,  2 gl .
k-algebra. By Yoneda’s lemma, every element 3. (1)k1 k3 [[1 , 2 ], 3 ] þ (1)k2 k1 [[2 , 3 ], 1 ] þ
(1)k3 k2 [[3 , 1 ], 2 ] = 0 for all i 2 gki with
 ¼ proj lim n 2 A ^ nA
^ ¼ proj lim A=m ^
i = 1, 2, 3.
n2N n2N
By axiom (1), it is clear that a graded Lie algebra is
induces a natural transformation
in particular a graded algebra. So the above-defined
  notions of a graded ideal, homomorphism, etc., apply
^ Þ ! F;
Homalg ðA; ^ ! R 7! Fðun Þðn Þ ½7
u:A as well to graded Lie algebras.
where n 2 N is chosen large enough such that the Example 11 Let A = k2Z Ak be a graded asso-
homomorphism u : A ^ ! R factors through some ciative algebra. Then A becomes a graded Lie
^ n
un : A=m ! R. This is possible indeed, since R is algebra with the bracket
Artinian. In the course of the proof of Schlessinger’s
theorem, A ^ and the element  2 A ^ are now con- ½a; b ¼ ab  ð1Þkl ba for a 2 Ak and b 2 Al
structed in such a way that [7] is an isomorphism.
The space A regarded as a graded Lie algebra is
often denoted by lie (A ).
Definition 12 A linear map D : A ! A defined
Differential Graded Lie Algebras
on a graded algebra A is called a derivation of
and Deformation Problems
degree l if
According to a philosophy going back to Deligne
‘‘every deformation problem in characteristic zero is DðabÞ ¼ ðDaÞb þ ð1Þkl aðDbÞ
controlled by a differential graded Lie algebra, with for all a 2 Ak and b 2 A
quasi-isomorphic differential graded Lie algebras bi

giving the same deformation theory’’ (cf. Goldman A graded (Lie) algebra A together with a
and Millson (1988), p. 48). In the following, we will derivation d of degree 1 is called a differential
explain the main idea of this concept and apply it to graded (Lie) algebra if d d = 0. Then (A , d)
two particular examples. becomes a cochain complex. Since ker d is a graded
22 Deformation Theory

subalgebra of A and im d a graded ideal in ker d, form Def g . Below, we will show in some detail how
the cohomology space this works for two examples, namely the deforma-
tion theory of complex manifolds and the deforma-
H ðA ; dÞ ¼ ker d=im d tion quantization of Poisson manifolds. But before
inherits the structure of a graded (Lie) algebra from A . we come to this, let us state a result which shows
how the deformation functor behaves under quasi-
Let f : A ! B be a homomorphism of differen- isomorphisms of the underlying differential graded
tial graded (Lie) algebras (A , d) and (B , @). Assume Lie algebra. This result is crucial in a sense that it
further that f is a cochain map, that is, that f d = allows to equivalently describe a deformation
@ f . Then one says that f is quasi-isomorphism or problem with controlling g by any other differential
that the differential graded (Lie) algebras A and B graded Lie algebra within the quasi-isomorphism
are quasi-isomorphic if the induced homomorphism class of g . So, in particular in the case where the
on the cohomology level f : H (A , d) ! H (B , @) is differential graded Lie algebra is formal, one often
an isomorphism. Finally, a differential graded (Lie) obtains a direct solution of the deformation
algebra (A , d) is called formal if it is quasi- problem.
isomorphic to its cohomology (H (A , d), 0).
Theorem 13 (Deligne, Goldman–Millson). Assume
that f : g ! h is a quasi-isomorphism of
Maurer–Cartan Equation differential graded Lie algebras. For every local
Artinian C-algebra R the induced functor f :
Assume that (g , [  ,  ], d) is a differential graded Lie
Def g (R) ! Def h (R) then is an equivalence of
algebra over C. Define the space MC(g ) of
solutions of the Maurer–Cartan equation by

MCðg Þ :¼ f! 2 g1 j d!  12½!; ! ¼ 0g ½8

The Kodaira–Spencer Algebra Controlling
In case the differential graded Lie algebra g is Deformations of Compact Complex Manifolds
nilpotent, this space naturally possesses a groupoid
Let M be a compact complex n-dimensional mani-
structure or, in other words, a set of arrows which are
fold. Recall that then the complexified tangent
all invertible. The reason for this is that, under the
bundle TC M has a decomposition into a holomor-
assumption of nilpotency, the space g0 is equipped
phic tangent bundle T 1, 0 M and an antiholomorphic
with the Campbell–Hausdorff multiplication
tangent bundle T 0, 1 M. This leads to a decomposi-
g0  g0 ! g0 ; ðX; YÞ 7! logðexp X; exp YÞ tion of the space of complex n-forms into the spaces
p, q M of forms on M of type (p, q). More generally,
and the group g0 acts on g1 by the exponential a smooth subbundle J0, 1  TC M which induces a
function. More precisely, in this situation one can decomposition of the form TC M = J1, 0 J0, 1 , where
define for two objects ,  2 MC(g ) the space of J1, 0 := J0, 1 , is called an almost complex structure on
arrows  !  as the set of all 2 g0 such that M. Clearly, the decomposition of TC M into the
exp   = . holomorphic and antiholomorphic part is an almost
We have now the means to define for every complex structure, and an almost complex structure
complex differential graded Lie algebra g its which is induced by a complex structure is called
deformation functor Def g . This functor maps the integrable. Assume that an almost complex structure
category of local Artinian C-algebras to the category J0, 1 is given on M and that it has finite distance to
of groupoids and is defined on objects as follows: the complex structure on M. The latter means that
Def g ðRÞ :¼ MCðg  mÞ ½9 the restriction %0, J
of the projection % : TC M ! T 0, 1 M
along T M to the subbundle J0,1 is an isomor-
1, 0

Hereby, R is a complex local Artinian algebra, and phism. Denote by  the inverse of %0, 1
J , and let ! 2
0, 1 1, 0
m its maximal ideal. Note that since R is Artinian,  (M, T M) be the composition % . One
g  m is a nilpotent differential graded Lie algebra, checks immediately that every almost complex
hence Def g (R) carries a groupoid structure as structure with finite distance to the complex
constructed above. Clearly, Def g is also a functor structure on M is uniquely characterized by a
of Artin rings as defined in the previous section. section ! 2 0, 1 (M, T 1, 0 M) and that every element
With appropriate choices of the differential of 0, 1 (M, T 1, 0 M) comes from an almost complex
graded Lie algebra g , essentially all deformation structure on M.
problems from the section ‘‘Basic definitions and As a consequence of the Newlander–Nirenberg
examples’’ can be recovered via a functor of the theorem, one can now show that the almost
Deformation Theory 23

complex structure J0, 1 resp. ! is integrable if and The bracket is the Gerstenhaber bracket
only if the equation
½;  : gk1  gk2 ! gk1 þk2
@!  12½!; ! ¼ 0 ½10
½f1 ; f2  :¼ f1 f2  ð1Þk1 k2 f2 f1
is fulfilled. But this is nothing else than the Maurer– where
Cartan equation in the Kodaira–Spencer differential
graded Lie algebra f1 f2 ða0      ak1 þk2 Þ
! X

L ; @; ½;  ¼ 0;p 1;0
 ðM; T MÞ; @; ½ ;  :¼ ð1Þik2 f1 a0      ai1  f2 ðai      aiþk2 Þ
 aiþk2 þ1      ak1 þk2
0, p 1, 0 1, 0
Hereby,  (M, T M) denotes the T M-valued
differential forms on M of type (0, p), @ : The triple (g , b, [  ,  ]) then is a differential graded
0,p (M, T 1, 0 M) ! 0, pþ1 (M, T 1, 0 M) the Dolbeault Lie algebra.
operator, and [  ,  ] is induced by the Lie bracket Consider the Maurer–Cartan equation b

of holomorphic vector fields. As a consequence of (1=2)[
] = 0 in g1 . Obviously, it is equivalent to
these considerations, deformations of the complex the equality
manifold M can equivalently be described by families
ða1 ; a2 Þ 
ða0 a1 ; a2 Þ þ
ða0 ; a1 a2 Þ 
ða0 ; a1 Þa2
(!p )p2P  L1 which satisfy eqn [10] and ! = 0. Thus,
it remains to determine the associated deformation ¼
ða0 ; a1 Þ; a2 Þ 
ða0 ;
ða1 ; a2 ÞÞ
functor DefL . for a0 ; a1 ; a2 2 A ½12
According to Schlessinger’s theorem, the functor
Def L is pro-representable. Hence, there exists a If one defines now for some
2 g1 the bilinear map
local C-algebra RL complete with respect to the m : A  A ! A by m(a, b) = ab þ
(a, b), then [12]
m-adic topology such that implies that m is associative if and only if
the Maurer–Cartan equation.
Def L ðRÞ ¼ Homalg ðRL ; RÞ ½11 Let us apply these observations to the case where A
is the algebra C1 (M)[[t]] of formal power series in one
for every local Artinian C-algebra R. Moreover, by
variable with coefficients in the space of smooth
Artin’s theorem, there exists a ‘‘convergent’’ solution
functions on a Poisson manifold M. By (a variant of)
of the Maurer–Cartan equation, that is, RL can be
the theorem of Hochschild–Kostant–Rosenberg and
replaced in eqn. [11] by a ring RL representing an
Connes, one knows that in this case the cohomology of
analytic germ.
(g , b) is given by formal power series with coefficients
Theorem 14 (Kodaira–Spencer, Kuranishi). The in the space 1 ( TM) of antisymmetric vector fields.
ringed space (RL , (0)) is a miniversal deformation Now, 1 ( TM) carries a natural Lie algebra bracket
of the complex structure on M. as well, namely the Schouten bracket. Thus, one
obtains a second differential graded Lie algebra
(1 ( TM)[[t]], 0, [  ,  ]). Unfortunately, the projec-
Deformation Quantization of Poisson Manifolds tion onto cohomology (g , b) ! 1 ( TM)[[t]] does
Let A be an associative k-algebra with char k = 0. not preserve the natural brackets, hence is not a quasi-
Put for every integer k  1 isomorphism in the category of differential graded Lie
algebras. It has been the fundamental observation by
gk :¼ Homk ðAðkþ1Þ ; AÞ Kontsevich that this defect can be cured as follows.

Then g becomes a graded vector space. Let us Theorem 15 (Kontsevich 2003). For every Poisson
impose a differential and a bracket on g . The manifold M the differential graded Lie algebra
differential is the usual Hochschild coboundary (g , b, [ , ]) is formal in the sense that there exists
b : gk ! gkþ1 , a quasi-isomorphism (g , b, [ , ]) ! (1 ( TM)
[[t]], 0, [ , ]) in the category of L1 -algebras.
bf ða0      akþ1 Þ
Note that the theorem only claims the existence of
:¼ a0 f ða1      akþ1 Þ
a quasi-isomorphism in the category of L1 -algebras
or, in other words, in the category of homotopy Lie
þ ð1Þiþ1 f ða0      ai aiþ1      akþ1 Þ algebras. This is a notion somewhat weaker than a
differential graded Lie algebra, but Theorem 13 also
þ ð1Þk f ða0      ak Þakþ1 holds in the context of L1 -algebras.
24 Deformations of the Poisson Bracket on a Symplectic Manifold

Since the solutions of the Maurer–Cartan equa- Gerstenhaber M and Schack S (1986) Algebraic cohomology
tion in (1 ( TM)[[t]], 0, [ , ]) are exactly the and deformation theory. Deformation theory of algebras
and structures and applications, Nato Advanced Study
formal paths of Poisson bivector fields on M, Institute, Castelvecchio-Pascoli/Italy. NATO ASI series,
Kontsevich’s formality theorem entails: C247: 11–264.
Goldman WM and Millson JJ (1988) The Deformation Theory
Corollary 16 Every Poisson manifold has a formal of Representations of Fundamental Groups of Compact
deformation quantization. Kähler Manifolds, Publ. Math. Inst. Hautes Étud. Sci,
vol. 67, pp. 43–96.
See also: Deformation Quantization; Deformation Greuel G-M (1992) Deformation und Klassifikation von Singu-
Quantization and Representation Theory; Deformations of laritäten und Moduln, Jahresbericht der DMV, Jubiläumstag.,
the Poisson Bracket on a Symplectic Manifold; Fedosov 100 Jahre DMV, Bremen/Dtschl. 1990, 177–238.
Quantization; Holonomic Quantum Fields; Operads. Hartshorne R (1977) Algebraic Geometry. Graduate Texts in
Mathematics, vol. 52. Berlin: Springer.
Kassel Ch (1995) Quantum Groups, Graduate Texts in Mathe-
Further Reading matics, vol. 155. New York: Springer.
Kontsevich M (2003) Deformation quantization of Poisson
Artin M (1976) Lectures on Deformations of Singularities. manifolds. Letters in Mathematical Physics 66(3): 157–216.
Lectures on Mathematics and Physics, vol. 54. Bombay: Tata Kuranishi M (1971) Deformations of Compact Complex Mani-
Institute of Fundamental Research. II. folds, Séminaire de Mathématiques Supérieures. Été 1969
Bayen F, Flato M, Fronsdal C, Lichnerowicz A, and Sternheimer D Montreal: Les Presses de l’Université de Montreal.
(1978) Deformation theory and quantization, I and II. Annals Manin (1988) Quantum Groups and Non-Commutative Geome-
of Physics 111: 61–151. try. Montreal: Les Publications CRM.
Bingener J (1987) Lokale Modulräume in der analytischen Geometrie, Palamodov VP (1990) Deformation of complex spaces. In:
Band 1 und 2, Aspekte der Mathematik Braunschweig: Vieweg. Gindikin SG and Khenkin GM (eds.) Several Complex
Drinfel’d VG (1986) Quantum Groups, Proc. of the ICM, 1986, Variables IV, Encyclopaedia of Mathematical Sciences,
pp. 798–820. vol. 10. Berlin: Springer.
Faddeev–Reshetikhin–Takhtjan (1990) Quantization of Lie groups Schlessinger M (1968) Functors of Artin rings. Transactions of the
and Lie algebras. Leningrad Mathematical Journal 1(1): 193–225. American Mathematical Society 130: 208–222.
Fedosov B (1996) Deformation Quantization and Index Theory. Wess–Zumino (1990) Covariant differential calculus on the
Berlin: Akademie-Verlag. quantum hyperplane. Nuclear Physics B, Proceedings Supple-
Grauert H and Remmert R (1984) Coherent Analytic Sheaves. ment 18B: 302–312.
Grundlehren der Mathematischen Wissenschaften, vol. 265.
Berlin: Springer.

Deformations of the Poisson Bracket on a Symplectic Manifold

S Gutt, Université Libre de Bruxelles, Brussels, and the time evolution of an observable At is
Belgium governed by the equation dAt =dt = (i=h)[H, At ].
S Waldmann, Albert-Ludwigs-Universität Freiburg, Quantization of a classical system is a way to pass
Freiburg, Germany from classical to quantum results. A first idea for
ª 2006 Elsevier Ltd. All rights reserved. quantization is to define a correspondence
Q : f 7! Q(f ) mapping a function f to a self-adjoint
operator Q(f) on a Hilbert space H in such a way
Introduction to Deformation Quantization that Q(1) = Id and [Q(f ), Q(g)] = ihQ({f , g}). Unfor-
tunately, there is no such correspondence defined on
The framework of classical mechanics, in its all smooth functions on M when one puts an
Hamiltonian formulation on the motion space, irreducibility requirement (which is necessary not
employs a symplectic manifold (or more generally a
to violate Heisenberg’s principle).
Poisson manifold). Observables are families of
Different mathematical treatments of quantization
smooth functions on that manifold M. The dynamics
have appeared:
is defined in terms of a Hamiltonian H 2 C1 (M) and
the time evolution of an observable ft 2 C1 (M  R) Geometric quantization of Kostant and Souriau:
is governed by the equation: (d=dt)ft = {H, ft }. first, prequantization of a symplectic manifold
The quantum-mechanical framework, in its usual (M, !) where one builds a Hilbert space and a
Heisenberg’s formulation, employs a Hilbert space correspondence Q defined on all smooth functions
(states are rays in that space). Observables are on M but with no irreducibility; second, polariza-
families of self-adjoint operators on the Hilbert tion to ‘‘cut down the number of variables.’’
space. The dynamics is defined in terms of a Berezin’s quantization where one builds on a
Hamiltonian H, which is a self-adjoint operator, particular class of symplectic manifolds (some
Deformations of the Poisson Bracket on a Symplectic Manifold 25

Kähler manifolds) a family of associative algebras ! is a closed nondegenerate 2-form on M) and if

using a symbolic calculus, that is, a dequantiza- u, v 2 C1 (M), the Poisson bracket of u and v is
tion procedure.
 Deformation quantization introduced by Flato, fu; vg :¼ Xu ðvÞ ¼ !ðXv ; Xu Þ
Lichnerowicz, and Sternheimer in 1976 where
where Xu denotes the Hamiltonian vector field
they ‘‘suggest that quantization be understood as
corresponding to the function u, that is, such that
a deformation of the structure of the algebra of
i(Xu )! = du. In coordinates the components of the
classical observables rather than a radical change
Poisson tensor Pij form the inverse matrix of the
in the nature of the observables.’’
components !ij of !.
This deformation approach to quantization is part Duals of Lie algebras form the class of linear
of a general deformation approach to physics Poisson manifolds. If g is a Lie algebra, then its dual
(a seminal idea stressed by Flato): one looks at g is endowed with the Poisson tensor P defined by
some level of a theory in physics as a deformation of P (X, Y) := ([X, Y]) for X, Y 2 g = (g )  (T g ) .
another level.
Definition 2 A Poisson deformation of the Poisson
Deformation quantization is defined in terms of a
bracket on a Poisson manifold (M, P) is a Lie
star product which is a formal deformation of the
algebra deformation of (C1 (M), { , }) which is a
algebraic structure of the space of smooth functions
derivation in each argument, that is,Pof the form
on a Poisson manifold. The associative structure
{u, v} = P (du, dv), where P = P þ  k Pk is a
given by the usual product of functions and the Lie
series of skew-symmetric contravariant 2-tensors
structure given by the Poisson bracket are simulta-
on M (such that [P , P ] = 0).
neously deformed.
In this article we concentrate on some mathema- Two Poisson deformations P and P0 of the
tical results concerning deformations of the Poisson Poisson bracket P on a Poisson manifold (M, P)
bracket on a symplectic manifold, classification of are equivalent if there exists a formal path in the
star products on symplectic manifolds, group actions diffeomorphism group of M, startingP at the identity,
on star products, convergence properties of some that Pis, a series T = exp D = Id þ j (1=j!)Dj for
star products, and star products on cotangent D = r1  r Dr where the Dr are vector fields on M,
bundles. such that

Tfu; vg ¼ fTu; Tvg0

Deformations of the Poisson Bracket where {u, v} = P (du, dv) and {u, v}0 = P0 (du, dv).
on a Symplectic Manifold
Proposition 3 (Flato et al. 1975, Lecomte 1987).
Definition 1 A Poisson bracket defined on the On a symplectic manifold (M, !), any Poisson
space of smooth functions on a manifold M is an deformation of the Poisson bracket corresponds P to
R-bilinear map on C1 (M), (u, v) 7! {u, v} such that a series of closed 2-forms on M,  = ! þ r>0  r !r
for any u, v, w 2 C1 (M): and is given by
(i) {u, v} = {v, u};  
fu; vg ¼ P ðdu; dvÞ ¼  Xu ; Xv
(ii) {{u, v}, w} þ {{v, w}, u} þ {{w, u}, v} = 0;
(iii) {u, vw} = {u, v}w þ {u, w}v. with i(Xu ) = du. The equivalence classes of Poisson
deformations of the Poisson bracket P are
A Poisson bracket is given in terms of a contra-
parametrized by H 2 (M; R)[[]].
variant skew-symmetric 2-tensor P on M (called
the Poisson tensor) by {u, v} = P(du ^ dv). The Poisson deformations are used in classical
Jacobi identity for the Poisson bracket is equiva- mechanics to express some constraints on the
lent to the vanishing of the Schouten bracket system. To deal with quantum mechanics, Flato
[P, P] = 0. (The Schouten bracket is the extension – et al. (1976) introduced star products. These give,
as a graded derivation for the exterior product – by skew-symmetrization, Lie deformations of the
of the bracket of vector fields to skew-symmetric Poisson bracket.
contravariant tensor fields.) A Poisson manifold
Definition 4 A ‘‘star product’’ on (M, P) is an
(M, P) is a manifold M with a Poisson bracket
R_-bilinear associative product  on C1 (M)_
defined by P.
given by
A particular class of Poisson manifolds, essential X
in classical mechanics, is the class of ‘‘symplectic u  v ¼ u  v :¼  r Cr ðu; vÞ
manifolds.’’ If (M, !) is a symplectic manifold (i.e., r0
26 Deformations of the Poisson Bracket on a Symplectic Manifold

for u, v 2 C1 (M) (we consider here real-valued This yields a differential star product on g (Gutt
functions; the results for complex-valued functions 1983). This star product can be written with an
are similar), such that C0 (u, v) = uv, C1 (u, v)  integral formula (for  = 2i)(Drinfeld 1987):
C1 (v, u) = {u, v}, 1  u = u  1 = u. Z
When the Cr ’s are bidifferential operators on M, u  vðÞ ¼ ^ðXÞ^vðYÞe2ih;CBHðX;YÞi dX dY
one speaks of a differential star product. When each
Cr is a bidifferential operator of order at most r in where u ^(X) = g u()e2ih, Xi and CBH denotes
each argument, one speaks of a natural star product. Campbell–Baker–Hausdorff formula for the product
One finds in the literature other normalizations of elements in the group in a logarithmic chart
for the skew-symmetric part of C1 such as (i=2){ , }; (exp X exp Y = exp CBH(X, Y) 8 X, Y 2 g). We call
these amount to a rescaling of the parameter . For this the standard (or CBH) star product on the dual
physical applications, in the above convention for of a Lie algebra.
the formal parameter,  corresponds to i h, where h De Wilde and Lecomte (1983) proved that on
is Planck’s constant. any symplectic manifold there exists a differential
In the case of complex-valued functions, one can star product. Fedosov (1994) gave a recursive
add the further requirement that the complex con- construction of a star product on a symplectic
jugation is a -involution for , that is, f  g = g  f . manifold (M, !) constructing flat connections on
According to the interpretation of  as being ih, we the Weyl bundle. Omori et al. (1991) gave an
have to require  = . Star products satisfying this alternative proof of existence of a differential star
additional property are called symmetric or Hermitian. product on a symplectic manifold, gluing local
A star product can also be defined not on the Moyal star products. In 1997, Kontsevich gave a
whole of C1 (M) but on a subspace N which is stable proof of the existence of a star product on any
under pointwise multiplication and Poisson bracket. Poisson manifold and gave an explicit formula for a
The simplest example of a deformation quantiza- star product for any Poisson structure on V = Rm .
tion is the Moyal product for the Poisson structure P This appeared as a consequence of the proof of his
on a vector space V = R m with constant coefficients: formality theorem.
P¼ Pij @ i ^ @ j ; Pij ¼ Pji 2 R
i;j Fedosov’s Construction of Star Products
where @ i = @=@yi is the partial derivative in the Fedosov’s construction gives a star product on a
direction of the coordinate yi , i = 1, . . . , n. The symplectic manifold (M, !), when one has chosen a
formula for the Moyal product is symplectic connection and a sequence of closed
   2-forms on M. The star product is obtained by
ðu M vÞðzÞ ¼ exp Prs @yr @y0s ðuðyÞvðy0 ÞÞy¼y0 ¼z ½1 identifying the space C1 (M)[[]] with an algebra of
flat sections of the so-called Weyl bundle endowed
When P is nondegenerate (so V = R2n ), the space of with a flat connection whose construction is related
formal power series of polynomials on V with to the choice of the sequence of closed 2-forms on M.
Moyal product is called the formal Weyl algebra
W = (S(V)[[]], M ). Definition 5 The symplectic group Sp(n, R) acts by
Let g be the dual of a Lie algebra g. The algebra of automorphisms on the formal Weyl algebra W. If
polynomials on g is identified with the symmetric (M, !) is a symplectic manifold, we can form its
algebra S(g). One defines a new associative law on this bundle F(M) of symplectic frames which is a
algebra by a transfer of the product
in the universal principal Sp(n, R)-bundle over M. The associated
enveloping algebra U(g), via the bijection between bundle W = F(M) Sp(n, R) W is a bundle of associa-
S(g) and U(g) given by the total symmetrization : tive algebras on M called the Weyl bundle. Sections
of the Weyl bundle have the form of formal series
1X X
 : SðgÞ ! UðgÞ : X1 . . . Xk 7! Xð1Þ
k! 2S aðx; y; Þ ¼  k aðkÞi1 ðxÞyi1    yil
Then U(g) = n0 Un , where Un := (Sn (g)) and we
decompose an element u 2 U(g); accordingly where the coefficients a(k) are symmetric covariant
P l-tensor fields on M. The product of two sections
u = un . We define, for P 2 Sp (g) and Q 2 Sq (g),
X n taken pointwise makes the space of sections into an
PQ¼ ðÞ 1 ðððPÞ
ðQÞÞpþqn Þ ½2 algebra, and in terms of the above representation of
n0 sections the multiplication has the form
Deformations of the Poisson Bracket on a Symplectic Manifold 27

bÞðx; y; Þ For differential star products, we consider differen-
    tial cochains given by differential operators on each
 ij @ @ 
¼ exp P aðx; y; Þbðx; z; Þ  argument. The associativity condition for a star
2 @yi @zj 
y¼z product at order k in the parameter  reads
Note that the center of this algebra coincides with X
ð@Ck Þðu; v; wÞ ¼ ðCr ðCs ðu; vÞ; wÞ
C1 (M)[[]].
A symplectic connection on M is a linear torsion-  Cr ðu; Cs ðv; wÞÞÞ
free connection r such that r! = 0.
If one has cochains Cj , j < k such that the star
Remark 6 It is well known that such connections
product they define is associative to order k  1,
always exist but, unlike the Riemannian case, are
then the right-hand side above is a cocycle
not unique. To see the existence, take any torsion-
(@(RHS) = 0) and one can extend the star product
free connection r0 and set T(X, Y, Z) = (r0X !)(Y, Z).
to order k if it is a coboundary (RHS = @(Ck )).
Define S by !(S(X, Y), Z) = (1=3)(T(X, Y, Z) þ
Denoting by m the usual multiplication of func-
T(Y, X, Z)), then rX Y = r0X Y þ S(X, Y) defines a
tions, and writing  = m þ C, where C is a formal
symplectic connection.
series of multidifferential operators, the associativity
The connection r induces a covariant derivative also reads @C = [C, C] where the bracket on the
on sections of the Weyl bundle, denoted @. The idea right-hand side is the graded Lie algebra bracket on
is to try to modify it to have zero curvature. Dpoly (M)[[]] = {multidifferential operators}.
Consider Da = @a  (a)  (1=)[r, a], where r is a
Theorem 8 (Vey 1975). Every differential p-cocycle
1-form with values in W P , with [a, a0 dx] = (a
C on a manifold M is the sum of the coboundary of a
a)dx and (a) = (1=)[ ij !ij yi dxj , a].
differential (p  1)-cochain and a 1-differential skew-
P 7 (Fedosov 1994). For a given series symmetric p-cocycle A: C = @B þ A. In particular, a
 = i1  i !i of closed 2-forms on M, there is a cocycle is a coboundary if and only if its total skew-
unique r 2 (W 1 ) satisfying some normalization symmetrization, which is automatically 1-differential
condition, so that Da = @a  (a)  (1=)[r, a] is flat. in each argument, vanishes. Given a connection r on
For any a
2 C1 (M)[[]], there is a unique a in the M, B can be defined from C by universal formulas
subspace W D of flat sections of W , such that (Cahen and Gutt 1982). Also
a(x, 0, ) = a0 (x, ). The use of this linear isomorph- p
ism to transport the algebra structure of W D to Hdiff ðC1 ðMÞ; C1 ðMÞÞ ¼ ðp TMÞ
C1 (M)[[]] defines the star product of Fedosov r,  . The similar result about continuous cochains is
P due to Connes (1985). In the somewhat pathological
Writing r,  = i0  r C 
r , Cr only depends on !i
for i < r and C (u, v) = c! ~
r (Xu , Xv ) þ Crþ1 (u, v),
case of completely general cochains, the full coho-
where c 2 R and the last term does not depend on !r . mology is not known.
Definition 9 Two star products  and 0 on (M, P)
are said to be equivalent if there
P is a series of linear
Classification of Star Products operators on C1 (M), T = Id þ 1 r=1  r
Tr such that
on a Symplectic Manifold Tðf  gÞ ¼ Tf 0 Tg ½3
Star products on a manifold M are examples of
Remark that the Tr automatically vanish on con-
deformations of associative algebras (in the sense of
stants since 1 is a unit for  and for 0 .
Gerstenhaber). Their study uses the Hochschild
cohomology of the algebra (here C1 (M) with values If  and 0 are equivalent differential star products,
in C1 (M)) where p-cochains are p-linear maps from then the equivalence is given by differential operators
(C1 (M))p to C1 (M) and where the Hochschild Tr ; if they are natural,
P the equivalence is given by
coboundary operator maps the p-cochain C to the T = Exp E with E = 1 r
r = 1  Er , where the Er are
(p þ 1)-cochain differential operators of order at most r þ 1.
Nest and Tsygan (1995), then Deligne (1995) and
ð@CÞðu0 ; . . . ; up Þ¼ u0 Cðu1 ; . . . ; up Þ
Bertelson et al. (1995, 1997) proved that any
X differential star product on a symplectic manifold
þ ð1Þr Cðu0 ; . . . ; ur1 ur ; . . . ; up Þ (M, !) is equivalent to a Fedosov star product and
that its equivalence class is parametrized by the
þ ð1Þpþ1 Cðu0 ; . . . ; up1 Þup corresponding element in H 2 (M; R)[[]].
28 Deformations of the Poisson Bracket on a Symplectic Manifold

Kontsevich (IHES preprint 97) proved that the Any isomorphism between two differential star
coincidence of the set of equivalence classes of star products on symplectic manifolds is the combination
and Poisson deformations is true for general Poisson of a change of parameter and a -linear isomorph-
manifolds: ism. Any -linear isomorphism between two star
products  on (M, !) and 0 on (M0 , !0 ) is the
Theorem 10 (Kontsevich). The set of equivalence combination of the action on functions of a
classes of differential star products on a Poisson symplectomorphism : M0 ! M and an equivalence
manifold (M, P) can be naturally identified with between  and the pullback via of 0 . It exists if
the set of equivalence classes of Poisson deforma- and only if those two star products are equivalent,
tions of P: P = P þ P2  2 þ    2 (X, ^2 TX )[[]], that is, if and only if ( 1 ) c( 0 ) = c(  ), where
[P , P ] = 0. ( 1 ) denotes the action of 1 on the second
Deligne (1995) defines cohomological classes de Rham cohomology space. In particular, a
associated to differential star products on a sym- symplectomorphism of a symplectic manifold can
plectic manifold; this leads to an intrinsic way to be extended to a -linear automorphism of a given
parametrize the equivalence class of such a differ- differential star product on (M, !) if and only if
ential star product. The characteristic class c(  ) is ( ) c(  ) = c(  ) (Gutt and Rawnsley 1999).
given in terms of the skew-symmetric part of the The notion of homomorphism and its relation to
term of order 2 in  in the star product and in terms modules has been studied by Bordemann (2004).
of local (‘‘-Euler’’) The link between the notion of star product on a
P derivations of the form symplectic manifold and symplectic connections
D = (@=@) þ X þ r1  r D0r . This characteristic bi

class has the following properties: already appears in the seminal paper of Bayen
et al. (1978), and was further developed by
 The map C from equivalence classes of star Lichnerowicz (1982), who showed that any Vey
products on (M, !) to the affine space [!]= þ star product (i.e., a star product defined by
H 2 (M; R)_ mapping [  ] to c(  ) is a bijection. bidifferential operators whose principal symbols at
 The characteristic class is natural relative to each order coincide with those of the Moyal star
diffeomorphisms and is equivariant under a product) determines a unique symplectic connection.
change of parameter (Gutt and Rawnsley 1995). Fedosov’s construction yields a Vey star product on
 The characteristic class c(  ) coincides (cf. Deligne any symplectic manifold starting from a symplectic
(1995) and Neumaier (1999)) for Fedosov-type connection and a formal series of closed 2-forms on
star products with their characteristic class intro- the manifold. Furthermore, any star product is
duced by Fedosov as the de Rham class of the equivalent to a Fedosov star product and the
curvature of the generalized connection used to de Rham class of the formal 2-form determines the
build them (up to a sign and factors of 2). equivalence class of the star product. On the other
Index theory has been introduced in the frame- hand, many star products which appear in natural
work of deformation quantization by Fedosov contexts (e.g., cotangent bundles or Kähler mani-
(1996) and by Nest and Tsygan (1995, 1996). We folds) are not Vey star products but are natural star
refer to the papers of Bressler, Nest, and Tsygan for products.
further developments in that subject. A first tool in Theorem 12 (Gutt and Rawnsley 2004). Any
that theory is the existence of a trace for the natural star product on a symplectic manifold
deformed algebra; this trace is essentially unique in (M, !) determines uniquely
the framework of symplectic manifolds (an elemen-
tary proof is given in Karabegov (1998) and Gutt (i) A symplectic connection r = r(  ).
and Rawnsley (2003)); the trace is not unique for (ii) A formal series of closed 2-forms  =
more general Poisson manifolds. (  ) 2 2 (M)_. P
(iii) A formal series E = r1  r Er of differential
Definition 11 A homomorphism from a differen- operators
P of order r þ 1 (E2 of order 2), with
tial star product  on (M, P) to a differential star Er u = rþ1
k=2 (E(k) i1 ...ik k
r ) ri1 ...ik u, where the E(k)
r are
product 0 on (M0 , P0 ) is an R-linear map symmetric contravariant k-tensor fields
A : C1 (M) _ ! C1 (M0 )_, continuous in the
such that
-adic topology, such that  
u  v ¼ exp E ðexp EuÞ r; ðexp EvÞ ½4
Aðu  vÞ ¼ Au 0 Av
We denote  = r, , E. If is a diffeomorphism of M
It is an isomorphism if the map is bijective. then the data for   is  r,  , and  E. In
Deformations of the Poisson Bracket on a Symplectic Manifold 29

particular, a vector field X is a derivation of a If we want the analog in our framework to the
natural star product , if and only if L X ! = 0, requirement that operators should correspond to the
L X  = 0, L X r = 0, and L X E = 0. infinitesimal actions of a Lie algebra, we should ask
the derivations to be inner so that functions are
associated to the elements of the Lie algebra.
Group Actions on Star Products A derivation D 2 DerR_ (M,  ) is said to be
essentially inner or Hamiltonian if D = (1=)ad u
Symmetries in quantum theories are automorphisms for some u 2 C1 (M)_. We call an action of a Lie
of an algebra of observables. In the framework group almost -Hamiltonian if each D is essentially
where quantization is defined in terms of a star inner; this is equivalent to the knowledge of a linear
product, a symmetry  of a star product  is an map
: g ! C1 (M)_  7!
 so that ad (1=)
automorphism of the R_-algebra C1 (M)_ with [
 ] = ad
[,] .
multiplication given by : We say the action is -Hamiltonian if
 can be
chosen to make
ðu  vÞ ¼ ðuÞ  ðvÞ; ð1Þ ¼ 1

where , being determined by what P it does on g ! C1 ðMÞ_;  7!

C1 (M), will be a formal series (u) = r0  r r (u)
of linear maps r  C1 (M) ! C1 (M). We denote by a homomorphism of Lie algebras, where C1 (M)_
AutR_ (M,  ) the set of those symmetries. is endowed with the bracket (1=)[ , ] . Such a
Any such automorphism  of  then can be homomorphism is called a quantization in Arnal
written as (u) = T(u
1 ), where isPa Poisson et al. (1983) and is called a generalized moment map
diffeomorphism of (M, P) and T = Id þ r1  r Tr is in Bordemann et al. (1998).
a formal series of linear maps. If  is differential, When a map 0 : g ! C1 (M) is a generalized
then the Tr are differential operators; if  is natural, moment map, that is,
then T = Exp E with E = r1  r Er and Er is a
1 0 
differential operator of order at most r þ 1.   0  0  0 ¼ 0½;
If t is a one-parameter group of symmetries of 
the star product , then its generator D will be a the star product is said to be covariant under g.
derivation of . Denote the Lie algebra of -linear When a map : g ! C1 (M)_ is a generalized
derivations of  by DerR_ (M,  ). moment map, so that D has no terms in  of
An action of a Lie group G on a star product  on degree > 0, thus D =  , this map is called a
a Poisson manifold (M, P) is a homomorphism quantum moment map (Xu 1998). Clearly in that
 : G ! AutR_ (M,  ); then g = ( g )1 þ O() and situation, the star product is invariant under the
there is an induced Poisson action of G on (M, P). action of g on M.
Given a Poisson action of G on (M, P), a star Covariant star products have been considered to
product is said to be ‘‘invariant’’ under G if all the study representations theory of some classes of Lie
( g )1 are automorphisms of . groups in terms of star products. In particular, an
An action of a Lie group G on  induces a autonomous star formulation of the theory of
homomorphism of Lie algebrasP D : g ! DerR_ representations of nilpotent Lie groups has been
(M,  ). For each  2 g, D =  þ r1  r Dr , where given by Arnal and Cortet (1984, 1985).
 is the fundamental vector field on M defined by ; Consider a differential star product  on a
hence, symplectic manifold, admitting an algebra g of vector
d fields on M consisting of derivations of , and assume
 ðxÞ ¼ j ðexp  tÞxÞ there is a symplectic connection r which is invariant
dt 0
under g; then  is equivalent, through an equivariant
Such a homomorphism D : g ! DerR_ (M,  ) is equivalence (T with L X T = 0), to a Fedosov star
called an action of the Lie algebra g on . product r, , ; this yields to a classification of such
invariant star products (Bertelson et al. 1998).
Proposition 13 (Arnal et al. 1983). Given D : g !
DerR_ (M,  ) a Phomomorphism so that for each Proposition 14 (Kravchenko, Gutt and Rawnsley,
 2 g, D =  þ r1  r Dr , where  are the funda- Müller-Bahns, Neumaier, and Hamachi). Consider
mental vector fields on M defined by an action of a Fedosov star product r,  on a symplectic
G on M and the Dr are differential operators, then manifold. A vector field X is a derivation of r,  if
there exists a local homomorphism  : U  G ! and only if L X ! = 0, L X  = 0, and L X r = 0. A
AutR_ (M,  ) so that  = D. vector field X is an inner derivation of  = r,  if
30 Deformations of the Poisson Bracket on a Symplectic Manifold

and only if L X r = 0 and there exists a series of Convergence of Some Star Products
X such that on a Subclass of Functions
Let (M, P) be a Poisson manifold and let  be a
iðXÞ!  iðXÞ ¼ d
X differential star product on it with 1 acting as the
identity. Observe that if there exists a value k of 
In this case X(u) = (1=)(ad
X )(u). such that
On a symplectic manifold (M, !), a vector field X X
is an inner derivation of the natural star product uv¼  r Cr ðu; vÞ
 = r, , E if and only if L X r = 0, L X E = 0, and r¼0
there exists a series of functions
X such that
converges (for the pointwise convergence of func-
tions), for all u, v 2 C1 (M), to Fk (u, v) in such a
iðXÞ!  iðXÞ ¼ d
X way that Fk is associative, then Fk (u, v) = uv. This is
easy to see as the order of differentiation in the Cr
Then X = (1=)ad X with X = Exp(E1 )
X . necessarily is at least r in each argument and thus
Let G be a compact Lie group of symplecto- the Borel lemma immediately gives the result. So
morphisms of (M, !) and g the corresponding Lie assuming ‘‘too much’’ convergence kills all defor-
algebra of symplectic vector fields on M. Con- mations. On the other hand, in any physical
sider a star product  on M which is invariant situation, one needs some convergence properties
under G. The Lie algebra g consists of inner to be able to compute the spectrum of quantum bi

derivations for  if and only if there exists a series observables in terms of a star product (as in Bayen
of functions
X and a representative (1=)(!  ) et al. 1978).
of the characteristic class of  such that i(X)! In the example of Moyal star product on the
i(X) = d
X . symplectic vector space (R 2n , !), the formal formula
Star products which are invariant and covariant   
are used in the problem of reduction: this is a 
ðu M vÞðzÞ ¼ exp Prs @xr @ys ðuðxÞvðyÞÞ
device in symplectic geometry which allows one to 2 x¼y¼z
reduce the number of variables. An important
issue in quantization is to know if and how obviously converges when u and v are polynomials.
quantization commutes with reduction. This pro- On the other hand, there is an integral formula for
blem has been studied by Fedosov for the action of Moyal star product given by
a compact group on the particular star products Z
constructed by him with trivial characteristic class ðu  vÞðÞ ¼ ðhÞ2n uð0 Þvð00 Þ
( r, 0 ). Here, one indeed obtains some ‘‘quantiza-
tion commutes with reduction’’ statements. More 2i
generally, Bordemann, Herbig, and Waldmann  exp !ð; 00 Þ þ !ð00 þ 0 Þ
considered covariant star products. In this case, 
one can construct a classical and quantum BRST þ !ð ; Þ d0 d00

complex whose cohomology describes the algebra of

observables for the reduced system. While this is
and this product  gives a structure of associative
well known classically – at least under some
algebra on the space of rapidly decreasing functions
regularity assumptions on the group action – for
I (R2n ). The formal formula converges (for  = i
h) in
the quantized situation, the nontrivial question is
the topology of I 0 for u and v with compactly
whether the quantum BRST cohomology is ‘‘as large
supported Fourier transform.
as’’ the classical one. Clearly, from the physical
Some works have been done about convergence of
point of view, this is crucial. It turns out that
star products.
whereas for strongly invariant star products one
indeed obtains a quantization of the reduced phase  The method of quantization of Kähler manifolds
space, in general the quantum BRST cohomology due to Berezin as the inverse of taking symbols of
might be too small. More general situations of operators, to construct on Hermitian symmetric
reduction have also been discussed by, for example, spaces star products which are convergent on a
Bordemann as well as Cattaneo and Felder, when a large class of functions on the manifold
coisotropic (i.e., first class) constraint manifold is (Moreno, Cahen Gutt, and Rawnsley, Karabegov,
given. Schlichenmaier).
Deformations of the Poisson Bracket on a Symplectic Manifold 31

 The constructions of operator representations of Let A : H ! H be a bounded linear operator

star products (Fedosov, Bordemann, Neumaier, and let
and Waldmann).
 The work of Rieffel and the notion of strict b hAeq ; eq i
AðxÞ ¼ ; q 2 L x; x 2 M
deformation quantization. Examples of strict (Fré- heq ; eq i
chet) quantization have been given by Omori,
Maeda, Niyazaki, and Yoshioka, and by Bieliavsky. b has an analytic
be its symbol. The function A
continuation to an open neighborhood of the
Convergence of Berezin-Type Star Products
 given by
diagonal in M  M
on Hermitian Symmetric Spaces
The method to construct a star product involves b yÞ ¼ hAeq0 ; eq i ;
Aðx; q 2 L x ; q0 2 L y
heq0 ; eq i
making a correspondence between operators and
functions using coherent states, transferring the which is holomorphic in x and antiholomorphic in y.
operator composition to the symbols, introducing a ^
We denote by EðLÞ the space of symbols of bounded
suitable parameter into this Berezin composition of operators on H . We can extend this definition of
symbols, taking the asymptotic expansion in this symbols to some unbounded operators provided
parameter on a large algebra of functions, and then everything is well defined.
showing that the coefficients of this expansion The composition of operators on H gives rise to an
satisfy the cocycle conditions to define a star associative product  for the corresponding symbols:
product on the smooth functions (Cahen et al.
1995). The idea of an asymptotic expansion appears Z n
in Berezin (1975) and in Moreno and Ortega- b  BÞðxÞ
ðA b ¼ Aðx; b xÞ ðx; yÞ ðyÞ ! ðyÞ
b yÞBðy;
M n!
Navarro (1983, 1986).
This asymptotic expansion exists for compact M, where
and defines an associative multiplication on formal
power series in k1 with coefficients in C1 (M) for jheq0 ; eq ij2
compact coadjoint orbits. For M a Hermitian ðx; yÞ ¼ ; q 2 L x ; q0 2 L y
symmetric space of compact type and more gener- keq0 k2 keq k2
ally for compact coadjoint orbits (i.e., flag mani-
folds), this formal power series converges on the is a globally defined real analytic function on
space of symbols (Karabegov 1998). M  M provided has no zeros ( (x, y) 1 every-
For general Hermitian symmetric spaces of non- where, with equality where the lines spanned by eq
compact type, using their realization as bounded and eq0 coincide).
domains, one defines an analogous algebra of Let k be a positive integer. The bundle (Lk = k L,
symbols of polynomial differential operators. r , hk ) is a quantization bundle for (M, k!, J) and

Reshetikhin and Takhtajan have constructed an we denote by H k the corresponding space of

holomorphic sections and by E(L ^ k ) the space of
associative formal star product given by an asymp-
totic expansion on any Kähler manifold. This they symbols of linear operators on H k . We let (k) be the
do in two steps, first building an associative product corresponding function. We say that the quantiza-
for which 1 is not a unit element, then passing to a tion is regular if (k) is a nonzero constant for all
star product. non-negative k and if (x, y) = 1 implies x = y.
We denote by (L,r, h) a quantization bundle for (Remark that if the quantization is homogeneous,
the Kähler manifold (M, !, J) (i.e., a holomorphic all (k) are constants.)
line bundle L with connection r admitting an Theorem 15 (Cahen et al.). Let (M, !, J) be a
invariant Hermitian structure h, such that the Kähler manifold and (L,r, h) be a regular quan-
curvature is curv(r) =2i!). We denote by H the tization bundle over M. Let A, b B
b be in B , where
Hilbert space of square-integrable holomorphic B  C1 (M) consists of functions f which have an
sections of L which we assume to be nontrivial. analytic continuation in M  M  so that f (x, y)
The coherent states are vectors eq 2 H such that (x, y)l is globally defined, smooth and bounded on
sðxÞ ¼ hs; eq iq; 8q 2 L x ; x 2 M; s 2 H K  M and M  K for each compact subset K of M
for some positive power l. Then
(L is the complement of the zero section in L). The Z
function (x) = jqj2 keq k2 , q 2 L x , is well defined and b k BÞðxÞ
b b yÞBðy;
b xÞ k !n ðyÞ
ðA ¼ Aðx; ðx; yÞ ðkÞ kn ðyÞ
real analytic. M n!
32 Deformations of the Poisson Bracket on a Symplectic Manifold

defined for k sufficiently large, admits an asymptotic All this suggests that for a quantization of T  Q,
expansion in k1 as k ! 1 the polynomials Pol (T  Q) should play a crucial
X role. In deformation quantization this is accom-
b k BÞðxÞ
ðA b  b BÞðxÞ
kr Cr ðA; b plished by the notion of a homogeneous star product
r0 (De Wilde and Lecomte 1983). If the operator
and the cochains Cr are smooth bidifferential @
operators, invariant under the automorphisms of H¼ þ L ½6
the quantization and determined by the geometry
alone. Furthermore, C0 (A, b B)
b =AbBb and C1 (A, b B)
b is a derivation of a formal star product ?, then ? is
b = (i=){A,
b A)
C1 (B, b B}.
b called homogeneous. It immediately follows that
If M is a flag manifold, this defines a star product Pol(T  Q)[]  C1 (T  Q)[[]] is a subalgebra over
on C1 (M) and the k product of two symbols is the ring C[] of polynomials in . Hence for
convergent (it is a rational function of k without homogeneous star products, the question of conver-
pole at infinity) (cf. Karabegov in that generality). gence (in general quite delicate) has a simple answer.
If D be a bounded symmetric domain and E the Let us now describe a simple construction of a
algebra of symbols of polynomial differential opera- homogeneous star product (following Bordemann
tors on a homogeneous holomorphic line bundle L et al. (1998)). We choose a torsion-free connection
over D which gives a realization of a holomorphic r on Q and consider the operator of the symme-
discrete series representation of G0 , then for f and g trized covariant derivative, locally given by
in E the Berezin product f k g has an asymptotic
D ¼ dxk _ r@=@xk : 1 ðSk T  QÞ ! 1 ðSkþ1 T  QÞ ½7
expansion in powers of k1 which converges to a
rational function of k. The coefficients of the Clearly, D is a global L object and a derivation of the
asymptotic expansion are bidifferential operators symmetric algebra 1 1 k 
k = 0  (S T Q). Let now
which define an invariant and covariant star product   1
f 2 Pol (T Q) and 2 C (Q) be given. Then one
on C1 (D ). defines the standard-ordered quantization %Std (f ) of f
with respect to r to be the differential operator
%Std (f ) : C1 (Q) ! C1 (Q) locally given by
Star Products on Cotangent Bundles 
X 1
ðÞr 1 @rf 

Since from the physical point of view cotangent %Std ðf Þ ¼ 
r! @p k1    @p kr 
bundles  : T  Q ! Q over some configuration space r¼0 p¼0
Q, endowed with their canonical symplectic struc-    
@ @ 1 r
ture !0 , are one of the most important phase spaces,  is k
   is k
D ½8
@x 1 @x r r!
any quantization scheme should be tested and
exemplified for this class of classical mechanical where is denotes the symmetric insertation of vector
systems. fields in symmetric forms. Again, this is independent
We first recall that on T  Q there is a canonical of the coordinate system xk . The infinite sum is
vector field , the Euler or Liouville vector field actually finite as long as f 2 Pol (T  Q) whence we
which is locally given by  = pk (@=@pk ). Here and in can safely set  = ih in this case. Indeed, [8] is the
the following, we use local bundle coordinates well-known symbol calculus for differential opera-
(qk , pk ) induced by local coordinates xk on Q. tors and it establishes a linear bijection
Using  we can characterize those functions
f 2 C1 (T  Q) which are polynomial in the fibers of %Std : Pol ðT  QÞ ! DiffOpðC1 ðQÞÞ ½9
degree k by f = kf . They are denoted by Polk (T  Q),
which generalizes the usual canonical quantization
whereas Pol (T  Q) denotes the subalgebra of all
in the flat case of T  Q = T  Rn = R2n . Using this
functions which are polynomial in the fibers.
linear bijection, we can define a new product ?Std for
Clearly, most of the physically relevant observables
Pol (T  Q) by
such as the kinetic energy, potentials, and generators
of point transformations are in Pol (T  Q). More- X
over, Pol (T  Q) is a Poisson subalgebra with f ?Std g ¼ %1
Std ð%Std ðf Þ%Std ðgÞÞ ¼  r Cr ðf ; gÞ ½10
n o
Polk ðT  QÞ; Pol‘ ðT  QÞ  Polkþ‘1 ðT  QÞ ½5 It is now easy to see that ?Std fulfills all requirements
of a homogeneous star product except for the fact
since L  !0 = !0 is conformally symplectic. that the Cr (  ,  ) are bidifferential. In this approach
Deformations of the Poisson Bracket on a Symplectic Manifold 33

this is far from being obvious as we only worked f 2 Pol(T  Q)[], we have Nf 2 Pol(T  Q)[] as
with functions polynomial in the fibers so far. well, and N commutes with H. As in the flat case
Nevertheless, it is true whence ?Std indeed defines a this allows one to define a Weyl-ordered quantiza-
star product for C1 (T  Q)[[]]. tion by
In fact, there is a different characterization of ?Std
using a slightly modified Fedosov construction: first %Weyl ðf Þ ¼ %Std ðNf Þ ½14
one uses r to define a torsion-free symplectic
connection on T  Q by a fairly standard lifting. together with a so-called Weyl-ordered star product
Moreover, using r one can define a standard-
ordered fiberwise product
Std for the formal Weyl f ? Weyl g ¼ N 1 ðNf ?Std NgÞ ½15
algebra bundle over T  Q, being the starting point of
which is now a Hermitian and homogeneous star
the Fedosov construction of star products. With
product such that % Weyl becomes a  -representation of
these two ingredients one finally obtains ?Std from
? Weyl , that is, we have %Weyl (f ? Weyl g) = % Weyl (f )
the Fedosov construction with the big advantage
% Weyl (g) and % Weyl (f )y = % Weyl (f ). Note that in the
that now the order of differentiation in the Cr can
flat case this is precisely the Moyal star product M
easily be determined to be r in each argument,
from [1].
whence ?Std is even a natural star product. More-
The star products ?Std and ? Weyl have been
over, Cr differentiates the first argument only in
extensively studied by Bordemann, Neumaier,
momentum directions which reflects the standard
Pflaum, and Waldmann and provide now a well-
understood quantization on cotangent bundles. We
Already in the flat situation the standard ordering
summarize a few highlights of this theory:
is not an appropriate quantization scheme from the
physical point of view as it maps real-valued 1. In the particular case of a Levi-Civita connection
functions to differential operators which are not r for some Riemannian metric g and the
symmetric in general. To pose this question in a corresponding volume density g , the 1-form
geometric framework, we have to specify a positive  vanishes. This simplifies the operator N and
density 2 1 (jn jT  Q) on the configuration space describes the physically most interesting situation.
Q first, as for functions there is no invariant 2. If the configuration space is a Lie group G, then its
meaning of integration. Specifying we can con- cotangent bundle T  G ffi G  g is trivial by using,
sider the pre-Hilbert space C1 0 (Q) with inner for example, left-invariant 1-forms. In this case the
product star products ? Weyl and ?Std restrict to the CBH star
Z product on g . Moreover, ? Weyl coincides with the
h ; i ¼  ½11 star product found by Gutt (1983) on T  G.
Q 3. Using the operator N one can interpolate between
Now the adjoint with respect to [11] of %Std (f ) can the two different ordering descriptions %Std and
be computed explicitly. We first consider the % Weyl by inserting an additional ordering parameter
second-order differential operator  in the exponent, that is, N = exp(( þ  v )).
Thus, one obtains -ordered representations %
@2 k @2 @ together with corresponding -ordered star pro-
¼ þ p k  þ kk‘ ½12
@q @pk ‘m
@p‘ @pm @p‘ ducts ? , where  = 0 corresponds to standard
ordering and  = 1=2 corresponds to Weyl order-
where k‘m are the Christoffel symbols of r. In fact, ing. For  = 1, one obtains antistandard ordering
 is defined independently of the coordinates and and in general one has the relation f ? g = g ?1 f
coincides with the Laplacian of the pseudo- as well as % (f )y = %1 (f ).
Riemannian metric on T  Q which is obtained from 4. One can describe also the quantization of an
the natural pairing of vertical and horizontal spaces electrically charged particle moving in a magnetic
defined by using r. Moreover, we need the 1-form background field B. This is modeled by a closed
 defined by rX = (X) and the corresponding 2-form B 2 1 (2 T  Q) on Q. Using local vector
vertical vector field v 2 1 (T(T  Q)) locally given potentials A 2 1 (T  Q) with B = dA locally, and
by v = k (@=@pk ). Then by minimal coupling, one obtains a star product
v ?B which depends only on B and not on the local
%Std ðf Þy ¼ %Std ðN 2 f Þ; N ¼ eð=2Þðþ Þ
potentials A. It will be equivalent to ? Weyl if and
Note that due to the curvature contributions, this only if B is exact. In general, its characteristic
statement is a highly nontrivial partial integration class is, up to a factor, given by the class [B] of
compared to the flat case. Note also that for the magnetic field B. While the observable
@-Approach to Integrable Systems

algebra always exists, a Schrödinger-like repre- star products are a particular kind of global
sentation of ?B only exists if B satisfies the usual symbol calculus.
integrality condition. In this case, there exists a 8. At least for a projectible Lagrangian submanifold
representation on sections of a line bundle whose L of T  Q, one finds representations of the star
first Chern class is given by [B]. This manifests product algebras on the functions on L. This
Dirac’s quantization condition for magnetic leads to explicit formulas for the WKB expansion
charges in deformation quantization. Another corresponding to this Lagrangian submanifold.
equivalent interpretation of this result is obtained 9. The relation between configuration space symme-
by Morita theory: the star products ? Weyl and ?B tries, the corresponding phase-space reduction,
are Morita equivalent if and only if B satisfies and the reduced star products has been analyzed
Dirac’s integrality condition. extensively by Kowalzig, Neumaier, and Pflaum.
5. Analogously, one can determine the unitary
equivalence classes of representations for a fixed, See also: Classical r-Matrices, Lie Bialgebras, and
exact magnetic field B. It turns out that the Poisson Lie Groups; Deformation Quantization;
representations depend on the choice of the global Deformation Quantization and Representation Theory;
Deformation Theory; Fedosov Quantization; Operads.
vector potential A and are unitarily equivalent if
the difference between the two vector potentials
satisfies an integrality condition known from the
Further Reading
Aharonov–Bohm effect. This way, the Aharonov–
Bohm effect can be formulated within the repre- Bayen F, Flato M, Frønsdal C, Lichnerowicz A, and Sternheimer D
sentation theory of deformation quantization. (1978) Deformation theory and quantization. Annals of Physics
6. There are several variations of the representa- 111: 61–151.
Cattaneo A (Notes By Indelicato D) Formality and star products.
tions %Std and %Weyl . In particular, one can In: Gutt S, Rawnsley J, and Sternheimer D (eds.) Poisson
construct a representation on half-forms instead Geometry, Deformation Quantisation and Group Representa-
of functions, thereby avoiding the choice of the tions. LMS Lecture Note Series 323, pp. 79–144. Cambridge:
integration density . Moreover, all the Weyl- Cambridge University Press.
Dito G and Sternheimer D (2002) Deformation quantization:
ordered representations can be understood as
genesis, developments and metamorphoses. In: Halbout G (ed.)
GNS representations coming from a particular Deformation Quantization, IRMA Lectures in Mathematics and
positive functional, the Schrödinger functional. Theoretical Physics, vol. 1, pp. 9–54. Berlin: Walter de Gruyter.
For %Weyl this functional is just the integration Gutt S (2000) Variations on deformation quantization. In: Dito G
over the configuration space Q. and Sternheimer D (eds.) Conférence Moshé Flato 1999.
7. All the (formal) star products and their represen- Quantization, Deformations, and Symmetries, Mathematical
Physics Studies, vol. 21, pp. 217–254. Dordrecht: Kluwer
tations can be understood as coming from formal Academic.
asymptotic expansions of integral formulas. From Waldmann S (2005) States and representations in deformation
this point of view, the formal representations and quantization. Reviews in Mathematical Physics 17: 15–75.

@-Approach to Integrable Systems
P G Grinevich, L D Landau Institute for Theoretical Such compatible families can be defined by present-
Physics, Moscow, Russia ing their common eigenfunctions. If it is possible to
ª 2006 Elsevier Ltd. All rights reserved. show that some analytic constraints imply that a
function is a common eigenfunction of a family of
operators, solutions of original nonlinear system are
also generated.
The main idea of the @ method is to impose the

The @-approach is one of the most generic methods following analytic constraints: if
denotes the
for constructing solutions of completely integrable spectral parameter and x the physical variables,
systems. Taking into account that most soliton then, for arbitrary fixed values x, the @
systems are represented as compatibility condition of the wave function is expressed as a linear
for a set of linear differential operators (Lax pairs, combination of the wave functions at other values
zero-curvature representations, L–A–B Manakov of
with x-independent coefficients. In specific
triples), it is sufficient to construct these operators. examples, this property is either derived from the

@-Approach to Integrable Systems 35

direct spectral transform or imposed a priori. Of 2. An n  n matrix-valued function g(, x) (it

course, the specific realization of this scheme describes the dynamics) such that
depends critically on the nonlinear system.
  g(, x) depends on the spectral parameter  2 C
The origin of the @-method came from
and ‘‘physical’’ variables x = (x1 , . . . , xN );
the following observation. A solution of the one-
the physical variables xk are either continuous
dimensional inverse-scattering problem (the
(xk belongs to a domain in R or in C) or
problem of reconstructing the potential by discrete
discrete (xk takes integer values);
spectrum and scattering amplitude at positive
 g(, x) is analytic in , defined for all  2 C,
energies) for the one-dimensional time-independent
except for a finite number of singular points,
Schrödinger operator
and is single valued; and
L ¼ @x2 þ uðxÞ ½1  det g(, x) has only finite number of zeros.
was obtained by Gelfand, Levitan, and Marchenko For problems with continuous physical variables
P the
in the 1950s. It essentially used analytic continua- typical form of g(, x) is g(, x) = exp( i xj Kj ()),
tion of the wave function from the real momenta to where Kj () are meromorphic matrices, mutually
the complex ones. If the potential u(x) decays commuting for all . The discrete variables are
sufficiently fast as jxj ! 1, then the eigenfunction usually encoded in orders of poles and zeros.
equation 3. An n  n matrix-valued function R(, ) – the
‘‘generalized spectral data.’’ Usually, it is a regular
L ðk; xÞ ¼ k2 ðk; xÞ ½2 function of four real variables <, =, <, =. (We
write this as a function of two complex variables,
has two solutions þ (k, x) and  (k, x) such that but we do not assume it to be holomorphic. It
1.  (k, x) = (1 þ o(1))eikx as x ! 1. would be more precise to write it as R(, ,  , ),

2. The functions þ (k, x),  (k, x) are holomorphic but to avoid long notations we omit the ,  
in k in the upper half-plane and the lower half- dependence.) To avoid analytical complications,
plane, respectively. the function R(, ) is usually assumed to vanish
as  or  tend to singular points of (), g(, x).
Existence of analytic continuation to complex
momenta is typical for one-dimensional systems. But Then the wave function  is defined by the data
in the multidimensional case the situation is differ- using the following properties:
ent. For example, wave functions for the mutlidi- 1.  = (, x) takes values in complex n  n
mensional Schrödinger operator constructed by matrices:
Faddeev are well defined for all complex momenta
k, but they are nonholomorphic in k, and they 2 3
11ð; xÞ ... 1n ð; xÞ
become holomorphic only after restriction to some 6 7
ð; xÞ ¼ 4 .. ... .. ½3
special one-dimensional subspaces. The last property . . 5
was one of the key points in the Faddeev approach.
bi bi n1ð; xÞ ... nn ð; xÞ
Beals and Coifman (1981–82) and Ablowitz et al.
(1983) discovered that departure from holomorphi-
city for multidimensional wave functions can be 2. For all  2 C outside the singular points, the

(), g(, x) wave function  satisfies the @-equation
interpreted as spectral data. Such spectral trans-
forms proved to be very natural and suit perfectly of inverse-spectral problem,
the purposes of the soliton theory. Some other ZZ
famous methods, including the Riemann–Hilbert @ð; xÞ
¼ d ^ d ð; xÞRð; Þ ½4
problem, can be interpreted as special reductions of @  2C
the @ method.
It is important that condition [4] is x-independent.
3. The function (, x)  (), where (, x) =

Nonlocal @-Problem 
and Local @-Problem (, x)g1 (, x), is regular for all  2 C and

The most generic formulation of the @-method is the
 ð; xÞ  ðÞ ! 0 as jj ! 1 ½5
nonlocal @-problem. Assume that the following data
is given:
The wave function (, x) is calculated by
1. A rational n  n matrix-valued normalization employing the data (), g(, x), R(, ) using the
function (): following procedure. Taking into account that the
@-Approach to Integrable Systems

functions (), g(, x) are holomorphic in , eqn [4] The coefficient A() can be eliminated by multi-
can be rewritten in terms of (, x): plying the wave function to an appropriate function
ZZ of ; therefore, in standard texts, A()  0.
@½ð; xÞ  ðÞ
¼ d ^ d
 ð; xÞgð; xÞ If for every  the kernel R(, ) is equal to 0
@  2C
everywhere except at finite number of points
 Rð; Þg1 ð; xÞ ½6 1 (), . . . , k (), one has the so-called local

@-problem. Such kernels are rather typical.
The right-hand side of [6] is regular; therefore, this
relation is valid for all complex  values.
Equation [6] with the boundary condition [5] is
equivalent to the following integral equation: Examples of Soliton Systems Integrable
by the @-Problem Method
1 d ^ d
ð; xÞ ¼ ðÞ þ
2i 2C    Let us discuss some important examples.
 d ^ d ð; xÞgð; xÞ
2C The KP-II Hierarchy
 Rð; Þg1 ð; xÞ ½7
The first nontrivial equation from the KP hierarchy
This equation can be derived using the generalized has the following form:
Cauchy formula. Let f (z) be a smooth (not necessa-
rily holomorphic) function in a bounded domain D ðut þ 6uux  uxxx Þx ¼ 3
2 uyy ½11
in the complex plane. Then
I From a physical point of view, the case of real
2 is
1 d the most interesting one. Equation [11] is called
f ðzÞ ¼ f ðÞ
2i @D   z KP-I if
2 = 1 and KP-II if
2 = þ1. The Lax pair
1 d ^ d  @f ðÞ for KP-II reads:
þ ½8
2i D   z @  
½L; A ¼ 0
If the kernel g(, x)R(, )g1 (, x) is
sufficiently good (e.g., it is sufficient to assume, that where
(1 þ jj)1þ g(, x)R(, )g1 (, x)(1 þ )2 ,  > 0, is
a continuous function at both finite and infinite L ¼ @y  @x2 þ uðx; y; tÞ
points), then we have a Fredholm equation (the A ¼ @t  4@x3 þ 6uðx; y; tÞ@x ½12
operator on the right-hand side of [7] is compact).
If it has no unit eigenvalues, eqn [7] is uniquely þ 3ux ðx; y; tÞ þ 3wðx; y; tÞ
solvable. But, for some values of x, one of the
eigenvalues may become equal to 1, and it results The Cauchy problem for initial data u(x, y, 0)
in singularities of solutions. decaying at infinity is solved by using the nonlocal

Riemann problem for KP-I and local @-problem for
If the norm of the integral operator is smaller than
1, eqn [7] is uniquely solvable. To generate solutions KP-II. The wave function is assumed to be scalar

valued (n = 1), and @-equation [4] takes the follow-
that are regular for all values of physical variables, it
is natural to restrict the class of admissible spectral ing form:
data by assuming the kernel g(, x)R(, )g1 (, x)
to be bounded in x for all , . In the scalar case @ð; x; y; tÞ  x; y; tÞ
¼ TðÞð; ½13
n = 1, this restriction implies: @ 

Rð; Þ ¼ 0 The wave function (, x, y, t) is assumed to be

1 regular for finite ’s and to have the following
for all ;  such that gð; xÞg ð; xÞ
essential singularity as  = 1:
is unbounded in x ½9
For specific examples like the Kadomtsev–Petviashvili-II ð; x; y; tÞ ¼ expðx þ 2 y þ 43 tÞð1 þ oð1ÞÞ ½14
(KP-II), direct scattering transform automatically
generates spectral data satisfying [9]. In KP-II, [9] Equivalently, ()  1 and the function g(, x, t) has
implies one essential singularity at  = 1,

Rð; Þ ¼ AðÞ ð  Þ þ TðÞ ð  Þ ½10 gð; x; tÞ ¼ expðx þ 2 y þ 43 tÞ ½15

@-Approach to Integrable Systems 37

Higher times tk from the KP hierarchy are @L

incorporated into this scheme by assuming that ¼ ½An ; L þ Bn L ½21
gð; tÞ ¼ exp k
 tk ½16 where
k¼1 L ¼ 4@z @ z þ uðz; tÞ
Here x = t1 , y = t2 , t = 4t3 .   ½22
bi An ¼ 22nþ1 @z2nþ1 þ @z2nþ1 þ
Equation [13] was originally derived (Ablowitz
et al. 1983) from the direct spectral transform. If the The order of Bn is smaller than 2n þ 1. In particular,
potential u(x, y) is sufficiently small and for n = 1,
u(x, y) = O(1=(x2 þ y2 )1þ ) for x2 þ y2 ! 1, then  
A1 ¼ 8 @z3 þ @z3 þ 2ðw@z þ w@
 z Þ
the wave function (, x, y) for the L-operator [12] ½23
B1 ¼ wz þ w z
Lð; x; yÞ ¼ 0
ut ¼ 8@z3 u þ 8@z3 u þ 2@ z ðuwÞ þ 2@z ðuwÞ
ð; x; yÞ ¼ expðx þ 2 yÞ½1 þ oð1Þ ½17
for x2 þ y2 ! 1 where

can be constructed by solving the following integral uðz; tÞ ¼ u

ðz; tÞ; @z wðz; tÞ ¼ 3@ z uðz; tÞ ½25
equation for the pre-exponent (, x, y) = (, x, y) This hierarchy is integrated using the scattering
exp(x  2 y): transform at zero energy for the two-dimensional
ZZ Schrödinger operator L. If Cauchy data with
ð; x; yÞ ¼ 1 Gð; x  x0 ; y  y0 Þuðx; yÞ asymptotic
 ð; x; yÞ dx0 dy0 ½18 uðzÞ ! E0 ; wðzÞ ! 0; for jzj ! 1 ½26
where the Green function G(, x, y) is given by is considered, the scattering transform for the
operator L ~ = L þ E0 with the potential u~(z) = u(z) þ
1 eiðpx xþpy yÞ E0 at fixed energy E0 and decaying at infinity is used.
Gð; x;yÞ ¼ 2 dpx dpy ½19
4 p2x þ ipy  2ipx In fact, the fixed-energy scattering problem is one of
the basic problems of mathematical physics, and the
It is not holomorphic in , but Novikov–Veselov hierarchy can be treated as an
@Gð; x; yÞ i infinite-dimensional Abel symmetry algebra for this
¼  sgnð<Þ e2i<x4i<=y ½20 problem. The scattering transform essentially depends
@  2
on the sign of E0 . The case E0 = 0, studied by Boiti,
The nonholomorphicity of G(, x, y) results in Leon, Manna, and Pempinelli is the most complicated
the special nonholomorphicity of (, x, y) of the from the analytic point of view, and we do not
form [13]. discuss it now.
Remark We see that one function of two real If E0 < 0, the wave function satisfies a pure local
variables T() is sufficient to solve the Cauchy 
problem in the plane. But it is also possible to  
@ð; zÞ 1
construct solutions of KP-II starting from generic ¼ TðÞ  ; z ½27
nonlocal kernels R(, ) (to guarantee at least local @  
existence of solutions, it is enough to assume that with ()  1, and
R(, ) is small and has finite support). It looks
like a paradox, but the situation is exactly the gð; zÞ ¼ eð =2Þðzþz=Þ ; 2 ¼ E0 ½28
same in the linear case. In the standard Fourier
Starting from generic spectral data T(), one obtains
method, only exponents with real momenta are
a fixed-energy eigenfunction for a second-order
used, but local solutions can be constructed as
combinations of exponents with arbitrary complex
momenta. ~
Lð; zÞ ¼ E0 ð; zÞ
L ¼ 4@z @z þ VðzÞ@z þ u
Novikov–Veselov Hierarchy and Two-Dimensional
To generate pure potential operators (V(z)  0), it is
Schrödinger Operator
necessary to impose additional symmetry constraints
Equations from this hierarchy admit representation of the spectral data (see the section ‘‘Reductions on
in terms of Manakov L–A–B triples, the @ data’’).
@-Approach to Integrable Systems

If E0 > 0, there are two types of generalized integrated by using the following zero-curvature

spectral data – @-data and nonlocal Riemann representation:
problem data. The wave function satisfies the ! !

@-relation: @z 0 1 0 qðz; tÞ
¼  ½39
0 @z 2 qðz; tÞ 0
@ð; zÞ 1
¼ TðÞ   ; z ; jj 6¼ 1 ½30
0 1
2i@z2 þ ig iqz  iq@ z
and has a jump at the unit circle jj = 1. The @t  ¼ @ A ½40
boundary values  (, z) = ((1  0), z) are i q
z þ i q@z 2i@z2  ig
connected by the following relation:
The wave function satisfies the following ‘‘scatter-
ing’’ equation:
þ ð; zÞ ¼  ð; zÞ þ Rð; Þ ð; zÞjdj ½31 0 1 0 1
jj¼1 @k 0 0 
@ A ¼ @
T A T ½41
gð; zÞ ¼ eið =2Þðzþz=Þ ; 2 ¼ E0 ½32 0 @k bðkÞ 0
Here T denotes the transposed matrix. Let us point
Constraints on the spectral data associated with out the amazing symmetry between the direct and
pure potential operators were found by Novikov inverse transforms.
and Grinevich for R(, ), and by Manakov and
Grinevich for T(). Existence of two different types Discrete Systems
of generalized scattering data has a very transparent
In the examples discussed above, continuous vari-
physical meaning: there is a one-to-one correspon-
ables are ‘‘encoded’’ in essential singularities of
dence between the classical scattering amplitude at
g(, x). Discrete variables correspond to orders of
energy E0 and the nonlocal Riemann problem data
 zeros and poles. For example, assuming that the
R(, ). The @-data T() can be treated as a
function g(, t) in the KP integration scheme
complete set of additional parameters enumerating
depends on extra continuous variables t1 , t2 , . . . ,
all potentials with a given scattering amplitude at
tn , . . . and discrete variable t0 = n,
one energy. 0 1
X 1
Davey–Stewartson-II and Ishimori-I Equations gð; tÞ ¼ t0 exp@ k tk A ½42
The Davey–Stewartson-II (DS-II) equation k6¼0

  one obtains solutions of the so-called two-dimensional

i@t q þ 2 @z2 þ @z2 q þ ðg þ 
gÞq ¼ 0 ½33 Toda–KP hierarchy.
Assume that we have a nonlocal @-problem  for a
scalar function with   1 and
@z g ¼  @z jqj2 ½34
  Pk nk
gð; n1 ; . . . ; nk Þ ¼ ½43
q ¼ qðz; tÞ; g ¼ gðz; tÞ; z ¼ x þ iy ½35 j¼1

can be treated as an integrable (2 þ 1)-dimensional The wave function defines a map Zk ! CN ,

extension of nonlinear Schrödinger equation. The ðn1 ; . . . ; nk Þ ! ðð1 ; n1 ; . . . ; nk Þ; . . . ;
Ishimori-I equation
ðN ; n1 ; . . . ; nk ÞÞ ½44
@t S þ S  @x2 S þ @y2 S þ @x w@y S þ @y w@x S ¼ 0 ½36 where 1 , . . . , N are some points in C. This
construction generates the so-called quadrilateral
lattices (each two-dimensional face is planar).
@x2 w  @y2 w þ 2Sð@x S  @y SÞ ¼ 0 ½37
Multidimensional Problems

The @-approach can also be applied to multidimen-
S ¼ Sðx;y;tÞ; S ¼ ðS1 ;S2 ;S3 Þ; S2 ¼ 1 ½38
sional inverse-scattering problems, but typically the
is an integrable (2 þ 1)-dimensional extension of the scattering data are overdetermined and satisfy
Heisenberg magnetic equation. Both systems are additional nonlinear compatibility conditions. For

@-Approach to Integrable Systems 39

example, the Faddeev wave functions for the 

@-approach, but also for other techniques including
n-dimensional stationary Schr̈odinger operator the finite-gap method. For example, the inverse-
  spectral transform for the two-dimensional
@x21   @x2n þ uðxÞ ðk;xÞ ¼ ðk kÞðk;xÞ ½45 Schrödinger operator was first developed for
finite-gap (quasiperiodic) potentials and only later
ðk; xÞ ¼ eik x ð1 þ oð1ÞÞ ½46 for the decaying ones. For operators with finite gap
at one energy the first-order terms were constructed
in the nonphysical domain kI 6¼ 0 (kR and kI denote by Dubrovin, Krichever, and Novikov in 1976, but
the real and imaginary parts of k, respectively) only in 1984 the potentiality reduction was found by

satisfy the following @-equation: Novikov and Veselov.
@ðk; xÞ
j ¼ 2 j hðk; kR þ xÞðk þ x; xÞ
@k x2R n Nonsingular Solutions
 ðx x þ 2k xÞd 1 d n ½47 As mentioned above, one can construct regular
The characterization of admissible spectral data solutions by choosing sufficiently small (in an
h(k, l), k 2 Cn , l 2 Rn is based on the following appropriate norm) scattering data. But for some
compatibility equation: special systems the regularity follows automatically
from reality reductions. For example, for arbitrary
@hðk; lÞ 1 @hðk; lÞ 
large @-data, real KP-II solutions constructed by the
j þ 2 @lj
local @-problem [13] with reduction [49] are regular.
Z The proof is based on the theory of generalized
¼ 2 j hðk; kR þ xÞhðk þ x; lÞ analytic functions (in the Vekua sense). Another
x2R n
example is the two-dimensional Schr̈odinger opera-
 ðx x þ 2k xÞd 1 d n ½48 tor at a fixed negative energy E0 < 0. The potenti-
More details can be found in Novikov and Henkin ality and reality constraints imply that the potential
(1987). is nonsingular for arbitrary large T(). But, unfortu-

nately, the @-problem with regular data covers only
a part of the space of potentials. In fact, each such

Reductions on the @-Data operator possesses a strictly positive real eigenfunc-

The most generic @-data usually result in solutions tion at the level E0 , exponentially growing in all
from wrong functional class (they may, e.g., be directions (it also follows from the generalized
complex or singular), or extra constraints on the analytic functions theory). Existence of such func-
auxiliary linear operators are necessary to obtain tion implies that the whole discrete spectrum is
solutions of the zero-curvature representation. For located above the energy E0 , and it gives a
example, to obtain real KP-II solutions using the restriction on the potential. (For more details, see

local @-problem [13], the following reduction on the the review by Grinevich (2000).)

@-data should be implied:
 ¼ TðÞ ½49 Some Explicit Solutions
It can be easily derived from the direct transform. 
The generic @-problem results in potentials that
But it is not always the case. For example, the could not be expressed in terms of elementary or
selection of pure potential two-dimensional standard special functions. But for degenerate
Schrödinger operators originally was not so evident. kernels, a solution of the inverse-spectral problem
To formulate the answer, it is convenient to can be written explicitly. For example, if
introduce a new function b(), T() = b() X
sgn(  1)=.
 Rð; Þ ¼ rk ðÞsk ðÞ ½51
For E0 < 0, the following constraints select real k¼1
potential operators:
the wave function and solutions can be expressed in
1 1  quadratures.
b  ¼ bðÞ; b ¼ bðÞ ½50 In particular, if all rk () and sk () are -functions,

rk () = Rk (  k ) and sk () = Sk (  k ), the
In some situations, the problem of finding appro- wave function is rational in  and can be expressed
priate reductions is the most difficult part of the as a rational combination of exponents of xk . If for
integration procedure. It is true not only for the some k and l, k = l , this procedure needs some
@-Approach to Integrable Systems

regularization. For example, it is possible to assume, A solution of the inverse problem can be obtained
that (  0 )=(  0 ) = 0. by using appropriate analogs of Cauchy kernels on

If for all k, k = k , the @-problem generates Riemann surfaces.
rational in x solutions (lumps). It is possible to show
that, the Novikov–Veselov real rational solitons for
E0 > 0 are always nonsingular, decay at 1 as Quasiclassical Limit
1=(x2 þ y2 ), and the potential u(z) has zero scatter-

The systems integrable by the @-method usually
ing in all directions for the waves with energy E0 .
describe integrable systems with high-order deriva-
tives. It is well known that by applying some

The @-Problem on Riemann Surfaces limiting procedures to integrable systems one can
construct new completely integrable equations, but
In all examples discussed above, the spectral vari- integration methods for these equations are based on
able is defined in a Riemann sphere. It is natural to completely different analytic tools. One of most
generalize it by considering wave functions depend- important examples is the theory of dispersionless
ing on a spectral parameter defined on a Riemann hierarchies. The limiting procedure for the @- 
surface of higher genus. Spectral transforms of such problem (quasiclassical @-problem) was developed
type arise in the theory of localized perturbation of by Konopelchenko and collaborators. In the KP
periodic solutions. Assume that the KP-II potential case, the quasiclassical limit of the wave function
u(x, y) has the form (, t) is assumed to have the following form:
uðx; yÞ ¼ u0 ðx; yÞ þ u1 ðx; yÞ ½52  t  
Sð; tÞ
 ; ¼  ^ð; t; Þ exp ½54
where u0 (x, y) is a real nonsingular finite-gap  
potential and u1 (x, y) decays sufficiently fast at
It is possible to show that the function S(, t)
infinity. Denote by 0 ( , x, y) the wave function of
satisfies a Beltrami-type equation:
the operator L0 = @y  @x2 þ u0 (x, y), where 2 ,  
the spectral curve  is a compact Riemann surface of Sð; tÞ Sð; tÞ
¼ W ; ½55
genus g with a distinguished point 1. In addition to @  @
essential singularity at the point 1, the wave
function 0 ( , x, y) has g simple poles at points which is treated as a dispersionless limit of [13].
1 , . . . , g and is holomorphic in outside these Higher-order corrections were also discussed in the

singular points. For a real nonsingular potential,  is literature (see Konopelchenko and Moro (2003)).
an M-curve, that is, there exists an antiholomorphic
See also: Boundary-Value Problems for Integrable
involution  :  ! , 1 = 1, the set of fixed Equations; Integrable Systems and Algebraic Geometry;
points form g þ 1 ovals a0 , . . . , ag , 1 2 a0 , k 2 ak . Integrable Systems and the Inverse Scattering Method;
The wave function ( , x, y) of the perturbed Integrable Systems: Overview; Integrable Systems and
operator L = @y  @x2 þ u(x, y) is defined at the Discrete Geometry; Riemann–Hilbert Methods in
same spectral curve , but it is not holomorphic Integrable Systems.
any more. It has the following properties:
1. At the point 1, the wave function ( , x, y) has Further Reading
an essential singularity: ( , x, y) = 0 ( , x, y)
(1 þ o(1)). Ablowitz MJ, Bar Jaakov D, and Fokas AS (1983) On the inverse
scattering of the time-dependent Schrödinger equation and the
2. In the neighborhoods of the points k , ( , x, y)
associated Kadomtsev–Petviashvili equation. Studies in
can be written as a product of a continuous Applied Mathematics 69(2): 135–143.
function by a simple pole at k . Beals R and Coifman RR (1981–82) Scattering, transformations
3. The wave function ( , x, y) satisfies the @ spectrales et equations d’evolution nonlineare. I, II: Seminaire
equation Goulaouic – Meyer–Schwartz – Exp. 22; Exp. 21. Palaiseau:
Ecole Polytechnique.
@ð ; x; y; tÞ Beals R and Coifman RR (1985) Multidimensional inverse
¼ Tð Þð ; x; y; tÞ ½53 scattering and nonlinear partial differential equations.
Proceedings of Symposia in Pure Mathematics 43: 45–70.
where the (0, 1)-form T( ) = t( )d is regular Bogdanov LV and Konopelchenko BG (1995) Lattice and
outside the divisor points k and in the neighbor- q-difference Darboux–Zakharov–Manakov systems via

@-dressing method. Journal of Physics A 28(5): L173–L178.
hood of k it possible to define local coordinate Bogdanov LV and Manakov SV (1988) The nonlocal @ problem
such that t( ) = sgn(= )t1 ( )(  k )=(  k ), t1 ( ) and (2 þ 1)-dimensional soliton equations. Journal of Physics
is regular. A 21(10): L537–L544.
Derived Categories 41

Grinevich PG (2000) The scattering transform for the Nachman AJ and Ablowitz MJ (1984) A multidimensional
two-dimensional Schrodinger operator with a potential that inverse-scattering method. Studies in Applied Mathematics
decreases at infinity at fixed nonzero energy. Russian 71(3): 243–250.
Mathematical Surveys 55(6): 1015–1083. Novikov RG and Henkin GM (1987) The @-equation in the

Konopelchenko B and Moro A (2003) Quasi-classical @-dressing multidimensional inverse scattering problem. Russian Mathe-
approach to the weakly dispersive KP hierarchy. Journal of matical Surveys 42(4): 109–180.
Physics A 36(47): 11837–11851.

Derived Categories
E R Sharpe, University of Utah, Salt Lake City, being invariant under complex structure deforma-
UT, USA tions of the target space X, and its pertinent
ª 2006 Elsevier Ltd. All rights reserved. correlation functions are computed by summing
over holomorphic maps into the target X. The A
model will not be relevant for us here. The B model
Introduction has the properties of being invariant under Kähler
moduli of the target X, and its pertinent correlation
In this article we shall briefly outline derived functions are computed by summing over constant
categories and their relevance for physics. Derived maps into the target X. In the closed-string B model,
categories (and their enhancements) classify off-shell the states of the theory are counted by the
states in a two-dimensional topological field theory cohomology groups H  (X,  TX), where X is con-
on Riemann surfaces with boundary known as the strained to be Calabi–Yau. The BRST operator in
open-string B model. We briefly review pertinent the B model Q can be identified with @ for many
aspects of that topological field theory and its purposes. The open-string B model is the same
relation to derived categories, the Bondal–Kapranov topological field theory, but now defined on a
enhancement and its relation to the open-string B Riemann surface with boundaries. As with all
model, as well as B model twists of two-dimensional open-string theories, we specify boundary conditions
theories known as Landau–Ginzburg models, and on the fields, which force the ends of the string to
how information concerning stability of D-branes is live on some submanifold of the target, and we
encoded in this language. We concentrate on more associate to the boundaries degrees of freedom
physical aspects of derived categories; for a very (known as the Chan–Paton factors) which describe
readable short review concentrating on the mathe- a (possibly twisted) vector bundle over the submani-
matics, see, for example, Thomas (2000). fold. In the case of the B model, the submanifold is a
complex submanifold, and the vector bundle is
forced to be a holomorphic vector bundle over that
Sheaves and Derived Categories
in the Open-String B Model
To lowest order, that combination of a submani-
Derived categories are mathematical constructions fold S of X together with a (possibly twisted)
which are believed to be related to D-branes in the holomorphic vector submanifold, is a ‘‘D-brane’’ in
open-string B model. We shall begin by briefly the open-string Bpmodel.
ffiffiffiffiffiffi We shall denote the twisted
reviewing the B model, as well as D-branes. bundle by E
KS , where pffiffiffiffiffiffi KS is the canonical
The A and B models are two-dimensional topolo- bundle of S, and the KS factor is an explicit
gical field theories, closely related to nonlinear incorporation of something known as the Freed–
sigma models, which are supersymmetrizations of Witten anomaly. Now, if i : S ,! X is the inclusion
theories summing over maps from a Riemann map, then to this D-brane we can associate a
surface (the world sheet of the string) into some sheaf i E.
‘‘target space’’ X. In both the A and B models, one Technically, a sheaf is defined by associating sets,
considers only certain special correlation functions, or modules, or rings, to each open set on the
involving correlators closed under the action of a underlying space, together with restriction maps
nilpotent scalar operator known as the ‘‘BRST saying how data associated to larger open sets
operator,’’ Q, which is part of the original super- restricts to smaller open sets, obeying the obvious
symmetry transformations. In considering the perti- consistency conditions, together with some gluing
nent correlation functions, only certain types of conditions that say how local sections can be
maps contribute. The A model has the properties of patched back together. A vector bundle defines a
42 Derived Categories

sheaf by associating to any open set sections of the of this article (see instead the ‘‘Further reading’’
bundle over that open set. Sheaves of the form ‘‘i E’’ section at the end), but we shall give a short outline
look like, intuitively, vector bundles over submani- below.
folds, with vanishing fibers off the submanifold. Mathematically, derived categories of sheaves
A more detailed discussion of sheaves is beyond the concern complexes of sheaves, that is, sets of
scope of this article; see instead, for example, Sharpe sheaves E i together with maps di : E i ! E iþ1
di diþ1 diþ2
To ‘‘associate a sheaf’’ means finding a sheaf such    ! E i ! E iþ1 ! E iþ2 !   
that physical properties of the D-brane system are
well modeled by mathematics of the sheaf. (In such that diþ1  di = 0. A category is defined by a
particular, the physical definition of D-brane has, collection of ‘‘objects’’ together with maps between
on the face of it, nothing at all to do with the the objects, known as morphisms. In a derived
mathematical definition of a sheaf, so one cannot category of coherent sheaves, the objects are com-
directly argue that they are the same, but one can plexes of sheaves, and the maps are equivalence
still use one to give a mathematical model of the classes of maps between complexes.
other.) For example, the spectrum of open-string Physically, if the complex consists of locally free
states in the B model stretched between two sheaves (equivalently, vector bundles), then we can
D-branes, associated to sheaves i E and j F , turn associate a brane/antibrane/tachyon system, by iden-
out to be calculated by a cohomology group known tifying the E i for i even, say, with D-branes, and the
as ExtX (i E, j F ). E i for i odd with anti-D-branes. If the E i are all
There are many more sheaves not of the form i E, locally free sheaves, then there are tachyons between
that is, that do not look like vector bundles over the branes and antibranes, and we can identify the
submanifolds. It is not known in general whether di ’s with those tachyons. In the open-string world-
they also correspond to (on-shell) D-branes, but in sheet theory, giving a tachyon a vacuum expectation
some special cases the answer has been worked out. value modifies the BRST operator Q, and a necessary
For example, structure sheaves of nonreduced condition for the new theory to still be a topological
schemes turn out to correspond to D-branes with field theory is that Q2 = 0, a condition which turns
nonzero nilpotent Higgs vevs. out to imply that diþ1  di = 0.
For a set of ordinary D-branes, the description To re-create the structure of a derived category,
above suffices. However, more generally one would we need to impose some equivalence relations. To
like to describe collections of D-branes and anti- see what sorts of equivalence relations one would
D-branes, and tachyons. An anti-D-brane has all like to impose, note the following. Physically, we
the same physical properties as an ordinary D- would like to identify, for example, a configuration
brane, modulo the fact that they try to annihilate consisting of a brane, an antibrane, and a tachyon,
each other. The open-string spectrum between which we can describe as a complex
coincident D-branes and anti-D-branes contains
OðDÞ ! O
tachyons. One can give an (off-shell) vacuum
expectation value to such tachyons, and then the with a one-element complex
unstable brane–antibrane–tachyon system will
evolve to some other, usually simpler, configura- OD
tion. For example, given a single D-brane wrapped corresponding to the D-brane which we believe is
on a curve, with trivial line bundle, and an anti-D- the endpoint of the evolution of the brane/antibrane
brane wrapped on the same curve, with line bundle configuration.
O(1), and a nonzero tachyon O(1) ! O, then One natural mathematical way to create identifi-
one expects that the system will dynamically evolve cations of this form is to identify complexes that
to a smaller D-brane sitting at a point on the curve. differ by ‘‘quasi-isomorphisms,’’ meaning, a set of
Now, one would like to find some mathematics maps (f n : Cn ! Dn ) compatible with d’s, and
that describes such systems, and gives information inducing an isomorphism ~f n : H n (C) ffi H n (D) on
about the endpoints of their evolution. Techni- the cohomologies of the complexes. In particular,
cally, one would like to classify universality classes in the example above, there is a natural set of maps
of world-sheet boundary renormalization group
flow. (–D)
It has been conjectured that derived categories of
sheaves provide such a classification. To properly
explain derived categories is well beyond the scope D
Derived Categories 43

that define a quasi-isomorphism. More generally, in are interested in, because clearly the entire physical
homological algebra, one typically does computations theories are not and cannot be isomorphic.
by replacing ordinary objects with projective or Although the entire physical theories are not
injective resolutions, that is, complexes with special isomorphic, we can hope that under renormalization
properties, in which the desired computation group flow, the theories will become isomorphic.
becomes trivial, and defining the result for the That is certainly the physical content of the statement
original object to be the same as the result for the that the brane/antibrane system O(D) ! O should
resolution. To formalize this procedure, one would describe the D-brane corresponding to OD – after
like a mathematical setup in which objects and their world-sheet boundary renormalization group flow,
projective and injective resolutions are isomorphic. the nonconformal two-dimensional theory describing
However, to define an equivalence relation, one the brane/antibrane system becomes a CFT describing
usually needs an isomorphism, and the quasi- a single D-brane.
isomorphisms above are not, in general, isomorphisms. More globally, this is the general prescription for
Creating an equivalence from nonisomorphisms, finding physical meanings of many categories: we
to resolve this problem, can be done through a can associate physical theories to particular types
process known as ‘‘localization’’ (generalizing the of representatives of isomorphism classes of
notion of localization in commutative algebra). objects, and then although distinct representatives
The resulting equivalence relations on maps of the same object may give rise to very different
between complexes define the derived category. physical theories, those physical theories at least lie
The derived category is a category whose objects in the same universality class of world-sheet
are complexes, and whose morphisms C ! D are renormalization group flow. In other words,
equivalence classes of pairs (s, t) where s : G ! C is (equivalence classes of) objects are in one-to-one
a quasi-isomorphism between C and another com- correspondence with universality classes of physical
plex G, and t : G ! D is a map of complexes. We theories.
take two such pairs (s, t), (s0 , t0 ) to be equivalent if Showing such a statement directly is usually not
there exists another pair (r, h) between the auxiliary possible – it is usually technically impractical to
complexes G, G0 , making the obvious diagram follow renormalization group flow explicitly. There
commute. This is, in a nutshell, what is meant by is no symmetry reason or other basic physics reason
localization, and by working with such equivalence why renormalization group flow must respect quasi-
classes, this allows us to formally invert maps that isomorphism. The strongest constraint that is clearly
are otherwise noninvertible. (We encourage the applied by physics is that renormalization group
reader to consult the ‘‘Further reading’’ section for flow must preserve D-brane charges (Chern char-
more details.) acters, or more properly, K-theory), but objects in a
Mathematically, this technology gives a very derived category contain much more information
elegant way to rethink, for example, homological than that.
algebra. There is a notion of a derived functor, a However, indirect tests can be performed, and
special kind of functor between derived categories, because many indirect tests are satisfied, the result is
and notions from homological algebra such as Ext generally believed.
and Tor can be re-expressed as cohomologies of the The reader might ask why it is not more efficient
image complexes under the action of a derived to just work with the cohomology complexes
functor, thus replacing cohomologies with H  (C) themselves, rather than the original com-
complexes. plexes. One reason is that the original complexes
Physically, looking back at the physical realization contain more information than the cohomology –
of complexes, we see a basic problem: different passing to cohomology loses information. For
representatives of (isomorphic) objects in the derived example, there exist examples of complexes that
category are described by very different physical have the same cohomology, yet are not quasi-
theories. For example, the sheaf OD corresponds to a isomorphic, and so are not identified within the
single D-brane, defined by a two-dimensional derived category, and so physically are believed
boundary conformal field theory (CFT), whereas to lie in different universality classes of boundary
the brane/antibrane/tachyon collection O(D) ! O renormalization group flow.
is defined by a massive nonconformal two- Another motivation for relating physics to derived
dimensional theory. These are very different physical categories is Kontsevich’s approach to mirror sym-
theories. If we want ‘‘localization on quasi- metry. Mirror symmetry relates pairs of Calabi–Yau
isomorphisms’’ to happen in physics, we have to manifolds, of the same dimension, in a fashion such
explain which properties of the physical theories we that easy classical computations in one Calabi–Yau
44 Derived Categories

are mapped to difficult ‘‘quantum’’ computations

0 0 0 0 0
involving sums over holomorphic curves in the other
Calabi–Yau. Because of this property, mirror sym- 1
metry has proved a fertile ground for algebraic 3
geometers to study. Kontsevich proposed that mirror Figure 1 1. Example of generalized complex. Each arrow is
symmetry should be understood as a relation labeled by the degree of the corresponding vertex operator.
between derived categories of coherent sheaves on
one Calabi–Yau and derived Fukaya categories on
Landau–Ginzburg Models
the other Calabi–Yau. At the time he made this
proposal, no one had any idea how either could be So far we have described how derived categories are
realized in physics, but since that time, physicists relevant to geometric compactifications, that is,
have come to believe that Kontsevich was secretly sigma models on Calabi–Yau manifolds. However,
talking about D-branes in the A and B models. there are also ‘‘nongeometric’’ theories – CFTs that
do not come from sigma models on manifolds, of
which Landau–Ginzburg models and their orbifolds
Bondal–Kapranov Enhancements
are prominent examples. Landau–Ginzburg models
Mathematically derived categories are not quite as can also be twisted into topological field theories,
ideal as one would like. For example, the cone and the B-type topological twist of (an orbifold of) a
construction used in triangulated categories does not Landau–Ginzburg model is believed to be iso-
behave functorially – the cone depends upon the morphic, as a topological field theory, to the B
representative of the equivalence class defining an model obtained from a nonlinear sigma model, of
object in a derived category, and not just the object the form we outlined earlier. Landau–Ginzburg
itself. models have a very different form than nonlinear
Physically, our discussion of brane/antibrane sigma models, and so sometimes there can be
systems was not the most general possible. One practical computational advantages to working
can give vacuum expectation values to more general with one rather than the other.
vertex operators, not just the tachyons. A Landau–Ginzburg model is an ungauged sigma
Curiously, these two issues solve each other. By model with a nonzero superpotential (a holo-
incorporating a more general class of boundary vertex morphic function over the target space that defines
operators, one realizes a more general mathematical a bosonic potential and Yukawa couplings). (In
structure, due to Bondal and Kapranov, which repairs ‘‘typical’’ cases, the target space is a vector space.)
many of the technical deficiencies of ordinary derived Because of the superpotential, a Landau–Ginzburg
categories. Ordinary complexes are replaced by gen- model is a massive theory – not itself a CFT, but
eralized complexes in which arrows can map between many Landau–Ginzburg models are believed to flow
non-neighboring elements of the complex. Schemati- to CFTs under the renormalization group.
cally, the BRST operator is deformed by boundary In formulating open strings based on Landau–
vacuum expectation values to the form Ginzburg models, naive attempts fail because of
X something known as the Warner problem: if the
Q¼@þ a superpotential is nonzero, then the obvious ways to
a try to define the theory on a Riemann surface with
boundary have the undesirable property that the
and demanding that the BRST operator square to
supersymmetry transformations only close up to a
zero implies that
nonzero boundary term, proportional to derivatives
@a þ b  a ¼ 0 of the superpotential. In order to find a description
a a;b
of open strings in which the supersymmetry trans-
formations close, one must take a very nonobvious
which is the same as the condition for a generalized formulation of the boundary data. Specifically, to
complex. Note that for ordinary complexes, the solve the Warner problem, one is led to work with
condition above factors into pairs of matrices whose product is proportional to
the superpotential.
@n ¼ 0 This method of solving the Warner problem is
nþ1  n ¼ 0 known as matrix factorization, and D-branes in this
theory are defined by the factorization chosen, that
which yields an ordinary complex of sheaves is, the choice of pairs of matrices. In simple cases,
(Figure 1). we can be more explicit as follows. Choose a set of
Derived Categories 45

polynomials F , G such that the Landau–Ginzburg On the world volume of the D-branes, we have a
superpotential W is given by rank-N vector bundle, and in the physical theory on
X that world volume we have a consistency condition
W¼ F G þ constant: for supersymmetric vacua, that the vector bundle be

‘‘Mumford–Takemoto stable.’’ To understand what
The F and G are used to define the boundary is meant by this condition on a Kähler manifold, let
action – the F’s appear as part of the boundary ! denote the Kähler form, and define the ‘‘slope’’ 
superpotential and the G’s appear as part of the of a vector bundle E on a manifold X of complex
supersymmetry transformations of boundary fermi dimension n to be given by
multiplets. The F and G , that is, the factorization R n1
of W, determine the D-brane in the Landau– ! ^ c1 ðEÞ
ðEÞ ¼ X
Ginzburg theory. We can also think of having a rank E
pair of holomorphic vector bundles E 1 , E 2 of the where ! is the Kähler form. Then, we say that E is
same rank, and interpret F and G as holomorphic (semi-)stable if for all subsheaves F satisfying
sections of E _1  E 2 and E _2  E 1 , respectively, obey- certain consistency conditions, (F )(  ) < (E).
ing FG / W  Id and GF / W  Id, up to additive Since the slope of a bundle depends upon the
constants. Kähler form, whether a given bundle is Mumford–
Although a Landau–Ginzburg model is not the Takemoto stable depends upon the metric. In
same thing as a sigma model on a Calabi–Yau, general, on a Kähler manifold, the Kähler cone
orbifolds of Landau–Ginzburg models are often on breaks up into subcones, with a different moduli
the same Kähler moduli space. Perhaps, the most space of (stable) holomorphic vector bundles in each
famous example of this relates sigma models on subcone.
quintic hypersurfaces in P 4 to a Z5 orbifold of a This is a mathematical notion of stability, but it also
Landau–Ginzburg model over C5 with five chiral corresponds to physical stability, at least in a regime in
superfields x1 , x2 , x3 , x4 , x5 , and a superpotential of which quantum corrections are small. If a given
the form bundle is only stable in a proper subset of the Kähler
cone, then when it reaches the boundary of the
W ¼ x51 þ x52 þ x53 þ x54 þ x55
subcone in which it is stable, the gauge field config-
þ x1 x2 x3 x4 x5 uration that satisfies the Donaldson–Uhlenbeck–Yau
for a complex number, corresponding to the partial differential equation splits into a sum of two
equation of the degree-5 hypersurface in P 4 . The separate bundles. In a heterotic string compactifica-
(complexified) Kähler moduli space in this example tion, this leads to a low-energy enhanced U(1) gauge
is a P 1 , with the sigma model on the quintic at one symmetry and D-terms which realize the change in
pole, the zero-volume limit of the sigma model along moduli space. In D-branes, this means the formerly
the equator, and the Landau–Ginzburg orbifold at bound state of D-branes (described by an irreducible
the opposite pole. holomorphic vector bundle) becomes only marginally
Since the closed-string topological B model is bound; a decay becomes possible.
independent of Kähler moduli, and the sigma model Pi-stability is a proposal for generalizing the
on the quintic and the Landau–Ginzburg orbifold considerations above to D-branes no longer wrap-
above lie on the same Kähler moduli space, one ping the entire Calabi–Yau, and including quantum
would expect them both to have the same spectrum corrections.
of D-branes, and indeed this is believed to be true. In order to define pi-stability, we must first
introduce a notion of grading ’ of a D-brane.
Specifically, for a D-brane wrapped on the entire
Calabi–Yau X with holomorphic vector bundle E,
R grading is defined as the mirror to the expression
So far we have discussed D-branes in the topological X ch(E) ^ , where  encodes the periods. Close to
B model, a topological twist of a physical sigma the large-radius limit, this has the form:
model. If we untwist back to a physical sigma Z
model, then the stability of those D-branes becomes ’ðEÞ ¼ Im log expðB þ i!Þ ^ chðE Þ
an issue.  X
To begin to understand what we mean by stability ^ tdðTXÞ þ   
in this context, consider a set of N D-branes
wrapped on, say, a K3 surface, at large radius (so where B is a 2-form, the ‘‘B field.’’ As defined ’ is
that world-sheet instanton corrections are small). clearly S1 -valued; however, we must choose a
46 Derived Categories

particular sheet of the log Riemann surface, to However, there is a technical problem that limits
obtain an R-valued function. such an extension. Specifically, in a derived cate-
This notion of grading of D-branes is an ansatz, gory, there is no meaningful notion of ‘‘subobject.’’
introduced as part of the definition of pi-stability. Thus, a notion of stability formulated in terms of
Physically, it is believed that the difference in grading subobjects cannot be immediately applied to derived
between two D-branes corresponds to the fractional categories. There are two (equivalent) workarounds
charge of the boundary-condition-changing vacuum to this issue that have been discussed in the math
between the two D-branes, though we know of no and physics literatures, which can be briefly sum-
convincing first-principles derivation of that state- marized as follows:
ment. In particular, unlike closed-string computa-
1. One workaround involves picking a subcategory
tions, the degree of the Ext group element
of the derived category that does allow you to
corresponding to a particular boundary R-sector
make sense of subobjects. Such a structure is
state is not always the same as the U(1)R charge –
known, loosely, as a ‘‘T-structure,’’ and so one
for example, it is often determined by the U(1)R
can imagine formulating stability by first picking
charge minus the charge of the vacuum. The grading
a T-structure, then specifying a slope function on
gives us the mathematical significance of that vacuum
the elements of the subcategory picked out by the
charge. This mismatch between Ext degrees and
U(1)R charges is necessary for the grading to make
2. Another (equivalent) workaround is to work with
sense: Ext group degrees are integral, after all, yet we
a notion of ‘‘relative stability.’’ Instead of speak-
want the grading to be able to vary continuously, so
ing about whether a D-brane is stable against
the grading had better not be the same as an Ext
decay into any other object, one only speaks
group degree.
about whether it is stable against decay into pairs
Given an R-valued function from a particular
of specified objects.
definition of log in the definition of ’ above, the
statement of pi-stability is then that for all In this fashion, one can make sense of pi-stability for
subsheaves F , as in the statement of Mumford– derived categories.
Takemoto stability,
See also: Fourier–Mukai Transform in String Theory;
’ðF Þ  ’ðEÞ Mirror Symmetry: A Geometric Survey; Spectral
Before trying to understand the physical meaning Sequences; Superstring Theories; Topological Quantum
Field Theory: Overview.
of ’, or the extension of these ideas to derived
categories, let us try to confirm that Mumford–
Takemoto stability emerges as a limit of pi-stability.
For simplicity, suppose that X is a Calabi–Yau Further Reading
3-fold. Then, for large Kähler form !, we can
Bondal A and Kapranov M (1991) Enhanced triangulated
expand ’(E) as, categories. Mathematics of the USSR Sbornik 70: 92–107.
 Z  Bridgeland T (2002) Stability conditions on triangulated cate-
1 i
Im log  !3 ðrk EÞ gories, math.AG/0212237.
 3! X Caldararu A (2005) Derived categories of sheaves: a skimming,
R 2
3 ! ^ c1 ðEÞ math.AG/0501094.
þ RX 3 þ  Hartshorne R (1966) Residues and Duality. Lecture Notes in
 X ! ðrk EÞ Mathematics, vol. 20. Berlin: Springer.
Kapustin A and Li Y (2003) D-branes in Landau–Ginzburg
Thus, we see that to leading order in the Kähler models and algebraic geometry. Journal of High Energy
form !, ’(F )  ’(E) if and only if Physics, 0312: 005 (hep-th/0210296).
R 2 R 2 Kontsevich M (1995) Homological algebra of mirror symmetry.
X ! ^ c1 ðF Þ ! ^ c1 ðEÞ In: Chatterji SD (ed.) Proceedings of International Congress of
rk F rk E Mathematicians (Zurich, 1994), pp. 120–139, (alg-geom/
9411018). Boston: Birkhäuser.
which is precisely the statement of Mumford– Sharpe E (2003) Lectures on D-branes and sheaves, hep-th/
Takemoto stability on a 3-fold X. 0307245.
One can define a notion of (classical) stability for Thomas R (2000) Derived categories for the working mathema-
more general sheaves, but what one wants is to tician, math.AG/0001045.
Weibel C (1994) An Introduction to Homological Algebra.
apply pi-stability to derived categories, not just
Cambridge Studies in Advanced Mathematics, vol. 38.
sheaves. Cambridge: Cambridge University Press.
Determinantal Random Fields 47

Determinantal Random Fields

A Soshnikov, University of California at Davis, Davis, Mathematical Framework
We start by building a standard mathematical
ª 2006 Elsevier Ltd. All rights reserved.
framework for the theory of random point pro-
cesses. Let E be a one-particle space and X a space
of finite or countable configurations of particles in E.
Introduction In general, E can be a separable Hausdorff space.
However, for our purposes it suffices to consider
The theory of random point fields has its origins in E = Rd or E = Zd . We usually assume in this section
such diverse areas of science as life tables, particle that E = Rd , with the understanding that all con-
physics, population processes, and communication structions can be easily extended to the discrete case.
engineering. A standard reference to the subject is
bi We assume that each configuration  = (xi ), xi 2 E,
the monograph by Daley and Vere-Jones (1988). i 2 Z1 (or i 2 Z1þ for d > 1), is locally finite. In other
This article is concerned with a special class of words, for every compact K  E, the number of
random point fields, introduced by Macchi in the mid- particles in K, #K () = #(xi 2 K) is finite.
1970s. The model that Macchi considered describes In order to introduce a -algebra of measurable
the statistical distribution of a fermion system in subsets of X, we first define the cylinder sets.
thermal equilibrium. Macchi proposed to call the new Let B  E be a bounded Borel set and let n  0. We
class of random point processes the fermion random call CBn = { 2 X : #B () = n} a cylinder set. We define
point processes. The characteristic property of this B as a -algebra generated by all cylinder sets (i.e., B
family of random point processes is the condition that is the minimal -algebra that contains all CBn ).
k-point correlation functions have the form of deter-
minants built from a correlation kernel. This implies Definition 1 A random point field is a triplet
that the particles obey the Pauli exclusion principle. (X, B, Pr ), where Pr is a probability measure on (X, B).
Until the mid-1990s, fermion random point processes It was observed in the 1960–1970s (see, e.g., Lenard
attracted only a limited interest in mathematics and (1973, 1975)), that in many cases the most convenient
physics communities, with the exception of two
bi way to define a probability measure on (X, B) is via the
important works by Spohn (1987) and Costin– point correlation functions. Let E = R d , equipped with
Lebowitz (1995). This situation changed dramatically the underlying Lebesgue measure.
at the end of the last century, as the subject greatly
benefited from the newly discovered connections to Definition 2 Locally integrable function k : Ek !
random matrix theory, representation theory, random R1þ is called a k-point correlation function of the
growth models, combinatorics, and number theory. random point field (X, B, Pr ) if, for any disjoint
Things are rapidly developing at the moment. Even the bounded Borel subsetsPA1 , . . . , Am of E and for any
terminology has not yet set in stone. Many experts ki 2 Z1þ , i = 1, . . . , m, m
i = 1 ki = k, the following for-
currently use the term ‘‘determinantal random point mula holds:
fields’’ instead of ‘‘fermion random point fields.’’ We Y
ð#Ai Þ!
follow this trend in our article. E
This article is intended as a short introduction to the i¼1
ð# Ai  ki Þ!
subject. The next section builds a mathematical
¼ k k ðx1 ; . . . ; xk Þdx1    dxk ½1
framework and gives a formal mathematical definition A11 Akmm
of the determinantal random point fields. Then we
discuss examples of determinantal random point fields where by E we denote the mathematical expectation
from quantum mechanics, random matrix theory, with respect to Pr . In particular, 1 (x) is the particle
random growth models, combinatorics, and represen- density, since
tation theory. This is followed by a discussion of the Z
ergodic properties of translation-invariant determi- E#A ¼ 1 ðxÞdx
nantal random point fields. We discuss the Gibbsian
property of determinantal random point fields. for any bounded Borel A  E. In general,
Central-limit theorem type results for the counting k (x1 , . . . , xk ) has the following probabilistic
functions and similar linear statistics are also dis- interpretation. Let [x1 , xi þ dxi ], i = 1, ..., k, be infini-
cussed. The final section is devoted to some general- tesimally small boxes around xi , then k (x1 , x2 , ... , xk )
izations of determinantal point fields, namely dx1  dxk is the probability to find a particle in each
immanantal and Pfaffian random point fields. of these boxes.
48 Determinantal Random Fields

In the discrete case E = Zd , the construction of a non-negative operator. It should be noted, how-
random point field is very similar. The probability ever, that there exist determinantal random point
space X and the -algebra B are constructed fields corresponding to non-Hermitian kernels (see,
essentially in the same way as before. Moreover, in e.g., [18] later). The kernel K(x, y) is usually called
the discrete case, the set of the countable configura- a correlation kernel of the determinantal random
tions of particles can be identified with the set of all point process.
subsets of E. Therefore, X = {0, 1}E , and B is generated
In the Hermitian case, the necessary and sufficient
by the events {Cx , x 2 E}, where Cx = {x 2 }. The
conditions on the operator K to define a determi-
k-point correlation function (x1 , . . . , xk ) is then just
nantal random point filed were established by
a probability that a configuration  contains the bi bi
Soshnikov (2000); see also Macchi (1975).
sitesT x1 , . . . , xk . In other words, k (x1 , . . . , xk ) =
Pr ( ki= 1 Cxi ). In particular, the one-point correlation Theorem 1 Hermitian locally trace class operator
function 1 (x), x 2 Zd , is the probability that a K on L2 (E) determines a determinantal random
configuration contains the site x, that is, point field if and only if 0  K  1 (in other words,
1 (x) = Pr (Cx ). both K and 1  K are non-negative operators). If
The problem of the existence and the unique- the corresponding random point field exists, it is
ness of a random point field defined by its unique.
correlation functions was studied by Lenard
The main technical part of the proof is the
(1973–1975). It is not surprising that Lenard’s
following proposition.
papers revealed many parallels to the classical
moment problem. In particular, the random point Proposition 1 Let (X, B, P) be a determinantal
field is uniquely defined by its correlation func- random point field with the Hermitian-symmetric
tions if the distribution of random variables {#A } correlation kernel K. Let f be a non-negative
for bounded Borel sets A is uniquely determined continuous function with compact support. Then
by its moments.
In this article we study a special class of random
bi Eeh;f i ¼ det Id  ð1  ef Þ1=2 Kð1  ef Þ1=2 ½4
point fields introduced by Macchi (1975). To
shorten the exposition, we give the definitions only
in the continuous case E = Rd . In the discrete case, where h, f i is the value of the linear statistics
the definitions are essentially the same. defined by the test function f on the configuration
Let K : L2 (Rd ) ! L2 (Rd ) be an integral locally  = (xi ); in other words, h, f i = i f (xi ).
trace-class operator. The last condition means that
Remark 2 The right-hand side (RHS) of [4] is well
for any compact B  Rd the operator KB is trace
defined as the Fredholm determinant
Pk of a trace-
class, where B (x) is an indicator of B. The kernel of
class operator.
Q Letting f = i=1 i s I i , one obtains
K is defined up to a set of measure zero in Rd  Rd . #I
Eeh, f i = E ki= 1 zi i , with zi = esi . In this case, the
For our purposes, it is convenient to choose it in
left-hand side (LHS) of [4] becomes the generating
such a way that for any bounded measurable B and
function of the joint distribution of the counting
any positive integer n
random variables #Ii , i = 1, . . . , k.
trððB KB ÞÞ ¼ Kðx; xÞdx ½2 Unfortunately, there are very few known results
B in the non-Hermitian case. In particular, the
bi necessary and sufficient condition on K for the
We refer the reader to Soshnikov (2000, p. 927) for existence of the determinantal random point field
the discussion. We are now ready to define a with the non-Hermitian correlation kernel is not
determinantal (fermion) random point field on Rd . known.
Definition 3 A random point field on E is said to We end this section with the introduction of the
be determinantal (or fermion) if its n-point correla- Janossy densities (a.k.a. density distributions, exclu-
tion functions are of the form sion probability densities, etc.) of a random point
n ðx1 ; . . . ; xn Þ ¼ det Kðxi ; xj Þ 1in ½3 The term Janossy densities in the theory of
random point processes was introduced by Sriniva-
Remark 1 If the kernel is Hermitian-symmetric, san in 1969, who referred to the 1950 paper by
then the non-negativity of n-point correlation Janossy on particle showers. Let us assume that all
functions implies that the kernel K(x, y) is non- point correlation functions exist and are locally
negative definite and, therefore K must be a integrable, and let I be a bounded Borel subset of
Determinantal Random Fields 49

Rd . Intuitively, one can think of the Janossy density {’‘ }1

‘ = 0 an orthonormal basis of the eigenfunctions,
J k, I (x1 , . . . , xk ), x1 , . . . , xk 2 I, as H’‘ = ‘  ’‘ , 0 < 1  2     . To define a Fermi
gas, we consider the nth exterior power of H,
J k;I ðx1 ; . . . ; xk Þ dxi ^n (H) : ^n (L2 (E)) ! ^n (L2 (E)), where ^n (L2 (E)) is
i¼1 the space of square-integrable antisymmetric
P func-
¼ Prfthere are exactly k particles in I and tions of n variables and ^n (H) = ni= 1 (d2 =dx2i þ
V(xi )). The eigenstates of the Fermi gas are given by
there is a particle in each of the normalized Slater determinants
the k infinitesimal boxes ðxi ; xi þ dxi Þ;
1 X Yn
i ¼ 1; . . . ; kg ½5 k1 ;...;kn ðx1 ; . . . ; xn Þ ¼ pffiffiffiffi ð1Þ ’ki ðxðiÞ Þ
n! 2Sn i¼1
To give a formal definition, we express point
correlation functions in terms of Janossy densities ¼ pffiffiffiffi detð’ki ðxj ÞÞ1i;jn ½10
and vice versa: n!
where 0  k1 < k2 <    < kn . A probability distribu-
k ðx1 ; . . . ; xk Þ
Z tion of n particles in the Fermi gas is given by the
X 1
1 squared absolute value of the eigenstate:
¼ J kþj;I ðx1 ; . . . ; xk ; xkþ1 ; . . . ; xkþj Þ
j! Ij
pðx1 ; . . . ; xn Þ ¼ j ðx1 ; . . . ; xn Þj2
 dxkþ1 . . . dxkþj ½6 1  
¼ det ’ki ðxj Þ 1i; jn
 det ’kj ðxi Þ
J k;I ðx1 ; . . . ; xk Þ 1i;jn
X Z 1  
ð1Þj ¼ det Kn ðxi ; xj Þ 1i;jn ½11
¼ kþj ðx1 ; . . . ; xk ; xkþ1 ; . . . ; xkþj Þ n!
j! Ij P
where Kn (x, y) = ni= 1 ’ki (x)’ki (y) is the kernel
 dxkþ1    dxkþj ½7
of the orthogonal projector onto the subspace
A very useful property of the Janossy densities is spanned by the n eigenfunctions {’ki } of H. The
that n-dimensional probability distribution [11]
defines a determinantal random point field with
Prfthere are exactly k particles in Ig n particles. The k-point correlation functions are
1 given by
¼ J ðx1 ; . . . ; xk Þdx1    dxk ½8
k! Ik k;I n!
k ðx1 ; . . . ; xn Þ ¼ pn ðx1 ; . . . ; xn Þ
In the case of determinantal random point fields, ðn  kÞ!
Janossy densities also have a determinantal form,  dxkþ1    dxn
¼ det Kn ðx1 ; xj Þ 1i;jk ½12
J k;I ðx1 ; . . . ; xk Þ
¼ detðId  KI Þ  det LI ðxi ; xj Þ 1i;jk ½9
Random Matrix Models
In the last equation, KI is the restriction of the operator
K to the L2 (I). In other words, KI (x, y) = Some of the most important ensembles of random
I (x)K(x, y)I (y), where I is the indicator of I. The matrices fall into the class of determinantal random
operator LI is expressed in terms of KI as LI = (Id  point processes.
KI )1 KI . For further results on the Janossy densities of The archetypal ensemble of Hermitian random
determinantal random point processes we refer the matrices is a so-called Gaussian unitary ensemble

reader to Soshnikov (2004) and references therein. (GUE). Let us consider the space of n  n Hermitian
matrices {A = (Aij )1i, jn , Re(Aij ) = Re(Aji ), Im(Aij ) =
Im(Aji )}. A GUE random matrix is defined by its
Examples of Determinantal Random probability distribution
Point Fields
PðdAÞ ¼ constn  expðtrA2 ÞdA ½13
Fermion Gas
whereQ dA is a Lebesgue Q measure, that is,
Let H = d2 =dx2 þ V(x) be a Schrödinger operator dA = i<j dRe(Aij ) dIm(Aij ) nk = 1 dAkk . The eigenva-
with discrete spectrum on L2 (E). We denote by lues of a random Hermitian matrix are real random
50 Determinantal Random Fields

variables, whose joint probability distribution is a 2. The edges of the uniform spanning tree in Z2
determinantal random point process of n particles parallel to the horizontal axis can be viewed as
on the real line. The correlation kernel has the the determinantal random point field in Z2 with
Christoffel–Darboux form built from the Hermite
polynomials. sin2 x
gðx; yÞ ¼
The GUE ensemble of random matrices is invar- sin2 x þ sin2 y
iant under the unitary transformation A ! UAU1 ,
U 2 U(n). An important generalization of [13] that Similarly, the edges of the uniform spanning
preserves the unitary invariance is forest in Zd parallel to the x1 -axis correspond to
the function
PðdAÞ ¼ constn expðtrVðAÞÞdA ½14
sin2 x1
where, for example, V(x) is a polynomial of even gðx1 ; . . . ; xd Þ ¼ Pd 2
i¼1 sin xi
degree with positive leading coefficients. The corre-
lation functions of the eigenvalues in [14] are again (the uniform spanning forest on Zd is a tree only
determinantal, and the Hermite polynomials in the for d  4). The result is due to Burton and
correlation kernel have to be replaced by the Pemantle (1993).
orthonormal polynomials with respect to the weight 3. Let d = 1 and  be a parameter between 0 and 1.
exp (V(x)). For details, we refer the reader to the Consider
monographs by Mehta (2004) and Deift (2000).
There are many other ensembles of random ð1  Þ2
gðxÞ ¼
matrices for which the joint distribution of the je2ix  j2
eigenvalues has determinantal point correlation
The corresponding probability measure is a
functions: classical compact groups with respect to
renewal process and
the Haar measure, complex non-Hermitian Gaus-
sian random matrices, positive Hermitian random 1   jnj
KðnÞ ¼ ^gðnÞ ¼ 
matrices of the Wishart type, and chains of 1þ
correlated Hermitian matrices. We refer the reader bi

to Soshnikov (2000) for more information. (see Soshnikov (2000)).

4. The process with g(x) = I (x), where I is an
arbitrary arc of a unit circle, appeared in

Discrete Translation-Invariant Determinantal the work of Borodin and Olshanski (2000). The
Random Point Fields corresponding correlation kernel is known as the
discrete sine kernel. The determinantal random
Let g : Td ! [0, 1] be a Lebesgue-measurable func-
point process on Z1 with the discrete sine kernel
tion on the d-dimensional torus Td . Assume that
describes the typical form of large Young
0  g  1. A configuration  in Zd can be thought of
diagrams ‘‘in the bulk’’ (see the next subsection).
as a 0–1 function on Zd , that is, (x) = 1 if x 2  and
5. The discrete sine correlation kernel with g = [0, 1=2]
(x) = 0 otherwise. We define a Zd -invariant prob- d appeared in the zig-zag process (Johansson 2002)
ability measure Pr on the Borel sets of X = {0, 1}Z in
derived from the uniform domino tilings in the
such a way that
plane. It corresponds to g = [0, 1=2] .
k ðx1 ; . . . ; xk Þ ¼ Prððx1 Þ ¼ 1; . . . ; ðxk Þ ¼ 1Þ
:¼ det g^ðxi  xj Þ 1i;jk ½15 Determinantal Measures on Partitions
By a partition of n = 1, 2, . . . we understand a
for x1 , . . . , xk 2 Zd . In the above formula, {g(n)}
collection of non-negative integers  = (1 , . . . , m )
are theP Fourier coefficients of g, that is,
such that 1 þ    þ m = n and 1  2      m .
g(x) = n ^ g(n)einx . It is clear from Definition 3 that
We shall use a notation Par(n) for the set of all
[15] defines a determinantal random point field on
partitions of n.
Zd with the translation-invariant kernel K(x, y) =
The Plancherel measure Mn on the set Par(n) is
g(x  y). Below we discuss several examples that fall
defined as
into this category. For further discussion we refer the
ðdim Þ2
bi bi

reader to Lyons (2003) and Soshnikov (2000).

Mn ðÞ ¼ ½16
1. In the trivial case when g is identically a constant
p 2 [0, 1], we obtain the i.i.d. Bernoulli prob- where dim  is the dimension of the corresponding
ability measure. irreducible representation of the symmetric group
Determinantal Random Fields 51

Sn . Let Par = 1n = 0 Par(n). Consider a probability probability density of their joint distribution at
measure M on Par time t1 > 0, given that their paths have not inter-
sected for all 0  t  t1 , is equal to
M ðÞ ¼ e Mn ðÞ where
n! ð1Þ 1 ð0Þ ð1Þ
t1 ðx1 ; . . . ; xð1Þ
n Þ¼ detðp0;t1 ðxi ; xj ÞÞni;j¼1
 2 ParðnÞ; n ¼ 0;1; 2;.. .; 0 < 1 ½17 Z
M is called the Poissonization of the measures Mn . provided the process (1 (t), 2 (t), . . . , n (t)) in Rn has
The analysis of the asymptotic properties of Mn and a strong Markovian property.
M has been important in connection to the famous Let 0 < t1 < t2 <    < tMþ1 . The conditional
Ulam problem and related questions in representa- probability density that the particles are in the
tion theory. positions x(1) (1)
1 < x2 <    < xn
at time t1 , at
(2) (2)
It was shown by Borodin and Okounkov (2000), the positions x1 < x2 <    < x(2) n at time t2 , . . . ,
and, independently, Johansson (2001) that M is a at the positions x1(M) < x(M)2 <    < x(M)
n at time tM ,
determinantal random point field. The correspond- given that at time tMþ1 they are at the positions
ing correlation kernel K (in the so-called modified x(Mþ1)
1 < x(Mþ1)
2 <    < x(Mþ1)
n and their paths have
Frobenius coordinates) is a so-called discrete Bessel not intersected, is then equal to
kernel on Z1 , ð1Þ
t1 ;t2 ;...;tM ðx1 ; . . . ; xðMÞ
n Þ
Kðx; yÞ
8 pffiffiffi pffiffiffi pffiffiffi pffiffiffi 1 Y M
ðlÞ ðlþ1Þ n
> pffiffiffi Jjxj1=2 ð2 ÞJjyjþ1=2 ð2 ÞJjxjþ1=2 ð2 ÞJjyj1=2 ð2 Þ ¼ detðptl ;tlþ1 ðxi ; xj ÞÞi;j¼1 ½21
> Zn ; M l¼0
> jxj  jyj
< if xy > 0 where t0 = 0.
¼ pffiffiffi pffiffiffi pffiffiffi pffiffiffi It is not difficult to show that [21] can be viewed
> pffiffiffi Jjxj1=2 ð2 ÞJjyj1=2 ð2 ÞJjxjþ1=2 ð2 ÞJjyjþ1=2 ð2 Þ
> as a determinantal random point process (see, e.g.,
> xy bi
> Johansson (2003).
if xy < 0 The formulas of a similar type also appeared in
½18 the papers by Johansson, Prähofer, Spohn, Ferrari,
Forrester, Nagao, Katori, and Tanemura in the
where Jx (  ) is the Bessel function of order x. One analysis of polynuclear growth models, random
can observe that the kernel K(x, y) is not Hermitian, walks on a discrete circle, and related problems.
but the restriction of this kernel to the positive and
negative semiaxis is Hermitian.
M is a special case of an infinite parameter family Ergodic Properties
of probability measures on Par, called the Schur As before, let (X, B, Pr ) be a random point field
measures, and defined as with a one-particle space E. Hence, X is a space of
1 the locally finite configurations of particles in E, B a
MðÞ ¼ s ðxÞs ðyÞ ½19 Borel -algebra of measurable subsets of X, and Pr a
probability measure on (X, B). Throughout this
where s are the Schur functions, x = (x1 , x2 , . . . )
section, we assume E = Rd or Zd . We define an
and y = (y1 , y2 , . . . ) are parameters such that
action {T t }t2E of the additive group E on X in the
Z¼ s ðxÞs ðyÞ ¼ ð1  xi yj Þ1 ½20 following natural way:
2Par i;j
T t : X ! X; ðT t Þi ¼ ðÞi þ t ½22
is finite and {xi }1 = {yi }1
i = 1.
It was shown by
bi i=1 We recall that a random point field (X, B, P) is
Okounkov (2001), that the Schur measures belong
called translation invariant if, for any A2 B, any
to the class of the determinantal random point fields.
t 2 E, Pr (T t A) = Pr (A). The translation invariance
of the correlation kernel K(x, y) = K(x  y, 0) =:
NonIntersecting Paths of a Markov Process K(x  y) implies the translation invariance of
k-point correlation functions
Let pt, s (x, y) be the transition probability of a
Markov process (t) on R with continuous trajec- k ðx1 þ t; . . . ; xk þ tÞ ¼ k ðx1 ; . . . ; xk Þ
tories and let (1 (t), 2 (t), . . . , n (t)) be n independent
a:e: k ¼ 1; 2; . . . ; t 2 E ½23
copies of the process. A classical result of Karlin and
McGregor (1959) states that if n particles start at which, in turn, implies the translation invariance
the positions x(0) (0)
1 < x2 <    < xn ,
then the of the random point field. The ergodic properties
52 Determinantal Random Fields

of such point fields were studied by several

bi bi
mathematical expectation of the integrable function
mathematicians (Soshnikov 2000, Shirai and
F on (X, B, P) with respect to the -algebra Bc . The
Takahashi, 2003, Lyons and Steif 2003). The potential U is uniquely defined by the values of
first general result in this direction was obtained
U(x, ), as follows from the following recursive
by Soshnikov (2000). relation:
Theorem 2 Let (X, B, P) be a determinantal ran- Uðfx1 ; . . . ; xn gjÞ ¼ Uðxn jfx1 ; . . . ; xn1 g [ Þ
dom point field with a translation-invariant correla-
tion kernel. Then the dynamical system (X, B, P, {T t }) þ Uðxn1 jfx1 ; . . . ; xn2 g [ Þ
is ergodic, has the mixing property of any multiplicity þ    þ Uðx1 jÞ
and its spectra is absolutely continuous.
For additional information about the Gibbsian
We refer the reader to the article on ergodic property, see Introductory Articles: Equilibrium
theory for the definitions of ergodicity, mixing Statistical Mechanics. Much less is known in the
property, absolute continuous spectrum of the continuous case. Some generalized form of Gibssian-
dynamical system, etc. ness, under quite restrictive conditions, was recently
In the discrete case [15], E = Zd , more is known.
bi established by Georgii and Yoo (2004).
Lyons and Steif (2003) proved that the shift
dynamical system is Bernoulli, that is, it is iso-
morphic (in the ergodic theory sense) to an i.i.d.
process. Under Central Limit Theorem for Counting
P the additional conditions Spec(K) 
(0, 1) and n jnjjK(n)j2 < 1, Shirai and Takahashi Function
(2003a) proved the uniform mixing property.
In this section, we discuss the central-limit theorem
type results for the linear statistics. The first
important result in this direction was established
Gibbsian Properties by Costin and Lebowitz in 1995, who proved the
Costin and Lebowitz (1995) were the first to central-limit theorem for the number of particles in
question the Gibbsian nature of the determinantal the growing box, #[L, L] , L ! 1, in the case of the
random point fields; they studied the continuous determinantal random point process on R 1 with the
determinantal random point process on R1 with a sine correlation kernel
so-called sine correlation kernel
sinððx  yÞÞ
Kðx; yÞ ¼
sinððx  yÞÞ ðx  yÞ
Kðx; yÞ ¼
ðx  yÞ
Below we formulate the Costin–Lebowitz theorem
The first rigorous result (in the discrete case) was in its general form due to Soshnikov (1999, 2000).

established by Shirai and Takahashi (2003b). Theorem 4 Let E be a one-particle space, {0 

Theorem 3 Let E be a countable discrete space Kt  1} a family of locally trace-class operators in
and K a symmetric bounded operator on l2 (E). L2 (E), {(X, B, Pt )} a family of the corresponding
Assume that Spec(K)  (0, 1). Then (X, B, P) is a determinantal random point fields in E, and {It } a
Gibbs measure with the potential U given by family of measurable subsets in E such that
U(xj)= log(J(x,x)  hJ1 jx ,jx i), where x 2 E, 2 X,
{x} \  =;. Here J(x,y) stands for the kernel of the Var#It
operator J =(Id  K)1 K, and we set J =(J(y,z))y,z2 ¼ trðKt  It  ðKt  It Þ2 Þ ! 1 as t ! 1 ½24
and jx =(J(x,y))y2 .
Then the distribution of the normalized number of
We recall that the Gibbsian property of the
particles in It (with respect to Pt ) converges to the
probability measure P on (X, B) means that
normal law, that is,
1 X Uð
jc Þ #It  E#It w
E½FjBc ðÞ ¼ e Fð
[ c Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! Nð0; 1Þ


where  is a finite subset of E, Bc is the -algebra An analogous result holds for the joint distribu-
generated by the B-measurable functions with the tion of the counting functions {#It1 , . . . , #Itk }, where
support outside of , E[FjBc ] is the conditional It1 , . . . , Itk are disjoint measurable subsets in E.
Determinantal Random Fields 53

The proof of the Costin–Lebowitz theorem uses In the special case  = (1n ) (i.e.,  consists of n
the k-point cluster functions. In the determinantal parts, all of which equal to 1), one obtains that
case, the cluster functions have a simple form  () = (1) , and K [x1 , . . . , xn ] = det(K(xi , xj )).
Therefore, in the case  = (1n ) the random point
rk ðx1 ; . . . ; xk Þ
process with the correlation functions [27] is a
1X determinantal random point process. When  = (n)
¼ ð1Þl Kðxð1Þ ; xð2Þ ÞKðxð2Þ ; xð3Þ Þ   
l 2S (i.e., the permutation has only one part, namely n) we
have  = 1 identically, and K [x1 , . . . , xn ] =
 KðxðkÞ ; xð1Þ Þ ½25
per(K(xi , xj )), the permanent of the matrix K(xi , xj ).
The importance of the cluster function stems from The corresponding random point process is known as
the fact that the integrals of the k-point cluster the boson random point process.
function over the k-cube with a side I can be expressed
as a linear combination of the first k cumulants of the
Pfaffian Processes
counting random variable #I . In other words,
Z Let
rk ðx1 ; . . . ; xk Þ dx1    dxk  
I...I K11 ðx; yÞ K12 ðx; yÞ
Kðx; yÞ ¼
X k K21 ðx; yÞ K22 ðx; yÞ
¼ kl Cl ð#I Þ ½26
be an antisymmetric 2  2 matrix-valued kernel, that
is, Kij (x, y) = Kji (y, x), i, j = 1, 2. The
Lkernel defines
It follows from [25] that the integral at the LHS of an integral operator acting on L2 (E) L2 (E), which
[26] equals, up to a factor (1)l (l  1)!, to the trace we assume to be locally trace class. A random point
of the kth power of the restriction of K to I. This process on E is called Pfaffian if its point correlation
allows one to estimate the cumulants of the counting functions have a Pfaffian form
random variable #I . For details, we refer the reader
to Soshnikov (2000). The central-limit theorem for a k ðx1 ; . . . ; xk Þ ¼ pfðKðxi ; xj ÞÞi;j¼1;...;k ; k  1 ½28
general class of linear statistics, under some techni- The RHS of [28] is the Pfaffian of the 2k  2k
cal assumptions on the correlation kernel was
bi antisymmetric matrix (since each entry K(xi , xj ) is a
proved in Soshnikov (2002). Finally, we refer the
bi 2  2 block). Determinantal random point processes
reader to Soshnikov (2000) for the functional is a special case of the Pfaffian processes, corre-
central-limit theorem for the empirical distribution sponding to the matrix kernel of the form
function of the nearest spacings.  
0 ~ yÞ
Kðx; yÞ ¼ ~ xÞ
Kðy; 0
Generalizations: Immanantal and Pfaffian
Point Processes where K ~ is a scalar kernel. The most well known
examples of the Pfaffian random point processes,
In this section, we discuss two important general- that cannot be reduced to determinantal form are
izations of the determinantal point processes. = 1 and = 4 polynomial ensembles of random
Immanantal Processes matrices and their limits (in the bulk and at the edge
of the spectrum), as the size of a matrix goes to
Immanantal random point processes were introduced infinity.
by P Diaconis and S N Evans in 2000. Let  be a
partition of n. Denote by  the character of the
corresponding irreducible representation of the sym-
metric group Sn . Let K(x, y), be a non-negative-definite, Acknowledgment
Hermitian kernel. An immanantal random point The research of A Soshnikov was supported in
process is defined through the correlation functions part by the NSF grant DMS-0405864.
k ðx1 ; . . . ; xk Þ ¼  ðÞ Kðxi ; xðiÞ Þ ½27 See also: Dimer Problems; Ergodic Theory; Growth
2Sn i¼1 Processes in Random Matrix Theory; Integrable Systems
in Random Matrix Theory; Percolation Theory; Quantum
In other words, the correlation functions are given by Ergodicity and Mixing of Eigenfunctions; Random Matrix
the immanants of the matrix with the entries Theory in Physics; Random Partitions; Statistical
K(xi , xj ). We will denote the RHS of [27] by Mechanics and Combinatorial Problems; Symmetry
K [x1 , . . . , xn ]. Classes in Random Matrix Theory.
54 Diagrammatic Techniques in Perturbation Theory

Further Reading Okounkov A (2001) Infinite wedge and random partitions.

Selecta Mathematica. New Series 7(1): 57–81.
Borodin A and Olshanski G (2000) Distribution on partitions, Shirai T and Takahashi Y (2003a) Random point fields associated
point processes, and the hypergeometric kernel. Communica- with certain Fredholm determinants, I. Fermion, Poisson and
tions in Mathematical Physics 211: 335–358. boson point processes. Journal of Functional Analysis 205(2):
Daley DJ and Vere-Jones D (1988) An Introduction to the Theory 414–463.
of Point Processes. New York: Springer. Shirai T and Takahashi Y (2003b) Random point fields associated
Diaconis P and Evans SN (2000) Immanants and finite point with certain Fredholm determinants, II. Fermion shifts and
processes. Journal of Combinatorial Theory Series A 91(1–2): their ergodic and Gibbs properties. The Annals of Probability
305–321. 31(3): 1533–1564.
Georgii H-O and Yoo HJ (2005) Conditional intensity and Soshnikov A (2000) Determinantal random point fields. Russian
Gibbsianness of determinantal point processes. Journal of Mathematical Surveys 55: 923–975.
Statistical Physics 118(91/92): 55–84. Soshnikov A (2002) Gaussian limit for determinantal random
Johansson K (2003) Discrete polynuclear growth and determi- point fields. The Annals of Probability 30(1): 171–187.
nantal processes. Communications in Mathematical Physics Soshnikov A (2004) Janossy densities of coupled random
242(1–2): 277–329. matrices. Communications in Mathematical Physics 251:
Lyons R (2003) Determinantal probability measures. Publications 447–471.
Mathématiques Institut de Hautes Études Scientifiques 98: Spohn H (1987) Interacting Brownian particles: a study of
167–212. Dyson’s model. In: Papanicolau G (ed.) Hydrodynamic
Lyons R and Steif J (2003) Stationary determinantal processes: Behavior and Interacting Particle Systems, IMA Vol. Math.
phase multiplicity, Bernoillicity, entropy, and domination. Appl. vol. 9, pp. 157–179. New York: Springer.
Duke Mathematical Journal 120(3): 515–575. Tracy CA and Widom H (1998) Correlation functions, cluster
Macchi O (1975) The coincidence approach to stochastic point functions, and spacing distributions for random matrices.
processes. Advances in Applied Probability 7: 83–122. Journal of Statistical Physics 92(5/6): 809–835.

Diagrammatic Techniques in Perturbation Theory

G Gentile, Università degli Studi ‘‘Roma Tre’’, Rome, solutions of the linear equation u̇ = M(t)u), we can
Italy write
ª 2006 Elsevier Ltd. All rights reserved. Z t
UðtÞ ¼ WðtÞU  þ WðtÞ d W 1 ð Þð Þ ½3

Introduction If we expect the solution U to be of order ", we can

try to write it as a Taylor series in ", that is,
Consider the dynamical system on Rd described by
the equation X
UðtÞ ¼ "k UðkÞ ½4
u_ ¼ ¼ GðuÞ þ "FðuÞ ½1
dt and, by inserting [4] into [3] and equating the
coefficients with the same Taylor order, we
where F, G : S  Rd ! Rd are analytic functions obtain
and " a real (small) parameter. Suppose also that for
UðkÞ ðtÞ ¼WðtÞU
" = 0 a solution u0 : R ! S (for some initial condi-
tion u0 (0) = u
) is known. Z t
We look for a solution of [1] which is a þ WðtÞ d W 1 ð ÞðkÞ ð Þ ½5
perturbation of u0 , that is, for a solution u which
can be written in the form u = u0 þ U, with where (k) (t) is defined as
U = O(") and U(0) = U  u(0)  u
. Then we con-
ð1Þ ðtÞ ¼ Fðu0 ðtÞÞ
sider the variational equation
1 @pG X
_ ¼ MðtÞU þ ðtÞ; ðkÞ ðtÞ ¼ ðu 0 ðtÞÞ Uðk1 Þ    Uðkp Þ
U Mij ðtÞ ¼ @ui Gj ðu0 ðtÞÞ ½2 p! @up
p¼2 k þþk
1 p ¼k

˜ 0 (t), U), with (u

where (t) = (u ˜ 0 , U) = G(u0 þ U) X1
1 @pF
þ ðu0 ðtÞÞ
G(u0 )  @u G(u0 )U þ "F(u0 þ U). By defining the p! @up
Wronskian matrix W as the solution of the X
matrix equation Ẇ = M(t)W such that W(0) = 1  Uðk1 Þ    Uðkp Þ k2 ½6
(the columns of W are given by d independent k1 þþkp ¼k1
Diagrammatic Techniques in Perturbation Theory 55

Hence (k) (t) depends only on coefficients of orders The convergence of the series can be proved
strictly less than k. In this way, we obtain an indeed (more generally for analytic perturbations, or
algorithm useful for constructing the solution even those that are differentiably smooth enough) by
recursively, so that the problem is solved, up to assuming on ! a stronger nonresonance condition,
(substantial) convergence problems. such as the Diophantine condition
j!   j > 8 2 ZN n f0g ½7
Historical Excursus jj
The study of a system like [1] by following the where jj = j1 j þ    þ jN j, and C0 and  are
strategy outlined above can be hopeless if we do not positive constants. We note that the set of vectors
make some further assumptions on the types of satisfying [7] for some positive constant C0 have full
motions we are looking for. measure in RN provided one takes  > N  1.
We shall see later, in a concrete example, that the Such a result is part of the Kolmogorov–Arnold–
coefficients U(k) (t) can increase in time, in a k- Moser (KAM) theorem, and it was first proved by
dependent way, thus preventing the convergence of Kolmogorov in 1954, following an approach quite
the series for large t. This is a general feature of this different fom the one described here. New proofs
class of problems: if no care is taken in the choice of were given in 1962 by Arnol’d and by Moser, but
the initial datum, the algorithm can provide a only very recently, in 1988, Eliasson gave a proof in
reliable description of the dynamics only for a very which a bound Ck is explicitly derived for the
short time. coefficients U(k) (t), again implying convergence for "
However, if one looks for solutions having a to be small enough.
special dependence on time, things can work better. Eliasson’s work was not immediately known widely,
This happens, for instance, if one looks for quasiper- and only after publication of papers by Gallavotti and
iodic solutions, that is, functions which depend on by Chierchia and Falcolini, in which Eliasson’s ideas
time through the variable = !t, with ! 2 RN a were revisited, did his work become fully appreciated.
vector with rationally independent components, The study of perturbation series [4] employs techni-
that is such that !   6¼ 0 for all  2 ZN n {0} ques very similar to those typical of a very different
(the dot denotes the standard inner product, field of mathematical physics, the quantum field
!   = !1 1 þ    þ !N N ). A typical problem of theory, even if such an analogy was stressed and
interest is: what happens to a quasiperiodic solution used to full extent only in subsequent papers.
u0 (t) when a perturbation "F is added to the The techniques have so far been applied to a wide
unperturbed vector field G, as in [1]? Situations of class of problems of dynamical systems: a list of
this type arise when considering perturbations of original results is given at the end.
integrable systems: a classical example is provided by
planetary motion in celestial mechanics.
Perturbation series such as [4] have been extensively A Paradigmatic Example
studied by astronomers in order to obtain a more Consider the case S = A  TN , with A an open subset
accurate description of the celestial motions compared of RN , and let H0 : A ! R and f : A  TN ! R
to that following from Kepler’s theory (in which all be two analytic functions. Then consider the Hamilto-
interactions between planets are neglected and the nian system with Hamiltonian H(A, ) = H0 (A) þ
planets themselves are considered as points). In "f (A, ). The corresponding equations describe a
particular, we recall the works of Newcomb and dynamical system of the form [1], with u = (A, ),
Lindstedt (series such as [4] are now known as which can be written explicitly:
Lindstedt series). At the end of the nineteenth century, (
Poincaré showed that the series describing quasiper- A_ ¼ "@ f ðA; Þ
iodic motions are well defined up to any perturbation ½8
_ ¼ @A H0 ðAÞ þ "@A f ðA; Þ
order k (at least if the perturbation is a trigonometric
polynomial), provided that the components of ! are Suppose, for simplicity, H0 (A) = A2 =2 and
assumed to be rationally independent: this means that, f (A, ) = f (), where A2 = A  A. Then, we obtain
under this condition, the coefficients U(k) (t) are for  the following closed equation:
defined for all k 2 N. However, Poincaré also showed
€ ¼ "@ f ðÞ
that, in general, the series are divergent; this is due to
the fact that, as seen later, in the perturbation series while A can be obtained by direct integration once
small divisors !   appear, which, even if they do not [9] has been solved. For " = 0, [9] gives trivially
vanish, can be arbitrarily close to zero.  = 0 (t)  0 þ !t, where ! = @A H0 (A0 ) = A0 is
56 Diagrammatic Techniques in Perturbation Theory

called the rotation (or frequency) vector. Hence, for realize that, if this happened, to order k terms
" = 0 all solutions are quasiperiodic. We are inter- proportional to t2k could be present, thus requir-
ested in the preservation of quasiperiodic solutions ing, at best, j"j < jtj2 for convergence up to time t.
when " 6¼ 0. This would exclude a fortiori the possibility of
For " 6¼ 0, we can write, as in [3], quasiperiodic solutions.
The aforementioned property of zero average can
 ¼ 0 ðtÞ þ aðtÞ; aðtÞ ¼ "k aðkÞ ðtÞ ½10 be verified only if the rotation vector is nonresonant,
k¼1 that is, if its components are rationally independent
or, more particularly, if the Diophantine condition
where a(k) is determined as the solution of the [7] is satisfied. Such a result was first proved by
equation Poincaré, and it holds irrespective of how the
 ðkÞ þ  parameters a(k) appearing in [11] are fixed. This
aðkÞ ¼ tA aðkÞ ðtÞ
Z t Z  reflects the fact that quasiperiodic motions take
ðk1Þ place on invariant surfaces (KAM tori), which can
 d d 0 ½@ f ðð 0 Þ ½11
0 0 be parameterized in terms of the angle variables
(t), so that the values a(k) contribute to the initial
with [@ f (( 0 )](k1) expressed as in [6].
phases, and the latter can be arbitrarily fixed.
The quasiperiodic solutions with rotation vector
The recursive equations [13] can be suitably
! could be written as a Fourier series, by
studied by introducing a diagrammatic representa-
tion, as explained below.
aðkÞ ðtÞ ¼ ei!t aðkÞ
2ZN Graphs and Trees
with ! as before. If the series [10], with the Taylor A (connected) graph G is a collection of points,
coefficients as in [12], exists, it will describe a called vertices, and lines connecting all of them. We
quasiperiodic solution analytic in ", and in such a denote with V(G) and L(G) the set of vertices and
case we say that it is obtained by continuation of the the set of lines, respectively. A path between two
unperturbed one with rotation vector !, that is vertices is a minimal subset of L(G) connecting the
0 (t). two vertices. A graph is planar if it can be drawn in
Suppose that the integrand [@ f (( 0 )](k1) in a plane without graph lines crossing.
[11] has vanishing average. Then the integral over A tree is a planar graph G containing no closed
 0 in [11] produces a quasiperiodic function, which loops (cycles); in other words, it is a connected
in general has a nonvanishing average, so that acyclic graph. One can consider a tree G with a
the integral over  produces a quasiperiodic single special vertex v0 : this introduces a natural
function plus a term linear in t. If we choose A  (k)
partial ordering on the set of lines and vertices, and
in [11] so as to cancel out exactly the term linear one can imagine that each line carries an arrow
in time, we end up with a quasiperiodic function. pointing toward the vertex v0 . We can add an extra
In Fourier space, an explicit calculation gives, for oriented line ‘0 connecting the special vertex v0 to
all  6¼ 0, another point which will be called the root of the
1 tree; the added line will be called the root line. In
 ¼ if this way, we obtain a rooted tree  defined by
ð!  Þ2 V() = V(G) and L() = L(G) [ ‘0 . A labeled tree is
1 X
1 X ði0 Þpþ1 ðk1 Þ a rooted tree  together with a label function defined
 ¼ a1 . . . aðkp p Þ
ð!  Þ2 p¼1 k1 þþkp ¼k1
p! on the sets V() and L().
0 þ1 þþp ¼ Two rooted trees which can be transformed into
k2 ½13 each other by continuously deforming the lines in
the plane in such a way that the latter do not cross
which again is suitable for an iterative construction each other (i.e., without destroying the graph
of the solution. The coefficients a(k) 0 are left structure) will be said to be equivalent. This notion
undetermined, and we can fix them (arbitrarily) as of equivalence can also be extended to labeled trees,
identically vanishing. simply by considering equivalent two labeled trees if
Of course, the property that the integrand in they can be transformed into each other in such a
[11] has zero average is fundamental; otherwise, way that the labels also match.
terms increasing as powers of t would appear (the Given two vertices v, w 2 V(), we say that w v
so-called secular terms). Indeed, it is easy to if v is on the path connecting w to the root line. One
Diagrammatic Techniques in Perturbation Theory 57

can identify a line with the vertices it connects; given k

a line ‘ = (v, w), one says that ‘ enters v and exits w.
For each vertex v, we define the branching number as ν
the number pv of lines entering v. Figure 1 Graphical representation of a(k).
The number of unlabeled trees with k vertices can
be bounded by the number of random walks with 2k
steps, that is, by 4k . k1
The labels are as follows: with each vertex v we
associate a mode label v 2 ZN , and with each line
we associate a momentum ‘ 2 ZN , such that the
momentum of the line leaving the vertex v is given ν1
by the sum of the mode labels of all vertices k ν2 k3
preceding P v (with v being included): if ‘ = (v0 , v) =
ν ν ν0 ν3
then ‘ = w
v w . Note that for a fixed unlabeled
tree the branching labels are uniquely determined,
and, for a given assignment of the mode labels, the
νp kp
momenta of the lines are also uniquely determined.
Figure 2 Graphical representation of the recursive equation [13].
ðiv Þpv þ1 1
Vv ¼ fv ; g‘ ¼ ½14
pv ! ð!  ‘ Þ2

where the tensor Vv is referred to as the node factor

of v and the scalar g‘ as the propagator of the line ‘.
One has jf j F ejj , for suitable positive constants
F and , by the analyticity assumption. Then one
can check that the coefficients a(k)  , defined in [12],
for  6¼ 0, can be expressed in terms of trees as
 ¼ ValðÞ
0 10 1
ValðÞ ¼ @ V v A@ g‘ A
v2VðÞ ‘2LðÞ

where (k)  denotes the set of all inequivalent trees Figure 3 An example of tree to be summed over in [15] for
with k vertices and with momentum  associated k = 39. The labels are not explicitly shown. The momentum of
with the root line, while the coefficients a(k) 0 can be
P root line is , so that the mode labels satisfy the constraint
v 2V () v = .
fixed a(k)
0 = 0 for all k  1, by the arbitrariness of the
initial phases previously remarked. The property
that [@ f (( 0 ))](k1) in [11] has zero average for all Then recursive equation [13] can be graphically
k  1 implies that for all lines ‘ 2 L() one has represented as the diagram in Figure 2, provided
g‘ = (!  ‘ )2 only for ‘ 6¼ 0, whereas g‘ = 1 for that we associate with the (grey) vertex v0 the
‘ = 0, so that the numerical values Val() are well node factor Vv0 , with v0 = 0 and pv0 = p denoting
defined for all trees . If a(k) 0 = 0 for all k  1, then the number of lines entering v0 , and with the lines
‘ 6¼ 0 for all ‘ 2 L(). ‘i , i = 1, . . . , p, entering v0 the momenta ‘i , respec-
The proof of [15] can be performed by induction tively. Of course, the sums over p and over the
on k. Alternatively, we can start from the recursive possible assignments of the labels {ki }pi=1 and {i }pi=0
definition [13], whereby the trees naturally arise in are understood. Each black bullet on the right-
the following way. hand side of Figure 2, together with its exiting line
Represent graphically the coefficient a(k)  as in looks like the diagram on the left-hand side, so
Figure 1; to keep track of the labels k and , we that it represents a(k i)
i , i = 1, . . . , p. Note that
assign k to the black bullet and  to the line. For Figure 2 has to be interpreted in the following
k = 1, the black bullet is meant as a grey vertex (like way: if one associates with the diagram as drawn
the ones appearing in Figure 3). in the right-hand side a numerical value (as
58 Diagrammatic Techniques in Perturbation Theory

described above) and one sums all the values over where Nh () is the number of lines in L() with scale
the assignments of the labels, then the resulting h and h0 is a (so far arbitrary) positive integer. The
quantity is precisely a(k) . problem is then reduced to that of finding an
The (fundamental) difference between the black estimate for Nh ().
bullets on the right- and left-hand sides is that the labels To identify which kinds of tree are the source of
ki of the latter are strictly less than k, hence we can problems, we introduce the notion of a cluster and
iterate the diagrammatic decomposition simply by a self-energy graph. A cluster T with scale hT is a
expressing again each a(ki i ) as a(k)
 in [13], and so on, connected set of nodes linked by a continuous
until one obtains a tree with k grey vertices and no black path of lines with the same scale label hT or a
bullets; see Figure 3, where the labels are not explicitly lower one and which is maximal, namely all the
written. This corresponds to the tree expansion [15]. lines not belonging to T but connected to it have
Any tree appearing in [15] is an example of what scales higher than hT and at least one line in T has
physicists call a Feynman graph, while the diagram- scale hT . An inclusion relation is established
matic rules one has to follow in order to associate to between clusters, in such a way that the innermost
the tree  its right numerical value Val() are usually clusters are the clusters with lowest scale, and so
called the Feynman rules for the model under on. Each cluster T can have an arbitrary number
consideration. Such a terminology is borrowed of lines coming into it (entering lines), but only
from quantum field theory. one or zero lines coming out from it (exiting line):
lines of T which either enter or exit T are called
external lines. A cluster T with only one entering
line ‘2T and with one exiting line ‘1T such that one
Multiscale Analysis and Clusters
has ‘1 = ‘2 will be called a self-energy graph
Suppose we replace [9] with  = "@ f (), so that (SEG) or resonance. In such a case, the line ‘1T is
no small divisors appear (that is, g‘ = 1 in [14]). called a resonant line. Examples of clusters and
Then convergence is easily proved P for " small SEGs are suggested by the bubbles in Figure 4; the
enough, since (by using the identity v2V() pv = k  1 mode labels are not represented, whereas the
and the inequality ex xk =k! 1 for all x 2 Rþ and all scales of the lines are explicitly written.
k 2 N), one finds If Sh () is the number of SEGs whose resonant
0 1 lines have scales h, then Nh () = Nh ()  Sh ()
Y  2 k Y
4 F will denote the number of nonresonant lines with
jV v j ejj=4 @ ejv j=4 A ½16
 2 scale h.
v2vðÞ v2vðÞ
A fundamental result, known as Siegel–Bryuno
and the sum over the mode labels can be performed lemma, shows that, for some positive constant c,
by using the exponential decay factors ejv j=4 , while one has
the sum over all possible unlabeled trees gives 4k . In X
Nh ðÞ 2h= c jv j ½18
particular, analyticity in t follows.
Of course, the interesting case is when the
propagators are present. In such a case, even if
no division by zero occurs, as !  ‘ 6¼ 0 (by
the assumed Diophantine condition [13] and the
absence of secular terms discussed previously), the
quantities !  ‘ in [14] can be very small. –1
Then we can introduce a scale h characterizing the 3
3 1
size of each propagator: we say that a line ‘ has scale
3 3
h‘ = h  0 if !  ‘ is of order 2h C0 and scale h‘ = 1
if !  ‘ is greater than C0 (of course, a more formal 2 0 0
3 3
3 2 0
definition can be easily envisaged, for which the reader 2 3 3
is referred to the original papers). Then, we can bound 5 6
j!  ‘ j  2h C0 for any ‘ 2 L(), and write 6 5
2 5
1 5 2
2hNh ðÞ
j g‘ j C2k
0 2
‘2LðÞ h¼0 Figure 4 Examples of clusters and SEGs. Note that the tree
1 itself is a cluster (with scale 6), and each of the two clusters with
2h0 k
0 2 exp 2 log 2 hNh ðÞ ½17 one entering and one exiting lines is a SEG only if the momenta
h¼h0 of its external lines are equal to each other.
Diagrammatic Techniques in Perturbation Theory 59

The conclusion is that we can take into account

ν0 ν0 ν0 ν0 ν0 ν0 ν the resonant lines: this simply adds an extra constant
raised to the power k, so that an overall estimate Ck ,
–ν0 –ν0 –ν0 –ν0 –ν0 –ν0 for some C > 0, holds for U(k) (t), and the conver-
gence of the series follows.
Figure 5 Example of tree whose value grows like a factorial.

Other Examples and Applications

which, if inserted into [17] instead of Nh (), would
give a convergent series; then h0 should be chosen in The discussion carried out so far proves a version of
such a way thatP the sum of the series in [17] is less the KAM theorem, for the system described by [9], bi
than, say,  v2VðÞ jv j=8. and it is inspired by the original papers by Eliasson
The bound [18] is a very deep one, and was (1996) and, mostly, by Gallavotti (1994).
originally proved by Siegel for a related problem Here we list some problems in which original
(Siegel’s problem), in which, in the formalism results have been proved by means of the diagram-
followed here, SEGs do not occur; such a bound matic techniques described above, or by some
essentially shows that accumulation of small divisors variants of them. These are discussed in the
is possible only in the presence of SEGs. A possible following.
tree with k vertices whose value can be proportional The first generalization one can think of is the
to some power of k! is represented in Figure 5, problem of conservation in quasi-integrable systems of
where a chain of (k  1)=2 SEGs, k odd, is drawn resonant tori (that is, invariant tori whose frequency
with external lines carrying a momentum  such that vectors have rationally dependent components). Even
!   C0 jj . if most of such tori disappear as an effect of the
In order to take into account the resonant lines, perturbation, some of them are conserved as lower-
we have to add a factor (!  ‘ )2 for each resonant dimensional tori, which, generically, become of either
line ‘. It is a remarkable fact that, even if there are elliptic or hyperbolic or mixed type according to the
trees whose value cannot be bounded as a constant sign of " and the perturbation. With techniques
to the power k, there are compensations (that is, extending those described here (introducing also, in
partial cancellations) between the values of all trees particular, a suitable resummation procedure for
with the same number of vertices, such that the sum divergent series), this has been done by Gallavotti
bi bi
of all such trees admits a bound of this kind. and Gentile; see Gallavotti et al. (2004) and Gallavotti
The cancellations can be described graphically as and Gentile (2005) for an account.
follows. Consider a tree  with a SEG T. Then take An expansion like the one considered so far can
all trees which can be obtained by shifting the be envisaged also for the motions occurring on the
external lines of T, that is, by attaching such lines to stable and unstable manifolds of hyperbolic lower-
all possible vertices internal to T, and sum together dimensional tori for perturbations of Hamiltonians
the values of all such trees. An example is given in describing a system of rotators (as in the previous
Figure 6. The corresponding sum turns out to be case) plus n pendulum-like systems. In such a case,
proportional to (!  )2 , if  is the momentum of the the function G(u) has a less simple form. For n = 1,
resonant line of T, and such a factor compensates one can look for solutions which depend on time
exactly the propagator of this line. The argument through two variables, = !t and x = egt , with
above can be repeated for all SEGs: this requires a (!, g) 2 RNþ1 , and ! Diophantine as before and g
little care because there are SEGs which are inside related to the timescale of the pendulum. This has
some other SEGs. Again, for details and a more been worked out by Gallavotti (1994), and then
formal discussion, the reader is referred to original used by Gallavotti et al. (1999) to study a class of
papers. three-timescale systems, in order to obtain a lower

Figure 6 Example of SEGs whose values have to be summed together in order to produce the cancellation discussed in the text.
The mode labels are all fixed.
60 Diagrammatic Techniques in Perturbation Theory

bound on the homoclinic angles (i.e., the angles described above has to be suitably adapted: this is
between the stable and unstable manifolds of the study of periodic solutions for the nonlinear
hyperbolic tori which are preserved by the perturba- wave equation utt  uxx þ mu = ’(u), with Dirichlet
tion). The formalism becomes a little more involved, boundary conditions, where m is a real parameter
essentially because of the entries of the Wronskian (mass) and ’(u) is a strictly nonlinear analytic odd
matrix appearing in [5]. In such a case, the function. Gentile and Mastropietro (2004) repro-
unperturbed solution u0 (t) corresponds to the duced the result of Craig and Wayne for the
rotators moving linearly with rotation vector ! and existence of periodic solutions for a large measure
the pendulum moving along its separatrix; a set of periods, and, in a subsequent paper by the
nontrivial fact is that if g0 denotes the Lyapunov same authors with Procesi (2005), an analogous
exponent of the pendulum in the absence of the result was proved in the case m = 0, which had
perturbation, then one has to look for an expansion previously remained an open problem in
in x = egt with g = g0 þ O("), because the perturba- literature.
tion changes the value of such an exponent.
The same techniques have also been applied to See also: Averaging Methods; Integrable Systems and
study the relation of the radius of convergence of the Discrete Geometry; KAM Theory and Celestial
standard map, an area-preserving diffeomorphism Mechanics; Stability Theory and KAM.
from the cylinder to itself, which has been widely
studied in the literature since the original papers by
Further Reading
Greene and by Chirikov, both appeared in 1979,
with the arithmetical properties of the rotation Berretti A and Gentile G (2001) Renormalization group and field
vector (which is, in this case, just a number). In theoretic techniques for the analysis of the Lindstedt series.
Regular and Chaotic Dynamics 6: 389–420.
particular, it has been proved that the radius of
Chierchia L and Falcolini C (1994) A direct proof of a theorem by
convergence is naturally interpolated through a Kolmogorov in Hamiltonian systems. Annali della Scuola
function of the rotation number known as Bryuno Normale Superiore di Pisa 21: 541–593.
function (which has been introduced by Yoccoz as Eliasson LH (1996) Absolutely convergent series expansions for
the solution of a suitable functional equation quasi periodic motions. Mathematical Physics Electronic
Journal 2, paper 4 (electronic), Preprint 1988.
completely independent of the dynamics); see
Gallavotti G (1994) Twistless KAM tori, quasi flat homoclinic
Berretti and Gentile (2001) for a review of results intersections, and other cancellations in the perturbation series
of this and related problems. of certain completely integrable Hamiltonian systems. A
Also the generalized Riccati equation u̇  iu2  review. Reviews in Mathematical Physics 6: 343–411.
2if (!t) þ i"2 = 0, where ! 2 Td is Diophantine and f Gallavotti G, Bonetti F, and Gentile G (2004) Aspects of the
Ergodic, Qualitative and Statistical Theory of Motion. Berlin:
is an analytic periodic function of = !t, has been
bi Springer.
studied with the diagrammatic technique by Gentile Gallavotti G and Gentile G (2005) Degenerate elliptic tori.
(2003). Such an equation is related to two-level Communications in Mathematical Physics 257: 319–362.
quantum systems (as first used by Barata), and Gallavotti G, Gentile G, and Mastropietro V (1999) Separatrix
existence of quasiperiodic solutions of the general- splitting for systems with three time scales. Communications
in Mathematical Physics 202: 197–236.
ized Riccati equation for a large measure set E of
Gentile G (2003) Quasi-periodic solutions for two-level systems.
values of " can be exploited to prove that the Communications in Mathematical Physics 242: 221–250.
spectrum of the corresponding two-level system is Gentile G and Mastropietro V (2004) Construction of periodic
pure point for those values of "; analogously, one solutions of the nonlinear wave equation with Dirichlet
can prove that, for fixed ", one can impose some boundary conditions by the Lindstedt series method. Journal
de Mathématiques Pures et Appliquées 83: 1019–1065.
further nonresonance conditons on !, still leaving a
Gentile G, Mastropietro V, and Procesi M (2005) Periodic
full measure set, in such a way that the spectrum is solutions for completely resonant nonlinear wave equations
pure point. (We note, in addition, that, technically, with Dirichlet boundary conditions. Communications in
such a problem is very similar to that of studying Mathematical Physics 256: 437–490.
conservation of elliptic lower-dimensional tori with Harary F and Palmer EM (1973) Graphical Enumeration. New
York: Academic Press.
one normal frequency.)
Poincaré H (1892–99) Les méthodes nouvelles de la mécanique
Finally we mention a problem of partial differ- céleste, vol. I–III. Paris: Gauthier-Villars.
ential equations, where, of course, the scheme
Dimer Problems 61

Dimer Problems
R Kenyon, University of British Columbia, Vancouver,
BC, Canada
ª 2006 Elsevier Ltd. All rights reserved.

The dimer model arose in the mid-twentieth century
as an example of an exactly solvable statistical
mechanical model in two dimensions with a phase
transition. It is used to model a number of physical
processes: free fermions in 1 dimension, the two-
dimensional Ising model, and various other
two-dimensional statistical-mechanical models at Figure 2 Honeycomb dimers (solid) and the corresponding
‘‘lozenge’’ tilings (gray).
restricted parameter values, such as the 6- and
8-vertex models and O(n) models. A number of
observable quantities such as the ‘‘height function’’
and densities of motifs have been shown to have Other models related to the dimer model are:
conformal invariance properties in the scaling limit
(when the lattice spacing tends to zero).  The spanning tree model on planar graphs. The
Recently, the model is also used as an elementary set of spanning trees on a planar graph is in
model of crystalline surfaces in R3 . bijection with the set of dimer coverings on an
A dimer covering, or perfect matching, of a graph associated bipartite planar graph. Conversely,
is a set of edges (‘‘dimers’’) which covers every dimer coverings of a bipartite planar graph are
vertex exactly once. In other words, it is a pairing of in bijection with directed spanning trees on an
adjacent vertices (see Figure 1a which is a dimer associated graph.
covering of an 8  8 grid). Dimer coverings of a grid  The Ising model on a planar graph with zero
are sometimes represented as domino tilings, that is, external field can be modeled with dimers on an
tilings with 2  1 rectangles (Figure 1b). The dimer associated planar graph.
model is the study of the set of dimer coverings of a  Plane partitions (three-dimensional versions of
graph. Typically, the underlying graph is taken to be integer partitions). Viewing a plane partition
a regular lattice in two dimensions, for example, the along the (1, 1, 1)-direction, one sees a lozenge
square grid or the honeycomb lattice, or a finite part tiling of the plane.
of such a lattice.  Annihilating random walks in one dimension can
Dimer coverings of the honeycomb graph are in be modeled with dimers on an associated planar
bijection with tilings of plane regions with 60 graph.
rhombi, also known as lozenges (see Figure 2).  The monomer-dimer model, where one allows a
These tilings in turn are projections of piecewise- certain density of holes (monomers) in a dimer
linear surfaces in R3 composed of unit squares in covering. This model is unsolved at present,
the 2-skeleton of Z3 . So one can think of honey- although some partial results have been obtained.
comb dimer coverings as modeling discrete surfaces
in R3 . These surfaces are monotone in the sense Gibbs Measures
that the orthogonal projection to the plane
The most general setting in which the dimer model
P111 = {(x, y, z)jx þ y þ z = 0} is injective.
can be solved is that of an arbitrary planar graph
with energies on the edges. We define here the
corresponding measure.
Let G = (V, E) be a graph and M(G) the set of
dimer coverings of G. Let E be a real-valued
function on the edges of G, with E(e) representing
(a) (b) the energy associated to a dimer on the bond e. One
Figure 1 A dimer covering of a grid and the corresponding defines the energy of a dimer covering as the sum of
domino tiling. the energies of those bonds covered with dimers.
62 Dimer Problems

The partition function of the model on (G, E) is then adjacent. We then have the following result of
the sum Kasteleyn:
X Theorem 1 Z = jPf(K)j = j det Kj.
Z¼ eEðCÞ=kT
C2MðGÞ Here Pf(K) denotes the Pfaffian of K.
Such an orientation of edges (which always exists
where the sum is over dimer coverings. In what for planar graphs) is called a Kasteleyn orientation;
follows we will take kT = 1 for simplicity. Note that any two such orientations can be obtained from one
Z depends on both G and E. another by a sequence of operations consisting of
The partition function is well defined for a finite reversing the orientations of all edges at a vertex.
graph and defines the Gibbs measure, which is If G is a bipartite graph, that is, the vertices can
by definition the probability measure  = E on be colored black and white with no neighbors
the set M(G) of dimer coverings satisfying having the same color, then the Pfaffian of K is the
(C) = (1=Z)eE(C) for a covering C. determinant of the submatrix whose rows index the
For an infinite graph G with fixed energy function white vertices and columns index the black vertices.
E, a Gibbs measure on M(G) is by definition any For bipartite graphs, instead of orienting the edges
measure which is a limit of the Gibbs measures on a one can alternatively multiply the edge weights by a
sequence of finite subgraphs which fill out G. There complex number of modulus 1, with the condition
may be many Gibbs measures on an infinite graph, that the alternating product around each face (the
since this limit typically depends on the sequence of first, divided by the second, times the third, as so on)
finite graphs. When G is an infinite periodic graph is real and negative.
(and E is periodic as well), it is natural to consider For nonplanar graphs, one can compute the
translation-invariant Gibbs measures; one can show partition function as a sum of Pfaffians; for a
that in the case of a bipartite, periodic planar graph graph embedded on a surface of Euler characteristic
the translation-invariant and ergodic Gibbs meas- , this requires in general 22 Pfaffians.
ures form a two-parameter family – see Theorem 3
below. Local Statistics
For a translation-invariant Gibbs measure  which
is a limit of Gibbs measures on an increasing The inverse of the Kasteleyn matrix can be used to
sequence of finite graphs Gn , one can define the compute the local statistics, that is, the probability that
partition function per vertex of  to be the limit a given set of edges occurs in a random dimer covering
(random with respect to the Gibbs measure ).
Z ¼ lim ZðGn Þ1=jGn j Theorem 2 Let S = {(v1 , v2 ), . . . , (v2k1 , v2k )} be a
set of edges of G. The probability that all these
where jGn j is the number of vertices of Gn . The free edges occur in a -random covering is
energy, or surface tension, of  is log Z. !
PrðSÞ ¼ Kv2i1 ;v2i Pf 2k2k ððK1 Þvi ;vj Þ
Again, for bipartite graphs the Pfaffian can be
Partition Function made into a determinant.
One can compute the partition function for dimer Heights
coverings on a finite planar graph G as the Pfaffian
(square root of the determinant) of a certain Bipartite graphs Suppose G is a bipartite planar
antisymmetric matrix, the Kasteleyn matrix. The graph. A 1-form on G is simply a function on the set
Kasteleyn matrix is an oriented adjacency matrix of of oriented edges which is antisymmetric with respect
G, indexed by the vertices V: orient the edges of a to reversing the edge orientation: f (e) = f (e) for
graph embedded in the plane so that each face has an edge e. A 1-form can be identified with a flow:
an odd number of clockwise oriented edges. Then just flow by f (e) along oriented edge e. The
define K = (Kvv0 ) with divergence of the flow f is then d f . Let  be the
space of flows on edges of G, with divergence 1 at
Kvv0 ¼ eEðvv Þ each white vertex and divergence 1 at each black
vertex, and such that the flow along each edge from
if G has an edge vv0 , with a sign according to the white to black is in [0, 1]. From a dimer covering M
orientation of that edge, and Kvv0 = 0 if v, v0 are not one can construct such a flow !(M) 2 : just flow
Dimer Problems 63

one unit along each dimer, and zero on the remaining below). When G is nonbipartite, it is conjectured
edges. The set  is a convex polyhedron in R E and its that there is a single ergodic Gibbs measure.
vertices can be seen to be exactly the dimer coverings. In the remainder of this section we assume that G
Given any two flows !1 , !2 2 , their difference is is bipartite, and assume also that the Z2 -action
a divergence-free flow. Its dual (!1  !2 ) (or preserves the coloring of the edges as black and
conjugate flow) defined on the planar dual of G is white (simply pass to an index-2 sublattice if not).
therefore the gradient of a function h on the faces of For integer n > 0 let Gn = G=nZ2 , a finite graph
G, that is, (!1  !2 ) = dh, where h is well defined on a torus (in other words, with periodic boundary
up to an additive constant. conditions). For a dimer covering M of Gn , we
When !1 and !2 come from dimer coverings, h is define (hx , hy ) 2 Z2 to be the horizontal and vertical
integer valued, and is called the height difference of height change of M around the torus, that is, the net
the coverings. The level sets of the function h are flux of !(M)  !0 across a horizontal, respectively
just the cycles formed by the union of the two vertical, cut around the torus (in other words, hx , hy
matchings. If we fix a ‘‘base point’’ covering !0 and are the horizontal and vertical periods around the
a face f0 of G, we can then define the height torus of the 1-form !(M)  !0 ). The characteristic
function of any dimer covering (with flow !) to be polynomial P(z, w) of G is by definition
the function h with value zero at f0 and which X
satisfies dh = (!  !0 ) . Pðz; wÞ ¼ eEðMÞ zhx why ð1Þhx hy
M2MðG1 Þ
Nonbipartite graphs On a nonbipartite planar
graph the height function can be similarly defined here the sum is over dimer coverings M of
modulo 2. Fix a base covering !0 ; for any other G1 = G=Z2 , and hx , hy depend on M. The poly-
covering !, the superposition of !0 and ! is a set of nomial P depends on the base point !0 only by a
cycles and doubled edges of G; the function h is multiplicative factor involving a power of z and w.
constant on the complementary components of these From this polynomial most of the large-scale
cycles and changes by 1 mod 2 across each cycle. behavior of the ergodic Gibbs measures can be
We can think of the height modulo 2 as taking two extracted.
values, or spins, on the faces of G, and the dimer The Gibbs measure on Gn converges as n ! 1 to
chains are the spin-domain boundaries. In particu- the (unique) ergodic Gibbs measure  with smallest
lar, dimers on a nonbipartite graph model can in this free energy F = log Z. The unicity of this measure
way model the Ising model on an associated dual follows from the strict concavity of the free energy
planar graph. of ergodic Gibbs measures as a function of the slope,
see below. The free energy F of the minimal free
energy measure is
Thermodynamic Limit Z
1 dz dw
By periodic planar graph we mean a graph G, with F¼ 2
log Pðz; wÞ
ð2iÞ S1 S1 z w
energy function on edges, for which translations by
elements of Z2 or some other rank-2 lattice   R2
that is, minus the Mahler measure of P.
are isomorphisms of G preserving the edge energies,
For any translation-invariant measure  on M(G),
and such that the quotient G=Z2 is a finite graph.
the average slope (s, t) of the height function for -
Without loss of generality we can take  = Z2 . The
almost every tiling is by definition the expected
standard example is G = Z2 with E 0, which we
horizontal and vertical height change over one
refer to as ‘‘dimers on the grid.’’ However, other
fundamental domain, that is, s = E[h(f þ (1, 0)) 
examples display different global behaviors and so it
h(f )] and t = E[h(f þ (0, 1))  h(f )] where f is any
is worthwhile to remain in this generality.
face. This quantity (s, t) lies in the Newton polygon
For a periodic planar graph G, an ergodic
of P(z, w) (the convex hull in R 2 of the set of
probability measure on M(G) is one which is
exponents of monomials of P). In fact, the points in
translation invariant (the measure of a set is the
the Newton polygon are in bijection with the
same as any Z2 -translate of that set) and whose
ergodic Gibbs measures on M(G):
invariant subsets have measure 0 or 1.
We will be interested in probability measures Theorem 3 When G is a periodic bipartite planar
which are both ergodic and Gibbs (we refer to them graph, any ergodic Gibbs measure has average slope
as ergodic Gibbs measures, dropping the term (s, t) lying in N(P). Moreover, for every point (s, t) 2
‘‘probability’’). When G is bipartite, there are N(P) there is a unique ergodic Gibbs measure (s, t)
multiple ergodic Gibbs measures (see Theorem 3 with that average slope.
64 Dimer Problems

In particular, this gives a complete description of

the set of all ergodic Gibbs measures. The ergodic
Gibbs measure (s, t) of slope (s, t) can be obtained
as the limit of the Gibbs measures on Gn , when one
conditions the configurations to have a particular
slope approximating (s, t).

Ronkin Function and Surface Tension

The Ronkin function of P is a map R : R2 ! R Figure 4 Minus the Ronkin function of P(z, w ) = 5 þ z þ 1=z
defined for (Bx , By ) 2 R2 by þ w þ 1=w.
1 dz dw
RðBx ; By Þ ¼ 2
log PðzeBx ; weBy Þ
ð2iÞ S1 S1 z w
The Ronkin function is convex and its graph is
piecewise linear on the complement of the amoeba
A(P) of P, which is the image of the zero set {(z, w) 2
C2 j P(z, w) = 0} under the map (z, w) 7! ( log jzj,
log jwj) (see Figures 3 and 4 for an example).
The free energy F((s, t)) of (s, t), as a function of
(s, t) 2 N(P), is the Legendre dual of the Ronkin
function of P(z, w): we have
Figure 5 (Negative of) the free energy for dimers on the
Fððs; tÞÞ ¼ RðBx ; By Þ  sBx  tBy square-octagon lattice.

on which R is linear) give points of nondifferentia-
@RðBx ; By Þ @RðBx ; By Þ bility of the free energy F, as defined on N(P). We
s¼ ; t¼
@Bx @By refer to these points of nondifferentiability as
‘‘cusps.’’ Cusps occur only at integer slopes (s, t)
The continuous map rR : R2 ! N(P) which takes (see Figure 5 for the free energy associated to the
(Bx , By ) to (s, t) is injective on the interior of A(P), Ronkin function in Figure 4).
collapses each bounded complementary component of By Theorem 3, the coordinates (Bx , By ) can also
A(P) to an integer point in the interior of N(P), and be used to parametrize the set of Gibbs measures
collapses each unbounded complementary component (s, t) (but only those with slope (s, t) in the interior
of A(P) to an integer point on the boundary of N(P). of N(P) or on the corners of N(P) and boundary
Under the Legendre duality, the facets in the integer points). This parametrization is not one-to-
graph of the Ronkin function (i.e., maximal regions one since when (Bx , By ) varies in a complementary
component of the amoeba, the measure (s, t) does
not change. On the interior of the amoeba the
parametrization is one-to-one.
The remaining Gibbs measures, whose slopes are
on the boundary of N(P), can be obtained by taking
limits of (Bx , By ) along the ‘‘tentacles’’ of the amoeba.

–4 –2 2 4
The Gibbs measures (s, t) can be partitioned into
three classes, or phases, according to the behavior of
–2 the fluctuations of the height function. If we
measure the height at two distant points x1 and x2
in G, the average height difference, E[h(x1 )  h(x2 )],
–4 is a linear function of x1  x2 determined by the
Figure 3 The amoeba of P(z, w ) = 5 þ z þ 1=z þ w þ 1=w , average slope of the measure. The height fluctuation
which is the characteristic polynomial for dimers on the periodic is defined to be the random variable h(x1 )  h(x2 ) 
‘‘square-octagon’’ lattice. E[h(x1 )  h(x2 )]. This random variable depends on
Dimer Problems 65

the two points and we are interested in its behavior 0

when x1 and x2 are far apart.
We say (s, t) is
–i 13 – 10 0
4 π
1. ‘‘Frozen’’ if the height fluctuations are bounded
almost surely. 4–5
π 4 0
2. ‘‘Rough’’ (or ‘‘liquid’’) if the covariance in the 0
height function E[h(x1 )h(x2 )]  E[h(x1 )]E[h(x2 )]
is unbounded as jx1  x2 j ! 1. –i 1 1
π –4
–i 3 – 2 0
3. ‘‘Smooth’’ (or ‘‘gaseous’’) if the covariance of the 4 π

height function is bounded but the height 1–1 1–1

0 π 4 0 π 4
fluctuations are unbounded. 0

The height fluctuations can be related to the decay

–i 1 1
π –4 –i – 5 + 4
of the entries of K1 , which are in turn related to the –i
0 0 4 π 0

decay of the Fourier coefficients of 1=P. In par-

ticular, we have
(0, 0) 0
1 0 3–2 0 13 – 10
Theorem 4 The measure (s, t) is respectively 4 4 π 4 π
frozen, rough, or smooth according to whether Figure 6 Values of K 1 on Z 2 with zero energies.
(Bx , By ) = (rR)1 (s, t) is in the closure of an
unbounded complementary component of A(p), in
the interior of A(P), or in the closure of a bounded translation invariance K1 1
(x0 , y0 ), (x, y) = K(0, 0), (xx0 , yy0 ) )
component of A(P). and values in other quadrants can be obtained by
The characteristic polynomials P which occur in K1 1
(0, 0), (x, y) = iK(0, 0), (y, x) ).
the dimer model are not arbitrary: their algebraic As a sample computation, using Theorem 2, the
curves {P = 0} are all of a special type known as probability that the dimer covering the origin points
Harnack curves, which are characterized by the fact to the right and, simultaneously, the one covering
that the map from the zero-set of P in C2 to its (0, 1) points upwards is
amoeba in R2 is at most two-to-one. In fact: !
ð0;0Þ;ð1;0Þ K1
Theorem 5 By varying the edge energies all Kð0;0Þ;ð1;0Þ Kð0;2Þ;ð0;1Þ det
Harnack curves can be obtained as the characteristic K1
ð0;2Þ;ð1;0Þ K1
polynomial of a planar dimer model. 1 i
4 4
¼ 1
det ¼
 14 þ 1 4
i 4
Local Statistics
Another computation which follows is the decay
In the thermodynamic limit (on a periodic planar
of the edge covariances. If e1 , e2 are two edges at
graph), local statistics of dimer coverings for the Gibbs
distance d, then Pr(e1 &e2 )  Pr(e1 )Pr(e2 ) decays
measure of minimal free energy can be obtained from
quadratically in 1/d, since K1 ((0, 0), (x, y)) decays
the limit of the inverse of the Kasteleyn matrix on the
like 1=(jxj þ jyj).
finite toroidal graphs Gn . This in turn can be
computed from the Fourier coefficients of 1/P.
As an example, let G be the square grid Z2 and take
E = 0 (which corresponds to the uniform measure on Scaling Limits
configurations for finite graphs). An appropriate The scaling limit of the dimer model is the limit
choice of signs for the Kasteleyn matrix is to put when the lattice spacing tends to zero.
weights 1, 1 on alternate horizontal edges and i, i Let us define the scaling limit in the following
on alternate vertical edges in such a way that around way. Let Z2 be the square grid scaled by , so the
each white vertex the weights are cyclically 1, i, 1, lattice mesh size is . Fix a Jordan domain U  R2
i. For this choice of signs we have and consider for each  a subgraph U of Z2 ,
Z 2 Z 2 iðxþyÞ bounded by a simple polygon, which tends to U as
1 e d d
ð0;0Þ;ðx;yÞ ¼  ! 0. We are interested in limiting properties of
ð2Þ2 0 0 2 sin  þ 2i sin 
random dimer coverings of U , in the limit as  ! 0,
This integral can be evaluated explicitly (see Figure 6 for example, the fluctuations of the height function
for values of K1 (0, 0), (x, y) near the origin; by and edge densities.
66 Dimer Problems

The limit depends on the (sequence of) boundary Fluctuations

conditions, that is, on the exact choice of approxi-
While the scaled height function h in the scaling
mating regions U . By changing U one can change
limit converges to its mean value h0 (whose graph is
the limiting rescaled height function along the
the surface S0 ), the fluctuations of the unrescaled
boundary. It is conjectured that the limit of the
height function h  (1=)h0 will converge in law to a
height function along the boundary of U (scaled by
random process on U.
 . . . and assuming this limit exists) determines
In the simplest setting, that of honeycomb dimers
essentially all of the limiting behavior in the interior,
with E 0, and in the absence of facets, the height
in particular the limiting local statistics.
fluctuations converge to a continuous Gaussian
Therefore, let u be a real-valued continuous
process, the image of the Gaussian free field on the
function on the boundary of U. Consider a sequence
unit disk D under a certain diffeomorphism 
of subgraphs U of Z2 , as  ! 0 as above, and
(depending on h0 ) of D to U.
whose height function along the boundary, when
In the particular case h0 = 0,  is the Riemann map
scaled by , is approximating u. We discuss the limit
from D to U and the law of the height fluctuations
of the model in this setting.
is just the Gaussian free field on U (defined to be
the Gaussian process whose covariance kernel is
Crystalline Surfaces the Dirichlet Green’s function). The conformal
invariance of the Gaussian free field is the basis for
The height function allows us to view dimer cover-
a number of conformal invariance properties of the
ings as random surfaces in R3 : to a dimer covering
honeycomb dimer model.
of G, one associates the graph of its height function,
extended in a piecewise linear fashion over the edges
Densities of Motifs
and faces of the dual G . These surfaces are then
piecewise linear random surfaces, which resemble Another observable of interest is the density field of a
crystal surfaces in the sense that microscopically (on motif. A motif is a finite collection of edges, taken up
the scale of the lattice) they are rough, whereas their to translation. For example, consider, for the square
long-range behavior is smooth and facetted, as we grid, the ‘‘L’’ motif consisting of a horizontal domino
now describe. and a vertical domino aligned to form an ‘‘L,’’ which
In the scaling limit, boundary conditions as we showed above to have a density 1=4 in the
described in the last paragraph of the previous thermodynamic limit. The probability of seeing this
section are referred to as ‘‘wire-frame’’ boundary motif at any given place is 1=4. However, in the
conditions, since the graph of the height function scaling limit one can ask about the fluctuations of the
can be thought of as a (random) surface spanning occurrences of this motif: in a large ball around a
the wire frame defined by its boundary values. point x, what is the distribution of NL  A=4, where
In the scaling limit, there is a law of large NL is the number of occurrences of the motif, and A is
numbers which says that the Gibbs measure on the area of the ball? These fluctuations form a
random surfaces (which is unique since we are random field, since there is a long-range correlation
dealing with a finite graph) concentrates, for fixed between occurrences of the motif.
wire-frame boundary conditions, on a single surface It is known that on Z2 , for the minimal free energy
S0 . That is, as the lattice spacing  tends to zero, ergodic Gibbs measure, the rescaled density field
with probability tending to 1 the random surface lies
close to a limiting surface S0 . The surface S0 is the 1 A
unique surface which minimizes the total surface pffiffiffiffi NL 
A 4
tension, or free energy, for its fixed boundary values,
that is, minimizes the integral over the surface of the
converges as  ! 0 weakly to a Gaussian random field
F((s, t)), where (s, t) is the slope of the surface at
which is a linear combination of a directional
the point being integrated over. Existence and derivative of the Gaussian free field and an independent
unicity of the minimizer follow from the strict white noise. A similar result holds for other motifs.
convexity of the free energy/surface tension as a
The joint distribution of densities of several motifs
function of the slope.
can also be shown to be Gaussian.
At a point where the free energy has a cusp, the
crystal surface S0 will in general have a facet, that is, See also: Combinatorics: Overview; Determinantal
a region on which it is linear. Outside of the facets, Random Fields; Growth Processes in Random Matrix
one expects that S0 is analytic, since the free energy Theory; Statistical Mechanics and Combinatorial
is analytic outside the cusps. Problems; Statistical Mechanics of Interfaces.
Dirac Fields in Gravitation and Nonabelian Gauge Theory 67

Further Reading Kasteleyn P (1967) Graph theory and crystal physics. In: Graph
Theory and Theoretical Physics, pp. 43–110. London:
Cohn H, Kenyon R, and Propp J (2001) A variational principle Academic Press.
for domino tilings. Journal of American Mathematical Society Kenyon R, Okounkov A, and Sheffield S (2005) Dimers and
14(2): 297–346. Amoebae. Annals of Mathematics, math-ph/0311005.

Dirac Fields in Gravitation and Nonabelian Gauge Theory

J A Smoller, University of Michigan, Ann Arbor, MI, that the gravitational field equations must be tensor
USA equations; that is, coordinates are an artifact, and
ª 2006 Elsevier Ltd. All rights reserved. physics should not depend on the choice of
bi Einstein’s Equations of GR
In this article we describe some recent results (Finster
et al. 1999a,b, 2000 a–c, 2002a) concerning the The metric gij =gij (x), i,j=0,1,2,3, x=(x0 ,x1 ,x2 ,x3 ),
existence of both particle-like, and black hole x0 =ct (c=speed of light, t=time), is the metric tensor
solutions of the coupled Einstein–Dirac–Yang–Mills defined on four-dimensional spacetime. Einstein’s
(EDYM) equations. We show that there are stable equations are ten (tensor) equations for the unknown
globally defined static, spherically symmetric solu- metric gij (gravitational field), and take the form
tions. We also show that for static black hole
Rij  12 Rgij ¼ Tij ½1
solutions, the Dirac wave function must vanish
identically outside the event horizon. The latter result where the left-hand side Gij = Rij  12 Rgij is the
indicates that the Dirac particle (fermion) must either Einstein tensor and depends only on the geometry,
enter the black hole or tend to infinity. = 8G=c4 , where G is Newton’s gravitational
The plan of the article is as follows. The next constant, while Tij , the energy–momentum tensor,
section describes the background material. It is represents the source of the gravitational field, and
followed by a discussion of the coupled EDYM encodes the distribution of matter. (The word
equations for static, spherically symmetric particle- ‘‘matter’’ in GR refers to everything which can
like and black hole solutions. The final section of produce a gravitational field, including elementary
the article is devoted to a discussion of these results. particles, electromagnetic or Yang–Mills (YM) fields. bi

From the Bianchi identities in geometry (cf. Adler

Background Material et al. (1975)), the (covariant) divergence of the
Einstein tensor, Gij , vanishes identically, namely
Einstein’s Equations
Gi;j ¼ 0
We begin by describing the Einstein equation for the bi

gravitational field (for more details, see, e.g., Adler

so, on solutions of Einstein’s equations,
et al. (1975)). We first note Einstein’s hypotheses of
general relativity (GR): j
Ti;j ¼0
(E1) The gravitational field is the metric gij in 3 þ 1
spacetime dimensions. The metric is assumed to and this in turn expresses the conservation of energy
be symmetric. and momentum. The quantities which comprise the
(E2) At each point in spacetime, the metric can be Einstein tensor are given as follows: first, from
diagonalized as diag(1,1,1,1). the metric tensor gij , we form the Levi-Civita
(E3) The equations which describe the gravitational connection kij defined by:
field should be covariant; that is, independent  
of the choice of coordinate system. k 1 k‘ @g‘j @gi‘ @gij
ij ¼ g  
2 @xi @xj @x‘
The hypothesis (E1) is Einstein’s brilliant insight,
whereby he ‘‘geometrizes’’ the gravitational field. where (4  4 matrix) [gk‘ ] = [gk‘ ]1 , and summation
(E2) means that there are inertial frames at each convention is employed; namely, an index which
point (but not globally), and guarantees that special appears as both a subscript and a superscript is to
relativity (SR) is included in GR, while (E3) implies be summed from 0 to 3. With the aid of kij , we can
68 Dirac Fields in Gravitation and Nonabelian Gauge Theory

construct the celebrated Riemann curvature tensor AðÞ ¼ 0; AðrÞ > 0 if r > 
Riqk‘ :
 is called the radius of the black hole, or the event
@iq‘ @iqk horizon.
Riqk‘ ¼  þ ipk pq‘  ip‘ pqk
@xk @x‘
Finally, the terms Rij and R which appear in the Yang–Mills Equations
Einstein tensor Gij are given by The YM equations generalize Maxwell’s equations.
Rij ¼ Rsisj To see how this comes about, we first write
Maxwell’s equations in an invariant way. Thus, let
(the Ricci tensor), and A denote a scalar-valued 1-form:

R ¼ gij Rij A ¼ Ai dxi ; Ai 2 R

is the scalar curvature. which is called the electromagnetic potential (by

From the above definitions, one sees at once the physicists), or a connection (by geometers). The
enormous complexity of the Einstein equations. For electromagnetic field (curvature) is the 2-form
this reason, one usually seeks solutions which have a F ¼ dA
high degree of symmetry, and in what follows, in this
section, we shall only consider static, spherically In local coordinates,
symmetric solutions; that is, solutions which depend
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @Aj @Ai
only on r = jxj = (x1 )2 þ (x2 )2 þ (x3 )2 . In this case, F ¼ Fij dxi ^ dxj ; Fij ¼ 
@xi @xj
the metric gij takes the form
In this framework, Maxwell’s equations are given by
ds2 ¼ TðrÞ2 dt2 þ AðrÞ1 dr2 þ r2 d2 ½2
d? F ¼ 0; dF ¼ 0 ½4
where d2 = d2 þ sin  d’2 is the standard metric where ? is the Hodge star operator, mapping 2-forms
on the unit 2-sphere, r,,’ are the usual spherical to 2-forms (in R4 ), and is defined by
coordinates, and t denotes time. pffiffiffiffiffiffi
ð? FÞk‘ ¼ 12 jgj"ijk‘ Fij
Black Hole Solutions
where g = det(gij ) and "ijk‘ is the completely anti-
Consider the problem of finding the gravitational symmetric symbol defined by "ijk‘ = sgn(ijk‘). As
field outside a ball of mass M in R 3 ; that is, there is usual, indices are raised (or lowered) via the metric,
no matter exterior to the ball. Solving Einstein’s so that, for example,
equations Gij 0 = 0 gives the famous Schwarzschild
solution (1916): Fij ¼ g ‘i g mj Fjm
  It is important to notice that ? F depends on the
2 2m 2 2
ds ¼  1  c dt metric. Note also that Maxwell’s equations are
r linear equations for the Ai ’s.
2m 1 2 The YM equations generalize Maxwell’s equations
þ 1 dr þ r2 d2 ½3
r and can be described as follows. With each YM field
(described below) is associated a compact Lie group
where m = GM=c2 . Since 2m has the dimensions of G called the gauge group. For such G, we denote its
length, it is called the Schwarzschild radius. Observe Lie algebra by g , defined to be the tangent space at the
that when r = 2m, the metric is singular; namely, identity of G. Now let A be a g -valued 1-form
gtt = 0 and grr = 1. By transforming the metric [2] to
A ¼ Ai dxi

the so-called Kruskal coordinates (cf. Adler et al.

(1975)), one observes that the Schwarzschild sphere where each Ai is in g . In this case, the curvature 2-form
r = 2m has the physical characteristics of a black hole: is defined by
light and nearby particles can enter the region r < 2m,
nothing can exit this region, and there is an intrinsic F ¼ dA þ A ^ A
(nonremovable) singularity at the center r = 0. or, in local coordinates,
For the general metric [2], we define a black hole
solution of Einstein’s equations to be a solution @Aj @Ai
Fij ¼  þ ½Ai ; Aj 
which satisfies, for some  > 0, @xi @xj
Dirac Fields in Gravitation and Nonabelian Gauge Theory 69

The commutator [Ai , Aj ] = 0 if G is an abelian it is also independent of H. By generalizing

group, but is generally nonzero if G is a matrix ¯ 0  = jj2 , in
the expression (due to Dirac), 
group. In this framework, the YM equations can be ¯ the adjoint
Minkowski space, where  0 and ,
written in the form d? F = 0, where now d is an spinors, are defined by
appropriately defined covariant exterior derivative. 0 1
For Maxwell’s equations, the gauge group G = U(1) 1 0
(the circle group {ei :  2 R}) so g is abelian and we 0 ¼ @ A; 
 ¼   0
recover Maxwell’s equations from the YM equa- 0 1
tions. Observe that if G is nonabelian, then the YM
where  denotes complex conjugation, and 1 is the
equations d? F = 0 are nonlinear equations for the ¯ j j is
2  2 identity matrix, the quantity G
connection coefficients Ai .
interpreted as the probability density of the Dirac
particle. We normalize solutions of the Dirac
The Dirac Equation in Curved Spacetime equation by requiring
The Dirac equation is a generalization of Schrödinger’s j ¼ 1 ½9
equation, in a relativistic setting (Bjorken and
Drell 1964). It thus combines quantum mechanics
with the theory of relativity. In addition, the Dirac Spherically Symmetric EDYM Equations
equation also describes the intrinsic ‘‘spin’’ of fermions
and, for this reason, solutions of the Dirac equation are In the remainder of this article we assume that all
often called spinors. fields are spherically symmetric, so they depend
The Dirac equation can be written as only on the variable r = jxj. In this case, the
Lorentzian metric in polar coordinates (t, r, , ’)
ðG  mÞ ¼ 0 ½5 takes the form [2]. The Dirac wave function can be

where G is the Dirac operator, m is the mass of the (Finster et al. 2000b) described by two real
Dirac particle (fermion), and  is a complex-valued functions, ( (r),
(r)), and the potential W(r) corre-
4-vector called the wave function, or spinor. The sponds to the magnetic component of an SU(2) YM

Dirac operator G is of the form field. As shown in Finster et al. (2000b), the EDYM
equations are
G ¼ iGj ðxÞ þ BðxÞ ½6 pffiffiffiffi 0 w
@xj A ¼  ðm þ !TÞ
where Gj as well as B are 4  4 matrices,
pffiffiffiffiffiffiffi m is the
(rest) mass of the fermion, and i = 1. The Dirac pffiffiffiffi 0 w
equation is thus a linear equation for the spinors. A
¼ ðm þ !TÞ 
The Gj (called Dirac matrices) and the Lorentzian
metric gij are related by
1 ð1  w2 Þ2
g jk I ¼ 12 fGj ; Gk g ½7 rA0 ¼ 1  A 
e2 r2
where {Gj ,Gk } is the anticommutator 2
 2!T 2 ð 2 þ
2 Þ  Aw02 ½12
j k j k k j
fG ; G g ¼ G G þ G G
Thus, the Dirac matrices depend on the underlying
metric in four-dimensional spacetime. T0 1 ð1  w2 Þ2
2rA0 ¼1þAþ 2
Suppose that H is a spacelike hypersurface in R 4 , T e r2
with future-directed normal vector  = (x), and let þ 2mTð 
Þ  2!T 2 ð 2 þ
2 Þ
2 2

d be the invariant measure on H induced by the T 2

metric gij . We define a scalar product on solutions þ 4 w
 2 Aw02 ½13
r e
,  of the Dirac equation by
hji ¼  j j d
G ½8 rAw00 ¼  ð1  w2 Þw þ e2 rT

H A0 T  2AT 0 0
 r2 w ½14
This scalar product is positive definite, and because
of current conservation (cf. Finster (1988))
Equations [10] and [11] are the Dirac equations,
 j ¼ 0
rj G [12] and [13] are the Einstein equations, and [14] is
70 Dirac Fields in Gravitation and Nonabelian Gauge Theory

the YM equation. The constants m, !, and e denote, crossing the horizon. Assumption 3 is considerably
respectively, the rest mass of the Dirac particle, its weaker than the corresponding assumption in

energy, and the YM coupling constant. Finster et al. (1999b), where, indeed, it was assumed
that the function A(r) obeyed a power law
A(r) = c(r  )s þ O((r  )sþ1 ), with positive con-
Nonexistence of Black Hole Solutions stants c and s, for r > .
Let the surface r =  > 0 represent a black hole event The main result in this subsection is the following
horizon: theorem:
Theorem 1 Every black hole solution of the
AðÞ ¼ 0; AðrÞ > 0 if r >  ½15
EDYM equations [10]–[14] satisfying the regularity
In this case, the normalization condition [9] is conditions 1–3 cannot be normalized and coincides
replaced by with a Bartnik–McKinnon (BM) black hole of the
pffiffiffiffi corresponding Einstein–Yang–Mills (EYM) equa-
Z 1
2 2 T tions; that is, the spinors and
must vanish
ð þ
Þ dr < 1; for every r0 >  ½16 identically outside the event horizon.
r0 A

In addition, we assume that the following global Remark Smoller and Wasserman (1998) proved
conditions hold: that any black hole solution of the EYM equations
  that has finite mass (i.e., that satisfies [17]) must be
lim r 1  AðrÞ ¼ M < 1 ½17 one of the BM black hole solutions (Bartnik and
McKinnon 1988) whose existence was first demon-

(finite mass), strated in Smoller et al. (1993). Thus, amending the

EYM equations by taking quantum-mechanical
lim TðrÞ ¼ 1 ½18 effects into account – in the sense that both the
gravitational and YM fields can interact with Dirac
(gravitational field is asymptotically flat Minkows-
particles – does not yield any new types of black
kian), and
hole solutions.
lim wðrÞ2 ; w0 ðrÞ ¼ ð1; 0Þ ½19 The present strategy in proving this theorem is to
assume that we have a black hole solution of the
(the YM field is well behaved). EDYM equations [10]–[18] satisfying assumptions
Concerning the event horizon r = , we make the 1–3, where the spinors do not vanish identically
following regularity assumptions: outside of the black hole. We shall show that this
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi leads to a contradiction. The proof is broken up
1. The volume element j det gij j = j sin jr2 A1 T 2 into two cases: either A1=2 is integrable or
is smooth and nonzero on the horizon; that is, nonintegrable near the event horizon. We shall
only discuss the proof for the case when A1=2 is
T 2 A1 ; T 2 A 2 C1 ð½; 1ÞÞ integrable near the event horizon, leaving the bi

alternate case for the reader to view in Finster

2. The strength of the YM field Fij is given by et al. (2000a).
If A1=2 is integrable, then one shows that there
2Aw02 ð1  w2 Þ2 are positive constants c, " such that
trðFij Fij Þ ¼ þ
r4 r4
bi 1
(cf. Bartnik and McKinnon 1988). We assume that c  2 ðrÞ þ
2 ðrÞ  ; if <r<þ" ½21
this scalar is bounded near the horizon; that is, c
outside the event horizon and near r = , assume
that Indeed, multiplying [10] by , and [11] by
adding gives an estimate of the form
w and Aw02 are bounded ½20 pffiffiffiffi 2
Að þ
2 Þ0  ð 2 þ
2 Þ
3. The function A(r) is monotone increasing outside
of and near the event horizon. Upon dividing by A( 2 þ
2 ) and integrating from

As discussed in Finster et al. (1999a), if assumption r >  to  þ " gives

1 or 2 were violated, then an observer freely falling
into the black hole would feel strong forces when j logð 2 þ
2 Þð þ "Þ  logð 2 þ
2 ÞðrÞj  const:
Dirac Fields in Gravitation and Nonabelian Gauge Theory 71

from which the desired result follows. Next, from holes; that is, the spinors must vanish identically. In
[12] and [13], other words, the EDYM equations do not admit
normalizable black hole solutions. Thus, in the
rðAT 2 Þ0 ¼ 4  !T 4 ð 2 þ
2 Þ presence of quantum-mechanical Dirac particles, static

4w and spherically symmetric black hole solutions do not

þ T 3 2mð 2 
2 Þ þ

r exist. Another interpretation of these our result is that

4 Dirac particles can only either disappear into the black
 2 ðAw0 Þ2 T 2 ½22 hole or escape to infinity. These results were proved
under very weak regularity assumptions on the form of
Using assumption 2 together with the last theorem, the event horizon (see assumptions 1–3).
we see that the coefficients of T 4 , T 3 , and T 2 on the
right-hand side of [21] are bounded near , and from
assumption 1 the left-hand side of [21] is bounded Particle-Like Solutions
near . Since assumption 1 implies T(r) ! 1 as
By a particle-like (bound state) solution of the (SU(2))
r & , we see that ! = 0. Since ! = 0, the Dirac
EDYM equations, we mean a smooth solution of
equations simplify and we can show that
is a
eqns [10]–[14], which is defined for all r  0, and
positive decreasing function which tends to 0 as
satisfies condition [9], which explicitly becomes
r ! 1. Then the YM equation can be written in the pffiffiffiffi
Z r
form 2 2 T
ð þ
Þ dr ¼ 1 ½24
r2 ðAw0 Þ0 ¼  wð1  wÞ2 0
2 rðT AÞ
ðAT 2 Þ0 In addition, we demand that [17]–[19] also hold. It
þe pffiffiffiffi þ r2 ðAw0 Þ ½23 is easily shown that, near r = 0, we must have
A 2AT 2
From assumption 2, Aw02 is bounded so A2 w02 ! 0 as
r & . Thus, from [22] we can write, for r near , wðrÞ ¼ 1  r2 þ Oðr2 Þ ½25
ðAw0 Þ0 ðrÞ  c1 þ pffiffiffiffiffiffiffiffiffiffi where is a real parameter. From this, via a Taylor
expansion, one finds that
where c1 and c2 are positive constants. Using this
ðrÞ ¼ 1 r þ Oðr3 Þ
inequality, we can show that for r near , ½26

ðrÞ ¼ 12 ð!T0  mÞ 1 r2 þ Oðr3 Þ
AðrÞ ¼ ðr  ÞBðrÞ
where 0 < limr& B(r) < 1. It follows that A() = 0
and A0 () > 0. Thus, the Einstein metric has the AðrÞ ¼ 1 þ Oðr2 Þ; TðrÞ ¼ T0 þ Oðr2 Þ ½27
same qualitative features as the Schwarzschild
metric near the event horizon. Hence, the metric with two parameters 1 and T0 > 0. Using linearity of
singularity can be removed via a Kruskal transfor-
the Dirac equation, we can always assume that 1 > 0.
mation (Adler et al. 1975). In these Kruskal Under all realistic conditions, the coupling of
coordinates, the YM potential is continuous and Dirac particles to the YM field (describing the weak
bounded (as is easily verified). As a consequence, the
or strong interactions) is much stronger than the
arguments in Finster et al. (2000c) go through and coupling to the gravitational field. Thus, we are
show that the spinors must vanish identically outside particularly intrested in the case of weak gravita-
the horizon. For this, one must note that continuous tional coupling. As shown in Finster et al. (2000b),
zero-order terms in the Dirac operator are irrelevant the gravitational field is essential for the formation
for the derivation of the matching conditions in
of bound states. However, for arbitrarily weak
Finster et al. 2000c, section 2.4). Thus, the matching
gravitational coupling, we can hope to find bound
conditions (equations (2.31), (2.34) of Finster et al. states. It is even conceivable that these bound-state
(2000c)) are valid without changes in the presence solutions might have a well-defined limit when the
of our YM field. Using conservation of the (electro- gravitational coupling tends to zero, if we let the
magnetic) Dirac current and its positivity in timelike
YM coupling go to infinity at the same time. Our
directions, the arguments in Finster et al. (2000c, idea is that this limiting case might yield a system of
section 4) all carry over. This completes the proof. equations which is simpler than the full EDYM
We have thus proved that the only black hole system, and can thus serve as a physically interesting
solutions of our EDYM equations are the BM black starting point for the analysis of the coupled
72 Dirac Fields in Gravitation and Nonabelian Gauge Theory

interactions described by the EDYM equations. regarded as Newton’s equation with the Newtonian
Expressed in dimensionless quantities, we shall thus potential ’. Thus, the limiting case [34] for
consider the limits the gravitational field corresponds to taking the
Newtonian limit. Finally, the normalization con-
m2 ! 0 and e2 ! 1 ½28 dition [16] reduces to
Z 1
That is, we ask whether weak gravitational coupling ðrÞ2 dr ¼ 1 ½36
can give rise to bound states. Using numerical methods, 0
we find particle-like solutions which are stable, even The boundary conditions [17]–[19], [24]–[26] are
for arbitrarily weak gravitational coupling. transformed into
Now assuming that [27] holds (weak gravitational
coupling), so that (A, T)  (1, 1), then we find that
wðrÞ ¼ 1  r 2 þ Oðr3 Þ; lim wðrÞ ¼ 1 ½37
the Dirac equations have a meaningful limit only 2 r!1

under the assumptions that converges and that ^ ¼ Oðr3 Þ

ðrÞ ¼ 1 r þ Oðr3 Þ;
ðrÞ ½38
ðrÞ !
ðrÞ; 2
m ðTðrÞ  1Þ ! ’
½29 ’ðrÞ ¼ ’0 þ Oðr3 Þ; lim ’ðrÞ < 1 ½39
mð!  mÞ ! E r!1

with the three parameters , 1 , and ’0 . We point

ˆ ’ and a real parameter E.
with two real functions
, out that the limiting system contains only one
Multiplying [29] with m and taking the limits [28] coupling constant q. According to [31] and [33],
as well as A, T ! 1, the Dirac equations become q is in dimensionless form given by
0 ¼  2
^ ½30 e2 m2 ! q ½40
Hence, in dimensionless quantities, the limit [17]

^0 ¼ ðE þ ’Þ 
^ ½31 describes the situation where the gravitational cou-
r pling goes to zero, while the YM coupling constant
We next consider the YM equation [14]. The last goes to infinity like e2
(m2 )1 . Therefore, this
term in [14] drops out in the limit of weak limiting case is called the reciprocal coupling limit
gravitational coupling [27]. The second summand (RCL). The reciprocal coupling system is given by
converges only under the assumption that eqns [29], [30], [32], and [34] together with the
normalization conditions [35] and the boundary
!q ½32 conditions [36]–[38]. According to [28], the para-
m meter E coincides up to a scaling factor with !  m,
with q a real parameter, playing the role of an and thus has the interpretation as the (properly
‘‘effective’’ coupling constant. Together with [27], scaled) energy of the Dirac particle. As in Newtonian
this implies that m ! 1. The YM equations thus mechanics, the potential ’ is determined only up to a
have the limit constant  2 R; namely, the reciprocal limit equa-
tions are invariant under the transformation
r2 w00 ¼ ð1  wÞ2 w þ qr
^ ½33
’ ! ’ þ ; E!E ½41
In order to get a well-defined and nontrivial limit of
the Einstein equations [13] and [14], we need to To simplify the connection between the EDYM
assume that the parameter m3 has a finite, nonzero equations, and the RCL equations, we introduce a
limit. Since this parameter has the dimension of parameter " in such a way that as " ! 0, EDYM !
inverse length, we can arrange by a scaling of our RCL; namely,
coordinates that
m3 ! 1 ½34 "¼
We differentiate the T-equation [13] with respect to r Notice that " describes the relative strength of gravity
and substitute [12]. Taking the limits [28] and [33], a versus the YM interaction. For realistic physical
straightforward calculation yields the equation situations, the gravitational coupling is weak;
namely, m2 1, but the YM coupling constant is
r2 ’ ¼  2 ½35
of order 1 : e2
1. So we investigate the parameter
where  = r2 @r (r2 @r ) is the radial Laplacian in range " 1, q
0. These form the starting points for
Euclidean R3 . Indeed, this equation can be the numeric below.
Dirac Fields in Gravitation and Nonabelian Gauge Theory 73

We seek stable bound states for weak gravita- Discussion

tional coupling. For this purpose, we consider the
In this article we have considered the SU(2) EDYM
total binding energy
equations. Our first result shows that the only black
B¼Mm ½42 hole solutions of these equations are the BM black
holes; that is, the spinors must vanish identically outside
where M is the ADM mass defined by [17] and m is the
of the black hole. In other words, the EDYM equations
rest mass of the Dirac particle. B is thus the amount of
do not admit normalizable black hole solutions. Thus,
energy set free when the binding is broken. If B < 0,
as mentioned earlier, this result indicates that the Dirac
then energy is needed to break up the binding.
particle either enters the black hole or escapes to
According to Lee (1987), a solution is stable if B < 0. bi
infinity. Two recent publications (Finster et al. 2002a,b)
In order to find solutions of the RCL equations with
we consider the Cauchy problem for a massive Dirac
B < 0, Lee’s treatment and a new two-parameter
equation in a charged, rotating-black-hole geometry
shooting method (Finster et al. 2000b) can be used.
(the non-extreme Kerr–Newman black hole), with
Stable solutions of these RCL equations then follow
compactly supported initial data outside the black
(see Finster et al. (2000b) for details).
hole. We prove that, in this case, the probability that the
We now turn to the full EDYM equations. Here
Dirac particle lies in any compact set tends to zero as
are the key steps of our method:
t ! 1. This means that the Dirac particle indeed either
1. Find solutions which are small perturbations of enters the black hole or tends to infinity. We also show
the limiting (RCL) solutions. that the wave function decays at a rate t5=6 on any
2. Trace these solutions by gradually changing the compact set outside of the event horizon.
coupling constants. For particle-like solutions of the SU(2) EDYM
3. This should yield a one-parameter family of equations, we find stable bound states for arbitrarily
solutions which are ‘‘far’’ from the known limit- weak gravitational coupling. This shows that as weak
ing solutions. as the gravitational interaction is, it has a regularizing
effect on the equations. The stability of particle-like
The point is that we use the RCL solutions as a
solutions of the EDYM equations is in sharp contrast
starting point for numerics, and we ‘‘continue’’ these
to the EYM equations, where the particle-like solu-
solutions to solutions of the full EDYM equations. bi

tions are all unstable (Straumann and Zhou 1990).

To be somewhat more specific, we see that if we
fix " and q, we have two parameters:
1 ¼ 0 ð0Þ and E¼!m
This work was partially supported by the NSF,
and two conditions at 1:
Contract Number DMS-010-3998.
2 þ
2 ! 0; w2 ! 1
See also: Abelian and Nonabelian Gauge Theories using
We consider the EDYM equations with weaker Differential Forms; Black Hole Mechanics; Bosons and
side conditions Fermions in External Fields; Dirac Operator and Dirac
pffiffiffiffi Field; Einstein Equations: Exact Solutions; Einstein
Z 1
2 2 2 T Equations: Initial Value Formulation; Noncommutative
0< ð þ
Þ dr < 1 Geometry and the Standard Model; Relativistic Wave
0 A
Equations Including Higher Spin Fields; Symmetry
0 < ¼ lim TðrÞ < 1 Classes in Random Matrix Theory.
lim w2 ðrÞ ¼ 1
 ¼ lim rð1  AðrÞÞ < 1 Further Reading
Adler R, Bazin M, and Schiffer M (1975) Introduction to General
Then we rescale these solutions to obtain the true Relativity, 2nd edn. New York: McGraw-Hill.
side conditions via the transformations Bartnik R and McKinnon J (1988) Particle-like solutions of the
pffiffiffi Einstein–Yang–Mills equations. Physical Review Letters 61:
~ ¼ 2 ð 2 rÞ
ðrÞ 141–144.
~ ¼ pffiffiffi

ðrÞ 2
ð 2 rÞ
Bjorken J and Drell S (1964) Relativistic Quantum Mechanics.
New York: McGraw-Hill.
AðrÞ ¼ Að 2 rÞ; ~
TðrÞ ¼ 1 Tð 2 rÞ Finster F (1988) Local U(2, 2) symmetry in relativistic quantum
mechanics. Journal of Mathematical Physics 39: 6276–6290.
~ ¼ 2 m;
m ~ ¼ 2 !
! Finster F, Smoller J, and Yau S-T (1999a) Particle-like solutions of
the Einstein–Dirac equations. Physical Review D 59: 104020/
~ ¼ 6 ; ~e2 ¼ 2 e2 1–104020/19.
74 Dirac Operator and Dirac Field

Finster F, Smoller J, and Yau S-T (2000a) The interaction of Dirac hole geometry. Advances in Theoretical and Mathematical
particles with non-abelian gauge fields and gravitation-black Physics 7: 25–52.
holes. Michigan Mathematical Journal 47: 199–208. Finster F, Kamran N, Smoller J, and Yau S-T (2002b) Decay rates
Finster F, Smoller J, and Yau S-T (2000b) The interaction of and probability estimates Dirac particles in the Kerr–Newman
Dirac particles with non-abelian gauge fields and gravitation- black hole geometry. Communications in Mathematical
bound states. Nuclear Physics B 584: 387–414. Physics 230: 201–244.
Finster F, Smoller J, and Yau S-T (1999b) Nonexistence of black Lee TD (1987) Mini-soliton stars. Physical Review D 25: 3640–3657.
hole solutions for a spherically symmetric, static Einstein– Smoller J, Wasserman A, and Yau S-T (1993) Existence of infinitely-
Dirac–Maxwell system. Communications in Mathematical many smooth global solutions of the Einstein–Yang/Mills equa-
Physics 205: 249–262. tions. Communications in Mathematical Physics 151: 303–325.
Finster F, Smoller J, and Yau S-T (2000c) Nonexistence of time- Smoller J and Wasserman A (1998) Extendability of solutions of
periodic solutions of the Dirac equation in a Reissner– the Einstein–Yang/Mills equations. Communications in Math-
Nordström black hole background. Journal of Mathematical ematical Physics 194: 707–733.
Physics 41: 2173–2194. Straumann N and Zhou Z (1990) Instability of the Bartnik–
Finster F, Kamran N, Smoller J, and Yau S-T (2002a) The long McKinnon solution of the Einstein–Yang–Mills equations.
time dynamics of Dirac particles in the Kerr–Newman black Physics Letters B 237: 353–356.

Dirac Operator and Dirac Field

S N M Ruijsenaars, Centre for Mathematics and forbidding an occupancy greater than one. In this
Computer Science, Amsterdam, The Netherlands heuristic picture, the annihilation of a negative-
ª 2006 Elsevier Ltd. All rights reserved. energy electron yields a hole in the sea, observable
as a new type of positive-energy particle with the
same mass, but opposite charge. This led Dirac to
predict that the electron should have an oppositely
charged partner.
The Dirac equation arose in the early days of His prediction was soon confirmed experimen-
quantum mechanics, inspired by the problem of tally, the partner of the negatively charged electron
taking special relativity into account in the quantum showing up as the positively charged positron. More
mechanical description of a freely moving electron. generally, all electrically charged particles (not only
From the outset, however, Dirac looked for an spin-1/2 particles described by the Dirac equation)
equation that also accomodated the electron spin have turned out to have oppositely charged anti-
and that could be modified to include interaction particles. Furthermore, some electrically neutral
with an external electromagnetic field. The equation particles also have distinct antiparticles.
he discovered satisfies all of these requirements. On Returning to the second-quantized Dirac theory,
the other hand, when it is rewritten in Hamiltonian this involves a Dirac quantum field in which the
form, the spectrum of the resulting Dirac operator creation/annihilation operators of negative-energy
includes not only the desired interval [mc2 , 1) states are replaced by annihilation/creation opera-
(where m is the electron mass and c the speed of tors of positive-energy holes, resp. The hole theory
light), but also an interval (1, mc2 ]. substitution therefore leads to a Hilbert space (called
Dirac himself already considered this negative Fock space) that accomodates an arbitrary number
part of the spectrum as unphysical, since no such of particles and antiparticles with the same mass and
negative energies had been observed and their opposite charge.
presence would entail instability of the electron. Soon after the introduction of the Dirac equation
This physical flaw of the ‘‘first-quantized’’ descrip- (which dates from 1928), it turned out that the
tion of a relativistic electron led to the introduction number of particles and antiparticles is not con-
of ‘‘second quantization,’’ as encoded in quantum served in a high-energy collision. Such creation and
field theory. In the field-theoretic version of the annihilation processes admit a natural description in
Dirac theory, the unphysical negative energies are the Fock spaces associated with relativistic quantum
obviated by a prescription that originated in Dirac’s field theories. The very comprehensive mathematical
hole theory. description of real-world elementary particle phe-
Specifically, Dirac postulated that the negative- nomena that is now called the standard model arose
energy states of his equation were occupied by a sea some 30 years ago, and has been abundantly
of unobservable particles, the Pauli principle confirmed by experiment ever since. It involves
Dirac Operator and Dirac Field 75

various relativistic quantum fields with nonlinear special class of unitary matrix multipliers; the
interactions. The Dirac quantum field is an essential external field then vanishes for t < 0 and equals
ingredient, inasmuch as it is used to describe all the pure gauge field corresponding to the unitary
spin-1/2 particles and antiparticles in the model matrix for t  0. Specializing to an even spacetime
(including quarks, electrons, neutrinos etc.). dimension and choosing special ‘‘kink’’ type
After this survey (which is not only very brief, unitaries, the associated Fock-space quadratic
but also biased toward the physical concepts at forms can be made to converge to the free Dirac
issue), the contents of this article will be sketched. field.
The free Dirac equation associated with the As mentioned already, Dirac’s second quantization
physical Minkowski spacetime R 4 is first detailed. procedure was invented to get rid of the unphysical
The exposition and notation are slightly unconven- negative energies of the first-quantized (one-particle)
tional in some respects. This is because we are partly theory. It is an amazing fact that the resulting
preparing the ground for a mathematically precise formalism for the simplest case (namely the massless
account of the second-quantized version of the free Dirac operator in a two-dimensional spacetime) can
Dirac theory. For example, momentum space (as be exploited for quite different purposes. In particu-
opposed to position space) is emphasized, since the lar, this setting can be tied in with various soliton
variable x in the Dirac equation does not have a equations and the representation theory of certain
clear physical significance and should be discarded infinite-dimensional groups and Lie algebras. In
in the Hilbert space formulation of the second- conclusion, some of these applications are briefly
quantized Dirac field. The latter acts on a Fock sketched, namely the construction of special solutions
space of multi-particle and -antiparticle wave to the Kadomtsev–Petviashvili (KP) equation (incl-
functions depending on momentum and spin vari- uding the KP solitons and finite-gap solutions) and
ables, and the spacetime dependence of the Dirac special representations of Kac–Moody and Virasoro
field is solely a consequence of relativistic covar- algebras.
iance. (In particular, the variable x in the Dirac
field (t, x) should not be viewed as the position of
particles and antiparticles created and annihilated
The Free One-Particle Dirac
by the field.)
To be sure, there is much more to the Dirac
Equation in R 4
theory than its free first- and second-quantized The free time-dependent Dirac equation is a linear
versions for Minkowski spacetime R4 . The primary hyperbolic evolution equation for a function (t, x)
purpose here is, however, to present these founda- on spacetime R 4 with values in C4 . It involves four
tional versions in some detail. A much more 4  4 matrices   ,  = 0, 1, 2, 3, satisfying the
sketchy account of further developments can be -algebra
found in subsequent sections. First, the one-particle
theory is reconsidered. Generalizations of the free     þ     ¼ 2g 14 ; g ¼ diagð1; 1; 1; 1Þ ½1
theory to arbitrary dimensions and Euclidean
settings are sketched and interactions with external Using the Pauli matrices
fields are described, touching on various aspects ! !
and applications. 0 1 0 i
The next focus is on relations with index theory 1 ¼ ; 2 ¼
1 0 i 0
that arise when the massless Euclidean Dirac ! ½2
operator is generalized to geometric settings, namely 1 0
3 ¼
l-dimensional Riemannian manifolds allowing a spin 0 1
structure. We illustrate the general Atiyah–Singer
index theory for the Dirac framework with some one can choose for example
simple examples for l = 1 (Toeplitz operators) and
l = 2 (the manifold S1  S1 ).    
0 12 k
0 k
More information on the many-particle Dirac  ¼ ;  ¼
12 0 k 0
theory appears in the final section. Brief remarks
on the Dirac field in interaction with other k ¼ 1; 2; 3 ½3
quantized fields are followed by an elaboration
of the far simpler situation of the Dirac field Now the free Dirac equation reads
interacting with external fields. Among the S-  
operators corresponding to such fields there is a ih 0 @t þ ihc  r  mc2 14 Þðt; x ¼ 0 ½4
76 Dirac Operator and Dirac Field

where h is Planck’s constant, c the speed of light, Under Fourier transformation,

and m the particle mass. Using from now on units so
h = c = 1, this can be abbreviated as F : Ȟ ! L2 ðR 3 ; dpÞ  C4
! ðxÞ 7! ðpÞ ¼ ð2 Þ 3=2
dx expðix  pÞ ðxÞ
i   @  m ðxÞ ¼ 0; x ¼ ðx0 ; x1 ; x2 ; x3 Þ
¼0 ½11

@ ¼ @=@x ;  ¼ 0; 1; 2; 3 ½5
eqn [9] turns into
The relativistic invariance of this equation can be
understood as follows. First, since the equation d
i  ¼ DðpÞ; DðpÞ ¼ a  p þ m ½12
does not explicitly involve the spacetime coordi- dt
nates, it is invariant under spacetime translations.
The matrix D(p) is Hermitian and has square E2p 14 ,
(If (t, x) solves [5], then also (t  a0 , x  a) is a
where Ep is the relativistic energy,
solution for all (a0 , a) 2 R4 .) Second, it is invariant
under Lorentz transformations (rotations and
Ep ¼ ðp  p þ m2 Þ1=2 ½13
boosts). Indeed, if (x) is a solution and L 2
SO(1, 3), then S(L)(L1 x) solves [5] too, where corresponding to a momentum p. Now, we have
S(L) denotes a (suitably normalized) matrix
satisfying UC DðpÞ ¼ DðpÞUC ½14

3 where UC is the charge conjugation matrix,
SðLÞ1   SðLÞ ¼ L    ½6
UC ¼ i 2 ½15
(The matrices  on the right-hand side of [6] satisfy Hence, the four eigenvalues of D(p) are given by
the -algebra [1]. From this, the existence of a Ep , Ep , Ep , and Ep . Therefore, the matrices
representation S(L) of SO(1, 3) satisfying [6] is
readily deduced.) 1 DðpÞ
As a consequence, the Poincaré group (inhomo- P ðpÞ ¼ 14  ½16
2 Ep
geneous Lorentz group) acts in a natural way on the
space of solutions to the time-dependent Dirac are projections on the positive and negative spectral
equation, expressing its independence of the choice subspaces of D(p).
of inertial frame. For quantum mechanical purposes, As orthonormal base for the positive-energy sub-
however, one needs to choose a frame and use the space, we can now choose
associated time variable to rewrite the equation as a
Hilbert space evolution equation.  1=2
The relevant Hilbert space Ȟ is the space of four- wþ;j ðpÞ ¼ Pþ ðpÞbj ; j ¼ 1; 2 ½17
Ep þ m
component functions that are square integrable over
space, where
0 1 0 1
Ȟ ¼ L2 ðR3 ; dxÞ  C4 ½7 1 0
1 B 0C 1 B 1C
b1 ¼ pffiffiffi B C; b2 ¼ pffiffiffi B C ½18
To obtain a self-adjoint Hamiltonian on Ȟ, one multi- 2 1A@ 2 0A
plies [5] by  0 and introduces the Hermitian matrices 0 1

 ¼ 0; k ¼  0  k ; k ¼ 1; 2; 3 ½8 Next, setting

Then, one obtains the Schrödinger type equation w;j ðpÞ ¼ UC wþ;j ðpÞ; j ¼ 1; 2 ½19

d an orthonormal base w,1 (p), w,2 (p) for the

i ¼ Ȟ ½9 negative-energy subspace of D(p) is obtained; cf. [14].
The upshot is that the time-independent Dirac
where Ȟ is the Dirac operator, equation

Ȟ ¼ i  r þ m ½10 Ȟ ¼ E ½20

Dirac Operator and Dirac Field 77

gives rise to bounded eigenfunctions transform into the operators

eþ;j ðx; pÞ ðPk f Þ

ðpÞ ¼
pk f
¼ þ;  ½31
¼ ð2 Þ expðix  pÞwþ;j ðpÞ; j ¼ 1; 2
e;j ðx; pÞ ðCf Þ
ðpÞ ¼ f
¼ þ;  ½32
¼ ð2 Þ expðix  pÞw;j ðpÞ; j ¼ 1; 2
ðPf Þ
ðpÞ ¼
¼ þ;  ½33
with eigenvalues E = Ep and E = Ep , resp. Clearly,
they are not square-integrable, but they can be used
as the kernel of a unitary transformation between ðTf Þ
ðpÞ ¼ i2 f
¼ þ;  ½34
Ȟ (7) and the Hilbert space
Note that Pk , P, and T leave the positive- and
H ¼ H þ H  ¼ Pþ H P H negative-energy subspaces Hþ and H invariant,
½22 whereas C interchanges them.
Hþ ; H ¼ L2 ðR3 ; dpÞ  C2
To conclude this section, we describe some salient
Specifically, we have features of the unitary representation of the (identity
component of the) Poincaré group on H, which
W : H ! Ȟ follows from the representation on solutions to [5]
f ðpÞ ¼ ðfþ ðpÞ; f ðpÞÞ already sketched. The spacetime translations over
X XZ a 2 R4 are represented by the unitary operator
7! ðxÞ ¼ dp e
;j ðx; pÞf
;j ðpÞ ½23 exp (ia0 H þ ia  P); explicitly,

¼þ; j¼1;2 R3
ðexpðia0 H þ ia  PÞf Þ
which entails ¼ expði
ða0 Ep  a  pÞÞf
¼ þ;  ½35
ðW 1 Þ
;j ðpÞ ¼ dx e
;j ðx; pÞ  ðxÞ ½24 The representation of the Lorentz group involves
R3 unitary 2  2 matrices U(k, A), where k is an
(Here and throughout this article, a bar denotes arbitrary 4-vector satisfying k k = 1 and A the
complex conjugation.) matrix in SL(2, C) representing L 2 SO(1, 3). (Recall
From the above, it is clear that the Dirac that SL(2, C) can be viewed as a 2-fold cover of
Hamiltonian Ȟ acting on the Hilbert space Ȟ is SO(1, 3).) In particular, U(k, A) does not depend on
unitarily equivalent to the multiplication operator on k for rotations,
H [22] given by Uðk; AÞ ¼ A
; 8A 2 SUð2Þ ½36
ðHf Þ
ðpÞ ¼
Ep f
¼ þ;  ½25 (Here and henceforth, we use
to denote the
Hermitian adjoint of matrices and operators.) For
Indeed, W is a diagonalizing transformation for Ȟ, boosts, however, there is dependence on the vector
the relation k, which is the image of the vector (1, 0) under the
boost. We refrain from a more detailed description
H ¼ W 1 ȞW ½26 of U(k, A), as this would carry us too far afield.
yielding an explicit realization of the spectral The unitary SO(1, 3) representation leaves the
theorem. decomposition H = Pþ H P H invariant. On the
Using the same notational convention, the positive-energy subspace Hþ , it is given by
momentum, charge conjugation, parity, and time-  L 1=2 
p0 p 

reversal operators on Ȟ, given by ðUðLÞf Þþ ðpÞ ¼ U ; A fþ ðpL Þ ½37

p0 m
ðP̌k ÞðxÞ ¼ i@k ðxÞ ½27

ðČ ÞðxÞ ¼ UC ðxÞ ½28 p ¼ ðEp ; pÞ; pL ¼ L1 p ½38

On H , it is given by the complex-conjugate
ðP̌ ÞðxÞ ¼ UP ðxÞ; UP ¼  0 ½29 representation,
 1=2  t
pL0 p
ðUðLÞf Þ ðpÞ ¼ U ; A f ðpL Þ ½39
ðŤ ÞðxÞ ¼ UT ðxÞ; UT ¼  1  3 ½30 p0 m
78 Dirac Operator and Dirac Field

just as for the spacetime translations, cf. [35]. (The Accordingly, the operators c(
) (f ) satisfy the canoni-
superscript t is used to denote the transpose matrix.) cal anticommutation relations (CARs) over H,
This feature is crucial for the second-quantized
Dirac theory, which is discussed next. fcðf Þ; cðgÞg ¼ 0;
fcðf Þ; c
ðgÞg ¼ ðf ; gÞ; 8f ; g 2 H ½44

where {A, B} denotes the anticommutator AB þ BA.

The Free Dirac Field in R 4
(From this, one readily deduces that c(
) (f ) is
The free Dirac field is an operator-valued distribution bounded with norm kf k.)
on a Fock space that describes an arbitrary number of Next, recalling the direct sum decomposition [22],
spin-1/2 particles and antiparticles in terms of momen- a notation change
tum space wave functions. Since spin-1/2 particles are
fermions (which encodes the Pauli exclusion principle), cð
Þ ðPþ f Þ ! að
Þ ðPþ f Þ; cð
Þ ðP f Þ ! bð
Þ ðP f Þ ½45
an M-particle wave function Fjþ,...,þ
1 ,...,jM
(p1 , . . . , pM ) (where
jl 2 {1, 2} is the spin index) is antisymmetric under any is made, thus indicating that a(
) and b(
) should be
interchange of a pair (ji , pi ) and (jk , pk ). Likewise, viewed as the creation/annihilation operators of
N-antiparticle wave functions Fk,..., 1 ,...,kN
(q1 , . . . , qN ) particles and antiparticles, resp. Since Hþ and H
are antisymmetric. But a wave function Fj,k þ
(p, q) are copies of L2 (R3 , dp)  C2 , a given function
describing a particle–antiparticle pair need not have (f1 (p), f2 (p)) in the latter space can occur both as an
any symmetry property, since a particle and an argument of a(
) () and of b(
) (); it can also be
antiparticle can be distinguished by their charge. viewed as a smearing function for unsmeared
The relevant Fock space is therefore the tensor quantities a(
) (
j (p) and bj (p), j = 1, 2, that are often
product of two antisymmetric Fock spaces built over referred to as operators as well (even though they
the one-particle and one-antiparticle spaces are only quadratic forms). Thus, one has, for
L2 (R3 , dp)  C2 . For later purposes, it is important example,
to view these spaces as the summands Hþ and H of
2 Z
the space H from the previous section. Thus, the
ðf Þ ¼ dpb
j ðpÞfj ðpÞ
arena for the free Dirac field is the Hilbert space R3
F a ðHÞ ’ F a ðHþ Þ  F a ðH Þ ½40 2 Z
bðf Þ ¼ dpbj ðpÞfj ðpÞ
where, for example, j¼1 R3

F a ðHÞ ¼ ðC H ðH  HÞa   Þ ½41 As explained shortly, the smeared time-zero Dirac
field takes the form
where the bar denotes the completion of the infinite
direct sum in the obvious inner product. The tensor ðf Þ ¼ aðPþ f Þ þ b
ðKP f Þ; f 2H ½47
(1, 0, 0, . . .) is viewed as the vacuum (the ‘‘filled
Dirac sea’’) and denoted by . Here and below, K denotes complex conjugation on
To get around in Fock space, one employs the H, Hþ , and H . Just as the operators c(
) (f ), the
creation and annihilation operators c(
) (f ), f 2 H. operators (
) (f ) satisfy the CARs over H,
The creation operator c
(f ), f 2 H, is defined by
linear and continuous extension of its action on the fðf Þ; ðgÞg ¼ 0
vacuum  and on elementary antisymmetric tensors, ½48
fðf Þ; 
ðgÞg ¼ ðf ; gÞ; 8f ; g 2 H
recursively given by
as is readily verified using [44]–[45]. But this
ðf Þ ¼ f ; c
ðf Þf1 ¼ f ^ f1 ; . . .
½42 -representation is not unitarily equivalent to the
ðf Þf1 ^    ^ fN ¼ f ^ f1 ^    ^ fN ; . . . c-representation [44]. This becomes clear in parti-
cular from the consideration of a crucial type of
Its adjoint, the annihilation operator c(f ), satisfies CAR automorphism that is considered next.
cðf Þ ¼ 0; cðf Þf1 ¼ ðf ; f1 Þ; . . . To this end, we fix a unitary operator U on H.
Then it is plain that the operators
cðf Þf1 ^    ^ fN ¼ ðÞj1 ðf ; fj Þ ~cð
Þ ðf Þ ¼ cð
Þ ðUf Þ ½49

 f1 ^  ^ fbj ^   ^ fN ; . . . ½43 ~ ð
Þ ðf Þ ¼ ð
Þ ðUf Þ
Dirac Operator and Dirac Field 79

also satisfy the CARs. The CAR-algebra automorph- The map

ism c(
) (f ) 7! c̃(
) (f ) can be unitarily implemented in
F a (H), since one has ðf Þ 7! ðCf Þ

Þ ðf Þ ¼ ðUÞcð
Þ ðf ÞðU
Þ ½51 also yields a CAR automorphism. It is unitarily
implemented by the Fock-space charge-conjugation
where (U) denotes the Fock-space product opera- operator
tor corresponding to U. Thus, for example,  
0 1
ðUÞ ¼ ; ðUÞf ¼ Uf ; . . . C¼ ½58
1 0
ðUÞf1 ^    ^ fN ¼ Uf1 ^    ^ UfN ; . . .
which interchanges particles and antiparticles.
For the CAR automorphism  (f ) 7!  ˜ (
) (f ) this is Notice that C is unitary, whereas C is antiunitary.
not true, however. Rewriting it in terms of the It remains to establish the precise relation of the
annihilation and creation operators a(
) and b(
) via above to the customary free Dirac field (t, x). This
[47], it amounts to a linear transformation (Bogoliubov is a quadratic form on Fa (H) given by
transformation), whose unitary implementability has Z X
been clarified several decades ago. To be specific, the ðt; xÞ ¼ ð2 Þ3=2 dp aj ðpÞwþ;j ðpÞeiEp tþipx
R3 j¼1;2
necessary and sufficient condition for unitary imple- 
mentability is that the off-diagonal parts þ b
j ðpÞw;j ðpÞeiEp tipx ½59
Uþ ¼ Pþ UP ; Uþ ¼ P UPþ ½53
(Its expectation hF1 , (t, x)F2 i is, for example, well
in the 2  2 matrix decomposition of operators on defined for F1 , F2 in the dense subspace of F a (H)
H be Hilbert–Schmidt operators. Therefore, no that consists of vectors with finitely many particles
problem arises when U is diagonal with respect to and antiparticles and wave functions in Schwartz
this decomposition. Indeed, in that case one can space.) It satisfies the time-dependent Dirac equation
choose as unitary implementer the product operator
i@t  ¼ ði  r þ mÞ ½60
ðUÞ ¼ ðUþþ Þ  ðKU KÞ ½54
in the sense of quadratic forms. Furthermore,
(cf. the tensor product structure [40] of Fa (H)). smearing it with a function (x) in the Hilbert
In particular, the automorphism space Ȟ (7), we obtain
ðf Þ 7! ðeitH f Þ ½55 Z
dx ðxÞ  ðt; xÞ
where H is the free diagonalized Dirac Hamiltonian R3
[25], is implemented by the operator ¼ ðeitH W 1 Þ
~ itH Þ ¼ ðeitE Þ  ðeitE Þ
ðe ½56 ~ itH ÞðW 1 Þðe
¼ ðe ~ itH Þ 2 Ȟ ½61
where E denotes multiplication by Ep on Hþ and H . As announced, the time evolution of the free Dirac
The change of CAR representation, therefore, entails field is, therefore, given by the unitary one-
that the unphysical negative energies of the one- parameter group [56], whose generator (the sec-
particle theory are replaced by positive energies of ond-quantized Dirac Hamiltonian) has spectrum
antiparticles. Hence, we obtain a mathematically {0} [ [m, 1).
precise version of Dirac’s hole theory substitution The Dirac field (t, x) can also be smeared with a
bj (p) ! b
j (p), b
j (p) ! bj (p). test function F(t, x) in the Schwartz space S(R4 )4 ,
More generally, if one chooses for U the Poincaré yielding a bounded operator
group representation (given by [35] and [37]–[39]), Z
then the Fock-space implementer [54] is the tensor ðFÞ ¼ dxFðxÞ  ðxÞ ½62
product of two product operators with the same R4
action on Fa (L2 (R 3 , dp)  C2 ). Observe that this is
˜ Then one obtains the relativistic covariance relation
also true for the Fock-space version (T) = (T) of
the time-reversal operator [34]. By contrast, the ~
ðUða; ~
¼ ðFa;L Þ ½63
Fock-space parity operator (P) ˜ = (P) gives rise to
two product operators with slightly different where
actions, cf. [33]. Accordingly, particles and anti-
particles have opposite parity. Fa;L ðxÞ ¼ SðL1 Þt FðL1 ðx  aÞÞ ½64
80 Dirac Operator and Dirac Field

and U(a, L) denotes the Poincaré group representa- the associated quantum fields are a crucial ingredi-
tion on H, cf. [35] and [37]–[39]. Likewise, one gets ent of the standard model.
the inversion formulas Next, we point out that it is possible to switch to
a representation in which the gamma matrices are
ðIÞðFÞ ~
¼ ðFI Þ;
ðIÞ I ¼ P; T ½65 real. This so-called Majorana representation is
with convenient (but not indispensable) in the description
of neutral spin-1/2 particles. By definition, such
FP ðt; xÞ ¼ UPt Fðt; xÞ; 
FT ðt; xÞ ¼ UTt Fðt; xÞ ½66 particles are equal to their antiparticles, so that the
while the Fock-space charge-conjugation operator second-quantized formalism of the previous section
[58] transforms the Dirac field as must be adapted: one needs the neutral CAR algebra
over H (also known as self-dual CAR).
½67 For various purposes, it is important to formulate
with the free Dirac equation for a spacetime whose
spatial dimension is arbitrary. Then one needs, first

FC ðxÞ ¼ UCt FðxÞ ½68 of all, gamma matrices satisfying the (Minkowski)
Clifford algebra relations
Finally, let us consider the global U(1) gauge
transformations f 7! ei f , where  2 R and f 2 H.     þ     ¼ 2g 1 ; g ¼ diagð1; 1n Þ ½73
They can be implemented by
where n is the space dimension and the minimal size
~ i Þ ¼ ðei Þ  ðei Þ
ðe ½69    of the gamma matrices is to be determined.
Clearly, for n = 1 and n = 2, one can take  = 2,
and one has choosing, for example,
ðe ~ i Þ
¼ ðF Þ
~ i ÞðFÞðe ½70    
0 1 0 1
0 ¼ ; 1 ¼
with 1 0 1 0
i 0
F ðxÞ ¼ ei FðxÞ ½71 2
0 i
The generator Q of the one-parameter group
˜ i ) is the charge operator: on wave functions
 7! (e to fulfill [73]. For n = 4, one can take  = 4, just
describing Nþ particles and N antiparticles, it has as for n = 3, supplementing [1] with the matrix i 5 ,
eigenvalue Nþ  N . cf. [72].
More generally, for n = 2N  1 and n = 2N, one
can take  = 2N in [73]. Indeed, a representation on
More on the One-Particle Dirac Theory the 2N -dimensional fermion Fock space F a (CN )
(cf. [41]) is readily constructed using the creation
Even for the free one-particle setting, the account and annihilation operators described in the previous
given earlier is far from complete. To begin with, section. Once this has been taken care of, most
the free Dirac equation admits a specialization to of the discussion on the free one-particle Dirac
massless particles. In the Weyl representation of equation in R4 can be easily generalized. Of special
the -algebra adopted above, the choice m = 0 importance in this regard is the straightforward
entails that the p-space equation [12] decouples adaptation of the formulas [7]–[26], which form
into two 2  2 equations for spinors that can be the foundation for the second-quantized version.
labeled by their chirality (‘‘handedness’’). This Indeed, the discussion of the last section applies
refers to their eigenvalue with respect to the nearly verbatim for arbitrary spacetime dimension.
chirality matrix In several applications, the so-called Euclidean
  version of the free Dirac theory in spacetime
5 0 1 2 3 12 0
 ¼ i    ¼ ½72 dimension n þ 1 is important. Basically, this version
0 12
is obtained upon replacing i@0 by @nþ1 in the Dirac
and this notion derives from the noninvariance of equation, a substitution that changes the character
the separate 2  2 equations under parity. (A of the equation from hyperbolic to elliptic. Pro-
positive-chirality spinor is mapped to a negative- vided that the mass vanishes, the Euclidean Dirac
chirality spinor under the parity operator P̌ (33) and equation admits a reinterpretation as a time-
vice versa.) Since the weak interaction breaks parity independent zero-eigenvalue Weyl equation in a
symmetry, the two 2  2 equations (often called Minkowski spacetime of dimension n þ 2. (This
Weyl equations) do have physical relevance. Indeed, equation is often called the zero-mode equation.)
Dirac Operator and Dirac Field 81

Let us now turn to the description of the to a Minkowski spacetime or Euclidean space of
interaction with an external electromagnetic poten- arbitrary dimension is straightforward. An adapta-
tial A (t, x). This can be taken into account via the tion of the resulting interacting one-particle Dirac
minimal substitution, theory in arbitrary dimension to quite general
geometric settings also yields a crucial starting
@ ! @ þ ieA ½75
point for index theory.
also known as the covariant derivative, in the time- Before turning to the latter area, we conclude this
dependent Dirac equation [5]. section with another striking application of the one-
For the electron in the Coulomb field of a nucleus particle framework, namely the massless Dirac
of charge Ze, one has equation in two spacetime dimensions with special
external fields. Specifically, the relevant Dirac
Ze operator is of the form
Ak ¼ 0; k ¼ 1; 2; 3; A0 ¼  ½76
4 jxj !
i dx iqðxÞ
and the time-independent equation d
irðxÞ i dx
ia  r þ m  ¼E ½77 where r(x) and q(x) are not necessarily real valued.
4 jxj (Note that this operator is in general not self-
can be solved explicitly. This leads to a bound-state adjoint.) With suitable restrictions on r and q,
spectrum that is more accurate than its nonrelativis- the direct and inverse scattering theory associa-
tic counterpart. In particular, one finds that energy ted with the Dirac operator [78] can be applied
levels that are degenerate in the nonrelativistic to various nonlinear PDEs in two spacetime
theory split up into slightly different levels. The dimensions to solve their Cauchy problems in
resulting fine structure of the Dirac levels can be considerable detail. As a crucial special case,
understood as a consequence of the coupling initial conditions yielding vanishing reflection
between the spin of the electron and its orbital give rise to soliton solutions for the pertinent
motion. equation.
In spite of this better agreement with the The first example in this framework was found by
experimental levels, the physical interpretation of Zakharov and Shabat (the nonlinear Schrödinger
the Dirac electron in a Coulomb field is enigmatic. equation); with other choices of r and q several other
This is not only because of the persistence of the soliton PDEs (including the sine-Gordon and mod-
negative-energy states of the free theory (which ified Korteweg–de Vries equations) were handled by
turn into scattering states), but also because of Ablowitz, Kaup, Newell, and Segur, who studied a
unphysical properties of the position operator. quite general class of external fields r and q.
More general time-independent external fields
(such as step potentials A0 (x) with a step height
The Dirac Operator and Index Theory
larger than 2m) can cause transitions between
positive- and negative-energy states (Klein para- Thus far, we have considered various versions
dox). This phenomenon is enhanced when time of the Dirac operator associated with the spaces
dependence is allowed. In particular, any external Rl for some l  1. For applications in the area
field that is given by functions in C1 0 (R ) leads to of index theory, however, one needs to generalize
a scattering operator S on the one-particle space H this base manifold. Indeed, one can define a Dirac
[22] that has nonzero off-diagonal parts S . operator for any l-dimensional oriented Rieman-
Hence, a positive-energy wave packet scattering nian manifold M that admits a spin structure.
at such a time- and space-localized field has a This is a lifting of the transition functions of the
nonzero probability to show up as a negative- tangent bundle TM (which may be assumed to
energy wave packet. take values in SO(l)) to the simply connected
When one tensors the one-particle space Ȟ with twofold cover Spin(l) (taking l  3).
an internal symmetry space Ck , one can also Choosing first l = 2N þ 1, the spin group has a
couple external Yang–Mills fields A taking values faithful irreducible representation on C2 . Hence,
in the k  k matrices via the substitution [75]. one obtains a C2 -bundle over M, the spinor
(From a geometric viewpoint, this can be bundle. The Levi-Civita connection on M derived
rephrased as tensoring the spinor bundle with a from the metric can now be lifted to a connection
vector bundle equipped with a connection A.) The on the spinor bundle. From the covariant deriva-
generalization of this external gauge field coupling tive corresponding to the spin connection and the
82 Dirac Operator and Dirac Field

Clifford algebra generators  1 , . . . ,  l , one can (viewed as operators on Hþ ), provided that does not
then construct a first-order elliptic differential vanish on S1 . (Recall a bounded operator B is
operator that acts on sections of the spinor Fredholm if it has finite-dimensional kernel K and
bundle. (For the case M = R2Nþ1 with its Eucli- cokernel C. Its Fredholm index is given by
dean metric, this construction yields the massless
positive-chirality Dirac operator acting on wave indexðBÞ ¼ dim K  dim C ½83
functions with 2N components, as considered and is norm continuous and invariant under addi-
above.) tion of a compact operator.) Assuming (S1 ) C

The massless Dirac operator thus obtained is from now on, the curve (S1 ) has a well-defined
self-adjoint as an operator on the L2 -space H winding number w( ) with respect to the origin. The
associated with the spinor bundle, and it has equality
infinite-dimensional positive and negative spectral
subspaces Hþ and H . (In this section the check indexðT Þ ¼ wð Þ ½84
accent on position-space quantities is omitted.) between objects from the area of analysis on the
Specializing to the case of compact M, a contin- left-hand side and from the areas of topology and
uous map from M to C
gives rise to a Fredholm geometry on the right-hand side is the simplest
operator on Hþ , and more generally a continuous example of an Atiyah–Singer type index formula.
map from M to GL(k, C) yields a Fredholm When is not only continuous but also smooth,
operator on Hþ  Ck . the index formula can be rewritten as
For a smooth map, the Fredholm index of this Z
operator can be written in terms of an integral over 1 d
indexðT Þ ¼  ½85
M involving certain closed differential forms. The 2 i S1
value of this integral does not change when exact
forms are added, since M has no boundary. Hence, yielding a characteristic class version.
one is dealing with de Rham cohomology classes. In It should be noted that the operator M on H
this context, the class involved (‘‘characteristic has a bounded inverse M1= when 0 2 = (S1 ), hence a
class’’) is determined by the Riemann curvature trivially vanishing index. Therefore, the compres-
tensor of M and the topological (‘‘winding’’) sion [82] involving the spectral projection of the
characteristics of the map. Dirac operator is needed to get a nonzero index.
The simplest example of this state of affairs arises Observe also that the equality [84] is quite easily
for l = 1 and M = S1 with its obvious spin structure verified for the case (z) = zn , since T yields a
(periodic boundary conditions). Writing 2 H = L2 power of the right (n > 0) or left (n < 0) shift on
(S1 ) as Pþ H ’ l2 (N).
We proceed to the case of even-dimensional

ðzÞ ¼ an z n ; z 2 S1 ½79 manifolds, l = 2N. Then the fiber C2 of the spinor

n2Z bundle splits into a direct sum of even and odd
spinors, corresponding to two distinct representa-
the Dirac operator H on H reads tions of Spin(2N) on C2 . (Here it is assumed
that N > 3; recall the Lie algebra isomorphisms
H¼z ½80 so(4) ’ so(3) so(3) and so(6) ’ su(4).) With respect
dz to this decomposition, the Dirac operator can be
It has eigenfunctions zn , n 2 Z. Thus, we may written as
0 D

X X H¼ ½86
D 0
ðPþ ÞðzÞ ¼ an zn ; ðP ÞðzÞ ¼ an zn ½81
n0 n<0 where D and D
are again first-order elliptic
differential operators expressed in terms of Clifford
As a consequence, the functions in Hþ (H ) are
algebra generators and the spin connection. Tensor-
L2 -boundary values of holomorphic functions in
ing the spinor bundle with a vector bundle equipped
jzj < 1 (jzj > 1). Operators of the form
with a connection A, one can define a Dirac
T ¼ P þ M Pþ ½82 operator on the tensor product which involves A
and takes the form
where is a continuous function on S1 and M  
0 D
denotes multiplication by , are called Toeplitz HA ¼ ½87
operators. It is not hard to see that they are Fredholm DA 0
Dirac Operator and Dirac Field 83

with respect to the even/odd spinor decomposition. interacting with quantized gauge fields and Klein–
Once more, the index of DA (viewed as a Fredholm Gordon fields. Although its perturbation theory is
operator between two different Hilbert spaces) can renormalizable, its mathematical existence is to date
be expressed as an integral over M involving wide open.
characteristic classes that depend on the curvatures It is far beyond the scope of this article to
of the two connections. elaborate on the analytical difficulties of relativistic
Probably the simplest example of the construc- quantum field theories, let alone those associated
tions just sketched is given by the torus M = S1  S1 with the standard model. Even for d = 2 and 3, a
with its flat metric. Employing the above coordinate nonperturbative construction of interacting quan-
and spin structure on S1 , one can take tum field models involving the Dirac field is an
extremely difficult enterprise. Apart from some
@ @
H ¼ L2 ðS1  S1 Þ  C2 ; D ¼ z1 þ iz2 ½88 rigorous results on certain self-interacting Dirac
@z1 @z2 field models, the only interacting model that is
Since the curvature vanishes, the index theorem for reasonably well understood from the constructive
this situation implies index(D) = 0. (Note that this is field theory viewpoint is the Yukawa model for
also plain from [88]: both kernel and cokernel of D d = 2 and 3. This describes the interaction between
are spanned by the constant sections.) On the other the Dirac field  and a Klein–Gordon field , the
hand, when one tensors the spinor bundle with a line interaction term being formally given by g(
 0 ).
bundle with connection A, the index formula reads On the other hand, the interaction of the quantized
Z Dirac field with external classical fields is much more
indexðDA Þ ¼  F ½89 easily understood and analytically controlled. As a
2 S1 S1 bonus, within this context, one can make contact
where F is the curvature 2-form corresponding to A. with various issues of physical and mathematical
The Atiyah–Singer index theorem for Dirac relevance. We now proceed to sketch the external-
operators has far-reaching applications. It can be field framework and some of its applications.
used to derive other results in this area, such as the Let us first consider the addition of an external
Gauss–Bonnet–Chern theorem, the Hirzebruch sig- field term gV(t, x) to the free Dirac operator Ȟ on
nature theorem, and (when M is a Kähler manifold)
Riemann–Roch type theorems. From this, one can Ȟ ¼ L2 ðRn ; dxÞ  C  Ck ½90
obtain information on various questions, such as the We assume from now on that the coupling g is real
existence of positive scalar curvature metrics or and that V is a self-adjoint k  k matrix-valued
zeros of vector fields on M. Other applications function on spacetime Rnþ1 with matrix elements
include insights on topological invariants of mani- that are in C1 nþ1
0 (R ). Then the (interaction picture)
folds obtained from ‘‘simple’’ manifolds (such as scattering operator S exists. It is unitary and has off-
spheres and tori) by glueing or covering operations. diagonal Hilbert–Schmidt parts S , so that a
This hinges on the additive properties of the index unitary Fock-space S-operator (S) ˜ implementing
that are clear from its being given by an integral the Bogoliubov transformation generated by S
over the manifold. Conversely, the integrality of exists:
Fredholm indices can be used to deduce that certain
rational cohomology classes are actually integral on ~
ðSÞðf ~
¼ ðSf Þ;
ÞðSÞ 8f 2 H ½91
manifolds that admit the structure that is required ˜
The arbitrary phase in (S) can be fixed by requiring
for the pertinent index theorem to apply, that ˜
that the vacuum expectation value of (S) be
certain manifolds do not admit such structures,
positive. More precisely, this number is generically
since one knows that the relevant class is not
nonzero and satisfies
integral, etc.
jð; ðSÞÞj ¼ detð1 þ TS Þ1=2 ½92
where TS is a positive trace class operator deter-
More on the Dirac Field
mined by S.
As mentioned earlier, the free-field formalism can be ˜
The vector (S) is a superposition of wave
easily generalized to an arbitrary spacetime dimen- functions with an equal and arbitrary number of
sion d. For d > 4, however, no renormalizable particles and antiparticles. More generally, the
interacting quantum field models involving the ˜
Fock-space S-operator (S) leaves the subspaces of
Dirac field are known. For the physical case d = 4 F a (H) with a fixed eigenvalue q 2 Z of the charge
the standard model involves various Dirac fields operator Q invariant, and can create and
84 Dirac Operator and Dirac Field

annihilate an arbitrary number of particle– change from unitary maps with nonzero winding
antiparticle pairs. number.
The unitary propagator U(T1 , T2 ) corresponding In particular, choosing k = = 2N1  N, there
to V(t, x) does not have Hilbert–Schmidt off- exist quite special ‘‘kink maps’’
diagonal parts (unless the spacetime dimension is
sufficiently small and special external fields are u ;a ðxÞ 2 Uð Þ; > 0; a 2 R 2N1 ½98
chosen). Even so, the diagonal parts are Fredholm
with vanishing index, and the off-diagonal parts are with winding number 1 and such that the quadratic
compact. Omitting the ill-defined determinantal form implementers of the unitary multiplication
factor, these properties imply that one obtains a operators
renormalized quadratic form ˜ rcn (U(T1 , T2 )) satisfy- !
1  u ;a ðxÞ 0
ing the implementing relation Ǔþ; ;a ¼ 
~ rcn ðUðT1 ; T2 ÞÞðf Þ 0 1  1
 ! ½99
1  1 0
¼ ðUðT1 ; T2 Þf Þ ~ rcn ðUðT1 ; T2 ÞÞ; 8f 2 H ½93 Ǔ; ;a ¼ 
0 1  u ;a ðxÞ
in the quadratic form sense.
The above unitary operators on H yield Fredholm converge to (a linear combination of the chiral
diagonal parts whose indices vanish. (They are norm components of) the free Dirac field (0, a) as the
continuous in g and reduce to the identity for g = 0.) kink size parameter goes to 0.
This is why their Fock-space implementers leave the For the special case N = 1, one can take
charge sectors invariant. Indeed, for a unitary
operator U on H with compact off-diagonal parts x  a  i
u ;a ðxÞ ¼ ½100
the implementer maps the charge-q sector to the x  a þ i
charge-(q þ q(U)) sector, where
and the off-diagonal parts of U, ,a are actually
qðUÞ ¼ indexðU Þ ½94 Hilbert–Schmidt. Thus, the implementers can be
Specializing to the case chosen to be unitary operators. But to get con-
vergence to the Dirac field components (0, a) as
n ¼ 2N  1;  ¼ 2 ; ¼ 2N1 ½95 ˜ , ,a ) should be
! 0, the unitary implementers (U
renormalized by a multiplicative factor.
a unitary (k  k)-matrix multiplier Ǔ on Ȟ does
For the N = 1 case, the unitary multipliers [96]
not have compact off-diagonal parts in general. But
give rise to loop groups. Indeed, requiring
when it is of the form
  lim u
ðxÞ ¼ 1k ;
¼ þ;  ½101
1  uþ ðxÞ 0 x!1
Ǔ ¼ ½96
0 1  u ðxÞ
we are dealing with continuous maps S1 ! U(k).
with respect to the chiral decomposition (the From the viewpoint of the Dirac theory, these
generalization of the  5 -decomposition [72] to even groups are local gauge groups. The convergence to
spacetime dimension), then it suffices for compact- the Dirac field just sketched can be used to great
ness of the off-diagonal parts that the matrices advantage to clarify the structure of the correspond-
u (x) 2 U(k) are continuous and converge to 1k for ing Fock-space gauge groups. Their Lie algebras
jxj ! 1. yield representations of Kac–Moody algebras, a
Viewing R2N1 as arising from S2N1 via stereo- topic which is considered shortly.
graphic projection, the latter unitaries can be viewed Before doing so, it should be pointed out that
as continuous maps from S2N1 to U(k), reducing to under some mild smoothness assumptions all of
1k at the north pole. As such, they yield elements of the above unitary matrix multipliers can also be
the homotopy group 2N1 (U(k)). By virtue of Bott’s viewed as S-operators associated with very special
periodicity theorem, the latter group equals Z for external fields. Indeed, the gauge-transformed Dirac
k  N. Thus, the maps u have a well-defined operator
‘‘winding number’’ w(u ) 2 Z for k  N. From the

index formula ȞU ¼ Ǔ ȞǓ ½102

indexðU Þ ¼ wðuþ Þ  wðu Þ ½97 is of the form
and [94] one now deduces that one can obtain
implementers ˜ rcn (U) effecting a nonzero charge ȞU ¼ Ȟ þ VðxÞ ½103
Dirac Operator and Dirac Field 85

where V(x) is a self-adjoint k  k matrix on belongs to G2 (H) provided the sequence xk vanishes
R2N1 (a ‘‘pure gauge’’ field). If one now defines a sufficiently fast as k ! 1. Thus, one obtains an
time-dependent external field by implementer (e˜ h(x) ), the so-called KP evolution
operator. This designation is justified by the vacuum
VðxÞ; t  0 expectation value
Vðt; xÞ ¼ ½104
0; t<0
~ hðxÞ ÞðGÞÞ;
ðxÞ ¼ ð; ðe ~ G 2 G2 ðHÞ ½112
then Ǔ equals the S-operator for V(t, x). (Equiva-
lently, Ǔ is the t ! 1 wave operator for the time- being a tau-function solving the hierarchy of KP
independent external field V(x).) evolution equations in Hirota bilinear form, as first
To conclude this section we sketch some applica- shown by Sato and his Kyoto school. For example,
tions of the second-quantized Dirac formalism for the KP equation itself,
the special case N = 1, m = 0, and positive chirality.  
Even though we could stick to the massless positive- uyy ¼ @x 43 ut  2uux  13 uxxx ½113
chirality Dirac operator id=dx on the line, it is has the bilinear form
simpler and more natural to start from its counter-  4 

part on the circle already considered in the last @ @ @ @2

 4 þ 3 2
ðx þ yÞðx  yÞ

section, cf. [80]. (Under the Cayley transform, the @y1 @y 1 @y 3 @y2 y¼0
positive- and negative-energy subspaces of id=dx ¼0 ½114
on L2 (R) correspond to those of zd=dz on L2 (S1 ),
given by [81].) Letting z = ei , we then obtain the relation being given by

Ȟ ¼ id=d ; Ȟ ¼ L2 ð½0; 2 ; d Þ x1 ¼ x; x2 ¼ y; x3 ¼ t; u ¼ 2@12 ln  ½115

H ¼ l2 ðZÞ; Hþ ¼ l2 ðNÞ; H ¼ l2 ðZ Þ The class of solutions to [113] thus obtained
and a corresponding Dirac field includes not only the rational and soliton solutions
! (which correspond to choosing Ǧ as multiplication by
1 X
a rational function of z = ei that does not vanish on
ðt; Þ ¼ ð2 Þ an eintþin þ b
n eintin S1 ), but also the finite-gap solutions associated with
n¼0 n¼1
compact Riemann surfaces. Moreover, for suitable
ðt; Þ 2 R  ½0; 2  ½106 subgroups of G2 (H), one obtains tau-functions for
where related soliton hierarchies, including the Korteweg–de
Vries, Boussinesq and Hirota–Satsuma hierarchies.
al ¼ cðel Þ; l  0; bl ¼ cðel Þ; l < 0 ½107 Even though the class of solutions associated with
and {el }l2Z is the canonical basis of l2 (Z). G2 (H) via the Dirac formalism is large, it should be
Consider now the group GL(H) of bounded noted that from the perspective of the Cauchy problem
operators on H with bounded inverses. The for the pertinent evolution equations the solutions are
transformation nongeneric, inasmuch as the initial data are real-
analytic functions.

ðf Þ 7! 
ðGf Þ; ðf Þ 7! ðG1
f Þ Finally, we consider Lie algebra representations
f 2 H; G 2 GLðHÞ ½108 related to the above special starting point [105] for
the second-quantized Dirac framework. Assume that
leaves the CAR [48] invariant. Provided that G exp(tA) is a one-parameter group of bounded
belongs to the subgroup operators on H with generator A in the Lie algebra
of G2 (H),
G2 ðHÞ
¼ fG 2 GLðHÞ j G Hilbert–Schmidtg ½109 g2 ðHÞ
¼ fA bounded j A Hilbert–Schmidtg ½116
there exists an implementer (G) on F a (H):
Then one can take
ðf Þ ¼ 
ðGf ÞðGÞ;
ðexpðtAÞÞ ~
¼ expðtdðAÞÞ ½117
ðGÞðf ~
Þ ¼ ðG1
f ÞðGÞ; 8f 2 H ½110
where d(A) is the Fock-space operator uniquely
In particular, the multiplication operator determined up to an additive constant by its
1 commutation relation
expðhðxÞÞ; hðxÞ ¼ xk zk ; z ¼ ei ½111
ðf Þ ¼ 
ðAf Þ; 8f 2 H ½118
86 Dirac Operator and Dirac Field

with the smeared Dirac field 

(f ). Fixing the at z = 0 and z = 1 (regarded as multiplication
constant by requiring operators on L2 (S1 )k ), the Fock-space counterparts
obtained via the d-operation yield representations
ð; dðAÞÞ ¼0 ½119 of the Kac–Moody Lie algebra A(1) k1 . Specifically, on
the map A 7! d(A) satisfies the Lie algebra relations the charge-0 sector of F a (H), one obtains the so-
called basic representation, whereas the charge-q
½dðAÞ; ~
dðBÞ ~
¼ dð½A; BÞ þ CðA; BÞ1 ½120 sectors with q = 1, . . . , k  1, yield the fundamental
so that the term representations. Using the neutral version of Dirac’s
second quantization, one can also obtain the
CðA; BÞ ¼ trðAþ Bþ  Bþ Aþ Þ ½121 basic and a fundamental representation of the
encodes a central extension of the Lie algebra g2 (H) Kac–Moody algebras B(1) l (for k = 2l þ 1) and D(1) l
[116]. (for k = 2l).
The developments sketched in the previous
See also: Bosons and Fermions in External Fields;
paragraph are in fact independent of the specific
Clifford Algebras and Their Representations; Current
form of the Hilbert space H and its Hþ =H Algebra; Dirac Fields in Gravitation and Nonabelian
decomposition. But the special feature of the choice Gauge Theory; Gerbes in Quantum Field Theory;
[105] and its S1 ! R analog is that the smeared Holonomic Quantum Fields; Index Theorems; Quantum
Dirac current Field Theory in Curved Spacetime; Quantum
Z 2 Chromodynamics; Random Walks in Random
d ð Þ : 
ð0; Þð0; Þ :; 2 C1 ðS1 Þ ½122 Environments; Relativistic Wave Equations Including
0 Higher Spin Fields; Solitons and Kac–Moody Lie
Algebras; Spinors and Spin Coefficients; Symmetry
(where the double dots denote normal ordering – the
Classes in Random Matrix Theory.
replacement of terms involving bk b
l by b
l bk ) is of
the form d(A ) with A 2 g2 (H) determined by .
(For spacetime dimension d > 2, this is no longer
true, as the Hilbert–Schmidt condition is violated.)
Further Reading
Moreover, [120] reduces to Bjorken JD and Drell SD (1964) Relativistic Quantum Mechanics.
New York: McGraw-Hill.
½dðA ~  Þ ¼ CðA ; A Þ1
Þ; dðA ½123 Carey AL and Ruijsenaars SNM (1987) On fermion gauge
groups, current algebras and Kac–Moody algebras. Acta
with the central extension explicitly given by Applicandae Mathematicae 10: 1–86.
Z 2 Date E, Jimbo M, Kashiwara M, and Miwa T (1983) Transfor-
i mation groups for soliton equations. In: Jimbo M and Miwa T
CðA ; A Þ ¼ d 0 ð Þð Þ ½124 (eds.) Proceedings of RIMS Symposium, Nonlinear Integrable
2 0
Systems – Classical Theory and Quantum Theory, pp. 39–119.
Singapore: World Scientific.
We have just sketched the details of the (simplest
Dirac PAM (1928) The quantum theory of the electron.
version of the) Dirac current algebra: the term [124] Proceedings of the Royal Society of London. Series A 117:
is commonly known as the Schwinger term, so that 610–624.
the central extension featuring in [120]–[121] may Dirac PAM (1928) The quantum theory of the electron, II.
be viewed as a generalization. The above setup can Proceedings of the Royal Society of London. Series A 118:
also be slightly generalized so as to obtain repre-
Glimm J and Jaffe A (1981) Quantum Physics. New York:
sentations of the Virasoro algebra, which is a central Springer.
extension of the Lie algebra of polynomial vector Itzykson C and Zuber JB (1980) Quantum Field Theory. New York:
fields on S1 . The general framework has a quite McGraw-Hill.
similar version for the neutral Dirac field (Majorana Pressley A and Segal G (1986) Loop Groups. Oxford: Clarendon.
Rose ME (1961) Relativistic Electron Theory. New York: Wiley.
field), described in terms of the self-dual CAR
Ruijsenaars SNM (1989) Index formulas for generalized Wiener–
algebra. In the neutral setting, one can construct Hopf operators and boson–fermion correspondence in 2N
the Neveu–Schwarz and Ramond representations of dimensions. Communications in Mathematical Physics 124:
the Virasoro algebra, which are crucial in string 553–593.
theory. Schweber SS (1961) An Introduction to Relativistic Quantum
Field Theory. Evanston, IL: Row-Peterson.
Tensoring Ȟ with an internal symmetry space Ck
Streater RF and Wightman AS (1964) PCT, Spin and Statistics,
and starting from the Lie algebra of rational maps and All That. New York: Benjamin.
S1 ! sl(k, C), z 7! M(z), with poles occurring solely Thaller B (1992) The Dirac Equation. New York: Springer.
Dispersion Relations 87

Dispersion Relations
J Bros, CEA/DSM/SPhT, CEA/Saclay, Gif-sur-Yvette, damping (!) of the wave, caused by the absorption
France of energy in the medium.
ª 2006 Elsevier Ltd. All rights reserved. It has appeared much later that for many
scattering phenomena, dispersion relations can be
derived from an appropriate set of general physical
principles. This means that inside a certain axio-
Introduction matic framework these relations are model indepen-
dent with respect to the detailed structure of the
Dispersion relations constitute a basic chapter of
scatterer or to the detailed type of particle interac-
mathematical physics which covers various types of
tion in the quantum case.
classical and quantum scattering phenomena and
In a very short and oversimplifying way, the
illustrates in a typical way the importance of general
following logical scheme holds. At first, one can say
principles in theoretical physics, among which
that any mathematical formulation of a physical
causality plays a major role. Each such phenomenon
principle of causality results in support-type proper-
is described in terms of a scattering amplitude F(!),
ties with respect to a time variable t of an
which is a complex-valued function of a frequency
appropriate ‘‘causal structural function’’ R(t) of the
variable !; in quantum physics, this variable
physical system considered: typically, such a causal
becomes an energy variable called E (or s in particle
function should vanish for negative values of t. It
physics), as it follows from the fundamental de ~ admits an
follows that its Fourier transform R
Broglie relation E =  h!. The real and imaginary ~ (c)
analytic continuation R in the upper half-plane
parts of F(!), which are called respectively the
of the corresponding conjugate variable, interpreted
dispersive part D(!) and the absorptive part A(!) of
as a frequency (or an energy in the quantum case):
F, have well-defined physical interpretations for all
here is the general reason for the occurrence of
these phenomena; they represent quantities which
complex frequencies and of holomorphic functions
are essentially accessible to measurements. The term
of such variables. In fact, the relevant holomorphic
dispersion relations refers to linear integral equa- ^ (c) ) always appears as gener-
scattering function F(!
tions which relate the functions D(!) and A(!); such ~ (c)
ated by R via some (more or less sophisticated)
integral equations are always closely related to the ^ coincides with R
~ (c)
procedure: in the simplest case, F
Cauchy integral representation of a subjacent holo-
^ (c) ) of the complexified fre- itself, but this is not so in general. Finally, the
morphic function F(!
^ (c) ) is called the derivation of suitable analyticity and boundedness
quency (or energy) variable !(c) . F(! ^ (c) ) in a domain whose typical form
properties of F(!
holomorphic scattering function or in short the
is the upper half-plane, allows one to apply a
scattering function, and the scattering amplitude
Cauchy-type integral representation to this function;
appears as the boundary value of the latter, taken at
the dispersion relations directly follow from the
positive real values of ! from the upper half-plane of
!(c) , namely
The first part of this article aims to describe the
^ þ i"Þ;
Fð!Þ ¼ lim Fð! ">0 most typical dispersion relations and their link
"!0 with the Cauchy integral. It then presents two
Historically, the first relations of that type to be basic illustrations of these relations, which are: (1)
obtained were the Kramers–Krönig relations (1926), in classical physics, the Kramers–Krönig relations
which concern the propagation of light in a mentioned above, and (2) in quantum physics, the
dielectric medium. In this basic example, F(!) dispersion relations for the forward scattering of
represents the complex refractive index of the equal-mass particles. The aim of the subsequent
medium n0 (!) = n(!) þ i(!) for a monochromatic parts is to give as complete as possible accounts of
wave with frequency !. The dispersive part D(!) is the derivation of the relevant analyticity domains
the real refractive index n(!), which is the inverse inside appropriate axiomatic frameworks which,
ratio of the phase velocity of the wave in the respectively, contain the previous two examples.
medium to its velocity c in the vacuum: the fact that The simplest axiomatic framework is the one
it depends on the frequency ! corresponds precisely which governs all the phenomena of linear
to the phenomenon of dispersion of light in a response: in the latter, the proof of analyticity
dielectric medium. A slab of the latter thus appears and dispersion relations most easily follows the
as a prototype of a macroscopic scatterer. The logical line sketched above. It will be presented
absorptive part A(!) is the rate of exponential together with its application to the derivation of
88 Dispersion Relations

the Kramers–Krönig relations. The rest of the symmetry relation F(! ^

^ (c) ) = F(! (c) ), (with !(c) and

article is devoted to the derivation of the so-called !(c) in the upper half-plane) and correspondingly
crossing analyticity domains which are the relevant D(!) = D(!), A(!) = A(!) on the reals; we shall
background of dispersion relations for the two- call (S) this symmetry relation.
particle scattering (or collision) amplitudes in The simplest case of dispersion relations is then
particle physics. This derivation relies on the obtained when D and A are linked by the reciprocal
general axiomatic framework of relativistic quan- Hilbert transformations:
tum field theory (QFT) (see Axiomatic Quantum Z þ1
Field Theory) and more specifically on the ‘‘analy- 1 1
Dð!Þ ¼ P Að!0 Þ 0 d!0 ½1a
tic program in complex momentum space’’ of the  1 !  !
latter. This framework, whose rigorous mathema-
Z þ1
tical form has been settled around 1960, represents 1 1
Að!Þ ¼  P Dð!0 Þ d!0 ½1b
the safest conceptual approach for describing the  1 !0 !
particle collision processes in a range of energies
which covers by far all those that can be produced where P denotes Cauchy’s principal value, defined
and will be produced in the accelerators for for any differentiable function ’(x) (sufficiently
several decades. A simple account of the field- regular at infinity) by
theoretical axiomatic framework and of the logical Z þ1
line of the derivation of dispersion relations will P dx
1 x
be presented here for the simplest kinematical Z " Z þ1 
situations. A broader presentation of the analytic dx dx
¼ lim ’ðxÞ þ ’ðxÞ ½2
program including an extended class of analyticity "!0 1 x " x
properties for the general structure functions and
(two-particle and multiparticle) collision ampli- As a matter of fact, the pair of equations [1a], [1b] is
tudes in QFT can also be found in this encyclope- equivalent to the following relation for F ¼ D þ iA:
dia (see Scattering in Relativistic Quantum Field Z þ1
1 1
Theory: The Analytic Program). For brevity, we Fð!Þ ¼ Fð!0 Þ lim 0 d!0 ½3
shall not treat here the derivation of dispersion 2i 1 "!0 !  !  i"

relations in the framework of nonrelativistic The latter is obtained as a limiting case of the
potential theory. Concerning the latter, the inter-
bi Cauchy formula
ested reader can refer to the book by Nussenzweig
(1972). A collection of old basic papers on field- Z þ1
^ ðcÞ Þ ¼ 1 Fð!0 Þ
theoretical dispersion relations can be found in the Fð! d!0 ½4
bi 2i 1 !  !ðcÞ
review book edited by Klein (1961). For a recent
and well-documented review of the multiplicity of expressing the fact that F ^ is holomorphic and
versions and applications of dispersion relations sufficiently decreasing at infinity in the upper half-
and their experimental checking, the reader can
plane I þ of the complex variable !(c) and that F(!)
consult the article by Vernov (1996). is the boundary value of F(!^ (c) ) on all the reals.
Finally, one checks that in view of the symmetry
relation (S), the Hilbert integral relations between D
Typical Dispersion Relations
and A given above reduce to the following disper-
The possibility of defining the scattering function sion relations:
^ (c) ) in the full upper half-plane and of exploiting
F(! Z þ1
^ on the 2 !0
the corresponding boundary value F of F Dð!Þ ¼ P Að!0 Þ 02 2
d!0 ½5a
negative part as well as on the positive part of the  0 !  !
real axis will depend on the framework of considered Z þ1
2! 1
phenomena. For the moment, we do not consider the Að!Þ ¼  P Dð!0 Þ d!0 ½5b
more general situations which also occur in particle  0 !02  !2
physics and will be described later (‘‘crossing
domains’’ and ‘‘quasi-dispersion-relations’’). Two Basic Examples
In the simplest cases, the real and imaginary parts
D and A of F are extended to negative values of the 1. The Kramers–Krönig relation in classical optics
variable ! via additional symmetry relations result- It will be shown in the next part that the complex
ing from appropriate ‘‘reality conditions.’’ As a refractive index n0 (!) = n(!) þ i(!) of a dielectric
typical and basic example, there occurs the medium is the boundary value of a holomorphic
Dispersion Relations 89

function n^0 (!(c) ) in I þ satisfying

R1 the symmetry relation Remark In view of (3), the scattering function
(S), and such that the integral 1 j^ n0 (! þ i)  1j2 d! ^ 0 (!(c) ) admits an analytic continuation as an even
is uniformly bounded for all  > 0. function of !(c) (still called T ^ 0 ) in the cut-plane
It follows that all the previous relations are C(cut)
m ¼ Cn{! 2 R; j!j  m}. In fact, in view of (S)
satisfied by the function F(! ^ (c) ) = n
^0 (!(c) ) 1. and (3), the boundary value T0 of T ^ 0 satisfies the
In particular, the real refractive index n(!) and the relation T0 (!) = T0 (!) in the real interval m ¼
‘‘extinction coefficient’’ (!) ¼ 2!(!)=c (c being {! 2 R; m < ! < m}. Let us then introduce the
: ^
^  (!(c) ) ¼
the velocity of light in the vacuum) are linked by function T 0 T0 (!(c) ) as a holomorphic
the following Kramers–Krönig dispersion relation function of !(c) in I  : one sees that the boundary
(corresponding to eqn [5a]): values of T ^ 0 and T ^  from the respective domains
I þ and I  coincide on m and therefore admit a
c ð!0 Þ common analytic continuation throughout this real
nð!Þ  1 ¼ P d!0 ½6 interval (in view of ‘‘Painlevé’s lemma’’ or ‘‘one-
2 !02  !2
dimensional edge-of-the-wedge theorem’’). One
2. Dispersion relation for the forward two-particle also notes that in view of (S) the extended func-
scattering amplitude in relativistic quantum physics tion T ^ 0 satisfies the ‘‘reality condition’’
One considers the following collision phenomenon in ^ (c) ^ 0 (!(c) ) in C(cut) . The fact that T ^ 0 is well
T0 (! ) = T m
particle physics. A particle 2 with mass m, called the
defined as an even holomorphic function in the cut-
target and sitting at rest in the laboratory, is collided
plane C(cut)m has been established in the general
by an identical particle 1 with relativistic energy !
framework of QFT, as explained in the last part of
larger than m (= mc2 ; in high-energy physics, one
this article.
usually chooses units such that c = 1). After the
collision, the particle 1 is scattered in all possible
directions, , of space, according to a certain Phenomena of Linear Response:
quantum scattering amplitude T (!), whose modulus Causality and Dispersion Relations
is essentially the rate of probability for detecting 1 in in the Classical Domain
the direction . The forward scattering amplitude
T0 (!) corresponds to the detection of 2 in the The subsequent axiomatic framework and results
forward longitudinal direction with respect to its (due to J S Toll (1952, 1956)) concern any physical
incidence direction towards the target. Let us also system which exhibits the following type of phe-
assume that the particles carry no charge of any kind, nomena: whenever it receives some excitation signal,
so that each particle coincides with its ‘‘antiparticle.’’ called the input and represented by a real-valued
In that case, T0 (!) is shown to be the boundary value function of time fin (t) with compact support, the
of a scattering function T ^ 0 (!(c) ) enjoying the follow- system emits a response signal, called the output and
ing properties: represented by a corresponding real-valued function
1. it is a holomorphic function in I þ satisfying the fout (t), in such a way that the following postulates
symmetry relation (S); are satisfied:
2. its behavior at infinity in I þ is such that the (P1) Linearity. To every linear combination of
integral inputs a1 fin, 1 þ a2 fin, 2 , there corresponds the
Z 1  ^ 2
 output a1 fout, 1 þ a2 fout, 2 .
T0 ð! þ iÞ (P2) Reproductibility or time-translation invariance.
1  ð! þ iÞ  Let  be a time-translation parameter taking
arbitrary real values; to every ‘‘time-translated
is uniformly bounded for all  > 0; and :
input’’ fin() (t) ¼ fin (t  ), there corresponds
3. under more specific assumptions on the mass () :
the output fout (t) ¼ fout (t  ).
spectrum of the subjacent theory, the ‘‘absorptive (P3) Causality. The effect cannot precede the cause,
part’’ A(!) ¼ Im T0 (!) vanishes for j!j < m. namely if tin and tout denote respectively the
: lower bounds of the supports of fin (t) and
Then by applying eqn [5a] to the function D(!) ¼
Re[(T0 (!)  T0 (0))=!2 ] (regular at ! = 0), one obtains fout (t), then there always holds the inequality
the following dispersion relation: tin  tout .
(P4) Continuity of the response. There exists
Re T0 ð!Þ some continuity inequality which expresses
Z þ1
the fact that a certain norm of the output is
2!2 1 majorized by a corresponding norm of the
¼ T0 ð0Þ þ P Að!0 Þ d!0 ½7
 m !0 ð!02  !2 Þ input. The case of an L2 -norm inequality of the
90 Dispersion Relations

form jfout j  jfin j is Rparticularly significant: function R~ (c) (!(c) ), called the Fourier–Laplace trans-
when the norm jf j ¼ [ jf (t)j2 dt]1=2 is interpre- form of R. R ~ (c) is defined for all !(c) = ! þ i, with
table as an energy (for the output as well as for  > 0, by the following formula in which the
the input), it acquires the meaning of a exponential is a good test-function for the distribu-
‘‘dissipation’’ property of the system. tion R (since exponentially decreasing for t ! þ1):
Z þ1
The postulate of linear dependence (P1) of fout ~ ðcÞ

with respect to fin is obviously satisfied if the

R ð! Þ ¼ ðcÞ
RðtÞei! t dt ½11
response is described by any general kernel K(t, t0 )
such that the following formula makes sense: More precisely, the tempered-distribution character of
R is strictly equivalent to the fact that R ~ (c) is of
Z þ1
moderate growth both at infinity and near the reals in
fout ðtÞ ¼ Kðt; t0 Þfin ðt0 Þdt0 ½8 I þ , namely that it satisfies a majorization of the
following form for some real positive numbers p and q:
Conversely, the existence of a distribution kernel K
can be established rigorously under the continuity ~ ðcÞ ð! þ iÞj  C ð1 þ j!j2 þ 2 Þq
jR ½12
assumption postulated in (P4) by using the Schwartz p
nuclear theorem. In full generality (see our comment We thus conclude from eqn [10] that each phenom-
in the next paragraph), the kernel K(t, t0 ) appears to enon of linear response is represented very simply in
be a tempered distribution in the pair of variables the frequency variable by the multiplicative operator
(t, t0 ) and the previous integral formula holds in the ~
R(!), whose analytic continuation R~ (c) (!(c) ) is called
sense of distributions, which means that both sides the (causal) response function.
of eqn [8] must be considered as tempered distribu-
tions (in t) acting on any smooth test-function g(t) in
the Schwartz space S. (Note, for instance, that the A Typical Illustration: The Damped Harmonic
trivial linear application fout = fin is represented by Oscillator
the kernel K(t, t0 ) = (t  t0 )).
We consider the motion x = x(t) of a damped
From the reproductibility postulate (P2), it fol-
harmonic oscillator of mass m submitted to an
lows that the distribution K can be identified with a
external force F(t). The force is the input (fin = F) and
distribution of the single variable  = t  t0 , namely
: the resulting motion is the output, namely fout (t) =
K(t, t0 ) ¼ R(t  t0 ). Moreover, the real-valuedness
x(t). All the previous general postulates (P1)–(P4) are
condition imposed to the pairs (fin , fout ) entails that
then satisfied, but this particular model is, of course,
R is real. Finally, the causality postulate (P3) implies
governed by its dynamical equation
that the support of the distribution R is contained in
the positive real axis, so that one can write, in the FðtÞ
x00 ðtÞ þ 2 x0 ðtÞ þ !20 xðtÞ ¼ ½13
sense of distributions, m
Z t where !0 is the eigenfrequency of the oscillator and
fout ðtÞ ¼ Rðt  t0 Þfin ðt0 Þdt0 ½9 is the damping constant ( > 0). The relevant
solution of this second-order differential equation
The convolution kernel R(t  t0 ) is typically what with constant coefficients is readily obtained in
one calls in physics a ‘‘retarded kernel.’’ terms of the Fourier transforms x ~
~(!) of x(t) and F(!)
If we now introduce the frequency variable !, of F(t). One can in fact replace eqn [13] by the
which is the conjugate of the time variable t, by the equivalent equation
Fourier transformation ~
Z þ1 ð!2  2i ! þ !20 Þ~
xð!Þ ¼ ½14
~f ð!Þ ¼ f ðtÞ ei!t dt
1 whose solution is of the form [10], namely x
~(!) =
~ F(!),
R(!) ~ with
we see that the convolution equation [9] is equiva-
lent to the following one: ~
Rð!Þ ¼
mð!2 þ 2i !  !20 Þ
~fout ð!Þ ¼ Rð!Þ
~ ~fin ð!Þ ½10 ~
¼ ½15a
In the latter, the Fourier transform R(!) of R is a mð!  !1 Þð!  !2 Þ
tempered distribution, which is the boundary value
from the upper half-plane I þ of a holomorphic !1;2 ¼ ð!20  2 Þ1=2  i ½15b
Dispersion Relations 91

It is clear that the rational function defined by eqns linear response with respect to fin (since fout ‘‘starts
[15] admits an analytic continuation in the full after’’ fin ). According to the general formula [10],
complex plane of !(c) minus the pair of simple poles the corresponding response function R ~ (c) can be

(!1 , !2 ) which lie in the lower half-plane. In directly computed from eqns [16] and [17], which
particular, it is holomorphic (and decreasing at yields:
infinity) in I þ , as expected from the previous general
~ ðcÞ ð!ðcÞ Þ ¼ ei!ðcÞ n^0 ð!ðcÞ Þ=c
R ½18
result. Moreover, this example suggests that for any 
particular phenomenon of linear response, the details In view of the previous axiomatic analysis, ~ (c)

of the dynamics are encoded in the singularities of the to be holomorphic and of moderate growth in I þ ,
holomorphic scattering function R ~ (c) (!(c) ), which all
and since this holds for all ’s sufficiently small, it
lie in the lower half-plane. The validity of a can be shown that the function n ^0 (!(c) ) itself is
dispersion relation only expresses the analyticity holomorphic and of moderate growth in I þ (no
(and decrease at infinity) of that function in the logarithmic singularity can be produced).
upper half-plane, which is model independent. 2. Polarization of the medium produced by an
Remark The same mathematical analysis applies to electric field. The dielectric polarization signal P(t)
any electric oscillatory circuit, in which the capaci- produced at a point of a medium by an external
tance, inductance, and resistance are involved in electric field E(t) is also a phenomenon of linear
place of the parameters m, !0 and : fin and fout response which obeys the postulates (P1)–(P4); the
correspond respectively to an external electric corresponding formula [10] reads
potential and to the current induced in the circuit; ~ ~
Pð!Þ ¼
0 ð!ÞEð!Þ ½19a
the response function is the admittance of the
circuit. where
0 is the complex dielectric susceptibility of
the medium, which is related to n0 by Maxwell’s
Application to the Kramers–Krönig Relation relation

The background of the Kramers–Krönig relation [6], ½n02 ð!Þ  1

~0 ð!Þ ¼

namely the analyticity and boundedness properties 4
of the complex refractive index n ^0 (!(c) ) in I þ , is One thus recovers the fact that
0 admits an analytic
provided by the previous axiomatic framework. continuation in I þ ; one can also show by a physical
However, it is not the quantity n ^0 (!(c) ) itself but argument that
˜ 0 (!), and thereby n0 (!)  1, tends to
appropriate functions of the latter which play the zero as a constant divided by !2 when ! tends
role of causal response functions; two phenomena to infinity. This behavior at infinity extends to
can in fact be exhibited, which both contribute to ^0 (!(c) )  1 in I þ in view of the Phragmen–Lindelöf
^0 (!(c) ).
proving the relevant properties of n theorem, since n ^0 is known (from (1)) to be of
1. Propagation of light in a dielectric slab with moderate growth. This justifies the analytic back-
thickness . One considers the wave front fin (t) of an ground of Kramers–Krönig’s relation.
incoming wave normally incident upon the slab,
with Fourier decomposition
From Relativistic QFT to the Dispersion
1 þ1 ~ Relations of Particle Physics: Historical
fin ðtÞ ¼ f ð!Þ ei!t d! ½16
2 1 Considerations and General Survey

After having traveled through the medium, it gives In the quantum domain, the derivation of dispersion
rise to an outgoing wave fout (t) on the exit face of relations for the two-particle scattering (or collision)
the slab, whose Fourier decomposition can be amplitudes of particle physics has represented, since
written as follows (provided the thickness  of the 1956 and throughout the 1960s, an important
slab is very small): conceptual progress for the theoretical treatment of
that branch of physics. These phenomena are
1 þ1 ~ 0 described in a quantum-theoretical framework in
fout ðtÞ ¼ f ð!Þ ei!ðtn ð!Þ=cÞ d! ½17 which the basic kinematical variables are the
2 1
energies and momenta of the particles involved.
In the latter, the real part of n0 (!)=c is the inverse of These variables play the role of the frequency of
the light velocity in the medium, while its imaginary light in the optical scattering phenomena. Moreover,
part takes into account the exponential damping of since large energies and momenta are involved,
the wave. The output fout thus appears as a causal which allow the occurrence of particle creation
92 Dispersion Relations

according to the conservation laws of special relativ- collision processes. In fact, the holomorphic functions
ity, it is necessary to use a relativistic quantum- which play the role of the causal response function
mechanical framework. Around 1950, the success of ~
R(!) are the QFT structure functions or ‘‘Green
the quantum electrodynamics formalism for comput- functions in energy–momentum space.’’ The study of
ing the electron–photon, electron–electron, and elec- all possible analyticity properties of these functions
tron–positron scattering amplitudes revealed the resulting from the QFT axiomatic framework is
importance of the concept of relativistic quantum called the analytic program (see Scattering in
field for the understanding of particle physics. Relativistic Quantum Field Theory: The Analytic
However, the methods of perturbation theory, Program). The primary basic scope of the latter
which had ensured the success of quantum electro- concerns the derivation of analyticity properties for
dynamics in view of the small value of the coupling the scattering functions of two-particle collision
parameter of that theory (namely the electric charge processes, which appears to be a genuine challenge
of the electron), were at that time inapplicable to the for the following reason. The basic Einstein relation
strong nuclear interaction phenomena of high-energy E = mc2 , which applies to all the incoming and
physics. This failure motivated an important school outgoing particles of the collisions, operates as a
of mathematical physicists for working out a model- geometrical constraint on the corresponding physical
independent axiomatic approach of relativistic QFT energy–momentum vectors: according to the Min-
(e.g., Lehmann, Symanzik, Zimmermann (1954), kowskian geometry, the latter have to belong to mass
Wightman (1956), and Bogoliubov (1960); see Axio- hyperboloids, which define the so-called ‘‘mass shell’’
matic Quantum Field Theory). Their main purpose of the collision considered. It is on the corresponding
was to provide a conceptually satisfactory treatment complexified mass-shell manifold that the scattering
of relativistic quantum collisions, at least for the case functions are required to be defined as holomorphic
of massive particles. Among various postulates functions. In the analytic program of QFT, the
expressing the invariance of the theory under the derivation of such analyticity domains and of
Poincaré group in an appropriate quantum- corresponding dispersion relations in the complex
mechanical Hilbert-space framework, the approach plane of the squared total energy variable, s, of each
basically includes a certain formulation of the given collision process then relies on techniques of
principle of causality, called microcausality or local complex geometry in several variables. As a matter of
commutativity. This axiomatic approach of QFT was fact, the scattering amplitude is a function (or
followed by a conceptually important variant, namely distribution) of two variables F(s, t), where t is a
the algebraic approach to QFT (Haag, Kastler, Araki second important variable, called the squared
1960), whose most important developments are
momentum transfer, which plays the role of a fixed
presented in the book by Haag (1992) (see Algebraic parameter for the derivation of dispersion relations in
Approach to Quantum Field Theory). From the the variable s. The value t = 0 corresponds to the
historical viewpoint, and in view of the analyticity special kinematical situation which has been
properties that they also generate, one can say that all described above (for the case of equal-mass particles
these (closely related) approaches parallel the axio- 1 and 2 ) under the name of forward scattering and
matic approach of linear response phenomena with, the variable s is a simple affine function of the energy
of course, a much higher degree of complexity. In ! of the colliding particle 1 in the laboratory
particular, the characterization of scattering (or Lorentz frame, (namely s = 2m2 þ 2m! in the equal-
collision) amplitudes in terms of appropriate struc- mass case). It is for the corresponding scattering
: :
ture functions of the basic quantum fields of the amplitude T0 (!) ¼ F0 (s) ¼ F(s, t)jt = 0 that a dispersion
theory is a nontrivial preliminary step which was relation such as eqn [7] can be derived, although this
taken at an early stage of the theory under the name derivation is far from being as simple as for the
of ‘‘asymptotic theory and reduction formulae’’ phenomena of linear response in classical physics:
(Lehmann, Symanzik, Zimmermann 1954 –57, even in that simplest case, it already necessitates the
Haag–Ruelle 1962, Hepp 1965). There again, in the use of analytic completion techniques in several
field-theoretical axiomatic framework, causality gen- complex variables. The first proof of this dispersion
erates analyticity through Fourier–Laplace transfor- relation was performed by K Symanzik in 1956. In
mation, but several complex variables now play the the case of general kinematical situations of measure-
role which was played by the complex frequency in ments, the direction of observation of the scattered
the axiomatics of linear response phenomena: they particle includes a nonzero angle with the incidence
are obtained by complexifying the relativistic energy– direction, which always corresponds to a negative
momentum variables of the (Fourier transforms of value of t. The derivation of dispersion relations at
the) quantum fields involved in the high-energy fixed t = t0 < 0, namely for the scattering amplitude
Dispersion Relations 93

Ft0 (s) ¼ F(s, t)jt = t0 requires further arguments of example, a ‘‘precise-increase’’ property was
complex geometry, and it is submitted to subtle expected to be satisfied by the forward scattering
limitations of the form t1 < t0  0, where t1 depends amplitude T0 (!) for ! (or s) tending to infinity.
on the mass spectrum of the particles involved in the This ‘‘precise-increase’’ property implied the neces-
theory. The first rigorous proof of dispersion relations sity of writing the corresponding dispersion rela-
at t < 0 was performed by N N Bogoliubov in 1960. tion [7] for the function (T0 (!)  T0 (0))=!2 : this is
Three conceptually important features of the what one calls a ‘‘dispersion relation with a
dispersion relations in particle physics deserve to subtraction.’’ As a matter of fact, the existence of
be pointed out. such restrictive bounds on the total cross sections at
high energies had been discovered in 1961 by
1. In comparison with the dispersion relations of M Froissart: his derivation relied basically on the
classical optics, a feature which appears to be new is use of the unitarity of the scattering operator
the so-called ‘‘crossing property,’’ which is character- (expressing the quantum principle of conservation
istic of high-energy physics since it relies basically on of probabilities), but also on a strong analyticity
the relativistic kinematics. According to that prop- postulate for the scattering function not implied by
erty, the boundary values of the analytic scattering the general field-theoretical approach (namely the
function F ^t (s) at positive and negative values of s Mandelstam domain of ‘‘double dispersion rela-
from the respective half-planes Im s > 0 and Im s < 0 tions’’). In the general framework of QFT, Froissart-
are interpreted, respectively, as the scattering ampli- type bounds appeared to be closely linked to a
tudes of two physically different collision processes, further nontrivial extension of the range of ‘‘admis-
which are deduced from each other by replacing the sible’’ values of t for which F ^t (s) can be analytically
incident particle by the corresponding antiparticle; continued in a cut-plane or crossing domain. In
one also says that ‘‘these two collision processes are fact, the extension of this range to positive (i.e.,
related by crossing.’’ A typical example is provided ‘‘unphysical’’) and even complex values of t, and as
by the proton–proton and proton–antiproton colli- a second step the proof of Froissart-type bounds in
sions, whose scattering amplitudes are therefore s( log s)2 for Ft (s) at all these admissible values of t,
mutually related by the property of analytic con- were performed in 1966 by A Martin. They rely on
tinuation. This type of relationship between the a subtle conspiracy of the analyticity properties
values of the scattering function at positive and deduced from the QFT axiomatic framework and of
negative values of s generalizes in a nontrivial way positivity and unitarity properties expressing the
the symmetry relation (S) satisfied by the forward basic Hilbertian structure of the quantum collision
scattering function T ^ 0 (!(c) ) when each particle coin- theory. The consequence of these bounds on the
cides with its antiparticle (see the second basic exact form of the dispersion relations is that, as in
example above). No nontrivial crossing property formula [7] of the case t = 0, it is justified to write a
holds in that special case and the fact that T ^ 0 is an (the so-called ‘‘subtracted’’) dispersion relation for
even function of ! precisely expresses the identity (Ft (s)  Ft (0) sFt0 (0))=s2 : for the general case when
of the two-collision processes related by crossing. In the crossing property replaces the symmetry (S),
the general case, for t = 0 as well as for t = t0 < 0 for such a dispersion relation involves two subtractions
any value of t0 , the analyticity domain that one (since Ft0 (0) 6¼ 0). Detailed information concerning
obtains for the scattering function is not the full cut- the interplay of analyticity and unitarity on the mass
plane of s: in its general form, a ‘‘crossing domain’’ shell and the derivation of refined forms of disper-
may exclude some bounded region Bt0 from the cut- sion relations and various boundedness properties
plane, but it always contains an infinite region which for the scattering functions are given in the book by
is the exterior of a circle minus cuts along the two Martin (1969).
infinite parts of the real s-axis (Bros, Epstein, Glaser 3. Constraints imposed by dispersion relations
1965): these cuts are along the physical regions of the and experimental checks. The conceptual impor-
two collision processes related by crossing. In that tance of dispersion relations incorporating the
general case, the scattering function F ^t (s) still satisfies above features (1) and (2) is displayed by such
what can be called a quasi-dispersion-relation, in spectacular application as the relationship between
which the right-hand side contains an additional the high-energy behaviors of proton–proton and
Cauchy integral, taken along the boundary of Bt . proton–antiproton cross sections. Even though the
2. A second important feature concerns the closest forms of relationship between these cross
behavior at large values of s of the scattering sections (e.g., the existence of equal high-energy
functions F ^t (s) in their analyticity domain. As limits) necessitate for their proof some extra
indicated in the presentation of the second basic assumption concerning, for instance, the behavior
94 Dispersion Relations

of the ratio between the dispersive and absorptive differs greatly from what has been previously
parts of the forward scattering amplitude, one can described.
speak of an actual model-independent implication
of general QFT that imposes nontrivial constraints Particles in Minkowskian geometry Each state of a
on phenomena. Otherwise stated, checking experi- relativistic classical particle with mass m is char-
mentally the previous type of relationship up to the acterized by its energy–momentum vector or
limits of high energies imposed by the present 4-momentum p = (p0 , p) satisfying the mass-shell
technology of accelerators constitutes an indirect, :
condition p2 ¼ p20  p2 = m2 (in units such that
but important test of the validity of the general c = 1). In view of the condition of positivity of the
principles of QFT. energy p0 > 0 the ‘‘physical mass shell’’ thus coin-
As a matter of fact, it has also appeared frequently cides with the positive sheet Hm of the mass
in the literature of high-energy physics during the hyperboloid Hm with equation p = m2 . 2

last 40 years that the Froissart bound by itself was The set of all energy–momentum configurations
considered as a key criterion to be satisfied by any characterizing the collisions of two relativistic classi-
sensible phenomenological model in particle physics. cal particles with initial (resp. final) 4-momenta
As already stated above, the Froissart bound is one p1 , p2 (resp. p01 , p02 ) is the mass-shell manifold M
of the deepest consequences of the analytic program defined by the conditions
of general QFT, since its derivation also incorpo- p2i ¼ m2 ; p02 2
p0i; 0 > 0;
i ¼m ; pi; 0 > 0; i ¼ 1; 2
rates in the most subtle way the quantum principle
of probability conservation. Would it be only for the p1 þ p2 ¼ p01 þ p02
previous basic results, the derivation of dispersion where the latter equation expresses the relativistic
relations (and, more generally, the results of the law of total energy–momentum conservation. M is
analytic program) in QFT appear as an important an eight-dimensional manifold, invariant under the
conceptual bridge between a fundamental theoreti- (six-dimensional) Lorentz group: the orbits of this
cal framework of relativistic quantum physics and group that constitute a foliation of M are parame-
the phenomenology of high-energy particle physics. trized by two variables, namely the squared total
energy s = (p1 þ p2 )2 = (p01 þ p02 )2 and the squared
Basic Concepts and Main Steps in the momentum transfer t = (p1  p01 )2 = (p2  p02 )2 (or
u = (p1  p02 )2 = 4m2  s  t). In these variables,
QFT Derivation of Dispersion Relations
called the Mandelstam variables, the ‘‘physical
The rest of this article outlines the derivation of the region’’  of the collision is represented by the set
analytic background of dispersion relations for the of pairs (s, t) (or triplets (s, t, u) with s þ t þ u = 4m2 )
forward scattering amplitudes in the framework of such that t  0, u  0, and therefore s  4m2 .
axiomatic QFT. After a brief introduction on Correspondingly, each state of a relativistic quan-
relativistic scattering processes and the problematics tum particle with mass m is characterized by a wave
of causality in particle physics, it gives an account of packet ^f (p) on Hm þ
, which is an element of unit norm
the Wightman axioms and the simplest reduction of L2 (Hm þ
; m (p)), with m (p) = dp=(p2 þ m2 )1=2 . In
formula which relates the forward scattering ampli- Minkowskian spacetime with coordinates x = (x0 , x),
tude to a retarded product of the field operators. any such state is represented by a wave function f (x)
Then it describes how the latter can be used for whose Fourier transform is the tempered distribution
þ ^
justifying a certain type of analyticity domain for the (with support in Hm ) f (p)  (p2  m2 ): f (x) is a
forward scattering functions, namely a crossing positive-energy solution of the Klein–Gordon equa-
domain or in the best cases a cut-plane in the tion (@ 2 =@x20  x þ m2 )f (x) = 0. A free two-particle
squared energy variable s. This is the basic result state is a symmetric wave packet ^f (p1 , p2 ) on Hm þ

that allows one to write dispersion relations (or þ þ þ
Hm in the Hilbert space L2 (Hm  Hm ; m  m ).
quasi-dispersion-relations) at t = 0; the exact form of
the latter, including at most two subtractions, relies Scattering kernels as response kernels: distribution
on the use of Hilbertian positivity and of the character While the input to be considered is a free
unitarity of the scattering operator. wave packet ^fin (p1 , p2 ) on Hm
þ þ
 Hm , representing the
preparation of an initial two-particle state, the output
Relativistic Quantum Scattering as a Phenomenon
corresponds to the detection of a final two-particle
of Linear Response
state also characterized by a wave packet ^gout (p01 , p02 )
þ þ
Collisions of quantum particles may be seen as on Hm  Hm . In quantum mechanics, linearity is
phenomena of linear response, but in a way which linked to the ‘‘superposition principle’’ of states,
Dispersion Relations 95

which allows one to state that collisions are described of the scattering kernel T; but it is not our purpose
by a certain bilinear form (^fin , ^ gout ) ! S(^fin , ^gout ), to develop that point here for two reasons: (1) the
called the ‘‘scattering matrix.’’ This bilinear form is interpretation of that condition is rather involved,
bicontinuous with respect to the Hilbertian norms of because it integrates a very weak form of causality
the wave packets, and it then results from the together with the spatial short-range character of the
Schwartz nuclear theorem that it is represented by a strong nuclear interactions between the elementary
distribution kernel S(p1 , p2 ; p01 , p02 ), namely a tem- particles; (2) the domains of analyticity obtained are
pered distribution with support contained in M, in by far too small with respect to those necessary for
such a way that (formally) writing dispersion relations. The reason for this
Z failure is that the scattering kernel only represents
Sð^fin ; ^gout Þ ¼ ^fin ðp1 ; p2 Þ^gout ðp0 ; p0 ÞSðp1 ; p2 ; p0 ; p0 Þ
1 2 1 2 an asymptotic quantum observable, in the sense that
it is intended to describe observations far apart from
 m ðp1 Þ m ðp2 Þ m ðp01 Þ m ðp02 Þ ½20
the extremely small spacetime region where the
If there were no interaction, S(^fin , ^
gout ) would reduce particles strongly interact, namely in regions where
to the Hilbertian scalar product <^ gout , ^fin > in L2 this interaction is asymptotically small. Although
þ þ well adapted to what is actually observed in the
(Hm  Hm ; m  m ) and the corresponding kernel S
would be the identity kernel detection experiments, the concept of scattering
  1     kernel is not sufficient for describing the funda-
I p1 ; p2 ; p01 ; p02 ¼  p1  p01  p2  p02 mental interactions of physics: it must be enriched
2     by other theoretical concepts which might explicitly
þ  p1  p02  p2  p01 take into account the microscopic interactions in
In the general case, the interaction is therefore spacetime. This motivates the introduction of quan-
: tum fields as basic quantities in particle physics.
described by the scattering kernel T(p1 , p2 ; p01 , p02 ) ¼
0 0 0 0
S(p1 , p2 ; p1 , p2 )  I(p1 , p2 ; p1 , p2 ). The action of T as Relativistic Quantum Fields: Microcausality and
a bilinear form (defined in the same way as the the Retarded and Advanced Kernels; Analyticity
action of S in eqn [20]) may be seen as the quantum in Complex Energy–Momentum Space
analog of the classical response formula [10]. Note,
however, the difference in the mathematical treat- By an idealization of the concept of quantum
ment of the output: instead of being considered as electromagnetic field and a generalization to all
the direct response (~fout ) to the input, it is now types of microscopic interactions of matter, one
explored by Hilbertian duality in terms of detection considers that all the phenomena involving such
wave packets ^ gout , in conformity with the principles interactions can be described by fields i (x), whose
of quantum theory. Finally, in view of the invariance amplitude can, in principle, be measured in arbi-
of the collision process under the Lorentz group, the trarily small regions of Minkowski spacetime. In the
scattering kernel T is constant along the orbits of quantum framework, one is thus led to the notion of
this group in M and it then defines a distribution local observable O (emphasized as a basic concept in
: the axiomatic approach of Araki, Haag, and
F(s, t) ¼ T(p1 , p2 ; p01 , p02 ) with support in the physical
region : this is what is called the scattering Kastler). In the Wightman field-theoretical frame-
amplitude. work, a local observable corresponds to the measur-
ing process of a ponderated
R average of a field i (x)
of the form O ¼ i [f ] = i (x)f (x) dx. In the latter,
What becomes of causality? One can show that the f (x) denotes a smooth real-valued test-function with
positive-energy solutions of the Klein–Gordon equa- (arbitrary) compact support K in spacetime; the
tion cannot vanish in any open set of Minkowski observable O is then said to be localized in K. Each
spacetime; they necessarily spread out in the whole observable O = i (f ) has to be a self-adjoint
spacetime. This makes it impossible to formulate a (unbounded) operator acting in (a dense domain
causality condition comparable to eqn [9] in terms of) the Hilbert space H generated by all the states of
of the spacetime wave functions fin and gout the system of fundamental fields {i }; therefore, the
corresponding to the input and output wave packets correct mathematical concept of relativistic quantum
^fin , ^
gout . In this connection, it is, however, appro- field (x) is an ‘‘operator-valued tempered distribu-
priate to note that (after various attempts of ‘‘weak tion on Minkowski spacetime.’’ Here the additional
causality conditions’’) a certain condition called ‘‘temperateness assumption’’ is a convenient techni-
‘‘macrocausality’’ (Iagolnitzer and Stapp 1969; see
cal assumption which in particular allows the
the book by Iagolnitzer (1992)) has been shown to passage to the energy–momentum space by making
be equivalent to some local properties of analyticity use of the Fourier transformation.
96 Dispersion Relations

In this QFT framework, it is natural to express a vector variable of x with respect to the Minkows-
certain form of causality by assuming that two kian scalar product k ¼ k0 x0  k x, and we define
observables (f ) and (f 0 ) commute if the sup- Z
~ ðcÞ 0 ðk; XÞ ¼ x x

ports of f and f 0 are spacelike-separated regions in R ; R ;0 X þ ; X  eik x dx ½24

2 2
spacetime, which means that no signal with
velocity smaller or equal to the velocity of light Since q x > 0 for all pairs (q, x) such that q 2 V þ ,
can propagate from either one of these regions to x 2 V , it follows that R ~ (c) 0 (k, X) is holomorphic
the other. This expresses the idea that these two with respect to k in the domain T þ containing all
observables should be independent, that is, ‘‘com- k = p þ iq such that q belongs to V þ . Moreover, in
patible as quantum observables.’’ This postulate is the limit q ! 0 this holomorphic function tends (in
equivalent to the following condition, called the sense of distributions) to the Fourier transform
microcausality or local commutativity, and under- ~ , 0 (p, X) of R, 0 (X þ x=2, X  x=2) with respect
stood in the sense of operator-valued tempered to x. The domain T þ , which is called the ‘‘forward
distributions: tube,’’ is the analog of the domain I þ of the !-plane;
bounds of moderate type comparable to those of [12]
½ðx1 Þ; ðx01 Þ ¼ 0; for ðx1  x01 Þ2 < 0 ½21 apply to the holomorphic function R ~ (c) 0 in T þ .
Similarly, the advanced kernel A, 0 (X þ x=2, X 
where (x1  x01 )2 is the squared Minkowskian x=2) admits a Fourier–Laplace transform A ~ (c) 0 (k, X),
pseudonorm of x = x1  x01 = (x0 , x), namely which is holomorphic and of moderate growth in the
x2 = x20  x2 . It follows that for every admissible ‘‘backward tube’’ T  containing all k = p þ iq such
pair of states , 0 in H, the tempered distribution that q belongs to V  . In view of [23], the Fourier
: transform C ~ , 0 (p, X) of C, 0 (X þ x=2, X  x=2)
C;0 ðx1 ; x01 Þ ¼ <; ½ðx1 Þ; ðx01 Þ0 > ½22
then appears as the difference between the boundary
values of R ~ (c) ~ (c)
has its support contained in the union of the sets 1 , 2 and A1 , 2 on the reals (from the
V þ : x1  x01 2 V þ and V  : x1  x01 2 V  , where V þ respective domains T and T  ).

and V  are, respectively, the closures of the forward

and backward cones V þ ¼ {x = (x0 , x); x0 > jxj}, The Field-Theoretical Axiomatic Framework and
 : þ the Passage from the Structure Functions of QFT
V ¼ V in Minkowski spacetime. It is always
possible to decompose the previous distribution as to the Scattering Kernels (Case of Forward
C;0 ðx1 ; x01 Þ ¼ R;0 ðx1 ; x01 Þ  A;0 ðx1 ; x01 Þ ½23
The postulates (Wightman axioms) Apart from the
in such a way that the supports of the distributions causality postulate, which we have already presented
R, 0 (x1 , x01 ) and A, 0 (x1 , x01 ) belong, respectively, above in view of its distinguished role for generating
to V þ and V  . R, 0 and A, 0 are called, analyticity properties in complex energy–momentum
respectively, retarded and advanced kernels and space, the field-theoretical axiomatic approach to
they are often formally expressed (for convenience) collision theory is based on the following postulates
as follows: (for all the fundamental developments of axiomatic
field theory, the interested reader may consult the

R;0 ðx1 ; x01 Þ ¼ ðx1;0  x01;0 ÞC;0 x1 ; x01 books by Streater and Wightman (1980) and by Jost
    (1965); see Axiomatic Quantum Field Theory).
A;0 x1 ; x01 ¼ ðx1;0  x01;0 ÞC;0 x1 ; x01
1. There exists a unitary representation g ! U(g) of
in terms of the Heaviside step function (t) of the the Poincaré group G in the Hilbert space of
time-coordinate difference t = x1, 0  x01, 0 . For every states H; in this representation, the abelian
pair (, 0 ), R, 0 (x1 , x01 ) appears as a relativistic subgroup of translations of space and time has a
generalization of the retarded kernel R(t  t0 ) of eqn Lie algebra whose generators are interpreted as
[10]: its support property in spacetime, similar to the the four self-adjoint (commuting) operators P of
support property of R in time, expresses a relativistic total energy–momentum of the system.
form of causality, or ‘‘Einstein causality.’’ 2. The quantum field operators (x) transform
There exists a several-variable extension of the covariantly under that representation; in the
theory of Fourier–Laplace transforms of tempered simplest case of scalar fields (considered here),
distributions which is based on a formula similar to (gx) = U(g)(x)U(g1 ).
eqn [11]. We introduce the vector variables 3. There exists a unique state , called the vacuum,
X = (x1 þ x01 )=2, x = x1  x01 and a complex such that the action of all polynomials of field
4-momentum k = p þ iq = (k0 , k) as the conjugate operators on  generates a dense subset of H;
Dispersion Relations 97

moreover,  is assumed to be invariant under the particles considered earlier). Equations [22]–[24] are
representation U of G, and thereby such that then applied to the case when  and 0 coincide
P  = 0. with a one-particle state of 2 at rest, namely with
4. Spectral condition or positivity of energy in all 4-momentum p2 = p02 along the time axis: p2 =
physical states. The joint spectrum  of the ((p2 )0 , 0), (p2 )0 = m2 . This describes in a simple way
operators P is contained in the closed forward the case of forward scattering, since in view of the
cone V þ of energy–momentum space. In order to energy–momentum conservation law p1 þ p2 = p01 þ
perform the collision theory of massive particles, p02 , the choice p2 = p02 also implies that p1 = p01 .
one needs a more detailed ‘‘mass-gap assump- (The possibility of restricting the distribution
tion’’:  is the union of the origin O, of one or T(p1 , p2 ; p01 , p02 ) to such fixed values of the energy–
several positive sheets of hyperboloid Hm i
and momenta is shown to be mathematically well
of a region VM defined by the conditions p2  justified). The advantage of this simple case is that
M2 , p0 > 0, with M larger than all the mi . the corresponding kernels [22], [23] of (x1 , x01 ) are
invariant under spacetime translations and therefore
The Hilbert space H is correspondingly decom-
depend only on x (and not on X). We can thus
posed as the direct sum of the vacuum subspace (or
rewrite eqns [22], [23] with simplified notations as
zero-particle subspace) generated by , of subspaces
of stable one-particle states with masses mi iso-
þ h x
morphic to L2 (Hm , mi ), and of a remaining sub- :
i Cp2 ðxÞ ¼ <p2 ;  ;  p2 >
space H . As a result of the construction of 2 2
‘‘asymptotic states,’’ H0 can be shown to contain ¼ Rp2 ðxÞ  Ap2 ðxÞ ½25
two subspaces H0in and H0out , generated, respectively,
by N-particle incoming states (with N arbitrary and which can be shown to give correspondingly by
 2) and by N-particle outgoing states. The collision Fourier transformation
operator S is then defined as the partially isometric ~ p ðpÞ ¼ <p2 ; ðpÞ
C ~ ðpÞp
~ ~ ~
2 >  <p2 ; ðpÞðpÞp2 >
operator from H0out onto H0in , which maps a 2

reference basis of outgoing states onto the corre- ¼R ~ p ðpÞ

~ p ðpÞ  A ½26
2 2

sponding basis of incoming states.

If the particle 1 appears in the asymptotic states of
the field , the scattering kernel T(p1 , p2 ; p01 , p02 ) is
An independent postulate: asymptotic completeness
then given in the forward configurations p1 = p01 2
(see Scattering, Asymptotic Completeness and þ
Hm , p2 = p02 2 Hm
, by the following reduction for-
Bound States and Scattering in Relativistic 1 2
mula in which s = (p1 þ p2 )2 :
Quantum Field Theory: Fundamental Concepts
and Tools) The theory is said to satisfy the F0 ðsÞ ¼ Tðp1 ; p2 ; p1 ; p2 Þ
property of asymptotic completeness if all the states   
¼ p2  m2 R ~ p ðp1 Þ þ ½27
of H can be interpreted as superpositions of various 1 1 2 jHm 1

N-particle states (either in the incoming or in the

outgoing state basis), namely if one has
H0 = H0in = H0out . This property is not implied by the Analyticity Domains in Energy–Momentum Space:
previous postulates on quantum fields, but its From the ‘‘Primitive Off-Shell Domains’’ of QFT to
physical interpretation and its role in the analytic the Crossing Manifolds on the Mass Shell
program are of primary importance (see Scattering For simplicity, we shall restrict ourselves to the
in Relativistic Quantum Field Theory: The Analytic consideration of forward scattering amplitudes,
Program). Let us simply note here that asymptotic namely to the derivation of crossing analyticity
completeness implies as a by-product the unitarity domains and (quasi-)dispersion relations at t = 0
property of the collision operator S on the full for two-particle collision processes of the form 1 þ
Hilbert space H0 (i.e., SS
= S
S = I). 2 ! 1 þ 2 , 1 and 2 being given massive
particles with arbitrary spins and charges.
Connection between retarded kernel and scattering
kernel for the forward scattering case; a simple The holomorphic function Hp2 (k) and its primitive
‘‘reduction formula’’ We consider the scattering of domain D. Nontriviality of dispersion relations for
a particle 1 with mass m1 on a target consisting of the scattering amplitudes As suggested by eqn [24],
a particle 2 with mass m2 and denote by we can exploit the analyticity properties of the
T(p1 , p2 ; p01 , p02 ) the corresponding scattering kernel Fourier–Laplace transforms of the retarded and
(defined similarly as for the case of equal-mass ~ p (p) and
advanced kernels Rp2 and Ap2 : in fact, R 2
98 Dispersion Relations

~ p (p) are, respectively, the boundary values of the

A restriction of the holomorphic function H ^ p ( , s) to
2 2
holomorphic functions the physical mass-shell value = m21 . However, it
Z turns out that the section of D by the complex mass-
~ ðcÞ
R p2 ðkÞ ¼ Rp2 ðxÞ eik x dx shell manifold M(c) with equation k2 = m21 is empty:
½28 this geometrical fact is responsible for the nontrivi-
~ ðcÞ
Ap2 ðkÞ ¼  Ap2 ðxÞ e dx ik x ality of the proof of dispersion relations for
V the physical quantity F ^0 (s) on the mass shell. In
fact, the tube T þ [ T  which constitutes the basic
from the corresponding domains T þ and T  . Accord- part of the domain D and is given by the field-
ing to the reduction formula [27], it is appropriate to theoretical microcausality postulate, is a ‘‘purely off-
consider correspondingly the functions Hpþ2 (k) ¼ (k2  shell’’ complex domain, as it can be easily checked: if a
(c) :
~ (k) and H  (k) ¼ (k2  m2 )A (c)
~ (k), which are
m21 )R p2 p2 1 p2 complex point k = p þ iq is such that q2 > 0, the
also, respectively, holomorphic in T þ and T  . Then corresponding squared mass = k2 = p2  q2 þ 2ip q
the forward scattering amplitude F0 (s) = F(s, 0) = is real if and only if p q = 0, which implies p2 < 0
T(p1 , p2 ; p1 , p2 ) appears as the restriction to the (i.e., p spacelike) and therefore = p2  q2 < 0.
hyperboloid sheet p 2 Hm 1
of the boundary value
þ þ
Hp2 (p) of Hp2 (k) on the reals.
Moreover, it can be seen that the two boundary ‘‘Off-shell dispersion relations’’ as a first step The
values Hpþ2 (p) = (p2  m21 )R ~ p (p) and H  (p) = (p2  starting point, which is easy to obtain from the
2 p2
2 ~
m1 )Ap2 (p) coincide as distributions in the region domain D, is the analyticity of the holomorphic
function H ^ p ( , s) in a cut-plane of the variable s for
R ¼ fp 2 R4 ; ðp þ p2 Þ2 < ðm1 þ m2 Þ2 ; all negative values of the squared mass variable .
ðp  p2 Þ2 < ðm1 þ m2 Þ2 g ½29 This cut-plane  is always the complement in C
(i.e., the complex s-plane) of the union of the s-cut
This follows from the intermediate expression in (s real (m1 þ m2 )2 ) and of the u-cut (u = 2 þ
eqn [26] and from the fact that a state of the form 2m22  s real (m1 þ m2 )2 ). This analyticity property
(p2  m21 )(p)p2 > is a state of energy–momentum thus justifies ‘‘off-shell dispersion relations’’ at fixed
p þ p2 and therefore vanishes (in view of the negative values of for the field-theoretical structure
spectral condition) if (p þ p2 )2 < (m1 þ m2 )2 (here function H ^ p ( , s).

we also use a simplifying assumption according to The latter property and the subsequent analysis
which no one-particle bound state is present in this concerning the process of analytic continuation of H ^p

channel). to positive values of will be more easily understood

The situation obtained concerning the holo- geometrically if one reduces the complex space of k to
morphic functions Hpþ2 (k) and Hp2 (k) parallels (in a two-dimensional complex space, which is legitimate
complex dimension four) the case of a pair of in view of the equality Hp2 (k) = H ^ p ( , s).

holomorphic functions in the upper and lower half- Having chosen the k0 -axis along p2 , we reduce the
planes whose boundary values on the reals coincide orthogonal space coordinates k of k to the radial
on a certain interval playing the role of R. As in this variable kr . One thus gets the following expressions
one-dimensional case there is a theorem, called the of the variables and s (resp. u):
‘‘edge-of-the-wedge theorem’’ (see below), which
¼ k20  k2r ; s ¼ þ m22 þ 2m2 k0
implies that Hpþ2 (k) and Hp2 (k) have a common
analytic continuation Hp2 (k): this function is holo- ðresp: u ¼ þ m22  2m2 k0 Þ
morphic in a domain D which is the union of :
^ p ( , s) ¼
Then we can write H Hp2 (k0 , kr ) =
T þ , T  and of a complex neighborhood of R; D is 2
Hp2 (k0 , kr ), and describe the image Dr of the
called the primitive domain of Hp2 (k).
domain D in the variables k = (k0 , kr ) = p þ iq as
Moreover, it follows from the postulate of invar-
Trþ [ Tr [ N (Rr ), where:
iance of the field (x) under the action of the Poincaré
group (see postulate (2)) that the holomorphic func- 1. Tr is defined by the condition q2 ¼ q20  q2r > 0,
tion Hp2 (k) only depends of the two complex variables q0 > 0 or q0 < 0,
= k2 (= k20  k2 ) and k p2 or equivalently s = (k þ 2. N is a complex neighborhood of the real region
p2 )2 = þ m22 þ 2k p2 ; it thus defines a correspond- Rr defined as follows. Let hþ 
s , hu be the two
^ p ( , s) ¼
ing holomorphic function H 2
Hp2 (k) in the branches of hyperbolae with respective equations:
image of D in these variables. 2 2 2
In view of the reduction formula [27], the hþ
s : ðp0 þ m2 Þ  pr ¼ ðm1 þ m2 Þ ; p0 þ m2 > 0
scattering function F ^0 (s) should appear as the h 2 2 2
u : ðp0  m2 Þ  pr ¼ ðm1 þ m2 Þ ; p0  m2 < 0
Dispersion Relations 99

Then Rr is the intersection of the region situated analytically continued as a holomorphic function of
below hþ 
s and of the region situated above hu . s in the cut-plane  and thereby satisfies the
corresponding dispersion relations.
Let us now consider any complex hyperbola
: The physical mass shell hyperbola h(c) [m21 ] thus
h(c) [ ] with equation k2 ¼ k20  k2r = . On such a
appears as a limiting case of the previous family (for
complex curve either one of the variables k0 or s or
tending to m21 from below). The analyticity of
u is a good parameter for holomorphic functions ^ p (m2 , s) in  2 can then be justified provided one
H 1 m1
which are even in kr , like Hp2 (k0 , kr ). If is real, any 2
knows that this function is analytic at at least one
complex point k = p þ iq of h(c) [ ] is such that p2
point of m2 : but this additional information results
and q2 have opposite signs (since p q = 0). There- 1
from a more thorough exploitation of the analyticity
fore, the sign of q2 is always opposite to the sign of
properties resulting from the QFT postulates. This
(= p2  q2 ): if is negative, all the complex points
will be now briefly outlined below.
of h(c) [ ] thus belong to Trþ [ Tr ; the union of all
these points with the real points of h(c) [ ] in Rr is
therefore a subset of Dr , which is represented in the Further information coming from the four-point
complex plane of s by the cut-plane  . The function function in complex momentum space It is
H^ p ( , s) is therefore analytic (and univalent) in  possible to obtain further analyticity properties of
2 :
^ p ( , s) ¼
for each < 0. Moreover, the existence of moderate H 2
Hp2 (k) by considering the latter as
bounds of type [12] on Hp2 in D (resulting from the the restriction to the submanifold k1 = k3 = k;
temperateness assumption) then implies the validity k2 = k4 = p2 of a master analytic function
of dispersion relations (with subtractions) for H4 (k1 , k2 , k3 , k4 ), called the four-point function of
H^ p ( , s) in  . the field  in complex energy–momentum space (see
Scattering in Relativistic Quantum Field Theory:
The Analytic Program). This function is holo-
The problem of analytic completion to the complex morphic in a well-defined primitive domain D4 of
mass-shell hyperbola h(c) [m21 ]: what is provided by the linear submanifold k1 þ k2 þ k3 þ k4 = 0. It is
the Jost–Lehmann–Dyson domain A basic fact in then possible to compute some local parts situated
complex geometry in n variables, with n  2, is the near the reals of the holomorphy envelope of D4 ,
existence of a distinguished class of domains, called which implies, as a by-product, that the function
holomorphy domains: for each domain U in this H^ p ( , s) can be analytically continued in a set  of
class, there exists at least one function which is the form
holomorphic in U and cannot be analytically
continued at any point of the boundary of U. In  ¼ fð ; sÞ; 2 ; s 2 V s1 ð Þg
one dimension, every domain is a holomorphy [ fð ;sÞ; 2 ; u ¼ 2 þ 2m22  s 2 V u1 ð Þg ½30
domain. In dimension larger than one, a general
domain U is not a holomorphy domain, but it with the following specifications:
admits a holomorphy envelope U, ^ which is a 1.  is a domain in the -plane, which is a complex
holomorphy domain containing U, such that every neighborhood of a real interval of the form
function holomorphic in U admits an analytic a < < M21 ; here M1 denotes a spectral mass
continuation in U. ^ threshold in the theory such that M1 > m1 ;
It turns out that the domain Dr considered above 2. for each , V s1 ( ) (resp. V u1 ( )) is a cut-
in the last subsection) is not a holomorphy domain; neighborhood in the s-plane of the real half-line
its holomorphy envelope D ^ r (obtained geometrically s > s1 (resp. of the half-line u = 2 þ 2m22 
by Bros, Messiah, and Stora in 1961) coincides with s real >u1 ); s1 and u1 denote appropriate real
a domain introduced by Jost–Lehmann (1957) and numbers independent of .
Dyson (1958) by methods of wave equations. This
domain can be characterized as the union of Dr with The final analytic completion: crossing domains on
all the complex points of all the hyperbolae with h(c) [m21 ]. Dispersion relations for 0 –0 meson
equations (k0  a)2  (kr  b)2 = c2 (for all a, b, c scattering and ‘‘quasi-dispersion-relations’’ for
real, including the complex straight lines for which proton–proton scattering We now wish to describe
c = 0) whose both branches have a nonempty briefly the final step of analytic completion, which
intersection with the real region Rr . displays the existence of a ‘‘quasi-cut-plane domain’’
In particular, one easily sees that all the hyperbo- in s for the function H ^ p (m2 , s), even in the more
2 1
lae h(c) [ ] with 0  < m21 belong to the previous general case when the s-cut and u-cut are associated
class. It follows that for any in this positive with different scattering channels, whose respective
interval, the function H ^ p ( , s) can still be mass thresholds s = M212 and u = M02
2 12 are unequal.
100 Dispersion Relations

This general situation may occur as soon as one It is only for the neutral case, where M12 =
charged particle 1 of the s-channel is replaced by M012 = m1 þ m2 , that a more favorable scenario
the corresponding antiparticle 1 in the u-channel, occurs, as explained earlier: in this case, the interval
in contrast with the case of neutral particles (like the { 2 ] a, 0[} of the set S0 is replaced by
0 meson) which coincide with their own antiparti- { 2 ] a, m21 [}, so that the whole cut-plane domain
cles. Here it is important to note that the two real m2 is obtained in the result of the previous crossing
branches hþ [m21 ] and h [m21 ] of the mass shell lemma. The scattering amplitudes of 0 –0 meson
hyperbola h(c) [m21 ] correspond, respectively, to the scattering and of  meson–proton scattering enjoy
physical region of the ‘‘direct scattering channel’’ of this property and, therefore, satisfy genuine disper-
the reaction 1 þ 2 ! 1 þ 2 with squared total sion relations in which the scattering function is
energy s, and to the physical region of the ‘‘crossed even (see the second basic example described at the
scattering channel’’ of the reaction 1 þ 2 ! 1 þ beginning of this article). In the general case of
2 with squared total energy u. A typical and crossing domains obtained above, corresponding
important example is the case of proton–proton Cauchy integral relations have been written and
scattering in the s-channel, where M12 equals twice used under the name of ‘‘quasi-dispersion-relations.’’
the mass m(= m1 = m2 ) of the proton, while the
corresponding u-channel refers to the proton–anti- Complementary results Some comments can now
proton scattering, whose threshold M012 equals twice be added concerning the passage from the purely
the mass of the  meson. geometrical results (i.e., analyticity domains)
In that general case, the analysis of the subsection described above to the writing of precise (quasi-)
‘‘‘Off-shell dispersion relations’ as a first step’’ still dispersion relations with two subtractions:
applies, so that the function H ^ p ( , s) is always Polynomial bounds and dispersion relations with
analytic in a set of the form N subtractions The previous methods of analytic
completion also allow one to control the bounds at
S0 ¼ fð ; sÞ; a < < 0; s 2  g ½31 infinity in the relevant complex domains. As it has
been noticed after eqn [24], the Fourier–Laplace
Then, the additional information described above in transforms of the retarded and advanced kernels, and
the last subsection allows one to use the following thereby the holomorphic functions Hp2 (k) discussed
crucial property of analytic completion, which we at the start of this section are bounded at most by a
call power of a suitable norm of k in their respective tubes
T  . Correspondingly, the holomorphic function
Crossing lemma If a function G( , s) is holomorphic
Hp2 (k) (resp. Hp2 (k0 , kr )) admits the same type of
in a domain which contains the union of the sets 
bound in its primitive analyticity domain D (resp.
and S0 (see eqns [30] and [31]), then it admits an
Dr ). These bounds are a consequence of the tempered
analytic continuation in a set of the following form:
distribution character of the structure functions of the
fields which is built-in in the Wightman field-
fð ; sÞ; 2 ; s 2  ;
theoretical framework. Then it can be checked that
js   m22 j ¼ ju   m22 j > Rð Þg in the holomorphy envelope D ^ r of Dr , and thereby in
the cut-plane (or crossing) domains obtained in the
By applying this property to the function H ^ p ( , s) intersection of D ^ r and of the complex mass shell
2 (c) 2
and restricting to the mass-shell value m1 which h [m1 ], the same type of power bound is still valid:
belongs to , one obtains the analyticity of the ^0 (s) is therefore bounded by some power jsjN1 of jsj
: ^
^0 (s) ¼
scattering function F Hp2 (m21 , s) in a crossing and thus satisfies a (quasi-)dispersion relation with N
domain of the complex mass shell hyperbola subtractions. The same type of argument holds for all
h(c) [m21 ]: the crossing between the two physical the similar cut-domains (or crossing domains) in s
regions hþ [m21 ] (s  M212 ) and h [m21 ] (u  M02 ^t (s) for all negative value of t.
12 ) is obtained for F
ensured by a complex domain of h(c) [m21 ] whose image It is also worthwhile to mention that a similar
in the s-plane  is the ‘‘cut-neighborhood
 infinity’’ remarkable (since not at all predictable) result was
{s; s 2 m2 , s  m21  m22  = u  m21  m22  > R(m21 )}. also obtained in the Haag, Kastler, and Araki frame-
Note that the relevant boundary values of F ^0 for work of algebraic QFT (Epstein, Glaser, Martin,
obtaining the scattering amplitudes of the two 1969; see Scattering in Relativistic Quantum Field
collision processes with respective physical regions Theory: The Analytic Program for further comments).
hþ [m21 ] and h [m21 ] have to be taken from the In this connection, one can also mention a more
respective sides Im s > 0 and Im u = Im s > 0 of the recent result. In the Buchholz–Fredenhagen axio-
corresponding s- and u-cuts. matic approach of charged fields (1982), in which
Dissipative Dynamical Systems of Infinite Dimension 101

locality is replaced by the more general notion of Froissart-type bounds on the scattering ampli-
‘‘stringlike locality’’ (see Algebraic Approach to tudes and thereby to justify the writing of
Quantum Field Theory, Axiomatic Quantum Field (quasi-)dispersion relations with at most two
Theory, and Scattering in Relativistic Quantum subtractions for all the admissible values of t.
Field Theory: Fundamental Concepts and Tools), a
proof of forward dispersion relations has again been See also: Algebraic Approach to Quantum Field Theory;
obtained (Bros, Epstein, 1994). Axiomatic Quantum Field Theory; Perturbation Theory
The extension of the analyticity domains by and its Techniques; Scattering in Relativistic Quantum
Field Theory: The Analytic Program; Scattering,
positivity and the derivation of bounds by unitarity
bi Asymptotic Completeness and Bound States; Scattering
(Martin 1966; see the book by Martin (1969)). The
in Relativistic Quantum Field Theory: Fundamental
following ingredients have been used: Concepts and Tools.
1. Positivity conditions on the absorptive part of
F(s, t), which are expressed by the infinite set of
inequalities (d=dt)n Im F(s, t)jt =0  0 (for all inte- Further Reading
gers n),
2. The existence of a two-dimensional complex Haag R (1992) Local Quantum Physics.Berlin: Springer.
neighborhood of some point (s = s0 , t = 0) in the Iagolnitzer D (1992) Scattering in Quantum Field Theories: The
Axiomatic and Constructive Approaches. Princeton Series in
analyticity domain resulting from QFT. Physics. Princeton: Princeton University Press.
The following results have then been obtained: Jost R (1965) The General Theory of Quantized Fields. AMS,
Providence, RI: American Mathematical Society.
(a) It is justified to differentiate the forward (sub- Klein L (ed.) (1961) Dispersion Relations and the Abstract
Approach to Field Theory. New York: Gordon and Breach.
tracted) dispersion relations with respect to t at
Martin A (1969) Scattering Theory: Unitarity, Analyticity and
any order. Crossing, Lecture Notes in Physics. Berlin: Springer.
^ t) can be analytically continued in a fixed
(b) F(s, Nussenzweig HM (1972) Causality and Dispersion Relations.
circle jtj < tmax for all values of s. The latter New York: Academic Press.
implies the extension of dispersion relations in s Streater RF and Wightman AS (1964, 1980) PCT, Spin and
to positive (and complex) values of t. Statistics, and all that. Princeton: Princeton University Press.
Vernov YuS (1996) Dispersion Relations in the Historical Aspect,
(c) In a last step, the use of unitarity conditions for IHEP Publications, Protvino Conf.: Fundamental Problems of
the ‘‘partial waves’’ f‘ (s) of F(s, t) (see Scattering High Energy Physics and Field Theory.
in Relativistic Quantum Field Theory: The
Analytic Program) allows one to obtain

Dissipative Dynamical Systems of Infinite Dimension

M Efendiev and S Zelik, Universität Stuttgart, differential equations. If the system is described by
Stuttgart, Germany ordinary differential equations (ODEs),
A Miranville, Université de Poitiers, Chasseneuil,
France d
yðtÞ ¼ Fðt; yðtÞÞ; yð0Þ ¼ y0 ;
ª 2006 Elsevier Ltd. All rights reserved.
yðtÞ :¼ ðy1 ðtÞ; . . . ; yN ðtÞÞ ½1
for some nonlinear function F : Rþ  RN ! RN , we
Introduction have a so-called finite-dimensional DS. In that case,
A dynamical system (DS) is a system which evolves the phase space  is some (invariant) subset of RN
with respect to the time. To be more precise, a DS and the evolution operator S(t) is defined by
(S(t), ) is determined by a phase space  which SðtÞy0 :¼ yðtÞ; yðtÞ solves ½1 ½2
consists of all possible values of the parameters
We also recall that, in the case where eqn [1] is
describing the state of the system and an evolution
autonomous (i.e., does not depend explicitly on the
map S(t) :  !  that allows one to find the state of
time), the evolution operators S(t) generate a
the system at time t > 0 if the initial state at t = 0 is
semigroup on the phase space , that is,
known. Very often, in mechanics and physics, the
evolution of the system is governed by systems of Sðt1 þ t2 Þ ¼ Sðt1 Þ Sðt2 Þ; t1 ; t2 2 Rþ ½3
102 Dissipative Dynamical Systems of Infinite Dimension

Now, in the case of a distributed system whose convection. This fact was used by Lorenz to justify
initial state is described by functions u0 = u0 (x) the so-called ‘‘butterfly effect,’’ a metaphor for the
depending on the spatial variable x, the evolution imprecision of weather forecast.
is usually governed by partial differential equations The theory of DSs in finite dimensions had been
(PDEs) and the corresponding phase space  is some extensively developed during the twentieth century,
infinite-dimensional function space (e.g.,  := L2 () due to the efforts of many famous mathematicians
or  := L1 () for some domain   R N .) Such DSs (such as Anosov, Arnold, LaSalle, Sinai, Smale, etc.)
are usually called infinite dimensional. and, nowadays, much is known on the chaotic
The qualitative study of DSs of finite dimensions behaviors in such systems, at least in low dimen-
goes back to the beginning of the twentieth century, sions. In particular, it is known that, very often, the
with the pioneering works of Poincaré on the N- trajectories of a chaotic system are localized, up to a
body problem (one should also acknowledge the transient process, in some subset of the phase space
contributions of Lyapunov on the stability and of having a very complicated fractal geometric struc-
Birkhoff on the minimal sets and the ergodic ture (e.g., locally homeomorphic to the Cartesian
theorem). One of the most surprising and significant product of Rm and some Cantor set) which, thus,
facts discovered at the very beginning of the theory accumulates the nontrivial dynamics of the system
is that even relatively simple equations can generate (the so-called strange attractor). The chaotic
very complicated chaotic behaviors. Moreover, these dynamics on such sets are usually described by
types of systems are extremely sensitive to initial symbolic dynamics generated by Bernoulli shifts on
conditions (the trajectories with close but different the space of sequences. We also note that, nowa-
initial data diverge exponentially). Thus, in spite of days, a mathematician has a large amount of
the deterministic nature of the system (we recall that different concepts and methods for the extensive
it is generated by a system of ODEs, for which we study of concrete chaotic DSs in finite dimensions.
usually have the unique solvability theorem), its In particular, we mention here different types of
temporal evolution is unpredictable on timescales bifurcation theories (including the KAM theory and
larger than some critical time T0 (which depends the homoclinic bifurcation theory with related
obviously on the error of approximation and on the Shilnikov chaos), the theory of hyperbolic sets,
rate of divergence of close trajectories) and can stochastic description of deterministic processes,
show typical stochastic behaviors. To the best of our Lyapunov exponents and entropy theory, dynamical
knowledge, one of the first ODEs for which such analysis of time series, etc.
types of behaviors were established is the physical We now turn to infinite-dimensional DSs gener-
pendulum parametrically perturbed by time-periodic ated by PDEs. A first important difficulty which
external forces, arises here is related to the fact that the analytic
structure of a PDE is essentially more complicated
y00 ðtÞ þ sinðyðtÞÞð1 þ " sinð!tÞÞ ¼ 0 ½4 than that of an ODE and, in particular, we do not
have in general the unique solvability theorem as for
where ! and " > 0 are physical parameters. We also ODEs, so that even finding the proper phase space
mention the more recent (and more relevant for our and the rigorous construction of the associated DS
topic) famous example of the Lorenz system which is can be a highly nontrivial problem. In order to
defined by the following system of ODEs in R3 : indicate the level of difficulties arising here, it
8 0 suffices to recall that, for the three-dimensional
< x ¼ ðy  xÞ
Navier–Stokes system (which is one of the most
y0 ¼ xy þ rx  y ½5 important equations of mathematical physics), the
: 0
z ¼ xy  bz required associated DS has not been constructed yet.
Nevertheless, there exists a large number of equa-
where , r, and b are some parameters. These tions for which the problem of the global existence
equations are obtained by truncation of the and uniqueness of a solution has been solved. Thus,
Navier–Stokes equations and give an approximate the question of extending the highly developed
description of a horizontal fluid layer heated from finite-dimensional DS theory to infinite dimensions
below. The warmer fluid formed at the bottom arises naturally.
tends to rise, creating convection currents. This is One of the first and most significant results in that
similar to what happens in the Earth’s atmosphere. direction was the development of the theory of
For a sufficiently intense heating, the time evolution integrable Hamiltonian systems in infinite dimen-
has a sensitive dependence on the initial conditions, sions and the explicit resolution (by inverse-scattering
thus representing a very irregular and chaotic methods) of several important conservative equations
Dissipative Dynamical Systems of Infinite Dimension 103

of mathematical physics (such as the Korteweg–de The second example is the damped nonlinear
Vries (and the generalized Kadomtsev–Petiashvilli wave equation in   Rn :
hierarchy), the sine-Gordon, and the nonlinear
Schrödinger equations). Nevertheless, it is worth @t2 u þ @t u  x u þ f ðuÞ ¼ 0
noting that integrability is a very rare phenomenon, uj@ ¼ 0; ujt¼0 ¼ u0 ; @t ujt¼0 ¼ u00 ½7
even among ODEs, and this theory is clearly
insufficient to understand the dynamics arising in which models, for example, the dynamics of a
PDEs. In particular, there exist many important Josephson junction driven by a current source
equations which are essentially out of reach of this (sine-Gordon equation). It is known that, under
theory. natural sign and growth assumptions on the non-
One of the most important classes of linear interaction function f, this equation generates
such equations consists of the so-called dissipative a DS in the energy phase space E of pairs of
PDEs which are the main subject of our study. As functions (u, @t u) such that @t u and rx u are square
hinted by this denomination, these systems exhibit integrable.
some energy dissipation process (in contrast to The last class of equations that we will consider
conservative systems for which the energy is here consists of reaction–diffusion systems in a
preserved) and, of course, in order to have nontrivial domain   R n :
dynamics, these models should also account for the @t u ¼ ax u  f ðuÞ; ujt¼0 ¼ u0 ½8
energy income. Roughly speaking, the complicated
chaotic behaviors in such systems usually arise from (endowed with Dirichlet (uj@ = 0) or Neumann
the interaction of the following mechanisms: (@n uj@ = 0) boundary conditions), which describes
some chemical reaction in . Here, u = (u1 , . . . , uN )
1. energy dissipation in the higher part of the
is an unknown vector-valued function which
Fourier spectrum;
describes the concentrations of the reactants, f (u) is
2. external energy income in its lower part;
a given interaction function, and a is a diffusion
3. energy flux from lower to higher Fourier modes
matrix. It is known that, under natural assumptions
provided by the nonlinear terms of the equation.
on f and a, these equations also generate an infinite-
We chose not to give a rigorous definition of a dimensional DS, for example, in the phase space
dissipative system here (although the concepts of  := [L1 ()]n .
energy dissipation and related dissipative systems We emphasize once more that the phase spaces  in
are more or less obvious from the physical point of all these examples are appropriate infinite-dimensional
view, they seem too general to have an adequate function spaces. Nevertheless, it was observed in
mathematical definition). Instead, we only indicate experiments that, up to a transient process, the
several basic classes of equations of mathematical trajectories of the DS considered are localized inside
physics which usually exhibit the above behaviors. a ‘‘very thin’’ invariant subset of the phase space
The first example is, of course, the Navier–Stokes having a complicated geometric structure which, thus,
system, which describes the motion of a viscous accumulates all the nontrivial dynamics of the system.
incompressible fluid in a bounded domain  (we It was conjectured a little later that these invariant sets
will only consider here the two-dimensional case are, in some proper sense, finite dimensional and that
  R2 , since the adequate formulation in three the dynamics restricted to these sets can be effectively
dimensions is still an open problem): described by a finite number of parameters. Thus
( (when this conjecture is true), in spite of the infinite-
@t u  ðu; rx Þu ¼ x u þ rx p þ gðxÞ dimensional initial phase space, the effective dynamics
div u ¼ 0; ujt¼0 ¼ u0 ; uj@ ¼ 0 (reduced to this invariant set) is finite dimensional and
can be studied by using the algorithms and concepts of
Here, u(t, x) = (u1 (t, x), u2 (t, x)) is the unknown the classical finite-dimensional DS theory. In particu-
velocity vector, p = p(t, x) is the unknown pressure, lar, this means that the infinite dimensionality plays
x is the Laplacian with respect to x,  > 0 and g are here only the role of (possibly essential) technical
given kinematic viscosity and external forces, difficulties, which cannot, however, produce any new
respectively, P and (u, rx )u is the inertial term dynamical phenomena which are not observed in the
([(u,rx )u]i = 2j=1 uj @xj ui , i = 1, 2). The unique global finite-dimensional theory.
solvability of [6] has been proved by Ladyzhenskaya. The above finite-dimensional reduction principle
Thus, this equation generates an infinite-dimensional of dissipative PDEs in bounded domains has been
DS in the phase space  of divergence-free square- given solid mathematical grounds (based on the
integrable vector fields. concept of the so-called global attractor) over the
104 Dissipative Dynamical Systems of Infinite Dimension

last three decades, starting from the pioneering Nevertheless, several ideas are mentioned in
papers of Ladyzhenskaya. This theory is considered the following which (from authors’ point of view)
in more detail here. were the most important for the development of
The finite-dimensional reduction theory has some these topics. The first one is the pioneering paper of
limitations. Of course, the first and most obvious Kirchgässner, in which dynamical methods were
restriction of this principle is the effective dimension applied to the study of the spatial structure of
of the reduced finite-dimensional DS. Indeed, it is solutions of elliptic equations in cylinders (which
known that, typically, this dimension grows at least can be considered as equilibria equations for
linearly with respect to the volume vol() of the evolution PDEs in unbounded cylindrical domains).
spatial domain  of the DS considered (and the The second is the Sinai–Buinimovich model of
growth of the size of  is the same (up to a spacetime chaos in discrete lattice DSs. Finally, the
rescaling) as the decay of the viscosity coefficient  third is the adaptation of the concept of a global
or the diffusion matrix a, see eqns [6]–[8]). So, for attractor to unbounded domains by Abergel and
sufficiently large domains , the reduced DS can be Babin–Vishik.
too large for reasonable investigations. We note that the situation on the understanding
The next, less obvious, but much more essential, of the general features of the dynamics in
restriction is the growing spatial complexity of the unbounded domains, however, seems to have chan-
DS. Indeed, as shown by Babin–Buinimovich, the ged in the last several years, due to the works of
spatial complexity of the system (e.g., the number of Collet–Eckmann and Zelik. This is the reason why a
topologically different equilibria) grows exponen- section of this review is devoted to a more detailed
tially with respect to vol(). Thus, even in the case discussion on this topic.
of relatively small dimensions, the reduced system Other important questions are the object of
can be out of reasonable investigations, due to its current studies and we only briefly mention some
extremely complicated structure. of them. We mention for instance, the study of
Therefore, the approach based on the finite- attractors for nonautonomous systems (i.e., sys-
dimensional reduction does not look so attractive tems in which the time appears explicitly). This
for large domains. It seems, instead, more natural, at situation is much more delicate and is not
least from the physical point of view, to replace large completely understood; notions of attractors for
bounded domains by their limit unbounded ones such systems have been proposed by Chepyzhov–
(e.g.,  = Rn or cylindrical domains). Of course, this Vishik, Haraux and Kloeden–Schmalfuss. We also
approach requires a systematic study of dissipative mention that theories of (global) attractors for
DSs associated with PDEs in unbounded domains. non-well-posed problems have been proposed by
The dynamical study of PDEs in unbounded Babin–Vishik, Ball, Chepyzhov–Vishik, Melnik–
domains started from the pioneering paper of Valero, and Sell.
Kolmogorov–Petrovskij–Piskunov, in which the tra-
veling wave solutions of reaction–diffusion equa-
tions in a strip were constructed and the
convergence of the trajectories (for specific initial Global Attractors and Finite-Dimensional
data) to this traveling wave solutions were estab- Reduction
lished. Starting from this, many results on the
Global Attractors: The Abstract Setting
dynamics of PDE in unbounded domains have been
obtained. However, for a long period, the general As already mentioned, one of the main concepts of
features of such dynamics remained completely the modern theory of DSs in infinite dimensions is
unclear. The main problems arising here are: that of the global attractor. We give below its
definition for an abstract semigroup S(t) acting on a
1. the essential infinite dimensionality of the DS metric space , although, without loss of generality,
considered (absence of any finite-dimensional the reader may think that (S(t), ) is just a DS
reduction), which leads to essentially new associated with one of the PDEs ([6]–[8]) described
dynamical effects that are not observed in finite- in the introduction.
dimensional theories; To this end, we first recall that a subset K of the
2. the additional spatial ‘‘unbounded’’ directions phase space  is an attracting set of the semigroup
lead to the so-called spatial chaos and the S(t) if it attracts the images of all the bounded subsets
interaction between spatial and temporal chaotic of , that is, for every bounded set B and every " > 0,
modes generates the spatio-temporal chaos, there exists a time T (depending in general on B
which also has no analog in finite dimensions. and ") such that the image S(t)B belongs to the
Dissipative Dynamical Systems of Infinite Dimension 105

"-neighborhood of K if t  T. This property can be usually the space L2 () of square integrable func-
rewritten in the equivalent form tions, 1 is the Sobolev space H 1 () of the functions
u such that u and rx u belong to L2 () and estimate
lim distH ðSðtÞB; KÞ ¼ 0 ½9 [11] is a classical smoothing property for solutions
of parabolic equations (for hyperbolic equations, a
where distH (X, Y) := supx2X inf y2Y d(x, y) is the non- slightly more complicated asymptotic smoothing
symmetric Hausdorff distance between subsets of . property should be used instead of [11]).
The following definition of a global attractor is Since the continuity of the operators S(t) usually
due to Babin–Vishik. arises no difficulty (if the uniqueness is proven), then
Definition 1 A set A   is a global attractor for the above scheme gives indeed the existence of the
the semigroup S(t) if global attractor for most of the PDEs of mathema-
tical physics in bounded domains.
(i) A is compact in ;
(ii) A is strictly invariant: S(t)A = A, for all t  0;
Dimension of the Global Attractor
(iii) A is an attracting set for the semigroup S(t).
In this subsection, we start by discussing one of the
Thus, the second and third properties guarantee
basic questions of the theory: in which sense is
that a global attractor, if it exists, is unique and that
the dynamics on the global attractor finite dimen-
the DS reduced to the attractor contains all the
sional? As already mentioned, the global attractor
nontrivial dynamics of the initial system. Further-
is usually not a manifold, but has a rather
more, the first property indicates that the reduced
complicated geometric structure. So, it is natural to
phase space A is indeed ‘‘thinner’’ than the initial
use the definitions of dimensions adopted for the
phase space  (we recall that, in infinite dimensions,
study of fractal sets here. We restrict ourselves to the
a compact set cannot contain, e.g., balls and should
so-called fractal (or box-counting, entropy) dimen-
thus be nowhere dense).
sion, although other dimensions (e.g., Hausdorff,
In most applications, one can use the following
Lyapunov, etc.) are also used in the theory of
attractor’s existence theorem.
Theorem 1 Let a DS (S(t), ) possess a compact In order to define the fractal dimension, we first
attracting set and the operators S(t) :  !  be recall the concept of Kolmogorov’s "-entropy, which
continuous for every fixed t. Then, this system comes from the information theory and plays a
possesses the global attractor A which is generated fundamental role in the theory of DSs in unbounded
by all the trajectories of S(t) which are defined for domains considered in the next section.
all t 2 R and are globally bounded.
Definition 2 Let A be a compact subset of a
The strategy for applying this theorem to concrete metric space . For every " > 0, we define N" (K) as
equations of mathematical physics is the following. the minimal number of "-balls which are necessary
In a first step, one verifies a so-called dissipative to cover A. Then, Kolmogorov’s "-entropy
estimate which has usually the form H" (A) = H" (A, ) of A is the digital logarithm of
this number, H" (A) := log2 N" (A). We recall that
kSðtÞu0 k  Qðku0 k Þ et þ C ; u0 2  ½10 H" (A) is finite for every " > 0, due to the Hausdorff
where k
k is a norm in the function space  and the criterium. The fractal dimension df (A) 2 [0, 1] of A
positive constants  and C and the monotonic is then defined by
function Q are independent of t and u0 2  (usually,
df ðAÞ :¼ lim sup H" ðAÞ= log2 1=" ½12
this estimate follows from energy estimates and is "!0
sometimes even used in order to ‘‘define’’ a dissipa-
We also recall that, although this dimension
tive system). This estimate obviously gives the
coincides with the usual dimension of the manifold
existence of an attracting set for S(t) (e.g., the ball
for Lipschitz manifolds, it can be noninteger for
of radius 2C in ), which is, however, noncompact
more complicated sets. For instance, the fractal
in . In order to overcome this problem, one usually
dimension of the standard ternary Cantor set in
derives, in a second step, a smoothing property for
[0, 1] is ln 2= ln 3.
the solutions, which can be formulated as follows:
The so-called Mané theorem (which can be
kSð1Þu0 k1  Q1 ðku0 k Þ; u0 2  ½11 considered as a generalization of the classical Yitni
embedding theorem for fractal sets) plays an
where 1 is another function space which is important role in the finite-dimensional reduction
compactly embedded into . In applications,  is theory.
106 Dissipative Dynamical Systems of Infinite Dimension

Theorem 2 Let  be a Banach space and A be a inequalities. Nevertheless, for periodic boundary
compact set such that df (A) < N for some N 2 N. conditions, Constantin–Foias–Temam and Liu
Then, for ‘‘almost all’’ (2N þ 1)-dimensional planes obtained upper and lower bounds of the same order
L in , the corresponding projector L :  ! L (up to a logarithmic correction):
restricted to the set A is a Hölder continuous
homeomorphism. C1  4=3  df ðAÞ  C2  4=3 ð1 þ lnð 1 ÞÞ1=3 ½14
Thus, if the finite fractal dimensionality of the
attractor is established, then, fixing a hyperplane L
Global Lyapunov Functions and the Structure
satisfying the assumptions of the Mané theorem of Global Attractors
and projecting the attractor A and the DS S(t)
restricted to A onto this hyperplane (A := L A Although the global attractor has usually a very
and S̄(t) := L  S(t)  1
L ), we obtain, indeed, a
complicated geometric structure, there exists one
reduced DS (S̄(t), A) which is defined on a finite- exceptional class of DS for which the global attractor
dimensional set A  L R 2Nþ1 . Moreover, this DS has a relatively simple structure which is completely
will be Hölder continuous with respect to the initial understood, namely the DS having a global Lyapunov
data. function. We recall that a continuous function
L :  ! R is a global Lypanov function if
1. L is nonincreasing along the trajectories, that is,
Estimates on the Fractal Dimension L(S(t)u0 )  L(u0 ), for all t  0;
Obviously, good estimates on the dimension of the 2. L is strictly decreasing along all nonequilibrium
attractors in terms of the physical parameters are solutions, that is, L(S(t)u0 ) = L(u0 ) for some t > 0
crucial for the finite-dimensional reduction and u0 implies that u0 is an equilibrium of S(t).
described above, and (consequently) there exists a For instance, in the scalar case N = 1, the
highly developed machinery for obtaining such reaction–diffusion equations [8]
estimates. The best-known upper estimates are R possess the2 global
Lyapunov function L(u R v0 ) :=  [ajrx u0 (x)j þ F(u0
usually obtained by the so-called volume contraction (x))]dx, where F(v) := 0 f (u) du. Indeed, multiply-
method, which is based on the study of the evolution ing eqn [8] by @t u and integrating over , we have
of infinitesimal k-dimensional volumes in the neigh-
borhood of the attractor (and, if the DS considered d
LðuðtÞÞ ¼ 2k@t uðtÞk2L2 ðÞ  0 ½15
contracts the k-dimensional volumes, then the dt
fractal dimension of the attractor is less than k).
Analogously, in the scalar case N = 1, multiplying
Lower bounds on the dimension are usually based
the hyperbolic equation [7] by @t u(t) and integrating
on the observation that the global attractor always
over , we obtain the standard global Lyapunov
contains the unstable manifolds of the (hyperbolic)
function for this equation.
equilibria. Thus, the instability index of a properly
It is well known that, if a DS posseses a global
constructed equilibrium gives a lower bound on the
Lyapunov function, then, at least under the generic
dimension of the attractor.
assumption that the set R of equilibria is finite, every
In the following, several estimates for the classes
trajectory u(t) stabilizes to one of these equilibria as
of equations given in the introduction are formu-
t ! þ1. Moreover, every complete bounded trajec-
lated, beginning with the most-studied case of the
tory u(t), t 2 R, belonging to the attractor is a
reaction–diffusion system [8]. For this system, sharp
heteroclinic orbit joining two equilibria. Thus, the
upper and lower bounds are known, namely
global attractor A can be described as follows:
C1 volðÞ  df ðAÞ  C2 volðÞ ½13 [
A¼ Mþ ðu0 Þ ½16
u0 2R
where the constants C1 and C2 depend on a and f
(and, possibly, on the shape of ), but are indepen- where Mþ (u0 ) is the so-called unstable set of the
dent of its size. The same types of estimates also hold equilibrium u0 (which is generated by all heteroclinic
for the hyperbolic equation [7]. Concerning the orbits of the DS which start from the given equilibrium
Navier–Stokes system [6] in general two-dimensional u0 2 A). It is also known that, if the equilibrium u0 is
domains , the asymptotics of the fractal dimension hyperbolic (generic assumption), then the set Mþ (u0 )
as  ! 0 is not known. The best-known upper bound is a -dimensional submanifold of , where  is the
has the form df (A)  C 2 and was obtained by instability index of u0 . Thus, under the generic
Foias–Temam by using the so-called Lieb–Thirring hyperbolicity assumption on the equilibria, the
Dissipative Dynamical Systems of Infinite Dimension 107

attractor A of a DS having a global Lyapunov function where the positive constant  and the monotonic
is a finite union of smooth finite-dimensional sub- function Q are independent of u0 .
manifolds of the phase space . These attractors are We can see that an inertial manifold, if it
called regular (following Babin–Vishik). exists, confirms in a perfect way the heuristic
It is also worth emphasizing that, in contrast to conjecture on the finite dimensionality formulated
general global attractors, regular attractors are in the introduction. Indeed, the dynamics of S(t)
robust under perturbations. Moreover, in some restricted to an inertial manifold can be, obviously,
cases, it is also possible to verify the so-called described by a system of ODEs (which is called the
transversality conditions (for the intersection of inertial form of the initial PDE). On the other hand,
stable and unstable manifolds of the equilibria) the asymptotic completeness gives (in a very strong
and, thus, verify that the DS considered is a form) the equivalence of the initial DS (S(t), ) with
Morse–Smale system. In particular, this means that its inertial form (S(t), M). Moreover, in turbulence,
the dynamics restricted to the regular attractor A is the existence of an inertial manifold would yield an
also preserved (up to homeomorphisms) under exact interaction law between the small and large
perturbations. structures of the flow.
A disadvantage of the approach of using a regular Unfortunately, all the known constructions of
attractor is the fact that, except for scalar parabolic inertial manifolds are based on a very restrictive
equations in one space dimension, it is usually condition, the so-called spectral gap condition,
extremely difficult to verify the ‘‘generic’’ hyperbo- which requires arbitrarily large gaps in the spectrum
licity and transversality assumptions for concrete of the linearization of the initial PDE and which can
values of the physical parameters and the associated usually be verified only in one space dimension. So,
hyperbolicity constants, as a rule, cannot be the existence of an inertial manifold is still an
expressed in terms of these parameters. open problem for many important equations of
mathematical physics (including in particular the
two-dimensional Navier–Stokes equations; some
Inertial Manifolds nonexistence results have also been proven by
It should be noted that the scheme for the finite- Mallet–Paret).
dimensional reduction described above has essential
drawbacks. Indeed, the reduced system (S̄(t), A)  is
Exponential Attractors
only Hölder continuous and, consequently, cannot
be realized as a DS generated by a system of ODEs We first recall that Definition 1 of a global
(and reasonable conditions on the attractor A which attractor only guarantees that the images S(t)B of
guarantee the Lipschitz continuity of the Mané all the bounded subsets converge to the attractor,
projections are not known). On the other hand, the without saying anything on the rate of convergence
complicated geometric structure of the attractor (in contrast to inertial manifolds, for which this
A (or A)  makes the use of this finite-dimensional rate of convergence can be controlled). Further-
reduction in computations hazardous (in fact, only more, as elementary examples show, this conver-
the heuristic information on the number of gence can be arbitrarily slow, so that, until now,
unknowns which are necessary to capture all the we have no effective way for estimating this rate of
dynamical effects in approximations can be convergence in terms of the physical parameters of
extracted). the system (an exception is given by the regular
In order to overcome these problems, the concept attractors described earlier for which the rate of
of an inertial manifold (which allows one to embed convergence can be estimated in terms of the
the global attractor into a smooth manifold) has hyperbolicity constants of the equilibria. However,
been suggested by Foias–Sell–Temam. To be more even in this situation, it is usually very difficult to
precise, a Lipschitz finite-dimensional manifold M   estimate these constants for concrete equations).
is an inertial manifold for the DS (S(t), ) if Furthermore, there exist many physically relevant
systems (e.g., the so-called slightly dissipative
1. M is semiinvariant, that is, S(t)M  M, for all gradient systems) which have trivial global attrac-
t  0; tors, but very rich and physically relevant transient
2. M satisfies the following asymptotic completeness dynamics which are automatically forgotten under
property: for every u0 2 , there exists v0 2 M the global-attractor approach. Another important
such that problem is the robustness of the global attractor
under perturbations. In fact, global attractors are
kSðtÞu0  SðtÞv0 k  Qðku0 k Þet ½17 usually only upper semicontinuous under
108 Dissipative Dynamical Systems of Infinite Dimension

perturbations (which means that they cannot for some constant K independent of ui . Then, the DS
explode) and the lower semicontinuity (which (S, 0 ) possesses an exponential attractor.
means that they cannot also implode) is much
In applications, 0 is usually a bounded absorb-
more delicate to prove and requires some hyperbo-
ing/attracting set whose existence is guaranteed by
licity assumptions (which are usually impossible to
the dissipative estimate [10], H := L2 () and
verify for concrete equations).
H1 := H 1 (). Furthermore, estimate [19] simply
In order to overcome these difficulties, Eden–
follows from the classical parabolic smoothing
Foias–Nicolaenko–Temam have introduced an inter-
property, but now applied to the equation of
mediate object (between inertial manifolds and
variations (as in [11], hyperbolic equations require
global attractors), namely an exponential attractor
a slightly more complicated analogue of [19]). These
(also called an inertial set).
simple arguments show that exponential attractors
Definition 3 A compact set M   is an exponen- are as general as global attractors and, to the
tial attractor for the DS (S(t), ) if best of our knowledge, exponential attractors exist
indeed for all the equations of mathematical physics
(i) M has finite fractal dimension: df (M) < 1;
for which the finite dimensionality of the global
(ii) M is semi-invariant: S(t)M  M, for all t  0;
attractor can be established. Moreover, since A 
(iii) M attracts exponentially the images of all the
M, this scheme can also be used to prove the finite
bounded sets B  :
dimensionality of global attractors.
It is finally worth emphasizing that the control on
distH ðSðtÞB; MÞ  QðkBk Þet ½18
the rate of convergence provided by [18] makes
exponential attractors much more robust than global
where the positive constant  and the monotonic
attractors. In particular, they are upper and lower
function Q are independent of B.
semicontinuous under perturbations (of course, up to
Thus, on the one hand, an exponential attractor the ‘‘best choice,’’ since they are not unique), as
remains finite dimensional (like the global attractor) shown by Efendiev–Miranville–Zelik.
and, on the other hand, estimate [18] allows one to
control the rate of attraction (like an inertial
manifold). We note, however, that the relaxation Essentially Infinite-Dimensional
of strict invariance to semi-invariance allows this Dynamical Systems – The Case of
object to be nonunique. So, we have here the Unbounded Domains
problem of the ‘‘best choice’’ of the exponential
attractor. We also mention that an exponential As already mentioned in the introduction, the theory
attractor, if it exists, always contains the global of dissipative DS in unbounded domains is develop-
attractor. ing only now and the results given here are not as
Although the initial construction of exponential complete as for bounded domains. Nevertheless, we
attractors is based on the so-called squeezing indicate below several of the most interesting (from
property (and requires Zorn’s lemma), we formulate our point of view) results concerning the general
below a simpler construction, due to Efendiev– description of the dynamics generated by such
Miranville–Zelik, which is similar to the method problems by considering a system of reaction–
proposed by Ladyzhenskaya to verify the finite diffusion equations [8] in Rn with phase space
dimensionality of global attractors. This is done for  = L1 (Rn ) as a model example (although all the
discrete times and for a DS generated by iterations results formulated below are general and depend
of some map S :  ! , since the passage from weakly on the choice of equation).
discrete to continuous times usually arises no
difficulty (without loss of generality, the reader Generalization of the Global Attractor and
may think that S = S(1) and (S(t), ) is one of the DS Kolmogorov’s e-Entropy
mentioned in the introduction).
We first note that Definition 1 of a global attractor
Theorem 3 Let the phase space 0 be a closed is too strong for equations in unbounded domains.
bounded subset of some Banach space H and let H1 Indeed, as seen earlier, the compactness of the
be another Banach space compactly embedded into attractor is usually based on the compactness of
H. Assume also that the map S : 0 ! 0 satisfies the embedding H 1 ()  L2 (), which does not hold
the following ‘‘smoothing’’ property: in unbounded domains. Furthermore, an attractor,
in the sense of Definition 1, does not exist for most
kSu1  Su2 kH1  Kku1  u2 kH ; u1 ; u2 2 0 ½19 of the interesting examples of eqns [8] in Rn .
Dissipative Dynamical Systems of Infinite Dimension 109

It is natural to use instead the concept of the Spatial Dynamics and Spatial Chaos
so-called locally compact global attractor which is
The next main difference with bounded domains is
well adapted to unbounded domains. This attrac-
the existence of unbounded spatial directions which
tor A is only bounded in the phase space
can generate the so-called spatial chaos (in addition
 = L1 (Rn ), but its restrictions Aj to all bounded
to the ‘‘usual’’ temporal chaos arising under the
domains  are compact in L1 (). Moreover, the
evolution). In order to describe this phenomenon, it
attraction property should also be understood in
is natural to consider the group {Th , h 2 R n } of
the sense of a local topology in L1 (R n ). It is
spatial translations acting on the attractor A:
known that this generalized global attractor A
exists indeed for problem [8] in Rn (of course, ðTh u0 ÞðxÞ :¼ u0 ðx þ hÞ; Th : A ! A ½21
under some ‘‘natural’’ assumptions on the non-
linearity f and the diffusion matrix a). As for as a DS (with multidimensional ‘‘times’’ if n > 1)
bounded domains, its existence is based on the acting on the phase space A and to study its
dissipative estimate [10], the smoothing property dynamical properties.
[11], and the compactness of the embedding In particular, it is worth noting that the lower
Hloc (Rn )  L2loc (Rn ) (we need to use the local bounds on the "-entropy that one can derive imply
topology only to have this compactness). that the topological entropy of this spatial DS is
The next natural question that arises here is how infinite and, consequently, the classical symbolic
to control the ‘‘size’’ of the attractor A if its fractal dynamics with a finite number of symbols is not
dimension is infinite (which is usually the case in adequate to clarify the nature of chaos in [21].
unbounded domains). One of the most natural In order to overcome this difficulty, it was suggested
ways to handle this problem (which was first by Zelik to use Bernoulli shifts with an infinite
suggested by Chepyzhov–Vishik in the different number of symbols, belonging to the whole interval
context of uniform attractors associated with ! 2 [0, 1]. To be more precise,n let us consider the
nonautonomous equations in bounded domains Cartesian product Mn := [0, 1]Z endowed with the
and appears as extremely fruitful for the theory of Tikhonov topology. Then, this set can be interpreted
dissipative PDE in unbounded domains) is to study as the space of all the functions v : Zn ! [0, 1],
the asymptotics of Kolmogorov’s "-entropy of the endowed with the standard local topology. We define
attractor. Actually, since the attractor A is compact a DS {T l , l 2 Zn } on Mn by
only in a local topology, it is natural to study the
n ðT l vÞðmÞ :¼ vðm þ lÞ; v 2 Mn ; l; m 2 Zn ½22
entropy of its restrictions, say, to balls BR
x0 of R of
radius R centered at x0 with respect to the three Based on this model, the following description of
parameters R, x0 , and ". A more or less complete spatial chaos was obtained.
answer to this question is given by the following
estimate: Theorem 4 Let eqn [8] in  = Rn possess at least
one exponentially unstable spatially homogeneous
H" ðAjBRx Þ  CðR þ log2 1="Þn log2 1=" ½20 equilibrium. Then, there exist  > 0 and a home-
omorphic embedding  : Mn ! A such that
where the constant C is independent of "  1, R, Tl  ðvÞ ¼   T l ðvÞ; 8l 2 Zn ; v 2 Mn ½23
and x0 . Moreover, it can be shown that this estimate
is sharp for all R and " under the very weak Thus, the spatial dynamics, restricted to the set
additional assumption that eqn [8] possesses at least (Mn ), is conjugated to the symbolic dynamics on
one exponentially unstable spatially homogeneous Mn . Moreover, there exists a dynamical invariant
equilibrium. (the so-called mean toplogical dimension) which is
Thus, formula [20] (whose proof is also based on always finite for the spatial DS [22] and strictly
a smoothing property for the equation of variations) positive for the Bernoulli scheme Mn . So, the
can be interpreted as a natural generalization of the embedding [23] clarifies, indeed, the nature of
heuristic principle of finite dimensionality of global chaos arising in the spatial DS [21].
attractors to unbounded domains. It is also worth
recalling that the entropy of the embedding of a ball
Spatio-Temporal Chaos
Bk of the space Ck (BR R
x0 ) into C(Bx0 ) has the
asymptotic H" (B) CR (1=") , which is essentially To conclude, we briefly discuss an extension of
worse than [20]. So, [20] is not based on the Theorem 4, which takes into account the temporal
smoothness of the attractor A and, therefore, modes and, thus, gives a description of the spatio-
reflects deeper properties of the equation. temporal chaos. In order to do so, we first note
110 Donaldson–Witten Theory

that the spatial DS [21] commutes obviously indicates that (even without considering the spatial
with the temporal evolution operators S(t) and, directions) we have indeed here essential new levels
consequently, an extended (n þ 1)-parametric semi- of dynamical complexity which are not observed in
group {S(t, h), (t, h) 2 R þ  Rn } acts on the attractor: the classical DS theory of ODEs.
Sðt; hÞ :¼ SðtÞ  Th ; Sðt; hÞ : A ! A
See also: Dynamical Systems in Mathematical Physics:
t 2 Rþ ; h 2 Rn ½24 An Illustration from Water Waves; Ergodic Theory;
Evolution Equations: Linear and Nonlinear; Fractal
Then, this semigroup (interpreted as a DS with
Dimensions in Dynamics; Inviscid flows; Lyapunov
multidimensional times) is responsible for all the Exponents and Strange Attractors.
spatio-temporal dynamical phenomena in the initial
PDE [8] and, consequently, the question of finding
adequate dynamical characteristics is of a great Further Reading
interest. Moreover, it is also natural to consider the
subsemigroups SVk (t, h) associated with the k-dimen- Babin AV and Vishik MI (1992) Attractors of Evolution
sional planes Vk of the spacetime Rþ  Rn , k < n þ 1. Equations. Amsterdam: North-Holland.
Chepyzhov VV and Vishik MI (2002) Attractors for Equations of
Although finding an adequate description of the Mathematical Physics. American Mathematical Society Collo-
dynamics of [24] seems to be an extremely difficult quium Publications, vol. 49. Providence, RI: American
task, some particular results in this direction have Mathematical Society.
already been obtained. Thus, it has been proved by Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in
the Theory of Solitons. Springer Series in Soviet Mathematics.
Zelik that the semigroup [24] has finite topological
Berlin: Springer.
entropy and the entropy of its subsemigroups Hale JK (1988) Asymptotic Behavior of Dissipative Systems.
SVk (t, h) is usually infinite if k < n þ 1. Moreover Mathematical Surveys and Monographs, vol. 25. Providence,
(adding a natural transport term of the form RI: American Mathematical Society.
(L, rx )u to eqn [8]), it was proved that the analog Katok A and Hasselblatt B (1995) Introduction to the Modern
of Theorem 4 holds for the subsemigroups SVn (t, h) Theory of Dynamical Systems. Encyclopedia of Mathematics
and its Applications, vol. 54. Cambridge: Cambridge Uni-
associated with the n-dimensional hyperplanes Vn of versity Press.
the spacetime. Thus, the infinite-dimensional Ber- Ladyzhenskaya OA (1991) Attractors for Semigroups and Evolu-
noulli shifts introduced in the previous subsection tion Equations. Cambridge: Cambridge University Press.
can be used to describe the temporal evolution in Temam R (1997) Infinite-Dimensional Dynamical Systems in
unbounded domains as well. Mechanics and Physics. Applied Mathematical Sciences, 2nd
edn., vol. 68. pp. New York: Springer.
In particular, as a consequence of this embedding, Zelik S (2004) Multiparametrical semigroups and attractors of
the topological entropy of the initial purely temporal reaction–diffusion equations in R n . Proceedings of the
evolution semigroup S(t) is also infinite, which Moscow Mathematical Society 65: 69–130.

Donaldson Invariants see Gauge Theoretic Invariants of 4-Manifolds

Donaldson–Witten Theory
M Mariño, CERN, Geneva, Switzerland constructed in such a way that the correlation
ª 2006 Elsevier Ltd. All rights reserved. functions of certain operators provide topological
invariants of the spacetime manifold where the
theory is defined. This means that one can use the
methods and insights of quantum field theory in
order to obtain information about topological
Since they were introduced by Witten in 1988, invariants of low-dimensional manifolds.
topological quantum field theories (TQFTs) have Historically, the first TQFT was Donaldson–Witten
had a tremendous impact in mathematical physics
bi bi
theory, also called topological Yang–Mills theory.
(see Birmingham et al. (1991) and Cordes et al. for a This theory was constructed by Witten (1998) starting
review). These quantum field theories are from N = 2 super Yang–Mills by a procedure called
Donaldson–Witten Theory 111

‘‘topological twisting.’’ The resulting model is topolo- where Fþ (A) is the self-dual part of the curvature,
gical and the famous Donaldson invariants of and G is the group of gauge transformations. To
4-manifolds are then recovered as certain correlation construct the Donaldson polynomials, one considers
functions in the topological theory. The analysis of the universal bundle
Witten (1998) did not indicate any new method to
compute the invariants, but in 1994 the progress in P ¼ ðV  A Þ=ðG  GÞ ½2
understanding the nonperturbative dynamics of N = 2
bi where A is the space of irreducible G-connections
theories (Seiberg and Witten 1994 a, b) led to an on V. This is a G-bundle over B  X, where
alternative way of computing correlation functions in B = A =G is the space of irreducible connections
Donaldson–Witten theory. As Witten (1994) showed, modulo gauge transformations, and as such has a
Donaldson–Witten theory can be reduced to Pontrjagin class
another, simpler topological theory consisting of
a twisted abelian gauge theory coupled to spinor p1 ðPÞ 2 H  ðB Þ  H  ðXÞ ½3
fields. This theory leads to a different set of
4-manifold invariants, the so-called ‘‘Seiberg– One can then obtain differential forms on B by
Witten invariants,’’ and Donaldson invariants can taking the slant product of p1 (P) with homology
be expressed in terms of these invariants through classes in X. In this way we obtain the Donaldson
Witten’s ‘‘magic formula.’’ The connection map:
between Seiberg–Witten and Donaldson invar-
 : Hi ðXÞ ! H4i ðB Þ ½4
iants was streamlined and extended by Moore
and Witten by using the so-called u-plane integral
After restriction to MASD , we obtain the following
(Moore and Witten 1998). This has led to a rather differential forms on the moduli space of ASD
complete understanding of Donaldson–Witten the- connections:
ory from a physical point of view.
In this article we provide a brief review of x 2 H0 ðXÞ ! OðxÞ 2 H4 ðMASD Þ
Donaldson–Witten theory. First, we describe the ½5
S 2 H2 ðXÞ ! I2 ðSÞ 2 H 2 ðMASD Þ
construction of the model, from both a mathematical
and a physical point of view, and state the main If the manifold X has b1 (X) 6¼ 0, there are also
results for the Donaldson–Witten generating func- cohomology classes associated to 1-cycles and
tional. In the next section, we present the basic results 3-cycles, but we will not consider them here.
of the u-plane integral of Moore and Witten and We can now formally define the Donaldson
sketch how it can be used to solve Donaldson–Witten invariants as follows. Consider the space
theory. In the final section, we mention some
generalizations of the basic framework. For a AðXÞ ¼ SymðH0 ðXÞ  H2 ðXÞÞ ½6
complete exposition of Donaldson–Witten theory, bi with a typical element written as x‘ Si1 Sip . The
the reader is referred to the book by Labastida Donaldson invariant corresponding to this element
and Mariño (2005). A short review of the u-plane
bi of A(X) is the following intersection number:
integral can be found in Mariño and Moore (1998a).
w ðVÞ;k
DX2 ðx‘ Si1 Sip Þ
Donaldson–Witten Theory: Basic ¼ O‘ ^ I2 ðSi1 Þ ^ ^ I2 ðSip Þ ½7
Construction and Results MASD

Donaldson–Witten Theory According to Donaldson where MASD is the moduli space of ASD connec-
bi tions with second Stiefel–Whitney class w2 (V) and
Donaldson theory as formulated in Donaldson (1990),
bi bi instanton number k. The integral in [7] will be
Donaldson and Kronheimer (1990), and Friedman and different from zero only if the degrees of the forms
Morgan (1991) starts with a principal G = SO(3) add up to dim(MASD ).
bundle V ! X over a compact, oriented, Riemannian It is very convenient to pack all Donaldson
4-manifold X, with fixed instanton number k invariants in a generating functional. Let
and Stiefel–Whitney class w2 (V) (SO(3) bundles on a {Si }i = 1,...,b2 be a basis of 2-cycles. We introduce the
4-manifold are classified up to isomorphism by these formal sum
topological data). The moduli space of anti-self-dual
(ASD) connections is then defined as X
S¼ vi Si ½8
MASD ¼ fA : Fþ ðAÞ ¼ 0g=G ½1 i¼1
112 Donaldson–Witten Theory

where vi are complex numbers. We then define the (1996). On the other hand, Donaldson theory can be
Donaldson–Witten generating functional as formulated as a topological field theory, and many
of these results can be obtained by using quantum
w ðVÞ
w ðVÞ;k
ZDW ðp; vi Þ ¼ DX2 ðepxþS Þ ½9 field theory techniques. This will be our main focus
k¼0 for the rest of the article.
where on the right-hand side we are summing over
all instanton numbers, that is, we are summing over Donaldson–Witten Theory According to Witten
all topological configurations of the SO(3) gauge bi

Witten (1988) constructed a twisted version of

field with a fixed w2 (V). This gives a formal power
N = 2 super Yang–Mills theory which has a nilpo-
series in p and vi .
tent Becchi–Rouet–Stora–Tyutin (BRST) charge
For bþ 2 (X) > 1, the generating functional [9]
(modulo gauge transformations)
is a diffeomorphism invariant of X; therefore, it
is potentially a powerful tool in four-dimensional _
Q ¼ _ A Q_ A_ ½15
topology. When bþ 2 (X) = 1, Donaldson invariants
are metric dependent. The metric dependence where Q_ A_ are the supersymmetric (SUSY) charges.
can be described in more detail as follows. Define Here _ is a chiral spinor index and A has its origin in
the period point as the harmonic 2-form the SU(2) R-symmetry. The field content of the
satisfying theory is the standard twisted N = 2 vector multiplet:
! ¼ !; !2 ¼ 1 ½10 A;  ¼ _ ; ; Dþ þ
 ;  ¼ _
_ ; ; ¼ _ ½16
which depends on the conformal class of the metric. where (1=2)Dþ  
 dx dx is a self-dual 2-form derived
As the conformal class of the metric varies, ! from the auxiliary fields, etc. All fields are valued in
describes a curve in the cone the adjoint representation of the gauge group. After
twisting, the theory is well defined on any Rieman-
Vþ ¼ f! 2 H2 ðX; RÞ : !2 > 0g ½11 nian 4-manifold, since the fields are naturally
Let  2 H 2 (X) satisfy interpreted as differential forms and the Q charge
is a scalar (Witten 1988).

w2 ðVÞ mod 2;  2 < 0; ð; !Þ ¼ 0 ½12 The observables of the theory are Q cohomology
classes of operators, and they can be constructed
Such an element  defines a ‘‘wall’’ in Vþ :
from 0-form observables O(0) using the descent
W ¼ f! : ð; !Þ ¼ 0g ½13 procedure. This amounts to solving the equations
The complements of these walls are called ‘‘cham- dOðiÞ ¼ fQ; Oðiþ1Þ g; i ¼ 0; . . . ; 3 ½17
bers,’’ and the cone Vþ is then divided in chambers
separated by walls. A class  satisfying [12] is the The integration over i-cycles (i) in X of the
first Chern class associated to a reducible solution of operators O(i) is then an observable. These descent
the ASD equations, and it causes a singularity in equations have a canonical solution: the 1-form-
moduli space: the Donaldson invariants jump when valued operator K_ = iA Q_ A_ =4 verifies
we pass through such a wall. Therefore, when d ¼ fQ; Kg ½18
bþ2 (X) = 1, Donaldson invariants are metric inde-
pendent in each chamber. A basic problem in as a consequence of the supersymmetry algebra. The
Donaldson–Witten theory is to determine the jump operators O(i) = Ki O(0) solve the descent equations
in the generating function as we cross a wall, [17] and are canonical representatives. When the
gauge group is SU(2), the observables are obtained
Zþ ðp; SÞ  Z ðp; SÞ ¼ WC ðp; SÞ ½14 by the descent procedure from the operator
The jump term WC (p, S, ) is usually called the O ¼ trð2 Þ ½19
‘‘wall-crossing’’ term.
The basic goal of Donaldson theory is to study The topological descendant O(2) is given by
the properties of the generating functional [9] and
Oð2Þ ¼  12 tr p1ffiffi2 ðF

þ Dþ 1 
 Þ 4   dx ^ dx

to compute it for different 4-manifolds X. On
the mathematical side, many results have been and the resulting observable is
obtained on ZDW , and some of them can be found
bi bi Z
in Donaldson and Kronheimer (1990), Friedman
bi bi I2 ðSÞ ¼ Oð2Þ ½21
and Morgan (1991), Stern (1998), and Göttsche S
Donaldson–Witten Theory 113

O and I2 (S) correspond to the cohomology classes in

space. We will label Spinc structures by the class
[5]. One of the main results of Witten (1988) is that  = c1 (L1=2 ) 2 H 2 (X, Z) þ w2 (X)=2. We say that  is
the semiclassical approximation in the twisted a Seiberg–Witten basic class if the corresponding
N = 2 Yang–Mills theory is exact. The semiclassical Seiberg–Witten invariants are not all zero. If MS W
evaluation of correlation functions of the observa- is zero dimensional, the Seiberg–Witten invariant
bles above leads directly to the definition of depends only on the Spinc structure associated to
Donaldson invariants, and the generating functional  = c1 (L1=2 ), and is denoted by SW().
[9] can be written as a correlation function of the 3. A manifold X is said to be of Seiberg–
twisted theory. One then has Witten simple type if all the Seiberg–Witten basic
D classes have a zero-dimensional moduli space. For
w2 ðVÞ  E
ZDW ðp; SÞ ¼ exp pO þ I2 ðSÞ ½22 simply connected 4-manifolds of Seiberg–Witten
simple type and with bþ 2 (X) > 1, Witten determined
the Seiberg–Witten contribution and proposed the bi
following ‘‘magic formula’’ for ZDW (Witten 1994b):
Results for the Donaldson–Witten
X h
Generating Function 2 2
ZDW ¼ 21þ7 =4þ11 =4 e2ið0 þ0 Þ e2pþS =2 e2ðS;Þ
The basic results that have emerged from the 
physical approach to Donaldson–Witten theory are 2 2
þ i h w2 ðVÞ e2pS =2 2iðS;Þ
e SWðÞ ½25
the following.
1. The Donaldson–Witten generating functional In this equation, , are the Euler characteristic and
is in general the sum of the two terms, signature of X, respectively, h = ( þ )=4 is the
holomorphic Euler characteristic of X, and 0
ZDW ¼ Zu þ ZSW ½23 is an integer lifting of w2 (V). This formula gen-
eralizes previous results by Witten (1994a) for
(We have omitted the Stiefel–Whitney class for
Kähler manifolds. It also follows from this formula
convenience.) The first term, Zu , is called the
that the Donaldson–Witten generating function of
‘‘u-plane integral.’’ It is given by a complicated
simply connected 4-manifolds of Seiberg–Witten
integral over C which can be written, in turn, as an
simple type and with bþ 2 (X) > 1 satisfies
integral over a fundamental domain of the con-
gruence subgroup 0 (4) of SL(2, Z). Zu depends  
only on the cohomology ring of X, and therefore  4 ZDW ¼ 0
does not contain any information beyond the one @p2
provided by classical topology. Finally, Zu vanishes
if bþ which is the Donaldson simple type condition
2 (X) > 1, and it is responsible for the wall- bi

crossing behavior of ZDW when bþ introduced by Kronheimer and Mrowka (1994).

2 (X) = 1.
2. The second term of [23], ZSW , is called the 4. Using the u-plane integral, one can find explicit
Seiberg–Witten contribution. This contribution expressions for ZDW in more general situations (like
involves the Seiberg–Witten invariants of X, which non-simply-connected manifolds or manifolds which
are obtained by considering the moduli problem are not of Seiberg–Witten simple type).
defined by the Seiberg–Witten monopole equations
In the next section we explain the formalism of bi
(Witten 1994b): the u-plane integral introduced by Moore and
Witten (1998), which makes possible a detailed
 ð_ M _ ¼ 0
_ þ 4iM
Þ derivation of the above results.
DL_ M_ ¼ 0

In these equations, M_ is a section of the spinor The u -Plane Integral

bundle Sþ  L1=2 , L is the determinant line bundle
of a Spinc structure on X, Fþ_
_ =  Fþ is the Definition of the u -Plane Integral
self-dual part of the curvature of a U(1) connection The evaluation of the Donaldson–Witten generating bi
on L, and DL is the Dirac operator for the bundle function can be made by using the results of Seiberg
Sþ  L1=2 . The solutions of these equations modulo and Witten (1994 a, b) on the low-energy dynamics
gauge equivalence form the moduli space MS W, of SU(2), N = 2 Yang–Mills theory. In their work,
and the Seiberg–Witten invariants are defined by Seiberg and Witten determined the exact low-energy
integrating suitable differential forms on this moduli effective action of the model up to two derivatives.
114 Donaldson–Witten Theory

From a physical point of view, there are certainly simply by twisting the physical theory. It can be
corrections to this effective action which are difficult written as
to evaluate. Fortunately, the computation in the
twisted version of the theory can be done by just i 4 1 n 00
K F ðaÞ þ Q; F ðD þ Fþ Þ
considering the Seiberg–Witten effective action. This 6 pffiffiffi 16 pffiffiffi
is because the correlation functions in the twisted i 2 n o 2i
theory are invariant under rescalings of the metric,  Q; F d   5
n opffiffiffi 2 
so we can evaluate them in the limit of large 000
 Q; F    gd 4 x
distances or equivalently of very low energies. The
effective action up to two derivatives is sufficient for ~
þ AðuÞtrR ^ R þ BðuÞtrR ^ R ½26
that purpose.
One way of describing the main result of the work where A(u), B(u) describe the coupling to gravity,
of Seiberg and Witten is that the moduli space of and after integration of the corresponding differen-

Q-fixed points of the twisted SO(3) N = 2 theory on tial forms we obtain terms proportional to the
a compact 4-manifold has two branches, which we signature and Euler characteristic of X. The
refer to as the Coulomb and Seiberg–Witten data of the low-energy effective action can be
branches. On the Coulomb branch the expectation encoded in an elliptic curve of the form
y2 ¼ x3  ux2 þ 14 x ½27
htr  i
u¼ and  is the modulus of the curve. The monodromy
group of this curve is 0 (4). All the quantities
breaks SO(3) ! U(1) via the standard Higgs involved in the action can be obtained by integrating
mechanism. The Coulomb branch is simply a copy a certain meromorphic differential on the curve, and
of the complex u-plane. The low-energy effective they can be expressed in terms of modular forms.
theory on this branch is simply the abelian N = 2 As for the operators, we have u = O(P) by
gauge theory. However, at two points, u = 1, there definition. We may then obtain the 2-observables
is a singularity where the moduli space meets a from theR descent R procedure. The result is that I(S) !
second branch, the Seiberg–Witten branch. At these ~I(S) = K2 u = (du=da)(Dþ þ F ) þ . Here Dþ
points, the effective action is given by the magnetic is the auxiliary field. Although one has I(S) ! ~I(S) in
dual of the U(1), N = 2 gauge theory coupled to a going from the microscopic theory to the effective
monopole matter hypermultiplet. Therefore, this theory, it does not necessarily follow that
branch consists of solutions to the Seiberg–Witten I(S1 )I(S2 ) ! ~I(S1 )~I(S2 ) because there can be contact
equations [24]. terms. If S1 and S2 intersect, then in passing to the
Since the manifold X is compact, the partition low-energy theory we integrate out massive modes.
function of the twisted theory is a sum over ‘‘all’’ This can induce delta function corrections to the
vacuum states. Equation [23] then follows. In this operator product expansion modifying the mapping
equation, Zu comes from ‘‘integrating over the to the low-energy theory as follows:
u-plane,’’ while ZSW corresponds to the points  
u = 1. As we stated before, Zu vanishes for expðIðSÞÞ ! exp ~IðSÞ þ S2 TðuÞ ½28
manifolds of bþ 2 (X) > 1, but once this piece has
been determined an argument originally presented where T(u) is the contact term. Such contact terms

at Moore and Witten (1998) allows one to derive were observed in Witten (1994a) and studied in

the form of ZSW as well for arbitrary bþ 2 (X) 1.

detail in Losev et al. (1998). It can be shown that
The computation of Zu is presented in detail in
bi  2
Moore and Witten (1998). The starting point of 1 du 1
TðuÞ ¼  E2 ðÞ þ u ½29
the computation is the untwisted low-energy 24 da 3
theory, which has been described in detail in
bi bi

Seiberg and Witten (1994 a, b) and Witten where E2 () is Eisenstein’s series and da=du is one of
(1995). It is an N = 2 theory characterized by a the periods of the elliptic curve [27].
prepotential F which depends on an N = 2 vector The final result of Moore and Witten is the
multiplet. The effective gauge coupling is given by following expression:
(a) = F 00 (a), where a is the scalar component of Z
the vector multiplet. The Euclidean Lagrange du d u 2^
Zu ðp; SÞ ¼ 1=2
ðÞe2puþS TðuÞ  ½30
density for the u-plane theory can be obtained C y
Donaldson–Witten Theory 115

Here, bþ
2 (X) = 1. The discontinuity of the u-plane inte-
  gral at these walls can be easily computed from
 da 1ð1=2Þ =8
d eqn [33]:
ðÞ ¼ 
u du
ðdu=daÞ2 WC¼2 ðp; SÞ
TðuÞ ¼ TðuÞ þ
8y i 2
h 2
¼  ð1Þð0 ;w2 ðXÞÞ e2i0 q =2 h1 ðÞ2 # 4 f1
where y = Im  and  is the discriminant of the
curve [27]. The quantity  is essentially a Narain– i
 exp 2pu1 þ S2 T1  ið; SÞ=h1 0 ½34
Siegel theta function associated to the lattice q
H 2 (X, Z). Notice that this lattice is Lorentzian and

has signature (1, (1)b2 (X) ) (since bþ This expression involves the modular forms
2 (X) = 1). The
self-dual projection of a 2-form  can be done with h1 , f21 , u1 , and T1 (the subscript 1 refers to the
the period point ! as þ = (, !)!. The lattice is fact that they are computed at the ‘‘electric’’ frame
shifted by half the second Stiefel–Whitney class of which is appropriate for the Seiberg–Witten curve at
the bundle, w2 (V), that is, u ! 1). They can be written in terms of Jacobi
theta functions #i (q), with q = e2i , and their
 ¼ H2 ðX; ZÞþ 12 w2 ðVÞ explicit expression is

and h1 ðqÞ ¼ 12 #2 ðqÞ#3 ðqÞ

"   # #2 ðqÞ#3 ðqÞ
1 du 2 2 2i2 X f1 ðqÞ ¼
 ¼ exp  S e 0 ð1Þð0 Þ w2 ðXÞ 2#84 ðqÞ
8y da 2

i du #42 ðqÞ þ #43 ðqÞ
 ð; !Þ þ ðS; !Þ u1 ðqÞ ¼
4y da 2ð#2 ðqÞ#3 ðqÞÞ2

exp i  ðþ Þ2  ið Þ2  i ðS;  Þ ½32 T1 ðqÞ ¼ 
1 E2 ðqÞ 1
þ u1 ðqÞ
da 24 h21 ðqÞ 3
Here, w2 (X) is the second Stiefel–Whitney class
of X, and 0 is a choice of lifting of w2 (V) to The subindex q0 means that in the expansion in q of
H 2 (X, Z). This expression can be extended to the the modular forms, we pick the constant term. The bi

non-simply-connected case (see Mariño and

formula [34] agrees with the formula of Göttsche

Moore (1999) and Moore and Witten (1998)). (1996) for the wall crossing of the Donaldson–
The study of the u-plane integral leads to a Witten generating functional.
systematic derivation of many important results
in Donaldson–Witten theory. We will discuss in
The Seiberg–Witten Contribution and Witten’s
detail two such applications, Göttsche’s wall- Magic Formula
crossing formula and Witten’s ‘‘magic formula.’’
At u = 1, Zu jumps at the second type of
Wall-Crossing Formula walls [33], which are called Seiberg–Witten (SW)
walls. In fact, these walls are labeled by classes
As shown by Moore and Witten, the u-plane integral  2 H 2 (X; Z) þ (1=2)w2 (X), which correspond to
is well defined and does not depend on the period Spinc structures on X. At these walls, the Seiberg–
point (hence on the metric on X) except for discontin- Witten invariants have wall-crossing behavior. Since
uous behavior at walls. There are two kinds of walls, the Donaldson polynomials do not jump at SW
associated, respectively, to the singularities at u = 1 walls, it must happen that the change of Zu at u = 1
(the semiclassical region of the underlying Yang–Mills is canceled by the change of ZSW . As shown by
theory) and at u = 1, given by Moore and Witten, this actually allows one
to obtain a precise expression for ZSW for general
u ¼ 1: þ ¼ 0;  2 H 2 ðX; ZÞþ 12 w2 ðVÞ 4-manifolds of bþ
½33 2 (X) 1.
u ¼ 1: þ ¼ 0;  2 H 2 ðX; ZÞþ 12 w2 ðXÞ On general grounds, ZSW is given by the sum of
the generating functionals at u = 1. These involve
The first type of walls is precisely the one that a magnetic U(1), N = 2 vector multiplet coupled to a
appears in Donaldson theory on manifolds of hypermultiplet (the monopole field). The twisted
116 Donaldson–Witten Theory

Lagrangian for such a system involves the magnetic Other Applications of the u -Plane Integral
prepotential FeD (aD ), and it can be written as
The u-plane integral makes possible to derive other
results on the Donaldson–Witten generating
 Wg þ i ~D F ^ F þ pðuÞtrR ^ R
16 functional.
pffiffiffi The blow-up formula. This relates the function
i 2 d~D b
þ ‘ðuÞtrR ^ R  ð ^ Þ^F ZDW on X to ZDW on the blown-up manifold X.
32 daD
The u-plane integral leads directly to the general
d2 ~D
i blow-up formula of Fintushel and Stern (1996).
þ ^ ^ ^ ½36
3  27  da2D Direct evaluations. The u-plane integral can be
00 evaluated directly in many cases, and this leads to
where ~D = FeD (aD ). Using the cancellation of wall
explicit formulas for the Donaldson–Witten generat-
crossings, one can actually compute the functions
ing functional of certain 4-manifolds with bþ
2 (X) = 1,
FeD (aD ), p(u), ‘(u) and determine the precise form of
on certain chambers, and in terms of modular forms.
the Seiberg–Witten contributions. One finds that a
For example, there are explicit formulas for the
Spinc structure  at u = 1 gives the following contribu-
Donaldson–Witten generating functional of product
tion to the Donaldson–Witten generating functional:
ruled surfaces of the form S2  g in the limiting bi

SWðÞ 2ið2 0 Þ chambers in which S2 or g are very small (Moore

SW ¼ e 0
and Witten 1998, Mariño and Moore 1999). Moore

8þ   and Witten (1998) have also derived an explicit
2 =2 #2 aD h
 qD 2i 2 formula for the Donaldson invariants of CP2 in terms
aD hM hM

of Hurwitz class numbers.

 exp 2puM þ ið; SÞ=hM þ S TM ½37
Extensions of Donaldson–Witten Theory
Here, aD , hM , uM , and TM are modular forms
that can be expressed as well in terms of Jacobi Donaldson–Witten theory is a twisted version of
theta functions #i (qD ), where qD = exp (2iD ). SU(2), N = 2 Yang–Mills theory. The twisting of
The subscript M refers to the monopole point, more general N = 2 gauge theories, involving other
and they are related by an S-transformation gauge groups and/or matter content, leads to other
to the quantities obtained in the ‘‘electric’’ topological field theories that give interesting gen-
frame at u ! 1. Their explicit expression is eralizations of Donaldson–Witten theory. We now
briefly list some of these extensions and their most
i 2E2 ðqD Þ  #43 ðqD Þ  #44 ðqD Þ important properties.
aD ðqD Þ ¼ 
6 #3 ðqD Þ#4 ðqD Þ Higher-rank theories. The extension of
1 Donaldson–Witten to other gauge groups has been
hM ðqD Þ ¼ #3 ðqD Þ#4 ðqD Þ studied in detail in Mariño and Moore (1998b) and
2i bi
½38 Losev et al. (1998). One can study the higher-rank
1 #43 ðqD Þ þ #44 ðqD Þ
uM ðqD Þ ¼ generalization of the u-plane integral, and as shown
2 ð#3 ðqD Þ#4 ðqD ÞÞ2 bi

in Mariño and Morre (1998b), this leads to a fairly

1 E2 ðqD Þ 1 explicit formula for the Donaldson–Witten generat-
TM ðqÞ ¼  þ uM ðqD Þ
24 h2M ðqD Þ 3 ing function in the SU(N) case, for manifolds with
bþ2 > 1 and of Seiberg–Witten simple type. Mathe-
The contribution at u = 1 is related to the matically, higher-rank generalizations of Donaldson
contribution at u = 1 by a u ! u symmetry: theory turn out to be much more complicated, but
2 they can be studied. In particular, higher-rank
Zu¼1 ðp; SÞ ¼ e2i0 ið þ Þ=4 Zu¼1 ðp; iSÞ ½39
generalizations of the Donaldson invariants can be

If the manifold has bþ2 (X) > 1 and is of Seiberg–

defined and computed (Kronheimer 2004), and the bi

Witten simple type, [37] reduces to results so far agree with the predictions of Mariño
and Moore (1998b). Unfortunately it seems that
ð1Þ h 21þ7 =4þ11 =4 e2pþS =2 2ðS;Þ
e these higher-rank generalizations do not contain
e 2ið20 0 Þ
SWðÞ ½40 new topological information, besides the one
encoded in the Seiberg–Witten invariants.
This leads to Witten’s ‘‘magic formula’’ [25] which Theories with matter. Twisted SU(2), N = 2
expresses the Donaldson invariants in terms of theories with hypermultiplets lead to generalizations
Seiberg–Witten invariants. of Donaldson–Witten theory involving nonabelian
Donaldson–Witten Theory 117


monopole equations (see Mariño (1997) and

Kronheimer P and Mrowka T (1994) Recurrence relations and
Labastida and Mariño (2005) for a review of these asymptotics for four-manifolds invariants. Bulletin of the
American Mathematical Society 30: 215.
models and some of their properties). The u-plane Kronheimer P and Mrowka T (1995) Embedded surfaces and the
integral leads to explicit formulas for the generating structure of Donaldson’s polynomials invariants. Journal of
functionals of these theories, which for manifolds of Differential Geometry 33: 573.
bþ2 > 1 can be written in terms of Seiberg–Witten
Labastida J and Mariño M (2005) Topological Quantum Field
invariants. Again, no new topological information Theory and Four-Manifolds. New York: Springer.
Losev A, Nekrasov N, and Shatashvili S (1998) Issues in
seems to be encoded in these theories. One can topological gauge theory. Nuclear Physics B 534: 549
however exploit new physical phenomena arising in (arXiv:hep-th/9711108).
the theories with hypermultiplets (in particular, the Lozano C (1999) Duality in topological quantum field theories
presence of superconformal points) to obtain new (arXiv:hep-th/9907123).
information about the Seiberg–Witten invariants Mariño M (1997) The geometry of supersymmetric gauge theories
in four dimensions (arXiv:hep-th/9701128).
(see Mariño et al. (1999) for these developments). Mariño M and Moore G (1998a) Integrating over the Coulomb
Vafa–Witten theory. The so-called Vafa–Witten branch in N = 2 gauge theory. Nuclear Physics Proceedings
theory is a close cousin of Donaldson–Witten theory,
Supplement 68: 336 (arXiv:hep-th/9712062).
and was introduced by Vafa and Witten (1994) as a Mariño M and Moore GW (1998b) The Donaldson–Witten
topological twist of N = 4 Yang–Mills theory. In function for gauge groups of rank larger than one. Commu-
nications in Mathematical Physics 199: 25 (arXiv:hep-th/
some cases, the partition function of this theory 9802185).
counts the Euler characteristic of the moduli space of Mariño M and Moore GW (1999) Donaldson invariants
instantons on the 4-manifold X. For a review of some
for non-simply connected manifolds. Communications in
properties of this theory, see Lozano (1999). Mathematical Physics 203: 249 (arXiv:hep-th/9804104).
Mariño M, Moore GW, and Peradze G (1999) Superconformal
invariance and the geography of four-manifolds. Communica-
See also: Duality in Topological Quantum Field Theory; tions in Mathematical Physics 205: 691 (arXiv:hep-th/
Mathai–Quillen Formalism; Seiberg–Witten Theory; 9812055).
Topological Quantum Field Theory: Overview. Moore G and Witten E (1998) Integration over the u-plane
in Donaldson theory. Advances in Theoretical and Mathema-
tical Society 1: 298 (arXiv:hep-th/9709193).
Seiberg N and Witten E (1994a) Electric – magnetic duality,
Further Reading monopole condensation, and confinement in N = 2
supersymmetric Yang–Mills theory. Nuclear Physics B 426: 19.
Birmingham D, Blau M, Rakowski M, and Thompson G (1991) Seiberg N and Witten E (1994b) Electric – magnetic duality,
Topological field theories. Physics Reports 209: 129. monopole condensation, and confirement in N = 2 super-
Cordes S, Moore G, and Rangoolam S (1994) Lectures on 2D symmetric Yang–Mills theory – erratum. Nuclear Physics B
Yang–Mills theory, equivariant cohomology and topological 430: 485 (arXiv:hep-th/9407087).
field theory (arXiv:hep-th/9411210). Stern R (1998) Computing Donaldson invariants. In:
Donaldson SK (1990) Polynomial invariants for smooth four- Gauge Theory and the Topology of Four-Manifolds, IAS/
manifolds. Topology 29: 257. Park City Mathematical Series. Providence, RI: American
Donaldson SK and Kronheimer PB (1990) The Geometry of Four- Mathematical Society.
Manifolds. Oxford: Clarendon. Vafa C and Witten E (1994) A strong coupling test of S duality.
Fintushel R and Stern RJ (1996) The blowup formula for Nuclear Physics B 431: 3 (arXiv:hep-th/9408074).
Donaldson invariants. Annals of Mathematics 143: 529 Witten E (1988) Topological quantum field theory. Communica-
(arXiv:alg-geom/9405002). tions in Mathematical Physics 117: 353.
Friedman R and Morgan JW (1991) Smooth Four-Manifolds and Witten E (1994a) Supersymmetric Yang–Mills theory on a
Complex Surfaces. New York: Springer. four manifold. Journal of Mathematical Physics 35: 5101
Göttsche L (1996) Modular forms and Donaldson invariants (arXiv:hep-th/9403195).
for four-manifolds with bþ = 1. Journal of American Mathe- Witten E (1994b) Monopoles and four manifolds. Mathematical
matical Society 9: 827 (arXiv:alg-geom/9506018). Research Letters 1: 769 (arXiv:hep-th/9411102).
Kronheimer P (2004) Four-manifold invariants from higher-rank Witten E (1995) On S duality in Abelian gauge theory. Selecta
bundles (arXiv:math.GT/0407518). Mathematica 1: 383 (arXiv:hep-th/9505186).
118 Duality in Topological Quantum Field Theory

Duality in Topological Quantum Field Theory

C Lozano, INTA, Madrid, Spain moduli problem related to the computation of
J M F Labastida, CSIC, Madrid, Spain certain mathematical invariants.
ª 2006 Elsevier Ltd. All rights reserved. On the other hand, in Chern–Simons theory, as a
representative of the so-called Schwarz-type topolo-
gical theories, the topological character is manifest:
one starts with an action which is explicitly
Introduction independent of the metric on the 3-manifold, and
There have been many exciting interactions between thus correlation functions of metric-independent
physics and mathematics in the past few decades. operators are topological invariants as long as
A prominent role in these interactions has been quantization does not introduce any undesired
played by certain field theories, known as topologi- metric dependence.
cal quantum field theories (TQFTs). These are Even though the primary motivation for introdu-
quantum field theories whose correlation functions cing TQFTs may be to shed light onto awkward
are metric independent and, in fact, compute certain mathematical problems, they have proved to be a
mathematical invariants (Birmingham et al. 1991, valuable tool to gain insight into many questions of
bi bi
Cordes et al. 1996, Labastida and Lozano 1998). interest in physics as well. One such question where
Well-known examples of TQFTs are, in two TQFTs can (and in fact do) play a role is duality. In

dimensions, the topological sigma models (Witten what follows, an overview of the manifestations of
1988a), which are related to Gromov–Witten invar- duality is provided in the context of TQFTs.
iants and enumerative geometry; in three dimen-

sions, Chern–Simons theory (Witten 1989), which is

related to knot and link invariants; and in four
dimensions, topological Yang–Mills theory (or
The notion of duality is at the heart of some of the
Donaldson–Witten theory) (Witten 1988b), which most striking recent breakthroughs in physics and
is related to the Donaldson invariants. The two- and mathematics. In broad terms, a duality (in physics)
four-dimensional theories above are examples of is an equivalence between different (and often
cohomological (also Witten-type) TQFTs. As such, complementary) descriptions of the same physical
they are related to an underlying supersymmetric system. The prototypical example is electric–
quantum field theory (the N = 2 nonlinear sigma magnetic (abelian) duality. Other, more sophisti-
model, and the N = 2 supersymmetric Yang–Mills cated, examples are the various string-theory
theory, respectively) and there is no difference dualities, such as T-duality (and its more specialized
between the topological and the standard version realization, mirror symmetry) and strong/weak
on flat space. However, when one considers curved coupling S-duality, as well as field theory dualities
spaces, the topological version differs from the such as Montonen–Olive duality and Seiberg–Witten
supersymmetric theory on flat space in that some effective duality.
of the fields have modified Lorentz transformation Also, the original ’t Hooft conjecture, stating that
properties (spins). This unconventional spin assign- SU(N) gauge theories are equivalent (or dual), at
ment is also known as twisting, and it comes about large N, to string theories, has recently been revived
basically to preserve supersymmetry on curved by Maldacena (1998) by explicitly identifying the
space. In fact, the twisting gives rise to at least one string-theory duals of certain (supersymmetric)
nilpotent scalar supercharge Q, which is a certain gauge theories.
linear combination of the original (spinor) super- One could wonder whether similar duality sym-
symmetry generators. metries work for TQFTs as well. As noted in the
In these theories the energy momentum tensor is following, this is indeed the case.
Q-exact, that is, In two dimensions, topological sigma models
come under two different versions, known as types
T ¼ fQ;  g
A and B, respectively, which correspond to the
for some  , which (barring potential anomalies) two different ways in which N = 2 supersymmetry
leads to the statement that the correlation functions can be twisted in two dimensions. Computations
of operators in the cohomology of Q are all metric in each model localize on different moduli spaces
independent. Furthermore, the corresponding path and, for a given target manifold, give different
integrals are localized to field configurations that are results, but it turns out that if one considers
annihilated by Q, and this typically leads to some mirror pairs of Calabi–Yau manifolds,
Duality in Topological Quantum Field Theory 119

computations in one manifold with the A-model are BRST-inspired terminology which reflects the formal
equivalent to computations in the mirror manifold resemblance of topological cohomological field
with the B-model. theories with some aspects of the BRST approach
Also, in three dimensions, a program has been to the quantization of gauge theories. Before con-
initiated to explore the duality between large N structing the topological observables of the theory,
Chern–Simons gauge theory and topological strings, we begin by pointing out that for each independent
thereby establishing a link between enumerative Casimir of the gauge group G it is possible to
and knot and link invariants construct an operator W0 , from which operators Wi
(Gopakumar and Vafa 1998). can be defined recursively through the descent
Perhaps the most impressive consequences of the equations {Q, Wi } = dWi1 . For example, for the
interplay between duality and TQFTs have come out quadratic Casimir,
in four dimensions, on which we will focus in what
follows. W0 ¼ trð2 Þ ½4
which generates the following family of operators:
Duality in Twisted N = 2 Theories 1
W1 ¼ trð Þ
As mentioned above, topological Yang–Mills theory 42  
(or Donaldson–Witten theory) can be constructed by 1 1
W2 ¼ 2 tr ^ þ^F ½5
twisting the pure N = 2 supersymmetric Yang–Mills 4 2
theory with gauge group SU(2). This theory contains 1
a gauge field A, a pair of chiral spinors 1 , 2 , and a W3 ¼ 2 trð ^ FÞ
complex scalar field B. The twisted theory contains a
gauge field A, bosonic scalars , , a Grassman-odd Using these one defines the following observables:
scalar , a Grassman-odd vector , and a Grassman- Z
odd self-dual 2-form . OðkÞ ¼ Wk ½6
On a 4-manifold X, and for gauge group G, the
twisted action has the form where k 2 Hk (X) is a k-cycle on the 4-manifold X.
Z  The descent equations imply that they are Q-closed
4 pffiffiffi 2
and depend only on the homology class of k .
S¼ d x g tr Fþ  i D  þ iD 
X Topological invariants are constructed by taking
1 i vacuum expectation values of products of the
þ f ;  g þ f  ; 
g  D D 
4 4  operators O(k) :
i 1 D E
þ f; g þ ½; 2 ½1 Oðk1 Þ Oðk2 Þ    Oðkp Þ
2 8
where Fþ is the self-dual part of the Yang–Mills field ¼ Oðk1 Þ Oðk2 Þ    Oðkp Þ eS=e ½7
strength F. The action [1] is invariant under the
transformations generated by the scalar supercharge Q: where the integration has to be understood on the
fQ;  g ¼ þ
F space of field configurations modulo gauge transfor-
fQ; A g ¼  ;
mations, and e is a coupling constant. Standard
fQ; g ¼ dA ; fQ; g ¼ i½;  ½2 arguments show that due to the Q-exactness of the
fQ; g ¼ 0; fQ; g ¼  action S, the quantities obtained in [7] are indepen-
dent of e. This implies that the observables of the
In these transformations, Q2 is a gauge transforma- theory can be obtained either in the weak-coupling
tion with gauge parameter , modulo field equa- limit e ! 0 (also short-distance or ultraviolet regime,
tions. Observables are, therefore, related to the since the N = 2 theory is asymptotically free), where
G-equivariant cohomology of Q (i.e., the cohomol- perturbative methods apply, or in the strong-coupling
ogy of Q restricted to gauge invariant operators). (also long-distance or infrared) limit e ! 1, where
Auxiliary fields can be introduced so that the one is forced to consider a nonperturbative approach.
action [1] is Q-exact, that is, In the weak-coupling limit one proves that
S ¼ fQ; g ½3 the correlation functions [7] descend to polynomials
in the product cohomology of the moduli space
for  a certain functional of the fields of the theory of anti-self-dual (ASD) instantons Hk1 (MASD ) 
which comes under the name of gauge fermion, a Hk2 (MASD )      Hkp (MASD ), which are precisely
120 Duality in Topological Quantum Field Theory

the Donaldson polynomial invariants of X. How- here, but rather the attention is turned to the twisted
ever, the weak- coupling analysis does not add any theories which emerge from N = 4 supersymmetric
new ingredient to the problem of the actual gauge theories.
computation of the invariants. The difficulties that
one has to face in the field theory representation are
similar to those in ordinary Donaldson theory. Duality in Twisted N = 4 Theories
Nevertheless, the field theory connection is very
important since in this theory the strong- and weak- Unlike the N = 2 supersymmetric case, the N = 4
coupling limits are exact, and therefore the door is supersymmetric Yang–Mills theory in four dimen-
open to find a strong-coupling description which sions is unique once the gauge group G is fixed.
could lead to a new, simpler representation for the The microscopic theory contains a gauge or gluon
Donaldson invariants. field, four chiral spinors (the gluinos) and six real
This alternative strategy was pursued by Witten scalars. All these fields are massless and take
(1994a), who found the strong-coupling realization values in the adjoint representation of the gauge
of the Donaldson–Witten theory after using the group. The theory is finite and conformally
results on the strong-coupling behavior of N = 2 invariant, and is conjectured to have a duality
supersymmetric gauge theories which he and Seiberg symmetry exchanging strong and weak coupling

(Seiberg and Witten 1994a–c) had discovered. The and exchanging electric and magnetic fields, which
key ingredient in Witten’s derivation was to assume extends to a full SL(2, Z) symmetry acting on the

that the strong-coupling limit of Donaldson–Witten microscopic complexified coupling (Montonen and
theory is equivalent to the ‘‘sum’’ over the twisted Olive 1977)
effective low-energy descriptions of the correspond-
ing N = 2 physical theory. This ‘‘sum’’ is not entirely
¼ þ ½8
a sum, as in general it has a part which contains a 2 e2
continuous integral. The ‘‘sum’’ is now known as bi As in the N = 2 case, the N = 4 theory can be
integration over the u-plane after the work of Moore
bi twisted to obtain a topological model, only that, in
and Witten (1998). Witten’s (1994a) assumption this case, the topological twist can be performed in
can be simply stated as saying that the weak-/ three inequivalent ways, giving rise to three different
strong-coupling limit and the twist commute. In TQFTs (Vafa and Witten 1994). A natural question
other words, to study the strong-coupling limit of to answer is whether the duality properties of the
the topological theory, one first untwists, then N = 4 theory are shared by its twisted counterparts
works out the strong-coupling limit of the physical and, if so, whether one can take advantage of the
theory and, finally, one twists back. From such a calculability of topological theories to shed some
viewpoint, the twisted effective (strong-coupling) light on the behavior and properties of duality.
theory can be regarded as a TQFT dual to the The answer is affirmative, but it is instructive to
original one. In addition, one could ask for the dual clarify a few points. First, as mentioned above, the
moduli problem associated to this dual TQFT. It topological observables in twisted N = 2 theories are
turns out that in many interesting situations independent of the coupling constant e, so the
(bþ2 (X) > 1) the dual moduli space is an abelian question arises as to how the twisted N = 4 theories
system corresponding to the Seiberg–Witten or
bi come to depend on the coupling constant. As it turns
monopole equations (Witten 1994a). The topologi- out, twisted N = 2 supersymmetric gauge theories
cal invariants associated with this new moduli space have an off-shell formulation such that the TQFT
are the celebrated Seiberg–Witten invariants. action can be expressed as a Q-exact expression,
Generalizations of Donaldson–Witten theory, with where Q is the generator of the topological
either different gauge groups and/or additional matter symmetry. Actually, R this is true only up to a
content (such as, e.g., twisted N = 2 Yang–Mills topological -term X tr(F ^ F),
multiplets coupled to twisted N = 2 matter multi- Z
plets) are possible, and some of the possibilities have 1 pffiffiffi 4
bi S¼ 2 g d xfQ; g
in fact been explored (see Moore and Witten (1998) 2e X
and references therein). The main conclusion that 1
emerges from these analyses is that, in all known 2i
trðF ^ FÞ ½9
162 X
cases, the relevant topological information is cap-
tured by the Seiberg–Witten invariants, irrespectively for some . However, the N = 2 supersymmetric
of the gauge group and matter content of the theory gauge theories possess a global U(1) chiral symmetry
under consideration. These cases are not reviewed which is generically anomalous, so one can actually
Duality in Topological Quantum Field Theory 121

get rid of the -term with a chiral rotation. As a corresponding Stiefel–Whitney class is defined mod-
result of this, the observables in the topological ulo N). One has, therefore, a family of partition
theory are insensitive to -terms (and hence to
functions Zv (
), one for each magnetic flux v. The
and e) up to a rescaling. SU(N) partition function is obtained by considering
On the other hand, in N = 4 supersymmetric the zero flux partition function (up to a constant
gauge theories -terms are observable. There is no factor), while the (dual) SU(N)=ZN partition func-
chiral anomaly and these terms cannot be shifted tion is obtained by summing over all v, and both are
away as in the N = 2 case. This means that in the to be exchanged under
! 1=
. The action of
twisted theories one might have a dependence on the SL(2, Z) on the Zv should be compatible with this
coupling constant
, and that – up to anomalies – exchange, and thus the
! 1=
operation mixes
this dependence should be holomorphic (resp. the Zv by a discrete Fourier transform which, for
antiholomorphic if one reverses the orientation of G = SU(N) reads
the 4-manifold). In fact, on general grounds, one
would expect for the partition functions of the Zv ð1=
Þ ¼  ðX; GÞ
w e2iuv=N Zu ð
Þ ½12
twisted theories on a 4-manifold X and for gauge u
group G to take the generic form
X We are now in a position to examine the (three)
ZX ðGÞ ¼ qcðX;GÞ qk ðMk Þ ½10 twisted theories in some detail. For further details bi
k and references, the reader is referred to Lozano
where q = e2i
, c is a universal constant
R (depending The first twisted theory considered here possesses
on X and G), k = (1=162 ) X tr(F ^ F) is the only one scalar supercharge (and hence comes
instanton number, and (Mk ) encodes the topolo- under the name of ‘‘half-twisted theory’’). It is a
gical information corresponding to a sector of the nonabelian generalization of the Seiberg–Witten
moduli space of the theory with instanton number k. abelian monopole theory, but with the monopole
Now we can be more precise as to how we expect multiplets taking values in the adjoint representa-
to see the Montonen–Olive duality in the twisted tion of the gauge group. The theory can be
N = 4 theories. First, under
! 1=
the gauge perturbed by giving masses to the monopole multi-
group G gets exchanged with its dual group plets while still retaining its topological character.
Ĝ. Correspondingly, the partition functions should The resulting theory is the twisted version of the
behave as modular forms mass-deformed N = 4 theory, which preserves
ZG ð1=
Þ ¼  ðX; GÞ
w ZG^ ð
Þ ½11 N = 2 supersymmetry and whose low-energy effec-
tive description is known. This connection with
where is a constant (depending on X and G), and N = 2 theories, and its topological character,
the modular weight w should depend on X in such a makes it possible to go to the long-distance limit
way that it vanishes on flat space. and compute in terms of the twisted version of the
In addition to this, in the N = 4 theory all the low-energy effective description of the supersym-
fields take values in the adjoint representation of G. metric theory. Below, we review how the u-plane
Hence, if H 2 (X, 1 (G)) 6¼ 0, it is possible to consider approach works for gauge group SU(2).
nontrivial G/Center(G) gauge configurations with The twisted theory for gauge group SU(2) has a
discrete magnetic ’t Hooft flux through the 2-cycles U(1) global symmetry (the ghost number) which
of X. In fact, G/Center(G) bundles on X are has an anomaly 3(2 þ 3 )=4 on gravitational
classified by the instanton number and a character- backgrounds (i.e., on curved manifolds). Nontrivial
istic class v 2 H2 (X, 1 (G)). For example, if topological invariants are thus obtained by con-
G = SU(2), we have Ĝ = SU(2)=Z2 = SO(3) and v is sidering the vacuum expectation value of products
the second Stiefel–Whitney class w2 (E) of the gauge of observables with ghost numbers adding up to
bundle E. This Stiefel–Whitney class can be repre- 3(2 þ 3 )=4. The relevant observables for this
sented in de Rham cohomology by a class in theory and gauge group SU(2) or SO(3) are
H 2 (X, Z) defined modulo 2, that is, w2 (E) and precisely the same as in the Donaldson–Witten
w2 (E) þ 2!, with ! 2 H 2 (X, Z), represent the same theory (eqns [4] and [5]). In addition to this, it is
’t Hooft flux, so if w2 (E) = 2, for some  2 possible to enrich the theory by including sectors
H 2 (X, Z), then the gauge configuration is trivial in with nontrivial nonabelian electric and magnetic ’t
SO(3) (it has no ’t Hooft flux). Hooft fluxes which, as pointed out above, should
Similarly, for G = SU(N) (for which Ĝ = SU behave under SL(2, Z) duality in a well-defined
(N)=ZN ), one can fix fluxes in H 2 (X, ZN ) (the fashion.
122 Duality in Topological Quantum Field Theory

The generating function for these correlation cancelation fixes almost completely the unknown
functions is given as an integration over the moduli functions in the contributions to the topological
space of vacua of the physical theory (the u-plane), correlation functions from the singularities. The
which, for generic values of the mass parameter, final result for the contributions from the singula-
forms a one-dimensional complex compact manifold rities (which give the complete answer for the
(described by a complex variable customarily correlation functions when bþ 2 (X) > 1) is written
denoted by u, hence the name), which parametrizes explicitly and completely in terms of the funda-
a family of elliptic curves that encodes all the mental periods da=du (written in the appropriate
relevant information about the low-energy effective local variables) and the discriminant of the elliptic
description of the theory. At a generic point in the curve comprising the Seiberg–Witten solution for
moduli space of vacua, the only contribution to the the physical theory. For simply connected spin
topological correlation functions comes from a 4-manifolds of simple type the generating function
twisted N = 2 abelian vector multiplet. Additional is given by
contributions come from points in the moduli space
where the low-energy effective description is singu- hepOþIðSÞ iv ¼ 2ð=2þð2þ3 Þ=8Þ mð3þ7 Þ=8 ðð
lar (i.e., where the associated elliptic curve (  ðþ =4Þ
 da 2
degenerates).  ð 1 Þ e2pu1 þS T1
Therefore, the total contribution to the generating du 1
function thus consists of an integration over the  ½x=2;v nx eði=2Þðdu=daÞ1 xS
moduli space with the singularities removed – which
is nonvanishing for bþ  ðþ =4Þ
2 (X) = 1 (Moore and Witten da
1998) only – plus a discrete sum over the contribu- þ 2b2 =2 ð1Þ =8 ð 2 Þ
du 2
tions of the twisted effective theories at each of the X
three singularities of the low-energy effective
 e2pu2 þS T2 ð1Þvx=2 nx eði=2Þðdu=daÞ2 xS
description (Seiberg and Witten 1994a, b, c). The x
effective theory at a given singularity contains, 2 da ðþ =4Þ 2pu3 þS2 T3
together with the appropriate dual photon multiplet, þ 2b2 =2 iv ð 3 Þ e
du 3
one charged hypermultiplet, which corresponds to )
the state becoming massless at the singularity. The  ð1Þvx=2 nx eði=2Þðdu=daÞ3 xS ½13
complete effective action for these massless states x
also contains certain measure factors and contact
terms among the observables, which reproduce the where x is a Seiberg–Witten basic class (and nx is
effect of the massive states that have been integrated the corresponding Seiberg–Witten invariant), m is
out as well as incorporate the coupling to gravity the mass parameter of the theory,  = ( þ )=4, v 2
(i.e., explicit nonminimal couplings to the metric of H 2 (X, Z2 ) is a ’t Hooft flux, S is the formal P
the 4-manifold). How to determine these a priori
sum S = a a a (and,R correspondingly, I(S) = a a
unknown functions was explained in Moore and I(a ), with I(a ) = a W2 ), where {a }a = 1,..., b2 (X)
Witten (1998). The idea is as follows. At points on form a basis of H2 (X) and a are constant parameters,
the u-plane where the (imaginary part of the) while (
) is the Dedekind function, i = (du=
effective coupling diverges, the integral is discontin- dqeff )u = ui (with qeff = exp (2i
eff ), and
eff is the
uous at anti-self-dual abelian gauge configurations. ratio of the fundamental periods of the elliptic curve),
This is commonly referred to as ‘‘wall crossing.’’ and the contact terms Ti have the form
Wall crossing can take place at the singularities of  
the moduli space – the appropriate local effective 1 du 2 ui m2
Ti ¼  þ E2 ð
Þ þ E4 ð
Þ ½14
eff diverges there – and, in the case of the 12 da i 6 72
asymptotically free theories, at the point at infinity –
the effective electric coupling diverges owing to with E2 and E4 the Einstein series of weights 2 and
asymptotic freedom. 4, respectively. Evaluating the quantities in [13]
On the other hand, the final expression for the gives the final result as a function of the physical
invariants can exhibit a wall-crossing behavior at parameters
and m, and of topological data of X as
most at u ! 1, so the contribution to wall crossing the Euler characteristic , the signature and the
from the integral at the singularities at finite values basic classes x. The expression [13] has to be
of u must cancel against the contributions coming understood as a formal power series in p and a ,
from the effective theories there, which also dis- whose coefficients give the vacuum expectation
play wall-crossing discontinuities. Imposing this values of products of O = W0 and I(a ).
Duality in Topological Quantum Field Theory 123

The generating function [13] has nice properties (Mk ) is the Euler characteristic of a suitable
under the modular group. For the partition function Zv , compactification of the kth instanton moduli space
Mk of gauge group G in X.
Zv ð
þ 1Þ ¼ ð1Þ =8 iv Zv ð
Þ As in the previous example, it is possible to consider

=2 nontrivial gauge configurations in G/Center (G) and
Zv ð1=
Þ ¼ 2b2 =2 ð1Þ =8 ½15
i compute the partition function for a fixed value of the
 ð1Þwv Zw ð
Þ ’t Hooft flux v 2 H 2 (X, 1 (G)). In this case, however,
w the Seiberg–Witten approach is not available, but, as
P conjectured by Vafa and Witten, one can nevertheless
Also, with ZSU(2) = 2 Zv = 0 and ZSO(3) = Zv ,
v carry out computations in terms of the vacuum degrees
ZSUð2Þ ð
þ 1Þ ¼ ð1Þ =8 ZSUð2Þ ð
Þ of freedom of the N = 1 theory which results from
giving bare masses to all the three chiral multiplets of
ZSOð3Þ ð
þ 2Þ ¼ ZSOð3Þ ð
Þ ½16 the N = 4 theory. It should be noted that a similar

ZSUð2Þ ð1=
Þ ¼ ð1Þ =8 2=2
=2 ZSOð3Þ ð
Þ approach was introduced by Witten (1994b) to obtain
the first explicit results for the Donaldson–Witten
Notice that the last of these three equations theory just before the far more powerful Seiberg–
corresponds precisely to the strong–weak coupling bi Witten approach was available. bi
duality transformation conjectured by Montonen As explained in detail by Vafa and Witten (1994),
and Olive (1977). the twisted massive theory is topological on Kähler
As for the correlation functions, one finds the 4-manifolds with h2, 0 6¼ 0, and the partition func-
following behavior under the inversion of the coupling: tion is actually invariant under the perturbation. In
 SUð2Þ the long-distance limit, the partition function is
1 1 SOð3Þ
tr 2
¼ hOiSUð2Þ

¼ 2 hOi1=
given as a finite sum over the contributions of the

discrete massive vacua of the resulting N = 1 theory.
 Z SUð2Þ
1 In the case at hand, it turns that, for G = SU(N), the
tr ð 2F þ ^ Þ ¼ hIðSÞiSUð2Þ

number of such vacua is given by the sum of the
82 S

1 ½17 positive divisors of N. The contribution of each

¼ 2 hIðSÞi1=
vacuum is universal (because of the mass gap), and

4 can be fixed by comparing with known mathema-

¼ hIðSÞIðSÞi1=
tical results (Vafa and Witten 1994). However, this
i is not the end of the story. In the twisted theory, the
i 1 SOð3Þ
þ hOi1=
]ðS \ SÞ chiral superfields of the N = 4 theory are no longer
3 scalars, so the mass terms cannot be invariant under
Therefore, as expected, the partition function of the holonomy group of the manifold unless one of
the twisted theory transforms as a modular form, the mass parameters be a holomorphic 2-form !.
while the topological correlation functions turn out (Incidentally, this is the origin of the constraint
to transform covariantly under SL(2, Z), following a h(2, 0) 6¼ 0 mentioned above.) This spatially depen-
pattern which can be reproduced with a far more dent mass term vanishes where ! does, and we will
bi bi

simple topological abelian model. assume as in Vafa and Witten (1994) and Witten
The second example considered next is the Vafa–
(1994b) that ! vanishes with multiplicity 1 on
Witten (1994) theory. This theory possesses two a union of disjoint, smooth complex curves
scalar supercharges, and has the unusual feature that Ci , i = 1, . . . , n of genus gi which represent the
the virtual dimension of its moduli space is exactly canonical divisor K of X. The vanishing of
zero (it is an example of balanced TQFT), and ! introduces corrections involving K whose precise
therefore the only nontrivial topological observable form is not known a priori. In the G = SU(2)
is the partition function itself. Furthermore, the case, each of the N = 1 vacua bifurcates along each
twisted theory does not contain spinors, so it is well of the components Ci of the canonical divisor
defined on any compact, oriented 4-manifold. into two strongly coupled massive vacua. This
Now this theory computes, with the subtleties vacuum degeneracy is believed to stem from the
explained in Vafa and Witten (1994), the Euler spontaneous breaking of a Z2 chiral symmetry bi

characteristic of instanton moduli spaces. In fact, in which is unbroken in bulk (see, e.g., Vafa and

this case in the generic partition function [10], Witten (1994) and Witten (1994b)).
X The structure of the corrections for G = SU(N)
ZX ðGÞ ¼ qcðX;GÞ qk ðMk Þ ½18 (see [19] below) suggests that the mechanism at
k work in this case is not chiral symmetry breaking.
124 Duality in Topological Quantum Field Theory

Indeed, near any of the Ci ’s, there is an N-fold essential in deriving the above formula. Blowing up
bifurcation of the vacuum. A plausible explana- a point on a Kähler manifold X replaces it with a
tion for this degeneracy could be found in the new Kähler manifold X b whose second cohomology
spontaneous breaking of the center of the gauge lattice is H 2 (X,b Z) = H 2 (X, Z)  I , where I is the
group (which for G = SU(N) is precisely ZN ). In one-dimensional lattice spanned by the Poincaré
any case, the formula for SU(N) can be computed dual of the exceptional divisor B created by the
(at least when N is prime) along the lines blow-up. Any allowed ZN flux b v on Xb is of the form

explained by Vafa and Witten (1994) and assum- b

v = v  r, where v is a flux in X and r = B,
ing that the resulting partition function satisfies a  = 0, 1, . . . , N  1. The main result concerning
set of nontrivial constraints which are described [19] is that under blowing up a point on a Kähler
below. 4-manifold with canonical divisor as above, the
Then, for a given ’t Hooft flux v 2 H 2 (X, ZN ), the partition functions for fixed ’t Hooft fluxes have a
partition function for gauge group SU(N) (with factorization as
prime N) is given by
!  ð
0 Þ
ZXb;bv ð
0 Þ ¼ ZX;v ð
0 Þ ½23
X Yn N1Y  ð1gi Þ"i ; ð
0 Þ
Zv ¼ v;wN ðeÞ
e i¼1 ¼0
 Precisely the same behavior under blow-ups of the
 =2 partition function [19] has been proved for the
 GðqN Þ þ N 1b1 generating function of Euler characteristics of
" !# instanton moduli spaces on Kähler manifolds. This
X 1 Y X 1 
N n N
m; 1gi ð2i=NÞv½Ci N should not come as a surprise since, as mentioned
m¼0 i¼1 ¼0
 above, on certain 4-manifolds, the partition function
 =2 of Vafa–Witten theory computes the Euler charac-
2 Gðm q1=N Þ teristics of instanton moduli spaces. Therefore, [19]
 eiððN1Þ=NÞmv ½19
N2 can be seen as a prediction for the Euler numbers of
instanton moduli spaces on those 4-manifolds.
where  = exp (2i=N), G(q) = (q)24 (with (q) the
Finally, the third twisted N = 4 theory also pos-
Dedekind function),  are the SU(N) characters at
sesses two scalar supercharges, and is believed to be a
level 1 and m,  are certain linear combinations
certain deformation of the four-dimensional BF
thereof. [Ci ]N is the reduction modulo N of the
theory, and as such it describes essentially intersection
Poincaré dual of Ci , and
theory on the moduli space of complexified gauge
n connections. In addition to this, the theory is ‘‘amphi-
wN ðeÞ ¼ "i ½Ci N ½20 cheiral,’’ which means that it is invariant to a reversal
of the orientation of the spacetime manifold. The
where "i = 0, 1, . . . , N  1 are chosen independently. terminology is borrowed from knot theory, where an
Equation [19] has the expected properties under oriented knot is said to be amphicheiral if, crudely
the modular group: speaking, it is equivalent to its mirror image. From this
property, it follows that the topological invariants of
Zv ð
þ 1Þ ¼ eði=12ÞNð2þ3 Þ eiððN1Þ=NÞv Zv ð
Þ the theory are completely independent of the complex-

=2 ified coupling constant
Zv ð1=
Þ ¼ N b2 =2 ½21
X i See also: Donaldson–Witten Theory; Electric–Magnetic
 eð2iuv=NÞ Zu ð
Duality; Hopf Algebras and q-Deformation Quantum
Groups; Large-N and Topological Strings; Seiberg–
b1 1
P also, with ZSU(N) = N Z0 and ZSU(N)=ZN = Witten Theory; Topological Quantum Field Theory:
v Zv ,

ZSUðNÞ ð1=
Þ ¼ N =2 ZSUðNÞ=ZN ð
Þ ½22
i Further Reading
which is, up to some correction factors that vanish in Birmingham D, Blau M, Rakowski M, and Thompson G (1991)
flat space, the original Montonen–Olive conjecture! Topological field theories. Physics Reports 209: 129–340.
There is a further property to be checked which Cordes S, Moore G, and Rangoolam S (1996) Lectures on 2D Yang–
Mills theory, equivariant cohomology and topological field
concerns the behavior of [19] under blow-ups. This
bi theory. In: David F, Ginsparg P, and Zinn-Justin J (eds.)
property was heavily used by Vafa and Witten Fluctuating Geometries in Statistical Mechanics and Field Theory,
(1994) and demanding it in the present case was Les Houches Session LXII, p. 505, hep-th/9411210. Elsevier.
Dynamical Systems and Thermodynamics 125

Gopakumar R and Vafa C (1998) Topological gravity as large N supersymmetric Yang–Mills theory. Nuclear Physics B 426:
topological gauge theory. Advances in Theoretical and Math- 19–52, hep-th/9407087.
ematical Physics 2: 413–442, hep-th/9802016; On the gauge Seiberg N and Witten E (1994b) Electric–magnetic duality,
theory/geometry correspondence. Advances in Theoretical and monopole condensation, and confinement in N = 2 super-
Mathematical Physics 3: 1415–1443, hep-th/9811131. symmetric Yang–Mills theory. – erratum. Nuclear Physics B
Labastida JMF and Lozano C (1998) Lectures on topological quantum 430: 485–486.
field theory. In: Falomir H, Gamboa R, and Schaposnik F Seiberg N and Witten E (1994c) Monopoles, duality and chiral
(eds.) Proceedings of the CERN-Santiago de Compostela-La symmetry breaking in N = 2 supersymmetric QCD. Nuclear
Plata Meeting, Trends in Theoretical Physics, pp. 54–93, hep-th/ Physics B 431: 484–550, hep-th/9408099.
9709192. New York: American Institute of Physics. Vafa C and Witten E (1994) A strong coupling test of S-duality.
Lozano C (1999) Duality in Topological Quantum Field Theories. Nuclear Physics B 431: 3–77, hep-th/9408074.
Ph.D. thesis, U. Santiago de Compostela, arXiv: hep-th/ 9907123. Witten E (1988b) Topological quantum field theory. Commu-
Maldacena J (1998) The large N limit of superconformal field nications in Mathematical Physics 117: 353–386.
theories and supergravity. Advances in Theoretical and Witten E (1988a) Topological sigma models. Communications in
Mathematical Physics 2: 231–252, hep-th/9711200. Mathematical Physics 118: 411–449.
Montonen C and Olive D (1977) Magnetic monopoles as gauge Witten E (1989) Quantum field theory and the Jones polynomial.
particles? Physics Letters B 72: 117–120. Communications in Mathematical Physics 121: 351–399.
Moore G and Witten E (1998) Integration over the u-plane in Witten E (1994a) Monopoles and four-manifolds. Mathematical
Donaldson theory. Advances in Theoretical and Mathematical Research Letters 1: 769–796, hep-th/9411102.
Physics 1: 298–387, hep-th/9709193. Witten E (1994b) Supersymmetric Yang–Mills theory on a four-
Seiberg N and Witten E (1994a) Electric–magnetic duality, manifold. Journal of Mathematical Physics 35: 5101–5135,
monopole condensation, and confinement in N = 2 hep-th/9403195.

Dynamical Systems and Thermodynamics

A Carati, L Galgani and A Giorgilli, Università di decreases to zero exponentially fast as frequency
Milano, Milan, Italy increases.
ª 2006 Elsevier Ltd. All rights reserved. Thus, the problem of a dynamical foundation for
classical statistical mechanics would be reduced to
ascertaining whether the Hamiltonian systems of
physical interest are ergodic or not. It is just in this
Introduction spirit that many mathematical works were recently
addressed at proving ergodicity for systems of hard
The relations between thermodynamics and spheres, or more generally for systems which are
dynamics are dealt with by statistical mechanics. expected to be not only ergodic but even hyperbolic.
For a given dynamical system of Hamiltonian type However, a new perspective was opened in the year
in a classical framework, it is usually assumed that a 1955, with the celebrated paper of Fermi, Pasta, and
dynamical foundation for equilibrium statistical Ulam (FPU), which constituted the last scientific
mechanics, namely for the use of the familiar work of Fermi.
Gibbs ensembles, is guaranteed if one can prove The FPU paper was concerned with numerical
that the system is ergodic, that is, has no integrals of computations on a system of N (actually, 32 or 64)
motion apart from the Hamiltonian itself. One of equal particles on a line, each interacting with the
the main consequences is then that classical two adjacent ones through nonlinear springs, certain
mechanics fails in explaining thermodynamics at boundary conditions having been assigned (fixed
low temperatures (e.g., the specific heats of crystals ends). The model mimics a one-dimensional crystal
or of polyatomic molecules at low temperatures, or (or also a string), and can be described in the
the black body problem), because the classical familiar way as a perturbation of a system of N
equilibrium ensembles lead to equipartition of normal modes, which diagonalize the corresponding
energy for a system of weakly coupled oscillators, linearized system. The initial conditions corre-
against Nernst’s third principle. This is actually the sponded to the excitation of only a few low-
problem that historically led to the birth of quantum frequency modes, and it was expected that energy
mechanics, equipartition being replaced by Planck’s would rather quickly flow to the high-frequency
law. At a given temperature T, the mean energy of modes, thus establishing equipartition of energy, in
an oscillator of angular frequency ! is not kB T (kB agreement with the predictions of classical equili-
being the Boltzmann constant), and thus is not brium statistical mechanics. But this did not occur
independent of frequency (equipartition), but within the available computation times, and the
126 Dynamical Systems and Thermodynamics

energy rather appeared to remain confined within a In the present article, the state of the art of the
packet of low-frequency modes having a certain FPU problem is discussed. The thesis of the present
width, as if being in a state of apparent equilibrium authors is that the FPU phenomenon survives in the
of a nonstandard type. This fact can be called ‘‘the thermodynamic limit, in the last mentioned sense,
FPU paradox.’’ In the words of Ulam, written as a namely that at sufficiently low temperatures there
comment in Fermi’s Collected Papers, this is exists a kind of metaequilibrium state surviving for
described as follows: ‘‘The results of the computa- extremely long times. The corresponding thermo-
tions were interesting and quite surprising to Fermi. dynamics turns out to be different from the standard
He expressed the opinion that they really constituted one predicted by the equilibrium ensembles, inas-
a little discovery in providing intimations that the much as it presents qualitatively some quantum-like
prevalent beliefs in the universality of mixing and features (typically, specific heats in agreement with
thermalization in nonlinear systems may not be Nernst’s third principle). The key point, with respect
always justified.’’ to equilibrium statistical mechanics, is that the
The FPU paper immediately had a very strong internal thermodynamic energy should be identified
impact on the theory of dynamical systems, because not with the whole mechanical energy, but only with
it motivated all the modern theory of infinite- a suitable fraction of it, to be identified through its
dimensional integrable systems and solitons (KdV
dynamical properties, as was suggested more than a
equation), starting from the works of Zabusky and century ago by Boltzmann himself, and later by
Kruskal (1965). But in this way the FPU paradox Nernst.
was somehow enhanced, because the FPU system Here, it is first discussed why nearly integrable
turned out to be associated to the class of systems can be expected to present the FPU phenom-
integrable systems, namely the systems having a enon. Then the latter is illustrated. Finally, some hints
number of integrals of motion equal to the number are given for the corresponding thermodynamics.
of degrees of freedom, which are in a sense the
most antithermodynamic systems. The merit of
establishing a bridge towards ergodicity goes to Nearly Integrable versus Hyperbolic
Izrailev and Chirikov (1966). Making reference to
Systems, and the Question of the Rates
the most advanced results then available in the
of Thermalization
perturbation theory for nearly integrable systems
(KAM theory), these authors pointed out that As mentioned above, it is usually assumed that the
ergodicity, and thus equipartition, would be recov- problem of providing a dynamical foundation to
ered if one took initial data with a sufficiently large classical statistical mechanics is reduced to the
energy. And this was actually found to be the case. mathematical problem of ascertaining whether the
Moreover, it turned out that their work, and its Hamiltonian systems of physical interest are ergodic
subsequent completion by Shepelyanski, was often or not. However, there remains open a subtler
interpreted as supporting the conjecture that the problem. Indeed, the notion of ergodicity involves
FPU paradox would disappear in the thermody- the limit of an infinite time (time averages should
namic limit (infinitely many particles, with finite converge to ensemble averages as t ! 1), while
density and energy density). The opposite conjec- intermediate times might be relevant. In this
ture was advanced in the year 1970 by Bocchieri, connection it is convenient to distinguish between
Scotti, Bearzi, and Loinger, and its relevance for the two classes of dynamical systems, namely the
relations between classical and quantum mechanics hyperbolic and the nearly integrable ones.
was immediately pointed out by Cercignani, The first class, in a sense the prototype of chaotic
Galgani, and Scotti. A long debate then followed. systems, should include the systems of hard spheres
Possibly, some misunderstandings occurred, because (extensively studied after the classical works of
in the discussions concerning the dynamical aspects Sinai), or more generally the systems of mass points
of the problem reference was generally made to with mutual repulsive interactions. For such systems
notions involving infinite times. In fact, it had not it can be expected that the time averages of the
yet been conceived that the FPU equilibrium might relevant dynamical quantities in an extremely short
actually be an apparent one, corresponding to some time converge to the corresponding ensemble
type of intermediate metaequilibrium state. This averages, so that the classical equilibrium ensembles
was for the first time suggested by researchers in could be safely used.
Parisi’s group in the year 1982. The analogy of A completely different situation occurs for the
such a situation with that occurring in glasses was dynamical systems such as the FPU systems, which
pointed out more recently. are nearly integrable, that is, are perturbations of
Dynamical Systems and Thermodynamics 127

systems having a number of integrals of motion reference to Figures 1–8, which are the results of
equal to the number of degrees of freedom. Indeed, numerical integrations of the FPU dynamical system.
in such a case ergodicity means that the addition of If x1 , . . . , xN denote the positions of the particles (of
an interaction, no matter how small, makes an unitary mass), or more precisely the displacements
integrable system lose all of its integrals of motion, from their equilibrium positions, and pi the corre-
apart from the Hamiltonian itself. And, in fact, this sponding momenta, the Hamiltonian is
quite remarkable property was already proved to be
p2 X
N þ1
generic by Poincaré, through a set of considerations H¼ i
þ Vðri Þ
which had a fundamental impact on the theory of i¼1
2 i¼1
dynamical systems itself. In view of its importance
for the foundations of statistical mechanics, the where ri = xi  xi1 and one has taken a potential
proof given by Poincaré was reconsidered by Fermi, V(r) = r2 =2 þ r3 =3 þ r4 =4 depending on two
who added a subtle contribution concerning the role positive parameters  and . Boundary conditions
of single invariant surfaces. It is just to such a paper with fixed ends, namely x0 = xNþ1 = 0, are consid-
that Ulam makes reference in his comment to the ered. We recall that the angular frequencies
FPU work mentioned above, when he says: ‘‘Fermi’s
earlier interest in the ergodic theory is one motive’’
for the FPU work.
The point is that the picture which looks at the

ergodicity induced on an integrable system by the
addition of a perturbation, no matter how small, E1, . . . ,E8
somehow lacks continuity. One might expect that, 0.02

in situations in which the nonlinear interaction
which destroys the integrals of motion is very small
(i.e., at low temperatures), the underlying integrable

structure should somehow be still appreciable, in
0 2 4 6 8 10 0 20
some continuous way. In fact, continuity should be Time (K-units) Mode
recovered by making a question of times, namely by
considering the rates of thermalization (to use the
very FPU phrase), or equivalently the relaxation


times, namely the times needed for the time averages

of the relevant dynamical quantities to converge to

the corresponding ensemble averages. By continuity,


one clearly expects that the relaxation times diverge
as the perturbation tends to zero. But more
complicated situations might occur, as, for example,


the existence of two (or more) relevant timescales.

0 20 40 60 80 100 0 20
The point of view that timescales of different orders Time (K-units) Mode
of magnitude might occur in dynamical systems
(with the exhibition of an interesting example) and
that this might be relevant for statistical mechanics,

was discussed by Poincaré himself in the year 1906.

Indeed, he denotes as ‘‘first-order very large time’’ a
time which is sufficient for a system to reach a


‘‘provisional equilibrium,’’ whereas he denotes as

‘‘second-order very large time’’ a time which is
necessary for the system to reach its ‘‘definitive

0 200 400 600 800 1000 0 20

Time (K-units) Mode
Figure 1 The FPU paradox: normal-mode energies Ej versus
The FPU Phenomenon: Historical and time (left) and energy spectrum, namely time average of Ej
Conceptual Developments versus j (right) for three different timescales. The energy, initially
given to the lowest-frequency mode, does not flow to the high-
We now illustrate the FPU phenomenon, following frequency modes within the accessible observation time. Here,
essentially its historical development. We will make N = 32 and E = 0:05.
128 Dynamical Systems and Thermodynamics


Time-averaged energies

E1, . . . ,E8




0 2 4 6 8 10 0 20
Time (K-units) Mode
100 102 104
Time (K-units)


Figure 2 The FPU paradox: time averages of the energies of
the modes 1, 2, . . . , 8 (from top to bottom) versus time for the
same run as Figure 1. The spectrum has reached an apparent

E1, . . . ,E8
equilibrium, different from that of equipartition predicted by
classical equilibrium statistical mechanics. An exponential

decay of the tail is clearly exhibited.

of the corresponding normal modes are !j = 2 sin
0 2 4 6 8 10 0 20
[ j=2(N þ 1)], with j = 1, . . . , N; it is thus conve- Time (K-units) Mode
nient to take as time unit the value , which is
essentially, for any N, the period of the fastest

normal mode.
The original FPU result is illustrated in Figures 1
and 2. Here N = 32,  =  = 1=4, and the total
E1, . . . ,E8

energy is E = 0.05; the energy was given initially to


the first normal mode (with vanishing potential
energy). Three timescales (increasing from top to
bottom) are considered, the top one corresponding
to the timescale of the original FPU paper. In the

boxes on the left the energies Ej (t) of modes j are 0 2 4 6 8 10 0 20
reported versus time (j = 1, . . . , 8 at top, j = 1 at Time (K-units) Mode
center and bottom). In the boxes on the right we Figure 3 The Izrailev–Chirikov contribution: for a fixed obser-
report the corresponding spectra, namely the time vation time, equipartition is attained if the initial energy E is high
enough. Here, from top to bottom, E = 0:1, 1, 10.
average (up to the respective final times) of the
energy of mode j versus j, for 1  j  N. In Figure 2
we report, for the same run of Figure 1, the time
averages of the energies of the various modes versus

time; this figure corresponds to the last one of the

original FPU work. The facts to be noticed in
Time-averaged energies

connection with these two figures are the following:


(1) the spectrum (namely the distribution of energy

among the modes, in time average) appears to have
relaxed very quickly to some form, which remains

essentially unchanged up to the maximum observed

time; (2) there is no global equipartition, but only a
partial one, because the energy remains confined
within a group of low-frequency modes, which form

a small packet of a certain definite width; and (3) 10–1 2 4 100 2 4 101 2 4 102
the time evolutions of the mode energies appear to Time (K-units)
be of quasiperiodic type, since longer and longer Figure 4 The Izrailev–Chirikov contribution: time averages of
quasiperiods can be observed as the total time the mode energies versus time for the same run as at bottom of
increases. Figure 3.
Dynamical Systems and Thermodynamics 129

After the works of Zabusky and Kruskal, by having a physical significance, namely the values
which the FPU system was somehow assimilated to commonly assumed for argon, for the threshold of
an integrable system, the bridge toward ergodicity
the specific energy they found the value c ’ 0.04V0 ,
was made by Izrailev and Chirikov (1966), through where V0 is the depth of the Lennard-Jones potential
the idea that there should exist a stochasticity well. This corresponds to a critical temperature of the
threshold. Making reference to KAM theory, which order of a few kelvin. The relevance of such a
had just been formulated in the framework of conjecture (persistence of the FPU phenomenon in the
perturbation theory for nearly integrable systems, thermodynamic limit) was soon strongly emphasized
their main remark was as follows. It is known that by Cercignani, Galgani, and Scotti, who also tried to
KAM theory, which essentially guarantees a beha- establish a connection between the FPU spectrum and
vior similar to that of an integrable system, applies Planck’s distribution.
only if the perturbation is smaller than a certain Up to this point, the discussion was concerned
threshold; on the other hand, in the FPU model the with the alternative whether the FPU system is
natural perturbation parameter is the energy E of ergodic or not, and thus reference was made to
the system. Thus, the FPU phenomenon can be properties holding in the limit t ! 1. Correspond-
expected to disappear above a certain threshold ingly, one was making reference to KAM theory,
energy Ec . This is indeed the case, as illustrated in namely to the possible existence of surfaces (N-
Figures 3 and 4. The parameters ,  and the class of dimensional tori) which should be dynamically
initial data are as in Figure 1. In Figure 3 the total invariant (for all times). The first paper in which
time is kept fixed (at 10 000 units), whereas the attention was drawn to the problem of estimating bi
energy E is increased in passing from top to bottom, the relaxation times to equilibrium was by Fucito
actually from E = 0.1 to E = 1 and E = 10. One sees et al. (1982). The model considered was actually a
that at E = 10 equipartition is attained within the different one (the so-called 4 model), but the results
given observation time; correspondingly, the motion can also be extended to the FPU model. Analytical
of the modes visually appears to be nonregular. The and numerical indications were given for the
approach to equipartition at E = 10 is clearly existence of two timescales. In a short time the
exhibited in Figure 4, where the time averages of system was found to relax to a state characterized by
the energies are reported versus time. an FPU-like spectrum, with a plateau at the low
There naturally arose the problem of the depen- frequencies, followed by an exponential tail. This,
dence of the threshold Ec on the number N of degrees however, appeared as being a sort of metastable
of freedom (and also on the class of initial data). state. In their words: ‘‘The nonequilibrium spectrum
Certain semianalytical considerations of Izrailev and may persist for extremely long times, and may be
Chirikov were generally interpreted as suggesting mistaken for a stationary state if the observation
that the threshold should vanish in the thermody- time is not sufficiently long.’’ Indeed, on a second
namic limit for initial excitations of high-frequency much larger timescale the slope of the exponential
modes. Recently, Shepelyanski completed the analy- tail was found to increase logarithmically with time,
sis by showing that the threshold should vanish also with a rate which decreases to zero with the energy.
for initial excitations of the low-frequency modes, as This is an indication that the time for equipartition
in the original FPU work (see, however, the should increase as an exponential with the inverse of
subsequent paper by Ponno mentioned below). If the energy.
this were true, the FPU phenomenon would dis- This is indeed the picture that the present authors
appear in the thermodynamic limit. In particular, consider to be essentially correct, being supported
the equipartition principle would be dynamically by very recent numerical computations, and by
justified at all temperatures. analytical considerations. Curiously enough, how-
The opposite conjecture was advanced by ever, such a picture was not fully appreciated until
Bocchieri et al. (1970). This was based on numerical quite recently. Possibly, the reason is that the
calculations, which indicated that the energy thresh- scientific community had to wait until becoming
old should be proportional to N, namely that the FPU acquainted with two relevant aspects of the theory
phenomenon persists in the thermodynamic limit of dynamical systems, namely Nekhoroshev theory
provided the specific energy  = E=N is below a and the relations between KdV equation and
critical value c , which should be definitely nonvan- resonant normal-form theory.
ishing. Actually, the computations were performed The first step was the passage from KAM theory
on a slightly different model, in which nearby to Nekhoroshev theory. Let us recall that, whereas
particles were interacting through a more physical in KAM theory one looks for surfaces which are
Lennard-Jones potential. By taking concrete values invariant (for all times), in Nekhoroshev theory one
130 Dynamical Systems and Thermodynamics

looks instead for a kind of weak stability involving 2

finite times, albeit ‘‘extremely long’’ ones, as they 1.8
are found to increase as stretched exponentials with

Specific energy u (× 10–3)

the inverse of the perturbative parameter. Thus, one
meets with situations in which one can have
instability over infinite times, while having a kind
of practical stability up to exponentially long times. 1

Notice that Nekhoroshev’s theory was formulated 0.8

only in the year 1974, and that it started to be 0.6

known in the West only in the early 1980s, just 0.4
because of its interest for the FPU problem. Another 0.2
interesting point is that just in those years one 0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
started to become acquainted with a related histor-
Temperature T (× 10–3)
ical fact. Indeed, the idea that equipartition might
require extremely long times, so that one would be Figure 5 Analogy with glasses: the specific energy u of an
FPU system is plotted versus temperature T for a cooling
confronted with situations of a practical lack of
process (upper curve) and a heating process (lower curve).
equipartition, has in fact a long tradition in The FPU system is kept in contact with a heat reservoir, whose
statistical mechanics, going back to Boltzmann and temperature is changed at a given rate. At low temperatures the
Jeans, and later (in connection with sound disper- system does not have time to reach the equilibrium curve u = T
sion in gases of polyatomic molecules) to Landau (with kB = 1).
and Teller.
In this way the idea of the existence of extremely analytic type mentione