You are on page 1of 155

October 12, 2011

Quantum Field Theory I

Ulrich Haisch
Rudolf Peierls Centre for Theoretical Physics, University of Oxford
OX1 3PN Oxford, United Kingdom

Abstract
This course deals with modern applications of quantum field theory with emphasize on
the quantization of theories involving scalar and spinor fields.

Recommended Books and Resources


There is a vast array of quantum field theory texts, many of them with redeeming features.
Here I mention a few of them, mostly the ones that I used or looked at when preparing this
course. To a large extent, I will follow the first section of
M. Peskin and D. Schroeder, An Introduction to Quantum Field Theory
This is a very clear and comprehensive book, covering essentially everything in this course
as well as many advanced aspects of quantum field theory that go (far) beyond the scope of
this lecture.
S. Weinberg, The Quantum Theory of Fields: Volume 1, Foundations
This is the first in a three volume series by one of the masters of quantum field theory.
It takes a unique route through the subject, focussing initially on particles rather than fields.
Since it has a very particular viewpoint, it is difficult to digest, but certainly worth reading.
L. Ryder, Quantum Field Theory
This elementary text has a nice discussion of much of the material in this course. It is
good for a first reading.
A. Zee, Quantum Field Theory in a Nutshell
This is a charming book, where emphasis is placed on physical understanding and the
author isnt afraid to hide the ugly truth when necessary. It contains many gems.
By browsing the web, I also found interesting material. Nice introductions to quantum
field theory (of different length and viewpoint) have been written by C. Anastasiou and D.
Tong. The corresponding scripts can be found at:
http://www.phys.ethz.ch/babis/Teaching/QFTI/qft1.pdf
http://www.damtp.cam.ac.uk/user/tong/qft/qft.pdf
Other links to useful resources can be found on the web page of D. Tong:
http://www.damtp.cam.ac.uk/user/tong/qft.html
For completeness, I will also give relevant references at the end of each section of this
script. The interested reader can consult them for further details on the discussed topics.

Contents
1 Introduction
1.1 Why QFT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Scales and Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Elements of Classical Field Theory
2.1 Dynamics of Fields . . . . . . . . .
2.2 Noethers Theorem . . . . . . . . .
2.3 Example: Electrodynamics . . . . .
2.4 Space-Time Symmetries . . . . . .
2.5 Problems . . . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

3 Klein-Gordon Theory
3.1 Klein-Gordon Field as Harmonic Oscillators
3.2 Structure of Vacuum . . . . . . . . . . . . .
3.3 Particle States . . . . . . . . . . . . . . . . .
3.4 Two Real Klein-Gordon Fields . . . . . . . .
3.5 Complex Klein-Gordon Field . . . . . . . . .
3.6 Heisenberg Picture . . . . . . . . . . . . . .
3.7 Klein-Gordon Correlators . . . . . . . . . . .
3.8 Non-Relativistic Limit . . . . . . . . . . . .
3.9 Problems . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

4 Interacting Fields
4.1 Classification of Interactions . . . . . . . . . . . . . . . . .
4.2 Interaction Picture . . . . . . . . . . . . . . . . . . . . . .
4.3 First Look at Scattering Processes . . . . . . . . . . . . . .
4.4 Wicks Theorem . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Second Look at Scattering Processes . . . . . . . . . . . .
4.6 Feynman Diagrams . . . . . . . . . . . . . . . . . . . . . .
4.7 Third Look at Scattering Processes . . . . . . . . . . . . .
4.8 Yukawa Potential . . . . . . . . . . . . . . . . . . . . . . .
4.9 Connected and Amputated Feynman Diagrams . . . . . .
4.10 From Correlation Functions to Scattering Matrix Elements
4.11 Decay Widths and Cross Sections . . . . . . . . . . . . . .
4.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Dirac Theory
5.1 Spinor Representation . . . . . . . . . .
5.2 Discrete Symmetries of Dirac Theory . .
5.3 Continuous Symmetries of Dirac Theory
5.4 Solutions to Dirac Equation . . . . . . .
5.5 Quantization of Dirac Theory . . . . . .
5.6 Problems . . . . . . . . . . . . . . . . . .
2

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

3
3
5

.
.
.
.
.

8
8
9
12
14
19

.
.
.
.
.
.
.
.
.

22
23
25
30
34
37
38
42
48
52

.
.
.
.
.
.
.
.
.
.
.
.

56
56
59
61
64
67
68
73
76
78
85
96
101

.
.
.
.
.
.

107
108
117
123
125
130
143

Introduction

As the term quantum field theory (QFT) suggests, QFT is the application of quantum mechanics (QM) to dynamical systems of fields, in the same sense that QM is concerned mainly
with the quantization of dynamical systems of particles. QFT is not only a subject that is
absolutely essential to understand the current state of elementary particle physics as well as
modern aspects of cosmology, but also plays a crucial role in many active areas of research,
ranging from atomic over nuclear and condensed-matter physics to pure mathematics. Since
the ultimate goal of this course is to gain a basic understanding of the fundamental laws of
nature, we will in the following focus mainly on the physics of elementary particles and hence
deal mostly with relativistic fields.

1.1

Why QFT?

The primary reason for introducing the concept of fields in classical physics is to construct laws
of nature that are local. The old laws of Newton (Coulomb) involve action at a distance.
This means that the force felt by a planet (an electron) changes immediately if a distant
star (proton) moves. The laws of Newton and Coulomb thus feature non-local interactions.
The field theories of Einstein (general relativity) and Maxwell (electrodynamics) remedied the
situation, with all interactions mediated in a local fashion by fields. The requirement of locality
remains a strong motivation for studying QFTs. However, there are further good reasons to
treat the quantum field (and not the particle) as fundamental (or as Steven Weinberg puts it
in [1]: Quantum fields are the basic ingredients of the universe, and particles are just bundles
of energy and momentum made out of them.).
QM and Special Relativity
A first reason is that the combination of QM and special relativity implies that particle number
is not conserved. Consider a particle of mass m trapped in a box of size L. Heisenbergs
uncertainty principle tells us that the uncertainty in the momentum of our particle is p
~/L. In the relativistic limit, momentum and energy can be treated on equivalent footing,
and one has an uncertainty in the energy of order E ~c/L. Yet, if E = 2mc2 , there is
enough energy available to create a virtual particle-antiparticle pair from the vacuum (Dirac
sea). This little exercise shows that when a particle with mass m is localized within a distance
Compton = ~/(mc), talking about a single particle loses its sense. For distances smaller than
this Compton wavelength there is a high probability that we will detect particle-antiparticle
pairs swarming around the single particle that we initially put into the box. Notice that
Compton is always smaller than the de Broglie wavelength given by de Broglie = ~/|p|.1 If
you like, the de Broglie wavelength is the distance at which the wavelike nature of particles
becomes apparent, while the Compton wavelength is the distance at which the concept of
a single pointlike particle breaks down and one has to start thinking about how to describe
multiparticle states.
1

Throughout this course we will use boldface type (ordinary italic type) to denote 3-vectors (4-vectors).

The presence of a multitude of particles and antiparticles at short distances (or high
energies) tells us that any attempt to write down a relativistic version of the one-particle
Schrodinger equation is doomed to fail. There is no mechanism in standard non-relativistic
QM to deal with changes in the particle number. Indeed, any attempt to naively write down a
relativistic version of the one-particle Schrodinger equation meets serious problems: negative
probabilities, infinite towers of negative energy states, or a breakdown of causality are the
common issues that arise.
QM and Causality
Let us have a closer look at the issue of breakdown of causality. Consider the amplitude


A(t) = y eiEt/~ x ,
(1.1)
that describes the propagation of a free particle from the point x to y. In non-relativistic QM
one has E = p2 /(2m) and hence2





A(t) = y exp i p2 /(2m) t/~ x
Z




d3 p

2
p p x
=
y
exp
i
p
/(2m)
t/~
(2~)3
Z
(1.2)





d3 p
2
=
exp
i
p
/(2m)
t/~
exp
ip

(y

x)/~
(2~)3
 m 3/2


=
exp im (y x)2 /(2~2 t) .
2i~t
R
Here we have made use of the completeness d3 p/(2~)3 |pihp| = 1 of |pi and a little bit
of algebra. The expression (1.2) is non-zero for all y and t, indicating that a particle can
propagate between any two points in an arbitrarily short time. In a relativistic theory, this
conclusion would
psignal a violation of causality. One might hope that using the relativistic
expression E = p2 c2 + m2 c4 for the energy would cure the problem, but it does not. In fact,
in the relativistic case one has
p




A(t) = y exp it/~ p2 c2 + m2 c4 x
Z
p




d3 p
2 c2 + m2 c4 exp ip (y x)/~
=
exp

it/~
p
(1.3)
(2~)3
Z
p


1
= 2 2
dp p sin (p |y x| /~) exp it/~ p2 c2 + m2 c4 .
2 ~ |y x| 0
This integral can be evaluated explicitly in terms of Bessel functions, but for our purposes it
is sufficient to consider its asymptotic behavior for L2 = |y x|2  c2 t2 , i.e., separations well
outside the light-cone. We use the method of stationary phase. The relevant phase function
2

The symbol p denotes the momentum operator, which in many QM books is indicated by a b . To avoid
clutter, I will not use the latter notation, but simply write p.

pL t p2 c2 + m2 c4 has a stationary point p = imcL/ L2 c2 t2 . Plugging this value into


(1.3), we find that (up to a rational function of L and t),
h
i

2
2
2
A(t) exp m/~ L c t .
(1.4)
This expression is small but non-zero outside the light-cone and causality is again violated.
In both cases, the observed failure is telling us that we need a new formalism to preserve
causality. This formalism is QFT. It solves the causality problem in a miraculous way. We
will see later that in QFT the propagation of a particle across a space-like interval is indistinguishable from the propagation of an antiparticle in the opposite direction. When we ask
whether an observation made at point x can affect an observation made at point y, we will
find that the amplitudes for particle and antiparticle propagation cancel in such a way that
causality is preserved.
What else is QFT good for?
Besides solving the causality problem, QFT also provides an elegant framework to describe
transitions between states of different particle number and type. An example physical processes, exhaustivelly studied (from 1989 until 2000) at the Large Electron Positron (LEP)
collider in Geneva, is the production of a muon ( ) and its antiparticle (+ ) out of the
annihilation of an electron (e ) and its antiparticle (the positron e+ ):
e + e + + + .

(1.5)

The experimental confirmation of the QFT predictions for processes such as (1.5), often to an
unprecedented level of accuracy, is our real reason for studying QFT. But the power of QFT
does not end here. In traditional QM the relation between spin and statistics has to be put
in by hand. To agree with experiment, one should choose Bose statistics (no minus sign if
one exchanges two identical particles) for integer spin particles, and Fermi statistics (minus
sign if one exchanges two identical particles) for half-integer spin particles. On the other
hand, in QFT the relationship between spin and statistics is a consequence of the framework,
following from the commutation quantization conditions for boson fields and anticommutation
quantization conditions for fermion fields.

1.2

Scales and Units

There are three fundamental dimensionful constants in nature: the speed of light c, Plancks
constant ~ (divided by 2), and Newtons constant GN . Their dimensions are
[c] = length time1 ,
[~] = length2 mass time1 ,
[GN ] = length3 mass1 time2 .

(1.6)

In order to avoid unnecessary clutter, we will work throughout this course in natural units,
defined by c = ~ = 1.3 This allows us to express all dimensionful quantities in terms of a
single scale which we choose to be mass or, equivalently, energy (since E = mc2 has become
E = m). Energies will be given in units of eV (the electron volt) or more often GeV = 109 eV
or TeV = 1012 eV, since we are typically dealing with high energies. To convert the unit of
energy back to units of length or time, we have to insert the relevant powers of c and ~. E .g.,
the length scale associated to a mass m is = h/(mc). Remembering that
hc 1.24 106 eV m ,

(1.7)

one finds that the length scale corresponding to the electron with mass me 511 keV is
e 2 1012 m.
Throughout this course we will refer to the dimension of a quantity, meaning the mass
dimension. Newtons constant, e.g., has [GN ] = 2 and defines a mass scale
GN = MP2 ,

(1.8)

where MP 1019 GeV is the Planck scale. This energy corresponds to a length scale LP
1035 m the Planck length. The Planck length is believed to be the smallest length scale that
makes sense: beyond this scale quantum gravity effects are likely to become important and
its no longer clear that the concept of space-time can be applied. The largest length scale we
can talk of is the size of the cosmological horizon, roughly 1060 LP .
A number for particle physics and cosmology relevant masses and the corresponding length
scales are shown in Table 1. Let me go through the list and spend some words on the most
important quantities. After the size of the observable universe, the first scale we encounter is
the cosmological constant () measured to be around 103 eV. Since nobody can really explain
why the cosmological constant has this particular value, lets forget about it real quick and
turn our attention to the masses of the known elementary particles. These range from less
than 1 eV for the neutrinos (s) to around 175 GeV for the top quark (t). The (in)famous
Higgs boson (h), which is the only not yet observed ingredient of the standard model (SM) of
elementary particle physics, is believed to weigh in at about 100 to 200 GeV. For scales around
1 TeV, i.e., the terascale, the predictive power of the SM is expected to break down. This is
precisely the energy regime that the Large Hadron Collider (LHC) at CERN in Geneva has
started to explore, having a design center-of-mass energy of 14 TeV. Beyond the electroweak
scale (v) of around 250 GeV, again nobody knows with certainty what is going on. One could
find a plethora of new (elementary) particles or a great desert. There are experimental
hints that the coupling constants of electromagnetism, and the weak and strong forces unify
at around MGUT = 1016 GeV, i.e., the grand unification scale (GUT). Everything is topped off
at the Planck scale where a QFT description might no longer be possible and a quantum theory
including the effects of gravity is needed to describe the physics of fundamental interactions.
The most likely possibility for such a theory seems to be some kind of string theory. But
also many other ideas such as loop quantum gravity, Horava-Lifshitz gravity, etc. exist. In
fact, the theory of everything (TOE) could also be a QFT, but one in which the finite or
3

The whole point of units is that you can choose whatever units are most convenient!

Quantity
Observable universe
Cosmological constant ()
Neutrinos (s)
Electron (e )
Muon ( )
Charm quark (c)
Tau ( )
Bottom quark (b)
Top quark (t)
Higgs boson (h)
Electroweak scale (v)
LHC energy
GUT scale (MGUT )
Planck scale (MP )

Mass
33

10 eV
103 eV
. 1 eV
511 keV
106 MeV
1.3 GeV
1.78 GeV
4.6 GeV
175 GeV
[100, 200] GeV
250 GeV
14 TeV
1016 GeV
1019 GeV

Length
27

10 m 2 1010 ly
103 m
& 106 m
2 1012 m
1014 m
1015 m
7 1016 m
3 1016 m
7 1018 m
[6, 12] 1018 m
5 1018 m
9 1020 m
1031 m
1035 m

Table 1: An assortment of masses and corresponding lengths scales that appear in the
context of particle physics and cosmology.

infinite number of renormalized couplings do not run off to infinity with increasing energy,
but hit a fixed point of the renormalization group equation. This possibility goes by the
name of asymptotic safety. Dont worry if havent understood a single word of what I have
mumbled about possible TOEs. All this is way too advanced to be covered in this course. I
only mentioned it, to make propaganda for the research of Joe Conlon (string theory), Andre
Lukas (string theory), and John Wheater (quantum gravity), which work on such theories
here in Oxford. Ask them if you want to know more about it.

References
[1] S. Weinberg, What is quantum field theory, and what did we think it was?, arXiv:hepth/9702027.
[2] S. Weinberg, The Search for Unity: Notes for a History of Quantum Field Theory,
Daedalus, Vol. 106, No. 4, Discoveries and Interpretations: Studies in Contemporary
Scholarship, Volume II (1977), 17 p.
[3] Chapter 1 of S. Weinberg, The Quantum theory of fields. Vol. 1: Foundations, Cambridge, UK, Univ. Pr. (1995), 609 p.
[4] F. Wilczek, Rev. Mod. Phys. 71, S85 (1999) [arXiv:hep-th/9803075].
7

Elements of Classical Field Theory

In this second section we will discuss various aspects of classical fields. We will cover only
the bare minimum ground necessary before turning to the quantum theory, and will return
to classical field theory at several later stages in this course when we need to introduce new
concepts or ideas.

2.1

Dynamics of Fields

A field is a quantity defined at every space-time point x = (t, x). While classical particle
mechanics deals with a finite number of generalized coordinates qa (t), indexed by a label a, in
field theory we are interested in the dynamics of fields
a (t, x) ,

(2.1)

where both a and x are considered as labels. We are hence dealing with an infinite number of
degrees of freedom (dofs), at least one for each point x in space. Notice that the concept of
position has been relegated from a dynamical variable in particle mechanics to a mere label
in field theory.
Lagrangian and Action
The dynamics of the fields is governed by the Lagrangian. In all the systems we will study
in this course, the Lagrangian is a function of the fields a and their derivatives a ,4 and
given by
Z
L(t) =

d3 x L(a , a ) ,

(2.2)

where the official name for L is Lagrangian density. Like everybody else we will, however, simply call it Lagrangian from now on. For any time interval t [t1 , t2 ], the action corresponding
to (2.2) reads
Z
Z
Z
t2

S=

dt

d3 x L =

d4 x L .

(2.3)

t1

Recall that in classical mechanics L depends only on qa and qa , but not on the second time
derivatives of the generalized coordinates. In field theory we similarly restrict to Lagrangians
L depending on a and a . Furthermore, with an eye on Lorentz invariance, we will only
consider Lagrangians depending on a and not higher derivatives.
Notice that since we have set ~ = 1, using the convention described in Section 1.2, the
dimension of the action is [S] = 0. With (2.3) and [d4 x] = 4, it follows that the Lagrangian
must necessarily have [L] = 4. Other objects that we will use frequently to construct Lagrangians are derivatives, masses, couplings, and most importantly fields. The dimensions of
the former two objects are [ ] = 1 and [m] = 1, while the dimensions of the latter two quantities depend on the specific type of coupling and field one considers. We therefore postpone
4

If there is no (or only little) room for confusion, we will often drop the arguments of functions and write
a = a (x) etc. to keep the notation short.

the discussion of the mass dimension of couplings and fields to the point when we meet the
relevant building blocks.
Principle of Least Action
The dynamical behavior of fields can be determined by the principle of least action. This
principle states that when a system evolves from one given configuration to another between
times t1 and t2 it does so along the path in cofiguration space for which the action is an
extremum (usually a minimum) and hence satisfies S = 0. This condition can be rewritten,
using partial integration, as follows


Z
L
L
4
a +
( a )
S = d x
a
( a )
(2.4)





Z
L
L
L
4
= dx

a +
a
= 0.
a
( a )
( a )
The last term is a total derivative and vanishes for any a that decays at spatial infinity and
obeys a (t1 , x) = a (t2 , x) = 0. For all such paths, we obtain the Euler-Lagrange equations
of motion (EOMs) for the fields a , namely


L
L

= 0.
(2.5)

( a )
a
Hamiltonian Formalism
The link between the Lagrangian formalism and the quantum theory goes via the path integral.
While this is a powerful formalism, we will for the time being use canonical quantization, since
it makes the transition to QM easier. For this we need the Hamiltonian formalism of field
theory. We start by defining the momentum density a (x) conjugate to a (x),
a =

L
.
a

(2.6)

In terms of a , a , and L the Hamiltonian density is given by


H = a a L ,

(2.7)

where, as in classical mechanics, we have eliminated a in favor of a everywhere in H. The


Hamiltonian then simply takes the form
Z
H = d3 x H .
(2.8)

2.2

Noethers Theorem

The role of symmetries in field theory is possibly even more important than in particle mechanics. There are Lorentz symmetry, internal symmetries, gauge symmetries, supersymmetries,
etc. We start here by recasting Noethers theorem in a field theoretic framework.
9

Currents and Charges


Noethers theorem states that every continuous symmetry of the Lagrangian gives rise to a
conserved current J (x), so that the EOMs (2.5) imply
J = 0 ,

(2.9)

or in components dJ 0 /dt + J = 0. To every conserved current there exists also a conserved


(global) charge Q, i.e., a physical quantity which stays the same value at all times, defined as
Z
Q=
d3 x J 0 .
(2.10)
R3

The latter statement is readily shown by taking the time derivative of Q,


Z
Z
0
dQ
3 dJ
=
dx
=
d3 x J ,
dt
dt
R3
R3

(2.11)

which is zero, if one assumes that J falls off sufficiently fast as |x| . Notice, however, that
the existence of the conserved current J is much stronger than the existence of the (global)
charge Q, because it implies that charge is in fact conserved locally. To see this, we define the
charge in a finite volume V by
Z
(2.12)
QV = d3 x J 0 .
V

Repeating the above analysis, we find


dQV
=
dt

d x J =
V

dS J ,

(2.13)

where S denotes the area bounding V , dS is a shorthand for n dS with n being the outward
pointing unit normal vector of the boundary S, and we have used Gauss theorem. In physical
terms the result means that any charge leaving V must be accounted for by a flow of the
current 3-vector J out of the volume. This kind of local conservation law of charge holds in
any local field theory.
Proof of Theorem
In order to prove Noethers theorem, well consider infinitesimal transformations. This is
always possible in the case of a continuous symmetry. We say that a is a symmetry of the
theory, if the Lagrangian changes by a total derivative
L(a ) = J (a ) ,

(2.14)

for a set of functions J . We then consider the transformation of L under an arbitrary change
of field a . Glancing at (2.4) tells us that in this case





L
L
L
L =

a +
a .
(2.15)
a
( a )
( a )
10

When the EOMs are satisfied than the term in square bracket vanishes so that we are simply
left with the total derivative term. For a symmetry transformation satisfying (2.13) and (2.14),
the relation (2.15) hence takes the form


L

a ,
(2.16)
J = L =
( a )
or simply J = 0 with
J =

L
a J ,
( a )

(2.17)

which completes the proof. Notice that if the Lagrangian is invariant under the infinitesimal
transformation a , i.e., L = 0, then J = 0 and J contains only the first term on the
right-hand side of (2.17).
We stress that that our proof only goes through for continuous transformations for which
there exists a choice of the transformation parameters resulting in a unit transformation, i.e.,
no transformation. An example is a Lorentz boost with some velocity v, where for v = 0
the coordinates x remain unchanged. There are examples of symmetry transformations where
this does not occur. E.g., a parity transformation does not have this property, and Noethers
theorem is not applicable then.
Energy-Momentum Tensor
Recall that in classic particle mechanics, spatial translation invariance gives rise to the conservation of momentum, while invariance under time translations is responsible for the conservation of energy. What happens in classical field theory? To figure it out, lets have a look
at infinitesimal translations
x x  = a (x) a (x + ) = a (x) +  a (x) ,

(2.18)

where the sign in the field transformation is plus, instead of minus, because we are doing an
active, as opposed to passive, transformation. If the Lagrangian does not explicitly depend
on x but only through a (x) (which will always be the case in the Lagrangians discussed in
this course), the Lagrangian transforms under the infinitesimal translation as
L L +  L .

(2.19)

Since the change in L is a total derivative, we can invoke Noethers theorem which gives
us four conserved currents T = (J ) one for each of the translations  ( = 0, 1, 2, 3).
From (2.18) we readily read off the explicit expressions for T ,
T =

L
a L .
( a )

(2.20)

This quantity is called the energy-momentum (or stress-energy) tensor. It has dimension
[T ] = 4 and satisfies
T = 0 .
(2.21)
11

The four conserved charges are ( = 0, 1, 2, 3)


Z

P = d3 x T 0 ,

(2.22)

Specifically, the time component of P is


Z
Z


0
3
00
3
a
P = d x T = d x a L ,

(2.23)

which (looking at (2.7) and (2.8)) is nothing but the Hamiltonian H. We thus conclude that
the charge P 0 is the total energy of the field configuration, and it is conserved. In fields theory,
energy conservation is thus a pure consequence of time translation symmetry, like it was in
particle mechanics. Similarly, we can identify the charges P i (i = 1, 2, 3),
Z
Z
i
3
0i
P = d x T = d3 x a i a ,
(2.24)
as the momentum components of the field configuration in the three space directions, and they
are of course also conserved.

2.3

Example: Electrodynamics

As a simple application of the formalism we have developed so far in this section, let us try
to derive Maxwells equations using the field theory formulation. In terms of the electric and
magnetic fields E and B and the charge density and 3-vector current j, these equations
take the well-known form
B = 0,
E+

B
= 0,
t

E = ,
B

E
=j.
t

(2.25)
(2.26)
(2.27)
(2.28)

The E and B fields are spatial 3-vectors and can be expressed in terms of the components
of the 4-vector field A = (, A) by
E =

A
,
t

B = A.

(2.29)

This definition ensures that the first two homogeneous Maxwell equations (2.25) and (2.26)
are automatically satisfied,
( A) = ijk i j Ak = 0 ,


A


+
( A) = () = ijk j k = 0 .
t
t
12

(2.30)
(2.31)

Here ijk is the fully antisymmetric Levi-Civita tensor with 123 = 123 = +1 and the indices
i, j, k = 1, 2, 3 are summed over if they appear twice.
The remaining two inhomogeneous Maxwell equations (2.27) and (2.28) follow from the
Lagrangian
1
1
(2.32)
L = ( A ) ( A ) + ( A )2 A J ,
2
2
with J = (, j). From the rules presented in Section 2.1, we gather that the dimension of the
vector field and current is [A ] = 1 and [J ] = 3, respectively. The funny minus sign of the first
term on the right-hand side is required to ensure that the kinetic term 1/2 A 2i is positive using
the Minkowski metric. Notice also that the Lagrangian (2.32) has no kinetic term 1/2 A 20 and
hence A0 is not dynamical. Why this is and necessarily has to be the case will only become
fully clear if you attend the advanced QFT course. Yet, we can already get an idea what
is going on by remembering that the photon (the quantum of electrodynamics) has only two
polarization states, i.e., two physical dofs, while the massless vector field A has obviously four
dofs. The fact that time component A0 is not dynamical reduces the number of independent
dofs in A from four to three. But this is still one too many. The last unwanted dof can be
gauged away using the gauge symmetry of the quantum version of electromagnetism aka
quantum electrodynamics (QED).
Enough said, lets do serious business and compute something. To see that the statement
made before (2.32) is indeed correct, we first evaluate
L
= A + ( A ) ,
( A )

L
= J ,
A

(2.33)

from which we derive the EOMs,






L
L
0 =

= 2 A + ( A ) + J = ( A A ) + J . (2.34)
( A )
A
Introducing now the field-strength tensor
F = A A ,

(2.35)

we can write (2.32) and (2.34) quite compact,


1
L = F F J A .
4

(2.36)

F = J ,

(2.37)

Does this look familiar? I hope so. Notice that [F ] = 2. In order to see that (2.37) indeed
captures the physics of (2.27) and (2.28), we compute the components of F . We find
F

0i

ij

= F

i0

= F

ji


i
A
= A A = +
= E i ,
t
0

ijk

= A A =  B ,
13

(2.38)

while all other components are zero. With this in hand, we then obtain from F 0 = and
F 1 = j 1 ,
F 0 = 0 F 00 + i F i0 = E = ,
F

= 0 F

01

+ i F

i1

E 1 B 3 B 2
+

=
=
t
x2
x3


1
E
B
= j1 .
t

(2.39)

Similar relations hold for the remaining components i = 2, 3. Taken together this proves the
second inhomogeneous Maxwell equation (2.28).
Let me also derive the energy-momentum tensor T of electrodynamics, ignoring for the
moment the source term A J . Using (2.33) one finds
T = ( A )( A ) ( A )( A ) +

1
F F .
4

(2.40)

Notice that the first term in (2.40) is not symmetric, which implies that T 6= T . In fact,
this is not really surprising since the definition of the energy-momentum tensor (2.20) does
not exhibit an explicit symmetry in the indices and . Nevertheless, there is typically a way
to massage the energy-momentum tensor of any theory into a symmetric form.5 To learn how
this can be done in the case under consideration is the objective of a homework problem.

2.4

Space-Time Symmetries

One of the main motivations to develop QFT is to reconcile QM with special relativity. We
thus want to construct field theories in which space and time are placed on an equal footing
and the theory is invariant under Lorentz transformations,
x (x0 ) = x ,

(2.41)

= ,

(2.42)

with
so that the distance ds2 = dx dx is preserved. Here = = diag (1, 1, 1, 1)
denotes the Minkowski metric. E.g., a rotation by the angle about the z-axis, and a boost
by v < 1 along the x-axis are respectively described by the following Lorentz transformations

1
0
0
0
v 0 0
0 cos sin 0
v 0 0

=
=
(2.43)
,
,
0 sin cos 0
0
0 1 0
0
0
0
1
0
0 0 1

with = 1/ 1 v 2 . The Lorentz transformations form a Lie group under matrix multiplication. You can learn more about this if you attend the lecture course on group theory held
5

One (but not the only) reason that you might want to have a symmetric energy-momentum tensor T
is to make contact with general relativity, since such an object sits on the right-hand side of Einsteins field
equations.

14

by Andre Lukas. Alternatively, you can study the group theory crash course written by
Martin Bauer (a PhD student at Mainz University). It can be found on my Oxford homepage.
The various fields belong to different representations of the Lorentz group. The simplest
example is the scalar field , which under the Lorentz transformation x x,6 transforms as
(x) 0 (x) = (1 x) .

(2.44)

The inverse 1 appears in the argument because we are dealing with an active transformation,
in which the field is truly shifted. To see why this means that the inverse appears, it will suffice
to consider a non-relativistic example such as a temperature field. Suppose we start with an
initial field (x) which has a hotspot at, say, x = (1, 0, 0). Lets now make a rotation x Rx
about the z-axis so that the hotspot ends up at x = (0, 1, 0). If we want to express the new
field 0 (x) in terms of the old field (x), we have to place ourselves at x = (0, 1, 0) and ask
what the old field looked like at the point R1 x = (1, 0, 0) we came from. This R1 is the
origin of the 1 factor in the argument of the transformed field in (2.44).
The Lagrangian formulation of field theory makes it especially easy to discuss Lorentz
invariance, since an EOM is automatically Lorentz invariant if it follows from a Lagrangian
that is a Lorentz scalar. This is an immediate consequence of the principle of least action. If a
Lorentz transformation leaves the Lagrangian unchanged, the transformation of an extremum
in the action will be another extremum. To give an example, lets look at the following
Lagrangian
1
1
(2.45)
L = ( )2 m2 2 .
2
2
where is a real scalar and, as we will see later, m is the mass of (for now on just think
about m as a parameter). Obviously, the dimension of the field is [] = 1. You will show in a
homework assignment that the EOM corresponding to (2.45) takes the form

+ m2 = 0 .
(2.46)
This equation is the famous Klein-Gordon equation. The Laplacian in Minkowski space is
sometimes denoted by . In this notation, the Klein-Gordon equation reads ( + m2 ) = 0.
Let us first check that a Lorentz transformation leaves the Lagrangian (2.45) and its action invariant. According to (2.44), the mass term transforms as 1/2 m2 2 (x) 1/2 m2 2 (x0 )
with x0 = 1 x. The transformation of is
(x) ((x0 )) = (1 ) ( )(x0 ) .

(2.47)

Using (2.43) we thus find that the derivative term in the Klein-Gordon Lagrangian behaves as
1
1
( (x))2 (1 ) ( )(x0 )(1 ) ( )(x0 )
2
2
1
= ( )(x0 )( )(x0 )
2
1
2
= ( (x0 )) ,
2
6

To shorten the notation we will often use matrix notation and drop the indices , etc.

15

(2.48)

under the Lorentz transformation . Putting things together, we find that the action of the
Klein-Gordon theory is indeed Lorentz invariant,
Z
Z
Z
4
4
0
S = d x L(x) d x L(x ) = d4 x0 L(x0 ) = S .
(2.49)
Notice that changing the integration variables from d4 x to d4 x0 , in principle introduces an Jacobian factor det (). This factor is, however, equal to 1 for Lorentz transformation connected
to the identity, that we are dealing with.
A similar calculation also shows that, as promised, also the EOM of the Klein-Gordon field
is invariant,


2 + m2 (x) 2 + m2 (x0 )
h
i
= (1 ) (1 ) + m2 (x0 )
(2.50)

= + m2 (x0 ) = 0 .
In the case of the Klein-Gordon theory, we hence conclude that the statements made before
(2.45) are indeed correct.
Representations of Lorentz Group
The transformation law (2.44) is the simplest possible transformation law for a field. In fact,
it is the only possibility for a one-component field aka a real scalar. Yet, it is also clear that in
order to describe nature (think only about electromagnetism) we need multicomponent fields,
which have more complicated transformation properties. The most familiar case is that of a
vector field, such as the vector potential A , which we have already met in Section 2.3. In
this case the quantity that is distributed in space-time also carries an orientation which must
be rotated and/or boosted.
In fact, we will learn in this course that the Lorentz group has a variety of representations, corresponding to particles with integer (bosons) and half-integer spins (fermions) in
QFT. These representations are normally constructed out of spinors. To start this general
(and somewhat formal) discussion, let me examine the allowed possibilities for linear field
transformations
a (x) 0a (x0 ) = M ()ab b (x) ,
(2.51)
under (2.41). The first important point to notice is that the Lorentz transformations form a
group. This means that two successive Lorentz transformations,
x x0 = x ,

x0 x00 = 0 x0 ,

(2.52)

can also be described in terms of a single one


x x00 = 00 x ,

(2.53)

00 = 0 .

(2.54)

with

16

What happens to (2.51) under this set of Lorentz transformations? For x x00 = 00 x, we
have (in matrix notation)
(x) 00 (x00 ) = M (00 )(x) .
(2.55)
On the other hand, for x x0 = x x00 = 0 x, we get
(x) 00 (x00 ) = M (0 )0 (x0 ) = M (0 )M ()(x) .

(2.56)

In order for the last two equations to be consistent with each other, the field transformations
M must obviously fulfill
M (0 ) = M (0 )M () .
(2.57)
In group theory terminology, this means that the matrices M furnish a representation of the
Lorentz group. Field Lorentz transformations are therefore not random, but they can be found
if we find all (finite dimensional) representations of the Lorentz group.
So how do the common representation of the Lorentz group look like and how do we get
all of them? While both questions will be answered in this lecture, I believe it is best to do
it case-by-case whenever we will meet a new type of (quantum) field. Since we already talked
about the real scalar (Klein-Gordon field) and the vector A (potential in electrodynamics),
it makes nevertheless sense to give the representations for these two types of fields already at
this point.
Since a scalar field by definition does not change under Lorentz transformations, (x)
0 0
(x ) = (x), the scalar representation of the Lorentz group is simply
M () = 1 .

(2.58)

This was easy! The representation of the vector A is also not difficult to figure out. Let me
for the time being only state the result. One finds
M () = ,

(2.59)

which means that a vector field A transforms under a Lorentz transformation as (restoring
indices)
A (x) (A0 ) (x0 ) = A (x0 ) .
(2.60)
It is important to notice that the latter transformation property implies that any term build
out of A and , where all Lorentz indices are contracted is invariant under Lorentz transformations. As an exercise you are supposed to show this explicitly for terms like A ,
etc.
Angular Momentum
In classical particle mechanics, rotational invariance gives rise to conservation of angular
momentum. What is the analogy in field theory? Moreover, we now have further Lorentz
transformations, namely boosts. What conserved quantity do they correspond to? In order to
address these questions, we first need the infinitesimal form of the Lorentz transformations
= + ,
17

(2.61)

where is infinitesimal. The condition (2.42) for to be a Lorentz transformation becomes


in infinitesimal form
= ( + ) ( + ) = + + + O( 2 ) ,

(2.62)

which implies that must be an antisymmetric matrix,


= .

(2.63)

Notice that an antisymmetric 4 4 matrix has six independent parameters, which agrees with
the number of different Lorentz transformations, i.e., three rotations and three boosts.
Applying the infinitesimal Lorentz transformation to our real scalar field , we have
(x) (x x) = (x) x (x) ,

(2.64)

where the minus sign arises from the factor 1 in (2.43). The variation of the field under
an infinitesimal Lorentz transformation is hence given by
= x .

(2.65)

By the same line of reasoning, one shows that the variation of the Lagrangian is
L = x L = ( x L) ,

(2.66)

where in the last step we used the fact that = 0 due to its antisymmetry. The Lagrangian
changes by a total derivative, so we can apply Noethers theorem (2.17) with J = x L
to find the conserved current,
L
x + x L
( )


L

=
L x = T x .
( )

J =

(2.67)

Stripping off , we obtain six different currents, which we write as


(J ) = x T x T .

(2.68)

(J ) = 0 ,

(2.69)

These currents satisfy


and give (as usual) rise to six conserved charges. For , 6= 0, the Lorentz transformation
is a rotation and the three conserved charges give the total angular momentum of the field
(i, j = 1, 2, 3):
Z

ij
Q = d3 x xi T 0j xj T 0i .
(2.70)
Whats about the boosts? In this case, the conserved charges are
Z

0i
Q = d3 x x0 T 0i xi T 00 .
18

(2.71)

The fact that these are conserved tells us that


Z
Z
Z
0i
d
dQ0i
3
0i
3 dT
= d xT + t d x

0=
d3 x xi T 00
dt
dt
dt
Z
d
dP i
i

=P +t
d3 x xi T 00 .
dt
dt
Yet, also the momentum P i is conserved, i.e., dP i /dt = 0, and we conclude that
Z
d
d3 x xi T 00 = const. .
dt

(2.72)

(2.73)

This is the statement that the center of energy of the field travels with a constant velocity. In
a sense its a field theoretical version of Newtons first law but, rather surprisingly, appearing
here as a conservation law. Notice that after restoring the label a our results for (J ) etc.
also apply in the case of multicomponent fields.
Poincar
e Invariance
We now require that a physical system possesses both space-time translation (2.18) and Lorentz
transformation symmetry (2.41). The symmetry group that includes both transformations is
called the Poincare group. Notice that for any Poincare-invariant theory the two charge
conservation equations (2.21) and (2.69) should hold. This is only possible if the energymomentum tensor T is symmetric. Indeed,

0 = (J ) = x T x T
= x T + T x x T T x

(2.74)

= T T = T T .
Since Maxwells theory is Poincare invariant, this general result tells us that the expression
of the energy-momentum tensor in (2.40) can be made symmetric without changing physics.
The key to actually do it, lies in making use of the conservation law (2.21) in an appropriate
way.

2.5

Problems

i) Suppose that a no further specified Lagrangian L depends not only on and but
also on the second derivatives of the fields:7
L = L(, , ) .

(2.75)

For the case that the variations vanish at the endpoints and that (1 . . . N ) =
1 . . . N () holds, derive the Euler-Lagrange EOMs for such a theory.
7

For the sake of brevity, we have omitted the subscript a labelling the different fields.

19

Apply your result to obtain the EOMs for the field with Lagrangian
L=

(t ) (x ) + (x )3 ( )2 .
2
6
2

(2.76)

ii) Let us study the dynamics of acoustic waves in an elastic medium (e.g. air), as described
by the Lagrangian
 2
y
1
1 2
(y)2 ,
(2.77)
L=
vsound
2
t
2
with the density of the medium and vsound the speed of sound.
Find the Euler-Lagrange EOMs for the system and their solutions. What do they describe? Calculate the Hamiltonian H.
iii) Consider the Klein-Gordon Lagrangian (2.45). Derive the kinetic and potential energy
(T and V with L = T V ) as well as the Euler-Lagrange EOMs for the field . Write
down the energy-momentum tensor T and show that it indeed satisfies T = 0.
Give the expressions for the conserved energy E and momentum P i .
iv) Using (2.60) show that the terms A , ( A )2 , and ( A )( A ) are Lorentz invariant.
What are the dimensions of these terms?
v) We saw that in the case of electrodynamics in vacuum using (2.20) leads to an energymomentum tensor T that is not symmetric. To remedy that, one can add to T
a term of the form , where is antisymmetric in its first two indices, i.e.,
= .
Show that such an object is automatically divergenceless, i.e., it obeys = 0.
This feature implies that instead of T one can also use
= T + ,

(2.78)

without changing the physics, since has the same globally conserved energy and
momentum as T .
Show that this construction, with
= F A ,

(2.79)

leads to an energy-momentum tensor that is symmetric and yields the standard


formulas for the electromagnetic energy and momentum densities:
E=


1
E2 + B2 ,
2

20

S =EB.

(2.80)

References
[1] Chapter 4 of L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields, Fourth
Edition: Vol. 2 (Course of Theoretical Physics Series), Butterworth-Heinemann (1975),
481 p.
[2] Chapter 5 of B. Thide, Electromagnetic Field Theory, revised and extended 2nd edition, http://www.plasma.uu.se/CED/Book/index.html

21

Klein-Gordon Theory

In QM, canonical quantization is a recipe that takes us from the Hamiltonian formalism
of classical dynamics to the quantum theory. The recipe tells us to take the generalized
coordinates qa and their conjugate momenta pa = L/ qa and promote them to operators.
The Poisson bracket structure of classical mechanics descends to the structure of commutation
relations between operators, namely
[qa , qb ] = [pa , pb ] = 0 ,
[qa , pb ] = ia b ,

(3.1)

where [a, b] = ab ba is the usual commutator.


If one wants to construct a QFT, one can proceed in a similar fashion. The idea is to
start with the classical field theory and then to quantize it, i.e., reinterpret the dynamical
variables as operators that obey canonical commutation relations,8
[a (x), b (y)] = [ a (x), b (y)] = 0 ,
[a (x), b (y)] = i (3) (x y)a b .

(3.2)

Here a (x) are field operators and the Kronecker delta in (3.1) has been replaced by a delta
function since the momentum conjugates a (x) are densities. Notice that for now, we are
working in the Schrodinger picture which means that the operators a (x) and a (x) do only
depend on the spatial coordinates but not on time. The time dependence sits in the states
|i which obey the usual Schodinger equation
d
|i = H|i .
(3.3)
dt
While all this looks pretty much the same as good old QM there is an important difference.
The wavefunction |i in QFT, is a functional, i.e., a function of every possible configuration
of the field a , and not a simple function.9 So things are more complicated in QFT than in
QM after all.
The Hamiltonian H, being a function of a and a , also becomes an operator in QFT. In
order to solve the theory, one task is to find the spectrum, i.e., the eigenvalues and eigenstates
of H. This is usually very difficult, since there is an infinite number of dofs within QFT, at
least one for each point x in space. However, for certain theories, called free theories, one can
find a way to write the dynamics such that each dof evolves independently from all the others.
Free field theories typically have Lagrangians which are quadratic in the fields, so that the
EOMs are linear.
i

This procedure is sometimes referred to as second quantization. We will not use this terminology here.
In functional analysis, a functional is a map from a vector space to the field underlying the vector space,
which is usually the real numbers. In other words, it is a function that takes a vector as its argument or input
and returns a scalar. Commonly, the vector space is a space of functions, so the functional takes a function
as its argument, and so it is sometimes referred to as a function of a function. The use of functionals goes
back to the calculus of variations where one searches for a function which minimizes a certain functional. A
particularly important application in physics is to search for a state of a system which minimizes the energy
functional.
9

22

3.1

Klein-Gordon Field as Harmonic Oscillators

So far the discussion in this section was rather general. Let us be more specific and consider
the simplest relativistic free theory as a practical example. It is provided by the classical
Klein-Gordon equation (2.45). To exhibit the coordinates in which the dofs decouple from
each other, we only have to Fourier transform the field ,
Z 3
d p i px
(t, x) =
e
(t, p) .
(3.4)
(2)3
In momentum space (2.45) simply reads

 2


2
2
(t, p) = 0 ,
+ p +m
t2

(3.5)

which tells us that for each value of p, the Fourier transform (t, p) solves the equation of a
harmonic oscillator with frequency
p
(3.6)
p = |p|2 + m2 .
We see that the most general solution of the classical Klein-Gordon equation is a linear superposition of simple harmonic oscillators, each vibrating at a different frequency with a different
amplitude. In order to quantize the field , we must hence only quantize this infinite number
of harmonic oscillators (as Sidney Coleman once said [1]: The career of a young theoretical
physicist consists of treating the harmonic oscillator in ever-increasing levels of abstraction.).
Lets recall how to do it in QM.
Harmonic Oscillator in QM
Consider the QM Hamiltonian
1 2 1 2 2
p + q ,
(3.7)
2
2
with the canonical commutation relations [q, p] = i. In order to find the spectrum of the
system, we define annihilation and creation operators (also known as lowering and raising or
ladder operators)
r
r

a=
q + p,
a =
q p.
(3.8)
2
2
2
2
H=

Expressing q and p through a and a gives


r

p = i
(a a ) .
2

1
q = (a + a ) ,
2

(3.9)

The commutator of the operators introduced in (3.8) is readily computed. One finds [a, a ] = 1.
Expressing the Hamiltonian (3.7) through a and a gives




H=
aa + a a = a a +
.
(3.10)
2
2
23

It is also easy to show that the commutator of H with a and a takes the form
[H, a ] = a .

[H, a] = a ,

(3.11)

These relations imply that if |i is an eigenstate of H with energy E, i.e., H|i = E|i, then
we can construct other eigenstates by acting with the operators a and a on |i:
Ha |i = (E + ) a |i ,

Ha|i = (E ) a|i ,

(3.12)

This feature explains why a (a ) is called annihilation (creation) operator. From the latter
equation it is also clear that the spectrum of (3.7) has a ladder structure, . . . , E 2, E
, E, E +, E +2, . . . . If the energy is bounded from below, there must be a ground state |0i,
which satisfies a|0i = 0. This state has the ground state or zero-point energy H|0i = /2|0i.
Excited states |ni are then created by the repeated action of a ,
p

(a )n |0i = n (n 1) . . . 1 |ni = n! |ni ,


(3.13)
and satisfy





1
1
H|ni = N +
|ni = n +
|ni ,
(3.14)
2
2
where N = a a is the number operator with N |ni = n|ni. The prefactor on the right-hand
side of (3.13) is needed to guarantee that the states |ni are normalized to 1, i.e., hn|ni = 1.
Quantization of Real Klein-Gordon Field
If we treat each Fourier mode of the field as an independent harmonic oscillator, we can apply
canonical quantization to the real Klein-Gordon theory, and in this way find the spectrum of
the corresponding Hamiltonian. In analogy to (3.9), we write and as a linear sum of an
infinite number of operators ap and ap , labelled by the 3-momentum p,
Z 3
i
dp
1 h
i px
i px
p
ap e
+ ap e
(x) =
,
(2)3 2p
(3.15)
r h
Z 3
i
p
dp
(i)
ap ei px ap ei px .
(x) =
(2)3
2
The commutation relations (3.2) become
[ap , aq ] = [ap , aq ] = 0 ,
[ap , aq ] = (2)3 (3) (p q) .
Let us assume that the latter equations hold, it then follows that
Z 3 3
r

d p d q i q 

i pxi qy

i px+i qy
[(x), (y)] =

[a
,
a
]
e
+
[a
,
a
]
e
p q
p q
(2)6 2
p
Z 3 3
r


d p d q i q
3 (3)
i pxi qy
i px+i qy
=
(2)

(p

q)

e
(2)6 2
p
Z 3

d p i 
i p(xy)
i p(xy)
=

e
= i (3) (x y) ,
(2)3 2
24

(3.16)

(3.17)

where we have dropped terms [ap , aq ] = [ap , aq ] = 0 from the very beginning. To show that
[(x), (y)] = [(x), (y)] = 0 is left as an exercise.
In terms of the ladder operators ap and ap the Hamiltonian of the real Klein-Gordon theory
takes the form
Z


1
H=
d3 x 2 + ()2 + m2 2
2

Z


p q
1 d3 x d3 p d3 q
i qx
i qx
i px
i px
e
a
e

a
e
=

a
e

a
q
p
q
p
2
(2)6
2


1
+
ip ap ei px ip ap ei px iq aq ei qx iq aq ei qx (3.18)
2 p q



m2
i qx
i qx
i px
i px
+
aq e
+ aq e
ap e
+ ap e
2 p q
Z 3
i
1
dp 1h
2
2
2

2
2
2

=
(
+
p
+
m
)(a
a
+
a
a
)
+
(
+
p
+
m
)(a
a
+
a
a
)
,
p
p
p
p
p
p p
p
p
p
4 (2)3 p
where we have first used the expressions for and given in (3.15) and then integrated over
d3 x to get delta functions (3) (p q), which, in turn, allows us to perform the d3 q integral.
Inserting finally the expression (3.6) for the frequency, the first term in (3.18) vanishes and
we are left with
Z 3

 Z d3 p


dp
1
1

a
a
+
a
a
=

a
a
+
[a
,
a
]
H=
p
p
p
p
p
p
p
p
p
p
2 (2)3
(2)3
2
(3.19)
Z 3


dp
1

3 (3)
=
p ap ap + (2) (0) .
(2)3
2
We see that the result contains a delta function, evaluated at zero where it has an infinite
spike. This contribution arises from the infinite sum over all modes vibrating with the zeropoint energy p /2. Moreover, the integral over p diverges at large momenta |p|. To better
understand what is going on let us have a look at the ground state |0i where the former infinity
first becomes apparent.

3.2

Structure of Vacuum

As in the case of the harmonic oscillator in QM, we define the vacuum |0i through the condition
that it is annihilated by the action of all ap ,
ap |0i = 0 ,

p.

(3.20)

With this definition the energy E0 of the vacuum comes entirely from the second term in the
last line of (3.19),
Z 3

d p p
3 (3)
(2) (0) |0i = |0i .
(3.21)
H |0i = E0 |0i =
(2)3 2
25

In fact, the latter expression contains not only one but two infinities. The first arises
because space is infinitely large. Infinities of this kind are often referred to as infrared (IR)
divergences. To isolate this infinity, we put the theory into a box with sides of length L and
impose periodic boundary conditions (BCs) on the field. Then, taking the limit L , we
get
Z L/2
Z L/2

3
i px
3 (3)
dx e
= lim
d3 x = V ,
(3.22)
(2) (0) = lim
p=0
L

L/2

L/2

where V denotes the volume of the box. This result tells us that the delta function singularity
arises because we try to compute the total energy E0 of the system rather than its energy
density E0 . The energy density is simply calculated from E0 by dividing through the volume
V . One finds
Z 3
d p p
E0
=
,
(3.23)
E0 =
V
(2)3 2
which is still divergent and resembles the sum of zero-point energies for each harmonic oscillator. Since E0 in the limit |p| , i.e., high frequencies (or short distances), this
singularity is an ultraviolet (UV) divergence. This divergence arises because we want too much.
We have assumed that our theory is valid to arbitrarily short distance scales, corresponding
to arbitrarily high energies. Recalling the discussion of energy scales in Section 1.2, this assumption is clearly absurd. The integral should be cut off at high momentum, reflecting the
fact that our theory presumably breaks down at some point (most likely far below the GUT
or Planck scale).
Fortunately, the infinite energy shift in (3.19) is harmless if we want to measure the energy
difference of the energy eigenstates from the vacuum. We can therefore recalibrate our
energy levels (by an infinite constant) removing from the Hamiltonian operator the energy of
the vacuum,
: H : = H E0 = H h0|H|0i .
(3.24)
With this definition one has : H : |0i = 0. In fact, the difference between the latter Hamiltonian
and the previous one is merely an ordering ambiguity in moving from the classical to the
quantum theory. E.g., if we would have defined our Hamiltonian to take the form
1
(3.25)
H = (q ip) (q + ip) ,
2
which is classically the same as our original definition (3.7), then after quantization instead of
(3.10), we would have gotten
H = a a .
(3.26)
This type of ordering ambiguity arises often in field theories. The method that we have
used above to deal with it is called normal ordering. In practice, normal ordering works by
placing all annihilation operators ap in products of field operators to the right. Applied to the
Hamiltonian of the real Klein-Gordon theory this prescription leads to
Z 3
dp
:H : =
p ap ap .
(3.27)
(2)3
In the remainder of this section, we will normal order all operators in this manner (dropping
the : : for simplicity).
26

Cosmological Constant
Above we concluded that as long as we are interested in the differences between energy levels
the infinite total energy E0 of the vacuum does not matter (which effectively means that E0
has no effect on particle physics phenomenology). So is the value of E0 unobservable then?
No, in fact, not at all, since gravity is supposed to see all energy densities. In particular, the
sum of all the zero-point energies should contribute to Einsteins equations,
R

R
g + g = 8GN T ,
2

(3.28)

in the form of a cosmological constant = E0 /V . Here R is the Ricci curvature tensor, R


the scalar curvature (for their definitions please consult a text on general relativity), g is
the metric tensor (not to be mixed up with the Minkowski metric ), GN denotes Newtons
constant, which we have already met in (1.8), and T is the energy-momentum tensor in its
symmetric form. Unfortunately, I do not have time to explain (3.28) in detail. If you want to
learn more about Einsteins equation, I suggest that you attend Andrew Steanes course on
general relativity. In order to be able to follow this lecture, it is sufficient to know that these
equations contain a term proportional to E0 /V .
An assortment of observations (cosmic microwave background, type-Ia supernovae, baryon
acoustic oscillations, etc.) tells us that 74% of the energy density in the universe has the properties of a cosmological constant. This constant energy density filling space homogeneously
is one form of dark energy. Another possibility of dark energy would be a scalar field such
as quintessence, a dynamic quantity whose energy density can vary in space. The rest of the
composition of todays cosmos is made up by dark matter, amounting to 22%, and visible
matter (atoms, etc.), giving the missing 4%. Dark matter is dark in the sense that it is inferred to exist from gravitational effects on visible matter and background radiation, but is
undetectable by emitted or scattered electromagnetic radiation. So in conclusion, fully 96%
of the universe seems to be composed of stuff weve never seen directly on earth.
But our lack of understanding does not end there. In the last subsection, we have argued
that integrating in (3.23) up to infinity is not the right thing to do, but that one should
only consider modes up to a certain UV cut-off UV , where one stops trusting the underlying
theory. The resulting energy density E0 then scales like 4UV . While it is not clear which precise
value we should take for UV , let us be not very ambitious and take a value for this scale, up
to which we truly believe that we understand the physics of fundamental interactions. The
electron mass me = 511 keV could be such a choice. In consequence,
E0predicted (511 keV)4 6 1022 eV ,

(3.29)

where the superscript predicted should probably better read guessed. Glancing at Table 1,
we see that the observed value of E0 is
E0observed (103 eV)4 1012 eV ,

(3.30)

so it is clearly non-zero but unfortunately also roughly 34 orders of magnitude smaller than
our prediction. Notice that the choice UV = me that lead to (3.29) was, in fact, a conservative
27

one, because other educated guesses such as UV = v, MGUT , etc. would have lead to a much
bigger disagreement of up to 120 orders of magnitude for the choice UV = MP .
From the point of view of QFT, the net cosmological constant, is the sum of a number of
apparently disparate contributions, including zero-point fluctuations of each field theory dof
and potential energies from scalar fields, as well as a bare cosmological constant. There is no
obstacle to imagining that all of the large and apparently unrelated contributions add together,
with different signs, to produce a net cosmological constant consistent with the limit (3.30),
other than the fact that it seems ridiculous. We know of no special symmetry which could
enforce a vanishing vacuum energy while remaining consistent with the known laws of physics.
This conundrum is the cosmological constant problem. While no widely accepted solution to
this problem exists, there are many proposed ones ranging from the anthropic principle to the
string-theory landscape. Dont bother if you have never even heard of any of them, it is not
important at all for what follows.
Casimir Effect
Using the normal ordering prescription we can happily set E0 = 0, while chanting the mantra
that only energy differences can be measured. However, it should be possible to see that the
vacuum energy is different if, for a reason, the fields vanish in some region of the space-volume
or if some frequencies p do not contribute to the vacuum energy. Such a set-up can be
realized, by forcing the real Klein-Gordon field to satisfy appropriate BCs. Let us assume,
that vanishes on the planes with x = 0 and x = L,
(0, y, z) = (L, y, z) = 0 ,

(3.31)

The presence of these BCs affects the Fourier decomposition of the field and, in particular,
leads to a quantization of the momentum of the field inside the planes (k Z+ ),


k
, py , pz .
(3.32)
p=
L
For simplicity let us consider a massless real scalar field. In this case the ground-state energy
per unit area S between the planes is given by the following expression
s 
Z
2
2
X
E0 (L)
k
dp 1
+ p2 .
(3.33)
=
2 2
S
(2)
L
k=1
Notice that we only integrate over the perpendicular directions p = (py , pz ), since the momentum px is discretized. Consequently, the volume integral has to be replaced by a surface
integral of the planes. In analogy to (3.22), this gives a factor S/(2)2 instead of V /(2)3 .
Let us see if we are able to calculate (3.33). We first switch to polar coordinates,
s 
Z
2
X
E0 (L)
dp p 1
k
=
+ p2 .
(3.34)
S
2
2
L
k=1 0
28

As it stands this integral is divergent in the limit p . We can regulate this singularity
in a number of different ways. One way to do it, is to introduce a UV cut-off a  L, so that
modes of momentum much bigger than a1 are removed. E.g., multiplying the integrand in
(3.34) by the factor exp [a ((k/L)2 + p2 )1/2 ] would do the job, since the resulting expression
has the property that as a 0, one regains the full, infinite result (3.33). The drawback of
this method is that the new integral is quite difficult to perform (though doable), so lets see
if we find an easier way.
The trick is to consider (3.33) not in d = 4 dimensions, but to work in less dimensions,
say, d = 4 2 with  > 0. While this looks very weird at first sight, let me mention that
in general there exists a value of  for which the integral is well-defined. We shall perform
our calculation for such a value, and then try to analytically continue the result to  = 0. In
d = 4 2 dimensions the integral (3.34) takes the form
s 
Z
2
12
k
1
E0 (L) X dp p
=
+ p2 .
(3.35)
S
2
2
L
0
k=1
To evaluate this expression, we first change variables p k/L l . We then obtain
!Z

q



X
32
1
E0 (L)
12
32
2
=
k
dl l
1 + l
S
4 L
0
k=1
Z
q


1 32
2
2 
2
=
1 + l
,
(2 3)
dl (l )
8 L
0
where we have identified the infinite sum with a Riemann zeta function, employing

X
1
= (a) .
a
k
k=1
2
Performing now the change of variables l
x/(1 x), we arrive at
Z
Z 1
q
2
2 
2
dl
dx x (1 x)5/2 = B(1 ,  3/2) ,
(l
1 + l
=
)
0

(3.36)

(3.37)

(3.38)

where in the last step we have used the definition of the Euler beta function,
Z 1
(a)(b)
B(a, b) =
=
dx xa1 (1 x)b1 .
(3.39)
(a + b)
0
Putting everything together, the final result in d = 4 2 dimensions reads
E0 (L)
1  32
=
(2 3) B(1 ,  3/2) .
(3.40)
S
8 L
10
Amazingly,
we can even take the limit  0. Using (a + 1) = a (a) with (1) = 1 and
(1/2) = , and recalling that (3) = 1/120, we arrive at the finite expression
E0 (L)
2
=
.
S
1440L3
10

(3.41)

Many subtleties have been swept under the carpet in this calculation. E.g., the dimensions of the expressions in (3.35) to (3.40) are wrong by 2. All cheats will become clear when the method of dimensional
regularization is properly introduced.

29

This result implies that the vacuum energy depends on the distance between the two planes,
on which vanishes. Can we realize this in an experiment?
Remember that the electromagnetic field is zero inside a conductor. If we place two uncharged conducting plates parallel to each other at a distance L, then we can reproduce the
BCs of the set-up that we have just studied. While the quantization of the electromagnetic
field is more complicated than the real Klein-Gordon field, which we have used to model the
effect, this difference becomes (almost) immaterial as far as the vacuum energy is concerned.
Our analysis, leads to an amazing prediction. Two electrically neutral metal plates attract
each other. This is known as the Casimir-Polder force, first predicted in 1948 [4]. Notice,
that the energy of the vacuum gets smaller when the conducting plates are closer, as indicated
by the minus sign in (3.41). Therefore, there is an attractive force between them. This is an
effect that has by now been verified experimentally with great precision.11 In our example,
the force per unit area (pressure or rather anti-pressure) between the two conductor plates is
given by
2
1 E0 (L)
=
.
(3.42)
F =
S L
480L4
In fact, the true Casimir-Polder force is twice as large as the latter result, due to the two
polarization states of the photon.

3.3

Particle States

After the discussion of the properties of the vacuum, we can now turn to the excitations of
. Its easy to verify (and therefore left as an exercise) that, in full analogy to (3.11), the
Hamiltonian and the ladder operators of the real Klein-Gordon theory obey the following
commutation relations
[H, ap ] = p ap ,
[H, ap ] = p ap .
(3.43)
These relations imply that we can construct energy eigenstates by acting on the vacuum state
|0i with ap (remember that they also imply that ap |0i = 0, p). We define
|pi = ap |0i .

(3.44)

H |pi = Ep |pi = p |pi ,

(3.45)

This state has energy


with p given in (3.6), which is nothing but the relativistic energy of a particle with 3momentum p and mass m. We thus interpret the state |pi as the momentum eigenstate
of a single scalar particle of mass m.
Let us check this interpretation by studying the other quantum numbers of |pi. We begin
with the total momentum P introduced in (2.24). Turning this expression into an operator,
we arrive, after normal ordering, at
Z
Z 3
dp
3
p ap ap .
(3.46)
P = d x =
(2)3
11

The first experimental test of the Casimir-Polder force was conducted by Marcus Sparnaay in 1958, in a
delicate and difficult experiment with parallel plates. Due to the large experimental errors, his results could
neither prove the theoretical prediction right nor wrong.

30

Acting with P on our state |pi gives


Z 3
Z 3
h
i
dq
dq

3 (3)

P |pi =
q
a
a
a
|0i
=
q
a
(2)

(p

q)
+
a
a
|0i = p |pi ,
q
q
q
p
q
p
(2)3
(2)3

(3.47)

where we have employed the second line in (3.16) and used the fact that an annihilation
operator acting on the vacuum is zero. The latter result tells us that the state |pi has
momentum p. Another property of |pi that we can study is its angular momentum. Again
we take the classical expression for the total angular momentum (2.67) and turn it into an
operator,
Z
i
ijk
J =
d3 x (J 0 )jk ,
(3.48)
It is a good exercise to show that by acting with J i on the one-particle state with zero
momentum one gets
(3.49)
J i |p = 0i = 0 .
This result tells us that the particle carries no internal angular momentum. In other words,
quantizing the real Klein-Gordon field gives rise to a spin-zero particle aka a scalar.
Multiparticle States
Acting multiple times with the creation operators on the vacuum we can create multiparticle
states. We interpret the state
|p1 , ..., pn i = ap1 . . . apn |0i ,

(3.50)

as an n-particle state. Since one has [api , apj ] = 0, the state (3.50) is symmetric under exchange
of any two particles. E.g.,
|p, qi = ap aq |0i = aq aq |0i = |q, pi .

(3.51)

This means that the particles corresponding to the real Klein-Gordon theory are bosons. We
see that, as promised already in Section 1.1, the relationship between spin and statistics
is, in fact, a consequence of the QFT framework, following, in the case at hand, from the
commutation quantization conditions for boson fields (3.2).
The full Hilbert space of our theory is spanned by acting on the vacuum with all possible
combinations of creation operators,
|0i ,

ap |0i ,

ap aq |0i ,

ap aq ar |0i ,

... .

(3.52)

This space is known as the Fock space and is simply the sum of the n-particle Hilbert spaces,
for all n 0. Like in QM, there is also an operator which counts the number of particles in a
given state in the Fock space. It is the number operator
Z 3
dp
a ap ,
(3.53)
N=
(2)3 p
31

which satisfies N |p1 , . . . , pn i = n |p1 , . . . , pn i. Notice that the number operator commutes
with the Hamiltonian, i.e., [N, H] = 0, ensuring that particle number is conserved. This
means that we can place ourselves in the n-particle sector, and will remain there. This is a
property of free theories, but will no longer be true when we consider interactions. Interactions
create and destroy particles, taking us between the different sectors in the Fock space.
Operator-Valued Distributions
We have referred to the states |pi as particles. Yet, this name is somewhat misleading,
since these states are momentum eigenstates and therefore not localized in space. Recall that
in QM both the position and momentum eigenstates are not good elements of the Hilbert
space since they are not normalizable (they normalize to delta functions). Similarly, in QFT
neither the operators (x), nor ap and ap are good operators acting on the Fock space. This
is because these operators all produce states that are not normalizable:


h0 |(x)(x)| 0i = (3) (0) ,
0 ap ap 0 = (2)3 (3) (0) .
(3.54)
This feature implies that they are operator-valued distributions and not functions. In the case
of (x) one has that although the field operator has a well-defined vacuum expectation value
(VEV), h0|(x)|0i = 0, the fluctuations h0|(x)(x)|0i of the operator at a fixed point are
infinite. We can construct well-defined operators by smearing these distributions over space.
E.g., we can create a wavepacket
Z 3
d p i px
e
(p) |pi ,
(3.55)
|i =
(2)3
which is partially localized in both position and momentum space. A typical state might be
described by the Gaussian (p) = exp [p2 /(2m2 )].
Relativistic Normalization
The vacuum |0i is normalized as h0|0i = 1. The one-particle states |pi = ap |0i then satisfy





hp|qi = 0 ap aq 0 = 0 (2)3 (3) (p q) + aq ap 0 = (2)3 (3) (p q) ,
(3.56)
where we have made use of (3.16) and (3.20) to arrive at the final answer. Since the latter
expression depends on 3-momenta, an immediate question that arises is whether it is Lorentz
invariant. What could go wrong? Suppose we perform a Lorentz transformation
p (p0 ) = p ,

(3.57)

such that p p0 . In our QFT it would be preferable, if the state p changes under this Lorentz
transformation as
|pi |p0 i = U () |pi ,
(3.58)
with U () being unitary, i.e., U ()U () = U ()U () = 1. In such a case the normalization
of |pi would remain unchanged


hp|pi hp0 |p0 i = p U ()U () p = hp|pi .
(3.59)
32

In order to find out whether or not the original and the Lorentz-transformed state, |pi and
|p0 i, are related by an unitary transformation, we should look at an object which we know
is Lorentz invariant. One such object is the identity operator (which is really the projection
operator onto one-particle states). With the normalization (3.56) we know that it is given by
Z 3
dp
|pihp| .
(3.60)
1=
(2)3
R
This operator is Lorentz invariant, but it consists of two terms: the measure d3 p and the
projector |pihp|. Are these two objects Lorentz invariant by themselves?
In fact, they are not.
R
In order to prove this statement, we start with the measure d4 p which is obviously Lorentz
invariant. The relativistic dispersion relation for a massive particle, i.e., p2 = m2 , and hence
p20 = Ep2 = p2 +m2 is also Lorentz invariant. Solving for p0 , there are two branches of solutions,
namely p0 = Ep . But the choice of branch is another Lorentz-invariant concept. Putting
everything together tells us that
Z
Z 3
Z 3

d p
dp
4
2
2
2
,
(3.61)
d p (p0 p m ) p0 >0 =
=

2p0 p0 =Ep
2Ep
is Lorentz invariant. From the latter result we can figure out everything else. E.g., the
Lorentz-invariant delta function for 3-momenta is
2Ep (3) (p q) ,
since

d3 p
2Ep (3) (p q) = 1 .
2Ep

(3.62)

(3.63)

This finally tells us that the relativistically normalized momentum eigenstates are given by12
p
p
|pi = 2Ep |pi = 2Ep ap |0i ,
(3.64)
and satisfy
hp|qi = (2)3 2Ep (3) (p q) .
We can also express the identity operator in terms of the |pi states. One has
Z 3
dp 1
1=
|pihp| .
(2)3 2Ep

(3.65)

(3.66)

We remark that some textspon QFT also define


normalized annihilation (cre
p relativistically

ation) operators by a(p) = 2Ep ap a (p) = 2Ep ap . In order to avoid (further) confusion,
we wont make use of this notation here.
12

Our notation is rather subtle here, since the relativistically normalized momentum states |pi differ from
|pi just by the fact that they are not set in boldface type.

33

3.4

Two Real Klein-Gordon Fields

Our task is to describe all known particles and their interactions. It is then interesting to
study the quantization of a system with more than one field. In order to keep things simple,
let us try to describe a system of two real Klein-Gordon fields 1,2 which differ only in their
mass parameters (m1 6= m2 ),

X 1
1 2 2
2
( i ) mi i .
(3.67)
L=
2
2
i=1,2
This Lagrangian leads to two independent Klein-Gordon equations,
( 2 + m2i ) i = 0 .

(3.68)

The Hamiltonian, the total momentum, and the number operator of the system is given by
H = H1 + H2 ,

P = P1 + P2 ,

N = N1 + N2 ,

(3.69)

where
Z
Hi =

d3 p
i,p ai,p ai,p ,
(2)3

Z
Pi =

d3 p
p ai,p ai,p ,
(2)3

Z
Ni =

d3 p
a ai,p ,
(2)3 i,p

(3.70)

with i,p = (p2 + m2i )1/2 . It should be clear, that we can construct particle states in the same
fashion as we did with the Lagrangian of just a single real Klein-Gordon field. Products of a1,p
operators acting on |0i create relativistic particles with mass m1 , while a2,p operators create
particles with mass m2 . E.g., the states
|S 1 i = a1,p |0i ,

|S 2 i = a2,p |0i ,

(3.71)

satisfy
H |S i i = i,p |S i i ,

P |S i i = p|S i i ,

N |S i i = 1|S i i .

(3.72)

These relations tell us that the states |S 1,2 i are degenerate in the sense that they are singleparticle states with the same momentum p. However, they can be distinguished by measuring
the energy of the particles as long as the masses m1,2 are different (which we have assumed
for the time being).
Equal-Mass Case
Admittedly the case of two real Klein-Gordon fields with different masses m1,2 is pretty boring.
Things get a little bit more interesting, if we consider the special case m1 = m2 = m. Why?
Because in this case the system possesses an additional rotation symmetry in the space of fields
1,2 . According to Noethers theorem this should lead to a new conserved charge. In order to
be able to identify the additional charge, we first write the Lagrangian (3.67) in a form that
exhibits the symmetry
1
1
L = ( T )( ) m2 T .
(3.73)
2
2
34

Here we have introduced the field vector = (1 , 2 )T .


Obviously, the latter Lagrangian is invariant under the orthogonal transformations (O(2)
transformations or two-dimensional rotations),
0 = R ,

(3.74)

with RT = R1 . To calculate the conserved current, we again consider infinitesimal symmetry


transformations (i, j = 1, 2)
Rij = ij + ij + O(2 ) .
(3.75)
The orthogonality of the matrix R,
1
T
ij + ji = Rij
= Rij
= ij ij ,

(3.76)

tells us that the matrix is antisymmetric. The infinitesimal transformation of the field 1
under (3.74) is
1 01 = R1i i = (1i + 1i ) i = 1 + 11 1 + 12 2 = 1 + 12 2 ,

(3.77)

which tells us that the variation of 1 is


1 = 12 2 .

(3.78)

2 = 21 1 = 12 1 .

(3.79)

An analog calculation gives


Knowing the variations 1,2 of the fields, the conserved current corresponding to (3.74) is
readily written down,
J =



L
i = 12 ( 1 ) 2 ( 2 ) 1 ,
( i )

(3.80)

so the conserved charge is


Z
Q=

d3 x


 
0 1 2 0 2 1 .

(3.81)

Substituting in the above expression the physical solutions (3.15) for the fields 1,2 , and
performing the integration over the space coordinates, one obtains (the actual computation is
part of an exercise)13
Z 3 h
i
dp

Q = i
a1,p a2,p a2,p a1,p ,
(3.82)
(2)3
which is an hermitian operator, i.e., it satisfies Q = Q. There is an ambiguity worth noting,
when applying Noethers theorem to find the conserved charge under the transformation (3.74).
Obviously, if Q is conserved, then so is every other operator c1 Q + c2 with c1,2 constant
numbers. The expression for Q in (3.82) is therefore unique up to a multiplicative and an
13

This expression has not be normal ordered.

35

additive constant. The ambiguity on the additive constant is removed when we remove the
contribution of the vacuum to the charge of particle states (as we have done for the energy).
The normal-ordered charge operator
: Q : = Q h0|Q|0i ,

(3.83)

is ambiguous only up to a multiplicative factor, which essentially denotes the units in which
we measure the charge of a state. Notice that we have already used this ambiguity in (3.81)
and simply ignored the factor 12 . In the following, we will use the normalization (3.83) of Q,
dropping as before the : : to avoid unnecessary clutter.
So far so good. Next we would like to determine the spectrum of Q. This is most easily
done using the technique of ladder operators. We first define the following linear combinations,
1
(3.84)
a,p = (a1,p ia2,p ) ,
2
of annihilation operators (an analog definition holds for the hermitian conjugate operators).
It is left as a homework problem to show that these new operators satisfy the following
commutation relations
[Q, a,p ] = a,p .

[Q, a,p ] = a,p ,

(3.85)

The latter relations imply that we can obtain states with charge q 1 from a state |Si of
charge q, i.e., Q |Si = q |Si, by the action of a,p ,




Q a,p |Si = (q 1) a,p |Si .
(3.86)
In other words the operators a,p are ladder operators with respect to Q. Since a,p are linear
combinations of a1,p and a2,p , which are ladder operators for the Hamiltonian H and the total
momentum operator P , so are a,p .
To find now all the common eigenstates of the charge operator Q, it is sufficient to start
from a single common eigenstate and then to act with a,p on this state. It is not surprising
that the vacuum |0i is also an eigenstate of Q, namely the one with zero charge14
Q |0i = 0 |0i = 0 .

(3.87)

Repeated application of the ladder operators,


|S
ni

n
Y

a,pi |0i ,

(3.88)

i=1

then creates n-particle states with positive (|S +


n i) and negative (|S n i) charge. Consequently,
one has
!
!
n
n
X
X
H |S
i,pi |S
P |S
pi |S
ni =
ni,
ni =
ni,
i=1
i=1
(3.89)

N |S
n i = n |S n i ,
14

Q |S
n i = n |S n i ,

Notice that the normal ordering (3.83) of Q plays an essential role here.

36

The main results of this subsection can be summarized as follows. The mass degeneracy of
the Klein-Gordon fields 1,2 results in a new O(2) symmetry of the Lagrangian. This gives rise
to a new conserved quantity, the charge Q. A particle state is then characterized by its mass
(or equivalently its energy), its momentum, and its charge, which can be either positive or
negative. States with the same energy and momentum, but opposite charge, can be interpreted
as particles and antiparticles. Notice that for a single real Klein-Gordon field there is only a
single type of particle, since a real scalar particle is its own antiparticle.

3.5

Complex Klein-Gordon Field

We can gain further insight into the theory by rewriting the Lagrangian (3.73) a little bit,
L = ( )( ) m2 ,

(3.90)

where

1
(3.91)
= (1 + i2 ) ,
2
denotes the complex Klein-Gordon field. We could now compute the Hamiltonian and momentum operators directly in terms of and , arriving at the same expressions as in the
representation with two real fields (if you dont believe me you are free to check this yourself).
In order to compute the charge Q, we then need to identify the internal symmetry of the new
Lagrangian. In fact, it is easy to see that (3.90) is invariant under a field phase-redefinition
aka a global U (1) transformation,
0 = ei ,

(0 ) = ei .

(3.92)

Notice that this transformation is the equivalent of the rotation symmetry transformation
(3.74) that we have found earlier, in the real field representation. We verify this by using the
explicit form of the matrix R in terms of sine and cosine of the rotation angle ,
!
!
!
!
!
!
1
cos sin
1
1
cos i sin
1

, =

,
2
sin cos
2
i2
i sin cos
i2
(3.93)
= 1 + i2 ei (1 + i2 ) , = ei .
So why should we bother about the complex Klein-Gordon Lagrangian if (3.73) and (3.90)
are equivalent? The reason is that the complex field representation is more suggestive to the
fact that we have both particle and antiparticle states. To see this we rederive the expression
for the charge operator (3.82). The variations of the fields and (treated as independent)
under (3.92) are
= i ,
= i .
(3.94)
Now we can again use the machinery of Noethers theorem to calculate Q. I spare you the
details of this computation and simply quote the final result after normal ordering. One finds15
Z 3 h
i
dp

Q=
a
a

a
a
= N+ + N ,
(3.95)
+,p
,p
,p
(2)3 +,p
You can obtain this expression by simply reexpressing (3.82) in terms of a,p and a,p using the inverse
of (3.84) and its hermitian conjugate analog.
15

37

where in the last step we have introduced the number operators


Z 3
dp
N =
a a,p .
(2)3 ,p

(3.96)

The expression (3.95) implies that Q counts the number of antiparticles (created by a+,p )
minus the number of particles (created by a,p ). Since [H, Q] = 0, this difference is a conserved
quantity in our quantum theory. Of course, in a free field theory this isnt such a big deal
because both N+ and N , i.e., the numbers of positively and negatively charged states, are
separately conserved. However, we will see soon that in interacting theories Q survives as a
conserved quantity, while N individually do not.

3.6

Heisenberg Picture

Although we started with a Lorentz-invariant Lagrangian, we slowly butchered it as we quantized the theory, introducing a preferred time coordinate t. Its not at all obvious that the
theory is still Lorentz invariant after quantization. E.g., the various field operators (x) we
met depend on space, but not on time. Yet, the one-particle states obey the Schrodingers
equation,
d|p(t)i
= H |p(t)i ,
(3.97)
i
dt
which means that they evolve in time according to
|p(t)i = eiEp t |pi .

(3.98)

Things start to look better in the Heisenberg picture where the time dependence is assigned
to the operators O,
OH = eiHt OS eiHt ,
(3.99)
so that
dOH
=
dt




d iHt
d iHt
iHt
iHt
e
OS e
+ e OS
e
dt
dt

(3.100)

= iH eiHt OS eiHt + eiHt OS eiHt (iH) = i [H, OH ] .


Here the subscripts S and H tell us whether the operator is in the Schrodinger or Heisenberg
picture. In QFT, we drop these subscripts and we will denote the picture by specifying whether
the fields depend on space (x) (the Schrodinger picture) or space-time (t, x) = (x) (the
Heisenberg picture).
The operators in the two pictures agree at a fixed time, say, t = 0. The commutation
relations (3.2) become equal-time commutation relations in the Heisenberg picture. In the
case of the real Klein-Gordon theory (2.45),
[(t, x), (t, y)] = i (3) (x y) .

[(t, x), (t, y)] = [(t, x), (t, y)] = 0 ,

38

(3.101)

Now that our operators depend on time, we can study how they evolve when the clock starts
ticking. For the field operator , we have
Z
n
o
i
2
ih

(x) = i [H, (x)] =


d3 y 2 (y) + (y) + m2 2 (y) , (x)
2
(3.102)
Z
= i d3 y (y) (i) (3) (x y) = (x) .
Similarly, we get for the conjugate operator ,
Z
n
o
i
2
ih
3
2
2 2
d y (y) + (y) + m (y) , (x)
(x)

= i [H, (x)] =
2
Z
n
o


i
3
2
(3)
=
d y y [(y), (x)] (y) + (y) y [(y), (x)] + 2i m (y) (x y)
2

= 2 m2 (x) ,
(3.103)
where we have included the subscript y on y when there may be some confusion about
which argument the derivative is acting on. To reach the last line, we have simply integrated
by parts. Putting (3.102) and (3.103), we then find that satisfies (as one could have guessed)
the Klein-Gordon equation (2.46). Things start to look more relativistic.
We can also write the Fourier expansion (3.15) of the field by using the definition of
Heisenberg operators (3.99). We first note that

(ap )H = eiHt ap eiHt = [eiHt , ap ] + ap eiHt eiHt
(3.104)

= eiEp t ap eiHt ap eiHt + ap eiHt eiHt = eiEp t ap ,
where have applied repeatedly
H n ap = ap (H Ep )n ,

(3.105)

which holds for any n and follows from the commutation relations (3.43), after expanding the
exponential in a power series (this step is actually not shown). A similar relation (with
replaced by +) holds for ap . In the case of ap , we hence have
ap


H

= eiHt ap eiHt = eiEp t ap .

Using (3.104) and (3.106) then gives,


Z 3

1
dp
p
ap eipx + ap eipx ,
(x) =
3
(2)
2Ep

(3.106)

(3.107)

which looks pretty much like (3.15) except that the exponentials are now written in terms
of 4-vectors, px = Ep t p x. Note also that the sign has flipped in the exponent due to

39

the Minkowski metric. Its a simple exercise to check that (3.107) indeed satisfies the KleinGordon equation (2.45), and is therefore left as a homework. For completeness let me also
give the result for the conjugate field in the Heisenberg picture. One finds,
r
Z 3
i
dp
Ep h
ipx
ipx
(x) =
ap e
ap e
,
(3.108)
(i)
(2)3
2
as you might have guessed immediately from looking at (3.15) and (3.107).
The equation (3.107) makes explicit the dual particle and wave interpretations of the
quantum field . On the one hand, is written as an operator, which creates and destroys
the particles that are the quanta of field excitation. On the other hand, is written as a
linear combination of solutions (the exponentials) of the Klein-Gordon equation. Both signs
of the time dependence, i.e., ip0 t with p0 > 0, appear in the exponential. If these were
single-particle wavefunctions, they would correspond to states of positive and negative energy.
Let us refer to them more generally as positive- and negative-frequency modes. The connection
between the particle-creation operators and the waveforms displayed here is always valid for
free quantum fields. A positive-frequency solution of the field equation has as its coefficient
the operator that destroys a particle in that single-particle wavefunction, while a negativefrequency solution of the field equation (being the hermitian conjugate of a positive-frequency
solution) has as its coefficient the operator that creates a particle in that positive-energy
single-particle wavefunction. In this way, the fact that relativistic wave equations have both
positive- and negative-frequency solutions is reconciled with the requirement that a sensible
quantum theory should contain only positive excitation energies.
Causality
It looks like we are approaching something Lorentz invariant in the Heisenberg picture, where
the field operator satisfies the Klein-Gordon equation. Yet, there is still a hint of nonLorentz invariance because and satisfy the equal-time commutation relations (3.101). The
question that we thus have to address is, what happens for arbitrary space-time separations? In
particular, for our theory to be causal, we must require that all space-like separated operators
commute,
[O1 (x), O2 (y)] = 0 , (x y)2 < 0 .
(3.109)
This ensures that a measurement at x cannot affect a measurement at y, when x and y are not
causally connected (outside the light-cone). A graphical representation of the latter equation
is given in Figure 3.1.
Does our theory satisfy the requirement (3.109)? To answer this question, we first define
(x y) = [(x), (y)] .

(3.110)

While the objects of the right-hand side are operators, it is seen (after a short calculation)

40

O2 (y)
x
O1 (x)

Figure 3.1: Picture of space-like separated operators O1 (x) and O2 (y).

that the left-hand side is simply a complex number,


Z 3
h Z d3 p

i
1
dq
1
ipx
ipx
iq y
iq y
p
p
a
e
+
a
e
,
a
e
+
a
e
(x y) =
p
q
p
q
(2)3 2Ep
(2)3 2Eq
Z 3 3


d pd q
1

ipx+iq y

ipxiq y
p
=
[a
,
a
]
e
+
[a
,
a
]
e
p q
p q
(2)6 2 Ep Eq
(3.111)
Z 3 3


d pd q
1
p
=
(2)3 (3) (p q) eipx+iq y eipxiq y
(2)6 2 Ep Eq
Z 3

d p 1  ip (xy)
e
eip (xy) .
=
3
(2) 2Ep
So what do we know about
(x y)? First of all, it is Lorentz invariant thanks to the
R
Lorentz-invariant measure d3 p/(2Ep ) that we have introduced in (3.61). Second, it does
not vanish for time-like separations. E.g., taking x = (t, 0, 0, 0) and y = (0, 0, 0, 0) gives
[(x), (y)] exp (imt) exp (imt), where the exp (imt) term arises from
Z
Z 3

p2
d p 1 iEp t
4
it p2 +m2
p
e
=
dp
e
(2)3 2Ep
(2)3 0
2 p2 + m2
Z

1
(3.112)
= 2
dE E 2 m2 eiEt
4 m

m 
=
Y1 (mt) + iJ1 (mt) eimt ,
t
8t
where to arrive at the second line, we have simply changed variables p (E 2 m2 )1/2 . In
41

order to obtain the final answer, one only needs to know that the Bessel functions of first and
second kind, J1 (x) and Y1 (x), behave like J1 (x)/x sin xcos x and Y1 (x)/x sin x+cos x in
the relevant limit x . An analog calculation gives the exp (imt) term. Third, it vanishes
for space-like separations. This follows by realizing that (x y) = 0 at equal times for all
(x y)2 = (x y)2 < 0, which can be seen explicitly by writing
Z 3
 ip(xy)

dp
1
p
[(t, x), (t, y)] =
e
eip(xy) = 0 .
(3.113)
3
(2) 2 p2 + m2
Notice that in order to arrive at the final result, we have flipped the sign of p in the second
exponent. This obviously does not change the result since p is an integration variable and
(p2 + m2 )1/2 is invariant under such a change. But since (x y) is Lorentz invariant, it can
only be a function of (x y)2 and must hence vanish for all (x y)2 < 0.
Taken together the above findings imply that the real Klein-Gordon theory is indeed causal
with commutators vanishing outside the light-cone. This property will continue to hold in the
interacting theory. Indeed, it is usually given as one of the axioms of local QFTs. Let me
mention, however, that the fact that [(x), (y)] is a complex function, rather than an operator,
is a property of free fields only and does not hold in an interacting theory.

3.7

Klein-Gordon Correlators

The causal structure of the real Klein-Gordon theory (2.45) can also be probed in a different
way. Lets create a particle at the space-time point y. What is the amplitude to find it at
point x? This question can be answered by calculating
Z 3 3


d pd q
1
p
0 ap aq 0 eipx+iq y
D(x y) = h0 |(x)(y)| 0i =
6
(2) 2 Ep Eq
Z 3 3


d pd q
1
p
=
0 [ap , aq ] 0 eipx+iq y
6
(2) 2 Ep Eq
(3.114)
Z 3 3
d pd q
1
p
=
(2)3 (3) (p q) eipx+iq y
(2)6 2 Ep Eq
Z 3
d p 1 ip(xy)
=
e
.
(2)3 2Ep
The function D(x y) is called propagator and is a Lorentz-invariant 3-momentum integral.
Let us now evaluate (3.113) for purely space-like separations, i.e., x y = (0, r).16 The
propagator is then
Z
Z 3
d p 1 ipr
2
p2 eipr eipr
D(x y) =
e
=
dp
(2)3 2Ep
(2)3 0
2Ep
ipr
(3.115)
Z
i
p eipr
=
dp p
.
2(2)2 r
p2 + m2
16

Notice that for purely time-like separations one would obtain the result (3.112).

42

Im p

+im

im

Re p

Figure 3.2: Branch cuts of the propagator D(x y) for a space-like transition.

Here we have first introduced spherical coordinates, then performed the integration over the
azimuthal and polar angles, and finally changed variables in the second term from p p in
order to combine the result into one term. The integrand in (3.115), considered as a complex
function of p, has branch cuts on the imaginary axis starting at im. In order to evaluate the
integral we push the contour up to wrap around the upper branch cut. The chosen integration
contour is shown in Figure 3.2. Defining = ip, we then recast (3.115) into
Z
1
1
er
d p
= 2 mK1 (mr) emr ,
D(x y) = 2
(3.116)
r
4 r m
4 r
2 m2
p

where the modified Bessel function K1 (x) scales like K1 (x) =
/(2x)+O(x3/2 ) ex in the
limit of x . The latter equation tells us that the propagator (xy) decays exponentially
quickly outside the light-cone but, nonetheless, it is non-vanishing. The quantum field appears
to leak out of the causal region. Yet, we have just seen in (3.113) that space-like measurements
commute and the theory is causal. How do we reconcile these two facts?
We get a first clue of how this puzzle is resolved by realizing that the relation (3.113),
expressed in terms of propagators, takes the form
(x y) = [(x), (y)] = D(x y) D(y x) = 0 .

(3.117)

What is the physical meaning of this result? It simply means that for (x y)2 < 0, there is no
Lorentz-invariant way to order events. If a particle can travel in a space-like direction from x
to y, it can just as easily travel from y to x.17 In any measurement, the amplitudes for these
two possible events cancel, so that the underlying QFT is causal.
17

When x y is space-like, a continuous Lorentz transformation can take x y to y x.

43

Another way to think about the cancellation of the two contributions in (3.117) is in terms
of amplitudes of particles and antiparticles. Let us first consider the case of a complex scalar
field. If we look at the equation [(x), (y)] = 0 outside the light-cone, the physical interpretation of (3.117) (or better its analog) is that the amplitude for the particle to propagate
from x to y cancels the amplitude for the antiparticle to travel from y and x. In fact, this
interpretation also applies (maybe in a less obvious way) to the case of the real scalar field,
because the particle is then its own antiparticle.
Greens functions
In fact, the statements made after (3.117) can be put on mathematical solid grounds. Lets
see how this goes. We start by considering the amplitude
Z 3

d p 1  ip(xy)
ip(xy)
h0 |[(x), (y)]| 0i =
e

e
,
(3.118)
(2)3 2Ep
and assume for now that x0 > y 0 . In this case we can rewrite the 3-momentum integral on
the right-hand side of (3.117) as a 4-momentum integral,

Z 3 

dp
1 ip(xy)
1
ip(xy)
h0 |[(x), (y)]| 0i =
e
+
e
0
0
(2)3 2Ep
2Ep
p =Ep
p =Ep
Z 3 Z
dp
dp0 1
(3.119)
eip(xy)
=
3
2
2
0
0
x >y
(2)
2i p m
Z 4
i
dp
eip(xy) .
=
4
2
(2) p m2
Notice that this is the first time in this course that we have integrated over 4-momentum.
Until now, we integrated only over 3-momentum, with p0 fixed by the mass-shell condition to
be p0 = Ep .
Barring possible typos, the calculation in (3.119) is certainly correct, but this fact might not
be obvious to everybody in the audience right away. So let me do some reverse engineering.
First notice that the denominator in the last line of (3.119) can be written as
p2 m2 = (p0 )2 p2 m2 = (p0 )2 Ep2 = (p0 Ep )(p0 + Ep ) ,

(3.120)

which implies that, for each value of p, the denominator produces a pole in the integrand at
1/2
p0 = Ep = (p2 + m2 ) . The 4-momentum integration is hence ill-defined and we need
a prescription for avoiding the singularities on the real p0 -axis. How do we have to choose
the integration contour in order to arrive at (3.119)? It is not difficult to see that in the
case x0 > y 0 the contour has to be chosen as shown in Figure 3.3. Notice that closing the
contour in the lower half-plane, where p0 i, ensures that the integrand vanishes since
exp (ip0 (x0 y 0 )) 0. The integral over p0 then picks up the residues at p0 = Ep which
are 2i/(p0 Ep ) p0 =Ep = 2i/(2Ep ), where the relative minus sign arises because we
take a clockwise contour. Combining these elements shows that the calculation that led to the
final result in (3.119) is in fact correct.
44

Im p0
Re p0
Ep

+Ep
p0 i

Figure 3.3: Integration contour for the retarded Greens function DR (x y).

In the following, we will call the last line of (3.119) together with the prescription for going
around the pole retarded Greens function,


DR (x y) = (x0 y 0 ) h0 |[(x), (y)]| 0i = (x0 y 0 ) D(x y) D(y x) , (3.121)
where the Heaviside step function (x) is defined as (x) = 0 for x < 0 and (x) = 1 for x > 0.
It seldom matters what value is used for (0), since (x) is mostly used as a distribution (in
the half-maximum convention one has (0) = 1/2). The name retarded Greens function is in
fact the correct one for DR (x y), since this mathematical object obeys18


2 + m2 DR (x y) = 2 (x0 y 0 ) h0 |[(x), (y)]| 0i

+ 2 (x0 y 0 ) ( h0 |[(x), (y)]| 0i)

+ (x0 y 0 ) 2 + m2 h0 |[(x), (y)]| 0i
= (x0 y 0 ) h0 |[(x), (y)]| 0i

(3.122)

+ 2(x0 y 0 ) h0 |[(x), (y)]| 0i


+0
= i (4) (x y) ,
and vanishes for x0 < y 0 by definition. Here all derivatives are understood with respect to
x. In order
to obtain the second
line we have used the two relations x (x) = (x) and


2
x (x) f (x) = (x) x f (x) , the latter of which is shown easily by partial integration, and
paid tribute to the fact that (x) obeys the Klein-Gordon equation. The last line then follows
by employing the second equal-time commutation relation in (3.101).
The retarded Greens function is useful in classical field theory if we know the initial value
of some field configuration and want to figure out what it evolves into in the presence of
18

Notice that the same result is obtained by applying the differential operator ( 2 + m2 ) directly to the
expression in the last line of (3.119).

45

Im p0
p0 +i

Ep

+Ep
Re p0

Figure 3.4: Integration contour for the advanced Greens function DA (x y).

a source, meaning that we want to know the solution to the inhomogeneous Klein-Gordon
equation, ( 2 + m2 ) (x) = J(x) for some fixed background function J(x), acting as a static
source. Similarly, one can define the advanced Greens function DA (x y) which vanishes
when x0 > y 0 , which is useful if we know the end point of a field configuration and want to
figure out where it came from. The integration contour corresponding to the advanced Greens
function is shown in Figure 3.4. You will get more familiar with the advanced Greens function
in an exercise.
Feynman Propagator
In fact, the most important quantity in interacting field theory is neither the retarded nor the
advanced Greens function but the Feynman propagator,
DF (x y) = h0 |T (x)(y)| 0i = (x0 y 0 )D(x y) + (y 0 x0 )D(y x) ,

(3.123)

where T stand for time ordering, i.e., placing all operators evaluated at later times to the left
so that e.g.,
T (x)(y) = (x0 y 0 )(x)(y) + (y 0 x0 )(y)(x) .
(3.124)
Given the similarity of (3.118) and (3.123), it is does not come as a surprise, that the Feynman
propagator can be written as,
Z 4
i
dp
eip(xy) .
(3.125)
DF (x y) =
4
2
(2) p m2
Again we distinguish the cases x0 > y 0 and y 0 > x0 . In the former case, we perform the p0
integration following the contour shown in Figure 3.5, which encloses the pole at p0 = +Ep
with residuum 2i/(2Ep ), where the minus sign arises again since the path has a clockwise

46

Im p0
Ep

Re p0
+Ep
p0 i

Figure 3.5: Integration contour for the Feynman propagator DF (x y) for x0 > y 0 .
In the case y 0 > x0 , the integration contour is closed in the upper-half plane.

orientation. Consequently, one obtains


Z 3
d p 2i iEp (x0 y0 )+ip(xy)
ie
DF (x y) =
(2)4 2Ep
Z 3
d p 1 ip(xy)
e
= D(x y) .
=
(2)3 2Ep
In contrast, in the case y 0 > x0 one finds
Z 3
dp
2i
iEp (x0 y 0 )+ip(xy)
DF (x y) =
ie
(2)4 (2Ep )
Z 3
d p 1 iEp (y0 x0 )ip(yx)
=
e
(2)3 2Ep
Z 3
d p 1 ip(yx)
e
= D(y x) .
=
(2)3 2Ep

(3.126)

(3.127)

where the integration is chosen as in Figure 3.5, but the path is closed in the upper-half plane
(due to the counter-clockwise orientation of the half-circle the residuum does not pick up a
minus sign). To go from the second line in (3.127) to the third, we have flipped the sign of p
which is valid since we integrate over d3 p and all other quantities depend only on p2 . Taken
together the latter two relations prove the equality of (3.123) and (3.125).
Like DR (x y) and DA (x y), also the Feynman propagator is a Greens function of the
Klein-Gordon equation,
Z 4

dp
i
2
2
+ m DF (x y) =
(p2 + m2 ) eip(xy)
4
2
2
(2) p m
(3.128)
Z 4
d p ip(xy)
= i
e
= i (4) (x y) .
(2)4
47

Im p0
+Ep

+i

Ep

Re p0

i
p0 i

Figure 3.6: Schematic picture of the i prescription for x0 > y 0 . In the case y 0 > x0 ,
the integration contour is closed in the upper-half plane.

Notice that instead of specifying the contour, we may instead write the Feynman propagator as follows
Z 4
i
dp
eip(xy) ,
(3.129)
DF (x y) =
4
2
2
(2) p m + i
with  > 0 and infinitesimal. As shown in Figure 3.6, this has the effect of shifting the
poles slightly off the real p0 -axis, so that the integration along this axis is equivalent to the
integration contour displayed in Figure 3.5. This way of writing DF (x y) is, for obvious
reasons, called the i prescription.

3.8

Non-Relativistic Limit

In order to study the non-relativistic limit of our theory, we return to the classical complex
Klein-Gordon field (for reasons that will become clear later on). We decompose it as19
(x) = eimt (x)
e ,

(3.130)

to single out the large kinematical part of the momentum of . In terms of the new field ,
e
the Klein-Gordon equation reads
i
h


e 2im
2 + m2 = t2 2 + m2 eimt
e = eimt
e 2
e = 0,
(3.131)
where the explicit m2 term cancelled against the time derivatives. The non-relativistic limit
e  m||.
is m  |p|, which after a Fourier transform is equivalent to saying that ||
e We are
19

The exponential factor removes the large frequency part from the x-dependence in . Consequently, the xdependence of
e is only governed by the small residual momentum and derivatives acting on
e are suppressed
by powers of 1/m. This way of decomposing a field is often the starting point for the construction of an
effective field theory that entails the physics of the full theory in the kinematical limit m  |p|. The most
well-known example of such a theory in particle physics is heavy quark effective theory.

48

e term in (3.131), so that the Klein-Gordon equation in the limit


hence allowed to neglect the
m becomes
d
1
i
e=
2
e.
(3.132)
dt
2m
This looks very similar to the Schrodinger equation for a non-relativistic free particle of mass
m. Except it does not have any probability interpretation. It is simply a classical field evolving
through an equation thats first order in time derivatives.
It is also worthwhile to consider the Lagrangian of the complex scalar field itself and
to investigate what happens to (3.90) in the non-relativistic limit. We again take the limit
e  m||,
||
e and obtain after a straightforward calculation (where in the last step we have
divided by 2m),
1
L = i
e
e
(
e ) ()
e .
(3.133)
2m
This Lagrangian has a conserved current related to its invariance under the global phase
transformation
e ei .
e Employing Noethers theorem (2.17), we find that the conserved
current takes the form


i

J =
e ,
e
[
e
e
e
e] .
(3.134)
2m
To get the Hamiltonian we compute the conjugate momentum
=

L
= i
e ,

(3.135)

which does not contain a time derivative. This looks a little disconcerting, but its fully
consistent for a theory which is first order in time derivatives. In order to determine the full
trajectory of the field, we only need to specify initial conditions for
e and
e at some point
in time, say t = 0 (knowing the time derivatives on the initial slice is not necessary).
Since the Lagrangian (3.133) already contains a term p q
(and not the usual 1/2 p q),

the time derivatives drop out when one computes the Hamiltonian,
H=

1
(
e ) ()
e .
2m

(3.136)

In order to quantize the system, we impose in the Schrodinger picture,


[(x),
e
(y)]
e
= [
e (x),
e (y)] = 0 ,

[(x),
e

e (y)] = (3) (x y) ,

and expand the field into its Fourier components,


Z 3
dp
(x)
e
=
ap eipx .
3
(2)

(3.137)

(3.138)

Inserting this into the commutation relations (3.137), leads to


[ap , aq ] = (2)3 (3) (p q) .

49

(3.139)

where the trivial expressions have been skipped. As usual the vacuum satisfies ap |0i = 0, and
the excitations are ap1 . . . apn |0i. The one-particle states |pi = ap |0i, have energy
H |pi =

p2
|pi ,
2m

(3.140)

which is the non-relativistic dispersion relation. From the above, we conclude that quantizing
the first order Lagrangian (3.133) gives rise to non-relativistic particles of mass m.
Some comments seem to be in order. Notice that we have a complex field but only a single
type of particle. The antiparticle is not in the spectrum. The existence ofR antiparticles is a
consequence of relativity. A related fact is that the conserved charge Q = d3 x :
e
e : is the
particle number. This remains conserved even if we include interactions in the Lagrangian of
the form (
e )
e 2 etc., which are invariant under a global phase rotation. So in non-relativistic
theories, particle number is conserved. It is only with relativity, and the appearance of antiparticles, that particle number can change. Finally, there is no non-relativistic limit of a real
scalar field. In the relativistic theory, the particles are their own antiparticles, and there is no
way to construct a multiparticle theory that conserves particle number.
Recovering QM
In QM, we talk about the position and momentum operators X and P . On the other hand,
as we saw below (2.1), in QFT position is relegated to a label. How do we get back to good
old QM? We already have the operator for the total momentum of the field, namely (3.46).
When acting on a single-particle state, it gives P |pi = p |pi. It is also not too difficult to
write down the position operator X. Lets do it in the non-relativistic limit. In this case the
operator
Z 3
d p ipx

a e
,
(3.141)

e (x) =
(2)3 p
creates a particle localized with a delta function at x. We hence write |xi =
e (x)|0i. It is
now natural to define the position operator X as
Z
X = d3 x x
e (x) (x)
e
,
(3.142)
since it has the sought property,
Z
X |xi = d3 y y
e (y) (y)
e
e (x)|0i
Z
=

(3.143)


d y y
e (y) (3) (y x)
e (x) (y)
e
|0i = x|xi .
3

We can now construct a state |i by taking a superposition of the one-particle states |xi,
Z
(3.144)
|i = d3 x (x) |xi .

50

Notice that the weight function (x) is what we would usually call the Schrodinger wavefunction (in the position representation). Lets make sure that it indeed has the right properties.
First, it is clear that for what concerns X it behaves correctly, namely
Z
X |i = d3 x x (x)|xi .
(3.145)
What about the momentum operator P ? A straightforward calculation gives,
Z 3 3
Z 3 3
d xd p
d xd p

P |i =
p ap ap (x)
e (x)|0i =
p ap eipx (x)|0i
3
3
(2)
(2)
Z 3 3
Z 3 3

d x d p ipx
d xd p
=
ap ieipx (x)|0i =
e
(i(x)) ap |0i
3
(2)
(2)3
Z

= d3 x i(x) |xi .

(3.146)

This tells us that P acts as the familiar derivative on wave functions (x). To obtain the final
result in (3.146), we have used in a first step the relationship [ap ,
e (x)] = eipx which can
be easily checked. We learn that when acting on one-particle states, the operators X and P
act as position and momentum operators in QM, with [X i , P j ] |i = i ij |i and i, j = 1, 2, 3.
But what about dynamics? In particular, how does our wavefunction (x) change with
time? To address this question, we first express the Hamiltonian corresponding to the density
(3.136) through ladder operators,
Z 3
Z
d p p2
1

3
(
e ) ()
e =
ap ap ,
(3.147)
H= dx
2m
(2)3 2m
which implies that
d
1
=
2 ,
(3.148)
dt
2m
which formally looks exactly like the time evolution of the original field
e given in (3.132).
Yet this time, it is really the Schrodinger equation, complete with the usual probabilistic
interpretation for the wavefunction (and not just a first-order differential equation).
Note
R
in particular, that the conserved charge arising from the current (3.134) is Q = d3 x |(x)|2
which is the total probability.
Historically, the fact that the equation for the classical field (3.132) and the one-particle
wavefunction (3.148) coincide caused some confusion. It was thought, that perhaps one is
quantizing the wavefunction itself and the resulting name second quantization is still sometimes used today meaning QFT. However, it is important to stress that, despite the name,
nothing is quantized twice. One simply quantizes a classical field once. Nonetheless, it is good
to know that, if one treats the one-particle Schrodinger equation as a quantum field, then it
will give the correct generalization to multiparticle states.
i

51

3.9

Problems

i) Consider the Klein-Gordon equation with the mass term set equal to zero and a dilatation
transformation with parameter ,
x x 0 = e x ,

(x) 0 (x0 ) = (x) ed .

(3.149)

Show that this transformation is a global symmetry of L if one chooses the scaling
dimension d in an appropriate way. Compute the associated Noether current and verify
that it is conserved. Is the symmetry preserved if you add a quartic /(4!) 4 to the
Lagrangian? What happens if you add a mass term m2 2 ?
ii) Consider the Lagrangian
1

1
L = ( M 2 2 ) + ( m2 ) 2 ,
2
2
2

(3.150)

with m  M and derive the EOMs for the fields and .


Express the heavy field through the light field and insert it back into the Lagrangian.
Expand your result in 1/M 2 . What has changed compared to the original Lagrangian?
Up to which energy scale would you trust the predictions of this effective Lagrangian?
iii) Show that the first relation in (3.2) is satisfied if [ap , aq ] = [ap , aq ] = 0 holds. Prove the
commutation relations (3.43). Calculate J i |p = 0i with J i defined in (3.48). Show that
the number operator N defined in (3.53) commutes with the Hamiltonian H of (3.27)
and satisfies N |p1 , . . . , pn i = n |p1 , . . . , pn i, where |p1 , . . . , pn i denotes the n-particle
state introduced in (3.50).
iv) Consider a real scalar (t, x) field living on a two-dimensional space-time and defined on
an interval x [0, L] with Dirichlet BCs (t, 0) = (t, L) = 0. Show that the (classical)
positive- and negative-frequency solutions to the Klein-Gordon equation that also satisfy
the BCs have the form
()
n (t, x) =

1
ein t sin(kn x) .
n L

(3.151)

Give the expression for kn in terms of L. How is n related to kn ? We now quantize the
field (t, x), keeping in mind that momentum here is discretized, i.e.,
(t, x) =


(+)

()
n (t, x) an + n (t, x) an ,

(3.152)

n=1

with the ladder operators satisfying [an , am ] = [an , am ] = 0 and [an , am ] = mn .


Compute the VEV h0|H|0i of the Hamiltonian density
i
1 h 2
2
2 2
H=
+ (x ) + m .
2
52

(3.153)

Integrating your result over the interval [0, L] and show that the total vacuum energy is

E0 (L) =

1X
n .
2 n=1

(3.154)

Since this quantity is infinite, we need some form of regularization in order to handle the
divergence. Let us introduce an exponentially damping function exp(n ) with > 0
in the sum, and consider for simplicity the case of a massless field. Prove that in this
case the vacuum energy can be written as
 

2
sinh
,
(3.155)
E0 (L, ) =
8L
2L
Take the limit 0 and determine the vacuum energy for the case when no BCs are
imposed. With all this at hand calculate the Casimir force.
v) Derive the charge operator Q for the U (1) invariant Lagrangian (3.90) using infinitesimal transformations. Show that the result expressed through creation and annihilation
operators takes the form of (3.95). Prove that Q satisfies (3.85). Verify that the charge
is conserved (via [H, Q] = 0). Are the operators N as defined in (3.96) conserved as
well? What happens if you add an interaction term
L =

2
( ) ,
4!

(3.156)

to the Lagrangian? What does this result imply for the case in which particles are their
own antiparticles?
vi) Consider a theory with two complex scalar fields 1 and 2 . Write down all possible terms of the Lagrangian which are Lorentz invariant and renormalizable, i.e., have
mass dimension of four and couplings with non-negative mass dimensions. Which terms
survive, if there is an additional discrete symmetry
(1 , 2 ) (1 , 2 ) ,

(3.157)

(1 , 2 ) (1 , 2 ) ,

(3.158)

and
under which the Lagrangian remains invariant?
Assume further, that both fields have the same mass, m1 = m2 , and that all dimensionless couplings are identical. You can now rewrite the Lagrangian in a more economic
way if you introduce the scalar doublet
!
1
=
,
(3.159)
2

53

and its hermitian conjugate. The theory at hand has four conserved global charges. One
charge follows from the U (1) invariance,
ei ,

= i ,

(3.160)

of the theory, that is already present in the case of a single complex scalar. The other
three charges correspond to the mixing of the scalar fields under an SU (2) transformation,
j
eij ,
= ij j ,
(3.161)
where the indices j = 1, 2, 3 are summed over and j = j /2 with j being the usual
Pauli matrices.
Compute the four conserved charges using Noethers theorem. For the SU (2) charges
you should find
Z

j
Q = d3 x i a ( j )ab b a ( j )ab b ,
(3.162)
where a, b = 1, 2 are field labels.
Show further, that the latter charges fulfill the SU (2) commutation relation,
[Qj , Qk ] = i jkl Ql .

(3.163)

What symmetries survive if you allow for different masses and dimensionless couplings?
vii) Consider the Lagrangian for a free complex scalar field (3.90) which is invariant under
global U (1) transformations (3.92). Is this Lagrangian invariant, if the global gets promoted to a local symmetry, i.e., e (x), where e is just a universal constant and
(x) a function of space-time?
If you now add a vector field A to the Lagrangian with a coupling


LA = i ( ) ( ) A + 2 (A )(A ) ,

(3.164)

how do the vector field and the coupling constant have to transform under the local
U (1), if the Lagrangian L + LA should remain invariant under phase redefinitions?
Compute the Noether current for this local symmetry. Add a kinetic term for the vector
field to the Lagrangian,
1
(3.165)
LA = F F ,
4
and derive the EOMs for the field A considering the full Lagrangian L + LA + LA .
viii) Compute the advanced Greens function DA (x y) for the Klein-Gordon equation using
the integration contour in Figure 3.4. Recall which initial conditions one assumes in
electrodynamics and why they lead to the use of the retarded Greens function. Can
you imagine physical BCs in which the advanced propagator would be the right choice?
Prove and explain in this context the following two relations
DR (x) = DA (x) ,
54

DF (x) = DF (x) .

(3.166)

ix) Explicitly perform the steps that lead to the Lagrangian (3.133). Show that the corresponding EOM (vary with respect to
e ) is the Schrodinger equation. The Lagrangian
has a global U (1) symmetry,
e ei .
e Verify the correctness of the expression for
the Noether current (3.134) and discuss the physical meaning of the conserved charge.
Based on your findings, give a reason why there is no non-relativistic limit for a real
scalar field?
x) Prove that the position operator X given in (3.142) satisfies X|xi = x|xi. Furthermore,
show that (3.144), (3.146), and (3.148) are correct.

References
[1] S. R. Coleman, Physics 253: Quantum Field Theory, Course given at Harvard
University, 1975 and 1976, http://www.damtp.cam.ac.uk/user/tong/qft/col1.pdf,
http://www.damtp.cam.ac.uk/user/tong/qft/col2.pdf,
http://www.physics.harvard.edu/about/Phys253.html
[2] N. Straumann, The history of the cosmological constant problem, arXiv:grqc/0208027.
[3] S. M. Carroll, The Cosmological Constant, Living Rev. Relativity 3, 1 (2001),
http://relativity.livingreviews.org/Articles/lrr-2001-1
[4] H. B. G. Casimir and D. Polder, The Influence of retardation on the London-van der
Waals forces, Phys. Rev. 73, 360 (1948).
[5] K. A. Milton, The Casimir effect: Recent controversies and progress, J. Phys. A 37,
R209 (2004) [arXiv:hep-th/0406024].

55

Interacting Fields

Often in QM, we are interested in particles moving in some fixed background potential V (x).
This can be easily incorporated into field theory by working with a Lagrangian with explicit
x dependence. E.g., in the case of our non-relativistic complex scalar field
e discussed in
Section 3.8, we could simply add a term
L = V (x)
e (x) (x)
e
,

(4.1)

to the Lagrangian (3.133). Since this interaction does not respect translational symmetry,
we wont have the associated energy-momentum tensor. While such Lagrangians are useful
in condensed matter physics, we rarely (or never) come across them in high-energy physics,
where all equations obey translational (and Lorentz) invariance.
One can of course also consider interactions between particles. Obviously, these are only
important for n particle states with n 2. We therefore expect them to arise from additions
to the Lagrangian (3.133) of the form
L =
e (x)
e (x) (x)
e (x)
e
,

(4.2)

which, in QFT, is an operator which destroys two particles before creating two new ones. Such
terms in the Lagrangian will indeed lead to inter-particle forces, both in the non-relativistic
and relativistic setting. In the following, we will explore these types of interactions in detail
for relativistic theories.

4.1

Classification of Interactions

The free QFTs we have discussed so far are special. We can determine their spectrum, but
they are dull since nothing happens as their name suggests. They have particle excitations,
but these do not interact.
To make things more interesting (i.e., more complicated) let us include interactions in
our theory. These will take the form of higher-order terms in the Lagrangian. We start by
asking what kind of small perturbations we can add to the theory. E.g., let us consider the
Lagrangian for a real scalar field (2.45) and add the infinite tower of additional terms
L =

Ln ,

Ln =

n=3

n n
,
n!

(4.3)

to it. Here the coefficients n are called coupling constants. The first question that we have to
address, is which restrictions the coupling constants have to satisfy in order for the additional
terms to be small perturbations. Naively one would think that one simply has to require that
n  1. But this turns out to be not quite right. In order to see why the naive guess is
not correct, we perform a dimensional analysis. Applying the rules gathered in Section 2.1,
we find that the dimensions of the coupling constants are
[n ] = 4 n .
56

(4.4)

This result makes clear why we cannot simply say n  1, because this statement is only
sensible for dimensionless quantities, but not dimensionful ones.
The interaction terms in (4.3) fall into three different categories. First, dimension-three
operators with [3 ] = 1. For such terms, we can define a dimensionless parameter 3 /E, where
E has dimension of mass and represents the energy scale of the process of interest. This means
that L3 = 3 3 /(3!) is a small perturbation for high energies, i.e., E  3 , but a big one
at low energies, i.e., E  3 . Such terms are called relevant, because they become and are
most relevant at low energies which, after all, is where most of the physics that we experience
lies. In a relativistic QFT, we have E > m, which means that we can always make this sort
of perturbations small by taking 3  m. Second, terms of dimension four with [4 ] = 0.
E.g., L4 = 4 4 /(4!). Such terms are small if 4  1 and are called marginal. Third,
operators with dimension of higher than four, having [n ] < 0. In this case the appropriate
dimensionless parameters is (n E n4 ) and terms Ln = n n /(n!) with n 5 are small
(large) at low (high) energies. Such contributions are called irrelevant, since in daily life,
meaning E n4  n , these operators do not matter.
As we will see later, it is typically impossible to avoid high-energy processes in QFT. We
have already seen a glimpse of this feature when we were discussing the structure of the vacuum
in Section 3.2, which involved the calculation of an integral over infinitely large frequencies of
a harmonic oscillator. We hence might expect problems with irrelevant operators that become
important at high energies. Indeed, these operators lead to non-renormalizable QFTs in which
one cannot make sense of the infinities at arbitrarily high energies. This does not mean that
these theories are useless, it just means that they become incomplete at some energy scale
and need to be embedded into an appropriate complete theory aka an UV completion. Let
me also add that the above naive assignment of relevant, marginal, and irrelevant operators
is not always carved in stone, since quantum corrections can sometimes change the character
of an operator.
Low-Energy Description
In typical applications of QFT only the relevant and marginal couplings are important. This
is due to the fact that the irrelevant couplings become small at low energies, as we have seen
above. In practice this saves us, since instead of considering the infinite number of interaction
terms in (4.3), only a handful are actually needed. E.g., in the case of the real scalar field
described earlier, we only have to take into account two operators, namely L3 = 3 3 /(3!)
and L4 = 4 4 /(4!), in the low-energy limit.
Let us have a closer look at this issue. Suppose that at some day we discover the true
superduper theory aka the TOE that describes the world at very high energy scales, say the
GUT scale, or, if you wish, even the Planck scale. Whatever this scale is, lets call it . Since
it is an energy scale, we obviously have [] = 1. What we want to understand are the laws
of physics at energy scales E that we can probe directly in a laboratory, which given todays
standards, means E  . Let us further suppose that at energies of order E, the laws of
physics are described by a real scalar field.20 This scalar field will have some complicated
20

Of course, we know that this assumption is plain wrong, since the SM is a non-abelian gauge theory with
chiral fermions, but the same argument applies in that case.

57

interaction terms (4.3), where the precise form is dictated by all the stuff that is going on in
the TOE. Can we get an idea about the interactions? Well, we can write our dimensionful
coupling constants n in terms of dimensionless couplings gn , multiplied by a suitable power
of the relevant scale ,
gn
n = n4 .
(4.5)

The exact values of the dimensionless couplings gn depend on the details of the TOE,21 so we
have to do some guesswork. Since the couplings gn are dimensionless, 1 looks like a pretty
good and somehow a natural guess. Since we are not completely sure, lets say gn = O(1).
This means that in a laboratory with E  the interaction terms Ln = n n /(n!) of (4.3)
will be suppressed by powers of (E/)n4 if n 5. Given the LHC energy of around 1 TeV,
this is a suppression by many orders of magnitude. E.g., for = MP one has E/ = 1016 . It
is this simple argument based on dimensional analysis that ensures that we need to focus only
on the first few terms in the interaction, namely those that are relevant and marginal. It also
means that if we only have access to low-energy experiments, it is going to be very difficult
to figure out the precise nature of the TOE, because its effects are highly diluted except for
the relevant and marginal interactions. Some people therefore call the superduper theory that
everybody is looking for, not TOE, but TOENAIL, which stands for theory of everything not
accessible in laboratories. The discussion given above is a poor mans version of the ideas
of effective field theory and Wilsons renormalization group, about which you can learn much
more by asking Matthias Neubert.
Weakly Coupled Theories
In this course we will only deal with weakly coupled QFTs, i.e., theories that can be truly
considered as small perturbations of the free field theory at all energies. We will look in more
detail at two specific examples.
The first example of a weakly coupled QFT we will study is the 4 theory,
L=

1
( )2 m2 2 4 ,
2
2
4!

(4.6)

where is our well-known real scalar field. For (4.6) to be weakly-coupled we have to require
 1. We can get a hint for what the effects of the additional 4 term will be. Expanding it
in terms of ladder operators, we find terms like
ap ap ap ap ,

ap ap ap ap ,

(4.7)

etc., which create and destroy particles. This signals that the 4 Lagrangian (4.6) describes a
theory in which particle number is not conserved. In fact, it is not too difficult to check that
the number operator N does not commute with the Hamiltonian, i.e., [H, N ] 6= 0.
The second example we will look at is a scalar Yukawa theory. Its Lagrangian is given by
L = ( )( ) +
21

1
1
( )2 M 2 m2 2 g ,
2
2

If we would know the precise structure of the TOE we could, in fact, calculate the couplings gn .

58

(4.8)

with g  M, m. This theory couples a complex scalar to a real scalar . In this theory
the individual particle numbers for and are not conserved. Yet, the Lagrangian (4.8) is
invariant under global phase rotations of , which ensures that there will be a conserved charge
Q obeying [H, Q] = 0. In fact, we have met this charge already in (3.95). In consequence,
in the scalar Yukawa theory the number of particles minus the number of antiparticles is
conserved. Notice also that the potential in (4.8) has a stable minimum at = = 0, but it
is unbounded from below, if g becomes too large. This means that we should not mess to
much with the scalar Yukawa theory.

4.2

Interaction Picture

In QM, there is a useful viewpoint called the interaction picture, which allows to deal with
small perturbations to a well-understood Hamiltonian. Let me briefly recall how this works.
In the Schrodinger picture, the states evolve as id/dt|iS = H |iS , while the operators OS are
time independent. In contrast, in the Heisenberg picture the states do not evolve with time,
but the operators change with time, namely one has |iH = eiHt |iS and OH = eiHt OS eiHt .
The interaction picture is a hybrid of the two. We split the Hamiltonian as
H = H0 + Hint ,

(4.9)

where in the interaction picture the time dependence of operators OI is governed by H0 , while
the time dependence of the states |iI is governed by Hint . While this split is arbitrary, things
are easiest if one is able to solve the Hamiltonian H0 , e.g., if H0 is the Hamiltonian of a free
theory. From what I have said so far, it follows that
|iI = eiH0 t |iS ,

OI = eiH0 t OS eiH0 t .

(4.10)

Since the Hamiltonian is itself an operator, the latter equation also applies to the interaction
Hamiltonian Hint . In consequence, one has
HI = (Hint )I = eiH0 t Hint eiH0 t .

(4.11)

The Schrodiner equation in the interaction picture is readily derived starting from the Schrodinger
picture,

d iH0 t
d
e
|iI = (H0 + Hint ) eiH0 t |iI ,
i |iS = H |iS , = i
dt
dt
d
(4.12)
= i |iI = eiH0 t Hint eiH0 t |iI ,
dt
d
= i |iI = HI |iI .
dt
Dysons Formula
In order to solve the system described by the Hamiltonian (4.9), we have to find a way of how
to find a solution to the Schrodinger equation in the interaction basis (4.12). Let us write the
solution as
|(t)iI = U (t, t0 )|(t0 )iI ,
(4.13)
59

where U (t, t0 ) is an unitary time-evolution operator satisfying U (t, t) = 1, U (t1 , t2 ) U (t2 , t3 ) =




U (t1 , t3 ), and U (t1 , t3 ) U (t2 , t3 ) = U (t1 , t2 ). Inserting (4.13) into the last line of (4.12)
i

d
U (t, t0 ) = HI (t) U (t, t0 ) .
dt

(4.14)

If HI would be a function, the solution to the differential equation (4.14) would read

 Z t
?
0
0
U (t, t0 ) = exp i dt HI (t ) .

(4.15)

t0

Yet, HI is not a function but an operator and this causes ordering issues. Lets have a closer
look at the exponential to understand where the trouble comes from. The exponential is
defined through its power expansion,
 Z t

Z t
2
Z t
(i)2
0
0
0
0
0
0
exp i dt HI (t ) = 1 i dt HI (t ) +
dt HI (t ) + . . . .
(4.16)
2
t0
t0
t0
When we differentiate this with respect to t, the third term on the right-hand side gives
Z t

Z t

1
1
0
0
0
0

dt HI (t ) .
(4.17)
dt HI (t ) HI (t) HI (t)
2
2
t0
t0
The second term of this expression looks good since it is part of HI (t)U (t, t0 ) appearing on the
right-hand side of (4.14), but the first term is no good, because the HI (t) sits on the wrong
side of the integral, and we cannot commute it through, given that [HI (t0 ), HI (t)] 6= 0 when
t 6= t0 . So what is the correct expression for U (t, t0 ) then?
The correct answer is provided by Dysons formula,22 which reads
 Z t

0
0
U (t, t0 ) = T exp i dt HI (t ) .
(4.18)
t0

Here T denotes time ordering as defined in (3.124). It is easy to prove the latter statement.
We start by expanding out (4.18), which leads to
Z t
U (t, t0 ) = 1 i dt0 HI (t0 )
t0

(i)2
+
2

"Z

dt0

dt00 HI (t00 ) HI (t0 ) +

t0

t0

dt0

t0

t0

(4.19)

dt00 HI (t0 ) HI (t00 ) + . . . .

t0

In fact, the terms in the last line are actually the same, since
Z t Z t
Z t Z t00
0
00
00
0
dt0 HI (t00 ) HI (t0 )
dt
dt HI (t ) HI (t ) =
dt00
t0

t0

t0

(4.20)

t0

dt
t0

22

t0
00

00

dt HI (t ) HI (t ) ,
t0

Essentially figured out by Paul Dirac, but in its compact notation due to Freeman Dyson.

60

where the range of integration in the first expression is over t00 t0 , while in the second
expression one integrates over t0 t00 , which is, of course, the same thing. The final expression
is simply obtained by relabelling t0 and t00 . In fact, it is not too difficult to show that one has
Z tn1
Z t Z t1
Z
1 t
dtn HI (t1 ) . . . HI (tn ) =
dt2 . . .
dt1
dt1 . . . dtn T (HI (t1 ) . . . HI (tn )) . (4.21)
n! t0
t0
t0
t0
Putting things together this means that the power expansion of (4.17) takes the form
Z

U (t, t0 ) = 1 i

dt HI (t ) + (i)

t0

t
0

t0

dt
t0

dt00 HI (t0 ) HI (t00 ) + . . . .

(4.22)

t0

The proof of Dysons formula is straightforward. First, observe that under the T operation,
all operators commute, since their order is already fixed by time ordering. Thus,

 Z t


 Z t

d
d
0
0
0
0
i U (t, t0 ) = i
T exp i dt HI (t )
= T HI (t) exp i dt HI (t )
dt
dt
t0
t0
(4.23)
 Z t

= HI (t) T exp i dt0 HI (t0 ) = HI (t)U (t, t0 ) .
t0

Notice that since t, being the upper limit of the integral, is the latest time so that the factor
HI (t) can be pulled out to the left.
Before moving on, I have to say that Dysons formula is rather formal. In practice, it turns
out to be very difficult to compute the time-ordered exponential in (4.18). The power of (4.18)
comes from the expansion (4.22) which is valid when HI is a small perturbation to H0 .

4.3

First Look at Scattering Processes

Let us now try to apply the interaction picture to QFT, starting with an easy example, namely
the interaction Hamiltonian of the Yukawa theory,
Z
Hint = g d3 x .
(4.24)
Unlike the free theories discussed in Section 2, this interaction does not conserve the particle
number of the individual fields, allowing particles of one type to morph into others. In order to
see why this is the case, we look at the evolution of the state, i.e., |(t)i = U (t, t0 ) |(t0 )i, in
the interaction picture. If g  M, m, where M and m are the masses of and , respectively,
the perturbation (4.24) is small and we can approximate the full time-evolution operator
U (t, t0 ) in (4.18) by (4.22). Notice that (4.22) is, in fact, an expansion in powers of Hint . The
interaction Hamiltonian Hint contains ladder operators for each type of particle. In particular,
glancing at (3.107) tells us that the field contains the operators a and a that create or
destroy particles.23 Lets call this particle mesons (M ). On the other hand, from the
discussion in Section 3.4 and 3.5, it follows that the field contains the operators a+ and
23

The additional subscript p of the ladder operators a and a etc. is dropped hereafter in the text.

61

a , which implies that it creates antiparticles and destroys particle. We will call these
particles nucleons (N ).24 Finally, the action of is to create nucleons through a and to
destroy antinucleons via a+ .
While the individual particle number is not conserved, it is important to emphasize that
Q = N+ N as defined in (3.95) is conserved not only in the free theory, but also in the
presence of Hint . At first order in Hint , one will have terms of the form a+ a a which destroys
. At second order in Hint , we have
a meson and creates a nucleon-antinucleon pair, M N N
more complicated processes. E.g., the combination of ladder operators (a+ a a)(a+ a a ) gives
M NN
. The rest of this section is devoted to calculate
rise to the scattering process N N
the quantum amplitudes for such processes to occur.
In order to calculate the amplitude, we have to make an important, but slightly dodgy,
assumption. We require that the initial state |ii at t (final state |f i at t ) is
an eigenstate of the free theory described by the Hamiltonian H0 . At some level, this sounds
like a reasonable approximation. If at t the particles are well separated they do not
feel the effects of each other. Moreover, we intuitively expect that the states |ii and |f i are
eigenstates of the individual number operators N and N . These operators commute with H0 ,
but not with Hint . As the particles approach each other, they interact briefly, before departing
again, each going on its own way. The amplitude to go from |ii to |f i is given by
lim hf |U (t+ , t )|ii = hf |S|ii ,

(4.25)

where the unitary operator S is known as the S-matrix. Needless to say, that the S in S-matrix
stands for scattering.
There are a number of reason why the assumption of non-interacting initial and final states
|ii and |f i is shaky. First, one cannot describe bound states. E.g., naively this formalism
cannot deal with the scattering of an e and proton (p) which collide, bind, and leave as a
Hydrogen atom. It is possible to circumvent this objection, since it turns out that bound
states show up as poles in the S-matrix. Second, and more importantly, a single particle,
a long way from its neighbors, is never alone in field theory. This is true even in classical
electrodynamics, where the electron sources the electromagnetic field from which it can never
escape. In QED, a related fact is that there is a cloud of virtual photons surrounding the
electron. This line of thought gets us into the issues of renormalization and you will hear
more on this later. For the time being, let me simply use the assumption of non-interacting
asymptotic states. After developing the basics of scattering theory, we will revisit the latter
problem.
Example: Meson Decay
Let us consider the relativistically normalized initial and final states,
p
p
|f i = 4Eq1 Eq2 a+,q1 a,q2 |0i .
|ii = 2Ep1 ap1 |0i ,
24

(4.26)

Of course, in reality nucleons are spin-1/2 particles, and do not arise from the quantization of a scalar
field. Our scalar Yukawa theory is therefore only a toy model for nucleons interacting with mesons.

62

The initial state contains a meson with momentum p1 , while the final state contains a nucleonantinucleon pair of momentum q1 and q2 . In leading order in the interaction Hint (4.24), the
is given by
amplitude for the process M N N
Z
hf |S|ii = ig hf | d4 x I (x)I (x)I (x)|ii .
(4.27)
Let us calculate this matrix element step by step. We first express I in terms of a and a
using (3.107). Notice that it is correct to apply the latter equation, since the I field in (4.27)
is in the interaction picture, which is the same as the Heisenberg picture of the free theory.
The annihilation operator in (3.107) will turn |ii into something proportional to |0i, while the
piece containing a creation operator will turn |ii into a two meson state. A two meson state
state, and the ladder operator appearing in
has however no overlap with hf |, which is a N N

I and I cannot change this situation. So we have


Z
Z 3 p
2Ep1
dk

ak ap1 eikx |0i


hf |S|ii = ig hf | d4 x I (x)I (x)
3
(2)
2Ek
Z
Z 3 p
i
2Ep1 h
dk
3 (3)

(2)

(p

k)

a
= ig hf | d4 x I (x)I (x)
a
eikx |0i (4.28)
k
1
p1
(2)3 2Ek
Z
= ig hf | d4 x I (x)I (x) eip1 x |0i .
Now we do the same for I and I . To get a non-zero overlap with our nucleon-antinucleon
final state, we have to pick up the creation operators a+ and a from the Fourier expansion
of the field operators. Altogether we then have
p
Z 4 3
d x d k1 d3 k2 4Eq1 Eq2
p
hf |S|ii = ig h0|
a,q2 a+,q1 a+,k1 a,k2 |0i ei(k1 +k2 p1 )x
6
(2)
4Ek1 Ek2
p
Z 4 3
d x d k1 d3 k2 4Eq1 Eq2
(4.29)
p
(2)6 (3) (q 1 k1 ) (3) (q 2 k2 ) ei(k1 +k2 p1 )x
= ig
(2)6
4Ek1 Ek2
= ig (2)4 (4) (q1 + q2 p1 ) ,
where we have made repeatedly use of the commutation relations of the ladder operators as
given in (3.16) and ignored contributions where annihilation operators act on the vacuum,
since these vanish by definition. We have drawn first blood: the result in (4.29) is our first
QFT amplitude.
decays. In particular,
Notice that the delta function constraints the possible M N N
the decay can only happen at all if the mass of the meson is larger or equal to the mass of the
nucleon-antinucleon state, i.e., m 2M . In order to see this, we simply boost our reference
frame so that the meson is at rest p1 = (m, 0, 0, 0). This is always possible. Momentum
conservation, as imposed by the delta function, than implies that the nucleon and antinucleon
1/2
are produced back-to-back, q 1 = q 2 , and that m = 2 M 2 + |q 1,2 |2
2M .
63

4.4

Wicks Theorem

Using Dysons formulas (4.18) and (4.22), we want to compute matrix elements such as

hf |T HI (x1 ) . . . HI (xn ) |ii ,
(4.30)
where |ii and |f i are assumed to be asymptotically free states. The ordering of the operators
HI is fixed by time ordering. However, since the interaction Hamiltonian contains certain
creation and annihilation operators, it would be convenient if we could start to move all
annihilation operators to the right, where they can start eliminating particles in |ii. Recall
that this is the definition of normal ordering as defined in (3.27). Wicks theorem tells us how
to go from time-ordered products to normal-ordered products. Before stating Wicks theorem
in its full generality, lets keep it simple and try to rederive something that we know already.
This is always a good idea.
Case of Two Fields
The most simple matrix element of the form (4.30) is
h0|T I (x)I (y)|0i .

(4.31)

We already calculated this object in Section 3.7 and gave it the name Feynman propagator.
What we want to do now is to rewrite it in such a way that it is easy to evaluate and to
generalize the obtained result to the case with more than two fields. We start by decomposing
the real scalar field in the interaction picture as

I (x) = +
I (x) + I (x) ,

with25

Z 3
1
1
dk
d3 k

ikx

ak e
,
I (x) =
ak eikx .
=
3
3
(2)
(2)
2Ek
2Ek
This decomposition can be done for any free field. It is useful since
+
I (x)

(4.32)

h0|
I (x) = 0 .

+
I (x)|0i = 0 ,

(4.33)

(4.34)

Now we consider the case x0 > y 0 and compute the time-order product of the two scalar fields,

 +

T I (x)I (y) = I (x)I (y) = +


(x)
+

(x)

(y)
+

(y)
I
I
I
I
+

= +
I (x)I (y) + I (x)I (y) + I (y)I (x) + I (x)I (y)



+ +
I (x), I (y) ,

(4.35)

+
where we have normal ordered the last line, i.e., brought
of
 + all Is to the right. To 0get rid

+
the I (x)I (y) term, we have added the commutator I (x), I (y) . In the case x < y 0 , we
find, repeating the above exercise,



,
(4.36)
T I (x)( y) = : I (x)I (y) : + +
(y),

(x)
I
I
25

The superscripts do not make much sense, but I just follow Pauli and Heisenberg here. If you have
to, complain with them.

64

where we have made use of the fact that the first four terms in the last line of (4.35) are simply
the normal-ordered product of the two fields, : (x)(y) : .
In order to combine the results (4.35) and (4.36) into one equation, we define the contraction
of two fields,



0
0
+
I (x), I (y) , x > y ,
(4.37)
I (x)I (y) = 
+ (y), (x) , y 0 > x0 .
I
I
This definition implies that the contraction of two I fields is nothing but the Feynman
propagator:
I (x)I (y) = DF (x y) .

(4.38)

For a string of field operators I , the contraction of a pair of fields means replacing the
contracted operators with the Feynman propagator, leaving all other operators untouched.
Equipped with the definition (4.37), the relation between time-ordered and normal-ordered
products of two fields can now be simply written as
T I (x)I (y) = : I (x)I (y) : + I (x)I (y) .

(4.39)

Let me emphasize that while both T I (x)I (y) and : I (x)I (y) : are operators, their difference
is a complex function, namely the Feynman propagator or the contraction of two I fields.
The formalism of contractions is also straightforwardly extended to our complex scalar
field I . One has
T I (x)I (y) = : I (x)I (y) : + I (x)I (y) ,
(4.40)
prompting us to define the contraction in this case as
I (x)I (y) = DF (x y) .

I (x)I (y) = I (x)I (y) = 0 .

(4.41)

For convenience and brevity, I will from here on often drop the subscript I, whenever I
calculate matrix elements of the form (4.30). There is however little room for confusion, since
contractions will always involve interaction-picture fields.
Strings of Fields
With all this new notation at hand, the generalization to arbitrarily many fields is also easy
to write down:


T (x1 ) . . . (xn ) = : (x1 ) . . . (xn ) + all possible contractions : .
(4.42)
This identity is known as the Wicks theorem. Notice that for n = 2 the latter equation is
equivalent to (4.39). Before proving Wicks theorem, let me tell you what the phrase all
possible contractions means by giving a simple example.

65

For n = 4 we have, writing i instead of (xi ) for brevity,



T 1 2 3 4 = : 1 2 3 4 + 1 2 3 4 + 1 2 3 4 + 1 2 3 4
+ 1 2 3 4 + 1 2 3 4 + 1 2 3 4

(4.43)


+ 1 2 3 4 + 1 2 3 4 + 1 2 3 4 : .
When the contracted field operator are not adjacent, we still define it to give a factor of DF .
E.g.,
: 1 2 3 4 : = DF (x2 x4 ) : 1 3 : .

(4.44)

Since the VEV of any normal-ordered operator vanishes, i.e., h0| : O : |0i = 0, sandwiching
any term of (4.43) in which there remain uncontracted field operators between the vacuum
|0i gives zero. This means that only the three fully contracted terms in the last line of that
equation survive and they are all complex functions. We therefore have

h0|T 1 2 3 4 |0i = DF (x1 x2 )DF (x3 x4 )
+ DF (x1 x3 )DF (x2 x4 )

(4.45)

+ DF (x1 x4 )DF (x2 x3 ) ,


which is a rather simple result and has, as we will see in the next section, a nice pictorial
interpretation.
Proof of Wicks Theorem
We still like to prove Wicks theorem. Naturally this is done by induction. We have already
proved the case n = 2. So lets assume that (4.42) is valid for n 1 and try to show that
the latter equation also holds for n field operators. With out loss of generality we can assume
that x01 > . . . > x0n , since if this is not the case we simply relabel the points in an appropriate
way. Such a relabeling leaves both sides of (4.42) unchanged. Then applying Wicks theorem
to the string 2 . . . n , we arrive at

T 1 . . . n = 1 . . . n =
= 1 : (2 . . . n + all contraction not involving 1 ) :

(4.46)

= (+
1 + 1 ) : (2 . . . n + all contraction not involving 1 ) : .


We now want to move the


1 s into the : . . . : . For 1 this is easy, since moving it in, it is
already on the left-hand side and thus the resulting term is normal ordered. The term with
+
+
1 is more complicated because we have to bring it into normal order by commuting 1 to
the right. E.g., consider the term without contractions,
+
+
+
1 : 2 . . . n : = : 2 . . . n : 1 + [1 , : 2 . . . n :]


+

= : +
1 2 . . . n : + : [1 , 2 ]3 . . . n + 2 [1 , 3 ]4 . . . n + . . . :

= : +
1 2 . . . n + 1 2 3 . . . n + 1 2 3 4 . . . n + . . . : .
66

(4.47)

Here we first used the fact that the commutator of a single operator and a string of operators
can be written as a sum of all possible strings of operators with two adjacent operators put
into a commutator. The simplest relation of this type reads [1 , 2 3 ] = [1 , 2 ]3 + 2 [1 , 3 ]
and is easy to prove. In the last step we then realized that under the assumption x01 > . . . > x0n
all commutator of two operators are equivalent to a contraction of the relevant fields.
The first term in the last line of (4.47) combines with the
1 term of (4.46) to give
: 1 . . . n : , meaning that we have derived the first term on the right-hand side of Wicks
theorem as well as all terms involving only one contraction of 1 with another field in (4.42).
It is not too difficult to understand that repeating the above exercise (4.47) with all the
remaining terms in (4.46) will then give all possible contractions of all the fields, including
those of 1 . Hence the induction step is complete and Wicks theorem is proved.

4.5

Second Look at Scattering Processes

In order to see the real power of Wicks theorem lets put it to work and try to calculate
N N N N scattering in the Yukawa theory (4.24). We first write down the expressions for
the initial and final states,
p
|ii = 4Ep1 Ep2 a+,p1 a+,p2 |0i = |p1 , p2 i ,
(4.48)
p
|f i = 4Eq1 Eq2 a+,q1 a+,q2 |0i = |q1 , q2 i .
We now look at the expansion of hf |S|ii in powers of the coupling constant g. In order to
isolate the interesting part of the S-matrix, i.e., the part due to interactions, we define the
T -matrix by
S = 1 + iT ,
(4.49)
where the 1 describes the situation where nothing happens. The leading contribution to iT
occurs at second order in the interaction (4.24). We find
Z

(ig)2
d4 x d4 y T (x)(x)(x) (y)(y)(y) .
(4.50)
2
Applying Wicks theorem to the time-order production entering this expression, we get (besides
others) a term
DF (x y) : (x)(x) (y)(y) : ,
(4.51)
which features a contraction of the two fields. This term will contribute to the scattering,
because the operator : (x)(x) (y)(y) : destroys the two nucleons in the initial state
and generates those appearing in the final state. In fact, (4.51) is the only contribution to
the process N N N N , since any other ordering of the field operators would lead to a
vanishing matrix element. The matrix element of the normal-ordered operator in (4.51) is
readily computed:
hq1 , q2 | : (x)(x) (y)(y) : |p1 , p2 i = hq1 , q2 | (x) (y)|0ih0|(x)(y)|p1 , p2 i
= ei(q1 x+q2 y) + ei(q1 y+q2 x)

ei(p1 x+p2 y) + ei(p1 y+p2 x)

= ei[(q1 p1 )x+(q2 p2 )y] + ei[(q2 p1 )x+(q1 p2 )y] + (x y) ,


67

(4.52)

where, in going to the third line, we have used the fact that for relativistically normalized
states, h0|(x)|pi = eipx . Putting things together, the matrix element (4.50) takes the form
Z
i ieik(xy)
(ig)2 d4 x d4 y d4 k h i[(q1 p1 )x+(q2 p2 )y]
i[(q2 p1 )x+(q1 p2 )y]
e
+e
+ (x y) 2
, (4.53)
2
(2)4
k m2 + i
where the term in curly brackets arises from (4.52), while the final factor stems from the
expression for the propagator (3.129). The (x y) terms double up with the others to
cancel the factor of 1/2 in the prefactor (ig)2 /2, while the x and y integrals give delta
functions. One arrives at
Z 4
h
i(2)8
dk
2
(4) (q1 p1 + k) (4) (q2 p2 k)
(ig)
(2)4 k 2 m2 + i
(4.54)
i
+ (4) (q2 p1 + k) (4) (q1 p2 k) .
Finally, we perform the k integration using the delta functions. We obtain


1
1
2
i(ig)
+
(2)4 (4) (p1 + p2 q1 q2 ) ,
(p1 q1 )2 m2 + i (p1 q2 )2 m2 + i

(4.55)

where the delta function, like in (4.29), imposes momentum conservation. Let me note that,
in fact, we can drop the i in the propagators, since the denominators cannot become zero. In
order to see this, we go to the center-of-mass (CM) frame, where p1 = p2 and, by momentum
conservation |p1 | = |q 1 |. This ensures that the 4-momentum of the meson is k = (0, p1 q 1 ),
and in consequence k 2 < 0. We will see shortly another, much simpler way to reproduce the
result (4.55) using Feynman diagrams. This will also shed light on the physical interpretation.
N
N
N
and
Notice that the above calculation is also relevant for the scatterings N

N N N N . Both reactions arise from the term (4.52) in Wicks theorem. However, we will
N
or N
N
N N , because these transitions
never find a term that contributes to N N N
would violate the conservation of the charge Q introduced in (3.95).

4.6

Feynman Diagrams

As the above example demonstrates, to actually compute scattering amplitudes using Wicks
theorem is (still) rather tedious. Theres a much better way, which starts by drawing pretty
pictures. This pictures represent the expansion of hf |S|ii and we will learn how to associate
mathematical expressions with those pictures. The pictures, you probably already guessed
it, are the famous Feynman diagrams. The Feynman-diagram approach turns out to be a
powerful tool to calculate QFT amplitudes (or as Schwinger puts it in [1]: Like the silicon
chips of more recent years, the Feynman diagram was bringing computation to the masses.).
We again start simple and consider the case of for fields, all at different space-time points,
which we have already worked out in (4.45). Let us present each of the points x1 to x4 by a
point and the propagators DF (x1 x2 ) etc. by a line joining the relevant points. Then the

68

right-hand side of (4.45) can be represented as a sum of three Feynman diagrams,


1
1

h0|T 1 2 3 4 |0i =

+
3

.
3

(4.56)

While this matrix element is not a measurable quantity, the pictures suggest a physical interpretation. Two particles are generated at two points and then each propagators to one of
the other points, where they are both annihilated. This can happen in three possible ways
corresponding to the three shown graphs. The total amplitude for this process is the sum of
the three Feynman diagrams.
Things get more interesting, if one considers expressions like (4.56) that contain field
operators evaluated at the same space-time point. So let us have a look at the expansion of
the propagator (4.31) of the real scalar field,

 Z


h0|T (x)(y) + (x)(y) i dt HI (t) + . . . |0i ,
(4.57)
in the presence of the interaction term HI = /(4!) 4 of the 4 theory (4.6). The first term
gives the free-field result, h0|T (x)(y)|0i = DF (x y), while the second term takes the form


Z Z
3 4
h0|T (x)(y) (i) dt d z (z) |0i
4!
(4.58)


Z
i
= h0|T (x)(y)
d4 z (z)(z)(z)(z) |0i .
4!
Now lets apply Wicks theorem (4.42) to (4.58). We get one term for each possible way to
contract the six different s with each other in pairs. There are 15 such possibilities, but
fortunately only two of these possibilities are really different. If we contract (x) and (y),
there are 3 possible ways to contract the remaining (z)s. The other possibility is to contract
(x) with (z) (four choices) and (y) with (z) (three choices), and (z) with (z) (one
choice). There are 12 possible ways to do this, all giving the same result. In consequence, we
have


Z Z
3 4
h0|T (x)(y) (i) dt d z (z) |0i
4!
Z
i
(4.59)
=3
DF (x y) d4 z DF (z z) DF (z z)
4!
Z
i
+ 12
d4 z DF (x z) DF (y z) DF (z z) .
4!
We can understand the latter expression better if we represent each term as a Feynman
graph. Again we draw each propagator as a line and each point as a dot. This time we have
however to distinguish between the external points x and y and the internal point z, which is
69

R
associated with a factor i d4 z. Neglecting the overall factors, we see that the expression
(4.59) is equal to the sum of the following two diagrams

(4.60)

We refer to the lines in these diagrams as propagators, since they represent the propagation
amplitudes DF (x y) etc. Internal points where four lines meet are called vertices. Since
DF (x y) is the amplitude for a free Klein-Gordon particle to propagate between x and y,
the diagrams actually interpret the analytic formula as a process of creation, propagation, and
annihilation which takes place in space-time.
Lets now move to a more complicated contraction that arises at order 3 in the 4 interaction (x = (x) etc.):

3 Z
Z
Z
i
4
4
d z z z z z d w w w w w d4 u u u u u |0i
4!

3 Z
1 i
d4 z d4 w d4 u DF (x z)DF (z z)DF (z w)
=
3!
4!

1
h0| x y
3!

(4.61)

DF2 (w u)DF (u u)DF (u y) .


The number of different contractions that gives this result is large. One has
3! 4 3 4 3 2 4 3 1/2 ,

(4.62)

which means a total number or 10 368 possibilities. Here the factor 3! arises from the interchange of the vertices z, w, and u, while the first 4 3 factor describes the placement of the
contractions into the z vertex. The factor 4 3 2 characterizes the placement of the contractions into the w vertex whereas the second 4 3 factor is associated to the placement of the
contractions into the u vertex. Finally, the factor of 1/2 is due to the interchange of the wu
contractions. The product in (4.62) is roughly 1/13 of the total number of 135 135 contractions
of 14 different field operators. The particular contraction (4.61) can be represented by the
following cactus diagram:

(4.63)
.

w
70

It is conventional, for obvious reasons, to let this one diagram represent the sum of all 10 368
identical terms.
In practical applications one always draws the Feynman diagrams first, using it as a
mnemonic device to write down the analytic expression. If this is done, one still has to
figure out the multiplicative overall
R 4 factor. Of course, one can do this as we have done it
above by associating a factor d z (i/(4!)) with each vertex, putting in the 1/n! factor
from the Taylor expansion, and then do the combinatorics by writing out the product of fields
as in (4.61) and counting. Yet, typically the 1/n! factor from the Taylor series will cancel the
n! factor arising from the interchanging the vertices, so that one can simply forget about this
factors. Furthermore, the generic vertex has four different lines coming from four different
places, so that the various contractions into the operator generates a factor of 4! (as in
the case of the w vertex in the above example). This factor of 4!R cancels the denominator of
i/(4!). It is therefore conventional to associate the expression d4 z (i) with each vertex.
Applying this scheme to the Feynman graph in (4.63) gives a multiplicative factor that is
too large by a factor of S = 8 = 2 2 2, which is called the symmetry factor of the diagram.
Two factor of 2 come from lines that start and end on the same vertex, since the diagram is
symmetric under the interchange of the ends of such lines (z and u in our case). The other
factor of 2 comes from the two propagators connecting w and u, since the graph is symmetric
under the interchange of these two lines. A third type of symmetry (not arising in the case at
hand) is the equivalence of two vertices. In order to arrive at the correct overall factor, one
has to divide by the symmetry factor, which is in general the number of possibilities to change
parts of the diagrams without changing the result of the Feynman graph.
Most people never need to evaluate Feynman graphs with a symmetry factor larger than
2, so there is no need to worry too much about these technicalities. But for completeness let
me give some examples of nontrivial symmetry factors. Here they are (dropping the labels x
and y at the external points):
S = 2 2 2 = 8,

S = 2,

(4.64)
S = 3! 2 = 12 .

S = 3! = 6 ,

Clearly, if you are in doubt about the symmetry factor you can always determine it by counting
equivalent contractions, as we did above.
We are now ready to summarize our rules needed to find the analytic expression for each
piece of a given Feynman diagram in the 4 theory:
1. For each propagator one has x

71

= DF (x y) .

Z
2. For each vertex one has

= (i)

3. For each external point one has x

d4 z .

= 1.

4. Divide by the symmetry factor.


Since these rules are written in terms of space-time points x, y, z, etc. these rules are called
position-space Feynman rules. One way to interpret these rules is to think of the factor (i)
as
R 4the amplitude for the emission and/or absorption of particles at a vertex. The integral
d z tells us that we have to sum over all points where this process can occur. This means
that this is nothing but the superposition principle of QM: when a process can happen in
different ways, we add the amplitudes for each possibility. Furthermore, in order to calculate
each individual amplitude the Feynman rules tell us to multiply the amplitudes (propagators
and vertices) for each of independent part of the process.
The above Feynman rules are given in position-space. Yet, in actual calculation it is (often)
more convenient to work in the momentum-space by introducing the Fourier transformation of
the propagator (3.129). To such a propagator one has to assign a 4-momentum p, indicating
in general the direction of the momentum with an arrow (since DF (x y) = DF (y x) the
direction of p is arbitrary). The z-dependent factors of the vertices in a diagram are then
given by
p3

p1

p4

p2

d4 z ei(p1 +p2 +p3 p4 )z = (2)4 (4) (p1 + p2 + p3 p4 ) .

(4.65)

In other words momentum is conserved at each vertex. The delta functions from the vertices
can now be used to perform some of the momentum integrals from the propagators. We are
left with the following momentum-space Feynman rules:
1. For each propagator one has

p2

= i .

2. For each vertex one has

3. For each external point one has x

72

= eipx .

i
.
m2 + i

4. Impose momentum conservation at each vertex.


Z
5. Integrate over each undetermined momentum

d4 l
.
(2)4

6. Divide by the symmetry factor.


Again, we can interpret each factor as the amplitude for that part of the process, with the
integrations coming from the superposition principle. The exponential factor for an external
point is just the amplitude for a particle at that point to have the needed momentum, or,
depending on the direction of the arrow, for a particle with a certain momentum to be found
at the specific point.

4.7

Third Look at Scattering Processes

Let us now apply the things that we have learned to the case of N N N N scattering. At
order g 2 we have to consider the two diagrams shown in Figure 4.1. Employing the relevant
momentum-space Feynman rules, it is readily seen that the analytic expression for the sum of
the displayed graphs agrees with the final result (4.55) of the calculation that we performed
earlier in Section 4.5. In fact, there is a nice physical interpretation of the graphs. We talk,
rather loosely, of the nucleons exchanging a meson which, in the first diagram, has momentum
k = p1 q1 = p2 q2 . This meson does not satisfy the usual energy dispersion relation, because
k 2 6= m2 , where m is the mass of the meson. The meson is called a virtual particle and is said
to be off-shell (or, sometimes, off mass-shell). Heuristically, it cant live long enough for its
energy to be measured to great accuracy. In contrast, the momentum on the external, nucleon
legs satisfy p21 = p22 = q12 = q22 = M 2 , which means that the nucleons, having mass M , are
on-shell. Similar considerations apply to the second diagram. It is important to notice that
the appearance of the two diagrams above ensures that the particles satisfy Bose statistics.
NN
, are
The diagrams describing the scattering of a nucleon and an antinucleon, N N
a little bit different than the ones for N N N N . At lowest order, the corresponding graphs
are shown in Figure 4.2. It is a simple matter to write down the amplitude using the relevant
Feynman rules,


1
1
2
+
(2)4 (4) (p1 + p2 q1 q2 ) . (4.66)
i(ig)
2
2
2
2
(p1 + p2 ) m + i (p1 q1 ) m + i
Notice that in the CM frame, p1 = p2 , the denominator of the first term in the square
bracket is 4 (M 2 + p21 ) m2 . If m < 2M , then this term never vanishes and we may drop the
i. In contrast, if m > 2M , then the amplitude corresponding to the first diagram diverges at
some value of p1 . In this case it turns out that we may also neglect the i term, although for a
different reason. In this case the meson is unstable when m > 2M and thus has a finite width
. When correctly treated, this instability adds a finite imaginary piece i to the denominator
which makes the application of the i prescription unnecessary. Nonetheless, the increase in
the scattering amplitude which we see in the first diagram when 4 (M 2 + p21 ) = m2 is what
73

p1

q1

p1

N
q1

M
q2

p2

q2

p2

Figure 4.1: Feynman diagrams contributing to N N N N scattering at order g 2 .

allows us to discover new particles. These appear as a resonance (a peak or bump) in the
cross section (roughly the amplitude squared).
We see that the amplitudes (4.55) and (4.66) (and in general all processes that include
the exchange of just a single particle) depend on the same combinations of momenta in the
denominators. There are standard names for various sums and differences of momenta that
are known as Mandelstam variables. They are
s = (p1 + p2 )2 = (q1 + q2 )2 ,
t = (p1 q1 )2 = (p2 q2 )2 ,

(4.67)

u = (p1 q2 )2 = (p2 q1 )2 ,
where, as in the explicit examples above, p1 and p2 are the momenta of the two initial-state
particles, and q1 and q2 are the momenta of the two final-state particles. In order to get a
feel for what these variables mean, let us assume (for simplicity) that all four particles are the
same. In the CM frame, the initial two particles have the following 4-momenta
p2 = (E, 0, 0, p) ,

p1 = (E, 0, 0, p) ,

(4.68)

The particles then scatter at some angle and leave with momenta
q2 = (E, 0, p sin , p cos ) .

q1 = (E, 0, p sin , p cos ) ,

(4.69)

Then from the definitions (4.67), we have that


s = 4E 2 ,

t = 2p2 (1 cos ) ,

u = 2p2 (1 + cos ) .

(4.70)

We see that the variable s measures the total center of mass energy of the collision, while the
variables t and u are measures of the energy exchanged between particles (they are basically
equivalent, just with the outgoing particles swapped around). Now the amplitudes that involve
exchange of a single particle can be written simply in terms of the Mandelstam variables. E.g.,
for nucleon-nucleon scattering, the amplitude (4.55) is proportional to26
A(N N N N )
26

1
1
+
,
2
tm
u m2

Here and in the following we simply drop all i terms.

74

(4.71)

p1

q1

p2

+
q2

p1

q1

q2

p2

NN
scattering at order g 2 .
Figure 4.2: Feynman diagrams contributing to N N

while in the case of nucleon-antinucleon scattering one finds


NN
)
A(N N

1
1
+
.
2
sm
t m2

(4.72)

We say that the first case involves t- and u-channel diagrams. On the other hand, the nucleonantinucleon scattering is said to involve s- and t-channel exchange.
Note finally that there is a relationship between the Mandelstam variables. In the cases
NN
scattering, which involves external particles with the same
of N N N N and N N
mass, one has
s + t + u = 4M 2 .
(4.73)
P4
When the masses of the external particles are different this becomes s + t + u = i=1 m2i ,
where mi denotes the individual masses of the initial- and final-state particles.
Let us now consider the case of meson-meson scattering, M M M M . The simplest
diagram we can draw that describes this process is shown in Figure 4.3. It has a single
loop, and momentum conservation at each vertex is no longer sufficient to determine every
momentum passing through the diagram. Assigning the single undetermined momentum l to
the right-hand propagator, all other momenta are fixed by the kinematics (the actual momenta
assignments are not displayed in the figure). The amplitude corresponding to the displayed
diagram is
Z
1
1
d4 l
4
i (ig)
4
2
2
(2) l M (l + q1 )2 M 2
(4.74)
1
1
4 (4)

(2) (p1 + p2 q1 q2 ) .
(l p1 + q1 )2 M 2 (l q2 )2 M 2
While an explicit calculation of this Rloop integral is beyond the scope of this lecture, notice
that for large l, this integral goes as d4 l/l8 , which means that it is UV finite (the integral is
also IR finite since all propagators are massive). In general, loop integrals can have however
both UV (l2 ) and IR (l2 0) singularities.
The delta function follows from the conservation of 4-momentum which, in turn, follows
from space-time translational invariance. It is common to all S-matrix elements. We will
define the amplitude A(f i) by stripping off this momentum-conserving delta function,
hf |S 1| ii = i hf |T | ii = i (2)4 (4) (pf pi ) A(f i) ,
75

(4.75)

M
N
M

N
N
N

Figure 4.3: Lowest order contribution to M M M M scattering. The momentum


assignments are not explicitly shown.

where pf (pi ) is the sum of the final (initial) 4-momenta, and the factor of i out front is a
convention which is there to match non-relativistic QM.

4.8

Yukawa Potential

So far we have calculated the quantum amplitudes for various scattering processes. But this
quantities are a little bit abstract. In order to make contact to experiment let me show in the
following how to translate the amplitude (4.55) for nucleon-nucleon scattering into something
familiar from Newtonian mechanics, namely a potential, or force, between the particles.
We start by asking a simple question in classical field theory that will turn out to be relevant
in order to calculate the quantum process. Suppose that we have a fixed delta function source
for our real scalar field , that persists for all times. What is the profile of (x)? In order to
answer this question, we have to solve the static Klein-Gordon equation,

2 + m2 (x) = (3) (x) .
(4.76)
R
We can solve this equation by going to momentum-space (x) = d3 p/ ((2)3 ) eipx (p).
After this Fourier transformation the relation (4.76) takes the form (p2 + m2 ) (p) = 1, which
means that we can write the field as
Z 3
eipx
dp
.
(4.77)
(x) =
(2)3 p2 + m2
Let us compute this integral explicitly. Changing to polar coordinates, and writing p x =
pr cos , we get
Z
1
p2
2 sin (pr)
dp
(x) =
2
2
2
(2) 0
p +m
pr
Z
1
p sin (pr)
=
dp 2
(4.78)
2
(2) r
p + m2
Z

dp peipr
1
=
Re
.
2
2
2r
2i p + m
76

We evaluate the last integral by closing the contour in the upper half plane p i, picking
up the pole at p = im. This gives
(x) =

1 mr
e
.
4r

(4.79)

We see that the field dies off exponentially quickly at distances 1/m, i.e., the Compton wavelength of the meson.
It is now interesting to ask how the profile of the field (the meson) and the force between
the particles (the nucleons) are related. Realize that in electrostatics where a charged particle
acts as a delta-function source for the gauge potential A0 with A = (, A) we have to face a
similar problem. In this case one has 2 A0 = (3) (x) which is solved by A0 = 1/(4r). The
profile of A0 then acts as the potential energy for another charged (test) particle moving in
this background. Is such an interpretation also possible in the case of ? Or phrased slightly
different, is there a classical limit of the scalar Yukawa theory where the nucleons act as deltafunction sources for the meson field, creating the profile (4.79)? And, if so, is this profile then
felt as a static potential? The answer is essentially yes, at least in the limit M  m. But the
correct way to describe the potential felt by the nucleons is not to talk about classical fields
at all, but instead work directly with the quantum amplitudes.
Let us see explicitly how this goes. We first compare the result of the first diagram in Figure 4.1 to the corresponding amplitude in non-relativistic QM which describes the interaction
of two particles through a potential. In order for the comparison to be meaningful, we have
to take the non-relativistic limit of (4.55). We work in the CM frame with p = p1 = p2
and q = q 1 = q 2 with |p| = |q| for elastic scattering. In the non-relativistic limit one has
|p|  M , which by momentum conservation implies |q|  M . It is easy to check that in this
limit the first term in (4.55) turns into
ig 2
.
(p q)2 + m2

(4.80)

We should now compare this result to the scattering amplitude in QM. In order to do this,
we consider two particles separated by a distance x, interacting through a potential V (x).
The amplitude for the particles to scatter from p into q can be computed in perturbation
theory, using techniques familiar from non-relativistic QM. In Born approximation, i.e., to
leading order in the perturbative expansion, the sought amplitude is given by
Z
hq |V (x)| pi = i d3 r V (x) ei(pq)x .
(4.81)
Taking into account that there is a relative factor of (2M )2 that arises in comparing the QFT
amplitude to hq |V (x)| pi, which can be traced to the relativistic normalization of the states
|p1 , p2 i,27 we find after equating (4.80) and (4.81) the following relation
Z
2
d3 r V (x) ei(pq)x =
.
(4.82)
(p q)2 + m2
27

Notice that this factor is also necessary to get the dimensions of the potential to work out correctly.

77

Here we have introduced the dimensionless parameter = g/(2M ). The latter equation is
trivially inverted, giving
Z 3
dp
eipx
2 mr
2
V (x) =
e
,
(4.83)
=
(2)3 p2 + m2
4r
where in the last step have used the results (4.77) through (4.79). The potential V (x) is the
famous Yukawa potential. The force has a range 1/m and the minus sign in (4.83) tells us
that the potential is attractive. Hideki Yukawa made this potential the basis for his theory of
the nuclear force and worked backwards from the range of the force (of about 1 fm) to predict
the mass (of about 200 MeV) of the required boson the pion [2]. It is important to realize
that QFT has given us an entirely new perspective on the nature of forces between particles.
Rather than being a fundamental concept, the force arises from the virtual exchange of other
particles, in this case the meson.

4.9

Connected and Amputated Feynman Diagrams

We have explained in some detail how to compute scattering amplitudes by drawing all Feynman diagrams and by writing down the corresponding analytic expression for them using
Feynman rules. In fact, there are a couple of caveats about what Feynman diagrams one
should draw and calculate. Both of these caveats are related to the assumption made so far
that the initial and final states are eigenstates of the free theory which, as we have mentioned
before, is not correct.
The two caveats are as follows. First, we consider only connected Feynman diagrams, where
every part of the diagram is connected to at least one external line. We shall see shortly, that
this will be related to the fact that the vacuum |0i of the free theory is not the true vacuum
|i of the interacting theory. An example of a disconnected diagram (or piece) is shown on the
left-hand side in Figure 4.4. Second, we do not consider diagrams with loops on external lines
so-called unamputated graphs. An example of such a diagram is depicted on the right-hand
side of the latter figure. These diagrams are related to the fact that the one-particle states
of the free theory are not the same as the one-particle states of the interacting theory. In
particular, correctly dealing with these diagrams will account for the fact that particles in
interacting QFTs are always surrounded by a swarm of virtual particles. We will refer to
diagrams in which all loops on external legs have been removed as amputated graphs.
Vacuum of the Interacting Theory
We start out by discussing the properties of the vacuum |i of the interacting theory. We
will normalize the state |i as h|i = 1 and H |i = 0. Since |i is the ground state of H,
we can isolate it by the following procedure. Imagine starting with the vacuum |0i of the free
theory (i.e., H0 |0i = 0) and evolving it with H,
X
eiHt |0i =
eiEn t |nihn|0i ,
(4.84)
n

78

Figure 4.4: Example of a disconnected (left-hand side) and an unamputated (righthand side) Feynman diagram in 4 theory.

where En (|ni) are the eigenvalues (eigenstates) of H. We must assume that |i and |0i have
some overlap, i.e., h|0i =
6 0. If this would not be the case the interaction term HI would
not be a small perturbation compared to H0 . Under this assumption, we can rewrite (4.84)
as follows
X
eiHt |0i = eiE0 t |ih|0i +
eiEn t |nihn|0i ,
(4.85)
n6=0

where E0 = h|H0 |i. Since En > E0 for all n, we can get rid of the second term in (4.85) by
sending t to infinity in a slightly imaginary direction, t (1 i) .28 It follows that
|i =

1
eiE0 t h|0i eiHt |0i .

lim

(4.86)

t(1i)

Since t is very large we can shift it by a small amount, lets say t0 , so that
1 iH(t+t0 )
|i = lim
eiE0 (t+t0 ) h|0i
e
|0i
t(1i)

lim

1 iH(t0 (t)) iH0 (tt0 )


e
e
|0i
eiE0 (t0 (t)) h|0i

t(1i)

lim
t(1i)

(4.87)

1
eiE0 (t0 (t)) h|0i
U (t0 , t) |0i .

Here we have used in the second line that H0 |0i = 0 and employed in the third line the relation
U (t, t0 ) = exp [iH0 (t t0 )] exp [iH(t t0 )] exp [iH0 (t0 t0 )] which follows from (4.14). We
see that (ignoring the prefactor) we can get the ket |i from |0i by simply evolving from t
to t0 with the time-evolution operator U . Similarly, we find for the bra h| the expression
h| =

lim
t(1i)

h0| U (t, t0 ) eiE0 (tt0 ) h0|i

1

(4.88)

Correlation Functions
There are many questions we want to ask in QFT that are not directly related to scattering
experiments. E.g., we might want to compute the viscosity of the quark gluon plasma, or
28

Since the BCs of the Feynman propagator DF (x y) are such that the integration contour that is slightly
rotated away from the Re p0 -axis the contribution of the imaginary piece of t does not alter the final result.

79

understand the response of a condensed matter system to an experimental probe, or figure


out the non-Gaussianity of density perturbations arising in the cosmic microwave background
from novel models of inflation. All of these questions are answered in the framework of QFT
by computing elementary objects known as correlation functions. In the following we will
define correlation functions, explain how to compute them using Feynman diagrams, and then
relate them back to scattering amplitudes.
In order to keep the following discussion as simple as possible, we will work in the real
Klein-Gordon theory. We start by defining the n-point correlation (or Greens) function
G(n) (x1 , . . . , xn ) = h|T (H (x1 ) . . . H (xn )) |i ,

(4.89)

where H denotes the field in the Heisenberg picture of the full theory, rather than the
interaction picture that we have been dealing with so far. The first question that one can ask,
is how to compute G(n) in terms of matrix elements evaluated on |0i, the vacuum of the free
theory. Let me first state the result and then prove it. The result reads

 Z t

0
0
h0|T I (x1 ) . . . I (xn ) exp i
dt HI (t ) |0i
t
(n)

 Z t

.
(4.90)
G (x1 , . . . , xn ) = lim
t(1i)
0
0
h0|T exp i
dt HI (t ) |0i
t

Notice that both the numerator and denominator appearing on the right-hand side of the
latter equation can be calculated using the methods developed for S-matrix elements, namely
Feynman diagrams (or alternatively Dysons formula and Wicks theorem) after expanding
the exponentials into a Taylor series.
After stating the result (4.90), we still have to prove it. With out loss of generality we
assume that x01 > . . . > x0n > t0 . If this is not the case we simply relabel the points in an
appropriate way. Such a relabeling leaves both sides of (4.90) unchanged. We then have
G(n) (x1 , . . . , xn ) = h|H (x1 ) . . . H (xn )|i
=

lim

t(1i)

eiE0 (tt0 ) h0|i

1

h0| U (t, t0 )





U (x01 , t0 ) I (x1 ) U (x01 , t0 ) U (x02 , t0 ) I (x2 ) U (x02 , t0 ) . . .


1
U (x0n , t0 ) I (xn ) U (x0n , t0 ) U (t0 , t) |0i eiE0 (t0 (t)) h|0i
=

lim

t(1i)

eiE0 (2t) |h0|i|2

(4.91)

1

h0|U (t, x01 ) I (x1 ) U (x01 , x02 ) . . . U (x0n1 , x0n ) I (xn ) U (x0n , t)|0i
=

h0|U (t, x01 ) I (x1 ) U (x01 , x02 ) . . . U (x0n1 , x0n ) I (xn ) U (x0n , t)|0i
.
t(1i)
h0|U (t, t)|0i
lim

Here we have first used (4.87) and (4.88) and rewritten all Heisenberg fields H in terms of
interacting fields,


H (x) = U (x0 , t0 ) I (x) U (x0 , t0 ) .
(4.92)
80



Remember that U satisfies U (t1 , t2 ) U (t2 , t3 ) = U (t1 , t3 ) and U (t1 , t3 ) U (t2 , t3 ) = U (t1 , t2 ).
In order to arrive at the last line, we have finally employed
1
h0|U (t, t)|0i .
(4.93)
1 = h|i = eiE0 (2t) |h0|i|2
The proof of (4.90) is complete after noticing that all fields in (4.91) are in time
order


R t 0and that
the product of U operators in the numerator reduces to U (t, t) = T exp i t dt HI (t0 ) .
Hence the last line in (4.91) is nothing but the right-hand side of (4.90).
Exponentiation of Bubble Diagrams
By means of (4.90) we can now (in principle) calculate any n-point correlation function. But
what is the physical interpretation of this equation? We first express the denominator of (4.90)
in terms of Feynman diagrams,

lim

h0|U (t, t)|0i = 1 +

t(1i)

+ . . . . (4.94)

The disconnected Feynman diagrams appearing on the right-hand side of this relation are
called vacuum bubbles. What is the value of the first non-trivial graph? Restoring the position
label and the integration momenta,
l1

l2

(4.95)

it is readily seen that momentum conservation requires l1 = l2 , so that the diagram evaluates
to (2)4 (4) (0). This factor is also easily derived in position space, where one has
Z
d4 z (const.) 2t V .
(4.96)
This result just tells us that the space-time process (4.95) can happen at any place in space, and
at any time between t and t. Every disconnected diagram will have one such (2)4 (4) (0) =
2t V factor, where V denotes the volume of space.
In fact, the contributions to G(n) from disconnected diagrams can be shown to exponentiate.
To prove the linked-cluster theorem, we first label the various possible disconnected pieces:

,
,
,
, ... .
(4.97)
Vi

Now we assume that a given Feynman diagram has ni pieces of the form Vi for each i, in
addition to its one piece that is connected. If we also denote the value of Vi by vi , the value
of a single Feynman graph is
!
Y (vi )ni
(value of connected piece)
,
(4.98)
(ni )!
i
81

where 1/((ni )!) is the symmetry factor associated with interchanging the ni copies of the piece
Vi . The value of the sum of all diagrams is then given by
!
X
X
Y (vi )ni
,
(4.99)
(value of connected piece)
(ni )!
i
all connected diagrams
all {ni }

where all {ni } means all ordered sets {n1 , n2 , . . .} of non-negative integers. The sum of
the connected diagrams factors out of this expression, giving
!
!
X
X Y (vi )ni
.
(4.100)
(value of connected piece)
(ni )!
i
all connected diagrams
all {ni }

In fact, not only the connected pieces factorize, but also the disconnected ones. One has

!
!
X Y (vi )ni
Y
X (vi )ni
Y
X

=
=
exp (vi ) = exp
vi .
(4.101)
(n
)!
(n
)!
i
i
i
i
i
i
all {ni }

all {ni }

We see that the combinatoric factors (as well as the symmetry factors) associated with each
diagram are such that the whole series of disconnected pieces sums to an exponential. Taken
together (4.99) through (4.101) imply that the sum of all diagrams is equal to the sum of all
connected diagrams multiplied with the exponential of the sum of all disconnected graphs.
Applying our findings concerning the exponentiation of bubble diagrams to (4.94), we
arrive at the following pictorial identity


 Z t

. (4.102)
dt0 HI (t0 ) |0i = exp
+
.
.
.
lim h0|T exp i
+
+

t(1i)

The exponentiation of disconnected diagrams is also relevant in the case of the numerator
of the right-hand side of (4.90). Let us consider the two-point correlation function G(2) for
simplicity. In this case the numerator takes the form


 Z t
0
0
lim h0|T I (x) I (y) exp i
dt HI (t ) |0i =
t(1i)

exp

+ ...

82

+ ...
.

(4.103)

Combining now (4.102) and (4.103), it follows that the exponentials involving the sum of
disconnected diagrams cancel between the numerator and denominator in the formula for the
correlation functions. In the case of the two-point function, the final form of (4.90) is thus
G(2) (x, y) =

+ . . . . (4.104)

The generalization to higher correlation function is straightforward and reads


G(n) (x1 , . . . , xn ) = h|T (H (x1 ) . . . H (xn )) |i =

!
sum of all connected graphs
. (4.105)
with n external points

The disconnected diagrams exponentiate, factor, and cancel as before. It is important to


remember that by disconnected we mean disconnected from all external points. In higher
correlations functions, diagrams can also be disconnected in another sense. Consider, e.g., the
four-point function
G(4) (x1 , x2 , x3 , x4 ) =

+ ... +

+ ...

+ ...

+ ... .

(4.106)

In many of the displayed diagrams, external points are disconnected from each other. Such
diagrams do neither exponentiate nor factor, they contribute to the amplitude just as do the
fully connected diagrams in which any point can be reached from any other by traveling along
the lines.
Energy Density of Vacuum
An immediate consequence of the linked-cluster theorem is that all vacuum bubbles cancel
when calculating correlation functions. Does this mean that the disconnected diagrams have
no physical meaning at all? The place to look for the answer to this question is (4.91) and
(4.93). Taken together these two equations imply that

 Z t

0
0
lim h0|T I (x1 ) . . . I (xn ) exp i
dt HI (t ) |0i
t(1i)
t
(4.107)

iE0 (2t)
2 1
= h|T (H (x1 ) . . . H (xn )) |i
lim
e
|h0|i|
.
t(1i)

83

Looking only at the t-dependent parts on both sides, it follows that


hX i
h
i
exp
vi exp iE0 (2t) .

(4.108)

The sum of all vacuum bubbles is therefore related to the difference in the ground-state zeropoint energies of the interacting and the free theory, the latter of which was defined to be zero.
Because each bubble graph Vi contains a single factor of (2)4 (4) (0) = 2t V , one explicitly
finds that the energy density of the ground state of the (interacting) 4 theory reads

E0
= i
E0 =

i1
h
4 (4)

+ . . . (2) (0)
.

(4.109)

Notice that the IR divergence arising from the infinite


 extent of space-time volume which we
have first met in Section 3.2 and then again in (4.96) has been removed in E0 , leaving behind
an highly UV-divergent expression that reflects our ignorance about the physics governing the
high-energy regime.
One-Particle States in Interacting Theory
We now have an extremely beautiful formula (4.105) for computing an extremely abstract
quantity the n-point correlation function. Our next task is to relate
 these objects back to
S-matrix elements (4.25) or equivalent T -matrix elements (4.49) , which will allow us to
compute quantities that can actually be measured, namely decay rates and cross sections.
In order to achieve this goal, we still have to learn how to deal with diagrams involving
loops on the external lines. Let us first try to understand the problem with such graphs,
looking at a specific example. We consider the following Feynman diagram
l
p1
p3

q1
q2

p2

1
=
2

i
d p3 2
p3 m2
4

d4 l

i
l2 m2

(4.110)

(i) (2)4 (4) (p2 + p3 q1 q2 )


(i) (2)4 (4) (p1 p3 ) ,

appearing in 4 theory. We can integrate over p3 using the second delta function. It tells us
to evaluate


1
1
1

= 2
= .
(4.111)
2

2
2
p3 m p3 =p1
p1 m
0
We get an infinity, since p1 , being the momentum of an external particle, is on-shell, i.e.,
p21 = m2 . This is not good! Clearly, diagrams like (4.111) should not contribute to the
84

S-matrix elements. In fact, this is physically reasonable, since the external leg corrections,

+ ... ,

(4.112)

represent the evolution of one-particle state of the free theory into the one-particle state of
the interacting theory, in the same way that the vacuum-bubble diagrams (4.97) represent
the evolution of |0i into |i. Since these corrections have nothing to do with the scattering
process itself, it is somehow clear that one should exclude them from the calculation of the
S-matrix.
For a generic Feynman diagram with external legs, we define amputation in the following
way. Starting from the tip of each external leg, find the last point at which the diagram can
be cut by removing a single propagator, such that this operation separates the leg from the
rest of the diagram. Cut there. Let me give an non-trivial example of a diagram that appears
at O(10 ), if one wants to compute scattering in 4 theory. Here it is:

=
amputation

(4.113)

So far we have learnt about the problem with external-leg


corrections the become infinite

for on-shell external states as implied by (4.111) and gave a simple prescription of how to
solve the issue, i.e., by simply removing these corrections by amputation. A practitioner or
an experimental physicist might be happy at this point, but as theorists we want more. So
lets have a closer look at the connection between G(n) and S.

4.10

From Correlation Functions to Scattering Matrix Elements

Before we start, let me warn you that this subsection will be more abstract than the preceding ones. Its main theme will be the singularities of Feynman diagrams viewed as analytic
functions of their external momenta. Yet, we will see rather soon that this apparently esoteric
subject is full of physical implications, and that it illuminates the relation between Feynman
diagrams and the general principles of QFT.

85

K
all
en-Lehmann Spectral Representation
We already know that in the free theory the matrix element h0|T (x)(y)|0i has a simple
physical interpretation. It gives the amplitude for a particle to propagator from y to x. To
what extent carries this over to the interacting theory? In order to answer this question, we
will have a look at the two-point correlation function (4.104). Our analysis of G(2) will rely
only on general principles of special relativity and QM, but will neither depend on the nature
of the interactions nor on an expansion in perturbation theory. Yet, to simplify matters, we
will restrict our consideration to the case of the real scalar field . Similar results can be
obtained for correlation functions of fields with spin.
We begin by studying the excited states of the interacting theory, with the corresponding
energies being defined relative to the ground-state energy E0 . Let |0 i be an excited eigenstate
of the full Hamiltonian with vanishing total 3-momentum 0, i.e., P |0 i = 0. That |0 i can
be an eigenstate of both H and P follows from the fact that [H, P ] = 0. Such a state can
consist of an arbitrary number of particles or it can even be bound state. The simultaneous
eigenvalues of H E0 and P can be combined into a 4-vector p0 = (m , 0), where m
denotes the mass of the particular zero-momentum state. Being the generator of spacetime translations, P = (H E0 , P ) transforms as contravariant 4-vector under boosts, i.e.,
U 1 ()P U () = P where U () is the unitary operator that implements the Lorentz
boost. This implies that by boosting |0 i one can generate a new state |p i, which can have
any 3-momenta p and is an eigenstate of H E0 with energy Ep () = (|p|2 + m2 )1/2 . Or
the other way round, any eigenstate with explicit 3-momentum can be boosted to a zeromomentum eigenstate. You are kindly asked to prove this statement explicitly. The sets of
eigenvalues p = (E E0 , p) are thus organized into hyperboloids, as is shown in Figure 4.5.
The lowest-lying isolated hyperboloid corresponds to the one-particle states of the interacting
theory, whereas the other ones correspond to bound states that may or may not be present.
Above a certain threshold value of m , a continuum of multiparticle states starts.
From the above it follows that the states |p i form a complete set of states in the interacting
theory, in the same way the states |pi do in the free theory. In turn, the completeness relation
of the one-particle states in the free theory (3.66) is replaced by
X Z d3 p
1
|p ihp | ,
(4.114)
1 = |ih| +
3
(2) 2Ep ()

where the first term corresponds to the ground state and the second one to all excited states.
We now insert (4.114) into the two-point function G(2) (x, y) = h|T (x)(y)|i.29 In the
case x0 > y 0 , we obtain
h|(x)(y)|i = h|(x)|ih|(y)|i
X Z d3 p
1
+
h|(x)|p ihp |(y)|i .
3 2E ()
(2)
p

29

(4.115)

For the sake of brevity, the labels H indicating Heisenberg fields will be dropped hereafter, whenever we
discuss the properties of correlation functions.

86

H
multiparticle
continuum
HH
Y
A
K
A

one particle
in motion

bound
state
m @
I
@
@

one particle at rest


P

Figure 4.5: The eigenvalues of P = (H, P ) are hyperboloids in the P H plane.


For a typical theory the states consist of one or more particles of mass m. In consequence, there is a hyperboloid of one-particle states and a continuum of hyperboloids
of two-, three-particle states, and so on. There may also be one or more bound state
hyperboloids below the threshold for creation of two free particles.

In the absence of preferred directions in the universe, the vacuum |i should be invariant under
space-time translations and Lorentz transformations, i.e., eiP x |i = |i and U () |i = |i.
As part of an exercise you will show that this implies that
h|(x)|i = h|(0)|i = v ,

(4.116)

where v denotes the VEV of the field (x), usually taken to be zero. If v 6= 0 than one

should reformulate the theory using the shifted field (x)


= (x) v, which by definition has
vanishing VEV. By an appropriate choice of the dofs of the interacting theory one hence can
always get rid of the first term in (4.115). The matrix elements entering the second term can
be manipulated as follows

h|(x)|p i = h|eiP x (0)eiP x |p i = eipx h|(0)|p i p0 =Ep ()

= eipx h|U 1 ()U ()(0)U 1 ()U ()|p i p0 =Ep ()

(4.117)


= eipx h|(0)|0 i p0 =Ep () ,
where U () implements a boost from p to 0. In order to arrive at the final expression, we
have made use of the fact that |i and (0) are Lorentz invariant.30
30

For a field with spin we would need to keep track of its non-trivial transformation properties under the
Lorentz group.

87

(s)
one-particle
states

bound
states

multiparticle
continuum

s
m

(2m)

Figure 4.6: The spectra density (s) for a typical interacting theory. The one-particle
states contribute a delta function at m2 , i.e., the square of the physical mass of the
particle. Multiparticle state form a continuous spectrum starting at (2m)2 . There may
also be bound states below the two-particle threshold.

Leaving out the VEV and using (4.117), the two-point correlation function (4.115) then
takes the form (x0 > y 0 )

Z 3
X
d p eip(xy)
2
h|(x)(y)|i =
|h|(0)|0 i|
(2)3 2Ep () p0 =Ep ()

(4.118)
Z 4
X
d
p
i
|h|(0)|0 i|2
eip(xy) ,
=
4 p2 m2 + i
(2)

where to arrive at the final result we have introduced an integration over p0 employing (3.120).
The integral in the last line of (4.118) is the Feynman propagator DF (xy; m2 ) belonging to a
-particle with mass m . We see that the particle interpretation has in fact changed in the
interacting theory from free particles to dressed particles (quasi-particles), so the particles
we are dealing with here are not the particles that we know from the free theory.
An expression analog to the one in (4.118) holds in the case x0 < y 0 . Combining both
cases one arrives at the Kallen-Lehmann spectral representation of the two-point correlation
function
Z
ds
(2)
G (x, y) =
(s) DF (x y; s) ,
(4.119)
0 2
where (s) depends on the squared invariant mass s. This spectral density function is positive
definite and given by
X
(s) =
2 (s m2 ) |h|(0)|0 i|2 .
(4.120)

88

Im (p2 )

one-particle
pole

multiparticle
brunch cut

bound-state
poles
?

m2

??

(2m)2

Re (p2 )

Figure 4.7: Analytic structure in the complex p2 -plane of the Fourier transform of the
two-point correlation function for a typical interacting theory. The one-particle states
lead to an isolated pole at p2 = m2 . States of two or more free particles give a brunch
cut, while possible bound states show up as additional poles below (2m)2 .

The spectral density for a typical theory is plotted in Figure 4.6. We see that the states in
the interacting theory that describe one-particle states correspond to an isolated delta function
in the spectral density,

(s) = 2 (s m2 ) Z + nothing else until s & (2m)2 .
(4.121)
The factor
Z = |h|(0)|0 i|2 ,

(4.122)

is called the field-strength renormalization. It is the probability for (0) to create a one-particle
state out of the vacuum |i and m denotes the physical mass of the associated particle, being
the energy eigenvalue in its rest frame. Notice that this physical mass is in general not equal
to the bare mass parameter occurring in the Lagrangian of the 4 theory (4.6). To make
the distinction between physical and bare quantities manifest, we will hereafter indicate bare
quantities by a subscript 0. It is important to realize that only the physical mass m is directly
observable, while the bare mass m0 is not.
In momentum-space the spectral decomposition (4.119) reads
Z
Z
i
ds
(2) 2
4
ipx (2)

(s) 2
G (p ) = d x e G (x, 0) =
p s + i
0 2
(4.123)
Z
iZ
ds
i
= 2
+
(s) 2
.
p m2 + i
p s + i
& (2m)2 2
The analytic structure of this function in the complex p2 -plane is depicted in Figure 4.7. The
first term gives an isolated simple pole at p2 = m2 , while the second term contributes a branch
cut beginning at p2 = (2m)2 . If there are any two-particle bound states these will appear as
additional delta functions in (4.123) and thus as additional poles below the cut.
Let us compare the results we have obtained in this subsection to those found in Section 3.7
for the free theory. The Fourier transform of the Feynman propagator (i.e., the two-point
89

correlation function in the theory of a free scalar field) reads (x0 > 0)
Z
Z
i
2
4
ipx
F (p ) = d x e DF (x) = d4 x eipx h0|T (x)(0)|0i =
,
D
2
p m20 + i

(4.124)

and is the amplitude for a particle to propagate from 0 to x. The relation (4.123) implies that
the two-point correlation function of the most general theory of an interacting real scalar field
takes a very similar form. The general expression is essentially a sum of scalar propagation
amplitudes for states generated from the vacuum by the field (0). There are however two
important differences between (4.123) and (4.124). First, (4.123) contains the field renormalization factor Z, which is one in the case of the free fields. The latter statement is easily shown
explicitly by evaluating the matrix elements h0|(0)|pi and thus left as an exercise. Second,
(4.123) contains contributions from multiparticle intermediate states with a continuous mass
spectrum. In the free field theory, (0) can create only a single particle from |0i. Notice that
the generation of multiparticle states is the reason why the factor Z in general differs from
unity in the interacting theory.
Lehmann-Symanzik-Zimmermann Reduction Formula
So far we have seen that the Fourier transform of the two-point correlation function (4.123)
considered as an analytic function of p2 has a simple pole at the square of the physical mass
of the one-particle states, while multiparticle intermediate states give weaker branch cut singularities. In the following we will find that this rather formal observation generalizes to
higher-point correlation functions and plays a crucial role in the derivation of a general relation between Greens functions and S-matrix elements. This relation has first been derived
by Harry Lehmann, Kurt Symanzik, and Wolfhart Zimmermann [3] and is today known as
the LSZ reduction formula. Combining the LSZ reduction formula with our Feynman rules
for computing correlation functions (4.105) will then give us a master formula for S-matrix
elements in terms of Feynman diagrams. For simplicity, we will again carry out the whole
analysis for the case of a real scalar field.
(2) (p2 ) in the vicinity
In the following we would like use the single-particle pole structure of G
of p2 m2 to obtain the asymptotic in and out states of the theory and in particular
their matrix elements,
out hq1 , . . . , qn |pA , pB iin

= hq1 , . . . , qn |S |pA , pB i .

(4.125)

These matrix elements are plane-wave amplitudes that describe the scattering of a initial
two-particle momentum state |pA , pB iin , constructed in the far past (t = t ), into a
n-particle momentum state |q1 , . . . , qn iout , which represents the final-state particles in the far
future (t = t+ ).31
The basic idea to derive the desired master formula is as follows. In order to calculate
the S-matrix element for a 2 n scattering process, we start with the correlation function
31

Because human built detectors are in general not able to resolve positions down to the de Broglie wavelengths of the particles, it is correct to work with plane-wave states in the Heisenberg picture rather than wave
packets to describe the collision.

90

involving (n + 2) Heisenberg fields. If we Fourier-transform this function with respect to the


coordinate of any one of these fields, we will find a pole of the form (4.123) in the corresponding
Fourier-transformed variable. We will argue that the one-particle states associated with these
poles are in fact asymptotic states, i.e., states given by the limit of well-separated wave packets
as they become concentrated around definite momenta. Taking the limit in which all (n + 2)
external particles go on-shell, we can then interpret the coefficient of the multiple pole as an
S-matrix element.
We first study the Fourier-transform of the (n + 2)-point correlation function with respect
to one argument x,
Z
d4 x eipx h|T (x 1 . . . n+1 ) |i .
(4.126)
Here the shorthands x = (x), 1 = (y1 ), etc. have been used and all s are Heisenberg
fields. We would now like to identify poles in the variable p0 . To do this, we divide the integral
over x0 into three regions,
Z
Z t+
Z
Z t
0
0
0
dx +
dx0 ,
(4.127)
dx +
dx =

t+

where t < min {yi0 } and t+ > max {yi0 } with i = 1, . . . , n + 1. In the region x0 [t , t+ ] the
result of the integral is an analytic function of p0 without poles, since the region is bounded
and the integrand depends on p0 through the analytic function exp(ip0 x0 ). In the other two
regions the integrand still has no poles, but the integration intervals are unbounded. Therefore
singularities in p0 may develop upon integration.
Consider the third region, i.e., x0 [t+ , [. In this case x0 is the latest time, so x stands
first in the time-ordered product. In order to determine the pole structure of (4.126), we
insert the completeness relation (4.114), assuming that the field has a vanishing VEV.32
The integral over the third region then becomes
Z
Z
X Z d3 k
1
0
3
i(p0 x0 px)
h|(x)|k ihk |T (1 . . . n+1 ) |i . (4.128)
dx
d xe
3
(2) 2Ek ()
t+

Using (4.117) and including a damping factor exp (x0 ) with infinitesimal  to ensure that
the integral is well-defined,33 the above integral takes the form
Z 3
XZ
dk
1
0
0
0
0
dx
e i(p k +i)x h|(0)|0 i (2)3 (3) (p k)
3
(2) 2Ek ()
t+



hk |T (1 . . . n+1 ) |i
(4.129)
k0 =Ek ()
0

1
ie i(p Ep ()+i) t+
h|(0)|0 i hp |T (1 . . . n+1 ) |i .
2Ep () p0 Ep () + i

If this is not the case we reformulate the theory in terms of the field.
This regularization is equivalent to the i prescription used in (3.129) and the tilted time-axis prescription
introduced in (4.86).
32

33

91

R
Here we have used d3 x exp (i(p k)x) = (2)3 (3) (p k). The expression (4.129) has
the same residue at p0 = Ep () i as the term i/(p2 m2 + i) = i/ (p0 )2 (Ep ())2 + i
appearing the two-point correlation function (4.118). Like before this singularity will be either
a single pole or a brunch cut, depending on whether the rest energy m is isolated or not.
The one-particle state in the far future corresponds to an isolated pole at the on-shell energy
p0 = Ep . In this case, (4.129) gives

Z
i Z
p0 Ep
4
ipx
d x e h|T (x 1 . . . n+1 ) |i
(4.130)
out hp|T (1 . . . n+1 ) |i .
p2 m2 + i
In order to obtain this result we have identified the matrix element h|(0)|0 i appearing in
(4.129) with Z 1/2 using (4.122), absorbing the left over phase into the definition of |0 i. We
have furthermore used the notation |piout = |p ioneparticle for a one-particle eigenstate with
momentum p that is created at asymptotically large times in the future.
In order to evaluate the contribution from the first region, i.e., x0 ] , t ], one puts x
last in the time-ordered product. Performing steps similar to the ones for the first integration
interval (the actual calculation is left as an exercise), one find that the one-particle state in
the far past corresponds to an isolated pole at the on-shell energy p0 = Ep ,

Z
i Z
p0 Ep
4
ipx
h|T (1 . . . n+1 ) | piin , (4.131)
d x e h|T (x 1 . . . n+1 ) |i

p2 m2 + i
where | piin = |p ioneparticle denotes the one-particle eigenstate with momentum p which
is constructed at asymptotically large times in the past.
We now want to repeat the same exercise for the remaining field coordinates y1 , etc. In
the asymptotic treatment of multiparticle states it is, however, better to use normalized wave
packets. In that case x is constrained to lie within a small band about the trajectory of
a particle with momentum p, with the spatial extent of the band being determined by the
wave packet. In this way the particles do not interfere and can effectively be considered
free
R 4 at asymptotic times, unlike plane-wave states. Instead of a simple Fourier transform
d x exp (ipx), we should hence have used
Z 3 Z
dq
0 0
d4 x eip x eiqx (q) ,
(4.132)
3
(2)
in (4.126), where (q) is a function that is peaked around p, and at the end taken the limit
of a sharply peaked wave packet (q) (2)3 (3) (q p).
With this modification the right-hand side in (4.129) would turn into
X Z d3 q
1
i
(q)
h|(0)|0 i hq |T (1 . . . n+1 ) |i
3
0
(2)
2E
q () p Eq () + i

(4.133)

Z 3
0
dq
i Z
p Ep

(q) 2
out hq|T (1 . . . n+1 ) |i ,
3
(2)
p m2 + i
where p = (p0 , q). We see that the one-particle singularity is now a branch cut, whose length
is the width in momentum space of the wave packet (q). It follows that if the width of the
92

(q) is taken to zero, the brunch cut sharpens up to a pole. In this limit (4.133) reduces to
the simple form (4.130). The same line of reasoning applies to the pole structure that appears
in the far past. In this case one recovers (4.131).
The procedure described above can be generalized to the (n + 2)-particle case we are
interested in by integrating each of the coordinates against a wave packet. Let me spare you
the gory details of the actual calculation and only tell you about the final result. It turns out
that by smearing each coordinate one can extract the leading singularities that turn out to
be products of poles in the separate energy variables. The physics behind this factorization is
that an (n + 2)-particle asymptotic state is created/annihilated by (n + 2) field operators that
are constrained to lie in distant wave packets and therefore are effectively localized. Under
these conditions an (n + 2)-particle excitation in the continuum can be represented by (n + 2)
distinct (i.e., independent) one-particle excitations of the ground state.
At the end one arrives at
! n Z
!
Y Z
Y
(n+2) (pA , pB , q1 , . . . , qn ) =
G
d4 xi eipi xi
d4 yi eiqj yj h|T (A B 1 . . . n ) |i
j=1

i=A,B

p0i Epi

qj0 Eqj

! n
!

Y
i Z
i Z
out hq1 , . . . , qn |pA , pB iin
2
2
2 + i
2 + i
p

m
q

m
i
j
j=1
i=A,B
Y

! n
!

Y
i Z
i Z
hq1 , . . . , qn |S|pA , pB i ,
2
2
2 + i
2 + i
p

m
q

m
i
j
j=1
i=A,B
Y

(4.134)

where the use of exp (iqj yj ) ensures that the particles in the in state have positive energy.
The latter relation is the famous LSZ reduction formula. It implies that the S-matrix element
involving two particles in the in state and n particles in the out state can be obtained
from the corresponding Fourier-transformed (n + 2)-point correlation function by extracting
the leading singularities in the energies p0i and qj0 , which coincide with the situations where
the external particles become on-shell.
Diagrammatic Master Formula
Our final goal is to reformulate the above procedure in the language of Feynman diagrams.
For concreteness, we will first analyze the relation between the diagrammatic expansion of
the scalar field four-point function and the S-matrix element and then generalize this result
to the case of 2 n scattering. We will consider explicitly the fully connected Feynman
diagrams contributing to the Fourier-transformed correlation functions. By a similar analysis,
it is straightforward to show that disconnected diagrams should be disregarded,
because they

do not have the singularity structure with a product of four (n + 2) poles, appearing on the
right-hand side of the LSZ reduction formula (4.134).
The exact four-point correlation function is shown in Figure 4.8. In this figure we have
indicated explicitly the diagrammatic corrections on each external leg. The light gray blob in

93

pB

q2

amp.

pA

q1

Figure 4.8: Structure of the exact four-point correlation function in scalar field theory.

the centre of the diagram represents the sum of all amputated four-point graphs,

amp.

+ ... ,

(4.135)

while the dark gray circles indicate the two-point Greens function aka the full propagator.
The full propagator can be written as a Dyson series,
=

1PI

1PI

1PI

+ ... ,

(4.136)

where

1PI

= i(p2 ) =

+ ... ,

(4.137)

is the collection of all one-particle irreducible (1PI) self-energy diagrams. Diagrams are called
1PI if they cannot be split in two by removing a single line. The Dyson series (4.136) is in
fact a geometrical series, which can be summed up according to
=


i
i
i
2
+
i(p
)
+ ...
2
2
p2 m0 + i p2 m0 + i
p2 m20 + i

i
= 2
.
2
p m0 (p2 ) + i

(4.138)

We see that the full propagator has a simple pole located at the physical mass m, which is
shifted away from the bare mass m0 by the self-energy:

p2 m20 (p2 )
= 0 , = m2 = m20 + (m2 ) .
(4.139)
p2 =m2

94

Notice that our sign convention for the 1PI self-energy (p2 ) implies that a positive contribution to (p2 ) corresponds to a positive shift of the scalar particle mass.
Close to its simple pole at p2 m2 the denominator of the full propagator (4.138) can be
expanded in the following way



(4.140)
p2 m20 (p2 ) = p2 m2 1 0 (m2 ) + O (p2 m2 )2 ,

where 0 (m2 ) stands for (p2 )/(p2 ) p2 =m2 . This implies that just like in the Kallen-Lehmann
spectral representation (4.123), the full propagator has a single-particle pole of the form
p0 Ep

p2

iZ
+ (regular terms) ,
m2 + i

(4.141)

with

1
.
(4.142)
1 0 (m2 )
As a result, the sum of all fully connected 2 2 diagrams contains a product of four poles
Z=

p2A

iZ
iZ
iZ
iZ
,
2
2
2
2
2
2
m + i pB m + i q1 m + i q2 m2 + i

(4.143)

multiplying the amputated four-point diagrams. This is exactly the singularity on the righthand side of the LSZ reduction formula (4.134). Comparing the coefficients of the product of
poles, we conclude that the S-matrix element of the process (pA )(pB ) (q1 )(q2 ) can be
expressed through
pB
q2
 4
Z
hq1 , q2 |S|pA , pB i =

amp.

pA

(4.144)

q1

where the light gray blob represents the sum of amputated four-point diagrams with all external momenta being on-shell. This is the sought diagrammatic master formula for the case
of 2 2 scattering of scalar fields.
An identical analysis can be applied to the Fourier-transformed (n + 2)-point correlator.
In this case the relation between the S-matrix element and the Feynman graphs reads34
pB
qn
 n+2
Z
hq1 , . . . , qn |S|pA , pB i =

amp.

pA

..
.

(4.145)

q1

Notice that the renormalization factors Z 1/2 are irrelevant for calculations at the leading
order of perturbation theory, but are important in the calculation of higher-order corrections.
This completes the derivation of the connection between scattering matrix elements and fully
connected amputated Feynman diagrams.
34

If the external particles are of different species, each has its own renormalization factor Z 1/2 . Furthermore,
if the particles have spin, there will be additional polarization factors on the right-hand side of the equation.

95

4.11

Decay Widths and Cross Sections

As in usual QM, also in QFT the probabilities for things to happen are the (modulus) square
of the quantum amplitudes. In this subsection we will compute these probabilities, known as
decay widths and cross sections. One small subtlety here is that any T -matrix element (4.75)
comes with a factor of (2)4 (4) (pf pi ), so that we end up with the square of a delta function.
As we will see in a moment, this subtlety is a result of the fact that we are working in an
infinite space.
Fermis Golden Rule
In order to start the discussion, let me derive something familiar, namely Fermis golden rule
using Dysons formula (4.18). For two energy eigenstates |mi and |ni with Em 6= En , one has
in Born approximation
Z t
hn| U (t) |mi = i hn| dt0 HI (t0 )|mi
0

dt0 eit

= i hn|Hint |mi

(4.146)

0
it

= hn|Hint |mi

1
,

where = En Em and we have used in the first step the equality (4.11) to express HI in
terms of Hint . The probability Pmn (t) for the transition from |mi to |ni to happen in the
time t, is thus given by
Pmn (t) = |hn| U (t) |mi|2 = 2 |hn|Hint |mi|2

1 cos (t)
.
2

(4.147)

The function (1 cos (t))/ 2 is visualized in Figure 4.9. The -dependence indicates that
most transitions occur in a region between energy eigenstates separated by E = 2/t, i.e.,
the half-width of the function. Looking at the figure one furthermore observes that as t ,
the function shown in the plot approaches a delta function. In order to find the normalization,
we evaluate
Z
1 cos (t)
d
= t .
(4.148)
2

This implies that


1 cos (t) t
() .
(4.149)
2 t
Consider now a distribution of final states with density (En ). In this case one has to integrate
over En and obtains
Z
Z
1 cos (t)
2
Pmn (t) = dEn (En ) |hn| U (t) |mi| = dEn (En ) 2 |hn|Hint |mi|2
2
(4.150)
t

2t |hn|Hint |mi|2 (Em ) .


96

t2
2

2
t

2
t

Figure 4.9: Graphical representation of (1 cos (t))/ 2 appearing in Pmn (t).

It follows that the probability for the transition per unit time for states around the same
energy Em En = E takes the form
Pmn (t) = 2 |hn|Hint |mi|2 (E) ,

(4.151)

This result is known as Fermis golden rule.


In the above derivation, we were rather careful with taking the limit t . Suppose we
were a little bit sloppier, and first chose to compute the amplitude for the initial state |mi at
t to evolve into the final state |ni at t . Then we would get
Z
dt0 HI (t0 )|mi = i hn|Hint |mi 2 () .
(4.152)
i hn|

Now when squaring the amplitude, we find Pmn (t) = |hn|Hint |mi|2 (2)2 [()]2 . Tracking
through the previous computation, we realize that the extra infinity arises because Pmn (t)
is the probability for the transition to happen in infinite time. We thus can write the delta
functions as (2)2 [()]2 = 2 () t, where t is a shorthand for t . The reason that we
have stressed this point is because the T -matrix element in (4.75) has been computed in the
same way as (4.152), which means that we have to reinterpret the square of the delta function
arising from |hf |T |ii|2 as a space-time volume factor.
Decay Rates
We would now like to calculate the probability for a single-particle initial state |ii of momentum pi and rest mass
P m to decay into the final state |f i consisting of n particles with total
momentum pf = nj=1 qj . This quantity is given by the ratio
Pn =

|hf |S|ii|2
.
hi|iihf |f i

(4.153)

The states |ii and |f i obey the relativistic normalization formula (3.64),
hi|ii = (2)3 2Epi (3) (0) = 2Epi V ,
97

(4.154)

where we have replaced the delta function (3) (0) by the volume V of the space. Similarly, one
has for the final state
n
X
hf |f i =
2Eqj V .
(4.155)
j=1

If the initial-state particle is at rest, i.e., Epi = m and pi = 0, we get using (4.75) for the
i f decay probability
n

2
1 Y 1
(2)4 (4) (pf pi ) |A(i f )|2
Pn =
2mV j=1 2Eqj V
n
Y
1
1
4 (4)
2
.
(2) (pf pi ) |A(i f )| V t
=
2mV
2Eqj V
j=1

(4.156)

Notice that in order to arrive at the second line we have replaced one of the delta functions
(2)4 (4) (0) by the space-time volume V t.
We can now divide out t to get the transition function per
After integrating
R 3unit time.
3
over all possible momenta of the final-state particles, i.e., V d qj /(2) , we then obtain in
terms of the relativistically-invariant n-body phase-space element35
dn = (2)4 (4) (pf pi )

n
Y
d3 qj 1
,
3 2E
(2)
q
j
j=1

(4.157)

the following expression for the partial decay width into the considered n-particle final state
Z
1
dn |A(i f )|2 .
(4.158)
n =
2m
R
Notice that the factors of the spatial volume V in the measure V d3 qj /(2)3 have cancelled
those in (4.156), while the factors 1/(2Eqj ) in (4.156) have conspired with the 3-momentum
R
integrals in V d3 qj /(2)3 to produce Lorentz-invariant measures (3.61). In consequence, the
density of final states (4.157) is a Lorentz-invariant quantity.
After summation over all possible n-particle final states, one finally finds the so-called total
decay width
Z
1 X
=
dn |A(i f )|2 ,
(4.159)
2m n
with dn corresponding to a given final state. The total decay width is equal to the reciprocal
of the half-life = 1/ of the decaying particle. If the decaying particle is not
at rest, the
decay rate becomes m /Epi . This leads to an increased half-life Epi /m = / 1 v 2 = ,
where v is the velocity of the decaying particle. Of course, this is a well-known effect related
to time dilation. E.g. taking the muon lifetime at rest as the laboratory value of 2.22 s, the
lifetime of a cosmic ray produced muon traveling at 98% of the speed of light is about five
times longer.
35

This object is in some textbooks denoted by dPSn .

98

In terms of the partial and total decay width, (4.158) and (4.159), the branching ratio (or
branching fraction) for the n-particle decay i f reads
B(i f ) =
Needless to say that B(i f ) [0, 1] and

n
.

(4.160)

B(i f ) = 1.

Cross Sections
Consider now a beam of particles of type B hitting a target at rest consisting of particles of
type A. The case of two colliding particle beams like e+ e (LEP), p
p (Tevatron) or pp (LHC)
can be obtained from this by an appropriate Lorentz boost. Lets start by assuming constant
densities A and B in the target and the beam over their whole extents `A and `B . The
number of scattering events will then be proportional to
(A `A ) (B `B ) O ,

(4.161)

where O denotes the cross-sectional overlap area common to both the beam and the target.
The experimental set-up is illustrated in Figure 4.10. The ratio
=

# scattering events
# scattering events
=
,
(OA `A ) (OB `B ) /O
NA NB /O

(4.162)

defines the cross section as the effective area of a chunk taken out of the beam by each
particle in the target. The quantities NA and NB are the numbers of A and B particles that
are relevant for scattering, i.e., the particles that at some point in time belong to the overlap
between target and beam. Notice that all of this can be equally well formulated in terms of
time-related quantities like the scattering rate and the incoming particle flux. Simply replace
the number (#) of scattering events by the number of scattering events per second and `B B
by the flux vB B of beam particles.
In reality A and B are not constant, since the colliding particles are described by wave
packets and both target and beam have a density profile. However, the range of the interaction
between the colliding particles is much smaller than the width of the individual wave packets
perpendicular to the beam, which in turn is much smaller than the actual diameter of the beam.
Therefore, to very good approximation A and B can be considered as locally constant on
QM (i.e., interaction) length scales, whereas the density profiles inside the target and beam
can be incorporated properly by averaging over the overlap region
Z
`A `B d2 x A (x ) B (x ) = NA NB /O .
(4.163)
Here x is the spatial coordinate perpendicular to the beam. From this it follows that
# scattering events = NA NB /O ,

(4.164)

where can be calculated for effectively constant values of A and B corresponding to approximately plane-wave initial states. By the way, we do not have to restrict ourselves to the total
99

`A
- 

beam
O

B
vB

`B
target
Figure 4.10: Incident beam of particles with density B , extent `B , and velocity vB
hitting a target of density A and extent `A . The overlap area of the beam and target
is denoted by O.

number of scattering events. In a similar way we can study the cross section for scattering into
the region d3 q1 . . . d3 qn around the n-particle final-state momentum point q 1 , . . . q n . This is
actually what detectors usually do,36 since they detect particles with energy and momentum
in certain finite bins, which are given by the detector resolution. These bins cannot resolve
the momentum spread of any of the wave packets, so in the final state we should use plane
waves as well.
Calculating cross sections therefore amounts to computing transition probabilities in momentum space. These transition probabilities are universal in the sense that they are independent of details of the experiment, like the properties of the beams, the targets or the
preparation of the initial-state particles. Consider an initial state consisting of one target
and one beam particle in the momentum state |ii = |pA , pB i scattering into a final state
|f i = |q1 , . . . , qn i. In analogy with the calculation that lead to (4.159), the corresponding
differential transition probability per unit time and flux is given by
d =

dn
1
|A(i f )|2 ,
F 4EpA EpB V

(4.165)

which is usually referred to as the differential cross section. In the latter expression F stands
for the flux associated with the incoming beam of particles. In the CM frame of the collision
this flux reads
|v rel |
|v A v B |
|p /EA pB /EB |
pCM ECM
F =
=
= A
=
,
(4.166)
V
V
V
EA EB V
where ECM = EA +EB is the total CM energy and pCM is the momentum of either of the particles in the CM frame. To find this result we have used that the 4-momentum of a massive par
ticle reads p0 = (m, 0) in its rest frame, and becomes p = (E0 + v p0 ) , (p0 + E0 v) =
36

Provided that the particle positions cannot be resolved at the level of the de Broglie wavelengths of the
particles, which typically is the case in human-built detectors.

100

m (1, v) upon a boost with velocity v. In the CM frame we thus find the expression
Z
dn
|A(i f )|2 ,
(4.167)
CM =
4|EA pB EB pA |
for the total cross section. The flux factor 1/4 |EA pB EB pA |1 is not Lorentz invariant,
but invariant under boosts along the beam direction, as expected for a cross-sectional area
perpendicular to the beam.
Notice that the expression for d as given in (4.165) is also valid for identical particles in the
final state. Finding a set of particles in the required momentum bin effectively identifies the
particles. However, when integrating d to obtain the total cross section CM for the scattering
into the n particles one has to restrict this integration to inequivalent configurations. E.g.,
M M in the scalar Yukawa theory is
the total cross section Rfor the 2 2 reaction N N
obtained as CM = 1/2 d.

4.12

Problems

M M at O(g 2 ) in the Yukawa


i) Draw the Feynman diagrams that contribute to N N
theory. Write down the corresponding amplitude and express your result through the
Mandelstam variables. Can the amplitude develop a pole?
Calculate the leading-order contribution (4.74) to M M M M scattering in the limit
of small external momenta, i.e., p21  M 2 etc. To do so, first relate the d-dimensional
integral
Z
i
1
dd l
=
(1 d/2) (M 2 )d/21 ,
(4.168)
d
2
2
d/2
(2) l M
(4)
to the Feynman integral appearing in (4.74), then set d = 42, and finally take the limit
 0. Can you recover the qualitative behavior of your result in the effective theory
obtained after integrating out the nucleon fields? Think in terms of higher-dimensional
operators.
NN
scattering in the Yukawa theory.
ii) Calculate the Born-level potential for N N
Compare your result to (4.83). What does your finding tell you about the nature of
scalar interactions?
Compute the leading-order potential for scattering in 4 theory. What is the
physical meaning of your result?
iii) Consider the anharmonic oscillator with Hamiltonian
H=

1 2 1 2 2 3
p + x + x .
2
2
3!

(4.169)

The goal of this exercise is it to calculate matrix elements of the form


h|xn (0)|i ,

101

(4.170)

where n = 1, 2, . . . and |i denotes the ground state of the perturbed Hamiltonian (we
write xn (0) rather than xn because the time associated with these operators is important)
using Feynman diagrams and comparing the obtained results with those following from
standard time-independent perturbation theory.
There are two types of vertices in the possible Feynman diagrams, namely external and
internal vertices. External vertices have a single line and correspond to the x(0) factors
entering the matrix elements (4.170). They are labeled by the time the operators are
evaluated, t = 0 in our case. Internal vertices have three lines, corresponding to the
perturbation /(3!) x3 in (4.169) and are labeled by a parameter t. Each internal vertex
has a different parameter. The Feynman diagrams are constructed using the following
Feynman rules:
0

= 1 , (external vertex) ,
(i)
=
3!

t = D(s, t) =

dt , (internal vertex) ,

(4.171)

1 i |st|
e
, (propagator) .
2

Here the limits of the t-integration are (1 i). We need the i because without it,
the Feynman integrals would not converge. Yet, the final results of the integrals turn
out to be independent of , so that in practice we could omit the i from our notation.
Draw the relevant Feynman diagrams that contribute at O(1) and O() to the matrix
elements (4.170) with n = 1, 2, 3. Determine the corresponding weight factors and
calculate the graphs using (a > 0)
Z
2
dt eia|t| = .
(4.172)
ia
Try to reproduce your results using standard time-independent perturbation theory.
Proceed to compute the O(2 ) correction to h|x2 (0)|i. Remember to employ (4.105).
The Feynman integrals appearing in this case are of the form (a, b, c > 0)
Z Z
2
2
2
ds dt eia|s| eib|t| eic|st| =
+
+
. (4.173)
(a + b)(b + c) (a + b)(a + c) (a + c)(b + c)
This result can be easily derived from (4.172). If you want, verify that you get the same
answer for the O(2 ) correction to h|x2 (0)|i using standard QM perturbation theory.
iv) Consider a Lorentz transformation that boosts p0 = (m , 0) to p0 = (Ep (), p).
Show that |p i = U ()|0 i satisfies P |p i = p0 |p i where P = (H E0 , P ).
The ground state |i of any interacting theory has to be Poincare invariant, since the
vacuum ought not to have a preferred direction. Show that h|(x)|i = v for any (x)
if h|(0)|i = v.
102

Compute the spectral density function (s) and the field renormalization factor Z of the
free scalar theory by explicitly calculating h0|(0)|pi.
v) Evaluate the leading singularity of the Fourier-transformed (n + 2)-point correlation
function (4.126) arising from the first integration region in (4.127). You should find the
result given in (4.131).
vi) In Section 4.10 we learnt that the two-point correlation function of the 4 theory viewed
as an analytic function of the momentum p2 has a branch-cut singularity associated with
multiparticle intermediate states. This finding should not come as a surprise to those
familiar with non-relativistic scattering theory, where the amplitudes considered as a
function of energy have branch cuts on the positive real axis. The imaginary part of the
scattering amplitude appears as a discontinuity across this branch cut. By the optical
theorem the imaginary part of the forward-scattering amplitude is then proportional to
the total cross section. In this exercise we will derive the QFT version of the optical
theorem for a 2 2 scattering process.
Derive a equation for the product T T involving the T -matrix starting from the unitarity
of the S-matrix. What is the physical reason for the unitarity of the S-matrix?
Calculate the matrix element of this relation between the two-particle initial and final
states |p1 , p2 i and |q1 , q2 i. In order to compute the matrix element of T T insert a
complete set of states |ki i with i = 1, . . . , . Give a pictorial representation of the
resulting identity.
Set p1 = q1 and p2 = q2 and relate the matrix element of T T to a total cross section.
Use this result to derive the standard form of the optical theorem, i.e.,
Im A(p1 , p2 p1 , p2 ) = 2ECM pCM (p1 , p2 anything) .

(4.174)

Here ECM is the total CM energy, pCM is the momentum of either of the particles in the
CM frame, and (p1 , p2 anything) is the total cross section for the production of all
final states.
vii) The generalized optical theorem
2 Im A(a b) =

XZ

df A(a f ) A (b f ) .

(4.175)

is true not only for S-matrix elements, but for any amplitude that we can define in terms
of Feynman diagrams. Here a and b denote asymptotic states, the sum f runs over all
possible sets of final states, and df is the corresponding phase-space element (4.157).
In this exercise we will learn that the optical theorem can also be used to deal with
unstable particles, which never appear in asymptotic states.
Recall that the exact two-point function of a scalar field takes the form (4.138). Use
the diagrammatic master formula (4.157) to derive a relation between the amplitude
A(p p) describing 1 1 scattering and the quantity i(p2 ) that is the sum of all
1PI insertions into the boson propagator (4.137).
103

The latter relation can be used to study the imaginary part of (p2 ). In order to do so,
we change the definition of the physical mass of the -particle from (4.139) into
m2 = m20 + Re (m2 ) .

(4.176)

Assume now that the full propagator (4.138) appears in the s channel of a 2 2
Feynman diagram. Compute the cross section for the process in the vicinity of the
resonance. Neglect all overall factors.
Compare your result to the relativistic Breit-Wigner formula for the cross section in the
region of a resonance,
2



1
.
(4.177)

s m2 + im
Here m is the mass of the resonance and is its width. Identify the width with the
imaginary part of (p2 ) assuming that the resonance is narrow, i.e.,  m. What does
this mean for the lifetime of the particle? Calculate Im (m2 ), and hence , using the
optical theorem (4.175). You should recover the result (4.159).
R
viii) Derive the explicit form of the 2-body phase-space element d2 from (4.157). Using
your result calculate the angular differential cross section (d/d)CM for a generic 2 2
process in the CM frame. The solid angle d is given by d d cos where [, ] is
the polar scattering angle (with respect to the beam axis) and [0, 2[ the azimuthal
scattering angle (around the beam axis). Consider also the case where the external
particles all have the same mass. In this case you should obtain
 
d
|A(p1 , p2 q1 , q2 )|2
=
,
(4.178)
d CM
64 2 s
where A(p1 , p2 q1 , q2 ) represents the relevant scattering matrix element.
ix) The interactions of pions at low energy can be described by a phenomenological model
called the linear sigma model,


Z
1 2 1
3
2
2
H= dx
+ (i ) + V ( ) .
(4.179)
2 i 2
Here i with i = 1, . . . , N are real scalar fields and i denotes the conjugate momentum
derived from i . The potential is given by
V (2 ) =

1 2

m (i )2 + (2i )2 .
2
4

(4.180)

Note that for m2 > 0 and = 0 the above Hamiltonian just consists out of N copies of
the Klein-Gordon Hamiltonian. If one now assumes to be a small perturbation, one
can calculate scattering amplitudes in a series expansion in .
Show that the propagator of the i fields is
i (x)j (y) = ij DF (x y) ,
104

(4.181)

where DF (x y) is the standard Klein-Gordon propagator with mass m. Show furthermore that there is one type of vertex given by
k

l
= 2i (ij kl + il jk + ik jl ) .

(4.182)

A vertex involving two 1 and two 2 thus has the value 2i, while a vertex where
four fields of the same type attach receives a factor 6i.
Compute the cross sections for 1 2 1 2 , 1 1 2 2 , and 1 1 1 1
scattering to first order in . Work in the CM frame.
Now consider the case m2 < 0. In this case, the potential has a local maximum rather
than a minimum at i = 0. Since the potential is symmetric under SO(N ) rotations of
= (1 , . . . , N ), we can choose to write the fields close to the new minimum as
T
(x) = 1 (x), . . . , N 1 (x), v + (x) ,

(4.183)

where v is a constant chosen to minimize the potential (4.180), (x) is a small deviation,
and i (x) denote the remaining fields, called pions. Show, that with such a potential we
have a theory of one massive sigma field and (N 1) massless pion fields, interacting
through cubic and quartic potential terms. Assign Feynman rules to the propagators
(x)(y) =

i (x)j (y) = i

j,

(4.184)

and the vertices

j
.

105

(4.185)

References
[1] J. Schwinger, Renormalization Theory of Quantum Electrodynamics: An Individual
View, in The Birth of Particle Physics, Cambridge University Press (1983), 329 p.
[2] H. Yukawa, Proc. Phys. Math. Soc. Jap. 17, 48 (1935).
[3] H. Lehmann, K. Symanzik and W. Zimmermann, Nuovo Cim. 1, 205 (1955).

106

Dirac Theory

We have seen that quantization of scalar fields gives rise to spin-zero particles. But most
particles in nature have an intrinsic angular momentum or spin. These arise naturally in field
theory by considering fields which themselves transform non-trivially under the Lorentz group.
In this section we will describe the Dirac equation, whose quantization gives rise to fermionic
spin-1/2 particles. In order to motivate the Dirac equation, we will start by studying the
appropriate representation of the Lorentz group.
We already know from Section 2.4 that if one considers infinitesimal Lorentz transformations (2.61), the matrices (2.63) entering the transformations have to be antisymmetric.
Such an object has six independent parameters which agrees with the number of transformations of the Lorentz group, i.e., three rotations and three boosts. In the following it will turn
out to be useful to introduce a basis of this six 4 4 antisymmetric matrices. We call our

matrices (M ) with , , , = 0, 1, 2, 3 and write the basis of six matrices as

(M )

= ,

(5.1)

where the indices and denote which basis element we are dealing with, while and

belong to the 4 4 matrices. Notice that (M ) is antisymmetric in both , and , . If


we use these matrices in practical applications (e.g., if we want to multiply them together or
act on some field) we will typically need to lower one index,
(M ) = .

(5.2)

Since we lowered the index with the Minkowski metric, we pick up various minus signs which
means that when written in this form, the matrices are no longer necessarily antisymmetric.
E.g., one has

0 0 0 0
0 1 0 0
0 0 1 0
1 0 0 0

12
01
(5.3)
(M ) =
(M ) =
.
,
0 1 0 0
0 0 0 0
0

The matrix M01 , which is real and symmetric, generates boost in the x direction, while M12
is real and antisymmetric and generates rotations in the xy plane. In terms of M , we can
now write any infinitesimal as
=

1
(M ) ,
2

(5.4)

where the matrix contains six numbers and is antisymmetric in and . This matrix
parametrizes the Lorentz transformation we are doing. The basis of the six matrices M
forms the generators of the Lorentz transformations. The generators obey the Lorentz Lie
algebra relations,
[M , M ] = M M + M M .
107

(5.5)

Here the matrix indices have been suppressed. A finite Lorentz transformation can be
constructed from (5.4) by building the exponential


1

= exp
M
.
(5.6)
2
Let me stress again what each of these objects are: the M are six 4 4 basis elements of the
Lorentz group, while the are six numbers telling us what kind of Lorentz transformation
we are doing.

5.1

Spinor Representation

We now want to find other matrices which satisfy the Lorentz algebra commutation relations
(5.5). In the following, we will construct the spinor representation of the Lorentz group using
a trick due to Dirac. We start by defining something which, at first sight, has nothing to do
with the Lorentz group. It is the Clifford algebra (or Dirac algebra)
{ , } = 2 1 ,

(5.7)

where {a, b} = ab+ba is the usual anticommutator, denotes a set of four matrices (the Dirac
matrices), and 1 is the nn unit matrix with n being the dimensionality of the representation.
The relation (5.7) implies that we have to look for matrices that satisfy =
when 6= ,
( 0 )2 = 1 ,
(5.8)
and
( i )2 = 1 ,

(5.9)

for i = 1, 2, 3. It is not difficult to convince oneself that the simplest representation of the
Clifford algebra (for four-dimensional Minkowski space) is in terms of 4 4 matrices. In fact,
there are many 4 4 matrices that obey (5.7).37 E.g., we may take the so-called Weyl or
chiral representation,
!
!
i
0
1
0

,
(5.10)
0 =
,
i =
1 0
i 0
where each element is a 2 2 matrix itself and i denotes the Pauli matrices
!
!
!
0 1
0 i
1 0
1 =
, 2 =
, 3 =
.
1 0
i 0
0 1

(5.11)

The latter matrices themselves satisfy { i , j } = 2 ij . Using these properties one easily shows
that (5.10) indeed satisfies (5.7). This is left as a homework problem.
One can construct any other representation of the Clifford algebra from a specific one by taking M M 1
for any invertible matrix M . However, up to this equivalence, it turns out that there is a unique irreducible
representation of the Clifford algebra, and the matrices (5.8) provide an example.
37

108

So what is the connection between the Clifford algebra and the Lorentz group? In order
to answer the question, we consider the commutator of two Dirac matrices ,
S =

1
[ , ] .
4

(5.12)

In our representation, the 0i and ij components of S are given explicitly by


!
i

0
1
S 0i =
,
2 0 i
and
i
S ij = ijk
2

k
0

0
k

(5.13)

!
.

(5.14)

It is straightforward to show and thus part of an exercise, that these matrices (irrespectively
of their representation) satisfy
[S , ] = .

(5.15)

[S , S ] = S S + S S .

(5.16)

and
The latter equality tells us that the matrices S form a representation of the Clifford algebra
(5.5). We now also understand the physical meaning of (5.13) and (5.14). The former object
induces a Lorentz boost, while the latter generates a three-dimensional rotation.
Dirac Spinors
The S are 4 4 matrices, because the are. So far we havent given an index to the rows
and columns of these matrices. Lets call the indices and . We furthermore need a field
that the (S ) act upon. The sought field has to have four complex components labelled
and we call it (x). This object is the famous Dirac spinor. Under Lorentz transformations,
we have
(x) S() (x0 ) .
(5.17)
where x0 = 1 x and the full Lorentz transformation S() takes the form


1

S
,
S() = exp
2

(5.18)

and the expression for is given in (5.6). Although the basis of generators S and M is
different we use the same six numbers in both S() and . This ensures that we are doing
the same Lorentz transformation on and x.
Both S() and are 4 4 matrices. So how can we be sure that the spinor representation
(5.18) is something new, and isnt equivalent to the familiar vector representation (2.60)?

109

In order to convince ourselves that the two representations are truly different, we look at
rotations. If we write the rotation parameters as ij = ijk k , then (5.6) and (5.18) become

0 0
0
0
!
0 0 3 2
ei/2
0

= exp
S() =
,
(5.19)
,
0 3 0 1
0 ei/2
0 2 1 0
where in order to arrive at the right-hand sides one has to remember that 12 = 21 = 3 ,
etc. We now consider a rotation by 2 around the z-axis, which means to take = (0, 0, 2).
It follows that

0
0
0 0
!
3
0
0 2 0
ei 0

= exp
= 1 .
(5.20)
S() =
= 1,
3
0 2 0 0
0 ei
0
0
0 0
This implies that under a 2 rotation a vector and spinor transforms as follows
A (x) A (x) ,

(x) (x) .

(5.21)

The latter relation tells us that spinors have the unintuitive property that a 2 rotation does
not return them to their initial state, but a 4 rotation does. So S() definitely differs from
the vector representation .
For later convenience let me also give explicitly the analogs of (5.19) for Lorentz boosts.
Writing the boost parameter as i0 = 0i = i , one finds

0 1 2 3
!

2
i/2
0
0
0
e
0

S() =
.
(5.22)
= exp 2
,

0
0
0
0 ei/2
3
0
0
0
Another important question to ask is whether or not S() is a unitary representation of
the Lorentz group.38 From (5.18), we infer that S() is unitary if S is anti-hermitian, i.e.,
(S ) = S . But we have
1
(5.23)
(S ) = [( ) , ( ) ] ,
4
which can be anti-hermitian if all are hermitian or all are anti-hermitian. However, we
can never arrange for this to happen since (5.8) and (5.9) imply that S has both real and
imaginary eigenvalues, and a anti-hermitian matrix ought only to have imaginary ones. E.g.,
in the Weyl representation (5.10), we have the property
( 0 ) = 0 ,

( i ) = i .

38

(5.24)

Notice that using the Weyl representation the relations (5.13) and (5.14) already tell us explicitly that
rotations are unitary while boosts are not. This observation is also true for the vector representation.

110

In fact the Lorentz group being non-compact, has no finite-dimensional representations that
are unitary. But this does not matter to us, since our spinor is not a QM wavefunction, but
a classical field.
Dirac Action
With the new field at hand we now want to construct Lorentz-invariant EOMs involving it.
In order to do this we try to write down a Lorentz-invariant action that is bi-linear in .
We consider the product
(x)(x) = ( )T (x)(x) ,
(5.25)
where (x) is the usual adjoint of a multi-component object. Under a Lorentz transformation
, one has
(x)(x) (x0 )S ()S()(x0 ) ,
(5.26)
which is not Lorentz invariant since S() is not unitary, i.e., S ()S() 6= 1. This means that
is not a Lorentz scalar and thus not the right building block for constructing the action.
Yet, it is easy to see what went wrong and to correct for it. From (5.24) we find that for
= 0, 1, 2, 3, one has
( ) = 0 0 ,
(5.27)
which in turn implies that
(S ) = 0 S 0 ,

(5.28)

S () = 0 S() 0 .

(5.29)

and
This suggests that instead of we should better use

(x)
= (x) 0 ,

(5.30)

as a building block in our Dirac action. This object is called adjoint Dirac spinor.
Equipped with and let us now see what kind of Lorentz covariant objects we can form
It is a simple exercise to show that this object transforms
out of them. We first consider .
under a Lorentz transformation as

0 )(x0 ) ,
(x)(x)
(x
(5.31)
R

which tells us that it is a Lorentz scalar. A term d4 x (x)(x)


is thus Lorentz invariant since

det() = 1. Next we consider . This term has the following transformation property

0 ) (x0 ) ,
(x)
(x) (x

(5.32)

under Lorentz transformations. This claim is proven as part of an exercise. From (5.32) we
is a Lorentz vector. This means that we can treat the index on the
infer that
matrices as a true vector index. In particular, we can
it
R 4form Lorentz scalars by
R contracting
4

with other Lorentz indices. As a result terms like d x (x)A/(x)(x) and d x (x)/ (x)
are Lorentz invariant. Here we have introduced the shorthand notation a/ = a for any
111

with = i/2 [ , ]. Not surprisingly


contravariant vector a . Finally, we consider
this object behaves like a Lorentz tensor

0 ) (x0 ) .
(x)
(x) (x

(5.33)

This result is again easy to derive by considering separately the properties


R 4 of , ,and under

Lorentz transformations. From (5.33) it follows that terms like d x (x)


(x)F (x),
where all indices are contracted are Lorentz invariant.

, and
, each of which
We are now equipped with the three bi-linears ,
transforms covariantly under the Lorentz group. We can try to build a Lorentz-invariant
action from these. In fact, we need only the first two terms. We write
Z
(i/ m) (x) .
S = d4 x (x)
(5.34)

This is the Dirac action we were looking for. Since [S] = 0, [d4 x] = 4, [ ] = 1, and [m] = 1,
= 3/2.
we can read off the mass dimension of the spinor field and its adjoint. We have [] = []
The factor of i is there to make the action (5.34) real. Upon complex conjugation, it cancels
a minus sign that comes from integration by parts. As we will see soon, after quantization
the Dirac theory describes particles and antiparticles of mass |m| and spin 1/2. Notice that
the Lagrangian is of first order, rather than the second-order Lagrangians we were working
with for scalar fields. Also, the mass parameter appears in the Lagrangian as m, which can
be positive or negative.
Dirac Equation
The EOMs for and follow from (5.34) by varying independently with respect to and ,
respectively. In the first case, we obtain39
(i/ m) = 0 .

(5.35)

This is the Dirac equation. In the second case it follows that

(i/ + m) = 0 ,

(5.36)

which is the hermitian-conjugate form of (5.35). Here the derivative acts to the left. Both
(5.35) and (5.36) are first order in derivatives, yet miraculously Lorentz invariant. As an
homework assignment you are asked to show this explicitly. In contrast, in the case of a scalar
field a first-order EOM would necessarily break Lorentz invariance, because one would always
need to introduce a privileged vector that saturates the open index of . The matrices
provide this index in the case of the Dirac equation.
It is also important to realize that the Dirac equation mixes up different components of
through . However, each individual component itself solves the Klein-Gordon equation
(2.46). In order to see this, we compute

0 = (i + m) (i m) = + m2
(5.37)


= 1/2 { , } + m2 = + m2 .
39

Hereafter we will often drop the coordinate x in (x) etc.

112

The final expression contains no matrices, and so applies to each component of the
spinor field separately.
Chiral Spinors
We have seen that in the chiral representation (5.10) both the spinor rotations (5.19) and
boosts (5.20) are block diagonal. This means that the Dirac representation is reducible. It
decomposes into two irreducible representations, acting only on two-component spinors L,R ,
which in the chiral representation, are defined by
!
R
=
.
(5.38)
L
The two-component objects L,R are called chiral spinors and the labels L, R stand for leftand right-handed chirality. They transform in the same way under rotations, but oppositely
under boosts:
L,R ei/2 L,R ,
L,R e1/2 L,R ,
(5.39)
In group theory language L is in the (1/2, 0) representation of the Lorentz group, R is in
the (0, 1/2) representation, and belongs to (1/2, 0) (0, 1/2).40 Strictly speaking, the Dirac
spinor is a representation of the double cover of SO+ (1, 3)
= SL(2, C)/Z2 . Here SO+ (1, 3)
denotes the proper, orthochronous or restricted Lorentz group, which consists of those Lorentz
transformations that preserve the orientation of space and direction of time, while SL(2, C)
is the complex special linear group and Z2 is the two element cyclic group. The fact that
the Lorentz group is doubly connected is the source of the rotation-by-4 property (5.21) of
spinors.
The relations (5.38) and (5.39) correspond to the chiral representation. But what happens
if we choose a different representation of the Clifford algebra, where the Lorentz group
matrices S() are not block diagonal? Is there an invariant way to define chiral spinors? We
can do this by defining the fifth Dirac matrix
5 = i 0 1 2 3 =

i

,
4!

(5.40)

where  is the totally antisymmetric Levi-Civita tensor with 0123 = 0123 = 1. The
5 matrix has the following properties, all of which can be verified using (5.40) and the
anticommutation relations (5.7):
( 5 ) = 5 ,

( 5 )2 = 1 ,

{ 5 , } = 0 .

(5.41)

The reason that this Dirac matrix is called 5 is that the set of matrices M = { , i 5 } satisfy
the five-dimensional Clifford algebra, i.e., { M , N } = 2 M N where M, N = 0, 1, 2, 3, 4. It is
also not difficult to check that
[S , 5 ] = 0 ,
(5.42)
40

Using this terminology, scalars belong to the (0, 0) representation, vectors are in the (1/2, 1/2) representation, and the electromagnetic field-strength tensor transforms as (1, 0) (0, 1) under the Lorentz group.

113

which means that 5 is a scalar41 under rotations and boosts. The latter relation also tells
us that eigenvectors of 5 whose eigenvalues are different transform without mixing, and as
a result the Dirac representation must be reducible. This criterion for reducibility is Schurs
lemma. It follows that
1 5
PL,R =
,
(5.43)
2
form Lorentz-invariant projection operators. They satisfy (please show this)
2
PL,R
= PL,R ,

PL,R PR,L = 0 ,

PL + PR = 1 .

One can also check easily that for the Weyl representation (5.10), one has explicitly
!
1
0
5 =
.
0 1

(5.44)

(5.45)

We see that PL,R project onto left- and right-handed spinors, i.e., L,R = PL,R .
Weyl Equations
The Dirac Lagrangian can be written in terms of the chiral fields (5.38) as


L = (i/ m) = i L / L + R / R m L R + R L ,

(5.46)

R,L . After a slight change of notation,


where L,R = P
= (1, ) ,

= (1, ) ,

and multiplying with = L + R from the left the corresponding EOMs read


i L L + R
R m L R + R L = 0 .

(5.47)

(5.48)

We see that a massive fermion requires both components L and R , since they are coupled
via the mass term, which is chirality flipping. The kinetic term on the other hand is chirality
conserving. This means that a massless fermion can be described by a single Weyl spinor L
or R alone. The corresponding Euler-Lagrange equations go by the name of Weyl equations:
i L = 0 ,

i
R = 0 .

(5.49)

In many practical applications it is overwhelmingly convenient to employ two-component


Weyl spinor notation, rather than the four-component Dirac spinors. This is due to the fact
that the Lagrangian of the SM and essentially all of its extensions violate parity, i.e., the leftand right-handed fermionic components couple differently to the electroweak gauge group. If
one uses four-component spinor notation, then there are a lot of clumsy left- and right-handed
projection operators. This is not the case if one employs the two-component Weyl fermion
notation, which treats fermionic dofs with different gauge quantum numbers separately from
the start, as nature intended for us to do. Plenty of details on and many useful techniques to
deal with Weyl fermions can be found in [1], which I highly recommend for further reading.
41

In fact, we will see soon that it is a pseudo-scalar and not a scalar.

114

Dofs Counting
At this point a couple of comments about the dofs counting seem to be indicated. In classical
mechanics, the number of dofs of a system is equal to the dimension of the configuration space
or, equivalently, half the dimension of the phase space. In field theory we have an infinite
number of dofs, but it makes sense to count the number of dofs per spatial point, which at
least should be finite. E.g., in this sense a real scalar field has a single dofs. At the quantum
level, this translates to the fact that it gives rise to a single type of particle. A classical
complex scalar field, on the other hand, has two dofs, corresponding to the particle and its
antiparticle in the QFT.
But what about a Dirac spinor? One might think that there are eight dofs, since has
four complex components. But this is wrong! Crucially, and in contrast to the scalar field,
the EOM of is first order rather than second order. In particular, for the Dirac theory, the
momentum conjugate to the spinor is given by
=

L
= i ,

(5.50)

which is not proportional to the time derivative of . The phase space of a spinor is hence
So the
parameterized by and , while for a scalar it is parameterized by and = .
phase space of the Dirac spinor has eight real dimensions and correspondingly the number
of real dofs is four. We will learn soon that, in the QFT, this counting manifests itself as two
dofs (i.e., spin up and down) for the particle, and another two for the antiparticle. A similar
counting for the Weyl fermion tells us that it has two dofs.
Majorana Fermions
Our spinor is a complex object. It has to be, since the representation S() is typically also
complex. This means that if we were to try to make real, e.g., by imposing = , then
it would not stay real once we make a Lorentz transformation. However, there is a way to
impose a reality condition on . In order to motivate this possibility, its simplest to look at
a novel basis for the Clifford algebra (5.9), known as the Majorana basis
!
!
!
!
2
3
2
1
0

i
0
0

i
0
0 =
, 1 =
, 2 =
, 3 =
. (5.51)
2 0
0 i 3
2 0
0 i 1
What is special about these matrices is that they are all pure imaginary, i.e., ( ) = .
This implies that the generators (5.12), and hence the full Lorentz transformations (5.18) are
real. In the specific basis (5.51), we can therefore work with a real spinor simply by imposing
the condition,
= ,
(5.52)
which is preserved under Lorentz transformations. Such spinors are called Majorana spinors.
Can this procedure be generalized to an arbitrary basis of Dirac matrices? We only ask
that the basis satisfies (5.24). We then define the charge conjugate of a Dirac spinor as
c = C ,
115

(5.53)

where C is a 4 4 matrix obeying


C C = 1 ,

C C = ( ) .

(5.54)

The first relation tells us that charge conjugation can be described by an unitary operator.
Let us first check that (5.53) is a sensible definition, meaning that c transforms nicely under
Lorentz transformations. One has
c (x) C S () (x0 ) = S()C (x0 ) = S() c (x0 ) .

(5.55)

Here we made use of the properties (5.54) to commute the matrix C past S () to the right.
Comparing the latter result to (5.17), we see that and c transform in the same way under
the Lorentz group. In fact, not only does c transforms nicely under rotations and boosts,
but it satisfies the Dirac equation, if does. This follows from,
(i/ m) = 0 ,

(i/ m) = 0 ,

C (i/ m) = (i/ m) c = 0 ,

(5.56)

where we have again employed (5.54). Finally, we can now impose the Lorentz-invariant reality
condition on the Dirac spinor, to yield a Majorana spinor,
= c .

(5.57)

After quantization, the Majorana spinor gives rise to a Majorana fermion that is its own
antiparticle. This is exactly the same as in the case of scalar fields, where we have seen that
a real scalar field gives rise to a spin-zero boson that is its own antiparticle.
So how does the matrix C look like? This, of course, depends a lot on the basis. In
the Majorana basis (5.51), where all the Dirac matrices are purely imaginary, one simply has
C = 1, and in consequence the condition (5.57) turns into (5.52). In the chiral basis (5.10),
on the other hand, only 2 is imaginary, and we may take42
C = i 2 .

(5.58)

It is also interesting to see how the Majorana condition (5.57) looks in terms of the decomposition into left- and right-handed Weyl spinors. Plugging in the various definition, we find
R = i 2 L and L = i 2 R . In other words, a Majorana spinor can be written in terms of
chiral spinors as
!
R
=
.
(5.59)
i 2 R
Notice that it is not possible to impose the Majorana condition, = c , at the same time
as the Weyl condition, L = 0 or R = 0. Instead the Majorana condition relates left- and
right-handed spinors via (5.59). In an exercise you will learn more about Majorana fermions.
So lets move on.
42

Be aware, in many texts an extra factor of 0 is absorbed into the definition of C.

116

5.2

Discrete Symmetries of Dirac Theory

In addition to the continuous Lorentz transformations we have considered so far, there are
two other space-time operations that are potential symmetries of any QFT, namely parity and
time reversal. Parity, denoted by P , sends
P

x = (t, x) (t, x) = xP ,

(5.60)

reversing the handedness of space. Times reversal, denoted by T , sends


T

x = (t, x) (t, x) = xT ,

(5.61)

interchanging the forward and backward light-cone. Since parity has an important role to play
in the SM and, in particular, the theory of the electroweak interactions, lets first have a look
at the action of P on spinors and bi-linears constructed from them.
Parity
In order to understand what happens to a spinor under parity, we consider how rotations and
boosts act on Weyl spinors. In the chiral representation, the corresponding transformation
properties have already been spelled out in (5.39). We also know that under parity rotations
do not flip sign, while boosts do, since P acting on a particle should reverse its momentum,
but not its spin. This tells us that parity exchanges right- and left-handed spinors,
P

L,R (x) R,L (xP ) .

(5.62)

Using this knowledge and the fact that changing the parity twice is the identity, i.e., P 2 = 1,
we see that the action of parity on can be described in the Weyl basis by
P = 0 .

(5.63)

This 4 4 matrix satisfies


P P = 1 ,

P P = ( ) ,

(5.64)

so also parity can be implemented by an unitary operator. Our spinor transforms under P as
P

(x) P (xP ) .

(5.65)

Notice that if (x) satisfies the Dirac equation (5.35), so does the parity-transformed spinor
P (xP ), since one has
(i 0 t + i i i m)P (t, x) = P (i 0 t i i i m)(t, x) = 0 .

(5.66)

Here the extra minus sign from passing P through i is compensated by the derivative acting
on x instead of x.
Let me now consider how the covariant interaction terms we have constructed before trans Obviously, one has
form under P . We start with .
P

P )(xP ) ,
(x)(x)
(x

117

(5.67)

given that ( 0 )2 = 1 and ( 0 ) = 0 . This is the transformation of a scalar. In the case of the
, we find instead

P ) (xP ) ,

(x)
(x) (1) (x

(5.68)

where (1) = 1 for = 0 and (1) = 1 for = 1, 2, 3. Notice that the factor (1) arises
from the combination of (5.24) and (5.27). The latter transformation property tells us that
transforms as a vector, with the spatial part changing sign. You can also check easily

transforms as a tensor, namely


that
P

P ) (xP ) .
(x)
(x) (1) (1) (x

(5.69)

5 and
5 . How do
Using 5 , we can form two more Lorentz-covariant objects, i.e.,
these transform under parity? In the first case, we obtain
P
5

P ) 5 (xP ) ,
(x)
(x) (x

(5.70)

where we have used the last relation in (5.41) and ( 0 )2 = 1. In the second case, a straightforward calculation gives
P
5

P ) 5 (xP ) .
(x)
(x) (1) (x

(5.71)

5 and
5 the names pseudoThe minus signs in (5.70) and (5.81) earns the objects
scalar and pseudo-vector (or axial-vector). To summarize, we have the following spinor bilinears,
: scalar ,

: vector ,

: tensor ,

(5.72)

5 : pseudo-scalar ,

5 : pseudo-vector .

The total number of bi-linears is (1 + 4 + (4 3)/2 + 4 + 1) = 16 which is all we could hope for
from a 4-component object.
We are now equipped with new terms involving 5 that we can start to add to our Lagrangian to construct new theories. Typically such terms will break parity invariance of the
5 does not break parity if is
theory, although this is not always true. E.g., the term
itself a pseudo-scalar. nature makes use of these parity-violating interactions by using 5 in
the electroweak force. A theory which treats L,R on an equal footing is called a vector-like
theory. In contrast, a theory in which L,R appear differently is called a chiral theory.

118

Time Reversal
Another obvious question that we should address is how our building blocks in (5.66) transform
under T . In order to answer this question, we first have to understand how the time-reversal
symmetry is correctly implemented in a QM context. The implementation turns out to be
more subtle than in the case of C and P , since the relevant operator is in the case of T not
unitary but anti-unitary, i.e., T = T 1 with h|T | 0 i = hT | 0 i , where |i and | 0 i denote
arbitrary multiparticle quantum states. A straightforward way to realize that the operator
implementing T must be anti-unitary is to consider the behavior of the Schrodinger equation
for a free particle under time reversal. In classical mechanics, a free particle has a time-reversal
invariant motion, and it is reasonable that we would like to retain this property in QM as
well. But the operator t is T -odd while is T -even. This is impossible to reconcile with the
Schrodinger equation unless time reversal changes i i and . The operator thus has
to be anti-unitary. Note that the anti-unitary of T implies that it does not have meaningful
eigenvalues, contrary to what happens in the case of C and P . As there is no quantum number
associated with time reversal, no conservation law exists when the action is invariant under
time reversal. Just as for parity, we define time-reversal transformation in QFT by its action
on states. We require that T should reverse the particle momentum and its spin.
It is not too difficult to figure out, that the transformation of the Dirac spinor under time
reversal involves in the chiral basis the matrix
T = i 1 3 ,

(5.73)

which satisfies
T T = 1 ,

T T = (1) ( ) ,

(5.74)

The transformation itself takes the following form


T

(x) T (xT ) .

(5.75)

The first thing to notice is that if (x) obeys the Dirac equation, the same is true for T (xT ).
This follows, because
(i 0 t + i i i m)T (t, x) = T (i( 0 ) t i( i ) i m) (t, x) = 0 .

(5.76)

Notice that the minus sign between the ( 0 ) and ( i ) term is compensated by the derivative
acting on t rather then t, and that the final result follows after complex conjugation which
sends i i.
We are now read to consider the transformation properties of the building blocks (5.72).
I simply quote the results without proof, leaving the actual derivations to you as an useful
one finds that
exercise. For the scalar
T

T )(xT ) .
(x)(x)
(x

(5.77)

, one has instead


In the case of the vector
T

T ) (xT ) .
(x)
(x) (1) (x

119

(5.78)

/ and A
/
This is exactly the transformation property we want for vectors, since it leaves
invariant under time reversal. Notice that the minus sign appearing for the space-components
in (5.78) is cancelled by those appearing in the transformation of the derivative and the
behaves
electromagnetic field A , respectively. One furthermore shows, that the tensor
like
T

T ) (xT ) ,
(x)
(x) (1) (1) (x
(5.79)
under time reversal. We finally want to know the transformation properties of the covariants
5 , one obtains
involving 5 . For the pseudo-scalar
T
5

T ) 5 (xT ) ,
(x)
(x) (x

(5.80)

5 is given by
while the action of T on the pseudo-vector
T
5
T ) 5 (xT ) .

(x)
(x) (1) (x

(5.81)

Charge Conjugation
The last of the three discrete symmetries is the particle-antiparticle symmetry C, which we
meet already at the end of Section 5.1 when discussing the properties of Majorana fermions.
In physical terms, charge conjugation is conventionally defined to take a fermion with a given
spin orientation into an antifermion with the same spin orientation. As we have seen in (5.56),
this transformation is a symmetry of the Dirac equation.
Once again we want to know how C acts on fermion bi-linears. I again quote the relevant
transforms under C as
results without giving the details of their derivation. The scalar
C

(x)(x)
(x)(x)
.

(5.82)

, one has instead


In the case of the vector
C

(x)
(x) (x)
(x) .

(5.83)

behaves like
Under C the tensor
C

(x)
(x) (x)
(x) .

(5.84)

5 , one arrives at
For the pseudo-scalar
C
5
5

(x)
(x) (x)
(x) ,

(5.85)

5 reads
while the action of C on the pseudo-vector
C
5
5

(x)
(x) (x)
(x) .

120

(5.86)

CP and CP T Symmetry
We saw that the free Dirac equation (5.35) is invariant under P , T , and C separately. Yet,
we can build more general QFTs that violate any of these discrete symmetries by adding to
the Dirac Lagrangian appropriate perturbations. These additional terms must transform as a
Lorentz scalar. The various fermionic bi-linears that can be used to construct such terms are
shown in Table 1. The last line of this table tells us that all Lorentz-scalar combinations of
and are invariant under the combined symmetry CP T . Actually, it is quite generally true
that one cannot build a Lorentz-invariant QFT with a hermitian Hamiltonian that violates
CP T . More precisely, one can prove the following three statements [2]: first, an interacting
theory that violates CP T invariance necessarily violates Lorentz invariance, second, CP T
invariance is not sufficient for out-of-cone Lorentz invariance, and third, theories that violate
CP T by having different particle and antiparticle masses must be non-local. This implies that
any study of CP T violation includes also Lorentz violation. Several experimental searches of
such violations have been performed during the last few years. A detailed list of results of
these experimental searches are summarized in [3]. So far no evidence for neither CP T nor
Lorentz violation has been found. The consequences of the CP T invariance are far-reaching.
The most celebrated ones are the equality of masses and total decay width (or lifetimes) for
particles and antiparticles. Both statements are easy to prove. Try it!
What about the other discrete symmetries in nature? Are they conserved? Although P is
conserved in electromagnetism, strong interactions, and gravity, it turns out to be violated in
electroweak interactions. The SM incorporates parity violation by expressing the electroweak
interaction as a chiral gauge interaction. Only the left-handed components of particles and
right-handed components of antiparticles participate in the electroweak interactions in the SM.
This implies that P is not a symmetry of our universe, unless a hidden mirror sector exists
in which parity is violated in the opposite way (a left-right symmetry). It was suggested
several times and in different contexts that parity might not be conserved, but in the absence
of compelling evidence these suggestions were not taken seriously. A careful review by Tsung
Dao Lee and Chen Ning Yang [4] showed that while P conservation had been verified in decays
by the strong or electromagnetic interactions, it was untested in the electroweak interaction.
They proposed several possible direct experimental tests. They were almost ignored, but Lee
was able to convince his colleague Chien-Shiung Wu to look for P violation. In 1957, Wus
group conducted an ingenious experiment showing that in the case of the -decay of Co60 ,
nature knows left from right [5]. The discovery of P violation immediately explained the
outstanding puzzle related to the decay of charged kaons.
So P is broken in nature, what about CP then? The first thing to notice in this respect is
that a symmetry of a QM system can be restored if another symmetry can be found such that
the combined symmetry remains unbroken. This rather subtle point about the structure of
Hilbert space was realized shortly after the discovery of P violation, and it was proposed that
charge conjugation was the desired symmetry to restore order. As a result, the CP symmetry
was proposed in 1957 by Lev Landau as the true symmetry between matter and antimatter.
In other words, a process in which all particles are exchanged with their antiparticles was
assumed to be equivalent to the mirror image of the original process. The discovery of CP
violation in 1964 in the decays of neutral kaons [6], which resulted in the Nobel Prize in Physics
121


Symmetry
P
T
C
CP
CP T

+1
+1
+1
+1
+1

(1)
(1)
1
(1)
1

(1) (1)
(1) (1)
1
(1) (1)
+1

1
1
+1
1
+1

(1)
(1)
+1
(1)
1

(1)
(1)
+1
(1)
1

(1)
(1)
+1
(1)
1

(1) (1)
(1) (1)
+1
(1) (1)
+1

Table 1: Transformation properties of fermion bi-linears as well as , A , and F


under the discrete P , T , and C symmetries and the combinations CP and CP T .

in 1980 for its discoverers James Cronin and Val Fitch, shocked particle physics and opened
the door to questions still at the core of particle physics and cosmology today. CP violation is
incorporated in the SM by including a complex phase in the matrix describing quark mixing.
In such a scheme a necessary condition for CP violation can then be shown to be the presence
of at least three generations of quarks. This possibility was suggested by Makoto Kobayashi
and Toshihide Maskawa in a seminal paper [7] in 1973, which earned them one half of the
Nobel Prize in Physics in 2008.
The past decade has seen tremendous progress in the study of CP violation. In particular,
the so-called B factories (BaBar and Belle) have collected and analyzed an impressive amount
of experimental data, that led to the confirmation of the Kobayashi-Maskawa (KM) mechanism of CP violation. Yet, the dynamical origin of CP violation remains a puzzling mystery
which awaits to be unraveled. Another unsolved theoretical questions in this context is why
the universe is made entirely of matter, rather than consisting of equal parts of matter and
antimatter. It can be demonstrated that, to create an imbalance in matter and antimatter
from an initial condition of balance, three necessary conditions [8] must be satisfied, one of
which is the existence of CP violation. The other two are baryon-number violation and the
presence of interactions out of thermal equilibrium. These conditions have been formulated
first in 1967 by Andrei Sakharov. The SM contains only two sources that can break the CP
symmetry. The first of these, involves the aforementioned KM phase, but can account for
only a small portion of the needed CP violation. The second of these, resides in the quantum
chromodynamics (QCD) Lagrangian and goes by the name of parameter. It has not been
found experimentally. The fact that one would expect the parameter to lead to either no
or CP violation that is way too large is the essence of the strong CP problem [9]. There
are several proposed solutions to solve this problem. The most well-known is based on an
idea original due to Robert Peccei and Helen Quinn [10], involving new scalar particles called
axions. Oops! Looks like I am getting carried away here. Lets get focused and return to the
discussion of the Dirac theory.

122

5.3

Continuous Symmetries of Dirac Theory

Besides the discrete P , T , and C symmetries, the Dirac action (5.34) enjoys a number of continuous symmetries. In the following we will discuss space-time translations, Lorentz transformations, the internal vector and axial-vector symmetry, and compute the associated conserved
currents.
Space-Time Translations
Under infinitesimal space-time translations (2.18), the Dirac spinor transforms as
=  .

(5.87)

Given that the Dirac Lagrangian depends on but not on , we can use the standard
formula (2.20) to obtain the energy-momentum tensor
L .
T = i

(5.88)

Since a current is conserved only when the EOMs are obeyed, we do not lose anything by
imposing the Euler-Lagrange equation already on T . In the case of a scalar field this does
not really buy us anything, because the EOMs are second order in derivatives, while the
energy-momentum tensor is first order. However, for a spinor field the EOMs are first order
(5.35). This means that we can ignore the second term in (5.88), leaving us with
.
T = i
It follows that the total energy is given by
Z
Z
Z
3
00
3
0

E = d x T = d x i = d3 x 0 (i + m) ,

(5.89)

(5.90)

where in order to obtain the final expression we have employed the Dirac equation. The
components of the total momentum are given by
Z
Z
i
3
0i
0 i .
P = d x T = d3 x i
(5.91)
Both the total energy and momentum are of course conserved.
Lorentz Transformations
Under a Lorentz transformation, the Dirac spinor transforms as (5.17) which, in infinitesimal
form, reads
1
(5.92)
= x + (S ) .
2
From (5.6) it follows that = 1/2 (M ) , where the generators of the Lorentz group
(M ) take the form (5.2). After direct substitution, this tells us that = , and as a
result (5.91) becomes

1 

= x + (S ) .
(5.93)
2
123

The conserved current arising from Lorentz transformations now follows from the same calculation we saw for the scalar field (2.67). Yet, there are two small differences. First, we are
allowed to neglect terms proportional to L in the computation and, second, we pick up an
extra piece in the current from the second term in (5.92). At the end one has
S .
(J ) = x T x T i

(5.94)

After quantization, when (J ) is turned into an operator, this extra term will be responsible
for providing the single-particle states with internal angular momentum, telling us that the
quantization of a Dirac spinor gives rise to a particle carrying spin 1/2.
Vector Symmetry
The Dirac Lagrangian is invariant under global phase rotations of the spinor, i.e.,
ei .

(5.95)

This symmetry gives rise to the conserved current


,
jV =

(5.96)

where the index V stands for vector, reflecting the fact that the left- and right-handed spinors
L,R transform in the same way under phase rotations. It is straightforward to check using
(5.35) and (5.36), that jV is indeed conserved under the EOMs,

( ) = im
im
= 0.
jV = ( ) +
The conserved quantity arising from the vector symmetry is
Z
Z
Z
3
0
3 0
Q = d x jV = d x = d3 x .

(5.97)

(5.98)

We will see shortly that this has the interpretation of electric charge, or particle number, for
fermions.
Axial-Vector Symmetry
In the case of massless fermions, the Dirac Lagrangian possesses an extra internal symmetry,
which rotates left- and right-handed fermions in opposite directions,
5

i 5 .
e

ei ,

(5.99)

Here the second transformation follows from the first by noticing that exp (i 5 ) 0 =
0 exp (i 5 ) as a consequence of the anti-commutation relation in (5.41). Invariance under
the global phase rotation (5.98) leads to the conserved current
5 ,
jA =
124

(5.100)

where the subscript A stands for axial-vector. This current is only conserved if the mass
parameter m in the Dirac action (5.34) is equal to zero. Indeed, with the full Dirac Lagrangian
we may compute

5 ( ) = 2im
5 ,
jA = ( ) 5 +
(5.101)
which is non-vanishing only if m 6= 0. However, in the quantum theory things become more
interesting for the axial-vector current. When the theory is coupled to gauge fields, the axial
transformation remains a symmetry of the classical Lagrangian. But the symmetry does not
survive the quantization process [1113]. It is the prototypical example of an anomaly: a
symmetry of the classical theory that is not preserved at the quantum level. In fact, the axial
anomaly has important physical implications. It does not only determine the neutral pion
decay 0 2, but also provides an indirect way to determine the number of color dofs. For
further reading, I recommend the recent review article [14].

5.4

Solutions to Dirac Equation

In order to get some feeling for the physics of the Dirac equation (5.35), we now discuss its
plane-wave solutions. The fact that the Dirac field obeys the Klein-Gordon equation, tells
us that it can be written as a linear combination of plane waves. We make the ansatz
(x) = u(p) eipx ,

(5.102)

where u(p) is a 4-component spinor that is independent of x, but does depend on the 3momentum p.43 Notice that (5.102) is a positive frequency solution, because exp (iEt).
Inserting the above ansatz into the Dirac equation takes the form
!
m p
(p/ m) u(p) =
u(p) = 0 ,
(5.103)
p
m
where we have used the notation (5.47). In order to find the solution to this equation, we
write u(p) = (u1 , u2 )T . In terms of the two-component spinors u1,2 the relation (5.103) reads
(p ) u2 = mu1 ,

(p
) u1 = mu2 ,

(5.104)

where p = p and p
= p
. However, these equations are not independent from each
other, since
(p )(p
) = p20 pi pj i j = p20 pi pj ij = p p = m2 .
(5.105)
We conclude that any spinor of the form
u(p) = N
43

(p ) 0
m 0

!
,

(5.106)

In an abuse of notation we denote hereafter the 4-component Dirac spinors by u(p) and not u(p) etc.

125

with constant N is a solution to (5.103). In order to make this more symmetric, we choose

. Then u1 = (p ) p
= m p , and putting things
N = 1/m and 0 = p
together one obtains
!

p
,
(5.107)
u(p) =
p

where is a 2-component spinor that can be chosen to satisfy = 1. Here it is understood


that in taking the square root of a matrix, we take the positive root of each eigenvalue.
Further solutions to the Dirac equation follow from the ansatz
(x) = v(p) eipx .

(5.108)

These solutions oscillate in time as exp (iEt) and are therefore called negative frequency
solutions. Realize however that both (5.102) and (5.108) are solutions to the classical field
equations and both have positive total energy (5.90). The Dirac equation (5.35) requires that
the 4-component spinor v(p) satisfies
!
m p
(p/ + m) v(p) =
v(p) = 0 .
(5.109)
p
m
Following the line of reasoning that lead to (5.106), it is easy to show that the latter equation
is solved by
!

p
v(p) =
,
(5.110)

for some constant 2-component spinor taken to be normalized as = 1.


Spin-Up and Spin-Down Solutions
In order to make contact to QM, consider the positive frequency solution with mass m and
vanishing 3-momentum p = 0, i.e., the rest frame of the associated particle. In this case the
solution to (5.103) takes the form
!

,
(5.111)
u(p) = m

where is an arbitrary 2-component spinor. We can interpret the spinor by looking at


the rotation generator (5.19). We see that transforms under rotations as an ordinary 2component spinor of the rotation group, and therefore determines the spin orientation of the
Dirac solution in the usual way. E.g., when T = (1, 0), the corresponding field has spin up
along the z-axis. After quantization, this will become the spin of the associated particle.44
44

In the rest of this section, we will indulge in an abuse of terminology and refer to the classical solutions
to the Dirac equations as particles, even though they have no such interpretation before quantization.

126

Starting from (5.111), we now consider the particle with spin T = (1, 0) and boost it along
the z-direction with p = (E, 0, 0, pz ). The solution (5.107) to the Dirac equation becomes
!
!

1
1
E pz

ey/2

0
0

u(p) =
(5.112)
! = m
!,

1
1

y/2

E + pz
e
0
0
where in the last step we have introduced the rapidity


E + pz
1
,
y = ln
2
E pz

(5.113)

which is related to E and pz via


E=


m y
e + ey ,
2

pz =


m y
e ey .
2

(5.114)

Notice that rapidities are, unlike speeds at relativistic velocities, additive quantities. This
feature explains why in particle physics rapidities are often used instead of velocities. For
large boosts, i.e., E  m or equivalent y  1, the result (5.112) turns into

0

0
u(p) 2E .
1

(5.115)

0
In the same limit, one obtains for a particle with spin T = (0, 1) the expression

0
1


u(p) 2E .
0

(5.116)

0
This implies that in the limit y the states degenerate into the 2-component
spinors of a

massless particle. We now also understand the reason for the factor of m in (5.111). It is
necessary to keep the spinor expressions finite in the massless limit.
Helicity
The solutions (5.115) and (5.116) are the eigenstates of the helicity operator
!
i

0
i
1
h = ijk pi S jk = pi
,
2
2
0 i
127

(5.117)

where S ij is the rotation generator (5.14). The massless field in (5.115) has helicity 1/2 and is
said to be right-handed, while the one in (5.116) has helicity 1/2 and is called left-handed.
Notice that the helicity of a massive particle depends on the frame of reference, since one can
always boost to a frame in which its momentum is in the opposite direction, but its spin is
unchanged. For a massless particle which travels at the speed of light one cannot perform such
a boost. This also explains the origin of the notation L,R for Weyl spinors. The solutions
of the Weyl equations (5.49) are states of definite helicity, corresponding to left- and righthanded particles, respectively. The Lorentz invariance of helicity (for a massless particle) is
manifest in the notation of Weyl spinors, since L and R live in different representations of
the Lorentz group.
Spinor Products
There are a number of identities that will be very useful in the following section, regarding
the (inner) products of the spinors u(p) and v(p). For convenience, we define a basis r and
r with r = 1, 2 for the 2-component spinors such that
r s = rs ,

r s = rs .

(5.118)

!
1
,
0

!
0
,
1

(5.119)

E.g., one can take


1 =

2 =

and similarly for r . Let us first look at the positive frequency solutions u(p). We can take the
inner product of 4-component spinors in two different ways, i.e., ur (p) us (p) or ur (p) us (p).
Of course, only the latter object is Lorentz invariant, but it will turn out that the former is
needed when we will quantize the theory. So let me state both. One has
!

ur (p) us (p) = ( r p , r p
)
p
s
(5.120)
= r (p ) s + r (p
) s = 2 r p0 s = 2p0 rs ,
while the Lorentz-invariant inner product is
!
!
01
p s
p ,
p
)
u (p) u (p) = (

10
p
s

= r p p
s + r p
p s = 2m rs .
r

(5.121)

Here we have used (5.105) in order to arrive at the final expression. For the negative frequency
solutions v(p), one derives in an analog way
v r (p) v s (p) = 2p0 rs ,

vr (p) v s (p) = 2m rs .

128

(5.122)

We can also compute the Lorentz-invariant inner product between ur (p) and v(p). We find
!
!
s

0
1
p

ur (p) v s (p) = ( r p , r p
)

10
p
s
(5.123)

s r p
p s = 0 ,
= r p p
and similarly for vr (p) us (p) = 0. The solutions u(p) and v(p) are thus orthogonal to each
other. Let us furthermore calculate ur (p)v s (p) and v r (p)us (p).45 Defining p = (p0 , p),
one has in the first case
!

ur (p) v s (p) = ( r p , r p
)

p
s
(5.124)

= r p p s r p
p
s .
Here the term under the first square root is given by (p )(
p ) = (p0 pi i )(p0 + pi i ) =
)(
p
). This means that the two terms in the
p20 p2 = m2 . The same result holds for (p
last line of (5.124) cancel, leaving us with
ur (p) v s (p) = v r (p) us (p) = 0 .

(5.125)

Spin Sums
In evaluating Feynman diagrams, we will often wish to sum over the polarization states of a
fermion. We can derive the relevant spin sums (or completeness relations) by simple calculations. We start by computing
!
!
X
X p r

0
1

ur (p) ur (p) =
)
( r p , r p

r
10
p

r=1,2
r=1,2
(5.126)
!
!

p p
p p
m p
=
=
= /p + m .

p
p
p
p
p
m
Notice that the two spinors appearing on the left-hand side of (5.126) are not contracted. In
the derivation of the latter equation, we have used that
!
X
1
0
r r =
.
(5.127)
0
1
r=1,2
Similarly, one derives
X

v r (p) vr (p) = /p m .

(5.128)

r=1,2

Again, it is crucial that


X

r r =

r=1,2
45

10
01

!
.

Our notation is such that with u(p) we in fact mean u(p) etc.

129

(5.129)

5.5

Quantization of Dirac Theory

We are now ready to construct the quantum version of the free Dirac field, starting from the
relevant action (5.34). We will first proceed naively and treat as we have done in the case
of the scalar field. Yet, we will see pretty fast that things go wrong, and we will have to
reconsider how to quantize the Dirac theory. Walking on this blind alley will, however, allow
us to better understand the relation between spin and statistics. So at the end, it will be a
quite useful detour.
Little Detour
We start in the usual way by calculating the momentum conjugate to . In fact, we already
did this in (5.50), and know that = i , which does not involve the time derivative of .
This makes perfectly sense, because the Dirac equation is first order in time, so that we need
only to specify and on an initial time slice to determine the full evolution.
In order to quantize the theory we then proceed in analogy with the Klein-Gordon field, and
promote and to operators, satisfying the following canonical (equal time) commutation
relations

[ (x), (y)] = [ (x), (y)] = 0 ,


(5.130)

[ (x), (y)] = (3) (x y) ,


where and denote spinor indices. This already looks peculiar. If were real-valued the
left-hand side would be antisymmetric under exchange of x and y, while the right-hand side
is symmetric. But is complex, so we do not have a contradiction yet. In fact, we will soon
learn that much worse problems arise when we impose commutation relations on the Dirac
field. But it is instructive to see how far we can get, in order to better understand the relation
between spin and statistics. So lets press on.
Since we are dealing with a free theory, where any classical solution is a sum of plane
waves, we may write the quantum operators in the Schrodinger picture as
i
X Z d3 p
1 h r r
ipx
r r
ipx
p
a
u
(p)
e
+
b
v
(p)
e
,
(x) =
p
p
3
(2)
2E
p
r=1,2
(5.131)
h
i
X Z d3 p
1
p
(x) =
arp ur (p) eipx + brp v r (p) eipx ,
3
(2)
2E
p
r=1,2
where the operators arp and brp create particles associated to the positive energy solutions
ur (p) exp (ip x) and negative energy solutions v r (p) exp (ip x), respectively. The arp and brp
are the corresponding annihilation operators. Like in the case of scalar fields the commutation
relations of the fields (5.130) lead to commutation relations for the ladder operators. The nonvanishing commutators are
[arp , asq ] = (2)3 (3) (p q) rs ,
[brp , bsq ] = (2)3 (3) (p q) rs .
130

(5.132)

Notice that the commutator [brp , bsq ] has a strange minus sign on the right-hand side. It is not
obvious that this sign causes trouble, but we should be aware of it. With the commutation
relations (5.132) at hand, it is straightforward to show that the relations (5.130) hold. One
has
h
X Z d3 pd3 q
1

p
[arp , asq ] ur (p)us (q) ei(pxyq)
[(x), (y)] =
6
(2)
4E
E
p q
r,s=1,2
i
r s
r
s
i(pxyq)
+ [bp , bq ] v (p)v (q) e
(5.133)
i
X Z d3 p 1 h
r
r
0 ip(xy)
r
r
0 ip(xy)
=
u
(p)
u

(p)

e
+
v
(p)
v
(p)

e
.
(2)3 2Ep
r=1,2
In order to simplify this further, we now employ the completeness relations (5.126) and (5.128).
It follows that
Z 3
i
dp 1 h

0 ip(xy)
0 ip(xy)
[(x), (y)] =
(p
/
+
m)

e
+
(p
/

m)

e
(2)3 2Ep
Z 3
 0
 0 i ip(xy)
dp 1 h
0
0
(5.134)
=
p

+
p

+
m

+
p

m
e
0
0
(2)3 2Ep
Z 3
d p ip(xy)
e
= (3) (x y) ,
=
(2)3
as promised. Notice that to obtain the second line we have change the integration from p
to p for what concerns the second term. We also see that the minus sign in the second
relation of (5.132) is crucial here, since it is necessary so that the terms p = pi i cancel
in the final expression. It is also easy to show that the first commutation relation in (5.130)
is satisfied once the equations (5.132) are imposed. I leave it to the reader to perform the
explicit computation.
Equipped with (5.132), we can find the explicit form of the Dirac Hamiltonian in terms
of
R
ladder operators. The Hamiltonian can be simply read off from (5.90), since E = H = d3 x H.
Hence, we have
H = (i + m) ,
(5.135)
as a starting point, which we would like to turn into an operator. We first look at
X Z d3 p
1 h r
p
ap (p + m) ur (p) eipx
(i + m) =
3
(2)
2E
p
r=1,2
i
r
r
ipx
+ bp (p + m) v (p) e
.

(5.136)

In order to find this result it is important to notice that p x = pi xi , which explains the
additional minus sign of the p terms. Using now (5.103) and (5.109) to replace the p

131

terms, leads to
r
i
X Z d3 p
Ep 0 h r r
ipx
r r
ipx
(i + m) =
ap u (p) e
bp v (p) e
,
(2)3
2
r=1,2

(5.137)

We now use this expression to write the Hamiltonian as


s
i
X Z d3 xd3 pd3 q
Ep h s s
iqx
s s
iqx
+ bq v (q) e
H=
a u (q) e
(2)6
4Eq q
r,s=1,2
h
i
r r
ipx
r r
ipx
ap u (p) e
bp v (p) e
(5.138)

X Z d3 p 1 h


asp arp us (p) ur (p) bsp brp v s (p) v r (p)
=
3
(2) 2
r,s=1,2

asp brp

u (p) v (p)

bsp arp

v (p) v (p)

i

where in the last two terms we have changed p to p. Now is the right time to employ the
formulas in (5.120), (5.122), and (5.125), that allow us to get rid of the spinor products. We
arrive at the simple result
h
i
X Z d3 p
r r
r r
a
a

b
b
E
H=
p
p p
p p
(2)3
r=1,2
(5.139)
h
i
X Z d3 p
r r
r r
3 (3)
=
Ep ap ap bp bp + (2) (0) .
(2)3
r=1,2
The delta-function term should be familiar to you by now. It is easily dealt with by normal
ordering. However, the term brp brp is a complete mess, since it implies that the Hamiltonian
is not bounded below, meaning that our quantum theory makes no sense. Taken seriously it
would tell us that we could tumble to states of lower and lower energy by continually producing
particles by the action of brp . Since the above calculation was a little subtle, you might think
that its possible to rescue the theory to get the minus signs to work out right. You can play
around with different things, but youll always find this minus sign cropping up somewhere.
And, in fact, its telling us something important that we missed.
Further insight in the structure of the Dirac theory, can be gained by investigating the
causality of the theory. To do this we should calculate [(x), (y)], or more conveniently

[(x), (y)],
at non-equal times and hope to find that this commutator is zero outside the
light-cone. We start this exercise by switching to the Heisenberg picture thereby restoring the
From (3.104) and (3.106), we infer that
time-dependence of and .
(arp )H = eiHt arp eiHt = eiEp t arp ,

(arp )H = eiHt arp eiHt = eiEp t arp ,

(5.140)

(brp )H = eiHt brp eiHt = eiEp t brp .

(5.141)

while
(brp )H = eiHt brp eiHt = eiEp t brp ,

132

It immediately follows that


X Z d3 p
1
p
(x) =
3
(2)
2Ep
r=1,2
X Z d3 p
1

p
(x)
=
(2)3 2Ep
r=1,2

arp

arp

ipx

u (p) e

brp

brp

ipx

ipx

v (p) e

,
(5.142)

ipx

u (p) e

v (p) e

We can now compute the commutator. One has


i
X Z d3 p 1 h
ip(xy)
ip(xy)

r
r
r
r

[ (x), (y)] =
(u
(p)
u

(p))
e
+
(v
(p)
v

(p))
e
(2)3 2Ep
r=1,2
i
d3 p 1 h
ip(xy)
ip(xy)
(p
/
+
m)
e
+
(p
/

m)
e
(2)3 2Ep
Z 3

d p 1  ip(xy)

ip(xy)
= (i/x + m)
e

e
.
(2)3 2Ep
Z

(5.143)

Looking back at (3.110) and (3.111), we see that this means that
[ (x), (y)] = (i/x + m) (x y) .

(5.144)

This expression vanishes outside the light-cone, because the commutator of the real scalar field
(x y) = [(x), (y)] does. As a result the quantum version of the Dirac theory is causal.
Although there is no problem with causality, it is worthwhile to stare at the commutator
in (5.144) a bit longer. If |0i is the vacuum state of the theory,
arp |0i = brp |0i = 0 .

(5.145)



[ (x), (y)] = h0 [ (x), (y)] 0i




= h0 (x) (y) 0i h0 (y) (x) 0i .

(5.146)

for all r and p, then

It is important to realize now, that the first (second) matrix element receives only contribution
from terms containing the ur (p) (v r (p)) spinors. Explicitly, one has in the first case
X Z d3 p d3 q


1



p
(ur (p) us (q)) ei(pxqy) h0|arp asq |0i , (5.147)
h0 (x) (y) 0i =
6
(2)
4Ep Eq
r,s=1,2
and a similar expression holds in the second case.
It is now crucial to ask the following questions. Can we say something about the matrix
elements h0|arp asq |0i based on the classical symmetries of the Dirac theory? In particular, how
does Lorentz invariance constrain the form of the relevant matrix elements? For the ground
133

state |0i to be invariant under translations, we must have exp (iP x) |0i = |0i. In analogy
to (5.140) the action of exp (iP x) on the ladder operators can be shown to lead to
eiP x arp eiP x = eipx arp ,

eiP x arp eiP x = eipx arp .

(5.148)

Analog expressions hold in the case of brp and brp . Therefore,


h0|arp asq |0i = h0|arp asq eiP x |0i = ei(pq)x h0|eiP x arp asq |0i = ei(pq)x h0|arp asq |0i .

(5.149)

This implies that the matrix element can only be non-zero if p = q. Similarly, it can be shown
that rotational invariance of |0i requires that r = s, which should be intuitively clear. From
these considerations, one concludes that the matrix element can be written as
h0|arp asq |0i = (2)3 (3) (p q) rs A(p) ,

(5.150)

where A(p) is an arbitrary function that is so far undetermined. Inserting the latter result
into (5.147), gives
X Z d3 p 1

h0 (x) (y) 0i =
(ur (p) ur (p)) eip(xy) A(p)
3 2E
(2)
p
r=1,2
(5.151)
Z 3
dp 1
=
(p/ + m) eip(xy) A(p) .
(2)3 2Ep
For this expression to be invariant under boosts, we have to require that A(p) must be a
Lorentz scalar, i.e., A(p) = A(p2 ). In fact, since p2 = m2 it follows that A has to be a
positive constant. The positivity of A is the result of the positivity of the norm of states in
any self-respecting Hilbert space. Hence,
Z 3


d p 1 ip(xy)

e
.
(5.152)
h0 (x) (y) 0i = A (i/ + m)
(2)3 2Ep
In a similar fashion, we can also calculate the second matrix element in (5.146). The final
result reads
Z 3


d p 1 ip(xy)

h0 (y) (x) 0i = B (i/ + m)


e
.
(5.153)
(2)3 2Ep
where B is another positive constant. The minus sign is important. It arises from the completeness relation (5.128) of the v r (p) spinors and the sign of x in the exponential. From
(5.152) and (5.153) we see that the two terms in the last line of (5.146) would indeed cancel
if A = B. Yet, this is impossible since A and B must both be positive.
So how to resolve this apparent contradiction? Setting A = B = 1, it follows from (5.152)
and (5.153) that (outside the light-cone)




h0 (x) (y) 0i = h0 (y) (x) 0i ,
(5.154)
which means that the spinor fields anticommute at space-like separation. This suggests that
postulating the commutation relations (5.130) for the spinor fields, was the mistake that lead
to the negative energy problem in (5.139).
134

Fermionic Quantization
The key piece of physics that we obviously missed before is that spin-1/2 particles are fermions,
meaning that they obey Fermi-Dirac statistics with the quantum state picking up a minus sign
upon the interchange of any two particles as indicated by (5.154). This fact is embedded into
the structure of relativistic QFT: the spin-statistics theorem tells us that integer spin fields
must be quantized as bosons, while half-integer spin fields must be quantized as fermions. Any
attempt to do otherwise will lead to an inconsistency.
All inconsistencies are removed by postulating the equal-time anticommutation relation
for the Dirac field,

{ (x), (y)} = { (x), (y)} = 0 ,

(5.155)

(3)

{ (x), (y)} = (x y)

instead of (5.130). In this case we still have the expansions (5.131) and (5.142) in terms of the
ladder operators arp , arp , brp , and brp , but the line of reasoning that lead to (5.132) now tells
us that
{arp , asq } = (2)3 (3) (p q) rs ,
{brp , bsq } = (2)3 (3) (p q) rs ,

(5.156)

while all other anticommutators vanish identically. Using these anticommutator relations, we
can now compute the Hamiltonian again, finding
h
i
X Z d3 p
r r
r r
a
a

b
b
E
H=
p
p p
p p
(2)3
r=1,2
(5.157)
h
i
X Z d3 p
r r
r r
3 (3)
=
Ep ap ap + bp bp (2) (0) .
(2)3
r=1,2
We see that the anticommutators have saved us from the indignity of an unbounded Hamiltonian. Notice that when normal ordering, we now throw away a negative infinite contribution
proportional to (2)3 (3) (0) and not a positive one as in the case of the scalar field (3.19). In
principle, the negative contribution from fermionic fields could (partially) cancel the positive
contribution arising from bosonic fields. So one could hope that if there is a symmetry relating fermions and bosons to each other, a so-called supersymmetry, the cosmological constant
problem might be solvable. In fact, it can be shown that supersymmetry solves the cosmological constant problem halfway, but does not render a complete solution. If you want to
figure out what halfway actually means, I recommend to have a look at the excellent review
[15] and the relevant references therein.
For completeness let me also quote the expression for the momentum operator. Inserting
the expansions (5.131) into (5.91), one finds after a straightforward calculation and normal
ordering the following result
h
i
X Z d3 p
r r
r r
P =
a
+
b
b
(5.158)
p
a
p p
p p .
3
(2)
r=1,2
135

Fermi-Dirac Statistic
Although the ladder operators now obey anticommutation relations, the Hamiltonian (5.157)
has nice commutation relations with them. You can check easily that
[H, arp ] = Ep arp ,

[H, arp ] = Ep arp ,

(5.159)

and likewise in the case of brp and brp . As in the scalar case (3.43), this implies that we can
again construct a tower of energy eigenstates by acting on |0i with arp and brp to create
particles and antiparticles. E.g., we have the one-particle state
|p, ri = arp |0i ,

(5.160)

with momentum p and spin quantum number r. The two-particle state


|p1 , r1 ; p2 , r2 i = arp11 arp22 |0i ,

(5.161)

|p1 , r1 ; p2 , r2 i = arp11 arp22 |0i = arp22 arp11 |0i = |p2 , r2 ; p1 , r1 i ,

(5.162)

obeys
due to (5.156). This confirms that the particles do satisfy Fermi-Dirac statistics as anticipated.
In particular, we have the Paulis exclusion principle |p, r; p, ri for all p and r. Finally, if one
wants to be sure about
of the particle, one could act with the angular momentum
R 3 the0 spin
i
ijk
jk
operator J = 
d x (J ) constructed from (5.94) to confirm that a stationary particle
|p = 0, ri does indeed carry intrinsic angular momentum 1/2. This exercise, which is left to
the reader, will show that in the case of |p = 0, ri only the third term in (5.94) will give a
non-vanishing contribution to the internal angular momentum.
Diracs Hole Interpretation
Before discussing the propagator of the Dirac field, a historical remark seems to be in order.
Dirac originally viewed his equation (5.35) as a relativistic version of the Schrodinger equation,
considering as the wavefunction ot a single particle with spin 1/2 (a fact which is put in by
hand in Diracs theory). In order to reinforce this interpretation, he wrote (5.35) as
i

= i + m = H ,
t

(5.163)

with = 0 and = 0 . The operator H appearing in the above equation is then


understood as the one-particle Hamiltonian. Notice that this viewpoint is quite different from
the one we held so far, where is a classical field that gets quantized. In Diracs view, the
Hamiltonian is defined by (5.163), while for us the Hamiltonian is given by the field operator
(5.157). But for the moment lets stick to (5.163) and see where it lead Dirac/leads us.
With the interpretation of as a single-particle wavefunction, the plane-wave solutions
(5.102) and (5.108) are thought of as energy eigenstates, satisfying
i

= i u(p)eipx = Ep u(p)eipx = Ep ,
t
t
136

(5.164)

and an analog relation for = v(p)eipx with Ep replaced by Ep . The plane-wave solutions
thus look like positive and negative energy solutions. The spectrum is again unbounded from
below, because there are states v(p) with arbitrary low energy Ep . At first glance this is
disastrous, just like the unbounded field theory Hamiltonian of (5.157). Paul Diracs ingenious
solution to this problem was to turn to the Pauli exclusion principle. In 1930, Dirac proposed
that in the true vacuum of the universe, all the negative energy states are filled, so that only
the positive energy states are accessible. The filled negative energy states are referred to as the
Dirac sea. Although you might worry about the infinite negative charge of the vacuum, Dirac
argued that only charge differences would be observable (a trick reminiscent of the normal
ordering prescription we use for field operators).
Having avoided the problem with the anomalous negative-energy quantum states by introducing an infinite sea comprised of occupied negative energy states, Dirac realized that
his theory made a shocking prediction. Suppose that a negative energy state is excited to a
positive energy state, leaving behind a hole in the Dirac sea. The hole would have all the
properties of the electron, except it would carry positive charge. After flirting with the idea
that it may be the proton,46 Dirac concluded that the hole is a new particle, the positron. It
took only couple of years before the positron was discovered experimentally in 1932 by Carl
Anderson, with all the physical properties predicted for the Dirac hole.
Although Diracs physical insight led him to the right answer, we now understand that
the interpretation of the Dirac equation as a single-particle wavefunction is not really correct.
E.g., Diracs argument for antimatter relies crucially on the particles being fermions while, as
we have seen already in this course, antiparticles exist for both fermions and bosons. What
we really learn from Diracs analysis is that there is no consistent way to interpret the Dirac
equation as a single-particle wavefunction. It is instead to be thought of as a classical field
which has only positive energy solutions, since the Hamiltonian (5.90) is positive definite.
Quantization of this field then gives rise to both particle and antiparticle excitations and
makes the vacuum the state in which no particles exist instead of an infinite sea of particles.
This picture is much more convincing, especially since it recaptures all the valid predictions of
the Dirac sea, such as electron-positron annihilation. On the other hand, the field formulation
does not eliminate all the difficulties raised by the Dirac sea. In particular, the problem of the
vacuum possessing infinite energy, is still present.
Feynman Propagator

We now look at the anticommutator of the fields (x) and (y).


Dropping the indices and
from here on, we simply write

iS(x y) = {(x), (y)}


.
46

(5.165)

Robert Oppenheimer pointed out that an electron and its hole would be able to annihilate each other,
releasing energy on the order of the electrons rest energy in the form of energetic photons. If holes were
protons, stable atoms would thus not exist, which is clearly in contradiction with observations. Hermann Weyl
also noted that a hole should have the same mass as an electron, whereas the proton is about 2000 times
heavier.

137

Inserting the expansions (5.142), we essentially only have to repeat the calculation that lead
to (5.143), to obtain


iS(x y) = (i/x + m) D(x y) D(y x) ,
(5.166)
where D(x y) is the propagator (3.114) of the real scalar field. The object iS(x y) is called
the fermionic propagator.
Some comments seem to be in order here. For space-like separated points (x y)2 < 0, we
have already seen in (3.117) that D(x y) D(y x) = 0. In the bosonic theory, we made
a big deal out of this, since it ensured that [(x), (y)] = 0 for (x y)2 < 0, which we took

as a proof of causality. However, in the case of fermions we now have {(x), (y)}
= 0 for
2
(xy) < 0. What happened to causality? The best that we can say is that all our observables
e.g., the Hamiltonian operator (5.157) or the momentum operator
are bi-linear in and ,
(5.158). These objects still commute outside the light-cone. The theory remains causal as long
as individual fermionic operators are not observable. If you think this is a weak argument,
remember that no one has ever seen a physical device come back to minus itself when you
rotate by 2! Notice furthermore, that the propagator satisfies (i/x m)S(x y) = 0, since
(x2 + m2 )D(x y) = 0 using the on-shell condition p2 = m2 .
By a similar calculation to that above, we can determine the VEVs of the bi-linears,
Z 3
dp

(p/ + m) eip(xy) ,
h0|(x)(y)|0i =
(2)3
(5.167)
Z 3
d
p

(p/ m) eip(xy) ,
h0|(y)(x)|0i
=
(2)3
which allows us to define the fermionic Feynman propagator SF (xy), which is a 44 matrix,
as the following time-ordered product
0

SF (xy) = h0|T (x)(y)|0i


= (x0 y 0 ) h0|(x)(y)|0i(y
x0 ) h0|(y)(x)|0i
, (5.168)

where the minus sign in front of the second term is crucial in the QFT of fermions. When
(x y)2 < 0, there is no invariant way to determine whether x0 > y 0 or x0 < y 0 . In this case

the minus sign is necessary to make the two definitions agree since {(x), (y)}
= 0 outside
the light-cone.
In full analogy to the scalar case, there is also a 4-momentum integral representation for
the Feynman propagator. It reads
Z 4
d p ip(xy)
/p + m
SF (x y) = i
e
,
(5.169)
4
2
(2)
p m2 + i
and satisfies
(i/x m)SF (x y) = i (4) (x y) ,

(5.170)

which means that SF (x y) is a Greens function of the Dirac operator.


The minus sign that we see in (5.168) also occurs for any string of operators inside any
time-ordered product. While bosonic operators commute inside T , fermionic operators anticommute. We have this same behavior for normal-ordered products as well, with fermionic
138

operators receiving a minus sign when their order is changed. With the understanding that
all fermionic operators anticommute inside time- or normal-ordered products, Wicks theorem
proceeds just as in the bosonic case, which has been outlined in great detail in Section 4.4. In
the fermionic case, we define the contraction as

: = SF (x y) .
(x)(y)
= T (x)(y)
: (x)(y)

(5.171)

Yukawa Theory
Based on the experiences gained in Section 4.6, it is now straightforward to work out the Feynman rules needed to calculate fermion correlation functions. Let us for definiteness consider
the case of the Yukawa theory,
L=

1
1
,
( )2 2 2 + (i/ m)
2
2

(5.172)

which describes the interaction of a scalar field with mass and a Dirac field with mass
m. Couplings of this type appear in the SM, between fermions and the Higgs boson, and
give mass to the fermionic dofs after electroweak symmetry breaking. In that context, the
fermions can be charged leptons (possibly neutrinos) or quarks. If you wish (5.172) is thus the
proper version of the scalar Yukawa theory of (4.8). Notice that there is however an important
difference following from the dimensions of the involved fields. We still have [] = 1, but the
kinetic terms of the fermion requires that [] = 3/2. Thus, unlike in the case with only scalars,
the coupling is dimensionless, i.e., [] = 0.
In order to get a grip on the Feynman rules, let us study scattering. This
is pretty much the same calculation we have already performed in Section 4.5. The only
minor modification is, that now the particles that scatter have spin, while the nucleons N we
considered earlier on are scalars. In analogy to (4.48) we write the initial and final states as
p
|ii = 4Ep1 Ep2 arp11 arp22 |0i = |p1 , r1 ; p2 , r2 i ,
(5.173)
p
|f i = 4Eq1 Eq2 asq11 asq22 |0i = |q 1 , s1 ; q 2 , s2 i .
Notice that for these states one has to be careful when one takes the adjoint since the fermionic
creation operators anticommute. E.g., the final-state bra is
p
(5.174)
hf | = 4Eq1 Eq2 h0| asq22 asq11 .
To get a contribution to the scattering of two fermions, we have to calculate the O(2 ) corrections to the T -matrix element ihf |T |ii. The relevant contribution to iT takes the form
Z

(i)2

d4 xd4 y T (x) (x)(x)


(y) (y)(y)
,
(5.175)
2
where all fields are interacting ones. Just like in the case of the bosonic calculation, the
contribution to scattering comes from the term where the two fields are contracted,

DF (x y) : (x)(x)
(y)(y)
: .
139

(5.176)

p1

q1

p1

q1

q2

p2

q2

p2

Figure 5.1: Feynman diagrams contributing to scattering at order 2 .

We can now study the action of the fermionic operators on |ii. By expanding the
operators, but not the fields, we find
Z 3 3


d k1 d k2
t1
r1 r2
ut2 (k2 )

(x)

u
(k
)
(y)
: (x)(x) (y)(y) : ap1 ap2 |0i =
1
(2)6
(5.177)
ei(k1 x+k2 y) t1 t2 r1 r2
p
ak1 ak1 ap1 ap2 |0i .
4Ek1 Ek2
Here the btk11 and bkt22 terms in the expansion of have been ignored since the do not contribute to the considered process at O(2 ) and the brackets indicate how the spinor indices are

contracted. Notice finally that the overall minus sign arises from moving (x) past (y).
By
anticommuting the annihilation operators past the creation operators and performing the momentum integrations using the delta functions, we then get for the right-hand side of (5.177)
the following expression



1
ur2 (p2 ) ei(p1 x+p2 y)

ur1 (p1 ) (y)


(x)
p
4Ep1 Ep2
(5.178)




ur1 (p1 ) ei(p1 y+p2 x) |0i .


ur2 (p2 ) (y)
+ (x)
Note the minus sign between the two individual terms. We now let this expression act on hf |
from the right. Let us first have a look what happens to the first term in (5.178). Ignoring
prefactors and exponentials, we have



ur2 (p2 ) |0i =


h0| asq22 asq11 (x)
ur1 (p1 ) (y)
ei(q1 x+q2 y) s1
p
(
u (q1 ) ur1 (p1 )) (
us2 (q2 ) ur2 (p2 ))
4Eq1 Eq2

(5.179)

ei(q1 y+q2 x) s1
p
(
u (q1 ) ur2 (p2 )) (
us2 (q2 ) ur1 (p1 )) .
4Eq1 Eq2
In fact, the second term in (5.178) can be shown to give the same result up to a sign. Both
terms thus add, which cancels the factor of 1/2 in (5.175). Furthermore, the square roots of
140

energies in (5.179) cancel against the relativistic normalizations of the states (5.173). Putting
everything together and including the Feynman propagator of the field, we end up with
Z 4 4 4
d xd yd k ieik(xy)  s1
2
(i)
(
u (q1 ) ur1 (p1 )) (
us2 (q2 ) ur2 (p2 )) ei[(q1 p1 )x+(q2 p2 )y]
(2)4
k 2 2 + i
(5.180)

(
us1 (q1 ) ur2 (p2 )) (
us2 (q2 ) ur1 (p1 )) ei[(q2 p1 )x+(q1 p2 )y] .
Performing the integrations over x and y and suppressing a factor i(2)4 , which will end up
in i hf |T |ii = i(2)4 (4) (p1 + p2 q1 q2 ) A( ), this becomes
Z

d4 k
2
us2 (q2 ) ur2 (p2 )) (4) (q1 p1 + k) (4) (q2 p2 k)
(i)
(
us1 (q1 ) ur1 (p1 )) (
2
2
k + i
(5.181)

(4)
(4)
r1
s2
r2
s1
u (q2 ) u (p1 )) (q1 p1 + k) (q2 p2 k) ,
(
u (q1 ) u (p2 )) (
from which we can immediately read of the result for the scattering amplitude
"
(
us1 (q1 ) ur1 (p1 )) (
us2 (q2 ) ur2 (p2 ))
A( ) = (i)2
(p1 q1 )2 2 + i
#
(
us1 (q1 ) ur2 (p2 )) (
us2 (q2 ) ur1 (p1 ))

.
(p1 q2 )2 2 + i

(5.182)

Honestly, the derivation of the expression for A( ) was a bit tedious. Can it be
done more easily? Yes, it can! Of course, the trick is again to use Feynman diagrams and
rules. The lowest-order Feynman graphs for the scattering of two fermions into two fermions
are shown in Figure 5.1. Starring at those diagrams as well as (5.182), it is, in fact, easy to
guess the Feynman rules that reproduce the final result for the scattering amplitude.
The relevant momentum-space Feynman rules involving fermions and antifermions turn out
to be:
i (p/ + m)
= 2
.
1. For each propagator one has
p m2 + i
p

= i .

2. For each vertex one has

3. For each external fermion one has

= us (p) (initial state) ,

141

= us (p) (final state) .

4. For each external antifermion one has

= vs (p) (initial state) ,

= v s (p) (final state) .

5. Impose momentum conservation at each vertex.


Z
6. Integrate over each undetermined momentum

d4 l
.
(2)4

7. Figure out the overall sign of the diagram.


The Feynman rule for the propagator of the scalar field (indicated by a dashed line) has
already been given in Section 4.6 and external scalar legs just give a trivial factor of 1.
Several comments regarding the above rules are in order. First, the direction of the momentum on a fermion line is significant. On external lines, the direction of the momentum is
always ingoing (outgoing) for initial-state (final-state) particles. This follows from the expan where the annihilation (creation) operators are multiplied by
sion of the operators and ,
exp (ipx) (exp (ipx)) as can be seen from (5.142). On internal lines, represented by propagators, the momentum must be assigned in the direction of the particle-number flow (for
electrons, this is the direction of the negative charge flow). It is conventional to draw arrows
on fermion lines to represent the direction of the particle-number flow. The momentum assigned to a fermion then flows in the direction of this arrow, while in the case of an antifermion
particle-number and momentum flow are opposite to each other. Hence an additional arrow,
identifying the momentum flow, has been drawn next to the antifermion line.
Second, in the case of the Yukawa theory the 1/n! factor from the Taylor expansion of the
time-ordered exponential is always cancelled by the n! ways of interchanging the vertices to
obtain the same contraction. In the case at hand there is thus no need for symmetry factors,
cannot replace each other in a contraction.
given that the fields in the interaction term
Third, the Dirac indices contract together along fermion lines. This happened in the case
of scattering (5.182), but will also happen in more complicated diagrams like e.g.
p4

p3

p2

p1

u(p4 )

i (p/3 + m) i (p/2 + m)
u(p1 ) .
(p23 m2 ) (p22 m2 )

(5.183)

Fourth and finally, we should understand how to determine the correct overall sign of the
diagrams. Let us return to the case of fermion-fermion scattering (5.182). Here the t-channel
diagram has a plus sign, while the u-channel contribution receives a minus sign. Where
does the relative minus sign between the two graphs come from? Let us look at the Wick

142

contractions. For the contractions corresponding to the t-channel diagram in Figure 5.1, we
have
h0|asq22 asq11 x x y y arp11 arp22 |0i .

(5.184)

two spaces to the left, and so one picks


This contraction can be untangled by moving y = (y)
2
up a factor (1) = 1. On the other hand, the contraction corresponding to the u-channel
diagram in Figure 5.1 reads
h0|asq22 asq11 x x y y arp11 arp22 |0i .

(5.185)

Here we only have to move y one space to the left, giving a factor of 1. The relative minus
sign between the two diagrams is a reflection of the Fermi-Dirac statistics. In more complicated
x = (x)(x)

graphs the overall sign can be determined most easily by noting that ()
as
well as any other pair of fermions, commutes with any operator. Thus, e.g.
x ()
y ()
z ()
w . . . = . . . (+1) ()
x ()
z ()
y ()
w ...
. . . ()

(5.186)

= . . . SF (x z)SF (z y)SF (y w) . . . ,
with SF (x y) given in (5.169). Notice that in the case of the simplest closed fermion loop in
the Yukawa theory the latter prescription leads to
x ()
y
= ()


= (1) tr y x x y

(5.187)

= (1) tr [SF (y x)SF (x y)] .


Due to the cyclic property of the trace changing the ordering of SF (y x) and SF (x y) of
course gives the same result. The result (5.187) extends straightforwardly to all closed fermion
lines. A fermion loop hence always gives a factor of 1 and the trace of the product of fermion
propagators that make up the loop. Equipped with the Feynman rules for the Yukawa theory,
we can now calculate the cross sections for some simple scattering processes. This is quite a
good exercise. You should try it!

5.6

Problems

i) Show explicitly that the Weyl representation (5.10) satisfies the Clifford algebra (5.7).
Derive the properties (5.15) and (5.16) of the matrices S introduced in (5.12).

, and
transforms as in (5.30), (5.31), and (5.32), i.e.,
Show that the term ,
it is a Lorentz scalar, vector, and tensor, respectively. It might be advantageous to look
143

at infinitesimal transformations and consider separately the transformation properties


and under the action of (5.6).
of , ,
Calculate the transformation properties of (5.35) and (5.36) under Lorentz transformations. You should find that theses EOMs are form invariant.
Verify that the fifth Dirac matrix 5 defined as in (5.40) satisfies (5.41) and (5.42). Prove
that the chiral projectors PL,R introduced in (5.43) obey the relations (5.44).
ii) Prove the Gordon identity

(p0 + p) i q
+
u(p) .
u(p ) u(p) = u(p )
2m
2m
0

(5.188)

Here q = (p0 p) .
iii) Use (5.7) to show that the following identities involving contractions of 4-dimensional
Dirac matrices are correct:
= 4 ,
= 2 ,
= 4 ,

(5.189)

= 2 .
Employ the anticommutation relation (5.7) in combination with the cyclic property of
the trace to prove the following identities:
tr (1) = 4 ,
tr (any odd number of Dirac matrices) = 0 ,
tr ( ) = 4 ,


tr = 4 + ,

tr 5 = 0 ,

tr 5 = 0 ,

tr 5 = 4i .

(5.190)

iv) Products of Dirac bi-linears obey relations known as Fierz identities. The simplest of
these formulas reads
(
u1 PL u2 ) (
u3 PL u4 ) = (
u1 PL u4 ) (
u3 PL u2 ) ,

(5.191)

where ui with i = 1, . . . , 4 are 4-component Dirac spinors (the momentum dependence


has been dropped here for simplicity) and PL is the left-handed projector introduced in
(5.43). In fact, there are similar rearrangement formulas for any product


u1 A u2 u3 B u4 .
(5.192)
144

Here A and B are any of the 16 combinations of Dirac matrices listed in (5.72). The
goal of this exercise is to derive these Fierz identities.
To begin with, normalize the 16 matrices A such that


tr A , B = 4 ab .

(5.193)

This gives A = {1, 0 , i j , . . .}. Write down all elements of this set.
The general form of the Fierz identity is

 X AB


u1 A u2 u3 B u4 =
CM N u1 M u4 u3 N u2 ,

(5.194)

M,N
AB
A
with unknown coefficients CM
N . Using the completeness of the set , show that
AB
CM
N =

1  M A N B
tr .
16

(5.195)

Employing (5.194) and (5.195) prove (5.191). In addition work out the explicit Fierz
transformation of the product (
u1 u2 )(
u3 u4 ).
v) In Section 4.8 we saw that the Yukawa potential for N N N N scattering is attractive.
and scattering in the
Repeat the calculation for , ,
non-relativistic limit. You might want to use (5.120) to (5.122).
If you understood how to calculate the Yukawa potential the derivation of the Coulomb
potential, which encodes the interactions of electrons/positrons and the photon field
A in the non-relativistic limit, is also not difficult. Consider again the three different
cases of particle-particle, particle-antiparticle, and antiparticle-antiparticle scattering.
The QED Feynman rules for the photon () propagator and vertex between electron
(Q = 1) and positron (Q = 1) are given by

=
p

ig
,
p2 + i

= iQe .

The propagator of a tensor boson, such as the graviton (G), i.e., the force carrier of
gravity, looks like
G

=
p

i
1h
i
(g )(g ) + (g )(g ) 2
.
2
p + i
145

Can you derive from this result the orientation of the gravitational force?
vi) The Furry theorem states that the sum of all Feynman graphs in QED with an odd
number of external photons (off or on the photon mass shell) and no other external lines
vanish. In order to proof Furrys theorem, consider (n = 0, 1, . . .)

h|T jV1 (x1 ) . . . jV 2n+1 (x2n+1 )|i ,

(5.196)

and show by invoking symmetry arguments that a matrix element of this form vanishes.
Here |i denotes the true vacuum of the interacting theory and jV (x) is the vector
current introduced in (5.96).
vii) The goal of this exercise is to introduce the spinor method and to derive some identities
that will be very useful to calculate scattering amplitudes in the high-energy limit, where
the involved particles can be treated as massless.
Derive an explicit solution for the Dirac equation
/p u(p) = 0 ,
of a massless fermion. To do so, write out /p using the basis
!
!
i
0
1
0

0 =
,
i =
,
5 =
1 0
i 0

(5.197)

!
1 0
,
0 1

(5.198)

of Dirac matrices. To keep the notation compact you might want to introduce
p1 ip2
.
eip = p
(p1 )2 + (p2 )2

p = p 0 p3 ,

(5.199)

Use the projection operators PL,R in the basis (5.198) to decompose the original solution
into two helicity solutions
u (p) = PR,L u(p) .
(5.200)
Give the explicit form of u and u . Show that u u = 2p0 , which fixes the normalization of the spinors. Relate u+ (
u+ ) with (
u )T ((u )T ) using 0 and 2 . What is the
physics behind these relations? So far we have only talked about the positive-energy
solutions u. How do the negative-energy solutions v fit into the picture? In particular,
how are u (
u ) and v (
v ) related in the case of a massless fermion?
Consider now a set of massless momenta pi with i = 1, 2, . . . , n. We introduce a bra and
ket notation with the spinor labelled by the index i corresponding to the momentum pi ,
|i i = u (pi ) ,

hi | = u (pi ) .

(5.201)

[ij] = hi+ |j i .

(5.202)

The basic spinor product are defined as


hiji = hi |j + i ,
146

What happens to hi |j i? Show the antisymmetry of the spinor products, i.e.,


hiji = hjii ,

[ij] = [ji] ,

(5.203)

by using either the explicit expressions for u and u you have derived earlier or the
charge conjugation properties of the spinors.
For the case when both energies are positive, i.e., p0i > 0 and p0j > 0, derive analytic
expressions for the spinor products (5.202). Express you result through
sij = (pi + pj )2 = 2pi pj ,
and

1 +
p1i p+
j pj pi
cos ij = q
,
+
|sij | p+
p
i j

2 +
p2i p+
j pj pi
sin ij = q
.
+
|sij | p+
p
i j

(5.204)

(5.205)

So what is the connection between spinor products and Lorentz products of momenta?
Use your explicit result to show that the two types of spinor products are related by
complex conjugation,
hiji = [ji] .
(5.206)
Since spinor products should have simple properties under crossing symmetry, one defines
the spinor product hiji for negative energies by analytic continuation from the positiveenergy case, but with pi,j replaced by pi,j if p0i,j < 0. The spinor product [ij] is then
defined through the identity


hiji[ji] = tr PL /pi /pj = sij .
(5.207)
Consider now the spinor string
[i| |ji = u+ (pi ) u+ (pj ) ,

(5.208)

a quantity that naturally appears as the current describing the emission of a vector
boson from a right-handed massless fermion line. Notice that the helicity labels on the
spinors can always be suppressed in favor of angle or square brackets as in the spinor
products. So one has, hi| = hi |, [i| = hi+ |, |ii = |i+ i, and |i] = |i i. Show the charge
conjugation property of the current
[i| |ji = hj| |i] .

(5.209)

Prove that
|ii[i| = PR /pi ,

|i]hi| = PL /pi ,

(5.210)

and use these projection operators to show the correctness of the Gordon identity
[i| |ii = hi| |i] = 2 pi .

147

(5.211)

Show by the use of (5.208) that


|iihj| |jihi| = hjiiPR ,

(5.212)

holds and derive from this relation the Schouten identity


hijihkli = hikihjli + hilihkji .

(5.213)

The same identity applies when angle brackets are replaced by square brackets. In
explicit calculation (5.208) is a powerful tool since its application can lead to enormous
algebraic simplifications.
Use the Fierz transformation
( PR )ij ( PL )kl = 2 (PL )il (PR )kj ,

(5.214)

to show the simple relation


hi| |j][k| |li = 2hili[jk] ,

(5.215)


[i| |ji = 2 |i]hj| + |ji[i| .

(5.216)

as well as the Fierz identity

A similar relation holds for hi| |j].


viii) The goal of this exercise is to calculate the squared tree-level matrix elements of the
processes d
u e e and d
u e e g using the spinor formalism developed above.
The first process d
u e e describes the production of a massive W boson from
the collision of a down (d) and an antiup quark (
u) and the subsequent decay of the
W boson into an electron (e ) and an electron antineutrino (
e ). Draw the relevant
Feynman diagram and write down the corresponding amplitude in spinor notation using
the Feynman rules:
W

=
p

u
d

p2

ig
,
2
MW
+ i

gw
= i Vud PL ,
2

148

gw
= i PL .
2

Here MW denotes the mass of the W boson, gw is the weak gauge coupling, and
Vud 1 is the complex 11 element of the Cabibbo-Kobayashi-Maskawa (CKM) matrix,
which describes quark mixing in the SM. Notice that the W boson only couples to the
left-handed component of the quark and lepton fields. The deeper significance of this
property will become clear once you learn more about the SM of particle physics.
Simplify your result for the amplitude using the charge conjugation property of the
current (5.209) and the Fierz identity (5.215). Calculate the squared matrix element
and express your result in terms of scalar products of momenta.
The second process is similar to the first one, but more complicated since it involves
the emission of an additional gluon (g) from one of the external quark legs. Draw the
possible Feynman graphs for d
u e e g at tree level and write down the amplitude.
In addition to the Feynman rules given already you will need:
q

g
= i gs T a .

=  (p) (final state) .

=  (p) (initial state) ,


p

Here gs is the coupling constant of QCD and T a with a = 1, . . . , 8 are the generators of
the associated gauge group, i.e., SU (3)c . The symbol  (p) stands for the polarization
vector of the initial- or final-state gluon.
In order to calculate the squared matrix element for d
u e e g, we also need to
introduce a spinor representation for the polarization vector for gluons with definite
helicity a = ,
hp| |]
[p| |i
+
,

,
(5.217)
(p, ) =
(p, ) =
2 hpi
2 [p]
where p is the gluon momentum and is an auxiliary massless vector, called the reference momentum, reflecting the freedom of on-shell gauge transformations. The objects
introduced in (5.217) have the following properties. Since /p |p i = 0, the polarization
vector  (p, ) is transverse to p, i.e.,
 (p, ) p = 0 ,
149

(5.218)

for any choice of with p 6= 0. Complex conjugation acts on the polarization vectors
like


= 
(5.219)
(p, )
(p, ) ,
and they are normalized as follows

 (p, )  (p, ) = 1 ,


 (p, )  (p, ) = 0 .

(5.220)

They also fulfill a complettness relation, which reads


X

a (p, ) (a (p, )) = +

a=

p + p
.
p

(5.221)

Equipped with the definition and the properties of the polarization vectors you can now
actually calculate the matrix element for W + g production. Consider the case of the
emission of a positive and negative helicity gluon separately and keep the gauge vector
arbitrary. Use the charge conjugation, Schouten, and Fierz identities, (5.209), (5.213),
and (5.215), as well as the projection operators (5.210), to reduce both amplitudes to
combinations
of basic spinor products (5.202). Also employ momentum conservation,
Pn
i=1 pi = 0, which leads to the identity
n
X

[ji]hiki = 0 .

(5.222)

i=1,i6=j,k

It is important that your final result for the helicity amplitudes is independent of the
choice of . Could you have obtained your results far more simple by a specific choice of ?
Square the d
u e e g amplitude and simplify your answer as much as possible. In the
fundamental representation the generators T a fulfill tr T a T b = TF ab with TF = 1/2
and T a T a = CF where CF = (Nc2 1)/(2Nc ) = 4/3 for Nc = 3 corresponding to the
QCD gauge group SU (3)c .
ix) Heavy quark effective theory (HQET) is an effective field theory designed to systematically exploit the simplifications of the interactions of QCD in the heavy-quark limit for
the case of hadrons containing a single heavy quark such as the B and D meson. The
first goal of this exercise is to derive the interactions of a heavy quark with the light dofs
starting from the Lagrangian
(iD
L=Q
/ mQ ) Q ,

(5.223)

where Q is a Dirac spinor representing the heavy quark of mass mQ and


D = igs T a Ga ,

(5.224)

is the covariant derivative, which describes the minimal coupling of quarks to a gluon.
It depends on the QCD coupling constant gs , the gluon fields Ga with a = 1, . . . , 8, and
the generators T a of SU (3)c . The obtained effective description will not only allow us to
150

show that the Lagrangian (5.223) has a spin-flavor symmetry in the limit mQ , but
also provides a systematic and rigorous way to obtain corrections to the infinite mass
limit.
To warm up solve the free Dirac equation for a heavy quark at rest. Use the decomposition
Q(x) = eimQ t Q(0) ,
(5.225)
and plague it into (5.35). What do you observe?
The heavy-quark momentum p can always be decomposed as
p = mQ v + k ,

(5.226)

where v is the 4-velocity of the heavy hadron. Once mQ v , the large kinematical part of
the momentum is singled out, the remaining component k is determined by soft QCD
bound-state interactions, and thus k 2  m2Q . In order to work in an arbitrary frame
one defines
1 v/
,
(5.227)
P =
2
with v/v/ = v 2 = 1. Show that P are projection operators and find the explicit form of
them in the rest frame.
Remove the large-frequency part of the x-dependence in Q(x) resulting from the large
momentum mQ v by plugging

Q(x) = eimQ vx Q(x)


h
i

= eimQ vx P+ Q(x)
+ P Q(x)

(5.228)



= eimQ vx hv (x) + Hv (x) ,
into the Lagrangian (5.223). Notice that (5.228) is the covariant generalization of decomposing Q(x) into upper and lower components. Why?
To decouple the simplified Dirac equation multiply it by the projection operators and
use
P a/ = a/ P vaP ,
(5.229)
where a = a vav for any 4-vector a . From the two resulting equations derive a
relation between Hv (x) and hv (x) valid up to terms of O(1/m2Q ).
Employ the relation between Hv (x) and hv (x) to eliminate the field Hv (x) from the
system of equations. Using

, D
] = igG
(5.230)
[D
,
find the final form of the EOM of the heavy-quark field hv (x). In (5.229), we have
introduced the QCD field strength tensor G = Ga T a . In its explicit form the field
strength is given by Ga = Ga Ga + gf abc Gb Gc with [T a , T b ] = if abc T c , where
f abc are the fully antisymmetric structure constants.
151

Write down the Lagrangian that leads to the EOM for hv (x) including O(1/mQ ) terms.
Discuss the spin and flavor properties of the leading term and the power corrections in
the 1/mQ expansion. By going to the heavy-quark rest frame determine the physical
meaning of the two O(1/mQ ) corrections. Explain the appearance of the spin and flavor
symmetry (and its breaking) in physical terms. Compare your findings for heavy-light
meson systems with the physics of the hydrogen atom. Point out similarities/differences.
Derive the Feynman rules for the heavy-quark propagator once starting from the HQET
Lagrangian and once by expanding the propagator of the free Dirac theory. Give also
the Feynman rule for the interaction of the heavy quark with the gluon.
The masses of the vector and pseudoscalar B and D mesons are experimentally determined to be MB = 5.33 GeV, MB = 5.28 GeV and MD = 2.00 GeV, MD = 1.86 GeV,
respectively. These numbers imply that
MB2 MB2 = 0.53 GeV2 ,

MD2 MD2 = 0.54 GeV2 ,

(5.231)

which suggests that the difference between the square of a heavy-light vector meson mass
and the square of a heavy-light pseudoscalar meson mass is a constant. Can you explain
this behavior qualitatively using the heavy-quark symmetries you have derived above?

References
[1] S. P. Martin, arXiv:hep-ph/9709356.
[2] O. W. Greenberg, Phys. Rev. Lett. 89, 231602 (2002) [arXiv:hep-ph/0201258].
[3] V. A. Kostelecky and N. Russell, arXiv:0801.0287 [hep-ph].
[4] T. D. Lee and C. N. Yang, Phys. Rev. 104 (1956) 254.
[5] C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. P. Hudson, Phys. Rev. 105,
1413 (1957).
[6] J. H. Christenson, J. W. Cronin, V. L. Fitch and R. Turlay, Phys. Rev. Lett. 13, 138
(1964).
[7] M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49, 652 (1973).
[8] A. D. Sakharov, Pisma Zh. Eksp. Teor. Fiz. 5 (1967) 32 [JETP Lett. 5 (1967) 24] [Sov.
Phys. Usp. 34 (1991) 392] [Usp. Fiz. Nauk 161 (1991) 61].
[9] M. Dine, arXiv:hep-ph/0011376.
[10] R. D. Peccei and H. R. Quinn, Phys. Rev. Lett. 38, 1440 (1977).
[11] S. L. Adler, Phys. Rev. 177, 2426 (1969).
[12] J. S. Bell and R. Jackiw, Nuovo Cim. A 60, 47 (1969).
152

[13] S. L. Adler and W. A. Bardeen, Phys. Rev. 182, 1517 (1969).


[14] S. L. Adler, arXiv:hep-th/0405040.
[15] S. M. Carroll, The Cosmological Constant, Living Rev. Relativity 3, 1 (2001),
http://relativity.livingreviews.org/Articles/lrr-2001-1

153

You might also like