Professional Documents
Culture Documents
Michele Maggiore
Département de Physique Théorique
Université de Genève
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Michele Maggiore 2023
The moral rights of the author have been asserted
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2023935108
ISBN 9780192867421
ISBN 9780192867438 (pbk.)
DOI: 10.1093/oso/9780192867421.001.0001
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
A Maura, Sara, Ilaria e Lorenzo
Preface
This book evolved from the notes of a course that I teach at the Uni-
versity of Geneva, for undergraduate physics students. For many gen-
erations of physicists, including mine, the classic references for classical
electrodynamics have been the textbook by Jackson and that by Lan-
dau and Lifschits.1 The former is still much used, although more modern 1
In the text we will refer to the latest
and excellent textbooks with a somewhat similar structure, such as Garg editions of these books, Jackson (1998)
and Landau and Lifschits (1975). How-
(2012) or Zangwill (2013), now exist, while the latter is by now rarely ever, these books went through many
used, even as an auxiliary reference text for a course. Because of my editions: the first edition of Jackson ap-
field-theoretical background, as my notes were growing I realized that peared in 1962, while that of Landau
they were naturally drifting toward what looked to me as a modern ver- and Lifschits even dates back to 1939.
sion of Landau and Lifschits, and this stimulated me to expand them
further into a book.
While this book is meant as a modern introduction to classical elec-
trodynamics, it is by no means intended as a first introduction to the
subject. The reader is assumed to have already had a first course on
electrodynamics, at a level covered for instance by Griffiths (2017). This
also implies a different structure of the presentation. In a first course
of electrodynamics, it is natural to take a ‘bottom-up’ approach, where
one starts from experimental observations in the simple settings of elec-
trostatics and magnetostatics, and then moves toward time-dependent
phenomena and electromagnetic induction, which eventually leads to
generalizing the equations governing electrostatics and magnetostatics
into the synthesis provided by the full Maxwell’s equation. This ap-
proach is the natural one for a first introduction because, first of all,
gives the correct historical perspective and shows how Maxwell’s equa-
tions emerged from the unification of a large body of observations; fur-
thermore, it also allows one to start with more elementary mathematical
tools, for the benefit of the student that meets some of them for the first
time, while at the same time discovering all these new and fundamental
physics concepts. The price that is paid is that the approach, following
the historical developments, is sometimes heuristic, and the logic of the
arguments and derivations is not always tight.
For this more advanced text, I have chosen instead a ‘top-down’ ap-
proach. Maxwell’s equations are introduced immediately (after an intro-
ductory chapter on mathematical tools) as the ‘definition’ of the theory,
and their consequences are then systematically developed. This has the
advantage of a better logical clarity. It will also allow us to always go
into the ‘real story’, rather than presenting at first a simpler version, to
be later improved.
viii Preface
of charge, and therefore of current, is derived from the three basic units
of length, mass and time.
The SI system is the natural one for applications to the macroscopic
world: currents are measured in amperes, voltages in volts, and so on.
This makes the SI system the obvious choice for laboratory applications
and in electrical engineering, and SI units are by now the almost uni-
versal standard for electrodynamics courses at the undergraduate level. 4
Actually, its real virtues appear
The Gaussian system, on the other hand, has advantages in other con- when combining electromagnetism with
texts, and in particular leads to neater formulas when relativistic effects quantum mechanics. In this case, the
are important.4 reduction from four to three base units
This state of affairs has led to a rather peculiar situation. In general, obtained with the Gaussian system can
be pushed further, using a system of
undergraduate textbooks of classical electromagnetism always use SI units where one also sets ~ = c = 1,
units; in contrast, more advanced textbooks of classical electrodynamics with the result that one remains with a
are often split between SI and Gaussian units, and all textbooks on single base unit, typically taken to be
quantum electrodynamics and quantum field theory use Gaussian units. the mass unit. In quantum field the-
ory this system is so convenient that
The difficulty of the choice is exemplified by the Jackson’s textbook, units ~ = c = 1 are called natural units
that has been the ‘bible’ of classical electrodynamics for generations of (we will briefly mention them in Sec-
physicists. The second edition (1975), as the first, used Gaussian units tion 2.2). As a consequence, all general-
throughout. However, the third edition (1998) switched to SI units for izations from classical electrodynamics
to quantum electrodynamics (and its
the first 10 chapters, in recognition of the fact that almost all other extensions such as the Standard Model
undergraduate level textbooks used SI units; then, from Chapter 11 of electroweak and strong interactions)
(Special Theory of Relativity) on, it goes back to Gaussian units, in are nowadays uniquely discussed using
the Gaussian system (or, rather, a vari-
recognition of the fact that they are more appropriate than SI units
ant of it, Heaviside–Lorentz or rational-
for relativistic phenomena.5 Gaussian units are also the most common ized Gaussian units, differing just by
choice in quantum mechanics textbooks: when computing the energy the placing of some 4π factors, that we
level of the hydrogen atom, almost all textbooks use a Coulomb potential will also introduce in Chapter 2), sup-
plemented by units ~ = c = 1.
in Gaussian units, −e2 /r, rather than the SI expression −e2 /(4π0 r).6
5
In this book we will use SI units, since this is nowadays the almost Among the other ‘old-time’ classics,
universal standard for an undergraduate textbook on classical electrody- Landau and Lifschits (1975) used Gaus-
sian units, while the Feynman Lectures
namics. However, it is important to be familiar also with the Gaussian on Physics, Feynman et al. (1964), used
system, as a bridge toward graduate and more specialized courses. This SI. The first two editions of the clas-
is particularly important for the student that wishes to go into theo- sic textbook by Purcell used Gaussian
units, but switched to SI for the 3rd edi-
retical high-energy physics where, eventually, only the Gaussian system
tion, Purcell and Morin (2013). Among
will be used. We will then discuss in Section 2.2 how to quickly trans- more recent books, SI is used in Grif-
late from SI to Gaussian units, and, in Appendix A, we will provide an fiths (2017), Zangwill (2013) and Tong
explicit translation in Gaussian units of the most important results and (2015), while Gaussian units are used in
Garg (2012) (with frequent translations
formulas of the main text. to SI units).
6
The most notable exception is the
Finally, I wish to thank Enis Belgacem, Francesco Iacovelli and Michele quantum mechanics textbook Griffiths
Mancarella, who gave the exercise sessions of the course for various years, (2004), that uses SI units, consistently
and Stefano Foffa for useful discussions. I thank again Francesco Iacovelli with the classical electrodynamics book
by the same author.
for producing a very large number of figures of the book. I am grateful
to Stephen Blundell for extremely useful comments on the manuscript.
Last but not least, as with my previous books with OUP, I wish to thank
Sonke Adlung, for his friendly and always very useful advice, as well as
all the staff at OUP.
1 Mathematical tools 1
1.1 Vector algebra 1
1.2 Differential operators on scalar and vector fields 3
1.3 Integration of vector fields. Gauss’s and Stokes’s theorems 5
1.4 Dirac delta 7
1.5 Fourier transform 13
1.6 Tensors and rotations 16
1.7 Groups and representations 18
1.7.1 Reducible and irreducible representations 20
1.7.2 The rotation group and its irreducible tensor rep-
resentations 22
2 Systems of units 27
2.1 The SI system 27
2.2 Gaussian units 31
2.3 SI or Gaussian? 36
3 Maxwell’s equations 39
3.1 Maxwell’s equations in vector form 39
3.1.1 Local form of Maxwell’s equations 39
3.1.2 Integrated form of Maxwell’s equations 42
3.2 Conservation laws 43
3.2.1 Conservation of the electric charge 43
3.2.2 Energy, momentum, and angular momentum of
the electromagnetic field 45
3.3 Gauge potentials and gauge invariance 52
3.4 Symmetries of Maxwell’s equations 55
5 Electromagnetic energy 99
5.1 Work and energy in electrostatics 99
5.2 Energy stored in a static electric field 101
5.2.1 Continuous charge distribution 101
5.2.2 The point-like limit and particle self-energies 104
5.2.3 Energy of charges in an external electric field 107
5.2.4 Energy of a system of conductors 108
5.3 Work and energy in magnetostatics 109
5.4 Energy stored in a static magnetic field 111
5.5 Forces and mechanical potentials 114
5.5.1 Mechanical potentials for conductors 118
5.5.2 Mechanical potentials in magnetostatics 124
5.6 Solved problems 128
References 431
Index 433
Mathematical tools
1
Classical electrodynamics requires a good familiarity with a set of math-
ematical tools, which will then find applications basically everywhere in 1.1 Vector algebra 1
physics. We find it convenient to begin by recalling some of these con- 1.2 Differential operators on
scalar and vector fields 3
cepts so that, later, the understanding of the physics will not be obscured
1.3 Integration of vector fields.
by the mathematical manipulations. We will focus here on tools that will Gauss’s and Stokes’s
be of more immediate use. Further mathematical tools will be discussed theorems 5
along the way, in the rest of the book, as they will be needed. 1.4 Dirac delta 7
1.5 Fourier transform 13
1.6 Tensors and rotations 16
1.1 Vector algebra
1.7 Groups and representations
18
A vector a has Cartesian components ai . We use the convention that
repeated indices are summed over, so,
X
a·b = ai bi ≡ ai bi , (1.1)
i
With the convention of the sum over repeated indices, the right-hand
side becomes ai bi cj dj . Notice that it was important here to use two
different letters, i, j, for the dummy indices involved.
2 Mathematical tools
where ijk is the totally antisymmetric tensor (or the Levi–Civita ten-
sor), defined by 123 = +1, together with the condition that it is an-
tisymmetric under any exchange of indices. Therefore ijk = 0 if two
indices have the same value and, e.g., 213 = −123 = −1. This also
implies that the tensor is cyclic, ijk = −jik = jki , i.e., it is unchanged
under a cyclic permutation of the indices. Note that a × b = −b × a.
We will see in Section 1.6 that the tensors δij and ijk play a special role
in the theory of representations of the rotation group. Unit vectors are
denoted by a hat; for instance, x̂, ŷ, and ẑ are the unit vectors along
the x, y and z axes, respectively. Note that, e.g.,
x̂ × ŷ = ẑ . (1.6)
(Prove it!) Note the structure of the indices: on the left, the index i is
summed over, so it does not appear on the right-hand side. It is a dummy
index, and we could have used a different name for it. For instance, the
left-hand side of eq. (1.7) could be written as pjk plm , with a different
letter p. In contrast, the indices j, k, l, m are free indices so, if they
appear on the left-hand side, they must also appear on the right-hand
side. Note also that, because of the cyclic property of the epsilon tensor,
the left-hand side of eq. (1.7) can also be written as jki ilm .
∇·(∇ × v) = ∂i (ijk ∂j vk )
= ijk ∂i ∂j vk
= 0. (1.18)
4 Mathematical tools
For future reference, we give the expression of the gradient and Lapla-
cian of a scalar field, and of the divergence and curl of a vector field,
in Cartesian coordinates (x, y, x), in polar coordinates (r, θ, φ), and in
1
Our convention on the polar angles is cylindrical coordinates (ρ, ϕ, z).1,2 Denoting by {x̂, ŷ, ẑ}, {r̂, θ̂, φ̂}, and
such that spherical coordinates are re- {ρ̂, ϕ̂, ẑ}, respectively, the unit vectors in the corresponding directions,
lated to Cartesian coordinates by x =
r sin θ cos φ, y = r sin θ sin φ, z =
we have
r cos θ, with θ ∈ [0, π] and φ ∈ [0, 2π]. ∇f = (∂x f )x̂ + (∂y f )ŷ + (∂z f )ẑ (1.22)
Cylindrical coordinates are related to
Cartesian coordinates by x = ρ cos ϕ, 1 1
= (∂r f )r̂ + (∂θ f )θ̂ + (∂φ f )φ̂ (1.23)
y = ρ sin ϕ (with ϕ ∈ [0, 2π]) and z = z. r r sin θ
Then, as ϕ increases, we rotate coun- 1
terclockwise with respect to the z axis. = (∂ρ f )ρ̂ + (∂ϕ f )ϕ̂ + (∂z f )ẑ , (1.24)
This means that ρ̂×ϕ̂ = +ẑ.
ρ
2
The Laplacian of a vector field has ∇2 f = ∂x2 f + ∂y2 f + ∂z2 f (1.25)
been defined from eq. (1.20) in terms of
the Cartesian coordinates and, in this 1 1 1
case, on each component of the vec-
= 2
∂r (r2 ∂r f ) + 2 ∂θ (sin θ∂θ f ) + 2 2 ∂φ2 f (1.26)
r r sin θ r sin θ
tor field, it has the same form as the 1 1 2 2
Laplacian acting on a scalar. This is no = ∂ρ (ρ∂ρ f ) + 2 ∂ϕ f + ∂z f , (1.27)
longer true in polar of cylindrical coor- ρ ρ
dinates. In that case, ∇2 v can be more
easily obtained from eq. (1.21), using ∇·v = ∂x vx + ∂y vy + ∂z vz (1.28)
the corresponding expressions of ∇·v
1 1 1
and ∇ × v. = 2
∂r (r2 vr ) + ∂θ (vθ sin θ) + ∂ φ vφ (1.29)
r r sin θ r sin θ
1 1
= ∂ρ (ρvρ ) + ∂ϕ vϕ + ∂z vz , (1.30)
ρ ρ
and therefore
I Z Z
ijk d`i uj = dsi ∂k ui − dsk ∂i ui . (1.44)
C S S
where A is the area of the surface and n̂ is the unit vector normal to
it. We therefore obtain an elegant formula for the area A of a planar
surface S, bounded by a curve C,
I
1
An̂ = x×d` . (1.47)
2 C
1.4 Dirac delta 7
We will make use very often of both Gauss’s and Stokes’s theorems.4 4
Proceeding as for Stokes’s theorem, if
A vector field such that ∇ × v = 0 everywhere is called irrotational, we set v(x) = ψ(x)w, with w constant,
we also get the useful identity
or curl-free. We have seen in eq. (1.19) that, if v is the gradient of a Z Z
function, v = ∇f , then it is irrotational. A sort of converse of this d3 x ∇ψ = ds ψ , (1.49)
V S
statement holds:
while, setting v(x) = u(x)×w, we get
Z Z
Theorem for curl-free fields. Let v be a vector field such that d3 x ∇×u = ds ×u . (1.50)
∇ × v = 0 everywhere in a region V simply connected (i.e., such that V S
δ(x − x0 ) = 0 if x 6= x0 , (1.51)
and that, for any function f (x) regular in an integration region I that
includes x0 , Z
dx δ(x − x0 )f (x) = f (x0 ) . (1.52)
I
Note that the integral on the left-hand side vanishes if I does not include
x0 because of eq. (1.51) [and of the assumed regularity of f (x)]. On the
other hand, again because of eq. (1.51), the integral in the left-hand side
of eq. (1.52) is independent of I, as long as x0 ∈ I. In the following we
will set for definiteness I = (−∞, +∞), so eq. (1.52) reads
Z +∞
dx δ(x − x0 )f (x) = f (x0 ) . (1.53)
−∞
8 Mathematical tools
Observe that, applying this definition to the case of the function f (x) =
1, we get the normalization condition
Z +∞
dx δ(x) = 1 . (1.54)
−∞
Since the Dirac delta vanishes at all points x 6= x0 , but still the integral
in the left-hand side of eq. (1.53) is non-zero, it must be singular in x0 .
Actually, the Dirac delta is not a proper function, but can be defined
by considering a sequence of functions δn (x, x0 ) such that, as n → ∞,
δn (x, x0 ) → 0 for x 6= x0 , and δn (x, x0 ) → +∞ for x = x0 , while
maintaining the normalization condition (1.54). As an example, one can
take a sequence of gaussians centered on x0 , with smaller and smaller
width,
1 2 2
δn (x − x0 ) = √ e−(x−x0 ) /(2σn ) , (1.55)
2π σn
with σn = 1/n. Another option could be to use
n for |x − x0 | < 1/(2n)
δn (x − x0 ) = (1.56)
0 for |x − x0 | > 1/(2n) .
These two sequences of functions are shown in Fig. 1.1. In both cases,
δn (x)
the limit of δn (x − x0 ) for n → ∞ does not exists, since it diverges when
x = x0 , and therefore does not define a proper function δ(x). However,
n → ∞ =⇒ σn → 0
Z +∞ Z x0 + 2n1
1 1
= lim n x0 + − x0 − f (x0 )
n→∞ 2n 2n
= f (x0 ) . (1.57)
xδ(x) = 0 . (1.62)
Another useful notion is the derivative of the Dirac delta, δ 0 (x), which,
in the sense of distributions, is defined from
Z +∞ Z +∞
dx δ 0 (x − x0 )f (x) ≡ − dx δ(x − x0 )f 0 (x)
−∞ −∞
= −f 0 (x0 ) . (1.63)
2
−∞
√ of eq. (1.68).6 This result could have also been proved using a sequence
− k2 2π δn (x) of approximations to the Dirac delta, and showing that, plugging it
=e 2n , (1.72)
n
on the left hand side of eq. (1.68), we get a continuous and differentiable
where we introduced x0 = x + ik/n2 . approximation to the theta function.
More precisely, we have actually con-
sidered a closed contour in the complex The Dirac delta has an extremely useful integral representation. A
plane z = x + iy, composed by the x simple way to obtain it is to use the sequence of gaussians (1.55). Then,
axis y = 0, and by the parallel line Z +∞ Z +∞
y = ik/n2 , and closed the contour join- n 1 2 2
Z +∞
dk − k22 +ikx n 1 2 2
e 2n = √ e− 2 n x
−∞ 2π 2π
= δn (x) . (1.73)
However, we have stressed that this only holds for r 6= 0, since these
manipulations become undefined at r = 0. To understand the behavior
of ∇·v in r = 0, we use Gauss’s theorem (1.48) in a volume V given by
a spherical ball of radius R. Its boundary is the sphere S 2 , and on the
boundary v = r̂/R2 , while ds = R2 dΩ r̂, where dΩ is the infinitesimal
9
Recall that the infinitesimal solid an- solid angle.9 Then,
gle dΩ is defined from the transforma- Z Z
tion from Cartesian to spherical coor-
dinates. In two dimensions, from x = d3 x ∇·v = ds · v
V 2
r cos θ and y = r sin θ, it follows that ZS
dxdy = rdrdθ. Similarly, in three di- r̂
= R2 dΩ r̂ · 2
mensions, the relation between Carte- 2 R
sian and spherical coordinates (as we ZS
already mentioned in Note 1 on page 4) = dΩ
is x = r sin θ cos φ, y = r sin θ sin φ, z = S2
r cos θ, with θ ∈ [0, π] and φ ∈ [0, 2π]. = 4π . (1.86)
Computing the Jacobian of the trans-
formation,
This shows that ∇·v cannot be zero everywhere. Rather, since ∇·v = 0
dxdydz = r2 sin θdrdθdφ for r 6= 0, but still its integral in d3 x over any volume V is equal to 4π,
= r2 drdΩ , (1.82) we must have
where
r̂
dΩ = sin θdθdφ . (1.83) ∇· = 4π δ (3) (x) . (1.87)
r2
Then, the total integral over the solid
angle is
Z Z π Z 2π From this, we can obtain another very useful result. Using the expression
dΩ = dθ sin θ dφ (1.23) of the gradient in polar coordinates, we get
0 0
= 4π , (1.84)
1 r̂
so the total solid angle in three dimen-
∇ =− 2. (1.88)
r r
sions is 4π. Observe that we can also
write dΩ = d cos θ dφ, reabsorbing the Writing
minus sign from d cos θ = − sin θdθ into
a change of the integration limits, so
1 1
that ∇2 = ∇· ∇
Z 1 Z 2π r r
Z
dΩ = d cos θ dφ . (1.85) r̂
−1 0 = −∇· , (1.89)
r2
1
∇2 = −4πδ (3) (x − x0 ) . (1.91)
|x − x0 |
We will use this result in Section 4.1.2, when we will introduce the notion
of Green’s function of the Laplacian.
1.5 Fourier transform 13
If f (x) is real, f˜∗ (k) = f˜(−k). The convolution theorem now tells us
that, if Z
F (x) = d3 x0 f (x0 )g(x − x0 ) , (1.101)
then
F̃ (k) = f˜(k)g̃(k) . (1.102)
For a function of time f (t) we will denote the integration variable that
enters in the Fourier transform by ω, and we will also use a different
sign convention, defining the Fourier transform f˜(ω) as
Z
˜
f (ω) = dt f (t)eiωt , (1.103)
so that Z
dω ˜
f (t) = f (ω)e−iωt . (1.104)
2π
In the context of Special Relativity, the advantage of using a different
sign convention between the spatial and temporal Fourier transforms is
that the Fourier transform of a function f (t, x), with respect to both t
and x, can be written as
Z
dωd3 k ˜
f (t, x) = f (ω, k)e−i(ωt−k·x) , (1.105)
(2π)4
i.e., Z
f˜(ω, k) = dtd3 x f (t, x)ei(ωt−k·x) , (1.106)
and, as we will see, the combination (ωt − k·x) is more natural from the
point of view of Special Relativity and Lorentz invariance.
Example 1.2 In this example, we compute the three-dimensional Fourier
transform of the function
1
f (r) = , (1.107)
r
that appears in the Coulomb potential. A direct computation leads to
14
Indeed, this function does not belong a problem of convergence.14 Indeed, from eqs. (1.99), (1.82), and (1.85),
to L1 (R3 ), compare with Note 10 on Z
page 13. 1
f˜(k) = d3 x e−ik·x
r
Z ∞ Z 2π Z 1
1
= dr r2 dφ d cos θ e−ikr cos θ , (1.108)
0 0 −1 r
where, to perform the integral, we have written d3 x in polar coordinates
with k as polar axis, so θ is the angle between k and x. The integral over
dφ just gives a 2π factor, and the integral over cos θ is also elementary,
Z 1
2
dα e−ikrα = sin(kr) , (1.109)
−1 kr
so we get Z ∞
4π
f˜(k) = 2 du sin u , (1.110)
k 0
1.5 Fourier transform 15
where k = |k| and u = kr. The integral over du, however, does not
converge at u = ∞, and is not well defined without a prescription. We
then start by computing first the Fourier transform of the function
e−µr
V (r) = , (1.111)
r
where µ > 0, and has the dimensions of the inverse of a length. This
function, which corresponds to an interaction potential known as the
Yukawa potential, reduces to the Coulomb potential as µ → 0+ . Per-
forming the same passages as before gives
Z
4π ∞
Ṽ (k) = 2 du sin u e−u , (1.112)
k 0
where = µ/k. For non-zero , i.e., non-zero µ, the factor e−u ensures 15
A note for the advanced reader. In
the convergence of the integral. Writing sin u = (eiu − e−iu )/(2i), the quantum field theory, the Coulomb
integral is elementary, potential is understood as the in-
teraction between two static charges
∞ mediated by the exchange of a massless
4π 1 1 (i−)u 1 −(i+)u
Ṽ (k) = e + e particle, the photon. The exchange of
k 2 2i i − i+ 0 a massive particle produces instead a
4π 1 Yukawa potential, with the constant
= µ related to the mass m of the ex-
k 2 1 + 2 changed particle by µ = mc/~, see
4π e.g., Section 6.6 of Maggiore (2005).
= . (1.113)
k + µ2
2 This way of regularizing the integral
therefore corresponds, physically, to
Therefore, assigning a small mass mγ to the
e−µr 4π photon and then taking the limit
V (x) = ⇐⇒ Ṽ (k) = . (1.114) mγ → 0. Indeed, one way to put limits
r k 2 + µ2
on the photon mass is to assume that
Since the limit µ → 0+ of these expressions is well defined, we can now the Coulomb interaction is replaced
define the Fourier transform of the Coulomb potential as the limit for by a Yukawa potential, and obtain
limits on mγ . Currently, the strongest
µ → 0+ of the Fourier transform of the Yukawa potential,15 so limit on the photon mass coming
from a direct test of the Coulomb law
1 4π gives mγ c2 < 1 × 10−14 eV. Writing
f (x) = ⇐⇒ f˜(k) = 2 . (1.115)
r k µ = 1/r0 , so that r0 = ~/(mγ c), this
translate into a limit r0 > 2 × 107 m,
i.e., there is no sign of an exponential
The correctness of this limiting procedure can be checked observing that, decay corresponding to a Yukawa po-
if we rather start from f˜(k) = 4π/k 2 and compute the inverse Fourier tential up to such scales. Other limits
transform, we get 1/r without the need of regulating any divergence. on the mass of the photon, not based on
a direct measurement of the Coulomb
Indeed, in this case we get
law, are even stronger, with the most
Z stringent being mγ c2 < 1 × 10−18 eV,
d3 k 4π ik·x
f (r) = e corresponding to r0 > 2 × 1011 m, see
(2π)3 k 2 https://pdg.lbl.gov/2021/listings/
Z ∞ Z 1 contents_listings.html. Another
4π 2 1
= 2π k dk d cos θ 2 eikr cos θ way to search for deviations from
8π 3 −1 k Coulomb’s law is to look for a force
Z ∞0 proportional to 1/r2+ , and set limits
1 2 sin u
= du on . This is a purely phenomenolog-
r π 0 u ical parametrization and, contrary to
1 the case of the Yukawa potential (and
= , (1.116) to statements in some textbook), it
r
R∞ has no field-theoretical interpretation
since 0 du (sin u)/u converges, and has the value π/2. in terms of a photon mass.
16 Mathematical tools
We now use this relation to define rotations and vectors, in any dimen-
sions, as follows. We consider a set of d objects vi , i = 1, . . . , d, and
a linear transformation of the form (1.120), where now Rij is a d × d
matrix. We define rotations in d dimensions as the transformations Rij 18
We will qualify this more precisely
that preserve the norm of v,18 so that eq. (1.121) (with i = 1, . . . , d) at the end of this section and in
holds. The set of objects vi that transforms as in eq. (1.120) when Rij Section 1.7, where we will see that
is a rotation matrix are then called “vectors under rotations” or, more “proper” rotations are obtained factor-
simply, vectors. ing out some discrete parity symmetry.
This definition allows us to characterize rotations as follows. Plugging
eq. (1.120) into eq. (1.121) we get
Using eq. (1.3) and renaming the dummy indices on the right-hand side
as i → k, j → i, k → j, this can be rewritten in the form
Recall that, if Rij are the matrix elements of the matrix R, the transpose
matrix RT has matrix elements (RT )ij = Rji . Then eq. (1.129) reads
T
Rik Rkj = δij or, in matrix form
RT R = I , (1.125)
RRT = RT R = I , (1.128)
or, in components,
Rki Rkj = Rik Rjk = δij . (1.129)
Equation (1.128) is just the definition of orthogonal matrices. We have
then found that rotations in d dimensions are given by d × d orthogonal
matrices, and vectors are defined by the transformation property (1.120)
under rotations.
The transformation property of vectors can be generalized to objects
with different transformation properties. In particular, in any dimension
18 Mathematical tools
d we can define a tensor Tij with two indices as an object that, under
rotations, transforms as
(with the indices running over two values {x, y} in two dimensions, over
{x, y, z} in three dimensions, and more generally over d values 1, . . . , d
in d dimensions), with the same matrix Rij that appears in the transfor-
mation of vectors. Similarly, a tensor Tijk with three indices is defined
as an object that, under rotations, transforms as
0
Tijk = Rii0 Rjj 0 Rkk0 Ti0 j 0 k0 , (1.131)
and so on. Note that, for tensors, the connection with the notion of arrow
is completely lost, and a tensor is only defined by its transformation
property (1.130).
Consider now a tensor Tij that, in a given reference frame, has compo-
nents δij . Normally, the numerical values of the components of a tensor
change if we perform a rotation, according to eq. (1.130). In this case,
however, in the new frame
where in the last equality we have used eq. (1.129). Thus, δij is a very
special tensor, whose components have the same numerical values in
all frames. It is then called an invariant tensor. In three dimensions,
the only other invariant tensor of the rotation group is ijk , since it
transforms as
ijk → Rii0 Rjj 0 Rkk0 i0 j 0 k0 . (1.133)
However, from the definition of the determinant of a 3 × 3 matrix, we
can check that the right-hand side of eq. (1.133) is equal to (det R)ijk .
Equation (1.128), together with the fact that, for two matrices A and
B, det(AB) = det(A) det(B), and det(RT ) = det(R), implies that
(det R)2 = 1, so det R = ±1. “Proper” rotations are defined as the trans-
formations with det R = +1 (while transformations with det R = −1
correspond to parity transformations, that change the orientation of one
axis, or of all three axes, possibly combined with proper rotations), so
ijk is indeed invariant under (proper) rotations.
g 7→ DR (g) , (1.134)
This condition means that the mapping preserves the group structure,
i.e., that it is the same to compose the elements at the group level (with
the ◦ operation) and then represent the resulting element g1 ◦ g2 , or
first represent g1 and g2 in terms of linear operators and then com-
pose the resulting linear operators. Note that the composition oper-
ation in DR (g1 )DR (g2 ) is the one between linear operators (such as
the matrix product when the linear operators are matrices, see next),
while the composition operation in g1 ◦ g2 is the one at the abstract
group level. Setting g1 = g, g2 = e in eq. (1.135), we find that, for all
g ∈ G, DR (g)DR (e) = DR (g); similarly, setting g1 = e, g2 = g, we get
DR (e)DR (g) = DR (g). This implies that the identity element of the
group e must be mapped to the identity operator I, i.e., DR (e) = I.
Similarly, we can show that DR (g −1 ) = [DR (g)]−1 .
The vector space on which the operators DR act is called the basis,
or the base space, for the representation R and, as we have mentioned,
it can be a real vector space or a complex vector space. We will be
particularly interested in matrix representations. In this case, the base
space is a vector space of finite (real or complex) dimension n, and an
abstract group element g is represented by a n × n matrix [DR (g)]ij ,
with i, j = 1, . . . , n. The dimension of the representation is defined as
the dimension n of the base space.
Writing a generic element of the base space as a vector v with com-
ponents (v1 , . . . , vn ), a group element g can be associated with a linear
operator [DR (g)]ij acting on the base space, and therefore to a linear
transformation of the base space
vi → [DR (g)]ij vj , (1.136)
with our usual summation convention on repeated indices. The impor-
tant point is that eq. (1.136) allows us to attach a physical meaning
to a group element: before introducing the concept of representation,
a group element g is just an abstract mathematical object defined by
its composition rules with the other group members. Choosing a spe-
cific representation, instead, allows us to interpret g as a transformation
22
Apart from matrix representations, acting on a vector space.22
the other typical situation encoun-
tered in physics is when a group el-
ement is represented by a differential 1.7.1 Reducible and irreducible representations
operator acting on a space of func-
tions. Since the space of function The different representations of a group describe all possible ways in
is infinite-dimensional, this representa- which objects can transform under the action of the group, for instance
tion is infinite-dimensional. under rotations. However, not all possible representations describe gen-
uinely different possibilities. Given a representation R of a group G,
whose basis is a space X, and another representation R0 of G, whose
basis is a space X 0 , R and R0 are called equivalent if there is an isomor-
phism S : X → X 0 such that, for all g ∈ G,
DR (g) = S −1 DR0 (g)S . (1.137)
Comparing with eq. (1.136), we see that, in the case of representations
of finite dimension, equivalent representations correspond to a change
1.7 Groups and representations 21
wx wx0 cos θ − sin θ wx
→ = , (1.138)
wy wy0 sin θ cos θ wy
which expresses the fact that, in two dimensions, vectors are represen-
tations of dimension two of the rotation group. Naively, we might think
that we can find a new type of representation of dimension four, by
putting together the components of v and w into a single object with
components (vx , vy , wx , wy ). Under rotations,
0
vx vx cos θ − sin θ 0 0 vx
vy vy0 sin θ cos θ 0 0 vy
= .
wx → wx0 0 0 cos θ − sin θ wx
wy wy0 0 0 sin θ cos θ wy
(1.139)
However, here we have not discovered a genuinely new type of repre-
sentation of dimension four; we have simply stack together two vectors.
Mathematically, the fact that this is not a genuinely new representation
is revealed by the fact that the matrix in eq. (1.139) is block diagonal
(for all rotations, so in this case for all values of θ), so that there is no
rotation that mixes the components of v with that of w.
This example motivates the definition of reducible and irreducible rep-
resentations. A representation R is called reducible if it has an invariant
subspace, i.e., if the action of any DR (g) on the vectors in the subspace
gives another vector of the subspace. This corresponds to the fact that
[possibly after a suitable change of basis of the form (1.137)] there is
a subset of components of the basis vector that never mixes with the
others, for all transformations DR (g). A representation R is called com-
pletely reducible if, for all elements g, the matrices DR (g) have a block
diagonal form, or can be put in a block diagonal form with a change
of basis corresponding to the equivalence relation (1.137). Conversely,
a representation with no invariant subspace is called irreducible. Ir-
reducible representations describe the genuinely different way in which
physical quantities can transform under the group of transformations in
questions.
To put some flesh into these rather abstract notions, we now look at
the simplest irreducible representations of the rotation group.
22 Mathematical tools
where, to get the second equality, we used the symmetry of Skl and, in
the third, we renamed the dummy indices k → l and l → k. Similarly, an
antisymmetric tensor remains antisymmetric. Therefore, Sij and Aij do
not mix under rotations. Consider now the trace of the symmetric part,
S = δij S ij . Using eq. (1.130) and the property (1.129) of orthogonal
matrices, we see that it is invariant under spatial rotations. It is then
convenient to separate Sij as
1 1
Sij = Sij − δij S + δij S
3 3
T 1
≡ Sij + δij S . (1.144)
3
24 Mathematical tools
T
The term Sij is traceless (as can be shown contracting it with δij and
24
We are working for simplicity in three using δij δij = δii = 3).24 Since the trace is invariant, under rotations
dimensions. In d dimensions, the trace- it remains traceless, and therefore it does not mix with the term δij S.
less combination is Sij − (1/d)δij S.
We have therefore separated the nine components of a generic tensor
T
Tij into its symmetric traceless part Sij (which has five components,
corresponding to the six independent components of a symmetric 3 × 3
matrix, minus the condition of zero trace), its trace S (one component),
and an antisymmetric tensor Aij which, in three dimensions, has three
independent components,
T 1
Tij = Sij + δij S + Aij . (1.145)
3
The trace S = δ ij Tij , being invariant under rotations, corresponds to the
scalar representation that we already encountered. The antisymmetric
tensor might look like a new representation of the rotation group, but
in fact, introducing
1
Ai = ijk Ajk , (1.146)
2
one sees that Ai is just a spatial vector. We have inserted a factor of
1/2 for convenience since, for instance, in this way A1 = (1/2)1jk Ajk =
(1/2)(123 A23 + 132 A32 ) = A23 [using the antisymmetry of ijk and
of Ajk with respect to (j, k)]. So, A23 = A1 , and similarly one finds
A12 = A3 , A31 = A2 , so the inversion of eq. (1.146) can be written
compactly as
Aij = ijk Ak . (1.147)
Therefore, the three independent components of an antisymmetric tensor
can be rearranged into the three independent components of a vector.
This means that these two representations are equivalent, in the sense
of eq. (1.137), and we have not discovered a genuinely new irreducible
representation of rotations.
In contrast, the traceless symmetric tensor is irreducible (since there
is no further symmetry that forbids its components to mix among them,
T
and then we can always find rotations that mix a given components of Sij
with any other component). We have therefore found a new irreducible
representation of the rotation group, of dimension five, the traceless
symmetric tensors, and eq. (1.145) can be rewritten as
T 1
Tij = Sij + δij S + ijk Ak , (1.148)
3
in terms of the invariant tensors δij and ijk , and of the irreducible
representations provided by the scalar S, the vector Ai , and the traceless
T
symmetric tensor Sij . Note how the nine independent components of
Tij have been shared between a scalar (one component), a vector (three
components), and a traceless symmetric tensor (five components). One
can work out similarly the decomposition into irreducible representations
of tensors with more indices, such as Tijk .
Observe that the dimension of the scalar, vector, and traceless sym-
metric tensor representations can all be written as 2s + 1 for s = 0
1.7 Groups and representations 25
with ωij infinitesimal parameters, which describe the deviation from the
identity transformation. Requiring that this satisfies eq. (1.129), to first
order in ω we get
δij = δik + ωik + O(ω 2 ) δjk + ωjk + O(ω 2 )
= δij + ωij + ωji + O(ω 2 ) , (1.150)
xi → xi + ω ij xj
= xi − ijk xj δθk . (1.153)