This action might not be possible to undo. Are you sure you want to continue?
CONTENTS
• Special relativity; Lorentz covariance of Maxwell equations
• Scalar and vector potentials, and gauge invariance
• Relativistic motion of charged particles
• Action principle for electromagnetism; energymomentum tensor
• Electromagnetic waves; waveguides
• Fields due to moving charges
• Radiation from accelerating charges
• Antennae
• Radiation reaction
• Magnetic monopoles, duality, YangMills theory
Contents
1 Electrodynamics and Special Relativity 4
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 4vectors and 4tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Lorentz tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Proper time and 4velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Electrodynamics and Maxwell’s Equations 19
2.1 Natural units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Gauge potentials and gauge invariance . . . . . . . . . . . . . . . . . . . . . 19
2.3 Maxwell’s equations in 4tensor notation . . . . . . . . . . . . . . . . . . . . 21
2.4 Lorentz transformation of
E and
B . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 The Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Action principle for charged particles . . . . . . . . . . . . . . . . . . . . . . 29
2.7 Gauge invariance of the action . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8 Canonical momentum, and Hamiltonian . . . . . . . . . . . . . . . . . . . . 33
3 Particle Motion in Static Electromagnetic Fields 35
3.1 Description in terms of potentials . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Particle motion in static uniform
E and
B ﬁelds . . . . . . . . . . . . . . . 37
4 Action Principle for Electrodynamics 43
4.1 Invariants of the electromagnetic ﬁeld . . . . . . . . . . . . . . . . . . . . . 43
4.2 Action for Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Inclusion of sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Energy density and energy ﬂux . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Energymomentum tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Energymomentum tensor for the electromagnetic ﬁeld . . . . . . . . . . . . 57
4.7 Inclusion of massive charged particles . . . . . . . . . . . . . . . . . . . . . 61
5 Coulomb’s Law 63
5.1 Potential of a point charges . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Electrostatic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3 Field of a uniformly moving charge . . . . . . . . . . . . . . . . . . . . . . . 66
1
5.4 Motion of a charge in a Coulomb potential . . . . . . . . . . . . . . . . . . . 69
5.5 The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6 Electromagnetic Waves 76
6.1 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.2 Monochromatic plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.3 Motion of a point charge in a linearlypolarised E.M. wave . . . . . . . . . . 83
6.4 Circular and elliptical polarisation . . . . . . . . . . . . . . . . . . . . . . . 84
6.5 General superposition of plane waves . . . . . . . . . . . . . . . . . . . . . . 86
6.6 Gauge invariance and electromagnetic ﬁelds . . . . . . . . . . . . . . . . . . 93
6.7 Fourier decomposition of electrostatic ﬁelds . . . . . . . . . . . . . . . . . . 96
6.8 Waveguides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.9 Resonant cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7 Fields Due to Moving Charges 106
7.1 Retarded potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2 LienardWiechert potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.3 Electric and magnetic ﬁelds of a moving charge . . . . . . . . . . . . . . . . 113
7.4 Radiation by accelerated charges . . . . . . . . . . . . . . . . . . . . . . . . 116
7.5 Applications of Larmor formula . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.6 Angular distribution of the radiated power . . . . . . . . . . . . . . . . . . . 121
7.7 Frequency distribution of radiated energy . . . . . . . . . . . . . . . . . . . 127
7.8 Frequency spectrum for relativistic circular motion . . . . . . . . . . . . . . 130
7.9 Frequency spectrum for periodic motion . . . . . . . . . . . . . . . . . . . . 133
7.10 Cerenkov radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.11 Thompson scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8 Radiating Systems 141
8.1 Fields due to localised oscillating sources . . . . . . . . . . . . . . . . . . . . 141
8.2 Electric dipole radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.3 Higher multipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.4 Linear antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9 Electromagnetism and Quantum Mechanics 158
9.1 The Schr¨ odinger equation and gauge transformations . . . . . . . . . . . . . 158
9.2 Magnetic monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
2
9.3 Dirac quantisation condition . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10 Local Gauge Invariance and YangMills Theory 165
10.1 Relativistic quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 165
10.2 YangMills theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
3
1 Electrodynamics and Special Relativity
1.1 Introduction
In Newtonian mechanics, the fundamental laws of physics, such as the dynamics of moving
objects, are valid in all inertial frames (i.e. all nonaccelerating frames). If S is an inertial
frame, then the set of all inertial frames comprises all frames that are in uniform motion
relative to S. Suppose that two inertial frames S and S
, are parallel, and that their origins
coincide at at t = 0. If S
is moving with uniform velocity v relative to S, then a point
P with position vector r with respect to S will have position vector r
with respect to S
,
where
r
= r −v t . (1.1)
Of course, it is always understood in Newtonian mechanics that time is absolute, and so
the times t and t
measured by observers in the frames S and S
are the same:
t
= t . (1.2)
The transformations (1.1) and (1.2) form part of what is called the Galilean Group. The
full Galilean group includes also rotations of the spatial Cartesian coordinate system, so
that we can deﬁne
r
= M r −v t , t
= t , (1.3)
where M is an orthogonal 3 3 constant matrix acting by matrix multiplication on the
components of the position vector:
r ↔
¸
¸
¸
x
y
z
¸
, M r ↔M
¸
¸
¸
x
y
z
¸
, (1.4)
where M
T
M = 1.
Returning to our simplifying assumption that the two frames are parallel, i.e. that
M = 1l, it follows that if a particle having position vector r in S moves with velocity
u = dr/dt, then its velocity u
= dr
/dt as measured with respect to the frame S
is given
by
u
= u −v . (1.5)
Suppose, for example, that v lies along the x axis of S; i.e. that S
is moving along
the x axis of S with speed v = [v[. If a beam of light were moving along the x axis of S
4
with speed c, then the prediction of Newtonian mechanics and the Galilean transformation
would therefore be that in the frame S
, the speed c
of the light beam would be
c
= c −v . (1.6)
Of course, as is well known, this contradicts experiment. As far as we can tell, with
experiments of everincreasing accuracy, the true state of aﬀairs is that the speed of the
light beam is the same in all inertial frames. Thus the predictions of Newtonian mechanics
and the Galilean transformation are falsiﬁed by experiment.
Of course, it should be emphasised that the discrepancies between experiment and the
Galilean transformations are rather negligible if the relative speed v between the two inertial
frames is of a typical “everyday” magnitude, such as the speed of a car or a plane. But if
v begins to become appreciable in comparison to the speed of light, then the discrepancy
becomes appreciable too.
By contrast, it turns out that Maxwell’s equations of electromagnetism do predict a
constant speed of light, independent of the choice of inertial frame. To be precise, let us
begin with the freespace Maxwell’s equations,
∇
E =
1
0
ρ ,
∇
B −µ
0
0
∂
E
∂t
= µ
0
J ,
∇
B = 0 ,
∇
E +
∂
B
∂t
= 0 , (1.7)
where
E and
B are the electric and magnetic ﬁelds, ρ and
J are the charge density and
current density, and
0
and µ
0
are the permittivity and permeability of free space.
To see the electromagnetic wave solutions, we can consider a region of space where there
are no sources, i.e. where ρ = 0 and
J = 0. Then we shall have
∇(
∇
E) = −
∂
∂t
∇
B = −µ
0
0
∂
2
E
∂t
2
. (1.8)
But using the vector identity
∇ (
∇
E) =
∇(
∇
E) − ∇
2
E, it follows from
∇
E = 0
that the electric ﬁeld satisﬁes the wave equation
∇
2
E −µ
0
0
∂
2
E
∂t
2
= 0 . (1.9)
This admits planewave solutions of the form
E =
E
0
e
i(
k·r−ωt)
, (1.10)
where
E
0
and
k are constant vectors, and ω is also a constant, where
k
2
= µ
0
0
ω
2
. (1.11)
5
Here k means [
k[, the magnitude of the wavevector
k. Thus we see that the waves travel
at speed c given by
c =
ω
k
=
1
√
µ
0
0
. (1.12)
Putting in the numbers, this gives c ≈ 3 10
8
metres per second, i.e. the familiar speed of
light.
A similar calculation shows that the magnetic ﬁeld
B also satisﬁes an identical wave
equation, and in fact
B and
E are related by
B =
1
ω
k
E . (1.13)
The situation, then, is that if the Maxwell equations (1.7) hold in a given frame of
reference, then they predict that the speed of light will be c ≈ 3 10
8
metres per second
in that frame. Therefore, if we assume that the Maxwell equations hold in all inertial
frames, then they predict that the speed of light will have that same value in all inertial
frames. Since this prediction is in agreement with experiment, we can reasonably expect
that the Maxwell equations will indeed hold in all inertial frames. Since the prediction
contradicts the implications of the Galilean transformations, it follows that the Maxwell
equations are not invariant under Galilean transformations. This is just as well, since the
Galilean transformations are wrong!
In fact, as we shall see, the transformations that correctly describe the relation between
observations in diﬀerent inertial frames in uniform motion are the Lorentz Transformations
of Special Relativity. Furthermore, even though the Maxwell equations were written down in
the prerelativity days of the nineteenth century, they are in fact perfectly invariant
1
under
the Lorentz transformations. No further modiﬁcation is required in order to incorporate
Maxwell’s theory of electromagnetism into special relativity.
However, the Maxwell equations as they stand, written in the form given in equation
(1.7), do not look manifestly covariant with respect to Lorentz transformations. This is
because they are written in the language of 3vectors. To make the Lorentz transformations
look nice and simple, we should instead express them in terms of 4vectors, where the extra
component is associated with the time direction.
In order to give a nice elegant treatment of the Lorentz transformation properties of
the Maxwell equations, we should ﬁrst therefore reformulate special relativity in terms of 4
vectors and 4tensors. Since there are many diﬀerent conventions on oﬀer in the marketplace,
1
Strictly, as will be explained later, we should say covariant rather than invariant.
6
we shall begin with a review of special relativity in the notation that we shall be using in
this course.
1.2 The Lorentz Transformation
The derivation of the Lorentz transformation follows from Einstein’s two postulates:
• The laws of physics are the same for all inertial observers.
• The speed of light is the same for all inertial observers.
To derive the Lorentz transformation, let us suppose that we have two inertial frames
S and S
, whose origins coincide at time zero, that is to say, at t = 0 in the frame S, and
at t
= 0 in the frame S
. If a ﬂash of light is emitted at the origin at time zero, then it will
spread out over a spherical wavefront given by
x
2
+y
2
+z
2
−c
2
t
2
= 0 (1.14)
in the frame S, and by
x
2
+y
2
+z
2
−c
2
t
2
= 0 (1.15)
in the frame S
. Note that, following the second of Einstein’s postulates, we have used the
same speed of light c for both inertial frames. Our goal is to derive the relation between
the coordinates (x, y, z, t) and (x
, y
, z
, t
) in the two inertial frames.
Consider for simplicity the case where S
is parallel to S, and moves along the x axis
with velocity v. Clearly we must have
y
= y , z
= z . (1.16)
Furthermore, the transformation between (x, t) and (x
, t
) must be a linear one, since
otherwise it would not be translationinvariant or timetranslation invariant. Thus we may
say that
x
= Ax +Bt , t
= Cx +Dt , (1.17)
for constants A, B , C and D to be determined.
Now, if x
= 0, this must, by deﬁnition, correspond to the equation x = vt in the fame
S, and so from the ﬁrst equation in (1.17) we have B = −Av. For convenience we will
change the name of the constant A to γ, and thus we have
x
= γ(x −vt) . (1.18)
7
By the same token, if we consider taking x = 0 then this will correspond to x
= −vt
in
the frame S
. It follows that
x = γ(x
+vt
) . (1.19)
Note that it must be the same constant γ in both these equations, since the two really just
correspond to reversing the direction of the x axis, and the physics must be the same for
the two cases.
Now we bring in the postulate that the speed of light is the same in the two frames, so
if we have x = ct then this must imply x
= ct
. Solving the resulting two equations
ct
= γ(c −v)t , ct = γ(c +v)t
(1.20)
for γ, we obtain
γ =
1
1 −v
2
/c
2
. (1.21)
Solving x
2
− c
2
t
2
= x
2
− c
2
t
2
for t
, after using (1.18), we ﬁnd t
2
= γ
2
(t − vx/c
2
)
2
and
hence
t
= γ(t −
v
c
2
x) . (1.22)
(We must choose the positive square root since it must reduce to t
= +t at zero relative
velocity, v.) Thus we arrive at the Lorentz transformation
x
= γ(x −vt) , y
= y , z
= z , t
= γ(t −
v
c
2
x) , (1.23)
where γ is given by (1.21), for the special case where S
is moving along the x direction
with velocity v.
At this point, for notational convenience, we shall introduce the simpliﬁcation of working
in a system of units in which the speed of light is set equal to 1. We can do this because the
speed of light is the same for all inertial observers, and so we may as well choose to measure
length in terms of the time it takes for light in vacuo to traverse the distance. In fact, the
metre is nowadays deﬁned to be the distance travelled by light in vacuo in 1/299,792,458
of a second. By making the small change of taking the lightsecond as the basic unit of
length, rather than the 1/299,792,458
th of a lightsecond, we end up with a system of units
in which c = 1. In these units, the Lorentz transformation (1.23) becomes
x
= γ(x −vt) , y
= y , z
= z , t
= γ(t −vx) , (1.24)
where
γ =
1
√
1 −v
2
. (1.25)
8
It will be convenient to generalise the Lorentz transformation (1.24) to the case where
the frame S
is moving with (constant) velocity v in arbitrary direction, rather than specif
ically along the x axis. It is rather straightforward to do this. We know that there is a
complete rotational symmetry in the threedimensional space parameterised by the (x, y, z)
coordinate system. Therefore, if we can ﬁrst rewrite the special case described by (1.24) in
terms of 3vectors, where the 3vector velocity v happens to be simply v = (v, 0, 0), then
generalisation will be immediate. It is easy to check that with v taken to be (v, 0, 0), the
Lorentz transformation (1.24) can be written as
r
= r +
γ −1
v
2
(v r) v −γv t , t
= γ(t −v r) , (1.26)
with γ = (1−v
2
)
−1/2
and v ≡ [v[, and with r = (x, y, z). Since these equations are manifestly
covariant under 3dimensional spatial rotations (i.e. they are written entirely in a 3vector
notation), it must be that they are the correct form of the Lorentz transformations for an
arbitrary direction for the velocity 3vector v.
The Lorentz transformations (1.26) are what are called the pure boosts. It is easy to
check that they have the property of preserving the spherical lightfront condition, in the
sense that points on the expanding spherical shell given by r
2
= t
2
of a lightpulse emitted
at the origin at t = 0 in the frame S will also satisfy the equivalent condition r
2
= t
2
in
the primed reference frame S
. (Note that r
2
= x
2
+y
2
+z
2
.) In fact, a stronger statement
is true: The Lorentz transformation (1.26) satisﬁes the equation
x
2
+y
2
+z
2
−t
2
= x
2
+y
2
+z
2
−t
2
. (1.27)
1.3 4vectors and 4tensors
The Lorentz transformations given in (1.26) are linear in the space and time coordinates.
They can be written more succinctly if we ﬁrst deﬁne the set of four spacetime coordinates
denoted by x
µ
, where µ is an index, or label, that ranges over the values 0, 1, 2 and 3. The
case µ = 0 corresponds to the time coordinate t, while µ = 1, 2 and 3 corresponds to the
space coordinates x, y and z respectively. Thus we have
2
(x
0
, x
1
, x
2
, x
3
) = (t, x, y, z) . (1.28)
Of course, once the abstract index label µ is replaced, as here, by the speciﬁc index values
0, 1, 2 and 3, one has to be very careful when reading a formula to distinguish between, for
2
The choice to put the index label µ as a superscript, rather than a subscript, is purely conventional. But,
unlike the situation with many arbitrary conventions, in this case the coordinate index is placed upstairs in
all modern literature.
9
example, x
2
meaning the symbol x carrying the spacetime index µ = 2, and x
2
meaning
the square of x. It should generally be obvious from the context which is meant.
The invariant quadratic form appearing on the lefthand side of (1.27) can now be
written in a nice way, if we ﬁrst introduce the 2index quantity η
µν
, deﬁned to be given by
η
µν
=
¸
¸
¸
¸
¸
¸
−1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
¸
. (1.29)
What this means is that the rows of the matrix on the right are labelled by the index µ
and the columns are labelled by the index ν. In other words, (1.29) is saying that the only
nonvanishing components of η
µν
are given by
η
00
= −1 , η
11
= η
22
= η
33
= 1 , (1.30)
with η
µν
= 0 if µ = ν. Note that η
µν
is symmetric:
η
µν
= η
νµ
. (1.31)
Using η
µν
, the quadratic form on the lefthand side of (1.27) can be rewritten as
x
2
+y
2
+z
2
−t
2
=
3
¸
µ=0
3
¸
ν=0
η
µν
x
µ
x
ν
. (1.32)
At this point, it is very convenient to introduce the Einstein Summation Convention.
This makes the writing of expressions such as (1.32) much less cumbersome. The summation
convention works as follows:
In an expression such as (1.32), if an index appears exactly twice in a term, then it will
be understood that the index is summed over the natural index range (0, 1, 2, 3 in our
present case), and the explicit summation symbol will be omitted. An index that occurs
twice in a term, thus is understood to be summed over, is called a Dummy Index.
Since in (1.32) both µ and ν occur exactly twice, we can rewrite the expression, using
the Einstein summation convention, as simply
x
2
+y
2
+z
2
−t
2
= η
µν
x
µ
x
ν
. (1.33)
On might at ﬁrst think there would be a great potential for ambiguity, but this is not the
case. The point is that in any valid vectorial (or, more generally, tensorial) expression, the
only time that a particular index can ever occur exactly twice in a term is when it is summed
10
over. Thus, there is no ambiguity resulting from agreeing to omit the explicit summation
symbol, since it is logically inevitable that a summation is intended.
3
Now let us return to the Lorentz transformations. The pure boosts written in (1.26),
being linear in the space and time coordinates, can be written in the form
x
µ
= Λ
µ
ν
x
ν
, (1.34)
where Λ
µ
ν
are constants, and the Einstein summation convention is operative for the dummy
index ν. By comparing (1.34) carefully with (1.26), we can see that the components Λ
µ
ν
are given by
Λ
0
0
= γ , Λ
0
i
= −γv
i
,
Λ
i
0
= −γ v
i
, Λ
i
j
= δ
ij
+
γ −1
v
2
v
i
v
j
, (1.35)
where δ
ij
is The Kronecker delta symbol,
δ
ij
= 1 if i = j , δ
ij
= 0 if i = j . (1.36)
A couple of points need to be explained here. Firstly, we are introducing Latin indices here,
namely the i and j indices, which range only over the three spatial index values, i = 1, 2
and 3. Thus the 4index µ can be viewed as µ = (0, i), where i = 1, 2 and 3. This piece
of notation is useful because the three spatial index values always occur on a completely
symmetric footing, whereas the time index value µ = 0 is a bit diﬀerent. This can be seen,
for example, in the deﬁnition of η
µν
in (1.29) or (1.30).
The second point is that when we consider spatial indices (for example when µ takes the
values i = 1, 2 or 3), it actually makes no diﬀerence whether we write the index i upstairs
or downstairs. Sometimes, as in (1.35), it will be convenient to be rather relaxed about
whether we put spatial indices upstairs or downstairs. By contrast, when the index takes
the value 0, it is very important to be careful about whether it is upstairs or downstairs.
The reason why we can be cavalier about the Latin indices, but not the Greek, will become
clearer as we proceed.
We already saw that the Lorentz boost transformations (1.26), reexpressed in terms of
Λ
µ
ν
in (1.35), have the property that η
µν
x
µ
x
ν
= η
µν
x
µ
x
ν
. Thus from (1.34) we have
η
µν
x
µ
x
ν
= η
µν
Λ
µ
ρ
Λ
ν
σ
x
ρ
x
σ
. (1.37)
3
As a side remark, it should be noted that in a valid vectorial or tensorial expression, a speciﬁc index can
NEVER appear more than twice in a given term. If you have written down a term where a given index
occurs 3, 4 or more times then there is no need to look further at it; it is WRONG. Thus, for example, it
is totally meaningless to write ηµµ x
µ
x
µ
. If you ever ﬁnd such an expression in a calculation then you must
stop, and go back to ﬁnd the place where an error was made.
11
(Note that we have been careful to choose two diﬀerent dummy indices for the two implicit
summations over ρ and σ!) On the lefthand side, we can replace the dummy indices µ and
ν by ρ and σ, and thus write
η
ρσ
x
ρ
x
σ
= η
µν
Λ
µ
ρ
Λ
ν
σ
x
ρ
x
σ
. (1.38)
This can be grouped together as
(η
ρσ
−η
µν
Λ
µ
ρ
Λ
ν
σ
)x
ρ
x
σ
= 0 , (1.39)
and, since it is true for any x
µ
, we must have that
η
µν
Λ
µ
ρ
Λ
ν
σ
= η
ρσ
. (1.40)
(This can also be veriﬁed directly from (1.35).) The full set of Λ’s that satisfy (1.40) are
the Lorentz Transformations. The Lorentz Boosts, given by (1.35), are examples, but they
are just a subset of the full set of Lorentz transformations that satisfy (1.40). Essentially,
the additional Lorentz transformations consist of rotations of the threedimensional spatial
coordinates. Thus, one can really say that the Lorentz boosts (1.35) are the “interesting”
Lorentz transformations, i.e. the ones that rotate space and time into one another. The
remainder are just rotations of our familiar old 3dimensional Euclidean space.
The coordinates x
µ
= (x
0
, x
i
) live in a fourdimensional spacetime, known as Minkowski
Spacetime. This is the fourdimensional analogue of the threedimensional Euclidean Space
described by the Cartesian coordinates x
i
= (x, y, z). The quantity η
µν
is called the
Minkowski Metric, and for reasons that we shall see presently, it is called a tensor. It is
called a metric because it provides the rule for measuring distances in the fourdimensional
Minkowski spacetime. The distance, or to be more precise, the interval, between two
inﬁnitesimallyseparated points (x
0
, x
1
, x
2
, x
3
) and (x
0
+ dx
0
, x
1
+dx
1
, x
2
+ dx
2
, x
3
+dx
3
)
in spacetime is written as ds, and is given by
ds
2
= η
µν
dx
µ
dx
ν
. (1.41)
Clearly, this is the Minkowskian generalisation of the threedimensional distance ds
E
be
tween neighbouring points (x, y, z) and (x + dx, y + dy, z + dz) in Euclidean space, which,
by Pythagoras’ theorem, is given by
ds
2
E
= dx
2
+dy
2
+dz
2
= δ
ij
dx
i
dx
j
. (1.42)
The Euclidean metric (1.42) is invariant under arbitrary constant rotations of the (x, y, z)
coordinate system. (This is clearly true because the distance between the neighbouring
12
points must obviously be independent of how the axes of the Cartesian coordinate system
are oriented.) By the same token, the Minkowski metric (1.41) is invariant under arbitrary
Lorentz transformations. In other words, as can be seen to follow immediately from (1.40),
the spacetime interval ds
2
= η
µν
dx
µ
dx
ν
calculated in the primed frame is identical to the
interval ds
2
calculated in the unprimed frame
ds
2
= η
µν
dx
µ
dx
ν
= η
µν
Λ
µ
ρ
Λ
ν
σ
dx
ρ
dx
σ
,
= η
ρσ
dx
ρ
dx
σ
= ds
2
. (1.43)
For this reason, we do not need to distinguish between ds
2
and ds
2
, since it is the same in
all inertial frames. It is what is called a Lorentz Scalar.
The Lorentz transformation rule of the coordinate diﬀerential dx
µ
, i.e.
dx
µ
= Λ
µ
ν
dx
ν
, (1.44)
can be taken as the prototype for more general 4vectors. Thus, we may deﬁne any set
of four quantities U
µ
, for µ = 0, 1, 2 and 3, to be the components of a Lorentz 4vector
(often, we shall just abbreviate this to simply a 4vector) if they transform, under Lorentz
transformations, according to the rule
U
µ
= Λ
µ
ν
U
ν
. (1.45)
The Minkowski metric η
µν
may be thought of as a 44 matrix, whose rows are labelled
by µ and columns labelled by ν, as in (1.29). Clearly, the inverse of this matrix takes
the same form as the matrix itself. We denote the components of the inverse matrix by
η
µν
. This is called, not surprisingly, the inverse Minkowksi metric. Clearly it satisﬁes the
relation
η
µν
η
νρ
= δ
ρ
µ
, (1.46)
where the 4dimensional Kronecker delta is deﬁned to equal 1 if µ = ρ, and to equal 0 if
µ = ρ. Note that like η
µν
, the inverse η
µν
is symmetric also: η
µν
= η
νµ
.
The Minkowksi metric and its inverse may be used to lower or raise the indices on other
quantities. Thus, for example, if U
µ
are the components of a 4vector, then we may deﬁne
U
µ
= η
µν
U
ν
. (1.47)
This is another type of 4vector. Two distinguish the two, we call a 4vector with an upstairs
index a contravariant 4vector, while one with a downstairs index is called a covariant 4
vector. Note that if we raise the lowered index in (1.47) again using η
µν
, then we get back
13
to the starting point:
η
µν
U
ν
= η
µν
η
νρ
U
ρ
= δ
µ
ρ
U
ρ
= U
µ
. (1.48)
It is for this reason that we can use the same symbol U for the covariant 4vector U
µ
= η
µν
U
ν
as we used for the contravariant 4vector U
µ
.
In a similar fashion, we may deﬁne the quantities Λ
µ
ν
by
Λ
µ
ν
= η
µρ
η
νσ
Λ
ρ
σ
. (1.49)
It is then clear that (1.40) can be restated as
Λ
µ
ν
Λ
µ
ρ
= δ
ρ
ν
. (1.50)
We can also then invert the Lorentz transformation x
µ
= Λ
µ
ν
x
ν
to give
x
µ
= Λ
ν
µ
x
ν
. (1.51)
It now follows from (1.45) that the components of the covariant 4vector U
µ
deﬁned by
(1.47) transform under Lorentz transformations according to the rule
U
µ
= Λ
µ
ν
U
ν
. (1.52)
Any set of 4 quantities U
µ
which transform in this way under Lorentz transformations will
be called a covariant 4vector.
Using (1.51), we can see that the gradient operator ∂/∂x
µ
transforms as a covariant
4vector. Using the chain rule for partial diﬀerentiation we have
∂
∂x
µ
=
∂x
ν
∂x
µ
∂
∂x
ν
. (1.53)
But from (1.51) we have (after a relabelling of indices) that
∂x
ν
∂x
µ
= Λ
µ
ν
, (1.54)
and hence (1.53) gives
∂
∂x
µ
= Λ
µ
ν
∂
∂x
ν
. (1.55)
As can be seen from (1.52), this is precisely the transformation rule for a a covariant 4
vector. The gradient operator arises suﬃciently often that it is useful to use a special symbol
to denote it. We therefore deﬁne
∂
µ
≡
∂
∂x
µ
. (1.56)
Thus the Lorentz transformation rule (1.55) is now written as
∂
µ
= Λ
µ
ν
∂
ν
. (1.57)
14
1.4 Lorentz tensors
Having seen how contravariant and covariant 4vectors transform under Lorentz transfor
mations (as given in (1.45) and (1.52) respectively), we can now deﬁne the transformation
rules for more general objects called tensors. These objects carry multiple indices, and each
one transforms with a Λ factor, of either the (1.45) type if the index is upstairs, or of the
(1.52) type if the index is downstairs. Thus, for example, a tensor T
µν
transforms under
Lorentz transformations according to the rule
T
µν
= Λ
µ
ρ
Λ
ν
σ
T
ρσ
. (1.58)
More generally, a tensor T
µ
1
···µm
ν
1
···νn
will transform according to the rule
T
µ
1
···µm
ν
1
···νn
= Λ
µ
1
ρ
1
Λ
µm
ρm
Λ
ν
1
σ
1
Λ
νn
σn
T
ρ
1
···ρm
σ
1
···σn
. (1.59)
Note that scalars are just special cases of tensors with no indices, while vectors are special
cases with just one index.
It is easy to see that products of tensors give rise again to tensors. For example, if U
µ
and V
µ
are two contravariant vectors then T
µν
≡ U
µ
V
ν
is a tensor, since, using the known
transformation rules for U and V we have
T
µν
= U
µ
V
ν
= Λ
µ
ρ
U
ρ
Λ
ν
σ
V
σ
,
= Λ
µ
ρ
Λ
ν
σ
T
ρσ
. (1.60)
Note that the gradient operator ∂
µ
can also be used to map a tensor into another
tensor. For example, if U
µ
is a vector ﬁeld (i.e. a vector that changes from place to place in
spacetime) then S
µν
≡ ∂
µ
U
ν
is a tensor ﬁeld.
We make also deﬁne the operation of Contraction, which reduces a tensor to one with
a smaller number of indices. A contraction is performed by setting an upstairs index on a
tensor equal to a downstairs index. The Einstein summation convention then automatically
comes into play, and the result is that one has an object with one fewer upstairs indices and
one fewer downstairs indices. Furthermore, a simple calculation shows that the new object
is itself a tensor. Consider, for example, a tensor T
µ
ν
. This, of course, transforms as
T
µ
ν
= Λ
µ
ρ
Λ
ν
σ
T
ρ
σ
(1.61)
under Lorentz transformations. If we form the contraction and deﬁne φ ≡ T
µ
µ
, then we see
that under Lorentz transformations we shall have
φ
≡ T
µ
µ
= Λ
µ
ρ
Λ
µ
σ
T
ρ
σ
,
= δ
σ
ρ
T
ρ
σ
= φ. (1.62)
15
Since φ
= φ, it follows, by deﬁnition, that φ is a scalar.
An essentially identical calculation shows that for a tensor with a arbitrary numbers of
upstairs and downstairs indices, if one makes an index contraction of one upstairs with one
downstairs index, the result is a tensor with the corresponding reduced numbers of indices.
Of course multiple contractions work on the same way.
The Minkowski metric η
µν
is itself a tensor, but of a rather special type, known as an
invariant tensor. This is because, unlike a generic 2index tensor, the Minkowski metric is
identical in all Lorentz frames. This can be seen from (1.40), which can be rewritten as the
statement
η
µν
≡ Λ
µ
ρ
Λ
ν
σ
η
ρσ
= η
µν
. (1.63)
The same is also true for the inverse metric η
µν
.
We already saw that the gradient operator ∂
µ
≡ ∂/∂X
µ
transforms as a covariant vector.
If we deﬁne, in the standard way, ∂
µ
≡ η
µν
∂
ν
, then it is evident from what we have seen
above that the operator
≡ ∂
µ
∂
µ
= η
µν
∂
µ
∂
ν
(1.64)
transforms as a scalar under Lorentz transformations. This is a very important operator,
which is otherwise known as the wave operator, or d’Alembertian:
= −∂
0
∂
0
+∂
i
∂
i
= −
∂
2
∂t
2
+
∂
2
∂x
2
+
∂
2
∂y
2
+
∂
2
∂z
2
. (1.65)
It is worth commenting further at this stage about a remark that was made earlier.
Notice that in (1.65) we have been cavalier about the location of the Latin indices, which
of course range only over the three spatial directions i = 1, 2 and 3. We can get away with
this because the metric that is used to raise or lower the Latin indices is just the Minkowski
metric restricted to the index values 1, 2 and 3. But since we have
η
00
= −1 , η
ij
= δ
ij
, η
0i
= η
i0
= 0 , (1.66)
this means that Latin indices are lowered and raised using the Kronecker delta δ
ij
and
its inverse δ
ij
. But these are just the components of the unit matrix, and so raising or
lowering Latin indices has no eﬀect. It is because of the minus sign associated with the η
00
component of the Minkowski metric that we have to pay careful attention to the process of
raising and lowering Greek indices. Thus, we can get away with writing ∂
i
∂
i
, but we cannot
write ∂
µ
∂
µ
.
16
1.5 Proper time and 4velocity
We deﬁned the Lorentzinvariant interval ds between inﬁnitesimallyseparated spacetime
events by
ds
2
= η
µν
dx
µ
dx
ν
= −dt
2
+dx
2
+dy
2
+dz
2
. (1.67)
This is the Minkowskian generalisation of the spatial interval in Euclidean space. Note that
ds
2
can be positive, negative or zero. These cases correspond to what are called spacelike,
timelike or null separations, respectively.
On occasion, it is useful to deﬁne the negative of ds
2
, and write
dτ
2
= −ds
2
= −η
µν
dx
µ
dx
ν
= dt
2
−dx
2
−dy
2
−dz
2
. (1.68)
This is called the Proper Time interval, and τ is the proper time. Since ds is a Lorentz
scalar, it is obvious that dτ is a scalar too.
We know that dx
µ
transforms as a contravariant 4vector. Since dτ is a scalar, it follows
that
U
µ
≡
dx
µ
dτ
(1.69)
is a contravariant 4vector also. If we think of a particle following a path, or worldline in
spacetime parameterised by the proper time τ, i.e. it follows the path x
µ
= x
µ
(τ), then U
µ
deﬁned in (1.69) is called the 4velocity of the particle.
It is useful to see how the 4velocity is related to the usual notion of 3velocity of a
particle. By deﬁnition, the 3velocity u is a 3vector with components u
i
given by
u
i
=
dx
i
dt
. (1.70)
From (1.68), it follows that
dτ
2
= dt
2
[1 −(dx/dt)
2
−(dy/dt)
2
−(dz/dt)
2
)] = dt
2
(1 −u
2
) , (1.71)
where u = [u[, or in other words, u =
√
u
i
u
i
. In view of the deﬁnition of the γ factor in
(1.25), it is natural to deﬁne
γ ≡
1
√
1 −u
2
. (1.72)
Thus we have dτ = dt/γ, and so from (1.69) the 4velocity can be written as
U
µ
=
dt
dτ
dx
µ
dt
= γ
dx
µ
dt
. (1.73)
Since dx
0
/dt = 1 and dx
i
/dt = u
i
, we therefore have that
U
0
= γ , U
i
= γ u
i
. (1.74)
17
Note that U
µ
U
µ
= −1, since, from (1.68), we have
U
µ
U
µ
= η
µν
U
µ
U
ν
=
η
µν
dx
µ
dx
ν
(dτ)
2
=
−(dτ)
2
(dτ)
2
= −1 . (1.75)
We shall sometimes ﬁnd it convenient to rewrite (1.74) as
U
µ
= (γ, γ u
i
) or U
µ
= (γ, γ u) . (1.76)
Having set up the 4vector formalism, it is now completely straightforward write down
how velocities transform under Lorentz transformations. We know that the 4velocity U
µ
will transform according to (1.45), and this is identical to the way that the coordinates x
µ
transform:
U
µ
= Λ
µ
ν
U
ν
, x
µ
= Λ
µ
ν
x
ν
. (1.77)
Therefore, if we want to know how the 3velocity transforms, we need only write down
the Lorentz transformations for (t, x, y, z), and then replace (t, x, y, z) by (U
0
, U
1
, U
2
, U
3
).
Finally, using (1.76) to express (U
0
, U
1
, U
2
, U
3
) in terms of u will give the result.
Consider, for simplicity, the case where S
is moving along the x axis with velocity v.
The Lorentz transformation for U
µ
can therefore be read oﬀ from (1.24) and (1.25):
U
0
= γ
v
(U
0
−vU
1
) ,
U
1
= γ
v
(U
1
−vU
0
) ,
U
2
= U
2
,
U
3
= U
3
, (1.78)
where we are now using γ
v
≡ (1 − v
2
)
−1/2
to denote the gamma factor of the Lorentz
transformation, to distinguish it from the γ constructed from the 3velocity u of the particle
in the frame S, which is deﬁned in (1.72). Thus from (1.76) we have
γ
= γ γ
v
(1 −vu
x
) ,
γ
u
x
= γ γ
v
(u
x
−v) ,
γ
u
y
= γ u
y
,
γ
u
z
= γ u
z
, (1.79)
where, of course, γ
= (1 −u
2
)
−1/2
is the analogue of γ in the frame S
. Thus we ﬁnd
u
x
=
u
x
−v
1 −vu
x
, u
y
=
u
y
γ
v
(1 −vu
x
)
, u
z
=
u
z
γ
v
(1 −vu
x
)
. (1.80)
18
2 Electrodynamics and Maxwell’s Equations
2.1 Natural units
We saw earlier that the supposition of the universal validity of Maxwell’s equations in all
inertial frames, which in particular would imply that the speed of light should be the same in
all frames, is consistent with experiment. It is therefore reasonable to expect that Maxwell’s
equations should be compatible with special relativity. However, written in their standard
form (1.7), this compatibility is by no means apparent. Our next task will be to reexpress
the Maxwell equations, in terms of 4tensors, in a way that makes their Lorentz covariance
manifest.
We shall begin by changing units from the S.I. system in which the Maxwell equations
are given in (1.7). The ﬁrst step is to change to Gaussian units, by performing the rescalings
E −→
1
√
4π
0
E,
B −→
µ
0
4π
B,
ρ −→
√
4π
0
ρ ,
J −→
√
4π
0
J . (2.1)
Bearing in mind that the speed of light is given by c = 1/
√
µ
0
0
, we see that the Maxwell
equations (1.7) become
∇
E = 4π ρ ,
∇
B −
1
c
∂
E
∂t
=
4π
c
J ,
∇
B = 0 ,
∇
E +
1
c
∂
B
∂t
= 0 , (2.2)
Finally, we pass from Gaussian units to Natural units, by choosing our units of length and
time so that c = 1, as we did in our discussion of special relativity. Thus, in natural units,
the Maxwell equations become
∇
E = 4π ρ ,
∇
B −
∂
E
∂t
= 4π
J , (2.3)
∇
B = 0 ,
∇
E +
∂
B
∂t
= 0 , (2.4)
The equations (2.3), which have sources on the righthand side, are called the Field Equa
tions. The equations (2.4) are called Bianchi Identities. We shall elaborate on this a little
later.
2.2 Gauge potentials and gauge invariance
We already remarked that the two Maxwell equations (2.4) are know as Bianchi identities.
They are not ﬁeld equations, since there are no sources; rather, they impose constraints on
19
the electric and magnetric ﬁelds. The ﬁrst equatio in (2.4), i.e.
∇
B = 0, can be solved by
writing
B =
∇
A, (2.5)
where
A is the magnetic 3vector potential. Note that (2.5) identically solves
∇
B = 0,
because of the vector identity that div curl ≡ 0. Substituting (2.5) into the second equation
in (2.4), we obtain
∇
E +
∂
A
∂t
= 0 . (2.6)
This can be solved, again identically, by writing
E +
∂
A
∂t
= −
∇φ, (2.7)
where φ is the electric scalar potential. Thus we can solve the Bianchi identities (2.4) by
writing
E and
B in terms of scalar and 3vector potentials φ and
A:
E = −
∇φ −
∂
A
∂t
,
B =
∇
A. (2.8)
Although we have now “disposed of” the two Maxwell equations in (2.4), it has been
achieved at a price, in that there is a redundancy in the choice of gauge potentials φ and
A. First, we may note that that
B in (2.8) is unchanged if we make the replacement
A −→
A+
∇λ, (2.9)
where λ is an arbitrary function of position and time. The expression for
E will also be
invariant, if we simultaneously make the replacement
φ −→φ −
∂λ
∂t
. (2.10)
To summarise, if a given set of electric and magnetic ﬁelds
E and
B are described by a
scalar potential φ and 3vector potential
A according to (2.8), then the identical physical
situation (i.e. identical electric and magnetic ﬁelds) is equally well described by a new pair
of scalar and 3vector potentials, related to the original pair by the Gauge Transformations
given in (2.9) and (2.10), where λ is an arbitrary function of position and time.
We can in fact use the gauge invariance to our advantage, by making a convenient
and simplifying gauge choice for the scalar and 3vector potentials. We have one arbitrary
function (i.e. λ(t, r)) at our disposal, and so this allows us to impose one functional relation
on the potentials φ and
A. For our present purposes, the most useful gauge choice is to use
this freedom to impose the Lorentz gauge condition,
∇
A +
∂φ
∂t
= 0 . (2.11)
20
Substituting (2.8) into the remaining Maxwell equations (i.e. (2.3), and using the Lorentz
gauge condition (2.11), we therefore ﬁnd
∇
2
φ −
∂
2
φ
∂t
2
= −4πρ ,
∇
2
A−
∂
2
A
∂t
2
= −4π
J . (2.12)
The important thing, which we shall make use of shortly, is that in each case we have on
the lefthand side the d’Alembertian operator = ∂
µ
∂
µ
, which we discussed earlier.
2.3 Maxwell’s equations in 4tensor notation
The next step is to write the Maxwell equations in terms of fourdimensional quantities.
Since the 3vectors describing the electric and magnetic ﬁelds have three components each,
there is clearly no way in which they can be “assembled” into 4vectors. However, we
may note that in four dimensional a twoindex antisymmetric tensor has (4 3)/2 = 6
independent components. Since this is equal to 3 + 3, it suggests that perhaps we should
be grouping the electric and magnetic ﬁelds together into a single 2index antisymmetric
tensor. This is in fact exactly what is needed. Thus we introduce a tensor F
µν
, satisfying
F
µν
= −F
νµ
. (2.13)
It turns out that we should deﬁne its components in terms of
E and
B as follows:
F
0i
= −E
i
, F
i0
= E
i
, F
ij
=
ijk
B
k
. (2.14)
Here
ijk
is the usual totallyantisymmetric tensor of 3dimensional vector calculus. It is
equal to +1 if (ijk) is an even permutation of (123), to = −1 if it is an odd permutation,
and to zero if it is no permautation (i.e. if two or more of the indices (ijk) are equal). In
other words, we have
F
23
= B
1
, F
31
= B
2
, F
12
= B
3
,
F
32
= −B
1
, F
13
= −B
2
, F
21
= −B
3
. (2.15)
Viewing F
µν
as a matrix with rows labelled by µ and columns labelled by ν, we shall have
F
µν
=
¸
¸
¸
¸
¸
¸
0 −E
1
−E
2
−E
3
E
1
0 B
3
−B
2
E
2
−B
3
0 B
1
E
3
B
2
−B
1
0
¸
. (2.16)
21
We also need to combine the charge density ρ and the 3vector current density
J into a
fourdimensional quantity. This is easy we just deﬁne a 4vector J
µ
, whose spatial compo
nents J
i
are just the usual 3vector current components, and whose time component J
0
is
equal to the charge density ρ:
J
0
= ρ , J
i
= J
i
. (2.17)
A word of caution is in order here. Although we have deﬁned objects F
µν
and J
µ
that
have the appearance of a 4tensor and a 4vector, we are only entitled to call them such if
we have veriﬁed that they transform in the proper way under Lorentz transformations. In
fact they do, and we shall justify this a little later.
For now, we shall proceed to see how the Maxwell equations look when expressed in
terms of F
µν
and J
µ
. The answer is that they become
∂
µ
F
µν
= −4πJ
ν
, (2.18)
∂
µ
F
νρ
+∂
ν
F
ρµ
+∂
ρ
F
µν
= 0 . (2.19)
Two very nice things have happened. First of all, the original four Maxwell equations
(2.3) and (2.4) have become just two fourdimensional equations; (2.18) is the ﬁeld equa
tion, and (2.19) is the Bianchi identity. Secondly, the equations are manifestly Lorentz
covariant; i.e. they transform tensorially under Lorentz transformations. This means that
they keep exactly the same form in all Lorentz frames. If we start with (2.18) and (2.19)
in the unprimed frame S, then we know that in the frame S
, related to S by the Lorentz
transformation (1.34), the equations will look identical, except that they will now have
primes on all the quantities.
We should ﬁrst verify that indeed (2.18) and (2.19) are equivalent to the Maxwell equa
tions (2.3) and (2.4). Consider ﬁrst (2.18). This equation is vectorvalued, since it has the
free index ν. Therefore, to reduce it down to threedimensional equations, we have two
cases to consider, namely ν = 0 or ν = j. For ν = 0 we have
∂
i
F
i0
= −4πJ
0
, (2.20)
which therefore corresponds (see (2.14) and (2.17)) to
−∂
i
E
i
= −4πρ , i.e.
∇
E = 4πρ . (2.21)
For ν = j, we shall have
∂
0
F
0j
+∂
i
F
ij
= −4πJ
j
, (2.22)
22
which gives
∂
0
E
j
+
ijk
∂
i
B
k
= −4πJ
j
. (2.23)
This is just
4
−
∂
E
∂t
+
∇
B = 4π
J . (2.24)
Thus (2.18) is equivalent to the two Maxwell ﬁeld equations in (2.3).
Turning now to (2.19), it follows from the antisymmetry (2.13) of F
µν
that the lefthand
side is totally antisymmetric in (µνρ) (i.e. it changes sign under any exchange of a pair of
indices). Thefore there are two distinct assignments of indices, after we make the 1 + 3
decomposition µ = (0, i) etc. Either one of the indices is a 0 with the other two Latin, or
else all three are Latin. Consider ﬁrst (µ, ν, ρ) = (0, i, j):
∂
0
F
ij
+∂
i
F
j0
+∂
j
F
0i
= 0 , (2.25)
which, from (2.14), means
ijk
∂B
k
∂t
+∂
i
E
j
−∂
j
E
i
= 0 . (2.26)
Since this is antisymmetric in ij there is no loss of generality involved in contracting with
ij
, which gives
5
2
∂B
∂t
+ 2
ij
∂
i
E
j
= 0 . (2.27)
This is just the statement that
∇
E +
∂
B
∂t
= 0 , (2.28)
which is the second of the Maxwell equations in (2.4).
The other distinct possibility for assigning decomposed indices in (2.19) is to take
(µ, ν, ρ) = (i, j, k), giving
∂
i
F
jk
+∂
j
F
ki
+∂
k
F
ij
= 0 . (2.29)
Since this is totally antisymmetric in (i, j, k), no generality is lost by contracting it with
ijk
, giving
3
ijk
∂
i
F
jk
= 0 . (2.30)
From (2.14), this implies
3
ijk
jk
∂
i
B
= 0 , and hence 6∂
i
B
i
= 0 . (2.31)
4
Recall that the i’th component of
∇×
V is given by (
∇×
V )i =
ijk
∂jV
k
for any 3vector
V .
5
Recall that ijm
km
= δ
ik
δ
j
−δ
i
δ
jk
, and hence ijm
kjm
= 2δ
ik
. These identities are easily proven by
considering the possible assignments of indices and explicitly verifying that the two sides of the identities
agree.
23
This has just reproduced the ﬁrst Maxwell equation in (2.4), i.e.
∇
B = 0.
We have now demonstrated that the equations (2.18) and (2.19) are equivalent to the four
maxwell equations (2.3) and (2.4). Since (2.18) and (2.19) are written in a fourdimensional
notation, it is highly suggestive that they are indeed Lorentz covariant. However, we should
be a little more careful, in order to be sure about thsi point. Not every set of objects V
µ
can be viewed as a Lorentz 4vector, after all. The test is whether they transform properly,
as in (1.45), under Lorentz transformations.
We may begin by considering the quantities J
µ
= (ρ, J
i
). Note ﬁrst that by applying
∂
ν
to the Maxwell ﬁeld equation (2.18), we get identically zero on the lefthand side, since
partial derivatives commute and F
µν
is antisymmetric. Thus from the lefthand side we get
∂
µ
J
µ
= 0 . (2.32)
This is the equation of charge conservation. Decomposed into the 3 + 1 language, it takes
the familiar form
∂ρ
∂t
+
∇
J = 0 . (2.33)
By integrating over a closed 3volume V and using the divergence theorem on the second
term, we learn that the rate of change of charge inside V is balanced by the ﬂow of charge
through its boundary S:
∂
∂t
V
ρdV = −
S
J d
S . (2.34)
Now we are in a position to show that J
µ
= (ρ,
J) is indeed a 4vector. Considering
J
0
= ρ ﬁrst, we may note that
dQ ≡ ρdxdydz (2.35)
is clearly Lorentz invariant, since it is an electric charge. Clearly, for example, all Lorentz
observers will agree on the number of electrons in a given closed spatial region, and so they
will agree on the amount of charge. Another quantity that is Lorentz invariant is
dv = dtdxdydz , (2.36)
the volume of an inﬁnitesinal region in spacetime. This can be seen from the fact that the
Jacobian . of the transformation from dv to dv
= dt
dx
dy
dz
is given by
. = det
∂x
µ
∂x
ν
= det(Λ
µ
ν
) . (2.37)
Now the deﬁning property (1.40) of the Lorentz transformation can be written in a matrix
notation as
Λ
T
η Λ = η , (2.38)
24
and hence taking the determinant, we get (∂Λ)
2
= 1 and hence
det Λ = ±1 . (2.39)
Assuming that we restrict attention to Lorentz transformations without reﬂections, then
they will be connected to the identity (we can take the boost velocity v to zero and/or
the rotation angle to zero and continuously approach the identity transformation), and so
det Λ = 1. Thus it follows from (2.37) that for Lorentz transformations without reﬂections,
the 4volume element dtdxdydz is Lorentz invariant.
Comparing dQ = ρdxdydz and dv = dtdxdydz, both of which we have argued are
Lorentz invariant, we can conclude that ρ must transform in the same way as dt under
Lorentz transformations. In other words, ρ must transform like the 0 component of a
4vector. Thus writing, as we did, that J
0
= ρ, is justiﬁed.
In the same way, we may consider the spatial components J
i
of the putative 4vector
J
µ
. Considering J
1
, for example, we know that J
1
dydz is the current ﬂowing through the
area element dydz. Therefore in time dt, there will have been a ﬂow of charge J
1
dtdydz.
Being a charge, this must be Lorentz invariant, and so it follows from the known Lorentz
invariance of dv = dtdxdydz that J
1
must transform the same way as dx under Lorentz
transformations. Thus J
1
does indeed transform like the 1 component of a 4vector. Similar
arguments apply to J
2
and J
3
. (It is important in this argument that, because of the
chargeconservation equation (2.32) or (2.34), the ﬂow of charges we are discussing when
considering the J
i
components are the same charges we discussed when considering the J
0
component.)
We have now established that J
µ
= (ρ, J
i
) is indeed a Lorentz 4vector, where ρ is the
charge density and J
i
the 3vector current density.
At this point, we recall that by choosing the Lorentz gauge (2.11), we were able to reduce
the Maxwell ﬁeld equations (2.3) to (2.12). Furthermore, we can write these equations
together as
A
µ
= −4π J
µ
, (2.40)
where
A
µ
= (φ,
A) , (2.41)
where the d’Alembertian, or wave operator, = ∂
µ
∂
µ
= ∂
i
∂
i
−∂
2
0
was introduced in (1.65).
We saw that it is manifestly a Lorentz scalar, since it is built from the contraction of indices
on the two Lorentzvector gradient operators. Since we have already established that J
µ
is a 4vector, it therefore follows that A
µ
is a 4vector. Note, en passant, that the Lorentz
25
gauge condition (2.11) that we imposed earlier translates, in the fourdimensional language,
into
∂
µ
A
µ
= 0 , (2.42)
which is nicely Lorentz invariant (hence the name “Lorentz gauge condition”).
The ﬁnal step is to note that our deﬁnition (2.14) is precisely consistent with (2.41) and
(2.8), if we write
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. (2.43)
First, we note from (2.41) that because of the η
00
= −1 needed when lowering the 0 index,
we shall have
A
µ
= (−φ,
A) . (2.44)
Therefore we ﬁnd
F
0i
= ∂
0
A
i
−∂
i
A
0
=
∂A
i
∂t
+∂
i
φ = −E
i
,
F
ij
= ∂
i
A
j
−∂
j
A
i
=
ijk
(
∇
A)
k
=
ijk
B
k
. (2.45)
In summary, we have shown that
J
µ
is a 4vector, and hence, using (2.40), that A
µ
is a
4vector. Then, it is manifest from (2.43) that F
µν
is a 4tensor. Hence, we have established
that the Maxwell equations, written in the form (2.18) and (2.19), are indeed expressed in
terms of 4tensors and 4vectors, and so the manifest Lorentz covariance of the Maxwell
equations is established.
Finally, it is worth remarking that in the 4tensor description, the way in which the gauge
invariance arises is very straightforward. First, it is manifest that the Bianchi identity (2.19)
is solved identically by writing
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
, (2.46)
for some 4vector A
µ
. This is because (2.19) is totally antisymmetric in µνρ, and so, when
(2.46) is substituted into it, one gets identically zero since partial derivatives commute.
(Try making the substitution and verify this explicitly. The vanishing because of the com
mutativity of partial derivatives is essentially the same as the reason why curl grad ≡ 0
and div curl ≡ 0.) It is also clear from (2.46) that F
µν
will be unchanged if we make the
replacement
A
µ
−→A
µ
+∂
µ
λ, (2.47)
where λ is an arbitrary function of position and time. Again, the reason is that partial
derivatives commute. Comparing (2.47) with (2.44), we see that (2.47) implies
φ −→φ −
∂λ
∂t
, A
i
−→A
i
+∂
i
λ, (2.48)
26
and so we have reproduced the gauge transformations (2.9) and (2.10).
It should have become clear by now that all the familiar features of the Maxwell equa
tions are equivalently described in the spacetime formulation in terms of 4vectors and
4tensors. The only diﬀerence is that everything is described much more simply and ele
gantly in the fourdimensional language.
2.4 Lorentz transformation of
E and
B
Although for many purposes the fourdimensional decsription of the Maxwell equations is
the most convenient, it is sometimes useful to revert to the original description in terms of
E and
B. For example, we may easily derive the Lorentz transformation properties of
E
and
B, making use of the fourdimensional formulation. In terms of F
µν
, there is no work
needed to write down its behaviour under Lorentz transformations. Raising the indices for
convenience, we shall have
F
µν
= Λ
µ
ρ
Λ
ν
σ
F
ρσ
. (2.49)
From this, and the fact (see (2.14) that F
0i
= E
i
, F
ij
=
ijk
B
k
, we can then immediately
read of the Lorentz transformations for
E and
B.
From the expressions (1.35) for the most general Lorentz boost transformation, we may
ﬁrst calculate
E
, calculated from
E
i
= F
0i
= Λ
0
ρ
Λ
i
σ
F
ρσ
,
= Λ
0
0
Λ
i
k
F
0k
+ Λ
0
k
Λ
i
0
F
k0
+ Λ
0
k
Λ
i
F
k
,
= γ
δ
ik
γ −1
v
2
v
i
v
j
E
k
−γ
2
v
i
v
k
E
k
−γ v
k
δ
i
+
γ −1
v
2
v
i
v
km
B
m
,
= γE
i
+γ
ijk
v
j
B
k
−
γ −1
v
2
v
i
v
k
E
k
. (2.50)
(Note that because F
µν
is antisymmetric, there is no F
00
term on the righthand side.)
Thus, in terms of 3vector notation, the Lorentz boost transformation of the electric ﬁeld
is given by
E
= γ(
E +v
B) −
γ −1
v
2
(v
E) v . (2.51)
An analogous calculation shows that the Lorentz boost transformation of the magnetic ﬁeld
is given by
B
= γ(
B −v
E) −
γ −1
v
2
(v
B) v . (2.52)
Suppose, for example, that in the frame S there is just a magnetic ﬁeld
B, while
E = 0.
An observer in a frame S
moving with uniform velocity v relative to S will therefore observe
27
not only a magnetic ﬁeld, given by
B
= γ
B −
γ −1
v
2
(v
B) v , (2.53)
but also an electric ﬁeld, given by
E
= γv
B. (2.54)
This, of course, is the principle of the dynamo.
6
It is instructive to write out the Lorentz transformations explicitly in the case when the
boost is along the x direction, v = (v, 0, 0). Equations (2.51) and (2.52) become
E
x
= E
x
, E
y
= γ(E
y
−vB
z
) , E
z
= γ(E
z
+vB
y
) ,
B
x
= B
x
, B
y
= γ(B
y
+vE
z
) , B
z
= γ(B
z
−vE
y
) . (2.55)
2.5 The Lorentz force
Consider a point particle following the path, or worldline, x
i
= x
i
(t). It has 3velocity
u
i
= dx
i
/dt, and, as we saw earlier, 4velocity
U
µ
= (γ, γ u) , where γ =
1
√
1 −u
2
. (2.56)
Multiplying by the rest mass m of the particle gives another 4vector, namely the 4
momentum
p
µ
= mU
µ
= (mγ, mγ u) . (2.57)
The quantity p
0
= mγ is called the relativistic energy E, and p
i
= mγ u
i
is called the
relativistic 3momentum. Note that since U
µ
U
µ
= −1, we shall have
p
µ
p
µ
= −m
2
. (2.58)
We now deﬁne the relativistic 4force f
µ
acting on the particle to be
f
µ
=
dp
µ
dτ
, (2.59)
where τ is the proper time. Clearly f
µ
is indeed a 4vector, since it is the 4vector dp
µ
divided by the scalar dτ.
Using (2.57), we can write the 4force as
f
µ
=
mγ
3
u
du
dτ
, mγ
3
u
du
dτ
u +mγ
du
dτ
. (2.60)
6
In a practical dynamo the rotor is moving with a velocity v which is much less than the speed of light,
i.e. v << 1 in natural units. This means that the gamma factor γ = (1 − v
2
)
−1/2
is approximately equal
to unity in such cases.
28
It follows that if we move to the instantaneous rest frame of the particle, i.e. the frame in
which u = 0 at the particular moment we are considering, then f
µ
reduces to
f
µ
rest frame
= (0,
F) , (2.61)
where
F = m
du
dt
(2.62)
is the Newtonian force measured in the rest frame of the particle.
7
Thus, we should in
terpret the 4force physically as describing the Newtonian 3force when measured in the
instantaneous rest frame of the accelerating particle.
If we now suppose that the particle has electric charge e, and that it is moving under
the inﬂuence of an electromagnetic ﬁeld F
µν
, then its motion is given by the Lorentz force
equation
f
µ
= eF
µν
U
ν
. (2.63)
One can more or less justify this equation on the grounds of “what else could it be?”, since
we know that there must exist a relativistic equation (i.e. a Lorentz covariant equation)
that describes the motion. In fact it is easy to see that (2.63) is correct. We calculate the
spatial components:
f
i
= eF
iν
U
ν
= eF
i0
U
0
+eF
ij
U
j
,
= e(−E
i
)(−γ) +e
ijk
B
k
γ u
j
, (2.64)
and thus
f = eγ (
E +u
B) . (2.65)
But f
µ
= dp
µ
/dτ, and so
f = d p/dτ = γ d p/dt (recall from section 1.5 that dτ = dt/γ) and
so we have
d p
dt
= e (
E +u
B) , (2.66)
where d p/dt is the rate of change of relativistic 3momentum. This is indeed the standard
expression for the motion of a charged particle under the Lorentz force.
2.6 Action principle for charged particles
In this section, we shall show how the equations of motion for a charged particle moving
in an electromagnetic ﬁeld can be derived from an action principle. To begin, we shall
7
Note that we can replace the proper time τ by the coordinate time t in the instantaneous rest frame,
since the two are the same.
29
consider an uncharged particle of mass m, with no forces acting on it. It will, of course,
move in a straight line. It turns out that its equation of motion can be derived from the
Lorentzinvariant action
S = −m
τ
2
τ
1
dτ , (2.67)
where τ is the proper time along the trajectory x
µ
(τ) of the particle, starting at proper
time τ = τ
1
and ending at τ = τ
2
. The action principle then states that if we consider all
possible paths between the initial and ﬁnal spacetime points on the path, then the actual
path followed by the particle will be such that the action S is stationary. In other words, if
we consider small variations of the path around the actual path, then to ﬁrst order in the
variations we shall have δS = 0.
To see how this works, we note that dτ
2
= dt
2
− dx
i
dx
i
= dt
2
(1 − v
i
v
i
) = dt
2
(1 − v
2
),
where v
i
= dx
i
/dt is the 3velocity of the particle. Thus
S = −m
t
2
t
1
(1 −v
2
)
1/2
dt = −m
t
2
t
1
(1 − ˙ x
i
˙ x
i
)
1/2
dt . (2.68)
In other words, the Lagrangian L, for which S =
t
2
t
1
Ldt, is given by
L = −m(1 − ˙ x
i
˙ x
i
)
1/2
. (2.69)
As a check, if we expand (2.69) for small velocities (i.e. small compared with the speed
of light, so [ ˙ x
i
[ << 1), we shall have
L = −m+
1
2
mv
2
+ . (2.70)
Since the Lagrangian is given by L = T − V we see that T is just the usual kinetic energy
1
2
mv
2
for a nonrelativistic particle of mass m, while the potential energy is just m. Of
course if we were not using units where the speed of light were unity, this energy would be
mc
2
. Since it is just a constant, it does not aﬀect the equations of motion that will follow
from the action principle.
Now let us consider small variations δx
i
(t) around the path x
i
(t) followed by the particle.
The action will vary according to
δS = m
t
2
t
1
(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
δ ˙ x
i
dt . (2.71)
Integrating by parts then gives
δS = −m
t
2
t
1
d
dt
(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
δx
i
dt +m
(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
δx
i
t
2
t
1
. (2.72)
30
As usual in an action principle, we restrict to variations of the path that vanish at the
endpoints, so δx
i
(t
1
) = δx
i
(t
2
) = 0 and the boundary term can be dropped. The variation
δx
i
is allowed to be otherwise arbitrary in the time interval t
1
< t < t
2
, and so we conclude
from the requirement of stationary action δS = 0 that
d
dt
(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
= 0 . (2.73)
Now, recalling that we deﬁne γ = (1 −v
2
)
−1/2
, we see that
d(mγv)
dt
= 0 , (2.74)
or, in other words,
d p
dt
= 0 , (2.75)
where p = mγv is the relativistic 3momentum. We have, of course, derived the equation
for straightline motion in the absence of any forces acting.
Now we extend the discussion to the case of a particle of mass m and charge e, moving
under the inﬂuence of an electromagnetic ﬁeld F
µν
. This ﬁeld will be written in terms of a
4vector potential:
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. (2.76)
The action will now be the sum of the freeparticle action (2.68) above plus a term describing
the interaction of the particle with the electromagnetic ﬁeld. The total action turns out to
be
S =
τ
2
τ
1
(−mdτ +eA
µ
dx
µ
) . (2.77)
Note that it is again Lorentz invariant.
From (2.44) we have A
µ
= (−φ,
A), and so
A
µ
dx
µ
= A
µ
dx
µ
dt
dt = (A
0
+A
i
˙ x
i
)dt = (−φ +A
i
˙ x
i
)dt . (2.78)
Thus we have S =
t
2
t
1
Ldt with the Lagrangian L given by
L = −m(1 − ˙ x
j
˙ x
j
)
1/2
−eφ +eA
i
˙ x
i
, (2.79)
where potentials φ and A
i
depend on t and x. The ﬁrstorder variation of the action under
a variation δx
i
in the path gives
δS =
t
2
t
1
m(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
δ ˙ x
i
−e∂
i
φδx
i
+eA
i
δ ˙ x
i
+e∂
j
A
i
˙ x
i
δx
j
dt ,
=
t
2
t
1
−
d
dt
(mγ ˙ x
i
) −e∂
i
φ −
dA
i
dt
+e∂
i
A
j
˙ x
j
δx
i
dt . (2.80)
31
(We have dropped the boundary terms immediately, since δx
i
is again assumed to vanish
at the endpoints.) Thus the principle of stationary action δS = 0 implies
d(mγ ˙ x
i
)
dt
= −e∂
i
φ −
dA
i
dt
+e∂
i
A
j
˙ x
j
. (2.81)
Now, the total time derivative dA
i
/dt has two contributions, and we may write it as
dA
i
dt
=
∂A
i
∂t
+∂
j
A
i
dx
j
dt
=
∂A
i
∂t
+∂
j
A
i
˙ x
j
. (2.82)
This arises because ﬁrst of all, A
i
can depend explicitly on the time coordinate; this con
tribution is ∂A
i
/∂t. Additionally, A
i
depends on the spatial coordinates x
i
, and along the
path followed by the particle, x
i
depends on t because the path is x
i
= x
i
(t). This accounts
for the second term.
Putting all this together, we have
d(mγ ˙ x
i
)
dt
= e
−∂
i
φ −
∂A
i
∂t
+e(∂
i
A
j
−∂
j
A
i
) ˙ x
j
,
= e(E
i
+
ijk
˙ x
j
B
k
) . (2.83)
In other words, we have
d p
dt
= e(
E +v
B) , (2.84)
which is the Lorentz force equation (2.66).
It is worth noting that although we gave a “threedimensional” derivation of the equa
tions of motion following from the action (2.77), we can also instead directly derive the
fourdimensional equation dp
µ
/dτ = eF
µν
U
ν
. To begin, we write the proper time interval
as dτ = (−η
ρσ
dx
ρ
dx
σ
)
1/2
, and so its variation under a variation of the path x
µ
(τ) gives
δ(dτ) = −(−η
ρσ
dx
ρ
dx
σ
)
−1/2
η
µν
dx
µ
dδx
ν
,
= −η
µν
dx
µ
dτ
dδx
ν
,
= −U
µ
dδx
µ
, (2.85)
where U
µ
is the 4velocity. Thus the variation of the action (2.77) gives
δS =
τ
2
τ
1
(mU
µ
dδx
µ
+eA
µ
dδx
µ
+e∂
ν
A
µ
δx
ν
dx
µ
) ,
=
τ
2
τ
1
(−mdU
µ
δx
µ
−edA
µ
δx
µ
+e∂
µ
A
ν
δx
µ
dx
ν
) ,
=
τ
2
τ
1
(−m
dU
µ
dτ
−e
dA
µ
dτ
+e∂
µ
A
ν
dx
ν
dτ
)δx
µ
dτ . (2.86)
Now we have
dA
µ
dτ
= ∂
ν
A
µ
dx
ν
dτ
= ∂
ν
A
µ
U
ν
, (2.87)
32
and so
δS =
τ
2
τ
1
−m
dU
µ
dτ
−e∂
ν
A
µ
U
ν
+e∂
µ
A
ν
U
ν
δx
µ
dτ . (2.88)
Requiring δS = 0 for all variations (that vanish at the endpoints) we therefore obtain the
equation of motion
m
dU
µ
dτ
= e(∂
µ
A
ν
−∂
ν
A
µ
) U
ν
,
= eF
µν
U
ν
. (2.89)
Thus we have reproduced the Lorentz force equation (2.66).
2.7 Gauge invariance of the action
In writing down the relativistic action (2.77) for a charged particle we had to make use
of the 4vector potential A
µ
. This is itself not physically observable, since, as we noted
earlier, A
µ
and A
µ
= A
µ
+∂λ describe the same physics, where λ is any arbitrary function
in spacetime, since A
µ
and A
µ
give rise to the same electromagnetic ﬁeld F
µν
. One might
worry, therefore, that the action itself would be gauge dependent, and therefore might not
properly describe the required physical situation. However, all is in fact well. This already
can be seen from the fact that, as we demonstrated, the variational principle for the action
(2.77) does in fact produce the correct gaugeinvariant Lorentz force equation (2.66).
It is instructive also to examine the eﬀects of a gauge transformation directly at the
level of the action. If we make the gaug transformation A
µ
→A
µ
= A
µ
+∂
µ
λ, we see from
(2.77) that the action S transforms to S
given by
S
=
τ
2
τ
1
(−mdτ +eA
µ
dx
µ
+e∂
µ
λdx
µ
) ,
= S +e
τ
2
τ
1
∂
µ
λdx
µ
= e
τ
2
τ
1
dλ, (2.90)
and so
S
= S +e[λ(τ
2
) −λ(τ
1
)] . (2.91)
Thus provided we restrict ourselves to gauge transformations that vanish at the endpoints,
the action will be gauge invariant, S
= S.
2.8 Canonical momentum, and Hamiltonian
Given any Lagrangian L(x
i
, ˙ x
i
, t) one deﬁnes the canonical momentum π
i
as
π
i
=
∂L
∂ ˙ x
i
. (2.92)
33
The relativistic Lagrangian for the charged particle is given by (2.79), and so we have
π
i
= m(1 − ˙ x
j
˙ x
j
)
−1/2
˙ x
i
+eA
i
, (2.93)
or, in other words,
π
i
= mγ ˙ x
i
+eA
i
, (2.94)
= p
i
+eA
i
, (2.95)
where p
i
as usual is the standard mechanical relativistic 3momentum of the particle.
As usual, the Hamiltonian for the system is given by
H = π
i
˙ x
i
−L, (2.96)
and so we ﬁnd
H = mγ ˙ x
i
˙ x
i
+
m
γ
+eφ. (2.97)
Now, ˙ x
i
= v
i
and mγv
2
+m/γ = mγ(v
2
+ (1 −v
2
)) = mγ, so we have
H = mγ +eφ. (2.98)
The Hamiltonian is to be viewed as a function of the coordinates x
i
and the canonical
momenta π
i
. To express γ in terms of π
i
, we note from (2.94) that mγ ˙ x
i
= π
i
−eA
i
, and so
squaring, we get m
2
γ
2
v
2
= m
2
v
2
/(1−v
2
) = (π
i
−eA
i
)
2
. Solving for v
2
, and hence for γ, we
ﬁnd that m
2
γ
2
= (π
i
−eA
i
)
2
+m
2
, and so ﬁnally, from (2.98), we arrive at the Hamiltonian
H =
(π
i
−eA
i
)
2
+m
2
+eφ. (2.99)
Note that Hamilton’s equations, which will necessarily give rise to the same Lorentz
force equations of motion we encountered previously, are given by
∂H
∂π
i
= ˙ x
i
,
∂H
∂x
i
= −˙ π
i
. (2.100)
As a check of the correctness of the Hamiltonian (2.99) we may examine it in the non
relativistic limit when (π
i
−eA
i
)
2
is much less than m
2
. We then extract an m
2
factor from
inside the square root in
(π
i
−eA
i
)
2
+m
2
and expand to get
H = m
1 + (π
i
−eA
i
)
2
/m
2
+eφ,
= m+
1
2m
(π
i
−eA
i
)
2
+eφ + . (2.101)
34
The ﬁrst term is the restmass energy, which is just a constant, and the remaining terms
presented explicitly in (2.101) give the standard nonrelativistic Hamiltonian for a charged
particle
H
nonrel.
=
1
2m
(π
i
−eA
i
)
2
+eφ. (2.102)
This should be familiar from quantum mechanics, when one writes down the Schr¨ odinger
equation for the wave function for a charged particle in an electromagnetic ﬁeld.
3 Particle Motion in Static Electromagnetic Fields
In this chapter, we discuss the motion of a charged particle in static (i.e. timeindependent)
electromagnetic ﬁelds.
3.1 Description in terms of potentials
If we are describing static electric and magnetic ﬁelds,
E =
E(r) and
B =
B(r), it is natural
(and always possible) to describe them in terms of scalar and 3vector potentials that are
also static, φ = φ(r),
A =
A(r). Thus we write
E = −
∇φ −
∂
A
∂t
= −
∇φ(r) ,
B =
∇
A(r) . (3.1)
We can still perform gauge transformations, as given in (2.9) and (2.10). The most general
gauge transformation that preserves the timeindependence of the potentials is therefore
given by taking the parameter λ to be of the form
λ(r, t) = λ(r) +k t , (3.2)
where k is an arbitrary constant. This implies that φ and
A will transform according to
φ −→φ −k ,
A −→
A+
∇λ(r) . (3.3)
Note, in particular, that the electrostatic potential φ can just be shifted by an arbitrary
constant. This is the familiar freedom that one typically uses to set φ = 0 at inﬁnity.
Recall that the Hamiltonian for a particle of mass m and charge e in an electromagnetic
ﬁeld is given by (2.98)
H = mγ +eφ, (3.4)
35
where γ = (1−v
2
)
−1/2
. In the present situation with static ﬁelds, the Hamiltonian does not
depend explicitly on time, i.e. ∂H/∂t = 0. In this circumstance, it follows that the energy
c is conserved, and is given simply by H:
c = H = mγ +eφ. (3.5)
The timeindependence of c can be seen from Hamilton’s equations (2.100):
dc
dt
=
dH
dt
=
∂H
∂t
+
∂H
∂x
i
˙ x
i
+
∂H
∂π
i
˙ π
i
,
= 0 − ˙ π
i
˙ x
i
+ ˙ x
i
˙ π
i
= 0 . (3.6)
We may think of the ﬁrst term in c as being the mechanical term,
c
mech
= mγ , (3.7)
since this is just the total energy of a particle of rest mass m moving with velocity v. The
second term, eφ, is the contribution to the total energy from the electric ﬁeld. Note that the
magnetic ﬁeld, described by the 3vector potential
A, does not contribute to the conserved
energy. This is because the magnetic ﬁeld
B does no work on the charge:
Recall that the Lorentz force equation can be written as
d(mγv
i
)
dt
= e(E
i
+
ijk
v
j
B
k
) . (3.8)
Multiplying by v
i
we therefore have
mγv
i
dv
i
dt
+mv
i
v
i
dγ
dt
= ev
i
E
i
. (3.9)
Now γ = (1 −v
2
)
−1/2
, so
dγ
dt
= (1 −v
2
)
−3/2
v
i
dv
i
dt
= γ
3
v
i
dv
i
dt
, (3.10)
and so (3.9) gives
m
dγ
dt
= ev
i
E
i
. (3.11)
Since c
mech
= mγ, and m is a constant, we therefore have
dc
mech
dt
= ev
E . (3.12)
Thus, the mechanical energy of the particle is changed only by the electric ﬁeld, and not
by the magnetic ﬁeld.
Note that another derivation of the constancy of c = mγ +eφ is as follows:
dc
dt
=
d(mγ)
dt
+e
dφ
dt
=
dc
mech
dt
+e∂
i
φ
dx
i
dt
,
= ev
E −ev
E = 0 . (3.13)
36
3.2 Particle motion in static uniform
E and
B ﬁelds
Let us consider the case where a charged particle is moving in static (i.e. timeindependent)
uniform
E and
B ﬁelds. In other words,
E and
B are constant vectors, independent of
time and of position. In this situation, it is easy to write down explicit expressions for the
corresponding scalar and 3vector potentials. For the scalar potential, we can take
φ = −
E r = −E
i
x
i
. (3.14)
Clearly this gives the correct electric ﬁeld, since
−∂
i
φ = ∂
i
(E
j
x
j
) = E
j
∂
i
x
j
= E
j
δ
ij
= E
i
. (3.15)
(It is, of course, essential that E
j
is constant for this calculation to be valid.)
Turning now to the uniform
B ﬁeld, it is easily seen that this can be written as
B =
∇
A, with the 3vector potential given by
A =
1
2
B r . (3.16)
It is easiest to check this using index notation. We have
(
∇
A)
i
=
ijk
∂
j
A
k
=
ijk
∂
j
(
1
2
km
B
x
m
) ,
=
1
2
ijk
mk
B
∂
j
x
m
=
1
2
ijk
jk
B
,
= δ
i
B
= B
i
. (3.17)
Of course the potentials we have written above are not unique, since we can still perform
gauge transformations. If we restrict attention to transformations that maintain the time
independence of φ and
A, then for φ the only remaining freedom is to add an arbitrary
constant to φ. For the 3vector potential, we can still add
∇λ(r) to
A, where λ(r) is an
arbitrary function of position. It is sometimes helpful, for calculational reasons, to do this.
Suppose, for example, that the uniform
B ﬁeld lies along the z axis:
B = (0, 0, B). From
(3.16), we may therefore write the 3vector potential
A = (−
1
2
By,
1
2
Bx, 0) . (3.18)
Another choice is to take
A
=
A+
∇λ(r), with λ = −
1
2
Bxy. This gives
A
= (−By, 0, 0) . (3.19)
One easily veriﬁes that indeed
∇
A
= (0, 0, B).
37
3.2.1 Motion in a static uniform electric ﬁeld
From the Lorentz force equation, we shall have
d p
dt
= e
E , (3.20)
where p = mγv is the relativistic 3momentum. Without loss of generality, we may take
the electric ﬁeld to lie along the x axis, and so we will have
dp
x
dt
= eE ,
dp
y
dt
= 0 ,
dp
z
dt
= 0 . (3.21)
Since there is a rotational symmetry in the (y, z) plane, we can, without loss of generality,
choose to take p
z
= 0, since the motion in the (yz) plane is evidently, from (3.21), simply
linear. Thus we may take the solution to (3.21) to be
p
x
= eEt , p
y
= ¯ p , p
z
= 0 , (3.22)
where ¯ p is a constant. We have also chosen the origin for the time coordinate t such that
p
x
= 0 at t = 0.
Recalling that the 4momentum is given by p
µ
= (mγ, p) = (c
mech
, p), and that p
µ
p
µ
=
m
2
U
µ
U
µ
= −m
2
, we see that
c
mech
=
m
2
+p
2
x
+p
2
y
=
m
2
+ ¯ p
2
+ (eEt)
2
, (3.23)
and hence we may write
c
mech
=
c
2
0
+ (eEt)
2
, (3.24)
where c
2
0
= m
2
+ ¯ p
2
is the square of the mechanical energy at time t = 0.
We have p = mγ v = c
mech
v, and so
dx
dt
=
p
x
c
mech
=
eEt
c
2
0
+ (eEt)
2
, (3.25)
which can be integrated to give
x =
1
eE
c
2
0
+ (eEt)
2
. (3.26)
(The constant of integration has been absorbed into a choice of origin for the x coordinate.)
Note from (3.25) that the xcomponent of the 3velocity asymptotically approaches 1 as t
goes to inﬁnity. Thus the particle is accelerated closer and closer to the speed of light, but
never reaches it.
38
We also have
dy
dt
=
p
y
c
mech
=
¯ p
c
2
0
+ (eEt)
2
. (3.27)
This can be integrated by changing variable from t to u, deﬁned by
eEt = c
0
sinhu. (3.28)
This gives y = ¯ p u/(eE), and hence
y =
¯ p
eE
arcsinh
eEt
c
0
. (3.29)
(Again, the constant of integration has been absorbed into the choice of origin for y.)
The solutions (3.26) and (3.29) for x and y as functions of t can be combined to give x
as a function of y, leading to
x =
c
0
eE
cosh
eEy
¯ p
. (3.30)
This is a catenary.
In the nonrelativistic limit when [v[ << 1, we have ¯ p ≈ m¯ v and then, expanding (3.30)
we ﬁnd the standard “Newtonian” parabolic motion
x ≈ constant +
eE
2m¯ v
2
y
2
. (3.31)
3.2.2 Motion in a static uniform magnetic ﬁeld
From the Lorentz force equation we shall have
d p
dt
= ev
B. (3.32)
Recalling (3.11), we see that in the absence of an electric ﬁeld we shall have γ = constant,
and hence d p/dt = d(mγv)/dt = mγ dv/dt, leading to
dv
dt
=
e
mγ
v
B =
e
c
v
B, (3.33)
since c = mγ +eφ = mγ (a constant) here.
Without loss of generality we may choose the uniform
B ﬁeld to lie along the z axis:
B = (0, 0, B). Deﬁning
ω ≡
eB
c
=
eB
mγ
, (3.34)
we then ﬁnd
dv
x
dt
= ω v
y
,
dv
y
dt
= −ω v
x
,
dv
z
dt
= 0 . (3.35)
39
From this, it follows that
d(v
x
+ i v
y
)
dt
= −i ω (v
x
+ i v
y
) , (3.36)
and so the ﬁrst two equations in (3.35) can be integrated to give
v
x
+ i v
y
= v
0
e
−i (ωt+α)
, (3.37)
where v
0
is a real constant, and α is a constant (real) phase. Thus after further integrations
we obtain
x = x
0
+r
0
sin(ωt +α) , y = y
0
+r
0
cos(ωt +α) , z = z
0
+ ¯ v
z
t , (3.38)
for constants r
0
, x
0
, y
0
, z
0
and ¯ v
z
, with
r
0
=
v
0
ω
=
mγv
0
eB
=
¯ p
eB
, (3.39)
where ¯ p is the relativistic 3momentum in the (x, y) plane. The particle therefore follows a
helical path, of radius r
0
.
3.2.3 Adiabatic invariant
In any conservative system with a periodic motion, it can be shown that the quantity
I ≡
π
i
dx
i
, (3.40)
integrated over a complete cycle of the coordinates x
i
is conserved under slow (adiabatic)
changes of the external parameters. Speciﬁcally, if there is an extrenal parameter a, then
dI/dt is of order O( ˙ a
2
, ¨ a), but there is no linear dependence on the ﬁrst derivative ˙ a.
In our previous discussion, of a charged particle moving under the inﬂuence of a uniform
magnetic ﬁeld
B that lies along the z direction, we may consider the invariant I that one
obtains by integrating around its closed path in the (x, y) plane. We shall have
I ≡
π
i
dx
i
=
(p
i
+eA
i
)dx
i
, (3.41)
and
p
i
dx
i
= 2πr
0
¯ p = 2πr
2
0
eB. (3.42)
We shall also have
e
A
i
dx
i
= e
S
B d
S = −eπr
2
0
B. (3.43)
Hence we ﬁnd
I = 2πr
2
0
eB −πr
2
0
eB, (3.44)
40
and so
I = πr
2
0
eB =
π¯ p
2
eB
. (3.45)
The statement is that since I is an adiabatic invariant, it will remain essentially un
changed if B, which we can view here as the external parameter, is slowly changed. Thus
we may say that
r
0
∝ B
−1/2
, or ¯ p ∝ B
1/2
. (3.46)
Note that since πr
2
0
= A, the area of the loop, it follows from (3.45) that
I = eΦ, (3.47)
where Φ = AB is the magnetic ﬂux threading the loop. Thus if we make a slow change to
the magnetic ﬁeld, then the radius of the particle’s orbit adjusts itself so that the magnetic
ﬂux through the loop remains constant.
As an application, we may consider a charged particle moving in a static magnetic ﬁeld
that changes gradually with position. We have already seen that c
mech
is constant in a pure
magnetic ﬁeld. Since we have
p
µ
p
µ
= −c
2
mech
+ p
2
= −m
2
, (3.48)
it follows that [ p[ is also a constant. In our discussion of the particle motion in the magnetic
ﬁeld, we deﬁned ¯ p to be the component of transverse 3momentum; i.e. the component in
the (x, y) plane. Thus we shall have
p
2
= ¯ p
2
+p
2
L
, (3.49)
where p
L
denotes the longitudinal component of 3momentum. It follows that
p
2
L
= p
2
− ¯ p
2
= p
2
−
eIB
π
. (3.50)
Since p
2
is a constant, it follows that as the particle penetrates into a region where the
magnetic ﬁeld increases, the longitudinal momentum p
L
(i.e. the momentum in the direction
of its forward motion) gets smaller and smaller. If the B ﬁeld becomes large enough, the
forward motion will be brought to zero, and the particle will be repelled out of the region
of high magnetic ﬁeld.
3.2.4 Motion in uniform
E and
B ﬁelds
Having considered the case of particle motion in a uniform
E ﬁeld, and in a uniform
B
ﬁeld, we may also consider the situation of motion in uniform
E and
B ﬁelds together. To
41
discuss this in detail is quite involved, and we shall not pursue it extensively here. Instead,
consider the situation where we take
B = (0, 0, B) ,
E = (0, E
y
, E
z
) , (3.51)
(there is no loss of generality in choosing axes so that this is the case), and we make the
simplifying assumption that the motion is nonrelativistic, i.e. [v[ << 1. The equations of
motion will therefore be
m
dv
dt
= e(
E +v
B) , (3.52)
and so
m¨ x = eB ˙ y , m¨ y = eE
y
−eB˙ x, m¨ z = eE
z
. (3.53)
We can immediately solve for z, ﬁnding
z =
e
2m
E
z
t
2
+ ¯ vt , (3.54)
where we have chosen the z origin so that z = 0 at t = 0. The x and y equations can be
combined into
d
dt
( ˙ x + i ˙ y) + i ω( ˙ x + i ˙ y) =
i e
m
E
y
, (3.55)
where ω = eB/m. Thus we ﬁnd
˙ x + i ˙ y = ae
−i ωt
+
e
mω
E
y
= ae
−i ωt
+
E
y
B
. (3.56)
Choosing the origin of time so that a is real, we have
˙ x = a cos ωt +
E
y
B
, ˙ y = −a sin ωt . (3.57)
Taking the time averages, we see that
' ˙ x` =
E
y
B
, ' ˙ y` = 0 . (3.58)
The averaged velocity along the x direction is called the drift velocity. Notice that it is
perpendicular to
E and
B. It can be written in general as
v
drift
=
E
B
B
2
. (3.59)
For our assumption that [v[ << 1 to be valid, we must have [
E
B[ << B
2
, i.e. [E
y
[ << [B[.
Integrating (3.57) once more, we ﬁnd
x =
a
ω
sin ωt +
E
y
B
t , y =
a
ω
(cos ωt −1) , (3.60)
42
where the origins of x and y have been chosen so that x = y = 0 at t = 0. These equations
describe the projection of the particle’s motion onto the (x, y) plane. The curve is called a
trochoid. If [a[ > E
y
/B there will be loops in the motion, and in the special case a = −E
y
/B
the curve becomes a cycloid, with cusps:
x =
E
y
ωB
(ωt −sin ωt) , y =
E
y
ωB
(1 −cos ωt) . (3.61)
4 Action Principle for Electrodynamics
In this section, we shall show how the Maxwell equations themselves can be derived from
an action principle. We shall also introduce the notion of the energymomentum tensor for
the electromagnetic ﬁeld. We begin with a discussion of Lorentz invariant quantities that
can be built from the Maxwell ﬁeld strength tensor F
µν
.
4.1 Invariants of the electromagnetic ﬁeld
As we shall now show, it is possible to build two independent Lorentz invariants that are
quadratic in the electromagnetic ﬁeld. One of these will turn out to be just what is needed
in order to construct an action for electrodynamics.
4.1.1 The ﬁrst invariant
The ﬁrst quadratic invariant is very simple; we may write
I
1
≡ F
µν
F
µν
. (4.1)
Obviously this is Lorentz invariant, since it is built from the product of two Lorentz tensors,
with all indices contracted. It is instructive to see what this looks like in terms of the electric
and magnetic ﬁelds. From the expressions given in (2.14), we see that
I
1
= F
0i
F
0i
+F
i0
F
i0
+F
ij
F
ij
,
= 2F
0i
F
0i
+F
ij
F
ij
= −2E
i
E
i
+
ijk
B
k
ij
B
,
= −2E
i
E
i
+ 2B
i
B
i
, (4.2)
and so
I
1
≡ F
µν
F
µν
= 2(
B
2
−
E
2
) . (4.3)
One could, of course, verify from the Lorentz transformations (2.51) and (2.52) for
E
and
B that indeed (
B
2
−
E
2
) was invariant, i.e. I
1
= I
1
under Lorentz transformations. This
43
would be quite an involved computation. However, the great beauty of the 4dimensional
language is that there is absolutely no work needed at all; one can see by inspection that
F
µν
F
µν
is Lorentz invariant.
4.1.2 The second invariant
The second quadratic invariant that we can write down is given by
I
2
≡
1
2
µνρσ
F
µν
F
ρσ
. (4.4)
First, we need to explain the tensor
µνρσ
. This is the fourdimensional Minkowski spacetime
generalisation of the totallyantisymmetric tensor
ijk
of threedimensional Cartesian tensor
analysis. The tensor
µνρσ
is also totally antisymmetric in all its indices. That means that
it changes sign if any two indices are exchanged. For example,
8
µνρσ
= −
νµρσ
= −
µνσρ
= −
σνρµ
. (4.5)
Since all the nonvanishing components of
µνρσ
are related by the antisymmetry, we need
only specify one nonvanishing component in order to deﬁne the tensor completely. We
shall deﬁne
0123
= −1 . (4.6)
Thus
µνρσ
is −1, +1 or 0 according to whether (µνρσ) is an even permutation of (0123),
and odd permutation, or no permutation at all. We use this deﬁnition of
µνρσ
in all frames.
This can be done because, like the Minkowski metric η
µν
, the tensor
µνρσ
is an invariant
tensor, as we shall now discuss.
Actually, to be more precise,
µνρσ
is an invariant pseudotensor. This means that un
der Lorentz transformations that are connected to the identity (pure boosts and/or pure
rotations), it is truly an invariant tensor. However, it reverses its sign under Lorentz trans
formations that involve a reﬂection. To see this, let us calculate what the transformation
of
µνρσ
would be if we assume it behaves as an ordinary Lorentz Lorentz tensor:
µνρσ
≡ Λ
µ
α
Λ
ν
β
Λ
ρ
γ
Λ
σ
δ
αβγδ
,
= (det Λ)
µνρσ
. (4.7)
8
Beware that in an odd dimension, such as 3, the process of “cycling” the indices on
ijk
(for example,
pushing one oﬀ the righthand end and bringing it to the front) is an even permutation;
kij
=
ijk
. By
contrast, in an even dimension, such as 4, the process of cycling is an odd permutation;
σµνρ
= −
µνρσ
.
This is an elementary point, but easily overlooked if one is familiar only with three dimensions!
44
The last equality can easily be seen by writing out all the terms. (It is easier to play around
with the analogous identity in 2 or 3 dimensions, to convince oneself of it in an example
with fewer terms to write down.) Now, we already saw in section 2.3 that det Λ = ±1,
with det Λ = +1 for pure boosts and/or rotations, and det Λ = −1 if there is a reﬂection as
well. (See the discussion leading up to equation (2.39).) Thus we see from (4.7) that
µνρσ
behaves like an invariant tensor, taking the same values in all Lorentz frames, provided
there is no reﬂection. (Lorentz transformations connected to the identity, i.e. where there
is no reﬂection, are sometimes called proper Lorentz transformations.) In practice, we shall
almost always be considering only proper Lorentz transformations, and so the distinction
between a tensor and a pseudotensor will not concern us.
Returning now to the second quadratic invariant, (4.4), we shall have
I
2
=
1
2
µνρσ
F
µν
F
ρσ
=
1
2
4
0ijk
F
0i
F
jk
,
= 2(−
ijk
)(−E
i
)
jk
B
,
= 4E
i
B
i
= 4
E
B. (4.8)
Thus, to summarise, we have the two quadratic invariants
I
1
= F
µν
F
µν
= 2(
B
2
−
E
2
) ,
I
2
=
1
2
µνρσ
F
µν
F
ρσ
= 4
E
B. (4.9)
Since the two quantities I
1
and I
2
are (manifestly) Lorentz invariant, this means that,
even though it is not directly evident in the threedimensional language without quite a lot
of work, the two quantities
B
2
−
E
2
, and
E
B (4.10)
are Lorentz invariant; i.e. they take the same values in all Lorentz frames. This has a
number of consequences. For example
1. If
E and
B are perpendicular in one Lorentz frame, then they are perpendicular in
all Lorentz frames.
2. In particular, if there exists a Lorentz frame where the electromagnetic ﬁeld is purely
electric (
B = 0), or purely magnetic (
E = 0), then
E and
B are perpendicular in any
other frame.
3. If [
E[ > [
B[ in one frame, then it is true in all frames. Conversely, if [
E[ < [
B[ in one
frame, then it is true in all frames.
45
4. By making an appropriate Lorentz transformation, we can, at a given point, make
E
and
B equal to any values we like, subject only to the conditions that we cannot alter
the values of (
B
2
−
E
2
) and
E
B at that point.
4.2 Action for Electrodynamics
We have already discussed the action principle for a charged particle moving in an electro
magnetic ﬁeld. In that discussion, the electromagnetic ﬁeld was just a speciﬁed background,
which, of course, would be a solution of the Maxwell equations. We can also derive the
Maxwell equations themselves from an action principle, as we shall now show.
We begin by introducing the notion of Lagrangian density. This is a quantity that is
integrated over a threedimensional spatial volume (typically, all of 3space) to give the
Lagrangian:
L =
Ld
3
x. (4.11)
Then, the Lagrangian is integrated over a time interval t
1
≤ t ≤ t
2
to give the action,
S =
t
2
t
1
Ldt =
Ld
4
x. (4.12)
Consider ﬁrst the vacuum Maxwell equations without sources,
∂
µ
F
µν
= 0 , ∂
µ
F
νρ
+∂
ν
F
ρµ
+∂
ρ
F
µν
= 0 . (4.13)
We immediately solve the second equation (the Bianchi identity) by writing F
µν
in terms
of a potential:
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. (4.14)
Since the Maxwell ﬁeld equations are linear in the ﬁelds, it is natural to expect that the
action should be quadratic. In fact, it turns out that the ﬁrst invariant we considered above
provides the appropriate Lagrangian density. We take
L = −
1
16π
F
µν
F
µν
, (4.15)
and so the action will be
S = −
1
16π
F
µν
F
µν
d
4
x. (4.16)
We can now derive the sourcefree Maxwell equations by requiring that this action be
stationary with respect to variations of the gauge ﬁeld A
µ
. It must be emphasised that we
treat A
µ
as the fundamental ﬁeld here.
46
The derivation goes as follows. We shall have
δS = −
1
16π
(δF
µν
F
µν
+F
µν
δF
µν
)d
4
x = −
1
8π
,
δF
µν
F
µν
d
4
x,
= −
1
8π
,
F
µν
(∂
µ
δA
ν
−∂
ν
δA
µ
)d
4
x = −
1
4π
F
µν
∂
µ
δA
ν
d
4
x,
= −
1
4π
∂
µ
(F
µν
δA
ν
)d
4
x +
1
4π
(∂
µ
F
µν
) δA
ν
d
4
x,
= −
1
4π
Σ
F
µν
δA
ν
dΣ
µ
+
1
4π
(∂
µ
F
µν
) δA
ν
d
4
x,
=
1
4π
(∂
µ
F
µν
) δA
ν
d
4
x. (4.17)
Note that in the ﬁnal steps, we have used the 4dimensional analogue of the divergence
theorem to turn the 4volume integral of the divergence of a vector into a 3volume integral
over the bouding surface Σ. The next step is to say that this integral vanishes, because
we restrict attention to variations δA
µ
that vanish on Σ. Finally, we argue that if δS is to
vanish for all possible variations δA
µ
(that vanish on Σ), it must be that
∂
µ
F
µν
= 0 . (4.18)
Thus we have derived the sourcefree Maxwell ﬁeld equation. Of course the Bianchi identity
has already been taken care of by writing F
µν
in terms of the 4vector potential A
µ
.
The action (4.16), whose variation gave the Maxwell ﬁeld equation, is written in what
is called secondorder formalism; that is, the action is expressed in terms of the 4vector
potential A
µ
as the fundamental ﬁeld, with F
µν
just being a shorthand notation for ∂
µ
A
ν
−
∂
ν
A
µ
. It is sometimes convenient to use instead the ﬁrstorder formalism, in which one
treats A
µ
and F
µν
as independent ﬁelds. In this formalism, the equation of motion coming
from demanding that S be stationary under variations of F
µν
will derive the equation
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. To do this, we need a diﬀerent action as our starting point, namely
S
f.o.
=
1
4π
(
1
4
F
µν
F
µν
−F
µν
∂
µ
A
ν
)d
4
x. (4.19)
First, consider the variation of F
µν
, now treated as an independent fundamental ﬁeld. This
gives
δS
f.o.
=
1
4π
(
1
2
F
µν
δF
µν
−δF
µν
∂
µ
A
ν
)d
4
x,
=
1
4π
[
1
2
F
µν
δF
µν
−
1
2
δF
µν
(∂
µ
A
ν
−∂
ν
A
µ
)]d
4
x, (4.20)
where, in getting to the second line, we have used the fact that F
µν
is antisymmetric. The
reason for doing this is that when we vary F
µν
we can take δF
µν
to be arbitary, but it must
47
still be antisymmetric. Thus it is helpful to force an explicit antisymmetrisation on the
∂
µ
A
ν
that multiplies it, since the symmetric part automatically gives zero when contracted
onto the antisymmetric δF
µν
. Requiring δS
f.o.
= 0 for arbitrary δF
µν
then implies the
integrand must vanish. This gives, as promised, the equation of motion
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. (4.21)
Vraying S
f.o.
in (4.19) instead with respect to A
µ
, we get
δS
f.o.
= −
1
4π
F
µν
∂
µ
δA
ν
d
4
x,
=
1
4π
(∂
µ
F
µν
) δA
ν
d
4
x, (4.22)
and hence reuiring that the variation of S
f.o.
with respect to A
µ
vanish gives the Maxwell
ﬁeld equation
∂
µ
F
µν
= 0 (4.23)
again. Note that in this calculation, we immediately dropped the boundary term coming
from the integration by parts, for the usual reason that we only allow variations that vanish
on the boundary.
In practice, we shall usually use the previous, secondorder, formalism.
4.3 Inclusion of sources
In general, the Maxwell ﬁeld equation reads
∂
µ
F
µν
= −4πJ
ν
. (4.24)
So far, we have seen that by varying the secondorder action (4.16) with respect to A
µ
, we
obtain
δS =
1
4π
∂
µ
F
µν
δA
ν
d
4
x. (4.25)
To derive the Maxwell ﬁeld equation with a source current J
µ
, we can simply add a term
to the action, to give
S =
−
1
16π
F
µν
F
µν
+J
µ
A
µ
d
4
x. (4.26)
Treating J
µ
as independent of A
µ
, we therefore ﬁnd
δS =
1
4π
∂
µ
F
µν
+J
µ
δA
ν
d
4
x, (4.27)
and so requiring δS = 0 gives the Maxwell ﬁeld equation (4.24) with the source on the
righthand side.
48
The form of the source current J
µ
depends, of course, on the details of the situation
one is considering. One might simply have a situation where J
µ
is an externallysupplied
source ﬁeld. Alternatively, the source J
µ
might itself be given dynamically in terms of some
charged matter ﬁelds, or in terms of a set of moving point charges. Let us consider this
possibility in more detail.
If there is a single point charge q at the location r
0
, then it will be described by the
charge density
ρ = q δ
3
(r −r
0
) , (4.28)
where the threedimensional deltafunction δ
3
(r), with r = (x, y, z), means
δ
3
(r) = δ(x)δ(y)δ(z) . (4.29)
If the charge is moving, so that its location at time t is at r = r
0
(t), then of course we shall
have
ρ = q δ
3
(r −r
0
(t)) . (4.30)
The 3vector current will be given by
J = q δ
3
(r −r
0
(t))
dr
0
dt
, (4.31)
and so the 4current is
J
µ
= (ρ, ρv) , where v =
dr
0
dt
, (4.32)
and ρ is given by (4.30). We can verify that this is the correct current vector, by checking
that it properly satisﬁes the chargeconservation equation ∂
µ
J
µ
= ∂ρ/∂t + ∂
i
J
i
= 0. Thus
we have
∂ρ
∂t
= q
∂
∂t
δ
3
(r −r
0
(t)) = q
∂
∂x
i
0
δ
3
(r −r
0
(t))
dx
i
0
dt
,
= −q
∂
∂x
i
δ
3
(r −r
0
(t))
dx
i
0
dt
= −∂
i
ρ
dx
i
0
dt
,
= −∂
i
(ρv
i
) = −∂
i
J
i
. (4.33)
Note that we used the chain rule for diﬀerentiation in the ﬁrst line, and that in getting
to the second line we used the standard result for a deltafunction that ∂/∂xδ(x − y) =
−∂/∂yδ(x −y). It is also useful to note that we can write (4.32) as
J
µ
= ρ
dx
µ
0
dt
, (4.34)
where we simply deﬁne x
µ
0
with µ = 0 to be t.
49
Note that the integral
J
µ
A
µ
for the point charge gives a contribution to that action
that is precisely of the form we saw in equation (2.77):
J
µ
A
µ
d
4
x =
qδ
3
(r −r
0
)
dx
µ
0
dt
A
µ
d
3
xdt ,
=
path
q
dx
µ
dt
A
µ
dt = q
path
A
µ
dx
µ
. (4.35)
Suppose now we have N charges q
a
, following paths r
a
(t). Then the total charge density
will be given by
ρ =
N
¸
a=1
q
a
δ
3
(r −r
a
(t)) . (4.36)
Since we have alluded several times to the fact that ∂
µ
J
µ
= 0 is the equation of charge
conservation, it is appropriate to examine this in a little more detail. The total charge Q
at time t
1
is given by integrating the charge density over the spatial 3volume:
Q(t
1
) =
t=t
1
J
0
dΣ
0
, where dΣ
0
= dxdydz . (4.37)
This can be written covariantly as
Q(t
1
) =
t=t
1
J
µ
dΣ
µ
, (4.38)
where we deﬁne also
dΣ
1
= −dtdydz , dΣ
2
= −dtdzdx, dΣ
3
= −dtdydz . (4.39)
Because the integral in (4.37) is deﬁned to be over the 3surface at constant t, it follows
that the extra terms, for µ = 1, 2, 3, in (4.38) do not contribute.
If we now calculate the charge at a later time t
2
, and then take the diﬀerence between
the two charges, we will obtain
Q(t
2
) −Q(t
1
) =
Σ
J
µ
dΣ
µ
, (4.40)
where Σ is the cylindrical closed spatial 3volume bounded by the “end caps” formed by the
surfaces t = t
1
and t = t
2
, and by the sides at spatial inﬁnity. We are assuming the charges
are conﬁned to a ﬁnite region, and so the current J
µ
is zero on the sides of the cylinder.
By the 4dimensional analogue of the divergence theorem we shall have
Σ
J
µ
dΣ
µ
=
V
∂
µ
J
µ
d
4
x, (4.41)
where V is the 4volume bounded by Σ. Thus we have
Q(t
2
) −Q(t
1
) =
V
∂
µ
J
µ
d
4
x = 0 , (4.42)
50
since ∂
µ
J
µ
= 0. Thus we see that ∂
µ
J
µ
= 0 implies that the total charge in an isolated
ﬁnite region is independent of time.
Note that the equation of charge conseravtion implies the gauge invariance of the action.
We have
S =
−
1
16π
F
µν
F
µν
+J
µ
A
µ
d
4
x, (4.43)
and so under a gauge transformation A
µ
→A
µ
+∂
µ
λ, we ﬁnd
S −→
−
1
16π
F
µν
F
µν
+J
µ
A
µ
d
4
x +
J
µ
∂
µ
λd
4
x,
= S +
J
µ
∂
µ
λd
4
x = S −
λ∂
µ
J
µ
d
4
x,
= S . (4.44)
4.4 Energy density and energy ﬂux
Here, we review the calculation of energy density and energy ﬂux in the 3dimensional
language. After that, we shall give the more elegant 4dimensional description.
Consider the two Maxwell equations
∇
B −
∂
E
∂t
= 4π
J ,
∇
E +
∂
B
∂t
= 0 . (4.45)
From these, we can deduce
E
∂
E
∂t
+
B
∂
B
∂t
=
E (
∇
B −4π
J) −
B (
∇
E) ,
=
ijk
(E
i
∂
j
B
k
−B
i
∂
j
E
k
) −4π
J
E,
= −
ijk
(B
i
∂
j
E
k
+E
k
∂
j
B
i
) −4π
J
E ,
= −∂
j
(
jki
E
k
B
i
) −4π
J
E ,
= −
∇ (
E
B) −4π
J
E . (4.46)
We then deﬁne the Poynting vector
S ≡
1
4π
E
B, (4.47)
and so
1
2
∂
∂t
(
E
2
+
B
2
) = −4π
∇
S −4π
J
E, (4.48)
since
E ∂
E/∂t =
1
2
∂/∂t(
E
2
), etc.
51
We now assume that the
E and
B ﬁelds are conﬁned to some ﬁnite region of space.
Integrating (4.48) over all space, we obtain
J
Ed
3
x +
1
8π
d
dt
(
E
2
+
B
2
)d
3
x = −
∇
Sd
3
x,
= −
Σ
S d
Σ,
= 0 . (4.49)
We get zero on the righthand side because, having used the divergence theorem to convert
it to an integral over Σ, the “sphere at inﬁnity,” the integral vanishes since
E and
B, and
hence
S, are assumed to vanish there.
If the current
J is assumed to be due to the motion of a set of charges q
a
with 3velocities
v
a
and rest masses m
a
, we shall have from (4.31) that
J
Ed
3
x =
¸
a
q
a
v
a
E =
dc
mech
dt
, (4.50)
where
c
mech
=
¸
a
m
a
γ
a
(4.51)
is the total mechanical energy for the set of particles, as deﬁned in (3.7). Note that here
γ
a
≡ (1 −v
2
a
)
−1/2
. (4.52)
Thus we conclude that
d
dt
c
mech
+
1
8π
(
E
2
+
B
2
)d
3
x
= 0 . (4.53)
This is the equation of total energy conservation. It says that the sum of the total mechanical
energy plus the energy contained in the electromagnetic ﬁelds is a constant. Thus we
interpret
W ≡
1
8π
(
E
2
+
B
2
) (4.54)
as the energy density of the electromagnetic ﬁeld.
Returning now to equation (4.48), we can consider integrating it over just a ﬁnite volume
V , bounded by a closed 2surface Σ. We will have
d
dt
c
mech
+
V
Wd
3
x
= −
Σ
S d
Σ. (4.55)
We now know that the lefthand side should be interpreted as the rate of change of total
energy in the volume V and so clearly, since the total energy must be conserved, we should
52
interpret the righthand side as the ﬂux of energy passing through the boundary surface Σ.
Thus we see that the Poynting vector
S =
1
4π
E
B (4.56)
is to be interpreted as the energy ﬂux across the boundary; i.e. the energy per unit area per
unit time.
4.5 Energymomentum tensor
The discussion above was presented within the 3dimensional framework. In this section
we shall give a 4dimensional spacetime description, which involves the introduction of the
energymomentum tensor. We shall begin with a rather general introduction. In order to
simplify this discussion, we shall ﬁrst describe the construction of the energymomentum
tensor for a scalar ﬁeld φ(x
µ
). When we then apply these ideas to electromagnetism, we
shall need to make the rather simple generalisation to the case of a Lagrangian for the
vector ﬁeld A
µ
(x
ν
).
We begin by considering a Lagrangian density L for the scalar ﬁeld φ. We shall as
sume that this depends on φ, and on its ﬁrst derivatives ∂
ν
φ, but that it has no explicit
dependence
9
on the spacetime coordinates x
µ
:
L = L(φ, ∂
ν
φ) . (4.57)
The action is then given by
S =
L(φ, ∂
ν
φ) d
4
x. (4.58)
The EulerLagrange equations for the scalar ﬁeld then follow from requiring that the
action be stationary. Thus we have
10
δS =
∂L
∂φ
δφ +
∂L
∂∂
ν
φ
∂
ν
δφ
d
4
x,
=
∂L
∂φ
δφ −∂
ν
∂L
∂∂
ν
φ
δφ
d
4
x +
Σ
∂L
∂∂
ν
φ
δφdΣ
ν
,
9
This is the analogue of a Lagrangian in classical mechanics that depends on the coordinates qi and
velocities ˙ q
i
, but which does not have explicit time dependence. Energy is conserved in a system described
by such a Lagrangian.
10
Note that ∂L/∂∂νφ means taking the partial derivative of L viewed as a function of φ and ∂µφ, with
respect to ∂νφ. For example, if L = −
1
2
(∂µφ)(∂
µ
φ) +
1
2
m
2
φ, then
∂L/∂∂νφ = −(∂
µ
φ)
∂(∂µφ)
∂∂νφ
= −(∂
µ
φ) δ
ν
µ
= −∂
ν
φ. (4.59)
53
=
∂L
∂φ
δφ −∂
ν
∂L
∂∂
ν
φ
δφ
d
4
x, (4.60)
where, in getting to the last line, we have as usual dropped the surface term integrated over
the boundary cylinder Σ, since we shall insist that δφ vanishes on Σ. Thus the requirement
that δS = 0 for all such δφ implies the EulerLagrange equations
∂L
∂φ
−∂
ν
∂L
∂∂
ν
φ
= 0 . (4.61)
Now consider the expression ∂
ρ
L = ∂L/∂x
ρ
. Since we are assuming L has no explicit
dependence on the spacetime coordinates, it follows that ∂
ρ
L is given by the chain rule,
∂
ρ
L =
∂L
∂φ
∂
ρ
φ +
∂L
∂∂
ν
φ
∂
ρ
∂
ν
φ. (4.62)
Now, using the EulerLagrange equations (4.61), we can write this as
∂
ρ
L = ∂
ν
∂L
∂∂
ν
φ
∂
ρ
φ +
∂L
∂∂
ν
φ
∂
ν
∂
ρ
φ,
= ∂
ν
∂L
∂∂
ν
φ
∂
ρ
φ
, (4.63)
and thus we have
∂
ν
∂L
∂∂
ν
φ
∂
ρ
φ −δ
ν
ρ
L
= 0 . (4.64)
We are therefore led to deﬁne the 2index tensor
T
ρ
ν
≡ −
∂L
∂∂
ν
φ
∂
ρ
φ +δ
ν
ρ
L, (4.65)
which then satisﬁes
∂
ν
T
ρ
ν
= 0 . (4.66)
T
µν
is called the energymomentum tensor.
We saw previously that the equation ∂
µ
J
µ
= 0 for the 4vector current density J
µ
implies that there is a conserved charge
Q =
t=const
J
0
dΣ
0
=
t=const
J
µ
dΣ
µ
, (4.67)
where dΣ
0
= dxdydz, etc. By an identical argument, it follows that the equation ∂
ν
T
ρ
ν
= 0
implies that there is a conserved 4vector:
P
µ
≡
t=const
T
µ0
dΣ
0
=
t=const
T
µν
dΣ
ν
. (4.68)
(Of course T
µν
= η
µρ
T
ρ
ν
.) Thus we may check
dP
µ
dt
= ∂
0
t=const
T
µ0
d
3
x =
t=const
∂
0
T
µ0
d
3
x = −
t=const
∂
i
T
µi
d
3
x,
= −
S
T
µi
dS
i
= 0 , (4.69)
54
where in the last line we have used the divergence theorem to turn the integral into a 2
dimensional integral over the boundary sphere S at inﬁnity. This vanishes since we shall
assume the ﬁelds are zero at inﬁnity.
Notice that T
00
= −T
0
0
and from (4.65) we therefore have
T
00
=
∂L
∂∂
0
φ
∂
0
φ −L. (4.70)
Now for a Lagrangian L = L(q
i
, ˙ q
i
) we have the canonical momentum π
i
= ∂L/∂ ˙ q
i
, and
the Hamiltonian
H = π
i
˙ q
i
−L. (4.71)
Since there is no explicit time dependence, H is conserved, and is equal to the total energy
of the system. Comparing with (4.70), we can recognise that T
00
is the energy density.
From (4.68) we therefore have that
P
0
=
T
00
d
3
x (4.72)
is the total energy. Since it is manifest from its construction that P
µ
is a 4vector, and
since its 0 component is the energy, it follows that P
µ
is the 4momentum.
The essential point in the discussion above is that P
µ
given in (4.68) should be conserved,
which requires ∂
ν
T
ρ
ν
= 0. The quantity T
ρ
ν
we constructed is not the unique tensor with
this property. We can deﬁne a new one, according to
T
ρ
ν
−→T
ρ
ν
+∂
σ
ψ
ρ
νσ
, (4.73)
where ψ
ρ
νσ
is an arbitrary tensor that is antisymmetric in its last two indices,
ψ
ρ
νσ
= −ψ
ρ
σν
. (4.74)
We shall take ψ
ρ
νσ
to vanish at spatial inﬁnity.
The antisymmetry implies, since partial derivatives commute, that
∂
ν
∂
σ
ψ
ρ
νσ
= 0 , (4.75)
and hence that the modiﬁed energymomentum tensor deﬁned by (4.73) is conserved too.
Furthermore, the modiﬁcation to T
ρ
ν
does not alter P
µ
, since, from (4.68), the extra term
will be
t=const
∂
σ
ψ
µνσ
dΣ
ν
=
t=const
∂
σ
ψ
µ0σ
dΣ
0
,
=
t=const
∂
i
ψ
µ0i
d
3
x,
=
S
ψ
µ0i
dS
i
= 0 , (4.76)
55
where S is the sphere at spatial inﬁnity. The modiﬁcation to P
µ
therefore vanishes since
we are requiring that ψ
ρ
νσ
vanishes at spatial inﬁnity.
The energymomentum tesnor can be pinned down uniquely by requiring that the four
dimensional angular momentum M
µν
, deﬁned by
M
µν
=
(x
µ
dP
ν
−x
ν
dP
µ
) (4.77)
be conserved. First, let us make a remark about angular momentum in four dimensions. In
three dimensions, we deﬁne the angular momentum 3vector as
L = r p. In other words,
L
i
=
ijk
x
j
p
k
=
1
2
ijk
(x
j
p
k
−x
k
p
j
) =
1
2
ijk
M
jk
, (4.78)
where M
jk
≡ x
j
p
k
−x
k
p
j
. Thus taking M
µν
= x
µ
p
ν
−x
ν
p
µ
in four dimensions is a plausible
looking generalisation. It should be noted that in a general dimension, angular momentum
is described by a 2index antisymmetric tensor; in other words, angular momentum is
associated with a rotation in a 2dimensional plane. It is a very special feature of three
dimensions that we can use the
ijk
tensor to map the 2index antisymmetric tensor M
jk
into the vector L
i
=
1
2
ijk
M
jk
. Put another way, a very special feature of three dimensions
is that a rotation in the (x, y) plane can equivalently be described as a rotation about the
orthogonal (i.e. z) axis. In higher dimensions, rotations do not occur around axes, but
rather, in 2planes. It is amusing, therefore, to try to imagine what the analogue of an axle
is for a higherdimensional being!
Getting back to our discussion of angular momentum and the energymomentum tensor
in four dimensions, we are deﬁning
M
µν
=
(x
µ
dP
ν
−x
ν
dP
µ
) =
(x
µ
T
νρ
−x
ν
T
µρ
)dΣ
ρ
, . (4.79)
By analogous arguments to those we used earlier, this will be conserved (i.e. dM
µν
/dt = 0)
if
∂
ρ
(x
µ
T
νρ
−x
ν
T
µρ
) = 0 . (4.80)
Distributing the derivative, we therefore have the requirement that
δ
µ
ρ
T
νρ
+x
µ
∂
ρ
T
νρ
−δ
ν
ρ
T
µρ
−x
ν
∂
ρ
T
µρ
= 0 , (4.81)
and hence, since ∂
ρ
T
µρ
= 0, that T
µν
is symmetric,
T
µν
= T
νµ
. (4.82)
Using the freedom to add ∂
σ
ψ
µνσ
to T
µν
, as we discussed earlier, it is always possible to
arrange for T
µν
to be symmetric. From now on, we shall assume that this is done.
56
We already saw that P
µ
=
T
µ0
d
3
x is the 4momentum, so T
00
is the energy density,
and T
i0
is the 3momentum density. Let us now look at the conservation equation ∂
ν
T
µν
= 0
in more detail. Taking µ = 0, we have ∂
ν
T
0ν
= 0, or
∂
∂t
T
00
+∂
j
T
0j
= 0 . (4.83)
integrating over a spatial 3volume V with boundary S, we therefore ﬁnd
∂
∂t
V
T
00
d
3
x = −
V
∂
j
T
0j
d
3
x = −
S
T
0j
dS
j
. (4.84)
The lefthand side is the rate of change of ﬁeld energy in the volume V , and so we can
deduce, from energy conservation, that T
0j
is the energy ﬂux 3vector. But since we are
now working with a symmetric energymomentum tensor, we have that T
0j
= T
j0
, and we
already identiﬁed T
j0
as the 3momentum density. Thus we have that
energy ﬂux = momentum density . (4.85)
From the µ = i components of ∂
ν
T
µν
= 0, we have
∂
∂t
T
i0
+∂
j
T
ij
= 0 , (4.86)
and so, integrating over the 3volume V , we get
∂
∂t
V
T
i0
d
3
x = −
V
∂
j
T
ij
d
3
x = −
S
T
ij
dS
j
. (4.87)
The lefthand side is the rate of change of 3momentum, and so we deduce that T
ij
is the
3tensor of momentum ﬂux density. It gives the i component of 3momentum that ﬂows,
per unit time, through the 2surface perpendicular to the x
j
axis. T
ij
is sometimes called
the 3dimensional stress tensor.
4.6 Energymomentum tensor for the electromagnetic ﬁeld
Recall that for a scalar ﬁeld φ, the original construction of the energymomentum tensor
T
ρ
ν
(which we later modiﬁed by adding ∂
σ
ψ
ρ
νσ
where ψ
ρ
νσ
= −ψ
ρ
σν
) was given by
T
ρ
ν
= −
∂L
∂∂
ν
φ
∂
ρ
φ +δ
ν
ρ
L. (4.88)
If we have a set of N scalar ﬁelds φ
a
, then it is easy to see that the analogous conserved
tensor is
T
ρ
ν
= −
N
¸
a=1
∂L
∂∂
ν
φ
a
∂
ρ
φ
a
+δ
ν
ρ
L. (4.89)
57
A similar calculation shows that if we consider instead a vector ﬁeld A
σ
, with Lagrangian
density L(A
σ
, ∂
ν
A
σ
), the construction will give a conserved energymomentum tensor
T
ρ
ν
= −
∂L
∂∂
ν
A
σ
∂
ρ
A
σ
+δ
ν
ρ
L. (4.90)
Let us apply this to the Lagrangian density for pure electrodynamics (without sources),
L = −
1
16π
F
µν
F
µν
. (4.91)
We have
δL = −
1
8π
F
µν
δF
µν
= −
1
4π
F
µν
∂
µ
δA
ν
, (4.92)
and so
∂L
∂∂
µ
A
ν
= −
1
4π
F
µν
. (4.93)
Thus from (4.90) we ﬁnd
T
ρ
ν
=
1
4π
F
νσ
∂
ρ
A
σ
−
1
16π
δ
ν
ρ
F
σλ
F
σλ
, (4.94)
and so
T
µν
=
1
4π
F
νσ
∂
µ
A
σ
−
1
16π
η
µν
F
σλ
F
σλ
. (4.95)
This expression is not symmetric in µ and ν. However, following our previous discussion,
we can add a term ∂
σ
ψ
µνσ
to it, where ψ
µνσ
= −ψ
µσν
, without upsetting the conservation
condition ∂
ν
T
µν
= 0. Speciﬁcally, we shall choose to add
∂
σ
ψ
µνσ
= −
1
4π
∂
σ
(A
µ
F
νσ
) ,
= −
1
4π
(∂
σ
A
µ
)F
νσ
−
1
4π
A
µ
∂
σ
F
νσ
= −
1
4π
(∂
σ
A
µ
)F
νσ
. (4.96)
(the ∂
σ
F
νσ
term drops as a consequence of the sourcefree ﬁeld equation.) This leads to
the new energymomentum tensor
T
µν
=
1
4π
F
νσ
(∂
µ
A
σ
−∂
σ
A
µ
) −
1
16π
η
µν
F
σλ
F
σλ
, (4.97)
or, in other words,
T
µν
=
1
4π
F
µ
σ
F
νσ
−
1
4
η
µν
F
σλ
F
σλ
. (4.98)
This is indeed manifestly symmetric in µ and ν. From now on, it will be understood when
we speak of the energymomentum tensor for electrodynamics that this is the one we mean.
It is a straightforward exercise to verify directly, using the sourcefree Maxwell ﬁeld
equation and the Bianchi identity, that indeed T
µν
given by (4.98) is conserved, ∂
ν
T
µν
= 0.
Note that it has another simple property, namely that it is tracefree, in the sense that
η
µν
T
µν
= 0 . (4.99)
58
This is easily seen from (4.98), as a consequence of the fact that η
µν
η
µν
= 4 in four dimen
sions. The tracefree property is related to a special feature of the Maxwell equations in
four dimensions, known as conformal invariance.
Having obtained the energymomentum tensor (4.98) for the electromagnetic ﬁeld, it is
instructive to look at its components from the threedimensional point of view. First, recall
that we showed earlier that
F
σλ
F
σλ
= 2(
B
2
−
E
2
) . (4.100)
Then, we ﬁnd
T
00
=
1
4π
(F
0
σ
F
0σ
−
1
4
η
00
F
σλ
F
σλ
) ,
=
1
4π
(F
0i
F
0i
+
1
2
B
2
−
1
2
E
2
) ,
=
1
4π
(
E
2
+
1
2
B
2
−
1
2
E
2
) ,
=
1
8π
(
E
2
+
B
2
) . (4.101)
Thus T
00
is equal to the energy density W that we introduced in (4.54).
Now consider T
0i
. Since η
0i
= 0, we have
T
0i
=
1
4π
F
0
σ
F
iσ
=
1
4π
F
0
j
F
ij
,
=
1
4π
E
j
ijk
B
k
= S
i
, (4.102)
where
S = 1/(4π)
E
B is the Poynting vector introduced in (4.47). Thus T
0i
is the energy
ﬂux. As we remarked earlier, since we now have T
0i
= T
i0
, it can be equivalently interpreted
as the 3momentum density vector.
Finally, we consider the components T
ij
. We have
T
ij
=
1
4π
F
i
σ
F
jσ
−
1
4
η
ij
2(
B
2
−
E
2
)
,
=
1
4π
F
i
0
F
j0
+F
i
k
F
jk
−
1
2
δ
ij
(
B
2
−
E
2
)
,
=
1
4π
−E
i
E
j
+
ik
jkm
B
B
m
−
1
2
δ
ij
(
B
2
−
E
2
)
,
=
1
4π
−E
i
E
j
+δ
ij
B
2
−B
i
B
j
−
1
2
δ
ij
(
B
2
−
E
2
)
,
=
1
4π
−E
i
E
j
−B
i
B
j
+
1
2
δ
ij
(
E
2
+
B
2
)
. (4.103)
To summarise, we have
T
µν
=
T
00
T
0j
T
i0
σ
ij
=
W S
j
S
i
σ
ij
, (4.104)
59
where W and
S are the energy density and Poynting ﬂux,
W =
1
8π
(
E
2
+
B
2
) ,
S =
1
4π
E
B, (4.105)
and
σ
ij
≡
1
4π
(−E
i
E
j
−B
i
B
j
+
1
2
Wδ
ij
) . (4.106)
Remarks
• Unless
E and
B are perpendicular and equal in magnitude, we can always choose a
Lorentz frame where
E and
B are parallel at a point. (In the case that
E and
B are
perpendicular (but unequal in magnitude), one or other of
E or
B will be zero, at the
point, in the new Lorentz frame.)
Let the direction of
E and
B then be along z:
E = (0, 0, E) ,
B = (0, 0, B) . (4.107)
Then we have
S = 1/(4π)
E
B = 0 and
σ
11
= σ
22
= W , σ
33
= −W , σ
ij
= 0 otherwise , (4.108)
and so T
µν
is diagonal, given by
T
µν
=
¸
¸
¸
¸
¸
¸
W 0 0 0
0 W 0 0
0 0 W 0
0 0 0 −W
¸
, (4.109)
with W = 1/(8π)(E
2
+B
2
).
• If
E and
B are perpendicular and [
E[ = [
B[ at a point, then at that point we can
choose axes so that
E = (E, 0, 0) ,
B = (0, B, 0) = (0, E, 0) . (4.110)
Then we have
W =
1
4π
E
2
,
S = (0, 0, W) ,
σ
11
= σ
22
= 0 , σ
33
= W , σ
ij
= 0 otherwise , (4.111)
and therefore T
µν
is given by
T
µν
=
¸
¸
¸
¸
¸
¸
W 0 0 W
0 0 0 0
0 0 0 0
W 0 0 W
¸
. (4.112)
60
4.7 Inclusion of massive charged particles
We now consider the energymomentum tensor for a particle with rest masses m. We proceed
by analogy with the construction of the 4current density J
µ
for charged noninteracting
particles. Thus we deﬁne ﬁrst a mass density, ε, for a point mass m located at r = r
0
(t).
This will simply be given by a 3dimensional delta function, with strength m, located at
the instantaneous position of the mass point:
ε = mδ
3
(r −r
0
(t)) . (4.113)
The energy density T
00
for the particle will then be its mass density times the corresponding
γ factor, where γ = (1 −v
2
)
−1/2
, and v = dr
0
(t)/dt is the velocity of the particle. Since the
coordinate time t and the proper time τ in the frame of the particle are related, as usual,
by dt = γdτ, we then have
T
00
= ε
dt
dτ
. (4.114)
The 3momentum density will be
T
0i
= εγ
dx
i
dt
= ε
dt
dτ
dx
i
dt
. (4.115)
We can therefore write
T
0ν
= ε
dt
dτ
dx
ν
dt
= ε
dx
0
dτ
dx
ν
dt
. (4.116)
On general grounds of Lorentz covariance, it must therefore be that
T
µν
= ε
dx
µ
dτ
dx
ν
dt
,
= ε
dx
µ
dτ
dx
ν
dτ
dτ
dt
. (4.117)
By writing it as we have done in the second line here, it becomes manifest that T
µν
for the
particle is symmetric in µ and ν.
Consider now a system consisting of a particle with mass m and charge q, moving
in an electromagnetic ﬁeld. Clearly, since the particle interacts with the ﬁeld, we should
not expect either the energymomentum tensor (4.98) for the electromagnetic ﬁeld or the
energymomentum tensor (4.117) for the particle to be conserved separately. This is because
energy, and momentum, is being exchanged between the particle and the ﬁeld. We can
expect, however, that the total energymomentum tensor for the system, i.e. the sum of
(4.98) and (4.117), to be conserved.
In order to distinguish clearly between the various energymomentum tensors, let us
deﬁne
T
µν
tot.
= T
µν
e.m.
+T
µν
part.
, (4.118)
61
where T
µν
e.m.
and T
µν
part.
are the energymomentum tensors for the electromagnetic ﬁeld and
the particle respectively:
T
µν
e.m.
=
1
4π
F
µ
σ
F
νσ
−
1
4
η
µν
F
σλ
F
σλ
,
T
µν
part.
= ε
dx
µ
dτ
dx
ν
dt
, (4.119)
where ε = mδ
3
(r −r
0
(t)).
Consider T
µν
e.m.
ﬁrst. Taking the divergence, we ﬁnd
∂
ν
T
µν
e.m.
=
1
4π
∂
ν
F
µ
σ
F
νσ
+F
µ
σ
∂
ν
F
νσ
−
1
2
F
σλ
∂
µ
F
σλ
,
=
1
4π
∂
ν
F
µ
σ
F
νσ
+F
µ
σ
∂
ν
F
νσ
+
1
2
F
σλ
∂
σ
F
λ
µ
+
1
2
F
σλ
∂
λ
F
µ
σ
,
=
1
4π
∂
ν
F
µ
σ
F
νσ
−
1
2
F
σλ
∂
σ
F
µ
λ
−
1
2
F
λσ
∂
λ
F
µ
σ
+F
µ
σ
∂
ν
F
νσ
,
=
1
4π
F
µ
σ
∂
ν
F
νσ
,
= −F
µ
ν
J
ν
. (4.120)
In getting to the second line we used the Bianchi identity on the last term in the top line.
The third line is obtained by swapping indices on a ﬁeld strength in the terms with the
1
2
factors, and this reveals that all except one term cancel, leading to the result. As expected,
the energymomentum tensor for the electromagnetic ﬁeld by itself is not conserved when
there are sources.
Now we want to show that this nonconservation is balanced by an equal and opposite
nonconservation for the energymomentum tensor of the particle, which is given in (4.119).
We have
∂
ν
T
µν
part.
= ∂
ν
ε
dx
ν
dt
dx
µ
dτ
+ε
dx
ν
dt
∂
ν
dx
µ
dτ
. (4.121)
The ﬁrst term is zero. This can be seen from the fact that the calculation is identical to
the one which we used a while back in section 4.3 to show that the 4current J
µ
= ρdx
µ
/dt
for a charged particle is conserved. Thus we have
∂
ν
T
µν
part.
= ε
dx
ν
dt
∂
ν
dx
µ
dτ
= ε
dx
ν
dt
∂
ν
U
µ
,
= ε
dU
µ
dt
. (4.122)
By the Lorentz force equation mdU
µ
/dτ = qF
µ
ν
U
ν
, we have
ε
dU
µ
dτ
= ρF
µ
ν
U
ν
= ρF
µ
ν
dx
ν
dτ
, (4.123)
and so
ε
dU
µ
dt
= ρF
µ
ν
dx
ν
dt
= F
µ
ν
J
ν
, (4.124)
62
since J
µ
= ρdx
µ
/dt. Thus we conclude that
∂
ν
T
µν
part.
= F
µ
ν
J
ν
, (4.125)
and so, combining this with (4.120), we conclude that the total energymomentum tensor
for the particle plus electromagnetic ﬁeld, deﬁned in (4.118) is conserved,
∂
ν
T
µν
tot.
= 0 . (4.126)
5 Coulomb’s Law
5.1 Potential of a point charges
Consider ﬁrst a static point charge, for which the Maxwell equations therefore reduce to
∇
E = 0 ,
∇
E = 4πρ . (5.1)
The ﬁrst equation implies, of course, that we can write
E = −
∇φ, (5.2)
and then the second equation implies that φ satisﬁes the Poisson equation
∇
2
φ = −4πρ . (5.3)
If the point charge is located at the origin, and the charge is e, then the charge density
ρ is given by
ρ = e δ
3
(r) . (5.4)
Away from the origin, (5.3) implies that φ should satisfy the Laplace equation,
∇
2
φ = 0 , [r[ > 0 . (5.5)
Since the charge density (5.4) is spherically symmetric, we can assume that φ will be
spherically symmetric too, φ(r) = φ(r), where r = [r[. From r
2
= x
j
x
j
we deduce, by
acting with ∂
i
, that
∂
i
r =
x
i
r
. (5.6)
From this it follows by the chain rule that
∂
i
φ = φ
∂
i
r = φ
x
i
r
, (5.7)
63
where φ
≡ dφ/dr, and hence
∇
2
φ = ∂
i
∂
i
φ = ∂
i
φ
x
i
r
= φ
x
i
r
x
i
r
+φ
∂
i
x
i
r
+φ
x
i
∂
i
1
r
,
= φ
+
2
r
φ
. (5.8)
Thus the Laplace equation (5.5) can be written as
(r
2
φ
)
= 0 , r > 0 , (5.9)
which integrates to give
φ =
q
r
, (5.10)
where q is a constant, and we have dropped an additive constant of integration by using
the gauge freedom to choose φ(∞) = 0.
To determine the constant q, we integrate the Poisson equation (5.3) over the interior
V
R
of a sphere of radius R centred on the origin, and use the divergence theorem:
V
R
∇
2
φd
3
x = −4πe
V
R
δ
3
(r)d
3
x = −4πe ,
=
S
R
∇φ d
S =
S
R
∂
i
q
r
dS
i
,
= −q
S
R
x
i
dS
i
r
3
= −q
S
R
n
i
dS
i
R
2
, (5.11)
where S
R
is the surface of the sphere of radius R that bounds the volume V
R
, and n
i
≡ x
i
/r
is the outwardpointing unit vector. Clearly we have
n
i
dS
i
= R
2
dΩ, (5.12)
where dΩ is the area element on the unitradius sphere, and so
−q
S
R
n
i
dS
i
r
2
= −q
dΩ = −4πq , (5.13)
and so we conclude that q is equal to e, the charge on the point charge at r = 0.
Note that if the point charge e were located at r
, rather than at the origin, then by
trivially translating the coordinate system we will have the potential
φ(r) =
e
[r −r
[
, (5.14)
and this will satisfy
∇
2
φ = −4πeδ
3
(r −r
) . (5.15)
64
5.2 Electrostatic energy
In general, the energy density of an electromagnetic ﬁeld is given by W = 1/(8π)(
E
2
+
B
2
).
A purely electrostatic system therefore has a ﬁeld energy U given by
U =
1
8π
Wd
3
x =
1
8π
E
2
d
3
x,
= −
1
8π
E
∇φd
3
x,
= −
1
8π
∇ (
E φ)d
3
x +
1
8π
(
∇
E)φd
3
x,
= −
1
8π
S
E φ d
S +
1
2
ρφd
3
x,
=
1
2
ρφd
3
x. (5.16)
Note that the surface integral over the sphere at inﬁnity gives zero because the electric ﬁeld
is assumed to die away to zero there. Thus we conclude that the electrostatic ﬁeld energy
is given by
U =
1
2
ρφd
3
x. (5.17)
We can apply this formula to a system of N charges q
a
, located at points r
a
, for which
we shall have
ρ =
N
¸
a=1
q
a
δ
3
(r −r
a
) . (5.18)
However, a naive application of (5.17) would give nonsense, since we ﬁnd
U =
1
2
N
¸
a=1
q
a
δ
3
(r −r
a
) φ(r)d
3
x =
1
2
N
¸
a=1
q
a
φ(r
a
) , (5.19)
where φ(r) is given by (5.14),
φ(r) =
N
¸
a=1
q
a
[r −r
a
[
, (5.20)
This means that (5.19) will give inﬁnity since φ(r), not unreasonably, diverges at the location
of each point charge.
This is the classic “selfenergy” problem, which one encounters even for a single point
charge. There is no totally satisfactory way around this in classical electromagnetism, and
so one has to adopt a “fudge.” The fudge consists of observing that the true selfenergy
of a charge, whatever that might mean, is a constant. Naively, it appears to be an inﬁnite
constant, but that is clearly the result of making the idealised assumption that the charge
is literally located at a single point. In any case, one can argue that the constant selfenergy
will not be observable, as far as energyconservation considerations are concerned, and so
65
one might as well just drop it for now. Thus the way to make sense of the ostensibly
divergent energy (5.19) for the system of point charges is to replace φ(r
a
), which means the
potential at r = r
a
due to all the charges, by φ
a
, which is deﬁned to be the potential at
r = r
a
due to all the charges except the charge q
a
that is itself located at r = r
a
. Thus we
have
φ
a
≡
¸
b=a
q
b
[r −r
b
[
, (5.21)
and so (5.19) is now interpreted to mean that the total energy of the system of charges is
U =
1
2
¸
a
¸
b=a
q
a
q
b
[r
a
−r
b
[
. (5.22)
5.3 Field of a uniformly moving charge
Suppose a charge e is moving with uniform velocity v in the Lorentz frame S. We may
transform to a frame S
, moving with velocity v relative to S, in which the charge is at
rest. For convenience, we shall choose the origin of axes so that the charge is located at the
origin of the frame S
.
It follows that in the frame S
, the ﬁeld due to the charge can be described purely by
the electric scalar potential φ
:
In S
: φ
=
e
r
,
A
= 0 . (5.23)
(Note that the primes here all signify that the quantities are those of the primed frame S
.)
We know that A
µ
= (φ,
A) is a 4vector, and so the components A
µ
transform under
Lorentz boosts in exactly the same way as the components of x
µ
. Thus we shall have
φ
= γ (φ −v
A) ,
A
=
A+
γ −1
v
2
(v
A)v −γv φ, (5.24)
where γ = (1 −v
2
)
−1/2
. Clearly the inverse Lorentz transformation is obtained by sending
v →−v, and so we shall have
φ = γ (φ
+v
A
) ,
A =
A
+
γ −1
v
2
(v
A
)v +γv φ
. (5.25)
From (5.23), we therefore ﬁnd that the potentials in the frame S, in which the particle is
moving with velocity v, are given by
φ = γφ
=
eγ
r
,
A = γv φ
=
eγv
r
. (5.26)
Note that we still have r
appearing in the denominator, which we would now like to
express in terms of the unprimed coordinates.
66
Suppose, for example, that we orient the axes so that v lies along the x direction. Then
we shall have
x
= γ(x −vt) , y
= y , z
= z , (5.27)
and so
r
2
= x
2
+y
2
+z
2
= γ
2
(x −vt)
2
+y
2
+z
2
. (5.28)
It follows therefore from (5.26) that the scalar and 3vector potentials in the frame S are
given by
φ =
e
R
∗
,
A =
ev
R
∗
, (5.29)
where we have deﬁned
R
2
∗
≡ (x −vt)
2
+ (1 −v
2
)(y
2
+z
2
) . (5.30)
The electric and magnetic ﬁelds can now be calculated in the standard way from φ and
A, as in (2.8). Alternatively, and equivalently, we can ﬁrst calculate
E
and
B
in the primed
frame, and then Lorentz transform these back to the unprimed frame. In the frame S
, we
shall of course have
E
=
er
r
3
,
B
= 0 . (5.31)
The transformation to the unprimed frame is then given by inverting the standard results
(2.51) and (2.52) that express
E
and
B
in terms of
E and
B. Again, this is simply achieved
by interchanging the primed and unprimed ﬁelds, and sending v to −v. This gives
E = γ(
E
−v
B
) −
γ −1
v
2
(v
E
) v ,
B = γ(
B
+v
E
) −
γ −1
v
2
(v
B
) v , (5.32)
and so from (5.31), we ﬁnd that
E and
B in the frame S are given by
E =
eγr
r
3
−
γ −1
v
2
ev r
r
3
v ,
B = γv
E
=
eγv r
r
3
. (5.33)
Let us again assume that we orient the axes so that v lies along the x direction. Then
from the above we ﬁnd that
E
x
=
ex
r
3
, E
y
=
eγy
r
3
, E
z
=
eγz
r
3
, (5.34)
and so
E
x
=
eγ(x −vt)
r
3
, E
y
=
eγy
r
3
, E
z
=
eγz
r
3
. (5.35)
67
Since the charge is located at the point (vt, 0, 0) in the frame S, it follows that the vector
from the charge to the point r = (x, y, z) is
R = (x −vt, y, z) . (5.36)
From (5.35), we then ﬁnd that the electric ﬁeld is given by
E =
eγ
R
r
3
=
e(1 −v
2
)
R
R
3
∗
, (5.37)
where R
∗
was deﬁned in (5.30).
If we now deﬁne θ to be the angle between the vector
R and the x axis, then the
coordinates (x, y, z) of the observation point P will be such that
y
2
+z
2
= R
2
sin
2
θ , where R
2
= [
R[
2
= (x −vt)
2
+y
2
+z
2
. (5.38)
This implies, from (5.30), that
R
2
∗
= R
2
−v
2
(y
2
+z
2
) = R
2
(1 −v
2
sin
2
θ) , (5.39)
and so the electric ﬁeld due to the moving charge is
E =
e
R
R
3
1 −v
2
(1 −v
2
sin
2
θ)
3/2
. (5.40)
For an observation point P located on the x axis, the electric ﬁeld will be E
(parallel
to the x axis), and given by setting θ = 0 in (5.40). On the other hand, we can deﬁne the
electric ﬁeld E
⊥
in the (y, z) plane (corresponding to θ = π/2). From (5.40) we therefore
have
E
=
e(1 −v
2
)
R
2
, E
⊥
=
e(1 −v
2
)
−1/2
R
2
. (5.41)
Note that E
has the smallest magnitude, and E
⊥
has the largest magnitude, that
E attains
as a function of θ.
When the velocity is very small, the electric ﬁeld is (as one would expect) more or less
independent of θ. However, as v approaches 1 (the speed of light), we ﬁnd that E
decreases
to zero, while E
⊥
diverges. Thus for v near to the speed of light the electric ﬁeld is very
sharply peaked around θ = π/2. If we set
θ =
π
2
−ψ , (5.42)
then
[
E[ =
e(1 −v
2
)
R
2
(1 −v
2
cos
2
ψ)
3/2
≈
e(1 −v
2
)
(1 −v
2
+
1
2
ψ
2
)
3/2
(5.43)
68
if v ≈ 1. Thus the angular width of the peak is of the order of
ψ ∼
1 −v
2
. (5.44)
We saw previously that the magnetic ﬁeld in the frame S is given by
B = γv
E
. From
(5.33) we have v
E = γv
E
, and so therefore
B = v
E =
e(1 −v
2
)v
R
R
3
∗
. (5.45)
Note that if [v[ << 1 we get the usual nonrelativistic expressions
E ≈
e
R
R
3
,
B ≈
ev
R
R
3
. (5.46)
5.4 Motion of a charge in a Coulomb potential
We shall consider a particle of mass m and charge e moving in the ﬁeld of a static charged Q.
The classic “Newtonian” result is very familiar, with the orbit of the particle being either an
allipse, a parabola or a hyperbola, depending on the charges and the orbital parameters. In
this section we shall consider the fully relativistic problem, when the velocity of the particle
is not necessarily small compared with the speed of light.
The Lagrangian for the system is given by (2.79), with φ = Q/r and
A = 0:
L = −m(1 − ˙ x
i
˙ x
i
)
1/2
−
eQ
r
, (5.47)
where ˙ x
i
= dx
i
/dt, and r
2
= x
i
x
i
. The charges occur in the combination eQ throughout
the calculation, and so for convenience we shall deﬁne
q ≡ eQ. (5.48)
It is convenient to introduce spherical polar coordinates in the standard way,
x = r sin θ cos ϕ, y = r sin θ sin ϕ, z = r cos θ , (5.49)
and then the Lagrangian becomes
L = −m(1 − ˙ r
2
−r
2
˙
θ
2
−r
2
sin
2
θ ˙ ϕ
2
)
1/2
−
q
r
. (5.50)
The Lagrangian is of the form L = L(q
i
, ˙ q
i
) for coordinates q
i
and velocities ˙ q
i
(don’t confuse
the coordinates q
i
with the product of charges q = eQ!). The EulerLagrange equations are
∂L
∂q
i
−
d
dt
∂L
∂ ˙ q
i
= 0 . (5.51)
69
Note that if L is independent of a particular coordinate, say q
j
, there is an associated
conserved quantity
∂L
∂ ˙ q
j
. (5.52)
The EulerLagrange equation for θ gives
r
2
sin θ cos θ ˙ ϕ
2
(1− ˙ r
2
−r
2
˙
θ
2
−r
2
sin
2
θ ˙ ϕ
2
)
−1/2
−
d
dt
r
2
˙
θ(1− ˙ r
2
−r
2
˙
θ
2
−r
2
sin
2
θ ˙ ϕ
2
)
−1/2
= 0 .
(5.53)
It can be seen that a solution to this equation is to take θ = π/2, and
˙
θ = 0. In other
words, if the particle starts out moving in the θ = π/2 plane (i.e. the (x, y) plane at z = 0),
it will remain in this plane. This is just the familiar result that the motion of a particle
moving under a central force lies in a plane. We may therefore assume now, without loss
of generality, that θ = π/2 for all time. We are left with just r and ϕ as polar coordinates
in the (x, y) plane. The Lagrangian for the reduced system, where we consistently ca set
θ = π/2, is then simply
L = −(1 − ˙ r
2
−r
2
˙ ϕ
2
)
1/2
−
q
r
. (5.54)
We note that ∂L/∂ϕ = 0, and so there is a conserved quantity
∂L
∂ ˙ ϕ
= mr
2
˙ ϕ(1 − ˙ r
2
−r
2
˙ ϕ
2
)
−1/2
= , (5.55)
where is a constant. Since (1 − ˙ r
2
−r
2
˙ ϕ
2
)
−1/2
= γ, we simply have
mγr
2
˙ ϕ = . (5.56)
Note that we can also write this as
mr
2
dϕ
dτ
= . (5.57)
Since the Lagrangian does not depend explicitly on t, the total energy c is also conserved.
Thus we have
c = H =
p
2
+m
2
+
q
r
(5.58)
is a constant. Here,
p
2
= m
2
γ
2
v
2
= m
2
γ
2
˙ r
2
+m
2
γ
2
r
2
˙ ϕ
2
,
= m
2
dr
dτ
2
+m
2
r
2
dϕ
dτ
2
, (5.59)
since, as usual, coordinate time and proper time are related by dτ = dt/γ.
70
We therefore have
c −
q
r
2
= p
2
+m
2
= m
2
dr
dτ
2
+m
2
r
2
dϕ
dτ
2
+m
2
. (5.60)
We now perform the standard change of variables in orbit calculations, and let
r =
1
u
. (5.61)
This implies
dr
dτ
= −
1
u
2
du
dτ
= −
1
u
2
du
dϕ
dϕ
dτ
= −
m
u
, (5.62)
where we have used (5.57) and also we have deﬁned
u
≡
du
dϕ
. (5.63)
It now follows that (5.60) becomes
(c −qu)
2
=
2
u
2
+
2
u
2
+m
2
. (5.64)
This ordinary diﬀerential equation can be solved in order to ﬁnd u as a function of ϕ, and
hence r as a function of ϕ. The solution determines the shape of the orbit of the particle
around the ﬁxed charge Q.
Rewriting (5.64) as
2
u
2
=
u
q
2
−
2
−
qc
q
2
−
2
2
−m
2
−
c
2
2
q
2
−
2
, (5.65)
we see that it is convenient to make a change of variable from u to w, deﬁned by
u
q
2
−
2
−
qc
q
2
−
2
= ±
m
2
+
c
2
2
q
2
−
2
cosh w, (5.66)
where the + sign is chosen if q < 0, and the − sign if q > 0. We can then integrate (5.65),
to obtain
±
q
2
−
2
w = ϕ, (5.67)
(making a convenient choice, without loss of generality, for the constant of integration), and
hence we have
q
2
−
2
u = ±
m
2
+
c
2
2
q
2
−
2
cosh
q
2
2
−1
1/2
ϕ
+
qc
q
2
−
2
. (5.68)
In other words, the orbit is given, in terms of r = r(ϕ), by
q
2
−
2
r
= ±
c
2
2
+m
2
(q
2
−
2
) cosh
q
2
2
−1
1/2
ϕ
+qc . (5.69)
71
The solution (5.69) is presented for the case where [[ < [q[. If instead [[ > [q[, it
becomes
2
−q
2
r
=
c
2
2
−m
2
(
2
−q
2
) cos
1 −
q
2
2
1/2
ϕ
−qc . (5.70)
Finally, if [[ = [q[, it is easier to go back to the equation (5.65) and resolve it directly
in this case, leading to
2qc
r
= c
2
−m
2
−c
2
ϕ
2
. (5.71)
The situation described above for relativistic orbits should be contrasted with what
happens in the nonrelativistic case. In this limit, the Lagrangian (after restricting to
motion in the (x, y) plane again) is simply given by
L =
1
2
m( ˙ r
2
+r
2
˙ ϕ
2
) −
q
r
. (5.72)
Note that this can be obtained from the relativisitic Lagrangian (5.54) we studied above,
by taking ˙ r and r ˙ ϕ to be small compared to 1 (the speed of light), and then expanding the
square root to quadratic order in velocities. As discussed previously, one can ignore the
leadingorder term −m in the expansion, since this is just a constant (the restmass energy
of the particle) and so it does not enter in the EulerLagrange equations. The analysis of
the EulerLagrange equations for the nonrelativistic Lagrangian (5.72) is a standard one.
There are conserved quantities
E =
1
2
m( ˙ r
2
+r
2
˙ ϕ
2
) +
q
r
, = mr
2
˙ ϕ. (5.73)
Substituting the latter into the former give the standard radial equation, whose solution
implies closed elliptical orbits given by
1
r
=
mq
2
1 +
2E
2
mq
2
cos ϕ −1
. (5.74)
(This is for the case E > −mq
2
/(2
2
). If E < −mq
2
/(2
2
) the orbits are hyperbolae, while
in the intermediate case E = −mq
2
/(2
2
) the orbits are parabolic.)
The key diﬀerence in the relativistic case is that the orbits are never closed, even when
[[ > [q[, as in (5.70), for which the radius r is a trigonometric function of ϕ. The reason
for this is that the argument of the trigonometric function is
1 −
q
2
2
1/2
ϕ, (5.75)
and so ϕ has to increase through an angle ∆ϕ given by
∆ϕ = 2π
1 −
q
2
2
−1/2
(5.76)
72
before the cosine completes one cycle. If we assume that [q/[ is small compared with 1,
then the shape of the orbit is still approximately like an ellipse, except that the “perihelion”
of the ellipse advances by an angle
δϕ = 2π
1 −
q
2
2
−1/2
−1
≈
πq
2
2
(5.77)
per orbit.
If on the other hand [[ ≤ [q[, then if q < 0 (which means eQ < 0 and hence an attractive
force between the charges), the particle spirals inwards and eventually reaches r = 0 within
a ﬁnite time. This can never happen in the nonrelativisitic case; the orbit of the particle
can never reach the origin at r = 0, unless the angular momentum is exactly zero. The
reason for this is that the centrifugal potential term
2
/r
2
always throws the particle away
from the origin if r tries to get too small. By contrast, in the relativisitic case the eﬀect
of the centrifugal term is reduced at large velocity, and it cannot prevent the collapse of
the orbit to r = 0. This can be seen by looking at the conserved quantity c in the fully
relativisitic analysis, which, from our discussion above, can be written as
c =
m
2
+m
2
dr
dτ
2
+
2
r
2
1/2
+
q
r
. (5.78)
First, consider the nonrelativistic limit, for which the restmass term dominates inside the
square root:
c ≈ m+
1
2
m
dr
dt
2
+
2
2mr
2
+
q
r
. (5.79)
Here, we see that even if q < 0 (an attractive force), the repulsive centrifugal term always
wins over the attractive charge term q/r at small enough r.
On the other hand, if we keep the full relativistic expression (5.78), then at small
enough r the competition between the centrifugal term and the charge term becomes “evenly
matched,”
c ≈
[[
r
+
q
r
, (5.80)
and clearly if q < −[[ the attraction between the charges wins the contest.
5.5 The multipole expansion
Consider the electrostatic potential of N point charges q
a
, located at ﬁxed positions r
a
. It
is given by
φ(r) =
N
¸
a=1
q
a
[r −r
a
[
. (5.81)
73
In the continuum limit, the potential due to a charge distrubution characterised by the
charge density ρ(r) is given by
φ(r) =
ρ(r
)d
3
r
[r −r
[
. (5.82)
Since we shall assume that the charges are conﬁned to a ﬁnite region, it is useful to
perform a multipole expansion of the potential far from the region where the charges are
located. This amounts to an expansion in inverse powers of r = [r[. This can be achieved
by performing a Taylor expansion of φ(r).
Recall that in one dimension, Taylor’s theorem gives
f(x +a) = f(x) +af
(x) +
a
2
2!
f
(x) +
a
3
3!
f
(x) + . (5.83)
In three dimensions, the analogous expansion is
f(r +a) = f(r) +a
i
∂
i
f(r) +
1
2!
a
i
a
j
∂
i
∂
j
f(r) +
1
3!
a
i
a
j
a
k
∂
i
∂
j
∂
k
f(r) + . (5.84)
We now apply this 3dimensional Taylor expansion to the function f(r) = 1/[r[ = 1/r,
taking a = −r
. This gives
1
[r −r
[
=
1
r
−x
i
∂
i
1
r
+
1
2!
x
i
x
j
∂
i
∂
j
1
r
−
1
3!
x
i
x
j
x
k
∂
i
∂
j
∂
k
1
r
+ . (5.85)
Now since r
2
= x
j
x
j
, it follows that ∂
i
r
2
= 2r ∂
i
r = 2x
i
, and so
∂
i
r =
x
i
r
. (5.86)
Note that we have (assuming r = 0) that
∂
i
∂
i
1
r
= ∂
i
−
x
i
r
3
= −
3
r
3
+
3x
i
r
4
x
i
r
= 0 , (5.87)
or, in other words
∇
2
1
r
= 0 . (5.88)
A consequence of this is that the multiple derivatives
∂
i
∂
j
1
r
, ∂
i
∂
j
∂
k
1
r
, ∂
i
∂
j
∂
k
∂
1
r
, (5.89)
are all traceless on any pair of indices:
δ
ij
∂
i
∂
j
1
r
= 0 , δ
ij
∂
i
∂
j
∂
k
1
r
= 0 , etc. (5.90)
We can use this property in order to replace the quantities
x
i
x
j
, x
i
x
j
x
k
, (5.91)
74
that multiply the derivative terms in (5.85) by the totally tracefree quantities
(x
i
x
j
−
1
3
δ
ij
r
2
) , (x
i
x
j
x
k
−
1
5
[x
i
δ
jk
+x
j
δ
ik
+x
k
δ
ij
]r
2
) , (5.92)
where r
2
= x
i
x
i
. (We can do this because the trace terms that we are subtracting out
here give zero when they are contracted onto the multiple derivatives of 1/r in (5.85).) It
therefore follows from (5.82) and (5.85) that we have
φ(r) =
1
r
ρ(r
)d
3
r
−
∂
i
1
r
x
i
ρ(r
)d
3
r
+
∂
i
∂
j
1
r
(x
i
x
j
−
1
3
δ
ij
r
2
)ρ(r
)d
3
r
−
∂
i
∂
j
∂
k
1
r
(x
i
x
j
x
k
−
1
5
[x
i
δ
jk
+x
j
δ
ik
+x
k
δ
ij
]r
2
)ρ(r
)d
3
r
+ . (5.93)
The expansion here can be written as
φ(r) =
Q
r
−p
i
∂
i
1
r
+
1
2!
Q
ij
∂
i
∂
j
1
r
−
1
3!
Q
ijk
∂
i
∂
j
∂
k
1
r
+ (5.94)
where
Q =
ρ(r
)d
3
r
,
p
i
=
x
i
ρ(r
)d
3
r
,
Q
ij
=
(x
i
x
j
−
1
3
δ
ij
r
2
)ρ(r
)d
3
r
,
Q
ijk
=
(x
i
x
j
x
k
−
1
5
[x
i
δ
jk
+x
j
δ
ik
+x
k
δ
ij
]r
2
)ρ(r
)d
3
r
, (5.95)
and so on. The quantity Q is the total charge of the system, p
i
is the dipole moment, Q
ij
is the quadrupole moment, and Q
ijk
, Q
ijk
, etc., are the higher multipole moments. Note
that by construction, all the multipole moments with two or more indices are symmetric
and traceless on all indices.
Note that the terms in the multipole expansion (5.94) do indeed fall oﬀ with increasing
inverse powers of r. For example, the dipole term is given by
φ
Dipole
= −p
i
∂
i
1
r
=
p
i
x
i
r
3
, (5.96)
which falls oﬀ like 1/r
2
. The quadrupole term is given by
φ
Quadrupole
=
1
2
Q
ij
∂
i
∂
j
1
r
=
1
2
Q
ij
(3x
i
x
j
−r
2
δ
ij
)
r
5
=
3
2
Q
ij
x
i
x
j
r
5
, (5.97)
which falls oﬀ like 1/r
3
. (The last equality above follows because Q
ij
is traceless.)
The total charge Q (the electric monopole moment) is of course a single quantity. The
dipole moment p
i
is a 3vector, so it has three independent components in general. The
75
quadrupole moment Q
ij
is a symmetric 2index tensor in three dimensions, which would
mean 3 4/2 = 6 independent components. But it is also traceless, Q
ii
= 0, which is one
condition. Thus there are 6 −1 = 5 independent components.
The octopole moment Q
ijk
is a 3index symmetric tensor, which would mean 3 4
5/3! = 10 independent components. But it is also traceless, Q
iij
= 0, which is 3 conditions.
Thus the octopole has in general 10 −3 = 7 independent components. It is straightforward
to see in the same way that the 2
pole moment
Q
i
1
i
2
···i
=
(x
i
1
x
i
2
x
i
−traces)ρ(r
)d
3
r
(5.98)
has (2 + 1) independent components.
In fact, the multipole expansion (5.94) is equivalent to an expansion in spherical polar
coordinates, using the spherical harmonics Y
m
(θ, φ):
φ(r, θ, φ) =
∞
¸
=0
¸
m=−
C
m
Y
m
(θ, φ)
1
r
+1
. (5.99)
At a given value of the terms fall oﬀ like r
−−1
, and there are (2 + 1) of them, with
coeﬃcients C
m
, since m ranges over the integers − ≤ m ≤ . For each value of , there is
a linear relationship between the (2 + 1) components of C
m
and the (2 + 1) components
of the multipole moments Q, p
i
. Q
ij
, Q
ijk
, etc. Likewise, for each there is a linear
relationship between r
−−1
Y
m
(θ, ϕ) and the set of functions ∂
i
1
∂
i
2
∂
i
r
−1
.
Consider, for example, = 1. The three functions Z
i
≡ ∂
i
r
−1
= −x
i
/r
3
are given by
Z
1
= −
sinθ cos ϕ
r
2
, Z
2
= −
sin θ sin ϕ
r
2
, Z
3
= −
cos θ
r
2
, (5.100)
when expressed in terms of spherical polar coordinates (see (5.49)). On the other hand, the
= 1 spherical harmonics are given by
Y
11
= −
3
8π
sin θ e
iϕ
, Y
10
=
3
4π
cos θ , Y
1,−1
=
3
8π
sinθ e
−iϕ
. (5.101)
Thus we see that
Z
1
=
8π
3
(Y
11
−Y
1,−1
)
2r
2
, Z
1
=
8π
3
(Y
11
+Y
1,−1
)
2i r
2
, Z
3
= −
4π
3
Y
10
r
2
. (5.102)
Analogous relations can be seen for all higher values of .
6 Electromagnetic Waves
6.1 Wave equation
As discussed at the beginning of the course (see section 1.1), Maxwell’s equations admit
wavelike solutions. These solutions can esist in free space, in a region where there are no
76
source currents, for which the equations take the form
∇
E = 0 ,
∇
B −
∂
E
∂t
= 0 ,
∇
B = 0 ,
∇
E +
∂
B
∂t
= 0 . (6.1)
As discussed in section 1.1, taking the curl of the
∇
E equation, and using the
∇
B
equation, one ﬁnds
∇
2
E −
∂
2
E
∂t
2
= 0 , (6.2)
and similarly,
∇
2
B −
∂
2
B
∂t
2
= 0 . (6.3)
Thus each component of
E and each component of
B satisﬁes d’Alembert’s equation
∇
2
f −
∂
2
f
∂t
2
= 0 . (6.4)
This can, of course, be written as
f ≡ ∂
µ
∂
µ
f = 0 , (6.5)
which shows that d’Alembert’s operator is Lorentz invariant.
The wave equation (6.4) admits planewave solutions, where f depends on t and on a
single linear combination of the x, y and z coordinates. By choosing the orientation of the
axes appropriately, we can make this linear combination become simply x. Thus we may
seek solutions of (6.4) of the form f = f(t, x). The function f will then satisfy
∂
2
f
∂x
2
−
∂
2
f
∂t
2
= 0 , (6.6)
which can be written in the factorised form
∂
∂x
−
∂
∂t
∂
∂x
+
∂
∂t
f(t, x) = 0 . (6.7)
Now introduced “lightcone coordinates”
u = x −t , v = x +t . (6.8)
We see that
∂
∂x
=
∂
∂u
+
∂
∂v
,
∂
∂t
= −
∂
∂u
+
∂
∂v
, (6.9)
and so (6.7) becomes
∂
2
f
∂u∂v
= 0 . (6.10)
77
The general solution to this is
f = f
+
(u) +f
−
(v) = f
+
(x −t) +f
−
(x +t) , (6.11)
where f
+
and f
−
are arbitrary functions.
The functions f
±
determine the proﬁle of a wavelike disturbance that propagates at the
speed of light (i.e. at speed 1). In the case of a wave described by f
+
(x−t), the disturbance
propagtes at the speed of light in the positive x direction. This can be seen from the fact
that if we sit at a given point on the proﬁle (i.e. at a ﬁxed value of the arguement of
the function f
+
), then as t increases the x value must increase too. This means that the
disturbance moves, with speed 1, along the positive x direction. Likewise, a wave described
by f
−
(x +t) moves in the negative x direction as time increases.
More generally, we can consider a planewave disturbance moving along the direction of
a unit 3vector n:
f(t, r) = f
+
(n r −t) +f
−
(n r +t) . (6.12)
The f
+
wave moves in the direction of n as t increases, while the f
−
wave moves in the
direction of −n. The previous case of propagation along the x axis, corresponds to taking
n = (1, 0, 0).
Let us now return to the discussion of electromagnetic waves. Following the discussion
above, there will exist planewave solutions of (6.2), propagating along the n direction, of
the form
E =
E(n r −t) . (6.13)
From the Maxwell equation ∂
B/∂t = −
∇
E, we shall therefore have
∂B
i
∂t
= −
ijk
∂
j
E
k
(n
x
−t) ,
= −
ijk
n
j
E
(n
x
−t) , (6.14)
where E
k
denotes the derivative of E
k
with respect to its argument. We also have that
∂E
k
(n
x
−t)/∂t = −E
k
(n
x
−t), and so
∂B
i
∂t
=
ijk
n
j
∂
∂t
E
k
(n
x
−t) . (6.15)
We can integrate this, and drop the constant of integration since an additional static
B
ﬁeld term is of no interest to us when discussing electromagnetic waves. Thus we have
B
i
=
ijk
n
j
E
k
, i.e.
B = n
E. (6.16)
78
The sourcefree Maxwell equation
∇
E = 0 implies
∂
i
E
i
(n
j
x
j
−t) = n
i
E
i
(n
j
x
j
−t) = −
∂
∂t
n
E = 0 . (6.17)
Again, we can drop the constant of integration, and conclude that for the plane wave
n
E = 0 . (6.18)
Since
B = n
E, it immediately follows that n
B = 0 and
E
B = 0 also. Thus we see that
for a plane electromagnetic wave propagating along the n direction, the
E and
B vectors
are orthogonal to n and also orthogonal to each other:
n
E = 0 , n
B = 0 ,
E
B = 0 . (6.19)
It also follows from
B = n
E that
[
E[ = [
B[ , i.e. E = B. (6.20)
Thus we ﬁnd that the energy density W is given by
W =
1
8π
(E
2
+B
2
) =
1
4π
E
2
. (6.21)
The Poynting ﬂux
S = (
E
B)/(4π) is given by
S
i
=
1
4π
ijk
E
j
km
n
E
m
=
1
4π
n
i
E
j
E
j
−
1
4π
E
i
n
j
E
j
,
=
1
4π
n
i
E
j
E
j
, (6.22)
and so we have
W =
1
4π
E
2
,
S =
1
4π
nE
2
= nW . (6.23)
Note that the argument n r −t can be written as
n r −t = n
µ
x
µ
, (6.24)
where n
µ
= (−1, n) and hence
n
µ
= (1, n) . (6.25)
Since n is a unit vector, n n = 1, we have
n
µ
n
µ
= η
µν
n
µ
n
ν
= 0 . (6.26)
n
µ
is called a Null Vector. This is a nonvanishing vector whose norm n
µ
n
µ
vanishes.
Such vectors can arise because of the minus sign in the η
00
component of the 4metric.
79
By contrast, in a metric of positivedeﬁnite signature, such as the 3dimensional Euclidean
metric δ
ij
, a vector whose norm vanishes is itself necessarily zero.
We can now evaluate the various components of the energymomentum tensor, which
are given by (4.104) and the equations that follow it. Thus we have
T
00
= W =
1
4π
E
2
=
1
4π
B
2
,
T
0i
= T
i0
= S
i
= n
i
W ,
T
ij
=
1
4π
(−E
i
E
j
−B
i
B
j
+
1
2
(E
2
+B
2
)δ
ij
) ,
=
1
4π
(−E
i
E
j
−
ik
jmn
n
k
n
m
E
E
n
+E
2
δ
ij
) ,
=
1
4π
(−E
i
E
j
−δ
ij
E
2
−n
i
n
k
E
k
E
j
−n
j
n
E
E
i
+δ
ij
n
k
n
E
k
E
+n
i
n
j
E
E
+n
k
n
k
E
i
E
j
+E
2
δ
ij
) ,
=
1
4π
n
i
n
j
E
2
= n
i
n
j
W . (6.27)
Note that in deriving this last result, we have used the identity
ik
jmn
= δ
ij
δ
km
δ
n
+δ
im
δ
kn
δ
j
+δ
in
δ
kj
δ
m
−δ
im
δ
kj
δ
n
−δ
ij
δ
kn
δ
m
−δ
in
δ
km
δ
j
. (6.28)
The expressions for T
00
, T
0i
and T
ij
can be combined into the single Lorentzcovariant
expression
T
µν
= n
µ
n
ν
W . (6.29)
From this, we can compute the conserved 4momentum
P
µ
=
t=const.
T
µν
dΣ
ν
=
T
µ0
d
3
x,
=
n
µ
Wd
3
x = n
µ
Wd
3
x, (6.30)
and hence we have
P
µ
= n
µ
c , (6.31)
where
c =
Wd
3
x, (6.32)
the total energy of the electromagnetic ﬁeld. Note that P
µ
is also a null vector,
P
µ
P
µ
= c
2
n
µ
n
µ
= 0 . (6.33)
80
6.2 Monochromatic plane waves
In the discussion above, we considered plane electromagnetic waves with an arbitrary proﬁle.
A special case is to consider the situation when the plane wave has a deﬁnite frequency ω,
so that its time dependence is of the form cos ωt. Thus we can write
E =
E
0
e
i(
k·r−ωt)
,
B =
B
0
e
i(
k·r−ωt)
, (6.34)
where
E
0
and
B
0
are (possibly complex) constants. The physical
E and
B ﬁelds are obtained
by taking the real parts of
E and
B. (Since the Maxwell equations are linear, we can always
choose to work in such a complex notation, with the understanding that we take the real
parts to get the physical quantities.)
As we shall discuss in some detail later, the more general planewave solutions discussed
previously, with an arbitrary proﬁle for the wave, can be built up as linear combinations of
the monochromatic planewave solutions.
Of course, for the ﬁelds in (6.34) to solve the Maxwell equations, there must be relations
among the constants
k, ω,
E
0
and
B
0
. Speciﬁcally, since
E and
B must satisfy the wave
equations (6.2) and (6.3), we must have
k
2
= ω
2
, (6.35)
and since
∇
E = 0 and
∇
B = 0, we must have
k
E
0
= 0 ,
k
B
0
= 0 . (6.36)
Finally, following the discussion in the more general case above, it follows from
∇
E +
∂
B/∂t = 0 and
∇
B −∂
E/∂t = 0 that
B =
k
E
ω
. (6.37)
It is natural, therefore, to introduce the 4vector
k
µ
= (ω,
k) = ω n
µ
, (6.38)
where n
µ
= (1, n) and n =
k/[
k[ =
k/ω. Equation (6.35) then becomes simply the statement
that k
µ
is a null vector,
k
µ
k
µ
= 0 . (6.39)
Note that the argument of the exponentials in (6.34) can now be written as
k r −ωt = k
µ
x
µ
, (6.40)
81
which we shall commonly write as k x. Thus we may rewrite (6.34) more brieﬂy as
E =
E
0
e
i k·x
,
B =
B
0
e
i k·x
. (6.41)
As usual, we have a plane transverse wave, propagating in the direction of the unit 3vector
n =
k/ω. The term “transverse” here signiﬁes that
E and
B are perpendicular to the
direction in which the wave is propagating. In fact, we have
n
E = n
B = 0 ,
B = n
E, (6.42)
and so we have also that
E and
B are perpendicular to each other, and that [
E[ =
B[.
Consider the case where
E
0
is taken to be real, which means that
B
0
is real too. Then
the physical ﬁelds (obtained by taking the real parts of the ﬁelds given in (6.34)), are given
by
E =
E
0
cos(
k r −ωt) ,
B =
B
0
cos(
k r −ωt) . (6.43)
The energy density is then given by
W =
1
8π
(E
2
+B
2
) =
1
4π
E
2
0
cos
2
(
k r −ωt) . (6.44)
If we deﬁne the time average of W by
'W` ≡
1
T
T
0
Wdt , (6.45)
where T = 2π/ω is the period of the oscillation, then we shall have
'W` =
1
8π
E
2
0
=
1
8π
B
2
0
. (6.46)
Note that in terms of the complex expressions (6.34), we can write this as
'W` =
1
8π
E
E
∗
=
1
8π
B
B
∗
, (6.47)
where the ∗ denotes complex conjugation, since the time and position dependence of
E or
B is cancelled when multiplied by the complex conjugate ﬁeld.
11
In general, when
E
0
and
B
0
are not real, we shall also have the same expressions (6.47)
for the timeaveraged energy density.
In a similar manner, we can evaluate the time average of the Poynting ﬂux vector
S = (
E
B)/(4π). If we ﬁrst consider the case where
E
0
is real, we shall have
S =
1
4π
E
B =
1
4π
E
0
B
0
cos
2
(n r −ωt) =
1
4π
nE
2
0
cos
2
(n r −ωt) , (6.48)
11
This “trick,” of expressing the timeaveraged energy density in terms of the dot product of the complex
ﬁeld with its complex conjugate, is rather speciﬁc to this speciﬁc situation, where the quantity being time
averaged is quadratic in the electric and magnetic ﬁelds.
82
and so
'
S` =
1
8π
E
0
B
0
=
1
8π
nE
2
0
. (6.49)
In general, even if
E
0
and
B
0
are not real, we can write '
S` in terms of the complex
E and
B ﬁelds as
'
S` =
1
8π
E
B
∗
=
1
8π
n
E
E
∗
, (6.50)
and so we have
'
S` = n'W` . (6.51)
6.3 Motion of a point charge in a linearlypolarised E.M. wave
Consider a plane wave propagating in the z direction, with
E = (E
0
cos ω(z −t), 0, 0) ,
B = (0, E
0
cos ω(z −t), 0) . (6.52)
Suppose now that there is a particle of mass m and charge e in this ﬁeld. By the Lorentz
force equation we shall have
d p
dt
= e
E +ev
B. (6.53)
For simplicity, we shall make the assumption that the motion of the particle can be treated
nonrelativistically, and so
p = mv = m
dr
dt
. (6.54)
Let us suppose that the particle is initially located at the point z = 0, and that it moves
only by a small amount in comparison to the wavelength 2π/ω of the electromagnetic
wave. Therefore, to a good approximation, we can assume that the particle is sitting in
the uniform, although timedependent, electromagnetic ﬁeld obtained by setting z = 0 in
(6.52). Thus
E = (E
0
cos ωt, 0, 0) ,
B = (0, E
0
cos ωt, 0) , (6.55)
and so the Lorentz force equation gives
m¨ x = eE
0
cos ωt −e ˙ z cos ωt ≈ eE
0
cos ωt ,
m¨ y = 0 ,
m¨ z = e ˙ x E
0
cos ωt . (6.56)
Note that the approximation in the ﬁrst line follows from our assumption that the motion
of the particle is nonrelativistic, so [ ˙ z[ << 1.
83
With convenient and inessential choices for the constants of integration, ﬁrst obtain
˙ x =
eE
0
mω
sin ωt , x = −
eE
0
mω
2
cos ωt , (6.57)
Substituting into the z equation then gives
¨ z =
e
2
E
2
0
m
2
ω
sinωt cos ωt =
e
2
E
2
0
2m
2
ω
sin 2ωt , (6.58)
which integrates to give (dropping inessential constants of integration)
z = −
e
2
E
2
0
8m
2
ω
3
sin 2ωt . (6.59)
The motion in the y direction is purely linear, and since we are not interested in the case
where the particle drifts uniformly through space, we can just focus on the solution where
y is constant, say y = 0.
Thus the interesting motion of the particle in the electromagnetic ﬁeld is of the form
x = αcos ωt , z = β sin 2ωt = 2β sin ωt cos ωt , (6.60)
which means
z =
2β
α
x
1 −
x
2
α
2
. (6.61)
This describes a “ﬁgure of eight” lying on its side in the (x, z) plane. The assumptions we
made in deriving this, namely nonrelativistic motion and a small z displacement relative
to the wavelength of the electromagnetic wave, can be seen to be satisﬁed provided the
amplitude E
0
of the wave is suﬃciently small.
The response of the charge particle to electromagnetic wave provides a model for how
the electrons in a receiving antenna behave in the presence of an electromagnetic wave.
This shows how the wave is converted into oscilliatory currents in the antenna, which are
then ampliﬁed and processed into the ﬁnal output signal in a radio receiver.
6.4 Circular and elliptical polarisation
The electromagnetic wave described in section 6.2 is linearly polarised. For example, we
could consider the solution with
E
0
= (0, E
0
, 0) ,
B
0
= (0, 0, B
0
) , n = (1, 0, 0) . (6.62)
This corresponds to a linearly polarised electromagnetic wave propagating along the x
direction.
84
By taking a linear superposition of waves propagating along a given direction n, we can
obtain circularly polarised, or more generally, elliptically polarised, waves. Let e and
f be
two orthogonal unit vectors, that are also both orthogonal to n:
e e = 1 ,
f
f = 1 , n n = 1 ,
e
f = 0 , n e = 0 , n
f = 0 . (6.63)
Suppose now we consider a plane wave given by
E = (E
0
e +
¯
E
0
f) e
i (
k·r−ωt)
,
B = n
E, (6.64)
where E
0
and
¯
E
0
are complex constants. If E
0
and
¯
E
0
both have the same phase (i.e.
¯
E
0
/E
0
is real), then we again have a linearlypolarised electromagnetic wave. If instead the phases
of E
0
and
¯
E
0
are diﬀerent, then the wave is in general elliptically polarised.
Consider as an example the case where
¯
E
0
= ±i E
0
, (6.65)
for which the electric ﬁeld will be given by
E = E
0
(e ±i
f) e
i (
k·r−ωt)
. (6.66)
Taking the real part, to get the physical electric ﬁeld, we obtain
E = E
0
e cos(
k r −ωt) ∓E
0
f sin(
k r −ωt) . (6.67)
For example, if we choose
n = (0, 0, 1) , e = (1, 0, 0) ,
f = (0, 1, 0) , (6.68)
then the electric ﬁeld is given by
E
x
= E
0
cos ω(z −t) , E
y
= ∓E
0
sin ω(z −t) . (6.69)
It is clear from this that the magnitude of the electric ﬁeld is constant,
[
E[ = E
0
. (6.70)
If we ﬁx a value of z, then the
E vector can be seen to be rotating around the z axis (the
direction of motion of the wave). This rotation is anticlockwise in the (x, y) plane if we
choose the plus sign in (6.65), and clockwise if we choose the minus sign instead. These
two choices correspond to having a circularly polarised wave of positive or negative helicity
85
respectively. (Positive helicity means the rotation is parallel to the direction of propagation,
while negative helicity means the rotation is antiparallel to the direction of propagation.)
In more general cases, where the magnitudes of E
0
and
¯
E
0
are unequal, or where the
phase angle between them is not equal to 0 (linear polarisation) or 90 degrees, the elec
tromagnetic wave will be elliptically polarised. Consider, for axample, the case where the
electric ﬁeld is given by
E = (a
1
e
i δ
1
, a
2
e
i δ
2
, 0) e
i ω(z−t)
, (6.71)
with the propagtion direction being n = (0, 0, 1). Then we shall have
B = n
E = (−a
2
e
i δ
2
, a
1
e
i δ
1
, 0) e
i ω(z−t)
. (6.72)
The constants a
1
, a
2
, δ
1
and δ
2
determine the nature of this plane wave propagating along
the z direction. Of course the overall phase is unimportant, so really it is only the diﬀerence
δ
2
−δ
1
between the phase angles that is important.
The magnitude and phase information is sometimes expressed in terms of the Stokes
Parameters (s
0
, s
1
, s
2
, s
3
), which are deﬁned by
s
0
= E
x
E
∗
x
+E
y
E
∗
y
= a
2
1
+a
2
2
, s
1
= E
x
E
∗
x
−E
y
E
∗
y
= a
2
1
−a
2
2
, (6.73)
s
2
= 2'(E
∗
x
E
y
) = 2a
1
a
2
cos(δ
2
−δ
1
) , s
3
= 2·(E
∗
x
E
y
) = 2a
1
a
2
sin(δ
2
−δ
1
) .
(The last two involve the real and imaginary parts of (E
∗
x
E
y
) respectively.) The four Stokes
parameters are not independent:
s
2
0
= s
2
1
+s
2
2
+s
2
3
. (6.74)
The parameter s
0
characterises the intensity of the electromagnetic wave, while s
1
charac
terises the amount of x polarisation versus y polarisation, with
−s
0
≤ s
1
≤ s
0
. (6.75)
The third independent parameter, which could be taken to be s
2
, characterises the phase
diﬀerence between the x and the y polarised waves. Circular polaristion with ± helicity
corresponds to
s
1
= 0 , s
2
= 0 , s
3
= ±s
0
. (6.76)
6.5 General superposition of plane waves
So far in the discussion of electromagnetic waves, we have considered the case where there is
a single direction of propagation (i.e. a plane wave), and a single frequency (monochromatic).
86
The most general wavelike solutions of the Maxwell equations can be expressed as linear
cobinations of these basic monochromatic planewave solutions.
In order to discuss the general wave solutions, it is helpful to work with the gauge
potential A
µ
= (φ,
A). Recall that we have the freedom to make gauge transformations
A
µ
→ A
µ
+ ∂
µ
λ, where λ is an arbitrary function. For the present purposes, of describing
wave solutions, a convenient choice of gauge is to set φ = 0. Such a gauge choice would not
be convenient when discussing solutions in electrostatics, but in the present case, where we
know that the wave solutions are necessarily timedependent, it is quite helpful.
Thus, we shall ﬁrst write a single monochromatic plane wave in terms of the 3vector
potential, as
A = ae e
i (
k·r−ωt)
, (6.77)
where e is a unit polarisation vector, and a is a constant. As usual, we must have [
k[
2
= ω
2
.
The electric and magnetic ﬁelds will be given by
E = −
∇φ −
∂
A
∂t
= i aωe e
i (
k·r−ωt)
,
B =
∇
A = i a
k e e
i (
k·r−ωt)
=
k
E
ω
. (6.78)
We can immediately see that
E and
B satisfy the wave equation, and that we must impose
e
k = 0 in order to satisfy
∇
E = 0.
We have established, therefore, that (6.77) describes a monochromatic plane wave prop
agating along the
k direction, with electric ﬁeld along e, provided that e
k = 0 and [
k[ = ω.
More precisely, the gauge potential that gives the physical (i.e. real) electric and magnetic
ﬁelds is given by taking the real part of
A in (6.77). Thus, when we want to describe the
actual physical quantities, we shall write
A = ae e
i (
k·r−ωt)
+a
∗
e e
−i (
k·r−ωt)
. (6.79)
(We have absorbed a factor of
1
2
here into a rescaling of a, in order to avoid carrying
1
2
factors around in all the subsequent equations.) For brevity, we shall usually write the
“physical”
A as
A = ae e
i (
k·r−ωt)
+ c.c. , (6.80)
where c.c stands for “complex conjugate.”
Now consider a general linear superposition of monochromatic plane waves, with diﬀer
ent wavevectors
k, diﬀerent polarisation vectors e, and diﬀerent amplitudes a. We shall
therefore label the polarisation vectors and amplitudes as follows:
e −→e
λ
(
k) , a −→a
λ
(
k) . (6.81)
87
Here λ is an index which ranges over the values 1 and 2, which labels 2 real orthonormal
vectors e
1
(
k) and e
2
(
k) that span the 2plane perpendicular to
k. The general wave solution
can then be written as the sum over all such monochromatic plane waves of the form (6.80).
Since a continuous range of wavevectors is allowed, the summation over these will be a
3dimensional integral. Thus we can write
A =
2
¸
λ=1
d
3
k
(2π)
3
e
λ
(
k) a
λ
(
k) e
i (
k·r−ωt)
+ c.c.
, (6.82)
where ω = [
k[, and
k e
λ
(
k) = 0 . (6.83)
For many purposes, it will be convenient to expand
A in a basis of circularlypolarised
monochromatic plane waves, rather than linearlypolarised waves. In this case, we should
choose the 2dimensional basis of polarisation vectors
±
, related to the previous basis by
±
=
1
√
2
(e
1
±i e
2
) . (6.84)
Since we have e
i
e
j
= δ
ij
, it follows that
+
+
= 0 ,
−
−
= 0 ,
+
−
= 1 . (6.85)
Note that
±
∗
=
∓
. We can label the
±
basis vectors by
λ
, where λ is now understood
to take the two “values” + and −. We then write the general wave solution as
A =
¸
λ=±
d
3
k
(2π)
3
λ
(
k) a
λ
(
k) e
i (
k·r−ωt)
+ c.c.
, (6.86)
Of course, we also have
k
λ
= 0, and ω = [
k[.
6.5.1 Helicity and energy of circularlypolarised waves
The angularmomentum tensor M
µν
for the electromagnetic ﬁeld is deﬁned by
M
µν
=
t=const
(x
µ
T
νρ
−x
ν
T
µρ
)dΣ
ρ
, (6.87)
and so the threedimensional components M
ij
are
M
ij
=
t=const
(x
i
T
jρ
−x
j
T
iρ
)dΣ
ρ
=
(x
i
T
j0
−x
j
T
i0
)d
3
x,
=
(x
i
S
j
−x
j
S
i
)d
3
x. (6.88)
88
Thus, since
S = (
E
B)/(4π), the threedimensional angular momentum L
i
=
1
2
ijk
M
jk
is
given by
L
i
=
ijk
x
j
S
k
d
3
x, (6.89)
i.e.
L =
1
4π
r (
E
B) d
3
x. (6.90)
Now, since
B =
∇
A, we have
[r (
E
B)]
i
=
ijk
km
x
j
E
B
m
,
=
ijk
km
mpq
x
j
E
∂
p
A
q
,
=
ijk
(δ
kp
δ
q
−δ
kq
δ
p
) x
j
E
∂
p
A
q
,
=
ijk
x
j
E
∂
k
A
−
ijk
x
j
E
∂
A
k
, (6.91)
and so
L
i
=
1
4π
(
ijk
x
j
E
∂
k
A
−
ijk
x
j
E
∂
A
k
)d
3
x,
=
1
4π
−
ijk
∂
k
(x
j
E
) A
+∂
(x
j
E
) A
k
d
3
x,
=
1
4π
−
ijk
x
j
(∂
k
E
) A
+
ijk
E
j
A
k
d
3
x. (6.92)
Note that in performing the integrations by parts here, we have, as usual, assumed that
the ﬁelds fall oﬀ fast enough at inﬁnity that the surface term can be dropped. We have
also used the sourcefree Maxwell equation ∂
E
= 0 in getting to the ﬁnal line. Thus, we
conclude that the angular momentum 3vector can be expressed as
L =
1
4π
(
E
A−A
i
(r
∇)E
i
)d
3
x. (6.93)
The two terms in (6.93) can be interpreted as follows. The second term can be viewed
as an “orbital angular momentum,” since it clearly depends on the choice of origin. It is
rather analogous to an r p contribution to the angular momentum of a system of particles.
On the other hand, the ﬁrst term in (6.93) can be viewed as an “intrinsic spin” term, since
it is constructed purely from the electromagnetic ﬁelds themselves, and is independent of
the choice of origin. We shall calculate this spin contribution,
L
spin
=
1
4π
E
Ad
3
x (6.94)
to the angular momentum in the case of the sum over circularlypolarised waves that we
introduced in the previous section. Recall that for this sum, the 3vector potential is given
89
by
A =
¸
λ
=±
d
3
k
(2π)
3
λ
(
k
) a
λ
(
k
) e
i (
k
·r−ω
t)
+ c.c.
, (6.95)
The electric ﬁeld is then given by
E = −
∂
A
∂t
=
¸
λ=±
d
3
k
(2π)
3
i ω
λ
(
k) a
λ
(
k) e
i (
k·r−ωt)
+ c.c.
, (6.96)
Note that we have put primes on the summation and integration variables λ and
k in the
expression for
A. This is so that we can take the product
E
A and not have a clash
of “dummy” summation variables, in what will follow below. We have also written the
frequency as ω
≡ [
k
[ in the expression for
A.
Our interest will be to calculate the time average
'
L
spin
` ≡
1
T
T
0
L
spin
dt . (6.97)
Since we are considering a wave solution with an entire “chorus” of frequencies now, we
deﬁne the time average by taking T to inﬁnity. (It is easily seen that this coincides with the
previous deﬁnition of the time average for a monomchromatic wave of frequency ω, where
T was taken to be 2π/ω.) Note that the time average will be zero for any quantity whose
time dependence is of the oscilliatory form e
i νt
, because we would have
1
T
T
0
e
i νt
dt =
1
i νT
(e
i νT
−1) , (6.98)
which clearly goes to zero as T goes to inﬁnity. Since the time dependence of all the
quantities we shall consider is precisely of the form e
iνt
, it follows that in order to survive
the time averaging, it must be that ν = 0.
We are interested in calculating the time average of
E
A, where
A and
E are given
by (6.95) and (6.96). The quantities ω appearing there are, by deﬁnition, positive, since
we have deﬁned ω ≡ [
k[. The only way that we shall get terms in
E
A that have zero
frequency (i.e. ν = 0) is from the product of one of the terms that is explicitly written times
one of the “c.c.” terms, since these, of course, have the opposite sign for their frequency
dependence.
The upshot of this discussion is that when we evaluate the time average of
E
A, with
A and
E given by (6.95) and (6.96), the only terms that survive will be coming from the
product of the explicitlywritten terms for
E times the “c.c.” term for
A, plus the “c.c.”
term for
E times the explicitlywritten term for
A. Furthermore, in order for the products
90
to have zero frequency, and therefore survive the time averaging, it must be that ω
= ω.
We therefore ﬁnd
'
E
A` =
¸
λλ
d
3
k
(2π)
3
d
3
k
(2π)
3
i ω
λ
(
k)
∗
λ
(
k
)a
λ
(
k)a
∗
λ
(
k
) e
i(
k−
k
)·r
−
∗
λ
(
k)
λ
(
k
)a
∗
λ
(
k)a
λ
(
k
) e
−i(
k−
k
)·r
. (6.99)
We now need to integrate '
E
A` over all 3space, which we shall write as
'
E
A` d
3
r . (6.100)
We now make use of the result from the theory of delta functions that
e
i(
k−
k
)·r
d
3
r = (2π)
3
δ
3
(
k −
k
) . (6.101)
Therefore, from (6.99) we ﬁnd
'
E
A` d
3
r =
¸
λλ
d
3
k
(2π)
3
i ω
λ
(
k)
∗
λ
(
k)a
λ
(
k)a
∗
λ
(
k)
−
∗
λ
(
k)
λ
(
k)a
∗
λ
(
k)a
λ
(
k)
. (6.102)
Finally, we recall that the polarization vectors
±
(
k) span the 2dimensional space or
thogonal to the wavevector
k. In terms of the original real basis unit vectors e
1
(
k) and
e
2
(
k) we have
e
1
(
k) e
2
(
k) =
k
ω
, (6.103)
and so it follows from (6.84) that
+
(
k)
∗
+
(
k) = −
i
k
ω
,
−
(
k)
∗
−
(
k) =
i
k
ω
. (6.104)
From this, it follows that (6.102) becomes
'
E
A` d
3
r = 2
d
3
k
(2π)
3
k [a
+
(
k)a
∗
+
(
k) −a
−
(
k)a
∗
−
(
k)] , (6.105)
and so we have
'
L
spin
` =
1
2π
d
3
k
(2π)
3
k
[a
+
(
k)[
2
−[a
−
(
k)[
2
. (6.106)
It can be seen from this result that the modes associated with the coeﬃcients a
+
(
k)
correspond to circularlypolarised waves of positive helicity; i.e. their spin is parallel to
the wavevector
k. Conversely, the modes with coeﬃcients a
−
(
k) correspond to circularly
polarised waves of negative helicity; i.e. with spin that is antiparallel to the wavevector
k.
91
In a similar fashion, we may evaluate the energy of the general wave solution as a sum
over the individual modes. The total energy c is given by
c =
1
8π
(E
2
+B
2
)d
3
x =
1
4π
E
2
d
3
x. (6.107)
(Recall that E
2
= B
2
here.) Since
E = −∂
A/∂t here, we have
'E
2
` =
¸
λ,λ
d
3
k
(2π)
3
d
3
k
(2π)
3
ω
2
λ
(
k)
∗
λ
(
k
) a
λ
(
k)a
∗
λ
(
k
) e
i (
k−
k
)·r
+
∗
λ
(
k)
λ
(
k
) a
∗
λ
(
k)a
λ
(
k
) e
−i (
k−
k
)·r
, (6.108)
where again, the timeaveraging has picked out only the terms whose total frequency adds
to zero. The integration over all space then again gives a threedimensional delta function
δ
3
(
k −
k
), and so we ﬁnd
'E
2
`d
3
r =
¸
λ,λ
d
3
k
(2π)
3
ω
2
λ
(
k)
∗
λ
(
k) a
λ
(
k)a
∗
λ
(
k)
+
∗
λ
(
k)
λ
(
k) a
∗
λ
(
k)a
λ
(
k)
, (6.109)
Finally, using the orthogonality relations (6.85), and the conjugation identity
±
=
∗
∓
, we
obtain
'c` =
1
2π
d
3
k
(2π)
3
ω
2
[a
+
(
k)[
2
+[a
−
(
k)[
2
. (6.110)
From the two results (6.106) and (6.110), we see that for a given mode characterised by
helicity λ and wavevector
k, we have
'
L
spin
`
k,λ
=
1
2π
k [a
λ
(
k)[
2
(sign λ) ,
'c`
k,λ
=
1
2π
ω
2
[a
λ
(
k)[
2
, (6.111)
where (sign λ) is +1 for λ = + and −1 for λ = −. The helicity σ, which is the component
of spin along the direction of the wavevector
k, is therefore given by
σ =
1
2π
[
k[ [a
λ
(
k)[
2
(sign λ) ,
=
1
2π
ω [a
λ
(
k)[
2
(sign λ) ,
=
1
ω
'c`
k,λ
(sign λ) . (6.112)
In other words, we have that
energy = ±(helicity) ω , (6.113)
92
and so we can write
c = [σ[ ω . (6.114)
This can be compared with the result in quantum mechanics, that
E = ¯ hω . (6.115)
Planck’s constant ¯ h has the units of angular momentum, and in fact the basic “unit” of
angular momentum for the photon is one unit of ¯ h. In the transition from classical to
quantum physics, the helicity of the electromagnetic ﬁeld becomes the spin of the photon.
6.6 Gauge invariance and electromagnetic ﬁelds
In the previous discussion, we described electromagnetic waves in terms of the gauge po
tential A
µ
= (−φ,
A), working in the gauge where φ = 0, i.e. A
0
= 0. Since the gauge
symmetry of Maxwell’s equations is
A
µ
−→A
µ
+∂
µ
λ, (6.116)
one might think that all the gauge freedom had been used up when we imposed the condition
φ = 0, on the grounds that one arbitrary function (the gauge parameter λ) has been used
in order to set one function (the scalar potential φ) to zero. This is, in fact, not the case.
To see this, recall that for the electromagnetic wave we wrote
A as a superposition of terms
of the form
A =c e
i (
k·r−ωt)
, (6.117)
which implied that
E = −
∂
A
∂t
= i ωc e
i (
k·r−ωt)
. (6.118)
From this we have
∇
E = −ω
k c e
i (
k·r−ωt)
, (6.119)
and so the Maxwell equation
∇
E = 0 implies that
k c = 0, and hence
k
A = 0 . (6.120)
This means that as well as having A
0
= −φ = 0, we also have a component of
A vanishing,
namely the projection along
k.
To see how this can happen, it is helpful to go back to a Lorentzcovariant gauge choice
instead. First, consider the Maxwell ﬁeld equation, in the absence of source currents:
∂
µ
F
µν
= 0 . (6.121)
93
Since F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
, this implies
∂
µ
∂
µ
A
ν
−∂
µ
∂
ν
A
µ
= 0 . (6.122)
We now choose the Lorentz gauge condition,
∂
µ
A
µ
= 0 . (6.123)
The ﬁeld equation (6.122) then reduces to
∂
µ
∂
µ
A
ν
= 0 , i.e. A
µ
= 0 . (6.124)
One might again think that all the gauge symmetry had been “used up” in imposing the
Lorentz gauge condition (6.123), on the grounds that the arbitrary function λ in the gauge
transformation
A
µ
−→A
µ
+∂
µ
λ (6.125)
that allowed one to impose (6.123) would no longer allow any freedom to impose further
conditions on A
µ
. This is not quite true, however.
To see this, let us suppose we are already in Lorentz gauge, and then try performing a
further gauge transformation, as in (6.125), insisting that we must remain in the Lorentz
gauge. This means that λ should satisfy
∂
µ
∂
µ
λ = 0 , i.e. λ = 0 . (6.126)
Nontrivial such functions λ can of course exist; any solution of the wave equation will work.
To see what this implies, let us begin with a general solution of the wave equation
(6.124), working in the Lorentz gauge (6.123). We can decompose this solution as a sum
over plane waves, where a typical mode in the sum is
A
µ
= a
µ
e
i (
k·r−ωt)
= a
µ
e
i kνx
ν
= A
µ
e
i k·x
, (6.127)
where a
µ
and k
ν
are constant. Substituting into the wave equation (6.124) we ﬁnd
0 = A
µ
= ∂
σ
∂
σ
(a
µ
e
i kνx
ν
) = −k
σ
k
σ
a
µ
e
i kνx
ν
, (6.128)
whilst the Lorentz gauge condition (6.123) implies
0 = ∂
µ
A
µ
= ∂
µ
(a
µ
e
i kνx
ν
) = i k
µ
a
µ
e
i kνx
ν
. (6.129)
In other words, k
µ
and a
µ
must satisfy
k
µ
k
µ
= 0 , k
µ
a
µ
= 0 . (6.130)
94
The ﬁrst of these equations implies that k
µ
is a null vecor, as we had seen earlier. The second
equation implies that 1 of the 4 independent components that a 4vector a
µ
generically has
is restricted in this case, so that a
µ
has only 3 independent components.
Now we perform the further gauge transformation A
µ
→A
µ
+∂
µ
λ, where, as discussed
above, λ = 0 so that we keep the gaugetransformed A
µ
in Lorentz gauge. Speciﬁcally,
we shall choose
λ = i he
i kνx
ν
, (6.131)
where h is a constant. Thus we shall have
A
µ
−→A
µ
−hk
µ
e
i kνx
ν
. (6.132)
With A
µ
given by (6.127) this means we shall have
a
µ
e
i kνx
ν
−→a
µ
e
i kνx
ν
−hk
µ
e
i kνx
ν
, (6.133)
which implies
a
µ
−→a
µ
−hk
µ
. (6.134)
As a check, we can see that the redeﬁned a
µ
indeed still satisﬁes k
µ
a
µ
= 0, as it should,
since k
µ
is a null vector.
The upshot of this discussion is that the freedom to take the constant h to be anything we
like allows us to place a second restriction on the components of a
µ
. Thus not merely are its
ostensible 4 components reduced to 3 by virtue of k
µ
a
µ
= 0, but a further component can be
eliminated by means of the residual gauge freedom, leaving just 2 independent components
in the polarisation vector a
µ
. Since the physical degrees of freedom are, by deﬁnition, the
independent quantities that cannot be changed by making gauge transformations, we see
that there are 2 degrees of freedom in the electromagnetic wave, and not 3 as one might
naively have supposed.
These 2 physical degrees of freedom can be organised as the + and − helicity states,
just as we did in our earlier discussion. These are the circularlypolarised waves rotating
anticlockwise and clockwise, respectively. In other words, these are the states whose spin is
eiether parallel, or antiparallel, to the direction of propagation. One way of understanding
why we have only 2, and not 3, allowed states is that the wave is travelling at the speed of
light, and so it is not possible for it to have a helicity that projects other than fully parallel
or antiparallel to its direction of propagation.
We can make contact with the φ = 0 gauge choice that we made in our previous
discussion of electromagnetic waves. Starting in Lorentz gauge, we make use of the residual
95
gauge transformation (6.134) by choosing h so that
a
0
−hk
0
= 0 , i.e. h = −
a
0
ω
. (6.135)
this means that after performing the residual gauge transformation we shall have
a
0
= 0 , (6.136)
and so, from (6.127), we shall have
A
0
= 0 , i.e. φ = 0 . (6.137)
The original Lorentz gauge condition (6.123) then reduces to
∂
i
A
i
= 0 , i.e.
∇
A = 0 . (6.138)
This implies
k
A = 0, and so we have reproduced precisely the φ = 0,
k
A = 0 gauge
conditions that we used previously in our analysis of the general electromagnetic wave
solutions. The choice φ = 0 amd
∇
A = 0 is known as Radiation Gauge.
In D spacetime dimensions, the analogous result can easily be seen to be that the
electromagnetic wave has (D −2) degrees of freedom.
6.7 Fourier decomposition of electrostatic ﬁelds
We saw earlier in 6.5 that an electromagnetic wave, expressed in the radiation gauge in
terms of the 3vector potential
A, could be decomposed into Fourier modes as in (6.86).
For each mode
A
(
k,λ)
in the sum, we have
λ
(
k)
k = 0, and so each mode of the electric
ﬁeld
E
(
k,λ)
= −∂
A
(
k,λ)
/∂t satisﬁes the transversality condition
k
E
(
k,λ)
= 0 . (6.139)
By constrast, an electrostatic ﬁeld
E is longitudinal. Consider, for example, a point
charge at the origin, whose potential therefore satisﬁes
∇
2
φ = −4πe δ
3
(r) . (6.140)
We can express φ(r) in terms of its Fourier transform Φ(
k) as
φ(r) =
d
3
k
(2π)
3
Φ(
k) e
i
k·r
. (6.141)
This is clearly a sum over zerofrequency waves, as one would expect since the ﬁelds are
statsic.
96
It follows from (6.141) that
∇
2
φ(r) = −
d
3
k
(2π)
3
k
2
Φ(
k) e
i
k·r
. (6.142)
We also note that the deltafunction in (6.140) can be written as
δ
3
(r) =
d
3
k
(2π)
3
e
i
k·r
. (6.143)
It follows that if we substitute (6.141) into (6.140) we shall obtain −
k
2
Φ(
k) = −4πe, and
hence
Φ(
k) =
4πe
k
2
. (6.144)
The electric ﬁeld is given by
E = −
∇φ, and so
E = −i
d
3
k
(2π)
3
k Φ(
k) e
i
k·r
. (6.145)
If we deﬁne
G(
k) to be the Fourier transform of
E, so that
E(r) =
d
3
k
(2π)
3
G(
k) e
i
k·r
, (6.146)
then we see that
G(
k) = −i
k Φ(
k) = −
4π i e
k
2
k . (6.147)
Thus we see that
G(
k) is parallel to
k, which proves that the electrostatic ﬁeld is Longitu
dinal.
6.8 Waveguides
For our purposes, we shall deﬁne a waveguide to be a hollow, perfectly conducting, cylinder,
essentially of inﬁnite length. For convenience we shall take the axis of th cylinder to lie
along the z direction. The crosssection of the cylinder, in the (x, y) plane, can for now be
arbitrary, but it is the same for all values of z. Thus, the crosssection through the cylinder
is a closed curve.
We shall consider an electromagnetic wave propagting down the cylinder, with angular
frequency ω. It will therefore have z and t dependence of the form
e
i (kz−ωt)
. (6.148)
Note that k and ω will not in general be equal; i.e. , the wave will not propagate at the
speed of light.
97
The assumed form of the time dependence in (6.148) implies that the sourcefree Maxwell
equations (which hold inside the waveguide), we shall have
∇
E = 0 ,
∇
E = i ω
B,
∇
B = 0 ,
∇
B = −i ω
E . (6.149)
Because of the assumed form of the z dependence in (6.148), we may write
E(x, y, z, t) =
E(x, y) e
i (kz−ωt)
,
B(x, y, z, t) =
B(x, y) e
i (kz−ωt)
. (6.150)
It is convenient also to deﬁne certain transverse quantities, as follows:
∇
⊥
≡
∂
∂x
,
∂
∂y
, 0
,
E ≡ (
E
⊥
, E
z
) ,
B ≡ (
B
⊥
, B
z
) . (6.151)
From (6.149), the Maxwell equations become
∇
⊥
E
⊥
= −i k E
z
,
∇
⊥
B
⊥
= −i k B
z
,
i k
E
⊥
+ i ω m
B
⊥
=
∇
⊥
E
z
,
m (
∇
⊥
E
⊥
) = i ω B
z
,
i k
B
⊥
−i ω m
E
⊥
=
∇
⊥
B
z
,
m (
∇
⊥
B
⊥
) = −i ω E
z
, (6.152)
where we have deﬁned the unit vector m along the z direction (the axis of the waveguide):
m = (0, 0, 1) . (6.153)
Note that the cross product of any pair of transverse vectors,
U
⊥
V
⊥
, lies purely in the z
direction, i.e. parallel to m.
6.8.1 TEM modes
There are various types of modes that can be considered. First, we may dispose of an
“uninteresting” possibility, called TEM modes. The acronym stands for “transverse electric
and magnetic,” meaning that
E
z
= 0 , B
z
= 0 . (6.154)
From the equations in (6.152) for
E
⊥
, we see that
∇
⊥
E
⊥
= 0 ,
∇
⊥
E
⊥
= 0 . (6.155)
98
These are the equations for electrostatics in the 2dimensional (x, y) plane. The second
equation implies we can write
E
⊥
= −
∇
⊥
φ, and then the ﬁrst equation implies that the
electrostatic potential φ satisﬁes the 2dimensional Laplace equation
∇
2
⊥
φ =
∂
2
φ
∂x
2
+
∂
2
φ
∂y
2
= 0 . (6.156)
Since the crosssection of the waveguide in the (x, y) plane is a closed curve, at a ﬁxed
potential (since it is a conductor), we can deduce that φ is constant everywhere inside the
conductor:
0 =
dxdy φ∇
2
⊥
φ = −
dxdy [
∇
⊥
φ[
2
, (6.157)
which implies
∇
⊥
φ = 0 inside the wavguide, and hence φ = constant and so
E = 0. Similar
considerations imply
B = 0 for the TEM mode also.
12
6.8.2 TE and TM modes
In order to have nontrivial modes propagating in the waveguide, we must relax the TEM
assumption. There are two basic types of nontrivial modes we may consider, where either
E or
B (but not both) are taken to be transverse. These are called TE modes and TM
modes respectively.
To analyse these modes, we ﬁrst need to consider the boundary conditions at the con
ducting surface of the cylinder. The component of
E parallel to the surface must vanish
(seen by integrating
E around a loop comprising a line segment just inside the wavguide ,
and closed by a line segment just inside the conductor, where
E = 0 by deﬁnition). Then, if
we deﬁne n to be the unit normal vector at the surface, we may say that n
E = 0. Next,
taking the scalar product of n with the
∇
E = i ω
B Maxwell equation, we get
i ωn
B = n (
∇
E) = −
∇ (n
E) = 0 . (6.158)
Thus, we have
n
E = 0 , n
B = 0 (6.159)
on the surface of the waveguide. We may restate these boundary conditions as
E
z
S
= 0 , n
B
⊥
S
= 0 , (6.160)
where S denotes the surface of the cylindrical waveguide.
12
If the waveguide were replaced by coaxial conducting cylinders then TEM modes could exist in the gap
between the innner and outer cylinder, since the potentials on the two cylinder need not be equal.
99
The two boundary conditions above imply also that
n
∇
⊥
B
z
S
= 0 . (6.161)
This follows by taking the scalar product of n with the penultimate equation in (6.152):
n
∇
⊥
B
z
= i k n
B
⊥
−i ωn ( m
E
⊥
) ,
= i k n
B
⊥
+ i ω m (n
E
⊥
) , (6.162)
and then restricting to the surface S of the cylinder. The condition (6.161) may be rewritten
as
∂B
z
∂n
S
= 0 , (6.163)
where ∂/∂n ≡ n
∇ is the normal derivative.
With the assumption (6.148), the wave equations for
E and
B become
∇
2
⊥
E + (ω
2
−k
2
)
E = 0 , ∇
2
⊥
B + (ω
2
−k
2
)
B = 0 , (6.164)
where ∇
2
⊥
= ∂
2
/∂x
2
+∂
2
/∂y
2
is the 2dimensional Laplacian. The third and ﬁfth equations
in (6.152) become, in terms of components,
i k E
x
−i ω B
y
= ∂
x
E
z
, i k E
y
+ i ω B
x
= ∂
y
E
z
,
i k B
x
+ i ω E
y
= ∂
x
B
z
, i k B
y
−i ω E
x
= ∂
y
B
z
. (6.165)
These can be solved for E
x
, E
y
, B
x
and B
y
in terms of E
z
and B
z
, giving
E
x
=
i
ω
2
−k
2
(ω ∂
y
B
z
+k ∂
x
E
z
) ,
E
y
=
i
ω
2
−k
2
(−ω ∂
x
B
z
+k ∂
y
E
z
) ,
B
x
=
i
ω
2
−k
2
(−ω ∂
y
E
z
+k ∂
x
B
z
) ,
B
y
=
i
ω
2
−k
2
(ω ∂
x
E
z
+k ∂
y
B
z
) . (6.166)
This means that we can concentrate on solving for E
z
and B
z
; after having done so, sub
stitution into (6.166) gives the expressions for E
x
, E
y
, B
x
and B
y
.
As mentioned earlier, we can now distinguish two diﬀerent categories of wave solution
in the waveguide. These are
TE waves : E
z
= 0 , and
∂B
z
∂n
S
= 0 ,
B
⊥
=
i k
ω
2
−k
2
∇B
z
,
E = −
ω
k
m
B
⊥
, (6.167)
100
TM waves : B
z
= 0 , and E
z
S
= 0 ,
E
⊥
=
i k
ω
2
−k
2
∇E
z
,
B =
ω
k
m
E
⊥
. (6.168)
Note that the vanishing of E
z
or B
z
in the two cases means that this ﬁeld component
vanishes everywhere inside the waveguide, and not just on the cylindrical conductor. Note
also that the second condition in each case is just the residual content of the boundary
conditions in (6.160) and (6.161), after having imposed the transversality condition E
z
= 0
or B
z
= 0 respectively. The second line in each of the TE and TM cases gives the results
from (6.166), written now in a slightly more compact way. In each case, the basic wave
solution is given by solving the 2dimensional Helmholtz equation
∂
2
ψ
∂x
2
+
∂
2
ψ
∂y
2
+ Ω
2
ψ = 0 , (6.169)
where
Ω
2
≡ ω
2
−k
2
, (6.170)
and ψ is equal to B
z
or E
z
in the case of TE or TM waves respectively. We also have the
boundary conditions:
TE waves :
∂ψ
∂n
S
= 0 , (6.171)
TM waves : ψ
S
= 0 . (6.172)
Equation (6.169), together with the boundary condition (6.171) or (6.172), deﬁnes an
eigenfunction/eigenvalue problem. Since the the crosssection of the waveguide is a closed
loop in the (x, y) plane, the equation (6.169) is to be solved in a compact closed region, and
so the eigenvalue specture for Ω
2
will be discrete; there will be a semiinﬁnite number of
eigenvalues, unbounded above, discretely separated from each other.
Consider, as an example, TM waves propagating down a waveguide with rectangular
crosssection:
0 ≤ x ≤ a , 0 ≤ y ≤ b . (6.173)
For TM waves, we must satisfy the boundary condition that ψ vanishes on the edges of
the rectangle. It follows from an elementary calculation, in which one separates variables
in (6.169) by writing ψ(x, y) = X(x)Y (y), that the eigenfunctions and eigenvalues, labelled
101
by integers (m, n), are given by
13
ψ
mn
= e
mn
sin
mπx
a
sin
nπy
b
,
Ω
2
mn
=
m
2
π
2
a
2
+
n
2
π
2
b
2
. (6.174)
The wavenumber k and the angular frequency ω for the (m, n) mode are then related by
k
2
= ω
2
−Ω
2
mn
. (6.175)
Notice that this means there is a minimum frequency ω
min
= Ω
mn
at which a wave can
propagate down the waveguide in the (m, n) mode. If one tried to transmit a lowerfrequency
wave in this mode, it would have imaginary wavenumber, and so from (6.150) it would die
oﬀ exponentially with z. This is called an evanescent wave.
The absolute lowest bound on the angular frequency that can propagate down the
waveguide is clearly given by Ω
1,1
. In other words, the lowest angular frequency of TM
wave that can propagate down the rectangular waveguide is given by
ω
min
= π
1
a
2
+
1
b
2
. (6.176)
In view of the relation (6.170) between the angular frequency and the wavenumber, we
see that the phase velocity v
ph
and the group velocity v
gr
are given by
v
ph
=
ω
k
=
1 −
Ω
2
ω
2
−1/2
,
v
gr
=
dω
dk
=
1 −
Ω
2
ω
2
1/2
. (6.177)
Note that because of the particular form of the dispersion relation, i.e. the equation (6.170)
relating ω to k, it is the case here that
v
ph
v
gr
= 1 . (6.178)
We see that while the group velocity satisﬁes
v
gr
≤ 1 , (6.179)
the phase velocity satisﬁes
v
ph
≥ 1 . (6.180)
13
If we were instead solving for TE modes, we would have the boundary condition ∂ψ/∂n = 0 on the
edges of the rectangle, rather than ψ = 0 on the edges. This would give diﬀerent eigenfunctions, involving
cosines rather than sines.
102
There is nothing wrong with this, even though it means the phase velocity exceeds the
speed of light, since nothing material, and no signal, is transferred faster than the speed of
light. In fact, as we shall now verify, energy and information travel at the group velocity
v
gr
, which is always less than or equal to the speed of light.
Note that the group velocity approaches the speed of light (from below) as ω goes
to inﬁnity. To be more precise, the group velocity approaches the speed of light as ω
becomes large compared to the eigenvalue Ω associated with the mode of propagation under
discussion. An example where this limit is (easily) approached is if you look through a length
of metal drainpipe. Electromagnetic waves in the visible spectrum have a frequency vastly
greater than the lowest TM or TE modes of the drainpipe, and they propagate through the
pipe as if it wasn’t there. The story would be diﬀerent if one tried to channel waves from
a microwave down the drainpipe.
Let us now investigate the ﬂow of energy down the waveguide. This is obtained by
working out the time average of the Poynting ﬂux,
'
S` = '
1
8π
E
B
∗
. (6.181)
Note that here the ﬁelds
E and
B are taken to be complex, and we are using the result
discussed earlier about taking time averages of quadratic products of the physical
E and
B
ﬁelds.
If we consider TM modes, then we shall have
E
⊥
=
i k
Ω
2
∇ψ, E
z
= ψ,
B
⊥
=
ω
k
m
E
⊥
=
i ω
Ω
2
m
∇ψ , B
z
= 0 . (6.182)
(Recall that m = (0, 0, 1).) Note that the expressions for
E and
B can be condensed down
to
E =
i k
Ω
2
∇ψ + mψ ,
B =
i ω
Ω
2
m
∇ψ. (6.183)
We therefore have
E
B
∗
=
i k
Ω
2
∇ψ + mψ
−
i ω
Ω
2
m
∇ψ
∗
. (6.184)
Using the vector identity
A (
B
C) = (
A
C)
B −(
A
B)
C, we then ﬁnd
E
B
∗
=
ωk
Ω
4
(
∇ψ
∇ψ
∗
) m +
i ω
Ω
2
ψ
∇ψ
∗
, (6.185)
since m
∇ψ = 0. Along the z direction (i.e. along m), we therefore have
'
S`
z
=
ωk
8πΩ
4
(
∇ψ
∇ψ
∗
) =
ωk
8πΩ
4
[
∇ψ[
2
. (6.186)
103
(The second term in (6.185) describes the circulation of energy within the crosssectional
plane of the waveguide.)
The total transmitted power P is obtained by integrating '
S`
z
over the crosssectional
area Σ of the waveguide. This gives
P =
Σ
dxdy '
S`
z
=
ωk
8πΩ
4
Σ
dxdy
∇ψ
∗
∇ψ,
=
ωk
8πΩ
4
Σ
dxdy
∇ (ψ
∗
∇ψ) −ψ
∗
∇
2
ψ
,
=
ωk
8πΩ
4
C
ψ
∗
∂ψ
∂n
d −
ωk
8πΩ
4
Σ
dxdy ψ
∗
∇
2
ψ,
= −
ωk
8πΩ
4
Σ
dxdy ψ
∗
∇
2
ψ =
ωk
8πΩ
2
Σ
dxdy ψ
∗
ψ , (6.187)
and so we have
P =
ωk
8πΩ
2
Σ
dxdy [ψ[
2
. (6.188)
Not that in (6.187), the boundary term over the closed loop C that forms the boundary of
the waveguide in the (x, y) plane gives zero because ψ vansihes everywhere on the cylinder.
The remaining term was then simpliﬁed by using (6.169).
We may also work out the total energy per unit length of the waveguide. The total
timeaveraged energy density is given by
'W` =
1
8π
E
E
∗
=
1
8π
i k
Ω
2
∇ψ + mψ
−
i k
Ω
2
∇ψ
∗
+ mψ
∗
,
=
k
2
8πΩ
4
∇ψ
∗
∇ψ +
1
8π
ψψ
∗
. (6.189)
The energy per unit length U is then obtained by integrating 'W` over the crosssectional
area, which gives
U =
Σ
dxdy 'W` =
k
2
8πΩ
4
Σ
dxdy
∇ψ
∗
∇ψ +
1
8π
Σ
dxdy [ψ[
2
,
=
k
2
8πΩ
2
Σ
dxdy [ψ[
2
+
1
8π
Σ
dxdy [ψ[
2
, (6.190)
where we have again integrated by parts in the ﬁrst term, dropped the boundary term
because ψ vanishes on the cylinder, and used (6.169) to simplify the result. Thus we ﬁnd
U =
ω
2
8πΩ
2
Σ
dxdy [ψ[
2
. (6.191)
Having obtained the expression (6.188) for the power P passing through the waveguide,
and the expression (6.191) for the energy per unit length in the waveguide, we may note
that
P =
k
ω
U =
1
v
ph
U = v
gr
U . (6.192)
This demonstrates that the energy ﬂows down the waveguide at the group velocity v
gr
.
104
6.9 Resonant cavities
A resonant cavity is a hollow, closed conducting “container,” inside which is an electromag
netic ﬁeld. A simple example would be to take a length of waveguide of the sort we have
considered in section 6.8, and turn it into a closed cavity by attaching conducting plates at
each end of the cylinder. Let us suppose that the length of the cavity is d.
Consider, as an example, TM modes in the cavity. We solve the same 2dimensional
Helmholtz equation (6.169) as before,
∂
2
ψ
∂x
2
+
∂
2
ψ
∂y
2
+ Ω
2
ψ = 0 , (6.193)
subject again to the TM boundary condition that ψ must vanish on the surface of the
cyliner. The
E and
B ﬁelds are given, as before, by
E
⊥
=
i k
Ω
2
e
i (κz−ωt)
∇ψ, E
z
= ψe
i (κz−ωt)
,
B
⊥
=
ω
k
m
E
⊥
, (6.194)
where m = (0, 0, 1). Now, however, we have the additional boundary conditions that
E
⊥
must vanish on the two conductiung plates, which we shall take to be at z = 0 and z = d.
This is because the component of
E parallel to a conductor must vanish at the conducting
surface.
In order to arrange that
E
⊥
vanish, for all t, at z = 0 and z = d, it must be that there
is a superposition of rightmoving and lefttmoving waves. (These correspond to z and t
dependences e
i (±κz−ωt)
respectively.) Thus we need to take the combination that makes a
standing wave,
E
⊥
= −
k
Ω
2
sinkz e
−i ωt
∇ψ , (6.195)
in order to have
E
⊥
= 0 at z = 0. Furthermore, in order to have also that
E
⊥
= 0 at z = d,
it must be that the wavenumber k is now quantised, according to
k =
pπ
d
, (6.196)
where p is an integer. Note that we also have
E
z
= ψ cos kz e
−i ωt
. (6.197)
Recall that in the waveguide, we had already found that Ω
2
≡ ω
2
−k
2
was quantised, be
ing restricted to a semiinﬁnite discrete set of eigenvalues for the 2dimensional Helmoholtz
105
equation. In the waveguide, that still allowed k and ω to take continuous values, subject to
the constraint (dispersion relation)
ω
2
= Ω
2
+k
2
. (6.198)
In the resonant cavity we now have the further restriction that k is quantised, according to
(6.196). This means that the spectrum of allowed frequencies ω is now discrete, and given
by
ω
2
= Ω
2
+
p
2
π
2
d
2
. (6.199)
If, for example, we consider the previous example of TM modes in a rectangular waveg
uide whose crosssection has sides of lengths a and b, but now with the added endcaps at
z = 0 and z = d, then Ω
2
is given by (6.174), and so the resonant frequencies in the cavity
are given by
ω
2
= π
2
m
2
a
2
+
n
2
b
2
+
p
2
d
2
, (6.200)
for positive integers (m, n, p).
7 Fields Due to Moving Charges
7.1 Retarded potentials
If we solve the Bianchi identity by writing F
µν
= ∂
µ
A
ν
− ∂
ν
A
µ
, the remaining Maxwell
equation (i.e. the ﬁeld equation)
∂
µ
F
µν
= −4πJ
ν
(7.1)
becomes
∂
µ
∂
µ
A
ν
−∂
µ
∂
ν
A
µ
= −4πJ
ν
. (7.2)
If we choose to work in the Lorentz gauge,
∂
µ
A
µ
= 0 , (7.3)
then (7.2) becomes simply
A
µ
= −4πJ
µ
. (7.4)
Since A
µ
= (φ,
A) and J
µ
= (ρ,
J), this means we shall have
φ = −4π ρ ,
A = −4π
J , (7.5)
or, in the threedimensional language,
∇
2
φ −
∂
2
φ
∂t
2
= −4π ρ , ∇
2
A−
∂
2
A
∂t
2
= −4π
J . (7.6)
106
In general, we can write the solutions to (7.6) as the sums of a particular integral of
the inhomogeneous equation (i.e. the one with the source term on the righthand side) plus
the general solution of the homogeneous equation (the one with the righthand side set to
zero). Our interest now will be in ﬁnding the particular integral. Solving this problem in the
case of static sources and ﬁelds will be very familiar from electrostatics and magnetostatics.
Now, however, we wish to solve for the particular integral in the case where there is time
dependence too. Consider the equation for φ ﬁrst.
First consider the situation where there is just an inﬁnitesimal amount of charge de(t)
in an inﬁnitesimal volume. (We allow for it to be time dependent, in general.) Thus the
charge density is
ρ = de(t) δ
3
(
R) , (7.7)
where
R is the position vector from the origin to the location of the inﬁnitesimal charge.
We therefore wish to solve
∇
2
φ −
∂
2
φ
∂t
2
= −4π de(t) δ
3
(
R) . (7.8)
When
R = 0. we have simply ∇
2
φ −∂
2
φ/∂t
2
= 0.
Clearly, φ depends on
R only through its magnitude R ≡ [
R[, and so φ = φ(t, R). Now,
with
R = (x
1
, x
2
, x
3
), we have R
2
= x
i
x
i
and so ∂
i
R = x
i
/R. Consequently, we shall have
∂
i
φ =
x
i
R
φ
, (7.9)
where φ
≡ ∂φ/∂R, and then
∇
2
φ = ∂
i
∂
i
φ = φ
+
2
R
φ
. (7.10)
Letting Φ = Rφ, we have
φ
=
1
R
Φ
−
1
R
2
Φ, φ
=
1
R
Φ
−
2
R
2
Φ
+
2
R
3
Φ. (7.11)
This means that for
R = 0, we shall have
∂
2
Φ
∂R
2
−
∂
2
Φ
∂t
2
= 0 . (7.12)
The general solution to this equation is
Φ(t, R) = f
1
(t −R) +f
2
(t +R) , (7.13)
where f
1
and f
2
are arbitrary functions.
107
The solution with f
1
is called the retarded solution, and the solution with f
2
is called
the advanced solution. The reason for this terminology is that in the retarded solution, the
“eﬀect” occurs after the “cause,” in the sense that the proﬁle of the function f
1
propagates
outwards from the origin where the charge de(t) is located. By contrast, in the advanced
solution the eﬀect precedes the cause; the disturbance propagates inwards as time increases.
The advanced solution is acausal, and therefore unphysical, and so we shall keep only the
causal solution, i.e. the retarded solution. The upshot is that for R = 0, the solution is
φ =
1
R
Φ(t −R) . (7.14)
We clearly expect that φ will go to inﬁnity as R approaches zero, since the charge (albeit
inﬁnitesimal) is located there. Consequently, it will be the case that the derivatives ∂/∂R
will dominate over the time derivatives ∂/∂t near to R = 0, and so in that region we can
write
∇
2
φ ≈ −4πde(t) δ
3
(
R) . (7.15)
This therefore has the usual solution that is familiar from electrostatics, namely
φ ≈
de(t)
R
, (7.16)
or, in other words,
Φ ≈ de(t) (7.17)
near R = 0. Since Φ is already eastablished to depend on t and R only through Φ = Φ(t−R),
we can therefore immediately write down the solution valid for all R, namely
Φ(t −R) = de(t −R) . (7.18)
From (7.14), we therefore have that
φ(
R, t) =
de(t −R)
R
. (7.19)
This solution is valid for the particular case of an inﬁnitesimal charge de(t) located
at R = 0. For a general timedependent charge distribution ρ(r, t), we just exploit the
linearity of the Maxwell equations and sum up the contributions from all the charges in the
distribution. This therefore gives
φ(r, t) =
ρ(r
, t −R)
R
d
3
r
, (7.20)
where
R ≡ r −r
. This solution of the inhomogeneous equation is the one that is “forced”
by the source term, in the sense that it vanishes if the source charge density ρ vanishes.
108
The general solution is given by this particular integral plus an arbitrary solution of the
homogeneous equation φ = 0. The solution (7.20) can be written as
φ(r, t) =
ρ(r
, t −[r −r
[)
[r −r
[
d
3
r
. (7.21)
In an identical fashion, we can see that the solution for the 3vector potential
A in the
presence of a 3vector current source
J(r, t) will be
A(r, t) =
J(r
, t −[r −r
[)
[r −r
[
d
3
r
. (7.22)
The solutions for φ(r, t) and
A(r, t) that we have obtained here are called the Retarded
Potentials. The analogous “advanced potentials” would correspond to having t + [r −r
[
instead of t − [r −r
[ as the time argument of the charge and current densities inside the
integrals. It is clear that the retarded potentials are the physically sensible ones, in that
the potentials at the present time t depend upon the charge and current densities at times
≤ t. In the advanced potentials, by contrast, the potentials at the current time t would be
inﬂuenced by what the charge and current densities will be in the future. This would be
unphysical, since it would violate causality.
Since the procedure by which we arrived at the retarded potential solutions(7.21) and
(7.22) may have seemed slightly “unrigorous,” it is perhaps worthwhile to go back and check
that they are indeed correct. This can be done straightforwardly, simply by substituting
them into the original wave equations (7.6). One ﬁnds that they do indeed yield exact
solutions of the equations. We leave this as an exercise for the reader.
7.2 LienardWiechert potentials
We now turn to a discussion of the electromagnetic ﬁelds produced by a point charge e
moving along an arbitrary path r = r
0
(t). We already considered a special case of this in
section 5.3, where we worked out the ﬁelds produced by a charge in uniform motion (i.e.
moving at constant velocity). In that case, we could work out the electromagnetic ﬁelds by
using the trick of transforming to the Lorentz frame in which the particle was at rest, doing
the very simple calculation of the ﬁelds in that frame, and then transforming back to the
frame where the particle was in uniform motion.
Now, we are going to study the more general case where the particle can be accelerating;
i.e. , where its velocity is not uniform. This means that there does not exist an inertial
frame in which the particle is at rest for all time, and so we cannot use the previous trick.
109
It is worth emphasising that even though the particle is accelerating, this does not mean
that we cannot solve the problem using special relativity. The point is that we shall only
ever study the ﬁelds from the viewpoint of an observer who is in an inertial frame, and
so for this observer, the laws of special relativity apply. Only if we wanted to study the
problem from the viewpoint of an observer in an accelerating frame, such as the restframe
of the particle, would we need to use the laws of general relativity.
Note that although we cannot use special relativity to study the problem in the rest frame
of the accelerating particle, we can, and sometimes will, make use of an instantaneous rest
frame. This is an inertial frame whose velocity just happens to match exactly the velocity
of the particle at a particular instant of time. Since the particle is accelerating, then a
moment later the particle will no longer be at rest in this frame. We could, if we wished,
then choose an “updated” instantaneous rest frame, and use special relativity to study the
problem (for an instant) in the new inertial frame. We shall ﬁnd it expedient at times to
make use of the concept of an instantaneous rest frame, in order to simply intermediate
calculations. Ultimately, of course, we do not want to restrict ourselves to having to hop
onto a new instantaneous rest frame every time we discuss the problem, and so the goal is
to obtain results that are valid in any inertial frame.
Now, on with the problem. We can expect, on grounds of causality, that the electromag
netic ﬁelds at (r, t) will be determined by the position and state of motion of the particle at
earlier times t
, as measured in the chosen inertial frame, for which the time of propagation
of information from r
0
(t
), where the particle was at time t
, to r at the time t is t − t
. It
is useful therefore to deﬁne
R(t) ≡ r −r
0
(t) . (7.23)
This is the radius vector from the location r
0
(t) of the charge at the time t to the observation
point r. The time t
is then determined by
t −t
= R(t
) , where R(t
) = [
R(t
)[ . (7.24)
There is one solution for t
, for each choice of t.
In the Lorentz frame where the particle is at rest at t
, the potential at time t will be
given by
φ =
e
R(t
)
,
A = 0 . (7.25)
We can determine the 4vector potential A
µ
in an arbitrary Lorentz frame simply by invent
ing a 4vector expression that reduces to (7.25) under the specialisation that the velocity
v ≡ dr
/dt
of the charge is zero at time t
.
110
Let the 4velocity of the charge, in the observer’s inertial frame, be U
µ
. If the charge is
at rest, its 4velocity will be
U
µ
= (1,
0) . (7.26)
Thus to write a 4vector expression for A
µ
= (φ,
A) that reduces to (7.25) if U
µ
is given by
(7.26), we just have to ﬁnd a scalar f such that
A
µ
= f U
µ
, (7.27)
with f becoming e/R(t
) in the special case. Let us deﬁne the 4vector
R
µ
= (t −t
, r −r
0
(t
)) = (t −t
,
R(t
)) . (7.28)
(This is clearly a 4vector, because (t, r) is a 4vector, and (t
, r
0
(t
), the spacetime coordi
nates of the particle, is a 4vector.) Then, we can write f as the scalar
f =
e
(−U
ν
R
ν
)
, (7.29)
since clearly if U
µ
is given by (7.26), we shall have −U
ν
R
ν
= −R
0
= R
0
= t −t
= R(t
).
Having written A
µ
as a 4vector expression that reduces to (7.25) under the specialisation
(7.26), we know that it must be the correct expression in any Lorentz frame. Now, we have
U
µ
= (γ, γ v) , where γ =
1
√
1 −v
2
, (7.30)
and so we see that
φ(r, t) = A
0
=
eγ
(t −t
)γ −γ v
R
=
e
t −t
−v
R
=
e
R −v
R
,
A(r, t) =
eγ v
(t −t
)γ −γ v
R
=
ev
R −v
R
. (7.31)
To summarise, we have concluded that the gauge potentials for a charge e moving along
the path r = r
0
(t
), as seen from the point r at time t, are given by
φ(r, t) =
e
R −v
R
,
A(r, t) =
ev
R −v
R
, (7.32)
where all quantities on the righthand sides are evaluated at the time t
, i.e.
R means
R(t
)
and v means dr
0
(t
)/dt
, with
R(t
) = r −r
0
(t
) , (7.33)
and t
is determined by solving the equation
R(t
) = t −t
, where R(t
) ≡ [
R(t
)[ . (7.34)
111
These potentials are known as the LienardWiechert potentials.
The next step will be to calculate the electric and magnetic ﬁelds from the Lienard
Wiechert potentials. However, before doing so, it is perhaps worthwhile to pause and give
an alternative derivation of the result for the potentials. People’s taste in what constitutes
a satisfying proof of a result can diﬀer, but I have to say that I personally ﬁnd the derivation
above rather unsatisfying. I would regard it as a bit of “handwaving argument,” which one
maybe would use after having ﬁrst given a “proper” derivation, in order to try to give a
“physical picture” of what is going on. The basic premise of the derivation above is that the
potentials “here and now” will be given precisely by applying Coulomb’s law at the position
the particle was in “a lighttravel time” ago. I ﬁnd it far from obvious that this should give
the right answer. It is in fact very interesting that this does give the right answer, but for
me, I would view this as a remarkable fact that emerges after one has ﬁrst given a “proper”
derivation of the result, rather than as a solid derivation in its own right.
A “proper” derivation of the LienardWiechert potentials can be given as follows. We
take as the starting point the expressions (7.21) and (7.22) for the retarded potentials due to
a timedependent charge and current source. These expressions can themselves be regarded
as solid and rigorous, since one only has to verify by direct substitution into (7.6) that they
are indeed correct. Consider ﬁrst the retarded potential for φ, given in (7.21). We can
rewrite this as a 4dimensional integral by introducing a deltafunction in the time variable,
so that
φ(r, t) =
ρ(r
, t
)
[r −r
[
δ(t
−t +[r −r
[) dt
d
3
r
. (7.35)
The charge density for a point charge e moving along the path r = r
0
(t) is given by
ρ(r, t) = e δ
3
(r −r
0
(t)) . (7.36)
This means that we shall have
φ(r, t) =
e δ
3
(r
−r
0
(t
))
[r −r
[
δ(t
−t +[r −r
[) dt
d
3
r
, (7.37)
and so after performing the spatial integrations we obtain
φ(r, t) =
e
[r −r
0
(t
)[
δ(t
−t +[r −r
0
(t
)[) dt
. (7.38)
To evaluate the time integral, we need to make use of a basic result about the Dirac
deltafunction, namely that if a function f(x) has as zero at x = x
0
, then
14
δ(f(x)) = δ(x −x
0
)
df
dx
−1
, (7.39)
14
To prove this, consider the integral I =
dxh(x)δ(f(x)) for an arbitrary function h(x). Next, change
112
where df/dx is evaluated at x = x
0
. (The result given here is valid if f(x) vanishes only at
the point x = x
0
. If it vanishes at more than one point, then there will be a sum of terms
of the type given in (7.39).)
To evaluate (7.38), we note that
∂
∂t
t
−t +[r −r
0
(t
)[
= 1 +
∂
∂t
(r −r
0
(t
)) (r −r
0
(t
))
1/2
,
= 1 +
(r −r
0
(t
)) (r −r
0
(t
))
−1/2
(r −r
0
(t
))
∂(−r
0
(t
))
∂t
,
= 1 −
v (r −r
0
(t
))
[r −r
0
(t
)[
,
= 1 −
v
R(t
)
R(t
)
, (7.40)
where v = dr
0
(t
)/dt
. Following the rule (7.39) for handling a “deltafunction of a func
tion,” we therefore take the function in the integrand of (7.38) that multiplies the delta
function, evaluate it at the time t for which the argument of the deltafunction vanishes,
and divide by the absolute value of the derivative of the argument of the deltafunction.
This therefore gives
φ(r, t) =
e
R(t
) −v
R(t
)
, (7.41)
where t
is the solution of t −t
= R(t
), and so we have reproduced the previous expression
for the LienardWiechert potential for φ in (7.32). The derivation for
A is very similar.
7.3 Electric and magnetic ﬁelds of a moving charge
Having obtained the LienardWiechert potentials φ and
A of a moving charge, the next step
is to calculate the associated electric and magnetic ﬁelds,
E = −
∇φ −
∂
A
∂t
,
B =
∇
A. (7.42)
variable to z = f(x), so dx = dz/(df/dx). Then we have
I =
dzh(x)
δ(z)
df/dx
= h(x0)/df/dx
δ(z)dz = h(x0)/df/dx ,
where df/dx is evaluated at x = x0. Thus we have
I = h(x0)/df/dxx
0
=
dxh(x)
δ(x −x0)
df/dxx
0
,
which proves (7.39). (The reason for the absolutevalue on df/dz is that it is to be understood that the
direction of the limits of the z integration should be the standard one (negative to positive). If the gradient
of f is negative at x = x0 then one has to insert a minus sign to achieve this. This is therefore handled by
the absolutevalue sign.)
113
To do this, we shall need the following results. First, we note that
∂R
∂t
=
∂R
∂t
∂t
∂t
, (7.43)
and so, since R
2
= R
i
R
i
we have
∂R
∂t
=
R
i
R
∂R
i
∂t
= −
v
i
(t
) R
i
R
= −
v
R
R
. (7.44)
(Recall that
R means
R(t
), and that it is given by (7.33).) Equation (7.43) therefore
becomes
∂R
∂t
= −
v
R
R
∂t
∂t
, (7.45)
and so, since we have from (7.34) that R(t
) = t −t
, it follows that
1 −
∂t
∂t
= −
v
R
R
∂t
∂t
. (7.46)
Solving for ∂t
/∂t, we therefore have the results that
∂t
∂t
=
1 −
v
R
R
−1
, (7.47)
∂R
∂t
= −
v
R
R −v
R
. (7.48)
Some other expressions we shall also need are as follows. First, from t − t
= R(t
) it
follows that ∂
i
t
= −∂
i
R(t
). Now
R(t
) = r −r
0
(t
), and so
R
2
= (x
j
−x
0
j
(t
))(x
j
−x
0
j
(t
)) . (7.49)
From this, by acting with ∂
i
, we obtain
2R∂
i
R = 2(δ
ij
−∂
i
x
0
j
(t
))(x
j
−x
0
j
(t
)) ,
= 2R
i
−2
∂x
0
j
(t
)
∂t
∂t
∂x
i
(x
j
−x
0
j
(t
)) ,
= 2R
i
−2v
R∂
i
t
. (7.50)
From this and ∂
i
t
= −∂
i
R(t
) it follows that
∂
i
t
= −
R
i
R−v
R
, ∂
i
R =
R
i
R −v
R
. (7.51)
Further results that follow straightforwardly are
∂
i
R
j
= ∂
i
(x
j
−x
0
j
(t
)) = δ
ij
−
∂x
0
j
(t
)
∂t
∂
i
t
= δ
ij
+
v
j
R
i
R −v
R
,
∂
i
v
j
=
∂v
j
∂t
∂
i
t
= −
˙ v
j
R
i
R−v
R
,
114
∂v
i
∂t
=
∂v
i
∂t
∂t
∂t
=
˙ v
i
R
R −v
R
,
∂R
∂t
= −
v
R
R−v
R
,
∂
R
∂t
=
∂
R
∂t
∂t
∂t
= −v
∂t
∂t
= −
v R
R −v
R
. (7.52)
Note that ˙ v
i
means ∂v
i
/∂t
; we shall deﬁne the acceleration a of the particle by
a ≡
∂v
∂t
. (7.53)
We are now ready to evaluate the electric and magnetic ﬁelds. From (7.32) and the
results above, we have
E
i
= −∂
i
φ −
∂A
i
∂t
,
=
e
(R −v
R)
2
(∂
i
R −∂
i
(v
j
R
j
)) −
e
R −v
R
∂v
i
∂t
+
ev
i
(R −v
R)
2
∂R
∂t
−
∂(v
R)
∂t
,
=
e
(R −v
R)
3
R
i
−v
i
(R −v
R) −v
2
R
i
+a
RR
i
−a
i
R(R −v
R)
−v
i
v
R −v
i
a
RR +v
2
v
i
R
¸
,
=
e(1 −v
2
)(R
i
−v
i
R)
(R −v
R)
3
+
e[a
R(R
i
−v
i
R) −a
i
(R −v
R) R]
(R −v
R)
3
. (7.54)
This can be rewritten as
E =
e(1 −v
2
)(
R −v R)
(R −v
R)
3
+
e
R [(
R −v R) a]
(R −v
R)
3
. (7.55)
An analogous calculation of
B shows that it can be written as
B =
R
E
R
. (7.56)
Note that this means that
B is perpendicular to
E.
The ﬁrst term in (7.55) is independent of the acceleration a, and so it represents a
contribution that is present even if the charge is in uniform motion. It is easily seen that
at large distance, where R →∞, it falls oﬀ like 1/R
2
. If the charge is moving with uniform
velocity v then we shall have
r
0
(t) = r
0
(t
) +v (t −t
) , (7.57)
and so
R(t
) −v R(t
) = r −r
0
(t
) −v (t −t
) ,
= r −r
0
(t) +v (t −t
) −v (t −t
) ,
=
R(t) . (7.58)
115
In other words, in this case of uniform motion,
R(t
) −v R(t
) is equal to the vector
R(t)
that gives the line joining the charge to the point of observation at the time the observation
is made. We shall also then have
R(t
) −v
R(t
) = R(t
) −v
2
R(t
) −v
R(t) ,
= (1 −v
2
)R(t
) −v
R(t) . (7.59)
If we now introduce the angle θ between v and
R(t), we shall have v
R(t) = v R(t) cos θ.
Since, as we saw above,
R(t
) = v R(t
) +
R(t), we obtain, by squaring,
R
2
(t
) = v
2
R
2
(t
) + 2vR(t)R(t
) cos θ +R
2
(t) , (7.60)
and this quadratic equation for R(t
) can be solved to give
R(t
) =
vR(t) cos θ +R(t)
1 −v
2
sin
2
θ
1 −v
2
. (7.61)
Equation (7.59) then gives
R(t
)−v
R(t
) = vR(t) cos θ+R(t)
1 −v
2
sin
2
θ−vR(t) cos θ = R(t)
1 −v
2
sin
2
θ . (7.62)
For a uniformly moving charge we therefore obtain the result
E =
e
R(t)
R
3
(t)
1 −v
2
(1 −v
2
sin
2
θ)
3/2
, (7.63)
which has reproduced the result (5.40) that we had previously obtained by boosting from
the rest frame of the charged particle.
The second term in (7.55) is proportional to a, and so it occurs only for an accelerating
charge. At large distance, this term falls oﬀ like 1/R, in other words, much less rapidly
than the 1/R
2
falloﬀ of the ﬁrst term in (7.55). In fact the 1/R falloﬀ of the acceleration
term is characteristic of an electromagnetic wave, as we shall now discuss.
7.4 Radiation by accelerated charges
A charge at rest generates a purely electric ﬁeld, and if it is in uniform motion it generates
both
E and
B ﬁelds. In neither case, of course, does it radiate any energy. However, if the
charge is accelerating, then it actually emits electromagnetic radiation.
The easiest case to consider is when the velocity of the charge is small compared with
the speed of light. In this case the acceleration term in (7.55) is approximated by
E =
e
R (
R a)
R
3
=
en (n a)
R
, (7.64)
116
where
n ≡
R
R
. (7.65)
Note that n
E = 0, and that
E is also perpendicular to n a. This means that the
polarisation of
E lies in the plane containing n and a, and is perpendicular to n.
From (7.56) we shall also have
B = n
E . (7.66)
As usual, all quantities here in the expressions for
E and
B are evaluated at the retarded
time t
.
The energy ﬂux, given by the Poynting vector, is given by
S =
1
4π
E
B =
1
4π
E (n
E) =
1
4π
E
2
n −
1
4π
(n
E)
E , (7.67)
and so, since n
E = 0 we have
S =
1
4π
E
2
n. (7.68)
Let us deﬁne θ to be the angle between the unit vector n and the acceleration a. Then
we shall have
E =
e
R
(n an −a) =
e
R
(an cos θ −a) , (7.69)
and so
E
2
=
e
2
R
2
(a
2
cos
2
θ −2a
2
cos
2
θ +a
2
) =
e
2
a
2
sin
2
θ
R
2
, (7.70)
implying that the energy ﬂux is
S =
e
2
a
2
sin
2
θ
4πR
2
n. (7.71)
The area element d
Σ can be written as
d
Σ = R
2
ndΩ, (7.72)
where dΩ = sin θ dθdϕ is the area element on the unitradius sphere (i.e. the solid angle
element). The power radiated into the area element d
Σ is dP =
S d
Σ = R
2
n
S dΩ, and
so we ﬁnd that
dP
dΩ
=
e
2
a
2
4π
sin
2
θ (7.73)
is the power radiated per unit solid angle.
The total power radiated in all directions is given by
P =
dP
dΩ
dΩ =
e
2
a
2
4π
π
0
sin
3
θ dθ
2π
0
dϕ,
=
1
2
e
2
a
2
π
0
sin
3
θ dθ =
1
2
e
2
a
2
1
−1
(1 −c
2
)dc =
2
3
e
2
a
2
, (7.74)
117
where, to evaluate the θ integral we change variable to c = cos θ. The expression
P =
2
3
e
2
a
2
(7.75)
is known as the Larmor Formula for a nonrelativistic accelerating charge.
The Larmor formula can be generalised to the relativistic result fairly easily. In principle,
we could simply repeat the argument given above, but without making the approximation
that v is small compared to 1 (the speed of light). Note that in terms of the unit vector
n =
R/R, the expression (7.55) for the electric ﬁeld becomes
E =
e(1 −v
2
)(n −v)
R
2
(1 −n v)
3
+
en [(n −v) a]
R(1 −n v)
3
. (7.76)
We can, in fact, obtain the relativisitic Larmor formula by a simple trick. First, we note from
(7.76) that since
S = (
E
B)/(4π) and
B = n
E, the energy ﬂux from the acceleration
term must be quadratic in the acceleration a. We can also note that the total radiated
power P is a Lorentz scalar (since it is energy per unit time, and each of these quantities
transforms as the 0 component of a 4vector). Thus, the task is to ﬁnd a Lorentzinvariant
expression for P that reduces to the nonrelativisitic Larmor result (7.75) in the limit when
v goes to zero.
First, we note that the nonrelativistic Larmor formula (7.75) can be written as
P =
2
3
e
2
a
2
=
2e
2
3m
2
d p
dt
2
. (7.77)
There is only one Lorentzinvariant quantity, quadratic in a, that reduces to this expression
in the limit that v goes to zero. It is given by
P =
2e
2
3m
2
dp
µ
dτ
dp
µ
dτ
, (7.78)
where p
µ
is the 4momentum of the particle and τ is the proper time along its path. Noting
that p
µ
= m(γ, γv), we see that
dp
µ
dτ
= γ
dp
µ
dt
= mγ(γ
3
v a, γ
3
(v a) v +γa) , (7.79)
and so
dp
µ
dτ
dp
µ
dτ
= m
2
γ
2
[−γ
6
(v a)
2
+γ
6
v
2
(v a)
2
+ 2γ
4
(v a)
2
+γ
2
a
2
] ,
= m
2
γ
2
[γ
4
(v a)
2
+γ
2
a
2
) . (7.80)
Now consider the quantity
a
2
−(v a)
2
= a
2
−
ijk
im
v
j
a
k
v
a
m
,
= a
2
−v
2
a
2
+ (v a)
2
=
a
2
γ
2
+ (v a)
2
, (7.81)
118
which shows that we can write
dp
µ
dτ
dp
µ
dτ
= m
2
γ
6
a
2
γ
2
+ (v a)
2
= m
2
γ
6
[a
2
−(v a)
2
] . (7.82)
Thus we see that the scalar P given in (7.78) is given by
P =
2
3
e
2
γ
6
[a
2
−(v a)
2
] . (7.83)
This indeed reduces to the nonrelativistic Larmor formula (7.75) if the velocity v is sent to
zero. For the reasons we described above, it must therefore be the correct fullyrelativistic
Larmor result for the total power radiated by an accelerating charge.
7.5 Applications of Larmor formula
7.5.1 Linear accelerator
In a linear accelerator, a charged massive particle is accelerated along a straightline tra
jectory, and so its velocity v and acceleration a are parallel. Deﬁning p = [ p[ = mγ[v[, we
have
dp
dt
= mγ
dv
dt
+mv
dγ
dt
, (7.84)
where v = [v[ and γ = (1 −v
2
)
−1/2
. Clearly we have
v
dv
dt
= v
dv
dt
= v a = va ,
dγ
dt
= γ
3
v
dv
dt
= γ
3
va , (7.85)
and so
dp
dt
= mγ
3
a . (7.86)
With v and a parallel, the relativisitic Larmor formula (7.83) gives P =
2
3
e
2
γ
6
a
2
, and so we
have
P =
2e
2
3m
2
dp
dt
2
. (7.87)
The expression (7.87) gives the power that is radiated by the charge as it is accelerated
along a straight line trajectory. In a particle accelerator, the goal, obviously, is to accelerate
the particles to as high a velocity as possible. Equation (7.87) describes the the power that
is lost through radiation when the particle is being accelerated. The energy c of the particle
is related to its rest mass m and 3momentum p by the standard formula
c
2
= p
2
+m
2
. (7.88)
The rate of change of energy with distance travelled, dc/dx, is therefore given by
c
dc
dx
= p
dp
dx
, (7.89)
119
and so we have
dc
dx
=
p
c
dp
dx
=
mγv
mγ
dp
dx
= v
dp
dx
=
dx
dt
dp
dx
=
dp
dt
. (7.90)
This means that (7.87) can be rewritten as
P =
2e
2
3m
2
dc
dx
2
. (7.91)
The “energyloss factor” of the accelerator can be judged by taking the ratio of the
power radiated divided by the power supplied. By energy conservation, the power supplied
is equal to the rate of change of energy of the particle, dc/dt. Thus we have
Power radiated
Power supplied
=
P
(dc/dt)
=
P
(dc/dx)
dt
dx
=
P
v (dc/dx)
,
=
2e
2
3m
2
v
dc
dx
. (7.92)
In the relativistic limit, where v is very close to the speed of light (as is typically achieved
in a powerful linear accelerator), we therefore have
Power radiated
Power supplied
≈
2e
2
3m
2
dc
dx
. (7.93)
A typical electron linear accelerator achieves an energy input of about 10 MeV per metre,
and this translates into an energyloss factor of about 10
−13
. In other words, very little of
the applied power being used to accelerate the electron is lost through Larmor radiation.
7.5.2 Circular accelerator
The situation is very diﬀerent in the case of a circular accelerator, since the transverse ac
celeration necessary to keep the particle in a circular orbit is typically very much larger than
the linear acceleration discussed above. In other words, the direction of the 3momemtum
p is changing rapidly, while, by contrast, the energy, and hence the magnitude of p, is rela
tively slowlychanging. In fact the change in [ p[ per revolution is rather small, and we can
study the power loss by assuming that the particle is in an orbit of ﬁxed angular frequency
ω. This means that we shall have
d p
dt
= ω [ p[ , (7.94)
and so
d p
dτ
= γω [ p[ , (7.95)
where dτ = dt/γ is the propertime interval. Since the energy is constant in this approxi
mation, we therefore have
dp
0
dτ
= 0 , and so
dp
µ
dτ
dp
µ
dτ
=
d p
dτ
2
= γ
2
ω
2
p
2
. (7.96)
120
Using equation (7.78) for the Larmor power radiation, we therefore have
P =
2e
2
3m
2
γ
2
ω
2
p
2
=
2
3
e
2
γ
4
ω
2
v
2
. (7.97)
If the radius of the accelerator is R then the angular and linear velocities of the particle are
related by ω = v/R and so the power loss is given by
P =
2e
2
γ
4
v
4
3R
2
. (7.98)
The radiative energy loss per revolution, ∆c, is given by the product of P with the
period of the orbit, namely
∆c =
2πRP
v
=
4πe
2
γ
4
v
3
3R
. (7.99)
A typical example would be a 10 GeV electron synchrotron, for which the radius R is about
100 metres. Plugging in the numbers, this implies an energy loss of about 10 MeV per
revolution, or about 0.1% of the energy of the particle. Bearing in mind that the time
taken to complete an orbit is very small (the electron is travelling at nearly the speed of
light), it is necessary to supply energy at a very high rate in order to replenish the radiative
loss. It also implies that there will be a considerable amount of radiation being emitted by
the accelerator.
7.6 Angular distribution of the radiated power
We saw previously that for a nonrelativistic charged particle whose acceleration a makes
an angle θ with respect to the position vector
R, the angular distribution of the radiated
power is given by (see (7.73))
dP
dΩ
=
e
2
a
2
4π
sin
2
θ . (7.100)
In the general (i.e. relativistic) case, where the velocity v is large, the we have, from (7.76),
that at large R the electric and magnetic ﬁelds are dominated by the radiationﬁeld term:
E =
en [(n −v) a]
R(1 −n v)
3
,
B = n
E . (7.101)
The radial component of the Poynting vector, n
S, is therefore given by
n
S =
1
4π
n (
E
B) =
1
4π
n [
E (n
E)] ,
=
1
4π
n [E
2
n −(n
E)
E] =
1
4π
E
2
, (7.102)
since n
E = 0. Thus we have
n
S =
e
2
4πR
2
n [(n −v) a]
(1 −n v)
3
2
, (7.103)
121
where as usual all quantities on the righthand side are evaluated at the retarded time t
calculated from the equation t − t
= R(t
), with
R(t
) = r −r
0
(t
). It is conventional to
denote the quantity in (7.103) by [n
S]
ret.
, to indicate that it is evaluated at the retarded
time t
.
The associated energy radiated during the time interval from t
= T
1
to t
= T
2
is
therefore given by
c =
T
2
T
1
[n
S]
ret.
dt , (7.104)
where T
i
is the time t that corresponds to the retarded time t
= T
i
. The integral can
therefore be rewritten as
c =
T
2
T
1
[n
S]
ret.
dt
dt
dt
. (7.105)
The quantity [n
S]
ret.
(dt/dt
) is the power radiated per unit area as measured with respect
to the charge’s retarded time t
, and so we have the result that
dP(t
)
dΩ
= R
2
[n
S]
ret.
dt
dt
= R
2
(1 −n v)[n
S]
ret.
. (7.106)
(Note that we used the result (7.47) here.)
7.6.1 Angular power distribution for linear acceleration
As an example, consider the situation when the charge is accelerated uniformly for only a
short time, so that v as well as a are approximately constant during the time interval of the
acceleration. This means that n and R are approximately constant, and so from (7.103)
and (7.106) we obtain the angular distribution
dP(t
)
dΩ
=
e
2
4π
[n [(n −v) a][
2
(1 −n v)
5
. (7.107)
If we now suppose that the acceleration is linear, i.e. that v and a are parallel, then we
obtain
dP(t
)
dΩ
=
e
2
a
2
4π
sin
2
θ
(1 −v cos θ)
5
, (7.108)
where as before we deﬁne θ to be the angle between a and n.
When [v[ << 1, the expression (7.108) clearly reduces to the nonrelativistic result given
in (7.73). In this limit, the angular radiated power distribution is described by a ﬁgureof
eight, oriented perpendicularly to the direction of the acceleration. As the velocity becomes
larger, the two lobes of the ﬁgureofeight start to tilt forwards, along the direction of the
acceleration. This is illustrated for the nonrelativistic and relativisitic cases in Figures 1
and 2 below. In each case, the acceleration is to the right along the horizontal axis.
122
0.4 0.2 0.2 0.4
a
1
0.5
0.5
1
Figure 1: The angular power distribution in the nonrelativistic case
The angle at which the radiated power is largest is found by solving d(dP/dΩ)/dθ = 0.
This gives
2(1 −v cos θ) cos θ −5v sin
2
θ = 0 , (7.109)
and hence
θ
max.
= arccos
√
1 + 15v
2
−1
3v
. (7.110)
In the case of a highly relativistic particle, for which v is very close to the speed of light,
the velocity itself is not a very convenient parameter, and instead we can more usefully
characterise it by γ = (1−v
2
)
−1/2
, which becomes very large in the relativistic limit. Thus,
substituting v =
1 −γ
−2
into (7.110), we obtain
θ
max.
= arccos
4
1 −
15
16
γ
−2
−1
3
1 −γ
−2
. (7.111)
At large γ we can expand the argument as a power series in γ
−2
, ﬁnding that
θ
max.
≈ arccos(1 −
1
8
γ
−2
) . (7.112)
This implies that θ
max.
is close to 0 when γ is very large. In this regime we have cos θ
max.
≈
123
20 40 60 80 100
a
40
20
20
40
Figure 2: The angular power distribution in the relativistic case (v = 4/5)
1 −
1
2
θ
2
max.
, and so in the highly relativistic case we have
θ
max.
≈
1
2γ
. (7.113)
We see that the lobes of the angular power distribution tilt forward sharply, so that they
are directed nearly parallel to the direction of acceleration of the particle.
Continuing with the highlyrelativistic limit, we may consider the proﬁle of the angular
power distribution for all small angles θ. Substituting
v =
1 −γ
−2
, sinθ ≈ θ , cos θ ≈ 1 −
1
2
θ
2
(7.114)
into (7.108), and expanding in inverse powers of γ, we ﬁnd that
dP(t
)
dΩ
≈
e
2
a
2
θ
2
4π
1 −
1 −γ
−2
1 −
1
2
θ
2
5
≈
8e
2
a
2
θ
2
π(γ
−2
+θ
2
)
5
, (7.115)
which can be written as
dP(t
)
dΩ
≈
8e
2
a
2
γ
2
π
(γθ)
2
[1 + (γθ)
2
]
5
. (7.116)
This shows that indeed there are two lobes, of characteristic width ∆θ ∼ 1/γ, on each side
of θ = 0. The radiated power is zero in the exactly forward direction θ = 0.
We can straightforwardly integrate our result (7.108) for the angular power distribution
for a linearlyaccelerated particle, to ﬁnd the total radiated power. We obtain
P =
dP(t
)
dΩ
dΩ =
e
2
a
2
4π
2π
π
0
sin
2
θ
(1 −v cos θ)
5
sin θ dθ =
1
2
e
2
a
2
1
−1
(1 −c
2
)dc
(1 −vc)
5
, (7.117)
where c = cos θ. The integral is elementary, giving the result
P =
2
3
e
2
γ
6
a
2
. (7.118)
This can be seen to be in agreement with our earlier result (7.83), under the specialisation
that a and v are parallel.
124
7.6.2 Angular power distribution for circular motion
For a second example, consider the situation of a charge that is in uniform circular motion.
For these purposes, we need only assume that it is instantaneously in such motion; the
complete path of the particle could be something more complicated than a circle, but such
that at some instant it can be described by a circular motion.
Circular motion implies that the velocity v and the acceleration a are perpendicular.
At the instant under consideration, we may choose a system of Cartesian axes oriented so
that the velocity v lies along the z direction, and the acceleration lies along the x direction.
The unit vector n =
R/R can then be parameterised by spherical polar coordinates (θ, ϕ)
deﬁned in the usual way; i.e. θ measures the angle between n and the z axis, and ϕ is the
azimuthal angle, measured from the x axis, of the projection of n onto the (x, y) plane.
Thus we shall have
n = (sin θ cos ϕ, sin θ sin ϕ, cos θ) , v = (0, 0, v) , a = (a, 0, 0) . (7.119)
Of course, in particular, we have n v = cos θ.
From (7.103) and (7.106), we have the general expression
dP(t
)
dΩ
=
e
2
4π
[n [(n −v) a][
2
(1 −n v)
5
, (7.120)
for the angular distribution of the radiated power. Using the fact that v a = 0 in the case
of circular motion, we have
[n [(n −v) a][
2
= [(n a)(n −v) −(1 −n v)a[
2
,
= (n a)
2
(1 −2n v +v
2
) + (1 −n v)
2
a
2
−2(n a)
2
(1 −n v) ,
= −(n a)
2
(1 −v
2
) + (1 −n v)
2
a
2
,
= (1 −v cos θ)
2
a
2
−γ
−2
a
2
sin
2
θ cos
2
ϕ, (7.121)
and so for instantaneous circular motion we have
dP(t
)
dΩ
=
e
2
a
2
4π(1 −v cos θ)
3
1 −
sin
2
θ cos
2
ϕ
γ
2
(1 −v cos θ)
2
. (7.122)
We see that as v tends to 1, the angular distribution is peaked in the forward direction i.e.
in the direction of the velocity v, meaning that θ is close to 0.
The total power is obtained by integrating
dP(t
)
dΩ
over all solid angles:
P(t
) =
dP(t
)
dΩ
dΩ =
2π
0
dϕ
π
0
sinθdθ
dP(t
)
dΩ
,
125
=
2π
0
dϕ
π
0
sin θdθ
e
2
a
2
4π(1 −v cos θ)
3
1 −
sin
2
θ cos
2
ϕ
γ
2
(1 −v cos θ)
2
,
=
π
0
sinθdθ
e
2
a
2
2(1 −v cos θ)
3
1 −
sin
2
θ
2γ
2
(1 −v cos θ)
2
,
=
1
−1
e
2
a
2
2(1 −vc)
3
1 −
1 −c
2
2γ
2
(1 −vc)
2
dc , (7.123)
where c = cos θ. After performing the integration, we obtain
P(t
) =
2
3
e
2
γ
4
a
2
. (7.124)
This expression can be compared with the general result (7.83), specialised to the case
where v and a are perpendicular. Noting that then
(v a)
2
=
ijk
im
v
j
a
k
v
a
m
= v
j
v
j
a
k
a
k
−v
j
a
j
v
k
a
k
= v
j
v
j
a
k
a
k
= v
2
a
2
, (7.125)
we see that (7.83) indeed agrees with (7.124) in this case.
The total power radiated in the case of linear acceleration, with its γ
6
factor as in
(7.118), is larger by a factor of γ
2
than the total power radiated in the case of circular
motion, provided we take the acceleration a to be the same in the two cases. However, this
is not always the most relevant comparison to make. Another way to make the comparison
is to take the magnitude of the applied force, [d p/dt[, to be the same in the two cases. For
circular motion we have that v is constant, and so
d p
dt
= mγ
dv
dt
= mγa . (7.126)
Thus for circular motion, we have from (7.124) that
P(t
) =
2e
2
γ
2
3m
2
d p
dt
2
. (7.127)
By contrast, for linear acceleration, where v is parallel to a, we have
d p
dt
= mγa +mγ
3
(v a)v = mγ
3
a , (7.128)
and so this gives
P(t
) =
2e
2
3m
2
d p
dt
2
. (7.129)
Thus if we hold [d p/dt[ ﬁxed when compariung the two, we see that it is the particle in
circular motion whose radiated power is larger than that of the linearlyaccelerated particle,
by a factor of γ
2
.
126
7.7 Frequency distribution of radiated energy
In this section, we shall discuss the spectrum of frequencies of the electromagnetic radiation
emitted by an accelerating charge. The basic technique for doing this will be to perform a
Fourier transform of the time dependence of the radiated power.
In general, we have
dP(t)
dΩ
= [R
2
n
S]
ret
=
1
4π
[[R
E]
ret
[
2
. (7.130)
Let
G(t) =
1
√
4π
[R
E]
ret
, (7.131)
so that we shall have
dP(t)
dΩ
= [
G(t)[
2
. (7.132)
Note that here dP(t)/dΩ is expressed in the observer’s time t, and not the retarded time
t
. This is because our goal here will be to determine the frequency spectrum of the elec
tromagnetic radiation as measured by the observer.
Suppose that the acceleration of the charge occurs only for a ﬁnite period of time, so
that the total energy emitted is ﬁnite. We shall assume that the observation point is far
enough away from the charge that the spatial region spanned by the charge while it is
accelerating subtends only a small angle as seen by the observer.
The total energy radiated per unit solid angle is given by
dW
dΩ
=
∞
−∞
dP
dΩ
dt =
∞
−∞
[
G(t)[
2
dt . (7.133)
We now deﬁne the Fourier transform g(ω) of
G(t):
g(ω) =
1
√
2π
∞
−∞
G(t) e
i ωt
dt . (7.134)
In the usual way, the inverse transform is then
G(t) =
1
√
2π
∞
−∞
g(ω) e
−i ωt
dω . (7.135)
It follows that
dW
dΩ
=
∞
−∞
[
G(t)[
2
dt =
1
2π
∞
−∞
dt
∞
−∞
dω
∞
−∞
dω
g
∗
(ω
) g(ω) e
i (ω
−ω)t
. (7.136)
The t integration can be performed, using
∞
−∞
dt e
i (ω
−ω)t
= 2πδ(ω
−ω) , (7.137)
127
and so
dW
dΩ
=
∞
−∞
dω
∞
−∞
dω
g
∗
(ω
) g(ω) δ(ω
−ω) =
∞
−∞
dωg
∗
(ω) g(ω) , (7.138)
i.e.
dW
dΩ
=
∞
−∞
dω[g(ω)[
2
. (7.139)
(The result that (7.133) can be expressed as (7.139) is known as Parseval’s Theorem in
Fourier transform theory.)
We can reexpress (7.139) as
dW
dΩ
=
∞
0
dω
d
2
I(ω, n)
dωdΩ
, (7.140)
where
d
2
I(ω, n)
dωdΩ
= [g(ω)[
2
+[g(−ω)[
2
(7.141)
is the energy emitted per unit solid angle per unit frequency interval. If
G(t) = [R
E]
ret
/
√
4π
is real, then
g(−ω) =
1
√
2π
∞
−∞
dt
G(t) e
−i ωt
= g
∗
(ω) , (7.142)
and then
d
2
I(ω, n)
dωdΩ
= 2[g(ω)[
2
. (7.143)
Using the expression for
E in (7.101), the Fourier transform g(ω), given by (7.134) with
(7.131), is
g(ω) =
e
2
√
2 π
∞
−∞
e
i ωt
n [(n −v) a]
(1 −n v)
3
ret
dt , (7.144)
where as usual, the subscript “ret” is a reminder that the quantity is evaluated at the
retarded time t
. Since
dt =
dt
dt
dt
= (1 −n v) dt
, (7.145)
we therefore have
g(ω) =
e
2
√
2 π
∞
−∞
e
i ω(t
+R(t
))
n [(n −v) a]
(1 −n v)
2
dt
. (7.146)
(We have now dropped the “ret” reminder, since everything inside the integrand now de
pends on the retarded time t
.)
We are assuming that the observation point is far away from the accelerating charge,
and that the period over which the acceleration occurs is short enough that the the vector
n =
R(t
)/R(t
) is approximately constant during this time interval. It is convenient to
choose the origin to be near to the particle during its period of acceleration. With the
128
observer being far away, at position vector r, it follows from
R(t
) = r −r
0
(t
) that to a
good approximation we have
R
2
(t
) ≈ r
2
−2r r
0
(t
) , (7.147)
and so
R(t
) ≈ r
1 −
2r r
0
(t
)
r
2
1/2
≈ r −
r r
0
(t
)
r
. (7.148)
Furthermore, we can also approximate n ≡
R(t
)/R(t
) by r/r, and so
R(t
) ≈ r −n r
0
(t
) . (7.149)
Substituting this into (7.146), there will be a phase factor e
i ωr
that can be taken outside
the integral, since it is independent of t
. This overall phase factor is unimportant (it will
cancel out when we calculate [g(ω)[
2
, and so we may drop it and write
g(ω) =
e
2
√
2 π
∞
−∞
e
i ω(t
−n·r
0
(t
))
n [(n −v) a]
(1 −n v)
2
dt
. (7.150)
From (7.143) we therefore have
d
2
I(ω, n)
dωdΩ
=
e
2
4π
2
∞
−∞
e
i ω(t
−n·r
0
(t
))
n [(n −v) a]
(1 −n v)
2
dt
2
, (7.151)
as the energy per unit solid angle per unit frequency interval.
The integral can be neatened up by observing that we can write
n [(n −v) a]
(1 −n v)
2
=
d
dt
n (n v)
1 −n v
, (7.152)
under the assumption that n is a constant. This can be seen be distributing the derivative,
to obtain
d
dt
n (n v)
1 −n v
=
n (n a)
1 −n v
+
n (n v) (n a)
(1 −n v)
2
,
=
(1 −n v)(n(n a) −a) + (n(n v) −n)(n a)
(1 −n v)
2
,
=
(n a)(n −v) −(1 −n v)a
(1 −n v)
2
,
=
n [(n −v) a]
(1 −n v)
2
. (7.153)
This allows us to integrate (7.151) by parts, to give
d
2
I(ω, n)
dωdΩ
=
e
2
4π
2
−
∞
−∞
n (n v)
1 −n v
d
dt
e
i ω(t
−n·r
0
(t
))
dt
2
, (7.154)
129
and hence
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
4π
2
∞
−∞
n (n v) e
i ω(t
−n·r
0
(t
))
dt
2
, (7.155)
It should be remarked here that the eﬀect of having integrated by parts is that the
acceleration a no longer appears in the expression (7.155). Prior to the integration by
parts, the fact that we were taking the acceleration to be nonzero for only a ﬁnite time
interval ensured that the integration over all t
from −∞ to ∞ would be cut down to
an integration over only the ﬁnite time interval during which a was nonzero. After the
integration by parts, the integrand in (7.155) no longer vanishes outside the time interval
of the nonzero acceleration, and so one might worry about issues of convergence, and the
validity of having dropped the boundary terms at t
= ±∞ coming from the integration by
parts. In fact, it can be veriﬁed that all is well, and any problem with convergence can be
handled by introducing a convergence factor e
−t

, and then sending to zero.
We shall make use of the result (7.155) in two applications. In the ﬁrst, we shall calculate
the frequency spectrum for a relativistic particle in instantaneous circular motion.
7.8 Frequency spectrum for relativistic circular motion
Consider a particle which, at some instant, is following a circular arc of radius ρ. We shall
cjoose axes so that the arc lies in the (x, y) plane, and choose the origin so that at t = 0
the particle is located at the origin, x = y = 0. Without loss of generality, we may choose
the unit vector n (which points in the direction of the observation point) to lie in the (x, z)
plane. We shall, for notational convenience, drop the prime from the time t
, so from now
on t will denote the retarded time.
The position vector of the particle at time t will be given by
r
0
=
ρ sin
vt
ρ
, ρ cos
vt
ρ
−ρ, 0
, (7.156)
where v = [v[ is its speed. Since v = dr
0
(t)/dt, we shall have
v =
v cos
vt
ρ
, −v sin
vt
ρ
, 0
. (7.157)
We may parameterise the unit vector n, which we are taking to lie in the (x, z) plane, in
terms of the angle θ between n and the x axis:
n = (cos θ, 0, sin θ) . (7.158)
We then have
n (n v) = (n v) n −v =
−v sin
2
θ cos
vt
ρ
, −v sin
vt
ρ
, v sin θ cos θ cos
vt
ρ
. (7.159)
130
We shall write this as
n (n v) = −v sin
vt
ρ
e
+v sin θ cos
vt
ρ
e
⊥
, (7.160)
where
e
= (0, 1, 0) and e
⊥
= n e
= (−sin θ, 0, cos θ) . (7.161)
We shall consider a particle whose velocity is highlyrelativistic. It will be recalled from
our earlier discussions that for such a particle, the electromagnetic radiation will be more
or less completely concentrated in the range of angles θ very close to 0. Thus, to a good
approximation we shall have e
⊥
≈ (0, 0, 1), which is the unit normal to the plane of the
circular motion. In what follows, we shall make approximations that are valid for small θ,
and also for small t. We shall also assume that v is very close to 1 (the speed of light).
From (7.156) and (7.158), we ﬁnd
t −n r
0
(t) = t −ρ cos θ sin
vt
ρ
≈ t −ρ(1 −
1
2
θ
2
)
vt
ρ
−
1
6
vt
ρ
3
,
≈ (1 −v)t +
1
2
θ
2
vt +
v
3
t
3
6ρ
2
,
≈
1
2
(1 +v)(1 −v)t +
1
2
θ
2
t +
t
3
6ρ
2
,
=
t
2γ
2
+
1
2
θ
2
t +
t
3
6ρ
2
. (7.162)
From (7.160), we ﬁnd
n (n v) ≈ −
t
ρ
e
+θ e
⊥
. (7.163)
We therefore ﬁnd from (7.155) that
d
2
I
dωdΩ
≈
e
2
ω
2
4π
2
−g
(ω) e
+g
⊥
(ω) e
⊥
2
,
=
e
2
ω
2
4π
2
[g
(ω)[
2
+[g
⊥
(ω)[
2
, (7.164)
where
g
(ω) =
1
ρ
∞
−∞
t e
i ω[(γ
−2
+θ
2
)t+
1
3
t
3
ρ
−2
]/2
dt ,
g
⊥
(ω) = θ
∞
−∞
e
i ω[(γ
−2
+θ
2
)t+
1
3
t
3
ρ
−2
]/2
dt . (7.165)
Letting
u =
t
ρ
(γ
−2
+θ
2
)
−1/2
, ξ =
1
3
ωρ(γ
−2
+θ
2
)
3/2
, (7.166)
131
leads to
g
(ω) = ρ(γ
−2
+θ
2
)
∞
−∞
ue
3i ξ(u+u
3
/3)/3
du,
g
⊥
(ω) = ρθ(γ
−2
+θ
2
)
1/2
∞
−∞
e
3i ξ(u+u
3
/3)/3
du. (7.167)
These integrals are related to Airy integrals, or modiﬁed Bessel functions:
∞
0
usin[3ξ(u +u
3
/3)/2] du =
1
√
3
K
2/3
(ξ) ,
∞
0
cos[3ξ(u +u
3
/3)/2] du =
1
√
3
K
1/3
(ξ) ,
(7.168)
and so we have
d
2
I
dωdΩ
≈
e
2
ω
2
ρ
2
3π
2
(γ
−2
+θ
2
)
2
(K
2/3
(ξ))
2
+
θ
2
γ
−2
+θ
2
(K
1/3
(ξ))
2
. (7.169)
The asymptotic forms of the modiﬁed Bessel functions K
ν
(x), for small x and large x,
are
K
ν
(x) −→
1
2
Γ(ν)
2
x
ν
; x −→0 ,
K
ν
(x) −→
π
2x
e
−x
; x −→∞. (7.170)
It therefore follows from (7.169) that d
2
I/(dωdΩ) falls oﬀ rapidly when ξ becomes large.
Bearing in mind that γ
−2
is small (since the velocity of the particle is very near to the
speed of light), and that θ has been assumed to be small, we see from (7.166) that there is
a regime where ξ can be large, whilst still fulﬁlling our assumptions, if ωρ is large enough.
The value of ξ can then become very large if θ increases suﬃciently (whilst still being small
compared to 1), and so the radiation is indeed concentrated around very small angles θ.
If ω becomes suﬃciently large that ωργ
−3
is much greater than 1 then ξ will be very
large even if θ = 0. Thus, there is an eﬀective highfrequency cutoﬀ for all angles. It is
convenient to deﬁne a “cutoﬀ” frequency ω
c
for which ξ = 1 at θ = 0:
ω
c
=
3γ
3
ρ
=
3
ρ
c
m
3
. (7.171)
If the particle is following a uniform periodic circular orbit, with angular frequency ω
0
=
vrho ≈ 1/ρ, then we shall have
ω
c
= 3ω
0
c
m
3
. (7.172)
The radiation in this case of a charged particle in a highly relativistic circular orbit is known
as “Synchrotron Radiation.”
132
Consider the frequency spectrum of the radiation in the orbital plane, θ = 0. In the two
regimes ω << ω
c
and ω >> ω
c
we shall therefore have
ω << ω
c
:
d
2
I
dωdΩ
θ=0
≈ e
2
Γ(2/3)
π
2
3
4
1/3
(ωρ)
2/3
,
ω >> ω
c
:
d
2
I
dωdΩ
θ=0
≈
3e
2
γ
2
2π
ω
ω
c
e
−2ω/ωc
. (7.173)
This shows that the power per unit solid angle per unit frequency increases from 0 like ω
2/3
for small ω, reaches a peak around ω = ω
c
, and then falls oﬀ exponentially rapidly one ω is
signiﬁcantly greater than ω
c
.
It is clear that one could continue with the investigation of the properties of the syn
chrotron radiation in considerably more depth. For example, would could consider the
detailed angular distibution of the radiation as a function of θ, and one could consider the
total power per unit frequency interval, obtained by integrating over all solid angles:
dI
dω
=
d
2
I
dωdΩ
dΩ. (7.174)
A discussion of further details along these lines can be found in almost any of the advanced
electrodynamics textbooks.
7.9 Frequency spectrum for periodic motion
Suppose that the motion of the charged particle is exactly periodic, with period T = 2π/ω
0
,
where ω
0
is the angular frequency of the particle’s motion. This means that n r
0
(t) will be
periodic with period T, and so the factor e
−i ωn·r
0
(t)
in (7.155) will have time dependence
of the general form
H(t) =
¸
n≥1
b
n
e
−i nω
0
t
. (7.175)
(We are again using t to denote the retarded time here, to avoid a profusion of primes.) The
Fourier transform h(ω) of the function H(t) is zero except when ω is an integer multiple of
ω
0
, and for these values it is proportional to a delta function:
h(ω) =
1
√
2π
∞
−∞
e
i ωt
H(t)dt =
¸
n≥1
b
n
∞
−∞
e
i (ω−nω
0
)t
dt ,
= 2π
¸
n≥1
b
n
δ(ω −nω
0
) . (7.176)
In fact, it is more appropriate to work with Fourier series, rather than Fourier transforms,
in this situation with a discrete frequency spectrum.
133
Going back to section 7.7, we therefore now expand
G(t) in the Fourier series
G(t) =
¸
n≥1
a
n
e
−i nω
0
t
. (7.177)
Multiplying by e
i mω
0
t
and integrating over the period T = 2π/ω
0
gives
1
T
T
0
e
i mω
0
t
G(t)dt =
1
T
¸
n≥1
a
n
T
0
e
i (m−n)ω
0
t
dt =a
m
, (7.178)
since the integral of e
i (m−n)ω
0
t
vanishes unless n = m:
1
T
T
0
e
i (m−n)ω
0
t
dt = δ
m,n
. (7.179)
Thus the coeﬃcients a
n
in the Fourier series (7.177) are given by
a
n
=
1
T
T
0
e
i nω
0
t
G(t)dt . (7.180)
The analogue of Parseval’s theorem for the case of the discrete Fourier series is now
given by considering
1
T
T
0
[
G(t)[
2
dt =
1
T
T
0
¸
m,n
a
n
a
∗
m
e
i (m−n)ω
0
t
dt =
¸
n
[a
n
[
2
. (7.181)
The time average of the power per unit solid angle is therefore given by
dP
dΩ
=
1
T
T
0
dP
dΩ
dt =
1
T
T
0
[
G(t)[
2
dt =
¸
n
[a
n
[
2
. (7.182)
The term[a
n
[
2
in this summation therefore has the interpretation of being the timeaveraged
power per unit solid angle in the n’th mode, which we shall denote by dP
n
/dΩ:
dP
n
dΩ
= [a
n
[
2
. (7.183)
It is now a straightforward matter, using (7.180), to obtain an expression for [a
n
[
2
in
terms of the integral of the retarded electric ﬁeld. The steps follow exactly in parallel
with those we described in section 7.7, except that the integral
∞
−∞
dt is now replaced by
T
−1
T
0
dt. The upshot is that the expression (7.155) for d
2
I/(dωdΩ) is replaced by
15
dP
n
dΩ
=
e
2
n
2
ω
4
0
4π
2
T
0
n (n v) e
i nω
0
(t−n·r
0
(t))
dt
2
, (7.184)
where T = 2π/ω
0
. This gives the expression for the timeaveraged power per unit solid
angle in the n’th Fourier mode.
15
The integer n labelling the modes is not to be confused with the unit vector n, of course!
134
Since we are assuming the observer (at r) is far away from the particle, and since the
integral in (7.184) is taken over the ﬁnite time interval T = 2π/ω
0
, it follows that to a good
approximation we can freely take the unit vector n outside the integral. Thus we may make
the replacement
T
0
n (n v) e
i nω
0
(t−n·r
0
(t))
dt −→n
n
T
0
v e
i nω
0
(t−n·r
0
(t))
dt . (7.185)
Now, for any vector
V , we have that
[n (n
V )[
2
= [n
V −
V [
2
= V
2
−(n
V )
2
, (7.186)
and on the other hand we also have
[n
V [
2
= (n
V ) (n
V ) = n [
V (n
V )] = n [V
2
n−(n
V )
V ] = V
2
−(n
V )
2
. (7.187)
Thus [n (n
V )[
2
= [n
V [
2
, and so we can reexpress (7.184) as
dP
n
dΩ
=
e
2
n
2
ω
4
0
4π
2
T
0
v ne
i nω
0
(t−n·r
0
(t))
dt
2
, (7.188)
where T = 2π/ω
0
and ω
0
is the angular frequency of the periodic motion. Recall that
throughout this section, we are using t to denote the retarded time, in order to avoid
writing the primes on t
in all the formulae.
7.10 Cerenkov radiation
So far, all the situations we have considered have involved electromagnetic ﬁelds in a vac
uum, i.e. in the absence of any dielectric or magnetically permeable media. In this section,
we shall take a brief foray into a situation where there is a dielectric medium.
It will be recalled that if a medium has permittivity and permeability µ, then electro
magnetic waves in the medium will propagate with speed ˜ c = 1/
√
µ. This means in general
that the “speed of light” in the medium will be less than the speed of light in vacuum. A
consequence of this is that a charged particle, such as an electron, can travel faster than
the local speed of light inside the medium. This leads to an interesting eﬀect, known as
Cerenkov Radiation. In practice, the types of media of interest are those that are optically
transparent, such as glass or water, and these have magnetic permeability µ very nearly
equal to 1, while the electric permittivity can be quite signiﬁcantly greater than 1. Thus
for the purposes of our discussion, we shall assume that µ = 1 and that the local speed of
light is reduced because is signiﬁcantly greater than 1.
135
We shall make use of the result (7.155) for the radiated power spectrum, in order to
study the Cerenkov radiation. First, we shall need to introduce the dielectric constant into
the formula. This can be done by a simple scaling argument. We shall also, just for the
purposes of this section, restore the explicit symbol c for the speed of light. This can be
done by sending
t −→ct , ω −→
ω
c
. (7.189)
(Of course any other quantity that involves time will also need to be rescaled appropriately
too. This is just dimensional analysis.)
Referring back to the discussion in section 2.1, it can be seen that the dielectric constant
can be introduced into the vacuum Maxwell equations by making the scalings
ρ −→
ρ
√
,
E −→
√
E ,
B −→
B, c −→
c
√
. (7.190)
Of course the scaling of the charge density ρ implies that we must also rescale the charge e
of the particle, according to
e −→
e
√
. (7.191)
Note that c continues to mean the speed of light in vacuum. The speed of light inside the
dielectric medium is given by
˜ c =
c
√
. (7.192)
The expression (7.155) for the radiated power per unit solid angle per unit frequency
interval now becomes
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
√
4π
2
c
3
∞
−∞
n (n v) e
i ω(t
−
√
n·r
0
(t
)/c)
dt
2
, (7.193)
For a charge moving at constant velocity v, we shall have
r
0
(t
) = v t
, (7.194)
and so (7.193) gives
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
√
4π
2
c
3
[n v[
2
∞
−∞
e
i ωt
(1−
√
n·v/c)
dt
2
, (7.195)
since [n (n v)[
2
= [(n v) n −v[
2
= v
2
−(n v)
2
= [n v[
2
.
The integration over t
produces a deltafunction.
16
Deﬁning θ to be the angle between
n and v, so that n v = v cos θ, we therefore have
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
√
c
3
v
2
sin
2
θ [δ(ω(1 −
√
(v/c) cos θ))[
2
, (7.196)
16
The occurrence of the deltafunction is because of the unphysical assumption that the particle has been
moving in the medium forever. Below, we shall obtain a more realistic expression by supposing that the
particle travels through a slab of mdeium of ﬁnite thickness.
136
and so (using δ(ax) = δ(x)/a))
d
2
I(ω, n)
dωdΩ
=
e
2
√
c
3
v
2
sin
2
θ [δ(1 −
√
(v/c) cos θ)[
2
. (7.197)
This expression shows that all the radiation is emitted at a single angle θ
c
, known as the
Cerenkov Angle, given by
cos θ
c
=
c
v
√
. (7.198)
Note that in terms of ˜ c, the speed of light in the medium, as given in (7.192), we have
cos θ
c
=
˜ c
v
. (7.199)
This makes clear that the phenomenon of Cerenkov radiation occurs only if v > ˜ c, i.e. if
the charged particle is moving through the medium at a velocity that is greater than the
local velocity of light in the medium. In fact one can understand the Cerenkov radiation as
a kind of “shock wave,” very like the acoustic shock wave that occurs when an aircraft is
travelling faster than the speed of sound. The Cerenkov angle θ
c
is given by a very simple
geometric construction, shown in Figure 3 below. The circles show the lightfronts of light
emmitted by the particle. Since the particle is travelling faster than the speed of light in
the medium, it “outruns” the circles, leaving a trail of lightfronts tangent to the angled
line in the ﬁgure. This is the lightfront of the Cerenkov radiation.
As mentioned above, the squared deltafunction in (7.197) is the result of making the
unrealistic assumption that the particle has been ploughing through the medium for ever,
at a speed greater than the local speed of light. A more realistic situation would be to
consider a charged particle entering a thin slab of dielectric medium, such that it enters at
time t
= −T and exits at t
= +T. The expression (7.195) is then replaced by
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
√
4π
2
c
3
[n v[
2
T
−T
e
i ωt
(1−
√
n·v/c)
dt
2
, (7.200)
which, using
T
−T
dte
i bt
= 2b
−1
sin bT, therefore implies that
d
2
I(ω, n)
dωdΩ
=
e
2
ω
2
√
v
2
T
2
sin
2
θ
π
2
c
3
sin[ωT(1 −
√
(v/c) cos θ)]
ωT(1 −
√
(v/c) cos θ
2
. (7.201)
This is sharply peaked around the Cerenkov angle θ
c
given by (7.198).
Integrating over all angles we obtain the total energy per unit frequency interval
dI
dω
=
d
2
I
dωdΩ
dΩ ≈
2e
2
ω
2
√
v
2
T
2
sin
2
θ
c
πc
3
π
0
sin[ωT(1 −
√
(v/c) cos θ)]
ωT(1 −
√
(v/c) cos θ
2
sin θ dθ .
(7.202)
137
c t
~
v t
Cerenkov angle
Figure 3: The Cerenkov angle θ
c
is given by cos θ
c
= (˜ ct)/(vt) = ˜ c/v.
(The integrand is peaked sharply around θ = θ
c
, so to a good approximation we can take
the sin
2
θ factor outside the integral, calling it sin
2
θ
c
.) Letting x = cos θ, the remaining
integral can be written as
1
−1
sin[ωT(1 −
√
(v/c)x)]
ωT(1 −
√
(v/c)x
2
dx ≈
∞
−∞
sin[ωT(1 −
√
(v/c)x)]
ωT(1 −
√
(v/c)x
2
dx. (7.203)
(The limits of integration can, to a good approximation, be extended to ±∞ because the
integrand is peaked around x = cos θ
c
.) Letting ωT − ωT
√
ep x/c = −y, the inetgral
becomes
c
ωT
√
v
∞
−∞
sin
2
y
y
2
dy =
πc
ωT
√
v
, (7.204)
and so expression (7.202) for the total energy per unit frequency interval becomes
dI
dω
≈
2e
2
vωT sin
2
θ
c
c
2
. (7.205)
The distance through the slab is given by 2vT, and so dividing by this, we obtain an
expression for the total energy of Cerenkov radiation per unit frequency interval per unit
138
path length:
d
2
I
dωd
=
e
2
ω
c
2
sin
2
θ
c
=
e
2
ω
c
2
1 −
c
2
v
2
. (7.206)
This is known as the FrankTamm relation. Note that this expression grows linearly with
ω, which means that the bulk of the energy is concentrated in the higher frequencies of
electromagnetic radiation. Of course there must be some limit, which arises because the
dielectric constant will fall oﬀ with increasing frequency, and so the Cerenkov eﬀect will cease
to operate at high enough frequencies.
17
In practice, the peak of the frequency spectrum
for Cerenkov radiation is in the ultraviolet.
The bluishgreen glow visible in pictures of nuclear fuel rods immersed in water is a
familiar example of Cerenkov radiation. Apart from looking nice, the Cerenkov eﬀect is
also of practical use, for measuring the velocity of charged particles moving at relativistic
speeds. One can determine the velocity by allowing the particles to move through a slab of
suitablychosen dielectric material, and measuring the Cerenkov angle.
7.11 Thompson scattering
Another application of the Larmor formula is in the phenomenon known as Thompson
scattering. Consider a plane electromagnetic wave incident on a particle of charge e and
mass m. The charge will oscillate back and forth in the electric ﬁeld of the wave, and so
it will therefore emit electromagnetic radiation itself. The net eﬀect is that the electron
“scatters” some of the incoming wave.
In most circumstances, we can assume that the induced oscillatory motion of the electron
will be nonrelativistic. As we saw in (7.73), if Θ is the angle between the acceleration a
and the unit vector n (which lies along the line from the electron to the observation point),
then the power radiated per unit solid angle is
dP
dΩ
=
e
2
a
2
4π
sin
2
Θ. (7.207)
Let us suppose that the plane electromagnetic wave has electric ﬁeld given by (the real
part of)
E = E
0
e
i (
k·r−ωt)
, (7.208)
17
At suﬃciently high frequencies, which implies very small wavelengths, the approximation in which the
medium is viewed as a continuum with an eﬀective dielectric constant breaks down, and it looks more and
more like empty space with isolated charges present. At such length scales the electron is more or less
propagating through a vacuum, and so there is no possibility of its exceeding the local speed of light. Thus
the Cerenkov eﬀect tails oﬀ at suﬃciently high frequencies.
139
that the wavevector
k lies along the z axis. The unit polarisation vector , which must
therefore lie in the (x, y) plane, may be parameterised as
= (cos ψ, sin ψ, 0) . (7.209)
Using standard spherical polar coordinates, the unit vector n will be given by
n = (sin θ cos ϕ, sin θ sin ϕ, cos θ) . (7.210)
In particular, this means
n = sinθ (cos ϕcos ψ + sin ϕsin ψ) = sin θ cos(ϕ −ψ) . (7.211)
The acceleration of the electron will be given by
ma = e
E , so a =
e
m
E
0
e
i ω(z−t)
. (7.212)
Note that this means
n a =
e
m
E
0
n e
i ω(z−t)
=
e
m
E
0
e
i ω(z−t)
sin θ cos(ϕ −ψ) . (7.213)
Since n a = a cos Θ, it follows that (7.207) becomes
dP
dΩ
=
e
2
4π
a
2
−(n a)
2
, (7.214)
and so the time average will be given by
dP
dΩ
=
e
4
8πm
2
[E
0
[
2
[1 −(n )
2
] . (7.215)
Thus we ﬁnd
dP
dΩ
=
e
4
8πm
2
[E
0
[
2
[1 −sin
2
θ cos
2
(ϕ −ψ)] . (7.216)
The direction of the polarisation (in the (x, y) plane) of the incoming electromagnetic
wave is parameterised by the angle ψ. For unpolarised incoming waves, we should average
over all angles ψ. Thus we obtain
dP
dΩ
ψ
≡
1
2π
2π
0
dψ
dP
dΩ
=
e
4
8πm
2
[E
0
[
2
(1 −
1
2
sin
2
θ) ,
=
e
4
16πm
2
[E
0
[
2
(1 + cos
2
θ) . (7.217)
The scattering cross section dσ/dΩ is then deﬁned by
dσ
dΩ
=
Energy radiated/unit time/unit solid angle
Incident energy ﬂux/unit area/unit time
. (7.218)
140
The denominator here will just be [E
0
[
2
/(8π), which is the time average of the Poynting
ﬂux for the incoming wave. Thus we arrive at the Thompson Formula for the cross section:
dσ
dΩ
=
e
4
(1 + cos
2
θ)
2m
2
. (7.219)
The total scattering cross section is obtained by integrating dσ/dω over all solid angles,
which gives
σ =
dσ
dΩ
dΩ = 2π
π
0
dσ
dΩ
sin θdθ ,
=
πe
4
m
2
π
0
sin
3
θdθ =
πe
4
m
2
1
−1
(1 +c
2
)dc , (7.220)
and so we ﬁnd
σ =
8πe
4
3m
2
. (7.221)
8 Radiating Systems
8.1 Fields due to localised oscillating sources
Consider a localised system of oscillating charge, at a single frequency ω. The charge density
and current density can therefore be written as
ρ(r, t) = ρ(r) e
−i ωt
,
J(r, t) =
J(r) e
−i ωt
. (8.1)
From the expressions (7.21) and (7.22) for the retarded potentials, we shall have
φ(r, t) =
ρ(r
, t −[r −r
[)
[r −r
[
d
3
r
,
= e
−i ωt
ρ(r
)
[r −r
[
e
i kr−r

d
3
r
. (8.2)
Note that here k is simply equal to ω, and we have switched to the symbol k in the
exponential inside the integral because it looks more conventional. In a similar fashion, we
shall have
A(r, t) = e
−i ωt
J(r
)
[r −r
[
e
i kr−r

d
3
r
. (8.3)
From these expressions for φ and
A, we can calculate
E = −
∇φ − ∂
A/∂t and
B =
∇
A. In fact, because of the simple monochromatic nature of the time dependence, we
can calculate
E easily, once we know
B, from the Maxwell equation
∇
B −
∂
E
∂t
= 4π
J . (8.4)
141
Away from the localised source region we have
J = 0. From the time dependence we have
∂
E/∂t = i ω
E = −i k
E, and so we shall have
E =
i
k
∇
B. (8.5)
Let us suppose that the region where the source charges and currents are nonzero is of
scale size d. The wavelength of the monomchromatic waves that they generate will then be
given by
λ =
2π
ω
=
2π
k
. (8.6)
We shall assume that d << λ, i.e. the scale size of the source region is very small compared
with the wavelength of the electromagnetic waves that are produced. It will be convenient
to choose the origin of the coordinate system to lie within the neighbourhood of the source
region, so that we may therefore assume
[r
[ << λ (8.7)
for all integration points r
in the expressions (8.2) and (8.3).
The discussion of the electromagnetic ﬁelds generated by these sources can then be
divided, like all Gaul, into three parts:
Near zone, or Static zone : d << r << λ,
Intermediate zone, or Induction zone : d << r ∼ λ,
Far zone, or Radiation zone : d << λ << r . (8.8)
8.1.1 The static zone
First, let us consider the near zone, where r << λ. Equivalently, we may say that
Static zone : kr << 1 . (8.9)
Since we are also assuming d << λ, and that the origin of the coordinate system is located
in the neighbourhood of the source region, it follows that in the near zone, we can just
approximate e
i kr−r

by 1. Thus we shall have
A(r, t) ≈ e
−i ωt
J(r
)
[r −r
[
d
3
r
, (8.10)
in the near zone. Aside from the timedependent factor e
−i ωt
, this is just like the expression
for the magnetostatic case. We can make a standard expansion, in terms of spherical
harmonics:
1
[r −r
[
=
∞
¸
=0
¸
m=−
4π
2 + 1
r
<
r
+1
>
Y
∗
m
(θ
, ϕ
) Y
m
(θ, ϕ) , (8.11)
142
where r
>
is the larger of r = [r[ and r
= [r
[, and r
<
is the smaller of r and r
, and (θ, ϕ)
and (θ
, ϕ
) are the spherical polar angles for r and r
respectively. Thus we shall have,
since in our case r
<< r,
A(r, t) = e
−i ωt
∞
¸
=0
¸
m=−
4π
2 + 1
1
r
+1
Y
m
(θ, ϕ)
J(r
) r
Y
∗
m
(θ
, ϕ
) d
3
r
. (8.12)
In the near zone, therefore, the electromagnetic ﬁelds will just be like static ﬁelds, except
that they are oscillating in time.
8.1.2 The radiation zone
Next, we shall consider the far zone, or radiation zone. Here, we shall have kr >> 1, while
at the same time kr
<< 1. In other words, the source is (as always) small compared with
the wavelength, but that the ﬁelds are being observed from a large distance which is much
larger than the wavelength. This means that
[r −r
[
2
= r
2
−2r r +r
2
≈ r
2
−2r r , (8.13)
and so
[r −r
[ ≈ r
1 −
2r r
r
2
1/2
≈ r −
r r
r
, (8.14)
and hence
[r −r
[ ≈ r −n r
, (8.15)
where n is the unit vector along r:
n =
r
r
. (8.16)
From (8.3) we shall therefore have
A(r, t) ≈ e
i (kr−ωt)
J(r
)
r −n r
e
−i kn·r
d
3
r
, (8.17)
in the far zone, and hence, to a very good approximation,
A(r, t) ≈
1
r
e
i (kr−ωt)
J(r
) e
−i kn·r
d
3
r
. (8.18)
(Recall that k and ω are just two names for the same quantity here.)
The magnetic ﬁeld is given by
B =
∇
A, or B
i
=
ijk
∂
j
A
k
. We are interested in
the contribution that dominates at large distance, and this therefore comes from the term
where the derivative lands on the e
i kr
factor rather than the 1/r factor. Thus we shall have
B
i
≈
ijk
1
r
(∂
j
e
i kr
)e
−i ωt
J(r
) e
−i kn·r
d
3
r
,
= i k
x
j
r
ijk
1
r
e
i kr
e
−i ωt
J(r
) e
−i kn·r
d
3
r
(8.19)
143
and so
B ≈ i k n
A. (8.20)
Note that the magnetic ﬁeld in this leadingorder approximation falls oﬀ like 1/r. This is
characteristic of electromagnetic radiation, as we have seen previously.
The electric ﬁeld can be calculated from the magnetic ﬁeld using (8.5), and again the
leadingorder behaviour comes from the term where the gradient operator lands on the e
i kr
factor. The rule again is therefore that
∇ →i k n, and so we ﬁnd
E ≈ −k n (n
A) . (8.21)
Note that (8.20) and (8.21) imply
n
B = 0 , n
E = 0 ,
E
B = 0 . (8.22)
Thus
E and
B are transverse and orthogonal. This is characteristic of radiation ﬁelds.
Since we are assuming that the characteristic size d of the source is very small compared
with the wavelength, d << λ = 2π/k, it follows that kd << 1 and so the quantity k[n r
[
appearing in the exponential in the integrand in (8.18) is much smaller than 1. This means
that it is useful to expand the exponential in a Taylor series, giving
A(r, t) ≈
1
r
e
i (kr−ωt)
¸
m≥0
(−i k)
m
m!
J(r
) (n r
)
m
d
3
r
, (8.23)
where the terms in the sum fall oﬀ rapidly with m.
8.1.3 The induction zone
This is the intermediate zone, where d << λ ∼ r, which means that the ﬁelds are being
observed from a distance that is comparable with the wavelength, and so kr ∼ 1. In this
case, we need to consider the exact expansion of e
i kr−r

[r −r
[
−1
. It turns out that this
can be written as
e
i kr−r

[r −r
[
= 4πi k
¸
≥0
j
(kr
) h
(1)
(kr)
¸
m=−
Y
∗
m
(θ
, ϕ
) Y
m
(θ, ϕ) , (8.24)
where j
(x) are spherical Bessel functions and h
(1)
(x) are spherical Hankel functions of the
ﬁrst kind. We shall not pursue the investigation of the induction zone further.
144
8.2 Electric dipole radiation
In the radiation zone, we obtained (8.23)
A(r, t) =
1
r
e
i (kr−ωt)
¸
m≥0
(−i k)
m
m!
J(r
) (n r
)
m
d
3
r
, (8.25)
The terms in this expansion correspond to the terms in a multipole expansion.
Consider ﬁrst the m = 0 term, for which
A(r, t) =
1
r
e
i (kr−ωt)
J(r
) d
3
r
, (8.26)
This actually corresponds to an electric dipole term. To see this, consider the identity
∂
i
(x
j
J
i
(r
)) = δ
ij
J
i
(r
) +x
j
∂
i
J
i
(r
) = J
j
(r
) +x
j
∇
J(r
) . (8.27)
The integral of the lefthand side over all space gives zero, since it can be turned into a
boundary integral over the sphere at inﬁnity (where the locaised sources must vanish):
∂
i
(x
j
J
i
(r
)) d
3
r
=
S
x
j
J
i
(r
) dS
i
= 0 . (8.28)
We also have the charge conservation equation
∇
J(r
, t) +
∂ρ(r
, t)
∂t
= 0 , (8.29)
and so with the time dependence e
−∈ωt
that we are assuming, this gives
∇
J(r
) = i ω ρ(r
) = i k ρ(r
) . (8.30)
Thus we conclude that
J(r
) d
3
r
= −i k
r
ρ(r
) d
3
r
, (8.31)
and so
A(r, t) = −
i k
r
e
i (kr−ωt)
r
ρ(r
) d
3
r
, (8.32)
The integrand here is just the electric dipole moment,
p =
r
ρ(r
) d
3
r
, (8.33)
and so we have
A(r, t) = −
i k p
r
e
i (kr−ωt)
. (8.34)
Note that this leadingorder term in the expansion of the radiation ﬁeld corresponds to
an electric dipole, and not an electric monopole. The reason for this is that a monopole term
145
would require that the total electric charge in the source region should oscillate in time.
This would be impossible, because the total charge in this isolated system must remain
constant, by charge conservation.
It is convenient to factor out the timedependence factor e
−i ωt
that accompanies all the
expressions we shall be working with, and to write
A(r, t) =
A(r) e
−i ωt
,
B(r, t) =
B(r) e
−i ωt
,
E(r, t) =
E(r) e
−i ωt
. (8.35)
Thus for the electric dipole ﬁeld we shall have
A(r) = −
i k p
r
e
i kr
. (8.36)
Then from
B(r) =
∇
A(r) we ﬁnd
B
i
=
ijk
∂
j
A
k
= −i k
ijk
p
k
∂
j
1
r
e
i kr
,
= −i k
ijk
p
k
∂
j
−
x
j
r
3
+ i k
x
j
r
2
e
i kr
, (8.37)
and so
B = k
2
(n p)
e
i kr
r
1 +
i
kr
. (8.38)
From (8.5) we then have
E
i
=
i
k
ijk
p
m
∂
j
km
k
2
x
1
r
2
+
i
kr
3
e
i kr
,
= i k (δ
i
δ
jm
−δ
im
δ
j
) p
m
δ
j
1
r
2
+
i
kr
3
−
2x
j
x
r
4
−
3i x
j
x
kr
5
+ i k
x
j
x
r
1
r
2
+
i
kr
3
e
i kr
,
=
k
2
r
(p
i
−n
i
n p) e
i kr
+
i k
r
2
(p
i
−3n p n
i
) e
i kr
−
1
r
3
(p
i
−3n p n
i
) e
i kr
. (8.39)
In 3vector language, this gives
E = −k
2
n (n p)
e
i kr
r
+ [3(n p) n − p ]
1
r
3
−
i k
r
2
e
i kr
. (8.40)
The reason why we have kept all terms in these expressions for
B and
E, rather than just
the leadingorder 1/r terms, is that if one makes a multipole expansion, in which the electric
dipole contribution we have obtained here is the ﬁrst term in the series, the expressions are
in fact exact to all orders in 1/r. We shall discuss this in more detail later.
Note that we have n
B = 0 everywhere, but that n
E = 0 only in the radiation zone
(i.e. at order 1/r). In the radiation zone we have
B = k
2
(n p)
e
i kr
r
,
E = −k
2
n (n p)
e
i kr
r
= −n
B. (8.41)
Note that we have [
B[ = [
E[, as usual for radiation ﬁelds.
146
Since (8.38) and (8.40) are in fact valid everywhere, we can also use these expressions
in the static zone (i.e. the near zone). Here, in the regime kr << 1, we therefore have
B = i k (n p)
1
r
2
,
E = [3(n p) n − p ]
1
r
3
. (8.42)
The electric ﬁeld here is precisely like that of a static electric dipole, except that it is
oscillating in time. Note that in the near zone we have [vecB[ ∼ (kr) [
E[, which means
[
B[ << [
E[.
Returning now to the radiation zone, we may calculate the radiated power in the usual
way, using the Poynting vector. In particular, we saw previously that with the electric and
magnetic ﬁelds written in the complex notation, the time average of the Poynting ﬂux is
given by
'
S` =
1
8π
E
B
∗
. (8.43)
Then the power radiated into the solid angle dΩ is given by
dP = '
S` nr
2
dΩ,
=
1
8π
[(−n
B)
B
∗
] nr
2
dΩ,
=
1
8π
[
B[
2
r
2
dΩ. (8.44)
From (8.41) we therefore have
dP
dΩ
=
k
4
8π
[n p [
2
=
k
4
8π
([[ p[
2
−(n p )2] . (8.45)
If we take θ to be the angle between p and n, so that n p = p cos θ, then this gives
dP
dΩ
=
k
4
8π
[ p [
2
sin
2
θ . (8.46)
Since dΩ = sin θdθdϕ, the total power radiated by the oscillating dipole is then given by
P =
dP
dΩ
dΩ = 2π
k
4
8π
[ p [
2
π
0
sin
3
θdθ =
1
3
k
4
[ p [
2
. (8.47)
As a concrete example, consider a dipole antenna comprising two thin conducting rods
running along the z axis, meeting (but not touching) at the origin, and extending to z = ±
1
2
d
respectively. The antenna is driven at the centre (z = 0) by an alternating current source
with angular frequency ω. The current will fall oﬀ as a function of z, becoming zero at
the tips of the antenna at z = ±
1
2
d. A reasonable approximation, in the regime we are
considering here where kd << 1, is that this falloﬀ is linear in z. Thus we may assume
I(z, t) = I(z)e
−i ωt
= I
0
1 −
2[z[
d
e
−i ωt
. (8.48)
147
The equation of charge conservation,
∇
J + ∂ρ/∂t = 0 then allows us to solve for the
charge density. The current (8.48) is essentially conﬁned to the line x = y = 0, since we are
assuming the conducting rods that form the antenna are thin. Thus really, we have
J(r, t) = I(z, t) δ(x)δ(y) . (8.49)
Similarly, the charge density will be given by
ρ(r, t) = λ(z, t) δ(x)δ(y) , (8.50)
where λ(z, t) is the charge per unit length in the rods. The charge conservationn equation
therefore becomes
∂I(z, t)
∂z
+
∂λ(z, t)
∂t
= 0 , (8.51)
and so, in view of the time dependence, which implies also λ(z, t) = λ(z)e
−i ωt
, we have
∂I(z)
∂z
−i ωλ(z) = 0 . (8.52)
Thus we shall have
λ(z) = −
i
ω
∂I(z)
∂z
= −
i
ω
I
0
∂
∂z
1 −
2[z[
d
. (8.53)
This implies
λ(z) =
2i I
0
ωd
, z > 0 ,
λ(z) = −
2i I
0
ωd
, z < 0 . (8.54)
The dipole moment p is directed along the z axis, p = (0, 0, p), and is given by
p =
d/2
−d/2
zλ(z)dz =
2i I
0
ωd
d/2
0
zdz −
2i I
0
ωd
0
−d/2
zdz =
i I
0
d
2ω
. (8.55)
From (8.46), we therefore ﬁnd that the power per unit solid angle is given by
dP
dΩ
=
k
4
[ p[
2
8π
sin
2
θ =
I
2
0
(kd)
2
32π
sin
2
θ , (8.56)
where θ is the angle between n = r/r and the z axis. The total radiated power is therefore
given by
P =
1
12
I
2
0
(kd)
2
. (8.57)
148
8.3 Higher multipoles
As mentioned previously, in a multipole expansion we can obtain exact expressions, term
by term, for the electric and magnetic ﬁelds. To do this, we go back to the general integral
expression
A(r) =
J(r
)
e
i kr−r

[r −r
[
d
3
r
, (8.58)
giving
A(r, t) =
A(r) e
−i ωt
. Let
1
r
e
i kr
= f(r) = f(r) . (8.59)
(Note that f(r) = f(r), i.e. it depends only on the magnitude of r.) It follows that
e
i kr−r

[r −r
[
= f(r −r
) , (8.60)
which we can therefore express as the Taylor series
f(r −r
) = f(r) −x
i
∂
i
f(r) +
1
2!
x
i
x
j
∂
i
∂
j
f(r) +
= f(r) −x
i
∂
i
f(r) +
1
2!
x
i
x
j
∂
i
∂
j
f(r) + ,
= f(r) −x
i
(∂
i
r) f
(r) +
1
2
x
i
x
j
[(∂
i
∂
j
r)f
(r) + (∂
i
r)(∂
j
r) f
(r)] + . (8.61)
Thus we ﬁnd
e
i
k[r −r
[
[r −r
[
=
1
r
e
i kr
+
1
r
2
−
i k
r
(n r
) e
i kr
+ . (8.62)
The ﬁrst term in this series gives the electric dipole contribution that we found in the
previous section, in (8.26). The second term gives contributions from an electric quadrupole
term and a magnetic dipole term. This gives
A(r) = e
i kr
1
r
2
−
i k
r
(n r
)
J(r
) d
3
r
. (8.63)
In order to interpret this expression, we need to manipulate the integrand a bit. Its i’th
component is given by
n
j
x
j
J
i
=
1
2
(J
i
x
j
−J
j
x
i
)n
j
+
1
2
(J
i
x
j
+J
j
x
i
)n
j
,
=
1
2
ijk
mk
J
x
m
n
j
+
1
2
(J
i
x
j
+J
j
x
i
)n
j
,
= −
ijk
n
j
´
k
+
1
2
(J
i
x
j
+J
j
x
i
)n
j
, (8.64)
where
´
i
=
1
2
ijk
x
j
J
k
, i.e.
´=
1
2
r
J(r
) (8.65)
is the magnetisation resulting from the current density
J.
149
The remaiing term in (8.64), i.e. the symmetric term
1
2
(J
i
x
j
+J
j
x
i
)n
j
, can be analysed
as follows. Consider
∂
k
(x
i
x
j
n
j
J
k
) = δ
ik
x
j
n
j
J
k
+δ
jk
x
i
n
j
J
k
+x
i
x
j
n
j
∂
k
J
k
,
= (x
i
J
j
+x
j
J
i
)n
j
+ i x
i
x
j
n
j
ωρ . (8.66)
Integrating this over all space, the lefthand side can be turned into a surface integral over
the sphere at inﬁnity, which therefore gives zero. Thus we conclude that
(x
i
J
j
+x
j
J
i
)n
j
d
3
r
= −i ω
x
i
x
j
n
j
ωd
3
r
. (8.67)
The upshot is that
(n r
)
J(r
) d
3
r
= −n
´d
3
r
−
i ω
2
r
(n r
) ρ(r
) d
3
r
. (8.68)
Deﬁning the magnetic dipole moment m by
m =
´d
3
r
=
1
2
r
J(r
) d
3
r
, (8.69)
we conclude that
A(r) = e
i kr
i k
r
−
1
r
2
n m+
i k
2
e
i kr
i k
r
−
1
r
2
r
(n r
) ρ(r
) d
3
r
. (8.70)
8.3.1 Magnetic dipole term
Consider the magnetic dipole term in (8.70) ﬁrst:
A(r) = e
i kr
i k
r
−
1
r
2
n m. (8.71)
Let
f ≡ e
i kr
i k
r
2
−
1
r
3
, (8.72)
so
A = rfn m = fr m. Then from
B =
∇
A we shall have
B
i
=
ijk
∂
j
A
k
=
ijk
km
∂
j
(fx
)m
m
,
=
ijk
km
(f
x
x
j
r
+fδ
j
) ,
= (δ
i
δ
jm
−δ
im
δ
j
)(f
x
x
j
r
+fδ
j
) ,
= rf
n
i
n m−rf
m
i
−2fm
i
. (8.73)
From (8.72) we have
f
= e
i kr
3
r
3
−
3i k
r
2
−
k
2
f
= −
k
2
r
e
i kr
−3f , (8.74)
150
and so we ﬁnd
B = −k
2
n (n m)
e
i kr
r
+ [3n(n m) − m]
1
r
3
−
i k
r
2
e
i kr
. (8.75)
Note that this is identical to the expression (8.40) for the electric ﬁeld of an electric dipole
source, with the electric dipole p replaced by the magnetic dipole, and the electric ﬁeld
replaced by the magnetic ﬁeld:
p −→ m,
E −→
B. (8.76)
The electric ﬁeld of the mangnetic dipole can be obtained from (8.5). However, a simpler
way to ﬁnd it here is to note that from the Maxwell equation
∇
E = −∂
B/∂t we have
∇
E = i ω
B = i k
B, (8.77)
and so
B = −
i
k
∇
E. (8.78)
Now, we already saw from the caclautions for the electric dipole that when the
B ﬁeld (8.38)
is substituted into (8.5), it gives rise to the
E ﬁeld given in (8.40). As we just noted, in the
present magnetic dipole case, the expression for the
B ﬁeld is just like the expression for
the
E ﬁeld in the electric dipole case, and we already know that in the electric case, the
B ﬁeld is given by (8.38). Therefore, we can conclude that in the present magnetic case,
the
E ﬁeld that would yield, using (8.78), the result (8.75) for the
B ﬁeld will be just the
negative of the expression for
B in the electric case (with p replaced by m). (The reason for
the minus sign is that (8.78) has a minus sign, as compared with (8.5), under the exchange
of
E and
B.) Thus the upshot is that the electric ﬁeld for the magnetic dipole radiation
will be given by
E = −k
2
(n m)
e
i kr
r
1 +
i
kr
. (8.79)
This result can alternatively be veriﬁed (after a rather involved calculation) by directlty
substituting (8.75) into (8.5).
18
18
The only “gap” in the simple argument we just presented is that any other vector
E
=
E +
∇h would
also give the same
B ﬁeld when plugged into (8.78), where h was an arbitrary function. However, we know
that
∇·
E should vanish (we are in a region away from sources), and it is obvious almost by inspection that
the answer given in (8.79) satisﬁes this condition. Thus if we had arrived at the wrong answer for
E, it could
be wrong only by a term
∇h where ∇
2
h = 0. There is no such function with an exponential factor e
i kr
,
and so there is no possibility of our answer (8.79) being wrong. If any doubts remain, the reader is invited
to substitute (8.75) into (8.5) to verify (8.79) directly.
151
An observation from the calculations of the electric and magnetic ﬁelds for electric dipole
radiation and magnetic dipole radiation is that there is a discrete symmetry under which
the two situations interchange:
p −→ m
E −→
B (8.80)
B −→ −
E
This is an example of what is known as “electric/magnetic duality” in Maxwell’s equations.
8.3.2 Electric quadrupole term
We now return to the electric quadrupole term in (8.70), namely
A(r) =
i k
2
e
i kr
i k
r
−
1
r
2
r
(n r
) ρ(r
) d
3
r
. (8.81)
For simplicity, we shall keep only the leadingorder radiation term in this expression,
A(r) = −
1
2
k
2
e
i kr
r
r
(n r
) ρ(r
) d
3
r
. (8.82)
and furthermore when calculating the
B and
E ﬁelds, we shall keep only the leadingorder
1/r terms that come from the derivatives hitting e
i kr
. Thus, from
B =
∇
A we shall have
B = −
1
2
k
2
(i k) n
e
i kr
r
r
(n r
) ρ(r
) d
3
r
,
= −
i k
3
2
e
i kr
r
(n r
)(n r
) ρ(r
) d
3
r
. (8.83)
This radiation ﬁeld can therefore be written simply as
B = i k n
A. (8.84)
In fact, in any expression where we keep only the leadingorder term in which the derivative
lands on e
i kr
, we shall have the rule
∇ −→i k n. (8.85)
For the electric ﬁeld, we have, using (8.5) and (8.85),
E =
i
k
∇
B = −n
B = −i kn (n
A) . (8.86)
The electric quadrupole moment tensor Q
ij
is deﬁned by
Q
ij
=
(3x
i
x
j
−r
2
δ
ij
) ρ(r) d
3
r . (8.87)
152
Deﬁne the vector
Q(n), whose components Q(n)
i
are given by
Q(n)
i
≡ Q
ij
n
j
. (8.88)
Consider the expression
1
3
n
Q(n). We shall have
[
1
3
n
Q(n)]
i
=
1
3
ijk
n
j
Q
k
n
,
=
1
3
ijk
n
j
n
(3x
k
x
−r
2
δ
k
) ρ(r) d
3
r ,
=
(n r)
i
(n r) ρ(r) d
3
r , (8.89)
(since the trace term gives zero). This implies that the expression (8.83) for the electric
quadrupole
B ﬁeld can be written as
B = −
i k
3
6r
e
i kr
n
Q(n) . (8.90)
Since we have
E =
B n (see (8.86)), it follows that the timeaveraged power per unit
solid angle will be given by
dP
dΩ
=
1
8π
(
E
B
∗
) nr
2
,
=
k
6
288π
[(n
Q(n)) n[
2
=
k
6
288π
([
Q(n)[
2
−[n
Q(n)[
2
) , (8.91)
and so
dP
dΩ
=
k
6
288π
[n
Q(n)[
2
. (8.92)
Written using indices, this is therefore
dP
dΩ
=
k
6
288π
(Q
ki
Q
∗
kj
n
i
n
j
−Q
ij
Q
∗
k
n
i
n
j
n
k
n
) . (8.93)
As always, having obtained an expression for the power radiated per unit solid angle, it
is natural to integrate this up over the sphere, in order to obtain the total radiated power.
In this case, we shall need to evaluate
n
i
n
j
dΩ, and
n
i
n
j
n
k
n
dΩ. (8.94)
One way to do this is to parameterise the unit vector n in terms of spherical polar angles
(θ, ϕ) in the usual way,
n = (sin θ cos ϕ, sin θ sin ϕ, cos θ) , (8.95)
and slog out the integrals with dΩ = sin θdθdϕ.
A more elegant way to evaluate the integrals in (8.94) is as follows. For the ﬁrst integral,
we note that the answer, whatever it is, must be a symmetric 2index tensor. It must also
153
be completely isotropic, since by the time we have integrated over all solid angles it is
not possible for the result to be “biased” so that it favours any direction in space over any
other. There is only one possibility for the symmetric isotropic tensor; it must be a constant
multiple of the Kr¨ onecker delta,
n
i
n
j
dΩ = cδ
ij
. (8.96)
The constant c can be determined by taking the trace, and using n
i
n
i
= 1:
4π =
dΩ = 3c , (8.97)
and so we have
n
i
n
j
dΩ =
4π
3
δ
ij
. (8.98)
In case one doubted this result, it is not too hard in this case to conﬁrm the result by
evaluating all the integrals explicitly using (8.95).
Turning now to the secon integral in (8.94), we can use a similar argument. The answer
must be a 4index totally symmetric isotropic tensor. In fact the only symmetric isotropic
tensors are those that can be made by taking products of Kr¨ onecker deltas, and so in this
case it must be that
n
i
n
j
n
k
n
dΩ = b (δ
ij
δ
k
+δ
ik
δ
j
+δ
i
δ
jk
) , (8.99)
for some constant b. We can determine the constant by multiplying both sides by δ
ij
δ
k
,
giving
4π =
dΩ = (9 + 3 + 3)b = 15b , (8.100)
and so
n
i
n
j
n
k
n
dΩ =
4π
15
(δ
ij
δ
k
+δ
ik
δ
j
+δ
i
δ
jk
) . (8.101)
With these results we shall have from (8.93) that
P =
dP
dΩ
dΩ =
k
6
288π
Q
ki
Q
∗
kj
n
i
n
j
dΩ −Q
ij
Q
∗
k
n
i
n
j
n
k
n
dΩ
,
=
k
6
288π
4π
3
Q
ki
Q
∗
kj
δ
ij
−
4π
15
Q
ij
Q
∗
k
(δ
ij
δ
k
+δ
ik
δ
j
+δ
i
δ
jk
)
,
=
k
6
216
Q
ij
Q
∗
ij
−
2
5
Q
ij
Q
∗
ij
−
1
5
Q
ii
Q
∗
jj
,
=
k
6
360
Q
ij
Q
∗
ij
. (8.102)
(Recall that Q
ij
is symmetric and traceless.)
154
Since the quadrupole moment tensor Q
ij
is symmetric, it is always possible to choose
an orientation for the Cartesian coordinate system such that Q
ij
becomes diagonal. (This
is because the matrix U that diagonalises Q, Q → Q
diag
= U
T
QU is itself orthogonal,
U
T
U = 1l, and therefore the diagonalisation is achieved by an orthogonal transformation of
the coordinates.) Thus, having chosen an appropriate orientation for the Cartesian axes,
we can assume that
Q
ij
=
¸
¸
¸
Q
1
0 0
0 Q
2
0
0 0 Q
3
¸
, where Q
1
+Q
2
+Q
3
= 0 . (8.103)
The expression (8.93) for the angular power distribution will give
dP
dΩ
=
k
6
288π
Q
2
1
n
2
1
+Q
2
2
n
2
2
+Q
2
3
n
2
3
−(Q
1
n
2
1
+Q
2
n
2
2
+Q
3
n
2
3
)
2
. (8.104)
One can substitute (8.95) into this in order to obtain an explicit expression for the dP/dΩ
in terms of spherical polar angles (θ, ϕ).
Consider for simplicity the special case where Q
1
= Q
2
. This means that there is an
axial symmetry around the z axis, and also we shall have
Q
1
= Q
2
= −
1
2
Q
3
. (8.105)
Substituting (8.95) and (8.105) into (8.104), we obtain
dP
dΩ
=
k
6
Q
2
3
128π
sin
2
θ cos
2
θ =
k
6
Q
2
3
512π
sin
2
2θ . (8.106)
This is indeed, as expected, azimuthally symmetric (it does not depend on ϕ). It describes a
quadrafoillike power distribution, with four lobes, unlike the ﬁgureofeight power distribu
tion of the electric dipole radiation. Note also that its frequency dependence is proportional
to ω
6
(= k
6
), unlike the electric dipole radiation that is proportional to ω
4
. A plot of the
power distribution for quadrupole radiation is given in Figure 3 below.
8.4 Linear antenna
In the later part of section 8.2, we considered a centrefed dipole antenna. In that section
we made the assumption that the wavelength of the electromagnetic radiation was very
large compared with the length of the dipole, i.e. that kd << 1. In that limit, one could
assume to a good approximation that the current in each arm of the dipole antenna fell oﬀ
155
0.6 0.4 0.2 0.2 0.4 0.6
0.6
0.4
0.2
0.2
0.4
0.6
Figure 4: The angular power distribution for electric quadrupole radiation
in a linear fashion as a function of z (the axis along which the dipole is located). Thus,
with the dipole arms extanding over the intervals
−
1
2
d ≤ z < 0 and 0 < z ≤
1
2
d , (8.107)
we assumed there that the current in each arm was proportional to (d/2 −[z[).
In this section, we shall consider the case where the dipole arms of not assumed to be
short campared to the wavelength. Under these circumstances, it can be shown that the
current distribution in the dipole arms takes the form
J(r, t) = I
0
sin k(d/2 −[z[) e
−i ωt
δ(x)δ(y)
Z , [z[ ≤
1
2
d , (8.108)
where
Z = (0, 0, 1) is the unit vector along the zaxis, which is the axis along which the
dipole is located.
We then have
A(r, t) =
A(r) e
−i ωt
, where
A(r) =
J(r
, t −[r −r [) d
3
r
[r −r
[
. (8.109)
Thus in the radiation zone, with [r −r
[ ≈ r −n r
as usual, we therefore have
A(r) ≈
Z
I
0
e
i kr
r
d/2
−d/2
sin k(d/2 −[z[) e
−i kz cos θ
dz ,
156
=
Z
2I
0
e
i kr
r
cos(
1
2
kd cos θ) −cos(
1
2
kd)
sin
2
θ
(8.110)
As we saw earlier, the magnetic ﬁeld is given by i k n
A in the radiation zone, and
E = −n
B. Therefore the radiated power per unit solid angle is given by
dP
dΩ
=
r
2
8π
[
E
B
∗
[
2
=
r
2
8π
[(
B
B
∗
) n[
2
=
r
2
8π
[
B[
2
. (8.111)
Here we have
[
B[
2
= [i kn
A[
2
= k
2
([
A[
2
−(n
A)
2
) = k
2
[
A[
2
sin
2
θ , (8.112)
since n
Z = cos θ, and so the radiated power per unit solid angle is given by
dP
dΩ
=
I
2
0
2π
cos(
1
2
kd cos θ) −cos(
1
2
kd)
sin θ
2
. (8.113)
We can now consider various special cases:
8.4.1 kd << 1:
In this case, we can make Taylor expansions of the trigonometric functions in the numerator
in (8.113), leading to
dP
dΩ
≈
I
2
0
2π
1 −
1
2
(
1
2
kd)
2
cos
2
θ −1 −
1
2
(
1
2
kd)
2
sin θ
2
,
=
I
2
0
2π
1
2
(
1
2
kd)
2
sin
2
θ
sin θ
2
,
=
I
2
0
(kd)
2
sin
2
θ
128π
. (8.114)
This agrees with the result (8.56), after making allowance for the fact that the current in
the calculation leading to (8.56) was twice as large as the current in the present calculation.
8.4.2 kd = π:
In this case, each arm of the dipole has a length equal to
1
4
of the wavelength, and so
I(z) = I
0
sin
1
2
π(1 −2[z[/d). In this case, (8.113) becomes
dP
dΩ
=
I
2
0
2π
cos
2
(
1
2
π cos θ)
sin
2
θ
. (8.115)
8.4.3 kd = 2π:
In this case, each dipole arm has a length equal to
1
2
of the wavelength, and I(z) =
I
0
sin π(1 −2[z[/d). In this case (8.113) becomes
dP
dΩ
=
I
2
0
2π
cos
4
(
1
2
π cos θ)
sin
2
θ
. (8.116)
157
9 Electromagnetism and Quantum Mechanics
9.1 The Schr¨ odinger equation and gauge transformations
We saw at the end of chapter 2, in equation (2.102), that in the nonrelativistic limit the
Hamiltonian describing a particle of mass m and charge e in the presence of electromagnetic
ﬁelds given by potentials φ and
A is
H =
1
2m
(π
i
−eA
i
)
2
+eφ, (9.1)
where π
i
is the canonical 3momentum. In quantum mechanics, we the standard prescription
for writing down the Schr¨ odinger equation for the wavefunction ψ describing the particle is
to interpret π
i
as an operator, and to write
H ψ = i ¯ h
∂ψ
∂t
. (9.2)
In the position representation we shall have
π
i
= −i ¯ h∂
i
, or π = −i ¯ h
∇. (9.3)
Thus the Schr¨ odinger equation for a particle of mass m and charge e in an electromagnetic
ﬁeld is
−
¯ h
2
2m
∇−
i e
¯ h
A
2
ψ +eφψ = i ¯ h
∂ψ
∂t
. (9.4)
The Schr¨ odinger equation (9.4) is written in terms of the scalar and vector potentials φ
and
A that describe the electromagnetic ﬁeld. Thus, if we perform a gauge transformation
A −→
A
=
A +
∇λ, φ −→φ
= φ −
∂λ
∂t
, (9.5)
the Schr¨ odinger equation will change its form. On the other hand, we expect that the
physics should be unaltered by a mere gauge transformation, since this leaves the physically
observable electric and magnetic ﬁelds unchanged. It turns out that we should simultane
ously perform a very speciﬁc phase transformation on the wavefunction ψ,
ψ −→ψ
= e
i eλ/¯ h
ψ (9.6)
then the Schr¨ odinger equation expressed entirely in terms of the primed quantities (i.e.
wavefunction ψ
and electromagnetic potentials φ
and
A
) will take the identical form to
the original unprimed equation (9.4). Thus, we may say that the Schr¨ odinger equation
transforms covariantly under gauge transformations.
158
To see the details of how this works, it is useful ﬁrst to deﬁne what are called covariant
derivatives. We this both for the three spatial derivatives, and also for the time derivative.
Thus we deﬁne
D
i
≡ ∂
i
−
i e
¯ h
A
i
, D
0
≡
∂
∂t
+
i e
¯ h
φ. (9.7)
Note that the original Schr¨ odinger equation (9.4) is now written simply as
−
¯ h
2
2m
D
i
D
i
ψ −i ¯ hD
0
ψ = 0 . (9.8)
Next, perform the transformations
A −→
A
=
A +
∇λ, φ −→φ
= φ −
∂λ
∂t
,
ψ −→ ψ
= e
i eλ/¯ h
ψ (9.9)
The crucial point about this is that we have the following:
D
i
ψ
≡
∂
i
−
i e
¯ h
A
i
ψ
=
∂
i
−
i e
¯ h
A
i
−
i e
¯ h
(∂
i
λ)
e
i eλ/¯ h
ψ
,
= e
i eλ/¯ h
∂
i
−
i e
¯ h
A
i
−
i e
¯ h
(∂
i
λ) +
i e
¯ h
(∂
i
λ)
ψ,
= e
i eλ/¯ h
∂
i
−
i e
¯ h
A
i
ψ, (9.10)
and
D
0
ψ
≡
∂
∂t
+
i e
¯ h
φ
ψ
=
∂
∂t
+
i e
¯ h
φ −
i e
¯ h
∂λ
∂t
e
i eλ/¯ h
ψ
,
= e
i eλ/¯ h
∂
∂t
+
i e
¯ h
φ −
i e
¯ h
∂λ
∂t
+
i e
¯ h
∂λ
∂t
ψ ,
= e
i eλ/¯ h
∂
∂t
+
i e
¯ h
φ
ψ. (9.11)
In other words, we have
D
i
ψ
= e
i eλ/¯ h
D
i
ψ, D
0
ψ
= e
i eλ/¯ h
D
0
ψ, (9.12)
which means that D
i
ψ and D
0
ψ transform the same way as ψ itself under a gauge transfor
mation, namely just with a homogeneous phase transformation factor e
i eλ/¯ h
. This would
not, of course, be the case for the “ordinary” derivatives ∂
i
ψ and ∂
0
ψ, because for these,
there would be an extra additive term, where the derivative lands on the spacetime depen
dent gauge paramater λ.
This means that D
i
ψ and D
0
ψ transform the same way as ψ itself under the gauge
transformations (9.9), namely just by acquiring the phase factor e
i eλ/¯ h
. This is a non
trivial statement, since the gauge parameter λ is an arbitrary function of space and time.
159
Had we been considering standard partial derivatives ∂
i
and ∂/∂t rather than the covariant
deriavtives deﬁned in (9.7), it would most certainly not have been true. For example,
∂
i
ψ
= ∂
i
e
i eλ/¯ h
ψ
= e
i eλ/¯ h
∂
i
ψ +e
i eλ/¯ h
i e
¯ h
(∂
i
λ) ψ = e
i eλ/¯ h
∂
i
ψ, (9.13)
precisely because the derivative can land on the spacetime dependent gaugetransformation
parameter λ and thus give the second term, which spoils the covariance of the transforma
tion. The point about the covariant derivatives is that the contributions from the gauge
transformation of the gauge potentials precisely cancels the “unwanted” second term in
(9.13).
By iterating the calculation, it also follows that D
i
D
i
ψ
= e
i eλ/¯ h
D
i
D
i
ψ, and so we see
that the Schr¨ odinger equation (9.8) written in terms of the primed ﬁelds, i.e.
−
¯ h
2
2m
D
i
D
i
ψ
−i ¯ hD
0
ψ
= 0 , (9.14)
just implies the Schr¨ odinger equation in terms of unprimed ﬁelds, since
0 = −
¯ h
2
2m
D
i
D
i
ψ
−i ¯ hD
0
ψ
,
= e
i eλ/¯ h
−
¯ h
2
2m
D
i
D
i
ψ −i ¯ hD
0
ψ
. (9.15)
What we have proved above is that the Schr¨ odinger equation transforms covariantly
under electromagnetic gauge transformations, provided that at the same time the wave
function is scaled by a spacetime dependent phase factor, as in (9.9). Note that we use the
term “covariant transformation” here in the same sense as we used it earlier in the course
when discussing the behaviour of the Maxwell equations under Lorentz transformations.
The actual transformation is totally diﬀerent in the two contexts; here we are discussing
the behaviour of the Schr¨ odinger equation under gauge transformations rather than Lorentz
transformations, but in each case the essential point, which is characteristic of a covariance
of any equation under a symmetry transformation, is that the equation expressed in terms
of the symmetrytransformed (primed) variables is identical in form to the original equation
for the unprimed variables, but with a prime placed on every ﬁeld.
It is worth noting that the two deﬁnitions of the spatial and time covariant derivatives
in (9.7) can be uniﬁed into the single 4dimensional deﬁnition
D
µ
= ∂
µ
−
i e
¯ h
A
µ
(9.16)
since we have A
µ
= (φ,
A), and hence A
µ
= (−φ,
A).
160
We shall have more to say about this 4dimensional covariant derivative later. For now,
we shall just make the obvious (if slightly provocative) remark that anyone who speaks of the
Schr¨ odinger equation as if it were the ultimate “holy grail” of physics should not be taken
seriously, since it is an approximation that is not even Lorentz invariant. (It is manifest that
time is treated on a diﬀerent footing from space in (9.8).) Thus the Schr¨ odinger equation
does not respect any relativisitic notion of causality. Furthermore, it gives answers that are
measurably in disagreement with experiments. For example, to treat the electron properly
in quantum physics it is necessary ﬁrst to have a relativistic theory (the Dirac equation),
and secondly one must move beyond quantum mechanics to quantum ﬁeld theory. This
lies outside the scope of the present course. We shall, however, return to the subject of
relativistic quantum mechanics a little later.
9.2 Magnetic monopoles
The Maxwell equations
∂
µ
F
µν
= −4π J
ν
,
∂
µ
F
νρ
+∂
ν
F
ρµ
+∂
ρ
F
µν
= 0 (9.17)
take on a more symmetricallooking form if we introduce the dual of the ﬁeldstrength
tensor, deﬁned by
¯
F
µν
=
1
2
µνρσ
F
ρσ
. (9.18)
In terms of
¯
F
µν
, the second equation in (9.17) (i.e. the Bianchi identity) becomes
∂
µ
¯
F
µν
= 0 . (9.19)
From F
0i
= −E
i
and F
ij
=
ijk
B
k
, it is easy to see that
¯
F
0i
= B
i
,
¯
F
ij
=
ijk
E
k
. (9.20)
It follows that
¯
F
µν
is obtained from F
µν
by making the replacements
E −→−
B,
B −→
E. (9.21)
The symmetry between the two Maxwell equations would become even more striking
if there were a current on the righthand side of (9.19), analogous to the electric 4current
density on the righthandside of the ﬁrst Maxwell equation in (9.17). Since the rˆ oles of
E
and
B are exchanged when passing from F
µν
to
¯
F
µν
, it is evident that the 4current needed
161
on the righthand side of (9.19) must be a magnetic 4current density, J
µ
M
. Let us now
attach a subscript E to the standard electric 4current density, in order to emphasise which
is which in the following. The generalised Maxwell equations will now be written as
∂
µ
F
µν
= −4π J
ν
E
, ∂
µ
¯
F
µν
= −4π J
ν
M
. (9.22)
Particles with magnetic charge, known as magnetic monopoles, have never been seen in
nature. However, there seems to be no reason in principle why they should not exist, and
it is of interest to explore their properties in a little more detail. A point electric charge e
has an electric ﬁeld given by
E =
er
r
3
. (9.23)
Thus by analogy, a point magnetic monopole, with magnetic charge g, will have a magnetic
ﬁeld given by
B =
g r
r
3
. (9.24)
This satisﬁes
∇
B = 4π ρ
M
, ρ
M
= g δ
3
(r) , (9.25)
where ρ
M
= J
0
M
is the magnetic charge density.
We shall be interested in studying the quantum mechanics of electricallycharged parti
cles in the background of a magnetic monopole. Since the Schr¨ odinger equation is written
in terms of the potentials φ and
A, we shall therefore need to write down the 3vector
potential
A for the magnetic monopole. To do this, we introduce Cartesian coordinates
(x, y, z), related to spherical polar coordinates (r, θ, ϕ) in the standard way,
x = r sin θ cos ϕ, y = r sin θ sin ϕ, x = r cos θ , (9.26)
and we also deﬁne
ρ
2
= x
2
+y
2
. (9.27)
Consider the 3vector potential
A = g
zy
rρ
2
, −
zx
rρ
2
, 0
. (9.28)
Using
∂r
∂x
=
x
r
,
∂r
∂y
=
y
r
,
∂r
∂z
=
z
r
,
∂ρ
∂x
=
x
ρ
,
∂ρ
∂y
=
y
ρ
,
∂ρ
∂z
= 0 , (9.29)
162
it is easily seen that
B
x
= ∂
y
A
z
−∂
z
A
y
= g∂
z
zx
rρ
2
=
gx
rρ
2
−
gxz
2
r
3
ρ
2
=
gx
r
3
, (9.30)
and similarly
B
y
=
gy
r
3
, B
z
=
gz
r
3
. (9.31)
Thus indeed we ﬁnd that
∇
A =
gr
r
3
, (9.32)
and so the 3vector potential (9.28) describes the magnetic monopole ﬁeld (9.24).
In terms of sherical polar coordinates we have ρ
2
= x
2
+ y
2
= r
2
sin
2
θ, and so (9.28)
can be written as
A =
g cot θ
r
(sin ϕ, −cos ϕ, 0) . (9.33)
Not surprisingly, this potential is singular at r = 0, since we are describing an idealised
point magnetic charge. In exactly the same way, the potential φ − e/r describing a point
electric charge diverges at r = 0 also. However, the potential (9.33) also diverges everywhere
along the z axis, i.e. at θ = 0 and θ = π. It turns out that these latter singularities are
“unphysical,” in the sense that they can be removed by making gauge transformations.
This is not too surprising, when we note that the magnetic ﬁeld itself, given by (9.24) has
no singularity along the z axis. It is, of course, genuinely divergent at r = 0, so that is a
real physical singularity.
To see the unphysical nature of the singularities in (9.33) along θ = 0 and θ = π, we
need to make gauge transformations, under which
A −→
A+
∇λ. (9.34)
Consider ﬁrst taking
λ = g ϕ = g arctan
y
x
. (9.35)
From this, we ﬁnd
∇λ = −
1
r
cosecθ (sin ϕ, −cos ϕ, 0) . (9.36)
Letting the gaugetransformed potential be
A
, we therefore ﬁnd
A
=
A+
∇λ =
g
r
cos θ −1
sin θ
(sin ϕ, −cos ϕ, 0) = −
g
r
tan
1
2
θ (sin ϕ, −cos ϕ, 0) . (9.37)
It can be seen that
A is completely nonsingular along θ = 0 (i.e. along the positive z axis).
It is, however, singular along θ = π (i.e. along the negative z axis).
163
We could, on the other hand, perform a gauge transformation with λ given by
λ = −g ϕ = −g arctan
y
x
(9.38)
instead of (9.35). Deﬁning the gaugetransformed potential as
A
in this case, we ﬁnd
A
=
g
r
cot
1
2
θ (sin ϕ, −cos ϕ, 0) . (9.39)
This time, we have obtained a gauge potential that is nonsingular along θ = π (i.e. the
negative z axis), but it is singular along θ = 0 (the positive z axis).
There is no single choice of gauge in which the 3vector potential for the magnetic
monopole is completely free of singularities away from the origin r = 0. We have obtained
two expressions for the vector potential, one of which,
A
, is nonsingular along the positove
z axis, and the other,
A
, is nonsingular along the negative z axis. The singularity that
each has is known as a “string singularity,” since it lies along a line, or string. By making
gauge transformations the location of the string can be moved around, but it can never be
removed altogether.
In the discussion above, the z axis apprears to have played a preferred rˆ ole, but this is, of
course, just an artefact of our gauge choices. We could equally well have chosen a diﬀerent
expression for
A, related by a gauge transformation, for which the string singularity ran
along any desired line, or curve, emanating from the origin.
9.3 Dirac quantisation condition
We have seen that gauge potentials for the magnetic monopole, free of singularities on the
positive and negative z axes resepctively, are given by
A
= −
g
r
tan
1
2
θ (sin ϕ, −cos ϕ, 0) ,
A
=
g
r
cot
1
2
θ (sin ϕ, −cos ϕ, 0) . (9.40)
The two are themselves related by a gauge transformation, namely
A
=
A
+
∇(−2gϕ) . (9.41)
Now let us consider the quantum mechanics of an electron in the background of the
magnetic monopole. As we discussed in section 9.1, the Schr¨ odinger equation for the electron
is given by (9.4), where e is its charge, and m is its mass. We shall consider the Schr¨ odinger
equation in two diﬀerent gauges, related as in (9.41). Denoting the corresponding electron
wavefunctions by ψ
and ψ
, we see from (9.9) (9.41) that we shall have
ψ
= e
−2i egϕ/¯ h
ψ
. (9.42)
164
However, we have seen that the gauge transformation is not physical, but merely corresponds
to shifting the string singularity of the magnetic monopole from the negative z axis to the
positive z axis. Quantum mechanically, the physics will only be unchanged if the electron
wavefunction remains single valued under a complete 2π rotation around the z axis. This
means that the phase factor in the relation (9.42) must be equal to unity, and so it must
be that
2eg
¯ h
2π = 2π n, (9.43)
where n is an integer. Thus it must be that the product of the electric charge e on the
electron, and the magnetic charge g on the magnetic monopole, must satisfy the socalled
Dirac quantisation condition,
2e g = n¯ h. (9.44)
It is interesting to note that although a magnetic monopole has never been observed, it
would only take the existence of a single monopole, maybe somewhere in another galaxy,
to imply that electric charges everywhere in the universe must quantised in units of
¯ h
2g
, (9.45)
where g is the magnetic charge of the lonely magnetic monopole. In fact all observed
electric charges are indeed quantised; in integer multiples of the charge e on the electron,
in everyday life, and in units of
1
3
e in the quarks of the theory of strong interactions. It is
tempting to speculate that the reason for this may be the existence of a magnetic monopole
somewhare out in the vastness of space, in a galaxy far far away.
10 Local Gauge Invariance and YangMills Theory
10.1 Relativistic quantum mechanics
We saw in the previous section that the ordinary nonrelativistic quantum mechanics of
a charged particle in an electromagnetic ﬁeld has the feature that it is covariant under
electromagnetic gauge transformations, provided that the usual gauge trnasformation of
the 4vector potential is combined with a phase transformation of the wavefunction for the
charged particle:
A
µ
−→A
µ
+∂
µ
λ, ψ −→e
i eλ/¯ h
ψ . (10.1)
The essential point here is that the gauge transformation parameter λ can be an arbitrary
function of the spacetime coordinates, and so the phase transformation of the wavefunction
165
is a spacetimedependent one. Such spacetimedependent transformations are known as
local transformations.
One could turn this around, and view the introduction of the electromagnetic ﬁeld as the
necessary addition to quantum mechanics in order to allow the theory to be covariant under
local phase transformations of the wavefunction. If we started with quantum mechanics in
the absence of electromagnetism, so that for a free particle of mass m we have
−
¯ h
2
2m
∇
2
ψ = i ¯ h
∂ψ
∂t
, (10.2)
then the Schr¨ odinger equation is obviously covariant under constant phase transformations
of the wavefunction,
ψ −→e
i c
ψ , (10.3)
where c is an arbitrary constant. And, indeed, the physics described by the wavefunction
is invariant under this phase transformation, since all physical quantities are constructed
using a product of ψ and its complex conjugate
¯
ψ (for example, the probability density
[ψ[
2
), for which the phase factors cancel out. Also, clearly, the Schr¨ odinger equation (10.2)
does not transform nicely under local phase transformations, since the derivatives will now
land on the (spacetime dependent) phase factor in (10.3) and give a lot of messy terms.
As we now know, the way to achieve a nice covariant transformation of the Schr¨ odinger
equation under local phase transformations is to replace the partial derivatives ∂
i
and ∂
0
in
(10.2) by covariant derivatives
D
i
= ∂
i
−
i e
¯ h
A
i
, D
0
= ∂
0
+
i e
¯ h
φ (10.4)
where A
i
and φ transform in the standard way for the electromagnetic gauge potentials
at the same time as the local phase transformation for ψ is performed, as in (10.1). From
this point of view, it could be said that we have derived electromagnetism as the ﬁeld
needed in order to allow the Schr¨ odinger equation to transform covariantly under local
phase transformations of the wavefunction.
The idea now is to extend this idea to more general situations. By again demanding
local “phase” transformations of some quantummechanical equation, we will now be able
to derive a generalisation of electromagnetism known as YangMills theory.
Working with a nonrelativistic equation like the Schr¨ odinger equation is rather clumsy,
because of the way in which space and time arise on such diﬀerent footings. It is more
elegant (and simpler) to switch at this point to the consideration of relativistic quantum
mechanical equation. There are various possible equations one could consider, but they
166
all lead to equivalent conclusions about the generalisation of electromagnetism. Examples
one could consider include the Dirac equation, which provides a relativistic description of
the electron, or any other fermionic particle with spin
1
2
. A simpler opetion is to consider
the KleinGordon equation for a relativistic particle of spin 0 (otherwise known as a scalar
ﬁeld). The KleinGordon equation for a free scalar ﬁeld ϕ with mass m is very simple,
namely
ϕ −m
2
ϕ = 0 , (10.5)
where = ∂
µ
∂
µ
is the usual d’Alembertian operator, which, as we know, is Lorentz
invariant.
19
Note that from now on, we shall use units where Planck’s constant ¯ h is set
equal to 1.
In what follows, we shall make the simplifying assumption that the scalar ﬁeld is mass
less, and so its KleinGordon equation is simly
ϕ = 0 . (10.6)
We shall do this because no essential feature that we wish to explore will be lost, and it
will slightly shorten the equations. It is completely straightforward to add it back in if
desired.
20
The KleinGordon equation (10.6) can be derived from the Lagrangian density
L = −
1
2
∂
µ
ϕ∂
µ
ϕ. (10.7)
Varying the action I =
d
4
xL, we ﬁnd
δI = −
d
4
x∂
µ
ϕ∂
µ
δϕ =
d
4
x(∂
µ
∂
µ
ϕ) δϕ, (10.8)
19
The nonrelativistic Schr¨ odinger equation can be derived from the KleinGordon equation (10.5) in an
appropriate limit. The leadingorder time dependence of a ﬁeld with mass (i.e. energy) m will be e
−i mt
(in units where we set ¯ h = 1). Thus the appropriate nonrelativistic approximation is where where the
wavefunction ϕ is assumed to be of the form ϕ ∼ e
−i mt
ψ, with ψ only slowly varying in time, which
means that the term ∂
2
ψ/∂t
2
can be neglected in comparison to the others. Substituting into (10.5) and
dropping the ∂
2
ψ/∂t
2
term gives precisely the Schr¨ odinger equation for the free massive particle, namely
−(1/2m)∇
2
ψ = i ∂ψ/∂t.
20
Note that we can only discuss a nonrelativistic limit for the massive KleinGordon equation. This is
because the nonrelativisitic approximation (discussed in the previous footnote) involved assuming that each
time derivative of ψ with respect to t was small compared with m times ψ. Clearly this would no longer
be true if m were zero. Put another way, a massless particle is inherently relativistic, since it must travel
at the speed of light (like the photon). We shall not be concerned with taking the nonrelativistic limit in
what follows, and so working with a massless ﬁeld will not be a problem.
167
(dropping the boundary term at inﬁnity as usual), and so demanding that the action be
stationary under arbitrary variations δϕ implies the KleinGordon equation (10.6).
Before moving on to the generalisation to YangMills theory, we shall ﬁrst review, again,
the derivation of electromagnetism as the ﬁeld needed in order to turn a global phase
invariance into a local invariance, this time from the viewpoint of the relativistic Klein
Gordon equation. To do this, we ﬁrst need to enlarge the system of wavefunctions from one
real scalar to two. Suppose, then, we have two real scalars called ϕ
1
and ϕ
2
, each satisfying
a KleinGordon equation. These equations can therefore be derived from the Lagrangian
density
L = −
1
2
∂
µ
ϕ
1
∂
µ
ϕ
1
−
1
2
∂
µ
ϕ
2
∂
µ
ϕ
2
. (10.9)
We can conveniently combine the two real ﬁelds into a complex scalar ﬁeld φ, deﬁned by
φ =
1
√
2
(ϕ
1
+ i ϕ
2
) . (10.10)
The Lagrangian density can then be written as
L = −
1
2
∂
µ
¯
φ∂
µ
φ. (10.11)
The complex ﬁeld φ therefore satisﬁes the KleinGordon equation
φ = 0 . (10.12)
It is clear that the complex ﬁeld φ has a global phase invariance, under
φ −→e
i α
φ, (10.13)
where α is a constant. (The term “global” is used to describe such phase transformations,
which are identical at every point in spacetime.) This can be seen at the level of the Klein
Gordon equation (10.12), since the constant phase factor simply passes straight through the
d’Alembertian operator. It can also be seen at the level of the Lagrangian density, since
again the derivatives do not land on the phase factor, and furthermore, the e
i α
phase factor
from transforming φ is cancelled by the e
−i α
phase factor from tramsforming
¯
φ.
It is also clear that the Lagrangian density is not invariant under local phase trans
formations, where α is assumed now to be spacetime dependent. This is because we now
have
∂
µ
φ −→∂
µ
(e
i α
φ) −→e
i α
∂
µ
φ + i (∂
µ
α) φ. (10.14)
It is the second term, where the derivatives land on α, that spoils the invariance.
168
The remedy, not surprisingly, is to introduce a gauge potential A
µ
, and replace the
partial derivatives by covariant derivatives
D
µ
= ∂
µ
−i e A
µ
, (10.15)
where now φ will be interpreted as describing a complex scalar ﬁeld with electric charge
e. As we saw before when discussing the Schr¨ odinger equation, the covariant derivative
acting on φ has a nice transformation property under the local phase tramsformations of φ,
provided at the same time we transform A
µ
:
φ −→e
i α
φ, A
µ
−→A
µ
+
1
e
∂
µ
α. (10.16)
This implies that D
µ
φ transforms nicely as
D
µ
φ −→e
i α
D
µ
φ, (10.17)
and so the new Lagrangian density
L = −
1
2
(D
µ
φ) (D
µ
φ) (10.18)
is indeed invariant. This is the “derivation” of ordinary electromagnetism.
In this viewpoint, where we are deriving electromagnetism by requiring the local phase
invariance of the theory under (10.13), has not yet given any dynamics to the gauge ﬁeld
A
µ
. Indeed, one cannot derive a dynamical existence for A
µ
because in fact there is no
unique answer. What one can do, however, is to introduce a dynamical term “by hand,”
which has all the natural properties one would like. First of all, we want a dynamical term
that respects the gauge invariance we have already achieved in the rest of the Lagrangian.
Secondly, we expect that it should give rise to a secondorder dynamical equation for A
µ
.
We are back to the discussion of section 4.2, where we derived Maxwell’s equations from an
action principal. The steps leading to the answer are as follows.
First, to make a gaugeinvariant term we need to use the gaugeinvariant ﬁeld strength
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
(10.19)
as the basic “building block.” Then, to make a Lorentzinvariant term, the lowestorder
possibility is to form the quadratic invariant F
µν
F
µν
. Taking the standard normalisation
as discussed in section 4.2, we are therefore lead to propose the total Lagrangian density
L = −
1
2
(D
µ
φ) (D
µ
φ) −
1
16π
F
µν
F
µν
, (10.20)
169
where D
µ
= ∂
µ
− i e A
µ
. It is easily veriﬁed that the EulerLagrange equations resulting
from this Lagrangian density are as follows. Requiring the stationarity of the action under
variations of the wavefunction φ implies
D
µ
D
µ
φ = 0 , i.e. (∂
µ
−i eA
µ
)(∂
µ
−i eA
µ
)φ = 0 , (10.21)
This is the gaugecovariant generalisation of the original uncharged KleinGordon equation.
Requiring stationarity under variations of A
µ
implies
∂
µ
F
µν
= −4πJ
ν
, (10.22)
where
J
µ
= −i e
¯
φD
µ
φ −(D
µ
φ) φ
. (10.23)
Thus A
µ
satisﬁes the Maxwell ﬁeld equation, with a source current density given by (10.23).
This is exactly what one would hope for; the complex ﬁeld φ carries electric charge e, and
so it is to be expected that it should act as a source for the electromagnetic ﬁeld. In the
process of giving dynamics to the electromagnetic ﬁeld we have, as a bonus, derived the
current density for the scalar ﬁeld.
10.2 YangMills theory
At the end of the previous subsection we rederived electromagnetism as the ﬁeld needed in
order to turn the global phase invariance of a complex scalar ﬁeld that satisﬁes the Kelin
Gordon equation into a local phase invariance. The phase factor e
i α
is a unitmodulus
complex number. The set of all unitmodulus complex numbers form the group U(1); i.e.
1 1 complex matrices U satisfying U
†
U = 1. (For 1 1 matrices, which are just numbers,
there is of course no distinction between Hermitean conjugation, denoted by a dagger, and
complex conjugation.)
In order to derive the generalisation of electromagnetism to YangMills theory we need to
start with an extended system of scalar ﬁelds, each satisfying the KleinGordon equation,
whose Lagrangian is invariant under a larger, nonabelian, group.
21
We shall take the
example of the group SU(2) in order to illustrate the basic ideas. One can in fact construct
a YangMills theory based on any Lie group.
21
An abelian group is one where the order of combination of group elements makes no diﬀerence. By
contrast, for a nonableian group, if two elements U and V are combined in the two diﬀerent orderings, the
results are, in general, diﬀerent. Thus, for a group realised by matrices under multiplication, for example,
one has in general that UV = V U.
170
The group SU(2) should be familiar from quantum mechanics, where it arises when
one discusses systems with intrinstic spin
1
2
. The group can be deﬁned as the set of 2 2
complex matrices U subject to the conditions
U
†
U = 1 , det U = 1 . (10.24)
It can therefore be parameterised in the form
U =
a b
−
¯
b ¯ a
, (10.25)
where a and b are complex numbers subject to the constraint
[a[
2
+[b[
2
= 1 . (10.26)
If we write a = x
1
+ i x
2
, b = x
3
+ i x
4
, the constraint is described by the surface
x
2
1
+x
2
2
+x
2
3
+x
2
4
= 1 (10.27)
in Euclidean 4space, and so the elements of the group SU(2) are in onetoone correspon
dence with the points on a unit 3dimensional sphere. Clearly SU(2) has three independent
parameters.
22
The group SU(2) can be generated by exponentiating the three Pauli matrices τ
a
, where
τ
1
=
0 1
1 0
, τ
2
=
0 −i
i 0
, τ
3
=
1 0
0 −1
. (10.28)
They satisfy the commutation relations
[τ
a
, τ
b
] = 2i
abc
τ
c
, (10.29)
i.e. [τ
1
, τ = 2] = 2i τ
3
, and cyclic permutations.
Let
T
a
=
1
2i
τ
a
. (10.30)
We shall therefore have
[T
a
, T
b
] =
abc
T
c
. (10.31)
22
For comparison, the group U(1), whose elements U can be parameterised as U = e
i α
with 0 ≤ α < 2π,
has one parameter. Since e
i α
is periodic in α the elements of U(1) are in onetoone correspondence with
the points on a unit circle, or 1dimensional sphere. In fact the circle, S
1
, and the 3sphere, S
3
, are the only
spheres that are isomorphic to groups.
171
Note that the T
a
, which are called the generators of the Lie algebra of SU(2), are anti
Hermitean,
T
†
a
= −T
a
. (10.32)
They are also, of course, traceless.
The SU(2) group elements can be written as
U = e
αa Ta
, (10.33)
where α
a
are three real parameters. (This is the analogue of writing the U(1) elements U
as U = e
i α
.) It is easy to check that the unitarity of U, i.e. U
†
U = 1, follows from the
antiHermiticity of the generators T
a
:
U
†
U =
e
αaTa
†
e
α
b
T
b
= e
αaT
†
a
e
α
b
T
b
= e
−αaTa
e
α
b
T
b
= 1 . (10.34)
The unitdeterminant property follows from the tracelessness of the T
a
, bearing in mind
that for any matrix X we have det X = exp(tr log X):
det U = det(e
αaTa
) = exp[tr log(e
αaTa
)] = exp[tr(α
a
T
a
)] = exp[0] = 1 . (10.35)
Suppose now that we take a pair of complex scalar ﬁelds, called φ
1
and φ
2
, each of
which satisﬁes the massless KleinGordon equation. We may assemble them into a complex
2vector, which we shall call φ:
φ =
φ
1
φ
2
. (10.36)
This vectorvalued ﬁeld therefore satisﬁes the KleinGordon equation
φ = 0 , (10.37)
which can be derived from the Lagrangian density
L = −(∂
µ
φ
†
)(∂
µ
φ) . (10.38)
It is obvious that the Lagrangian density (10.38) is invariant under global SU(2) trans
formations
φ −→U φ, (10.39)
where U is a constant SU(2) matrix. Thus, we have
L −→−∂
µ
(φ
†
U
†
) ∂
µ
(Uφ) = −(∂
µ
φ
†
) U
†
U (∂
µ
φ) = −(∂
µ
φ
†
) (∂
µ
φ) = L. (10.40)
172
Obviously, L would not be invariant if we allowed U to be spacetime dependent, for
the usual reason that we would get extra terms where the derivatives landed on the U
transformation matrix. Based on our experience with the local U(1) phase invariance of
the theory coupled to electromagnetism, we can expect that again we could achieve a local
SU(2) invariance by introducing appropriate analogues of the electromagnetic ﬁeld. In this
case, since the SU(2) group is characterised by 3 parameters rather than the 1 parameter
characterising U(1), we can expect that we will need 3 gauge ﬁelds rather than 1. We shall
called these A
a
µ
, where 1 ≤ a ≤ 3. In fact it is convenient to assemble the three gauge ﬁelds
into a 2 2 matrix, by deﬁning
A
µ
= A
a
µ
T
a
, (10.41)
where T
a
are the generators of the SU(2) algebra that we introduced earlier.
We next deﬁne the covariant derivative D
µ
, whose action on the complex 2vector of
scalar ﬁelds φ is deﬁned by
D
µ
φ = ∂
µ
φ +A
µ
φ. (10.42)
Since we don’t, a priori, know how A
µ
should transform we shall work backwards and
demand that its transformation rule should be such that D
µ
satisﬁes the nice property we
should expect of a covariant derivative in this case, namely that if we transform φ under a
local SU(2) transformation
φ −→φ
= U φ, (10.43)
then we should also have that
(D
µ
φ) −→(D
µ
φ)
= U(D
µ
φ) . (10.44)
Working this out, we shall have
(D
µ
φ)
= D
µ
φ
= (∂
µ
+A
µ
)(U φ) ,
= (∂
µ
U) φ +U ∂
µ
φ +A
µ
U φ,
= U D
µ
φ = U ∂
µ
φ +U A
µ
φ. (10.45)
Equating the last two lines, and noting that we want this to be true for all possible φ, we
conclude that
∂
µ
U +A
µ
U = U A
µ
. (10.46)
Multiplying on the right with U
†
then gives the result that
A
µ
= U A
µ
U
†
−(∂
µ
U) U
†
. (10.47)
173
This, then, will be the gauge transformation rule for the YangMills potentials A
µ
. In other
words, the full set of local SU(2) transformations comprise
23
A
µ
−→ A
µ
= U A
µ
U
†
−(∂
µ
U) U
†
,
φ −→ φ
= U φ. (10.48)
What we have established is that if we replace the Lagrangian density (10.38) by
L = −(D
µ
φ)
†
(D
µ
φ) , (10.49)
then it will be invariant under the local SU(2) transformations given by (10.48). The
proof is now identical to the previous proof of the invariance of (10.38) under global SU(2)
transformations. The essential point is that the local transformation matrix U “passes
through” the covariant derivative, in the sense that (D
µ
φ)
= D
µ
(Uφ) = U D
µ
φ.
So far, we have suceeded in constructing a theory with a local SU(2) symmetry, but as
yet, the YangMills potentials A
a
µ
that we introduced do not have any dynamics of their own.
Following the strategy we applied to the case of electromagnetism and local U(1) invariance,
we should now look for a suitable term to add to the Lagrangian density (10.49) that will do
the job. Guided by the example of electromagnetism, we should ﬁrst ﬁnd a a ﬁeld strength
tensor for the YangMills ﬁelds, which will be the analogue of the electromagnetic ﬁeld
strength
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
. (10.50)
It is clear that the expression (10.50) is not suitable in the YangMills case. If one
were to try adopting (10.50) as a deﬁnition for the ﬁeld strength, then a simple calculation
shows that under the SU(2) gauge transformation for A
µ
given in (10.48), the ﬁeld strength
would transform into a complete mess. It turns out that the appropriate generalistion that
is needed for YangMills is to deﬁne
F
µν
= ∂
µ
A
ν
−∂
ν
A
µ
+ [A
µ
, A
ν
] . (10.51)
Of course this would reduce to the standard electromagnetic ﬁeld strength in the abelian
U(1) case, since the commutator [A
µ
, A
ν
] ≡ A
µ
A
ν
−A
ν
A
µ
would then vanish.
23
Note that this nonabelian result, which takes essentially the same form for any group, reduces to the
previous case of electromagnetic theory is we specialise to the abelian group U(1). Essentially, we would just
write U = e
i α
and plug into the transformations (10.48). Since left and right multiplication are the same in
the abelian case, the previous results for electromagnetic gauge invariance can be recovered.
174
The ﬁrst task is to check how F
µν
deﬁned by (10.51) transforms under the SU(2) gauge
transformation of A
µ
given in (10.48). We shall have
F
µν
−→ F
µν
= ∂
µ
A
ν
+A
µ
A
ν
−(µ ↔ν) ,
= ∂
µ
(UA
ν
U
†
−(∂
ν
U)U
†
) + (UA
µ
U
†
−(∂
µ
U)U
†
)(UA
ν
U
†
−(∂
ν
U)U
†
) −(µ ↔ν) ,
= (∂
µ
U)A
ν
U
†
+U(∂
µ
A
ν
) −UA
ν
U
†
(∂
µ
U)U
†
−(∂
µ
∂
ν
U)U
†
+ (∂
ν
U)U
†
(∂
µ
U)U
†
+UA
µ
U
†
UA
ν
U
†
−UA
µ
U
†
(∂
ν
U)U
†
−(∂
µ
U)U
†
UA
ν
U
†
+(∂
µ
U)U
†
(∂
ν
U)U
†
−(µ ↔ν) ,
= U(∂
µ
A
ν
−∂
ν
A
µ
+A
µ
A
ν
−A
ν
A
µ
)U
†
, (10.52)
where the notation −(µ ↔ν) means that one subtracts oﬀ from the terms written explicitly
the same set of terms with the indices µ and ν exchanged. Comparing with (10.51) we see
that the upshot is that under the SU(2) gauge transformation for A
µ
given in (10.48), the
ﬁeld strength F
µν
deﬁned in (10.51) transforms as
F
µν
−→F
µν
= UF
µν
U
†
. (10.53)
This means that F
µν
transforms covariantly under SU(2) gauge transformations. It would of
course, reduce to the invariance of the elctromagnetic ﬁeld strength transformation (F
µν
=
F
µν
) in the abelian case.
It is now a straightforward matter to write down a suitable term to add to the Lagrangian
density (10.49). As for electromagnetism, we want a gaugeinvariant and Lorentzinvariant
quantity that is quadratic in ﬁelds. Thus we shall take
L = −(D
µ
φ)
†
(D
µ
φ) +
1
8π
tr(F
µν
F
µν
) , (10.54)
The proof that tr(F
µν
F
µν
) is gauge invariant is very simple; under the SU(2) gauge trans
formation we shall have
tr(F
µν
F
µν
) −→ tr(F
µν
F
µν
) = tr(UF
µν
U
†
UF
µν
U
†
) = tr(UF
µν
F
µν
U
†
)
= tr(F
µν
F
µν
U
†
U) = tr(F
µν
F
µν
) . (10.55)
The equations of motion for the φ and A
µ
ﬁelds can be derived from (10.54) in the
standard way, as the EulerLagrange equations that follow from requiring that the action I =
d
4
xL be stationary under variations of φ and A
µ
respectively. First, let us just consider
the sourcefree YangMills equations that will result if we just consider the Lagrangian
density for the YangMills ﬁelds alone,
L
Y M
=
1
8π
tr(F
µν
F
µν
) . (10.56)
175
Wrting I
Y M
=
d
4
xL
Y M
, we shall have
δI
Y M
=
1
8π
2 tr
d
4
xδF
µν
F
µν
,
=
1
4π
tr
d
4
x(∂
µ
δA
ν
−∂
ν
δA
µ
+ [δA
µ
, A
ν
] + [A
µ
, δA
ν
])F
µν
,
=
1
2π
tr
d
4
x(∂
µ
δA
ν
+ [A
µ
, δA
ν
])F
µν
,
=
1
2π
tr
d
4
x(−δA
ν
∂
µ
F
µν
+A
µ
δA
ν
F
µν
−δA
ν
A
µ
F
µν
) ,
= −
1
2π
tr
d
4
xδA
ν
(∂
µ
F
µν
+ [A
µ
, F
µν
]) , (10.57)
and so requiring that the action be stationary implies
∂
µ
F
µν
+ [A
µ
, F
µν
] = 0 . (10.58)
These are the sourcefree YangMills equations, which are the generalisation of the source
free Maxwell equations ∂
µ
F
µν
= 0. Obviously the YangMills equations reduce to the
Maxwell equations in the abelian U(1) case.
If we now include the −(D
µ
φ)
†
(D
µ
φ) term in the above calculation, we shall ﬁnd that
δI =
d
4
x(φ
†
δA
ν
D
ν
φ −(D
ν
φ)
†
δA
ν
φ) −
1
2π
tr
d
4
xδA
ν
(∂
µ
F
µν
+ [A
µ
, F
µν
]) . (10.59)
Requiring stationarity under the variations δA
ν
now gives the YangMills equations with
sources,
∂
µ
F
µν
+ [A
µ
, F
µν
] = 2π J
ν
, (10.60)
where
J
µ
= (D
µ
φ) φ
†
−φ(D
µ
φ)
†
. (10.61)
Note that is is a 2 2 matrix current, as it should be, since the Hermiteanconjugated
2vector sits to the right of the unconjugated 2vector.
24
This completes this brief introduction to YangMills theory. As far as applications are
concerned, it is fair to say that YangMills theory lies at the heart of modern fundamental
physics. The weak nuclear force is described by the WeinbergSalam model, based on the
YangMills gauge group SU(2). The W and Z bosons, which have been seen in particle
accelerators such as the one at CERN, are the SU(2) gauge ﬁelds. The strong nuclear
force is described by a YangMills theory with SU(3) gauge group, and the 8 gauge ﬁelds
associated with this theory are the gluons that mediate the strong interactions. Thus one
may say that almost all of modern particle physics relies upon YangMills theory.
24
It is helpful to introduce an index notation to label the rows and columns of the 2 ×2 matrices, in order
to verify that (10.61) is the correct expression.
176
Contents
1 Electrodynamics and Special Relativity 1.1 1.2 1.3 1.4 1.5 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4vectors and 4tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lorentz tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proper time and 4velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 7 9 15 17 19 19 19 21 27 28 29 33 33 35 35 37 43 43 46 48 51 53 57 61 63 63 65 66
2 Electrodynamics and Maxwell’s Equations 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 Natural units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gauge potentials and gauge invariance . . . . . . . . . . . . . . . . . . . . . Maxwell’s equations in 4tensor notation . . . . . . . . . . . . . . . . . . . . Lorentz transformation of E and B . . . . . . . . . . . . . . . . . . . . . . . The Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Action principle for charged particles . . . . . . . . . . . . . . . . . . . . . . Gauge invariance of the action . . . . . . . . . . . . . . . . . . . . . . . . . Canonical momentum, and Hamiltonian . . . . . . . . . . . . . . . . . . . .
3 Particle Motion in Static Electromagnetic Fields 3.1 3.2 Description in terms of potentials . . . . . . . . . . . . . . . . . . . . . . . . Particle motion in static uniform E and B ﬁelds . . . . . . . . . . . . . . .
4 Action Principle for Electrodynamics 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Invariants of the electromagnetic ﬁeld . . . . . . . . . . . . . . . . . . . . . Action for Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . Inclusion of sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Energy density and energy ﬂux . . . . . . . . . . . . . . . . . . . . . . . . . Energymomentum tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . Energymomentum tensor for the electromagnetic ﬁeld . . . . . . . . . . . . Inclusion of massive charged particles . . . . . . . . . . . . . . . . . . . . .
5 Coulomb’s Law 5.1 5.2 5.3 Potential of a point charges . . . . . . . . . . . . . . . . . . . . . . . . . . . Electrostatic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Field of a uniformly moving charge . . . . . . . . . . . . . . . . . . . . . . .
1
5.4 5.5
Motion of a charge in a Coulomb potential . . . . . . . . . . . . . . . . . . . The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69 73 76 76 81 83 84 86 93 96 97 105 106 106 109 113 116 119 121 127 130 133 135 139 141 141 145 149 155 158 158 161
6 Electromagnetic Waves 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monochromatic plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . Motion of a point charge in a linearlypolarised E.M. wave . . . . . . . . . . Circular and elliptical polarisation . . . . . . . . . . . . . . . . . . . . . . . General superposition of plane waves . . . . . . . . . . . . . . . . . . . . . . Gauge invariance and electromagnetic ﬁelds . . . . . . . . . . . . . . . . . . Fourier decomposition of electrostatic ﬁelds . . . . . . . . . . . . . . . . . . Waveguides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resonant cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 Fields Due to Moving Charges 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 Retarded potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LienardWiechert potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . Electric and magnetic ﬁelds of a moving charge . . . . . . . . . . . . . . . . Radiation by accelerated charges . . . . . . . . . . . . . . . . . . . . . . . . Applications of Larmor formula . . . . . . . . . . . . . . . . . . . . . . . . . Angular distribution of the radiated power . . . . . . . . . . . . . . . . . . . Frequency distribution of radiated energy . . . . . . . . . . . . . . . . . . . Frequency spectrum for relativistic circular motion . . . . . . . . . . . . . . Frequency spectrum for periodic motion . . . . . . . . . . . . . . . . . . . .
7.10 Cerenkov radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 Thompson scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Radiating Systems 8.1 8.2 8.3 8.4 Fields due to localised oscillating sources . . . . . . . . . . . . . . . . . . . . Electric dipole radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Higher multipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 Electromagnetism and Quantum Mechanics 9.1 9.2 The Schr¨dinger equation and gauge transformations . . . . . . . . . . . . . o Magnetic monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
. . . . . . . . . . . . . . .3 Dirac quantisation condition . . . . 3 . . . . . . . . . 10. . . . . . . . . . . . . . .2 YangMills theory . . . . . 164 165 165 170 10 Local Gauge Invariance and YangMills Theory 10. . . . . . . . . . . . . . . . . . . .1 Relativistic quantum mechanics . . . . . . . . . . . . . .9. . . . . . .
If S is an inertial frame. and that their origins coincide at at t = 0.e.1) and (1. (1. If a beam of light were moving along the x axis of S 4 . are parallel.4) Returning to our simplifying assumption that the two frames are parallel. so that we can deﬁne r = M · r −vt.e.5) Suppose. M ·r ↔ M y . for example.1) Of course. it is always understood in Newtonian mechanics that time is absolute. Suppose that two inertial frames S and S . the times t and t measured by observers in the frames S and S are the same: t = t. and so The transformations (1. components of the position vector: r ↔ y . that M = 1 it follows that if a particle having position vector r in S moves with velocity l. such as the dynamics of moving objects. The full Galilean group includes also rotations of the spatial Cartesian coordinate system.3) where M is an orthogonal 3 × 3 constant matrix acting by matrix multiplication on the x z x z where M T M = 1. (1.e.1 Electrodynamics and Special Relativity Introduction In Newtonian mechanics. where r = r − vt. are valid in all inertial frames (i. all nonaccelerating frames). u = dr/dt. then the set of all inertial frames comprises all frames that are in uniform motion relative to S. t = t. (1. that v lies along the x axis of S. i.1 1. that S is moving along the x axis of S with speed v = v. If S is moving with uniform velocity v relative to S. i. the fundamental laws of physics. then its velocity u = dr /dt as measured with respect to the frame S is given by u = u−v. (1.2) (1. then a point P with position vector r with respect to S will have position vector r with respect to S .2) form part of what is called the Galilean Group.
independent of the choice of inertial frame.10) ω2 . and ω is also a constant. (1. Then we shall have ×( But using the vector identity × E) = − ×( 2 ∂ ∂t × B = −µ0 ( · E) − 0 ∂2E . the speed c of the light beam would be c = c−v. Of course. let us begin with the freespace Maxwell’s equations. it should be emphasised that the discrepancies between experiment and the Galilean transformations are rather negligible if the relative speed v between the two inertial frames is of a typical “everyday” magnitude.11) . such as the speed of a car or a plane. To be precise. where E0 and k are constant vectors. this contradicts experiment.9) This admits planewave solutions of the form E = E0 ei(k·r−ωt) . i. ·E = 1 0 ρ. ρ and J are the charge density and 0 and µ0 are the permittivity and permeability of free space. (1.8) ·E = 0 × E) = it follows from that the electric ﬁeld satisﬁes the wave equation E − µ0 0 ∂2E = 0. To see the electromagnetic wave solutions. the true state of aﬀairs is that the speed of the light beam is the same in all inertial frames. ∂t2 2 E. current density.7) · B = 0. then the prediction of Newtonian mechanics and the Galilean transformation would therefore be that in the frame S . where ρ = 0 and J = 0. as is well known. we can consider a region of space where there are no sources. Thus the predictions of Newtonian mechanics and the Galilean transformation are falsiﬁed by experiment.6) Of course.e. where k 2 = µ0 5 0 (1. As far as we can tell. (1.with speed c. and ∂B = 0. then the discrepancy becomes appreciable too. it turns out that Maxwell’s equations of electromagnetism do predict a constant speed of light. ∂t (1. × B − µ0 ×E+ 0 ∂E = µ0 J . But if v begins to become appreciable in comparison to the speed of light. ∂t2 (1. with experiments of everincreasing accuracy. By contrast. ∂t where E and B are the electric and magnetic ﬁelds.
as we shall see. Since there are many diﬀerent conventions on oﬀer in the marketplace. To make the Lorentz transformations look nice and simple. Furthermore. 1 Strictly. they are in fact perfectly invariant 1 under the Lorentz transformations. where the extra component is associated with the time direction. Therefore.12) Putting in the numbers. the magnitude of the wavevector k. we should ﬁrst therefore reformulate special relativity in terms of 4vectors and 4tensors. A similar calculation shows that the magnetic ﬁeld B also satisﬁes an identical wave equation. 0 (1. In order to give a nice elegant treatment of the Lorentz transformation properties of the Maxwell equations. then they predict that the speed of light will be c ≈ 3 × 10 8 metres per second in that frame. is that if the Maxwell equations (1. ω (1. the familiar speed of light.7).7) hold in a given frame of reference.Here k means k. it follows that the Maxwell equations are not invariant under Galilean transformations. No further modiﬁcation is required in order to incorporate Maxwell’s theory of electromagnetism into special relativity. This is because they are written in the language of 3vectors. we should instead express them in terms of 4vectors.e. if we assume that the Maxwell equations hold in all inertial frames. Since this prediction is in agreement with experiment. even though the Maxwell equations were written down in the prerelativity days of the nineteenth century. then. do not look manifestly covariant with respect to Lorentz transformations. 6 . and in fact B and E are related by B= 1 k×E. However. the transformations that correctly describe the relation between observations in diﬀerent inertial frames in uniform motion are the Lorentz Transformations of Special Relativity. as will be explained later. we can reasonably expect that the Maxwell equations will indeed hold in all inertial frames. we should say covariant rather than invariant. Since the prediction contradicts the implications of the Galilean transformations.13) The situation. This is just as well. the Maxwell equations as they stand. i. since the Galilean transformations are wrong! In fact. Thus we see that the waves travel at speed c given by c= 1 ω =√ k µ0 . then they predict that the speed of light will have that same value in all inertial frames. this gives c ≈ 3 × 10 8 metres per second. written in the form given in equation (1.
(1. correspond to the equation x = vt in the fame S. that is to say. If a ﬂash of light is emitted at the origin at time zero. and moves along the x axis with velocity v. t) and (x . whose origins coincide at time zero.14) (1. z . z. To derive the Lorentz transformation. this must. following the second of Einstein’s postulates. if x = 0. y.we shall begin with a review of special relativity in the notation that we shall be using in this course.2 The Lorentz Transformation The derivation of the Lorentz transformation follows from Einstein’s two postulates: • The laws of physics are the same for all inertial observers. Consider for simplicity the case where S is parallel to S. and so from the ﬁrst equation in (1. and by x + y + z − c2 t = 0 2 2 2 2 (1. Our goal is to derive the relation between the coordinates (x.18) . since otherwise it would not be translationinvariant or timetranslation invariant. C and D to be determined. t ) in the two inertial frames. Thus we may say that x = Ax + Bt . (1. then it will spread out over a spherical wavefront given by x2 + y 2 + z 2 − c 2 t2 = 0 in the frame S. B .17) we have B = −Av. by deﬁnition. let us suppose that we have two inertial frames S and S . t = Cx + Dt . t) and (x . and thus we have x = γ(x − vt) . at t = 0 in the frame S. and at t = 0 in the frame S .15) in the frame S . z =z.16) Furthermore. t ) must be a linear one.17) for constants A. 7 (1. Note that. the transformation between (x. Clearly we must have y = y. we have used the same speed of light c for both inertial frames. Now. • The speed of light is the same for all inertial observers. 1. For convenience we will change the name of the constant A to γ. y .
t = γ(t − v x) . we shall introduce the simpliﬁcation of working in a system of units in which the speed of light is set equal to 1. for notational convenience. we obtain γ= 1 . where 1 . so if we have x = ct then this must imply x = ct . z = z. z = z. It follows that x = γ(x + vt ) .21) ct = γ(c + v)t (1.) Thus we arrive at the Lorentz transformation x = γ(x − vt) . (1. y = y. if we consider taking x = 0 then this will correspond to x = −vt in the frame S . At this point.19) Note that it must be the same constant γ in both these equations.By the same token. for γ.18). for the special case where S is moving along the x direction with velocity v. c2 (1. In fact. By making the small change of taking the lightsecond as the basic unit of length. c2 (1. Solving the resulting two equations ct = γ(c − v)t . γ=√ 1 − v2 8 (1.25) y = y.458 of a second.24) .792. and the physics must be the same for the two cases. 1 − v 2 /c2 (1. v. since the two really just correspond to reversing the direction of the x axis.23) becomes x = γ(x − vt) . the metre is nowadays deﬁned to be the distance travelled by light in vacuo in 1/299. and so we may as well choose to measure length in terms of the time it takes for light in vacuo to traverse the distance.23) where γ is given by (1.792. t = γ(t − vx) . after using (1. rather than the 1/299. (1. Now we bring in the postulate that the speed of light is the same in the two frames.20) Solving x2 − c2 t2 = x 2 − c2 t 2 for t . We can do this because the speed of light is the same for all inertial observers.458 th of a lightsecond. we ﬁnd t 2 = γ 2 (t − vx/c2 )2 and hence t = γ(t − v x) .22) (We must choose the positive square root since it must reduce to t = +t at zero relative velocity. the Lorentz transformation (1. we end up with a system of units in which c = 1.21). In these units.
while µ = 1. 9 . v2 t = γ(t − v · r) . (Note that r 2 = x2 + y 2 + z 2 . 2 and 3. It is easy to check that they have the property of preserving the spherical lightfront condition. 0).3 4vectors and 4tensors The Lorentz transformations given in (1. rather than a subscript. y. then generalisation will be immediate. The case µ = 0 corresponds to the time coordinate t. is purely conventional.26) with γ = (1−v 2 )−1/2 and v ≡ v. x2 . (1. Therefore. one has to be very careful when reading a formula to distinguish between. where µ is an index. or label. x.24) to the case where the frame S is moving with (constant) velocity v in arbitrary direction.) In fact. 1. They can be written more succinctly if we ﬁrst deﬁne the set of four spacetime coordinates denoted by xµ . a stronger statement (1.24) in terms of 3vectors.e. the Lorentz transformation (1. x1 . as here.It will be convenient to generalise the Lorentz transformation (1. 0. in the sense that points on the expanding spherical shell given by r 2 = t2 of a lightpulse emitted at the origin at t = 0 in the frame S will also satisfy the equivalent condition r is true: The Lorentz transformation (1. 1.27) 1. z) coordinate system. y and z respectively. by the speciﬁc index values 0.24) can be written as r =r+ γ−1 (v · r) v − γv t . if we can ﬁrst rewrite the special case described by (1.26) satisﬁes the equation x2 + y 2 + z 2 − t 2 = x + y + z − t .28) Of course. The Lorentz transformations (1. for 2 The choice to put the index label µ as a superscript. z). 0. unlike the situation with many arbitrary conventions.26) are what are called the pure boosts. once the abstract index label µ is replaced. that ranges over the values 0. y. y. It is rather straightforward to do this. (1. it must be that they are the correct form of the Lorentz transformations for an arbitrary direction for the velocity 3vector v. 2 2 2 2 2 = t 2 in the primed reference frame S .26) are linear in the space and time coordinates. they are written entirely in a 3vector notation). It is easy to check that with v taken to be (v. 0). rather than specifically along the x axis. Thus we have 2 (x0 . But. We know that there is a complete rotational symmetry in the threedimensional space parameterised by the (x. where the 3vector velocity v happens to be simply v = (v. Since these equations are manifestly covariant under 3dimensional spatial rotations (i. z) . 2 and 3. and with r = (x. in this case the coordinate index is placed upstairs in all modern literature. 2 and 3 corresponds to the space coordinates x. x3 ) = (t.
Using ηµν . tensorial) expression.33) On might at ﬁrst think there would be a great potential for ambiguity. 1. (1. as simply x2 + y 2 + z 2 − t2 = ηµν xµ xν . This makes the writing of expressions such as (1. using the Einstein summation convention. (1. 2. (1.29) What this means is that the rows of the matrix on the right are labelled by the index µ and the columns are labelled by the index ν. it is very convenient to introduce the Einstein Summation Convention.31) x +y +z −t = 2 2 2 2 ηµν xµ xν . The summation convention works as follows: In an expression such as (1. (1.29) is saying that the only nonvanishing components of ηµν are given by η00 = −1 .32). x2 meaning the symbol x carrying the spacetime index µ = 2. the quadratic form on the lefthand side of (1. Note that ηµν is symmetric: ηµν = ηνµ . we can rewrite the expression.example. more generally. η11 = η22 = η33 = 1 .32) much less cumbersome. thus is understood to be summed over. and x 2 meaning the square of x.32) At this point.32) both µ and ν occur exactly twice. if we ﬁrst introduce the 2index quantity η µν . Since in (1. the only time that a particular index can ever occur exactly twice in a term is when it is summed 10 . An index that occurs twice in a term. In other words. if an index appears exactly twice in a term. It should generally be obvious from the context which is meant. and the explicit summation symbol will be omitted. 3 in our present case). deﬁned to be given by 0 = 0 −1 0 1 0 0 0 0 1 0 0 ηµν 0 0 0 1 . The invariant quadratic form appearing on the lefthand side of (1.30) with ηµν = 0 if µ = ν.27) can now be written in a nice way. then it will be understood that the index is summed over the natural index range (0. µ=0 ν=0 (1.27) can be rewritten as 3 3 (1. is called a Dummy Index. The point is that in any valid vectorial (or. but this is not the case.
11 . Sometimes. 4 or more times then there is no need to look further at it. If you have written down a term where a given index occurs 3. in the deﬁnition of ηµν in (1. Λi j = δij + γ−1 vi vj . will become clearer as we proceed. The reason why we can be cavalier about the Latin indices. it should be noted that in a valid vectorial or tensorial expression.over. We already saw that the Lorentz boost transformations (1. 3 Now let us return to the Lorentz transformations.26). Thus from (1. 2 and 3. since it is logically inevitable that a summation is intended.34) we have ηµν xµ xν = ηµν Λµ ρ Λν σ xρ xσ . (1.37) As a side remark. we are introducing Latin indices here.35). i). 2 or 3). but not the Greek. it will be convenient to be rather relaxed about whether we put spatial indices upstairs or downstairs. being linear in the space and time coordinates. where δij is The Kronecker delta symbol. whereas the time index value µ = 0 is a bit diﬀerent. By contrast. it actually makes no diﬀerence whether we write the index i upstairs or downstairs.36) Λ0 i = −γvi . and the Einstein summation convention is operative for the dummy index ν. it is very important to be careful about whether it is upstairs or downstairs. have the property that ηµν xµ xν = ηµν x µ x ν . Thus the 4index µ can be viewed as µ = (0. there is no ambiguity resulting from agreeing to omit the explicit summation symbol. and go back to ﬁnd the place where an error was made.35) A couple of points need to be explained here. we can see that the components Λ µ ν are given by Λ0 0 = γ . This can be seen. it is WRONG.34) carefully with (1. This piece of notation is useful because the three spatial index values always occur on a completely symmetric footing.35). as in (1. By comparing (1. reexpressed in terms of Λµ ν in (1. can be written in the form x µ = Λ µ ν xν . namely the i and j indices. Firstly.26). If you ever ﬁnd such an expression in a calculation then you must stop.29) or (1. Thus. δij = 1 if i=j. 2 and 3. δij = 0 if i=j. when the index takes the value 0. a speciﬁc index can NEVER appear more than twice in a given term. for example. where i = 1.26). The second point is that when we consider spatial indices (for example when µ takes the values i = 1.30). which range only over the three spatial index values. Thus. Λi 0 = −γ vi . i = 1. 3 (1. The pure boosts written in (1. for example. it is totally meaningless to write ηµµ xµ xµ . (1.34) where Λµ ν are constants. v2 (1.
y. this is the Minkowskian generalisation of the threedimensional distance ds E between neighbouring points (x.42) The Euclidean metric (1. y + dy. x2 . given by (1. The distance. The quantity ηµν is called the Minkowski Metric. x3 + dx3 ) in spacetime is written as ds.35). z). E (1. z) and (x + dx. and is given by ds2 = ηµν dxµ dxν .40) (1. z) coordinate system. y.38) (This can also be veriﬁed directly from (1. we must have that ηµν Λµ ρ Λν σ = ηρσ . known as Minkowski Spacetime. x2 + dx2 . Thus.35) are the “interesting” Lorentz transformations. i. one can really say that the Lorentz boosts (1. the ones that rotate space and time into one another. xi ) live in a fourdimensional spacetime. is given by ds2 = dx2 + dy 2 + dz 2 = δij dxi dxj .41) Clearly. (This is clearly true because the distance between the neighbouring 12 . y. (1. z + dz) in Euclidean space. The remainder are just rotations of our familiar old 3dimensional Euclidean space. since it is true for any xµ .39) (1. The coordinates xµ = (x0 . Essentially. the interval. by Pythagoras’ theorem.(Note that we have been careful to choose two diﬀerent dummy indices for the two implicit summations over ρ and σ!) On the lefthand side. x3 ) and (x0 + dx0 . between two inﬁnitesimallyseparated points (x 0 .42) is invariant under arbitrary constant rotations of the (x.40).35). which. x1 . are examples. and thus write ηρσ xρ xσ = ηµν Λµ ρ Λν σ xρ xσ . This is the fourdimensional analogue of the threedimensional Euclidean Space described by the Cartesian coordinates x i = (x. the additional Lorentz transformations consist of rotations of the threedimensional spatial coordinates. x1 + dx1 . It is called a metric because it provides the rule for measuring distances in the fourdimensional Minkowski spacetime.e. we can replace the dummy indices µ and ν by ρ and σ. (1.40) are the Lorentz Transformations. and. The Lorentz Boosts. or to be more precise. it is called a tensor.) The full set of Λ’s that satisfy (1. but they are just a subset of the full set of Lorentz transformations that satisfy (1. This can be grouped together as (ηρσ − ηµν Λµ ρ Λν σ )xρ xσ = 0 . and for reasons that we shall see presently.
the inverse of this matrix takes the same form as the matrix itself.29). we call a 4vector with an upstairs index a contravariant 4vector.) By the same token. then we get back 13 .47) again using η µν . to be the components of a Lorentz 4vector (often. Two distinguish the two. since it is the same in all inertial frames. then we may deﬁne Uµ = ηµν U ν .46) where the 4dimensional Kronecker delta is deﬁned to equal 1 if µ = ρ. we may deﬁne any set of four quantities U µ . the inverse Minkowksi metric. (1. Thus. (1.41) is invariant under arbitrary Lorentz transformations.points must obviously be independent of how the axes of the Cartesian coordinate system are oriented. 1. i. For this reason. according to the rule U µ = Λµ ν U ν .43) = ηρσ dxρ dxσ = ds2 . It is what is called a Lorentz Scalar. (1. (1. as in (1. In other words. the Minkowski metric (1. not surprisingly. for example. the inverse η µν is symmetric also: η µν = η νµ . Clearly it satisﬁes the relation ρ ηµν η νρ = δµ .e. we do not need to distinguish between ds 2 and ds 2 . Clearly.47) This is another type of 4vector. 2 and 3. The Lorentz transformation rule of the coordinate diﬀerential dx µ . as can be seen to follow immediately from (1. whose rows are labelled by µ and columns labelled by ν. for µ = 0. and to equal 0 if µ = ρ.45) The Minkowski metric ηµν may be thought of as a 4 × 4 matrix. Note that like ηµν .40).44) can be taken as the prototype for more general 4vectors. the spacetime interval ds 2 = ηµν dx µ dx ν calculated in the primed frame is identical to the interval ds2 calculated in the unprimed frame ds 2 = ηµν dx dx µ ν = ηµν Λµ ρ Λν σ dxρ dxσ . dx µ = Λµ ν dxν . if U µ are the components of a 4vector. Note that if we raise the lowered index in (1. while one with a downstairs index is called a covariant 4vector. we shall just abbreviate this to simply a 4vector) if they transform. Thus. We denote the components of the inverse matrix by η µν . This is called. (1. under Lorentz transformations. The Minkowksi metric and its inverse may be used to lower or raise the indices on other quantities.
55) is now written as ∂µ = Λ µ ν ∂ν . we can see that the gradient operator ∂/∂x µ transforms as a covariant 4vector. (1.51) It now follows from (1.54) (1.49) (1. 14 (1.48) It is for this reason that we can use the same symbol U for the covariant 4vector U µ = ηµν U ν as we used for the contravariant 4vector U µ . (1. In a similar fashion.to the starting point: µ η µν Uν = η µν ηνρ U ρ = δρ U ρ = U µ . this is precisely the transformation rule for a a covariant 4vector.55) (1. (1. The gradient operator arises suﬃciently often that it is useful to use a special symbol to denote it.56) Thus the Lorentz transformation rule (1. Using the chain rule for partial diﬀerentiation we have ∂xν ∂ ∂ = .51) we have (after a relabelling of indices) that ∂xν = Λµ ν . we may deﬁne the quantities Λ µ ν by Λµ ν = ηµρ η νσ Λρ σ . ∂xµ (1. ν (1. ∂x µ ∂x µ ∂xν But from (1. We therefore deﬁne ∂µ ≡ ∂ .57) . Using (1.50) We can also then invert the Lorentz transformation x µ = Λµ ν xν to give xµ = Λ ν µ x .52) Any set of 4 quantities Uµ which transform in this way under Lorentz transformations will be called a covariant 4vector. µ = Λµ ∂x ∂xν (1.51). It is then clear that (1.40) can be restated as ρ Λµ ν Λµ ρ = δ ν .52). ∂x µ and hence (1.53) As can be seen from (1.47) transform under Lorentz transformations according to the rule Uµ = Λ µ ν Uν .53) gives ∂ ν ∂ .45) that the components of the covariant 4vector U µ deﬁned by (1.
Consider. of course. a simple calculation shows that the new object is itself a tensor. since. Furthermore.60) = Λµ ρ Λν σ T ρσ .45) type if the index is upstairs. a vector that changes from place to place in spacetime) then Sµν ≡ ∂µ Uν is a tensor ﬁeld. For example.62) σ = δρ T ρ σ = φ . 15 . Thus. (1. (1. a tensor T µ1 ···µm ν1 ···νn will transform according to the rule T µ1 ···µm ν1 ···νn (1.45) and (1. A contraction is performed by setting an upstairs index on a tensor equal to a downstairs index. while vectors are special cases with just one index.52) type if the index is downstairs. a tensor T µ ν . These objects carry multiple indices. (1. a tensor T µν transforms under Lorentz transformations according to the rule Tµν = Λµ ρ Λν σ Tρσ .1. The Einstein summation convention then automatically comes into play. using the known transformation rules for U and V we have T µν = U V µ ν = Λ µ ρ U ρ Λν σ V σ .59) Note that scalars are just special cases of tensors with no indices. which reduces a tensor to one with a smaller number of indices. This. if Uµ is a vector ﬁeld (i. transforms as T µ ν = Λ µ ρ Λν σ T ρ σ (1. If we form the contraction and deﬁne φ ≡ T µ µ . we can now deﬁne the transformation rules for more general objects called tensors.e. For example.4 Lorentz tensors Having seen how contravariant and covariant 4vectors transform under Lorentz transformations (as given in (1. Note that the gradient operator ∂µ can also be used to map a tensor into another tensor.61) under Lorentz transformations. if U µ and V µ are two contravariant vectors then T µν ≡ U µ V ν is a tensor. of either the (1.52) respectively). for example. We make also deﬁne the operation of Contraction.58) = Λµ1 ρ1 · · · Λµm ρm Λν1 σ1 · · · Λνn σn T ρ1 ···ρm σ1 ···σn . and the result is that one has an object with one fewer upstairs indices and one fewer downstairs indices. then we see that under Lorentz transformations we shall have φ ≡ T µ µ = Λ µ ρ Λµ σ T ρ σ . for example. or of the (1. More generally. and each one transforms with a Λ factor. It is easy to see that products of tensors give rise again to tensors.
then it is evident from what we have seen above that the operator ≡ ∂ µ ∂µ = η µν ∂µ ∂ν which is otherwise known as the wave operator. But these are just the components of the unit matrix.66) this means that Latin indices are lowered and raised using the Kronecker delta δ ij and its inverse δ ij .63) transforms as a scalar under Lorentz transformations. it follows. η0i = ηi0 = 0 . or d’Alembertian: = −∂0 ∂0 + ∂i ∂i = − ∂2 ∂2 ∂2 ∂2 + 2+ 2+ 2.40). Of course multiple contractions work on the same way. the result is a tensor with the corresponding reduced numbers of indices. the Minkowski metric is identical in all Lorentz frames. An essentially identical calculation shows that for a tensor with a arbitrary numbers of upstairs and downstairs indices. in the standard way. which of course range only over the three spatial directions i = 1. unlike a generic 2index tensor. which can be rewritten as the statement ηµν ≡ Λµ ρ Λν σ ηρσ = ηµν . 2 ∂t ∂x ∂y ∂z (1. 2 and 3.65) (1. But since we have η00 = −1 . but we cannot write ∂µ ∂µ . (1. ηij = δij . It is worth commenting further at this stage about a remark that was made earlier. Notice that in (1. This can be seen from (1. This is because. that φ is a scalar.65) we have been cavalier about the location of the Latin indices. It is because of the minus sign associated with the η 00 component of the Minkowski metric that we have to pay careful attention to the process of raising and lowering Greek indices. This is a very important operator. 16 . The same is also true for the inverse metric η µν . Thus. (1. if one makes an index contraction of one upstairs with one downstairs index.Since φ = φ. we can get away with writing ∂ i ∂i . The Minkowski metric ηµν is itself a tensor. and so raising or lowering Latin indices has no eﬀect. We can get away with this because the metric that is used to raise or lower the Latin indices is just the Minkowski metric restricted to the index values 1. If we deﬁne. but of a rather special type. known as an invariant tensor. ∂ µ ≡ η µν ∂ν . 2 and 3. by deﬁnition.64) We already saw that the gradient operator ∂ µ ≡ ∂/∂X µ transforms as a covariant vector.
1 − u2 Thus we have dτ = dt/γ. we therefore have that U0 = γ . If we think of a particle following a path. it follows the path x µ = xµ (τ ).68) This is called the Proper Time interval.74) .69) the 4velocity can be written as Uµ = dt dxµ dxµ =γ . (1. 17 U i = γ ui . it is useful to deﬁne the negative of ds 2 . it is obvious that dτ is a scalar too.69) is called the 4velocity of the particle. and so from (1. then U µ deﬁned in (1. or in other words. it follows that Uµ ≡ dxµ dτ (1. (1. it is natural to deﬁne 1 γ≡√ . it follows that dτ 2 = dt2 [1 − (dx/dt)2 − (dy/dt)2 − (dz/dt)2 )] = dt2 (1 − u2 ) . u = (1. the 3velocity u is a 3vector with components u i given by ui = From (1. where u = u.5 Proper time and 4velocity We deﬁned the Lorentzinvariant interval ds between inﬁnitesimallyseparated spacetime events by ds2 = ηµν dxµ dxν = −dt2 + dx2 + dy 2 + dz 2 . Since dτ is a scalar. By deﬁnition. In view of the deﬁnition of the γ factor in (1. These cases correspond to what are called spacelike. respectively. dτ dt dt (1. and write dτ 2 = −ds2 = −ηµν dxµ dxν = dt2 − dx2 − dy 2 − dz 2 .72) (1. On occasion. negative or zero.70) √ ui ui .68). (1.25). Note that ds2 can be positive. Since ds is a Lorentz scalar. timelike or null separations.67) This is the Minkowskian generalisation of the spatial interval in Euclidean space.73) Since dx0 /dt = 1 and dxi /dt = ui .e. dt (1. or worldline in spacetime parameterised by the proper time τ . and τ is the proper time.71) dxi .69) is a contravariant 4vector also. It is useful to see how the 4velocity is related to the usual notion of 3velocity of a particle. We know that dxµ transforms as a contravariant 4vector. i.1.
x. and then replace (t. y.68). z) by (U 0 .78) where we are now using γv ≡ (1 − v 2 )−1/2 to denote the gamma factor of the Lorentz transformation. γv (1 − vux ) 18 uz = uz . which is deﬁned in (1.Note that U µ Uµ = −1. to distinguish it from the γ constructed from the 3velocity u of the particle in the frame S. U 1 . γ ui ) or U µ = (γ.45). Thus from (1. U 3 ) in terms of u will give the result. γ uz = γ u z . using (1. γ ux = γ γv (ux − v) . (1. U 2 . The Lorentz transformation for U µ can therefore be read oﬀ from (1.25): U U U U 0 1 2 3 = γv (U 0 − vU 1 ) . Consider. = γv (U 1 − vU 0 ) . it is now completely straightforward write down how velocities transform under Lorentz transformations. the case where S is moving along the x axis with velocity v. Thus we ﬁnd ux = ux − v . = U2 . γ u) .74) as U µ = (γ. (dτ )2 (dτ )2 (1. U 2 . U 1 . since. and this is identical to the way that the coordinates x µ transform: U µ = Λµ ν U ν .75) We shall sometimes ﬁnd it convenient to rewrite (1.24) and (1. z). U 3 ). x µ = Λ µ ν xν .80) .79) where. (1. = U3 . y. we have U µ Uµ = ηµν U µ U ν = −(dτ )2 ηµν dxµ dxν = = −1 . γ uy = γ u y . Finally. γv (1 − vux ) (1.76) Having set up the 4vector formalism.72). γ = (1 − u 2 )−1/2 is the analogue of γ in the frame S .76) we have γ = γ γv (1 − vux ) . for simplicity.77) Therefore. 1 − vux uy = uy . we need only write down the Lorentz transformations for (t.76) to express (U 0 . (1. x. from (1. (1. of course. if we want to know how the 3velocity transforms. We know that the 4velocity U µ will transform according to (1.
(2.2 2. c ∂t c 1 ∂B = 0.1 Electrodynamics and Maxwell’s Equations Natural units We saw earlier that the supposition of the universal validity of Maxwell’s equations in all inertial frames.7) become · E = 4π ρ .4) are called Bianchi Identities. we see that the Maxwell equations (1. we pass from Gaussian units to Natural units. system in which the Maxwell equations are given in (1. they impose constraints on 19 . They are not ﬁeld equations.3) (2. ×E + ∂t ×B− (2. The ﬁrst step is to change to Gaussian units.7). are called the Field Equations. is consistent with experiment. Thus. rather. in a way that makes their Lorentz covariance manifest. by performing the rescalings 1 E. · B = 0. µ0 B.4) The equations (2. in terms of 4tensors. as we did in our discussion of special relativity.3). B −→ 4π √ J −→ 4π 0 J . since there are no sources. · B = 0. written in their standard form (1. in natural units. 2. which have sources on the righthand side. Our next task will be to reexpress the Maxwell equations. It is therefore reasonable to expect that Maxwell’s equations should be compatible with special relativity.7). We shall begin by changing units from the S. this compatibility is by no means apparent. by choosing our units of length and time so that c = 1.2 Gauge potentials and gauge invariance We already remarked that the two Maxwell equations (2. The equations (2. However. which in particular would imply that the speed of light should be the same in all frames.1) √ Bearing in mind that the speed of light is given by c = 1/ µ0 0 . the Maxwell equations become · E = 4π ρ .2) Finally. E −→ √ 4π 0 √ ρ −→ 4π 0 ρ . We shall elaborate on this a little later.I.4) are know as Bianchi identities. ×E+ c ∂t ×B − (2. ∂t ∂B = 0. ∂E = 4π J . 4π 1 ∂E = J.
if a given set of electric and magnetic ﬁelds E and B are described by a scalar potential φ and 3vector potential A according to (2. r)) at our disposal. we may note that that B in (2. ∂t B= × A. related to the original pair by the Gauge Transformations given in (2.9) where λ is an arbitrary function of position and time. Note that (2. λ(t.8) Although we have now “disposed of” the two Maxwell equations in (2. Thus we can solve the Bianchi identities (2.4) by writing E and B in terms of scalar and 3vector potentials φ and A: E =− φ− ∂A .the electric and magnetric ﬁelds.5) into the second equation (2. then the identical physical situation (i. we obtain ∂A = 0.10) To summarise.10).e. writing B= × A. ·A+ ∂φ = 0.8). The ﬁrst equatio in (2. i. ∂t (2.8) is unchanged if we make the replacement A −→ A + λ. ∂t This can be solved. identical electric and magnetic ﬁelds) is equally well described by a new pair of scalar and 3vector potentials. where λ is an arbitrary function of position and time.5) identically solves in (2. and so this allows us to impose one functional relation on the potentials φ and A. · B = 0.e. Substituting (2.4). (2.11) 20 . ∂t because of the vector identity that div curl ≡ 0. We have one arbitrary function (i. again identically. ∂t (2. where A is the magnetic 3vector potential. First.9) and (2. in that there is a redundancy in the choice of gauge potentials φ and A. can be solved by (2.e.6) (2. if we simultaneously make the replacement φ −→ φ − ∂λ . The expression for E will also be invariant. For our present purposes. (2. by writing × E+ E+ ∂A = − φ. by making a convenient and simplifying gauge choice for the scalar and 3vector potentials.4).7) where φ is the electric scalar potential.5) · B = 0. it has been achieved at a price.4). We can in fact use the gauge invariance to our advantage. the most useful gauge choice is to use this freedom to impose the Lorentz gauge condition.
(2.3 Maxwell’s equations in 4tensor notation The next step is to write the Maxwell equations in terms of fourdimensional quantities. F21 = −B3 . Since the 3vectors describing the electric and magnetic ﬁelds have three components each. (2. Fij = ijk Bk . if two or more of the indices (ijk) are equal).8) into the remaining Maxwell equations (i. 2. In other words.3). which we shall make use of shortly. F13 = −B2 .Substituting (2. Since this is equal to 3 + 3. we therefore ﬁnd ∂2φ ∂t2 ∂2A 2 A− 2 ∂t 2 φ− = −4πρ . and to zero if it is no permautation (i. Thus we introduce a tensor F µν . we have F23 = B1 . to = −1 if it is an odd permutation.e.13) Fi0 = Ei . However.14) is the usual totallyantisymmetric tensor of 3dimensional vector calculus.e. satisfying Fµν = −Fνµ . = −4π J . we may note that in four dimensional a twoindex antisymmetric tensor has (4 × 3)/2 = 6 independent components. Here ijk (2. (2. F32 = −B1 . we shall have E1 = E2 0 −E1 0 −B3 B2 21 −E2 B3 0 −B1 Fµν E3 −B2 B1 0 −E3 . is that in each case we have on the lefthand side the d’Alembertian operator = ∂ µ ∂µ . F12 = B3 .15) Viewing Fµν as a matrix with rows labelled by µ and columns labelled by ν. It is equal to +1 if (ijk) is an even permutation of (123).12) The important thing. It turns out that we should deﬁne its components in terms of E and B as follows: F0i = −Ei .16) . there is clearly no way in which they can be “assembled” into 4vectors. (2. it suggests that perhaps we should be grouping the electric and magnetic ﬁelds together into a single 2index antisymmetric tensor. F31 = B2 . (2. which we discussed earlier. and using the Lorentz gauge condition (2.11). This is in fact exactly what is needed.
18) (2.19) are equivalent to the Maxwell equations (2. except that they will now have primes on all the quantities. whose spatial components J i are just the usual 3vector current components. Therefore. the original four Maxwell equations (2. (2. 22 (2.18) and (2.17)) to −∂i Ei = −4πρ . i.4). we shall proceed to see how the Maxwell equations look when expressed in terms of Fµν and J µ . we are only entitled to call them such if we have veriﬁed that they transform in the proper way under Lorentz transformations. and (2.19) Two very nice things have happened. then we know that in the frame S .3) and (2. For now.18) and (2. = 0. they transform tensorially under Lorentz transformations. we shall have ∂0 F 0j + ∂i F ij = −4πJ j .3) and (2. related to S by the Lorentz transformation (1. and we shall justify this a little later. namely ν = 0 or ν = j. since it has the free index ν. and whose time component J 0 is equal to the charge density ρ: J0 = ρ . the equations will look identical.19) in the unprimed frame S. Although we have deﬁned objects F µν and J µ that have the appearance of a 4tensor and a 4vector. In fact they do. Ji = Ji . (2.20) .18) is the ﬁeld equation. · E = 4πρ . For ν = j. First of all. which therefore corresponds (see (2. The answer is that they become ∂µ F µν ∂µ Fνρ + ∂ν Fρµ + ∂ρ Fµν = −4πJ ν . This is easy we just deﬁne a 4vector J µ . Secondly.e. Consider ﬁrst (2. (2.21) (2. to reduce it down to threedimensional equations.4) have become just two fourdimensional equations.34).18). We should ﬁrst verify that indeed (2. This means that they keep exactly the same form in all Lorentz frames.e. we have two cases to consider.19) is the Bianchi identity.22) i. This equation is vectorvalued. the equations are manifestly Lorentz covariant. If we start with (2.14) and (2. (2. For ν = 0 we have ∂i F i0 = −4πJ 0 .We also need to combine the charge density ρ and the 3vector current density J into a fourdimensional quantity.17) A word of caution is in order here.
ν. ρ) = (0. means ijk (2.e. from (2. after we make the 1 + 3 decomposition µ = (0. These identities are easily proven by considering the possible assignments of indices and explicitly verifying that the two sides of the identities agree.28) which is the second of the Maxwell equations in (2.30) From (2.4).14). giving 3 ijk ∂i Fjk = 0 .26) Since this is antisymmetric in ij there is no loss of generality involved in contracting with ij . and hence ijm kjm = 2δik . it changes sign under any exchange of a pair of indices).25) ∂Bk + ∂ i Ej − ∂ j Ei = 0 .23) Thus (2.19) is to take (µ.which gives ∂0 Ej + This is just4 − ∂E + ∂t × B = 4π J . which.14). ∂B = 0. no generality is lost by contracting it with ijk .19).31) Recall that the i’th component of × V is given by ( × V )i = ijk ∂j Vk for any 3vector V . it follows from the antisymmetry (2. giving ∂i Fjk + ∂j Fki + ∂k Fij = 0 . ∂t (2. 23 . (2. j. (2. Turning now to (2. k).27) This is just the statement that ×E+ (2. Thefore there are two distinct assignments of indices. i. Consider ﬁrst (µ.13) of F µν that the lefthand side is totally antisymmetric in (µνρ) (i. ∂t (2. (2. Either one of the indices is a 0 with the other two Latin. k). (2.3).24) ijk ∂i Bk = −4πJ j . which gives5 2 ∂B +2 ∂t ij ∂i Ej = 0 . j. (2. or else all three are Latin. The other distinct possibility for assigning decomposed indices in (2. this implies 3 4 5 ijk jk ∂i B = 0 . i) etc. ρ) = (i.18) is equivalent to the two Maxwell ﬁeld equations in (2. j): ∂0 Fij + ∂i Fj0 + ∂j F0i = 0 .29) Since this is totally antisymmetric in (i. ν. Recall that ijm k m = δik δj − δi δjk . and hence 6∂i Bi = 0 .
This has just reproduced the ﬁrst Maxwell equation in (2.4), i.e.
· B = 0.
We have now demonstrated that the equations (2.18) and (2.19) are equivalent to the four maxwell equations (2.3) and (2.4). Since (2.18) and (2.19) are written in a fourdimensional notation, it is highly suggestive that they are indeed Lorentz covariant. However, we should be a little more careful, in order to be sure about thsi point. Not every set of objects V as in (1.45), under Lorentz transformations. We may begin by considering the quantities J µ = (ρ, J i ). Note ﬁrst that by applying ∂ν to the Maxwell ﬁeld equation (2.18), we get identically zero on the lefthand side, since partial derivatives commute and F µν is antisymmetric. Thus from the lefthand side we get ∂µ J µ = 0 . (2.32)
µ
can be viewed as a Lorentz 4vector, after all. The test is whether they transform properly,
This is the equation of charge conservation. Decomposed into the 3 + 1 language, it takes the familiar form ∂ρ + ∂t · J = 0. (2.33)
By integrating over a closed 3volume V and using the divergence theorem on the second term, we learn that the rate of change of charge inside V is balanced by the ﬂow of charge through its boundary S: ∂ ∂t J 0 = ρ ﬁrst, we may note that dQ ≡ ρdxdydz (2.35)
V
ρdV = −
S
J · dS .
(2.34)
Now we are in a position to show that J µ = (ρ, J) is indeed a 4vector. Considering
is clearly Lorentz invariant, since it is an electric charge. Clearly, for example, all Lorentz observers will agree on the number of electrons in a given closed spatial region, and so they will agree on the amount of charge. Another quantity that is Lorentz invariant is dv = dtdxdydz , (2.36)
the volume of an inﬁnitesinal region in spacetime. This can be seen from the fact that the Jacobian J of the transformation from dv to dv = dt dx dy dz is given by J = det ∂x µ = det(Λµ ν ) . ∂xν (2.37)
Now the deﬁning property (1.40) of the Lorentz transformation can be written in a matrix notation as ΛT η Λ = η , 24 (2.38)
and hence taking the determinant, we get (∂Λ) 2 = 1 and hence det Λ = ±1 . (2.39)
Assuming that we restrict attention to Lorentz transformations without reﬂections, then they will be connected to the identity (we can take the boost velocity v to zero and/or the rotation angle to zero and continuously approach the identity transformation), and so det Λ = 1. Thus it follows from (2.37) that for Lorentz transformations without reﬂections, the 4volume element dtdxdydz is Lorentz invariant. Comparing dQ = ρdxdydz and dv = dtdxdydz, both of which we have argued are Lorentz invariant, we can conclude that ρ must transform in the same way as dt under Lorentz transformations. In other words, ρ must transform like the 0 component of a 4vector. Thus writing, as we did, that J 0 = ρ, is justiﬁed. In the same way, we may consider the spatial components J i of the putative 4vector J µ . Considering J 1 , for example, we know that J 1 dydz is the current ﬂowing through the area element dydz. Therefore in time dt, there will have been a ﬂow of charge J 1 dtdydz. Being a charge, this must be Lorentz invariant, and so it follows from the known Lorentz invariance of dv = dtdxdydz that J 1 must transform the same way as dx under Lorentz transformations. Thus J 1 does indeed transform like the 1 component of a 4vector. Similar arguments apply to J 2 and J 3 . (It is important in this argument that, because of the chargeconservation equation (2.32) or (2.34), the ﬂow of charges we are discussing when considering the J i components are the same charges we discussed when considering the J 0 component.) We have now established that J µ = (ρ, J i ) is indeed a Lorentz 4vector, where ρ is the charge density and J i the 3vector current density. At this point, we recall that by choosing the Lorentz gauge (2.11), we were able to reduce the Maxwell ﬁeld equations (2.3) to (2.12). Furthermore, we can write these equations together as Aµ = −4π J µ , where Aµ = (φ, A) , where the d’Alembertian, or wave operator, (2.41) (2.40)
2 = ∂ µ ∂µ = ∂i ∂i − ∂0 was introduced in (1.65).
We saw that it is manifestly a Lorentz scalar, since it is built from the contraction of indices on the two Lorentzvector gradient operators. Since we have already established that J µ is a 4vector, it therefore follows that A µ is a 4vector. Note, en passant, that the Lorentz 25
gauge condition (2.11) that we imposed earlier translates, in the fourdimensional language, into ∂µ Aµ = 0 , which is nicely Lorentz invariant (hence the name “Lorentz gauge condition”). The ﬁnal step is to note that our deﬁnition (2.14) is precisely consistent with (2.41) and (2.8), if we write Fµν = ∂µ Aν − ∂ν Aµ . we shall have Aµ = (−φ, A) . Therefore we ﬁnd F0i = ∂0 Ai − ∂i A0 = Fij ∂Ai + ∂i φ = −Ei , ∂t = ∂i Aj − ∂j Ai = ijk ( × A)k = ijk Bk . (2.44) (2.43) (2.42)
First, we note from (2.41) that because of the η 00 = −1 needed when lowering the 0 index,
(2.45)
In summary, we have shown that J µ is a 4vector, and hence, using (2.40), that A µ is a 4vector. Then, it is manifest from (2.43) that F µν is a 4tensor. Hence, we have established that the Maxwell equations, written in the form (2.18) and (2.19), are indeed expressed in terms of 4tensors and 4vectors, and so the manifest Lorentz covariance of the Maxwell equations is established. Finally, it is worth remarking that in the 4tensor description, the way in which the gauge invariance arises is very straightforward. First, it is manifest that the Bianchi identity (2.19) is solved identically by writing Fµν = ∂µ Aν − ∂ν Aµ , (2.46)
for some 4vector Aµ . This is because (2.19) is totally antisymmetric in µνρ, and so, when (2.46) is substituted into it, one gets identically zero since partial derivatives commute. (Try making the substitution and verify this explicitly. The vanishing because of the commutativity of partial derivatives is essentially the same as the reason why curl grad ≡ 0 and div curl ≡ 0.) It is also clear from (2.46) that F µν will be unchanged if we make the replacement Aµ −→ Aµ + ∂µ λ , derivatives commute. Comparing (2.47) with (2.44), we see that (2.47) implies φ −→ φ − ∂λ , ∂t 26 Ai −→ Ai + ∂i λ , (2.48) (2.47)
where λ is an arbitrary function of position and time. Again, the reason is that partial
It should have become clear by now that all the familiar features of the Maxwell equations are equivalently described in the spacetime formulation in terms of 4vectors and 4tensors. there is no F 00 term on the righthand side. and the fact (see (2. for example. it is sometimes useful to revert to the original description in terms of E and B.) Thus.and so we have reproduced the gauge transformations (2. ijk Bk . v2 k m Bm .14) that F 0i = Ei .49) we can then immediately From this.50) (Note that because F µν is antisymmetric. (2. there is no work needed to write down its behaviour under Lorentz transformations. For example. v2 (2. we may ﬁrst calculate E . the Lorentz boost transformation of the electric ﬁeld is given by E = γ(E + v × B) − is given by B = γ(B − v × E) − γ−1 (v · B) v .52) γ−1 (v · E) v . we may easily derive the Lorentz transformation properties of E and B.4 Lorentz transformation of E and B Although for many purposes the fourdimensional decsription of the Maxwell equations is the most convenient. = Λ0 0 Λi k F 0k + Λ0 k Λi 0 F k0 + Λ0 k Λi F k . we shall have F µν = Λµ ρ Λν σ F ρσ . that in the frame S there is just a magnetic ﬁeld B. making use of the fourdimensional formulation. F ij = read of the Lorentz transformations for E and B. calculated from Ei = F 0i = Λ0 ρ Λi σ F ρσ . The only diﬀerence is that everything is described much more simply and elegantly in the fourdimensional language.10). γ−1 γ−1 vi v = γ δik 2 vi vj Ek − γ 2 vi vk Ek − γ vk δi + v v2 γ−1 = γEi + γ ijk vj Bk − vi vk Ek . Raising the indices for convenience. In terms of F µν . 2. while E = 0.51) An analogous calculation shows that the Lorentz boost transformation of the magnetic ﬁeld Suppose. (2. v2 (2.9) and (2. From the expressions (1.35) for the most general Lorentz boost transformation. An observer in a frame S moving with uniform velocity v relative to S will therefore observe 27 . in terms of 3vector notation.
and p i = mγ ui is called the relativistic 3momentum. γ u) .e.56) Multiplying by the rest mass m of the particle gives another 4vector.57).51) and (2.55) (2. (2. Note that since U µ Uµ = −1. and. mγ u) . Ey = γ(Ey − vBz ) .60) In a practical dynamo the rotor is moving with a velocity v which is much less than the speed of light. dτ (2. Bx = B x . v2 (2. we shall have pµ pµ = −m2 . Using (2.53) 2.5 The Lorentz force Consider a point particle following the path. we can write the 4force as f µ = mγ 3 u · 6 du du du . Clearly f µ is indeed a 4vector.54) γ−1 (v · B) v . This means that the gamma factor γ = (1 − v 2 )−1/2 is approximately equal to unity in such cases. We now deﬁne the relativistic 4force f µ acting on the particle to be fµ = dpµ . Bz = γ(Bz − vEy ) .58) where τ is the proper time. since it is the 4vector dp µ divided by the scalar dτ . of course. 0).not only a magnetic ﬁeld. 0. γ= √ 1 − u2 (2. where 1 . v = (v.52) become Ex = E x . dτ dτ dτ (2.57) The quantity p0 = mγ is called the relativistic energy E.59) (2. i. x i = xi (t). given by B = γB − but also an electric ﬁeld. This. as we saw earlier. (2. mγ 3 u · u + mγ . v << 1 in natural units. namely the 4momentum pµ = mU µ = (mγ. It has 3velocity ui = dxi /dt. 28 . is the principle of the dynamo. Ez = γ(Ez + vBy ) . or worldline. given by E = γv × B . 6 It is instructive to write out the Lorentz transformations explicitly in the case when the boost is along the x direction. Equations (2. 4velocity U µ = (γ. By = γ(By + vEz ) .
(2. F ) . i. since the two are the same.It follows that if we move to the instantaneous rest frame of the particle. we shall 7 Note that we can replace the proper time τ by the coordinate time t in the instantaneous rest frame. then f µ reduces to fµ where F =m du dt (2.5 that dτ = dt/γ) and where dp/dt is the rate of change of relativistic 3momentum. and so f = dp/dτ = γ dp/dt (recall from section 1. We calculate the spatial components: f i = eF iν Uν = eF i0 U0 + eF ij Uj . (2. (2.66) (2. and that it is moving under the inﬂuence of an electromagnetic ﬁeld F µν . In fact it is easy to see that (2. dt expression for the motion of a charged particle under the Lorentz force. This is indeed the standard 2. If we now suppose that the particle has electric charge e.65) ijk Bk γ u j . the frame in which u = 0 at the particular moment we are considering.e.63) is correct.61) is the Newtonian force measured in the rest frame of the particle. so we have dp = e (E + u × B) . 7 Thus. since we know that there must exist a relativistic equation (i. a Lorentz covariant equation) that describes the motion. 29 .6 Action principle for charged particles In this section.62) rest frame = (0.63) One can more or less justify this equation on the grounds of “what else could it be?”. (2. To begin. we shall show how the equations of motion for a charged particle moving in an electromagnetic ﬁeld can be derived from an action principle.64) But f µ = dpµ /dτ . = e(−Ei )(−γ) + e and thus f = eγ (E + u × B) .e. then its motion is given by the Lorentz force equation f µ = eF µν Uν . we should interpret the 4force physically as describing the Newtonian 3force when measured in the instantaneous rest frame of the accelerating particle.
with no forces acting on it. small compared with the speed of light. ˙ ˙ ˙ ˙ (2. 2 (2. (2. of course. is given by (2. The action principle then states that if we consider all possible paths between the initial and ﬁnal spacetime points on the path.69) L = −m(1 − xi xi )1/2 . if we expand (2. t2 t1 S = −m t1 (1 − v 2 )1/2 dt = −m t2 t1 (1 − xi xi )1/2 dt . move in a straight line. we shall have ˙ L = −m + 1 mv 2 + · · · .71) Integrating by parts then gives t2 δS = −m t1 d (1 − xj xj )−1/2 xi δxi dt + m (1 − xj xj )−1/2 xi δxi ˙ ˙ ˙ ˙ ˙ ˙ dt t2 t1 . Since it is just a constant.70) Since the Lagrangian is given by L = T − V we see that T is just the usual kinetic energy 1 2 2 mv for a nonrelativistic particle of mass m. while the potential energy is just m. It turns out that its equation of motion can be derived from the Lorentzinvariant action S = −m τ2 dτ . for which S = Ldt.e.67) where τ is the proper time along the trajectory x µ (τ ) of the particle. then to ﬁrst order in the variations we shall have δS = 0. it does not aﬀect the equations of motion that will follow from the action principle. then the actual path followed by the particle will be such that the action S is stationary. The action will vary according to t2 δS = m t1 (1 − xj xj )−1/2 xi δ xi dt .consider an uncharged particle of mass m. where vi = dxi /dt is the 3velocity of the particle. this energy would be mc2 . if we consider small variations of the path around the actual path. ˙ ˙ (2.69) for small velocities (i. Now let us consider small variations δx i (t) around the path xi (t) followed by the particle.72) 30 . so xi  << 1).68) In other words. the Lagrangian L. It will. we note that dτ 2 = dt2 − dxi dxi = dt2 (1 − vi vi ) = dt2 (1 − v 2 ). starting at proper time τ = τ1 and ending at τ = τ2 . In other words. Of course if we were not using units where the speed of light were unity. Thus t2 To see how this works. ˙ ˙ As a check. τ1 (2.
This ﬁeld will be written in terms of a 4vector potential: Fµν = ∂µ Aν − ∂ν Aµ . The total action turns out to be S= τ1 τ2 (−mdτ + eAµ dxµ ) . so δxi (t1 ) = δxi (t2 ) = 0 and the boundary term can be dropped. ˙ dt dt 31 (2. we restrict to variations of the path that vanish at the endpoints. derived the equation The action will now be the sum of the freeparticle action (2. The ﬁrstorder variation of the action under a variation δxi in the path gives t2 δS = t1 t2 m(1 − xj xj )−1/2 xi δ xi − e∂i φ δxi + eAi δ xi + e∂j Ai xi δxj dt . (2.79) where potentials φ and Ai depend on t and x. From (2. and so we conclude from the requirement of stationary action δS = 0 that d (1 − xj xj )−1/2 xi = 0 . recalling that we deﬁne γ = (1 − v 2 )−1/2 . A).68) above plus a term describing the interaction of the particle with the electromagnetic ﬁeld. (2. dt for straightline motion in the absence of any forces acting.73) where p = mγv is the relativistic 3momentum. ˙ ˙ ˙ (2.44) we have Aµ = (−φ.74) (2. dp = 0. Now we extend the discussion to the case of a particle of mass m and charge e. we see that d(mγv) = 0.77) Note that it is again Lorentz invariant. We have. in other words. dt or.75) (2. and so Aµ dxµ = Aµ Thus we have S = t2 t1 dxµ dt = (A0 + Ai xi )dt = (−φ + Ai xi )dt . moving under the inﬂuence of an electromagnetic ﬁeld F µν .As usual in an action principle.80) = t1 . The variation δxi is allowed to be otherwise arbitrary in the time interval t 1 < t < t2 . of course.76) (2. ˙ ˙ dt (2. ˙ ˙ ˙ dt Now. ˙ ˙ ˙ ˙ ˙ ˙ − dAi d (mγ xi ) − e∂i φ − ˙ + e∂i Aj xj δxi dt .78) Ldt with the Lagrangian L given by L = −m(1 − xj xj )1/2 − eφ + eAi xi .
84) (2. dτ dτ 32 (2. ˙ dp = e(E + v × B) .77). dxµ = −ηµν dδxν . (−mdUµ δxµ − edAµ δxµ + e∂µ Aν δxµ dxν ) . This accounts for the second term.77) gives τ2 (2. this contribution is ∂Ai /∂t.82) (2. and we may write it as ∂Ai dxj ∂Ai dAi = + ∂ j Ai = + ∂ j Ai x j .83) In other words. ˙ ∂t = e(Ei + ijk xj Bk ) . since δx i is again assumed to vanish at the endpoints.81) This arises because ﬁrst of all. dt which is the Lorentz force equation (2.85) δS = τ1 τ2 (mUµ dδxµ + eAµ dδxµ + e∂ν Aµ δxν dxµ ) . ˙ dt ∂t dt ∂t (2. and along the path followed by the particle. Additionally. Putting all this together. we have (2.) Thus the principle of stationary action δS = 0 implies dAi d(mγ xi ) ˙ = −e∂i φ − + e∂i Aj xj .66). dτ dτ dτ (2. It is worth noting that although we gave a “threedimensional” derivation of the equations of motion following from the action (2. Ai can depend explicitly on the time coordinate.86) = τ1 τ2 = τ1 Now we have dxν dAµ = ∂ ν Aµ = ∂ ν Aµ U ν . dτ = −Uµ dδxµ . To begin. (−m dUµ dxν dAµ −e + e∂µ Aν )δxµ dτ . we can also instead directly derive the fourdimensional equation dpµ /dτ = eF µν Uν . we write the proper time interval as dτ = (−ηρσ dxρ dxσ )1/2 . xi depends on t because the path is xi = xi (t).(We have dropped the boundary terms immediately. we have d(mγ xi ) ˙ dt = e − ∂i φ − ∂Ai + e(∂i Aj − ∂j Ai ) xj .87) . where Uµ is the 4velocity. Ai depends on the spatial coordinates x i . and so its variation under a variation of the path x µ (τ ) gives δ(dτ ) = −(−ηρσ dxρ dxσ )−1/2 ηµν dxµ dδxν . Thus the variation of the action (2. ˙ dt dt Now. the total time derivative dAi /dt has two contributions.
66). (2. all is in fact well.and so τ2 δS = τ1 −m dUµ − e∂ν Aµ U ν + e∂µ Aν U ν δxµ dτ . as we noted earlier. ∂ xi ˙ (2. and Hamiltonian Given any Lagrangian L(xi .77) does in fact produce the correct gaugeinvariant Lorentz force equation (2. However. It is instructive also to examine the eﬀects of a gauge transformation directly at the level of the action. and therefore might not properly describe the required physical situation.88) Requiring δS = 0 for all variations (that vanish at the endpoints) we therefore obtain the equation of motion m dUµ dτ = e(∂µ Aν − ∂ν Aµ ) U ν . 2.77) that the action S transforms to S given by τ2 S = τ1 (−mdτ + eAµ dxµ + e∂µ λdxµ ) .90) S = S + e[λ(τ2 ) − λ(τ1 )] . τ1 (2.8 Canonical momentum. This is itself not physically observable. (2. = eFµν U ν . dτ (2. as we demonstrated. τ2 τ1 = S+e and so ∂µ λdxµ = e τ2 dλ . where λ is any arbitrary function in spacetime.92) 33 . the action will be gauge invariant. that the action itself would be gauge dependent. since Aµ and Aµ give rise to the same electromagnetic ﬁeld F µν .66). S = S.7 Gauge invariance of the action In writing down the relativistic action (2.89) 2. t) one deﬁnes the canonical momentum π i as ˙ πi = ∂L . This already can be seen from the fact that. Thus we have reproduced the Lorentz force equation (2. since. If we make the gaug transformation A µ → Aµ = Aµ + ∂µ λ. we see from (2.77) for a charged particle we had to make use of the 4vector potential Aµ . the variational principle for the action (2. Aµ and Aµ = Aµ + ∂λ describe the same physics. One might worry. xi . therefore.91) Thus provided we restrict ourselves to gauge transformations that vanish at the endpoints.
we get m2 γ 2 v 2 = m2 v 2 /(1 − v 2 ) = (πi − eAi )2 . ˙ ∂πi ∂H = −πi . γ (2.97) (2.93) where pi as usual is the standard mechanical relativistic 3momentum of the particle.95) (2.99) Note that Hamilton’s equations. are given by ∂H = xi . ˙ ˙ ˙ or.96) Now. the Hamiltonian for the system is given by H = π i xi − L . and so ﬁnally.94) that mγ xi = πi − eAi . ˙ = pi + eAi . from (2. 1 (πi − eAi )2 + eφ + · · · .98) The Hamiltonian is to be viewed as a function of the coordinates x i and the canonical momenta πi . (2. = m+ 2m (2. ˙ and so we ﬁnd H = mγ xi xi + ˙ ˙ m + eφ . (2.99) we may examine it in the nonrelativistic limit when (πi − eAi )2 is much less than m2 .100) As a check of the correctness of the Hamiltonian (2.The relativistic Lagrangian for the charged particle is given by (2. and so ˙ squaring. xi = vi and mγv 2 + m/γ = mγ(v 2 + (1 − v 2 )) = mγ. Solving for v 2 .79).101) 34 . We then extract an m2 factor from inside the square root in (πi − eAi )2 + m2 and expand to get H = m 1 + (πi − eAi )2 /m2 + eφ . we note from (2. we ﬁnd that m2 γ 2 = (πi − eAi )2 + m2 . which will necessarily give rise to the same Lorentz force equations of motion we encountered previously. To express γ in terms of πi . (2.94) (2. πi = mγ xi + eAi . ˙ ∂xi (2. we arrive at the Hamiltonian H= (πi − eAi )2 + m2 + eφ .98). and so we have πi = m(1 − xj xj )−1/2 xi + eAi . and hence for γ. in other words. so we have ˙ H = mγ + eφ . As usual.
1) We can still perform gauge transformations. we discuss the motion of a charged particle in static (i. as given in (2. φ = φ(r). 3 Particle Motion in Static Electromagnetic Fields In this chapter.2) where k is an arbitrary constant. timeindependent) electromagnetic ﬁelds. which is just a constant. it is natural (and always possible) to describe them in terms of scalar and 3vector potentials that are also static. (3.98) H = mγ + eφ . Recall that the Hamiltonian for a particle of mass m and charge e in an electromagnetic ﬁeld is given by (2. Thus we write E = − φ− B = ∂A = − φ(r) . ∂t × A(r) . t) = λ(r) + k t . E = E(r) and B = B(r). = 1 (πi − eAi )2 + eφ . A −→ A + λ(r) . in particular. (3.10). A = A(r).e. when one writes down the Schr¨dinger o equation for the wave function for a charged particle in an electromagnetic ﬁeld. that the electrostatic potential φ can just be shifted by an arbitrary constant. This is the familiar freedom that one typically uses to set φ = 0 at inﬁnity. 2m (2. (3. and the remaining terms presented explicitly in (2.1 Description in terms of potentials If we are describing static electric and magnetic ﬁelds.4) 35 .9) and (2. This implies that φ and A will transform according to φ −→ φ − k .The ﬁrst term is the restmass energy.101) give the standard nonrelativistic Hamiltonian for a charged particle Hnonrel. (3.3) Note. The most general gauge transformation that preserves the timeindependence of the potentials is therefore given by taking the parameter λ to be of the form λ(r.102) This should be familiar from quantum mechanics. 3.
This is because the magnetic ﬁeld B does no work on the charge: Recall that the Lorentz force equation can be written as d(mγv i ) = e(Ei + dt Multiplying by v i we therefore have mγv i Now γ = (1 − v 2 )−1/2 . (3. ˙ ˙ ˙ ˙ (3. and is given simply by H: E = H = mγ + eφ . i. In the present situation with static ﬁelds. does not contribute to the conserved energy. we therefore have dEmech = ev · E .9) gives dγ = ev i Ei . and not by the magnetic ﬁeld.7) since this is just the total energy of a particle of rest mass m moving with velocity v. eφ. dt (3. is the contribution to the total energy from the electric ﬁeld. (3.12) Thus.6) We may think of the ﬁrst term in E as being the mechanical term.100): dE dt = dH ∂H ∂H ∂H = + i xi + ˙ πi . = dt dt = ev · E − ev · E = 0 . the Hamiltonian does not depend explicitly on time.13) . described by the 3vector potential A. The second term.9) ijk v j Bk ) . Emech = mγ .e. dt dt dt m (3. the mechanical energy of the particle is changed only by the electric ﬁeld. it follows that the energy E is conserved. Note that another derivation of the constancy of E = mγ + eφ is as follows: dE dt d(mγ) dφ +e dt dt dxi dEmech + e∂i φ . dt = mγ. and m is a constant. In this circumstance. Note that the magnetic ﬁeld.8) dv i dv i dγ = (1 − v 2 )−3/2 v i = γ 3 vi .5) (3.where γ = (1 − v 2 )−1/2 .11) Since Emech (3. dt dt (3. The timeindependence of E can be seen from Hamilton’s equations (2. so dv i dγ + mv i v i = ev i Ei . ˙ dt ∂t ∂x ∂πi = 0 − π i xi + x i πi = 0 .10) and so (3. = 36 (3. ∂H/∂t = 0.
From (3. We have ( × A)i = = ijk (3. we can take φ = −E · r = −Ei xi . 2 (3. we can still add λ(r) to A. jk 1 2 ijk B ∂j xm = B . since −∂i φ = ∂i (Ej xj ) = Ej ∂i xj = Ej δij = Ei . 2 Bx.18) Another choice is to take A = A + 1 λ(r). timeindependent) uniform E and B ﬁelds. E and B are constant vectors. then for φ the only remaining freedom is to add an arbitrary constant to φ. 0) . If we restrict attention to transformations that maintain the timeindependence of φ and A. For the 3vector potential.19) . Suppose. In other words. with the 3vector potential given by A = 1B × r .) Turning now to the uniform B ﬁeld.16). we may therefore write the 3vector potential 1 A = (− 1 By. B). that the uniform B ﬁeld lies along the z axis: B = (0. Of course the potentials we have written above are not unique. It is sometimes helpful. 37 (3. for calculational reasons.15) (3.2 Particle motion in static uniform E and B ﬁelds Let us consider the case where a charged particle is moving in static (i.e.16) ∂j Ak = mk 1 ijk ∂j ( 2 k m B 1 2 ijk xm ) . B). 0. since we can still perform gauge transformations. independent of time and of position. 0) . of course. 0. For the scalar potential. 2 It is easiest to check this using index notation. Clearly this gives the correct electric ﬁeld. essential that Ej is constant for this calculation to be valid. where λ(r) is an arbitrary function of position. with λ = − 2 Bxy. 0.17) = δi B = Bi . it is easy to write down explicit expressions for the corresponding scalar and 3vector potentials.14) (3. for example. (It is. to do this. (3.3. This gives A = (−By. In this situation. it is easily seen that this can be written as B = × A. One easily veriﬁes that indeed × A = (0.
we can. without loss of generality.20) where p = mγv is the relativistic 3momentum. and that pµ pµ = m2 U µ Uµ = −m2 . we may take the electric ﬁeld to lie along the x axis.23) (3. 38 . simply linear. m2 + p2 + p2 = x y m2 + p2 + (eEt)2 . py = p . and so we will have dpx = eE . ¯ pz = 0 . (3. ¯ (3. z) plane. Without loss of generality. choose to take pz = 0.3. but never reaches it. dt dpy = 0.26) (The constant of integration has been absorbed into a choice of origin for the x coordinate. dt (3.21) to be px = eEt .2. and so px dx = = dt Emech which can be integrated to give x= 1 eE 2 E0 + (eEt)2 . Recalling that the 4momentum is given by p µ = (mγ. from (3.25) (3. we shall have dp = eE . (3. 2 E0 + (eEt)2 . We have also chosen the origin for the time coordinate t such that ¯ px = 0 at t = 0. dt dpz = 0. ¯ eEt 2 E0 + (eEt)2 . Thus the particle is accelerated closer and closer to the speed of light.1 Motion in a static uniform electric ﬁeld From the Lorentz force equation. p) = (Emech . Thus we may take the solution to (3. we see that Emech = and hence we may write Emech = We have p = mγ v = Emech v. since the motion in the (yz) plane is evidently.24) 2 where E0 = m2 + p2 is the square of the mechanical energy at time t = 0.21). p).21) Since there is a rotational symmetry in the (y. dt (3.25) that the xcomponent of the 3velocity asymptotically approaches 1 as t goes to inﬁnity.22) where p is a constant.) Note from (3.
We also have dy py = = dt Emech p ¯
2 E0
+ (eEt)2
.
(3.27)
This can be integrated by changing variable from t to u, deﬁned by eEt = E0 sinh u . This gives y = p u/(eE), and hence ¯ y= p ¯ eEt . arcsinh eE E0 (3.29) (3.28)
(Again, the constant of integration has been absorbed into the choice of origin for y.) The solutions (3.26) and (3.29) for x and y as functions of t can be combined to give x as a function of y, leading to x= This is a catenary. In the nonrelativistic limit when v << 1, we have p ≈ m¯ and then, expanding (3.30) ¯ v we ﬁnd the standard “Newtonian” parabolic motion x ≈ constant + 3.2.2 eE 2 y . 2m¯2 v (3.31) eEy E0 cosh . eE p ¯ (3.30)
Motion in a static uniform magnetic ﬁeld
From the Lorentz force equation we shall have dp = ev × B . dt (3.32)
Recalling (3.11), we see that in the absence of an electric ﬁeld we shall have γ = constant, and hence dp/dt = d(mγv)/dt = mγ dv/dt, leading to dv e e = v×B = v×B, dt mγ E since E = mγ + eφ = mγ (a constant) here. Without loss of generality we may choose the uniform B ﬁeld to lie along the z axis: B = (0, 0, B). Deﬁning ω≡ we then ﬁnd dvx = ω vy , dt eB eB = , E mγ dvz = 0. dt (3.34) (3.33)
dvy = −ω vx , dt 39
(3.35)
From this, it follows that d(vx + i vy ) = −i ω (vx + i vy ) , dt and so the ﬁrst two equations in (3.35) can be integrated to give vx + i vy = v0 e−i (ωt+α) , (3.37) (3.36)
where v0 is a real constant, and α is a constant (real) phase. Thus after further integrations we obtain x = x0 + r0 sin(ωt + α) , y = y0 + r0 cos(ωt + α) , z = z 0 + vz t , ¯ (3.38)
for constants r0 , x0 , y0 , z0 and vz , with ¯ r0 = mγv0 p ¯ v0 = = , ω eB eB (3.39)
where p is the relativistic 3momentum in the (x, y) plane. The particle therefore follows a ¯ helical path, of radius r0 . 3.2.3 Adiabatic invariant
In any conservative system with a periodic motion, it can be shown that the quantity I≡ πi dxi , (3.40)
integrated over a complete cycle of the coordinates x i is conserved under slow (adiabatic) changes of the external parameters. Speciﬁcally, if there is an extrenal parameter a, then dI/dt is of order O(a2 , a), but there is no linear dependence on the ﬁrst derivative a. ˙ ¨ ˙ In our previous discussion, of a charged particle moving under the inﬂuence of a uniform magnetic ﬁeld B that lies along the z direction, we may consider the invariant I that one obtains by integrating around its closed path in the (x, y) plane. We shall have I≡ and
2 pi dxi = 2πr0 p = 2πr0 eB . ¯
πi dxi =
(pi + eAi )dxi ,
(3.41)
(3.42)
We shall also have e Hence we ﬁnd
2 2 I = 2πr0 eB − πr0 eB ,
Ai dxi = e
S
2 B · dS = −eπr0 B .
(3.43)
(3.44)
40
and so
2 I = πr0 eB =
π p2 ¯ . eB
(3.45)
The statement is that since I is an adiabatic invariant, it will remain essentially unchanged if B, which we can view here as the external parameter, is slowly changed. Thus we may say that r0 ∝ B −1/2 , or p ∝ B 1/2 . ¯ (3.46)
2 Note that since πr0 = A, the area of the loop, it follows from (3.45) that
I = eΦ ,
(3.47)
where Φ = AB is the magnetic ﬂux threading the loop. Thus if we make a slow change to the magnetic ﬁeld, then the radius of the particle’s orbit adjusts itself so that the magnetic ﬂux through the loop remains constant. As an application, we may consider a charged particle moving in a static magnetic ﬁeld that changes gradually with position. We have already seen that E mech is constant in a pure magnetic ﬁeld. Since we have
2 pµ pµ = −Emech + p 2 = −m2 ,
(3.48)
it follows that p is also a constant. In our discussion of the particle motion in the magnetic ﬁeld, we deﬁned p to be the component of transverse 3momentum; i.e. the component in ¯ the (x, y) plane. Thus we shall have p 2 = p 2 + p2 , ¯ L where pL denotes the longitudinal component of 3momentum. It follows that ¯ p2 = p 2 − p 2 = p 2 − L eIB . π (3.50) (3.49)
Since p 2 is a constant, it follows that as the particle penetrates into a region where the magnetic ﬁeld increases, the longitudinal momentum p L (i.e. the momentum in the direction of its forward motion) gets smaller and smaller. If the B ﬁeld becomes large enough, the forward motion will be brought to zero, and the particle will be repelled out of the region of high magnetic ﬁeld. 3.2.4 Motion in uniform E and B ﬁelds
Having considered the case of particle motion in a uniform E ﬁeld, and in a uniform B ﬁeld, we may also consider the situation of motion in uniform E and B ﬁelds together. To 41
55) Choosing the origin of time so that a is real.60) . mω B (3. ¯ 2m (3. Ey . we have x = a cos ωt + ˙ Ey . B y = −a sin ωt . Notice that it is perpendicular to E and B. B y = 0. B) . E = (0.58) The averaged velocity along the x direction is called the drift velocity. ˙ (3.e.51) (there is no loss of generality in choosing axes so that this is the case).54) where we have chosen the z origin so that z = 0 at t = 0. and we shall not pursue it extensively here.e. y ˙ m¨ = eEz . Ey  << B. z (3. v << 1. ω B 42 y= a (cos ωt − 1) . (3. ﬁnding z= e Ez t2 + v t . Instead. ˙ (3. and we make the simplifying assumption that the motion is nonrelativistic. Ez ) . i.52) We can immediately solve for z.56) (3. Thus we ﬁnd x + i y = ae−i ωt + ˙ ˙ e Ey Ey = ae−i ωt + .57) once more.53) dv = e(E + v × B) .57) Taking the time averages. B2 (3. The x and y equations can be combined into ie d (x + i y) + i ω(x + i y) = ˙ ˙ ˙ ˙ Ey . It can be written in general as vdrift = E×B . dt m where ω = eB/m. 0. i. we see that x = ˙ Ey . dt (3.59) For our assumption that v << 1 to be valid. we must have  E × B << B 2 . Integrating (3. consider the situation where we take B = (0. we ﬁnd x= a Ey sin ωt + t. m¨ = eEy − eB x . The equations of motion will therefore be m and so x ˙ m¨ = eB y . ω (3.discuss this in detail is quite involved.
We begin with a discussion of Lorentz invariant quantities that can be built from the Maxwell ﬁeld strength tensor F µν . with all indices contracted. we may write I1 ≡ Fµν F µν . = 2F0i F 0i + Fij F ij = −2Ei Ei + = −2Ei Ei + 2Bi Bi . ωB y= Ey (1 − cos ωt) . If a > Ey /B there will be loops in the motion. i. of course. From the expressions given in (2. and in the special case a = −E y /B the curve becomes a cycloid. These equations describe the projection of the particle’s motion onto the (x.61) 4 Action Principle for Electrodynamics In this section. it is possible to build two independent Lorentz invariants that are quadratic in the electromagnetic ﬁeld. (4.2) One could. verify from the Lorentz transformations (2. since it is built from the product of two Lorentz tensors.1) Obviously this is Lorentz invariant.3) ijk Bk ij B . This 43 . I1 = I1 under Lorentz transformations.e. One of these will turn out to be just what is needed in order to construct an action for electrodynamics.1 Invariants of the electromagnetic ﬁeld As we shall now show.14). We shall also introduce the notion of the energymomentum tensor for the electromagnetic ﬁeld.52) for E and B that indeed (B 2 − E 2 ) was invariant. The curve is called a trochoid.1 The ﬁrst invariant The ﬁrst quadratic invariant is very simple. and so I1 ≡ Fµν F µν = 2(B 2 − E 2 ) . y) plane. we see that I1 = F0i F 0i + Fi0 F i0 + Fij F ij . (4.1. 4.51) and (2. with cusps: x= Ey (ωt − sin ωt) .where the origins of x and y have been chosen so that x = y = 0 at t = 0. 4. It is instructive to see what this looks like in terms of the electric and magnetic ﬁelds. (4. ωB (3. we shall show how the Maxwell equations themselves can be derived from an action principle.
= − µνρσ This is an elementary point. 4.5) Since all the nonvanishing components of shall deﬁne 0123 are related by the antisymmetry.would be quite an involved computation. Actually. We = −1 . one can see by inspection that Fµν F µν is Lorentz invariant. +1 or 0 according to whether (µνρσ) is an even permutation of (0123). to be more precise. This means that un der Lorentz transformations that are connected to the identity (pure boosts and/or pure rotations). we need to explain the tensor analysis. (4. This is the fourdimensional Minkowski spacetime ijk generalisation of the totallyantisymmetric tensor of threedimensional Cartesian tensor is also totally antisymmetric in all its indices. we need only specify one nonvanishing component in order to deﬁne the tensor completely. However. = 8 Beware that in an odd dimension. We use this deﬁnition of This can be done because. (4. (4. ijk .6) Thus µνρσ is −1. kij σµνρ By . let us calculate what the transformation of µνρσ would be if we assume it behaves as an ordinary Lorentz Lorentz tensor: µνρσ ≡ Λ µ α Λν β Λρ γ Λσ δ = (det Λ) µνρσ αβγδ . in all frames. µνρσ (4. 8 µνρσ =− νµρσ =− µνρσ µνσρ =− σνρµ .1. contrast. in an even dimension. such as 3.2 The second invariant The second quadratic invariant that we can write down is given by I2 ≡ First. or no permutation at all. The tensor µνρσ 1 µνρσ 2 Fµν Fρσ . µνρσ is an invariant is an invariant pseudotensor. the tensor tensor.4) µνρσ . That means that it changes sign if any two indices are exchanged.7) ijk . pushing one oﬀ the righthand end and bringing it to the front) is an even permutation. like the Minkowski metric η µν . For example. such as 4. µνρσ and odd permutation. as we shall now discuss. To see this. it reverses its sign under Lorentz transformations that involve a reﬂection. However. the process of “cycling” the indices on (for example. the great beauty of the 4dimensional language is that there is absolutely no work needed at all. the process of cycling is an odd permutation. it is truly an invariant tensor. but easily overlooked if one is familiar only with three dimensions! 44 .
we shall almost always be considering only proper Lorentz transformations. we have the two quadratic invariants I1 = Fµν F µν = 2(B 2 − E 2 ) . this means that. 3. (4.The last equality can easily be seen by writing out all the terms.e. or purely magnetic (E = 0).39). if there exists a Lorentz frame where the electromagnetic ﬁeld is purely electric (B = 0).7) that µνρσ behaves like an invariant tensor. where there is no reﬂection. = 2(− ijk )(−Ei ) jk B . Returning now to the second quadratic invariant.) In practice.4). taking the same values in all Lorentz frames. we shall have I2 = 1 µνρσ 2 Fµν Fρσ = 1 2 ×4× 0ijk F0i Fjk . are sometimes called proper Lorentz transformations.8) = 4Ei Bi = 4E · B . i. Thus. and E·B (4.e. This has a number of consequences. then it is true in all frames. we already saw in section 2. In particular. (Lorentz transformations connected to the identity. i. (See the discussion leading up to equation (2. (It is easier to play around with the analogous identity in 2 or 3 dimensions. If E and B are perpendicular in one Lorentz frame. For example 1. provided there is no reﬂection. (4. to convince oneself of it in an example with fewer terms to write down. and so the distinction between a tensor and a pseudotensor will not concern us. then E and B are perpendicular in any other frame. 45 . If E > B in one frame. to summarise. the two quantities B2 − E2 . even though it is not directly evident in the threedimensional language without quite a lot of work.3 that det Λ = ±1. then it is true in all frames.) Now. Conversely. they take the same values in all Lorentz frames. I2 = 1 µνρσ 2 Fµν Fρσ = 4E · B . with det Λ = +1 for pure boosts and/or rotations. (4. and det Λ = −1 if there is a reﬂection as well.9) Since the two quantities I1 and I2 are (manifestly) Lorentz invariant.10) are Lorentz invariant. 2. then they are perpendicular in all Lorentz frames. if  E < B in one frame.) Thus we see from (4.
In that discussion. the electromagnetic ﬁeld was just a speciﬁed background. all of 3space) to give the Lagrangian: L= Ld3 x . (4. (4. would be a solution of the Maxwell equations. it turns out that the ﬁrst invariant we considered above provides the appropriate Lagrangian density. it is natural to expect that the action should be quadratic. at a given point.2 Action for Electrodynamics We have already discussed the action principle for a charged particle moving in an electromagnetic ﬁeld. subject only to the conditions that we cannot alter the values of (B 2 − E 2 ) and E · B at that point. We can also derive the Maxwell equations themselves from an action principle. ∂µ F µν = 0 .4. we can. t2 S= t1 Ldt = Ld4 x . We take L=− and so the action will be S=− 1 16π Fµν F µν d4 x . make E and B equal to any values we like. (4. (4.15) We can now derive the sourcefree Maxwell equations by requiring that this action be stationary with respect to variations of the gauge ﬁeld A µ . the Lagrangian is integrated over a time interval t 1 ≤ t ≤ t2 to give the action.11) Then.14) Since the Maxwell ﬁeld equations are linear in the ﬁelds.13) We immediately solve the second equation (the Bianchi identity) by writing F µν in terms of a potential: Fµν = ∂µ Aν − ∂ν Aµ . 4.16) 1 Fµν F µν . By making an appropriate Lorentz transformation. It must be emphasised that we treat Aµ as the fundamental ﬁeld here. ∂µ Fνρ + ∂ν Fρµ + ∂ρ Fµν = 0 . as we shall now show. In fact.12) Consider ﬁrst the vacuum Maxwell equations without sources. which. (4. 16π (4. 46 . of course. This is a quantity that is integrated over a threedimensional spatial volume (typically. We begin by introducing the notion of Lagrangian density.
namely Sf.19) First.16). 4π Σ 4π 1 (∂µ F µν ) δAν d4 x . The reason for doing this is that when we vary F µν we can take δF µν to be arbitary. It is sometimes convenient to use instead the ﬁrstorder formalism. it must be that ∂µ F µν = 0 . In this formalism. we have used the 4dimensional analogue of the divergence theorem to turn the 4volume integral of the divergence of a vector into a 3volume integral over the bouding surface Σ.o. because we restrict attention to variations δA µ that vanish on Σ. the action is expressed in terms of the 4vectorpotential Aµ as the fundamental ﬁeld. δFµν F µν d4 x . we have used the fact that F µν is antisymmetric. = = 1 4π 1 4π ( 1 Fµν δF µν − δF µν ∂µ Aν )d4 x . To do this. 4π 4π 1 1 F µν δAν dΣµ + − (∂µ F µν ) δAν d4 x . but it must 47 . The next step is to say that this integral vanishes.18) Thus we have derived the sourcefree Maxwell ﬁeld equation. 2 (4. that is. Finally. = 1 4π ( 1 F µν Fµν − F µν ∂µ Aν )d4 x . The action (4. 4π (4. 2 1 [ 1 Fµν δF µν − 2 δF µν (∂µ Aν − ∂ν Aµ )]d4 x .The derivation goes as follows. in which one treats Aµ and Fµν as independent ﬁelds. whose variation gave the Maxwell ﬁeld equation. We shall have δS = − = = = = 1 1 (δFµν F µν + Fµν δF µν )d4 x = − . now treated as an independent fundamental ﬁeld. the equation of motion coming from demanding that S be stationary under variations of F µν will derive the equation Fµν = ∂µ Aν − ∂ν Aµ . Of course the Bianchi identity has already been taken care of by writing F µν in terms of the 4vector potential A µ .o. 16π 8π 1 1 F µν ∂µ δAν d4 x . in getting to the second line. is written in what is called secondorder formalism. − .20) where. This gives δSf. we need a diﬀerent action as our starting point.17) Note that in the ﬁnal steps. we argue that if δS is to vanish for all possible variations δA µ (that vanish on Σ). 4 (4. (4. F µν (∂µ δAν − ∂ν δAµ )d4 x = − 8π 4π 1 1 − ∂µ (F µν δAν )d4 x + (∂µ F µν ) δAν d4 x . with Fµν just being a shorthand notation for ∂ µ Aν − ∂ν Aµ . consider the variation of F µν .
26) Treating J µ as independent of Aµ . 4π (4. we immediately dropped the boundary term coming from the integration by parts. since the symmetric part automatically gives zero when contracted onto the antisymmetric δF µν . 4.o. secondorder.o.27) and so requiring δS = 0 gives the Maxwell ﬁeld equation (4. formalism. = − = 1 F µν ∂µ δAν d4 x .o. we get δSf. 4π 1 (∂µ F µν ) δAν d4 x . we shall usually use the previous.3 Inclusion of sources In general. as promised. Note that in this calculation. 4π (4. This gives.23) again.24) with the source on the righthand side. In practice.o. = 0 for arbitrary δF µν then implies the integrand must vanish.16) with respect to A µ . 16π (4. to give S= − 1 Fµν F µν + J µ Aµ d4 x . (4.still be antisymmetric.21) (4. 48 . for the usual reason that we only allow variations that vanish on the boundary. Vraying Sf. the equation of motion Fµν = ∂µ Aν − ∂ν Aµ .19) instead with respect to Aµ .22) and hence reuiring that the variation of S f.25) To derive the Maxwell ﬁeld equation with a source current J µ . we obtain δS = 1 4π ∂µ F µν δAν d4 x . Requiring δSf. in (4. we have seen that by varying the secondorder action (4. we can simply add a term to the action. the Maxwell ﬁeld equation reads ∂µ F µν = −4πJ ν . we therefore ﬁnd δS = 1 ∂µ F µν + J µ δAν d4 x . Thus it is helpful to force an explicit antisymmetrisation on the ∂µ Aν that multiplies it. (4. with respect to Aµ vanish gives the Maxwell ﬁeld equation ∂µ F µν = 0 (4.24) So far.
0 49 dxµ 0 . ρv) . where the threedimensional deltafunction δ 3 (r). then of course we shall have ρ = q δ 3 (r − r0 (t)) . means δ 3 (r) = δ(x)δ(y)δ(z) . and that in getting to the second line we used the standard result for a deltafunction that ∂/∂xδ(x − y) = −∂/∂yδ(x − y). (4. of course.32) dr0 . dt (4. dt (4. with r = (x. so that its location at time t is at r = r 0 (t). If there is a single point charge q at the location r 0 .The form of the source current J µ depends. then it will be described by the charge density ρ = q δ 3 (r − r0 ) . or in terms of a set of moving point charges. Thus we have ∂ρ ∂t = q dxi ∂ 3 ∂ δ (r − r0 (t)) = q i δ 3 (r − r0 (t)) 0 .34) . The 3vector current will be given by J = q δ 3 (r − r0 (t)) and so the 4current is J µ = (ρ. y. (4. the source J µ might itself be given dynamically in terms of some charged matter ﬁelds.30).31) (4. where v= dr0 . We can verify that this is the correct current vector.30) and ρ is given by (4. ∂xi dt dt i = −∂i (ρv ) = −∂i Ji . One might simply have a situation where J µ is an externallysupplied source ﬁeld.29) (4.28) If the charge is moving. ∂t dt ∂x0 = −q ∂ 3 dxi dxi δ (r − r0 (t)) 0 = −∂i ρ 0 . dt (4. Let us consider this possibility in more detail.33) Note that we used the chain rule for diﬀerentiation in the ﬁrst line. by checking that it properly satisﬁes the chargeconservation equation ∂ µ J µ = ∂ρ/∂t + ∂i J i = 0. It is also useful to note that we can write (4. z). Alternatively. on the details of the situation one is considering.32) as Jµ = ρ where we simply deﬁne xµ with µ = 0 to be t.
38) t=t1 Because the integral in (4. (4. and by the sides at spatial inﬁnity.39) J µ dΣµ .37) is deﬁned to be over the 3surface at constant t. We are assuming the charges are conﬁned to a ﬁnite region. Then the total charge density will be given by N ρ= a=1 qa δ 3 (r − ra (t)) . we will obtain Q(t2 ) − Q(t1 ) = J µ dΣµ . (4. where dΣ0 = dxdydz . (4.38) do not contribute. and then take the diﬀerence between the two charges. (4. If we now calculate the charge at a later time t 2 . The total charge Q at time t1 is given by integrating the charge density over the spatial 3volume: Q(t1 ) = J 0 dΣ0 . in (4. Thus we have Q(t2 ) − Q(t1 ) = ∂µ J µ d4 x = 0 . 2. Aµ dt = q q dt path path 3 that is precisely of the form we saw in equation (2.42) V 50 . (4. dt µ dx Aµ dxµ . dΣ2 = −dtdzdx .40) Σ where Σ is the cylindrical closed spatial 3volume bounded by the “end caps” formed by the surfaces t = t1 and t = t2 . following paths ra (t).77): J Aµ d x = = µ 4 (4. and so the current J µ is zero on the sides of the cylinder. 3.Note that the integral J µ Aµ for the point charge gives a contribution to that action dxµ qδ (r − r0 ) 0 Aµ d3 xdt . (4. (4. By the 4dimensional analogue of the divergence theorem we shall have J µ dΣµ = ∂µ J µ d4 x .36) Since we have alluded several times to the fact that ∂ µ J µ = 0 is the equation of charge conservation.37) t=t1 This can be written covariantly as Q(t1 ) = where we deﬁne also dΣ1 = −dtdydz . it follows that the extra terms. for µ = 1.41) Σ V where V is the 4volume bounded by Σ. it is appropriate to examine this in a little more detail. dΣ3 = −dtdydz .35) Suppose now we have N charges qa .
51 . 1 E ×B. − 4π J · E . (4. + Ek ∂j Bi ) − 4π J · E .43) and so under a gauge transformation A µ → Aµ + ∂µ λ. 4π (4. Thus we see that ∂µ J µ = 0 implies that the total charge in an isolated ﬁnite region is independent of time.4 Energy density and energy ﬂux Here.46) ijk (Bi ∂j Ek = −∂j ( jki Ek Bi ) · (E × B) − 4π J · E .47) ∂ (E 2 + B 2 ) = −4π ∂t · S − 4π J · E . 16π (4.since ∂µ J µ = 0. λ∂µ J µ d4 x .44) 4. − 1 Fµν F µν + J µ Aµ d4 x + 16π J µ ∂µ λd4 x = S − J µ ∂µ λd4 x . Note that the equation of charge conseravtion implies the gauge invariance of the action. ∂t (4. etc. After that. we shall give the more elegant 4dimensional description.45) × B − 4π J) − B · ( × E) . we ﬁnd S −→ = = S+ S. ijk (Ei ∂j Bk − Bi ∂j Ek ) − 4π J · E . we can deduce E· ∂B ∂E +B· ∂t ∂t = E·( = = − = − We then deﬁne the Poynting vector S≡ and so 1 2 ∂E = 4π J . Consider the two Maxwell equations ×B − From these.48) 1 since E · ∂ E/∂t = 2 ∂/∂t(E 2 ). (4. We have S= − 1 Fµν F µν + J µ Aµ d4 x . (4. ∂t ×E+ ∂B = 0. we review the calculation of energy density and energy ﬂux in the 3dimensional language.
We now assume that the E and B ﬁelds are conﬁned to some ﬁnite region of space. Integrating (4.48) over all space, we obtain J · Ed3 x + 1 d 8π dt (E 2 + B 2 )d3 x = − = − = 0.
Σ
· Sd3 x , S · dΣ , (4.49)
We get zero on the righthand side because, having used the divergence theorem to convert it to an integral over Σ, the “sphere at inﬁnity,” the integral vanishes since E and B, and hence S, are assumed to vanish there. If the current J is assumed to be due to the motion of a set of charges q a with 3velocities va and rest masses ma , we shall have from (4.31) that J · Ed3 x = where Emech = ma γa
a
a
qa v a · E =
dEmech , dt
(4.50)
(4.51)
is the total mechanical energy for the set of particles, as deﬁned in (3.7). Note that here
2 γa ≡ (1 − va )−1/2 .
(4.52)
Thus we conclude that d 1 Emech + dt 8π (E 2 + B 2 )d3 x = 0 . (4.53)
This is the equation of total energy conservation. It says that the sum of the total mechanical energy plus the energy contained in the electromagnetic ﬁelds is a constant. Thus we interpret W ≡ 1 (E 2 + B 2 ) 8π (4.54)
as the energy density of the electromagnetic ﬁeld. Returning now to equation (4.48), we can consider integrating it over just a ﬁnite volume V , bounded by a closed 2surface Σ. We will have d Emech + dt W d3 x = − S · dΣ . (4.55)
V
Σ
We now know that the lefthand side should be interpreted as the rate of change of total energy in the volume V and so clearly, since the total energy must be conserved, we should 52
interpret the righthand side as the ﬂux of energy passing through the boundary surface Σ. Thus we see that the Poynting vector S= 1 E×B 4π (4.56)
is to be interpreted as the energy ﬂux across the boundary; i.e. the energy per unit area per unit time.
4.5
Energymomentum tensor
The discussion above was presented within the 3dimensional framework. In this section we shall give a 4dimensional spacetime description, which involves the introduction of the energymomentum tensor. We shall begin with a rather general introduction. In order to simplify this discussion, we shall ﬁrst describe the construction of the energymomentum tensor for a scalar ﬁeld φ(xµ ). When we then apply these ideas to electromagnetism, we shall need to make the rather simple generalisation to the case of a Lagrangian for the vector ﬁeld Aµ (xν ). We begin by considering a Lagrangian density L for the scalar ﬁeld φ. We shall assume that this depends on φ, and on its ﬁrst derivatives ∂ ν φ, but that it has no explicit dependence9 on the spacetime coordinates xµ : L = L(φ, ∂ν φ) . The action is then given by S= L(φ, ∂ν φ) d4 x . (4.58) (4.57)
The EulerLagrange equations for the scalar ﬁeld then follow from requiring that the action be stationary. Thus we have10 δS = =
9
∂L ∂L δφ + ∂ν δφ d4 x , ∂φ ∂∂ν φ ∂L ∂L δφ d4 x + δφ − ∂ν ∂φ ∂∂ν φ
Σ
∂L δφdΣν , ∂∂ν φ
This is the analogue of a Lagrangian in classical mechanics that depends on the coordinates qi and
velocities q i , but which does not have explicit time dependence. Energy is conserved in a system described ˙ by such a Lagrangian. 10 Note that ∂L/∂∂ν φ means taking the partial derivative of L viewed as a function of φ and ∂µ φ, with
1 1 respect to ∂ν φ. For example, if L = − 2 (∂µ φ)(∂ µ φ) + 2 m2 φ, then
∂L/∂∂ν φ = −(∂ µ φ)
∂(∂µ φ) ν = −(∂ µ φ) δµ = −∂ ν φ . ∂∂ν φ
(4.59)
53
=
∂L ∂L δφ d4 x , δφ − ∂ν ∂φ ∂∂ν φ
(4.60)
where, in getting to the last line, we have as usual dropped the surface term integrated over the boundary cylinder Σ, since we shall insist that δφ vanishes on Σ. Thus the requirement that δS = 0 for all such δφ implies the EulerLagrange equations ∂L ∂L = 0. − ∂ν ∂φ ∂∂ν φ (4.61)
Now consider the expression ∂ρ L = ∂L/∂xρ . Since we are assuming L has no explicit dependence on the spacetime coordinates, it follows that ∂ ρ L is given by the chain rule, ∂ρ L = ∂L ∂L ∂ρ φ + ∂ρ ∂ν φ . ∂φ ∂∂ν φ ∂L ∂L ∂ν ∂ρ φ , ∂ρ φ + ∂∂ν φ ∂∂ν φ ∂L ∂ρ φ , ∂∂ν φ (4.62)
Now, using the EulerLagrange equations (4.61), we can write this as ∂ρ L = ∂ ν = ∂ν and thus we have ∂L ν ∂ρ φ − δ ρ L = 0 . ∂∂ν φ We are therefore led to deﬁne the 2index tensor ∂ν Tρ ν ≡ − which then satisﬁes ∂ν Tρ ν = 0 . T µν is called the energymomentum tensor. We saw previously that the equation ∂ µ J µ = 0 for the 4vector current density J µ implies that there is a conserved charge Q=
t=const
(4.63)
(4.64)
∂L ν ∂ρ φ + δ ρ L , ∂∂ν φ
(4.65)
(4.66)
J 0 dΣ0 =
t=const
J µ dΣµ ,
(4.67)
where dΣ0 = dxdydz, etc. By an identical argument, it follows that the equation ∂ ν Tρ ν = 0 implies that there is a conserved 4vector: Pµ ≡ dP µ dt
t=const
T µ0 dΣ0 =
t=const
T µν dΣν .
(4.68)
(Of course T µν = η µρ Tρ ν .) Thus we may check = ∂0 = − T µ0 d3 x =
t=const S t=const
∂0 T µ0 d3 x = −
t=const
∂i T µi d3 x , (4.69)
T µi dSi = 0 , 54
H is conserved. ∂i ψ µ0i d3 x . since. We can deﬁne a new one. from (4. Comparing with (4. Furthermore.76) = t=const = S ψ µ0i dSi = 0 . since partial derivatives commute. and since its 0 component is the energy. according to Tρ ν −→ Tρ ν + ∂σ ψρ νσ .75) (4.68) should be conserved. and ˙ ˙ the Hamiltonian H = πi q i − L . the modiﬁcation to Tρ ν does not alter P µ . (4. ψρ νσ = −ψρ σν .71) Since there is no explicit time dependence. Notice that T 00 = −T0 0 and from (4.68) we therefore have that P0 = T 00 d3 x (4. This vanishes since we shall assume the ﬁelds are zero at inﬁnity. The antisymmetry implies.74) (4. ∂∂0 φ (4. We shall take ψρ νσ to vanish at spatial inﬁnity.73) is conserved too.70) Now for a Lagrangian L = L(q i .where in the last line we have used the divergence theorem to turn the integral into a 2dimensional integral over the boundary sphere S at inﬁnity. The essential point in the discussion above is that P µ given in (4. Since it is manifest from its construction that P µ is a 4vector. where ψρ νσ is an arbitrary tensor that is antisymmetric in its last two indices. From (4. that ∂ν ∂σ ψρ νσ = 0 . and is equal to the total energy of the system. ˙ (4. we can recognise that T 00 is the energy density. (4. which requires ∂ν Tρ ν = 0. The quantity Tρ ν we constructed is not the unique tensor with this property.70).73) and hence that the modiﬁed energymomentum tensor deﬁned by (4.65) we therefore have T 00 = ∂L ∂0 φ − L . the extra term will be t=const ∂σ ψ µνσ dΣν = t=const ∂σ ψ µ0σ dΣ0 .72) is the total energy. it follows that P µ is the 4momentum.68). 55 . q i ) we have the canonical momentum πi = ∂L/∂ q i .
In higher dimensions. (4. a very special feature of three dimensions is that a rotation in the (x.78) where M jk ≡ xj pk −xk pj . we are deﬁning M µν = (xµ dP ν − xν dP µ ) = (xµ T νρ − xν T µρ )dΣρ . therefore. as we discussed earlier. 56 . First. (4. we deﬁne the angular momentum 3vector as L = r × p. In three dimensions.80) (4. (4.79) By analogous arguments to those we used earlier. we shall assume that this is done. we therefore have the requirement that µ ν δρ T νρ + xµ ∂ρ T νρ − δρ T µρ − xν ∂ρ T µρ = 0 . rotations do not occur around axes. . let us make a remark about angular momentum in four dimensions.81) and hence. that T µν is symmetric. angular momentum is described by a 2index antisymmetric tensor. since ∂ρ T µρ = 0. (4. in other words. z) axis. In other words. The modiﬁcation to P µ therefore vanishes since we are requiring that ψρ νσ vanishes at spatial inﬁnity. Thus taking M µν = xµ pν −xν pµ in four dimensions is a plausiblelooking generalisation. deﬁned by M µν = (xµ dP ν − xν dP µ ) (4. y) plane can equivalently be described as a rotation about the orthogonal (i. Distributing the derivative. From now on. in 2planes. It is amusing.where S is the sphere at spatial inﬁnity. dM µν /dt = 0) if ∂ρ (xµ T νρ − xν T µρ ) = 0 . it is always possible to arrange for T µν to be symmetric.77) be conserved.e. but rather. The energymomentum tesnor can be pinned down uniquely by requiring that the fourdimensional angular momentum M µν . Li = ijk x j k p = j k 1 2 ijk (x p − x k pj ) = jk 1 2 ijk M . to try to imagine what the analogue of an axle is for a higherdimensional being! Getting back to our discussion of angular momentum and the energymomentum tensor in four dimensions. It is a very special feature of three dimensions that we can use the into the vector Li = 1 jk 2 ijk M . ijk tensor to map the 2index antisymmetric tensor M jk Put another way. It should be noted that in a general dimension. T µν = T νµ .82) Using the freedom to add ∂σ ψ µνσ to T µν .e. this will be conserved (i. angular momentum is associated with a rotation in a 2dimensional plane.
we have ∂ i0 T + ∂j T ij = 0 .85) V V S The lefthand side is the rate of change of 3momentum. or ∂ 00 T + ∂j T 0j = 0 . ∂∂ν φ (4. Thus we have that energy ﬂux = momentum density . from energy conservation. (4. per unit time. T ij is sometimes called the 3dimensional stress tensor. ∂t and so. that T 0j is the energy ﬂux 3vector. Let us now look at the conservation equation ∂ ν T µν = 0 in more detail.84) (4. the original construction of the energymomentum tensor Tρ ν (which we later modiﬁed by adding ∂σ ψρ νσ where ψρ νσ = −ψρ σν ) was given by Tρ ν = − ∂L ν ∂ρ φ + δ ρ L .86) (4. But since we are now working with a symmetric energymomentum tensor.We already saw that P µ = T µ0 d3 x is the 4momentum. so T 00 is the energy density. we therefore ﬁnd ∂ ∂t T 00 d3 x = − ∂j T 0j d3 x = − T 0j dSj . then it is easy to see that the analogous conserved tensor is Tρ ν = − ∂L ν ∂ρ φa + δ ρ L .6 Energymomentum tensor for the electromagnetic ﬁeld Recall that for a scalar ﬁeld φ. and we already identiﬁed T j0 as the 3momentum density.88) If we have a set of N scalar ﬁelds φa . 4.87) (4.89) . From the µ = i components of ∂ν T µν = 0. and so we deduce that T ij is the 3tensor of momentum ﬂux density. through the 2surface perpendicular to the x j axis. integrating over the 3volume V . we have ∂ ν T 0ν = 0. ∂∂ν φa a=1 57 N (4. (4. It gives the i component of 3momentum that ﬂows. Taking µ = 0. and so we can deduce. we get ∂ ∂t T i0 d3 x = − ∂j T ij d3 x = − T ij dSj . we have that T 0j = T j0 . and T i0 is the 3momentum density. ∂t integrating over a spatial 3volume V with boundary S.83) V V S The lefthand side is the rate of change of ﬁeld energy in the volume V .
(4.92) (4.97) . 16π (4. 1 1 F µ σ F νσ − 4 η µν Fσλ F σλ .98) 4π This is indeed manifestly symmetric in µ and ν. with Lagrangian density L(Aσ . ∂ν Aσ ). 4π 4π 4π 1 ν 1 νσ F ∂ρ Aσ − δ Fσλ F σλ . 4π 1 1 µ 1 = − (∂σ Aµ )F νσ − A ∂σ F νσ = − (∂σ Aµ )F νσ .) This leads to the new energymomentum tensor T µν = or. However. From now on.96) (the ∂σ F νσ term drops as a consequence of the sourcefree ﬁeld equation. ∂∂µ Aν 4π Thus from (4. where ψ µνσ = −ψ µσν . (4. following our previous discussion.93) 1 µν 1 F δFµν = − F µν ∂µ δAν . Speciﬁcally. T µν = we can add a term ∂σ ψ µνσ to it.91) (4. in the sense that ηµν T µν = 0 .99) 1 νσ µ 1 µν F (∂ Aσ − ∂σ Aµ ) − η Fσλ F σλ .90) Let us apply this to the Lagrangian density for pure electrodynamics (without sources). in other words. we shall choose to add ∂σ ψ µνσ = − 1 ∂σ (Aµ F νσ ) . L=− We have δL = − and so ∂L 1 = − F µν . the construction will give a conserved energymomentum tensor Tρ ν = − ∂L ν ∂ρ Aσ + δ ρ L . Note that it has another simple property. without upsetting the conservation condition ∂ν T µν = 0. ∂∂ν Aσ 1 Fµν F µν .95) 4π 16π This expression is not symmetric in µ and ν.90) we ﬁnd Tρ ν = and so 1 µν 1 νσ µ F ∂ Aσ − η Fσλ F σλ . ∂ν T µν = 0.A similar calculation shows that if we consider instead a vector ﬁeld A σ . it will be understood when T µν = we speak of the energymomentum tensor for electrodynamics that this is the one we mean. 4π 16π ρ (4. that indeed T µν given by (4.98) is conserved. 4π 16π (4. using the sourcefree Maxwell ﬁeld equation and the Bianchi identity. 58 (4. It is a straightforward exercise to verify directly. namely that it is tracefree.94) (4. 8π 4π (4.
we ﬁnd T 00 = = = = 1 (F 0 σ F 0σ − 1 η 00 Fσλ F σλ ) . as the 3momentum density vector. Then. 4π (4.98).102) where S = 1/(4π)E × B is the Poynting vector introduced in (4. 8π (4. (4. We have T ij = = = = = 1 4π 1 4π 1 4π 1 4π 1 4π F i σ F jσ − 1 η ij 2(B 2 − E 2 ) .100) (4. The tracefree property is related to a special feature of the Maxwell equations in four dimensions. known as conformal invariance. Now consider T 0i . First. since we now have T 0i = T i0 .47). 2 4π 1 1 (E 2 + 2 B 2 − 1 E 2 ) . 4 1 F i 0 F j0 + F i k F jk − 2 δij (B 2 − E 2 ) .103) To summarise. 4 4π 1 1 (F 0i F 0i + 1 B 2 − 2 E 2 ) . Thus T 0i is the energy ﬂux. it can be equivalently interpreted Finally. recall that we showed earlier that Fσλ F σλ = 2(B 2 − E 2 ) .This is easily seen from (4. we have T µν = T 00 T i0 T 0j σij 59 = W Si Sj σij . we consider the components T ij .104) . Since η 0i = 0.98) for the electromagnetic ﬁeld.101) Thus T 00 is equal to the energy density W that we introduced in (4. 1 − Ei Ej − Bi Bj + 2 δij (E 2 + B 2 ) . − E i Ej + ik jkm B 1 Bm − 2 δij (B 2 − E 2 ) . 2 4π 1 (E 2 + B 2 ) . as a consequence of the fact that η µν ηµν = 4 in four dimensions. (4.54). 4π 4π 1 Ej ijk Bk = Si . it is instructive to look at its components from the threedimensional point of view. As we remarked earlier. 1 − Ei Ej + δij B 2 − Bi Bj − 2 δij (B 2 − E 2 ) . we have T 0i = = 1 0 iσ 1 0 ij F σF = F jF . Having obtained the energymomentum tensor (4.
given by W 0 W 0 0 0 0 W 0 0 0 = 0 T µν 0 with W = 1/(8π)(E 2 + B 2 ). in the new Lorentz frame.) Let the direction of E and B then be along z: E = (0. B = (0.105) 1 1 (−Ei Ej − Bi Bj + 2 W δij ) .where W and S are the energy density and Poynting ﬂux. 1 (E 2 + B 2 ) . 0) . W = and σij ≡ Remarks • Unless E and B are perpendicular and equal in magnitude.112) . B. 0. E) . (4. σ33 = −W . W ) . Then we have S = 1/(4π)E × B = 0 and σ11 = σ22 = W . 4π (4.107) σij = 0 otherwise . S = (0. 4π (4. 0. Then we have W σ11 = 1 2 E . 0.111) and therefore T µν is given by W 0 0 0 0 0 0 0 0 W 0 = 0 T µν W 60 W 0 0 .110) σij = 0 otherwise . (4. then at that point we can choose axes so that E = (E. (In the case that E and B are perpendicular (but unequal in magnitude). 0. we can always choose a Lorentz frame where E and B are parallel at a point. (4.108) and so T µν is diagonal.106) B = (0.109) • If E and B are perpendicular and E = B at a point. (4. (4. B) . 0) = (0. E. (4. −W 0 0 . σ33 = W . 8π S= 1 E ×B. 4π = σ22 = 0 . at the point. 0) . one or other of E or B will be zero.
and momentum. + Tpart. since the particle interacts with the ﬁeld.98) and (4. and v = dr0 (t)/dt is the velocity of the particle.117) for the particle to be conserved separately. by dt = γdτ . i. with strength m.118) 61 . = ε dτ dτ dt (4. it becomes manifest that T µν for the particle is symmetric in µ and ν. In order to distinguish clearly between the various energymomentum tensors. moving in an electromagnetic ﬁeld. dτ (4. it must therefore be that T 0ν = ε T µν = ε dxµ dxν . where γ = (1 − v 2 )−1/2 . located at the instantaneous position of the mass point: ε = mδ 3 (r − r0 (t)) . ε. We proceed by analogy with the construction of the 4current density J µ for charged noninteracting particles.4. This is because energy.117) By writing it as we have done in the second line here. to be conserved. dt dτ dt (4.114) dt dxν dx0 dxν =ε . Thus we deﬁne ﬁrst a mass density.115) dt . as usual. (4.e. we should not expect either the energymomentum tensor (4. let us deﬁne µν µν µν Ttot.117). = Te.116) (4. we then have T 00 = ε The 3momentum density will be T 0i = εγ We can therefore write dxi dt dxi =ε . . Consider now a system consisting of a particle with mass m and charge q. We can expect. (4.98) for the electromagnetic ﬁeld or the energymomentum tensor (4. dτ dt dτ dt On general grounds of Lorentz covariance. This will simply be given by a 3dimensional delta function.113) The energy density T 00 for the particle will then be its mass density times the corresponding γ factor.m. that the total energymomentum tensor for the system. however. the sum of (4. Since the coordinate time t and the proper time τ in the frame of the particle are related. is being exchanged between the particle and the ﬁeld. Clearly. dτ dt µ dxν dτ dx . for a point mass m located at r = r 0 (t).7 Inclusion of massive charged particles We now consider the energymomentum tensor for a particle with rest masses m.
As expected. 4π 1 1 1 ∂ν F µ σ F νσ + F µ σ ∂ν F νσ + 2 F σλ ∂σ Fλ µ + 2 F σλ ∂λ F µ σ . and Tpart. we have ε and so ε dU µ dxν = ρF µ ν U ν = ρF µ ν . (4. 4π −F µ ν J ν .3 to show that the 4current J µ = ρdxµ /dt for a charged particle is conserved. 4π 1 1 1 ∂ν F µ σ F νσ − 2 F σλ ∂σ F µ λ − 2 F λσ ∂λ F µ σ + F µ σ ∂ν F νσ .119) where ε = mδ 3 (r − r0 (t)).m.120) 1 2 In getting to the second line we used the Bianchi identity on the last term in the top line. Taking the divergence.m. ﬁrst. = µν Consider Te. we ﬁnd = = = = 1 1 ∂ν F µ σ F νσ + F µ σ ∂ν F νσ − 2 F σλ ∂ µ Fσλ . = ε dτ dt (4.124) . 4 4π µ dxν dx . The third line is obtained by swapping indices on a ﬁeld strength in the terms with the factors. = ∂ν ε We have the one which we used a while back in section 4. = ε dxν dxµ dxν ∂ν ∂ν U µ . Thus we have µν ∂ν Tpart. Now we want to show that this nonconservation is balanced by an equal and opposite nonconservation for the energymomentum tensor of the particle.µν µν where Te. 4π 1 µ F σ ∂ν F νσ . 1 F µ σ F νσ − 1 η µν Fσλ F σλ .123) (4. leading to the result. This can be seen from the fact that the calculation is identical to µν ∂ν Tpart. the energymomentum tensor for the electromagnetic ﬁeld by itself is not conserved when there are sources. dt (4.121) +ε ∂ν dt dτ dt dτ The ﬁrst term is zero. dxν dxµ dxν dxµ .122) By the Lorentz force equation mdU µ /dτ = qF µ ν U ν . = µν Tpart. dt dt 62 (4. are the energymomentum tensors for the electromagnetic ﬁeld and the particle respectively: µν Te.m. which is given in (4. µν ∂ν Te. (4. =ε dt dτ dt dU µ = ε . dτ dτ dxν dU µ = ρF µ ν = F µν J ν .119).m. and this reveals that all except one term cancel.
3) implies that φ should satisfy the Laplace equation. From r 2 = xj xj we deduce. then the charge density ρ is given by ρ = e δ 3 (r) .2) φ = −4πρ .7) xi . Thus we conclude that µν ∂ν Tpart. r > 0 . and the charge is e. φ(r) = φ(r). (5. for which the Maxwell equations therefore reduce to × E = 0.5) Since the charge density (5. that ∂i r = From this it follows by the chain rule that ∂i φ = φ ∂i r = φ xi .since J µ = ρdxµ /dt.4) φ = 0.125) and so.4) is spherically symmetric.126) 5 5. 2 (5. by acting with ∂i . deﬁned in (4. of course. (4. we can assume that φ will be spherically symmetric too. (5.120). where r = r. Away from the origin.118) is conserved. that we can write E = − φ. µν ∂ν Ttot. · E = 4πρ .1 Coulomb’s Law Potential of a point charges Consider ﬁrst a static point charge. we conclude that the total energymomentum tensor for the particle plus electromagnetic ﬁeld. = F µ ν J ν .6) 63 . combining this with (4. (5. (4. r (5. r (5. (5. and then the second equation implies that φ satisﬁes the Poisson equation 2 (5.1) The ﬁrst equation implies.3) If the point charge is located at the origin. = 0 .
r (5. and ni ≡ xi /r is the outwardpointing unit vector.14) φ = −4πeδ 3 (r − r ) . (5. Note that if the point charge e were located at r . and so −q ni dSi = −q r2 dΩ = −4πq . and use the divergence theorem: 2 VR φd3 x = −4πe = SR VR δ 3 (r)d3 x = −4πe . which integrates to give φ= q .8) Thus the Laplace equation (5. To determine the constant q.11) where SR is the surface of the sphere of radius R that bounds the volume V R .9) where q is a constant. we integrate the Poisson equation (5. 64 (5.where φ ≡ dφ/dr. (5. r SR xi dSi ni dSi = −q . the charge on the point charge at r = 0. r ∂i xi 1 xi xi xi +φ + φ xi ∂i .10) r > 0. and hence 2 φ = ∂ i ∂i φ = ∂ i φ = φ + 2 φ .13) (5. rather than at the origin. where dΩ is the area element on the unitradius sphere. r − r  (5. then by trivially translating the coordinate system we will have the potential φ(r) = and this will satisfy 2 e .15) . and we have dropped an additive constant of integration by using the gauge freedom to choose φ(∞) = 0. ∂i φ · dS = SR = −q q dSi . =φ r r r r r (5.12) SR and so we conclude that q is equal to e.5) can be written as (r 2 φ ) = 0 . 2 r3 SR R (5. Clearly we have ni dSi = R2 dΩ .3) over the interior VR of a sphere of radius R centred on the origin.
14).17) We can apply this formula to a system of N charges q a . Thus we conclude that the electrostatic ﬁeld energy is given by U= 1 2 ρφd3 x . 1 8π ( · E)φd3 x . as far as energyconservation considerations are concerned. whatever that might mean.17) would give nonsense. 1 8π 1 = − 8π 1 = − 8π = − = 1 2 φd3 x . (5. A purely electrostatic system therefore has a ﬁeld energy U given by U = 1 8π W d3 x = E· 1 8π E 2 d3 x . one can argue that the constant selfenergy will not be observable. φ(r) = qa . for which we shall have ρ= a=1 N qa δ 3 (r − ra ) . but that is clearly the result of making the idealised assumption that the charge is literally located at a single point. There is no totally satisfactory way around this in classical electromagnetism. (5. and so one has to adopt a “fudge.19) will give inﬁnity since φ(r).5. Note that the surface integral over the sphere at inﬁnity gives zero because the electric ﬁeld is assumed to die away to zero there. the energy density of an electromagnetic ﬁeld is given by W = 1/(8π)( E 2 + B 2 ). (5. located at points ra . · (E φ)d3 x + S E φ · dS + 1 2 ρφd3 x . and so 65 . In any case.” The fudge consists of observing that the true selfenergy of a charge. which one encounters even for a single point charge. Naively. diverges at the location of each point charge. r − ra  a=1 (5.2 Electrostatic energy In general. a naive application of (5. it appears to be an inﬁnite constant. is a constant.19) where φ(r) is given by (5.18) However. (5.20) This means that (5.16) ρφd3 x . This is the classic “selfenergy” problem. since we ﬁnd N N U= 1 2 a=1 qa δ 3 (r − ra ) φ(r)d3 x = N 1 2 a=1 qa φ(ra ) . not unreasonably.
in which the particle is moving with velocity v. Clearly the inverse Lorentz transformation is obtained by sending v → −v.23). Thus we shall have φ = γ (φ − v · A) .19) for the system of point charges is to replace φ(r a ). A=A + γ −1 (v · A )v + γv φ . r A = 0. which means the potential at r = ra due to all the charges. in which the charge is at rest. by φa .one might as well just drop it for now. v2 (5. 66 . we shall choose the origin of axes so that the charge is located at the origin of the frame S .3 Field of a uniformly moving charge Suppose a charge e is moving with uniform velocity v in the Lorentz frame S.) We know that Aµ = (φ.19) is now interpreted to mean that the total energy of the system of charges is U= 1 2 (5.21) and so (5. which is deﬁned to be the potential at r = ra due to all the charges except the charge q a that is itself located at r = ra . r A = γv φ = eγv . A =A+ γ−1 (v · A)v − γv φ .24) where γ = (1 − v 2 )−1/2 . we therefore ﬁnd that the potentials in the frame S.25) From (5. r − rb  b=a qa qb . We may transform to a frame S . For convenience. and so we shall have φ = γ (φ + v · A ) .22) a 5. moving with velocity v relative to S. It follows that in the frame S . are given by φ = γφ = eγ . r − rb  b=a a (5.23) (Note that the primes here all signify that the quantities are those of the primed frame S . A) is a 4vector.26) Note that we still have r appearing in the denominator. v2 (5. which we would now like to express in terms of the unprimed coordinates. r (5. Thus we have φa ≡ qb . the ﬁeld due to the charge can be described purely by the electric scalar potential φ : In S : φ = e . (5. Thus the way to make sense of the ostensibly divergent energy (5. and so the components A µ transform under Lorentz boosts in exactly the same way as the components of x µ .
30) The electric and magnetic ﬁelds can now be calculated in the standard way from φ and A.27) = x +y +z 2 2 2 = γ 2 (x − vt)2 + y 2 + z 2 . r3 (5. e . R∗ (5. (5. (5. we can ﬁrst calculate E and B in the primed frame. and then Lorentz transform these back to the unprimed frame. and sending v to −v. Alternatively.29) (5. Again. this is simply achieved by interchanging the primed and unprimed ﬁelds. as in (2.31) The transformation to the unprimed frame is then given by inverting the standard results (2. that we orient the axes so that v lies along the x direction. r3 Ey = eγy .33) Let us again assume that we orient the axes so that v lies along the x direction. v2 r3 r3 eγv × r B = γv × E = . r3 E = (5. and equivalently. (5. we ﬁnd that E and B in the frame S are given by eγr γ − 1 ev · r − v.52) that express E and B in terms of E and B. we shall of course have E = er . v2 γ−1 B = γ(B + v × E ) − (v · B ) v .31).51) and (2.28) It follows therefore from (5. R∗ A= ev .32) (5. for example. Then from the above we ﬁnd that Ex = and so Ex = eγ(x − vt) . In the frame S . r3 Ez = eγz .34) 67 . r3 (5. z =z. and so r 2 y = y. Then we shall have x = γ(x − vt) . v2 E = γ(E − v × B ) − and so from (5.8).Suppose. r3 B = 0. r3 Ez = eγz . r3 Ey = eγy . This gives γ−1 (v · E ) v .35) ex .26) that the scalar and 3vector potentials in the frame S are given by φ= where we have deﬁned 2 R∗ ≡ (x − vt)2 + (1 − v 2 )(y 2 + z 2 ) .
37) where R2 = R2 = (x − vt)2 + y 2 + z 2 . it follows that the vector from the charge to the point r = (x. If we set θ= then E = π −ψ. we can deﬁne the electric ﬁeld E⊥ in the (y. (5.30).30). z) plane (corresponding to θ = π/2). that 2 R∗ = R2 − v 2 (y 2 + z 2 ) = R2 (1 − v 2 sin2 θ) . = 3 R∗ r3 (5.35).40). z) . 2 (5. If we now deﬁne θ to be the angle between the vector R and the x axis. then the coordinates (x.36) e(1 − v 2 )R eγ R .38) (5. y.42) e(1 − v 2 ) e(1 − v 2 ) ≈ 1 R2 (1 − v 2 cos2 ψ)3/2 (1 − v 2 + 2 ψ 2 )3/2 68 (5. When the velocity is very small. y. From (5. From (5. R3 (1 − v 2 sin2 θ)3/2 (5.41) Note that E has the smallest magnitude.39) and so the electric ﬁeld due to the moving charge is E= eR 1 − v2 . z) of the observation point P will be such that y 2 + z 2 = R2 sin2 θ . y. we ﬁnd that E decreases to zero. z) is R = (x − vt. and given by setting θ = 0 in (5.40) we therefore have E = e(1 − v 2 ) . the electric ﬁeld is (as one would expect) more or less independent of θ. On the other hand. R2 E⊥ = e(1 − v 2 )−1/2 . R2 (5. that E attains as a function of θ. 0. Thus for v near to the speed of light the electric ﬁeld is very sharply peaked around θ = π/2. (5. we then ﬁnd that the electric ﬁeld is given by E= where R∗ was deﬁned in (5. as v approaches 1 (the speed of light). 0) in the frame S.Since the charge is located at the point (vt. the electric ﬁeld will be E (parallel to the x axis).43) . However. while E⊥ diverges.40) For an observation point P located on the x axis. and E⊥ has the largest magnitude. from (5. This implies.
and so for convenience we shall deﬁne q ≡ eQ . and r 2 = xi xi . R3 B≈ ev × R .if v ≈ 1. z = r cos θ . ∂qi dt ∂ qi ˙ 69 (5. R3 (5. In this section we shall consider the fully relativistic problem.44) We saw previously that the magnetic ﬁeld in the frame S is given by B = γv × E . (5. The charges occur in the combination eQ throughout ˙ the calculation. qi ) for coordinates qi and velocities qi (don’t confuse ˙ ˙ the coordinates qi with the product of charges q = eQ!).50) y = r sin θ sin ϕ .33) we have v × E = γv × E . (5. with the orbit of the particle being either an allipse.45) Note that if v << 1 we get the usual nonrelativistic expressions E≈ eR .4 Motion of a charge in a Coulomb potential We shall consider a particle of mass m and charge e moving in the ﬁeld of a static charged Q.46) 5.51) . a parabola or a hyperbola. 3 R∗ (5. x = r sin θ cos ϕ .79). The EulerLagrange equations are d ∂L ∂L − = 0. The Lagrangian for the system is given by (2. and so therefore B =v×E = e(1 − v 2 )v × R . when the velocity of the particle is not necessarily small compared with the speed of light. It is convenient to introduce spherical polar coordinates in the standard way. The classic “Newtonian” result is very familiar. and then the Lagrangian becomes ˙ L = −m(1 − r 2 − r 2 θ 2 − r 2 sin2 θ ϕ2 )1/2 − ˙ ˙ q . r (5. depending on the charges and the orbital parameters. r (5. From (5.48) The Lagrangian is of the form L = L(qi .49) (5.47) where xi = dxi /dt. with φ = Q/r and A = 0: L = −m(1 − xi xi )1/2 − ˙ ˙ eQ . Thus the angular width of the peak is of the order of ψ∼ 1 − v2 .
˙ Note that we can also write this as mr 2 dϕ = . ˙ ˙ ˙ ∂ϕ ˙ where is a constant. Here.Note that if L is independent of a particular coordinate. that θ = π/2 for all time. The Lagrangian for the reduced system.e.59) 70 .56) (5. We may therefore assume now.55) Since the Lagrangian does not depend explicitly on t. the total energy E is also conserved. coordinate time and proper time are related by dτ = dt/γ.58) (5. where we consistently ca set θ = π/2. if the particle starts out moving in the θ = π/2 plane (i. Thus we have E =H= is a constant. r (5. y) plane. is then simply L = −(1 − r 2 − r 2 ϕ2 )1/2 − ˙ ˙ q . the (x. Since (1 − r 2 − r 2 ϕ2 )−1/2 = γ. p 2 = m2 γ 2 v 2 = m2 γ 2 r 2 + m2 γ 2 r 2 ϕ2 . y) plane at z = 0). p 2 + m2 + q r (5.53) ˙ It can be seen that a solution to this equation is to take θ = π/2. it will remain in this plane. ∂ qj ˙ The EulerLagrange equation for θ gives ˙ r 2 sin θ cos θ ϕ2 (1− r 2 −r 2 θ 2 −r 2 sin2 θ ϕ2 )−1/2 − ˙ ˙ ˙ d 2˙ ˙ r θ(1− r 2 −r 2 θ 2 −r 2 sin2 θ ϕ2 )−1/2 = 0 . we simply have ˙ ˙ mγr 2 ϕ = . as usual.52) We note that ∂L/∂ϕ = 0.57) (5. We are left with just r and ϕ as polar coordinates in the (x. dτ dτ since.54) (5. without loss of generality. and θ = 0. say q j . ˙ ˙ dr 2 dϕ 2 = m2 + m2 r 2 . there is an associated conserved quantity ∂L . This is just the familiar result that the motion of a particle moving under a central force lies in a plane. dτ (5. ˙ ˙ dt (5. In other words. and so there is a conserved quantity ∂L = mr 2 ϕ(1 − r 2 − r 2 ϕ2 )−1/2 = .
(5.57) and also we have deﬁned u ≡ It now follows that (5. for the constant of integration). and the − sign if q > 0.64) This ordinary diﬀerential equation can be solved in order to ﬁnd u as a function of ϕ. u (5. deﬁned by u q2 − 2 − qE 2− q 2 = ± m2 + E2 2 cosh w . and hence r as a function of ϕ. (5. and let r= This implies 1 du 1 du dϕ dr =− 2 =− 2 =− u . in terms of r = r(ϕ). and q2 − 2 w = ϕ. We can then integrate (5. (5. dϕ (5.67) u = ± m2 + E2 2 cosh q2 − 2 q2 2 −1 1/2 ϕ + qE q2 − 2 .60) We now perform the standard change of variables in orbit calculations. (5. dτ u dτ u dϕ dτ m where we have used (5.60) becomes (E − qu)2 = 2 1 . to obtain ± hence we have q2 − 2 (making a convenient choice.69) 71 . Rewriting (5.We therefore have E− q r 2 = p 2 + m2 = m2 dr dτ 2 + m2 r 2 dϕ dτ 2 + m2 . The solution determines the shape of the orbit of the particle around the ﬁxed charge Q. q2 − 2 (5.63) u + 2 2 2 u + m2 .61) (5.64) as 2 u = u 2 q2 − 2 − qE q2 − 2 2 − m2 − E2 2 . q2 − 2 (5.62) du . the orbit is given.68) In other words.65) we see that it is convenient to make a change of variable from u to w.65). without loss of generality.66) where the + sign is chosen if q < 0. by q2 − r 2 = ± E2 2 + m2 (q 2 − 2) cosh q2 2 −1 1/2 ϕ + qE . (5.
72) is a standard one. it becomes 2 − q2 = r E2 2 − m2 ( 2 − q 2 ) cos 1− q2 2 1/2 ϕ − qE .71) The situation described above for relativistic orbits should be contrasted with what happens in the nonrelativistic case. If E < −mq 2 /(2 2 ) the orbits are hyperbolae. for which the radius r is a trigonometric function of ϕ.74) in the intermediate case E = −mq 2 /(2 2 ) the orbits are parabolic.72) Note that this can be obtained from the relativisitic Lagrangian (5. even when   > q. (5. In this limit.69) is presented for the case where   < q.76) . r (5. y) plane again) is simply given by 1 L = 2 m(r 2 + r 2 ϕ2 ) − ˙ ˙ q . since this is just a constant (the restmass energy of the particle) and so it does not enter in the EulerLagrange equations. The reason for this is that the argument of the trigonometric function is 1− q2 2 1/2 ϕ.54) we studied above. one can ignore the leadingorder term −m in the expansion. by taking r and r ϕ to be small compared to 1 (the speed of light).70) Finally.75) and so ϕ has to increase through an angle ∆ϕ given by ∆ϕ = 2π 1 − 72 q2 2 −1/2 (5. (5. as in (5. r = mr 2 ϕ .70).65) and resolve it directly in this case. r (5. leading to 2qE = E 2 − m 2 − E 2 ϕ2 . if   = q. mq 2 (5. The analysis of the EulerLagrange equations for the nonrelativistic Lagrangian (5.73) Substituting the latter into the former give the standard radial equation.The solution (5.) (This is for the case E > −mq 2 /(2 2 ). As discussed previously. ˙ (5. If instead   > q. it is easier to go back to the equation (5. the Lagrangian (after restricting to motion in the (x. while The key diﬀerence in the relativistic case is that the orbits are never closed. and then expanding the ˙ ˙ square root to quadratic order in velocities. whose solution implies closed elliptical orbits given by 1 mq = 2 r 1+ 2E 2 cos ϕ − 1 . There are conserved quantities 1 ˙ ˙ E = 2 m(r 2 + r 2 ϕ2 ) + q .
The always throws the particle away from the origin if r tries to get too small. in the relativisitic case the eﬀect of the centrifugal term is reduced at large velocity. and it cannot prevent the collapse of the orbit to r = 0. r (5.” E≈   q + . the orbit of the particle can never reach the origin at r = 0. for which the restmass term dominates inside the square root: E ≈ m + 1m 2 dr dt 2 2 + 2mr 2 + q . then at small enough r the competition between the centrifugal term and the charge term becomes “evenly matched. On the other hand. This can be seen by looking at the conserved quantity E in the fully relativisitic analysis. consider the nonrelativistic limit. then the shape of the orbit is still approximately like an ellipse. can be written as E = m2 + m 2 dr dτ 2 2 1/2 + r2 + q .78). if we keep the full relativistic expression (5. r (5. r r (5. By contrast. except that the “perihelion” of the ellipse advances by an angle δϕ = 2π per orbit.80) and clearly if q < −  the attraction between the charges wins the contest. This can never happen in the nonrelativisitic case. we see that even if q < 0 (an attractive force). the repulsive centrifugal term always wins over the attractive charge term q/r at small enough r. which. unless the angular momentum reason for this is that the centrifugal potential term 2 /r 2 1− q2 2 −1/2 −1 ≈ πq 2 2 (5. It is given by φ(r) = qa .before the cosine completes one cycle.77) is exactly zero. If on the other hand   ≤ q. If we assume that q/  is small compared with 1.81) .79) Here. from our discussion above. the particle spirals inwards and eventually reaches r = 0 within a ﬁnite time. then if q < 0 (which means eQ < 0 and hence an attractive force between the charges). 5. located at ﬁxed positions ra .78) First.5 The multipole expansion Consider the electrostatic potential of N point charges q a . r − ra  a=1 73 N (5.
it follows that ∂i r 2 = 2r ∂i r = 2xi . r − r  r r 2! r 3! r Now since r 2 = xj xj . r r r r r (5.91) .90) We can use this property in order to replace the quantities xi xj . it is useful to perform a multipole expansion of the potential far from the region where the charges are located.89) are all traceless on any pair of indices: δij ∂i ∂j 1 = 0. r δij ∂i ∂j ∂k 1 = 0. (5. (5. the analogous expansion is f (r + a) = f (r) + ai ∂i f (r) + 1 1 ai aj ∂i ∂j f (r) + ai aj ak ∂i ∂j ∂k f (r) + · · · . 2! 3! (5. Taylor’s theorem gives f (x + a) = f (x) + af (x) + a2 a3 f (x) + f (x) + · · · . r − r  (5. Recall that in one dimension.88) A consequence of this is that the multiple derivatives ∂ i ∂j ∂k ∂ 1 .In the continuum limit. r etc. 2! 3! (5. This can be achieved by performing a Taylor expansion of φ(r).84) We now apply this 3dimensional Taylor expansion to the function f (r) = 1/r = 1/r. This amounts to an expansion in inverse powers of r = r.85) xi . in other words 21 (5. This gives 1 1 1 1 1 1 1 = − x i ∂i + xi xj ∂i ∂j − xi xj xk ∂i ∂j ∂k + · · · . and so ∂i r = Note that we have (assuming r = 0) that ∂i ∂i or. r = 0. x i xj xk . 74 ··· (5. the potential due to a charge distrubution characterised by the charge density ρ(r) is given by φ(r) = ρ(r )d3 r .86) 3 3xi xi xi 1 = ∂i − 3 = − 3 + 4 = 0. taking a = −r .83) In three dimensions. r ··· (5.82) Since we shall assume that the charges are conﬁned to a ﬁnite region.87) r 1 ∂i ∂j . r (5. r 1 ∂ i ∂j ∂k .
85). Note that by construction.92) where r 2 = xi xi .94) Qijk = (xi xj xk − 1 [xi δjk + xj δik + xk δij ]r )ρ(r )d3 r . r r (5. xi ρ(r )d3 r . 5 2 (5. = 1 Qij 2 2 5 r r r (5. For example.) The total charge Q (the electric monopole moment) is of course a single quantity. p i is the dipole moment. The dipole moment pi is a 3vector. 2 1 (xi xj xk − 5 [xi δjk + xj δik + xk δij ]r ) . are the higher multipole moments. Note that the terms in the multipole expansion (5.94) do indeed fall oﬀ with increasing inverse powers of r. Qij is the quadrupole moment.that multiply the derivative terms in (5. The quadrupole term is given by 1 φQuadrupole = 2 Qij ∂i ∂j 1 (3xi xj − r 2 δij ) xi xj = 3 Qij 5 .. and Qijk . (The last equality above follows because Q ij is traceless.82) and (5. 5 (5.) It therefore follows from (5. (We can do this because the trace terms that we are subtracting out here give zero when they are contracted onto the multiple derivatives of 1/r in (5.95) and so on. The 75 . all the multipole moments with two or more indices are symmetric and traceless on all indices. Qijk . the dipole term is given by φDipole = −pi ∂i 1 pi xi = 3 .85) by the totally tracefree quantities 1 (xi xj − 3 δij r ) .85) that we have φ(r) = 1 r ρ(r )d3 r − ∂i 1 r 1 r xi ρ(r )d3 r + ∂i ∂j 1 r (xi xj − 1 δij r )ρ(r )d3 r 3 2 2 − ∂i ∂j ∂k (xi xj xk − 1 [xi δjk + xj δik + xk δij ]r )ρ(r )d3 r + · · · .96) which falls oﬀ like 1/r 2 . 1 (xi xj − 3 δij r )ρ(r )d3 r . so it has three independent components in general. The quantity Q is the total charge of the system. etc.93) The expansion here can be written as φ(r) = where Q = pi = Qij = ρ(r )d3 r .97) which falls oﬀ like 1/r 3 . 2 1 1 1 1 1 Q − pi ∂i + Qij ∂i ∂j − Qijk ∂i ∂j ∂k + · · · r r 2! r 3! r (5. 2 ··· (5.
in a region where there are no 76 . Qij . θ. φ): C m Y m (θ. Thus there are 6 − 1 = 5 independent components. Likewise. when expressed in terms of spherical polar coordinates (see (5.1 Electromagnetic Waves Wave equation As discussed at the beginning of the course (see section 1. the multipole expansion (5.98) has (2 + 1) independent components. using the spherical harmonics Y φ(r. (5. In fact. The octopole moment Qijk is a 3index symmetric tensor. which is 3 conditions. The three functions Z i ≡ ∂i r −1 = −xi /r 3 are given by Z2 = − sin θ sin ϕ . with and the (2 + 1) components there is a linear since m ranges over the integers − ≤ m ≤ . But it is also traceless. 8π 4π Y10 .100) and the set of functions ∂i1 ∂i2 · · · ∂i r −1 . Q iij = 0.quadrupole moment Qij is a symmetric 2index tensor in three dimensions. Thus the octopole has in general 10 − 3 = 7 independent components. For each value of . for example.49)). 8π Y10 = 3 cos θ . for each Consider. ∞ =0 m=− m (θ.−1 = 3 sin θ e−iϕ . But it is also traceless.−1 ) . r2 (5. and there are (2 + 1) of them. which would mean 3 × 4 × 5/3! = 10 independent components. 3 r2 (5. φ) = At a given value of coeﬃcients C m. r2 = 1. 4π Y1. 6 6.−1 ) . Q ii = 0.102) Analogous relations can be seen for all higher values of . ϕ) of the multipole moments Q. Maxwell’s equations admit wavelike solutions. the = 1 spherical harmonics are given by Y11 = − 3 sin θ eiϕ . pi . φ) 1 r +1 . which is one condition. It is straightforward to see in the same way that the 2 pole moment Qi1 i2 ···i = (xi1 xi2 · · · xi − traces)ρ(r )d3 r (5. 3 2r 2 Z1 = 8π (Y11 + Y1. there is m a linear relationship between the (2 + 1) components of C relationship between r − −1 Y m (θ. etc.99) the terms fall oﬀ like r − −1 . r2 Z3 = − cos θ . Qijk .101) Thus we see that Z1 = 8π (Y11 − Y1.1). On the other hand. which would mean 3 × 4/2 = 6 independent components. These solutions can esist in free space. 3 2i r 2 Z3 = − (5.94) is equivalent to an expansion in spherical polar coordinates. Z1 = − sin θ cos ϕ .
which shows that d’Alembert’s operator is Lorentz invariant. The wave equation (6. The function f will then satisfy ∂2f ∂2f − 2 = 0. · B = 0. 2 B− (6. Thus we may seek solutions of (6. ∂t2 ∂2f = 0. By choosing the orientation of the axes appropriately. taking the curl of the equation.1) ×B (6. we can make this linear combination become simply x. ∂u∂v 77 (6.5) .4) admits planewave solutions. one ﬁnds 2 E− ∂2E = 0.4) This can.6) (6. We see that ∂ ∂ ∂ = + . and using the (6. be written as f ≡ ∂ µ ∂µ f = 0 . where f depends on t and on a single linear combination of the x.4) of the form f = f (t. of course.7) becomes ∂2f = 0.1.source currents.7) (6. ∂t × E equation. ×B− ∂E = 0.3) Thus each component of E and each component of B satisﬁes d’Alembert’s equation 2 f− (6. (6.10) ∂ ∂ ∂ =− + .9) v = x + t. y and z coordinates. ∂t ∂B ×E+ = 0. ∂x ∂u ∂v and so (6.2) As discussed in section 1. ∂t2 ∂2B = 0. for which the equations take the form · E = 0. ∂t2 and similarly. x) = 0 . ∂t ∂u ∂v (6. x). + ∂x ∂t (6. ∂x2 ∂t which can be written in the factorised form ∂ ∂ − ∂x ∂t Now introduced “lightcone coordinates” u = x − t.8) ∂ ∂ f (t.
we can consider a planewave disturbance moving along the direction of a unit 3vector n: f (t. then as t increases the x value must increase too.e.13) Ek (n x − t) . In the case of a wave described by f + (x − t). at a ﬁxed value of the arguement of the function f+ ). corresponds to taking n = (1.11) The f+ wave moves in the direction of n as t increases. and drop the constant of integration since an additional static B ﬁeld term is of no interest to us when discussing electromagnetic waves. E (n x − t) . More generally.12) (6. Thus we have Bi = ijk nj Ek .15) We can integrate this.The general solution to this is f = f+ (u) + f− (v) = f+ (x − t) + f− (x + t) .e. along the positive x direction. Likewise. (6. The previous case of propagation along the x axis. with speed 1. there will exist planewave solutions of (6. We also have that ∂Ek (n x − t)/∂t = −Ek (n x − t). and so ∂Bi = ∂t ijk nj ∂ Ek (n x − t) .16) 78 . This can be seen from the fact that if we sit at a given point on the proﬁle (i.14) where Ek denotes the derivative of Ek with respect to its argument. where f+ and f− are arbitrary functions. The functions f± determine the proﬁle of a wavelike disturbance that propagates at the speed of light (i. while the f − wave moves in the direction of −n. Let us now return to the discussion of electromagnetic waves. of the form E = E(n · r − t) . From the Maxwell equation ∂ B/∂t = − ∂Bi ∂t = − = − × E.e. This means that the disturbance moves. propagating along the n direction. the disturbance propagtes at the speed of light in the positive x direction. Following the discussion above. at speed 1). 0. B =n×E.2). i. r) = f+ (n · r − t) + f− (n · r + t) . 0). (6. we shall therefore have ijk ∂j ijk nj (6. (6. a wave described by f− (x + t) moves in the negative x direction as time increases. ∂t (6.
24) nµ is called a Null Vector.The sourcefree Maxwell equation · E = 0 implies ∂ n · E = 0. It also follows from B = n × E that E = B . ijk Ej k m n Em = 4π 4π 4π 1 ni Ej Ej .25) (6. the E and B vectors are orthogonal to n and also orthogonal to each other: n · E = 0. i.20) n · B = 0. E · B = 0. 4π S= 1 nE 2 = nW . Since n is a unit vector. it immediately follows that n · B = 0 and E · B = 0 also. we have nµ nµ = ηµν nµ nν = 0 . This is a nonvanishing vector whose norm n µ nµ vanishes. Such vectors can arise because of the minus sign in the η 00 component of the 4metric. 79 .18) Since B = n × E. n · n = 1. 4π (6.21) The Poynting ﬂux S = (E × B)/(4π) is given by Si = = and so we have W = 1 2 E . and conclude that for the plane wave n · E = 0. (6. (6. where nµ = (−1. (6.17) ∂i Ei (nj xj − t) = ni Ei (nj xj − t) = − Again. E =B.19) Thus we ﬁnd that the energy density W is given by W = 1 2 1 (E 2 + B 2 ) = E . n) and hence nµ = (1.26) (6. Thus we see that for a plane electromagnetic wave propagating along the n direction. (6.e. n) . 4π (6. we can drop the constant of integration. 8π 4π (6.22) Note that the argument n · r − t can be written as n · r − t = n µ xµ .23) 1 1 1 ni Ej Ej − Ei nj Ej . ∂t (6.
32) the total energy of the electromagnetic ﬁeld. in a metric of positivedeﬁnite signature. T ij = 4π 1 (−Ei Ej − ik jmn nk nm E En + E 2 δij ) . Note that P µ is also a null vector. W d3 x . (6. From this. (6.27) = δij δkm δ n + δim δkn δ j + δin δkj δ m − δim δkj δ n − δij δkn δ m − δin δkm δ j . 4π 4π = Si = ni W .31) (6. 1 ni nj E 2 = n i nj W .30) = and hence we have nµ W d 3 x = n µ P µ = nµ E . which are given by (4. we have used the identity ik jmn (6.28) The expressions for T 00 . P µ Pµ = E 2 nµ nµ = 0 .By contrast. (6.33) 80 . Thus we have T 00 = W = 1 2 1 2 E = B . T 0i and T ij can be combined into the single Lorentzcovariant expression T µν = nµ nν W . We can now evaluate the various components of the energymomentum tensor. where E= W d3 x . T 0i = T i0 1 1 (−Ei Ej − Bi Bj + 2 (E 2 + B 2 )δij ) . = 4π 1 = (−Ei Ej − δij E 2 − ni nk Ek Ej − nj n E Ei + δij nk n Ek E 4π +ni nj E E + nk nk Ei Ej + E 2 δij ) .29) T µν dΣν = T µ0 d3 x . we can compute the conserved 4momentum Pµ = t=const. (6. = 4π Note that in deriving this last result. (6.104) and the equations that follow it. a vector whose norm vanishes is itself necessarily zero. such as the 3dimensional Euclidean metric δij .
(6. k) = ω nµ . and since · E = 0 and · B = 0. n) and n = k/k = k/ω. there must be relations among the constants k.34) where E0 and B0 are (possibly complex) constants. 81 (6. with the understanding that we take the real parts to get the physical quantities.40) (6. The physical E and B ﬁelds are obtained by taking the real parts of E and B.2 Monochromatic plane waves In the discussion above.35) then becomes simply the statement that k µ is a null vector. we can always choose to work in such a complex notation. ω (6. Speciﬁcally. so that its time dependence is of the form cos ωt.39) .38) where nµ = (1. we must have k · E0 = 0 . (6.6.34) can now be written as k · r − ωt = kµ xµ .37) It is natural. we considered plane electromagnetic waves with an arbitrary proﬁle. (Since the Maxwell equations are linear. we must have k2 = ω2 .) As we shall discuss in some detail later. k · B0 = 0 .3). the more general planewave solutions discussed previously. ω. Note that the argument of the exponentials in (6. A special case is to consider the situation when the plane wave has a deﬁnite frequency ω. Of course. therefore. with an arbitrary proﬁle for the wave. to introduce the 4vector k µ = (ω. B = B0 ei(k·r−ωt) .2) and (6. following the discussion in the more general case above. Thus we can write E = E0 ei(k·r−ωt) .34) to solve the Maxwell equations. can be built up as linear combinations of the monochromatic planewave solutions. Equation (6.36) ×E+ (6. since E and B must satisfy the wave equations (6. E0 and B0 . for the ﬁelds in (6. it follows from ∂ B/∂t = 0 and × B − ∂ E/∂t = 0 that B= k×E . (6.35) Finally. k µ kµ = 0 .
0 (6.47) B is cancelled when multiplied by the complex conjugate ﬁeld. 8π 4π 0 T B = B0 cos(k · r − ωt) . B =n×E. Then the physical ﬁelds (obtained by taking the real parts of the ﬁelds given in (6. is rather speciﬁc to this speciﬁc situation. The energy density is then given by W = 1 2 1 (E 2 + B 2 ) = E cos2 (k · r − ωt) . we have n · E = n · B = 0. where the ∗ denotes complex conjugation. B = B0 ei k·x . 82 .42) and so we have also that E and B are perpendicular to each other.45) where T = 2π/ω is the period of the oscillation.46) Note that in terms of the complex expressions (6. propagating in the direction of the unit 3vector n = k/ω. (6. The term “transverse” here signiﬁes that E and B are perpendicular to the direction in which the wave is propagating. If we ﬁrst consider the case where E0 is real. are given by E = E0 cos(k · r − ωt) . and that  E = B. we can evaluate the time average of the Poynting ﬂux vector S = (E × B)/(4π). 4π 4π 4π (6. 8π 8π 0 (6.34)). we have a plane transverse wave.” of expressing the timeaveraged energy density in terms of the dot product of the complex ﬁeld with its complex conjugate.44) If we deﬁne the time average of W by W ≡ 1 T W dt . then we shall have W = 1 2 1 2 E0 = B . Consider the case where E0 is taken to be real. we shall have S= 11 1 1 1 2 E×B = E0 × B0 cos2 (n · r − ωt) = n E0 cos2 (n · r − ωt) . where the quantity being timeaveraged is quadratic in the electric and magnetic ﬁelds. we can write this as W = 1 1 E · E∗ = B · B∗ .43) (6. (6. 8π 8π (6. we shall also have the same expressions (6. which means that B0 is real too.48) This “trick. since the time and position dependence of E or In general. 11 for the timeaveraged energy density.47) In a similar manner. (6.41) As usual. Thus we may rewrite (6.34).34) more brieﬂy as E = E0 ei k·x . when E0 and B0 are not real. In fact.which we shall commonly write as k · x.
electromagnetic ﬁeld obtained by setting z = 0 in (6. x ˙ m¨ = 0 . we can write S in terms of the complex E and B ﬁelds as S = and so we have S =n W .51) 1 1 E × B∗ = n E · E∗ .54) (6. E0 cos ωt.56) B = (0. y m¨ = ex E0 cos ωt .52). 0) .and so S = 1 1 2 E0 × B0 = n E0 . z ˙ (6. 0.52) Suppose now that there is a particle of mass m and charge e in this ﬁeld. (6. E0 cos ω(z − t). (6. dt nonrelativistically. 0) . even if E0 and B0 are not real. dt (6. By the Lorentz force equation we shall have dp = eE + ev × B .49) In general. Thus E = (E0 cos ωt.50) 6.53) For simplicity.M. 0) . (6. 8π 8π (6.3 Motion of a point charge in a linearlypolarised E. wave Consider a plane wave propagating in the z direction. we shall make the assumption that the motion of the particle can be treated Let us suppose that the particle is initially located at the point z = 0. and that it moves only by a small amount in comparison to the wavelength 2π/ω of the electromagnetic wave. with E = (E0 cos ω(z − t). and so p = mv = m dr . 8π 8π (6. although timedependent. we can assume that the particle is sitting in the uniform. and so the Lorentz force equation gives m¨ = eE0 cos ωt − ez cos ωt ≈ eE0 cos ωt . B = (0.55) Note that the approximation in the ﬁrst line follows from our assumption that the motion of the particle is nonrelativistic. 0) . Therefore. ˙ 83 . to a good approximation. 0. so z << 1.
With convenient and inessential choices for the constants of integration, ﬁrst obtain x= ˙ eE0 sin ωt , mω x=− eE0 cos ωt , mω 2 (6.57)
Substituting into the z equation then gives z= ¨
2 2 e2 E0 e2 E0 sin ωt cos ωt = sin 2ωt , m2 ω 2m2 ω
(6.58)
which integrates to give (dropping inessential constants of integration) z=−
2 e2 E0 sin 2ωt . 8m2 ω 3
(6.59)
The motion in the y direction is purely linear, and since we are not interested in the case where the particle drifts uniformly through space, we can just focus on the solution where y is constant, say y = 0. Thus the interesting motion of the particle in the electromagnetic ﬁeld is of the form x = α cos ωt , which means z= x2 2β x 1− 2 . α α (6.61) z = β sin 2ωt = 2β sin ωt cos ωt , (6.60)
This describes a “ﬁgure of eight” lying on its side in the (x, z) plane. The assumptions we made in deriving this, namely nonrelativistic motion and a small z displacement relative to the wavelength of the electromagnetic wave, can be seen to be satisﬁed provided the amplitude E0 of the wave is suﬃciently small. The response of the charge particle to electromagnetic wave provides a model for how the electrons in a receiving antenna behave in the presence of an electromagnetic wave. This shows how the wave is converted into oscilliatory currents in the antenna, which are then ampliﬁed and processed into the ﬁnal output signal in a radio receiver.
6.4
Circular and elliptical polarisation
The electromagnetic wave described in section 6.2 is linearly polarised. For example, we could consider the solution with E0 = (0, E0 , 0) , B0 = (0, 0, B0 ) , n = (1, 0, 0) . (6.62)
This corresponds to a linearly polarised electromagnetic wave propagating along the x direction. 84
By taking a linear superposition of waves propagating along a given direction n, we can obtain circularly polarised, or more generally, elliptically polarised, waves. Let e and f be two orthogonal unit vectors, that are also both orthogonal to n: e · e = 1, e·f = 0, f · f = 1, n· e = 0, n · n = 1, n · f = 0. (6.63)
Suppose now we consider a plane wave given by E = (E0 e + E0 f) ei (k·r−ωt) , B =n×E, (6.64)
where E0 and E0 are complex constants. If E0 and E0 both have the same phase (i.e. E0 /E0 is real), then we again have a linearlypolarised electromagnetic wave. If instead the phases of E0 and E0 are diﬀerent, then the wave is in general elliptically polarised. Consider as an example the case where E0 = ±i E0 , for which the electric ﬁeld will be given by E = E0 (e ± i f ) ei (k·r−ωt) . Taking the real part, to get the physical electric ﬁeld, we obtain E = E0 e cos(k · r − ωt) For example, if we choose n = (0, 0, 1) , then the electric ﬁeld is given by Ex = E0 cos ω(z − t) , Ey = E0 sin ω(z − t) . (6.69) e = (1, 0, 0) , f = (0, 1, 0) , (6.68) E0 f sin(k · r − ωt) . (6.67) (6.66) (6.65)
It is clear from this that the magnitude of the electric ﬁeld is constant, E = E0 . (6.70)
If we ﬁx a value of z, then the E vector can be seen to be rotating around the z axis (the direction of motion of the wave). This rotation is anticlockwise in the (x, y) plane if we choose the plus sign in (6.65), and clockwise if we choose the minus sign instead. These two choices correspond to having a circularly polarised wave of positive or negative helicity 85
respectively. (Positive helicity means the rotation is parallel to the direction of propagation, while negative helicity means the rotation is antiparallel to the direction of propagation.) In more general cases, where the magnitudes of E 0 and E0 are unequal, or where the phase angle between them is not equal to 0 (linear polarisation) or 90 degrees, the electromagnetic wave will be elliptically polarised. Consider, for axample, the case where the electric ﬁeld is given by E = (a1 ei δ1 , a2 ei δ2 , 0) ei ω(z−t) , with the propagtion direction being n = (0, 0, 1). Then we shall have B = n × E = (−a2 ei δ2 , a1 ei δ1 , 0) ei ω(z−t) . (6.72) (6.71)
The constants a1 , a2 , δ1 and δ2 determine the nature of this plane wave propagating along the z direction. Of course the overall phase is unimportant, so really it is only the diﬀerence δ2 − δ1 between the phase angles that is important. The magnitude and phase information is sometimes expressed in terms of the Stokes Parameters (s0 , s1 , s2 , s3 ), which are deﬁned by
∗ ∗ s0 = E x Ex + E y Ey = a 2 + a 2 , 2 1 ∗ ∗ s 1 = E x Ex − E y Ey = a 2 − a 2 , 2 1
(6.73)
∗ s2 = 2 (Ex Ey ) = 2a1 a2 cos(δ2 − δ1 ) ,
∗ s3 = 2 (Ex Ey ) = 2a1 a2 sin(δ2 − δ1 ) .
∗ (The last two involve the real and imaginary parts of (E x Ey ) respectively.) The four Stokes
parameters are not independent: s2 = s 2 + s 2 + s 2 . 3 2 1 0 (6.74)
The parameter s0 characterises the intensity of the electromagnetic wave, while s 1 characterises the amount of x polarisation versus y polarisation, with −s0 ≤ s1 ≤ s0 . (6.75)
The third independent parameter, which could be taken to be s 2 , characterises the phase diﬀerence between the x and the y polarised waves. Circular polaristion with ± helicity corresponds to s1 = 0 , s2 = 0 , s3 = ±s0 . (6.76)
6.5
General superposition of plane waves
So far in the discussion of electromagnetic waves, we have considered the case where there is a single direction of propagation (i.e. a plane wave), and a single frequency (monochromatic). 86
but in the present case. and a is a constant. the gauge potential that gives the physical (i. when we want to describe the actual physical quantities. 87 a −→ aλ (k) . as A = ae ei (k·r−ωt) . it is quite helpful. we shall ﬁrst write a single monochromatic plane wave in terms of the 3vector potential.c stands for “complex conjugate.c. The electric and magnetic ﬁelds will be given by E = − φ− B = ∂A = i aω e ei (k·r−ωt) . Such a gauge choice would not be convenient when discussing solutions in electrostatics. (6. (We have absorbed a factor of “physical” A as A = ae ei (k·r−ωt) + c. We shall therefore label the polarisation vectors and amplitudes as follows: e −→ eλ (k) . ∂t k×E .80) 1 2 (6. we shall write A = ae ei (k·r−ωt) + a∗ e e−i (k·r−ωt) . We have established. real) electric and magnetic ﬁelds is given by taking the real part of A in (6. therefore. that (6. provided that e · k = 0 and k = ω. For the present purposes. with electric ﬁeld along e. Thus.81) (6. Thus. in order to avoid carrying factors around in all the subsequent equations. with diﬀerent wavevectors k.77). a convenient choice of gauge is to set φ = 0. where c.79) 1 2 here into a rescaling of a. we shall usually write the . we must have  k2 = ω 2 .) For brevity. As usual.e. In order to discuss the general wave solutions. and diﬀerent amplitudes a. .77) describes a monochromatic plane wave propagating along the k direction.78) We can immediately see that E and B satisfy the wave equation. More precisely. diﬀerent polarisation vectors e. and that we must impose e · k = 0 in order to satisfy · E = 0.The most general wavelike solutions of the Maxwell equations can be expressed as linear cobinations of these basic monochromatic planewave solutions. it is helpful to work with the gauge potential Aµ = (φ.” Now consider a general linear superposition of monochromatic plane waves. Recall that we have the freedom to make gauge transformations Aµ → Aµ + ∂µ λ. (6. × A = i ak × e ei (k·r−ωt) = ω (6. where we know that the wave solutions are necessarily timedependent.77) where e is a unit polarisation vector. where λ is an arbitrary function. A). of describing wave solutions.
and ω = k.80). rather than linearlypolarised waves. (6.82) where ω = k.85) Note that ± ∗ = . Since a continuous range of wavevectors is allowed. Helicity and energy of circularlypolarised waves The angularmomentum tensor M µν for the electromagnetic ﬁeld is deﬁned by M µν = t=const (xµ T νρ − xν T µρ )dΣρ . and k · eλ (k) = 0 . related to the previous basis by (6. . Thus we can write 2 A= λ=1 d3 k eλ (k) aλ (k) ei (k·r−ωt) + c.87) and so the threedimensional components M ij are M ij = t=const (xi T jρ − xj T iρ )dΣρ = (xi T j0 − xj T i0 )d3 x .c. (6.c.88) = (xi S j − xj S i )d3 x . it will be convenient to expand A in a basis of circularlypolarised monochromatic plane waves.84) 1 = √ (e1 ± i e2 ) . (6. we should choose the 2dimensional basis of polarisation vectors ± ±. = 1. We can label the basis vectors by where λ is now understood to take the two “values” + and −. We then write the general wave solution as A= λ=± d3 k (2π)3 λ (k) aλ (k) e i (k·r−ωt) + c. − ± · − = 0. 2 Since we have ei · ej = δij .83) For many purposes.5. which labels 2 real orthonormal vectors e1 (k) and e2 (k) that span the 2plane perpendicular to k. + · − λ.86) Of course. (6.Here λ is an index which ranges over the values 1 and 2. . 88 . the summation over these will be a 3dimensional integral.1 λ = 0. it follows that + · + = 0. The general wave solution can then be written as the sum over all such monochromatic plane waves of the form (6. (2π)3 (6. we also have k · 6. In this case. (6.
94) to the angular momentum in the case of the sum over circularlypolarised waves that we introduced in the previous section. the threedimensional angular momentum L i = given by Li = i. Lspin = 1 4π E × A d3 x (6. (6. the ﬁrst term in (6. On the other hand. The second term can be viewed as an “orbital angular momentum. xj (∂k E ) A + ijk Ej Ak d3 x . (6.93) The two terms in (6. − δkq δ p ) xj E ∂p Aq . xj E ∂p Aq . (6.93) can be viewed as an “intrinsic spin” term.89) 1 4π r × (E × B) d3 x . we have. and is independent of the choice of origin.e. (6. Recall that for this sum.” since it clearly depends on the choice of origin. we conclude that the angular momentum 3vector can be expressed as L= 1 4π (E × A − Ai (r × )Ei )d3 x .91) xj E ∂k A − ijk ijk ijk xj E ∂ Ak )d3 x . since S = (E × B)/(4π). L= Now. We shall calculate this spin contribution. as usual. We have also used the sourcefree Maxwell equation ∂ E = 0 in getting to the ﬁnal line. the 3vector potential is given 89 . since B = × A. − − ∂k (xj E ) A + ∂ (xj E ) Ak d3 x .92) Note that in performing the integrations by parts here. assumed that the ﬁelds fall oﬀ fast enough at inﬁnity that the surface term can be dropped.90) Bm . It is rather analogous to an r × p contribution to the angular momentum of a system of particles. ijk xj E ∂k A − xj E ∂ Ak . we have [r × (E × B)]i = = = = and so Li = = = 1 4π 1 4π 1 4π ( ijk ijk k m xj E ijk k m mpq ijk (δkp δ q ijk ijk x j 1 jk 2 ijk M is S k d3 x .Thus.93) can be interpreted as follows. since it is constructed purely from the electromagnetic ﬁelds themselves. (6. Thus.
c. We are interested in calculating the time average of E × A.96). plus the “c. it follows that in order to survive the time averaging. in what will follow below. The quantities ω appearing there are.c. since these.e. ν = 0) is from the product of one of the terms that is explicitly written times one of the “c. by deﬁnition.by A= λ =± d3 k (2π)3 λ ( k ) a λ ( k ) e i (k ·r−ω t) + c. (6.) Note that the time average will be zero for any quantity whose time dependence is of the oscilliatory form e i νt . i νT (6. with A and E given by (6. This is so that we can take the product E × A and not have a clash of “dummy” summation variables. (6.95) The electric ﬁeld is then given by E=− ∂A = ∂t λ=± d3 k iω (2π)3 λ (k) aλ (k) e i (k·r−ωt) + c. . it must be that ν = 0.95) and (6.96) Note that we have put primes on the summation and integration variables λ and k in the expression for A.” term for E times the explicitlywritten term for A. (6.c. The upshot of this discussion is that when we evaluate the time average of E × A. the only terms that survive will be coming from the product of the explicitlywritten terms for E times the “c. Furthermore. because we would have 1 T T 0 ei νt dt = 1 (ei νT − 1) . The only way that we shall get terms in E × A that have zero frequency (i.97) Since we are considering a wave solution with an entire “chorus” of frequencies now. where T was taken to be 2π/ω.” term for A.96).c. positive. . of course. in order for the products 90 . we deﬁne the time average by taking T to inﬁnity. where A and E are given by (6.c. Our interest will be to calculate the time average Lspin ≡ 1 T T 0 Lspin dt .” terms. Since the time dependence of all the quantities we shall consider is precisely of the form e iνt . (It is easily seen that this coincides with the previous deﬁnition of the time average for a monomchromatic wave of frequency ω.95) and (6. since we have deﬁned ω ≡ k. We have also written the frequency as ω ≡ k  in the expression for A. have the opposite sign for their frequency dependence.98) which clearly goes to zero as T goes to inﬁnity.
91 .to have zero frequency. the modes with coeﬃcients a − (k) correspond to circularlypolarised waves of negative helicity.101) λλ × ∗ λ (k)aλ (k)a∗ (k) λ λ ∗ (k)aλ (k)aλ (k) . ω − (k) × ∗ − (k) = ik . We therefore ﬁnd E×A = λλ d3 k d3 k iω (2π)3 (2π)3 λ (k) × ∗ λ (k )aλ (k)a∗ (k ) ei(k−k λ λ )·r − ∗ λ (k) × (k )a∗ (k)aλ (k ) e−i(k−k λ )·r . − Finally. which we shall write as E × A d3 r .e. with spin that is antiparallel to the wavevector k.103) × ∗ + (k) =− ik .102) becomes E × A d3 r = 2 and so we have Lspin = 1 2π d3 k k [a+ (k)a∗ (k) − a− (k)a∗ (k)] . i. from (6.102) span the 2dimensional space or thogonal to the wavevector k. it must be that ω = ω. i.99) we ﬁnd E × A d3 r = d3 k iω (2π)3 λ (k) )·r (6. we recall that the polarization vectors e2 (k) we have e1 (k) × e2 (k) = and so it follows from (6.99) We now need to integrate E × A over all 3space. Conversely. and therefore survive the time averaging. ω (6.e. + − (2π)3 d3 k k a+ (k)2 − a− (k)2 . (2π)3 (6. it follows that (6.84) that + (k) ∗ λ (k) ± (k) × (6. We now make use of the result from the theory of delta functions that ei(k−k Therefore. (6.105) (6. In terms of the original real basis unit vectors e 1 (k) and k .106) It can be seen from this result that the modes associated with the coeﬃcients a + (k) correspond to circularlypolarised waves of positive helicity.104) From this.100) d3 r = (2π)3 δ 3 (k − k ) . their spin is parallel to the wavevector k. ω (6. (6.
106) and (6. which is the component of spin along the direction of the wavevector k. we (6.In a similar fashion.) Since E = −∂ A/∂t here. using the orthogonality relations (6. is therefore given by σ = = = In other words. we may evaluate the energy of the general wave solution as a sum over the individual modes. 2π 1 2 ω aλ (k)2 . 2π 1 E k. we have E2 = λ. λ ± (6.λ = = 1 k aλ (k)2 (sign λ) . The helicity σ.112) . 2π (6. the timeaveraging has picked out only the terms whose total frequency adds to zero. (2π)3 . (6.85).109) = ∗ Finally.107) (Recall that E 2 = B 2 here. (6.λ d3 k 2 ω (2π)3 λ (k) · ∗ λ (k) aλ (k)a∗ (k) λ λ + ∗ λ (k) · (k) a∗ (k)aλ (k) . we see that for a given mode characterised by helicity λ and wavevector k. and the conjugation identity obtain E = 1 2π d3 k 2 ω a+ (k)2 + a− (k)2 .113) 1 k aλ (k)2 (sign λ) . ω (6.110). we have Lspin E k. The total energy E is given by E= 1 8π (E 2 + B 2 )d3 x = 1 4π E 2 d3 x . and so we ﬁnd E 2 d3 r = λ.λ k. we have that energy = ±(helicity) ω . 2π 1 ω aλ (k)2 (sign λ) .111) where (sign λ) is +1 for λ = + and −1 for λ = −. The integration over all space then again gives a threedimensional delta function δ 3 (k − k ).110) From the two results (6.λ d3 k d3 k ω2 (2π)3 (2π)3 λ (k) · ∗ λ (k ) aλ (k)a∗ (k ) ei (k−k λ · λ )·r + ∗ λ (k) (k ) a∗ (k)aλ (k ) e−i (k−k λ )·r .108) where again.λ (sign λ) . 92 (6.
on the grounds that one arbitrary function (the gauge parameter λ) has been used in order to set one function (the scalar potential φ) to zero. (6. First. 6. in the absence of source currents: ∂ µ Fµν = 0 . ¯ (6. In the transition from classical to ¯ quantum physics.114) Planck’s constant h has the units of angular momentum. i.e. A 0 = 0. that E = hω.and so we can write E = σ ω .117) This means that as well as having A0 = −φ = 0. To see this. A). consider the Maxwell ﬁeld equation. not the case. Since the gauge symmetry of Maxwell’s equations is Aµ −→ Aµ + ∂µ λ . 93 (6. and so the Maxwell equation · E = 0 implies that k · c = 0. This can be compared with the result in quantum mechanics. in fact. we described electromagnetic waves in terms of the gauge potential Aµ = (−φ.116) one might think that all the gauge freedom had been used up when we imposed the condition φ = 0. we also have a component of A vanishing.118) (6. recall that for the electromagnetic wave we wrote A as a superposition of terms of the form A = c ei (k·r−ωt) . and in fact the basic “unit” of ¯ angular momentum for the photon is one unit of h. (6.120) (6. the helicity of the electromagnetic ﬁeld becomes the spin of the photon. and hence k · A = 0. which implied that E=− From this we have · E = −ω k · c ei (k·r−ωt) . ∂t (6. namely the projection along k. To see how this can happen. working in the gauge where φ = 0. it is helpful to go back to a Lorentzcovariant gauge choice instead. This is.6 Gauge invariance and electromagnetic ﬁelds In the previous discussion.115) (6.121) .119) ∂A = i ω c ei (k·r−ωt) .
The ﬁeld equation (6. (6.124) (6. let us suppose we are already in Lorentz gauge.129) .123) implies 0 = ∂ µ Aµ = ∂ µ (aµ ei kν x ) = i k µ aµ ei kν x . To see this.128) whilst the Lorentz gauge condition (6. however.123). λ = 0. ∂ µ Aµ = 0 .124). 94 k µ aµ = 0 .122) One might again think that all the gauge symmetry had been “used up” in imposing the Lorentz gauge condition (6.122) then reduces to ∂ µ ∂µ Aν = 0 . any solution of the wave equation will work. (6. In other words.124) we ﬁnd 0= Aµ = ∂ σ ∂σ (aµ ei kν x ) = −k σ kσ aµ ei kν x . Substituting into the wave equation (6.125). let us begin with a general solution of the wave equation (6.123). We can decompose this solution as a sum over plane waves. where aµ and kν are constant. To see what this implies. We now choose the Lorentz gauge condition. Aµ = 0 .125) that allowed one to impose (6.130) ν ν (6. (6. where a typical mode in the sum is Aµ = aµ ei (k·r−ωt) = aµ ei kν x = Aµ ei k·x . insisting that we must remain in the Lorentz gauge. on the grounds that the arbitrary function λ in the gauge transformation Aµ −→ Aµ + ∂µ λ conditions on Aµ .127) (6.123) would no longer allow any freedom to impose further Nontrivial such functions λ can of course exist. i. ν ν ν (6. as in (6.e.e. working in the Lorentz gauge (6. and then try performing a further gauge transformation.123) (6. This is not quite true. kµ and aµ must satisfy k µ kµ = 0 . this implies ∂ µ ∂µ Aν − ∂ µ ∂ν Aµ = 0 . This means that λ should satisfy ∂ µ ∂µ λ = 0 .Since Fµν = ∂µ Aν − ∂ν Aµ . i.126) (6.
Speciﬁcally. In other words. . as we had seen earlier.131) (6. so that a µ has only 3 independent components. where h is a constant. leaving just 2 independent components in the polarisation vector aµ . respectively. the independent quantities that cannot be changed by making gauge transformations. we see that there are 2 degrees of freedom in the electromagnetic wave. Since the physical degrees of freedom are.134) ν ν ν ν ν we shall choose (6. and so it is not possible for it to have a helicity that projects other than fully parallel or antiparallel to its direction of propagation. With Aµ given by (6. Thus not merely are its ostensible 4 components reduced to 3 by virtue of k µ aµ = 0. These are the circularlypolarised waves rotating anticlockwise and clockwise.The ﬁrst of these equations implies that k µ is a null vecor. allowed states is that the wave is travelling at the speed of light. we can see that the redeﬁned a µ indeed still satisﬁes k µ aµ = 0. or antiparallel. The second equation implies that 1 of the 4 independent components that a 4vector a µ generically has is restricted in this case.132) (6. and not 3. these are the states whose spin is eiether parallel. as it should. One way of understanding why we have only 2. by deﬁnition.127) this means we shall have aµ ei kν x −→ aµ ei kν x − h kµ ei kν x . The upshot of this discussion is that the freedom to take the constant h to be anything we like allows us to place a second restriction on the components of a µ . Thus we shall have Aµ −→ Aµ − h kµ ei kν x . Starting in Lorentz gauge. since k µ is a null vector. λ = 0 so that we keep the gaugetransformed A µ in Lorentz gauge.133) As a check. we make use of the residual 95 (6. and not 3 as one might naively have supposed. where. just as we did in our earlier discussion. as discussed above. These 2 physical degrees of freedom can be organised as the + and − helicity states. We can make contact with the φ = 0 gauge choice that we made in our previous discussion of electromagnetic waves. λ = i h e i kν x . but a further component can be eliminated by means of the residual gauge freedom. which implies aµ −→ aµ − h kµ . to the direction of propagation. Now we perform the further gauge transformation A µ → Aµ + ∂µ λ.
we have λ (k) · k = 0. the analogous result can easily be seen to be that the electromagnetic wave has (D − 2) degrees of freedom.λ) = 0 .e. In D spacetime dimensions.λ) = −∂ A(k. The choice φ = 0 amd · A = 0 is known as Radiation Gauge.137) (6. (6. ω (6. k · A = 0 gauge conditions that we used previously in our analysis of the general electromagnetic wave solutions. Consider. as one would expect since the ﬁelds are statsic.123) then reduces to ∂i Ai = 0 . i. and so. 96 . expressed in the radiation gauge in terms of the 3vector potential A.gauge transformation (6.134) by choosing h so that a0 − h k 0 = 0 . (6. we shall have A0 = 0 .140) We can express φ(r) in terms of its Fourier transform Φ( k) as φ(r) = d3 k Φ(k) ei k·r .e.139) By constrast. could be decomposed into Fourier modes as in (6. For each mode A(k.λ) /∂t satisﬁes the transversality condition k · E(k. i. h=− a0 .135) this means that after performing the residual gauge transformation we shall have a0 = 0 . from (6.86). for example. (6. an electrostatic ﬁeld E is longitudinal.λ) in the sum.138) This implies k · A = 0. and so we have reproduced precisely the φ = 0.5 that an electromagnetic wave. a point charge at the origin.e. φ = 0.127). i.136) The original Lorentz gauge condition (6. · A = 0. 6. (2π)3 (6. whose potential therefore satisﬁes 2 φ = −4πe δ 3 (r) . (6.141) This is clearly a sum over zerofrequency waves. and so each mode of the electric ﬁeld E(k.7 Fourier decomposition of electrostatic ﬁelds We saw earlier in 6.
in the (x.144) The electric ﬁeld is given by E = − φ. We shall consider an electromagnetic wave propagting down the cylinder.8 Waveguides For our purposes. .It follows from (6. It will therefore have z and t dependence of the form ei (kz−ωt) . cylinder. For convenience we shall take the axis of th cylinder to lie along the z direction. (2π)3 (6.141) into (6. we shall deﬁne a waveguide to be a hollow.141) that 2 φ(r) = − d3 k 2 k Φ(k) ei k·r . with angular frequency ω. y) plane. 97 .143) It follows that if we substitute (6. the crosssection through the cylinder is a closed curve. k2 (6. (2π)3 (6. and hence Φ(k) = 4πe .145) If we deﬁne G(k) to be the Fourier transform of E. Thus. 4π i e k.142) We also note that the deltafunction in (6. and so E = −i d3 k k Φ(k) ei k·r . but it is the same for all values of z.147) d3 k G(k) ei k·r . (2π)3 (6.140) can be written as δ 3 (r) = d3 k i k·r e .140) we shall obtain − k 2 Φ(k) = −4πe.148) Note that k and ω will not in general be equal. the wave will not propagate at the speed of light. essentially of inﬁnite length.146) Thus we see that G(k) is parallel to k.e. (6. (2π)3 (6. The crosssection of the cylinder. perfectly conducting. can for now be arbitrary. i. k2 (6. so that E(r) = then we see that G(k) = −i k Φ(k) = − dinal. which proves that the electrostatic ﬁeld is Longitu 6.
as follows: ⊥ ≡ ∂ ∂ .8.148) implies that the sourcefree Maxwell equations (which hold inside the waveguide). . we may write E(x. ⊥ Bz i k B⊥ − i ω m × E⊥ = ⊥ .148).The assumed form of the time dependence in (6. (6.155) . z. From (6. (6. y. z.0 . × B = −i ω E . t) = B(x. · B = 0. 1) . where we have deﬁned the unit vector m along the z direction (the axis of the waveguide): m = (0.154) From the equations in (6.153) Note that the cross product of any pair of transverse vectors.152) × B⊥ ) = −i ω Ez .e. ⊥ Ez i k E⊥ + i ω m × B⊥ = m·( m·( ⊥ .1 TEM modes There are various types of modes that can be considered. y. · B⊥ = −i k Bz . lies purely in the z direction.152) for E⊥ . the Maxwell equations become ⊥ ⊥ · E⊥ = −i k Ez . 98 ⊥ × E⊥ = 0 .149). B z ) . ∂x ∂y B ≡ ( B⊥ . we may dispose of an “uninteresting” possibility. t) = E(x. B(x. U⊥ × V⊥ . called TEM modes. × E = iωB . × E⊥ ) = i ω B z . i. (6. parallel to m. we shall have · E = 0.151) E ≡ ( E⊥ . First. y) ei (kz−ωt) . y) ei (kz−ωt) .” meaning that Ez = 0 . (6. (6. 0. (6. we see that ⊥ · E⊥ = 0 . (6.150) It is convenient also to deﬁne certain transverse quantities. E z ) . 6.149) Because of the assumed form of the z dependence in (6. The acronym stands for “transverse electric and magnetic. Bz = 0 .
we have n × E = 0.156) Since the crosssection of the waveguide in the (x. n · B⊥ = 0. we may say that n × E = 0. y) plane.2 TE and TM modes In order to have nontrivial modes propagating in the waveguide. There are two basic types of nontrivial modes we may consider. we must relax the TEM assumption. These are called TE modes and TM modes respectively. Then. (6. We may restate these boundary conditions as Ez = 0. and closed by a line segment just inside the conductor. we ﬁrst need to consider the boundary conditions at the conducting surface of the cylinder.157) = 0 inside the wavguide. 12 If the waveguide were replaced by coaxial conducting cylinders then TEM modes could exist in the gap between the innner and outer cylinder. and hence φ = constant and so E = 0. if we deﬁne n to be the unit normal vector at the surface. (6. we get × E) = − · (n × E) = 0 . Similar considerations imply B = 0 for the TEM mode also. since the potentials on the two cylinder need not be equal. The second equation implies we can write E⊥ = − ⊥ φ. The component of E parallel to the surface must vanish (seen by integrating E around a loop comprising a line segment just inside the wavguide .160) S S where S denotes the surface of the cylindrical waveguide. where either E or B (but not both) are taken to be transverse. Next. at a ﬁxed potential (since it is a conductor). 99 .12 6. and then the ﬁrst equation implies that the electrostatic potential φ satisﬁes the 2dimensional Laplace equation 2 ⊥φ = ∂2φ ∂2φ + 2 = 0. we can deduce that φ is constant everywhere inside the conductor: 0= which implies ⊥φ dxdy φ 2 ⊥φ =− dxdy  ⊥ φ 2 .8. (6. n·B =0 (6. ∂x2 ∂y (6. where E = 0 by deﬁnition). To analyse these modes. y) plane is a closed curve.159) × E = i ω B Maxwell equation.158) on the surface of the waveguide.These are the equations for electrostatics in the 2dimensional (x. taking the scalar product of n with the iωn · B = n · ( Thus.
k (6. (6.166) gives the expressions for E x . = i k n · B⊥ + i ω m · (n × E⊥ ) . 2 ⊥B + (ω 2 − k 2 ) B = 0 . after having done so.167) S E=− . With the assumption (6. (−ω ∂y Ez + k ∂x Bz ) . Bx and By in terms of Ez and Bz . (6.152) become. Ey . (6. ω m × B⊥ .148).161) This follows by taking the scalar product of n with the penultimate equation in (6. i k B x + i ω E y = ∂ x Bz . substitution into (6. Bx and By .152): n· ⊥ Bz = i k n · B⊥ − i ωn · (m × E⊥ ) .165) These can be solved for Ex .163) is the normal derivative. we can now distinguish two diﬀerent categories of wave solution in the waveguide. (ω ∂x Ez + k ∂y Bz ) . in terms of components. i k B y − i ω E x = ∂ y Bz . the wave equations for E and B become 2 ⊥E + (ω 2 − k 2 ) E = 0 .164) where 2 ⊥ = ∂ 2 /∂x2 + ∂ 2 /∂y 2 is the 2dimensional Laplacian. These are TE waves : Ez = 0 . i k E x − i ω B y = ∂ x Ez . (−ω ∂x Bz + k ∂y Ez ) . giving Ex = Ey = Bx = By = i − k2 i 2 − k2 ω i 2 − k2 ω i ω2 − k2 ω2 (ω ∂y Bz + k ∂x Ez ) . (6. The third and ﬁfth equations in (6. (6.166) This means that we can concentrate on solving for E z and Bz . (6.162) and then restricting to the surface S of the cylinder.The two boundary conditions above imply also that n· ⊥ Bz S = 0. As mentioned earlier. i k E y + i ω B x = ∂ y Ez . Ey . The condition (6.161) may be rewritten as ∂Bz ∂n where ∂/∂n ≡ n · S = 0. ∂Bz ∂n = 0. B⊥ = ω2 and ik − k2 100 Bz .
as an example.171) or (6.160) and (6. unbounded above. (6. Note also that the second condition in each case is just the residual content of the boundary conditions in (6. Ez S = 0. y) = X(x)Y (y). discretely separated from each other. (6. deﬁnes an eigenfunction/eigenvalue problem. after having imposed the transversality condition E z = 0 or Bz = 0 respectively.169) is to be solved in a compact closed region. ω m × E⊥ . we must satisfy the boundary condition that ψ vanishes on the edges of the rectangle. It follows from an elementary calculation. k (6. 0 ≤ y ≤ b. in which one separates variables in (6. together with the boundary condition (6. boundary conditions: TE waves : ∂ψ ∂n ψ = 0.172) Equation (6. The second line in each of the TE and TM cases gives the results from (6. and not just on the cylindrical conductor.173) For TM waves. TM waves propagating down a waveguide with rectangular crosssection: 0 ≤ x ≤ a.161). and so the eigenvalue specture for Ω2 will be discrete.169) by writing ψ(x.171) (6. labelled 101 .166). the equation (6. there will be a semiinﬁnite number of eigenvalues.TM waves : Bz = 0 .172). Consider. (6. that the eigenfunctions and eigenvalues.169) and ψ is equal to Bz or Ez in the case of TE or TM waves respectively. We also have the S TM waves : S = 0. written now in a slightly more compact way. Since the the crosssection of the waveguide is a closed loop in the (x.170) (6. y) plane. ∂x2 ∂y 2 where Ω2 ≡ ω 2 − k 2 . E⊥ = ω2 and ik − k2 Ez . In each case. the basic wave solution is given by solving the 2dimensional Helmholtz equation ∂2ψ ∂2ψ + + Ω2 ψ = 0 .168) B= Note that the vanishing of Ez or Bz in the two cases means that this ﬁeld component vanishes everywhere inside the waveguide.169).
180) If we were instead solving for TE modes. a2 b2 (6.170) between the angular frequency and the wavenumber.by integers (m. it would have imaginary wavenumber. a b m2 π 2 n2 π 2 + 2 .150) it would die oﬀ exponentially with z.174) The wavenumber k and the angular frequency ω for the (m. This is called an evanescent wave. involving cosines rather than sines.179) (6. In other words. n) mode. i. we see that the phase velocity vph and the group velocity vgr are given by vph = vgr = ω Ω2 −1/2 = 1− 2 . we would have the boundary condition ∂ψ/∂n = 0 on the edges of the rectangle. n).175) Notice that this means there is a minimum frequency ω min = Ωmn at which a wave can propagate down the waveguide in the (m. the equation (6. If one tried to transmit a lowerfrequency wave in this mode.e. 13 (6. rather than ψ = 0 on the edges.1 .176) In view of the relation (6.177) Note that because of the particular form of the dispersion relation. We see that while the group velocity satisﬁes vgr ≤ 1 . the lowest angular frequency of TM wave that can propagate down the rectangular waveguide is given by ωmin = π 1 1 + . and so from (6. This would give diﬀerent eigenfunctions.170) relating ω to k. n) mode are then related by k 2 = ω 2 − Ω2 .178) (6. mn (6. it is the case here that vph vgr = 1 . The absolute lowest bound on the angular frequency that can propagate down the waveguide is clearly given by Ω1. a2 b (6. 102 . are given by13 ψmn = emn sin Ω2 mn = mπx nπy sin . k ω dω Ω2 1/2 = 1− 2 . dk ω (6. the phase velocity satisﬁes vph ≥ 1 .
B= iω m× Ω2 ψ. even though it means the phase velocity exceeds the speed of light. Ez = ψ . If we consider TM modes. The story would be diﬀerent if one tried to channel waves from a microwave down the drainpipe. To be more precise. (6. since nothing material.184) ik Ω2 ψ + mψ.186) 103 . An example where this limit is (easily) approached is if you look through a length of metal drainpipe. Let us now investigate the ﬂow of energy down the waveguide.) Note that the expressions for E and B can be condensed down to E= We therefore have E×B∗ = ik Ω2 ψ +mψ × − iω m× Ω2 ψ∗ .183) Using the vector identity A × (B × C) = (A · C) B − (A · B) C. the group velocity approaches the speed of light as ω becomes large compared to the eigenvalue Ω associated with the mode of propagation under discussion. and they propagate through the pipe as if it wasn’t there. and no signal. along m). energy and information travel at the group velocity vgr . 8πΩ4 (6. Note that the group velocity approaches the speed of light (from below) as ω goes to inﬁnity. 1). Along the z direction (i. is transferred faster than the speed of light. we therefore have S z = ωk ( ψ· 8πΩ4 ψ∗ ) = ωk  ψ2 . This is obtained by working out the time average of the Poynting ﬂux. Electromagnetic waves in the visible spectrum have a frequency vastly greater than the lowest TM or TE modes of the drainpipe. then we shall have E⊥ = B⊥ = ik ψ. 0. Ω2 ω iω m × E⊥ = 2 m × k Ω ψ. S = 1 E×B∗ .181) Note that here the ﬁelds E and B are taken to be complex.e. (6. we then ﬁnd E ×B∗ = since m · ωk ( ψ· Ω4 ψ∗ ) m + iω ψ Ω2 ψ∗ . In fact. as we shall now verify.185) ψ = 0. Bz = 0 . (6.182) (Recall that m = (0. which is always less than or equal to the speed of light.There is nothing wrong with this. 8π (6. and we are using the result discussed earlier about taking time averages of quadratic products of the physical E and B ﬁelds. (6.
and the expression (6.(The second term in (6. This gives P = Σ z over the crosssectional dxdy S z = ωk 8πΩ4 dxdy Σ ψ∗ · ψ.187). = − 4 8πΩ Σ 8πΩ2 Σ = and so we have P = (6.192) ω vph This demonstrates that the energy ﬂows down the waveguide at the group velocity v gr .191) for the energy per unit length in the waveguide. The remaining term was then simpliﬁed by using (6. the boundary term over the closed loop C that forms the boundary of the waveguide in the (x. 4 8πΩ C ∂n 8πΩ4 Σ ωk ωk dxdy ψ ∗ 2 ψ = dxdy ψ ∗ ψ .191) Having obtained the expression (6. (6. The total timeaveraged energy density is given by W = = 1 ik ik 1 E·E∗ = ψ + mψ · − 2 2 8π 8π Ω Ω k2 1 ψ∗ · ψ + ψψ ∗ . where we have again integrated by parts in the ﬁrst term. (6. We may also work out the total energy per unit length of the waveguide.189) The energy per unit length U is then obtained by integrating W over the crosssectional area. P = 104 . and used (6.185) describes the circulation of energy within the crosssectional plane of the waveguide. y) plane gives zero because ψ vansihes everywhere on the cylinder. (6. which gives U = Σ dxdy W = k2 8πΩ2 k2 8πΩ4 dxdy Σ ψ∗ · ψ+ 1 8π Σ dxdy ψ2 .169).190) = Σ dxdy ψ2 + 1 8π Σ dxdy ψ2 .) The total transmitted power P is obtained by integrating S area Σ of the waveguide. (6.169) to simplify the result. dxdy 8πΩ4 Σ ωk ωk ∂ψ = d − ψ∗ dxdy ψ ∗ 2 ψ .188) for the power P passing through the waveguide. ωk · (ψ ∗ ψ) − ψ ∗ 2 ψ . 8πΩ4 8π ψ∗ + m ψ∗ . we may note that 1 k U= U = vgr U . Thus we ﬁnd U= ω2 8πΩ2 Σ dxdy ψ2 .188) 8πΩ2 Σ Not that in (6. (6.187) ωk dxdy ψ2 . dropped the boundary term because ψ vanishes on the cylinder.
we have the additional boundary conditions that E⊥ must vanish on the two conductiung plates. for all t.6.195) in order to have E⊥ = 0 at z = 0. (These correspond to z and t dependences ei (±κz−ωt) respectively. (6. In order to arrange that E⊥ vanish.197) pπ . (6. k ψ.196) Recall that in the waveguide. which we shall take to be at z = 0 and z = d.” inside which is an electromagnetic ﬁeld. however. according to k= where p is an integer. This is because the component of E parallel to a conductor must vanish at the conducting surface. 0.9 Resonant cavities A resonant cavity is a hollow. E⊥ = − k sin kz e−i ωt Ω2 ψ. TM modes in the cavity. (6. it must be that the wavenumber k is now quantised. We solve the same 2dimensional Helmholtz equation (6. and turn it into a closed cavity by attaching conducting plates at each end of the cylinder. being restricted to a semiinﬁnite discrete set of eigenvalues for the 2dimensional Helmoholtz 105 . ∂2ψ ∂2ψ + + Ω2 ψ = 0 . it must be that there is a superposition of rightmoving and lefttmoving waves. A simple example would be to take a length of waveguide of the sort we have considered in section 6.169) as before.194) where m = (0. Furthermore. by E⊥ = B⊥ = i k i (κz−ωt) e Ω2 ω m × E⊥ . Ez = ψ ei (κz−ωt) .8. The E and B ﬁelds are given. Consider. as before. we had already found that Ω 2 ≡ ω 2 −k 2 was quantised. in order to have also that E⊥ = 0 at z = d.193) subject again to the TM boundary condition that ψ must vanish on the surface of the cyliner. ∂x2 ∂y 2 (6. Now. Note that we also have Ez = ψ cos kz e−i ωt . at z = 0 and z = d. d (6.) Thus we need to take the combination that makes a standing wave. closed conducting “container. 1). as an example. Let us suppose that the length of the cavity is d.
1) (7.199) d2 If. A) and J µ = (ρ. the remaining Maxwell equation (i.e. Since Aµ = (φ. In the waveguide.5) φ− ∂2φ = −4π ρ . and so the resonant frequencies in the cavity are given by ω2 = π2 for positive integers (m.200) 7 7. subject to the constraint (dispersion relation) ω 2 = Ω2 + k 2 . (6. then (7. ∂t2 (7. a2 b d (6. or. this means we shall have φ = −4π ρ .196). the ﬁeld equation) ∂µ F µν = −4πJ ν becomes ∂µ ∂ µ Aν − ∂µ ∂ ν Aµ = −4πJ ν . This means that the spectrum of allowed frequencies ω is now discrete.6) . in the threedimensional language. but now with the added endcaps at z = 0 and z = d.4) A = −4π J . 2 (7. then Ω2 is given by (6. ∂µ Aµ = 0 .198) In the resonant cavity we now have the further restriction that k is quantised. and given by p2 π 2 . p). m2 n2 p2 + 2 + 2 . If we choose to work in the Lorentz gauge.2) becomes simply Aµ = −4πJ µ . ∂t2 106 2 A− ∂2A = −4π J . J ). according to (6. (7. we consider the previous example of TM modes in a rectangular wavegω 2 = Ω2 + uide whose crosssection has sides of lengths a and b.2) (7.174).3) (7.equation. (6. for example.1 Fields Due to Moving Charges Retarded potentials If we solve the Bianchi identity by writing F µν = ∂µ Aν − ∂ν Aµ . n. that still allowed k and ω to take continuous values.
First consider the situation where there is just an inﬁnitesimal amount of charge de(t) in an inﬁnitesimal volume. where f1 and f2 are arbitrary functions. we wish to solve for the particular integral in the case where there is time dependence too. we shall have ∂i φ = where φ ≡ ∂φ/∂R. Consequently. (7. (We allow for it to be time dependent. 107 (7. we can write the solutions to (7.11) This means that for R = 0. x2 . Now.10) Letting Φ = R φ.13) (7.6) as the sums of a particular integral of the inhomogeneous equation (i.e. we shall have ∂2Φ ∂2Φ − 2 = 0. Our interest now will be in ﬁnding the particular integral. xi φ . R) = f1 (t − R) + f2 (t + R) . however. with R = (x1 .8) When R = 0. φ depends on R only through its magnitude R ≡ R. ∂t2 (7. R R R (7. the one with the source term on the righthand side) plus the general solution of the homogeneous equation (the one with the righthand side set to zero). R). and then 2 Clearly. we have R2 = xi xi and so ∂i R = xi /R. and so φ = φ(t. We therefore wish to solve 2 φ− ∂2φ = −4π de(t) δ 3 (R) .7) where R is the position vector from the origin to the location of the inﬁnitesimal charge. R R φ = 1 2 2 Φ − 2 Φ + 3 Φ. ∂R2 ∂t The general solution to this equation is Φ(t.12) .) Thus the charge density is ρ = de(t) δ 3 (R) . in general. Now. Solving this problem in the case of static sources and ﬁelds will be very familiar from electrostatics and magnetostatics. we have simply 2φ − ∂ 2 φ/∂t2 = 0. R (7. we have φ = 1 1 Φ − 2 Φ. Consider the equation for φ ﬁrst.9) φ = ∂ i ∂i φ = φ + 2 φ .In general. R (7. x3 ).
This therefore gives φ(r. t) = de(t − R) . and the solution with f 2 is called the advanced solution.15) This therefore has the usual solution that is familiar from electrostatics. The reason for this terminology is that in the retarded solution. From (7. namely φ≈ or. it will be the case that the derivatives ∂/∂R will dominate over the time derivatives ∂/∂t near to R = 0.20) where R ≡ r − r . This solution of the inhomogeneous equation is the one that is “forced” by the source term.14) We clearly expect that φ will go to inﬁnity as R approaches zero. the solution is φ= 1 Φ(t − R) . the “eﬀect” occurs after the “cause. t). R (7. and so we shall keep only the causal solution. R (7. since the charge (albeit inﬁnitesimal) is located there. and so in that region we can write 2 φ ≈ −4πde(t) δ 3 (R) . we just exploit the linearity of the Maxwell equations and sum up the contributions from all the charges in the distribution.18) (7. The advanced solution is acausal.” in the sense that the proﬁle of the function f 1 propagates outwards from the origin where the charge de(t) is located.17) (7. the retarded solution. This solution is valid for the particular case of an inﬁnitesimal charge de(t) located at R = 0. de(t) . R (7. For a general timedependent charge distribution ρ(r.16) near R = 0.19) (7.The solution with f1 is called the retarded solution. namely Φ(t − R) = de(t − R) . 108 . t) = ρ(r . in the advanced solution the eﬀect precedes the cause. Consequently. t − R) 3 d r . Φ ≈ de(t) we can therefore immediately write down the solution valid for all R.e. i. Since Φ is already eastablished to depend on t and R only through Φ = Φ(t−R). By contrast. the disturbance propagates inwards as time increases. The upshot is that for R = 0. we therefore have that φ(R.14). R (7. in other words. and therefore unphysical. in the sense that it vanishes if the source charge density ρ vanishes.
e.3. r − r  (7. We already considered a special case of this in section 5. the potentials at the current time t would be inﬂuenced by what the charge and current densities will be in the future. moving at constant velocity).20) can be written as φ(r. in that the potentials at the present time t depend upon the charge and current densities at times ≤ t. we could work out the electromagnetic ﬁelds by using the trick of transforming to the Lorentz frame in which the particle was at rest. t) = J(r .The general solution is given by this particular integral plus an arbitrary solution of the homogeneous equation φ = 0. t) that we have obtained here are called the Retarded Potentials. where its velocity is not uniform. we are going to study the more general case where the particle can be accelerating.21) In an identical fashion. and then transforming back to the frame where the particle was in uniform motion.21) and (7. This means that there does not exist an inertial frame in which the particle is at rest for all time. This would be unphysical. One ﬁnds that they do indeed yield exact solutions of the equations. Now. We leave this as an exercise for the reader. t) and A(r. The solution (7. Since the procedure by which we arrived at the retarded potential solutions(7. . t − r − r ) 3 d r . This can be done straightforwardly. simply by substituting them into the original wave equations (7. 7.6). r − r  (7. since it would violate causality. we can see that the solution for the 3vector potential A in the presence of a 3vector current source J (r. In the advanced potentials.22) The solutions for φ(r. doing the very simple calculation of the ﬁelds in that frame. and so we cannot use the previous trick. In that case. t) = ρ(r . The analogous “advanced potentials” would correspond to having t + r − r  instead of t − r − r  as the time argument of the charge and current densities inside the integrals.2 LienardWiechert potentials We now turn to a discussion of the electromagnetic ﬁelds produced by a point charge e moving along an arbitrary path r = r 0 (t). where we worked out the ﬁelds produced by a charge in uniform motion (i. i. t) will be A(r. It is clear that the retarded potentials are the physically sensible ones. by contrast. 109 . t − r − r ) 3 d r .” it is perhaps worthwhile to go back and check that they are indeed correct.22) may have seemed slightly “unrigorous.e.
this does not mean that we cannot solve the problem using special relativity. of course. for which the time of propagation of information from r0 (t ). point r. we do not want to restrict ourselves to having to hop onto a new instantaneous rest frame every time we discuss the problem.24) (7.25) under the specialisation that the velocity v ≡ dr /dt of the charge is zero at time t . 110 . the potential at time t will be given by φ= e . (7. such as the restframe of the particle. as measured in the chosen inertial frame. where R(t ) = R(t ) . Ultimately. would we need to use the laws of general relativity. on with the problem. (7. then choose an “updated” instantaneous rest frame. and so the goal is to obtain results that are valid in any inertial frame. Only if we wanted to study the problem from the viewpoint of an observer in an accelerating frame. This is an inertial frame whose velocity just happens to match exactly the velocity of the particle at a particular instant of time. In the Lorentz frame where the particle is at rest at t . make use of an instantaneous rest frame. Now. R(t ) A = 0. and sometimes will. that the electromagnetic ﬁelds at (r. if we wished. for each choice of t. and so for this observer. to r at the time t is t − t . in order to simply intermediate calculations. Since the particle is accelerating. the laws of special relativity apply. t) will be determined by the position and state of motion of the particle at earlier times t . Note that although we cannot use special relativity to study the problem in the rest frame of the accelerating particle.23) This is the radius vector from the location r 0 (t) of the charge at the time t to the observation There is one solution for t . We can expect. we can. It is useful therefore to deﬁne R(t) ≡ r − r0 (t) . We shall ﬁnd it expedient at times to make use of the concept of an instantaneous rest frame. then a moment later the particle will no longer be at rest in this frame. and use special relativity to study the problem (for an instant) in the new inertial frame. on grounds of causality.25) We can determine the 4vector potential A µ in an arbitrary Lorentz frame simply by inventing a 4vector expression that reduces to (7. The point is that we shall only ever study the ﬁelds from the viewpoint of an observer who is in an inertial frame.It is worth emphasising that even though the particle is accelerating. The time t is then determined by t − t = R(t ) . We could. where the particle was at time t .
25) if U µ is given by (7.26).31) where 1 . If the charge is at rest. t) = ev R−v·R . (7. with f becoming e/R(t ) in the special case. R means R(t ) and v means dr0 (t )/dt .e. is a 4vector.26). Now.28) (7.26). where 111 R(t ) ≡ R(t ) . and so we see that φ(r. as seen from the point r at time t. (t − t )γ − γ v · R R−v·R To summarise.32) where all quantities on the righthand sides are evaluated at the time t . i. (7. (7.33) .30) (t − t )γ − γ v · R t−t −v·R eγ v ev = . γ v) . t) = e R−v·R . because (t. t) = eγ = e = e R−v·R . R(t )) . (7. r) is a 4vector. (−U ν Rν ) (7. Having written Aµ as a 4vector expression that reduces to (7. the spacetime coordinates of the particle. be U µ .) Then. A(r.25) under the specialisation (7.27) (This is clearly a 4vector. we can write f as the scalar f= e . r0 (t ). we have U µ = (γ. Let us deﬁne the 4vector Rµ = (t − t . we shall have −U ν Rν = −R0 = R0 = t − t = R(t ). A) that reduces to (7. t) = A0 = A(r. and t is determined by solving the equation R(t ) = t − t . and (t . we have concluded that the gauge potentials for a charge e moving along the path r = r0 (t ).26) Thus to write a 4vector expression for A µ = (φ.Let the 4velocity of the charge.29) since clearly if U µ is given by (7. γ= √ 1 − v2 (7. in the observer’s inertial frame. with R(t ) = r − r0 (t ) . we know that it must be the correct expression in any Lorentz frame. are given by φ(r. r − r0 (t )) = (t − t .34) (7. we just have to ﬁnd a scalar f such that Aµ = f U µ . 0) . its 4velocity will be U µ = (1. (7.
namely that if a function f (x) has as zero at x = x 0 . t) = (7. Next. but for me.6) that they are indeed correct. given in (7. The basic premise of the derivation above is that the potentials “here and now” will be given precisely by applying Coulomb’s law at the position the particle was in “a lighttravel time” ago. change 112 . t) = e δ 3 (r − r0 (t )) δ(t − t + r − r ) dt d3 r . but I have to say that I personally ﬁnd the derivation above rather unsatisfying. so that φ(r.21) and (7.21). we need to make use of a basic result about the Dirac deltafunction. People’s taste in what constitutes a satisfying proof of a result can diﬀer. I would view this as a remarkable fact that emerges after one has ﬁrst given a “proper” derivation of the result. However.These potentials are known as the LienardWiechert potentials.” which one maybe would use after having ﬁrst given a “proper” derivation. r − r  (7.38) To evaluate the time integral. consider the integral I = dxh(x)δ(f (x)) for an arbitrary function h(x). t ) δ(t − t + r − r ) dt d3 r . I would regard it as a bit of “handwaving argument. A “proper” derivation of the LienardWiechert potentials can be given as follows.37) (7. (7.22) for the retarded potentials due to a timedependent charge and current source. then14 δ(f (x)) = δ(x − x0 ) 14 df dx −1 . in order to try to give a “physical picture” of what is going on. r − r  e δ(t − t + r − r0 (t )) dt .36) and so after performing the spatial integrations we obtain φ(r. These expressions can themselves be regarded as solid and rigorous. This means that we shall have φ(r. t) = ρ(r . r − r0 (t ) (7. Consider ﬁrst the retarded potential for φ. We can rewrite this as a 4dimensional integral by introducing a deltafunction in the time variable. I ﬁnd it far from obvious that this should give the right answer. since one only has to verify by direct substitution into (7. It is in fact very interesting that this does give the right answer.39) To prove this. it is perhaps worthwhile to pause and give an alternative derivation of the result for the potentials. t) = e δ 3 (r − r0 (t)) . The next step will be to calculate the electric and magnetic ﬁelds from the LienardWiechert potentials. rather than as a solid derivation in its own right. before doing so.35) The charge density for a point charge e moving along the path r = r 0 (t) is given by ρ(r. We take as the starting point the expressions (7.
Thus we have I = h(x0 )/df /dxx0 = dxh(x) δ(x − x0 ) . ∂t B= × A.39). ∂(−r0 (t )) . (The result given here is valid if f (x) vanishes only at the point x = x0 . ∂t = 1 + (r − r0 (t )) · (r − r0 (t )) = 1− = 1− v · (r − r0 (t )) .39). R(t ) (7. t) = e R(t ) − v · R(t ) . (7.39) for handling a “deltafunction of a function. where df /dx is evaluated at x = x0 . (7. then there will be a sum of terms of the type given in (7.32).” we therefore take the function in the integrand of (7.3 Electric and magnetic ﬁelds of a moving charge Having obtained the LienardWiechert potentials φ and A of a moving charge. The derivation for A is very similar. r − r0 (t ) −1/2 (r − r0 (t )) · v · R(t ) . If it vanishes at more than one point.42) variable to z = f (x).38) that multiplies the deltafunction. and so we have reproduced the previous expression for the LienardWiechert potential for φ in (7.41) where t is the solution of t − t = R(t ). the next step is to calculate the associated electric and magnetic ﬁelds.38). 7. Then we have I= dzh(x) δ(z) = h(x0 )/df /dx df /dx δ(z)dz = h(x0 )/df /dx . This therefore gives φ(r.) 113 . (The reason for the absolutevalue on df /dz is that it is to be understood that the direction of the limits of the z integration should be the standard one (negative to positive). evaluate it at the time t for which the argument of the deltafunction vanishes.40) where v = dr0 (t )/dt . E =− φ− ∂A .where df /dx is evaluated at x = x0 . Following the rule (7. we note that ∂ ∂t t − t + r − r0 (t ) = 1+ ∂ ∂t (r − r0 (t )) · (r − r0 (t )) 1/2 .) To evaluate (7. df /dxx0 which proves (7. This is therefore handled by the absolutevalue sign. so dx = dz/(df /dx). If the gradient of f is negative at x = x0 then one has to insert a minus sign to achieve this. and divide by the absolute value of the derivative of the argument of the deltafunction.
47) (7.44) (7. ∂t R ∂t R R (7. j j = 2Ri − 2 ∂x0 (t ) ∂t j (xj − x0 (t )) .45) Solving for ∂t /∂t.34) that R(t ) = t − t . we shall need the following results. ∂t R ∂t and so. ∂t R ∂t (7. and that it is given by (7. it follows that 1− ∂t v · R ∂t =− .33). since R2 = Ri Ri we have ∂R Ri ∂Ri vi (t ) Ri v·R = =− =− . ∂i R = Ri R−v·R . ∂t ∂t ∂t and so. we note that ∂R ∂t ∂R = . (7.43) (Recall that R means R(t ). Now R(t ) = r − r0 (t ). and so 0 R2 = (xj − xj (t ))(xj − x0 (t )) . we therefore have the results that ∂t ∂t ∂R ∂t = 1− v·R R v·R −1 . Some other expressions we shall also need are as follows.50) From this and ∂i t = −∂i R(t ) it follows that ∂i t = − Ri R−v·R . j (7. First. since we have from (7. j ∂t ∂xi = 2Ri − 2v · R ∂i t .To do this. (7.43) therefore becomes ∂R v · R ∂t =− . (7. First. we obtain 2R∂i R = 2(δij − ∂i x0 (t ))(xj − x0 (t )) . ∂t R−v·R 114 ∂x0 (t ) vj Ri j ∂i t = δij + .) Equation (7.46) (7. by acting with ∂i .48) = − R−v·R . ∂t R−v·R .51) Further results that follow straightforwardly are 0 ∂i Rj = ∂i (xj − xj (t )) = δij − ∂i vj = v j Ri ˙ ∂vj ∂i t = − .49) From this. from t − t = R(t ) it follows that ∂i t = −∂i R(t ).
115 (7.53) ∂vi ∂t vi R ˙ = . (7. R)3 + e[a · R (Ri − vi R) − ai (R − v · R) R] (R − v · R)3 + .32) and the (R − v · R)3 Ri − vi (R − v · R) − v 2 Ri + a · R Ri − ai R (R − v · R) −vi v · R − vi a · R R + v 2 vi R . = r − r0 (t) + v (t − t ) − v (t − t ) . and so it represents a contribution that is present even if the charge is in uniform motion. The ﬁrst term in (7. where R → ∞. R (7. it falls oﬀ like 1/R 2 . If the charge is moving with uniform velocity v then we shall have r0 (t) = r0 (t ) + v (t − t ) . + − 2 ∂t ∂t R − v · R ∂t (R − v · R) e ∂v .55) An analogous calculation of B shows that it can be written as B= R×E . It is easily seen that at large distance. ∂t (∂i R − ∂i (vj Rj )) − ∂vi evi ∂R ∂(v · R) . and so R(t ) − v R(t ) = r − r0 (t ) − v (t − t ) .54) This can be rewritten as E= e(1 − v 2 )(R − v R) (R − v · R)3 eR × [(R − v R) × a] (R − v · R)3 .∂vi ∂t ∂R ∂t ∂R ∂t Note that vi means ∂vi /∂t .58) (7.57) . = e(1 − v 2 )(Ri − vi R) (R − v · (7. = R(t) . = = −v =− ∂t ∂t ∂t R−v·R = (7. we shall deﬁne the acceleration a of the particle by ˙ a≡ results above. we have Ei = −∂i φ − = = e (R − v · e R)2 ∂Ai . ∂t (7. From (7. = − R−v·R ∂ R ∂t ∂t vR .52) We are now ready to evaluate the electric and magnetic ﬁelds.55) is independent of the acceleration a.56) Note that this means that B is perpendicular to E. ∂t ∂t R−v·R v·R .
However.61) (7. as we shall now discuss.55) is approximated by E= eR × (R × a) en × (n × a) = .64) . The second term in (7.4 Radiation by accelerated charges A charge at rest generates a purely electric ﬁeld. of course. 3 (t) (1 − v 2 sin2 θ)3/2 R (7.55). this term falls oﬀ like 1/R.40) that we had previously obtained by boosting from the rest frame of the charged particle. does it radiate any energy. by squaring. R(t ) − v R(t ) is equal to the vector R(t) that gives the line joining the charge to the point of observation at the time the observation is made.63) vR(t) cos θ + R(t) 1 − v 2 sin2 θ .59) then gives R(t )−v· R(t ) = vR(t) cos θ+R(t) 1 − v 2 sin2 θ−vR(t) cos θ = R(t) 1 − v 2 sin2 θ . as we saw above.55) is proportional to a. In this case the acceleration term in (7. much less rapidly than the 1/R2 falloﬀ of the ﬁrst term in (7. R2 (t ) = v 2 R2 (t ) + 2vR(t)R(t ) cos θ + R2 (t) . 1 − v2 (7.60) which has reproduced the result (5. and if it is in uniform motion it generates both E and B ﬁelds. and this quadratic equation for R(t ) can be solved to give R(t ) = Equation (7. The easiest case to consider is when the velocity of the charge is small compared with the speed of light. (7. R(t ) = v R(t ) + R(t). in other words.62) For a uniformly moving charge we therefore obtain the result E= eR(t) 1 − v2 . in this case of uniform motion. Since. In fact the 1/R falloﬀ of the acceleration term is characteristic of an electromagnetic wave. we shall have v · R(t) = v R(t) cos θ.59) If we now introduce the angle θ between v and R(t). At large distance. In neither case. = (1 − v 2 )R(t ) − v · R(t) . We shall also then have R(t ) − v · R(t ) = R(t ) − v 2 R(t ) − v · R(t) . then it actually emits electromagnetic radiation. 7.In other words. (7. and so it occurs only for an accelerating charge. if the charge is accelerating. 3 R R 116 (7. we obtain.
Then we shall have E= and so E2 = e e (n · a n − a) = (a n cos θ − a) . R R (7. This means that the polarisation of E lies in the plane containing n and a.74) 117 . the solid angle element).69) e2 a2 sin2 θ e2 2 (a cos2 θ − 2a2 cos2 θ + a2 ) = . 1 sin3 θ dθ = 2 e2 a2 (7.e. The total power radiated in all directions is given by P = = dP e2 a2 dΩ = dΩ 4π 1 2 2 2e a π 0 π 0 1 −1 sin3 θ dθ 0 2π dϕ . time t . (7.65) Note that n · E = 0. is given by S= 1 1 2 1 1 E×B = E × (n × E) = E n− (n · E) E .56) we shall also have B =n×E. The power radiated into the area element d Σ is dP = S · dΣ = R2 n · S dΩ. The energy ﬂux.68) Let us deﬁne θ to be the angle between the unit vector n and the acceleration a. R (7.67) (7.70) implying that the energy ﬂux is (7. given by the Poynting vector. 4π (7. 2 (1 − c2 )dc = 3 e2 a2 .73) is the power radiated per unit solid angle. R2 R2 S= e2 a2 sin2 θ n.71) The area element dΣ can be written as dΣ = R2 n dΩ .where n≡ R . all quantities here in the expressions for E and B are evaluated at the retarded and so. 4π 4π 4π 4π 1 2 E n. and so we ﬁnd that dP e2 a2 = sin2 θ dΩ 4π (7. From (7. and that E is also perpendicular to n × a. 4πR2 (7. and is perpendicular to n. since n · E = 0 we have S= (7.66) As usual.72) where dΩ = sin θ dθdϕ is the area element on the unitradius sphere (i.
75) in the limit when v goes to zero. but without making the approximation that v is small compared to 1 (the speed of light). Noting that pµ = m(γ. In principle. in fact. Note that in terms of the unit vector n = R/R. a2 + (v · a)2 . the expression (7.76) (7.77) There is only one Lorentzinvariant quantity. Thus.75) We can.78) where pµ is the 4momentum of the particle and τ is the proper time along its path. that reduces to this expression in the limit that v goes to zero. R2 (1 − n · v)3 R (1 − n · v)3 (7. γ2 (7. and each of these quantities transforms as the 0 component of a 4vector). (7.76) that since S = (E × B)/(4π) and B = n × E. we note that the nonrelativistic Larmor formula (7. 3m2 dτ dτ (7. First.where.81) = a2 − v 2 a2 + (v · a)2 = 118 . First. the task is to ﬁnd a Lorentzinvariant expression for P that reduces to the nonrelativisitic Larmor result (7. Now consider the quantity a2 − (v × a)2 = a2 − ijk i m vj ak v (7.80) am . to evaluate the θ integral we change variable to c = cos θ. we note from (7. dτ dt and so dpµ dpµ dτ dτ = m2 γ 2 [−γ 6 (v · a)2 + γ 6 v 2 (v · a)2 + 2γ 4 (v · a)2 + γ 2 a2 ] . quadratic in a. We can also note that the total radiated power P is a Lorentz scalar (since it is energy per unit time. = m2 γ 2 [γ 4 (v · a)2 + γ 2 a2 ) . γ 3 (v · a) v + γa) . It is given by P = 2e2 dpµ dpµ . obtain the relativisitic Larmor formula by a simple trick. we could simply repeat the argument given above. γv). the energy ﬂux from the acceleration term must be quadratic in the acceleration a. we see that dpµ dpµ =γ = mγ(γ 3 v · a.75) can be written as 2 P = 3 e2 a2 = 2e2 dp 3m2 dt 2 .79) (7. The Larmor formula can be generalised to the relativistic result fairly easily.55) for the electric ﬁeld becomes E= e(1 − v 2 )(n − v) en × [(n − v) × a] + . The expression P = 2 e2 a2 3 is known as the Larmor Formula for a nonrelativistic accelerating charge.
In a particle accelerator.88) . dt dt (7. obviously. we have dp dv dγ = mγ + mv . dt dt dγ dv = γ3v · = γ 3 va .which shows that we can write dpµ dpµ dτ dτ = m2 γ 6 a2 + (v · a)2 = m2 γ 6 [a2 − (v × a)2 ] .86) dv dv =v· = v · a = va . Equation (7.1 Applications of Larmor formula Linear accelerator In a linear accelerator.83) gives P = 2 e2 γ 6 a2 .87) The expression (7. (7. 7. it must therefore be the correct fullyrelativistic Larmor result for the total power radiated by an accelerating charge.75) if the velocity v is sent to zero. dx dx 119 (7.85) (7.84) With v and a parallel.78) is given by 2 P = 3 e2 γ 6 [a2 − (v × a)2 ] . dE/dx. γ2 (7. Deﬁning p = p = mγv. Clearly we have v and so dp = mγ 3 a . (7.83) This indeed reduces to the nonrelativistic Larmor formula (7.87) gives the power that is radiated by the charge as it is accelerated along a straight line trajectory. and so its velocity v and acceleration a are parallel. and so we 3 2e2 dp 3m2 dt 2 .87) describes the the power that is lost through radiation when the particle is being accelerated. For the reasons we described above.82) Thus we see that the scalar P given in (7.5.5 7.89) (7. the goal. dt dt dt where v = v and γ = (1 − v 2 )−1/2 . the relativisitic Larmor formula (7. dt have P = (7. The energy E of the particle is related to its rest mass m and 3momentum p by the standard formula E 2 = p2 + m2 . a charged massive particle is accelerated along a straightline trajectory. is to accelerate the particles to as high a velocity as possible. The rate of change of energy with distance travelled. is therefore given by E dp dE =p .
96) . dE/dt.95) dτ where dτ = dt/γ is the propertime interval. (7.93) A typical electron linear accelerator achieves an energy input of about 10 MeV per metre.2 Circular accelerator The situation is very diﬀerent in the case of a circular accelerator.92) In the relativistic limit. and we can study the power loss by assuming that the particle is in an orbit of ﬁxed angular frequency ω. dx E dx mγ dx dx dt dx dt This means that (7. (7. Power supplied 3m2 dx (7. By energy conservation.87) can be rewritten as P = 2e2 dE 3m2 dx 2 (7. In other words. and this translates into an energyloss factor of about 10 −13 . Thus we have Power radiated Power supplied = = dt P P P = = . (7. very little of the applied power being used to accelerate the electron is lost through Larmor radiation. 3m2 v dx (7. while.94) = γ 2 ω 2 p2 . In other words. dt and so dp = γω p . dτ and so dp dpµ dpµ = dτ dτ dτ 120 2 (7. by contrast. the direction of the 3momemtum p is changing rapidly. we therefore have 2e2 dE Power radiated ≈ .90) . is relatively slowlychanging. (dE/dt) (dE/dx) dx v (dE/dx) 2e2 dE . where v is very close to the speed of light (as is typically achieved in a powerful linear accelerator). since the transverse acceleration necessary to keep the particle in a circular orbit is typically very much larger than the linear acceleration discussed above. In fact the change in p per revolution is rather small. Since the energy is constant in this approximation.5. we therefore have dp0 = 0.and so we have dE p dp mγv dp dp dx dp dp = = =v = = . the energy.91) The “energyloss factor” of the accelerator can be judged by taking the ratio of the power radiated divided by the power supplied. and hence the magnitude of p. the power supplied is equal to the rate of change of energy of the particle. This means that we shall have dp = ω p . 7.
or about 0. for which the radius R is about 100 metres. from (7. R (1 − n · v)3 B = n×E. this implies an energy loss of about 10 MeV per revolution. dΩ 4π (7.76). (7.100) In the general (i.103) . n · S.Using equation (7. Plugging in the numbers. it is necessary to supply energy at a very high rate in order to replenish the radiative loss.98) The radiative energy loss per revolution. is given by the product of P with the period of the orbit. 4π 4π e2 n × [(n − v) × a] 4πR2 (1 − n · v)3 121 (7.78) for the Larmor power radiation. 4π 4π 1 1 2 n · [E 2 n − (n · E) E] = E . we therefore have P = 2e2 2 2 2 2 2 4 2 2 γ ω p = 3e γ ω v .102) 2 . 3m2 (7.101) The radial component of the Poynting vector. the angular distribution of the radiated power is given by (see (7. Thus we have n·S = 1 1 n · (E × B) = n · [E × (n × E)] .73)) e2 a2 dP = sin2 θ . relativistic) case. is therefore given by n·S = = since n · E = 0.1% of the energy of the particle. namely ∆E = 2πRP 4πe2 γ 4 v 3 = .6 Angular distribution of the radiated power We saw previously that for a nonrelativistic charged particle whose acceleration a makes an angle θ with respect to the position vector R. 3R2 (7.e. Bearing in mind that the time taken to complete an orbit is very small (the electron is travelling at nearly the speed of light). ∆E. that at large R the electric and magnetic ﬁelds are dominated by the radiationﬁeld term: E= en × [(n − v) × a] . v 3R (7.97) If the radius of the accelerator is R then the angular and linear velocities of the particle are related by ω = v/R and so the power loss is given by P = 2e2 γ 4 v 4 .99) A typical example would be a 10 GeV electron synchrotron. (7. the we have. 7. It also implies that there will be a considerable amount of radiation being emitted by the accelerator. where the velocity v is large.
104) where Ti is the time t that corresponds to the retarded time t = Ti . = dΩ 4π (1 − n · v)5 obtain (7. and so from (7.103) by [n · S]ret. that v and a are parallel. This means that n and R are approximately constant. dt dt . with R(t ) = r − r0 (t ). (dt/dt ) is the power radiated per unit area as measured with respect to the charge’s retarded time t . = R2 (1 − n · v)[n · S]ret.106) we obtain the angular distribution dP (t ) e2 n × [(n − v) × a]2 . As the velocity becomes larger. dt (7.73). (7. i.6. the angular radiated power distribution is described by a ﬁgureofeight. so that v as well as a are approximately constant during the time interval of the acceleration.103) and (7. dΩ dt (Note that we used the result (7. . 122 .108) where as before we deﬁne θ to be the angle between a and n. the expression (7. The associated energy radiated during the time interval from t = T1 to t = T2 is therefore given by E= therefore be rewritten as E= T2 T1 [n · S]ret.107) If we now suppose that the acceleration is linear. It is conventional to denote the quantity in (7. then we dP (t ) sin2 θ e2 a2 . dt . along the direction of the acceleration.47) here.where as usual all quantities on the righthand side are evaluated at the retarded time t calculated from the equation t − t = R(t ). The integral can T2 T1 [n · S]ret. consider the situation when the charge is accelerated uniformly for only a short time. This is illustrated for the nonrelativistic and relativisitic cases in Figures 1 and 2 below. the two lobes of the ﬁgureofeight start to tilt forwards. When v << 1. oriented perpendicularly to the direction of the acceleration. . and so we have the result that dP (t ) dt = R2 [n · S]ret.105) The quantity [n · S]ret. the acceleration is to the right along the horizontal axis.108) clearly reduces to the nonrelativistic result given in (7.) 7. In each case.e.106) As an example.1 Angular power distribution for linear acceleration (7. to indicate that it is evaluated at the retarded time t . In this limit. = dΩ 4π (1 − v cos θ)5 (7.
2 a 0. ≈ arccos(1 − 1 γ −2 ) . √ 1 + 15v 2 − 1 = arccos . (7.5 1 Figure 1: The angular power distribution in the nonrelativistic case The angle at which the radiated power is largest is found by solving d(dP/dΩ)/dθ = 0.4 0.2 0. we obtain θmax. = arccos 3 15 −2 − 16 γ −2 1−γ 4 1− 1 .4 0. is close to 0 when γ is very large.109) (7. In this regime we have cos θ max. which becomes very large in the relativistic limit. Thus.110). ≈ 123 . ﬁnding that θmax. substituting v = 1 − γ −2 into (7. the velocity itself is not a very convenient parameter.111) At large γ we can expand the argument as a power series in γ −2 . This gives 2(1 − v cos θ) cos θ − 5v sin2 θ = 0 . and instead we can more usefully characterise it by γ = (1 − v 2 )−1/2 . for which v is very close to the speed of light.5 0.1 0. and hence θmax. 8 (7.110) In the case of a highly relativistic particle. 3v (7.112) This implies that θmax.
π(γ −2 + θ 2 )5 (7.113) We see that the lobes of the angular power distribution tilt forward sharply. and expanding in inverse powers of γ. giving the result 2 P = 3 e2 γ 6 a2 . ≈ 1 .114) into (7.83). 2γ (7.115) 8e2 a2 γ 2 dP (t ) (γθ)2 . we may consider the proﬁle of the angular power distribution for all small angles θ. The integral is elementary. under the specialisation that a and v are parallel. 124 . .40 20 a 20 20 40 40 60 80 100 Figure 2: The angular power distribution in the relativistic case (v = 4/5) 1 2 1 − 2 θmax.118) This can be seen to be in agreement with our earlier result (7. We obtain P = dP (t ) e2 a2 dΩ = 2π dΩ 4π π 0 where c = cos θ. (7.108). Continuing with the highlyrelativistic limit. sin θ ≈ θ . e2 a2 θ 2 1 1 − γ −2 1 − 2 θ 2 5 1 cos θ ≈ 1 − 2 θ 2 (7. so that they are directed nearly parallel to the direction of acceleration of the particle.108) for the angular power distribution of θ = 0. we ﬁnd that dP (t ) ≈ dΩ 4π 1 − which can be written as ≈ 8e2 a2 θ 2 . for a linearlyaccelerated particle. The radiated power is zero in the exactly forward direction θ = 0. of characteristic width ∆θ ∼ 1/γ. on each side We can straightforwardly integrate our result (7. Substituting v= 1 − γ −2 .116) ≈ dΩ π [1 + (γθ)2 ]5 This shows that indeed there are two lobes. (1 − vc)5 (7. sin2 θ 1 sin θ dθ = 2 e2 a2 (1 − v cos θ)5 1 −1 (1 − c2 )dc . and so in the highly relativistic case we have θmax.117) (7. to ﬁnd the total radiated power.
θ measures the angle between n and the z axis. and so for instantaneous circular motion we have e2 a2 dP (t ) sin2 θ cos2 ϕ = 1− 2 . in particular. y) plane. For these purposes. From (7. and the acceleration lies along the x direction. = −(n · a)2 (1 − v 2 ) + (1 − n · v)2 a2 . sin θ sin ϕ. Circular motion implies that the velocity v and the acceleration a are perpendicular.121) (7.e.2 Angular power distribution for circular motion For a second example. we have n · v = cos θ. = dΩ 4π (1 − n · v)5 of circular motion. we have n × [(n − v) × a]2 = (n · a)(n − v) − (1 − n · v)a2 . (7. consider the situation of a charge that is in uniform circular motion. but such that at some instant it can be described by a circular motion. = (1 − v cos θ)2 a2 − γ −2 a2 sin2 θ cos2 ϕ . measured from the x axis. Of course. we may choose a system of Cartesian axes oriented so that the velocity v lies along the z direction.7. we have the general expression e2 n × [(n − v) × a]2 dP (t ) .119) (7. over all solid angles: dP (t ) . cos θ) . of the projection of n onto the (x.103) and (7. 0. 0. dΩ 4π(1 − v cos θ)3 γ (1 − v cos θ)2 in the direction of the velocity v. the complete path of the particle could be something more complicated than a circle. dΩ dϕ 0 0 sin θdθ 125 . we need only assume that it is instantaneously in such motion. ϕ) deﬁned in the usual way.6. v) .106).122) We see that as v tends to 1. Thus we shall have n = (sin θ cos ϕ. the angular distribution is peaked in the forward direction i. = (n · a)2 (1 − 2n · v + v 2 ) + (1 − n · v)2 a2 − 2(n · a)2 (1 − n · v) . The unit vector n = R/R can then be parameterised by spherical polar coordinates (θ. At the instant under consideration. 0) . meaning that θ is close to 0. a = (a. i.e.120) for the angular distribution of the radiated power. The total power is obtained by integrating P (t ) = dP (t ) dΩ = dΩ 2π dP (t ) dΩ π v = (0. and ϕ is the azimuthal angle. Using the fact that v · a = 0 in the case (7.
126 . The total power radiated in the case of linear acceleration.83).= = = sin2 θ cos2 ϕ e2 a2 1− 2 . (7. we have dp = mγa + mγ 3 (v · a)v = mγ 3 a . specialised to the case where v and a are perpendicular. we have from (7. we see that it is the particle in circular motion whose radiated power is larger than that of the linearlyaccelerated particle.124) in this case.123) where c = cos θ. sin θdθ 2(1 − v cos θ)3 2γ (1 − v cos θ)2 0 1 e2 a2 1 − c2 1− 2 dc . Noting that then (v × a)2 = ijk i m vj ak v am = v j vj ak ak − v j aj vk ak = v j vj ak ak = v 2 a2 . (7. (7. where v is parallel to a. For circular motion we have that v is constant. (7. dt dt Thus for circular motion.126) .124) This expression can be compared with the general result (7. we obtain 2 P (t ) = 3 e2 γ 4 a2 .129) Thus if we hold dp/dt ﬁxed when compariung the two. dt and so this gives P (t ) = 2e2 dp 3m2 dt 2 (7.83) indeed agrees with (7.118).124) that P (t ) = 2e2 γ 2 dp 3m2 dt 2 (7.127) By contrast. and so dv dp = mγ = mγa . Another way to make the comparison is to take the magnitude of the applied force. this is not always the most relevant comparison to make. for linear acceleration. 4π(1 − v cos θ)3 γ (1 − v cos θ)2 0 0 π sin2 θ e2 a2 1− 2 . is larger by a factor of γ 2 than the total power radiated in the case of circular motion. to be the same in the two cases. After performing the integration. by a factor of γ 2 . 3 2γ (1 − vc)2 −1 2(1 − vc) 2π π dϕ sin θdθ (7. dp/dt. with its γ 6 factor as in (7. However. provided we take the acceleration a to be the same in the two cases.125) we see that (7.128) .
In general.130) Note that here dP (t)/dΩ is expressed in the observer’s time t. This is because our goal here will be to determine the frequency spectrum of the electromagnetic radiation as measured by the observer. 4π so that we shall have dP (t) = G(t)2 .132) (7.7 Frequency distribution of radiated energy In this section. using ∞ −∞ dt ei (ω −ω)t = 2πδ(ω − ω) . we shall discuss the spectrum of frequencies of the electromagnetic radiation emitted by an accelerating charge.136) The t integration can be performed.133) We now deﬁne the Fourier transform g(ω) of G(t): 1 g(ω) = √ 2π ∞ −∞ G(t) ei ωt dt . 127 (7. dΩ 4π Let 1 G(t) = √ [RE]ret . so that the total energy emitted is ﬁnite. (7. dΩ (7. (7. Suppose that the acceleration of the charge occurs only for a ﬁnite period of time. The basic technique for doing this will be to perform a Fourier transform of the time dependence of the radiated power. We shall assume that the observation point is far enough away from the charge that the spatial region spanned by the charge while it is accelerating subtends only a small angle as seen by the observer. and not the retarded time t .7. the inverse transform is then 1 G(t) = √ 2π It follows that dW = dΩ ∞ −∞ ∞ −∞ g(ω) e−i ωt dω .134) In the usual way.137) . (7.135) G(t)2 dt = 1 2π ∞ −∞ dt ∞ −∞ dω ∞ −∞ dω g ∗ (ω ) · g(ω) ei (ω −ω)t . (7. The total energy radiated per unit solid angle is given by dW = dΩ ∞ −∞ dP dt = dΩ ∞ −∞ G(t)2 dt .131) (7. we have 1 dP (t) = [R2 n · S]ret = [RE]ret 2 .
the subscript “ret” is a reminder that the quantity is evaluated at the dt dt = (1 − n · v) dt . (7. is e g(ω) = √ 2 2π retarded time t . and that the period over which the acceleration occurs is short enough that the the vector n = R(t )/R(t ) is approximately constant during this time interval. (7. since everything inside the integrand now depends on the retarded time t .and so dW = dΩ i.139) is known as Parseval’s Theorem in dω d2 I(ω.142) d2 I(ω. With the 128 . (1 − n · v)2 (7.139) as dW = dΩ where ∞ 0 ∞ −∞ dω ∞ −∞ dω g ∗ (ω ) · g(ω) δ(ω − ω) = ∞ −∞ ∞ −∞ dωg ∗ (ω) · g(ω) .101). then 1 g(−ω) = √ 2π and then ∞ −∞ d2 I(ω. (7.145) n × [(n − v) × a] (1 − n · v)3 ret dt . n) = 2g(ω)2 .134) with (7.144) ei ω(t +R(t )) n × [(n − v) × a] dt . given by (7. dt (7. dωdΩ (7. (7. the Fourier transform g(ω).143) Using the expression for E in (7.133) can be expressed as (7.e. n) .) We can reexpress (7. If G(t) = [RE]ret / 4π is real.139) (The result that (7.131).) We are assuming that the observation point is far away from the accelerating charge.141) dt G(t) e−i ωt = g ∗ (ω) . dW = dΩ Fourier transform theory. It is convenient to choose the origin to be near to the particle during its period of acceleration. dωdΩ (7.138) dωg(ω)2 .140) √ is the energy emitted per unit solid angle per unit frequency interval.146) (We have now dropped the “ret” reminder. Since dt = we therefore have e g(ω) = √ 2 2π ∞ −∞ ∞ −∞ ei ωt where as usual. n) = g(ω)2 + g(−ω)2 dωdΩ (7.
The integral can be neatened up by observing that we can write n × [(n − v) × a] d = (1 − n · v)2 dt to obtain d dt n × (n × v) 1−n·v = = = = n × (n × a) n × (n × v) (n · a) . This can be seen be distributing the derivative. to give e2 d2 I(ω. (7. n) = 2 − dωdΩ 4π n × (n × v) d i ω(t −n·r0 (t )) e dt 1 − n · v dt 129 2 . (1 − n · v)2 (n · a)(n − v) − (1 − n · v)a .154) . and so we may drop it and write e g(ω) = √ 2 2π From (7. (7. 1−n·v (7. (7. there will be a phase factor e i ωr that can be taken outside the integral. it follows from R(t ) = r − r0 (t ) that to a good approximation we have R2 (t ) ≈ r 2 − 2r · r0 (t ) . since it is independent of t .143) we therefore have e2 d2 I(ω. (1 − n · v)2 (7. (1 − n · v)2 ∞ −∞ n × (n × v) . This overall phase factor is unimportant (it will cancel out when we calculate g(ω) 2 . n) = 2 dωdΩ 4π ∞ −∞ ∞ −∞ ei ω(t −n·r0 (t )) n × [(n − v) × a] dt . we can also approximate n ≡ R(t )/R(t ) by r/r.150) ei ω(t −n·r0 (t )) n × [(n − v) × a] dt (1 − n · v)2 2 . at position vector r.146). (1 − n · v)2 n × [(n − v) × a] .151) by parts. and so R(t ) ≈ r − n · r0 (t ) . and so R(t ) ≈ r 1 − 2r · r0 (t ) r2 1/2 (7.153) This allows us to integrate (7. + 1−n·v (1 − n · v)2 (1 − n · v)(n (n · a) − a) + (n (n · v) − n)(n · a) .148) Furthermore. r (7.149) Substituting this into (7.observer being far away.152) under the assumption that n is a constant. (7.151) as the energy per unit solid angle per unit frequency interval.147) ≈r− r · r0 (t ) .
0.155). . and the validity of having dropped the boundary terms at t = ±∞ coming from the integration by parts. the integrand in (7. x = y = 0. Since v = dr 0 (t)/dt. We shall cjoose axes so that the arc lies in the (x. 7. In fact. After the integration by parts. Without loss of generality. z) plane. the fact that we were taking the acceleration to be nonzero for only a ﬁnite time interval ensured that the integration over all t from −∞ to ∞ would be cut down to an integration over only the ﬁnite time interval during which a was nonzero.155) in two applications. 0 .156) where v = v is its speed. −v sin . we shall calculate the frequency spectrum for a relativistic particle in instantaneous circular motion. it can be veriﬁed that all is well. ρ cos − ρ.and hence ∞ 2 d2 I(ω. z) plane. we shall have v = v cos vt vt . 0 .157) We may parameterise the unit vector n. ρ ρ (7. and choose the origin so that at t = 0 the particle is located at the origin. which we are taking to lie in the (x. at some instant.8 Frequency spectrum for relativistic circular motion Consider a particle which. We shall make use of the result (7. −v sin . so from now on t will denote the retarded time. and any problem with convergence can be handled by introducing a convergence factor e − t  . for notational convenience. and then sending to zero. We shall. is following a circular arc of radius ρ. y) plane. in terms of the angle θ between n and the x axis: n = (cos θ.155) dωdΩ 4π 2 −∞ It should be remarked here that the eﬀect of having integrated by parts is that the acceleration a no longer appears in the expression (7. drop the prime from the time t .159) (7. we may choose the unit vector n (which points in the direction of the observation point) to lie in the (x. v sin θ cos θ cos ρ ρ ρ (7. sin θ) . We then have n × (n × v) = (n · v) n − v = − v sin2 θ cos 130 vt vt vt . and so one might worry about issues of convergence. Prior to the integration by parts. The position vector of the particle at time t will be given by r0 = ρ sin vt vt . ρ ρ (7. (7. n) e2 ω 2 = n × (n × v) ei ω(t −n·r0 (t )) dt .158) . In the ﬁrst.155) no longer vanishes outside the time interval of the nonzero acceleration.
t −2 (γ + θ 2 )−1/2 . Thus. 0) and e⊥ = n × e = (− sin θ. 6ρ t3 1 ≈ 1 (1 + v)(1 − v)t + 2 θ 2 t + 2 . cos θ) . 4π 2 1 6 vt ρ 3 .163) (7.158). = 2γ 2 2 6ρ t n × (n × v) ≈ − e + θ e⊥ . the electromagnetic radiation will be more or less completely concentrated in the range of angles θ very close to 0. In what follows. (7. we shall make approximations that are valid for small θ. 0.155) that d2 I dωdΩ ≈ = where g (ω) = 1 ρ ∞ −∞ ∞ −∞ 2 e2 ω 2 − g (ω) e + g⊥ (ω) e⊥ . (7. we ﬁnd t − n · r0 (t) = t − ρ cos θ sin vt vt 1 ≈ t − ρ(1 − 2 θ 2 ) − ρ ρ v 3 t3 1 ≈ (1 − v)t + 2 θ 2 vt + 2 . 2 6ρ 3 t t + 1 θ2 t + 2 . and also for small t.164) t ei ω[(γ ei ω[(γ −2 +θ 2 )t+ 1 t3 ρ−2 ]/2 3 dt .We shall write this as n × (n × v) = −v sin where e = (0. ρ ρ (7. 2 4π e2 ω 2 g (ω)2 + g⊥ (ω)2 . (7. we ﬁnd (7. (7. 1). We shall also assume that v is very close to 1 (the speed of light). ρ We therefore ﬁnd from (7.166) 131 . From (7. 1.165) g⊥ (ω) = θ Letting u= −2 +θ 2 )t+ 1 t3 ρ−2 ]/2 3 dt .160). 0. It will be recalled from our earlier discussions that for such a particle. ρ 1 ξ = 3 ωρ(γ −2 + θ 2 )3/2 . to a good approximation we shall have e⊥ ≈ (0.162) From (7. which is the unit normal to the plane of the circular motion.156) and (7.160) We shall consider a particle whose velocity is highlyrelativistic.161) vt vt e + v sin θ cos e⊥ .
are Kν (x) −→ Kν (x) −→ 1 2 Γ(ν) 2 x ν . we see from (7. whilst still fulﬁlling our assumptions. 3 (7. 2x It therefore follows from (7.leads to g (ω) = ρ(γ −2 + θ 2 ) ∞ −∞ ue3i ξ(u+u ∞ −∞ 3 /3)/3 du .170) π −x e . and so the radiation is indeed concentrated around very small angles θ.168) and so we have e2 ω 2 ρ2 −2 θ2 d2 I ≈ (γ + θ 2 )2 (K2/3 (ξ))2 + −2 (K1/3 (ξ))2 . Thus.169) that d 2 I/(dωdΩ) falls oﬀ rapidly when ξ becomes large. dωdΩ 3π 2 γ + θ2 (7. if ωρ is large enough.169) The asymptotic forms of the modiﬁed Bessel functions K ν (x). (7. then we shall have ωc = 3ω0 E m 3 . 3 ∞ 0 1 cos[3ξ(u + u3 /3)/2] du = √ K1/3 (ξ) .166) that there is a regime where ξ can be large. (7.167) g⊥ (ω) = ρθ(γ −2 + θ 2 )1/2 e3i ξ(u+u 3 /3)/3 These integrals are related to Airy integrals. Bearing in mind that γ −2 is small (since the velocity of the particle is very near to the speed of light). If ω becomes suﬃciently large that ωργ −3 is much greater than 1 then ξ will be very large even if θ = 0. there is an eﬀective highfrequency cutoﬀ for all angles. or modiﬁed Bessel functions: ∞ 0 1 u sin[3ξ(u + u3 /3)/2] du = √ K2/3 (ξ) .171) If the particle is following a uniform periodic circular orbit.172) The radiation in this case of a charged particle in a highly relativistic circular orbit is known as “Synchrotron Radiation.” 132 . and that θ has been assumed to be small. x −→ 0 . for small x and large x. The value of ξ can then become very large if θ increases suﬃciently (whilst still being small compared to 1). (7. It is convenient to deﬁne a “cutoﬀ” frequency ω c for which ξ = 1 at θ = 0: ωc = 3 E 3γ 3 = ρ ρ m 3 . x −→ ∞ . with angular frequency ω 0 = vrho ≈ 1/ρ. (7. du .
to avoid a profusion of primes. For example. rather than Fourier transforms. 7. In the two regimes ω << ωc and ω >> ωc we shall therefore have ω << ωc : ω >> ωc : d2 I dωdΩ d2 I dωdΩ θ=0 θ=0 Γ(2/3) 2 3 1/3 (ωρ)2/3 . It is clear that one could continue with the investigation of the properties of the synchrotron radiation in considerably more depth. periodic with period T . (7. and so the factor e −i ω n·r0 (t) in (7. (7.Consider the frequency spectrum of the radiation in the orbital plane.175) (We are again using t to denote the retarded time here. would could consider the detailed angular distibution of the radiation as a function of θ.174) A discussion of further details along these lines can be found in almost any of the advanced electrodynamics textbooks.9 Frequency spectrum for periodic motion Suppose that the motion of the charged particle is exactly periodic. it is more appropriate to work with Fourier series. reaches a peak around ω = ω c . with period T = 2π/ω 0 .176) = 2π bn δ(ω − nω0 ) . π 4 3e2 γ 2 ω −2ω/ωc e .155) will have time dependence of the general form H(t) = n≥1 where ω0 is the angular frequency of the particle’s motion. θ = 0. This means that n · r 0 (t) will be bn e−i nω0 t . and then falls oﬀ exponentially rapidly one ω is signiﬁcantly greater than ωc . ≈ 2π ωc ≈ e2 (7. 133 . dωdΩ (7.) The Fourier transform h(ω) of the function H(t) is zero except when ω is an integer multiple of ω0 . in this situation with a discrete frequency spectrum. obtained by integrating over all solid angles: dI = dω d2 I dΩ . and one could consider the total power per unit frequency interval.173) This shows that the power per unit solid angle per unit frequency increases from 0 like ω 2/3 for small ω. and for these values it is proportional to a delta function: h(ω) = 1 √ 2π n≥1 ∞ −∞ ei ωt H(t)dt = n≥1 bn ∞ −∞ ei (ω−nω0 )t dt . In fact.
15 The integer n labelling the modes is not to be confused with the unit vector n. using (7.177) are given by an = 1 T T 0 ei nω0 t G(t)dt . (7. we therefore now expand G(t) in the Fourier series G(t) = n≥1 an e−i nω0 t .182) The term an 2 in this summation therefore has the interpretation of being the timeaveraged power per unit solid angle in the n’th mode. The steps follow exactly in parallel with those we described in section 7.n .178) since the integral of ei (m−n)ω0 t vanishes unless n = m: 1 T T 0 ei (m−n)ω0 t dt = δm. to obtain an expression for a n 2 in terms of the integral of the retarded electric ﬁeld. dΩ (7.180) The analogue of Parseval’s theorem for the case of the discrete Fourier series is now given by considering 1 T T 0 G(t)2 dt = 1 T T 0 m. (7.7.183) It is now a straightforward matter. 2 (7. (7.7. (7.Going back to section 7.179) Thus the coeﬃcients an in the Fourier series (7.n an · a ∗ ei (m−n)ω0 t dt = m n an 2 .155) for d 2 I/(dωdΩ) is replaced by15 4 dPn e2 n2 ω0 = dΩ 4π 2 T 0 n × (n × v) ei nω0 (t−n·r0 (t)) dt . of course! 134 .181) The time average of the power per unit solid angle is therefore given by dP 1 = dΩ T T 0 dP 1 dt = dΩ T T 0 G(t)2 dt = n an 2 . This gives the expression for the timeaveraged power per unit solid angle in the n’th Fourier mode.184) where T = 2π/ω0 . which we shall denote by dP n /dΩ: dPn = an 2 . except that the integral T −1 T 0 ∞ −∞ dt is now replaced by dt.180). (7. (7. The upshot is that the expression (7.177) Multiplying by ei mω0 t and integrating over the period T = 2π/ω 0 gives 1 T T 0 ei mω0 t G(t)dt = 1 T T an n≥1 0 ei (m−n)ω0 t dt = am .
and these have magnetic permeability µ very nearly equal to 1. we shall take a brief foray into a situation where there is a dielectric medium. in order to avoid writing the primes on t in all the formulae.186) v × n ei nω0 (t−n·r0 (t)) dt .184) is taken over the ﬁnite time interval T = 2π/ω 0 . all the situations we have considered have involved electromagnetic ﬁelds in a vacuum. This means in general ˜ that the “speed of light” in the medium will be less than the speed of light in vacuum. it follows that to a good approximation we can freely take the unit vector n outside the integral.187) Thus n × (n × V )2 = n × V 2 . can travel faster than the local speed of light inside the medium. then electro√ magnetic waves in the medium will propagate with speed c = 1/ µ. and permeability µ. in the absence of any dielectric or magnetically permeable media. and so we can reexpress (7.10 Cerenkov radiation So far. This leads to an interesting eﬀect.184) as 4 e2 n2 ω0 dPn = dΩ 4π 2 T 0 (7. 2 (7. (7. (7. In practice. Thus for the purposes of our discussion. we shall assume that µ = 1 and that the local speed of is signiﬁcantly greater than 1. such as glass or water. It will be recalled that if a medium has permittivity 135 . In this section. for any vector V . 7. we have that n × (n × V )2 = n · V − V 2 = V 2 − (n · V )2 .188) where T = 2π/ω0 and ω0 is the angular frequency of the periodic motion. and on the other hand we also have n× V 2 = (n× V )·(n × V ) = n·[V ×(n× V )] = n·[V 2 n−(n· V )V ] = V 2 −(n· V )2 . i.Since we are assuming the observer (at r) is far away from the particle. such as an electron. A consequence of this is that a charged particle.185) Now. known as Cerenkov Radiation. and since the integral in (7. Thus we may make the replacement T 0 n × (n × v) ei nω0 (t−n·r0 (t)) dt −→ n × n × T 0 v ei nω0 (t−n·r0 (t)) dt . the types of media of interest are those that are optically transparent. Recall that throughout this section.e. we are using t to denote the retarded time. while the electric permittivity light is reduced because can be quite signiﬁcantly greater than 1.
we shall have r0 (t ) = v t . and so (7. ω −→ too.194) ei ωt (1− √ n·v/c) dt 2 . according to e e −→ √ . 136 .193) gives √ e2 ω 2 d2 I(ω.155) for the radiated power spectrum.190) Of course the scaling of the charge density ρ implies that we must also rescale the charge e Note that c continues to mean the speed of light in vacuum. This is just dimensional analysis. n) = dωdΩ 4π 2 c3 ∞ −∞ √ 2 n × (n × v) ei ω(t − n·r0 (t )/c) dt . This can be done by a simple scaling argument.196) The occurrence of the deltafunction is because of the unphysical assumption that the particle has been moving in the medium forever. First. This can be done by sending ω . dielectric medium is given by c c= √ . we shall need to introduce the dielectric constant into the formula. (7. n) (v/c) cos θ))2 . The speed of light inside the The expression (7. in order to study the Cerenkov radiation. Below. ˜ interval now becomes (7.) Referring back to the discussion in section 2.155) for the radiated power per unit solid angle per unit frequency √ e2 ω 2 d2 I(ω. (7.191) E −→ √ E. just for the purposes of this section. = v 2 sin2 θ δ(ω(1 − 3 dωdΩ c 16 (7.We shall make use of the result (7.192) (7. we therefore have √ √ e2 ω 2 d2 I(ω. The integration over t produces a deltafunction. we shall obtain a more realistic expression by supposing that the particle travels through a slab of mdeium of ﬁnite thickness. of the particle. it can be seen that the dielectric constant can be introduced into the vacuum Maxwell equations by making the scalings ρ ρ −→ √ . B −→ B .16 Deﬁning θ to be the angle between n and v. n) = n × v2 dωdΩ 4π 2 c3 ∞ −∞ (7. so that n · v = v cos θ.189) c (Of course any other quantity that involves time will also need to be rescaled appropriately t −→ ct .1.193) For a charge moving at constant velocity v. (7. (7. We shall also.195) since n × (n × v)2 = (n · v) n − v2 = v 2 − (n · v)2 = n × v2 . restore the explicit symbol c for the speed of light. c c −→ √ .
shown in Figure 3 below. the speed of light in the medium. if ˜ the charged particle is moving through the medium at a velocity that is greater than the local velocity of light in the medium. i.195) is then replaced by √ T √ 2 e2 ω 2 d2 I(ω. The circles show the lightfronts of light emmitted by the particle. = dΩ ≈ 3 (v/c) cos θ dω dωdΩ πc ωT (1 − 0 (7. (7. n) e2 ω 2 = dωdΩ .197) This expression shows that all the radiation is emitted at a single angle θ c . The Cerenkov angle θ c is given by a very simple geometric construction.192). Since the particle is travelling faster than the speed of light in the medium.197) is the result of making the unrealistic assumption that the particle has been ploughing through the medium for ever. This is the lightfront of the Cerenkov radiation.200) dtei bt = 2b−1 sin bT .199) This makes clear that the phenomenon of Cerenkov radiation occurs only if v > c. it “outruns” the circles. given by c cos θc = √ . we have ˜ cos θc = c ˜ . known as the Cerenkov Angle.201) This is sharply peaked around the Cerenkov angle θ c given by (7.and so (using δ(ax) = δ(x)/a)) √ √ e2 d2 I(ω. the squared deltafunction in (7. at a speed greater than the local speed of light. As mentioned above.e. v (7.202) 137 . The expression (7.198) Note that in terms of c. Integrating over all angles we obtain the total energy per unit frequency interval √ √ (v/c) cos θ)] 2 d2 I 2e2 ω 2 v 2 T 2 sin2 θc π sin[ωT (1 − dI √ sin θ dθ . v (7. such that it enters at time t = −T and exits at t = +T . using T −T (7. In fact one can understand the Cerenkov radiation as a kind of “shock wave. A more realistic situation would be to consider a charged particle entering a thin slab of dielectric medium. n) ei ωt (1− n·v/c) dt . n) = 3 v 2 sin2 θ δ(1 − (v/c) cos θ)2 . leaving a trail of lightfronts tangent to the angled line in the ﬁgure.” very like the acoustic shock wave that occurs when an aircraft is travelling faster than the speed of sound. dωdΩ c (7.198). n × v2 = 2 c3 dωdΩ 4π −T which. as given in (7. therefore implies that √ √ v 2 T 2 sin2 θ sin[ωT (1 − (v/c) cos θ)] √ π 2 c3 ωT (1 − (v/c) cos θ 2 d2 I(ω.
the inetgral becomes πc sin2 y √ . so to a good approximation we can take the sin2 θ factor outside the integral. be extended to ±∞ because the √ integrand is peaked around x = cos θc . calling it sin 2 θc . we obtain an expression for the total energy of Cerenkov radiation per unit frequency interval per unit 138 . and so dividing by this.202) for the total energy per unit frequency interval becomes c √ ∞ √ sin[ωT (1 − (v/c)x)] √ ωT (1 − (v/c)x 2 dx . the remaining integral can be written as √ 1 sin[ωT (1 − (v/c)x)] √ ωT (1 − (v/c)x −1 2 ∞ −∞ dx ≈ (The limits of integration can.203) 2e2 vωT sin2 θc dI . dy = (7. c ˜ (The integrand is peaked sharply around θ = θ c .) Letting ωT − ωT ep x/c = −y.205) The distance through the slab is given by 2vT . ≈ dω c2 (7. (7. to a good approximation.~ ct vt Cerenkov angle Figure 3: The Cerenkov angle θc is given by cos θc = (˜t)/(vt) = c/v.) Letting x = cos θ.204) 2 ωT v −∞ y ωT v and so expression (7.
11 Thompson scattering Another application of the Larmor formula is in the phenomenon known as Thompson scattering.path length: c2 d2 I e2 ω e2 ω = 2 sin2 θc = 2 1 − 2 dωd c c v . then the power radiated per unit solid angle is e2 a2 dP = sin2 Θ . The charge will oscillate back and forth in the electric ﬁeld of the wave. (7. for measuring the velocity of charged particles moving at relativistic speeds. and so the Cerenkov eﬀect will cease to operate at high enough frequencies. and so there is no possibility of its exceeding the local speed of light. which means that the bulk of the energy is concentrated in the higher frequencies of electromagnetic radiation. 17 (7. which implies very small wavelengths. if Θ is the angle between the acceleration a and the unit vector n (which lies along the line from the electron to the observation point). and so it will therefore emit electromagnetic radiation itself. which arises because the dielectric constant will fall oﬀ with increasing frequency. and measuring the Cerenkov angle. At such length scales the electron is more or less propagating through a vacuum. Thus the Cerenkov eﬀect tails oﬀ at suﬃciently high frequencies. The bluishgreen glow visible in pictures of nuclear fuel rods immersed in water is a familiar example of Cerenkov radiation. Note that this expression grows linearly with ω. the approximation in which the medium is viewed as a continuum with an eﬀective dielectric constant breaks down. 139 . Consider a plane electromagnetic wave incident on a particle of charge e and mass m. Of course there must be some limit.207) Let us suppose that the plane electromagnetic wave has electric ﬁeld given by (the real part of) E = E0 ei (k·r−ωt) . One can determine the velocity by allowing the particles to move through a slab of suitablychosen dielectric material. and it looks more and more like empty space with isolated charges present.208) At suﬃciently high frequencies. The net eﬀect is that the electron “scatters” some of the incoming wave. As we saw in (7. Apart from looking nice.206) This is known as the FrankTamm relation. dΩ 4π (7. the peak of the frequency spectrum for Cerenkov radiation is in the ultraviolet. the Cerenkov eﬀect is also of practical use. 7.73). In most circumstances. 17 In practice. we can assume that the induced oscillatory motion of the electron will be nonrelativistic.
216) The direction of the polarisation (in the (x.that the wavevector k lies along the z axis.213) so a= e E0 ei ω(z−t) . the unit vector n will be given by n = (sin θ cos ϕ. it follows that (7.214) (7. = dΩ 8πm2 (7. y) plane. dΩ 8πm2 Thus we ﬁnd e4 dP E0 2 [1 − sin2 θ cos2 (ϕ − ψ)] .217) The scattering cross section dσ/dΩ is then deﬁned by Energy radiated/unit time/unit solid angle dσ = .210) (7.218) . sin ψ.207) becomes dP e2 = a2 − (n · a)2 . this means n · = sin θ (cos ϕ cos ψ + sin ϕ sin ψ) = sin θ cos(ϕ − ψ) . Thus we obtain dP dΩ ≡ = 1 2π dP e4 E0 2 (1 − 1 sin2 θ) . m (7. For unpolarised incoming waves.209) Since n · a = a cos Θ. we should average over all angles ψ. The unit polarisation vector .211) (7. = dψ 2 2π 0 dΩ 8πm2 e4 E0 2 (1 + cos2 θ) .215) (7. Using standard spherical polar coordinates. sin θ sin ϕ.212) (7. 0) . dΩ 4π and so the time average will be given by dP e4 = E0 2 [1 − (n · )2 ] . The acceleration of the electron will be given by ma = eE . In particular. cos θ) . y) plane) of the incoming electromagnetic wave is parameterised by the angle ψ. may be parameterised as = (cos ψ. m m (7. which must therefore lie in the (x. Note that this means n·a= e e E0 n · ei ω(z−t) = E0 ei ω(z−t) sin θ cos(ϕ − ψ) . 16πm2 ψ (7. dΩ Incident energy ﬂux/unit area/unit time 140 (7.
t) = e−i ωt J (r ) i kr−r  3 e d r . The charge density and current density can therefore be written as ρ(r.1) From the expressions (7. and we have switched to the symbol k in the exponential inside the integral because it looks more conventional. we shall have φ(r. = e−i ωt r − r  (8. ∂t (8. dΩ 2m2 (7. t − r − r ) 3 d r .21) and (7. In a similar fashion. r − r  (8. which gives σ = = and so we ﬁnd σ= dσ dΩ = 2π dΩ πe4 m2 dσ sin θdθ . t) = J (r) e−i ωt .The denominator here will just be E 0 2 /(8π). we can calculate E = − φ − ∂ A/∂t and B = × A.22) for the retarded potentials. because of the simple monochromatic nature of the time dependence. t) = ρ(r . Thus we arrive at the Thompson Formula for the cross section: e4 (1 + cos2 θ) dσ = .219) The total scattering cross section is obtained by integrating dσ/dω over all solid angles. we shall have A(r. from the Maxwell equation ×B− ∂E = 4π J .221) 8 8. at a single frequency ω.4) 141 . m −1 0 8πe4 .3) From these expressions for φ and A. we can calculate E easily. J(r. 0 dΩ π πe4 1 sin3 θdθ = 2 (1 + c2 )dc .1 Radiating Systems Fields due to localised oscillating sources Consider a localised system of oscillating charge. t) = ρ(r) e−i ωt . which is the time average of the Poynting ﬂux for the incoming wave.2) Note that here k is simply equal to ω.220) (7. (8. r − r  ρ(r ) i kr−r  3 e d r . In fact. 3m2 π (7. once we know B.
d << r ∼ λ . ϕ ) Y +1 2 + 1 r> 142 m (θ. r − r  (8. Aside from the timedependent factor e −i ωt . i. ϕ) . (8. where r << λ. or Static zone : Intermediate zone. we can just approximate ei kr−r  by 1. into three parts: Near zone. this is just like the expression for the magnetostatic case.6) ω k We shall assume that d << λ. or Induction zone : Far zone. (8. the scale size of the source region is very small compared λ= with the wavelength of the electromagnetic waves that are produced. From the time dependence we have ∂ E/∂t = i ω E = −i k E. Equivalently. The wavelength of the monomchromatic waves that they generate will then be given by 2π 2π = .7) First. We can make a standard expansion.9) Since we are also assuming d << λ. like all Gaul.e. (8. it follows that in the near zone. and that the origin of the coordinate system is located in the neighbourhood of the source region. (8.5) Let us suppose that the region where the source charges and currents are nonzero is of scale size d. Thus we shall have A(r.10) in the near zone. let us consider the near zone. we may say that Static zone : kr << 1 . or Radiation zone : 8. t) ≈ e−i ωt J(r ) 3 d r .1 The static zone d << r << λ .Away from the localised source region we have J = 0. so that we may therefore assume r  << λ for all integration points r in the expressions (8.11) . It will be convenient to choose the origin of the coordinate system to lie within the neighbourhood of the source region. The discussion of the electromagnetic ﬁelds generated by these sources can then be divided.1. d << λ << r .3). and so we shall have E= i k ×B. in terms of spherical harmonics: 1 = r − r  ∞ =0 m=− 4π r< ∗ Y m (θ .8) (8.2) and (8. (8.
where n is the unit vector along r: n= From (8.14) in the far zone.19) 143 . (8. r 1 i kr −i ωt xj e J (r ) e−i kn·r d3 r = ik ijk e r r ijk (8. Thus we shall have. or Bi = ijk ∂j Ak . Thus we shall have Bi ≈ 1 (∂j ei kr )e−i ωt J(r ) e−i kn·r d3 r . or radiation zone. while at the same time kr << 1.) The magnetic ﬁeld is given by B = × A. except that they are oscillating in time.16) (8. r−n·r J(r ) e−i kn·r d3 r . since in our case r << r. the electromagnetic ﬁelds will just be like static ﬁelds. r (8. we shall consider the far zone. r (8. to a very good approximation. and r< is the smaller of r and r . but that the ﬁelds are being observed from a large distance which is much larger than the wavelength. t) ≈ (8. A(r. the source is (as always) small compared with the wavelength. Here.15) 2r · r r2 1/2 2 (8. ϕ ) d3 r . ϕ) and (θ . and hence. 8. and so r − r  ≈ r 1 − and hence r − r  ≈ r − n · r . This means that r − r 2 = r 2 − 2r · r + r ≈ r 2 − 2r · r . ϕ) J (r ) r Y ∗ (θ . m (8. and (θ.17) r .3) we shall therefore have A(r. ϕ ) are the spherical polar angles for r and r respectively.13) ≈r− r·r .2 The radiation zone Next. t) = e−i ωt ∞ =0 m=− 1 4π Y 2 + 1 r +1 m (θ. we shall have kr >> 1.18) (Recall that k and ω are just two names for the same quantity here. therefore.12) In the near zone. We are interested in the contribution that dominates at large distance. and this therefore comes from the term where the derivative lands on the ei kr factor rather than the 1/r factor.where r> is the larger of r = r and r = r . A(r.1. In other words. t) ≈ ei (kr−ωt) 1 i (kr−ωt) e r J (r ) e−i kn·r d3 r .
Since we are assuming that the characteristic size d of the source is very small compared with the wavelength. we need to consider the exact expansion of e i kr−r  r − r −1 . 144 . t) ≈ 1 i (kr−ωt) e r (−i k)m m! J(r ) (n · r )m d3 r . ϕ) . characteristic of electromagnetic radiation.23) m≥0 where the terms in the sum fall oﬀ rapidly with m.20) and (8.20) Note that the magnetic ﬁeld in this leadingorder approximation falls oﬀ like 1/r. which means that the ﬁelds are being case. it follows that kd << 1 and so the quantity kn · r  appearing in the exponential in the integrand in (8. 8.24) where j (x) are spherical Bessel functions and h (x) are spherical Hankel functions of the ﬁrst kind. (8. The electric ﬁeld can be calculated from the magnetic ﬁeld using (8.5). giving A(r. The rule again is therefore that → i k n. where d << λ ∼ r.21) imply n · B = 0. This means that it is useful to expand the exponential in a Taylor series.18) is much smaller than 1. d << λ = 2π/k. We shall not pursue the investigation of the induction zone further. and again the leadingorder behaviour comes from the term where the gradient operator lands on the e i kr factor. Note that (8.3 The induction zone This is the intermediate zone.1.21) (8. and so we ﬁnd (8. E · B = 0. as we have seen previously. It turns out that this can be written as ei kr−r  = 4πi k r − r  j (kr ) h ≥0 (1) observed from a distance that is comparable with the wavelength. and so kr ∼ 1. n· E = 0. (8. This is characteristic of radiation ﬁelds. (8. This is E ≈ −k n × (n × A) . ϕ ) Y m m (θ.and so B ≈ ikn × A. In this (kr) m=− (1) Y ∗ (θ .22) Thus E and B are transverse and orthogonal.
30) The integrand here is just the electric dipole moment. r (8. consider the identity ∂i (xj Ji (r )) = δij Ji (r ) + xj ∂i Ji (r ) = Jj (r ) + xj · J(r ) . (8. t) = − i k p i (kr−ωt) e . t) = 0. Consider ﬁrst the m = 0 term.26) This actually corresponds to an electric dipole term. (8.27) The integral of the lefthand side over all space gives zero.33) Note that this leadingorder term in the expansion of the radiation ﬁeld corresponds to an electric dipole.32) r ρ(r ) d3 r . ∂t (8. Thus we conclude that J (r ) d3 r = −i k and so A(r. since it can be turned into a boundary integral over the sphere at inﬁnity (where the locaised sources must vanish): ∂i (xj Ji (r )) d3 r = We also have the charge conservation equation · J(r .29) xj Ji (r ) dSi = 0 . for which A(r.23) A(r. To see this. we obtained (8. t) = 1 i (kr−ωt) e r J(r ) d3 r .34) r ρ(r ) d3 r .2 Electric dipole radiation In the radiation zone. (8.25) m≥0 The terms in this expansion correspond to the terms in a multipole expansion. this gives · J(r ) = i ω ρ(r ) = i k ρ(r ) . (8. t) = 1 i (kr−ωt) e r (−i k)m m! J(r ) (n · r )m d3 r . and not an electric monopole. The reason for this is that a monopole term 145 .31) (8. (8. t) = − i k i (kr−ωt) e r r ρ(r ) d3 r .28) S and so with the time dependence e−∈ ωt that we are assuming.8. t) + ∂ρ(r . (8. p= and so we have A(r. (8.
(8. r r r (8. We shall discuss this in more detail later. E(r. r (8. the expressions are in fact exact to all orders in 1/r. Note that we have n · B = 0 everywhere. in which the electric dipole contribution we have obtained here is the ﬁrst term in the series. and to write A(r. is that if one makes a multipole expansion. t) = B(r) e−i ωt . but that n · E = 0 only in the radiation zone (i. It is convenient to factor out the timedependence factor e −i ωt that accompanies all the expressions we shall be working with.5) we then have Ei = i k i ei kr 1+ . t) = E(r) e−i ωt . r r ijk pk ∂j (8.e.41) Note that we have B = E.37) B = k 2 (n × p) From (8.39) r r r ijk pm ∂j k mk 2 x In 3vector language. 146 . r xj xj − 3 + i k 2 ei kr . rather than just the leadingorder 1/r terms.40) The reason why we have kept all terms in these expressions for B and E. at order 1/r). B(r. r kr (8.38) 1 i + ei kr . this gives E = −k 2 n × (n × p) ei kr ik 1 + [3(n · p) n − p ] 3 − 2 ei kr .would require that the total electric charge in the source region should oscillate in time. This would be impossible. by charge conservation. r E = −k 2 n × (n × p) ei kr = −n × B . = i k (δi δjm − δim δj ) pm δj 2 4 5 2 r kr r kr r r kr ik 1 k2 = (pi − ni n · p) ei kr + 2 (pi − 3n · p ni ) ei kr − 3 (pi − 3n · p ni ) ei kr . because the total charge in this isolated system must remain constant. as usual for radiation ﬁelds.36) ∂j Ak = −i k ijk pk ∂j = −i k and so 1 i kr e . (8. r 2 kr 3 i 2xj x 3i xj x xj x 1 i 1 + 3 − − + ik + 3 ei kr . r (8. In the radiation zone we have B = k 2 (n × p) ei kr . t) = A(r) e−i ωt .35) Thus for the electric dipole ﬁeld we shall have A(r) = − Then from B(r) = × A(r) we ﬁnd Bi = ijk i kp i kr e .
45) S · n r 2 dΩ . so that n · p = p cos θ. then this gives dP k4 = p 2 sin2 θ . is that this falloﬀ is linear in z. dΩ 8π 8π If we take θ to be the angle between p and n. r3 (8. In particular.40) are in fact valid everywhere. consider a dipole antenna comprising two thin conducting rods 1 running along the z axis. we may calculate the radiated power in the usual way.41) we therefore have k4 k4 dP = n × p 2 = ([p2 − (n · p )2] . Thus we may assume I(z.43) Then the power radiated into the solid angle dΩ is given by dP = = = From (8. Note that in the near zone we have vecB ∼ (kr)  E. we can also use these expressions in the static zone (i. 8π (8. meeting (but not touching) at the origin. The current will fall oﬀ as a function of z. which means B << E.46) (8. becoming zero at the tips of the antenna at z = ± 1 d. dΩ 8π (8. r2 E = [3(n · p) n − p ] 1 . in the regime kr << 1. using the Poynting vector.47) As a concrete example. we saw previously that with the electric and magnetic ﬁelds written in the complex notation. except that it is oscillating in time. 8π 1 B2 r 2 dΩ . Returning now to the radiation zone.44) Since dΩ = sin θdθdϕ. and extending to z = ± 2 d respectively. d (8. The antenna is driven at the centre (z = 0) by an alternating current source with angular frequency ω. Here. 1 [(−n × B) × B ∗ ] · n r 2 dΩ .38) and (8. t) = I(z)e−i ωt = I0 1 − 147 2z −i ωt e . A reasonable approximation.48) .42) The electric ﬁeld here is precisely like that of a static electric dipole. in the regime we are 2 considering here where kd << 1.Since (8. we therefore have B = i k (n × p) 1 . (8. the near zone). the time average of the Poynting ﬂux is given by S = 1 E ×B∗. 8π (8.e. the total power radiated by the oscillating dipole is then given by P = dP k4 dΩ = 2π p 2 dΩ 8π π 0 1 sin3 θdθ = 3 k 4 p 2 .
t) is the charge per unit length in the rods. in view of the time dependence. The charge conservationn equation therefore becomes ∂I(z. t) δ(x)δ(y) . ωd 2i I0 λ(z) = − . Thus really. the charge density will be given by ρ(r.56) where θ is the angle between n = r/r and the z axis. Similarly. t) + = 0. 2ω (8. t) ∂λ(z. (8. ∂z ∂t (8.48) is essentially conﬁned to the line x = y = 0. 0. dΩ 8π 32π (8.53) (8. ωd λ(z) = z > 0. and is given by d/2 p= −d/2 zλ(z)dz = 2i I0 ωd d/2 0 zdz − 2i I0 ωd 0 zdz = −d/2 i I0 d .54) ∂ i 2z i ∂I(z) = − I0 1− .46). (8. which implies also λ(z. we have ∂I(z) − i ωλ(z) = 0 .57) 148 . p). we have J(r. The current (8. since we are assuming the conducting rods that form the antenna are thin.49) where λ(z. we therefore ﬁnd that the power per unit solid angle is given by k 4 p2 I 2 (kd)2 dP = sin2 θ = 0 sin2 θ . t) δ(x)δ(y) . ∂z Thus we shall have λ(z) = − This implies 2i I0 . t) = λ(z.50) (8.The equation of charge conservation. The total radiated power is therefore given by P = 2 1 2 12 I0 (kd) .55) From (8. z < 0. (8.51) and so.52) The dipole moment p is directed along the z axis. · J + ∂ρ/∂t = 0 then allows us to solve for the charge density. t) = I(z. t) = λ(z)e −i ωt . ω ∂z ω ∂z d (8. p = (0.
= f (r) − xi (∂i r) f (r) + 1 xi xj [(∂i ∂j r)f (r) + (∂i r)(∂j r) f (r)] + · · · (8. t) = A(r) e−i ωt .60) 1 x x ∂i ∂j f (r) + · · · 2! i j 1 = f (r) − xi ∂i f (r) + xi xj ∂i ∂j f (r) + · · · . in (8.26). To do this. Its i’th component is given by nj xj Ji = = 1 2 (Ji xj 1 2 ijk 1 − Jj xi )nj + 2 (Ji xj + Jj xi )nj . 1 M = 2 r × J(r ) (8. This gives A(r) = ei kr ik 1 − 2 r r (n · r ) J(r ) d3 r . i. r − r  (8. i. in a multipole expansion we can obtain exact expressions. we go back to the general integral expression A(r) = giving A(r. Let J(r ) ei kr−r  3 d r .8.e.64) xj Jk .59) (8. r − r  which we can therefore express as the Taylor series f (r − r ) = f (r) − xi ∂i f (r) + (8.62) The ﬁrst term in this series gives the electric dipole contribution that we found in the previous section. we need to manipulate the integrand a bit. = − where Mi = 1 2 ijk ijk nj Mk (8. 2 1 + 2 (Ji xj + Jj xi )nj .61) 2 ei kr − r  ik 1 1 (n · r ) ei kr + · · · . it depends only on the magnitude of r.) It follows that ei kr−r  = f (r − r ) .63) In order to interpret this expression. = ei kr + 2 − r − r  r r r Thus we ﬁnd (8. mk J xm nj + 1 (Ji xj + Jj xi )nj .58) 1 i kr e = f (r) = f (r) . 149 . r (Note that f (r) = f (r).65) is the magnetisation resulting from the current density J.3 Higher multipoles As mentioned previously. for the electric and magnetic ﬁelds. (8. term by term.e. 2! . The second term gives contributions from an electric quadrupole term and a magnetic dipole term.
66) Integrating this over all space.1 ik i k i kr i k 1 1 − 2 n×m+ e − 2 r r 2 r r r (n · r ) ρ(r ) d3 r . (8. the lefthand side can be turned into a surface integral over the sphere at inﬁnity. xj + f δj ) .70) ﬁrst: A(r) = ei kr Let f ≡ ei kr so A = rf n × m = f r × m. Then from B = Bi = = ijk ∂j Ak ik 1 − 2 n × m.64).71) (8. (8. 2 r r × A we shall have ijk k m ∂j (f x (8.e. (8.The remaiing term in (8. the symmetric term 1 (Ji xj + Jj xi )nj .70) M d3 r = 1 2 r × J(r ) d3 r . which therefore gives zero. can be analysed 2 as follows.3. (8. Thus we conclude that (xi Jj + xj Ji )nj d3 r = −i ω The upshot is that (n · r ) J (r ) d3 r = −n × M d3 r − iω 2 r (n · r ) ρ(r ) d3 r . Consider ∂k (xi xj nj Jk ) = δik xj nj Jk + δjk xi nj Jk + xi xj nj ∂k Jk .69) Magnetic dipole term Consider the magnetic dipole term in (8. = (xi Jj + xj Ji )nj + i xi xj nj ωρ . r xj = (δi δjm − δim δj )(f x + f δj ) . ijk k m (f x (8.72) we have f = ei kr 3i k k 2 k2 3 − 2 − = − ei kr − 3f .72) = )mm . r = rf ni n · m − rf mi − 2f mi .74) . i.67) Deﬁning the magnetic dipole moment m by m= we conclude that A(r) = ei kr 8.73) From (8.68) xi xj nj ωd3 r . (8. r3 r f r 150 (8. r r 1 ik − 3 .
it gives rise to the E ﬁeld given in (8. and it is obvious almost by inspection that h where 2 the answer given in (8.) Thus the upshot is that the electric ﬁeld for the magnetic dipole radiation will be given by E = −k 2 (n × m) substituting (8. (8. 151 .18 18 i ei kr 1+ . the reader is invited to substitute (8. where h was an arbitrary function.79) being wrong.and so we ﬁnd B = −k 2 n × (n × m) ei kr ik 1 + [3n (n · m) − m ] 3 − 2 ei kr .5). we already saw from the caclautions for the electric dipole that when the B ﬁeld (8.40).5).78) has a minus sign.78). under the exchange of E and B. the E ﬁeld that would yield.79) This result can alternatively be veriﬁed (after a rather involved calculation) by directlty The only “gap” in the simple argument we just presented is that any other vector E = E + h would also give the same B ﬁeld when plugged into (8. Therefore.75) into (8.5) to verify (8. using (8. (The reason for the minus sign is that (8. As we just noted.5). (8. However. the expression for the B ﬁeld is just like the expression for the E ﬁeld in the electric dipole case. we can conclude that in the present magnetic case.77) Now. we know that · E should vanish (we are in a region away from sources). Thus if we had arrived at the wrong answer for E. However. r r r (8.38) is substituted into (8.79) satisﬁes this condition.78) × E = −∂ B/∂t we have (8. the result (8.78). and we already know that in the electric case. the B ﬁeld is given by (8. with the electric dipole p replaced by the magnetic dipole.75) for the B ﬁeld will be just the negative of the expression for B in the electric case (with p replaced by m). There is no such function with an exponential factor ei kr . and so B=− i k ×E. as compared with (8. If any doubts remain. it could be wrong only by a term h = 0.79) directly. E −→ B . r kr (8. in the present magnetic dipole case.76) The electric ﬁeld of the mangnetic dipole can be obtained from (8. and the electric ﬁeld replaced by the magnetic ﬁeld: p −→ m .5).75) Note that this is identical to the expression (8.38).75) into (8. a simpler way to ﬁnd it here is to note that from the Maxwell equation × E = iωB = ikB.40) for the electric ﬁeld of an electric dipole source. and so there is no possibility of our answer (8.
83) × A we shall have (n × r )(n · r ) ρ(r ) d3 r .86) (8. For the electric ﬁeld. Thus. This radiation ﬁeld can therefore be written simply as B = ikn × A. A(r) = − 1 k 2 2 ei kr r r (n · r ) ρ(r ) d3 r .2 Electric quadrupole term (8. in any expression where we keep only the leadingorder term in which the derivative The electric quadrupole moment tensor Q ij is deﬁned by Qij = (3xi xj − r 2 δij ) ρ(r) d3 r . from B = B = − 1 k 2 (i k) n × 2 = − i k 3 ei kr 2 r ei kr r r (n · r ) ρ(r ) d3 r .80) We now return to the electric quadrupole term in (8.81) For simplicity.85). we have. we shall have the rule −→ i k n . E= i k × B = −n × B = −i kn × (n × A) . lands on ei kr .82) and furthermore when calculating the B and E ﬁelds.3.An observation from the calculations of the electric and magnetic ﬁelds for electric dipole radiation and magnetic dipole radiation is that there is a discrete symmetry under which the two situations interchange: p −→ m E −→ B B −→ −E This is an example of what is known as “electric/magnetic duality” in Maxwell’s equations. we shall keep only the leadingorder radiation term in this expression. we shall keep only the leadingorder 1/r terms that come from the derivatives hitting e i kr .85) (8.84) In fact. (8. 8. (8. 152 (8. (8.87) .5) and (8.70). using (8. namely A(r) = 1 i k i kr i k e − 2 2 r r r (n · r ) ρ(r ) d3 r . (8.
in order to obtain the total radiated power. and ni nj nk n dΩ .88) nj Qk n .89) = = (n × r)i (n · r) ρ(r) d3 r . whatever it is. must be a symmetric 2index tensor. having obtained an expression for the power radiated per unit solid angle. it follows that the timeaveraged power per unit solid angle will be given by dP dΩ = = and so 1 (E × B ∗ ) · n r 2 . 288π 288π k6 dP = n × Q(n)2 . (8. this is therefore (8. sin θ sin ϕ. (since the trace term gives zero).94) is as follows.83) for the electricquadrupole B ﬁeld can be written as B=− i k 3 i kr e n × Q(n) . A more elegant way to evaluate the integrals in (8. and slog out the integrals with dΩ = sin θdθdϕ. cos θ) . it is natural to integrate this up over the sphere. For the ﬁrst integral. (8.Deﬁne the vector Q(n). In this case. This implies that the expression (8. Consider the expression 1 n × Q(n). we note that the answer.95) . n = (sin θ cos ϕ. It must also 153 (8. 8π k6 k6 (n × Q(n)) × n2 = (Q(n)2 − n · Q(n)2 ) .92) Written using indices.94) One way to do this is to parameterise the unit vector n in terms of spherical polar angles (θ. we shall need to evaluate ni nj dΩ .86)).90) Since we have E = B × n (see (8.93) As always. nj n (3xk x − r 2 δk ) ρ(r) d3 r . 6r (8. kj k dΩ 288π (8. whose components Q(n)i are given by Q(n)i ≡ Qij nj . We shall have 3 1 [ 3 n × Q(n)]i = 1 3 ijk 1 3 ijk (8. dΩ 288π dP k6 = (Qki Q∗ ni nj − Qij Q∗ ni nj nk n ) . ϕ) in the usual way.91) (8.
ij ij jj 5 5 216 k6 Qij Q∗ . In fact the only symmetric isotropic tensors are those that can be made by taking products of Kr¨necker deltas.95).) 154 . it is not too hard in this case to conﬁrm the result by evaluating all the integrals explicitly using (8. 15 (8.97) (8.98) dΩ = 3c . ij 360 (8. and using n i ni = 1: 4π = and so we have ni nj dΩ = 4π δij . it must be a constant multiple of the Kr¨necker delta.be completely isotropic.99) for some constant b. and so in this o case it must be that ni nj nk n dΩ = b (δij δk + δik δj + δi δjk ) . kj k 288π 3 15 k6 Qij Q∗ − 2 Qij Q∗ − 1 Qii Q∗ . (8. since by the time we have integrated over all solid angles it is not possible for the result to be “biased” so that it favours any direction in space over any other.102) (Recall that Qij is symmetric and traceless.100) With these results we shall have from (8. 3 (8. we can use a similar argument. (8. There is only one possibility for the symmetric isotropic tensor.93) that P = = = = dP k6 ni nj nk n dΩ .101) dΩ = (9 + 3 + 3)b = 15b .96) In case one doubted this result.94). We can determine the constant by multiplying both sides by δ ij δk . giving 4π = and so ni nj nk n dΩ = 4π (δij δk + δik δj + δi δjk ) . Turning now to the secon integral in (8. ni nj dΩ − Qij Q∗ Qki Q∗ dΩ = k kj dΩ 288π k 6 4π 4π Qki Q∗ δij − Qij Q∗ (δij δk + δik δj + δi δjk ) . The answer must be a 4index totally symmetric isotropic tensor. The constant c can be determined by taking the trace. o ni nj dΩ = cδij . (8.
105) (8. In that section we made the assumption that the wavelength of the electromagnetic radiation was very large compared with the length of the dipole.104). we can assume that Qij = 0 0 Q1 0 Q2 0 0 Q3 0 .104) One can substitute (8. Note also that its frequency dependence is proportional to ω 6 (= k 6 ). i. the coordinates.106) This is indeed.95) and (8. It describes a quadrafoillike power distribution. unlike the electric dipole radiation that is proportional to ω 4 .) Thus.4 Linear antenna In the later part of section 8. we obtain 2 k 6 Q3 k 6 Q2 dP 3 = sin2 θ cos2 θ = sin2 2θ . 2 Substituting (8. Q → Q diag = U T QU is itself orthogonal. that kd << 1.e. (This is because the matrix U that diagonalises Q.105) into (8. 1 1 2 2 3 3 1 3 dΩ 288π (8. U T U = 1 and therefore the diagonalisation is achieved by an orthogonal transformation of l. 8. azimuthally symmetric (it does not depend on ϕ). This means that there is an axial symmetry around the z axis.2. dΩ 128π 512π (8. as expected. it is always possible to choose an orientation for the Cartesian coordinate system such that Q ij becomes diagonal. we considered a centrefed dipole antenna. with four lobes. In that limit. where Q 1 + Q2 + Q3 = 0 . having chosen an appropriate orientation for the Cartesian axes. and also we shall have Q1 = Q 2 = − 1 Q3 .93) for the angular power distribution will give k6 dP 2 = Q2 n2 + Q2 n2 + Q2 n2 − (Q1 n2 + Q2 n2 + Q3 n2 )2 .Since the quadrupole moment tensor Q ij is symmetric. (8. ϕ). A plot of the power distribution for quadrupole radiation is given in Figure 3 below.103) The expression (8.95) into this in order to obtain an explicit expression for the dP/dΩ in terms of spherical polar angles (θ. Consider for simplicity the special case where Q 1 = Q2 . one could assume to a good approximation that the current in each arm of the dipole antenna fell oﬀ 155 . unlike the ﬁgureofeight power distribution of the electric dipole radiation.
6 0. 2 (8.4 0. it can be shown that the current distribution in the dipole arms takes the form J(r.109) Thus in the radiation zone.6 0. which is the axis along which the (8.6 Figure 4: The angular power distribution for electric quadrupole radiation in a linear fashion as a function of z (the axis along which the dipole is located).4 0. t) = A(r) e−i ωt . t) = I0 sin k(d/2 − z) e−i ωt δ(x)δ(y) Z .107) we assumed there that the current in each arm was proportional to (d/2 − z).2 0.108) where Z = (0. 156 . with the dipole arms extanding over the intervals −1d ≤ z < 0 2 and 1 0 < z ≤ 2d . t − r − r ) d3 r .4 0. where A(r) = J(r . we shall consider the case where the dipole arms of not assumed to be short campared to the wavelength.6 0. We then have A(r. with r − r  ≈ r − n · r as usual. 1) is the unit vector along the zaxis.4 0.2 0. we therefore have A(r) ≈ Z I0 ei kr r −d/2 sin k(d/2 − z) e−i kz cos θ dz . Thus. (8. dipole is located. In this section.2 0. 0. Under these circumstances. r − r  d/2 z ≤ 1 d .2 0.0.
113) becomes dP I 2 cos2 ( 1 π cos θ) 2 = 0 . 128π 2 .= Z 1 1 2I0 ei kr cos( 2 kd cos θ) − cos( 2 kd) r sin2 θ (8.112) .116) .113) becomes of the wavelength.110) As we saw earlier.111) Here we have B2 = i kn × A2 = k 2 (A2 − (n · A)2 ) = k 2 A2 sin2 θ . and so − 2z/d). = 0 dΩ 2π sin2 θ 157 (8.4. the magnetic ﬁeld is given by i k n × A in the radiation zone. dΩ 8π 8π 8π (8. (8.56) was twice as large as the current in the present calculation.3 kd = 2π: 1 2 In this case.114) This agrees with the result (8.1 kd << 1: In this case. (8. In this case (8. after making allowance for the fact that the current in the calculation leading to (8. In this case. (8. 8. 2π sin θ 2 I0 (kd)2 sin2 θ . and I(z) = I 2 cos4 ( 1 π cos θ) dP 2 .115) 8.113) We can now consider various special cases: 8.4. and so the radiated power per unit solid angle is given by 1 dP I 2 cos( 1 kd cos θ) − cos( 2 kd) 2 = 0 dΩ 2π sin θ 2 (8. leading to dP dΩ ≈ = = 1 2 I0 1 − 1 ( 1 kd)2 cos2 θ − 1 − 2 ( 1 kd)2 2 2 2 2π sin θ 1 2 I0 2 ( 1 kd)2 sin2 θ 2 2 .2 kd = π: 1 4 In this case. dΩ 2π sin2 θ (8. we can make Taylor expansions of the trigonometric functions in the numerator in (8. each dipole arm has a length equal to I0 sin π(1 − 2z/d). and E = −n × B. Therefore the radiated power per unit solid angle is given by dP r2 r2 r2 = E × B ∗ 2 = (B · B ∗ ) n2 = B2 . each arm of the dipole has a length equal to I(z) = I0 sin 1 2 π(1 of the wavelength.4. since n · Z = cos θ.56).113).
It turns out that we should simultaneously perform a very speciﬁc phase transformation on the wavefunction ψ.102). On the other hand. ∂t (9.e.5) the Schr¨dinger equation will change its form. o wavefunction ψ and electromagnetic potentials φ and A ) will take the identical form to the original unprimed equation (9. φ −→ φ = φ − ∂λ .1) where πi is the canonical 3momentum. h ψ −→ ψ = ei eλ/¯ ψ (9. (9. Thus. 2m (9.4) The Schr¨dinger equation (9. since this leaves the physicallyobservable electric and magnetic ﬁelds unchanged. and to write H ψ = ih ¯ In the position representation we shall have πi = −i h ∂i .1 Electromagnetism and Quantum Mechanics The Schr¨dinger equation and gauge transformations o We saw at the end of chapter 2. 158 . In quantum mechanics.6) then the Schr¨dinger equation expressed entirely in terms of the primed quantities (i.2) Thus the Schr¨dinger equation for a particle of mass m and charge e in an electromagnetic o ﬁeld is − h2 ¯ 2m − ie A h ¯ 2 ψ + eφ ψ = i h ¯ ∂ψ . ∂t (9. that in the nonrelativistic limit the Hamiltonian describing a particle of mass m and charge e in the presence of electromagnetic ﬁelds given by potentials φ and A is H= 1 (πi − eAi )2 + eφ . we expect that the o physics should be unaltered by a mere gauge transformation.4).3) ∂ψ . if we perform a gauge transformation A −→ A = A + λ. in equation (2.4) is written in terms of the scalar and vector potentials φ o and A that describe the electromagnetic ﬁeld. Thus. we may say that the Schr¨dinger equation o transforms covariantly under gauge transformations. ¯ or π = −i h ¯ . ∂t (9. we the standard prescription for writing down the Schr¨dinger equation for the wavefunction ψ describing the particle is o to interpret πi as an operator.9 9.
We this both for the three spatial derivatives. namely just with a homogeneous phase transformation factor e i eλ/¯ . since the gauge parameter λ is an arbitrary function of space and time. This means that Di ψ and D0 ψ transform the same way as ψ itself under the gauge h transformations (9.7) Note that the original Schr¨dinger equation (9.To see the details of how this works. perform the transformations A −→ A = A + λ. h D0 ψ = ei eλ/¯ D0 ψ . h ¯ ∂i − (9. φ −→ φ = φ − ∂λ . Thus we deﬁne Di ≡ ∂ i − ie Ai . This would not. it is useful ﬁrst to deﬁne what are called covariant derivatives. 159 . This is a non trivial statement. of course. h ¯ D0 ≡ ∂ ie + φ. φ− + h ¯ h ∂t ¯ h ∂t ¯ ie φ ψ. namely just by acquiring the phase factor e i eλ/¯ . we have h Di ψ = ei eλ/¯ Di ψ .8) Next. ∂t (9. (9.12) which means that Di ψ and D0 ψ transform the same way as ψ itself under a gauge transforh mation.9). where the derivative lands on the spacetime dependent gauge paramater λ. there would be an extra additive term. and also for the time derivative.10) and D0 ψ ie ∂ + φ ∂t h ¯ h ∂ = ei eλ/¯ + ∂t h ∂ = ei eλ/¯ + ∂t ≡ ∂ ie i e ∂λ h ei eλ/¯ ψ . + φ− ∂t h ¯ h ∂t ¯ ie i e ∂λ i e ∂λ ψ. h ¯ ψ = (9. ∂t h ¯ (9. because for these. h ¯ h ¯ h ¯ ie h = ei eλ/¯ ∂i − Ai ψ .9) h ψ −→ ψ = ei eλ/¯ ψ The crucial point about this is that we have the following: Di ψ ≡ ie ie ie h Ai ψ = ∂i − Ai − (∂i λ) ei eλ/¯ ψ .4) is now written simply as o − h2 ¯ Di Di ψ − i h D 0 ψ = 0 .11) In other words. h ¯ h ¯ h ¯ ie ie ie h = ei eλ/¯ ∂i − Ai − (∂i λ) + (∂i λ) ψ . ¯ 2m (9. be the case for the “ordinary” derivatives ∂ i ψ and ∂0 ψ.
¯ 2m i i h2 ¯ h Di Di ψ − i h D 0 ψ . For example. is that the equation expressed in terms of the symmetrytransformed (primed) variables is identical in form to the original equation for the unprimed variables. A).e. The actual transformation is totally diﬀerent in the two contexts. but in each case the essential point.13) precisely because the derivative can land on the spacetime dependent gaugetransformation parameter λ and thus give the second term.13). as in (9. which is characteristic of a covariance of any equation under a symmetry transformation. which spoils the covariance of the transformation.Had we been considering standard partial derivatives ∂ i and ∂/∂t rather than the covariant deriavtives deﬁned in (9. h ¯ (9.8) written in terms of the primed ﬁelds. ¯ = ei eλ/¯ − 2m (9. 160 .14) just implies the Schr¨dinger equation in terms of unprimed ﬁelds. Note that we use the term “covariant transformation” here in the same sense as we used it earlier in the course when discussing the behaviour of the Maxwell equations under Lorentz transformations. it also follows that D i Di ψ = ei eλ/¯ Di Di ψ. but with a prime placed on every ﬁeld.16) since we have Aµ = (φ. h By iterating the calculation. o − h2 ¯ D D ψ − i h D0 ψ = 0 . h h h ∂i ψ = ∂i ei eλ/¯ ψ = ei eλ/¯ ∂i ψ + ei eλ/¯ ie h (∂i λ) ψ = ei eλ/¯ ∂i ψ .9). The point about the covariant derivatives is that the contributions from the gauge transformation of the gauge potentials precisely cancels the “unwanted” second term in (9. here we are discussing the behaviour of the Schr¨dinger equation under gauge transformations rather than Lorentz o transformations.15) What we have proved above is that the Schr¨dinger equation transforms covariantly o under electromagnetic gauge transformations. A).7) can be uniﬁed into the single 4dimensional deﬁnition Dµ = ∂ µ − ie Aµ h ¯ (9. and so we see that the Schr¨dinger equation (9. provided that at the same time the wave function is scaled by a spacetime dependent phase factor. and hence Aµ = (−φ. it would most certainly not have been true.7). i. It is worth noting that the two deﬁnitions of the spatial and time covariant derivatives in (9. ¯ 2m i i (9. since o 0 = − h2 ¯ D D ψ − i h D0 ψ .
17) (i.8). This lies outside the scope of the present course.) Thus the Schr¨dinger equation o does not respect any relativisitic notion of causality.19) it is easy to see that Fij = ijk F0i = Bi .17). to treat the electron properly in quantum physics it is necessary ﬁrst to have a relativistic theory (the Dirac equation).17) take on a more symmetricallooking form if we introduce the dual of the ﬁeldstrength tensor. (9. (9. the second equation in (9. the Bianchi identity) becomes ∂µ F µν = 0 . Ek . Since the rˆles of E o and B are exchanged when passing from Fµν to Fµν . however. return to the subject of relativistic quantum mechanics a little later. From F0i = −Ei and Fij = ijk Bk . since it is an approximation that is not even Lorentz invariant. it is evident that the 4current needed 161 . = 0 (9. and secondly one must move beyond quantum mechanics to quantum ﬁeld theory. we shall just make the obvious (if slightly provocative) remark that anyone who speaks of the Schr¨dinger equation as if it were the ultimate “holy grail” of physics should not be taken o seriously. For example. B −→ E . it gives answers that are measurably in disagreement with experiments.We shall have more to say about this 4dimensional covariant derivative later. (9.18) In terms of Fµν .20) It follows that Fµν is obtained from Fµν by making the replacements E −→ −B .2 Magnetic monopoles The Maxwell equations ∂µ F µν ∂µ Fνρ + ∂ν Fρµ + ∂ρ Fµν = −4π J ν .e. 9. (9. (It is manifest that time is treated on a diﬀerent footing from space in (9.19). For now. We shall. analogous to the electric 4current density on the righthandside of the ﬁrst Maxwell equation in (9.21) The symmetry between the two Maxwell equations would become even more striking if there were a current on the righthand side of (9. Furthermore. deﬁned by Fµν = 1 2 µνρσ F ρσ .
24) (9. x = r sin θ cos ϕ . Consider the 3vector potential A=g Using ∂r ∂x ∂ρ ∂x = = x . will have a magnetic ﬁeld given by B= This satisﬁes · B = 4π ρM . (9. r3 ρM = g δ 3 (r) . Since the Schr¨dinger equation is written o in terms of the potentials φ and A. we introduce Cartesian coordinates (x. To do this. 2 rρ rρ (9. ∂z r ∂ρ = 0. and it is of interest to explore their properties in a little more detail. there seems to be no reason in principle why they should not exist. ϕ) in the standard way.− 2. with magnetic charge g. Let us now attach a subscript E to the standard electric 4current density. gr .28) (9. ∂y r ∂ρ y = .0 . ρ ∂r y = .23) Thus by analogy. ∂z zy zx .22) Particles with magnetic charge. in order to emphasise which is which in the following. x = r cos θ .µ on the righthand side of (9. y. and we also deﬁne ρ2 = x 2 + y 2 . r3 (9. a point magnetic monopole. The generalised Maxwell equations will now be written as ν ∂µ F µν = −4π JE . z).26) (9. have never been seen in nature. (9.19) must be a magnetic 4current density. ν ∂µ F µν = −4π JM . r x . ∂y ρ 162 ∂r z = . known as magnetic monopoles. we shall therefore need to write down the 3vector potential A for the magnetic monopole.27) y = r sin θ sin ϕ . However. θ. J M . 0 where ρM = JM is the magnetic charge density. (9.25) We shall be interested in studying the quantum mechanics of electricallycharged particles in the background of a magnetic monopole. A point electric charge e has an electric ﬁeld given by E= er . related to spherical polar coordinates (r.29) .
32) gy . singular along θ = π (i. 0) . since we are describing an idealised point magnetic charge.33) along θ = 0 and θ = π.37) (9. given by (9. this potential is singular at r = 0. 0) . we therefore ﬁnd A =A+ λ= g g cos θ − 1 1 (sin ϕ.it is easily seen that Bx = ∂y Az − ∂z Ay = g∂z and similarly By = Thus indeed we ﬁnd that ×A= gr . along the positive z axis). of course.28) describes the magnetic monopole ﬁeld (9.24) has no singularity along the z axis. In terms of sherical polar coordinates we have ρ 2 = x2 + y 2 = r 2 sin2 θ. x (9. r sin θ r (9.30) and so the 3vector potential (9.36) y . In exactly the same way. i. − cos ϕ. at θ = 0 and θ = π. (9. the potential φ − e/r describing a point electric charge diverges at r = 0 also. so that is a real physical singularity. under which A −→ A + Consider ﬁrst taking λ = g ϕ = g arctan From this.33) also diverges everywhere along the z axis. It is. 0) . r3 (9. and so (9. To see the unphysical nature of the singularities in (9. r3 Bz = gz . when we note that the magnetic ﬁeld itself. however. 163 . − cos ϕ. genuinely divergent at r = 0. This is not too surprising. r (9. along the negative z axis). we need to make gauge transformations. r Letting the gaugetransformed potential be A .35) λ. − cos ϕ.e.34) It can be seen that A is completely nonsingular along θ = 0 (i. It turns out that these latter singularities are “unphysical. However. r3 (9. 2 rρ rρ r ρ r (9.” in the sense that they can be removed by making gauge transformations.31) zx gx gxz 2 gx = 2− 3 2 = 3 . we ﬁnd 1 λ = − cosecθ (sin ϕ. the potential (9.e.28) can be written as A= g cot θ (sin ϕ. − cos ϕ. It is.24).33) Not surprisingly.e. 0) = − tan 2 θ (sin ϕ.
(9.38) instead of (9.” since it lies along a line. we see from (9. related by a gauge transformation. for which the string singularity ran along any desired line. 0) . just an artefact of our gauge choices.39) This time. perform a gauge transformation with λ given by λ = −g ϕ = −g arctan y x (9. of o course. is nonsingular along the negative z axis. The singularity that each has is known as a “string singularity. the negative z axis).42) 164 . or string. 0) .1. In the discussion above.3 Dirac quantisation condition We have seen that gauge potentials for the magnetic monopole. but it is singular along θ = 0 (the positive z axis). − cos ϕ. r g = cot 1 θ (sin ϕ. related as in (9. the Schr¨dinger equation for the electron o is given by (9. 2 r (9. Deﬁning the gaugetransformed potential as A in this case. emanating from the origin.e.4). As we discussed in section 9. one of which.41) that we shall have h ψ = e−2i egϕ/¯ ψ . 0) .9) (9. is nonsingular along the positove z axis. We shall consider the Schr¨dinger o equation in two diﬀerent gauges. but it can never be removed altogether. free of singularities on the positive and negative z axes resepctively. are given by A A g 1 = − tan 2 θ (sin ϕ. or curve. the z axis apprears to have played a preferred rˆle. Denoting the corresponding electron wavefunctions by ψ and ψ . (9. we have obtained a gauge potential that is nonsingular along θ = π (i.40) The two are themselves related by a gauge transformation. By making gauge transformations the location of the string can be moved around. namely A =A + (−2gϕ) . We could equally well have chosen a diﬀerent expression for A. A .We could. − cos ϕ.41) Now let us consider the quantum mechanics of an electron in the background of the magnetic monopole. and m is its mass. 2 r (9. 9. A .41). where e is its charge. and the other. but this is. We have obtained two expressions for the vector potential. − cos ϕ. on the other hand. we ﬁnd A = g cot 1 θ (sin ϕ.35). There is no single choice of gauge in which the 3vector potential for the magnetic monopole is completely free of singularities away from the origin r = 0.
44) It is interesting to note that although a magnetic monopole has never been observed. and so it must be that 2eg 2π = 2π n . (10. 10 10. ¯ (9. h ψ −→ ei eλ/¯ ψ . This means that the phase factor in the relation (9. provided that the usual gauge trnasformation of the 4vector potential is combined with a phase transformation of the wavefunction for the charged particle: Aµ −→ Aµ + ∂µ λ . we have seen that the gauge transformation is not physical. and in units of 1 e in the quarks of the theory of strong interactions. in a galaxy far far away. in everyday life. In fact all observed electric charges are indeed quantised. 2e g = n h .45) where g is the magnetic charge of the lonely magnetic monopole.43) where n is an integer. Thus it must be that the product of the electric charge e on the electron.However. but merely corresponds to shifting the string singularity of the magnetic monopole from the negative z axis to the positive z axis. the physics will only be unchanged if the electron wavefunction remains single valued under a complete 2π rotation around the z axis.1 Local Gauge Invariance and YangMills Theory Relativistic quantum mechanics We saw in the previous section that the ordinary nonrelativistic quantum mechanics of a charged particle in an electromagnetic ﬁeld has the feature that it is covariant under electromagnetic gauge transformations. h ¯ (9. in integer multiples of the charge e on the electron. Quantum mechanically. It is 3 tempting to speculate that the reason for this may be the existence of a magnetic monopole somewhare out in the vastness of space. and so the phase transformation of the wavefunction 165 . and the magnetic charge g on the magnetic monopole. to imply that electric charges everywhere in the universe must quantised in units of h ¯ . it would only take the existence of a single monopole.42) must be equal to unity. 2g (9.1) The essential point here is that the gauge transformation parameter λ can be an arbitrary function of the spacetime coordinates. maybe somewhere in another galaxy. must satisfy the socalled Dirac quantisation condition.
since the derivatives will now land on the (spacetime dependent) phase factor in (10. the Schr¨dinger equation (10. it could be said that we have derived electromagnetism as the ﬁeld needed in order to allow the Schr¨dinger equation to transform covariantly under local o phase transformations of the wavefunction. clearly. By again demanding local “phase” transformations of some quantummechanical equation. ∂t (10. There are various possible equations one could consider.3) where c is an arbitrary constant. From this point of view. we will now be able to derive a generalisation of electromagnetism known as YangMills theory. One could turn this around.2) by covariant derivatives Di = ∂ i − ie Ai . indeed. but they 166 .is a spacetimedependent one. Also. the way to achieve a nice covariant transformation of the Schr¨dinger o equation under local phase transformations is to replace the partial derivatives ∂ i and ∂0 in (10. If we started with quantum mechanics in the absence of electromagnetism. Such spacetimedependent transformations are known as local transformations. the probability density ψ2 ). so that for a free particle of mass m we have − h2 ¯ 2m 2 ψ = ih ¯ ∂ψ .3) and give a lot of messy terms.1). and view the introduction of the electromagnetic ﬁeld as the necessary addition to quantum mechanics in order to allow the theory to be covariant under local phase transformations of the wavefunction. The idea now is to extend this idea to more general situations.2) then the Schr¨dinger equation is obviously covariant under constant phase transformations o of the wavefunction. as in (10.2) o does not transform nicely under local phase transformations. Working with a nonrelativistic equation like the Schr¨dinger equation is rather clumsy. It is more elegant (and simpler) to switch at this point to the consideration of relativistic quantum mechanical equation. And. (10. As we now know. o because of the way in which space and time arise on such diﬀerent footings. since all physical quantities are constructed ¯ using a product of ψ and its complex conjugate ψ (for example. the physics described by the wavefunction is invariant under this phase transformation.4) where Ai and φ transform in the standard way for the electromagnetic gauge potentials at the same time as the local phase transformation for ψ is performed. h ¯ D0 = ∂0 + ie φ h ¯ (10. ψ −→ ei c ψ . for which the phase factors cancel out.
167 . or any other fermionic particle with spin 1 2. where (10.7) d4 x∂ µ ϕ ∂µ δϕ = d4 x (∂µ ∂ µ ϕ) δϕ .5) in an o appropriate limit. Examples one could consider include the Dirac equation. a massless particle is inherently relativistic. we ﬁnd δI = − 19 (10.5) and dropping the ∂ 2 ψ/∂t2 term gives precisely the Schr¨dinger equation for the free massive particle. We shall not be concerned with taking the nonrelativistic limit in what follows.6) We shall do this because no essential feature that we wish to explore will be lost. A simpler opetion is to consider the KleinGordon equation for a relativistic particle of spin 0 (otherwise known as a scalar ﬁeld). Clearly this would no longer be true if m were zero. since it must travel at the speed of light (like the photon). and so its KleinGordon equation is simly ϕ = 0. with ψ only slowly varying in time.20 The KleinGordon equation (10. which provides a relativistic description of the electron. namely ϕ − m2 ϕ = 0 . In what follows. energy) m will be e−i mt (in units where we set h = 1). which means that the term ∂ 2 ψ/∂t2 can be neglected in comparison to the others. Substituting into (10.all lead to equivalent conclusions about the generalisation of electromagnetism. Put another way. and it will slightly shorten the equations.6) can be derived from the Lagrangian density L = − 1 ∂ µ ϕ ∂µ ϕ . is Lorentz invariant. and so working with a massless ﬁeld will not be a problem. as we know. which. The KleinGordon equation for a free scalar ﬁeld ϕ with mass m is very simple. Thus the appropriate nonrelativistic approximation is where where the ¯ wavefunction ϕ is assumed to be of the form ϕ ∼ e−i mt ψ. This is because the nonrelativisitic approximation (discussed in the previous footnote) involved assuming that each time derivative of ψ with respect to t was small compared with m times ψ.e. we shall make the simplifying assumption that the scalar ﬁeld is massless. namely o −(1/2m) 2 ψ = i ∂ψ/∂t. The leadingorder time dependence of a ﬁeld with mass (i.19 Note that from now on. 20 Note that we can only discuss a nonrelativistic limit for the massive KleinGordon equation.5) = ∂ µ ∂µ is the usual d’Alembertian operator. 2 Varying the action I = d4 xL. It is completely straightforward to add it back in if desired.8) The nonrelativistic Schr¨dinger equation can be derived from the KleinGordon equation (10. (10. we shall use units where Planck’s constant h is set ¯ equal to 1. (10.
we ﬁrst need to enlarge the system of wavefunctions from one real scalar to two. this time from the viewpoint of the relativistic KleinGordon equation. It is also clear that the Lagrangian density is not invariant under local phase transformations. (10. It is the second term.12) where α is a constant.12).9) We can conveniently combine the two real ﬁelds into a complex scalar ﬁeld φ.13) (10. and furthermore. These equations can therefore be derived from the Lagrangian density 1 L = − 1 ∂ µ ϕ1 ∂µ ϕ1 − 2 ∂ µ ϕ2 ∂µ ϕ2 . deﬁned by 1 φ = √ (ϕ1 + i ϕ2 ) .11) The complex ﬁeld φ therefore satisﬁes the KleinGordon equation φ = 0. where the derivatives land on α.10) (10.14) . 2 (10. since again the derivatives do not land on the phase factor. 2 The Lagrangian density can then be written as 1 ¯ L = − 2 ∂ µ φ ∂µ φ .(dropping the boundary term at inﬁnity as usual). that spoils the invariance. Suppose. since the constant phase factor simply passes straight through the d’Alembertian operator. which are identical at every point in spacetime. where α is assumed now to be spacetime dependent. then. 168 (10.) This can be seen at the level of the KleinGordon equation (10. To do this. and so demanding that the action be stationary under arbitrary variations δϕ implies the KleinGordon equation (10. again. under φ −→ ei α φ .6). (10. (The term “global” is used to describe such phase transformations. we shall ﬁrst review. It is clear that the complex ﬁeld φ has a global phase invariance. Before moving on to the generalisation to YangMills theory. the derivation of electromagnetism as the ﬁeld needed in order to turn a global phase invariance into a local invariance. each satisfying a KleinGordon equation. the e i α phase factor ¯ from transforming φ is cancelled by the e −i α phase factor from tramsforming φ. It can also be seen at the level of the Lagrangian density. This is because we now have ∂µ φ −→ ∂µ (ei α φ) −→ ei α ∂µ φ + i (∂µ α) φ . we have two real scalars called ϕ 1 and ϕ2 .
e (10. not surprisingly. to make a gaugeinvariant term we need to use the gaugeinvariant ﬁeld strength Fµν = ∂µ Aν − ∂ν Aµ (10. where we derived Maxwell’s equations from an action principal.16) (10. This is the “derivation” of ordinary electromagnetism.2. 16π (10.18) is indeed invariant.The remedy. we are therefore lead to propose the total Lagrangian density L = − 1 (D µ φ) (Dµ φ) − 2 169 1 F µν Fµν . the covariant derivative acting on φ has a nice transformation property under the local phase tramsformations of φ. In this viewpoint. one cannot derive a dynamical existence for A µ because in fact there is no unique answer.19) as the basic “building block. we expect that it should give rise to a secondorder dynamical equation for A µ . has not yet given any dynamics to the gauge ﬁeld Aµ . Indeed. the lowestorder possibility is to form the quadratic invariant F µν Fµν . The steps leading to the answer are as follows. where we are deriving electromagnetism by requiring the local phase invariance of the theory under (10.15) where now φ will be interpreted as describing a complex scalar ﬁeld with electric charge o e. As we saw before when discussing the Schr¨dinger equation.17) (10. This implies that Dµ φ transforms nicely as Dµ φ −→ ei α Dµ φ .2. and replace the partial derivatives by covariant derivatives Dµ = ∂ µ − i e A µ . First.” Then. however. to make a Lorentzinvariant term.20) . First of all.13). We are back to the discussion of section 4. Secondly. we want a dynamical term that respects the gauge invariance we have already achieved in the rest of the Lagrangian.” which has all the natural properties one would like. (10. provided at the same time we transform A µ : φ −→ ei α φ . What one can do. is to introduce a dynamical term “by hand. is to introduce a gauge potential A µ . and so the new Lagrangian density 1 L = − 2 (D µ φ) (Dµ φ) Aµ −→ Aµ + 1 ∂µ α . Taking the standard normalisation as discussed in section 4.
the complex ﬁeld φ carries electric charge e. Thus. if two elements U and V are combined in the two diﬀerent orderings. denoted by a dagger. diﬀerent. One can in fact construct a YangMills theory based on any Lie group. The phase factor e i α is a unitmodulus complex number.e. 21 We shall take the example of the group SU (2) in order to illustrate the basic ideas. there is of course no distinction between Hermitean conjugation. for a group realised by matrices under multiplication. (10. i. i. This is exactly what one would hope for. By contrast.23).2 YangMills theory At the end of the previous subsection we rederived electromagnetism as the ﬁeld needed in order to turn the global phase invariance of a complex scalar ﬁeld that satisﬁes the KelinGordon equation into a local phase invariance. one has in general that U V = V U . in general. for a nonableian group.21) This is the gaugecovariant generalisation of the original uncharged KleinGordon equation. (For 1 × 1 matrices. Requiring stationarity under variations of A µ implies ∂µ F µν = −4πJ ν . The set of all unitmodulus complex numbers form the group U (1). 1 × 1 complex matrices U satisfying U † U = 1. 10.e.23) (10. with a source current density given by (10. each satisfying the KleinGordon equation. and so it is to be expected that it should act as a source for the electromagnetic ﬁeld. It is easily veriﬁed that the EulerLagrange equations resulting from this Lagrangian density are as follows. as a bonus. where ¯ Jµ = −i e φDµ φ − (Dµ φ) φ . derived the current density for the scalar ﬁeld. whose Lagrangian is invariant under a larger. In the process of giving dynamics to the electromagnetic ﬁeld we have.22) Thus Aµ satisﬁes the Maxwell ﬁeld equation. group. (10. 21 An abelian group is one where the order of combination of group elements makes no diﬀerence. 170 .where Dµ = ∂µ − i e Aµ . (∂ µ − i eAµ )(∂µ − i eAµ )φ = 0 .) In order to derive the generalisation of electromagnetism to YangMills theory we need to start with an extended system of scalar ﬁelds. the results are. for example. Requiring the stationarity of the action under variations of the wavefunction φ implies D µ Dµ φ = 0 . nonabelian. which are just numbers. and complex conjugation.
2i Tc . (10.25) det U = 1 . The group can be deﬁned as the set of 2 × 2 complex matrices U subject to the conditions U †U = 1 . τ3 = 1 0 0 −1 .28) They satisfy the commutation relations [τa . (10. Clearly SU (2) has three independent parameters. where it arises when 1 one discusses systems with intrinstic spin 2 . and cyclic permutations. has one parameter. (10. S 3 . the constraint is described by the surface x2 + x 2 + x 2 + x 2 = 1 1 2 3 4 (10. It can therefore be parameterised in the form U= a b −¯ a b ¯ . If we write a = x1 + i x2 .24) where a and b are complex numbers subject to the constraint a2 + b2 = 1 . or 1dimensional sphere. Let Ta = We shall therefore have [Ta . and the 3sphere. and so the elements of the group SU (2) are in onetoone correspondence with the points on a unit 3dimensional sphere. Tb ] = 22 abc τc . τ = 2] = 2i τ3 . (10. [τ1 . b = x3 + i x4 . whose elements U can be parameterised as U = ei α with 0 ≤ α < 2π. S 1 .e.30) abc (10.27) (10.29) 1 τa . where τ1 = 0 1 1 0 . 171 .26) in Euclidean 4space. Since ei α is periodic in α the elements of U (1) are in onetoone correspondence with the points on a unit circle.The group SU (2) should be familiar from quantum mechanics.31) For comparison. are the only spheres that are isomorphic to groups. (10. the group U (1). τ2 = 0 i −i 0 . In fact the circle. τb ] = 2i i.22 The group SU (2) can be generated by exponentiating the three Pauli matrices τ a .
bearing in mind that for any matrix X we have det X = exp(tr log X): det U = det(eαa Ta ) = exp[tr log(eαa Ta )] = exp[tr(αa Ta )] = exp[0] = 1 . which are called the generators of the Lie algebra of SU (2). (10. we have L −→ −∂ µ (φ† U † ) ∂µ (U φ) = −(∂ µ φ† ) U † U (∂µ φ) = −(∂ µ φ† ) (∂µ φ) = L .40) (10.39) 172 .38) is invariant under global SU (2) transformations φ −→ U φ . which can be derived from the Lagrangian density L = −(∂ µ φ† )(∂µ φ) . where U is a constant SU (2) matrix.) It is easy to check that the unitarity of U .32) They are also. each of which satisﬁes the massless KleinGordon equation. of course. i.e.37) It is obvious that the Lagrangian density (10. (10. (10. The SU (2) group elements can be written as U = e α a Ta . (10. † (10.33) where αa are three real parameters. called φ 1 and φ2 .36) This vectorvalued ﬁeld therefore satisﬁes the KleinGordon equation φ = 0.34) The unitdeterminant property follows from the tracelessness of the T a . (10. (10. follows from the antiHermiticity of the generators T a : U † U = e α a Ta † eαb Tb = eαa Ta eαb Tb = e−αa Ta eαb Tb = 1 . are antiHermitean.Note that the Ta . which we shall call φ: φ= φ1 φ2 . We may assemble them into a complex 2vector. † Ta = −Ta .38) (10.35) Suppose now that we take a pair of complex scalar ﬁelds. Thus. U † U = 1. (This is the analogue of writing the U (1) elements U as U = ei α . traceless.
45) (10.42) Since we don’t. We shall called these Aa . for the usual reason that we would get extra terms where the derivatives landed on the U transformation matrix. whose action on the complex 2vector of scalar ﬁelds φ is deﬁned by Dµ φ = ∂ µ φ + A µ φ . a priori. = (∂µ U ) φ + U ∂µ φ + Aµ U φ . by deﬁning Aµ = A a Ta . since the SU (2) group is characterised by 3 parameters rather than the 1 parameter characterising U (1). Working this out.47) (10. L would not be invariant if we allowed U to be spacetime dependent. We next deﬁne the covariant derivative D µ . we conclude that ∂µ U + A µ U = U A µ . = U Dµ φ = U ∂ µ φ + U A µ φ . (10. where 1 ≤ a ≤ 3. and noting that we want this to be true for all possible φ. we can expect that we will need 3 gauge ﬁelds rather than 1. we can expect that again we could achieve a local SU (2) invariance by introducing appropriate analogues of the electromagnetic ﬁeld. we shall have (Dµ φ) = Dµ φ = (∂µ + Aµ )(U φ) . Based on our experience with the local U (1) phase invariance of the theory coupled to electromagnetism. (10. µ (10.41) where Ta are the generators of the SU (2) algebra that we introduced earlier.44) (10. 173 (10. know how Aµ should transform we shall work backwards and demand that its transformation rule should be such that D µ satisﬁes the nice property we should expect of a covariant derivative in this case.Obviously. then we should also have that (Dµ φ) −→ (Dµ φ) = U (Dµ φ) . namely that if we transform φ under a local SU (2) transformation φ −→ φ = U φ . In fact it is convenient to assemble the three gauge ﬁelds µ into a 2 × 2 matrix. In this case. Multiplying on the right with U † then gives the result that Aµ = U Aµ U † − (∂µ U ) U † .43) Equating the last two lines.46) .
It turns out that the appropriate generalistion that is needed for YangMills is to deﬁne Fµν = ∂µ Aν − ∂ν Aµ + [Aµ . then. since the commutator [Aµ . Guided by the example of electromagnetism.48) What we have established is that if we replace the Lagrangian density (10. we have suceeded in constructing a theory with a local SU (2) symmetry. the YangMills potentials Aa that we introduced do not have any dynamics of their own. Aν ] ≡ Aµ Aν − Aν Aµ would then vanish. Since left and right multiplication are the same in the abelian case. 174 . then a simple calculation shows that under the SU (2) gauge transformation for A µ given in (10.This. which will be the analogue of the electromagnetic ﬁeld strength Fµν = ∂µ Aν − ∂ν Aµ . (10. in the sense that (D µ φ) = Dµ (U φ) = U Dµ φ. The essential point is that the local transformation matrix U “passes through” the covariant derivative. the full set of local SU (2) transformations comprise 23 Aµ −→ Aµ = U Aµ U † − (∂µ U ) U † . will be the gauge transformation rule for the YangMills potentials A µ . 23 Note that this nonabelian result. (10. reduces to the previous case of electromagnetic theory is we specialise to the abelian group U (1). If one were to try adopting (10. we should ﬁrst ﬁnd a a ﬁeld strength tensor for the YangMills ﬁelds. µ Following the strategy we applied to the case of electromagnetism and local U (1) invariance.51) Of course this would reduce to the standard electromagnetic ﬁeld strength in the abelian U (1) case. which takes essentially the same form for any group.50) It is clear that the expression (10. Essentially.49) that will do the job. the previous results for electromagnetic gauge invariance can be recovered. In other words.48). but as yet. (10. Aν ] . we should now look for a suitable term to add to the Lagrangian density (10.50) as a deﬁnition for the ﬁeld strength. we would just write U = ei α and plug into the transformations (10. the ﬁeld strength would transform into a complete mess. So far.38) under global SU (2) transformations.48).49) then it will be invariant under the local SU (2) transformations given by (10.50) is not suitable in the YangMills case.38) by L = −(D µ φ)† (Dµ φ) . φ −→ φ = U φ . (10.48). The proof is now identical to the previous proof of the invariance of (10.
(10. (10. First.56) .53) This means that Fµν transforms covariantly under SU (2) gauge transformations. LY M = 1 tr(F µν Fµν ) . It would of course. 8π (10.55) 1 tr(F µν Fµν ) . (10. It is now a straightforward matter to write down a suitable term to add to the Lagrangian density (10.52) where the notation −(µ ↔ ν) means that one subtracts oﬀ from the terms written explicitly the same set of terms with the indices µ and ν exchanged. = U (∂µ Aν − ∂ν Aµ + Aµ Aν − Aν Aµ )U † .48). (∂µ U )Aν U † + U (∂µ Aν ) − U Aν U † (∂µ U )U † − (∂µ ∂ν U )U † + (∂ν U )U † (∂µ U )U † +U Aµ U † U Aν U † − U Aµ U † (∂ν U )U † − (∂µ U )U † U Aν U † +(∂µ U )U † (∂ν U )U † − (µ ↔ ν) . 8π 175 (10. the ﬁeld strength Fµν deﬁned in (10. We shall have Fµν −→ Fµν = ∂µ Aν + Aµ Aν − (µ ↔ ν) .49). Thus we shall take L = −(D µ φ)† (Dµ φ) + formation we shall have tr(F µν Fµν ) −→ tr(F µν Fµν ) = tr(U F µν U † U Fµν U † ) = tr(U F µν Fµν U † ) = tr(F µν Fµν U † U ) = tr(F µν Fµν ) .54) The proof that tr(F µν Fµν ) is gauge invariant is very simple. = = ∂µ (U Aν U † − (∂ν U )U † ) + (U Aµ U † − (∂µ U )U † )(U Aν U † − (∂ν U )U † ) − (µ ↔ ν) . As for electromagnetism.54) in the standard way. under the SU (2) gauge trans The equations of motion for the φ and A µ ﬁelds can be derived from (10. let us just consider the sourcefree YangMills equations that will result if we just consider the Lagrangian density for the YangMills ﬁelds alone.The ﬁrst task is to check how Fµν deﬁned by (10. reduce to the invariance of the elctromagnetic ﬁeld strength transformation (F µν = Fµν ) in the abelian case. as the EulerLagrange equations that follow from requiring that the action I = d4 xL be stationary under variations of φ and A µ respectively.51) transforms under the SU (2) gauge transformation of Aµ given in (10.51) we see that the upshot is that under the SU (2) gauge transformation for A µ given in (10.48). we want a gaugeinvariant and Lorentzinvariant quantity that is quadratic in ﬁelds. Comparing with (10.51) transforms as Fµν −→ Fµν = U Fµν U † .
(10. as it should be. F µν ] = 0 .58) These are the sourcefree YangMills equations.60) Note that is is a 2 × 2 matrix current. d4 x(∂µ δAν + [Aµ . F µν ]) . where Jµ = (Dµ φ) φ† − φ (Dµ φ)† . are the SU (2) gauge ﬁelds. d4 xδAν (∂µ F µν + [Aµ . The weak nuclear force is described by the WeinbergSalam model. (10.61) (10. Thus one may say that almost all of modern particle physics relies upon YangMills theory. 176 . Obviously the YangMills equations reduce to the Maxwell equations in the abelian U (1) case. The W and Z bosons. The strong nuclear force is described by a YangMills theory with SU (3) gauge group. 2vector sits to the right of the unconjugated 2vector. which are the generalisation of the sourcefree Maxwell equations ∂µ F µν = 0. If we now include the −(D µ φ)† (Dµ φ) term in the above calculation. Aν ] + [Aµ .57) = − 1 tr 2π and so requiring that the action be stationary implies ∂µ F µν + [Aµ . d4 x(∂µ δAν − ∂ν δAµ + [δAµ . in order to verify that (10. d4 x(−δAν ∂µ F µν + Aµ δAν F µν − δAν Aµ F µν ) . (10. δAν ])F µν . ∂µ F µν + [Aµ . and the 8 gauge ﬁelds associated with this theory are the gluons that mediate the strong interactions. F µν ]) .59) Requiring stationarity under the variations δA ν now gives the YangMills equations with sources. we shall have = = = = 1 8π 1 4π 1 2π 1 2π 2 tr tr tr tr d4 xδFµν F µν . based on the YangMills gauge group SU (2). which have been seen in particle accelerators such as the one at CERN. 24 It is helpful to introduce an index notation to label the rows and columns of the 2 × 2 matrices.61) is the correct expression. since the Hermiteanconjugated This completes this brief introduction to YangMills theory. As far as applications are concerned. we shall ﬁnd that δI = d4 x(φ† δAν D ν φ − (D ν φ)† δAν φ) − 1 tr 2π d4 xδAν (∂µ F µν + [Aµ . 24 (10. it is fair to say that YangMills theory lies at the heart of modern fundamental physics. δAν ])F µν . F µν ] = 2π J ν .Wrting IY M = δIY M d4 xLY M .