You are on page 1of 36

Special Relativity

Newtonian mechanics is invariant under the set of Galilean transformations,

x0 = x − vt
t0 = t

where v is any constant velocity, because the second law only involves the force and the acceleration. For
the acceleration we have
d2 x0
a0 =
dt02
d2 (x − vt)
=
dt2
2
d x d2 t
= − v
dt2 dt2
= a

Now suppose the force is the gradient of a potential that only depends on the separations of particles,
X
F = −∇ V (xi − xj )
i,j

Then in any other Galilean frame, we have both


 0 ∂
∇ i =
∂x0i
X ∂xj ∂ X ∂t ∂
= 0 +
j
∂xi ∂xj j
∂x0i ∂t
X ∂ x0j + vj t ∂

X ∂t ∂
= 0 +
j
∂xi ∂xj j
∂x0i ∂t
X ∂
= δij
j
∂xj

=
∂xi
so that ∇0 = ∇, and

x0i − x0j = (xi − vt) − (xj − vt)


= xi − xj

Newton’s second law is therefore invariant under Galilean transformations.


The same is not true of electrodynamics. For example, in the absence of forces, Maxwell’s equations lead
to the wave equation,
1 ∂2ψ
− 2 2 + ∇2 ψ = 0
c ∂t
for the fields and potentials.

1 ∂2
0 = − ψ (x0 , t0 ) + ∇02 ψ (x0 , t0 )
c2 ∂t02
1 ∂2
= − 2 2 ψ (x − vt, t) + ∇2 ψ (x − vt, t)
c ∂t

1
 
1 ∂ ∂ X ∂
= − 2 ψ (x − vt, t) − vj ψ (x − vt, t) + ∇2 ψ (x − vt, t)
c ∂t ∂t j
∂x j
 
1  ∂2 X ∂ ∂ X ∂ ∂
= − 2 ψ (x − vt, t) − 2 vj ψ (x − vt, t) + vj vk ψ (x − vt, t) + ∇2 ψ (x − vt, t)
c ∂t2 j
∂x j ∂t ∂x j ∂x k
j,k

1 ∂ 2 ψ (x0 , t) 2 ∂ψ 0 1
= − 2 2
+ 2v · 5 (x , t) − 2 (v · 5) (v · 5) ψ (x0 , t) + ∇2 ψ (x − vt, t)
c ∂t c ∂t c

Postulates of special relativity


Special relativity is a combination of two fundamental ideas: the equivalence of inertial frames, and the
invariance of the speed of light. Inertial frames are the same in relativistic mechanics as they are in Newtonian
mechanics, i.e., frames of reference (sets of orthonormal basis vectors) in which Newton’s second law holds.
Newton’s second law
F = ma
is unchanged by 10 distinct transformations:

x0i = j Oij xj
P
Rotations
x0 = x + a T ranslations
t0 = t + t0 Origin of time
x0 = x + vt Boosts (change of velocity)

There are three independent parameters describing rotations (for example, specify a direction in space by
giving two angles (θ, ϕ) then specify a third angle, ψ, of rotation around that direction). Translations can
be in the x, y or z directions, giving three more parameters. Three components for the velocity vector and
one more to specify the origin of time gives a total of 10 parameters. If we know one frame of reference in
which Newton’s second law holds, then these transformations give us a 10 parameter family of equivalent
frames of reference. Einstein’s first postulate is that these inertial frames of reference are indistinguishable.
This means that there is no such thing as absolute rest. We can say that two frames move with constant
relative velocity, but it is incorrect to say that one is at rest and the other moves.
The constancy of the speed of light, or perhaps better, the existence of a limiting velocity, is demonstrated
by the Michaelson-Morely experiment. By measuring the speed of light in two different directions, at different
times of the year so that the motion relative to the “fixed stars” is different for each, they found no effect of
the motion on the travel time of the light. Difficulties explaining this, and especially difficulties when models
were based on the disturbance travelling in a medium, lend support to the idea that light always travels in
empty space with the limiting velocity, c. Notice that this is in dramatic conflict with our normal idea of
addition of velocities. In Euclidean 3-space, if observer A moves with velocity v with respect to observer B,
and A throws a ball with velocity u, then the velocity of the ball with respect to B is u + v. But according
to this postulate of special relativity, if A shines a light beam it travels with speed c relative to both A and
B. We must have cn̂ + v = cn̂0 , regardless of the directions of the unit vectors n̂, n̂0 .
With some basic assumptions about the nature of space, the two postulates:
1. The laws of physics are the same in all inertial frames of reference
2. There is a limiting velocity to all phyiscal phenomena, c, found experimentally to be the speed at which
light travels in vacuum (and theorized to be the speed at which gravitational waves travel in nearly
flat spacetime). This velocity is independent of inertial frame, so that if in one inertial fram an object
moves with speed c, then it moves with speed c in all inertial frames.
There are some other basic ideas we will use. Since there is strong evidence for the conservation of momentum,
with momenta additively conserved, we still need the physical arena to be a vector space, with linear
combinations of vectors giving other vectors. Also, we need a notions of straight lines and distance. We will

2
assume that un-forced particles travel in straight lines, along their initial direction. What constitutes the
length turns out to be the central difference between classical and relativistic models.
Another approach to these supplementary assumptions is given in problem 11.1. The assumption that
spacetime is homogeneous and isotropic places strong constraints on the allowed transformations, since the
transformation cannot depend on location or time. This approach also rules out position or time dependence
of a scale factor, Λ, (see below). However, just as the 2-dimensional surface of a sphere is isotropic and
homogeneous, there exist constant curvature 4-dimensional spaces which are homogeneous and isotropic, so
some further assumption is required.

Lorentz transformations
In order for a position, x, and time, t, to describe a vector in every frame of reference, we need to restrict
possible transformations to linear transformations. Only linear transformations preserve the additivity prop-
erties of vectors. This means that the position and time in any two frames of reference must be related by
a matrix,  0    
x M00 M01 M02 M03 x
 y 0   M10 M11 M12 M13   y 
 0 =  
 z   M20 M21 M22 M23   z 
t0 M30 M31 M32 M33 t
It is convenient to write this as a matrix equation, and simplifies the notation if we define

x0 = ct
x1 = x
x2 = y
x3 = z

where c is the postulated universal physical constant with units of velocity. Then we may write the trans-
formation as
3
X
x0a = Mab xb
b=0

Now consider two inertial frames, with origins coinciding at time t = t0 = 0, in which a pulse of light is
emitted at time t = 0. Picture an expanding spherical wave with radius ct = ct0 . Then we must have
p
ct = x2 + y 2 + z 2
p
ct0 = x02 + y 02 + z 02

Write these relations as

x2 + y 2 + z 2 − c2 t2 = 0
02 02 02 2 02
x +y +z −c t = 0

Each of these must hold if the other does, and since the primed and unprimed coordinates are linearly related
to one another, they must be proportional,

x2 + y 2 + z 2 − c2 t2 = Λ (v) x02 + y 02 + z 02 − c2 t02




To restrict Λ, suppose we relate x0a to a third frame, x00a . If the relative velocity is u, then we must have

x2 + y 2 + z 2 − c2 t2 = Λ (v) x02 + y 02 + z 02 − c2 t02




= Λ (v) Λ (u) x002 + y 002 + z 002 − c2 t002




3
Now choose u = −v, so that we are back to the original frame, x00a = xa . Then we require

Λ (v) Λ (−v)

and therefore
Λ (v) = ef (v)
where f (−v) = −f (v). Conventionally, we take f (v) = 0, but there exist generalizations of relativity
involving nontrivial factors. Setting Λ = 1 is equivalent to assuming that clocks maintain the same rate as
they move from place to place. However, as long as the clock rates in different places are related by a single
multiplicative function, there is no measurable effect comparing magnitudes of times that could demonstrate
it. From here on, we will take f (v) = 0 and Λ = 1.
We therefore define
s2 ≡ x2 + y 2 + z 2 − c2 t2
and require s02 = s2 between any two inertial frames. This equivalence defines the Lorentz transformations.
Any linear transformation preserving the quantity s is a Lorentz transformation.
We will find all Lorentz transformations below, but for now we check that for motion of O0 along the
positive x-axis of O, we have

ct0 = γ (ct − βx)


0
x = γ (x − βct)
0
y = y
z0 = z

where
1
γ ≡ p
1 − β2
v
β ≡
c
Substituting into s02 we have

s02 = x02 + y 02 + z 02 − c2 t02


2 2
= [γ (x − βct)] + y 2 + z 2 − [γ (ct − βx)]
= γ 2 x2 − 2βxct + β 2 c2 t2 + y 2 + z 2 − γ 2 c2 t2 − 2βxct + β 2 x2
 

= γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
  

and since
!2
2 2 2 1
1 − β2

γ −γ β = p
1− β2
1
1 − β2

= 2
1−β
= 1

we have

s02 γ 2 − γ 2 β 2 x2 + 2γ 2 β − 2γ 2 β xct − γ 2 − γ 2 β 2 c2 t2 + y 2 + z 2
  
=
= x 2 − c2 t 2 + y 2 + z 2
= s2

4
proving that the transformation is a Lorentz transformation.
Notice that we can use a hyperbolic substitution to rewrite the Lorentz transformation. Define the
rapidity, ζ, by
β ≡ tanh ζ
Then
1
γ = p
1 − β2
1
= p
1 − tanh2 ζ
1
= q
sinh2 ζ
1− cosh2 ζ
cosh ζ
= p
cosh2 ζ − sinh2 ζ
= cosh ζ
and γβ = sinh ζ. Then, with x0 = ct, we have
x00 = x0 cosh ζ − x1 sinh ζ
x01 = −x0 sinh ζ + x1 cosh ζ
x02 = x2
x03 = x3
The similarity to a rotation,
x0 = x cos θ − y sin θ
0
y = x sin θ + y cos θ
is not accidental, but will become clear when we find all Lorentz transformations.
Now suppose the velocity is in an arbitrary direction, β. We can project the position coordinate x parallel
and perpendicular to β,
1
xk = (β · x) β
β2
1
x⊥ = x − 2 (β · x) β
β
with x = x⊥ + xk . The component xk will behave just like the x-direction in the formula above, while the
perpendicular directions x⊥ will be unchanged. The time transforms as before, so we have
ct0 = γ (ct − β · x)
x0k

= γ xk − βct
x0⊥ = x⊥
The last two may be combined as
x0 = x0k + x0⊥

= γ xk − βct + x⊥
   
1 1
= γ (β · x) β − βct + x − (β · x) β
β2 β2
γ−1
= x+ (β · x) β − γβct
β2

5
Spacetime
We now consider properties the 4-dimensional physical arena called spacetime. The defining properties are
that it is a 4-dimensional vector space in which the squared length of any vector (from (x1 , y1 z1 , ct1 ) to
(x2 , y2 z2 , ct2 ) is given by
2 2 2 2
s2 = (x2 − x1 ) + (y2 − y1 ) + (x2 − x1 ) − c2 (t2 − t1 )

We have seen that the value of s2 is independent of the inertial frame of reference. If the time interval
2
(t2 − t1 ) is larger than the spatial separation, so that s2 < 0, we use the equivalent length
2 2 2 2
c2 τ 2 = c2 (t2 − t1 ) − (x2 − x1 ) − (y2 − y1 ) − (x2 − x1 )

To distinguish from 3-dimensional names, s is called the proper length and τ is called the proper time.

Contravariant Vectors
We will discuss the reasons for this notation later, but from now on, the coordinate labels will be written
raised. Thus, for Greek indices α, β, . . . ∈ (0, 1, 2, 3), we write

xα = x0 , x1 , x2 , x3


where x0 = ct and xi for Latin indices i = 1, 2, 3 are the usual spatial (x, y, z). This means that use of
the Greek or Latin alphabet tells us whether an object is four or three dimensional. We know how the
coordinates xα change when we change to a different frame of reference. We now define a (contravariant)
vector, or 4-vector, to be any set of four quantities,

Aα = A0 , A1 , A2 , A3


= A0 , Ai


= A0 , A


that transform in the same way, i.e.,

A00 γ A0 − β · A

=
A0k = γ Ak − βA0


A0⊥ = γA⊥

It follows immediately that the quantity


2 2
kAα k = − A0 +A·A

is the same in any inertial reference frame. This is the length of the 4-vector Aα . We will give alternative
notation for this later.
Now let Aα and B α be any two 4-vectors. Their scalar product or inner product is given by

−A0 B 0 + A · B

This is seen to be invariant by noting that Aα + B α is also a vector, and writing it as


1  2   2   2 
−A0 B 0 + A · B = − A0 + B 0 + (A + B) · (A + B) − − A0 + A · A + − − B 0 + B · B
2
1 α 2 2 2

= kA + B α k − kAα k − kB α k
2

6
Since each of the three terms on the right is invariant (that is, unchanged by change of reference frame), the
sum is as well, so the inner product is unchanged as well.
More generally, if we write the general form of a Lorentz transformation as
3
X
x0α = M α β xβ
β=0

then a 4-vector is any set of four functions Aα which transform as


3
X
A0α = M α β Aβ
β=0

Causality
In graphing spacetime, time is generally taken as the vertical axis. Points in spacetime are called events and
denoted P (t, x). The invariant separation between two events P (t1 , x1 ) , P (t2 , x2 ) is given by the invariant
interval
2 2
s2 = |x1 − x2 | − c2 (t1 − t2 )
or
2 2
c2 τ 2 = c2 (t1 − t2 ) − |x1 − x2 |
whichever is positive. When s2 > 0 the separation is called spacelike and when τ 2 > 0 the separation is
called timelike. When s2 = c2 τ 2 = 0, the separation of the two events is called lightlike or null.
The minus sign between the time and space coordinates in the expression for the interval is responsibility
for causal relations in spacetime. Consider the lightlike lines from any fixed spacetime event, P . This set of
null lines is called the light cone, and its position in spacetime is agreed on by all inertial observers. As a
result, the region above these lines is agreed by all inertial observers to occur at later times (larger values
of t) and constitutes the future of P . Events lying below the lowest set of lightlike lines are agreed to have
earlier values of t, and this region is therefore called the past of P . The remaining points of spacetime are
called elsewhere.
Any object travelling from P at the speed of light must follow a null curve; objects travelling slower than
the speed of light follow curves contained inside the future light cone. Moreover, the path lies of a particle
travelling slower than the speed of light lies in the future of every event on its path. Such a path is called
the world line of the particle and is said to be a timelike curve.
Suppose P1 and P2 are two events on the world line of a particle. Then there exists a frame of reference
in which P1 and P2 occur at the same spatial location. In this frame of reference,
2 2
c2 τ12
2
= c2 (t1 − t2 ) − |x1 − x2 |
2
= c2 (t1 − t2 )

so that in this particular frame of reference, the proper time interval equals the difference in time coordinates,
τ12 = t1 − t2 .
Similarly, suppose two events P1 and P2 have spacelike separation
2 2
s212 = |x1 − x2 | − c2 (t1 − t2 ) > 0

Then there exists a frame of reference in which the two events occur at the same value of t, and the proper
interval becomes equal to the spatial separation of the events:
2
s212 = |x1 − x2 |

Now consider the world line of a particle. We know (and will demonstrate later) that such a particle
always moves with speed less than c. The proper time along its world line is the physical time for the particle.

7
Consider two infinitesimally separated points on the world line. Choose a frame of reference (any will do!)
and specify the position of the particle in that frame of reference by x (t), so that the infinitesimal change
in proper time is
r
1
dτ = dt2 − 2 dx2
c
s  2
1 dx
= dt 1 − 2
c dt

We may not integrate along the world line between any two events A, B, to find the elapsed proper time for
the particle,

ˆtB
s  2
1 dx
τAB = dt 1 − 2
c dt
tA
ˆtB
s
2
v (t)
= dt 1 −
c2
tA

This shows that the elapsed time for physical processes depends on the motion.

Addition of 4-velocities
The ordinary 3-velocity no longer adds as a vector, because it does not transform linearly under boosts. To
see this, consider an infinitesimal boost in the x-direction,

cdt0 = γ (cdt − βdx)


dx0 = γ (dx − βcdt)
dy 0 = dy
dz 0 = dz

If a particle moves with velocity


dx
u=
dt
in the initial frame, then in the primed frame it moves with velocity

dx0 γ (dx − βcdt)


=
dt0 γ dt − 1c βdx


dx
dt − βc
=
1 − 1c β dx
dt
u−v
=
1 − uv
c2

so the velocities do not add


u0 6= u − v
We can see this in the vector case as well. Using the differential of the general transform

cdt0 = γ (cdt − β · dx)


γ−1
dx0 = dx + (β · dx) β − γβcdt
β2

8
we find
dx0
u0 =
dt0
γ−1
dx + β2 (β · dx) β − γβcdt
=
γ dt − 1c β · dx


dx
+ γ−1 β · dx

dt β2 dt β − γβc
=
γ 1 − 1c β · dx

dt
u + γ (v̂ · u) v̂ − (v̂ · u) v̂ − γv
=
γ 1 − v·u

c2
u − (v̂ · u) v̂ v − (v̂ · u) v̂
=  −
γ 1 − v·uc2
1 − v·u
c2
uk − v u⊥
= +
1 − v·u γ 1 − v·u

c2 c2

so that not only does the component of u along v not simply add, the perpendicular component is modified
nonlinearly as well.
Nonetheless, spacetime is a vector space, and we should be able to add vectors. The problem here is
that u does not transform as a contravariant vector because the time derivative is taken with respect to the
coordinate time instead of the proper time. If we use the time, τ , which all inertial observers agree on, then
it is natural to define the 4-velocity as
dxα
Uα =

0
This is easily seen to be a 4-vector. Since dτ = dτ , we have

dx0α
U 0α =
dτ 0
dx0α
=
dτ 
d X α β 
= M βx

β
X dxβ
= Mα β

β
X
= Mα βUβ
β

and this is exactly the transformation law for a 4-vector. Since the transformation is linear, linear combina-
tions of 4-vectors will also transform as 4-vectors.
To see that this gives the correct result, we write the sum of two 4-vectors in terms of their time and
space components. Using the relationship between time and proper time,
r
v2
dτ = dt 1 − 2
c
derived above, and the notation xα = (ct, x) we have for a single 4-vector,

dxα
Uα =

9
dxα
= q
2
1 − vc2 dt
 
dt dx
= γ c ,
dt dt
= γ (c, v)

Every 4-velocity can be written in this way for some 3-velocity, v. Now recall that the importance of this
2
transformation law is that the magnitude of 4-vectors is Lorentz invariant. This means that kU α k must be
some number that all inertial observers agree on. We can check this directly:
2 2 2 2 2
kU α k = − U0 + U1 + U2 + U3
= −γ 2 c2 + γ 2 v · v
1
c2 − v · v

= − 2
1−β
c2 v2
 
= − 1 −
1 − β2 c2
= −c2

The 4-velocity is therefore a normalized 4-vector.


Now suppose a particle with 4-velocity

Uα = γu (c, u)
= c (cosh ζu , n sinh ζu )

where

β ≡ tanh ζ
γ ≡ cosh ζ
γβ ≡ sinh ζ

in a given initial frame, is viewed from a second frame of reference moving with 3-velocity v = v î. Then a
Lorentz transformation of U α gives the velocity in the new frame,

c cosh ζ 0 = u0 cosh ζv − u1 sinh ζv


= c (cosh ζu cosh ζv − sinh ζu sinh ζv )
= c cosh (ζu − ζv )
0
c sinh ζ = c (− cosh ζu sinh ζv + sinh ζu cosh ζv )
= c sinh (ζu − ζv )
u02 = 0
u03 = 0

so the rapidities simply add,

ζ0 = ζu − ζv

In terms of the velocities,

β0 = tanh ζ 0
= tanh (ζu − ζv )

10
sinh ζu cosh ζv − cosh ζu sinh ζv
=
cosh ζu cosh ζv − sinh ζu sinh ζv
tanh ζu − tanh ζv
=
1 − tanh ζu tanh ζv
βu − βv
=
1 − βu βv
Dividing both the numerator and the denominator by cosh ζu cosh ζv gives
tanh ζu − tanh ζv
β0 =
1 − tanh ζu tanh ζv
βu − βv
=
1 − βu βv
which is the same as the previous result.

Relativistic energy and momentum


We next need to generalize the energy and momentum of a particle. The momentum should add as a 4-vector
and should reduce to p = mv in the c −→ ∞ limit. To be linear in the 3-velocity, we take it linear in the
4-velocity,
P α = mU α
where we might have
m = m (v)
lim m (v) = mN ewton
v−→0

and in order for this to transform as a 4-vector, its norm must be Lorentz invariant. Computing,
2 2
kP α k = kmU α k
2
= m2 kU α k
= −m2 c2
Since we already know that c2 is invariant, the mass m must also be invariant in order for P α to be a
4-vector.
Now look at the various expressions for the 4-velocity:
Uα = γ (c, u)
= c (cosh ζ, n sinh ζ)
We have similar expressions for the 4-momentum,
Pα = mU α
= (mγc, mγu)
= (mc cosh ζu , mcn sinh ζu )
From this we identify the relativistic 3-momentum,
p = mγu
and notice that it has the correct limit,
mu
lim p = lim p
β−→0 β−→0 1 − β2
= mu

11
The meaning of the remaining component follows from the same limit, but we need to keep the first order
term:

P 0 c = mγc2
mc2
lim P 0 c = lim p
β−→0 β−→0 1 − β2
 
1
= lim mc2 1 + β 2 + . . .
β−→0 2
1
= mc2 + mv2 + . . .
2
We immediately recognize the classical kinetic energy, together with the constant mass-energy. We therefore
define the relativistic energy,
E = P 0 c = mγc2
We may also write the energy and momentum in terms of the rapidity,
 
α E
P = ,p
c
= (mγc, mγu)
= (mc cosh ζ, mcn sinh ζ)

The invariant mass gives the important relationship between relativistic energy, momentum and mass:
2
−m2 c2 = kP α k
 2
E
= − + p2
c
p
E = p2 c2 + m2 c4

Though we take the positive square root here, the negative root also turns out to be important in quantum
field theory.
We now generalize Newton’s second law,
~ = m~a
F
by replacing the vectors on each side with 4-vectors,
dU α
Fα = m

where the acceleration 4-vector is defined as the proper time rate of change of the 4-velocity. Notice that it
is orthogonal to the 4-velocity,
dU α dU α
Uα = ηαβ U β
dτ dτ 
β α
1 d η αβ U U
=
2 dτ

1 d −c2
=
2 dτ
= 0

This means that the force is also orthogonal to the velocity,

Uα F α = 0

12
and since, in the rest frame of a particle, U α = (c, 0), we must have
0 = Uα F α
= −cF 0
In this frame, therefore, the 4-force may be written in terms of the 3-force,
 
~
[F α ]rest f rame = 0, F

From this, we can boost to find the 4-force in any inertial frame.
Since the mass is invariant, this may also be written as
dP α
Fα =

and, as a vector equation, may be summed over many particles to give the total force and total momentum.
For an isolated system (by definition, one with no net external force acting on it), we have conservation of
total momentum,
α
dPtotal
= 0

α α
Ptotal,initial = Ptotal,f inal

This relationship is used for collision problems.


The 4-vector force contains information about the

Tensors
In order to describe more than particle motion in spacetime, we need to define objects more complicated than
vectors, analogous to matrices and their generalizations. In order to construct meaningful physical quantities
– invariants – we need to keep track of how these objects transform under Lorentz transformations. This
leads us to define Lorentz tensors as those objects which transform linearly and homogeneously under Lorentz
transformations. It is possible to define tensors for other groups of transformations as well. We will write our
definition in a way that actually applies to the group of general coordinate transformations (or, in another
guise, diffeomorphisms).
Our definition parallels our definition of a 4-vector as any set of four quantities transforming as
3
X
A0α = M α β Aβ
β=0

where
3
X
x0α = M α β xβ
β=0

is a Lorentz transformation. Notice that the Jacobian matrix of partial derivatives,


3
∂x0α ∂ X α β
= M βx
∂xµ ∂xµ
β=0
3
X ∂ β
= Mα β x
∂xµ
β=0
3
X
= M α β δµβ
β=0
= Mα µ

13
To generalize to arbitrary coordinate transformations instead of just  the constant Lorentz transformations,

0α 0α
we simply replace M α µ by the matrix ∂x ∂x µ , where x = x x β
may be any coordinate transformation.
The transformation of a vector then becomes
3
X ∂x0α β
A0α = A
∂xβ
β=0

Now suppose we have a vector equation of the form


3
X
Vα = T α β Sβ
β=0

for two vectors and a matrix. We say such an equation is covariant if it has the same form in any other
coordinates. In order for this to be the case we require
3
X 0
V 0α = T α
µS

µ=0
3 3 3
X ∂x0α β X 0
α
X ∂x0µ
V = T µ Sν
∂xβ µ=0 ν=0
∂xν
β=0
3 3
! 3 X
3
X ∂x0α X
β ρ
X 0
α ∂x0µ ν
T ρS = T µ S
∂xβ ρ=0 µ=0 ν=0
∂xν
β=0
3 3
! 3 X
3
X ∂x0α X
β ν
X 0
α ∂x0µ ν
T νS = T µ S
∂xβ ν=0 µ=0 ν=0
∂xν
β=0
 
3 3 3
X X ∂x0α β X 0
α ∂x0µ  ν
 T ν − T µ S = 0
ν=0
∂xβ µ=0
∂xν
β=0

If this equation holds for all vectors S ν , then the matrix in parentheses must vanish,
3 3
X ∂x0α β X 0
α ∂x0µ
T ν − T µ =0
∂xβ µ=0
∂xν
β=0

Using the inverse transformation,


3
X ∂x0α ∂xµ ∂x0α
=
µ=0
∂xµ ∂x0β ∂x0β
= δβα
we have
 
3 3
X ∂x0α β X 0
α

∂x  ∂xν
0 =  T ν − T µ
∂xβ µ=0
∂xν ∂xσ
β=0
3 X
3 3 3
X ∂x0α β ∂xν X 0 α X ∂x0µ ∂xν
= T ν − T µ
ν=0 β=0
∂xβ ∂xσ µ=0 ν=0
∂xν ∂xσ
3 X3 3
X ∂x0α β ∂xν X 0 α µ
= T ν − T µ δσ
ν=0
∂xβ ∂xσ µ=0
β=0
3 X3
X ∂x0α β ∂xν 0
= T ν −T ασ
ν=0
∂xβ ∂xσ
β=0

14
and we have the transformation law for the matrix,
3 X3
0
α
X ∂x0α β ∂xν
T σ = T ν
ν=0
∂xβ ∂xσ
β=0

Several things become evident from this calculation. First, we do not want to be writing all those
summation symbols! Looking closely, we see that there are two types of index, raised and lowered. In every
case where we have a sum, we have exactly two identical indices, and one is raised
P and one is lowered. From
here on, we employ the Einstein summation convention, and omit the explicit symbol. Instead we sum
automatically whenever we find a pair of matching indices with one raised and one lowered. The indices in
such a summed pair are called dummy indices, because they can be changed at will. Other indices, which
occur singly in each term, must always match in name and position with a corresponding index in each term.
These are called free indices, and they tell us what type of object we have. For example, in the expression
above, α and σ are free indices. They each occur once on each side of the equation, and α is raised and σ
lowered. Using the summation convention, we rewrite:
0
α ∂x0α β ∂xν
T σ = T ν
∂xβ ∂xσ
and because β and ν are dummy indices, this means the same thing as

0
α ∂x0α µ ∂xλ
T σ = T λ σ
∂xµ ∂x
∂x0α ∂xλ µ
= T λ
∂xµ ∂xσ
Notice a second property of this transformation. T µ λ has one index of each type, and they transform

differently. The raised index transforms just like a contravariant vector, with the Jacobian matrix ∂x ∂xµ .
α
∂x
However, the lowered index transforms with the inverse matrix, ∂x 0µ . Raised indices are called contravariant

and lowered indices are called covariant.


Suppose we have a covariant vector and a contravariant vector,

Aα , B β

We define the inner product or scalar product to be the sum

Aα B α = B α Aα

and this quantity is invariant,


  0α 
∂xµ

∂x
A0α B 0α = A µ Bν
∂xα ∂xν
 µ 0α 
∂x ∂x
= Aµ B ν
∂xα ∂xν
= δνµ Aµ B ν
= Aµ B µ

As we have seen, Lorentz transformations preserve the quadratic form


2 2 2 2
ds2 = dx0 − dx1 − dx2 − dx3

We can write this as a double sum


ds2 = gαβ dxα dxβ

15
where  
1 0 0 0
 0 −1 0 0 
gαβ = ηαβ = 
 0 0 −1 0 
0 0 0 −1
Notice that there is a sign ambiguity here. we could equally well use the opposite sign of ds2 to define the
metric, in which case we would have
 
−1 0 0 0
 0 1 0 0 
ηαβ =  0 0 1

0 
0 0 0 1

This is actually the more common practice (see, e.g., Misner, Thorne & Wheeler, or Weinberg), but here
we will stay with Jackson’s convention. We will use the symbol gαβ to refer to a general metric which may
have functions as entries, and reserve the symbol ηαβ for this particular, constant, orthonormal matrix. For
example, if we write the line element ds2 in spherical coordinates,
2
ds2 = dx0 − dr2 + r2 dθ2 + r2 sin2 θdϕ2


then  
1 0 00
 0 −1 00 
gαβ = 
 0 0 −r2
0 
0 −r2 sin2 θ
0 0
 
0
Since gαβ has two covariant indices, we say it is type . We define the inverse to gαβ with the
2
symbol g αβ so that
g αβ gβµ = δµα
For the orthonormal metric ηαβ we see that
 
1 0 0 0
 0 −1 0 0 
η αβ = 
 0 0 −1 0 
0 0 0 −1

while for in spherical coordinates we write


 
1 0 0 0
 0 −1 0 0
g αβ

=
− r12

 0 0 0 
1
0 0 0 − r2 sin2θ

Now look at the inner product of two vectors,

A0 B 0 − A · B = A0 B 0 − A1 B 1 − A2 B 2 − A3 B 3
= ηαβ Aα B β

This inner product is the same as the covariant-contravariant sum if we define

Bα = ηαβ B β

16
for then we have

ηαβ Aα B β = Aα Bα

Because ηαβ is symmetric, we could equally well write

ηαβ Aα B β = ηβα Aα B β
= Aβ B β
= Aα B α
 
α 1
We say that contravariant vectors, A , are rank-1 tensors of type while covariant vectors, Aα , are
0
 
0
rank one tensors of type .
1      
2 1 0
We can now define rank two tensors of types , and as 4 × 4 matrices which tranform
0 1 2
as follows:  
α β 2
R̃αβ = Rµν ∂∂xx̃µ ∂∂xx̃ν type
 0 
α µ ∂ x̃α ∂xν 1
S̃ β = S ν ∂xµ ∂ x̃β type
 1 
∂xµ ∂xν 0
T̃ αβ = Tµν ∂ x̃α ∂ x̃β type
2
α
Every raised index transforms with a factor of ∂∂xx̃µ , while every lowered index transforms with the inverse
µ
matrix, ∂∂xx̃α . This means that whenever we sum over one raised and one lowered index, the transformation
matrices cancel, leaving the sum invariant. For example, even though

Jα T αβ

has three indices, the twoα-indices


 are dummy indices
 and the sum over them is invariant. Therefore, the

1 2
quantity transforms as a tensor, and not a tensor:
0 1

∂ x̃β
J˜α T̃ αβ = (Jα T αµ )
∂xµ
 
p
We could go on to describe higher rank tensors of type with n = p + q indices, but we will not need
q
them for electrodynamics.  
0
tensor is the partial derivative operator, ∂α ≡ ∂x∂α = ∂x ∂ ∂ ∂ ∂

A particularly important 0 . ∂x1 , ∂x2 , ∂x3 =
1


∂x0 , ∇ . The chain rule shows that this is a covariant vector,

∂ ∂xβ ∂
α
=
∂ x̃ ∂ x̃α ∂xβ
 
1
This also an operator; acting on a contravariant vector, the result is a tensor,
1

β ∂Aβ
Tα =
∂xα
= ∂α Aβ

17
Setting α = β and summing gives an invariant, the divergence,

∂α Aα = ∂ α Aα
∂A0
= +∇·A
∂x0
We can write the same operator in contravariant form,

∂α = η αβ ∂β
∂ ∂
= η αβ β
∂xα  ∂x 

= , −∇
∂x0

and using the two we may form the Lorentz invariant operator,

 ≡ ∂α∂
 α   
∂ ∂
= , −∇ · ,∇
∂x0 ∂x0
∂2
= − ∇2
∂x02
This familiar wave operator is called the d’Alembertian.

Thomas precession (version 1)


Assigning the electron a spin, s, and consequent magnetic moment µ, related by
ge
µ= s
2mc
introduced by Uhlenbeck and Goudsmit in 1926, explains the anomalous Zeeman effect if g = 2 and produces
the correct multiplet splitting of spectral lines when an atom is in a magnetic field if g = 1. But it does not
solve both problems at once. The next year, Thomas showed that the conflict is resolved by a relativistic
effect.
First, we work the problem incorrectly, as was done before Thomas’ insight. The problem resides in the
use of an incorrect equation of motion. Let an electron of magnetic moment µ move in a magnetic field B.
Since the electron is moving, it sees a magnetic field B0 , so the rate of change of its angular momentum s is
(incorrectly) given by the Newtonian formula
 
ds
= µ × B0
dt electron f rame

corresponding to an interaction energy of


U = −µ · B0
We show below that the magnetic field in the moving frame is given by

γ2
B0 = γ (B − β × E) − β (β · B)
1+γ
≈ B−β×E

where in the second line we neglect terms of order β 2 and higher.

18
Now, the electron is in the electric field of the nucleus, which for single outer electron atoms may be
written as the gradient of a spherically symmetric potential,
dV (r) r
E=−
dr r
With the orbital angular momentum of the electron given by
L = mv × r
we can write the energy as
= −µ · B + µ · (β × E)
U
ge ge
= − s·B+ s · (v × E)
2mc 2mc2
ge ge 1 dV (r)
= − s·B− s · (v × r)
2mc 2mc2 r dr
ge ge 1 dV (r)
= − s·B+ s·L
2mc 2m2 c2 r dr
so that the second term becomes a spin-orbit interaction. With g = 2, the first term is correct but the second
term gives a spin-orbit interaction that is double the actual value.
The problem with this calculation is that the equation of motion above holds only in an inertial frame
of reference. However, the true inertial frame of the electron is a rotating one. Let the rest frame of the
nucleus be the laboratory inertial frame, O. Then consider two frames of reference for the electron, one at
time t when the electron has velocity β, and one a moment later at time t + δt, when the electron velocity
is β + δβ. We can write a boost that takes us to the electron rest frame at either of these times by using
the appropriate velocity:
α α
(x0 ) = [Aboost (β)] β xβ
α α
(x00 ) = [Aboost (β + δβ)] β xβ
We can relate O00 to O0 by combining these
α α
(x00 ) = [Aboost (β + δβ)] xβ β
α β µ
[Aboost (β + δβ)] β A−1 (x0 )

= boost (β) µ
α β µ
= [Aboost (β + δβ)] β [Aboost (−β)] µ (x0 )
Let O0 be moving in the x-direction, so that
 
γ βγ
β  βγ γ 
[Aboost (−β)] µ = 
 1 
1
Let O00 lie in the xy-plane. Then from our general formula
t0 = γ (ct − β · x)
γ−1
x0 = x+ (β · x) β − γβct
β2
1
Now substitute β + δβ for β and expand to keep only terms linear in δβ = δβ1 i + δβ2 j.cmith γ = √ ,
1−β 2
let γ 00 = √ 1
, so that
1−(β+δβ)2

1
γ 00 = q
2
1 − (β + δβ)

19
1
= p
2
1 − β − 2β · δβ
1
= p p
1−β 2 1 − 2γ 2 β · δβ
= γ 1 + γ 2 β · δβ


= γ + γ 3 β · δβ
Then for the time component,
t0 = γ 00 (ct − (β + δβ) · x)
γ + γ 3 β · δβ (ct − βx − δβ · x)

=
= γ (ct − βx − δβ1 x − δβ2 y) + γ 3 β · δβ (ct − βx)
= γ + γ 3 βδβ1 ct − γβ + γδβ1 + γ 3 β 2 δβ1 x − γδβ2 y
 

= γ + γ 3 βδβ1 ct − γβ + γ 1 + γ 2 β 2 δβ1 x − γδβ2 y


  

= γ + γ 3 βδβ1 ct − γβ + γ 3 δβ1 x − γδβ2 y


 

For the spatial components,


γ−1
x0 = x + 2 ((β + δβ) · x) (β + δβ) − γ (β + δβ) ct
(β + δβ)
!
γ 00 − 1 00
= x+ 2 (βx + δβ1 x + δβ2 y) − γ ct β
(β + δβ)
!
γ 00 − 1 3

+ 2 ((β + δβ) · x) − γ + γ βδβ1 ct δβ
(β + δβ)
 
3
γ + γ β · δβ − 1
 (βx + δβ1 x + δβ2 y) − γ + γ 3 βδβ1 ct β

= x+ 
2 2
β 1 + β δβ1
 
γ−1
+ x − γct δβ
β
γ − 1 γ 3 βδβ1
    
γ−1 3

= x+ x − γct δβ + + (βx − 2δβ 1 x + δβ1 x + δβ 2 y) − γ + γ βδβ1 ct β
β β2 β2
   
γ−1 γ−1 3 3

= x+ x − γct δβ + (βx − δβ1 x + δβ2 y) + γ δβ1 x − γ + γ βδβ 1 ct β
β β2
In components,
   
γ−1 γ−1 γ−1
x0 = δβ1 + γ 3 βδβ1 x + δβ2 y + − γ + γ 3 βδβ1 β − γδβ1 ct
 
1+ δβ1 + (γ − 1) −
β β β
 
γ−1
= γ + γ 3 βδβ1 x + δβ2 y + −βγ − γ 3 δβ1 ct
 
β
γ−1
y0 = y + δβ2 x − γδβ2 ct
β
z0 = z
Now write this as a matrix equation,
γ + γ 3 βδβ1 −γβ − γ 3 δβ1 −γδβ2
 00 
0
  
ct ct
 −βγ − γ 3 δβ1 γ−1
 x00 
 00  =  γ + γ 3 βδβ1 β δβ2 0 

 x 

 y  γ−1
 −γδβ2 β δβ2 1 0  y 
00
z 0 0 0 1 z

20
so that we have
α β
xα (t + δt) = [Aboost (β + δβ)] β [Aboost (−β)] µ xµ (t)
γ + γ 3 βδβ1 −γβ − γ 3 δβ1 −γδβ2 0
  
γ βγ
 −βγ − γ 3 δβ1 γ + γ 3 βδβ1 γ−1
β δβ2 0   βγ γ
  µ
=  γ−1
 x (t)
 −γδβ2 β δβ2 1 0  1 
0 0 0 1 1
   
γ γ + γ 3 βδβ1 − γβ γβ + γ 3 δβ1  γβ γ + γ 3 βδβ1 − γ γβ + γ 3 δβ1 
 
−γδβ2 0
 γ −βγ − γ 3 δβ1 + γβ γ + γ 3 βδβ1 γ−1
γβ −βγ − γ 3 δβ1 + γ γ + γ 3 βδβ1 β δβ2 0 
 µ
=   x (t)
    
 −γ 2 δβ2 + γβ γ−1 β δβ2 −γ 2
βδβ 2 + γ γ−1
β δβ2 1 0 
0 0 0 1

Simplifying,

γ γ + γ 3 βδβ1 − γβ γβ + γ 3 δβ1 = γ 2 + γ 4 βδβ1 − γ 2 β 2 − γ 4 βδβ1


 

= γ2 − γ2β2
= 1
3 3
= γ 2 β + γ 4 β 2 δβ1 − γ 2 β − γ 4 δβ1
 
γβ γ + γ βδβ1 − γ γβ + γ δβ1
= γ 4 β 2 − 1 δβ1


= −γ 2 δβ1
3 3
= −β 2 γ 2 − βγ 4 δβ1 + γ 2 + γ 4 βδβ1
 
βγ −βγ − γ δβ1 + γ γ + γ βδβ1
= γ2 − β2γ2
  = 1
γ−1
−γ 2 δβ2 + γβ = −γ 2 δβ2 + γ 2 − γ δβ2

δβ2
β
= −γδβ2
 
γ−1 1
−γ 2 βδβ2 + γ −γ 2 β 2 + γ 2 − γ δβ2

δβ2 =
β β
1 2
γ 1 − β 2 − γ δβ2
 
=
β
1−γ
= δβ2
β
Therefore,

1 −γ 2 δβ1 −γδβ2 0
 
 −γ δβ1 2 γ−1
1 β δβ2 0 
xα (t + δt) = 
 −γδβ2 1−γ
 xµ (t)
β δβ2 1 0 
0 0 0 1

The best way to think about this matrix is as the product of an infinitesimal boost and an infinitesimal
rotation,
  
1 −γ 2 δβ1 −γδβ2 0 0 0 0 0
−γ 2 δβ1
   

γ−1
0 −γδβ2
 −γ 2 δβ1 γ−1  
0 0 δβ2 0  

 −γ 2 δβ1
1 β δβ2 0 
   β 0 0
 = 1 + 
    1 + 
 −γδβ2 1−γ δβ2
 
1 0 γ−1  −γδβ2 0 0
0 − δβ2 0 0 
 
β   β
0 0 0 1 0 0 0
0 0 0 0

21
Recall that for a rotation in the xy plane,
 0   
x cos θ − sin θ x
 y 0  =  sin θ cos θ  y 
z0 1 z

so if θ is infinitesimal the transformation matrix is


   
1 −θ 0 −θ 0
 θ 1 =1+ θ 0 0 
1 0 0 0

so the first factor is a rotation by an angle


 
γ−1
δθ = − δβ2
β

around the z axis. We can get both angle and axis by writing this as
 
γ−1
δθ = − β × δβ
β2
 
γ − 1
= −   2   β × δβ
γ −1
γ2

γ 2 (γ − 1)
 
= − β × δβ
γ2 − 1
 2 
γ
= − β × δβ
γ+1

The second factor in xα (t + δt) is a pure infinitesimal boost.


The relationship between the laboratory frame and the electron rest frame is given by undoing both the
total boost and the rotation
α β

lab = [Arotation (−δθ)] β [Aboost (−β − δβ)] µ xµelectron

The rotation transformation results in an angular velocity, the Thomas precession, given by

ωT = −
dt
1 dθ
= −
γ dτ
 
γ dβ
= β×
γ+1 dτ
γ a×v
= −
γ + 1 c2

In the case of our atom, the (non-relativistic) acceleration is produced by the potential V (r),

eE
a =
m
e dV (r) r
= −
m dr r

22
so that
γ a×v
ωT = −
γ + 1 c2
e 1 dV (r) 1 r × v
=
m r dr 2 c2
e 1 dV (r) 1 L
=
m r dr 2 mc2
e 1 dV (r)
= L
2m2 c2 r dr
Now return to the equation of motion for the electron. Now, since we have a rotating frame of reference,
we must replace  
ds
= µ × (B − β × E)
dt electron f rame
with
 
d
+ ωT × s = µ × (B − β × E)
dt
ds
= µ × (B − β × E) − ω T × s
dt
ge ge
= s×B− s × (β × E) + s × ω T
2mc 2mc
ge ge 
= s×B−s× (β × E) − ω T
2mc 2mc
and then energy is modified to

U −µ · B + µ · (β × E)
=
ge ge
= − s·B+ 2
s · (v × E)
2mc 2mc
ge  ge 
=⇒ − s·B+s· β × E − ωT
2mc 2mc
ge ge 1 dV (r)
= − s·B+ 2 2
s·L − s · ωT
2mc 2m c r dr
ge ge 1 dV (r) e 1 dV (r)
= − s·B+ 2 2
s·L − 2 2
s·L
2mc 2m c r dr 2m c r dr
ge (g − 1) e 1 dV (r)
= − s·B+ s·L
2mc 2m2 c2 r dr
The coefficient of the spin-orbit interaction energy is changed from g = 2 to g − 1 = 1, giving the required
factor of 2.

Covariance of electrodynamics
Just as we modified Newton’s second law to write a covariant equation,
dP α
Fα =

we must rewrite the Lorentz force law, Maxwell equations and continuity equation in a manifestly covariant
form.

23
Lorentz force law
We begin with the Lorentz force law,  v 
F=q E+ ×B
c
α
We may replace the left side with the relativistic momentum, F α = dP dτ , making the right side into a
4-vector. To determine the covariant form of the electric and magnetic fields, we need to know that the
electric charge is invariant. Jackson discusses a number of experiments that test this to parts in 1019 , so the
velocity-independence of electric charge is well established. Furthermore, the right side of the equation is
linear in the velocity, and so must be proportional to the 4-velocity, leaving us with an equation of the form
F α = q (? [E, B]) U β

 is to find the nature of the unknown tensor (? [E, B]). The most general suitable tensor will
where our goal
1
be of type ,
1
F α = qF αβ U β
 
2
It is more convenient to write this as a tensor,
0

F α = qF αβ Uβ
called the Faraday tensor. This is sufficient to find the full form of the Faraday tensor, F αβ , in terms of
E, B, but we will use the Maxwell equations instead.

Levi-Civita tensor
There is one more tensor we require. Just as we have the Levi-Civita tensor, εijk , in 3-dimensions, there is
a Levi-Civita tensor in 4-dimensions,

 +1 Even permutations of 0, 1, 2, 3
εαβµν = −1 Odd permutations of 0, 1, 2, 3
0 Otherwise

then we can form a second field tensor dual to the Faraday tensor,
1
Fαβ = εαβµν F µν
2

Continuity equation
The continuity equation is an easy place to start, and it gives us the 4-vector form of the source current. We
have
∂ρ
+∇·J=0
∂t
and we know that the gradient operator transforms as a covariant vector,

∂α =
∂xα
 
1 ∂
= ,∇
c ∂t
From this we can recognize the form of the continuity equation as a vanishing divergence,
0 = ∂α J α
1 ∂J 0
= +∇·J
c ∂t

24
We immediately identify

Jα = (ρc, J)

so that the charge density times c, together with the current density, form a 4-vector.

The vector potential


Using the Lorentz gauge, we have found the following wave equations for the scalar and vector potentials,

φ = −4πρ

A = − J
c
Since
J α = (ρc, J)
is already known to be a 4-vector, and the d’Alembertian,  = ∂α ∂ α , is shown to be a Lorentz-invariant
operator, the potentials must also form a 4-vector,

Aα = (φ, A)

making the four equations into a single, 4-vector equation,


4π α
Aα = − J
c

Maxwell equations with sources


Now consider the Maxwell equations,

∇·B = 0
1 ∂B
∇×E+ = 0
c ∂t
∇·E = 4πρ
1 ∂E 4π
∇×B− = J
c ∂t c
Check signs

B = ∇×A
1 ∂A
E = −∇φ −
c ∂t
1 ∂∇ · A
−∇2 φ − = 4πρ
c ∂t
1 ∂2A
 
1 ∂ 4π
∇ (∇ · A) + φ + 2 2 − ∇2 A = J
c ∂t c ∂t c
So with
1 ∂φ
∇·A+ = ∂α Aα = 0
c ∂t
we have
1 ∂2φ
− + ∇2 φ = −4πρ
c2 ∂t2
1 ∂2A 4π
− 2 2 + ∇2 A = − J
c ∂t c

25
Now, we know that the Maxwell equations with sources go together,
∇ · E = 4πρ
1 ∂E 4π
∇×B− = J
c ∂t c
because ρ and J belong to the same 4-vector. The right side is linear in the four derivatives, ∂α , and linear
in the fields. The object which we have that is linear in the fields is F αβ , so we must combine ∂α and F αβ
in such a way as to give 4π α
c J . The simplest possibility is
4π β
∂α F αβ = J
c
and we now determine whether there is any matrix F αβ with is linear in the components of E and B, for with
this equation is equivalent to the two sourced Maxwell equations. The equation above is a vector equation,
so we can consider one component at a time.

β = 0: When we look at the zero component we have


4π 0
∂α F α0 J =
c

∂0 F 00 + ∂1 F 10 + ∂2 F 20 + ∂3 F 30 = ρc
c
Since this equation has ρ on the right side, it must, if anything, be equivalent to
∇·E = 4πρ

∂1 E 1 + ∂2 E 2 + ∂3 E 3 = ρc
c
Comparing the left hand sides,
∂1 E 1 + ∂2 E 2 + ∂3 E 3 = ∂0 F 00 + ∂1 F 10 + ∂2 F 20 + ∂3 F 30
This holds correctly only if we choose
F 00 = 0
F 10 = E1
F 20 = E2
F 30 = E3

β = 1: Turning to the next component of our guess at the form of the covariant equation, we have
4π 1
∂0 F 01 + ∂1 F 11 + ∂2 F 21 + ∂3 F 31 = J
c
and we compare this to the x-component of Ampere’s law,
1 ∂E 1
1 4π 1
[∇ × B] − = J
c ∂t c
∂B 3 ∂B 2 1 ∂E 1 4π 1
− − = J
∂y ∂z c ∂t c
For these two equations to agree we must have
F 01 = −E 1
11
F = 0
21
F = B3
F 31 = −B 2

26
β = 2: Proceeding, we set β = 2, to find
4π 2
∂0 F 02 + ∂1 F 12 + ∂2 F 22 + ∂3 F 32 = J
c
which must give the y-component of Ampere’s law,

1 ∂E 2
2 4π 2
[∇ × B] − = J
c ∂t c
∂B 1 ∂B 3 1 ∂E 2 4π 2
− − = J
∂z ∂x c ∂t c
so that we require

F 02 = −E 2
12
F = −B 3
F 22 = 0
32
F = B1

β = 3: Finally, we set β = 3, to find


4π 3
∂0 F 03 + ∂1 F 13 + ∂2 F 23 + ∂3 F 33 = J
c
which must give the z-component of Ampere’s law,

∂B 2 ∂B 1 1 ∂E 3 4π 3
− − = J
∂x ∂y c ∂t c
and since the right sides match, we require

F 03 = −E 2
F 13 = B2
F 23 = −B 1
33
F = 0

Putting all of the components together, we see that we have determined the entirety of the Faraday
tensor,
0 −E 1 −E 2 −E 3
 
 E1 0 −B 3 B 2 
F αβ = 
 E2 B3

0 −B 1 
E 3 −B 2 B 1 0
With this choice of the components of the Faraday tensor, two of the Maxwell equations are given correctly
by
4π β
∂α F αβ = J
c
Notice that F αβ antisymmetric,
F αβ = −F βα
This means that if we take two derivatives and contract,

∂α ∂β F αβ = −∂α ∂β F βα

27
But mixed partial derivatives commute,
∂2
∂α ∂β =
∂xα ∂xβ
∂2
=
∂xβ ∂xα
= ∂β ∂α
so we may write the right hand side as
−∂α ∂β F βα = −∂β ∂α F βα
and renaming the dummy indices this is
−∂α ∂β F βα = −∂α ∂β F αβ
Returning this to the original expression,
∂α ∂β F αβ = −∂α ∂β F αβ
αβ
2∂α ∂β F = 0
so this double derivative vanishes identically by symmetry. As an immediate consequence, we have, taking
another derivative of our covariant expression
4π β
∂α F αβ = J
c

∂α ∂β F αβ = ∂β J β
c
∂β J β = 0
showing that the Maxwell equations imply the conservation of charge.

Homogeneous Maxwell equations


Now consider the remaining Maxwell equations. In these,

∇·B = 0
1 ∂B
∇×E+ = 0
c ∂t
The roles of the electric and magnetic fields are reversed. In fact, we may get these two equations from the
the sourced Maxwell equations by the replacements
E −→ −B
B −→ E
α
J −→ 0
and this is exactly what happens if we take the dual of the Faraday tensor,
1
Fαβ = εαβµν F µν
2
We compute this in detail. First, notice that if α = β, the right side contains the same index twice and must
vanish. For example,
1
F11 = ε11µν F µν
2
= 0

28
Therefore, all diagonal elements of F αβ vanish. Notice also that because the Levi-Civita tensor is antisym-
metric, εαβµν = −εβαµν , we must have the dual field tensor antisymmetric as well,

Fαβ = −Fβα

This leaves us with only six independent components to compute:


For the first, look at
1
F01 = ε01µν F µν
2
The sums on µ and ν range over 4 × 4 = 16 possibilities, but most of them vanish. If either µ or ν equals
either 0 or 1, the Levi-Civita tensor vanishes, so they must each be either 2 or 3, and they cannot both be
the same. This leaves only two possibilities,
1 1
F01 = ε0123 F 23 + ε0132 F 32
2 2
1 1
= × (+1) × F 23 + × (−1) × F 32
2 2
1 1
× (+1) × F 23 + × (−1) × −F 23

=
2 2
= F 23
= −B 1

The remaining terms are similar,


1 1
F02 = ε0213 F 13 + ε0231 F 31
2 2
= F 31
= −B 2
1 1
F03 = ε0312 F 12 + ε0321 F 21
2 2
= −B 3
1 1
F12 = ε1203 F 03 + ε1230 F 30
2 2
= −E 3
1 1
F23 = ε2301 F 01 + ε2301 F 01
2 2
= −E 1
1 1
F31 = ε3102 F 02 + ε3120 F 20
2 2
= −E 2

and we may write the whole dual tensor,


−B 1 −B 3 −B 3
 
0
 B1 0 −E 3 E2 
Fαβ =
 B2

E3 0 −E 1 
B3 −E 2 E1 0
This has the electric and magnetic fields interchanged, so we may
 hope
 that we can use the dual field tensor
2
to write the remaining equations. Once again we require the form of the tensor,
1

F αβ = η αµ η βν Fµν

29
= η αµ Fµν η βν
= [η] [F] η t
 

= [η] [F] [η]


−B 1 −B 3 −B 3
   
−1 0 0 0 0 −1 0 0 0
 0 1 0 0  B1 0 −E 3 E2   0 1 0 0 
=    
 0 0 1 0  B2 E3 0 −E 1   0 0 1 0 
0 0 0 1 B3 −E 2 E1 0 0 0 0 1
−B 1 3 3
  
−1 0 0 0 0 −B −B
 0 1 0 0  −B 1 0 −E 3 E 2 
=   
 0 0 1 0  −B 2 E3 0 −E 1 
0 0 0 1 −B 3 −E 2 E1 0
B1 B3 B3
 
0
 −B 1 0 −E 3 E2 
=  2 3

 −B E 0 −E 1 
−B 3 −E 2 E 1 0
This is exactly what we need. It is the same as F αβ , with the substitutions E −→ −B and B −→ E.
Therefore, setting the source to zero, the correct field equations must be given by
∂α F αβ = 0
This is easy to check. Write the homogeneous equations in components:
∂B 1 ∂B 2 ∂B 3
+ + = 0
∂x ∂y ∂z
3 2
∂E ∂E 1 ∂B 1
− + = 0
∂y ∂z c ∂t
1 3
∂E ∂E 1 ∂B 2
− + = 0
∂z ∂x c ∂t
∂E 2 ∂E 1 1 ∂B 3
− + = 0
∂x ∂y c ∂t
Consider the first of the curl equations,
∂E 3 ∂E 2 1 ∂B 1
− + =0
∂y ∂z c ∂t
We may write this in terms of Fαβ as
∂F 21 ∂F 31 ∂F 01
2
+ 3
+ = 0
∂x ∂x ∂x0
11 21 31
∂F ∂F ∂F ∂F 01
+ + + = 0
∂x1 ∂x2 ∂x3 ∂x0
∂F β1
= 0
∂xβ
The remaining components work in the same way.

The Maxwell equations in covariant form


We now can replace all four Maxwell equations with only two manifestly Lorentz covariant equations,
4π β
∂α F αβ = J
c
∂α F αβ = 0

30
We have seen that it follows from these that the current is conserved

∂β J β = 0

We can also now complete the form of the Lorentz force law. The law is
 v 
F=q E+ ×B
c
dP α
and we deduced that for a force written as F α = dτ , the law must be of the form

dP α
= λF αβ Uβ
 0dτ
0 −E 1 −E 2 −E 3
  
P −γc
1   E1
d 
 P2  0 −B 3 B 2   γv 1 
= λ
 E2 B3

1 

dτ  P  0 −B γv 2 
P3 E 3 −B 2 B 1 0 γv 3
−γE v − γE v − γE 3 v 3
1 1 2 2
   
γmc
d  1   −γcE 1 − γv 2 B 3 + γv 3 B 2 
 γmv 2  = λ 
dτ  γmv   −γcE 2 + γv 2 B 3 − γv 3 B 1 
γmv 3 −γcE 3 − γv 1 B 2 + γv 2 B 1
1
   
d E/c E·v
= −γcλ c
dτ p E+β×B

and therefore,
dE
= −γcλE · v

dp
= −γcλ (E + β × B)

Replacing
d d

dτ dt
and setting the constant to
q
λ=−
c
this becomes
dE
= qE · v
dt
dp
= q (E + β × B)
dt
so we not only reproduce the Lorentz force law, but also get the correct expression for the rate of change of
energy. The covariant form of the Lorentz force law is therefore
dP α q
= − F αβ Uβ
dτ c
or, using the antisymmetry of the Faraday tensor,

dP β q
= Uα F αβ
dτ c

31
Lorentz transformation of the Maxwell equations
Now that we have written the Maxwell equations in covariant form, we know exactly how they transform
under Lorentz transformations. Consider a boost in the x-direction, from O to Õ given by the transformation
matrix  
γ −γβ 0 0
 −γβ γ 0 0 
M αβ =  0

0 1 0 
0 0 0 1
 
2
Then, since the Faraday tensor is a tensor, it transforms as
0
F̃ αβ = M αµ M αν F µν
= M αµ F µν M αν
 α
= M αµ F µν M t ν
where the rearrangement may now be written as the matrix product,
F̃ = M F M t
We find,
0 −E 1 −E 2 −E 3
   
γ −γβ 0 0 γ −γβ 0 0
 −γβ γ 0 0   E1 0 −B 3
B 2 
−γβ γ 0 0
F̃ αβ

=    
 0 0 1 0   E2 B3 0 −B 1   0 0 1 0 
0 0 0 1 E 3 −B 2 B 1 0 0 0 0 1
1 1 2 3
  
γ −γβ 0 0 γβE −γE −E −E
 −γβ γ 0 0   γE 1 −γβE 1 −B 3 B 2 
=   2 3 2 3

 0 0 1 0   γE − γβB −γβE + γB 0 −B 1 
0 0 0 1 γE 3 + γβB 2 −γβE 3 − γB 2 B 1 0
2 2 2
 1
−γE + γβB −γE − γβB 2
2 3 3
 
0  −γ + γ β E
 −γ 2 β 2 + γ 2 E 1 0 γβE 2 − γB 3 γβE 3 + γB 2 
=  2 3 2 3

 γE − γβB −γβE + γB 0 −B 1 
γE 3 + γβB 2 −γβE 3 − γB 2 B1 0
  
−E 1 −γ E 2 − βB 3  −γ E 3 + βB 2

0
 E1 0  −γ B − βE
3 2
γ B 2 + βE 3 
=  2 3
 3 2

 γ E − βB  γ B − βE  0 −B 1 
γ E 3 + βB 2 −γ B 2 + βE 3 B1 0
Comparing components with
−Ẽ 1 −Ẽ 2 −Ẽ 3
 
0
 Ẽ 1 0 −B̃ 3 B̃ 2 
F̃ αβ =
 Ẽ 2

B̃ 3 0 −B̃ 1 
Ẽ 3 −B̃ 2 B̃ 1 0
we see that
Ẽ 1 = E1
Ẽ 2 = γ E 2 − βB 3


Ẽ 3 = γ E 3 + βB 2


B̃ 1 = B1
B̃ 2 = γ B 2 + βE 3


B̃ 3 = γ B 3 − βE 2


32
which we can write vectorially in terms of the components of E and B parallel and perpenducular to β,
1
Ek = (β · E) β
β2
1
E⊥ = E − 2 (β · E) β
β
and similarly for B, as:

Ẽk = Ek
Ẽ⊥ = γE⊥ − γβ × B⊥
B̃k = Bk
B̃⊥ = γB⊥ − γβ × E⊥

and we may reconstruct the full vectors,

Ẽ = Ek + γE⊥ − γβ × B⊥
   
1 1 1
= (β · E) β + γ E − 2 (β · E) β − γβ × B − 2 (β · B) β
β2 β β
γ−1
= γ (E − β × B) − (β · E) β
β2
γ−1
= γ (E − β × B) − (β · E) β
1 − γ12
γ2
= γ (E − β × B) − (β · E) β
γ+1
B̃ = Bk + γB⊥ − γβ × E⊥
   
1 1 1
= (β · B) β + γ B − (β · B) β − γβ × E − (β · E) β
β2 β2 β2
γ2
= γ (B − β × E) − (β · B) β
γ+1
so the complete transformation laws are
γ2
Ẽ = γ (E − β × B) − (β · E) β
γ+1
γ2
B̃ = γ (B − β × E) − (β · B) β
γ+1

Suppose we have a pure electric field in the original frame, with B = 0. Then in Õ, as the speed
approaches the speed of light, β −→ β̂, and the electric field approaches
γ2
Ẽ = γE − (β · E) β
γ+1
   
−→ γ E − β̂ · E β̂
= γE⊥

so the electric field flattens into the plane orthogonal to the motion. At the same time, the magnetic field
approaches
B̃ = −γ β̂ × E
which also lies in the orthogonal plane, and is perpendicular to E.

33
Thomas precession and the BMT equation
We are now in a position to consider the fully relativistic evolution of spin angular momentum. We know
that the equation of motion of the 3-dimensional spin vector, s, in the rest frame Õ of the particle is
ds ge
= s × B̃
dt̃ 2mc
and want to generalize this to a covariant expression. The simplest way is to generalize the spin 3-vector to
a spin 4-vector, which in the rest frame Õ reduces to

S̃ α = (0, s)

Notice that since Ũ α = (c, 0), we have


Ũα S̃ α = 0
and this relation is invariant.
To generalize the rest frame equation, we will clearly want the proper time derivative of the 4-vector spin
dS α

expressed in terms of tensors. The only tensors relevant to the problem are
dU α
F αβ , U α , S α ,

α
Notice that the last of these, dUdτ , has a part expressible in terms of the first two but may also depend on
non-electromagnetic forces, so we need to consider it. From the first three, we need to construct a 4-vector,
and looking at the limit in the rest frame, we assume that the correct expression is at most linear in the
fields and each term is linear in the spin vector. Then there are only four possibilities:

F αβ Sβ
(Sµ F µν Uν ) U α
dU β
 
Sβ Uα


(Notice that U β Sβ U α = 0 and (F µν Uµ Uν ) U α = 0 and since the acceleration is already linear in F αβ we
do not consider multiples of it). Therefore, we may consider an arbitrary linear combination of these, and
ask what combination gives the right behavior in the particle rest frame,
dS α dU β
 
αβ µν α
= AF Sβ + B (Sµ F Uν ) U + C Sβ Uα
dτ dτ
First, notice that since U α Sα = 0,
d
0 = (U α Sα )

dU α dSα
= Sα + U α
dτ dτ
Rewrite this using the Lorentz force law, combined with other non-electromagnetic forces, F α
dU α e
m = Fα + Uβ F βα
dτ mc
so that
dSα dU α
Uα = − Sα
dτ dτ

34
and compare to our conjectured expression:

dS α dU β
 
= AF αβ Sβ + B (Sµ F µν Uν ) U α + C Sβ Uα
dτ dτ
dS α dU β
 
Uα = AUα F αβ Sβ + B (Sµ F µν Uν ) Uα U α + C Sβ Uα U α
dτ dτ
dU β
 
= AUα F αβ Sβ − c2 B (Sµ F µν Uν ) − c2 C Sβ

Substituting our constraint we must have

dU α dU β
 
− Sα = AUα F αβ Sβ − c2 B (Sµ F µν Uν ) − c2 C Sβ
dτ dτ

and substituting for the acceleration,


    
1 α e βα αβ 2 µν 2 1 α e βα
− F + Uβ F Sα = AUα F Sβ − c B (Sµ F Uν ) − c C Sα F + Uβ F
m mc m mc

and since the non-electromagnetic force is arbitrary, we must have both


1 α 1 2
− F Sα = − c CSα F α
m m
e e
− Uβ F βα Sα = AUα F αβ Sβ − c2 BSµ F µν Uν − c2 C Sα Uβ F βα
mc mc
1
The first of these gives C = c2 ,
after which the second gives
e  e 
− Uβ F βα Sα = A + c2 B − Uα F αβ Sβ
mc mc
0 = A + c2 B

With these restrictions, the equation of motion reduces to

dS α dU β
 
A 1
= AF αβ Sβ − 2 (Sµ F µν Uν ) U α + 2 Sβ Uα
dτ c c dτ

Now consider the limit to the rest frame, where we have a pure magnetic field E = 0 in F αβ , and the
4-velocity is given by
U α = (c, 0)
Then the spatial components give

dsi
= AF iα sα
dτ   
0 0 0 0 0
dsi  0 0 −B 3
B 2   s1 
= A
 0 B3
 2 
dt 0 −B 1  s 
0 −B 2 B 1 0 s3
 
0
 B 2 s3 − B 3 s2 
= A
 B 3 s1 − B 1 s3 

B 1 s2 − B 2 s1

35
or simply
ds
= −A (s × B)
dt
and comparing this to
ds ge
= s×B
dt 2mc
we see that
ge
A=−
2mc
and the full equation of motion for s becomes

dS α dU β
 
ge αβ ge 1 µν α 1
= − F Sβ + (Sµ F Uν ) U + 2 Sβ Uα
dτ 2mc 2mc c2 c dτ
ge αβ ge 1 1  e 
= − F Sβ + 2
(Sµ F µν Uν ) U α + 2 Uα F αβ Sβ U α
2mc 2mc c c mc
ge αβ 1 g  e
= − F Sβ + 2 −1 (Sµ F Uν ) U α
µν
2mc c 2 mc
This is the BMT equationThe sign difference from Jackson in the first term is because we use the opposite
convention for the sign of the metric, ηαβ .

36

You might also like