You are on page 1of 14

CHAPTER 2

MINKOWSKI SPACETIME AND SPECIAL RELATIVITY

Scarcely anyone who truly


understand relativity theory can
escape this magic.

A. Einstein

In the previous chapter we saw that tensors are a very good tool for writing covariant equations in
3-dimensional Euclidean space. In this chapter we will generalize the tensor concept to the framework
of the Special Theory of Relativity, the Minkowski spacetime. I will assume the reader to be familiar
at least with the rudiments of Special Relativity, avoiding therefore any kind of historical introduction
to the theory.

2.1 Einstein’s Relativity


Special Relativity is based on two basic axions, formulated by Einstein in 19051 :

1. Principle of Relativity (Galileo): The laws of physics are the same in all the inertial frames: No
experiment can measure the absolute velocity of an observer; the results of any experiment do not
depend on the speed of the observer relative to other observers not involved in the experiment.
2. Invariance of the speed of light: The speed of light in vacuum is the same in all the inertial
frames.

Instantaneous action at a distance is inconsistent with the second postulate and must be replaced
by retarded action at a distance. Absolute simultaneity will only apply as an approximation at low
velocities for nearby events.

2.2 Minkowski spacetime: new wine in a old bottle


The framework of Special Relativity is a 4-dimensional manifold called Minkowski (or pseudo-Euclidean)
space-time. The differential and topological structures of the Newtonian and Minkowskian spacetimes
1 On the electrodynamics of moving bodies.
2.2 Minkowski spacetime: new wine in a old bottle 16

coincide2 , but they differ in the metrical structure, i.e. in the definition of distances. While in Newto-
nian spacetime the spatial and temporal distances are independent, in Minskowskian spacetime space
and time by themselves, are doomed to fade away into mere shadows, and only a kind of union of the
two will preserve an independent reality 3 . Space and time are distinguished only by a sign, which will
play however a central role. For any inertial frame of reference in Minkowski spacetime there is a set
of coordinates4
{xµ } = {t, x, y, z} = {x0 , x1 , x2 , x3 } = {x0 , xi } , (2.1)
and a set of orthonormal basis vectors

{eµ } = {et , ex , ey , ez } = {e0 , e1 , e2 , e3 } = {e0 , ei } , (2.2)

satisfying the Lorentz orthonormality condition 5

eµ · eν = ηµν , (2.3)

with  
−1 0 0 0
 0 1 0 0 
ηµν =
 0
 = diag(−1, 1, 1, 1) . (2.4)
0 1 0 
0 0 0 1
The inverse of (2.4) is traditionally denoted by η µν and satisfies6

η µν ηνρ = δ µ ρ , (2.5)

where the 4-dimensional Kronecker delta δ µ ρ is the indexed version of the identity matrix, i.e. δ µ ρ = 1
if µ = ρ and zero otherwise. Note that ηµν and η µν are numerically equivalent.

Exercise
Which is the value of δ µ µ ?

In terms of the basis vectors, the infinitesimal displacement dS between two points in spacetime can
be expressed as
dS = dxµ eµ , (2.6)
where dxµ are the so-called contravariant components. These are computed via the scalar product of
the vector dS and the corresponding basis vector eµ .

dS · eµ = (dxν eν ) · eµ = dxν (eν · eµ ) = dxν ηνµ = dxµ , (2.7)

where we have defined the covariant or dual components by lowering the index of the contravariant
components with the metric
dxµ ≡ ηµν dxν . (2.8)

2 Both of them are smooth, continuous, homogeneous, isotropic, orientable,. . .


3 Minkowski, 1908.
4 As in the previous chapter, we will not consider the quadruplet {xµ } to be a vector in Minkowski spacetime.

Coordinate indices will be always upper indices.


5 Note that the time coordinate is 4-dimensionally orthogonal to the spatial coordinates.
6 Einstein’s summation convention is used.
2.2 Minkowski spacetime: new wine in a old bottle 17

Exercise
Starting with a covariant vector defined by Eq.(2.8) , show that the inverse of the metric η µν
can be used to raise indices
η µν dxν = dxµ . (2.9)

As in the Euclidean case, contravariant and covariant vectors are just an appropriate way of
simplifying the notation and taking into account the summation convention. Note however that in
the present case lowering or raising indices changes the sign of the temporal component while keeping
intact the spatial ones
dx0 = −dx0 , dxi = +dxi . (2.10)
The upper and lower index notation automatically keeps track of the minus signs associated to the
temporal component. The indefiniteness of the metric is automatically incorporated in the notation!

In some old-fashioned books and in ’t Hooft’s lecture notes you will find a fourth coordinate
x4 = it, instead of the coordinate x0 = t appearing before. Written in terms of x4 the Minkowski
spacetime has the appearance of a positive-definite 4 dimensional Euclidean space

eµ · eν = δµν (2.11)

and there is no difference between lower and upper indices. This notation is however confusing
since it hides the non-positive definite character of the metric.

The square of the infinitesimal distance between two events in Minkowskian spacetime is given by

|dS|2 ≡ ds2 = ηµν dxµ dxν = dxµ dxµ = −dt2 + dX 2 . (2.12)

where ηµν is the Minskowski metric and dX 2 ≡ dx2 + dy 2 + dz 2 denotes the spatial interval.

Note that we have arbitrarily chosen a ηµν = diag(−1, 1, 1, 1) spacelike convention for the metric
signature, which keeps intact the notation used for Cartesian tensors in Euclidean spacetime.
Some books use a different timelike convention for the signature of the metric, taking ηµν =
diag(1, −1, −1, −1). Although the physics is independent of the convention used, the signs
appearing in the formulas in those books may differ from those in the expressions presented
here. For instance, using the convention ηµν = diag(1, −1, −1, −1), Eq.(2.40) would change to

pµ pµ = m2 . (2.13)

Note however that in both cases you recover E 2 = p2 + m2 after expanding the expression into
the different components.

Note that, contrary to the Newtonian case, the metric ηµν is not positive-definite. Given the
Lorentzian signature (− + ++) , the interval (2.12) can be positive, zero, or negative

• If ds2 = 0, dX/dt = 1 and the interval corresponds to the trajectory of a light ray. This interval
is called null or lightlike interval. The set of all lightlike wordlines leaving or arriving to a given
point xµ spans the future or past lightcone of the event. There is a lightcone associated to each
point in spacetime.
2.3 Minkowski spacetime isometry group 18

Timelike

Fu
Separation

tu
Massive

re
lig
particle Massless

ht
co
particle


Spacelike ne Spacelike
Separation Separation

Pa
st
lig
ht
co
ne
Timelike
Separation

Figure 2.1: Minkowski spacetime.

• If ds2 < 0 the interval is said to be timelike. It corresponds to the wordline of a particle with
nonzero rest mass moving with a velocity smaller than light, dX/dt < 1. Two events separated
by such an interval are both inside the lightcone and can be in causal contact. There will exist
a frame in which the two events happen at same position but at different times.
• If ds2 > 0 the interval is termed spacelike. There will exist a frame in which the two events
happen at the same time but at different places, without any causal relation between them.

The different concepts are summarized in Fig.2.1.

2.3 Minkowski spacetime isometry group


The transformations of Special Relativity are defined as those that do not change the Minkowski line
element (2.12) (not the spatial or temporal intervals separately!). Following the procedure outlined
in the previous chapter, and taking into account that ηµν is also a constant metric, the requirement
ds2 = ds̄2 gives rise to the following condition
∂Λρ µ ∂Λρ µ
ηρσ Λσ ν = 0 → = 0, (2.14)
∂xπ ∂xπ
where we have defined
∂ x̄µ
Λµ ν ≡ . (2.15)
∂xν
As before, the transformation relating the two reference frames must be linear7 . This set of trans-
formations constitute the so-called inhomogeneous Lorentz group or the Poincare group, which is a
combination of translations
xµ → xµ + aµ (2.16)
and linear homogeneous Lorentz transformations

xµ = Λµ ν xν , (2.17)

with Λ a 4 × 4 matrix, independent of the coordinates. The first (upper) index in Λµ ν labels rows,
while the second (lower) one labels columns.
7 Note that this is basically due to the fact that we are dealing with constant metrics.
2.3 Minkowski spacetime isometry group 19

In order to preserve the line element (2.12) the constant matrices Λµ ν are required to satisfy the
pseudo-orthogonality condition
ηµν = ηρσ Λρ µ Λσ ν , (2.18)
which, in matrix notation, becomes
η = ΛT ηΛ , (2.19)
T
with denoting matrix transpose. Eq. (2.18) is the relativistic analogue of the orthogonality condition
(1.19). The determinant of the Λ matrices is also ±1. As we did in the previous chapter, we will not
consider the full Lorentz group8 (which is neither connected nor compact)

O(3, 1) = L+ ∪ P L+ ∪ T L+ ∪ P T L+ , (2.20)

with P and T the parity P µ ν = diag(+1, −I) and time reversal T µ ν = diag(−1, +I) operations. We
will restrict ourselves to the continuous Lorentz transformations connected with the identity (the
proper Lorentz group)

L+ ≡ SO(3, 1) = {Λ|ΛT ηΛ = η , Λ0 0 ≥ 0 , det Λ = 1} (2.21)

with S denoting special or reflection-free. These transformations are the relativistic analog of proper
rotations in Euclidean spacetime.

Exercise
Verify that the restricted set of Lorentz transformations (2.21) forms a group :

• Closure: The product of any two Lorentz transformations is another Lorentz transforma-
tion.
• There is an identity transformation.
• Every Lorentz transformation has an inverse.

• The product of Lorentz transformations is associative.

The fact that η 2 = I4 , with I4 the identity matrix, allows us to easily compute the inverse Lorentz
transformation
η 2 = η ΛT ηΛ = η ΛT η Λ = I4 Λ−1 = ηΛT η ,
 
→ (2.22)
which, writing explicitly the components, becomes

Λ−1 ν = η µλ Λρ λ ηρν = Λν µ . (2.23)

The position of the indices is important Λµ ν 6= Λν µ !!

How many Lorentz transformation are there? Each Lorentz transformation is represented by a 4×4
matrix, which makes a total of 16 components. The pseudo-orthogonality condition (2.18) imposes
however some constraints. Indeed, taking the transpose of such a equation leaves it unchanged. The
independent components are just the diagonal elements plus half the off-original elements. We are
8 Note that now, not only the determinant of Λ, but also the element Λ0 0 , plays a special role in the splitting (2.20).
.
2.3 Minkowski spacetime isometry group 20

left therefore with 16 − 10 = 6 independent Lorentz transformations. There are two different kinds of
homogeneous Lorentz transformations. The most obvious one are spatial rotations
 
µ 1 0
Λ ν= , (2.24)
0 Ri j

where Ri j is a 3 × 3 orthogonal matrix δkl = δij Ri k Rlj , with i, j running only in spatial directions.
There are three independent rotations matrices, one per spatial direction. For instance, a rotation of
angle θ around the z-axis will take the form
 
cos θ sin θ 0
Ri j =  − sin θ cos θ 0  . (2.25)
0 0 1
The difference with Euclidean transformations arises when considering the so-called boosts, which mix
the spatial and temporal components. There are three of them, each one associated to the mixing of
a particular spatial component with time. As an example consider a boost of rapidity η = tanh−1 v
along the x direction
   
γ −γβ 0 0 cosh η − sinh η 0 0
 −γβ
Λµ ν = 
γ 0 0   =  − sinh η cosh η 0 0  ,
 
 0 (2.26)
0 1 0   0 0 1 0 
0 0 0 1 0 0 0 1

where we have defined the parameter γ = 1/ 1 − v 2 , with v the 3-velocity. After the boost, the tem-
poral and spatial coordinates are a linear and homogeneous combination of the spatial and temporal
coordinates in the old frame.

Exercise
Verify that Eq. (2.26) gives rise to the standard Lorentz transformation

t0 = γ(t − vx) , x0 = γ(x − vt) . (2.27)

For doing so, assume Λµ ν to be the transformation from the rest frame of a given inertial
observer to the frame of a second initial observer moving with speed v along the x axis and
determine the relation between that velocity and η .

The convenience of using the rapidity parameter η instead of the velocity v resides in the fact that
η combines additively. In fact, if we consider two consecutive boosts in the same direction, we have
Λ(η1 )Λ(η2 ) = Λ(η1 + η2 ) . (2.28)

Exercise
Consider the composition of 2 boosts with velocities v1 and v2 along the x direction. Show that
Λ1 Λ2 gives rise to a boost with 3-velocity
v1 + v2
v= . (2.29)
1 + v 1 v2
What happens in the limit v1 , v2  1? Generalize the previous to an arbitrary direction. Is
the general result symmetric under the interchange of the two velocities?. Note that all the
expressions are written in natural units.
2.3 Minkowski spacetime isometry group 21

t t

nts 2
x
seve
n eou
ulta
Sim
Simultaneous events 1
x
Figure 2.2: A boost transformation

The form of Eq. (2.26)


   
cosh η − sinh η 0 0 cos iη sin iη 0 0
 − sinh η cosh η 0 0   sin iη cos iη 0 0 
Λµ ν = = , (2.30)
 0 0 1 0   0 0 1 0 
0 0 0 1 0 0 0 1

and the property (2.28) closely resemble those of spatial rotations (2.25). The main difference is
the change of trigonometric functions by their hyperbolic analogue, reflecting the relative sign of the
temporal direction with respect to the spatial directions. Note however two important differences
between boosts and ordinary rotations

• The rotation parameter θ in Eq.(2.25) runs between 0 and 2π, with both points included. The
rapidity parameter η is non-compact and can take whatever value in R.
• The boost matrix (2.26) is symmetric, which is not the case for ordinary rotations, cf. Eq.
(2.25):

Although we should not take the analogy between rotations and boosts too seriously, it is instructive
to look at the action of a Lorentz transformation on a spacetime diagram. As shown in Fig. 2.2, a
Lorenzt boost rotates time and space by the same angle η = tanh−1 v, but in opposite directions! In
Special Relativity the simultaneity of two events depends on the observer, the hyperplanes of constant
coordinate time do not have an invariant meaning. Note however that the light cone (i.e the dashed
line at 45 degrees in the diagram) is invariant under Lorentz transformations.

Exercise
Use the diagram Fig.2.2 to derive the well-known effects of space contraction
p
L̄ = L 1 − v 2 → L̄ < L , (2.31)

and time dilatation


T
T̄ = √ → T̄ > T . (2.32)
1 − v2
Hint: Don’t be confused by drawing just a slice of Minkowski spacetime in an Euclidean paper.
Use the Minkowski metric!
2.4 Tensors in Minkowski spacetime 22

2.4 Tensors in Minkowski spacetime


The Einstein’s Principle of Relativity introduced at the beginning of this chapter implies that all
the laws of physics must retain their mathematical form in all the inertial frames, i.e. they must be
covariant under Lorentz transformations. As we learnt in the previous chapter, tensorial equations
automatically satisfy this requirement. Since the required discussion of tensors in Minkowski spacetime
closely follows that in Section 1.4 for Eucliden spacetime9 , we will simply summarize the results in
the Table 2.1.

∂ x̄µ
Lorentz transformations ∂xν ≡ Λµ ν are constants!

Scalar φ̄ = φ
∂ x̄µ ν
Contravariant vector V̄ µ = ∂xν V
∂xν
Covariant vector V̄µ = ∂ x̄µ Vν
∂ x̄µ ∂ x̄ν ρσ
Contravariant rank-2 tensor T̄ µν = ∂xρ ∂xσ T
∂xρ ∂xσ
Covariant rank-2 tensor T̄µν = ∂ x̄µ ∂ x̄ν Tρσ
∂ x̄µ ∂xσ ρ
Mixed rank-2 tensor T̄ µ ν = ∂xρ ∂ x̄ν T σ

Table 2.1

Exercise
How does the volume element d4 x = dx0 dx1 dx2 dx3 transforms under Lorentz transformations?

2.5 Covariance and Relativistic Mechanics


In Newtonian spacetime the trajectory of a particle is described by the position 3-vector as a function
of time xi (t), with t an absolute element of the theory. In Special Relativity a particle of mass m
follows timelike worldlines in Minkowski spacetime. The time depends on the chosen reference frame
and it is just another coordinate at the same level of the spatial coordinates. The trajectory xµ (σ) can
be expressed in terms of a completely arbitrary parameter σ, which changes continuously along the
wordline and does not need to have any particular physical interpretation. However, an interesting
(and natural) possibility for the particular case of massive particles is to identify it with the proper
time τ of the particle. This proper time τ is defined as the time measured by an observer in the
particle’s rest frame (dX = 0), which can always be achieved by performing a Lorentz transformation.
The proper time interval dτ is therefore related to the Minkowski spacetime interval10

ds2 = −dτ 2 < 0 . (2.33)


9 Note indeed that the only conceptual difference is the replacement of rotations by Lorentz transformations, the

replacement of latin indices by greek indices, and the use of the Minkowski metric, instead of the Euclidean one, for
lowering and raising indices.
10 Note that the proper time is not a useful parametrization for the worldline of massless particles, such as photons,

since these particles move on the light cone and can travel any distance in zero proper time dτ 2 = −ds2 = 0.
2.5 Covariance and Relativistic Mechanics 23

The connection between the proper time and the measurement made in an inertial reference frame
with coordinate time interval dt is given by
dt −1/2
γ≡ = 1 − v2 . (2.34)

i
Note that γ is a growing function of the 3-velocity v i = dx
dt and it is always bigger than one. The
proper time goes by at a slower rate than the coordinate time t.

4-velocity

Given τ , and in clear analogy with the 3-dimensional case, the 4-velocity uµ along the trajectory is
given by the 4-vector
dxµ
uµ ≡ , (2.35)

which is tangent to the worldline of the particle and automatically normalized

ηµν uµ uν = uµ uµ = −1 . (2.36)

In terms of its components, the 4-velocity uµ can be written as


dxµ dt T T
= 1, v i = γ 1, v i . (2.37)
dτ dτ
For an observer at rest γ = 1 and Eq.(2.37) becomes simply uµ = (1, 0, 0, 0)T . In the Newtonian limit
v  c, dτ → dt and ui → v i .
Let us consider the behaviour of uµ with respect to Lorentz transformations. Since the proper time
dτ is invariant under Lorentz transformations and dxµ transforms as a contravariant tensor, we have

uµ = Λµ ν uν . (2.38)

The 4-velocity is a timelike 4-vector.

4-momentum

Since the mass m of the particle is a scalar under Lorentz transformations, the 4-momentum

pµ = muµ (2.39)
T T
is a Lorentz 4-vector with components pµ = E, pi = mγ, mγv i .

In the instantaneous rest frame of the particle, pµ = (m, 0)T . This can be used to simplify many
computations. We can compute things in this particular frame and then re-express the result
in a form valid in any other inertial frame by appealing to covariance.

Using the normalization condition for the 4-velocities (2.36), the normalization condition for the
4-momentum becomes
pµ pµ = −m2 , (2.40)
p
which is nothing else than the well-known energy-momentum relation E = p2 + m2 written in a
covariant way11 . In the Newtonian limit (|pi |  m) this relation becomes the familiar expression of
11 Note that p is the three momentum pi .
2.5 Covariance and Relativistic Mechanics 24

2
p
the Newtonian theory together with the energy equivalent of the mass mc2 , i.e. E ' m + 2m . Note
that the 4-momentum remains well defined even for massless particles, where it has zero square norm
and becomes lightlike, pµ pµ = 0. We will take this as a definition of a massless classical particle. The
indefiniteness of Minkowski metric allows for non-zero values of the temporal and spatial parts as long
as they cancel out in pµ pµ . In particular, we can always find a frame in which pµ = (E, 0, 0, E)T .

4-acceleration

It is important to remark that Special Relativity, as Newtonian mechanics, is concerned with the
relation between inertial observers and not with the behavior of the objects that they are studying.
In particular, the observed objects can be accelerating. The 4-acceleration in Minkowski spacetime
can be defined as12
uµ (τ + ) − uµ (τ ) duµ d2 xµ
aµ = lim → aµ ≡ = . (2.42)
→0  dτ dτ

The 4-acceleration (2.42) transforms in the proper way under Lorentz transformations since Λµ ν
is linear and depends only on the relative velocity between the two frames. This fact allows it to pass
completely the d2 /dτ 2 . The acceleration aµ is spacelike 4-vector

ηµν aµ aν > 0 (2.43)

orthogonal to the timelike 4-velocity


aµ uµ = 0 , (2.44)
as can be easily shown by computing the derivative of Eq. (2.36) with respect to the proper time.

Energy-momentum conservation

The objects defined above allow us to easily generalize the Newton’s second law, which becomes
dpµ
fµ = = maµ . (2.45)

Note that Eq.(2.45) is a tensorial identity, which maintains its form under Lorentz transformations
and automatically satisfies Einstein’s principle of Relativity. The explicit expression of f µ depends on
the considered interaction (cf. the first exercise in Section 2.7). The components of the 4-vector f µ
in Eq. (2.45) are proportional to the Newtonian force F i and to the work done by F i per unit time,
i.e. f µ = γ (vi F i , F i )T . Contrary to what happens in Newtonian physics, energy and momentum
conservation laws are not independent. The conservation laws of Newtonian mechanics in a given
collision between particles are replaced by a conservation law for the total 4-momentum 13
X µ X µ
pin = pout , (2.46)
in out

12 You will probably wondering why I am making such a mess writing out the explicit definition in (2.42) instead of

directly writing
duµ d 2 xµ
aµ ≡ = . (2.41)
dτ dτ
The point I want you to notice in that the two vectors v µ (τ + ) and v µ (τ ) are located at different points in spacetime.
What we are really doing when computing the acceleration in Special Relativity is trivially moving the two vectors
to the same point in spacetime before subtracting then. This trivial operation of moving a vector from one point to
another will turn out to be not so trivial in General Relativity. As you will see, we will need to introduce some extra
machinery in order to do that. . . but let’s move one step at a time. . .
13 The interaction is assumed to be a contact interaction. Particles are free away from the interaction point.
2.6 Relativistic Lagrangian for free particles 25

with the subscripts in and out denoting the incoming and outgoing particles. Note that the conserva-
tion law is Lorentz covariant and reduces to the Newtonian momentum and energy conservation for
small velocities.

Exercise
Show that a photon cannot spontaneously decay into an electro-positron pair.

2.6 Relativistic Lagrangian for free particles


The equation of motion for a free particle following from (2.45)

dpµ
= 0, (2.47)

can be also be derived from a Lagrangian formulation where the role of the generalized coordinates
is played by the space-time coordinates xµ and the classical time t is replaced by an appropriate
parameter σ. The simplest guess for the relativistic action would be a naive generalization of the
Newtonian action, namely
dxµ dxν
Z  
1
S = m dτ ηµν . (2.48)
2 dτ dτ
Note however that the previous expression is not invariant under reparametrizations of the path14
τ → f (τ ). The dynamic of the particle seems to depend on the “internal coordinate” τ used in the
description of the curve xµ (τ ). Moreover, the action (2.48) does not contain any information about
the lightcone. On top of that, t neither has a smooth massless limit.
To solve these problems, we will substitute the proper time by an arbitrary parameter σ and
introduce a non-dynamical function15 e(σ), the so-called einbein. This quantity will be treated as an
additional generalized coordinate during the intermediate computations and fixed to a particular value
only at the end. To clarify the construction, we proceed in several steps. We start by replacing the
problematic mass appearing in the action (2.48) by e−1 (σ). This gives rise to the following structure

dxµ dxν
Z  
1 1
S∼ dσ ηµν . (2.49)
2 e(σ) dσ dσ

In order for the previous action to be invariant under reparametrizations of the path16 σ → f (σ),
the einbein e(σ) must be chosen to transform in the proper way. The transformation rule can be
determined by inspection: the property e(σ)dσ must remain invariant. In other words, the infinitesimal
displacement dσ and the einbein must transform in an opposite way
 −1
dσ̄ = f˙(σ)dσ , ē(σ̄) = f˙(σ) e(σ) . (2.50)

With these transformations at hand, we proceed now to reintroduce the mass parameter m in the
action (2.49). The form of the new term is essentially determined by pure dimensional arguments,
reparametrization invariance and the massless limit. In order for the new piece to be reparametrization
invariant, the integration measure dσ must come together with a factor e(σ). This gives a term
14 The reparametrization invariance of the action should be understood as a gauge symmetry: a redundancy of the

description, not a symmetry relating different solutions of the theory.


15 i.e. no kinetic term for e(σ) will be included.
16 Or if you want invariant under 1D general coordinate transformations.
2.7 Maxwell’s equations 26

dσe(σ) with dimension [E]−2 , which must be compensated17 by something with dimension [E]2 and
proportional to m. There you are: the new term is dσe(σ)m2 . The resulting action is the so-called
einbein action
dxµ dxν
Z  
1 1
S= dσ ηµν − m2 e(σ) (2.51)
2 e(σ) dσ dσ
and give rise to the following Euler-Lagrange equations for the generalized coordinates xµ (σ) and e(σ)

dxµ dxµ dxν


 
d
e−1 (σ) = 0 and ηµν = −m2 e2 (σ) . (2.52)
dσ dσ dσ dσ

The massive or massless character of particles is automatically incorporated in the second equation.
Indeed, choosing e(σ) = 1 and taking the limit m → 0, we obtain the equations of motion for a free
massless particle
d2 xµ dxµ dxν
2
= 0, ηµν = 0. (2.53)
dσ dσ dσ
On the other hand, the equations for massive particles can be obtained by choosing e(σ) = 1/m and
using the proper time as affine parameter (σ = τ )

d2 xµ dxµ dxν
m = 0, ηµν = −1 . (2.54)
dτ 2 dτ dτ
These kind of choices in which e(σ) = constant are called affine and restrict the function f (σ) to the
form
f˙ = 1 → f (σ) = σ + constant . (2.55)

Exercise
Consider the massive case. Show that the action (2.51) is equivalent to the geometrical action
Z
S = −m dτ . (2.56)

2.7 Maxwell’s equations


18
In their traditional form, Maxwell’s equations are given by

∇ × E + ∂t B = 0, (2.57)
∇·B = 0, (2.58)
∇ · E = ρ, (2.59)
∇ × B − ∂t E = J , (2.60)

with E and B the electric and magnetic fields, J the current density and ρ the charge density. They
are 8 coupled linear differential equations, in which the boundary conditions are usually taken to be
such that for infinite systems the fields E and B go to zero at infinity. Note also the symmetry E ↔ B
in the absence of sources.
17 Recall that, in natural units, the action is dimensionless.
18 Note that they are written using the Heaviside-Lorentz convention, in which no 4π factors appear.
2.7 Maxwell’s equations 27

The homogenous Maxwell’s equations (2.57) and (2.58) can be solved by introducing the so- called
electromagnetic potentials: a scalar potential ϕ and a vector potential19 A satisfying

E = −∇ϕ − ∂t A , B = ∇ × A. (2.61)

Using them, the inhomogeneous Maxwell’s equations become

∂t2 ϕ − ∇2 ϕ = ρ , ∂t2 A − ∇2 A = J . (2.62)

Given the electromagnetic potentials ϕ and A in Eq. (2.61) the electromagnetic fields E and B are
completely determined, but no viceversa. A and φ are gauge fields (see the exercise below).
Familiarity with Maxwell’s equations soon leads to the appreciation of the unified nature of the
electromagnetic field and its relativistic nature. Although in their 19th century version Maxwell’s
equations (2.57)-(2.60) do not seem at all invariant under Lorentz transformations, they can be written
in a more compact and elegant way, that makes explicit their covariant form. Introducing the 4-vector
potential (gauge field) Aµ ≡ (ϕ, A) and the charge-current density 4-vector J µ ≡ (ρ, J), we obtain20

∂ν F µν = Jµ , (2.63)
ρ µν
µνρσ ∂ F = 0, (2.64)

where the antisymmetric quantity F µν ≡ ∂ µ Aν − ∂ ν Aµ is the gauge invariant (Faraday) field strength
with components
F 0i = E i , F ij = ijk Bk , (2.65)
and the different  are totally antisymmetric tensors in the corresponding dimension21 n

+1, if µ1 µ2 . . . µn is an even permutation of 01 . . . (n − 1) ,

µ1 µ2 ...µn
 = −1, if µ1 µ2 . . . µn is an odd permutation of 01 . . . (n − 1) , (2.66)

0, otherwise .

In covariant notation, (2.62) becomes


2Aµ = J µ (2.67)
where we have defined the d’Alambertian operator

2 ≡ ∂µ ∂ µ = −∂t2 + ∂i2 . (2.68)

• The covariant components µνρσ of the permutation tensor µνρσ in Minkowski spacetime
are defined by lowering each of the indices with the metric tensor ηµν

µνρσ = ηµλ ηνκ ηρπ ηστ λκπτ , (2.69)

from which it follows that 0123 = −0123 .


• Note that, whereas a cyclic permutation of the indices in the 3-dimensional permuta-
tion symbol leaves it unchanged (ijk = jki ), a cyclic permutation of the 4-dimensional
permutation symbol gives rise to a minus sign (µνρσ = −νρσµ ).
19 Inthe 3-dimensional sense!
20 Eq. (2.64) is sometimes called a Bianchi identity. As you will see, this will not be the last time that we will find
one of these identities.
21 Note that some books use the opposite sign convention for (2.66).
2.7 Maxwell’s equations 28

Exercise
Just for those of you knowing Classical Field Theory. The previous equations of motion can be
obtained from the following Lagrangian density
1
L = − F µν Fµν + J µ Aµ , (2.70)
4
where J µ is treated as an external source.
L d4 x is invariant under
R
• Check that with the Lagrangian density (2.70), the action S =
Lorentz transformations

Aµ = Λµ ν Aν , F µν = Λµ ρ Λν σ F ρσ , J µ = Λµ ν J ν , (2.71)

and gauge transformations


Aµ → Aµ − ∂ µ χ , (2.72)
where χ = χ(t, x) is an arbitrary function of space-time. The concept of purely elec-
tric or magnetic fields and that of a static charge distribution with zero current become
meaningless, being a good description only in a particular reference frame.
• Which is the Lorentz invariant generalization of the equation of motion F = q(E+v ×B)?
Can you guess the associated Lagrangiana ? Check the consistency of the two results by
computing the Euler-Lagrange equations (3.30) of the obtained Lagrangian .
a Hint: Which is the generalization of the cross product in the 4-dimensional case? Consider the Lagrangian

of a charged particle in an electrostatic potential Φ.

Taking the 4-divergence of Eq. (2.63) we obtain

∂µ J µ = ∂µ ∂ν F µν = 0 , (2.73)

where in the last step we used the fact that F µν is an antisymmetric tensor, i.e. F µν = −F νµ .
Eq.(2.73) is nothing else than the continuity equation

∂ρ
+ ∇ · J = 0. (2.74)
∂t
The conservation of total charge Q(t)
Z Z
Q̇(t) = ρ̇(t, xi )d3 xi = − ∂k J k (t, xi )d3 xi = 0 , (2.75)
R3 R3

is imposed by the field equations. If the charge is not conserved there is no solution!

Exercise
Prove that the product Sµν Aµν of a symmetric tensor S µν and an antisymmetric tensor Aµν is
zero.

You might also like