Professional Documents
Culture Documents
A. Einstein
In the previous chapter we saw that tensors are a very good tool for writing covariant equations in
3-dimensional Euclidean space. In this chapter we will generalize the tensor concept to the framework
of the Special Theory of Relativity, the Minkowski spacetime. I will assume the reader to be familiar
at least with the rudiments of Special Relativity, avoiding therefore any kind of historical introduction
to the theory.
1. Principle of Relativity (Galileo): The laws of physics are the same in all the inertial frames: No
experiment can measure the absolute velocity of an observer; the results of any experiment do not
depend on the speed of the observer relative to other observers not involved in the experiment.
2. Invariance of the speed of light: The speed of light in vacuum is the same in all the inertial
frames.
Instantaneous action at a distance is inconsistent with the second postulate and must be replaced
by retarded action at a distance. Absolute simultaneity will only apply as an approximation at low
velocities for nearby events.
coincide2 , but they differ in the metrical structure, i.e. in the definition of distances. While in Newto-
nian spacetime the spatial and temporal distances are independent, in Minskowskian spacetime space
and time by themselves, are doomed to fade away into mere shadows, and only a kind of union of the
two will preserve an independent reality 3 . Space and time are distinguished only by a sign, which will
play however a central role. For any inertial frame of reference in Minkowski spacetime there is a set
of coordinates4
{xµ } = {t, x, y, z} = {x0 , x1 , x2 , x3 } = {x0 , xi } , (2.1)
and a set of orthonormal basis vectors
eµ · eν = ηµν , (2.3)
with
−1 0 0 0
0 1 0 0
ηµν =
0
= diag(−1, 1, 1, 1) . (2.4)
0 1 0
0 0 0 1
The inverse of (2.4) is traditionally denoted by η µν and satisfies6
η µν ηνρ = δ µ ρ , (2.5)
where the 4-dimensional Kronecker delta δ µ ρ is the indexed version of the identity matrix, i.e. δ µ ρ = 1
if µ = ρ and zero otherwise. Note that ηµν and η µν are numerically equivalent.
Exercise
Which is the value of δ µ µ ?
In terms of the basis vectors, the infinitesimal displacement dS between two points in spacetime can
be expressed as
dS = dxµ eµ , (2.6)
where dxµ are the so-called contravariant components. These are computed via the scalar product of
the vector dS and the corresponding basis vector eµ .
where we have defined the covariant or dual components by lowering the index of the contravariant
components with the metric
dxµ ≡ ηµν dxν . (2.8)
Exercise
Starting with a covariant vector defined by Eq.(2.8) , show that the inverse of the metric η µν
can be used to raise indices
η µν dxν = dxµ . (2.9)
As in the Euclidean case, contravariant and covariant vectors are just an appropriate way of
simplifying the notation and taking into account the summation convention. Note however that in
the present case lowering or raising indices changes the sign of the temporal component while keeping
intact the spatial ones
dx0 = −dx0 , dxi = +dxi . (2.10)
The upper and lower index notation automatically keeps track of the minus signs associated to the
temporal component. The indefiniteness of the metric is automatically incorporated in the notation!
In some old-fashioned books and in ’t Hooft’s lecture notes you will find a fourth coordinate
x4 = it, instead of the coordinate x0 = t appearing before. Written in terms of x4 the Minkowski
spacetime has the appearance of a positive-definite 4 dimensional Euclidean space
eµ · eν = δµν (2.11)
and there is no difference between lower and upper indices. This notation is however confusing
since it hides the non-positive definite character of the metric.
The square of the infinitesimal distance between two events in Minkowskian spacetime is given by
where ηµν is the Minskowski metric and dX 2 ≡ dx2 + dy 2 + dz 2 denotes the spatial interval.
Note that we have arbitrarily chosen a ηµν = diag(−1, 1, 1, 1) spacelike convention for the metric
signature, which keeps intact the notation used for Cartesian tensors in Euclidean spacetime.
Some books use a different timelike convention for the signature of the metric, taking ηµν =
diag(1, −1, −1, −1). Although the physics is independent of the convention used, the signs
appearing in the formulas in those books may differ from those in the expressions presented
here. For instance, using the convention ηµν = diag(1, −1, −1, −1), Eq.(2.40) would change to
pµ pµ = m2 . (2.13)
Note however that in both cases you recover E 2 = p2 + m2 after expanding the expression into
the different components.
Note that, contrary to the Newtonian case, the metric ηµν is not positive-definite. Given the
Lorentzian signature (− + ++) , the interval (2.12) can be positive, zero, or negative
• If ds2 = 0, dX/dt = 1 and the interval corresponds to the trajectory of a light ray. This interval
is called null or lightlike interval. The set of all lightlike wordlines leaving or arriving to a given
point xµ spans the future or past lightcone of the event. There is a lightcone associated to each
point in spacetime.
2.3 Minkowski spacetime isometry group 18
Timelike
Fu
Separation
tu
Massive
re
lig
particle Massless
ht
co
particle
Spacelike ne Spacelike
Separation Separation
Pa
st
lig
ht
co
ne
Timelike
Separation
• If ds2 < 0 the interval is said to be timelike. It corresponds to the wordline of a particle with
nonzero rest mass moving with a velocity smaller than light, dX/dt < 1. Two events separated
by such an interval are both inside the lightcone and can be in causal contact. There will exist
a frame in which the two events happen at same position but at different times.
• If ds2 > 0 the interval is termed spacelike. There will exist a frame in which the two events
happen at the same time but at different places, without any causal relation between them.
xµ = Λµ ν xν , (2.17)
with Λ a 4 × 4 matrix, independent of the coordinates. The first (upper) index in Λµ ν labels rows,
while the second (lower) one labels columns.
7 Note that this is basically due to the fact that we are dealing with constant metrics.
2.3 Minkowski spacetime isometry group 19
In order to preserve the line element (2.12) the constant matrices Λµ ν are required to satisfy the
pseudo-orthogonality condition
ηµν = ηρσ Λρ µ Λσ ν , (2.18)
which, in matrix notation, becomes
η = ΛT ηΛ , (2.19)
T
with denoting matrix transpose. Eq. (2.18) is the relativistic analogue of the orthogonality condition
(1.19). The determinant of the Λ matrices is also ±1. As we did in the previous chapter, we will not
consider the full Lorentz group8 (which is neither connected nor compact)
O(3, 1) = L+ ∪ P L+ ∪ T L+ ∪ P T L+ , (2.20)
with P and T the parity P µ ν = diag(+1, −I) and time reversal T µ ν = diag(−1, +I) operations. We
will restrict ourselves to the continuous Lorentz transformations connected with the identity (the
proper Lorentz group)
with S denoting special or reflection-free. These transformations are the relativistic analog of proper
rotations in Euclidean spacetime.
Exercise
Verify that the restricted set of Lorentz transformations (2.21) forms a group :
• Closure: The product of any two Lorentz transformations is another Lorentz transforma-
tion.
• There is an identity transformation.
• Every Lorentz transformation has an inverse.
The fact that η 2 = I4 , with I4 the identity matrix, allows us to easily compute the inverse Lorentz
transformation
η 2 = η ΛT ηΛ = η ΛT η Λ = I4 Λ−1 = ηΛT η ,
→ (2.22)
which, writing explicitly the components, becomes
µ
Λ−1 ν = η µλ Λρ λ ηρν = Λν µ . (2.23)
How many Lorentz transformation are there? Each Lorentz transformation is represented by a 4×4
matrix, which makes a total of 16 components. The pseudo-orthogonality condition (2.18) imposes
however some constraints. Indeed, taking the transpose of such a equation leaves it unchanged. The
independent components are just the diagonal elements plus half the off-original elements. We are
8 Note that now, not only the determinant of Λ, but also the element Λ0 0 , plays a special role in the splitting (2.20).
.
2.3 Minkowski spacetime isometry group 20
left therefore with 16 − 10 = 6 independent Lorentz transformations. There are two different kinds of
homogeneous Lorentz transformations. The most obvious one are spatial rotations
µ 1 0
Λ ν= , (2.24)
0 Ri j
where Ri j is a 3 × 3 orthogonal matrix δkl = δij Ri k Rlj , with i, j running only in spatial directions.
There are three independent rotations matrices, one per spatial direction. For instance, a rotation of
angle θ around the z-axis will take the form
cos θ sin θ 0
Ri j = − sin θ cos θ 0 . (2.25)
0 0 1
The difference with Euclidean transformations arises when considering the so-called boosts, which mix
the spatial and temporal components. There are three of them, each one associated to the mixing of
a particular spatial component with time. As an example consider a boost of rapidity η = tanh−1 v
along the x direction
γ −γβ 0 0 cosh η − sinh η 0 0
−γβ
Λµ ν =
γ 0 0 = − sinh η cosh η 0 0 ,
0 (2.26)
0 1 0 0 0 1 0
0 0 0 1 0 0 0 1
√
where we have defined the parameter γ = 1/ 1 − v 2 , with v the 3-velocity. After the boost, the tem-
poral and spatial coordinates are a linear and homogeneous combination of the spatial and temporal
coordinates in the old frame.
Exercise
Verify that Eq. (2.26) gives rise to the standard Lorentz transformation
For doing so, assume Λµ ν to be the transformation from the rest frame of a given inertial
observer to the frame of a second initial observer moving with speed v along the x axis and
determine the relation between that velocity and η .
The convenience of using the rapidity parameter η instead of the velocity v resides in the fact that
η combines additively. In fact, if we consider two consecutive boosts in the same direction, we have
Λ(η1 )Λ(η2 ) = Λ(η1 + η2 ) . (2.28)
Exercise
Consider the composition of 2 boosts with velocities v1 and v2 along the x direction. Show that
Λ1 Λ2 gives rise to a boost with 3-velocity
v1 + v2
v= . (2.29)
1 + v 1 v2
What happens in the limit v1 , v2 1? Generalize the previous to an arbitrary direction. Is
the general result symmetric under the interchange of the two velocities?. Note that all the
expressions are written in natural units.
2.3 Minkowski spacetime isometry group 21
t t
nts 2
x
seve
n eou
ulta
Sim
Simultaneous events 1
x
Figure 2.2: A boost transformation
and the property (2.28) closely resemble those of spatial rotations (2.25). The main difference is
the change of trigonometric functions by their hyperbolic analogue, reflecting the relative sign of the
temporal direction with respect to the spatial directions. Note however two important differences
between boosts and ordinary rotations
• The rotation parameter θ in Eq.(2.25) runs between 0 and 2π, with both points included. The
rapidity parameter η is non-compact and can take whatever value in R.
• The boost matrix (2.26) is symmetric, which is not the case for ordinary rotations, cf. Eq.
(2.25):
Although we should not take the analogy between rotations and boosts too seriously, it is instructive
to look at the action of a Lorentz transformation on a spacetime diagram. As shown in Fig. 2.2, a
Lorenzt boost rotates time and space by the same angle η = tanh−1 v, but in opposite directions! In
Special Relativity the simultaneity of two events depends on the observer, the hyperplanes of constant
coordinate time do not have an invariant meaning. Note however that the light cone (i.e the dashed
line at 45 degrees in the diagram) is invariant under Lorentz transformations.
Exercise
Use the diagram Fig.2.2 to derive the well-known effects of space contraction
p
L̄ = L 1 − v 2 → L̄ < L , (2.31)
∂ x̄µ
Lorentz transformations ∂xν ≡ Λµ ν are constants!
Scalar φ̄ = φ
∂ x̄µ ν
Contravariant vector V̄ µ = ∂xν V
∂xν
Covariant vector V̄µ = ∂ x̄µ Vν
∂ x̄µ ∂ x̄ν ρσ
Contravariant rank-2 tensor T̄ µν = ∂xρ ∂xσ T
∂xρ ∂xσ
Covariant rank-2 tensor T̄µν = ∂ x̄µ ∂ x̄ν Tρσ
∂ x̄µ ∂xσ ρ
Mixed rank-2 tensor T̄ µ ν = ∂xρ ∂ x̄ν T σ
Table 2.1
Exercise
How does the volume element d4 x = dx0 dx1 dx2 dx3 transforms under Lorentz transformations?
replacement of latin indices by greek indices, and the use of the Minkowski metric, instead of the Euclidean one, for
lowering and raising indices.
10 Note that the proper time is not a useful parametrization for the worldline of massless particles, such as photons,
since these particles move on the light cone and can travel any distance in zero proper time dτ 2 = −ds2 = 0.
2.5 Covariance and Relativistic Mechanics 23
The connection between the proper time and the measurement made in an inertial reference frame
with coordinate time interval dt is given by
dt −1/2
γ≡ = 1 − v2 . (2.34)
dτ
i
Note that γ is a growing function of the 3-velocity v i = dx
dt and it is always bigger than one. The
proper time goes by at a slower rate than the coordinate time t.
4-velocity
Given τ , and in clear analogy with the 3-dimensional case, the 4-velocity uµ along the trajectory is
given by the 4-vector
dxµ
uµ ≡ , (2.35)
dτ
which is tangent to the worldline of the particle and automatically normalized
ηµν uµ uν = uµ uµ = −1 . (2.36)
uµ = Λµ ν uν . (2.38)
4-momentum
Since the mass m of the particle is a scalar under Lorentz transformations, the 4-momentum
pµ = muµ (2.39)
T T
is a Lorentz 4-vector with components pµ = E, pi = mγ, mγv i .
In the instantaneous rest frame of the particle, pµ = (m, 0)T . This can be used to simplify many
computations. We can compute things in this particular frame and then re-express the result
in a form valid in any other inertial frame by appealing to covariance.
Using the normalization condition for the 4-velocities (2.36), the normalization condition for the
4-momentum becomes
pµ pµ = −m2 , (2.40)
p
which is nothing else than the well-known energy-momentum relation E = p2 + m2 written in a
covariant way11 . In the Newtonian limit (|pi | m) this relation becomes the familiar expression of
11 Note that p is the three momentum pi .
2.5 Covariance and Relativistic Mechanics 24
2
p
the Newtonian theory together with the energy equivalent of the mass mc2 , i.e. E ' m + 2m . Note
that the 4-momentum remains well defined even for massless particles, where it has zero square norm
and becomes lightlike, pµ pµ = 0. We will take this as a definition of a massless classical particle. The
indefiniteness of Minkowski metric allows for non-zero values of the temporal and spatial parts as long
as they cancel out in pµ pµ . In particular, we can always find a frame in which pµ = (E, 0, 0, E)T .
4-acceleration
It is important to remark that Special Relativity, as Newtonian mechanics, is concerned with the
relation between inertial observers and not with the behavior of the objects that they are studying.
In particular, the observed objects can be accelerating. The 4-acceleration in Minkowski spacetime
can be defined as12
uµ (τ + ) − uµ (τ ) duµ d2 xµ
aµ = lim → aµ ≡ = . (2.42)
→0 dτ dτ
The 4-acceleration (2.42) transforms in the proper way under Lorentz transformations since Λµ ν
is linear and depends only on the relative velocity between the two frames. This fact allows it to pass
completely the d2 /dτ 2 . The acceleration aµ is spacelike 4-vector
Energy-momentum conservation
The objects defined above allow us to easily generalize the Newton’s second law, which becomes
dpµ
fµ = = maµ . (2.45)
dτ
Note that Eq.(2.45) is a tensorial identity, which maintains its form under Lorentz transformations
and automatically satisfies Einstein’s principle of Relativity. The explicit expression of f µ depends on
the considered interaction (cf. the first exercise in Section 2.7). The components of the 4-vector f µ
in Eq. (2.45) are proportional to the Newtonian force F i and to the work done by F i per unit time,
i.e. f µ = γ (vi F i , F i )T . Contrary to what happens in Newtonian physics, energy and momentum
conservation laws are not independent. The conservation laws of Newtonian mechanics in a given
collision between particles are replaced by a conservation law for the total 4-momentum 13
X µ X µ
pin = pout , (2.46)
in out
12 You will probably wondering why I am making such a mess writing out the explicit definition in (2.42) instead of
directly writing
duµ d 2 xµ
aµ ≡ = . (2.41)
dτ dτ
The point I want you to notice in that the two vectors v µ (τ + ) and v µ (τ ) are located at different points in spacetime.
What we are really doing when computing the acceleration in Special Relativity is trivially moving the two vectors
to the same point in spacetime before subtracting then. This trivial operation of moving a vector from one point to
another will turn out to be not so trivial in General Relativity. As you will see, we will need to introduce some extra
machinery in order to do that. . . but let’s move one step at a time. . .
13 The interaction is assumed to be a contact interaction. Particles are free away from the interaction point.
2.6 Relativistic Lagrangian for free particles 25
with the subscripts in and out denoting the incoming and outgoing particles. Note that the conserva-
tion law is Lorentz covariant and reduces to the Newtonian momentum and energy conservation for
small velocities.
Exercise
Show that a photon cannot spontaneously decay into an electro-positron pair.
dpµ
= 0, (2.47)
dτ
can be also be derived from a Lagrangian formulation where the role of the generalized coordinates
is played by the space-time coordinates xµ and the classical time t is replaced by an appropriate
parameter σ. The simplest guess for the relativistic action would be a naive generalization of the
Newtonian action, namely
dxµ dxν
Z
1
S = m dτ ηµν . (2.48)
2 dτ dτ
Note however that the previous expression is not invariant under reparametrizations of the path14
τ → f (τ ). The dynamic of the particle seems to depend on the “internal coordinate” τ used in the
description of the curve xµ (τ ). Moreover, the action (2.48) does not contain any information about
the lightcone. On top of that, t neither has a smooth massless limit.
To solve these problems, we will substitute the proper time by an arbitrary parameter σ and
introduce a non-dynamical function15 e(σ), the so-called einbein. This quantity will be treated as an
additional generalized coordinate during the intermediate computations and fixed to a particular value
only at the end. To clarify the construction, we proceed in several steps. We start by replacing the
problematic mass appearing in the action (2.48) by e−1 (σ). This gives rise to the following structure
dxµ dxν
Z
1 1
S∼ dσ ηµν . (2.49)
2 e(σ) dσ dσ
In order for the previous action to be invariant under reparametrizations of the path16 σ → f (σ),
the einbein e(σ) must be chosen to transform in the proper way. The transformation rule can be
determined by inspection: the property e(σ)dσ must remain invariant. In other words, the infinitesimal
displacement dσ and the einbein must transform in an opposite way
−1
dσ̄ = f˙(σ)dσ , ē(σ̄) = f˙(σ) e(σ) . (2.50)
With these transformations at hand, we proceed now to reintroduce the mass parameter m in the
action (2.49). The form of the new term is essentially determined by pure dimensional arguments,
reparametrization invariance and the massless limit. In order for the new piece to be reparametrization
invariant, the integration measure dσ must come together with a factor e(σ). This gives a term
14 The reparametrization invariance of the action should be understood as a gauge symmetry: a redundancy of the
dσe(σ) with dimension [E]−2 , which must be compensated17 by something with dimension [E]2 and
proportional to m. There you are: the new term is dσe(σ)m2 . The resulting action is the so-called
einbein action
dxµ dxν
Z
1 1
S= dσ ηµν − m2 e(σ) (2.51)
2 e(σ) dσ dσ
and give rise to the following Euler-Lagrange equations for the generalized coordinates xµ (σ) and e(σ)
The massive or massless character of particles is automatically incorporated in the second equation.
Indeed, choosing e(σ) = 1 and taking the limit m → 0, we obtain the equations of motion for a free
massless particle
d2 xµ dxµ dxν
2
= 0, ηµν = 0. (2.53)
dσ dσ dσ
On the other hand, the equations for massive particles can be obtained by choosing e(σ) = 1/m and
using the proper time as affine parameter (σ = τ )
d2 xµ dxµ dxν
m = 0, ηµν = −1 . (2.54)
dτ 2 dτ dτ
These kind of choices in which e(σ) = constant are called affine and restrict the function f (σ) to the
form
f˙ = 1 → f (σ) = σ + constant . (2.55)
Exercise
Consider the massive case. Show that the action (2.51) is equivalent to the geometrical action
Z
S = −m dτ . (2.56)
∇ × E + ∂t B = 0, (2.57)
∇·B = 0, (2.58)
∇ · E = ρ, (2.59)
∇ × B − ∂t E = J , (2.60)
with E and B the electric and magnetic fields, J the current density and ρ the charge density. They
are 8 coupled linear differential equations, in which the boundary conditions are usually taken to be
such that for infinite systems the fields E and B go to zero at infinity. Note also the symmetry E ↔ B
in the absence of sources.
17 Recall that, in natural units, the action is dimensionless.
18 Note that they are written using the Heaviside-Lorentz convention, in which no 4π factors appear.
2.7 Maxwell’s equations 27
The homogenous Maxwell’s equations (2.57) and (2.58) can be solved by introducing the so- called
electromagnetic potentials: a scalar potential ϕ and a vector potential19 A satisfying
E = −∇ϕ − ∂t A , B = ∇ × A. (2.61)
Given the electromagnetic potentials ϕ and A in Eq. (2.61) the electromagnetic fields E and B are
completely determined, but no viceversa. A and φ are gauge fields (see the exercise below).
Familiarity with Maxwell’s equations soon leads to the appreciation of the unified nature of the
electromagnetic field and its relativistic nature. Although in their 19th century version Maxwell’s
equations (2.57)-(2.60) do not seem at all invariant under Lorentz transformations, they can be written
in a more compact and elegant way, that makes explicit their covariant form. Introducing the 4-vector
potential (gauge field) Aµ ≡ (ϕ, A) and the charge-current density 4-vector J µ ≡ (ρ, J), we obtain20
∂ν F µν = Jµ , (2.63)
ρ µν
µνρσ ∂ F = 0, (2.64)
where the antisymmetric quantity F µν ≡ ∂ µ Aν − ∂ ν Aµ is the gauge invariant (Faraday) field strength
with components
F 0i = E i , F ij = ijk Bk , (2.65)
and the different are totally antisymmetric tensors in the corresponding dimension21 n
+1, if µ1 µ2 . . . µn is an even permutation of 01 . . . (n − 1) ,
µ1 µ2 ...µn
= −1, if µ1 µ2 . . . µn is an odd permutation of 01 . . . (n − 1) , (2.66)
0, otherwise .
• The covariant components µνρσ of the permutation tensor µνρσ in Minkowski spacetime
are defined by lowering each of the indices with the metric tensor ηµν
Exercise
Just for those of you knowing Classical Field Theory. The previous equations of motion can be
obtained from the following Lagrangian density
1
L = − F µν Fµν + J µ Aµ , (2.70)
4
where J µ is treated as an external source.
L d4 x is invariant under
R
• Check that with the Lagrangian density (2.70), the action S =
Lorentz transformations
Aµ = Λµ ν Aν , F µν = Λµ ρ Λν σ F ρσ , J µ = Λµ ν J ν , (2.71)
∂µ J µ = ∂µ ∂ν F µν = 0 , (2.73)
where in the last step we used the fact that F µν is an antisymmetric tensor, i.e. F µν = −F νµ .
Eq.(2.73) is nothing else than the continuity equation
∂ρ
+ ∇ · J = 0. (2.74)
∂t
The conservation of total charge Q(t)
Z Z
Q̇(t) = ρ̇(t, xi )d3 xi = − ∂k J k (t, xi )d3 xi = 0 , (2.75)
R3 R3
is imposed by the field equations. If the charge is not conserved there is no solution!
Exercise
Prove that the product Sµν Aµν of a symmetric tensor S µν and an antisymmetric tensor Aµν is
zero.