Professional Documents
Culture Documents
Space-Times
SECOND EDITION
George F. R. Ellis
Distinguished Professor of Complex Systems
Mathematics Department, University of Cape Town
and
Ruth M. Williams
Fellow and Lecturer in Mathematics
Girton College and Assistant Director of Research,
Department of Applied Mathematics and
Theoretical Physics, University of Cambridge
Diagrams by
Mauro Carfora
Department of Nuclear and Theoretical Physics,
University of Pavia
OXFORD
UNIVERSITY PRESS
OXFORD
UNIVERSITY PRESS
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University's objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Athens Auckland Bangkok Bogota Buenos Aires Calcutta
Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul
Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai
Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw
with associated companies in Berlin Ibadan.
You must not circulate this book in any other binding or cover
and you must impose this same condition on any acquirer
A catalogue record for this book is available from the British Library
This book grew out of a series of lectures and a summer school course given by
one of us (G.F.R.E.) at the University of Cape Town. A series of notes taken by a
student (Gavin Hough) was useful in preparing the text, the major part of which
was completed while G.F.R.E. was at the University of Texas and R.M.W. at the
Institute for Advanced Study, Princeton. We thank Marilyn Brink, Colin
Myburgh, Sasha Loncarevic, and Clive Khouny for useful criticisms of a draft of
the text. We decided to turn the notes into an introductory book because we
believed that, despite the proliferation of books on relativity theory, there was no
equivalent text available. We hope that the book will make a solid understanding
of flat and curved space-times accessible to a wider audience than hitherto.
We are extremely grateful to Dr Mauro Carfora for combining his artistic
skills with his knowledge of relativity to produce the diagrams in the book.
Relativity may at first seem to the reader to be an abstract theory, far removed
from the reality of everyday life. By the end of the book, we will have demon-
strated that this theory is of fundamental importance not only for elementary
particle physics and astronomy, but also in the way it affects conditions of life in
the world around us. We shall also see that the cover photograph, showing an
eclipse of the Sun by the Earth, as seen from an Apollo spacecraft, illustrates
several features of relativity. [Editor's note: the front cover of the second edition
has a different photograph; it shows the galaxy NGC 3377, which is believed
to have a black hole at its centre. As we shall see, this also illustrates several
features of relativity.]
We have been very pleased to prepare a second edition of this book, at the request
of Oxford University Press, in order to bring this presentation of relativity theory
up to date (and allowing us to correct some errors and areas of lack of clarity that
have been pointed out by readers). While the foundations of the subject remain
the same as ever, there has been marked progress in some areas of application of
relativity theory, particularly because of the vast explosion of new astronomical
data from powerful new ground-based telescopes such as Keck, and a series of
satellite observatories: IRAS (infra-red astronomical satellite), COBE (cosmic
background radiation explorer), the Hubble space telescope, ROSAT (X-ray),
and so on. Also, for example, gravitation radiation detectors have made enor-
mous strides, and major new-generation gravitational wave observatories will
come on line in the next five years or so, opening up a new astronomical channel
of observation. The observational situation is being transformed.
Thus this revision presents a substantial amount of new material that takes
these developments into account. However, we have not altered the basic
structure of the book, despite critical comments by some reviewers. The prime
cause of dissatisfaction to some is that we take so long to reach the Lorentz
transformation-traditionally, an early part of many presentations. This policy
of ours is deliberate. We believe it is essential to get the grounding right first, and
that takes a long time and considerable thought; it should not be rushed. It is
possible to move quickly to the Lorentz transformations, and learn to manipulate
them mechanically, but that does not mean that what they represent is under-
stood in a serious way.
Our aim is to solidly lay the foundations, first deriving all the main relativity
results in a simple and well-grounded way, and only then use the Lorentz
transformations as a device for summarizing concisely what has been discovered.
The other way of presentation (effectively starting with the Lorentz transfor-
mation) is right for some readers; ours is right for many others, as readers'
comments testify. So the basic presentation is the same as before. We hope that
you will find it enlightening.
Introduction 1
2. Fundamentals of measurement 35
2.1 Time 35
2.2 Distance 37
2.3 Simultaneity 41
2.4 World maps, world pictures, and radar maps 44
8. Finale 313
Afterword 315
Appendices
A. Line integrals 318
B. Four-vectors and relativistic dynamics 325
C. Four-vectors, electromagnetism, and energy-
momentum conservation 341
Index 369
Introduction
The aim of this book is to demonstrate the unifying power of the concept of a
space-time in understanding the nature of the physical world. It will do so firstly
by giving a good understanding of the nature and meaning of the flat space-time
of the special theory of relativity, and features of that theory such as length
contraction, time dilation, and the twin paradox. Secondly it will provide an
introduction to the nature and meaning of the curved space-times of the general
theory of relativity, including the concept of the expanding universe and the
nature of black holes. Both of these theories of relativity are due to Albert
Einstein (Fig. 0.1), the special theory being completed in 1905 and the general
theory in 1916.
Einstein's theories of relativity and their dramatic revelations of the unex-
pected nature of space-time are among the major scientific discoveries of this
century, replacing the ideas about space and time that had been believed since
Galileo and Newton. It is fundamental in approaching these topics that the
reader be prepared to drop his/her preconceived. ideas about the nature of dis-
tance measurements, time measurements, simultaneity, and causality. This is
perhaps Einstein's greatest single contribution to the understanding of space-
time: teaching us to question the commonplace ideas about these concepts. The
resulting revolution in understanding, leading to the discovery of length con-
traction, time dilation ('a moving clock goes slow'), the relativity of simultaneity,
and the fact that space-time geometry and causality are determined by the matter
in it, will be explored in depth in this book. One should note that the kinematic
effects discussed here are only dramatic when speeds near the speed of light are
involved; they are negligible in ordinary everyday life. That is why we do not
understand these effects intuitively as `the way things are'. However, many of the
consequences of special relativity are significant in situations that do not involve
high-speed motion; in particular, the nature of magnetic forces and the possibility
of nuclear power are two such consequences of considerable-importance.
The concept of space-time presented here is a model of reality used with great
success by theoretical physicists. It summarizes the nature of spatial and time
relationships in a concise way, and is a very good illustration of the use of geo-
metry in understanding physics. The point of a geometrical picture is that it
represents in a concise way many analytical relationships that are tedious to
describe in full, and are difficult to understand when they are written out in detail.
These pictures enable to understand in a direct way the results of distance and
2 Introduction
Fig. 0.1 Albert Einstein, who proposed the special theory of relativity in 1905 and the
general theory of relativity in 1916 thereby bringing the study of flat and curved space-
times into the main-stream of physics. (the photograph shows Einstein in 1933.)
(Photograph from the Amercian Institute of Physics.)
time measurements and so are a very useful tool in making predictions about the
results of physical experiments. One should remember that the space-time view
embodied in relativity theory is a model of reality which has been tested by many
physical experiments, and depicts more correctly than other models the results of
these experiments. It is thus a way of summarizing much of what we know about
the physical universe. The understanding obtained through the concept of a
Introduction 3
space-time shows how various features that we at first may regard as independent
of each other are in fact manifestations of the same underlying physical phe-
nomena. Thus this concept is not merely a tool to use in making predictions
efficiently, but also provides a way of understanding a deeper unity in nature than
is obvious on the surface.
Being able to understand fully the concept of a space-time implies being able
to calculate the results of measurements in particular space-times. We shall show
how this can be done without employing more than school-level mathematics
plus the simple concept of a line integral (explained in Appendix A). Thus we
believe that anyone with a good grasp of school algebra, some trigonometry, and
the concept of a function should be able to follow our detailed argument
including the calculations (in this respect our book is similar to Lilley's book
Discovering Relativity for Yourself, Cambridge University Press, 1981, which
gives a more extended introduction to the actual details of calculation than we do
here). In a few restricted places in the main text, the idea of a derivative is also
needed; omitting these sections will not impair understanding of the major thrust
of our argument. We recommend that the serious reader should indeed try to
follow all the calculations presented in the main text and attempt at least some of
the examples, both for the satisfaction this will afford and because this is the way
to fullest understanding of the concepts presented. Restrictions on the length of
the book meant that we were not able to include solutions to the exercises.
However, a set of notes containing a mixture of complete solutions, hints, and
answers to the problems may be obtained separately from the authors (please
write to Dr R. M. Williams). For fun, we have included some examples involving
writing programs for a microcomputer; these examples enable a good visual
presentation of some of the ideas, and are amusing to carry out, but again they are
not essential to understanding the text. We suggest that, if at any time you feel
that you are becoming stuck in detailed argument or calculations, you should just
note the general ideas presented and go on to the next section.
An acquaintance with school level physics will make the argument easier to
follow at some places, but a lack of this background will not prevent the'reader
from grasping the main ideas. We show how the concepts of energy and
momentum are united through the concept of a space-time four-vector, leading
to the famous result E = mcz (Appendix B); and how electricity and magnetism
are united in a space-time tensor, leading to the fundamental understanding of a
magnetic field as being essentially an electric field viewed from a relatively
moving frame (Appendix Q. These topics have been separated from the main
text because their full development requires somewhat more mathematics than
the main text (full appreciation of Appendix C requires sufficient knowledge of
partial derivatives to understand Maxwell's equations in vector notation). Thus
while this material will be interesting and useful to anyone who wishes to
understand these dynamical applications of relativity theory, it is not essential to
the understanding of the kinematics described in the main part of the book. With
these appendices, the book describes sufficient material on special relativity to
give adequate understanding for most first-year university physics courses on the
subject; however, the main text should be accessible to a wider circle of readers,
4 Introduction
namely, any interested person with a reasonable knowledge of school mathe-
matics, and the will to follow the argument through (and indeed could serve as a
text for courses such as described by T. A. Roman in `General relativity, black
holes and cosmology: a course for non-scientists', American Journal of Physics
54, 144, 1986). Should you not have a background in physics but wish to follow
through some of the physics arguments a bit further, the book Time, Space and
Things by B. K. Ridley (Cambridge, 1984) might be a good starting point.
This book focuses particularly on understanding relativity from a geometrical
viewpoint (perhaps the most similar other approaches being those in Geroch's
book General Relativity from A to B, University of Chicago Press, 1978, and in
Lilley's book mentioned above). We make particular use of Bondi's K-calculus to
determine the results of calculations in flat space-time (Hermann Bondi used this
approach in a successful BBC television series on relativity, and published
accounts of it in his books Relativity and Common Sense, Anchor Books, 1964,
and Assumption and Myth in Physical Theory, Cambridge University Press,
1967). Instead of starting off with the Lorentz transformation as the basis of the
argument, we arrive at this concept fairly late in our presentation, when it appears
as a convenient unified way of summarizing relationships we have previously
derived by use of the K-calculus. Our presentation of the nature of simple curved
space-times centres on showing the reader how he or she may deduce many
properties of these space-times directly from their interval. Further reading is
suggested in the concluding section of the main text ('Afterword'), and the reader
will find that the Index has been carefully prepared as a guide to the terms used
and ideas presented throughout the book.
While we have endeavoured to present the material covered thoroughly, we
have also tried to do so concisely so that the overall size of the volume will not be
excessive or daunting.
The first part of the book may seem to some to be rather leisurely, because all
the detail is spelt out. This is a conscious decision on our part: we feel that the
average textbook goes too fast through the fundamentals. The serious student
will probably be able to read the first few chapters fairly quickly, but will benefit
from this thorough grounding; he/she will find the main increment of difficulty is
in the Appendices, whose inclusion results in covering what is needed for a first
university course in relativity. On the other hand readers for whom they are too
technical may well omit these appendices. We believe that in their case the book
will provide a good opportunity for the interested non-specialist reader or early
student to understand the nature of flat and curved space-times, and how they
determine physical measurements of time, distances, and instantaneity, without
becoming bogged down in mathematical formalism. Thus the reader will become
familiar with one of the foundations of our modern understanding of the nature
of the physical world.
1
x
(a) (b)
t1 t
(c) (d)
Fig.1.1 Constructing a space-time. (a) A cine camera takes photographs of billiard balls
on a table. One ball moves relative to the others. (b) A series of photographs from the
film. (c) The photographs stacked together, later ones above the earlier ones. (d) The
photographs fused together to form a `space-time', with time coordinate t and spatial
coordinates x, y.
histories of the stationary billiard balls are represented by vertical tubes in the
space-time, while the history of a ball moving to the left is represented by a tube
sloping over to the left. To recover the detailed history of motions of objects in the
space, simply consider a series of horizontal sections of the space-time (surfaces
of instantaneity) at later and later times. These sections intersect the tubes
representing the histories of the stationary balls at x and y coordinate positions
that stay constant (showing that they are indeed stationary), and intersect the
tube representing the ball moving to the left in positions that are successively
more to the left (showing it does indeed move to the left). In effect, by considering
a succession of time slices in this way one can reconstruct a series of images
corresponding to the photographs from which the space-time was initially
constructed, and then by considering these in turn one can visualize the motion of
the particles as in a cine film. The space-time therefore completely represents
these motions.
The space-time we have constructed is three-dimensional, representing the
histories of objects in a two-dimensional space (the surface of the table).
1.1 The concept of a space 7
*If you feel that the labels A and B for the different cameras and the corresponding observers are
antiseptically impersonal, you might like to substitute names such as Alfred or Angela for A,
Barbara or Bernard for B. While such labelling may well initially help the beginner to grasp what is
happening, ultimately it becomes an annoying distraction. We have chosen to use the more con-
venient abstract labels from the beginning.
Space-time diagrams and the foundations of special relativity
(a) (b)
t,
(c)
Fig. 1.2 Effect of the observer's motion on the space-time picture. (a) Camera A is
fixed above the billard table. (b) Camera B moves with the moving Billiard ball. (c) The
space-time view of the ball's history, constructed from A's photographs. (d) The
space-time constructed from B's photographs.
x x'
slide
fuse
}
4-
same
view
Fig. 1.3 Although they look different, A's and B's space-time views are equivalent:
sliding A's pictures sideways before fusing them together will give the same space-time
view as B's.
1.1 The concept of a space 9
Fig. 1.4 A planet in circular motion around the Sun, describing a helix in space-time.
Examples of space-times
The ideas explained so far should become quite clear on carefully considering two
examples.
(A) A planet in circular motion around a sun. In the sun's frame of reference,
the sun is at rest in the spatial coordinates used, while the planet circles around it,
describing a helix in space-time (Fig. 1.4). To see that this is the correct space-
time picture, consider later and later time sections of the space-time; the positions
of the planet in the successive surfaces of instantaneity trace out a circle around
the sun, as required.
(B) A circular wave in a pond. Consider dropping a stone into a large pond at
some time t1, producing a spreading spherical ripple in the pond (Fig. 1.5a).
Photographs of the crest of the spherically spreading wave taken from a camera
stationary above the point of impact (Fig. 1.5b) produce a space-time picture in
which the spreading wave is depicted as a cone with apex at time t = tl (Fig. 1.5c).
Again considering later and later surfaces of instantaneity in the space-time, we
recover the series of images depicting the spherically spreading wave, starting
from the centre at time ti.
Points in space-time are called events. An event represents a particular
position in the physical world at a particular time, the set of all events repre-
senting the spatial and temporal locations of all possible physical occurrences.
A world-line is the path traced out in space-time by the events representing the
history of a particular particle or light ray. For example the helix in example (A) is
the world-line of the planet as it orbits around the sun. Not all lines in space-time
are possible world-lines; for example, if a line reaches a maximum time and then
slopes down again (Fig. 1.6), it does not represent a possible world-line of a
massive body, because time would start to go backwards along such a world-line,
where it slopes down. We shall discover further restrictions on allowable world
lines after considering the limiting role played by the speed of light in relativity.
Summary
Space-time represents the histories of objects in space. When the space
represented is two-dimensional, the space-time is three-dimensional (three
10 Space-time diagrams and the foundations of special relativity
stone
(b)
x
(c)
Fig. 1.5 (a) Circular ripples produced by a stone thrown into a pond. (b) A succession of
photographs of the spreading wave. (c) A space-time view of the spreading wave.
coordinates are needed to characterize all events: the two spatial coordinates x
and y depicting the spatial position of the event, and the coordinate t representing
the time of the event). The full space-time needed to represent all events in the real
physical world is four-dimensional (with one time coordinate and three spatial
coordinates). Each surface (t = constant) tells us where each object was at the
1.1 The concept of a space 11
time t, according to an observer using a particular coordinate system, say (x, y, z);
these surfaces are slices of instantaneity or simultaneity in the space-time
(Fig. 1.7).
t1
Exercises
1.1 An observer 0 watches the engine of a train shunting on a straight track; he
chooses the x coordinate to measure distance along the track. Plot the world-line of the
engine in the (t, x) plane if, starting at a distance of 50 in from the observer, (i) it moves at
10 m/sec away from the observer for 5 seconds; (ii) then it is stationary for 7 seconds; (iii)
then it moves at 5 m/sec towards the observer for 8 seconds.
1.2 The motion of a rocket relative to observer A is shown in Fig. 1.8. What is the
distance of the rocket from A at t = 0 seconds? at t = 10 seconds? What is the speed of
motion of the rocket relative to A?
Fig. 1.8
1.3 Draw a space-time diagram representing the motion of the Moon about the Earth
(stating carefully what reference frame you are using). Indicate approximate time and
spatial scales on your diagram.
1.4 Suppose a particle in an accelerator moves in a circular orbit of radius 25 in,
speeding up all the time as it moves. Sketch a space-time diagram of its motion.
1.5 Two cars A and B, watched by a person C waiting to cross the street, collide and
then bounce apart. Sketch the world-lines of A, B, and C as seen by (i) the driver of one of
12 Space-time diagrams and the foundations of special relativity
the cars; (ii) the driver of the other car; (iii) the person waiting to cross the street. [The
drivers are each securely seat-belted into their respective cars.]
So far, our discussion of space-times has been based on the everyday ideas of
Newtonian theory. The concept of a space-time applies equally in the case of
relativity theory, provided we take into account important relativity principles
which we examine in the next two sections.
V2
Energy
Fig.1.9 A graph of the square of the speed of a particle against the energy of motion given
to it, showing the experimental result and the prediction of Newtonian theory. No matter
how much energy is given to the particle, the speed of light c is a limit to the speed it attains.
predicted by Newtonian theory. This happens in such a way that no matter how
much energy one imparts it is not possible to accelerate particles to move faster
than the speed of light (Fig. 1.9). The amount of energy needed to accelerate fast-
moving particles to higher speeds becomes larger and larger as the speed
increases; smaller and smaller speed increments result from each doubling of the
energy, and the speed of light is never reached. This is an experimental result that
has been proved many times over at a cost of many billions of dollars (since that is
the cost of the high energy particle accelerators now in use). One has to invest
large sums of money in accelerators to produce an observable effect, because the
speed of light is so large: the speed-of-light limit certainly does not act as a factor
restricting the speed of cars, aircraft, or other vehicles on the earth!
Fig. 1.10 Distant galaxies and foreground stars. The foreground stars all belong to our
own galaxy, which is a spiral system of stars and dust like the galaxy M81 shown here. The
four `nearby' galaxies visible in the photograph are at a distance of some millions of light
years from us (three fainter galaxies are even more distant) but the individual stars seen are
within a few thousand light years. The photograph dramatically illustrates the time delays
necessarily involved in all our observations of distant objects: we are seeing conditions at
the galaxies millions of years ago, and those in the stars up to a few thousand years ago.
Thus the images represent these objects as they were at times differing by millions of years.
(Photograph from the Hale Observatory.)
32 million light years from us, and so the image show the galaxy as it was 32
million years ago. The back cover shows the COBE image (see p. 59) of the
surface of last scattering of light in the very early universe, approximately 1010
years ago. The light that made this image has been travelling towards us for
that enormous time.
1.2 Causality and the speed of light 15
d1 \d2
shortest longer
pond; I
(a)
(b) (c)
Photos
distort
r_--Ir
- A
Of use
t=tl
photo
(d)
Fig. 1.11 (a) A camera above the centre of a pond: the distance d1 to the centre is clearly
shorter than the distance d2 to a point further out. Consequently, light arriving at the
camera from the centre set out later than light arriving at the same instant from the edge.
(b) Circles of constant imaging time on a photograph PI of the pond, the larger circles
corresponding to earlier times. (c) Surfaces of simultaneity in a stack of photographs of the
pond (viewed edge-on, showing the finite thickness of each photograph). The photograph
PI is shown shaded. (d) Distortion of the stack of photographs before fusing, to represent
correctly surfaces of simultaneity as exactly horizontal sections of space-time.
To explore this effect further, consider a camera 3 metres above the centre of a
circular pond of diameter 8 metres (Fig. 1.11 a). The light has to travel a distance
of 3 metres from the centre of the pond to the camera, taking (3 m)/(3 x 108 m/
sec) = 10-8 seconds to do so, but light from the edge of the pond has to travel a
distance of 5 metres, taking (5 m)/(3 x 108 m/sec) = 3 x 10-8 seconds to do so.
Thus light from the edge takes 3 x 10-8 seconds more to reach the camera than
light from the centre. A photograph records one instant when light reaches the
camera from different places within its field of view; if these places are at various
distances from the camera, the image obtained will represent the different times
when the light set out towards the camera. Hence, when the camera takes a
16 Space-time diagrams and the foundations of special relativity
photograph of the pond, one will obtain images of the situation in different areas
of the pond at different times: light from the edge has to travel further and so has
to set out earlier in order to reach the lens at the same time as light from the centre.
If we sketch lines of exact simultaneity on a photo PI of the pond taken by the
camera, they will form circles with the outer circle depicting the situation at the
pond earliest, say at a time t1, and the central point the situation at a time t2 which
is 0.667 x 10-8 seconds later than t1 (Fig. 1.llb). A photograph taken by the
camera is not an instantaneous photograph of the pond! Hence, on stacking a
succession of photographs together and fusing them to obtain a representation of
space-time, horizontal sections will not represent exact simultaneity:* as one
moves out from the centre on a horizontal slice of space-time (which will be one
of the photographic images), the situation represented will be earlier and earlier
the further one is from the centre. There will be an earlier photo P° in which the
situation at the central point is depicted at the time tl; this photograph will lie
below P1 in the stack (because later photographs lie above earlier ones). It follows
that exact surfaces of simultaneity in the space-time (e.g. ti is constant) will be
lowest at the centre and will curve up as one moves from the centre to the edge
(Fig. I. I lc).
To correct this, i.e. to obtain a space-time representation in which horizontal
sections are indeed exactly simultaneous sections of the space-time, one will have
to distort the photographs of the pond by bending their outer regions downwards
before stacking them and fusing them together (Fig. 1.11 d). One could in this
way allow for the light travel time, and obtain a space-time picture correctly
representing simultaneity as exactly horizontal surfaces.
In this particular case, the effect is negligible in practice. However, this will not
always be true. Consider, for example, the delays implied from the centre to the
edge of the photographic image where an observer in a spacecraft photographs
the disc of a galaxy from a distance of 30 000 light years above the centre of the
galaxy. If the galaxy has a radius of 40 000 light years, the delay represented in the
photograph will be 20 000 year, i.e. the situation at the centre will be depicted
20 000 years after that at the edge of the disc.
*In Section 1.1, we ignored light travel time and so regarded horizontal slices as exactly
simultaneous. This will be a good approximation for slowly moving objects considered at everyday
time and length scales.
1.2 Causality and the speed of light 17
t t
j(sec) (sec)
3 light 3 light
ray ray
2 2
1 1
X X
1 2 3 4 5 6 7 6 9 (1010cm) 2 3 4 (light-sec)
(a) (b)
Fig. 1.12 (a) A light ray travelling in x-direction after emission at the event 0 (x = 0,
t = 0). Its space-time position is shown at t = 1 and t = 2. (b) The same light ray depicted
using a spatial coordinate X = x/c (with units of light-seconds).
terms of coordinates X = x/c, Y = y/c, Z = z/c which are just the previous
spatial coordinates divided by the speed of light; they are the same distances
but measured in terms of 'light-times' (light-seconds, light-years, etc.). Then in
I second the light would be at the position x = 1 c cm, y = z = 0, so X =
(1 c cm) / (c cm/sec) = 1 light-second, Y = Z = 0; at the time t = 2 seconds, it will
be at the position X = (2c cm) /(c cm/sec) = 2 light-seconds, Y = Z = 0; and so
on. At an arbitrary time t, it will be at the position X = (ct)/c = t light-sec,
Y = Z = 0 (Fig. 1.12b). The relation between this and the previous representa-
tion is easily obtained on remembering that I light-second = (1 sec) x (c cm/
sec) = 3 x 101 ° cm = 300 000 km. Another way of thinking of the coordinates X,
Y, Z is that when they are used, we have effectively chosen units of measurement
for spatial distances so that the speed of light is 1 (because then light travels a
distance of 1 light-second in 1 second, etc).
In flat space, initially parallel light rays never meet each other because the
spatial distance between them stays constant (Fig. 1.13a); consequently in space-
time diagrams, they are represented by parallel straight lines that remain a
constant distance apart (Fig. 1.13b). We shall see later that this is not true in a
curved space-time.
x
(b)
Fig. 1.13 (a) Parallel light rays in a three-space with coordinates (x, y, z). (b) These rays
are represented by parallel straight lines in space-time.
Fig. 1.14 The future light cone of the event 0 is the set of all future-directed light rays
through 0.
light in a two-dimensional plane. The light will spread out circularly in this plane,
which is described by coordinates x and y. This is exactly analogous to the
spherical wave in the pond (Example (B) above). By exactly the same reasoning as
used in that example (leading to Fig. 1.5c), a three-dimensional space-time
diagram representing the spread of the light will show the wave front as a cone
originating at (x = y = 0, t = 0) and with radius ct at time t (Fig. 1.15b). As the
future light cone of the event 0 obtained in this way represents light travelling out
in all directions from the emission event 0, it is generated by all the future light
rays that pass through 0.
To represent this situation in a clear, standard way, it is convenient to use the
coordinates X = x/c, Y = y/c, Z = z/c introduced above. Their use has the
advantage that in these units the spatial distance travelled is equal to the time
elapsed (the effective speed of light is 1); for example, after a time of 1 second, the
1.2 Causality and the speed of light 19
t4
x
(a) (b)
Fig. 1.15 (a) A sphere of light spreading out from a flashbulb. (b) Representation of the
spherical light wave in a three-dimensional space-time diagram, giving the future light
cone of 0.
light has spread to a sphere of radius 1 light-second. Consequently the light cone
makes an angle of 45° with the vertical axis, representing the fact that a unit
horizontal distance in these diagrams is traversed in a unit time; this makes it
particularly easy to draw the light cones when these units are used (Fig. 1.15b was
drawn using this convention).
It is often convenient to restrict our attention even further to a fixed value of
Y (say Y = 0) as well as a fixed value of Z. The light then spreads out in a one-
dimensional space with X as the spatial coordinate (this situation might be real-
ized, for example, if a pair of optical fibres convey the light from the flashbulb in
the positive and negative X directions, Fig. 1.16a). The corresponding two-
dimensional space-time diagram shows the light emitted from the event 0 as
travelling on lines at ±45° to the t axis (Fig. 1.16b); these are the two light rays
through 0, because such lines are precisely those in which a unit (vertical) change
in time corresponds to a unit (horizontal) change in distance. This diagram is a
two-dimensional section (with one time and one space dimension represented)
of the three-dimensional Fig. 1.15b (representing one time and two space
dimensions). In this diagram we have extended the light rays to the past of 0;
the light rays converging on 0 from the past generate its past light cone, repre-
senting converging light pulses that arrive at the position (X = Y = Z = 0) at the
time t = 0.
The importance of the light cone of any event derives from the fact that it limits
the region of space-time which can be causally affected from that event. For
example, suppose President Lugarnev of Transylvania receives information at
noon that at 3:00 p.m. a nuclear missile is to be launched towards his castle on the
earth from a secret base on Mars. He instantly presses the button firing his Super-
Z lasers at the base on Mars, but he is too late: the energy bolts he has released,
travelling at the speed of light, will take 4 hours to reach Mars and so will destroy
the rocket launching pad 1 hour. after the missile has left. Let the event where he
receives the information be 0; this event (specified by a time and spatial position)
is then noon at his castle. The light cone of 0 is depicted in Fig. 1.17, where, for
20 Space-time diagrams and the foundations of special relativity
bulb
light light
IFF
fibre fibre
(a)
past
light cone
(b)
Fig. 1.16 (a) Light spreading from a flashbulb one-dimensionally along optical fibres.
(b) Representation of these light rays in a two-dimensional space-time diagram, gen-
erating the future light cone of O. The past light cone of 0 (i.e. light rays converging
to O)
is also shown.
(a)
[EARTH] [MARS]
(b)
Fig.1.17 (a) A space-time diagram showing the event P (t = 3, X = 4) where missiles are
launched from Mars towards the Earth. At the time t = 0 on the earth (at X = 0), it is
already too late to prevent the launching of these missiles; this is because a laser pulse
emitted at this event 0 will reach Mars at the event R (t = 4, X = 4), an hour after the
missiles were launched. (b) Depiction of this series of events by a sequence of instantaneous
spatial views. At t = 0, the castle fires a bolt towards the missile base; at t = 3, the base fires
a missile while the bolt is still a light-hour away from it; at t = 4, the base is destroyed but
the missile is on its way to the castle. Note the direct correspondence between these spatial
views and the space-time diagram. The reason event P cannot be influenced from event 0
is because P is outside O's future light cone (the light ray OR lies on this light cone).
which can be influenced by objects travelling from the event P at less than the
speed of light; the future light cone itself can be influenced from P by signals
travelling at the speed of light. The past light cone represents the set of events in
space-time from which signals sent at the speed of light arrive at the spatial
position and time represented by event P. Thus in a photograph of an object taken
at P, the light arriving at P records the situation at the instant where the object's
world-line intersects our past light cone (Fig. 1.20); the camera necessarily
records the resulting time delays (as in the cover photograph). The interior of the
past light cone C-(P) is the region in space-time from which the event P can be
influenced by objects travelling at less than the speed of light. The exterior of the
22 Space-time diagrams and the foundations of special relativity
Fig. 1.18 A straight world-line passing through 0 and P represents motion relative to
the reference frame (t, X) at a speed v in the X-direction; at time t, it is at position
X = x/c = vt/c. The angle a of this world-line to the vertical is given by tan a = X/
t = v/c. For a light ray, v = c and tan a = 1.
Fig.1.19 The future and past light cones C + (P), C - (P) of an event P determine the future
of P (the interior of the future light cone), and the past of P (the interior of the past light
cone). Events outside these light cones cannot be influenced from P or influence what
happens there.
light cones is the region which cannot be influenced by P and which cannot
influence P.
One can illustrate the latter feature by considering a particular event on the
surface of the Earth, when an astronaut on the Moon is observed through an
1.2 Causality and the speed of light 23
Fig. 1.20 A photograph taken by observer A at the event P depicts the event R in B's
history, where B's world-line intersects the past light cone of P.
Fig.1.21 The past and future light cones of an event 0 in the history of an observer A on
the Earth, who (at the event 0) sees event e (a threatening boulder starting to roll down) in
the history of an astronaut B on the Moon. Observer A immediately sends a warning signal
to B; but this arrives at event r, after the boulder has just hit the astronaut at the event b
in his history. Because b is outside the future light cone of 0, the observer at 0 cannot
influence what happens there.
ultrapowerful telescope. Suppose that at this time one were to observe a boulder
rolling down a slope towards the astronaut. Since light takes 1.27 seconds to
reach the Earth from the Moon, we are observing an event 1.27 light-seconds
away and 1.27 light-seconds to the past, on the past light cone (Fig. 1.21). It is
already too late to radio a warning to the astronaut if the boulder will take
2 seconds to reach him, because the event where the bolder will reach him is
outside the causal future of the reception event. Given the restrictions on
24 Space-time diagrams and the foundations of special relativity
communication resulting from the limiting nature of the speed of light, there is no
method of sending a warning signal in time.
The causal limitations discussed here are fundamental, but will not sig-
nificantly affect ordinary everyday life in an obvious way because the speed of
light is so large: in the context of cars, aircraft, etc. on or near the surface of the
Earth, the resulting delays in communication are negligible. They become sig-
nificant either when large distances or times are involved, or if the time-scales
involved in some process are such that the speed of light is a significant limiting
factor. One example is supercomputers: an ultimate limit is imposed on their
possible speed of calculation because information cannot be conveyed from
one part of the computer to another at speeds greater than the speed of light;
this limits the number of calculations that can be performed per second. For
this reason, distances between their components must be kept small; thus
supercomputers of the future will be small machines.
Exercises
1.6 A satellite takes survey pictures of a square region of the Earth, 800 km in width,
from 300 km above the Earth's surface. What is the delay from the centre of the image to
the edge? (Regard the Earth's surface as flat in order to simplify the calculation).
1.7 Suppose that a `mind reader' in London claims to know what his twin brother in
New Zealand says at any moment, within less than one-hundredth of a second after a word
is uttered. Is there anything extraordinary about this claim? [The radius of the Earth is
about 6000 km.].
1.8 A rocket R moves in the z direction relative to an observer A on Mars, at a speed v
where v/c = z; their positions coincide at t = 0. Plot the world-lines of A and R in a (t, Z)
diagram. The rocket emits light signals in both the forward and backward directions at
t = 2 sec; draw the corresponding light rays in your space-time diagram. The observer A
signals to the rocket at the time t = 1 sec; what is the earliest time he can expect to get a
reply? [All distances and times are measured in the reference frame of the observer A.]
1.9 Draw a diagram to illustrate the fact that the `past' (i.e. the past light cone and its
interior) of any point P on any world-line, always includes the `past' of any earlier point Q
on that world-line. Interpret this result in physical terms.
Computer Exercise 1
Write a program that will either (a) take as input a spatial distance D (in miles or km) and
give as output the time T (in seconds, minutes, or hours) for light to travel that distance;
or (b) take as input a light travel time T, and give as output the corresponding distance D.
Try the program for suitable distances on the Earth, and in the solar system.
Now alter the program to print out additionally the rescaled distance D1 = D/c,
where c is the speed of light. Notice the simplification achieved. [This corresponds to use of
coordinates X, Y, Z discussed above, for which the speed of light is unity. Your output
should always state the units of time and distance being used.]
The laws of physics are the same for all non-accelerating observers.
In the Newtonian theory, this result is well established as far as the laws of
dynamics are concerned: there is no way for an experimenter to determine
absolute uniform motion by any dynamical experiment. For example, if one
carried out a series of experiments involving measuring the motion of colliding
billiard balls, timing pendulums, etc. in a compartment in a uniformly moving
train, the results are independent of the speed of motion of the train. Therefore,
one cannot determine the speed of motion of the train by any such experiments, as
they are not affected by this speed; indeed, the results of the experiments will be
exactly the same as if the train is at rest. Similarly, the results would be the same if
the experiments were done in the Concorde airliner flying smoothly at twice the
speed of sound. This set of results establishes the Newtonian principle of rela-
tivity, that the laws of dynamics of particles and rigid bodies are the same in all
non-accelerating frames.
The genius of Einstein lay in extending this principle to all the laws of physics
(it applies e.g. to optics, thermodynamics, electromagnetic effects, and elemen-
tary particle physics). Thus, the special principle of relativity implies .that no
physical experiment whatever can establish the absolute motion of any uniformly
moving body (one can easily establish motion relative to other bodies, but that is
not the issue: the point is that we cannot determine the motion of the Earth at
some instant as being say 350 km/sec in any particular direction, in an absolute
sense). This is because no experiment can detect such absolute motion; and that is
because the laws of physics are unaffected by any absolute uniform motion.
One can rephrase the principle of relativity as stating the equivalence of all
inertial reference frames. The set of coordinates used by an observer to describe
space-time, with himself at the origin (x = y = z = 0), constitutes his reference
frame. A reference frame is said to be inertial if it is non-rotating and non-
accelerating. Newton's laws of motion imply that a body experiences an accel-
eration relative to an inertial reference frame if and only if forces caused by other
bodies act on it; indeed this feature may be used to characterize inertial frames.
If one frame is inertial, any other frame moving uniformly relative to it is also
inertial. The claim then is that one may use any inertial reference frame and the
laws of physics will be unchanged.
At first this principle seems obscure, but after we have encountered it in
various contexts and seen its implications, its nature will become obvious. It is a
powerful unifying principle underlying all known laws of physics. It is already
clear that it is useful in the following sense: it implies that if a body is in uniform
motion, we do not have to specify that state of motion before being able to apply
the laws of physics to it. For example, the operation of the electric generators and
motors in an aircraft are unaffected by the motion of the aircraft, if it is moving
26 Space-time diagrams and the foundations of special relativity
uniformly. Therefore we do not have to design the motors to take the speed of
operation into account; an electric motor that works on the surface of the earth
will work equally well in a rocket moving uniformly at 25 000 miles an hour
relative to the surface of the Earth. Engineering would be very difficult indeed if
this were not so.
The speed of light in empty space is the same for all observers, independent of
the motion of the source and of the observer.
If the speed of light were not independent of the motion of the observer, we could
detect absolute motion by measuring the speed of light in different directions, so
contradicting the principle of relativity. Given this invariance it is then clear that
the speed of light must be independent of the motion of the source also, or else its
absolute motion could be detected by measuring the speed of light it has emitted
(which would be measured to be the same by all observers). This principle is
supported by all available experimental evidence, in particular, by the famous
Michelson-Morley experiment which showed that the speed of light emitted by
distant stars is the same when measured from the Earth, whether the Earth in its
orbit around the Sun is moving towards or away from the stars (Fig. 1.22). In
addition, this principle is also a consequence of the relativity principle applied to
particle dynamics, because the speed of light is a limiting speed for particle
motion (cf. the previous section). This implies that if the speed of light were
different in different frames, it would be possible to use dynamical experiments
(aimed at determining the limiting velocity of motion) to determine the absolute
motion of each reference frame.
Given the validity of this result, it becomes starkly clear that we will have to
revise our ideas about many features we have previously taken for granted. To see
this, we consider three important effects of special relativity.
Earth
Fig. 1.22 In the Michelson-Morley experiment, the speed of light emitted by a star is
measured both when the Earth in its orbit around the Sun is moving towards the star and
away from it. The same result is obtained for the speed of light in both cases: this speed is
independent of the relative motion of the source and the observer.
rocket
signal
.ivwvvy
-
15Q000 km/sec
3000oo km/sec
(a) (b)
Fig.1.23 (a) An observer at rest on the Earth measuring the speed of a light signal emitted
from a fast-moving rocket. (b) The same situation but viewed from the rest frame of the
rocket.
(because the signals travel at the speed of light). Suppose such a light clock is
attached to a rocket (Fig. 1.25a); seen from the rocket's frame, the time measured
will be given by eqn (1.1) independently of its state of motion (because of the
principle of relativity). Now suppose the rocket moves past an identical clock on
the ground, at a speed v (Fig. 1.25b). Considered from the ground, the light
always travels at the same speed; therefore the interval between emission and
reception of light by the clock on the rocket is measured from the ground to be 2t',
where the distance travelled by the light is given by Pythagoras' theorem, so
c2t2
t = vt+
2'Z do Z
Fig. 1.24 A `light clock' consisting of two mirrors held at a fixed distance by a rigid rod,
and a pulsed lights ource. The `ticks' of the clock are each time the pulse of light is reflected
by the bottom mirror.
1.3 Relative motion in special relativity 29
moving clock
(length of arm
exaggerated)
(length of arm
exaggerated)
(a) (b)
Fig. 1.25 (a) A light clock fixed to a rocket, viewed from the rest-frame of the rocket. The
light is reflected from a mirror at a distance do, and is received back after a time 2t. (b) A
light clock aboard a rocket moving at speed v relative to an identical clock on the ground.
An observer on the ground sees the light received back by the rocket after a time 2t'.
This implies
ti2(c2 - v2) = do ti2(1 - v2/c2) = do /c2.
Taking the square root and dividing by (1 - v2/c2)z gives
(do/c)/(1 - v2/c2)Z.
But the rate of the clock on the ground is given by (1.1). Thus
t' = t/(1 - v2/c2)12. (1.2a)
We see that even with identical light clocks, the ticks of clocks in relative motion
measure time at different rates. Since t' is larger than t, the moving clock is seen
from the ground to `run slow' in the ratio
t'/t = 1/(1 - v2/c2)z. (1.2b)
This effect is significant when motion is at speeds near the speed of light. We shall
rederive this result in Section 3.4, and discuss its experimental verification in that
section and in Section 3.6.
Twins
E -x
(b) (c)
Fig.1.26 The `twin paradox'. (a) Twin A stays at home while twin B goes on a long return
journey at high speed. (b) A clock actually measures time along its world-line in space-time
(each `tick' can be thought of as a marker on the world-line). (c) A space-time diagram of
the twins' histories: twin A's clock measures time t along his world-line, while twin B's
clock measures time t' along her world-line.
twins carry with them. For the effect to be significant, the relative motion must
take place at close to the speed of light.
This effect is not really surprising when one asks what a measurement of `time'
means in space-time. Remembering that clocks are mechanisms whose history is
represented by a world-line in space-time, we see that it is plausible that what they
really measure is `distance in space-time' along the world-lines representing their
history (Fig. 1.26b). Because the twins have followed different space-time paths
between the events when they are together initially and finally (Fig. 1.26c), it is
not too surprising that they have lived for different times. A similar effect occurs
on the surface of a table. The distance d from P to Q along the curve C is different
than that along the route C' (Fig. 1.27a); the `twin paradox' is the analogous
effect in space-time. There is, however, a significant difference in the two effects:
from this analogy one might at first expect that the time measured by the twin who
moves out and back would be longer, as her world-line looks longer in Fig. 1.26c;
but the actual sign of the effect is the opposite. We must be careful; while such a
space-time diagram accurately represents instantaneous relative spatial pos-
itions and time measurements made by a single observer, we must not jump to
conclusions about what spatial or time measurements will be made by other
observers. In this case the diagram represents accurately measurements made by
the stay-at-home twin A, but does not in an obvious way represent measurements
made by the traveller B. What is clear from the diagram is that we may expect
1.3 Relative motion in special relativity 31
0
(a) (b)
Fig.1.27 (a) Two paths between points P and Q on the surface of a table. The straight-line
path C is of length d, while the curved path C' is of length d'. (b) Four routes from town Q
to town P lying on opposite sides of the city C. Travel time is longest on route a through the
city centre; and is shortest on the apparently longer route d, a freeway that avoids even the
outer suburbs of the city. This provides a good analogy to the situation in Fig. 1.26c.
B's time measurements to differ from those of A, but we must not jump to
conclusions as to how they differ.
An analogy to the situation represented by the space-time diagram can be
given as follows: imagine towns P and Q lying respectively to the north and south
of an ancient city C. The roads of this city are very congested by heavy traffic
passing through narrow streets, so the closer one travels to the city centre the slower
travel by car is. One can choose routes from P to Q through the centre of the city,
through inner suburbs, through outer suburbs, or on a ring road that avoids the
city altogether; these further-out routes from P to Q of course involve travelling a
longer distance, as is at once apparent from a map (Fig. 1.27b). However, the
travel time from P to Q is shortest on the ring road and longest on the road
through the city centre. The map represents accurately the different possible
paths from P to Q, but not the different times it will take to travel on these routes;
the shortest travel time from the initial to the final points is associated with the
path that looks longest on the map. This gives us a good analogy to the space-
time situation represented in Fig. 1.26c. One can crudely understand the sign of
the effect in that case by remembering the example of the observer watching a
clock from a tram, which suggests that the nearer to the speed of light a clock
moves relative to an observer A, the slower it will appear to him that it is running.
We shall discuss the time dilation effect and `twin paradox' fully in Section 3.4.
(a) (b)
Fig. 1.28 (a) Two cameras A and B over a billiard table: A is stationary above the centre,
and B is moving to the left. Light rays are emitted from the sides of the table as two balls R
and G are simultaneously pocketed; at the same instant, A and B coincide. (b) Both light
rays reach A at the same instant, but B receives the light from the left before the light from
the right. Thus he sees R fall into a pocket before G.
1.3 Relative motion in special relativity 33
simultaneous for A
x
simultaneous for B
(a) (b)
Fig. 1.29 (a) Surfaces of simultaneity for A and B, showing how relatively moving
observers determine different space-sections of space-time as being instantaneous.
(b) Cross-section (y = constant) of Figure (a). The surface of simultaneity for A is parallel
to the x-axis, but that for B is tilted relative to it.
different splittings of space-time into space and time. Space-time is a unit which
unifies space and time, but does so in different ways for different observers. The
argument above is indicative of the fundamental feature that simultaneity is
determined relative to the motion of the observer, but does not enable one to
understand the issues fully. A full technical examination of simultaneity and how
to measure it follows, see Section 3.3.
Exercises
1.10 Which of the following properties would you expect for a correct relativistic
velocity addition law, combining parallel velocities vl and v2 to produce a resultant
velocity v3?
(i) {vlIC <<1,v2/c<<1}=vl+v2,:;v3;
(ii) {viIC >2,V2/C>2}=V3/C> 1;
(iii) {v,/c<2,v21CG v31CG1.
(« means `much less than'.)
1.11 A certain type of elementary particle is unstable: when at rest in the laboratory, it
is measured to decay on average after 10-5 seconds. Such particles are made to travel at a
speed of c in a linear accelerator. What will their average lifetime then appear to be to a
stationarys observer?
1.12 A passenger A sitting in the middle of a coach in a moving train observes that
lightning strikes both ends of the coach at exactly the same time. At the instant when he
receives the light signals, he passes a stationary observer B standing near the railway track.
Which end of the coach does the lightning strike first, according to B? If the coach is
moving at 30 m/sec and is measured by the stationary observer to be 100m long, what time
difference does he measure between the two lightning strikes? [Note that B is precisely
midway between the ends of the coach when he receives the signals.]
Computer Exercise 2
. Write a program that takes as input a value V representing speed of relative motion as
a fraction of the speed of light, and a time Tmeasured by a stationary observer; and gives
34 Space-time diagrams and the foundations of special relativity
as output the corresponding time T' measured by the moving observer (see eqn (1.2)).
Make sure your program will only accept values of V with magnitude less than 1.
Verify from your program (a) that T' < T for all non-zero values of V (positive or
negative); (b) for any given T, T' --> 0 as V --> 1. What value do you find for T'lT when
V= 0? Interpret your results physically.
Conclusion
Space-time diagrams give a very convenient description of spatial and temporal
relations, which enable us to clarify important features such as the nature of
causal relationships. The examples given so far show that in order to understand
relativity theory properly, and the way space-time represents space and time
measurements for different observers, we need to rethink carefully the nature of
space and time measurements. We shall do so in Chapter 2, and then work out
systematically the consequences for the geometry of the space-time of special
relativity in Chapter 3 (studying there in depth the concepts introduced in
this Chapter). The unifying theme of a space-time interval will be introduced
in Chapter 4, and used in later chapters to study some basic ideas of curved
space-times.
While all the preceding material is necessary for a full understanding of the
later chapters, so that ideally one should read them in sequence, nevertheless a
reader who wishes to proceed directly to the main ideas of curved space-times can
do so now by reading Chapter 5. However, understanding of the interesting
applications in Chapters 6 and 7 will be greatly benefited by a perusal at least of
the flat-space universes discussed in Section 4.3.
Although we shall mention it again in the Afterword, let us recommend as an
additional source of discussion and examples the book Space-Time Physics by
E. F. Taylor and J. A. Wheeler (Second edition: Freeman, 1992); this describes
special relativity (and a little beyond) in a highly readable way, with lots of
examples and pictures, and provides a useful parallel text which could be read in
conjunction with Chapters 1-4 of this book.
2
Fundamentals of measurement
2.1 Time
We assume the existence of ideal clocks which measure time accurately along
their world-lines. These clocks may for example be mechanical (e.g. based on an
escapement mechanism controlling the rate at which a spring unwinds), atomic
(e.g. depending on the half-life of a radioactive substance), electromechanical
(e.g. based on a crystal), or electronic (based on an electronic oscillator). The
notion of perfect measurement of time along a world-line is important because it
implies the universality of time measurement in the following sense. The equa-
tions determining the mechanical response of a body involve time, as do the
equations of electromagnetism and of atomic and nuclear structure. Until we
have investigated further, we are not entitled to assume that these and the times in
other physical laws are the same, or even simply related to each other. However,
to the accuracy so far measured, it turns out that the relevant time is the same for
all physical systems: we do not have to allow for different time variables in
mechanical systems, thermal systems, atomic systems, etc. Therefore, we do not
have to specify the kind of clock to be used by an observer: the universality of time
allows him to base his clock on any physical principle he chooses. Ideal clocks
constructed on the basis of any physical laws will all agree with each other.
The further point of importance to be emphasized is that a clock by itself
cannot determine a time measurement at some point away from itself (I cannot
obtain a reading from a clock remote from me unless transmitting and. receiving
mechanisms are used to transfer data from it to where I am). Thus, clocks by
themselves cannot establish surfaces of instantaneity in space-time, but rather
measure time along a world-line (namely, the world-line of the clock in space-
time, Fig. 2.1). There is no implication here that the same time will be measured
from an initial to a final point along different world-lines, and indeed, in relativity
theory this is not expected to be true (cf. Fig. 1.26 and the discussion in Section
1.3). Experimental evidence shows that special relativity is correct: ideal clocks
have been flown around the world in airliners and compared with identical clocks
stationary on the ground. Their readings differ, in agreement with the prediction
36 Fundamentals of measurement
of special relativity. Thus the Newtonian idea of a uniform flow of time that is the
same for all observers, is wrong.
Given any world-line, then, there is a unique time measured along that world-
line by any ideal clock moving along it. This is called proper time along that
world-line. All direct time measurements are measurements ofproper time along
some world-line or other. To relate proper times measured along different world-
lines implies use of signalling devices that can transfer information between
distant observers; we shall deal with this in Section 2.3 below. Given this
understanding, there is one particular `time' that needs clarification: namely,
what is the significance of the time coordinate t specified in the standard coor-
dinates (t, X, Y, Z) used to describe space-time by an observer A (cf. Section 1.1)?
The answer is that it is the proper time measured by that observer along his own
world-line in space-time, which is the line (X= Y= Z = 0) in those coordinates
(Fig. 2.2). It does not directly indicate time measured along other arbitrary
world-lines. However, as we shall see later, it will correctly give the time measured
by any observer who is at rest in this coordinate system, i.e. who is stationary
relative to A.
Fig. 2.1 Measurement of time is based on the fact that a clock measures time t' along its
own world-line in space-time.
Fig. 2.2 The time t in the standard coordinate system of an observer A is time measured
by a clock stationary relative to him. It measures time along his world-line (the line
X= Y= 0, which is the origin of the spatial coordinates in his reference frame).
2.2 Distance 37
Exercise 2.1
The period of rotation of the earth as measured by an electromagnetic crystal clock is
found to be increasing. Does this imply that (a) dynamical time (as measured by the
fundamental laws controlling the Earth's rotation) is different from electromagnetic time,
or (b) that the Earth's rotation is an imperfect clock for some reason?
2.2 Distance
In texts on elementary physics it is often stated that rulers or `rigid rods' are the
basis of measurement of distances. However, they are very imperfect measures of
distance; the length of a ruler varies with temperature, for example, and will be
different if it is held horizontally or vertically in a gravitational field (because of the
elastic response to stresses induced by gravity). Therefore, `corrections'' must be
made to allow for the fact that a ruler does not in fact measure a constant distance
under all conditions. Further, it is impracticable to use a ruler (or series of rulers) to
measure accurately the distance from Rome to Venice or Dover to Calais, let alone
from the Earth to the Moon or Mars. Some more practical method must exist.
Measuring the distance of one object from another which is far from it implies
sending signals or information between these objects. The invariance of the speed
of light means that electromagnetic radiation is the best basis for standard
measuring devices in space-time. This is true in particular for the measurement of
distance. Thus, the proper basis for measuring distance in special relativity is
radar. This works as follows: to measure the distance between points P and Q, an
electromagnetic signal is emitted by a transmitter at P and reflected back to P
from Q (Fig. 2.3).
The emission time t1 and reception time t2 of the signals are measured by an
ideal clock at P. Let the difference between these times be t t2 - t1; this is then
the light travel time for the outward and return journeys. If the distance between
P and Q is d, the distance travelled by the light is 2d. But light travels at the
invariant speed c; so t = 2d/c, and the distance measured is half the light travel
time:
d
transmitter reflect
and
receiver
Fig. 2.3 A device to measure the distance between P and Q: a radar signal (usually a radio
wave) is sent from P at the time t1, reflected at Q, and the echo received by Pat time t2. The
distance d then follows from the light travel time t2 - t1.
38 Fundamentals of measurement
As an example, if the light is emitted at 12:01 and received at 12:03
then tl = 12:01, t2 = 12:03, r = 2 minutes, and the distance is light- 1
minute = 60 light-seconds = 60 sec x 300 000 km/sec = 18 000 000 km. By con-
trast, if r = 2 µsec = 2 x 10-6 sec, then d= 1 µsec = 300 metres. This use of radar
to measure distance, apart from being the fastest method, is in most cases the only
practical method. It is for example the basis of accurate measurement of distance
for mapping purposes by surveyors (e.g. through a device called a Tellumat, see
Fig. 2.4). It has been used to measure the distance to the Moon and to Mars with
unprecedented accuracy. It is routinely used by ships and aircraft to determine
distances to other ships and aircraft. Also, because of the problems with defining
Fig. 2.4 The Tellumat, an advanced distance measuring device based on the radar
principle. This instrument uses microwave radiation to measure distances between 20 in
and 25 km to within an accuracy of 5 mm. The distance measured appears directly as a
digital read-out on the hand-held control unit. (Photograph from Plessey plc.)
2.2 Distance 39
a length standard by means of a `rigid rod', the metre is now defined as the
distance light travels in a given time; thus the constancy of the speed of light-the
basis of radar-is also the basis now used to define the length in a laboratory.
From now on, in this book we shall assume that radar is the practical means of
measuring distance.
A space-time diagram of the use of radar to measure distance is given in
Fig. 2.5. Unless otherwise stated, we shall from now on use the coordinates
(t, X, Y, Z) introduced in the last chapter, scaled so that the speed of light is 1
(because lengths are measured in light travel times) and the light cone is at 45 ° to
the vertical in space-time diagrams. Then all world-lines of massive particles
must make an angle of less than 45 ° to the vertical in these diagrams, because they
cannot move faster than light. This convention has been used in Fig. 2.5. When
radar is used to measure distance, it is very natural to describe distances
in terms of light travel times (e.g. µsec, sec, years). To convert to ordinary
units, one just has to multiply by the speed of light. For example, 1 µsec is
(10-6 sec) x (3 x 1010 cm/sec) = 3 x 104 cm = 300 metres; 1 msec is 300 km; 1 sec
is 300 000 km. In these units, the mean distance from the Earth to the Moon
(381 550 km) is 1.27 sec; the mean distance from the Earth to the Sun
(149 600 000 km) is 8.31 minutes; the distance to the nearest star is 4.27
light-years.
We can now give a direct meaning to the standard spatial coordinates (X, Y, Z)
in an observer's space-time picture. Along the coordinate axes, they are just the
distances measured by him by radar from his world-line (X = Y = Z = 0) to
the event in question (Fig. 2.6), in units of light-travel time; for a general point,
the distance measured is d = (X2 + Y2 + Z2)z. As in the case of time measure-
ments, one cannot assume that one can read distances measured by other
light
t=t1-remission
t=I
t=o
X
t=I
t=
x=D X=1 X=2
Fig. 2.6 The coordinate X in the standard coordinate system of an observer P is radar
distance measured by him from his position. Thus a series of radar signals establishes the
lines X = 0, X = 1, X = 2, etc. in space-time (X = 0 being his own world-line).
observers directly from the space-time diagram, since they are in general not
directly represented by the coordinates X, Y, Z.
An important feature of distance measurement by radar is that an observer at P
can measure the distance to Q purely by observations at his own position; he does
not have to go himself to Q, or attain any active collaboration from Q, to make
the measurement. Instead he sends light or radio waves to Q; all that is required is
that they are reflected back to P by some object at Q. This feature is what makes
radar so important in navigation and in military applications.
Finally, having defined distance in terms of radar, we can now understand the
common use of rulers to measure distances on scales of between 10-3 metres and
10-2 metres as being due to their being reasonably good approximations to `rigid
rods' (rods of constant length) in many circumstances. Ifany conflict were ever to
arise between ruler and radar measurements of distance, we would reject the ruler
result in favor of that determined by radar.
Exercises
2.2 Find the light travel time between the following locations: (i) your feet and your
eyes; (ii) Cambridge and London (90 km apart); (iii) the Earth and the planet Pluto (mean
distance 5900 million km). Calculate the distance in kilometres to astronomical objects
which are (1) one light-hour away, (2) one light-day away, (3) one light-year away.
2.3 A fighter aircraft sends out a signal that is reflected from a bomber aircraft; the
echo signal is received by the fighter after an elapse of 20 µsec. One second after sending the
first signal the fighter sends another signal; the echo signal is received after 15 µsec. Deduce
the distance measured by the fighter to the bomber on each occasion, and hence find the
relative speed of approach of the two aircraft.
2.3 Simultaneity 41
2.3 Simultaneity
In order to synchronize a clock at a distant point Q with a clock at P, one has to
send information to Q about the state of the clock at P (or vice versa). An initial
suggestion might be that one should send an ideal clock C from P to Q,
after synchronizing C with P's clock; this will then enable synchronization of Q's
clock with C, and so with P (Fig. 2.7a). However, this will not work. This is
because, as we have already seen, the result obtained will depend on the path
through space-time taken by C from P to Q (Fig. 2.7b), that is, on the speed with
which C is moved from P to Q. Thus one cannot set up a consistent synchroni-
zation system this way that will give the same answer no matter how the clock C is
moved from P to Q (in mathematical terms, proper time is not an integrable
variable).
As in the case of distance measurement, one must turn to the use of electro-
magnetic signals ('light') to convey adequately the information needed for syn-
chronization from P to Q. In fact, determining which events are simultaneous
with particular events in the history of an inertial observer P is again best achieved
by radar.
P C
®O
t T
(a)
(b)
Fig. 2.7 (a) A conceivable process for synchronizing distant clocks at P and Q by
transporting a third clock C between them, and a space-time diagram of this process. (b)
This procedure will not work, because the result is ambiguous: another clock C',
synchronized with C at P, will in general disagree with C on arrival at Q after traveling
from P to Q. Thus the result of such a synchronization process is arbitrary.
42 Fundamentals of measurement
Fig. 2.8 The synchronization of clocks at P and Q using a radar signal. Because light
takes the same time to travel out and back, the reflection event r at Q must be simultaneous
with the event q at P half-way between emission and reception of the signal.
station A station B
t4 car 1
Fig. 2.9 Police car 1 is stationary relative to police stations A and B, but car 2 is
approaching B. Signals sent out simultaneously (as measured by car 1) from A and B at
events a and b will be received at the same time by car 1 at event p, but car 2 will receive the
signal from station B first (at event q) and the signal from station A second (at event r).
Thus car 2 will detect the emission event b before the emission event a.
44 Fundamentals of measurement
surface
simultaneous
with q: to
ti
X
Fig. 2.10 The surface of events in space-time simultaneous for the observer P (stationary
in the chosen coordinate system) with the event q at the origin of coordinates. P has to use a
whole series of radar signals (e.g. those shown establishing simultaneity of r and r' with q)
to determine this surface.
*e.g. by a radio or X-ray telescope (see The New Astronomy by N. Henbest and M. Marten
Cambridge University Press, 1983), or by the human eye.
2.4 World maps, world pictures, and radar maps 45
(a) (b)
tT (sec)
(c) (d)
Fig. 2.11 (a) A world map depicts the position of each object in the surface of
simultaneity of some event t = to on the observer's world-line. (b) A world picture depicts
the position of each object in the past light cone of some event t = to on the observer's
world-line (e.g. when a photograph was taken). (c) A radar map depicts the position of
each object in the future light cone of an event t = to on the observer's world-line (when a
radar pulse was emitted). (d) When ordinary units are used to describe everyday
occurrences, the light cones are extremely flat and so the three views are very similar,
because the spatial position of an object cannot change much between the events r and s
where its world-line intersects these light cones (except if the object viewed is moving at
close to the speed of light).
Exercises
2.4 Explain what practical problems will occur in using radar over very long distances,
and estimate the maximum distance over which radar is a practical distance-measuring
device.
2.5 Taking into account special relativity principles and the limiting nature of the
speed of light, see if you can propose some other method of determining simultaneity at a
distance. If you do so, convince yourself whether it is essentially equivalent to the radar
definition, or not.
2.6 Two volcanoes 100 km apart on lo (a satellite of Jupiter) are seen by an observer A
at rest on lo to erupt simultaneously. Observer B is the pilot of a rocket which according to
A is 10 km directly above the first volcano when it explodes, flying towards the second at a
speed of c. What will B see as happening at the second volcano at the moment when he
sees the first explode?
2.7 According to a nuclear treaty between two superpowers, if either strikes first the
second is entitled to destroy the first completely. The superpowers deploy two ships A and
B which move at a very high speed towards each other. Ship A sends off radar signals at
one-second intervals which are reflected back by B. At t = 0 in its coordinates, A fires a
weapon at B. At t = 4, A receives back the signal sent at t = -6, which detects B firing at A.
What can A conclude about who fired first? [In Chapter 3 we will consider if B would reach
the same conclusion.]
2.8 Ask various friends what time interval appropriately corresponds to various dis-
tance measures: e.g. 1 cm; l metre; 1 kilometre. [In principle it is not possible to make such
A
Fig. 2.12 Radar used to control the movements of aircraft. (top) The radar antenna.
Pulses are transmitted and received by the unit at the focus of the curved antenna, which
rotates to cover all directions around the airfield. (bottom) The display (a `radar map'),
directly showing the spatial positions of aircraft relative to the airfield. (Photograph from
Plessey plc.)
48 Fundamentals of measurement
a comparison, but in practice most people are able to make a reasonable correspondence
on the basis of their experience in daily life, e.g. using the speed of walking or driving to set
the relative scales.] Try to draw past and future light cones in space-time using `natural
units' (e.g. minutes and metres). Observe from this how the light cones closely define a
`surface of simultaneity' in everyday life.
Computer Exercise 3
Write a program that accepts as input from a radar set trained on a UFO, (a) the time TI at
which a radar pulse is transmitted towards the UFO, (b) the time T2 at which an echo is
received from it; and gives as output, (i) the distance D measured to the UFO, (b) the time
TR at which the radar pulse was reflected by it.
Suppose the radar set sends out a regular train of pulses a time T apart. What
condition should Tsatisfy to avoid confusion between different echo pulses? Modify your
program to print out also the relative speed of approach of the UFO as determined by the
echo pulses received from it. Ensure your program prints out a special warning message if
the speed determination for the UFO appears to violate a special relativity condition.
What might be an appropriate phrasing of this warning message?
Conclusion
We have now determined methods for measuring the fundamental quantities
(time, distance, simultaneity) needed as a basis for all other kinematic mea-
surements, and have done this taking the limiting nature of the speed of light into
account. It is important to realize that (in view of the principle of relativity) every
observer is equivalent and so all will use the same method to determine time, to
measure distance, and to determine simultaneity, as outlined above. In the next
chapter, we will determine the consequences of these methods of measurement.
3
Fig. 3.1 Two radio signals sent out 1 year apart by space station A, as seen in A's
coordinates (the time t = 0 is chosen to be midday on 13 March 2010). The first signal is
received by astronaut B at the event a, whose coordinates are t = 0.5, X = 0.5. The second
is received by Bat the event b, whose coordinates are t = 1.75, Z = 0.75. Thus according to
A, the time interval between B's reception of these signals is 1.25 years. We are unable to
determine directly from this diagram the time interval B measures between these events.
Fig. 3.2 Light signals sent at an interval T by observer A, as measured by his clock, to
observer B moving relative to A. The signals are received by B at an interval T' as measured
by his clock; K is defined by the relation T' = KT.
of those happenings as measured at the space station. This is the effect we now
investigate.
Consider two inertial observers A and B in relative motion. A emits a light
signal, waits a time interval T as measured by his clock, and then sends a second
signal. B measures the time interval between reception of these signals to be T'
(Fig. 3.2). A quantity K is then defined as the ratio of these proper times:
We shall see below that, when the speed of relative motion is non-zero, the time
intervals are different, i.e. K is unequal to 1. (The formulae relating K to the
relative velocity of the observers are (3.9) and (3.10) below.)
In principle, one can easily measure K directly from definition (3.1). For
example, if A's `vehicle' (be it a spacecraft, aircraft, the earth, or whatever) has
attached to it a radio beacon that emits signals at known regular intervals (say
every minute), B merely has to receive these signals and measure the time interval
between them to determine K. Thus, if B measures the time interval between
reception of the signals to be one and a half minutes, then T = 1 minute and
T' = 1.5 minutes, so K = 1.5/1 = 1.5. More hypothetically, suppose A and B
each possess identical accurate clocks, and B has a very powerful telescope
through which he can observe A's clock. He then merely has to watch A's clock
through the telescope, and compare the time it registers with that registered by his
own clock (e.g. noting the time interval T' elapsing according to his own clock
every time A's clock registers that an hour has passed; then K follows from (3.1)
with T = 1 hour). This is nothing other than the `thought experiment' mentioned
in Section 1.3, where an observer in the tram watched the clock tower in Berne.
That thought experiment already tells us that we expect K to get unboundedly
large if the relative velocities of the observers approaches the speed of light.
Redshift
Often the easiest practical way to measure the quantity K is by measuring the
observed wavelength of light, radio waves, or other electromagnetic radiation
emitted by the source, provided the intrinsic wavelength of this radiation is
known. This is the basis of the redshift measurements that are our major tool in
investigating the expansion of the universe.
Suppose that A emits electromagnetic radiation at wavelength AE. Then*
the period ATE of this radiation (the time for one full oscillation, cf. Fig. 3.3a) is
given by AE = czTE. By eqn (3.1), the period of the radiation received by B is
measured by him to be ATO = KATE (Fig. 3.3b). The wavelength Ao that B
observes for the light is related to its period by the relation AO = cATO. Therefore
the wavelength of the received radiation is related to the wavelength of the
emitted radiation by
AO = KAE. (3.2)
This change in wavelength is easy to measure direct from the spectrum of received
light. One identifies in the observed spectrum a line of known wavelength at the
source (e.g. the `alpha line' of wavelength 1215 angstroms in the spectrum of
hydrogen), measures its received wavelength, and so determines Kfrom eqn (3.2).
It is common to express the result of such measurements in terms of the redshift
*You can omit the details of the following derivation if you are prepared to accept eqn. (3.3b) as
correct.
52 Measurements in flat space-times
(a) (b)
Fig. 3.3 (a) The amplitude of an electric field plotted against time, showing the period
ATE (the time for one full oscillation). (b) An observer B measures a period Oro for a signal
emitted by observer A with period ATE.
Redshifts for distant galaxies are routinely measured by astronomers from their
spectra, and used to determine their speed of recession (Fig. 3.4; we will cover the
relation of redshift to velocity in Sections 3.2 and 4.3). The name `redshift' is used
because light in distant receding galaxies is observed to be displaced towards the
red end of the spectrum. This is because if z > 0, then K > 1 and the received
wavelength is longer than the emitted wavelength. The colour of light is directly
determined by its wavelength as follows: in units of 10-5 cm, the wavelength of
red light is between 7.5 and 6.3, orange 6.3 to 5.9, yellow 5.9 to 5.3, green 5.3 to
4.9, blue 4.9 to 4.5, indigo 4.5 to 4.3, and violet 4.2 to 3.9, while infra-red is above
7.5 and ultraviolet is below 3.9. Thus, light emitted as blue may be seen as green,
that emitted as green may be seen as yellow, and so on, cf. Fig. 3.5a; so the light is
displaced towards the red end of the spectrum, as claimed. On the other hand if
-1 < z < 0, then 0 < K < 1, the received wavelength is shorter than the emitted
wavelength, and the light is blueshifted (light emitted as yellow may be seen as
green, light emitted as green may be seen as blue, and so on; see Fig. 3.5b).
3.1 The Doppler effect 53
7,500,000
100,000,000
D 130,000,000
X11 !II i 11 i
Fig. 3.4 The relation between distance and redshift for distant galaxies. In each case the
galaxy spectrum is presented between reference laboratory spectra; redshifts are measured
directly from the frequency shift in the K and H lines of calcium between the spectra,
indicated here by arrows. The redshifts are then expressed as velocities (by use of the
Doppler shift formula). The distances of the galaxies are estimated from their apparent
luminosities, and expressed in light-years (one light-year is about 6 x 1012 miles). The
relation between redshift and distance seen here is usually taken as evidence for the
expansion of the universe (see Chapter 7). (Photograph from the Hale Observatories.)
The effect, of course, applies to all electromagnetic radiation. If, for example,
A is broadcasting by radio and B is moving relative to A, then B will have to
retune his radio in order to receive the transmission if K is significantly different
from 1. As an example, suppose a transmitter sends out a signal at a frequency of
2 kHz. Frequency v is related to wavelength A by the relation c = v,\, so eqn (3.2)
shows
vo = vE/K (3.4)
54 Measurements in flat space-times
observed: observed:
V B G Y 0 R V
//////REDSHIFT B G Y 0 R
I
I
BLUESHIFT\\\ \ \
V I B G Y 0 R V I B G Y 0 R
E
emitted: emitted:
(a) (b)
Fig. 3.5 (a) Redshift: the observed wavelength of light of different colours (red, orange,
yellow, green, blue, indigo, violet) is longer than that of the emitted wavelength, so the
colours appear to be shifted towards the red end of the spectrum. (b) Blueshift: the
observed wavelength of light is shorter than that of the emitted wavelength.
Say for definiteness that K = 2; then B will receive the signals at 1 kHz. Clearly, K
can be measured directly from the amount of retuning required. Because the
effect is essentially the same as that occurring in the Doppler shift of sound waves
(when sounds from a moving source are heard at a different frequency from their
emitted frequency), the parameter Kcan appropriately be called the Doppler shift
factor.
Uniformity of K
The first basic assumption we shall make about K (defined by eqn (3.1)) is that
when A and B are inertial observers, K is independent of T and constant in time.
Firstly, Kis assumed independent of T, thus for example the same Doppler factor
will be measured whether the signals are emitted one second or one hour apart.
This implies that the spectral shift observed for a single source (eqns (3.2-4)) is the
same for all wavelengths. This is the hallmark of the effect: the same redshift must
be observed for all spectral lines in an observed spectrum. If the value measured
for z from light from a single source varies depending on which line is measured,
the change of wavelength is not due to the simple Doppler shift effect; some other
explanation must be found. Secondly, Kis assumed constant in time ifboth A and
B are moving inertially; thus the value obtained for Kwill be the same at 1 o'clock
and 4 o'clock if the relative speed is constant. One can invert this: suppose that the
source A is moving inertially in the flat space-time of special relativity. Then one
can test whether B is moving inertially or not (that is, whether the sources are in
relative uniform motion) by seeing if K is constant in time. (Note that the results
mentioned here are true in the special theory of relativity; they do not always hold
true in the curved space-times of the general theory of relativity, as we shall see in
Chapter 5.)
An illustration of this result is as follows: suppose that an observer B moves
uniformly relative to observer A, and observes a K-factor of 2; B passes A at the
event 0, and A sends signals to B at the event 0 and then at 1-second intervals for
10 seconds (see Fig. 3.6a). Then B will receive these signals regularly at 2-second
intervals; hence the whole period T1 of transmission recorded by A (10 seconds) is
related to the whole period T2 of reception of the signals by B (20 seconds) by the
relation T2 = 2T1, that is, T2 = KT1 (see Fig. 3.6b).
3.1 The Doppler effect 55
(a) (b)
Fig. 3.6 (a) An observer A sends regular signals for 10 seconds, which are received by
observer B during a period of 20 seconds because the K-factor is 2. (b) In general in this
situation, T2 = KTI.
(a)
(b)
Fig. 3.7 Relative motion at speed v for observers A and B, seen (a) in A's rest frame (A is
at rest and B moves to the right at speed v), and (b) in B's rest frame (B is at rest and A
moves to the left at speed v).
Reciprocity of K
The second basic assumption about K is a consequence of the principle of rela-
tivity. Suppose that as well as A sending signals to B, the observer B sends signals
to A. Then there is no intrinsic difference between the two situations: in each case
the source merely sends signals to the observer, who is in motion relative to the
source (see Fig. 3.7). In special relativity the factor Kis simply a result of relative
motion in flat space-time. Since this space-time is isotropic (i.e. the same in all
56 Measurements in flat space-times
directions), light propagation is the same in all directions. Because of the
equivalence of all inertial observers, the two K-factors measured must be the same:
where KAB is the K-factor for light emitted from A and received at B, and KBA is
the K-factor for light emitted from B and received at A. If this were not so, there
would be some intrinsic difference between light propagation from A to B and
from B to A, contrary to the relativity assumption; this intrinsic difference would
enable us to measure absolute motion. Thus the Doppler shift effect is completely
reciprocal: whatever relative time change is detected by B in observations of A, is
also detected by A in observations of B. If A measures a factor-2 increase in the
wavelengths of all light received from B, then B will also measure a factor-2
increase in the wavelengths of all light received from A. So A will have to retune
his receiver by a factor 2 to receive signals from B, and B will also have to retune
his receiver by a factor 2 to receive signals from A. The observer B will see A's
clock running slow by a factor of 2, and A will observe B's clock to be running
slow by a factor of 2. This symmetry allows us to omit the subscript `AB' from
KAB when the context makes it clear which observers are concerned (see Fig. 3.8).
Measuring K by radar
A useful feature results from the symmetry relation (3.5): suppose A sends out
two pulses separated by a time interval T, which are reflected by B and received
again by A with a time separation T" (Fig. 3.9). By the definition of K, the time
between these pulses measured by B will be T' = KT, and then T" = KT' = K2 T.
Thus, A merely has to observe the ratio T"/T to determine K from the relation
K = /(T"/T). (3.6)
The significance of this derives from the fact that to use relations (3.1-4) to
determine K, the observer A has to receive radiation emitted by B where this
radiation has to be of a known wavelength (or frequency). Thus, either the signal
Fig. 3.8 Signals sent by Bat an interval T' (as measured by his clock) and received by A at
an interval T" (as measured by his clock). By the relativity principle, T" = KT'.
3.1 The Doppler effect 57
Fig. 3.9 Signals sent by A at an interval T, reflected by B at an interval T', and received
by A at an interval T".
Summary
The discussion we have given shows that when K > 1 (which will be the case when
A and B are moving apart, as we shall see in the next section) the factor Kgives the
relative time increase observed by B in all phenomena at A, and observed by A in
all phenomena at B. The fact that we commonly refer to this effect in terms of the
redshifting of light is just because this happens to be easy to observe. The time-
shift observed for all other effects will be the same. For example, suppose we
observe the radiation received from a quasi-stellar object at great distance to have
a redshift z = 3, and to vary in brightness on a time-scale of 8 hours. Then (since
K = z + 1 = 4) in fact these variations must have taken place on a time-scale of
2 hours at the source.
Exercises
3.1 A space-traveller moving away from the Earth at speed such that K = 2 tunes into
a television show transmitted from the Earth. In what way will the K-factor affect the
display he obtains and the way he receives it?
3.2 In order to perform a complicated docking manoeuvre, it is essential that two
spacecraft can be held at rest relative to each other. Devise a simple experiment to check
that this is so.
Computer Exercise 4
Write a programme that will accept as input (a) either a value for K or a value of z due to
relative motion of two observers, and (b) a time period T, a wavelength L, or a frequency F
58 Measurements in flat space-times
measured by one of them; and give as output the corresponding time period T', wavelength
L' or frequency F' (as appropriate) measured by the other (given by eqns (3.1-4)). Now
modify your programme to accept as input a letter representing the colour of emitted light
(e.g. `B' for `blue') and to print out the colour of this light as seen by the relatively moving
observer. [Note that for high values of z, some light will be shifted out of the visual range
and some radiation into this range.]
If your computer has colour graphics, apply this change to any colour image you have
available to see visually the effect of redshifting (K > 1) or blueshifting (K < 1) an image.
(3.8)
C\dT/E(1 +z)2
This determines the effect of motion on flux of radiation received from distant
objects (see eqns (4.35) and (7.11) below, and Fig. 7.13, for the cosmological
application). They look fainter if they are receding from us and brighter if they are
approaching.
Now we live in a universe bathed in cosmic background radiation ('CBR'),
the relic radiation from the `Hot Big Bang'; that is, black body radiation at a
temperature of 2.75 K (see pp. 272-4 below). This radiation is isotropic (i.e. is
measured to be the same in all directions) for any observer at rest relative to the
matter that emitted that radiation-that is, who is moving at the average velocity
of all the matter in the universe. The implication of the above relation is that we
can detect any motion of our own Galaxy or Sun relative to the universe by
measuring a dipole anisotropy in this radiation-a higher temperature in one
part of the sky (the direction towards which we are moving) and lower in the
opposite part of the sky (the direction away from which we are moving). Actually
the effect is even stronger: the instruments we use measure intensity of radiation
(that is, flux of radiation received in a unit solid angle from many sources of
radiation), rather than flux from a single source; this brings in two more factors of
redshift (see (4.36) and (7.11,12) below), enhancing the dipole anisotropy effect
predicted.
3.2 Relative velocity 59
Fig. 3.10 The cosmic background radiation temperature anisotropy as measured in all
directions in the sky (the oval shape represents the entire sky). The section surrounded by
the lightest region is hotter by one part in a thousand than the dark section (which is the
opposite direction in the sky). This is caused by our motion relative to the rest frame of the
universe. The radiation was emitted at a redshift of about 1100.
Fig. 3.11 (a) The residual anisotropy once the dipole has been removed. Apart from the
major lane across the sky due to sources in our own galaxy, the anisotropy is only one part
in a hundred thousand. (b) The remnant anisotropy once the galaxy signal has been
subtracted. The primordial fluctuations detected represent inhomogeneities at the surface
of last scattering of the Cosmic Background Radiation. They provide the seeds for growth
of large-scale structures at much later times, such as the clusters of galaxies we see at the
present time, and the matter in them is the most distant matter we can detect by any form
of electromagnetic radiation (they form our visual horizon). (Images 3.10 and 3.11
reproduced by permission of the NASA Goddard Space Flight Center and the COBE
Science Working Group.)
measured by both their clocks; we can regard them as signalling to each other by
radio at that time (the distance is zero, so communication is instantaneous).
Suppose that a radio pulse is then emitted by A at a time T as measured by his
clock, which is reflected by B at a time T' as measured by B's clock, and received
again by A at a time T" measured by A's clock (Fig. 3.12). Remembering the
3.2 Relative velocity 61
Fig. 3.12 Observer A emits a radio signal at time T, and observer B receives it at time T' at
event p. It is reflected back to A who receives it at time T". A measures the event q at time
2(T + T") to be simultaneous with p.
definition (3.1) of Kand the reciprocity relation (3.5), we find (cf. Fig. 3.6 and the
derivation of eqn (3.6)) that
T' = KT, T"=KT'=K2T.
According to A, the travel time for the radio pulse is therefore
T"-T=K2T-T=(K2-1)T.
By eqn (2.1) the radar distance measured by A between B and A is thus
D = c(K2 - 1)T. (3.9)
Z
V(K2+1)=(K2-1) K2(V-1)=-(V+1),
so
K2=-(V+1)/(V-1)=(1+V)/(1- V).
On taking the square root of this relation, the sign ambiguity is resolved because
Kmust always be positive (if B observes A's clock through a telescope, he will not
see it run backwards!) Thus the Doppler shift factor K resulting from a relative
radial velocity v is found to be
1+V
K= (1-V) (3.12a)
if v =1 c then K = 32 = 1.732;
ifv=4c then K=7'=2.646;
ifv = io c then K = 192'= 4.359;
ifv = 100 c then K = 199' = 14.107.
Thus, as expected, high relative speeds cause large K-factors, and so large ratios
between times measured by two observers.
3.2 Relative velocity 63
that is, receding observers each measure the other to be receding at the same
speed, and approaching observers each measure the other to be approaching at
the same speed. This result is in fact just a consequence of Einstein's relativity
principle, that physics should be the same for both inertial observers, since this
leads to the expressions (3.9-12) which treat both observers on exactly the same
footing. If this were untrue (e.g. if you measure me to be receding at 500 km/sec,
but I measure you to be receding at 250 km/sec) relative velocities would be very
difficult indeed to deal with. As in the case of K, we will omit the subscript `AB'
from vAB whenever no confusion results.
Fig. 3.13 A situation similar to that depicted in Fig. 3.12 but with the observers
approaching each other rather than receding. A sends a signal at time T" before the
observers meet and receives it back a time T before they meet, after B has reflected it at V.
64 Measurements in flat space-times
Suppose V = 0; then (3.12a) shows K = 1. Similarly if K = 1, then (3.11)
shows V = 0. Thus the relations we have derived show
K=1 q v = 0 V = 0,
i.e. there is no Doppler shift effect if and only if the relative velocity is zero.
Considering now the relation of v to Kand z implied by (3.3, 11, 12), we find that
K > 1 (a relative slowing down of time is observed) when observers recede from
each other, and K < 1 (a relative speeding up of time is observed) when they
approach each other:
Relative approach -1 < V < 0 0<K<1 -1 < z < 0 (light blueshifted)
No relative motion V = 0 K=1 z=0 (no Doppler effect)
Relative recession 0 < V < 1 1< K z> 0 (light redshifted)
Basically, this follows because when receding, each observes the other to be
positioned at steadily increasing distances and so light travelling either way has to
travel larger and larger distances; so we then expect the time intervals observed at
the receiver to be longer than the time intervals at the emitter, i.e. > K > 1 (cf.
Fig. 3.1). Similarly, when approaching, light travelling either way will travel
shorter and shorter distances so the observed time intervals at the receiver will be
shorter than those at the emitter, i.e. K < 1.
Figure 3.14 shows the relation between v/c and K; one can read off the relation
either way from this graph (e.g. one can find the K-value corresponding to any
v/c, or the v/c value corresponding to any K). It is clear from this graph (and
follows from eqns (3,11,12)) that as the relative speed of motion approaches the
speed of light, the relative time-change observed increases without limit. In the
case of relative approach,
v/c- -1 s> K-0,
Exercise 3.3
(i) What relative radial velocity V corresponds to a K-factor of 3? Determine the
corresponding velocity v = cV in km/sec.
(ii) What relative radial velocity V corresponds to a K-factor of 3? Determine the
corresponding velocity v = cV in km/sec.
(iii) If A recedes radially from B at a speed v = 3 c, what is the K-factor observed by A?
What is the K-factor observed by B?
(iv) If A approaches B radially at a speed v = c, what is the K-factor observed by A?
What is the K-factor observed by B? 3
where VAB is the speed of B relative to A and VAC is the speed of C relative to A
measured as a fraction of the speed of light (we are using the sign conventions just
introduced). Therefore (3.12a) shows that
We have just proved that (3.13) implies (3.14). Similarly, one can show from eqn
(3.11) that (3.14) implies (3.13); that is, two K-factors are reciprocal to each other
if and only if the corresponding relative velocities are the same in magnitude but
opposite in sign (one corresponding to approach and the other to recession).
This is precisely the situation that will occur during a 'fly-by' (see Fig. 3.15a).
For example, suppose that B flies past A at a constant speed of c. While B is
approaching A, we have vAB/c = -5 and K = Z. After B has reached 5 A and is
receding, vAB/c = +5 and K = 2. As B passes A, the K-factor suddenly changes
to its reciprocal (in this case, from to 2. There are good physical reasons for this
change: initially A points his receiving
z antenna to the left (B is approaching from
that side). As B passes, A has to swing the antenna round to receive signals from B,
66 Measurements in flat space-times
approaching
le
receding
Fig. 3.15 A'fly-by'. (a) A watches B approaching from the left and then receding to the
right. (b) A space-time situation, showing the light rays by which A observes B when he is
approaching and receding.
which now come from the right. A then receives signals from B on a different
family of light rays than the family of light rays on which the signals were initially
travelling (Fig. 3.15b). As a consequence, A will also have to retune his receiver as
B passes; e.g. if B transmits radio signals at a wavelength of 1 metre, A will receive
the signals at a wavelength of 0.5 metres while B is approaching but at 2 metres
while B is receding. This is closely analogous to the corresponding effect in the
case of sound waves: as a train or car passes a stationary observer while emitting a
warning note, the tone heard drops from a high pitch to a low pitch. The Doppler
shift factor again changes discontinuously as approach changes to recession.
Exercise 3.4
Show that (3.14) implies (3.13), that is, reciprocal K -factors imply that the measured
radial speeds of approach and recession are the same.
A B C
-
Fig. 3.16 Observers A, B, C in relative motion, all moving in the same direction.
3.2 Relative velocity 67
place in the x direction and their world-lines in a space-time diagram lie in the
(t, X) plane. Figure 3.17 is such a diagram drawn from the viewpoint of A. We can
immediately read off the relative velocities vAB and VAC from this diagram,
because the axes are marked off according to the measurements made by A; but
we cannot read off vBC, because it is not apparent from this diagram how the time
and space measurements made by B or C relate to those made by A.
Fig. 3.17 The world-lines of the observers A, B, C, seen from A's reference frame.
Fig. 3.18 A emits signals separated by a time interval T; they are received by B separated
by T', and by C separated by T".
68 Measurements in flat space-times
the composition law for Doppler shift factors K. Squaring relation (3.16) to
obtain KAC = KABKsC and using formula (3.12a), we obtain
1+VAC _ 1+VAB 1+VBCI
(1-VAC) (1-VAB)(1-VBC/
which may be solved for VAC as follows: multiply through by the product of the
denominators to obtain
(1 + VAC)(1 - VAB) (I - VBC) _ (1 + VAB)(1 + VBC)(1 - VAC).
Now multiply out, cancel terms, and collect terms in VAC to give
VAC(1+VABVBC)=(VAB+VBc)
Dividing by 1 + VAB VBC,
VAC = (VAB + VBC)/(1 + VAB VBc), (3.17a)
that is,
VAB + VBC
VAC (3.17b)
+ vABVBC/C2 '
the relativistic velocity addition law for parallel velocities. When the speeds
involved are very small compared with the speed of light c (vAB/c) « 1,
IvBC/cJ K 1) the denominator is very nearly equal to 1 and this reduces to the
Newtonian result
VAC = vAB + VBC. (3.18)
However, for larger speeds the results given by eqns (3.17,18) differ consider-
ably. For example, suppose vAB = vBC = c. Then the relativity result is
VAC = (2 c + 2 c) /(1 + 2 x 2) = 5 c, from (3.17),
2 while the Newtonian result is
VAC = c + c = c, from (3.18). Similarly if VAB =vBC = c, the relativity result
2
is VAC = c/(1
2 + 1) =c/=c= (0.96) c while the Newtonian
a result is
i
VAC = 1.5c.
light, so will A, no matter what the relative velocity of A and B is. This resolves the
velocity-addition problem we encountered in Section 1.2.
Finally, we note that if we consider situations where the relative motion of B
and C is not parallel to that of A and B, the relativistic result is more complex than
that derived here, but still guarantees consistency with the principle of invariance
of the speed of light (and so with the limiting nature of the speed of light for
motion of massive particles). The theory is self-consistent!
Exercises
3.5 Let rocket A move away from B to the left at c, and rocket C move away from B to
a
the right at c. Draw a space-time diagram of this situation from B's viewpoint, and show
that from this a diagram B can determine the relative separation of A and C to be increasing
at a rate 1 c. How is this consistent with the fact that the relative speed of motion measured
by two observers for each other cannot exceed the speed of light? What relative velocity
will A measure for C?
3.6 (i) Consider eqn (3.16) in the case when KBC = 1. Explain the situation occurring.
Is the result obtained reasonable? What particular conclusion can you draw ifKAB = 1 also?
(ii) Consider eqn (3.16) in the case when KAC = 1. Explain the situation occurring,
and hence rederive the result that K is replaced by 1 /K when a speed of approach v is
replaced by a speed of recession of the same magnitude.
3.7 (i) What value of K corresponds to a relative speed of approach of 1000 km/hr?
(a typical speed of approach of airliners). Is this measurable?
(ii) What is the value of Kif v is 500 km/sec? (typical of the relative motions of galaxies in
our cluster).
(iii) If K is measured to be 5, what is the corresponding speed of relative motion?
(iv) A traffic officer measures a car 150 m from him to be travelling towards him at
100 km/hr in a 60 km/hr speed zone. How long does the radar echo take to reach him? If the
pulses emitted by his radar set are separated by 3 psec, what is the separation measured by
his radar set between the echo pulses?
3.8 Prove from (3.17) that if I VABI < 1 and VBcI < 1, then I VACS < 1. [Hint: prove
that(1 - VAB)/(1 - VBC)/(l + VABVBC) = 1 - VACandasimilarexpressionforl + VAC.]
Computer Exercises
5. Write a program that will either (a) accept as input a value for a radial relative
velocity V and compute the corresponding K-factor (from eqn (3.12)), or (b) accept as
input a K-factor and compute the corresponding radial relative velocity V (from eqn
(3.11)). [Ensure that your program accepts only relative speeds less than the speed of light,
and values of K greater than zero.]
Use your program to confirm (i) the form of Fig. 3.14, and (ii) the reciprocal K-relation
(3.14) for equal speeds of approach and recession.
6. Write a program that will accept as input speeds VAB and VBC of relative motion,
and print out VAC, the speed of relative motion measured by A for C (calculated from eqn
(3.17); restrict the inputs to physically acceptable values).
Use your program to verify that VAC does not exceed the speed of light. Adjust the
program to print out the error if VAC is estimated by the corresponding Newtonian value
(3.18), and hence check that the Newtonian value is acceptably good in ordinary everyday
circumstances.
70 Measurements in flat space-times
3.3 Simultaneity
We have already seen in Section 1.3 that the surfaces of simultaneity or instant-
aneity for observers A and B in space-time depend on their motion. This is a key
feature: most of the `paradoxes' of relativity theory require an understanding of
the relativity of instantaneity for their resolution. We now examine this issue.
Simultaneity in the observer's rest frame
To have in mind a specific example, one can consider setting up a standard time
system throughout the solar system in order to facilitate communication between
space ships and assist space navigation. Initially the plan is to extend Greenwich
Mean Time out as far as Mars. The way to do this is for an observer A at
Greenwich to set up a standard clock, and then to use the concept of simultaneity
determined by radar (as explained in Section 2.3) to extend time measured by this
clock to other points in the solar system.
Just as one would intuitively expect, when the space-time is represented using
the standard coordinates (t, X) of A's reference frame, the surfaces of instant-
aneity he determines by use of radar are the surfaces It = constant} (Fig. 3.19a).
For example, if A emits a light signal at tl = -1 and receives its echo at t2 = +1,
then since light moves at unit speed in these coordinates the reflection event P has
coordinates t = 0, X = 1. By eqn (2.2), A measures P to occur at the time
T = 1(-1 + 1) = 0. Thus P is measured by A to be simultaneous with the event 0
(at t = 0, X = 0) in his history (Fig. 3.19b). Similarly emitting light at t = -2 and
receiving it at t = +2, A determines the event Q at It = 0, X = 2} also to be
simultaneous with 0; and in fact A determines all points for which t = 0 to be
simultaneous with each other. This is not an accident; use of simultaneity (as
defined by radar) is the natural way observer A extends clock readings from his
world-line to other points in space-time, so he will naturally define the surfaces
tt A
tt -A
t=+i
Q(t=o,X\=2) t=o
0 y
p (t=o,X=1)
X
t=-
surfaces of simultaneity
for A
(a) (b)
Fig. 3.19 (a) The surfaces of simultaneity for an observer A (who is, by definition,
stationary in his own coordinate system (t, X)). (b) Observer A determines the event P at
(0, 1) to be simultaneous with 0 because light emitted by A at t = -1 is reflected at P and
received back by A at t = 1. Similarly A can determine Q at (0, 2) to be simultaneous with O.
3.3 Simultaneity 71
It = constant} to denote simultaneity with clock readings along his own world-
line. Essentially, we have simply verified that this natural interpretation is correct.
Fig. 3.20 Observer B, moving relative to A, determines the event P' to be simultaneous
with 0 because 0 coincides with the point half-way between E, where B emitted a signal,
and R, when the signal was received back after reflection at P.
72 Measurements in flat space-times
The equal-angle rule
Consider the situation above, as represented in Fig. 3.21. Examination of the
geometry implied by the equality of the distances OE and OR, plus the fact that
the segments EP' and RP' are at 45° to the vertical, shows that the shaded tri-
angles ORS and OP'V are congruent to each other. One can convince oneself of
this result experimentally (for various values of the angle SOR, draw equal line
segments OE and OR accurately and then determine P' as the intersection of lines
at 45° from R and E), or by formal geometric proof based on ordinary Euclidean
geometry (such a proof is given at the end of this section). Consequently, the
angles SOR and VOP' are the same. This implies a simple rule characterizing
surfaces of simultaneity in space-time (Fig. 3.22): if a world-line A makes an angle
a with the vertical in a space-time diagram, surfaces of instantaneity for an observer
with world-line A tilt up by an angle a toward A.
Fig. 3.21 Figure 3.20 redrawn to illustrate the fact that triangles ORS and OP'V are
congruent.
Fig. 3.22 The angle a between the surfaces of simultaneity for A and B is the same as
the angle between their world-lines in a space-time diagram drawn from A's viewpoint.
3.3 Simultaneity 73
Fig. 3.23 A point (to, Xo) on B's world-line, where Xo = Vt0, and a point (t1, X1) on B's
surface of simultaneity. Because of the equal-angle rule (Fig. 3.22), tl /Xi = X0/to =tan a.
ti=VXi=vxilc 2
(3.19)
which is the equation for B's surface of instantaneity in terms of the variables
measured by A.
Two examples
As a first example, consider the observer A to be on the surface of the Earth; B is
in a rocket moving past at a speed 1 c in the direction of the planet Mars, at a time
when the distance to Mars is 4 light-hours. Then Fig. 3.23 applies with v/c = 2,
XI = xl /c = 4 hours, and tl =1 x 4 = 2 hours (from eqn (3.17)). Thus, the event
P in Mars' history that A measures to be simultaneous with the event 0 when A
and B pass each other, is 2 hours prior to the event P' in Mars' history that B
measures to be simultaneous with O.
As a second example, the Andromeda Nebula is about 2 190 000 light years
from the Earth. Consider simultaneity between events on the Earth and at
Andromeda as measured by an observer A on the surface of the Earth, and an
observer B in an airliner flying at 300 km/hr above the Earth in the direction of
Andromeda. The relativespeed of motion of these observers is V = v/c =
(300 km/hr) x (1 /3600 hr/sec)/ (300 000 km/sec) = 1/3600000, so by (3.19)
74 Measurements in flat space-times
the difference in time between events at Andromeda they measure to be simul-
taneous with a single event on the Earth is tl = (2 190 000/3 600 000) = 0.61
years. Similarly, if observer C travels on a bus at 30 km/hr towards Andromeda,
he will disagree with A about simultaneity on Andromeda by 22 days.
Conclusion
This analysis confirms what we discovered previously, namely that space-time is
a unit which is split into space (surfaces of simultaneity) and time in different
ways by different observers (Fig. 3.24). The splitting depends on their relative
velocities; it is given by eqn (3.19), which is the analytic form of the simple `equal
tilt' result illustrated in Fig. 3.22. The analysis is inevitable once we have decided
to base the concept of simultaneity on measurable effects, and recognize that it is
best to do so on the basis of the speed of light because of the fundamental
importance of this speed in nature. As is the case for all relativity effects, the
relativity of simultaneity is completely reciprocal: viewed from B's reference
frame, his surfaces of simultaneity are horizontal and it is A's surfaces of
simultaneity that are tilted, inclining up towards A's world-line (see Fig. 3.25,
which is just Fig. 3.22 redrawn from B's viewpoint).
Finally we note that for small values of Ivx/c21 the effect is very small; in
particular, it is negligible in everyday life (the differences for simultaneity of
different observers are in the region of 10-5 tsec). On the other hand, as v
increases towards c, ti -> x1 /c: that is, events simultaneous with 0 approach
closer and closer to the future light cone. Figure 3.22 shows that v increases,
a -> 45° and B's surface of simultaneity in space-time approaches closer and
closer to his world-line. If the limit when v/c = 1 could be attained, B's world-line
would be contained in his surface of simultaneity: time would cease to flow for
him. This corresponds to the fact that in this situation, if B attempted to use radar
simultaneity
for A
Fig. 3.24 Space-time split differently into space (surfaces of simultaneity) and time
(measured along world-lines) for observers A and B in relative motion.
3.3 Simultaneity 75
Exercises
3.9 An airliner flies at 500 km/hr towards a destination 1000 km away. What is the
resulting difference in simultaneity for the aircraft and the control tower at the destination?
Does the pilot have to allow for it?
3.10 Twin A on the Earth maintains radio contact with Twin B who is in a rocketship
moving away from him at a speed of c. They decide to blow out candles on cakes
simultaneously at midday on January 10th
z (their birthday). At that moment the distance
between them, as measured by A, will be 2 light-years. What difference will there be
between the times they each consider the appropriate moment for each to blow out the
candles?
B turns around when her distance from A is measured by A to be 3 light-years, and
starts returning at a speed z c. Let P be the event in A's history that B measures to occur
immediately before the turn-around, and Q be the event in A's history that B measures to
occur immediately after the turn-around (for simplicity take this to happen instanta-
neously). What difference in time does A measure between the events P and Q?
3.11 Return to consideration of Exercise 2.7. Determine what instant in B's history she
measures to be simultaneous with the event when A fired at her. Does B reach the same
conclusion as A about who fires first?
Computer Exercise 7
Write a program that will accept as input (a) the relative speed of motion V of two
observers, and (b) a distance D; and will then print out the difference in simultaneity DT
measured by these observers at the distance D (given by eqn (3.19)). Verify the negligible
nature of the effect for everyday speeds of motion on the Earth.
76 Measurements in flat space-times
Modify your program so that it can also calculate D from DT, given V, or V from D and
DT. Hence find, for example, what relative speed will cause a difference of simultaneity of
one hour at a distance of four light-hours. What limit can you deduce on the possible
magnitude of DT at this distance? What is the general form of the limit on the magnitude
on D T, given D?
a2 + ,02 = 45° (the angle OTU exterior to triangle OET is equal to the sum of
the interior angles); and similarly
a3 + 03 = 45°.
These three equations show that a2 = a3. We also have ai = a2 (opposite angles
are equal), therefore al = a3.
We see now that triangles ORS and OP'V are congruent with equal sides (OR
and OP') and two pairs of equal angles (a1 and a3, and the two right angles OSR
and OVP').
Fig. 3.26 Figure to prove congruence of the two shaded triangles in Fig. 3.21 (see text).
Light emitted at event E in B's history is reflected at events Q' and P' and returns to event R
in B's history.
3.4 Time dilation 77
Fig. 3.27 Observers A and B in relative motion. By sending a light signal at T and
receiving in back at T", the observer A determines the event Q to be simultaneous with the
reflection event P, which is at time T' according to B's clock.
cannot directly measure proper time t' for an observer B from a space-time
diagram drawn from the viewpoint of an observer A, because this diagram will be
calibrated in terms of A's variables (t, X, Y, Z), and we are not entitled simply to
assume what the relation between t and t' is. However, we can easily calculate this
relation (cf. the derivation of eqn (1.2)). In this section we shall work out the
magnitude of the time dilation effect in terms of the Doppler shift factor K and in
terms of the relative velocity v, consider direct evidence for time dilation, discuss
the symmetry of the time dilation effect, and investigate the `twin paradox'.
Fig. 3.28 The distinction between K and y: (a) The K-factor relates observers' clocks by
observed Doppler shifts and depends on light signals travelling only in one direction.
(b) The 'y-factor relates the observers' clocks by simultaneity determined by radar, and
depends on signals travelling both ways between the observers. (c) A situation where
information is conveyed by a one-way signal from A to B, but that information was
determined by previous radar measurements using reflected (two-way) signals andso is
based on the y-factor.
3.4 Time dilation 79
that has been travelling towards us for over a thousand million years. We do not
obtain information about the object `now'. We can make these measurements to
such great distances because the object itself (perhaps a galaxy or quasi-stellar
object) provides the power supply for the signal. The information is relatively
easy to obtain, but is also relatively limited; in particular, neither distance nor
simultaneity are directly deducible from measurements of the K-factor.
By contrast, observations to determine directly the -y-factor depend on
obtaining a echo pulse; the experiments are essentially more complex, requiring
coordinated measurement of emitted and received signals. Correspondingly, they
give more information (distance and simultaneity can be deduced directly, and
indeed the Doppler shift factor is also directly measurable from a series of radar
pulses, see eqn (3.6)). The distance to which radar can be used is more limited,
both because of limits on practicalities of observing time delays, and because of
limits on power requirements, since we provide the power for the signal detected.
Unless either the radiation is emitted parallel (i.e. non-spreading) to very high
accuracy, or the target actively aids the process by amplifying and rebroadcasting
the signal, the power needed goes up as the fourth power of the distance because
of the need to obtain an echo pulse. It is hardly practical to use radar to measure
distances of more than a few light-years; at present, the maximum distance
measured by radar is about 8 light-hours. Clearly, the same limits will apply to the
use of radar for clock synchronization.
Finally, we note that in any complex situation one may have to consider
carefully before deciding which is the real effect in operation. As an example,
suppose observer A tracks a uniformly moving spacecraft B by radar for some
time, and then after suitable computations sends a message: `When you receive
this message, the time will be 12:00 noon' (Fig. 3.28c). Now, the final message is
one-way from A to B, so one could conceivably think that the information sent
was essentially a deduction from the K-factor effect. However, this would be
incorrect; the data sent is based on the two-way radar observations by A that took
place initially, the final signal merely transferring from A to B the results of these
previous measurements. The information B receives in this case is not about
conditions at the time of transmission, but rather about what conditions will be at
the time of reception: at that time, A's clock will simultaneously read 12:00
(where simultaneity is measured by A). Thus the information sent is about radar-
based determination of simultaneity, and the relative time dilation measured by
such observations will be determined by the -y-factor.
K2K
7(1/K) = (1/K) 2) 1 = 1 = 7(K)
We already know (Section 3.2) that K -> 1/K corresponds to changing from
approach to recession (or vice versa) at the same relative speed of motion.
Thus, we have shown that the time dilation effect (determined by radar com-
parison of clock setting) is the same for relative approach or recession at the same
speed. The above examples suggest, and further investigation confirms, that
ry > 1. Thus B's clock (moving past A) is measured by A to run slow relative to
A's clock (at rest in the chosen coordinate system), whether they are approaching
or receding from each other. Essentially, this symmetry is because the light used
to make the measurements travels both ways between A and B. It contrasts
with the Doppler shift effect, where A observes B's clock to be running slow if B
recedes, but to be running fast if B approaches; the difference between observed
consequences of approach and recession in this case is possible because the
light used to make the measurement travels only one way (either from A to B or
from B to A).
/ l+V+l)/2(l+V) ={(1-V)(1+V)}-2.
z
ry= I
Therefore
which confirms the result already obtained by other means (eqn (1.2)). As
examples, if v/c = 4, then 1 - V2 = i6, (1
Similarly,
- V2)z = 0.97, and ry = 1/0.97 = 1.033.
if v/c = 2 then ry = 1/0.866 = 1.155;
if v/c = 4 then ry = 1/0.661 = 1.512;
if v/c = to then ry = 1/0.436 = 2.294;
if v/c =l o then ry = 1/0.141 = 7.089.
3.4 Time dilation 81
rA
Fig. 3.29 A graph of the -y-factor against V = v/c, plotted from eqn (3.21). Note
that ry becomes arbitrarily large as the relative speed approaches the speed of light.
Thus, as expected, high relative speeds cause large -y-factors and so large observed
time dilations.
It follows immediately from eqn (3.21) that (a) the effect is always one of an
observed slowing down of the moving clock (-y > 1); (b) it vanishes if and only if
there is no relative motion of the object and observer:
ry= 1 # v=0;
and (c) the time dilation becomes indefinitely large as the relative speed
approaches the speed of light:
ry oo (v/c)2 1.
It also confirms (d) that time dilation depends only on the magnitude of the
relative velocity, not whether it is a speed of approach or recession:
'y(-v) = ('y(v). All these features are clear in the graph of 'y as a function of v/c
shown in Fig. 3.29 (plotted from eqn (3.21)).
where 'y is given by (3.21). Hence tQ > tP: although OP looks longer than OQ, the
segment OP in the space-time diagram represents a smaller time measured by B,
3.4 Time dilation 83
(c)
Fig. 3.30 (a) Observer A measures the point Q on his world-line to be simultaneous with
P on B's world-line. (b) Observer B measures the point R on A's world-line to be
simultaneous with P on his world-line. Event R precedes event Q. (c) The same situation
redrawn in B's rest frame.
than the time A measures from 0 to Q (cf. Fig. 1.27b). A measures B's clock to be
running slow. How can B also measure A's clock to be running slow?
The key feature is that B does not measure Q on A's world-line to be simul-
taneous with P on his world-line. Rather, from what we have learned in Section
3.3, B measures a point R on A's world-line to be simultaneous with P, where R
precedes Q: i.e. tR = tP with tR < tQ (Fig. 3.30b). Exactly analogously to (3.22a),
B's analysis shows that
tP = rytR, (3.22b)
showing that B measures A's clock to be running slow. There is no contradiction
between these results; rather (3.22a, b) show that
tQ ='y2tR (3.22c)
confirming the result tQ > tR, as required for consistency.
84 Measurements in flat space-times
Hence, the key to understanding the way the time dilation effect can be reci-
procal is to note that A measures Q and P to be simultaneous, but B measures R
and P to be simultaneous. Finally, we note that Fig. 3.30b is drawn from A's
viewpoint. To understand fully the reciprocity, consider Fig. 3.30c which is the
identical space-time situation drawn from B's viewpoint. Relations (3.22)
therefore hold for Fig. 3.30c, just as they do for Fig. 3.30b. Note that one can
directly read off proper times t' measured by B from Fig. 3.30c, because it is
calibrated in terms of his variables (t', X', Y', Z'); however, one cannot directly
read off times measured by A from this diagram. Later (in Section 4.2) we shall
find out how to represent a time along B's world-line equal to the time OQ
measured by A.
18
years
2yearn
(a) (b)
Fig. 3.31 The `twin paradox'. (a) Twin B travels at speed v for 6 years, and then returns at
the same speed to rejoin A, who has remained at rest (in an inertial reference frame) during
B's journey. A light signal emitted by A at S is received by B at U, when she turns around.
(b) Twin A measures the event W on his world-line to be simultaneous with the event U on
B's world-line.
Fig. 3.32 An acceleration detector, consisting of a weight held between springs fitted with
detectors that record movement of the weight relative to the sides of the framework.
demanded in order that they meet again. This acceleration is physically detect-
able. Suppose each observer has with him or her an acceleration detector con-
sisting of a weight constrained to move within a framework by springs which are
fitted with strain detectors (Fig. 3.32). Since A moves inertially, his detector will
register no forces, but B's will; this shows that the distinction between their
86 Measurements in flat space-times
Fig. 3.33 Various world-lines between events 0 and P. The straight-line path A is the one
with the longest proper time. It is uniquely characterized by the fact that an acceleration
detector will measure no acceleration along this path.
Conclusion
In summary, `a moving clock goes slow' in a way that is completely reciprocal for
any pair of inertial observers (each measures the other's clock to be going slowly).
This is consistent because they disagree about simultaneity. This time dilation
effect refers to a comparison of times measured by both clocks `now', i.e. it is
based on the idea of simultaneity. It must not be confused with the Doppler shift
effect relating observed times, which is also completely reciprocal, but relates
3.4 Time dilation 87
time measured by the observer now to time at the source of the radiation when
this radiation was emitted (which could be a very long time ago). Time dilation
gives rise to the `twin paradox': any observer who moves away from an inertial
observer and then back again will find he has experienced a smaller increase in
time than the inertial observer. This feature has been observed experimentally by
comparison of a clock in an aircraft with a clock stationary on the surface of the
Earth (the Hafele-Keating experiment described above).
Exercises
3.12 Consider the `twin paradox' example above (Fig. 3.31).
(1) Let light emitted by B at the event U be received by A at an event V. Using the K-
factor, determine the times A measures from 0 to V and from V to P; hence deduce the total
time measured by A from 0 to P.
(2) What event in A's history does B determine to be simultaneous with U, (a) just
before she turns around, (b) just after she turns around? Use the -y-factor based on B's view
of space-time during her inertial segments of motion to determine the time intervals in A's
history between 0, these events, and P. Hence confirm that B can also use -y to determine
the time A measures from 0 to P.
(3) Suppose A and B each observe the other by radar. Find the relative motions each
determines for the other. [This reveals a quite unexpected motion that B measures for A,
and so sharply shows the distinction between them.]
3.13 Let the world-line of an inertial observer Ago from a space-time event 0 to P. Let
observer D move inertially from 0 to some event Q, and then inertially to P. Show that D
measures a shorter time interval from 0 to P than A does. [Hint: find the time in A's history
he measures to be simultaneous with Q; then use the relevant 'y-factors separately for the
outward and return journey of D.]
Generalize to show that if D moves on any finite number of inertial segments from 0 to
P,; he measures a shorter time interval from 0 to P than A does (unless he moves in an
unbroken geodesic from 0 to P, when he moves exactly as A and therefore measures the
same time interval).
3.14 Suppose that a spaceship cruises at v = c. Assuming that you may neglect the
a
times for acceleration and deceleration, find how much the earth will have aged during an
outward and return journey which takes 50 years as measured by the astronauts on board.
How far from the earth will the space-ship have travelled? What limits does this suggest to
what may be achieved in space travel?
3.15 The relation between velocity and K (and so redshift) considered so far has been
for the case of radial motion (the source moving directly towards or away from the
observer). Now consider the case of transverse motion; the source is moving at right angles
to the line of sight from the observer (Fig. 3.34). Then the distance between the source
and observer is unchanging instantaneously. Calculate the K-factor for light emitted by the
source and received by the observer, and hence the redshift measured in this case. [Hint:
the K-factor is simply due to the time dilation effect (3.21) in this case]. How large will be
Light
Source
Observer
Fig. 3.34
88 Measurements in flat space-times
the resultant effect on the measured CBR temperature in those direction? (See the
discussion of redshift and background radiation at the end of Section 3.1.)
Computer Exercise 8
Write a program which will accept as input any one of the three parameters: velocity
V(= v/c), time dilation factor G(= y), Doppler shift factor K; and prints out the other two.
Use your program to plot carefully a graph of y and Kagainst V for all allowed values of V.
Modify your program (a) to print out additionally the `slow-motion' approximations
G1 = 1 +1 V2 and K1 = 1 + V. Find out for what ranges of V G1 and Kl are good
approximations to G and K respectively. (b) Repeat this for the fast motion approxima-
tions G2 = l/ (2e) and K2 = (2/e) where e is defined by V = 1 - E.
1 2 3 4 5
III II II IIIIIII III IIIII II II II
U V w
(a)
(b)
Fig. 3.35 (a) A ruler with ends u and w and mid-point v. (b) A space-time diagram of the
ruler at rest in the reference frame of an observer A, showing the world-lines of the ends u
and w and of the mid-point v. Clearly, the entire strip between the world-lines of u and w
will represent histories of particles comprising the ruler. A surface of simultaneity for A is
horizontal in this diagram.
Fig. 3.36 (a) Observer A measures the instantaneous length of the ruler to be L, while
observer B (moving relative to A) measures the length L' in his surfaces of simultaneity.
(b) The same situation drawn in B's rest frame.
Fig. 3.37 Observers A and B both determine the length of the rod by radar, A emitting a
signal at Q and receiving the echo at S, while B emits a signal at P and receives the echo at R.
A measures U and W to be simultaneous, while B measures 0 and W as simultaneous
(OP and OR represent equal times).
3.5 Length contraction 91
top = toR and B determines 0 and W to be simultaneous. The light travel time
is T' = 2toP = 2tO'R, and the length of the rod is measured by B to be
Let the light emitted at the event P by B reach A's world-line at the event Q.
Suppose that A emits a light signal at the event Q. This light is reflected at the
event W and received back by A at the event S. Let Ki be the K-factor for a relative
speed of approach v; this relates the time t,O to tQo, so the time measured by A
from Q to 0 is tQo = Ki x 1T'. Let K2 be the K-factor for a relative speed of
recession v; this relates the time tOR to tos, so the time measured by A from 0 to S
is tos = K2 x br'. Let the total light travel time measured by A be T. Then
Now, K2 = 1/Ki because they relate approach and recession at the same
speed. Then
L = 2 cT. (3.23c)
Hence the ratio of the length of the rod measured by A to the length of the rod
measured by B is
1/'y = (1 - v2/c2)Z.
Fig. 3.39 Measurement of two rods RA, with end-points u and w, and RB with endpoints'
and w'. Observer A measures both to have length L (the distance between events U and W).
Observer B measures them to have the lengths L' (between U and N) and L" (between
U and M) in his surface of simultaneity UM.
W are instantaneous for A, so A measures the same length L (the radar distance
measured by him between U and W) for both of them. Now, using radar, B
measures lengths in his surface of instantaneity, which is indicated as the line
UNM in the diagram, where N lies on the world-line of w and M on the world-line
of w'. He measures the length of RA as the radar distance L" between U and N,
and the length of RB as the radar distance L' between U and M. By the results
above, A measures the relatively moving rod RB short by a factor -y:
These results are consistent with each other. Indeed, they show that
consistent with the feature that L" > L' (apparent because the segment UM is
longer than UN).
In view of this reciprocity, it is apparent that given any rigid object, it will
appear longest to an observer for whom it is at rest (i.e. who moves at the same
speed as the object). We may use the name proper length to denote the length of
the object as measured by such an observer. Then every observer moving relative
to the rod will measure the length to be less than its proper length.
Transverse measurements
The length contraction effect is a longitudinal effect: that is, it is observed in the
direction of relative motion of the object (in the above calculation, the relative
94 Measurements in flat space-times
motion was in the X-direction and the length contraction occurred in the length of
the object measured in that direction). No change of size is measured in directions
perpendicular to the motion, because there is no change of relative distances in
those directions. Thus, radar sets aligned along the Y and Z axes by A and B will
give the same measurements of distances along these axes, and one will find the
size of objects measured in the Yand Z directions unaffected by relative motion in
the X-direction. A body moving past will therefore be measured to be distorted
in shape, having the same Y and Z dimensions as when stationary but being
contracted in the X-direction.
Photographic images
The length contraction effect refers to measurements made by radar. This does
not mean that a photographic image will show the length contraction in an
obvious way, because such an image does not represent the state of the object `at
an instant'. To work out what the image will show, one must allow for the light
travel time from different parts of the object to the camera, and this works in the
opposite way to the length contraction. In general, the result is complex to work
out, but a simple example will make the principle clear. This detailed study is
peripheral to our main line of argument, and so may be omitted at a first reading.
Consider a rigid rod RB with edges u' and w' moving towards the observer A
(Fig. 3.40). As in the previous example, denote the proper length of the rod by L"
and the length A measures for it by L; these quantities are then related by (3.25a).
In the following, unless otherwise stated, all distances will be scaled according to
A's coordinate X, which is used to calibrate the x-axis in Fig. 3.40 and is nor-
malized so that the speed of light is 1. At event R, observer A takes a photograph
of RB. The light arriving at the event R has travelled up its past light cone; we
denote by U the event where this light left the edge u, and by W the event where it
Fig. 3.40 A photograph being taken by an observer A of a ruler with end-points u' and w'
moving towards the camera. The events U at u and W at w are recorded by the camera at
event R. By the time the light ray leaves W, the ruler is a distance d nearer the camera so its
apparent length is Lo = L + d.
3.5 Length contraction 95
left the edge w. Suppose RB moves a distance d towards A while the light travels
from U to W; since RB is moving at a speed v/c towards A, we have (dl = TJv/cJ
where Tis the time the light takes to travel from U to W. Remembering our sign
convention for v, a speed of approach is represented by a negative value of v, so
d = -Tv/c. When the light arrives at W it has travelled a distance L + d towards
A, so T = L + d; consequently d = -(L + d)v/c. Solving for d shows that
d = {-(v/c)/(1 +v/c)}L. (3.26a)
Now, using (3.25a) and the expression (3.21) for ly and simplifying, we can
show that
Lo = (11K)L"; (3.27)
Exercises
3.16 A section of the surface of a road has pressure studs laid into it, connected to a
measurement centre by cables that are all exactly the same length. A series of lights at the
centre indicate which studs are loaded by a vehicle in the road. An articulated lorry passes
over them at high speed. When the lorry is at rest, its length is measured to be 30 metres.
What length would this apparatus measure for the lorry, if its speed of travel were
v = 0.01c?
3.17 To measure the length of a high-speed train, an observer measures the time T it
takes to pass a fixed point on the track, and then determines its length L' from its speed of
motion v (which he also measures) by the relation L' = vT. Show that the length-con-
traction formula (3.24) relates L' to the (proper) length L measured for the train by an
observer moving with it.
3.18 A science-fiction story features a moon-buggy which has continuous contact
with the ground (via caterpillar tracks) and has its weight distributed uniformly along its
10-metre length. What is the upper limit to the speed at which it can travel directly across a
4-metre-wide chasm without falling into it? For speeds greater than this critical speed,
explain how it is possible, from the point of view of someone travelling in the buggy, for it
to fall into the chasm.
Computer Exercise 9
Write a program that will accept as input a relative velocity V(= v/c) and a proper length
L, and prints out the measured length L' given by eqn (3.24). Also print out the approx-
imate value L1' = L(1 - z Vz), and find for what range of V, the estimate L1' is a good
approximation to L. Apply your program (a) to a Concorde airliner at maximum speed,
(b) a space shuttle.
v/c -D.99
atmosphere
muon
. . . . . . at .
. rest
v/c No.99
(a) (b)
Fig. 3.41 (a) Cosmic rays colliding with particles in the Earth's atmosphere to produce
muons which then decay into other particles. The muons travel at a speed of about 0.99c
relative to the Earth. (b) The same situation from the viewpoint of the muons, with the
Earth approaching at high speed.
illuminating examples, and will then consider how one can express the essential
features either through a single unified relation (the Lorentz transformation) or
through the concept of an invariant (the space-time interval), both of which will
be discussed in detail in Chapter 4.
Their mean flight time through the Earth's atmosphere, from where they are
created, to sea level is t2, given by
Defining f by
f - (mean time of flight)/(mean lifetime), (3.28c)
we find that f = t2/ti _ (6.7 x 10-5)/(2.2 x 10-6) ^ 30. Now a statistical ana-
lysis shows that during one mean lifetime ti, the proportion of muons surviving
will be about I /e, where e is the transcendental number occurring in natural
logarithms (e , 2.71828...); and during the time t2, the fraction surviving should
98 Measurements in flat space-times
be about e-f - e-30 - 10-13. However, when measurements are made of the
number of muons created high in the atmosphere and those arriving at sea level it
turns out that a much higher fraction arrive at sea level: about 1% = 10-2 of the
total number created. Thus, the prediction is entirely wrong: enormously more
particles survive than expected on the basis of this simple calculation. What has
gone wrong?
The essential point is that we have failed to take time dilation into account. In
considering any physical situation, one should make a definite decision as to
which frame will be used for the analysis, and then stick to this decision; mixing
results of measurements by two different observers will usually lead to incorrect
results. We first choose to look at the situation from the viewpoint of an observer
on the ground. Then eqn (3.28a) is an incorrect estimate of the measured muon
lifetime, because it is the lifetime measured by an observer moving with the muon.
The lifetime tl measured by an observer stationary on the ground will differ by a
factor y, where
y = (1 - V2/C2)-1
= {1 -
(0.99)2}_2
7.1 (3.29a)
so
(a factor 1/y times our previous estimate). Hence e-f - e-4.2 _ 0.015, an esti-
mate of the fraction surviving which is in good agreement with the experiment.
The time dilation effect therefore reconciles the theoretical and experimental
results in the Earth's frame; the observations in fact provide an experimental
verification of the time dilation effect. However, a problem is apparent if we
consider the situation from the viewpoint of an observer travelling with the
muon. This is because in that frame there is no time dilation effect for the decay:
the muon is stationary in the observer's reference frame (Fig. 3.41b), and has the
lifetime (3.28a). The previous analysis, which gave an incorrect answer, appears
to apply.
The resolution in this case is provided by remembering that we must apply all
the special relativity results in analyzing our observations. Seen from the muon's
reference frame (Fig. 3.41b), the Earth is approaching at the same speed
v (Iv/c) _- 0.99) as the observer on the Earth measured for the muon, because
both observers agree about the relative rate of approach (see eqn (3.12b)).
However, from this viewpoint the atmosphere is also moving by at high speed, so
the path through the atmosphere is measured to be much shorter because of the
length-contraction effect. In fact, the moving observer would measure the path
3.6 The whole package of kinematic effects 99
through the atmosphere, from creation of the muon until it is hit by the surface of
the Earth, to have a length of 20/y km 20 x 0.141 2.8 km, instead of the
20 km measured by an observer on the Earth (at rest relative to the atmosphere).
Thus, for the moving observer the muon traverses this path in a time t, given by
10-6
t2 = 2.8 km/(0.99 x 3 x 105 km/sec) - 9.4 x sec = t2/y. (3.30a)
Hence, evaluating both terms in (3.28c) in the muon's reference frame,
and we obtain exactly the same result (3.29c) as before. In the muon's reference
frame, we reconcile the theoretical and experimental results by use of the length-
contraction effect, and the experiment serves as a verification of this effect.
This analysis shows very clearly why one must consider length contraction and
time dilation together: they are the same phenomenon seen from different points of
view. From the stationary frame, theory and experiment agree because of time
dilation; from the moving frame, because of length contraction; the analysis would
be inconsistent if only one of the effects occurred. The experimental data for
muon decay serves to verify that both effects occur in the real physical world. The
interested reader will find more details about how the experiment is performed in
the book Special Relativity by A. P. French (published by the MIT Press).
(a)
t=to=tl s s
simeous simultaneous
forB for A,C
f initially
(b) (c)
Fig. 3.42 (a) Observer B sees two rockets, A and C, accelerate simultaneously in the same
direction. The distance dbetween them stays constant because they accelerate identically.
(b) An idealized space-time diagram of the situation as seen by observer B. Very powerful
engines are switched on just before events s and s' and switched off just after these events.
(c) Initially the surfaces of simultaneity of A and C coincide with those of B, but after they
have finished accelerating they are tilted relative to those of B. Thus, just after he has
completed accelerating (at event s), A determines the event f' in C's history (before C
started accelerating) to be simultaneous with s. Thus his measurements show that, at that
instant, C is still to start accelerating.
If the rope still joins the two rockets, it is at rest relative to them; this must then be
its length (measured in its own rest frame). But it is inextensible; it cannot stretch
to this length. It will therefore have broken.
The problem arises when one considers how this can have happened. As
established above, B observed both rockets to accelerate in precisely the same
way. This seems to imply that the distance between them could not change, and
therefore that the rope did not break. They accelerated identically; how can the
distance between them have changed from 400 in (as measured by A initially) to
500 in (as measured by A finally)? Does the rope actually break or not?
As before, the problem is that we have not taken all the relativity effects into
account. The apparent paradox is resolved by considering the relativity of
simultaneity. Specifically, instantaneous surfaces in space-time for A and C
when they are moving at their final speed are tilted relative to their initial surfaces
of instantaneity, which coincide with those of B (see Fig. 3.42c). Thus, consider
events as determined by A. Just before he starts to fire his rocket engine (at the
3.6 The whole package of kinematic effects 101
event s in his history), C is also just about to start his (at the events). At this stage,
A and C both measure their distance apart to be 400 in, and they agree about
simultaneity. But, when A has finished firing his engine (just after the event s),
C has not yet started firing his (since A measures s to be simultaneous with the
event f' in C's history, which precedes s'). At this stage, A determines that he is
moving away from C, because he has finished accelerating but C has not yet
begun to accelerate. The distance between them increases and the rope snaps.
C then begins accelerating (just before the events' in his history, measured by A to
be simultaneous with the event fin A's history). Finally, C ceases accelerating just
after the event s'. Both A and C now measure their distance apart to be 500 m,
and they agree about simultaneity. This explains why their final distance apart
is greater than their initial distance apart, which of course means that the rope
must break.
As before, we see that consistency of the special relativity effects depends on
taking all of them into account; the puzzling `paradoxes' of relativity usually
result from ignoring one or other of them. The most difficult to appreciate
initially is the relativity of simultaneity; indeed, a rough rule of thumb is that
when a problem appears particularly paradoxical, it is usually because this effect
has been forgotten.
Exercises
3.19 Particles called pions decay into other particles at a rate such that (when
measured in their rest frame) on average half the number of pions present decay in
18 x 10-9 sec. Suppose now that in a high-energy collision experiment, pions are produced
with a speed 0.99c. How long will it take on average, as measured by a stationary observer,
for half their number to decay? How far will they have travelled in this time? [Compare
with the distance travelled by the muons described in the text.]
3.20 A car 5 metres long drives into a garage 4 metres long at a speed v = s c.
According to a stationary observer, the length of the car will appear to be reduced by a
factor of 1/-y to 4 metres and so it will fit into the garage exactly. On the other hand the
driver of the car will perceive the length of the garage to be reduced by 1 /-Y to 3.2 metres so
the car will not fit in. How would you resolve this apparent paradox? What wording in its
statement is not sufficiently precise?
3.21 Construct a space-time diagram to illustrate the possibility of causal paradox if
tachyons (particles travelling faster than light) were to exist. Observer A is at rest while
observer B moves past at relative speeds c. Draw in the surfaces showing which events are
simultaneous, according to B, with the events on A's world-line at t = 0, 1, 2, 3, 4, 5.
Suppose that at t = 1, A were to send a signal toward B with speed 3 c. Show that B will
determine that A sent the signal at a time when B had already received it. Hence according
to his (radar) measure of instantaneity, B could transmit an answer to the signal before
it had been sent! [Moral: consistency of relativity theory forbids sending signals faster
than light.]
3.22 Our analysis above of Example (b) (tied rockets) referred to instantaneous dis-
tances as deduced by A, B, and C from their surfaces of simultaneity. In practice, they
would measure their separation by radar signals which are not instantaneous measure-
ments; e.g. the surface of simultaneity (sf') would be determined by A from signals sent out
before event S and received after s.
102 Measurements in flat space-times
Work out in detail the separation A would measure for C by radar measurements of
distance (cf. Example 3.12; note that this is a lengthy but interesting exercise).
of nuclear energy, and for understanding the processes taking place in the Sun).
The topics dealt with in this section are an important part of special relativity
theory, but are not essential for understanding the nature of space-time geometry
or measurements. Thus, the reader who wishes to concentrate on the geometry of
space-times can omit this section.
A: Mass
Just as we had to be prepared to question all our preconceived ideas about space-
time measurements, so we must also be prepared to revise our ideas about the
basic quantities involved in dynamics. In Newtonian theory, the mass of an object
is a quantity of considerable importance, since the energy and momentum of any
body are proportional to its mass. Thus, the mass of a rocket determines the
amount of energy needed to place it in an orbit around the Earth at a particular
distance; the mass of a meteorite determines the amount of kinetic energy it
dissipates when it crashes into the Moon at a particular speed and forms a new
crater; the masses of elementary particles determine the final speed each attains
after a collision: the mass of a car of given power determines the time it takes to
accelerate from rest to a speed of 100 km/hr.
In Newtonian theory, the mass m of an object is independent of the motion of
the observer who measures it. In relativity theory we must be prepared to ques-
tion whether this is still true or not. Accordingly, we will denote by m0 the mass
measured for an object by an observer when it is at rest relative to him. It will then
be an experimental question whether or not he should still regard its mass as m0,
when the body is in relative motion. It will turn out that the effective relativistic
mass m does indeed depend on relative motion (eqn (3.34) below). The second
important feature is that in Newtonian theory, total mass is conserved in inter-
actions; for example, if 10 kg of hydrogen and 80 kg of oxygen burn to form
water, it is predicted that the mass of water produced will be 90 kg. We shall see
below that mass conservation remains true in relativity theory, but in an extended
sense: mass can be converted to energy and energy to mass; it is the total of mass
and energy that is conserved.
B: Momentum
In Newtonian theory, the momentum of an object is its mass multiplied by its
velocity. The importance of momentum is that it underlies the basic conservation
laws of dynamical motion:
(M1) when no forces act on a body, its momentum is conserved;
(M2) when a collision takes place between particles or massive bodies, the total
momentum of all the objects involved in the collision is conserved.
Consider, for example, a space station of mass 100 tons and a meteorite of mass
50 tons approaching each other. In the reference frame of an inertial observer B
the space station is initially moving in the +X-direction at a speed 10 c and the
meteorite in the -X-direction at a speed c (Fig. 3.43a). The initial momentum of
the rocket is 100 x io c 1 Oc (to the right,
2 as implied by the positive sign), and the
initial momentum of the meteorite is 50 x (-z) c = -25c (to the left, as implied by
104 Measurements in flat space-times
--0-
1/10 C
M tons
X
(b)
Fig. 3.43 (a) A space station moves right at v = io c while a meteorite moves left at
v = ? c. (b) After they collide and fuse together, the wreckage moves at speed v' in the
+X-direction.
the negative sign). As no forces are acting on them, by (M1) these momenta stay
constant; they therefore continue approaching each other at constant speeds.
They then collide, generate considerable heat, and fuse together. Let the
wreckage have mass M and speed v' in the +X-direction (Fig. 3.43b). The total
final momentum of the material involved is Mv'. By (M2), this is equal to the
total initial momentum of the space station plus the meteorite, which is
10c + (-25c) = -15c. Thus conservation of momentum tells us Mv' = -15c, so
the final speed is v' = -15c/M. Now in Newtonian theory, total mass is con-
served so the final mass of the wreckage is equal to the mass of the space station
plus the meteorite, i.e. M = 100 + 50 = 150. Thus v' _ -15c/150 = - io c; that
is, the wreckage moves to the left at io the speed of light (v'/c = -0.1).
In this example, the situation was particularly simple because all motion took
place parallel to the Xaxis. If the motion is in a general direction, we can write the
velocity vector v in terms of its components (vx, vy, vZ) parallel to the X, Y, and Z
axes respectively; then the components (px, py, pZ) of the momentum vector p
parallel to these axes are given by
px = mvx, Py = mvy, Pz = mvZ. (3.31 a)
We can conveniently combine these three relations in the single vector equation
p=my (3.31b)
giving the momentum p measured by an observer B for a particle of mass m
moving with velocity v. According to Newtonian theory, B will measure
each component (3.31a) of total momentum to be conserved when collisions
take place.
In relativity theory, on examining momentum conservation from a space-time
viewpoint (see Appendix B), it turns out that the quantity conserved is notp but
rather a vector ir, the relativistic three-momentum, defined by
n = mo'y(v)v (3.32a)
3.7 Relativistic dynamics 105
with components
where mo is a mass associated with the particle (which we later identify as its
rest mass') and 7(v) = { 1 - (v/c)2}-z (see eqn (3.21)). Given this definition, the
relativity-theory prediction is that momentum it is conserved in collisions:
and from this one can work out the effects of collisions in relativity theory almost
identically to the way one does in Newtonian theory.
To see this, consider again the space station and meteorite in the example
above. We naturally assume that the masses stated previously are rest masses.
Relative to the observer B, the 7-factor for the space station is 7(IOc) _
{1 - (i0)2}-z = = 1.005 so the x component of its initial momentum
(l0)-2
is ir, = mo7(v)vx = 100 x 1.005 x io c = 10.05c. The 7-factor for the meteorite is
7(Z C) = (1 - (Z)2)-z = (4)-z = 1.155, so the x component of its initial momen-
tum 50 x 1.155 x (-1)c = -28.868c. The total initial momentum is therefore
10.05c - 28.868c = -18.818c, which will be equal to the total final momentum,
so
where Mo is the rest mass of the wreckage. Completion of the calculation to find
v' demands that we work out the final total mass Mo.
According to Newtonian theory, total mass is conserved. Can we generalize
this result in a simple way? This depends on identifying a conserved quantity that
we should call `mass' in relativity theory. Now, on comparing (3.31) and (3.32) it
becomes clear that if we define the mass m of a moving particle by
then the Newtonian and relativistic equations both take the same form: the
conserved momentum is given by `momentum = mass x velocity'. Further, given
this definition of a mass m that depends on the velocity relative to the observer
(mo being independent of this velocity), the four-dimensional momentum con-
servation equation shows that m is conserved in collisions (Appendix B). From
now on, we refer to m (determined from the rest mass and relative speed by
equation (3.34)) as the `mass' of an object, both because the momentum equa-
tions then preserve their form (cf. (3.31), (3.35)) and because this quantity is
conserved in collisions:
(total initial mass m) = (total final mass m). (3.36)
106 Measurements in flat space-times
When the body is at rest relative to the observer, (3.34) shows m = mo, hence the
name `rest mass' for mo. Clearly m > mo, with m = mo if and only if the body is at
rest relative to the observer.
Returning to our example, the initial mass of the space station relative to
the observer was mo-y = 100 x 10.05 = 100.5 tons, and the initial mass of
the meteorite was 50 x 1.155 = 57.75 tons. Thus, the total initial mass was
100.5 + 57.75 = 158.25 tons. Provided that no mass has been lost any other way,
it follows that, by conservation of relativistic mass, this will be the final mass M
also; so
Dividing this into the relation (*) above, v'/c = - 18.818/158.25 = -0.119; then
substituting this value back into (**) shows Mo = 158.25/ry(0.119c) =
158.25/1.0071 = 157.13 tons, 7 tons more than the total rest mass of the bodies
that collided! The source of the extra rest mass would be conversion of some of
the kinetic energy of the two bodies into mass, as we shall discuss later in this
section. We shall then also see that the collision as discussed so far is over-
simplified; in practice radiation would be given off which we need to take into
account to get the full picture.
Exercise 3.23
Consider the example above when the mass of the meteorite is taken to be 20 tons, all
other conditions remaining unchanged. Show that then, according to Newtonian theory,
after the collision the wreckage remains at rest in the rest frame of Observer B, but
according to relativity theory this is not so. What is the final total rest mass in this case?
88888
__ooya
V ,
ooa
e
( 888PB
E field
Charged'
plates
Fig. 3.44 Acceleration of a charged particle by an electric field (e.g. in a television display
tube).
are used daily to produce particle collisions at very high energies, and many
thousands of such collisions have been analysed on the basis of conservation of
relativistic momentum (eqn (3.33)) and mass (eqn 3.36)). The theory enables us to
understand the collisions in each case, so these are among the best-tested laws
in physics.
C: Force
In Newtonian theory, if a force F acts on a body with momentum p the rate of
change of momentum is equal to the force acting: that is,*
This determines the motion of the body when acted on by any force. For example,
the electrons which generate the display in a television set are initially accelerated
from rest by an electric field. To analyse this, note that if a particle with electric
charge e moves non-relativistically in a uniform electric field E, parallel to the
field, the force exerted by the field on the particle will be F = eE (Fig. 3.44). If the
x axis is chosen parallel to the field, then since p = my the motion of the particle is
determined by the equation eE = m dv/dt for the velocity component v in the x
direction with solution v = (eE/m) t if it starts from rest. In principle, the particle
can eventually reach arbitrarily high speeds if it moves in a uniform electric field
long enough.
In relativity theory, the same equation of motion is valid; again
*Here dp/dt (the `derivative of p with respect to t') means the rate of change of p as time t evolves;
for example the velocity v is the rate of change of position, v = dxldt, and acceleration a is rate of
change of velocity, a=dv/dt. If you have not learnt about derivatives in calculus courses, you will
simply have to accept as correct some of the results we quote below.
108 Measurements in flat space-times
however, here `momentum' is now the relativistic momentum (3.32). Thus the
equations of motion are -
This again determines the motion of a body acted on by any force, but now
correctly takes relativistic effects into account. Reconsidering the example above,
the equation for v now becomes
eE = mod{v/(1 - v2/c2)z}/dt.
This leads to the relation v/{ 1 - (v/c)2}2i = eEt/mo, which can be solved for v/c,
giving the result
In this case, even an arbitrarily long acceleration period will not enable the
particle to exceed the speed of light (Fig. 3.45).
Again the question is: does the relativity force-law (3.37b) describe accurately
the effects of a force acting on a particle? The answer is similar to the previous
one: this force law has been tested many thousands of times up to very high
energies in many particle accelerators, and is a very well-established law of
motion in accord with all the experimental data.
v/c
Fig. 3.45 The speed of motion of the charged particle as a function of time. No matter
how long the particle accelerates, it does not exceed the speed of light.
3.7 Relativistic dynamics 109
will be that expected of a particle of mass m (rather than mo). Hence the effective
mass of a particle moving relative to B will be measured by him to vary with its
relative speed of motion.
Clearly, the form of the momentum and force equations is very similar in
Newtonian theory and relativity theory; indeed, we can regard the only difference
as being that the effective mass m in the relativistic theory depends on the speed of
motion of the body relative to the observer according to formula (3.34), while in
Newtonian theory it is independent of this motion. Despite this close similarity in
form, variation of m with speed v results in a fundamental difference between the
Newtonian and relativistic cases. In Newtonian theory, m is a constant and there
is nothing special about the speed of light. In relativity theory, m is related v by
(3.34); the relation is shown in Fig. 3.46. The crucial feature is that the effective
mass m diverges as V -+ ±1 (i.e. as v ± c) and so the momentum n (given by
(3.35)) diverges then also; a graph of the magnitude of the momentum against the
magnitude of the relative velocity v is given in Fig. 3.47. The consequence is that
as one imparts more and more momentum to an object, either through collisions
or through exerting forces on it, it moves closer and closer to the speed of light as
its momentum increases, but never reaches that speed because the inertial mass
increases without limit and so the force needed to increase its speed by some given
amount also increases without limit. Thus, one cannot accelerate a particle to
faster than light in a particle accelerator, no matter how large the accelerator is
(see Fig. 3.45), nor can one accelerate a rocket to faster than light no matter how
much fuel one burns or how powerful the rocket motor is.
To see this in a specific case, suppose a projectile is moving at v/c = 5; then
25)-Z
Y = (1 - = 3, so its effective mass is 3 mo and its momentum has magnitude
= my = 3 mo (5) c = moc. If its momentum is now doubled, then it = 8 moc.
3
m
MO
3
Fig. 3.46 A graph of m/mo, the ratio of effective mass m to rest mass mo, against relative
speed of motion V = v/c.
110 Measurements in flat space-times
A
J7J
MO
3
y1 V/ l.
Fig. 3.47 A graph of 1n/mo l, the ratio of the magnitude of relativistic momentum to rest
mass against V= v/c.
7r = 8
m0 = V = v/c = (63)2= 0.936,
34
7r = m0 = V = v/C = (1:409-'056)12
4105) Z= 0.999,
showing that less and less return is gained for each doubling of momentum and
the speed of light is not attained.
Because the effective mass diverges at the speed of light, one cannot by any
physical process accelerate a real object so that its final speed exceeds the speed of
light. Thus, the dynamical theory of special relativity is in agreement with the
basic assumption that the speed of light is a limiting speed for motion of all
massive bodies, and indeed ensures that this condition is fulfilled. There is no
inconsistency between the kinematics and dynamics of special relativity; they form
a consistent whole together, as long as we do not omit any of the relativistic effects.
Computer Exercises
10. Write a program that accepts as input the rest mass MO and speed of motion V1 of a
particle moving relative to an observer B in the X-direction of his reference frame, and
3.7 Relativistic dynamics 111
prints out its relativistic mass Ml and momentum P1. Use this program to verify the forms
of Figs 3.46 and 3.47, and so to check that no matter how much momentum may be
imparted to a particle its speed will not exceed the speed of light.
Print out also the slow-motion approximation MI = Mo(1 +2 (VI/c)2), and find out
for what ranges of VI this is a good approximation to Ml.
11. Write a program that will accept as inputs the rest masses M0(I) and speeds of
motion VI (I) of two particles labelled 1(1= 1, 2) which collide in a particle accelerator and
are converted to two new particles (labeled J, J = 3, 4) in this collision, all particles moving
in the X-direction of the chosen coordinate axes. The program should additionally accept
as inputs the measured speeds V2(J) of the product particles, and then calculate and print
out their rest masses M0(J). [Find the total momentum and rest mass of the initial par-
ticles, use the mass and momentum conservation equations, and then solve for the final
rest masses.]
What happens if you enter a value of V2(J) greater than the speed of light? What happens
for a value equal to the speed of light?
Using eqns (3.34) and (3.41) in (3.39) shows that the total energy is
We now define the relativistic kinetic energy EK from the rest-mass energy Eo and
the total energy E by the relation
E=Eo+EK (3.43)
that is, the kinetic energy EK is defined to be precisely that part of the total energy
E due to motion of the body relative to the observer. Using the definitions (3.39),
(3.41) we recover eqn (3.38b) as a relation that is exactly true in relativity theory.
Also using (3.42) and (3.21), eqn (3.43) shows that
EK = mocZ 1 -1 (3.44)
{1 - (v/c)Z}z
is the exact relativity expression for kinetic energy. In the case of slow motion this
reduces to EK = mov2+ (small terms that may be neglected), thus recovering the
Newtonian expression
2 z move as a good approximation to the kinetic energy when
(v/cj << 1. When low speeds are involved, we may use either expression for kinetic
energy, for the difference between them will be negligible; when high speeds are
involved, we must use the relativity expressions for energy, or we shall obtain
wrong answers. Summing up, in relativity theory, eqn (3.43) splits the total
3.7 Relativistic dynamics 113
energy E into its rest-mass energy E0, that part of the total energy independent of
the motion of the body, and its kinetic energy EK, the part solely dependent on
that motion. Eo is given by (3.41) and EK by (3.44).
These ideas are of fundamental importance in physics. We do not have space to
consider all their implications in detail, but will outline some of the most
important consequences.
V= C/2 V=C/2
A1/2Kg
e1/2Kg
IN- X
Initial state
V= CA
-
V=C/4
©MO
eM0
V-X
Final state
Fig. 3.48 Two balls of mass 0.5 kg approach each other, each moving at a speed z c
relative to an observer B, and then move apart after colliding, each moving at a speed c.
a
114 Measurements in flat space-times
If the final rest mass of each ball is Mo, the final total energy is
1 1
showing that zero-rest-mass particles must move at the speed of light, and their
energy and momentum are the same (up to a factor of c required to convert
dimensions between these quantities). Thus, we can consistently conceive of
particles which have finite energy and carry finite momentum for which (3.32),
(3.42), (3.45), and (3.46) hold in a limiting form where (3.48) is true. Such particles
do indeed exist, for example the photon, which is the particle associated with light
(and so must of necessity move at the speed of light, as required by (3.48)). As is
familiar, photons are able to carry energy between distant points, e.g. one can
destroy a satellite in space by use of a suitable laser on the earth which focuses
light on the satellite; the photons which carry the energy from the laser to the
satellite will also carry momentum, so the laser will recoil as it is fired and the
satellite wreckage will be pushed into a new orbit by this momentum.
This kind of effect will occur for every collision involving zero-rest-mass
particles. Suppose for example that a photon with energy 1 MeV collides with a
stationary electron with rest mass 0.511 MeV. After the collision it is observed
that the photon has been deflected through an angle of 45°. Suppose that its
energy and momentum are then E' and n', and those of the electron are E" and n".
Now Ijt = E'/c, and conservation of energy gives
1+0.511 =E'+E".
Conservation of momentum shows
n=X, +X",
where it is the initial momentum of the photon. Rearranging the last equation and
taking the squared magnitude, we obtain
In-H,I2= IX"I2,
giving
InI2+In)I2-2n. In)'I2.
=
Noting that the magnitude of n is 1/c (from (3.48)) and that n _
Inj In') cos 45°, we find that
from (3.45). Substitution for E" from the equation of conservation of energy
leads to
1+(E')2-2x2E',/2=(1.511-E')2-(0.511)2.
Solving this equation gives E' = 0.636 MeV and we see that the photon has lost
energy as a result of the collision.
The existence of zero-rest-mass particles leads to significant changes to the
equation of state of matter at very high temperatures, which in turn affects such
features as the equilibrium states of massive stars and the rate of expansion of the
116 Measurements in flat space-times
early universe. As a specific example: because the temperatures there are so high,
the interior of a star like the Sun contains vast numbers of high-energy photons.
The gravitational forces trying to cause the sun to collapse are prevented from
doing so primarily by radiation pressure exerted by these photons because of the
momentum they necessarily carry (eqn (3.48)). Thus, this is one of the features
making a long life possible for stars like the Sun; it makes possible the stability of
the Sun, which in turn makes life possible on the Earth.
* 1 a.m.u. ='atomic mass unit'= 1.6605 x 10-27kg. Its energy equivalent is 931.5 MeV.
3.7 Relativistic dynamics 117
mp Mn
D 0 approach
binding energy
radiated
v
deuteron
nucleus
Fig. 3.49 A proton and a neutron fuse together, giving off energy, to form a deuteron
nucleus with mass less than the total mass of the constituent particles.
For most nuclei the binding energy per nucleon is about 8 MeV; the nucleus
with the largest binding energy per nucleon is iron, which is therefore the most
stable nucleus. Some of the lighter elements can give off energy by fusion, when
they combine to make heavier elements. A dramatic demonstration is the fusion
(in a series of steps) of hydrogen to helium, releasing the binding energy of the
helium; this is the process occurring in the hydrogen bomb, and is also the main
source of energy in the Sun. This is an immensely important consequence of
special relativity theory, because the Sun is the source of all the energy that
enables life to exist on the Earth. The stars in galaxy NGC 3377 on the cover of
this book would be invisible were it not for the release of fusion energy, not to
mention the fact that neither the photographer nor the reader would exist in the
absence of this process!
Fission Elements heavier than iron have a lower binding energy per nucleon
than iron, and so can give off energy by fission, as they split to make lighter
elements. The most famous example of this is when uranium 235 splits into two
nuclei, giving off the difference in binding energy between the initial uranium
nucleus and the two final nuclei. This is the process occurring in the original
atomic bomb, and the source of energy in many nuclear reactors used to supply
electricity. Thus, the tiny mass differences corresponding to the binding energy of
nuclei have very important consequences in the modern world.
Pair annihilation and creation In pair annihilation, an electron and its anti-
particle,* a positron, annihilate entirely and all their rest mass plus their kinetic
energy is converted to energy carried off in the form of electromagnetic radiation
(in the form of particles of light, i.e. photons). The rest mass of an electron is
mOC2 = 0.511 MeV, so an energy greater than 1.022 MeV is released per electron-
positron pair annihilated. This process may be a powerful source of energy in
various astrophysical processes.
*A particle with equal mass but opposite values of other quantities such as charge.
118 Measurements in flat space-times
The converse process of pair creation is also possible. If two photons collide
with a total energy greater than the threshold energy of 1.022 MeV, sufficient
energy is present to provide the rest mass of an electron-positron pair, so such a
pair can be created where none existed before. This does not violate the law of
conservation of mass, because energy has been turned into mass, and it is the total
of mass and energy that is conserved in the reaction. Similarly, a single photon of
sufficient energy can create an electron-positron pair, provided there is a nucleus
nearby to allow conservation of energy and momentum. The creation of matter
out of pure radiation is perhaps the most dramatic demonstration of the mass-
energy relation. It has been demonstrated many hundreds of thousands of times
in particle-accelerator experiments; Figure. 3.50 shows the creation of an elec-
tron-positron pair in a bubble chamber, where a high-energy photon enters the
chamber from the left and produces the pair near a nucleus which allows bal-
ancing of total energy and momentum. Neither the photon nor the nucleus leaves
a visible track in the chamber, so the tracks of the electron-positron pair seem to
appear out of nothing.
(the equality of E and E0 following from (3.43)). Thus, if an observer moves with a
particle, all he will measure will be its rest-mass energy. Now change to another
frame so that the particle is in relative motion at speed v; equations (3.32) for the
momentum and (3.42) for the mass then follow. The latter provides the basis for
(3.38) and so for (3.39).
Conclusion
We have seen how the application of relativity theory to dynamics leads to the
understanding of many important phenomena: the velocity-dependence of
effective mass, the equivalence of mass and energy, the inertia of all forms of
energy, the concept of `rest mass', and the possibility of converting mass to energy
and vice versa. While many of the consequences of relativity theory are important
only when high speed motion or large distances are involved, some of the
dynamical phenomena are important in everyday life; for example, nuclear fis-
sion is a source of power for many cities at the present time.
Fig. 3.50 Conversion of energy to matter according to Einstein's famous formula
E = mcz illustrated by pair production. A very energetic photon provides the energy
needed to create the rest masses of an electron-positron pair. The photon does not make a
visible track, but the tracks of the electron and positron after their creation are visible as
they move to the right in spiral tracks in the Brookhaven bubble chamber. (Photograph
provided by Brookhaven National Laboratory.)
120 Measurements in flat space-times
Exercises
3.24 Suppose that a particle moves with speed (i) 10-6c (ii) 10-2c (iii) c. Find in
each case the ratio of the kinetic energy to the rest energy. z
3.25A particle with rest mass M° and speeds c collides with a stationary particle of rest
mass Mo. They coalesce to form a new particle. Find its rest mass and speed of motion.
a particle of rest mass M° is moving with speed 13 cat time t = 0. It is subject to a
3.26A
constant force of magnitude iz Mo c parallel to its direction of motion. Find its speed when
t = 1. How inaccurate would the Newtonian result be in this case?
3.27 Obtain the energy-momentum relation for zero-rest-mass particles as follows. (i)
Square equation (3.32a). (ii) Square equation (3.42). (iii) Obtain equation (3.45), and solve
for E. (iv) Take the limit of this expression as m° --+ 0. (v) Obtain from (3.46) from (3.45),
and so show the speed v = c allows non-zero values for 7r and E even through m° is zero.
[Note that relations (3.32) and (3.42) are indeterminate in this case.]
3.28 The energy received from the Sun at the Earth is 8 x 107 erg/cmz-min on a sur-
face perpendicular to the rays of the Sun. At what rate (in kg per minute) is hydrogen
consumed in the Sun, in a chain of reactions fusing 4 protons (i.e. hydrogen nuclei)
together to form one helium nucleus, to provide this radiated energy? What does this
suggest about the lifetime of the Sun?
[Hint: (i) Find in a.m.u. the energy released when 4 protons form a helium nucleus (the
mass of a helium nucleus is 4.002603 a.m.u.). Convert this to ergs, using the relation
la.m.u. = 1.4916 x 10-3 ergs. (ii) Find the surface area (4irrz) of a sphere with radius
equal to the Earth-Sun distance of 1.496 x 1013 cm. (iii) Find the total energy falling on
this sphere per minute if 8 x 107 ergs fall on each cm2 of the sphere per minute.
(iv) Determine how many fusion events will be needed per minute to produce this energy.
(v) Convert the total mass of hydrogen needed per minute into kilograms. (vi) Estimate the
maximum lifetime of the Sun if all its mass (1.989 x 1030 kg) is used up in the fusion
process.]
3.29 How much energy is released in the fission of a radium nucleus (consisting of 224
nucleons with an average binding energy of 7.5 MeV per nucleon) into 4 iron nuclei (each
with 56, nucleons with an average binding energy of 8.6 MeV per nucleon)?
3.30 What energy is released in annihilation of a proton-antiproton pair? What is the
threshold energy a pair of photons require in order to create a proton-antiproton pair? [A
proton has a rest mass 1836.1 larger than the rest mass of an electron.]
Computer-Graphics Exercise 1
Write a program that will draw on the screen the (t, X) axes of an observer A. It should then
contain subroutines to do the following:
(a) Accept as input the speed of motion v of an observer B relative to A, and the spatial
position Xo of B at the time t = 0 as measured by A; and then draw on the screen B's world-
line. [Note: Iv/c) < 1.]
(b) Accept as input a time to as measured by A, and then indicate the point P in B's
history that corresponds to that time.
(c) Draw a surface of simultaneity for B through any designated point Q on his
world-line.
(d) Draw the future and past light cones of any designated space-time point Q (with
coordinates (t1, X1)).
(e) Draw a series of light rays from A to B emitted at regular time intervals TO, and
print out the interval of reception of signals by B as measured by A [found directly from t-
coordinate values], and as measured by B [determined by use of the K-factor].
(f) Draw a radar signal sent from A to B at some time t1 and reflected back to A.
Use your program to depict (1) an observer A receiving radiation at some instant to from a
distant galaxy, indicating clearly the time t1 in the galaxy's history that is observable by A
at the time to; (2) the front and back ends of a rocket B moving past A, measured by A to be
of length L, showing surfaces of simultaneity for A and for B; (3) observer B moving away
from observer A and then back at speed v, while A keeps track of his motion by radar. [You
should be able to think of many other elaborations and uses for this graphics program.]
4
To examine the unifying ideas set out in Section 3.6, we look in turn at Lorentz
transformations, at simple quantities invariant under these transformations, and
at the invariant interval of flat space-time. We complete our examination of flat
space-times by looking at three universe models based on these space-times.
An understanding of the invariant interval and its meaning (presented in
Section 4.2) is important for a full appreciation of the properties of the curved
space-times discussed in Chapters 6 and 7. From the viewpoint of this book, the
primary importance of the Lorentz-transformation discussion (in Section 4.1) is
that it enables us to prove (in Section 4.2) that the space-time interval is an
invariant, i.e. is the same for all observers.
*See `Navigation between the planets', W. G. Melbourne. Scientific American 234, June 1976, 58.
4.1 The Lorentz transformation 123
lN - - - - - - - - - - -- - -
C
j x x
Q Y O
O
0
O
4.1 The Lorentz transformation 125
writing the latter as (x)2; and similarly for other powers of x that may occur.
However, in the context of using the coordinate names t, X, Y, Z we continue to
denote the squares simply by t2, X2, Y2, Z2, when no confusion will arise. We
refer to A's coordinate system (Xa) determined in this way as his reference frame
and denote it by the single letter F.
B's coordinates Observer B defines his coordinates (t', X', Y', Z') = (xa') in the
identical way, by use of ideal clocks and radar. Then he has covered space-time
by a second coordinate grid that labels each event P by coordinates (t', X', Y', Z')
(Fig. 4.lb). In this case, the position of a point P is represented by a combination
of a spatial displacement in a surface of instantaneity { t'= constant} (tilted at
some angle a relative to the surface of constant time t) and a time displacement
along a line {X', Y', Z' constant} parallel to B's world-line (tilted at the same
angle a relative to A's world-line, cf. Fig. 3.20). Again one may make these dis-
placements in either order. We refer to B's coordinate system (xa') as his reference
frame P.
Fig. 4.2 The event E has coordinates (t, X) in A's frame and coordinates (t', X') in B's
frame. Lines IE and OD are simultaneous for A, while JE and OF are simultaneous for B.
Observer A measures DE to be at constant distance from his world-line 01, while observer
B measures EF to be at constant distance from his world-line OJ. The lines FG and JH are
parallel to the X-axis, and the lines CF and JK are parallel to the t-axis.
Xl = -y(v)X'. (4.1b)
(d) Because the angle HOJ is equal to the angle COF, we have HJ/HO = FC/CO,
that is, Xo/to = t1/Xi; so, using (4.1c), we obtain
ti = XOX3/to = VX1 (4.1d)
Hence
where
This is the required set of equations relating two observers' coordinates for the
same event.
As an example of their use, suppose observer B is in a rocket moving uniformly
at speed v = s c away from observer A at a base on the Earth, both observers
agreeing to measure time from the instant when B passed A at the space-time
event O. After a while B observes a tremendous explosion on a planet he is
observing by radar. He measures the coordinates of the event P where the
explosion took place to be (Xa') = (5, 1, 3, 0), that is, t' = 5, X' = 1, Y' = 3,
Z' = 0. He radios back to A, `Danger! Radioactive debris at position (1, 3, 0)
because of explosion at time t' = 5' (in this example, units are assumed to be years
and light-years). Standard coordinates are based on A's position and motion.
What coordinates should A assign to this event when broadcasting a warning to
other spacecraft?
128 The Lorentz transformation and the invariant interval
In this case, V = 5 and so, by (4.4), ly = -y(45) = {1 - (5)2} 4 = 3; thus eqns (4.3)
become
t= (5+Ax 1)
3 5 =3x 59 39
showing the different coordinates allocated by the two observers to the explosion
at this event.
showing the different coordinates allocated by the two observers to the same
event P. This result is of course just the inverse of (*) above, as it must be because
(4.5) is the inverse of (4.3).
Equations (4.5) are identical in form to (4.3) except for the minus sign pre-
ceding v. The reason for the sign is as follows: according to A, the origin of B's
reference frame (the point X' = Y' = Z' = 0) moves in the positive x-direction at
a speed v; according to B, the origin of A's reference frame (the point X = Y =
Z = 0) moves in the negative x'-direction at the same speed. The Lorentz trans-
formation formulae (4.3, 5) are valid when A and B are approaching each other
for t, t' < 0 and receding from each other for t, t' > 0, i.e. when B moves in the
+x-direction as measured by A. However, we would find the opposite sign for v
if we calculated the result for B moving in the negative x-direction relative to A.
4.1 The Lorentz transformation 129
Therefore, the convention for the sign of v implied in the Lorentz transformation
formula is that v will be positive when the relative motion is in the +x-direction
and negative when it is in the -x-direction. Given this understanding (which is
different from that implied in the K-factor formulae above in Sections 3.1 and
3.2), eqns (4.5) are precisely what we would expect to determine B's coordinates
from those of A. This equivalence of the formulae is a direct consequence of the
relativity principle (each observer is equivalent to the other, so there must be no
essential difference in the transformation formulae between them).
A:X=o B:x=o
(Eo)
(c)
Fig. 4.3 (a) A point Q on B's world-line has coordinates (t, X) in A's frame and
coordinates (t', 0) in B's frame; t and t' are related by the time dilation factor. (b) A point R
on B's surface of events simultaneous with the origin 0 has coordinates (t, X) in A's frame
and (0, X') in B's frame; X and X' are related by the length contraction factor. (c) For both
observers A and B, light rays have the same speed; according to A their equation is
t = x/c = X, and according to B it is t' = x'/c = X'.
4.1 The Lorentz transformation 131
x = -y(v)x', (4.8a)
t = 7y(v)vx'/c2 = vx/c2. (4.8b)
The first result relates the distance from 0 to R as measured by A to the same
distance measured by B, and shows the standard length-contraction effect (3.23)
for B moving past A. The second is the formula (3.17) giving A's coordinates for
an arbitrary point R in B's surface of simultaneity with 0. For example, if v = c
so that ^y = 3a length of 1 light-year measured instantaneously by B (x' = 1) for s
an object at rest relative to him parallel to v will correspond by (4.8a) to a length of
3 x 1 = 3 light-years measured by A; also by (4.8b), A and B will disagree about
simultaneity over that distance by an amount (v/c)(x/c) = s x 1 = s years.
Again the reciprocal results follow on setting t = 0 in (4.5).
D: Invariance of the speed of light Thirdly, both observers should agree
about motion at the speed of light. We check this for light moving in the
+x-direction by setting t' _ +x'/c in (4.3), finding x = 7y(v)x'(1 + v/c) and t =
y(v)(x'/c)(1 + v/c) = x/c; thus
t' = +x'/c = t = +x/c, (4.9a)
confirming that if B measures the speed of light in the +x-direction, then A agrees
(Fig. 4.3c). For example, if v = c so that ^y = then, after 1 year, B will measure
light emitted at 0 (t' = 0, X' s= 0) to be at 3event P (t' = 1, X' = 1). A will
determine the coordinates for event 0 to be It = 0, X = 0} and the coordinates
for the event P to be {t = x 1 x 5= 3 years, X= 3 x 1 x 5= 3 light-years},
3
confirming that A measures this light to be travelling at the speed c. Similarly,
showing that A agrees with B about the speed of light in the -x-direction.
E. Relativistic velocity addition Fourthly, suppose a third observer C moves
past B at a speed v' in the +x'-direction (Fig. 4.4). Let C's coordinates
(t", x", y", z ") be aligned in the standard way. Then (applying the results above
for B observing C) B's coordinates are related to C's by
where
7(v') = {1 - (v )2/c2}-z. (4.11)
132 The Lorentz transformation and the invariant interval
Y Y
A B
X
- X,
B
C
X,
V.
Fig. 4.4 Three observers in relative motion: A has coordinate system (t, X, Y, Z);
observer B has coordinate system (t', X', Y, Z') and is moving at speed v relative to A
in the X-direction; C has coordinates (t", X", Y", Z") and is moving with speed v' relative
to B in the X'-direction (which is parallel to the X-direction).
Now, the relation between A's and C's coordinates should again be a Lorentz
transformation of the form (4.3), because A and C are just two inertial observers
whose coordinates are related in the standard way. Indeed this is so: one can
substitute (4.10, 11) into (4.3, 5) and simplify, obtaining eventually (after some
tedious algebra)
where
'Y(v'")
= {1 - (v")2/c2}-z, (4.13)
the quantity v" being defined by
v" = (v+v')/(1 +vv'/c2). (4.14)
This indeed shows that measurements made by A and C are related in the stan-
dard way, with the relative velocity of A and C given by (4.14). Thus, we have
confirmed that if A measures B to move at a speed v in the x-direction, and B
measures C to move at a speed v' in the (parallel) x'-direction, then A measures
C to move at a speed v" in the x-direction where v" is given by the special-relativity
velocity addition formula (3.15). For example, if v = c and v' = c, then
1 + v'/c2 = 1 + 5 x 5 - 25 so v" = (5 + 5)c x 29 = 9 c (which
s is less than
5 c, as
required).
4.1 The Lorentz transformation 133
Reprise
We have now determined the Lorentz-transformation equations (4.3) and (4.5)
relating the measurements made by two observers with different velocities in the
x-direction, and verified that we can derive the standard kinematic results of
special relativity from them, thus confirming that these formulae do indeed
encapsulate in a compact form the kinematics of special relativity. In the next
section we turn to looking for quantities invariant under Lorentz transforma-
tions; the present section concludes by giving a worked example of the use of the
Lorentz transformation, and showing how these transformations may be viewed
in an active sense rather than the passive sense used so far. This is useful later on in
constructing simple universe models.
An example
Suppose that a rocket of length 100 in travels horizontally above the ground at a
speed of 107 m/sec. At a certain moment, a light signal is emitted from the front
end of the rocket. Let us compare the times the signal takes to reach the tail end of
the rocket according to (i) an observer travelling on the rocket, and (ii) a sta-
tionary observer on the ground.
For the observer travelling in the rocket, the length of the rocket is of course
100 in. Thus (i) the time taken is this length divided by the speed of light, i.e.
100 m/(3 x 108 m/sec) = 0.33 x 10-6 sec. (ii) Suppose that in this observer's
reference frame the light is emitted at event A with coordinates to = XA = 0, and
received at event B with time tB as calculated above and distance XB = 100 m.
Then, in the frame of the stationary observer, who moves with the relative speed
V'/C')-
v = 107 m/sec in the x-direction, eqn (4.5a) gives
to = 0,
tg = Y(v)(tB - vxB/CZ) = (1 -
tB{(1 - v/c)/(1 + v/c)}z
tB{(1-30)/(1+so)1Z- 2309- tB-0.32x 10-6sec.
Notice that the result cannot be obtained naively from either the length con-
traction factor or the time dilation factor, because neither the length of the rocket
nor the rate of a moving clock is directly at issue.
Active transformations
So far, we have regarded the Lorentz transformation in a passive sense: it relates
the reference frames of different observers, and so determines how their different
coordinates for the same, event are related. However, it can also be regarded in
an active sense. To see how this works, consider first an ordinary rotation of axes
in Euclidean 2-space (Fig._ 4.5a). Changing from reference frame F (with coor-
dinates (x, y)) to reference frame F' (with coordinates (x', y')), we find the
acs` a wl trt
X=const. X'=const.
(a)
(b)
(c) (d)
Fig. 4.5 (a) A rotation of axes in the Euclidean two-plane changes the coordinates (X, Y)
of the point P to coordinates (X', Y'). This i9, a passive transformation: the points in the
space remain fixed, but the reference frame changes. (b) In an active rotation, the point P
moves with the axes and coordinates to a new point P. (c) The image point P' has the same
coordinates relative to the new coordinate system, as the initial point P had relative to
the old coordinates. (d) The movement of points in the Euclidean plane generated by an
active rotation.
4.1 The Lorentz transformation 135
according to A (cf. eqn (4.7)). This event was initially a unit time along both
A's and B's axes. Thus, if we take the event Q at It = 1, x = y = z = 0} in
A's frame and give it a boost through +v, it will end up at Q' with coordinates
(4.15). By this construction, it is clear that length and time measurements are
preserved under an active Lorentz transformation (e.g. a unit time measurement
in B's frame remains a unit time measurement, as the boost is performed). From
this viewpoint, this is in fact the defining property of Lorentz transformations,
which move points in space-time as shown in Fig. 4.6b.
If we keep on repeating the boost for a particular relative velocity v, we will
get an infinite series of frames each related to the previous one by (4.3),
136 The Lorentz transformation and the invariant interval
to
(a)
(b)
Fig. 4.6 An active Lorentz transformation moving the points of two-dimensional flat
space-time into each other. (a) The effect of a boost on specific points P, Q, R, moving each
point (t,X) into a new point with new coordinates t' = t, X' = X. (b) The pattern of
motion generated by the boost (this is the exact analogue of Fig. 4.5d).
vectors
at distance=+i
O
X
Fig. 4.7 The effect of a repeated series of boosts on the unit time-like and space-like
vectors along the axes of the reference frame of an observer A. The image vectors can be
thought of as the unit time and space vectors along the axes of the reference frames of a
series of relatively moving observers. They define the surfaces at unit time and unit spatial
distance from the origin 0.
This surface enables one to compare the units of time on different lines through
the origin representing the uniform motion of particles at different speeds, all
passing 0 at time t = 0. Similarly, repeatedly boosting the displacement It = 0,
x,= I, y = 0, z = 0} representing a unit spatial displacement will give a series of
vectors representing instantaneous unit spatial measurements by this family of
observers, defining a surface at unit spatial distance from 0 (measured along the
straight line from 0). This surface enables one to compare the units of spatial
distance along different space-like lines all passing through the origin. These two
surfaces are the space-time equivalent of a unit circle in the Euclidean plane
(since that is the surface at constant unit distance from the origin 0, measured
ak g the straight line from 0; there is in that case no distinction between time-
like and space-like curves or measurements). Figure 4.7 also displays how, as the
relative velocity v tends towards c, the frames of the other observers (viewed
from A) appear to collapse towards the light cone. This is a consequence of the
limiting nature of the speed of light in special relativity.
Exercises
4.1 Deduce explicitly transformation (**) following (4.5) from the general Lorentz
transformation formula (4.5).
4.2 Consider two events A and B defined in some frame of reference by coordinates
to = XA = YA = ZA = 0 and tB = 1, XB = 2c, yB = ZB = 0. What are their coordinates in
a frame moving with speed c in the x direction relative to the first frame? What has
a on transformation between the two frames? What aspect
happened to their time ordering
of the relationship between A and B makes this feature possible?
138 The Lorentz transformation and the invariant interval
4.3 Suppose that two events are connected by
a time-like line in one reference frame.
Show that their time order is the same in all reference frames.
4.4 A passenger on a train moving with speed v watches a girl stationary on the ground
throw a ball at speed 2v at an angle of 60° to the horizontal, in the direction parallel to the
train's motion. According to the girl the path of the ball is given by
x=vt, y=\/3vt-zgt2,
where x and y measure horizontal and vertical distances. Find the path according to the
passenger on the train.
4.5 A spaceship with a top speed of c pursues one with a top speed of c. An observer
on a nearby planet observes them to be one s light-year apart. How much later,
s according to
the observer on the planet, will the slower one be caught? What will this time difference be
(i) according to an observer on the slower spaceship, (ii) according to an observer on the
faster ship?
4.6 Apply a boost with parameter v to the following events described by their (t, X)
coordinates: (a) (-1,2), (b) (0,V3), (c) (1, 2), (d) (e) (1,1), (f) (2,-1),
(g) (,/3, 0), (h) (2, 1). Plot the old and new points on a space-time diagram for v = 5c/13,
and draw in what lines of constant distance from the origin are needed to show the effect of
the boost on these points.
4.7/ The group property [this example presumes you know the mathematical definition
of a group]. Show that a combination of any number of Lorentz transformations of the
standard form (with parallel velocities) will lead to a final Lorentz transformation of the
same form, for some appropriate velocity. Consider, for example, a family of observers A1,
A2, A3, ..., each moving at the speed v relative to the previous member of the family (A2
moves at a speed v relative to A1; A3 moves at a speed v relative to A2; and so on). The
resulting series of coordinate axes are shown in Fig. 4.7. This shows the unit time-like
vectors (from the origin {t = 0, x = Of to the point {t = 1, x = Of on each observer's
world-line) and space-like vectors (from the origin {t = 0, x = 0}) to the point
{ t = 0, x/c = 11 in each rest frame) for this family of observers, as seen from A's reference
frame. Then every pair of reference frames in this family is related by a Lorentz trans-
formation of the form given in eqn (4.3), with v replaced by the relevant value for the
relative velocity (derived by repeating the relativity velocity addition law the appropriate
number of times). The identity transformation is a Lorentz transformation (put v = 0 in
(4.3)), and the inverse transformation to any Lorentz transformation is also a Lorentz
transformation (in fact (4.5) is the inverse of (4.3)). Prove that, together with the com-
position property discussed above, the (t, x) Lorentz transformation form a group of
transformations.
Computer Exercise 12
Write a program that will accept as input (a) a speed V(= v/c), (b) coordinates (t, x, y, z)
of a point P measured by an observer A, and will print as output coordinates (t', x', y', z')
of P measured by an observer B, given by the Lorentz-transformation equations (4.5).
Make sure that your program allows repeated Lorentz transformations, i.e. having made
one transformation, unless new data is fed in the output of the previous transformation
is automatically the input for the next one.
Get your program (c) to print out additionally the result of the Newtonian transfor-
mations (4.6), and so experiment to see when these are a good approximation to the
Lorentz transformation; (d) to print out the quantity Q = -t2 + X2 + Y2 + Z2. What is
the change in this quantity each time you perform a Lorentz transformation?
4.2 Space-time separation invariants 139
Computer Graphics Exercise 2
Write a program that draws a set of axes (t, X), and then shows the effect on a space-time
diagram of moving a chosen point P with coordinates (t, X) to the point P' with coordi-
nates (t', X') given by the Lorentz-transformation formula (4.5) for a specified speed
V(= v/c) [arrange for an arrow to be drawn on the screen from P to the new point P'; this
exercise regards the Lorentz transformation as an active transformation, but you may use
the program from Computer exercise 12 to perform your calculations]. Try the effect of
repeated transformations on the points (1) t = 1, X = 0; (2) t = 1, X = 1; (3) t = 0, X = 1;
(4) t = -1, X = -1.
Modify the program (a) to show the effect of the transformation in moving several
chosen points simultaneously; (b) to show its effect on a line through the origin, as follows:
given a specification of a point Q, (i) draw the straight line through the origin 0 (t = 0,
X = 0) and Q; (ii) mark off on this line the series of points Q; where Q1 is Q, the point Q2 is
twice as far from 0 along the line as Q, the point Q3 is three times as far from 0 along this
line, etc., until the edge of your diagram has been reached; (ii) show the effect of the
transformation on all of the points Q;, and draw the new straight line through the origin
that they move into. Try this program on the set of points (1)-(4) listed above.
S2=-t2+x2+Y2+Z2. (4.16)
(It is important to note here that although this is written, for historical reasons, as
`S squared', it is not necessarily positive. This will become clear in the following
discussion.) When an observer B using coordinates (t', X', Y, Z') evaluates this
quantity, by its definition (4.16) he will evaluate
where we have set V = v/c. On multiplying out and using the expression (4.4) for
'y, this becomes
that is, both observers obtain the same value for this expression (whatever their
speed of relative motion). Thus S2 an invariant under change of velocity in the
X-direction. It is also invariant under any spatial rotation of the X, Y, Z axes,
because t and X2 + Y2 + Z2 are separately invariant under such rotations. It is
therefore invariant under any velocity change whatever (a spatial rotation can
bring any change of velocity to a change of velocity in the x-direction); so
S2is an invariant-it will be found to have the same value by all inertial observers.
As an example, suppose in A's coordinate system an event P is given by
(Xa)=(39, 35, 3, 0); then S2 =-(39)2+(35)2+32+02=-$91+695+9=-15.
Now if B moves relative to A at a speed v = s c, then B's coordinates for the event
P will be (x°') = (5, 1, 3, 0) see eqn (**) in the previous section). Thus the value of
S2 calculated by B is S'2 = -(5)2 + (1)2 + (3) + (0)2 = -25 + 1 + 9 = -15, the
same value as before, confirming that S2 is an invariant in this particular case.
When we remember that many other quantities we previously believed to be
invariant have turned out not to be so, it is clear that this quantity must have some
special meaning. What is the meaning? It is just `the square of the space-time
distance' from the origin 0 with coordinates (0, 0, 0, 0) to the point P with
coordinates (t, X, Y, Z). Thus it is a natural generalization to the space-time
situation of r 2 - x2 + y2 + z2, the square of the spatial distance from the origin 0
with coordinates (0, 0, 0) to the point P with coordinates (x, y, z) in Euclidean
three-dimensional space. However, there is an important difference: r2 is non-
negative: r2 > 0, but because of the minus sign in (4.16), S2 may take negative,
positive, or zero values, with slightly different interpretations in each case. We
shall consider them in turn. In examining these meanings, it is convenient to
rewrite (4.16) in the form
S2 = -t2 + R2, (4.19)
(a) (b)
Fig. 4.8(a) The set of points seen by an observer A to satisfy S2 = -T2. An observer B
moving relative to A measures the time from the origin 0 to this surface to be T.
(b) The same situation drawn from the viewpoint of observer B.
Si2=S2=-T2. (4.20b)
Equations (4.20) show that t' = T (time goes the same way for A and B, so we
reject the alternative solution t' = -T). That is, in terms of his standard coor-
dinates (t', X', Y', Z'), B will assign coordinates (T, 0, 0, 0) to P. But this means
that he will measure a time Tfrom 0 to P, using a standard ideal clock (since this is
just the meaning of the coordinate t). This will of course also be true in Fig. 4.8a,
which represents the identical space-time situation. Hence, each point P on the
surface S2 = -T2 will be found to be a time T away from the origin 0 by an
observer moving inertially from 0 to P. This is thus the set of points in space-time
at proper time T from 0 (to the future of 0), as measured by inertial observers.
Similarly if t < 0 we find the surfaces at proper time T from 0 to the past.
With this understanding, it is easy to use this invariant to work out proper-time
measurements made by inertial observers. For example, consider the point P
measured by A to be at {t = 5 sec, X = 3 light-sec, Y= Z = 0}. Then he calculates
S2 = -52 + 32 = -25 + 9 = -16 = -42, so P lies in the surface `proper time = 4
seconds' to the future of 0. Thus, an observer B moving from 0 to P will move in
the +x-direction at speed v/c = V = X/t = s relative to A and will measure a
proper time of 4 seconds from 0 to P.
A particular case of significance is when T = 1; we obtain then the surface
S2 = -1, representing unit proper time from 0. This surface provides the
information we have needed all along but lacked in order to compare proper
times measured along different world-lines: where each world-line intersects this
surface establishes a unit time interval along that world-line, from which all other
4.2 Space-time separation invariants 143
time measurements along that world-line are obtained by simple scaling (two time
units will be twice as far from 0 along the line, and so on; see Fig. 4.9). Thus, this
surface calibrates the way time is measured along different inertial world-lines
through 0 (which is not immediately obvious from the relative distances on the
diagram, as has been emphasized all along). We can thus categorize the surfaces
{S2 = constant < 0} as `1 second', `2 seconds', etc., representing graphically the
relation between proper times along the different world-lines.
The invariant S2 also helps interpret the meaning of Fig. 4.7: if a series of boosts
is applied to the unit time-vector in any observer's reference frame, it will remain a
unit vector and so will always join the origin 0 to a point Pin the surface S2 = -1;
indeed all the arrows in Fig. 4.7 do lie in such a surface (cf. Fig. 4.9). Similarly, it
allows us to see easily what event in A's history corresponds to the time t' mea-
sured by a moving observer B, so providing the last bit of information missing in
our discussion of time dilation (Fig. 4.10). In particular this makes clear that a
given proper time along the stationary observer's axis will be represented in these
diagrams by a longer distance along the moving observer's world-line.
"two
seconds"
Fig. 4.9 Surfaces in space-time at one unit and two units of proper time from the origin.
Fig. 4.10 To find what event in A's history corresponds to the time is in B's history, we
draw the surface S2 = -t'2.Observer B's world-line intersects this at the event P, which is
at the time t in A's frame.
144 The Lorentz transformation and the invariant interval
(a)
(b)
Fig. 4.11 (a) The set of events seen by observer A to satisfy S2 = Dz. An observer B
moving relative to A measures the event Q, simultaneous with the origin 0 in his reference
frame, to be at a distance D from 0. (b) The same situation redrawn in the reference frame
of observer B.
Positive values of S2 Suppose S2 > 0. Then there is some positive real number
D such that S2 = D2. Consider the set of points given by S2 = D2 viewed from the
frame of an observer A (Fig. 4.11 a), and any event Q on this surface. Then there is
a straight line from the origin 0 to Q; we rotate the spatial axes so that y and z are
constant along this line, i.e. so that its spatial direction is the x-direction. By
eqn (4.19), A finds the time t and distance R of Q from 0 to be related by
S2
>0 ? R2 > t2 ? V2 = (R/t)2 > 1;
that is, the straight line OQ from 0 to Q represents motion at greater than the
speed of light relative to A. Therefore OQ cannot be the world-line of any
observer B moving inertially between 0 and Q. In the (t, X) plane, this line will
make some angle a with the horizontal axis (Fig. 4.11 a); the line at the angle a
from the vertical axis towards Q is then the world-line of an observer B for whom
the events 0 and Q are simultaneous, i.e. the line OQ lies in his surface of
instantaneity. Change to B's frame of reference by a suitable Lorentz transfor-
mation; then the events 0 and Q will both lie in his surface of instantaneity ft, = 0}
(Fig. 4.11 b). His coordinates (t', X', Y', Z') for Q will then be (0, X', 0, 0).
Evaluating S12 for this point shows Si2 = X12 But, since this is an invariant,
.
Fig. 4.12 Surfaces at one unit and two units of spatial distance from the origin 0.
that is, the straight line OL from 0 to L represents motion at the speed of light
relative to A. Thus this surface is just the light cone measured by A for the event
0. Since S2 invariant, any other observer B will also find Sr2 = 0: this set of events
will also be the light cone he determines for the event O. That is, invariance of
S2 = 0 for different observers is just Einstein's principle of the invariance of the
speed of light for all observers.
146 The Lorentz transformation and the invariant interval
Fig. 4.13 A rigid rod, stationary relative to the observer A, has end-points u and w. It is
measured by the relatively moving observer B to have a length X'. To find the length the
observer A will measure for the rod, we draw the surface S2 = Xi2. This intersects B's
surface of simultaneity through 0 at Q, which is a distance X from the origin in A's
reference frame. Therefore A measures the length of the rod to be X.
Fig. 4.14 The surfaces {S2 = constant} at constant space-time distance from the origin 0
drawn in a space-time diagram. The surfaces S2 = 0 are the light cone of the origin.
Summary All observers will agree on the value of the invariant S2. The
surfaces S2 = constant are drawn in Fig. 4.14; they represent proper times from
0 (when S2 is negative), instantaneous spatial distances from 0 (when S2 is
positive), and the light cone C+(0) of 0 (when S2 zero). It is convenient to
4.2 Space-time separation invariants 147
refer to the latter as being at zero (space-time) distance from 0, for the
following reason. Taking the limit as a point Q approaches C+(O), Q is
simultaneous with 0 and the spatial distance OQ goes to zero (if approached
from the region where S2 > 0) or the measured proper time OQ goes to zero (if
approached from the region where S2 < 0). One can use this invariant to
compare easily the spatial distance and proper time measurements made by
different inertial observers who pass through the event O.
Exercises
4.8 Calculate explicitly the quantity S2 for the cases (a) t = 4, X = 2, Y = 3, Z = 0;
(b) t = 2, X = 4, Y = 0, Z = 5; (c) t = 5, X = 3, Y = 0, Z = 4. In each case interpret your
results in terms of the relation between the origin of coordinates 0 and the point P with the
stated coordinates. Use equations (4.5) to prove explicitly the invariance of S2 in these
cases if v/c = 2.
4.9 If the light cone is projected into the (t, X) plane by setting y = Z = 0, S2 = 0
becomes t2 - X2 = 0. Deduce that the solution is t = ±X. Show explicitly that these rays
are invariant under (4.5).
4.10 Suppose that a light signal is emitted at the space-time event 0 (t = 0, X = 0)
and absorbed at the space-time event B (t = 1, X = 1). Is S2 zero for B? Suppose now
the light is reflected by a mirror at B and absorbed when at the event C (t = 2, X = 0).
Is S2 zero for C?
4.11 Consider again the discussion of muon decay in Section 3.6. Calculate from
quantities given in the Earth's frame the proper time taken by the muons to move through
the Earth's atmosphere. Use this time to predict the fraction of muons surviving at sea
level.
Fig. 4.15 A supernova explosion occurred at event Q and dinosaurs became extinct on a
neighbouring planet at event P. The time coordinates t of these events differ by At, and the
spatial coordinates X by X.
right-hand side evaluated for the displacement components (1, -2, 1, 0) from Q
to P. Explicitly,
AS2=-12+(-2)2+12+02=-1+4+1 =+4.
Because this is positive, the displacement from Q to P is space-like (it represents a
spatial distance of 2 light-years); therefore no causal effect spreading from Q,
travelling at less than or at the speed of light, could influence what happened at P.
The extinction of the dinosaurs was not caused by the supernova explosion.
This example makes clear that it is useful focusing on the displacement from
Q to P (with components (y°) in the above example). To consider this more gen-
erally, consider two points P and Q in space-time, to whom an inertial observer A
assigns coordinates (tp, Xp, Yp, Zp) and (tQ, XQ, YQ, ZQ) (as in Fig. 4.15). When
we make a Lorentz transformation (4.2) to the frame of a second observer B, these
points will then be assigned coordinates (t', XP, YP, Z p) and (t6, XQ, Y'' , ZQ)
respectively. It is straightforward to work out how the displacement from Q to P
behaves; the result is (4.21c), leading to the invariant distance between these
points (4.22). The details are as follows.
The old and new coordinates of P are related by
tp ='y(v)(tP + VXX), Yp = Y'p,
XP ='y(v)(XX + Vt.), ZP = ZP,
and those of Q by
tQ ='y(v)(t' + VXQ), YQ = YQ
XQ = ry(v)(XQ + Vt6), ZQ = Z.
Subtracting these equations shows that
tp-tQ=ry(V){(tP-tQ)+V(XP-XQ)}, YP - YQ= YP - YQ,
XP -XQ=y(v){(xx-xQ)+V(t' -t6)}, Zp-ZQ=ZP -ZQ.
4.2 Space-time separation invariants 149
This is somewhat clumsy to deal with, so we use the notation that 0 represents the
change in a quantity between Q and P. Then
are the changes in the coordinates (t, X, Y, Z) and (t', X', Y', Z') between Q and
P; and we find finally
This again has exactly the Lorentz-transformation form (4.2), but with X
replaced by OX, etc. Now given the definition (4.16), the invariance result (4.18)
was a direct result of (4.2). In exactly the same way, define
OS2
_(At)2 + (OX )2 + (A y)2 + (OZ)2 (4.22)
Then it follows from (4.21c) that this is an invariant: for any change of reference
frame,
What this result shows is that the space-time distance of the point P from the point
Q, is invariant. Thus, just as before, we can draw surfaces of constant distance
about the point Q, which is an arbitrary point in the space-time, and interpret the
result exactly as before except with 0 replaced by Q (see Fig. 4.16a). Specifically,
if OS2 < 0, then the displacement QP represents motion at less than the speed of
light, and so is a possible history of a massive particle or observer (Fig. 4.16b); we
shall then call it time-like. If OS2 = 0, it represents motion at the speed of light,
and so is a possible path of a zero-rest-mass particle (e.g. a photon); we. shall then
call it null or light-like. If OS2 > 0, it cannot represent motion of any particle,
since it would be motion at greater than the speed of light; rather, it represents an
instantaneous spatial displacement for some observer. We then call it space-like.
These are a more general form of the previous results; in fact, the previous cal-
culations will follow on choosing Q to be 0 (with coordinates (0, 0, 0, 0)) here,
cf. Example 4.12.
The new formulation has several advantages. One is that it is clear that
expression (4.22) is invariant not only under boosts and rotations of the axes, but
also under translations: that is if we change the origin of coordinates, setting
t'=t+to, X'=X+Xo, Y'=Y+Yo, Z'=Z+Zo,
for some choice of constants to, Xo, Yo, Zo, the values (4.21 a,b) will be unchanged
and so will the value (4.22). Thus the quantity OS2, the space-time separation
150 The Lorentz transformation and the invariant interval
(a)
(b)
Fig. 4.16 (a) The surfaces JAS' = constant} representing constant space-time distance
from the event Q. (b) Time-like displacements from Q (possible world-lines of observers or
massive particles) are those for which A S2 < 0; null displacements (representing motion at
the speed of light) are those for which A S2 = 0; and space-like displacements those for
which / S2 > 0.
(a) (b)
Fig. 4.17 (a) In a flat space-time given in standard coordinates, the light cones at each
point are parallel to each other. (b) The future of a point Q which lies on the future null
cone C+(P) of a point P, lies in the future of P; the null cones of Q are tangent to the null
cone of P.
Exercises
4.12 We are free to choose any point in space-time as the origin 0 of our coordinates.
Choose the origin as the point Q in the calculation above. Then (XQ) _ (0), i.e. tQ = XQ =
YQ = ZQ = 0 by definition. Verify that (XQ) = (0), i.e. tQ = XQ = YQ = ZQ = 0, and
that therefore the calculation above leading to (4.23) reduces precisely to the previous
calculation leading to (4.18). Deduce that all the results following (4.18) for positive,
negative, and zero values of S2, understood as a measure of separation from 0, also hold
for / S2 understood as a measure of separation from Q.
4.13 The light cone C+(P) of an event P is generated by the light rays through P. Show
that the light cones of each point Q on these light rays are tangent to C+ (P) (Fig. 4.17b) by
deducing (a) that the interior of C+ (Q) lies in the interior of C+ (P); (b) that the interior of
C-(Q) lies outside C+ (P); and (c) that the light cones C+ (Q), C-(Q) intersect C+ (P)
152 The Lorentz transformation and the invariant interval
precisely in the light ray from P through Q. [It will be important later that these features
remain true in curved space-times.]
T=
Y(-OS2)z = (Ot2 - AX2 - AY2 - OZ2)1, (4.24)
where the sign E represents summation of the expression over all the inertial
segments (that is, the total proper time along the path is just obtained by adding
up the proper times measured along each of these segments); here and in the
sequel, 'Ot2' means (Ot)2, etc. This is clearly an invariant (since each term in the
(a)
Fig. 4.18 (a) A time-like path made up of time-like straight (inertial) segments. (b) Paths
made up of smaller and smaller straight (and therefore inertial) segments. (c) The limit of
these paths is a smooth time-like path.
4.2 Space-time separation invariants 153
T=(102-82)2+{102-(-8)2}2=(100-64)1+(100+64)1
_ (36)1 + (36)2 = 6 + 6 = 12
years, confirming our previous results. On the direct path between the initial and
final points travelled by A, we have At = 20 - 0 = 20 and OX = 0 - 0 = 0; so
.T= (202 - 02)2 = 20, as expected.
Expression (4.24) enables us to determine what clock measurements would be
along any time-like path in space-time made up of a finite number of inertial
segments. However, general paths may have a direction that is continuously
varying, and we wish to determine proper time along any feasible path of an
observer. To do this, we consider piecewise inertial paths from Q to P with smaller
and smaller inertial segments (Fig. 4.18b). In the limit as these segments shrink to
zero, we obtain a smooth time-like path C (Fig. 4.18c). As long as the limiting
value for OS2 remains negative for each segment as we take the limit, this
represents a possible motion of an observer from Q to P, and the proper time T
measured by an observer moving along the path is the limit of the expression
(4.24). It is conventional to write this limit as a line integral:
T = f(-ds2)1, (4.25a)
where
or equivalently,
where'd t2, means (dt)2, etc. (It would be more in line with the notation we have
used previously to write dS2 instead of ds2; however, it is an almost universal
convention to use the notation ds2, so we shall do so here.) This is nothing other
than a formalism for the limit of expression (4.24) as all the inertial segments are
shrunk to indefinitely small lengths and the piecewise inertial path tends to the
smooth world-line C. We may interpret this as representing the path C from Q to
P as made up of `infinitesimal' segments, each consisting of a displacement
(dt, dX, d Y, dZ) from a point P, with coordinates (t, X, Y, Z) to a point Pj with
coordinates (t + dt, X+ dX, Y+ d Y, Z + dZ) (Fig. 4.19), each of which (by 4.24)
contributes a proper time dT = (-ds2)2 (given by (4.25b)) to the total time T.
154 The Lorentz transformation and the invariant interval
Fig. 4.19 Two points Pi and Pj on a smooth time-like curve, with coordinates differing by
dt and dX.
(a)
Y
y=constant
0
rP
x x+dx
(b)
Fig. 4.20 (a) A curve C in the Euclidean two-plane between points P and Q. Neigh-
bouring points have coordinates differing by dx and dy, and the distance between them can
be found by Pythagoras' theorem. (b) A curve such that dy = 0 (that is, y = constant) has
x as a curve parameter.
Then (4.25a) simply states that the total time measured along the path is the sum
of all these contributions (cf. Appendix A). Invariant expressions such as
(4.25b,c) are known as metric forms or intervals.
The Euclidean two plane This concept is illustrated now by considering how
one measures length along an arbitrary curve C in the ordinary Euclidean two-
plane. First consider using standard Cartesian coordinates (x, y) (Fig. 4.20a).
4.2 Space-time separation invariants 155
L = f(ds2)2 (4.26a)
where
(It is not appropriate to `take the square root' in (4.26a), as the entity in the
bracket is really the full expression ds2 given by (4.26b).) Again we are regarding
the total length as made up of contributions from segments representing dis-
placements from (x, y) to (x + dx, y + dy), of length (ds2)" where ds2 is given by
(4.26b). This expression is a line integral evaluating the length of any curve in the
plane (similarly, expression (4.25) is a line integral evaluating the proper time
along any time-like path in space-time). Again it is an invariant agreed on by all
observers (as each of the infinitesimal contributions ds2 is an invariant); in fact
this is nothing other than repeatedly using Pythagoras' theorem (4.26b) applied
to small line elements to estimate the length of the whole line.
Expression (4.26) tells us the length along any curve segment (dx, dy). In
understanding its meaning, it is useful to consider first the specialization of this
expression to curve segments on which only x or only y varies. Take the first case:
if only x varies along the curve, then y is constant and so dy = 0 all along the curve
(Fig. 4.20b). The expression (4.26) then reduces to
To see that this is correct, note that along the lines {r only varies} the coordinate 0
is constant; so dO = 0 along this line. Then (4.27) shows
ds 2 = dr 2 + 0 = dr 2 ;
but this is the square of the distance travelled. Thus r directly measures distance
along these curves (as required by its definition). On the other hand, along the
curves {O only varies} the coordinate r is constant so dr = 0 along these curves.
156 The Lorentz transformation and the invariant interval
Y
Fig. 4.21 The same curve as in Fig. 4.20(a) but now described by polar coordinates r
and 0. The distance between neighbouring points is now given by Pythagoras' theorem
from orthogonal displacements dr and dB, through distances dr and r dO respectively.
rsined(p
(a) (b)
Fig. 4.22 (a) Spherical polar coordinates r, 0, and 0 in Euclidean three-space. Here r
describes radial distance, 0 is the angle between the radial, direction and the z axis, and 0
describes rotation about this axis. (b) The distance between neighbouring points described
in spherical polars is given by Pythagoras' theorem from orthogonal displacements dr, dB,
and do through distances dr, r dB, and r sin 0 do respectively.
increment do represents a distance r sing do along the curves {q only varies}, that
is, the curves {r, 0 constant}. This is indeed precisely the way distances relate to
standard polar coordinates (see Fig. 4.22b). Of course, the spatial geometries
represented by (4.28a) and (4.28b) are the same-it is the coordinates use that
differ.
The important point to notice here is that when general coordinates are used,
they will not directly represent distances even along these coordinate curves, but
the relation between a coordinate increment and the actual distance travelled can
be read off from the interval (in this case, from (4.28b)). Distances travelled along
any curves will be given by (4.26a). Actually working out these expressions in the
case of a general curved line may be complex (but if it is a coordinate line, the
expression can often be evaluated without trouble). More details on the concept
of a line integral needed to evaluate these distances are given in Appendix A.
Exercise 4.14
The circle C given by Jr = R = constant} passes through the point P at {r = R, 0 = 0}
and the point Q at {r = R, 0 = 7r}. Show that (a) the straight line L from P to Q has length
2R, (b) the segment of C joining P to Q for 0 < B < 7r has length irR. [Apply (4.26a), (4.27)
first to the straight line joining P and Q, and then to the curve r = R.] Deduce that this circle
has radius R, diameter 2R, and circumference 27rR.
Space-time
These examples have simply considered Euclidean spaces, where ds2 > 0,
described by different coordinate systems. In space-time, ds2 is not constrained
to be > 0 because of the minus sign in (4.25b). As intimated above, it will in this
158 The Lorentz transformation and the invariant interval
t
(t+dt,x+dx)
dt
(t,X)
0 X
dX
AI /particle
(dt,dX)
N.- X
0
(a) (b)
X
(c) (d)
Fig. 4.24 (a) A displacement (dt, dX) along the world-line of a particle moving at speed v
relative to the observer A. The corresponding proper time d'r, measured by a clock moving
with the particle along this displacement, is related to dt by the time-dilation relation
dt = y(v) dr, which shows dt > dr with dt = d-r if and only if v = 0. (b) Several piecewise
inertial paths joining two time-like separated points P and Q. The longest time will be
measured along the path A, the straight line path between them. (c) The same situation as
seen by an observer B moving inertially between Q and P. Clearly (from (a)) proper time
along each inertial segment on y and A' will correspond to a longer time as measured by B;
thus proper time from Q to P along these paths will be less than along the single inertial
path A. (d) Displacements ni and rte in space-time. Their scalar product is defined by
eqn (4.31).
where dr 2 = dx2 + dye + dzZ = c2 (dX2 + d Y2 + dZ2) gives the spatial distance
measured by A along n. Now v = (change of distance)/(change of time) = dr/dt,
so
Conclusion
In this section, we have looked at the invariants related directly to measurements
of time and distance in space-time. There are other important invariants we have
4.2 Space-time separation invariants 161
not considered here, related to energy, momentum, and the electromagnetic field;
they are most easily constructed by using the tensor formalism discussed in
Appendix B. Some of those invariants are introduced there and in Appendix C.
Exercises 4.15
(i) In the Euclidean two-plane, consider a path as shown in Fig. 4.25a, joining {x = 0,
y = -a} and {x = 0, y = a} via the point {x = Aa, y = 0}. Find the length L (given by (4.26a))
of the path, and show that the shortest path (i.e. minimum value of L) corresponds to A = 0.
(ii) Now in a two-dimensional space-time consider a path as shown in Fig. 4.25b,
joining {t = -a, x = 0} and { t = +a, x = 0} via the point (t = 0, x = Aa). Find the proper
time -r (given by (4.25)) along the path, and show that the longest proper time (i.e. the
maximum value of T) corresponds to A = 0.
4.16 Illustrate how you would use the metric form to determine the K-factor for two
observers in relative motion by working through the following exercise. Suppose that the
metric form in a two-dimensional space-time is
where a and b are positive constants. Observer A is at rest at x = 0 and emits light signals
at t = t1 and t = t2. Observer B moves at speed v relative to A passing him at t = 0. Cal-
culate, (i) the equations of the light rays sent by A; (ii) the coordinates of the points where B
receives the signals; (iii) the interval As1 between the emission events, and the interval Os2
between the reception events; (iv) the ratio K = Os2/Os1.
4.17 (a) Prove that the scalar product (4.31) is an invariant. (b) Suppose that an
observer 0 determines both the displacements n1 and n2 to be instantaneous. Show that the
scalar product (4.31) then reduces to the expression
n1'n2=dX1dX2+dY1dY2+dZ1dZ2
which determines both distances and angles in Euclidean space (e.g. if n1 n2 = 0 then the
displacements are orthogonal to each other).
y
y=a
y=-a
(a)
(b)
Fig. 4.25
162 The Lorentz transformation and the invariant interval
4.18 Two-dimensional flat space-time has the metric form ds2 = -dt2 + dX2
(obtained from (4.25b) by setting d Y = dZ = 0). On choosing a new coordinate v defined
by v = t + X instead of t, then dv = dt + dX and in terms of the coordinates (v, X) the
interval becomes
Deduce from this that a curve {v=constant} is a light ray, but a curve (X= constant} is
time-like. Sketch these curves in a space-time diagram. On further choosing the coordinate
w = t - X instead of X, then dw = dt - dX and in terms of the coordinates (v, w) the
metric form becomes
ds2 = -dvdw. (**)
Show from this that the curves {v = constant} and the curves {w = constant} are light rays
(for this reason, these coordinates are called null coordinates). Sketch these curves in a
space-time diagram. Check that if we define a new null coordinates u = -v, the metric
form becomes
ds2 = dudw. (***)
Computer Exercise 13
Write a program that will accept as input (a) coordinates (TP, XP) and (TQ, XQ) for the
initial point P and final point Q of a time-like curve, (b) an integer Nindicating the number
of intermediate points to be specified, (c) coordinates T(I) and X(I) for each of these
intermediate points R(I) (I=1 to N). It should give as output the total proper time T
measured by an observer moving from P to Q along the piecewise inertial path
P --+ R(l) (2) -
(1) --+ R(2) (N) --+ Q. [The program must check that the total path and
--+ R(N)
each of these segments is indeed time-like.]
A curve of uniform acceleration between the point P with coordinates (t = -3, X = 5)
and Q with coordinates (t = 3, X = 5) satisfies the equation t2 - XZ = -16. Choose a
series of Npoints R(I) on this curve (I= 1 to N) between P and Q and determine the proper
time T from P to Q along the piecewise inertial path defined by these points. Show that as N
gets larger and larger, T tends to a limit TL, the proper time from P to Q along the
uniformly accelerated path. [One way to choose the points is to choose a set of values for
T (-5 < T < 5) and then solve the equation X2 = TZ + 16 for X.]
*In a complete cosmological model we will also have to specify many other physical features of the
matter in the universe, but in this book we examine only the space-time geometry of these universe
models.
164 The Lorentz transformation and the invariant interval
As has been mentioned above, the universe models we look at here do not
attempt to represent the nature of gravity (which will be discussed in the next
section). Instead they are based on the symmetries of flat space-time, which
define a structure for space-time that picks out particular classes of world-lines as
`naturally preferred', so we choose these for the world-lines of the fundamental
observers. We look at three such models: the Minkowski universe, the Rindler
universe, which has many properties similar to those of a black hole, and the
Milne universe, which is a simple expanding universe model. We will discuss
curved-space universe models of the black-hole type and the expanding type, in
Chapters 6 and 7 respectively.
Minkowski universes
We first consider a two-dimensional version of this universe model, and then a
four-dimensional version.
A two-dimensional Minkowski universe This is just the two-dimensional flat
space-time of special relativity with the metric form given in terms of coordinates
(t, X) by
the world-lines of the fundamental observers being lines {X= constant}, and the
number density of galaxies being uniform in the surfaces {t=constant}, which
are surfaces of instantaneity for all the fundamental observers (Fig. 4.27a). This
universe model is based on the translation invariance of the space-time: the
world-lines are moved into themselves by the time-translation symmetry
where to is any constant. This, in particular, implies that the world-lines stay a
constant distance from each other. They are moved into each other by the spatial
translation symmetry
~t=const
/
surfaces
of h omogenei t y
t=o
00 X
translational
invariance
(a)
(a)
Fig. 4.27 The Minkowski universe. (a) The world-lines of the fundamental observers,
representing the average motion of matter in the universe, are {X=constant} and their
surfaces of instantaneity are {t=constant}. (b) Construction of the universe by
(i) repeatedly applying a spatial translation through a distance Xo to the world-line L
through the origin to determine the initial points of these world-lines in the surface t = 0,
and (ii) applying time translations to these events for all values to to determine the world-
lines in space-time. By this construction, the density of matter measured in the surfaces
{t = constant} is uniform.
the world-lines of the fundamental observers being lines {X, Y, Z constant}, and
the number density of galaxies being uniform in the surfaces It = constant},
which are surfaces of instantaneity for all the fundamental observers. The
properties of this space-time are clear immediately from the discussion above of
the two-dimensional version (which is just the section of the full four-dimensional
space-time obtained on setting Y = Z = constant in (4.32b)).
This is the simplest kind of universe model: a static, uniform distribution of
matter in a flat space-time, without beginning or end and without spatial limit.
It is rather uninteresting: there are no observed redshifts or blueshifts, and the
density of matter in the universe is uniform in time and space. The model does not
correspond to the real universe, where systematic galactic redshifts are observed;
we include it mainly for contrast with the other two to follow, and to illustrate in
a familiar context some of the methods we will use in the rest of this section. There
is a universe model with curved space-time, the Einstein static universe, which is
similar to the Minkowski universe discussed here; we will discuss it in Chapter 7.
We conclude examination of this universe model by considering briefly three
conceivable methods of estimating the distance of an object in such a space-time:
distance by apparent angle, by apparent luminosity, and by apparent brightness.
This detailed material is included because similar methods will be used later in
examining the properties of curved space-times; it may be omitted on a first
reading.
Apparent size To determine how apparent sizes will appear in these universes,
we change to spherical polar coordinates (r, 0, 0) so that the metric form
becomes
(a) (b)
Fig. 4.28 (a) A rod of length D lying perpendicular to the line of sight from the observer at
a distance r. The apparent angular size of the rod is a (b) We estimate the distances of
objects such as cars by observing their apparent angle a, and deducing the distance to them
because we know approximately what their length is.
apparent angular size of the object as a = 02 - 01, this is related to the length of
the object by
a = D/r (4.34)
showing that the apparent size of the object is proportional to its length D and
inversely proportional to its distance r. It is effectively through this equation that
we estimate distance of objects in everyday life: for example our eye estimates the
apparent angle of a car as it passes (Fig. 4.28b), we know the approximate size D
of the car, so our brain can estimate the distance r to it (in effect by using
eqn (4.34)). If the object is not at rest relative to the observer, or does not lie
transverse to the line of sight, the calculation becomes more complex but still
follows directly from (4.32).
Apparent luminosity We wish to calculate the rate at which energy is received by
an observer at a distance r from a star. For generality, we will not assume the
observer is at rest relative to the star. To be precise, we will assume that the star is
at rest at the origin r = 0 of coordinates for which the metric form is (4.32c), and
the observer is moving radially outwards so as to measure a redshift z for the
received radiation (Fig. 4.29a).
Suppose the star is measured in its rest frame to emit radiation uniformly in all
directions at a rate L ergs/sec. This radiation is carried by photons, the energy of
each photon being E = by where h is a constant and v is the frequency of the
radiation, related to its wavelength A by c = v\. The rate at which photons are
emitted by the star will then be L/E = L/hv photons per second. Assuming that
photons are conserved, after travelling a distance r from the star (as measured in
the star's frame) they will all arrive at the observer, at which distance they will be
spread over a sphere of area 47rr 2 (Fig. 4.29b). Because of the K-factor effect (see
the redshift relations (3.3, 4)) the rate at which these photons arrive will be a
168 The Lorentz transformation and the invariant interval
light k to bserver
O star
redshift z
observer
(a) (b)
(measured by observer)
(c)
Fig. 4.29 (a) An observer moving relative to a star in flat space-time measures a redshift z
in radiation received from the star. (b) When radiation from the star arrives at the distance
rat which the observer is situated, it has spread out over an area 47rr2. (c) The solid angle 1
is the apparent size of the object as seen by the observer; it can be thought of as the amount
of sky covered by the star.
factor 1 + z slower, in the observer's rest frame, than the rate at which they were
emitted in the rest frame of the star; thus the rate at which photons arrive per unit
area will be measured by the observer as
R = (L/hv)(1/47rr2){1/(1 +z)}.
Now the energy per photon measured by the observer is hv' where v' is the fre-
quency measured by the observer, related to v by v'/v = 1/(1 + z). Conse-
quently the flux of radiation (i.e. the energy received per unit area per unit time)
measured by the observer from the star is
Apparent brightness The flux Fis the total radiation emitted by an object. When
observing an extended object such as a galaxy, what our instruments directly
record is actually its apparent brightness, i.e. flux received per unit solid angle, in
the wavelength band lying in its range of sensitivity (for example, this is what is
recorded by our eye or by a photographic plate). The solid angle SZ is the amount
of the sky covered by the image of the object. It is defined by the equation
S = rr2c where S is the cross-section area of the star, and r' is the distance
measured to the object by the observer (Fig. 4.29c). The observed intensity of
radiation I (the brightness at all wavelengths) is the flux received per unit solid
angle, i.e
I = F/1 = Fri2/S. (4.36a)
Now the relation between r (the distance measured between the object and
observer by someone stationary relative to the star) and r' (the same distance
measured by the observer) is r' = r/ (1 + z), which is effectively eqn (3.25) applied
to the present situation (it is clear that these distances must be related by
K = 1 + z rather than y because the light we are concerned with travels one way,
from the source to the observer, rather than both ways; the solid angle is the solid
angle subtended by the source at the time of observation, not at the present time
as deduced by radar). Combining this result with (4.35) and (4.36a) shows that
I = Io/(l +z)4, (4.36b)
where Io = L/(47rS) is the surface brightness of the star.
This shows that in flat space-time, the observed intensity of radiation from a
given source is independent of the distance between the observer and the source; it
depends only on their relative motion. In the case of the Minkowski universe, a
fundamental observer will measure the same intensity of radiation from a source,
no matter how far he is from it (as z = 0 then). Thus, it is not possible to use
observed intensity (or surface brightness, i.e. the measured intensity in restricted
wavelength bands) alone to estimate distance of an observed object.
Exercise 4.19
In a Minkowski universe every past light ray from an event P would eventually intersect
a star. Prove that the redshift observed by a fundamental observer is zero for every star
(assuming each star moves at the fundamental velocity). Deduce from eqn (4.36b) that if
the stars shone continuously in such a universe, the entire night sky would be as bright as
the surface of a star, contrary to our experience that the sky is dark at night (this is Olber's
paradox).
What conservation law shows stars cannot shine continuously (i.e. puts a limit on the
possible lifetime of a star?)
Fig. 4.30 The Rindler universe. The world-lines L of the fundamental observers are
obtained by boosts (see Fig. 4.6(b)) applied to their initial positions at equal distances
along the surface { t = 0}. The boosts move the surface { t = 0} into the surfaces {O = Oo },
{i3 = 2001,..., in terms of the parameter 0 (eqn (4.44)),
constructed as in the previous example. Start with flat space-time given in terms
of coordinates (t, X) and with ds2 from (4.32a). Use the spatial translations
(4.33b) to determine the initial positions of a family of world-lines in the surface
{t = 0} through the origin 0, resulting in an initially uniform distribution of
matter as in the previous case. We now use the boosts about 0 (eqn (4.37a) below)
to determine the world-lines elsewhere from their initial positions (Fig. 4.30). As
discussed above (cf. (4.23)) the interval is invariant, so this determines the world-
lines in such a way that the distance X0 between them in their surfaces of
instantaneity remains constant at all later times.
The result is clearly different from the Minkowski universe. Explicitly, a
general point P on each line L is obtained from the initial event (X', t') by a boost
for some value V for the relevant change of velocity, where y(V) = (1 - V2)-Z;
thus V (I V I < 1) serves as a parameter along the world-line L. For every value of
V, the boost preserves the invariant S2 giving the distance from 0 to P (see
eqns (4.16-18)) which on each world-line L takes the value at the initial point:
-t2+X2 = p2, p2 = constant (4.37b)
This is therefore the equation for the fundamental world-lines. These curves are
sketched in Fig. 4.31; they are all asymptotic to the light cone through 0 at large
4.3 Some flat-space universes 171
Fig. 4.31 The world-lines S2 = p2 in the Rindler universe, and their surfaces of
simultaneity which are also surfaces of homogeneity (i.e. of constant density).
values of IX!. As the world-line L passes through the point {t' = 0, X' = p}, a
general point on L can be expressed in terms of this initial point via eqn (4.37a) as
In this form, V is a parameter along the curve that is labelled by the value p. Note
that the point 0 is a fixed point of these boosts, so this procedure does not
generate a world-line through 0 itself; for later purposes it will be convenient to
define the world-line Lo to be given by {X = 0}. These universe models have
many interesting properties, which we will investigate in turn.
(A) Constant relative distances By construction, the world-lines are invariant
under the Lorentz transformations (boosts) about 0; therefore, they maintain a
constant distance from each other at all times. This does not at first appear to be
the case in Fig. 4.31, but is clear because they lie in surfaces at a constant distance
from 0 (see (4.37b)). The point is that the surfaces of instantaneity for this whole
family of observers are the straight lines Iv through 0; at every point on each
surface I v the angle to the horizontal is the same, but at later and later times on
each world-line (corresponding larger and larger values of V) the I v tilt up more
and more relative to the X-axis, asymptotically approaching the light cone. This
is because these observers are accelerating: at every time on each world-line, the
speed relative to the t and X axes is increasing, so the lines tilt over at an angle a
from the vertical which steadily increases towards 45°. Correspondingly, the
surfaces of instantaneity tilt up from the X-axis by the same angle a; hence larger
172 The Lorentz transformation and the invariant interval
and larger length contraction effects make a constant distance (for an observer L)
look longer and longer (to the stationary observer Lo, who is not a fundamental
observer).
The event 0 is at a strangely privileged position for this family of observers. It is
regarded by each observer L to be simultaneous with every event in his history
(because all their surfaces of instantaneity intersect here) and to be always at the
same distance from him. Conversely, every observer at the event 0 (no matter
what his velocity) will measure the same distance to an observer L. By contrast, an
observer with world-line Lo has surfaces of instantaneity {t = constant}, and by
(4.37b) will measure all the observers L to be approaching him (until the event 0)
and then moving further and further away from him (after the event 0). That
observer will measure the density of matter to be uniform at the time t = 0
(because it was constructed to be uniform then) but not at any other time,
because, as (4.37b) shows, the instantaneous (t = constant) spatial distance
X, - X2 measured by Lo between two fundamental world-lines depends on the
time t. Nevertheless the universe model is spatially uniform for the fundamental
observers. The space-time symmetries (4.33b) combined with (4.37a) act in the
surfaces of instantaneity Iv, showing the space-time itself is uniform on these
surfaces. Also, the distance between the world-lines is measured to be constant on
these surfaces, so the fundamental observers will measure the density of matter to
be constant on them. Thus they will be seen to be surfaces of homogeneity in this
universe model.
(B) Uniform acceleration Since the world-lines L are not straight lines, each
observer is moving non-inertially. Because of the construction of these world-
lines by the use of Lorentz transformations, which preserve space and time
intervals and will uniformly increment the velocity for the same time step on each
world-line for all times, this necessarily happens in such a way that each observer
will measure his rate of change of speed relative to his proper time to be a con-
stant, i.e. he is in a state of constant acceleration. From the force law (3.35b), this
would require a constant force (e.g. a steadily firing rocket engine) to keep each
observer on his orbit. However, as seen by Lo, these world-lines move closer and
closer to the speed of light but never exceed it (in accord with the limiting nature
of the speed of light).
While these statements are obvious once one appreciates the role of the Lorentz
transformation as a map of the space-time into itself that preserves space and
time measurements, it is interesting to verify these results explicitly. Consider the
event Q = (t, X) on the world-line L: {p = po}, mapped into another event
Q' = (t', X') on L by (4.37a) for some specific value AV of V. Then the proper
time between Q and Q' is AT given by
AT'=At2-0X2 (4.38a)
where
Since Q' is on L, ti2 - X12 = -p0. On simplifying the terms involving AV and
-y(AV), one finds
showing that AT is a constant on the world-line, for a given AV. This is the time
measured moving on a straight line from Q to Q', which is nearly the same as the
time ALT measured moving from Q to Q' along L if AV is small. Now AV is
the change in velocity undergone by the observer in that time (Fig. 4.32a). Thus,
the acceleration undergone in that time is A V/ALT. In the limit of small AV it
follows that 7 - 1 z AV2 and (4.39a) then shows
AT = poAV. (4.39b)
Also ALT - AT, so the proper acceleration A = dV/dT, which is the limit of
AV/ALT for small AV and so for small ALT, is given by
A = Po 1 (4.40)
(a) (b)
Fig. 4.32 (a) Two neighbouring points Q and Q' on the world-line L (p = po) have
velocities differing by AV. (b) Just as the acceleration required to move on the uniformly
accelerated path L decreases with distance p, so does the force required to keep an observer
on a path of uniform acceleration at constant distance from the centre of the Earth
(in everyday life, that force is exerted on us by the floor; without the floor we would fall
freely towards the Earth's centre).
174 The Lorentz transformation and the invariant interval
confirming that the acceleration is constant on each world-line and is smaller the
further the world-line is from 0. This is exactly similar, for example, to a static
observer maintaining a constant radial distance from the centre of the Earth: he is
held at this constant distance by a constant force, usually supplied by the floor,
and the size of force needed decreases with distance from the centre of the Earth
(Fig. 4.32b). This similarity between uniformly accelerated observers and a
uniform gravitational field will turn out later to be of fundamental importance.
(C) Redshifts measured by fundamental observers Because the observers are
not moving inertially, the analysis of Sections 3.1 and 3.2 no longer holds.
However, we can easily calculate the observed K-factor for this family of
observers. Consider light emitted at an event r1 by an observer 01 on the world-
line LI : {p = p1 } and received at an event r2 by an observer 02 on the world-line
L2: {p = p2} (Fig. 4.33). Under the boost (4.37a), for some chosen value of O V,
light rays are mapped into light rays. Thus, if r1 is mapped to ri on L1 and r2 is
mapped to rZ on L2, then the light ray from r1 to r2 is mapped to a light ray from
ri to rZ. By (4.39a), the proper time ATI from r1 to r1 is given by
ATl = 2p1{ry(OV) - 1}, (4.41 a)
and the proper time OT2 from r2 to rZ is given by
OT2 = 2p2{-y(OV) - 1}. (4.41b)
Fig. 4.33 Light is emitted at event r1 on the world-line 01 (p = pi) and received at event
r2 on the world-line 02 (p = p2). When event r1 is boosted to the event ri at a proper time
OTI later, the light ray is boosted to another light ray linking these world-lines (since both
light rays and the world-lines are invariant under these Lorentz transformations). The
second ray is emitted at ri on 01 and received at event rZ on 02, a time OT2 after r2.
4.3 Some flat-space universes 175
Taking the ratio of these equations, we find AT, /pi = OT2/p2i hence the time
intervals are related by
K = OT2/OTI = p2/PI. (4.42)
This expression is independent of L V, so on considering the limit for small A V, it
gives the observed K-factor at each instant and by (3.3) determines the redshift
observed by 02 for radiation emitted by O1. This redshift is due to the accelerated
motion of the observers; since it depends only on the ratio of the two distances pl
and p2, it is independent of time. The redshift increases as p2 increases and as pl
decreases, and diverges if either p, -- 0 or p2 -- 00-
(D) Redshifts relative to a stationary observer A more complex calculation
determines the K-factor if the emission events r1 and ri are on the exceptional
world-line Lo through the origin 0 (Fig. 4.34). One finds after a certain amount of
algebra that
OTl = pZ{l - (1 - OV)ry}/{t + (pz + t2)2}
where t is the time of reception of the signal at the event r2, while OT2 is given by
(4.41 b). Taking the ratio determines K. In the limit of small A V and dropping the
subscript `2', one finds
K = {t+ (p2 + t2)z}/p. (4.43)
This gives both blueshifts (for negative t, as L approaches Lo) and redshifts (for
positive t, as L recedes from Lo) of indefinitely large magnitude for t large enough
in magnitude.
(E) The event horizon A little reflection on the last example or on Fig. 4.31 will
show that the observer on Lo can only receive signals from the observer on L when
t > 0, but can only send signals to him when t < 0. Thus, any fundamental
observer L cannot send a signal to Lo and receive an answer! In fact, it is clear
(Fig. 4.35) that all events for which t - X > 0 cannot send signals to L, while
Fig. 4.34 Light signals emitted from the exceptional world-line Lo at events r1 and ri , and
received by the uniformly accelerating observer L.
176 The Lorentz transformation and the invariant interval
all events for which t + X < 0 cannot receive signals from L. The surface {t = X j
is called an event horizon for these fundamental observers. All the events
`the other side' of the horizon, i.e. for which t > X, are forever hidden from the
fundamental observers: they can never know what happens there.
To clarify this, suppose an observer L in a spaceship moving as a fundamental
observer at time t = 0 releases an astronaut in a capsule which then falls freely
(i.e. no forces act on it). Since it moves inertially, its world-line is a straight line C
(Fig. 4.36). At any time until the capsule crosses the event horizon at the event Q,
the astronaut could return to the spaceship by turning on a sufficiently powerful
rocket motor. However, after the event Q, the capsule can never return to the
spaceship: it would have to move faster than light to do so. It can be thought of as
`trapped' by the event horizon, a surface in space-time which it cannot cross in
one direction. Neither can it send any signals to the spaceship to tell what has
happened to it. As far as the outer world (t < X) is concerned, the astronaut has
then effectively ceased to exist.
Suppose C sends out signals at regular intervals that are received by L
(Fig. 4.36). For simplicity, suppose the event Q is measured by C to occur at 12:00
noon. Then the regular signals sent out before 12:00 noon will all eventually be
received by L, but the 12:00 signal will not, neither will any subsequent signal.
Watching C's clock through a telescope, L will never see it reach 12:00 o'clock. In
fact, the regular signals will be received by L at longer and longer time intervals,
the last minute to noon in C's history being seen by L in an infinite length of time;
that is, the Doppler-shift factor K diverges and the redshift becomes infinite.
This is clear from the diagram because this last minute is seen by L over his entire
remaining history. It also follows directly from (4.43), because t - 00 on
events
hidden
from L T event horizon t=X
Fig. 4.35 The event horizons t = ±X in a Rindler universe. A fundamental observer with
world-line L cannot send signals to events in the region t < -X behind the past event
horizon t = -X, and cannot receive signals from events in the region t > X lying behind
the future event horizon t = +X.
4.3 Some flat-space universes 177
tj
L's world-line in the distant future. As the redshift diverges, the image intensity
will decrease to zero (by eqn (4.36b)). Thus observing C continuously, L will see
all activity on C slowing down indefinitely; the observed redshift will increase
without limit, and the image will fade away. The event Q and all subsequent
events will be unobservable to L, but as far as C is concerned, nothing special at
all will happen there. This behaviour is exactly similar to that of a particle wat-
ched by an outside observer as it crosses the event horizon of a black hole (see
Chapter 6).
(F) The metric form Finally, it is interesting to see how the metric form (4.32a)
is transformed if we change to coordinates adapted to the symmetry of the world-
lines. We do so by using as coordinates p (given by (4.37b)) and a quantity /3
determined from r by the relation: dr = p d/3 along the world-lines (this relation
is just the infinitesimal limit of relation (4.39b)). These are comoving coordinates
for the fundamental observers: p labels the world-lines, and ,Q is a time parameter
(but not proper time) along them. Explicitly, Q is the `hyperbolic velocity' related
to V in (4.37c) by V = tanh/3; then y(V) = cosh,3 and Vy(V) = sinh/3. This
implies (4.37c) can be written*
X = p cosh,3, t = p sinh /3. (4.44a)
*Here, cosh Q = i {exp 3 + exp(-Q)}, sinh Q = i {exp(3) - exp(-Q)}, tanh Q = (sinh Q)/ cosh Q,
where exp is the exponential function which can be given in terms of a power series by exp x =
1 + x + x2/2! + x3/3! + x4/4! + . From these relations, it follows that cosh2 3 - sinh2 Q = 1,
cosh 0 = 1, sinh 0 = 0, tanh 0 = 0 (more details of these `hyperbolic functions' may be found in any
standard book on calculus).
178 The Lorentz transformation and the invariant interval
(to check this, use (4.32a) and (4.44a) to determine dT along the world-lines
on
which dp = 0). From the definition of 0 and the fact that p measures radial
distance, the metric form may be written
Exercises
4.20 (a) Explain why it is necessary for
a force to act to keep a fundamental observer in
a Rindler universe on his world-line. In what way might one produce the required force?
(b) Noting that this force (measured at each instant in the observer's rest-frame)
must be
constant for an infinite proper time along his world-line, what physical considerations
suggest that this would be difficult to achieve in practice in some circumstances?
4.21 Find and sketch the paths of light rays in a Rindler universe in terms of the
coordinates in the interval (4.44b). What is the coordinate speed of light
at a point (p, 0)?
4.22 (a) Derive (4.39) and (4.40) from the
preceding equations; (b) derive the formula
(4.43) for the redshift relative to a stationary observer as follows.
(i) Write down the equations of the forward light-rays through the
events rl (t1, 0) and
ri (ti, 0).
(ii) Use these equations to relate tl and t,' to the coordinates of r2 (t2, X2) and
rz (tz, X2) where the light rays meet the path of the observer 02:
p = P2.
(iii) Express At, = ti - tl in terms of t2 and p2, by using
L.
I
surfaces
of
homogeneity
Fig. 4.37 The Milne universe. The world-lines are generated by repeatedly applying
a boost through a speed ±A Vo to the world-line Lo. The surfaces of uniformity
(or homogeneity) are given by SZ = -T2.
Because the world-lines are straight lines, T is just proper time measured along
these world-lines from 0; so these surfaces are surfaces of constant proper time in
the history of the fundamental observers. The boost (4.37a) leaves these surfaces
invariant and so moves the intersection Q of any world-line L with a surface S to a
point Q' representing the intersection of another world-line L' with the same
surface S. Because the world-lines are generated by repeated use of the trans-
formation (4.37a) with the same value of AV, they are equally spaced in the
surface S. By a calculation similar to that leading to (4.39a), the spatial distance
180 The Lorentz transformation and the invariant interval
Op between Q and Q' is given by
Opt = 2T2{-y(AV) - 1}; (4.46a)
in the limit of small AV (AV << 1), this becomes
Op = TA V. (4.46b)
Just as we arrived at the invariant metric form (4.44b) in the Rindler universe, if
we here use (T,,3) as coordinates for this universe model, where d p = T d,3, we
obtain the metric form
ds2 = -dT2 + T2d'32 (4.47a)
for these space-times where the fundamental world-lines are the curves
{,3 = constant}. As before,,3 is the hyperbolic velocity related to Vin (4.37a) by
V = tanh,3; because the curve Lo goes through the point {t = T, X = 0} we can
express the transformation (4.37a) in this case as
t = Tcosh,3, X = Tsinh,3, (4.47b)
where,3 labels the fundamental world-lines and T is proper time along them. The
spatial homogeneity of the space-time is manifest here, because the form (4.47a)
is independent of the spatial variable,3.
The spatial distance between two world-lines of the family of fundamental
observers, measured in a surface S (dT = 0), will be
Fig. 4.38 A boost through AV applied to the event Q on the world-line L moves it to the
event Q' where a second world-line L' intersects the same surface of homogeneity; clearly
L' is at a larger distance from Lo than L, and is moving at a higher speed relative to Lo than
L. The surface t = to is not a surface of homogeneity because it crosses an infinite number
of world-lines near the boundary Y(t = ±X).
L'f
(a)
Fig. 4.39 (a) An observer on the galaxy with fundamental world-line Lo sees all the other
galaxies to be receding from him in all directions. (b) The same is true for an observer on
any other galaxy with fundamental world-line L' say.
projector
screen
(a) (b)
(c)
Fig. 4.40 (a) A projector throws a picture of a cluster of galaxies on a screen. (b) If the
screen is moved steadily further and further away, the images of the galaxies on the screen
move further and further apart from each other; the appearance is just like that of an
expanding universe. (c) The relation between the Hubble constant Ho in a Milne universe,
and its age To.
This feature of every galaxy receding equally from every other one is perhaps
difficult to grasp at first, but can be visualized in the following manner: consider a
projector throwing images of a cluster of galaxies on a distant screen (Fig. 4.40a).
If the screen is moved further away from the projector, the whole scene depicted
increases in scale and the image of each galaxy moves away from the image of
each other galaxy without there being a centre to this apparent expansion
(Fig. 4.40b). Thus, if the screen is steadily moved away, one will see visually
depicted on it the expansion of a small section of the universe. The space-time
diagram formed from a succession of these images on the screen will be just like
Fig. 4. 37.
(D) The Hubble constant The Hubble constant Ho measures the rate of
expansion of the universe at a specified time To. It is defined as the rate of change
4.3 Some flat-space universes 183
of distance to a nearby galaxy per unit proper time divided by the distance to that
galaxy, this ratio being evaluated at the time TO. In the case considered here, we see
from (4.46) for a given pair of galaxies at times Tl and T2, that Apl = TI AV and
AP2 = 72A V, so the change of distance in time Ar = TZ - Tl is Opt - Opt =
OTOV. Hence Ho = (ATAV/OT)/(TAV) = 1/T evaluated at the time To, i.e.
Ho = 1 /TO, which clearly decreases with the age of the universe (Fig. 4.40c).
(E) Initial singularity Since the expansion is linear, then if it is followed back
in time to 0 (r = 0), there is a `Big Bang' at 0 where all the matter world-lines
intersect (by (4.48), the distance between every pair of galaxies goes to zero there).
Clearly then the matter density is infinite at the origin 0. However since the
surfaces S are surfaces of constant density, this means that the matter density
increases everywhere on these surfaces as T -> 0, and so goes infinite at all points
on the boundary .9' (Fig. 4.41). Accordingly, this boundary should really be
regarded as the edge of the universe model, because the spatially homogeneous
region where the matter is expanding and has a finite density is bounded by this
surface. Thus, having constructed the universe model, it is regular only within the
region t = ±X, and the exterior region should be discarded because it is separated
from the expanding universe region by infinite-density surfaces.
While there is an edge to the galaxy distribution in each surface It = constant},
when we exclude the exterior region we cannot really regard the model as
representing an expansion of the matter in the universe into a surrounding
vacuum. How can we then interpret what is happening? The key is to note that
there is no boundary or edge to the galaxies in the surfaces of homogeneity
T = constant. Thus when analysed in terms of these surfaces, the expansion does
not take place into a surrounding vacuum or anything else, but is simply a con-
tinuous increase in distance between every pair of galaxies in these surfaces,
which are infinite in extent. This describes completely what is happening in the
Fig. 4.41 The `Big Bang': at the point 0 where all the world-lines intersect, the density of
matter is infinite. As the surfaces shown are surfaces of constant density, the density is also
infinite on the surfaces 9, which are therefore the boundary of this universe model: the
spatially homogeneous expanding universe region comes to an end at these surfaces where
the matter density diverges. The event 0 is the beginning of the universe.
184 The Lorentz transformation and the invariant interval
Fig. 4.42 The past light cone C - (p) of an event p at time To on a world-line L intersects all
the other fundamental world-lines in the universe before reaching the boundary surface 9'.
Thus the observer on L can see all the galaxies in the universe. However the furthest spatial
distance to which L can have measured by radar at that time is To, the distance to the event
R where C - (p) intersects '. z
universe model, because these surfaces completely cover the space-time region
representing the expanding universe (Fig. 4.41).
The past light-cone C - (p) of any point p on a world-line L intersects all the
other world-lines back to Y. Thus in principle each fundamental observer can at
all times see and communicate with every other galaxy in the universe, even
though there are an infinite number of them. By (3.10a) the Doppler shift factor
will diverge as one looks to earlier and earlier times (i.e. to galaxies for which
T -4 0 and the relative velocity v -4 c), so by (3.3) the redshift will also diverge
there and by (4.36b) the intensity of received light will fade away to zero.
By contrast, although at each time To each observer can receive signals from all
the other galaxies in the universe, the distance measured by radar to the limiting
observable event R in any direction would be just To, so one might say that the
z
size of the observed universe is just To. Every fundamental observer would agree
on this measurement (Fig. 4.42).
Four-dimensional Milne universes One can construct four-dimensional flat-
space Milne universe models that have all the essential features discussed above;
these will be presented in Chapter 7. Since these are flat space-times, eqns (4.35)
(with r = p) and (4.36) will determine the observed flux and intensity of radiation
in such universes. These universe models display many features of the curved-
space-time expanding universe models which we will examine in Chapter 7.
Exercises
4.24 In a diagram of the Milne universe, draw in the world-lines of some inertially
moving particles. Why will each such particle eventually be at rest relative to the funda-
mental observers and matter around it? Suppose a particle is emitted from the origin at
time t = to and moves freely with speed Vo. Which is the furthest fundamental observer
(i.e. the one with the largest value of V) which this particle can reach, given an infinite
amount of time?
4.25 Derive eqns (4.46a) and (4.46b).
4.26 Suppose the Rubble constant is measured to be Ho = 50 km/sec per Mpc, where
one `megaparsec' (Mpc) is 3.26 x 106 light years, and the age of the oldest stars in globular
4.3 Some flat-space universes 185
clusters in the universe is established to be 16 x 109 years. Is this data consistent with a
Milne universe model? What if we find that the Hubble constant is really 100 km/sec
per Mpc?
4.27 Deduce from the interval (4.47a) that the redshift z of light observed by a fun-
damental observer A at time To for light emitted by a fundamental galaxy at time TG is given
by
1 + z = To/TG.
Hence prove that the redshift observed by A at a given time To will diverge as he examines
spectra emitted by galaxies at earlier and earlier times (i.e. as TG -> 0). What does this
imply about the measurements A might make of the flux or intensity or radiation emitted
by galaxies at very early times in the history of this universe? [For simplicity, assume here
that the light emitted by each galaxy is constant throughout its history.]
5
Curved space-times
Fig. 5.1 The bending of light rays by the gravitational field of a massive object; the paths
in space and in space-time are no longer straight.
5.1 The general concept 187
it out flat on a plane after making appropriate cuts where necessary. If distortion,
gaps, or overlap arise at any point in this process then the surface is curved there.
If the surface has positive curvature (e.g. the summit of a hill) there will be gaps in
the projection onto the plane (Fig. 5.2a). If the curvature is negative (e.g. the
saddle-shaped surface between two neighbouring hills) there will be overlap in
the projection (Fig. 5.2b).
Geometrical relationships in curved spaces differ from those in flat spaces. As
an example, consider the surface of a sphere; we can regard this as an idealized
model of the surface of the Earth. Great circles are the curves in this surface where
any plane through the centre of the sphere intersects it, e.g. lines of constant
longitude, and the equator (Fig. 5.3). The analogue, on this surface, of a straight
line is a great circle, because (i) when one moves on the surface of the sphere, these
are the curves of shortest distance between any two points (as can be seen by
stretching a piece of elastic between two points on a sphere), and (ii) these are the
(a) (b)
Fig. 5.2 (a) A surface with positive curvature. Because the circumference of a circle of
radius r is less than 27rr, if we flatten a section of it onto the plane it will tear, and there
will then be gaps in this projection onto the plane. (b) A surface with negative curvature.
Because the circumference of a circle of radius r is greater than 2irr, if we flatten a
section of it onto the plane it will fold and there will then be overlaps in this projection
onto the plane (see `The mathematics of three-dimensional manifolds', W. P. Thurston
and J. R. Weekes, Scientific American, July 1984, pp. 103 and 106).
constant longitude
great circles
Fig. 5.3 The equator and lines of constant longitude are great circles ('geodesics') on the
surface of the Earth.
188 Curved space-times
curves obtained if one starts out from any point on the surface of the sphere in a
given direction and then moves on this surface without deviation from its
direction of motion (think of a ship or aircraft steering straight ahead, deviating
neither to the left nor the right). We shall call curves in any space that have these
two properties, geodesics of the space; thus great circles are geodesics on the
surface of a sphere. Now if you try drawing a triangle on the surface of a sphere,
with sides given by great circles, you will find that the angles do not add up to
180°; indeed one can find such a triangle for which every corner is 90° (Fig. 5.4a).
Further, if you follow two such curves that start off parallel to each other (e.g.
they are both initially at right angles to the equator, see Fig. 5.4b) the distance
between them does not remain the same; on the contrary they eventually intersect
each other. If two aircraft start off exactly parallel to each other, and fly straight
ahead at the same height above the surface of the Earth, they will eventually
collide. Thus the geometry of this curved space is different from that of a flat
space; Euclid's axiom, that parallel straight lines never meet, is untrue. Further, it
is intuitively clear that the smaller the radius of the sphere considered, the more
highly curved is its surface, and then the shorter is the distance until initially
parallel great circles intersect (Fig. 5.4c). Thus this distance provides a measure of
the amount of curvature of the surface.
A curved (four-dimensional) space-time is rather more difficult to imagine,
but geodesics can again be defined in essentially the same manner and similar
kinds of effects occur. This will be made clear in this and the following chapters.
In this chapter we consider the nature of curved space-times, and how they are
described mathematically. As a preliminary to this we first examine Einstein's
principle of equivalence, which underlies the curved space-time understanding of
the nature of gravitation.
(a) (b)
ri rz
(c)
Fig. 5.4 (a) A `spherical triangle' formed by three great circles (the equator and two lines
of latitude meeting at a right angle at the North Pole). Each of the three interior angles of
the triangle is 90°. (b) Two great circles (lines of latitude), initially parallel to each other at
the equator, intersect at the North Pole. (c) The distance d from the equator to the
intersection of these initially parallel great circles is shorter if the radius r of the sphere is
shorter; then the surface of the sphere is more highly curved.
5.2 Acceleration and gravitation 189
Exercises
5.1 Pick a point P on a plane, and draw various circles of radius r with P as centre.
Repeat the procedure on the surface of a sphere of radius a. In both cases, find the ratio
R = C/r between the circumference C and radius r of each circle (for the circles drawn on
the sphere, measure the radius along a geodesic on the sphere). How does the ratio R for the
circles on the sphere depend on their radius? [You can do this exercise experimentally,
actually drawing the circles on a piece of paper and on a ball, or use simple trig to calculate
the answers you would obtain if you actually carried the experiment out.] How would R
vary with the radius a of the sphere?
5.2 The basic problem of mapping the surface of the world in an atlas arises because
the Earth's surface is not flat. Consider this problem in the light of the above discussion.
Can you characterize the kinds of distortion that are likely to arise in mapping the Earth's
surface on a flat map (as in an ordinary atlas)? How could you minimize this distortion
best? In attempting a least-distorted map of the Earth's surface by `cutting' into separate
areas and projecting these onto a plane, would you expect to find gaps or overlaps in this
projection?
5.3 Consider the surface of a cone. By projecting (i) a region including the vertex, and
(ii) a region not including the vertex, onto a plane so as to preserve distances and angles,
determine the nature of its curvature.
independent of its mass m. Thus, different objects accelerate at the same rate in a
gravitational field, irrespective of their mass or composition. Indeed, this is the
190 Curved space-times
essential content of Galileo's famous observation that bodies of all kinds fall at
the same rate when air resistance can be ignored. It also underlies the fact that we
do not have to know the composition or nature of a planet in order to calculate its
orbit (the outer planets such as Saturn and Jupiter, composed mainly of
hydrogen-rich gases such as methane, move on elliptic orbits, just as do the inner
planets such as Mars and the Earth, made mainly of rock and iron).
This fundamental feature has two major consequences which we consider in
turn. We consider the principle of equivalence in this section, and the meaning of
geodesics in the next.
the laws of physics are the same for all observers, no matter what their state of
motion.
As we shall now see, this leads to a new understanding of the nature of gravity.
It is clear that the gravitational force measured by an observer depends criti-
cally on his state of acceleration. It is convenient here to think of an observer
carrying out experiments in a lift (in the USA: an elevator). As long as the lift is
stationary or in uniform motion, the results are identical to those he finds in a
stationary laboratory on the Earth's surface. For simplicity, consider the lift
when stationary; the Earth's gravity acts on the lift and on the observer in it.
Tension exerted by the cable holding the lift (Fig. 5.5a) prevents it accelerating
downwards at the rate g observed for every freely falling object (where g has
approximately the value 9.8 m/sect at the surface of the Earth, determined by (*)
with M as the Earth's mass and r its radius). The reaction exerted by the floor of
the lift on the observer prevents her from falling down the lift shaft; she experi-
ences this as her weight. If she releases a glass held in her hand, it accelerates
downward relative to her at the rate g and breaks on hitting the floor. Because of
the equivalence of gravitational and inertial mass, the same acceleration is
experienced by all bodies no matter how heavy they are (within limits) or what
they are made of, this being demonstrated by Galileo's celebrated experiments at
the leaning tower of Pisa, and other more modern versions of that experiment.
However, if the cable attached to the lift breaks (Fig. 5.5b), and we ignore
friction and air resistance, then relative to the Earth's surface the lift will
accelerate downwards at the rate g (since it will be a freely falling object). The
observer also accelerates downwards relative to the Earth at this rate, because
the floor no longer prevents this happening: it accelerates away from her at just
the free-fall rate, and so exerts no force to slow down her fall. Since the reaction
5.2 Acceleration and gravitation 191
I
gravity
ITA
Fig. 5.5 (a) An observer in a stationary lift, held in place by the tension T in the cable.
The force of gravity holds her to the floor; an object dropped by her will accelerate to
the floor of the lift at the rate g. (b) An observer in a lift in free fall after the cable has
broken. She will not experience any force holding her to the floor; an object dropped by
her will float next to her as it accelerates downwards at the same rate g as she does.
from the floor now vanishes, she will no longer feel her weight holding her down
on the floor. Thus, as far as she is concerned, the force of gravity now appears to
have no effect. If she releases a glass held in her hand, it will accelerate downwards
relative to the Earth at the rate g, precisely as she is doing, and so will float next to
her at a constant distance above the floor (which is also accelerating down,
relative to the Earth, at the rate g). Thus, because all freely falling bodies
experience the same acceleration in a gravitational field, any freely falling object
will appear to be stationary in the observer's reference frame. Measured by local
experiments in this accelerating reference frame, the Earth's gravitational field no
longer causes objects to accelerate towards the floor of the lift at the rate g. Its
usual effects have been transformed away by changing to an accelerating refer-
ence frame.
One can make the point even more strongly by considering what the
observer would experience if one were to attach rockets to the roof of the lift
to accelerate it downwards at a rate 2g (Fig. 5.6). She can then stand as if in a
normal gravitational field with her feet on the ceiling of the lift! Gravity tends to
accelerate her down at a rate g relative to the Earth, but the roof of the lift
accelerates down at 2g; the reaction exerted by the roof on her feet will act to make
her accelerate down at the rate 2g instead of the free fall rate g. Consequently, the
observer would apparently experience a perfectly normal force of gravity acting
from the floor to the roof, holding her against the roof. If she releases a glass from
her hand, relative to her it will accelerate towards the roof at the rate g and break
on hitting the ceiling. From experiments within the lift, she will measure a
standard value for the acceleration due to gravity but would regard the roof as
`down' and the floor as `up'. Thus, by changing to an appropriately accelerating
reference frame, one can reverse the effective direction of gravity (for a short
while!).
192 Curved space-times
Fig. 5.6 An observer in a lift being accelerated downwards at a rate 2g by a rocket. The
observer is upside down with her feet on the ceiling, and apparently experiences the
normal force of gravity holding her against the ceiling (in the same way as gravity holds
the observer in Fig. 5.5(a) against the floor). An object dropped by her will accelerate
(relative to her) at the rate g towards the ceiling.
(a) (b)
gravity
(c) (d)
Fig. 5.7 (a) An observer A in a lift at rest relative to the Earth (cf. Fig. 5.5(a)). (b) An
observer B in a rocket moving with constant acceleration g far from any massive body. An
object dropped by B will accelerate to the rocket floor at rate g. (c) An observer C in a
lift falling freely under gravity (cf. Fig. 5.5(b)). (d) An observer D in a rocket in free fall
far from any massive body. An object dropped by D will float next to him.
their physical situation are again quite different, can be summarized in the
principle of equivalence:
Fig. 5.8 The freely falling observer C will measure a light ray travelling across the lift to
move in a straight line (because this situation is equivalent to that of observer Din a freely
falling rocket). The same light ray will appear curved to observer A, the stationary
observer in the gravitational field, because C is accelerating relative to A.
(a) (b)
gravity acceleration
I+Dif
(c)
Fig. 5.9 (a) The direction of the gravitational field at various points around the earth.
The directions at P and P' are opposite. (b) An acceleration that transforms away the
gravitational field at P will double it at P', so no single reference frame can transform it
away everywhere. (c) In a flat space-time, a separate accelerated frame is needed at each
point to transform away the gravitational field.
(initial
velocity)
(b)
Fig. 5.10 (a) A stone dropped from rest from the top of the Tower of Pisa at 12 noon
on 1 January 1604. (b) The world-line of the stone, starting at the event P in space-time
with an initial four-velocity V. (c) In general the world-line in space-time of a freely
falling object (i.e. an object moving under gravity and inertia only) is uniquely
determined by an initial space-time position Q and an initial four-velocity U defined at
that event.
event, which is just the space-time direction of the world-line at the event P (see
Appendix B). The stone being released from rest, the initial space-time direction
of its world-line is parallel to the t-axis, since this corresponds to no change in the
Z-direction; if it were thrown down instead of being released from rest, its initial
direction would be sloping in the Z-direction.
From this example, it is clear that a similar result will hold in general for any
object moving freely under gravity and inertia alone: the initial conditions needed
to specify the motion are its initial space-time position Q and velocity (a time-like
direction at that event, Fig. 5.10c). Given these, the motion is completely
determined, and is described by a unique time-like path in space-time. For
example, if we know the position of an artificial satellite moving around the Earth
at a particular time, and its motion at that instant, we can predict its future
motion around the Earth as long as no force other than gravity acts on it (e.g. as
long as it does not fire a rocket engine). A unique space-time curve describes this
motion, being completely determined by an initial point in space-time and
direction at that event.
5.3 Freely falling motion and the meaning of geodesics 197
Planets
Just as gravity curves the paths of light rays in a curved space-time, so it will also
curve the paths of massive objects. Note the inherent non-linearity of the
theory-massive bodies produce space-time curvature which then affects the
motion of these same massive bodies. This is the reason why some calculations in
curved space-times are very difficult. However, here we shall be concerned
mostly with the motion of what are known technically as `test particles', which
just means that we are neglecting their effect on the curvature of space-time, and
seeing how their motion is affected by curvature produced by other more massive
bodies.
The curving of the paths of massive objects by space-time is clearly necessary if
we are to describe the nearly circular motion of the planets as due to gravity
producing a curved space-time. One aspect of this motion may be illustrated by
considering two everyday examples of circular motion. Firstly, consider a ball
made to describe a circular path by someone swinging the piece of string to which
it is attached. The force or tension in the string maintains the circular path.
Secondly, consider a ball following a circular path at a fixed height inside a
hemispherical shell (Fig. 5.11); in this case the reaction of the shell maintains the
circular path. The first of these examples corresponds to the idea of gravity being
a force determining motion; the second embodies the idea of motion being
198 Curved space-times
Fig. 5.11 A ball moving at a fixed height inside a spherical shell is maintained on its
circular path by the curvature of the shell. Bound planetary motion is just like this, the
planet being held in its circular orbit by the curvature of space-time caused by the
gravitational field of the Sun. Bodies with sufficient kinetic energy will escape to infinity,
highly cu
in space
Fig. 5.12. The orbit of a planet around the Sun is a path of `least distance' (longest
proper time) in space-time, which also has the property that its direction is undeviating
(in the curved space-time). Its spatial projection can be highly curved.
*In space, a geodesic is the path giving the shortest distance between its end-points but in space-
time it is the path giving the longest time between its initial and final points (cf. the discussion in
Section 4.2, and Section 5.4 below).
5.3 Freely falling motion and the meaning of geodesics 199
space-time. How then can a particle moving on a geodesic arrive back at the same
spatial position (as happens, for example, in the case of a planet moving in a
circular orbit around the Sun)? This is difficult to illustrate, but the example
mentioned above of the ball moving in a hemispherical shell gives some insight
into this; for it is clear that if the ball were moving at the equator, it would veer
neither to right nor left, and end up back at the same position. A practical
example that nearly demonstrates this is a motorcycle rider on a `wall of death' at
a fair. In a curved space-time representing the gravitational field of a massive
star, the effect of the space-time curvature is as if its planets were moving on a
smoothly curved surface of revolution that holds those planets with sufficiently
small kinetic energy near it, but lets those with large energy escape to infinity
(Fig. 5.11). One must remember here that the undeviating direction is in space-
time, rather than space; this is not easy to visualize, and in the end we have to rely
on our calculations to see that the paths predicted by the theory do indeed work
out as observed in the solar system, for example, the Earth moving around the
Sun in its nearly-circular orbit, held at this distance by the space-time curvature.
ORBITAL
MOTION
(lesser
7 acceleration
l Earth: median
I
sea acceleration
greater
acceleration
Moon
G G
Fig. 5.13 (a) Two particles falling freely from rest towards a star G. The distance between
them decreases as they move towards G. (b) A spherical cloud of particles is distorted as it
falls freely towards a star. (c) The tides on the Earth are produced by the gravitational field
of the Moon. The sea on the side of the Earth nearer the Moon experiences stronger
acceleration than the sea on the far side (cf. the distortion in (b)).
Fig. 5.14 The space-time paths of the freely falling particles in Fig. 5.13(a). They are
parallel initially but meet after a finite time (cf. Fig. 5.4(b)).
Exercises
5.4 Devise a method for constructing the geodesic routes to be used by aircraft flying at
a constant height above the Earth's surface between various cities. In particular, look at
(i) London-Sydney, (ii) New York-Tokyo, (iii) Cape Town-Los Angeles.
5.5 Explain why an astronaut in a satellite orbiting the Earth experiences a state of
weightlessness.
5.6 Two particles are simultaneously released from rest a distance 9 metres apart at
the surface of the Earth, and fall down a tunnel which allows them to fall to the centre
of the Earth. What will happen there? Draw a space-time diagram of this situation.
where 0 and 0 are standard polar coordinates (we can think of 0 as latitude
measured from the north pole, and 0 as longitude; see Fig. 5.15). Just as in the
argument following eqn (4.28b), this shows that the distance measured along a
asine&
Fig. 5.15 The angles 9 and 0 used to describe position on the surface of a sphere. Small
increments in 9 and 0 result in displacements a d9 and a sin 9 do on the surface of the
sphere.
202 Curved space-times
line of constant longitude (0 constant) from 01 to 02 is a(92 - 01), while the dis-
tance measured along a line of constant latitude (0 constant) from 01 to 02 is
a(02 - 01) sin 0 (see Fig. 4.22*). Moving through a general small displacement
(d9 in the 0 direction, do in the 0 direction), then because the lines of constant
latitude and longitude are at right angles to each other, we very nearly have a
small flat right-angled triangle, and the smaller these displacements are, the more
accurate this approximation is. In such a flat triangle, Pythagoras' result will
hold: the square on the hypotenuse is the sum of the squares on the other two
sides. The form (5.1) shows that the geometry of the curved surface agrees in the
limit of very small displacements with this flat space result. Thus in the limit very
near any point, the geometry of the curved surface is the same as that of a flat
space. This is of course clear on the surface of the Earth: one does not need to use
spherical trigonometry to lay out a football field or design a building!
The distinction at this level between flat and curved spaces is that, for a flat
(two-dimensional) space, it is possible to find a coordinate system in which the
metric form is everywhere
i.e. with the coefficients of dx2 and dy 2 being 1, whereas no such coordinate
system can be found for a curved space (e.g. on the surface of a sphere). Note that
this statement does not imply that the metric form is the same for all coordinate
systems in a flat space; indeed we have seen various other forms for the flat-space
metric in Section 4.2. In a curved two-dimensional space, one can always find
coordinates such that the metric form is (5.2a) at any point P, but it will not be this
at other points (for example, up to a common scaling factor a the two-dimen-
sional metric form (5.1) reduces to this at each point on the line 0 = z 7 but not
elsewhere). If one could find coordinates such that this form applied everywhere,
this would imply that Pythagoras' theorem holds for arbitrarily large displace-
ments, in contrast to the situation in curved spaces where it only holds in the limit
near each point. Similar results hold for higher-dimensional spaces, e.g. a three-
dimensional space is flat if and only if coordinates x, y, z can be found such that
the metric form everywhere is
In general coordinates, the metric form will be different (see e.g. (4.28b)).
*Cf. eqn (4.28b); here we have the same metric form but with r = a = constant, which implies
dr = 0, giving a 2-sphere of radius a as required.
5.4 The metric form and the metric tensor 203
gli g12 _ a2 0
[gij] = g21 g22 a2sn 2B]
Then the metric form (5.1) can be written
ds2 = gi i (dxl )2 + g12 dx' dx2 + g2i dx2 dxl + 922 (dx2)2 (5.4a)
then (5.4a) gives the metric form (5.2a). Thus the formalism [gij] may be used to
specify the metric form of a flat two-dimensional space in Cartesian coordinates,
or a (curved) two-sphere in polar coordinates. Examination of other examples
suggests that for a general two-dimensional space in general coordinates, the
metric form can be written as in (5.4a), where the coefficients gj, called the
components of the metric tensor, are symmetric:
and otherwise are arbitrary functions of the coordinates xi and x2. A more
concise way of writing (5.4) is
where E stands for summation over all values of the indices i and j (in this case,
i = 1, 2 and j = 1, 2) and the last equation is understood to hold for all values
of i and j (in this case, i, j = 1, 2).
One great advantage of this notation is that it includes all the cases we have
come across so far, no matter what the dimension of the space (provided we take
the summation over appropriate values). Thus, for example, we recover (4.28b)
from (5.5) on setting gi 1 = 1, g22 = r2, g33 = r2 sin2 B, gij = 0 otherwise, but
obtain (5.2b) if instead we set gn = g22 = g33 = 1, gij = 0 otherwise. Thus the
general concept is that a curved space of n dimensions is described by a metric
form ds2 given by (5.5) where i and j range over the values 1 to n.
Exercises
5.7 Flat two-dimensional space is given in terms of plane polar coordinates (r, 0).
What form will the metric components take in this case?
204 Curved space-times
5.8 In the case of a general three-dimensional space, verify that when written out
in full detail, expression (5.5a) becomes
ds = g, i (dx 1) 2 + g12 dx 1 dx2 + g13 dx' dx3 + 921 dx2 dx' + 922 (dx2) 2
+ 923 dx2 dX3 + 931 dx3 dx 1 + 932 dx3 dX2 + 933 (dX3) 2.
-1 0 0 0
0 1 0 0
[gZj] = 0 J (5.6b)
0 1 0
0 0 0 1
In a curved space-time one can find a coordinate system in which the metric form
is (5.6a) at any specified point P, but there is no coordinate system giving this
form everywhere. In flat space-time this form will apply only if special coordi-
nates are used; but the general form (5.5) will apply in all cases (see e.g. (4.29)).
The metric also gives a convenient way of writing the scalar product (see
(4.31)). In a general space, the scalar product of vectors q1, q2 is
It can easily be seen that this reduces to (4.31) when the metric takes the form
(5.6b) and the rd's are chosen as in (4.31).
Once the metric form is given, then just as in flat space-time, it determines all
time measurements by ideal clocks in the space-time (moving on time-like curves,
for which ds2 < 0) through eqn (4.25a), and the motion of light at each point
(paths on which ds2 = 0). Thus it determines the light rays at each point and the
*We are here using the same units for spatial distances (measured by light travel times) as for time
measurements, i.e. we are using units such that the speed of light c is 1.
5.4 The metric form and the metric tensor 205
past and future null cones of each event (which are generated by these light rays),
and so the nature of causality. As a simple example, consider the universe model
with metric form given in terms of suitable coordinates:
(that is, goo = -1, gi l = g22 = g33 = t 3, gig = 0 otherwise). One immediately sees
that along each world-line {x = const, y = const, z = const}, the identities
dx = 0 = dy = dz, and so ds2 = -dt2, hold; therefore by (4.25a) the coordinate t
measures proper time along those world-lines, which are the fundamental world-
lines in this universe. However, along a curve {t = const, y = const, z = const
we have ds2 = t3 dx2, so proper distance along that curve is measured by tax
rather than x, which (as we will see in detail in Chapter 7) implies that this is an
expanding universe. The null cone is determined by the condition ds2 = 0; from
(5.7a) this shows that a displacement (dxa) _ (dt, dx, dy, dz) along the null cone
must obey
To see the implications, consider the null cones projected into a surface
{y = const, z = const}, i.e. set dy = dz = 0 in (5.7b), to obtain
This shows that, for small values of the coordinate t, a given displacement dx
results in a very small displacement dt; at larger values oft, the same displacement
dx results in a larger displacement dt (Fig. 5.16). Thus in terms of these coordi-
nates, the light cone `flattens out' as one approaches the surface t = 0 (for the
Fig. 5.16 The light cones for the interval (5.7a), given by (5.7c). For small values of the
coordinate t, the cones are flattened out.
206 Curved space-times
axis of
symmetry
Z}
Fig. 5.17
same increment in t, the required increment in x in order to fulfil (5.7c) gets larger
and larger as t decreases). It does so in a way independent of the value of x (since
the coordinate x does not appear explicitly in (5.7a,c)). We shall examine this and
related models in detail in Chapter 7.
Exercise
5.9 Consider flat space-time (which has spatial symmetry about any chosen axis).
Take cylindrical polar coordinates in which z measures distance parallel to the axis, r
measures distance from the axis, and 0 is an angle describing rotation about this axis
(see Fig. 5.17). Write down the metric form dsz and metric tensor in these coordinates.
the equations is the symmetric Einstein tensor G'1 = Gi' which is built from
second partial derivatives of the components of the metric tensor with respect to
the various coordinates. The tensor describes the geometry of space-time and is
the most general such object which satisfies certain important requirements, such
as transforming correctly and being zero when the space-time has no curvature.
On the other side of the equation is the symmetric stress-energy tensor T'U = T1'
(see Appendix C, Section C5). The components of this object describe the matter
and energy which cause the space-time curvature, combining in one the energy
density, momentum density, and isotropic and anisotropic pressures. Then
Einstein's equations take the simple form
G'i = rT i (5.8)
where t is the gravitational constant, equal to 87rG/c2.* This equation states that
matter (represented by the stress-energy tensor on the right) causes space-time
curvature (represented by the Einstein tensor on the left). The space-time cur-
vature in turn determines how the matter moves, and this is how we experience
gravitational effects. The equations of motion of the matter are embodied in the
conservation law satisfied by the stress-energy tensor (see the discussion of these
laws in the flat space-time case, in Appendix Q. We can choose coordinates so
that at a particular point this law is
OT U __
(5.9)
8xi 0'
(When written in a form valid in general coordinates, the partial derivatives have
to be replaced by a 'covariant derivative' which involves extra terms and gives the
correct tensor transformation properties.) This is just the statement of energy-
momentum conservation. By (5.8) this law means that a similar property must
hold for GU, and this is one of the requirements determining the form of these
equations.
Using the symmetry of GY and Vi in their indices, and recalling that each
index can take four different values in four-dimensional space-time, we might
be led to conclude that there are ten independent coupled equations for the ten
independent components of the metric tensor However, four of the degrees
of freedom of the metric tensor correspond to the freedom to choose what
coordinate system to use in a particular problem-we require this freedom
because we know that the physical reality studied must be independent of the
coordinates used to describe it. Thus, given a suitable coordinate choice, only
six metric tensor components have to be determined by the field equations, the
remaining four components being fixed by our choice of gauge, the technical
* Several years after Einstein first formulated his equations, he inserted an extra term, adding Ag`U
to G''i, where A is the so-called cosmological constant. This was to allow the possibility of a static
unchanging universe as a particular solution. However, when the expansion of the universe was
discovered in 1929, he changed his mind and set A = 0. Many considerations, including the validity
of the Newtonian limit, constrain A to be extremely small, and it is usually taken to be strictly zero,
except in cosmological applications (see Chapter 7), where it may indeed be important.
208 Curved space-times
term for coordinate choice. On the other hand, it turns out that only six of the
Einstein equations are independent because there are four relations between
them, the Bianchi identities. These are precisely the derivative conditions
aG`u/ax' = 0 on G''i mentioned above. Hence we may solve four of the Einstein's
equations (the initial value equations) for the unknown metric tensor compo-
nents on a space-like initial surface E, and six of the equations (the propagation
equations) in a suitable open set U in space-time containing r; it then turns out
that, because of the Bianchi identities, the constraint equations will be true in all
of U (and not just on E), so we do not have to solve these four equations
throughout U. Additionally, if we choose coordinates cleverly in particular cases,
we may be able to do so in such a way as to guarantee that some of the field
equations are identically satisfied. Thus despite the great complexity of these
equations, many solutions are known.
Einstein's equations embody the physics of gravitation. It is of course
important to show that in the slow-motion, weak-field limit, we regain from them
the results of Newtonian gravitational theory to a high degree of accuracy,
because that theory gives a very good description of the behaviour of matter in the
solar system. It is far from obvious that this is true, because the Einstein and
Newtonian gravitational equations are so dissimilar from each other. However,
amazingly, this can be demonstrated, provided we employ suitable coordinates;
and this requirement fixes the constant of proportionality ic between GY and Ti
in (4). However, the predictions of Newtonian theory are not completely accu-
rate, and where there is a disagreement, Einstein's theory gives the better pre-
diction. In fact it has stood the test of all experiments so far conducted to examine
its accuracy (see Sections 5.6. and 5.9). Einstein's theory disagrees dramatically
with Newtonian theory in the case of strong fields. As we shall see, according to
Einstein's theory extremely dense matter can cause space-time to `curl up' on
itself, resulting in a 'black hole' (Chapter 6); there is now evidence that solar-mass
black holes exist in the outer regions of our galaxy, and that much more massive
black holes may exist at the centres of galaxies. In general, the curvature of space-
time manifests itself in the bending of light rays and similar gravitational effects,
resulting for example in a redshift that has a gravitational rather than a Doppler
origin being detected in observations of massive stars.
Exercises
5.10 What symmetries would you expect in the metric form describing the space-time
around a static, spherically symmetric star? From general arguments, write down the most
general metric form that might represent this space-time, provided coordinates are chosen
adapted to these symmetries.
5.11 What other physical situation might the interval of Exercise 5.10 represent?
Geodesics again
We have already discussed the physical meaning of time-like geodesics and their
importance in describing the effects of gravity. How does that discussion relate to
the mathematical formalism. we have now set up?
5.6 Light rays 209
As has been mentioned before, in a curved space one can look for the shortest
distance between two points. This can be found by choosing a path which
minimizes L = f (ds2) z (cf. (4.26a)), where ds2 is the metric form (5.5). Similarly
in curved space-time, we may find the time-like path that maximizes the value of
-c = f (-ds2) I (cf. (4.25a) where ds2 is the space-time metric form (again given by
(5.5)). This will correspond to the path with the longest proper time between its
end-points, as pointed out in the discussion in Section 4.2.* Any paths that are
either maxima or minima of the space-time distance between their end-points are
geodesics of the space-time (cf. Section 5.1). As we have seen, particles moving
freely (i.e. not subject to any non-gravitational forces) will follow such paths, in
curved space-times.
In introducing the idea of a curved space, we indicated that there is an alter-
native way of defining a geodesic: namely, as a curve whose direction is
unchanging as one moves along it. This idea can be made precise in any curved
space or curved space-time (cf. Section 5.7), and it turns out that the two defi-
nitions are the same: a curve of extreme length is also one that does not deviate
from its initial direction. In a flat space or space-time, the geodesics are simply
straight lines between their initial and final points.
Time-like geodesics (those for which ds2< 0 at each point) in space-time have a
very clear physical meaning which we have already discussed (Section 5.3). Null
geodesics (those for which ds2 = 0 at each point) also have an important physical
meaning, which we will discuss next.
* In space-time, whether the path is `shortest' or `longest' depends on the sign convention used for
the space-time interval; this convention is arbitrary, and one can quite consistently use the opposite
sign for ds2 than that used here. However, what is independent of this choice is the physical effect:
these are the paths of longest proper time. Here we regard ds2, which is negative on a time-like
path, as minimized, resulting in a maximum value for the elapsed time, given by integrating
(-dss2)=.
210 Curved space-times
Bending of light rays
We have seen already that light rays observed by a freely falling observer D far
from any gravitational field should be seen to move in straight lines (for this is just
the flat-space-time situation). Hence, by the principle of equivalence, this should
also be true for an observer C freely falling radially towards the centre of the
Earth (Fig. 5.7c). But the path of this light will appear curved relative to an
observer A at rest relative to the Earth, just as the path of light will appear curved
relative to an observer B in a uniformly accelerating rocket far from any grav-
itational field (Figs 5.7a,b; cf. Fig. 5.8). Hence the principle of equivalence leads
us to believe that (relative to an observer at rest on that body) light rays will be
bent by the gravitational field of a massive body. The classical way of testing this is
by observing the apparent positions of stars during a solar eclipse. The stars are
seen by light rays which just graze the surface of the Sun, and the bending of these
rays produces a distorted image of their positions (see Fig. 5.18). From the
Schwarzschild solution of Einstein's equations (see Section 6.1), which describes
the gravitational field outside a spherically symmetric object like the Sun, the
gravitational deflection of such a light ray can be calculated to be 1.75 seconds of
arc. This prediction was first tested during the total eclipse in 1919 by an expe-
dition led by Eddington, and it was confirmed to within an accuracy of about 10
per cent. This led to the widespread acceptance of the general theory of relativity.
Since then, many similar observations have been made during total eclipses of the
Sun, but the difficulties which seem inherent in such measurements mean that the
accuracy has not improved significantly. However, it has proved possible to test
the Einstein prediction more rigorously by radio interferometer measurements of
the bending of radio waves from quasars (very distant objects that appear very
like stars) being eclipsed by the Sun. In 1976, Fomalont and Sramek performed
such measurements to an accuracy of 1 per cent, giving excellent agreement with
the predictions of general relativity.
Apparent
position
of star
Fig. 5.18 Light rays from a distant star are bent by the gravitational field of the Sun,
producing a distorted image of the star's position.
Exercise
5.12 The focal length of the sun
5.6 Light rays 211
Consider parallel light rays projected towards the Sun from infinity. After passing
the Sun, they will intersect within a distance d because of the bending of light by the Sun.
Find d (in light years). [Hint: 1 parsec = 3.26 light years is the distance from which the
diameter of the orbit of the Earth (of radius 150 million km) subtends an angle of 1 second
of are. The radius of the Sun is 696 000 km.] How does this distance compare with the
distance to the nearest star?
Gravitational redshifts
We can note similarly that if light is emitted from the floor of a laboratory or
rocket in free fall and received by a detector at the roof, then observer D should
measure no change in frequency of this light. On the other hand, for observer B in
an accelerating rocket, the roof accelerates away from the position of the floor
when the light was emitted; thus, in every time interval as measured by B, the light
has to travel further before reaching the roof, than in the previous time interval
(Fig. 5.19a). Consequently, the accelerating observer B will detect a redshift in the
received light (indeed this was shown by the calculation of observed redshift in a
Rindler universe presented in Section 4.3). The principle of equivalence leads us
to believe that the same will be true for the observer A stationary on the surface of
the Earth (Fig. 5.19b). Thus we have the prediction of gravitational redshift: light
`climbing out' of a stationary gravitational field will be redshifted when received
by a stationary observer (Fig. 5.19c). This has been verified in a number of
different types of experiments. The celestial ones involve observations of distant
AT
Fig. 5.19 (a) In an accelerated rocket containing an observer B, light emitted at succes-
sive intervals from the floor has further and further to travel to the roof. (b) Observation
of light rays by the equivalent observer A in a stationary lift in the Earth's gravitational
field, must give the same results as B's observations. (c) Gravitational redshift: the time
interval OT' between reception of signals sent out at interval OT, is larger than OT
although the reception point w is not moving relative to the emission point u; this is
because of a gravitational field between w and u, causing space-time curvature.
212 Curved space-times
massive stars, and a measurement by Brault in 1962 of the redshift of the sodium
Di line emitted on the surface of the Sun confirmed the general relativistic
prediction to a precision of 5 per cent. The classic terrestrial experiments were
by Pound and Rebka in 1959 and Pound and Snider in 1965; they used the
Mossbauer effect to measure the redshift of photons emitted at the base of a
22.5 in tower at Harvard University and received at the top of that tower (see
Fig. 6.7). The measured redshift agreed to within 1 per cent of that calculated
from Einstein's theory.
distance
r
FLAT: =ar
fight-rays
(b)
Fig. 5.20 (a) In a flat space, the size d of an object viewed with angular width a at a
distance r must be car. (b) In a curved space this relation is not true. If the space has
negative curvature, the apparent size ar will be smaller than the real size d (c) In a space
of positive curvature, the light rays will be closer together at the object than they would
be in flat space, and the apparent size car will be larger than the real size d. This is the
`gravitational lensing' effect.
5.6 Light rays 213
Fig. 5.21 Light rays nearer a massive body will be bent more than those further away,
because the gravitational field is stronger nearer the body. Consequently, images will be
distorted when light moves near a massive object.
than would be the case in flat space-time. A further effect is that in general the
light conveying images of distant objects will be differentially bent, since the light
nearer a massive object will be bent more than the light further from the object,
because the gravitational field is stronger near the object (Fig. 5.21). Thus, dis-
tortion will occur in the image; for example, a spherical object will appear
elliptical, so in general the gravitational lensing is imperfect and distorts the
appearance of the object observed.
From the space-time viewpoint, it is clear that what we are discussing is
nothing other than the `geodesic deviation' effect discussed above (Section 5.3),
but now considered in the case of light rays. Because of the tidal effects of the
gravitational fields of distant objects, initially parallel light rays will tend to
intersect each other, and light rays diverging from a point will tend to be focused.
As in the case of particle world-lines, the relative separation of neighbouring light
rays can be used to detect space-time curvature, and to measure its strength. In
the space-time context, Euclid's axiom that parallel straight lines never meet is
replaced by an equation (the equation of geodesic deviation) determining how the
distance between neighbouring geodesics varies as a result of space-time cur-
vature. In the case of light rays, these effects are directly observable by measuring
apparent angular diameters of distant objects.
Gravitational lensing
In extreme cases, the focusing effect resulting from the presence of massive
objects or diffuse matter can cause bending sufficient to produce refocusing of the
light rays. Then they no longer recede from each other as one goes to greater
distances, but rather approach each other. Consequently, beyond a. certain
distance where the light rays start refocusing, the size of an object subtending a
constant angular size a at the observer now decreases with distance from the
observer (Fig. 5.22a), so if one were to move a rigid object further away
(Fig. 5.22b) its apparent size would increase with distance from the observer
(instead of decreasing, as one would normally expect). This can occur locally, or
over the whole past light cone.
Local lensing An example of the occurrence of local refocusing is when in a
cosmological model, a massive object refocuses light rays from more distant
214 Curved space-times
(a)
Fig. 5.22 (a) The refocusing of light rays in a gravitational field. The size of objects
subtending the same angle at an observer increases with distance first and then decreases
with distance. (b) An object of size d beyond the point of refocusing subtends a greater
angle at the observer as it moves further away (a' > a).
I, 1lensing mass
Fig. 5.23 A massive object refocuses light from a more distant source, producing
multiple images I1 and I2 of the source,
objects so causing multiple images (Fig. 5.23). This has now been observed in
several cases where light from very distant quasi-stellar objects is focused by an
intervening galaxy.* Figure 5.24 shows such a case; the two quasi-stellar images
0957 + 561 have been identified by their spectra as coming from the same quasi-
stellar object; the galaxy causing the focusing is very faint, and can only be
detected by special processing of the image (Fig. 5.25). This is a dramatic
demonstration of the effect of intervening space-time curvature on light rays. In
this example, the effect is local: light passing near the focusing galaxy is refocused,
but light that does not go near it will be unaffected. Thus, this effect will only
occur in comparatively few directions in the sky, for light rays that pass suffi-
ciently near very massive galaxies or other objects.
Large-scale refocusing The second kind of refocusing implies that the light cone
as a whole is bent back in on itself. In flat space-time, the area of a wave front
* See `The discovery of gravitational lenses' by F. H. Chaffee. Scientific American, November 1980.
5.6 Light rays 215
Fig. 5.24 and 5.25 Gravitational lensing by an intervening galaxy creates two images of a
single quasi-stellar object (QSO 0957 + 561). In Fig. 5.24 the two QSO images, identified as
coming from a single very distant object because of the similarity of their spectra. In Fig.
5.25 one of the QSO images has been digitally removed, revealing the fainter image of the
lensing galaxy (which is nearer but does not radiate as energetically as the QSO). These
photographs thus reveal directly the bending of light caused by the gravitational field of
the galaxy, and so demonstrates space-time curvature. (These images were made by Alan
Stockton at the Institute of Astronomy, University of Hawaii.)
necessarily increases with distance from the observer (after having gone a dis-
tance r = ct in a time t, the light from a source is spread out over an area 47rr2,
cf. Fig. 4.29b). In a curved space-time, this will not be true; in general, the total
area of a wave front will decrease with distance instead of increasing (Fig. 5.26a),
because neighbouring light rays are focused towards each other (as in Fig. 5.22).
Correspondingly, going back down our past light cone, the light cone as a whole
will reach a maximum distance from our past world-line C and then start refo-
cusing towards that world-line (Fig. 5.26b). Examination of expanding universe
models confirms that this is indeed the kind of behaviour we expect for our own
past light cone in the real universe, because there is sufficient matter and radiation
216 Curved space-times
light cones
tilt in
light area
<47 d2 1 'A' C-(P):Past
light cone
(a)
Fig. 5.26 Refocusing of light where the light cone as a whole is refocused by the
curvature of space-time caused by the gravitational field of uniformly distributed
matter or radiation. (a) Light spreading out spherically from a source s at a distance d
has area less than 4ird 2, and eventually focuses to zero. In this situation, as seen by the
observer, the light originates at a distant region, spreads to a maximum, and then
focuses to the observer. (b) In a space-time view, this implies that the light cone of the
observer reaches a surface S of maximum area and then bends back on itself as we
follow it back into the past (the local light cones tip over, remaining tangent to the light
cone of P; cf. Fig. 4.17(b) in the flat-space case). The surface of refocusing S where the
geodesics are a maximum distance apart is seen by an observer at P as a surface of
minimum angular diameters. Going further back into the past, the area of the light front
decreases as the light rays approach each other.
uniformly spread out through the universe to cause this overall refocusing. That
means that we expect the kind of refocusing behaviour shown in Fig. 5.22 to occur
down every light ray, as we follow it back sufficiently far into the past.
Locally, the light cone at each point still represents the speed of light, thus the
local light cones (in a coordinate system in which the coordinates directly
represent lengths and times) cannot be parallel to each other in such a space-time,
and are shown tilted over appropriately in Fig. 5.26b. We believe that the density
of matter in the universe is sufficient to cause this kind of refocusing, and so cause
`anomalous' angular diameters and luminosities in images of distant objects, at a
redshift of somewhere between 1 and 5 (Fig. 5.27). However, this has not yet been
verified observationally.
Exercises
5.13 Suppose a black box is dropped from an aircraft and falls freely towards the Earth
from a height of 10 km above the Earth's surface. Initially two marbles in the box are at rest
a distance of 10 cm apart horizontally. How far apart will they be when the box hits the
Earth? [The Earth's radius is approximately 6000 km. You may neglect the gravitational
attraction between the marbles.]
5.7 Causality 217
I
a
10-
5.
0 1.25
7
1 4
Fig. 5.27 The apparent angular diameter of a rigid object as it is moved to further and
further distances in an Einstein-de Sitter universe is given in terms of the observed
redshift of the object by the relation
a = (constant) (1 + z)2{(1 + z) - (1 + z)z}-I
This device measures gravitational tidal forces by their geodesic deviation effect.
Indicate how one might in principle construct another measuring device basically using the
same idea but this time applied to light rays. Would this be useful in practice?
5.14 Consider a region of space-time far from any gravitating masses. How would you
test whether it is flat or curved?
5.15 If light rays are bent in gravitational fields, can we still use light as a basis for the
measurement of time and distance in curved space-times?
5.16 Draw diagrams to show how two images of the same object may be seen when the
light rays are bent by (a) layers of air at different temperatures in a desert, (b) the grav-
itational field of a very dense body.
5.17 When refocusing of light rays occurs, the apparent flux of radiation measured
from a distant object will differ from that measured in flat space-time. Consider how the
argument leading to eqn (4.35) should be adapted to this situation. [Denote the area of the
outgoing light front by A.]
5.7 Causality
The large-scale refocusing of light rays shows that the local behaviour of light
cones can be very different in a curved space than in flat space-time. This in turn
implies that causal properties can be quite different. One particular feature that
can occur is the existence of various types of horizon in a curved space-time, that
is, surfaces that limit predictability in various ways. The simplest such surface is
our past light cone, limiting the regions we can have had causal contact with
(cf. the discussion in Chapter 1). In the following chapters we will discuss
218 Curved space-times
carefully the concepts of an event horizon around a black hole, the basic concept
having already been introduced during our discussion of the (flat-space) Rindler
universe model, and of a particle horizon in cosmology. A further possibility in
curved space-times is the violation of our normal ideas of causality, which we
discuss briefly in this section.
To see how this can occur, we note that the local light cones can tilt over relative
to each other; indeed we may expect that this will happen in a rotating system (the
light rays get dragged along by the rotation). However, as before, the speed of
light (locally determined by the light cone) is still a limiting speed, so the light
cones and associated paths of light rays still determine what parts of space-time
can be influenced by any particular event. If the rotation is large enough, the light
cones may tip over until they appear horizontal in a given coordinate system; an
example of a space-time where this occurs is Godel's stationary, rotating uni-
verse, where the light cones tip right over if one goes far enough out from any
observer (Fig. 5.28). Then causal violations will be theoretically possible in this
space-time, because closed time-like lines can exist. Thus in principle an old man
can stand next to, and converse with, a young man who is himself (i.e. the same
person) at an earlier stage in his life history! (Fig. 5.29). It is in principle possible
for an observer on any galaxy in this space-time to travel from any event in the
galaxy's history to any previous event in its history, by accelerating far enough
away from its world-line and then back. There is no evidence that this can occur in
the real universe, but on the other hand this possibility (which raises various
causal paradoxes) has not been disproved observationally or experimentally. We
do not claim it is likely that the real universe is like this, but merely point out that
curved space-time models exist where this is a theoretical possibility.
Fig. 5.28 Godel's stationary universe. On the axis the light cones are vertical but away
from the axis the rotation causes them to tilt over. This tilting increases with distance
from the axis so that eventually they are horizontal and there are then closed time-like
lines (the curves drawn are everywhere pointing in the future-directed time-like direction
of the local light-cones).
5.8 Parallel propagation along a curve 219
Fig. 5.29 In a universe with closed time-like lines, world-lines can come back to
themselves so it would be possible for an old man to stand next to himself as a young man!
Exercise
5.18 In a certain region the space-time interval ds2 is given by
where a is a constant. Find the equation of the null cone at radius r for 0 and 0 constant.
At what value of r would you expect there to be a horizon?
along 'y. A geodesic is then a curve whose direction is parallel transported along
itself, i.e. whose direction is unchanging.
Parallel transport along a curve allows us to compare vectors at distant points
in a curved space; however, there is no well-defined concept of `parallel' at distant
points (e.g. at London and New York) in an absolute sense because the result
depends on the path taken between these two points. For example, consider a
sphere (Fig. 5.31) and motion along the curve -y along the great circle from P (on
the equator) to Q (the north pole), e.g. by steering a ship straight ahead all the
way. Let x0 at P point along the equator to the right. Then at each point of ry the
parallel transported vector x will remain at right angles to the direction of ry, and
so will define the vector x7 at Q. Now consider motion from P to Q along the
segment A' of the equator from P to the point R a quarter around the equator,
and then up the great circle A" from R to Q, these two segments together defined
as the curve A from P to Q. Parallel transporting x along A', it always points along
the direction of a'; when the new path turns 90° left at Q, the vector x will initially
Fig. 5.30 The parallel transport of a direction along the path of an aircraft. Initially
the direction is along the axis of the aircraft, but after its path turns through 30° to the
right, the direction is 30° to the left of the aircraft's axis.
Fig. 5.31 The parallel transport of a direction x on the surface of a sphere: when
transported from P to Q along the path -y, it defines the vector x,y at Q: when trans-
ported along the path A via R, it defines the vector XA at Q. The vectors x.y and XA are
not parallel to each other!
5.8 Parallel propagation along a curve 221
lie at right angles to the direction of motion and this will remain true until Q is
reached, defining the vector xA at Q. This is at right angles to the vector x7 there.
Thus parallel transporting a vector from P to Q along two different paths ry and A
in general gives a different result at Q; mathematically, we say that parallel
transport is not integrable. It is then clear that parallel transporting x,y round ry
from Q to P and then round A back from P to Q, will result in a vector parallel to
xA; thus parallel transport round a closed loop results in rotation of the vector.
The amount of this rotation is a measure of the amount of curvature enclosed by
the loop; in a flat space with its normal topology, the rotation will be zero.
The idea of parallel transport can be extended to space-time. Parallel transport
of a space-like vector x along a time-like geodesic is understood to represent the
physical situation of using a perfect gyroscope (or equivalent mechanical device,
such as a Foucault pendulum) that keeps pointing in the same direction and so
tells one what direction at a later time in one's history is parallel with a direction at
an earlier time (Fig. 5.32a,b). This is the basis of the non-rotating reference frame
that underlies the usual studies of mechanics and is realized, for example, in the
inertial guidance systems of ships, aircraft, and spacecraft. As particles in free fall
and light rays move on space-time geodesics, the time-like directions of their
world-lines are parallel propagated along them.
x
(a)
t=to t=ti
(b)
Exercise
5.19 Consider a circle drawn on the surface of a cone, at a constant distance from the
vertex. Take a direction in the surface at right angles to the circle, and perform parallel
transport on it round the circle. By what angle will its direction change in one circuit? What
do you conclude about the curvature of the surface? [You may find it useful to `flatten out'
the cone onto a plane, as discussed previously, to see clearly what is happening.]
(b)
(a) (b)
Fig. 5.34 (a) A particle thrown from the Moon's surface at event A and landing again
at event B, after falling freely (and therefore travelling on a geodesic in space-time).
(b) A space-time diagram of this situation. The initial direction v of particle motion is
parallel transported along the world-line -y of the observer from A to B, defining a
vector v,y at B. However, after parallel transport along the geodesic path A of the particle
from A to B, it defines the direction va at B. The vectors v.y and va are not parallel to
each other (v), is in the +z-direction, but va is in the -z-direction). This corresponds to
the fact that in (a), when the particle leaves the observer its motion is upwards but when
it returns its motion is downwards.
other experimental tests, leaving until Section 5.11 the important topic of the
detection of gravitational waves. For a fuller discussion of experimental tests of
general relativity, the reader is referred to Clifford Will's book Was Einstein
Right? (second edition: Basic Books, New York, 1993).
Perihelion shifts
According to Newtonian theory, a planet moving in the gravitational field of the
Sun and sufficiently far removed from the gravitational effects of other bodies,
would describe a closed elliptical orbit. However, it has been known for a long
time that motion in the solar system does not fit this idealized picture. The planet
subject to the most intense scrutiny has been Mercury; being the nearest planet to
the Sun the gravitational effects on its motion are easiest to measure. It turns
out that its orbit is not closed, but like an ellipse with axes which rotate by a tiny
amount each time the planet goes round. The way to make this idea precise is
to consider the perihelion, which is the position of closest approach to the
attracting body, the Sun in this case. The line joining the planet to the Sun at this
point is observed and is found to precess; it rotates through a very small angle
each time.
A very large part of the rotation of Mercury's perihelion is a result of classical
Newtonian effects, in particular the perturbation of the orbit due to other
planets. This accounts for 5557 seconds of arc per century. Very accurate
observations and calculations left a tiny rotation of 43 seconds of arc per century
which could not be explained this way. This presented a major challenge for
224 Curved space-times
Einstein's new theory. Using the Schwarzschild solution to describe the spheri-
cally symmetric gravitational field of the Sun, Einstein was able to determine the
orbit of Mercury according to general relativity, and amazingly the prediction
gave a rotation of 43 seconds, in excellent agreement with observation. This was
the first experimental test of the theory and provided very compelling evidence in
support of it. The test is particularly compelling because the theory was not
designed specifically to meet this challenge-it just turned out that it did so, after
the theory had been fashioned on the basis of fundamental considerations by
Einstein on the nature of space, time, and gravitation.
More recently, the general relativistic prediction of perihelion precession
has been confirmed by observations of the binary pulsar discovered by Hulse
and Taylor in 1974. This system, which will be discussed in more detail in
Section 5.11, consists of two very compact stars in a very tight orbit around each
other. The perihelion precession is orders of magnitude larger than that of
Mercury; the prediction of about 4 degrees per year agrees closely with the
measured value.
Radar time-delay
A way of investigating the curvature of space produced by the Sun, say, is to
measure the delay in the travel-time of a radar beam passing near it, as compared
with the travel time if the space were flat. Early experiments were performed by
sending a radar beam from Earth and measuring the round trip time after it was
returned by a reflector on the surface of Venus or Mercury or onboard a Mariner
spacecraft. As the path of the radar beam moved nearer to the Sun as the relative
positions of the Earth and the reflector changed, the travel time varied (see
Fig. 5.35). In a more recent experiment by Shapiro in 1976, the radar travelled to
Mars and was sent back by reflectors both on the surface of the planet and in a
spacecraft in orbit around it. The round trip time for signals passing near the Sun
was measured and found to agree well with the values calculated from general
relativity. Because of the curvature of space, the distance was found to be larger
by about 37 km out of a total distance of 378 million km from Earth to Mars. The
radar travel time was about 42 minutes for the round trip.
Target
(a) Sun
(b) Sun
Fig. 5.35 The gravitational field of the Sun produces curvature in space, which is
represented here by a `rubber sheet' picture. (a) When the target planet is far from the
Sun, the radar path is on the `flat' part of the sheet. (b) As the target approaches the
Sun, the radar has greater and greater distance to cover because of the `dip' in the sheet,
so the travel time is longer.
and velocity. The accuracy is extremely impressive, and can be as good as 5-10 cm
in position.
Although the velocities of the clocks are small and the gravitational fields
are weak, relativistic effects like time dilation and gravitational frequency shift
would cause errors much larger than possible errors in the accuracy of the
cesium clocks used, and so they need to be taken into account. A natural
consequence of this is that the GPS provides a way of testing the theory of
relativity. Data from the GPS satellites recorded by the TOPEX satellite (in
orbit primarily to measure the height of the sea) is providing the first explicit
measurements of the periodic part of the combined effect of time dilation and
gravitational frequency shifts on an orbiting receiver. Preliminary analysis of
226 Curved space-times
the data gives an agreement between theory and experiment to within 2.5 per cent.
(For more details, see `The global positioning system' by T. A. Herring,
Scientific American, February 1996, 32-38.)
(or cosine of the same argument) where A 'j and ki = (w, k) are constants. There is
already a restriction on AU because of the choice of the Lorentz gauge, and it can
be restricted further to what is known as the transverse traceless gauge. Traceless
means that if AU is written as a matrix, the sum of its diagonal components is zero.
To understand what we mean by transverse, let us choose our axes so that the
wave is travelling in the z-direction, so ki = (w, 0, 0, w). Then our gauge condi-
5.10 Gravitational waves 227
tions mean that components of A`'j with either i or j being in the z-direction are
zero, so that the wave oscillations are transverse to the direction in which it is
travelling. In particular, we can write
AXx AXy 0
A AYy -AXX 0
0 0 0
so that there are only two independent components, AXX and AXy. Hence the
perturbation of the metric from its flat-space value also has only two independent
components.
One way of understanding such waves in a more concrete way is to consider
their effect on the motion of particles which they pass. It is no good to consider a
single particle as we could always choose coordinates moving with it, so we need
to consider the relative motion of two or more particles (see Section 5.3 and
pp. 212-13). It is possible to construct and solve an equation for the vector
separating two particles; this is called the equation of geodesic deviation and
relates derivatives of the separation to derivatives of the metric describing the
space-time curvature. It is beyond the scope of this book to give details of this
equation but we shall describe its predictions pictorially. Consider a circle of
particles around a central one, all lying in the (x,y)-plane, perpendicular to the
direction of travel of the wave. For a wave with hxx non-zero, hxy = 0, the circle
would be distorted as shown in Fig. 5.36b, first squeezed in the x-direction and
elongated in the y-direction, followed by the opposite effect. For a wave with
hxX = 0 but hXy non-zero, there would be a similar effect but the squeezing and
elongation would be at 45 degrees to the x- and y-axes (Fig. 5.36c). We say that
the plane wave has two polarization states corresponding to Fig. 5.36.
(c)
Fig. 5.36 Distortion by gravitational waves with the two types of polarization (a) A
circle of particles around a central one, all lying in the (x, y)-plane, before a gravitational
wave travelling in the z-direction reaches them. (b) Distortion for non-zero h...
(c) Distortion for non-zero hXy.
228 Curved space-times
We have focused our discussion on plane wave solutions of the linearized form
of Einstein's equations. There are also exact wave solutions in the more general
case where no approximation is made about the weakness of the gravitational
field, but since any gravitational waves reaching the Earth are likely to be weak,
we shall not discuss the more general waves here.
Direct detection
The strength of a gravitational wave is usually characterized by a parameter h, the
strain produced on an idealized detector consisting of two free masses a distance
L apart. If their separation changes by AL as a result of the passing of the wave, h
is given by
h = 2AL
(5.11)
Although the largest signals currently anticipated, say from a supernova explo-
sion in our galaxy, have h in the range 10-17 to 10-18, such events are likely to be
rare and so it makes more sense to have detectors with sensitivities of 10-21 to
10-22. We shall see to what extent present-day detectors match up to this aim.
The basic idea for a means of detecting gravitational waves is, as already
suggested, to measure changes in the metric by studying the separation of two
heavy masses suspended in a way which isolates them as much as possible from all
other vibrations. As a model of what are known as resonant detectors, consider
two masses joined by a spring (Fig. 5.37). In the absence of gravitational waves,
the oscillations of the spring would be simple harmonic motion with damping
(like the motion of an imperfect pendulum which gradually slows down because
of air resistance). However, gravitational waves impinging on the masses could
provide a forcing term for this damped motion, and adjustment of the parameters
of the detector to match the frequency of the waves could result in a large or
resonant response, which would be more likely to be detected.
The pioneer of gravitational wave detection is Joseph Weber from the
University of Maryland, who first built such resonant detectors in the 1960s.
Rather than the simplified model just described, a resonant detector usually
consists of a very large cylindrical bar, with the elasticity of the bar, when it is
stretched along its axis, playing the role of the spring. In the last 30 years, Weber
has reported a number of gravitational wave `events', and although they have not
Fig. 5.37
''i
m m
Partially
mirror \
transmitting
t7l
Partially
transmitting
mirrors
Detector
Fig.5.38 Aschematicrepresentationofaninterferometricdetectorofgravitationalwaves
(not to scale: the paths to the fully reflective mirrors are much longer than the other paths).
5.11 Detection of gravitational waves 231
Indirect detection
Astonishing as it may seem, we already have indirect evidence for the existence of
gravitational waves. Moreover, this was obtained not from gravitational wave
detectors but from conventional radio telescopes. Before looking at the details,
we need to consider the theoretical basis for this indirect detection.
Gravitational waves carry energy (which is why a bar detector oscillates
when such energy is transferred to it by a passing wave). Since we believe that
energy is conserved overall, this means that the source of the waves must be
losing energy. Suppose that the source is two compact objects in orbit around
each other. As energy is lost, the size of the orbit decreases and the period of
rotation becomes shorter. So the idea is that if one observes a binary system
with decreasing period, the most likely explanation is that gravitational
radiation is being emitted.
In 1974, Hulse and Taylor, astronomers then at the University of
Massachusetts at Amherst, discovered what was labelled as PSR1913 + 16, a type
of neutron star known as a pulsar because it rotates rapidly and very regularly,
beaming out charged particles from each of its magnetic poles. This particular
pulsar also moves in close orbit about a very massive companion neutron star,
with a period of about 8 hours. If this system emits gravitational waves, then its
energy must decrease, the pulsar and its companion will move closer to each other
and their orbital period will decrease. This effect was calculated as early as 1941
by the Russian physicists, Landau and Lifshitz, and observations of the binary
pulsar agreed extremely closely with the theoretical prediction. The observed
value in 1982 for the rate of decrease of the period was (2.30 + 0.22) x 10-12
compared with the relativistic prediction of 2.4 x 10-12. These values translate
into about 7 x 10-5 seconds per year, so the experimental accuracy needed
was extremely high. Hulse and Taylor were awarded the Nobel Prize for this work
in 1993.
Perhaps at some stage in the future, gravitational wave detectors will be suf-
ficiently sensitive to register directly the radiation from PSR1913 + 16. In the
meantime, it provides us with the best indirect evidence for the existence of such
radiation.
The reader who wishes to keep up-to-date on this exciting subject will find
articles on it from time-to-time in journals like Scientific American. Two very
informative articles from earlier this decade are `Catching the wave' by Russell
Ruthen, Scientific American, March 1992, 72-81, and `Binary neutron stars' by
Tsvi Piran, Scientific American May 1995, 53-61. The theoretical background is
covered in a very accessible way by Bernard Schutz in A First Course in General
Relativity (Cambridge University Press, 1985).
A different kind of indirect detection applies to the possible gravitational
cosmic radiation background mentioned above. While this might be observed
directly with extremely sensitive detectors, it should also be detectable by its
effects on the cosmic microwave background radiation anisotropies (discussed
in the section on cosmology). There are currently various groups undertaking
high-sensitivity measurements of these anisotropies; if they have the appropriate
232 Curved space-times
angular pattern, they might give us an indirect detection of the cosmic back-
ground of gravitational radiation.
Varying n
When Einstein put forward his theory of general relativity, and for quite some
time afterwards, it was assumed that the gravitational constant n was just that: a
fundamental constant of nature with fixed value like the mass of the electron.
However, with the discovery in 1929 that the universe is expanding this
assumption was called into question.
The origin of inertia has for long been a source of speculation. It is clear that the
gravitational force on a test particle depends on the matter in the rest of the
universe. But we have seen that gravity and inertia are intimately connected
with each other. Putting these together, Mach's principle (see the discussion in
D.W. Sciama's book The Physical Foundations of General Relativity, Doubleday,
1969) suggests that inertia is the result of interactions with very distant matter in
the universe. Indeed it is likely that the most distant matter we see is most
important, the essential point being that the very large amount of such matter
makes up for its very large distance. If this is so, then because the universe is
expanding, it might be that the consequent change in the force on a test particle
would be described by changes in the value of K.
Another motivation for considering the idea of varying n came from the British
physicist Paul Dirac, who was awarded the Nobel Prize in 1933 for his leading
role in the development of quantum mechanics. Dirac noticed a rather extra-
ordinary coincidence between particular combinations of quantities appearing in
physics. The ratio of the electric force between a proton and an electron to the
gravitational force between them, and the ratio of the age of the universe to the
time for light to travel a tiny distance called the classical electron radius, are both
enormous numbers, and what is more, they are both approximately 1040. Unless
we live at a special time, this coincidence should be valid at other times. Now the
age of the universe is certainly not constant, so that suggests that some other
`ingredient' in the numerical coincidence is also changing (keeping the ratio
constant). The most likely candidate is n!
The simplest way to incorporate this possibility into physics is just to replace
the constant n in Einstein's equations by a function of time. Another way, which
forms the basis of the Brans-Dicke theory, which is one of the so-called scalar-
tensor theories of gravity, is to introduce a completely new term into Einstein's
equations. This term involves a scalar field 0 (that is, a field without indices), the
value of which is determined by the matter throughout the universe. This field
5.12 Alternative theories and approaches 233
also plays the role of the inverse of n, leading of course to a varying value of that
so-called constant again.
Experiments to distinguish between theories with varying n and conventional
general relativity are very difficult, partly because any variation in n would be
expected to be very small anyway. For many years, it was not possible to dif-
ferentiate between the theories, but recent experiments all seem to come out in
favour of general relativity. If n did vary, planetary orbits would slow down as a
result, but observations of Mercury, Venus, and Mars have found no such effect,
down to one part in 100 billion per year. It is possible that future observations of
gravitational waves would also provide conclusive evidence for or against the-
ories with varying n.
Current best limits on the time variation k of n from combined solar system
measurements are I'/n) < 4 x 10-12 yr-1 (with the same limits on the time var-
iation of the Newtonian gravitational constant).
Quantum gravity
A problem which has challenged theoretical physicists for many years is how to
combine two of the most successful physical theories of the twentieth century,
namely general relativity and quantum mechanics. As we have seen, general
relativity provides a very accurate description of the large scale behaviour of
gravitating bodies in the solar system and beyond. On the other hand, quantum
mechanics deals predominantly with the behaviour of matter on the very small
scale. Why then is there any need to try to relate these theories? In fact there are a
number of compelling reasons for attempting this.
Think first about the very early universe. Just after the Big Bang, the grav-
itational fields were extremely strong and the distances minute, so that both
relativistic and quantum effects would have been very important. In Einstein's
theory, the Big Bang itself was what is known as a singularity because the density
of matter was infinite. General relativity does not deal with singularities (they are
specifically excluded from its domain) so a new theory, which takes quantum
effects into account, is needed to throw further light on the Big Bang and on other
singularities. In Chapter 6, we shall consider black holes, which have singularities
in their centres. In the presence of the strong gravitational fields produced by
black holes, quantum effects are known to be significant. For example, Stephen
Hawking has shown that black holes are not really so black when quantum
mechanics is taken into account; particles can be radiated from these objects
which, classically, absorb everything and emit nothing.
Another argument for trying to find a synthesis of quantum mechanics and
general relativity is one of completeness. There are four fundamental forces in
nature, the strong and weak nuclear forces, the electromagnetic force, and
gravity. A break-through in particle theory occurred in the mid-70s when the
weak and electromagnetic forces were combined in a unified theory, described
by mathematical objects called Lie groups. Glashow, Salam, and Weinberg
received the Nobel Prize for this work. The obvious next step was to incor-
porate the strong force, with the description of strong interactions known as
quantum chromodynamics. This was partially combined with the electroweak
234 Curved space-times
theory to give the Standard Model, which describes all these interactions in terms
of three types of particle, leptons, quarks and gauge bosons. This has prove
highly successful and has had many of its predictions confirmed by experiment.
Theorists now search for a Grand Unified Theory ('GUT'), in which the different
types of interaction are low energy manifestations of a single master theory and
many proposals have been made in this regard. Of course it remains to incor-
porate the final force, gravity, into the scheme and a great deal of work has gone
into trying to formulate general relativity along the lines of these so-called gauge
theories.
There are two types of approach to the quest for a theory of quantum gravity.
The first starts with general relativity and attempts to extend and modify it to
make a theory describing the quantum properties of the gravitational field. The
second approaches it from the other end, starting with some new quantum theory
which will have, it is hoped, general relativity as its limit in appropriate cir-
cumstances. While it is outside the scope of this book to give a detailed account of
progress in quantum gravity, we will just mention one approach which is
regarded by many physicists as the best candidate available so far for the ultimate
description of the forces of nature. This is string theory.
Traditionally, physicists have regarded particles, idealized as point-like
objects, as the fundamental constituents of matter. String theorists argue that one
could just as well consider `extended objects', strings, which trace out two-
dimensional surfaces, called world sheets, as they move through space-time. The
world sheets of these strings are either bounded by two lines (like a ribbon) or are
closed up on themselves to form a thin tube (like a drinking straw). Thus the
strings can be of finite length, infinitely long (without ends) or form closed loops.
But the extension does not stop with strings-they have been generalized to
p-branes which are objects with p spatial dimensions, sweeping out (p + 1)-
dimensional surfaces as they move in a higher dimensional space. (For example,
p = 0 corresponds to a particle, p = 1 to a string, and p = 2 to a membrane.) The
study of such objects is currently known as M-theory, although at the time of
writing (1999) no one seems quite sure about the definition of the theory or indeed
what the M stands for! (See `The theory formerly known as strings' by M.J. Duff,
Scientific American, February 1998, 54-59).
String theory (or M-theory) contains some very important ingredients. One of
these is supersymmetry, which is a mathematical formalism by which particles
with integer values of spin and particles with half-odd integer values can be
treated together. (In more conventional quantum theory, these different types of
particles had to be dealt with separately mathematically.) Secondly, as men-
tioned, the strings or more general objects live in higher dimensional spaces
(indeed it, appears that when one adopts this viewpoint, everything becomes
simpler in eleven dimensions) and to make contact with real four-dimensional
space-time, the extra dimensions have to be `compactified'. Imagine a two-
dimensional surface in the shape of a hollow cylinder. If the radius is extremely
small compared with the length, the object appears to all intents and purposes to
be a line, a one-dimensional entity. In an analogous way, the extra dimensions
in string theory are curled up on themselves, so that they are not seen at the
5.12 Alternative theories and approaches 235
macroscopic level. Thirdly, and more generally, these theories involve some very
complicated and sophisticated mathematical ideas. These include the discovery
of unexpected symmetries (for example, dualities between high energy and low
energy results), and the use of gauge theories, in which force-fields are repre-
sented via a generalization of the idea of parallel transport. The elegance with
which these ideas fit together makes the theory very attractive.
The obvious question to ask now is what all this has to with gravity. One very
significant connection is that one of the string states is a massless spin-2 particle
which can be identified with the graviton, which is the entity through which
particle theorists think the gravitational force is mediated (in the same way as the
photon or light particle mediates the electromagnetic force). Thus string theory
has general relativity as an approximation, in particular as its low energy limit. A
second rather amazing connection is the calculation of the entropy or informa-
tion content of an extreme charged black hole. This is done by counting the
number of string states that have the same mass and charge as an extremal black
hole, that is a black hole with as much charge as possible. No one quite under-
stands why this works, but it indicates a deep relationship between string theory
and general relativity.
There are still many fundamental questions in string theory which have yet to
be answered, and it is not clear whether some of the difficulties will ever be
overcome. However, it has certainly stimulated a lot of work, which has produced
some fascinating results. In common with other approaches to quantum gravity,
it is extremely difficult to relate it to any observational data. Currently there is no
experimental evidence for any theory of quantum gravity. Indeed there is a
fundamental problem here: it is unlikely that we will ever be able to test such
theories, which make predictions for the future state of black holes, which are
hidden behind event horizons, and the quantum gravity era of the early universe,
which is also inaccessible to observation. This is because the early universe is
highly opaque, and any remnants of the quantum gravity era have probably
been swept away by a period of inflation at very early times (discussed below).
Certainly the possibility of testing these quantum gravity theories in the
laboratory is highly improbable, so verifying them in some observational or
experimental way poses a serious problem. However, the verification of super-
symmetry in accelerator experiments would provide strong indirect evidence for
the correctness of the M-theory approach. Also the experimental observation of
light scalar fields or of space-time-dependent coupling constants could provide
evidence for the existence of higher dimensions, since the scalar could be inter-
preted as the size of the extra dimensions.
An excellent description of the aims and achievements of superstring theory is
given by Brian Greene is his book The Elegant Universe (Jonathan Cape, London
1999).
Broken symmetries
It has been mentioned that gauge theories are central to modern theoretical
physics. Their successful application to particle physics depends on the idea of a
broken symmetry (an idea imported from the theory of magnetism), which is thus
236 Curved space-times
now fundamental to much of physics. It underlies for instance the mechanism
proposed for the inflationary universe idea (see Section 7.6). The point we want to
make here is that this idea is also of importance in other ways in relating physics,
and in particular relativity theory, to modern cosmology. Two particular
examples of broken symmetries in the universe are the preferred 4-velocity in
cosmology (a rest frame for the universe) and the preferred direction of time in
physics (the origin of the arrow of time).
The basic idea is that particular solutions to the laws of physics in general do
not have the same symmetries as the laws themselves. Thus in the case of
cosmology, as has been emphasized at the end of Section 3.1, there is a pre-
ferred rest frame in cosmology, defined by the CBR. This breaks the Lorentz
invariance of the laws of physics, expressed via relativity theory-as expressed
in detail in this book. But that invariance of the laws themselves does not mean
that solutions of the gravitational equations will also have that symmetry, so
there is no contradiction. There is indeed a preferred rest frame in the universe,
and we are close to such a rest frame (we are moving at about 300 km/sec
relative to it). This will happen in any solution where the existence of matter
defines a local rest-frame, so it is not very mysterious, but it is still important
to realize that this is indeed the situation.
Secondly, and more profoundly, the laws of fundamental physics are time
symmetric (except for a weak symmetry-breaking associated with the weak
force); but all macroscopic physics, chemistry, and biology are dominated by a
unique arrow of time and in particular by the second law of thermodynamics.
How is this consistent? Again, the situation is that the solutions to the laws break
the symmetry inherent in the laws. However, here the consequences are pro-
found: we are unable so send signals to the past, as Maxwell's equations by
themselves imply, or reconstruct a broken glass by simply reversing the motion of
its particles (see the discussion by Roger Penrose in The Emperor's New Mind:
Oxford University Press, 1989). It is unclear how the only solutions to the.time-
symmetric fundamental equations all come to have the one-way arrow of time
imposed on them. The best suggestion so far is that this is because of the
expansion of the universe, which supplies a `master' arrow of time, that then
results in all the others (the mechanical, thermodynamic gravitational, electro-
dynamic, biological, and psychological arrows). However, this is not yet fully
understood; it has something to do with the way boundary conditions are
imposed on physical quantities at the origin of the universe, and the way this
differs from the corresponding conditions at the end of the universe (see The
Emperor's New Mind for further discussion).
This arrow of time is profoundly important to physics in general, and to the
nature of life in particular. We still await a fully convincing explanation of this
broken symmetry, and how it comes into being physically (mathematically we
impose it by hand: we simply reject half of the solutions that are allowed by the
equations). It probably does have a cosmological origin, but how it works still
needs explanation. For further discussion of this fascinating and important topic,
see for example The Arrow of Time by Peter Coveney and Roger Highfield
(Fawcett Books, 1992).
5.12 Alternative theories and approaches 237
Computer Exercise 14
(A) The geometry of a space-time is represented by a diagonal metric tensor:
where A, B, C, and D are functions of the coordinates {xJ} _ {T, X, Y, Z}, defined in a
subroutine METRIC. A simple example is A = 1, B = T, C = T, D = T.
(1) Arrange for the coordinates X(T), Y(T), Z(T) of a curve in the space-time from an
initial point (T 0, X 0, Y 0, Z 0) to a final point (T 1, X 1, Y 1, Z1) to be stored in a
subroutine CURVE, either in analytic form (i.e. giving suitable formulae for the curve in
terms of simple functions) or in a numerical table. As a particular example, you might take
X(T) =T2, Y(T) = T, Z(T) = 0.
(2) Split the time period (TO, T 1) into M equal parts labelled by J (J = 1, 2, ... , M)
with the Jth interval starting at the time T (J). Write a subroutine STEP that (a) determines
from CURVE the coordinates X (J), Y(J), Z(J) corresponding to T(J); (b) finds the
increments DT, DX, DY, and DZ in the Jth interval, and (from METRIC) the functions
A, B, C, D evaluated at T(J); (c) evaluates the approximation DS2 to the interval (*),
where
Particular examples
The concepts and results described in this chapter are difficult to understand in
general, so in the following chapters we examine the nature of particular curved
5.12 Alternative theories and approaches 239
where m is the mass of the body measured in geometric units. Here r is a radial
coordinate, 0 and 0 are the usual angular coordinates, and t is a time coordinate.
However, as we shall show below, r is not proper distance and t is not proper time
along the coordinate curves. The form (6.1) will be valid for r > RS where RS is the
value of the coordinate r at the surface of the body; for 0 < r < R, a different
interval (the interior solution) will describe the interior structure of the body.
As we discuss shortly, one must have RS > 2m for a static star.
In these expressions, the mass m is naturally given in geometric units. These
units will be the same as the units used for spatial distances (since m/r must be
dimensionless in (6.1)). The mass m in these units is related to the mass M given in
ordinary units of distance by the formula m = GM/c2 where G is the gravita-
tional constant and c the speed of light. In keeping with the previous sections we
will often measure distances in terms of light travel times, so masses will then also
be measured in units of time! (the mass m* in these units is given by m* = m/c).
An idea of the meaning of these units may be gained from the following:
As already stated, the object referred to could be a planet, the Sun, or a star, but in
the analysis that follows, we shall in general refer to it just as a star for simplicity.
Symmetries
Clearly the space-time is static (i.e. unchanging with time), because the form (6.1)
is independent of time. We will refer to observers for whom r, B and 0 are constant
as `static observers', since they do not move relative to the star; and they would
measure all physical properties of the space-time to be constant in time.
The space-time is also spherically symmetric about the central body. This is
not so obvious, until one realizes that the r2 term in the metric form is simply the
metric form describing a two-dimensional unit sphere (see eqn (5.1)), which is of
course spherically symmetric about the centre of the sphere. This is the only part
of the metric where B and 0 occur; so in fact the space-time described has the same
symmetry as the two-sphere, that is, it is spherically symmetric about the centre of
the star generating the gravitational field.
D=radial distance
from r to r2
(do=o,d(p=o)
J-sphere {r=r2}
(area A2=4nr2)
{r=constant}
radial distance
from r=1 to r=r2 bphere {r=q}
(do=o, d(p=o) (area A,=an i2)
(a) (b)
Fig. 6.1 (a) A space-time diagram for the Schwarzschild solution, with the 0 angle
suppressed. Surfaces {r = constant} are represented by cylinders, with D denoting the
radial distance from the surface r = ri to the surface r = r2. (b) A spatial section
{t = constant} of the Schwarzschild solution. Surfaces {r = constant} are spheres.
242 Spherical stars and stellar collapse
the coordinate r used here is that it is an `area coordinate': that is, it is chosen so
that the area of the two-sphere defined by It = constant, r = constant} is pre-
cisely 47rr2 (this follows immediately from the form (6.1), which reduces to that of
the two-sphere with surface area 47rr2 when we set dt = 0, dr = 0). However, this
coordinate does not directly measure the distances between these two-spheres
(which it does in the case of flat space-time). In fact, the distance one would
measure along the normal to these spheres at any time t, from the sphere r = ri to
r = r2, is given by integrating (6.1) with dt = 0, dB = 0, and do = 0:
D= f(l - 2m/r)-Zdr,
Fig. 6.2 The distance D between spheres r = rl and r = r2, plotted as a function of
d = r2 - r l for rl equal to 2.01 in, 3 in, and 100m.
6.1 The Schwarzschild solution 243
coordinate
time
difference Dt=t,-t3
DTI clock
time
r=r r=1
(a)
DT
Dt
1.0
0.8
0.6
0.4
0.2 J
2m 5m tom 15m r
(b)
Fig. 6.3 (a) The relation between clock time and coordinate time varies with the value of
the radial coordinate r. (b) The proper time interval DT divided by the corresponding
coordinate time interval Dt, plotted as a function of r.
DT = (1 - 2m/r)2 Dt (6.3)
Asymptotic behaviour
Very far from the body, when r becomes very large, the factors (2m/r) become
negligible and then ds2 coincides with the flat space metric in spherical polar
coordinates. Thus this solution represents an asymptotically flat space-time. This
244 Spherical stars and stellar collapse
corresponds to the physical situation that far enough away from the Earth or
Sun, their gravitational fields are negligible. To investigate this further, one can
use the approximation (1 - _ 1 + 2m/r when 12m/r) << 1, to obtain the
2m/r)-1
ds2 = -(1 - 2m/r) dt2 + (1 + 2m/r) dr2 + r2(d02 + sin 2 0d02) <(6.1a)
valid far from the star. Indeed for ordinary stars or planets this form will be a
good approximation everywhere outside its surface, because the condition r > RS
implies m/r < mIRs; and for the Earth and the Sun we find:
Earth: mass = 0.44 cm, RS = 6.4 x 108 cm, m/R, = 6.9 x 10-10
Sun: mass = 1.5 x 105 cm, RS = 7 x 1010 cm, m/R, = 2.1 X 10-6.
Therefore, even close to the surface of the Earth, m/r < 6.9 x 10-10 and, in the
case of the Sun, m/r < 2.1 x 10-6; hence, in both cases (6.1a) will be a good
approximation to (6.1). Then (6.2) is closely approximated by D = f (1 + m/r) dr
giving
D = r2 - r1 + m loge(r2/rl) (6.2a)
and (6.3) by
DT = (1 - m/r) Dt. (6.3a)
Clearly the larger r is, the more closely (6.1a) approximates the flat-space metric
(4.29), while (6.2a) and (6.3a) approximate the flat-space results D = r2 - r1 and
DT = Dt.
The singularity
Clearly, problems would arise in the metric if r could approach the value 2m,
because then DT would go to zero, the coefficient of dt2 in (6.1) would go to zero,
and that of dr2 would diverge. We do not have to worry about this in the present
section, where we assume r > RS > 2m (and indeed, as we have seen, in ordinary
astrophysical situations in the solar system, RS >> 2m). However, we shall have to
investigate the `singularity' in the metric form as r approaches 2m in the next
section, when we consider gravitational collapse.
Redshifts
A consequence of (6.3) is that there are observed gravitational redshifts in these
space-times (as there were in the Rindler universe). Let us see why this is.
Consider two static observers situated radially relative to each other, that is, at
the same values of 0 and 0 but at different values r1 and r2 of r (Fig. 6.4). A light
ray travelling radially outwards from r1 to r2 will obey the conditions dO = 0,
do = 0, ds2 = 0 (the first two following because the path is radial, the last because
it represents motion at the speed of light). Then from (6.1) it follows that along the
6.1 The Schwarzschild solution 245
Fig. 6.4 Two static observers 01 and 02 on the same radial line (0, 0, constant), but at
different values r1 and r2 of r.
light ray, the displacements dr and dt will be related by dr/dt = 1 - 2m/r. Hence,
if the light is emitted by 01 at time t1 and received by 02 at time t2 (see Fig. 6.5a),
we find
This is the formula for the gravitational redshift observed in these space-times
(Fig. 6.5b). Crudely, we can think of light travelling radially out as `climbing out'
of a potential well and so losing energy and hence being received as redder than it
was emitted. K would be measured in precisely the same way as in flat space-time
(see Section 3.1). As in that case, it is the ratio observed between times of all events
as measured at the object and at the observer; referring to it as a redshift effect is
labelling it by one of the most direct ways of measuring it (Fig. 6.5c).
As in the flat-space case, K12 is independent of t1 and DTI. This is essentially
because in both cases, the space-times are static (i.e. unchanging in time).
However, in the present case, quite unlike the case of Doppler shifts for inertial
246 Spherical stars and stellar collapse
/ static observers,-,,
r=
/02
Dt2
DT1
(b)
K12 t
2.5 -I
2.0-1
1.5 {
1.0-I
0.5
2m 5m tom 15m r2
(c)
Fig. 6.5 (a) Radial light signals are emitted by O1 at t1 and t1 (at coordinate interval Dt1)
and received by 02 at t2 and t'2 (at coordinate interval Dt2). (b) The gravitational redshift is
defined to be the ratio of the proper time intervals DTI and DT2. (c) The gravitational
redshift 1 + z = K12 plotted against r2 for various values of rl.
observers in flat space-time, the effect is not reciprocal. In fact, clearly now
K12 = 1/K21; correspondingly, light travelling inwards from r2 to r1 is gaining
energy from the gravitational field and so is blueshifted rather than redshifted.
This has the further consequence that, unlike the situation of inertial observers in
flat space-time, radar measurements of distance will reveal a K-factor of 1 (the
factor K12 on the outward trip will be compensated by the factor K21 on the
inward trip, resulting in no overall change in observed wavelength; Fig. 6.6).
These differences from the Minkowski-universe case occur because the redshifts
observed are due to the inhomogeneity of the space-time, rather than due to
Doppler shifts in a homogeneous space-time; the factor K now is caused by the
gravitational field of the star (represented by the factors 1 - 2m/r in the metric
form). The situation is very analogous to that of the static accelerating observers
6.1 The Schwarzschild solution 247
Fig. 6.6 The reciprocal nature of the gravitational redshift. A proper time interval DTI
between radial light rays at r = r1 produces a proper time interval DT2 = K12DT1 at
r = r2. Reflection of these signals produces a proper time interval DTI = K21DT2 =
K21K12DT1 = DT1 at r = r1.
in the Rindler universe (Section 4.3), which is not surprising: we expect this on the
basis of the principle of equivalence. In the weak-field case (m/r << 1), (6.4)
becomes
K12 = 1 + m/r1 - m/r2. (6.4a)
This will hold, for example, for the gravitational redshift caused by the Earth or
the Sun in the solar system.
According to these results, the light emitted by dense stars (e.g. white dwarfs)
will show a gravitational redshift when it is received on the Earth. This has been
verified observationally. Again, if sensitive enough measurements of precisely
emitted wavelengths can be made, the effect can be observed for light moving
radially out from the Earth (i.e. climbing vertically away from the Earth's sur-
face), and this too has been verified observationally (as mentioned in Section 5.6)
in the case of light emitted at the base of the Harvard Tower and received near the
top of the tower (Fig. 6.7). Thus, gravitational redshift is a phenomenon that has
been well verified experimentally.
Further properties
Many other results follow from the metric form (6.1). In particular, one can
derive from it the particle orbits in the gravitational field represented, and the
bending of light that will result from that field. The methods used to derive these
results, however, demand more advanced mathematical techniques than we are
allowing ourselves in this book.
248 Spherical stars and stellar collapse
Fig. 6.7 A test of the gravitational redshift using light emitted at the bottom of the
Harvard Tower at r = r, and absorbed near the top at r = r2.
We will not pursue this topic further here, except to note that these calculations
form the basis of the classical tests of general relativity (see Section 5.9) through
examination of the paths of light-rays (particularly the famous observations of
the bending of light by the Sun) and the motion of planets and spacecraft in the
solar system (particularly observation of the perihelion of the planet Mercury).
Extremely good data is now available through the tracking of spacecraft through
the solar system and the measurement by radar of the distance to reflectors placed
on the planet Mars. The best evidence at the present time, from examining the
motion of light and massive bodies in the solar system, is that the geometry of the
space-time of the solar system is indeed well represented by the Schwarzschild
metric form (6.1).
Conclusion
There is good reason to believe that the Schwarzschild solution describes accur-
ately the gravitational field of an isolated massive body, e.g. a spherical star, for
that geometry appears to describe the space-time of the solar system to a high
degree of accuracy. Thus the geometric properties described above will char-
acterize the local space-time features of many regions of the universe.
Exercises
6.1 Consider a light ray in the gravitational field of a spherically symmetric object,
described by (6.1). Find its coordinate velocity if it travels (a) radially, (b) transversely.
Does the dependence on distance violate Einstein's principle of the invariance of the speed
of light?
6.2 Light signals are emitted from a lift which moves at 20 metres/sec in a vertical lift
shaft on the outside of a sky-scraper. An observer at the base of the lift shaft records the
light signal when the lift is 100 metres above ground level. Calculate the redshift due to
(a) the Doppler effect, (b) the gravitational effect (you may take the radius of the Earth to
be 6000 km).
6.3 The gravitational effect of the Earth on the Moon is due to the curvature of space-
time casued by the Earth at the distance of the Moon. Calculate (a) the distance from the
surface of the Earth to the Moon, and the circumference C of the Moon's orbit; hence find
6.2 Spherical collapse to black holes 249
the ratio R = C/d of C to the distance d from the centre of the Earth to the Moon; (b) the
ratio DT/Dt of proper time to coordinate time at the Moon's orbit, (c) the gravitational
redshift from the Earth's surface to the Moon's surface. [The radius of the Earth is 6000 km
and the average distance from the centre of the Earth to the centre of the Moon is
386 000 km; after converting to suitable units, take these as the appropriate values of the
coordinate r in (6.1). Note that we do not know the form of ds2 inside the Earth.]
Similarly, calculate the curvature effects at the Earth's orbit caused by the Sun. It is this
tiny effect that is responsible for us remaining in our nearly circular orbit around the Sun!
[Take the distance from the centre of the Sun to the Earth as 1.496 x 108 km.]
6.4 Read about light bending and perihelion precession (see e.g. Space, Time and
Gravitation by A. Eddington, Harper Torchbooks, 1959), and other tests of general
relativity theory (see e.g. `Gravitation theory' by C. M. Will, Scientific American,
November 1974 and Will's book referred to on p. 223).
Trapped
cannot send sign
EVENT
HORIZON
(r=em)
SINGULARITY
(r=o)
invisible-
collapse
Fig. 6.8 A space-time diagram of the collapse of a star to form a black hole. The vertical
axis represents time, and r and 0 are polar coordinates in planes perpendicular to the t-axis
(the angle ¢ has been suppressed). Lines of constant v are drawn at 45° to the vertical. The
radius of the star decreases to zero, where a'singularity' (with infinite density) is formed on
the axis. The surface r = 2m forms the event horizon, which encloses the events which
cannot be seen from the outside world. The ingoing light rays move on lines of constant v,
while the directions of the outgoing ones depend on radial distance. The light cones tilt
over toward the spatial origin with decreasing r, and are vertical on the surface r = 2m (the
`event horizon').
Outside the surface of the star, we have the exterior solution represented by the
metric form (6.5), which is just the Schwarzschild solution in new coordinates.
An important feature is that the light cones, determined as usual by the equation
ds2 = 0, `tip over' as one moves from large to small values of r. The significance of
the surface Jr = 2m} can now be seen: this is a null surface, i.e. it is generated by
light rays (the rays {r = 2m, 0 = const, 0 = conet}). At all points, the `ingoing'
light rays are at 45° to the vertical. These trace the path of light that is emitted
radially inwards towards the centre (e.g. by pointing a flashlight towards the
centre of the star, and pressing the `on' button for a brief instant). Similarly, the
`outgoing' light rays trace the path of light that is emitted radially outwards from
the centre (e.g. by pointing a flashlight directly away from the centre of the star,
and pressing the `on' button for a brief instant). Outside the surface {r = 2m},
these rays are tilted outward; inside the surface, they are tilted inward. On the
surface, they are precisely vertical, i.e. they are lines of constant r (as follows from
the fact that the coefficient of dr2 in the metric (6.5) vanishes there). The light rays
252 Spherical stars and stellar collapse
indicate the orientation of the light cone at each point; and the causal properties
of this space-time all follow from this behaviour of the local light cones
(cf. Fig. 4.17(b)).
Exercises
6.5 Consider radial light rays in the metric (6.5). Deduce that the coordinate dis-
placements dv and dr along the light rays are related by
{2 dr - (1 - 2m/r) dv} dv = 0.
Hence show that the ingoing light rays are given by dv = 0 and the outgoing light rays by
dr = z (1 - 2m/r) dv; and so confirm that the local light cones are correctly represented
in Fig. 6.8 (where lines {v = constant} are drawn at 45° to the vertical, while lines
{r = constant} are vertical). [Hint: look at the possible signs of dr/dt on the null lines].
6.6 Check that the transformation from the coordinate t to the coordinate v is not
well-behaved when r = 2m. [This feature is necessary to enable removal of the apparent
singularity in (6.1) to give the form (6.5), regular at r = 2m.]
* We are considering the situation here classically. When quantum effects are significant, the
situation is different, as will be mentioned briefly at the end of the chapter.
6.2 Spherical collapse to black holes 253
moves outwards (i.e. to larger values of r) in this region. This becomes even
clearer if one uses conformally flat coordinates where the light cone appears
at ±45°; however, to go into that representation is beyond the scope of this
book. For details, see e.g. R d'Inverno Introducing Einstein's Relativity,
pp. 230-238 (Oxford University Press, 1992), or C. W. Misner, K. S. Thorne,
and J. A. Wheeler Gravitation, pp. 833-840 (Freeman, 1973).
This gravitational trapping of light and matter will happen for very small radii.
For example, in the case of an object with the mass of the Sun, in appropriate
units m = 1.5 km so the Schwarzschild radius is 3 km. Thus we would have to
compress the Sun (whose radius is 696 000 km) until its radius is less than 3 km in
order to make the curvature of space-time high enough to cause this trapping
effect. Similarly, the Earth would have to be compressed to about 0.9 cm radius
before it fell within its event horizon.
Fig. 6.9 An infalling observer 02 emits radial light signals each minute; a stationary
observer 01 receives them at longer and longer intervals, and the final minute before the
infalling observer 02 crosses the event horizon appears to the external observer O1 to last
for ever. Thus, 01 sees ever-increasing redshifts in the images of O2i consequently, O2 fades
away from sight.
254 Spherical stars and stellar collapse
inevitably drawn into the singularity within a short time thereafter, and there is
no way he can send signals to the outside observer 01 to report his findings.
Similarly 01 cannot see what happens to 02 once he has crossed the horizon.
Suppose 02 crosses the horizon at the time 12:00 measured by his clock. Light
emitted by him at 12:00 will never reach 01, since it will stay at the distance
r = 2m. Light emitted by him at every previous time will reach 01. For illustra-
tion, signals sent by him at 11:57, 11:58, and 11:59 are shown in Fig. 6.9.
Clearly, when 01 and 02 are at the same radial distance, there will be a K-factor
determined by the Doppler redshift effect alone. However, as 02 gets further
away from 01, a gravitational redshift will contribute to K as discussed in the
previous section. The crucial feature is that the light emitted in the one-minute
interval from 11:59 to 12:00 will take an infinite time to be received by 01 (the
second signal never arrives). The light emitted during the intervals 11:57 to 11:58
and 11:58 to 11:59 will be climbing out of deeper and deeper gravitational
potential wells, so the observed redshift (and thus the K-factor) will be getting
larger and larger. In the limit as 02 crosses the horizon, the K-factor becomes
infinitely large (the observed time dilation increasing without limit, see eqn (6.4)).
Thus the event horizon may also be characterized as an infinite-redshift surface.
This situation is precisely modelled by the Rindler universe discussed in
Section 4.3.
From this discussion, it becomes clear that the surface of the star too will be
observed with ever-increasing redshift as it approaches the horizon. As the
observed redshift increases, the observed intensity of light received from the star
will decrease, so the star (seen from outside) will be observed to fade away as its
surface approaches the horizon, and larger and larger redshifts are seen by the
outside observer. One should note here that the speed at which the observer 02
and the surface of the star cross the horizon is perfectly finite (and indeed less
than c); the infinite redshift observed is a gravitational redshift in a static space-
time, namely the exterior Schwarzschild solution (note that the redshift will
become infinite even for the family of static observers in the space-time). While
K-factors will be reciprocal for ingoing and outgoing signals outside r = 2m, one
cannot consider their reciprocity for r < 2m, for the outgoing signals cannot then
be received by 01. Ingoing signals will be received by 02 with increasing blue-
shifts, while the out-going signals will not be received at all. Thus the surface of
the star after it has crossed the event horizon, and its final destruction at the
central singularity, cannot be witnessed by an outside observer.
Exercises
6.7 A radial geodesic x'(v) in the Schwarzschild solution, where v is an affine
parameter, is characterized by three features: (a) 0 = constant, = constant,
(b) e is constant along the geodesic (the tangent vector
X' = dx'/dv has constant magnitude along a geodesic because it is parallel propagated),
(c) dt/dv = E/(1 - 2m/r) where E( 0) is a constant (this is energy conservation for the
particle relative to the static frame).
(1) What is the value of e if x'(v) is null? What is its sign if x'(v) is time-like?
(2) Show that (a) to (c) lead to the equation
relating the displacement dr to the affine parameter increment dv; using (c), deduce the
relation
Do black holes actually exist in the universe? If so, will they be like the
spherically-symmetric case we have just considered, or are there other possibi-
lities? How will they have formed? Is there any way we can detect them? In the
next three sections, we shall try to answer these and other questions.
where
0 r2 - 2mr + a2 + q2 (6.7)
p2 = r2 + a2 cos2 B. (6.8)
This is a generalization of (6.1), the Schwarzschild metric, to which it reduces
when a = q = 0. (When a = 0, the solution is known as the Reissner-Nordstrom
solution for a charged black hole.)
The Kerr-Newman solution has a horizon at
r=r+=m+(m2-q2-a2)z; (6.9)
this takes a real value only if m2 > q2 + a2 and so corresponds to a black hole only
if this inequality holds. Unlike the Schwarzschild metric, there is another non-
zero value of r with physical significance:
r = ro(0) m + (m2 - q2 - a2 cos2 O)z, (6.10)
which is known as the static limit. To understand the importance of this, we need
to consider first the case of a particle dropped straight in towards a black hole
from very far away. The cross-term involving dgdt in the metric means that such a
particle acquires an angular velocity in the same direction as the rotating black
hole. This effect is known as the dragging of inertial frames. When the particle
reaches the ergosphere, which is the region between the static limit and the
horizon (Fig. 6.10), this dragging effect is so strong that the particle has to rotate
with the hole even if it has arbitrarily large angular momentum in the opposite
direction!
6.4 Black hole evaporation and thermodynamics 257
Horizon
Ergosphere
Static limit
Fig. 6.10 A Kerr-Newman black hole; the ergosphere is the region between the horizon
at r = r+ and the static limit at r = ro(0).
Positive energy
particle escaping
to infinity
Negative energy
particle falling
into black hole
Particle
anti-particle
pairs
Fig. 6.11 Two-dimensional diagram of the mechanism for the Hawking process:
particle-antiparticle pairs are produced in vacuum fluctuations. When this happens near
the horizon of a black hole, the negative energy particle may fall into the black hole and the
positive energy one escape to infinity.
6.4 Black hole evaporation and thermodynamics 259
the rate of radiation of a black body, it can be shown (see Exercise 6.10) that the
lifetime of the black hole is proportional to the cube of its mass. Thus big black
holes live longer, but not for ever; they radiate away all their mass in a finite time.
For black holes of stellar mass, which have temperature 3 x 10-8 K, their
potential life-time is of the order of 1061 years, which is much longer than the
current age of the universe. On the other hand, much smaller black holes formed
in the early universe should have radiated away by now (see Section 6.5).
Exercise 6.10
Use the following facts to show that the lifetime of a black hole is proportional to the
cube of its mass: (i) the temperature of a black hole is inversely proportional to its mass;
(ii) the horizon area is proportional to the square of its mass; (iii) the rate of radiation is
proportional to the horizon area and to the fourth power of the temperature.
Stellar collapse
On theoretical grounds, we believe that many black holes should occur at the end-
point of the life of massive stars, which cannot be prevented from collapsing by
any known physical force (see the subsection on sources of gravitational waves in
6.5 Black hole candidates and ways of detecting them 261
Section 5.10). Such black holes would have masses between two and one hundred
solar masses. Their detection is much more feasible when they are in a binary
orbit with a visible star. In that case, not only does the motion of the visible
companion suggest the presence of an invisible object, but also matter slowly
spiralling in towards the rotating black hole tends to form an `accretion disc' in its
equatorial plane. Different parts of the disc rotate with different speeds, and the
resulting frictional heating leads to the emission of X-rays. Once these are
detected, study of the structure of the accretion disc and of the orbit of the visible
star can lead to limits on the mass of the invisible object. For example, the first
candidate widely believed to be a black hole is the X-ray source known as Cygnus
X- 1. The Uhuru satellite recorded data showing that the X-rays varied over a very
short time-scale, which meant that the source was very compact. The data on the
spectrum of the visible star gave an indication of its mass, and the resulting model
predicted that the mass of the invisible object was at least 3 solar masses, probably
greater than 7 solar masses, and most likely about 16 solar masses! Since all of
these possibilities are well over the mass limit for a neutron star, it was deduced
that a black hole had been located (see e.g. `The search for black holes' by Kip
Thorne, Scientific American, December 1974).
Recent work by Narayan and collaborators on particular models of accretion
discs, has pinpointed another way of distinguishing black holes from neutron
stars. The energy carried through an accretion disc to the central object will
disappear if it is a black hole or be re-radiated if it hits the `hard' surface of a
neutron star. Observations of this phenomenon promise an exciting new
approach to black hole detection. (See R. Narayan, `Astrophysical evidence for
black hole event horizons' in Gravitation and Relativity: At the Turn of the Mil-
lenium; Proceedings of the GR-15 Conference, Pune, India, December 1997,
edited by N. Dadhich and J. Narlikar (IUCAA 1998).)
Exercise 6.11
Read up on the evidence for the existence of black holes (a) as remnants of the collapse of
massive stars, (b) at the centre of our own galaxy [see e.g. the Scientific American articles
cited above; The Cambridge Encyclopaedia of Astronomy, ed. S. Mitton, Cambridge
University Press, 1979; The New Astronomy by N. Henbest and M. Marten, Cambridge
University Press, 1983].
Computer Exercise 15
(A) (a) Using equation (**) of Exercise 6.7, write a subroutine GEODESIC to deter-
mine a numerical approximation to the geodesic curve R(T) starting at a point (Ti, RI),
with the constant e/E2 denoted by EPS [choose a time increment DT and then for
J = 1, 2, ... repeatedly determine the corresponding increment DR(J) and hence find the
next point T(J), R(J) on the curve, where the initial values are T (l) = Ti, R(1) = R1]. (b)
Choose values for Ti, R1 (where R1 > 2M) and EPS (where EPS 0 and is such that
DR(1) < 0; note that the numerical approximation suggested will break down if
DR(J) = 0). Use your subroutine GEODESIC to find the resulting geodesic ry, and show
that for arbitrarily large coordinate times this curve never crosses R = 2M. (c) Now use
program PROPER from Computer Exercise 14 to show that the proper time along the
geodesic until R = 2M is reached, is finite.
(B) If you are feeling strong, calculate [using similar methods to Part (A)] the radial
outgoing null geodesics from ry to an observer 0 who remains stationary at radial distance
R = Rl, starting from ry at the times T(J). Use the program PROPER from Computer
Exercise 14 to determine the corresponding proper-times intervals DTAU-EMIT, DTAU-
OBS between the light rays measured by ry and by O. Hence explicitly determine the
redshift measured by 0 for light emitted by ry. Show how this diverges as ry approaches
r = 2m.
7
-7 t=t,
tz-t, T
7A
R(t,)
r=r2
r= r=r3
proper surfaces of
time constant time
[o,v suppressed]
Fig. 7.1
The world-lines {r, B, 0 constant} of fundamental galaxies and observers in the
FLRW universe. The proper time along these world-lines between surfaces of constant
time are just the coordinate time differences.
* The function sinh r is the hyperbolic sine, introduced in the discussion of the Rindler universe
(Section 4.3).
266 Simple cosmological models
uniform in the surfaces {t=const}. The form (7.1) also implies that for every
fundamental observer, these space-times are isotropic about each point, i.e. all
directions are equivalent (so for example a fundamental observer cannot point in
any particular direction and say `the centre of the universe lies in that direction',
since no direction is preferred over any other). As in the previous chapter, this
follows for the observer at the origin r = 0 because 0 and 0 occur in (7.1) only in
the form of the metric of a two-sphere, which is spherically symmetric. From the
spatial homogeneity of the universe model, it then is clear that this is true for every
fundamental observer.
Exercise 7.1
By considering an arbitrary displacement dxa = (0, dxl, dx2, dx3) in a surface t = to,
show that this surface is orthogonal in the space-time sense (see eqn (5.6c)) to the world-
lines {r, 0, 0 = constant}. Deduce that this displacement is instantaneous for a funda-
mental observer.
The space-sections
The surfaces {t = constant} are locally surfaces of simultaneity for all funda-
mental observers, because they are orthogonal (in the space-time sense) to the
matter world-lines. They are surfaces of homogeneity in space-time, that is, all
physical quantities are constant on them (in particular p = p(t), p = p(t)). It is
instructive to examine in some detail the geometry of these surfaces. Consider the
surface t = to which can be seen to have the metric form:
distance R(to)r
Fig. 7:2 The coordinate r is not the radial distance nor is it an area coordinate as it was for
the Schwarzschild solution (Chapter 6). The area A of a sphere with coordinate rs at t = to
is 4irf2(rs)R2(to) and its radial distance from the origin is R(to)r,.
7.1 Space-time geometry 267
the area of which is A = 4irR2(to)f2(rs). From (7.2), the distance from 0 to this
sphere is D = R(to)rs. The implications depend on which form off(r) applies.
Flat space If f(r) = r, then A = 4irR2(to)r5 and we have the usual relation
between A and D, i.e. A = 4irD2. This is precisely the relation that holds in
Euclidean space, and in fact this is just the case when the space-sections are flat
(i.e. they are surfaces of zero curvature). These space-sections continue inde-
finitely; thus this is a spatially infinite universe. It will, for example, therefore
contain an infinite number of galaxies (because of the spatial homogeneity of the
distribution of galaxies).
Hyperbolic space If f (r) = sinh r, A = 4irR2(to) sinh2 rs. Now as we move out
from any point, the area of the sphere S is greater than it would be in Euclidean
space because sinh2 r > r2 (Fig. 7.3). This is the case of a hyperbolic 3-space of
constant negative curvature 1/R2(to), characterized by the relation
A = 4irR2(to) sinh2(D/R(to)),
showing how the distance D relates to the surface area A of a two-sphere centred
on any point in the three-space. Again, the space-sections continue indefinitely;
this is also a spatially infinite universe containing an infinite number of galaxies.
Elliptic space If f (r) = sin r, then A = 4irR2(to) sin 2 r5. Now as we move out
from any point, the area of the sphere S is less than it would be in Euclidean space
because sin 2 r < r2 (Fig. 7.3). This is the case of an elliptic three-space of constant
Fig. 7.3 The local geometry of the three-spaces of constant time in the FLRW universes is
characterized by the area of a sphere of radius D. Here this area is plotted against D2 for
elliptic (k = +1), flat (k = 0), and hyperbolic (k = -1) spaces (we have taken R(to) = 1;
then D = re).
268 Simple cosmological models
positive curvature 1/R2(to), characterized by the relation
A = 47rR2(to) sin2(D/R(to)),
showing how the distance D relates to the surface area A of a two-sphere centred
on any point in the three-space.
In this case, a new feature arises. As D increases, the area A increases to a
maximum, then decreases, and finally goes to zero at a point P, the point
`antipodal' to 0. Thereafter A increases again, goes to a maximum, and decreases
to zero again.
To understand this, consider moving out from 0 on geodesics in any direction
D1 and the directly opposite direction D2. At a distance d from 0, these curves
intersect a two-sphere S centred at 0 in two points P1 and P2 antipodal to each
other on S (Fig. 7.4). As the distance dincreases, the area of the sphere S reaches
a maximum and then starts decreasing again. As this area goes to zero, the
geodesics approach a point P antipodal to 0 in the three-space; the curves
approach P from precisely opposite directions D1 and DZ (because they intersect
each surface S in points antipodal to each other on S). Hence, the situation is as
follows: moving out from 0 in the direction D1, a geodesic passes through all the
points p reaches Pin direction Dl , leaves Pin direction D2, passes through all the
1
points P2, and arrives back at 0 from the direction D2 (Fig. 7.4). Therefore, the
universe is necessarily spatially closed: moving radially out from 0 in any
direction, one passes through the antipodal point P and then arrives back at 0.
The maximum distance of any point in the space from 0 cannot exceed the dis-
tance to P, and the total volume of the three-space is finite.
k =+1
antipodal pairs
of points on 2-spheres
Fig. 7.4 The global geometry of an elliptic (k = +1) three-space. Geodesics in opposite
directions D1 and D2 from 0 cut a series of two-spheres in antipodal points pi and P2.
Because the area of the two-spheres eventually goes to zero, the geodesics eventually meet
again at P, the point antipodal to O. They approach P from opposite directions D i and Dz;
hence a geodesic starting from 0 in the direction D 1 and continuing without deviation will
arrive at P from the direction D'1, pass through P, and continue in the direction DZ,
eventually arriving back at 0 from the direction D2.
7.1 Space-time geometry 269
2-sphere model
Fig. 7.5 The two-dimensional analogue of Fig. 7.4. Geodesics in opposite directions from
a point 0 on the 2-sphere meet again at P, the point antipodal to 0. En route they cut each
circle centred at 0 in opposite points pi and P2.
a R (tz)
dz
a R(t:)
t=t,
r= i r_r2
=3
d1 d2
R(t) R(t,)
Fig. 7.6 The distances between galaxies scale with R(t): at t = ti, the distances are
proportional to R(tl); at t = t2, they are proportional to R(t2).
7.2 The evolution of the universe 271
infinity: 2 x oo = oo). In the cases k = 0 and k = -1, the spatial sections are
infinite, without edge, and the expansion is simply a continual increase of dis-
tance between every pair of galaxies in the universe. In the case k = + 1, the
spatial sections are finite but again without edge, and the expansion is again an
increase of distance between every pair of galaxies. The second point is that the
expansion takes place isotropically and without a centre; every fundamental
observer sees every other fundamental galaxy to be receding from him equally in
all directions. The way this can take place has been described in the discussion of
the Milne universe (Section 4.3, see particularly Fig. 4.40b); in the present con-
text, it is because we can equally well choose any galaxy to lie at the origin of
coordinates, and the metric will still take the form (7.1).
There are three possibilities for this expansion allowed by the field equations,
depending on the value of the constant k (which determines the spatial curva-
ture). In the next section we describe the kinds of behaviour predicted by
Einstein's equations, and in the subsequent section we will see how observed
redshifts provide direct evidence that R(t) does indeed change.
Exercise 7.2
In the case k = + 1, consider the two-dimensional space-section obtained from (7.2) by
setting 0 =1 ir, giving the metric form (7.4), where the fundamental observers are at
constant values of r and 0. Suppose now R(t) increases steadily from zero. Explain why we
can accurately model this situation by considering a (very strong) balloon with galaxies
painted on it, where the balloon is steadily growing larger and larger as it is blown up. Note
that there is no centre to the expansion depicted by this model; each galaxy recedes equally
from every other galaxy.
Evolving universes
Provided the energy density and pressure in the universe are positive, Einstein's
field equations uniquely imply that the universe must expand from an infinitely
compressed state, the `hot big bang', with the rate of expansion decreasing as the
universe ages. We consider first the initial expansion of the universe, and then its
behaviour at later times.
The early universe In all cases, at very early times (when the universe is filled
with radiation, cf. the discussion below) the evolution proceeds according to
where we have chosen the time coordinate t so that t = 0 at the origin of the
universe (where R = 0). This implies in particular that the universe begins by
expanding from that time, when the matter in the universe is indefinitely com-
pressed (because R(t) = 0 there) and the density and temperature are infinitely
large (Fig. 7.7). This is the Hot Big Bang'-a singular origin to the universe. As
far as classical physics can predict, the curvature of space-time is infinite there
and space, time, and even the laws of physics do not exist before: thus this is the
origin of the universe. We can only use the laws of physics to understand the
evolution of the universe after this creation event.
At very early times the physics involved in understanding the evolution of the
universe is ill understood, and our theories about what happens are speculative.
However, at times later than about I second after the expansion began, the
physics involved is reasonably well understood. The universe was filled with a
very hot interacting mixture of particles and radiation in equilibrium with each
other, that cooled as the universe expanded (the temperature Tis proportional to
1/R, and was 109K at t = 1 second, see Fig. 7.7). As the temperature dropped,
element formation (nucleosynthesis) took place at about 108 K, and then the
matter and radiation in the universe decoupled when the temperature was about
3000 K (the universe was opaque to electromagnetic radiation at earlier times
when electrons, freely moving between nuclei, scattered light strongly, but was
7.2 The evolution of the universe 273
Fig. 7.7 The density p, temperature T and scale factor R of the universe plotted against
time t. At t = 0, the `Hot Big Bang' singularity corresponds to infinite density and
temperature in zero volume!
Fig. 7.8 Black-body radiation arriving along the past light cone. Before the decoupling
time td the universe was opaque, the redshift greater than 1000 and the temperature greater
than 3000 K. The temperature of the radiation is now 3 K.
transparent afterwards when the electrons were bound together with nuclei to
form atoms). The remnant radiation from this time is observed by us today as
black-body radiation at a temperature T of approximately 3 K, observed with
very sensitive radio receivers and infra-red-radiation detectors (see `The primeval
fireball' by P. Peebles and D. T. Wilkinson, Scientific American, June 1967).
Although it is very difficult to detect because of this low temperature, the dis-
covery of this radiation in 1965 was of great importance, because it is direct
evidence that there was a hot early stage in the universe, when R(t) was much less
than it is now. Further, this radiation provides direct evidence of conditions very
early on (at the time of decoupling, long before the existence of any stars or
galaxies; see Fig. 7.8). It is not possible by analysing any electromagnetic
274 Simple cosmological models
radiation to obtain information about times earlier than the time of decoupling,
because the universe was opaque before then. Further, the isotropy of this tem-
perature (it is the same in all directions to an accuracy of 1 part in 104) is the best
evidence we have for the uniformity of the universe at very early times; the very
small remnant anisotropy detected can be understood as due to our motion
relative to the fundamental velocity at our space-time position (see Section 3.1
above, and `The cosmic background radiation and the new aether drift' by
R. Muller, Scientific American, May 1978).
The physics involved in the early stages of the universe is very complex. A brief
summary, and references for further reading, is given in an Appendix to this
section.
The late universe The later behaviour of the universe differs according to
whether the spatial curvature is positive, zero, or negative (see Fig. 7.9a).
Assuming the cosmological constant, A, is zero, then if k = -1, the universe is a
low-density universe that easily expands forever; if k = 0, it is a high-density
universe that just manages to expand forever; if k = +1, it is a high-density
universe that expands to a maximum value of R(t) and then recollapses in the
future, ending at a second singularity similar to the initial singularity where it
began. If A > 0, then the universe will in many cases expand forever, even if
k = +1.
It is clearly of considerable interest to find out whether k = 0, + 1, or -1, since
this determines not only whether the universe space-sections are finite or infinite
(as discussed above) but also whether the universe will expand forever or not if
A = 0. One attempts to determine the value of k by astronomical observations
of distant galaxies, using these to determine the behaviour of R(t) and hence to
infer the value of k.
RA
(a) (b)
Fig. 7.9 (a) The scale factor R(t) plotted against time t. For k = -1 and 0, it increases
indefinitely; for k = +1 it increases to a maximum and then decreases again to zero. (b)
The Hubble constant Ho is the slope of the curve R(t) at the time to, and the deceleration
parameter qo its curvature then.
The evolution of the universe 275
The basic parameters The basic parameters characterizing different universe
models are the Hubble constant
11 dR
Ho =
R dt]0
and the deceleration parameter
1 d2R
qo = [- RHo dt2 ] o'
where the subscript 0 means `evaluated at the time to'. The first characterizes the
rate of expansion of the universe (see the discussion of the Milne universe in
Section 4.3), and the second the rate at which the expansion of the universe is
slowing down (Fig. 7.9(b)). According to the Einstein field equations with
vanishing cosmological constant A, qo is directly proportional to the amount of
matter in the universe; if qo > z we are in a high-density (k = + 1) universe which
will recollapse, whereas if qo < 2 we are in a low density (k = -1) universe which
will expand forever. The critical case qo = z (the Einstein-de Sitter universe) has
flat spatial sections (k = 0) and a simple form for R(t): in this case,
R(t) a (t - ti)' (7.6)
where t1 is a constant (this would be the time at which the expansion began, if
this expansion law held all the way back to the initial singularity; however, as we
have seen, that is not the case). Our present observations of the density of matter
in the universe suggest it is too low to cause a recollapse; the highest densities
suggested by direct observations correspond to qo ;; 0.1. Although observed
densities are less than those predicted in the critical-density case, in order to
understand the broad nature of the evolution of the universe it is common
practice to use (7.5) at early times and (7.6) at late times, matching the expressions
for R to R, at some critical time t, at which a transition took place from a
radiation-dominated to a matter-dominated universe. Similarly the expressions
for R must be matched at t,
The Hubble constant gives an estimate of the age of the universe. In the case
qo = 0, we are in an empty universe with age to = 1 /Ho; this is just the Milne
model discussed in Section 4.3 (an empty universe with linear expansion*). If
qo = i (the critical case), then to (1/Ho). Present estimates of Ho imply that
1 /Ho is about 15 x 109 years. Combining this with present estimates of the ages of
stars in globular clusters (between 14 and 18 x 109 years) suggests that in a high-
density universe, the deduced ages of stars may be uncomfortably large compared
with the age of the universe. However, there is still considerable uncertainty in
the value of the Hubble constant, so arguments based on ages need to be treated
with caution. Additionally it now seems possible that A > 0, which will imply
larger ages for a given Hubble constant, solving the age problem, as is dis-
cussed later.
* More precisely, the four-dimensional Milne universe has metric form (7.1) with k = -1, R(t) = t,
and qo = 0. It is a flat space-time but with negatively-curved space sections.
276 Simple cosmological models
gas instead of being bound together as atoms. The radiation will then be black-
body radiation at a temperature appropriate to the stage of evolution of the
universe. The free electrons interact strongly with all electromagnetic radiation.
This means that the universe is then opaque to light, radio waves, X-rays, etc.; as
in the interior of the Sun, a photon (that is, a particle of light) can proceed only a
very small distance before colliding with an electron and being scattered from it.
However, at later times (when the temperature of the universe drops to about
3000 K) the electrons and nuclei recombine to form atoms. The free electrons are
now closely bound to the nuclei, and so they no longer scatter light as they did at
earlier times, and the 'universe becomes transparent, with radiation mostly
moving freely between the atoms without interacting with them; thus the time of
recombination is also the time of decoupling of matter and radiation. The
radiation that was in equilibrium with the matter at early times thereafter remains
black-body radiation, with its temperature falling steadily as the universe
expands. As mentioned above, the solar system is bathed in the very dilute
remnants of this black-body radiation at the present time. For more details, see
e.g. the books by Weinberg, Sciama, or Harrison mentioned above or Chapter 3
of A Short History.
Exercise 7.3
Determine the relation between the Hubble constant and the age of the universe (a) in
the case of a matter-dominated universe (i.e. (7.6) holds), and (b) for a radiation-domi-
nated universe (i.e. (7.5) holds).
Redshift
It is easy to work out from the fundamental form (7.1) the paths of radial light
rays. On them, dO = 0 = dq (as they are radial geodesics on which 0 and 0 are
constant) and ds2 = 0 (expressing the fact that they are light rays). Then we see
from (7.1) that on these curves, dr = dt/R(t) (taking both dr and dt positive for a
future-outgoing geodesic). Thus, if light is emitted by a galaxy O1 at r = 0 and time
t = tei and received by an observer 02 at r = u and time t = to (Fig. 7.10), we find
Jf
u= (7.7)
where the integral is taken from the time to of emission of the light to the time to of
its observation. Similarly, a light ray emitted by 01 a short time later at to + Dte
278 Simple cosmological models
observer
Fig. 7.10 Radial light rays are emitted at to and to + Dte by a source at r = 0, and received
at to and to + Dto by an observer at r = u.
and received by 02 at to + Dto will obey relation (7.7) but now with the integral
taken from the time to + Dte to the time to + Dto.
Now, the crucial feature is that u is constant (because the fundamental
observers are at constant values of the coordinate r) so the right-hand side of (7.7)
has the same value on both light rays. We can therefore equate the two integrals.
If we now approximate these expressions, allowing for the fact that Dto and Dte
are small and so R(t) is very nearly constant for the relevant interval, we find that
Dto/R(to) = Dte/R(te). Hence, the ratio of time intervals observed is given by
K = Dto/Dte = R(to)/R(te) = 1 + z (7.8)
(the last relation following from (3.3)). We have worked the result out for one
galaxy at the origin of coordinates, but the result applies to any galaxy pair
because of the homogeneity of the universe (the emitter can always be chosen as
the origin of the coordinates). .
This expression shows how observed redshifts directly measure the expansion
that has taken place in the universe; so by (7.8), redshifts directly measure the
ratio of the scale factor at the time of observation and the time of emission. Note
that in this case, the effect is entirely reciprocal; 02 would observe exactly the
same redshift as 01 for light emitted at te and received at to. However, the value of
K will not stay constant for any particular pair of galaxies: rather, its variation
with time will reflect directly the dynamic expansion or contraction of the uni-
verse. Thus, K is a function of to (or of to). The Milne universe described in
Section 4.3 is an exact model of this situation.
The factor Kis directly observed through measuring the redshifts in spectra of
distant galaxies (see e.g. The Realm of the Nebulae, by E. Hubble, Yale University
Press, 1936, reprinted 1982; `The redshift' by A. Sandage, Scientific American,
September 1956; and Fig. 3.4). At the time of writing, redshifts up to z = 5.34 for
galaxies have been measured, detected by light emitted when the universe was
about one-seventh of its present age. In the case of quasi-stellar objects, redshifts
of up to 5.0 have been measured, again corresponding to seeing these objects a
7.3 Observable quantities 279
very long time ago (about 7 x 109 years) when they were 6.0 times closer than at
present. In the case of the cosmic microwave background radiation, because the
radiation temperature varies as 1 /R(t) and its present temperature is 3 K, the
temperature of this radiation at the time corresponding to a redshift of z will be
T = 3 (1 + z) K. Thus the radiation we have detected, emitted by hot dense
matter in the early universe at a temperature of about 3000 K (when the universe
became transparent), was emitted at a redshift of about 1000 (Fig. 7.8). Because
R(t) --> 0 at the beginning of the universe (when t --> 0), the redshift of radiation
received from earlier and earlier times, if it could penetrate the intervening
matter, would diverge to infinity. Because of the opaqueness of intervening
matter, we cannot in fact receive electromagnetic radiation from extremely early
times, but we may one day be able to detect neutrinos emitted at a redshift of
about 109. If we were able to detect extremely weak gravitational waves, we could
in principle observe to much earlier times.
Exercises
7.4 Suppose the light rays emitted at an interval Dte by O1 were reflected from 02 and
received again by 01. What would be the interval Dt' measured by O1 between their
reception? Contrast your result with that for the Schwarzschild solution (Section 6.1).
7.5 Use eqn (7.8) to confirm the result that A scales as R(t). [Consider the relation
between the period and the wavelength of the light.]
125 Z
Fig. 7.11 The relation between `area distance' ro, which determines apparent angles
through equation (7.9a), and redshift z in a matter-dominated universe with flat spatial
sections (k = 0). There is a maximum of the area distance at z = 1.25; correspondingly
there will be a minimum in apparent angular diameters at this redshift.
This relation is plotted in Fig. 7.11. The striking feature here is that this quantity
has a maximum at z = 4, and thereafter decreases. Consider observing a series of
sources with sharply defined features that can be used to define angular diameters
(e.g. barred galaxies), and suppose that they are all of the same intrinsic size D.
Then, by (7.9a), in such a universe the apparent angular diameter of this set of
uniform objects will reach a minimum at redshift z = 4 and thereafter increase
(Fig. 5.26). This is precisely the situation indicated in Fig. 5.21, but it is true for
observations made in all directions, and at all times (since this behaviour is
independent of the value of to or Ho). Thus, in these universes we have the
situation shown in Fig. 5.25b, where the entire past light cone of each observer
refocuses at z = 4.
Examination of the equations involved shows that similar refocusing is
expected to occur in all expanding-universe models with metric form, (7.1) that
contain normal matter (more precisely, matter with a positive energy density).
Unfortunately, this predicted behaviour is difficult to verify observationally,
because there is a great variation in the intrinsic size of galaxies and radio sources,
and because most galaxies do not have sharply defined outer edges (they fade away
into the night sky).
Exercise 7.6
Derive eqns (7.9) from (7.1). Derive eqn (7.10), and verify that ro has a maximum at z = 5.
7.3 Observable quantities 281
Observed luminosities
Again, we can essentially follow the calculation previously given for the case of
flat space-time in Section 4.3. Consider a source of luminosity L, that is, a source
emitting radiation at rate L in all directions. We choose coordinates centred on
the source, i.e. the source is at r = 0. When we place a detector to receive this
radiation, that detector (say of area A) intersects a particular bundle of light rays
out of all such rays emanating from the source (Fig. 7.12a). If this bundle of rays is
characterized by angular displacements (d0, do), the fraction of light L emitted in
these directions in unit time by the source is
P
-- sin0dBdo.
If we assume there is no absorbing medium in the way, all these photons reach the
receiver. Now three effects occur, which determine radiation intensity detected at
the receiver.
Firstly, at the detector (where t = to and r = u), this radiation is spread over an
area A (Fig. 7.12b). One can easily relate this area to the angles dB and do at the
source because the light rays are radial, i.e. 0 and 0 are constant on them. Thus,
the bundle of light rays will still be characterized by 0, d0, 0 and do at the detector.
Hence, from the metric form (7.1), the area A is given by the expression
A = R2(to)f2(u) sin 0d9do. Because the photons are conserved, all the photons
emitted into the bundle of light rays will be received at the detector. Thus, the rate
at which photons are received by the detector per unit area will be proportional to
P and inversely proportional to the area A; taking the ratio, the rate of reception
of photons per unit area is inversely proportional to R2(to) f 2(u).
Secondly, the energy per photon is proportional to its frequency v which,
because of the redshift, is inversely proportional to 1 + z.
area A
detector
(a) (b)
Fig. 7.12 (a) A bundle of light rays emitted by a source and received by a detector of area
A. (b) The relation between the area A of the bundle of light rays at the detector and the
solid angle at the source. For radial light rays with solid angle sin 0 dO do at the source, the
width of the beam in the 0-direction at the detector will be d11 = R(to)f(r) d0; similarly
the width in the c-direction will be d12 = R(to)f (r) sin 0 do. The area at the detector is
then A = d11 dI2.
282 Simple cosmological models
Finally, because photons are conserved, the rate at which they are received
would be the same as that at which they are emitted, were if not for the Doppler
shift factor K = 1 + z. Because this factor relates all time intervals measured by
the source and by the observer, it relates in particular the time interval measured
by the source and the observer for transmission and reception of any particular
set of photons. Thus, the ratio of the rate at which they are received to the rate at
which they are emitted is inversely proportional to 1 + z (cf. eqn (7.8)).
When all these factors are put together, the radiation flux (the radiation
received per unit area per unit time) measured from the source is given by
F= L (7.11)
4ir(1 + z)4ro
where ro is defined by (7.9b). Equations (7.9) and (7.7) enable us to calculate the
flux of radiation (or `apparent luminosity') of any source of known intrinsic
luminosity L at a redshift z, once R(t) is known (from the Einstein field equa-
tions). That is, it enables us to construct a theoretical redshift-luminosity relation
for each universe model. The nature of these curves is shown in Fig. 7.13, where,
as customary, the observed source flux has been re-expressed in terms of its
.6
6.2
.8
.4
5.0
.6
42
.8
.4
m -
I I I I I I I I I I I I I
8 10 12 14 16 18 20 22
Fig. 7.13 Magnitude-redshift curves: log(cz), where z is the red shift, is plotted against m,
the apparent magnitude of the source, which is effectively the logarithm of the received flux
of radiation F.
7.3 Observable quantities 283
Apparent brightness
Just as (4.34) and (4.35) led to the brightness relation (4.36b) in the case of the
Minkowski universe, so now (7.9a) and (7.11) again lead to the same brightness
relation (4.36b):
Exercises
7.7 Check the derivation of eqn (7.11), and derive eqn (7.12).
7.8 Number. Counts: Suppose there is a density n of objects at a coordinate distance r
from the observer in a FLRW universe model. Determine how many such objects one
would expect to see between distances r and r + dr within the range of angles do and do
about a direction (0, 0).
[Hint: (i) Find the proper distance dl corresponding to dr at the distance r from the
observer. (ii) Find the area dA defined by the quantities do and do at this distance.
(iii) Hence find the proper volume d V corresponding to dr, do, and do, and so find the
number dN of such objects in the volume d V from the formula dN = n dV.]
Large-scale structure
First, because we have better distance indicators we have been able to identify
large-scale `walls' made of galaxies, surrounding much emptier voids, the whole
having something like a bubble structure. These have not been seen in the past
because we see images of all these objects projected against one another; to
separate them out we need careful distance estimates apart from redshift (see, for
example, A Short History for details of such measures).
7.4 New observational data 285
Fig. 7.14 The Hubble Deep Field was composed of 276 separate exposures taken by the
Hubble Space telscope, so providing the deepest detailed study of primeval galaxies
available at the time it was taken (January, 1996). The faintest galaxies in this picture
formed just a billion years after the Big Bang. (Image reproduced by permission of NASA
and AURA/STScI.)
Given these distance estimates and redshift measurements, we can also identify
large-scale streaming motions of clusters of galaxies. These are presumably due
to the gravitational pull of large-scale inhomogeneities, and tentative identifi-
cations have been made of.the `great attractor' which is the cause of motion of the
whole supercluster of galaxies in which our own galaxy is situated. This causes the
observed dipole anisotropy in the CMB (discussed above in Section 3.1).
These structures and associated motions exist on scales up to something like
200 to 300 Mpc. On larger scales we seem to be reaching the statistical uniformity
that must underlie successful use of the (spatially homogeneous and isotropic)
RW universe models. It seems likely that the universe is indeed spatially homo-
geneous and isotropic on these large scales, up to at least a few thousand Mpc.
286 Simple cosmological models
However, we cannot be sure this remains true on even larger scales, because
horizons prevent us from determining what structures exist at such large scales.
Dark matter
One of the major issues is that some forms of matter are hard to detect. If matter is
cold, it is not luminous (e.g. a rock or cold lump of coal is not easily detectable
from a distance) and so will not show up as a bright object in a telescope image. If
it is gaseous it may be detectable by creating absorption lines in more distant
objects, and study of such lines has indeed told us there are many dust clouds
between us and more distant objects, while lack of absorption troughs due to
neutral hydrogen spread out along the line of sight to distant quasi-stellar objects
enables us to put strong limits on the amount of such hydrogen in intergalactic
space. However, solid objects will not be visible in this way. Therefore the uni-
verse could be filled with `Jupiters' (massive planet-like objects) or numerous
small black holes that we have not detected.
Thus the estimates of the amount of matter in the universe that we have
obtained from galactic images (essentially, estimates of the number of stars we see
in distant galaxies and the average mass of these stars) may be a gross under-
estimate of the amount of matter in the universe. Evidence is accumulating that
this is indeed so: not only that much more matter is present than we see, so that
visible galaxies are like the foam on top of a wave (a small fraction of all there is,
floating on an unseen sea of dark matter) but also that much of this matter may be
`exotic', i.e. non-baryonic: made of massive neutrinos, photinos, quark nuggets,
and so on that do not easily interact with ordinary matter (and hence are hard to
detect). The main evidence for this dark matter comes from studies of the rotation
of galaxies, and of the motions of galaxies in clusters: the gravitational fields we
detect seem to indicate the presence of more matter than we see (see e.g. L. Kraus:
The Fifth Essence: the Search for Dark Matter in the Universe, Basic Books, 1989).
The key question is whether the density of matter in the universe is as high as
the critical density needed to cause a recollapse in the future: a `big crunch' (in
many ways similar to a time-reversed Big Bang) rather than a continual ever-
colder expansion (what used to be called a 'heat death', although that expression
seems a far better name for a recollapse than a continued expansion). The visible
matter is much less than is needed to cause such a recollapse. Thus a central issue
in present-day cosmology.is determining how much dark matter there is in the
universe, and what it is made of. Evidence comes from a variety of directions, as
well as those already mentioned: from ages, lensing, nucleosynthesis studies,
structure formation studies, and CBR anisotropy measurements. Additionally
laboratory searches are under way to try to directly detect exceedingly elusive
dark matter particles, whose nature is not known but can be theorized. One thing
should be made quite clear here: the critical density is about 10-29 gm/cc (as
opposed to the observed matter density of about 10-31 gm/cc). But this is the
averaged matter density through all space: because most space is empty, this is an
extremely low value-much lower than the best `vacuum' we can attain on Earth.
Indeed the average density of matter on the earth is something like 1031 times
7.4 New observational data 287
greater! We live in a highly condensed region of matter in a (presently) very nearly
empty universe.
At present the weight of evidence seems to be that despite the presence of much
dark matter, so that what we see is between 1/10th and 1/30th of all the matter
present, the matter density present is less than the critical density, implying that
the universe will expand forever (see P. Coles and G. F. R. Ellis: Is The Universe
Open or Closed? The Density of Matter in the Universe, Cambridge University
Press, 1997, for a detailed discussion). If there is no cosmological constant, this
also means that the space sections are of constant negative curvature, which will
be infinite if the spatial geometry continues unchanged beyond the horizon, and
the space sections have their natural topology. However, it is possible that a
cosmological constant is also present (see below), which means that the expan-
sion is even faster than in a standard low-density model, so there would be no
chance of a recollapse. This also allows the possibility that the space sections
could be flat or even positively curved, despite the low matter density; it may be
rather difficult to tell which is the case.
Ages and evolution
One of the ongoing issues in cosmology has been that of the ages of stars and
galaxies as opposed to the age of the universe; we cannot believe universe models
where the ages of stars exceed the age of the universe (see p. 275 above). Indeed
the stars we see were formed since decoupling of matter and radiation, so irre-
spective of what happened in the earliest epochs before then, the timescale since
decoupling must be adequate for galaxy and star formation. This is critical for
cosmology-it is the one area where we could be confronted with a flat contra-
diction with the standard models of cosmology that might force us to abandon
them (and indeed this is the reason that Hubble never fully embraced the
expanding universe models even at the end of his life in 1953-age estimates then
were indeed difficult to reconcile).
The history of cosmology includes a series of revisions of the age estimates for
the universe, as employment of new distance indicators has led to new estimates
of the Hubble constant, leading to belief in a larger and older universe, as well as
new estimates of the ages of star clusters and galaxies. The situation is compli-
cated by the fact that these estimates are not independent-indeed a major recent
re-evaluation of distances and hence of the Hubble constant due to new parallax
measurements by the Hipparchos satellite (giving smaller estimates of Ho and so
higher ages for the universe) also leads to lower estimates for the ages of stars
(because these are based inter alia on estimates of the distances of star clusters
where we study stellar evolution). These estimates suggest there is no age problem
even in a critical density universe. Other estimates of distances and ages disagree
as regards the probable value of the Hubble constant, suggesting a younger
universe where the problem might still be significant. But if we accept the
implication of recent supernovae observations that the universe is accelerating at
present, then there is not an age problem after all. However, this conclusion must
be treated with caution and continually reviewed-this is one of the areas where
the standard model of cosmology is vulnerable to disproof.
288 Simple cosmological models
Supernovae as standard candles
Continuing work on the standard cosmological tests (the magnitude-redshift
relationship, number counts, or angular sizes versus distance indicators) always
ends up confounded by the unknown evolutionary history of the sources
(galaxies, radio sources, quasars) that we observe. Their observable properties
depend on that history, which we do not fully understand; hence we learn more
from these observations about that evolutionary history, than we do about the
universe itself.
The method that now shows the possibility of revolutionizing this situation is
observation of supernovae in distant galaxies. The point is that if we identify the
type of supernova, and observe its light curve carefully, we are able to estimate its
intrinsic luminosity, which should be largely independent of the history of the
galaxy in which it is situated, for it is controlled by local physics, independently of
that history. Thus a huge effort is at present going into searching for such
supernovae and then observing their light curves. This has the potential to
determine accurately both the Hubble constant and the deceleration parameter.
First indications from this programme are somewhat ambiguous, some
observations suggesting a higher Hubble constant and some a lower one. We
need to understand the different types of supernovae somewhat better in order to
clear this up. However, despite this uncertainty, what has been done so far vin-
dicates this approach as a way of testing the geometry of the universe, and fur-
thermore suggests that the universe is at present accelerating-the rate of
expansion is speeding up rather than slowing down. Assuming Einstein's equa-
tions are correct on a cosmological scale, this implies presence of a negative
energy field-probably a cosmological constant, as mentioned above.
This needs confirmation-it presently depends mainly on observation of a
small number of most distant supernovae. If vindicated, this is an exciting dis-
covery-it eases the age problem, even for critical density universes, and shows
that the universe will expand forever, rather than re-collapsing in the future
(independent of the sign of spatial curvature). It also raises major issues for
fundamental physics-why is this field present? It will also have to tie in to other
cosmological observations, particularly number counts of very distant objects,
gravitational lensing observations, and CBR observations. The problem will be
to find a single model that fits all this data.
Gravitational lensing
The possibility of gravitational lensing arises because of the bending of light
caused by local gravitational fields (see pp. 213-17 above). Observations and
interpretation of both weak and strong gravitational lensing has now become a
fascinating and flourishing part of astronomy.
Strong lensing due to galaxies can lead to multiple images of a single object (see
p. 215), so much effort has gone into identifying multiple images of a quasar
lensed by an intervening galaxy. Particularly striking has been identification of
an Einstein cross-four images of a more distant object, seen near the core of an
intervening galaxy. From such multiple images, one can work out the probable
7.4 New observational data 289
intervening mass distribution and compare it with optical, radio, IR, and X-ray
pictures of the same region and hence try to see what dark matter may be present
there.
Strong lensing due to dense galaxy clusters can lead not only to multiple images
but also to formation of apparent arcs-highly distorted and magnified images of
much more distant galaxies (Fig. 7.15). Redshift measurements confirm that
these arcs are indeed images of much more distant objects than the cluster with
which they are associated, and analysis of these images again enables us to
determine the mass distribution in the lensing cluster. Additionally, the bright-
ness of these lensed images may be much greater than would occur had they not
been lensed-enabling us to see (albeit it in distorted form) galaxies at a far
greater distance than would otherwise be possible (Fig. 7.16). These studies
confirm that clusters contain ten times as much matter as is visible, distributed
fairly evenly round the cluster.
Lensing also occurs due to stars or massive planets in our own galaxy and the
Magellanic clouds (the two small satellite galaxies of our own galaxy). Here the
angular separation of the images is far less than can be resolved, so this is called
microlensing. If a further object passes behind a nearer one, the intensity of the
images fluctuates rapidly and that is what one can hope to detect. The probability
that the further object will pass close enough to the lensing object for such
fluctuations to be detectable is very low, but we are able to search systematically
for them by observing millions of stars for fluctuations. Such searches are at
present under way, and through them we hope to put limits on the amount of dark
matter hidden in massive planets, brown dwarfs, or little black holes. The
probability of such lensing is quite small, but we are able to survey 100 000 stars at
a time, making the search feasible.
Fig. 7.15 Gravitational lensing by the rich cluster Abell 2218 shows arcs that are dis-
torted images of galaxies five times further than the lensing cluster, some 50 times fainter
than can be detected with ground based telescopes. Studying of such multiple images of
the same galaxy provides details of the mass distribution in the lensing galaxy. (Image
reproduced by permission of W. J. Crouch.)
290 Simple cosmological models
Fig. 7.16 A NASA Hubble Space Telescope image of the galaxy cluster CL1358 + 62 has
uncovered a gravitationally-lensed image of a more distant galaxy located far beyond the
cluster. The gravitationally-lensed image appears as a red crescent to the lower right of
centre. The galaxy's image is brightened, magnified, and smeared into an arc-shape by the
gravitational influence of the intervening galaxy cluster, which acts like a gigantic lens. The
bright spots are star-forming regions in the very distant galaxy (Image reproduced by
permission of M. Franx.).
Finally, massive objects such as galaxies or clusters can cause systematic dis-
tortions in images of more distant objects. By carefully examining the statistics of
such distortions in a large number of objects, we can deduce the amount of
clustered matter causing this lensing. This is a growing field of study at present.
In the case of strong lensing (multiple images and arcs), what is happening from
a space-time viewpoint is that self-intersections and cusps are occurring in our
past light cone due to the bending of light rays by matter concentrations
(Fig. 7.17). Now our past light cone intersects a great many such concentra-
tions-1011 stars in our own galaxy, 1011 galaxies, hundreds of clusters-so the
real structure of our past light cone is fractal-like, containing a very large number
of such self-intersections and cusps on a variety of angular scales (Fig. 7.18). It is
these self-intersections of our past light cone we are detecting when we observe
multiple images of the same more distant object. These observations provide
strong confirmation of Einstein's vision that matter curves space-time and hence
causes apparent bending of light rays (which in fact are geodesics moving
undeviated in direction in a curved space-time).
CBR
When matter concentrations occur they also lead to a variation in gravitational
potential, and hence in the redshift of light emitted (see pp. 244-7 above). In
particular this applies to the matter on the surface of last scattering that emitted
the 2.75 K black body CBR, which thus (by eqn (7.12)) will show slight variations
7.4 New observational data 291
Fig. 7.17 Distortion of our past light cone caused by gravitational lensing, resulting in
self-intersection of the past light cone (at P2) and two cusps (at Pl and P_1), as seen here
where it intersects a spacelike surface E. The generating geodesic -y, passes through the self-
intersection and reaches the cusp Pl ; geodesic -y_ 1 passes through the self-intersection and
reaches the cusp P_1. The central geodesic -yL passes through the lens centre (not shown)
before reaching the critical point where the cusp points and self-intersection points all
meet. (Drawing courtesy of M. Carfora.)
Fig. 7.18 Impression of increasing degree of self-intersection of past light cone through
creation of many cusps as the wave front passes many matter concentrations. (Drawing
courtesy of B. Bassett.)
in the observed temperature as we scan the sky. Thus such small temperature
anisotropies serve to map density inhomogeneities at the time of last scattering,
which since then will have grown into galaxies and clusters of galaxies like those
we see nearby (we cannot see galaxies originating from the matter that emitted the
CBR precisely because we see that matter at the time it emitted that radiation,
292 Simple cosmological models
which was long before galaxy formation occurred). Additionally, anisotropies in
the CBR will be caused by primordial gravitational radiation if its amplitude is
large enough (as is implied by inflationary theories, for example).
Detailed study of the process of formation of inhomogeneities gives us pre-
dictions of how the CBR anisotropy pattern should be correlated with the pattern
of matter inhomogeneities we observe. Many detailed models of this interaction
have been constructed, and intensive observational searches are under way for
the CBR anisotropies that remain after the dipole anisotropy due to our motion
(see Section 3.1 above) has been subtracted out. Such anisotropies have been
observed, and study of their angular correlation functions will support or dis-
prove various models of development of these relics of the inhomogeneities at last
scattering (Fig. 7.19). However, considerable work remains to be done here: at
°FWHM [degrees]
10.0 1.0 0.1
COBE
X FIRS (2m, 92A)
100 o Tenerife
+ IAC (0.3, 0.7)
SP
n (0.3, 0.0)
BAM
Python
+ (1.0, 0.0)
ARGO
80 IAB
x MAX
Sask
O MSAM
A CAT
60
X OVRO
o PythV
O Vip
O QMAP
MAT
40
20
Fig. 7.19 The predicted and measured CBR power spectra. The vertical axis shows the
power in the angular fluctuations on various angular scales, characterized by the
parameter £ (corresponding angles are shown at the top). The power peaks at between
0.2 to 0.6 degrees, depending on the matter model. Observational points and error bars
from various surveys are shown. Present observations marginally support a model with
flat space sections, a matter density parameter Q,,=0.3 and a cosmological constant
parameter 1 A = 0.7. (Figure reproduced by permission of C. Lineweaver.)
7.5 The light cone, observational limits, and horizons 293
present the (smaller scale) matter inhomogeneity spectra predicted by the sim-
plest models do not fit well with the larger scale inhomogeneities indicated by the
CBR anisotropy; part of the problem is that we do not know the bias parameter,
(that is the degree to which final inhomogeneities reflect the initial inhomo-
geneities that were their seeds), nor what mixture of cold and hot dark matter may
be present. However it is exciting that very recent measurements of these peaks
also support the view that the cosmological constant is positive.
t=to galaxy
world
Iv__ lines
Fig. 7.20 The reconvergence of the past light cone of an event at t = to; going back into
the past from t = to, the cross-sectional area of the light cone reaches a maximum at the
surface of reconvergence and then decreases to zero at the initial singularity.
294 Simple cosmological models
In the previous section, we proved this result in the particular
case of a matter-
dominated universe with k = 0. There we saw that the reconvergence
can take
place relatively recently in cosmological terms (in that case, it happens
at a red-
shift of z = 4). Objects at higher redshifts will have anomalously high
angular
diameters because of the `gravitational lensing' due to the matter in the universe.
As noted above, we have already seen galaxies and QSOs that are apparently
much further away (at redshifts z > 3).
Exercise 7.9
Determine if Fig. 7.20 shows the true shape of the past null cone (in
terms of proper
distance and proper time) for the Einstein-de Sitter case where R(t) is given by
eqn (7.6).
To do this change to (non-comoving) proper distance spatial coordinates and then
determine the null cone equation in those coordinates. (For the solution,
see G. F. R. Ellis
and T. Rothman: `Lost horizons'. American Journal of Physics 61 (10), 93, 1993).
The existence of an initial singularity Clearly, our own past, i.e. the
region of
space-time from which our present situation could have been influenced
by
causal events (cf. Section 1.2), is trapped inside this light cone, whose cross-
sectional area goes to zero as the initial singularity is approached. This
suggests
that something drastic goes wrong with ordinary physics at
very early times in the
universe. However, an important question arises: the completely smooth and
isotropic FLRW universe models we are using (represented by the metric form
(7.1)) are highly idealized. In the real universe we observe many inhomogeneities
and irregularities at the present time, that could conceivably be indications
of
major anisotropies or inhomogeneities in the very early universe (e.g.
due to an
overall `vorticity' or rotation of the universe); could these have lead to the
avoidance of an initial singularity in universe models more realistic than
those we
have considered up to now?
The key feature is that, even in such more realistic universe models,
we still
expect a surface of refocusing to occur (because it is predicted to exist in recent
regions where deviations from the smoothed-out universe models are relatively
minor). A major piece of innovative mathematics by Roger Penrose and
Stephen
Hawking showed that once refocusing has taken place, the fact that
gravity exerts
an overall attractive effect on light guarantees the trapping of our causal past
inside a light cone whose area goes to zero in the distant past; and that
conse-
quently we can be confident that an initial singularity will exist in our past, no
matter how anisotropic or inhomogeneous the early universe
may be.* Thus,
general relativity theory predicts the existence of a breakdown of space-time
structure and known physics in the early universe in all realistic universe models.
To investigate this further, one must move to a full quantum theory of
gravity
which has general relativity as a classical limit. The nature of such
a theory is a
problem which is still far from being fully resolved.
* For an outline of this rather technical subject see `Singularities and horizons:
F. Tipler, C. Clarke and G. Ellis, in General Relativity and Gravitation, Volume a review article' by
Plenum Press, 1980. 2, ed. A. Held,
7.5 The light cone, observational limits, and horizons 295
Particle horizons
Equation (7.7) with integration limits to and to shows how far (in terms of the
radial coordinate r) an observer Oat r = 0 can see at time to when he looks back to
a redshift corresponding to a time to (Fig. 7.2la): all matter at r coordinate values
up to the value u are, in principle, visible to him. Consider a fixed value of to; then
the maximum distance umax (in terms of the coordinate r) that can be seen to,
looking back to earlier and earlier times, is given by letting to -* 0, where t = 0
corresponds to the beginning of the universe (R = 0 there). Examination of the
form of R(t) prescribed by Einstein's equations shows that umax is finite, thus the
observer 0 can at any time only see a finite number of galaxies (namely, those
lying at r-values less than the value umax). For example, in a flat matter-domi-
nated universe, (7.6) holds. For simplicity, setting tl = 0 and calling the constant
of proportionality a, we have R(t) = at3. Then, (7.7) shows u = 3(to - te)/a; as
to -> 0, this goes to the limiting value Umax = 3to/a.
(a)
galaxy world lines
Fig. 7.21 (a) An observer at time to can see matter up to a radial coordinate value u by
light emitted at time to or later. (b) Figure 7.20 drawn in new coordinates where all the light
cones are at 45 ° to the vertical (as in the flat space diagrams in Chapters 1 to 4). Galaxy
world-lines are vertical and the initial singularity is represented by a horizontal line rather
than a single point.
296 Simple cosmological models
The crucial feature is that this number is finite; however, there are an infinite
number of galaxies in a k = 0 or k = -1 universe with the standard topology
(there are galaxies for indefinitely large values of the coordinate r). Thus in these
cases the fraction of matter in the universe which we can have seen-or indeed,
with which we can have had any causal communication at all-is very small
(strictly, it is zero in an infinite universe). If k = +1, the fraction is finite but less
than one. Thus, a fundamental feature of the universe is a limitation on the
regions and the matter with which any observer can have had causal commu-
nication; there are many galaxies we cannot every hope to observe, no matter
what kind of detectors we use (as long as the laws of physics as we presently know
them hold, in that signals cannot locally travel faster than light). If we could wait
patiently thousands of millions of years we would indeed then see some of the
galaxies that were previously hidden, but will never see them all. At any later time
in a k = 0 or k = -1 universe, no matter how far in the future, there will still be
further galaxies an observer cannot see or be causally influenced by; indeed there
will be an infinity of such galaxies.
This feature is difficult to understand from an ordinary space-time diagram of
the universe because it all gets too squashed together as R --+ 0 (see Fig. 7.20)
However, we can choose new coordinates in which this does not happen, which
make the causal properties much clearer. Specifically, we can choose coordinates
so that the light cones in the universe are at 45°, as was true in our standard
representation of flat space-time. In the case of a universe with k = 0, the spatial,
coordinate is r and the time coordinate w, called the conformal time, given by
` dr
w=
f R(r)
When we choose such coordinates, the result is as illustrated in the conformal
diagram, Fig. 7.21b. The penalty for this clear representation of causal relations
is that the representation of spatial distances in the diagram will be badly dis-
torted near the initial singularity (the coordinate distances directly represented in
the diagram have to.be re-scaled by multiplication by R(t) in order to scale like
measured spatial distances).
Causal limitations Figure 7.21b shows very clearly the nature of the causal
limitations in these universe models. At the time to, the causal past of a galaxy G is
the interior of its past light cone C-, bounded below by the initial singularity
where R = 0 (Fig. 7.22). Fundamental world-lines are vertical lines in this dia-
gram. Thus, the galaxies Q and R will have been observed by G; however, the
galaxy T will not have been observed (because its past history does not intersect
the past light cone C-). The galaxy Q is precisely the limiting case: it is a'particle',
bounding those that have been seen by G at the time to from those that have not.
According to Rindler, we can define the set of all such galaxies separating those
seen from those not seen as G's particle horizon* (at time to). At a later time t1,
observer G's particle horizon will have moved out. Thus, at tt, the galaxies Q and
Fig. 7.22 In the coordinates of Fig. 7.21b, it becomes clear that an observer on galaxy G
at t = to can see galaxies R and Q but not T and Q. At t = ti, the galaxies T and Q" are now
visible but T' is not. Thus at any time, G has a particle horizon separating the visible
galaxies from those not seen. The diameter D of the particle horizon at the time t = tl is
indicated.
T are visible; the particle horizon has moved out to Q', and T' is still not observed.
So in principle as time progresses, new galaxies could be seen after crossing the
particle horizon of G (they are not visible at time to but become observable at
some time ti). The physical size of the particle horizon at the time to (i.e. the
distance then to the particles lying at coordinate value u = umax) is
D = R(to)umax Since R(to) = a(to)3, we have D = 3to. In terms of the Hubble
constant Ho, this is D = 2/Ho. (This is a little inaccurate because the early uni-
verse is radiation-dominated, so (7.5) holds at early times rather than (7.6);
however, the error will be small.)
In what way would new galaxies become visible-would a galaxy flash into
view out of apparent nothingness? No, and not just because galaxies would not
have formed at very early times. For the purposes of the present argument,
suppose for the moment that ready-formed luminous objects were available for
observation at arbitrarily early times. The point is that, at the limit of the particle
horizon, the limit R - 0 holds, but by (7.8) this is just the condition that z is
infinite. Thus, the horizon can also be regarded as the limiting surface where the
redshift of observed galaxies diverges. Hence, the source would be seen to emerge
gradually into view, with the intensity of received radiation steadily increasing as
the redshift decreases (the received radiation is zero when z is infinite, see (7.11)).
It will be clear from this description how the existence of particle horizons
fundamentally limits the particles we are able to observe in such universes. The
situation is entirely unlike that in the Milne universes discussed in Section 4.3. In
that example, the situation is as shown in Fig. 4.42. There are no particle hori-
zons-every observer on every galaxy can see all the other galaxies in the universe
at any time. It was pointed out by Roger Penrose that this difference is essentially
due to the fact that in the FLRW case, the boundary of the universe (the initial
singularity) is space-like, and this will always imply the existence of particle
298 Simple cosmological models
horizons (cf. Fig. 7.21b), whereas in the Milne case the boundary of the universe is
a light-like (null) surface, which implies a lack of particle horizons.
Physical implications The existence of particle horizons is of great interest for
both physical and philosophical reasons. Physically, it restricts the spatial
dimensions of regions that can have had causal communication with each other
at any time. This is a fundamental limitation affecting processes such as galaxy
formation, since it limits the size of causally interacting collections of particles
that could have formed galaxies or clusters of galaxies. Further, it makes the
observed isotropy of the microwave background radiation, i.e. the fact that we
see its temperature to be the same no matter in what direction we observe, into a
considerable puzzle: this isotropy apparently indicates that conditions were very
similar in regions that can have had no causal communication at all with each
other (Fig. 7.23).
This is a fundamental problem that can only be solved by either (1) having a
full-blown theory of creation which predicts that initial conditions must be
uniform even for causally disconnected regions; or (2) dropping some of the usual
physical assumptions, e.g. the usual equations of state or the field equations (as
happens e.g. in the `inflationary universe', where equations of state with unusual
properties occur because of quantum effects in the very early universe, see the
next section); or (3) assuming a different topology (i.e. global connectivity) of the
universe than in the standard theory, resulting in a `small' universe that we have
already seen around many times because the topology of the space-sections is`not
the `natural' topology one normally assumes. This third possibility will be dis-
cussed briefly in the final section of this chapter.
Philosophical implications Philosophically, particle horizons show that on the
usual understanding of the FLRW universe models, we have seen only a small
fraction of all the matter in the universe. This sets major limitations on our ability
to determine observationally the structure of the space-time in which we live, and
to predict its future. Thus, in the normal situation, we cannot strictly predict that
the Moon will rise tomorrow (even assuming that the laws of physics will hold
unchanged), because we do not have all the data needed to make that prediction
L_r
singularity'
Fig. 7.23 Events Q and R at the decoupling time are on the past light cone of P, but there
can be no causal connection between them because their causal pasts (bounded by the
initial singularity) do not intersect. It is therefore puzzling that the microwave background
radiation received from them has the same observed temperature now.
7.5 The light cone, observational limits, and horizons 299
E M
Fig. 7.24 The world-lines of the Earth and the Moon in a FLRW universe model. A source
G of gravitational radiation outside the Earth's past null cone at the event P (and therefore
invisible to the Earth at P) could destroy the Moon at event S. Therefore the prediction that
the Moon will rise tomorrow is based on assumptions without observational foundation.
(Fig. 7.24); for example, a gravitational wave from a source we have not yet seen
or had any causal contact which could destroy the Moon during the day and
invalidate this prediction. If there were no horizons, we would have seen all the
matter in the universe instead of having access to a sample which is zero per cent
of all the matter in the universe, as would be the case in the standard low-density
models. In a universe without particle horizons, our observational relation to
the universe, and in particular our ability to know with certainty what is in it
or what is likely to happen (on a cosmological scale) in the future, would be really
quite different from that in universes with particle horizons. A detailed analysis
of problems arising in determining the structure of the universe because of the
existence of horizons is given in the article `Cosmology and verifiability'
in Physical Sciences and the History of Physics, ed. R. S. Cohen and
M. W. Wartofsky, Reidel, 1984.
Exercise 7.10
Assuming the radiation-dominated expansion law (7.5) for times before td, calculate the
physical size D of the particle horizon at the time t = td corresponding to decoupling
(1 + z = 1 + zd = 1000) in terms of the radius function R(td) at the time of decoupling.
Now assume the matter-dominated expansion law (7.6) from the present time to to the time
of decoupling, and re-express your result in terms of the present-day Hubble constant Ho.
[To do so, you must match the expressions for R and R at td.]
What is the present-day physical size D' that corresponds to D? (This will be the present-
day size of the largest region that could physically interact at the time of decoupling.)
Assuming k = 0, what would be the angular size measured for D by an observer at time to?
Explain briefly why this is the largest angular scale on which normal physical processes can
explain the observed isotropy of the microwave background radiation.
Visual horizons
Consideration of the real situation in the universe shows that actually the particle
horizon is further out than the limit of particles we can observe by any known
type of radiation. The realistic limit is given by the visual horizon, corresponding
300 Simple cosmological models
to those particles we see at the time of last scattering (i.e. the decoupling of matter
and radiation). The point is that the universe was opaque to all electromagnetic
radiation at earlier times-the mean free path for photons was less than a few
centimetres. This changed dramatically at decoupling, when the electrons that
caused the strong interaction with radiation (by Thomson scattering) became
bound into atoms and so unable to interact strongly with photons any more. The
mean free path of radiation then changed to thousands of millions of light years,
the universe at present being astonishingly transparent.
Thus the furthest particles we can see in the universe are precisely those that
emitted the CBR. They form the visual horizon, separating the particles we can
have seen (at the present time) from those that are inaccessible to our observation
(unless we live in a `small universe', see Section 7.7 below). More distant particles
cannot be detected by light, ultraviolet or infra-red radiation, radio waves,
X-rays, or -y-rays, and we see the matter comprising this horizon only at the time
of last scattering (corresponding to a redshift of about 1100), when they emitted
that radiation (we are thus unable to see them at the present time, when they have
become galaxies). Using a conformal diagram similar to Figs 7.21-24, we can see
their history as the vertical lines which pass through the intersection of our past
light cone and the surface of last scattering (Fig. 7.25). This is the space-time
position of the visual horizon for observations we may make at time to; remember
that each point in this diagram is a 2-sphere, so this is actually a space-time
cylinder (the product of a 2-sphere and a line). The spatial position of the horizon
at the present time to is given by the intersection of this cylinder with the surface of
ph vh vh ph
R
t=to
Q
t=td
a t=o
Fig. 7.25 A conformal diagram of an FLRW universe, showing the distinction between
the visual horizon (vh) and the particle horizon (ph), at t = to. The visual horizon
corresponds to the vertical lines through P and Q, the intersection of the past light cone of
Rat t = to, with the surface of last scattering, t = td. The particle horizon goes through the
intersection of that light cone with the initial singularity at t = 0.
7.6 Steady-state and inflationary universes 301
constant time t = to. This is a 2-sphere surrounding us, lying inside the corres-
ponding sphere representing the particle horizon. The matter between these two
spheres can in principle have been in causal contact with us but cannot be seen by
us. As in the case of the particle horizon, once a particle has entered the visual
horizon it cannot leave it, no matter what the behaviour of the radius function
S(t); once a particle is visible to us, it is visible at all later times.
It is this horizon that delimits the matter we can hope to see in any detail. There
are corresponding further horizons for observations by neutrinos and gravita-
tional waves; they lie between the visual horizon and the (more distant) particle
horizon. These will be relevant to our observations when directionally sensitive
neutrino and gravitational wave telescopes are operational, which is still a way
off. No known type of possible observation will enable us to detect any details
of the matter outside the gravitational wave horizon, or a fortiori the particle
horizon.
ecoupling\
Fig. 7.26 The inflationary universe situation. In causal coordinates, the initial singularity
is now much further back in the past than in the previous case (Fig. 7.17). Consequently,
the causal pasts of R and Q overlap to a considerable extent; thus, a common physical
cause can explain the similar conditions at R and Q, leading to the observed isotropy of the
cosmic microwave background radiation. However, there are still causal influences on R
that cannot have any effect on Q, and vice versa.
Secondly, while inflation moves the particle horizon out a very large distance, it
does not affect the visual horizon at all (for that is determined by the dynamics of
the universe since decoupling). Thirdly, the fact that in principle some causal
influence could have an effect at both Q and R does not show that in reality
effective processes would cause conditions there to be the same.
However, physical processes in inflation that could cause such smoothing have
been extensively investigated, as have further processes that would then generate
inhomogeneous seeds for structure growth at much later times. This has been an
exciting development, linking particle physics processes at these very early times,
particularly some of its fundamental ideas concerning symmetry breaking, to the
forms of large-scale structure developing in the universe after decoupling. In
brief, provided initial inhomogeneity and anisotropy are not too large, the
expansion due to inflation will smooth out the universe and quantum fluctu-
ations will then provide the needed seeds for structure to grow by gravitational
attraction at later times. Additionally cosmological gravitational radiation will
be generated by these processes. These ideas and their development are described
in an interesting way in the book by Alan Guth called The Inflationary Universe:
The Quest for a New Theory of Cosmic Origins. (Addison Wesley, 1997) and in
Chapter 4 of Silk's book. There are now a large variety of inflationary universe
models with these basic features, but with many different detailed mechanisms
and geometries proposed. They largely succeed in what they aim to do, for the
first time ever providing a mechanism whereby we can hope to explain the
spectrum of inhomogeneities in the present universe from the largest scale cosmic
structures down to galaxies. Furthermore these structure formation theories also
provide predictions of the resulting CBR anisotropies that should remain at the
present time.
These theories are not entirely compatible with observations: the amount of
structure at smaller scales, as measured by galaxy correlation functions, leads to
predictions of more structure at the largest scales than we presently determine
304 Simple cosmological models
from the CBR anisotropies at those scales together with present structure for-
mation theories. However, one can avoid this conclusion by including more
components in the theory, a number of different inflationary fields for example.
Current work aims to obtain much more refined measurements of CBR anisot-
ropies to compare with the various theories which are being developed to fit the.
detailed observations.
Some varieties of inflation ('chaotic inflation', particularly developed by
Andrei Linde) predict major inhomogeneities on super-horizon scales, with
many expanding spatially homogeneous and isotropic regions with different
parameters and properties growing out of earlier expanding regions, like a multi-
headed hydra (see Chapter 15 of Guth's book). We are then expected to be
located in one of these regions, looking like a standard FLRW model to us even
though it is very different beyond the visual horizon. However, it is clear that we
will never be able to verify observationally if this is so or not, for the other regions
where conditions are different (separated from us and each other by regions of
major inhomogeneity) are simply not accessible to our observational inspection.
Thus this intriguing idea will forever remain in the `unverifiable' category.
Overall this has been a very exciting proposal, extending the ideas of physical
cosmology to the limits of particle physics, and showing a possible influence of the
very small (microphysics) on the very large (cosmology) in a way that exemplifies
the underlying physics project of unifying quite different areas by means of a
single explanatory schema. It has not yet fully succeeded for a number of reasons:
particularly because first, although the theory is fully framed in accord with
present-day particle physics ideas, the link is incomplete because there is no
specific proposal forphysical identification of the inflationary field (the `inflaton');
no specific scalar field has been found in the laboratory that has the properties
needed to give an inflationary universe with the desired early universe behaviour.
Of course, should such properties be identified from the cosmology side, and then
laboratory experiments verify that a field exists with precisely the characteristics
thus determined, this would be one of the great achievements of physics; however,
this has not yet happened. Hence, it is at present an `in principle' proposal,
developed in a great variety of speculative ways, rather than a development of the
consequences of existence of an identified physical field.
Secondly, inflation predicts that most models will result in a critical density
universe today, just balanced between eternal expansion and recollapse, but as
discussed above, that does not seem to accord with current observations. To save
inflation one either has to move to inflationary models with lower present-day
matter densities, which is possible but is regarded as unaesthetic by many (the
universe has to be `fine-tuned' to attain this result), or introduce a cosmological
constant. The latter introduces a new fine-tuning problem that is presently
unresolved (why is this constant so close to zero yet non-zero?), but as mentioned
above may be indicated independently by supernovae-based observations of the
distance-redshift relation for distant galaxies and by CBR anisotropy mea-
surements. Either way, this evidence is awkward for standard inflationary theory.
The issues of the precise values of the density of matter in the universe and the
value of the cosmological model remains to be resolved. If they do not indicate
7.7 Small universes 305
flat spatial sections, many will drop inflationary models, but others will continue
to construct them with different fields and parameter values, because the ideas
developed are so appealing from a physical cosmology and particle physics
viewpoint that they will not easily be discarded.
identify
sides
ioentiry
front 1 i
bac
z
v
x
Fig. 7.27 A `small universe' formed from a section of a k = 0 (spatially flat) FLRW
universe model by identifying opposite sides of a rectangular block in a space-section
It = constant}. This means e.g. that when an observer travelling in the z-direction reaches
the top face, he continues his journey up from the corresponding position in the
bottom face.
306 Simple cosmological models
WL
r -same galaxies galaxy, observer
CIO c
Fig. 7.28 The representation of one spatial dimension (and the time dimension) in a
`small universe'. All the vertical lines represent the same galaxy, and so are identified with
each other. Thus, the two diagrams are equivalent. The past light cone of the event P cuts
the galaxy world-line many times, so an observer will see many images of each galaxy. In
particular he will see many images of his own galaxy in the past.
identify
distance L in either the x or the y direction one ends up at the point where one
began. Again `unwrapping' the universe model, we see this is observationally
equivalent to a situation where the basic unit cell and its contents are exactly
repeated indefinitely in all directions. Then looking around us in all directions up
to some limiting redshift z*, we will have seen the same material many times over.
The effect is almost identical to that which would be obtained by having a few
7.7 Small universes 307
hundred model galaxies suspended in a box whose sides, bottom, and top were all
mirrors; there would be an appearance of galaxies stretching to infinity, since
each galaxy would create many images by multiple reflections (the effect is illus-
trated in `The mathematics of three-dimensional manifolds' by W. P. Thurston
and J. R. Weekes, Scientific American, July 1984).
It is very difficult to prove that the real universe is not like this, with a relatively
small number of objects creating a very large number of images, because it would
not be easy to discern that all the images of one object in such a universe came
from the same object. This is because we would see it at different redshifts in the
different images, in different directions, subtending different apparent angles, at
different stages in its history, and apparently from different directions (e.g. we
might see images of a single spiral galaxy edge-on in some directions and from
above its plane in others).
An interesting new development is that it has been shown that in all small
universes, no matter how complex the spatial topology, the CBR anisotropy
pattern mentioned above will be characterized by the existence of circles in the
sky where the pattern of temperature fluctuations is identical. The new more
sensitive investigations of CBR at present under way will include a systematic
search for such circles in the CBR anisotropy patterns. If found they would show
that we live in a 'small universe' and would also determine the spatial connectivity
of the universe. This would then inter alia rule out the chaotic inflationary
models, for we would be able then to see all the matter in the universe and show
that the whole was described by a single FLRW domain.
There are several reasons for favouring such universe models. One is that,
unlike the usual case, there are no particle horizons at recent times in these uni-
verses (we have seen all the matter in the universe) and so the predictability
problem mentioned earlier does not arise-we can predict that the Moon will rise
tomorrow because in this case we do have sufficient data to predict the future (we
have seen all the sources that could interfere with the Moon's motion). Secondly,
as emphasized by Einstein and Wheeler, the problem of boundary conditions that
plagues many physical theories disappears, since in these cases there are no
spatial boundaries to the universe, so we do not have to determine the values of
physical fields at infinity before determining the behaviour of those fields. And
thirdly, in such models the apparent homogeneity and isotropy of the universe
are very neatly explained-the universe looks homogeneous and isotropic
because we are seeing the same region over and over again. This is the simplest
reason for apparent homogeneity one can imagine!
Exercises
7.11 For the steady-state universe, an exponential expansion law R(t) = exp Ht, with
H constant, is valid for all times. In this case there is no initial singularity (remember that
Einstein's equations are not satisfied) so in evaluating the particle horizon we must con-
sider values of t from - co to to; i.e. u = f `0 dt/R(t). Show that there is no particle
horizon.
Argue from this that if the universe has a power law expansion at late timesbut there is a
period t11 < t < tI when an exponential expansion takes place in the very early universe
308 Simple cosmological models
(tI < td) then the present-day horizon scale will be much larger than in the standard model,
and could indeed exceed the size of the visible universe.
If you are feeling strong, repeat Exercise 7.9 but now where the expansion of the universe
is exponential for the time At when R(t)/Ro varies from 10-56 to 10-27 (i.e. R = µ exp vt,
where p and v are constants). Compare your results with those of Exercise 7.10.
7.12 Estimate the number of images visible per galaxy out to a redshift z = 1 in a
{k = 0} small-universe model where R(t) takes the matter-dominated form (7.1c), with
t1 = 0, and the identification length-scale is 400 Mpc. [Estimate how many times the basic
unit cell fits into the region of the universe observable out to a redshift of 1.]
The steady state and small universe models are examples of alternative models to
the standard FLRW models (whereas the inflationary universe model is a variant
of that model, rather than an alternative).
Alternative geometries
There are a series of other models based on alternative geometries that one can
use as universe models. In particular, one can look at spatially homogeneous but
7.8 Alternative universes 309
anisotropic universes. The properties of these models are the same everywhere, at
each cosmic time, as is the case in the FLRW universes; but unlike them these
models appear different in different directions; for example, the expansion rate
will be different as we look in different directions in the sky. By comparing the
predicted anisotropies in such models with astronomical observations, one can
put limits on the amount of anisotropy in the real universe.
Alternatively, one can construct spherically symmetric models where all
observations are isotropic about one, if one is at the centre (no direction is picked
out as different from any other), but physical properties are dependent on radial
distance from the centre as well as on time. It could be that we live in such a
universe, but it places us at the centre of that universe. This is a very unpopular
position nowadays from a philosophical viewpoint (unlike the situation in pre-
vious centuries when this was taken for granted!), but that does not prove the real
universe is not of this kind. If we ask for observational proof that the universe is
not of this nature, that proof is very difficult to provide. The reason is that as we
look out further away, we are also looking back in time (we observe on the past
null cone!), so the requirement is to distinguish evolution in time, in a spatially
homogeneous model, from evolution in space, in a spatially inhomogeneous
model. This is difficult to do because we do not understand source evolution
adequately, and we cannot distinguish these two cases observationally.
How then can we prove spatial homogeneity? The best argument is to rely on
the very near isotropy of the CBR. If we assume that a similar high degree of
isotropy holds everywhere else in the visible universe-a Copernican assumption,
expressing the idea that we are not at a privileged position in the universe-then it
follows that the universe is very like a FLRW model in this region. Near-isotropy
of the radiation everywhere in an expanding universe region implies near-spatial
homogeneity in this region. We cannot prove observationally that this is correct,
but it is a very plausible argument, and we have no evidence that contradicts it. So
we are reasonably safe in assuming a perturbed FLRW model is a good model of
the observed region of the universe. But the moral of the story is that you can only
test such arguments by investigating models with more general geometries than
the highly restricted FLRW geometry. Thus examining such models is useful to
us in limiting the degree of inhomogeneity and anisotropy of the real universe.
Alternative physics
One can also examine the effect of alternative physical theories on cosmology.
For example, it is possible that the gravitational constant is actually not a con-
stant, but rather varies with time (as suggested in Section 5.12); indeed this might
seem plausible in view of the fact that most other properties of the universe vary
with time. Hence, we can construct model universes with time-varying gravity
and see what observational effects result. This has been done, and it turns out that
the standard theory, based on Einstein's equations, is as good as we need-there
is no observational reason to modify it.
Where we do need a change of physics is in the very early universe, where
conditions are so extreme that quantum effects cannot be avoided, and indeed
310 Simple cosmological models
some quantum form of gravitational theory must come into play. Thus some
cosmologists study quantum cosmology, the theoretical predictions as to what the
behaviour of the universe must have been like in these very extreme conditions at
very early times-corresponding to less that 10-43 seconds after the Big Bang in
the standard model! Because we do not know the correct theory of gravity at these
times, which may be some form of string theory or M-theory, we have to indulge
in (controlled) speculation to try to understand what happened then. One of the
most interesting ideas is described in the famous book A Brief History ofTime by
Stephen Hawking (Bantam Press, London, 1988)- namely that at the earliest
times, space-time no longer had one time dimension and three spatial ones, but
rather four spatial dimensions. At these early epochs, time no longer existed
(hence all our ordinary language becomes inadequate), but the space that
replaced `space-time', called an instanton, still had four dimensions. It then fol-
lows that there was no longer necessarily a singularity in the space-time struc-
ture-under these altered conditions, space-time can be regular everywhere
(apart from the very singular condition of having changed from space-time to
space-space!). Rather there is a smooth initial spatial domain, like the surface of a
sphere (but two dimensions higher), where there is no unique region we could call
a `beginning' (see Fig. 7.30). However, there is a beginning of `real' time, when a
change takes place to a classical situation where space and time are distinct from
each other, as described in the rest of this book. Recently, Hawking and Turok
have proposed a modification of these ideas designed to incorporate inflation
with k zA 0. This allows the instanton to have some `mild' singularities.
It is difficult testing such ideas, because we cannot duplicate the required
conditions in a laboratory on Earth! So in examining this and other ideas of how
one might have `creation out of nothing' (see, for example, chapter 16 of Guth's
book), or the collapse of 10 effective spatial dimensions to 3 as might be expected
from string theory, one has to extrapolate greatly from testable conditions to the
unknown. The scientific skill is to know which of the measurable properties we
can determine in the laboratory are the ones hinting at the way things work under
Classical space-time
Fig. 7.30 A schematic representation of the so-called `no boundary condition', with an
instanton changing to classical space-time.
Observational tests 311
much more extreme conditions. According to one's opinion on this, one obtains
different theories. Our main criteria then, in the face of a lack of observational
evidence, is their logical coherence, aesthetic appeal, and unifying relation to the
rest of physics. We may ultimately end up with several competing theories we are
unable to prove or disprove, because of the observational and experimental limits
on what we can determine.
Conclusion
Overall, the universe models described in this chapter are certainly too simple to
describe many detailed features of the real universe; but they provide an
appealing idealized view that enables us to understand many of its important
features, such as the hot Big Bang, the expansion of the universe, and the exist-
ence of horizons. In each case compatible with present observations, there is an
origin to the universe at a singularity where the known laws of physics break
down and space-time itself begins (rather similar to the end of space-time at the
singularity occuring at the end of gravitational collapse of a massive object).
Despite some valiant attempts, explanation of the `creation of the universe' is still
outside the scope of experimental science. The reader interested in philosophical
ideas related to this issue will find stimulating discussions in Harrison's book
mentioned earlier, and in The Anthropic Cosmological Principle by J. D. Barrow
and F. J. Tipler, Oxford University Press, 1986.
There are causal problems with the standard model, because of the existence of
particle horizons. The inflationary universe concept may be a way of overcoming
these problems, but is still a somewhat speculative proposal at present. The `small
universe' concept is a possibility which explains some features of the observed
312 Simple cosmological models
Singularity
Finale
The content of relativity theory is very surprising. It is an example of the hidden
nature of reality - namely, the true way things are is not at all obvious; but we
can find it out if we investigate carefully enough. Indeed relativity theory is a
classic case of the counter-intuitive nature of physics. The major such features
in fundamental physics are Newtonian theory, special relativity, quantum
theory, and general relativity (together with a feature of somewhat different
character: its capacity to create self-organizing structures). In each case it has
taken the best and most creative scientists to break out of the `obvious' mould and
determine the true pattern that underlies the regularities of nature (see Lewis
Wolpert: The Unnatural Nature of Science (Faber, London, 1992) for further
discussion).
Relativity theory is a prime example of such an unexpected (and initially
unwanted) theory. However, it has its own internal logic that is quite clear and
understandable, once its spirit has been understood. That is what we have tried to
present in this book. It is at one with the rest of physics in on overall scheme that
has stood the test of hundreds of thousands of experimental checks. Surprising
as it is, it appears to be the way things really are.
It is important to clear up one common misunderstanding about relativity
theory. It is believed by some that it is an example of what has become known as
relativism: namely because our view of the universe depends on our viewpoint (in
societal terms, our understanding depends on our culture; in relativity theory, our
observations depend on our frame of reference), nothing is fixed and so anything
goes. Nothing could be further from the truth. It will be clear from the discussion
in this book that the view taken is that there is indeed an invariant underlying
reality, namely space-time. Certainly one's view of it (i.e. what one measures)
depends on one's reference frame (and particularly on one's state of motion), but
there is nothing arbitrary about how this happens. On the contrary, the rigid laws
of tensor algebra that we have explored in this volume relate the viewpoints of
different observers; if you know what one observer measures, then you can cal-
culate precisely what any other one will measure. Thus insofar as relativity theory
supports a view of relativism in other branches of knowledge, it does so only in
this manner: given a description of what exists, one can deduce from that what
various observers will see or measure. The variety of possible viewpoints are
strongly related to the nature of the underlying reality, but any single view (or
even perhaps the whole set of available views) may not be sufficient to determine
fully the nature of that reality. A classic example here is the nature of the universe
beyond the horizon. We cannot determine observationally anything about it; this
does not mean it does not exist! To suppose that what exists is determined by what
314 Finale
we can know (in more formal terms, that ontology proceeds from epistemology)
is simply human hubris.
If you are familiar with the debates on science and social construction,
in particular the famous Sokal affair (see the fascinating web pages at
http://www.physics.nyu.edu/faculty/sokal/index.html), it will be clear we are
taking a strong position in this matter: despite what any philosophers, literary
critics or sociologists may say, the features we describe in this book are, in our
view, tested accurate descriptions of the way things really are, rather than socially
constructed theories that are no better than other theories of the universe. We do
not agree that the patterns of understanding we discuss here have been imposed
on the universe by the human mind. Rather, with most other working scientists
we believe that in spite of social pressures and the idiosyncrasies of individuals,
they have been found by us as accurate and unexpected, indeed surprising,
descriptions of reality. They underlie the existence of galaxies and the functioning
of the Sun, the expansion of the universe and the way physical and chemical
systems function. No group of scientists wanted them to be that way-indeed
most of these discoveries (the expanding universe, for example, and even rela-
tivity theory itself) were resisted by scientists, who then eventually had to give
way to them in the face of inexorable evidence. These theories have been adopted
because they accurately describe a knowable external and invariant reality,
independent of the functioning of human society or language.
The counter-intuitive nature of relativity theory is clear. But this description
of the way things are has stood the searching examination of numerous experi-
mental tests. As a description of the way the physical universe functions, it is
reliable knowledge.
Afterword
The reader who has reached this far will now have a thorough understanding of
the concept of a space-time and of its use to represent the collapse of a massive
object to form a black hole, as well as of the causal limitations implied in the
standard models of an expanding universe. He/she will be able to carry out cal-
culations based on Bondi's K-calculus to determine the effect of relative motion
on times, lengths, simultaneity, and relative velocities in flat space-time; and will
be able to work out simple consequences of the conservation of relativistic
energy and momentum. He/she will have an understanding of the meaning of the
metric form of a space-time, and how to determine the properties of simple
curved space-times from it. If he/she studies the Appendices, he/she will learn
the basic notions of a four-vector and four-tensor in flat space-time, and in
particular how energy and momentum are united in a four-vector and the
electric and magnetic fields in a four-tensor. The topics covered and their relation
to each other are carefully detailed in the index to this book, and its use is
recommended to the reader wishing for guidance as to what topic is covered
where in the book.
Further reading
If we have succeeded in our task, the reader will be keen to learn more about
relativity theory and its uses in physics and astrophysics. Various references to
other books and articles have been scattered through the text, and the reader will
find all of them interesting. We conclude by giving some suggestions for further
reading,,both at about the same level as the present book and at a more advanced
level.
There are many books that present special relativity from various viewpoints.
For the reader wishing to go into more detail at about the same level as the present
book, we suggest Special Relativity by A. P. French (Nelson, 1968), or Space-
Time Physics (Second edition) by E. F. Taylor and J. A. Wheeler (Freeman,
1992); these examine in detail the physical implications of special relativity (the
basics of which have been presented here, but not discussed at great length). For
further reading on general relativity at about the present level, we suggest
Eddington's classic book Space, Time and Gravitation (Harper Torchbooks,
1959); this was first published in 1920.
A book on general relativity which is close to ours in its aims and methods, but
more focused specifically on the properties of black holes, is Exploring Black
Holes: Introduction to General Relativity, also by E. F. Taylor and J. A. Wheeler
(to be published by Addison Wesley Longman, 2000).
316 Afterword
I=J fdu.
I
Here, f is a function of the coordinates (x °) used to describe the space, f = f (X a),
and the path C (with parameter u) is also specified in terms of the coordinates:
x° = Xa(u). The integral is to be evaluated from the parameter value up (corre-
sponding to the initial point P) to the parameter value uQ (corresponding to the
final point Q).
The concept
The basic idea here is the usual one for integrals. Imagine dividing the path C into
n equal steps, each of parameter length 6 (so 6 = (up - uQ)/n). We evaluate the
function f at the mid-point of each of these intervals (Fig. A.1). Define
(A.1)
i=1
X2
where f is the value of the function at the mid-point of the ith interval, and E'.,
denotes summing all the terms from i = 1 to n. The quantity S, is simply a sum of
the contributions of the function f at the various points along the path, multiplied
by the length 6 over which it makes each contribution. This approximates the
value we want to work out. We then allow the length 6 of each step to become very
small while the number of steps n becomes very large; then the approximation
becomes more and more accurate. In the limit of very large n, the limiting value of
Sn is what we mean by the integral I.
I= limSn.
n- oo
(A.2)
This quantity can be viewed geometrically in the following way: consider a plot of
the value off along the path as a function of the parameter u, i.e. f (xa(u)) (see
Fig. A.2a); recall that (x a) _ (x°, x', x2, x3), with the indices 0, 1, 2, 3, indicating
the various coordinates, not `powers of x'. Then rectangles of height f and width
6 (Fig. A.2b) approximate this curve in a stepwise fashion; and Sn is just the sum
of the areas of these rectangles. Taking the limit as n becomes indefinitely large
and 6 indefinitely small, the value obtained is clearly the area under this curve
from up to uQ. By definition, this limit is also the integral I, so I is the area under
this curve.
Applications
The simplest application is the length L of the curve C from an initial point P to a
final point Q; in this case, f is chosen to be ds/du, the rate of change of distance
along the curve with respect to the curve parameter u (at each point on the curve,
A
f
fi
f(u)
0 up 4 uA
Fig. A.2 (a) A graph of the value of the function f (x") along the curve, as a function of
the curve parameter u. (b) A stepwise approximation to the function by means of the
values f at the mid-points of intervals of width b. The area under these rectangles is Sn;
as the number of intervals is increased and their width decreased, the limit of Snis the area
under the curve, which by definition is the integral I.
320 Appendix A
x1 4
Fig. A.3 A path from P to Q specified by the parameter u. The distance along the path
between points with parameter values u and u + du, is ds.
Evaluation
In practice, line integrals are evaluated in various ways: (1) graphically,
(2) numerically, (3) analytically, or (4) from tables. In each of the first three cases,
one carries out the limiting procedure above, in case (1) by drawing a graph of
f (u) against u and estimating the area under this curve, in (2) by using a calculator
to work out the sum S for larger and larger values of n, and in case (3) by using
standard techniques of integration which will be familiar to readers who have
studied calculus (some of these methods are illustrated in the examples and
exercises that follow, see particularly Exercise A.4). In the fourth case, one relies
on previous work someone else has done by one or other of methods (1)-(3),
presented in tabulated form. In the rest of this appendix, we will illustrate analytic
integration from first principles, thus illustrating and clarifying the concept
of integration. The exercises following illustrate integration by graphical and
numerical methods, as well as by use of simple formulae.
Line integrals 321
L = ff du r sin 0 do =
n
f -27r
Re sin 60° du
Since the sum consists of repeating the same term, namely 1 (n times), it cancels
the factor 1/n, so taking the limit is trivial and we obtain
L = Re,/37r.
Fig. A.4 A point on the Earth's surface at latitude 30°N corresponds in spherical
polar coordinates to r = Re and 0 = 60°. The length of that circle of latitude is found by
allowing to vary from 0 to 360°.
322 Appendix A
fn =0 2du = V2. Thus the total length of the perimeter is 2 + '/2 (which one
can of course also work out more easily from elementary geometry).
Third example In both the above examples, the calculation has been very easy
because the quantity inside the integral has been a constant; one can simply take
any such constant outside the summation. As our final example we consider an
integral where this is not so.
Suppose a car travels along a straight road for a time T with a speed v pro-
portional to the time t (v = kt, with k constant), and that the force F exerted
by the engine during this motion is av (a constant). Then the work done during
the journey is given by W = f F dx, where x is the distance travelled at time t.
Now dx/dt = v, so dx = v dt and
1t-T t=T
W= (av) (v dt) = ak2 / t2 dt.
t=0 ,/ t=O
Sn =
P
0
(0,0)
L .
X
Fig. A.5 The triangle OPQ with vertices (0, 0), (1,0) and (0,I) in Cartesian
coordinates (x, y).
Line integrals 323
Fig. A.6 To work out ft` o t2 dt, the interval [0, T] is divided into n intervals of width
6 = T/n. Then t; is the value of t at the mid-point of the ith interval; tz is the height of
the curve f = t2 at that point. The area under the curve is the limit as 6 -* 0 of the area
under the rectangles.
where t; _ (i - 2)T/n is the value of t at the mid-point of the ith interval, and
6 = T/n is the width of each interval (Fig. A.6). Thus
Sn =
I:{(, - 2)T/n}2T/n = (T/n)
n
-l+4)'
i=1 i=1
In the limit as n - oo, the 4n2 term dominates in the term 4n2 - 1 and the limit of
Sn is T3, giving W = 3 ak2T3. We see that in this case, the amount of fuel used
by the3 car will depend on the cube of the travel time!
Exercises
A.1 Evaluate f,'.0 t2 dt graphically, by the following procedure: (1) plot the curve
f = t2 between the stated limits. (2) Choose a value for n, plot the positions t; of the
midpoints of the intervals of width 6 between t = 0 and t = 1, and draw the rectangles of
height f = tz centred on the points ti. (3) Calculate the total area Sn of all these rectangles.
(4) Repeat for larger values of n. (5) Estimate the area A under the curve by counting the
graph paper squares and fractions of squares under the curve. (6) Verify that as n increases,
the area Sn tends to the area A.
[Note: you may find it useful to think of a series of rectangles which definitely have an
area greater than the area under the curve, and another series which definitely have an area
less than the area under the curve; then the area of the curve of necessity lies between the
areas of these two sets of rectangles. This sometimes provides a quick way of limiting the
range of possibilities for the area under the curve.]
A.2 Repeat for the cases f = 1 and f = t, to determine 1 dt and ff o tdt
respectively.
324 Appendix A
A.3 Calculate f'o t dt from first principles, using the method of the third example in
the text above.
A.4 Suppose you have proved f 0 1 dt = T, f,` 0 tdt = 2 T2, and
o t2 dt = 3 T3. What result does this suggest for ft` t" dt?
o
A.5 Consider a circle in the Euclidean plane, given in polar coordinates {r, O} by
{r = r0, 0 < 0 < 2ir}. Prove from the interval (4.27) that (a) the radial distance D from the
origin to the circle is D = r0, and (b) the circumference C of the circle is 2irr0. [You may find
the results quoted in Exercise A.4 helpful.]
A.6 Evaluate f x ds around the closed path in the Euclidean plane given by (a) y 0,
0 < x < 1; (b) x2 + y2 = 1 between (1, 0) and (2 V2, z ,/2); (c) x = y between (Z ,/2, i ,/2)
and (0, 0).
Computer Exercise 16
Write a program that accepts as input (a) lower and upper limits TO and Ti, (b) a number
N of divisions, and then by a procedure equivalent to steps (2) and (3) in Exercise A. 1,
calculates the sum SN approximating the integral of a function F(T) between the limits TO
and Ti (the function F(T) is specified in a subroutine at the end of the program).
Use your program to verify the results obtained in Exercises A. 1 and A.2.
Appendix B
Four-vectors and relativistic
dynamics
Throughout this book, great emphasis has been placed on the unification of the
concepts of space and time into a single entity called space-time. In this and the
following Appendix we describe the technical means by which separate three-
dimensional and one-dimensional quantities are combined into single four-
dimensional quantities called space-time vectors and tensors. In this appendix we
first examine the concept of a space-time position vector, and then generalize
from this to the idea of a general four-vector. We also consider how to construct
invariants from four-vectors.
Fig. B.2 The position vector Sofa point Pin four-dimensional space-time described by
coordinates (t, x/c, y/c, z/c).
where the new Minkowski coordinates (Xa) = (t',x"/c) are related to the old
ones (Xe) _ (t, x`/c) by the Lorentz transformation (4.5a-c)*. This shows that
*The brackets in (S') emphasize that we regard the components S°, S', SZ, S3 as grouped together
to form a single object. Once this is understood, the brackets can usually be omitted and when
convenient we will do this.
*We use x' or xi (with indices i, j = 1, 2, 3) to refer to x, y, z, whereas X° or X' (with indices
a, b = 0, 1, 2, 3) refer to t, x/c, y/c, z/c.
1.1 Four-vectors and relativistic 327
(B.3b)
Exercise
B.1 Show from (B.3b) that if V = s and (Sa') = (4 , a , 0, 5) then (Sa) _ (4,3,0,5).
Check that this confirms that (B.3b) is the inverse of (B.3a).
-
A°' ='y(v)(A° VA'), A1
='y(v)(AI - VA°),
AZ = A2, AT = A3,
when frame F' moves at speed v in the +x-direction relative to frame F. One can
conveniently write this in the form
ry -Vy 0 0
a -Vy 'y 0 0
(B.6a)
0 0 1 0
328 Four-vectors and relativistic
where a labels columns, and a' labels rows, and the summation in (B.5a) is over all
values (0, 1, 2, 3) of the index a. Explicitly, (B.5a) is
Substituting from the first row of (B.6a) then gives the first of eqns (B.4). Simi-
larly, on setting a' equal to 1', 2', and 3', we successively obtain the other eqns
(B.4), and so demonstrate the equivalence of (B.5a), (B.6a), and (B.4).*
The inverse transformation Just as we derived (B.5a) from (B.3a), we find
similarly from (B.3b) that the inverse transformation,. giving the components Aa
in the original frame from the components Aa' in the final frame, will be
Aa,
A° = 7(L 1)a (B.5b)
y V'y
[(L-1)a ] V -Y -Y 0 0
(B.6b)
0 0 1 0
0 0 0
00 1
10
where a labels rows and a' labels columns, and the summation in (B.5b) is over
all values of the index a'.
Exercises
B.2 Apply the transformation (B.5a) with V = s to the vectors (a) (A") = (1, 0, 0, 0);
(b) (Ba) = (0, 1, 0, 0); (c) (C°) = (1, 1, 0, 0).
B.3 Write out the inverse transformation (B.5b) in a form similar to (B.4). Check
explicitly from this that (B.5b) is the inverse of (B.5a), i.e. that for any vector A, trans-
forming (A°) to (A"') and back again by successively using (B.5a) and then (B.5b), gives the
identity.
change of reference frame, provided the transformation matrices L and L-1 are
chosen appropriately.
Exercise
B.4 Write out the forms of the matrices Land L-1 for a change of velocity by vin the
+ z-direction (i.e. explicitly determine the components (B.6a) and (B.6b)). Repeat Exer-
cise B.3 with this choice of transformation.
t+ t+
curve Xa(U)
u+du
d Xa d7
---.WY==y/c
Ua
X=x/c -X
(a) (b)
(Ua) = (dxa/dt) (dt/dr) = 7(v) (1, vX/c, vyl c, vzl c) = 7(v) (1, v1 c) (B.7b)
Thus the spatial part of the four-velocity is just'y(v)v/c, while the time part is
'y(v), the time dilation factor (showing how coordinate time varies with proper
time). Suppose for example the particle moves in the +z direction at a speeds c.
Then v = 0 0 35 and (v) = 45 so (/Ua) =4 5 (1 0 0 53) = (5 0 0 3)
If we write down the four-velocity in the rest frame F' of the particle, then v' = 0
so ry(v)' = 1 and (Ua') = (1, 0, 0, 0). Thus in this case, as might be expected, the
four-velocity is purely along the time axis with no spatial components (a particle
is at rest in its own rest-frame!). Frame F' moves at speed v relative to frame F; if
we choose the x-axis to lie in the direction of v and then apply the transformation
relations (B.5b), we obtain the components of (Ua) in frame F, relative to which
the particle moves at speed v in the x direction, from these rest-frame compo-
nents. Now allowing a general spatial rotation of axes, we regain (B.7b) for an
arbitrary reference frame. Thus the form of (B.7b) is essentially a consequence of
the transformation formula (B.5).
Exercises
B. 5 Verify that (B.7b) is obtained by transforming from the particle rest frame F' to the
observer's frame F, in the particular case when the particle moves in the +x-direction
relative to the observer.
B.6 The four-velocity of a particle P is measured by observer 0 to be
(Ua) _ (4 , 4, 0, 0). What is the time dilation factor for P relative to 0? What is the three-
velocity of P relative to 0?
B.7 Use the transformation properties of the four-velocity vector to derive the rela-
tivity velocity addition law for parallel velocities (eqn (3.15)).
from our convention of defining (Xa) in terms of X = x/c, etc., since we wish to
measure distance in terms of light travel times. An alternative convention,
adopted in many books, is to multiply the position vector, four-velocity, and
four-momentum by a factor of c. In this case the time component of the position
vector would be ct rather than t.
The definition of four-momentum shows that (P°') = (mo, 0, 0, 0) in the rest
frame F' of the particle, defined by v' = 0. Indeed (B.8a) can be obtained by
applying the transformation relations (B.5) to obtain the components pa in a
general frame from these rest-frame components. This transformation makes
explicit the fact that the relativistic three-momentum (that is, the spatial com-
ponents of the four-momentum) is nothing other than rest mass in relative
motion. Thus in relativity theory, rest mass and three-momentum are united in
one four-dimensional quantity and are related by the Lorentz transformation
(unlike in Newtonian theory, where they are defined independently); indeed,
it is the transformation rule (B.4) that determines the nature of relativistic
three-momentum.
Exercises
B.8 Obtain (B.8a) by transforming from the particle rest frame F' to the observer's
frame F, in the particular case when the particle moves in the +x-direction relative to the
observer.
B.9 The four-momentum of a particle P is measured by observer 0 to be
(P°) = (15, 12, 0, 0). what is the rest mass of P? What is the three-velocity of P relative to O?
where the sum is over all particles involved in the collision; then from (B.8a)
the first being the law of conservation of relativistic energy E, and the second the
law of conservation of relativistic three-momentum n.
As discussed in the main text, these predictions have been tested in many
thousands of collisions (see eqns (3.33) and (3.40), and the discussion there). This
shows how in relativity theory the single four-dimensional relation (B.9), uniting
the laws of energy and momentum conservation, replaces separate laws of energy
332 Four-vectors and relativistic
and momentum conservation in Newtonian theory. Further, if we apply the
transformation laws (B.5) to (B.9) it becomes clear that what is regarded as
energy conservation in one frame is momentum conservation in another; these
are really different aspects of the same fundamental physical phenomenon.
Exercises
B.10 An observer 0 sees a particle move in the x-direction with energy E and relati-
vistic three-momentum ir, where E = ln1c. Show that an observer 0' moving in the x-
direction relative to 0 at speed v will also find this relation to be true, i.e. he will find
E' = 1n'lc. [This relation will be true in the case of a zero-rest-mass particle, e.g. a photon.]
B.11 Prove from the transformation properties of a sum of four-momenta that it is
impossible to have conservation of momentum without conservation of energy.
B.12 An observer sees a particle P1 of rest mass 4 approach from the left at a speed
v = z c and collide with an identical particle P2 approaching from the right at speed c.
Find the four-momentum of each particle, and the total initial four-momentum of both z
particles. Verify that the total initial relativistic three-momentum is zero in this frame.
After collision, the particles each have rest mass M1. They move apart, P1 moving to the
left at v = c. what is the speed at which P2 moves to the right? What is M1?
B.13 An3 observer using a frame F measures the total four-momentum of a system of
particles to be (Pa) = (M, H/c). Show that a unique effective rest mass M° and four-
velocity (U°) are defined by the relation pa = M° U°(a = 0, ... , 3). What equations
determine M° and (Un) from M and IF [Write out (Un) in terms of the speed v relative to
the frame F, so determining a second set of expressions for the components P° and P',
equate these components and solve for M° and v in terms of M and 1I.]
On changing to the rest frame of an observer moving with the four-velocity (U'), the
new components of the four-momentum are (pa') = (M', H'/c). Show that M' = M° and
II' = 0. [A frame F' satisfying the last condition is called a centre-of-mass frame. Collision
calculations are usually easiest performed in this frame.]
B.14 A proton collides with a second proton at rest. The outgoing particles are a
proton p, a neutron n, and a charged pion it. Given that (very approximately)
MP = Mn = 6m,, find the minimum energy needed by the moving particle to make this
reaction possible. [Hint: In the centre-of-mass frame, where the total three-momentum is
zero, the configuration with minimum energy is the one where all three produced particles
are at rest.]
The four force In Minkowski coordinates in flat space-time, the natural four-
dimensional generalization of Newton's force law F = dp/dt is
The spatial components of this equation are the relativistic force-law (3.37b), and
the time component is the rate of change of relativistic energy equation. Again we
see how in relativity theory a single four-dimensional equation unites equations
that were separate relations in Newtonian theory, and indeed shows that they are
different aspects of the same fundamental phenomenon. Applying (B.5) to (B.12)
shows that what is the equation for the rate of change of energy in one frame
contributes to an equation for the rate of change of momentum in another frame,
and vice versa.
Exercises
B.15 Prove the last statement by transforming (B.12a) from one frame to another.
(Examine the specific example in which the energy is constant in one frame.)
B.16 Suppose no force acts on a rocket, i.e. f' = 0(a = 0, ... , 3). Show that both its
rest mass and its three-velocity v are constant. Deduce that its relativistic three-momentum
and its four-velocity (U°) are constant.
Can you invert this relation, i.e. deduce from a constant four-velocity that no force acts
on the rocket? If not, what additional information would you need to make this deduction?
B.4: Invariants
Each component of a four-vector (A°) depends on the reference frame used;
however, the quantity
(B.13)
(B.14)
Exercises
13.17 Prove the invariance (B.14) under (B.5) by explicitly calculating the value of the
left-hand side from (B.13) and (B.5a).
B.18 Find the value of A A in the cases (a) (Ar') = (1,0,0,0); (b) (An) = (0,1,0,0);
(c) (A") = (1, 1, 0, 0); (d) (A') = (5, 3, 2, 0).
mo = 0 E2 = 7rzc2, (B.16a)
the equivalence following from (B. 15). This will be in true in all frames if true
in one, because it is an invariant relation, and enables us to replace E in (B.8) by
7rc. Then we find
mo = 0 (P°) = (7r/c,Tr/c) PP=0 (B.16b)
1.1 Four-vectors and relativistic 335
showing that in this case P is a null vector, representing motion at the speed of
light. This is a satisfactory conclusion, because a photon is a zero-rest-mass
particle.
Exercise
B.19 Explicitly verify from (B.13) and (B.7b) that the four-velocity (Ua) of an
arbitrarily moving particle is time-like of magnitude 1.
-Mo, (2MO)2,
-(aMo) (4Mo), -M2o,
we obtain mo = 4 ,/15M0. Similarly, from
P1=P3-Pz,
using P2 P3 = - 4 MoE3 /cz, we find that
E3 = 3M0cz (v) Q,\/ 15)Mocz
Hence 'y(v) = 15 ,\/15 and v/c = a as before.
Exercises
B.20 Verify (B.15) and (B.16).
B.21 A photon is a particle of zero rest mass. Show that it is impossible for an isolated
free electron to emit or absorb a photon. [Use the energy and momentum conservation
laws for emission of a single photon by an electron.]
This can again be shown to be an invariant, and (B.17) is a special case of this
formula. A and B are orthogonal if and only if A B = 0.
Exercise
B.22 (a) Show the equivalence of (B.13) and (B.17) when Minkowski coordinates are
used. (b) Write out the scalar product A B of two four-vectors when Minkowski coor-
dinates are used, and confirm that it is invariant under (B.5).
where 6 ,' = 1 if a' = b' and is zero otherwise; these are the components of the unit
matrix,* which has the property that it transforms any vector into itself, i.e.
ShXb = Xa. (B.19)
b
(This can easily be checked, e.g. setting a = 0, >b 6bOXb =68x° + 51X1 + 62X2+
S°X3 = X° + 0 + 0 + 0 = X°, etc.)
The fact that the transformation (B.5b) is the inverse of that in (B.5a) follows
from the inverse property (B. 18). On substituting from (B.5b) into (B.5a), we find
that
Aa, = _ La
E`L 1)QAc'
a a c'
)C, A = Ac'=Aa,
aa'(L 6
the second to last step following from (B.18), and the last from (B.19). Relation
(B.18) is true in particular for the forms of L and L given in (B.6a) and (B.6b),
as can be checked directly; for example, setting a' = 0' and b' = 0', we find
as required. Similarly, we can check all the other components of (B.18). Thus,
(B.18) summarizes in a compact way that (B.6b) is inverse to (B.6a).
Exercise
B.23 Changing from Cartesian coordinates (x') = (x, y, z) to polar coordinates
(x") = (p, 9, z) in Euclidean space, the matrix L for the transformation of vector com-
ponents is
cos o sin o 0
-(1/p)sino (l/p)coso 0
0 0 1
that is, for any vector A with components A' and A", respectively for i = 1, 2, 3
and i' = 1, 2, 3',
A" =E LA'.
Find the explicit transformation giving A'' in terms of A', for each i' = 1', 2', 3'.
Determine the inverse matrix L-1, and find explicitly the inverse transformation
giving A'. in terms of the A''. In particular find the Cartesian components of the
vector fields with polar components (W'') = (1, 0, 0); (Y`) = (0, 1, 0);
(Z'') = (0 0 1)
The first term on the right has the correct form (B.5a) for a four-vector but the
second term, if non-zero, does not. This term will only vanish if the quantities La'
are constant, which will not be true for a general transformation (but is true for a
Lorentz transformation in flat space-time). Consequently (dPa/dr) does not
transform as a four-vector in general. The same will be true for any derivative of a
four-vector; extra terms will have to be added to its definition to make it trans-
form as a vector under a general change of basis. The way this has to be done is
described by the tensor calculus, which we do not deal with here; see e.g. Tensor
Calculus by J. L. Synge and A. Schild (Dover, 1959) for a clear introduction to the
subject.
The metric tensor In the case of curved space-times, one can again write the
space-time scalar-product of a vector A with itself in any reference frame as in
eqn (B. 17). By (B. 18), this quantity will be an invariant, no matter what change of
frame is made, provided the metric tensor components gab transform as
when vectors transform as (B.5). Since the scalar-product of a vector with itself
must indeed be invariant (it does not depend on the reference frame or coordinate
system used to calculate it), (B.20a) must be the way that the metric tensor
components transform.* This will in particular guarantee that the metric form
(5.5) is an invariant under arbitrary change of frame (and so whatever coordi-
nates are used in a curved space-time). As well as being valid in curved space-
times, (B.20a) is also how the metric tensor will transform under arbitrary
changes of coordinates in flat space-time (for a flat space-time is just a special
case of a curved space-time).
Since (B.20a) describes the change of the metric tensor under all changes of
coordinates, one might ask what is special about Lorentz transformations (such
as (B.5)), which represent the change from one set of Minkowski coordinates to
another? The answer is that it is precisely these transformations that preserve the
specific metric form (5.6b). Thus, for example, suppose the metric initially has this
*Often vector components are given relative to the natural bases defined by the coordinates used.
Then L has a special form: it is the Jacobian matrix of partial derivatives, i.e. if new coordinates are
given by x°' = x°' (x°), then L" = 0x"/0x".
*In matrix form, (B.20a) is g' = L-Ig(L-1 )T, where T denotes the transpose.
1.1 Four-vectors and relativistic 339
form, and a transformation L given by (B.6a) is made, with inverse L-1 given by
(B.6b). The summation (B.20a) with the metric tensor [gab] in (5.6b) gives
(L-1)b,.
ga'b' _ -(L-')Q,(L-')b' + (L-1 )la (L-1 )b' + (L-'),,,(L-1 )b'
+ (L-1 )a
Then, for example, setting a' = 0', b' = 0', and using (B.6b), we find
-(L-1)00,(L-1)00, (L-1)o + (L-1 )0,(L-')'O"
Sao' + 0' + (L-')o
=-7z+(V7)2+0+0=72(V2- 1) =-1,
showing that this component has retained its value according to (5.6b). Similarly
each component retains its form. This then becomes a new way of defining a
Lorentz transformation: it is a transformation that preserves the form of the
metric tensor components. A general transformation in flat or curved space-time
will not do this.
Exercises
B.24 Verify that if the metric transforms according to (B.20a) and vectors according to
(B.5a), then the scalar product (B.17) is an invariant provided L-1 is defined by (B.18).
B.25 Determine the new components gy of the metric tensor when a change is made
from Cartesian coordinates (x') to polar coordinates (x`') by using (B.20a) with the
transformation matrix as in Exercise B.23. Explicitly verify that X X, X Y, and Y Y are
invariant, where these vectors are defined as in Exercise B.23.
B.26 The quantity (Wa) is defined from a vector (Xb) by the relation Wa = >b gabXb.
Show that the transformation properties of (Wa) are given by
The summation convention Finally, we note that in each case where a summa-
tion over indices occurs (see (B.5), (B.17), (B.18), (B.19), (B.20a)), each summed
index occurs precisely twice, once up and once down. This feature means that we
can tell which indices are to be summed with which other index by simply noting
that they occur in such repeated pairs (one up, one down). Therefore, we can save
much writing by using a simplified notation: we can omit the summation signs, it
being understood that summation is implied whenever indices occur as a repeated
pair. Thus for example we can write (B.20a) in the form
(L-1)a,(L-1)6'gab
ga'b' = (B.20b)
the summation over a and b being implied because they are each repeated indices
(one up, one down). This is known as Einstein's summation convention. As a
further example, the scalar product A B of two vectors is
A . B = gabAaBb (B.21)
Exercises
B.27 Write out explicitly the summations implied in the expressions (a) XaWa,
(b) gab X a yb.
B.28 Assuming (Wa) transforms as in Exercise B.26, prove the invariance of the
quantity S = YaWa where Ya is any vector transforming according to (B.5).
Computer Exercise 17
Write a program that will accept as input the components L(A, B) of a transformation
matrix L from frame F to F', and the components LI(A, B) of the inverse matrix L-1. It
should first check that L(A, B) and LI(A, B) are indeed inverse matrices, and then request
as input the components X(B) of a vector X and the components G(A, B) of a metric tensor.
It should then calculate (i) the new components Xl(A) of the vector and G1(A, B) of the
metric, according to eqns (B.5) and (B.20), (ii) the scalar product X X before and after the
transformation. This quantity should be invariant, so the difference X' X' - X X serves
as a check on the calculation. Suppose you find this is almost but not quite zero; to what
can you attribute this difference? [You could choose the transformation matrix to repre-
sent a spatial rotation; a `boost' (B.5); or some more general transformation.]
Consider the four-momentum P of a particle of rest mass MO, at rest in frame F. Use
your program to find its four-momentum in frame V. Alter the program to add the four-
momenta of several particles together to give a total four-momentum in the initial frame F,
and to find the self-product P P of this total four-momentum. What transformation
property would you expect for this quantity? Use your program to verify your expectation.
Explain briefly the importance of this quantity. Adjust your program to handle also the
case of particles of zero rest mass. Determine the total four-momentum of a set of particles
of zero rest mass and find its magnitude. Try some other cases, and comment on your
answer.
Appendix C
Four-tensors, electromagnetism, and
energy-momentum conservation
We have considered in Appendix B the concept of four-vectors. These represent
simple geometric objects in space-time, but are not complex enough to represent
all the physical and geometric objects of interest (four-vectors are described by
four independent components, but many geometric or physical quantities need
more components for a complete description). To enable representation of more
complex objects, we need four-tensors. These are more general objects which
behave like vectors in a way which we shall make precise shortly, but which can
have more components (labelled by more indices). In the main body of the
Appendix we shall consider general tensors with one or two indices; and finally
we will summarize briefly the generalization to tensors with an arbitrary number
of indices.
A° = La A°, (C.la)
where the transformation matrices L and L-1 are inverse to each other:
Aa = (L 1)a,Aa', (C.3a)
Wa = La Wa' (C.3b)
This describes how the new components are obtained from the old. Conversely,
to obtain the old components from the new, the inverse transformations are
Tab = (L-1)a (L-1)b Ta'b'
Sa = (L-1)a,Sa,Lcc . (C.5a,b)
In each case, the relations must be true for every value of the `free indices' (in
(C.4a), a' and b'; in (C.4b), a' and c'; in (C.5a), a and b; in (C.5b), a and c). The
detailed meaning of the relation then follows from the summation convention: as
in the cases (C.1 a,c), discussed in detail in the previous section, one simply writes
out all the terms implied by the summation and then substitutes the values of the
tensor and transformation matrix components.
An example As a simple example, we consider (C.4b) in the case of a two-
dimensional space. In this case it becomes
Sa = Li Si (L-1)c, + La La Si (L-1)1., + LZ SZ(L 1)c,. (C.6a)
Four-tensors, electromagnetism, and energy-momentum conservation 343
valid for each value 1' and 2' of the free indices a' and c', e.g. for a' = 1', and
c' = 1'
S1' = L'S i(L 1)i +L1 Sz(L-1)1, +Lz Si(L 1)i +L12 SZ(L 1)i
LI,
Suppose L represents a rotation: = cos 0 = Lz' and Lz' = sin B = -L2'. Then
this relation becomes
then we find Sl; = 1; that is the (1, 1) component of S is invariant under the
rotation. Similarly letting a' and c' in (C.6a) take all other values (1, 2), (2, 1), and
(2, 2), we find that all these components are invariant under the rotation. Is this
a special property of spatial rotation? To investigate this, we return to (C.6a) and
now substitute in (C.6b) with an arbitrary transformation matrix L, finding
the last step following from (C.2). Thus the quantity S, defined in an initial frame
by (C.6b) and transforming in the tensor way (C.6a), has the same components
(C.6b) in all coordinates, i.e. no matter what transformation is made.
Tensor equations The importance of the transformation rules exemplified by
(C.1-5) is that if a tensor equation is true in one frame, it is true in all frames; and
this is a property we obviously want for any real physical equation (the validity of
an equation must not depend on which observe makes the measurement or what
coordinate system he uses). As an example of this assertion, suppose that we
know that the equation
Ra = Sa (C7)
which proves the result stated. The proof is similar in the case of other tensor
equations, in which the free indices on the left and the right are the same (i.e. if
there is a free index a upstairs on the left, there is also a free index a upstairs on the
right; if there is a free index d downstairs on the left, there is also a free index d
344 Appendix C
downstairs on the right; etc.) An important special-case is that if a tensor vanishes
in one reference frame (so all its components are zero in that frame), then it
vanishes in all frames. We particularly wish this feature to be true for physically
significant quantities: it should not be possible to transform a non-zero physical
quantity to zero by changing the coordinate system or reference frame.
Tensor operations The tensor eqn (C.7) is a rather simple one. One can con-
struct more complex equations by using four basic tensor operations. These are,
(1) Linear combination. For example, given tensors [Rab] and [Sab], and
numbers A and p, then we can define a new tensor [Tb] by
Tab = ARab + t sab
Note that this is only possible for tensors of the same type, that is, with the same
number of indices upstairs and downstairs.
(2) Tensor-product formation. For example, given any two vectors, say (Ra)
and (Sb) we can define a new tensor [Tb] by
Tb = RaSb.
(3) Raising and lowering indices. Given any tensor with an upstairs index a, one
can produce a tensor with that index in the downstairs position by multiplication
with the metric tensor. For example, given [Tb], we can lower the index a to get
[Tcb] where Tcb = gcaTT. We can regard [Tb] and [Tab] as different arrays of
components describing the same geometric object. Conversely we can raise any
downstairs index b by multiplication with the inverse metric tensor [gbd], i.e. the
tensor defined by
where Sb are the components of the unit tensor (cf. (B.18)). Thus, for example,
Tb = gadTdb raises the index don Tdb.
(4) Tensor contraction. We can contract a tensor by summing over any pair of
indices (one up, one down). For example, given a tensor [Sbd] (which may be built
up by repeated application of the previous operations), we can define a quantity T
by contracting the indices b and d; that is,
T = Sbb, (C.9a)
where the summation is over all values of the index b. This quantity is necessarily
an invariant, i.e. a quantity on whose value all agree:
T'= T. (C.9b)
-1 0 0 0
0 1 0 0
[gab] = (C.10)
0 0 1 0
0 0 0 1
It follows from (C.8) that the inverse metric components gab also have this
standard form: goo = -1 glI = g22 = g33 = 1, gab = 0 if a 54 b.
This form is preserved by Lorentz transformation, i.e. it is invariant under the
transformation (B.20) where L represents a Lorentz transformation (B.5). One
can use any other change of frame L as long as this matrix is non-singular; in
general this will bring the metric to a more complex form.
The transformation laws (C. 1-5) will all remain valid in curved space-times, as
will all results one can deduce from them. However, in a curved space-time, one
cannot choose coordinates to bring the metric tensor to the canonical form (C. 10)
everywhere; the most one can do is to bring it to this form at any point P. More
specifically, coordinates can be chosen so that gab takes the form (C. 10) at P; it
will then in general not have that form at some other point Q. A different set of
coordinates can be chosen to bring it to the form (C. 10) at the point Q, but in these
coordinates it will in general not have this form at the point P.
Exercises
C.1 Prove that the four-dimensional tensor Sb with components Sb = 6b in one frame,
has the same components in all frames.
C.2 Prove that the quantity W = WabX"Xb is an invariant provided these quantities
are tensors as indicated by their indices. If W vanishes in one frame, will it vanish in all
frames?
Suppose Wab is antisymmetric, that is, Wab = - Wba Evaluate Win this case.
C.3 Show that if a tensor Tab is symmetric in one frame, i.e. Tab = Tba for all a and b,
then it is symmetric in all frames. State and prove a similar result for antisymmetric
tensors. What is the value of the invariant gab Wab if Wab is antisymmetric?
C.4 Show that the metric tensor components gab have the canonical form (C. 10) if and
only if the coordinates used are Minkowski coordinates (t, x/c, y/c, z/c) for some
observer. [Hint: consider (i) proper time along a curve {x = const, y = const, z = const};
(ii) proper distance along a curve {t = const,y = const, z = const}; (iii) the meaning of
gol = 0; (iv) the meaning of 912 = 0.]
346 Appendix C
From the transformation law (C.le), show that the metric tensor components preserve
this canonical form under a Lorentz transformation (B.6b).
C.5 Flat space is given in Minkowski coordinates. Determine the components gab of
the inverse metric tensor. Hence find the components X. of the vector Xb = (1, 1, 0, 0), and
the components T`d of the tensor Tab where Too = µ, TI I = T22 = T33 = p. Prove that in
general, if Tab is symmetric, then so is Td. Confirm explicitly that this general result is
true in the particular case just considered.
To show that this representation is correct, we consider in turn the Lorentz force
law, the transformation properties of E and B, and Maxwell's equations.
Exercise C.6
(a) Suppose E = c(1, 0, 0) and B = (0, 2, 0). What will be the tensor [Fab] representing this?
(b) Suppose the tensor [Fab] is
0 4 2 3
[Fab] = c -4 0 0 2
-2 0 0 5
-3 -2 -5 0
Check that this is antisymmetric, and find the electric and magnetic fields it represents.
Four-tensors, electromagnetism, and energy-momentum conservation 347
The Lorentz force law Particle motion under electromagnetic forces is deter-
mined by the momentum equation (B.12b), where the electromagnetic three-
force F on a particle with electric charge e moving with three-velocity v is given by
the Lorentz force law. This expresses the fact that the force due to the electric field
E is independent of v while the force due to the magnetic field B depends on v,
both being proportional to the charge e. Explicitly,
do/dt = F,= e(E + v x B), (C.12)
thus the four-force f (see (B. 12)) is determined from the particle four-velocity U
(see (B.7)). To show the equivalence of this form to (C.12), note that the metric
takes its canonical form (C.10) and U° is given by (B.7b), so (Ub) = (gb°UQ) =
-y(- 1, v/c), while by (B.12) we have (fa) = -y(v F/c2, F/c). Recalling that
dP°/dr = -ydPa/dt and (Pa) = (E/c2, 7r/c), we may cancel factors of 7/c in
(C.13) to obtain, for a = 1,
*On using curvilinear coordinates in flat space-time, or general coordinates in a curved space-time,
extra terms have to be added to these equations to make them into tensor equations (because they
involve derivatives).
348 Appendix C
three-fields E and B, which are represented as an antisymmetric four-tensor Fab
as in (C. 11). These equations are of importance in every situation where electric
forces are utilized (electric motors, relays, television tubes, and so on), and also
govern, for example, the spiral motion of cosmic rays travelling through inter-
stellar space.
Exercises
C.7 Set a = 2 and a= 3 in (C.13) and so derive the other two components of (C.12).
Show from these equations that if a cosmic ray moves in interstellar space where E = 0 but
B 0, then (a) energy is conserved, and so the speed of motion is constant; (b) momentum
parallel to the magnetic field is constant. [The path of the particle will be a spiral.]
C.8 Write out the Lorentz force-law equations explicitly in the case when E _
c(1, 0, 0) and B = (0, 2, 0). Can you see from these equations that some component of
momentum is constant?
C.9 Show from the symmetry of [Fab] that (C.13) implies UadPa/d r = 0. Deduce that
dmo/dt = 0. [Hint: Ua =gabUb where gab are constant if Minkowski coordinates are
used; and Pa = mo Ua where Ua Ua = -1.]
Fa b = LQ'Lb'Fab,
(C. 15)
where Fa,b, will be related to the electric and magnetic fields E' and B' measured
in the new frame by the primed version of relation (C. 11). To see the effect of
relative motion on electric and magnetic fields, we use Minkowski coordinates
(so El = E, etc.) and consider the effect of motion in the x-direction; then L and
L-I are given by (B.6). The calculation is now straightforward; for example,
EX = Fo,1' = La' Lb Fab
Exercises
C.10 Suppose v = 5, E = c(3, 2, 0), B = (1, 4, 0). Find E' and B'.
C.11 The relations inverse to (C.15) are F°b = (L I )Q, (L I )b,F° b . Determine
directly from this relation the inverse transformations to (C. 16). Can you see a simple way
to deduce directly from (C. 16) the results you obtain?
C.12 Suppose E and B are both non-zero in a frame F, and are perpendicular to
each other in this frame. Show one can find a frame F' in which the electric field vanishes.
[Hint: first rotate the axes until E lies in the y-direction and B in the z-direction; then use
(C.16)]
The set of results (C. 16), following directly from the tensor transformation law
(C. 15), is remarkable in showing the profound connection between electric and
magnetic fields. For example, if we start off in a frame F with an electric field in
the y-direction, i.e. E = (0, Ey, 0), and no magnetic field, i.e. B = (0, 0, 0), then
on transforming to a frame F' moving with speed v in the x-direction we find from
(C.16) that
E = (EX, Ey, EZ) = (e/47reo) (x2 + y2 + z2)-2 (x, y, z), B = (0, 0, 0). (C.17)
In a frame F' which moves with speed v in the x-direction relative to F, the
electric and magnetic fields E' and B' are, by (C.16) and (C.17),
E' = (EX,, E', EZ,) = (e/47reo) (x2 + y2 + z2) 2 (x, ryY, Yz),
(x2
B' = (BX,, By,, BZ,) = 2 + y2 + z2) z (0, z, -Y)
350 Appendix C
The coordinates (t, x, y, z) are those of the frame F; in terms of coordinates
(t', x', y', z') of F', we find (on using (B.1-3))
y)/
Thus in the frame F' there are time-dependent electric and magnetic fields. This
effect is the origin of the magnetic field due to a current in a wire, as we
demonstrate in detail in the next section.
Exercise C.13
A particle with mass m and charge e moves in a frame F in constant electric and magnetic
fields E and B in the y- and z-directions respectively. At t = 0 it is at the origin moving
with speed u in the x-direction.
(i) Write down the electric and magnetic fields in a frame F' moving with speed v in the
x-direction.
(ii) Show that if v/c = cB/E and u = v, a possible solution to the equation of motion
(in the frame F') is
E' _ (1/47reo)
J
a'{(x' - x")2 + yi2 +
z/2}-z(x'
d'
- x",Y', z')x
where the integral is over the entire line. Evaluation of the integral gives
z'2)-1(O,Y/,
E' = (Q'/27reo)(Y12 + z'); (C.19)
clearly the magnetic field is zero in this frame.
..
Four-tensors, electromagnetism, and energy -momentum consJhvafib
charge
element a'
Fig. C.1 The charge element 0'dx" of a current flowing along the x'-axis in frame F'
produces an electric field E' at the point P with coordinates (x',y', z').
In a frame F moving with speed v in the x'-direction relative to F', the electric
and magnetic fields are given from (C. 16) and (C. 19) by
between the charge density 0 in the frame F and charge density 0' in the frame F'
(this follows because the total charge must be the same in the two frames, and
viewed from F a unit length in F' appears to be contracted by a factor l/'y; thus
0'x1=0x1/ly).
Now let us consider a long electrically neutral wire along the x-axis in a frame
F, in which a current of density j flows. The current consists of electrons with
charge density 0_ = -0 say (0 > 0) and average speed -v say, relative to F.
There will be an equal but opposite density 0+ = 0 of positive charge on the
copper ions which remain at rest in the wire while the current flows (Fig. C.2). The
current density is given by j = 0_ (-v) = av. We need to consider the electric and
magnetic fields produced by both the positive and the negative charges.
Since the positive charges are at rest in the frame F, it follows by (C. 19) applied
in this frame that they produce electric and magnetic fields E+ and B+ given by
stationary
y positive ions
wire
moving electrons
Fig. C.2 An electric current along the x-axis consists of stationary positive ions, and
electrons moving in the negative x-direction.
x'-direction relative to F', we see from (C.20) and the definitions of Q and j that
E- = -(o,/27rEO)(Y2 + z2)-1(0,Y,z),
B = (j/27rcoc2)(Y2 + z2)-1(0, -z, Y)
The total electric and magnetic fields due to the current and the charges in the wire
are then E = E+ + E- and B = B+ + B-. From the results above,
This is the well-known result that a current j in a straight wire produces zero
electric field, and a magnetic field of magnitude j/27rEor2 at a distance r from the
wire in a direction tangential to a circle around the wire. Thus, for example, a test
charge q moving at speed u parallel to the wire at distance r from it will experience
a force given by (C.12) with v replaced by u and e by q, i.e.
Applying it to the positive charges, o-+ (the charge density in the frame F) is the
rest-frame density, so
(C.23b)
Thus, the total charge density in the frame F' is o-' = at + Q+ = Q(ry2 - 1)/ry, i.e.
Substituting into (C. 19), we get the same values for the components of the electric
field as before (see (C.21b)). Thus, the difference in the electric charge densities,
which is the source of the electric field in frame F', is just due to the reciprocal
nature of the length contraction effect (C.20b), which relates the charge densities
in the two frames.* The magnetic field in the frame F' results from the motion of
the positive charges at speed +v relative to this frame.
Although the length contraction involved in (C.23) will be extremely small
because v/c is very small, the effect is appreciable because the density p is very
large (there are a very large number of electrons involved); thus the relativity
length contraction effect is important here even though the speeds involved are
very low.
Exercises
C.14 In an inertial frame F, a line charge of density e per unit length lies along the
x-axis and moves in the x-direction with speed u. Show that, in an inertial frame F' in
*One can ask why the charge density is zero in the frame F. The answer is that this is the case we
have chosen to consider; one could do a similar (but more complex) calculation for the case of a
wire which is charged in frame F.
354 Appendix C
which the magnetic field vanishes, the charge density is e' = e/ y. Calculate the electric field
in F' and hence find both the electric and magnetic fields in F.
C. 15 The drift speed of electrons in the wire, causing the current, is only v = 6 x
10-2 cm/sec. Find V = v/c and calculate the corresponding length contraction factor ry.
If there are 1023 free electrons per cm3 in the wire, each carrying a charge e, find a' in terms
of e from (C.23c).
C.16 Read Section 8 of Special Relativity by A. P. French (Nelson 1968), where a
detailed discussion of this example is given, based on the transformation properties of
a force.
Q1 =-FabFab. (C.24a)
that is,
Qi = 2(c2B2 - E2). (C.24b)
Because this quantity is invariant (Qi = Q1 for all observers), we see that if B and
Ehave the same relativistic magnitude in one frame (c2B2 = E2), they have the same
magnitude in all frames (c2Bi2 = Ei2).
Four-tensors, electromagnetism, and energy-momentum conservation 355
The second invariant depends for its definition on the totally antisymmetric
tensor [rabcd] (if any two neighbouring indices are swapped, the sign of gabcd
changes, e.g. ,abcd = -77 bacd), that in Minkowski coordinates has the component
0123 = 1. Because it is totally antisymmetric, the indices a, b, c, d of any non-zero
77
(To determine the coefficient in front, observe that when a = 0 in the summation,
b ranges over the values 1, 2, 3, successively giving the first factor of each term in
the bracket. When b takes the value 1, the non-zero terms in c and d are F23 and
-F32 which combine because Fcd is skew, giving a total contribution of two terms
FO1F23; similarly, two contributions to this term arise when b = 0, c = 0, and
d = 0). Thus
Q2 = -8cE B. (C.26b)
Because this quantity is invariant (Q' = Q2 for all observers), we see that if E and
B are orthogonal in one frame, they are orthogonal in all frames.
A particularly interesting case arises if both Q1 and Q2 vanish, but the field is
not zero. This is the case of orthogonal electric and magnetic fields of equal
magnitude. If a field has this property in one frame it has it in all frames; and this is
precisely the case that occurs in plane electromagnetic waves. As these are
invariant requirements, if one observer finds this to be true then so will all other
observers.
Exercises
C.17 (a) Check that the quantity F, 'is zero. (b) Verify expressions (C.25), (C.24b), and
(C.26b).
C.18 Suppose E = a(0, 1, 0) and B = ,0(0, 0,1) in a frame F. (a) Find E2, B2 and E E. B.
(b) What are Q1 and Q2 in this frame? (c) What are their values in a frame F' moving at
speed v = c in the x-direction?
s
V E=p Eo ° at '
V xB=µoj+ cat
ZaE (C.27a)
356 Appendix C
where V A is the divergence of the vector field A and V x A is the curl of A
(see e.g. Vector Analysis by M. Spiegel, Schaum, for details), and ao and µo are
constants related by
'oµo = 1/c2. (C.27b)
The meaning of Maxwell's equations is discussed in detail in Volume II of The
Feynman Lectures on Physics by R. P. Feynman, R. B. Leighton, and M. Sands
(Addison-Wesley, 1964), and less technically in An Introduction to the Meaning
and Structure of Physics by L. N. Cooper (Harper and Row, 1968).
The four-dimensional form The four-dimensional form of Maxwell's equations
in flat space-time is given, in Minkowski coordinates,* by
1 aFab
c axb
1
ao
Ja ' _+_+_
aFab
axc
aFca aFbc
axb + axa =
0, ( C.28a,b )
a/axa is the partial derivative operator, with the index a treated as a downstairs
index.
Three-dimensional equivalence To see the equivalence of this four-dimensional
form to the three-dimensional (C.27a), we examine first (C.28a) and then (C.28b).
Setting a = 0 in (C.28a) gives
i.e. by (C.25),
-aEy/a(x/c) + aEZ/a(y/c) + caBx/at = 0,
where (Ua) is the four-velocity of the charge; by (C.28c) this shows that
these equations making explicit the fact that a current is just charge in motion
relative to the observer.
This feature is vital for the nature of magnetic forces, for this identification of a
current as charges in motion makes explicit the fact that an observer moving past
an electric charge will determine that there is a source term j for the magnetic field,
although an observer stationary relative to the charge will not. This explains
(from the viewpoint of Maxwell's equations) why, as discussed above, we can
regard motion relative to a charge as the source of magnetic fields. Thus the
transformation properties of [Fab] and (Ja) together with the Lorentz force law
(C.12) and Maxwell's equations (C.28) lead to a consistent analysis of the force
on the particle, no matter which reference frame is used.
To explore this further, we use (C.29) to give an alternative derivation of the
result (C.23) which was. crucial in our analysis of the force due to moving charges
in a wire; now we base our analysis on the fact that the current is a four-vector.
358 Appendix C
At a first reading, the reader may wish to omit this detailed calculation and skip to
the section on charge conservation.
Returning again to the situation of a current in a wire that we examined above,
we now consider the current four-vectors (J+) and (J°) due to the positive and
negative charges in the wire. Since the rest frame of the positive charges is F, we
have (J+) = p+(1, 0, 0, 0) where p+ is their rest-charge density; transforming to
the frame F' (cf. (C.29)), we obtain
the latter defining the density p' of positive charges in the frame P. The time
component of this equation is
The rest frame of the negative charges is F', so (J°') = p' (1, 0, 0, 0) where p' is
their rest-charge density; transforming to the frame F, we obtain (Jo) =
'yp' (1, -v/c) = p_ (1, -v/c), the latter defining the density of negative charges in
the frame F. The time component of this equation is
P- _ 'YP (C.30b)
Since the wire is uniform (and so has a constant area, independent of the reference
frame), the line densities a are proportional to the volume densities p; because the
line densities are equal in the frame F, so are the volume densities; that is,
Equations (C.30) correspond to (C.23), and enable us to obtain the same results
as before. This approach emphasizes that the transformation properties of the
charge density p are not those of a scalar but those of the time-like component of a
four-vector. If p were a scalar, there would be no electric field in the F' frame and
hence no force on the particle in that frame; therefore an observer at rest relative
to the wire (i.e. using reference frame F) would find there is no force on the
particle (because (f a) is a four-vector) and hence no magnetic field.
Exercises
C.19 Set a = 2 in (C.28a) and a = 0, b = 1, c = 2 in (C.28b), thereby explicitly
obtaining two more components of Maxwell's equations (C.27a) from (C.28).
C.20 (a) A set of charges is at rest in frame F, with charge density po; there is no current
measured in this frame. A cosmic ray moves past at speed v = i3 c in the x-direction.
Determine the charge density and current measured in the rest frame F' of the cosmic ray.
(b) Determine the value of the invariant J"Ja, and obtain from (C.29) a relation between it
and the quantities j and p.
C.2 Read about the physical .meaning of Maxwell's equations in one of the books
mentioned at the beginning of this section.
Four-tensors, electromagnetism, and energy-momentum conservation 359
Conservation of charge From the Maxwell equations (C.28a), one finds that
(1/c)a2Fab/ax°axb = (1/eo)aJa/axa.
The left-hand side vanishes because [Fab] is antisymmetric (Fab = -Fba) and
azf/ax°axb = azf/axbaxa for every function f. Thus the right-hand side van-
ishes, so
3Ja/axa = 0 (C.31)
which is just the equation of conservation of charge. If we write this equation out
in a Minkowski frame and use (C.28c) it takes the form
Op/at + V j = 0, (C.31a)
that is, the rate of change of charge with respect to time is minus the divergence of
the current; this is the usual form of the conservation equation (see e.g. Volume II
of The Feynman Lectures in Physics). The interest of this calculation is how simple
it is to prove, using the four-dimensional notation, that the conservation of
charge is a consequence of Maxwell's equations; it follows directly from those
equations plus the skew-symmetry of [F°b].
Exercise C.22
(a) Verify the derivation of (C.31) by explicitly writing out the expression 02Fab/
8x°8xb from (C.11) and showing that the terms in it cancel. (b) Derive (C.31 a) from (C.31).
(c) Will (C.31a) be the same in all frames, or not?
defined for any matter or physical field in space-time. It represents the energy,
momentum, and stress associated with that matter (a solid, fluid, gas, plasma,
collection of elementary particles, or whatever) or field (an electromagnetic field,
scalar field, spinor field, etc.).
Its components, in a Minkowski frame in flat space-time, are as follows: Too
is the relativistic energy density it of the matter or fields; = Tot = Tto = qt/c
(i = 1, 2, 3) where qt may be regard either as the flux or energy across a surface
perpendicular to the i-direction, or as the i component of the momentum density.
In appropriate units these quantities are equal, since the relation E = mc2 implies
Exercise C.23
Derive (C.34) from (C.33) by the method just outlined, in the case where
v = (v, 0, 0) (i.e. using the tensor transformation with L given by (B.6b)).
which just expresses the conservation of momentum for the matter flow. This
confirms that, in this case, conditions (C.35) are just the statement of energy and
momentum conservation. They can be written out in four-dimensional form, as
follows. From the relation Ua Ua = -1 it follows that Uad U/dr = 0; also, on
using the compact notation f b =_ of/axb, we may write df/dr = f, Ua because
Ua = dxa/dr. Therefore, on putting (C.33) in (C.35) and contracting with [Ua],
we find
Ubb (C.36a)
dro + A0 0,
dUa
(C.36b)
dr 0,
We previously defined a geodesic for a single particle as a curve on which motion takes place under
no forces, i.e. a curve for which dP°/dt = 0 (a). In that case, pa = mo U° (b) where mo is the
particle rest mass. Substituting (b) into (a), and contracting with UU shows that mo is constant; then
it follows as above that (a) and (b) imply dU°/di = 0.
362 Appendix C
Exercises
C.24 Derive eqns (C.36) and check the last statement.
C.25 The four-velocity field of the fundamental observers in a (flat space-time) Milne
universe is given in Cartesian coordinates by (Ua) = (t/r, X/r, Y/-r, Z/r) where
r = {t2 - (X2 + Y2 + Z2)}2. Show that Uu = 3/r; hence deduce from (C.36a) that
µo = M/r3 is the evolution of energy density along the fluid flow lines, where Mis constant
along these lines.
The condition (C.35) again produces the hydrodynamic equations for this fluid,
conveniently expressed in a form analogous to (C.36). The form of the energy-
momentum tensor in (C.37) is the one usually used in universe models, in con-
junction with a suitable equation of state relating p and µ.
Exercises
C.26 Derive the flat space-time energy and momentum conservation equations ana-
logous to (C.36), for a perfect fluid (C.37).
C.27 Show from (C.36a) that, for a perfect fluid in a Mime universe with four-velocity
as given in Exercise C.25, and equation of state p/c2 = 3 p, the evolution of energy density
along the fluid flow lines is µ = M/r4, where Mis constant along these lines.
When discussing a perfect fluid, we will not be comparing expressions for [Tab] in different frames.
We can therefore omit the subscript 0, as used above in (C.33) and (C.34), without causing
confusion.
Four-tensors, electromagnetism, and energy-momentum conservation 363
behaviour of charged fluids (magnetohydrodynamics). The divergence of Tab in
(C.38) is not zero in general. Rather, the general result is
It is only when one adds together the Tab from the electromagnetic fields and Tab
from the particle motion that alto al/Ox b = 0. However, in the case where there is
no current (Ja = 0), the condition (C.35) for the electromagnetic stress tensor
(C.38), which then gives the equations of conservation of electromagnetic energy
and momentum, is automatically implied by Maxwell's equations (C.28a).
Exercises
C.28 Derive eqns (C.39) from (C.38) in the text above.
C.29 Read up about electromagnetic energy density, the Poynting vector, and the
Maxwell stress tensor in a book on electromagnetism. See what you can find out about the
implications of these concepts for such phenomena as sunspots and galaxies that are radio
sources.
Curved space-time
We have emphasized in many places that expressions obtained are valid only in
Minkowski coordinates. As mentioned before, it is only in a flat space-time that
we can obtain such coordinates everywhere. However, we can find such coor-
dinates at any particular chosen point P in a curved space-time, so the expres-
sions above, and interpretations of these tensors, will remain valid in a curved
space-time. Further, we have pointed out that when derivatives of tensors occur,
364 Appendix C
extra terms are required if general coordinates are used, in order that the
relations shall be proper tensor relations. However, again in a curved space-time,
if suitable coordinates are chosen these relations will be true at any particular
point P in the space-time; thus they too maintain their meaning in curved
space-times.
According to Einstein, the stress-energy tensor Tab of all the matter and
physical fields present has a very important role in a curved space-time: it is the
source of the curvature, and so of the geometry, of the space-time. In physical
terms, the stress-energy tensor together with suitable equations of state and
boundary conditions determines the gravitational fields that occur in nature. We
have seen examples of this in Chapters 6 and 7.
Computer Exercise 18
Write a program that will accept as input (1) a transformation speed V from a
Minkowski frame F to a Minkowski frame F', (2) components E(I) (I = 1, 2,3) of an
electric field and B(J) (J = 1, 2,3) of a magnetic field in frame F, and will then print out
the components El (I) and B1 (J) of these fields in the frame F. It should also work out the
values of the quantities Q1 and Q2 (given by eqns (C.24) and (C.26)) before and after
the transformation; the degree to which they are invariant serves as a check on the accuracy
of the calculation.
Use your program to experiment with cases where one or other of the fields are
(i) parallel and (ii) perpendicular to the relative velocity of the frames. Determine in which
cases you can transform (a) an electric field, and (b) a magnetic field, to zero. What features
of electromagnetic fields are you unable to alter by any change of reference frame?
where, as before, we use the summation convention; summation over the values 0,
1, 2, 3 is understood for each pair of repeated indices (i.e. for the indices a, b, c, d).
Four-tensors, electromagnetism, and energy-momentum conservation 365
Thus, the rule is that each upstairs index transforms under the same matrix L as a
vector with an index upstairs (the matrix effectively cancels out the old index a on
the tensor and replaces it by a', etc); and each downstairs index transforms under
the inverse matrix L-1 (see (B. 18); again each old index, e.g. b, is cancelled out and
replaced by a new index, e.g. b'). This describes how the new components are
obtained from the old. Conversely, to obtain the old components from the new,
one replaces the matrices L by L-1 and L-1 by L in the obvious way, so that
upstairs indices are still cancelled by downstairs indices and replaced by a new
upstairs index, and vice versa; that is,
where now the summation is over all values of the indices a', b', c', d.
As mentioned previously, the great importance of the transformation rule
(C.42), which is the `natural' one for any quantity with indices, is that if a tensor
equation is true in one frame, it is true in all frames. For example, if
Td=S (C.43a)
a
for all values of the indices a, b, c, d, in frame F, then also
in every other frame F'. Whatever frame is used, the free indices on the left and the
right must be the same (i.e. if there is a free index a upstairs on the left, there is also
a free index a upstairs on the right; if there is a free index d downstairs on the left,
there is also a free index d downstairs on the right; etc). An important particular
example is that ifa tensor vanishes in one reference frame (so all its components are
zero in that frame), then it vanishes in all frames.
Exercises
C.32 Prove that (C.42) is the inverse of (C.41), i.e. applying first (C.41) and then
(C.42), we end up with the components we started with.
C.33 (a) Prove that (C.43b) follows from (C.43a) and (C.41). (b) Does the converse
follow, i.e. does (C.43a) follow from (C.43b)? If so, why? (c) Prove that if a tensor vanishes
in one frame, then it vanishes in all frames.
Td =\Rab+µS
Note that this is only possible for tensors of the same type, that is, with the same
number of indices upstairs and downstairs.
366 Appendix C
(2) Tensor product formation: given any two tensors, say [Rab] and [S,], we can
define a new tensor [T°e d] by
Tabd = RabSd
cd c e
(3) Tensor contraction: given a tensor Sabb (which may be built up by repeated
application of the previous two operations), we can define a new tensor pa° by
contracting the indices b and d; that is,
pa = Sab
c cb '
where the summation is over all values of the index b. One can contract over any
pair of upstairs and downstairs indices, reducing the number of upstairs and
downstairs indices by one each. The repeated indices are called `dummy indices'
because they are not free indices, but rather denote summation. They can be
relabelled at will, provided a free index label is not used. Thus, for example,
Sab = Sae, (relabelling b to e); but we must not relabel b to a or c here (a and c being
the free indices).
(4) Raising and lowering indices: given any upstairs index a, one can produce a
tensor with that index in the downstairs position by multiplication with the metric
tensor. For example, given Tbe, we can `lower the index' a to get Tcbe where
Tcbe = gcaTbe' We can regard Tbe and Tebe as different components describing the
same geometric object. Conversely we can raise any downstairs index b by
multiplication with the inverse metric tensor [gbd] defined by (C.8). Thus, for
example, Tbe = gad Tdbe `raises the index' don Tdbe
In a more formal derivation of tensor properties, we would show that each of
these processes does indeed result in the components of a new tensor; however, in
this somewhat informal introduction, the reader is asked either to believe this to
be true, or to prove it for himself (in fact, the properties follow easily from the
definitions given).
Exercises
C.34 Write down the transformation law for a tensor with components Tb,,. Prove
that the contraction [Pac] = [T bC,] of [T bC] is a tensor.
C.35 Find what tensor [Sabcd] is obtained by first raising, and then lowering, the index
a on a tensor [Tabd].
Exercise C.36
Prove that T = Saab is an invariant by writing down the transformation law for Sbd and
then contracting.