You are on page 1of 64

Introduction to General Relativity and Cosmology

C.U. Physics, PG 4th semester


Anirban Kundu

Contents

1 Why General Relativity? 5


1.1 Flat space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Curved space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Curved space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 The Equivalence Principle 10


2.1 The Eötvös experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 The equivalence principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Free fall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Metric and affine connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Weak gravity: the Newtonian limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 The Einstein-Hilbert action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 Gravitational redshift: “Apparent weight of photons” . . . . . . . . . . . . . . . . . . 18

3 The Schwarzschild Metric 20


3.1 Black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Black hole thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 The Einstein Equation 26


4.1 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Einstein equations in vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 The expanding universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Einstein equation for matter: the FRW geometry . . . . . . . . . . . . . . . . . . . . 31

5 Newtonian Cosmology 33

6 Modern Cosmology 34
6.1 Microscopic content of the universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.1.1 Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.1.2 Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.1.3 Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1.4 Dark matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1.5 Dark energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Macroscopic content of the universe . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3 The fluid equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.4 The Hubble parameter revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.4.1 The density parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.4.2 The deceleration parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.5 The redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

7 The evolution of the universe 43


7.1 Matter dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2 Radiation dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.3 Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.4 More exotic situations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.5 The cosmological fine-tuning problem . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8 The universe 46
8.1 The age of the universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.2 The cosmic microwave background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3 The horizon problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.4 The CMB anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

9 The early universe 51

10 Nucleosynthesis 54
10.1 Helium abundance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
10.2 Deuterium abundance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
10.3 Baryogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

11 Inflation 59
11.1 Slow-roll inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
11.2 Slow roll parameters and cosmological perturbations . . . . . . . . . . . . . . . . . . 63

This note is based upon the following textbooks:


Weinberg: Gravitation and Cosmology
Hartle: Gravity
Liddle: An Introduction to Modern Cosmology
Kolb and Turner: The Early Universe

2
Dodelson: Modern Cosmology
Narlikar: Introduction to Cosmology
You must read the original textbooks, particularly the first four. Remember that the supplementary
problems form an integral part of the course.
Notation: In contrast to my electrodynamics course, and quantum field theory and particle
physics courses, I will use the flat space-time (Minkowski) metric ηµν = diag(−1, 1, 1, 1). This
makes the invariant interval ds2 = −dt2 + dx2 + dy 2 + dz 2 . Thus, Aµ = −ηµν Aν . I’ll use c = 1, but
not h̄ = 1. Also, I will not use the relativists’ convention of G = 1.
Why such a change? The main reason is that this is the metric that the relativists use, and if
we want their equations to match with ours (except, maybe, a factor of c raised to some power),
we must use this metric. Why do they use this metric when it gives p2 = −m2 ? Because for them,
the curvature of space-time is the more important thing. They do not want to have a negative
curvature for a sphere, so there is no minus sign in the spatial coordinates.
A four-dimensional vector will be labeled by a Greek index. All Greek indices run from 0 to 3,
and all Latin indices run from 1 to 3. All repeated indices are implicitly summed over.
Almost all the figures and plots have been lifted from the internet, and I acknowledge all the
creators, no individual acknowledgements are shown.

3
G 6.672 × 10−11 m3 /kg/s2 h̄ 1.055 × 10−34 J-s
c 2.998 × 108 m/s kB 1.381 × 10−23 J/K
= 3.076 × 10−7 Mpc/yr = 8.619 × 10−5 eV/K
me 0.511 MeV mp 938.3 MeV

1 pc 3.086 × 1016 m
1 yr 3.156 × 107 s
1 eV 1.602 × 10−19 J, or 11602.27 K
1 M 1.989 × 1030 kg

Age of universe (13.817 ± 0.048) × 109 yr (Planck)


Hubble constant H0 67.3 ± 1.2 km/s/Mpc (Planck)
Reduced Hubble constant h H0 /100
Physical Baryon density Ωb h2 0.02205 ± 0.00028 (Planck)
Baryon density Ωb 4.9%
Physical CDM density ΩDM h2 0.1199 ± 0.0027 (Planck)
Cold dark matter density ΩDM 26.4%
Cosmological constant density ΩΛ 0.685 ± 0.017 (Planck), 0.712 ± 0.010 (WMAP-9)
Redshift at matter-radiation equality 3391 ± 60 (Planck)
Redshift at photon decoupling 1090.43 ± 0.54 (Planck)
Age of decoupling 377730+3205
−3200 yr

Table 1: Some cosmological parameters, obtained from the combined set of 9-year data taken by
the Wilkinson Microwave Anisotropy Probe (WMAP) [arXiv:1212.5226], Sloan Digital Sky Survey
(SDSS), the Hubble Space Telescope, and the Planck experiment [arXiv:1303.5076]. This is the
most recent average released in 2013.

4
1 Why General Relativity?

Why should we study General Relativity (GR)? Simply because this is an astrophysics and cosmol-
ogy course and GR is one of the cornerstones of modern-day cosmology. Starting from the big bang
— the creation of the universe — to the exciting objects like quasars, active galactic nuclei, black
holes, neutron stars, binary pulsars, GR is applied everywhere. It is not that GR is tested in these
systems; it has already been tested in several laboratory experiments as well as in experiments in
the solar system. A better analogy would have been the Schrödinger equation and its numerous
applications in atomic, molecular, nuclear, and solid-state physics.
So, what is GR? It is a theory of gravity; a classical theory since gravity has not yet been
quantised 1 . Its input is the fact that space-time is curved, the curvature being caused by the
matter (or energy) present. The field equations of GR, also called the Einstein equations, relate
a measurement of the curvature with the energy-momentum (also called the stress-energy) tensor,
and give a very profound result: gravity is the geometry of space-time! In other words, gravity is
not like the electromagnetic field — if you have a mass, the nearby space-time gets curved, and the
distortion appears as gravity. One may compare it with Newtonian gravity, a theory with a flat
space-time but a field and its source, which is more analogous to electromagnetism. It is expected
that in the experiments where the curvature of space-time plays a crucial role, these two theories
can be distinguished. Unfortunately, gravity is so weak that in most of the cases, the distortion is
very small, so the distinction between these two theories is tiny; and one needs very precise and
careful experiments to verify GR. After all, experiment is the final arbiter in physics. We will
discuss some of these tests, and GR has been vindicated in all of them.
Since its formulation, GR was thought of, and perhaps not unjustifiably, as a theory whose
mathematics is too complicated for physicists. The reason is its heavy dependence on a branch
of mathematics called differential geometry. One may contrast it with quantum mechanics, whose
mathematics was relatively easy but the physics, in its early years, was confusing and difficult to
grasp (one may like to go through the correspondences between Einstein and Bohr). Nowadays,
we do not attach too much importance on geometry; we know how to extract the basic physics of
gravity without being an expert in differential geometry. This is the approach taken by Weinberg,
who, in his classic text, motivated GR from the principle of equivalence, i.e., from the fact that
the gravitational mass and the inertial mass of a body are the same (more on this later). I would
like to emphasize that you can be a successful astrophysicist or cosmologist if you have the basic
knowledge of the physics of GR, without getting lost in the quagmire of differential geometry. The
first part of this course just introduces you to the subject; it is too brief to give you even a cursory
coverage.

Figure 1: The indentation of a flat rubber sheet with a mass.

A good analogy of how matter affects geometry and geometry in turn affects matter is the case
of an indented rubber sheet. Consider a rubber sheet with rectangular grid lines; this represents a
1
There can be other competing theories of gravity. They share some common characteristics with GR, but differ
in other respects. And none of them offers a quantum version.

5
flat space. A straight line is one that intersects all the parallel grid lines at an equal angle. Now
once you put some mass on the rubber sheet, there is an indentation; the parallel grid lines are
no longer parallel, they are concentrated more in the vicinity of the mass. The more massive the
object is, the more severe the indentation; in other words, the more distorted the geometry will be.
However, if you go far away from the body, the space is again flat; the effect of the mass dies down.
The line that intersects all the grid lines at an equal angle is no longer a straight line, but any body
without an external force will move along that line, so the trajectory will appear curved, and that
happens because of the indenting mass. We might say that the mass distorts the flat geometry and
the motion of a test particle follows a curved path which appears to be the result of gravity.
Now let us first try to have an idea of what curved space-time means.

1.1 Flat space

We start with the fifth postulate 2 of Euclid as stated in his famous treatise Elements:
If a straight line falling on two straight lines make the interior angles on the same side less than
two right angles, the two straight lines produced indefinitely meet on that side on which the angles
are less than two right angles.
There are other equivalent ways of stating this postulate. The postulate cannot be proved; you
must assume it to be true. Then it gives you the definition of parallel lines; lines that never meet
even if extended to infinity. This postulate defines flat space.

Figure 2: Parallel lines in a flat space.

In a 2-dimensional flat space, a line element can be written as

ds2 = dx21 + dx22 . (1)

This is nothing but Pythagoras’ theorem. Here x1 and x2 are orthogonal cartesian axes, so we can
write
ds2 = gIJ dxI dxJ (2)
(I, J = 1, 2) with g11 = g22 = 1, g12 = g21 = 0. The metric is just the 2 × 2 unit matrix. This may
appear to be a good definition of a flat space, but beware: it is not! For example, I can use the
plane polar coordinate system, where

ds2 = dr2 + r2 dθ2 (3)

and so g11 = 1 but g22 = r2 — the metric is no longer the unit matrix. So we must look for some
other definition of a flat space.
2
The other postulates define points, lines, angles, etc. and without these four postulates you cannot construct any
geometry.

6
1.2 Curved space

While Euclid’s fifth postulate cannot be proved, it is possible to have geometries without the
fifth postulate. In 19th century, mathematicians Gauss, Bolyai, and Lobachevski independently
formulated such geometries, which have no internal inconsistencies, but the space is not flat — it
is curved.
Consider the surface of a sphere. On a flat space, the angles of a triangle sum up to 180◦ .
What about a spherical triangle, i.e., a triangle whose sides all lie on the surface of a sphere? It
is obvious that the sum will be more than 180◦ . For example, consider the triangle formed by the
Greenwich longitude, the 90◦ east longitude and the equator; the longitudes cut all latitude lines
at right angles, so all angles of this triangle are 90◦ and they sum up to 270◦ . Also note that if you
make the triangle smaller and smaller, the sum of the angles tend to 180◦ , i.e., the space is locally
flat 3 .

Figure 3: A spherical triangle.

What about the fifth postulate? That is obviously not valid, since all longitudes are parallel
lines (they cut the latitudes at right angles), but they meet at the poles. This is a finite curved
space (the area of the sphere is finite), so we cannot extend any line to infinity. We can, however,
think of saddle-like spaces. The saddle is convex along the body of the horse but concave along
its spine, and so can be extended upto infinity. If you have a 3-dimensional plotting programme,
like splot in Gnuplot or Plot3D in Mathematica, try to plot f (x, y) = x2 − y 2 with ranges covering
(0, 0).

Figure 4: Saddle space.

The separation between two close points on the surface of a sphere of radius a, located at (θ, φ)
and (θ + dθ, φ + dφ) is given by
ds2 = a2 (dθ2 + sin2 θdφ2 ) . (4)
3
There can be no ‘straight’ line on the surface of the sphere. The analogue of the straight lines are the great
circles, circles which lie entirely on the surface of the sphere and whose centres coincide with the centre of the sphere.
All longitudes are great circles, so is the equator; but other latitude lines are not.

7
The metric, obviously, is

g11 = a2 , g22 = a2 sin2 θ, g12 = g21 = 0. (5)

This metric looks almost identical to the flat space plane polar metric, so how do we know that this
space is indeed curved? One must have some quantity, which is an intrinsic property of the space,
and should not depend on the choice of the coordinate system. Gauss derived such a quantity,
which is called the Gaussian curvature. The expression can be found in Eq. (1.1.12) of Weinberg.
There is only one such curvature for a 2-dimensional space. For an N -dimensional space, there are
N 2 (N 2 − 1)/12 independent curvatures.

Figure 5: Infinitesimal space on the surface of a sphere.

Let us spend some more time on the 2-dimensional curved space. We have already talked about
two such spaces: the sphere, which is a finite space with a fixed positive curvature, given by

K = 1/a2 (6)

where a is the radius of the sphere. The saddle, on the other hand, is an infinite space, but the
curvature is not fixed (it is positive in one direction and negative in the other). What Gauss et
al. considered was an infinite space of constant negative curvature; such a space is impossible to
visualise, but one can mathematically construct it. Why do we need a fixed curvature? Because the
curvature defines the inner property of a space, its intrinsic property — and unless the curvature
is constant the space won’t be isotropic, so other postulates of Euclid will be violated. We will,
however, concentrate on the sphere.
Given a curved space, the first task is to construct the geometry of that space. This means
that one has to obtain the metric. The metric gives you the separation between two nearby points:
ds2 = gIJ dxI dxJ (we will use uppercase indices for 2-dimensional space). Now, if it were a flat
space, I could have written, in cartesian coordinates, ds2 = ηIJ dxI dxJ where η is the 2 × 2 unit
matrix. I can never do so for a curved space, but:
If we consider a sufficiently small region in a curved space, it is possible to find a locally Euclidean
coordinate system so that the distance between two nearby points (ξ1 , ξ2 ) and (ξ1 + dξ1 , ξ2 + dξ2 ) is
given by the Pythagoras’ theorem: ds2 = dξ12 + dξ22 .
This seems to be a very trivial statement — all of us, in our school days, have drawn triangles
whose angles sum up to 180◦ — but it is not, and we will come back to this statement later. For
now, we see that while it is possible to have such locally Euclidean frames, one cannot cover any
finite amount of surface with a single Euclidean frame; to cover the entire sphere, one must have
an infinite number of such Euclidean frames 4 . Now suppose we choose some other coordinate
4
If you want to flatten a curved surface, there will be lots of deformations; a good example is the flat projection of
the curved surface of the earth, also called the Mercator projection. The pole region is so deformed that Greenland
and Africa appear to be of the same size, Alaska appears to be larger than India, and Antarctica larger than Asia.

8
Figure 6: The Mercator projection, a poor attempt to flatten the surface of the earth. Note the
huge deformations near the poles.

system (x1 , x2 ) that covers a finite part of the sphere — maybe the whole surface. How should the
separation look like? Well, both the Euclidean variables ξ1 and ξ2 are functions of x1 and x2 , so
we can write, using the chain rule of differentiation,
∂ξ1 ∂ξ1
   
dξ1 = dx1 + dx2 , (7)
∂x1 ∂x2
and hence
ds2 = g11 (x1 , x2 )dx21 + 2g12 (x1 , x2 )dx1 dx2 + g22 (x1 , x2 )dx22 (8)
where
∂ξ1 2 ∂ξ2 2
   
g11 = + ,
∂x1 ∂x1
∂ξ1 2 ∂ξ2 2
   
g22 = + ,
∂x ∂x2
 2
∂ξ1 ∂ξ1 ∂ξ2 ∂ξ2
   
g12 = + . (9)
∂x1 ∂x2 ∂x1 ∂x2
Such a space is called a metric space. The derivation can be turned the other way: given a metric
space, we can at any point choose a locally Euclidean frame.

1.3 Curved space-time

The next step is conceptually easy, but actually it needed the genius of a great mathematician,
Georg Riemann, for a successful construction of a higher dimensional curved space. The reason
is that there is no unique curvature for a higher dimensional space and the geometry becomes
much more complicated. The mathematical framework was constructed by Riemann and other
mathematicians, but it was Einstein who saw the physics in it.
For a flat or Minkowski space-time, the invariant separation is

ds2 = −dt2 + dξi2 = ηµν dξ µ dξ ν (10)

where the Minkowski metric η = diag(−1, 1, 1, 1) and we have used c = 1 5 . Therefore, for a curved
space-time, we should have
ds2 = gµν dxµ dxν (11)
5
This metric has an overall minus sign compared to the Bjorken-Drell metric diag(1, −1, −1, −1), which is more
popular in quantum field theory and particle physics courses. The reason is that this makes life simpler for you when
you compare standard textbooks.

9
where the elements of the symmetric tensor gµν are functions of the coordinates. Again, we expect
that it is possible to find infinitesimal regions where gµν = ηµν — the space-time is locally flat.
Suppose I have such an infinitesimally flat space-time; then I can equate (10) and (11) and write

∂ξ α ∂ξ β
gµν = ηαβ . (12)
∂xµ ∂xν
Remember that the ξ coordinates change if we go to another space-time point.
The theoretical introduction is almost complete; now we will look into an important experiment
and its consequences.
Q. The sum of the interior angles for a spherical triangle is π + A/a2 where A is the area of the triangle and
a is the radius of the circle. Find the area of the triangle bounded by the equator and the longitudes 0◦ and
60◦ . What is the maximum possible area of a spherical triangle?
Q. A ball is thrown from the ground level at an angle 45◦ with the vertical, and it falls back to the ground
after 2 seconds. Defining the plane in which the ball moved to be the x-y plane, draw the space-time diagram
of the motion of the ball. Convince yourself that the ball moves in an almost straight line in the space-time
diagram.
Q. What is the value of K for a plane?
Q. Derive Eq. (9).

2 The Equivalence Principle

2.1 The Eötvös experiment

If you drop two different objects from the same height, they take equal time to fall to the ground
(assuming, of course, that the air resistance is negligible). In a spaceship orbiting around the earth,
everything is weightless; everything, from the astronauts to the paperweights, fall towards the earth
with the same acceleration. It has been tested by a very precise experiment (see Box 2.1 of Hartle)
that earth and moon fall with the same acceleration in the gravitational field of the sun.
What does this mean? In the immediate vicinity of a freely falling observer, there is no gravita-
tional field, as far as the observer is concerned. If you are in a freely falling lift, all your experiments
will give you identical results that you would have obtained in a flat space-time. If you put a ball
at some height above your head, it will remain there, as dictated by Newton’s first law. Of course,
an observer on the ground will explain this differently — he will say that both you and the ball
are falling with the same acceleration g, and so the relative distance remains fixed (I am assuming
that the value of g remains constant over the laboratory, which I can safely do, since it is only an
infinitesimal portion of a curved space-time that has a flat metric).
This is a central idea in Einstein’s theory of gravity, so let us first see how it came about.
Newton’s second law says that the applied force is proportional to the acceleration, and let us call
the proportionality constant the inertial mass mi of the body. In a gravitational field, the force
on a body is proportional to the acceleration due to gravity. This proportionality constant will
be called the gravitational mass mg of the body. Now mi , a measure of the inertia, is an intrinsic
property of the body; it may depend on a lot of things, like chemical composition, electric charge,
interatomic distance, etc. On the other hand, mg is dependent only on the gravitational force. One
may expect that the ratio mi /mg need not be equal for all bodies. If that be the case, bodies with
different composition, dropped from the same height, should not reach the ground simultaneously6 .
6
A good counterexample to gravity is the production of magnetic field by an external current. The same current

10
m fv
iB
m fv
iA

m fh
iB
m g
gB
m g
m fh gA
iA

Figure 7: The Eötvös experiment.

It does, if we neglect the air resistance.


Are mi and mg the same for all bodies 7 ? A precise experiment to settle this question was done
by the Hungarian baron Roland von Eötvös, which we briefly describe.
Consider Fig. 7, which shows a bar loaded with two different bodies A and B. It is supported
somewhere in the middle by a wire, with a plane mirror attached to it to detect any possible
rotation of the wire (think of a ballistic galvanometer). The bar need not be horizontal. The
gravitational force acts downwards — towards the centre of the earth. There is also the centrifugal
force generated by the rotation of the earth. Budapest, where the experiment was performed, is
not on the equator — it is at about 46◦ north latitude — so the centrifugal acceleration, which is
in the plane of the latitude, has both vertical and horizontal components. The downward force is
proportional to mg but the upward force is proportional to mi , so at equilibrium,

lA (mgA g − miA fv ) = lB (mgB g − miB fv ) . (13)

(Can you show, in Fig. 7, lA and lB ?) There is, of course, a horizontal component fh of the
centrifugal acceleration. The horizontal force may be unequal, so it will produce a torque given by
the difference of the two moments:

T = lA miA fh − lB miB fh . (14)

Eliminating lB , one gets


" −1 #
mgA mgB
 
T = lA miA fh 1 − g − fv g − fv , (15)
miA miB

or, since fv  g, " #


miA miB
T = lA fh mgA − . (16)
mgA mgB
If the ratio mi /mg is different for A and B, there should be a resultant torque. How do we know
whether such a torque is actually there, since we cannot stop the earth’s rotation by the flip of
a switch? The way out is to rotate the whole experiment by 180◦ . This keeps everything in the
vertical direction invariant, but reverses the torque. It is easy to measure whether there is a net
rotation of the wire in these two setups. The answer, to a very good precision, is no. This tells you
that mi = mg for all bodies.
produces the same auxiliary field H but the magnetic field B depends on the magnetisation M of the material, and
so the ratio of the magnitudes of B and H is material-dependent.
7
What one expects is a constant ratio. If it is something other than unity, we can scale the force with it.

11
Defining  
mgA mgB
miA − miB
η=  , (17)
1 mgA mgB
2 miA + miB

the latest experiment obtains η = (−0.2 ± 2.8) × 10−12 ; thus, the equality of the gravitational mass
and the inertial mass is one of the most accurate relationships in physics. This is from a laboratory
experiment; the lunar laser ranging test, which compares the falling rate of the earth and the moon,
improves the number to ∼ 1.5 × 10−13 .

2.2 The equivalence principle

Suppose there is a system of n particles, moving with nonrelativistic velocities under the influence
of forces F (xn − xm ) and an external uniform gravitational field g. The equations of motion are

d2 xn X
mn 2
= mn g + F (xn − xm ) (18)
dt m

which, seen from another coordinate frame falling with the same velocity
1 2
x0 = x − gt , t0 = t, (19)
2
would look like
d2 x0n X
mn = F (x0n − x0m ). (20)
dt2 m
So the original observer, using the unprimed coordinates, and his freely falling friend, using the
primed coordinates, will have the same laws of motion, except that the former sees a gravitational
field and the latter does not. This is what we have qualitatively stressed in the earlier subsection.
But there is a catch: the deduction assumes g to be uniform. The gravitational field need
not be homogeneous (look at the tides); if the freely falling laboratory is big enough, the radial
distance between the particles will gradually decrease as they fall. However, we can always have a
flat space-time in an infinitesimal region, so we can say, without any problem,
Experiments in a sufficiently small freely falling laboratory, over a sufficiently short time, give
results that are identical to those obtained from the same experiments in an inertial frame in empty
space.
This is the equivalence principle. What experiments, or in other words, what physical laws are
we talking about? All the laws of nature — Newton’s laws of motion, or special relativity if the
velocity is relativistic, Maxwell’s electrodynamics, or any other laws you can think about. The last
phrase, “an inertial frame in empty space”, evidently refers to flat space-time.
We are not discriminating here between the weak equivalence principle, which includes only the
laws for freely falling particles (Eqs. (18) and (20)), and the strong equivalence principle, which
includes all laws. Since the mass of a body includes the electromagnetic binding energy, and also
the strong binding energy for the nucleons, it is expected that these two energies also obey the
equivalence principle 8 . In a gravitational field, everything falls with the same accceleration — so
a light ray passing near the sun will bend towards the sun. For a light ray grazing the sun, the
bending would be 1.750 . Einstein claimed such a bending to be a crucial test for GR, and it was
vindicated in the 1919 eclipse experiments, making Einstein synonymous with science.
8
What about the gravitational energy itself? The energy is so tiny that such an equivalence is extremely difficult
to detect, but hopefully it is also in the ambit of the principle, since the applicability of the equivalence principle to
gravity is a crucial step to Einstein’s equations of GR.

12
2.3 Free fall

Consider a particle moving freely under the influence of gravitational forces. According to the
equivalence principle, there is a locally flat freely falling coordinate system ξ α in which the equation
of motion is that of a straight line in space-time, i.e.,
d2 ξ α
=0 (21)
dτ 2
where dτ , the proper time, is given by
dτ 2 = −ηαβ dξ α dξ β = −ds2 . (22)
Now suppose that we use any other coordinate system xµ , which can be cartesian or curvilinear,
fixed or moving in the laboratory frame, even rotating or accelerating. ξ α s are functions of xµ s
(and vice versa), so we can write Eq. (21) as
d ∂ξ α dxµ ∂ξ α d2 xµ ∂ 2 ξ α dxµ dxν
 
0= = + µ ν . (23)
dτ ∂xµ dτ µ
∂x dτ 2 ∂x ∂x dτ dτ
Multiplying this by ∂xλ /∂ξ α and using
∂ξ α ∂xλ
= δµλ , (24)
∂xµ ∂ξ α
we get the equation of motion
d2 xλ µ
λ dx dx
ν
+ Γ µν = 0, (25)
dτ 2 dτ dτ
where the affine connection or the Christoffel symbol is defined as 9

∂xλ ∂ 2 ξ α
Γλµν = . (26)
∂ξ α ∂xµ ∂xν
Eq. (25) is the analogue of force-free motion in a gravitational field; this is called a geodesic equation
and the trajectory of the particle is called the geodesic. For a flat space-time, the geodesic is a
straight line. One can formulate a variational principle analogous to that in nonrelativistic classical
mechanics: The motion of a test particle 10 between two timelike separated points extremises the
proper time between them.
Note that the Christoffel symbol is symmetric in its lower indices. However, it can be shown
that it is not a mixed tensor of rank 3; the transformation property is different.
The proper time may be expressed, using Eq. (12) , as
∂ξ α µ ∂ξ β ν
dτ 2 = −ηαβ dx dx = −gµν dxµ dxν . (27)
∂xµ ∂xν

For a photon, one cannot use τ , since dτ 2 is zero (the separation of two points on the surface
of a light cone is always zero). We have to use some other parameter, say σ ≡ ξ 0 , which is equally
good (such parameters are called affine parameters). We replace the standard equations by
d2 ξ α dξ α dξ β dxµ dxν
= 0, −η αβ = −gµν = 0, (28)
dσ 2 dσ dσ dσ dσ
9
The Christoffel symbols are only the symmetric part (in the lower indices) of the affine connections, which are
relevant for the parallel transport of vectors. The affine connection can be written as the sum of two parts, one
symmetric and one antisymmetric in the lower indices. The antisymmetric affine connection gives a torsion to the
space-time. This takes us beyond the Riemannian geometry, so we will not discuss it further. Interested readers may
look at Einstein-Cartan gravity. For us, affine connection means the symmetric part.
10
So that it does not affect the gravitational field.

13
and Eq. (25) as
d2 xλ µ
λ dx dx
ν
+ Γ µν . (29)
dσ 2 dσ dσ
Q. Show that for a force-free motion in flat 2-dimensional space, the geodesics are straight lines.
Q. Show that
∂2ξα λ ∂ξ
α
= Γµν . (30)
∂xµ ∂xν ∂xλ
(This is nothing but an exercise to check that you know how to handle the Lorentz indices!)
Q. Show that the time dt for a photon to travel a distance dx is determined by

g00 dt2 + 2g0i dxi dt + gij dxi dxj = 0. (31)

This is a quadratic equation for dt with two roots. Which one is consistent with the flat space-time result?

2.4 Metric and affine connection

Start from Eq. (12). Differentiating with respect to xλ , we get


∂gµν ∂ 2 ξ α ∂ξ β ∂ 2 ξ β ∂ξ α
= η αβ + ηαβ , (32)
∂xλ ∂xλ ∂xµ ∂xν ∂xλ ∂xν ∂xµ
which can be written using Eq. (30) as
∂gµν α β α β
ρ ∂ξ ∂ξ ρ ∂ξ ∂ξ
= Γλµ ηαβ + Γ λν ηαβ = Γρλµ gρν + Γρλν gρµ . (33)
∂xλ ∂xρ ∂xν ∂xµ ∂xρ
Add to Eq. (33) the same equation with µ and λ interchanged, and subtract the same equation
with ν and λ interchanged. We then have
∂gµν ∂gλν ∂gµλ
λ
+ µ
− = gρν Γρλµ + gρµ Γρλν + gρν Γρµλ + gρλ Γρµν − gρλ Γρνµ − gρµ Γρνλ . (34)
∂x ∂x ∂xν
Since gµν and Γρµν are both symmetric under the exchange of µ and ν, we can write
∂gµν ∂gλν ∂gµλ
λ
+ µ
− = 2gρν Γρλµ . (35)
∂x ∂x ∂xν
Define a matrix g νσ as the inverse of gνσ so that gνσ g σρ = δνρ , and multiply Eq. (35) with g νσ . This
gives a very important relation:
1 ∂gµν ∂gλν ∂gµλ
 
Γσλµ = g νσ λ
+ µ
− . (36)
2 ∂x ∂x ∂xν
It can be shown that all the effects of gravity are contained in the metric gµν and the affine
connections Γλµν , and given the metric, we can use Eq. (36) to obtain the affine connections. We
will now see some examples.
Example 1: Flat space-time, cartesian coordinates
In the cartesian system, we can write ds2 = −dt2 + dx2i , so gµν = g µν = diag(−1, 1, 1, 1). All
derivatives vanish, and so all Christoffel symbols are zero.
Example 2: Flat plane polar coordinates
The Christoffel symbols need not always be zero for flat space (or space-time). Consider, for
example, ds2 = dr2 + r2 dθ2 . Denote r by 1 and θ by 2, and let us calculate Γ122 = Γrθθ . The metric
is diagonal, so we must set σ = ν = 1, λ = µ = 2, and
1 ∂g22
 
Γ122 = g 11 − 1 = −r. (37)
2 ∂x

14
Example 3: Surface of a sphere
Following the metric in Eq. (5), we see that g 11 = 1/a2 , g 22 = 1/a2 sin2 θ, and g 12 = g 21 = 0.
Putting σ = ν = 1 and µ = λ = 2, we get
1 ∂g22 1
 
Γθφφ = Γ122 = g 11 − 1 = − a−2 .2a2 sin θ cos θ = − sin θ cos θ. (38)
2 ∂x 2

Example 4: The Schwarzschild metric


This is one of the more interesting cases that we will discuss. The geometry has a spherical
symmetry, and is the relevant one for the space-time outside a static spherical mass distribution
(e.g., a star) 11 . The metric is time independent; the geometry does not change with time. The
spherical symmetry argument ensures that in spherical polar coordinate for the spatial part, the
θ-φ sector can be written as r2 dθ2 + r2 sin2 θdφ2 . The time-independence means that g00 and g11
can be functions of the radial coordinate r only. We write
−1
2GM 2GM
  
2 2
ds = − 1 − dt + 1 − dr2 + r2 dθ2 + r2 sin2 θdφ2 , (39)
r r

where M is the mass of the gravitating body, and G is Newton’s gravitational constant 12 . Since
gµν is diagonal, so is g µν , and its elements can be obtained by dividing unity with the corresponding
element of gµν . For diagonal metrics, σ must be equal to ν. Let us try to find Γrrr = Γ111 , so that
ν = σ = µ = λ = 1:
1 11 ∂g11 ∂g11 ∂g11 1 ∂g11
 
Γrrr = g + − = g 11 1
2 ∂x1 ∂x1 ∂x1 2 ∂x
 
1 2GM  2GM 1 GM 1
 
= 1− − 2  2  = − 2 . (40)

2 r r r 2GM
1− 2GM 1− r
r

Before we move on to the next example, note that something interesting seems to happen for r = 0
and r = 2GM (if c 6= 1, for r = 2GM/c2 ). We will come to this later.
Example 5: The Friedmann-Robertson-Walker (FRW) metric

Figure 8: The expanding grid.

Suppose the metric is not constant in time; the coordinate grid expands with time (see Fig. 8).
Such coordinates are called comoving coordinates. Consider a 2-dimensional rectangular grid that
11
Stars, in general, are not static. They have a significant angular momentum; the Sun takes about a month to
rotate on its axis, and there are highly dense objects that take about a millisecond.
12
Some texts, like Hartle, relate mass with length and time by putting G = 1. While this is a perfectly valid
procedure, this makes me a little uncomfortable, since I am accustomed to use h̄ = 1. Gravity is classical, so general
relativists do not need h̄ — except people like Stephen Hawking — but we will keep both G and h̄, and treat mass
differently from length and time, which are related by c = 1.

15
expands with time. Two points at (x1 , y1 ) and (x2 , y2 ) will always maintain a constant coordinate
distance, but the physical distance between them increases. Such a metric is relevant for us since
this is what the metric of the universe should look like — we all know, from the days of Edwin
Hubble, that the universe expands.
Let us use a metric
1
 
ds2 = −dt2 + a2 (t) dr2 + r2 dθ2 + r2 sin2 θdφ2 , (41)
1 − kr2
where a(t) is called the scale factor. Note that r is dimensionless and the length dimension is
carried by a(t), which determines the physical distance between two grid points, and acts as the
distance measure in cosmology. By rescaling the radial coordinate r, we can make the constant k
take discrete values of 0, +1, or −1 only, for flat, closed and open universes respectively 13 . We
have every reason to believe that we are in the k = 0 universe.
Again, the metric is diagonal, so let us quickly work out a few affine connections.
1 ∂g33
 
Γtφφ = Γ033 : (ν = σ = 0, µ = λ = 3) ⇒ Γ033 = g 00 − 0 = aȧr2 sin2 θ,
2 ∂x
1 ∂g aȧ
 
11
Γtrr = Γ011 : (ν = σ = 0, µ = λ = 1) ⇒ Γ011 = g 00 − 0 = ,
2 ∂x 1 − kr2
1 ∂g11 ȧ
Γrtr = Γ101 : (ν = σ = 1, µ = 1, λ = 0) ⇒ Γ101 = g 11 0 = . (42)
2 ∂x a
(Here, ȧ = da(t)/dt.)
Q. Find, if any, the nonzero affine connections of a 2-dimensional spherical geometry, other than that in Eq.
(38).
Q. Show that for the Schwarzschild geometry, one gets the following nonzero Christoffel symbols:
 −1  
GM 2GM GM 2GM
Γttr = −Γrrr = 2 1− , Γrtt
= 2 1− , Γrθθ = −(r − 2GM ),
r r r r
1
Γrφφ = −(r − 2GM ) sin2 θ, Γθrθ = Γφrφ = , Γθφφ = − cos θ sin θ, Γφθφ = cot θ. (43)
r
Q. Repeat the same problem for the FRW metric. Note that a Christoffel symbol is trivially zero, for a
diagonal metric, if all its indices are different. Also, it must be symmetric in its lower indices. Write down
all possible independent nonzero combinations. It turns out that 13 of them are nonzero. Three of these 13
have been worked out in Eq. (42). Get the rest 10. Look at Appendix B, p. 547, of Hartle, if you get stuck.
Q. Show that, in the FRW geometry, one can write

ds2 = −dt2 + a2 (t) dχ2 + S 2 (χ) dθ2 + sin2 θdφ2 ,


 
(44)

where S(χ) = χ, sin χ, sinh χ for k = 0, +1, −1 respectively.


R
Q. Suppose one defines some time coordinate η = dt/a(t). Show that the FRW metric can be written as
  
2 2 2 1 2 2 2 2 2 2
ds = a (η) −dη + dr + r dθ + r sin θdφ . (45)
1 − kr2

The coordinate η is known as conformal or scale-invariant time as apart from the external scale factor a(η),
the relationship between spatial and conformal time coordinates always remains the same. Suppose f is any
function, f˙ = df /dt and f 0 = df /dη. Show that

f˙ = f 0 /a , f¨ = (f 00 /a2 ) − (f 0 a0 /a3 ) .
 
(46)

13
What is important to note is that k is positive for a closed universe and negative for an open universe.

16
2.5 Weak gravity: the Newtonian limit

Suppose the motion of a particle is non-relativistic, so that |dx/dτ |  dt/dτ (remember c = 1).
This simplifies the geodesic Eq. (25) to
2
d2 xµ dt

2
+ Γµ00 = 0. (47)
dτ dτ
We assume the gravitational field to be stationary, so that all time derivatives of gµν vanish:

1 ∂g00
Γµ00 = − g µν ν . (48)
2 ∂x
Since the field is weak, we can assume the metric to be “almost” cartesian:

gµν = ηµν + hµν , |hµν |  1, (49)

so that
1 ∂h00
Γµ00 = − η µν . (50)
2 ∂xν
Suppose µ = i, a spatial index. The Minkowski metric forces ν = i. So we can write
2
d2 xi 1 dt d2 t

− ∇i h00 = 0, = 0. (51)
dτ 2 2 dτ dτ 2

The second equation, which follows from Γ000 = 0 as the metric is time-independent, implies dt/dτ =
constant. Dividing by (dt/dτ )2 , we get

d2 xi 1
= ∇i h00 . (52)
dt2 2
Compare this with the corresponding Newton’s equation,

d2 xi
= −∇i φ (53)
dt2
where φ is the gravitational potential. At a distance r from a spherical mass distribution with total
mass M , it is φ = −GM/r. We can write

h00 = −2φ + constant. (54)

If we expect the effect of gravity to die down at infinity so that at infinity, gµν = ηµν , h00 must be
zero at r = ∞. This gives
h00 = −2φ, g00 = −(1 + 2φ). (55)
The value of φ is indeed small. On the surface of a proton, it is of the order of 10−39 , so particle
physics can safely neglect gravity; it is of the order of 10−9 , 10−6 and 10−4 respectively on the
surfaces of the earth, the sun, and a typical white dwarf star.
Q. From dimensional analysis, show that φSI /c2 gives our φ.
Q. What is the value of G in the c = 1 system? If the mass and the radius of the sun are 2 × 1030 kg and
0.7 × 109 m respectively, calculate φ on the surface of the sun.
Q. What should be the radius and density of the sun to have a strong gravity (i.e. |h00 | = 1)?

17
2.6 The Einstein-Hilbert action
R 4
In a flat space-time, the action is S = d x L(x, ∂µ x). In the presence of gravity, one writes

Z
S= d4 x −g L(x, ∂µ x) , (56)

which is called the Einstein-Hilbert action. Here g = det(gµν ); in the metric that we use, this is
negative, so we take the square root of −g. For flat space-time, g = det(ηµν ) = −1. Note that the
distortion in the geometry from a flat space-time is contained in g, so that is how gravity couples
with the dynamics given by L. For example, if we consider the coupling of a real scalar field φ with
gravity, we should write
√ 1 1
Z  
4
S= d x −g (∂µ φ)(∂ µ φ) − m2 φ2 . (57)
2 2

2.7 Gravitational redshift: “Apparent weight of photons”

Einstein suggested three tests for GR. Among them, the bending of light by sun, or other gravitating
bodies, has been well studied. In fact, we now know that light from very distant sources can bend
while passing through an intermediate galaxy; this effect is called gravitational lensing. The second
test, the perihelion shift of mercury, yielded results that are consistent with GR. There have been
other tests, that were not even discussed (or thought of) by Einstein, which vindicated his theory.
However, the third test that Einstein thought of, turned out not to be a test of GR but of a far more
general principle — the equivalence principle that we have just studied. Remember that there can
be many theories of gravity embedded in a curved space-time that satisfy the equivalence principle;
the Princeton astrophysicist Robert Dicke and his student Carl Brans gave such a theory that was
popular in the 1960s and 70s. All these theories will pass this test, known as the gravitational
redshift.
Consider a clock in a gravitational field (not necessarily in free fall). According to the equiva-
lence principle, the separation ∆t between two ticks as seen by an observer in that field is identical
to what is observed in a locally inertial coordinate frame, so that

∆t = (−ηαβ dξ α dξ β )1/2 , (58)

which can be written as


!1/2
∂ξ α ∂ξ β
∆t = −ηαβ µ dxµ ν dxν = (−gµν dxµ dxν )1/2 . (59)
∂x ∂x

Suppose the clock is moving with a four-velocity dxµ /dτ . The time interval dt between two ticks
is given by
∆t dxµ dxν 1/2
 
= −gµν . (60)
dt dt dt
If the clock is at rest, the time dilation just for the gravitational field is

dt = ∆t(−g00 )−1/2 . (61)

Unfortunately, you cannot observe this; the gravitational field affects the clock (like an atom emit-
ting a particular frequency) and the standard measuring device equally, so there is no visible effect.
However, if the atom is at position 1 and the measuring device is at position 2, with different
gravitational fields, the situation is interesting. The device compares the tick interval dt for the

18
atom at position 1 with that coming from a similar atom sitting next to the device at position 2.
The gravitational time dilation at these two positions are

dt1 = ∆t(−g00 (x1 ))−1/2 , dt2 = ∆t(−g00 (x2 ))−1/2 . (62)

We do not know ∆t, as we have no idea what the separation would have been in free space, but ∆t
can be eliminated from the above equation. What we find is that the frequency emitted at position
1 and the frequency detected at position 2 are related by
1/2
ν2 g00 (x2 ) ∆ν

= ⇒ = φ(x2 ) − φ(x1 ) , (63)
ν1 g00 (x1 ) ν
where ν2 /ν1 = 1 + ∆ν/ν, g00 = −1 − 2φ, and we have made a binomial expansion in the weak field
limit. Thus, you can think of a redshifted photon as one who spent some of its energy to climb out
of a gravitational well.
To visualise this, think of a tall tower. Suppose there is a clock at the top and another at
the bottom of the tower; will they keep identical time? The clocks are not moving, so there is
no question of time dilation. To put it in another way, suppose an atom at the top of the tower
is emitting electromagnetic radiation with a frequency ν. Will the frequency still be ν when a
detector detects the radiation at the bottom of the tower? From the preceding discussion, we know
that there will be a shift, which follows from the equivalence principle.
Consider an observer falling freely from the top of the tower, and when he passes the source,
the radiation is emitted. Equivalence principle tells us that in the frame of the observer, there is
no gravity; the space-time is flat, and all postulates of special relativity are satisfied. So, to the
observer, the radiation still has the frequency ν. But to him the detector on the ground is coming
up, and so it will catch more number of crests of the radiation per second than if it were sitting
still. This is analogous to the well-known Doppler effect; the detected frequency νd will be more
than ν, hence the radiation will be blue-shifted. If the emitter is placed on the ground and the
detector is at the top of the tower, a freely falling observer will see the detector receding, and hence
the radiation will be red-shifted.
Whichever way the shift is, the phenomenon is known as the gravitational redshift. In the SI
system, Eq. (63) becomes
φe − φd
 
νd = 1 + ν, (64)
c2
where φe (φd ) is the gravitational potential at the position of the emitter (detector). Note that we
have introduced the factor of c; this is because in a laboratory experiment, φe − φd is measured in
conventional units. Both φe and φd are negative, but if the emitter is above the detector, |φe | < |φd |;
hence in Eq. (64) the quantity in bracket is positive and the frequency is blue-shifted. If the height
difference is h, φe − φd = gh.
In 1960, Pound and Rebka published a paper in Physical Review Letters with the enigmatic title
“Apparent weight of photons”; what they actually did was to measure the gravitational redshift.
They used the 22.5 m high tower of the Jefferson Physical Laboratory at Harvard University. The
signal was the 14.4 keV γ-ray emitted in the decay of the unstable Fe57 nucleus. The detector, at the
bottom, was again some Fe57 , which is expected to absorb these γ-rays (at least, the small fraction
that reaches them) by the opposite reaction. If the frequency is shifted, the absorption will not take
place with the same efficiency. The source was moved upwards slowly, with a constant velocity,
so that there is a Doppler redshift superposed on the gravitational blueshift. One can change
the Doppler shift by varying the velocity of the source, and monitor the absorption efficiency;
the efficiency peaks where the emitted frequency is the same as the absorbed one, and hence the
Doppler shift (which is calculable) gives the gravitational blueshift.

19
Figure 9: The Pound-Rebka experiment.

The experiment was not an easy one. The potential difference, gh, divided by c2 , gives a number
of the order of 10−15 . The width of the γ-ray is order of magnitudes larger than this. There are
several reasons for this. First, the nucleus has an inherent motion. Second, the nucleus recoils
uncontrollably after the emission of the γ-ray. Both these effects can be taken care of by ‘locking’
the emitter nucleus in a crystal lattice, so that the motion is minimised and the recoil is absorbed by
the whole lattice (this is known as the Mössbauer effect). Pound and Rebka also interchanged the
positions of the emitter and the absorber to eliminate some systematic errors, and finally confirmed
the prediction for the gravitational redshift within 1%.
Q. Heartbeat is just like a clock, and if the heart beats faster, the person ages more quickly. If your room
is approximately 40 m higher than mine, who will age more quickly, and how much?

3 The Schwarzschild Metric

The Schwarzschild metric is given in Eq. (39):


−1
2GM 2GM
  
2 2
ds = − 1 − dt + 1 − dr2 + r2 dθ2 + r2 sin2 θdφ2 . (65)
r r
It is independent of t and has a spatial spherical symmetry, as is evident from its polar coordinates.
How do I get this metric? Indeed, it is a very nontrivial job to deduce the metric from the Einstein
equations, and you may consult any standard textbook on GR if you are interested. To derive the
metric, one imposes the symmetry conditions and get a general form. For example, one may deduce
that gtt and grr will be functions of r only. What one generally does is to check that the metric is
indeed a solution of the Einstein equation. We will later see that the Schwarzschild geometry is a
solution for the Einstein equation in the absence of any matter or energy (i.e., vacuum). Such a
vacuum solution is acceptable for the geometry outside a spherical mass, but not for the geometry
inside it.
Remember that the Schwarzschild coordinate r is not the distance from any “centre”. Rather,
it is related to the area A of a two-dimensional sphere with fixed r and t by A = 4πr2 . If GM/r is
small, one can write
2GM 2GM
   
ds2 ≈ − 1 − dt2 + 1 + dr2 + r2 dθ2 + r2 sin2 θdφ2 . (66)
r r
This is just the Newtonian static weak field metric (compare gtt = g00 with that of Eq. (55)) with
the gravitational potential φ = −GM/r, so we can identify M with the total mass of the source

20
of curvature, or approximately just the total mass of the gravitating body (approximately, because
the gravitational energy of the body also contributes to the source of curvature, but we may safely
neglect such subtleties).
The metric apparently has singularities at r = 0 and r = 2GM (in SI, 2GM/c2 ). The latter is
called the Schwarzschild radius Rs and is the characteristic length scale for curvature of the metric,
just as the radius of a sphere is the characteristic length scale for its curvature. However, there is
no astronomical body (except a possible exception called black hole that we will soon talk about)
for which Rs lies outside the body. For example, the sun has Rs = 2GM /c2 ≈ 3 km, well inside
the interior of the sun, where the vacuum solution is simply not valid!
Even if we have a body whose radius is less than its Rs , we do not expect to see any singularity.
This is because the singularity is only apparent, an artifact of the choice of the metric, as Eddington
pointed out, and we can go to some other coordinate system where the singularity is absent 14 . We
can also calculate the intrinsic properties of the space-time — the curvature quantities, and none of
them shows any singularity! On the other hand, r = 0 is a genuine singularity; curvatures become
infinite. In GR we meet such singularities in a number of places. Most notable among them is
the singularity at the origin of the universe, the so-called Big Bang at time t = 0, whose possible
resolution may lie only in a quantum theory of gravity. For now, we must live with the singularity
and the question of what happened at t = 0 is simply not well-defined. However, remember that
our ignorance can never be an argument for invoking a supernatural creator.

3.1 Black holes

A star shines by hydrogen burning. It is in a state of dynamic equilibrium, between an outward


radiation pressure and an inward gravitational pressure. When the hydrogen runs out, the star
starts to collapse under gravitation. This ignites the helium that has been formed in the core, and
the star again reaches another equilibrium. This process can continue till the core contains only
Fe56 , which cannot be further ignited (can you say why?). Thus, no more nuclear fusion is possible
which may resist the gravitational collapse.
The collapsing star, however, may be kept in equilibrium by a nonthermal source of pressure.
One common example is the Fermi pressure that builds up because the electrons, loosely speaking,
repel each other as a consequence of Fermi statistics. This happens when the electron wavefunctions
start to overlap, the same mechanism that causes the band structure in solids. A star which is
held in equilibrium by the electron Fermi pressure is known as a white dwarf star. Its typical
radius is a few thousand kilometres, and its maximum mass can be about 1.4 times the solar mass.
This is known as the Chandrasekhar limit, worked out in 1930 by the young Indian astrophysicist
Subrahmanyan Chandrasekhar. Thus, if the electron Fermi pressure is to keep a star in balance,
its maximum mass can only be about 1.4M .
A more massive star undergoes further compression — the electron Fermi pressure is not enough
to resist the gravitational collapse. In such a star, protons and electrons combine to form neutrons
(and neutrinos, which escape). Such stars are called neutron stars, and they are highly interesting
objects. The typical radius of a neutron star is about 10 km, and it generally rotates very rapidly
14
One such system is the Kruskal-Szekeres coordinates, where t and r are replaced by
p p
V = (r/2GM ) − 1 exp(r/4GM ) sinh(t/4GM ) , U = (r/2GM ) − 1 exp(r/4GM ) cosh(t/4GM ) , (67)

for r > 2GM , and


p p
V = 1 − (r/2GM ) exp(r/4GM ) cosh(t/4GM ) , U = 1 − (r/2GM ) exp(r/4GM ) sinh(t/4GM ) , (68)

for 0 < r < 2GM , and V 2 − U 2 = (1 − r/2GM ) exp(r/2GM ).

21
about its axis (which is just conservation of angular momentum; if radius goes down by a factor of
105 , the angular velocity increases by 1010 if no mass is lost).
The maximum mass of a neutron star is about 0.7 solar mass, so some mass is shed when the
star collapses beyond the white dwarf stage. Modern estimates put this limit a bit higher; a star
can reach a nonthermal equilibrium if its mass is not greater than something between 1.5 and 3
M , taking into account all the uncertainties in such calculations (this is known as the Tolman-
Oppenheimer-Volkoff limit). As most of the mass is not retained, one estimates the starting mass
to be around 15-20 M . But there are a lot of stars in the sky which are more massive than this.
Some of them shed a large fraction of their mass by violent explosion; still there must be some
stars which have run out of nuclear fuel, while the mass is still greater than 3M . What happens
to them?
The answer is that they are in a state of ongoing gravitational collapse. The radius of such a
star shrinks quickly, goes below the Schwarzschild radius Rs = 2GM , and ultimately goes on to hit
the r = 0 singularity. What happens at the singularity? Nobody knows. Let us see why.
We know that light rays bend in a curved space-time. In particular, it is not very difficult
to calculate the bending for a Schwarzschild geometry. For a light ray grazing the sun, it is 1.75
arcsecond. The geometry is much more violently curved near a body whose radius is close to Rs
(remember that Rs = 3 km for sun and about 9 mm for earth). Light rays emitted from the surface
of such a collapsing star bend, but can escape the star and come to us as long as the radius is
greater than Rs . When the radius is below Rs , the light rays get permanently trapped — the
bending is so large that emitted rays fall back on the star, nothing comes out, and there is no way
to obtain any information about the star except to observe its nearby geometry. Such a body is
called a black hole, and the radius Rs is known as the event horizon 15 .
The light ray emitted at r = Rs does not fall back on the star, neither does it come to us —
it goes around the star like a satellite. What should one see from outside? In a very short period
of time — typically 10−5 s — the collapse of a star slows down, the emitted light gets more and
more red-shifted until the red shift reaches infinity, the star grows dark, and the geometry outside
becomes indistinguishable from a Schwarzschild geometry. All history of the star will be erased; a
Schwarzschild black hole is characterised only by its mass M .

Figure 10: The curvature of space-time around a black hole. Note the bottomless well nature after
the event horizon.

This is true only for a static charge-neutral black hole. Apart from mass, a black hole can have
a nonzero angular momentum; such a rotating object is known as a Kerr black hole. It can also
be static and charged (a Reissner-Nördstrom black hole), or rotating and charged (a Kerr-Newman
15
The event horizon is defined as the boundary in space-time for which events inside the horizon can never affect
the outside observer. In other words, once one crosses the event horizon, there is no way to come back, even for light.
Only for Schwarzschild black holes, the event horizon can be identified with the Schwarzschild radius Rs . For other
types of black holes, the relationship is different.

22
black hole), but that is all for a classical black hole. To any outside observer, it is completely
specified by its mass, angular momentum, and electric charge. No other information that passes
through the unidirectional event horizon can ever be restored; this is cheekily referred as a no hair
theorem.
Are there such black holes in our galaxy? We have preliminary evidence for some. If an ordinary
star is near a black hole, the matter from the star gets sucked up by the black hole, and the falling
matter emits X-rays, which is an indirect signal of such a black hole. We have evidence for some
such X-ray signals. The black hole can act as a gravitational lens; the light coming from a galaxy
behind the black hole bends around it so that we see a double image, which is called microlensing.
There may be a very massive black hole (about 4 million solar mass) at the centre of our galaxy,
in the direction of the constellation sagittarius. Some stars die with a violent explosion called
supernova, the remnant of such a supernova may be a neutron star, or even a black hole. Another
very important evidence, the generation of gravitational waves from the merger of two black holes,
has been discovered recently; more on that later.

Figure 11: X-ray from an accreting black hole, and the microlensing effect.

3.2 Black hole thermodynamics

Black holes pose a serious problem to ordinary thermodynamics. They suck everything that comes
within the event horizon. So one expects its mass to increase forever. According to the no hair
theorem, it should be completely specified by its mass, charge, and angular momentum. What
about the entropy? If we assume the entropy to be zero, like a classical particle, we will be in
trouble; suppose some system with a high entropy passes the event horizon and becomes a part
of the black hole, so the entropy of the universe goes down, in contradiction to the second law
of thermodynamics. Thus, it is necessary to think about a nonzero entropy for a black hole.
Bekenstein and Hawking showed that the entropy is proportional to the area of the event horizon,
and the area always increases for allowed physical processes, so there is no problem with the second
law. This gives rise to the three laws for black hole thermodynamics (plus a zero-th law), in a
perfectly analogous way to ordinary thermodynamics.

• Zero-th law: The surface gravity κ on the event horizon is a constant.

• First law: Suppose a black hole in equilibrium is perturbed. The change in energy is given
by
κ
dE = dA + Ω dJ + Φ dQ (69)
8πG
where A, Ω, J, Φ and Q are the surface area of the event horizon, the angular velocity, the
angular momentum, the electrostatic potential and the electric charge respectively. The
change in mass and area are related: dM = κ dA/8πG. Depending on what type of black

23
hole we consider, some of these terms may be zero. As the energy perturbation changes the
outside geometry which can be felt at a distance, only the information about mass, angular
momentum, and electric charge can be retrieved. This is, thus, a mathematical form of the
no hair theorem.
• Second law: The area of the event horizon is a non-decreasing function of time: dA/dt ≥ 0.
The analogy with entropy is obvious.
• Third law: It is not possible to have a black hole with κ = 0.

The second law is true for a classical black hole but is in general violated by quantum phenomena;
it may lose mass through a quantum field theoretic process called Hawking radiation. The radiation
follows a perfect black-body pattern, so it is meaningful to talk about the temperature of the black
hole, which comes as the fourth parameter to specify such an object. Ultimately the black hole will
evaporate, but the lifetime is extremely large:
3
Mbh

τbh = 8.3 × 10−26 s, (70)
1 gm
so that there is not much chance for us to see an evaporating black hole. In fact, given that the
universe is about 13.7 billion years old, only small black holes of mass ∼ 1014 g formed at the time
of the big bang (they are called primordial black holes) are evaporating now.
We can have a handwaving explanation for Hawking radiation; the detailed mathematics is
completely beyond the scope of this course. Just outside the event horizon, the gravitational
energy of the black hole may crate a particle-antiparticle pair from the vacuum. Suppose one of
them climbs out of the gravitational well while the other falls within the event horizon. As some
gravitational energy is lost, the black hole shrinks in size (thus, to an outside observer, what fell
into the black hole must have a negative energy, whether it is a particle or antiparticle). The
smaller the black hole is, the faster is the rate of evaporation.
It is very difficult to have direct evidence of a black hole. However, one may produce very small
black holes in the laboratory. These black holes are not exactly like the ones we have discussed,
but since they are very tiny, they decay through Hawking radiation almost as soon as they are
produced. People have studied the possibility of production and decay of such tiny black holes in
the upcoming Large Hadron Collider.

Figure 12: Hawking radiation, and simulated signal of a possible black hole at the LHC.

3.3 Gravitational waves

There is another way a black hole may perish; that is by merging with another black hole and
forming a bigger black hole. To be more precise, one should say that the event horizons merge.
This is, however, a very violent process and greatly distorts the geometry of space-time.

24
Think of a black hole binary; two such objects revolving around each other and coming closer
because of gravity. Then, within a very small time, maybe a few seconds, they come so close that
their event horizons merge, and for all practical purpose they become a single black hole. Not all
the mass goes into the new black hole; a fraction of it gets radiated.
Einstein showed in 1916 that the his equations, which we will discuss later for some specific cases,
admit a wave solution in the weak gravity limit |hµν |  1. In this case, the small perturbations can
propagate as waves. This is analogous to how scalar and vector potentials admit wave solutions for
Maxwell’s equations; remember that h00 in the weak gravity limit is proportional to the Newtonian
gravitational potential φ. This wave is called a gravitational wave. It also turns out from the
retarded potential solution of the wave equation that these waves travel with the velocity of light.
What about the strong gravity case? This is much more complicated to deal with because we
cannot neglect the higher powers of the perturbation (if we can still call it a perturbation) hµν . This
makes the equations nonlinear. Physically, this means that the change in mass-energy distribution
due to the gravitational wave creates in turn its own gravity field. Only numerical solutions exist
in the strong gravity case but even then, one gets a wave solution. So the gravitational waves are
a necessary consequence of Einstein’s theory16 .
Detection of gravitational waves has always been a big challenge because gravity is a very weak
force, compared to electromagnetism17 , and therefore its effects on matter is going to be extremely
tiny. An indirect evidence came from the observation of the time period of a binary pulsar. They
are dense objects and create a strong gravity field. It was found that the time period was increasing
very slowly, so some energy must have been radiated off in the form of gravitational waves. However,
this is only an indirect evidence; people wanted to find direct evidence, and all the experiments
yielded null results (i.e. results consistent with no such waves), simply because the instruments
were not sensitive enough.
Then, on September 14, 2015, a strange signal was detected in the two advanced LIGO (or
aLIGO) detectors. LIGO is an acronym of laser-interferometry gravitational-wave observatory.
Situated in the US, the two detectors caught a passing gravitational wave (the time difference for
detection shows that the wave indeed travels with the velocity of light). The analysis, published
in February 2016, showed that this is due to the merger of two black holes, whose masses were
approximately 36M and 29M . After merger, the new black hole had a mass of 62M , and
energy equivalent to 3M was radiated in the form of gravitational waves. The energy output is
of the order of 1047 J, more than the energy output per second from all the stars in the visible
universe. Within 2017, six such merger events were observed altogether, including a neutron star
merger in August 2017. For the latter, there is also an optical signal, and they reached earth
simultaneously, showing that gravitational waves travel at the speed of light. This whole business
opened a new era in astronomy as this is the first time people saw something without the mediation
of electromagnetic waves. The observation was duly recognised with the Nobel prize in 2017 to
Rainer Weiss, Barry Barish and Kip Thorne 18 .
16
Gravitational waves can occur in other competing theories of gravity too, like the Brans-Dicke theory, but GR is
the most economical one.
17
To convince yourself, calculate the ratio of gravitational and electrostatic forces between two protons.
18
Another LIGO detector will soon be installed in India.

25
4 The Einstein Equation

4.1 Curvature

We will state a few results of differential geometry, without any deduction. You may look at
Weinberg for more complete discussion, but be careful — his convention is not the same as ours
(go to the end of the subsection to see where the difference lies).
In 4-dimensional space-time, the measure of the curvature, the Riemann-Christoffel curvature
tensor, is defined as
∂Γλµκ ∂Γλµν
Rλ µνκ = − + Γηµκ Γλνη − Γηµν Γλκη . (71)
∂xν ∂xκ
One can lower the first index and have the so-called Riemann curvature:

Rλµνκ = gλα Rα µνκ . (72)

It can be shown that


!
1 ∂ 2 gλκ ∂ 2 gµν ∂ 2 gλν ∂ 2 gµκ  
Rλµνκ = + − − + gησ Γηκλ Γσµν − Γηνλ Γσµκ . (73)
2 ∂xν ∂xµ ∂xλ ∂xκ ∂xκ ∂xµ ∂xν ∂xλ

In a locally inertial frame at space-time coordinate x = X, we can choose gαβ (x = X) = ηαβ .


We can also choose the first derivatives of the metric ∂gαβ /∂xγ to vanish at x = X, so that the
difference between gαβ and ηαβ starts from (x − X)2 . This means that the Christoffel symbols
vanish at x = X (but not their derivatives!) — an infinitesimal region around x = X can be
thought as flat, and one can simplify Eq. (73) as
!
1 ∂ 2 gλκ ∂ 2 gµν ∂ 2 gλν ∂ 2 gµκ
Rλµνκ = + − − . (74)
2 ∂xν ∂xµ ∂xλ ∂xκ ∂xκ ∂xµ ∂xν ∂xλ

The curvature quantities are easily calculated from Eq. (73). Note that the curvature tensor
satisfies:

(a) Symmetry : Rλµνκ = Rνκλµ ,


(b) Antisymmetry : Rλµνκ = −Rµλνκ = −Rλµκν = Rµλκν ,
(c) Cyclicity : Rλµνκ + Rλκµν + Rλνκµ = 0. (75)

The Riemann curvature has 44 = 256 components, but the constraints in (75) reduce it to only 20.
This is the number of independent curvature quantities in four dimension 19 ; for N dimensions it
is N 2 (N 2 − 1)/12. (Can you show that this is always an integer if N is a positive integer?)
One can contract two of the indices of the Riemann curvature to form the rank-2 Ricci tensor:

Rµκ = g λν Rλµνκ . (76)

(What happens if we contract Rλµνκ with g λµ ? It gives zero, from the antisymmetry property of
Eq. (75)). Note that Rµκ is a symmetric tensor (the symmetry property between the first and
the third index of Rλµνκ ), so it has 10 independent components. However, there are 4 constraint
19
So for N = 2, there is only one curvature, the Gaussian curvature in 2-dimensional curved space. For N = 1,
there is no curvature, so a curved line has zero curvature! This is not surprising, as a curved line can be continuously
deformed to get a straight line — they are topologically equivalent; but a curved surface cannot be flattened without
causing major deformation, look at the Mercator projection of the globe and the distance between the latitude lines.

26
conditions that must be obeyed by Rλµνκ , and hence Rµκ . These are known as Bianchi identities
and reduce the number of independent components of the Ricci tensor to six.
The Ricci tensor can also be written as
∂Γαµκ ∂Γαµα
Rµκ = − + Γαβα Γβµκ − Γαβκ Γβµα , (77)
∂xα ∂xκ
where the summation convention (over α and β) is implied.
One can further go down, and construct the Ricci scalar, also known as the scalar curvature:

R = g µκ Rµκ . (78)

A combination of R and Rµν is known as the Einstein curvature:

1
Gµν = Rµν − gµν R. (79)
2

Now a note of caution: The definitions for the Riemann-Christoffel curvature tensor, and the
Riemann curvature, differ from text to text. We have followed the notation of Hartle and Dodelson.
Weinberg uses an opposite definition, so in his case Eqs. (71), (73), and (74) all contain a relative
minus sign compared to our convention. This does not affect the definitions (76) and (79), and
Eq. (75) is also unaffected. However, be careful when you deduce the curvature quantities from the
metric. Also, Einstein equation in presence of matter, which contains Gµν on the left-hand side,
looks different; the one in Weinberg contains a minus sign on the right-hand side.

4.2 Einstein equations in vacuum

Einstein equations in vacuum are given by

Rµν = 0. (80)

This is analogous to the Newton’s gravitational equation in vacuum: ∇2 φ = 0. Einstein equations


(they are actually a set of 6 independent equations, written in a compact notation, just like F = ma,
which is a set of 3 equations) are a first principle, they cannot be deduced, since there is no more
fundamental principle from which one can deduce them. Of course, one must be sure that at the
proper limit (nonrelativistic, static, weak field) one gets back Newtonian gravity.
Let us check that the Schwarzschild metric indeed satisfies (80). First calculate the Riemann-
Christoffel curvatures using Eqs. (40) and (71). For Rt rtr = R0 101 , we put λ = ν = 0, µ = κ = 1,
and the nonzero terms are
∂ GM 1
 
0
R 101 = − 2
+ Γ001 Γ111 − Γ010 Γ010
∂r r 1 − 2GM/r
2GM 1
= 3
, (81)
r 1 − 2GM/r
so
2GM
R0101 = g00 R0 101 = −
. (82)
r3
Similarly, one gets (do this, at least once in your lifetime)

Rθφθφ = 2GM r sin2 θ,

27
2GM GM
 
Rtθtθ = 1− ,
r r
2GM GM
 
Rtφtφ = 1− sin2 θ,
r r
GM 1
Rrθrθ = − ,
r 1 − 2GM/r
GM 1
Rrφrφ = − sin2 θ . (83)
r 1 − 2GM/r

The other nonzero components of the curvature follow from symmetry arguments, Eq. (75).
Now calculate the Ricci tensor Rµκ . All components with µ 6= κ are identically zero. The
diagonal components also vanish, as can be seen from, say,

R00 = g αβ R0α0β = g 11 R0101 + g 22 R0202 + g 33 R0303 = 0. (84)

Q. Deduce the properties of the Riemann curvature as given in Eq. (75).


Q. Contract Eq. (79) with g µν . What is the value of g µν gµν ? Hence get R in terms of Gµν and g µν .
Q. Check that R11 is also zero. Note that R1010 is related to R0101 .
Q. Show that Rtrtt = 0 without direct computation.
Q. Show that Rµν = 0 is equivalent to Gµν = 0.

4.3 The expanding universe

Figure 13: Method of parallax.

The basic tenet of cosmology is that the universe is homogeneous and isotropic. Homogeneity
means that the universe looks the same from every point in space; isotropy means that there is no
preferred direction. Of course, this is true only at a length scale of a few hundred megaparsec or
more, not at the scale of a few kilometres.
Megaparsec, or one million parsec, is the largest distance scale used in astronomy and cosmology.
A star is said to be at a distance of one parsec if it gives a parallax of one arc-second; parallax is
nothing but the shift in the angular position of the star when viewed from two opposite ends of
the orbit of the earth, see Fig. 13. One parsec is about 3.26 light-years. Thus, 1 Mpc = 3.26×106
light-year = 3.09 × 1019 km.

28
The method of parallax is applicable for nearby stars only. For further stars and nearby galaxies,
one uses the Cepheid variable stars to measure the distance. Cepheids are stars whose brightness
undergoes a periodic variation. A typical example of a visible cepheid is the pole star. The period
is a known function of intrinsic brightness or absolute magnitude, so the measurement of the period
and the apparent magnitude gives the distance of the star, or the galaxy where the star is.

Figure 14: The change in luminosity of a Cepheid variable star.

The accepted explanation for the pulsation of Cepheids is called the Eddington valve, or κ-
mechanism, where κ denotes the opacity of the gas. Helium is the gas thought to be most active
in the process. He++ is more opaque than He+ . The more helium is heated, the more ionized it
becomes.
At the dimmest part of a Cepheid’s cycle, the ionized gas in the outer layers of the star is opaque,
and so is heated by the star’s radiation, and due to the increased temperature, begins to expand.
As it expands, it cools, and so becomes less ionized and therefore more transparent, allowing
the radiation to escape. Then the expansion stops, and reverses due to the star’s gravitational
attraction. The process goes on in a cycle.
For further galaxies where individual stars cannot be seen, one has to rely on a particular type
of supernova, called supernova 1a. Very massive stars end their lives in a violent burst, shedding
most of their outer material into space, and the remnant core becomes a neutron star or maybe
a black hole. Such a bursting star is called a supernova. The energy output of supernova 1a is
roughly constant; so the photon flux that is captured on earth gives a pretty accurate measurement
of their distance.

Figure 15: Formation of a supernova.

The universe contains matter and radiation; the matter is in the form of the visible matter like

29
electrons, protons, atoms and molecules and objects built out of them, and some matter which we
can detect only by its gravitational presence. In fact, most of the matter in the universe is in the
form of such dark matter; nobody knows what constitutes the dark matter, because the Standard
Model of particle physics does not have any possible candidate for dark matter. However, it is
hoped that an answer will soon be available from the high energy physics experiment at the LHC.
The radiation covers the entire electromagnetic spectrum, but most of it is in the form of an almost
isotropic blackbody radiation with a temperature of 2.725 ± 0.001 K. This falls in the microwave
region, so it is called the cosmic microwave background, and is supposed to be a remnant of the big
bang.

Figure 16: The content of the universe.

The universe also expands. It expands uniformly, all galaxies recede from all other galaxies,
and there is no centre of expansion. Whether it will expand forever or stop expanding and then
start contracting depends on the matter and energy density of the universe 20 . We have reasons
to believe that it will go on expanding. The velocity with which a galaxy moves away from us is
directly proportional to its distance:
v = H0 d (85)
where the proportionality constant H0 , known as Hubble’s constant, is about (67.3 ± 1.2)
(km/s)/Mpc (from a combined analysis of the Wilkinson Microwave Anisotropy Probe (WMAP)
and other data, averaged in 2013). So, if a galaxy is 1 Mpc away from us, it recedes roughly with a
velocity of 67.3 km/s. The spectral lines will be correspondingly red-shifted (this is Doppler shift,
not gravitational).
The expansion forces us to use a comoving coordinate to describe the geometry of the universe.
We are familiar with the FRW geometry; the scale factor a(t) in the metric describes the expansion
of the grid. The velocity is radial, so we can write
|ṙ|
v= r, (86)
|r|
but r = ax, where r and x denote the physical and the comoving coordinates respectively. The
velocity is entirely 21 due to the expansion of the grid, so x is constant, and we can write

v = r = H(t)r , (87)
a
20
It has recently been observed that the expansion rate actually is increasing, albeit very slowly, with time. This
is against all our known concepts of standard astrophysics. The energy responsible for such accelerated expansion
is called dark energy, and is distinct from dark matter. Almost 95% of the matter-energy density of the universe is
believed to be dark — 27% in dark matter, 68% in dark energy, and only about 5% as visible matter and radiation.
What constitutes dark energy is an open and challenging question. The discovery of dark energy and accelerating
universe was acknowledged with the Nobel Prize in 2011.
21
Almost. There can be some individual motions, but they are negligible on the average.

30
Figure 17: The Doppler shift from the distant stars and galaxies.

where the Hubble parameter is



H(t) = , (88)
a
whose present value is denoted by H0 .

4.4 Einstein equation for matter: the FRW geometry

We just state the form of Einstein equation in the presence of matter (this includes radiation):

Gµν = 8πGTµν . (89)

We have encountered the Einstein curvature Gµν in Eq. (79). G is the Newton’s gravitational
constant, and Tµν is the stress-energy tensor. T00 gives the energy density, Ti0 the momentum
density. T0i the energy flux and Tij the usual stress tensor. The exact composition depends on
what matter we take in our calculation. For a universe filled with perfect isotropic fluid, one writes

Tµν = pgµν + (p + ρ)Uµ Uν , (90)

where U0 = 1 and Ui = 0, p and ρ being the pressure and density of the fluid.
Tµν is conserved in flat space-time. In the presence of gravitation, its conservation is a little bit
more complicated, because the gravitational self-energy also contributes to it. However, there are
four constraint equations, which reduce to ∂ µ Tµν = 0 in flat space-time. Since Tµν is symmetric,
this again tells us that the number of independent equations in (89) is six.
How do we get the factor of 8πG? Poisson’s equation for Newtonian potential φ can be written
as
∇2 φ = 4πGρ (91)
and in the weak-gravity limit, g00 ≈ −(1 + 2φ), so one can write

∇2 g00 = −8πGT00 . (92)

Though this is true only for a non-relativistic distribution of matter in the weak gravity limit, and
is not even Lorentz invariant, the factor acted as the root of Einstein’s guess about his equation.
It is basically a consistency check: in the Newtonian limit, one must get back Eq. (91).
A brief digression here. When Einstein first formulated his field equations, the expansion of the
universe was not known, so Einstein tried a static solution (something like the FRW metric with a

31
constant in time). This clearly led to an inconsistency; we will later see why. To circumvent that,
Einstein introduced another term in his field equation,

Gµν + Λgµν = 8πGTµν , (93)

and this led to a consistent solution. The constant Λ was called the cosmological constant. When
the expansion was discovered, it was found that there is a consistent solution without Λ, and
Einstein subsequently discarded it, calling it his biggest blunder. However, during the last couple
of decades, the cosmological constant has made a comeback, almost with a vengeance, and we have
enough reasons to believe that there is indeed a nonzero Λ. This is probably the source of the dark
energy. More on this later.
As a final exercise to this part of the course, we would like to calculate the Einstein curvature
components for the FRW geometry.
There are two ways to do this. The first one is to calculate the Riemann curvatures first,
following Eq. (71). This we have done for the Schwarzschild metric. The second way is to use
directly Eq. (77). However, if you are not comfortable with the double summation, go through the
first route. It takes more time, but it is probably safer for a beginner.
I’ll use the second route. Use (77) and keep track of the nonzero Christoffel symbols, Eq. (42).
If you are careful with the double summation, and remember that the Christoffel symbols are
symmetric in their lower indices, the only nonzero terms are

∂Γ101 ∂Γ102 ∂Γ103


R00 = − − − − Γ101 Γ101 − Γ202 Γ202 − Γ303 Γ303
∂x0 ∂x2 ∂x3
 2
∂ 3ȧ ȧ ä
 
= − −3 = −3 ,
∂t a a a
∂Γ011 ∂Γ111 ∂Γ010 ∂Γ111 ∂Γ212 ∂Γ313
R11 = + − − − −
∂x0 ∂x1 ∂x1 ∂x1 ∂x1 ∂x1
+Γ01 Γ11 + Γ02 Γ11 + Γ03 Γ11 + Γ11 Γ11 + Γ212 Γ111 + Γ313 Γ111
1 0 2 0 3 0 1 1

−Γ011 Γ110 − Γ101 Γ011 − Γ111 Γ111 − Γ221 Γ212 − Γ331 Γ313
aä + 2ȧ2 + 2k
= ,
1 − kr2
∂Γ022 ∂Γ122 ∂Γ323
R22 = + − + Γ022 Γ101 + Γ022 Γ202 + Γ022 Γ303 + Γ111 Γ122
∂x0 ∂x1 ∂x2
+Γ212 Γ122 + Γ313 Γ122 − Γ022 Γ202 − Γ122 Γ212 − Γ212 Γ122 − Γ202 Γ022 − Γ323 Γ303
 
= r2 aä + 2ȧ2 + 2k . (94)

The calculation of R33 is left as an exercise; the result will be


 
R33 = r2 sin2 θ aä + 2ȧ2 + 2k . (95)

The off-diagonal components of Rµν are all zero.


The next task is to calculate the scalar curvature R. For that, we need g µν , which is
 
g µν = diag −1, (1 − kr2 )a−2 , a−2 r−2 , a−2 r−2 (sin θ)−2 . (96)

So
3ä 3   6  
R = Rµν g µν = + 2 aä + 2ȧ2 + 2k = 2 aä + ȧ2 + k . (97)
a a a

32
This gives us the nonzero components of the Einstein tensor,

1 ȧ2 + k
G00 = R00 − g00 R = 3 ,
2 a2
1
G11 = − (ȧ2 + 2aä + k) , (98)
1 − kr2
and similarly for G22 and G33 .
To solve the Einstein equation, one must know the right-hand side of Eq. (89). Assuming the
matter to be a perfect isotropic fluid, one may write down the Friedmann equations:
 2
ȧ k 8πGρ
+ = ,
a a2 3
1  2

2aä + ȧ + k = −8πGp . (99)
a2

One must now solve Eqs. (99) to extract any further information about the evolution of the
universe. That is done in the Cosmology section of this note.
Before we close this section, consider the case for a static universe filled with matter. By this
term we mean non-relativistic matter, with p = 0, but a finite ρ. If a is constant, we have
k 8πGρ
G00 = 8πGT00 ⇒ 2
= ,
a 3
k
G11 = 8πGT11 ⇒ = 0, (100)
a2
which is clearly inconsistent; a static universe can at best be empty. But with a nonzero Λ, the
Friedmann equations read
 2
ȧ k Λ 8πGρ
+ 2
− = ,
a a 3 3
1  2

2aä + ȧ + k − Λ = −8πGp , (101)
a2
so for a static universe
3k k
− Λ = 8πGρ , − + Λ = 8πGp , (102)
a2 a2
which gives a consistent solution (Λ = k/a2 ) even at the limit p = 0.
Q. Deduce Eq. (101) starting from Eq. (93).

5 Newtonian Cosmology

While it is true that a proper study of cosmology requires a knowledge of GR, at least at the
level discussed here, one can formulate cosmology even without GR. A good example is Newtonian
cosmology.
Consider the space to be isotropic and homogeneous. Consider a test particle of mass m, just
outside a uniform spherical mass distribution, of radius r and density ρ, so that M = 34 πρr3 .
Newton’s law of gravity gives
GM
r̈ = − 2 . (103)
r

33
Multiplying both sides by 2ṙ, we get
d  2 d 1
 
ṙ = 2GM . (104)
dt dt r
Thus,
2GM
ṙ2 − = A, (105)
r
where A is a constant whose value can be fixed by the initial condition: at t = 0, r = r(0), ṙ = ṙ(0).
Note that 2GM/r is just the escape velocity squared and depending on the value of ṙ, A can be
positive, negative, or zero, giving different conic orbits. Also note that 21 Am = 12 mṙ2 − GM m/r is
the total energy Em of the particle. Putting M = 34 πρr3 , we get
 2
ṙ 8 A 8 2Em
= πGρ + 2 = πGρ + . (106)
r 3 r 3 mr2
This is where Newton stopped because he did not have Hubble’s data on expanding universe. Now
that we know that the universe expands, let us conjecture that there was some explosion at t = 0
(remember, in Newtonian cosmology, space and time are different entities), and the test particle is
flying out as a result of that explosion. Thus we may put r(t) = a(t)x, where x is some constant
that does not depend on time, and a(t) is the scale factor. This gives
 2
ȧ 8 2Em
= πGρ + . (107)
a 3 ma2 x2
Now put 2Em /mx2 = −k (or −kc2 in the SI system) where k can be positive, negative, or zero.
This gives you the Friedmann equation:
 2
ȧ 8 k
H2 = = πGρ − 2 . (108)
a 3 a
Everything else follows. However, note that

• In GR, the space-time fabric expands, whereas in Newtonian case, the particle flies away.
This is analogous to the active and passive viewpoints of transformation.

• t = 0 is a singularity in GR; it is not so in Newtonian cosmology. It is the time when the


explosion took place.

• What happened for t < 0? This question is not even defined in big-bang cosmology because
of the t = 0 singularity, but this is defined here, but there is no satisfactory answer for that.

• The major problem is with the idea of a homogeneous and isotropic universe. In such a
universe, how can there be a specific favoured point? It is here that the Newtonian cosmology
faces its greatest stumbling block, albeit philosophical.

• If the universe is really infinite in spatial extent, every test mass must be in equilibrium; all
points attract it uniformly. However, the equilibrium is an unstable one; a slight perturbation
collapses the configuration and the universe breaks down into a number of massive seeds. This
can be taken as the predecessor of the modern-day ideas of galaxy formation.

6 Modern Cosmology

Cosmology is the study of the universe; from its birth to its present-day structure and composition.

34
6.1 Microscopic content of the universe

6.1.1 Matter

Matter, by which we mean non-relativistic matter, also called dust, is characterised by zero pressure:
p = 0. This includes all baryons, as they have cooled down and hardly interacts with each other22 .
On a larger scale, this includes stars and galaxies, since the only interaction between them is
gravitational. As a rule of thumb, particles moving with nonrelativistic velocity v  1 are included
in matter. For the present universe, this means protons, neutrons, electrons, atoms formed out of
them, and some weird object called dark matter.
An important quantity is the number density: the average number of particles per unit volume.
If the average energy per particle is , the total energy per unit volume is given by E = n, where
n is the number density 23 .
Most of the baryons are protons. Free neutrons are unstable, so whatever neutrons are there
remain bound inside the atomic nuclei, mostly in He4 . The neutron-proton ratio in the universe
is approximately 1:8 (this we will study in more detail later in the Nucleosynthesis section), and
the universe is charge neutral, so the number of electrons is approximately equal to the number of
protons.
Not all baryons are visible (i.e., inside stars). Most of the hydrogen is in the form of interstellar
gas. Their presence and density are estimated by the spectral absorption lines when the light from
a distant star passes through them. It is estimated that only one out of five baryons is visible; 80%
of the baryonic matter is in the form of such interstellar gas.

6.1.2 Radiation

This includes electromagnetic radiation, i.e., photons, and also neutrinos, as they are almost mass-
less and therefore are moving with relativistic velocities. The photons can interact with protons
and electrons. They can ionise an atom, or just scatter off a free electron (Thomson or Compton
scattering). Each photon can have two degrees of freedom, two polarisation states, and occupation
number per mode of polarisation is given by
1
N= , (109)
exp(hν/kT ) − 1

and the energy density in a frequency interval ν to ν + dν is

8πh ν 3 dν
Eν dν = . (110)
c3 exp(hν/kT ) − 1

The energy density radiated off a black body is given by

π2 k4 4
Z
Eν dν = T , (111)
15(h̄c)3

where the constant, sometimes written as α, has a numerical value of 7.6 × 10−6 Jm−3 K−4 .
22
Cosmologists include electrons in the list of baryons! After all, if only the gravitational interaction matters,
contribution of electrons is negligible compared to the contribution of protons and neutrons.
23
We include the rest energy, m, in . Thus,  is a sum of the rest energy and kinetic energy. If the rest energy is
much larger than the kinetic energy, the particle is nonrelativistic.

35
6.1.3 Neutrinos

There are three types of neutrinos, and all of them have very tiny masses, so they are extremely
relativistic particles. There might be some heavy yet-to-be-discovered neutrinos, but they are
nonrelativistic and hence should be treated as matter, or more precisely as dark matter as such
neutrinos are unlikely to have any interaction except gravitational. The neutrinos are very weakly
interacting, so they are relevant only when the density is very high. The sum of all light neutrino
masses is less than 0.44 eV (this comes from a combination of 9-year WMAP and SDSS data,
arXiv:1212.5226). While they are included as another component of radiation, however, you must
remember that unlike photons, cosmic neutrinos are really hard to detect; only very recently a few
cosmic neutrinos have been detected by the IceCube experiment, which is situated right at the
south pole. The properties of these high-energy cosmic neutrinos are more or less as expected; all
three flavours have a comparable admixture.
The energy density of the cosmic neutrinos is apparently 21
8 times that of the photon. This
is easy to derive. Both photons and neutrinos have two degrees of freedom; neutrinos are just
two-component Weyl spinors. There are three neutrino flavours, and as they are fermions, the
occupation number has a factor of exp(hν/kT ) + 1 in the denominator instead of exp(hν/kT ) − 1
for bosons. Using
Z ∞ 3
y dy 7π 4 7 ∞ y 3 dy
Z
= = . (112)
0 ey + 1 120 8 0 ey − 1
we get an enhancement by 3 × 78 over photons. As we will see, the actual number is a bit different.

6.1.4 Dark matter

There is much more matter in the universe that we can see. This invisible matter interacts only
gravitationally, and its presence can be inferred from indirect evidences like the rotational motion
of the spiral galaxies.
A galaxy rotation curve shows the velocity of matter (e.g., a star) rotating in the arms of a
spiral galaxy. If a galaxy has mass M (R) within a radius R, then the velocity of the star is given
by the balance between the centrifugal force and the inward gravitational pull:
v2 GM (R)
= . (113)
R R2
For spiral galaxies, most of the visible mass is concentrated in the central hub. So if we go further√out
in the arms, the enclosed mass is more or less a constant, and the velocity falls off as v ∝ 1/ R.
Experimentally, the velocity does not fall off anywhere close to this; rather, it becomes almost
constant as we increase R. This tells us that there is more matter than the visible one, and their
distribution is more like an overall halo over the visible galaxy.
In 2006, the Chandra X-ray observatory had an irrefutable proof of the existence of dark matter
in the Bullet cluster of galaxies. The image shows two galaxy clusters colliding with each other;
the visible matter, as shown in red, has a diffused distribution because of collision, but the dark
matter, as inferred from the mass distribution and shown here as blue blobs, has passed through
each other. This shows that they are indeed very weakly interacting, as compared with the visible
matter.
The general opinion is that the dark matter is non-relativistic in nature, at least the major
component is; this is known as the cold dark matter (CDM) 24 . Dark matter constitutes about 27%
24
If they were hot, i.e., with relativistic velocity, they would have blown the initial structures apart, and there
would not have been any galaxies. There might be some hot dark matter, but only as a tiny fraction of the CDM.

36
Figure 18: (L) Galactic rotation curve. (R) The Bullet cluster.

of the matter-energy density of the universe where ordinary visible matter gives only 5%. There are
several ongoing experiments to detect the dark matter directly through their scattering off nuclei,
and some of them, like XENON or LUX, have put pretty stringent bounds on the DM-nucleon
scattering cross section. There are also several theoretical models of elementary particles that go
beyond the existing Standard Model of particle physics and predict such a dark matter candidate.
Again, we hope that the LHC will shed some light towards the direct production of such exotic
particles.

6.1.5 Dark energy

This is something really exotic and was discovered only in the last two decades or so (Riess et al.,
1998; Perlmutter et al., 1999). It was found by a study of a number of standard candles for distance
measurements, called supernova 1a, that very distant parts of the universe are actually receding
from us at a greater velocity than expected, i.e., there appears to be some mechanism causing this
acceleration. This is known as the dark energy, and it can be shown that a nonzero cosmological
constant can cause such an acceleration. There are of course a few problems, which we will talk
about later, associated with such an explanation, and there are alternative theories. The standard
cosmological model with a nonzero Λ and a cold dark matter is often called the ΛCDM model.
Q. Derive Eq. (111). You may use

y 3 dy π4
Z
= . (114)
0 ey − 1 15
Also, check that the numerical value of the constant is indeed 7.6 × 10−6 Jm−3 K−4 .
Q. The temperature of the cosmic microwave background is 2.725 K. What is the energy density due to this
background?

6.2 Macroscopic content of the universe

Stars and galaxies: A galaxy is, in some sense, the smallest unit in cosmological study and is often
treated as a point light source. In reality, however, a galaxy typically contains about 1011 stars,
where a star can have a typical mass of that of the sun: M ≈ 2 × 1030 kg. A galaxy has a typical
radius of tens of kpc. There are spherical or elliptical galaxies, spiral galaxies (ours is a spiral one),
and even irregular-shaped galaxies. For spiral galaxies, the width is much smaller; maybe less than
1 kpc. We live in one of the spiral arms of the milky way galaxy. There are reasons to believe that
the actual width of the galaxy is much larger, consisting of a dark matter halo.

37
The local group: Several of the nearby galaxies form the local group. Our nearest galaxy is
the Large Magellanic Cloud at about 50 kpc from the sun. Andromeda is another large galaxy of
about the same size as the milky way and is about 770 kpc from us. A typical galaxy group takes
a volume of a few Mpc3 .
Clusters, superclusters, and voids: A number of galactic groups form a cluster of galaxies. This
is the largest gravitationally-collapsed object in the universe and is typically of the dimension of 100
Mpc. A good example is the Coma cluster with about 50000 galaxies. These clusters are joined by
filament-like structures of galaxies into superclusters, and between the clusters there can be voids,
even as large as of dimension about 50 Mpc. How the structure was formed is one of the open and
most interesting questions of cosmology.

Figure 19: The local group (left), and the large-scale view of the universe, showing galaxy clusters,
filaments, and voids (right).

Large-scale smoothness: Only when one goes above the scale of hundreds of Mpc that the uni-
verse appears smooth. (This is not such a big distance; we have observed almost up to half a billion
Mpc.) Even the largest scale observations have not revealed anything bigger than clusters and
superclusters, scattered all through the universe almost uniformly. This leads to the cosmological
principle: the universe is smooth at the largest scale. This is also the justification of the two crucial
assumptions: homogeneity, the universe looks same from each point, and isotropy, the universe
looks same in all directions 25 .
The cosmic microwave background (CMB) radiation: This omnipresent microwave background
was discovered accidentally by Penzias and Wilson in 1965, and is one of the most important ev-
idences of the Big Bang cosmology. CMB is extremely uniform (which leads again to an evidence
of the cosmological principle), and has a blackbody spectrum, peaking at a wavelength that corre-
sponds to a temperature of 2.725 ± 0.001 K. This is commonly known as the temperature of the
universe. It was much larger before, and the universe has cooled down since its beginning. CMB
has a very small anisotropy, at about 1 part in 105 , which is believed to be the remnant of some
initial perturbation that acted as the seed for galaxy and structure formations. Of course, there
are radiations at all bands of the electromagnetic spectrum, mostly coming from stars and other
stellar objects.
25
They are not the same. If the universe has a uniform magnetic field everywhere, it is homogeneous but not
isotropic. If we are at the centre of the universe and the distribution is spherically symmetric, then it is isotropic but
not homogeneous. However, if the distribution is isotropic everywhere, it implies homogeneity.

38
Figure 20: The left plot shows the perfect blackbody spectrum for CMB. The right plot shows the
anisotropy over the isotropic background (and subtracting the dipole effect, mentioned later). The
excesses are bluish, the deficits are reddish.

6.3 The fluid equation


4 3
Consider a fluid in a spherical volume of radius a. Its energy is E = 3 πa ρ. To study the
thermodynamics, let us write 26
δQ = T dS = dE + pdV . (115)
But the expansion is reversible — no heat was supplied from outside because by definition, there
is no outside — so dS = 0. Taking the time derivative of the right-hand side,
dE dV 4
+p = 4πa2 (ρ + p)ȧ + πa3 ρ̇ = 0 , (116)
dt dt 3
which leads to the fluid equation

ρ̇ + 3 (ρ + p) = 0 . (117)
a
This is not an independent equation. To see this, take the first equation of (99). Its time derivative
gives
ȧä ȧ3 k ȧ 4 ȧ
2
− 3 − 3 = πGρ̇ = −4πG (ρ + p) . (118)
a a a 3 a
Cancelling ȧ/a and substituting ρ by the first equation of (99), we get the second equation of (99).
Thus, only two of the three equations, the two Friedmann equations and the fluid equation, are
independent, and the third one can be derived from them.
Q. Work out the intermediate steps.
Q. Show that the first equation of (99) and Eq. (118) lead to the acceleration equation

ä 4πG
=− (ρ + 3p) . (119)
a 3
Hence convince yourself that the universe will decelerate if ρ + 3p > 0. What is the condition for the universe
to expand with a constant velocity?
Q. Show, from Eq. (101), that for nonzero Λ one has

8πGρ k Λ ä 4πG Λ
H2 = − 2+ , =− (ρ + 3p) + . (120)
3 a 3 a 3 3
26
The µdN term does not contribute. For photons, the chemical potential µ = 0, as the number of photons in
presence of charged particles is not constant. For electrons (and other fermions) µ 6= 0 but the number is constant
by some conservation laws.

39
As ρ and p fall with time while Λ is a constant, the expansion of the universe decelerates at first and then
accelerates. If the universe contains only matter, so that p = 0, what is the matter density when the accel-
eration starts?

6.4 The Hubble parameter revisited

The first equation of (99) can be written as


8πGρ k
H2 = − 2. (121)
3 a
First take k = 0, the flat universe. The density ρ goes down with time as the volume expands, so
H decreases with time. This means that the rate of expansion slows down, but never becomes zero.
How it goes depends on how ρ varies with time. As we will see later, if the universe is dominated
by either matter or radiation or both, H ∝ 1/t.
If the universe is open (k < 0), the right-hand side of Eq. (121) is positive definite, so the
universe goes on expanding. After a long time, ρ becomes sufficiently small, so the k/a2 term
dominates. That means ȧ2 = |k| = constant, so the universe expands linearly, and the velocity
becomes constant. This is also known as free expansion.
If the universe is closed (k > 0), H will be zero at a certain time, when the density falls to
3k/8πGa2 . The expansion stops. But the gravitational attraction is still there, and the universe
starts collapsing. The collapsing phase is easy to describe, as Eq. (121) is symmetric under t → −t;
the collapse is exactly a mirror image of the expansion. The stars and galaxies will start gettng
blue-shifted, the temperature will increase, and ultimately the universe will end up in a big crunch,
a mirror image of the big bang.

6.4.1 The density parameter

The present-day value of H is known as the Hubble constant H0 (H by itself is obviously not a
constant). Sometimes, we introduce a reduced Hubble constant h — not to be confused with the
Planck constant — defined as H0 = 100h km/s/Mpc, so that h = 0.673 ± 0.012 is a dimensionless
constant. For the rest of the text, we will use h = 0.7, remembering that there are a lot of quantities
which are functions of h, and hence have an uncertainty depending on h.
Now, there is a critical density ρc that makes the universe flat, i.e., k = 0, which is given by
3H 2
ρc (t) = , (122)
8πG
where ρc depends on t as H does so. Putting all the values, this comes out to be (using 1 Mpc =
3.086 × 1019 km)
ρc (t0 ) = 1.88 × 10−26 h2 kg/m3 = 9.32 × 10−27 kg/m3 = 1.36 × 1011 M /Mpc3 . (123)
The density seems to be tiny but it is not, as there are about 1011 stars in a typical galaxy which is
about 1 Mpc across. Thus, the actual density of the universe must be close to the critical density.
Of course, we need to measure ρ experimentally to know whether the universe is flat, open, or
closed.
The dimensionless density parameter Ω(t) is defined as
ρ(t)
Ω(t) = . (124)
ρc (t)

40
The present value of the density parameter Ω(t0 ) is denoted by Ω0 .
We can now write
8πG k k k
H2 = ρc Ω − 2 = H 2 Ω − 2 =⇒ Ω − 1 = 2 2 . (125)
3 a a a H
If Ω = 1, k = 0; but k is a constant and so must always remain zero, and hence Ω cannot evolve
with time but must remain unity always. This is called a critical-density universe. Of course, Ω
receives contribution from everything like matter, radiation, and dark matter; even dark energy.
If dark energy comes from a nonzero cosmological constant, we can define a density parameter for
that too:
Λ
ΩΛ (t) = , (126)
3H 2
and following the same steps as in Eq. (125), we get

k
Ω + ΩΛ − 1 = , (127)
a2 H 2
so that the conditions for open, flat, and closed universes are, respectively, 0 < Ω + ΩΛ < 1,
Ω + ΩΛ = 1, Ω + ΩΛ > 1. Here, Ω stands for combined matter (including dark matter and
neutrinos) and radiation densities.
Thus, the density for the cosmological constant is
Λ
ρΛ = , (128)
8πG
so that ΩΛ ≡ ρΛ /ρc , where ρc is the critical density. The Friedmann equation can be written as

8πG k
H2 = (ρ + ρΛ ) − 2 . (129)
3 a
ρΛ is constant, so the fluid equation gives ρΛ + pΛ = 0. The cosmological constant has a negative
effective pressure; work is done on the cosmological constant fluid as the universe expands.
Often cosmologists call −k/a2 H 2 the curvature density Ωk , and hence the equation becomes
X
Ωi + Ωk = 1 , (130)
i

where we take the sum over all matter, radiation, dark matter, and dark energy. The present
estimate from Planck is Ωbaryon = 0.0487 ± 0.0016, ΩCDM = 0.265 ± 0.014, and ΩΛ = 0.685 ± 0.017,
and Ωi = 1.0005 ± 0.0065, which is perfectly consistent with a flat universe model.
P

Q. Using the mass-energy relationship, show that


3 3
ρc (t0 ) = 1.69h2 × 10−9 J/m = 8.37 × 10−10 J/m . (131)

6.4.2 The deceleration parameter

The deceleration parameter quantifies how the rate of expansion changes. Its value at the present
time, q0 , is defined as
ä(t0 ) 1 a(t0 )ä(t0 )
q0 = − 2 =− 2 , (132)
a(t0 ) H0 ȧ (t0 )

41
Figure 21: The evidence of dark energy from supernova Ia studies (left). The allowed region in the
Ωm -ΩΛ plane, from supernova studies, baryon acoustic oscillation (This is some sort of clustering
of baryons in the universe due to the primordial acoustic waves and acts as a standard cosmological
ruler), and cosmic microwave background data (right).

so that if we make a Taylor expansion of a(t) about its present value a(t0 ), we can write
1
a(t) = a(t0 ) + ȧ(t0 )(t − t0 ) + ä(t0 )(t − t0 )2 + · · ·
2
a(t) 1
= 1 + H0 (t − t0 ) − q0 H0 (t − t0 )2 + · · ·
2
(133)
a(t0 ) 2

Suppose we have a matter-dominated universe with nonzero Λ. Thus, from the Friedmann
equations,
aä Λ ȧ2 + k 1
= − = (Λ − 4πGρ) (134)
a2 2 a2 3
which gives, using Eq. (122),
1
q0 = Ω0 − ΩΛ . (135)
2
Q. Consider a matter-filled universe (with Λ = 0) so that p = 0. Show, from Eq. (119), that
4πGρ 3 1
q0 = = Ω0 . (136)
3 8πGρc 2
Thus, a measurement of q0 immediately gives Ω0 .
Q. For a radiation-filled universe, p = 13 ρ. Again perform the same exercise and show that q0 = Ω0 .
Q. Perform the same exercise for a radiation-filled universe but with nonzero Λ; see Eq. (120).
Q. Suppose a(t) = a(t0 )(t/t0 )2/3 . Show that H = 2/3t and q0 > 0 27 .

6.5 The redshift

Consider two nearby point sources (for cosmology, this is equivalent to two close galaxies) at a
separation dr. If their relative velocity is dv, we can write dv = (ȧ/a)dr = Hdr. The galaxies are
27
q0 is an experimentally measurable quantity, and for an accelerating universe, q0 < 0. It is this observation that
led to the concept of dark energy.

42
connected by a light signal, so dr = dt. The change in wavelength between the emitter and the
receiver galaxies is given by dλ = λr − λe , so
dλ ȧ ȧ da
= dv = dr = dt = , (137)
λe a a a
which gives λ ∝ a. So the wavelength increases and the energy decreases as the space expands.
The redshift can be expressed as
λr a(tr )
1+z = = . (138)
λe a(te )
So, if z = 0.5, the size of the universe was only 23 -rd of its present size when the light was emitted.
Cosmologists often quote the age (which is related to the size) as the redshift. For large z (z  1)
one can neglect 1 on the left-hand side of (138).

7 The evolution of the universe

7.1 Matter dominance

Next, we need to solve the Friedmann equations to know how the universe evolves. If there is only
non-relativistic matter, p = 0, so the fluid equation becomes
ȧ 1 d 1
ρ̇ + 3 ρ = 0 =⇒ 3
(ρa3 ) = 0 =⇒ ρa3 = constant =⇒ ρ ∝ 3 , (139)
a a dt a
so that the density falls as the volume increases, which is quite intuitive.
If the universe is flat (which it probably is to a very good extent, and we will see why we can
assume it to be this flat, or even more, at the past), k = 0, and we can scale a by any constant
since it is only the ratio ȧ/a that appears. Let us set a(t0 ) = 1 so that physical and comoving
coordinates are the same at present. This give the density at any time t:
ρ0
ρ(t) = . (140)
a3
Thus, for flat universe,
ȧ2 8πG ρ0 8πGρ0 1
2
= =⇒ ȧ2 = . (141)
a 3 a3 3 a
Assume that the solution has a power-law form: a(t) ∝ tq . The left-hand side goes as t2q−2 and
the right-hand side as t−q , which are equal only if q = 2/3. So we can write
2/3 2/3
t t
 
a(t) = a(t0 ) = , (142)
t0 t0

as a(t0 ) = 1. This also gives


ρ0 ρ0 t20
ρ(t) = = , (143)
a3 t2
and
ȧ 2
H= = . (144)
a 3t
This shows that the expansion slows down with time but never quite reaches zero. Thus, even the
pull of gravity is not enough to make the universe recollapse.

43
7.2 Radiation dominance

The discussion is effectively the same except for the fact that radiation has a pressure p = 13 ρ. This
gives
ȧ 1 d 1
ρ̇ + 4 ρ = 0 =⇒ 4
(ρa4 ) = 0 =⇒ ρa4 = constant =⇒ ρ ∝ 4 . (145)
a a dt a
Assuming a power-law dependence, this gives
1/2
t ρ0 ρ0 t20 ȧ 1

a(t) = , ρ(t) = = , H= = . (146)
t0 a4 t2 a 2t
Again, the universe expands forever, but the rate of expansion is slower than a matter-dominated
universe. Why does the density go down as a−4 ? Three powers of a are due to volume expansion.
However, as the space stretches, the wavelength increases, and the energy goes down as 1/a due to
the stretching of space alone, so the combined dependence is 1/a4 . Remember that neutrinos are
relativistic and hence are a component of radiation.

7.3 Mixtures

Now consider a more realistic universe with both matter and radiation, with densities ρmat and
ρrad respectively, so the total density is ρ = ρmat + ρrad . Both the densities fall with time; ρmat as
a−3 and ρrad as a−4 . The solution is messy, but it helps if we try to solve it in the extreme limits,
when one density dominates the other.

Assume that the radiation dominates the matter. Then a(t) ∝ t, so
1 1 1
ρrad ∝ , ρmat ∝
∝ 3/2 . (147)
t2 a3 t
Thus, the matter density falls off more slowly, so radiation dominance is an unstable situation; even
if we have a tiny bit of matter, ultimately it will come to dominate the evolution. Whn the matter
becomes dominant, expansion rate speeds up: a(t) ∝ t2/3 , so
1 1 1
ρmat ∝ , ρrad ∝ ∝ 8/3 . (148)
t2 a4 t
The radiation density falls faster (naturally, as the space stretches), so matter becomes more and
more dominant with time, and this is a stable situation. This is, in all probability, the state where
our universe is in now.
Note that the number densities for both matter and radiation fall off as 1/a3 due to the expansion
of the volume. For radiation, each photon also loses energy due to the stretching of space as 1/a,
causing the redshift.

7.4 More exotic situations

Suppose the equation of state is p = (γ − 1)ρ where 0 < γ < 2. This leads to ρa3γ = constant, and
hence a(t) = (t/t0 )2/3γ . The Hubble parameter goes down as H = 2/3γt.
This solution breaks down when γ = 0. Here p = −ρ, and hence ρ̇ = 0, so the density is
constant in time: ρ = ρ0 . But this does not mean that the universe is static; in fact, there is an
exponential increase:
s s 
ȧ2 8πGρ0 8πGρ0 8πGρ0 
2
= =⇒ ȧ = a =⇒ a(t) ∝ exp  t . (149)
a 3 3 3

44
It is believed that this actually happened almost after the big bang, and is known as inflation.
Suppose the universe is curved so that there is a curvature term −k/a2 in the Friedmann
equations. The densities — both matter and radiation — fall faster than 1/a2 , so ultimately the
universe will be curvature-dominated, however small k might be. That it is still so flat suggests
that to start with, the curvature term must have been extremely and uncomfortably close to zero.
This is uncomfortable because there is no apparent symmetry to keep it to zero; so it must have
been a very fortuitous accident to get the present universe. This is known as the flatness problem
and is one of the motivations behind the idea of inflation, which we discuss later. However, if the
curvature term ever dominates, ȧ becomes constant, and the universe undergoes a free expansion.
From the experimental standpoint, it is better to consider a flat universe with a nonzero Λ. As
we saw, if Λ is large, it can overcome the negative contributions from pressure and density and lead
to an accelerating universe: ä > 0 or q0 < 0.
What exactly is the origin of the cosmological constant? Perhaps the origin is the zero-point
energy of the quantum fields. This can be treated as an energy of the vacuum; particle physics
never bothers about it since only the difference between zero-level (vacuum) and the excited levels
matters. However, such a nonzero energy density (analogous to the zero-point energy of the quan-
tum harmonic oscillator) contributes to Λ. Unfortunately, it turns out that the contribution from
the fields is about 60 orders of magnitude larger than the experimentally favoured value of Λ. This
is one of the most challenging problems in cosmology.

3
No Big Bang 99%
95%
90%
2
68%

1
ΩΛ

expands forever
0 lly
recollapses eventua
cl fla

Flat
os t

Λ=0
ed

-1 Universe
op
en

0 1 2 3
ΩΜ
Figure 22: The allowed region in the Ω-ΩΛ plane. Supernova allowed region has been superposed.
Also see figure 21 (b).

To have an idea about the fate of the universe, see figure 22. The line Ω + ΩΛ = 1 (in the figure
Ω is called ΩM ) separates the open and closed universes. If Λ is large, there is no big bang solution.
A pressureless universe accelerates if ΩΛ > Ω0 /2, see Eq. (135). If the universe is flat, q0 =
3
2 Ω0 − 1. SoΩΛ > 1/3 is the required condition for acceleration. Note that Ω goes down with time,
but ΩΛ is a constant. So even if there is a tiny cosmological constant, ultimately it will dominate
the evolution of the universe. The acceleration starts when ΩΛ > 31 .
Whether the universe expands forever or recollapses depends on both Ω and ΩΛ . If the matter
(plus radiation, dark matter, etc.) density is small, the universe expands if ΩΛ > 0. If the matter
density is large, the gravitational pull can overcome a small positive cosmological constant and
cause recollapse.

45
The present estimate is Ω ≈ 0.3, ΩΛ ≈ 0.7 (see fig. 21(b)), so the universe is expected to be
flat, expanding forever, and accelerating. What happens when the universe expands to, say, 5
times its present size? Right now Ω/ΩΛ ≈ 37 . Only Ω will go down by a factor of 1/53 = 1/125
(assuming matter dominance), so at that time, Ω/ΩΛ = 0.43/125 ≈ 0.0034. If the universe is still
flat, Ω + ΩΛ = 1, so Ω ≈ 0.003 and ΩΛ ≈ 0.997. This shows that the evolution will entirely be
governed by the cosmological constant, and it will be an exponential expansion.

7.5 The cosmological fine-tuning problem

One can find the dark energy density at present, from ρc and ΩΛ . This energy is present even
when there is no matter or radiation, so we can call it the energy density of vacuum. What might
contribute to this energy density? The Standard Model of particle physics may provide an answer.
In the Standard Model, there is a scalar field Φ, called the Higgs field, with a potential which has a
certain symmetry but the vacuum is not symmetric. This is called spontaneous symmetry breaking
and this mechanism is supposed to be responsible for the masses of all elementary particles.
This symmetry breaking is a phase transition, although we do not yet know whether it is
first order or second order. The order√parameter is the vacuum expectation value (VEV) of the
Higgs field Φ, defined as h0|Φ|0i = v/ 2, where |0i is the vacuum state of the theory. At high
temperature, the symmetry was unbroken; it breaks when the temperature of the universe goes
down, and the Higgs field acquires a nonzero VEV. But at the same time, the potential gets a
constant contribution; the lowest point of the potential is lowered by a finite amount, which can
be written in terms of the parameters of the potential. This naturally should contribute to the
vacuum energy density. Unfortunately, this is about 60 orders of magnitude too large compared to
ρc .
Thus, there must be some other contribution, coming from the geometry of space-time, that
cancels this huge contribution and leaves a remnant that is ∼ 60 orders of magnitude smaller.
However, these two contributions are apparently uncorrelated; one comes from the geometry and
the other comes from particle physics! If a nonzero Λ is indeed the source of dark energy, we have
to live with this, which is known as the cosmological fine-tuning problem.

8 The universe

8.1 The age of the universe

The natural time scale associated with cosmology is the inverse of the Hubble parameter. For
example, if the present time is denoted t0 , we can set its scale by equating it with H0−1 . Putting 1
Mpc equal to 3.086 × 1019 km and 1 year equal to 3.156 × 107 s, we get

H0−1 = 13.9 × 109 yr , (150)

which is a very close estimate to the actual age of the universe, 13.75 ± 0.11 billion years. If
the universe has only matter in it, and has a critical density ρc , then we know how the Hubble
parameter varies: H = 2/3t, and so t0 = 23 H0−1 , which is quite a bit smaller than the expected age
of the universe. Geology tells us that the earth is about 5 billion years old, the universe must be
older than that. How do we measure the exact age of the universe?
This is obtained by measuring the spectrum of the uranium isotopes in the galactic disc. Ura-
nium is formed inside the dying and exploding stars by something known as the r-process. If we
know from theory the ratio by which the two dominant isotopes of uranium, U 235 and U 238 , are

46
expected to be produced, and if we can measure their relative abundances now, then with the
knowledge of their half-lives (which is very accurately known), we can find the time when the first
such explosions occurred. We might add a billion years for the formation and evolution of those
first-generation stars.
If the universe contains only matter, but is open (Ω0 < 1), then it can be shown that
1 Ω0 2 − Ω0
 
H0 t0 = − 3/2
cosh−1 . (151)
1 − Ω0 2(1 − Ω0 ) Ω0

The maximum value of the right-hand side is 1 for an empty universe, and if the universe is critical
(Ω0 = 1), we get t0 = 2/3H0 back. For the estimated density Ω0 = 0.3, H0 t0 ≈ 0.8.
On the other hand, if the universe is flat, but contains a nonzero ΩΛ , the equation becomes
√ ! s !
2 1 1 + 1 − Ω0 2 1 −1 1 − Ω0
H0 t0 = √ ln √ = √ sinh . (152)
3 1 − Ω0 Ω0 3 1 − Ω0 Ω0

Q. What is the age of the universe for Ω0 = 0.3, using Eq. (152)?
Q. Show that (152) gives H0 t0 = 23 as Ω0 → 1. Numerically find the value of Ω0 when H0 t0 = 1.
Q. The initial abundances of the two uranium isotopes are expected to be at the ratio U 235 : U 238 = 1.65.
The present ratio is only 0.0065. The half-lives are τ235 = 0.714 × 109 yr, τ238 = 4.62 × 109 yr. Find the
time when the explosion occurred. Add another billion years and you’ll get approximately the time when
the galaxy was formed.

8.2 The cosmic microwave background

While this all-pervading background was accidentally discovered by Penzias and Wilson in 1965,
it was only the first nine minutes of data taken by the Far Infrared Absolute Spectrophotometer
(FIRAS), on board the Cosmic Background Explorer (COBE) satellite launched by NASA in 1989,
that established the black-body spectrum to an accuracy better than 1%. Now we know that it is
smooth to a level of 10−5 with a temperature of 2.725 ± 0.001 K.
You found earlier that the present-day energy density in CMB is 4.17 × 10−14 J/m3 . Using Eq.
(131), we get
Ωrad = 2.47 × 10−5 h−2 = 4.98 × 10−5 . (153)
Clearly, it is insignificant compared to the other components of Ω. But it was not so always. We
know ρrad ∝ a−4 . Comparing with Eq. (111), we get a very important relation:

T ∝ a−1 . (154)

Thus, the universe cools as it expands, and it was arbitrarily hot at arbitrarily smaller time. The
interesting point is that once the radiation was thermalised and attained a black-body character, it
would always remain a black-body spectrum. This is evident from the black-body energy distribu-
tion, Eq. (110). Both ν and T vary as 1/a, so the exponential in the denominator retains its form.
ν 3 in the numerator goes down as 1/a3 , but the volume in which the radiation was increases by
the same amount, so the spectrum remains unchanged. Obviously, both ν and T go down, so the
peak of the distribution shifts to the low-energy end, and the energy density goes down as 1/a4 .
Given, rad (t0 ) = 4.17 × 10−14 J/m3 , and mean energy of a microwave photon

Emean ≈ 3kB T = 7.05 × 10−4 eV = 1.13 × 10−22 J , (155)

47
we get the present number density of photons:

nγ = 3.69 × 108 m−3 . (156)

Assuming Ωbaryon = 0.0456, and using Eq. (131), we get the energy density of baryons:

baryon = 3.7 × 10−11 J/m3 . (157)

Baryons are non-relativistic, so its energy is equal to its rest mass, which is 938 MeV = 1.5 × 10−10
J. This gives the number density of baryons

nb = 0.25 m−3 . (158)

Even though there are many more photons than baryons,


nb
= 6.78 × 10−10 , (159)

their contribution to Ω is negligible as they have extremely low energy. However, even this gives
rise to a serious question: why the number is of the order of 10−10 and not zero? After all, we
expect equal number of baryons and antibaryons to be produced out of the primordial radiation,
so the net baryon number of the universe should be zero. This is still an open problem, and will
be discussed later in the subsection baryogenesis.
Next comes the question: how did the CMB originate, and what does it tell us?
The origin of CMB is easy to understand. In the primordial soup, there were a huge number of
energetic photons, plus some protons and electrons, and some charge neutral particles like neutrinos
and neutrons. The number of electrons was the same as the number of protons, but they could
not form a hydrogen atom, because as soon as the atom was formed, the electron was blasted away
by one of those energetic photons. This could occur because the energy of the photon was more
than the ionisation energy of hydrogen, 13.6 eV. Once the electron becomes free, it could have
interacted with any other photon; this is known as Thomson scattering. Thus, the mean free path
of the photon was quite small; the interaction probability was large, and the universe is said to be
in an opaque phase.
One can estimate the mean free path. Suppose the size of the universe was one-millionth of
the present size: a = 10−6 . The temperature was 3 million K, and so the mean energy of the
photon was 3kB T ≈ 250 eV, enough to ionise a hydrogen atom. The electron density at present
is 0.2/m3 , so at that time it was 2 × 1017 /m3 . The mean free path is roughly given by 1/ne σe ,
where σe = 6.65 × 10−29 m2 is the Thomson scattering cross-section. This gives a mean free path
of 7.5 × 1010 m. Light takes about 250 seconds to travel this distance, which is much smaller than
the age of the universe at that time.
As the universe expands and cools, the photons lose their energy, and there comes a time when
their energy falls below 13.6 eV. Once that happens, the hydrogen atom remains stable, and the
free electrons are quickly removed from the soup. The mean free path of the photon increases by
a huge amount; in fact it gets larger than the size of the universe. This is known as decoupling;
the universe becomes transparent to the photons. Since then, the photons travel unimpeded, and
lose their energy as the universe expands. At the time of decoupling, the distribution was that of
a black-body; so even now it retains the black-body character.
We can make a crude estimate of the temperature when the photons decoupled, by equating
3kB T with 13.6 eV. This gives a temperature of roughly 50000 K. This is too large an estimate, and
we can see why. It is not strictly necessary for the mean energy to go below 13.6 eV. Remember

48
that there are approximately 1.5 × 109 photons per baryon. Even if one of them has an energy
more than 13.6 eV, it will ionise the atom, and then the rest of the photons can interact with the
free electron. Thus, the temperature should be such that there should not be even one photon in
1.5 billion whose energy is greater than 13.6 eV. The photon energy distribution has a Boltzmann
tail, and the decoupling temperature Tdec is given by
1 13.6
exp(−13.6/kB Tdec ) = ⇒ Tdec = ≈ 7500 K . (160)
1.5 × 109 kB ln(1.5 × 109 )

Even this contains an overestimation by a factor of 2.5, and a more refined calculation brings it
down to 3000 K. At that time, the universe was roughly one-thousandth of its size, or z = 1000
(the experimental number is z = 1090.89 ± 0.69, note the accuracy). There were no stars and
galaxies at that time; the structures came much later, the first stars being ignited at t ∼ 1.5 × 109
year. The photons are travelling freely since decoupling, and are now reaching us uniformly from
all parts of the sky, originating as if on the surface of a sphere whose radius is the length those
photons travelled. This is known as the surface of last scattering. (A good analogue of this surface
is the view of the sky on a cloudy day. We can only see the clouds, not the sky above.) Observers
at other parts of the universe will see photons coming from different spheres, so the surface of last
scattering is not something uniform over the entire universe. Needless to say, the photons are also
getting cooled due to the expansion of the universe.

Figure 23: The surface of last scattering.

At present, Ωrad is 5 × 10−5 and ΩM is 0.3. But the former falls off as a−4 and the latter only
as a−3 , so at what value of a were they equal? The answer is a = 1/6000 28 , so at a = 1/1000, the
universe was definitely matter-dominated. In the matter-dominated phase, a ∝ t2/3 , and taking
t0 = 13.7 × 109 yr = 4.3 × 1017 s, and a0 = 1, we get

tdec ≈ 1.4 × 1013 s = 434000 yr . (161)

The actual estimate is a bit less, like 380000 years, because we have neglected Λ in our estimate
which speeds up the expansion rate (again, the experimental error is less than 0.5%).

8.3 The horizon problem

The uniformity of CMB throws another puzzle of Big Bang cosmology. Consider two points A and
B in the sky opposite to each other. The spectrum is identical for both these points, so there must
have been some way of communication between them. But light from A has only reached up to
28
It goes down a bit if we include neutrinos in Ωrad , but still is well above a = 1/1000.

49
us till now, so it could not have reached B. How, then, was the communication established? What
makes the spectra from A and B identical? This is known as the horizon problem.
In fact, the problem is much worse than this. The photons have been travelling unimpeded since
decoupling, so the uniformity must have been established before decoupling took place. But the
universe was very young then, and only very nearby points could have been in causal contact with
each other. Since then, the universe has expanded thousandfold, and even points which are very
close in the sky now (say a degree or two apart) could not have been in contact at the decoupling
epoch. Again, as we will see, inflation helps us to solve this problem. However, note that CMB is
not exactly uniform — there are anisotropies at the scale of 1 in 105 over the smooth spectrum,
which points to some degree of non-smooth nature at distant past, perhaps relevant for galaxy and
structure formations.
Q. Show that the velocity of light is roughly given by c = 3.076 × 10−7 Mpc/yr. If the age of the universe
is 13.7 billion years and the decoupling occurred 380000 years ago, what is the radius of the surface of last
scattering? [Note: The actual size is more than this roughly by a factor of three. This is because as the light
comes towards us, the space expands.]
Q. Right now, Ωrad /Ωbaryon ≈ 1.1 × 10−3 , so the universe is matter dominated. At what value of a were
they equal? What was the temperature then? What was the number density ratio nγ /nb ? If you take all
matter into account so that ΩM (t0 ) = 0.3, at what value of a did the transition take place?
Q. Given, kB = 8.619 × 10−5 eV/K, and the mean energy of a photon is 3kB T , find the size of the universe
when the photons had an average energy of 1 MeV. If the present number density of the electrons is 0.2/m3 ,
what was the density then?
Q. The photon decoupling occurred roughly when the universe was 1013 s old, corresponding to a tempera-
ture of 3000 K. If it were radiation dominated till that point (which it was not), a(t) ∝ t1/2 . Calculate the
temperature when the universe was 1 s old. What was the average photon energy then?
Q. The LHC is going to have collisions with an energy of 3000 GeV (the energy is distributed among the
constituents of the protons). At what time the universe was this hot?
Q. Given, the universe had a temperature of 3000 K at 1013 s and 10000 K at 1012 s, and the transition from
radiation to matter dominance occurred between this. Set up the equation to find the transition temperature
and find the solution iteratively.

8.4 The CMB anisotropy

The CMB appears to be extremely uniform, but it is not precisely so; there are tiny anisotropies, of
the order of 1 in 105 , in the spectrum. The largest contribution in anisotropy comes from the motion
of the earth and is nothing but a Doppler shift; CMB photons are slightly blue-shifted towards the
direction of our motion and slightly red-shifted away from it. This is a dipole anisotropy and is
purely a local effect, so is of little importance as far as the physics implications are concerned. To
describe the intrinsic anisotropies, note that the surface of last scattering appears as a spherical
surface to us, so any asymmetry may be expressed in terms of the spherical harmonics:
∞ X l
∆T X
= alm Ylm (θ, φ) . (162)
T l=1 m=−l

An isotropic universe has no preferred direction and hence no dependence on m; so we can define,
as a measure of anisotropy,
1 X
Cl ≡ |alm |2 . (163)
2l + 1 m

50
The dipole anisotropy is given by l = 1 and is of the order of 10−3 . The higher anisotropies,
measured only after the COBE satellite was launched, are of the order of 10−5 . These intrinsic
fluctuations are caused by the tiny density fluctuations in the primordial universe, which are in
turn responsible for the structure formation. The relationship between the anisotropy and the
density fluctuation is not difficult to understand, at least qualitatively. Consider two points on
the surface of last scattering, one with slightly more concentration of matter and the other with
slightly less than the average. The gravitational potential of the first region is deeper than that
of the second, so the photons coming from the first region will be more red-shifted (and hence
have less temperature) compared to those coming from the underdense region. This is known as
the Sachs-Wolfe effect. There are other effects that contribute to the CMB anisotropy, but the
Sachs-Wolfe effect dominates on the large distance scale.

9 The early universe

The photons decoupled at z ≈ 1000, when the universe was matter-dominated. The only exciting
thing that happened after that epoch is the formation of structures: the galaxies and the stars.
In fact, the first galaxies formed quite early; we have observed in October 2010 the galaxy UDFy-
38135539, which was formed before the universe was 800 million years old. A lot of such young
galaxies and protogalaxies have been discovered; the latest one in this series is z8 GND 5296,
discovered in 2013, which was created when the universe was only 700 million years old, and has
a redshift z = 7.51. The farthest structure to be observed is UDFj-39546284, a protogalaxy with
z = 11.9 with an estimated age of 13.42 billion years, only 380 million years after the big bang.
In 2018, Judd Bowman and collaborators found the signals of the birth of the first stars when
the universe was only 180 million years old. The signal was in the form of a tiny change on the CMB
spectrum. The CMB radiation, when passing through the interstellar gas which is slightly heated
up because of the igniting stars (and therefore emitting its own characteristic line, the 21-cm line of
hydrogen), shows a minute distortion. Bowman and collaborators studied the signal in detail and
found that the distortion is indeed coming from the first stars igniting up. One more intriguing
observation is that some of the surrounding atmosphere is getting colder; the temperature is almost
half of the expected value. While it is easy to heat up the gas, there is no known mechanism at that
point of time which may cool it down. A bold hypothesis is that this may be due to the interaction
of the gas with dark matter particles, transferring their energy to the dark matter. If confirmed,
this will be the first direct evidence of dark matter interacting with ordinary matter by anything
other than gravitation.
While the formation of structures is an extremely interesting subject by its own right, we will
try to focus on earlier times, when the universe was very young. Of course, we cannot go back to
t = 0; that is a singularity of the Big Bang model, corresponding to an infinite temperature.
Before we start to track the universe backward, there is one point that we should carefully
consider. The radiation in the universe includes all relativistic particles, which means neutrinos
and antineutrinos (we assume them to be massless, though, strictly speaking, they are not) as well
as photons; even electrons are relativistic when kB T  me . While they become nonrelativistic
much earlier (roughly at a temperature of 0.5 MeV or 6 × 109 K), the cosmic neutrino background
is still a part of the radiation density (even though they are extremely hard, almost impossible, to
detect). How should we estimate Ων ?
Both photon and a ν-ν̄ pair have two degrees of freedom; this is true only if neutrinos are
massless, or are their own antiparticles if they are massive 29 . However, there are 3 such light
29
Such fermions are called Majorana fermions. Only neutrino can be a Majorana fermion as it is charge neutral.

51
neutrinos, and the integral over the Fermi distribution gives a factor of 7π 4 /120 instead of π 4 /15
for the Bose distribution. Together, they give an overall enhancement of 3 × 78 = 21/8 for the
neutrino energy density over the photon energy density.
Unfortunately, this is not exactly true, and not cosmology but particle physics is responsible for
that. When the photon temperature was above me = 0.51 MeV, the e+ -e− pairs were in thermal
equilibrium with the photons and γ + γ ↔ e+ + e− could proceed with equal strength in both
directions. At that time, Tν = Tγ . Then the photons cooled, electrons became heavy enough to
be pair produced (this is also known as a freeze-out of electrons), but the remaining positrons
annihilated with electrons and produced photons. Thus, pair creation was no longer possible, but
pair annihilation still was; the energetic photons were quickly thermalised in the low-energy thermal
bath of the photons. In short, even though the photons cooled, the cooling was slowed down because
there was a feedback from the e+ -e− annihilation, and the effective temperature was higher than
what it would have been in the absence of pair annihilation. This is something analogous to putting
a bucket of hot water in a tub whose water is undergoing cooling, which heats up the water again
and slows down the cooling process. This is known as the reheating of the photon bath. Neutrinos
could not take part in such a process, because e+ + e− → ν + ν̄ can be mediated by the weak gauge
bosons W and Z (why not by photons?). Those bosons are much heavier, and weak interaction
has already become much weaker in strength than the electromagnetic interaction. So, only the
photons and not the neutrinos are going to be reheated.
The entropy density s of a sea of relativistic particles at temperature T (so that T  mi ) is
4 ρ π2
s= , ρ = g∗ (T ) (kB T )4 , (164)
3T 30
where the effective number of degrees of freedom g∗ is written in terms of gboson and gfermion 30

X 7 X
g∗ (T ) = gboson (T ) + gfermion (T ) , (165)
boson
8 fermion

and this is a function of temperature because we have to take only those particles whose masses are
below kB T . (We use h̄ = c = 1, or otherwise ρ should have a factor of (h̄c)3 in the denominator.)
Before annihilation, the total degrees of freedom of the electron-positron-photon soup was 11 2
(2 for photon, and 4 for electron-positron pair — they are Dirac fermions — weighted by 78 for the
Fermi distribution, so 2 + 4 × 78 = 11
2 ). After annihilation, only photons remain
31 , so there are 2

degrees of freedom. Using


(g∗ T 3 )before annihilation = (g∗ T 3 )after annihilation , (166)
we get
1/3
11

Tafter annihilation = Tbefore annihilation . (167)
4
This raises the photon temperature from the neutrino temperature. Ωrad is proportional to T 4 , so
4/3
7 4

Ων = 3 × × Ωrad = 0.68Ωrad = 3.4 × 10−5 , (168)
8 11
and hence the total density for the relativistic particles is
Ωrel = Ωrad + Ων = 8.4 × 10−5 . (169)
There are, of course, a lot of proposed but still hypothetical Majorana fermions, including one which is an excellent
cold dark matter candidate.
30
For fermions, we have to multiply by 78 .
31
The remaining electrons can take part only in Thomson scattering, that does not change g∗ .

52
Ωrel falls as 1/a4 whereas ΩM falls only as 1/a3 , so the transition from a radiation-dominated phase
to a matter-dominated phase took place when
Ωrel 1
at = ≈ . (170)
ΩM 3238
(We have taken ΩM = ΩDM + Ωb . A precise number is a−1 t = 1 + zt = 3233 ± 87.) The temperature
was about 8800 K; this is before photons decoupled. We can also estimate the time, considering
that since then a ∝ t2/3 . This is close 12
√ to 2 × 10√ s, or about 64000 years. Before that, the universe
was radiation dominated and a ∝ t, T ∝ 1/ t.
Speaking roughly, let us take T = 104 K at t = 1012 s. We can divide the time before that like
this (multiply the temperature roughly by 10−4 to get the energy in eV):

Time Temperature Characteristic


1013 s - t0 3000 K - 3 K Atoms, galaxies, stars.
Photons decoupled (CMBR)

1012 s - 1013 s 10000 K - 3000 K Matter-dominated but still opaque. No atoms.

1 s - 1012 s 1010 K - 10000 K Nucleosynthesis starts and ends (3-20 minutes)


Protons and neutrons join to form atomic
nuclei. Only neutrinos are decoupled.

10−6 s - 1 s 1013 K - 1010 K Hadron epoch: free quarks hadronise.


Free electrons, protons, neutrons, neutrinos.
Everything is interacting with everything.
Muons and other unstable particles decayed.
Energy is still too high for nuclei.

10−12 s - 10−6 s 1016 K - 1013 K Quark epoch: Free quarks, leptons, photons, but
electroweak phase transition has occurred.

10−36 s - 10−12 s 1029 K - 1016 K Electroweak epoch

Before 10−36 s > 1029 K Inflation and other exotic phenomena


Quantum gravity (?)

While these are the main characteristics, it is possible to focus on definite reactions more
accurately. For example, the synthesis of atomic nuclei from protons and neutrons started when the
universe was about 3 minutes old; before that, the photons were so energetic that they would have
blasted the nucleus apart 32 . Nucleosynthesis stopped at 20 minutes, after which the temperature
and density dropped to such a level that nuclear fusion cannot continue. Between 1 and 10 seconds
was the lepton epoch, where hadrons and anti-hadrons annihilated to produce leptons and anti-
leptons; they were also produced from photons. After 10 seconds, the temperature dropped to
a level where lepton production stopped. The leptons and antileptons still annihilated, leaving a
small lepton asymmetry to keep the universe charge neutral.
Neutrinos decoupled at about 1 second and formed the cosmic neutrino background. The
quarks started to get trapped within hadrons at 10−6 seconds. Before that, there existed a plasma
32
Thus, their energy must be in the nuclear binding energy range, i.e., MeV.

53
of free quarks, antiquarks, and gluons, along with leptons and photons. At the beginning of the
electroweak epoch (which corresponds to an energy of 1016 GeV), the strong force was of the same
strength as the electroweak force — perhaps there was a unified force law at that time; we call it
the Grand Unification. The Grand Unification broke down at 1016 GeV, but electromagnetism and
weak forces were still unified, and W and Z bosons still massless. The electroweak symmetry broke
at the end of the electroweak epoch (∼ 100 GeV), and electromagnetism and weak forces became
distinct. The weak gauge bosons became massive at this time.
How the electroweak symmetry broke is still not completely understood. The Standard Model
of particle physics, of course, has a definite answer. There is a complex scalar field Φ with four
degrees of freedom, whose potential can be written as

V (Φ) = −µ2 Φ† Φ + λ(Φ† Φ)2 (171)

with µ2 , λ > 0. The potential looks like a hat, a local maximum at Φ = 0 and minima at some
nonzero value v of Φ, which is called the vacuum expectation value (VEV) of the field: h0|Φ|0i = v.
Once the configuration rolls down to the minimum, the symmetry of the potential is spontaneously
broken, and three weak gauge bosons and all the massive fermions get their masses. This is known
as the Higgs mechanism. One degree of freedom of the field Φ remains as a physical object, which
is known as the Higgs boson, and was discovered in July 2012.
However, the way we have written the potential V (Φ) is true only if we do the field theory
at T = 0. At finite T , there are temperature-dependent corrections to the potential that should
be taken into account. For ordinary temperatures, these corrections are negligible, but become
significant at very high temperatures of the electroweak epoch. Once you add the corrections, the
shape of the potential changes; the minimum comes back at Φ = 0, and the symmetry is restored.
Thus, when the universe was born, the electroweak symmetry was exact, and the gauge bosons and
fermions were all massless. Once it cooled down, the temperature-dependent corrections became
less significant, and the symmetry broke.
In short, this is a phase transition – from a symmetric phase to a symmetry-broken phase. The
order parameter is the VEV v; it is zero in the symmetric phase, and nonzero when the symmetry is
broken. Its precise value is 246 GeV, as we can measure from the masses of the weak gauge bosons
W and Z. What is the order of the phase transition? If v changed smoothly, this is a second
order; if it changed abruptly, this is a first order transition. If Standard Model is the only thing in
particle physics, we can calculate the order, which turns out to be second. We will later see that a
first-order phase transition is preferred to generate the baryon asymmetry of the universe. Thus,
this is still an open question.
What happened before the electroweak epoch? This is a speculative region, but what in all
probability happened is a very transient but very rapid exponential expansion, known as inflation.
The rest is open to speculation and wild guesses.
Q. What is the temperature of the cosmic neutrino background now? What is the number density?
Q. What is the value of g∗ when (i) me < T < mµ , (ii) mµ < T < mπ , (iii) T > mπ but all other degrees of
freedom are frozen or decoupled?

10 Nucleosynthesis

An important test of the Big Bang model is the synthesis of light elements, which is known as
Big Bang Nucleosynthesis (BBN). While medium and heavy elements are synthesized inside the
stars, there are some light elements, apart from hydrogen, with which even the first generation

54
stars started their lives 33 . These elements include deuterium (D), He3 , trace amount of Li7 and
even smaller amount of Li6 ; however, the most important of them is He4 . While He4 is synthesised
in stars in the p-p chain, the amount of He4 that might have formed inside all the stars from the
birth of the universe is way too small from the amount that is found. So there must be a significant
amount of He4 that came out of the BBN. Some unstable radioactive isotopes were also formed,
like H3 , Be7 , and Be8 , which either decayed or fused with other nuclei to form stable isotopes 34 .
BBN produced no elements heavier than beryllium, due to a bottleneck; the absence of a stable
nucleus with 5 or 8 nucleons. Be8 is extremely unstable; it breaks almost immediately to two
α-particles. In stars, the bottleneck is passed by collisions of three He4 nuclei at the same time,
producing an excited state of C12 (this is known as the triple-α process). However, this process is
very slow, taking tens of thousands of years to convert a significant amount of helium to carbon in
stars, and therefore it made a negligible contribution in the minutes following the Big Bang.
The condition for nucleosynthesis could not have been reached before the first second. After
one second, the universe had a temperature of 1010 K, which corresponds to approximately 1 MeV.
The nuclear binding energy is of that order, so before that limit is reached, no stable nucleus could
have been formed; the argument is just the same as for photon decoupling. The energetic photons
would have ripped the nucleus apart.
All baryons except proton and neutron decayed after 1 second. The free neutron is not stable
but has a very long half-life: tn1/2 = 614 s. Thus, when the photon energy fell below 1 MeV,
all the primordial neutrons were still there. They would have decayed if they could not become
constituents of the nuclei. Once they are inside the nuclei, they become stable.
Before 1 second, but when mN  kB T (N stands for either proton or neutron), the nucleons
are non-relativistic, and are in thermal equilibrium. They follow a Maxwell-Boltzmann distribution
in which the number density is
mN
 
3/2
nn,p ∝ m exp − . (172)
kB T
Here, we have used c = 1, so mn = 939.6 MeV, mp = 938.3 MeV (do not confuse n for number
density with n for neutron). The constant for proportionality is the same for both nucleons, so
!3/2
nn mn ∆m
 
= exp − , (173)
np mp kB T

where ∆m = mn − mp = 1.3 MeV. If kB T  ∆m, nn and np would be almost identical.


There can also be conversions:

n + νe ↔ p + e− , n + e+ ↔ p + ν̄e . (174)

These interactions produce rapidly as long as the temperature is about 0.8 MeV. After that, these
being weak processes, the probability decreases so much that the rate of conversion becomes longer
than the lifetime of the universe. Once that happens, the ratio becomes constant, and could have
been further changed only by free neutron decay. At that point,
nn
= exp (−1.3/0.8) ≈ 0.2 . (175)
np
33
The first generation stars are the first stars that were born after the Big Bang. The second generation stars
started with some of their remnants, so they might easily contain some amount of the heavier elements.
34
The first paper on BBN is also called the αβγ paper after its authors, Ralph Alpher, who was a graduate student
of Gamow, Hans Bethe, and George Gamow himself. While some of the conclusions came out to be incorrect, the
basic idea survived.

55
Figure 24: The mass fraction of the elements synthesized in the BBN. The widths are due to
experimental as well as theoretical uncertainties. The vertical band shows the BBN predictions,
which are very well satisfied for all the elements except Li7 .

10.1 Helium abundance

Even this is not enough for the production of the light elements to start. They go through the
chain
p + n → D , D + p → He3 , D + D → He4 . (176)
Thus, deuterium is an important intermediate step. But it is a very weakly bound system (the
binding energy is 2.2 MeV) and cannot have a sufficient concentration for helium production unless
the temperature is of the order of 0.1 MeV 35 ; the photons will break it apart before that. Thus,
BBN starts only when the universe is about 3 minutes old. Ultimately, it is about 400 seconds
before all neutrons get absorbed in nuclei. However, before that, the neutron concentration further
depletes because of free neutron decay, and
nn 400 ln 2 1
 
≈ 0.2 × exp − ≈ . (177)
np 614 8

As a first approximation, we can neglect the formation of other nuclei apart from He4 . Thus, all
neutrons end up in He4 ; every such nucleus contains two neutrons, so nHe4 = 12 nn . Each helium
nucleus has four nucleons, so the baryon mass fraction in He4 is given by
4 × nn /2 2
Y4 = = ≈ 0.22 . (178)
nn + np 1 + np /nn
A more sophisticated calculation yields something close to 0.24, which is almost precisely what
the experments found. The number is a function of baryon abundance, or nb /nγ , and it is a
groundbreaking success of Big Bang cosmology to predict the light element abundances to such an
35
The calculation is a bit complicated but analogous to the Boltzmann tail argument we used to calculate the
CMBR decoupling. This is known as deuterium bottleneck.

56
accuracy. It is also a function of the number of light neutrinos; after all, they are a component of
the radiation density, and the rate of expansion depends on this, and the rate of expansion controls
when BBN is going to start and when it is going to stop. The observed abundance is consistent with
the number of light neutrinos equal to 3, which has been corroborated by the collider experiments.
Note that Y4 depends on a number of factors. For example, if mn − mp were higher, the half-life
of neutrons would have been much smaller and hence there would have been very few neutrons left
when BBN started, pulling Y4 down. Similarly, if the deuteron binding energy were higher, BBN
would have started earlier, which in turn means more remaining neutrons and hence higher Y4 . A
larger value of nb /nγ would have started BBN earlier too, because the deuterium bottleneck could
have been breached earlier. Finally, if Be8 were stable, He4 would continue fusing further, making
Y4 smaller than it is now. In fact, if Be8 were stable, there could have been heavier elements like
C12 out of BBN.

10.2 Deuterium abundance

Deuterium is in some ways the opposite of He4 in that while He4 is very stable, deuterium is very
weakly bound. Because He4 is very stable, there is a strong tendency on the part of two deuterium
nuclei to combine to form He4 . The only reason BBN does not convert all of the deuterium in
the universe to He4 is that the expansion of the universe cooled the universe and the conversion
stopped short before it could be completed. One consequence of this is that unlike He4 , the amount
of deuterium is very sensitive to initial conditions. The denser the universe is, the more deuterium
gets converted before time runs out, and the less deuterium remains.
There are no known post-Big Bang processes which would produce significant amounts of deu-
terium. Hence observations about deuterium abundance suggest that the universe is not infinitely
old, which is in accordance with the Big Bang theory.
During the 1970s, there were major efforts to find processes that could produce deuterium other
than BBN. The problem was that while the concentration of deuterium in the universe is consistent
with the Big Bang model as a whole, it is too high if ΩM = Ωbaryon . If we had that many baryons,
much of the currently observed deuterium would have been burned into He4 . This inconsistency
is another evidence that not all matter in the universe is in the form of baryons; there must be a
significant concentration of non-baryonic matter, which in all probability is the dark matter.
The predictions for He3 and deuterium are even closer than that for He4 . However, there is
some discrepancy between the Li7 prediction and actual abundance, by about a factor of 3-4; this
is a subject of much active interest.
BBN stopped when the universe was about 20 minutes old. This is because both temperature
and density fell to such a level that nuclear fusion was no longer possible. The relative abundances of
the light elements became fixed, apart from He4 , which is produced even by stellar nucleosynthesis.

10.3 Baryogenesis

While the generation of lighter elements is pretty well understood, one should try to answer how
the protons and neutrons, or for that matter, the quarks and antiquarks created. They could have
been created from a hot photon bath: γ + γ → q + q̄. But then the net baryon number should have
been zero; all known interactions are expected to conserve baryon number. One implication of this
is that the proton is stable, because it is the lightest baryon, and decays such as p → π 0 + e+ are
forbidden. The half-life of proton is more than 1034 years.
If the baryon number is strictly conserved, all baryons should have annihilated with all an-

57
tibaryons, and we expect nb /nγ = 0 and not something ∼ O(10−9 ). How it came to be nonzero is
known as baryogenesis 36 .
One solution could be that the universe started with a nonzero baryon number, but that solution
is not very appealing: why such an asymmetry at the singularity? We would rather like to generate
this asymmetry when the universe was very young, say at the beginning of the electroweak epoch.
You will later see that even if there were such an asymmetry at t = 0, the exponential expansion
of the universe would have diluted it to a completely negligible level.
How baryogenesis occurred is still an open problem. There are so many hypothetical interactions
which might produce a baryon asymmetry, but no known interactions do that. Even though we do
not know the exact form of interaction, we know what should be the characteristics for a successful
baryogenesis. These are known as Sakharov conditions:

• Baryon number violation;

• C (charge conjugation) and CP (charge-parity combined) violation;

• Deviation from thermal equilibrium.

The origin of these conditions are easy to understand. First, baryon number has to be violated
if we are to generate some baryons out of the initial state where the net baryon number was zero.
Second, both C and CP have to be violated if baryons and antibaryons are to be created in unequal
amount. Suppose you want to create two quarks out of a non-baryon X. If C is violated,

Γ(X → qL qL0 ) 6= Γ(X̄ → q̄L q¯0 L ) . (179)

However, if CP is conserved,

Γ(X → qL qL0 ) = Γ(X̄ → q̄R q¯0 R ) , (180)

from which it follows that

Γ(X → qL qL0 ) + Γ(X → qR qR


0
) = Γ(X̄ → q̄L q¯0 L ) + Γ(X̄ → q̄R q¯0 R ) , (181)

and equal number of baryons and antibaryons are created. Thus, violation of both C and CP are
a must to create the baryon asymmetry. Of course, this asymmetry is really small: only about one
baryon excess in a billion. Even with these two conditions, if the interactions proceed in a state
of thermal equilibrium, the opposite reaction would proceed in the same amount, and whatever
baryon asymmetry was produced would be washed out. Remember that the Sakharov conditions
are necessary, but they are not sufficient.
We know that weak interaction violates C, and also CP by a small amount. Is Standard Model
enough to explain the baryon asymmetry? The answer is no for several reasons. First, given the
magnitude of CP violation, we can calculate the ratio nb /nγ within the Standard Model, with three
generations of fermions. This comes out to be of the order of 10−18 and not 10−9 , a billion times
too small. Second, Standard Model conserves the baryon number. There is a catch here: there
can be interactions in the Standard Model, as shown by Gerard ’t Hooft, that violate the baryon
number, but the probabilities are extremely small, completely negligible within the life span of
the universe. As Kuzmin, Rubakov, and Shaposhnikov showed, it may become significant in a hot
36
We do not see any antimatter anywhere in the universe except the timy amount produced at the particle physics
laboratories. If there were antimatter stars and galaxies, we would have expected a huge amount of γ-ray flux when
those antimatter galaxies collided and annihilated with matter galaxies. We get nothing even close to that.

58
universe though; the tunnelling probability of the universe going from one vacuum to another with
a different baryon number increases with temperature. The third problem still remains. If the
electroweak phase transition is of second order, whatever asymmetry was generated should have
been washed out by now. For the asymmetries to survive, we need a strong first order transition,
which does not happen within the framework of the Standard Model. There are extensions of the
Standard Model where this may take place.
A possible way to implement baryogenesis is as follows. Suppose some very heavy particle,
whose mass is of the order of 1015 -1016 GeV, was created at the very first instants. These particles
go out of equilibrium very soon; they may annihilate to create photons but the photons, being
cold enough, would not be able to produce them. Now suppose these particles are unstable and
decay into baryons. Of course the antiparticles would decay into antibaryons. This involves baryon
number violation. And then there is to be P and CP violation for the decay rates to be asymmetrical;
X → q and X̄ → q̄ should not proceed at equal rate, where X is that enigmatic heavy particle.
While it is difficult to implement this in a concrete model (there is no completely satisfactory
model of baryogenesis as yet), this gives rise to another problem of Big Bang cosmology. This is
called the problem of relic abundance. If all those X particles decayed promptly, well and good.
But Grand Unified theories predict some extremely massive stable particles too; one of them is
the magnetic monopole. If you remember, Dirac showed that the existence of magnetic monopoles
would show why the electric charge is quantised.
The magnetic monopoles being very heavy, they become non-relativistic almost at the moment
of their birth, and their density goes down only as a−3 , while the density of radiation goes down
as a−4 . This makes the monopole density at the present instant just too high. To be quantitative,
suppose the monopoles were created at the Grand Unified epoch, with a temperature of 2 × 1029
K (this corresponds to an energy of 2 × 1016 GeV). Since then, the temperature has gone down
roughly by a factor of 1029 . If the universe is assumed to be radiation-dominated, its size has
increased by the same factor. If Ωmon /Ωrad = 10−10 to start with (something compatible with
the baryon asymmetry), now it should have been 1019 . This is clearly against all experimental
evidence! Where have all those stable relic particles gone?
Q. Perform a more sophisticated calculation for the monopole density. If Ωmon /Ωrad = 10−10 at 2 × 1029
K, when did the matter-radiation transition took place? (Assume all matter in the form of such monopoles;
the temperature was still too high for all other particles to be relativistic). Since then, the universe was
matter-dominated; what is the present value of Ωmon /Ωrad now?

11 Inflation

Inflation was a transient stage at almost the very beginning of the universe when it underwent
a very rapid exponential expansion. There are various models of inflation, but the exponential
expansion is a common feature. To motivate inflation, let us again go through the three major
problems of the original Big Bang cosmology.

• The flatness problem: The Friedmann equation can be written as


|k|
|Ωtot − 1| = , (182)
a2 H 2
where Ωtot includes all matter, radiation, and cosmological constant densities. If the universe
is matter-dominated,
√ a ∝ t2/3 , H ∝ t−1 , and hence (a2 H 2 )−1 ∝ t2/3 . If it is radiation-
dominated, a ∝ t, H ∝ t−1 , and (a2 H 2 )−1 ∝ t. In both cases, |Ωtot − 1| is a monotonically

59
increasing function with time. Thus, flat geometry is an unstable situation; even if there is
a slight curvature to start with, the universe will quickly become curvature-dominated. On
the other hand, for such a curved universe, H 2 ∝ a−2 , so a ∝ t, H ∝ t−1 , and (a2 H 2 )−1 is
constant; so curvature-dominance is a stable situation.
To have a quantitative estimate, take the present estimate |Ωtot − 1| < 4 × 10−4 . The age of
the universe is 4 × 1017 seconds, so if the universe were radiation-dominated,

Decoupling : t ∼ 1013 s |Ωtot − 1| < 10−8 ,


Matter − radiation equality : t ∼ 1012 s |Ωtot − 1| < 10−9 ,
Nucleosynthesis : t ∼ 1 s |Ωtot − 1| < 10−21 ,
Electroweak symm. breaking : t ∼ 10−12 s |Ωtot − 1| < 10−33 . (183)

So, (i) if the universe is still flat, there must have been an immense fine-tuning to start with;
(ii) even at the nucleosynthesis and decoupling epoch, the universe was definitely flat, so we
can safely assume k = 0 then.

• The horizon problem: This is simply the apparently strange uniformity of CMBR coming
from different parts of the sky. Again, let us make a simple but quantitative estimate. The
present age of the universe is t0 = 13.7 × 109 yr, so if the universe is not expanding, light
could have travelled about 4110 Mpc, using c ≈ 3 × 10−7 Mpc/yr. The decoupling occurred
at 1013 s, so light could have travelled only about 0.1 Mpc in that time. But that length of 0.1
Mpc has been stretched by a factor of (t0 /1013 )2/3 (it’s a matter-dominated universe), so it
is now about 117 Mpc. So, this stretch subtends an angle of ∼ (117/4110) × (180/π)◦ ≈ 1.6◦ .
Even this is an overestimation, as the radius of the visible universe has been stretched since
light left the surface of last scattering. Thus, in no case radiation coming from an angular
separation of more than 1.6◦ in the sky should have the same characteristic.

• Relic abundance: This we have just discussed; if there are any stable extremely heavy particles
as remnant of the beginning of the electroweak era (so the mass is about 1015 GeV), they will
be non-relativistic too soon, and their density will be much more than the radiation density.
This is incompatible with observation.

Inflation solves all these problems with a single stroke. The central idea is that during the
inflation, ä > 0. From the acceleration equation (119), we see that ρ + 3p < 0. Since the density
is positive by definition, this implies a negative pressure: p < −ρ/3 for ä > 0. We have already
encountered such a scenario where the universe is dominated by a cosmological constant, and
Λ
q 
H2 = ⇒ a(t) = exp Λ/3t . (184)
3
However, after some time, the inflation must stop; so whatever is responsible for that exponential
expansion must decay to ordinary particles. Then the Big Bang can proceed as usual. So the
inflation must have occurred even before the electroweak symmetry was broken; currently favoured
models suggest that it occurred sometime between 10−36 and 10−32 s.
How does inflation solves all these problems? Note that
dȧ d
ä > 0 ⇒ >0 ⇒ (aH) > 0 , (185)
dt dt
so during inflation, aH increases with time, and if the inflation is fast and large enough, even an
initially curved universe becomes so flat so quickly that all the evolution after the inflation failed
to move it from k ≈ 0. Maybe, after a long long time, it will move away from zero again.

60
This is something that you have to visualize. Suppose you have a football, whose surface is
curved. You have a small pinhole aperture, so you cannot see the entire football, but whatever you
can see tells you that it is curved. Now you inflate the ball, but at the same time move away from
it with your pinhole, so you catch more and more surface of the ball. If the rate of inflation is less
than the rate you are moving away, the full curvature will come into view in a short time.

Figure 25: Inflation. Note how the size of the universe changes during the inflationary epoch,
solving all the problems of big-bang cosmology.

Now suppose in a very small time, so small that you don’t have enough time to move the
pinhole, someone inflates the ball to the size of the sun! Now, that will make the observed surface
completely flat, and it will remain flat as you move away; unless you move at a very great distance,
in a very distant future.
In short, the universe might have started with a nonzero curvature. But then inflation occurred,
and the curvature quickly came down to zero; and it was put to zero so strongly that it is still very
close to zero.
That also tells you that the universe that we can see is actually a very small fraction of the
actual universe; most of the universe is not observable as light is yet to reach us from those parts.
But if it were only a very small part to begin with, that could have been in thermal equilibrium
before inflation started. After inflation, that became the total visible universe; so that tells you
why radiation coming from different parts of the sky is isotropic. Once, before inflation, all those
parts were in causal contact; after inflation, the isotropy still remains. So that solves the horizon
problem too.
In fact, if the inflation is sufficiently strong — much stronger than what is required to solve the
flatness or horizon problems — it can blow away all the relic particles too, so that the relic abun-
dance is consistent with observation. One important point is that after inflation, the temperature
must drop to such a level that those heavy particles are not generated anymore.
Let us try to calculate in a simplistic scenario how much inflation is needed. Suppose the
inflation occurred at 10−34 second, it is perfectly exponential, and the universe is always radiation-
dominated. Also, let us take |Ωtot − 1| < 10−3 . The present age is 4 × 1017 s, and |Ωtot − 1| ∝ t for
a radiation-dominated universe, so

|Ωtot (t0 ) − 1| < 10−3 ⇒ |Ωtot (10−34 s) − 1| < 2.5 × 10−55 . (186)

During inflation H is constant, so |Ωtot − 1| ∝ 1/a2 . Suppose just before the inflation started,
|Ωtot − 1| was √
of the order of 1, say 0.25. Thus, during inflation, the size of the universe increased
by a factor of 1054 = 1027 . This is a huge expansion and immediately puts most of the universe
beyond our reach. In fact, we need a stronger expansion to solve the relic density problem.

61
The exact amount of inflation depends on a lot of things; the dynamics of the object responsible
for generating the negative pressure, and also the time for inflation. Suppose the inflation started
at 10−36 s and ended at 10−34 s. At the beginning of inflation, H −1 , the characteristic time of the
universe, is 10−36 s, but H does not change during inflation. So, during inflation,
afinal
= exp [H(tfinal − tinitial )] = e99 ≈ 1043 , (187)
ainitial
so this is enough inflation to solve the flatness problem quite firmly.
That is all very nice, but how do we know that inflation really took place and is just not a
mathematical artifact to cure the sickness of Big Bang?

11.1 Slow-roll inflation

Inflation needs ä/a > 0, or ρ + 3p < 0. Ordinary matter and radiation cannot provide a negative
pressure, so we need something else. It cannot be the nonzero cosmological constant; the exponential
inflation must stop almost immediately after it started. Fortunately, there is something that can
provide a transient source of negative pressure. This is a scalar field. Just like any quantum field,
there will be a particle associated with its excitation; this is called inflaton 37 .
The Einstein-Hilbert action of a real scalar field φ can be written as
√ 1
Z  
S= d4 x −g ∂ µ φ∂µ φ − V (φ) , (188)
2
where V (φ) includes the mass term and the possible interactions. For example, a typical form of
V (φ) can be
1 1
V (φ) = m2 φ2 + λφ4 . (189)
2 4
Note that we do not know the exact form of V (φ), we only know certain characteristics that it must
have.
The energy-momentum tensor is given by
1 α
 
Tνµ = ∂ µ φ∂ν φ − δνµ ∂ φ∂α φ − V (φ) . (190)
2
The conditions of isotropy and homogeneity imply that φ is a function of time only, and hence the
energy-momentum tensor is diagonal:
1 1
 
T00 = ρ = φ̇2 + V (φ) , Tji = −pδji = − φ̇2 + V (φ) δji . (191)
2 2

Thus, ρ + 3p < 0 leads to φ̇2 < V (φ); the kinetic energy is smaller than the potential energy. This
is also called a slow-roll potential; the potential has a small downward slope and the field is slowly
rolling down. As long as this persists, inflation takes place. The large potential energy — the
vacuum energy of the field — acts as an effective cosmological constant, driving the exponential
inflation. Once that slope ends, there is a steep downward jump ending in a deep valley; when the
field falls down this valley, the potential energy decreases very quickly, the kinetic energy increases,
and the field reaches its true vacuum state, after which it is no longer effective. Once in the valley,
we get the normal expanding universe 38 .
37
A large part of this subsection is based on this excellent review on inflation: arXiv:0904.4584.
38
Whatever we see now has come from the post-inflationary period, including the CMB. When the scalar field falls
down the valley, the universe is reheated, and all the heat at present comes from the reheating.

62
With the expressions of ρ and p in (191), we can write the Friedmann equations as

8πG 1 2
 
H2 = φ̇ + V (φ) , Ḣ = −4πGφ̇2 . (192)
3 2
2 , where M ∼ 1019 GeV is the Planck scale.
Sometimes, people set 8πG = 1/MPl Pl

Q. Integrate (192) to obtain


√ Z q
2
φ(t) = 2MPl dt −Ḣ , V (t) = MPl (3H 2 + Ḣ) . (193)

You cannot proceed any further unless you know how the scale factor a behaves with time during
inflation. Suppose it is a power law; a(t) = tq . Find H and show that this leads to
"s #
p V0 t
φ(t) = 2q MPl ln , (194)
q(3q − 1) MPl

where V0 is an integration constant. Hence find the form of V (φ):


" s #
2 φ
V (φ) = V0 exp − . (195)
q MPl

11.2 Slow roll parameters and cosmological perturbations

The inflation is guaranteed if φ̇2  V (φ); the motion is really slow. Also, the acceleration of
the field must be very small: φ̈  3H φ̇, so that there is enough e-folds for inflation. These two
conditions lead to the slow-roll approximation. Let us define two parameters
2
1 2 V0 V 00
  
2
 = MPl , η= MPl , (196)
2 V V

where V 0 = dV /dφ, V 00 = d2 V /dφ2 . The slow-roll approximation is equivlent to

η,   1 . (197)

With a potential of the form V (φ) = V0 φn (remember that the potential can have any form; we, after
all, do not know how the scale factor behaves), it is easy to check that φ  MPl . Such potentials
are therefore called large-field models. On the other hand, potentials like V (φ) = Λ[1 + cos(φ/f )],
where Λ and f are constants that characterize the depth and the width of the potential, lead to
the small-field case.
One can have more than this. The inflaton field can cause a perturbation to the original FRW
metric. These perturbation change the metric, hence the Einstein tensor and the Friedmann equa-
tions. We generally consider two types of perturbations: scalar and tensor. These perturbations
act as the seed of anisotropies, and ultimately lead to structure formations. These anisotropies are
reflected in CMB anisotropies, so if one can find out the anisotropies as a function of the multipole
l, one gets a good idea about the nature of these perturbations.
There are two quantities that experiments measure. The first one is called the scalar spectral
index ns . It essentially tells you how much power is there in the scalar perturbations. The second
one is the tensor-to-scalar ratio r, which gives you the relative importance of tensor perturbations.
Different inflaton potentials give different values of r, but generate ns almost in the same range,
slightly smaller but very close to unity.

63
Figure 26: The ns versus r allowed regions (1σ and 2σ) by the BICEP2 experiment. The red
regions are from the Planck experiment data (2013) which gives r < 0.26 at 95% confidence level.
BICEP2 gets r = 0.20+0.07
−0.05 . There are now serious doubts about the analysis and one generally
believes that the conclusion of BICEP2 is incorrect; r is still consistent with zero.

Now look at Fig. 26, and assume for the time being that it is correct. The red area shows the
previously allowed region from the 2013 data, courtesy the Planck experiment. The blue regions
show the results from the BICEP2 experiment, which came in early 2014 and created quite a
sensation. The data rules out r = 0 at about 7σ, but let us try to understand its full significance
if it were correct.
1. The data tells you that there is a large tensor perturbation. This is definitely nonzero if
BICEP2 was right. This can be interpreted as the first signal of the primordial gravitational waves,
those that came out at the inflationary epoch, and subsequently interacted with the CMB photons.
The primordial waves are different from the later-epoch waves that we have observed at LIGO.
2. The primordial gravitational waves can only be detected if they were amplified by a huge
amount, or they would have diluted down to a level completely undetectable. Such amplification
can be caused only by inflation. Thus, BICEP2 results could have been the first positive proof of
inflation, which would in turn have made the Nobel Prize almost a certainty for its proponents,
Alan Guth and Andrei Linde.
3. If the tensor perturbations were that large, they must have been produced by the quantum
fluctuations of the inflaton field when the energy was very high. If the energy were low, the excited
states would have been of very low mass, and cannot generate such strong perturbations. So, in
some sense, we could have had a window to the energy of the universe when inflation took place,
or the time window. This turns out to be about 1016 GeV. This is huge, very much out of range of
any particle physics accelerators.
If confirmed, BICEP2 results would have opened a new era in the research of cosmology as well as
in particle physics. However, as in late 2014, there was a serious doubt whether BICEP2 had taken
all the systematic uncertainties and possible error sources into account, like the electromagnetic
emission from the foreground of the sky that they were observing. Planck has a much better field
of view, so they could possibly act as the confirmatory test. As in 2015, with the latest Planck
results. we can almost definitely say that the BICEP2 analysis was incorrect; r is still consistent
with zero (although nonzero values are not ruled out), and there is no proof as yet for the existence
of primordial gravitational waves. Physics is full of such cruel and nasty shocks. Still, the future is
extremely exciting.

64

You might also like