Computational Astrophysics Lecture Notes

Computational Astrophysics
Justin Read
justin.inglis.read@gmail.com
http://justinread.net
ETH Z urich [ University of Leicester
Abstract
We study computational methods that form the key tools for modern theoretical astrophysics. We
study how to solve gravity for many body systems from small stellar clusters up to the Universe as
a whole. We then show that the uid equations can give a good description of gas in the Universe
and study numerical methods for solving these. We conclude with a look to the state of the art in
computational astrophysics across a range of interesting problems from how stars and galaxies form
to calculating the distribution of dark matter in the Universe.
Notation
Vectors are denoted in bold v; time derivatives are denoted by a dot x =
dx
dt
; and spatial derivatives
are denoted by a dash y
=
dy
dx
. We will typically use units of kiloparsecs, Solar masses, kilometres per
second and gigayears: L=kpc, M=M
, V=km/s and T=Gyrs, unless otherwise stated. For reference,

unit conversions to S.I. values are given in appendix A. We use the usual notation for dierent
coordinate systems, Cartesian: (x, y, z), cylindrical polars: (R, , z), and spherical polars: (r, , ).
Reading list
Suggested further reading for the course:
Dynamics background (for interpreting simulations): Binney and Tremaine 2008
N-body review article: http://adsabs.harvard.edu/abs/2011EPJP..126...55D
SPH review article: http://adsabs.harvard.edu/abs/2012JCoPh.231..759P
Problems with SPH (and solutions): http://adsabs.harvard.edu/abs/2010MNRAS.405.1513R;
http://adsabs.harvard.edu/abs/2011arXiv1111.6985R
Classical N-body problem: Heggie and Hut 2003
Numerical methods for particle-based simulation: Hockney and Eastwood 1988
For more astronomical background: Shu 1982, esp. Part III (Chap. 11-16) Galaxies and Cos-
mology.
Course overview [rough]
1. Astrophysics 101: what problems would we like to solve? why might these require simulation?
A brief history of computer simulation in astrophysics
2. Solving gravity
3. From one body to N-bodies
4. Collisional N-body systems
5. Collisionless N-body systems
6. Cosmological N-body simulations
7. Initial conditions
8. Solving the Euler equations: Smoothed Particle Hydrodynamics (SPH)
9. Selected topical problems in computational astrophysics
1
Contents
1 Observables (Astrophysics 101) 6
1.1 Whats out there? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Measuring starlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Absolute magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Flux and apparent magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Other observed properties of stars . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.4 The Hertzsprung-Russel (HR) diagram . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.5 Integrated starlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 The parsec and the distance ladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Measuring velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Timescales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.1 The orbit time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.2 The crossing time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.3 The dynamical time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.4 The [direct] collision time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.5 The relaxation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Why simulate? 17
2.1 What problems can we study? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 The need for speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 A brief history of N-body simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.1 The rst N-body calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.2 Simulation validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.3 Moores law and the dramatic increase in N . . . . . . . . . . . . . . . . . . . . 21
3 Solving Gravity 22
3.1 Some useful general results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 The gravitational potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 Poissons equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.3 Some other general results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Newtons Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Spherical symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Oblate spheroidal systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 Understanding the multipole expansion . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1.1 An expansion in basis functions . . . . . . . . . . . . . . . . . . . . . 27
3.3.1.2 Moments of the density distribution . . . . . . . . . . . . . . . . . . . 28
3.3.1.3 The physical meaning of dipoles and quadrupoles . . . . . . . . . . . 28
3.3.2 Gravity versus electrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 Analytic potential density pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 The Miyamoto-Nagai disc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.2 The Logarithmic potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 Numerical Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2
3.5.1 Direct summation; collisional simulations . . . . . . . . . . . . . . . . . . . . . 31
4 From one body to N-bodies 33
4.1 The one-body problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.1.1 Orbits in spherical potentials . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.1.2 Frequencies and resonance . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1.3 The special case of Kepler . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.2 Axisymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.3 2D potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.1.3.1 Surfaces of section (Poincare sections) . . . . . . . . . . . . . . . . . . 39
4.1.3.2 Orbit families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.3.3 Stackel potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.1.4 Triaxiality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.4.1 Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.5 Orbits in discs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1.6 Rotating potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 The N-body problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Collisional N-body systems 47
5.1 Direct force evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2 Time integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.1 The Simple Euler integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.2 The Leapfrog integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.3 Variable timesteps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.4 Hermite integrators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2.5 The choice of time-step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3 Close encounters and regularisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 The use of special hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Alternatives to N-body simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5.1 Orbit averging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5.2 Monte-Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5.3 Fluid Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5.4 Critique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6.1 The stability of the Solar system . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6.2 Runaway growth of black holes in star clusters . . . . . . . . . . . . . . . . . . 60
6 Collisionless N-body systems 64
6.1 The continuum limit: the collisionless Boltzmann equation . . . . . . . . . . . . . . . . 64
6.2 The collisionless N-body method and force softening . . . . . . . . . . . . . . . . . . . 65
6.3 Force calculation: Fourier techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Force calculation: Tree techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7 Cosmological N-body simulations 71
7.1 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.1.1 The homogeneous Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.1.1.1 The FLRW metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.1.2 The inhomogeneous Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.1.3 The non-linear growth of structure: evolution equations . . . . . . . . . . . . . 74
7.1.4 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3
8 Initial conditions 76
8.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.2 Setting up stellar systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.2.1.1 Isotropic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.2.1.1.1 The Plummer solution . . . . . . . . . . . . . . . . . . . . . . 77
8.2.1.1.2 Negative distribution functions . . . . . . . . . . . . . . . . . 77
8.2.1.2 Anisotropic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.2.1.3 Axisymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2.1.4 Triaxial systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2.2 Jeans methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.3 Drawing N points from the distribution function . . . . . . . . . . . . . . . . . . . . . 79
8.3.1 Accept/reject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.3.2 A faster method for spherically symmetric distributions . . . . . . . . . . . . . 80
8.4 Comparing dierent IC methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.5 Noise and quiet starts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.6 Cosmological initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
9 Smoothed Particle Hydrodynamics 83
9.1 The Euler equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.1.1 The entropy form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.2 Smoothed Particle Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2.1 The density interpolant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2.2 The SPH equations of motion: the classic derivation . . . . . . . . . . . . . . 84
9.2.3 The SPH equations of motion: a more general derivation . . . . . . . . . . . . 85
9.2.4 Errors & stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.2.5 Dealing with trajectory crossing: articial dissipation . . . . . . . . . . . . . . 85
9.2.6 Hydrodynamic tests of SPH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A Common constants in astrophysics 86
B Key results from Vector Calculus 87
B.1 Curvilinear coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
B.2 Divergence operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
B.3 Divergence & Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
C Some useful mathematical functions 89
C.1 The Dirac Delta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
C.2 Functions for use in tensor calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
D The Taylor expansion 90
E Solving Poissons and Laplaces equations 91
F Some useful potential density pairs 93
F.1 Spherical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
F.1.1 Point Mass Keplerian potential . . . . . . . . . . . . . . . . . . . . . . . . . . 93
F.1.2 Constant density sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
F.1.3 Power-law Prole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
F.1.4 Split power law models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
F.2 Axisymmetric systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
F.3 Triaxial systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
F.3.1 Generalised Ferrers potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
G Spherical harmonics 96
4
H Lagrangian & Hamiltonian mechanics 97
H.1 Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
H.1.1 Holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
H.1.2 Noethers Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
H.1.3 Rotating reference frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
H.2 Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
H.2.1 Canonical transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
H.2.2 The Hamilton-Jacobi equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
H.2.3 Actions & integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
H.2.4 A worked example: the simple harmonic oscillator . . . . . . . . . . . . . . . . 105
H.3 Phase space and Liouvilles Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
I Dynamical friction 110
I.1 The Chandrasekhar approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
I.2 Resonance: what Chandrasekhar misses . . . . . . . . . . . . . . . . . . . . . . . . . . 115
I.3 The dynamical friction timescale and the connection to relaxation . . . . . . . . . . . 115
I.4 Wakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
I.5 Mass segregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
I.6 Collisionless relaxation and friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
J A brief primer on general relativity 118
J.1 What is wrong with good old Newton? . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
J.2 Special relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
J.2.1 Introducing tensor notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
J.2.2 4-momentum, 4-force and all that ... . . . . . . . . . . . . . . . . . . . . . . . 121
J.2.3 The clock hypothesis and general relativity . . . . . . . . . . . . . . . . . . . . 122
J.2.4 The equivalence principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
J.2.5 The eld equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
J.2.6 Energy conservation in general relativity . . . . . . . . . . . . . . . . . . . . . . 125
J.3 Solving the eld equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
J.3.1 The Newtonian weak eld limit . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
J.3.2 The weak eld limit & gravitational waves . . . . . . . . . . . . . . . . . . . . . 127
J.3.3 The Schwarzschild solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
J.3.4 The FLRW metric and the cosmological model . . . . . . . . . . . . . . . . . . 127
5
Lecture 1
Observables (Astrophysics 101)
In this lecture, we briey review the necessary astrophysics for this course. We discuss what as-
tronomers can see in the Universe and how. We discuss time and distance scales to get a feel for
how huge the Universe is and how long, typically, we have to wait for things to happen. And we
discuss what we know observationally about the sorts of systems we will be interested in simulating
and understanding, from our own Solar system up to the Universe as a whole.
1.1 Whats out there?
Before embarking on the course proper, its worth a brief summary of what the Universe is made up
of; this is summarised in Figure 1.1. The scales are very dicult to grasp. The typical human can
comprehend the dierence in size between a grain of salt and a giant cathedral. This is a dynamic
range of about 10
5
- which is very impressive. It allows us to imagine just how far away from the
moon we are! The Universe is a very big place!!
1.2 Measuring starlight
Most of what astronomers see in the Universe is star light. Individual stars emit a spectrum remarkably
close to that of a perfect black body radiator, and this is shown in Figure 1.2. The total power output
from our own star the Sun is called its luminosity, and is given by: L
= 3.83 10
26
W
1
. The
symbol, , will be used a lot throughout this course and just means of our sun.
The solar luminosity, L
, is really the bolometric luminosity: the total rate of energy output

integrated over all wavelengths. More usually, in astronomy, we use the luminosity output in a partic-
ular waveband (range of wavelengths). This is of more practical value since astronomical instruments
are usually sensitive only over some limited range of wavelengths (an optical telescope, for exam-
ple). Many such wavebands are used by astronomers. Most common are the V isual band centred on
= 550 nm; the Blue band centred on = 440 nm; and the Ultraviolet band centred on = 365 nm.
These are marked on Figure 1.2. The U, B, V labels stand for something sensible, like visual, but
this is not the case for all bands (the infrared bands are labelled I, J, K). Even more confusing is the
fact that the exact denition of these bands has evolved along with the instruments and telescopes
which astronomers use: not every instrument has the same sensitivity, and their wavelength lters
can dier, sometimes by quite a lot! Fortunately, if you are ever confused, there is an excellent review
by Fukugita et al. 1995, which pretty much covers all of the wavebands you are ever likely to need,
and how to convert between them.
1.2.1 Absolute magnitude
Luminosities span an enormous dynamic range in astronomy and it makes sense to use a logarithmic
scale. This is called the absolute magnitude, and is given by:
1
Many astronomers still use the erg as the unit of energy. For completeness this is dened in appendix A; I will not
use this unit in this course.
6
Figure 1.1: The Universe: a very very big place. Alpha Centauri - the nearest star to us - is some
10
13
km away; thats very close compared to the extent of our Galaxy (called the Milky Way): 10
18
km,
or the distance to the Hercules cluster of galaxies some 10
22
km away. Also marked is the Andromeda
galaxy (M31) the nearest large spiral to our own Galaxy, and the star cluster, M13, which orbits
within our Galaxy.
U B V
Figure 1.2: Stars emit a near perfect black body spectrum of radiation. Marked on the plot are lines
of dierent black body temperature, T, also known as the eective temperature of a star, T
e
and the
U, B, V wavebands.
7
M 2.5 log
10
_
L
L
_
+ const. (1.1)
where the constants are chosen separately for each waveband. In the B and V bands, for example,
the constants are chosen such that the solar absolute magnitudes are:
M
,B
= 5.48, M
,V
= 4.83 (1.2)
This choice of normalisation is just historical. The system of logarithmic magnitudes comes originally
from the fact that luminosities were measured just by eye: the human eye responds on a logarithmic
scale.
1.2.2 Flux and apparent magnitude
Flux in astronomy - the actual number of photos arriving per unit area - is measured using apparent
magnitudes. The ux is given by: f = L/(4d
2
), where d is the distance to the source; the apparent
magnitude is given by:
m 2.5 log
10
_
L
L
(10 pc)
2
d
2
_
+ const. = M + 5 log
10
(d/10 pc) (1.3)
again, the choice of normalisation: 10 pc, is historical. The constant is the same as in equation 1.1.
1.2.3 Other observed properties of stars
Some other useful denitions are:
The distance modulus: mM = 5 log
10
(d/10 pc).
The color of a star: L
V
/L
B
, or M
B
M
V
= m
B
m
V
= B V . This is useful because it is
independent of distance, and because stars are approximately black body radiators. Thus, the
colour gives a measure of the star surface temperature.
The eective temperature of a star: the temperature it would have were it a black body radiator
(which stars nearly are). Thus the Stefan-Boltzmann law gives: L = 4R
2
T
4
e
, dening T
e
. R
is the radius of the star, and = 5.670 10

8
J K
4
m
2
s
1
is the Stefan-Boltzmann constant.
The spectral class of a star. This is a measure of the stars surface temperature (T
e
). The
historical labels, in order of decreasing temperature are: O, B, A, F, G, K and M (see Figure
1.3); each class is divided into a subclass numbered 0-9 (e.g. B0 is slightly cooler than O9). Our
Sun is a G2 star with T
e
= 5770 K.
An example of some real stellar spectra is given in Figure 1.4. Notice the absorption features due to
elements in the stellar atmospheres. Measuring these lines allows for much better spectral classication
of stars; while measuring their Doppler shift
2
allows a determination of the radial velocity of a star
its speed towards or away from us. The two absorption lines at just below 500 nm and around 650 nm
are H and H respectively and are caused by ionised hydrogen.
1.2.4 The Hertzsprung-Russel (HR) diagram
As we have shown, most stars are very nearly perfect black body radiators. They are well dened,
observationally, by just two numbers: a colour (which is equivalent to a surface temperature), and a
luminosity. A plot of colour v.s. luminosity is called a Hertzsprung-Russel, or HR, diagram and is
shown in Figure 1.5. Notice that, as mentioned previously, the colour may be determined from the
dierence in just two wavebands in this case B V . This is because stars are so close to black
body radiators that only two points along the black body curve are required to dene a temperature
2
The Doppler shift is the shift in the wavelength of the emitted star light due to the motion of the source. Sources
moving away from us are redshifted, those moving towards us are blueshifted.
8
O B A F
G K M L
Credit: Prof. Richard Pogge
T
eff
>310
4
K 7.510
3
10
4
K
5.210
3
5.910
3
K 2.510
3
3.910
3
K
Figure 1.3: Stellar spectral classication.
300 400 500 600 700
0
2
4
6
8
10
12
14
16
18
20
Wavelength (nm)
N
o
r
m
a
l
i
z
e
d

F
l
u
x

(
F
!
)

+

C
o
n
s
t
a
n
t
Dwarf Stars (Luminosity Class V)
M5v
M0v
K5v
K0v
G4v
G0v
F5v
F0v
A5v
A1v
B5v
B0v
O5v
Figure 1.4: Examples of real stellar spectra for O through to M type stars. The wavelength is rest
frame and has already been corrected for the motion of the stars. The uxes are oset from one
another for clarity.
9
Figure 1.5: The Hertzsprung-Russel (HR) diagram.
(see Figure 1.2). The luminosity may also be represented instead on the logarithmic scale of absolute
magnitudes (see equation 1.1). In Figure 1.5, the luminosity is shown along the left axis, and the
absolute magnitude is shown along the right. Similarly, colour is shown along the bottom axis and the
equivalent surface temperature and associated spectral type is shown along the top axis. Notice that
most stars lie along the main sequence. These stars are called (once again for historical reasons)dwarf
stars and are denoted by a V . Depending on their initial mass and chemical composition (known
as their metallicity
3
), stars are born somewhere along the main sequence. They do not evolve along
the main sequence. Instead when a proto-star ignites burning hydrogen
4
, it moves onto the main
sequence. Once stars use up all of their hydrogen fuel, they evolve o of the main sequence and enter
a giant phase. At the end of their lives, stars eject most of their remaining mass. For the more massive
stars, this will be in the form of a supernovae explosion. After this mass loss phase, only the very
core of the stars remains. Low mass stars become white dwarfs, while the most massive stars will end
up as neutron stars or black holes. For a much more detailed account of the lives of stars see Phillips
1999.
3
It is worth making an important point here. Stars can be almost completely characterised by just three numbers:
their total mass, their age, and nally, what astronomers call their metallicity. Metallicity is a measure of the chemical
composition of stars. Zero metallicity means the stars are composed entirely of hydrogen and helium. Anything heavier
than helium is (confusingly) what an astronomer calls a metal. Note that this means we can expect stars to be
degenerate on the H-R diagram, which only contains two pieces of information per star.
4
The use of the verb burning here is standard in astronomy. It refers to, of course, nuclear fusion of the hydrogen
into helium and other by-products.
10
1.2.5 Integrated starlight
In practise, stars are often so far away that they are unresolved. This means that in a distant star
cluster or galaxy, we really measure the integrated light from many stars. In this case the light from
the galaxy is really just the sum of many dierent stellar spectra such as those shown in Figure 1.4.
1.3 Gas
Astronomers see more than just start light. Useful information about the Universe also comes from gas
seen either in absorption or emission. We briey review some relevant observations for dynamicists:
HI pronounced H one refers to observations of atomic hydrogen one proton and one electron.
There is a very sharp line in the atomic hydrogen emission/absorption spectrum in the radio at
21cm. It is caused by the transition between the spin states of the proton and electron being
aligned and anti-aligned. The rarity of such a highly forbidden transition (once every 10
7
yrs)
means we do not ever observe it on Earth. In space, however, where the number of hydrogen
atoms is astronomical (hehe), such transitions are common. The line is useful because it is
naturally so narrow in energy
5
. This means that any broadening of the line observed, must be
due to Doppler shifts the line is useful for measuring gas kinematics
6
.
H refers to observations of excited hydrogen. The observed photons come from Balmer-
emission from the n = 3 to the n = 2 transition (recall that n = 1 is the ground state). H has
a wavelength of 656.3 nm (optical light). It is a good tracer of ionised gas because it requires
little more energy to ionise hydrogen than to excite an electron from n = 1 to n = 3. Since
ionised gas is hot, H is often a good indicator of star formation. Where ionised gas exists H
can be used to trace kinematics.
CO refers to observations of the roto-vibrational lines from carbon monoxide molecules. These
emit photons at wavelengths of a few microns. It is a good tracer of cold gas and can also be
used for kinematics.
Note that each of the above methods probes dierent gas which may have quite dierent kinematics,
even in the same galaxy see Figure 1.6; data taken from Simon et al. 2003.
From here on, we will not talk very much about gas dynamics in this course and refer the interested
student to the excellent Shu 1991.
1.4 The parsec and the distance ladder
Since astronomers only really see star light, it is notoriously dicult to measure distances: is an object
faint and close, or distant and bright? In this section we present the standard measures of distance
in astronomy which comprise the distance ladder.
The standard unit of length in astronomy is the parsec, which stands for parallax of one
arcsecond
7
. Unlike some astronomical units and conventions, the parsec is of practical, rather than
just historic value. It is derived from the parallax method of distance measurement, which is rst
attributed to Hipparchus of Rhodes. As the Earth moves around the Sun, the angular separation of
a given nearby star with respect to the very distant background stars (eectively innitely far away)
changes. Knowing the Earth-Sun distance (1 a.u. = 1.49597892(1)10
11
m; determined from radar
ranging), then denes the parsec as 1 pc = 3.08567802(2)10
16
m. This is shown in Figure 1.7.
5
Recall that isolated forbidden transitions occur through quantum uctuations and that the uncertainty principle
gives us: Et /2. This means that the 21cm line, which has a very long lifetime, must have a very narrow energy.
We emphasise isolated here because on Earth forbidden transitions proceed through collisional de-excitation. In space,
however, the extremely low gas densities make this unlikely.
6
It is worth making an important point here. Kinematics refers to velocity measurements. By contrast dynamics
involves accelerations. It is very rare in astronomy that we can ever really measure accelerations since changes in velocity
occur on such long timescales. This is why we need dynamics. Our dynamical model allows us to calculate accelerations
given the positions and velocities of the particles (the measurements). We can then calculate what happened in the
past, and what will happen in the future.
7
1 arcsec = 2/360/60/60 radians. See also appendix A.
11
Figure 1.6: (a) H velocity eld of the dwarf spiral galaxy NGC 2976. The contours show H intensity.
(b) CO velocity eld of the same galaxy; the contours show intensity. The angular resolution of each
data set is shown by the lled circles in the top left. Notice how H and CO observations trace very
dierent regions of the galaxy. However, the kinematics are similar.
Figure 1.7: Geometric denition of the parsec: dp = 1 a.u.; d = 1 pc if p = 1 arcsec.
12
Object Typical distance Method
Sun, Solar system 10
6
pc Radar
Hyades star cluster 40 pc Hipparcos
Galaxy 10
4
pc Cepheid variable stars
Andromeda 10
5
Virgo cluster 10
7
Beyond > 10
8
pc Hubble expansion: redshift
Table 1.1: The distance ladder. Note that beyond the Virgo cluster, even very bright stars like
Cepheids become unresolved and we see only the integrated light from galaxies. Further away than
this, we must determine distances using the redshift of galaxies.
With the launch of the Hipparcos satellite, we can now use the parallax distance measure out to
about 1000 pc
8
. This is no mean feat, but as we have seen it is barely an eighth of the distance from
our Sun to the centre of the Galaxy; not very far in astronomical terms. To measure greater distances,
a number of other methods are adopted by astronomers. These are calibrated, rst by reference to
the parallax distance, and then later to each other, building what is known as the distance ladder.
For example, Cepheid variable stars are a type of star which pulsate in a periodic fashion related to
their luminosity. By calibrating their period-luminosity relation using parallax distance (Hipparcos
can just about do this), they can be used to measure distances reliably out to the Virgo cluster of
galaxies, some 10
7
pc away! The distance ladder is summarised in Table 1.1.
1.5 Measuring velocity
We have already seen that the radial velocity of stars and gas can be measured by the Doppler shift
of absorption lines in their spectra. The angular velocity of an object on the sky is called its proper
motion and can be measured for nearby stars, star clusters and galaxies if we are very patient. The
idea is very simple: we measure the position of an object relative to a bright distant background
source like a Quasar.
9
Then we return about ve years later and measure it again. Even very small
movements can be detected if we have high signal to noise. The object may only move 1/200th of a
pixel on a CCD, and yet its motion can be detected because the relative ux in each pixel will change.
1.6 Timescales
In this section we discuss some important timescales in astrophysics. In our Solar system, the relevant
timescale is the time it takes planets to orbit around the Sun. This is the orbit time, and it gives us a
measure of how rapidly things progress on the scale of our Solar system. The orbit time is a relevant
quantity on much larger scales in the Universe too. The orbits of stars within a star cluster, of stars
within a galaxy and of galaxies within clusters of galaxies all have meaningful orbit times.
Other relevant timescales are the interaction time between the stars within these self-gravitating
systems. This governs whether or not a system is collisional or collisionless. These two regimes lead
to very dierent dynamics. We will focus in this course mainly on collisionless systems.
A summary of timescales as a function of scale is given in Table 1.2.
1.6.1 The orbit time
The orbit time is the orbital timescale for a particle at radius, r. Using gravitational constant, G, and
enclosed mass M, this gives:
8
At a stretch! Accurate determinations are at more like 150 pc.
9
A Quasar is an extremely bright unresolved galaxy which is typically very far away. They are believed to be so
bright because they contain a super-massive black hole at the centre which is consuming a large amount of gas very
rapidly and emitting a large amount of energy in the process.
13
We are dealing with collisionless systems. So it would be prudent
to make sure that they really are collisionless!
Consider a system of size R, containing N bodies, each of size r:
R
r
=4r
2
: collision cross section
(R) =
3N
4R
3
: density
=
1
: mean free path

t
col
=

v
typ
=
R
r
2
1
3N
R
R
GM
2. Timescales
N of these
For stars r <<< R => direct collisions (almost) never occur.
v
2
typ
GM
R
Direct collisions
Figure 1.8: Calculating t
coll
: the typical timescale between direct collisions in a self-gravitating system.
t
orb
=
2r
v
circ
; v
circ
=
_
GM
r
; t
orb
= 2
_
r
3
GM
(1.4)
1.6.2 The crossing time
The crossing time the time taken for a particle to cross the system (galaxy, star cluster, Solar system
etc.). The typical velocity of the particle is given by v
typ
v
circ
=
_
GM/r, where r is the radius of
the system and M the mass. The crossing time is then:
t
cross
=
r
v
typ
=
_
r
3
GM
(1.5)
1.6.3 The dynamical time
The dynamical time is the time taken for a particle to fall from a radius r to the centre of a constant
density sphere. This is given by:
t
dyn
=
_
3
16G
(1.6)
Given that, for a constant density sphere, M = 4/3r
3
, the above three timescales are identical to
within some small pre-factors. For this reason, they are often used interchangeably.
1.6.4 The [direct] collision time
The direct collision time is the timescale over which direct collisions within an equilibrium self-
gravitating system occur. Consider a system of size R, containing N bodies, each of size r.
This is shown in Figure 1.8. Each body has a cross sectional area for collision of = 4r
2
. Note that
this is not the surface area of a sphere; it is a cross sectional area of radius 2r collisions occur when
two stars, each of radius r collide. Thus we have = (2r)
2
.
The density of bodies is (R) =
3N
4R
3
. The mean free path of each body is =
1
, and the typical

velocity of a body within the system is given by v
2
typ
=
GM
R
. Putting this all together gives us:
t
coll
=

v
typ
=
_
R
r
_
2
1
3N
t
orb
2
(1.7)
Notice that, for stars, typically r R and direct collisions (almost) never occur. Can you think of
somewhere in the Universe where it might occur?
1.6.5 The relaxation time
Direct collisions almost never occur, but gravity is long range! Stars accumulate changes in velocity
over time due to both long and short range gravitational interactions. Since such interactions are
random in direction, in the mean, they produce no net eect. However, one can think of an individual
star receiving velocity kicks from the surrounding stars and undergoing a 3D random walk in velocity
14
v
1
v
2
v
3
v
4
Tuesday, August 30, 2011
Figure 1.9: A random walk in velocity space. A star (marked by the red circle) starts initially with
zero velocity. It receives successive velocity kicks of random direction v
1
...v
n
. The nal velocity is
then given by [v
t
[
2
= [
n
i=1
v
i
[
2
=
n
i=1
[v
i
[
2
. The last equality follows because of the random
direction of each kick. Notice that it is the root mean squared (r.m.s.) sum of kicks which determines
the nal velocity magnitude, not the mean of velocity kicks.
Direct collisions almost never occur, but gravity is long range! Stars
accumulate changes in velocity over time due to both long and
short range gravitational interactions. Imagine the interaction
between two stars as follows:
2. Timescales
By symmetry, we only need
consider the perpendicular
force component: F
p
F
p
x =vt t =0
b
v
m
m
F
p
=
Gm
2
r
2
cos() =
Gm
2
b
2
1+
vt
b
3/2
This gives a change in perpendicular velocity:
F
p
=m
dv
p
dt
v
p
=
Z

Gm
b
2
1+
vt
b
3/2
dt =
2Gm
bv
The relaxation time
Figure 1.10: Calculating t
relax
: the timescale over which accumulated gravitational interactions turn
a star through 45 degrees.
space. As with the standard random walk, each kick is of random direction, yet over many kicks, a
star can lie some way away from its initial velocity; this is shown in Figure 1.9. The relaxation time
is the time over which these accumulated gravitational interactions on average turn a star through 45
degrees
10
. Imagine the interaction between two stars, each of mass m, as shown in Figure 1.10.
By symmetry, we need only consider the perpendicular force on one of the stars, F
p
. This is given
by:
F
p
=
Gm
2
r
2
cos()
Gm
2
b
2
_
1 +
_
vt
b
_
2
_
3/2
(1.8)
where b is the impact parameter: the perpendicular distance of closest approach between the two stars;
and the approximation sign is there to remind us that we have assumed straight line trajectories:
x = vt.
Using Newtons laws: F
p
= m v
p
, this gives a change in perpendicular velocity of the star given by:
v
p

_

Gm
b
2
_
1 +
_
vt
b
_
2
_
3/2
dt =
2Gm
bv
(1.9)
The above is a reasonable approximation provided that v
p
v b
min

Gm
v
2
typ
.
That was one encounter. Let us assume that the star travels across the system once. If the system
is of size b
max
, then the number of other stars it will encounter is given by:
dn =
N
b
2
max
2bdb (1.10)
where b is the impact parameter as before.
10
This is one of many denitions of the relaxation time. It will suce for our order-of-magnitude calculation here.
15
Object t
orb
t
relax
Solar system 1 year
11
Hyades open star cluster 4 10

6
yrs 140 10
6
yrs
M13 globular cluster 2 10
8
yrs 5 10
9
yrs
Milky Way Galaxy 2 10
8
yrs 2 10
16
yrs
Virgo galaxy cluster 3 10
9
yrs 10
10
yrs
Hercules galaxy cluster 6 10
9
yrs 10
10
yrs
Table 1.2: Timescales in astronomy.
Now, over many encounters, v
p
= 0, but v
2
p
,= 0; this is illustrated in Figure 1.9. Thus the
change in the perpendicular velocity of the star when it crosses the system once is given by:
v
2
p
=
_
b
max
b
min
v
2
p
dn 8N
_
Gm
b
max
v
typ
_
2
ln
_
b
max
b
min
_
(1.11)
where v
typ
=
_
GM
b
max
is the typical stellar velocity (recall that M is the mass of the whole system
interior to b
max
, while m is the mass of one star).
The relaxation time is the time over which accumulated gravitational interactions on average turn
a star through 45 degrees. This occurs when the star has crossed the system n
cross
= v
2
typ
/v
2
p
times.
Since each crossing takes t
cross
b
max
/v
typ
= t
orb
/(2), we nd:
t
relax
= n
cross
t
cross

N
16 ln
t
orb
(1.12)
where ln = ln(b
max
/b
min
) is known as the Coulomb logarithm. Notice that it is set by the dynamic
range in the system. Since b
max
/b
min
= [10, 10000] gives ln = [2.3, 9], it is reasonable to assume
ln 10 for most back of the envelope calculations.
The relaxation time is particularly important. It determines whether or not a self-gravitating
system can be thought of as collisionless: t
relax
> t
universe
or collisional: t
relax
< t
universe
. Collisionless
systems are much easier to model and we will deal almost exclusively with these.
A summary of timescales in astronomy is given in Table 1.2. Notice how slowly things typically
orbit; our Sun, for example, cannot have made more than 50 revolutions about the centre of our
Galaxy over the entire lifetime of the Universe ( 14 Gyrs). This means we are very unlikely to actually
see anything happen in the Universe. Even when dynamical times are very short, like at the very
centre of our Galaxy, we can usually only hope to see stars move enough to measure their transverse
velocity across the sky. In general, the Universe must be viewed as a snapshot. By measuring the
positions and velocities of stars (using their relative Doppler shifts or proper motions), we can then
use dynamics to work out where those stars were in the recent past, and where they will be in the
future.
As a nal point, have a think about the relaxation times for the clusters of galaxies. We derived
the relaxation time in a relatively crude way assuming that all of the objects undergoing relaxation
have the same mass. Is this likely to be true for galaxies in a cluster of galaxies?
11
Using this and r = 1 a.u. means we can now weigh the Sun!
16
Lecture 2
Why simulate?
In this lecture, we discuss why we might want to simulate astrophysical systems, and we give a brief
history of the eld.
2.1 What problems can we study?
Although signicantly weaker than all of the other forces, the dominant force in the Universe is gravity.
Apart from very early on in the Universe, or near to massive black holes, gravity can be well described
by the Newtonian theory. For this reason, most of this course will focus on solving Newtonian gravity
for massive self-gravitating systems. Such self gravitating systems are called N-body systems, referring
to the number N of self gravitating entities. Our Solar system, for example, has N 9 if we count
only the planets and the Sun
1
. Our Galaxy has over a 100 billion stars and so is a massively N-body
system. Calculating the gravitational forces between all of these entities and evolving their motion
forwards in time is then a signicant computational challenge. If we consider the Universe as a whole
as an N-body system, this becomes even more challenging still. Yet building models of such systems is
worth the diculty of overcoming such challenges. It allows us to address many interesting questions
in modern theoretical astrophysics and cosmology. These can be divided into two main types of
question. Those that ask, given some observed N-body system, can we calculate what will happen
next?
Is our Solar system stable?
Will our Galaxy collide with the nearby spiral galaxy Andromeda, what will happen if it does?
What is the future of our Universe?
And those that try to work forwards from some assumed but sensible initial condition to the
present day:
How did the Solar system form?
How did our Galaxy form and evolve?
How did our Universe form and evolve?
What is the Universe composed of, or what are the nature of dark matter and dark energy?
The former questions use data in the Universe around us as an initial condition for a simulation that
calculates what will happen next rather like forecasting the weather. The latter questions essentially
try to t data in the Universe around us by evolving some assumed initial condition forwards in time.
This then tests these initial assumptions how did our Solar system start out? What is the Universe
made up of? etc. We will study both types of question on this course, focussing on the numerical
techniques and tools that allow us to set up, perform, and analyse these simulations.
1
Sadly, Pluto is no longer ocially a planet.
17
x
1
x
2
m
1
m
2
O
Figure 2.1: Set-up for the two body problem. The centre of mass of the system is marked at O. The
two masses, m
1
and m
2
are separated by: r = x
2
x
1
.
2.2 The need for speed
So far, we have simply stated that building models of large N-body systems requires computer sim-
ulations. Many of you may already be thinking: Pah! A real physicist can solve everything using
nothing but pure thought, and perhaps a pencil. Lets consider how far this can get us. We start
with a simple astrophysical system involving just two point masses m
1
> m
2
in their centre of mass
frame, as shown in Figure 2.1. This can be the Sun and the Earth, for example. The gravitational
force on each mass is given trivially by:
m
1
x
1
= F
12
= m
2
x
2
=
Gm
1
m
2
[r[
3
r (2.1)
where F
12
is the gravitational force between the two masses.
But since we are choosing coordinates about the centre of mass, we must have also that:
m
1
x
1
+m
2
x
2
= 0 (2.2)
Combining equations 2.1 and 2.2 gives:
r =
G(m
1
+m
2
)
[r[
3
r (2.3)
Which is identical the equation of motion for a test particle moving about a point mass of mass M
(the Kepler problem) with M (m
1
+ m
2
). Hopefully you already know that the Kepler problem
can be solved analytically (we will come to this later on in the course if youve not seen it before).
This is all very well, but the above system cannot even describe our Solar system that has eight
planets and the Sun! Let us suppose we want to add just one more body to the system a three
body problem. This was considered one of the most pressing astronomical problems of the nineteenth
century, and there was an enormous prize on oer (by Oscar II, King of Sweden) for anyone who could
solve it. Henri Poincare did not solve the problem, but he did win the prize all the same by showing
that it could never be solved exactly. A full proof is beyond the scope of this course, but boils down
to the fact that the three body problem is in general chaotic. That is, an innitesimal change in
the initial conditions can lead to extremely dierent outcomes. Since the initial conditions cannot be
perfectly known, the outcome cannot be perfectly predicted.
Some special cases for the three body problem do exist, however. An example is given in Figure
2.2.
2.3 A brief history of N-body simulations
2.3.1 The rst N-body calculation
Given the interest in N-body systems, astrophysicists were keen to nd ecient ways to compute the
gravitational forces between N-bodies and to evolve these forwards in time. The very rst attempt to
do this was the pioneering work of Holmberg 1941. He realised that the luminosity of a lightbulb falls
o as 1/r
2
exactly the same as gravity. Thus, by arranging lightbulbs in a compact disc array, he
could simulate a disc galaxy. Putting down a light metre between the bulbs, he could then calculate
the gravitational force (proportional to the light intensity) on each bulb. Armed with the force, he
then needed to update the velocities and positions of the bulbs (assumed to be initially rotating).
18
Figure 2.2: Examples of solutions to the three (left) and four (right) body problems. Such solutions
are very rare. In general, for greater than two-bodies the problem is chaotic and we must resort to
using computers.
1941ApJ....94..385H
Figure 2.3: Left & Middle: The Holmberg lightbulb N-body experiments for modelling galaxy
interactions for a retrograde encounter (left) and a prograde encounter (middle). Right: A real
galaxy-galaxy interaction observed with the Hubble space telescope (credit ESA/NASA).
This required some time integration of the system. We will discuss this in much more detail in later
lectures, but can introduce the most basic form of the idea here: Euler integration. Dening a timestep
t
i
for a lightbulb i, we can update its position and velocity as:
x
i
(t + t) = x
i
(t) + x
i
t
i
(2.4)
x
i
(t + t) = x
i
(t) + x
i
t
i
(2.5)
where x
i,0
is the acceleration evaluated at t
0
.
While the above is conceptually simple, it is not very accurate. The simple Euler method is nothing
more than a Taylor expansion in t about t to rst order. Thus, the errors will be proportional to
t
2
. We may obtain ever increasing accuracy by shrinking t, but this amounts to a large amount
of manual labour in physically moving the bulbs! We will discuss more sophisticated algorithms later
on in the course.
Holmbergs lightbulbs amount to a rather clever parallel computer for the force. Setting up two
such lightbulb galaxies, he could then calculate the eect of a galactic encounter for the rst time.
The results for co-rotating and counter-rotating interacting galaxies is shown in Figure 2.3, top pan-
els. With this rather crude technique, he was already able to show that prograde versus retrograde
1
Our selection was taken from the review paper Dehnen and Read 2011.
19
Figure 2.4: The increase in particle number in N-body simulations over the past 50 years for se-
lected collisional (red) and collisionless (blue) N-body simulations
1
. The solid line shows the scaling
N =N
0
2
(yeary
0
)/2
(with N
0
=16 and y
0
=1960 valid for von Hoerners calculation) expected from
Moores law if the costs scale linearly with N.
encounters produce very dierent results, while the prograde encounter naturally leads to longer and
more prominent spiral arms. His calculation captures the essential qualitative features of real galactic
encounters observed in the Universe, like the beautiful on-going merger between galaxies NGC2207
and IC2163 (see Figure 2.3, bottom panel).
2.3.2 Simulation validation
While Hombergs calculation is both beautiful and pioneering, it provides another important lesson
for us here. Such N-body simulations are not perfect. There are errors that come from the choice of
timestep t, from the choice of initial conditions, and from in Holmbergs case the human error
in physically moving the lightbulbs. How can we be certain that the results are correct? There are a
number of key tests we can use:
The sanity check: Do the results look sensible? [Can you spot something odd with the 7 most
central lightbulbs in Figure 2.3, top panels?]
The conservation check: Are energy, momentum, angular momentum and any other con-
served quantity actually conserved? Are they conserved well enough to trust the results?
The convergence test: Do the results change if we increase the numerical resolution? (i.e. if
we increase the number of lightbulbs).
The analytic comparison: Do the results match analytic expectations?
It is hard for us, in this case, to perform the conservation check, or the convergence test without
access to Holmbergs original data or apparatus (though these tests are easy to apply to our own
modern computer simulations). But we can compare with more modern numerical calculations, and
with analytic calculations. Although the three body problem is not solvable analytically, the restricted
three body problem (where the third mass is light compared to the other two) is. Using this, we can
20
show analytically that prograde encounters should lead to more material being torn o into spiral
arms than retrograde encounters (Read et al. 2006c), suggesting that Hombergs main qualitative
results are correct. How to analyse and validate simulations will form a major part of this course.
2.3.3 Moores law and the dramatic increase in N
During the second world war, the rst computers were developed. The rst electronic, digital and
programming computer was Colossus which was used as a code-breaking machine by the British at
Bletchly Park. By the early 1960s programmable computers with basic memory were available on
science campuses across the world, leading to the rst N-body simulations post-Holmberg by von
Hoerner 1960 and Aarseth 1963. Since then, we have seen a dramatic increase in computer power,
with computer power nearly doubling every two years (Moores law). Figure 2.4 shows how the N in
N-body has dramatically increased in line with Moores law since Holmbergs early experiments. The
very latest calculations now routinely break the billion particle mark, with N > 10
9
! On this course,
we will look at the algorithms that have made this possible.
21
Lecture 3
Solving Gravity
On astronomical scales, gravity dominates over all of the other forces. In this Lecture we review how
to calculate the gravitational eld due to arbitrarily complex mass distributions. We focus here on
Newtonian gravity and present some useful analytic derivations. These will help us to understand
and test our numerical results later on in the course. A numerical approach to solving gravity will be
presented in Lecture ??.
3.1 Some useful general results
We start with some useful general results. All of these rely on three basic assumptions about the
gravitational force:
The gravitational force due to an innitesimal piece of mass is an inverse square law.
The gravitational force acts instantly over arbitrarily large distances.
The gravitational force due to each innitesimal element may be linearly added to give the total
force.
These assumptions form the bedrock of the Newtonian world view. They are ultimately empirically
justied: they give a very accurate account of orbital motions in the Solar system, and of the dynamics
of falling bodies on the Earth. In this sense, we can think of them as being very nearly correct.
However, Einstein found fault with the second assumption and the improvement general relativity
gives a more accurate description of the Universe at high energy scales. Luckily, for most of the
Universe, Newton is an excellent approximation.
3.1.1 The gravitational potential
The force per unit mass
1
due to any mass distribution can be built up by summing up the innitesimal
contributions
2
F(x):
F(x) = G
x
x
[x
x[
3
m(x
) = G
x
x
[x
x[
3
(x
)
3
x
(3.1)
to obtain:
F(x) = G
_
x
x
[x
x[
3
(x
)d
3
x
(3.2)
1
An important word of warning here! Often we theoretical physicists can get a bit sloppy with our terminology.
Force per unit mass is, of course, acceleration. Sometimes we will say force meaning force per unit mass, meaning
acceleration. I will try hard not to do this, but beware it happens!
2
Of course, galaxies and star clusters, as we have seen in Lecture 1, are really made up of many discrete stars and
possibly many discrete dark matter particles too. Is it reasonable to assume that this discrete matter distribution is
smooth?
22
Dening the gravitational potential (x) by:
(x) = G
_
(x
)
[x
x[
d
3
x
(3.3)
we nd:
F(x) = (x) (3.4)
The above tells us that the gravitational force per unit mass is fully specied by a scalar eld, (x).
3.1.2 Poissons equation
The above gives us the force per unit mass from the potential. To calculate the potential from the
mass distribution, we need Poissons equation and we derive this here. Combining equations 3.4 and
3.3, we obtain:
= G
_
(x
)(x x
)
[x
x[
3
d
3
x
(3.5)
Thus:
() = G
_
(x
)
x
_
(x x
)
[x
x[
3
_
d
3
x
(3.6)
The term in square brackets is a straight-forward dierentiation. But we include it in order to point
out a remarkable coincidence:
_
(x x
)
[x
x[
3
_
=
3
[x
x[
3

3[x
x[
2
[x
x[
5
(3.7)
The above dierential is zero everywhere, except at x = x
. This remarkable cancellation occurs only

because gravity is exactly an inverse square law in three dimensions. It is what leads us to Poissons
equation:
2
(x) () = G(x)
_
d
2
= 4G(x) (3.8)
where the above follows since the integral will be zero everywhere except at x
= x. In this limit, (x)

comes out of the integral, and we are left with just an integral over solid angle, d
2
.
3.1.3 Some other general results
Integration of Poissons equation (equation 3.8), and application of the Divergence Theorem
3
, leads
directly to Gauss Theorem:
4G
_
d
3
x =
_
.d
2
S (3.9)
The total potential energy of a mass distribution is given by:
W =
1
2
_
(x)(x)d
3
x (3.10)
Finally, notice that, from Stokes Theorem
4
, Gravity is a conservative force:
_
C
F dl =
_
S
() d
2
S = 0 (3.11)
The left integral is, by denition, just the work done in moving around a closed path, C. Since this
is zero, this tells us that particles moving in a static gravitational eld must conserve energy. Have a
think about this in more detail. The Earth, for example, orbits around the Sun. But does it conserve
energy?
3
See Appendix B.
4
See Appendix B.
23
4. The one body problem: Potential theory
Newton I & II:
r
1
r
2
m
1
r
2
1
m
2
r
2
2
a
r
p
q
q
Thin shell, mass M
q
=
GM
|pq
4
=
p
p
=
GM
|pq
4
=
p
Newton I & II theorems: proof
Newton I & II:
r
1
r
2
m
1
r
2
1
m
2
r
2
2
a
Thin shell, mass M
q
=
GM
|pq
4
=
p
p
=
GM
|pq
4
=
p
Newton I & II theorems: proof
p
p
q
r
Figure 3.1: Geometrical proof of Newtons First Theorem (left) and Second Theorem (right).
3.2 Newtons Theorems
In general, we want to solve Poissons equation (equation 3.8). This gives us the gravitational potential,
(x, y, z), for a given matter density, (x, y, z). The force at any given point may then be determined
from equation 3.4. However, in practice, this becomes analytically messy and intractable quite quickly.
A number of theorems due to Newton, make the task much easier for systems with a high degree of
symmetry.
3.2.1 Spherical symmetry
Newtons First Theorem: A body that is inside a spherical shell of matter experiences no net
gravitational force from that shell.
Proof: See Figure 3.1.
Newtons Second Theorem: The gravitational force on a body that lies outside a closed spherical
shell of matter is the same as it would be if all the shells matter were concentrated into a point
at its centre.
Proof: See Figure 3.1.
These results are extremely useful. Together they mean that, for spherical mass distributions, only
the enclosed mass matters for calculating the force:
F =
GM(r)
r
2
r (3.12)
where r = r/[r[ is a unit vector pointing along r. Using the above is much easier than solving Poissons
equation! It also denes the cumulative mass:
M(r) = 4
_
r
0
(r
)r
2
dr
(3.13)
which is just the total mass interior to r.
3.2.2 Oblate spheroidal systems
Not much in the Universe has Oblate spheroidal symmetry. But this third theorem due to Newton
gives us insight into how the potential will change as we move away from spherical symmetry. First,
we should dene some terminology:
24
Homoeoid: A homoeoid is a shell of uniform density bounded by two concentric, similar, ellipsoids
each obeying
x
2
a
2
+
y
2
a
2
+
z
2
c
2
=
2
, with a > c by convention (a, c and are constants). Similar
means with the same a/c, but dierent
2
. It has the symmetry of an oblate spheroid [see right
margin].
a
c
Confocal ellipsoids: Confocal ellipsoids share the same two focal points. Each obeys
x
2
a
2
+
y
2
a
2
+
z
2
c
2
=
2
with the same
2
, but a = cosh u; c = sinh u. The smallest ellipsoid is then a straight line
between the two focal points (u 0), the largest tends towards being a sphere (u ) [see
right margin and Figure 3.2].
a
c
Now, we are ready to state Newtons third theorem, and a related theorem we will have to prove
along the way, the Homoeoid theorem:
Newtons Third Theorem: A mass that is inside a homoeoid experiences no force from that ho-
moeoid.
The Homoeoid theorem: The exterior isopotential surfaces of a homoeoidal shell of negligible thick-
ness are spheroids confocal with the shell itself. Inside the shell the potential is constant.
Proof: This is a bit of work (we can no longer just use one diagram!), but it will be worth it for the
physical intuition we will gain. Our strategy is to solve Poissons equation for a thin homoeoidal
shell. This will allow us to prove the Homoeoid theorem. Then, summing over many such
innitesimal shells, we will prove Newton III.
We start by solving Poissons equation for a thin homoeoidal shell. When working in spherical
symmetry, it often makes sense to use spherical coordinates. Here our symmetry is that of an oblate
spheroid:
R
2
a
2
+
z
2
c
2
=
2
(3.14)
where R
2
= x
2
+y
2
. Thus, it makes sense to work in oblate spheroid coordinates (u, , v):
R = cosh usin v; z = sinh ucos v; = (3.15)
Notice that u = const. recovers equation 3.14 for an oblate spheroid; while v = const. gives hyperbolae
which lie perpendicular to these confocal ellipsoids of constant u (see Figure 3.2).
We have chosen a coordinate system of confocal ellipses. This anticipates the solution. Recall that
the Homoeoid theorem states that the exterior isopotential surfaces of an innitesimal homoeoid are
confocal ellipsoids. This means that we can expect that, in these coordinates, = (u), and, indeed,
we shall prove that this is the case.
We now introduce the key trick, which will employ a lot, for solving Poissons equation. Notice
that, for an innitesimally thin shell, there is only mass contained within the shell itself; outside of the
shell the matter density is zero. Thus, we reduce the problem to solving Laplaces equation:
2
= 0,
subject to suitable boundary conditions at the shell, the origin and innity (here we will only require
boundary conditions at the shell and innity).
In general, Laplaces equation may be solved using the method of separation of variables. In
case you have not seen this before, a full worked example in cylindrical coordinates is presented in
Appendix E. However, life is simpler in this case, since we have that = (u) only (recall that we
assume this in anticipation of the answer). From equations B.2 and B.9, and assuming = (u),
2
= 0 reduces to:
d
du
_
cosh u
d
du
_
= 0 (3.16)
Equation 3.16 may now be simply integrated to give two solutions:
= const. (3.17)
and:
= A[arctan(sinh u) +
0
] = A[/2 arcsin(sech u) +
0
] (3.18)
25
Figure 3.2: Oblate spherical coordinates (see equation 3.14). The black lines mark trajectories of
constant u; the red of constant v. Note that this is a side view. The ellipsoids of constant u are
confocal and range from a straight line between the two foci of the ellipse (u 0) and a sphere
(u ).
Where the last equality follows from the diagram in the margin.
1
sinh(u)
cosh(u)
One of these two solutions must correspond to the potential inside the shell; one the potential
outside. We can determine which solution is which from the boundary conditions at the shell, u = u
0
and innity:
0 in the limit r .
must be everywhere continuous (i.e. continuous at the shell boundary u = u
0
).
The above means that inside the homoeoidal shell, the potential must be constant. (It cannot
be constant outside the shell as it must tend to zero at innity.) And noticing that for large u,
sech u /r 0, we obtain the nal solution for the shell:
=
GM

_
arcsin(sech u
0
) (u < u
0
)
arcsin(sech u) (u u
0
)
(3.19)
It is straightforward to show that this solution satises the above boundary conditions.
We have now proven the Homoeoidal theorem: The potential inside a thin homoeoid is constant;
the potential outside has isopotential surfaces which are ellipsoids confocal with the homoeoid itself.
Phew! It is now also an easy matter to prove Newton III: a thick homoeoid is just the sum of many
thin homoeoids, and the potential inside each is constant. Thus the potential inside any homoeoid
must be a constant. Ta da!
So what do we learn from the above exercise? Apart from Newton III itself, we learn three
important things. Firstly, that we can reduce the problem of solving Poissons equation to that of
solving Laplaces equation by breaking the mass up into innitesimally thin shells. Secondly, we may
now understand qualitatively the potential of some general inhomogeneous body. We may think of
the body as being a collection of innitesimal (homoeodial) shells. Each shell contributes a potential
which is isotropic on confocal ellipsoids as you move away from the shell. This gives us the intuition
that any gravitating body of any shape will rapidly look spherical as you move away from it. You can
see how rapid in Figure 3.2: notice how round the outer contours are. Again, this is really useful. If
we are a long way from a gravitating body (like a disc galaxy, for example) we can reasonably assume
that the potential is that of a point mass of the same total mass as the body. Close to the body,
26
however, its shape matters. We will explore this in more detail in the following sections. Finally,
because of this rapid movement towards sphericity: density is always more attened than potential.
3.3 The multipole expansion
We have seen, above, how far Newton got in calculating the gravitational eld due to bodies of
dierent shapes. We now approach the problem of calculating the gravitational potential for general
mass distributions.
The details of this calculation are given in Binney and Tremaine 2008, however the principle is
the same as above. We reduce the hard problem of solving Poissons equation to the easier problem
of solving Laplaces equation,
2
= 0, inside and outside of innitesimal spherical shells, subject to
suitable boundary conditions at the origin, innity and on the surface of the shell. Each shell may
then be summed over to give the total gravitational potential. If you have never seen a solution of
Laplaces equation before, the above is worked through in cylindrical coordinates in Appendix E.
Working through the above we derive the multipole expansion in spherical polar coordinates
(r, , ):
= 4G
l=0
l
m=l
Y
m
l
(, )
2l + 1
_
1
r
(l+1)
_
r
0
lm
(a)a
(l+2)
da
. .
external potential
+ r
l
_

r
lm
(a)
da
a
(l1)
_
. .
internal potential
(3.20)
where Y
m
l
(, ) are the spherical harmonics: an orthogonal set of basis functions which may be
used to represent any function of (, ). In this case, we use them to represent the density distribution,
which is given by:
(a, , ) =
l=0
l
m=l
lm
(a)Y
m
l
(, ) (3.21)
where
lm
are constant (complex!) coecients. These may be obtained in a similar fashion to the
coecients of Fourier series:
lm
=
_

0
sin d
_
2
0
dY
m
l
(, )(a, , ) (3.22)
where the denotes the complex conjugate (recall that this just means that imaginary i i).
A graphical representation of the spherical harmonics is given in Figure 3.3; formulae are given in
Appendix G. Notice that, just as for the homoeoidal shell, we obtain two solutions: one for the interior
potential, and one for the exterior potential.
3.3.1 Understanding the multipole expansion
It is easy to mistake being able to derive the multipole expansion with actually understanding what
it means. Here we present three dierent ways of understanding the multipole expansion. Each way,
gives us a little bit more insight.
3.3.1.1 An expansion in basis functions
Firstly, it is important to understand the multipole series as a sum over orthogonal basis functions
in spherical coordinates. The real and imaginary components of the rst few spherical harmonics are
plotted in Figure 3.3. You will have come across these basis functions many times before, though you
may not have realised it at the time (remember those chemistry classes?). Each value of l corresponds
to a dierent multipole. The lowest, l = 0, corresponds to the monopole which is just a spherical
potential. The l = 1 term is called the dipole, and the l = 2 term is the quadrupole. A real mass
distribution is then built up from the sum of each of these basis functions, weighted by dierent
amounts this is what the multipole expansion actually is. In dierent coordinate systems, we nd
dierent basis functions give solutions to Poissons equation: Cartesian coordinates give Fourier series
for the basis functions; cylindrical coordinates give Bessel functions (see Appendix E).
27
3.3.1.2 Moments of the density distribution
You may be wondering: what are monopoles, dipoles and quadrupoles, physically? Consider rst
the l = 0 term the monopole. We have only Y
0
0
=
1
4
in the spherical harmonic sum
5
, and equation
3.20 becomes:
= 4G
1
4
_
1
r
_
r
0
00
(a)a
2
da +
_

r
00
(a)ada
_
= 4G
_
1
r
_
r
0
(a)a
2
da +
_

r
(a)ada
_
(3.23)
where the second line follows from equation 3.22.
The above demonstrates that the monopole is just the spherical component of the potential. We
leave it as an exercise to derive Newtons rst and second theorems from the above.
The higher multipoles dipole, quadrupole etc. are higher order moments of the density distri-
bution. Consider for a moment just the external eld: the l = 1 term (dipole) is falling o as 1/r
2
,
while its associated integral is a rst moment
6
of the density; and the l = 2 term (quadrupole) is
dropping o as 1/r
3
, while its associated integral is a second moment of the density. This is why we
call them multipole moments.
Notice something interesting about the dipole term: if we are sitting at the centre of mass then the
dipole moment (for the external eld) will exactly vanish because Y
m
1
is an antisymmetric function,
and by denition, the
lm
a
3
contribute equally to the sum. This actually points towards a very deep
result which we discuss in the next section.
Now consider the external eld. If I am inside a mass distribution, then I can have a dipole eld
which cannot be trivially transformed away you should be able to see this from the form of the
internal potential in equation 3.20. It is only external to mass distributions that I can transform the
dipole away. Of course, for many mass distributions, the internal eld will be unimportant. Remember
from Newton I and III that the interior potential from spherical or oblate confocal mass distributions
is constant (and therefore exerts no force). Can you also see this from the functional form of the
internal and external potentials in equation 3.20? (It is easiest to consider the internal and external
parts together to see this.)
3.3.1.3 The physical meaning of dipoles and quadrupoles
Finally, lets look at a third way of understanding the dipole and quadrupole terms. This is probably
the easiest way, and it will give you a nice physical picture of dipolar and quadrupolar elds. Consider
two point masses, each a distance h from the origin, as shown in Figure 3.4. The force per unit mass
at some point, P, a distance, r, from the origin is given by:
F
P
= G
_
m
1
(r h)
[r h[
3
+
m
2
(r +h)
[r +h[
3
_
(3.24)
If we assume that [h[ [r[, we obtain:
F
P
=
G
[r[
3
(m
1
(r h) +m
2
(r +h)) (3.25)
Now we see something interesting. Notice that:
F
P
=
2Gm
1
[r[
3
r m
1
= m
2
(3.26)
F
P
=
2Gm
1
[r[
3
h m
1
= m
2
(3.27)
5
see Appendix G for formulae for the rst few terms.
6
Recall that the moments of the density are given by (in integral form; and Cartesian coordinates): rst moment
I
j
=
x
j
d
3
x, second moment I
jk
=
x
j
x
k
d
3
x, etc. The rst moment of the density you will have seen before as
the centre of mass of the system.
28
If m
1
= m
2
, we recover the force from a dipole! But if m
1
= m
2
, the force is just monopolar (it
falls o as 1/r
2
). We have shown that we can build a dipole while sitting at the centre of mass, only
if we allow negative charges. Since we cannot have negative charges in gravity (there is, as far as we
know, no such thing as negative mass) then we cannot have dipoles in gravity. Of course, remember
that when we loosely say we cannot have dipoles in gravity, we really mean that we cannot have
dipoles in the external eld which are non-vanishing under a coordinate transformation. See if you
can construct a similar argument for a quadrupole eld. Can you have this in gravity if you are sitting
at the centre of mass?
3.3.2 Gravity versus electrostatics
The above results for dipoles actually express something very deep about the dierences between the
gravitational and electric forces. Recall from your electrodynamics that accelerating charges radiate
photons. The dipole radiation dominates with total power output:
P

d
2
=
_
k
d
dt
(q
k
x
k
)
_
2
(3.28)
where d is the dipole moment in Cartesian coordinates and q
k
is the charge. But consider the similar
expression for the gravitational force. Now the charge, q
k
, is the same thing as the inertial mass:
q
k
= m
k
. This is a fundamental dierence between gravity and the electric force. For gravity, the term
in brackets above now becomes the momentum, and it must be zero from momentum conservation.
We have just proved that in gravity, accelerating charges do not radiate. This is a direct result of
the fact that inertial mass is charge for the gravitational force. But this has very wide implications.
It means that there is no way that I can tell if I am freely falling towards and object in a gravitational
eld; or if, it is freely falling towards me! This leads us towards Einsteins stoke of genius: freely falling
objects in gravitational elds feel no force; they are eectively inertial frames. This is the equivalence
principle from which all of general relativity is derived.
We can see a similarly interesting result for the electric force too. For the electric force, the fact
that accelerating charges radiate tells us that electrons cannot orbit around protons
7
. If they did
they would constantly radiate and lose energy, until they fell into the proton the system would not
be stable! The solution to this conundrum is, of course, quantum mechanics.
There are really only two dierences, classically, between the gravitational and electric forces. Both
obey Poissons equation and both are inverse square laws. However, in gravity there are only positive
charges, and charge is the same thing as inertial mass. It is these, seemingly small, dierences which
drive gravity towards the modern theory of general relatively, and drive the electric force towards the
modern quantum eld theories. We can understand, just from the multipole expansion, why the two
theories must be so dierent.
3.4 Analytic potential density pairs
As we have seen above, for all but the simplest mass distributions, the potential can get quite messy
quite quickly. For the seemingly-simple exponential disc, we already have a potential in the plane of
the disc which involves modied Bessel functions (see Appendix E). We need a computer to calculate
these Bessel functions, so already we are not really in the analytic regime anymore!
As a result, it is really useful in day-to-day astrophysics to have a tool-kit of density distributions
which approximate real galaxies and star clusters, but for which the potential is a known, simple,
analytic function. It is not necessary to remember all of them, but you should be aware that such
potential density pairs exist. Actually, there are not that many of them! A useful selection is detailed
for completeness in Appendix F. We present just a couple of useful examples here.
3.4.1 The Miyamoto-Nagai disc
The Miyamoto-Nagai disc potential is given by (Nagai and Miyamoto 1976):
7
Strictly speaking, above we have only proved that accelerating charges can radiate, not that they do. But they do
(we assume the reader has encountered an electrodynamics course).
29
Figure 3.3: The real and imaginary parts of the rst few spherical harmonic basis functions. They
may look familiar to you from chemistry class. The monopole term is what physical chemists would
call an s orbit, the dipole term is what is referred to as a p orbit. The sum over many terms in
the spherical harmonic series, each with dierent weight can reproduce any smooth function of (, ).
Hence they are referred to as orthogonal basis functions. Fourier series are another example of a set
of basis functions you may have come across before.
M
(R, z) =
GM
_
R
2
+ (a +
b
2
+z
2
)
2
(3.29)
M
(R, z) =
_
b
2
M
4
_
aR
2
+ (a + 3
z
2
+b
2
)(a +
z
2
+b
2
)
2
_
R
2
+ (a +
z
2
+b
2
)
2
5/2
(z
2
+b
2
)
3/2
(3.30)
Where the above density prole is an analytic solution of Poissons equation in cylindrical coordinates:
2
(R, z) = 4G(R, z).
Real galaxies look more like exponential discs than equation 3.30. However, the above is very
useful because it is fully analytic. Notice that for a 0, we recover a spherical density distribution.
This is known as a Plummer sphere. It has a scale length, b. For b 0; a = 0, we shrink the Plummer
sphere to zero size and recover the potential due to a point mass (a Kepler potential). The Plummer
sphere provides a good t to globular cluster and dwarf spheroidal galaxy light distributions. Finally,
for b 0; a ,= 0 we form an innitesimally thin disc. This is known as a Kuzmin disc. For this reason,
the Miyamoto-Nagai disc is also often referred to as a Plummer-Kuzmin model. Notice that a sets a
scale height and b a scale length for the disc.
3.4.2 The Logarithmic potential
The atness of disk galaxy rotation curves suggests that the potential should be logarithmic, since
v
c
= const. implies that:
d
dR
R
1
. As a result, a very useful toy potential is the Logarithmic
potential given by:
L
(R, z) =
v
2
0
2
ln
_
R
2
c
+R
2
+
z
2
q
2
_
+ const, 0.7 < q
1 (3.31)
L
(R, z) =
_
v
2
0
4Gq
2
_
(2q
2
+ 1)R
2
c
+R
2
+ 2(1
1
2
q
2
)z
2
(R
2
c
+R
2
+z
2
q
2
)
2
(3.32)
In the equatorial plane, the circular speed goes as
v
c
=
v
0
R
_
R
2
c
+R
2
(3.33)
30
O
r
P
h
m
1
m
2
h
Figure 3.4: A dipole potential may be generated from two oppositely charged point masses. In this
diagram, the origin is at O and [h[ [r[. In the case m
1
= m
2
, it is straightforward to show (see
text) that the force at P, F
p
1/r
3
. It is therefore a dipole force (that due to a dipole potential). If
instead, m
1
= m
2
- i.e. they are of like charge (as in gravity they must be!), then we nd F
p
1/r
2
,
and we no longer have a dipole. It is not possible to achieve a dipole force from like charges. This
highlights a fundamental dierence between gravity and the electric force.
and we recover the at rotation curve which makes this a useful model for real galaxies.
Another nice feature of the Logarithmic potential is that it is an axisymmetric potential density
pair (like the Miyamoto-Nagai disc). The potential may be attened using the parameter, q
. You
can see from the density equation why q
cannot take any value we like: for q
< 0.7, the density is

negative for some values of z. Note also that this is attening in potential, not in density, which is
dierent. The axial ratio q
of the isodensity contours is related to q
by
q
2
=
1 + 4q
2
2 + 3/q
2
, (r R
c
) (3.34)
q
2
= q
4
_
2
1
q
2
_
, (r R
c
) (3.35)
A couple of other things are worth noting. Firstly, the potential is less attened than the density
this is a generic feature of attened potentials. Remember how rapidly those confocal ellipsoids
become round? (see Figure 3.2). Secondly, notice that the potential is attened for all z! How is this
possible given that we proved that all potentials should start to look spherical as we move innitely
far away?
3.5 Numerical Techniques
In this section, we briey touch upon numerical techniques for determining the gravitational potential
and forces; we will cover this in much more detail later on in Lecture ??.
3.5.1 Direct summation; collisional simulations
The simplest kind of model discretises the matter distribution into a collection of point masses, called
particles that may be stars within a star cluster, dark matter particles etc. We may, then, naively
sum over all of the forces between the particles. Such discrete methods are in general called N-body
models and summing over all particles is called direct summation. Such an approach runs into two
main problems:
1. the force between two particles goes as 1/r
2
and so formally diverges as the particles get arbi-
trarily close together; and
31
2. for each particle we have to make N force calculations. Calculating the potential of the whole
system then takes N
2
calculations.
The rst problem can be avoided by using a softened force. A typical example is to replace the
ensemble of point masses by an ensemble of Plummer spheres
8
, this gives for the force exerted by the
j
th
particle on the i
th
particle:
F
ij
=
Gm
2
(x
j
x
i
)
(
2
+[x
i
x
j
[
2
)
3/2
(3.36)
where is called the force softening. Notice that the above force tends to a point mass force for
[x
i
x
j
[
2

2
, but no longer diverges for [x
i
x
j
[ 0. For real stars, one can think of as being
the typical radius of a star (i.e. very small!); such simulations are collisional simulations.
The second problem is one of speed. There are O(10
10
) stars in our Galaxy, and with N
2
force
calculations for each simulation timestep, it remains too challenging even for the latest computers.
For this reason, direct summation is usually used only with special hardware (either Grape boards or
GPUs) for star clusters of 10
3
10
6
stars. Even achieving 10
6
stars requires running many Grape
boards in parallel (see e.g. Harfst et al. 2007a). A detailed account of direct summation techniques
is given in Aarseth 2003a.
8
See 3.4.1 for a denition of the Plummer sphere.
32
Lecture 4
From one body to N-bodies
In this Lecture, we study how massless tracer particles move in a static gravitational potential. This
is the one-body problem that can describe the motion of the Earth around the Sun, or of our Solar
system about the centre of our Galaxy. Having gained some intuition about such orbital motions in
complex gravitational potentials, we will then gradually progress towards the full N-body problem that
requires computational astrophysics to solve. We will use a number of results from Lagrangian &
Hamiltonian mechanics; see appendix H.1 if you need a refresher.
4.1 The one-body problem
The simplest orbit problem we can consider is that of a static gravitational potential in which massless,
tracer, particles orbit. I will call these particles massless tracer particles throughout this Lecture, but
really I mean a particle that has no gravitational charge. It does still have inertial mass, however,
and this will be denoted by a little m.
Such an approximation is a good one for static (i.e. non-time varying) collisionless systems.
In this case, we may, quite accurately, pretend that each tracer (that can be a star/dark matter
particle/asteroid etc.) is moving in the static gravitational potential of all of the other stars/dark
matter particles etc. This is what I refer to as the one body problem. It can be used to describe
the motion of planets around the Sun (ignoring the eect of the other planets), of stars within star
clusters (again ignoring the gravitational collision terms), and of stars within a galaxy (the best
approximation of all, since galaxies are a near-perfect collisionless system; c.f. Lecture 1).
The simplest case of the one-body problem is a massless particle orbiting in a spherically symmetric
potential, (r), so we start with that. Working in spherical symmetry suggests that we use
spherical polar coordinates: (r, , ). However, we can be a bit smarter still and work in generalised
coordinates i.e. picking coordinates which describe the space in which the particle is constrained
to move. Since in spherically symmetric systems all of the force points towards the centre of the
potential, we know by symmetry that all orbits must be planar. Thus, without loss of generality, we
can take = const. = /2. The remaining generalised coordinates are then the plane polars: (r, ).
Thus, the Lagrangian for the problem is given by (see Appendix H.1):
L =
1
2
m
_
r
2
+ (r

)
2
_
m(r) (4.1)
The Euler-Lagrange equations then give us two equations, one for , and one for r. The equation:
d
dt
_
mr
2

_
= 0 (4.2)
tells us that the angular momentum, p
= mr
2

, must be conserved. This is actually a special case
of Noethers theorem (see Appendix H.1). The r equation is then:
33
r
J
2
r
3
+

r
= 0 (4.3)
where we have used equation 4.2 to substitute for the specic angular momentum J = J
= p
/m.
Before we continue further, it is worth briey dening some notation (see also appendix H.2.3):
Integral: A quantity which is conserved along the path of the particle: I(x(t
1
), v(t
1
)) =
I(x(t
2
), v(t
2
)).
Isolating integral: An integral which reduces the degrees of freedom available to the particle.
This is much more useful than an integral, because it tells us something about the particle
trajectory.
In general, there can be anything from zero to ve isolating integrals; these may or may not be
analytic (more than ve integrals would reduce the phase space available to a particle to a point). For
a static gravitational potential, the specic energy E is always an isolating integral.
How many isolating integrals of the motion are there for a spherical potential? Before we even
wrote down the Lagrangian, we reduced the problem from three spatial dimensions to two. This
reduced phase space from 6D to 4D. Thus we implicitly found two isolating integrals in the rst
paragraph (these are the r and components of the specic angular momentum vector, J). Then we
found another: the component of the specic angular momentum vector, J
= p
/m. Finally, an
isolating integral that is always present for static gravitational elds is the specic energy, E. Thus,
we reduce phase space to just two dimensions. For a general spherical potential, there are not any
more isolating integrals than the four we have just found: (E, J), except in a couple of very special
cases.
4.1.1.1 Orbits in spherical potentials
We start with the Hamiltonian for the problem, which is given by:
H =
1
2m
_
p
2
r
+
p
2
r
2
_
+m(r) = (4.4)
where is the energy of the particle and p
r
and p
are the canonical momenta.

Looking at equation 4.4, we see immediately the advantage of working in spherical symmetry. We
know that p
is an integral of the motion, as is . Thus the Hamiltonian is a function only of r and

r. It may be solved!
The Hamiltonian may be simply rearranged to give:
r
2
= 2 (E (r))
J
2
r
2
(4.5)
where E = /m and J = J
= p
/m are the specic energy and angular momentum, respectively.

The roots of the above equation, r = 0, must be where the coordinate r is stationary i.e. where it
takes its minimum and maximum values. These are called the pericentre (r
min
= r
p
) and the apocentre
(r
max
= r
a
). To nd them, we must solve for the roots (i.e. r = 0) of equation 4.5:
r
2
(r)
E
r
2
J
2
2E
= 0 (4.6)
Assuming that 1/r, the above equation is quadratic and hence in general we obtain two roots.
The above shows that general orbits in spherical potentials are planar, conserve angular momen-
tum, and oscillate back and forth between pericentre and apocentre. Such an orbit is given in Figure
4.1; it is called a rosette orbit, after its shape. Given enough dynamical times (assuming the orbit is
not resonant) the particle will completely ll the annulus r
p
< r < r
a
. We can think of this orbit as
a precessing ellipse.
Figure 4.2 plots equation 4.6 assuming a Kepler potential ( = GM/r) for changing E at xed
J
2
(left plot) and changing J
2
at xed E (right plot). Notice that for a circular orbit, there is only
one point where r = 0 (in other words, r
p
= r
a
). Such an orbit is the case of minimum energy and
34
Figure 4.1: A general particle orbit in a static spherical potential. The left plot shows the particle
trajectory after just 5 dynamical times; the right plot after 50. The orbit is conned to a plane and
sweeps out a rosette pattern within this plane. You can see the maximum and minimum radii swept
out by the particle clearly in the right plot, and its precession within the orbit plane. Given enough
dynamical times (and assuming the orbit is not resonant) the particle will completely ll the annulus
r
p
< r < r
a
.
is shown by the red line in the left plot. The black line shows an orbit which is not energetic enough
to exist at all in this potential. The case of maximal energy, by contrast, has only a pericentre: the
apocentre tends towards innity and the orbit becomes unbound. Notice that the energy basically
determines how far out the particle orbits, while the angular momentum sets a centrifugal barrier
which prevents the particle from reaching the centre.
4.1.1.2 Frequencies and resonance
There are two important frequencies for orbits in a spherical potential: the rate at which particles
oscillate between peri and apocentre (the radial frequency); and the rate at which particles move
through = 2 (the azimuthal frequency). An orbit is resonant when the frequency of oscillation in
the radial coordinate is some rational number times the azimuthal frequency. Such orbits are closed,
but not all closed orbits are resonant. Circular orbits, for example, are closed but not resonant since
they have no radial frequency.
We now derive the condition for resonance. The time taken for the particle to move from pericentre
to apocentre and back again, T
r
, is given by:
T
r
= 2
_
r
a
r
p
dt
dr
dr
= 2
_
r
a
r
p
dr
_
2 (E (r))
J
2
r
2
(4.7)
If is the azimuthal angle traversed in one radial oscillation, then:
= 2
_
r
a
r
p
d
dt
dt
dr
dr
= 2J
_
r
a
r
p
dr
r
2
_
2 (E (r))
J
2
r
2
(4.8)
35
r
2
= 0
E
J
2
Figure 4.2: Gaining a feel for orbits in spherical potentials. How a general spherical orbit changes
in phase space, (r, r), when E is increased at xed J
2
(left plot) and when J
2
is increased at xed
E (right plot). Notice that the energy basically determines how far out the particle orbits, while the
angular momentum sets a centrifugal barrier which prevents the particle from reaching the centre.
and we obtain the azimuthal period:
T
=
2
T
r
(4.9)
The resonance condition is then:
T
= 1T
r
(4.10)
where 1 is any rational number.
In general, there is no reason why
2
should be a rational number; indeed, there are many more

irrational than rational numbers. This is why we expect to see a rosette orbit; only a few orbits will
show such a resonance and be closed. However, there are two very special potentials for which all
orbits are resonant: the Kepler potential, and the harmonic potential. Potentials in which resonant
orbits constrain the particles to move in an even lower dimensional space must permit an extra
isolating integral in this case a fth integral. This isolating property connects these frequencies to
the action-angle variables (see appendix H and Binney and Tremaine 2008 for more details).
4.1.1.3 The special case of Kepler
For a Kepler potential due to a point mass, = GM/r and the orbit is fully analytic. Substituting
this potential into equation 4.3, we obtain:
r
J
2
r
3

GM
r
2
= 0 (4.11)
The usual trick is to then switch from time to angle variables
d
dt
=
J
r
2
d
d
and substitute r = 1/u. This
gives:
d
2
u
d
2
+u =
GM
J
2
(4.12)
which may be solved to give:
36
5. The two body problem: Orbits
Special case: Kepler potential is solvable with closed orbits.
Orbits in spherical potentials: the Kepler potential
d
2
u
d
2
+u =
GM
J
2
r() =
a(1e
2
)
1+ecos(
0
)
=
GM
r

a(1e) a(1+e)
2a
r
0
Figure 4.3: The familiar Keplerian elliptical orbit. Marked are the semi-major axis, a; the radius, r;
the phase angle,
0
; the apocentre, r
a
= a(1 +e); and the pericentre, r
p
= a(1 e).
r() =
a(1 e
2
)
1 +e cos(
0
)
(4.13)
where e is the eccentricity and a the semi-major axis.
The above orbit is the familiar closed ellipse that gives a good approximation to the motions of
the planets of the Solar System and of stars around the black hole at the centre of our galaxy (see
Figure 4.3). We can connect the semi-major axis a and eccentricity e to the isolating integrals E and
J using equation 4.6:
2(E +GM/r) J
2
/r
2
= 0 (4.14)
r
a,p
=
GM
2E
_
1
_
1 +
2J
2
E
G
2
M
2
_
1
2
_
= a(1 e) (4.15)
The above shows us again that the specic energy controls the maximum radius that the particle
orbits: a = GM/2E, while the specic angular momentum sets a centrifugal barrier that prevents
the particle from penetrating the centre of the potential: e =
_
1 +
2J
2
E
G
2
M
2
(see also Figure 4.2).
4.1.2 Axisymmetry
So far, weve looked at the problem of tracer particle orbits in spherical potentials. Now lets start to
look at more realistic potentials. Axisymmetric potentials are a natural case to consider since many
galaxies appear disc-like (see Lecture 1). Symmetry suggests that we work in cylindrical coordinates:
(R, , z) and the Lagrangian is, in general, then given by:
L =
1
2
m
_

R
2
+ (R

)
2
+ z
2
_
m(R, z) (4.16)
As previously, notice that L has no dependance. Thus, in axisymmetric potentials, we must have
at least two integrals: E and J
, and we know that these integrals are isolating. Also, notice that for
orbits in the symmetry plane, z = 0; z = 0 and the Lagrangian is indistinguishable from the one for
the spherical case (equation 4.1). This means that we cannot use rotation curves alone to determine
the shape of the mass distribution
1
.
1
As a little aside on rotation curves, before we leave the z = 0 plane behind, consider the rosette orbit in Figure
4.1. This is the general orbit for spherical potentials and so must be valid also in the plane of an axisymmetric galaxy.
It doesnt look good for the assumption of circular orbits we used when analysing rotation curves in Lecture 1 does
it? However, we are saved by one important fact. Remember that the circular orbits are a minimum of the energy.
If rotation curves are measured using the motion of gas particles rather than star particles, then we might genuinely
expect the gas to be in the minimum energy state. If it were not, the gas particle orbits would precess and cross in real
space (not in 6N dimensional phase space, of course). This causes the gas (unlike stars) to shock because it is collisional.
It will continue to lose energy, until, eventually, it settles into the lowest energy state possible: a circular orbit. This
justies our previous assumption. But, things can get complicated again when we consider triaxial potentials in section
4.1.4 which might not permit circular orbits at all!
37
x
y
r
=t
v
y
=
d
dt
(r sin(t)) = rcos(t)
v
y
y
r
r
S.O.S.
z
z
v
z
v
z
Figure 4.4: An example of a surface of section for a simple circular orbit in the xz plane. The surface
of section is shown on the right (S.O.S. means, hopefully, surface of section and not save our souls...).
The orbit punctures the z axis at x = 0 when the z velocity, v
z
= 0. Only orbits coming through
the back of the plane are plotted. This dierentiates between clockwise, and counter-clockwise orbits.
Two orbits are plotted in the gure: the clockwise orbit (right square) and the counter-clockwise orbit
(left square).
Notice from the Lagrangian (equation 4.16) that we now have a problem. In the spherical case,
we had at least four isolating integrals and only three spatial dimensions: we could completely solve
the general case. Now we can only denitely say that we have two isolating integrals. But we have
three spatial dimensions! Lets dene some nomenclature to express this frustrating situation:
Integrable Hamiltonian: One which has at least as many isolating integrals as spatial dimen-
sions.
Non-integrable Hamiltonian: One which has fewer isolating integrals than spatial dimen-
sions.
The former is soluble; the latter isnt (analytically). But, we still gain by using our Lagrangian/Hamiltonian
approach. We know that, compared to the spherical case, there is at least one more degree of freedom
for particle orbits. This means that without doing any further work, we know that a general orbit in
an axisymmetric potential will have an orbit plane which will precess. We know one other important
thing too: general orbits have conserved p
. This means that orbits will have a centrifugal barrier

just like in the spherical case (see Figure 4.2). They will not be able to pass arbitrarily close to
the centre of the galaxy. We will see later that, once there is no symmetry left in the problem (and
we have a triaxial galaxy), no component of the angular momentum is conserved and this restriction
goes away. In practice, many axisymmetric systems do have a third isolating integral. However, it is
referred to as non-classical, meaning that it is in general non-analytic.
4.1.3 2D potentials
Before we get to potentials with no symmetry the triaxial potentials in section 4.1.4 it is instructive
to consider rst the similar problem in two dimensions. We can make progress here analytically and
the problem is also much easier to visualise using a tool known as the surface of section. You can
think of the 2D potential as a slice through the xz plane of an axisymmetric potential. In this case
y = 0, while the potential is attened along z, leaving no symmetry axis. This is why these 2D
potentials provide a good starting point for understanding the more general triaxial potentials.
38
4.1.3.1 Surfaces of section (Poincare sections)
In this section, we develop a qualitative understanding of orbits in 2D potentials using numerical orbit
integrators
2
. We will introduce a convenient graphical way to represent these results called a surface
of section. Working in 2D rather than 3D makes things much easier to visualise, but still provides
intuition for the more complex 3D situation which we will come to in section 4.1.4.
The surface of section is a useful way to visualise 6D phase space on a 2D sheet of paper. We
reduce the dimensionality of the space in the following way:
Restrict ourselves to the x z plane (i.e. use a 2D potential) 4D.
Use the fact that E is always an isolating integral (x E) 3D.
Fix x = 0 2D.
For example, consider a circular orbit in the x z plane, as shown in Figure 4.4. The surface of
section is shown on the right (S.O.S. means, hopefully, surface of section and not save our souls...).
The orbit punctures the z axis at x = 0 when the z velocity, v
z
= 0. This leads to just two points
on the surface of section, marked by the squares. But the same two points would exist whether our
orbit was going clockwise or counter clockwise! To avoid this ambiguity, the convention is to plot only
orbits coming through the back of the plane. Thus, in Figure 4.4, we actually see two circular orbits:
the clockwise orbit (right square) and the counter-clockwise orbit (left square).
A circular orbit is straightforward. But what, more complicated, things are we permitted in an
general potential? A good way to see this is by studying orbits in a useful toy potential: the 2D
Logarithmic potential:
=
v
2
0
2
ln
_
r
2
c
+x
2
+
z
2
q
2
_
(4.17)
Recall that we encountered the 3D version of this in Lecture 3 (see equation 3.31). However, there
is an important dierence. In 3D, such a potential is axisymmetric and conserves p
. In 2D, the
Lagrangian becomes:
L =
1
2
m
_
x
2
+ z
2
_
m
v
2
0
2
ln
_
r
2
c
+x
2
+
z
2
q
2
_
(4.18)
Notice that there are no longer any symmetries in the Lagrangian and, in general, only energy is an
isolating integral. This is important: the 2D Logarithmic potential corresponds to the x z plane
of the 3D Logarithmic potential, with y = 0. But studying these 2D potentials is more like studying
triaxial systems, than studying axisymmetric systems. Indeed, we will see the strong connection to
triaxial systems shortly.
You can see real orbits in real time in this potential by ring up Chris Mihos surface of section
Java applet. To re-up the applet, you need to be online. Then you can click here; instructions for
how to use the applet are given on his website. Why are all of the orbits in this potential bound, even
when the orbit-energy is positive? You will also produce similar orbits using your own orbit integrator
code during the problem classes.
4.1.3.2 Orbit families
A key concept to take away from the surface of section is the idea of orbit families. This is shown in
Figure 4.5. This is a schematic surface of section corresponding to a ctitious Hamiltonian. Notice
the dierent types of orbit. Lets quickly dene some notation which will help us to understand this
plot:
Periodic orbits: Also known as a resonant or closed orbit, meaning that two or more of their
orbital frequencies are rational multiples of each other. These show up in the surface of section
diagram as the purple dots. An example is the closed circular orbit.
2
Dont worry just yet about how these numerical integrators work, we will cover in the coming lectures.
39
Surfaces of Section II
p
x
x
= periodic (resonance) orbit
= energy surface
= regular loop orbit
= regular box orbit
NOTE: Each resonance orbit creates a
family of regular orbits.
Loop orbit: has fixed sense of rotation
about the center; never has x!0
Box orbit: no fixed sense of rotation
about the center. Orbit comes
arbitrarily close to center.
= irregular (stochastic) orbit
This gure is only an illustration of the topology of various orbits in a SOS. It
does not correspond to an existing Hamiltonian.
Figure 4.5: A schematic surface of section corresponding to a ctitious Hamiltonian (credit: Frank
van den Bosch). Note that this surface of section is plotted in (x, m x), rather than (z, z) as in Figure
4.4.
Regular orbits: Also known as quasi-periodic orbits. These have at least as many isolating
integrals as spatial dimensions. They form orbit families around the resonant orbits. These are
marked with the green and red closed curves.
Irregular orbits: Also known as chaotic or stochastic orbits. These have fewer isolating
integrals than spatial dimensions. In the most extreme case, they have only energy as an
isolating integral. In this case, over time, they will ll all of phase space energetically available
to them. They are marked with the blue dots.
Energy surface: This is marked by the solid blue outer line. It delimits the region within
which the particles can orbit given their energy. Recall that the surface of section diagram is
for one energy E. Dierent energies give dierent surfaces of section.
There are two main types of regular orbit in the 2D potential: box orbits (red lines) and loop
orbits (green lines); some real examples of these taken from the 2D Logarithmic potential (using Chris
Mihos Java applet) are given in Figure 4.6. You can think of these as the generalisations of the radial
and circular orbits; indeed they form orbit families about those special resonant cases. You have
come across this concept already in spherical potentials, although you didnt know it at the time. The
rosette orbits are loop orbits which form a family around the closed circular orbit! Remember, just
like radial and circular orbits:
Box orbits pass arbitrarily close to the centre of the potential.
Loop orbits never pass through the centre of the potential.
Can we have box orbits in an axisymmetric potential
3
? Or in a spherical potential?
The resonant parent orbit of the loop orbits is clearly marked by the purple dots on Figure 4.5.
But where is the resonant orbit which parents the box orbit family? The box orbits are parented
3
Have a careful think about this. Remember that the 2D Logarithmic potential describes orbits in the y = 0 plane
of the 3D Logarithmic potential. These orbits are unstable, but can we call some of them box orbits ? In the end this
becomes a semantic issue, but we will discuss it in the examples class. As a clue, think about the dimensionality of a
radial v.s. circular orbit, and of a box v.s. loop orbit. Therein lies the dierence.
40
by the radial orbit, which is the orbit given by the energy surface. We can understand this in the
following way: The relevant radial orbit (conned to the z-axis) has x = x = 0. Thus the Hamiltonian
of the system becomes:
1
2
m z
2
+m(0, z) = E (4.19)
and we recover the equation of the energy surface:
m z =
_
2(E m(0, z)) (4.20)
What about the point right at the centre of the S.O.S. diagram? This would correspond to an orbit
conned to the x-axis: z = z = 0. Such an orbit is unstable in the S.O.S. plane (though it is not
really unstable in a physical sense); it marks the transition between the loop and the box orbits.
Orbits in Planar Potentials IV
For orbits at larger radii R
>
R
c
one has to resort to numerical integration.
This reveals two major orbit families: The rst is the family of box orbits,
which have no net sense of cirulation about the center, and which, in the
course of time, will pass arbitrarily close to the center of the potential
Note that the orbit completes a lled curve in the SOS, indicating that it
admits a second isolating integral of motion, I
2
. This is not a classical
integral, as it is not associated with a symmetry of the system. We can, in
general, not express I
2
in the phase-space coordinates.
Orbits in Planar Potentials V
The second main family is that of loop orbits. These do have a net sense of
circulation, and always maintain a minimum distance from the center of the
potential. Any star launched from R R
c
in the tangential direction with a
speed of the order of v
0
will follow such a loop orbit.
Once again, the fact that the orbit completes a lled curve in the SOS,
indicates that it admits a second, (non-classical) isolating integral of motion.
Since we dont know what this integral is (in terms of the phase-space
coordinates) it is simply called I
2
.
Orbits in Singular Logarithmic Potentials
Here is an example of a banana-orbit (member of the 2 : 1 resonance family).
Here is an example of a sh-orbit (member of the 3 : 2 resonance family).
Here is an example of a pretzel-orbit (member of the 4 : 3 resonance family).
Orbits in Logarithmic Potentials with BH
Here is an example of a stochastic orbit in a (cored) logarithmic potential
with a central black hole.
Figure 4.6: Real orbits in a 2D Logarithmic potential (taken from Chris Mihos Java applet). From
top we have, box & loop; banana & sh; and pretzel & stochastic (irregular) orbits. The stochastic
orbits are created by adding a central black hole (point mass) to the potential; this destroys the box
orbits as they can no longer pass through the centre they become chaotic.
There are other kinds of regular orbits in 2D potentials too, and some examples of these are given
in Figure 4.6. The nal orbit shown (bottom right) is an irregular orbit. The irregular orbits are the
result of scattering events with a central black hole. They arise because the box orbits try to pass
arbitrarily close to the centre, but eventually encounter the black hole and are scattered into a new
region of phase space.
The concept of orbit families is powerful. If we can nd the resonant (closed) orbits within a
potential, then we know that there will be whole families of orbits surrounding each one. We will have
a good idea of the full orbital structure of the potential. This is denitely a labour-saving device!
But, beware! Irregular (chaotic) orbits can mess this all up. If they form a signicant fraction of
the total orbit population, then the regular orbits become increasingly irrelevant. Of course we know
that there cannot be any irregular orbits in spherical potentials so there is no worry there (how do we
know this?).
41
4.1.3.3 Stackel potentials
We can gain a more analytic intuition for orbits in 2D potentials by considering a very special class
of potentials called Stackel potentials. The key trick is to switch into using the oblate spheroidal
coordinates we introduced in Lecture 3 (equation 3.15). We are motivated to do this by the structure
of orbits in the 2D Logarithmic potential. Remember the two main orbit families were the loop
and box orbits. Take a look at the oblate spheroidal coordinate system shown in Figure 3.2, and
compare this with the loop and box orbits shown in Figure 4.6. Spot the similarity? The oblate
spheroidal coordinates seem like a natural choice for describing the loop and box orbits since they
have an intrinsically similar shape. We might hope then that, in these coordinates, the Hamiltonian
will become separable again and we will be able to nd some special cases which are integrable to
guide us. This is indeed the case for St ackel potentials.
Recall that the oblate spheroidal coordinates are given by:
x = cosh usin v; z = sinh ucos v (4.21)
Where x, z dene the same 2D plane we used for the 2D Logarithmic potential. A general Stackel
potential in these coordinates is given by:
(u, v) =
U(u) V (v)
sinh
2
u + cos
2
v
(4.22)
This may seem rather strange, but consider now the Hamiltonian in these coordinates, with above
above potential:
H =
1
(u, v)
_
1
2m
2
_
p
2
u
+p
2
v
+mU(u) mV (v)
_
= E (4.23)
where (u, v) = sinh
2
u + cos
2
v, and E is the total energy of the system.
Now, by analogy with the harmonic oscillator problem (see appendix H), suppose that p
u
(u) and
p
v
(v). This means that we can now separate the Hamiltonian to give:
p
u
=
_
2m
_
E sinh
2
u I
2
U(u)
_
(4.24)
p
v
=
_
2m(E cos
2
v +I
2
+V (v)) (4.25)
where I
2
is a constant. It is straightforward to show by time dierentiation of the above and elimination
of u and p
u
using Hamiltons equations, that I
2
is a second integral of the motion. This is the beauty
of the Stackel potential form. We have picked a special case which is fully integrable! In fact this is
a very very special case. The oblate spheroidal coordinates are the only coordinate system in which
the Hamiltonian is separable in the above fashion (this was proved by Stackel, hence the name of the
potentials).
Now, the key point about the above is that the expressions inside the square roots for p
u
and p
v
must be positive, otherwise we will have complex momenta that cannot be physical. This gives:
E sinh
2
u I
2
U(u) > 0 (4.26)
E cos
2
v +I
2
+V (v) > 0 (4.27)
To progress further from here, we must specify a form for the potential. However, we can get a feel for
the two main classes of orbit simply by inspection of the above equations. If I
2
> 0, then (remembering
that E is negative), we must have U(u) < 0 to obtain solutions. If U(u) is a monotonic function of
u, we require u > u
min
. There is no similar constraint on v, but energy conservation gives an upper
bound on u: u < u
max
. This orbit shape corresponds to the loop orbits that we have already found
numerically (see Figure 4.6). Now consider I
2
< 0. In this case, we have only the energy constraint
on u: u < u
max
, while v is bounded both by energy and by the above equations: v
min
< v < v
max
.
For v
min
= 0 we obtain the box orbits (see Figure 4.6).
42
4.1.4 Triaxiality
We now look at the full triaxial case. Recall that triaxial means that each axis is dierent and we no
longer have any obvious symmetry. As a result, we may as well work in good old Cartesian coordinates.
The Lagrangian is given by:
L =
1
2
m
_
x
2
+ y
2
+ z
2
_
m(x, y, z) (4.28)
Notice that now L is a function of all three spatial variables: all we can say is that the energy, E, is
an isolating integral. This isnt very spectacular it tells us that the general orbit will be irregular
and chaotic!
Basically, the above means that it is now time to switch on your computer. However, we can gain
some intuition from the 2D potentials we studied in the previous section. Have a look at some regular
orbits in a triaxial potential, as shown in Figure 4.7. In practice, they really do look like the 3D
analogues of the box and loop orbits we are already familiar with!
How on Earth is it that the above can or should be true? We might expect total chaos (literally)
when we move to fully triaxial potentials. The reason is that the potentials which describe real Galaxies
(like that used for the orbits in Figure 4.7) are very close to being integrable. A useful theorem, which
we wont have time or space to prove here, is the KAM theorem (Kolmogorov, Arnold & Moser after
the three inventors). This states that if the Hamiltonian is close to integrable, then the quasi-periodic
(regular) orbits occupy a nite volume of phase space (see e.g. Binney and Tremaine 2008). This
means that even triaxial potentials will show orbit structure similar to that found in axisymmetric
potentials (but with the addition, of course, of chaotic orbits).
But why can we ignore the irregular orbits, just because regular orbits exist? The answer lies in
a process called Arnold diusion. The regular orbits take up some nite region of phase space (you
can think of this as occupying a nite region of the surface of section diagram). The irregular orbits
must then squeeze into the spaces left over in between the regular orbits. This is because each point
in phase space must correspond to a unique orbit (remember Liouvilles theorem? right now it is
useful!). If there are enough regular orbits then the phase volume left to the irregular orbits becomes
smaller and smaller. The irregular orbits can become trapped in between large families of regular
orbits. Over long enough time, they will escape to ll all of the available phase space. However, in
practice, this process called Arnold diusion can occur very slowly slow enough that the irregular
orbits are well conned over any timescale of interest.
4.1.4.1 Chaos
In practice, it is not clear how useful the KAM theory really is. The problem is that it is very dicult
to measure how irregular an orbit is. When we cant solve for an orbit proper, then all we can do
is make statements about how much of phase space it will ll in innite time. And there lies the
problem. Over nite time (the lifetime of our Galaxy, for example), quasi-periodic orbits can appear
chaotic; while irregular orbits can appear quasi-periodic
4
. This has led to quite an industry in trying
to calculate which orbits are truly irregular and which are not. But all such schemes eventually run
into the same problem: that we cannot conclusively prove either, without waiting innitely long. Lets
illustrate this problem using a popular way of measuring chaos: Lyapunov exponents.
The idea behind Lyapunov exponents is quite simple. If an orbit is chaotic (irregular), then an
innitesimal change in its initial phase space position will cause an exponential divergence away from
the original orbit (this is just the denition of chaotic). Thus the phase separation of the perturbed
and unperturbed orbits is given by:
p(t) = e
t
p(0) (4.29)
where is the Lyapunov exponent.
The problem with the above method is that it can prove that orbits are chaotic, but not the
converse. Consider the situation where t 1: either the Lyapunov exponent is small, or we track
the orbit only over a short period of time. In this case, we may expand the exponential as:
4
One way to see this is through Arnold diusion again. A trapped irregular orbit will eventually slip through the
gap between two regular orbit families. But, until it does, it remains well conned, sometimes for a very long time.
Such an orbit will appear regular until it diuses through the gap.
43
Figure 4.7: Examples of orbits in a triaxial potential (non-rotating). In each 3D plot, the minor axis
of the triaxial gure is vertical. Notice how similar the orbits are to the axisymmetric case (Figure
4.6). The rst orbit (top left) is a triaxial box orbit. The next two are the triaxial equivalent of loop
orbits, called tube orbits. The bottom two are triaxial equivalents of the sh and pretzel orbits.
e
t
1 +t +
(t)
2
2!
+
(t)
3
3!
+O((t)
4
) (4.30)
We see that, in practice, unless the Lyapunov exponents are very large, or we observe for a very
long time (innity, remember!), orbits will diverge in phase space only as a power law, rather than
exponentially. Other schemes suer from the same problem too.
4.1.5 Orbits in discs
There is one nal analytic case worth considering. As discussed in Lecture 1, many galaxies are disc-
like. The stars in these discs from from gas which is a collisional uid. As mentioned previously, this
means that we expect the gas to shock and lose energy until its orbits reach the lowest energy state:
circular motion. Since stars form from this gas, we can expect their orbits to be nearly circular also.
If the disc is very thin (and most are), then the stars will lie very nearly in the xy plane with z = 0.
Recall the Lagrangian for axisymmetric systems (with m = 1):
L =
1
2
_

R
2
+ (R

)
2
+ z
2
_
(R, z) (4.31)
The Euler Lagrange equations (see appendix H) yield the equations of motion:
R R

2
+

R
= 0 (4.32)
d
dt
_
R
2

_
= 0 (4.33)
z +

z
= 0 (4.34)
The second of these equations recovers conservation of the z-component of the (specic) angular
momentum: L
z
= const. = R
2

. We may then eliminate for from the above equations using the
following eective potential:
e
= +
L
2
z
2R
2
(4.35)
44
to give:
R =
e
R
; z =
e
z
(4.36)
So far we have made no approximations. We have simply recast the equations of motion in the above
2D form. We see that, as a result of L
z
conservation, motion in an axisymmetric potential can be
fully described by 2D motion in the (R, z) plane called the meridional plane.
Now we use the fact that disc stars move on nearly circular orbits in the z = 0 plane. We dene
x R R
g
, where R
g
is a constant called the guiding centre. For a perfectly circular orbit x = 0.
Thus, disc stars move on orbits with x 0, z 0 and we can Taylor expand
5
the potential about this
point to give:
e

e
(R
g
, 0)+x
e
R
R
g
,0
+z
e
z
R
g
,0
+x
2
e
R
2
R
g
,0
+z
2
e
z
2
R
g
,0
+x
2
e
Rz
R
g
,0
+O(xz
2
)
(4.37)
The rst order terms in the expansion vanish by application of equations 4.32 and 4.34 for a circular
orbit with R = R
g
, z = 0. The xz cross term also vanishes if we assume (quite reasonably) that
the potential is symmetric about z = 0. Neglecting the higher order terms is called the epicycle
approximation and it recovers the equations of an uncoupled simple harmonic oscillator for x and z:
x =
2
x; z =
2
z (4.38)
with:
2
=

2
e
R
2
R
g
,0
;
2
=

2
e
z
2
R
g
,0
(4.39)
The above gives us two orbit frequencies: the vertical frequency and the epicyclic frequency . A
third comes from the azimuthal motion:
2
(R) =
1
R
R
g
,0
(4.40)
which denes the azimuthal frequency.
As for the spherical potential, these frequencies dene the orbit motion. The star will perform
simple harmonic motion in R and z about its guiding centre which is a circular orbit. The motion
about the guiding centre is called an epicycle and is fully analytic as in the case of the harmonic
oscillator that has the same equations of motion.
A particular potential model relates to . Using
2
(R) = L
2
z
/R
4
and the denition of the
eective potential, we obtain:
2
(R
g
) =
_
R
d
2
dR
+ 4
2
_
R
g
(4.41)
For a constant density harmonic potential, = const. and we have = 2. For the Kepler potential,
R
3/2
, and we have = . Thus we may expect:
<
<
2.
As can be seen from the above equations, these three frequencies give us information about the
local gradients in the rotation curve. The above approximation provides a useful rst step towards
building a local mass model and determining the density of matter near the Sun.
4.1.6 Rotating potentials
Unfortunately, we wont have time to touch upon rotating potentials in this course. Rotation is
important if the potential is non-axisymmetric (otherwise there is no time averaged change). We have
assumed so far that Galactic discs are axisymmetric which is a reasonable approximation for many
galaxies, particularly away from the centre of the disc. However, many galaxies have bars which form
as a result of gravitational instability in the disc. Bars are highly non-axisymmetric and lead to new
and interesting orbit structure. The interested reader is referred to the course book (Binney and
Tremaine 2008).
5
See appendix D.
45
4.2 The N-body problem
So far we have discussed only the one-body approximation of a tracer particle moving in a general
static gravitational potential. In Lecture 2, we discussed the two-body problem, showing that it
reduces to a one-body Kepler problem under a simple transformation. Once we move beyond the
two-body problem, however apart from a (very few!) special cases all Hamiltonians become
non-integrable and we must resort to computational techniques. We will discuss such computational
techniques and how they dier when studying collisional or collisionless systems over the next two
lectures. You will explore some of these techniques in the problem classes.
46
Lecture 5
Collisional N-body systems
In this lecture, we discuss how to simulate collisional N-body systems. These typically have N < 10
6
and the force evaluation is not the hard part; rather, it is the timestepping (evolving the system
forwards in time) that is challenging. After briey describing the force evaluation, we focus our
eorts on building an accurate and conservative time integration scheme. This lecture largely follows
our review article (Dehnen and Read 2011).
5.1 Direct force evaluation
For N-body systems will small N, we can simply treat the particles as discrete point masses. Indeed,
this is almost exactly correct if the bodies are stars within a star clusters, or planets in a solar system
since such bodies are so small compared to the system size and typically near-perfectly spherical. The
(Newtonian) force on a particle i then follows from a simple sum over particles j:
F
i
=
N
j
Gm
i
m
j
(x
j
x
i
)
[x
i
x
j
[
3
(5.1)
where F
ij
is the force between particle pairs i and j at positions x
i
and x
j
and N is the total number
of particles.
We immediately run into two computational problems, however. The rst is that we must compute
O(N) sums for each particle and thus the algorithm scales as O(N
2
) which is very slow (i.e. if I increase
the number of particles by a factor 10, the computational costs will increase 100 fold!). Secondly, recall
that these particles are merely sampling points in the density eld. If two approach one another,
they should not really behave like giant point masses. Yet equation 5.1 has that F
i
diverges for
x
i
x
j
. For collisionless N-body systems that we will discuss in the next lecture, this latter problem
is typically solved by introducing a force softening such that the force equation becomes:
F
i
=
N
j
Gm
i
m
j
(x
j
x
i
)
(
2
+[x
i
x
j
[
2
)
3/2
(5.2)
which removes the diverging force for approaching particles.
For collisional systems, however, we actually want to model close encounters. We might still want
to use some , but it should represent the physical size of the gravitating bodies the radius of a
star or planet, for example. How then do we cope when bodies get very close together and the forces
become very large? This we must address by using clever timestepping algorithms and, ultimately,
by introducing some regularisation. We discuss these, next.
47
5.2 Time integration
5.2.1 The Simple Euler integrator
Having calculated the force on the particles, we must then evolve them forwards in time. It is tempting
to use the simple Euler method. Dening a timestep t
i
for a particle i, we can update its position
and velocity as:
x
i
(t + t) = x
i
(t) + x
i
t
i
(5.3)
x
i
(t + t) = x
i
(t) + x
i
t
i
(5.4)
where x
i,0
is the acceleration evaluated at t
0
. However, while this is conceptually straightforward, such
a scheme performs very poorly in practice. The Euler method is nothing more than a Taylor expansion
in t about t to rst order. Thus, the errors will be proportional to t
2
. We can signicantly
improve on this at little additional computational cost by using either a symplectic integrator and/or
a higher order integrator i.e. one that has an error that goes as t
n
, with n > 2. Symplectic
integrators precisely solve an approximate Hamiltonian and have the advantage that, as a result,
energy is manifestly conserved in a time-averaged sense. This means that energy errors are bounded
and will not grow even over many thousands of dynamical times. Higher order integrators give smaller
errors for the same timestep, but do not necessarily conserve energy. We discuss each of these in turn,
next.
5.2.2 The Leapfrog integrator
Symplectic integrators seek to solve an approximate Hamiltonian perfectly and thus ensure that energy
errors are bounded. This is their great advantage. It becomes particularly important when modelling
systems over many many dynamical times (like the Solar system) since otherwise even tiny energy
errors can catastrophically accumulate.
Recall from Appendix H that any Hamiltonian system (i.e. that obeys Hamiltons equations) will
be energy conserving. We start by dening a Lagrangian that has no explicit time dependence:
L L(x, x) ;
L
t
= 0 (5.5)
Thus:
dL
dt
=
L
x
i
x
i
+
L
x
i
x
i
=
d
dt
_
L
x
i
x
i
_
(5.6)
where we have used the summation convention.
Now, notice that:
L = T V =
1
2
m
i
x
2
i
V (5.7)
Thus:
L
x
i
x
i
= m
i
[ x
i
[
2
= 2T (5.8)
And therefore:
H = T +V
= 2T L
=
L
x
i
x
i
L
(5.9)
where H is the Hamiltonian of the system the total energy. Thus:
48
dH
dt
=
d
dt
_
L
x
i
x
i
_
dL
dt
= 0 (5.10)
And we have proven that if the Lagrangian has no explicit time dependence, then the total energy is
conserved.
Now, once we evolve a system over a series of discreet timesteps, we no longer perfectly solve our
system of equations. In general, this breaks the time independence of L and energy is therefore no
longer conserved. But suppose we can replace the Hamiltonian H with an approximate form:
H = H +H
err
(5.11)
where H
err
is the error Hamiltonian. Provided that

H and H are time-invariant, the energy error is
bounded at all times (e.g. Yoshida 1993). The goal is to nd a

H that can be solved exactly by simple
numerical means and that minimises H
err
. Dening the combined phase-space coordinates w = (x, p)
with p = m x and m = 1, we can re-write Hamiltons equations (Appendix H) as:
1w = w, (5.12)
where 1 , H (with Poisson bracket A, B
x
A
p
B
x
B
p
A) is an operator acting on w.
[Writing this out, we have:
1w =
x
w
p
1
x
1
p
w
=
x
w x +
p
w p
=
dw
dt
= w (5.13)
.]
Equation (5.12) has the formal solution:
w(t + t) = e
t H
w(t) (5.14)
where we can think of the operator e
t H
as a symplectic map from t to t +t. This operator can be
split into a succession of discrete but symplectic steps, each of which can be exactly integrated. The
most common choice is to separate out the kinetic and potential energies, H = T(p) + V (x), such
that we can split
e
t H
= e
t (T +V)
e
t V
e
t T
= e
t

H
. (5.15)
The expression in equation (5.15) is only approximate because the operators T , T and 1 , V
are non-commutative. However, this operator splitting is useful because while in general equa-
tion (5.12) has no simple solution, the equivalent equations for each of our new operators do:
e
t T
_
x
p
_
=
_
x + t p
p
_
and e
t V
_
x
p
_
=
_
x
p t V (x)
_
. (5.16)
These operations are also known as drift and kick operations, because they only change either the
positions (drift) or velocities (kick). Notice that the drift step in (5.15) is identical to the simple Euler
method (equation 5.3), while the kick step is not. This is because the kick step is calculated used the
drifted rather than the initial positions. An integrator than combines one drift and one kick operation
is called a modied Euler scheme and is symplectic. By contrast, the Euler method is not symplectic
because the acceleration is calculated using the initial positions.
Both the modied and un-modied Euler schemes are only rst order accurate (c.f. the comparison
with a Taylor expansion we discussed previously). We can do better by combining many appropriately
weighted kick and drift steps:
e
t

H
=
N
i
e
a
i
V
e
b
i
T
= e
t H+O(t
n
)
(5.17)
49
where the coecients a
i
and b
i
are chosen to obtain the required order of accuracy n. From equa-
tion (5.17), we see two important things: (i) the approximate Hamiltonian

H is solved exactly by the
successive application of the kick and drift operations (Yoshida 1993); and (ii)

H approaches H in the
limit t 0, and/or the limit n .
At second order (n = 2), and choosing coecients that minimise the error, we derive the leapfrog
integrator: e
t H+O(t
3
)
= e
1
2
t V
e
t T
e
1
2
t V
. Writing out each of these operations using equa-
tions (5.16) we have (subscripts 0 and 1 refer to times t and t + t, respectively):
x
= x
0
+
1
2
t x
0
(5.18)
x
1
= x
0
+ t x
(5.19)
x
1
= x
+
1
2
t x
1
(5.20)
where x
0
= V (x
0
) and x
1
= V (x
1
), while the intermediate velocity x
serves only as an
auxiliary quantity. Combining equations (5.18-5.20), we nd:
x
1
= x
0
+ t x
0
+
1
2
t
2
x
0
(5.21)
x
1
= x
0
+
1
2
t ( x
0
+ x
1
) (5.22)
which are the familiar Taylor expansions of the positions and velocities to second order in t.
In principle, we can combine as many kick and drift operations as we choose to raise the order
of the scheme. However, it is impossible to go beyond second order without having at least one
a
i
and one b
i
coecient in equation (5.17) be negative (Sheng; Suzuki 1989; 1991). This involves
some backwards integration which is problematic when using variable timestepsespecially if time
symmetry is required
1
.
5.2.3 Variable timesteps
In practice, most modern codes also employ variable timesteps, typically in a hierarchy of timestep
rungs (see Figure 5.2). This breaks the symplectic nature of the leap frog integrator and in principle
time averaged energy conservation is no longer guaranteed. However, we can do almost as well as
symplectic by ensuring that the scheme remains time symmetric (e.g. Quinlan and Tremaine 1990).
One route to time symmetry is to solve the implicit Leap Frog equations (5.21) and (5.22) using a
symmetrised timestep:
t =
1
2
_
t(x
0
,

x
0
,

x
0
...) + t(x
1
,

x
1
,

x
1
...)
(5.23)
where in general the timestep may be some function of the position, velocity, acceleration and even
higher order derivatives of x (we will discuss how to choose the timestep later on in this section). Since
t is a function of the force evaluated at t
1
, but the positions at t
1
are a function of t, we must solve
the above equation iteratively (e.g. Hut et al. 1995). This is why such an integrator is referred to as
an implicit scheme. However, while this strategy will work, each iteration involves another expensive
force evaluation and so implicit schemes are not often useful in practice. Fortunately, there are explicit
(non-iterative) time symmetric Leap Frog algorithms (Holder et al. 1999). As a simple example, let
us dene the new time step at t
1
as:
t
old
t
new
= T(x,

x,

x...)
2
(5.24)
where the function T generates the step size from the coordinates, while t
old
and t
new
are the time
steps used to evolve to and from the arguments of T. Clearly, equation (5.24) is time symmetric by
construction and requires no iteration to solve.
In the left panel of Figure 5.1, we compare the integration of a simple Kepler orbit with an
eccentricity of e = 0.9 for the leapfrog integrator with xed (black) and variable (blue) timesteps,
and a non-symplectic fourth order Hermite integrator with xed timesteps (red; see Dehnen and
1
Recently, Chin and Chen 2005 have constructed fourth-order symplectic integrators which require only forwards
integration. To achieve this, rather than eliminate all the errors by appropriate choice of the coecients a
i
and b
i
,
they integrate one of the error terms thus avoiding any backward step. Their method requires just two force and one
force gradient evaluation per time step. It has not yet found wide application in N-body dynamics, but could be a very
promising avenue for future research.
50
Figure 5.1: (Figures taken from Dehnen & Read 2011.) Left Comparison of the Leap Frog integrator (black);
a 4th order non-symplectic Hermite scheme (red); and time symmetric Leap Frog using variable timesteps
(blue) for the integration of an elliptic (e = 0.9) Kepler orbit over 100 periods. In the rst two cases a xed
timestep of t = 0.001 t
orb
was used. Middle The fractional change in energy for the Kepler problem for
various avours of the Leap Frog integrator: xed timesteps (black); variable timesteps (red); and symmetric
variable timesteps (blue). Right The fractional change in energy for the Kepler problem for various avours of
the 4th order Hermite integrator: xed timesteps (black); variable timesteps using the Aarseth timestepping
scheme (red; equation 5.38); and variable timesteps using equation 5.39 (red dotted). The blue line shows the
energy error for an orbit computed using the K-S regularised equations of motion (see text for details). In all
cases with variable timesteps, the calculations were performed at the same computational cost ( 250 force
and jerk evaluations per orbit, which is about a quarter of the cost of the xed-timestep calculations).
Read 2011 for details). The middle panel compares energy conservation for the leap-frog integrator
with xed (black), variable (red) and variable time-symmetric (blue) timesteps. For the leapfrog
integrator with xed timesteps (black), the energy uctuates on an orbital time scale, but is perfectly
conserved in the long term. This can be seen also in the orbit (left panel) that precesses, but does not
decay. By contrast, the Hermite integrator that is not symplectic but is more accurate shows smaller
phase error, but does decay with time. Best of all is the leapfrog integrator with variable symmetric
timesteps (blue). This has very small orbital error (left panel), and excellent long-term error properties.
Figure 5.2: (Figure taken from Dehnen & Read
2011.) Schematic illustration of a block timestep-
ping scheme. Particles are organised on timesteps
in a hierarchy of powers of two relative to a base
time t
0
. The time step level, denoted n
0,1,2...
is
called the timestep rung. Particles can move up
and down rungs at synchronisation points marked
by the red arrows.
In the middle panel of Fig. 5.1, we compare energy
conservation for the leapfrog using xed timesteps,
variable timesteps, and time symmetric variable
timesteps (i.e. using equation 5.24) for a simple Ke-
pler orbit with an eccentricity of e = 0.9. With xed
timesteps (black) the energy uctuates on an orbital
time scale, but is perfectly conserved in the long term;
with variable timesteps, manifest energy conservation
is lost (red); while with the time symmetric variable
time step scheme, we recover excellent energy conser-
vation (blue). The time symmetric variable time step
leapfrog used a quarter of the force calculations
required for the xed-step integration while giving
over an order of magnitude better energy conserva-
tion. This is why variable timesteps are an essential
ingredient in modern N-body calculations.
5.2.4 Hermite integrators
The leapfrog integrator is a popular because of its
simplicity, manifest energy conservation, and stabil-
ity. However, it has only found wide application for
collisionless N-body applications. This is because, al-
though energy is conserved in a time averaged sense,
51
the nite timestep causes the energy to oscillate, leading to spurious orbital precession (see Figure
5.1). In collisional systems where we must correctly track chaotic close encounters, such oscillations
and false precession rapidly ruin the quality of the integration. Shrinking the timestep helps, but as
the leapfrog is only a second order scheme, the step sizes required become prohibitively small. This
motivates considering higher order non-symplectic integrators for collisional applications.
The current state of the art are higher-order non-symplectic integrators, so called Hermite schemes.
The central idea is to Taylor expand the local acceleration and its derivatives about t
0
(here to fourth
order, for higher-order schemes see Nitadori and Makino 2008):
a = a
0
+

a
0
(t t
0
) +
1
2
a
0
(t t
0
)
2
+
1
6
...
a
0
(t t
0
)
3
+O(t
4
) (5.25a)
a =

a
0
+

a
0
(t t
0
) +
1
2
...
a
0
(t t
0
)
2
+O(t
3
) (5.25b)
where a =

x and

a are, respectively, the acceleration and jerk:
a
i
= G
j=i
m
j
x
2
ij
x
ij
3x
ij
(x
ij

x
ij
)
[x
ij
[
5
with x
ij
x
i
x
j
, (5.26)
of some particle with timestep t. Equations (5.25) amount to a polynomial t to the acceleration
between time steps
2
. Assuming we know the polynomial coecients a
0
,

a
0
etc., we can integrate them
to obtain the nal position and velocity:
x
1
= x
0
+

x
0
t +
1
2
a
0
t
2
+
1
6
a
0
t
3
+
1
24
a
0
t
4
, (5.27a)
x
1
=

x
0
+a
0
t +
1
2
a
0
t
2
+
1
6
a
0
t
3
+
1
24
...
a
0
t
4
. (5.27b)
Of the polynomial coecients the initial acceleration a
0
and jerk

a
0
are readily calculated from
equations (5.1) and (5.26). The higher-order coecients are evaluated by means of a predictor-
corrector scheme. First, we predict the position, velocity, and acceleration at time t
1
:
x
p
= x
0
+

x
0
t +
1
2
a
0
t
2
+
1
6
a
0
t
3
+O(t
4
), (5.28a)
x
p
=

x
0
+a
0
t +
1
2
a
0,i
t
2
+O(t
4
), (5.28b)
a
p
= a
0
+

a
0
t +O(t
4
). (5.28c)
Next, we use the predicted particle positions to estimate the acceleration a
1
and jerk

a
1
at time
t
1
, using equations (5.1) and (5.26). We now have everything we require to calculate the higher
order coecients in equations (5.25). Substituting equation (5.25b), evaluated at t
0
and t
1
into
equation (5.25a), we obtain:
a
0
=
_
6(a
1
a
0
) (4
a
0
+ 2
a
1
)t
/t
2
, (5.29a)
...
a
0
=
_
12(a
0
a
1
) + 6(
a
0
+

a
1
)t
/t
3
. (5.29b)
And nally, we can integrate equation (5.25a) to obtain the corrected position and velocity at time
t
1
:
x
1
= x
p
+
1
24
a
0
t
4
+
1
120
...
a
0
t
5
, (5.30a)
x
1
=

x
p
+
1
6
a
0
t
3
+
1
24
...
a
0
t
4
. (5.30b)
At this point, we can also improve the estimate of

a at t
1
, using

a
1
=

a
0
+
...
a
0
t, which is useful for
calculating the size of the next timestep t (see below). Equations (5.30) comprise the corrector step.
Applying this corrector step iteratively, replacing the predictor values with the corrector values in
the above equations, gives us an implicit integration scheme. A single iteration (explicit) scheme is
2
The scheme described here is called a Hermite scheme because the polynomial coecients are determined from
derivatives of the acceleration (Makino 1991b). Instead one can also obtain the coecients by matching accelerations
over several previous time steps (the polynomial scheme; Aarseth 1963). This has the advantage that we need not
compute derivatives of a, but the disadvantage that accelerations need to be stored, or calculated backwards in time.
For this reason, we focus in this review only on the Hermite scheme. The interested reader is referred to the excellent
book by Aarseth 2003b for further details of the polynomial or Aarseth scheme.
52
called a PEC scheme: Predict, Evaluate, Correct; further iterations are denoted P(EC)
n
, where n
is the number of iterations (Kokubo et al. 1998). In the limit n , we converge on the implicit
Hermite solution which is simply the combination of equations (5.28) and (5.30):
x
1
= x
0
+
1
2
(
x
1
+

x
0
)t
1
10
(a
1
a
0
)t
2
+
1
120
(
a
1
+

a
0
)t
3
, (5.31a)
x
1
=

x
0
+
1
2
(a
1
+a
0
)t
1
12
(
a
1

a
0
)t
2
. (5.31b)
Note the symmetry in the above equations between subscripts 0 and 1: the implicit Hermite scheme
is time symmetric (provided that the timesteps are time symmetric) and will therefore give excellent
energy conservation. The corollary of the above, however, is that the explicit Hermite scheme (which
is typically employed) is neither time symmetric nor symplectic. In practice, implicit integration
schemes are not employed because of the numerical cost of calculating the acceleration and jerk over
several iterations.
A comparison of various avours of the 4th order Hermite integrator are given in Figure 5.1, for
the integration of an elliptical Kepler orbit with e = 0.9 over 100 orbits. Notice that the leapfrog
integrator with xed timestep conserves energy exactly (in the long-term), but that the peak of the
oscillations is initially two orders of magnitude worse than for the 4th order Hermite scheme with
xed steps (compare black lines in middle and right panels). Over time, the energy losses accumulate
for the Hermite integrator, causing the apocentre of the orbit to decay, but the orbital precession
(both are numerical errors) is signicantly less than for the leapfrog (compare black and red orbits
in the left panel). It is the orbital stability and excellent energy accuracy that have made Hermite
integrators popular for use in collisional N-body problems.
5.2.5 The choice of time-step
Given the enormous dynamic range in time involved in collisional N-body problems (ranging from
days to giga-years), it has become essential to use variable timestep schemes (Aarseth 2003b). Early
schemes used an individual time step for each particle. However, it is better to arrange the particles
in a hierarchy of timesteps organised in powers of two, with reference to a base step t
0
(Makino
1991a):
t
n
= t
0
/2
n
(5.32)
for a given timestep rung n. Particles can then move between rungs at synchronisation points as
shown in Fig. 5.1. This block-step scheme leads to signicant eciency savings because particles on
the same rung are evolved simultaneously. However, time symmetry with block stepping presents
some challenges (e.g. Makino et al. 2006). A key problem is that, in principle, particles can move to
lower timestep rungs whenever they like, but they may only move to higher rungs at synchronisation
points where the end of the smaller step overlaps with the end step of a higher rung (see Fig. 5.1).
This leads to an asymmetry in the timesteps, even if some discrete form of equation (5.24) is used.
Makino et al. 2006 show that it is possible to construct a near-time symmetric block time step scheme,
provided some iteration is allowed in determining the time step. Whether a non-iterative scheme is
possible remains to be seen.
We now need some criteria to decide which rung a particle should be placed on. For low-order
integrators like the leapfrog, we have only the acceleration to play with. In this case, a possible
timestep criterion can be found by analogy with the Kepler problem:
t
i
=
_
[
i
[/[a
i
[, (5.33)
where
i
is the gravitational potential of particle i, and is a dimensionless accuracy parameter.
Substituting
i
= GM/r
i
and [a
i
[ = GM/r
2
i
, valid for a particle at radius r
i
orbiting a point mass M,
we see that this gives t
i
=
_
r
3
i
/GM, i.e. exactly proportional to the dynamical time. However, a
timestep criteria that depends on the potential is worrisome since the transformation +const.
has no dynamical eect, but would alter the timesteps. In applications like cosmological N-body
simulations, where the local potential has signicant external contributions, simulators have typically
employed:
t
i
=
_
/[a
i
[ (5.34)
53
and similar, where is the force softening length. Equation (5.34) is really only dened on dimensional
grounds: it creates a quantity with dimensions of time from a local length scalethe force softening
and the local acceleration. It is clear that this time step criteria would be of no use for, say, the Kepler
problem, where it will lead to too small steps at large radii, and too large steps at small radii.
In a recent paper, Zemp et al. 2007 have attempted to solve the above conundrum by trying to
determine what a particle is orbiting about. If this is known, then the dynamical time itself makes
for a natural timestep criteria:
t
i
=
_
r
3
i
/GM(r
i
), (5.35)
where M(r
i
) is the mass enclosed within the particles orbit from some attractor at distance r
i
.
Indeed, for an isolated system with a power law mass prole it is straightforward to show that equa-
tions (5.33) and (5.35) are identical. Zemp et al. 2007 attempted to dene a M(r) based on infor-
mation taken from a gravitational tree structure (see the next lecture). Such ideas lend themselves
naturally to collisionless simulations where a tree is often readily available as a by-product of the force
calculation. But it remains to be seen if such a timestep criteria can be competitive for collisional
N-body applications. Unlike many collisionless applications, in collisional N-body applications, it is
often well-dened from the outset what particles are orbiting about at least until close interactions
occur in which case special treatment is in any case required. In addition, the higher order integrators
typically employed provide a wealth of additional free information that can be used to determine the
timestep. The fourth order Hermite integrator, for example, gives us

a,

a and
...
a. Such considerations
have motived higher order timestepping criteria for collisional N-body applications.
Some immediately obvious choices for a higher order timestep criteria might be, for example, to
set the time step based on the truncation error in the Hermite expansion:
t
i
=
_
[a
i
[/[
a
i
[
_
1/2
or t
i
=
_
[a
i
[/[
...
a
i
[
_
1/3
(5.36)
or to use the error in the predictor step, as suggested by Nitadori and Makino 2008:
t
i
= t
old,i
_
[a
i
[/[a
i
a
p,i
[
_
1/p
(5.37)
where p is the order of the expansion and t
old,i
is the previous timestep. However, while such criteria
seem sensible, they are all out-performed by the seemingly mystic Aarseth 2003b criterion
t
i
=
_
[a
i
[[
a
i
[ +[
a
i
[
2
[
a
i
[[
...
a
i
[ +[
a
i
[
2
_
1/2
(5.38)
with 0.02. (For higher order schemes, this generalises to include higher derivatives of a, see
Makino 1991b; Nitadori and Makino 2008.)
The success of equation (5.38) probably lies in the fact that it conservatively shrinks the time step
if either

a or
...
a are large compared to the smaller derivatives, and it requires no knowledge of the
previous timestep. Equations (5.36) give poorer performance because they do not use information
about all known derivatives of a. Equation (5.37) gives poorer performance for large timesteps. It too
uses information about all calculated derivatives of a (since it is based on the error between predicted
and true accelerations). But the problem is that it relies on the previous timestep for its normalisation.
If t
old,i
is too large, the criteria will not respond fast enough, leading to overly large timesteps and
large energy losses (Nitadori and Makino 2008).
The above suggests that a conservative truncation error-like criteria that encompasses all deriva-
tives of a might perform at least as well as the Aarseth criteria while being (perhaps) more theoretically
satisfying:
t
i
= min
_
_
_
[a[
[
a[
_
,
_
2
[a[
[
a[
_
1/2
,
_
3
[a[
[
...
a[
_
1/3
, ...,
_
p
[a[
[a
(p)
[
_
1/p
_
_
(5.39)
where p is the highest order of a calculated by the integrator; and a
(p)
is the p
th
derivative of a.
In the right panel of Fig. 5.1, we compare 4th order Hermite integrators with dierent variable
timestep criteria for a Kepler orbit problem with e = 0.9. The black curve shows results for a xed
timestep with t = 0.001 t
orb
; the red curve shows results using a variable timestep and the Aarseth
54
criteria (equation 5.38); and the red dotted curve shows results using equation (5.39). For the Aarseth
criteria we use = 0.02; for our new criteria in equation (5.39) we set in all cases such that exactly
the same number of steps are taken over ten orbits as for the Aarseth criteria. The truncation
error-like criteria (equation 5.39) appears to give very slightly improved performance for the same
cost. However, whether this remains true for full N-body applications remains to be tested.
The above timestep criteria have been well tested for a wide range of problems and so appear to
work well at least for the types of problem for which they were proposed. However, there remains
something unsatisfying about all of them. For some, changing the velocity or potential can alter
the timestep and, with the possible exception of the criterion by Zemp et al. 2007, all are aected
by adding a constant to the acceleration. This is unsatisfactory, since the internal dynamics of the
system is not altered by any of these changes. Applying a constant uniform acceleration, generated
for example by an external agent, to a star cluster is allowed by the Poisson equation and does not
alter the internal dynamics, and thus should not drastically alter the timesteps. Only if the externally
generated acceleration varies across the cluster does it aects its internal dynamics, an eect known
as tides. This suggests to use
t
i
_
/[[
a[[
_
1/2
(5.40)
where

a is the gradient of the acceleration and [[ [[ denotes the matrix norm. Remarkably, for
the Kepler problem this agrees with equation (5.33), while for isolated systems with power-law mass
proles it is very similar to equation (5.35). However, computing the gradient of a merely for the sake
of the time step seems extravagant.
5.3 Close encounters and regularisation
A key problem when modelling collisional dynamics is dealing with the divergence in the force for
x
i
x
j
in equation (5.1), requiring prohibitively small timesteps (or large errors) with any of the above
schemes. Consider our simple Kepler orbit problem. For a timestep criteria as in equation (5.33),
this gives a timestep at pericentre r
p
of t
2
r
3
p
/GM. Thus, for increasingly eccentric orbits,
the timesteps will rapidly shrink, leading to a few highly eccentric particles dominating the whole
calculation. To avoid this problem, collisional N-body codes introduce regularisation for particles
that move on tightly bound orbits. The key idea is to use a coordinate transformation to remove the
force singularity, solve the transformed equations, and then transform back to physical coordinates.
Consider the equations of motion for a perturbed two-body system with separation vector

R = x
1
x
2
(using R [
R[):
R = G(m
1
+m
2
)
R
R
3
+

F
12
, (5.41)
where

F
12
=

F
1

F
2
is the external perturbation. This, of course, still has the singularity at R = 0.
Now, consider the time transformation dt = Rd:
=
1
R
R
G(m
1
+m
2
)
R
R
+R
2
F
12
(5.42)
where

denotes dierentiation w.r.t. . Note that we have removed the R
2
singularity in the force,
but gained another in the term involving R
. To eliminate that, we must also transform the co-

ordinates. The current transformation of choice is the KustaanheimoStiefel (K-S) transformation
(Kustaanheimo et al.; Aarseth; Yoshida 1965; 2003b; 1982), which requires a move to four spatial
dimensions. We introduce a dummy extra dimension in

R = (R
1
, R
2
, R
3
, R
4
), with R
4
= 0, and
transform this to a new four vector u = (u
1
, u
2
, u
3
, u
4
) such that

R = /(u)u, with:
/ =
_
_
u
1
u
2
u
3
u
4
u
2
u
1
u
4
u
3
u
3
u
4
u
1
u
2
u
4
u
3
u
2
u
1
_
_
(5.43)
55
The inverse transformation is non-unique, since one of the components of u is arbitrary. In general,
we may write:
u
2
1
=
1
2
(R
1
+R) cos
2
; u
2
=
R
2
u
1
+R
3
u
4
R
1
+R
(5.44a)
u
2
4
=
1
2
(R
1
+R) sin
2
; u
3
=
R
3
u
1
R
2
u
4
R
1
+R
(5.44b)
where is a free parameter. It is a straightforward exercise to verify that equations (5.44) satisfy the
transformation equation

R = /(u)u. We also require a transformation between the velocities

R and
u
. Writing

R
= /(u
)u +/(u)u
= 2/(u)u
, and using the relation /

T
/ = R
I gives:
u
=
1
2
/
T
R
=
1
2
/
T
R (5.45)
where the last relation follows from the time transformation dt = Rd. Substituting the K-S coordi-
nate transform into equation (5.42) gives (Aarseth 2003b):
u
1
2
Eu =
1
2
R/
T
F
12
(5.46a)
where E is the specic binding energy of the binary, which is evolved as:
E
= 2u
/
T
F
12
. (5.46b)
(Note that the transformed time is given by t
= [u[
2
, which follows from equation 5.42.) We can now
see two important things. Firstly, there are no longer any coordinate singularities in equations (5.46).
Secondly, in the absence of an external eld (
F
12
= 0), E = const. and our transformed equations
correspond to a simple harmonic oscillator.
We can evolve the above regularised equations of motion using the Hermite scheme (described
earlier in this section), so long as we can calculate u
and E
. These follow straightforwardly from

the transformed time derivatives of equations (5.46):
u
=
1
2
_
E
u +Eu
+R
Q+R
_
(5.47a)
E
= 2u

Q+ 2u
(5.47b)
where

Q = /
T
F
12
describes the external interaction term.
In Figure 5.1(c), we show results for a Kepler orbit with eccentricity e = 0.9 integrated over
100 orbits using the K-S regularisation technique (blue). We use a Hermite integrator with variable
timesteps, and the timestep criteria as in equation (5.39). For as many force calculations as the
variable timestep Hermite integration scheme, the results are over 100 times more accurate. This is
why K-S regularisation has become a key element in modern collisional N-body codes.
K-S regularisation as presented above works only for a perturbed binary interaction. However, it
is readily generalised to higher order interactions where for each additional star, we must transform
away another potential coordinate singularity (Aarseth and Zare; Heggie; Aarseth 1974; 1974; 2003b).
In practice, this means introducing N coupled K-S transformations, which requires 4N(N 1) + 1
equations, making extension to large N inecient. For this reason, chain regularisation has become
the state-of-the art (Mikkola and Aarseth 1990; Mikkola and Aarseth 1993). The idea is to regularise
only the close interactions between the N particles, rather than all inter-particle distances, which
reduces the number of equations to just 8(N 1) + 1, paving the route to high N. For interactions
involving large mass ratio, other regularisation techniques can become competitive with the K-S chain
regularisation (e.g. Mikkola and Aarseth 2002). This is particularly important for interactions between
stars and supermassive black holes.
5.4 The use of special hardware
Many of the key algorithmic developments for collisional N-body simulations were advanced very early
on in the 1960s and 1970s (e.g. Aarseth 1963; Ahmad and Cohen 1973; Aarseth and Zare 1974; Heggie
56
1974). As a result, the eld has been largely driven by the extraordinary improvement in hardware.
From the early 1990s onwards, the slowest part of the calculation the direct N-body summation
that scales as N
2
was moved to special hardware chips called GRAPE processors (GRAvity PipE; Ito
et al. 1990). The latest GRAPE-6 processor manages an impressive 1 Teraop (Makino et al. 2003),
allowing realistic simulations of star clusters with up to 10
5
particles (e.g. Baumgardt and Makino
2003). However, to move toward the million star particle mark relevant for massive star clusters,
several GRAPE processors must be combined in parallel. This became possible only very recently with
the advent of the GRAPE-6A chip (Fukushige et al. 2005). The GRAPE-6A is lower performance (and
cheaper) than the GRAPE-6, but specially designed to be used in a parallel cluster. Such a cluster was
recently used by Harfst et al. 2007b to model a star cluster with N = 4 10
6
.
The GRAPE processors have been invaluable to the direct N-body community, and with the recently
developed GRAPE-DR, they will continue to drive the eld for some time to come (Makino 2008).
However, concurrent with the further development of the GRAPE chips, signicant interest is now
shifting towards Graphical Processor Units (GPUs) for hardware acceleration. This is driven primarily
by cost. Even the smaller and cheaper GRAPE-6A costs several thousand dollars at the time of writing
and delivers 150 GigaFlops of processing power. By contrast GPUs deliver 130 GigaFlops for just
a couple of hundred dollars. The advent of a dedicated N-body library for GPUs makes the switch
to GPUs even easier (Gaburov et al. 2009). Whether the future of direct N-body calculations lies in
dedicated hardware, or GPUs remains to be seen. A third way is entirely possible if new algorithms
can make the force calculations more ecient. We discuss the prospects for this, next.
5.5 Alternatives to N-body simulations
We focus in this lecture mainly on N-body techniques for modelling self-gravitating systems. However,
other techniques do exist in the literature and provide a valuable cross check of the results from N-body
calculations. The primary alternative to N-body simulations for collisional systems are Fokker-Planck
codes that attempt to solve the N-body problem using statistical methods. This means solving the
collisional Boltzmann equation
3
(e.g. [Binney and Tremaine, 2008]):
df
dt
=
f
t
+

x
f
x

tot
x

f

x
= [f] (5.48)
where f( w, t) is the distribution function of stars (the continuum limit mass density of stars in six-
dimensional phase-space w = x,

x at time t); and is the encounter operator that describes the
interaction between stars. In the limit [f] 0, equation 5.48 recovers the collisionless Boltzmann
equation.
The full expression for [f] depends on the two body, three body and all the way up to N-body
interactions between stars within a system. Unfortunately, a full treatment then involves solving N1
integrals known as the BBGKY hierarchy after Bogoliubov, Born, Green, Kirkwoor and Yvon who all
independently derived this approach (e.g. Binney and Tremaine 2008). We can make progress, how-
ever, by considering the eects of stellar scattering events as some local process that moves stars from a
phase space point w to another one w+w in a time t with a given probability: ( w, w)d
6
( w)t.
In this case, the encounter operator becomes:
[f] =
_
d
6
( w) [( w w, w)f( w w) ( w, w)f( w)] (5.49)
Equation 5.48 with equation 5.49 is called the master equation.
The master equation avoids the need to calculate a hierarchy of N1 integral equations to evaluate
[f]. But, as with all approximations in physics, we do not get something for nothing. In substituting
the master equation for the collisional Boltzmann equation, our resulting equation is no longer time
reversible. This means that a high density feature in f will spread out over time due to the action of
[f], but that the reverse cannot occur. This asymmetry enters because of the assumption (implicit
in equation 5.49) that and f are independent: that is that the scattering rate is a simple product
of f and .
3
We follow here the approach and derivations as in the excellent [Binney and Tremaine, 2008].
57
To solve the master equation, we must calculate the scattering probability . Earlier, we noted
that relaxation is driven primarily by long range interactions (see 1). This is useful, since we may then
make the approximation that [ w[ is small as compared to [ w[ and expand the encounter operator as
a Taylor series (e.g. Binney and Tremaine 2008):
( w w, w)f( w w) = ( w, w)f( w)
6
i=1
w
i
w
i
[( w, w)f( w)] +
1
2
6
i,j=1
w
i
w
j
2
w
i
w
j
[( w, w)f( w)] +O( w
3
) (5.50)
The Fokker-Planck approximation truncates this series after the second order terms such that the
encounter operator becomes:
[f]
6
i=1
w
i
D
i
f( w) +
1
2
6
i,j=1
2
w
i
w
j
D
ij
f( w) (5.51)
where the diusion coecients are given by:
D
i
( w) =
_
d
6
( w)( w, w)w
i
; D
ij
( w) =
_
d
6
( w)( w, w)w
i
w
j
(5.52)
The diusion coecients dene the rate at which stars diuse through phase space as a result of
stellar encounters. In principle, we can include higher order terms in the Taylor expansion to increase
accuracy. However, such corrections are small and this is not done in practice (Binney and Tremaine
2008).
The beauty of the Fokker Planck approach is that now the encounter operator depends only on
the diusion coecients that are a function only of the local phase space coordinates w. However,
we must still calculate these coecients. This follows from the assumptions that: (i) encounters are
local; (ii) each encounter is independent of all others; and (iii) each encounter involves only a single
pair of stars. In this case the diusion coecients can be calculated by summing over the random
velocity kicks from many binary scattering events to give (Binney and Tremaine 2008):
D
i
= 4G
2
m
a
(m+m
a
) ln

v
i
_
d
3
v
a
f
a
(x, v
a
)
[v v
a
[
(5.53)
D
ij
= 4G
2
m
2
a
ln

2
v
i
v
j
_
d
3
v
a
f
a
(x, v
a
)[v v
a
[ (5.54)
where ln is the familiar Coulomb logarithm; m and m
a
are the masses of the star being scat-
tered and the eld stars, respectively; and the eld star distribution function is normalised such that
_
d
3
v
a
f
a
(v
a
) = n, the number density of eld stars. Note that this approach still leaves us with some
free parameters in the minimum and maximum stellar impact parameters (b
min
; b
max
) that dene the
Coulomb logarithm: = b
max
/b
min
. However, both may be reliably estimated to within an order or
magnitude, and since they appear inside a logarithm, the eect of even large uncertainties is greatly
diminished.
Finally, we must connect the density of stars to the gravitational potential through Poissons
equation:
2
= 4G
_
d
3
vf (5.55)
We now have everything we need to solve the combined Fokker Planck (FP) and Poisson equa-
tions. However, equation 5.48 is six-dimensional, plus time. This is problematic since numerically
sampling the distribution function is very expensive: even a million sample points will provide only
ten sample points per dimension. For this reason, in practice, several methods have been employed
in the literature that reduce the dimensionality of the problem: the orbit averaging technique; the
Monte-Carlo approach; and uid models. We briey describe each of these, next.
58
5.5.1 Orbit averging
Orbit averaging averages the net eect of the FP equation over each orbit for each star. Assuming that
the stellar system can be described completely by a sum over regular orbits, and that the potential
varies slowly with time, we may write the distribution function at an instant as a function of three
action coordinates (Binney and Tremaine 2008). Integrating over the three associated angle variables,
we may then half the spatial dimensionality of the whole FP equation, making it more amenable
to numerical integration. The resultant equations may then be solved on a mesh. Following the
pioneering work of Cohn 1979, typically this has been done in two dimensions, integrating the specic
energy and z-component of the angular momentum (the actions for an axisymmetric system; e.g.
Takahashi 1995). We are unaware of any fully 3D orbit-averaged FP calculation to date.
5.5.2 Monte-Carlo
The Monte Carlo approach samples the stellar system using a subset of N
s
tracer particles called
super-stars. The orbit of each of these super-stars is then integrated in a background potential
(calculated from the density of all super-stars), but including diusion in velocity as calculated from
repeated scattering events acting on each star. This amounts to a Monte-Carlo evaluation of the
Fokker-Planck diusion coecients hence the name of the method. Since each sample point of the
stellar system represents many stars, we must then assume that the potential is spherical. In this way,
each super-star represents many stars of like pericentre and apocentre, but dierent orbital plane (e.g.
Henon 1971; Freitag and Benz 2001).
5.5.3 Fluid Models
Finally, the the uid approach takes moments of the FP equations, resulting in uid-like equations
(Larson 1970). The advantage of the method is that it does not require any orbit averaging of the FP
equation. However, the hierarchy of moment equations must be closed by some assumption about the
velocity distribution function. In addition, there is an implicit assumption that all scattering events
are local and that scattered stars never move far from their initial phase space position.
5.5.4 Critique
Despite the large number of assumptions that go into workable FP codes: local binary-only encounters,
independent encounters, an assumed Coulomb logarithm, etc., the agreement with full N-body models
is remarkable (e.g. Spurzem and Takahashi 1995; Heggie et al. 1998; Kim et al. 2008). Unfortunately,
given such a large number of assumptions, at the present time Fokker-Planck codes mainly serve to
increase our understanding of the N-body simulations, rather than as an independent cross-check of
the results. However, given the enormous and rapid advance in computational power, it is perhaps time
to revisit the Fokker-Planck method afresh. We may at last begin to think about a direct integration
of the six-dimensional Fokker-Planck equation, perhaps including higher order terms in the expansion
of the encounter operator, or estimating the encounter operator from the BBGKY hierarchy. (This
latter has the advantage that it avoids the need for an assumed Coulomb logarithm.) There remain
signicant numerical challenges to such an approach, but it holds the promise of a robust method for
solving collisional N-body dynamics that relies on very dierent assumptions to the N-body method.
5.6 Applications
We consider next two interesting applications of collisional N-body codes to real astrophysics problems.
5.6.1 The stability of the Solar system
One interesting application of our above apparatus is predicting what will happen to our Solar system
over the next few billion years. Although almost all of the gravitational force comes from the Sun, the
Solar system is still a highly collisional system because of resonance. Over billions of orbits, resonances
between the planets can excite signicant interactions and chaotic motions that could in principle
59
destroy the Solar system. Such a calculation is particularly challenging because of the enormous
number of dynamical times involved. The force calculations are easy, but the timestepping is very
very hard! Calculations performed recently by Laskar and Gastineau 2009 show that such resonances
over the next 5 Gyrs can signicantly pump the eccentricity of Mercury. Of 201 calculations (each
varying the initial conditions by just 3.8k cm (yes centimetres!) with 100 < k < 100, 34 ended with
Mercury colliding with the Sun, and 86 with Mercury colliding with Venus (see Figure 5.6.1). We may
reasonably ask then why the Solar system (that is already billions of years old) hasnt already fallen
apart. The answer is that relativistic corrections to the orbits matter! Although the Solar system
might not seem relativistic, the post-Newtonian terms coming from general relativity break the perfect
resonance properties of the Kepler orbits (they precess). This breaks the resonant pumping eect and
signicantly reduces the probability of Mercurys eccentricity rising. Including these corrections (and
that from lunar contributions), only 1% of solutions lead to Mercury colliding with another planet
or the Sun. It seems that our Solar system is as stable as it is because of Einsteins theory of gravity!
5.6.2 Runaway growth of black holes in star clusters
A second interesting application is the physical collision of massive stars at the centre of dense star
clusters. We showed in 1 that physical stellar collisions are usually extremely unlikely. However,
these can occur at the very centres of the most massive star clusters with reasonable probability. This
is because of a process called the gravo-thermal catastrophe. A curious property of self-gravitating
systems is that they have negative specic heat. That means that they heat up when they lose energy!
This strange property may be understood from the virial theorem. The virial theorem, in its simplest
form, is just:
2T +V = 0 (5.56)
where T and V are the kinetic and potential energies respectively.
The total energy is just E = T +V and thus:
E = T = V/2 (5.57)
By losing energy the system gets hotter! But and this is the important part the potential energy
decreases twice as fast as the kinetic energy increases. So we are not getting something for nothing.
The negative specic heat capacity of self-gravitating systems means that they are all intrinsically
unstable there is no (nite) equilibrium state. To see why this is, we use a thought experiment due
to Donald Lynden-Bell. Imagine a hot core of stars surrounded by a cooler outer halo. The central hot
dense component loses heat to the outer halo by scattering stars. The result is that the hot core loses
energy. But then, because of the negative specic heat, it heats up and shrinks. But now it is hotter,
so more heat ows to the cold halo and the core shrinks further. It continues to shrink indenitely
in a runaway process known as the gravo-thermal catastrophe. Fortunately, the end-state of such
shrinking rarely has time to occur in nature. But here is something to think about: where would such
a process end? What is the state of maximum entropy for a self-gravitating system. Is it even sensible
to use the standard concept of entropy?
The above process has been observed (inferred) to occur in globular clusters. Heat ow from the
core outwards occurs either by three body scattering events with binary stars in the core (ejection),
or by many gradual encounters (evaporation). As more and more stars are ejected or evaporated, the
central core of the cluster shrinks. This core collapse is simply a manifestation of the gravo-thermal
catastrophe discussed above. In Figure 5.6.2, we plot the central density as a function of time of a
dense, collisional, N-body star cluster modelled by Makino 1996. The dierent curves show increasing
simulation resolution up the page, oset vertically for clarity. Notice that the density rises with time
as the gravo-thermal catastrophe sets in forming a sharp spike at around a time of 350 in simulation
units. This is the moment where the density is high enough for physical stellar collisions to occur.
But the density does not rise indenitely. At some point the density becomes suciently high that
the binary stars that drive the heat ow start to gravitationally heat the core itself. Once the core
becomes kinematically hotter than the surrounding stars, there is a temperature inversion and the
core will cool and expand (remember self-gravitating systems have negative specic heat!). But again,
this expansion phase cannot continue indenitely. Once the core drops in temperature suciently, the
60
Figure 5.3: (Figure taken from Laskar & Gastineau 2009.) Evolution of the maximum eccentric-
ity of Mercury (computed over 1 Myr intervals) over 5 Gyr. a, Pure Newtonian model without the
contribution of the Moon, for 201 solutions with initial conditions that dier by only 3.8cm in the
semi-major axis of Mercury. b, Full Solar System model with relativistic and lunar contributions, for
2,501 solutions with initial conditions that dier by only 0.38mm in the semi-major axis of Mercury.
61
No. 2, 1996 POSTCOLLAPSE EVOLUTION OF GLOBULAR CLUSTERS 799
FIG. 1.Logarithm of the central density plotted as a function of the
scaled N-body time. Curves for dierent values of N are vertically shifted
by 3 units.
chaotic oscillatory behavior even if the energy production
rate was larger than the critical value for transition between
stable expansion and regular oscillation. They also found
that the amplitude of the oscillation was smaller for larger
energy input (smaller N), which is consistent with the
present result. In fact, it would be difficult to distinguish our
N-body calculation results from their stochastic FP results.
The only visible dierence is that their result is smoother
while the central density is low. This is because the FP
result is perfectly smooth as long as there is no binary
heating.
3.2. Analysis of the 32K Run
In the previous section, we summarized the postcollapse
evolution of N-body point-mass systems with 2K32K par-
ticles. In this section, we take a closer look at the 32K
particle simulation to see whether we can nd direct signa-
tures of gravothermal expansion. It is generally believed
that a long expansion phase without signicant energy
input and a temperature inversion during this phase are the
most direct signatures of gravothermal expansion
& Sugimoto & Engle In (Bettwieser 1984; McMillan 1996).
this section, we investigate both.
shows an enlarged view of the time variation of Figure 2
the core radius as compared with the sum of the binding
energies of all binaries. It is clear that most of the energy is
generated when the core is very small.
Continued expansion without energy input from binaries
has been considered to be the most direct signature of gra-
vothermal oscillation. In one can clearly see three Figure 2,
such expansions, at t \67006740, t \68606920, and
t \72207280. All these expansions continue for more than
10 half-mass crossing times, which is hundreds of the core
relaxation time. These expansions cannot be driven simply
by binary heating. If the expansion were driven only by
binary heating, it could not continue without energy input
for a timescale more than 100 times longer than the core
relaxation time.
shows temperature proles for the contracting Figure 3
and expanding phases. Near the end of the expanding
FIG. 2.Core radius (top) and binding energy of binaries (bottom) as a
function of time for the 32K particle run.
phase, a temperature inversion of the order of 5% is clearly
visible. Since these proles are time-averaged over 10 time
units (80 snapshots), the actual inversion might be some-
what stronger. Note that the temperature inversion is also
visible only near the end of long expansion phases in gas-
model and FP calculations & Sugimoto (Bettwieser 1984;
et al. Cohn 1989).
shows the relation between the central density Figure 4
and the central velocity dispersion. The trajectory bears a
striking resemblance to that obtained by gas-model calcu-
lations The fact that the trajectory shows (Goodman 1987).
clockwise rotations means that this is a refrigeration cycle,
in which the central region absorbs heat when the tem-
perature is low and releases heat when the temperature is
high & Sugimoto In (Bettwieser 1984; Bettwieser 1985).
particular, the later phase of the large expansions (indicated
by the arrow marked B) is nearly isothermal. Since the
binding energy of binaries is unchanged during this phase,
this nearly isothermal expansion is driven by heat supplied
from outside the core. In other words, the expansion is
gravothermal.
3.3. Core SizeResults and Interpretation
shows the evolution of the number of particles in Figure 5
the core. For all runs, the number of particles at the
maximum contraction is of the order of 10 while that at the
maximum expansion is D1% of the total number of par-
ticles. This result is again in good agreement with gas
models and FP calculations (e.g., & Ramamani Heggie
et al. 1989; Breeden 1994).
shows the fraction of time for which the number Figure 6
of particles in the core is smaller than the value as a N
c
function of for the postcollapse phase. In other N
c
/N,
words, this gure shows the cumulative distribution of the
core mass. For N[8192, the median core mass is D0.5%
0.6%. This corresponds to For N\32,768, the r
c
/r
h
D0.01.
core mass is somewhat smaller than that for 16K or 8K
runs.
Whether the core size in the gravothermal oscillation
phase depends on the total number of particles is not well
understood. & Sugimoto argued that clus- Bettwieser (1984)
ters spend most of their time in the most expanded state and
Figure 5.4: (Figure taken from Makino 1996.) Gravo-thermal oscillations caused by repeated core
collapse of a globular cluster.
temperature inversion ends and core collapse can recommence. This cycle of events is called gravo-
thermal oscillations. You can see precisely this behaviour in the late time evolution of the core density
in Figure 5.6.2. Apart from being intrinsically fascinating, gravo-thermal oscillations may have a key
role to play in the formation and evolution of supermassive black holes and therefore galaxies. There
appear to be two types of black hole in the Universe: stellar mass black holes that we believe form
from the death of massive stars; and so-called supermassive black holes that reside at the centres of
galaxies, with masses in the range 10
6
10
9
M
. It remains an unsolved problem how supermassive

black holes form. The tricky part is forming black holes more massive than a few thousand Solar
masses. Above this mass, gas accretion at the centre of giant galaxies can drive the mass of the
hole rapidly up to even billions of Solar masses. But getting to a few thousand Solar masses is the
bottle-neck. One theory for how to get there is to form a supermassive star, or black hole at the centre
of a dense star cluster during a gravo-thermal oscillation spike. The calculation is dicult because
it is hard to model what should happen when two stars physically collide. But Portegies Zwart and
McMillan 2002 started by assuming the very simplest thing: that colliding stars stick together to form
a more massive star. The result of their calculation is shown in Figure 5.6.2. The x-axis shows the
mass of a stellar system, ranging from star clusters in the mass range 10
2
10
6
M
, to galaxies in
the range 10
8
10
12
M
; the y-axis shows the mass of the black hole at the centre of such stellar
systems. The data points rightwards of 10
6
M
show detections or limits from real observations of

galaxies and massive star clusters. The points with error bars below 10
6
M
show the results of the

calculations of Portegies Zwart and McMillan 2002. These plot the nal mass of the massive central
object formed through stellar collisions at the centre of a dense star cluster. Notice that the mass
of this object rises with the mass of the star cluster, reaching 1000 M
for a star cluster mass of

10
5
M
. Assuming that their simple assumption that colliding stars stick is approximately OK,
this could then be a viable mechanism for forming the seeds of galactic supermassive black holes.
Such supermassive black holes appear to signicantly aect how galaxies form since they return a
signicant fraction of their rest-mass energy to the surrounding inter-stellar gas. Thus understanding
how supermassive black holes form and evolve is key to understanding how the galaxies they live in
form and evolve, and ultimately key to understanding our own place in the Universe.
62
power-law initial mass functions with exponents of 2 or
2.35 (Salpeter) and lower mass limits of 1 M
. The model
with 64k stars (model N64R6r36) was initialized with a
Scalo (1986) mass function but with a lower mass limit of
0.3 M
instead of the 0.1 M
used in the other models. The

characterization of the tidal eld is discussed in Portegies
Zwart et al. (2001b).
The number of collisions in these simulations ranged
from 0 to 24. Figure 1 shows the mean collision rate N
coll
per star per million years as a function of the initial half-
mass relaxation time. The solid line in Figure 1 is a t to the
simulation data and has
N
coll
2:2 10
4
N
c
t
rlx
17
for t
rlx
d20 30 Myr, consistent with our earlier estimate (eq.
[8]) if f
c
0:2. The quality of the t in Figure 1 is quite strik-
ing, especially when one bears in mind the rather large
spread in initial conditions for the various models.
Figure 2 shows the cumulative mass distributions of the
primary (more massive) and secondary (less massive) stars
participating in collisions. We include only events in which
the secondary experienced its rst collision (that is, we omit
secondaries that were themselves collision products). In
addition, we distinguish between collisions early in the evo-
lution of the cluster and those that happened later by subdi-
viding our data based on the ratio t
coll
=t
f
, where t
coll
is
the time at which a collision occurred and t
f
is the dynami-
cal friction timescale of the secondary star (see eq. [4]). The
solid curves in Figure 2 show cuts in the secondary masses
at d1, d5, and < 1 (rightmost solid curve). The mean
secondary masses are hmi 4:0 4:8, 8:2 6:5, and
13:5 8:8 M
for d1, 5, and 1, respectively.

The distribution of primary masses in Figure 2 (dashed
curve) hardly changes as we vary the selection on . We
therefore show only the full (d1) data set for the primar-
ies. In contrast, the distribution of secondary masses
changes considerably with increasing . For small , secon-
daries are drawn primarily from low-mass stars. As
increases, the secondary distribution shifts to higher masses
while the low-mass part of the distribution remains largely
unchanged. The shift from low-mass (d8 M
) to high-mass
collision secondaries (e8 M
) occurs between 1 and

5. This is consistent with the theoretical arguments pre-
sented in x 2.3. During the early evolution of the cluster
(d1), collision partners are selected more or less randomly
from the available (initial) population in the cluster core; at
later times, most secondaries are drawn from the mass-seg-
regated population.
Interestingly, although hard to see in Figure 2, all the
curves are well tted by power laws between 8 and 80
M
(0.8 and 30 M
for the leftmost curve). The power-law

exponents are 0.4, 0.5, and 2.3 for d1, 5, and 1 and
0.3 for the primary (dashed) curve. (Note that the Salpeter
mass function has exponent 2.35.)
Figure 3 shows the maximum mass of the runaway colli-
sion product as function of the initial mass of the star clus-
ter. Only the left side (log M=M
d7) of the gure is

relevant here; we discuss the extrapolation to larger masses
in x 4.3. The N-body results are consistent with the theoreti-
cal model presented in equation (15).
Fig. 2.Cumulative mass distributions of primary (dashed curve) and
secondary (solid curves) stars involved in collisions. Only those secondaries
experiencing their rst collision are included. From left to right, the solid
curves represent secondary stars for which t
coll
=t
f
d1, 5, and 1. The
numbers of collisions included in each curve are 18 (for t
coll
=t
f
d1) and 42
and 95 (two rightmost solid curves). The dotted curve gives a power-law t
with the Salpeter exponent (between 5 and 100 M
) to the rightmost solid

curve (d1).
Fig. 3.Mass after a period of runaway growth as a function of the mass
of the star cluster. The solid line is m
r
30 8 10
4
M
c0
ln
c
(see eq.
[16] with f
c
0:2, 1, and ln
c
lnM
c0
=M
, where M
c0
is the initial
mass of the cluster or 10
6
M
, whichever is smaller). This relation may

remain valid for larger systems built up from many clusters having masses
d10
6
M
. For clusters with M

c0
e10
7
M
, we therefore extend the relation

as a dashed line. The logarithmic factor, however, remains constant, since it
refers to the clusters out of which the bulge formed, not the bulge itself. The
bottom dashed line shows 0:01m
r
. The ve error bars to the left give a sum-
mary of the results presented in Table 1; the data are averages of (left to
right) 4k stars (models KML101, KML111, and KML112), 6k stars (model
6k6X10), 12k stars (models RxW4 and 12k6X10), 14k stars (model
KML144), and 64k stars (model N64R6r36). The downward-pointed
arrow gives the upper limit for the mass of a compact object in the globular
cluster M15 (Gebhardt et al. 2000), and the error bar to the right gives the
mass estimate for the compact object associated with Chandra source 7 in
the irregular galaxy M82 (Matsumoto & Tsuru 1999). The Milky Way is
represented by the star symbol using the bulge mass from Dwek et al.
(1995) and the black hole mass from Eckart & Genzel (1997) and Ghez et
al. (1998). Bullets and triangles (upper right) represent the bulge masses and
measured black hole mass of Seyfert galaxies and quasars, respectively
(both from Wandel 1999, 2002). The dotted lines gives the range in solu-
tions to least-squares ts to the bullets and triangles (Wandel 2002).
904 PORTEGIES ZWART & McMILLAN Vol. 576
Stellar remnant
Supermassive BH
Stellar system/galaxy mass [log(M/Msun)]
B
l
a
c
k

h
o
l
e

m
a
s
s

[
l
o
g
(
M
/
M
s
u
n
)
]
Figure 5.5: (Figure taken from Portegies Zwart & McMillian 2002.) Runaway growth of massive stars
at the centre of a large star cluster, driven by core collapse.
63
Lecture 6
Collisionless N-body systems
In this lecture, we discuss how to simulate collisionless N-body systems. These typically have N 10
6
and the force evaluation rather than the timestepping is the hard part. We therefore spend some time
discussing how to eciently calculate the force for systems with very large N > 10
9
. This lecture
largely follows our review article (Dehnen and Read 2011); applications will follow later at the end of
the course.
6.1 The continuum limit: the collisionless Boltzmann equa-
tion
For star clusters, with 10
5
stars, we may use direct summation techniques, as discussed in the
previous lecture. But our own Galaxy has some 10
10
stars and (possibly) an unimaginable amount of
dark matter particles. The trick to getting around the N
2
problem in this case is to use the fact that
galaxies are collisionless to a very good approximation (see 1). Assuming N is very large, we may
move to the continuum limit, describing the collisionless system by a smooth dierentiable distribution
function f(x, x, t) that obeys the collisionless Boltzmann equation:
df
dt
=
f
t
+ x
f
x

f
v
= 0 (6.1)
where we have used the fact that x = . Observables are then extracted from f by taking
moments, for example the density and the gravitational potential:
=
_
fd
3
v ; = G
_
d
3
x
(x
)
[x x
[
(6.2)
Now, in principle we can try to solve equation 6.1 numerically, expanding f as some orthogonal basis
set, for example. However, this is hard for two reasons. The rst problem is that f is six-dimensional.
Thus, with 10
6
grid points we would still only sample f with just 10 points per dimension. This
makes large problems extremely challenging. But the second problem is more fundamental and is a
consequence of Liouvilles theorem: for collisionless systems, 6D phase space density is conserved (see
Appendix H). To illustrate why this is a problem, consider a 1D problem where N-body particles
move under a 1D gravitaional force along a line. In this case, phase space is two dimensional (q, p),
with momentum p = m q. Suppose that we start with a square of particles in phase space with
q
min
< q < q
max
; p
min
< p < p
max
(see Figure 6.1). Imagine rst the system evolving without any
forces acting. The top of the square moves more quickly than the bottom causing the square to shear
out into a line; this is what we see for the very early time evolution (t = 1 in Figure 6.1). Once
we add the 1D gravitational force, all particles will have an orbital turning point at some q
max
at
which they turn around and fall back, moving to negative p on the phase diagram. The result as time
marches forward is that the particles wrap up in phase space to form an every more tightly wound
spiral pattern as shown in Figure 6.1. The area in the spiral must equal the area in the initial square
by Liouvilles theorem, but this becomes increasingly dicult to measure as the sprial becomes ever
more tightly wound. And now we see the diculty. A faithful numerical solution of equation 6.1
64
Figure 6.1: (Figure taken from Dehnen & Read 2011.) Phase-mixing of 10
4
points in the one-dimensional
Hamiltonian H = p
2
/2 + |q|. This Hamiltonian corresponds to massless tracer particles orbiting a central
point mass in one-dimensional gravity (where phase-space is two-dimensional). The ne-grained distribution
function is either 1 or 0, but at late times a smooth coarse grained distribution appears. Note that dynamical
mixing in 3D self-gravitating systems is much faster and stronger than with this toy model.
must follow the winding up of this spiral which becomes increasingly dicult numerically. Indeed, by
t = 1000 the N-body numerical solution shown in Figure 6.1 has already lost the spiral pattern due
to a lack of numerical resolution.
6.2 The collisionless N-body method and force softening
In practice, the winding up of phase space should not be a fundamental problem because we are
usually only interested in the observables that are the moments of the distribution function f. Once
we calculate a moment like the density, we are necessarily averaging f over some volume. For a tightly
wound spiral in phase space, it does not matter that the spiral is unresolved so long as the window
function for calculating the density is larger than the spacing between spiral wraps. We see that the
winding problem is only present if we try to directly solve equation 6.1. If we work instead with
moments or discrete samples of f from the get-go, then this problem goes away. This is the real
advantage of the N-body technique as applied to collisionless systems. The key idea is to sample
the distribution function with a discrete set of N points. In this sense, the N-body method is a
Monte Carlo method since the sample points are usually randomly drawn from some assumed initial
distribution function. But then these points are evolved using the equation of motion x = . From
this moment onwards, the N-body method is a Method of Characteristics, since the N-body particles
represent individual characteristic solutions of the evolution of f under the collisionless Boltzmann
equation.
It is important to emphasise that for collisionless N-body applications the particles no longer
represent physical self-gravitating entities (as was the case for the collisionless simulations discussed
in 5); rather they are discrete sample points of the smooth distribution function f that move with
the ow. Observables follow as above from moments over the particles:
g(t)
__
d
3
xd
3
xg(x, x) f(x, x, t)
i
g
_
x
i
(t), x
i
(t)
_
. (6.3)
where the
i
are appropriately chosen weights. These moments are accurate so long as the spatial
65
scale that one is interested in sampling is well-populated by particles. Thus, the particle number N
is a numerical parameter that controls the spatial resolution of the simulation.
There is one key remaining problem, however: what happens if two particle trajectories approach
one another? Suppose we represent our Milky Way galaxy with N = 1000 sample points. It has a
total dynamical mass of M 10
12
M
and thus the mass of each particle will be 10

9
M
! If we model
the force from each particle as that of a point mass, then we will obtain very large (and incorrect)
accelerations as the two particles approach. For low velocity encounters, the particles could even form
an unphysical tight binary pair. How should we cope with this numerical problem? Once possibility is
to detect when this occurs and simply stop the simulation after all, it simply means that our particle
masses are too large and we should re-simulate using larger N. However, this would make all but the
most trivial simulations impossible to run. A better strategy can be found by noting that such close
passages are rare and fast. Thus, if we can integrate the trajectories through such encounters with
minimal force between the particles, then we can continue the N-body simulation for much longer
times without the need to rapidly increase N. This is the goal of force softening. The idea is to replace
a point mass potential for each particle by a smoothed, softened, potential. A common choice is to
use the potential for a Plummer sphere (see Appendix F), for which the force between two particles i
and j is given by:
F
ij
=
Gm
2
(x
j
x
i
)
(
2
+[x
i
x
j
[
2
)
3/2
(6.4)
where is called the force softening. Notice that the above force tends to a point mass force for
[x
i
x
j
[
2
2
, but no longer diverges for [x
i
x
j
[ 0.
Unfortunately, there is as of yet no solid theoretical basis for any particular choice of or softened
force function. However, a reasonable choice is to make as small as possible while avoiding two-body
energy exchange between particles over the lifetime of the simulation (thus ensuring that the simulation
is indeed collisionless). We require that the maximum acceleration on a particle (a
max
Gm/
2
; where
m is the particle mass) is less than the minimum mean-eld acceleration (a
min
GM
tot
/R
2
; where
M
tot
and R are the total mass and scale length of the system). This gives:
>
R
N
tot
=
min
(6.5)
In practice, a few times
min
is a safe choice. This may all seem a bit voodoo at this stage and, un-
fortunately, it remains so at present. There have been many papers exploring dierent force softening
formulae (called Kernels; e.g. Dehnen 2001) and varying softening (e.g. Power et al. 2003). Luckily
most simulations appear to not be sensitive to these choices and produce converged results for quite
dierent (but reasonable) choices. However, we will discuss some special cases that may be sensitive
to force softening in later lectures.
One immediate question that follows from equation 6.5 is what to do if the system size varies in
time or space: R R(x, t)? In this case, we ought to vary the force softening too. There has been
some work in the literature on such adaptive force softening techniques, but such extensions remain
in their infancy. Price and Monaghan 2007 have recently shown that it is possible to have adaptive
while retaining good energy and momentum conservation; Iannuzzi and Dolag 2011 show that this
appears to give improved performance for the same computational cost - at least for the problems
they consider. It appears likely that adaptive softening will become the norm over the next few years.
The force softening sets a minimum scale to the problem, below which we are not interested in
calculating uctuations in the potential. This is really useful and allows the O(N
2
) force calculation
for direct summation to be reduced to O(N) ln(N), or even O(N). We now describe two popular
methods for achieving this.
6.3 Force calculation: Fourier techniques
In the Fourier method, we divide space up into a grid or mesh (see Figure 6.2). It is assumed that
the the matter in each cell is concentrated at the centre. Provided the mesh is ne enough that it is
smaller than the mean inter-particle separation, the fact that our system is collisionless means that
we do not need to resolve below this scale. (This is ultimately the reason why collisionless systems
66
Figure 4.3: Subdivision of a particle distribution in 2D (from Pfalzner 1996).
Figure 4.4: A nal spatial division of a particle distribution (left) and its corresponding
tree structure (right) (from Pfalzner 1996).
34
Figure 6.2: Subdivision of space in to a grid or mesh. The particles are shown as black lled circles.
are so much easier to deal with than, for example, gas dynamical systems where the smallest scale
which ought to be resolved is the molecular mean free path.)
The key idea is then to write the gravitational potential as a convolution:
(x) =
_
((x x
)(x
)d
3
x
= ( (6.6)
which denes the Greens function for the Poisson equation:
( =
G
[x x
[
(6.7)
As above, it is more usual to use a softened Greens function that does not diverge for x = x
. For
Plummer force softening, we have:
( =
G
_
2
+[x x
[
2
(6.8)
As you may remember, the Fourier transform of a convolution is given by:
F.T. (x) =

((k) (k) (6.9)
where

((k) = F.T. ((x), and similarly for (k).
Now, we know

((k) analytically and so the only hard work which remains is in nding (k).
This may be done very rapidly by using the method of Fast Fourier Transforms (FFTs). For more
information on this algorithm, the reader is referred to the excellent Press et al. 1992. Thanks to the
Fast Fourier transform, this method scales as O(N ln N) which is a dramatic improvement on N
2
.
Forces may be similarly calculated by noting that:
x
(x) =
_

x
((x x
)(x
)d
3
x
= ( (6.10)
which simply gives us a dierent but also analytic Greens function for the force calculation.
There is some additional complication in how the particles are mapped onto the grid cells and
how the forces are then mapped back on to the particles. Also, in practice, adaptive meshes are often
employed rather than a xed grid to put resolution only where it is needed. A more detailed account
of this method that includes these complications is presented in Binney and Tremaine 2008.
67
Figure 6.3: Schematic illustration of the Barnes and Hut oct-tree in two dimensions. The particles are
rst enclosed in a square (root node). This square is then iteratively subdivided in four squares of half
the size, until exactly one particle is left in each nal square (the leaves of the tree). In the resulting
tree structure, each square can be a parent of up to four children. Note that empty squares need
not be stored. For a three-dimensional simulation, the tree nodes are cubes instead of squares.
6.4 Force calculation: Tree techniques
The other obvious thing to do is to solve the multipole expansion. In practice, this is often combined
with tree techniques. The density is represented by particles, as in the direct summation technique,
but now we divide up space into a tree structure (see Figure 6.3). At the base of the tree is the root
node. This is then subdivided into branches of the tree which are themselves subdivided until we
arrive at one particle per sub-division the leaves of the tree. The tree can be built by dividing space
in a number of dierent ways. A popular choice is the oct-tree where each parent cube is divided into
eight equal children (also called the Barnes and Hut oct-tree after Barnes and Hut 1986). This is
useful because all cells are cubic. But other, more complicated, choices can be better. Binary trees,
for example, divide the cubes into two halves which leads to rectangular cells, but a more adaptive
(and therefore more ecient) space division (see e.g. Stadel 2001).
Having built the tree, we calculate the potential of each tree node as:
node
(r) = G
_
node
d
3
x
(x)
_
2
+[r x[
2
(6.11)
where x is the distance to the centre of mass of the node, and we have used the softened potential
corresponding to the softened force of equation 6.4 (other choices of softening Kernel are also possible;
see e.g. Dehnen 2001). For particle simulations, the density within the node is a sum over delta
functions:
node
(x) =
([x x
[) (6.12)
where x
is the distance from the centre of mass of the node to one of the particles.
Substituting equation 6.12 in 6.11 then gives:
node
(r) = G
2
+[r x
[
2
(6.13)
Now, since [r[ [x
[, we may Taylor expand

1
the square root to give:
node
(r) = G
_
1
s
+
r
i
x
i
s
3
+
3
2
r
i
x
i
r
j
x
j
s
5
+...
_
(6.14)
where s
2
=
2
+[r[
2
and we use the summation convention
2
.
The above is just the multipole expansion for the node in Cartesian coordinates. The rst term
is the monopole, the second the dipole that must be zero because we use coordinates about the
centre of mass, and the third is the quadrupole. It is useful because the dependence on r now falls out
linearly: we may calculate these multipole sums for each node and then sum over all nodes to obtain
1
see Appendix D.
2
see appendix B.
68
the potential at a given point. This presents us with a trade-o between building more branches in
the tree and including more multiple moments: both give increased force accuracy. Branching the
tree is controlled by the opening angle , which is dened by comparing the size of the quadrupole
term with the monopole term for a node:
1
s
5
i
x
j
r
i
r
j
<
2
1
s
(6.15)
If the size of one size of the cubic node is l and s r, then the above reduces to:
l
r
< (6.16)
which is the branching criteria proposed by Barnes and Hut 1986. Other branching criteria can
compare the size of higher order moments (e.g. hexadecapole) giving dierent trade-os between
having more branches on the tree versus a more accurate force calculation for each node (see e.g.
Springel et al. 2001).
One nal gotcha to watch out for is that one should not actually use the cell length l in the tree
opening criteria. If the multipole moments are correctly calculated and used then there is no problem.
But early codes simply used l/r as written in equation 6.16. This is dangerous because the longest
length in a cell is
3 larger than l and the opening criteria can fail leading to catastrophic force errors
(the infamous exploding galaxy bug). Instead, one should use max[x
] where we recall that x
is
the distance from the centre of mass of a node to a particle inside the node. This is shown graphically
by the dotted circles in Figures 6.4 and 6.5, where it is clear that this distance is often larger than l.
The above algorithm scales as O(N) ln N. This can be simply understood by considering a binary
tree for a constant density particle distribution. The space is continually subdivided until we have one
particle per cell this is the scale of the force softening . For constant density, each subdivision
halves the number of particles and after ln N subdivisions, we reach the leaf nodes (actually, in
this case we have log
2
N subdivisions). Then, for each particle we walk the tree to calculate the
force, by summing over the nodes. This is another ln N force computations per particle giving overall
O(N) ln N time. The computational advantage of the tree code is shown graphically in Figure 6.4.
An even faster improvement on the tree code can be obtained if particle-node interactions are
replaced with node-node interactions: the Fast Multipole Method, or FMM (see Figure 6.5, left panel).
This has the additional advantage that momentum conservation can be ensured by construction since
the node-node interactions are all antisymmetric. A careful ordering of the sums can reduce the
order of the algorithm further to O(N) (see Dehnen 2000). This improvement is just becoming really
important. With state-of-the-art simulations now using O(10
9
) particles, we can obtain speed-ups
of 20 by eliminating the ln N dependence. This is why, for gravity only simulations, FMM tree
techniques remain faster for a given accuracy than Fourier methods, and are the current state-of-the
art. A graphical depiction of the speed-up using FMM as compared to a standard tree code is given
in Figure 6.5, right panel.
69
Figure 6.4: (Figure taken from Dehnen & Read 2011.) Left: computation of the force for one of 100 particles
(asterisks) in two dimensions (for graphical simplicity) using direct summation: every line corresponds to a
single particle-particle force calculation. Middle: approximate calculation of the force for the same particle
using the tree code. Cells opened are shown as black squares with their centres z indicated by solid squares and
their sizes w by dotted circles. Every green line corresponds to a cell-particle interaction. Right: approximate
calculation of the force for all 100 particles using the tree code, requiring 902 cell-particle and 306 particle-
particle interactions ( = 1 and n
max
= 1), instead of 4950 particle-particle interactions with direct summation
(left).
Figure 6.5: (Figure taken from Dehnen & Read 2011.) Left: Illustration of the geometry for the Taylor
expansion used with the fast multipole method. Right: approximate calculation of the force for the same
100 particles as in Fig. 6.4 using the FMM, requiring 132 cell-cell (blue), 29 cell-particle (green), and 182
particle-particle (red) interactions ( = 1 and n
max
= 1).
70
Lecture 7
Cosmological N-body simulations
In this lecture, we discuss how to perform cosmological N-body simulations of large patches of the
Universe (or even the whole Universe). The Universe can also be modelled as a collisionless uid
and so many results from the previous lecture can be carried over to this regime. We present the
key equations and discuss the additional complications that arise when conducting such simulations;
applications will follow later at the end of the course.
7.1 Equations of motion
The Universe is suciently massive and dense that we are no longer in the Newtonian regime: we
must switch to general relativity. Nonetheless, as we shall see, the equations of motion for the growth
of small perturbations in the Universe look remarkably similar to Newtonian gravity and most of the
techniques we have already derived in the previous lecture may be used. We start with the equations
that describe an unperturbed smooth homogeneous Universe, and then add some perturbations on
top of this.
7.1.1 The homogeneous Universe
Building a cosmological model means nding a solution to Einsteins eld equations (see Appendix J)
that gives a good description of the distribution of matter and energy in the Universe both now and
backwards in time. If we knew nothing about the observed Universe, we might start with the simplest
assumption: a homogeneous and isotropic Universe. In fact, this assumption agrees very well with
the data. The Universe is certainly quite close to homogeneous today on large scales, as can be seen
from galaxy surveys like the Sloan Digital Sky Survey (SDSS; Figure 7.1; Yadav et al. 2005).
A more theoretical argument for homogeneity comes from the observed local expansion. Lemaitre
(1927) and later Hubble (1929) found that nearby galaxies are all receding from us with a velocity
proportional to distance (Hubble 1929; Nussbaumer and Bieri 2011; and Figure 7.2). The Copernican
Principle states that there is nothing special about our place in the Universe. We may then reasonably
assume that all observers must see the Universe expanding away from them, which implies that the
Universe is isotropic. But, if the Universe is isotropic then it must be homogeneous. The proof is
geometrical and given in Figure 7.3 (argument taken from Peacock 1999). The converse is, however,
not true. A homogeneous Universe can be anisotropic (can you think of an example?).
Note that we could have realised much sooner than either Hubble or Friedmann the Universe is
expanding due to a paradox commonly attributed to the German amateur astronomer Olbers in 1823,
but in fact dating back much earlier than him to Thomas Digges in the late 1500s (Harrison 1989).
The paradox is as follows: if the Universe were static and innite then we would see a star along ever
single light line in the night sky. Thus the photons arriving from each of these stars would light up
the night sky until it were as bright as the Sun. Edgar Allen Poe put it very well in a poem, Eureka
(1848) in which he wrote:
Were the succession of stars endless, then the background of the sky would present us a
uniform luminosity, like that displayed by the Galaxy since there could be absolutely no
71
Testing homogeneity in SDSS-DR1 3
Figure 1. This shows the two dimensional galaxy distribution in in the NGP and SGP subsamples that have been analysed here.
There are various other probes which test the cosmological principle. The fact that the Cosmic Microwave Background
Radiation (CMBR) is nearly isotropic (T/T 10
5
) can be used to infer that our space-time is locally very well described
by the Friedmann-Robertson-Walker metric (Ehlers, Green & Sachs 1968). Further, the CMBR anisotropy at large angular
scales ( 10
o
) constrains the rms density uctuations to / 10
4
on length-scales of 1000 h
1
Mpc (e.g. Wu, Lahav & Rees
1999). The analysis of deep radio surveys (e.g. FIRST, Baleises et al. 1998) suggests the distribution to be nearly isotropic
on large scales. By comparing the predicted multipoles of the X-ray Background to those observed by HEAO1 (Scharf et al.
2000) the uctuations in amplitude are found to be consistent with the homogeneous universe (Lahav 2002). The absence of
big voids in the distribution of Lyman- absorbers is inconsistent with a fractal model (Nusser & Lahav 2000).
A brief outline of the paper follows. In Section 2 we describe the data and the method of analysis, and Section 3 contains
results and conclusions.
2 DATA AND METHOD OF ANALYSIS
2.1 SDSS and the N-body data
SDSS is the largest redshift survey at present and our analysis is based on the publicly available SDSS-DR1 data (Abazajian
et al. 2003). Our analysis is limited to the two equatorial strips which are centred along the celestial equator ( = 0
), one
in the Northern Galactic Cap (NGP) spanning 91
in r.a. and the other Southern Galactic Cap (SGP) spanning 65
in r.a.,
their thickness varying within | | 2.5
in dec. We constructed volume limited subsamples extending from z = 0.08 to 0.2

in redshift (i.e. 235 h
1
Mpc to 571 h
1
Mpc comoving in the radial direction) by restricting the absolute magnitude range
to 22.6 Mr 21.6. The resulting subsamples are two thin wedges of varying thickness aligned with the equatorial
plane. Our analysis is restricted to slices of uniform thickness 4.1 h
1
Mpc along the equatorial plane extracted out of the
wedge shaped regions. These slices are nearly 2D with the radial extent and the extent along r.a. being much larger than
the thickness. We have projected the galaxy distribution on the equatorial plane and analysed the resulting 2D distribution
(Figure 1). The SDSS-DR1 subsamples that we analyse here contains a total of 3032 galaxies and the subsamples are exactly
same as those analysed in Pandey & Bharadwaj (2004). We have used a Particle-Mesh (PM) N-body code to simulate the dark
matter distribution at the mean redshift z = 0.14 of our subsample. A comoving volume of [645h
1
Mpc]
3
is simulated using
256
3
particles on a 512
3
mesh with grid spacing 1.26h
1
Mpc. The set of values (m0, 0, h) = (0.3, 0.7, 0.7) were used for
the cosmological parameters, and we used a CDM power spectrum characterised by a spectral index ns = 1 at large-scales
and with a value = 0.2 for the shape parameter.The power spectrum was normalised to 8 = 0.84 (WMAP, Spergel et al.
2003) . Theoretical considerations and simulations suggest that galaxies may be biased tracer of the underlying dark matter
distribution (e.g., Kaiser 1984; Mo & White 1996; Dekel & Lahav 1999; Taruya & Suto 2001 and Yoshikawa et al. 2001).
A sharp cuto biasing scheme (Cole et al. 1998) was used to generate particle distributions. This is a local biasing scheme
where the probability of a particle being selected as a galaxy is a function of local density only. In this scheme the nal
dark-matter distribution generated by the N-body simulation was rst smoothed with a Gaussian of width 5h
1
Mpc. Only
the particles which lie in regions where the density contrast exceeds a critical value were selected as galaxy. The values of the
critical density contrast were chosen so as to produce particle distributions with a low bias b = 1.2 and a high bias b = 1.6.
An observer is placed at a suitable location inside the N-body simulation cube and we use the peculiar velocities to determine
the particle positions in redshift space. Exactly the same number of particles distributed over the same volume as the actual
data was extracted from the simulations to produce simulated NGP and SGP slices. The simulated slices were analysed in
exactly the same way as the actual data.
Figure 7.1: The Universe today on large scales is observed to be very homogeneous. This image shows
the distribution of galaxies in the SDSS galaxy survey towards the North Galactic Pole (left) and the
South Galactic Pole (right). The distribution becomes statistically homogeneous on scales larger than
70 Mpc.
ASTRONOMY: E. HUBBLE
corrected for solar motion. The result, 745 km./sec. for a distance of
1.4 X 106 parsecs, falls between the two previous solutions and indicates
a value for K of 530 as against the proposed value, 500 km./sec.
Secondly, the scatter of the individual nebulae can be examined by
assuming the relation between distances and velocities as previously
determined. Distances can then be calculated from the velocities cor-
rected for solar motion, and absolute magnitudes can be derived from the
apparent magnitudes. The results are given in table 2 and may be
compared with the distribution of absolute magnitudes among the nebulae
in table 1, whose distances are derived from other criteria. N. G. C. 404
o~~~~~~~~~~~~~~~~
0.
S0OKM
0
DISTANCE
0 IDPARSEC S 2 ,10 PARSECS
FIGURE 1
Velocity-Distance Relation among Extra-Galactic Nebulae.
Radial velocities, corrected for solar motion, are plotted against
distances estimated from involved stars and mean luminosities of
nebulae in a cluster. The black discs and full line represent the
solution for solar motion using the nebulae individually; the circles
and broken line represent the solution combining the nebulae into
groups; the cross represents the mean velocity corresponding to
the mean distance of 22 nebulae whose distances could not be esti-
mated individually.
can be excluded, since the observed velocity is so small that the peculiar
motion must be large in comparison with the distance effect. The object
is not necessarily an exception, however, since a distance can be assigned
for which the peculiar motion and the absolute magnitude are both within
the range previously determined. The two mean magnitudes, - 15.3
and -
15.5, the ranges, 4.9 and 5.0 mag., and the frequency distributions
are closely similar for these two entirely independent sets of data; and
even the slight difference in mean magnitudes can be attributed to the
selected, very bright, nebulae in the Virgo Cluster. This entirely unforced
agreement supports the validity of the velocity-distance relation in a very
PRoc. N. A. S. 172
Figure 7.2: Hubbles original data showing that the Universe is expanding with velocity proportional
to distance.
72
A
B
C
D
E
Figure 7.3: A geometric proof that an isotropic Universe is homogeneous. The converse is not neces-
sarily true. Isotropy about point B tells us that the density at C, D and E is the same. By expanding
spheres of dierent radii around point A, we see that the overlapping purple shaded area must be
homogeneous. For large enough shells, we may extend this argument to the whole Universe.
point, in all that background, at which would not exist a star. The only mode, therefore,
in which, under such a state of aairs, we could comprehend the voids which our tele-
scopes nd in innumerable directions, would be by supposing the distance of the invisible
background so immense that no ray from it has yet been able to reach us at all.
The fact that the sky at night is dark is indeed a compelling mystery! It can be understood,
however, if the Universe is expanding. In this case, the light from distant stars are redshifted (more so
with increasing distance). Innitely distant stars will be redshifted innitely until they can no longer
be seen.
7.1.1.1 The FLRW metric
In Appendix J, we discuss the (unique) metric that describes an isotropic and homogeneous Universe,
the FLRW metric:
c
2
d
2
= c
2
dt
2
R
2
(t)
_
dr
2
1 kr
2
+r
2
(d
2
+ sin
2
d
2
)
_
(7.1)
where R(t) is called the scale factor; k = [1, 0, 1] is a parameter that measures the fundamental
curvature of the spacetime; and r is a time independent co-moving coordinate. We can see that
k describes curvature by considering k = 0. In this case, the FLRW metric looks very similar to
Minkowski space (in which special relativity applies), just with an expansion factor R(t). Thus, k = 0
is often called at space, even though it still has some spacetime curvature. k = 1 are called closed
and open Universes respectively, which we will describe in more detail shortly.
Since distant galaxies are observed to all be moving away from us at ever greater speeds, we
will mostly be dealing with pure radial motion in the FLRW metric. For this reason, it is useful to
transform to a dierent coordinate system that eliminates the 1 kr
2
term in the denominator of the
radial part. Consider the function:
S
k
(r
) =
_
_
_
sin r
k = +1
sinh r
k = 1
r
k = 0
(7.2)
Now, we have that (taking k = 1 as an example):
dr
2
1 kr
2

dS
2
k
1 kS
2
k
=
cos
2
r
dr
2
1 sin
2
r
= dr
2
(7.3)
73
and similarly for k = 1 and k = 0. Thus, the metric becomes:
c
2
d
2
= c
2
dt
2
R
2
(t)
_
dr
2
+S
k
(r
)
2
d
2
_
(7.4)
where d
2
= (d
2
+ sin
2
d
2
). We will use the above metric with the notation r = r
from here on.

Since the Universe is observed to be homogeneous, and the Copernican Principle suggests that
it must be isotropic, the FLRW metric, and perturbations around it, form the basis of our current
cosmological model. Straight away this makes an important prediction. The scale factor R(t) acts
to cause either an expansion or contraction of the length scales in the metric as a function of time:
the FLRW metric describes either expanding or collapsing Universes. Initially this worried Einstein
who introduced the cosmological constant to try to counter-act the expansion term. But this static
solution is unstable and Einstein later called it his greatest blunder (c.f Appendix J). Now we can
think of this instead as a beautiful prediction of general relativity and Einsteins eld equations: an
isotropic, homogeneous Universe must either expand or contract.
7.1.2 The inhomogeneous Universe
The Universe on large scales is observed to be very close to homogeneous. On smaller scales in the
nearby Universe, however (
<
70 Mpc), the Universe becomes very inhomogeneous. We start to see

local uctuations in the density eld due to the presence of structure: galaxies like our own Milky Way.
This suggests that after the Big Bang, the Universe was not perfectly homogeneous. Tiny uctuations
perhaps seeded by quantum eects provided just enough inhomogeneity to grow into the galaxies
and local structure we see today. Studying the non-linear growth of these tiny uctuations into the
structure we see in the Universe today requires cosmological N-body models. A full treatment of the
growth of structure in an expanding FLRW spacetime involves linear (and potentially higher order)
perturbations of the GR eld equations, which is a bit involved for us here. Instead, we will use a
useful approximation. Since any inhomogeneity must be local, and local space must be Minkowski, we
can approximate the local dynamics as being pure Newtonian weak eld embedded in an expanding
FLRW spacetime. Thus, locally, the Universe must approach a classical uid.
7.1.3 The non-linear growth of structure: evolution equations
At very early times, the Universe can be a rather complex uid (e.g. a mix of non-relativistic matter,
relativistic radiation, and interaction terms between the two). However, after the de-coupling of
photons (that become the cosmic microwave background radiation; CMB), the Universe becomes very
much simpler to model. Now, radiation is negligible and only matter, curvature and vacuum energy are
important. This is a great simplication because we need deal only with two types of non-relativistic
uid: one for baryons, and one for dark matter. Furthermore, these couple only through gravity.
First, let us derive the full non-linear equations of motion. For the reasons outlined above, we will
assume that we can describe the motion as locally Newtonian within an expanding FLRW background.
The derivation is easiest done in co-moving rather than Eulerian coordinates. Writing the position
of some point moving with the expansion x as x = a(t)r(t), where r is the co-moving distance and
a(t) = R(t)/R
0
is the dimensionless scale factor, we may dierentiate twice to give:
x = ar +a r (7.5)
x = ar + 2 a r +ar (7.6)
where the dot refers to d/dt the co-moving temporal derivative. Rearranging in terms of the
co-moving acceleration, we have:
r = 2
a
a
r

r
a
2
g
0
(7.7)
0
Note that there has been some worry in the literature about the validity of this approximation on large scales. In
particular, various authors have suggested that inhomogeneities could drive the observed cosmic acceleration, mimicking
the mysterious dark energy (e.g. Buchert et al.; Mukhanov et al. 2000; 1997). This appears to be unlikely (Ishibashi and
Wald 2006), though the necessary corrections due to inhomogeneities could aect attempts to use N-body simulations
to determine precision cosmological parameters (e.g. Enea Romano and Chen 2011).
74
where we have written the acceleration terms as the peculiar acceleration: x/a =

r
a
2
, and an un-
perturbed acceleration: g
0
=
a
a
r, and ignored pressure forces. Assuming weak eld general relativity,
the Poisson equation still holds locally and we have:
2
r
(
0
+ )
a
2
= 4G(
0
+ ) (7.8)
Thus, subtracting the unperturbed parts of the above equations (g
0
,
0
and
0
)
1
, we arrive at the
full equations of motion:
r = 2
a
a
r

r
a
2
(7.9)
2
r
a
2
= 4G (7.10)
Notice that these equations are remarkably similar to the non-cosmological N-body equations of
motion for a collisionless uid. In co-moving coordinates r, we now have two acceleration terms: the
left term is due to the expansion; the right term is due to gravity and depends on the local potential
contrast , rather than the absolute gravitational potential. Equation 7.10 is just a Poisson equation
relating over density to potential contrast . We already know how to solve the Poisson equation!
(c.f. 6).
7.1.4 Boundary conditions
To solve for the interaction potential, we must specify the boundary conditions. Here we may assume
that we are simulating some small cubic patch of the Universe of size L (this is required to satisfy the
local-Newton approximation we introduced above), and apply periodic boundary conditions (valid
since the Universe is isotropic and homogeneous on large scales). In this case, the interaction potential
is the solution of a modied Poisson equation:
2
(x) = 4G
_
1
L
3
+
(x nL)
_
(7.11)
where

is the Dirac delta function convolved with a gravitational softening kernel of co-moving scale
length (we discuss force softening in the following section); and the sum over vector n = (n
1
, n
2
, n
3
)
includes all integer triplets.
Note that while the above equations are mathematically similar to the non-cosmological case (and
therefore amenable to similar numerical treatment), they have a quite dierent meaning. Since we
are now working in co-moving space, the potential is really the peculiar potential with respect to the
smooth background :
2
(x) = 4G[(x) ] ; (x) =
j
m
j
(x x
j
); (7.12)
while the forces are peculiar forces with respect to the expansion.
For Fourier methods, the above is completely natural (Fourier methods are periodic by construc-
tion); tree methods are harder to make periodic. A method that signicantly speeds up the calculation
is the Ewald method (Hernquist et al. 1991). However, for tree methods there is some computational
cost with having a periodic domain (the opposite is true for Fourier methods, of course having a
non-periodic domain is then more expensive as padding cells are required).
1
This relies on the infamous Jeans swindle after Jeans (1902). It is, of course, dodgy to subtract away the
unperturbed gravitational eld. In the unperturbed limit, we have r = r = 0 and therefore
0
= 0. But we must
have from the Poisson equation that
2
0
= 4G
0
. These can only both be true if
0
= 0! The reason the swindle
works is really just that it leads to the answer one gets if doing a proper linearised analysis in GR. Such an approach
is beyond the scope of this course, however, and we must accept the swindle for now.
75
Lecture 8
Initial conditions
In this lecture, we discuss how set up the initial conditions for N-body simulations.
8.1 Preamble
All N-body methods evolve the distribution function f(x, v, t). For collisional systems this obeys the
Boltzmann equation (with collision terms); for collisionless systems it obeys the collisionless Boltzmann
equation. Setting up initial conditions for these simulations then amounts to sampling the initial
distribution function f(x, x, t = 0) with N points. We discuss how to do this, in spherical symmetry,
in discs, in triaxial systems and in the cosmological case in this lecture. In all cases, we will set up
the systems in collisionless equilibrium i.e. we will ensure that they initially respect the collisionless
Boltzmann equation.
8.2 Setting up stellar systems
The simplest case to start with is a single component spherical stellar system like a star cluster. In
spherical symmetry, the distribution function is simplied f f(r, v) but is still four dimensional. An
even simpler starting point is to assume the equivalent of spherical symmetry in velocity space too.
This is called isotropy. In this case, the distribution function further simplies to f f(r, v). We can
simplify things even further still if we realise that the distribution function can be written instead in
terms of isolating integrals (c.f. 4). For a spherical systems, the specic energy E and specic angular
momentum L are isolating integrals and so in general we may write f f(E, L) which reduces the
general problem to just two dimensions in integral space. But if the system is also isotropic then only
the magnitude of the velocity matters and thus f cannot be a function of L. Thus for spherical and
isotropic systems f f(E) and is just a one dimensional function of the specic energy. Note that
all such simplications of the distribution function implicitly assume that the distribution function
solves the steady state collisionless Boltzmann equation.
8.2.1.1 Isotropic systems
For spherical isotropic systems, we may write f f(E) with E =
1
2
v
2
+(r) where is the gravita-
tional potential. But how do we know what form f should take? Recall that the density follows from
the integral of the distribution function over velocity:
(r) = 4
_
v
max
0
dvv
2
f(
1
2
v
2
+ ) (8.1)
changing variables from v E using v
2
= 2(E ) and 2vdv = 2dE gives:
1
8
() = 2
_
0
dE(E )
1
2
f(E) (8.2)
76
where we have used the fact that v = v
max
for E = 0 to set the integration limits (recall that is
dened to be negative), and the fact that (r) is a monotonic function of r to write ().
The above equation is an Abel integral for which the inverse is known (e.g. Binney and Tremaine
2008):
f(E) =
1
8
2
d
dE
_
0
E
d
E
d
d
(8.3)
Thus, we can now determine f(E) for general spherical isotropic systems from any given (r) and
(r). For single component (self-consistent) systems (r) and (r) are related through the Poisson
equation:
2
= 4G ; with : (r) = (r) (8.4)
but this need not be the case in general. The density of stars could represent a massless tracer
population, for example, moving in the background potential of a dark matter halo.
8.2.1.1.1 The Plummer solution Equation 8.3 is known as Eddingtons formula and it has a
few special cases that are analytically solvable. The most well-known of these is the Plummer solution
that gives a reasonable model for the distribution of stars in star clusters:
(r) =
GM
r
2
+b
2
; =
3M
4b
3
_
1 +
r
2
b
2
_
5/2
(8.5)
where M is the mass of the Plummer sphere, and b the scale length.
Writing r/b = x and b/GM = K, we have that K = (1 +x
2
)
1/2
; = 3M/(4b
3
)K
5
and thus:
d
d
=
b
GM
3M
4b
3
5K
4
(8.6)
f() =
1
8
2
d
d
_

0
dK
K
15
4
K
4
(8.7)
where = E and we use without loss of generality G = M = b = 1.
The above integral can be solved by trigonometric substitution: K = cos
2
2 cos sin d =
dK. Thus:
f() =
15
8
3
d
d
9/2
_
/2
0
cos
9
d (8.8)
f()
7/2
(8.9)
8.2.1.1.2 Negative distribution functions By inspection of equation 8.3, we can see that there
will be some situations where f() becomes negative. This means that such a model is unphysical
and cannot be realised. This can easily occur for tracer populations that are de-coupled from the
gravitational potential, but also if cuspy density proles are overly radially anisotropic (An and Evans
2006). A simple example is a tracer population inside the Plummer sphere with (r) (1+x
2
)
with
positive. In fact that is a special case of a more general theorem that density proles that do not
monotonically decrease with radius are manifestly unstable (Binney and Tremaine 2008).
8.2.1.2 Anisotropic systems
The above concepts can be generalised also to anisotropic systems where the distribution function
f f(E, L). The anisotropy is often described by the anisotropy parameter:
(r) = 1
+
2
2
2
rr
(8.10)
where
rr,,
are the velocity dispersions in the r, , direction, respectively. The parameter thus
describes the relative amount of radial to tangential motion in a spherical system. If = 0, this means
77
tangential and radial motions are equally balanced: the system is isotropic. If = 1, then the system
is maximally radially anistropic; = means maximally tangentially anisotropic.
Now, assuming a constant , we may write the distribution without loss of generality as (Binney
and Tremaine 2008):
f(E, L) = L
2
f
1
(E) (8.11)
For (r), each choice need to be dealt with separately. The interested reader is referred to Binney
and Tremaine 2008 where methods for anisotropic systems are discussed in detail.
8.2.1.3 Axisymmetry
Axisymmetric systems like discs can also be initialised from distribution functions. Here we have
f f(E, L
z
). Some useful distribution functions and methods are described in detail in Kuijken and
Dubinski 1995 and Widrow and Dubinski 2005 but are beyond the scope of this course.
8.2.1.4 Triaxial systems
Triaxial systems have no symmetry axis and are correspondingly more complex to set up; they are
certainly beyond the scope of this course. Methods to initialise them are usually numerical in nature;
the most comment are related to the N-body method itself. The interested reader is referred to Syer
and Tremaine 1996; De Lorenzi et al. 2007 and Dehnen 2009.
8.2.2 Jeans methods
An alternative to using distribution functions is to use moments of the distribution function:
Zeroth moment (spatial density):
(x) =
_
f(x, v)d
3
v (8.12)
First moments (mean velocity):
v
i
(x) =
1
(x)
_
v
i
f(x, v)d
3
v (8.13)
Second moments (root mean square velocities):
v
i
v
j
(x) =
1
(x)
_
v
i
v
j
f(x, v)d
3
v (8.14)
... and higher order moments.
which allows us to dene the velocity dispersion tensor:
ij
= v
i
v
j
v
i
v
j
(8.15)
This allows us to integrate out some of the dimensions in the problem. Since galaxies are often
roughly spherical, spherical polar coordinates are a natural choice. In this case, the steady state
collisionless Boltzmann equation becomes:
r
f
r
+

+ v
r
f
v
r
+ v
f
v
+ v
f
v
= 0 (8.16)
where v
r
= r is the velocity along r, v
=

r, and v
=

r sin .
Now, we can multiply through by powers of each of the velocity components v
r
, v
, v
and integrate
over velocity to obtain moment equations called the Jeans equations (Binney and Tremaine 2008).
Assuming spherical symmetry, the radial second order moment equation is given by:
78
4 Read et. al.
Model DM N (kpc) NDM DM(kpc) Orbit Time(Gyrs)
A P: 5, 0.23 None 10
5
0.01 - - 9 kpc,84 kpc,-34.7
o
10.43 0.0011
B P: 0.072, 0.23 P: 14, 0.5 10
5
0.01 10
6
0.01 23 kpc,85 kpc,-34.7
o
4.5 0.00014
C SP: 0.072, 0.1, 0 SP: 90, 1.956, 1 10
5
0.01 10
6
0.01 23 kpc,85 kpc,-34.7
o
4.5 0.15
D SP: 0.072, 0.1, 0 SP: 10
3
, 1.956, 1 10
5
0.01 10
6
0.01 6.5 kpc,80 kpc,7.25
o
8.85 0.83
Table 1. Simulation initial conditions for models A-D. The columns from left to right show the model (labelled by a letter in order of
discussion), the stellar density prole, , the dark matter density prole, DM, the number of stars, N, the force softening for the
stars, , the number of dark matter particles, NDM, the force softening for the dark matter, DM, the dwarf galaxy orbit, the output
time in Gyrs at which we show the results from the model and the strength of tidal shocks, (see section 2.2). The density proles
are either (P)lummer with parameters: mass (10
7
M), scale length (kpc) or (S)plit (P)ower law with parameters: mass (10
7
M), scale
length (kpc) and central log slope, (see equations 11 and 13 for more details). The orbit for the dwarf galaxy is given by its pericentre,
apocentre and inclination angle to the Milky Way disc.
Figure 2. Equilibrium tests for model A: The upper panels are for the stars, while the lower panels are for the dark matter. The left
panels show the cumulative mass prole, the middle panels show the velocity anisotropy and the right panels show the fractional change
in energy as a function of time. The blue lines are for the analytic initial conditions, the black lines are for the simulated initial conditions
with 10
6
dark matter and 10
5
star particles, the red lines are for the simulated initial conditions evolved for 5 Gyrs, while the green lines
are for a simulation set up using the Maxwellian approximation (Hernquist 1993) and evolved for 5 Gyrs.
in detail by Gnedin & Ostriker (1997), Gnedin et al. (1999)
and Gnedin et al. (1999).
In the impulsive limit, the mean energy injected into
the satellite at the r.m.s. radius, r, for disc shocks is given
by (Gnedin et al. 1999):
Edisc =
2g
2
m
r
2
3V
2
z
(1 + x
2
)
(2)
where: gm is the maximal disc force along the z direction (in
all of the calculations presented in this paper it is a assumed
that the plane of the Milky Way disc lies perpendicular to
the z-axis); Vz is the z component of the satellite velocity at
the point it passes through the disc; x = ( is the angular
velocity of a star within the satellite at r; 2H/Vz is the
typical shock time for a disc of scale height, H); and is the
c 0000 RAS, MNRAS 000, 000000
Figure 8.1: Taken from Read et al. 2006.
1
r
_
2
rr
_
+
2
_
2
rr
2
tt
_
r
=
r
=
GM(r)
r
2
(8.17)
where by symmetry
tt
=
.
Now, these moment equations have the key advantages that (i) we do not need to specify the form
of f; instead it is constrained entirely by its moments; and (ii) we have now signicantly reduced the
dimensionality of the problem. The above assumption of spherical symmetry can be relaxed, of course.
But there is a more fundamental problem: the hierarchy of Jeans equations is not closed. If the true
distribution function f(v) is a Gaussian, then we are ne; f(v) is fully specied by its rst and second
moment. But in general, f(v) can require an innite number of moments to be fully specied, with
an associated innity of Jeans equations! Luckily, f(v) is typically quite close to Gaussian and so
assuming a Gaussian is usually reasonable. But it does introduce some dis-equilibrium in the initial
conditions (see Figure 8.1 taken from Read et al. 2006b).
Finally, using the anisotropy parameter dened above, the spherical, radial, Jeans equation
(equation 8.17) simplies further to:
1
r
_
2
rr
_
+
2
2
rr
(r)
r
=
GM(r)
r
2
(8.18)
8.3 Drawing N points from the distribution function
Several methods exist for sampling N points randomly from the distribution function. Here we
consider two: the accept/reject algorithm, and a much faster algorithm that only works for spherically
symmetric data (Press et al. 1992).
8.3.1 Accept/reject
Armed with the distribution function, or its moments we must now draw N discrete sample points
randomly in a manner that reproduces the distribution described by f. Usually the density and
velocity eld are set up independently as this is more numerically ecient. The density eld (x)
79
Figure 8.2: Taken from Sellwood & Debattista 2009.
follows from the zeroth moment of the distribution function (equation 8.12). We use this to initialise
the particle positions using a simple algorithm called accept/reject as follows:
1. Normalise the density function to unity i.e. dene:
(x) =
(x)
_
(x)d
3
x
(8.19)
2. Draw a random position x uniformly over some range x
min,i
< x < x
max,i
with i = 0, 1, 2 being
the coordinates.
3. Calculate the probability of this position from (x)
4. Draw a random number R in the range [0, 1]. Compare this with . If R < , then accept this
position, otherwise reject.
5. Repeat steps 2 4 until N positions have been accepted.
The above ensures that the positions drawn will be consistent with the density prole (x).
We now need to set the velocities for each particle. These can be drawn in a similar fashion from
the velocity distribution function. For a particle at position x
i
, this is given by f(x
i
, v) if we know the
distribution function. If we only know the moments one for each coordinate direction (c.f. section
8.2.2) then we must assume some form for f(x
i
, v). If only the rst and second moment are known,
then we must assume a 3D Gaussian; if higher moments are also known then modied 3D Gaussians
can be assumed as required. This latter amounts to an approximative method and will therefore lead
to stellar systems that are not quite in equilibrium initially. We will show this in 8.4.
8.3.2 A faster method for spherically symmetric distributions
Accept/reject will always work, but can be quite slow if a distribution function has signicant wings
(leading to many trial positions being rejected). Several methods exist that are more ecient (Press
80
et al. 1992), we discuss one here that is much faster but only works for spherically symmetric dis-
tributions. Consider a spherically symmetric density function (r) and imagine that we dene a
cumulative mass:
M = 4
_
r
0
r
2
(r
)dr
(8.20)
If we are initialising the particle positions, then this really corresponds to a physical mass, but for
the velocities it is really just some cumulative integral.
Now, dening a maximum or total mass: M
= 4
_
0
r
2
(r
)dr
, we can describe a new

algorithm for initialising the positions:
1. Solve analytically or numerically (using interpolated tabulated look-up tables) for the inverse
function r(M).
2. Draw a random mass in the range 0 < M < M
. Calculate the radius associated with this

mass using the inverse function r(M).
3. Draw two random angles to initialise x from the radial coordinate r.
4. Repeat steps 2 3 until N particle positions are realised.
A similar strategy can be used also for the velocities provided the system is isotropic (i.e. spherically
symmetric in velocity space).
The speed advantage of the above algorithm is clear. We have some initial cost in calculating the
function r(M) if this needs to be done numerically. But from this point on, every single random draw
is a hit no draws are rejected. Thus, the above algorithm for spherical systems is much faster than
accept/reject.
8.4 Comparing dierent IC methods
In Figure 8.1, we compare the stability of initial conditions generated for a two-component spherical
system (containing stars and dark matter with dierent distribution functions), using two dierent
popular methods in the literature (Read et al. 2006b). The left panels show the number density of
star (top) and dark matter (bottom) particles. The blue lines mark the analytic target prole and the
black lines show the initial conditions set up with 10
5
star and 10
6
dark matter particles. Notice that
even in the initial conditions, the black lines underestimate the analytic curves at small r
<
0.035.
This occurs because of the nite number of particles used. The red and green curves show the proles
after 5 Gyrs of evolution which corresponds to 20 dynamical times. For the red curves, the velocities
were initialised using the correct isotropic distribution function calculated using equation 8.3. For the
green curves, they were calculated assuming a Gaussian (Maxwellian) distribution. Notice that the
red curves remain in excellent agreement with the analytic curves even after so many dynamical times,
while the green curves evolve away from the initial proles. This is a result of the dis-equilibrium
caused by the Gaussian approximation for the velocity distribution function. The resulting galaxy
becomes denser in its dark matter prole, and less dense in the stellar prole. The velocity distribution
function particularly for the dark matter becomes radially anisotropic (middle panels). In all cases
the simulations conserve energy, however (right panels) dis-equilibrium just means that the energy
gets redistributed between the star and dark matter particles until equilibrium is reached. Note the
middle panels plot a symmetrised anisotropy parameter:
(r)
=

rr
tt
tt
+
rr
(8.21)
which has the advantage that unlike (r), it does not tend to for tangentially biased systems.
Like ,
= 0 for isotropic systems and
= 1 for radially anisotropic systems, but
= 1 for
tangentially anisotropic systems.
81
8.5 Noise and quiet starts
A mentioned previously, N-body methods are a mixture between a Monte Carlo Method and a method
of characteristics. The Monte Carlo part refers to the random sampling of the distribution function
described above; the method of characteristics refers to the fact that these discrete particles then follow
the collisionless uid ow. The Monte Carlo initialisation can be problematic because it introduces
noise into the initial conditions. This noise can trigger dynamical instabilities like gravitational
collapse that would not be present in the real Universe for which N
real
N. This is seen, for
example, in N-body simulations of disc galaxies where low resolution simulations typically see spiral
arms and/or a bar forming in the disc, while signicantly higher resolution simulations of the same
system show no such spirals/bar, or a delayed growth of these structures. As stated clearly in Dubinski
et al. 2009, the onset of sprial/bar instabilities in simulated discs is seeded entirely by Poisson noise
in the particle distribution. Future simulations, should initialise a perturbation of amplitude large
than the noise to avoid such eects. Otherwise the simulations will never converge.
A corollary of the above is that we can expect very dierent results for N-body simulations of
Galactic discs simply if we change the initial random number seed. (This is true at some level even if
perturbations are initially introduced at an appropriate level by hand, because N-body systems are
after all chaotic.) An illustration of this is given in Figure 8.2. This shows the amplitude of a bar
(left) and its pattern speed (rotation frequency; right panel) as a function of time in a simulated
N-body disc galaxy. All that is changed in the 16 dierent runs is the random number seed used to
set up the initial conditions, yet the results dier greatly in terms of the size evolution of the bar.
If noise-free (really noise reduced) initial conditions are required, there several options none of
them ideal. We can in principle initialise a stellar system thousands of times and select the realisation
that minimises the initial particle noise. This is called a quiet start. Alternatively, we could select
lower noise particle distributions initially like grids or glasses, though the former has the disadvantage
that it introduces preferred directions. Finally, we can always simply increase the particle number to
reduce the noise. This is the most reliable method, but is expensive as the noise drops only as 1/
N.
At the end of the day, noise is a problem for N-body simulations that wont go away. We must
simply do our best to ensure that it does not overly aect our results. This can be tested through
resolution studies, quiet starts, and ensuring that perturbations of interest are always larger than the
simulation noise.
8.6 Cosmological initial conditions
Finally, we briey touch upon setting up initial conditions for cosmological N-body simulations. This
is largely beyond the scope of this course. The interested reader is referred to excellent papers by
Efstathiou et al. 1985, Jenkins 2010 and Hahn and Abel 2011.
82
Lecture 9
Smoothed Particle Hydrodynamics
In this lecture, we discuss how to solve the Euler equations for astrophysical uid dynamics. We focus
on the popular Smoothed Particle Hydrodynamics technique.
9.1 The Euler equations
In this lecture, we consider solving the Euler equations in the absence of sinks or sources in the
Lagrangian form (Springel and Hernquist 2002):
d
dt
= v Continuity Equation (9.1)
dv
dt
=
P
MomenutmEquation (9.2)
du
dt
=
P
v Energy Equation (9.3)

where , v and u are the density, velocity and internal energy per unit mass of the ow, respectively.
The equations are closed by the ideal gas equation of state:
P = u( 1) (9.4)
where is the adiabatic index.
If gravity is also being solved then this can be simply added to the momentum equation:
v =
P
(9.5)
where the gravitational forces are calculated as discussed in previous lectures. From here on, we will
consider only the hydrodynamic pressure forces.
9.1.1 The entropy form
An alternate form of the Euler equations is called the entropy form. Writing the internal energy as:
u =
A
1
1
(9.6)
where A(s) is the specic entropy function (a monotonic function of the specic entropy, s) note that
we can rewrite the equation of state as:
P = A
(9.7)
For adiabatic ow in the absence of sinks or sources, A is conserved. Thus, writing A = const.,
we implicitly solve the energy equation (equation 9.3) and we need solve only the continuity and
momentum equations.
Note that often A is referred to as the entropy when really it is a monotonic function of the
specic entropy. From here on we will adopt this convention also.
83
9.2 Smoothed Particle Hydrodynamics
Smoothed Particle Hydrodynamics (SPH) was rst introduced as a tool for studying stellar structure
(Gingold and Monaghan 1977; Lucy 1977), but has since found wide application in all areas of theo-
retical astrophysics (Monaghan 1992), in engineering (Libersky et al. 1993), and beyond (e.g. Hieber
and Koumoutsakos 2008).
Although there are many varieties of SPH, the central idea is to represent a uid by discrete
particles that move with the ow (Monaghan 1992; Price 2005). Typically these particles represent
the uid exactly, though in some variants the uid is advected on top of the particles (Dilts 1999;
Maron and Howes 2003). The key advantages over Eulerian schemes
1
are its Lagrangian nature that
makes it Galilean invariant, and its particle nature that makes it easy to couple to the fast multipole
method for gravity that scales as O(N) (Dehnen 2000; Greengard and Rokhlin 1987). For some recent
reviews see Price 2010, Springel 2010b and Rosswog 2010; for a discussion of modern SPH algorithms
that are more accurate and avoid many problems with previous generation versions of SPH, see Read
et al. 2010 and Read and Hayeld 2012 and references therein (and citations to).
9.2.1 The density interpolant
SPH starts with the density interpolant for some uid quantity A(x):
A(x) =
_
d
3
x
(x
)([x x
[)
_
d
3
x
A(x
)W([x x
[, h) (9.8)
where W is a positive denite smoothing kernel, and h is called the smoothing length. For the above
approximation to be valid, the kernel must satisfy:
_
Wd
3
x
= 1 (9.9)
and:
lim
h0
W (9.10)
A simple example of a smoothing function that obeys the above criteria is a Gaussian kernel, but we
will encounter other examples that are of greater practical use shortly.
Now, we can further approximate the above smooth integral by a discrete sum:
A
i
=
j
m
j
j
A
j
W
ij
(9.11)
where
m
j
j
d
3
x
approximates the volume element in the integral, and W

ij
= W([x
i
x
j
[, h).
Using the above, we can now derive the SPH density estimator. Writing A = , the density, we
have:
i
=
j
m
j
W
ij
(9.12)
the key equation for SPH.
9.2.2 The SPH equations of motion: the classic derivation
The SPH equations of motion can be derived in a number of ways. A popular method is to derive
them from the Lagrangian for hydrodynamics (e.g. Bennett 2006):
L =
_ _
1
2
v
2
u
_
dV (9.13)
1
This does not apply to Lagrangian moving mesh schemes that are Galilean invariant (Springel 2010a).
84
Discretising the Lagrangian similarly to above (replacing the volume element dV with the volume per
SPH particle m/), we obtain (Price 2005):
L =
j
m
j
_
1
2
v
2
j
u
j
_
(9.14)
and the standard or classic SPH equations of motion then follow from the density estimate (equation
9.12) and the Euler-Lagrange equations (see Appendix H for a derivation of the Euler Lagrange
equations, and Price 2005 for a full worked derivation of the following equations):
d
i
dt
=
N
j
m
j
v
ij

i
W
ij
(9.15)
dv
i
dt
=
N
j
m
j
_
P
i
2
i
+
P
j
2
j
_
i
W
ij
(9.16)
du
i
dt
=
P
i
2
i
N
j
m
j
v
ij

i
W
ij
(9.17)
Note that equation 9.15 is automatically satised by time derivative of the SPH density estimate
(equation 9.12; see Price 2005). For this reason, equation 9.12 is often referred to as the integral form
of the continuity equation.
The above system of equations are closed by the discretised equation of state:
P
i
= u
i
( 1)
i
(9.18)
9.2.3 The SPH equations of motion: a more general derivation
The above derivation, while elegant, hides the signicant freedom available in deriving the SPH
equations of motion. A more general derivation is discussed in detail in Read et al. 2010. The
additional freedoms allow for a better control of the errors and stability of the numerical method,
giving much greater accuracy at little additional computational cost.
9.2.4 Errors & stability
A full error and stability analysis of the generalised SPH equations of motion is given in Read et al.
2010. A detailed discussion of this is beyond the scope of this course, but stability of the particles
is a key factor that must be kept under control in SPH in order to avoid spurious numerical eects.
Unlike grid-based hydrodynamics codes, the particles that move with the ow can have instabilities
and support numerical phenomenon like transverse waves (that do not exist in real Eulerian uids).
Suppressing such numerical artefacts is possible simply by choosing a suitable kernel function; lin-
earised stability analyses can be used to guide this choice and certainly some kernels and much better
than others! For details, the interested reader is referred to Read et al. 2010 and Read and Hayeld
2012.
9.2.5 Dealing with trajectory crossing: adding articial dissipation terms
As with N-body gravity, the particles in SPH represent large unresolved patches of the uid. Similarly
to force softening for gravity, we must cope with the situation when particle trajectories cross. The
usual solution is to add articial dissipation terms that advect momentum, mass and energy as required
between approaching particle pairs to ensure uid continuity. A full discussion of modern algorithms
that use careful switching to avoid unnecessary dissipation is given in Read and Hayeld 2012.
9.2.6 Hydrodynamic tests of SPH
For hydrodynamic tests of SPH and some applications, please refer to the lecture slides (also available
online).
85
Appendix A
Common constants in astrophysics
Constant Value in S.I. units
Gravitational constant G = 6.672(4)10
11
m
3
kg
1
s
2
Speed of light c = 2.9979245810
8
ms
1
(by denition)
Solar mass M
= 1.989(2)10
30
kg
Earth mass M
= 5.976(4)10
24
kg
Solar bolometric luminosity L
= 3.826(8)10
26
j s
1
Stefan-Boltzmann constant = 5.670 10
8
J K
4
m
2
s
1
Unit conversions Value in S.I. units
Astronomical unit 1 a.u. = 1.49597892(1)10
11
m
Parsec pc = 3.08567802(2)10
16
m
Light year lyr = 9.460528410
15
m
Erg erg = 10
7
j
Minute of arc arcmin = 2/360/60 rad
Second of arc arcsec = 2/360/60/60 rad
86
Appendix B
Key results from Vector Calculus
B.1 Curvilinear coordinates
Life is easy working in Cartesian coordinates: (x, y, z). However, as we shall see time and again
throughout this course, problems are often much simpler if we exploit inherent symmetries. It helps
then to work in coordinate systems which share the same symmetry as the problem we are looking at.
In practise, this means working typically in cylindrical polar coordinates: (R, , z), or spherical polar
coordinates: (r, , ). Here, we briey summarise the mathematical machinery required to transform
between general coordinate systems. For a much more detailed account see e.g. Arfken and Weber
2005.
Suppose we switch from Cartesian coordinates to some general coordinates: (q
1
, q
2
, q
3
). From the
chain rule we have:
dx =
x
q
1
dq
1
+
x
q
2
dq
2
+
x
q
3
dq
3
(B.1)
and similarly for y and z. Thus the distance between two points (q
1
, q
2
, q
3
) and (q
1
+dq
1
, q
2
+dq
2
, q
3
+
dq
3
), is given by:
ds
2
= dx
2
+dy
2
+dz
2
=
l
x
l
q
i
x
l
q
j
dq
i
dq
j
= h
ij
dq
i
dq
j
(B.2)
where we have employed the summation convention (repeated indices are summed over), and h
ij
is
known as the metric tensor - you may be familiar with this from General Relativity. In orthogonal
coordinate systems, h
ij
is diagonal ds
2
= h
ii
dq
2
i
= h
2
i
dq
2
i
. This last equality is a notation usually
used to avoid confusion since for h
ii
dq
i
dq
i
it may not be clear what is really summed over. In the
above denition, we have that h
1
=
h
11
and similarly for the other components. Finally, note that
we have used the notation x
l
for the lth component of the vector x = (x, y, z).
As an example, consider spherical polar coordinates. Here we have:
x = r sin cos ; y = r sin sin ; z = r cos (B.3)
Thus, we have:
h
2
1
= h
11
=
l
x
l
q
1
x
l
q
1
= sin
2
cos
2
+ sin
2
sin
2
+ cos
2
= 1 (B.4)
Similarly, we nd h
2
= r, h
3
= r sin .
87
B.2 Divergence operator
The Divergence operator, also called grad, or nabla, in Cartesian coordinates is given by:
=
_

x
,

y
,

z
_
(B.5)
The above notation is commonly used, but is potentially dangerous. More formally, we should write:
= x

x
+ y

y
+ z

z
(B.6)
where x, y, z are unit vectors pointing along each of the Cartesian coordinate axes. In Cartesian
coordinates, where each unit vector is a function only of one coordinate ( x =

x
, etc.), this
distinction is not so important. However, in more general orthogonal coordinates, we must remember
that nabla acts also on the unit vectors themselves.
In a general, orthogonal, coordinate system: (q
1
, q
2
, q
3
), is given by:
=
e
1
h
1
q
1
+
e
2
h
2
q
2
+
e
3
h
3
q
3
(B.7)
where e
1
, e
2
, e
3
are unit vectors pointing along each of the general coordinate axes. Note that we
do not concern ourselves here with covariant and contravariant forms since these only come into play
when we consider non-orthogonal coordinate systems.
B.3 Divergence & Curl
The Divergence in Cartesian coordinates is dened:
F =
F
x
x
+
F
y
y
+
F
z
z
(B.8)
And in a general, orthogonal, coordinate system: (q
1
, q
2
, q
3
), it is:
F =
1
h
1
h
2
h
3
_

q
1
(h
2
h
3
F
1
) +

q
2
(h
3
h
1
F
2
) +

q
3
(h
1
h
2
F
3
)
_
(B.9)
Similar results may be derived for the curl, F, in general coordinate systems (see e.g. Arfken and
Weber 2005).
The Divergence and Curl may be better understood physically through the following theorems:
The Divergence Theorem:
_
V
FdV =
_
S
F dS (B.10)
Gausss theorem (divergence theorem):
Key results from vector calculus
Z
V
Fd
3
x =
Z
S
F dS
=
x
,

y
,

z
Div. operator in Cartesian coordinates (x,y,z):

Stokes theorem:
I
C
F dl =
Z
S
(F) dS
dS
V
S
S
dS
C
Stokes Theorem:
_
C
F dl =
_
S
(F) dS (B.11)
Gausss theorem (divergence theorem):
Key results from vector calculus
Z
V
Fd
3
x =
Z
S
F dS
=
x
,

y
,

z
Div. operator in Cartesian coordinates (x,y,z):

Stokes theorem:
I
C
F dl =
Z
S
(F) dS
dS
V
S
S
dS
C
The above two theorems give us physical insight. Suppose that the eld F represents the force per unit
mass of a gravitational eld: F = . Then F = 0 tells us that there is no net force pointing in or
out of a surface bounding some volume around the gravitational eld, V . No force means no mass to
produce that force. Not surprisingly, then, we have from Poissons equation F =
2
= 4G = 0.
Similarly, F = 0 tells us something important about the gravitational eld. It means that the
integral around a closed loop of F dl = 0. But this is just the work done the energy expended in
moving around that closed loop. It means that the eld is conservative and that particles moving in
that eld conserve energy. Since = 0 for any scalar eld [exercise], we have that gravity
must be a conservative force.
88
Appendix C
Some useful mathematical functions
C.1 The Dirac Delta function
The Dirac Delta function is dened as:
(x) = 0; x ,= 0 (C.1)
_
f(x)(x)d
3
x = f(0) (C.2)
C.2 Functions for use in tensor calculus
The Dirac Delta function is not to be confused with the Kronecker delta used in tensor calculus:
ij
=
_
1, if i = j
0, if i ,= j
(C.3)
Another useful object in tensor calculus is the Levi-Civita pseudo-tensor:
ijk
=
_
_
_
1, if (i, j, k) = (1, 2, 3), (2, 3, 1), or (3, 1, 2)
1, if (i, j, k) = (3, 2, 1), (2, 1, 3), or (1, 3, 2)
0, otherwise : i = j, j = k, or k = i
(C.4)
It is used to dene the cross product of two vectors:
c = a b c
i
=
ijk
a
j
b
k
(C.5)
where we have employed the summation convention:
c
i
=
ijk
a
j
b
k

j,k
ijk
a
j
b
k
(C.6)
You should be able to convince yourself, using the denition of
ijk
, that the above is indeed the usual
cross product (which many students like to remember it as a determinant of a 3x3 matrix).
Note that
ijk
is a pseudo-tensor. The result of a cross product is not actually a vector (as is
sometimes taught), but a pseudo-vector. Pseudo vectors transform just like normal vectors under a
rotation, but not under an inversion followed by a rotation (where they gain an extra sign-ip). This
is easy to see for the cross product by considering a coordinate inversion where all vectors change sign:
a a, b b. But the pseudo-vector c = a b remains unchanged. Pseudo-tensors may be
similarly dened. They are, in general, of limited use because they are not (unlike normal tensors)
coordinate invariant.
89
Appendix D
The Taylor expansion
We use the Taylor expansion a lot throughout this course; for completeness, we derive it here. We
may write any function as an innite power law series:
f(x) =
n=0
a
n
(x a)
n
; (x a) < 1 (D.1)
The (xa) < 1 is required to ensure the series converges. If a function may be represented by a nite
number of terms (for example if f(x) is really a polynomial), then this criteria may be dropped. The
terms, a
n
, may be obtained by dierentiation. Notice that:
f(x)
n=1
na
n
(x a)
n1
(D.2)
f(x)
n=2
n(n 1)a
n
(x a)
n2
(D.3)
.
.
.
f(x)
n
= n!a
n
(D.4)
We can now nd the a
n
by setting x = a:
f
n
(a) = n!a
n
(D.5)
and we derive the Taylor series:
f(x) =
n=0
f
n
(a)
n!
(x a)
n
; (x a) < 1 (D.6)
The above refers to a Taylor series in x about a point a. This can be a source of confusion for students
because, more commonly, people want to expand a Taylor series in some small quantity x about
a point x. This means that in the above formula, we must substitute: x x + x and a x.
This confusing use of notation is common, unfortunately, in most math methods books. Switching
variables, as above, we obtain:
f(x +x) =
n=0
f
n
(x)
n!
x
n
; x < 1 (D.7)
which is the form of the Taylor expansion most commonly used in physics. It may be simply generalised
to vectors of more than one variable to give:
f(x +x) =
n
1
n!
(x )
n
f(x); [x[ < 1 (D.8)
see e.g. Arfken and Weber 2005.
90
Appendix E
Solving Poissons and Laplaces
equations
Two very important equations in physics are Poissons equation:
2
= 4G (E.1)
and, the special case, Laplaces equation:
2
= 0 (E.2)
Here we outlined the basic strategy for solving Poissons equation: reduce
2
= 4G to solving
2
= 0 inside and outside of innitesimal spherical shells, subject to suitable boundary conditions;
then sum over these innitesimal shells. In this appendix we work through a concrete example of this
in cylindrical polar coordinates (R, , z).
So, step one is to solve Laplaces equation in cylindrical coordinates. Before solving it, we must
rst recall what Laplaces equation is in cylindrical polar coordinates. In Cartesian coordinates it is
straightforward from the denition of , also called grad
1
(see Appendix B):
2
= =

2
x
2
+

2
y
2
+

2
z
2
(E.3)
In more general coordinate systems, we must remember to correctly transform each of the Cartesian
coordinates (see e.g. Appendix B and Arfken and Weber 2005).
2
then looks quite dierent.
Substituting for F = in equation B.9 and noting that in cylindrical coordinates we have h
R
= 1,
h
= R and h
z
= 1, we recover:
2
=
1
R
R
_
R
R
_
+
1
R
2
2
+

2
z
2
= 0 (E.4)
The key to solving Laplaces equation is the method of separation of variables. It is important to
note that this is only possible in some special coordinate systems notably Cartesian, cylindrical
polars, spherical polars and oblate spherical coordinates. In more general coordinates things get more
dicult.
Separation of variables works by writing: (R, , z) = A(R)B()C(z). Notice that we may now
rearrange equation E.4 to give:
1
AR
R
_
R
A
R
_
+
1
R
2
B
2
B
2
. .
f(R,)
+
1
C
2
C
z
2
. .
g(z)
= 0 (E.5)
1
and
2
are often referred to as operators because they operate on the variable which comes after them. In this
case the operation is dierentiation.
91
Notice that the left two terms are a function of R and only, while the right term is a function only
of z. This is the key to the separation of variables method. Now the term on the right is a constant
as far as the left two terms are concerned. We may write:
1
C
2
C
z
2
= const. = m
2
(E.6)
which gives us C(z) = C
m
e
mz
, where m is a complex number.
Now we may play the same game with the left two terms. Re-arranging, we obtain:
R
A
R
_
R
A
R
_
m
2
R
2
=
1
B
2
B
2
(E.7)
Now the left term is a function only of R, while the right is a function only of . As above, we may
introduce another constant and nd: B() = B
l
e
l
(l is also a complex number). This leaves just the
R equation, which may be rearranged to give:
1
R
R
_
R
A
R
_
(m
2
+
l
2
R
2
)A = 0 (E.8)
This is Bessels equation. Its solutions may not be obtained analytically. This is usually swept under
the carpet by simply labelling the functions which solve the above equation as Bessel functions;
these may then be calculated whenever they are required using numerical techniques (see e.g. Press
et al. 1992). The full solution to Laplaces equation may now be written as:
(R, , z) =
m,l
a
ml
A
ml
(R)e
mz
e
l
(E.9)
where A
ml
(R) are the Bessel functions. Poissons equation may now be solved from the above by
applying boundary conditions at innity, zero and the surface of the thin shell; and summing over all
such innitesimal shells.
For a disc galaxy, we may assume that it is symmetric in . This is, of course, just an approxima-
tion. Real galaxies show beautiful spiral arm features which are clearly not symmetric in . However,
using this assumption, we have: B() = const. and l = 0. Thus equation E.8 reduces to:
1
R
R
_
R
A
R
_
m
2
A = 0 (E.10)
where A J
m
(R) are cylindrical Bessel functions of order m.
As a nal postscript, note that there are a whole other independent set of solutions to equation
E.10 usually denoted J
m
(R) which is a bit confusing since this does not mean that m is negative. In
special cases, we also need to consider Bessel functions of the second kind called Neumann functions.
Finally, there are a set of related functions which solve a very similar equation called modied Bessel
functions. You can read about all of these and more in any good math methods textbook (e.g. Arfken
and Weber 2005).
92
Appendix F
Some useful potential density pairs
In Lecture 3, we developed the machinery to calculate the gravitational potential from arbitrary
mass distributions. However, on a day to day basis it is really useful to have some handy analytic
potential-density pairs which reasonably describe real galaxies. These can be really useful for back-of
the envelope calculations and are often used for toy models in astrophysics. In this appendix we
present some of these.
F.1 Spherical systems
F.1.1 Point Mass Keplerian potential
You should be familiar with this one by now, but we present it for completeness. Dening the escape
velocity, v
e
, and the rotation curve or circular speed v
c
, we obtain:
(r) =
GM
r
; v
c
(r) =
_
GM
r
; v
e
(r) =
_
2GM
r
(F.1)
The rotation curve is the circular speed of a tracer (i.e. massless) particle, orbiting on a circular
orbit at a radius R: v
c
= R
R
. The escape velocity is derived from the usual balance of kinetic and
potential energy: v
e
=
_
2[(r)[. It is important to remember that such quantities only really have
meaning in spherical potentials, or in the symmetry plane of axisymmetric potentials.
A common question from students is what is the density prole of the point mass? We require
a function which is zero everywhere, except at the origin and which, when integrated over all space
gives the total mass, M. This is the delta function (see Appendix C):
(r) = M(r) (F.2)
F.1.2 Constant density sphere
Not much in the Universe may really be described by a constant density sphere. But this potential
has a number of curious properties and so will crop up time and again during this course. It is also
used to derive the denition of the dynamical time which we quoted without proof in Lecture 1.
The constant density sphere, =
0
= const., generates the potential due to a harmonic oscillator:
(r) =
1
2
2
r
2
+ const. (F.3)
where
2
= 4/3G
0
is a constant. Such a potential is interesting because of its resonant properties.
Dening the dynamical time as the time taken to travel from r to the centre of the sphere, we
obtain:
t
dyn
=
_
3
16G
0
(F.4)
93
F.1.3 Power-law Prole
The Power-law prole is a useful approximation to luminosity prole of many galaxies. It is attractive
because of its simple analytic form. However, all power law proles have innite mass, and so should
be used with a bit of caution.
(r) =
0
_
r
0
r
_
(F.5)
M(r) =
4
0
r
0
3
r
(3)
< 3 (F.6)
=
4G
0
r
0
(2 )(3 )
r
(2)
+ const. (F.7)
v
2
c
(r) =
4G
0
r
0
3
r
(2)
(F.8)
For 2 < < 3, although the mass diverges at large r, the escape speed is nite and is given by:
v
2
e
(r) = 2
v
2
c
(r)
2
(F.9)
Note that in the real universe we see galaxies projected onto the sky. As a result, we measure the
surface density of a galaxy rather than its 3D density. The surface density prole of this model is
given by (R) 1/R
1
. Elliptical galaxies suggest 3, while the atness of spiral galaxy rotation
curves suggests that 2, corresponding to the singular isothermal sphere model (see later). Thus
we note that:
v
e
will be very uncertain.
Divergence of M(r) in these models shows that we need to get out to where the galaxy becomes
Keplerian if we want to measure a total mass and luminosity.
F.1.4 Split power law models
Power law distributions have innite mass. We can get around this problem by using instead, split
power law distributions. The prole is due to several authors (Hernquist 1990, Saha 1992, Dehnen
1993 and Zhao 1996). The density-potential pair is given by:
SP
=
M(3 )
4a
3
1
(r/a)
(1 +r/a)
4
(F.10)
SP
=
GM
a(2 )
_
(1 +a/r)
2
1
(F.11)
where M and a are the mass and scale length in both cases, and is the central log-slope for the SP
prole (note that the SP prole always goes as
SP
r
4
for r a).
Other useful quantities for these models:
M(r) = M
_
r
r +a
_
(3)
(F.12)
v
2
c
(r) =
GM
r
_
r
r +a
_
(3)
(F.13)
v
2
e
(r) =
_
2[
SP
(r)[ (F.14)
Even more general proles are presented in Zhao 1996, but these involve Hypergeometric functions
in general and are less usefully analytic. The split power law model with = 1 is called a Hernquist
prole. This is particularly useful because it provides a good t to the dark matter density distribution
found in cosmological simulations of the Universe that assume a weakly interacting relic particle
(WIMP; cold dark matter). A useful prole that does not have such a simple analytic potential is the
94
Navarro, Frenk and White (NFW) prole that provides an even better match to these dark matter
halos (Navarro et al. 1996; Lokas and Mamon 2001):
NFW
=

0
(r/a)
(1 +r/a)
2
(F.15)
F.2 Axisymmetric systems
The two most useful axisymmetric potential density pairs: the Miyamoto-Nagai disc and the Logarith-
mic potential were both given in detail in Lecture 3. For further examples, see Binney and Tremaine
2008.
F.3 Triaxial systems
F.3.1 Generalised Ferrers potentials
We may use Newton III to sum over shells to form bodies constructed on similar ellipsoids. The
analysis is similar to the multipole expansion, but using the oblate spheroid potentials we derived
in section 3.2; we will not present the details here (the interested reader may consult Binney and
Tremaine 2008).
The results are a useful class of near-analytic triaxial potentials. These are good for getting a
handle on how particle orbits change as a potential lowers in symmetry.
Writing:
m
2
a
2
1
3
i=1
x
2
i
a
2
i
+
(F.16)
where a
1
, a
2
, a
3
are the principle axes of the triaxial ellipsoid, we obtain the potential-density pair
which is constant on iso-density surface m
2
:
= (m
2
) (F.17)
(x) = G
_
a
2
a
3
a
1
__

0
[() (m)]d
_
( +a
2
1
)( +a
2
2
)( +a
2
3
)
(F.18)
where:
(m)
_
m
2
0
(m
2
)dm
2
(F.19)
The above equation for is analytic for the spit power law density, above, and so may be determined
from one numerical integral which is not too hard.
95
Appendix G
Spherical harmonics
Spherical harmonics are an orthonormal basis function set, dened on the surface of a sphere (, ).
They are a natural choice for systems at or close to spherical symmetry. A function can in general be
written as some sum over the basis set (c.f. Fourier series):
f(r, , ) =
l=0
m=l
m=l
f
lm
(a)Y
m
l
(, ) (G.1)
where (r, , ) are the familiar spherical polar coordinates, f
lm
(a) are coecients, and Y
m
l
(, ) are
the spherical harmonic basis functions (see below for the rst few of these). The coecients may
be derived similarly to Fourier series coecients by multiplying through by the conjugate basis set
Y
m
l
(, ) and integrating:
f
lm
=
_

0
sin d
_
2
0
dY
m
l
(, )f(r, , ) (G.2)
The rst few spherical harmonic terms are given below for reference. A graphical representation of
these is given in Figure G.1.
Y
0
0
(, ) =
1
4
Y
0
1
(, ) =
_
3
4
cos Y
1
1
(, ) =
_
3
8
sin e
i
Y
0
2
(, ) =
_
5
16
(3 cos
2
1) Y
1
2
(, ) =
_
15
8
sin cos e
i
Y
2
2
(, ) =
_
15
32
sin
2
e
2i
Figure G.1: The real and imaginary parts of the rst few spherical harmonic basis functions. They
may look familiar to you from chemistry class. The monopole term is what physical chemists would
call an s orbit, the dipole term is what is referred to as a p orbit. The sum over many terms in
the spherical harmonic series, each with dierent weight can reproduce any smooth function of (, ).
Hence they are referred to as orthogonal basis functions. Fourier series are another example of a set
of basis functions you may have come across before.
96
Appendix H
Lagrangian & Hamiltonian
mechanics
In this appendix we briey review Lagrangian and Hamiltonian mechanics. It is important to state
up-front that neither of these methods will do anything that you cant already do with Newtonian
mechanics. In fact, they do a little bit less as you will see on the problem sheet. However, they often
make hard problems in Newtonian mechanics very simple. As an example, we will use them in this
appendix to solve two powerful theorems: Noethers theorem and Liouvilles theorem. You will see
many other examples on the problem sheet and throughout the course.
H.1 Lagrangian mechanics
In classical Newtonian mechanics, a system of gravitating particles evolves under Newtons laws. These
are the familiar: a body continues its motion unchanged unless acted on by a force; force is the rate of
change of momentum; and every action has an equal and opposite reaction
1
. Lagrangian mechanics is
really just a reworking of Newtons rst law. The central idea is summed up in Figure H.1. A particle
starts at a position x
1
(t
1
) and moves to x
2
(t
2
). If no forces acted it would move in a straight line.
However, in the presence of forces (in this case gravity), the particles motion will be more complex.
The central idea in Lagrangian mechanics is that this deviation from a straight line path will be as
small as possible. Particles will move on the shortest possible path between points 1 and 2 given the
constraint that they are acted on by forces. In more mathematical language, we may write the path
length between 1 and 2 as:
S =
_
x
2
,t
2
x
1
,t
1
L(x, x, t)dt (H.1)
This denes the Lagrangian, L(x, x, t), which now contains all of the physics.
Now, suppose that we know that the path, S, is the shortest given the physical constraints. This
means that if we pick a path innitesimally close to S, S
, then S = S
S = 0. This is shown in
Figure H.1. Along the path S, the particle motion is given by the function x(t); along S
it is given
by x
(t) = x +x. Now, if we can solve for x(t), then we have solved for the dynamics of the system;
we know what path the particle will take given the boundary conditions: x
1
(t
1
), x
2
(t
2
). The solution
is a key result from the Calculus of Variations and we will now derive it:
S = S
S
=
_
x
2
,t
2
x
1
,t
1
Ldt
=
_
x
2
,t
2
x
1
,t
1
[L([x +x], [ x + x], t) L(x, x, t)] dt (H.2)
1
Of course, Newton himself never actually phrased the laws in this way.
97
S
S
x
1
, t
1
x
2
, t
2
x(t)
x +x; t +t
Figure H.1: The principle of least action: a particle will move on the extremum path between two
xed points.
Taylor expanding the left term
2
, and using the summation convention
3
, gives:
S =
_
x
2
,t
2
x
1
,t
1
_
L +
L
x
i
x
i
+
L
x
i
x
i
+O(
2
) L
_
dt
=
_
x
2
,t
2
x
1
,t
1
_
L
x
i
x
i
+
L
x
i
x
i
_
dt (H.3)
where x
i
is one component of the vector x.
The second term may be dealt with by integrating by parts and noting that x(t
1
, t
2
) = 0. Thus,
we have:
S =
_
x
2
,t
2
x
1
,t
1
_
L
x
i
d
dt
_
L
x
i
__
x
i
dt = 0 (H.4)
and we derive the Euler-Lagrange equations:
L
x
i
d
dt
_
L
x
i
_
= 0 (H.5)
The above equations now allow us to solve for x(t) given the Lagrangian, L. But what is L? In
practice, L, is just whatever mathematical function recovers the correct dynamics equations in this
case Newtons laws. For classical mechanics, we have L = T V where T is the kinetic energy, and V
is the potential energy. It is tempting to ascribe some physical meaning to the above, but this would
be a mistake. The Lagrangians for special relativity and electromagnetism do not have such simple
forms and so we should think of it only as a coincidence.
Putting L = T V =
1
2
m x
2
i
m into equation H.5 gives:
m
_
x
i
+ x
i
_
= 0 (H.6)
and we recover Newtons second law!
We appear to have put lots of work into developing some mathematical machinery which has just
given us (by design) Newtons second law. So what is the point of the above exercise? In the following
sections, we shall use the Euler-Lagrange equations to derive some very powerful theorems. We will
see that the above is eectively a very useful mathematical trick which makes some problems much
easier to solve. However, as you will see on the problem sheet, it can make some problems harder to
solve; and some, impossible.
H.1.1 Holonomic constraints
If you remember back to all of those mechanics classes you sat through in your youth, you may
remember that one of the most confusing aspects of Newtonian mechanics is getting the reactionary
2
You should have seen this many times by now. If you need to refresh your memory see Appendix D.
3
This convention, due to Einstein, means that all repeated indices are summed over:
L
x
i
x
i

3
i=1
L
x
i
x
i
.
98
mg
R
x
y
f
a
P
Figure H.2: An example of Newtonian v.s. Lagrangian mechanics: a ball rolling (f ,= 0) or sliding
(f = 0) down an inclined plane. The reaction force and friction forces from the plane, R and f
respectively, are marked; neither is necessary in the Lagrangian approach.
forces right; for example the force of reaction of an inclined plane on a ball falling under gravity.
Such problems are completely avoided in the Lagrangian approach which derives the equations of
motion just from the kinetic and potential energy. However, unfortunately, we do not get something
for nothing as we shall show in this section.
Let us consider the inclined plane problem to help to illustrate what is going on. The familiar set-
up is shown in Figure H.2: a ball of mass m, radius a, slides down a plane of angle . The reactionary
force from the plane is marked, R. We assume, for now, that the ball slides without rolling (f = 0).
The reactionary force must be supposed to exist because otherwise the ball would fall directly
through the plane the force from gravity points downwards, after all! So the reactionary force is a
form of constraint the ball is constrained to move on the surface of the plane. Mathematically, we
may write the constraint as: y = 0. Constraints which may be written in this form (g(x
i
) = 0) are
called holonomic constraints. Such constraint equations reduce the degrees of freedom for the ball:
the number of independent directions the ball can move in.
The entire motion of the ball may be described using just the x coordinate along the plane. This
is known as a generalised coordinate. It describes the motion of the ball only within the space it is
constrained to move.
Just for illustrative purposes, let us now derive the equations of motion for the ball. First using
Newton, and then again with the new Lagragian technique. Newton is straightforward: balancing
forces along the plane and perpendicular to the plane we have:
mg cos = R (H.7)
mg sin = m x (H.8)
Now using Lagrangian mechanics:
L =
1
2
m x
2
mgxsin (H.9)
which from equation H.5 gives:
m x mg sin = 0 (H.10)
and notice that we no longer need to even introduce the concept of the reactionary force!
Lagrangian mechanics is great for systems where we can work in generalised coordinates which
describe the motion of some particles subject to some holonomic constraints. However, we run into
problems when we cannot describe the constraints in holonomic form.
Imagine now the same problem as above, but with friction between the plane and the ball (f ,= 0).
If the ball rolls without slipping, the no-slippage constraint gives us: dx = ad (the angle, , is dened
in Figure H.2). A constraint of this form is non-holonomic. Consider a representative point on the
surface of the sphere, P. It now necessarily moves in two dimensions. There is no simple constraint
which reduces the degrees of freedom of the problem.
99
We can still solve the problem using Newton and Lagrange, however. First Newton:
m x = mg sin f (H.11)
I =
I
a
x = fa (H.12)
where I =
2
5
ma
2
is the moment of inertia of a sphere
4
; and we have used the no-slippage constraint
equation to eliminate . Eliminating for the friction force, f, then gives us the equation of motion:
m x = mg sin
2
5
m x (H.13)
Notice that we have once again required an additional force the friction, f. In the Lagrangian
approach this is not necessary; we can derive the equation of motion directly from the Lagrangian:
L =
1
2
m x
2
+
1
2
I
2
gmxsin (H.14)
But, wait a minute! We know from Newton that the equation of motion has only one variable, x (see
equation H.13 remember that is a constant). Yet the Lagrangian as written above is an equation
in two variables: x and . We may substitute for x using the no-slippage constraint. But we cannot
chose a new coordinate system which forces the Lagrangian to have only one generalised coordinate.
The above illustrates a key point. There is no holonomic constraint equation which can be applied.
The non-slip condition is an example of a velocity constraint equation: x = a ; these are always
non-holonomic. Of course, we can still solve the problem by substituting for and then using equation
H.5. This recovers equation H.13 as expected [exercise].
You will try some more astrophysical examples of this on the problem sheet.
H.1.2 Noethers Theorem
We have seen some of the strengths and short-falls of the Lagrangian approach. Now it is time to see
its full potential. Consider a Lagrangian which is invariant under a translation: L(x
i
+x
i
) = L(x
i
).
Then we can Taylor expand the left term:
L(x
i
+x
i
) = L(x
i
) +x
i
L
x
i
+O(x
2
i
) (H.15)
Breaking the translation up into a series of innitesimal steps (limx 0), we recover the denition
of dierentiation:
L(x
i
+x
i
) L(x
i
)
x
i
=
L
x
i
= 0 (H.16)
and from the Euler-Lagrange equation, (equation H.5), we obtain:
d
dt
_
L
x
i
_
= 0 (H.17)
L
x
i
= m x
i
= const. (H.18)
A Lagrangian which does not depend on the position of the system as a whole, requires linear mo-
mentum conservation! This is actually quite a deep result. We observe momentum conservation in
the Universe as an empirical fact. However, the above tells us that we must have linear momentum
conservation if the laws of physics (which are completely described by the Lagrangian) are to be the
same everywhere in space.
4
Recall that the moment of inertia of a body is a second moment of the density. The rst moment is the centre
of mass times the mass: Mx =
x(x)d
3
x. The moment of inertia is the second moment involving the perpendicular
distance to a given axis. It must in general be a tensor. About the centre of mass we have: I
ij
=
(r
2
ij
x
i
x
j
)(x)d
3
x.
Hence I describes a torque. In this problem, we use the moment of inertia for the sphere. In this case, by symmetry,
we have I
xx
= I
yy
= I
zz
= I =
(x
2
+ y
2
)d
3
x =
2
5
ma
2
. All other terms are zero.
100
We may perform a similar exercise for angular momentum. We work in cylindrical polars (R, , z)
for simplicity. Consider a Lagrangian invariant under a rotation: L( + ) = L(). As above, we
can Taylor expand to show that
L
= 0. Then the Euler-Langrange equations give us:

L
_
1
2
mR
2
+
1
2
mR
2

2
+
1
2
m z
2
V (R, , z)
_
= mR
2

= const. (H.19)
So if physics is the same whichever way we are facing, then we must have angular momentum conser-
vation.
Finally, we can look at what happens if the Lagrangian is invariant in time. Now we have
L
t
= 0.
Thus:
dL
dt
=
L
x
i
x
i
+
L
x
i
x
i
=
d
dt
_
L
x
i
x
i
_
(H.20)
Thus:
H =
L
x
i
x
i
L
= T +V
= const. (H.21)
Where H is the Hamiltonian of the system and is the total energy of the system.
We have shown that if physics is the same from one moment to the next, is the same independent
of direction and is the same from one place to the next, then we must have conservation (globally) of
momentum, angular momentum and energy. This is why these three conservation laws are so central
to modern physics. They are dicult to get away from particularly in astronomy. If physics is
dierent from one location to the next, or from one time to the next, then the whole exercise of
astrophysics is made extremely dicult. We could no longer reliably apply terrestrial physics to the
cosmos!
H.1.3 Rotating reference frames
As another example of Lagrangian mechanics making life easier, lets derive the equations of motion
for a general rotating reference frame. You may think that you have derived this before using Newtons
laws. But little did you know that you really only treated a special case: that of frames rotating at
constant angular speed. We will now derive the general result using Lagrangian mechanics with the
angular velocity of the frame free to be a function of time: = (t). We will see that there are, in
general, more ctitious forces than just the centrifugal and coriolis terms you will have seen before.
Sounds hard? Watch as the Euler-Lagrange equations make it easy...
First lets write down the Lagrangian:
L = T V =
1
2
m[v
in
[
2
m(x) (H.22)
where v
in
is the total velocity as observed from an inertial frame. This velocity then includes that of
the rotating frame. Thus we have:
v
in
= x +x (H.23)
where (t) is the angular velocity of the rotating frame.
Now, hopefully you will have seen tensor notation before. Vectors are all very well, but tensor
notation makes life so much easier. We will, as above, use the summation convention throughout and
101
from now on x
i
refers to an element in the vector x, and similarly for other quantities. The main
reason why this makes life easier is that all quantities then become scalars (e.g. x
i
is a scalar element
of the vector x) and they commute, add, subtract, multiply and divide just like normal numbers.
In tensor notation the Lagrangian becomes:
L =
1
2
m( x
i
+
ijk
j
x
k
)
2
m(x) (H.24)
Hopefully, you will have seen the Levi-Civita pseudo-tensor,
ijk
, before. Primarily it is used to dene
the cross product in tensor notation. If you are not familiar with it, have a look in Appendix C. Note
also that the notation, (x), just means that is a function of all three position coordinates.
Now all we have to do is put the above Lagrangian into the Euler-Lagrange equations. Simple!
We have:
d
dt
_
L
x
l
_
=
d
dt
[m( x
i
+
ijk
j
x
k
)
il
] (H.25)
where
il
is the Kronecker delta (see Appendix C). An important point to note here. Remember when
dierentiating in tensor notation to always introduce a new subscript (in this case l). Dont use any
of the subscripts already in use. You will end up in a horrible mess that way. Remember that all of
the other subscripts in the Lagrangian are being summed over! (they are called dummy indices).
Now for the next part of the Euler-Lagrange equations:
L
x
l
= m( x
i
+
ijk
j
x
k
)
ipq
ql
m
x
l
(H.26)
Putting the above together, and returning to vector notation, we recover:
m
_
x +

x + 2 x +x
_
+m (H.27)
We now see that, in general, there are three ctitious forces. The term on the left is the new one: the
inertial force of rotation, the middle term is the coriolis force, and the right term is the centrifugal
force. We have been using the centrifugal force throughout the course so far. We see that in the
special case of circular orbits, = z, and the centrifugal term reduces to the familiar: m
2
R (in
cylindrical coordinates: (R, , z)).
H.2 Hamiltonian mechanics
At the end of section H.1.2, we introduced a new quantity the Hamiltonian, H, of a system. This
describes the total energy of the system and we may use it to form Hamiltonian mechanics. This is
really just another way of formulating Lagrangian mechanics. But as we shall see, it is also useful for
quickly solving some otherwise dicult problems.
We rst dene more rigourously what we mean by generalised coordinates. We mentioned these
briey in section H.1.1. If we have a generalised position coordinate, q
i
, then we may dene the
generalised momentum:
p
i

_
L
q
i
_
(H.28)
We now have generalised coordinates which completely describe the particle distribution at a given
moment: q
i
, p
i
. Recall that these are the coordinates which describe the space constrained by the
holonomic constraint equations. However, we may think of them as position and momentum. The
space they describe is called phase space. In general this space has 6 dimensions per particle; in
practice, the constraint equations reduce this dimensionality.
Now, we wish to reformulate the Euler Lagrange equations in terms of the generalised coordinates.
This will give us Hamiltons equations of motion.
Recall from section H.1.2, we dened the Hamiltonian as:
H = p
i
q
i
L (H.29)
102
and we proved that if L does not depend explicitly on time then H represents the total energy of the
system and is conserved.
Using the Euler-Lagrange equations, it is straightforward to prove that:
H
q
i
= p
i
;
H
p
i
= q
i
(H.30)
These are Hamiltons equations. They are a reworking of the Euler-Lagrange equations, but now in
the generalised coordinates: (q
i
, p
i
).
So the above is all just fancy mathematics at the moment. What do we gain from it? Well,
generalised coordinates are very useful. The Euler-Lagrange equations are somewhat hampered by
the fact that we must choose a coordinate, q
i
, and its time derivative, q
i
to represent the system. Now,
we have no such limitation. We can represent our system using any two independent coordinates q
i
and p
i
. Notice the symmetry in Hamiltons equations. This will help you understand what we have
achieved by using generalised coordinates. We can now completely swap p
i
q
i
and q
i
p
i
.
Although p
i
is called a generalised momentum, we can quite happily use it to represent position:
p
i
= x
i
, with the generalised coordinate representing momentum q
i
= m x
i
, if we like.
H.2.1 Canonical transformations
We will see how far these generalised coordinates can take us in the following sections. However,
rst we must dene a canonical coordinate transformation. This will allow us to transform between
dierent generalised coordinates, while ensuring that Hamiltons equations still hold true.
Substituting the Hamiltonian (equation H.29) into equation H.2, we can now dene a variational
principle for Hamiltonian mechanics just as we did for Lagrangian mechanics:
S =
_
(p
i
q
i
H) dt = 0 (H.31)
Now, imagine we switch to some new phase space coordinates, (Q
i
, P
i
), with a new associated Hamil-
tonian, F. For the new coordinates, we may also write:
S =
_
_
P
i

Q
i
F
_
dt = 0 (H.32)
The key point here is that we are only interested in transformations which preserve the dynamics of
the system. This means that if I travel in a loop from one xed point in phase space to another, and
back again, the physical path taken in each coordinate system must be the same. Mathematically
speaking this means that:
_
_
(p
i
q
i
H)
_
P
i

Q
i
F
__
dt = 0 (H.33)
The above will always be true if we state that the integrand is given by the absolute time derivative
of some function, G:
(p
i
q
i
H)
_
P
i

Q
i
F
_
=
dG
dt
(H.34)
such that:
_
dG
dt
dt =
_
dG = 0 (H.35)
Coordinate transformations of the above sort are called canonical transformations, and we now prove
an important property of them.
Splitting up G in the following way:
dG = p
i
dq
i
P
i
dQ
i
+dG
(H.36)
with dG
= (H +F)dt, we nd that:
_
p
i
dq
i
P
i
dQ
i
= 0 (H.37)
103
since the contribution from dG
cancels in integrating around a closed loop. This proves an important

property of canonical transformations:
_
p
i
dq
i
is conserved. We will return to why this is important
shortly.
But how do we actually perform such a transformation? We can understand this by asserting some
form for G. The function G is arbitrary, but a useful choice (and we shall see why below) is:
G = S(P
i
, q
i
, t) P
i
Q
i
(H.38)
From equation H.34, and substituting equation H.38, we obtain:
p
i
dq
i
Hdt P
i
dQ
i
+Fdt d(S P
i
Q
i
) = 0 (H.39)
gathering terms together, we have:
_
p
i
S
q
i
_
dq
i
+
_
S
P
i
Q
i
_
dP
i
+
_
F H
S
t
_
dt = 0 (H.40)
and we obtain:
p
i
=
S
q
i
; Q
i
=
S
P
i
; F = H +
S
t
(H.41)
We can now use the above equations to transform from (q
i
, p
i
) to (Q
i
, P
i
). The function, S, is called
the generating function, and it denes the canonical coordinate transform.
H.2.2 The Hamilton-Jacobi equation
Now its time to start showing you why we went to so much trouble to pose our dynamics equations
in such abstract terms. The key is understanding a cunning trick due to Jacobi. He realised that if
we can nd some phase space coordinates (Q
i
, P
i
) in which the Hamiltonian vanishes, F = 0, then
Hamiltons equations become:
P
i
= 0;

Q
i
= 0 (H.42)
Both P
i
and Q
i
are now constants of the motion. In one step, we have completely solved the problem.
The transformation which achieves this may be written down directly from equations H.41. The result
is the Hamilton-Jacobi (H-J) equation:
H(
S
q
i
, q
i
, t) +
S
t
= 0 (H.43)
All we need to do is to solve the above equation for the generating function, S. We then obtain Q
i
from Q
i
=
S
P
i
, and the P
i
are the integration constants of equation H.43. This is a more signicant
step than Hamiltons equations themselves. Recall that Hamiltons equations are ultimately just a
reworking of Newtons laws. But if we can solve the H-J equation, then we have actually solved
Newtons equations of motion completely.
In practice, however, solving the H-J equation is not that useful because S = S(t). Our transfor-
mation which solves the problem evolves with time! But, if we assume from the start that S is some
simple function of time: S(t) = t +, then we can reduce the H-J equation to an even simpler form:
H(P
i
) = const. (H.44)
The above equation is really useful. Now Hamiltons equations reduce to:
P
i
= const.; Q
i
= At +B (H.45)
As above, if we can nd the generating function, S, then our dynamics problem is fully solved. But
now S takes a much simpler time independent form.
In general, there will be many many cases where there are no solutions to the H-J equation.
However, as we shall see next, even these cases tell us something important about the dynamics of
the system.
104
H.2.3 Actions & integrals
Solving equation H.44 gives us the generating function, S which transforms (q
i
, p
i
) (Q
i
, P
i
) such
that P
i
= const.. Such constants are arbitrary they are just constants of integration which come
out of the H-J equation. As such, we may further constrain S by choosing our constants of motion in
advance. A natural choice is the conserved loop integral we encountered earlier. We dene the action
of a system as being:
A
i
=
1
2
_
i
p
i
dq
i
(H.46)
where
i
describes an independent path through phase space. In general, there will be three such
independent paths one for each P
i
= A
i
.
Actions are useful for two main reasons. Firstly, they are isolating integrals of the motion. An
integral of the motion is conserved along the trajectory of the particle. An isolating integral is even
more useful: it lowers the dimensionality of the phase space available to the particle. An example you
have already come across is the energy, which is an isolating integral for any static potential. (This
is a direct result of gravity being a conservative force.) If no other isolating integrals exist, then the
particle is free to roam throughout all of phase space as far as its energy will allow. Such an orbit
will be chaotic and it will not be a solution of the H-J equation. The other extreme is an orbit with
ve isolating integrals of motion. Such an orbit will be completely conned to a line in phase space.
The Kepler orbit is the classic example of this.
Actions are particularly special isolating integrals. This is because, having found the actions, the
motion of the particle is then fully parameterised by the other phase space coordinates, Q
i
. Recall
that, by construction, these evolve linearly with time: Q
i
= At +B (see equation H.45). By using the
actions as our canonical momenta, the phase space coordinates, Q
i
, have a simple physical meaning.
They are angles which describe the motion of the particle in time around the phase space loop which
denes the associated action. This is probably quite dicult to picture right now. But dont worry, I
promise you will see this much more clearly, when we consider a specic an example, below.
The second advantage of using actions is that they are adiabatic invariants of the motion. They
remain unchanged if the gravitational potential changes slowly enough (strictly speaking, this means
innitely slowly). This can be understood if you think of an innitesimal change to the Hamiltonian:
H H + H, which is the result of the change in the gravitational potential. We may equate
H =
S
t
with the generating function (see equation H.41). Now we see that the adiabatic change
is just a canonical map (a canonical coordinate transformation). But we have already proved that
canonical maps conserve the actions, and so we have now proven that actions must be adiabatically
invariant.
H.2.4 A worked example: the simple harmonic oscillator
As a worked example, lets consider the favourite hobby-horse of physics: the simple harmonic oscil-
lator. It is a toy model which students encounter time and time again on rst dynamics courses,
right through to quantum mechanics. But often students are not often told why such a simple system
could be important in so many areas of physics.
Consider the minimum of some 1D potential well
5
which takes the form (x), where the minimum
is at x = 0. Here can represent any scalar eld, but we can think of it, in keeping with this course,
as being a gravitational eld. Now, suppose that we displace the particle from its equilibrium, by
some small amount, x. We may now Taylor expand about the minimum to obtain:
(x) = (0) +x
(0) +
x
2
2!

(0) +O(x
3
) (H.47)
Since we are at a minimum, the potential at x = 0 must be stationary,
(0) = 0; while (0) and
(0) are just constants. Thus, to order x

3
, we recover the potential of a simple harmonic oscillator
6
:
5
We work in one dimension for simplicity, but all of these arguments are equally valid in three dimensions [exercise].
6
Just in case youve never seen this potential before, it is trivial to solve the dynamics using Newtons second law.
We have x =
d
dx
= 2Ax the equation of motion for a simple harmonic oscillator!
105
(x) = Ax
2
+B (H.48)
Now we see why the simple harmonic oscillator pops up time and again in physics. It is because a
small perturbation about any minima will produce approximate simple harmonic motion.
For the following example, lets assume that A =
1
2
and B = 0, since the constants are arbitrary.
Thus we obtain a Hamiltonian:
H =
p
2
2
+
x
2
2
= E (H.49)
where p = m x is the momentum, we assume m = 1, and E is the energy as usual.
Lets solve the problem rst using Hamiltons equations and then again using our new H-J equation.
This will help you to understand what all of the above mathematics really means in practice.
First Hamilton. Hamiltons equations give:
H
x
= p = x;
H
p
= x = p (H.50)
which gives, for the equation of motion:
x = x (H.51)
The familiar equation of motion for a simple harmonic oscillator. Integrating we obtain:
x = Asin(t +B) (H.52)
where A and B are constants of the integration.
All straightforward so far. Now lets try again using the H-J equation. In some sense, this is a
bit like using a sledge hammer to play the piano, but it will give you a better understanding of how
to really use the results we have derived above (the problem sheet will also help you to develop this
further).
We know that the Hamiltonian doesnt depend explicitly on time and so must be the (conserved)
energy: H = E. Now the aim is to nd new generalised coordinates, (Q
i
, P
i
), which make H(P
i
) =
const.. We are working in 1 dimension, so will drop the subscript, i, from now on there can only
possibly be one action since there is only one momentum coordinate, p, in the rst place.
The H-J equation is now:
1
2
_
S
x
_
2
+
1
2
x
2
= E (H.53)
where all we have done is substitute for p =
S
x
using equations H.41. It is now a simple matter to
solve equation H.53 for the generating function, S:
S =
_
_
2E x
2
dx (H.54)
Now, you can see that there is an arbitrary integration constant which will come from equation H.54.
We can x this by deciding that our conserved momentum P will be an action. Recall that this
decision is arbitrary, but that actions are more physically meaningful than other choices because they
are isolating integrals. From the denition of an action (equation H.46) we have:
P =
1
2
_
pdx (H.55)
Now, we require a bit of care here. We are integrating around a closed loop in phase space. This is
best done in plane polar coordinates since this will make it clear what the integration limits should
be. We substitute: x = q cos , p = q sin (see the diagram in the margin). Now the integration
limits range from 0 2 in these coordinates, and, by convention, we take the loop integral in a
clockwise direction.
p
x
q
In general, q = q() in these new coordinates. We may calculate the shape of the phase loop, q(),
by substituting our new coordinates into the Hamiltonian. Rearranging equation H.49, we obtain:
106
p(x) =
_
2E x
2
q
2
sin
2
= 2E q
2
cos
2

q =
2E (H.56)
In this case, the loop in phase space is a circle of radius
2E. Using dx = dq cos q sin d =

q sin d, we have in our new coordinates:
P =
1
2
_
pdx
=
1
2
_
2
0
q
2
sin
2
d
=
2E
2
_
2
0
1 cos(2)
2
d
= E (H.57)
Note the minus sign in the second line which comes from the clockwise integration direction.
We nd that, for the special case of the harmonic oscillator, the action is just the energy, E. But
this is not true in general, so take note! It does highlight an important point, however. The actions
provide all of the isolating integrals required to describe a system. We know that in any static potential
the energy will be an isolating integral (remember, this is just because gravity is a conservative force).
So it is not surprising that, when we can have only one action (as in this 1D case), it is simply related
to the energy.
Now, all that is left is to calculate the other canonical phase space coordinate:
Q =
S
P
=
S
E
=
_
1
2E x
2
dx (H.58)
and, using a standard trigonometric substitution, we recover the equation of motion directly:
x =
2E sin(QQ
0
) (H.59)
where Q
0
is just an integration constant.
But what does H.59 mean? We have solved the problem in the new coordinates P = E and Q,
but we must now think about what these coordinates are physically. The simplest way to do this is
to plug our solution back into the Hamiltonian. This allows us to solve also for p and gives:
p =
2E cos(QQ
0
) (H.60)
which satises x
2
+p
2
= 2E. Now we can see the meaning of P and Q. The trajectory of the particle
is a circle in phase space, (x, p) (see Figure H.3). Q is now an angle in that phase space, while P is
1
2
integral of phase space area. That is, the radius of the circle in phase space is
2E, and its

area is then 2E = 2P.
Recall that, by construction, we know Q = At + B, where A and B are arbitrary constants
(remember, we chose to transform to phase space coordinates where this is so!). This now proves that
equations H.52 and H.59 are indeed the same solution.
The above exercise illustrates a nal useful thing about choosing to use actions as our conserved
canonical momenta. The Q coordinate is now an angle in the phase space (x, p). By using actions, this
will always be the case. The actions, P
i
, represent integrals over phase space area in the coordinates,
(q
i
, p
i
), while the angle variables, Q
i
, represent angles around the loop integral. Can you think of what
shape three action-angle coordinates would trace out in phase space? [Hint: each action coordinate
must be independent, and you know the answer in 1D is a circle].
107
p
x
QQ
0
2E
Figure H.3: The trajectory of a 1D harmonic oscillator in phase space: (x, p). Marked also are the
new canonical coordinates, (Q, P), for which H = H(P) = const. By xing P to be an action, the
physical meaning of these new coordinates is now clear. Q represents an angle around the particle
trajectory in phase space; while P is the area of the particle trajectory divided by 2. Q now fully
parameterises the position of the particle in phase space.
H.3 Phase space and Liouvilles Theorem
We conclude this chapter will one nal proof which we be of great importance later on in the course.
Recall that in general, phase space is lled with N particles, each with six phase space coordinates
a 6N dimensional space! We can write this in the following compact form:
i,j
= (q
i,j
, p
i,j
) 1 i 3 ; 1 j N (H.61)
were
0,j
,
1,j
and
2,j
form three independent vector elds. These act like three separate axes for
our 6N dimensional phase space. Each contains 2N elements representing an independent component
of the phase position for each particle.
With the above compact notation, we can now write down a velocity of our particles in phase
space. I have carefully put the word velocity in quotes because really I mean the rate of change of
the canonical coordinates. This should not be confused with velocity in the old Newtonian sense of
the word. We have:

i,j
= ( q
i,j
, p
i,j
) 1 i 3 ; 1 j N (H.62)
Now, things are probably getting a little hairy right now. How on earth are we to visualise 4 dimen-
sional space let alone 6N dimensional space? One trick to help you with this is to exploit our old
friend: symmetry. For example, we know that a sphere is a three dimensional object. However, we
only need one number to completely describe a perfect sphere. As long as we know that it is a perfect
sphere, all we need is the radius. The same is true of an N-sphere in higher dimensions. Although it
becomes impossible to visualise the sphere itself, we know it is an object which is fully described by
one radius and that radius will be a magnitude in N-dimensional space: r
2
= x
2
1
+x
2
2
+ x
2
n
.
Bearing the above in mind, imagine that our 6N dimensional phase space lls some 6N dimensional
volume. Again this is not a volume in the usual sense, and hence the quotes. The particles in phase
space need not have any symmetry whatsoever, but picture it in your mind as a sphere for now. Now,
since
1,j
is just a 2N-dimensional vector eld (and similarly for
2,j
and
3,j
), we may write down
the 2N-dimensional analogue of the divergence theorem:
_
V
i

i
dV =
_
S

i
dS
i
(H.63)
where the above vectors represent the 1 j N components of
i,j
.
The integral on the right represents the ow of particles in phase space through the surface S,
bounding the volume V. Now it becomes clear why imagining the sphere helps. Because we are
108
picturing a ow through a surface one dimension lower than the volume, we really can imagine
the problem. The ow of particles through the 6N-1 hypersurface into the 6Nth dimension can be
understood intuitively by imagining the ow through a 2D surface bounding a sphere. Note also that
we have substituted our slightly dodgy surface for the correct terminology: hypersurface.
Now notice that:
i

i
=

q
i,j
q
i,j
+

p
i,j
p
i,j
=

2
H
q
i,j
p
i,j

2
H
p
i,j
q
i,j
= 0 (H.64)
where the second line follows by substituting Hamiltons equations (equations H.30).
The above proves Liouvilles theorem:
Hamiltonian ow preserves phase space volume (and therefore density) for any region of phase space.
Another way of expressing the above is that the particles evolve in phase space as an incompressible
uid. Without solving any equations, or doing any dynamics the above tells us two very important
things:
1. A boundary in phase space always encloses the same group of particles.
2. From 1. we see that phase trajectories dont cross.
But we must be careful. It can be very dicult to picture 6N dimensional space. In projections of
this space onto lower dimensional subspaces, particles can appear to cross all the time. Again, we
can gain some intuition for this by imagining lower-dimensional analogues. In three dimensions, for
example, a simple example is two sets of lines which are conned to two parallel planes. The lines in
one plane will never cross the lines in the other plane, since the two planes are parallel. But they will
appear to cross one another when viewed along some projections.
In discussing projections onto lower dimensional spaces, there is a common confusion which is
worth highlighting. A system of 6N particles is just one point in 6N dimensional phase space. When
we say that trajectories dont cross in 6N dimensional phase space, we really refer to dierent copies
of our original system, each copy being another point in 6N dimensional space. When viewed like
this, Liouvilles theorem seems less useful. Much more useful is the 6 dimensional special case which
applies only to collisionless systems. If particles are collisionless and therefore do not interact with
one another at all (they do not exchange energy, angular momentum nor any other property), then
Liouvilles theorem also applies in 6 dimensional phase space
7
. This more restricted version of the
theory means that individual phase trajectories of particles in a collisionless system will not cross; each
point in phase space represents a unique trajectory. The theorem now applies to individual particle
orbits, rather than copies of the system as a whole and is, correspondingly, more useful.
7
Hopefully it is clear why. If particles do not interact then we immediately remove N 1 degrees of freedom per
particle.
109
Appendix I
Dynamical friction
A related physical eect to relaxation is that of dynamical friction. Dynamical friction is of much
broader relevance in astrophysics than relaxation since it proceeds much faster. The basic idea is
outlined in Figure I.1. A larger body, or satellite, moves through a medium of smaller bodies.
Through successive scattering events, the larger body loses energy to the smaller ones and slows
down. The physics of the interaction is the same as in relaxation: scattering. However, because of the
disparity in mass between the two bodies, dynamical friction proceeds much faster, as we shall prove.
Dynamical friction likely aects the distribution of globular clusters in our Galaxy, the distribution
of galaxies in a cluster of galaxies and the rate of accretion of satellite galaxies in the Universe.
Figure I.2, for example, shows the stream of stars coming of the Sagittarius dwarf galaxy. This small
companion to the Milky Way was accreted in the last few gigayears. It is a fascinating galactic
interaction which is helping us to learn about the shape of our Galactic potential, about dark matter
v.s. alternative gravity and about the role of galaxy interactions in galaxy formation in general. Its
rate of infall is undoubtedly aected by dynamical friction between the galaxy and the background
stars and dark matter.
I.1 The Chandrasekhar approach
The rst description of dynamical friction was that due to Chandrasekhar 1943. It is still an extremely
accurate rule of thumb some sixty years later and a good place to start. The derivation is very similar
to that of the relaxation time presented in Lecture 1. Things are made a little more complicated,
however, by the dierent mass of the infalling body and the background particles. Our strategy is as
follows:
Derive the eect of one scattering event.
Integrate over all impact parameters.
Integrate over all particles.
We will assume an innite, homogeneous background distribution of particles. Later we will call
this assumption to question and we shall see that it does miss out some very important physics in
special cases.
First consider the two body interaction. The two body problem may always be transformed into
a Kepler problem for a ctitious reduced particle moving a reduced Kepler potential. The equation of
motion is:
mM
m+M
r =
GMm
r
2
r (I.1)
where m and M are the masses of the background particles and the infalling body, respectively, and
r = x
m
x
M
; m x
m
+M x
M
= 0.
Let us dene the change in velocity of the reduced particle due to the interaction as: r. From
the above denitions, we have:
110
Figure I.1: Schematic diagram of Chandrasekhar dynamical friction. The larger body marked in red
moves in the direction marked by the arrow, like a marble falling through honey. It scatters stars in
front of it leading to an overdensity of stars behind it a wake. The momentum transfer between the
larger body and the background due to these gravitational scattering events causes the larger body
to slow down. This is the dynamical friction.
Figure I.2: A view of the Sagittarius stream as it would be seen by an alien sitting outside our Galaxy.
The stream is tidal debris torn o of the Sagittarius dwarf galaxy as it fell into the Milky Way little
over a billion years ago. The rate of infall of the satellite was undoubtedly governed by dynamical
friction.
111
r

de
b
V
0
V
0
Figure I.3: Schematic of the two-body scattering event, see text for details. Note that b marks the
impact parameter, while
0
is the angle at which the two bodies are at closest approach.
r = x
m
x
M
(I.2)
m x
m
+M x
M
= 0 (I.3)
Eliminating x
m
, we obtain:
r
_
m
m+M
_
= x
M
(I.4)
Now we wish to solve for the change in velocity r, which we do using the solution for Kepler orbits.
To help you picture the geometry, the scattering event is shown in Figure I.3.
For a Kepler orbit we have:
1
r
= C cos(
0
) +
G(M +m)
L
2
(I.5)
where L is the specic angular momentum. The above equation is just the standard Kepler solution
relating the radius and angle (it is usually written in terms of eccentricity and semi-major axis,
however).
From Figure I.3, we can see that the angular momentum is given by: L = V
0
b, while we know
that L = r
2

. Dierentiating equation I.5 with respect to time and using the above substitutions, we
obtain:
r = CbV
0
sin(
0
) (I.6)
At the start of the interaction we have r = , = 0 which gives:
V
0
= CbV
0
sin(
0
) (I.7)
0 = C cos(
0
) +
G(M +m)
b
2
V
2
0
(I.8)
and eliminating for C, we obtain:
112
tan
0
=
bV
2
0
G(M +m)
(I.9)
Now we need a little geometry, which is given also in Figure I.3 (a good Figure often helps a lot!!).
Notice that the scattering event is symmetrical about the point of closet approach, =
0
. Thus the
nal perpendicular and parallel components of the velocity are given by:
V
f,perp
= V
0
sin
de
= V
0
sin( 2) = V
0
sin(2
0
) = V
0
sin(2
0
) (I.10)
and similarly:
V
f,para
= V
0
cos(2
0
) (I.11)
Thus, using equations I.9 and I.4 and some trig substitutions, the change in the velocity components
of x
M
are given by:
[ x
M,perp
[ =
2mbV
3
0
G(M +m)
2
_
1 +
b
2
V
4
0
G
2
(M +m)
2
_
1
(I.12)
[ x
M,para
[ =
2mV
0
M +m
_
1 +
b
2
V
4
0
G
2
(M +m)
2
_
1
(I.13)
Now that was one interaction. Over many interactions, the mean eect of perpendicular encounters
will average to zero (they can have either sign). However, the important point is that the parallel
encounters will always reduce the velocity of the heavier object. This is the dynamical friction eect.
Of course, the perpendicular encounters do not vanish when you consider the root mean square eect.
This is just the relaxation we already derived in Lecture 1. Over many many encounters, the heavy
particle will random walk in the perpendicular component of its velocity. But this is a tiny eect
compared to dynamical friction which will always reduce the parallel velocity component.
It is worth a brief paragraph at this point to discuss exactly why it is that the parallel component
always points in one direction (i.e. decelerates the infalling body). What we have done, above, is
to analyse the problem from the centre of mass frame of M and m. Now, in the limit m = M,
x
m
= x
M
: the particles receive equal and opposite momentum kicks, as expected. Provided
that we have a typical particle i.e. not moving at signicantly greater velocity than the mean of
the background, then the net eect of such encounters will be zero in the mean (although relaxation
will still occur through a random walk). Once the masses of the particles are signicantly dierent,
however, an asymmetry enters into the problem. We transform into the centre of mass frame which
is now close to the rest frame of the massive particle. In this frame, the massive particle is nearly
stationary and sees a head wind from the background particles which must be moving towards the
massive body. Hence the friction force is always a deceleration.
To obtain the nal friction force, as in Lecture 1, we must now integrate over all impact parameters
and all particles. First the impact parameters. The rate at which the infalling body encounters
background particles with velocity density f( x
m
) within the range b b +db is given by:
2bdbV
0
f( x
m
)d
3
x
m
(I.14)
Thus we obtain:
d x
M
dt
x
M
= V
0
f( x
m
)d
3
x
m
_
b
max
0
2mV
0
M +m
_
1 +
b
2
V
4
0
G
2
(M +m)
2
_
1
2bdb (I.15)
= 2 ln(1 +
2
)G
2
m(M +m)f( x
m
)d
3
x
m
( x
m
x
M
)
[ x
m
x
M
[
3
(I.16)
Remember that we assume an innite homogeneous medium. To avoid divergences in the integral, just
like for the relaxation time, we have to add some cut-o which is the maximum size of the system:
b
max
. We also come across, once again, the Coulomb Logarithm which we encountered in deriving the
relaxation time:
113

b
max
V
2
0
G(M +m)
(I.17)
and, since 1, typically:
ln(1 +
2
) 2 ln (I.18)
Now we integrate over all of the particles. Here we assume isotropy and use a cunning trick! Notice
how the right hand part of equation I.16 looks like an integral over a velocity density times a function
which looks like a gravitational potential, but in the velocity. If the velocity is isotropic, then just like
a spherical gravitational potential, we can use the Newton II theorem (Binney and Tremaine 2008):
_
f( x
m
)
( x
m
x
M
)
[ x
m
x
M
[
3
d
3
x
m
=
4
_
x
M
0
f( x
m
) x
2
m
d x
m
x
2
M
x
M
(I.19)
If the above looks strange, just imagine x
m
x and youll see that it just amounts to:
_

(xx
)
|xx
|
3
d
3
x
=
1
G
=
M(r)
r
2
x, which is the familiar Newton II.
Putting it all together, we obtain the Chandrasekhar dynamical friction formula:
d x
M
dt
= 16
2
ln G
2
(M +m)
_
x
M
0
mf( x
m
) x
2
m
d x
m
x
2
M
x
M
(I.20)
We may further simplify this if we assume a Maxwellian distribution of velocities:
f( x
m
) =
n
(2
2
)
3
2
e
x
2
m
2
2
(I.21)
where n is the local number density of particles and is the velocity dispersion. Using the above,
M m, = nm, and x
M
= v
M
the Chandrasekhar friction formula now becomes:
dv
M
dt
=
4G
2
M ln
v
3
M
_
erf(X)
2X
e
X
2
_
v
M
(I.22)
where X = v
M
/(
2).
Now lets spend a little time thinking about what equation I.22 actually means. The main points
are the following:
Particles moving faster than v
M
do not contribute to the friction. This statement is only true
for isotropic background distributions and is a result of the limit in the integral of equation I.20.
Notice that the friction falls o as 1/v
2
M
: fast moving bodies experience little friction.
Particles close in velocity to the heavier object contribute most to the friction. The is most
clearly seen from equation I.16. Notice the divergence in the friction force for x
m
x
M
.
Particles at radii larger than b
min
=
G(M+m)
V
2
0
contribute almost all of the friction. We can prove
that this is so by considering the integral in equation I.15:
I =
_
b
max
0
_
1 +
b
2
V
4
0
G
2
(M +m)
2
_
1
bdb
=
G
2
(M +m)
2
2V
4
0
ln
_
1 +
b
2
max
V
4
0
G
2
(M +m)
2
_
G
2
(M +m)
2
V
4
0
ln (I.23)
where = b
max
/b
min
using the above denition of b
min
.
Now imagine that there is a minimum impact parameter such that only encounters b b
min
contribute to the dynamical friction eect. The integral now becomes:
114
I
_
b
max
b
min
_
1 +
b
2
V
4
0
G
2
(M +m)
2
_
1
bdb
_
b
max
b
min
G
2
(M +m)
2
V
4
0
1
b
db
=
G
2
(M +m)
2
V
4
0
ln (I.24)
The above proves that if 1 (as is the case for almost any astronomical system of interest)
then there is a minimum impact parameter such that b b
min
=
G(M+m)
V
2
0
. Particles closer than
this to the heavier infalling body do not signicantly contribute to the dynamical friction. This
is a very important point. Figure I.3 might give the misleading impression that particles are
actually bouncing o the infalling satellite. This is absolutely not correct. It is long range slow
interactions which contribute most of the friction force.
Finally, notice that the drag force is proportional to the mass density rather than the individual
particle masses (provided M m). Thus the friction force from dark matter particles of
proton mass is identical to the friction force from the same mass density of stars.
I.2 Resonance: what Chandrasekhar misses
There is something important which the Chandrasekhar approach misses. Equation I.22 suggests that
outside of a galaxy where = 0 there should be no friction, while if = const. one would expect
friction as per normal. Both of these statements are inconsistent with numerical N-body experiments.
Indeed, we should perhaps expect the rst to be wrong since we have already shown that it is the long
range interactions that provide most of the friction force.
The above failures occur because we have implicitly assumed that the infalling satellite sees each
background particle only once. This is what we see in the schematic diagram of Figure I.3: the heavy
particle moves in a straight line. For satellites orbiting in real galaxies, however, the satellite will
scatter the same resonant particles on each orbit. Tremaine and Weinberg 1984 explore such ideas
in more detail and derive a dynamical friction formula for spherical systems. This gives the new and
important insight that it is particles that resonate with the infalling satellite that provide almost all
of the friction force. Thus, friction does not cease simply because the local 0.
Given the above it is surprising that Chandrasekhar works at all! However, for most gravitational
potentials the infalling satellite sinks faster than resonances can be excited. Thus, to a good approxi-
mation, it is always encountering new particles or a least new resonances. This, combined with the
fact that the formula depends only logarithmically on the arbitrary parameters wrapped up inside
is why Chandrasekhar usually provides an excellent match to numerical experiments. An important
exception is the constant density harmonic potential mentioned above. In this case, numerical exper-
iments nd that the friction force is momentarily much stronger than Chandrasekhar predicts, after
which there is no observed friction at all! For a solution to this interesting problem, have a look in
Read et al. 2006a. For now, if you want a clue as to what is going on, have a think about which
orbital frequencies are permitted in the harmonic potential. If resonant particles drive most of the
friction force, what is special about the harmonic potential?
I.3 The dynamical friction timescale and the connection to
relaxation
We have stated so far without proof that dynamical friction is more important in the Universe than
relaxation because it proceeds faster. We may obtain a rough estimate of the dynamical friction time
if we imagine an infalling satellite on a circular orbit. Let us assume it remains on a perfectly circular
orbit throughout its infall, and that the background distribution is a spherical isothermal distribution
given by:
115
(r) =
v
2
c
4Gr
2
(I.25)
where v
c
is the circular speed and is a constant. This is a good model for galaxies which show at
rotation curves (see Lectures 1).
Equation I.22 now reduces to (using the fact that for an isothermal sphere = v
c
/
2):
M
dv
M
dt
= F =
4 ln G
2
M
2
(r)
v
2
c
_
erf(1)
2
e
1
_
= 0.428
ln GM
2
r
2
(I.26)
The infalling satellite loses specic angular momentum L at a rate:
dL
dt
=
Fr
M

0.428GM
r
ln (I.27)
and since its orbit remains circular and v
c
= const., we have that L = rv
c
at all times. Substituting
this into the the above gives:
r
dr
dt
=
0.428GM
v
c
ln (I.28)
and solving gives us the dynamical friction timescale:
t
fric
=
2.64 10
11
ln
_
r
i
2kpc
_
2
_
v
c
250km/s
__
10
6
M
M
_
(I.29)
which recalling that ln 10 is typically shorter than a Hubble time for infalling galaxies, star
clusters and massive globular clusters. The relaxation time for all but the centre of globular clusters
is by contrast many Hubble times (see Lecture 1).
I.4 Wakes
The eect of the scattering is that it puts more particles behind, rather than in front of the satellite
(see Figure I.3). This creates a wake behind the satellite. It is possible to derive the Chandrasekhar
friction formula by considering the gravitational pull of this wake on the satellite (see e.g. Mulder
1983). Hence, this is an alternative way of understanding the friction eect.
I.5 Mass segregation
The dynamical friction eect is of importance inside globular clusters because it causes mass segrega-
tion: the heavier stars to sink towards the centre, while the lighter stars move out to the edge. This
process could be very important in seeding the supermassive black holes observed to reside at the
centres of galaxies. We do not understand fully yet how such black holes form. It is not too dicult
to grow them from 10
3
to 10
6
M
through gas accretion. But black holes are born from the end-phase
of the collapse of massive stars; they start out at a mere 40M
at best. Bridging the gap from 40 to

10
3
M
is a signicant challenge in theoretical astronomy, even today. Dynamical merging of massive

stars and black holes at the centre of globular clusters could be one solution. Theoretically, we expect
such massive stars and black holes to reside at the centres of globular clusters as a result of mass
segregation and, indeed, mass segregation has been observed in some nearby clusters.
I.6 Collisionless relaxation and friction
So far we have discussed relaxation and dynamical friction as collisional processes. Yet we have loosely
talked also about dynamical friction on satellite galaxies. If a satellite galaxy mostly comprises dark
116
matter particles then both it and the galaxy it is falling into are undoubtedly collisionless. So how is
it then than dynamical friction can proceed? We discuss this in detail in the next lecture where we
consider galaxy-galaxy interactions.
117
Appendix J
A brief primer on general relativity
Since many of you will have not covered (or only lightly covered) general relativity, I provide a very
quick refresher here. I present the central concepts that lead to special and general relativity, derive
the geodesic equation for GR and present the Einstein eld equations. Finally, I discuss two solutions
to the eld equations: the Schwarzschild solution and the FLRW metric. I use the former to derive
gravitational lensing an important dark matter probe; the latter forms the basis of our current
cosmological model.
J.1 What is wrong with good old Newton?
Newton himself understood that something is a bit shy about Newtonian gravity. In a famous letter
to Bently in 1693, Newton wrote
1
:
It is inconceivable that inanimate brute matter should, without the mediation of something
else which is not material, operate upon and aect other matter without mutual contact...
That gravity should be innate, inherent, and essential to matter, so that one body may act
upon another at a distance through a vacuum, without the mediation of anything else, by
and through which their action and force may be conveyed from one to another, is to me
so great an absurdity that I believe no man who has in philosophical matters a competent
faculty of thinking can ever fall into it.
However, the theory worked so spectacularly well that it was not until hundreds of years later with
the arrival of Albert Einstein
2
that the worries returned. The problems can be simply understood:
1. Velocities are purely additive. Einstein understood that electromagnetism involves the speed
of light. What happens then, he wondered, if we travel very rapidly? Is the speed of light
some constant, plus the speed we are moving at? Einstein realised that such a theory would be
unworkable because all velocities are relative. Without an absolute reference frame (and what
would that be
3
?) we would be unable to even assign an unambiguous velocity to light. Physics
would be ill-dened!
2. There are really two masses in Newtonian mechanics: inertial mass, and gravitational mass. It
is truly remarkable that the two are identical as best we can tell (better than one part in 10
11
;
Will 1993). Surely this means something ...
3. Newtonian gravity implies instantaneous action at a distance. How do objects at the edge of
the Universe know instantaneously that Im jumping up and down?
1
See e.g. http://plato.stanford.edu/entries/newton-philosophy/.
2
If you have not read some of his original work, I can really recommend it (e.g. Einstein 1916). It is remarkable how
little our pedagogical treatment of general relativity (at least for many physicists) has changed from Einsteins original
exposition.
3
Actually, the idea of an absolute reference frame was very popular in the late 1800s and Maxwell supposed that light
travelled through an absolute aether. This appealing idea was, however, famously refuted by the Michaelson-Morley
experiments in 1881 and 1887 (Michelson 1881;Michelson and Morley 1887).
118
The rst point, as we shall see, leads us to Einsteins theory of special relativity. The second leads to
general relativity. And the third, is something we will return to once armed with Einsteins general
theory of relativity.
J.2 Special relativity
h/2
a) b)
x = vt
Monday, September 5, 2011

Figure J.1: A schematic diagram of the light-
clock thought experiment. In panel a), the
clock is stationary. The photon travels up and
back in a time t = h/c, where c is the speed
of light. In panel b), we watch the clock zoom
past us at a speed v.
The rst of the three considerations, above, led Ein-
stein to assert that the speed of light must be constant
independent of the choice of inertial frame
4
. This
rather deep result leads to some remarkable conclu-
sions. First, it implies that time must be relative.
To arrive at this result, we can use a simple thought
experiment: the light clock. Imagine I construct a
clock so that in a time t a single photon of light trav-
els upwards a distance h/2, bounces o a mirror, and
travels back another distance h/2 to its original po-
sition. This is shown in Figure J.1. In panel a), the
clock is stationary. The photon travels up and back
in a time t = h/c, where c is the speed of light. In
panel b), we watch the clock zoom past us at a speed
v. Since the speed of light is constant, to us the photon travel time is now:
t
= 2
(
_
h
2
_
2
+
_
v
2
t
_
2
)
1/2
c
(J.1)
where we now use t
to indicate that we are observing the clock from a dierent inertial frame.
Rearranging, and after some simple algebra, the above gives:
t
t
=
_
1
v
2
c
2
_
1/2
= (J.2)
which denes the Lorentz factor .
For v c, the above equation has almost no eect. But as we approach the speed of light, t
> t
and time becomes heavily dilated: moving clocks run slow!
The above derives a pure time transformation. But in general, we can transform the position
coordinates between inertial frames too and the speed of light must remain invariant also in such
situations. A general position transformation from a frame S to a frame S
can be written:
x
= a
1
x +a
2
y +a
3
z +a
4
ct +a
5
(J.3)
Now, suppose that in the frame S
, S is moving at speed a v along the x-axis. We may then dene

S
x = vt
t
S
Thursday, September 29, 2011

without loss of generality x
such that x
= 0 at x = vt (see the margin gure). This gives:

x
= (x vt) ; x = (x
+vt
) (J.4)
where the right equation follows by symmetry between the frames (in the frame S, S
moves at speed
v). In the Newtonian world view, we would then assert that t = t
which derives the Galilean

transformation, = 1. However, in special relativity, we have instead that c is a constant. Imagine,
then, that we move a distance x = ct in frame S. This must then correspond to x
= ct
in frame S
.
Substituting these relations into equations J.4, we have:
ct
= (ct vt) ; ct = (ct
+vt
) (J.5)
and we recover as in equation J.2. Thus, we derive the full Lorentz transforms:
x
=
_
x
v
c
ct
_
(J.6)
4
An inertial frame is one that experiences no accelerations.
119
ct
=
_
ct
v
c
x
_
(J.7)
where we have deliberately used the speed of light to give the time and position coordinates the same
dimensions. We now see an important key result: the quantity:
ds
2
= c
2
dt
2
dx
2
= c
2
dt
2
dx
2
(J.8)
is invariant. This is the fundamental length the Lorentz invariant in special relativity. (Note
that we have assumed up to now that dy = dz = 0. Putting these back in, the above generalises to:
ds
2
= c
2
dt
2
dx
2
dy
2
dz
2
.)
J.2.1 Introducing tensor notation
The key concept in special (and general) relativity is that physics must be independent of our choice
of coordinate system however crazy. To facilitate this, it is helpful to devise a mathematical frame-
work that allows for arbitrary transformations, while maintaining physical properties like the Lorentz
invariant. Consider the following 4-vector position:
x
= (cdt, x, y, z) ; = 0, 1, 2, 3 (J.9)
where we have used the speed of light, c, to make the time coordinates have the same dimensions of
length as the other coordinates
5
.
Now let us dene some mathematics that returns the correct length (the Lorentz invariant) when
we take the product of this 4-vector with itself independent of any coordinate transformation. To
achieve this, let us rst dene the self product as:
x
= x
(J.10)
where repeated indices are summed over, and we dene the metric g
as the object that transforms

the contravariant form of x
to the covariant
6
form x
. In special relativity, the self product must

produce the Lorenz invariant (equation J.8). Thus:
ds
2
= c
2
dt
2
dx
2
dy
2
dx
2
= dx
dx
(J.11)
which derives the metric for special relativity g
= diag(1, 1, 1, 1). This is called

Minkowski spacetime. As we will see later, the metric takes dierent forms in the presence of gravita-
tional elds.
The Lorentz invariant (equation J.8) also gives us the transformation laws that co- and contravari-
ant 4-vectors must obey:
x
=
x
(J.12)
x
=
x
(J.13)
where the above simply ensures that x
= x
= const. by construction.
In special relativity, the coordinate transformation is dened by
x
, often written as
, which
is simply a matrix that denes the Lorentz transform (it is derived from the partial derivatives of the
transformation equations J.6 and J.7):
=
_
_
_
_
0 0
0 0
0 0 1 0
0 0 0 1
_
_
_
_
(J.14)
We may generalise such 4-vectors to higher dimensional beasts: tensors. Tensors like g
are
matrices that must obey transformation relations that ensure their eect is coordinate invariant:
5
Note that c is only really required here because we use metres to measure distance, and seconds to measure time.
We could instead adopt units where c = 1 and indeed this is often done in relativity books.
6
A useful way to remember which is which is: co- goes below.
120
A
=
x
(J.15)
Tensors also have co- and contravariant (and now mixed) forms. It is straightforward to show that the
above transformation rule ensures that the eect of A
acting on a 4-vector is coordinate independent.

Note that if we allow the metric to act on a tensor, we can lower the dimensionality of the tensor
a process call contraction:
g
= A
= A (J.16)
where A is a scalar contraction of the tensor A
.
The above is all very nice, but one thing remains a bit tricky. Suppose I want to take the derivative
of a tensor. A rst guess at a useful derivative operator would be the 4-derivative:
=

x

_

ct
,
_
(J.17)
However, if I were to use the 4-derivative, then it is straightforward to show that I produce an object
that is no longer a tensor. That is, suppose I write:
Y
a
b
=
a
X
b
(J.18)
and I then apply a coordinate transformation to X
b
:
Y
a
b
=

x
a
_
x
b
x
_
=
x
d
x
a
x
d
_
x
b
x
_
=
x
d
x
a
x
b
x

d
X
+
x
d
x
a
2
x
b
x
d
x
(J.19)
The rst term on the left is the tensor transformation rule for Y
a
b
, but the term on the right is an
extra piece. Thus, we have proven that Y
a
b
is not a tensor. This is bad because we have developed
this whole mathematical machinery in order to describe physics in a coordinate independent manner.
We must therefore hunt for a derivative operator that produces tensors from tensors. There is more
than one operator that can achieve this. Here, however, we will need only the covariant derivative
operator:
c
X
a
=
c
X
a
+
a
bc
X
b
(J.20)
where
a
bc
=
x
a
x
2
x
x
b
x
c
is called a Christoel symbol.
It is straightforward to show that the addition of the Christoel symbol here negates the eect
of the extra piece we derived above for the non-tensorial 4-derivative and thus that the covariant
derivative does indeed produce tensors from tensors (note that when deriving this result you must
remember to also transform the Christoel symbol).
The above mathematical trickery is useful. If we can phrase our physics in terms of such tensors
and 4-vectors, then we will be coordinate independent by construction.
J.2.2 4-momentum, 4-force and all that ...
Although special relativity deals only with inertial frames, we can still happily watch other people
accelerate. Thus, dening a four-momentum and four-force is still meaningful. Let us start by dening
the proper time from the Lorentz invariant proper distance:
ds
2
= c
2
d
2
(J.21)
Since the proper distance ds and the speed of light c are invariant, then the proper time d must be
also. This is useful as it suggests that we can form invariant time derivatives of the 4-position by
using . Thus suggests the following denition of 4-momentum:
121
P
= m
dx
d
(J.22)
where m is an invariant mass the rest mass of a particle. It is straightforward to show that P
=
0 then gives momentum-energy conservation laws that reduce to classical energy and momentum
conservation in the low velocity limit. This suggests that the above choice is the right one. Similarly,
we may then dene a four-force:
F
=
dP
d
(J.23)
which tends to the more usual 3-force in the non-relativistic limit.
The above equations form the basis of special relativistic dynamics that hopefully you have en-
countered before.
J.2.3 The clock hypothesis and general relativity
So far, we have discussed only how to deal with inertial frames that are not accelerating. How to deal
with accelerations can be understood from the famous clock, or twin paradox. Imagine I have two
twins on Earth. One ies away from the other for a time t/2 at a speed v, turns around, and then
comes back at a speed v. The twin on Earth sees his brothers clock run slow such that the total
time elapsed is:
t
= (v)t/2 +(v)t/2 = t (J.24)

But now consider things from the view of the rocket-twin. Surely he sees his brothers clock running
slow too such that t = t
! This is the paradox. The solution, of course, is that the rocket-twin must
accelerate to come back to Earth. And accelerations are not described in special relativity. Thus, the
apparent symmetry of the problem is broken. On the other hand, however, we may assert that the
Earth-twin must have the answer right since he does not accelerate and therefore special relativity in
his frame is just ne: this is the clock hypothesis and is the basis for general relativity.
Let us take the above idea a little further. Suppose, then, we dene a frame in which there are no
accelerations. In such a frame special relativity must apply and the double proper time derivative of
our 4-vector position (the 4-acceleration) must be zero:
d
2
d
2
= 0;
= (ct, x, y, z) (J.25)
Furthermore spacetime must be Minkowski:
c
2
d
2
=
(J.26)
with
= diag(1, 1, 1, 1).
Now, we can describe motion in any frame by simply transforming away from the above one.
Inserting our general coordinate transform (equation J.12) into equation J.25, we obtain the general
relativistic dynamics equations [exercise]:
d
2
x
d
2
+
dx
d
dx
d
= 0 (J.27)
with the metric equation:
c
2
d
2
= g
dx
dx
(J.28)
where the Christoel symbols
and new metric g
are dened by the transformation coecients:
=
x
(J.29)
g
(J.30)
And nally, it is possible to substitute for the metric inside the the Christoel symbols, thus demon-
strating that everything can be described purely by the metric:
122
=
1
2
g
_
g
+
g
_
(J.31)
The metric itself simply describes how to dene a length in some arbitrarily hideous spacetime.
We can think of it as describing curvature and often people talk of curved spacetime in GR. But this
is just one interpretation of the mathematics it does not necessarily mean that spacetime is really
curved.
J.2.4 The equivalence principle
We are now in a position to return to the troubling aspects of Newtonian gravity we set out at the
start in J.1. So long as I can keep transforming to a frame where there are no accelerations, to what
extent can I be said to really be feeling any force? This insight led to Einsteins equivalence principle,
which states that:
Freely falling (innitesimal) observers experience no gravitational eects i.e. they
can just keep transforming to Minkowski space.
Or to put it another way it is the tidal gravitational eld that you feel, or the force caused
by hitting something that gets in the way of your being a freely falling observer, not gravity per-se.
The only gravity you feel is when bits of your body attempt to be freely falling observers that fall in
dierent directions (ouch!). This is very dierent from the Newtonian world view. The even stronger
version of the above is the strong equivalence principle that goes further:
All laws of physics for freely falling observers are identical to those in the absence of
gravity.
The above has some profound implications. First, in order to be true the gravitational and inertial
masses must be identical. This is because we require that the acceleration of our inertial frame be
exactly equal to the gravitational acceleration (otherwise we cannot simply transform our frame and
remove the force):
a =
m
G
m
I
g (J.32)
where g is the gravitational acceleration, m
I
is the inertial mass, and m
G
is the gravitational mass.
Thus we must have m
I
/m
G
= 1. And this then has a further implication:
An observer in a windowless room cannot distinguish between being on the surface of
the Earth, and being in a spaceship accelerating at 1g.
J.2.5 The eld equations
So far, we have derived the general relativistic equivalent of F = ma. To solve real problems, we
must have a method to determine the accelerations and that means having a eld theory. In classical
Newtonian mechanics, gravity is a scalar eld described by Poissons equation, for example. What
are the eld equations in GR?
Up to now, our treatment has been completely general. We have simply demanded that physics
remain invariant under transformations between inertial frames, and that special relativity apply in
non-accelerating frames. Now, we enter a fuzzier and less satisfying realm where we have some choice
in how to proceed. Given such choice, we will aim for the very simplest eld equations possible. We
will be guided by noting the following:
1. We may expect that the source of gravity should look something like (somewhat like in Newtons
laws) mass.
2. We know that at least in classical mechanics mass, momentum and energy are conserved. In
special relativity, this becomes the conservation of 4-momentum.
123
3. We know that we must nd some tensor in these quantities to ensure coordinate invariance.
Suppose we just write down a tensor whose covariant derivative is nothing more than the 4-conservation
laws for each component of the 4-momentum. This is the energy-momentum tensor T
which satises:
= 0 (J.33)
where T
00
is the energy density, T
12
is the x-component of the y-momentum current, etc., and the
tensor must be symmetric. Since T
is really dened only by a derivative (by conservation laws), its

precise form must depend on the type of matter being considered. For a perfect uid, for example, in
the rest frame, T
takes on a simple form:

T
= diag(c
2
, p, p, p) (J.34)
where and p are the proper density and pressure, respectively. (Recall that pressure is just the ux
of x-momentum in the x-direction
7
.) Transforming to a general frame, this gives:
T
=
_
+p/c
2
_
U
pg
(J.35)
where U
=
dx
d
is the 4-velocity. (To use the above we must also specify an equation of state that
links the pressure and density.)
We now have an equation for the source terms of our eld, but not how to describe the response
how spacetime will curve (i.e. what the metric will be) in response these sources. Again, we may
be guided by some simple considerations:
1. We must construct something using the metric. However, we know from our dynamics equation
J.27 that simple coordinate transforms are described by rst derivatives of the metric (these
appear in the Christoel symbols). Thus, we must use at least second derivatives of the metric
if we want to describe physical spacetime distortions that cannot be simply transformed away.
2. As above, we must look for a tensor to ensure coordinate invariance for our eld theory. Since
we will equate it to the energy-momentum tensor, this should be a second rank tensor.
3. If our tensor let us call it G
has a vanishing covariant derivative (
= 0) then it
must follow that G
= kT
, where k is some constant. Thus, we should hunt for something

that has a vanishing covariant derivative.
It turns out that there is only one tensor that can be constructed that is linear in the second derivatives
of the metric: the Riemann tensor:
R
a
bcd
=
c
a
bd
a
bc
+
e
bd
a
ec
e
bc
a
ed
(J.36)
which is a function only of the Christoel symbols, and therefore a function of the metric and its
derivatives (c.f. equation J.29).
The Riemann tensor is a fourth rank tensor, but can be contracted using the metric to form a
second rank tensor (the Ricci tensor), or contracted further to form a scalar. There is only one second
rank tensor that can be constructed from the Riemann tensor and its contractions that has a vanishing
covariant derivative: the Einstein tensor:
G
= R
1
2
g
R (J.37)
where R
is the Ricci tensor, and R = g
is the curvature scalar.

And thus, we arrive at the Einstein eld equations:
G
=
8G
c
4
T
(J.38)
7
In case this is not clear, recall that pressure is just the force per unit area perpendicular to a surface. In the
x-direction, for example, we may write: P = F x/A =
d
dt
(m v) x/A, which is then clearly the ow of x-momentum
per unit area along x (i.e. a momentum-current).
124
where the constant of proportionality is determined by demanding that the eld equations reproduce
Newtons laws in the weak eld limit (we will come to this shortly).
The above derivation is rather sketchy and relies a bit on you having encountered this all before.
But the sketched derivation highlights an important point: general relativistic dynamics is on quite
secure ground; the eld equations are not. This will become important when we discuss alternative
gravity theories in later lectures. To give you an idea now, though, of the remaining freedoms, consider
the following. There is another tensor that should be familiar to you already whose covariant derivative
is zero: the metric itself:
= 0. Thus, we can generalise the eld equations further to:

G
+ g
=
8G
c
4
T
(J.39)
where is known as the cosmological constant.
Einstein himself was the rst to propose adding the cosmological constant in order to create
solutions where the Universe is static. He later called this his greatest blunder
8
. Lemaitre following
on from theoretical work by de Sitter and Friedmann went on to demonstrate that the Universe is in
fact expanding (Nussbaumer and Bieri 2011). But the cosmological constant has returned with recent
observations that suggest that the Universe is accelerating: a phenomenon that could be explained by
said cosmological constant (often referred to as dark energy; of which more later).
To understand what the cosmological constant means physically, lets move it over to the right
side of the eld equations and add it to the energy-momentum tensor as if it were an additional source
term:
T
=
c
4
8G
g
(J.40)
In the absence of any matter, T
= 0 and the above must represent the source terms coming from
the vacuum itself. If that sounds strange, later on the in the course we will explain why there may be
no such thing as genuinely empty space. For now, consider what the above means dynamically. We
are free to transform the metric into Minkowski space: g
= diag(1, 1, 1, 1), which gives:

T
=
c
4
8G
diag(1, 1, 1, 1) =
c
4
8G
diag(c
2
vac
, p
vac
, p
vac
, p
vac
) (J.41)
and if the above really represents the vacuum solution, then this Minkowski space solution must be
just ne: all observers will see the same vacuum. The energy density of the vacuum must be encoded
in T
00
= c
2
vac
and thus we have derived that the vacuum pressure is negative:
p
vac
= c
2
vac
(J.42)
Thus the vacuum assuming such a thing exists will behave like antigravity pushing the Universe
apart. This is why the cosmological constant has been called dark energy and evoked to explain the
observed accelerating expansion of the Universe.
J.2.6 Energy conservation in general relativity
This is often a source of confusion and you will often hear (particularly when adding the cosmological
constant) that energy is not conserved in general relativity. In fact, the issue is somewhat subtle.
For starters, what is clearly conserved by construction is the energy-momentum tensor, and that is
the fundamental coordinate invariant tensor that should be conserved. Furthermore, classical energy
conservation is recovered in the weak-eld limit. What is less clear is whether a scalar quantity like
energy can be said to be conserved in general. Being a scalar, the energy is, of course, coordinate
dependent and denition dependent and so you can arrive at the conclusion that energy is conserved
or not, depending on how you dene it! Such issues should already be familiar to you by now from
special relativity where it is the energy-momentum 4-vector that is conserved. In special relativity,
energy is only conserved for inertial observers watching from a xed frame. In general relativity, no
observer can sit happily in a xed frame anymore, and so a simple unambiguous scalar energy can no
longer be dened.
8
I believe the origin of this quote is George Gamows autobiography: My World Line.
125
J.3 Solving the eld equations
We now have the equations of motion and the eld equations: in principle we are all set. In practise,
nding solutions to the eld equations is hard. Not least because coordinate transformations can fool
us into thinking that two solutions are dierent when really they are the same but simply transformed!
Here, we will consider rst weak eld solutions to the equations that can be solved using perturba-
tion theory. We then present two full solutions of interest for this course: the Schwarzshchild solution
that is the relativistic equivalent of a point mass (and also happens to describe black holes), and the
Friedmann, Lamaitre, Robertson, Walker (FLRW) metric that describes an innite homogeneous and
isotropic Universe which forms the backbone of our current cosmological model.
J.3.1 The Newtonian weak eld limit
First let us consider Newtonian weak eld general relativity. This is dened by three things:
1. Objects move slowly, i.e.
1
c
dx
d

dt
d
.
2. Gravity is weak such that spacetime is very close to Minkowski. In this case, we may write the
metric as Minkowski plus some small perturbation, h
:
g
+h
(J.43)
3. The metric is static and not a function of time.
From the rst condition, the GR dynamics equation (J.27) becomes:
d
2
x
d
2
+
00
c
2
_
dt
d
_
2
= 0 (J.44)
which, substituting our perturbed metric (equation J.43) into the Chrisoel symbols using equation
J.29 gives:
d
2
x
dt
2
=
c
2
2
h
00
;
d
2
t
d
2
= 0 (J.45)
The term on the right tells us that in the Newtonian weak eld limit, time dilation eects must be
constant. But more interesting is the term on the left. This immediately tells us the meaning of
the h
00
term in the metric. It must correspond to the standard Newtonian gravitational potential as
follows: h
00
=
2
c
2
, thus recovering the familiar Newtonian dynamics equations. (Note that we are not
specifying what the metric is here, but rather interpreting what it must mean.)
Furthermore, we may solve the Einstein eld equations (J.38) for our perturbed metric. Using
again that h
00
=
2
c
2
, the left hand side reduces to:
R
00
1
2
Rg
00
=
2
2
c
2
(J.46)
while the right hand side gives:
8GT
00
c
4
=
8G
c
2
(J.47)
and thus we recover the familiar Poisson equation:
2
= 4G. (This derives the constant of
proportionality in the Einstein eld equations that we simply stated earlier, above.) Note that some
signicant algebra is required to achieve the above results, and I would advise you to consult a good
textbook on GR if embarking on the above derivations for the rst time.
126
J.3.2 The weak eld limit & gravitational waves
The Newtonian weak eld limit helps us understand the connection between GR and Newton, but
does not give us intuition as to how these two theories dier. Here, we consider instead a linear
expansion of the GR eld equations. In this case, we assume only that the eld is weak such that the
metric can be decomposed into Minkowski plus a small perturbation:
g
+h
(J.48)
But we no longer assume slow moving particles or static elds. In this case, the eld equations become
(to linear order and in the harmonic gauge):
h

1
2
h =
16G
c
4
T
(J.49)
where =
2
ct
2
x
2
y

2
z
is called the dAlembert operator.
If we consider then a vacuum (T
= 0), and dene the trace-reversed perturbation: h
=
h

1
2
h, then we have that:

h
= 0 (J.50)
which is a wave equation! Thus, in general relativity unlike in Newtonian gravity gravitational
perturbations drive gravitational waves through the vacuum. This solves the nal of our our original
problems with Newtonian gravity: instantaneous action at a distance
9
. In GR information about the
gravitational tidal eld is transmitted by gravitational waves at the speed of light.
J.3.3 The Schwarzschild solution
The Schwarzschild solution to the eld equations was found within just a year of Einstein completing
general relativity (Schwarzschild 1916), and penned while Schwarzschild was serving in the army
during world war one
10
. It is a tragedy that he did not survive to contribute more to the eld. The
solution is, in fact, the only spherically symmetric vacuum solution to the Einstein eld equations (i.e.
with T
= 0) and has the following form:

c
2
d
2
=
_
1
2GM
c
2
r
_
c
2
dt
2
dr
2
_
1
2GM
c
2
r
_ r
2
(d
2
+ sin
2
d
2
) (J.51)
where r, , are the familiar spherical polar coordinates (in this context Schwarzschild coordinates). It
is straightforward to verify that the above metric is indeed a solution of the eld equations [exercise].
Equating
GM
r
= it is clear that the above metric is identical (at least in the all important g
00
term)
to the weak eld metric we derived, above. Thus the Schwarzschild metric approaches the Newtonian
solution for a point mass of mass M in the weak eld limit. This is why it is often thought of as the
GR equivalent of a point mass.
J.3.4 The FLRW metric and the cosmological model
Above, we wrote down the Scchwarzschild solution to the eld equations that approaches a Newtonian
point mass in the weak eld limit. Here, we are interested in nding a solution that can describe the
whole Universe. Out task is made signicantly easier by the fact that observations of galaxies in the
distant Universe (Wu et al. 1999; Yadav et al. 2005), and the cosmic microwave background radiation
(the afterglow of the Big Bang more on this later), suggest that the Universe is very close to being
perfectly isotropic and homogeneous. Furthermore, it would be quite the coincidence if it appeared
this way just from our perspective. Thus, it is reasonable to hunt for a Universe-metric that describes
isotropic and homogeneous matter. As you will see on the problem sheet, such a metric due to its
9
See e.g. the excellent lecture notes on GR by Sean Carroll: http://preposterousuniverse.com/grnotes/.
10
A quick search on the NASA Astronomy Abstract Service nds that this article has just 17 citations to date! Let
this be a lesson, then, to us all that citations are not everything. It is also interesting to note that Schwarzschild was
over 40 when he produced his most famous work. That busts another popular myth about science and age.
127
symmetries (rather like the Scchwarzschild metric) is unique. It is typically called the Friedmann,
Lamaitre, Robertson, Walker (FLRW) metric after the various authors who discovered it:
c
2
d
2
= c
2
dt
2
R
2
(t)
_
dr
2
1 kr
2
+r
2
(d
2
+ sin
2
d
2
)
_
(J.52)
where R(t) is called the scale factor, and k is a parameter that measures the fundamental curvature
of the spacetime.
128
Bibliography
[Aarseth and Zare, 1974] S. J. Aarseth and K. Zare. A regularization of the three-body problem.
Celestial Mechanics, 10:185205, October 1974. 56
[Aarseth, 1963] S. J. Aarseth. Dynamical evolution of clusters of galaxies, I. MNRAS, 126:223+,
1963. 21, 52, 56
[Aarseth, 2003a] S. J. Aarseth. Gravitational N-Body Simulations. Gravitational N-Body Simulations,
by Sverre J. Aarseth, pp. 430. ISBN 0521432723. Cambridge, UK: Cambridge University Press,
November 2003., November 2003. 32
[Aarseth, 2003b] S. J. Aarseth. Gravitational N-Body Simulations. Cambridge University Press, Cam-
bridge, UK, November 2003. 52, 53, 54, 55, 56
[Ahmad and Cohen, 1973] A. Ahmad and L. Cohen. A numerical integration scheme for the N-body
gravitational problem. J. Comp. Phys., 12:389402, 1973. 56
[An and Evans, 2006] J. H. An and N. W. Evans. A Cusp Slope-Central Anisotropy Theorem. ApJ,
642:752758, May 2006. 77
[Arfken and Weber, 2005] G. B. Arfken and H. J. Weber. Mathematical methods for physicists 6th
ed. Materials and Manufacturing Processes, 2005. 87, 88, 90, 91, 92
[Barnes and Hut, 1986] J. Barnes and P. Hut. A Hierarchical O(NlogN) Force-Calculation Algorithm.
Nature, 324:446449, December 1986. 68, 69
[Baumgardt and Makino, 2003] H. Baumgardt and J. Makino. Dynamical evolution of star clusters
in tidal elds. MNRAS, 340:227246, March 2003. 57
[Bennett, 2006] A. Bennett. Lagrangian uid dynamics. Cambridge Monographs on Mechanics, 2006.
84
[Binney and Tremaine, 2008] J. Binney and S. Tremaine. Galactic dynamics. Princeton, NJ, Prince-
ton University Press, 2008, 747 p., 2008. 1, 27, 36, 43, 45, 57, 58, 59, 67, 77, 78, 95, 114
[Buchert et al., 2000] T. Buchert, M. Kerscher, and C. Sicka. Back reaction of inhomogeneities on the
expansion: The evolution of cosmological parameters. Phys. Rev. D, 62(4):043525, August 2000.
74
[Chandrasekhar, 1943] S. Chandrasekhar. Dynamical Friction. I. General Considerations: the Coe-
cient of Dynamical Friction. ApJ, 97:255+, March 1943. 110
[Chin and Chen, 2005] S. Chin and C. Chen. Forward symplectic integrators for solving gravita-
tional few-body problems. Celestial Mechanics and Dynamical Astronomy, 91:301322, 2005.
10.1007/s10569-004-4622-z. 50
[Cohn, 1979] H. Cohn. Numerical integration of the Fokker-Planck equation and the evolution of star
clusters. ApJ, 234:10361053, December 1979. 59
[De Lorenzi et al., 2007] F. De Lorenzi, V. P. Debattista, O. Gerhard, and N. Sambhus. NMAGIC: a
fast parallel implementation of a
2
-made-to-measure algorithm for modelling observational data.
MNRAS, 376:7188, March 2007. 78
129
[Dehnen and Read, 2011] W. Dehnen and J. I. Read. N-body simulations of gravitational dynamics.
European Physical Journal Plus, 126:55+, May 2011. 19, 47, 51, 64
[Dehnen, 1993] W. Dehnen. A Family of Potential-Density Pairs for Spherical Galaxies and Bulges.
MNRAS, 265:250+, November 1993. 94
[Dehnen, 2000] W. Dehnen. A Very Fast and Momentum-conserving Tree Code. ApJ, 536:L39L42,
June 2000. 69, 84
[Dehnen, 2001] W. Dehnen. Towards optimal softening in three-dimensional N-body codes - I. Mini-
mizing the force error. MNRAS, 324:273291, June 2001. 66, 68
[Dehnen, 2009] W. Dehnen. Tailoring triaxial N-body models via a novel made-to-measure method.
MNRAS, 395:10791086, May 2009. 78
[Dilts, 1999] G. Dilts. Moving-least-squares-particle hydrodynamics I. consistency and stability.
International journal for numerical methods in engineering, 44:11151155, February 1999. 84
[Dubinski et al., 2009] J. Dubinski, I. Berentzen, and I. Shlosman. Anatomy of the Bar Instability in
Cuspy Dark Matter Halos. ApJ, 697:293310, May 2009. 82
[Efstathiou et al., 1985] G. Efstathiou, M. Davis, S. D. M. White, and C. S. Frenk. Numerical tech-
niques for large cosmological N-body simulations. ApJS, 57:241260, February 1985. 82
[Einstein, 1916] A. Einstein. Die Grundlage der allgemeinen Relativitatstheorie. Annalen der Physik,
354:769822, 1916. 118
[Enea Romano and Chen, 2011] A. Enea Romano and P. Chen. Corrections to the apparent value of
the cosmological constant due to local inhomogeneities. JCAP, 10:16, October 2011. 74
[Freitag and Benz, 2001] M. Freitag and W. Benz. A new Monte Carlo code for star cluster simula-
tions. I. Relaxation. A&A, 375:711738, August 2001. 59
[Fukugita et al., 1995] M. Fukugita, K. Shimasaku, and T. Ichikawa. Galaxy Colors in Various Pho-
tometric Band Systems. PASP, 107:945+, October 1995. 6
[Fukushige et al., 2005] T. Fukushige, J. Makino, and A. Kawai. GRAPE-6A: A Single-Card GRAPE-
6 for Parallel PC-GRAPE Cluster Systems. PASJ, 57:10091021, December 2005. 57
[Gaburov et al., 2009] E. Gaburov, S. Harfst, and S. Portegies Zwart. SAPPORO: A way to turn
your graphics cards into a GRAPE-6. New Astronomy, 14:630637, October 2009. 57
[Gingold and Monaghan, 1977] R. A. Gingold and J. J. Monaghan. Smoothed particle hydrodynamics
- Theory and application to non-spherical stars. MNRAS, 181:375389, November 1977. 84
[Greengard and Rokhlin, 1987] L. Greengard and V. Rokhlin. A fast algorithm for particle simula-
tions. Journal of Computational Physics, 73:325348, December 1987. 84
[Hahn and Abel, 2011] O. Hahn and T. Abel. Multi-scale initial conditions for cosmological simula-
tions. MNRAS, 415:21012121, August 2011. 82
[Harfst et al., 2007a] S. Harfst, A. Gualandris, D. Merritt, R. Spurzem, S. Portegies Zwart, and
P. Berczik. Performance analysis of direct N-body algorithms on special-purpose supercomput-
ers. New Astronomy, 12:357377, July 2007. 32
[Harfst et al., 2007b] S. Harfst, A. Gualandris, D. Merritt, R. Spurzem, S. Portegies Zwart, and
P. Berczik. Performance analysis of direct N-body algorithms on special-purpose supercomput-
ers. New Astronomy, 12:357377, July 2007. 57
[Harrison, 1989] E. Harrison. Darkness at Night: a riddle of the Universe. Harvard University Press,
1989. 71
130
[Heggie and Hut, 2003] D. Heggie and P. Hut. The Gravitational Million-Body Problem: A Multidis-
ciplinary Approach to Star Cluster Dynamics. Cambridge University Press, Cambridge, February
2003. 1
[Heggie et al., 1998] D. C. Heggie, M. Giersz, R. Spurzem, and K. Takahashi. Dynamical Simulations:
Methods and Comparisons. Highlights of Astronomy, 11:591+, 1998. 59
[Heggie, 1974] D. C. Heggie. A global regularisation of the gravitational N-body problem. Celestial
Mechanics, 10:217241, October 1974. 56, 57
[Henon, 1971] M. Henon. Monte Carlo Models of Star Clusters (Part of the Proceedings of the IAU
Colloquium No. 10, held in Cambridge, England, August 12-15, 1970.). Ap&SS, 13:284299, October
1971. 59
[Hernquist et al., 1991] L. Hernquist, F. R. Bouchet, and Y. Suto. Application of the Ewald method
to cosmological N-body simulations. ApJS, 75:231240, February 1991. 75
[Hernquist, 1990] L. Hernquist. An analytical model for spherical galaxies and bulges. ApJ, 356:359
364, June 1990. 94
[Hieber and Koumoutsakos, 2008] S. E. Hieber and P. Koumoutsakos. A Lagrangian particle method
for the simulation of linear and nonlinear elastic models of soft tissue. Journal of Computational
Physics, 227:91959215, November 2008. 84
[Hockney and Eastwood, 1988] R. W. Hockney and J. W. Eastwood. Computer simulation using
particles. Bristol: Hilger, 1988. 1
[Holder et al., 1999] T. Holder, B. Leimkuhler, and S. Reich. Explicit variable step-size and time-
reversible integration. Appl. Numer. Math, 39:367377, 1999. 50
[Holmberg, 1941] E. Holmberg. On the Clustering Tendencies among the Nebulae. II. a Study of
Encounters Between Laboratory Models of Stellar Systems by a New Integration Procedure. ApJ,
94:385+, November 1941. 18
[Hubble, 1929] E. Hubble. A Relation between Distance and Radial Velocity among Extra-Galactic
Nebulae. Proceedings of the National Academy of Science, 15:168173, March 1929. 71
[Hut et al., 1995] P. Hut, J. Makino, and S. McMillan. Building a better leapfrog. ApJ, 443:L93L96,
April 1995. 50
[Iannuzzi and Dolag, 2011] F. Iannuzzi and K. Dolag. Adaptive gravitational softening in GADGET.
MNRAS, 417:28462859, November 2011. 66
[Ishibashi and Wald, 2006] A. Ishibashi and R. M. Wald. Can the acceleration of our universe be
explained by the eects of inhomogeneities? Classical and Quantum Gravity, 23:235250, January
2006. 74
[Ito et al., 1990] T. Ito, J. Makino, T. Ebisuzaki, and D. Sugimoto. A special-purpose N-body machine
GRAPE-1. Computer Physics Communications, 60:187194, September 1990. 57
[Jenkins, 2010] A. Jenkins. Second-order Lagrangian perturbation theory initial conditions for resim-
ulations. MNRAS, 403:18591872, April 2010. 82
[Kim et al., 2008] E. Kim, I. Yoon, H. M. Lee, and R. Spurzem. Comparative study between N-body
and Fokker-Planck simulations for rotating star clusters - I. Equal-mass system. MNRAS, 383:210,
January 2008. 59
[Kokubo et al., 1998] E. Kokubo, K. Yoshinaga, and J. Makino. On a time-symmetric Hermite inte-
grator for planetary N-body simulation. MNRAS, 297:10671072, July 1998. 53
[Kuijken and Dubinski, 1995] K. Kuijken and J. Dubinski. Nearly Self-Consistent Disc / Bulge / Halo
Models for Galaxies. MNRAS, 277:1341+, December 1995. 78
131
[Kustaanheimo et al., 1965] P. Kustaanheimo, A. Schinzel, H. Davenport, and E. Stiefel. Perturbation
theory of kepler motion based on spinor regularization. Journal f ur die reine und angewandte
Mathematik, 218:204219, 1965. 55
[Larson, 1970] R. B. Larson. A method for Computing the evolution of star clusters. MNRAS,
147:323+, 1970. 59
[Laskar and Gastineau, 2009] J. Laskar and M. Gastineau. Existence of collisional trajectories of
Mercury, Mars and Venus with the Earth. Nature, 459:817819, June 2009. 60
[Libersky et al., 1993] L. D. Libersky, A. G. Petschek, T. C. Carney, J. R. Hipp, and F. A. Allahdadi.
High Strain Lagrangian Hydrodynamics A Three-Dimensional SPH Code for Dynamic Material
Response. Journal of Computational Physics, 109:6775, November 1993. 84
[ Lokas and Mamon, 2001] E. L. Lokas and G. A. Mamon. Properties of spherical galaxies and clusters
with an NFW density prole. MNRAS, 321:155166, February 2001. 95
[Lucy, 1977] L. B. Lucy. A numerical approach to the testing of the ssion hypothesis. AJ, 82:1013
1024, December 1977. 84
[Makino et al., 2003] J. Makino, T. Fukushige, M. Koga, and K. Namura. GRAPE-6: Massively-
Parallel Special-Purpose Computer for Astrophysical Particle Simulations. PASJ, 55:11631187,
December 2003. 57
[Makino et al., 2006] J. Makino, P. Hut, M. Kaplan, and H. Saygn. A time-symmetric block time-step
algorithm for N-body simulations. New Astronomy, 12:124133, November 2006. 53
[Makino, 1991a] J. Makino. A Modied Aarseth Code for GRAPE and Vector Processors. PASJ,
43:859876, December 1991. 53
[Makino, 1991b] J. Makino. Optimal order and time-step criterion for Aarseth-type N-body integra-
tors. ApJ, 369:200212, March 1991. 52, 54
[Makino, 1996] J. Makino. Postcollapse Evolution of Globular Clusters. ApJ, 471:796, November
1996. 60
[Makino, 2008] J. Makino. Current Status of GRAPE Project. In E. Vesperini, M. Giersz, & A. Sills,
editor, IAU Symposium, volume 246 of IAU Symposium, pages 457466, May 2008. 57
[Maron and Howes, 2003] J. L. Maron and G. G. Howes. Gradient Particle Magnetohydrodynamics: A
Lagrangian Particle Code for Astrophysical Magnetohydrodynamics. ApJ, 595:564572, September
2003. 84
[Michelson and Morley, 1887] A. Michelson and E. Morley. On the Relative Motion of the Earth and
the Luminiferous Ether. American Journal of Science, 34:333345, 1887. 118
[Michelson, 1881] A. Michelson. The Relative Motion of the Earth and the Luminiferous Ether.
American Journal of Science, 22:120129, 1881. 118
[Mikkola and Aarseth, 1990] S. Mikkola and S. J. Aarseth. A chain regularization method for the
few-body problem. Celestial Mechanics and Dynamical Astronomy, 47:375390, 1990. 56
[Mikkola and Aarseth, 1993] S. Mikkola and S. J. Aarseth. An implementation of N-body chain reg-
ularization. Celestial Mechanics and Dynamical Astronomy, 57:439459, November 1993. 56
[Mikkola and Aarseth, 2002] S. Mikkola and S. Aarseth. A Time-Transformed Leapfrog Scheme. Ce-
lestial Mechanics and Dynamical Astronomy, 84:343354, December 2002. 56
[Monaghan, 1992] J. J. Monaghan. Smoothed particle hydrodynamics. ARA&A, 30:543574, 1992.
84
[Mukhanov et al., 1997] V. F. Mukhanov, L. R. W. Abramo, and R. H. Brandenberger. Backreaction
Problem for Cosmological Perturbations. Physical Review Letters, 78:16241627, March 1997. 74
132
[Mulder, 1983] W. A. Mulder. Dynamical friction on extended objects. A&A, 117:916, January 1983.
116
[Nagai and Miyamoto, 1976] R. Nagai and M. Miyamoto. A family of self-gravitating stellar systems
with axial symmetry. PASJ, 28:117, 1976. 29
[Navarro et al., 1996] J. F. Navarro, C. S. Frenk, and S. D. M. White. The Structure of Cold Dark
Matter Halos. ApJ, 462:563+, May 1996. 95
[Nitadori and Makino, 2008] K. Nitadori and J. Makino. Sixth- and eighth-order Hermite integrator
for N-body simulations. New Astronomy, 13:498507, October 2008. 52, 54
[Nussbaumer and Bieri, 2011] H. Nussbaumer and L. Bieri. Who discovered the expanding universe?
ArXiv e-prints, July 2011. 71, 125
[Peacock, 1999] J. A. Peacock. Cosmological physics. Cosmological physics. Publisher: Cambridge,
UK: Cambridge University Press, 1999. ISBN: 0521422701, 1999. 71
[Phillips, 1999] A. C. Phillips. The Physics of Stars, 2nd Edition. Physica Scripta Volume T, July
1999. 10
[Portegies Zwart and McMillan, 2002] S. F. Portegies Zwart and S. L. W. McMillan. The Runaway
Growth of Intermediate-Mass Black Holes in Dense Star Clusters. ApJ, 576:899907, September
2002. 62
[Power et al., 2003] C. Power, J. F. Navarro, A. Jenkins, C. S. Frenk, S. D. M. White, V. Springel,
J. Stadel, and T. Quinn. The inner structure of CDM haloes - I. A numerical convergence study.
MNRAS, 338:1434, January 2003. 66
[Press et al., 1992] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical
recipes in C. The art of scientic computing. Cambridge: University Press, c1992, 2nd ed., 1992.
67, 79, 81, 92
[Price and Monaghan, 2007] D. J. Price and J. J. Monaghan. An energy-conserving formalism for
adaptive gravitational force softening in smoothed particle hydrodynamics and N-body codes. MN-
RAS, 374:13471358, February 2007. 66
[Price, 2005] D. Price. Smoothed Particle Hydrodynamics. ArXiv Astrophysics e-prints, July 2005.
84, 85
[Price, 2010] D. J. Price. Smoothed Particle Hydrodynamics and Magnetohydrodynamics. ArXiv
e-prints, December 2010. 84
[Quinlan and Tremaine, 1990] G. D. Quinlan and S. Tremaine. Symmetric multistep methods for the
numerical integration of planetary orbits. AJ, 100:16941700, November 1990. 50
[Read and Hayeld, 2012] J. I. Read and T. Hayeld. SPHS: smoothed particle hydrodynamics with
a higher order dissipation switch. MNRAS, page 2941, April 2012. 84, 85
[Read et al., 2006a] J. I. Read, T. Goerdt, B. Moore, A. P. Pontzen, J. Stadel, and G. Lake. Dynamical
friction in constant density cores: a failure of the Chandrasekhar formula. MNRAS, 373:14511460,
December 2006. 115
[Read et al., 2006b] J. I. Read, M. I. Wilkinson, N. W. Evans, G. Gilmore, and J. T. Kleyna. The
importance of tides for the Local Group dwarf spheroidals. MNRAS, 367:387399, March 2006. 79,
81
[Read et al., 2006c] J. I. Read, M. I. Wilkinson, N. W. Evans, G. Gilmore, and J. T. Kleyna. The
tidal stripping of satellites. MNRAS, 366:429437, February 2006. 21
[Read et al., 2010] J. I. Read, T. Hayeld, and O. Agertz. Resolving mixing in smoothed particle
hydrodynamics. MNRAS, 405:15131530, July 2010. 84, 85
133
[Rosswog, 2010] S. Rosswog. Conservative, special-relativistic smoothed particle hydrodynamics.
Journal of Computational Physics, 229:85918612, November 2010. 84
[Saha, 1992] P. Saha. Constructing stable spherical galaxy models. MNRAS, 254:132138, January
1992. 94
[Schwarzschild, 1916] K. Schwarzschild. On the Gravitational Field of a Mass Point According to
Einsteins Theory. Abh. Konigl. Preuss. Akad. Wissenschaften Jahre 1906,92, Berlin,1907, pages
189196, 1916. 127
[Sheng, 1989] Q. Sheng. Solving linear partial dierential equations by exponential splitting. IMA J.
Numerical Analysis, 9(2):199212, 1989. 50
[Shu, 1982] F. H. Shu. The physical universe. an introduction to astronomy. A Series of Books in
Astronomy, Mill Valley, CA: University Science Books, 1982, 1982. 1
[Shu, 1991] F. Shu. Physics of Astrophysics, Vol. II: Gas Dynamics. Published by University Science
Books, 648 Broadway, Suite 902, New York, NY 10012, 1991., 1991. 11
[Simon et al., 2003] J. D. Simon, A. D. Bolatto, A. Leroy, and L. Blitz. High-Resolution Measure-
ments of the Dark Matter Halo of NGC 2976: Evidence for a Shallow Density Prole. ApJ,
596:957981, October 2003. 11
[Springel and Hernquist, 2002] V. Springel and L. Hernquist. Cosmological smoothed particle hydro-
dynamics simulations: the entropy equation. MNRAS, 333:649664, July 2002. 83
[Springel et al., 2001] V. Springel, N. Yoshida, and S. D. M. White. GADGET: a code for collisionless
and gasdynamical cosmological simulations. New Astronomy, 6:79117, April 2001. 69
[Springel, 2010a] V. Springel. E pur si muove: Galilean-invariant cosmological hydrodynamical sim-
ulations on a moving mesh. MNRAS, 401:791851, January 2010. 84
[Springel, 2010b] V. Springel. Smoothed Particle Hydrodynamics in Astrophysics. ARA&A, 48:391
430, September 2010. 84
[Spurzem and Takahashi, 1995] R. Spurzem and K. Takahashi. Comparison between Fokker-Planck
and gaseous models of star clusters in the multi-mass case revisited. MNRAS, 272:772784, February
1995. 59
[Stadel, 2001] J. G. Stadel. Cosmological N-body simulations and their analysis. Ph.D. Thesis, Univ.
Washington, 2001. 68
[Suzuki, 1991] M. Suzuki. General theory of fractal path integrals with applications to many-body
theories and statistical physics. J. Math. Phys., 32(2):400407, 1991. 50
[Syer and Tremaine, 1996] D. Syer and S. Tremaine. Made-to-measure N-body systems. MNRAS,
282:223233, September 1996. 78
[Takahashi, 1995] K. Takahashi. Fokker-Planck Models of Star Clusters with Anisotropic Velocity
Distributions I. Pre-Collapse Evolution. PASJ, 47:561573, October 1995. 59
[Tremaine and Weinberg, 1984] S. Tremaine and M. D. Weinberg. Dynamical friction in spherical
systems. MNRAS, 209:729757, August 1984. 115
[von Hoerner, 1960] S. von Hoerner. Die numerische Integration des n-Korper-Problemes f ur Stern-
haufen. I. ZAp, 50:184214, 1960. 21
[Widrow and Dubinski, 2005] L. M. Widrow and J. Dubinski. Equilibrium Disk-Bulge-Halo Models
for the Milky Way and Andromeda Galaxies. ApJ, 631:838855, October 2005. 78
[Will, 1993] C. M. Will. Theory and Experiment in Gravitational Physics. Cambridge University
Press, March 1993. 118
134
[Wu et al., 1999] K. K. S. Wu, O. Lahav, and M. J. Rees. The large-scale smoothness of the Universe.
Nature, 397:225230, January 1999. 127
[Yadav et al., 2005] J. Yadav, S. Bharadwaj, B. Pandey, and T. R. Seshadri. Testing homogeneity on
large scales in the Sloan Digital Sky Survey Data Release One. MNRAS, 364:601606, December
2005. 71, 127
[Yoshida, 1982] H. Yoshida. A New Derivation of the Kustaanheimo-Stiefel Variables. Celestial Me-
chanics, 28:239242, September 1982. 55
[Yoshida, 1993] H. Yoshida. Recent Progress in the Theory and Application of Symplectic Integrators.
Celestial Mechanics and Dynamical Astronomy, 56:2743, March 1993. 49, 50
[Zemp et al., 2007] M. Zemp, J. Stadel, B. Moore, and C. M. Carollo. An optimum time-stepping
scheme for N-body simulations. MNRAS, 376:273286, March 2007. 54, 55
[Zhao, 1996] H. Zhao. Analytical models for galactic nuclei. MNRAS, 278:488496, January 1996. 94
135

Computational Astrophysics Lecture Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Astrophysics Lecture Notes

Uploaded by

Copyright:

Available Formats

Computational Astrophysics

, V=km/s and T=Gyrs, unless otherwise stated. For reference,

, is really the bolometric luminosity: the total rate of energy output

is the radius of the star, and = 5.670 10

: mean free path

, and the typical

Hyades open star cluster 4 10

. This remarkable cancellation occurs only

= x. In this limit, (x)

Thin shell, mass M

cannot take any value we like: for q

< 0.7, the density is

of the isodensity contours is related to q

are the canonical momenta.

is an integral of the motion, as is . Thus the Hamiltonian is a function only of r and

/m are the specic energy and angular momentum, respectively.

should be a rational number; indeed, there are many more

. This means that orbits will have a centrifugal barrier

. To eliminate that, we must also transform the co-

, and using the relation /

. These follow straightforwardly from

. It remains an unsolved problem how supermassive

show detections or limits from real observations of

show the results of the

for a star cluster mass of

instead of the 0.1 M

used in the other models. The

for d1, 5, and 1, respectively.

) occurs between 1 and

for the leftmost curve). The power-law

d7) of the gure is

) to the rightmost solid

, whichever is smaller). This relation may

. For clusters with M

, we therefore extend the relation

and thus the mass of each particle will be 10

[, we may Taylor expand

] where we recall that x

in r.a. and the other Southern Galactic Cap (SGP) spanning 65

in dec. We constructed volume limited subsamples extending from z = 0.08 to 0.2

from here on.

70 Mpc), the Universe becomes very inhomogeneous. We start to see

, we can describe a new

. Calculate the radius associated with this

= 0 for isotropic systems and

= 1 for radially anisotropic systems, but

v Energy Equation (9.3)

approximates the volume element in the integral, and W

Div. operator in Cartesian coordinates (x,y,z):

Div. operator in Cartesian coordinates (x,y,z):

= 0. Then the Euler-Langrange equations give us:

cancels in integrating around a closed loop. This proves an important

(0) = 0; while (0) and

(0) are just constants. Thus, to order x

2E. Using dx = dq cos q sin d =

2E, and its

at best. Bridging the gap from 40 to

is a signicant challenge in theoretical astronomy, even today. Dynamical merging of massive

Monday, September 5, 2011

, S is moving at speed a v along the x-axis. We may then dene

Thursday, September 29, 2011

= 0 at x = vt (see the margin gure). This gives:

which derives the Galilean