This action might not be possible to undo. Are you sure you want to continue?
https://www.scribd.com/doc/114461149/DiffGeom
07/22/2013
text
original
An Introduction to Diﬀerential Geometry.
Michael D. Alder
November 29, 2008
2
Contents
1 Introduction 9
2 Smooth Manifolds and Vector Fields 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Smooth Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Smooth maps and tangent vectors . . . . . . . . . . . . . . . . 15
2.4 Notation: Vector Fields . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Cotangent Bundles . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 The Tangent Functor . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.1 The (nonexistent) Cotangent Functor . . . . . . . . . 35
2.7 Autonomous Systems of ODEs . . . . . . . . . . . . . . . . . . 37
2.7.1 Systems of ODEs and Vector Fields . . . . . . . . . . . 37
2.7.2 Exponentiation of Things . . . . . . . . . . . . . . . . 39
2.7.3 Solving Linear Autonomous Systems . . . . . . . . . . 40
2.7.4 Existence and Uniqueness . . . . . . . . . . . . . . . . 41
2.8 Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.9 Lie Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Tensors and Tensor Fields 51
3.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Natural and Unnatural Isomorphisms . . . . . . . . . . 51
3.1.2 Multilinearity . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.3 Dimension of Tensor spaces . . . . . . . . . . . . . . . 57
3.1.4 The Tensor Algebra . . . . . . . . . . . . . . . . . . . . 62
3.2 Tensor Fields on a Manifold . . . . . . . . . . . . . . . . . . . 67
3
4 CONTENTS
3.3 The Riemannian Metric Tensor . . . . . . . . . . . . . . . . . 72
3.3.1 What this means: Ancient History . . . . . . . . . . . 76
3.4 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 The Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . 91
3.6 The Exterior Calculus . . . . . . . . . . . . . . . . . . . . . . 96
3.7 Hodge Duality: The Hodge Operator . . . . . . . . . . . . . 102
3.7.1 The Riemannian Case . . . . . . . . . . . . . . . . . . 102
3.7.2 The SemiRiemannian Case . . . . . . . . . . . . . . . . 106
4 Some Elementary Physics 109
4.1 Three weird forces . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.2.1 Gradient Fields . . . . . . . . . . . . . . . . . . . . . . 116
4.2.2 What are Flux? . . . . . . . . . . . . . . . . . . . . . . 117
4.3 Maxwell and Faraday . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4.1 The Idea of Invariance . . . . . . . . . . . . . . . . . . 126
4.4.2 The Lorentz Group . . . . . . . . . . . . . . . . . . . . 130
4.4.3 The Maxwell Equations . . . . . . . . . . . . . . . . . 135
4.5 Saying it with Diﬀerential Forms . . . . . . . . . . . . . . . . 136
4.6 Lorentz Invariance . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6.1 Special Relativity . . . . . . . . . . . . . . . . . . . . . 145
5 DeRham Cohomology: Counting holes 149
5.1 Cultural Anthropology . . . . . . . . . . . . . . . . . . . . . . 149
5.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3 Inﬁnite Variety . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.4 Gauge Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.5 Exact and Closed forms . . . . . . . . . . . . . . . . . . . . . 157
5.6 Homotopies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.7 Counting Holes . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.8 More Cultural Anthropology . . . . . . . . . . . . . . . . . . . 166
5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6 Lie Groups 169
CONTENTS 5
6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . 169
6.1.1 The rest of the course . . . . . . . . . . . . . . . . . . 169
6.2 Introduction to Lie Groups . . . . . . . . . . . . . . . . . . . . 169
6.3 Group Representations . . . . . . . . . . . . . . . . . . . . . . 173
6.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.2 Irreducible Representations . . . . . . . . . . . . . . . 176
6.3.3 Tensor Representations . . . . . . . . . . . . . . . . . . 177
6.3.4 Schur’s Lemma . . . . . . . . . . . . . . . . . . . . . . 178
6.3.5 Representations of SU(2, C) . . . . . . . . . . . . . . . 179
6.3.6 Representations of SU(2) . . . . . . . . . . . . . . . . . 182
6.4 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7 Fibre Bundles 185
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.2 Principal Bundles . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.3 The Endomorphism Bundle . . . . . . . . . . . . . . . . . . . 191
8 Connections 193
8.1 Fundamental Ideas . . . . . . . . . . . . . . . . . . . . . . . . 193
8.2 Back in R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.2.1 Covariant diﬀerentiation . . . . . . . . . . . . . . . . . 197
8.2.2 Curves and transporting vectors . . . . . . . . . . . . . 199
8.3 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.4 Extensions to Tensor Fields on R
2
. . . . . . . . . . . . . . . . 201
8.5 The Koszul Connection . . . . . . . . . . . . . . . . . . . . . . 203
8.6 Vector Potentials . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.6.1 Tensor formulation . . . . . . . . . . . . . . . . . . . . 206
8.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 207
6 CONTENTS
Preface
This is a ﬁrst course in Diﬀerential Geometry. I follow a number of sources:
ﬁrst the text for the course, Baez and Muniain’s Gauge Fields, Knots and
Gravity, second the unique Michael Spivaks’s Comrehensive Introduction to
Diﬀerential Geometry, which is almost encyclopaedic and also readable, if at
times demanding. Third R.W.R Darling’s Diﬀerential Forms and Connec
tions and ﬁnally the rather old fashioned Sternberg Lectures in Diﬀerential
Geometry. I shall also make some allusions to Helgason’s Diﬀerential Geom
etry, Lie Groups and Symmetric Spaces.
My aim is to cover some of the ideas with applications to theoretical physics
from the text book while covering basic ideas. I hope that the students will
have done my 3P0 course which introduces tensors and tensor ﬁelds, but it
seems unsafe to count on it having been absorbed as thoroughly as desired.
So some of the introductory material has been lifted from my 3P0 notes.
Mike Alder, February 2007
7
8 CONTENTS
Chapter 1
Introduction
This course is about Diﬀerential Geometry and the text book is really im
portant. You need your own copy unless you are sharing with a very good
friend. It is particularly important if you want to see where the physical
reasons for studying this topic tie in with the mathematics. We shan’t get
through the whole book but we shall have started on the journey.
I copied some of that from the introduction to 3P0. It’s still true.
You can see what is in the course by reading the contents page. Not that
it will help; this is Mathematics, not one of those subjects where you learn
to say the right things without considering whether they are true or false or
even whether or not they mean anything.
There are many diﬀerent reasons for studying this subject but one impor
tant one is that you will come to grips with some of the ideas that modern
Physics needs to make sense of the universe. You may be a physicist or a
mathematician or even an engineer (embryonically at least). The three sub
jects tend to attract slighly diﬀerent kinds of people (only slightly diﬀerent:
compared with poets, popstars, princes, politicians and philosophers we are
barely diﬀerent at all). Engineers tend to see the world in terms of facts and
protocols which they have to learn and which may or may not make much
sense, physicists see the world in terms of facts and theories, the theories
being there to summarise and predict the facts. Mathematicians expect to
see reasons and logic and relatively few basic facts from which the others
can be deduced. For a mathematician it has to make sense or it is deﬁnitely
wrong. For a theoretical physicist it has to be elegant or it is deﬁnitely wrong.
Because I am a mathematician, I have put in material which is left out of
the text book where it is treated as some bunch of facts which you need to
know, whereas I want to show why these things are the case. Maybe I just
have a bad memory for facts but a good one for arguments. Whatever the
9
10 CHAPTER 1. INTRODUCTION
reason, I am going to try to show you the essential beauty of the subject,
to get you to agree that it is amazingly cool, because this is ultimately why
mathematicians do it. The fact that it is also very useful is not why we do
it, it is why we get paid to do it. Though not much
1
.
This is very tough stuﬀ, so don’t expect an easy time. On the other hand it
will be very exciting.
1
This is a really bad subject to do if you want to get rich, or boss people about, but a
very good one if you want to be happy and have lots of interesting and important things
to think about.
Chapter 2
Smooth Manifolds and Vector
Fields
2.1 Introduction
This chapter considers the machinery needed to say what we mean by a
smooth manifold. We also look at vector ﬁelds on smooth manifolds and
explain what this has to do with systems of ordinary diﬀerential equations.
The ﬁrst idea is that a curve (in the plane or in three space) is a one di
mensional object, a surface such as a sphere (the surface of a beach ball) or
a torus (the surface of an American doughnut where they sell you a hole in
the middle) is a two dimensional object. And there ought reasonably to be
higher dimensional variants of these things, as for example the nsphere S
n
given by
S
n
¦x ∈ R
n+1
: x = 1¦
It is also reasonable to look at smooth maps between manifolds. If you draw a
smooth curve on a beach ball without stopping, which joins up to stop where
it starts and has the same ﬁnal and initial velocity, then we could think of
this as a smooth map from S
1
to S
2
. But we have a problem in dealing
with what we mean by a diﬀerentiable map in this case since neither S
1
nor
S
2
are Banach spaces, and Banach spaces are the setting for talking about
diﬀerentiation, since they contain the linear algebra and distance notions
needed to talk of linear approximations, which is what derivatives are.
We might be able to make sense of this if the sphere is sitting in R
3
, and the
circle in R
2
, in which case we have a way of deﬁning smoothness extrinsically.
But smoothness ought to make sense intrinsically, that is without reference
to some external space in which the manifold may or may not be sitting. An
11
12 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
important reason for this is that we live in what looks to be a 3manifold
called the physical universe, at least at some scales. Make it a 4manifold if
you want to throw in time. If the universe is sitting in some higher dimen
sional space, we can’t know much about it, so it is slightly lunatic to believe
it is there. Google branes for an alternative viewpoint. Also string theory for
some disturbing ideas. But these do not contradict the idea that if we live
in a three dimensional physical universe it makes sense to talk about smooth
motion in it without having to postulate some inaccessible space external to
the universe.
So we seek to specify enough extra structure on a manifold so that we can
talk about smooth maps between them without reference to any space in
which they may be sitting. This will certainly be necessary if we are to
suppose that we live in a 3manifold and want to talk about geodesics in it,
curves of minimal length. We shall certainly want to do this if we are to talk
of the path of a photon in our universe.
All this means generalising ideas about maps from R
n
, or subsets of R
n
, to
R
m
, which involve diﬀerentiability. Which we understand. Or do we?
Recall that if U, V are open subsets of R
n
and f : U → V is a diﬀerentiable
map we have that at each point a ∈ U, the derivative of f at a is the linear
map Df(a) : R
n
→ R
n
which is represented in the standard basis by the
n n matrix of partial derivatives:
[Df
a
] =
D
1
f
1
(a) D
2
f
1
(a) D
n
f
1
(a)
D
1
f
2
(a) D
2
f
2
(a) D
n
f
2
(a)
.
.
.
.
.
.
.
.
.
D
1
f
n
(a) D
2
f
n
(a) D
n
f
n
(a)
¸
¸
¸
¸
¸
=
∂f
1
∂x
1
∂f
1
∂x
2
∂f
1
∂x
n
∂f
2
∂x
1
∂f
2
∂x
2
∂f
2
∂x
n
.
.
.
.
.
.
.
.
.
∂f
n
∂x
1
∂f
n
∂x
2
∂f
n
∂x
n
¸
¸
¸
¸
¸
x=a
Usually I shan’t bother to distinguish between the linear map and its matrix
representation. You know how to compute this matrix if it should be abso
lutely necessary, and you should understand that the linear map is the linear
part of the best aﬃne approximation to f at a. Note that I have used f
i
for
the n component functions of f and x
i
for the components of a vector in R
n
.
I shall explain this notation later.
An important point about smooth curves needs to be considered:
2.2. SMOOTH MANIFOLDS 13
Figure 2.1.1: A smooth curve.
Exercise 2.1.1. Figure 2.1.1 shows two line segments joined together. The
horizontal one is the set of points in R
2
with y = 0 and 0 ≤ x ≤ 1 and the
vertical one is the set of points in R
2
with x = 1 and 0 ≤ y ≤ 1
Show that there is a continuous but nondiﬀerentiable function from [0, 2] to
R
2
which traces the curve formed by the two segments from the origin to the
point (1, 1)
T
.
Show that there is a diﬀerentiable function from [0, 2] to R
2
which does the
same job.
Exercise 2.1.2. Show that [−1, 1] is the image of [−1, 1] by a continuous
bijection which is not diﬀerentiable.
The conclusion you should draw from this is that you cannot decide if a curve
is smooth or not merely by looking at the image!
2.2 Smooth Manifolds
Deﬁnition 2.2.1. A chart on a topological space X is a homeomorphism
from some open subset of X onto an open subset of R
n
. I shall call the
inverse of such a homeomorphism a local parametrisation.
We show a typical local parametrisation map from a rectangular neighbour
hood of the origin in R
2
to a region on the surface in ﬁgure 2.2.1. A chart
can be used to give coordinates for points of the space, at least some of them.
Diﬀerent charts will, of course, give diﬀerent coordinates in general to those
points on which the domains overlap.
Deﬁnition 2.2.2. Two charts on a space X, f : U →R
n
and g : V → R
n
are
smoothly compatible iﬀ the maps f ◦g
−1
and g◦f
−1
are inﬁnitely diﬀerentiable
wherever they are deﬁned.
14 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.2.1: A local coordinate map.
Figure 2.2.2: Two charts.
In other words, the composite map f ◦ g
−1
must have partial derivatives of
all orders at every point of the domain, and the same is true of the inverse
map.
If U and V have empty intersection then this holds vacuously. If they do have
an intersection, then f ◦ g
−1
has domain and codomain some open subsets
of R
n
and is certainly continuous. It makes sense to demand that this map
be smooth, that is, inﬁnitely diﬀerentiable. The picture of ﬁgure 2.2.2 may
help.
Deﬁnition 2.2.3. A smooth atlas for a space X is a collection of smoothly
compatible charts such that every point of X is in the domain of at least
one chart. Such an atlas is maximal iﬀ every possible (smoothly compatible)
chart is in it.
Deﬁnition 2.2.4. A smooth nmanifold is a hausdorﬀ topogical space to
gether with a maximal atlas of smoothly compatible charts. The atlas is said
2.3. SMOOTH MAPS AND TANGENT VECTORS 15
to deﬁne a smooth diﬀerential structure on X.
The reason for wanting the atlas to be maximal is just so that anyone wan
dering in with a new local coordinate map can’t cause us trouble. Either it
is compatible with our atlas in which case we already have it, or it is not, in
which case it may be part of a diﬀerent diﬀerential structure for the manifold.
Exercise 2.2.1.
1. Show that S
1
and S
2
as usually deﬁned are smooth manifolds.
2. Show that the ﬂat torus obtained by gluing opposite edges of a square
is also a smooth manifold.
3. Show that S
n
is a smooth manifold for any n ∈ Z
+
. Hint: use the
Implicit Function Theorem.
(Generic hint: you don’t need to have many charts. Enough to cover the
manifold will do, then just add the instruction to ﬁll up with all other possible
smoothly compatible charts.)
Exercise 2.2.2. Construct a deﬁnition of an orientable manifold.
2.3 Smooth maps and tangent vectors
Now we have enough to say what it means for a map f : X → Y to be
smooth when X and Y are smooth manifolds:
Deﬁnition 2.3.1. A map f : X → Y between smooth manifolds is diﬀeren
tiable when h
−1
◦ f ◦ g is diﬀerentiable for all charts h on X and all charts g
on Y belonging to the diﬀerential structures.
The diagram of ﬁgure 2.3.1 gives the idea.
We can deﬁne higher order diﬀerentiability in the same way and we can say
that a map f : X → Y is smooth whenever all composites h
−1
◦ f ◦ g are
smooth for all charts h on X and all charts g on Y
Exercise 2.3.1. Show that if f : X → Y has composite h
−1
◦ f ◦ g diﬀer
entiable at some point a in X then it is diﬀerentiable in any other pair of
charts containing a, f(a).
16 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.1: A smooth map.
Figure 2.3.2: Some tangent curves in a manifold.
2.3. SMOOTH MAPS AND TANGENT VECTORS 17
Note that although we can say that f is diﬀerentiable, we cannot provide
a derivative, since this will generally be diﬀerent in diﬀerent charts. If we
move away from simple linear spaces we must pay the price: there is no longer
a best aﬃne approximation because aﬃne maps don’t make sense between
manifolds in general.
We can however say when two maps from R into a manifold X are tangent.
Let f, g : (−1, 1) → X be smooth maps into a manifold and without loss
of generality let f(0) = g(0) = a ∈ X. Then we can say that f and g are
tangent at a iﬀ the derivative of f and the derivative of g are the same for
any chart h : U → R
n
where f(0) = g(0) = a ∈ U. If they are the same in
any one chart they must be the same in any other.
Exercise 2.3.2. Prove the last remark.
Exercise 2.3.3. Show that tangency is an equivalence relation on the set of
maps from R to X, and that we can do the same thing with maps from X
to R.
We can take a tangency equivalence class of maps from R to the manifold
X, and regard it as an object in its own right. The picture 2.3.2 shows some
members of a tangency equivalence class.
The curves can be thought of as the trajectories of moving points, and they
are all moving through the point a at the same speed, and in the same
direction, although we cannot give the direction a particular vector to specify
it, and the speed may also be diﬀerent in diﬀerent charts.
Deﬁnition 2.3.2. A tangency equivalence class at a point a in a manifold
X is called a tangent vector at a in X.
Remark 2.3.1. Watching the faces of students in class when giving this
deﬁnition is a real treat. The look of stark horror and incomprehension
is very encouraging, as it proves that some at least are listening. A small
amount of imagination, however, goes a long way to making this deﬁnition
quite reasonable.
Suppose that the North pole has been cleared of snow and turned into a
skating rink for penguins
1
, and the North pole itself is marked by a ﬂashing
red light. There are two space craft hovering up there, call them A and B. In
space craft A, an astronaut leans out and takes a photograph of the region
around the north pole. Suppose for simplicity he is directly above the north
1
It has been pointed out to me that there are no penguins at the North pole only at
the South pole. On the other hand there isn’t a skating rink at the North pole either. So
if we are going to make a skating rink we might as well import the penguins.
18 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.3: Penguins (imported from the antarctic) skating.
pole so his photograph, when enlarged is a disc as in the picture ﬁgure 2.3.4.
Astronaut B is somewhere over Russia and he also takes a photograph of
what he can see.
Now each astronaut looks at his photograph and lays it out ﬂat and enlarges
it to a nice size, and each marks on a coordinate grid using a ruler and pen,
and so each has a chart of a bit of the polar regions, with the north pole
in the domain of each chart. If both put the origin in the centre, astronaut
A will have the ﬂashing red light at the origin, and astronaut B will have a
negative x coordinate for the red light if he puts his coordinates on the chart
in the way suggested by the diagram. I regard the chart as both the bit of
the earth the astronaut can see, and also the process of turning it into a ﬂat
picture with a coordinate grid on it. Call them u and v for the maps and U
and V for the domains of the maps back on earth.
I claim it makes sense to talk of a penguin skating over the north pole as
having a velocity vector as it passes through the north pole. Each astronaut
can plot the position of the green penguin in his chart, and each will agree if
the curve is diﬀerentiable. Note that if g : (−1, 1) → S
2
describes the green
penguin then astronaut A will plot the green curve at the top of the picture
and will be able to give it a perfectly respectable velocity on his chart relative
to the cartesian coordinates marked on the chart. Similarly astronaut B can
do the same. The problem is that they will have, usually, diﬀerent estimates
or what the velocity is. If B is much higher up in space, his scale will be such
that the penguins will seem to be moving more slowly, for instance.
2.3. SMOOTH MAPS AND TANGENT VECTORS 19
Figure 2.3.4: Two penguins skating under the watchful eyes of two astro
nauts.
Does this mean my claim that we can assign a meaningful velocity vector to
the penguin is just nonsense? No, for if b is the blue penguin, also skating
over the north pole at the same time as the green penguin (and mysteriously
not knocking over the ﬁrst penguin: maybe they are ghost penguins and can
occupy the same space), it certainly makes sense to say if they are travelling
in the same direction at the same speed. A penguin cutting across the path
would obviously be travelling in a diﬀerent direction, and a really slow pen
guin would be slower for both astronauts, and the fast penguins would pass
through it. So I claim that the penguin velocity is a real thing which exists
at the penguin level if not at the astronaut level. But if one astronaut said
that the blue penguin and the green penguin had the same velocity at the
instant they went through the pole, the other astronaut would agree — even
though disagreeing as to the actual value of the vector in both direction and
speed, these things being properties of the charts, not the penguins.
The reason this happens is that there are two things going on here. The actual
velocity at the north pole is a real thing, penguins are actually moving, and
either they pass through the north pole at the same time in the same direction
at the same speed or they don’t. But attempts by the two astronauts to
describe the penguin motion to each other with numbers involve inventing
coordinate systems which are bits of language. So the numerical value of any
vector is dependent on the language. But the fact that diﬀerent languages
20 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
agree on whether two penguins have the same velocity tells you that the
velocity is real. It exists independent of the coordinate system, provided the
two coordinate systems are related by a diﬀeomorphism. So there are moving
penguins and there is language, and the penguins will have the same velocity
at the pole or they won’t, and this is true no matter what language you use
to talk about it unless your language is really weird.
The problem then is to say what a velocity vector is given that any pair of
astronauts can disagree about the actual numbers. And the most elegant
solution is to say that it is what all the penguin trajectories, real and poten
tial, have in common. And what they have in common is that every observer
will agree that they pass through the north pole in the same direction at the
same speed. This is the tangency equivalence class.
Note that I have assumed that all observers use synchronised clocks so they
all agree that the time at which the penguins hit the north pole is time
zero. This doesn’t have to be the case either. They will all agree on the
simultaneity of the events, whatever time they claim they occur. This is
because two penguins either meet or they don’t, and this is not a matter of
language but of fact.
The ghostpenguins are negotiable. Having a nice vivid picture of some sort
is essential: you should be prepared to invent your own, but this time you
may borrow my penguins if they help. If I give you more deﬁnitions like this,
it is your job to supply the penguins, or whatever it takes.
Exercise 2.3.4. Show that the claim that the two astronauts would agree if
two penguins have the same velocity at the north pole is true provided that
u ◦ v
−1
and v ◦ u
−1
are both diﬀerentiable.
Remark 2.3.2. There is, of course, a simpler way of deﬁning tangent vectors
on S
2
. It is usually viewed as a subspace of R
3
, so a curve on S
2
is also a
curve in R
3
and we can deﬁne velocities on S
2
as tangent vectors in R
3
in
the sense of the derivatives of maps from (−1, 1) to R
3
which, for tangent
vectors at a particular point, happen to lie in a plane in R
3
which is tangent
to S
2
at that point. This certainly removes some tricky conceptual problems
but at the expense of making tangent vectors extrinsic rather than intrinsic
to the space. The whole thrust of the text book is to using intrinsic ideas
for the very good reason that we live in a 3manifold and cannot form any
useful idea of an embedding of it in some higher dimensional space.
The next proposition tells us that the set of tangency equivalence classes at
a ﬁxed point a in a manifold form a vector space, the tangent space at a.
Proposition 2.3.1. The set of tangent vectors at a point a of a smooth
nmanifold X comprise a real vector space of dimension n.
2.3. SMOOTH MAPS AND TANGENT VECTORS 21
Figure 2.3.5: The sum of tangent vectors.
Proof:
We have to produce sensible rules for adding and scaling tangent vectors.
Then we have to show that the result satisﬁes the axioms for a real vector
space. Suppose we have a tangency equivalence class v and that v is an
element of it, that is a curve v : R → X with v(0) = a and in any chart
w : W → R
n
with a ∈ U there is some derivative of w ◦ v. Then we can
scale the function v by a scalar k ∈ R to get v(kt) instead of v(t) for t ∈ R
and the derivative of w ◦ v will also be scaled by the factor k. This will be
the same scaling in any chart, so it makes sense to call this new function kv.
This has its own tangency equivalence class, kv.
It would not make a diﬀerence if we had chosen another function v
∈ v,
kv
is a function tangent to kv since they both have the same derivative no
matter what chart we choose, although in diﬀerent charts the derivatives will
be diﬀerent but still equal to each other.
So we can say that kv exists and we have scaled the equivalence class.
If v and u are distinct tangency equivalence classes through the point a
as in ﬁgure 2.3.5, we can take representative functions v, u : R → X with
u(0) = v(0) = a and composing with w : W → R
n
, a chart, we have two
maps, w ◦ u and w ◦ v from R into R
n
. Such maps may be added: we take
the map w ◦ u +w ◦ v −w(a). At t = 0 this passes through w(a) ∈ R
n
. The
resulting curve in R
n
can be mapped back into the manifold by w
−1
, or at
least a bit of it in a neighbourhood of w(a) can be. This gives a sort of ‘sum’
curve of u and v in the manifold, w
−1
◦ (w◦ u+w◦ v −w(a)). The tangency
class of this sum curve is deﬁned to be the sum of u and v. It is easy to see
that the tangency equivalence class does not depend on the choice of chart.
(Although the sum curve does.)
22 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Nor does it depend on which representatives u of u and v of v which we
choose because they all have the same derivative. We may write u+v is the
tangency equivalence class of w
−1
(w ◦ u) +(w ◦ v) −w(a)) therefore, and we
may add tangency equivalence classes, otherwise known as tangent vectors
at a.
(If in doubt about this argument, say it with penguins.)
It is clear that the sum is associative and commutative and there is a zero
which contains the constant function sending R to a. The rest of the axioms
for a vector space are easily checked.
The claim that it has dimension n the same as the dimension of X is left as
an exercise.
Exercise 2.3.5. Check all the axioms for a vector space. This kind of thing
is called axiom bashing and is good for you.
The resulting vector space is called T
a
(X) and is isomorphic to R
n
when X
is an nmanifold. I want to emphasise an important point: there is in general
no particular or natural isomorphism between T
a
(X) and R
n
. If X = R
n
,
then I can get away with calling T
0
(R
n
) by the ‘slang’ name
˙
R
n
, because in
this case there is an obvious basis for the tangent space at the origin, I have
the unit vectors along the axes. And by a simple translation I can carry R
n
to R
n
and take the origin to any point a, and this translation will also take
curves through the origin (and hence vectors) to curves through a. So in
this rather special case, I do have a natural basis for T
a
(X). But there is no
natural basis for T
a
(S
2
) for any a ∈ S
2
; the best I could do is to fudge one by
using the embedding in R
3
, but this is a property of the embedding, not of
S
2
. This loss of a natural basis, or if you prefer a natural isomorphism with
R
n
, has important implications. It parallels the fact that there is no obvious
choice of a coordinate frame in the space we inhabit
2
.
Exercise 2.3.6. Prove the last statement. Hint: Do it for X = R
n
ﬁrst,
then observe that locally any X is R
n
as near as dammit, and that tangency
is a very local kind of business.
There is one such tangent space T
a
(X) for each point a ∈ X. There is, in
general, no particular isomorphism between any T
a
(X) and R
n
. If X = R
n
then there is (what?), but in general there is a huge choice and no way of
picking any particular one.
2
Although in earlier days, it was thought in some quarters that Jerusalem was a good
place to put one. Where exactly in Jerusalem was not altogether clear.
2.3. SMOOTH MAPS AND TANGENT VECTORS 23
Figure 2.3.6: The simplest tangent bundle.
Exercise 2.3.7. Show that the tangent plane at the north pole to S
2
as
usually embedded in R
3
can be mapped isomorphically to the tangent space
as deﬁned here. Is there an obvious isomorphism?
Exercise 2.3.8. Show that there is an isomorphism between T
a
(X) and
T
b
(X) for any two points a, b ∈ X.
Examples 2.3.1.
1. The simplest case is where X = R. A tangent vector at the point 1
can be thought of as the space of velocities of moving points as they go
through 1; the chart consisting of the identity map does it all nicely.
So we have a line of possible velocities attached to each point of R and
the tangent bundle is the collection of all the tangent spaces. We can
draw it as R
˙
R where the ﬁrst component is the space itself and the
second is the space of velocities. I am making up the notation of
˙
R
for the space of velocities, and you won’t ﬁnd it in the books, but it
makes sense and reduces confusion. Since
˙
R is isomorphic to R, R
˙
R
looks an awful lot like R
2
. We think of the diﬀerent tangent spaces
˙
R
a
attached to each a ∈ R and draw some of them as in ﬁgure 2.3.6
The reason it is called a bundle is because it looks like a bundle of (red)
tangent spaces. The tangent spaces are called the ﬁbres of the bundle.
The manifold to which the ﬁbres are attached is called the base space
of the ﬁbre bundle.
2. Let the manifold X be S
1
. Again it makes sense to have curves in S
1
which all pass through a point and have the same velocity vector at that
point. The diﬀerent velocities again form a vector space
˙
R and there is
24 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.7: The next simplest tangent bundle.
one attached to each point of S
1
. If we draw the possible tangents in
the plane, they intersect; this is a property of the space we are trying
to squash the tangent bundle into and if we turn them through a right
angle as in the last example we get the ﬁbre bundle of ﬁgure 2.3.7
Again the ﬁbres are all copies of a line and the bundle is pretty much the
same as S
1
R. The red dot sitting over the black one represents a speed
in the positive direction passing through the black point underneath it.
3. We have now run out of cases where we can draw the pictures, since
R and S
1
are the only one dimensional manifolds, and if we go to S
2
we get a tangent bundle of dimension four. We can draw one tangent
plane, but any more would usually intersect, and this is what happens
when we try to embed a four dimensional space in R
3
. We can see
however that there is a collection of planes, one for each point of S
2
and they form a four dimensional space. It is useful to visualise at least
a part of the tangent space of S
2
as a sphere in R
3
with some bits of
tangent planes attached to it, as in ﬁgure 2.3.8, because it is better to
have a partial idea than stick entirely to the algebra, but you should
be aware of the limitations of the picture.
The two earlier examples came out to be simple cartesian products of the
tangent space at any one point with the manifold. Such bundles are called
trivial bundles. An example of a nontrivial ﬁbre bundle is the M¨obius bundle
shown in ﬁgure 2.3.9
This has ﬁbre an interval, say (−1, 1), from R and base space S
1
. But it is
2.3. SMOOTH MAPS AND TANGENT VECTORS 25
Figure 2.3.8: A bit of the tangent bundle for S
2
.
Figure 2.3.9: A nontrivial ﬁbre bundle.
not the cartesian product of the two.
Every tangent bundle has, however, a projection onto the base space, the
underlying manifold. We may write this as a vertical pair of spaces
TM
?
π
M
where M is the manifold, TM is the tangent bundle and π is the projection
which sends a tangent vector to the point in the manifold to which it is
attached.
Now we have described the tangent bundle as a union of all the tangent
spaces T
a
(M) for a ∈ M but that does not specify a topology on it. To do
that we say a subset U of TM is open iﬀ the projection π(U) is open in M
and the intersection of U with any ﬁbre is open in the ﬁbre. Since the ﬁbres
are all real vector spaces we can give them the usual topology, obtained from
an isomorphism with R
n
.
Deﬁnition 2.3.3. The tangent bundle to a smooth manifold M is the set
¸
a∈M
T
a
(M)
with the topology speciﬁed by saying U ⊆ TM is open whenever π(U) is
26 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
open in M and for every a ∈ π(U), U ∩ T
a
(M) is open in T
a
(M), where
T
a
(M) has a topology induced by any isomorphism with R
n
.
Note that this assumes that any two isomorphisms with R
n
will induce the
same topology.
Exercise 2.3.9.
1. Show that a linear map from R
n
to R
m
is continuous iﬀ it is continuous
at the origin.
2. Show that any linear map from R
n
to R
m
is continuous.
3. Show that any isomorphism from R
n
to itself is a homeomorphism.
Now for some formal deﬁnitions:
Deﬁnition 2.3.4. A ﬁbre bundle is a quartet (E, B, F, π) where E is called
the total space, B is called the base space, π : E → B is a continuous map
called the projection and for every b ∈ B, π
−1
(b) is homeomorphic to F. The
spaces π
−1
(b) are called the ﬁbres of the bundle.
Deﬁnition 2.3.5. A ﬁbre bundle is called locally trivial iﬀ for every b ∈ B
there is an open set U ⊆ B containing b such that π
−1
(U) is homeomorphic
to U F
The bundle B F is called a trivial bundle.
Exercise 2.3.10. Describe clearly the trivial bundle with base space S
2
and
ﬁbre S
1
and give an example of a nontrivial bundle with the same base and
ﬁbre. Hint: you might ﬁnd it easier if you specify some gluings.
Exercise 2.3.11. Show that the tangent bundle of a smooth manifold is
locally trivial.
Exercise 2.3.12. Show there is a natural atlas on the tangent bundle which
makes it a smooth manifold. Is the bundle projection smooth?
Note that for a locally trivial ﬁbre bundle a topology on the bundle must
have as base the cartesian product of sets which are open in B (and over
which the bundle is locally trivial) with open sets in the ﬁbre.
Deﬁnition 2.3.6. A section of a ﬁbre bundle E with projection π to base
space B is a map s : B → E such that s ◦ π is the identity on B.
2.4. NOTATION: VECTOR FIELDS 27
Deﬁnition 2.3.7. A vector ﬁeld on a manifold M is a section of the tangent
bundle TM.
You should be able to see that this makes sense and we can talk about
continuous, diﬀerentiable and smooth vector ﬁelds according as the section
(which is after all a map) is continuous, diﬀerentiable or smooth.
Exercise 2.3.13.
1. Draw a vector ﬁeld on R
2
which is nice and easy and write it as a
section of the tangent bundle.
2. Show that the tangent bundle for S
2
is not trivial. Use the hairy ball
theorem which says that any continuous vector ﬁeld on S
2
must have
at least one place where the vector is of length zero.
2.4 Notation: Vector Fields
On R
2
, I can write the tangent space as R
2
˙
R
2
which is mildly useful for
thinking about the meaning but not standard and not particularly useful for
computations. I shall extend this to talking about the standard basis for
˙
R
2
and call it ( ˙ e
1
, ˙ e
2
). A vector ﬁeld on R
2
is an assignment to each point of R
2
of a vector, and if it is a smooth vector ﬁeld this vector changes smoothly as
we move around in R
2
. So there is a tangent vector, with two components,
which both depends smoothly on x and y and hence is given by a pair of
functions P(x, y), Q(x, y). We might write the vector ﬁeld as
P(x, y) ˙ e
1
+Q(x, y) ˙ e
2
but we don’t. We write it as
P(x, y)
∂
∂x
+Q(x, y)
∂
∂y
This notation takes a bit of explaining.
If we have a smooth function f : R
2
→R and a smooth vector ﬁeld on R
2
we
can take the directional derivative of f at any point in the direction of the
vector ﬁeld at the point, and multiply it by the length of the vector. This
will give us a new smooth function on R
2
. This means that such a vector
ﬁeld can be thought of as an operator on the space of smooth functions from
R
2
to R, which is usually written as (
∞
(R
2
). The constant vector ﬁeld which
assigns the vector ˙ e
1
to every point of R
2
can easily be seen to be the operator
28 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
∂/∂x and similarly the orthogonal constant vector ﬁeld which assigns ˙ e
2
is
the operator ∂/∂y. This explains the notation for vector ﬁelds on R
2
and by
an obvious extension we can write a vector ﬁeld v on R
n
as
¸
i∈[1:n]
v
i
(x)
∂
∂x
i
where each v
i
is a function from R
n
to R and where I called v
1
the function
P(x, y) and v
2
the function Q(x, y) when n = 2.
Since the procedure for interpreting a vector ﬁeld as an operator on (
∞
(R
2
)
is local, a vector ﬁeld on a manifold M is an operator on (
∞
(M) although
there is an issue involved in choosing a basis for each tangent space T
a
(M)
if we wish to do calculations.
This gives two quite diﬀerent ways of looking at a vector ﬁeld on a smooth
manifold. We have the tangency equivalence classes which we may think of
as little arrows, each selected by a section of the tangent bundle. This is
a quite straightforward transfer of ideas from R
n
and should seem natural
and reasonable once you have come to terms with the problem of having to
say everything via charts. But the other way of thinking of a vector ﬁeld as
an operator on the space (
∞
(M) has some advantages. One of these is that
it makes sense without immediate reference to charts. Of course, we need
charts to say what it means for some map f : M →R to be diﬀerentiable, but
given that, we have a pleasant freedom from particular coordinate systems.
Physicists are particularly interested in this, because the physical universe
does not come equipped with charts anymore than it has an origin and axes
sticking out of it. Recall penguins, and what they do, versus the language for
talking about them given by charts. Now we want to focus on the behaviour
of the physical universe (penguins) and not be to distracted by the language
(charts). So an invariant description, that is one which does not depend
on choosing a particular language, is deﬁnitely more physical. Note Oliver
Heavisides remarks quoted at the top of chapter three of the text book.
On R
n
we can therefore write a vector ﬁeld v as a map
v : (
∞
(R
n
) → (
∞
(R
n
)
with vf the map
¸
i∈[1:n]
v
i
(x)
∂f
∂x
i
.
This can be compressed into
v =
¸
i∈[1:n]
v
i
∂
∂x
i
2.4. NOTATION: VECTOR FIELDS 29
An even more compact form is
v =
¸
i∈[1:n]
v
i
∂
i
We can make this even terser by using the Einstein Summation Convention
which is that if an index is repeated as a superscript and a subscript then we
automatically sum over the possible values. This gives us
v = v
i
∂
i
where you have to know what the space is in which we are working to know
how many i’s there are. For some reason physicists prefer to use greek letters
as indices which means that you are likely to ﬁnd expressions such as
v = v
µ
∂
µ
instead. I fear that you will have to get used to this as the textbook is
committed to it.
This leads to a new deﬁnition of a vector ﬁeld on a smooth manifold M.
First we deﬁne (
∞
(M) to be the set of smooth maps from M to R. This
is clearly a real vector space. It is certainly possible to add and scale the
functions, and the rest is simple axiom bashing, as done in second year. It
is rather more than just a vector space, it is an algebra, which is to say it is
possible to multiply any pair of elements, fg being the function
∀a ∈ M, fg(a) = f(a)g(a)
where the right hand side of the equality means we just multiply the two real
numbers f(a) and g(a). The multiplication is associative, commutative, and
left and right distributive over addition. In other words, it is a real vector
space which is also a commutative ring, which is basically what we mean by
a real algebra. You should write down the complete list of axioms for such a
thing, not relying on the text book too much.
Now we deﬁne a linear operator on such an algebra A by saying it is a map
v : A →A
which is linear, that is,
∀f, g ∈ A v(f +g) = vf +vg
and
∀f ∈ A, ∀t ∈ R v(tf) = tv(f)
30 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Such an operator is called a derivation if it also satisﬁes
∀f, g ∈ A v(fg) = fv(g) +gv(f)
which you will recognise as Leibnitz Rule for diﬀerentiating a product func
tion.
Exercise 2.4.1. Take M = R
2
and any smooth vector ﬁeld on it. Show that
it is a derivation.
Note that it makes sense to deﬁne a derivation over any real algebra and
algebraists indeed do exactly this. This is a long way from diﬀerentiating
functions, but it gives all the essential properties, and algebraists have a
habit of studying the properties without much caring where they came from.
They have their uses. Algebraists, that is.
We can ﬁnally deﬁne a vector ﬁeld on a manifold M as a derivation on the real
algebra (
∞
(M). Such a deﬁnition has advantages and disadvantages. The
obvious disadvantage is that it is so abstract it seems to have nothing to do
with the things we care about, but the advantage is that the abstraction has
removed all the irrelevancies which get in the way of thinking about things
and left the bare essentials. Any lingering suspicion that the geometric baby
has been thrown out with all that bathwater may be put to rest by checking
through the last exercise carefully, and by doing it with S
1
instead of R
2
:
Exercise 2.4.2. Take M = S
1
and any smooth vector ﬁeld v on it regarded
as a section of the tangent bundle. Show that v is a derivation: take some
simple functions from S
1
to R and operate on them by v. Conﬁrm that all
the rules for a derivation are satisﬁed.
We also need to be able to go in the opposite direction: if v : (
∞
(M) →
(
∞
(M) is a derivation, then it must be able to be expressed as a vector ﬁeld
in the earlier sense.
Exercise 2.4.3. Do this on R
2
at the origin. Suppose f : R
2
→ R is a
smooth map. Then we can write
f
¸
x
y
= f(0) +
¸
∂f
∂x
,
∂f
∂y
0
¸
x
y
+ax
2
+ 2bxy +cy
2
where a, b, c are second order partial derivatives of f evaluated at some point
between the origin and (x, y)
T
(and hence, we have to admit, depend upon
x and y). This is just the Taylor expansion with Lagrange form of the
remainder in two dimensions.
2.5. COTANGENT BUNDLES 31
Now apply v to f to get a new function g: then g(0) must be the limit of
g(x, y)
T
as (x, y)
T
→ 0, as g is certainly continuous, and show that since
v is linear, g(x, y)
T
must be the sum of the action of v on the above three
terms in a neighbourhood of the origin, that v takes the constant ﬁrst term
to zero, and that since v satisﬁes the Leibnitz condition, g(0) must be
¸
∂f
∂x
,
∂f
∂y
¸
u
v
for some vector (u, v)
T
∈
˙
R
2
. Finally show that if it works on R
2
it must
work on R
n
and also on any smooth manifold.
We can now deﬁne Vect(M) or 1(M) as the set of all vector ﬁelds on the
smooth manifold M.
Exercise 2.4.4. Show that Vect(M) (1(M) ) is a real vector space. Show
that it is a module over (
∞
(M), that is, it is like a vector space over (
∞
(M)
except that (
∞
(M) is not a ﬁeld but a ring.
Exercise 2.4.5. Show that 1(M) as a module over (
∞
(R
n
), is ﬁnite dimen
sional and has the obvious basis.
2.5 Cotangent Bundles
I mentioned earlier that we could do the business of equivalence classes of
maps from the manifold to R in exactly the same way as we took maps from
R to the manifold. If we do this we get an exact parallel and a tangency
equivalence class of such maps at a point is called a cotangent or covector
at the point. Somewhat easier is to deﬁne the space of cotangents at a ∈ X
for a smooth manifold X as the dual space of T
a
(X). Recall that the dual
(vector) space for a space V is the space V
of linear maps from V to R. I
shall say more about this in the next chapter. We can do exactly the same
process of taking the union of all the T
a
(X)
as we did for the tangent bundle
and this gives us a slightly diﬀerent object called the cotangent bundle. It
has to be admitted that there is no diﬀerence between them as topological
spaces. All the diﬀerence is in the algebra and it manifests itself strongly
when we look to see what happens under maps between manifolds.
Exercise 2.5.1. I have given two diﬀerent deﬁnitions of the cotangent space.
Show they are equivalent.
32 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
The same sort of considerations as worked for vector ﬁelds apply to covec
tor ﬁelds or diﬀerential 1forms as they are more commonly known. At each
point of R
2
, we select an element of
˙
R
2
, the cotangent space, which again has
two components. I suppose we might call the standard basis for this space
( ˙ e
1
, ˙ e
2
) where ˙ e
1
is the linear map from
˙
R
2
to R which projects everything
onto the ﬁrst component and ˙ e
2
projects everything onto the second com
ponent. But we actually call them dx, dy to be loosely consistent with the
classical notation. So we interpret dx as the linear map which takes (x, y)
T
to x where (x, y)
T
is a point in the tangent space T
a
(R
2
) at some point a.
Similarly for dy. So a diﬀerential 1form or covector ﬁeld on R
2
is written
P(x, y) dx +Q(x, y) dy
The generalisation to R
n
is of course
¸
i∈[1:n]
ω
i
(x) dx
i
(or ω
i
dx
i
using the Einstein summation convention)
and this, for smooth functions ω
i
, i ∈ [1 : n] represents a covector ﬁeld
or diﬀerential 1form on R
n
. The preference for letters towards the end of
the Greek alphabet to denote diﬀerential forms is widespread so again you
ought to get used to it. The subscripts instead of superscripts for indices
tells you something about the covariance or contravariance of the entities. I
shall explain this properly shortly.
If you wonder why on earth anybody bothers to distinguish between vector
ﬁelds and diﬀerential 1forms, one answer is that it is natural to diﬀerentiate
kforms to get (k +1)forms for k ∈ N. This is what Stokes’ theorem is really
all about. As you ought to have learnt in second year but probably didn’t.
2.6 The Tangent Functor
Suppose f : X → Y is a diﬀerentiable map between manifolds. Then for the
case where X = R
n
and Y = R
m
there is a map between the tangent spaces
at each point which takes the tangent space at a ∈ X to the tangent space
at f(a) ∈ Y . To take a tangent vector v
a
in the tangent space T
a
(X) to one
in the tangent space T
f(a)
(Y ) all we have to do is to operate on it by Df(a)
which is by deﬁnition a linear map and has the right dimensions for domain
and codomain. If we are prepared to choose a basis for T
a
(X) and T
f(a)
(Y )
we could represent Df(a) by a matrix, and there is a perfectly sensible way
of choosing the ‘same’ basis for tangent spaces over diﬀerent points. All this
makes sense even if X and Y are just ﬁnite dimensional real vector spaces
2.6. THE TANGENT FUNCTOR 33
without the extra structure of R
n
. In fact it makes sense in arbitrary Banach
spaces.
Of course, there is a slight problem of how to extend this to manifolds which
are not Banach spaces. Spheres and tori spring to mind.
If we take v
a
, and recall that it is a tangency equivalence class of curves
v : (−1, 1) → X taking 0 to a then f ◦ v is a curve through f(a) and it
speciﬁes a tangency class. Moreover if v
is tangent to v at a then f ◦ v
is
tangent to f ◦ v at f(a).
Exercise 2.6.1. Most of this should have been a second year exercise but
probably wasn’t. Do it now and all about tangent vectors and maps will be
clear. Well, clearer.
1. Let f : R
2
→R
2
be deﬁned by
¸
x
y
→
¸
u
v
=
¸
x
2
+x +y +y
2
1 +xy
Compute f on the set of points
¸
t
0
for t ∈ [0, 1]. Do this by choosing
ten points along the interval and evaluating f on them and plot them
on a sheet of graph paper to obtain ten points which should lie on a
smooth curve. Do the same for points on the interval
¸
0
t
for t ∈ [0, 1].
2. Calculate f
¸
1/10
0
and f
¸
0
1/10
if you haven’t already.
3. Calculate Df
¸
1
0
.
4. Evaluate the above matrix on the tangent vector ˙ e
1
5. Evaluate the above matrix on the tangent vector ˙ e
2
6. Map the two tangent vectors obtained by the last two jobs on the same
graph.
7. Represent the tangent vector ˙ e
1
by any curve c
1
in the tangency equiv
alence class and compose with f. Diﬀerentiate to ﬁnd a linear repre
sentative of Tf(0, ˙ e
1
)
8. Repeat for a curve c
2
representing ˙ e
2
34 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
9. Sketch the curves f ◦ c
1
and f ◦ c
2
10. Prove the claim that if v
is tangent to v at a then f ◦ v
is tangent to
f ◦ v at f(a).
It follows that f induces a map Tf which takes tangent vectors at a to tan
gent vectors at f(a). This process doesn’t, on the face of things, involve
diﬀerentiation. Nor does it involve charts. Of course it does involve diﬀer
entiation, as the last series of exercises shows convincingly. And it is easy to
see that it goes through on charts for the usual reasons, which involve the
chain rule.
In the case when we have a diﬀerentiable f : R
n
→ R
m
the last exercises
should convince you we have at each point a ∈ X the diagram
T
a
(X)
?
π
X
X
T
f(a)
(Y )
?
π
Y
Y


Df(a)
f
This diagram commutes which means whichever way around you go you get
the same result. We can do this for every point a ∈ X to get the commutative
diagram:
TX
?
π
X
X
TY
?
π
Y
Y


Tf
f
The process of taking a manifold and producing its tangent bundle is said to
be functorial because if we have two manifolds and a smooth map between
them the process gives a map between the bundles.
Instead of writing Tf we often write f
for the same map. This is more
general because it makes sense for some other vector bundles and not just
the tangent bundle.
Such a map between tangent bundles is said to be ﬁbre preserving, since
it takes anything in the ﬁbre over a to the ﬁbre over f(a). And we can
generalise this to maps between any ﬁbre bundles, so they are also called
bundle maps. If the ﬁbre is a vector space we talk of vector bundles and we
2.6. THE TANGENT FUNCTOR 35
require the bundle maps to be linear, so the map Tf is also a vector bundle
map.
Note that the map Tf contains all the information about the derivative and
also tells you where things are, which the derivative (being only the linear
part of an aﬃne map) does not. So this is actually cleaner and conceptually
simpler than the usual description of the business of diﬀerentiation. Another
way to put this, in the light of the last exercise, is that when you calculate
lots of partial derivatives you are merely trying to calculate the linear part of
an aﬃne map which speciﬁes a tangency equivalence class, that is, a tangent
vector.
We can usefully think of Tf as coming in two parts, since locally the tangent
space is simply a cartesian product of possible tangent vectors over a space
with a part of the space. On the ﬁrst part Tf is simply f and on the second
part, the ﬁbres, it is Df, the derivative of f. We can now choose to deﬁne the
derivative of a smooth map this way. I have hankered after teaching calculus
this way in ﬁrst year. It is actually easier, probably because you need to
isolate the core ideas in order to generalise things and fronting up to the core
ideas although demanding at ﬁrst makes life a lot easier subsequently.
Note that the chain rule can now be formulated as
T(f ◦ g) = Tf ◦ Tg
Exercise 2.6.2. Conﬁrm that the chain rule holds. This is also a part of
saying that T is functorial.
Exercise 2.6.3. Guess what a functor is and what it is a map between.
Conﬁrm your guess by doing some googling. I warn against doing the googling
ﬁrst.
Exercise 2.6.4. Take f : R
+
→ R
+
deﬁned by x → x
2
. Show this is a
diﬀeomorphism. Let V be the vector ﬁeld on R
+
which has constant vectors
of length 1 at every point. Show that Tf takes this into a new vector ﬁeld on
R
+
, and say what the new vector ﬁeld is. Regarding the two vector ﬁelds as
diﬀerential equations, ﬁnd both solutions.
2.6.1 The (nonexistent) Cotangent Functor
Suppose we have f : X → Y a smooth map between smooth manifolds,
and we look to see what happens in the cotangent bundle. Thinking of a
cotangent at a ∈ X as a tangency equivalence class of maps from some
neighbourhood of a to R, we see that the map between the ﬁbres goes in
36 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
the reverse direction. Given v : W ⊂ Y →R as a representative function in
the tangency equivalence class at f(a) (with f(a) ∈ W), f induces v ◦ f :
f
−1
(W) →R on X which deﬁnes a cotangent vector at a. So we obtain the
diagram:
T
a
X
?
π
X
X
T
f(a)
Y
?
π
Y
Y

f
f
This makes T
, a hypothetical induced map on the whole cotangent space
a mess, because it goes one way (left to right) on the space part and the
opposite way (right to left) on the cotangent part. If f has a smooth inverse
we can get around this, but it is not so neat. Incidentally:
Deﬁnition 2.6.1. A smooth map with a smooth inverse is called a (smooth)
diﬀeomorphism
Exercise 2.6.5. How, if at all, can we relate the derivative of f to f
when
X = R
n
, Y = R
m
?
Remark 2.6.1. In older books, a covector ﬁeld is called a contravariant
vector ﬁeld and a vector ﬁeld is called a covariant vector ﬁeld. See for
example, Mackey’s Theoretical Foundations of Quantum Mechanics. As we
shall see later, a covariant vector ﬁeld is a contravariant tensor ﬁeld. Don’t
blame me for this.
This is all rather confusing on ﬁrst encounter. Familiarity breeds acceptance
and the best way to become familiar with these ideas is to work them through
in very simple cases. So make up a set of exercises yourself in which you work
with particular simple maps between very simple manifolds (R
n
and R
m
for
n, m small positive integers.) As a start:
Exercise 2.6.6. Let f : R → R be given by f(x) = x
2
. Put a = 2 and
investigate what happens if we take (a) a tangent vector at 1 and (b) a
cotangent vector at 4.
Now try it for f : R
2
→R
2
with
¸
x
y
→
¸
x
2
+y
2
xy
ans some suitable points for a and f(a). In this case you can conveniently
represent tangent vectors as columns and cotangents as rows.
2.7. AUTONOMOUS SYSTEMS OF ODES 37
Exercise 2.6.7. Write out a lecture for ﬁrst year students which describes
tangent vectors on R in a really simple way as possible velocities along the
line, and hence deﬁne the tangent bundle R
˙
R. Deﬁne diﬀerentiation of
maps fromR to R in terms of bundle maps. Prove the chain rule as T(f ◦g) =
Tf ◦ Tg. Be prepared to answer any awful questions an intelligent student
might ask.
Write out a lecture on ordinary diﬀerential equations in terms of sections of
the tangent bundle. Set up and solve some easy ones in this notation.
Do you think this is easier or harder than the traditional way of doing it?
Assume that since Mathematica can solve ODEs, the idea is not to train
students to jump through hoops but to get them to understand what they
are doing.
2.7 Autonomous Systems of ODEs
2.7.1 Systems of ODEs and Vector Fields
Consider the system of linear ordinary diﬀerential equations:
˙ x = −y x(0) = 1
˙ y = x y(0) = 0
We can write this as a two dimensional problem:
¸
˙ x
˙ y
=
¸
0 −1
1 0
¸
x
y
or more succinctly:
˙ x = Ax (2.7.1)
where A is the above matrix.
The matrix A deﬁnes a vector ﬁeld on R
2
by taking the location x to the
vector A(x). We are now used to the idea of a vector ﬁeld on R
2
both visually
in terms of lots of little arrows stuck on the space (which can incidentally be
generated quickly and painlessly using Mathematica), and algebraically as
a map from R
2
to
˙
R
2
sending locations to arrows (with their tails attached
to those locations).
Such a system of ordinary diﬀerential equations is called autonomous, mean
ing that the vector ﬁeld speciﬁed by the system doesn’t change in time.
38 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.7.1: A vector ﬁeld or system of ODEs in R
2
Consequently we can either refer to an Autonomous System of Ordinary Dif
ferential Equations deﬁned on an open set U ⊆ R
n
, or we can talk about a
Smooth Vector Field on U. The second is much shorter and easier to think
about.
If we draw the vector ﬁeld in the above case, we get arrows which go around
the space in a positive direction as in ﬁgure 2.7.1
A solution to the system of diﬀerential equations, or an integral curve for the
vector ﬁeld is a map f : R −→R
2
, usually written
¸
x(t)
y(t)
with the property that ˙ x and ˙ y satisfy the given system of equations. What
this means is that we think of a point moving in R
2
so that it’s velocity at
any point is just the vector attached to that point. So the solution curve has
to have the vector ﬁeld tangent to it always.
It is possible to learn to solve autonomous systems of diﬀerential equations
without ever understanding that they are all about vector ﬁelds which give
the velocity of a moving point, and that a solution is simply a function which
says where the moving point is at any time, and which agrees with the given
vector ﬁeld in what the velocity vector is. This is a pity.
In the above case, you can see by looking at the system what the solution
is: obviously the solution orbits are circles, and given the initial condition
2.7. AUTONOMOUS SYSTEMS OF ODES 39
where at time t = 0 we start at the point (1, 0)
T
, the solution can be written
down as
x = cos(t), y = sin(t)
and it is easy to verify that this works.
Exercise 2.7.1. Do it.
Obviously, solving initial value ODE problems for more complicated vector
ﬁelds isn’t going to be so easy, and doing it in dimensions greater than three
by the ‘look at it and think’ method also looks doomed. So it is desirable
to have a general rule for getting out the solution. Fortunately this is easy
enough for linear vector ﬁelds in principle, although the calculations can be
messy in preactice. But again, that’s what computers are for.
2.7.2 Exponentiation of Things
I did this in second year M213 but some of you may have missed out on it in
which case here it is. Those of you who did it can read this rather quickly.
If you write down the usual series for the exponential function you get:
exp(x) = 1 +x +
x
2
2!
+
x
3
3!
+
x
n
n!
+
Now think about this and ask yourself what x has to be for this to make
sense. You are used to x being a real number, but it should be obvious that
it could equally well be a complex number. After all, what do you do with
x? Answer, you have to be able to multiply it by itself lots of times, and you
have to be able to scale it by a real number, and you have to be able to add
the results of this. You also have to have an identity to represent x
0
. Oh, and
you need to be able to take limits of these things. So it will certainly work
for x a real or a complex number. But it also makes sense if x is a square
matrix. Or, with any system where the objects can be added and scaled and
multiplied by themselves. And have limits of sequences of these things.
The name of a system of objects which can be added and scaled by real
numbers is a vector space, and a vector space where the vectors can also
be multiplied is called an algebra. We can do exponentiation in any algebra
which has a norm and a multiplicative identity. (And it would be a help if it
was complete in that norm, i.e. limits of cauchy sequences exist.) The square
n n matrices form such an algebra. We can also hope to take sequences of
them and maybe have them converge to some matrix. So we can exponentiate
square matrices.
40 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.7.2. Exponentiate the matrix A in equation 2.7.1. Now expo
nentiate the matrix tA. Do you recognise the result?
It should be obvious that we could, in principle, calculate the exponential of
a matrix to some number of terms, and if the inﬁnite sum makes sense and
the sequence of partial sums converges, then we could always get some sort of
estimate of exp(A) for any matrix A by computing enough terms. We would
hope that multiplying A by itself n times would give some reasonable sort of
matrix, and when we divided all the entries by n! we would get something
pretty close to the zero matrix. If this happened for all the n past some
point, then we could optimistically suppose that exp(A) was some matrix
which we could at least get better and better approximations to, which after
all is exactly what we have with exp(x) for x a real number.
Exercise 2.7.3. Deﬁne the norm of an n n matrix A to be
A = sup
x=1
A(x)
as in an earlier problem, and show that A
2
 ≤ (A)
2
. Hence prove that
the function exp is always deﬁned for any n n matrix.
Exercise 2.7.4. If e
tA
exp(tA) denotes a map from R to the space of nn
matrices, show that its derivative is Ae
tA
.
There are other algebras where a bit of exponentiation makes sense, so be
prepared for them.
2.7.3 Solving Linear Autonomous Systems
In principle this is now rather trivial:
Proposition 2.7.1. If ˙ x = Ax is an autonomous linear system of ODEs
with x(0) = a, then
x = e
tA
a
is the solution.
Proof:
Diﬀerentiating e
tA
gives Ae
tA
by the last exercise and since exp
0
= I the
identity matrix, the initial value x(0) = a is satisﬁed. So it is certainly a
solution.
If this looks a bit like a miracle and in need of explanation, you are thinking
sensibly and merely need to do more of it. It may help to note that the
2.7. AUTONOMOUS SYSTEMS OF ODES 41
exponential function is the unique function with slope at a point the same
as the value at the point, and that this leads to the general solution for the
linear ODE in dimension one, and that this goes over to higher dimensions
with no essential changes. In eﬀect, the exponential function was invented to
solve all these cases. It actually goes deeper than this, see Vladimir Arnold’s
book Ordinary Diﬀerential Equations.
2.7.4 Existence and Uniqueness
Could you have two diﬀerent solutions (or more)? No, not for linear systems,
but this requires thought. Certainly the 1dimensional ODE given by
˙ x(t) = 3x
2/3
, x(0) = 0
has the solution x(t) = t
3
but also the solution x(t) = 0 It also has inﬁnitely
many other solutions. (Can you ﬁnd some?) Of course this is not a linear
ODE, but it is clear that some sort of conditions will need to be imposed
before we can look at vector ﬁelds which are not linear and expect them to
have solutions. Happily, there is a simple one which guarantees at least local
existence and uniqueness:
Theorem 2.7.1. If f : U ⊆ R
n
−→ R
n
is a continuously diﬀerentiable
vector ﬁeld, then for any point a in U there is a neighbourhood W ⊆ U
of a containing a solution to the system of equations ˙ x = f(x) with a as
initial value, and the solution is unique. Moreover, there is a continuously
diﬀerentiable map F : W J −→R
n
for some interval J = (−a, a) on 0 ∈ R
such that for all b in W, the map F
b
: J −→ R
n
is the solution for initial
value b at t = 0.
There is a proof in Hirsch and Smale’s Diﬀerential Equations, Dynamical
Systems and Linear Algebra, pages 163 to 169.
There is a better proof in Arnold’s book on page 213. It is actually the
same proof but much better explained. It is given for the general (non
autonomous) case. Both arguments use the contraction mapping theorem.
You should read through it if you have not already done a proof in your
ODEs course. Assuming you did one.
The results follow easily from a more basic result sometimes called The
Straightening Out Theorem (In Arnold The basic theorem of the theory of
ordinary diﬀerential equations or the rectiﬁcation theorem. See chapter 2).
The theorem says that in a neighbourhood U of a point of R
n
where the
(continuously diﬀerentiable) vector ﬁeld is nonzero, we can ﬁnd a oneone
42 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
diﬀerentiable map from U to W ⊆ R
n
with a diﬀerentiable inverse, such that
the transformed vector ﬁeld on W is uniform and constant.
Given that we can do that, we could also make the vectors all have length
one and lie along the x
1
axis in R
n
with a rotation and scaling. The system
of ODEs then would be, in this transformed region W, the rather boring
system:
˙ x
1
= 1
˙ x
2
= 0
.
.
.
˙ x
n
= 0
with the solution
x
1
(t) = t +a
1
; x
2
(t) = a
2
; x
n
(t) = a
n
If you believe in the Straightening Out Theorem, then it is obvious that any
continuously diﬀerentiable vector ﬁeld has at any point where the vector ﬁeld
is nonzero a solution which is unique in some neighbourhood of the point
and which depends smoothly on the point. All we have to do is to map the
straight line boring solution(s) back by the diﬀerentiable inverse.
Exercise 2.7.5. Prove the last remark.
When the vector ﬁeld is zero at a point, the solution is the constant function
taking all of R to the point. So there is a unique solution here too.
Remark 2.7.1. You will ﬁnd a proof of the straightening out theorem in
Arnold. I shan’t prove it in this course on the grounds that this isn’t a course
on ODEs. At least, I don’t think it is.
Remark 2.7.2. It should be obvious that although we have looked at sys
tems of ordinary diﬀerential equations on R
n
, the fact that everything is
deﬁned locally means that they ship over to any smooth manifold. If the
manifold is compact then the completeness is guaranteed, and the solution
can be found by doing everything in charts and piecing the bits together.
2.8 Flows
I rather slithered over one important point, which is the question of whether
we always get a solution for all time, past and future. It is not hard to see
2.8. FLOWS 43
that the vector ﬁeld X(x) = x
2
, X(0) = 1 on R has a solution
x(t) =
1
1 −t
which goes oﬀ to inﬁnity in ﬁnite time. From which we deduce that it is
not in general possible to ensure that there is a solution for all time, and
this explains the cautious statement of the last theorem. The best we can
hope to do, the theorem tells us, for a smooth vector ﬁeld at a point is to
ﬁnd a neighbourhood of the point in which there is a parametrised curve,
x(t) : t ∈ (−a, a) where if we are lucky a will be ∞ and if we aren’t it will
be some possibly rather small positive number.
Deﬁnition 2.8.1. A vector ﬁeld on U ⊆ R
n
is said to be complete if any
solution can be extended to the whole real line.
Exercise 2.8.1. Show that if a vector ﬁeld has compact support then it is
complete.
Exercise 2.8.2. Show that if U is the unit open ball in R
n
centred on the
origin and X is a smooth vector ﬁeld on U, then if X is complete, and if
Proj(X(x), x) is the projection of X(x) on x, then
lim
x→1
Proj(X(x), x) = 0
Remark 2.8.1. It should be obvious that there are not many physical situa
tions where things go belting oﬀ to inﬁnity in ﬁnite time, and for that reason
I shall restrict myself from now on to complete vector ﬁelds. If I forget to
put the word in, put it in yourself. Also put the word ‘smooth’ in front of
the term ‘vector ﬁeld’ whenever it occurs since I shall not consider any other
sort.
The business of getting a solution is going to work not just for the point we
selected as our starting point but also for neighbouring points provided we
don’t go too far away. In the happy case where the vector ﬁeld has solutions
for all time, the space U on which the vector ﬁeld is deﬁned is decompos
able as a set of integral curves, since solutions can’t intersect each other, or
themselves, although they can, of course, be closed loops. This statement
follows from the uniqueness of a solution. Hence we deduce that a vector
ﬁeld gives rise to what is called a foliation of the space into integral curves.
You can, perhaps, guess that partial diﬀerential operators more complicated
than vector ﬁelds will give rise to higher dimensional foliations, decomposing
the space into surfaces and other manifolds.
44 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.8.3. Describe the foliation of R
2
by the vector ﬁeld
−y
∂
∂x
+x
∂
∂y
Recall that in second year (M213) we discussed the idea of groups acting on
sets and came to the conclusion that they were conveniently seen as homo
morphisms from a group G into the group Aut(V ) of maps from the set V
into itself. Then a complete smooth vector ﬁeld X on U ⊆ R
n
gives rise to
an action of the group R on U as follows:
x : R U −→ U
(t, x
0
) → x(t)
where x(t) is the integral curve of X with x(0) = x
0
.
To prove this is indeed a group action, we need to show that x(0, x
0
) = x
0
for every x
0
which follows immediately from my deﬁnition of x. (Since the
additive identity of R is 0.) We also need to show that
∀ s, t ∈ R, ∀ x
0
∈ R
n
, v(s, v(t, x
0
)) = v(s +t, x
0
)
which merely means that if you travel for time t from x
0
along the solution
curve, and then go on for time s, this gives the same result as travelling for
time s + t from the starting point x
0
, which is, after all, what we expect a
solution curve to do.
If we ﬁx t and look to see what the group action does, it is a map from R
n
to itself. Well, we knew that. It is a truth that this map is always a smooth
diﬀeomorphism. The old fashioned way of saying this is that the solutions
depend smoothly upon the initial conditions, but I much prefer the modern
way of saying it. You should be able to see that all we are doing is taking
each point as input, and outputting the point it will get to after time t.
Proposition 2.8.1. For a complete smooth vector ﬁeld X on U open in
R
n
, for any t ∈ R, the map x
t
: U −→ U, which sends x
0
to x(t, x
0
) is a
diﬀeomorphism of U
Proof:
The map x
t
certainly has an inverse, x
−t
. And the theorem on existence of
solutions to an ODE establishes that the map is continuously diﬀerentiable
when X is. So if X is smooth, so is x
t
.
Remark 2.8.2. The set of diﬀeomorphisms ¦x
t
: t ∈ R¦, or in other words
the map x : R U −→ U, is called in old fashioned books a oneparameter
group of diﬀeomorphisms. I shall simply say that the map x obtained from
the vector ﬁeld X is the ﬂow of X.
2.9. LIE BRACKETS 45
Remark 2.8.3. Given a ﬂow x on U ⊆ R
n
we can always recover the vector
ﬁeld by simple taking any point, a and diﬀerentiating the map x
a
: R −→ U
which sends t to x(t, a) at t = 0. This must give us the required vector ﬁeld
from which the ﬂow can be derived. So there is a correspondence between
ﬂows and vector ﬁelds.
You now have four ways of thinking about vector ﬁelds. They are bunches
of arrows tacked onto a space; they are autonomous systems of ordinary
diﬀerential equations. And they are also ﬂows, obtained by solving the au
tonomous system. And last but not least they are operators on the algebra of
smooth functions from the space to R. This demonstrates that vector ﬁelds
are more interesting and complicated than you might have supposed.
I shall give one important feature of vector ﬁelds which arises from this
multiple perspective and which is much less obvious if you stick only to
systems of ordinary diﬀerential equations.
2.9 Lie Brackets
Writing, as is conventional in some areas, X and W for two vector ﬁelds in
1(R
n
) and bearing in mind that we can compose any such operators to get
X ◦ W and W ◦ X (which we write XW and WX for short). In general the
result is a perfectly good operator but some calculations will rapidly convince
you that XW is not, in general, a vector ﬁeld operator but something much
nastier.
Example 2.9.1. Let V = −y ∂/∂x + x ∂/∂y and W = x ∂/∂x + y ∂/∂y
Then V Wh is
−xy
∂
2
h
∂x
2
−y
∂h
∂x
−y
2
∂
2
h
∂x∂y
−0 +x
2
∂
2
h
∂y∂x
+x
∂h
∂y
+xy
∂
2
h
∂y
2
+ 0
and WV h =
−xy
∂
2
h
∂x
2
+ 0 +x
2
∂
2
h
∂y∂x
+x
∂h
∂y
−y
2
∂
2
h
∂x∂y
−
∂h
∂x
+xy
∂
2
h
∂y
2
+ 0
Neither of these look like a vector ﬁeld operating on h. If however we take
the diﬀerence, V W −WV we get some happy cancellation and wind up with
V W −WV = (−y
∂
∂x
+x
∂
∂y
) −(x
∂
∂y
−y
∂
∂x
) = 0
which is a vector ﬁeld although not a very interesting one.
46 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.9.1. Write down another pair of vector ﬁelds V, W on R
2
and
compute V W − WV . Check to see if you always get the zero vector ﬁeld.
What is it telling you about the vector ﬁelds when V W −WV = 0? (Some
intelligent conjectures would be of interest but only if supported by evidence
not used in framing the conjecture.)
Exercise 2.9.2. If X = P(x, y)∂/∂x+Q(x, y)∂/∂y and W = R(x, y)∂/∂x+
S(x, y)∂/∂y, calculate XW −WX and verify that is is a vector ﬁeld.
Exercise 2.9.3. Compute XW − WX for X, W ∈ 1(R
n
) and show it is a
vector ﬁeld in 1(R
n
) Show that this also holds for 1(U) for any open set
U ⊆ R
n
.
All this gives the following deﬁnition:
Deﬁnition 2.9.1. The Lie Bracket or Poisson Bracket of two vector ﬁelds
X, W in 1(U) for U ⊆ R
n
is written [X, W] and deﬁned by
[X, W] XW −WX
It is a multiplication on the vector space of Vector ﬁelds on U.
Exercise 2.9.4. Do some simple calculations preferably for U ⊆ R
1
and con
vince yourself that the Lie bracket multiplication is not in general associative
but does satisfy the Jacobi Identity:
∀X, Y, Z ∈ X(U), [X, [Y, Z]] + [Y [X, Z]] + [Z, [X, Y ]] = 0
Exercise 2.9.5. Prove that the Jacobi Identity is always satisﬁed for Vector
Fields.
The Lie bracket almost makes the vector space of vector ﬁelds on U, an
open subset of R
n
, into an algebra, which you will recall is merely a vector
space where the vectors can be multiplied, to make a ring. Here the Lie
Bracket operation fails to be associative in general, but a vector space with a
nonassociative multiplication which satisﬁes the Jacobi Identity is, notwith
standing, called a Lie Algebra. There are others besides these and again
algebraists have gone to town on investigating abstract Lie Algebras. Well,
we wouldn’t like them to be at a loose end and hang around street corners
3
.
Exercise 2.9.6. Prove [X, (Y + Z)] = [X, Y ] + [X, Z] and [(X + Y ), Z] =
[X, Z] + [Y, Z] Prove also that ∀ a ∈ R, [aX, Y ] = a[X, Y ] and [X, aY ] =
a[X, Y ].
3
Although they’d probably have an interesting line in graﬃti.
2.9. LIE BRACKETS 47
Remark 2.9.1. The above properties you will recognise as bilinearity.
Exercise 2.9.7. Investigate the relation between [hX, Y ], [X, hY ] and h[X, Y ].
It should be apparent that although the calculations tend to be messy and
provide great scope for making errors, they are not essentially diﬃcult. A
natural candidate for a good symbolic algebra package, you might say.
Exercise 2.9.8. Is there a multiplicative identity for the Lie Bracket oper
ation on vector spaces? That is, is there a vector ﬁeld J such that for every
other vector ﬁeld, X, [J, X] = X? (Hint: what is [J, J]?)
You might be interested in an area of applications of these ideas. If so read
on.
It is easy to ﬁnd the solution, h(x, y) = x
2
+y
2
to the PDE
−y
∂h
∂x
+x
∂h
∂y
= 0
Now this is one solution, and ﬁnding a single solution is very nice, but we
usually want the general solution. In this particular case you can probably
guess it. But in general, if we have some linear partial diﬀerential operator
L acting on F, a suitable space of smooth functions, and if we want the set
of all solutions of Lh = 0, then it will usually be a lot harder to ﬁnd them.
This process is aided by the following idea: The set of solutions of L is going
to be a linear subspace of F, by deﬁnition of the term linear operator. Call
it F
0
. Now a symmetry of the solution space of the operator L, often called
a symmetry of the operator L, is some vector ﬁeld operator X such that X
takes F
0
into itself, i.e. if whenever h is a solution to Lh = 0, so is Xh. If we
know the collection of all symmetry operators for L and we have a solution,
then we can ﬁnd all the other solutions. In trivial cases this will amount
to no more than adding in arbitrary constant functions, but in nontrivial
cases it will do a whole lot more than this. So it would be a good idea to
be able to ﬁnd, for a given L, the set of all symmetries X for L. It is clear
that the PoissonLie bracket can be used for any pair of linear operators, not
just vector ﬁelds. The following observation goes some way to explaining our
interest in them:
Proposition 2.9.1. If [L, X] = wL for some function w ∈ F, then X is a
symmetry of L.
Proof:
We need to show that ∀ h ∈ F, L(Xh) = 0 Now
LX −XL = gL ⇒ LX = gL +XL
48 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
and
∀h ∈ F, (gL +XL)h = gLh +XLh = 0 +0 = 0
Exercise 2.9.9. Prove the converse, that if X is a (vector space) symmetry
of L, then [L, X] = gL for some g ∈ F.
Exercise 2.9.10. What symmetry is involved in ﬁnding the general solution
to the equation
−y
∂h
∂x
+x
∂h
∂y
= 0
and how does it give the general solution?
Now it is possible to prove that the set of all vector space symmetries of an
operator L is itself a Lie Algbra. Which is one reason for wanting to know
more about them.
Some students of PDEs want to know why it is that the standard partial
diﬀerential equations all had their variables separable: does this happen for
all possible PDEs and why does it work for these cases? The answer to this
question is rather long and may be found in Volume 4 of the Encylopedia
of Mathematics and Its Applications, Symmetry and Separation of Variables
by Willard Miller. It has a lot to do with Lie Algebras.
It is now possible to state properly a signiﬁcant problem.
Going back to the idea of ﬂows, it makes sense to discover whether ﬂows
commute. For a suitable pair of ﬂows, x, y : R U −→ R
n
we can start oﬀ
from a ∈ U and go by ﬂow x for a time s and then by ﬂow y for time t. This
will get us to some point in U, written naturally enough as y
t
◦ x
s
(a). Or we
could go the other way around, ﬁrst by y and then by x to get x
s
◦ y
t
(a). If
we always wind up at the same point for any starting point and any pair of
times s, t then we may say that the ﬂows commute.
Then when the ﬂows x, y correspond to the vector ﬁelds X, Y , we have the
following result: x and y commute iﬀ [X, Y ] = 0. You can see that this works
for the case of the two vector ﬁelds V, W in Example 2.9.1.
At present we lack the machinery to prove this result economically, so I shall
skip it until it is needed.
Remark 2.9.2. Again all this makes perfectly good sense on manifolds for
the usual reasons. The idea of thinking of a vector ﬁeld on a manifold M
as a special kind of operator on (
∞
(M) ensures that we can compose them
and add them and subtract them, so the Lie Bracket makes sense there too,
and we can write down vector ﬁelds on manifolds via charts and ﬁnd integral
curves for them and so foliate the manifold.
2.10. CONCLUSION 49
Exercise 2.9.11. Demonstrate the truth of the last remark by doing some
of these things on S
1
and, if you are feeling very brave, S
2
.
2.10 Conclusion
This has been a quick introduction to the ideas of smooth manifolds and
vector ﬁelds on them. There are whole books dedicated to these ideas and
you will ﬁnd some in the library. You will ﬁnd some of these ideas covered
very quickly in chapters two and three of the text book, which you should
read and satisfy yourself that it is intelligible. You should be able to see why
deﬁnitions are as they are.
50 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Chapter 3
Tensors and Tensor Fields
This chapter deals with the machinery needed to talk about diferential geom
etry, although it only starts on actually doing so. It contains the information
in Chapter four of the text book and goes into the algebra in more detail.
This is because we are doing it right, on account of being mathematicians
and therefore feeling uneasy about relying on our intuitions without being
able to check the logic. We also cover part of Chapter one of Part three,
where the text book is decidedly scrappy, and part of Chapter ﬁve of Part
one. I do things in a slightly diﬀerent order. You should however read the
text book in conjunction with the notes and do the exercises.
I also throw in a few remarks about the exterior calculus and Stokes’ Theo
rem, not because this is part of the course but because it is a part of every
educated person’s background in the twentyﬁrst century. It has to be ad
mitted that there aren’t many educated people around, but then there never
have been.
3.1 Tensors
3.1.1 Natural and Unnatural Isomorphisms
Let V denote a real vector space of ﬁnite dimension, so V is isomorphic to
R
n
for some n ∈ Z
+
. Then I can deﬁne the space of shifts of V which is a
collection of maps from V to V by taking any v ∈ V and writing
ˆ v : V → V, ∀ w ∈ V, w → w+v
The set of all such maps I shall call
ˆ
V . The map ˆ v is the map that adds v to
everything. I can compose such maps, and it is immediate that ˆ u◦ˆ v =
u +v.
51
52 CHAPTER 3. TENSORS AND TENSOR FIELDS
Similarly, for every t ∈ R,
´
tu = tˆ u where we scale maps in the usual way,
(tf)(x) = t(f(x)).
This makes
ˆ
V a vector space and gives an isomorphism
: V →
ˆ
V , ∀ v ∈ V, v → ˆ v
Exercise 3.1.1. Conﬁrm that this is an isomorphism of vector spaces.
This isomorphism is natural, which means (in part) that given f : U → V , a
linear map between real vector spaces, we get a map from
ˆ
U to
ˆ
V :
f : U → V → f
:
ˆ
U →
ˆ
V , f
(ˆ u) =
f(u)
Note that we can specify the isomorphism and the map f
without making
any reference to a basis for U or V . This is the other part of what we mean
by natural. I cannot deﬁne naturalness properly without an excursion into
category theory which I am hoping to avoid, but the idea is suﬃciently clear
for present purposes. I hope.
I shall write U
∼
= V when vector spaces U and V are isomorphic and U
N
∼
= V
when they are naturally isomorphic.
Exercise 3.1.2. Take the space L(R, V ), of linear maps from R to V , and
show this is also a vector space, naturally isomorphic to V .
The space of shifts being naturally isomorphic to V leads to two pictures
of a vector space, one has got points in it and the other has got arrows in
it. We can certainly think of a shift map taking u to u + v as an arrow
from u to u + v in the original space, and the map itself as a whole lot of
arrows, all basically showing where each point starts and ﬁnishes under the
map. And since the spaces are isomorphic we can cheerfully think in either
one. Physicists do this all the time as do applied mathematicians, and so
they confuse the two distinct things, points and arrows, and usually this does
no harm; in fact the more ways you have of thinking about something the
easier it is to solve problems, so it actually does some good. It is, however,
probably better to confuse things when you know you are doing it, rather
than just being confused.
Now I deﬁne the space V
, the dual space to V .
Deﬁnition 3.1.1. V
is the set L(V, R) of linear maps from V to R with
the usual rules for addition and scaling of maps, viz, (tf)(v) = t(f(v)) and
(f +g)(v) = f(v) +g(v), for every t ∈ R and every u, v ∈ V .
3.1. TENSORS 53
Exercise 3.1.3. Conﬁrm that V
is indeed a vector space of the same di
mension as V .
Exercise 3.1.4. Show that a basis for V determines a basis for V
in an
obvious way. It is called the dual basis. In R
2
, [1, 0] is the dual to
¸
1
0
∈
R
2
, and so on. Note that my usage of representing vectors as columns (and
elements of R
n
as rows) is consistent with standard matrix notation and
makes it easier to distinguish R
n
from its dual space.
Remark 3.1.1. The standard (ordered) basis in R
n
is often written as the
ordered set (e
1
, e
2
, , e
n
) which saves writing lots of columns. e
j
is the
column of n numbers which has a 1 in the j
th
place and a zero everywhere
else. I shall often write (e
1
, e
2
, , e
n
) for the dual basis. You can think of e
j
either as a row matrix with n entries, with the j
th
entry 1 and all the others
zero, or you can think of it as the projection onto the j
th
axis, according to
taste. People who cannot tell subscripts from superscripts are going to have
a hard time with tensors.
It follows from the last exercise that V and V
are isomorphic, but the
isomorphism is not natural. Given U and V real vector spaces, and a linear
f : U → V , we get a map f
: V
→ U
deﬁned by
f
: V
→ U
, ∀ g ∈ V
g → g ◦ f
This map is the wrong way around. The term contravariant is used for
things like this. Again I am being a triﬂe vague here in order to avoid a long
discursion.
If V = R
n
then I ﬁnd it helpful to write the elements of R
n
as column arrays.
Then it is natural to write the elements of R
n
as row arrays. Then this
makes it clear that the latter act on the former (by matrix multiplication) so
there is a map
R
n
R
n
→R,
¸
¸
¸
¸
[a
1
, a
2
, a
n
] ,
x
1
x
2
.
.
.
x
n
¸
¸
¸
¸
¸
→ a
1
x
1
+a
2
x
2
+ a
n
x
n
Physicists write the thing on the right as a
i
x
i
by what is called the Einstein
summation convention which means that a repeated lower index and upper
index is short for a sum over all possible values of the index. This explains
why we use lower indices or subscripts for covectors, elements of the dual
54 CHAPTER 3. TENSORS AND TENSOR FIELDS
space, and superscripts or upper indices for the components of a vector. It
makes writing squares and higher powers a real bugger, but fortunately we
don’t have to do that very often.
Note that this generalises, there is a map
V
V →R, (g, v) → g(v)
All I have done with my rows and columns is specify the maps and the vectors
by arrays of numbers. This is so we can do sums. Usually rather horrid sums,
but that is what Mathematica and MATLAB are for.
Note that confusing a space and its dual is not a good idea: physicists did this
and got themselves in a bit of a mess in consequence. They are isomorphic,
at least when ﬁnite dimensional, but not naturally isomorphic, so it is a good
idea to keep them separate.
Exercise 3.1.5. Deﬁne the unit vector at 0 on R to be the tangency equiv
alence class of the map i : R → R given by i(t) = t. Then i ∈
˙
R
0
is a basis
element. Deﬁne dx :
˙
R
0
→ R by dx(i) = 1. The identity map x : R → R
goes over to the identity map
˙
R →
˙
R, and it takes the tangent vector i to
itself. What does it do to dx?
Note that all the above spaces are isomorphic and all maps are pretty much
the identity map if you are prepared to be sloppy. Examine which of the
various maps are covariant and which contravariant.
Exercise 3.1.6. Let V be the space of all real valued functions deﬁned on R.
Is it the case that V and V
are isomorphic? If so provide an isomorphism.
Exercise 3.1.7. Show that an isomorphism between a vector space V and
its dual provides a quadratic form on V which, if positive deﬁnite, deﬁnes
an inner product on V . Show that an inner product on V determines an
isomorphism between V and its dual whenever V is ﬁnite dimensional. Is it
true when V is not ﬁnite dimensional?
3.1.2 Multilinearity
Deﬁnition 3.1.2. A bilinear map f : U V → W for real vector spaces
U, V, W is one such that
∀ u, u
, ∈ U, ∀ v ∈ V, ∀ s, t ∈ R, f(su+tu
, v) = s f(u, v)+t f(u
, v) and
∀ u ∈ U, ∀ v, v
, ∈ V, ∀ s, t ∈ R, f(u, sv +tv
) = s f(u, v) +t f(u, v
)
3.1. TENSORS 55
We can describe this by saying that f is linear in each variable separately.
The ﬁeld can in fact be any ﬁeld you like as long as it is the same ﬁeld for
U,V and W .
Exercise 3.1.8. Find a bilinear map from R R to R.
Deﬁnition 3.1.3. For any u ∈ U and a bilinear map f : U V → R I can
write f
(u,−)
: V →R as the map
f
(u,−)
: V →R, v → f(u, v)
Similarly for any v ∈ V, f
(−,v)
: U →R sends u to f(u, v).
We can describe bilinearity of f by saying that f is linear in each variable
separately, meaning that for any u ∈ U, f
(u,−)
is linear and for any v ∈
V, f
(−,v)
is linear.
Two exercises which may help later in understanding some technicalities:
Exercise 3.1.9.
1. Show that when V is ﬁnite dimensional, V
, the dual of V
, is nat
urally isomorphic to V . That is, show there is an isomorphism which
does not require a basis of either space to specify it, and that f : U → V
induces a map f
: U
→ V
.
2. Show that if Bil(U V, W) is the vector space of bilinear maps from
U V to W and L(A, B) is the vector space of linear maps from A to
B for any real vector spaces A and B then
Bil(U V, W)
N
∼
= L(U, L(V, W))
where
N
∼
= denotes a natural isomorphism of vector spaces.
Now we generalise the idea of bilinearity which deals with maps from U V
to a vector space, to multilinearity which has more than just two terms in
the product.
Deﬁnition 3.1.4. If we have a kfold cartesian product of real vector spaces
U
1
U
2
U
k
and if j ∈ [1 : k] we can take (u
1
, u
2
, , u
k
) in this
product, and for any
f : U
1
U
2
U
k
→R
we deﬁne f
(u
1
,u
2
,···ˆ u
j
,···u
k
)
: U
j
→R where the ˆ u
j
means that the j
th
term has
been replaced by −, to be the map which sends u
j
to f(u
1
, u
2
, , u
j
, , u
k
).
56 CHAPTER 3. TENSORS AND TENSOR FIELDS
Note that ˆ u
j
has absolutely nothing to do with shift maps!
Deﬁnition 3.1.5. A kmultilinear map f : U
1
U
2
U
k
to R for real
vector spaces U
j
, j ∈ [1 : k] is a map which if we keep all but one term
(u
1
, u
2
, , u
k
) ﬁxed , representing this as (u
1
, u
2
, ˆ u
j
, u
k
), then
f
(u
1
,u
2
,···ˆ u
j
,···u
k
)
is a linear map from U
j
to R, for any j ∈ [1 : k] and any (u
1
, u
2
, ˆ u
j
, u
k
).
This is actually quite simple but a swine to write down. If you ﬁnd it con
fusing write down a trilinear map from R
2
R
2
R
2
to R.
Deﬁnition 3.1.6. A covariant ktensor on a vector space V is a multilinear
map
f : V V V
. .. .
k copies
→R
We write T
k
(V ) for the vector space (under the usual addition and scaling of
maps) of all covariant ktensors on V . By convention, a 0tensor on a (real)
vector space V is a real number.
Deﬁnition 3.1.7. A contravariant tensor on a vector space V is a multi
linear map
g : V
V
V
. .. .
copies
→R
We write T
(V ) for the vector space of contravariant tensors on V .
Since a covariant 1tensor on V is actually an element of V
and a contravari
ant 1tensor on V is actually an element of V
N
∼
= V , there is a case for saying
that the names co and contra should be swapped around. But I haven’t got
the nerve.
We can have mixed tensors which are covariant in some arguments and con
travariant in others. When a physicist writes down something like g
µ,ν
the
fact that they are subscripts tell you that this is a covariant tensor, the fact
that there are two of them tells you it is a bilinear map from V V to R, and
almost certainly V is either R
3
or possibly the tangent space or the cotangent
space to the manifold we live in. When a physicist writes
g
τ
µ,ν
he has a tensor g : V V V
→R for some V which he frequently forgets to
specify on the grounds that he knows, as do all right thinking people, what it
is. He uses subscripts for the coeﬃcients g
µ,ν
so that he can use superscripts
for the things they operate on and use the Einstein convention.
3.1. TENSORS 57
Deﬁnition 3.1.8. We talk of a
k
type tensor ω on V when it is covariant
of order k and contravariant of order , that is when it is a multilinear map
ω : V V V
. .. .
k copies
V
V
V
. .. .
copies
→R
We write T
k
(V ) for the space of type (k, )
T
tensors on V . T
k
0
(V ) is written
T
k
(V ) and T
0
(V ) is written T
(V )
I shall expand on this when I explain tensor ﬁelds which comes up next.
Deﬁnition 3.1.9. A covariant ktensor ω is symmetric iﬀ
ω(u
1
, u
2
, , u
k
) = ω(u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the result is the same.
Deﬁnition 3.1.10. A covariant ktensor ω is alternating (or antisymmetric)
iﬀ
ω(u
1
, u
2
, , u
k
) = −ω(u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the sign only is changed.
Note that we can say this more easily: a covariant ktensor is symmetric
iﬀ it is invariant under the symmetry group S
k
acting on the arguments, in
algebra, ω ∈ T
k
(V ) is symmetric iﬀ ω = ω◦σ for every σ in the permutation
group S
k
on the arguments of ω. And if it is antisymmetric then it is invariant
under the group A
k
. If σ is a permutation of the set of arguments, we write
sgn(σ) to be +1 if σ is an even permutation and −1 if it is odd. Then we
can say ω is alternating iﬀ ω = sgn(σ) ω ◦ σ.
Alternating ktensors are also known as kforms and are important for later
work. They have everything to do with orientation.
Deﬁnition 3.1.11. We write Ω
k
(V
n
) for the space of alternating covariant
ktensors on the vector space V having dimension n.
3.1.3 Dimension of Tensor spaces
You can either read this carefully or simply do the exercises at the end of
the subsection. Or you can do both. As long as you ﬁnd out how to do the
exercises!
The space of ktensors on V
n
is obviously a vector space because we can
add and scale the maps; the sum or two tensors of type (k, )
T
is obviously
58 CHAPTER 3. TENSORS AND TENSOR FIELDS
another tensor of the same type, and likewise scaling such a tensor by a real
number gives another tensor of the same type. Since the set of all maps from
any X to R is a vector space, the type (k, )
T
tensors form a linear subspace.
Example 3.1.1. Suppose ω is any (2, 0)
T
tensor on R. Then we put ω(1, 1) =
a. Then by multininearity, keeping the second component ﬁxed we deduce
that ω(x, 1) = xa for any x ∈ R, and now keeping the ﬁrst component ﬁxed
we see that ω(x, y) = xya. Thus the tensor is speciﬁed by just one number,
a and so the space of covariant 2tensors on R is a one dimensional vector
space, having ω(1, 1) = 1 as a basis element.
We note that ω is always a symmetric tensor. There is precisely one alter
nating (2, 0)
T
tensor on R and it is the zero map. So the space of alternating
(2, 0)
T
tensors on R is zero dimensional. The zero tensor is both symmetric
and antisymmetric.
To get a basis for the covariant k tensors on R
n
, we need to specify the maps
on every choice of basis elements. For example, for 2tensors on R
2
we know
the multilinear map completely if we know it on (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
)
and (e
2
, e
2
). Then multilinearity will guarantee us the value on any pair of
vectors, each in R
2
. The extension to higher order tensors and diﬀerent n
is obvious, and by taking any basis for V we get the same conclusion. This
gives the obvious result, the dimension of the space of covariant ktensors
on V
n
is n
k
. For example, we can take as a basis for T
2
(R
2
) the four maps
deﬁned by the four columns:
(e
1
, e
1
) → 1 (e
1
, e
1
) → 0 (e
1
, e
1
) → 0 (e
1
, e
1
) → 0
(e
1
, e
2
) → 0 (e
1
, e
2
) → 1 (e
1
, e
2
) → 0 (e
1
, e
2
) → 0
(e
2
, e
1
) → 0 (e
2
, e
1
) → 0 (e
2
, e
1
) → 1 (e
2
, e
1
) → 0
(e
2
, e
2
) → 0 (e
2
, e
2
) → 0 (e
2
, e
2
) → 0 (e
2
, e
2
) → 1
Then it is obvious that these four maps are linearly independent and that
any bilinear map from R
2
R
2
to R is a linear combination of these.
Exercise 3.1.10. Prove the last remark.
Remark 3.1.2. We can write the map given by the ﬁrst column as dx ⊗dx,
the second column map as dx⊗dy, the third as dy⊗dx and the last as dy⊗dy.
I shall explain this neat notation later.
If the tensors are of mixed type (k, )
T
, then by taking the dual basis for
the contravariant tensors we get the dimension is n
k+
. The Riemannian
Curvature tensor, which you may meet later, is a (3, 1)
T
tensor and in R
4
,
3.1. TENSORS 59
spacetime, it therefore has dimension 4
4
= 256. This means it takes 256
numbers to specify it. Fortunately it has a lot of symmetries which reduces
the dimension to 20, otherwise nobody would have the patience to do any
calculations with it.
The space of alternating covariant 2tensors is obviously a subspace of the
space of all covariant 2tensors: on R
2
, we do not need to look at what any
such tensor ω does to (e
1
, e
1
) because it has to be zero. Similarly if we know
it on (e
1
, e
2
) we know its value on (e
2
, e
1
), it is just the negative. So if we
know its value on (e
1
, e
2
) we know it completely, and since a basis for the
space Ω
2
(R
2
) is the single alternating tensor which sends sends (e
1
, e
2
) to 1,
the dimension of Ω
2
(R
2
) is one. Since it is easily veriﬁed that the alternating
map which sends (e
1
, e
2
) to 1 is the determinant of the matrix formed by
putting the two vectors as adjacent columns, we see that the determinant is
a basis for the space Ω
2
(R
2
) of alternating 2tensors on R
2
.
Exercise 3.1.11. Easily verify the above claim.
In R
3
we have three basis elements, (e
1
, e
2
, e
3
). and if we look to see what
we have as a basis for the alternating two tensors we observe that we know
any such alternating ω if we know it on (e
1
, e
2
), (e
2
, e
3
) and (e
1
, e
3
). For
every other pair of basis elements, the result is forced by knowing ω on these
three together with the fact that ω is alternating. Since we have only three
choices of real numbers to make in order to nail down a particular alternating
2tensor on R
3
, the dimension of Ω
2
(R
3
) is 3. And for R
n
, all we have to
do is to take pairs e
i
, e
j
with i < j, and again knowing ω on these tells us
everything about ω. There are n(n −1) ways of choosing two diﬀerent basis
vectors from R
n
, and we need half of them, so the dimension of Ω
2
(R
n
) is
n(n −1)/2.
And ﬁnally, we can choose k distinct basis elements from the set of n in
n(n −1)(n −2) (n −k + 1) ways and each such way can be permuted in
k! ways and we need only one of them. We can choose a suitable basis on
which to deﬁne an alternating ktensor to be the set ¦e
i
1
, e
i
2
, e
i
k
¦ with
i
1
< i
2
< < i
k
and this can be done in
n
C
k
=
n!
k!(n −k)!
ways, the
number of ways of choosing k things from n. So the dimension of Ω
k
(V
n
) is
n
C
k
.
The space of symmetric tensors is similar except that we do not know the
value of ω when two choices of the same basis elements of R
2
are made. In
R
2
the 2tensor ω is determined if we know ω(e
1
, e
1
), ω(e
1
, e
2
), ω(e
2
, e
1
),
and ω(e
2
, e
2
). If we know it is symmetric we don’t need both ω(e
1
, e
2
) and
ω(e
2
, e
1
). So the dimension of the symmetric 2tensors on R
2
is three. The
60 CHAPTER 3. TENSORS AND TENSOR FIELDS
symmetric ktensors on R
n
have a basis the set of values of maps deﬁned
on ¦e
i
1
, e
i
2
, e
i
k
¦ with i
1
≤ 1
2
≤ ≤ i
k
. For the 2tensors on R
n
we
can choose two elements in n
2
ways and we can notice that n of these have
both elements the same. The remaining n(n − 1) ways have the subscripts
diﬀerent and we can select half of them. So the dimension is n(n−1)/2+n =
n(n +1)/2. I leave you to work out the dimension of the space of symmetric
k tensors on R
n
.
Note that the space Ω
n
(R
n
) always has dimension 1. Taking the ω deﬁned by
taking the value one on (e
1
, e
2
, e
3
, e
n
) in that order, we observe that we
have a particularly simple alternating nform on R
n
. It is called the volume
element, and its value on any set of n vectors in some order can be calculated
using multilinearity. If we write each vector out as a column, the result is
the determinant of the resulting n n matrix. This is a good way to deﬁne
the determinant.
Note that we could write out a basis for Ω
k
(R
n
) for k ≤ n in terms of
the possible choices of k row elements by taking the determinant of the
result. Thus alternating tensors are all about determinants or, alternatively,
determinants are all about alternating tensors.
Exercise 3.1.12.
1. By evaluating an ω in the space of (2, 0)
T
tensors on R
2
on the elements
(e
1
, e
1
), (e
2
, e
1
), (e
1
, e
2
)(e
2
, e
2
) to get (a, b, c, d) respectively, show that
ω is deﬁned by a 2 2 matrix using suitable matrix operations.
2. Show that this is equivalent to acting on the pair of vectors (a, b) by a
2tensor φ by writing the matrix as A and calculating a
T
Ab.
3. Complete the scruﬀy arguments used to obtain the dimension of Ω
k
(R
n
)
which took suitable basis elements of
R
n
R
n
R
n
. .. .
k terms
to deﬁne a set of multilinear maps, with the implied belief that we can
extract a basis for Ω
k
(R
n
) by ﬁxing suitable values. In particular show
that a set of such maps is linearly independent and spans Ω
k
(R
n
).
4. Show that the symmetric (2, 0)
T
tensors form a subspace. What is the
dimension? Give a basis for it.
5. Do the same for the alternating (2, 0)
T
tensors.
3.1. TENSORS 61
6. Show that the determinant acting on
¸
x
y
,
¸
u
v
by taking the two
vectors to vx −uy is an alternating 2tensor on R
2
.
7. Show that any other alternating 2tensor on R
2
is a multiple of this by
a real number.
8. Repeat for (0, 2)
T
tensors and again for (1, 1)
T
tensors.
9. Find the dimension of the linear space of all covariant ktensors on R
n
.
10. Find the dimension of the space of all alternating covariant ktensors on
R
n
. Hint: start oﬀ in a small way by looking for a nonzero alternating
2tensor on R and showing there aren’t any. You have done alternating
2tensors on R
2
. The determinant is a basis for the alternating 3
tensors on R
3
, and this generalises. The alternating 2tensors on R
3
have a basis obtained by choosing two rows of the three rows made up of
the two vectors side by side, and producing the 22 determinant on the
entries. This leads to three basis elements. Show this by looking at the
four choices of a pair of bases. Now generalise to higher dimensions.
Finally generalise to higher order alternating tensors. It’s a fair bit of
work but will burn the elements of the exterior algebra of alternating
tensors into your brain for ever.
Once you have done the work of ﬁnding out how you manipulate them,
ﬁnding out the actual use and hence the point of the things is painless.
Remark 3.1.3. If we take the space of (2, 0)
T
tensors on R
2
or R
3
and
represent them as spaces of 2 2 and 3 3 matrices, you might think that
we have captured all the properties needed for (2, 0)
T
tensors and they are
merely matrices dressed up. You might conclude that the same holds for
(0, 2)
T
tensors and for (1, 1)
T
tensors. We have a bit of a problem however
if we decide to change the basis. Obviously this will change the matrix
representing a particular tensor, even if we agree to use the same basis for
both occurrences of V or V
, or if we use some new basis for V and the
dual basis for V
. There has to be a matrix representing the transition from
one basis to another, and you might think that the usual rule for transition
matrices applies as in Linear algebra. But the matrix here represents a
bilinear map, not a linear one, and you would be wrong in general. While I
shan’t be concerned with change of basis in what remains, a proper course
in tensors would certainly go into this, and you might want to play around
with ﬁnding out what happens.
62 CHAPTER 3. TENSORS AND TENSOR FIELDS
Representing higher order tensors with matrices doesn’t work. We would
need at least cubes and tesseracts of numbers instead of squares or rectan
gles of them. Fortunately, there are neater ways of writing them down which
we shall meet later. In the next chapter we shall be concerned with Mawell’s
Equations for the Electromagnetic ﬁeld, and this will take us as far as al
ternating 3tensors on R
4
. Physicists and old style mathematicians have an
obsession with matrix representations which causes them serious problems
for higher order tensors. We shall breeze through them without eﬀort merely
by using more powerful notations.
The 0tensors have dimension one by deﬁnition.
If you ask an oldfashioned applied mathematician what a tensor is, he might
well tell you that it is a matrix, but it transforms diﬀerently. This tells you
more about oldfashioned applied mathematicians than it tells you about
tensors.
3.1.4 The Tensor Algebra
A deﬁnition of something we have met before:
Deﬁnition 3.1.12. An algebra is a vector space with a left and right dis
tributive multiplication on it. The multiplication is usually associative so the
elements of the space deﬁne a ring. The exception is Lie Algebras which are
not associative but instead satisfy the Jacobi Identity (see Exercise 2.9.4).
Deﬁnition 3.1.13. A graded algebra is a set of vector spaces indexed by the
group Z
n
for some n ∈ Z
+
(or the group Z), with a distributive multiplication
on the set.
Given two covariant tensors we can multiply them. More speciﬁcally, given
a ktensor and an tensor we can construct a k +tensor as follows:
If
ω : V V V
. .. .
k copies
→R
is a covariant ktensor and
η : V V V
. .. .
copies
→R
is a covariant tensor we deﬁne
ω ⊗η : V V V
. .. .
+k copies
→R
3.1. TENSORS 63
by taking ω on the ﬁrst k elements, η on the last , and multiplying the
results.
Exercise 3.1.13. Show that this gives a covariant k + tensor.
This is called the tensor product in the tensor algebra. This makes the set of
all covariant tensors a graded algebra. Graded algebras are quite common and
you will meet them later if you do algebraic topology or theoretical physics.
Physicists stick the word ‘super’ in front of a theory when it goes to a graded
version, hence superstring theory. Usually they have only two levels so they
talk about Z
2
gradings.
Note that the tensor product of alternating tensors is not alternating unless
one of the tensors is a constant (zero tensor).
Note also that the tensor product although not commutative is associative
and distributes over addition.
Exercise 3.1.14.
1. Show that if ω
1
, ω
2
are ktensors and ϕ is an tensor then
(ω
1
+ω
2
) ⊗ϕ = ω
1
⊗ϕ +ω
2
⊗ϕ
2. Show that if s, s
and t, t
are real numbers,
(sω
1
+tω
2
) ⊗(s
ϕ
1
+t
ϕ
2
)
is what you’d expect it to be on the optimistic assumption that ⊗ is a
nice well behaved multiplication.
3. Give a basis for the space of (2, 0)
T
tensors on R
2
in terms of dx and
dy. Hint: Note that dx
i
: R
n
→ R is a covariant 1tensor on R
n
for
any n. If n = 2 we call them dx and dy. Certainly the tensor product
of any two 1tensors is a 2tensor. Show that every 2tensor is a linear
combination of such tensor products. (A count of basis elements might
save you some trouble here.) Look back to Remark 3.1.2 to ﬁnd the
answer written down, with an explanation promised later. This is the
explanation.
4. Represent the tensor dx ⊗dy as a matrix over R
2
.
5. Represent the tensor dx ⊗dy as a matrix over R
3
.
6. Repeat the last two for the tensor dx ⊗dy −dy ⊗dx. (Later we shall
call this tensor dx ∧ dy.)
64 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.1.15. Show by an example that not every twotensor on R
2
can
be written as a tensor product of onetensors. This is obvious once you see
it but some people are tempted to suppose all higher order tensors are tensor
products of onetensors. The moral: one 2tensor is not the same things as
two 1tensors!
Example 3.1.2. We can write down a bit of the tensor algebra (not all of
it, it is inﬁnite dimensional) on R
2
without too much trouble. Note that I
use dx to specify the linear map from R
2
to R which projects on the ﬁrst
component, and dy for the projection on the second component.
Order basis isomorphic to
T
k
(R
2
) ¦dx
i
1
⊗ ⊗dx
i
k
¦ R
2
k
.
.
.
.
.
.
.
.
.
T
3
(R
2
) ¦dx ⊗dx ⊗dx, , dy ⊗dy ⊗dy¦ R
8
T
2
(R
2
) ¦dx ⊗dx, dx ⊗dy, dy ⊗dx, dy ⊗dy R
4
T
1
(R
2
) ¦dx, dy¦ R
2
T
0
(R
2
) 1 R
Using the isomorphisms we can also write out the tensor multiplication in
an admittedly strange form:
¸
R R
2
R
4
R (R, ) (R
2
,
q
) (R
4
,
q
)
R
2
(R
2
,
q
) (R
4
,?) (R
8
,?)
R
4
(R
4
,
q
) (R
8
,?) (R
16
,?)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table 3.1.2
Here I have started with the 0tensors, then the 1tensors, and so on, and
used the isomorphisms to indicate where the tensor product takes us. The
symbol means ordinary multiplication, and
q
means scalar multiplication.
The question marks remain to be ﬁlled in, but I will do the multiplication
from T
1
(R
2
) T
1
(R
2
) to T
2
(R
2
). In the bases given this is
R
2
R
2
→ R
4
¸
a
b
,
¸
c
d
→
ac
ad
bc
bd
¸
¸
¸
¸
3.1. TENSORS 65
If you represent elements of T
2
(R
2
) by 22 matrices, you can get this result
by a matrix multiplication
¸
c
d
[a, b]
For what that’s worth.
Exercise 3.1.16. Fill in the other question marks.
It is easy to generalise the tensor algebra so that we can take the tensor
product of contravariant tensors or of a covariant tensor with a contravari
ant tensor or of two mixed tensors. These things are best understood by
constructing simple examples when they are ridiculously easy, rather than
by looking at the formal deﬁnitions which on ﬁrst encounter are terrifying.
Algebra is learnt by making up lots of examples. When you have done this
you can easily see what is being said and after a small amount of practice
you can use the language to terrorise people unfamiliar with it. This is child
ish and you should be ashamed of yourself for actually frightening engineers,
applied mathematicians and physicists this way.
Exercise 3.1.17. A covariant 1tensor on R
n
is a linear map from R
n
to
R and consequently an element of R
n
. We write dx
i
: R
n
→ R to be the
projection which picks out the i
th
component of each vector.
1. Show the set ¦dx
i
: i ∈ [1 : n]¦ is a basis for T
1
(R
n
)
2. Show that the set ¦dx
i
⊗ dx
j
, i, j ∈ [1 : n]¦ is a basis for the space
T
2
(R
n
), so any ω ∈ T
2
(R
n
) can be speciﬁed by the entries in an n n
matrix relative to this basis.
3. Take some ω ∈ T
2
(R
3
) and specify it by such a matrix, take two ele
ments of R
3
and show how to evaluate ω on them by matrix multiplica
tions of representations of the vectors with respect to the standard basis
for R
3
.
4. Choose a diﬀerent basis for R
3
; discuss what has to be done to the
matrix in order to get it to still represent the same ω ∈ T
2
(R
3
)
5. The space T
1
(R
n
) is the space of linear maps from R
n
to R and is hence
R
n
. It is naturally isomorphic to R
n
, and we can use the natural
isomorphism to take ¦e
i
, i ∈ [1 : n]¦ to be a basis for T
1
(R
n
). What is
a basis for T
2
(R
n
)? For T
k
(R
n
)?
66 CHAPTER 3. TENSORS AND TENSOR FIELDS
6. Since the dimension of T
2
(R
n
) is clearly n
2
we can represent any ele
ment ϕ ∈ T
2
(R
n
) by an nn matrix as before. How does this transform
under change of basis?
Remark 3.1.4. Note that this confuses an earlier deﬁnition which had dx
i
as a linear map from
˙
R
n
to R, but the reason for the confusion will become
clear later. If you are rather more fussy than I am, you might want to do it
right: If ¦e
i
: i ∈ [1 : n]¦ are the standard basis elements for R
n
, we can write
the corresponding dual basis for R
n
as ¦e
i
: i ∈ [1 : n]¦. Then go through
replacing dx
i
with e
i
throughout and you will have it in impeccable form.
Remark 3.1.5. We deﬁne a 0tensor on any real vector space to be another
and more exotic name for a real number. This means we can take tensor
products of 0tensors with k tensors to get the scaling operation. You should
have already worked this out from doing the exercises.
Exercise 3.1.18. If you read Darling’s book you will discover that he goes
about deﬁning tensor algbras quite diﬀerently. He deﬁnes U ⊗V for any real
vector spaces U and V , by proving a universality theorem which is somewhat
obscure.
You can recover Darling’s treatment as follows.
First note that if we have L(U, R) and L(V, R) we can deﬁne L(U, R)⊗L(V, R)
to have elements f ⊗ g which means for each f ∈ L(U, R) and g ∈ L(V, R)
we take f ⊗g : U V →R by
f ⊗g(u, v) = f(u) g(v)
where is just multiplication in R. From now on I shall just write f(u)g(v)
for this. It is clear that this is a bilinear map from U V to R. It is also
clear that L(U, R) ⊗L(V, R) is a vector space under the usual operations of
scalar multiplication and addition.
But L(U, R) is just U
and L(V, R) = V
. So we have deﬁned U
⊗V
as a
new vector space. It is therefore perfectly straightforward to deﬁne U
⊗V
as a new vector space. And if we identify U
with U and V
with V , we
have U ⊗V .
Show that this gives us Darling’s treatment. Find the dimension of U ⊗ V
in terms of the dimension of U and the dimension of V . Find an explicit
representation for R
2
⊗R
3
and calculate
¸
1
2
⊗
4
5
6
¸
¸
3.2. TENSOR FIELDS ON A MANIFOLD 67
3.2 Tensor Fields on a Manifold
It makes sense to attach various things to manifolds. For example, to each
point of S
2
we can attach a number. We can think of it as glued on to the
sphere. Or we can imagine it as giving a function from the sphere to the
real numbers. It might measure the temperature at the surface of a solid
ball, perhaps. It makes sense to attach numbers smoothly so the function is
smooth. As you wander about the surface of the sphere the numbers will not
change too sharply.
For a diﬀerent example of attaching things to manifolds, we can attach vec
tors. Again we can think of it in various ways, and it can be used for various
purposes: for example it might make sense to have the wind blowing at the
surface of the sphere and to want to say by how much and in what direction
at each instant. Or we might want to measure the tangential component in
the surface of a magnetic or electric ﬁeld.
We might want to attach tensors. Such things are important and useful
particularly to physicists. You can see that we might want to assign, for
some purpose, to each point of a sphere a two by two matrix, and it would
make sense to require the matrices to change smoothly as we moved over
the surface of the sphere. To give an example of something quite practical
that we might want to attach to a manifold, suppose we had the job of
describing distances on S
2
. There is, of course, a standard metric on S
2
but
it is extrinsic and arises from the embedding in R
3
. If you think about the
intrinsic deﬁnition of S
2
you can see that there is absolutely nothing in it
which allows us to talk of distances. It is rather reasonable to want to have
an intrinsic notion of distance on S
2
and on other manifolds. In fact we can’t
do General Relativity without it. In the same way, there is no way to talk
about the area of a region, or the angle between vectors on a sphere, except
with reference to an embedding of the sphere in R
n
, and it makes sense to do
these things intrinsically, as is shown by the fact that we habitually do these
things in the space in which we live— which might be S
3
for all we know.
Or something more complicated. All this and much more is done by means
of attaching tensors to a manifold giving what are called tensor ﬁelds. We
now investigate these things.
The idea of a vector ﬁeld on a manifold is not too hard to grasp: technically
the section of the tangent bundle takes each point of the manifold and assigns
to it a vector attached to that point from the space of possible choices. If
the manifold is R
2
, we assign to a point an arrow, an element of what I shall
call
˙
R
2
for the tangent space at each point. Confusing
˙
R
2
with R
2
, we get
68 CHAPTER 3. TENSORS AND TENSOR FIELDS
such things as
V : R
2
→R
2
,
¸
x
y
→
¸
−y
x
This of course is the same as the system of Ordinary Diﬀerential Equations
˙ x = −y
˙ y = x
in the notation of M213. It also shows you why I want to write
˙
R
2
, with
elements
¸
˙ x
˙ y
, for the codomain of a vector ﬁeld.
The diﬀerence between the modern notation and the older one is that we are
being careful to make it clear that the two spaces R
2
(in the expression for a
vector ﬁeld V : R
2
→ R
2
) are actually diﬀerent. One is a space of locations
and the other is a space of arrows, actually tangent vectors. You probably
found the old fashioned notation confusing when you ﬁrst met it, and indeed
it is. The new notation is not only clearer, it generalises to manifolds which
the old notation does not. Even V : R
2
→
˙
R
2
is an improvement.
As mentioned earlier, as well as attaching vectors to points in a manifold
we can attach other things. If we attach real numbers we merely get a map
from the manifold to R, and we have now got a new way to think of such
a map. Could we attach a matrix? To see the useful way to do such a
thing, observe that the tangent bundle is merely obtained by taking as ﬁbre
the tangent space at each point. But we could start with the tangent space
and replace it with its dual space. The thing that we get when we take the
ﬁbre bundle with the dual to the tangent space as ﬁbre and glue all these
ﬁbres together using the same method as for the tangent bundle is called
the cotangent bundle. It looks rather similar and in R
2
the only diﬀerence
would be that instead of attaching the space of columns of two numbers
(representing the possible arrows at each point), we would be attaching the
space of rows of two numbers (representing linear maps from R
2
to R). The
spaces are isomorphic, but clearly the elements are not the same. Of course I
have added my own bit of confusion here by confusing linear maps with their
matrix representation, another isomorphism. On R
n
this is harmless. On a
manifold it generally is not and again the isomorphism needs to be thought
about.
The cotangent bundle is important in classical mechanics where it corre
sponds to the momentum space whereas the tangent space corresponds to
the velocity space. The reason is that we have an energy function. If we look
3.2. TENSOR FIELDS ON A MANIFOLD 69
at R
2
it has tangent space what I have called R
2
˙
R
2
. Now the function
1/2mv
2
is a function from
˙
R
2
to R,
¸
v
1
v
2
→
m
2
((v
1
)
2
+ (v
2
)
2
)
and the derivative of this function is the row matrix
[mv
1
, mv
2
]
This is an element of the cotangent bundle because it is a covector, not a
vector. Cheerfully confusing the two leads to ghastly muddle further down
the track.
As well as the dual of the tangent space attached to each point of a smooth
manifold, we can attach tensors. The vector space of ktensors such as
ω : V V V
. .. .
k copies
→R
(any space of maps into R is a vector space) has a basis consisting of the
multilinear maps evaluated on each of the n
k
combinations of basis elements
of V . If V = T
a
(M) for some manifold M and some a ∈ M, then ω will be
speciﬁed relative to this basis by n
k
numbers where n is the dimension of V
and hence M, and k is the order of the tensor. In principle this could be an
awful lot of numbers (but in practice it usually isn’t).
This ktensor vector space can also be thought of as stuck on the manifold
at a. The tangent space is just the locally trivial vector bundle over the
manifold as base space with ﬁbre the tangent space at each point. In exactly
the same way we can take a locally trivial vector bundle with ﬁbre the vector
space of ktensors on the tangent space at each point. At each a ∈ M this is
a vector space of dimension n
k
. Given that a ﬁnite dimensional real vector
space has a topology which is invariant under isomorphisms, and given that
the tensor bundles must be locally trivial (since the tangent space to R
n
is a
trivial bundle), the topology of each tensor bundle has as a base the cartesian
product of those open sets in the manifold which are subsets of those open
sets over which the bundle is trivial, with open sets in the ﬁbre.
If we do this for every a ∈ M we get the ktensor bundle of M. If we do it
for all the possible k we get the full tensor bundle of M.
Then:
Deﬁnition 3.2.1. A smooth ktensor ﬁeld on a manifold M is a smooth
section of the ktensor bundle.
70 CHAPTER 3. TENSORS AND TENSOR FIELDS
Awful Warning
There is scope for some confusion here. If we take the manifold to be R
n
,
the tangent space is, in my idiosyncratic notation, R
n
˙
R
n
and this makes
a vector ﬁeld a section of this bundle. If we look to see what sort of tensor
ﬁeld this is, we see that the tensor ﬁeld must assign to each point of R
n
a multilinear map from some space V to R. There is only one possibility
and that is to make V the dual space to
˙
R
n
and take the linear maps. This
means that by identifying the double dual with the original space
˙
R
n
we get
the right answer. Thus a vector ﬁeld is a T
1
tensor on the tangent space
T
a
(R
n
) for every a ∈ R
n
. So a covariant vector ﬁeld in the ordinary sense,
mentioned in the last chapter is a contravariant tensor ﬁeld. There is, if you
like, an element of dualising in deﬁning a tensor in the ﬁrst place, so we have
to dualise again to get rid of it.
The terminology is unfortunate since the tangent functor takes tangent vec
tors to tangent vectors and is covariant, but not many people have the nerve
to change the traditional terminology. I certainly don’t.
Some of the books by physicists make a pig’s breakfast of all of this duality.
Confusion is the natural state of man. And woman. Try to be clear about
which space you are working in and avoid the muddle.
End of Awful Warning
If we take the alternating covariant ktensors on the tangent space at every
point of the manifold the smooth tensor ﬁeld of sections is called a diﬀerential
kform on the manifold. Similary we can limit the section to taking values
in the symmetric ktensors. Both of these are important.
This sounds horrible but if you look at 2tensors, alternating or not, you can
see that on the tangent space T
a
(M) when M = R
n
, any one of them can
be represented nicely by an n n matrix of numbers. And if we select one
such matrix for each point a of M, then we get an n n matrix of functions
from the manifold to R. So as long as k is one or two we do not have
anything very complicated. If k = 1 then we are talking about vector ﬁelds
or covector ﬁelds, and if k = 2 we are sticking matrices onto the manifold at
each point. If we never go beyond dimension 3 then the worst thing we have
to imagine is a space with a 3 3 matrix associated with each point of the
space. This is not really very bad. Admittedly this only gets us the applied
mathematician’s view of the world, but at least we know how to generalise
it to higher dimensions and higher orders if it turns out to be necessary.
A serious issue with this simpliﬁed view of things is that the speciﬁcation of
the matrix representing a tensor on any T
a
(M) requires us to choose a basis
3.2. TENSOR FIELDS ON A MANIFOLD 71
Figure 3.2.1: Shifting a vector between tangent spaces.
for T
a
(M). And if we now do the same for some T
b
(M) for a = b then we
need to choose a basis for representing tensors on T
b
(M). But the spaces
T
b
(M) and T
a
(M) don’t have much to do with each other in general. So in
what sense can it be made the ‘same’ basis? And if it is diﬀerent, how do we
ensure that the matrix of functions is going to behave nicely in representing
the tensor ﬁelds? The tensor ﬁelds are perfectly respectable things, but if
we insist on representing them by matrices of functions we have some serious
problems. Note that if M = R
n
, the tangent spaces can be shifted into each
other in a natural way and the idea that we are using the ‘same’ basis for
each of them makes sense. It all goes wrong if M = S
n
. We need in this
case something like an explicit isomorphism between the tangent spaces at
diﬀerent points.
To see what can go wrong here, imagine a sphere and take a point on the
equator. Attach a vector to this point, say one pointing along the equator.
I have shown this in ﬁgure 3.2.1. Now look to see what happens if you move
it parallel to itself along a line of longitude so that it moves up towards the
north pole. It seems reasonable to say that we are shifting the vector so it
is still pointing in the same direction, and still has the same length, despite
the fact that the vectors are all in diﬀerent tangent spaces. In other words I
am claiming that I can tell when two vectors in two diﬀerent tangent spaces
are ‘the same’. That this is insanely foolhardy becomes apparent if I go to
the same place by a diﬀerent route. Suppose I ﬁrst go around the equator.
My pink vector also goes around the equator, becoming rather purpler as it
goes. When it is opposite the starting point, I now move it up the curve of
longitude until it gets to the north pole. All the way, by both paths, I have
moved the vector so it is pointing in the ‘same’ direction, but the result is
a pair of vectors pointing in opposite directions. So cheerfully doing on a
72 CHAPTER 3. TENSORS AND TENSOR FIELDS
sphere what makes perfect sense on R
2
is fraught with problems.
One of the hardest things to do is to unlearn things you soaked up through
the skin when young and gullible. If you were encouraged to think that
isomorphic vector spaces were never worth distinguishing and you went all
sloppy in your thinking as a consequence, you now have the formidable job
of working it all out again. Don’t blame me, blame the scruﬀy bunch who
taught you manifest nonsense and blame yourself for buying it. Think of this
in the future and currently and regard it as a second Awful Warning.
Exercise 3.2.1.
1. Take M = S
1
and k = 2. Deﬁne a ktensor ﬁeld on S
1
.
2. Take M = S
2
and k = 2. Deﬁne an alternating 2tensor ﬁeld (2form)
on S
2
. Explain what this might have to do with area of regions on a
sphere and indicate how you might calculate the area of a region in S
2
with respect to your choice of 2form.
3. Take M = R
3
and k = 2. Deﬁne a diﬀerential 2form on R
3
.
Note that we can talk about kcovariant and contravariant mixed tensors
and mixed tensor ﬁelds.
3.3 The Riemannian Metric Tensor
Recall from M213 that an inner product on a vector space V is a positive
deﬁnite symmetric quadratic form, which is to say a map
', ` : V V →R
such that
1. ', ` is bilinear; that is ∀ u ∈ V, 'u, −` : V → R is linear and
∀ v ∈ V, '−, v` : V →R is linear
2. ', ` is symmetric; that is ∀ u, v ∈ V, 'u, v` = 'v, u`
3. ', ` is positive deﬁnite; that is ∀ u ∈ V, 'u, u` ≥ 0 and
'u, u` = 0 ⇒ u = 0
We can now summarise the above conditions by saying that an inner product
for V is a symmetric covariant 2tensor on V , with the additional property
3.3. THE RIEMANNIAN METRIC TENSOR 73
Figure 3.3.1: (Bits of) some perfectly respectable Hilbert Spaces stuck on a
manifold.
of being positive deﬁnite. Recall also from M213 that positive deﬁniteness
can be speciﬁed by observing that nondegenerate quadratic forms can be
classiﬁed as to their general shape by diagonalising them and then rescaling
the axes so that they are all diagonal with entries +1 along the diagonal
down to some point after which they are −1. This gives the signature of
the quadratic form (1, 1, 1, , 1, −1, −1 − 1) for some number of posi
tive and some number of negative ones. A positive deﬁnite form has all n
entries +1. There are also degenerate forms where some of the entries after
diagonalisation may be zero.
I shall only consider the positive deﬁnite forms here, although physicists want
to look at the general case of nondegenerate forms because in relativity we
have to put in time as an extra dimension, which gives a signature (1, 1, 1, −1)
or (3, 1). (Or (−1, 1, 1, 1) if you are a physicist. Physicists put time ﬁrst.
Some of them even use (1, 3), multiplying our form by −1. I shall outline a
reason for this in the next chapter.)
We now deﬁne a Riemannian metric tensor ﬁeld on a manifold as a positive
deﬁnite symmetric two tensor ﬁeld. That is, at every point of the manifold
we attach, smoothly, some positive deﬁnite symmetric 2tensor. This means
we have some bilinear function of a pair of tangent vectors at each point.
It is a daft name, and it would have been much more sensible to call it a
Riemannian inner product tensor ﬁeld, because it gives an inner product on
each tangent space. But it is too late to be sensible now. Figure 3.3.1 shows
some vectors in some of the tangent spaces to a sphere, and each pair has a
sort of local dot product in a perfectly respectable tangent space which is now
a perfectly respectable inner product space, in fact a perfectly respectable
Hilbert Space.
74 CHAPTER 3. TENSORS AND TENSOR FIELDS
Each such inner product may be speciﬁed, via charts, as a symmetric 2 2
matrix in the case of the ﬁgure, each matrix A(a) at the point a on the sphere
acting on a pair of tangent vectors u
a
and v
a
to give
u
T
a
A(a)v
a
but it would be better to regard it as a bilinear symmetric map which takes
pairs of tangent vectors with their tails at some point of the manifold, and
returns a real number. Thinking of it as a matrix makes it clear that there
are three distinct numbers which depend on where we are on the sphere. For
an nmanifold it will be
n
C
2
distinct numbers for each point of the manifold.
Or if you insist you can think of the metric tensor as n(n − 1)/2 distinct
functions from the manifold to R. So it makes sense to physicists to write
such a thing as g
µ,ν
where µ (mu) and ν (nu) range through the two possible
values on a sphere or the three possible values on a threemanifold, or the
four on spacetime. With g
µ,ν
= g
ν,µ
. Of course this involves a choice of some
charts to cover the manifold. It might be better to write g
µ,ν
(a) for a a point
in the manifold to remind ourselves that we have what is in eﬀect a matrix
valued function on the manifold, but we don’t.
Note that there is absolutely no machinery for calculating the dot product
of a tangent vector u
a
at a point a, with a tangent vector v
b
at a diﬀerent
point b. This can be done in R
n
but the Inner Product tensor ﬁeld doesn’t
allow it.
If the symmetric tensor is always positive deﬁnite we call it a Riemannian
metric, and the manifold with this tensor ﬁeld is called a Riemannian Manifold.
If the symmetric tensor ﬁeld has signature (−1, 1, 1, 1) it is called a Lorentzian
metric and the manifold is called a Lorentzian manifold. Physicists treat the
universe we live in, including time, as a Lorentzian manifold. More generally,
for any signature of form we say we have a semiRiemannian metric. Bear in
mind at all times that when a physicist talks about a metric on a manifold
he means, almost always, an inner product on all of its tangent spaces, not
necessarily positive deﬁnite but always nondegenerate. Usually it is either
riemannian or lorentzian.
If two quadratic forms are positive deﬁnite, so is their sum. It makes sense to
add them because they are just functions, and if 'u, u` ≥ 0 and ≺ u, u ~ ≥ 0
then the sum is also nonnegative, if the sum is equal to zero both the terms
'u, u` and ≺ u, u ~ must be equal to zero so u = 0. Moreover if we scale
by a positive constant the result is another positive deﬁnite form, while if
we scale by a negative constant the result is a negative deﬁnite form. Hence
the positive deﬁnite symmetric covariant 2tensors which are positive deﬁnite
are not a vector subspace of the space of covariant 2tensors on V , but they
3.3. THE RIEMANNIAN METRIC TENSOR 75
are an open subset of the vector space of symmetric covariant 2tensors and
therefore a manifold with a dimension.
Exercise 3.3.1. What is the dimension of the space of positive deﬁnite sym
metric 2tensors on R
2
? Hint: it is the same as the dimension of the vector
space of symmetric 2tensors and if you represent a tensor by a matrix, you
need to count the number of independent numbers in the matrix.
All the above makes sense if we use contravariant 2tensors. In fact since I
haven’t said anything about V , it might just as well be the dual space to
some other space.
Now we say it again formally:
Deﬁnition 3.3.1. A (positive deﬁnite) Riemannian metric for a manifold
M is a positive deﬁnite symmetric covariant 2tensor ﬁeld on M.
What does this mean in computational terms? It is easiest to begin by
looking at a very simple case, a metric tensor ﬁeld on R
2
. The idea of such a
tensor ﬁeld on R
2
has to do with inner products on
˙
R
2
, in fact one such inner
product for each point of R
2
. This can be grasped by thinking of the matrix
of numbers operating on pairs of vectors in
˙
R
2
being ﬁxed for each point in
R
2
, and as we move about in R
2
, we change the numbers in the matrix. So
the numbers depend on where you are, and are given by smooth functions of
your location in R
2
.
More generally, we take a manifold, we take a point on it, a and look at
the tangent space at a. Now we take the symmetric bilinear maps from this
space to R which are positive deﬁnite. On R
n
, this inner product could be
speciﬁed by taking n independent vectors as a basis, then taking the dual
space and the basis elements for that, and calling them (dx
1
, dx
2
, dx
n
),
and then writing the tensor as
¸
i,j∈[1:n]
g
ij
dx
i
⊗dx
j
where g
ij
is an n n symmetric positive deﬁnite matrix. This follows from
an exercise which I hope you did. Alternatively you can use the Einstein
convention and just write g
ij
dx
i
⊗dx
j
. If you were a classical mathematician
or happen to be scruﬀy, you might leave out the ⊗, as if it is obvious to the
meanest intellect what dx
i
dx
j
means. You might perhaps imagine in a dim
sort of way it means that you are multiplying a very, very little bit of the i
th
component of a vector with another very, very little bit of the j
th
component
of a possible diﬀerent vector. In which case you are so confused there is no
hope for you.
76 CHAPTER 3. TENSORS AND TENSOR FIELDS
The standard inner product on R
2
can be written in this form as the identity
matrix. To calculate
¸
x
y
q
¸
u
v
we simply compute
[x y]
¸
1 0
0 1
¸
u
v
to get xu + yv. Doing the same with any other symmetric positive deﬁnite
matrix instead of the identity will give us a new inner product.
For M = R
2
it makes sense to take the same basis (dx, dy) for elements of the
cotangent space over every point, so we get that it is possible to represent a
Riemannian metric on R
2
in the form
[ ˙ a ,
˙
b ]
¸
g
11
(x, y) g
12
(x, y)
g
12
(x, y) g
22
(x, y)
¸
˙ c
˙
d
Again, this was an exercise which I hope you did.
This when multiplied out gives the required bilinear map from
˙
R
2
˙
R
2
to
R. For any choice of two tangent vectors we get a real number. The g
ij
are
smooth functions, for i, j ∈ [1 : 2] (and g
12
= g
21
.)
On R
3
they would be smooth functions for i, j ∈ [1 : 3]. The matrix would be
symmetric still and at each point it would be positive deﬁnite (or in general
have the required signature).
3.3.1 What this means: Ancient History
If you reﬂect a little on what a covariant 2tensor on the tangent space is,
you will see that we have bilinear maps from pairs of vectors in the tangent
space at a to R, for every a in the manifold.
Now tangent vectors in the old days of classical geometry were not thought
of as elements of a perfectly respectable vector space, but were imagined to
be inﬁnitesimal elements in the base space. You can see that if you take a
velocity vector at a point a ∈ R
2
and travel along it for a very, very short
time, you trace out, more or less, a line segment in R
2
. If you put a little
arrow on its head (its tail being at a) you get the beginings of a picture of
a vector ﬁeld, which we learnt how to draw in second year. If you have a
uniform velocity parallel to the Xaxis and of unit length and in the direction
of increasing x we can represent this by a tiny little arrow attached to a and
pointing in the direction of increasing x. Such a tangent vector should be
rather small because it really represents a velocity through a, and hence an
element of what I have called
˙
R
2
, not a set of points in R
2
. The practicalities
3.3. THE RIEMANNIAN METRIC TENSOR 77
are that velocities change and can change continuously so a big long vector
would be misleading. In fact any ﬁnite length vector is misleading, but we
can be sloppy and imagine that velocities have been turned into distances by
travelling for very short times.
The idea of an inﬁnitesimal time, one so small it was not eﬀectively distin
guishable from zero, but where ratios of inﬁnitesimals made sense and need
not be zero is one which seems natural to many people. My Mathematics
teacher at school talked of dy/dx being a ratio of numbers each of which was
inﬁnitesimal, that is not individually distinguishable from zero. I thought he
was oﬀ his head. I still do. This isn’t mathematics, it’s nonsense
1
. It does
however suggest mathematics. So although my Maths master was talking
incoherent garbage, there is something there which makes sense. And the
idea of inﬁnitesimal distances and times leading to a deﬁnite velocity, a sort
of garbled version of the deﬁnition of a limit, has been used a great deal in
times past.
One way to think of this which you may ﬁnd useful is contained in the
following example.
Let c be the curve x(t) = t, y(t) = 2 sin(t) be given. We look at the origin,
through which the curve passes. First we take the line segment from
¸
0
0
to
¸
u
2 sin(u)
for some u = 0.
This line segment has two important numbers associated with it, the projec
tion along the xaxis and the projection along the yaxis. I shall call such a
line segment
u
and the two numbers ∆x(
u
) and ∆y(
u
). The slope of the
line segment is
∆y(
u
)
∆x(
u
)
. So I think of ∆y as assigning one number to each
such
u
and ∆x as assigning another with the ratio being the slope of the
line segment.
As we take shorter and shorter line segments, that is if we let u → 0 in the
example, the numbers get smaller but the ratio in general does not. I can
easily stipulate that the line segment has one end ﬁxed (at 0 in our case)
and the other end lies along the curve given.
1
It is possible to go through model theory and make these ideas respectable, but this
requires a lot of logic. It is also possible to junk the lot and replace it with the idea of a
limit. And ﬁnally it is possible to choose terminology which looks a lot like the incoherent
rubbish but actually makes sense. This last is what we do and it explains some of the
more baroque aspects of our language.
78 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.2: ∆x and ∆y and dx and dy.
Now look at the tangent vector at 0 deﬁned by the curve above. It is a
perfectly respectable vector in the tangent space
˙
R
2
0
at 0. In fact I can take
a basis for
˙
R
2
0
consisting of the vector of unit positive speed along the x
axis, which I have called i or ˙ e
1
or ∂/∂x earlier, and the second vector being
deﬁned by a curve of positive unit speed along the yaxis which I have called
˙ e
2
and ∂/∂x earlier but might have called j. In this basis it is easy to see
that the tangent vector at 0 deﬁned by the curve is just
¸
1
2
I have already deﬁned dx in the cotangent space as the linear map which
sends this tangent vector to 1 ∈ R and dy as the linear map which sends it
to 2.
So dx and dy do to tangent vectors what ∆x and ∆y do to line segments
in the original space. I have shown the idea in ﬁgure 3.3.2. Note that we
can say that dy/dx for this tangent vector is just 2 by straight division. And
of course this is precisely what we get when we diﬀerentiate 2 sin(x) at the
origin, which is not exactly a surprise.
Classically, the idea of ∆x was what you were probably taught at school:
it was a “little bit of x”, and ∆y was a little bit of y, but you were really
looking at line segments along curves, and ∆x and ∆y are probably better
thought of as maps from line segments to R. It is easy to see that with this
way of looking at things, the claim
dy
dx
= lim
∆x→0
∆y
∆x
3.3. THE RIEMANNIAN METRIC TENSOR 79
Figure 3.3.3: A new rule for measuring distances of points from the origin.
makes sense provided we specify what we really mean by the terms. This
would involve saying that we are calculating ∆x() and ∆y() for line seg
ments joining some ﬁxed point on a curve to other points, and the limit
means that the other points are taken to be getting closer and closer to the
ﬁxed point. All this explanation was unfortunately regarded as not really
part of the mathematics and consequently got left out of the notation. If we
intend to study the subject on manifolds we have to put it back in.
The idea, then, that ∆x means “a little bit of x” and dx means “a very,very
little bit of x” (so little that it is inﬁnitesimal) still survives in the literature.
And the classical mathematicians wrote
d
2
= dx
2
+dy
2
to be an inﬁnitesimal version of Pythagoras’ Theorem and then used it to
ﬁnd the length of curves. These days we deﬁne everything through limits,
which you spend a lot of time doing more or less rigorously in ﬁrst year. At
least, that was the idea.
So instead of writing
[x, y]
¸
a b
b c
¸
x
y
as the square of a new norm on R
2
, people wrote
[dx, dy]
¸
a b
b c
¸
dx
dy
as the same thing with inﬁnitesimals to give inﬁnitesimal sizes of inﬁnitesimal
vectors.
80 CHAPTER 3. TENSORS AND TENSOR FIELDS
If you take a symmetric positive deﬁnite matrix and use it to deﬁne a new
norm (squared) on R
2
you can look at the set
¸
x
y
∈ R
2
: ax
2
+ 2bxy +cy
2
= 1
as the set of points at distance 1 from the origin. This is an ellipse as in
ﬁgure 3.3.3. To calculate the distance of the indicated point from the origin
we need to measure the length of the orange line by taking the length of the
blue part of it as one unit. Alternatively we scale the ellipse until it passes
through the point and then look to see what the scaling factor was. This
makes the distance of the point from the origin about 2 units.
If you do the same thing using a Riemannian metric, the ellipse changes as
you move around the space. One can follow the idea of the old fashioned
geometers by drawing little (inﬁnitesimal?) ellipses around every point of
the space. They thought of this in terms of a dx
2
+ 2b dxdy + c dy
2
, where
dx
2
means take an inﬁnitesimal amount of x and square it. So to calculate
the distance along a curve in R
2
equipped with a Riemannian metric, you
took some ﬁnite set of points along the curve, one at the start and one at the
end, took the ellipse on each point, and measured the distance to the next
point and then added them all up. Then you repeated with more and more
points on your curve. In the limit we get the right answer. I have shown
a stage in this process in ﬁgure 3.3.4. Nobody, of course actually did it by
taking limits, they used Calculus which is quicker and less eﬀort.
Deﬁnition 3.3.2. A geodesic on a manifold with a riemannian metric tensor
is a curve joining two points such that its length is less than or equal to that
of any other curve joining the points.
Exercise 3.3.2. Show that in R
n
with the euclidean metric, geodesics are
straight line segments. Hint: This is a standard calculus of variations prob
lem. Google this if stuck.
Exercise 3.3.3. Describe the geodesics on the (ﬂat) torus.
Of course the idea of inﬁnitesimal ellipses is daft: the question is how to
rescue the idea so that it gives us a way of computing the length of a curve
in a space where distances keep changing. If the ellipses, or more properly the
positive deﬁnite symmetric quadratic forms, are perfectly respectable things
deﬁned on the tangent space at each point, we get what we need.
3.3. THE RIEMANNIAN METRIC TENSOR 81
Figure 3.3.4: Length of a curve via a Riemannian metric.
Example 3.3.1. On a suitable open set in R
2
I deﬁne a new metric by saying
that locally it is given by the matrix
¸
1 +xy 0
0 x
2
+y
2
Find the length of the curve along the parabola y = x
2
from the origin to
x = y = 1 in this metric.
Solution:
The ordinary formula for the curve is that it is
c
d where c is the curve and
d
2
= dx
2
+dy
2
is the ‘inﬁnitesimal path length’. We can write this as
d
2
= [dx, dy]
¸
1 0
0 1
¸
dx
dy
Our new and improved inner product changes from place to place but it gives
rise to a norm just as the old one does, and it is a norm on the tangent space.
We therefore have
d
2
1
= [dx, dy]
¸
1 +xy 0
0 x
2
+y
2
¸
dx
dy
for the new way of measuring the diﬀerential path length and so the length
of the path along the parabola, with x = t, y = t
2
is
1
0
(1 +t
3
).1 + (t
2
+t
4
)(4t
2
) dt ≈ 1.49958
where the approximation is done using Mathematica. This compares with
about 1.29361 using the standard metric.
Example 3.3.2. Find the path length of the spiral r = θ for 0 ≤ θ ≤ 2π in
the metric on R
2
given by
d
2
2
= [dθ, dr]
¸
r
2
0
0 1
¸
dθ
dr
82 CHAPTER 3. TENSORS AND TENSOR FIELDS
Solution: This is just the usual metric on R
2
disguised by using polar
coordinates since d
2
2
= (rdθ)
2
+ (dr)
2
is the usual way of calculating the
‘inﬁnitesimal path length’ and the answer is
2π
0
√
t
2
+ 1 dt ≈ 21.2563
This compares with 2
√
2π ≈ 8.885765876 in the euclidean metric on the θ, r
space. Well, in that space the curve is a straight line.
Exercise 3.3.4. Draw the curve and obtain a crude estimate of the length if
possible with upper and lower bounds to see if you think this is the length in
the usual metric.
Exercise 3.3.5. Find the path length of the above spiral using the metric
given by
d
2
2
= [dθ, dr]
¸
r
2
0
0 r
4
¸
dθ
dr
Remark 3.3.1. I should feel ashamed of myself for writing out expressions
such as the above for specifying a metric (or more accurately the square of a
norm), and should undoubtedly have written
ω = r
2
dθ ⊗dθ +r
4
dr ⊗dr
or something similar. I have tried to give you something which will relate
the correct formulation to the things that the classical mathematicians did
(and which you may ﬁnd at least as badly expressed in works on tensors and
tensor ﬁelds written by the congenitally confused). The bad notation can be
used to do sums quite quickly so is not wholly bad. Much depends on whether
you want to do an awful lot of sums without thinking what you are doing.
And face it, who would want to think while doing monster sums if they didn’t
have to?
Exercise 3.3.6.
1. Find the length of the path A, r = 1 for 0 ≤ θ ≤ π/2 with respect to
the metric given by r
2
dθ ⊗ dθ + dr ⊗ dr. (Note that in the θ, r space
this gives the same answer as the usual metric,. that is, treating θ, r as
if it were a piece of R
2
with the euclidean metric.)
2. What is the length of the parallel line B, r = 2 for 0 ≤ θ ≤ π/2 in the
new metric?
3.3. THE RIEMANNIAN METRIC TENSOR 83
Figure 3.3.5: Three lines of diﬀerent lengths.
3. What is the length of the line C, r = 0 for 0 ≤ θ ≤ π/2?
I show the three lines in ﬁgure 3.3.5.
4. Explain what has gone wrong. The length of a line segment with the
end points diﬀerent cannot be zero in a metric.
5. On the ﬁgure 3.3.5, draw the curve r = 1/(sin(θ) + cos(θ)), for θ ∈
[0, π/2]. Calculate its length with respect to the new metric. Hint: you
might try using NIntegrate in Mathematica.
6. Show the curve is a geodesic in the space, in particular it is shorter
than the ‘straight’ line A. Hint: transform back to R
2
with the euclidean
metric. Find out how to do this by reading on a bit.
Note that the Riemannian metric tensor enables us to make sense of the angle
at which two curves cross. Without this it makes no sense at all to say that
curves intersect at right angles on a manifold, because in diﬀerent charts we
could get totally diﬀerent answers. We feed in two tangent vectors, one along
each curve, at the point of intersection so they are both in the same tangent
space. The Riemannian metric tensor gives us a number out and this leads
us to the angle just as in R
n
.
Suppose now that I want to compute path length of a curves on a manifold.
Let us say I have a curve c : [0, 1] → S
2
on S
2
. I want to compute its length.
I take a chart containing some of the curve, say u : U → R
2
and this takes
the bit of the curve in S
2
to a bit of curve in R
2
. I have a Riemannian metric
tensor on the manifold. The picture of ﬁgure 3.3.6 shows a local parametri
sation by u
−1
of a patch containing some of the curve. The composite u ◦ c
‘shifts’ the curve to the codomain of u, the open set u(U) in R
2
.
84 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.6: Length of a curve on a manifold via a Riemannian metric.
Now I want to know what happens to the covariant 2tensor ﬁeld on S
2
which
tells me how to measure distances there. I claim that u
−1
induces a covariant
2tensor ﬁeld on u(U). This requires a certain amount of thought.
We have the picture from the last chapter:
T
a
X
?
π
X
X
T
f(a)
Y
?
π
Y
Y

f
f
Now a linear map from T
a
(X) to R is taken by diﬀerentiable f to a linear
map from T
f(a)
(Y ) to R. We can see this by using the natural equivalence
of V
with V or we can simply send α : T
a
(X) → R to α ◦ f
: T
f(a)
→ R.
These are the same thing.
Now we know what f
is on tangent vectors, it is just the derivative of f at
each point. So on R
2
if we have f : R
2
→R
2
given by
f
¸
x
y
=
¸
u
v
we can write
[du dv] = [dx dy]
¸
∂f
1
∂x
∂f
1
∂y
∂f
2
∂x
∂f
2
∂y
¸
Similarly we can, given f : X → Y for X = R
n
and Y = R
m
transform the
covectors dx, dy in a way strictly dual to the way we can carry a tangent
3.3. THE RIEMANNIAN METRIC TENSOR 85
vector on X to one on Y . In fact it works better for covectors because
a covector ﬁeld on Y is pulled back to one on X (and it is certainly not
generally true that a vector ﬁeld on X is taken to one on Y ).
Exercise 3.3.7. Why not?
Exercise 3.3.8. Deﬁne the pullback of a covector ﬁeld on Y to one on X.
Example 3.3.3. Let
P : R
2
→ [0, 2π) [0, ∞),
¸
x
y
→
¸
θ
r
be the polar coordinate map. Then we can write P
−1
as
x = r cos(θ)
y = r sin(θ)
Now we have
dx =
∂x
∂θ
dθ +
∂x
∂r
dr
dy =
∂y
∂θ
dθ +
∂y
∂r
dr
hence
dx = −r sin(θ) dθ + cos(θ) dr
dy = r cos(θ) dθ + sin(θ) dr
where upon we can calculate the various tensor products in the inner product:
dx ⊗dx = (−r sin(θ) dθ cos(θ) dr) ⊗(−r sin(θ) dθ cos(θ) dr)
= r
2
sin
2
(θ) dθ⊗dθ−r sin(θ) cos(θ)dr⊗dθ−r sin(θ) cos(θ)dθ⊗dr+cos
2
(θ) dr⊗dr
and similarly for dx ⊗dy, dy ⊗dx and dy ⊗dy
= r
2
cos
2
(θ) dθ⊗dθ+r sin(θ) cos(θ)dr⊗dθ+r sin(θ) cos(θ)dθ⊗dr+sin
2
(θ) dr⊗dr
Hence we have
d ⊗d = dx ⊗dx +dy ⊗dy = r
2
dθ ⊗dθ +dr ⊗dr
This, when translated into matrix terms and old fashioned dx
2
+dy
2
language,
gives us that the standard identity matrix on R
2
for the euclidean metric
tensor goes over to the matrix
¸
r
2
0
0 1
of example 3.3.2
86 CHAPTER 3. TENSORS AND TENSOR FIELDS
Example 3.3.4. Suppose f : [0, 1] → R
2
is a curve in R
2
and we wish to
compute its length. Writing f(t) = (x(t), y(t))
T
we have the length of f is
[0,1]
du
where du is the pullback from R
2
by f of the length measure d on R
2
. The
usual (Lebesgue) measure on [0, 1] is written dt. This gives us:
dx ⊗dx = (dx/dt dt) ⊗(dx/dt dt)
= (dx/dt)
2
dt ⊗dt
dy ⊗dy = (dy/dt dt) ⊗(dy/dt dt)
= (dy/dt)
2
dt ⊗dt
d ⊗d = dx ⊗dx +dy ⊗dy
du = f
d
⇒ du ⊗du = ((dx/dt)
2
+ (dy/dt)
2
) dt ⊗dt
⇒ du =
((dx/dt)
2
+ (dy/dt)
2
) dt
⇒
[0,1]
du =
[0,1]
((dx/dt)
2
+ (dy/dt)
2
) dt
A familiar formula usually derived somewhat less formally but using essen
tially the same ideas. It is worth going through this argument while thinking
of dx/dt as the amount of stretching f does to the unit interval in the x
direction when it takes [0, 1] into R
2
(and likewise dy/dt). Working through
the new jargon for a simple, friendly example makes you appreciate how the
new jargon actually does a good job of articulating geometric ideas of what
is going on.
Exercise 3.3.9. Write out the matrix and tensor product forms of the spher
ical and cylindrical polar coordinate transforms of R
3
and conﬁrm that the
euclidean metric goes to what it ought to.
Returning to the tensor ﬁeld exported by u
−1
to R
2
, the map u distorts
distances, but it also distorts the metric tensor in exactly the right way so
that if we use the u
−1
induced metric tensor to measure path length in R
2
we get the right answer for the metric tensor on S
2
.
You might be surprised at ﬁrst that we transport a metric tensor ﬁeld on a
space X to one on a space Y by a homeomorphism u
−1
which is the inverse
of the map u : X → Y . Actually this makes good sense. Suppose we take
the simplest case of the usual metric tensor ﬁeld on R which assigns to the
3.3. THE RIEMANNIAN METRIC TENSOR 87
interval [a, b] the length b − a. Map R → R by u(x) = 2x. Now we want a
new, shiny metric tensor ﬁeld on the codomain which gives the image of [a, b]
the same length, b − a, so we can feel we have shifted not just the interval
[a, b] but also the metric with which to measure its length.
Writing the length in the domain as
b
a
dx we see that we can get the (usual,
boring, old fashioned) length of the image in the codomain by writing it as
x=b
x=a
du =
b
a
2 dx
where du = 2dx follows from u = 2x.
You might have felt a bit happier had I written this as
b
a
du
dx
dx =
b
a
2 dx
Much depends on your previous experience of Calculus.
If we want to have length b − a for the new, shiny length in the codomain,
which I shall call the uspace, we need to use du
−1
. Now the classical mathe
maticians, Gauss and his mob, would cheerfully write things not very diﬀerent
from
u
−1
(x) = x/2, du
−1
= 1/2 dx
then using the metric given by du
−1
on the uspace we get the length of the
interval [2a, 2b] in the uspace with the ‘right’ metric is
x=2b
x=2a
1/2 dx = 1/2 (2b −2a) = b −a
Obviously this works with all linear maps from R to R not just 2x.
Exercise 3.3.10. Show it works for u(x) = −2(x).
If u is a diﬀeomorphism from R to R then we have something like
dx
du
= D u
−1
=
1
du/dx
by the inverse function theorem. The interval [a, b] in the xspace is taken to
[u(a), u(b)] if u is increasing, which I can assume it is without loss of generality
since if it isn’t I just compose with the map that multiplies everything by
−1 and rename the composite to be u. Now the length of this in the usual
metric is
b
a
du/dx dx. If I choose the metric given by du
−1
then I replace the
88 CHAPTER 3. TENSORS AND TENSOR FIELDS
old, boring metric dx with the new, shiny, transported metric 1/(du/dx) dx,
then the length is
b
a
du/dx 1/(du/dx) dx = b −a
This tells us that if we use the metric transported by u
−1
to measure the
length of a curve in R transported by u, we get the same length.
Exercise 3.3.11. Show this works just as well if the curve is in R
2
and
u : R
2
→ R
2
is a diﬀeomorphism. The length of the curve in the uspace
measured by the metric transported to the uspace by u
−1
is the same in both
spaces. Hint: Try it for linear maps u ﬁrst.
Exercise 3.3.12. By taking two distinct charts both covering a curve on S
2
and hence related by a diﬀeomorphism, show that whichever chart you use, if
you induce the right metric tensors on R
2
from the charts and calculate the
lengths by both of them, they agree on the length of the curve on S
2
. Note that
you don’t really need S
2
at all for this exercise, it is about the way covariant
tensor ﬁelds on R
2
transform under diﬀeomorphisms.
Exercise 3.3.13.
1. Show that contravariant tensors of any order on a vector space U are
carried by linear maps f : U → V to contravariant tensors of the same
order on the vector space V .
2. Show that covariant tensor ﬁelds of any order on a smooth manifold
V are carried to covariant tensor ﬁelds on a manifold U of the same
order by the inverse of a diﬀeomorphism h : V → U.
3. Let ', ` be an inner product on V . Show that it induces an isomorphism
between V and V
.
4. Does an isomorphism from V to V
always give an inner product on
V ?
5. Deduce that if we have an inner product on V we can induce a dual
inner product on V
and viceversa, and hence that it would be possible
to deﬁne a Riemannian metric tensor as being a contravariant tensor
ﬁeld.
6. Explain why this is not usually done.
3.3. THE RIEMANNIAN METRIC TENSOR 89
7. Find a metric tensor on o
2
which gives the usual notions of distance
and angles between intersecting curves.
8. Using this tensor, conﬁrm that the angle at the north pole between the
curves obtained by travelling around the great circle in the x −z plane
and the great circle in the y − z plane is what common sense says it
should be.
We now have enough machinery to say quite a lot about what we mean by the
geometry of a space and in particular we can say something about curvature.
If I give a curve on a manifold, and a riemannian metric structure on it, we
can cover the curve with open sets homeomorphic to open sets in R
n
, shift
the curve (in pieces if necessary) back to R
n
, shift the metric structure, and
compute the length. But how do we specify a curve on the manifold in the
ﬁrst place? And how do we specify a riemannian structure on it? We can do
that with charts too, if all else fails.
Note that this is all intrinsic, it does not require an embedding of the manifold
in R
n
. If we do have an embedding, we can derive a riemannian structure
for the manifold from the usual euclidean metric on the enclosing space.
Exercise 3.3.14. How?
But if we are to say anything about the geometry of the space of the universe
in which we live, it has to be done intrinsically. If there is an embedding of
the universe in some higher dimensional euclidean space, we cannot ever
know anything about it, and so it is idle to talk about it.
A useful reference for much of the material covered so far is Volume One of
A Comprehensive Introduction to Diﬀerential Geometry by Michael Spivak.
A quick glance should persuade you that there is a rather considerable depth
in the material. Also bear in mind there are several volumes.
The idea of a vector bundle with bundle maps which are linear on each ﬁbre
is of great importance in theoretical physics. Essentially, ﬁelds are sections of
suitable locally trivial vector bundles. We can specify a general locally trivial
vector bundle over a manifold by taking local trivialisations and specifying
a way of gluing the local products together. This often involves a group:
in the case of the m¨obius bundle, for example, the group Z
2
has everything
to do with the bundle structure. Quantum Chromodynamics and Gauge
Invariance are describable in terms of the structure of locally trivial vector
bundles. The physicists Yang and Mills were obliged to reinvent some of the
ideas well known to diﬀerential geometers, a good argument for doing serious
mathematics before tackling theoretical physics.
90 CHAPTER 3. TENSORS AND TENSOR FIELDS
3.4 Geometry
Given a Riemannian Manifold M
n
which we assume is compact and path
connected, for any a, b ∈ M
n
we can take a smooth path between a and b
and compute its length. This gives a map from the space of smooth paths
joining a and b to R. Of all possible such paths, we may hope that there is
one with the length a minimum, a geodesic on the manifold. It is not entirely
trivial to show that such a path exists.
Exercise 3.4.1. Show that if M
n
is not compact there may be points a, b
such that there is no path of minimum length joining them.
When this can be done we assign the distance between a and b to be this
minimum length. Note that the minimum length may exist even though
there is no path having it as length.
This makes the Riemannian manifold a metric space.
Exercise 3.4.2.
1. Prove that last claim.
2. Show that T
2
, S
2
, RP
2
, K
2
all have the structure of Riemannian man
ifolds and give the metric arising.
Deﬁnition 3.4.1. A map f : (X, d) → (Y, e) between metric spaces is an
isometry iﬀ it preserves distances, i.e. iﬀ
∀ a, b ∈ X, e(f(a), f(b)) = d(a, b)
If we take a small ball centred on the north pole of S
2
as in the diagram,
ﬁgure 3.4.1, we can make it a ball in the metric of some radius r.
There is certainly a diﬀeomorphism between this ball and the ball of radius
r centred on the origin in R
2
. However it is easy to see that the length of the
perimeter is 2πr for the ball in R
2
but is less than this for the ball on S
2
. So
there cannot be an isometry between the two balls.
Exercise 3.4.3. Provide a convincing argument for these claims.
On the other hand, there is an obvious isometry between any two balls of the
same radius in R
2
, and also one between any two balls of the same suﬃciently
small radius in S
2
. A shift does it in R
2
and a rotation in S
2
.
Deﬁnition 3.4.2. If X
n
is a Riemannian manifold and for any two points
a, b ∈ X, there is a smooth isometry f : X → X with f(a) = b then we say
the geometry is homogeneous.
3.5. THE EXTERIOR ALGEBRA 91
Figure 3.4.1: A ball (disc) on S
2
.
Deﬁnition 3.4.3. If X
n
and Y
m
are homogeneous Riemannian manifolds
and for any suﬃciently small ball B in X there is a smooth map f taking B
isometrically to a ball on Y , then we say that X and Y have the same local
geometry.
Exercise 3.4.4.
1. Show that if two homogeneous manifolds have the same local geometry
they have the same dimension.
2. Show that having the same local geometry is an equivalence relation
on homogeneous Riemannian manifolds.
3. Show that the ﬂat torus deﬁned by gluing is a homogeneous Riemannian
manifold with the same local geometry as R
2
.
4. Show that the cylinder S
1
R as the subspace (cos(t), sin(t), z)
T
of R
3
has the same local geometry as R
2
.
5. Show that RP
2
is a homogeneous Riemannian manifold.
6. Construct a deﬁnition of what it means for two manifolds to have the
same local topology. Given an example of distinct manifolds having
the same local topology.
7. Construct a deﬁnition of what it means for two manifolds to have the
same global geometry.
3.5 The Exterior Algebra
I have explained that the tensor algebra T
k
(R
n
) has basis the set
¦dx
i
1
⊗dx
i
2
⊗ ⊗dx
i
k
: dx
i
j
∈ T
1
(R
n
)¦
92 CHAPTER 3. TENSORS AND TENSOR FIELDS
Well to be more exact, I asked you to prove it. It is all a matter of getting
used to the jargon and is conceptually rather simple once you are happy
with dual bases and elements of the cotangent bundle as linear maps taking
tangent vectors to numbers.
The space of alternating covariant ktensors, Ω
k
(R
n
), we know has dimension
n
C
k
and we can get a basis by noting that we have the maps speciﬁed when
we know what they do to some standard basis elements of
R
n
R
n
R
n
. .. .
k terms
Since the maps are clearly linearly independent for diﬀerent choices and since,
by an exercise, any alternating ktensor on R
n
can be expressed as a linear
combination of maps which take each of the needed basis elements of R
nk
to
1, we can say, at some length, what the basis elements of Ω
k
(R
n
) are. They
are the maps which take each choice of e
i
1
, e
i
2
, , e
i
k
having i
1
< i
2
< i
k
to one, and the value on every other basis element of R
nk
are speciﬁed by the
fact that they alternate. Then multilinearity forces the value of any linear
combination of these things everywhere.
It has to be said that this is messy. It would be nice if we could specify the
basis elements more neatly. Something similar to the description given for
the general tensor space T
k
(R
n
) would be neater. We can take it that this is
possible because we know that we can express a basis for the space in terms
of making choices of k distinct rows of k vectors from R
n
placed side by side
and evaluating the determinant on our choice.
If we have two vectors from R
3
,
(x, a) =
x
y
z
a
b
c
then we can take the top pair to get ω
12
(x, a) = xb −ya, or the bottom pair
to get ω
23
(x, a) = yc −zb, or the top and bottom to get ω
13
(x, a) = xc −za.
These three are all alternating and every alternating 2tensor on R
3
is a linear
combination of these three.
Exercise 3.5.1. Prove this last remark.
We actually write these as dx ∧dy, dy ∧dz and dx ∧dz respectively, and the
only thing left to do is to explain where the terminology comes from.
First I want to explain the determinant for n n matrices. You will need to
recall the material on odd and even permutations from 3P0.
3.5. THE EXTERIOR ALGEBRA 93
Suppose I have 3 columns, each column a vector in R
3
. I shall write them
x
1
x
2
x
3
y
1
y
2
y
3
z
1
z
2
z
3
Now I choose one element from each column, taking care never to choose
two things from the same row. So I can pick x
1
, y
2
, z
3
or x
2
, y
1
, z
3
but not
x
1
, y
3
, z
1
.
It follows that if we just look at the indices in xyz order, we get a permu
tation of (1, 2, 3) specifying a choice. If I pick x
1
, y
2
, z
3
I get the identity
permutation. If I pick x
2
, y
1
, z
3
I get the permutation which I wrote out as
1 2 3
2 1 3
Now I make every possible choice of x
i
, y
j
, z
k
with no two indices the same,
and I get 6 (count them) possibilities, which is 3!, the size of the permutation
group S
3
. Now I multiply every number in each choice together, obtaining
x
1
y
2
z
3
for the ﬁrst permutation and x
2
y
1
z
3
for the second, and so on. Note
that the terms in the product are never the same (although of course the
values of the terms or the result of multiplying them together may be). This
gives me 3! products. Had I done this with n vectors in R
n
I should have got
n! distinct products.
Now I take these products, multiply each by the sign of the permutation (+1
if an even permutation, −1 if odd), and add them up. This sum of n! terms,
with parity taken into account is the determinant of the matrix.
Exercise 3.5.2. Conﬁrm this for n = 2 and n = 3.
Exercise 3.5.3. Show that for any n n matrix A, det(A) = det(A
T
).
I could write out the choice x
1
, y
2
, z
3
as dx
1
⊗dx
2
⊗dx
3
applied to the three
vectors. This is taking dx
i
to mean the projection map from R
3
to R which
selects the i
th
component. I am confusing this projection map (which I might
more reasonably have called e
i
, the dual basis element to e
i
, or e
i
) with the
map dx
i
:
˙
R
n
→ R and the reason is that pretty soon we shall be doing all
this on the tangent space, and if I confuse the notation a bit now there is
less novelty later.
In the case of R
2
I get that the determinant can be written easily as dx
1
⊗
dx
2
−dx
2
⊗dx
1
. There are,after all, only two permutations of two things.
94 CHAPTER 3. TENSORS AND TENSOR FIELDS
I shall write this as dx ∧ dy. In fact if I have a covariant 2tensor on R
n
, I
shall also write:
dx
1
∧ dx
2
= dx
1
⊗dx
2
−dx
2
⊗dx
1
This is equivalent to choosing the ﬁrst two rows of the n 2 matrix made
up by choosing any two vectors in R
n
, and computing the determinant.
It is easy to see that it is an alternating covariant 2tensor on R
n
. I have
immediately that dx
2
∧ dx
1
= −dx
1
∧ dx
2
Similarly I can take dx
i
∧ dx
j
deﬁned by
dx
i
∧ dx
j
= dx
i
⊗dx
j
−dx
j
⊗dx
i
and this is −dx
j
∧ dx
i
and dx
i
∧ dx
i
= 0, for i, j ∈ [1 : n]. What this means
is that I select the 2 2 matrix comprising the i
th
and j
th
rows of the two
column vectors, and calculate the determinant of them.
This can be generalised to covariant 3tensors on R
n
without too much trou
ble. In this case I have to deﬁne dx
i
∧dx
j
∧dx
k
and I do this by writing out
every permutation of i, j, k so that if σ is a permutation I take the 3! terms
dx
σ(i)
⊗dx
σ(j)
⊗dx
σ(k)
for the 3! permutations, σ. I then multiply the resulting numbers together,
multiply by the sign of the permutation, and sum the 3! numbers. This gives
dx
i
∧ dx
j
∧ dx
k
. It is easy to see that it is an alternating 3tensor on R
n
.
Exercise 3.5.4. Prove the last claim.
The generalisation to alternating k tensors on R
n
is obvious.
Exercise 3.5.5. Write it down.
It follows that we can give a basis for the space Ω
k
(R
n
) rather easily: it
consists of the alternating tensors
¦dx
i
1
∧ dx
i
2
∧ ∧ dx
i
k
: i
1
< i
2
< < i
k
∈ [1 : n]¦
Now putting x
1
= x, x
2
= y and x
3
= z in traditional fashion, we recover
the mysterious expression at the beginning of this section on the Exterior
Algebra.
3.5. THE EXTERIOR ALGEBRA 95
Exercise 3.5.6.
1. Show how to construct a linear map Alt: T
k
(R
n
) → Ω
k
(R
n
) which
‘alternates’ any tensor and sends any alternating ω to itself. Hint: the
essential idea occurs in turning dx⊗dy into dx∧dy and dx
1
⊗dx
2
⊗dx
3
into dx
1
∧dx
2
∧dx
3
by adding up the signed permutations. It might be
better to average them in this case.
2. Show how to generalise the ∧ of dx
i
, dx
j
so that if ω is an alternating
ktensor on R
n
and ϕ is an alternating tensor, then ω ∧ ϕ is an
alternating k + tensor. Hint: Hit the tensor product with the Alt of
the last exercise.
3. Show that dx
1
∧ dx
2
applied to a pair of points in R
2
, represented in
the standard way, gives twice the oriented area of the triangle formed
by the pair of points together with the origin.
4. What do you need to get the area of the triangle formed by two points
in R
n
and the origin? Does it make sense to talk of an oriented area
in this case?
The exercises should now make it clear that just as ⊗ between ktensors
and tensors gives us k + tensors and hence a graded algebra, so ∧ be
tween alternating ktensors and alternating tensors gives us an alternating
k + tensor and hence another graded algebra. This is called the Exterior
Algebra. Since the only alternating ntensor on R
n
is the determinant, and
since Ω
k
(R
n
) is just the zero tensor whenever k > n, we are really only
concerned with the graded algebra ¦Ω
k
(R
n
) : 1 ≤ k ≤ n¦. This makes the
exterior algebra rather simpler (and a lot smaller) than the tensor algebra.
Exercise 3.5.7. Show that Ω
k
(R
n
) is just the zero tensor whenever k > n.
I can write out the full exterior algebra in the form:
Order basis isomorphic to
Ω
0
(R
2
) 1 R
1
Ω
1
(R
2
) ¦dx, dy¦ R
2
Ω
2
(R
2
) ¦dx ∧ dy¦ R
1
This is a nice ﬁnite table. Just as I wrote out table 3.1.2 I can write out the
exterior algebra for R
2
:
96 CHAPTER 3. TENSORS AND TENSOR FIELDS
R R
2
R
R (R, ) (R
2
,
q
) (R, )
R
2
(R
2
,
q
) (R,det) ¦0¦
R (R, ) ¦0¦ ¦0¦
Table 3.5
Again, denotes ordinary multiplication in R and
q
denotes scalar multipli
cation. And det denotes the determinant. The table starts with 0tensors at
the top and 2tensors at the bottom.
Exercise 3.5.8. Write out the full exterior algebra on R
3
. You should repli
cate the above two tables with rather more columns and rows. In the second
table, work out what the multiplications are, as for table 3.1.2. Do you recog
nise anything?
3.6 The Exterior Calculus
The step from the tensor algebra to tensor ﬁelds consisted of having a section
of the tensor bundle, which meant attaching a type (k, )
T
tensor to each
point on a manifold. We do exactly the same thing again, we take a section
of the Ω
k
(V ) bundle where V is a tangent space. This means that we attach
to each point a of the nmanifold M
n
an alternating ktensor on the space
T
a
(M). A 0tensor is just a number, and attaching a number to each point of
a manifold is merely deﬁning a map from M
n
to R. Similarly, attaching an
ntensor is attaching a number, the volume element, at each point of M
n
. In
between we have kforms attached at each point of the manifold. Naturally
we want the sections to be smooth.
Such sections are called diﬀerential forms on the manifold.
To make this concrete we look at R
2
and R
3
.
A diﬀerential 0form on R
2
is just a smooth map from R
2
to R. We know a
fair bit about these.
A diﬀerential 1form on R
2
assigns to each point of R
2
a pair of numbers
a dx +b dy and consequently is a pair of functions
P(x, y) dx +Q(x, y) dy
It is a covector ﬁeld and looks very like a vector ﬁeld (but watch out for what
happens when you change bases!)
A diﬀerential 2form on R
2
assigns to each point a of R
2
an operator α(a) dx∧
dy. This is short for α(a) dx ⊗ dy − dy ⊗ dx for some number α(a) which
3.6. THE EXTERIOR CALCULUS 97
depends on a. This acts on any pair of vectors in the tangent space at a.
Let’s choose some with respect to the standard basis for
˙
R
2
(since, for any
a ∈ R
2
,
˙
R
2
a
is isomorphic to
˙
R
2
0
in a natural way). Then
α(a) dx ∧ dy
¸
x
y
,
¸
u
v
= α(a)(xv −yu)
So α(a) dx∧dy assigns to any pair of tangent vectors the area of the parallel
ogram in the tangent space which they determine, multiplied by a function
of a. Or if you prefer, twice α(a) times the area of the triangle consisting of
the two points and the origin of the tangent space
˙
R
2
.
A quite useful way of looking at this is that α(a) dx ∧dy is doing something
a bit like the riemannian metric, but instead of returning the inner product
of two tangent vectors it is returning an inﬁnitesimal area element. So we
imagine that we want the area deﬁnition to vary over the space so that
calculating an area of a region is now more complicated. On the other hand
you have seen this before, more or less.
Example 3.6.1. First I am going to transform the usual area measure dx∧dy
on R
2
and use it to calculate the area of the unit disc in polar coordinates.
We have the polar coordinate transform
P : R
2
` ¦0¦ → S
1
R
+
¸
x
y
→
¸
θ
r
We have the inverse given by
x = r cos(θ)
y = r sin(θ)
and exactly as before
dx = −r sin(θ) dθ + cos(θ) dr
dy = r cos(θ) dθ + sin(θ) dr
Last time we calculated dx⊗dx and the three others. This time there is only
one thing to calculate, dx ∧ dy. We get
dx ∧ dy = (−r sin(θ) dθ + cos(θ) dr) ∧ (r cos(θ) dθ + sin(θ) dr)
which we can easily see is just:
−r sin
2
(θ) dθ ∧ dr +r cos
2
(θ) dr ∧ dθ = r dr ∧ dθ
98 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.1. Show this carefully.
Using this new area element we get that
B
1
(0)
dx ∧ dy =
B
1
(0)
r dr ∧ dθ
which we already knew although not in this language. Note that the domain
of integration, B
1
(0) is a disk in the x − y space and a rectangle wrapped
around a cylinder in the θ −r space. This is what happens to the punctured
disc under the diﬀeomorphism P.
The new integral has θ ∈ S
1
and r ∈ [0, 1] which makes for an easy integral,
(1/2)(2π) = π.
I knew that.
Note that this works because we
• transformed the disc in R
2
into a rectangle in S
1
R
+
, except that the
centre of the disc really got thrown away (zero area so does not aﬀect
the result) so the rectangle (a) doesn’t have a base (zero area in any
sane density on R
2
) and b gets wrapped once around the circle.
• Back transformed the measure density dx ∧dy to get the right density
to use to compute the area. All this does is to make clear something
which you were trained to do using much sloppier arguments to justify
the right rule for the change to polars. It was all perfectly OK but the
rationale was scruﬀy. Note how the exterior algebra rules for computing
the new form automatically take care of signs and orientations. Doing
it for any other transformation than the polar one is now a doddle.
Exercise 3.6.2. Calculate the area of the unit disc in R
2
with respect to the
density xy dx ∧ dy.
Exercise 3.6.3. Work through the argument for the spherical and cylindrical
polar coordinate transformations in R
3
.
Exercise 3.6.4. Think of some bizarre diﬀeomorphism of R
2
to some two
dimensional space that does something frightful but has an explicit inverse
(make sure the inverse can be written down even if the original is a swine).
Use it to evaluate the area of some region in both spaces, before and after
being transformed. This should give two moderately foul double integrals
with weird limits. Use Mathematica to get numerical solutions and conﬁrm
they are pretty much the same.
3.6. THE EXTERIOR CALCULUS 99
Remark 3.6.1. This should give you a conviction that diﬀerential forms
have their uses, and will suggest the most important thing about them:
Diﬀerential Forms are things you
integrate over manifolds. A dif
ferential kform can be integrated
over a kmanifold or kmanifold
with boundary.
I have made sure you would see this as it tells you what they are for.
Transforming diﬀerential forms by diﬀeomorphisms follows the same pattern
as for transforming the riemannian metric tensor, except that we may have
to transform kforms for k > 2. The rules are simple however.
Exercise 3.6.5. Write down explicit rules in terms of partial derivatives for
transforming a diﬀerential 3form on R
3
under a diﬀeomorphism, rules which
you must have used in doing the preceding exercise but one.
It follows from the big announcement that 2forms on R
2
are integrated over
things like discs, and a 1form on R
2
has to be integrated over curves.
Example 3.6.2. Let the curve c be the graph of y = x
2
between x = 0 and
x = 1. Let the diﬀerential 1form be dx + dy. What would we expect the
answer to
c
dx + dy be on the basis of what this means, and what would
the calculation be?
Solution: Drawing the graph and taking a typical line segment on the curve,
∆x() is the projection along the xaxis and ∆y() is the projection along
the yaxis. If we add these up we get 1 + 1 = 2, and this is not going to
change as the segments get shorter. So the answer is 2. All done by a little
thought about what these things mean.
If we write y = x
2
we get dy = 2x dx so
c
dx +dy =
[0,1]
(1 + 2x) dx = x +x
2
1
0
= 2
As an alternative we could write x = t, y = t
2
, t ∈ [0, 1] to express the curve
parametrically and this would give the same answer.
100 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.6. Now try it for the curve c being the ﬁrst quadrant of the
unit circle. Do you get the same answer? If not why not?
Note that when we express the curve parametrically we do so by a function c
and this allows us to pull back the diﬀerential 1form on R
2
to a diﬀerential
1form on R which is just some function multiplied by dt and we can integrate
this in the usual way, numerically if necessary.
Exercise 3.6.7.
1. What would you expect the result of
c
dx + dy to be when c is any
smooth closed curve?
2. Suppose we take f(x, y) = x +y. Then we have
df =
∂f
∂x
dx +
∂f
∂y
dy
which in this case is the 1form dx + dy. So diﬀerentiating a smooth
0form gives a smooth 1form. Show that this is always the case.
3. It follows that
c
dx + dy =
c
df. Use the fundamental theorem
of calculus to prove your solution to the ﬁrst question is correct, and
verify that it gives the right answers to all the other integrations of this
1form along curves.
4. Find a 1form on R
2
, P(x, y) dx +Q(x, y) dy, that is not df for any f.
Hint: What can we say about ∂P/∂y and ∂Q/∂x if the 1form is the
derivative of a 0form?
The usual way to represent the derivative of a 0form f is as the row matrix
[∂f/∂x, ∂f/∂y] and since this represents, when evaluated at any point, a
linear map from R
2
to R and P dx+Q dy represents a linear map from
˙
R
2
to
R when evaluated at any point, the diﬀerence is rather small, but signiﬁcant.
When we treat the diﬀerentiation in the second sense, we call d the exterior
derivative. It goes much further than this. I shall deﬁne an exterior derivative
of 1forms to give a 2form:
For ω = P dx +Q dy, I deﬁne
dω =
∂P
∂y
dy ∧ dx +
∂Q
∂x
dx ∧ dy = (
∂Q
∂x
−
∂P
∂y
) dx ∧ dy
3.6. THE EXTERIOR CALCULUS 101
Exercise 3.6.8.
1. Show that d
2
= 0 for any 0form f.
2. Calulate the exterior derivative of a 1form on R
3
by making up a
suitable example.
3. Pretend, brieﬂy, that there is no such thing as duality and that the last
1form is a vector ﬁeld. Identify the 2form.
4. Make the rule: To obtain the exterior derivative of a k form on R
n
, take
each component function P(x
1
, x
2
, x
n
) dx
i
1
∧dx
i
2
∧ ∧dx
i
k
of the
kform, diﬀerentiate each such P with respect to each of the variables
separately to get, for example, some ∂P/∂x
j
, and put dx
j
∧ in front of
the existing term, to get
∂P/∂x
j
∧ dx
j
∧ dx
i
1
∧ dx
i
2
∧ ∧ dx
i
k
Sum the results for the n diﬀerent variables x
j
and also for the diﬀerent
functions P. The result is a k + 1 form on R
n
. Show that this rule
gives the same answer as in the particular cases you have worked with.
5. Use the above rule to calculate the exterior derivative of a 2form on
R
3
. Choose your own 2form, preferably so as to have three nontrivial
but diﬀerentiable component functions.
6. Pretending, brieﬂy, that the 2form ω on R
3
is a vector ﬁeld, identify
dω.
Remark 3.6.2. You should be able to see that the clunky way you did
Stokes’ Theorem using vector ﬁelds arose from confusing vector ﬁelds with
both 1forms and 2forms, which you can do only on R
3
. In fact it is really
about diﬀerential forms. Stokes’ Theorem in general says
∂M
ω =
M
dω
where M is an nmanifold with boundary ∂M and ω is any diﬀerential n−1
form. For a proof, dig up my old 2C2 notes oﬀ the web.
This is the ‘modern’ form of Stokes’ Theorem. It diﬀers from the old obsolete
form in two ways: ﬁrst it is about diﬀerential forms not vector ﬁelds, so a
graps of duality is important and second it works for all positive integers n,
all nmanifolds with boundary (or without, but they are less interesting).
102 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.9.
1. Show that Stokes’ theorem in dimension 1, with ω a 0form, is just a
restatement of the Fundamental Theorem of Calculus.
2. Show that the almost certainly scrofulous proof you met in second year
of Green’s Theorem is also a scrofulous proof that Stokes’ Theorem
holds when ω is a 1form on R
2
.
3. Show that the almost certainly scrofulous proof you met in second year
of Stokes’ Theorem is also a scrofulous proof that
∂M
ω =
M
dω
holds when ω is a 1form on R
3
.
4. Show that the almost certainly scrofulous proof you met in second
year of the Divergence Theorem is also a scrofulous proof that Stokes’
Theorem holds when ω is a 2form on R
3
.
5. Construct a plausible explanation of why you had to do a bungled
version of Stokes’ Theorem in second year, given that the correct version
has been known since about 1925.
Remark 3.6.3. If you want to see a proper proof of Stokes’ Theorem (all the
above, and more, in one hit) read Michael Spivak’s Calculus on Manifolds.
It consists of proper deﬁnitions of all the terms in awful generality and some
calculations. I should point out that all the physical intuitions which led
to the theorem are contained in the exercises and are not, as some shallow
people imagine, completely absent.
3.7 Hodge Duality: The Hodge Operator
3.7.1 The Riemannian Case
In R
3
we know from the table referred to in Exercise 3.5, that the Exterior
Algebra has a striking symmetry. If we look at the 0forms we observe they
have dimension 1, just as do the 3forms, while the 1forms have dimension
3, just like the 2forms.
It follows from the equality of the dimension that there is an isomorphism
between the space of 1forms on R
3
and the space of 2forms on R
3
; also one
between the space of 0forms (numbers) on R
3
and the space of 3forms on
R
3
.
Only a small amount of thought shows that there must be, in general, an
isomorphism between the kforms on R
n
and the n −kforms on R
n
. In fact
3.7. HODGE DUALITY: THE HODGE OPERATOR 103
not just one isomorphism of course, but scads of them. The question is, can
we ﬁnd a more or less natural isomorphism by some process which works in
all cases? The answer is yes, and the isomorphism is called , or Hodge if
you want to give credit where it belongs.
Let us start by going from 2forms on R
3
to 1forms on R
3
and see if we can
work out the general pattern by doing concrete cases.
Recall from Exercise 3.5.1 that if ω is a 2form on R
3
then it operates on any
pair of vectors
¸
a
1
a
2
a
3
¸
¸
,
b
1
b
2
b
3
¸
¸
by expressing the resulting number as
c
1
a
2
b
2
a
3
b
3
+c
2
a
1
b
1
a
3
b
3
+c
3
a
1
b
1
a
2
b
2
for some particular numbers c
1
, c
2
, c
3
.
But this is just the value of the three by three determinant:
a
1
b
1
c
1
a
2
b
2
−c
2
a
3
b
3
c
3
It seems reasonable therefore to deﬁne (ω) to be the 1form [c
1
, −c
2
, c
3
]
which may be written c
1
e
1
−c
2
e
2
+c
3
e
3
using the standard dual basis in R
3
.
If you are troubled by the minus sign, note that it arises because I have
always described the submatrices obtained by omitting a row in numerical
order rather than cylic order. So I get (1, 3) where it might have been more
natural to put (3, 1). But it is much easier to specify submatrices this way
so I shall carry on doing so.
There is also an isomorphism between 0forms and 3forms on R
3
. It takes 1
to det. Det of course takes the three ordered vectors (a, b, c) to the determi
nant of the matrix obtained by writing each vector out as a column (or row)
and listing them to get the three by three matrix.
In R
3
we used to send the 2form
c
1
e
2
∧ e
3
+c
2
e
1
∧ e
3
+c
3
e
1
∧ e
2
to the 1form
c
1
e
1
−c
2
e
2
+c
3
e
3
104 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.7.1: A choice of k numbers from n.
We can simplify this by saying that (e
2
∧ e
3
) = e
1
, (e
1
∧ e
3
) = −e
2
and
(e
1
∧ e
2
) = e
3
. Then everything is multilinear and so we don’t need any
more. In other words, it suﬃces to specify on all the choices of i, j in e
i
∧e
j
.
Now suppose we have a kform on R
n
. We do not have just three basis
elements, we have
n
C
k
of them and we can take one of them and write it as
e
i
1
∧ e
i
2
∧ ∧ e
i
k
where we have chosen from the numbers [1 : n] some increasing subsequence
consisting of the k numbers i
1
, i
2
, , i
k
. We specify if we give an n − k
form that every such basis element gets sent to. There is an obvious choice:
we have taken a row of n numbers and picked out k of them. Figure 3.7.1
shows that I have selected some numbers in order and painted them red. This
leaves n − k black ones. I wedge the ‘black’ projections. Thus ‘the indices
we take out go to the indices we leave behind’.
The only problem is that of sign. I have gone from the red and black mixed
up to the red ones followed by the black ones. This is a permutation of the
n numbers [1 : n] and it may be an odd permutation or an even one. (see
3P0 notes if you don’t understand these things) We call the sign of an even
permutation 1 and the sign of an odd permutation −1. So I ﬁnish up with
the deﬁnition of :
(e
i
1
∧ e
i
2
∧ ∧ e
i
k
) = sign(σ)e
i
k+1
∧ e
i
k+2
∧ ∧ e
i
n
where σ is the permutation:
1 2 3 n
i
1
i
2
i
3
i
n
and the i
j
for 1 ≤ j ≤ k on the left hand side are the red numbers and the
remainder are the black numbers in the same order as before.
Exercise 3.7.1. Conﬁrm that the general case copies exactly what we did
with 2forms on R
3
.
3.7. HODGE DUALITY: THE HODGE OPERATOR 105
Exercise 3.7.2. Conﬁrm by explicit calculation for some particular kforms
on R
n
for various n that this deﬁnition makes sense and gives reasonable
answers. In particular send some 2forms on R
4
to some 2forms.
Exercise 3.7.3. Is it true that
2
is the identity? Prove it or give a coun
terexample.
Exercise 3.7.4. Verify that the result of taking (e
i
∧ e
j
) on R
3
looks a lot
like the cross product. (!) Explain this.
Exercise 3.7.5. Verify that if we choose the ordered basis (e
1
, e
3
, e
2
) and
make ω = dx we get a diﬀerent value for (ω) than if we used the ordered
basis (e
1
, e
2
, e
3
). Show that if we use the ordered basis
¸
e
1
,
0
cos(θ)
sin(θ)
¸
¸
0
−sin(θ)
cos(θ)
¸
¸
we get the same result as in the standard basis.
If ω is a kform on V and f : U → V is a linear map, then we can pull back
ω to a kform on U deﬁned by
f
(ω)(a
1
, a
2
, , a
k
) = ω(f(a
1
, ), f(a
2
), .f(a
k
))
Exercise 3.7.6. Deﬁne f : R
3
→R
3
by f(e
1
) = e
2
, f(e
2
) = e
3
, f(e
3
) = e
1
.
Let ω be the 1form dx + 2dy + 3dz. Show that f
((ω)) = (f
(ω)). The
last exercise makes it clear that at least in this case, f
and commute.
Exercise 3.7.7. Do they always? Is functorial? Its deﬁnition is locked
into the standard basis, does it need to be? If not, what does this do for
deﬁning it on manifolds?
The answer to the last question is that if f preserves both the inner product
and the orientation, that is if it an element of SO(n, R), then f
((ω)) =
(f
(ω)). This tells us that the operator involves the parity (or orienta
tion or chirality) of an orthonormal basis in an essential way. Note that in
a Hilbert space we have a natural deﬁnition of an orthonormal basis and
that since a Riemannian structure on an oriented manifold makes each tan
gent space a Hilbert space, the operator makes sense on oriented smooth
manifolds with a Riemannian structure.
I do wish the physicists would learn not to call it a metric, but they are
beyond saving.
106 CHAPTER 3. TENSORS AND TENSOR FIELDS
3.7.2 The SemiRiemannian Case
We are going to generalise the idea of an inner product so as to be able to
deal with the Minkowski metric on spacetime. This is Lorentzian. And we
might as well deal with the general case because it isn’t any more work.
Deﬁnition 3.7.1. A symmetric bilinear form φ : V V →R on a real vector
space is said to be nondegenerate iﬀ ∀u ∈ V, φ(u, v) = 0 ⇒ v = 0
Deﬁnition 3.7.2. A nondegenerate symmetric bilinear form φ : V V →R
on a real vector space is said to be an inner product. We write such forms as
', ` with φ(u, v) = 'u, v`.
Exercise 3.7.8. Deﬁne the inner product on R
2
by
¸
x
y
,
¸
a
b
= xb +ya
Show that this is an inner product in the new sense but that it is not positive
deﬁnite.
Exercise 3.7.9. On R
4
deﬁne
x
0
x
1
x
2
x
3
¸
¸
¸
¸
,
a
0
a
1
a
2
a
3
¸
¸
¸
¸
¸
= −x
0
a
0
+x
1
a
1
+x
2
a
2
+x
3
a
3
Show that this is an inner product (the Lorentzian inner product on space
time), and ﬁnd a nonzero vector which is orthogonal to itself. Find two
distinct points which are at ‘distance’ zero from each other. Explain what
this means physically.
It should be clear that our generalisation of the idea of an inner product is
constructed so as to allow us to do for spacetime what we usually do for
space, and that the ‘metric’ derived from the inner product isn’t a metric in
the standard sense at all. This is (a) strictly necessary in order to describe
relativity in a sensible fashion and (b) an awful shock to the system. It means
that I have set the velocity of light equal to 1 and that any two points on
the path of a ray of light have zero separation in spacetime. I shall discuss
some aspects of Physics in the next chapter which may shed some light on
this extraordinary behaviour.
From now on I shall use the generalised deﬁnition of the inner product.
3.7. HODGE DUALITY: THE HODGE OPERATOR 107
Proposition 3.7.1. An inner product on a ﬁnite dimensional real vector
space V determines an isomorphism from V to V
.
Proof: Deﬁne Φ : V → V
by Φ(u) = 'u, −`. That is,
∀ v ∈ V, Φ(u)(v) = 'u, v`
Then that Φ is 11 follows from the nondegeneracy of ', `, and that Φ is
linear follows from the bilinearity of ', `. Since the spaces have the same
dimension, this is suﬃcient to make Φ an isomorphism.
Exercise 3.7.10. Let ', ` be the inner product of exercise 3.7.8. Write down
the isomorphism explicitly, representing elements of R
2
by row matrices.
The converse is also true:
Proposition 3.7.2. Suppose Φ : V → V
is an isomorphism of ﬁnite di
mensional real vector spaces. Then the bilinear form
'u, v` =
1
2
((Φ(u))(v) + (Φ(v))(u))
is an inner product on V .
Proof: That it is bilinear follows from the linearity of Φ, and it is constructed
to be symmetric. It remains only to show it is nondegenerate. If it were
degenerate then there would be some nonzero u such that 'u, −` is the zero
map in V
which would put u ∈ ker(Φ) contradicting Φ being 11.
Exercise 3.7.11. Construct an example of an isomorphism from R
2
to R
2
which gives an inner product in the old sense, that is a bilinear positive
deﬁnite map, and also one which gives an inner product which is not one
of the old fashioned sort. Show that by composing any given isomorphism
Φ : R
n
→ R
n
with a suitably chosen isomorphism from R
n
to itself, we can
always get an old fashioned style inner product. Hint: Use the material on
the classiﬁcation of quadratic forms from Second year.
Now we have to mess around with Hodge a bit to make it work properly
on a semiRiemannian manifold. Note that we can take the determinant of
an orthonormal basis and if this is riemannian we must get 1, since the inner
product of diﬀerent basis elements will be zero, and the inner product of a
basis element with itself is 1. And the determinant of the identity matrix is
1.
This stops being true for a generalised inner product: the inner product
of the time axis with itself is −1, alternatively the matrix representing the
108 CHAPTER 3. TENSORS AND TENSOR FIELDS
Lorentz inner product is
−1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
which has determinant −1. Since determinants ﬁgure largely in deﬁning
we need to take account of this. If the inner product has signature (s, n −s)
then we change the deﬁnition of to:
(e
i
1
∧ e
i
2
∧ ∧ e
i
k
) = sign(σ)(i
1
)(i
2
) (i
p
)e
i
k+1
∧ e
i
k+2
∧ ∧ e
i
n
where (i
j
) is deﬁned to be the inner product of the i
th
j
basis element with
itself.
This gives the convenient form of the operator given in the text book and
ensures invariance under the generalisation of the special orthogonal group
which preserves the orientation and generalised inner product.
Exercise 3.7.12. Prove this.
As usual, the best way to accomodate a new idea is to do lots of sums until
you are used to using the idea, whereupon it stops leading to panic attacks
and becomes just another part of your machinery for thinking.
Chapter 4
Some Elementary Physics
Most of you will know most, perhaps almost all, of this, but some may not
and it is convenient to summarise it brieﬂy.
4.1 Three weird forces
The ﬁrst time I met electrostatic forces up close and personal, in other than
a laboratory situation, was the ﬁrst time I undressed a girl. It was dark
and she was wearing nylon underwear, and when I took it oﬀ it crackled and
glowed as miniature lightning ﬂashed. My ﬁrst thought was that this was
the wages of sin, and the devil had come to claim his own, but then my brain
started working and I realised I was merely seeing and hearing electricity.
An education can be useful.
Later, in an American hotel, I once walked across a nylon carpet while wear
ing Ug boots (among other things) and reached out to the door handle, and
got a large blue spark between my ﬁngers and the handle. This was the
second time I encountered electrostatic forces up close and personal, in other
than a laboratory situation.
The management accepted no responsibility for guests electrocuting them
selves with Ug boots.
It is possible to replicate these eﬀects in a small way by stroking some fur
against a nylon or plastic ball and one can use it to pick up little bits of
paper. They just jump up oﬀ the table top to stick, brieﬂy, to the plastic
ball. Try it if you don’t believe me.
The force that aﬀects the paper is called an electrostatic force, and on a
human scale it is usually quite small, working only for rather litle bits of
109
110 CHAPTER 4. SOME ELEMENTARY PHYSICS
paper, although google Van de Graaf generator to ﬁnd out how to scale
up rather a lot. Or stand on a hilltop holding up a spiky metal stick in a
thunderstorm if you really need convincing that the process of building up
charge (to use the jargon) can be rather spectacular
1
.
So there is a rather weird force which attracts all sorts of objects and which
needs some investigation. And knowledge of which can prevent heart attacks
and interference with your love life.
You can get at any time a ’fridge magnet from the government, or from lots
of even more useless people at New Year, and it magically glues itself to the
’fridge. It does not work on paper or wood. But it can demonstrably attract
small bits of iron. So it is diﬀerent from the electrostatic force. But similar
in some ways. A second weird force.
And ﬁnally if you step oﬀ the top of a tall building you will come down
towards the rest of us with an acceleration of about 9.81 meters per second
per second less air resistance
2
. The planet appears to also attract things.
Like electrostatics but unlike magnetism, it appears to attract diﬀerent sub
stances. Unlike electostatics it does not stop working when you connect the
thing attracting and the thing attracted by a metal wire.
So there are at least three weird forces, electrostatics, magnetism and gravity.
It is natural to wonder whether there is any connection between them, what
the similarities and diﬀerences are. Michael Faraday investigated these issues
in the early nineteenth century
3
.
1
Children, do not do this unless you are very, very bored. As a cure for boredom, this
compares with the guillotine as a cure for dandruﬀ.
2
Children, do not attempt this unless you are superman. Note that believing you
are superman doesn’t cut it. The Universe does not have the slightest respect for your
opinions unless they are right. And neither do I. Incidentally, it doesn’t matter how deeply
or passionately you believe something. If you are wrong in your beliefs the universe may,
with supreme indiﬀerence, kill you stone dead. If you have been brought up with the
view that you are entitled to your beliefs whatever they are, I encourage you to see if the
universe agrees with you.
3
You can ﬁnd his Notebooks in the library and they make interesting reading. They
are also very well written. I should rate him as a better writer than Jane Austen, and
the subject is a lot more interesting. Unless you think the matter of manners and which
male ends up mating with which female is more interesting than understanding how the
universe works. If so you have deﬁnitely come to the wrong shop and should enrol in
English, where you may learn the vital skill of saying the right things about storybooks.
It never fails to amaze me that there really are people who feel this is quite a reasonable
way to spend their time.
4.2. FIELDS 111
4.2 Fields
If you do some delicate quantitative measurements on electrostatic forces
you ﬁnd some interesting and important things. One is that the source of an
electric force comes in lumps, although rather small lumps, so this was not
known at the beginning of the study of the subject.
If you ﬁnd the smallest possible lumps you can ﬁnd that they are very small
and they all repel each other. They are called electrons, derived from the
Greek name for amber, and you might amuse yourself by ﬁguring out the con
nection. The amount of repulsion is inversely as the square of the distance
between them, and we say the electron has a charge and by a curious con
vention it is called a negative charge. Some things appear to have a positive
charge, and indeed something must have or the whole universe would have
negative charge instead of being mainly neutral. Opposite charges attract
again by an amount that depends inversely on the square of the distance
between them. When we say they attract or repel, we use Newton’s termi
nology of forces: If you want to feel a force, get a friend to poke you with a
stick. Failing that, jump oﬀ a wall. You will not feel any force while falling,
but you will when you stop. It will be very like being poked all over by a
large stick.
In Newton’s day the word force was strongly associated with sticks if you
poked, or possibly ropes if you pulled. The idea that you could have a
force acting without some material object connecting the thing on which the
force acted and the thing doing the acting (usually a person or a horse) was
regarded as a contradiction in terms; it was called action at a distance and
was thought to be quite contrary to standard usage of the term ‘force’, and
hence a violation of common sense and the natural order. Hence Newton’s
deﬁnition of a force in terms of what it did, dissociating the eﬀect from the
mechanism, was a wild and novel idea. Now that we all think like this, it
is hard to appreciate the intellectual jump made in simply deﬁning force in
terms of observable changes in the motion of a mass. It is true that forces still
tend to be associated with physical objects, the sun for example in the case of
forces acting on planets, but it is possible to conceive of a force ﬁeld in space
with no such association. Well, it is now, it was a radical if not actually loony
idea in the middle of the seventeenth century. See the discussion by Feynman
in his Lecture Notes on Physics on the question of whether Newton’s Law
is ‘merely’ a deﬁnition or something more. I think he misses the point here,
which is to assert that forces are not to be thought of in terms of sticks or
ropes joining the source of the force to the thing acted upon, but in terms of
motion of the thing acted upon. The source of the force is a separate matter
112 CHAPTER 4. SOME ELEMENTARY PHYSICS
and will depend on what type of force it is.
We therefore measure forces by using Newton’s Law which says Force is Mass
times acceleration. Newton used Latin but we prefer algebra: F = ma or
F = m¨ x. Even better, since the mass may change in time (as when a rocket
moves by consuming fuel and throwing it out rather fast at the back), it
would be more useful to write F = ˙ p or in Leibnitz’ notation, F = dp/dt
where p is the momentum.
For a ﬁxed mass, we measure the acceleration. We compare masses using a
spring balance or something equivalent.
Since electrons don’t seem to aﬀect each other if we pile lots of them onto
some small object, other than trying to repel each other, and opposite charges
cancel each other out to a good approximation when they are brought to
gether, the force between a charge Q
1
and one Q
2
is
1
4π
0
Q
1
Q
2
r
2
(4.2.1)
where Q
j
is the amount of charge, ultimately a count of electrons or their
compensating positive equivalents which also come in lumps, r is the distance
between them, and
0
is a constant. This is the size of the force, the direc
tion on each is towards the other if they have diﬀerent sign and away from
each other if the charges are of the same type. It is worth remarking that
the attractions or repulsions do not depend on there being air or any other
medium in the space between the objects, although the medium changes the
constant which is therefore a property of the medium. We usually take
it that the medium is a hard vacuum and
0
is the number associated with
empty space. You would ﬁnd the reason for the 4π too incredible, so I shall
simply observe that it is a rather bizarre choice of unit.
We can set up some rather special circumstances which merit a little thought:
if we take two metal parallel plates we can put a negative charge on one of
them, using Ug boots or nylon underwear if all else fails, and investigate to
see what happens in between. See ﬁgure 4.2.1 for a picture of the situation.
We take the little green ball to have some standard unit charge, supposed
to be small and positive and, we put a (large) positive charge on one plate
and an equal but opposite negative charge on the other. It is found that
there is a force which tends to accelerate the test charge in the direction
shown. This happens throughout the intervening space and we know of its
existence by looking to see how much the test charge accelerates. And only
by looking at some such test charge, because neutral objects are unaﬀected
to a ﬁrst approximation, and negatively charged test objects have the force
vector reversed. It really is a force ﬁeld because we can check back from the
4.2. FIELDS 113
Figure 4.2.1: An Electric Field.
acceleration experienced by the same charge with diﬀerent masses. It exists
throughout the space. We assume that it is there when we don’t actually
have a charge there to measure it, and we assume that trees that fall in
a forest make a noise even when there is nobody there to hear it. This is
because life seems simpler that way, and we tend to think there really is a
world outside us.
Consequently it is natural for us to believe that there is a vector ﬁeld de
ﬁned on the space in which we live, possibly changing in time, and where
the physical meaning is that this is an electric ﬁeld detected by measuring
the acceleration on a known charge and mass. Acceleration measurements
require, in principle, rulers and clocks, and we can get those. In practice
we also cheat by using geometry and trigonometry, but checking up suggests
these work pretty well.
You have to understand that this idea comes at the end of a loooooong
sequence of delicate measurements and careful experiments and seems to
explain things in a satisfactory way. In particular we can often calculate
in simple cases what we expect the ﬁeld to be like and we get very good
agreement between the numbers we calculate and the ones we measure. It
is this that we mean when we say we understand something: we get good
agreement metween measurements and calculations
4
.
So we are inclined to have a certain amount of faith in the existence of electric
4
This is not what philosophers or social scientists mean when they say they understand
something; what they mean is that they get a pleasant sensation of insight. Pleasant
sensations of insight are nice to have, and physicists and mathematicians get them too,
but we like to convince ourselves they are not bogus. Read Karl Popper’s Logic of Scientiﬁc
Discovery to get an inkling about the diﬀerence.
114 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.2.2: Two bar magnets attracting each other.
ﬁelds, for essentially the same reason that we believe in the existence of the
Pope. Most of us haven’t actually seen the Pope, the closest is usually a
picture, perhaps on television. But the hypothesis that he exists accounts
for a lot of phenomena which would otherwise be rather hard to explain, such
as television pictures and photographs.
Physicists also believe in Magnetic ﬁelds for similar reasons: in fact sprin
kling some iron ﬁlings on a sheet of paper under which sits a bar magnet
(obtainable from most good toyshops) makes it hard to doubt that you
can test the strength of a magnetic ﬁeld with a small piece of iron. There
are however some signiﬁcant diﬀerences between magnetism and electricity.
Electric charge comes in lumps, negative lumps and, sort of, positive lumps.
Magnetism doesn’t. People have actually looked hard for so called magnetic
monopoles and totally failed to ﬁnd them. What turns up is invariably two
of them, one called North and the other called South. The name of course
is derived from the discovery that the planet has a magnetic ﬁeld which can
be used for ﬁnding out what direction on the Earth you are pointing in, by
means of the magnetic compass. This made sailing a boat a much safer bet,
and has been known for a long time by the Chinese
5
Magnetism also deﬁnes a ﬁeld.The picture of ﬁgure 4.2.2 shows two bar mag
nets close together. In the conﬁguration shown, the north pole of the magnet
on the left attracts the south pole of the magnet on the right. The south pole
of the magnet on the left repels the south pole of the magnet on the right,
and vice versa, but less because they are further away, and we can conﬁrm
that we have an inverse square law by using longer magnets. The force is
easily detected and we can arrange various conﬁgurations of magnets just as
we could arrange the parallel plates for testing charge. So magnetic ﬁelds
exist too.
It is natural to regard the two ﬁelds as speciﬁed by vector ﬁelds and we can
expect to be able to describe the ﬁelds for reasonably simple conﬁgurations,
we can measure the constants in an equation similar to equation 4.2.1, and
calculate the ﬁeld at other points and this works very well. Of course we
5
We in the West stole the idea over ﬁve hundred years ago, along with Printing and the
recipe for Gunpowder. Unfortunately we also stole the idea of Bureaucracy oﬀ them. But
we got out own back by letting them copy Marxism oﬀ us. That slowed them down a bit,
but they’ve ﬁgured out it was a trick quite recently. Which is more than our educational
theorists have done.
4.2. FIELDS 115
Figure 4.2.3: An electrical circuit.
don’t get the same constant
0
occurring, we get a diﬀerent constant, µ
0
.
Although an electric ﬁeld will move a charge, the magnetic ﬁeld also has an
eﬀect on charges, but only when they move. If a charge q moves at some
velocity vector v in a magnetic ﬁeld B, there is a force which is orthogonal to
both v and B which is proportional to the product of the speed, the charge,
and the strength of the magnetic ﬁeld at the point. We can write this out in
vector form as the Lorentz force law:
F = q(E +v B) (4.2.2)
This can be veriﬁed by using a Cathode Ray Tube such as occurs in old
fashioned television sets and computer monitors: just bring a magnet close
to the side of the tube and watch the screen. Try not to electrocute yourself.
All this strongly suggests that electricity and magnetism are closely related,
as is indeed the case.
One of the ways of seeing this is to take a coil of wire and join the ends to
opposite sides of large metal plates as in ﬁgure 4.2.3.
The circle labelled A is an ammeter which measures current and you can
ignore it for a ﬁrst approximation and assume the wire goes right through
it. Disconnect the wire somewhere, and charge the metal plates just as in
ﬁgure 4.2.1. Now complete the circuit. Since the wire is metal, the charge
leaves the plates and tries to neutralise itself by ﬂowing along the wire. When
ﬂowing through the coil however it creates a magnetic ﬁeld, and if you look
ahead to ﬁgure 4.3.1 you can work out that you get something very like a
bar magnet created by the ﬂowing charge. This magnetic ﬁeld is caused
by the changing current ﬂow and it contains energy. The ﬁeld acts so as
to impeded the charge and in fact to send it back the way it came. Once
it is back on the plate, the magnetic ﬁeld vanishes and the process starts
again. The current, that is the moving charge, thus oscillates and this can
116 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.2.4: An explanation of an inverse square law.
be measured (at least in principle, although it can be rather fast) by the
ammeter, which registers a sine wave. This oscillation will eventually die
down under normal circumstances because of resistance in the wire, so we
get a decaying sinusoidal wave. You probably saw the second order ODE
which describes this process in ﬁrst year. This circuit is the basis of radio
and television transmissions.
4.2.1 Gradient Fields
One of the possibilities for vector ﬁelds is that the direction and length of
each vector corresponds to the acceleration with which a small mass placed
at that point would fall down a hill. The question is, does such a hill exist?
If it does we say that the vector ﬁeld is the gradient of a potential ﬁeld.
For an inverse square law of attraction, as with gravity, we can imagine a
ﬁeld as indicated in ﬁgure 4.2.4, where the sun, say is at the bottom of the
well, and a planet would be a little dimple in the surface (to denote its own
gravitational ﬁeld) and would move in an ellipse, much like taking a large
sheet of rubber, putting a heavy object in the middle to deform it, and then
knocking a light ball along the rubber sheet. You can, I hope, imagine the
trajectory it would follow.
As a way of thinking about force ﬁelds this is quite productive. We can write
V for the vector ﬁeld at any point and then there is a height function f also
4.2. FIELDS 117
Figure 4.2.5: Another explanation of an inverse square law.
deﬁned at each point and we have that
V = ∇f =
∂f/∂x
∂f/∂y
∂f/∂z
¸
¸
Since we like to think of things running down hills rather than up them, it
is quite usual for physicists and engineers to put a minus sign in the above
equation. Do so if it makes you feel better.
All three of the forces we are considering are of this type, gradient ﬁelds,
except for singularities at the centre of attraction.
It is clear that ∇is pretty much diﬀerentiating f to give the three components
of the derivative of f along orthogonal axes, and this raises the possibility that
it might be more natural to regard the electric or magnetic or gravitational
ﬁelds as 1forms rather than vector ﬁelds on the space we live in.
4.2.2 What are Flux?
The fact that the three ﬁelds all fall oﬀ according to an inverse square law
suggests that this is a property of the space we are living in. One possible
explanation of an inverse square law of repulsion between two objects is that
each is emitting some particles, small point like objects, which hit the other
object and force it away. This would mean that the number passing through
a given area would decrease as the area is moved further away from the
source, as in ﬁgure 4.2.5.
The area subtended by a disc of ﬁxed size would be proportional to the
inverse square of the distance from the centre, so counting hits would give
an inverse square law simply as a property of the dimension of the space we
live in reduced by one.
Even if we don’t believe in anything as fanciful as microscopic particles spat
out by charges, we can certainly think of a ﬂow of imaginary ‘stuﬀ’ put out
at a constant rate proportional to the amount of charge, and so people would
talk and write of the electric ﬂux or the magnetic ﬂux where the word ‘ﬂux’
118 CHAPTER 4. SOME ELEMENTARY PHYSICS
means something which ﬂows, and was used in medicine to mean stuﬀ which
dribbled out of sores and noses
6
. Like the potential function representing a
hill down which objects roll, this is simply a possible way of thinking about
things and we do not feel obliged to specify the imaginary ﬂowing stuﬀ in
any detail. After all, we are doing nothing more than observing that a vector
ﬁeld has an associated ﬂow which is equivalent to it in that we can get to the
ﬂow from the vector ﬁeld by solving a system of ODEs and given the ﬂow
we can get back to the vector ﬁeld by diﬀerentiating it at every point.
Regarding Electric ﬁeld in terms of a ﬂow invites us to consider how much
ﬂows out of a region compared with how much ﬂows into the region. This
has much to do with Stokes’ Theorem and the extent to which the stuﬀ is
created. Obviously the stuﬀ ﬂows out of any charge and ought to either
be conserved or get compressed elsewhere. Imagine, to picture this, water
ﬂowing along in a stream. Now place an imaginary football in the stream.
Water ﬂows through the imaginary football as if it isn’t there, which is fair
enough because it isn’t. The point is that water is hard to compress so
the density is pretty much uniform throughout the stream, and moreover
the water doesn’t come out of nowhere or suddenly vanish. This severely
constrains the kind of vector ﬁeld that we get in a stream by attaching to
each point a little arrow saying how fast, and in which direction, the water
is moving at any point. It satisﬁes the condition that the divergence of the
vector ﬁeld is zero at every point, where we measure the divergence at a point
by putting a small box at the point, and taking the amount of water coming
out of the box less the amount of water going into the box and dividing by
the volume of the box. Now take the limit as the box gets smaller to get
a number at each point of the stream. This is the divergence of the vector
ﬁeld, and for streams it has to be zero. There is precisely as much imaginary
water ﬂowing into the imaginary football as there is ﬂowing out. You might
reasonably conjecture that the divergence of an Electric ﬁeld is zero at a
point except when there is some charge at that point, when it depends on
the sign and amount of charge.
And you would be right.
In algebra we can write the divergence of a vector ﬁeld V as a function g
with
g = ∇
q
V = ∂V
x
/∂x +∂V
y
/∂y +∂V
z
/∂z
Then if our charge comes in lumps, which we often assume to be the case,
any little box containing a positive charge will have some net amount of ﬂux
coming out proportional to the charge, and if the little box is empty of net
6
Many things have improved since the early Nineteenth century.
4.2. FIELDS 119
Figure 4.2.6: The Electric ﬁeld around a positive charge.
Figure 4.2.7: The magnetic dipole ﬁeld.
charge there will be just as much going in as there is coming out. Electric
ﬁeld ﬂows are incompressible, and so are magnetic ﬁelds.
If the ﬂow is not incompressible, it may still satisfy the Continuity Equation
which holds for a larger class of ﬂows of physical systems. It says:
∂ρ
∂t
= −∇
q
j
where ρ is the density of “stuﬀ” at a point and time and j is the vector ﬁeld
of the ﬂow of the “stuﬀ”. You can translate this as: What goes in must
come out or wind up as a sticky mess in the middle. It applies to every
little box you put in the ﬂow, so it makes sense in the limit as the boxes
get smaller. The right hand side can be imagined to represent the build up
of concentration of the ﬂow, which accounts for the minus sign, and the left
hand side then represents the consequence of an increase in the density.
The ﬂow of an electric ﬁeld for a point charge looks like ﬁgure 4.2.6 and the
ﬂow for a magnetic ﬁeld looks like ﬁgure 4.2.7. If we take a ﬁeld like this at
every point along a line segment and add them up we get the ﬁeld of a long,
thin bar magnet.
120 CHAPTER 4. SOME ELEMENTARY PHYSICS
4.3 Maxwell and Faraday
Faraday spent a lot of time investigating the relationship between the three
forces. He didn’t ﬁnd any link between the other two and gravity, although
he spent a long time looking as you will see from the Notebooks. But he did
ﬁnd some important relationships between electricity and magnetism. Some
of this had also been done by Ampere in France
7
.
The key things that turn up are that a moving charge produces a magnetic
ﬁeld, and that a changing magnetic ﬁeld moves charge.
Electrons move rather easily through metals. The electrons in metals that
are attached to the positively charged nuclei in the atoms may be bound
more or less tightly in the atoms, and the outer electrons are more or less
communal to a set of atoms in the crystal lattice which metals form. A
bar of iron is basically a mess of little crystals all jammed together; if you
heat it and let it cool very slowly, you get fewer and bigger crystals, in an
extreme case, practicable only for small bits of iron, you can get a single
crystal. Trying to make it one big crystal is important in some applications
because the strength of a crystal is much greater than the strength of the
metal mixture. Or to put it another way, when you pull at two ends of a
wire, it comes apart at the places between the crystals, not in the middle of
a crystal. And electrons are very small and light. So a metal looks to an
electron rather like a sequence of mostly empty rooms (the crystals) and the
electrons are rather like a swarm of ﬂies, buzzing about aimlessly. Except
that the ﬂies repel each other. When an electric ﬁeld is put across the wire,
the electrons drift in the direction forced by the ﬁeld. In eﬀect, if you pump a
bunch of electrons in at one end of a piece of wire, they repel nearby electrons
and so on so a compression wave passes rather quickly down the wire.
It makes sense therefore to talk of the current which is basically a count of
the number of electrons passing a point in a second
8
. By measuring charge
in some more practical way we can write i = dQ/dt where charge Q moves
7
During the Napoleonic Wars in Europe, Faraday and Sir Humphrey Davy travelled
around to talk to the physicists in various European countries. They regarded the war as
rather a nuisance, and had to avoid the ﬁghting which they saw as a form of insanity to
which some people are addicted. They didn’t need passports which hadn’t been invented:
it was any freeborn Englishman’s right to go wherever he wanted. Passports were intro
duced later under the usual excuse that the government wants to help you. Few people
of any intelligence believed this in early nineteenth century Britain, it being too obvious
that politicians mainly want to help themselves. Not everything has improved since the
early Nineteenth century.
8
But with the direction reversed because current is positive charge and electrons are
negative.
4.3. MAXWELL AND FARADAY 121
Figure 4.3.1: The magnetic ﬁeld of a current (moving charge).
along a wire, or even in a stream through empty space. Since what goes in
must come out and since the electrons are not going to bunch up if they can
help it, the current at one point of a piece of wire must be the same at any
other point except for brief transients when we switch on the process. It is
clear that current is a vector since moving charge has a direction associated
with it.
Michael Faraday, one of the ﬁnest experimentalist the world has produced
and an all round smart cookie, established that charge moving along a wire
produces a magnetic ﬁeld which circles the wire. Drawing curves for the ﬂow
of the ﬁeld we get something like ﬁgure 4.3.1. I have drawn ony a section at
one point of the wire, there is such a set of circles centred on every point.
He also found that when a magnetic ﬁeld changes it induces a current. This is
how we get our mains electricity from power stations. We spin a magnet and
surround it by a coil of wire in eﬀect. Some serious googling or any elementary
text book on electricity should show you exactly how this is done
9
.
James Clerk Maxwell took the ﬁndings of Faraday and wrote them out in
Algebra.
If you reﬂect that changing magnetic ﬁelds produce an electric ﬁeld and
changing electric ﬁelds produce a magnetic ﬁeld, it might occur to you that
this swapping of energy between the two forms might happen in a cyclic way,
and might indeed happen in empty space. It might even occur to you that
such a cycling arrangement might travel through space. Your opinions would
not however count for much and would be considered of very minor interest,
mostly by your friends and relatives, and some of them might consider your
views as evidence of insanity. If however you were to take your wamblings
and turn them into algebra, you might be able to prove that this could
indeed happen and show how to calculate the speed of propagation of such
9
There are people who are convinced they understand electricity: when you click the
switch the light comes on, or maybe the television set, although this generally requires a
diﬀerent switch. There is more to it than that, and it is a good idea to understand it a
little better, or you are not really a member of our civilisation, just a freeloading parasite
on it, hardly better than a politician or an arts graduate
122 CHAPTER 4. SOME ELEMENTARY PHYSICS
an electromagnetic wave in terms of constants which were properties of the
vacuum, such as
0
and the corresponding magnetic one µ
0
. And if this turned
out to be the same as the measured velocity of light, you would eventually
be taken very seriously. This is what Maxwell did. The velocity of light
just happens to be 1/
√
0
µ
0
, both of which were known from entirely static
experiments. And it led shortly afterwards to people trying deliberately to
produce such electromagnetic waves, and this led on to radio, radar, television
and most recently mobile phones
10
.
Maxwell’s Equations are four in number and state things that are known
about the electric and magnetic ﬁelds. Two deal with the nature of the ﬁelds
separately and two deal with the interaction between them. I give them here
for deﬁniteness in more or less the same form as the text book. We suppose
that E is the electric ﬁeld, B is the magnetic ﬁeld and ρ is the density of
charge.
∇
q
E = ρ (4.3.1)
∇
q
B = 0 (4.3.2)
∇E +
∂B
∂t
= 0 (4.3.3)
∇B−
∂E
∂t
= j (4.3.4)
The vector j is in the direction of the moving charge and has norm the rate
of it.
These are very diﬀerent from the form that Maxwell wrote them in, which
were much longer and not so compressed, and we shall get an even terser
form later. I note that there are some constants for the medium which have
been ﬁxed up to make the velocity of light one. This is just a choice of units
in which to measure things and so is quite harmless and makes equations
simpler.
10
Whether this is altogether desirable may be doubted, but there are at least some
beneﬁts. Certainly the reason we are much better oﬀ than the inhabitants of Bangladesh
or Congo is that we are more closely related to Isaac Newton, Michael Faraday and James
Clerk Maxwell than they are, biologically or socially. And we live with traditions which
are still in many ways similar to the traditions in which these men lived and produced the
amazing changes that they did. There are also some diﬀerences. Nowadays, instead of
being funded by the Royal Society at the discretion of Sir Humphrey Davey (its president),
Faraday would have had to submit a grant application to a committee to study electricity.
It is very doubtful if he’d get it. First he had no appropriate qualiﬁcations, and second
the practical applications would certainly have been beyond the imagination of the kind
of people who enjoy being on committees. He’d probably have been told to give up all
this foolery with wires and magnets and work on more powerful steam engines.
4.3. MAXWELL AND FARADAY 123
Figure 4.3.2: The curl of a vector ﬁeld.
The ﬁrst two equations simply say that the electric and magnetic ﬁelds are
incompressible, that the ﬂux into a region is equal to the ﬂux out in the
case of an electric ﬁeld, except in a region containing charge, and that the
magnetic ﬁeld is always incompressible (there are no magnetic monopoles).
The second two contain the information about the interchange between mag
netic ﬁelds and electric ﬁelds. It is essential to understand what they are
saying, do not merely memorise them.
The curl of a vector ﬁeld is the extent to which it tends to twist around
some axis. If you visualise a stream of water ﬂowing and you put a very
small paddle in it, then in general the paddle gets turned by the ﬂow being
greater on one side than on the other. Figure 4.3.2 gives the idea.
The curl can be thought of as a vector by taking the amount of twist about
the positive xaxis, the positive yaxis and the positive zaxis to give three
components; alternatively we can take the direction of the vector to be that
in which the rotation is a maximum and the length equal to the maximum
torque. Only a little thought suggests that it would be much more natural
to think of it as a 2form, when it is simply the exterior derivative of the
1form which replaces the vector ﬁeld. This is undoubtedly a better way to
think of it, as it generalises to higher dimensions quite naturally. And of
course the divergence can be thought of as applying the exterior derivative
to a 2form to get a 3form, which on R
3
is, at each point, a number. So we
may anticipate the next stage of writing these equations out will be to turn
them into diﬀerential forms instead of vector ﬁelds.
For the present however, equation 4.3.3 says that the curl of the electric ﬁeld
is the rate of change of the magnetic ﬁeld with the direction reversed. We
have to think of B as a vector ﬁeld which depends upon the time: if we
take each of the three components it is a function of x, y, z and t, and if we
diﬀerentiate it (partially) with respect to t we get a new vector ﬁeld, also
usually depending on t. So equation 4.3.3 says that for every time t, the
vector ﬁeld ∇E is the negative of the time derivative of B. The amount of
124 CHAPTER 4. SOME ELEMENTARY PHYSICS
twist of the electric ﬁeld depends upon the rate at which the magnetic ﬁeld
is changing. This is, like equation 4.5.1, and of course equation 4.3.4, part of
the interconnection between magnetic and electric ﬁelds.
Finally, equation 4.3.4 is almost dual to equation equation 4.3.3 except for
a minus sign and the j term and tells us something about the curl of the
magnetic ﬁeld at every time in terms of the current vector and the rate of
change of the electric ﬁeld. The latter term was introduced by Maxwell
not on the basis of experimental results, but because it led to the wave
solution to the four equations. One suspects strongly he had done the vague
English language argument about the exchange of energy between electric
and magnetic ﬁelds in free space and wanted to make it come good.
In order to collect your ideas on these equations, and to recall some earlier
work, some simple exercises will establish what is going on. If you did second
year physics you have probably already done these, although not perhaps in
this form.
Exercise 4.3.1. Find a vector ﬁeld Von R
3
with constant curl the vector
(0, 0, 1)
T
. Find some more vector ﬁelds with the same curl. Show that there
is an inﬁnite dimensional space of vector valued functions on R
3
which can
be added to your solution to give another solution.
Exercise 4.3.2. Find a vector ﬁeld V on R
3
which has a constant curl vector
zero, but which has the property that the integral around the unit circle (in
the z = 0 plane) of V is nonzero.
Exercise 4.3.3. A current vector j is deﬁned to be uniformly (0, 0, 1)
T
for
points of distance less than one from the zaxis. You may imagine a rod of
radius 1 along the zaxis carrying a current. The current is zero for points
at a distance from the zaxis greater than one (that is, outside the rod). Find
a continuous magnetic ﬁeld the curl of which is the given j ﬁeld.
Explain why continuity is worth having, and given that there are rather a lot
of other solutions, explain what grounds you have for preferring yours. You
might ﬁnd ﬁgure 4.3.1 inspirational.
Exercise 4.3.4. Show that the wave equation is a solution to the Maxell
Equations in empty space. First write down the equation of an electric ﬁeld
which has all the vectors in a plane the same length and direction: take, say,
planes x = s and arrange to have for ﬁxed s, every electric vector the same
length and direction at any time t, but change the vector in time and also with
s so that it has unit speed along the x axis. Now look to see if this satisﬁes the
Maxwell Equations, for ρ and j zero, by doing some partial diﬀerentiating.
When you have done so, throw your hat in the air and shout ‘huzzah!’ in
celebration. You have seen the light.
4.3. MAXWELL AND FARADAY 125
Figure 4.3.3: What is the eﬀect on a television set of a magnet?
Exercise 4.3.5. A beam of electrons is emitted by a cathode at the back of
a television set and paints a spot on the centre of the screen. Traditionally,
deﬂector plates are charged so as to sweep the spot in a raster scan giving your
television picture. I show a horizontal section through the tube in ﬁgure 4.3.3.
Discuss what happens when a bar magnet is placed in the location shown.
What numbers would you need to know in order to calulate the deﬂection of
the beam and its direction?
Exercise 4.3.6. Read the ﬁrst twenty chapters of volume Two of the Feyn
man Lectures on Physics (in the library). Chapter nineteen is of no direct
relevance but is good fun and worth reading to see how a great physicist thinks.
If you have been doing physics you should ﬁnd this easy, but there are some
penetrating observations which you may want to think about. If you haven’t
done much physics, again this will show you something of what you have been
missing.
Exercise 4.3.7. Read Chapter four of the text book and do all the exercises.
Remark 4.3.1. The work which has been described so far has changed the
world, mainly for the better, and changed it enormously. It is the product
of the Western Intellectual Tradition, and it is worth reﬂecting on the kind
of society which can produce such things, and also on the kinds of society
which cannot, which is most of them.
Maxwell’s Equations represent one of the glories of Western civilisation,
something which is likely to remain as long as humanity endures and possibly
for much longer. Maxwell, Faraday and others stole lightning from the gods:
these men are heroes far beyond such as Alexander, Caesar or Napoleon
11
.
11
Or miscellaneous footy players, or people who hit balls with sticks. Or people who
play guitars or sing. The list goes on.
126 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.1: A ball about to bounce oﬀ a wall.
Your life at University is being spent in part at least in coming to under
stand the thinking of the great men who produced these marvels, and also
to understand something of how they did it. There are worse ways to spend
your time
12
.
4.4 Invariance
4.4.1 The Idea of Invariance
Imagine a ball rolling in the plane and bouncing of a ﬁxed wall, as in ﬁgure
4.4.1.
If the ball has initial momentum p in the direction of the arrow, then it is
simple to compute the new momentum after the ball has bounced: the com
ponent parallel to the wall is unchanged and the component perpendicular
to the wall is reversed in sign.
This makes a number of assumptions which are less than satisfactory; one
is that the ball is elastic since if it was made of putty it might stick to the
wall after deforming. So we also assume that energy is conserved by the
impact, in general not a realistic assumption, but approximately true for
bodies which are elastic enough. We also assume that the wall is rather
ﬂat and very smooth, since the ball will actually impact over a region, not
at a point, and if diﬀerent bits of the wall made diﬀerent angles with the
trajectory then the behaviour is potentially more complicated and harder to
compute. When you did this sort of thing at school I rather suspect they just
12
Conquering Europe, or anything involving balls or guitars, for instance.
4.4. INVARIANCE 127
trained you to make the assumptions that the school teacher made without
asking many questions, so you probably never questioned the assumptions
and indeed didn’t even think about what they were. There can be rather a
lot.
Exercise 4.4.1. Think of some more assumptions that are necessary to get
a solution to the problem as posed.
In order to do the calculation, subject to all the usual assumptions, we need
to take a coordinate system, and some are better than others. I have shown
some axes in the left lower side of the picture. I haven’t however marked on
any units and you don’t know what units the momentum is given in either. It
should be fairly obvious that the units don’t much matter, in that whatever
we choose, as long as we take the same ones after as before we will get the
same answer. What is crucial is the angle the momentum vector makes with
the wall.
Now suppose we change the position and orientation of the axes. I do the
calculation in the original system and you do it in the new system. We can
translate everything from one to the other; your initial momentum vector p
will consist of an ordered pair of numbers, and an initial point for the ball
will also consist of another ordered pair of numbers. So will mine, and mine
will be diﬀerent. It is easy to write down a euclidean group element which
will reliably translate your numbers to mine and the inverse will translate
my numbers to yours. And the resulting momentum vector, speciﬁed by a
direction and a point through which it passes, will translate by the same rule.
This is just like the situation of chapter one where I talked about penguins:
we have language which consists of ﬁnite lists of numbers, and we have the
physical entities, and the behaviour had better be described in the same way
whatever the language, because what happens in the world does not depend
on the language we use to talk about it. This is a key assumption about the
physical world which we use to put conditions upon the kinds of language
we shall use to talk about it
13
.
In the above case we can also change the units, you can use metres per second
and I can use parasangs per lunar month; the translation system still works
in that the system that translates the initial momentum from yours to mine
will also translate the ﬁnal momentum from yours to mine. In fact it is
diﬃcult to think of diﬀerent systems of specifying initial and ﬁnal momenta
and positions where a consistent translation system will not work.
13
This condition does not seem to apply to ones love life. There are tactful ways a bloke
can tell his girlfriend he doesn’t like her outﬁt, and there are others. He may tell the truth
in both cases, but the language makes a diﬀerence. In particular, it is always a mistake
to laugh. I speak from bitter experience.
128 CHAPTER 4. SOME ELEMENTARY PHYSICS
Exercise 4.4.2. But not impossible. Hint: consider the map from the polar
coordinate space r, θ to itself that doubles the angle. This is not a bijection,
we can translate events one way, but not unambiguously the other.
What this means is that if we have two languages for talking about events,
then as long as the translation scheme between the two languages is a bijec
tion, and as long as an event can be speciﬁed in one language, then it can
also be speciﬁed in the other, and the translation scheme will ‘work’ for all
such events if those events are correctly described.
But as well as specifying observable events, we also want to predict what will
happen in advance by means of some kind of theory. And it is going to be a
poor sort of theory where the prediction is diﬀerent in languages with such a
translation scheme to hand. This imposes a constraint on the theory: it must
translate the same predictions into each other. This is known as Einstein’s
Principle of Covariance and you should be able to see how he (and Poincar´e)
came to it: by seeing diﬀerent coordinate frameworks as providing diﬀerent
languages and there being a translation system between them.
We normally have not just two languages and one translation system between
them but a whole space of languages and a group of translations schemes,
since given any three languages, if I can translate from A to B and from B
to C by maps, then I can translate from A to C by the composite; moreover
the identity will translate from any language to itself, and we really want
every translation system to work in both directions, so there is an inverse
map. The associativity of composites of maps is immediate, so we have a
group of such translation systems. In the case of the shifting and rotation of
a coordinate frame used to specify only the positions of points, this is clearly
the Special Euclidean Group, SE(2, R). See the M2213 lecture notes if you
have forgotten this.
If f : R
2
→ R
2
is any map, it makes sense to ask if it is invariant with
respect to a group action. For example, f(x, y) = x
2
+ y
2
is invariant under
the rotation group SO(2, R): putting
X = x cos θ −y sin θ
Y = x sin θ +y cos θ
we easily conﬁrm that f(X, Y ) = f(x, y) for any theta. So if some sort of
prediction is speciﬁed by a function we can look to see if it is invariant under
the appropriate group of transformations of coordinates which we regard as
specifying the possible languages we have available, and if it is not then it
cannot possibly deﬁne a satisfactory theory, because diﬀerent observers will
expect to have diﬀerent and incompatible outcomes. If a theory is speciﬁed
4.4. INVARIANCE 129
by requiring two functions to be equal, then they must be equal both before
and after we perform the appropriate group actions on them.
What is the group action in the case of the ball bouncing oﬀ the wall? We
have that the space in question is the space of positions and momenta of
balls. This we have seen is the cotangent space to R
2
which is isomorphic to
R
4
. It is not uncommon to write elements of this in the form (q
1
, q
2
, p
1
, p
2
)
where the q
i
are the positions, x and y more conventionally, and the p
i
are
the momenta. If we do a shift of a coordinate system, this will aﬀect the q
i
but not the p
i
. If we do a rotation, both will be aﬀected in the same way.
We can also consider a coordinate frame which is moving at a constant ve
locity with respect to another one, requiring us to specify also the time. So
we have a ﬁve dimensional space in which to specify the position and mo
mentum of the ball at each time, two coordinate frameworks for turning the
motion into a map from R denoting the time into R
5
, and a map between
them which has the property of taking one description to another description
of the same event.
Exercise 4.4.3. Write down a speciﬁcation for a ball moving in a atraight
line in R
2
with constant momentum. Use the ﬁve numbers (t, q
1
, q
2
, p
1
, p
2
)
Choose actual numbers for the motion!
1. Take a coordinate frame which is shifted by some constant amount and
translate the function giving the position and momentum of the ball into
the new framework.
2. Do the same with a rotated coordinate frame.
3. Do the same for a frame which is both rotated and shifted.
4. Do the same for a frame which is both rotated and is moving at constant
velocity.
5. Do the same for a frame which is rotating with constant angular veloc
ity.
6. Write down the groups for the ﬁrst three transformations. What phys
ically intelligible function is invariant under this group?
7. Write down the group for all the ﬁrst four transformations. (This is
called the Galilean Group.) What is its dimension?
8. Write down the group for all ﬁve transformations.
9. Is your function invariant under either of the last two groups?
130 CHAPTER 4. SOME ELEMENTARY PHYSICS
10. What happens when the ball bounces?
11. Explain the physics here.
Exercise 4.4.4. If V is a vector ﬁeld on R
3
and f : R
3
→R
3
is a euclidean
transformation, is it true that when V satisﬁes the equation ∇
q
V = 0 then
so does Tf(V )? If so prove it, if not give a counterexample.
Exercise 4.4.5. If V is a vector ﬁeld on R
3
and f : R
3
→ R
3
is a diﬀeo
morphism, is it true that when V satisﬁes the equation ∇
q
V = 0 then so
does Tf(V )? If so prove it, if not give a counterexample.
4.4.2 The Lorentz Group
The deﬁnition of the Orthogonal group O(n,R) was either that it consisted of
the orthogonal n n real matrices, or, better, that it consisted of the linear
maps from R
n
to R
n
which preserved the inner product, Formally,
∀ A ∈ L(R
n
, R
n
), A ∈ O(n, R) ⇔ ∀ u, v ∈ R
n
, 'u, v` = 'Au, Av`
So as to simplify things I shall now take a generalised inner product on R
2
which I shall write as having elements
¸
t
x
and the (lorentzian) inner product is deﬁned to be
¸
t
x
,
¸
t
x
= tt
−xx
Note that I have reversed the sign from what you probably expected and the
one that the text book favours. If you feel uneasy about this go through
multiplying everything by −1.
It follows that the norm of the vector (t, x)
T
is t
2
−x
2
.
I shall argue by analogy with the usual inner product on R
2
.
In order to ﬁnd out what the orthogonal maps looked like on R
2
, we took
the unit circle, and argued that any point on it had to remain on it under
an orthogonal map. Doing the same here, take the set
H
1
=
¸
t
x
∈ R
2
: t
2
−x
2
= 1
4.4. INVARIANCE 131
Figure 4.4.2: An analogue of the unit circle in a Lorentzian space.
Then if A preserves the new inner product, or is lorentzian we have that
¸
t
x
∈ H
1
⇒
¸
s
u
= A
¸
t
x
∈ H
1
⇒ s
2
−u
2
= 1
It is easy to draw the set t
2
− x
2
= 1 and it consists of a hyperbola as in
ﬁgure 4.4.2.
My reason for drawing it this way around and taking t
2
−x
2
and not x
2
−t
2
is that all the action has x < t which, given that the velocity of light is one
in these units and that things don’t travel faster than light, is the way things
ought to be.
Now we can parametrise the unit circle by cos θ, sin θ and it is easy to
parametrise the curve H
1
by
t = cosh θ, x = sinh θ
since
cosh
2
θ −sinh
2
θ =
e
2θ
+e
−2θ
+ 2
4
−
e
2θ
+e
−2θ
−2
4
= 1
Now we note that the standard basis elements in R
2
are t = 1, x = 0 and
t = 0, x = 1 and that the norm of the ﬁrst is 1, so it is in H
1
and the norm
of the second is −1 and it is not in H
1
. So there is a slight problem with
deﬁning a Lorentzian matrix in terms of cosh θ and sinh θ. The solution is
to observe that we need to extend H
1
to contain the other hyperbola which
intersects the x axis, as in ﬁgure 4.4.3
132 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.3: A better analogue of the unit circle in a Lorentzian space.
We can now see that we should have deﬁned
H
1
=
¸
t
x
∈ R
2
: t
2
−x
2
= ±1
which would have made no diﬀerence in the case of S
1
and the usual inner
product had we done it, but does make a diﬀerence here. We now have that
the lorentzian matrices are those for any real θ:
¸
cosh θ sinh θ
sinh θ cosh θ
Those vectors for which we are in the original hyperbola are called spacelike,
since they represent velocities which are less than 1 and correspond to things
we may see in our universe. Light, which moves at the velocity 1 in our units
must lie along x = ±t, and consists of vectors not in H
1
and the norm of
any such vector is zero. So the analogue of distance in our Lorentzian space,
which we call the interval, is zero for any light ray. Seen from the point
of view of a ray of light, there is no diﬀerence between starting from the
Andromeda Galaxy and arriving in your eye. This is deﬁnitely weird; well,
that’s reality for you.
Supposing we start with a spacelike vector in the two dimensional Lorentzian
space, for example t = 2, x = 1. It goes to
t = 2 cosh θ + sinh θ, x = 2 sinh θ + cosh θ
It is easy to verify that the norm of the original vector is 3 and so is the
norm of the ﬁnal vector.
4.4. INVARIANCE 133
Exercise 4.4.6. Conﬁrm that all such matrices as those advertised do in
deed preserve the lorentzian form. What is their determinant. What other
matrices preserve the Lorentzian form? What is their determinant?
Now let’s get back to higher dimensional spaces with a (1,n) signature. I
have the usual spacetime situation with (x
0
, x
1
, x
2
, x
n
)
T
and the lorentzian
generalised inner product x
0
x
0
−
¸
j∈[1:n]
x
j
x
j
. I am particularly concerned
with n = 3 because that is the number of spatial dimensions of the universe
we live in.
There are six “basic” lorentzian matrices in R
4
with the lorentzian inner
product:
1 0 0 0
0 1 0 0
0 0 c −s
0 0 s c
¸
¸
¸
¸
,
1 0 0 0
0 c 0 s
0 0 1 0
0 −s 0 c
¸
¸
¸
¸
,
1 0 0 0
0 c −s 0
0 s c 0
0 0 0 1
¸
¸
¸
¸
(4.4.1)
gives three of them, where the c and s entries represent cosines and sines
of angles and give the three dimensional space of real orthogonal matrices.
The time axis is left ﬁxed in this case, and each of these leaves one other
orthogonal axis ﬁxed.
The other three are
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
,
ch 0 sh 0
0 1 0 0
sh 0 ch 0
0 0 0 1
¸
¸
¸
¸
,
ch 0 0 sh
0 1 0 0
0 0 1 0
sh 0 0 ch
¸
¸
¸
¸
(4.4.2)
where ch is short for cosh θ and sh is short for sinh θ for various θ. Each of
these swaps the time into one of the three space axes and vice versa. Again,
two axes are left ﬁxed. They are known to physicists as Lorentz Boosts.
Exercise 4.4.7. Show that each of the above six matrices preserves the
lorentzian inner product, and hence that any composite of them (for any
consistent values of the argument of cos, sin, cosh or sinh) will also.
Exercise 4.4.8. Show that every matrix which preserves the lorentzian inner
product must be some ﬁnite product of such matrices.
Exercise 4.4.9. Show that the Galilean group can be represented by matrices
of the form
1 0 0 0
v
1
a
11
a
12
a
13
v
2
a
21
a
22
a
23
v
3
a
31
a
32
a
33
¸
¸
¸
¸
134 CHAPTER 4. SOME ELEMENTARY PHYSICS
where
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
¸
¸
is in SO(3, R).
Remark 4.4.1. Note that both the Lorentz group and the Galilean group
can deal with a change of coordinates from a ﬁxed system to one moving at
uniform velocity with respect to it. And they are diﬀerent! The lorentz group
is the right one for relativity. You should observe that for the lorentz group,
movement with velocity v means setting tanh(θ) = v, and that we recover the
usual (relativistic) rules for the addition of velocities.
Exercise 4.4.10. Find a good source on Special Relativity: you could do
worse that the Feynman Lectures on Physics, Volume 1, chapter 15. Note
the equations
x
=
x −ut
√
1 −u
2
y
= y
z
= z
t
=
t −ux
√
1 −u
2
Show that these are essentially the inverse of the ﬁrst matrix in 4.4.2.
Explain why physicists use the inverse.
Exercise 4.4.11. Show that if a space ship zooms past you at velocity half
that of light, and another spaceship zooms past that at half the speed of light
(relative to the ﬁrst spaceship) in the same direction, then you will decide the
second spaceship has a velocity of 4/5 the speed of light. Show that if the ﬁrst
had speed u and the second had speed v relative to the ﬁrst, your opinion of
the speed of the second is given by
w =
u +v
1 +uv
Show that if [u[ < 1 and [v[ < 1 then [w[ < 1.
Exercise 4.4.12. Read Feynman’s Lecture Notes in Physics, Volume 1,
chapter 15. If you are a physicist you will have already covered this mate
rial, if not you will ﬁnd it comforting to discover you have now done Special
Relativity. Easy, isn’t it? Note that apart from a few technical diﬃculties (!)
you have discovered how atom bombs and nuclear power stations work
14
.
14
The details of atom bombs are very simple, and the recipe is as follows: take about
4.4. INVARIANCE 135
4.4.3 The Maxwell Equations
The ﬁrst of Maxwell’s equations for the vacuum, with charge density zero is
∇
q
E = 0
To see that this is invariant under SO(3,R) is in one sense trivial. The equa
tion says that the net outﬂow of the ﬂux determined by the vector ﬁeld E for
any little box is zero, in fact for any box whatever, where a box is a region
bounded by something diﬀeomorphic to a 2sphere. Rotating a box gives an
other box, so the net outﬂow from a shifted box is also zero. This obviously
extends to the nonvacuum case with a nonzero charge density. It obviously
holds for a much larger group than SO(3,R) too: it must hold for any diﬀeo
morphism, although the equation stating the fact that the divergence is zero
would look rather diﬀerent.
Although this argument is persuasive, it lacks a certain rigour, so a slightly
more careful argument is required. We can observe that when we take the
divergence of the original vector ﬁeld at any point it has to be the same as
the divergence of the transformed ﬁeld at the transformed point. And since
the zero map is also preserved by the orthogonal map, the result follows.
The equation ∇
q
B = 0 is also invariant for the same reason.
Exercise 4.4.13. Show that the statement ‘the divergence of the original
vector ﬁeld at any point has to be the same as the divergence of the trans
formed ﬁeld at the transformed point’ can be translated into algebra by doing
it, and that it is true for any vector ﬁeld.
Now this argument uses only the linearity of the matrix and the fact that
t = t
, and indeed is far more general than that.
It remains to prove invariance for the boost maps, the ﬁrst of which is
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
The problem here is that the electric and magnetic ﬁelds are deﬁned as vector
ﬁelds on R
3
and the lorentz boost maps are represented by 4 4 matrices,
so we can expect some serious complications.
a kilogram of Uranium 235 or Plutonium and shape it into a hemisphere. Do the same
with a second kilogram. Now clap them together hard to make a solid sphere. Children,
do not do this at home unless you really dislike your parents.
136 CHAPTER 4. SOME ELEMENTARY PHYSICS
The actual transformation of the E and B ﬁelds is rather a shock at ﬁrst and
is given by
cosh θ sinh θ 0 0
sinh θ cosh θ 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
E
x
E
y
E
z
¸
¸
=
E
x
E
y
cosh θ −B
z
sinh θ
E
z
cosh θ +B
y
sinh θ
¸
¸
(4.4.3)
and
cosh θ sinh θ 0 0
sinh θ cosh θ 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
B
x
B
y
B
z
¸
¸
=
B
x
B
y
cosh θ +E
z
sinh θ
B
z
cosh θ −E
y
sinh θ
¸
¸
(4.4.4)
What is surprising about this is that the Electric and Magnetic components
get mixed up. This means that if I am travelling in the x direction with
velocity tanh θ (which you will note has absolute value always less than 1,
the speed of light) and we both measure an electric ﬁeld and a magnetic
ﬁeld, we shall diﬀer on which bits are which. This is a strong hint that the
two phenomena of electric ﬁelds and magnetic ﬁelds are all part of the same
underlying entity, called the electromagnetic ﬁeld.
The derivation of the above transforms will be easier once we go over to using
diﬀerential forms instead of vector ﬁelds to represent the two parts, E and
B of the electromagnetic ﬁeld, so I shall defer it.
The invariance of the Maxwell Equations under the transforms is also eas
ier to see in this setting. So we now turn to the right way to talk of the
electromagnetic ﬁeld.
4.5 Saying it with Diﬀerential Forms
Given a physical phenomenon, in this case the force exerted on a charged
particle, and given that two bits of language can be used to describe it, in
this case as a vector ﬁeld on R
3
or as a 1form or a 2form, the question of
which piece of language to use comes up immediately. We may, of course,
choose the ﬁrst one that occurs to us and having made a choice stick to
it in deﬁance of later developments. This is rather stupid and regrettably
common. An alternative is to ask if there are any physically obvious grounds
for making a choice, and a second is to keep them both in use until such time
as one demonstrates advantages.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 137
Let me ﬁrst argue that it is reasonable to use 2forms for describing magnetic
ﬁelds. The reason is that 2forms take account of orientations, and magnetic
ﬁelds certainly exhibit all the usual signs of being orientation aware. If you
look at the Lorentz Force law, which I give again for your greater comfort,
F = q(E +v B) (4.5.1)
you will see that the v B part clearly has an orientation aspect in it by
virtue of the cross product, whereas the electric ﬁeld has only the sense or
direction. It therefore makes sense to represent the magnetic ﬁeld as a 2
form. Some sort of right hand rule is involved in computing a cross product:
this may be seen as containing the information that magnetic ﬁelds also use
some sort of orientation information. They care about which direction a
charge is moving. Of course, we can force a vector view on the ﬁeld if we
insist, which means we have to keep in mind the right hand rules of physics,
whereas if we use 2forms, this should be taken care of by the formalism. A
good language is one which does most of the work and doesn’t require us to
keep a close watch on it.
If B is a 2form then so is its time derivative, and the equation
∇E = −
∂B
∂t
is telling us that E is a 1form. So we can write the above equation in the
form
dE = −
∂B
∂t
where d is the exterior derivative which we know takes 1forms to 2forms.
We may also write dB = 0 with some conﬁdence to represent the classical
equation ∇
q
B = 0 since the divergence of a vector ﬁeld is a scalar ﬁeld and
the exterior derivative of a 2form is a 3form which on R
3
is pretty much
also a scalar ﬁeld, multiplied by det if you want to be careful.
Unfortumately, after that everything goes pearshaped.
the equation
∇B−
∂E
∂t
= j
makes no sense when we try to translate it: B can’t have a curl, it is one.
∂
t
E has to be another 1form, and so is j. So we somehow have to arrange
that the translation of ∇B is a 1form.
On the other hand, we have also avoided facing the fact that we should be
doing all of this on R
4
with the lorentz metric. Maybe we can save things by
a small amount of rearrangement.
138 CHAPTER 4. SOME ELEMENTARY PHYSICS
Let us look at the simplest case ﬁrst, the equations
∇
q
B = 0
∇E +
∂B
∂t
= 0
become
dB = 0
dE +
∂B
∂t
= 0
This corresponds closely to the physics: dB is indeed a divergence that is a
3form on R
3
and the exterior derivative of the electric 1form is indeed a
2form.
We can shift all this to our lorentz space, R
4
with the lorentz inner product,
bt deﬁning a 2form on R
4
as follows:
F = B+E ∧ dt (4.5.2)
Writing this out in coordinate form with respect to the standard basis we get
B = B
x
dy ∧ dz +B
y
dz ∧ dx +B
z
dx ∧ dy
and
E ∧ dt = E
x
dx ∧ dt +E
y
dy ∧ dt +E
z
dz ∧ dt
I hope you recall representing the 2form 3dx ⊗dx + 4dx ⊗dy −2dy ⊗dx +
5dy ⊗dy on R
2
as a matrix
¸
3 4
−2 5
You will have veriﬁed that this operates on the two input vectors
¸
x
y
,
¸
u
v
by sending them to the number
[x, y]
¸
3 4
−2 5
¸
u
v
If you don’t recall this or didn’t do the exercise, do it NOW.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 139
It is straightforward to verify that the 2form F can be represented in the
same way on R
4
by the matrix
0 −Ex −Ey −Ez
Ex 0 Bz −By
Ey −Bz 0 Bx
Ez By −Bx 0
¸
¸
¸
¸
Exercise 4.5.1. Verify this on pain of death.
It is now easy to compute dF. This will be a 3form on R
4
.
We have:
F = B+E ∧ dt
dF = dB+d(E ∧ dt)
Doing the dB part ﬁrst:
dB = d(B
x
dy ∧ dz +B
y
dz ∧ dx +B
z
dx ∧ dy)
=
∂B
x
∂t
dt ∧ dy ∧ dz +
∂B
x
∂x
dx ∧ dy ∧ dz
+
∂B
y
∂t
dt ∧ dz ∧ dx +
∂B
y
∂y
dy ∧ dz ∧ dx
+
∂B
z
∂t
dt ∧ dx ∧ dy +
∂B
z
∂z
dz ∧ dx ∧ dy
= (
∂B
x
∂x
+
∂B
y
∂y
+
∂B
z
∂z
) dx ∧ dy ∧ dz
+
∂B
x
∂t
dt ∧ dy ∧ dz +
∂B
y
∂t
dt ∧ dz ∧ dx +
∂B
z
∂t
dt ∧ dx ∧ dy
Now doing the d(E ∧ dt) part:
d(E ∧ dt) = d(E
x
dx ∧ dt +E
y
dy ∧ dt +E
z
dz ∧ dt)
=
∂E
x
∂y
dy ∧ dx ∧ dt +
∂E
x
∂z
dz ∧ dx ∧ dt
+
∂E
y
∂x
dx ∧ dy ∧ dt +
∂E
y
∂z
dz ∧ dy ∧ dt
+
∂E
z
∂x
dx ∧ dz ∧ dt +
∂E
z
∂y
dy ∧ dz ∧ dt
=
∂E
z
∂y
−
∂E
y
∂z
dy ∧ dz ∧ dt
+
∂E
x
∂z
−
∂E
z
∂x
dz ∧ dx ∧ dt
+
∂E
y
∂x
−
∂E
x
∂y
dx ∧ dy ∧ dt
140 CHAPTER 4. SOME ELEMENTARY PHYSICS
Collecting up both parts we get:
dF =
∂B
x
∂x
+
∂B
y
∂y
+
∂B
z
∂z
dx ∧ dy ∧ dz (4.5.3)
+
∂E
z
∂y
−
∂E
y
∂z
+
∂B
x
∂t
dy ∧ dz ∧ dt
+
∂E
x
∂z
−
∂E
z
∂x
+
∂B
y
∂t
dz ∧ dx ∧ dt
+
∂E
y
∂x
−
∂E
x
∂y
+
∂B
z
∂t
dx ∧ dy ∧ dt
If dF = 0 then each of the above four lines must be zero. The ﬁrst line says
that div B = 0 in old fashioned language, and the last three say that curl
E +∂B/∂t = 0 in the same old fashioned language.
In other words, we get two of the Maxwell equations out. This is encouraging
and leads us to feel that gluing E and B together into a single entity, the
2form F is a good idea. This is the physically signiﬁcant thing of which the
magnetic ﬁeld and the electric ﬁeld are merely diﬀerent aspects.
The next step is to express the other pair of Maxwell equations in the same
language.
This is where the operator comes in. It is clear that if we F we get
another 2form. When we calculate the matrix for it we get
0 Bx By Bz
−Bx 0 Ez −Ey
−By −Ex 0 Ex
−Bz Ey Ex 0
¸
¸
¸
¸
Exercise 4.5.2. Conﬁrm this. The calculation is utterly trivial, all you need
to do is to organise your thoughts sensibly. Observe that the Hodge dual can
be memorised by mapping Ej to −Bj and Bj to Ej. This looks very like
what we want for the other pair of Maxwell Equations in the classical form.
If we take the exterior derivative we get a 3form on R
4
, and if we it we
get a 1form. We can represent the current as a 1form on R
4
by putting the
charge density ρ in the zeroth place and using the other three places to give
the values for the current ﬂow. This means we need to deﬁne J as the 1form
ρ
∂
∂t
+j
1
∂
∂x
+j
2
∂
∂y
+j
3
∂
∂z
4.6. LORENTZ INVARIANCE 141
Whereupon we may write the other two Maxwell Equations out as
(d((F))) = J
Exercise 4.5.3. Show that this does indeed amount to precisely the other
pair of Maxwell’s Equations.
It is common to leave out all the parentheses and summarise the Maxwell
Equations, all together, in the form
dF = 0 (4.5.4)
d F = J (4.5.5)
Remark 4.5.1. You have to allow that this is rather cool. Compacting the
equations in this way gives us a much more elegant way to express the basic
facts of electromagnetism and should leave you feeling that it is more true to
the underlying reality than the classical form. If you had never met Maxwell’s
Equations in the classical formalism and you had just met these for the ﬁrst
time, you would, I think, ﬁnd a good deal of charm in the conciseness, and
feel that the evidence for the electromagnetic ﬁeld being a 2form on a four
dimensional spacetime is overwhelming. The fact that it requires a lorentz
inner product to work properly is at least highly suggestive.
Exercise 4.5.4. How far could you get in rewriting the Maxwell Equations
(in terms of forms) with the standard inner product on R
4
? What changes
would you need?
4.6 Lorentz Invariance
The ﬁrst issue to be addressed is to determine how 2forms transform under
a transformation of coordinates.
Suppose B is a 2form on R
n
with a generalised inner product, and A : R
n
→
R
n
is a diﬀeomorphism. Then take one coordinate system at the origin and
let another be obtained by performing A on it. Call S the coordinate system
at the origin, and think ordered basis for a concrete example. Then AS is the
second coordinate system. Think of A as a linear map, possibly a lorentz map
to make this relatively concrete. Then a vector x in the coordinate system S
is read as A
−1
x in AS. Call it x
to save space. That is, we have two ways of
talking about the same point in the space, two languages. Similarly, u and
u
= A
−1
u for a second vector.
Then if B
is the correct transform of B in AS we shall have that B
acts
on the ordered pair x
, u
to give the same number as B does on the ordered
142 CHAPTER 4. SOME ELEMENTARY PHYSICS
pair x, u. This must be the case since the number we get out must not
depend on the language we are using to describe the points which exist and
are independent of the language.
If A is linear and S and hence S
= AS are given by ordered bases, then we
can represent B by a matrix [B] and write B(x, u) as x
T
[B]u. Similarly we
have x
T
[B
]u
for the same number obtained by the transformed 2form, and
these are equal for any choice of x and u. Saying this in algebra:
∀ x, u ∈ R
n
x
T
A
T
[B]
Au = x
T
[B]u
This can only happen when A
T
[B]
A = [B] which tells us that
[B]
= A
T−1
[B]A
−1
which gives us the transform of the matrix [B] representing the 2form B.
Exercise 4.6.1. It is well known that for an orthogonal matrix A, A
T
= A
−1
.
What can you say about the transpose of a lorentzian matrix?
Exercise 4.6.2. The question arises, how much of this depends on linearity?
Obviously we have chosen to represent things by matrices, but the equation
makes sense in a much more general setting except possibly for the business
of the transpose, which arose from our determination to represent B by a
matrix. Suppose we express the 2form relative to a basis in the usual way
as a sum of dx
i
∧ dx
j
. What can be said about the expression of B
relative
to the dx
i
∧ dx
j
? How much if anything can be saved if we permit A to be
a diﬀeomorphism? Hint: investigate this in R
2
` ¦0¦ with reference to the
polar coordinate transform.
The obsession which physicists and old stle mathematicians have with matrix
representations of diﬀerential forms can obscure the basic simplicities. We
have already computed dF in standard terms as a 3form in equation 4.5.3,
and it is simpler to investigate the lorentz transformations of both 2forms
and 3forms directly.
Let’s do it for the 2form F on R
4
and the ﬁrst lorentz boost.
We have that any 2form on R
4
is given by suitable linear combinations,
weighted sums, of the six terms dt ∧ dx, dt ∧ dy, dt ∧ dz, dx ∧ dy, dx ∧ dz,
dy ∧ dz. From the matrix representation for F we can read these oﬀ:
F = −Ex dt ∧ dx −Ey dt ∧ dy −Ez dt ∧ dz
+ Bz dx ∧ dy −By dx ∧ dz +Bx dy ∧ dz
4.6. LORENTZ INVARIANCE 143
If we suppose that the ﬁrst lorentz boost is used to transform the standard
basis in R
4
to a new basis, what I shall call the dashed basis, then we need
the inverse map to transform the coordinates of a point (event) in R
4
to the
new coordinates in the dashed frame. Thus we have
t
x
y
z
¸
¸
¸
¸
=
c −s 0 0
−s c 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
t
x
y
z
¸
¸
¸
¸
=
ct −sx
st −cx
y
z
¸
¸
¸
¸
where c is short for cosh(θ) and s for sinh(θ), for any θ ∈ R.
Taking the exterior derivative we get
dt
dx
dy
dz
¸
¸
¸
¸
=
c dt −s dx
−s dt +c dx
dy
dz
¸
¸
¸
¸
We have therefore that
dt
∧ dx
= dt ∧ dx; dt
∧ dy
= c dt ∧ dy −s dx ∧ dy;
dt
∧ dz = c dt ∧ dz −s dx ∧ dz; dx
∧ dy
= −s dt ∧ dy +c dx ∧ dy
and
dx
∧ dz
= −s dt ∧ dz +c dx ∧ dz; dy
∧ dz
= dy ∧ dz
Exercise 4.6.3. Find the expressions for the four basis elements of a three
form,
dt
∧ dx
∧ dy
, dt
∧ dx
∧ dz
, dt
∧ dy
∧ dz
and dx
∧ dy
∧ dz
in terms of the undashed forms, dt ∧ dx ∧ dt and so on.
Exercise 4.6.4. Calculate the exterior derivative dF in terms of the func
tions Ex, Ey, Ez, Bx, By, Bz and the basis elements
dt ∧ dx ∧ dy, dt ∧ dx ∧ dz, dt ∧ dy ∧ dz and dx ∧ dy ∧ dz
Your ﬁrst term should be, using the shorter notation of the text book for
partial derivatives,
(−∂
y
Ex +∂
y
Ex +∂
t
Bz) dt ∧ dx ∧ dy
144 CHAPTER 4. SOME ELEMENTARY PHYSICS
In the dashed frame we have the same form as in the last exercise for dF
except that we have Ex
, Ey
, Ez
, Bx
, By
, Bz
, partial derivatives of these
with respect to the dashed variables, and the usual suspects:
dt
∧ dx
∧ dy
, dt
∧ dx
∧ dz
, dt
∧ dy
∧ dz
and dx
∧ dy
∧ dz
Now we can replace the last four terms by the undashed terms.
We also have that the chain rule allows us to replace the ∂
x
Ey
and similar
terms by their undashed translations:
[∂
t
f
, ∂
x
f
, ∂
y
f
, ∂
z
f
] = [∂
t
f, ∂
x
f, ∂
y
f, ∂
z
f]
c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
Exercise 4.6.5. Do this to obtain an expression for dF
in terms of the
undashed symbols.
We ﬁnally have to conﬁrm that
dF = 0 ⇒ dF
= 0
At ﬁrst sight, the expression for dF
is a mess, but a little thought shows it
is just a new dF for a slightly diﬀerent pair of electric and magnetic ﬁelds—
which also satisfy Maxwell’s equations.
Exercise 4.6.6. Checking with the forms of 4.4.3 and 4.4.4, show that
dF
= 0
This sequence of exercises has established that the vacuum Maxwell equations
are invariant under the ﬁrst lorentz boost.
Exercise 4.6.7. Conﬁrm that this also works for the other two lorentz boosts.
This is best done using a small amount of thought rather than a large amount
of algebra.
Since we have already seen that the vacuum Maxelll equations are invariant
under the special orthogonal subgroup, it follows that the equation
dF = 0
is invariant under the lorentz group.
Now if d F = 0 which it does in the vacuum, it also follows that d F = 0
since takes zero forms to zero forms and
2
is a number, in fact ±1.
4.6. LORENTZ INVARIANCE 145
Exercise 4.6.8. Which?
So the same calculation will establish the invariance of the second equation.
After doing this we conclude that both the vacuum equations are invariant
under the Lorentz group.
Exercise 4.6.9. Show that d F is also invariant under the three lorentz
boosts and the special orthogonal group. Hint: this does not require doing it
all over again!
This result is absolutely astonishing. I shall now explain why.
4.6.1 Special Relativity
Newton’s Laws of motion are about forces, which is to say accelerations
and masses if we look at the things we can actually measure directly. And
these appear at ﬁrst sight to be invariant under the galilean group. Certainly
Newton thought they were, although group theory not having been formalised
in his day, he wouldn’t have put it in that way. Had the lorentz group been
in existence, the possibility that the laws of motion were invariant under the
lorentz group would have been regarded as a bizarre possibility too absurd
to waste time on, although one couldn’t rule it out on experimental grounds
since speeds with which we are familiar are small compared with the velocity
of light.
Invariance of the laws of nature under the galilean group explained why
we haven’t found anywhere in the universe labelled ‘origin’, a special point,
possibly with three orthogonal axes sticking out of it. It must have looked
unlikely that we will, and the fact that we have used coordinate systems and
bases of orthogonal vectors to talk about the universe led immediately to the
observation that this was merely a convenient language, and no particular
coordinate frame was better than any other; indeed, one could be moving
at constant velocity with respect to another and they were equally good.
Although accelerating frames changed things, as one ﬁnds when looking at
tops and roundabouts and planets.
The fact that Maxwell’s equations are not invariant under the galilean group
but are under the lorentz group produces some very strange results. One
is that the velocity of light is constant and does not depend on your own
velocity.
This is very unnerving indeed. If light is a wave motion like waves in water
then the wave velocity is a property of the water. If you are at rest with
respect to the water you get one answer, and if you are moving with respect
to the water you get another. This is not observed. If light goes like little
146 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.6.1: An experiment with two charges.
bullets from a source, then the velocity of the light is aﬀected by the velocity
of the source with respect to any other observer. This is also not observed.
What happens when you travel fast away from the light is that the frequency
is shifted down, the colour changes, and if you travel towards it the frequency
is shifted up, the doppler eﬀect. But the velocity is not aﬀected.
It gets worse. Suppose I am sitting in a frame of rulers with a clock and
describing what happens when I take two charged balls of opposite signs and
place them a small distance from each other, as in ﬁgure 4.6.1.
I have tied each charge to some ﬁxed object and measured the tendency
of the charges to attract each other by measuring the stretch of a spring.
Since everything is pretty much the same except for the sign of the charge,
including the mass of the balls and the elasticity of the springs, there is a
high degree of symmetry in the arrangement and I expect the stretches to be
the same. I can measure these by reading oﬀ the number on the ruler where
the edge of the ball is.
Now you come zooming past me going into the picture at half the speed of
light.
You see the two balls, you see the extension of the spring, and you can
measure the electric and magnetic ﬁelds with some little test charge you
carry with you, and some little bar magnet. Your little bar magnet can be
thought of as just two charges orbiting about each other, close enough so
they have no net electric ﬁeld to speak of, and fast enough so they produce a
magnetic ﬁeld. You can also observe, just as I can, the numbers on the ruler
where the edges of the balls are, and we had better agree on these numbers.
The collision of penguins is a fact in any language that is not totally bizarre,
and the coincidence of edges of balls and numbers on an external ruler must
also be a property of the universe, not the language we choose to talk about
it.
In my framework there is no magnetic ﬁeld at all, unless we count your
measuring apparatus. But in yours there are two charges coming towards
4.6. LORENTZ INVARIANCE 147
you at half the velocity of light so there has to be a magnetic ﬁeld, because
moving charges have to produce one; Faraday said they did and so they
do. So the numbers you get for the electric ﬁeld and the magnetic ﬁeld will
be diﬀerent from the numbers I get, yours will have a nonzero magnetic
component.
Exercise 4.6.10. Use the ﬁrst lorentz boost to work out what the rate of
exchange is.
We will both agree that the springs extend and the objects are attracting
each other. But our explanations of why will be diﬀerent. You will have an
explanation which involves magnetic ﬁelds and mine won’t.
In diﬀerent conﬁgurations we may disagree about the extension of the springs
or the masses of the balls or the time duration between events, although we
must of course agree about whether an event occurs or not. If two balls
(or penguins) collide, we must agree that the event happens, this being a
property of the balls (or penguins), but the translation between our languages
may make our measurements of various forces disagree when our translation
is done using the lorentz group.
The problem which was tackled in the early days of the twentieth century
was, can you have the mechanical part of nature invariant under the galilean
group and the electromagnetic part invariant under the lorentz group? As you
can see from our discussion on penguins and Einstein’s covariance principle,
this amounts to having a diﬀerent and incompatible language for diﬀerent
aspects of the universe, and we measure the eﬀects of ﬁelds by mechanical
means. So it does not make much sense to have two incompatible languages
for talking about the same thing. The MichelsonMorley experiment tried
to measure the velocity of the earth with respect to the luminiferous aether,
the whatever it was that waggled when light passed through it (‘luminiferous’
just means ‘light bearing’ and ‘aether’ meant some weird stuﬀ which spread
throughout the universe and had no other function than to bear light. In
particular it didn’t obstruct or slow down mechanical things like planets or
penguins). This makes some sort of sense if there is one kind of invariance
for matter and another for electromagnetic ﬁelds. The answer seemed to
be it was zero: the velocity of light does not depend on the velocity of the
observer measuring it. This is consistent with the lorentz invariance of the
Maxwell equations, but not with the idea that you can get away with two
incompatible languages for talking about the world.
Given this there are only two possibilities: either Maxwell’s laws are wrong
or Newton’s are. Mostly people assumed that Maxwell was going to have
to lose out in the ﬁght between the intellectual giants, mainly because they
148 CHAPTER 4. SOME ELEMENTARY PHYSICS
were used to Newton, and Maxwell was the new kid on the block, although
it was hard to see how he was wrong.
Poincar´e pointed out that the alternative was to suppose that everything was
invariant under the lorentz group. Einstein worked out the consequences and
we had the special theory of relativity, and E = mc
2
is a trivial corollary.
Hence atom bombs and nuclear power stations. So today we use galilean
invariance as a simple approximation when velocities are low, and lorentz
invariance is taken to be right. For everything.
It is fashionable for philosophers to pontiﬁcate on science, and since scientists
are usually much too busy doing interesting things to bother much about
them, the philosophers have much more inﬂuence on the great unwashed
than they should. One line of argument goes: ‘Einstein showed Newton
was wrong, no doubt someone will eventually show Einstein is wrong too,
so nothing is known for sure and all knowledge and belief is temporary. So
we might as well stay totally ignorant of science. In fact since all knowledge
is liable to error we can’t be said to really know anything. And there is no
sense in pursuing truth if there is no possibility of catching it.’ I shall call
this the postmodern fallacy.
This certainly saves philosophers and others the trouble of learning about
tensors, or anything else complicated. The argument is very popular with
people who like to be thought of as intellectual but don’t have much intellect.
Exercise 4.6.11. Explain, as to a philosopher, why the postmodern argument
sucks.
Exercise 4.6.12. Google MichelsonMorley Experiment
This has been a quick introduction to special relativity. I have essentially
followed the historical development of the ideas, whereas it is more usual to
give you the facts which have become known as the result of experiments
since. Facts there are in plenty and they support completely the invariance
under the lorentz group of everything in the universe. Physicists tend to see
life as a huge collection of facts, mathematicians as a much smaller collection
of ideas. To mathematicians, reality is there to give us interesting things to
think about, and so we rely on physicists and engineers to ﬁnd out how things
behave so we can make languages which describe them concisely. It has been
a very succesful partnership and physics (and more recently engineering) has
forced us to produce some beautiful mathematics, some small amount of
which we have used thus far.
There is still lots left as the next chapter will show.
Chapter 5
DeRham Cohomology:
Counting holes
First some observations on a few cultural issues. There are diﬀerences be
tween mathematicians and physicists which cause problems. I don’t want to
overstate these, nor to leave anyone thinking I disapprove of either culture;
my ﬁrst degree was in Physics and my Ph.D. in Pure Mathematics, and I
ﬁnd both subjects wonderfully worth studying, but failure to confront issues
tends to make them harder to deal with, not easier. So some thoughts on the
diﬀerences may be worth putting up for your consideration. It should also
be noted that, seen from outside, the two cultures are so similar it is hard to
see any diﬀerence at all.
5.1 Cultural Anthropology
In the beginning of the twentieth century, the Mata Grosso was the great
unexplored jungle of the Amazon basin in South America. It was full of,
well, jungle, which we now call a rain forest (possibly to avoid giving oﬀence
to jungles), and contained exotic animals and exotic tribes of people with
strange customs, such as shooting curare tipped darts at strangers.
Cultural anthropologists, anxious to study humanity in all its bizarre aspects,
visited these tribes in order to learn their ways; those who avoided the curare
tipped darts were able to return to civilisation and tell it about the customs
and manners of these fascinating people. One of the chief diﬃculties they
faced was the strange human ability to follow complicated rules without being
able to say what the rules actually are. This is obvious in language: a ten
year old has a good grasp of his native language and can follow incredibly
149
150 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
complex rules of grammar with no apparent diﬃculty. The conclusion that
French must be an easy language because lots of French children speak it, is
not in fact the case. It is rather that they have internalised a huge number
of rules, but they don’t know what they are. In order to learn French as an
adult, we tend to want to know the rules. It is no use asking a French child
to tell you. They don’t know.
In the same way, it was no use asking a denizen of the Amazon basin what
their basic assumptions about the universe were. They had them, they fol
lowed these assumptions, but they had been brought up in the culture and
couldn’t articulate them. Part of the interest in exotic tribes is trying to
work out what those assumptions are, but there is no use asking the exotic
tribesmen. They learnt them at too early an age to realise they were making
them.
I once talked to an Australian engineer on a Japanese train, and (in front of
some of the Japanese staﬀ) he expressed his amusement about the introduc
tion of ﬂush toilets to Japan many years ago. They needed to have pictures
explaining to Japanese how to sit on the toilet seat, otherwise the more hy
gienically conscious Japanese would stand on it and squat. He thought this
was very funny, because he presumably believed that the customary manner
of using a ﬂush toilet is something people are born knowing. He thought
this because he had been pottytrained at an early age and had forgotten
the process. I doubt if his mother had. Modern Japanese toilets are so com
plicated, they have to be explained to foreigners, so the Japanese have had
their revenge on ignorant Aussies.
These days the Mata Grosso is in the process of being turned into farms and
housing estates and the exotic tribesmen drive cars and drink cocacola, so
there is not much point in a cultural anthropologist visiting the place. It
is much like back home. By way of compensation, there are other exotic
tribes being created. One of these is the tribe of theoretical physicists
1
. Just
like the Amazonian Indians, they have their special language, their cultural
assumptions about the world. And just like the Amazonian Indians, they
don’t actually know what they are.
This is where mathematicians come in. They are also a weird tribe, as you
may have noticed, but being professionally interested in rules they have a
much clearer idea of what those they follow actually are. And when they
study theoretical physics they ﬁnd it necessary to articulate the assumptions
1
One cultural anthropologist actually spent some months with a group of physicists
but his report on their weird ways aroused little interest, perhaps because he couldn’t
make as much sense of them as they could of him. This is a true story and not a joke.
Well, maybe it is a joke but it is also true.
5.2. SOLUTIONS 151
which the physicists make. It is no use asking the physicists, they do it
be training and reﬂex and don’t even notice they are doing it. Learning
theoretical Physics as an adult is harder than learning French, and asking
French children is no help, as noted above.
So I am going to make some points which theoretical physicists would regard
as too obvious to talk about and don’t.
5.2 Solutions
The Maxwell Equations are basically about a set of six functions from R
4
to
R, Ex, Ey, Ez, Bx, By, Bz which correspond to things that can be measured
using particular instruments. In practice we can only sample these functions
discretely if there is something in nature producing them, or we can more or
less ignore reality and just write down the six functions. They are, in our
notation, the components of a 2form on R
4
and we take the Lorentz inner
product should it be necessary. It is possible to see this as a map from R
4
to
R
6
. Any such map will deﬁne a suitable 2form, and it is not unreasonable to
demand that we restrict ourselves to smooth functions and maybe analytic
functions.
Exercise 5.2.1. Why is it not unreasonable?
Now the Maxwell Equations impose conditions on these six functions. Not
every choise of six functions fromR
4
to R will satisfy them. In fact there must
be some space of solutions. We have from dF = 0 four conditions on these
functions and from d F = 0 another four. I am staying with the vacuum
equations for the time being. So we have a total of eight constraints on an
inﬁnite dimensional space of functions, so we get some inﬁnite dimensional
manifold of solutions. This is not much help.
The sad fact is we have only one solution to the vacuum Maxwell Equations,
which we got by guessing that a plane wave in space would do it. If you
write down
Ex(t, x, y, z) = 0, Ey(t, x, y, z) = 0, Ez(t, x, y, z) = C sin(y −t)
for any real number C, then you are describing a (sine) wave with an electric
ﬁeld which exists only in the zdirection and which travels at unit speed in
the ydirection. If we take the curl of this we get, so Maxwell tells us,
∂B
∂t
= −
cos(t −y)
0
0
¸
¸
152 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
and integrating gives
Bx = −sin(y −t), By = 0, Bz = 0
which says the magnetic ﬁeld is a plane wave also travelling along the y axis
being nonzero only in the x direction.
Exercise 5.2.2. Draw a picture.
It is easy to verify that ∇
q
E = 0 and ∇
q
B = 0 and it remains only to show
that curl B = ∂
t
E which is rather easy.
Exercise 5.2.3. Do it.
Of course we can get some more solutions out of this, in fact an inﬁnite
number of them. For a start we can rotate things so that instead of travelling
along the y axis it goes along any other line. Just apply any element of
SO(3,R) to the above system and we get a new one which we already know
also satisﬁes the Maxwell equations. From a physical perspective it would
be astonishing if it didn’t.
Exercise 5.2.4. Do it.
Better, apply a lorentz transformation to the 2form and get a larger class of
equivalent solutions.
For a second thing, we can change the frequency of the wave so it has more
oscillations in time and space.
Exercise 5.2.5. Do it.
And ﬁnally if we have two or more solutions we can add them and get another
solution.
Exercise 5.2.6. Verify this.
This gives us a lot more solutions, inﬁnitely many more, but one has to feel
that they are not really all that diﬀerent. Of course Fourier Theory tells us
that any function can be approximated as a ﬁnite sum of such things. On
the other hand it is easy to construct functions which are not solutions.
Exercise 5.2.7. Do it.
So we ask the question, how many other solutions are there? It is conceivable
that this is in fact all, and it is conceivable that there are squillions of other
families of solutions, where another family is obtained from any one solution
by doing some rotations, or lorentz transformations, and scalings and sums.
One thing we note is that in going from the wave equation for the electric
ﬁeld to obtain a magnetic ﬁeld, we simpliﬁed the integral by making some
functions equal to zero.
5.2. SOLUTIONS 153
We had
curl
0
0
sin(y −t)
¸
¸
=
−cos(y −t)
0
0
¸
¸
= −
∂B
∂t
Integrating this gives
B =
sin(y −t)
A
B
¸
¸
where A and B can be any functions which do not depend on t. It is natural
to cheat and make them zero, as I just did. Can we have any other possibility?
We would need to ensure that ∇
q
B = 0 which would force ∂
y
A + ∂
z
B = 0
for a start. We also have
∂
t
E =
0
0
−cos(y −t)
¸
¸
would have to be curl B, that is
∂
x
∂
y
∂
z
¸
¸
Bx
By
Bz
¸
¸
=
0
0
−cos(y −t)
¸
¸
or
∂
y
Bz −∂
z
By
∂
z
Bx −∂
x
Bz
∂
x
By −∂
y
Bx
¸
¸
=
0
0
−cos(y −t)
¸
¸
This would seem to give rather a lot of possibilities for B other than the
simplest one we have considered.
Exercise 5.2.8. Does it? Find one or prove there aren’t any.
Note how we got this present family. Basically, we guessed from knowing that
light travels through space as a wave and the velocity is 1 in our units and the
conjecture that light is an electromagnetic thing, that a wave would work,
and wow, it did. Checking to see if a function from R
4
to R
6
will satisfy
Maxwell’s Equations is very simple, actually ﬁnding one by some process
other than guessing is a diﬀerent story. And that makes the assumption
that we should look in a space of analytic or at least smooth functions; in
practice we are going to be using the elementary functions because these are
the ones we can easily write down and diﬀerentiate. Why should the universe
be so kind as to use the functions we ﬁnd easy to write down? What if some
154 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
important physical process depended on functions we can’t write down as a
small sum of elementary functions? What, if anything, could be said about
it?
Exercise 5.2.9. Think about this. Have we just been dead lucky with light?
The question as to whether there are any other solutions to the vacuum
equation outside ﬁnite sums of lorentz transforms of the wave solution merits
a little thought.
The physicist will surely observe that there are bound to be solutions to any
nonvacuum problem. Take any conﬁguration of moving charges. Specify
them by elementary functions Ex, Ey, Ez where possible. Then we can
hope to derive, in any coordinate frame in which the data is speciﬁed, the
corresponding magnetic ﬁelds. This requires merely some diﬀerentiation and
integration, leaving some unknown functions provided by the integration
stage. Now the physicist ‘knows’ that there is a solution: his reasoning
is that the universe will surely provide one, so it must be there to be found.
Indeed he will believe it is unique up to the transformations of coordinates,
since the universe doesn’t toss up between options. This, of course, assumes
that the Maxwell equations are true, which physicists do indeed believe. In
the main.
The question of why do physicists feel happy to restrict themselves largely
to the analytic elementary functions which I invited you to ponder a while
back, and the question of why physicists are so conﬁdent about being able to
prove uniqueness and existence of solutions are explained by two essentially
philosophical positions which go back to Newton.
The ﬁrst can be summarised by the old adage ‘If something is ineﬀable,
there’s no point trying to eﬀ it’, and perhaps also ‘If something can’t be
detected it isn’t there.’ If a function that was zero around the planet earth
was nonzero somewhere else, ﬁrst it could not be represented by an analytic
function and second, we would have no way of knowing by local measurements
of any precision that it existed, so there is no point in wasting thinking
time about it, and if a function that can’t be written down is essential to
understanding something then we are never going to understand it, so again
forget about the possibility.
The second can be summarised by the principle that if you have a theory
which accounts for the phenomena, commit yourself to it until either someone
comes up with a simpler or more encompassing theory or you run into facts
which are in conﬂict with it, in which case bend the theory minimally to
accomodate the new facts. The more committed you are to the theory, the
more likely you are to discover such facts. Pondering what if s is a waste of
5.3. INFINITE VARIETY 155
time.
The belief that the universe does not toss up but is consistent and hence
provides us with unique solutions is again a philosophical position. One can
argue that it is justiﬁed in various ways: it can be productive because if we
get lots of solutions we can look for extra conditions to force uniqueness and
usually ﬁnd them. Recall the exercise you did on the magnetic ﬁeld for a
wire carrying a current.
Most physicists regard these metaphysical convictions as so obvious that they
never bother to mention them. Much like the french children.
5.3 Inﬁnite Variety
Giving the curl of a ﬁeld and asking for a solution is, as you will have dis
covered, diﬃcult because there are so many solutions. In diﬀerential form
notation we have dX = Y where X is a 1form and Y is a 2form. Now d is
linear, so it follows that if dω = 0 then whenever X is a solution, so is X+ω.
And since d
2
= 0, if f is any diﬀerentiable function whatever, X + df is a
solution.
What does this do to the physicists conviction that solutions have to be
unique on the physical grounds that the universe does not toss up? There
are two things one might do, and physicists do both of them. One is to
impose extra conditions which force uniqueness. Another is to declare the
diﬀerence between diﬀerent solutions as an artefact of the language and deny
that it is physically signiﬁcant. In the former case they explain that the uni
verse has some rather unexpected preferences, often for continuous functions,
and in the second they glory in the freedom that they get to choose arbitrary
functions to suit their convenience, seeing no objection to making diﬀerent
choises at diﬀerent times. If a mathematician has noticed that they often
do the ﬁrst and then unexpectedly do the second, and points out the incon
sistency, they express surprise and a certain contempt that mathematicians
lack the courage to follow them. You will ﬁnd this attitude in the text book
section on Gauge Freedom. I have not been able to get physicists to agree
that consistency in how they resolve multiple solutions to physical problems
is particularly desirable, although they insist that the universe shows con
sistency. This appears to be a religious conviction, possibly derived from
Newton who believed (a) that God had created the universe and (b) that
God was not smallminded enough to be inconsistent or to try to fool us.
Quantum mechanics might have given him spiritual indigestion, as would
some of the practices of his intellectual heirs. But then, Newton consid
156 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
ered himself a philosopher ﬁrst and a mathematician second, and Physics or
indeed Science hadn’t been invented as a separate subject in his time.
Again, physicists remain happily oblivious to the underlying assumptions
in their practice, or the great majority of them do. Extracting them for
inspection is time consuming, but I haven’t found a quicker way of making
sense of their work. And it is suﬃciently interesting and important work to
justify the eﬀort.
5.4 Gauge Freedom
We have seen that guessing a solution and then verifying it is the quick and
easy way, but it presumes that we are good at guessing, or equivalently that
the solution is simple. It doesn’t seem safe to rely on this. So it is reasonable
to introduce other assumptions, some on physical grounds, some in a spirit
of optimism.
We know that d
2
= 0 and so when given dB = 0 it is tempting to consider
the possibility that the reason dB is equal to zero is that B = dX for some
unknown 1form X. Such a thing is known in the literature as a vector
potential. We also know that it is far from unique: adding df for any function
(0form) f will give an equally good X. This is precisely analogous to having
a constant of integration crop up: again we might feel inclined to ﬁx it in
a physical situation by imposing an extra condition, as when we solve an
ODE, or we may feel that it gives us a glorious freedom to choose one that is
convenient, or we may decline to make a choice at all. In the case of vector
potentials, the practice of physicists is to glory in the freedom and call it gauge
freedom. A similar situation exists when we obtain a potential function for
a physical situation, when adding in an arbitrary constant will not change
the force ﬁeld which is its gradient. Physicists sometimes insist that physical
constraints such as ensuring the potential goes to zero at inﬁnity suﬃce to
get rid of the ambiguity, but they do not usually feel any such compulsion
in the case of the vector potential. Just what exactly is physically real and
what is an artefact of language is never precisely speciﬁed
2
. This allows
2
This creates real problems. One of my lecturers at Imperial College told the class,
rather sadly, that when he was starting on a PhD, he had come up with what he saw as
a very interesting problem. His kindly supervisor had assured him that it wasn’t a real
problem, but an artefact of language. Someone else, perhaps with a less assured or less
kindly supervisor had assumed it was real, done the research, and become famous as a
result. I suppose one moral to be derived is that you shouldn’t trust your supervisor. The
conclusion I derived was that physicists are not at all clear as to what is real and what
isn’t. This surprised me at the time, but I was very young.
5.5. EXACT AND CLOSED FORMS 157
physicists to spout manifest drivel. I was once assured that there were 4π
lines of force coming out of a unit charge, and on pointing out that this was
roughly 12
4
7
lines and what did 4/7
th
of a line look like? I was reproved for
being too literal. Clearly one wasn’t supposed to ask what things meant, one
was simply being instructed in the right things to say, whether it made sense
or was total bullshit. Thus do subcultures maintain a wall against outsiders:
there’s a lot of it about.
Exercise 5.4.1. Listen to some conversation between your friends and de
cide, how much of what is said is carrying information about the world which
could be translated into a foreign language and remain intelligible, as in ‘Your
dress is transparent in the sunlight’ and how much is comprehensible only af
ter a large number of extra propositions have also been translated, and possibly
not then, as in ‘ All cultures are equally valid in their own terms’.
You will note that the mathematical subculture has a quite diﬀerent set of
underlying assumptions from those of most of the rest of the human race.
One is that assertions have to make sense and should, if possible, be true
or derivable from other assertions which are either true or clearly stated to
be assumptions. Many students come to university with a quite diﬀerent
assumption: that what is to be said is anything that has been approved by
authority. Whether it is true, false or totally meaningless is of no importance.
Answering an examination question is done by taking a few half recalled
fragments from lectures and gluing them together with bullshit. No doubt
this works well in the schools, and perhaps in other university departments,
but mathematicians really don’t like it. As I am sure you have noticed by
now.
Exercise 5.4.2. What other fundamental but usually unstated assumptions
characterise mathematical ‘culture’ ?
For the time being I shall simply go along with the physicists, but point out
any oddities while doing so.
5.5 Exact and Closed forms
I suggested earlier that given that dF = 0 for a 2form F, we could get this
result if F = dX, using the well known result that d
2
= 0.
Deﬁnition 5.5.1. A form Y which satisﬁes the condition dY = 0 is said to
be closed.
Deﬁnition 5.5.2. A kform Y such that there exists a k − 1form ω such
that Y = dω is said to be exact.
158 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
Then we have the result that:
Proposition 5.5.1. Every exact form is closed.
Proof: d
2
= 0
What about the other way around? Is every closed form exact? The answer
is interesting: it depends completely on topological properties of the space
on which the form is deﬁned. You might think that this is interesting only if
you are a topologist; it has however some important implications for Physics.
The idea is explained clearly enough in Chapter Six of the text book, which
does it ﬁrst for the case when X is a 1form on R
2
. To say that dX = 0 is
to say that the ﬁeld, corresponding to X when we use the inner product to
change to an equivalent vector ﬁeld, has zero curl. The question then is, is
it a potential ﬁeld? Is it the gradient of a scalar ﬁeld f : R
2
→R?
We can try to construct one by the simple process of taking some point
a ∈ R
2
and declaring f(a) = 0. To get, for any other point b a credible value
of f(b), we take a path from a to b and integrate the vector ﬁeld along the
path. This tells us the amount of work the vector ﬁeld does along the path.
We can put a minus sign in if we feel like, but hey, who cares? Now this will
certainly give a value of f(b) but the obvious problem is that if we took a
diﬀerent path, we might get a diﬀerent answer. In general, we would. Your
second year exercises in the course of doing Stoke’s Theorem should have
convinced you of this.
If however the curl of the ﬁeld is zero, then the integral around any closed loop
is zero. This follows from Stoke’s Theorem in the plane, otherwise known
as Green’s Theorem, immediately. And this means that the value of f(b)
cannot depend on the path, because two paths between ﬁxed points when
joined together give a closed loop. Hence the value of f(b) does not depend
on the path, and so we can take this as a sensible value for f(b) because it
depends only on the vector ﬁeld and the point a.
Exercise 5.5.1. Show it depends on the point a only up to an additive con
stant: in other words if I choose a and you choose a
, your function f
and
my function f will diﬀer by a constant.
Exercise 5.5.2. Translate this into the language of 1forms. 2forms and
0forms on R
2
.
This seems to give us the following:
Proposition 5.5.2. Every Closed 1form on R
2
is exact.
Proof: Just construct the 0form as indicated. Then it is trivial to verify
that d of the 0form is the given 1form.
5.5. EXACT AND CLOSED FORMS 159
This seems perfectly reasonable and hasn’t seemed to involve us in any topol
ogy, so I shall now give what looks like a counterexample to the last propo
sition:
Proposition 5.5.3. The 1form
X =
y
x
2
+y
2
dx −
x
x
2
+y
2
dy
is closed but not exact.
Proof:
First the closed part.
dx = ∂
y
y
x
2
+y
2
dy ∧ dx −∂
x
x
x
2
+y
2
dx ∧ dy
=
2(x
2
+y
2
) −2(x
2
+y
2
)
(x
2
+y
2
)
2
dy ∧ dx
= 0
Now suppose X = df for some function (0form) f. Then it would follow that
the integral of X around the unit circle is zero, since starting at a = (1, 0)
T
and proceeding in the positive direction would give us f(a) − f(a) = 0 for
the integral, by deﬁnition of the construction of f. But a glance at the vector
ﬁeld shows this is wrong. We have unit length vectors against us each step of
the way, so the integral is −2π. Check it by doing the algebra if the geometric
argument fails to carry conviction.
So there ain’t no such f, and X is not exact.
And something has gone horribly wrong.
Exercise 5.5.3. Can you see what? Stop now and try to work out why this
result is not, as at ﬁrst appears, in conﬂict with the penultimate proposition
that said that every closed 1form on R
2
is exact. Warning: I am about to
give the game away on the next page, so stop now and work it out.
160 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
The answer is of course obvious once you have seen it. The 1form
X =
y
x
2
+y
2
dx −
x
x
2
+y
2
dy
is not deﬁned on R
2
. It is deﬁned and smooth on R
2
` ¦0¦. This is R
2
with
a hole in it. The hole completely destroys the argument, because Green’s
Theorem, Stokes in the plane, doesn’t work if there is a hole in the region.
The boundary of a disc with a hole in it consists of both the bounding circle
and the point at the hole. Ignoring missing points screws up everything.
You should be warned that evil people, I suspect physicists, have the bad
habit of writing the negative of this form as dθ. You can see why they do it,
but you have to deplore their moral and mathematical muddle.
Exercise 5.5.4. Why do they do it? You might like to consider the function
which takes a point in the plane, writes it in polar coordinates and sends it
to θ. What happens if you take the exterior derivative of this 0form?
The removal of a point of R
2
makes a mess of the result that all closed forms
are exact. The argument works however for subsets of R
2
which don’t have
any holes in. One hole is enough to bugger things up.
Exercise 5.5.5. Show this.
Exercise 5.5.6. What about the corresponding case of closed 1forms on
R
3
` ¦0¦. Are they always exact? After all, if we have a loop in R
3
` ¦0¦ we
need to ﬁnd a surface with the loop as boundary which does not contain 0,
in order to use the classical Stoke’s Theorem. This will allow the argument
to go through even in R
3
` ¦0¦. And such surfaces are always there, we have
lots of extra room and can deform smoothly any bad surface that contains the
origin until it doesn’t.
It might occur to you to wonder if it goes on in the same way: does closed
imply exact on R
n
in general? Investigating in the simplest case, R
2
, we
know that the only 3form is zero so every 2form on R
2
is closed. This
would suggest that if it is true, every smooth function on R
2
has a smooth
vector ﬁeld of which it is the divergence.
Exercise 5.5.7. Is this indeed the case? If so prove it, if not give a coun
terexample.
Exercise 5.5.8. Show that every 3form on R
3
is the exterior derivative of
a 2form.
Exercise 5.5.9. What about closed 2forms on R
3
? The required theorem
we would need is obviously more complicated since we have to construct a
1form not just a function. Do it for the 2form dx ∧dy +dx ∧dz +dy ∧dz.
5.5. EXACT AND CLOSED FORMS 161
The last exercise will show there is a certain amount of slack and that we
can make some choices. It would be nice however to have a more systematic
approach.
To do this, let’s look at 1forms on R
2
which are exact and see if we can
be systematic about getting the ‘potential function’ f. Suppose we have a
1form
ω = Pdx +Qdy
We can take the origin as a starting point and look to see what we get if we
integrate ω along a path from 0 to the point x. Rather than talk about any
old path, let’s do it with a straight line. Then the line is the set of points tx
for t ∈ [0, 1] and we get that the path integral of ω along this path is
1
0
P
dx
dt
+Q
dy
dt
dt where x =
¸
x
y
or
1
0
(P(tx, ty)x +Q(tx, ty)y)
Exercise 5.5.10. Evaluate this for x = [1, 2]
T
and the 1form x dx +y dy.
Exercise 5.5.11. Evaluate this for x = [1, 2]
T
and the 1form −y dx+x dy.
Note that this is not closed.
The result, for any point x is a number which we can call f(x) I shall call
it I(ω)(x) and use I(ω) instead of f. The reason is that I goes in the
opposite direction to the exterior derivative so I (for exterior Integral?) seems
a reasonable symbol to use.
So we have an operator I from 1forms to 0forms which makes sense on R
n
and always gives an answer whether the 1form is closed or not. And we
observe that when ω is closed, then dI(ω) = ω so ω must be exact.
Can we get from 2forms to 1forms by a similar process? We investigate
the simplest case of R
2
and a nice simple 2form. Let us start by taking
the constant 2form 2 dx ∧ dy. We want to do some integrating to obtain a
suitable 1form I(2 dx∧dy) = Pdx+Qdy. Since all 2forms on R
2
are closed
we would rather like to have d(Pdx +Qdy) = 2 dx ∧ dy.
There is a fair bit of slack here. We would have
∂
x
Q−∂
y
P = 2
and we would need to make up our minds about how to split up the 2 between
the two contributions. Let’s make them equal. Then we would have
∂
x
Q = 1; ∂
y
P = −1
162 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
We can integrate both these equations to get
Q(x, y) = x; P(x, y) = −y
or
I(2dx ∧ dy) = −y dx +x dy
and checking conﬁrms that this works: d(−y dx +x dy) = 2 dx ∧ dy.
Had we chosen some other way to split the number 2 up between the two
terms, we would have got another equally good 1form: there is no shortage
of them.
Exercise 5.5.12. Try it. Make one term zero. Or −1. Now look at the
various 1forms which have constant exterior derivative 2dx ∧dy. What can
you say about their diﬀerence?
Now we try to make the process look more like an operator I taking 2forms
to 1forms. First we split the elements up in equal amounts to be deﬁnite.
Then we integrate along a path as for the case of 1forms. I write
I(2dx ∧ dy) =
1
0
t 2x dt
dy −
1
0
t 2y dt
dx (5.5.1)
The term t is in there to make sure we divide by 2, which we can regard as
sharing the contributions out equally.
Exercise 5.5.13. Suppose we do the same with some more complicated 2
form which is not constant, such as ω = x
2
+y
2
dx∧dy. Can you see how to
ﬁx up to obtain the 1form I(ω) by modifying equation 5.5.1 appropriately?
Exercise 5.5.14. Can you make it work for 2forms on R
3
? Try it on closed
2forms ﬁrst. Then try it on a 2form ω that is not closed, and also try to
make it work for the 3form dω. Notice anything?
If you have been good and virtuous and done the sequence of exercises above
you will be prepared to believe that we can construct for any kform ω on
R
n
, k > 0, a (k −1)form Iω, also on R
n
, given by:
Iω(x) =
¸
i
1
<···<i
k
k
¸
α=1
(−1)
α−1
1
0
t
k−1
ωi
1
< < i
k
(tx) dt
x
i
α
dx
i
1
∧ ∧
¯
dx
i
α
∧ ∧ dx
i
k
(5.5.2)
where the
¯
dx
i
α
means this term is omitted.
5.6. HOMOTOPIES 163
This is undoubtedly a bit messy, which is why I gave the sequence of exercises.
If you prefer memorising things to understanding them, the very best of luck.
Note that I takes the zero kform to the zero (k −1)form.
It is now possible to prove the Poincar´e lemma:
Theorem 5.5.1 (Poincar´e Lemma). If a region U ⊆ R
n
is starshaped
with respect to the origin and if ω is a smooth kform deﬁned on U, then
there is a smooth (k −1)form Iω deﬁned on U and
ω = d Iω +I dω
It follows that if ω is closed then it is exact.
Proof: This is a thoroughly horrid calculation which is done on page 95
of Michael Spivak’s Calculus on Manifolds. You have probably worked out
what the term star shaped with respect to the origin means: if a point is
in the set U so is every point on the line segment joining that point to the
origin.
Exercise 5.5.15. Show that we can get the same result for any starshaped
subsets U ⊆ R
n
where U is starshaped with respect to any point.
Exercise 5.5.16. Show that if a region U ⊆ R
n
is diﬀeomorphic to any
starshaped subset, then the result still holds for all smooth kforms on U.
Exercise 5.5.17. Gives some examples of subsets U in R
n
which are not
diﬀeomorphic to starshaped regions.
5.6 Homotopies
Recall from 3P0 the idea of a homotopy:
Deﬁnition 5.6.1. We say that two continuous maps f, g : X → Y where
X, Y are topological spaces are homotopic iﬀ there is a continuous map F :
X I → Y such that ∀ x ∈ X, F(x, 0) = f(x) and ∀ x ∈ X, F(x, 1) = g(x).
In such a case we write f · g.
Exercise 5.6.1. Show that homotopy is an equivalence relation on the con
tinuous maps from X to Y .
In other words, we can change t continuously from 0 to 1 and interpolate
between f and g. If X is the space consisting of a single point, ∗, then to say
that two maps, f, g from ∗ to Y are homotopic is to say that f(∗) and g(∗)
can be connected by a continuous path joining them. So we can say that:
164 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
Deﬁnition 5.6.2. a space Y is path connected or (0connected) iﬀ every
two maps from ∗ to Y are homotopic.
Or equivalently, we say Y is 0connected iﬀ every constant map to Y is
homotopic to every other constant map.
This can be extended considerably:
Deﬁnition 5.6.3. A space Y is simply connected or 1connected iﬀ every
map f : S
1
→ Y is homotopic to a constant map.
Deﬁnition 5.6.4. A space Y is kconnected iﬀ every map S
k
→ Y is homo
topic to a constant map.
You should be warned that some writers use the term kconnected to mean
what I call nconnected for every n ∈ [0 : k]. In my sense,
Proposition 5.6.1. The circle, S
1
is 0connected but not 1connected.
Proof:
To see this we make use of the exponential map exp : R → S
1
, t → e
2πit
.
If we take a map f : [0, 1] → S
1
with f(0) = f(1) we can regard f as a map
from S
1
to S
1
. Using the fact that exp is locally a diﬀeomorphism, we can
lift f to
˜
f : [0, 1] → R with exp ◦
˜
f = f. If, without loss of generality, we
assume f(0) = f(1) = [1, 0]
T
then we can ﬁx
˜
f(0) = 0 and observe that
˜
f(1)
must be an integer. This integer is called the winding number of f.
Exercise 5.6.2. Draw a picture. Conﬁrm that it is always possible to chop
[0, 1] into small enough bits so that exp has a smooth inverse on the image
by f of each bit. Explain precisely how
˜
f is constructed.
It is not hard to show that the winding number is a homotopy invariant,
which is to say that if two maps are homotopic then they have the same
winding number, and also that if they have the same winding number they
are homotopic.
Exercise 5.6.3. Do it.
It follows that there is no homotopy between the identity map and a constant
map from S
1
to itself, and hence that S
1
is not 1connected.
Exercise 5.6.4. Finish the argument.
Exercise 5.6.5. Show that S
2
is path connected and 1connected but not
2connected.
Exercise 5.6.6. Show that S
n
is kconnected for 0 ≤ k < n but not n
connected.
5.7. COUNTING HOLES 165
Deﬁnition 5.6.5. If f : X → Y and g : Y → X are continuous maps and if
f ◦g · I
X
and g◦f · I
Y
then we say that X and Y have the same homotopy
type, and f is a homotopy equivalence.
Exercise 5.6.7. Show that having the same homotopy type is an equivalence
relation on topological spaces. Show that R
n
has the homotopy type of a one
point space, and that S
k
has the homotopy type of S
n
iﬀ k = n.
Exercise 5.6.8. Show that R
n
`¦0¦ has the homotopy type of S
n−1
for n ≥ 1.
The last exercise has as an almost immediate corollary that if R
2
has any
holes in it, the resulting space is not simply connected.
Exercise 5.6.9. Show this.
Let A denote any compact subset of R
2
. Now it is immediate that if two
loops in R
2
` A are homotopic, and if ω is any closed 1form on R
2
` A, then
the integral of ω over the ﬁrst loop is equal to the integral over the second.
It follows that we can say that:
Theorem 5.6.1. If a manifold is connected and simply connected then every
closed 1form on it is exact.
Proof:
If ω is a closed 1form on a connected and simply connected manifold M
n
,
then the integral around any loop is zero since the loop is homotopic to a
constant map. Hence the integral along any path between any pair of end
points does not depend on the path. To compute the integral along any path
we take a chart containing one end point and take a point along the path
which is in the domain of the same chart, shift the 1form and the path to
R
n
by means of the chart and compute the integral in R
n
. Do this for a
set of points along the path until we have the whole path, and add up the
part integrals to get the value of the integral for the whole path. Putting the
function I(ω) equal to zero at the starting point and the value of the integral
at the ﬁnishing point deﬁnes I(ω) at the end point. We can do this for every
end point on M
n
. The argument that dI(ω) = ω takes place in R
n
and is
trivial.
5.7 Counting Holes
The text book does an excellent job of explaining how we have a vector space
Z
p
(M) of closed pforms on M and another B
p
(M) of exact pforms and we
166 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
know B
p
(M) ⊆ Z
p
(M). So we can form the quotient vector space
H
p
(M) Z
p
(M)/B
p
(M)
This measures the number of pholes in M. If you have troubles with quotient
spaces, take R
n
and R
m
with m < n, take an embedding of R
m
in R
n
by
a linear map, and look at the quotient object, which should look a lot like
R
n−m
.
There are a number of ways of computing the cohomology of spaces, not
necesarily manifolds, and hence a number of diﬀerent cohomology theories.
In a sense, and up to a choice of a coeﬃcient group, they all give the same
answers. This takes us further into algebraic topology than I am game to go
in this course, but you should know that much. If you want to know more
algebraic topology, do the unit in second semester. You can ﬁnd my notes
on the web, which may or may not help.
Exercise 5.7.1. Do exercise 98 on page 125 of the text book. Read the section
carefully.
5.8 More Cultural Anthropology
There is a section in the text book on the BohmAharonov eﬀect which should
be on keen interest to cultural anthropologists. The eﬀect is a quantum
mechanical phenomenon of some interest.
The text book explains that physicists get some insight into the eﬀect by
visualising an inﬁnitely long core on which is wound a helical wire which the
authors wrongly call a spiral. Then it appears that the fact that the region
outside the wire is not simply connected accounts for the eﬀect happening.
They then go on to admit that in fact the wire is not inﬁnite so the space
outside is in fact simply connected. Since the wires are normally joined via
a generator or battery so as to produce a current in the wire, one might feel
that they were right the ﬁrst time. But one can certainly visualise a very
long coil, say a lightyear length of wire, and a humungous charge at one
end which attracts the electrons towards it. Then if we neutralise the charge
with an equal and opposite one (producing a humungous ﬂash, perhaps) the
electrons will be released to head down the coil. After about nine months
one could conduct the experiment to detect the eﬀect somewhere about the
middle of the coil. Presumably it would be observed to happen despite the
fact that the coil has ends and the complementary space is in fact simply
connected.
5.8. MORE CULTURAL ANTHROPOLOGY 167
Think about this. It is claimed that physicists get insight into why something
happens based on an assumption which is in fact false. It is rather like
claiming that you get some insight into why human beings have two legs by
observing that horses have four legs so the rear half of a horse has two. If
you were told this, you might point out that the claim is made by a person
who might in fact be a horses rearend, but since you are not, it does not in
fact contribute noticeably to your understanding.
Us coarse, crude mathematicians have a technical term for this sort of thing:
we call it bullshit.
It might be that some kind of sense can be made of this, and it would be
interesting to see it done, even more interesting to try to actually do it.
One is left with the impression that to a physicist, mathematics is there in
two rˆoles, one is to supply a means of doing the computations and the other is
as a sort of mnemonic for remembering the rules for doing them. Mnemonics
do not have to make sense, and generally don’t.
To a mathematician, the rules are there because they reﬂect the way the
universe works, and they have to make sense. Either the universe does in
fact work this way in which case the rules are right and we may trust our
calculations, or it doesn’t and they are unreliable. One may, course, have
only uncertain knowledge of which of these states of aﬀairs actually obtains
of. Taking a punt on it being right and then examining reality closely and
discovering if our sums agree with our measurements is usually felt to be the
way to go. Talking pure bullshit, even if it is the same bullshit as that uttered
by the rest of the tribe, doesn’t cut it. In the creative phase of development
of an idea, some haziness is allowable indeed necessary. But bullshit is always
a bad idea. And removing the haziness is crucial to progress. Incorporating
it into your subject is popular among people like publicists and politicians,
where a career built on a foundation of bullshit is quite common, but it is
disappointing to ﬁnd it in Physics.
It wouldn’t be quite so bad if physicists understood that what they are doing
here is rather silly. Like the arts students who feel quite proud of their
inability to use logic and announce that they are not to be constrained by
mere rules and consistency, the price paid is that nobody else will trust their
arguments. Long, long ago, physicists understood that bullshit is baaaaaad.
Some of them these days do not. So do civilisations crumble.
168 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
5.9 Summary
There are serious problems for a mathematician trying to understand physics,
many of them put in place by physicists, who have a very diﬀerent notion
of what constitutes an explanation. Nevertheless it is a fascinating and re
warding subject.
I have worked through most of part I of the text book and would like to
have got much further. It is possible for the interested reader to tackle the
next two sections, and I would encourage you to do this. You will certainly
come to the conclusion that understanding Physics entails getting a grasp of
an awful lot of contemporary mathematics. You might ﬁnd it easier to do it
the physicist’s way, which involves knowing a lot of facts and stringing them
together with algebra in a rather muddled manner, or you might ﬁnd it better
to understand the mathematics ﬁrst. This is probably to be determined by
how much brainpower you have versus the extent of your memory.
The remaining chapter headings represent a pious hope of how far I would
have liked to get but probably won’t. As time permits I shall continue ﬁnish
ing the material but I doubt if we will get any further this semester. Maybe
we want a postgraduate unit on it.
Chapter 6
Lie Groups
6.1 Introduction and Motivation
6.1.1 The rest of the course
The next few chapters will treat the machinery needed to deal with part Two
of the text book. There are a number of elements of this. The ﬁrst is a study
of some Lie Groups which will require a small amount of group theory and
a brief return to the tensor algebra, the second is a study of vector bundles
in particular Gbundles, where the bundle structure is speciﬁed by a group.
This is known to physicists as the gauge group of the bundle. It tells us
how to glue things together in order to build a bundle from trivial bundles.
This will lead to the YangMills equation as a generalisation of the Maxwell
Equations for force ﬁelds other than the electromagnetic.
6.2 Introduction to Lie Groups
I discussed Lie Groups brieﬂy in the second year algebra unit. They were
all matrix groups, and hence mostly subspaces of the general linear group
GL(n, R), which we can think of either as the space of all invertible linear
maps from R
n
to itself, or as the space of n n invertible matrices with
real entries. The exceptions were barely mentioned subgroups of GL(n, C)
which is either the space of invertible linear maps from C
n
to itself or the
space of nn invertible matrices with complex entries. Which deﬁnition you
prefer is a matter of taste; I prefer to think of the linear maps as being more
fundamental and regard the matrices as handy devices for representing the
169
170 CHAPTER 6. LIE GROUPS
linear maps in a convenient form for computation
1
. In this course however
I shall usually write GL(n, R) for the matrices and Aut(R
n
) for the linear
automorphisms (isomorphisms with itself) of R
n
.
Notable among these groups were the Orthogonal groups, O(n, R) and the
Special Orthogonal groups SO(n, R), the Unitary groups, U(n, C) and the
Special Unitary groups SU(n, C). GL(n, R) is of course a vector space, in
fact an algebra because we have a multiplication, not usually commutative,
obtained by composing the maps or, equivalently, multiplying the matri
ces. It is obvious that the group property of the Lie Groups is that of the
multiplication, but that if we add two orthogonal matrices the result is not
an orthogonal matrix, so the Lie Groups are not vector spaces. They are
however smooth manifolds, and hence have a dimension.
To see that they are manifolds, the easy way is to note that for all the above
examples, an element of the matrix group is deﬁned by putting a bunch of
smooth conditions on the elements of the matrix. For example, to get O(2, R)
we take the space of all 2 2 matrices with real entries,
¸
x u
y v
and require the conditions:
x
2
+y
2
= 1, u
2
+v
2
= 1, xu +yv = 0
This gives us three independent conditions on four numbers so we expect,
or at least hope, to have one degree of freedom left and a one dimensional
manifold. This is a rather sloppy discussion of an application of the implicit
function theorem, which you need to remind yourself of. And the implicit
function theorem is a generalisation done locally of the rank nullity theorem.
Which you know from second year. I hope.
Let’s do two cases in agonising detail. First the unit circle, because it is so
easy.
The Implicit function theorem deals with the zero of a function f : RR →R
which is diﬀerentiable at a point (a, b) ∈ RR. Think f(x, y) = x
2
+y
2
−1,
Df(a, b) = [∂
x
f(a, b), ∂
y
f(a, b)] = [2a, 2b]. It tells us that when the derivative
with respect to y is invertible, we can represent the zero of f locally as the
1
This is probably related to the fact that I don’t particularly enjoy doing sums, but I
do like understanding the ideas which tell me how to do them. This often requires me to
do sums, but I prefer to do the bare minimum. Of course, if the ideas didn’t tell me how
to do the sums, I should suspect them of being metaphysical tosh, so I do believe that
sums are, or at least the fact that they can be done is, important.
6.2. INTRODUCTION TO LIE GROUPS 171
graph of a curve y = g(x) for a diﬀerentiable g. The derivative with respect
to y is invertible when it is nonzero which happens everywhere except at
y = 0, x = ±1 in the case of f(x, y) = x
2
+ y
2
− 1. And in this case, if we
swop x and y we can expresss the curve as the graph of a map from y to
x. Since in either case the rank of the derivative is one and also we have
the curve is locally (that is, in a neighbourhood of the point) a graph of a
diﬀerentiable function, then we have the conclusion that at every point of
the zero of f where the rank of the derivative is one, the zero of f is locally
diﬀeomorphic to an interval.
The generalisation of this which we need is the Implicit Function Theorem
which I give in what may be a new (manifold) form:
Theorem 6.2.1. If f : R
n
R
m
→R
m
is diﬀerentiable and
M = ¦(x, y) ∈ R
n
R
m
: f(x, y) = 0¦
Then if (a, b) ∈ M is such that rankDf(a, b) = m, then there is a neighbour
hood U of (a, b) ∈ M which is diﬀeomorphic to an open ball in R
n
.
Exercise 6.2.1. Find the version of the implicit function theorem you are
used to and verify that it is equivalent to the form given.
An even more useful form is:
Theorem 6.2.2. If f : R
n
→R
m
with n ≥ m is a smooth map and
M = ¦(x) ∈ R
n
: f(x) = 0¦
Then if rank Df = m on M, M is a smooth manifold of dimension n −m.
This is somewhat stronger than the classical Implicit Function Theorem and
the idea is intuitively appealing: locally we have that f may be approximated
by an aﬃne map the linear part of which is Df, and if Df : R
n
→R
m
is onto
then the kernel has dimension n − m. And in a neighbourhood, the graph
of the derivative is diﬀeomorphic to the graph of f. In other words it is the
ranknullity theorem and the fact that the derivative is a good approximation
to the function in a suﬃciently small neighbourhood.
In the case of f(x, y) = x
2
+ y
2
− 1 it is easy to verify that the rank of Df
is never zero on the solution so must always be at least one.
Now to do the same with the orthogonal group O(2, R): we have
f : R
4
→R
3
, (x, y, u, v)
T
→ (x
2
+y
2
−1, u
2
+v
2
−1, xu +yv)
T
and
Df =
2x 2y 0 0
0 0 2u 2v
u v x y
¸
¸
172 CHAPTER 6. LIE GROUPS
and the rank of Df is 3 on M so M is a smooth manifold (since Df is
smooth) and has dimension 1.
Exercise 6.2.2. Do it for O(n, R). Show that the condition that a matrix
be in O(n, R) forces the determinant to be ±1, and deduce the dimension of
SO(n, R).
To get SO(2, R) we need another condition, namely xv −uy = 1. This might
lead you to suspect that SO(2, R) is a zero dimensional manifold, but the
fact is that the constraints are not independent, and we may deduce from
the ﬁrst three that xv −uy = ±1. This means that O(2, R) is disconnected
and SO(2, R) is one component of it. And the argument from second year
shows that SO(2, R) is diﬀeomorphic to the unit circle as a manifold. So
O(2, R) is diﬀeomorphic to two circles. Aren’t you glad we did not stipulate
that our manifolds had to be connected.
Exercise 6.2.3. First show that for a manifold (not necessarily smooth or
even diﬀerentiable) connected implies pathconnected. Then show that if we
have a Lie group, the connected component containing the identity is a Lie
subgroup.
Other Lie Groups can be obtained in essentially the same way as O(2, R)
by imposing conditions on linear maps or matrices: at one end we have all
GL(n, F) which is the space of all linear maps from F
n
to itself which are
invertible, where F is any ﬁeld. Then we can restrict ourselves to the case
where F = C or F = R which is less than adventurous but still more than
enough to require some thought. We can stipulate that the determinant be
1 which will ensure that the measure is unchanged to get SL(n, R), we can
insist that some generalised inner product with signature (k, n−k) on R
n
be
preserved to get what we call O((k, n−k), R). We can restrict to determinant
1 in addition, which requires us to put S for Special in front of the name.
And we can more or less repeat using C instead of R, except we use the term
unitary instead of orthogonal. And we can do much of it all over again using
ﬁnite ﬁelds.
You will note that SO((3, 1), R) is what we have called the Lorentz group.
This suggests an extension: we could take any of the groups regarded as op
erations on R
n
which preserve the origin and “aﬃnise” them by also allowing
shifts. This will increase the dimension of the group by n since there are n
independent directions in which we can do the shifting. If we do this to the
Lorentz group we get the Poincar´e group
2
.
2
This allows the writer of the Wikipedia article on the Lorentz group to start oﬀ by
deﬁning the Lorentz group as a subgroup of the Poincar´e group, probably the least hepful
6.3. GROUP REPRESENTATIONS 173
Exercise 6.2.4. Show that the cartesian product of two Lie Groups is a Lie
Group.
This gives us enough Lie Groups to be going on with.
Exercise 6.2.5. Do the exercises 1 to 10 in chapter one of part two of the
text book.
6.3 Group Representations
6.3.1 Introduction
Recall, from second year, that an abstract group is merely a collection of
things which can be multiplied and divided to give other things in the collec
tion. This statement is usually made more precise by giving three axioms:
Deﬁnition 6.3.1. A group is a set G and a binary operation
q
: GG → G
(with the operation
q
(g, h) usually written in inﬁx notation as g
q
h) such
that
1. ∀ a, b, c ∈ G, (a
q
b)
q
c = a
q
(b
q
c)
2. ∃ e ∈ G, ∀a ∈ G, a
q
e = e
q
a = a
3. ∀ a ∈ G, ∃ a
−1
∈ G, a
q
a
−1
= a
−1
q
a = e
We can now give a formal deﬁnition of a Lie group:
Deﬁnition 6.3.2. A Lie group is a group G which is also a smooth manifold
such that the maps inv:G → G, g → g
−1
and
: GG → G
are smooth, where is the multiplication in the group.
You will have already worked this out from contemplating the examples. I
hope.
Deﬁnition 6.3.3. A Lie group homomorphism is a homomorphism between
Lie Groups which is a smooth map between the manifolds.
deﬁnition one could imagine. Perhaps he wrote the article on the Poincar´e group and
deﬁned it as an extension of the Lorentz group.
174 CHAPTER 6. LIE GROUPS
Exercise 6.3.1. Verify that all the Lie groups discussed are indeed Lie
groups.
We are also interested in abelian groups which also satisfy the condition:
4. ∀a, b ∈ G, a
q
b = b
q
a
Abstract groups are sometimes diﬃcult to work with and so we ofen use a
representation of the group which means that the elements of the group be
come represented by matrices and the group action by matrix multiplication.
Thus we may take the rather forlorn group Z
2
which has only two elements,
usually written 0 and 1, we replace
q
by + since the group is abelian, and we
seek to represent 0 by the identity matrix, 1 by some other matrix, and +
by matrix multiplication. There are a lot of possible choices. For example
we can choose the 2 2 matrix pair
¸
1 0
0 1
,
¸
−1 0
0 −1
which clearly works. Such a thing is called a representation of dimension 2.
There is a rather simpler representation of dimension 1 which you should be
able to see almost instantly.
Exercise 6.3.2. Write it down!
Formally,
Deﬁnition 6.3.4. A real representation of a group Gof dimension (or degree)
n is a homomorphism from G into GL(n, R).
and
Deﬁnition 6.3.5. A Lie group representation of a Lie group G is a Lie group
homomorphism from G into GL(n, R).
The theory of complex representations, where we go into GL(n, C), is much
simpler, and we shall ﬁnd that there is a strong preference for complex repre
sentations in the books. There is also some interest to physicists in Quater
nionic representations where the quaternions, H, are the ‘step beyond C’.
Just as C is a two dimensional ﬁeld, H is a four dimensional ‘ﬁeld’, actually
not a ﬁeld but a skewﬁeld since the multiplication does not commute.
Exercise 6.3.3. Deﬁne H as the set of quartets a + bi + cj + dk where
i, j, k are meaningless symbols satisfying the rules i
2
= j
2
= k
2
= −1 and
ij = k, jk = i, ki = j. Assuming everything distributes in a sensible way,
show the result is a skew ﬁeld. Go back to the M213 notes to see this done
for C if hopelessly lost.
6.3. GROUP REPRESENTATIONS 175
Exercise 6.3.4. Just as there are ‘orthogonal’ group over C called the unitary
groups, there are analogues over H called the symplectic groups. Construct
one as a group of quaternionic matrices. Construct the one dimensional
complex group U(1, C) as a group of real 22 matrices and the corresponding
symplectic group as a group of real 4 4 matrices.
Remark 6.3.1. The use of H comes from Hamilton who invented them. It
is said that the relations deﬁning H are carved in a bridge in Ireland. If
you invent something great, you may be allowed to deface bridges too, but
these days it would probably be an oﬀence and you would face a severe ﬁne if
caught.
Remark 6.3.2. You might have felt that it makes more sense to insist that
the homomorphisms are 11, but this complicates the theory enough to make
it a bad idea. If it is 11, we say the representation is faithful
Remark 6.3.3. We could call any linear map from G into Aut(V ) for any
vector space V a representation over V , and this has its advantages. Most
representations are matrix representations in practice.
The above example suggests that there could be rather a lot of representa
tions of a group, and that we can build some of them up from other, simpler,
representations. Such is indeed the case, and the theory of representations
deals with precisely this issue. It is a quite satisfying kind of theory for alge
braists and they often give courses on it, usually for ﬁnite groups, occasionally
for compact Lie groups, rather rarely in complete generality.
The representation of Z
2
of dimension 2 given above sends the positive xaxis
to the negative xaxis and viceversa for the nonidentity element, and leaves
the xaxis ﬁxed for the identity: we say the xaxis is invariant under the
group action. It is clearly a subspace of the vector space R
2
and gives rise to
the subrepresentation which you will surely have discovered when looking
for a one dimensional representation of Z
2
. The yaxis is also an invariant
subspace, and R
2
is the direct sum of these two subspaces. This is revision
of second year material and I hope you recall it.
Had one of the minus signs in the second matrix been removed, note that
again there are two invariant subspaces of which R
2
is a direct sum, and that
this gives a new representation of Z
2
. Both parts give subrepresentations of
Z
2
, but only one is faithful. In fact one is distinctly trivial.
Exercise 6.3.5. Find some real representations of Z
2
Z
2
. Is there a faith
ful one dimensional real representation? Is there a faithful one dimensional
complex representation? A faithful two dimensional real or complex repre
sentation?
176 CHAPTER 6. LIE GROUPS
The fact that the given two dimensional representation of Z
2
can be split
into two subrepresentations of lower dimension means that it is really not
worth a deal of thought, because we can obviously recover it from the lower
dimensional representations by direct summing them. In fact all the real
representations of Z
2
can be obtained by minor variations of this process.
Exercise 6.3.6. Prove the last claim.
Exercise 6.3.7. Find some complex representations of Z
3
. Of Z
n
. Can you
ﬁnd any twodimensional representations which do not have (complex) one
dimensional subrepresentations?
6.3.2 Irreducible Representations
Deﬁnition 6.3.6. A representation m : G → GL(n, R) is irreducible iﬀ there
are no proper subspaces of the space on which the matrices act which are
invariant under the group action.
Remark 6.3.4. It might be more natural to deﬁne reducible representations
but they are too boring.
The two things that make this interesting to physicists are
1. The representations of compact groups are all direct sums of irreducible
representations
2. Most gauge groups are compact
3. The irreducible representations of the gauge groups correspond to the
fundamental particles, for example, electrons
This allows us to compute properties of the fundamental particles by looking
at group representations. This is surely quite astonishing and wonderful. I
have said some rude things about physicists, but if they can do this then
they have more than redeemed themselves. They are good blokes. Or good
sheilas, as the case may be. Or at least, some of them certainly are.
I have not given a formal deﬁnition of the direct sum of two representations.
Exercise 6.3.8. Construct a suitable deﬁnition.
Deﬁnition 6.3.7. Two representations f, g : G → GL(n, R) are equivalent
iﬀ there is an isomorphism α : R
n
→R
n
such that
∀ a ∈ G, f(a) ◦ α = α ◦ g(a)
6.3. GROUP REPRESENTATIONS 177
Exercise 6.3.9. Draw the obvious commutative diagram.
Observe that we could have generalised this by deﬁning Aut(V ) as the set
of invertible linear maps from V to itself where V is any real vector space,
and then deﬁning a representation of a group G over V as a homomorphism
m : G →Aut(V ). Then if α : U → V is an isomorphism of vector spaces, we
can talk of representations of a group G over U and V as being equivalent
provided the appropriate diagram commutes.
Exercise 6.3.10. Draw the new diagram.
6.3.3 Tensor Representations
I shall present this as a sequence of easy exercises.
Exercise 6.3.11. Write out a nontrivial representation, φ, of Z
2
as 2 2
real matrices.
Exercise 6.3.12. Write out a nontrivial representation,ψ, of Z
2
as 3 3
real matrices.
Exercise 6.3.13. Using the discussion on Darling’s expressions for the ten
sor product, ﬁnd an isomorphism between R
2
⊗R
3
and R
6
.
Exercise 6.3.14. Find the obvious tensor representation φ ⊗ψ in terms of
6 6 real matrices and the above isomorphism.
Exercise 6.3.15. Show it really is a representation.
Exercise 6.3.16. Repeat for the group SO(2, R).
Exercise 6.3.17. Deﬁne the tensor product of two representations, one over
a vector space U and the other over a vector space V .
Exercise 6.3.18. Write down a really, really obvious map from R
2
R
3
to
R
2
⊗R
3
.
Exercise 6.3.19. Show that any bilinear map f : R
2
R
3
→R factors into
your really, really obvious map and a linear map from R
2
⊗R
3
to R.
Exercise 6.3.20. Show this generalises to bilinear maps from U V to R.
Exercise 6.3.21. List all the one dimensional complex representations of
Z
4
.
Exercise 6.3.22. Hence or otherwise, list all the one dimensional complex
representations of SO(2, R). (Which it may be convenient to identify with
U(1, C))
Exercise 6.3.23. Explain why the above repesentations are irreducible when
they are.
178 CHAPTER 6. LIE GROUPS
6.3.4 Schur’s Lemma
I got this from Frank Adams’ Lectures on Lie Groups which I recommend
only to the bravest. It is a beautiful book but very, very dense.
Deﬁnition 6.3.8. A CGspace V is a complex ﬁnite dimensional vector space
and a homomorphism φ from the Lie group G into Aut(V ); that is, a repre
sentation of G over V .
Deﬁnition 6.3.9. A map between CG spaces U and V is a C linear map f :
U → V which commutes with the homomorphisms, that is, if φ : G →Aut(U)
and ψ : G → Aut(V ) are the representations, for all g ∈ G, for all u ∈ U,
f(φ(g)(u)) = ψ(g)(f(u))
Proposition 6.3.1. If φ and ψ are irreducible, any CG map is either zero
or an isomorphism.
Proof: Ker(f) and Im(f) are clearly invariant subspaces of the representa
tions and are hence either zero or the whole space.
Remark 6.3.5. It is clear that this works over arbitrary ﬁelds, not just C.
The actual Lemma needs the complex numbers:
Proposition 6.3.2 (Schur’s Lemma). If f : V → V is a CG map between
irreducible representations φ and ψ of a Lie group G, then f = λI
V
for some
λ ∈ C.
Proof:
V is isomorphic to C
n
for some n ∈ Z
+
so we work there. Then there are
n complex eigenvalues for f by the Fundamental Theorem of Algebra, not
necessarily diﬀerent. So there is at least one λ ∈ C such that det(f −λI
V
) is
zero. Then by the previous proposition we must have f = λI
v
, since f −λI
V
cannot be an isomorphism for this value of λ and is hence zero.
Corollary 6.3.2.1. All the irreducible complex representations of an abelian
group have dimension one.
Proof: If G is abelian and ρ : G →Aut(V ) is a representation then for every
g ∈ G ρ(g) is an automorphism of V which is a CG map from ρ to ρ. It
follows that ρ(g) is λI
V
for some complex number λ and hence that every
subspace of V is invariant under ρ(g). If ρ is irreducible then it follows that
V has dimension one, where the only subspaces are the space itself and the
zero element.
6.3. GROUP REPRESENTATIONS 179
Remark 6.3.6. If you have a trace of mathematical taste you will allow that
the last three results are very cool.
It follows that the complex irreducible representations of U(1, C) are all
equivalent to one of the form
ρ
n
(1, θ) : (r, φ) → (r, φ +nθ), n ∈ Z
Exercise 6.3.24. Show this carefully.
Exercise 6.3.25. Show that tensor multiplication in R is just multiplication,
likewise in C, and hence that the tensor product of the above irreducible
representations just makes ρ
n
⊗ρ
m
= ρ
n+m
6.3.5 Representations of SU(2, C)
The text book indicates, without any very compelling arguments, that the
irreducible representations of U(1, C) have important physical signiﬁcance.
Since the deﬁnition of U(1, C) means that it has to preserve lengths, it must
be the subset of C which contains only complex numbers of modulus 1, that
is, it is the unit circle. And a very ﬁne group it is too, being isomorphic as
a Lie group to SO(2, R).
Exercise 6.3.26. Prove the last claim.
The representations of U(1, C) being so simple, it is natural to investigate
the representations of U(2, C) and SU(2, C). I shall refer to these as U(2)
and SU(2) from now on since the C may reasonably be taken for granted.
Again we need look only at the irreducible representations and again we are
motivated by the hope of some important physical applications of these ideas.
The ﬁrst observation worth noting is that U(2) and SU(2) are not abelian
groups, so we expect complications.
First it is essential to get some sort of feeling for the groups. SU(2) is the
subgroup of U(2) having determinant one, and U(2) will consist of the 2 2
matrices with complex entries which preserve the complex inner product on
C
2
, that is the rule
¸
a
b
q
¸
u
v
= a¯ u +b¯ v
where ¯ v is the complex conjugate of v. The maps will have to take the
standard basis for C
2
to vectors which have length 1 and which are orthogonal
with respect to the complex inner product, and so the columns of the matrices
representing these linear maps must also be orthogonal and have length 1,
which implies that the inverse of such a matrix is its conjugate transpose.
180 CHAPTER 6. LIE GROUPS
We use A
∗
to denote the conjugate transpose of A, although many physicists
use A
†
.
We note that
¸
e
iθ
0
0 e
iφ
is such a matrix for any θ, φ and so is
¸
cos α −sin α
sin α cos α
for any α, since real orthogonal matrices are necessarily unitary. Also the
product of two matrices which have inverses equal to their conjugate trans
pose has its inverse equal to its conjugate transpose.
Exercise 6.3.27. Prove this.
This would lead one to conjecture that the manifold U(2) has (real)dimension
at least three. That it is a (real) manifold follows from the usual arguments
involving the Implicit Function theorem. Note that it makes sense to have
complex manifolds with smooth maps between charts in C
n
, but we shall not
be dealing with such things.
Exercise 6.3.28. Find the dimension of U(n) from the Implicit Function
theorem. Show that an element of U(n) must have determinant a complex
number of modulus 1, and hence deduce the dimension of SU(n). (The answer
to the last part is n
2
−1; make sure you get it right!)
An insight into the geometry of SU(2) is obtained from the Pauli matrices.
Recall that a matrix is hermitean if it is equal to its conjugate transpose.
The Pauli matrices are
σ
0
=
¸
1 0
0 1
, σ
1
=
¸
0 1
1 0
, σ
2
=
¸
0 −i
i 0
, σ
3
=
¸
1 0
0 −1
It is easy to see that these are linearly independent over C and hence form
a basis for the four (complex) dimensional space GL(2, C). If we take only
real coeﬃcients then we get the hermitean 2 2 matrices.
Exercise 6.3.29. Conﬁrm this claim. Conﬁrm that all hermitean matrices
are obtained in this way.
You will observe that the Pauli matrices are certainly hermitean themselves
but are also unitary.
Multiply each of σ
j
for j ∈ [1 : 3] by −i and call these, following Baez and
Muniain, I, J, K to get:
6.3. GROUP REPRESENTATIONS 181
σ
0
=
¸
1 0
0 1
, I =
¸
0 −i
−i 0
, J =
¸
0 −1
1 0
, K =
¸
−i 0
0 i
Note that
1. These matrices also span GL(2, C) with complex coeﬃcients
2. Each has determinant one
3. Each is unitary
Now it is easy to verify that taking all possible real linear combinations of
these matrices gives us a representation of the Quaternions, H.
Exercise 6.3.30. Do it.
It is also easy to verify that whenever a
2
+b
2
+c
2
+d
2
= 1, for reals a, b, c, d,
aσ
0
+bI +cJ +dK
is unitary,
Exercise 6.3.31. Do it
has determinant one
Exercise 6.3.32. Do it
and only slightly harder to conﬁrm that every unitary 2 2 matrix with
determinant one is of this form.
Exercise 6.3.33. Do it.
This has shown that SU(2) is the three sphere S
3
equipped with a multipli
cation which does not commute.
Remark 6.3.7. There is quite a lot of useful structure lying about here which
has been used by engineers and physicists for many a long year. Mathemati
cians tend to see themselves as discovering structure and pointing it out to
physicists and engineers who eventually come to ﬁnd it useful in talking about
something in reality, and then imagine that they discovered the structure ex
perimentally. Physicists and engineers have a diﬀerent story.
182 CHAPTER 6. LIE GROUPS
6.3.6 Representations of SU(2)
This is reasonably well described in the text book: the representations are
over vector spaces of homogeneous polynomials: The zero degree polynomials
are simply the complex numbers, the space H
j
for j half an integer is the
space of polynomials of degree j in two variables. Thus we have for j = 0
the constant functions from C
2
to C and for j = 1 the functions
f
a,b
: C
2
→C,
¸
x
y
→ ax +by
for a, b, x, y ∈ C. Then H
j
is a vector space over C of (complex) dimension
2j + 1. U
j
: SU(2) → Aut(H
j
) is the representation which takes any g ∈
SU(2) to the automorphism carrying the polynomial p to the polynomial q
deﬁned by
q
¸
x
y
= p
g
−1
¸
x
y
Exercise 6.3.34. Conﬁrm this gives a representation of SU(2).
These are in fact all the irreducible representations of SU(2), something which
is not proved in the text book and I shan’t prove it either. You may if you
wish.
Remark 6.3.8. This concludes everything we have to say about representa
tions, where ‘we’ means Baez, Muniain and me, but it is far from completing
the business. There is a lot of important and relevant material still uncovered.
Well, that’s life.
Exercise 6.3.35. Read the discussion in the text book and ﬁll in any gaps.
Remark 6.3.9. I am skipping the material which claims that SU(2) is a
double covering of SO(3); quite a lot can be said about this and it explains
the interest physicists have in SU(2).
Exercise 6.3.36. A lot of deep issues arise which physicists tend to gloss
over. This is an invitation to think about them.
First we have the mystery that the irreducible representations of the group
U(1) or SO(2) has something to do with the fact that charge is conserved.
Then we have that the irreducible representations of SU(2) tell us something
about spin, and about fundamental particles. This invites two separate ques
tions, the ﬁrst is what exactly do the groups have to do with it? Groups such
as the Lorentz and Poincar´e groups arise naturally enough from our desire
to have the physics independent of the detailed choice of language, Einstein’s
6.4. LIE ALGEBRAS 183
principle of general covariance. What is the explanation for U(1) having
everything to do with the Maxwell Equations and Electromagnetism?
Second, why irreducible representations? Why representations at all? One
can see that they might be convenient in doing calculations, but it looks as
though the use made of them goes beyond simple convenience. What exactly
is the relation between the physics and our description, and why are repre
sentations central to it?
There is a sketch of an answer to these questions in the part I have skipped:
it involves Quantum Mechanics and the standard Hilbert space representation
of quantum states.
You are invited to write a short essay addressing these questions.
You are also invited to consider the extent to which the Hilbert Space repre
sentation is essential to Quantum Mechanics, and to ponder whether a wholly
abstract description of what is needed for a mathematical model of QM would
necessitate Unitary representations.
6.4 Lie Algebras
Deﬁnition 6.4.1. The Lie Algebra of a Lie group G is the tangent space
at the identity. It is called g. This makes it a vector space of the same
dimension as G.
Remark 6.4.1. The multiplication comes later.
Remark 6.4.2. Elements of g used to be (and still are by some people)
called the inﬁnitesimal elements of G. You can see why. In particular the
inﬁnitesimal rotations are obtainable from the rotations in SO(3) by taking
curves through the identity in SO(3) and diﬀerentiating them. The text book
gives some natural examples: we take the matrix function
cos t −sin t 0
sin t cos t 0
0 0 1
¸
¸
which represents a curve of rotations about the zaxis and diﬀerentiate it at
t = 0 to get
J
z
=
0 −1 0
1 0 0
0 0 0
¸
¸
and J
x
, J
y
can be obtained in the same way.
184 CHAPTER 6. LIE GROUPS
Exercise 6.4.1. Do it.
These three matrices are linearly independent and span the algebra so(3).
The multiplication is the Lie Bracket in this case,
[X, Y ] = XY −Y X
Exercise 6.4.2. Verify that the Lie bracket is in the vector space so(3) when
X and Y are.
We can recover the original matrix functions by exponentiation:
Exercise 6.4.3. Show that exp(tJ
z
) is what it ought to be.
Exercise 6.4.4. Show that the Lie algebra of SO(2) is just R.
Exercise 6.4.5. Do exercises 33 to 54 in the text book.
Remark 6.4.3. Lie algebras are, as the book tells us, nicer in many ways to
work with than Lie groups because they are vector spaces. They give a lot of
information about the groups and their representations.
Chapter 7
Fibre Bundles
7.1 Introduction
A standard source on Fibre Bundles is Dale Husemoller’s Fibre Bundles.
There are probably more modern books, and there are certainly better writ
ten books, but I own a copy so will stick to following it. I shall do very little
on this subject (there is quite a lot to be done) because I want to focus on
diﬀerential geometry, the subject of these notes, but there are close connec
tions, as is shown by the physics. Anyway, to get very far in Fibre Bundles
you would need more homotopy theory than you have. So this will be a short
chapter.
First some examples:
1. The product S
1
R with projection to S
1
has base space S
1
, ﬁbre
(space) R and total space S
1
R. It is easy to see why we call the ﬁbre
a ﬁbre (it is long and thin) and the ﬁbres are glued together by the
topology of the base space.
2. The M¨obius bundle which has the same base space and ﬁbre as the last
example, but has a twist in it so as to make a m¨obius strip (without
a boundary). Again there is a projection from the total space to the
base space, and the inverse image of any point is a copy of R.
3. Any product of two spaces. For example a 2torus has base space S
1
and also ﬁbre S
1
.
4. Any tangent bundle. This attaches to every point of a smooth manifold
a vector space, the tangent space at the point, and the resulting object
185
186 CHAPTER 7. FIBRE BUNDLES
is a vector bundle, which is deﬁned as a ﬁbre bundle which has a vector
space for the ﬁbre, an important subclass of ﬁbre bundles.
5. Tensor bundles. Again, these are all vector bundles.
6. SO(n,R), n ≥ 2 is a ﬁbre bundle with base space S
n−1
. The map takes
an element of SO(n,R) and sends it to wherever the north pole of the
sphere S
n−1
is taken by applying the element to the sphere. The inverse
image of this point in SO(n,R) is a subset which is an embedded copy
of SO(n1,R), the ﬁbre. When n = 2, SO(n1,R) is a single point,
the identity map from R to itself, so SO(2,R) is just a copy of S
1
topologically.
7. The sphere S
n
is a ﬁbre bundle over RP
n
which sends antipodal points
to the same point and hence has ﬁbre Z
2
. It might be better to describe
the ﬁbre as S
0
, the pair ±1 under multiplication, or O(1,R). More
interesting bundles can be obtained by replacing R with C.
8. Take the sphere S
2
and at each point take the space of ordered pairs of
orthonormal tangent vectors. This gives an orthogonal 2frame bundle
over S
2
. In general, if M is a smooth manifold, for k an integer less than
or equal to the dimension of the manifold, take the space of (ordered)
k orthonormal tangent vectors at each point. An orthogonal 1frame
bundle on S
2
would consist of attaching a unit circle to each point
of the space, the circle being in the tangent space at the point. An
orthogonal 2frame bundle on S
2
would attach 2 circles at each point
(Explain why). Clearly this supposes a Riemannian Inner Product.
More generally, it makes sense to attach at each point of a smooth
nmanifold, an ordered set of k linearly independent vectors of the
tangent space at that point, for k ≤ n. These bundles are called frame
bundles. A section of the twoframe bundle on S
2
would give a rather
special pair of vector ﬁelds being everywhere linearly independent, and
we know there is not even one such vector ﬁeld. So it is not at all
obvious whether a given manifold admits a ﬁeld of kframes in general.
Note that there is a rather natural group action on frame bundles,
O(n,R) on the orthogonal frame bundles, and GL(n,R) on the bundles
where we do not suppose a Riemannian structure. The group acts on
the total space but sends ﬁbres to ﬁbres by what is a multiplication
of the (Lie) group and hence a diﬀeomorphism. Bundles with a group
action of this sort are called principal bundles. I shall elaborate on
these later.
7.1. INTRODUCTION 187
Exercise 7.1.1. Show that attaching an ordered set of n orthonormal vectors
to each point of a space is equivalent to attaching an element of the orthogonal
group, and that the nframe bundle eﬀectively attaches GL(n,R) to each point
of the nmanifold, this being the ﬁbre. Thus a useful way of thinking of a
bundle with base space a manifold is to regard the manifold as having a copy
of the ﬁbre attached at each point of the manifold.
Exercise 7.1.2. Show that the 2torus admits a ﬁeld of 2frames. Does S
3
?
The above should convince you that (a) some spaces have a structure which
makes them something like a generalised cartesian product and (b) it is worth
knowing more about them because there are interesting examples.
Deﬁnition 7.1.1. A ﬁbre bundle is a triple of space, E, B, F and a map
π : E → B such that for every b ∈ B, π
−1
(b) is homeomorphic to F.
Deﬁnition 7.1.2. A ﬁbre bundle is locally trivial iﬀ there is a cover of B by
open sets U
j
and for each of them π
−1
(U
j
) is homeomorphic to U
j
F.
Remark 7.1.1. All our ﬁbre bundles will be locally trivial
Exercise 7.1.3. Give an example of a ﬁbre bundle which is not locally
trivial.
Deﬁnition 7.1.3. A bundle map between ﬁbre bundles (E, B, F, π) and
(E
, B
, F
, π
) is a pair of maps f
B
: B → B
and f
E
: E → E
such that
π
◦ f
E
= f
B
◦ π.
Remark 7.1.2. It follows that ﬁbres wind up inside ﬁbres under f
E
. It is
helpful to draw a picture of a square:
E
?
π
B
E
?
π
B


f
E
f
B
We say that the square commutes with the condition π
◦ f
E
= f
B
◦ π. If π
is onto then f
E
determines f
B
. From the deﬁnition, it has to be.
Exercise 7.1.4. Deﬁne the terms product of ﬁbre bundles, subbundle, quo
tient bundle. Give examples of each.
Exercise 7.1.5. Find out what a ﬁbre product is and give an example.
188 CHAPTER 7. FIBRE BUNDLES
7.2 Principal Bundles
In many of the above examples, the ﬁbre had some extra structure besides
being a topological space: often it was a vector space, giving a vector bundle,
and sometimes it was a group. A group acts on itself by multiplication, and
so we can more generally consider the case when the ﬁbre has a group action
on it. We care most about the case where the group action on the ﬁbre is that
of a Lie group, and the action is regular, which means it is both transitive,
that is for any two points of the space there is a group element which acts to
take one to the other, and also free, that is only the identity leaves any point
ﬁxed; this is equivalent to saying that for any two x, y in the ﬁbre, F there
exists precisely one g in G such that g x = y. In this case, F is known as
a principal homogeneous space for G or as a Gtorsor. This deﬁnition holds
whether F is actually the ﬁbre of a bundle or not.
Exercise 7.2.1. Show that the action of S
1
on itself (regarded as U(1,C),
i.e. the set of complex numbers of modulus 1 with the usual complex mul
tiplication) makes it an S
1
torsor. Is there a regular action of S
1
on T
2
?
Is there a regular action of R on T
2
? Take the quotient space I/∂I which
joins the ends of the unit interval together. This is homeomorphic to S
1
but
lacks the group structure and the smoothness structure. Show that it can be
given the structure of a smooth manifold via any homeomorphism with S
1
and also that it is an S
1
torsor. Is any Lie group G a Gtorsor? Are there
any Gtorsors that are not homeomorphic to G?
Deﬁnition 7.2.1. A ﬁbre bundle where each ﬁbre is a Gtorsor (for the same
G) is called a Principal bundle
Exercise 7.2.2. Show that the nframe bundle for any smooth manifold
(Usually written F(M)) is a principal bundle.
Exercise 7.2.3. By taking the m¨obius strip with ﬁbre a closed interval and
gluing the ends of each ﬁbre, show that the resulting space is a principal
bundle and work out what the space is.
Remark 7.2.1. This has a lot to do with gauge theory.
Exercise 7.2.4. Do some googling to understand the last remark.
Remark 7.2.2. The condition that the ﬁbre be a Gtorsor means that we
can use group actions to say something about the bundle structure. We
have, in eﬀect, a sort of Construction Kit for the bundle which tells us how
to put it together, the group elements can be used to specify how to glue
local trivialisations together.
7.2. PRINCIPAL BUNDLES 189
Figure 7.2.1: A locally trivial cover of S
1
.
In the simplest case, take a bundle over S
1
with ﬁbre the interval R and
the action of O(1,R) on it. We might stipulate that the action be always
the identity, so if we have a pair of trivialisations of the bundle, on the
intersection the relation between the ﬁbres is that they are the same way
up. This inevitably forces the bundle to be trivial, S
1
R. Or we might
insist that the group action be −1 on one intersection and +1 on the other,
when we would get the m¨obius strip. Since we would like to be consistent
on intersections, it is reasonable to want the intersection to be connected, so
for S
1
we shall do it with three open sets which cover S
1
.
Exercise 7.2.5. Is the ﬁbre a Gtorsor for the orthogonal group?
Remark 7.2.3. In the above case, if we impose the condition that the group
action has to be constant on the intersection of the trivialising cover of the
base space, then we need at least three such open sets in the cover. Labelling
them α, β and γ, we can characterise each intersection by specifying an or
dered pair, see the diagram ﬁgure 7.2.1. α is the red open set, β the blue and
γ the green. Then the intersection αβ is the region between the black bars
at the top right. If I now assign the element 1 ∈ O(1,R) to αβ the element
−1 to βγ at the left and the element 1 to γα then you can read this as an
instruction to start with three strips, α R β R and γ R, and glue the
ﬁrst two strips together keeping both orientations of R to have the positive
numbers pointing up, the strip γ R is glued to αR also with the real line
having the same orientation, but γ R is glued to β R with a reversal, so
that the β part is upside down. It is clear that these instructions produce a
m¨obius strip. Moreover, in general, we can specify a locally trivialising cover
190 CHAPTER 7. FIBRE BUNDLES
of the base space, with the condition that the intersection is path connected,
take any ﬁbre having a group action on it and, by assigning group elements
to intersections, give instructions to build a new object. We need to ensure
that the instructions are unambiguous and that the resulting object is a ﬁbre
bundle.
Deﬁnition 7.2.2. In general, If ¦U
α
¦ is a trivialising cover of a manifold,
with the condition that U
α
∩ U
β
is connected or empty, then when there is
a group action on the ﬁbre with a group G, the map from the (nonempty)
intersections (αβ) to G gives the transition functions for the bundle.
Exercise 7.2.6. Suppose αβ means that you hold α the ‘right way up’ in
some sense and apply a group element g
αβ
to β before doing the gluing of
each x in the ﬁbre F over α to g
αβ
(x) over β. Verify that this means that
we must have g
αβ
the inverse of g
βα
. What can you say about g
αα
?
Exercise 7.2.7. Verify also that for an unambiguous instruction we need to
have the cocycle condition:
g
αβ
g
βγ
g
γα
= 1
on any nonempty region α ∩ β ∩ γ.
Exercise 7.2.8. Verify that the construction described always gives a ﬁbre
bundle.
Note that we do not need the ﬁbre to be a Gtorsor for this group, it suﬃces
that the action be that of a subgroup. In fact we can get a trivial bundle by
consistently choosing the identity. (We can get it other ways, too!)
Exercise 7.2.9. Explain the last, parenthetic, remark.
Exercise 7.2.10. Show that by choosing ﬁbre S
0
instead of R, we can deal
with the case where the ﬁbre is a Gtorsor. Instead of taking a subgroup,
we can throw out most of the ﬁbre. So for the trivial bundle and the m¨obius
bundle over S
1
with ﬁbre S
0
, we still have all the essential properties, and
now we need only count connected components to see the diﬀerence.
Deﬁnition 7.2.3. In the case described above of a bundle the structure of
which is determined by a group action on the ﬁbres and a set of transition
functions, the bundle is called a Gbundle, and the group is called the gauge
group of the bundle.
Remark 7.2.4. In practice, the ﬁbre is a vector space, usually a tangent
space or tensor space.
7.3. THE ENDOMORPHISM BUNDLE 191
Deﬁnition 7.2.4. For any linear transformation T of a Gvector bundle ﬁbre
F
p
attached to a point p in the manifold, we can ask whether it arises from
the action of G. In general some will and some won’t. If it does, we say T
lives in G.
Exercise 7.2.11. Show this is well deﬁned. That is, show that if p is in two
charts with domains α and β, then if T lives in G over α it also lives in G
over β, even though the particular element g ∈ G is in general diﬀerent.
Exercise 7.2.12. Give examples of Gvector bundles and linear transforma
tions of F
p
that live in G and others which do not.
Exercise 7.2.13. Extend this idea to the Lie algebra g when G is a Lie
group.
Deﬁnition 7.2.5. A Gauge Transformation is a smooth Gbundle map from
a vector bundle into itself which is the identity on the base space and such
that every linear map from a ﬁbre F
p
to itself lives in the (Lie) group G.
Remark 7.2.5. Physicists care about these a lot. See the section on page
215 of the text book to ﬁnd out why. Or at least get some vague idea.
7.3 The Endomorphism Bundle
There is a natural isomorphism between
V ⊗V
and End(V )
where End(V ) is the vector space of endomorphisms of V , that is the linear
maps from V to itself. Observe that End(V ) is a ring under composition, it
has a unit, but is not in general commutative. Recall that we deﬁned a vector
space with an associative multiplication to be an algebra. From the isomor
phism it is clear that for any smooth manifold we have an endomorphism
bundle where V is the tangent space at each point of the manifold.
More generally, if E is any vector bundle over a smooth manifold M with
ﬁbre V , we can deﬁne the endomorphism bundle E⊗E
by attaching End(V
p
)
to each point p in M. There is nothing new here, we have the tensor bundle
construction in the very special case of (1, 1)
T
tensors.
A section T of E ⊗E
acts on a section s of E. If at each point p in M, the
section s(p) is in the ﬁbre then T(p) is a linear map from the ﬁbre into itself
which takes s(p) to T(p)(s(p)), everything done pointwise. And if Γ(E) is
the space of sections of E, any section T of E ⊗E
gives a map
T : Γ(E) → Γ(E)
192 CHAPTER 7. FIBRE BUNDLES
which is linear regarding Γ(E) as a C
∞
(M, R) module.
Exercise 7.3.1. Verify the above claim.
Exercise 7.3.2. Show that any C
∞
(M, R)linear map deﬁnes a section of
E ⊗ E
. This will involve partitions of unity so needs M paracompact. Do
it ﬁrst for the case where E = M V .
Exercise 7.3.3. Show that the set of all gauge transformations for a Gvector
bundle E is iself a group, (, the gauge group.
Exercise 7.3.4. Read page 222 of the text book.
Chapter 8
Connections
I have followed and ampliﬁed R.W.R Darling’s book Diﬀerential Forms and
Connections, of which I can only say ‘thank God for Wikipedia’. You should
google the term Covariant Derivative on Wikipedia and anywhere else you
can ﬁnd it.
8.1 Fundamental Ideas
You will have noticed that I have used the notation
˙
R
n
to denote the tangent
space of R
n
at any point. I can get away with this because if I take any
a, b ∈ R
n
, then
˙
R
n
a
and
˙
R
n
b
are isomorphic and moreover the isomorphism
comes from the shift map that takes a to b by adding b − a to everything
in R
n
. Clearly this takes curves and their tangency equivalence classes at a
to corresponding curves and their equivalence classes at b in a thoroughly
uninteresting way. An important consequence of this is that if I am standing
at the origin in R
n
and you are standing somewhere else, it makes sense to
ask if we are looking in ‘the same’ direction. We can ask if the unit tangent
vector representing my direction of look is carried by the shift from me to
you into the unit vector representing your direction of look.
On a 2sphere, even the standard one embedded in R
3
, this is not the situation
at all. It is true that we could use R
3
as our notion of what consitutes ‘the
same’ direction, but if you and I are both on the equator of the Earth,
supposed to be an embodiment of S
2
, and if you are a quarter of a planet
away from me, if I am looking at my horizon due West, watching the sun
set, and you are also looking due West, you would not be looking towards
the sun. If you are somewhere to the West of me, then the Sun is overhead
from your point of view, and if I am to the West of you, it is dark for you
193
194 CHAPTER 8. CONNECTIONS
and the direction of the Sun is under your feet. Yet if we are both looking
due West, it clearly makes some sort of sense to say we are looking in ‘the
same’ direction. We wouldn’t feel quite so tempted to say this if I watched
the sunset and you were looking due North.
Exercise 8.1.1. Show that there is a curve joining us so that at each step
neighbours are looking in the same direction but I am looking due West and
you are looking due North.
So the question is, how do we deﬁne this notion of ‘the same direction’ for
diﬀerent places on a manifold?
One approach is to go through a group action, since the shift maps from
R
n
to itself constitutes precisely such an action of the additive group R
n
on
the vector space R
n
, and the group action takes tangent vectors to unique
tangent vectors. In the case of S
2
embedded in the standard way in R
3
, one
would be tempted to use the special orthogonal group. If there is a rotation
taking me to you (and there surely is ) then we could use the rotation to take
my tangent space to yours. Then if the image of my direction of look is your
direction of look, we are looking in ‘the same’ direction. The fact that we
can decide which directions are North, South, East and West, for all points
on the Earth except the poles, suggests that some sort of sense can be made
of this. On the other hand the fact that all directions at the North Pole are
due South also suggests that there is something a bit wrong. Nevertheless, if
I am at the North Pole and you are a kilometre away and we are looking at
each other, then it makes sense to say we are looking in opposite directions,
and if I turn around to see what you are looking at (a polar bear behind
me perhaps) then we would be looking in the same direction. This may be
because a distance of a kilometre is small enough to make the world pretty
much ﬂat. But suppose we have a chain of ﬁve thousand people, all looking
towards the next one in the chain, all able to see over each others shoulders
at the next one beyond, we could each decide that we were looking in the
same direction. If we had forty thousand people, and I am number one (as
seems reasonable to me) and standing at the North Pole looking towards you,
and you are looking due South at somebody a kilometre away also looking
in the same direction as you, and so on, then we are all looking due South
until we get to the South Pole, when everybody later in the chain is looking
due North. So North and South have got screwed up, but we are all still
looking in ‘the same’ direction. Any three consecutive people will agree on
this. And if we all turn through a right angle anticlockwise, are we are still
all looking in the same direction? Any consecutive triple of people, observing
their neighbours out of the corners of their eyes would surely agree that they
were. And if they all held their arms out sideways, their right arms would be
8.1. FUNDAMENTAL IDEAS 195
pointing towards the neighbour they had previously been looking at. Half of
them would say they were looking West and the other half would say they
were looking East. Were it not for the axial inclination of the Earth, we
might have them all looking at the sun on the horizon, although some would
ﬁnd it setting and others rising.
You will note that a suitable rotation of the Earth, although not the usual
one, would take each person to his successor and would carry the direction
of look along with it.
We certainly have that parallel translation of a tangent vector around this
closed loop would have to bring us back to the original vector, but this need
not happen for all closed loops, see ﬁgure 3.2.1 in chapter three.
Exercise 8.1.2. Without looking at the picture, draw a closed loop of people
on S
2
with a distinguished starting point, so that everyone is looking in the
same direction as the person in front, but such that the last person is at the
same point as the ﬁrst person but is looking in a diﬀerent direction.
Three observations: ﬁrst that we should feel that when the people are very
close together on any smooth manifold, it should be possible to say if they
are looking in diﬀerent directions. Since ‘very close’ is a scale dependent
kind of thing, it must be intelligible to take limits, so diﬀerentiation must
come into it somewhere. On a symmetric space with a Lie group action,
the group action ought to also give us some sense of what ‘looking the same
way’ means, but the notion ought to be intelligible on any smooth manifold,
although not with the structure we have on it at present. Another reason for
thinking diﬀerentiation comes into it somewhere is that directions are given
by tangent vectors.
Second, we seem to have that paths come into it too. Whether two people
at diﬀerent points a and b can be said to be looking in the same direction
would seem to require us to have a ‘path’ of people, everyone looking in the
same direction as the person next to him, the path joining the person at a
to the person at b. There isn’t a ‘next’ person on a continuous path, and
the last exercise makes it clear that whether the person at a is looking in the
same direction as the person at b would depend on the path.
Exercise 8.1.3. Find two paths on a sphere between points a and b such
that each person along each path is looking in the same direction, the person
at a is looking in a deﬁnite direction, but the two coincident people at b are
looking in diﬀerent directions.
And Third, we might want to transfer along paths on a manifold other things
besides directions of looking: for example we might naturally want to transfer
a frame, that is a coordinate system, or a linear map. For this reason we
196 CHAPTER 8. CONNECTIONS
need to think in terms of doing parallel transportations for sections of vector
bundles in general, not just the tangent bundle.
So the problem is, how to articulate precisely this notion of shifting some
object parallel to itself along a curve on a smooth manifold. The above
discussion shows it should make some sort of sense, but consideration of S
2
shows that it may have some surprises.
There seem to be two possibilities: one is to restrict ourselves to symmetric
spaces with a good group action under which the manifold is invariant, and
the other is to ﬁnd some more general diﬀerential structure. The ﬁrst choice
leads to Cartan connections, which generalise the idea of using rotations of
S
2
to carry tangent vectors along with the points, or shifts to take tangent
spaces to tangent spaces in R
n
. The second choice is more general and leads
to Koszul connections and in particular to the LeviCivita connection. There
are more connections than you can shake a stick at, but it is better to get
one type sorted out properly before going on to others.
Exercise 8.1.4. Can you say what kinds of paths on S
2
look ‘right’ for the
action of SO(3, R) to be used for determining how to transport a unit tangent
vector (direction of look!)?
Exercise 8.1.5. Can you transport a frame on S
2
using the same idea? If
you had an eye where your left ear is so you could look in two orthogonal
directions at once, could you have a chain of people all looking in the same
direction with eyes and left ears? Or could you have a chain where this is
impossible?
Exercise 8.1.6. You could certainly have a Riemannian inner product on
R
2
that started oﬀ with the standard basis being orthornormal and changed
gradually along a path until the basis
¸
1
0
,
¸
1
1
was an orthonormal
basis. Find such a path. Transport an orthogonal frame along the path so it
stays an orthogonal frame in the Riemannian inner product.
Exercise 8.1.7. Could you do the same thing with the standard basis,
(e
1
, e
2
), and the basis (e
2
, e
1
)? Prove your claim.
Exercise 8.1.8. Can you do the same kind of thing on S
2
? RP
2
?
We conclude that the idea is to shift ‘things’ along curves. At the very least
the curves should be smooth. In fact any smooth curve, at least locally is the
solution to a system of ODEs (the Straightening Out Theorem from ODE
theory: see Arnol’d.) So an alternative is to move them ‘inﬁnitesimally’
along a vector ﬁeld, and the ‘things’ will be sections of some vector bundle.
8.2. BACK IN R
N
197
If the shifts are ‘inﬁnitesimal’ then we can hope to get a shift along a curve
by some sort of integration process. This leads to asking if we can have some
sort of diﬀerential operation of a vector ﬁeld on various other sections of a
vector bundle.
8.2 Back in R
n
8.2.1 Covariant diﬀerentiation
I shall deal with the case n = 2 in order to save typing, but the extension is
trivial. Take a vector ﬁeld X on R
2
and a point a ∈ R
2
with X(a) denoting
the vector at a. We write a as
¸
a
1
a
2
and X(a) as u =
¸
u
1
u
2
. Let Y be
another vector ﬁeld on R
2
. Can we diﬀerentiate Y in the direction u at a?
If we take Y (a) = v =
¸
v
1
v
2
, then we can only talk about diﬀerentiating Y
at a if we have Y making sense in a neighbourhood of a so we want to seee
v as a pair of functions,
v(a) =
¸
v
1
(a)
v
2
(a)
Of course, we are going to be looking at the derivative of Y at a in the
direction X(a) for diﬀerent points a.
We already have a way of talking about the directional derivative of a function
with respect to a vector ﬁeld. Take a function f : R
2
→R and a vector ﬁeld
X on R
2
. Then I can take the Lie derivative Xf or L
X
(f) and diﬀerentiate
f along the vector ﬁeld with
Xf
¸
x
y
=
¸
∂f
∂x
∂f
∂y
¸
X
1
(x, y)
X
2
(x, y)
This gives me another function, a 0tensor ﬁeld.
I can certainly do this to both components v
1
(a) and v
2
(a) and this will
give me a new vector ﬁeld which I shall write as ∇
X
(Y ). Baez and Munian
write D
X
(Y ), at least some of the time, which reminds you that this is
something to do with diﬀerentiation, but then, ∇ is also something to do
with diﬀerentiation. Notice ∇
X
(Y ) is very diﬀerent from ∇
Y
(X) in general.
The former requires diﬀerentiation of Y in a direction at a point, but has
nothing to do with diﬀerentiating X, and the latter is the other way around.
With the notation given I have
X = u
1
(x, y)
∂
∂x
+u
2
(x, y)
∂
∂y
198 CHAPTER 8. CONNECTIONS
and
Y = v
1
(x, y)
∂
∂x
+v
2
(x, y)
∂
∂y
Then going back to writing vectors as columns we have
∇
X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)
Remark 8.2.1. We could write this using the Einstein summation conven
tion as
∇
X
(Y ) = u
i
∂
i
(v
j
)∂
j
, i, j ∈ [1 : 2]
which has the advantage that if I leave out the last part, by not specifying
which n we are working in, it makes sense for R
n
for any n.
Exercise 8.2.1. Take a nice vector ﬁeld X on R
2
such as y
∂
∂x
−x
∂
∂y
. Choose
a nice simple Y and calculate ∇
X
(Y ), also ∇
Y
(X), ∇
X
(X) and ∇
Y
(Y ).
Sketch all the vector ﬁelds and satisfy yourself everything makes sense, and
that we can legitimately regard ∇
X
(Y ) as a derivative of Y in the direction
X at each point.
Remark 8.2.2. From the matrix notation, certain things are obvious:
1. ∇
X
(Y ) is certainly additive in X:
∇
X
1
+X
2
(Y ) = ∇
X
1
(Y ) +∇
X
2
(Y )
2. ∇
X
(Y ) is Rlinear in X:
∀ t ∈ R, ∇
tX
(Y ) = t∇
X
(Y )
3. Since this is done pointwise as far as X is concerned, it is C
∞
(R
2
, R)
linear in X:
∀ f ∈ C
∞
(R
2
, R), ∇
fX
(Y ) = f∇
X
(Y )
4. ∇
X
(Y ) is Rlinear in Y :
∇
X
(Y
1
+Y
2
) = ∇
X
(Y
1
) +∇
X
(Y
2
)
∀ t ∈ R, ∇
X
(tY ) = t∇
X
(Y )
5. It satisﬁes the Leibnitz rule so far as C
∞
(R
2
, R) scaling of Y is con
cerned:
∇
X
(fY ) = f∇
X
(Y ) + (Xf)Y
8.2. BACK IN R
N
199
Exercise 8.2.2. Verify all these.
Exercise 8.2.3. Conﬁrm that the Lie derivative L
X
(Y ) does not satisfy all
these conditions.
Exercise 8.2.4. Instead of operating with ∇
X
on a vector ﬁeld, you could
operate in a very similar way on a covector ﬁeld or 1form, ω = P dx+Q dy
to get another 1form ∇
X
(ω) = XP dx + XQ dy. Show this also satisﬁes
the above conditions.
Remark 8.2.3. We have now deﬁned ∇
X
for two diﬀerent sections of bundles
over R
2
, the tangent bundle and the cotangent bundle.
Deﬁnition 8.2.1. Any operator ∇
X
for a vector ﬁeld X on sections of any
vector bundle over R
2
which satisﬁes the properties of Remark 8.2.2 is called a
connection, and the particular connections described are called the Euclidean
connections on the tangent and cotangent bundles.
Obviously extending these to R
n
merely means more terms. Extending them
to manifolds gives some complications.
8.2.2 Curves and transporting vectors
We can now talk about moving a vector along a curve in R
2
so it stays
pointing in the same direction. γ : I → R
2
is a smooth curve, and u(t) is a
vector at γ(t) ∈ R
2
, I want to use the fact that for each t ∈ I, if I diﬀerentiate
u(t) in the direction γ
(t) there is no change, so I want
∀ t ∈ I, ∇
γ
(t)
(u(t)) = 0
This does not make sense, because neither γ
(t) nor u are vector ﬁelds on
R
2
, although γ
is a vector ﬁeld on the curve. This doesn’t matter as far as
γ
(t) is concerned, because we only need a vector at each point of the curve
to give a direction in which to diﬀerentiate. It does matter as far as u(t)
is concerned, at the very least we want to know what u is doing in some
neighbourhood of the curve, whereupon ∇
γ
(t)
(u(t)) becomes intelligible and
I can insist that it be zero. This will put some conditions on u. We already
know, of course, exactly what we want to get out, because we know that
shifting vectors parallel to themselves in R
2
is rather trivial. What we want
to conclude is that u is constant along γ, and that this does not depend on
the curve. But we have an eye on doing the same sort of thing on S
2
, where
life is more complicated.
200 CHAPTER 8. CONNECTIONS
You can see that the proposition that the directional derivative of a function
is zero in every direction certainly tells us that the function is constant. You
can also see that if the directional derivative of a function along a curve is
zero, it must be constant along the curve. And we don’t give a damn what
the function is doing elsewhere. This holds for each component function of
the vector ﬁeld that extends u in a neighbourhood of the curve.
Exercise 8.2.5. Prove that a smooth function which has directional deriva
tive along a curve equal to zero is constant along the curve.
Exercise 8.2.6. Is it true for any continuous curve in R
2
, that a function
deﬁned on it can always be extended to a neighbourhood of the curve?
We therefore deduce that for the euclidean connection on R
2
, the properties
of the connection ensure that the condition
∇
γ
(t)
(u(t)) = 0
is quite intelligible since it makes sense for any extension of u(t) into a
neighbourhood of the curve, and it tells us how to parallel transport a vector
along a curve, and that for any two points on the curve, the condition ensures
the vector at each point is ‘the same’. Big deal. Of course, the notion of
vectors in two diﬀerent tangent spaces being ‘the same’ is certainly trivial in
R
2
, indeed in R
n
generally, so long as we have the standard structure; there
is one obvious sense in which it makes sense, the trick is to say what it means
along curves in manifolds that are not so simple.
From this, after some reﬂection, we conclude that the euclidean connection
on R
n
solves all the problems of parallel transportation on R
n
. This, face
it, wasn’t much of a problem. On the other hand, it does give us a hope of
solving the same problem on S
2
and other manifolds, including the universe
in which we live. And if we can parallel translate vectors we should be able
to parallel translate other things using the same ideas.
So we study connections.
8.3 Covariance
The question is, can we make this work on manifolds in general? Certain
things are prerequisites: in particular, this all has to be independent of a
choice of basis. If you are a physicist and you use a vector ﬁeld or diﬀerential
1form to represent some thing like an electric ﬁeld, you would insist that
the vector at a point is a real thing that does not depend on the choice of a
8.4. EXTENSIONS TO TENSOR FIELDS ON R
2
201
coordinate system. We now know that the right way to express this belief is
in terms of invariance under certain group actions. You and I may diﬀer in
the actual numbers to be assigned, but we’d better agree on what happens
in the world after a suitable translation scheme is established. Or it ain’t
Science.
If you change the basis of R
n
for some reason known only to yourself, then
both X and Y will have diﬀerent representations. Still, a vector ﬁeld exists
independent of your description, and something would be horribly wrong if
∇
X
(Y ) depended on the basis.
Exercise 8.3.1. Take the vector ﬁelds on R
2
you used for an earlier problem
and express them in the basis
¸
1
1
,
¸
1
−1
in such a way that it really is the same vector ﬁeld. Do the calculations all
over again.
Exercise 8.3.2. Show that in general if we change the basis on R
n
so that
X and Y are written in the new basis as X
and Y
, then ∇
X
(Y
) is what it
jolly well ought to be.
Exercise 8.3.3. Using the same vector ﬁelds, do the calculations using polar
coordinates. What conclusions do you draw?
Exercise 8.3.4. Suppose φ is a diﬀeomeorphism of R
n
and X, Y are vector
ﬁelds on R
n
. Explain how to describe X, Y in terms of the ‘coordinate system’
given by φ. What happens to ∇
X
(Y ) under this diﬀeomorphism?
8.4 Extensions to Tensor Fields on R
2
Returning to R
2
, we could express the Euclidean connection in the form:
∇
X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)
or the somewhat more compact:
∇
X
(Y ) = u
i
∂
i
(v
j
)∂
j
This rather obscures the fact that I am using u’s to represent the vector ﬁeld
X and v’s to represent the vector ﬁeld Y , so it might be better to write
∇
X
(Y ) = X
i
∂
i
(Y
j
)∂
j
(8.4.1)
202 CHAPTER 8. CONNECTIONS
The deﬁnition of ∇
X
(ω) where
ω = P dx +Q dy = ω
1
dx +ω
2
dy
was just
∇
X
(ω) = Xω
1
dx +Xω
2
dy = Xω
j
dx
j
and unpacking the expression for Xω
j
we get
∇
X
(ω) = X
i
∂
i
ω
j
dx
j
(8.4.2)
which looks a lot like equation 8.4.1.
Exercise 8.4.1. Write out equation 8.4.2 as a matrix.
The fact that we have the same basic shape for vector ﬁelds as for 1forms
tells us that all we are doing is choosing a suitable basis for each of them:
(e
1
, e
2
) is the standard basis for the vectors in R
2
and I have (∂
1
, ∂
2
) for
the standard basis for the tangent vectors, and (dx
1
, dx
2
) for the cotangent
vectors. Suppose I have a (k, ) tensor bundle, then I can write out a basis
for any section as a collection of terms in the form
dx
i
1
⊗dx
i
2
⊗ dx
i
k
⊗∂
i
k+1
⊗ ⊗∂
i
k+
We can extend the deﬁnition of ∇
X
(Y ) to ∇
X
(s) where s is any section of
the tensor bundle by writing
∇
X
(α ⊗β) = ∇
X
(α) ⊗β +α ⊗∇
X
(β)
and extending to as many tensor products as you feel a need for, and using
linearity.
Exercise 8.4.2. Show that for all sections of a tensor bundle s, the properties
of Remark 8.2.2 hold.
Exercise 8.4.3. Letting X be the same old vector ﬁeld on R
2
as in earlier
exercises, and let a Riemannian inner product be deﬁned on the positive
quadrant by the matrix
s =
¸
1 +xy 0
0 1 +x
2
+y
2
Find ∇
X
(s). Is it positive deﬁnite? If we take the covariant derivative of a
symmetric 2tensor s, is the the resulting 2tensor necessarily symmetric?
8.5. THE KOSZUL CONNECTION 203
8.5 The Koszul Connection
The crucial properties of the covariant derivative of tensor bundles on R
2
were: For any section s of a tensor bundle,
1. ∇
X
(s) is certainly additive in X:
∇
X
1
+X
2
(s) = ∇
X
1
(s) +∇
X
2
(s)
2. ∇
X
(s) is Rlinear in X:
∀ t ∈ R, ∇
tX
(s) = t∇
X
(s)
3. Since this is done pointwise as far as X is concerned, it is C
∞
(R
2
, R)
linear in X:
∀ f ∈ C
∞
(R
2
, R), ∇
fX
(s) = f∇
X
(s)
4. ∇
X
(s) is Rlinear in s:
∇
X
(s
1
+s
2
) = ∇
X
(s
1
) +∇
X
(s
2
)
∀ t ∈ R, ∇
X
(ts) = t∇
X
(s)
5. It satisﬁes the Leibnitz rule so far as C
∞
(R
2
, R) scaling of s is con
cerned:
∇
X
(fs) = f∇
X
(s) +Xfs
Extending these to R
n
is rather trivial; consider it done. The next step is
to deﬁne a Koszul connection on any vector bundle E over a manifold M as
a map operating on a vector ﬁeld, X, on M and any section s of E which
satisﬁes the above rules. This is rather abstract, but I have built up the
simple concrete cases ﬁrst in order to cheer you up.
8.6 Vector Potentials
We gave a very simple covariant derivative on R
2
which quite obviously sat
isﬁed the rules for a connection, indeed that’s where we got the rules from.
Now we take the Physicist’s perspective. Weening them oﬀ coordinates is an
ongoing process, so let’s try doing it their way– then we get to be able to do
lots of sums, which is good, and confuse things horribly, which is bad.
204 CHAPTER 8. CONNECTIONS
Suppose we have a section s of some bundle E over R
2
with ﬁbre a vector
space F so that E = R
2
F. s is therefore a map from R
2
to F, and if
we let (e
1
, e
2
, e
n
) be a basis for F, then for every point v ∈ R
2
we have
s(v) = s
1
(v)e
1
+s
2
(v)e
2
+ s
n
(v)e
n
, or s(v) = s
i
e
i
in Physicist’s notation.
If X is a vector ﬁeld on R
2
we can write X = X
1
∂
1
+ X
2
∂
2
= X
j
∂
j
. And
if s
= ∇
X
(s) we have also s
(v) = s
1
(v)e
1
+ s
n
(v)e
n
= s
i
e
i
. And
the n functions s
i
(v) depend on the n functions s
i
(v) and on the functions
X
1
, X
2
only. Moreover, the rules for getting the s
i
are speciﬁed by the rules
of section 8.5 and nothing else. Let’s see how it works out. I shall have to
take ∇
Y
for Y the unit vector ﬁeld in the direction of the xaxis and also the
yaxis, and rather than write this as ∇
∂
1
or ∇
∂
2
I shall shorten this to ∇
1
and ∇
2
respectively.
s
= ∇
X
(s) = ∇
(X
1
∂
1
+X
2
∂
2
)
(s)
= ∇
X
1
∂
1
(s) +∇
X
2
∂
2
(s)
= X
1
∇
1
(s) +X
2
∇
2
(s)
= X
1
∇
1
(s
1
e
1
+ s
n
e
n
) +X
2
∇
2
(s
1
e
1
+ s
n
e
n
)
= X
j
∇
j
(s
i
e
i
) using the Einstein convention
= X
j
(s
i
∇
j
(e
i
) + (X
j
s
i
)e
i
) (Leibnitz)
It might be better to expand this last to
∇
X
(s) = X
1
(s
1
∇
1
(e
1
) +s
2
∇
1
(e
2
) + s
n
∇
1
(e
n
))
+ X
2
(s
1
∇
2
(e
1
) +s
2
∇
2
(e
2
) + s
n
∇
2
(e
n
))
+ X
1
((∂
1
s
1
)e
1
+ (∂
1
s
2
)e
2
+ + (∂
1
s
n
)e
n
)
+ X
2
((∂
1
s
1
)e
1
+ (∂
1
s
2
)e
2
+ + (∂
1
s
n
)e
n
)
Now the terms ∇
1
(e
i
) and ∇
2
(e
i
) are, for each i ∈ [1 : n], going to be
values of the section, and can therefore be expressed in terms of the basis
(e
1
, e
2
, e
n
). I have, in other words, for each j ∈ [1 : 2] and for each
i ∈ [1 : n], at each point v ∈ R
2
, there is a collection of numbers A
k
i,j
(v))
expressing ∇
j
(e
i
) as
¸
k∈[1:n]
Γ
k
i,j
e
k
Which tells us that we can express ∇
X
(s) as
X
1
¸
i,k∈[1:n]
s
i
Γ
k
i,1
e
k
+X
2
¸
i,k∈[1:n]
s
i
Γ
k
i,2
e
k
+X
1
¸
i∈[1:n]
(∂
1
s
i
)e
i
+X
2
¸
i∈[1:n]
(∂
2
s
i
)e
i
8.6. VECTOR POTENTIALS 205
Collecting these up and changing the name of a summation index gives us:
∇
X
(s) =
¸
j∈[1:2],i,k∈[1:n]
X
j
(∂
j
s
i
+ Γ
i
k,j
)e
i
(8.6.1)
or
∇
X
(s) = X
j
(∂
j
s
i
+ Γ
i
k,j
)e
i
in physics speak.
Exercise 8.6.1. Find the expression for ∇
j
(∂
i
) in terms of the standard
basis for vectors in R
2
. Now do it for polar coordinates.
The 2n
2
functions Γ
i
k,j
from R
2
to R (On R
2
, on R
m
it would be mn
2
) pretty
much tell us everything about the connection, given that the X
j
tell us about
the vector ﬁeld X and the ∂
j
s
i
tell us about diﬀerentiating the section. In
general they are mn
2
functions from the manifold of dimension m to R,
and they tell us how the connection works on the bundle with ﬁbre F of
dimension n. The collection of functions is called the Vector Potential for
the connection, or sometimes the Christofel symbols. When the manifold
has the same dimension, n, as the ﬁbre, there are n
3
such functions. The
text book prefers to use the term Christoﬀel symbols for the case when the
connection respects a riemannian inner product.
Exercise 8.6.2. When we discussed the Euclidean connection for sections
of the tangent bundle, what were the Γ
i
k,j
?
The signiﬁcance of the vector potential term in equation 8.6.1 is not hard to
see. If we left it out, or equivalently insisted that all terms are zero, then in
the case of a vector ﬁeld we would simply have the situation of the Euclidean
connection,
∇
X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)
In order to work on S
2
, this would have to survive a diﬀeomorphism, and by
an earlier exercise, it doesn’t.
Exercise 8.6.3. Take the usual vector ﬁelds on R
2
` ¦0¦, X = −y ∂
x
+
x ∂
y
and Y = x ∂
x
+ y ∂
y
. Compute ∇
X
(Y ) in cartesian form. Now ﬁnd
expressions for the same vector ﬁelds in polar form. (You’d better get X
P
=
∂
θ
and Y
P
= r∂
r
and make sure you can prove these are correct, not just look
at the pictures!) Now use the rule for the Euclidean connection to calculate
∇
X
P
(Y
P
). This had better not be the polar form of ∇
X
(Y ) or you have got
the wrong answer.
206 CHAPTER 8. CONNECTIONS
Exercise 8.6.4. Find a vector ﬁeld Z on R
2
` ¦0¦ which corresponds to the
polar ﬁeld ∂
r
. (That is, it consists of a unit vector radially outwards at each
point.) Calculate ∇
∂
r
(∂
θ
) by translating it into cartesian coordinates to do
the sums and then translate back. What if you had chosen a diﬀerent basis
for the tangent vectors and picked r∂
r
instead? What is ∇
r∂
r
(∂
θ
)? Calculate
∇
∂
θ
(∂
θ
), ∇
∂
θ
(r∂
r
) and ∇
r∂
r
(r∂
r
) in the same way. Translate them all back
into polar coordinates.
Exercise 8.6.5. Explain why r∂
r
is a better choice than ∂
r
. (Hint, look at
the polar diﬀeomorphism.)
Exercise 8.6.6. Hence compute the vector potential for ∇
X
P
(Y
P
).
Exercise 8.6.7. Conﬁrm that if you take the vector potential into account,
you get the right answer for ∇
X
P
(Y
P
).
Exercise 8.6.8. ∇
∂
x
(∂
y
) and the similar terms in the cartesian framework
are what you’d expect them to be, but the ∇
∂
i
(∂
j
) in polar coordinates
contain signiﬁcant information. What are the numbers telling you?
Exercise 8.6.9. If you look at what you have been doing with the above
calculations, you can see that we have deﬁned ∇
X
(Y ) in cartesian coordinates
on R
2
(and hence by trivial modiﬁcation on R
n
) and then proceeded to
take it on the subspace R
2
` ¦0¦ by ignoring the deleted point. Then we
transferred it all to o
1
R
+
by the diﬀeomorphism P for polar coordinates.
In order to compute ∇
X
P
(Y
P
), I rather took it for granted that it is to be
done by translating X
P
and Y
P
into cartesian form, doing it there, and then
translating the answer back into polar coordinates, which surely is the only
sane thing to do. If φ were any diﬀeomorphism from a subset U ⊆ R
2
to
some other space, V , then what we are doing is taking a vector ﬁeld X on
U to the vector ﬁeld φ
∗
◦ X ◦ φ
−1
on V , a vector ﬁeld Y on U to the vector
ﬁeld φ
∗
◦ Y ◦ φ
−1
on V , and deﬁning
∇
φ
∗
◦X◦φ
−1(φ
∗
◦ Y ◦ φ
−1
) = φ
∗
◦ ∇
X
(Y ) ◦ φ
−1
Show that this is a consistent way to export ∇ from one manifold to an
other which is diﬀeomorphic to it, and hence explain why we can deﬁne a
connection on a manifold, and why ∇ is called covariant diﬀerentiation.
8.6.1 Tensor formulation
The term Γ
i
k,j
looks very like a (2,1) tensor in coordinate terms. For the
Riemannian Inner product, what goes in at each point in the same tangent
8.7. CONCLUDING REMARKS 207
space is a pair of vectors and what comes out is a number, and this is bilinear
and varies smoothly as we move around in the manifold. For the vector
potential, we have that the things going in are two vector ﬁelds, or at least a
vector in the tangent space at a point and a ﬁeld (or possibly more general
section) deﬁned in a neighbourhood of the point (so we can diﬀerentiate), and
what comes out is another vector ﬁeld (or possibly more general section).
8.7 Concluding Remarks
This just starts on the subject of connections, which are crucial to much
diﬀerential geometry. For example, it is connections that have curvature. We
could show how a Riemannian Inner product (metric) leads to a connection,
the LeviCivita connection, which is compatible with the metric. But there
is too much to ﬁt into an introductory course unless I follow the tradition of
training you to say the right things with only minimal grasp of what they
mean, something I much prefer not to do.
This action might not be possible to undo. Are you sure you want to continue?